Commun. Math. Phys. 256, 1–42 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1171-y
Communications in
Mathematical Physics
Crossed Products of the Cantor Set by Free Minimal Actions of Zd N. Christopher Phillips Department of Mathematics, University of Oregon, Eugene, OR 97403-1222, USA. E-mail:
[email protected] Received: 26 June 2003 / Accepted: 26 August 2003 Published online: 8 March 2005 – © Springer-Verlag 2005
Abstract: Let d be a positive integer, let X be the Cantor set, and let Zd act freely and minimally on X. We prove that the crossed product C ∗ (Zd , X) has stable rank one, real rank zero, and cancellation of projections, and that the order on K0 (C ∗ (Zd , X)) is determined by traces. We obtain the same conclusion for the C*-algebras of various kinds of aperiodic tilings. In [40], Putnam considered the C*-algebra A associated with a substitution tiling system satisfying certain additional conditions, and proved that the order on K0 (A) is determined by the unique tracial state τ onA. That is, if η ∈ K0 (A) satisfies τ∗ (η) > 0, then there is a projection p ∈ M∞ (A) = ∞ n=1 Mn (A) such that η = [p]. In this paper, we strengthen Putnam’s theorem, obtaining Blackadar’s Second Fundamental Comparability Question ([7], 1.3.1) for A, namely that if p, q ∈ M∞ (A) are projections such that τ (p) < τ (q) for every tracial state τ on A, then p q, that is, that p is Murray-von Neumann equivalent to a subprojection of q. We further prove that the C*-algebra A has real rank zero [10] and stable rank one [42]. We also extend the theorem: the same conclusions hold for the C*-algebras of some other kinds of aperiodic tilings, and when A is the transformation group C*-algebra C ∗ (Zd , X) of an arbitrary free and minimal action of Zd on the Cantor set X. We should also mention the recent proof of the gap labelling conjecture for the Cantor set ([3, 5, 25]), which states that the image of K0 (C ∗ (Zd , X)) under the map to R induced by a trace is the subgroup generated by the values of the corresponding invariant measure on compact open subsets of X. Our results are loosely related to the Bethe-Sommerfeld Conjecture for quasicrystals in the tight binding approximation. The tight binding Hamiltonian for a quasicrystal coming from an aperiodic tiling is a selfadjoint element of the C*-algebra of the tiling. When this C*-algebra has real rank zero, any selfadjoint element has arbitrarily small perturbations which have finite spectrum, and moreover selfadjoint elements with totally
Research partially supported by NSF grant DMS 0070776.
2
N.C. Phillips
disconnected spectrum are generic (form a dense Gδ -set) in the set of all selfadjoint elements. Our proofs are based on the methods of Sect. 3 of [40]. These methods require the presence of a “large” AF subalgebra. For the substitution tilings of [40], a suitable subalgebra is constructed there. For transformation group C*-algebras of free minimal actions of Zd on the Cantor set, we obtain this subalgebra by reinterpreting the main result of Forrest’s paper [16] in terms of groupoids. We actually prove our main results for the reduced C*-algebras of what we call almost AF Cantor groupoids. These form a class of groupoids to which the methods of Sect. 3 of [40] are applicable. Forrest in effect shows that the transformation group groupoid of a free minimal action of Zd on the Cantor set is almost AF, and Putnam in effect shows in Sect. 2 of [40] that the groupoids of the substitution tiling systems considered there are almost AF. There are three reasons for presenting our main results in terms of almost AF Cantor groupoids. First, the abstraction enables us to focus on just a few key properties. In particular, as will become clear, for actions of Zd we do not need the full strength of the results obtained in [16]. Second, it seems plausible that groupoids arising in other contexts might turn out to be almost AF, so that our work would apply elsewhere. Third, we believe that the methods will work for actions of much more general discrete groups, and for actions that are merely essentially free. Proving this is, we hope, primarily a matter of generalizing Forrest’s construction of Kakutani-Rokhlin decompositions [16], and we want to separate the details of the generalization from the methods used to obtain from it results for the crossed product C*-algebras. This paper is organized as follows. The first section presents background material on principal r-discrete groupoids and their C*-algebras, especially in the case that the unit space is the Cantor set, in a form convenient for later use. In the second section, we define almost AF Cantor groupoids and present some basic results. The key technical result, Lemma 2.7, appears here. In Sects. 3–5, we prove results for the reduced C*-algebra of an almost AF Cantor groupoid when the C*-algebra is simple, proving (most of) Blackadar’s Second Fundamental Comparability Question in Sect. 3, real rank zero in Sect. 4, and stable rank one in Sect. 5. The full statement of Blackadar’s Second Fundamental Comparability Question is obtained by combining the result of Sect. 3 with stable rank one. In Sect. 6 we use [16] to show that free minimal actions of Zd on the Cantor set yield almost AF Cantor groupoids. The structural results above therefore hold for their C*-algebras. In Sect. 7 we do the same for the groupoids associated with several kinds of aperiodic tilings, and discuss the relation to the Bethe-Sommerfeld Conjecture. The last section contains some open problems and an example related to the nonsimple case. 1. Cantor Groupoids In this section, we fix notation, recall some important definitions, and establish a few elementary facts. For groupoid notation and terminology, we will generally follow Renault’s book [41], with two exceptions. If G is a groupoid with unit space G(0) , we will refer to its range and source maps r, s : G → G(0) , given by r(g) = gg −1
and s(g) = g −1 g
for g ∈ G. Also, transformation groups will normally act on the left; see Example 1.3 below for notation. We will recall many of the relevant definitions from [41], since they are scattered through the book.
Actions of Zd on the Cantor Set
3
It is convenient to have a term to describe the basic assumptions we will be imposing on our groupoids. Definition 1.1. A topological groupoid G equipped with a Haar system (which will be suppressed in the notation) is called a Cantor groupoid if the following conditions are satisfied: (1) G is Hausdorff, locally compact, and second countable. (2) The unit space G(0) is compact, totally disconnected, and has no isolated points (so is homeomorphic to the Cantor set). (3) G is r-discrete in the sense of Definition 2.6 in Chapter 1 of [41], that is, G(0) is open in G. (4) The Haar system consists of counting measures. Using Lemma 1.2 below, we can rephrase this in terminology that has recently become common (see for example [23] and [33]) by saying that a Cantor groupoid is a second countable locally compact Hausdorff etale groupoid whose unit space is the Cantor set, equipped with the Haar system of counting measures. The following lemma describes some of the immediate properties of Cantor groupoids. Lemma 1.2. Let G be a Cantor groupoid. Then: (1) For any x ∈ G(0) , the sets r −1 (x), s −1 (x) ⊂ G are discrete. (2) The range and source maps r, s : G → G(0) are local homeomorphisms. (3) G is totally disconnected. Proof. (1) This is Lemma 2.7(i) in Chapter 1 of [41]. (2) This is Lemma 2.7(iii) in Chapter 1 of [41]. (3) It follows from Part (2) that every point g ∈ G has an open neighborhood which is homeomorphic to an open subset of G(0) . Now use the fact that G(0) is assumed to be totally disconnected. We now give the motivating example. Example 1.3. Let X be the Cantor set, and let be a countable discrete group which acts on X. Then the transformation group groupoid × X, equipped with the Haar system consisting of counting measures, is a Cantor groupoid. For reference, and to establish conventions, here are the groupoid operations. The pairs (γ1 , x1 ) and (γ2 , x2 ) are composable exactly when x1 = γ2 x2 , and then (γ1 , x1 )(γ2 , x2 ) = (γ1 γ2 , x2 ). The range, source, and inverse are given by r(γ , x) = (1, γ x),
s(γ , x) = (1, x),
and (γ , x)−1 = (γ −1 , γ x).
Example 1.4. Let G be a Cantor groupoid, and let H be an open subgroupoid of G which contains the entire unit space G(0) of G (or, more generally, whose unit space is a compact open subset of G(0) with no isolated points). Equip H with the Haar system of counting measures. Then H is a Cantor groupoid. To verify this, the only nonobvious condition from Definition 1.1 is that the counting measures form a Haar system, and for this the only issue is the second condition
4
N.C. Phillips
(continuity) in Definition 2.2 of Chapter 1 of [41]. For x ∈ H (0) , let µx and ν x be the counting measures on {g ∈ G : r(g) = x} and {g ∈ H : r(g) = x}. Let f ∈ Cc (H ). We have to show that x → H f dν x is continuous. Since H is open in G, the function f extends to a function f ∈ Cc (G) by setting f(g) = 0 for g ∈ G \ H . Further, x fdµx , f dν = H
G
which is known to be continuous in x. So H is a Cantor groupoid. As a specific example, we mention the subgroupoids of the transformation group groupoid of a minimal homeomorphism of the Cantor set implicit in [37]. For the groupoid interpretation, see Example 2.6 of [39]. For convenience, we will reformulate several standard definitions and constructions in our restricted context. Remark 1.5. Let G be a Cantor groupoid, or, more generally, a locally compact r-discrete groupoid with counting measures as the Haar system. (1) (See the beginning of Sect. 1 in Chapter 2 of [41].) The product and adjoint in the convolution algebra Cc (G) (the space of continuous functions on G with compact support) are given by: (f1 f2 )(g) = f1 (gh)f2 (h−1 ) and f ∗ (g) = f (g −1 ). h∈G : r(h)=s(g)
(Note that we write f1 f2 rather than f1 ∗ f2 .) The C*-algebra C ∗ (G) is the completion of Cc (G) in a suitable C* norm; see Definition 1.12 in Chapter 2 of [41]. (2) (See Definitions 3.2, 3.4, and 3.12 in Chapter 1 of [41].) A Borel measure µ on G(0) is invariant if and only if for every f ∈ Cc (G), the numbers f (g) dµ(x) and f (g) dµ(x) G(0)
g∈G : r(g)=x
G(0)
g∈G : s(g)=x
are equal. (The difference in the expressions is that one sum is over r(g) = x and the other is over s(g) = x.) (3) (See the discussion preceding Definition 2.8 in Chapter 2 of [41], but note that Renault seems to reserve the term “regular representation” for the case that the measure µ is quasiinvariant.) Let µ be a Borel measure on G(0) . Then the regular representation π of C ∗ (G) associated with µ is constructed as follows. Define a measure ν on G by f dν = f (g) dµ(x) G(0)
G
g∈G : s(g)=x
for f ∈ Cc (G). (This measure is called ν −1 in [41].) Then π is the representation on L2 (G, ν) determined by the formula π(f )ξ, η = f (gh)ξ(h−1 )η(g) dµ(x). G(0)
g,h∈G : s(g)=r(h)=x
(4) By comparing formulas, we see that the regular representation as defined here is the same as the representation called Indµ in [29] before Corollary 2.4. (See Sect. 1A of [29].)
Actions of Zd on the Cantor Set
5
Because of the way the relevant material is presented in the literature, and to call attention to the neat formulation of [29], we take some care with the definition of the reduced C*-algebra. Definition 1.6. (Sect. 1A of [29].) Let G be a locally compact r-discrete groupoid with counting measures as the Haar system. We define the reduced C*-algebra Cr∗ (G) to be the completion of Cc (G) in the supremum of the seminorms coming from the representations λx , for x ∈ G(0) , defined as follows: let Gx = {g ∈ G : s(g) = x}, and let Cc (G) act on the Hilbert space l 2 (Gx ) by λx (f )ξ(g) =
h∈G : s(h)=x
f (gh−1 )ξ(h).
As shown in Theorem 2.3 of [29], this norm comes from a single canonical regular representation of Cc (G) on a Hilbert module over C0 G(0) . This might appropriately be used as the definition of Cr∗ (G). Lemma 1.7. The reduced C*-algebra Cr∗ (G) as defined above is the same as the reduced C*-algebra as in Definition 2.8 in Chapter 2 of [41]. Proof. The representation λx used in Definition 1.6 is easily checked to be the representation of Remark 1.5(3) coming from the point mass at x. Given this, and Remark 1.5(4), the result follows from Corollary 2.4(b) of [29]. The following two results are well known (in fact, in greater generality), but we have been unable to locate references. The version of the first for full groupoid C*-algebras and full crossed products is in [41] (after Definition 1.12 in Chapter 2), but we have been unable to find a statement of the reduced case. Proposition 1.8. Let X be a locally compact Hausdorff space, let the discrete group act on X, and let G = × X be the transformation group groupoid (as in Example 1.3). Then the reduced groupoid C*-algebra Cr∗ (G) is isomorphic to the reduced crossed product C*-algebra Cr∗ (, X). Proof. Recall that Cr∗ (, X) is the completion of Cc (, C(X)) in a suitable norm (see 7.6.5 and 7.7.4 of [36]), with multiplication and adjoint on Cc (, C(X)) given by twisted versions of those in the ordinary group algebra (see 7.6.1 of [36]). Define ϕ : Cc (G) → Cc (, C(X)) by ϕ(f )(γ )(x) = f (γ , γ −1 x). One checks that ϕ is a bijective *-homomorphism. Moreover, when one identifies G(1,x) with in the obvious way, one finds that ϕ transforms the representation λ(1,x) of Definition 1.6 into the regular representation πx of C ∗ (, C(X)) determined as in 7.7.1 of [36] by the point evaluation evx , regarded as
a one dimensional representation of C(X). By Definition 1.6, the representation x∈X λ(1,x) is injective on Cr∗ (G), and by Theorem 7.7.5 of [36] the representation x∈X πx is injective on Cr∗ (, X). Therefore ϕ determines an isomorphism Cr∗ (G) → Cr∗ (, X). Proposition 1.9. Let G be a locally compact r-discrete groupoid with counting measures as the Haar system. Let H be an open subgroupoid, and give H also the Haar system consisting of counting measures. (See Example 1.4.) Then the inclusion Cc (H ) ⊂ Cc (G) defines an injective homomorphism Cr∗ (H ) → Cr∗ (G).
6
N.C. Phillips
H Proof. Let λG x and λx denote the representations of Cc (G) and Cc (H ) used in Definition 1.6. It suffices to show that, for every x ∈ G(0) , the representation λG x |Cc (H ) is a (0) , and perhaps a copy of the zero for various y ∈ H direct sum of representations λH y representation. Define a relation on Gx by g ∼ h exactly when gh−1 ∈ H . Let Y = {g ∈ Gx : g ∼ g}, which is equal to {g ∈ Gx : r(g) ∈ H }. Restricted to Y , the relation ∼ is an equivalence relation. Let E be an equivalence class. One easily checks that the subspace l 2 (E) is invariant for λG x |Cc (H ) . Choose g0 ∈ E and let y = r(g0 ). Then the formula h → hg0 defines a bijection Hy → E. (To check surjectivity: if g ∼ g0 , then gg0−1 is in H and has source y, and g = (gg0−1 )g0 .) Further, if we define a unitary u : l 2 (E) → l 2 (Hy ) by the formula ∗ H (uξ )(h) = ξ(hg0 ), then we get u λG x (f )|H u = λy (f ) for all f ∈ Cc (H ). Finally, one easily checks that l 2 (Gx \ Y ) is an invariant subspace for λG x |Cc (H ) , and that the restriction of the representation to this subspace is zero.
Definition 1.10. Let G be a groupoid. A graph in G is a subset T ⊂ G such that the restrictions of both the range and source maps to T are injective. Such sets are called G-sets in Definition 1.10 in Chapter 1 of [41]. Our terminology was suggested by Putnam, and is based on the idea that, when G is principal in the sense of Definition 1.14(2) below, T is the graph of a map from s(T ) to r(T ). Lemma 1.11. Let G be a Cantor groupoid. Let K ⊂ G be a compact set. Then K is a finite disjoint union of compact graphs, which are open if K is open. Proof. Since r and s are local homeomorphisms (Lemma 1.2(2)), for each g ∈ K there is a compact open subset E(g) such that the restrictions of both r and s to E(g) are injective. Since K is compact, there are g1 , g2 , . . . , gn ∈ K such that E(g1 ), E(g2 ), . . . , E(gn ) cover K. Then set K1 = E(g1 ) ∩ K and, inductively, Kl = (E(gl ) \ [E(g1 ) ∪ · · · ∪ E(gl−1 )]) ∩ K. The Kl are compact, and open if K is open, because the E(gl ) are compact and open, they are graphs because the E(gl ) are graphs, and clearly K is the disjoint union of K1 , K2 , . . . , Kn . Lemma 1.12. Let G be a Cantor groupoid, and let S ⊂ G be a compact graph. Then there exists a compact open graph T which contains S. Proof. Write S = ∞ n=1 Vn for compact open subsets Vn ⊂ G with V1 ⊃ V2 ⊃ · · · ⊃ S. We claim that some Vn is a graph in G. Suppose not. Then there are infinitely many n such that s|Vn is not injective, or there are infinitely many n such that r|Vn is not injective. We assume the first case. (The proof is the same for the second case.) Then for all n, the restriction s|Vn is not injective. Choose gn , hn ∈ Vn such that s(gn ) = s(hn ) and gn = hn . By compactness, we may pass to a subsequence and assume that gn → g and hn → h. Then g, h ∈ S and s(g) = s(h). If g = h, we have contradicted the assumption that S is a graph. If g = h, then every neighborhood of g contains two distinct elements, namely gn and hn for sufficiently large n, whose images under s are equal; this contradicts the fact (Lemma 1.2(2)) that s is a local homeomorphism. Thus in either case we obtain a contradiction, so some Vn is a graph.
Actions of Zd on the Cantor Set
7
Lemma 1.13. Let G be a Cantor groupoid. Let µ be a Borel measure on G(0) , and let ν be the measure on G of Remark 1.5(3). Let L ⊂ G be a compact graph. Then: (1) ν(L) = µ(s(L)). (2) If µ is G-invariant, then ν(L) = µ(r(L)). Proof. Use Lemma 1.12 to choose a compact open graph V which contains L. Then s|V : V → s(V ) and r|V : V → r(V ) are homeomorphisms. It suffices to prove that if f : G → [0, 1] is any continuous function with supp(f ) ⊂ V and f = 1 on L, then f dν = f ◦ (s|V )−1 dµ G(0)
G
and, when µ is G-invariant,
f dν =
G
G(0)
f ◦ (r|V )−1 dµ.
)−1
= 0 off s(V ) and f ◦ (r|V )−1 = 0 off r(V ).) Because V is a (We take f ◦ (s|V graph, the first equation is just the definition of ν. For the second, we use invariance of µ to rewrite f dν = f (g) dµ(x) G
G(0)
g∈G : r(g)=x
(changing the condition s(g) = x in the original sum to the condition r(g) = x). Now the second equation follows in the same way as the first. At this point, we recall some further definitions from [41]. Definition 1.14. (1) (Definition 1.1 in Chapter 1 of [41].) Let G be a groupoid, and let x ∈ G(0) . The isotropy subgroup of x is the set {g ∈ G : r(g) = s(g) = x}. (It is a group with identity element x.) (2) (Definition 1.1 in Chapter 1 of [41].) A groupoid G is principal if every isotropy subgroup is trivial (has only one element). Equivalently, whenever g1 , g2 ∈ G satisfy r(g1 ) = r(g2 ) and s(g1 ) = s(g2 ), then g1 = g2 . (3) (See Page 35 of [41].) Let G be a groupoid. A subset E ⊂ G(0) is invariant if whenever g ∈ G with s(g) ∈ E, then also r(g) ∈ E. (4) (Definition 4.3 in Chapter 2 of [41].) A locally compact groupoid G is essentially principal if for every closed invariant subset E ⊂ G(0) , the set of x ∈ E with trivial isotropy subgroup is dense in E. The groupoids appearing in the following definition will play a crucial role in what follows. Definition 1.15. A Cantor groupoid G is called approximately finite (AF for short), if it is the increasing union of a sequence of compact open principal Cantor subgroupoids, each of which contains the unit space G(0) . In Definition 3.7 of [19], and with a weaker condition (designed to allow unit spaces which are only locally compact), such a groupoid is called an AF equivalence relation. The next proposition is included primarily to make the connection with earlier work. The corollary will be essential, but it is easily proved directly.
8
N.C. Phillips
Proposition 1.16. An AF Cantor groupoid is an AF groupoid in the sense of Definition 1.1 in Chapter 3 of [41]. An AF groupoid as defined there is an AF Cantor groupoid if and only if its unit space is compact and has no isolated points. Proof. The first statement follows easily from Lemma 3.4 of [19]. The “only if” part of the second statement is clear. To prove the rest, let G be an AF groupoid in the sense of [41], and assume its unit space is compact and has no isolated points. By definition, we can write G as the increasing union of a sequence of open subgroupoids Hn , each of which has the same unit space G(0) , and each of which is a disjoint union of a sequence of elementary groupoids (Definition 1.1 in Chapter 3 of [41]) Hn,k of types Nn,k ∈ {1, 2, . . . , ∞}. (0) For each n, the unit spaces Hn,k are open in G(0) , and G(0) is compact, so there are t (n) (0) only finitely many Hn,k , that is, Hn = k=1 Hn,k with t (n) < ∞. Moreover, each Hn,k is compact. Comparing with Definition 1.1 in Chapter 3 of [41], we see that this can happen only when the type Nn,k is finite. So each Hn is compact, and we are done. Corollary 1.17. Let G be an AF Cantor groupoid. Then Cr∗ (G) is an AF algebra. Proof. By Proposition 1.15 in Chapter 3 of [41], the full C*-algebra C ∗ (G) is AF. The reduced C*-algebra Cr∗ (G) is a quotient (actually, in this case equal to the full C*-algebra). 2. Almost AF Groupoids In this section we introduce almost AF Cantor groupoids, and prove some basic properties. An almost AF Cantor groupoid contains a “large” AF Cantor subgroupoid, and its reduced C*-algebra contains a corresponding “large” AF subalgebra. We establish one to one correspondences between the sets of normalized traces on the two C*-algebras, and between them and the sets of invariant Borel probability measures on the unit spaces of the two groupoids. In addition, if the reduced C*-algebra of the groupoid is simple, so is the AF subalgebra. Moreover, an almost AF Cantor groupoid is essentially principal, and has an invariant measure whose associated regular representation is injective on the reduced C*-algebra. Although our main results involve only almost AF Cantor groupoids whose reduced C*-algebras are simple, and we don’t know how to generalize them, it seems worthwhile to attempt to give a definition which is also appropriate for the nonsimple case. For more details, see the discussion after Definition 2.2. Definition 2.1. Let G be a Cantor groupoid, and let K ⊂ G(0) be a compact subset. We say that K is thin if for every n, there exist compact graphs S1 , S2 , . . . , Sn ⊂ G such that s(Sk ) = K and the sets r(S1 ), r(S2 ), . . . , r(Sn ) are pairwise disjoint. Definition 2.2. Let G be a Cantor groupoid. We say that G is almost AF if the following conditions hold: (1) There is an open AF subgroupoid G0 ⊂ G which contains the unit space of G and such that whenever K ⊂ G \ G0 is a compact set, then s(K) ⊂ G(0) is thin in G0 in the sense of Definition 2.1. (2) For every closed invariant subset E ⊂ G(0) , and every nonempty relatively open subset U ⊂ E, there is a G-invariant Borel probability measure µ on G(0) such that µ(U ) > 0.
Actions of Zd on the Cantor Set
9
This definition is an abstraction of the key ideas in the argument of Sect. 3 of [40]. Note that G0 is not uniquely determined by G. We will see in Proposition 2.13 that condition (2) is redundant when Cr∗ (G) is simple (or when Cr∗ (G0 ) is simple). In the nonsimple case, we would still like the definition to imply that Cr∗ (G) has stable rank one and real rank zero. We have three motivations for condition (2). First, it seems to be exactly what is needed to guarantee that the groupoid is essentially principal. Second, it allows products with a totally disconnected compact metric space, regarded as a groupoid in which every element is a unit. (For a transformation group groupoid, this corresponds to forming the product with the trivial action on such a space.) Third, there is a free nonminimal action of Z on the Cantor set whose transformation group groupoid G satisfies condition (1) but has an open subset in its unit space which is null for all G-invariant Borel probability measures, and for which Cr∗ (G) does not have stable rank one. See Example 8.8. Lemma 2.3. Let G be a second countable locally compact Hausdorff r-discrete groupoid with Haar system consisting of counting measures. Suppose that for every nonempty open subset U ⊂ G(0) , there is a G-invariant Borel probability measure µ on G(0) such that µ(U ) > 0. Then there exists a G-invariant Borel probability measure on G(0) such that the regular representation it determines (Remark 1.5(3)) is injective on Cr∗ (G). Proof. Let U1 , U2 , . . . form a countable base for the topology of G(0) consisting of nonempty open sets. Choose Borel probability measure µn on G(0) such that
a G-invariant −n µ , which is a G-invariant Borel probability measure µn (Un ) > 0. Set µ = ∞ 2 n n=1 on G(0) such that µ(Un ) > 0 for all n. Then supp(µ) is a closed subset of G(0) such that supp(µ) ∩ Un = ∅ for all n. Therefore supp(µ) = G(0) . Now apply Corollary 2.4 of [29] and Remark 1.5(4). Corollary 2.4. Let G be an almost AF Cantor groupoid. Then there exists a G-invariant Borel probability measure on G(0) such that the regular representation it determines, as in Remark 1.5(3), is injective on Cr∗ (G). We now need a lemma on thin sets. Lemma 2.5. Let G be a Cantor groupoid, and let K ⊂ G(0) be a compact subset which is thin in the sense of Definition 2.1. Then: (1) For every n, there exist a compact open set W containing K and compact open graphs W1 , W2 , . . . , Wn ⊂ G, such that s(Wk ) = W for all k, and such that the sets r(W1 ), r(W2 ), . . . , r(Wn ) are pairwise disjoint compact open subsets of G(0) . (2) For every ε > 0, there is a compact open subset V of G(0) such that K ⊂ V and µ(V ) < ε for every invariant Borel probability measure µ on G(0) . (3) For every invariant Borel probability measure µ on G(0) , we have µ(K) = 0. Proof. (1) Using Definition 2.1(2), choose compact graphs S1 , S2 , . . . , Sn ⊂ G such that s(Sk ) = K and such that the sets r(S1 ), r(S2 ), . . . , r(Sn ) are pairwise disjoint. Choose disjoint compact open sets U1 , U2 , . . . , Un ⊂ G(0) such that r(Sk ) ⊂ Uk . Use Lemma 1.12 to choose compact open graphs V1 , V2 , . . . , Vn ⊂ G such that Sk ⊂ Vk . Replacing Vk by Vk ∩ r −1 (Uk ), we may assume that r(V1 ), r(V2 ), . . . , r(Vn ) are pairwise disjoint. Since s is a local homeomorphism (Lemma 1.2(2)), the sets s(V1 ), s(V2 ), . . . , s(Vn ) are all compact open sets containing K. Define W = s(V1 ) ∩ s(V2 ) ∩ · · · ∩ s(Vn )
and Wk = Vk ∩ s −1 (W ) .
10
N.C. Phillips
Then the Wk are compact open graphs such that Sk ⊂ Wk for all k, such that r(W1 ), r(W2 ), . . . , r(Wn ) are pairwise disjoint, and such that s(Wk ) = W . (2) Let ε > 0. Choose n ∈ N with n1 < ε. Let W ⊂ G(0) and W1 , W2 , . . . , Wn ⊂ G be as in Part (1). Let µ be any invariant Borel probability measure on G(0) . Let ν be the measure in Remark 1.5(3). By Lemma 1.13, µ(r(Wk )) = ν(Wk ) = µ(s(Wk )) = µ(W ) for all k. Since the r(Wk ) are disjoint and µ G(0) = 1, it follows that µ(W ) ≤ (3) This is immediate from Part (2).
1 n
< ε.
Lemma 2.6. Let G be an almost AF Cantor groupoid. Then G is essentially principal (Definition 1.14(4)). Proof. Let E ⊂ G(0) be a closed G-invariant subset. Let G0 be as in Definition 2.2(1). Note that G0 is principal. If x ∈ G(0) has nontrivial isotropy, then there is g ∈ G with g = x such that r(g) = s(g) = x. So g ∈ G0 , whence x ∈ s(G \ G0 ). Now G \ G0 is a closed subset of a locally compact second countable Hausdorff space, and therefore is a countable union of compact subsets: G \ G0 = ∞ n=1 Kn . Each s(Kn ) is thin relative to G(0) . Let Un be the interior of s(Kn ) ∩ E relative to E. Then Lemma 2.5(3) implies that µ(Un ) = 0 for every G0 -invariant Borel probability measure µ on G(0) , and hence for every G-invariant Borel probability measure µ on G(0) . So Un = ∅ by Definition 2.2(2). Thus s(Kn ) ∩ E is nowhere dense in E. It follows that s(G \ G0 ) ∩ E is meager in E, and in particular that its complement is dense in E. So the points in E with trivial isotropy are dense in E. Now we start work toward the correspondences between the sets of invariant measures. The following lemma is the key technical result for taking advantage of the structure of an almost AF Cantor groupoid, not only here but in later sections as well. The main part is (3), in which the products are in the C*-algebra of the AF subgroupoid. The other parts are given for easy reference. Lemma 2.7. Let G be a Cantor groupoid, and let G0 be an AF subgroupoid satisfying Part (1) of the definition of an almost AF Cantor groupoid (Definition 2.2). Let f ∈ Cc (G). Let K and L be compact open subsets of G(0) such that K ∩ s(supp(f ) ∩ [G \ G0 ]) = ∅ and L ∩ r(supp(f ) ∩ [G \ G0 ]) = ∅. Let p = χK and q = χL . Then (with convolution products evaluated in Cc (G)), we have: (1) p, q ∈ Cc (G) are projections. (2) (fp)(g) =
f (g) s(g) ∈ K 0 s(g) ∈ K
(3) fp, qf ∈ Cc (G0 ).
and
(qf )(g) =
f (g) r(g) ∈ L . 0 r(g) ∈ L
Actions of Zd on the Cantor Set
11
Proof. Part (1) is obvious. To prove Parts (2) and (3) for fp, we evaluate (f χK )(g) following Remark 1.5(1). We have χK (h) = 0 for h ∈ G(0) , so the formula reduces to (f χK )(g) = f (g)χK (s(g)) for g ∈ G. This is the formula for fp in Part (2). Now suppose g ∈ G \ G0 . If f (g) = 0, then s(g) ∈ s(supp(f ) ∩ [G \ G0 ]), so χK (s(g)) = 0. Thus g ∈ G \ G0 implies (f χK )(g) = 0. Certainly supp(f χK ) is compact, so f χK ∈ Cc (G0 ), which is Part (3). The proof of Parts (2) and (3) for qf is similar, or can be obtained from the case already done by applying it to f ∗ and taking adjoints. Lemma 2.8. Let G be a Cantor groupoid, and let G0 be an AF subgroupoid satisfying Part (1) of the definition of an almost AF Cantor groupoid (Definition 2.2). Then every G0 -invariant Borel probability measure on G(0) is G-invariant. Proof. Let µ be a G0 -invariant probability measure on G(0) . By assumption, we have f (g) dµ(x) = f (g) dµ(x) G(0)
g∈G : r(g)=x
G(0)
g∈G : s(g)=x
for all f ∈ Cc (G0 ). We need to verify this equation for all f ∈ Cc (G). It suffices to do this for nonnegative functions f . Let f ∈ Cc (G) be nonnegative, and let ε > 0. By Lemma 1.11, we can write supp(f ) as the disjoint union of finitely many compact graphs in G, say N of them. It follows that for any x ∈ G(0) , we have card({g ∈ supp(f ) : r(g) = x}) ≤ N
and
card({g ∈ supp(f ) : s(g) = x}) ≤ N.
Set K1 = r(supp(f ) ∩ [G \ G0 ])
and K2 = s(supp(f ) ∩ [G \ G0 ]).
Then K1 and K2 are thin subsets of G(0) , so Lemma 2.5(2) provides compact open subsets V1 , V2 ⊂ G(0) such that K j ⊂ Vj
and µ(Vj ) <
ε Nf ∞ + 1
for j = 1, 2. Define f1 = χG(0) \V1 f . By Lemma 2.7(2), 0 r(g) ∈ V1 f1 (g) = f (g) otherwise. Therefore f (g) dµ(x) − f1 (g) dµ(x) (0) g∈G : r(g)=x g∈G : r(g)=x G G(0) = f (g) dµ(x) g∈G : r(g)=x V1 ≤ µ(V1 )f ∞
sup card({g ∈ G : r(g) = x}) ≤ µ(V1 )f ∞ N < ε.
x∈G(0)
12
N.C. Phillips
In the following calculation, for the first step we use the previous estimate, for the second we use f1 ∈ Cc (G0 ) (which holds by Lemma 2.7(3)) and the G0 -invariance of µ, and for the third we use f1 ≤ f (which holds because f is nonnegative): f (g) dµ(x) < ε + f1 (g) dµ(x) g∈G : r(g)=x g∈G : r(g)=x (0) G(0) G =ε+ f1 (g) dµ(x) g∈G : s(g)=x (0) G ≤ε+ f (g) dµ(x). G(0)
g∈G : s(g)=x
A similar argument, using f2 = f χG(0) \V2 in place of f1 , gives the same inequality with the range and source maps exchanged. Since ε > 0 is arbitrary, we get f (g) dµ(x) = f (g) dµ(x), g∈G : s(g)=x
G(0)
as desired.
G(0)
g∈G : r(g)=x
Lemma 2.9. Let G be a locally compact r-discrete groupoid with counting measures as the Haar system. (1) Let µ be an invariant Borel probability measure on G(0) . Then the formula τ (f ) = f |G(0) dµ, G(0)
for f ∈ Cc (G), defines a normalized trace on the C*-algebra Cr∗ (G). Moreover, the assignment µ → τ is injective. (2) Suppose in addition that G is principal. Then every normalized trace on Cr∗ (G) is obtained from an invariant Borel probability measure µ on G(0) as in (1). Proof. This is a special case of Proposition 5.4 in Chapter 2 of [41].
Lemma 2.10. Let G be a Cantor groupoid, and let G0 be an AF subgroupoid satisfying Part (1) of the definition of an almost AF Cantor groupoid (Definition 2.2). Let τ1 and τ2 be two normalized traces on Cr∗ (G) whose restrictions to Cr∗ (G0 ) are equal. Then τ1 = τ2 . Proof. Let f ∈ Cc (G), and let ε > 0. We prove that |τ1 (f ∗ f ) − τ2 (f ∗ f )| < ε. Such elements are dense in the positive elements of Cr∗ (G), so their linear span is dense in Cr∗ (G), and the result will follow. Without loss of generality f ≤ 1. Let K = r(supp(f ∗ f ) ∩ [G \ G0 ]). Since f ∗ f ∈ Cc (G), the set K is thin in G0 (Definition 2.1). Also, since f ∗ f is selfadjoint, we have s(supp(f ∗ f ) ∩ [G \ G0 ]) = K. Choose n ∈ N with n > 2ε −1 . Use Lemma 2.5(1) to choose a compact open set W containing K, and compact open graphs W1 , W2 , . . . , Wn ⊂ G such that s(Wk ) = W and the sets r(W1 ), r(W2 ), . . . , r(Wn ) are pairwise disjoint compact open subsets of G(0) . Let p = χW , which is a projection in Cc (G). The function vk = χWk defines an element of Cc (G) such that vk∗ vk = p and
Actions of Zd on the Cantor Set
13
vk vk∗ = χr(Wk ) . Since the projections χr(W1 ) , χr(W2 ) , . . . , χr(Wn ) are pairwise orthogonal, it follows that τ1 (p), τ2 (p) ≤
1 n
< 21 ε.
By Lemma 2.7(3), the products (1 − p)f ∗ f and f ∗ f (1 − p) are in Cc (G0 ). Since p ∈ Cc (G0 ), it follows that f ∗ f − pf ∗ fp = (1 − p)f ∗ f + pf ∗ f (1 − p) ∈ Cc (G0 ). Therefore τ1 (f ∗ f − pf ∗ fp) = τ2 (f ∗ f − pf ∗ fp). On the other hand, pf ∗ fp ≤ f 2 p ≤ p, so 0 ≤ τ1 (pf ∗ fp) ≤ τ1 (p) < 21 ε
and
0 ≤ τ2 (pf ∗ fp) ≤ τ2 (p) < 21 ε.
It follows that |τ1 (f ∗ f ) − τ2 (f ∗ f )| = |τ1 (pf ∗ fp) − τ2 (pf ∗ fp)| < ε, as desired.
Proposition 2.11. Let G be a Cantor groupoid, and let G0 be an AF subgroupoid satisfying Part (1) of the definition of an almost AF Cantor groupoid (Definition 2.2). Then the following sets can all be canonically identified: • The space M of G-invariant Borel probability measures on G(0) . • The space M0 of G0 -invariant Borel probability measures on G(0) . • T (Cr∗ (G)), the space of normalized traces on Cr∗ (G). • T (Cr∗ (G0 )), the space of normalized traces on Cr∗ (G0 ). The map from M to M0 is the identity. (Both are sets of measures on G(0) .) The map from T (Cr∗ (G)) to T (Cr∗ (G0 )) is restriction of traces (using Lemma 1.9). The maps from M to T (Cr∗ (G)) and from M0 to T (Cr∗ (G0 )) are as in Lemma 2.9. Proof. The map from M0 to T (Cr∗ (G0 )) is bijective by Lemma 2.9, because G0 is principal. The map from M to M0 is well defined because G-invariant measures are obviously G0 -invariant, and is then trivially injective. It is surjective by Lemma 2.8. The map from T (Cr∗ (G)) to T (Cr∗ (G0 )) is injective by Lemma 2.10, and the map from M to T (Cr∗ (G)) is injective by Lemma 2.9. The composite M → T (Cr∗ (G0 )) is bijective by what we have already done, so both these maps must be bijective. It is apparently not known whether Lemma 2.9(2) can be generalized to essentially principal groupoids, but this proposition shows that its conclusion is valid for almost AF Cantor groupoids. We next show how to simplify the verification that a groupoid is almost AF when its reduced C*-algebra is simple. We need the following well known lemma. We have been unable to find a suitable reference in the literature, so we include a proof for completeness.
14
N.C. Phillips
Lemma 2.12. Every unital AF algebra B has a normalized trace. Proof. Write B = lim Bn for finite dimensional C*-algebras Bn and injective unital −→ homomorphisms Bn → Bn+1 . Let τn be a normalized trace on Bn , and use the HahnBanach Theorem to extend τn to a state ωn on B. Use Alaoglu’s Theorem to find a weak* limit point τ of the sequence (ωn ). It is easily checked that τ is a trace on B. Proposition 2.13. Let G be a Cantor groupoid, and let G0 be an AF subgroupoid satisfying Part (1) of the definition of an almost AF Cantor groupoid (Definition 2.2). Assume that Cr∗ (G) is simple, or that Cr∗ (G0 ) is simple. Then G is an almost AF Cantor groupoid. Proof. We must verify Part (2) of Definition 2.2. First, if Cr∗ (G) is simple, Proposition 4.5(i) in Chapter 2 of [41] implies that there are no nontrivial closed G-invariant subsets in G(0) . If Cr∗ (G0 ) is simple, then for the same reason there are no nontrivial closed G0 -invariant subsets in G(0) , and so certainly no nontrivial closed G-invariant subsets in G(0) . Therefore, in either case, it suffices to find a G-invariant Borel probability measure µ on G(0) such that µ(U ) > 0 for every nonempty open subset U ⊂ G(0) . The C*-algebra Cr∗ (G0 ) is AF by Corollary 1.17. It is unital, so by Lemma 2.12 it has a normalized trace τ . Proposition 2.11 therefore implies the existence of a G-invariant Borel probability measure µ on G(0) . (0) Let U ⊂ G be open and nonempty, and suppose that µ(U ) = 0. Let V = r s −1 (U ) , using the range and source maps of G. Then V is a G-invariant subset of G(0) (Definition 1.14(2)) which contains U , and it is open because r is a local homeomorphism (Lemma 1.2(2)). Therefore V = G(0) . Since every element of a Cantor groupoid G is contained in a compact open graph in G, and since G is second countable, there exists a countable base for the topology of G consisting of compact open graphs. In particular, there is a countable collection of compact open graphs, say W1 , W2 , . . . ⊂ G, such that s(Wn ) ⊂ U for all n and V = ∞ n=1 r(Wn ). Using Lemma 1.13, we get µ(r(Wn )) = µ(s(Wn )) ≤ µ(U ) = 0, whence ∞ µ G(0) = µ(V ) ≤ µ(r(Wn )) = 0. n=1
This contradicts the assumption that µ is a probability measure.
We now show that simplicity of Cr∗ (G) implies that of Cr∗ (G0 ). We will not actually need this result, because the order on projections in even a nonsimple unital AF algebra is almost determined by traces, but we think it clarifies the structure of almost AF Cantor groupoids. We need a lemma. Lemma 2.14. Let A be a unital AF algebra. Suppose τ (p) > 0 for every normalized trace τ on A and every nonzero projection p ∈ A. Then A is simple. Proof. Suppose A is not simple. Let I be a nontrivial ideal in A. Then A/I is a unital AF algebra. By Lemma 2.12, there is a normalized trace τ on A/I . Let π : A → A/I be the quotient map. Since I is AF, there is a nonzero projection p ∈ I . Then τ ◦ π is a normalized trace on A and p is a nonzero projection in A such that τ ◦ π(p) = 0.
Actions of Zd on the Cantor Set
15
Proposition 2.15. Let G be an almost AF Cantor groupoid, with AF subgroupoid G0 as in Definition 2.2(1). Suppose Cr∗ (G) is simple. Then Cr∗ (G0 ) is a simple AF algebra. Proof. The algebra Cr∗ (G0 ) is AF because G0 is an AF groupoid. (See Proposition 1.15 in Chapter 3 of [41].) It is unital because the unit space of G0 is compact. By Proposition 2.11, every trace on Cr∗ (G0 ) is the restriction of a trace on Cr∗ (G). Since this algebra is simple, every normalized trace on it is strictly positive on every nonzero projection in Cr∗ (G), and in particular on every nonzero projection in Cr∗ (G0 ). So Cr∗ (G0 ) is simple by Lemma 2.14. We close this section with one significant unanswered question. Question 2.16. Is an almost AF Cantor groupoid necessarily amenable? Remark 2.17. If the almost AF Cantor groupoid G is a transformation group groupoid × X (as in Example 1.3), then the answer is yes; in fact, the group is necessarily amenable. See Example 2.7(3) of [1]. 3. Traces and Order on K-Theory In this section, we prove that if G is an almost AF Cantor groupoid, then the traces on Cr∗ (G) determine the order on K0 (Cr∗ (G)), that is, if η ∈ K0 (Cr∗ (G)) and τ∗ (η) > 0 for all normalized traces, then η > 0. The description “traces determine order” is strictly correct only for simple algebras. When G is the groupoid of a substitution tiling system as in [40], this is the main result of that paper. Theorem 7.1 below implies that such a groupoid is in fact an almost AF Cantor groupoid. For the case that Cr∗ (G) is simple, the result of this section will be strengthened in Corollary 5.4 below. Although the proofs are a bit different (and, we hope, conceptually simpler), the basic idea of this section is entirely contained in Sect. 3 of [40]. Lemma 3.1. Let G be an almost AF Cantor groupoid, with open AF subgroupoid G0 ⊂ G as in Definition 2.2(1). Let F ⊂ Cc (G) be a finite set, and let ε > 0. Then for every ε > 0 there exists a compact open subset V of G(0) such that, with p = χV ∈ C G(0) ⊂ Cr∗ (G0 ), we have: (1) r(supp(f ) ∩ [G \ G0 ]) ∪ s(supp(f ) ∩ [G \ G0 ]) ⊂ V for all f ∈ F . (2) (1 − p)f (1 − p) > f − ε for all f ∈ F . (3) τ (p) < ε for every normalized trace τ on Cr∗ (G). Proof. We start by choosing V so that (1) and (2) are satisfied. Let F = {f1 , f2 , . . . , fn }. By Corollary 2.4, there is a G-invariant probability measure µ on G(0) whose associated regular representation π (see Remark 1.5(3)) is faithful on Cr∗ (G). Let ν be as in Remark 1.5(3). Choose ξ1 , ξ2 , . . . , ξn , η1 , η2 , . . . , ηn ∈ Cc (G) ⊂ L2 (G, ν) such that ξk = ηk = 1 for 1 ≤ k ≤ n.
and
| π(fk )ξk , ηk | > fk − 21 ε
16
N.C. Phillips
Let K=
n
supp(fk ) ∪ supp(ξk ) ∪ supp(ηk ) .
k=1
Then K ∩ (G \ G0 ) is a compact subset of G \ G0 , so s(K ∩ [G \ G0 ]) is a thin set in G(0) by Definition 2.2(1) and has measure zero by Lemma 2.5(3). Considering {g −1 : g ∈ K} in place of K, we also get µ(r(K ∩ [G \ G0 ])) = 0. Therefore L = s(K ∩ [G \ G0 ]) ∪ r(K ∩ [G \ G0 ]) is a compact subset of G(0) with µ(L) = 0. Choose a decreasing sequence of(0)compact open sets Vl ⊂ G(0) such that ∞ V = L. Set p = χ ∈ C G . Then (0) l l G \Vl l=1 one checks, for example by using the formula for π(pl )ξ, η in Remark 1.5(3), that if ξ ∈ Cc (G) ⊂ L2 (G, ν) then 0 r(g) ∈ Vl (π(pl )ξ )(g) = ξ(g) otherwise. Use Lemma 1.11 to write K as the union of compact graphs K1 , K2 , . . . , KN ⊂ G. From Lemma 1.13, and because µ is G-invariant, we get ν(r −1 (Vl ) ∩ K) =
N
ν(r −1 (Vl ) ∩ Kj ) =
j =1
N
µ(r(r −1 (Vl ) ∩ Kj )) ≤ N µ(Vl ).
j =1
Set Wl = r −1 (Vl )∩K. Then W1 ⊃ W2 ⊃ · · · , we have ν(Wl ) ≤ N µ(Vl ) → 0 (because µ(L) = 0), and for each k we have π(pl )ξk = χG\Wl ξk
and π(pl )ηk = χG\Wl ηk
(pointwise product on the right). So π(pl )ξk → ξk and π(pl )ηk → ηk almost everywhere [ν] as l → ∞. Applying the Dominated Convergence Theorem, we get lim π(pl )ξk − ξk = 0
and
l→∞
lim π(pl )ηk − ηk = 0
l→∞
for 1 ≤ k ≤ n. Therefore there is l such that | π(fk )π(pl )ξk , π(pl )ηk | > fk − ε for 1 ≤ k ≤ n. So, using ξk = ηk = 1, we get pl fk pl ≥ | π(pl fk pl )ξk , ηk | = | π(fk )π(pl )ξk , π(pl )ηk | > fk − ε. Take V = Vl . With this choice, parts (1) and (2) of the conclusion hold. To obtain part (3), apply Lemma 2.5(2) with ε as given, and taking for K the set [G \ G0 ] ∩
n
[supp(fk ) ∪ supp(fk∗ )]
k=1
(which is thin in G0 ). Call the resulting set W . Then replace V by V ∩W . This clearly does not affect the validity of parts (1) and (2), and we get part (3) by Proposition 2.11.
Actions of Zd on the Cantor Set
17
Lemma 3.2. Let A be a C*-algebra with real rank zero. Let a, b ∈ A be positive elements with a, b ≤ 1 and ab = b. Let ε > 0. Then there is a projection p ∈ bAb such that ap = p
and
pb − b < ε.
Proof. Let B = bAb. Then ax = x for all x ∈ B. Since A has real rank zero, the hereditary subalgebra B has an approximate identity consisting of projections. Since b ∈ B, there is p ∈ B with pb − b < ε. Lemma 3.3. Let G be an almost AF Cantor groupoid, with open AF subgroupoid G0 ⊂ G as in Definition 2.2(1). Let e ∈ Cr∗ (G) be a projection, and let ε > 0. Then there is a projection q ∈ Cr∗ (G0 ) which is Murray-von Neumann equivalent to a subprojection of e and such that τ (e) − τ (q) < ε for every normalized trace τ on Cr∗ (G). Proof. Without loss of generality ε < 6. Choose δ0 > 0 such that whenever A is a C*-algebra and p1 , p2 ∈ A are projections such that p1 p2 − p2 < δ0 , then p2 is Murray-von Neumann equivalent to a subprojection of p1 . Define a continuous function f : [0, ∞) → [0, 1] by −1 6ε t 0 ≤ t ≤ 16 ε f (t) = 1 1 6 ε ≤ t. Choose δ > 0 such that whenever A is a C*-algebra and a1 , a2 ∈ A are positive elements with a1 , a2 ≤ 1 and a1 − a2 < δ, then f (a1 ) − f (a2 ) < 21 δ0 . Since Cc (G) is a dense *-subalgebra of Cr∗ (G), there is a selfadjoint element d ∈ Cc (G) with e − d < min 21 δ, 16 ε and d ≤ 1. Apply Lemma 3.1 with F = {d}, obtaining a projection p = χV ∈ C G(0) ⊂ Cr∗ (G0 ) for a suitable compact open set V ⊂ G(0) , such that r(supp(d) ∩ [G \ G0 ]) ∪ s(supp(d) ∩ [G \ G0 ]) ⊂ V and τ (p) < ε for every τ ∈ T (Cr∗ (G)), the space of normalized traces on Cr∗ (G). Lemma 2.7(3) gives (1 − p)d, d(1 − p) ∈ Cr∗ (G0 ). For every τ ∈ T (Cr∗ (G)), we have τ (pe(1 − p)) = τ ((1 − p)ep) = 0, so τ ((1 − p)e(1 − p)) = τ (e) − τ (pep) ≥ τ (e) − τ (p) > τ (e) − 16 ε,
18
N.C. Phillips
and (using d 2 − e < 13 ε) τ (d(1 − p)d) = τ ((1 − p)d 2 (1 − p)) > τ (e) − 21 ε. Also, d(1 − p)d is a positive element in Cr∗ (G0 ). Let f : [0, ∞) → [0, 1] be as above, and define continuous functions g, h : [0, ∞) → [0, 1] by 0 ≤ t ≤ 16 ε 0 t 0 ≤ t ≤ 16 ε 1 1 −1 and h(t) = 1 g(t) = 6ε t − 1 6 ε ≤ t ≤ 3 ε 1 6ε 6 ε ≤ t. 1 1 ε ≤ t 3 Define a = f (d(1 − p)d),
b = g(d(1 − p)d),
and
c = h(d(1 − p)d).
Then a, b, c ∈ Cr∗ (G0 ) are positive, and ab = b,
b + c ≥ d(1 − p)d,
a ≤ 1,
b ≤ 1,
and
c ≤ 16 ε.
In particular, every τ ∈ T (Cr∗ (G)) satisfies τ (c) ≤ 16 ε, so τ (b) = τ (b + c) − τ (c) ≥ τ (d(1 − p)d) − 16 ε > τ (e) − 23 ε. Since Cr∗ (G0 ) is an AF algebra, we can apply Lemma 3.2 to find a projection q ∈ bCr∗ (G0 )b such that aq = q and qb − b < 16 ε. Then qbq − b < 13 ε, so that for every τ ∈ T (Cr∗ (G)), τ (q) ≥ τ (qbq) > τ (b) − 13 ε > τ (e) − ε. This is one half of the desired conclusion. From d − e < 21 δ and d ≤ 1 we get d(1 − p)d − e(1 − p)e < δ, so the choice of δ at the beginning of the proof gives a − f (e(1 − p)e) < 21 δ0 . Since ef (e(1 − p)e) = f (e(1 − p)e), we get ea − a < δ0 . Also aq = q, so eq − q < δ0 . The choice of δ0 at the beginning of the proof implies that q is Murray-von Neumann equivalent to a subprojection of e. This the other half of the desired conclusion. Proposition 3.4. Let A be a unital AF algebra (not necessarily simple). Let p, q ∈ A be projections. Suppose that τ (p) < τ (q) for all normalized traces τ on A. Then p q. Proof. This is a special case of Theorem 5.1(b) of [35]. ∞
One can also give a direct proof. Write A = n=0 An with An finite dimensional, and assume p, q ∈ A0 and that p q fails. Then apply the proof of Lemma 2.12 to traces τn on An such that τn (p) ≥ τn (q). The following result is a K-theoretic version of Blackadar’s Second Fundamental Comparability Question ([7], 1.3.1). We will get the full result for the simple case in Corollary 5.4, after we have proved stable rank one.
Actions of Zd on the Cantor Set
19
Theorem 3.5. Let G be an almost AF Cantor groupoid. If η ∈ K0 (Cr∗ (G)) satisfies τ∗ (η) > 0 for all normalized traces τ on Cr∗ (G), then there is a projection e ∈ M∞ (Cr∗ (G)) =
∞
Mn (Cr∗ (G))
n=1
such that η = [e]. Proof. Let G0 ⊂ G be an open AF subgroupoid as in Definition 2.2(1). Write η = [q] − [p] for projections p, q ∈ M∞ (Cr∗ (G)). Choose n large enough that both p and q are in Mn (Cr∗ (G)). Then G × {1, 2, . . . , n}2 , with the groupoid structure being given by (g, j, k)(h, k, l) = (gh, j, l) when g and h are composable in G, and all other pairs not composable, is an almost AF Cantor groupoid whose reduced C*-algebra is the simple C*-algebra Mn (Cr∗ (G)). Replacing G by G × {1, 2, . . . , n}2 , we may therefore assume p, q ∈ Cr∗ (G). Because T (Cr∗ (G)) is weak* compact, there is ε > 0 such that τ (q) − τ (p) > ε for all τ ∈ T (Cr∗ (G)). Apply Lemma 3.3 twice, once to find a projection q0 ∈ Cr∗ (G0 ) which is Murray-von Neumann equivalent to a subprojection q1 of q and such that τ (q) − τ (q0 ) < 13 ε for all τ ∈ T (Cr∗ (G)), and once to find a projection f0 ∈ Cr∗ (G0 ) which is Murray-von Neumann equivalent to a subprojection f1 of 1 − p and such that τ (1 − p) − τ (f0 ) < 13 ε for all τ ∈ T (Cr∗ (G)). Then (using Proposition 2.11) we get 0 < τ (q0 ) − τ (1 − f0 ) < 13 ε for all τ ∈ T (Cr∗ (G0 )). Since Cr∗ (G0 ) is an AF algebra (by Corollary 1.17), we can apply Proposition 3.4 to find a projection p0 ≤ q0 which is Murray-von Neumann equivalent to 1 − f0 . Now we write, in K0 (Cr∗ (G)), η = [q] − [p] = ([q] − [q0 ]) + ([q0 ] − [p0 ]) + ([p0 ] − [p]) = [q − q1 ] + [q0 − p0 ] + [1 − p − f1 ] > 0, as desired.
4. Real Rank of the C*-Algebra of an Almost AF Groupoid In this section, we prove that if G is an almost AF Cantor groupoid such that Cr∗ (G) is simple, then Cr∗ (G) has real rank zero. Lemma 4.1. Let G be an almost AF Cantor groupoid, with open AF subgroupoid G0 ⊂ G as in Definition 2.2(1). Let F ⊂ Cc (G) be a finite set, and let n ∈ N. Then exists there a compact open subset W of G(0) such that the projection p = χW ∈ C G(0) satisfies: (1) (1 − p)f, f (1 − p) ∈ Cc (G0 ) for all f ∈ F . (2) There are n mutually orthogonal projections in C G(0) , each of which is Murrayvon Neumann equivalent to p in Cr∗ (G0 ). Proof. Let K=
[supp(f ) ∪ supp(f ∗ )].
f ∈F
Then K ∩ (G \ G0 ) is a compact subset of G \ G0 , so by Definition 2.2, the set L = s(K ∩ [G \ G0 ]) ⊂ G(0) is thin in G0 . By Lemma 2.5(1), there exist a compact open set W containing L and compact open graphs in G0 , say W1 , W2 , . . . , Wn ⊂ G0 , such that
20
N.C. Phillips
s(Wk ) = W and the sets r(W1 ), r(W2 ), . . . , r(Wn ) are pairwise disjoint compact open subsets of G(0) . Then the functions χWk ∈ Cc (G0 ) are partial isometries which implement Murray-von Neumann equivalences between p and the n mutually orthogonal projections χr(W1 ) , χr(W2 ) , . . . , χr(Wn ) . We clearly have s(supp(f ) ∩ [G \ G0 ]) ⊂ L for all f ∈ F . Also, supp(f ∗ ) ∩ [G \ G0 ] = {g −1 : g ∈ supp(f ) ∩ [G \ G0 ]}, so r(supp(f ) ∩ [G \ G0 ]) = s(supp(f ∗ ) ∩ [G \ G0 ]) ⊂ L for all f ∈ F . Therefore (1 − p)f, f (1 − p) ∈ Cc (G0 ) by Lemma 2.7(3).
Lemma 4.2. Let A be a finite dimensional C*-algebra. Let a ∈ Asa , let p ∈ A be a projection, and let n ∈ N. Then there is a projection q ∈ A such that p ≤ q,
[q] ≤ 2n[p] ∈ K0 (A),
and
qa − aq ≤ n1 a.
Proof. The result is trivial if a = 0, so, by scaling, we may assume a = 1. Without loss of generality A = Mm , which we think of as operators on Cm , and a is diagonal. Making suitable replacements of the diagonal entries of a, we find b ∈ (Mm )sa such that 2n − 1 2n − 3 2n − 3 2n − 1 1 ,− , ... , , . a − b ≤ 2n and sp(b) ⊂ − 2n 2n 2n 2n Define u = exp(π ib). Then u is a unitary in Mm with u2n = −1. Moreover, with log being the continuous branch with values such that Im(log(ζ )) ∈ (−π, π ), we have b = π1i log(u). Let H ⊂ Cm be the linear span of the spaces uk pCm for 0 ≤ k ≤ 2n − 1. Then uH = H since u2n = −1, and dim(H ) ≤ 2n · rank(p). Let q ∈ Mm be the orthogonal projection onto H . Then p ≤ q and [q] ≤ 2n[p] ∈ K0 (Mm ). Since uq = qu, functional 1 calculus gives bq = qb. Since a − b ≤ 2n , this gives qa − aq ≤ n1 . Lemma 4.3. Let A be a unital AF algebra. Let a ∈ Asa be nonzero, and let p ∈ A be a projection. Let ε > 0, and let n ∈ N satisfy n > 1ε . Then there is a projection q ∈ A such that p ≤ q,
[q] ≤ 2n[p] ∈ K0 (A),
and
qa − aq < εa.
Proof. There are finite dimensional subalgebras of A which contain projections arbitrarily close to p and selfadjoint elements arbitrarily close to a and with the same norm. Conjugating by suitable unitaries, we find that there are finite dimensional subalgebras of A which exactly contain p and which contain selfadjoint elements arbitrarily close to a and with the same norm. Choose such a subalgebra B which contains a selfadjoint element b with b = a and a − b < 21 ε − n1 a. Apply Lemma 4.2 to B, b, p, and n. The resulting projection q ∈ B ⊂ A has the required properties.
Actions of Zd on the Cantor Set
21
Lemma 4.4. Let r > 0, let f : [−r, r] → [0, 1] be a continuous function, and let ε > 0. Then there is δ > 0 such that, whenever A is a unital C*-algebra, and whenever a projection p ∈ A and a selfadjoint element a ∈ A satisfy a ≤ r and pa − ap < δ, and τ (f (a)) − τ (1 − p) > ε for all normalized traces τ on A, then (using functional calculus in pAp) τ (f (pap)) > τ (f (a)) − τ (1 − p) − ε for all normalized traces τ on A. Proof. Approximating the function f uniformly on [−r, r] by a polynomial, we can choose δ > 0 so small that whenever A is a unital C*-algebra, and whenever a projection p ∈ A and a selfadjoint element a ∈ A satisfy a ≤ r and pa − ap < δ, then f (pap) − pf (a)p < ε. (The expression f (pap) is evaluated using functional calculus in pAp.) Fix a normalized trace τ . We estimate: τ ((1 − p)f (a)(1 − p)) ≤ τ (1 − p)f (a) ≤ τ (1 − p) and |τ (f (pap)) − τ (pf (a)p)| ≤ f (pap) − pf (a)p < ε. Also, τ ((1 − p)f (a)p) = τ (pf (a)(1 − p)) = 0, because τ is a trace. So τ (f (pap)) > τ (pf (a)p) − ε = τ (f (a)) − ε − τ ((1 − p)f (a)(1 − p)) + τ ((1 − p)f (a)p) + τ (pf (a)(1 − p)) ≥ τ (f (a)) − τ (1 − p) − ε, as desired.
The following lemma gives a suitable version of a now standard technique, which goes back at least to the proof of Lemma 1.7 of [12]. Lemma 4.5. Let A be a C*-algebra, let a ∈ A be normal, let λ0 ∈ C, let ε > 0, and let f : C → [0, 1] be a continuous function such that supp(f ) is contained in the open ε-ball with center λ0 . Let p be a projection in the hereditary subalgebra generated by f (a). Then pa − ap < 2ε
and pap − λ0 p < ε.
22
N.C. Phillips
Proof. Choose a continuous function g : C → C such that |g(λ) − λ| < ε for all λ ∈ C and g(λ) = λ0 for all λ ∈ supp(f ). Then g(a) − a < ε, This completes the proof.
pg(a) = g(a)p,
and pg(a)p = λ0 p.
Theorem 4.6. Let G be an almost AF Cantor groupoid. Suppose that Cr∗ (G) is simple. Then Cr∗ (G) has real rank zero in the sense of [10]. Proof. Let G0 ⊂ G be an open AF subgroupoid as in Definition 2.2(1). Let a ∈ Cr∗ (G) be selfadjoint with a ≤ 1. We approximate a by an invertible selfadjoint element. Since Cc (G) is a dense *-subalgebra of Cr∗ (G), we can approximate a arbitrarily well by an element a0 ∈ Cc (G); replacing a0 by 21 (a0 + a0∗ ) and suitably scaling, we can approximate a arbitrarily well by a selfadjoint element a0 ∈ Cc (G) with a0 ≤ 1. Thus, without loss of generality a ∈ Cc (G). Also, if a is invertible then there is nothing to prove, so we assume 0 ∈ sp(a). Let ε > 0. Choose continuous functions f, g : [−1, 1] → [0, 1] such that g(0) = 1, f g = g, and supp(f ) ⊂ − 19 ε, 19 ε . We are going to find projections e, p, q which add up to 1, which approximately commute with a, such that qaq is close to zero, such that p q, and such that eae is close to an element of the AF algebra Cr∗ (G0 ). The approximation of a by an invertible selfadjoint element will then follow from Lemma 8 of [21]. The projection p will be chosen to dominate a projection p0 such that b = a − p0 ap0 ∈ Cr∗ (G0 ), and q will be a projection in the hereditary subalgebra in Cr∗ (G0 ) generated by f ((1 − p)a(1 − p)). We need to choose p0 small enough that, when p is constructed following Lemma 4.3, the hereditary subalgebra generated by f ((1 − p)a(1 − p)) is large enough to contain a projection q such that p q. Our first task is to lay the groundwork for this outcome. Let T (Cr∗ (G)) be the set of normalized traces on Cr∗ (G), and define α=
inf
τ ∈T (Cr∗ (G))
τ (g(a)).
All traces are faithful because Cr∗ (G) is simple, g(a) is a nonzero positive element, and T (Cr∗ (G)) is weak* compact. Therefore α > 0. Choose δ > 0 as in Lemma 4.4, with 41 α in place of ε, with r = 2, and with g in place of f . We also require δ < 19 ε. Choose m ∈ N with m2 < δ, and use Lemma 4.1 to find a projection p0 ∈ C G(0) which is Murray-von Neumann equivalent in Cr∗ (G0 ) to more than 8mα −1 mutually orthogonal projections in C G(0) , and such that (1 − p0 )a, a(1 − p0 ) ∈ Cc (G0 ). In particular, τ (p0 ) < 18 αm−1 for all τ ∈ T (Cr∗ (G0 )). Define b = a − p0 ap0 , which is a selfadjoint element of Cc (G0 ) with b ≤ 2. Because m2 < δ, we can apply Lemma 4.3 with 21 δ in place of ε to obtain a projection p ∈ Cr∗ (G0 ) such that pb − bp < δ, p0 ≤ p, and [p] ≤ 2m[p0 ] in K0 (Cr∗ (G)). Now p commutes with b − a = p0 ap0 , so also pa − ap < δ. Furthermore, because p ∈ Cr∗ (G0 ) and p ≥ p0 , we get (1 − p)a, a(1 − p) ∈ Cr∗ (G0 ). Define a0 = (1 − p)a(1 − p). For every τ ∈ T (Cr∗ (G0 )), we have τ (p) ≤ 2mτ (p0 ) < 2m ·
α = 41 α. 8m
Actions of Zd on the Cantor Set
23
By the choice of δ and using Lemma 4.4 (with 1 − p in place of p and 41 α in place of ε), we get τ (g(a0 )) > τ (g(a)) − τ (p) − 41 α ≥ α − 41 α − 41 α = 21 α for all τ ∈ T (Cr∗ (G0 )). Also f (a0 )g(a0 ) = g(a0 ), and Cr∗ (G0 ) is an AF algebra, so Lemma 3.2 provides a projection q ∈ Cr∗ (G0 ) such that q ∈ g(a0 )Cr∗ (G0 )g(a0 ),
f (a0 )q = q,
and
qg(a0 ) − g(a0 ) < 18 α.
Therefore qg(a0 )q − g(a0 ) < 41 α. For all τ ∈ T (Cr∗ (G0 )), we have τ (qg(a0 )q) ≤ τ (q) because g(a0 ) ≤ 1. Combining this with the previous estimates, it follows that τ (q) > τ (g(a0 )) − 41 α > 21 α − 41 α = 41 α. Since Cr∗ (G0 ) is an AF algebra (by Corollary 1.17), since traces determine order on the K0 group of an AF algebra (Proposition 3.4), and since τ (p) < 41 α for all τ ∈ T (Cr∗ (G0 )), it follows that p q. Since a0 is orthogonal to p, so is q. Moreover, Lemma 4.5 implies that qa0 − a0 q < 29 ε
and
qa0 q < 19 ε.
Define e = 1−p−q. We estimate a−(eae+pap). Recall that a0 = (1−p)a(1−p). Therefore a − (eae + pap) = pa(1 − p) + (1 − p)ap + qa0 e + ea0 q + qa0 q. Thus, using qe = 0, a − (eae + pap) ≤ 2pa − ap + 2qa0 − a0 q + qa0 q < 2δ + 49 ε + 19 ε < 29 ε + 49 ε + 19 ε = 79 ε. Since pq
and
1 − e = p + q,
Lemma 8 of [21] provides an invertible selfadjoint element b ∈ (1 − e)Cr∗ (G)(1 − e) such that b − pap < 19 ε. Also, eae ∈ eCr∗ (G0 )e, which is an AF algebra, so there is an invertible selfadjoint element c ∈ eCr∗ (G0 )e such that c − eae < 19 ε. It follows that b + c is an invertible selfadjoint element in Cr∗ (G) such that a − (b + c) ≤ a − (eae + pap) + b − pap + c − eae < 79 ε + 19 ε + 19 ε = ε. This completes the proof.
24
N.C. Phillips
5. Stable Rank of the C*-Algebra of an Almost AF Groupoid In this section, we prove that if G is an almost AF Cantor groupoid such that Cr∗ (G) is simple, then Cr∗ (G) has stable rank one. This implies that the projections in M∞ (Cr∗ (G)) satisfy cancellation, and allows us to strengthen the conclusion of Theorem 3.5 to the full version of Blackadar’s Second Fundamental Comparability Question. Lemma 5.1. Let G be an almost AF Cantor groupoid, with open AF subgroupoid G0 ⊂ G as in Definition 2.2(1). For every ε > 0 there is δ > 0 such that if a ∈ Cc (G) satisfies a ≤ 1 and if e0 ∈ Cr∗ (G) is a nonzero projection such that ae0 < δ, then there exists a nonzero projection e ∈ Cr∗ (G0 ) and a compact open subset V ⊂ G(0) such that, with p = χV ∈ Cr∗ (G0 ), we have: (1) r(supp(a) ∩ [G \ G0 ]) ∪ s(supp(a) ∩ [G \ G0 ]) ⊂ V . (2) ae < ε. (3) e and p are orthogonal. Proof. Choose δ = min 13 ε, 21 > 0. Let a ∈ Cc (G) satisfy a ≤ 1 and let e0 ∈ ∗ Cr (G) be a nonzero projection such that ae0 < δ. Choose c ∈ Cc (G) with c = 1 and c − e0 < δ 2 . Apply Lemma 3.1 with F = {a, c, c∗ , c∗ c}, and with δ 2 in place of ε. We obtain a compact open subset V ⊂ G(0) such that, with p = χV , Condition (1) is satisfied, and also (1 − p)c∗ c(1 − p) > 1 − δ 2 . Moreover, Lemma 2.7(3) implies that (1 − p)c∗ , c(1 − p) ∈ Cr∗ (G0 ). It follows that (1 − p)c∗ c(1 − p) ∈ Cr∗ (G0 ), which is an AF algebra. Choose a continuous function f such that [1 − δ 2 , 1] ⊂ supp(f ) ⊂ (1 − 2δ 2 , 1 + δ 2 ). Choose a nonzero projection e in the hereditary subalgebra generated by f ((1 − p)c∗ c(1 − p)) in the AF algebra Cr∗ (G0 ). Apply Lemma 4.5 in Cr∗ (G0 ) with this f and with λ0 = 1, getting e − e(1 − p)c∗ c(1 − p)e < 2δ 2 . Since 0 ∈ supp(f ), we have e ∈ (1 − p)Cr∗ (G0 )(1 − p). So p and e are orthogonal, which is Part (3). We combine this with the estimate above and the estimate c∗ c − e0 ≤ c∗ c − e0 + c∗ − e0 e0 < 2δ 2 to obtain e − e0 e2 = e(1 − e0 )e < e(1 − c∗ c)e + 2δ 2 = e − ec∗ ce + 2δ 2 < 4δ 2 . Therefore, since a ≤ 1, ae ≤ ae − e0 e + ae0 e < 2δ + δ ≤ ε. This is Part (2).
Actions of Zd on the Cantor Set
25
Theorem 5.2. Let G be an almost AF Cantor groupoid. Suppose that Cr∗ (G) is simple. Then Cr∗ (G) has (topological) stable rank one in the sense of [42]. Proof. Let G0 ⊂ G be an open AF subgroupoid as in Definition 2.2(1). We are going to show that every two sided zero divisor in Cr∗ (G) is a limit of invertible elements. That is, if a ∈ Cr∗ (G) and there are nonzero x, y ∈ Cr∗ (G) such that xa = ay = 0, then we show that for every ε > 0 there is an invertible element c ∈ Cr∗ (G) such that a − c < ε. It will follow from Theorem 3.3(a) of [43] (see Definition 3.1 of [43]) that any a ∈ Cr∗ (G) which is not a limit of invertible elements is left or right invertible but not both. Since Cr∗ (G) is simple, the definition of an almost AF Cantor groupoid implies it has a faithful trace, and there are no such elements. So let a ∈ Cr∗ (G), let x, y ∈ Cr∗ (G) be nonzero elements such that xa = ay = 0, and let ε > 0. Our strategy is to perturb a and multiply it by a unitary, giving an element ub0 , in such a way that we can replace x and y with orthogonal projections (which will be called p and f2 ), which furthermore have the property that (1 − p − f2 )ub0 (1 − p − f2 ) ∈ Cr∗ (G0 ). With respect to the decomposition of the identity 1 = p + [1 − p − f2 ] + f2 , the element ub0 will be block lower triangular, with one diagonal entry in an AF algebra and the other two equal to zero. Such an element is clearly a limit of invertible elements. Without loss of generality a ≤ 1. Since Cr∗ (G) has real rank zero (Theorem 4.6), there are nonzero projections e0 ∈ x ∗ Cr∗ (G)x
and
f0 ∈ yCr∗ (G)y ∗ ,
and we have e0 a = af0 = 0. Choose δ > 0 in Lemma 5.1 for 41 ε in place of ε. Choose b ∈ Cc (G) such that b ≤ 1 and a − b < min δ, 41 ε . Then e0 b, bf0 < δ. Let K = r(supp(b) ∩ [G \ G0 ]) ∪ s(supp(b) ∩ [G \ G0 ]). Apply Lemma 5.1 to b∗ and e0 , obtaining a nonzero projection e1 ∈ Cr∗ (G0 ) and a compact open subset V ⊂ G(0) such that, with g = χV ∈ Cr∗ (G), we have: K ⊂ V,
e1 b < 41 ε,
and
e1 g = 0.
Apply Lemma 5.1 to b and f0 , obtaining a nonzero projection f1 ∈ Cr∗ (G0 ) and a compact open subset W ⊂ G(0) such that, with h = χW ∈ Cr∗ (G), we have: K ⊂ W,
bf1 < 41 ε,
and
f1 h = 0.
Choose ρ > 0 with ρ < min
inf∗
τ ∈T (Cr (G))
τ (e1 ),
inf∗
τ ∈T (Cr (G))
τ (f1 ) .
(As usual, T (Cr∗ (G)) is the space of normalized traces.) The set K is thin in G0 , so Lemma 2.5(2) provides a compact open set Z ⊂ G(0) such that K ⊂ Z and µ(Z) < ρ for
26
N.C. Phillips
every G0 -invariant Borel probability measure µ on G(0) . Let p = χV ∩W ∩Z ∈ C G(0) . Then p, e1 , and f1 are all in Cr∗ (G0 ), and e1 p = f1 p = 0. Proposition 2.11 implies that τ (p) < τ (e1 )
and τ (p) < τ (f1 )
for all τ ∈ T (Cr∗ (G0 )). Since Cr∗ (G0 ) is an AF algebra (by Corollary 1.17), it follows from Proposition 3.4 that there are projections e2 , f2 ∈ Cr∗ (G0 ) which are unitarily equivalent to p and with e2 ≤ e1 and f2 ≤ f1 . Furthermore, since e2 and f2 are both orthogonal to p, and (1 − p)Cr∗ (G0 )(1 − p) is a simple AF algebra, there is a unitary w ∈ Cr∗ (G0 ) such that we2 w ∗ = f2
and wpw ∗ = p;
also, it is easy to find a unitary v ∈ Cr∗ (G0 ) such that vf2 v ∗ = p
and vpv ∗ = f2 .
Set u = vw, which is a unitary in Cr∗ (G0 ) such that ue2 u∗ = p
and upu∗ = f2 .
Define b0 = (1 − e2 )b(1 − f2 ). Since b0 = b − e2 b(1 − f2 ) − bf2 , and since e2 b(1 − f2 ) ≤ e1 b < 41 ε
and
bf2 ≤ bf1 < 41 ε,
we get b − b0 < 21 ε, so a − b0 < 43 ε. We have (1 − p)b(1 − p) ∈ Cr∗ (G0 ) by Lemma 2.7(3). With respect to the decomposition of the identity 1 = p + [1 − p − f2 ] + f2 , we claim that ub0 has the block matrix form 0 0 0 ub0 = x d0 0 y z 0 with d0 ∈ (1 − p − f2 )Cr∗ (G0 )(1 − p − f2 ). To see this, we observe that ub0 f2 = 0 because b0 f2 = 0, that pub0 = u(u∗ pu)b0 = ue2 b0 = 0 because e2 b0 = 0, and that d0 = (1 − p − f2 )ub0 (1 − p − f2 ) = u(1 − e2 − p)b0 (1 − p − f2 ) = u[1 − e2 ][(1 − p)b0 (1 − p)][1 − f2 ],
Actions of Zd on the Cantor Set
27
which is in Cr∗ (G0 ) because u and each of the three terms in brackets are in Cr∗ (G0 ). Since Cr∗ (G0 ) is an AF algebra, there exists an invertible element d ∈ Cr∗ (G0 ) such that d − d0 < 41 ε. Then 1 4ε 0 0 c = u∗ · x d 0 y z 41 ε is an invertible element of Cr∗ (G) such that b0 − c ≤ 41 ε, so a − c < ε. Recall that M∞ (A) = ∞ n=1 Mn (A).
Corollary 5.3. Let G be an almost AF Cantor groupoid. Suppose that Cr∗ (G) is simple. Then the projections in M∞ (Cr∗ (G)) satisfy cancellation: if e, f, p ∈ M∞ (Cr∗ (G)) are projections such that e ⊕ p is Murray-von Neumann equivalent to f ⊕ p, then e is Murray-von Neumann equivalent to f . Proof. This follows from the fact that Cr∗ (G) has stable rank one (Theorem 5.2), using Proposition 6.5.1 of [6]. Corollary 5.4. Let G be an almost AF Cantor groupoid. Suppose that Cr∗ (G) is simple. Let p, q ∈ M∞ (Cr∗ (G)) be projections such that τ (p) < τ (q) for all normalized traces τ on Cr∗ (G). Then p is Murray-von Neumann equivalent to a subprojection of q. Proof. This follows from Theorem 3.5 and Corollary 5.3.
We follow 1.1 of [22] and say that a Riesz group is a directed partially ordered abelian group satisfying Riesz decomposition (equivalently, interpolation); for definitions see pp. 1, 4, and 23 of [20]. (This terminology is not universal; it differs, for example, from Sect. IV.6 of [14], where Riesz groups are required to be unperforated and hence torsion free.) Corollary 5.5. Let G be an almost AF Cantor groupoid. Suppose that Cr∗ (G) is simple. Then K0 (C∗r (G)) is a Riesz group in the sense above. Proof. The only nonobvious part is Riesz decomposition. Let η, µ1 , µ2 ∈ K0 (Cr∗ (G))+ satisfy η ≤ µ1 + µ2 . By Theorem 3.5 there exist projections p, q1 , q2 ∈ M∞ (Cr∗ (G)) such that η = [p], µ1 = [q1 ], and µ2 = [q2 ]. Because η ≤ µ1 +µ2 , there is a projection e ∈ M∞ (Cr∗ (G)) such that p⊕e q1 ⊕q2 ⊕e. Corollary 5.3 implies p q1 ⊕q2 . Since Cr∗ (G) has real rank zero (Theorem 4.6), Theorem 1.1 of [48], applied in Mn (Cr∗ (G)) for suitable n, gives projections p1 ≤ q1 and p2 ≤ q2 such that p is Murray-von Neumann equivalent to p1 ⊕ p2 . Then η = [p1 ] + [p2 ] with [p1 ] ≤ µ1 and [p2 ] ≤ µ2 . 6. Kakutani-Rokhlin Decompositions In this section, we show that the transformation group groupoid coming from a free minimal action of Zd on the Cantor set is an almost AF Cantor groupoid. The argument is essentially a reinterpretation of some of the results of Forrest’s paper [16] in terms of groupoids. As can be seen by combining the results of this section and Sect. 2, Forrest does not need all the conditions he imposes to get his main theorem. We use only the Følner condition, not the inradius conditions.
28
N.C. Phillips
We presume that the construction of [16] can be generalized to cover actions of more general groups. Accordingly, we state the definitions in greater generality, in particular using the word length metric rather than the Euclidean distance on Zd . This change has no effect on the results. It should also be possible to generalize to actions that are merely essentially free. This generalization would require more substantial modification of Forrest’s definitions, and we do not carry it out here. Doing so would gain no generality for the groups we actually handle, namely Zd . An essentially free minimal action of an abelian group is necessarily free, because the fixed points for any one group element form a closed invariant set. We begin by establishing notation for this section. Convention 6.1. Throughout this section: (1) is a countable discrete group with a fixed finite generating set which is symmetric in the sense that γ ∈ implies γ −1 ∈ . (2) X is the Cantor set, and acts freely and continuously on X on the left. The action is denoted (γ , x) → γ x. (3) G = × X is the transformation group groupoid, and is equipped with the Haar system consisting of counting measures. Thus G is a Cantor groupoid. (See Definition 1.1 and Example 1.3.) For the convenience of the reader, we reproduce here the relevant definitions from [16], but stated for the more general situation of Convention 6.1. (Forrest considers the case in which = Zd and consists of the standard basis vectors and their inverses.) Definition 6.2. (Definition 2.1 of [16].) Let the notation be as in Convention 6.1. (1) A tower is a pair (E, S), in which E ⊂ X is a compact open subset and S ⊂ is a finite subset, such that 1 ∈ S and the sets γ E, for γ ∈ S, are pairwise disjoint. (2) The levels of a tower (E, S) are the sets γ E ⊂ X for γ ∈ S. (3) A traverse of a tower (E, S) is a set of the form {γ x : γ ∈ S} with x ∈ E. (4) A Kakutani-Rokhlin decomposition Q is a finite collection of towers whose levels form a partition PQ of X (called the partition determined by Q). In the following definition, we use the word length metric (as opposed to the Euclidean norm on Zd used in Definition 3.2 of [16]). However, in that case the two metrics are equivalent in the usual sense for norms on Banach spaces, so the difference is not significant. Definition 6.3. Let l(γ ) (or l (γ )) denote the word length of γ ∈ relative to the generating set . Let S ⊂ be a nonempty finite subset. The Følner constant c(S) (or c (S)) is the least number c > 0 such that card(Sγ S) ≤ c · l(γ )card(S) for all γ ∈ . Here Sγ S is the symmetric difference Sγ S = (S \ γ S) ∪ (γ S \ S). The Følner constant c(Q) of a Kakutani-Rokhlin decomposition Q is c(Q) =
sup c(S). (E,S)∈Q
Actions of Zd on the Cantor Set
29
Definition 6.4. (Definition 2.2 of [16].) Let Q1 and Q2 be Kakutani-Rokhlin decompositions. We say that Q2 refines Q1 if: (1) The partition PQ2 (Definition 6.2(4)) refines the partition PQ1 . (2) Every traverse (Definition 6.2(3)) of a tower in Q2 is a union of traverses of towers in Q1 . Definition 6.5. Let (E, S) be a tower as in Definition 6.2(1), and let G = × X as in Convention 6.1(3). We define the subset G(E,S) ⊂ G by G(E,S) = {(γ1 γ2−1 , γ2 x) : x ∈ E and γ1 , γ2 ∈ S}. We show below that it is a subgroupoid of G. We equip it with the Haar system consisting of counting measures; we show below that this really is a Haar system. Let Q be a Kakutani-Rokhlin decomposition as in Definition 6.2(4). We define the subgroupoid associated with Q to be
GQ =
G(E,S) ⊂ G.
(E,S)∈Q
Again, we show below that it is a subgroupoid of G, and that we may equip it with the Haar system consisting of counting measures. Lemma 6.6. Assume the hypotheses of Convention 6.1. Then the subset G(E,S) ⊂ G of Definition 6.5 is a compact open Cantor subgroupoid of G. Further, the subset GQ ⊂ G of Definition 6.5 is a compact open Cantor subgroupoid of G which contains the unit space G(0) of G, and is the disjoint union of the sets G(E,S) as (E, S) runs through the towers of Q. Proof. The subset G(E,S) is a finite union of subsets of G of the form {(γ1 γ2−1 , γ2 x) : x ∈ E} for fixed γ1 , γ2 ∈ . Since E is compact and open in X, these sets are compact and open in G. Therefore G(E,S) is compact and open in G. So GQ , being a finite union of sets of the form G(E,S) , is also compact and open in G. It is immediate to check that the subset G(E,S) is closed under product and inverse in the groupoid G, and so is a subgroupoid. It is a Cantor groupoid by Example 1.4. The subset GQ is the disjoint union of finitely many sets of the form G(E,S) , and is therefore also a Cantor groupoid. Its unit space, thought of as a subset of X, is (E,S)∈Q
(0)
G(E,S) =
γ E,
(E,S)∈Q γ ∈S
which is the union of all the levels of towers in Q, and is therefore equal to X (Defini tion 6.2(4)). Thus it is equal to G(0) . Lemma 6.7. Assume the hypotheses of Convention 6.1. Let Q1 and Q2 be KakutaniRokhlin decompositions such that Q2 refines Q1 (Definition 6.4). Then GQ1 ⊂ GQ2 .
30
N.C. Phillips
Proof. Let (E, S) be a tower of Q1 and let g ∈ G(E,S) . Then g = (γ1 γ2−1 , γ2 x) for some x ∈ E and γ1 , γ2 ∈ S. Since PQ2 is a partition of X, there exist a tower (F, T ) ∈ Q2 and elements η2 ∈ T and y ∈ F such that γ2 x = η2 y. The traverse {ηy : η ∈ T } (Definition 6.2(3)) of (F, T ) is, by Definition 6.4(2), a union of traverses of towers in Q1 . That is, there exist towers (E1 , S1 ), (E2 , S2 ), . . . , (En , Sn ) ∈ Q1 (not necessarily distinct) and points xk ∈ Ek for 1 ≤ k ≤ n, such that {ηy : η ∈ T } =
n
{γ xk : γ ∈ Sk }.
k=1
Therefore η2 y = γ xk for some k and some γ ∈ Sk . Now γ xk is in the level γ Ek ∈ PQ1 . Since γ xk = γ2 x, the action is free, and the levels of all the towers of Q1 form a partition of X, it follows that (Ek , Sk ) = (E, S),
xk = x,
and
γ = γ2 .
In particular, {γ x : γ ∈ S} ⊂ {ηy : η ∈ T }, so there is η1 ∈ T such that γ1 x = η1 y. Substituting y = η2−1 γ2 x, we get γ1 x = η1 η2−1 γ2 x. Because the action is free, it follows that η1 η2−1 = γ1 γ2−1 . We now have g = (γ1 γ2−1 , γ2 x) = (η1 η2−1 , η2 y) ∈ G(F,T ) ⊂ GQ2 . Theorem 6.8 (Forrest [16]). In the situation of Convention 6.1, assume that = Zd , that consists of the standard basis vectors and their inverses, and that the action of Zd on X is minimal (as well as being free). Then there exists a sequence Q1 , Q2 , Q3 , . . . of Kakutani-Rokhlin decompositions such that Qn+1 refines Qn for all n, and such that the Følner constants obey limn→∞ c(Qn ) = 0. Proof. Combine Proposition 5.1 of [16] with Lemma 3.1 of [16]. This immediately gives everything except for the estimate on the Følner constants. From [16] we get an estimate on constants slightly different from those given in Definition 6.3. Specifically, define c0 (S) and c0 (Q) as in Definition 6.3, but considering Zd as a subset of Rd and using γ 2 in place of l(γ ). That is, card(Sγ S) ≤ c0 (S) · γ 2 card(S), etc. That limn→∞ c0 (Qn ) = 0 follows from the fact that the atoms of the partitions Tn (x) in Proposition 5.1 of [16], for fixed n and as x runs through X, are exactly the translates in of the sets S appearing in the towers (E, S) which make up Qn . Now for any γ ∈ Zd we have l(γ ) = γ ∞ ≥ d −1 γ 2 . Therefore c(Qn ) ≤ d · c0 (Qn ) for all n, whence limn→∞ c(Qn ) = 0, as desired.
Actions of Zd on the Cantor Set
31
Theorem 6.9. Assume the hypotheses of Convention 6.1, and assume moreover that the action of on X is minimal (as well as being free). Assume that there exists a sequence Q1 , Q 2 , Q 3 , . . . of Kakutani-Rokhlin decompositions such that Qn+1 refines Qn for all n, and such that the Følner constants obey limn→∞ c(Qn ) = 0. Then G = × X is almost AF in the sense of Definition 2.2, and Cr∗ (G) is simple. Proof. The algebra Cr∗ (G) is the same as the reduced transformation group C*-algebra Cr∗ (, X) by Proposition 1.8. So it is simple by the corollary at the end of [2]. (See the preceding discussion and Definition 1 there.) To prove that G is almost AF, it is by Proposition 2.13 now enough to verify Definition 2.2(1). With GQn as in Definition 6.5, define G0 = ∞ n=1 GQn . Using Lemma 6.6 and Lemma 6.7, we easily verify the conditions of Definition 1.15. Thus G0 is an AF Cantor groupoid, which is open in G because each GQn is open in G. It remains to verify the condition that s(K) be thin in G0 (Definition 2.1) for every compact set K ⊂ G \ G0 . For this argument, we will identify the unit space G(0) = {1} × X with X in the obvious way. So let K ⊂ G \ G0 be compact and let n ∈ N. Let T = {γ ∈ : ({γ } × X) ∩ K = ∅} ⊂ . Then T is finite because K is compact. Let l be the word length metric on , as used in Definition 6.3. Let ρ = supγ ∈T l(γ ). Choose m so large that c(Qm ) <
1 . ρn · card(T )
We now claim that if (E, S) ∈ Qm is a tower and ηE, x ∈ s(K) ∩ η∈S
then there exist γ ∈ T and η ∈ S \ γ −1 S such that (γ , x) ∈ {γ } × ηE. To prove this, recall that s(γ , x) = x (really (1, x)), and find γ ∈ T such that (γ , x) ∈ K. Also choose η ∈ S such that η−1 x ∈ E. Then write (γ , x) = ([γ η]η−1 , η[η−1 x]). Since (γ , x) ∈ G(E,S) , we have γ η ∈ S. So η ∈ S \ γ −1 S. This proves the claim. It follows that the sets ηE, for (E, S) ∈ Qm , γ ∈ T , and η ∈ S \ γ −1 S, form a cover of s(K) by finitely many disjoint compact open subsets of X. For (S \ γ −1 S), (E, S) ∈ Qm and η ∈ γ ∈T
define L(E,S),η = s(K) ∩ ηE. The sets L(E,S),η are a cover of s(K) by finitely many disjoint compact subsets of X. For fixed (E, S) ∈ Qm , we now claim that there are compact graphs in G(E,S) , say M(E,S),1 , M(E,S),2 , . . . , M(E,S),n ⊂ G(E,S) , such that s(M(E,S),k ) = s(K) ∩
η∈S
ηE
32
N.C. Phillips
and the sets r(M(E,S),1 ), r(M(E,S),2 ), . . . , r(M(E,S),n ) are pairwise disjoint. Since the subgroupoids G(E,S) are disjoint (Lemma 6.6) and their union is contained in G0 , we can take then the required graphs in G0 to be Mk = M(E,S),k . (E,S)∈Qm
To find the M(E,S),k , first note that for γ ∈ T , we have l(γ −1 ) = l(γ ) ≤ ρ, so, using Definition 6.3, card(S \ γ −1 S) ≤ card(Sγ −1 S) ≤ c(Qm )ρ · card(S) < Therefore, with R(E,S) =
card(S) . n · card(T )
(S \ γ −1 S),
γ ∈T
we have card(R(E,S) ) < n1 card(S). So there exist n injective functions σ1 , σ2 , . . . , σn : R(E,S) → S with disjoint ranges. Define M(E,S),k =
{σk (η)η−1 } × L(E,S),η .
η∈R(E,S)
Then M(E,S),k ⊂ G(E,S) because L(E,S),η ⊂ ηE. Clearly M(E,S),k is compact. The restriction of the source map to M(E,S),k is injective because the L(E,S),η are disjoint. Further, σk (η)η−1 L(E,S),η ⊂ σk (η)E, so the sets σk (η)η−1 L(E,S),η , for η ∈ R(E,S) and 1 ≤ k ≤ n, are pairwise disjoint. It follows both that the restriction of the range map to M(E,S),k is injective and that the sets r(M(E,S),1 ), r(M(E,S),2 ), . . . , r(M(E,S),n ) are pairwise disjoint. Thus the M(E,S),k are graphs in G(E,S) with the required properties. We have verified Definition 2.1 for s(K). Corollary 6.10. Let d be a positive integer, let X be the Cantor set, and let Zd act freely and minimally on X. Then the transformation group groupoid G = Zd × X is almost AF in the sense of Definition 2.2, and Cr∗ (G) is simple. Proof. This is immediate from Theorems 6.8 and 6.9.
Theorem 6.11. Let d be a positive integer, let X be the Cantor set, and let Zd act freely and minimally on X. Then: (1) C ∗ (Zd , X) has (topological) stable rank one in the sense of [42]. (2) C ∗ (Zd , X) has real rank zero in the sense of [10]. (3) Let p, q ∈ M∞ (C ∗ (Zd , X)) be projections such that τ (p) < τ (q) for all normalized traces τ on C ∗ (Zd , X). Then p is Murray-von Neumann equivalent to a subprojection of q. (4) If K0 (C ∗ (Zd , X)) is torsion free, then K0 (C ∗ (Zd , X)) is a dimension group in the sense of Chapter 3 of [20].
Actions of Zd on the Cantor Set
33
Proof. Using Proposition 1.8 and Corollary 6.10, we obtain Part (1) from Theorem 5.2, Part (2) from Theorem 4.6, and Part (3) from Corollary 5.4. To prove (4), in view of Corollary 5.5, we need only prove that K0 (C ∗ (Zd , X)) is unperforated. That K0 (C ∗ (Zd , X)) is weakly unperforated is immediate from Proposition 1.8 and Theorem 3.5 (or from Part (3)). If K0 (C ∗ (Zd , X)) is torsion free, then K0 (C ∗ (Zd , X)) is unperforated. Added in proof: Theorem 3.1 of [17] states that K0 (C ∗ (Zd , X)) is always torsion free, so that k0 (C ∗ (Zd , X)) should always be a dimension group. This result is apparently ˇ incorrect. G¨ahler [50] has computed the top (two dimensional) Cech cohomology of the hull of the T¨ubingen Triangle Tiling, and found that it has torsion. Using Proposition 3.4 and Theorem 6.3 and 7.1 of [49] (also see the beginning of Sect. 7 there), it follows that K0 (C ∗ (R2 , )) has torsion. By the remark before Corollary 7.2, there is a free minimal action of Z2 on the Cantor set X such that C ∗ (R2 , ) is stably isomorphic to C ∗ (Z2 , X). For this action, then K0 (C ∗ (Zd , X)) is not torsion free, and is therefore not a dimension group. 7. Tilings and Quasicrystals In this section, we show that the C*-algebras associated with three different broad classes of aperiodic tilings have real rank zero and stable rank one, and satisfy Blackadar’s Second Fundamental Comparability Question. In particular, we strengthen the conclusion of the main result of [40]. Then we discuss the relationship with the Bethe-Sommerfeld Conjecture for quasicrystals. We begin by showing that the groupoids of [40] are almost AF. The proof consists of assembling, in the right order, various results proved in [40]. Theorem 7.1. Consider a substitution tiling system in Rd as in Sect. 1 of [40], under the conditions imposed there: • The substitution is primitive. • The finite pattern condition holds. • The system is aperiodic. • The substitution forces its border. • The capacity of the boundary of every prototile is strictly less than d. Then the groupoid Rpunc defined in Sect. 1 of [40] is an almost AF Cantor groupoid. Proof. That the groupoid Rpunc is a Cantor groupoid in the sense of Definition 1.1 follows from the construction as described on pp. 594–595 of [40] and on p. 187 of [28]. Note that the base for the topology described in [40] is countable, so that Rpunc really is second countable. We take the open AF subgroupoid required in Definition 2.2 to be the open subgroupoid RAF from p. 596 of [40]. It is easily seen from the construction there to be an AF Cantor groupoid which contains the unit space of Rpunc . (Also see pp. 198–200 of [28].) We must show that if K ⊂ Rpunc \ RAF is compact, then s(K) is thin in the sense of Definition 2.1. Referring to the definition of Rpunc and its topology (pp. 594–595 of [40]), we see that Rpunc consists of certain pairs (T , T + x) in which T is a tiling of Rd and x ∈ Rd , and that there is a base for its topology consisting of sets of the form {(T , T + x) : T ∈ U } for suitable x ∈ Rd and suitable sets U of tilings. It is immediate from this that for any r > 0 the set {(T , T − x) ∈ Rpunc : x < r}
34
N.C. Phillips
is open in Rpunc . As r varies, these sets cover Rpunc , and K −1 is also a compact subset of Rpunc \ RAF , so there is r > 0 such that K −1 is contained in the set L = {(T , T − x) ∈ Rpunc \ RAF : x ≤ r}. Since s(K) ⊂ r(L), it suffices to show that r(L) is thin. The proof of Theorem 2.1 of [40], at the end of Sect. 2.1 there, consists of showing that there are compact open sets Un ⊂ punc , the unit space of Rpunc , for n ∈ N, each containing r(L), and homeomorphisms γn from Un to disjoint subsets of punc , each having graph contained in RAF . But it is immediate from this that r(L) is thin. This completes the verification of Part (1) of Definition 2.2. From the description of C ∗ (RAF ) and its K-theory at the end of Sect. 1 of [40], we see that this algebra is a direct limit of a system of finite dimensional C*-algebras in which the matrix B of partial embedding multiplicities is the same at each stage. Moreover, there is n such that B n has no zero entries. Therefore the direct limit algebra is simple, whence Cr∗ (RAF ) is simple. It now follows from Proposition 2.13 that Rpunc is almost AF. The following corollary contains Theorem 1.1 of [40]. Using more recent results (see Theorem 5.1 of [25]), it is now also possible to obtain this result from Theorem 6.11 in the same way that the next two theorems are proved. Corollary 7.2. For a substitution tiling system in Rd as in Theorem 7.1, the C*-algebras of the associated groupoid Rpunc as in [40] has stable rank one and real rank zero, and satisfies Blackadar’s Second Fundamental Comparability Question ([7], 1.3.1). Proof. Using Theorem 7.1, we obtain stable rank one from Theorem 5.2, real rank zero from Theorem 4.6, and Blackadar’s Second Fundamental Comparability Question from Corollary 5.4. We can also obtain the same result for several other kinds of aperiodic tilings. As discussed on p. 198 of [28], we reduce to the case of crossed products by free minimal actions of Zd on the Cantor set. Theorem 7.3. Consider a projection method pattern T as in [18], with data (E, K, u) (see Definitions I.2.1 and I.4.4 of [18]) such that E ∩ZN = {0}. Let GT be the associated groupoid (Definition II.2.7 of [18]). Then C ∗ (GT ) has stable rank one and real rank zero, and satisfies Blackadar’s Second Fundamental Comparability Question. Proof. It follows from Theorem II.2.9 and Corollary I.10.10 of [18] that C ∗ (GT ) is strongly Morita equivalent to the transformation group C*-algebra of a minimal action of Zd on a Cantor set XT , for a suitable d. The action is free since the action of Rd in the pattern dynamical system in that corollary is free by construction. The required properties hold for C ∗ (Zd , XT ) by Theorem 6.11. Since C ∗ (GT ) is strongly Morita equivalent to C ∗ (Zd , XT ), and both algebras are separable, these two algebras are stably isomorphic by Theorem 1.2 of [9]. So C ∗ (GT ) has stable rank one by Theorem 3.6 of [42] and has real rank zero by Theorem 3.8 of [10]. It is clear that Blackadar’s Second Fundamental Comparability Question is preserved by stable isomorphism. Theorem II.2.9 of [18] shows that several other groupoids associated to T give C*algebras strongly Morita equivalent to C ∗ (GT ). These C*-algebras therefore also have stable rank one and real rank zero, and satisfy Blackadar’s Second Fundamental Comparability Question.
Actions of Zd on the Cantor Set
35
Theorem 7.4. Consider a minimal aperiodic generalized dual (or grid) method tiling [47], satisfying the condition (D2) before Lemma 10 of [27]. Then the C*-algebra associated with the tiling has stable rank one and real rank zero, and satisfies Blackadar’s Second Fundamental Comparability Question. Proof. Theorem 1 and Lemma 11 of [27] show that the C*-algebra associated with such a tiling is stably isomorphic to the crossed product C*-algebra for an action of Zd on the Cantor set. (See Definition 12 of [27] and the following discussion, the discussion following Theorem 1 of [27], and also Definition 13 of [27].) The action is minimal (see the discussion following Definition 12 of [27]) and free (see Definition 8 and Lemma 3 of [27]). Therefore the crossed product has stable rank one and real rank zero, and satisfies Blackadar’s Second Fundamental Comparability Question, by Theorem 6.11. It follows as in the proof of Theorem 7.3 that the C*-algebra of the tiling has these properties as well. Although we have not checked, we presume one can obtain Theorems 7.3 and 7.4 from Theorem 5.1 of [25] as well. Indeed, that result and the same Morita equivalence argument give: Theorem 7.5. Let T be a tiling in Rd as in [44] (in particular, up to translation there are only finitely many tiles) which satisfies the finite pattern condition (described, for example, in [40] and [3]) and is aperiodic. Let T be the continuous hull of T (Definition 2.1 of [28]). Then the tiling C*-algebra C ∗ (Rd , T ) has stable rank one and real rank zero, and satisfies Blackadar’s Second Fundamental Comparability Question. Except for the tilings of [40], these proofs do not show that the tiling groupoids themselves are almost AF Cantor groupoids. It is quite possible that the structural results of [3] can be used to prove such a theorem. We do not, however, pursue that question here. We now turn to the Bethe-Sommerfeld conjecture for quasicrystals. Consider the Schr¨odinger operator H for an electron moving in a solid in Rd , which at this point may be crystalline, amorphous, or quasicrystalline. (The physical case is, of course, d = 3.) Associated with this situation there is a C*-algebra which contains all bounded continuous functions of H and all its translates by elements of Rd . See [4]. For crystals, in which the locations of the atomic nuclei form a lattice in Rd , the Bethe-Sommerfeld conjecture asserts that if d ≥ 2 then there is an energy above which the spectrum of H has no gaps. See [13, 46], Corollary 2.3 of [24], and [26] for some of the mathematical results on this conjecture. According to the introduction to [4], it is expected that this property also holds for quasicrystals. We point out that quasicrystals actually occur in nature; see [45] for a survey. For the Schr¨odinger operator in the so-called tight binding representation for a quasicrystal whose atomic sites are given by an aperiodic tiling, the relevant C*-algebra is the C*-algebra of the tiling. See Sect. 4 of [4]. We have just seen that the C*-algebras associated with three different broad classes of tilings have real rank zero, that is, every selfadjoint element, including bounded continuous real functions of H , can be approximated arbitrarily closely in norm by selfadjoint elements with finite spectrum. In Proposition 7.6 below, we further show that among the selfadjoint elements of such an algebra, those with totally disconnected spectrum are generic, in the sense that they form a dense Gδ -set. That is, the selfadjoint elements of the relevant C*-algebra whose spectrum is not totally disconnected form a meager set in the sense of Baire category. These results say nothing about the spectrum of H itself. It is also not clear whether arbitrary norm small selfadjoint perturbations are physically relevant. Nevertheless, gap
36
N.C. Phillips
labelling theory (see [4] for a recent survey) exists in arbitrary dimensions, and the introduction of [4] raises the question of its physical interpretation when there are no gaps. Our results do suggest the possibility that this theory has physical significance in the general situation. Proposition 7.6. Let A be a C*-algebra with real rank zero. Then there is a dense Gδ -set S in the selfadjoint part Asa of A such that every element of S has totally disconnected spectrum. Proof. For a finite subset F ⊂ R, let SF = {a ∈ Asa : sp(a) ∩ F = ∅}. Then SF is open Asa , since if a, b ∈ Asa satisfy a − b < ε, then sp(b) is contained in the ε-neighborhood of sp(a). We show that SF is dense in Asa . Given any selfadjoint element a ∈ A and ε > 0, the real rank zero condition provides a selfadjoint element b ∈ A with finite spectrum and a − b < 21 ε. By perturbing eigenvalues, it is easy to find a selfadjoint element c ∈ A with b − c < 21 ε and such that sp(c) ∩ F = ∅. Then c ∈ SF and a − c < ε. So SF is dense. The set of selfadjoint elements a ∈ A with sp(a) ∩ Q = ∅ is a countable intersection of sets of the form SF , and is therefore a dense Gδ -set. Clearly all its elements have totally disconnected spectrum. 8. Examples and Open Problems In this section, we discuss some open problems, beginning with the various possibilities for improving our results. One of the most obvious questions is the following: Question 8.1. Let G be an almost AF Cantor groupoid. Does it follow that Cr∗ (G) is tracially AF in the sense of Definition 2.1 of [30]? Lemma 2.7 and the thinness condition in Definition 2.2 come close to giving the tracially AF property; the main part that is missing is the requirement that the projections one chooses approximately commute with the elements of the finite set. By recent work of H. Lin [31], a positive answer to this question would imply a positive solution to the following conjecture: Conjecture 8.2. Let d be a positive integer, let X be the Cantor set, and let Zd act freely and minimally on X. Then C ∗ (Zd , X) is an AH algebra, that is, isomorphic to a direct limit lim An of C*-algebras An each of which is a finite direct sum of C*-algebras of −→ the form C(Y, Mk ). (The compact spaces Y and the matrix sizes k may vary among the summands, even for the same value of n.) This conjecture is also predicted by the Elliott classification conjecture. It is true for d = 1 (essentially [38]). We point out here that Matui has shown in [32] (see Proposition 2 and Theorem 3 both in Sect. 4) that when d = 2, the algebra C ∗ (Zd , X) is often AF embeddable. We next address the possibility of generalizing the group. For not necessarily abelian groups, we should presumably require that the action be minimal and merely essentially free, which for minimal actions means that the set of x ∈ X with trivial isotropy subgroup is dense in X. This is the condition needed to ensure simplicity of the transformation
Actions of Zd on the Cantor Set
37
group C*-algebra. See the final corollary of [2], noting that all actions of amenable groups are regular (as described before this corollary) and that essential freeness of an action of a countable discrete group on a compact metric space X is equivalent to topological freeness, Definition 1 of [2], of the action on C(X) (as is easy to prove). Thus, let act minimally and essentially freely on the Cantor set X. If is close to Zd , and especially if the action is actually free, then there seems to be a good chance of adapting the methods of Forrest [16] to show that the transformation group groupoid is still almost AF. This might work, for example, if has a finite index subgroup isomorphic to Zd . In this context, we mention that crossed products of free minimal actions of the free product Z/2ZZ/2Z on the Cantor set, satisfying an additional technical condition, are known to be AF [8]. The group Z/2ZZ/2Z has an index two subgroup isomorphic to Z. However, we believe the correct generality is to allow to be an arbitrary countable amenable group. Question 8.3. Let X be the Cantor set, and let the countable amenable group act minimally and essentially freely on X. Is the transformation group groupoid × X almost AF? Even if not, do the conclusions of Theorem 6.11 still hold? That is: Question 8.4. Let X be the Cantor set, and let the countable amenable group act minimally and essentially freely on X. Does it follow that C ∗ (, X) has stable rank one and real rank zero? If p, q ∈ M∞ (C ∗ (Zd , X)) are projections such that τ (p) < τ (q) for all normalized traces τ on C ∗ (Zd , X), does it follow that p is Murray-von Neumann equivalent to a subprojection of q? The other obvious change is to relax the condition on the space. If X is not totally disconnected, then even a crossed product by a free minimal action of Z need not have real rank zero. Examples 4 and 5 in Sect. 5 of [11] have no nontrivial projections. However, it seems reasonable to hope that the other parts of the conclusion of Theorem 6.11 still hold, at least under mild restrictions. Question 8.5. Let X be a compact metric space with finite covering dimension [15], and let the countable amenable group act minimally and essentially freely on X. Does it follow that C ∗ (, X) has stable rank one? If p, q ∈ M∞ (C ∗ (, X)) are projections such that τ (p) < τ (q) for all normalized traces τ on C ∗ (, X), does it follow that p is Murray-von Neumann equivalent to a subprojection of q? We think we know how to prove stable rank one when = Zd , but a number of technical details need to be worked out. The definition given for an almost AF Cantor groupoid whose reduced C*-algebra is not simple is merely a guess, and it is also not clear what properties the reduced C*-algebra of such a groupoid should have. Question 8.6. What is the “right” definition of an almost AF Cantor groupoid in the case that the reduced C*-algebra is not simple? Question 8.7. Let G be an almost AF Cantor groupoid such that Cr∗ (G) is not simple. What structural consequences does this have for Cr∗ (G)? The definition should surely exclude the transformation group groupoid coming from the action of Z on Z ∪ {±∞} by translation, even though this action is free on a dense set. It is not essentially free, which is equivalent to the transformation group groupoid G not being essentially principal (Definition 1.14(4)).
38
N.C. Phillips
Suppose, however, that h1 , h2 : X → X are two commuting homeomorphisms of the Cantor set X, such that the map (n1 , n2 ) → hn1 1 ◦ hn2 2 determines a free minimal action of Z2 . Then the transformation group groupoid for the action of Z on X generated by h1 should be almost AF. It is possible for this action to be already minimal, or for X to be the disjoint union of finitely many minimal sets, in which case the situation is clear. It is possible that neither of these happens—consider the product of two minimal actions. It is not clear what additional structure the homeomorphism h1 must have (although it can’t be completely arbitrary—see below), and it is also not clear whether the crossed product must necessarily have real rank zero, stable rank one, or order on its K0 -group determined by traces. Let h be an arbitrary aperiodic homeomorphism of the Cantor set X. (That is, the action of Z it generates is free, but need not be minimal.) Theorem 3.1 of [38] shows that if h has no nontrivial invariant subsets which are both closed and open, and if h has more than one minimal set, then C ∗ (Z, X, h) does not have stable rank one. If h1 and h2 are as above, and h = h1 , then the existence of a unique minimal set K for h implies that h is minimal. (The set K is necessarily invariant under h2 .) On the other hand, we know of no examples of h1 and h2 as above in which h1 is not minimal yet has no nontrivial invariant subsets which are both closed and open. It follows from Corollary 2.3 of [34] that if h is an arbitrary aperiodic homeomorphism of the Cantor set X, then C ∗ (Z, X, h) has the ideal property, that is, every ideal is generated as an ideal in the algebra by its projections. This certainly suggests that the reduced C*-algebra of an almost AF Cantor groupoid should have the ideal property. This condition is, however, rather weak in this context, since every simple unital C*-algebra, regardless of its real or stable rank, has the ideal property. We now give an example of an aperiodic homeomorphism of the Cantor set whose transformation group groupoid is not almost AF, but only because of the failure of condition (2) in Definition 2.2. Its C*-algebra does not have stable rank one. Example 8.8. Let X1 = Z ∪ {±∞} be the two point compactification of Z, and let h1 : X1 → X1 be translation to the right (fixing ±∞). Let X2 be the Cantor set, and let h2 : X2 → X2 be any minimal homeomorphism. Let X = X1 × X2 , and let h = h1 × h2 : X → X. Let G = Z × X be the transformation group groupoid, as in Example 1.3, for the corresponding action of Z. Then G has the following properties: • It is a Cantor groupoid. • It satisfies condition (1) of Definition 2.2, the main part of the definition of an almost AF Cantor groupoid. • There is a nonempty open subset U ⊂ G(0) which is null for all G-invariant Borel probability measures. • The reduced C*-algebra Cr∗ (G) does not have stable rank one. For the first statement, observe that X is a totally disconnected compact metric space with no isolated points, and hence homeomorphic to the Cantor set. So G is a Cantor groupoid, as in Example 1.3. The third statement is also easy to prove, with U = Z × X2 . The G-invariant Borel probability measures on G(0) are exactly the h-invariant Borel probability measures on X. (We have not found this statement in the literature. However, our case follows immediately from Remark 1.5(2) and the formulas in Example 1.3 by considering functions supported on sets of the form {γ } × X and letting γ run through the group.) So let µ be any h-invariant Borel probability measure on X. Then µ({n} × X2 ) is independent of n,
Actions of Zd on the Cantor Set
39
and
µ({n} × X2 ) = µ(Z × X2 ) ≤ µ(X) = 1.
n∈Z
It follows that µ({n} × X2 ) = 0 for all n ∈ Z, whence µ(Z × X2 ) = 0. We verify condition (1) of Definition 2.2. Fix any z ∈ X2 . Set Y = X1 ×{z} ⊂ X. Let G0 be the subgroupoid G as in Example 2.6 of [39] corresponding to this situation. (Also see the statement of Theorem 2.4 of [39].) Note that, in our notation, G0 = G\(L∪L−1 ), with L = {(k, hl (y)) : y ∈ Y , k, l ∈ Z, l > 0, and k + l ≤ 0} ⊂ G. To see that G0 is AF (compare with [37]), we choose a decreasing sequence of compact open subsets Zn ⊂ X2 with ∞ n=1 Zn = {z}. Set Yn = X1 × Zn , define Ln ⊂ G by using Yn in place of Y in the definition of L, and set Hn = G \ (Ln ∪ L−1 n ). It is easy to check that H n is a closed and open subgroupoid of G which is contained in G0 , and that G0 = ∞ n=1 Hn . Since h2 is minimal, there is l(n) such that the sets n(l) h2 (Zn ), h22 (Zn ), . . . , h2 (Zn ) cover X2 , and it follows that Hn ⊂ {−n(l), −n(l) + 1, . . . , n(l)} × X. So Hn is compact. We have now shown that G0 is in fact an AF Cantor groupoid. We now show that if K ⊂ G\G0 is compact, then s(K) is thin in G0 . We will identify the unit space G(0) = {1} × X with X in the obvious way. Compact subsets of thin sets are easily seen to be thin, so it suffices to consider sets of the form K = (G \ G0 ) ∩ T with T = {−m, −m + 1, . . . , m − 1, m} × X for some m ∈ N. Now (G \ G0 ) ∩ T = (L ∪ L−1 ) ∩ T , so, following Example 1.3, s(K) = s(L ∩ T ) ∪ r(L ∩ T ). Next, we calculate L ∩ T = {(k, hl (y)) : y ∈ Y , k, l ∈ Z, l > 0, k + l ≤ 0, and −m ≤ k ≤ m} = {(−k, hl (y)) : y ∈ Y , k, l ∈ Z, and 1 ≤ l ≤ k ≤ m}. It follows that s(L ∩ T ) = h(Y ) ∪ h2 (Y ) ∪ · · · ∪ hm (Y ) and r(L ∩ T ) =
1≤l≤k≤m
h−k+l (Y ) = Y ∪ h−1 (Y ) ∪ · · · ∪ h−m+1 (Y ).
40
N.C. Phillips
Now let n ∈ N. We find compact graphs S1 , S2 , . . . , Sn ⊂ G such that s(Sk ) = s(K) and the sets r(S1 ), r(S2 ), . . . , r(Sn ) are pairwise disjoint. For 1 ≤ j ≤ n, define ! Sj = {m(j − 1)} × h(Y ) ∪ · · · ∪ hm (Y ) ∪ {−m(j − 1)} × Y ∪ · · · ∪ h−m+1 (Y ) ⊂ Z × X. Then the Sj are compact graphs in Z × X, s(Sj ) = s(K), and r(Sj ) = hm(j −1)+1 (Y ) ∪ hm(j −1)+2 (Y ) ∪ · · · ∪ hmj (Y ) ∪ h−m(j −1) (Y ) ∪ h−[m(j −1)+1] (Y ) ∪ · · · ∪ h−(mj −1) (Y ) m(j −1)+1 mj −m(j −1) −(mj −1) = X1 × {h2 (z), . . . , h2 (z)} ∪ {h2 (z), . . . , h2 (z)} . Since the homeomorphism h2 has no periodic points, these sets are pairwise disjoint. This completes the proof that if K ⊂ G \ G0 is compact, then s(K) is thin in G0 , and hence also the proof of condition (1) of Definition 2.2. The reduced C*-algebra Cr∗ (Z × X) is isomorphic to the transformation group C*algebra Cr∗ (Z, X, h) = C ∗ (Z, X, h), by Proposition 1.8. We can use Theorem 3.1 of [38] to show that C ∗ (Z, X, h) does not have stable rank one, but it is just as easy to do this directly. Let u ∈ C ∗ (Z, X, h) be the canonical unitary, satisfying uf u∗ = f ◦ h−1 for f ∈ C(X). Set R = {0, 1, . . . , ∞} × X2 ⊂ X
and
p = χR ∈ C(X) ⊂ C ∗ (Z, X, h).
Then one checks that s = up + (1 − p) satisfies s∗s = 1
and ss ∗ = 1 − χ{0}×X2 = 1.
So s is a nonunitary isometry in C ∗ (Z, X, h). It follows that s is not a norm limit of invertible elements of C ∗ (Z, X, h), so this algebra does not have stable rank one. (Compare with Examples 4.13 of [42].) Acknowledgements. I am grateful to the following people for helpful conversations and email correspondence: Claire Anantharaman-Delaroche, Jean Bellissard, Tomasz Downarowicz, Alan Forrest, Johannes Kellendonk, David Pask, Cornel Pasnicu, Ian Putnam, Jean Renault, and Christian Skau. Some of these conversations occurred at the conference on Operator Algebras and Mathematical Physics at Constant¸a in July 2001, and I am grateful to the organizers of that conference for support. Part of this work was done during a visit to the Institute of Mathematics of the Romanian Academy, and I am grateful to that institution for its hospitality. I also thank the referee for pointing out that simplicity is not needed in Theorem 3.5 and for suggesting a simplification of the proof of Theorem 4.6.
References 1. Anantharaman-Delaroche, C.: Amenability and exactness for dynamical systems and their C*-algebras. Trans. Am. Math. Soc. 354, 4153–4178 (2002) 2. Archbold, R.J., Spielberg, J.S.: Topologically free actions and ideals in discrete C*-dynamical systems. Proc. Edinburgh Math. Soc. (2) 37, 119–124 (1994) 3. Bellissard, J., Benedetti, R., Gambaudo, J.-M.: Spaces of tilings, finite telescopic approximation and gap labelings. http://arxiv.org/abs/math.DS/0109062, 2001 4. Bellissard, J., Herrmann, D., Zarrouati, M.: Hull of aperiodic solids and gap labelling theorems. In: M.B. Baake, R.V. Moody (eds.), Directions in Mathematical Quasicrystals, CRM Monograph Series Vol. 13, Providence, RI: Amer. Math. Soc., 2000, pp. 207–259
Actions of Zd on the Cantor Set
41
5. Benameur, M., Oyono-Oyono, H.: Calcul du label des gaps pour les quasi-cristaux. C.R. Math. Acad. Sci. Paris. 334, 667-670 (2002) 6. Blackadar, B.: K-Theory for Operator Algebras. MSRI Publication Series 5, New York, Heidelberg, Berlin, Tokyo: Springer-Verlag, 1986 7. Blackadar, B.: Comparison theory for simple C*-algebras. In: D.E. Evans, M. Takesaki (eds.), Operator Algebras and Applications, London Math. Soc. Lecture Notes Series no. 135, Cambridge, New York: Cambridge University Press, 1988, pp. 21–54 8. Bratteli, O., Evans, D.E., Kishimoto, A.: Crossed products of totally disconnected spaces by Z2 Z2 . Ergod. Th. Dynam. Sys. 13, 445–484 (1993) 9. Brown, L.G., Green, P., Rieffel, M.A.: Stable isomorphism and strong Morita equivalence of C*algebras. Pacific J. Math. 71, 349–363 (1977) 10. Brown, L.G., Pedersen, G.K.: C*-algebras of real rank zero. J. Funct. Anal. 99, 131–149 (1991) 11. Connes, A.: An analogue of the Thom isomorphism for crossed products of a C*-algebra by an action of R. Adv. in Math. 39, 31–55 (1981) 12. Cuntz, J.: K-theory for certain C*-algebras. Ann. Math. 113, 181–197 (1981) 13. Dahlberg, B.E.J., Trubowitz, E.: A remark on two-dimensional periodic potentials. Comment. Math. Helv. 57, 130–134 (1982) 14. Davidson, K.R.: C*-Algebras by Example. Fields Institute Monographs no. 6, Providence, RI: Amer. Math. Soc., 1996 15. Engelking, R.: Dimension Theory. Oxford, Amsterdam, New York: North-Holland, 1978 16. Forrest, A.: A Bratteli diagram for commuting homeomorphisms of the Cantor set. Int. J. Math. 11, 177–200 (2000) 17. Forrest,A., Hunton, J.: The cohomology and K-theory of commuting homeomorphisms of the Cantor set. Ergod. Th. Dynam. Sys. 19, 611–625 (1999) 18. Forrest, A., Hunton, J., Kellendonk, J.: Topological invariants for projection method patterns. Mem. Amer. Math. Soc. 758, Providence, RI: Amer. Math. Soc. 2002 19. Giordano, T., Putnam, I.F., Skau, C.F.: Affable equivalence relations and orbit structure of Cantor dynamical systems. Ergod. Th. Dynam. Sys. 24, 441–475 (2004) 20. Goodearl, K.R.: Partially Ordered Abelian Groups with Interpolation. Math. Surveys and Monographs no. 20, Providence RI: Amer. Math. Soc., 1986 21. Goodearl, K.R.: Notes on a class of simple C*-algebras with real rank zero. Publ. Mat. (Barcelona) 36, 637–654 (1992) 22. Goodearl, K.: Riesz decomposition in inductive limit C*-algebras. Rocky Mtn. J. Math. 24, 1405– 1430 (1994) 23. Haefliger, A.: Groupoids and foliations. In: A. Ramsay, J. Renault (eds.), Groupoids in Analysis, Geometry, and Physics (Boulder, CO, 1999), Contemp. Math. Vol. 282, Providence RI: Amer. Math. Soc. 2001, pp. 83–100 24. Helffer, B., Mohamed, A.: Asymptotic of the density of states for the Schr¨odinger operator with periodic electric potential. Duke Math. J. 92, 1–60 (1998) 25. Kaminker, J., Putnam, I.: A proof of the gap labeling conjecture. Mich. Math. J. 51, 537–546 (2003) 26. Karpeshina,Y.E.: Perturbation theory for the Schr¨odinger operator with a periodic potential. Lecture Notes in Math. no. 1663, Berlin: Springer-Verlag, 1997 27. Kellendonk, J.: The local structure of tilings and their integer group of coinvariants. Commun. Math. Phys. 187, 115–157 (1997) 28. Kellendonk, J., Putnam, I.F.: Tilings, C*-algebras, and K-theory. In: M.B. Baake, R.V. Moody (eds.), Directions in Mathematical Quasicrystals, CRM Monograph Series vol. 13, Providence RI: Amer. Math. Soc. 2000, pp. 177–206 29. Khoshkam, M., Skandalis, G.: Regular representations of groupoids and applications to inverse semigroups. J. Reine Rngew. Math. 546, 47–72 (2002) 30. Lin, H.: Tracially AF C*-algebras. Trans. Am. Math. Soc. 353, 693–722 (2001) 31. Lin, H.: Classification of simple C*-algebras of tracial topological rank zero. Duke Math. J. 125, 91-119 (2004) 32. Matui, H.: AF embeddability of crossed products of AT algebras by the integers and its application. J. Funt. Anal. 192, 562–580 (2002) ´ 33. Moerdijk, I.: Etale groupoids, derived categories, and operations. In: A. Ramsay, J. Renault (eds.), Groupoids in Analysis, Geometry, and Physics (Boulder, CO, 1999), Contemp. Math. vol. 282, Providence RI: Amer. Math. Soc. 2001, pp. 101–114 34. Pasnicu, C.: The ideal property in crossed products. Proc. Am. Math. Soc. 131, 2103–2108 (2003) 35. Pasnicu, C.: Shape equivalence, nonstable K-theory and AH algebras. Pacific J. Math. 192, 159–182 (2000) 36. Pedersen, G.K.: C*-Algebras and their Automorphism Groups. London-New York-San Francisco, Academic Press, 1979
42
N.C. Phillips
37. Putnam, I.F.: The C*-algebras associated with minimal homeomorphisms of the Cantor set. Pacific J. Math. 136, 329–353 (1989) 38. Putnam, I.F.: On the topological stable rank of certain transformation group C*-algebras. Ergod. Th. Dynam. Sys. 10, 197–207 (1990) 39. Putnam, I.F.: On the K-theory of C*-algebras of principal groupoids. Rocky Mtn. J. Math. 28, 1483–1518 (1998) 40. Putnam, I.F.: The ordered K-theory of C*-algebras associated with substitution tilings. Commun. Math. Phys. 214, 593–605 (2000) 41. Renault, J.: A Groupoid Approach to C*-Algebras. Springer Lecture Notes in Math. no. 793, Berlin, Heidelberg, New York: Springer-Verlag, 1980 42. Rieffel, M.A.: Dimension and stable rank in the K-theory of C*-algebras. Proc. London Math. Soc. Ser. 3 46, 301–333 (1983) 43. Rørdam, M.: On the structure of simple C*-algebras tensored with a UHF-algebra. J. Funct. Anal. 100, 1–17 (1991) 44. Sadun, L., Williams, R.F.: Tiling spaces are Cantor set fiber bundles. Ergod. Th. Dynam. Sys. 23, 307–316 (2003) 45. Sire, C., Gratias, D.: Introduction to the physics of quasicrystals. In: P.E.A. Turchi, A. Gonis (eds.), Statics and Dynamics of Alloy Phase Transformations, NATO ASI Series B: Physics vol. 319, New York: Plenum Press 1994, pp. 127–154 46. Skriganov, M.M.: The spectrum band structure of the three-dimensional Schr¨odinger operator with periodic potential. Invent. Math. 80, 107–121 (1985) 47. Socolar, J.E.S., Steinhardt, P.J., Levine, D.: Quasicrystals with arbitrary orientational symmetry. Phys. Rev. B 32, 5547–5550 (1985) 48. Zhang, S.: A Riesz decomposition property and ideal structure of multiplier algebras. J. Operator Theory 24, 204–225 (1990) 49. Anderson J.E., Putnam I.F.: Topological invariants for substitution tilings and their associated C ∗ algebras. Ergod. Th. Dynam. Sys. 18, 509–537 (1998) 50. G¨ahler, F.: Talk at the conference “A periodic Order: Dynamical Systems, Combinatorics, and Operators”, Banff International Research Station, 29 May-3 June 2004 Communicated by Y. Kawahigashi
Commun. Math. Phys. 256, 43–110 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1281-6
Communications in
Mathematical Physics
Global Existence for the Einstein Vacuum Equations in Wave Coordinates Hans Lindblad1, , Igor Rodnianski2, 1
Mathematics Department, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0112, USA 2 Department Mathematics, Fine Hall, Princeton University, Princeton, NJ 08544-1000, USA Received: 11 December 2003 / Accepted: 28 September 2004 Published online: 8 March 2005 – © Springer-Verlag 2005
Abstract: We prove global stability of Minkowski space for the Einstein vacuum equations in harmonic (wave) coordinate gauge for the set of restricted data coinciding with the Schwarzschild solution in the neighborhood of space-like infinity. The result contradicts previous beliefs that wave coordinates are “unstable in the large” and provides an alternative approach to the stability problem originally solved ( for unrestricted data, in a different gauge and with a precise description of the asymptotic behavior at null infinity) by D. Christodoulou and S. Klainerman. Using the wave coordinate gauge we recast the Einstein equations as a system of quasilinear wave equations and, in absence of the classical null condition, establish a small data global existence result. In our previous work we introduced the notion of a weak null condition and showed that the Einstein equations in harmonic coordinates satisfy this condition.The result of this paper relies on this observation and combines it with the vector field method based on the symmetries of the standard Minkowski space. In a forthcoming paper we will address the question of stability of Minkowski space for the Einstein vacuum equations in wave coordinates for all “small” asymptotically flat data and the case of the Einstein equations coupled to a scalar field. 1. Introduction The focus of this paper is the question of global existence and stability for the Einstein vacuum equations in “harmonic” (wave coordinate) gauge. The Einstein equations determine a 4-d manifold M with a Lorentzian metric g with vanishing Ricci curvature Rµν = 0.
Part of this work was done while H.L. was a Member of the Institute for Advanced Study, Princeton, supported by the NSF grant DMS-0111298 to the Institute. H.L. was also partially supported by the NSF Grant DMS-0200226. Part of this work was done while I.R. was a Clay Mathematics Institute Long-Term Prize Fellow. His work was also partially supported by the NSF grant DMS–01007791.
44
H. Lindblad, I. Rodnianski
We consider the initial value problem: for a given a 3-d manifold , with a Riemannian metric g0 , and a symmetric two-tensor k0 , we want to find a 4-d manifold M, with a Lorentzian metric g satisfying the Einstein equations, and an imbedding ⊂ M such that g0 is the restriction of g to and k0 is the second fundamental form of in M. The initial value problem is overdetermined which imposes compatibility conditions on the initial data: the constraint equations j
j
R0 − k0 ij k0 i + k0 ii k0 j = 0,
j
∇ j k0ij − ∇i k0 j = 0,
∀i = 1, ..., 3.
Here R0 is the scalar curvature of g0 and ∇ is covariant differentiation with respect to g0 . The Einstein equations are invariant under diffeomorphisms. To have a working formulation one needs to eliminate this freedom by fixing a gauge condition or a system of coordinates. While the Einstein equations are independent of the choice of a coordinate system, the existence of a special or preferred system of coordinates has been a subject of debate [Fo]. Historically, the first special coordinates were the harmonic coordinates (also referred to as wave coordinates in current terminology). These obey the equation g x µ = 0, µ = 0, 1, 2, 3, where g = ∇α ∇ α is the geometric wave operator. Relative to the wave coordinates a Lorentzian metric g satisfies the wave coordinate condition if:1 1 g αβ ∂β gαµ = g αβ ∂µ gαβ , ∀µ = 0, .., 3. (1.1) 2 In this system of coordinates, the vacuum Einstein equations take the form of a system of quasilinear wave equations g αβ ∂α ∂β gµν = Nµν (g, ∂g),
∀µ, ν = 0, .., 3
(1.2)
with a nonlinearity N (u, v) depending quadratically on v. In this particular gauge Choquet-Bruhat [CB1] was able to establish the existence of a globally hyperbolic development2 of the Einstein vacuum equations starting with an arbitrary set of initial data prescribed on a 3-d space-like hypersurface and satisfying the constraint equations. While the result of Choquet-Bruhat and a later result of Choquet-Bruhat and Geroch [CB-G], establishing the existence of a maximal Cauchy development, constructs solutions for any given initial data set, it does not provide any information about the geodesic completeness of the obtained solution. In the language of the evolution equations these results only show the existence of “local in time” solutions. The global results have proved to be by far more resistant. The outstanding global problem, which for a long time remained open, and was finally ingeniously solved by Christodoulou and Klainerman [C-K], was that of the stability of Minkowski space. In simplified language, it is the problem of constructing a global solution to the Einstein vacuum equations from the initial data, which is close to the Minkowski metric mµν , and asymptotically approaching the Minkowski space. The initial data (, g0 , k0 ) for the problem of stability of Minkowski space is asymptotically flat, i.e., the complement of a compact set in is diffeomorphic to the complement of a ball in R3 , and there exists a system of coordinates (x1 , x2 , x3 ) with r =
x12 + x22 + x32 such that for all sufficiently
large r the metric3 g0 ij = (1 + 2M/r)δij + o(r −1−σ ), and the second fundamental 1 We shall use below the standard convention of summing over repeated indices and the notation ∂α = ∂/∂x α . 2 For the definitions of global hyperbolicity and maximal Cauchy development see [H-E, Wa]. 3 The stability result of [C-K] was proved for strongly asymptotically flat data g 0 ij = (1+2M/r)δij + o(r −3/2 ), k0 = o(r −5/2 ).
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
45
form k0 = o(r −2−σ ) for some σ > 0. Here M is the mass, which by the positive mass theorem is positive unless the data is flat, see Schoen and Yau [S-Y] and Witten [Wi]. In addition, the data is required to satisfy a global smallness assumption, which makes sure that it is sufficiently close to the data (R3 , δ, 0) for the Minkowski space. To understand some of the difficulties of the problem we recall that a generic system of quasilinear equations φI =
K α AJI,αβ ∂ φJ ∂ β φK + cubic terms
(1.3)
|α|≤|β|≤2
allows solutions with smooth arbitrarily small initial data which blow up in finite time4 . The key to global existence for such equations was the null condition found by Klainerman, [K2]. The small data global existence result for the equations satisfying the null condition was established in [C1, K2]. The null condition manifests itself in K of the quadratic terms of the special algebraic cancellations in the coefficients AJI,αβ equation.5 It can be shown however, that the Einstein vacuum equations in wave coordinates do not satisfy the null condition. Moreover, Choquet-Bruhat [CB3] showed that even without imposing a specific gauge the Einstein equations violate the null condition. These considerations led to the suggestion that the wave coordinates are not suitable for proving stability of Minkowski space. In fact, considering a second iterate of Eq. (1.2), Choquet-Bruhat [CB2] argued that the Einstein vacuum equations are not stable in wave coordinates near the Minkowski solution. All these resulted in the belief that the wave coordinates are unstable in the large in the sense that a possible finite time blow up of solutions of Eq. (1.2) is due to a coordinate singularity. The global stability of Minkowski space had been proved by Christodoulou and Klainerman [C-K] who avoided the use of a preferred system of coordinates and instead relied on the invariant formulation of the Einstein equations with the choice of maximal time foliation (or the double null foliation in the new proof of Klainerman and Nicolo [K-N1]) and utilizing Bianchi identities for the curvature. The special structure of the quadratic terms plays a crucial part in the generalized energy estimates which form the backbone of the proof but the null condition can not be pointed out precisely. A semiglobal stability result was also obtained in the work of Friedrich [Fr]. He used the conformal method to reduce the global problem to a local one. The approach is invariant and the special structure is again exploited implicitly. In this paper we revisit the problem of global stability of Minkowski space in wave coordinates. More precisely, we consider the data6 (R3 , g0 , k0 ) with the metric g0 coinciding with the spatial part of the Schwarzschild metric gS = (1 + M/r)4 dx 2 in the region r > 1 >> M, vanishing second fundamental form k0 for r > 1, and satisfying a global smallness assumption on R3 . We prove that for this initial data the wave coordinate gauge is stable in the large: the reduced Einstein equations (1.2) has a global solution g defining a future causally geodesically complete space-time, [H-E]. The metric g in wave coordinates x α , α = 0, .., 3 approaches the Minkowski metric m: supx∈R3 |g(t, x) − m| → 0 as t → ∞. The intuition behind this result is based on the observation that the Einstein vacuum equations in wave coordinates (1.2) satisfy the weak null condition. This notion was This is in particular true for a semilinear equation φ = (∂t φ)2 , [J1]. E.g. φ = (∂t φ)2 − |∇x φ|2 satisfies the null-condition. 6 The existence of such data is guaranteed by the results of Corvino and Chrusciel-Delay, [Co, C-D]. 4 5
46
H. Lindblad, I. Rodnianski
introduced in [L-R] for general quasilinear systems (1.3) and requires that the corresponding effective asymptotic system K (∂t + ∂r )(∂t − ∂r ) I = r −1 AJI,nm (∂t − ∂r )n J (∂t − ∂r )m K ,
I ∼ rφI n≤m≤2
(1.4) has global solutions for all small initial data.7 Here, K K AJI,nm (ω) = AJI,αβ ωˆ α ωˆ β , ωˆ = (−1, ω), ω ∈ S2 . |α|=n,|β|=m K (ω) ≡ 0 and thus implies the weak null The classical null condition states that AJI,nm condition. The asymptotic system (1.4) arises as an approximation of (1.3) when one neglects the derivatives tangential to the outgoing Minkowski light cones, known to have faster decay. The asymptotic equation was introduced in [H1] to predict the time of a blow-up for scalar wave equations known to blow up in finite time, and was used in [L2] to find some other scalar wave equations for which the known blow-up mechanism was not present. Asymptotic systems played an important role in the analysis of the blow-up mechanisms in [A1]. In [L-R] we have shown that the asymptotic system generated by the Einstein equations in wave coordinates (1.2) has global solutions for all data. In this paper we consider the full nonlinear system (1.2). We should note that although the asymptotic system provides useful heuristics about the behavior of solutions, in particular the L∞ decay of the first derivatives of various components of the metric g, it is barely used in our proof of the small data global existence result for the full nonlinear equation (1.2). While it is tempting to put forward a conjecture that, parallel to the result for the classical null condition [C1, K2], the weak null condition guarantees the global existence result for small initial data, we can only argue that all known examples seem to confirm it. A simple example of an equation satisfying the weak null condition, violating the standard null condition and yet possessing global solutions for all data is given by the system
φ = w · ∂ 2 φ + ∂ψ · ∂ψ,
ψ = 0,
w = 0.
(1.5)
Another example is provided by the equation φ = φ φ. The proof of a small data global existence result for this equation is quite involved, [L2] (radial case), [A3]. As we show in this paper the Einstein equations (1.2) is yet another example. Interestingly enough, at the level of an effective asymptotic system the Einstein equations can be modelled by the system (1.5). The asymptotic behavior of null components of the Riemann curvature tensor Rαβγ δ of metric g- the so-called “peeling estimates”- was discussed in the works of Bondi, Sachs and Penrose and becomes important in the framework of asymptotically simple space-times (roughly speaking, space-times which can be conformally compactified), see also the paper of Christodoulou [C2] for further discussion of such space-times. Global solutions obtained in the work [C-K] were accompanied by very precise analysis of its asymptotic behavior although not entirely consistent with peeling estimates. However, global solutions obtained by Klainerman-Nicolo [K-N1] in the exterior8 stability of Minkowski space were shown to possess peeling estimates for special initial data, [K-N2]. 7 8
For the precise definition see Sect. 6. Outside of the domain of dependence of a compact set.
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
47
Our work is less precise about the asymptotic behavior and is focused more on developing a technically relatively simple approach allowing us to prove stability of Minkowski space in a physically interesting wave coordinate gauge In particular, we rely only on the standard Killing and conformal Killing vector fields of Minkowski space and do not construct almost Killing and conformal Killing vector fields adapted to the geometry of null cones of the solution g. Our proof is based on generalized energy estimates combined with decay estimates. The generalized energy estimates are used with Minkowski vector fields {∂α , αβ = xα ∂β − xβ ∂α , S = x α ∂α }. For the equations satisfying the standard null condition uniform in time bounds on the generalized energies, combined with global Sobolev (Klainerman-Sobolev) inequalities, are sufficient to infer small data global existence. In our case however the generalized energies slowly grow in time (at the rate of t ε ) and need to be complemented by independent, not following from the global Sobolev inequalities, decay estimates. We derive the latter by direct integration of the equation along the characteristics. It is at this point that the intuition from the effective asymptotic system is most useful. We show that all components of the metric with exception of one decay at the rate of t −1 . The remaining component however decays only as t −1+ε . Somewhat surprisingly, the glue that holds together such weak decay estimates and the generalized energy estimates is the wave coordinate condition (1.1). In this paper we only prove the result for a restricted set of data coinciding with the Schwarzschild data outside of the ball of radius one.9 This allows us to somewhat sidestep the problem of a long range effect of a gravitational field. Due to the inward bending of the light rays, a solution arising from initial data coinciding with the Schwarzschild data outside of the ball of radius one will be equal to the Schwarzschild solution in the exterior of the Minkowski cone r = t + 1. In our subsequent work we hope to be able to prove the stability of Minkowski space in wave coordinates for general data. In addition we hope to show that our method can be also used to treat the problem of small data global existence for the Einstein equations coupled to a scalar field. 2. The Main Results and the Strategy of the Proof We now formulate the main results of our paper. Our first result is global existence for the Einstein vacuum equations in wave coordinates. Theorem 2.1. Consider the reduced Einstein vacuum equations 10 g hµν = g αβ ∂ 2 hµν = Fµν (h)(∂h, ∂h), αβ
∀µ, ν = 0, ..., 3,
(2.1)
where gµν = mµν + hµν and the nonlinear term F is as in Lemma 3.2. We assume that the initial data (g, ∂t g)|t=0 = (g0 , g1 ) are smooth, the Lorentzian metric is of the form g0 = −a 2 dt 2 + g0 ij dx i dx j and 9 Since the initial metric is always of the form g = (1 + 4M/r)δ + o(r −1 ) with M > 0, data ij ij coinciding with the Schwarzschild outside of a compact set is the closest analogue of compactly supported or rapidly decaying data usually considered in small data global existence results for nonlinear wave equations. 10 In what follows we shall introduce the reduced wave operator g = g αβ ∂ 2 and note that in wave αβ g = g , where g φ = |g|−1/2 ∂α g αβ |g|1/2 ∂β φ is the geometric wave operator. coordinates
48
H. Lindblad, I. Rodnianski
1) obey the wave coordinate condition
g αα ∂α gα µ =
1 αα
g ∂µ gαα , 2
∀µ = 0, ..., 3,
(2.2)
2) satisfy the constraint equations R0 − |k0 |2 + (trk0 )2 = 0,
∇ j k0ij − ∇i trk0 = 0,
∀i = 1, ..., 3,
where R0 is the scalar curvature of the metric g0 ij , and the second fundamental form (k0 )ij = −1/2a −1 g1 ij . 3) We assume that the metric (g0 )ij coincides with the spatial part of the Schwarzschild metric gs (in wave coordinates): (g0 )ij =
r + 2M 2 dr + (r + 2M)2 (dθ 2 + sin2 θ dφ 2 ), r − 2M
r>1
and g1 = 0 for r > 1. Moreover, we assume that the lapse function a 2 (r) = (r − 2M)/(r + 2M) for r > 1 and a(r) = 1 for r ≤ 1/2. 4) The data (h0 , h1 ) = (g0 − m, g1 ) verify the smallness condition (2.3) ε = EN (0) + M < ε0 , where N ≥ 10 and EN (t) = sup
0≤τ ≤t |I |≤N
∂Z I h(τ, ·) 2L2 .
(2.4)
Here Z I is a product of |I | vector fields of the form ∂i , xi ∂j − xj ∂i , t∂i + xi ∂t and t∂t + x i ∂i . Then there exists a unique global smooth solution g with the property that for some constant CN , EN (t) ≤ 16ε2 (1 + t)2CN ε , gµν (t) − mµν L∞ ≤ CN ε(1 + t)−1+CN ε . x
(2.5)
Remark 2.2. The existence of data satisfying the assumptions of the theorem follows from the work of [Co, C-D], as argued in Sect. 4. A corollary of the above result is the global stability of Minkowski space for a restricted set of initial data. Theorem 2.3. Let (R3 , g0 , k0 ) be the initial data set for the Einstein vacuum equations Rµν = 0. Assume that relative to some system of coordinates (x1 , x2 , x3 ) the metric g0 coincides with the spatial part of the Schwarzschild metric gS outside the ball of radius one, g0 = (1 +
M 4 2 ) dx , r
r > 1,
while the second fundamental form k0 vanishes for r > 1. In addition, we assume that relative to that system of coordinates g0 , M and k0 satisfy the smallness condition ∂xI (g0 − δ) L2 (B1 ) + ∂xI k0 L2 (B1 ) + M < . 0≤|I |≤N
0≤|I |≤N −1
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
49
Then there exists a future causally geodesically complete11 solution g together with a global system of wave coordinates with the property that the curvature tensor of g relative to these coordinates decays to zero along any future directed causal geodesic. We now outline the strategy of the proof. Remark 2.4. Throughout the paper we shall use the notation A B for the inequality A ≤ CB with some large universal constant C. In our estimates we will make no distinction between the tensors hαβ = gαβ − mαβ and Hαβ = mαα mββ (g αβ − mαβ ), since H = −h + O(h2 ) and the terms quadratic in h are lower order. The continuity argument. For the proof we let δ be any fixed number 0 < δ < 1/2. Let g be a local smooth solution of the reduced Einstein equations (2.1). We start with the weak estimate EN (t) ≤ 64ε2 (1 + t)2δ .
(2.6)
By assumptions of the theorem the estimate (2.6) holds for t = 0. Let [0, T ] be the largest time interval on which (2.6) still holds. We shall show that if ε > 0 is sufficiently small then on the interval [0, T ] the inequality (2.6) implies the same inequality with the constant 64 replaced by 16. It will then follow that the solution and the energy estimate (2.6) can be extended to a larger time interval [0, T ] (such an extension is standard for quasilinear wave equations) thus contradicting the maximality of T . This will imply that T = ∞ and the solution is global. We will in fact prove that for a sufficiently small ε the stronger estimate (2.5) holds true on the interval [0, T ]. The global Sobolev inequality of Proposition 9.2 and the weak energy estimate (2.6) imply the pointwise decay estimates: |I |≤N−2
|∂Z I h(t, x)| ≤
Cε(1 + t)δ , (1 + t + r)(1 + |t − r|)1/2
r = |x|.
(2.7)
From the assumption that the constant δ < 1/2 we derive the following weak decay estimates |∂Z I h(t, x)| ≤ Cε(1 + t + r)−1/2−γ (1 + |t − r|)−1/2−γ ,
|I | ≤ N − 2
(2.8)
with some fixed constant γ > 0. The weak decay estimates (2.8) will lead to much stronger decay estimates in Theorem 14.1. In turn, using the stronger decay estimates in Theorem 14.1 we will be able to obtain stronger energy estimates in Theorem 15.1, i.e. (2.5). These in particular will enable us to show that the estimate (2.6) holds globally in time and conclude the proof. We remark that in the course of the proof all constants will be independent of ε > 0 but they will depend on a lower bound for γ > 0 (and hence on an upper bound for δ < 1/2). As described above, the proof is a direct consequence of three results. The first is the global Sobolev inequality of Proposition 9.2, introduced by S. Klainerman [K1], giving decay estimates in terms of energy estimates for the generators of the Lorentz group. The second ingredient is the improved decay estimates in Theorem 14.1. The final component is the energy estimates in Theorem 15.1 which rely on the improved decay estimates. 11
For the definition see [H-E] and Sect. 16 of this paper.
50
H. Lindblad, I. Rodnianski
Weak decay estimates. As pointed out above we may start by assuming the weak decay estimate (2.8). Furthermore, since the solution g = m + h coincides with the Schwarzschild solution of mass M ≤ ε in the region r ≥ t + 1, we have |Z I h(t, x)| ε(1 + r + t)−1 ,
when |x| = t + 1.
(2.9)
Hence integrating (2.8) from the light cone, where (2.9) holds, we get |Z I h(t, x)| ε(1 + r + t)−1/2−γ (1 + |t − r|)1/2−γ .
(2.10)
Since the vector fields span the tangent space of the outgoing light cones r − t = q we infer, with ∂¯ denoting the derivatives tangential to the cones, that ¯ I h| ε(1 + r + t)−3/2−γ (1 + |t − r|)1/2−γ . |∂Z
(2.11)
This means that, close to the light cone t = r, derivatives tangential to the forward light cones decay quite a bit better than the expected decay rate from (2.8) for a generic derivative. Wave coordinate condition. As we shall see below certain components of the tensor h decay faster than others. This can be seen upon introduction of a null frame of vector fields L = ∂t + ∂r , L = ∂r − ∂t and S1 , S2 : two orthonormal vectors tangential to the sphere of radius r in R3 . The first improved estimates come from the wave coordinate condition (2.2). Writing gαβ = mαβ + hαβ , we obtain from (2.2) that mαβ ∂α hβµ = ∂µ mαβ hαβ + O(h ∂h). In particular, contracting with a vector field T ∈ T = {L, S1 , S2 } and using that for any symmetric 2-tensor k, mαβ kαβ = −kLL + δ AB kAB , implies that we can express the transversal derivative ∂L of certain components of h in terms of the tangential derivatives that decay better and a quadratic term ¯ ε(1 + t + r)−1−2γ , |hLT | ε(1 + |t − r|)(1 + t + r)−1 . |(∂h)LT | ≤ |∂h|+|h||∂h| Even though the estimate above does not give a better decay rate for all components of h it gives the decay exactly for those components which, as it turns out, control the geometry, i.e., they lead to stronger energy and decay estimates. The above estimates will be sufficient to obtain improved estimates for the lowest order energy of h. However, in order to get estimates for the energy of Z I h we commute the vector fields Z through the equation for h. This generates additional commutator terms. The main commutator terms are controlled with the help of the following additional estimate from the wave coordinate condition: |(∂h)LT | + |(∂Zh)LL | ≤ ε(1 + t + r)−1−2γ , |hLT | + |(Zh)LL | ≤ ε(1 + |t − r|)(1 + t + r)−1 .
(2.12)
We now describe derivation of the stronger decay and energy estimates. Stronger decay estimates. We rely on the following decay estimate for the wave equation on a curved background:12 12
g = g αβ ∂ 2 . Recall that the reduced wave operator αβ
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
51
t
g φ(τ, ·) L∞ dτ (1 + τ ) t +C sup Z Iφ(τ, ·) L∞ + C (1 + τ )−1 Z I φ(τ, ·) L∞ dτ.
(1 + t + r)∂φ(t, ·) L∞ ≤ C
0
0≤τ ≤t |I |≤1
0 |I |≤2
(2.13) The estimate (2.13) will be applied to the components of the tensor h. The term Z I h on the right-hand side of the estimate will be controlled with the help of the weak decay g h. The estimates, and thus the decay rate of h will be determined in terms of decay of L∞ − L∞ estimate (2.13) does not rely on the fundamental solution as does the more standard L1 − L∞ type estimate. This estimate was used [L1] in the constant coefficient case and here we establish it in the variable coefficient case only under the assumption of the weak decay of all of the components of the metric g and the stronger decay of the components of g controlled by the wave coordinate condition. This analysis is by itself very interesting but we will not go into it here and just refer the reader to the following sections. We now analyze the inhomogeneous term in the equation for hµν . The tensor hµν = gµν − mµν verifies the reduced Einstein equations of the form: g hµν = Fµν (h)(∂h, ∂h), Fµν (h)(∂h, ∂h) = P (∂µ h, ∂ν h) + Qµν (∂h, ∂h) + Gµν (h)(∂h, ∂h), 1 1 P (∂µ h, ∂ν h) = ∂µ trh ∂ν trh − ∂µ hαβ ∂ν hαβ . 4 2
(2.14) (2.15)
Here Qµν are linear combinations of the standard null-forms and Gµν (h)(∂h, ∂h) is a quadratic form in ∂h with coefficients as smooth functions of h vanishing at h = 0. The weak decay estimates imply that the last two terms decay fast |Qµν (∂h, ∂h)| + |Gµν (h)(∂h, ∂h)| |∂h| |∂h|+|h||∂h|2 ε2 (1+r + t)−2−2γ (1+|t −r|)−2γ .
(2.16)
The problematic term is P (∂µ h, ∂ν h) since a priori the weak decay estimates only give the decay rate of ε2 (1+r +t)−1−2γ (1+|t −r|)−1−2γ , which is not sufficient in the wave zone t ≈ r. The crucial improvement comes as a result of a decomposition of the tensor P (∂µ h, ∂ν h) with respect to a null frame {L, L, S1 , S2 }. Let T ∈ T = {L, S1 , S2 } be any of the vectors generating the tangent space to the forward Minkowski light cones and U ∈ U = {L, L, S1 , S2 } denote any of the null frame vectors. Define, for an arbitrary symmetric two tensor k, |k|T U = T ∈T ,U ∈U = |T µ U µ kµν |. It then follows that |P (∂h, ∂h)|T U |∂h| |∂h| ε2 (1 + r + t)−2−2γ (1 + |t − r|)−2γ .
(2.17)
On the other hand, the absolute value of the tensor P (∂h, ∂h) obeys the estimate |P (∂h, ∂h)| |∂h|2T U + |∂h|LL |∂h|.
(2.18)
We now decompose the system of equations for h with respect to the null-frame g h|T U ε 2 (1 + r + t)−2−2γ (1 + |t − r|)−2γ , | g h|U U |∂h|2 + ε 2 (1 + r + t)−2−2γ (1 + |t − r|)−2γ , | TU
(2.19) (2.20)
52
H. Lindblad, I. Rodnianski
where in the last inequality we also used the improved decay estimate for ∂hLL obtained from the wave coordinate condition. The result is a system of equations where the g hT U have very good decay properties, while g hU U for the remaining components non-tangential component depends, to the highest order, only on the components hT U satisfying a better equation. An additional subtlety in the above analysis is the fact that g (or even with ). However, contraction with the null frame does not commute with the decay estimate (2.13) for the wave equation only uses the principal radial part of : ∂t2 − r −2 ∂r2 − 2 r −1 ∂r , which respects the null frame. This analysis results in the improved decay estimates |∂h|T U ≤ Cε(1 + t)−1 ,
|∂h| ≤ Cε(1 + t)−1 ln(2 + t).
(2.21)
The energy estimates. We rely on the following energy estimate for the wave equation, which holds under the assumption that the above decay estimates hold for the background metric g: for any γ > 0,
T
|∂φ|2 + T
+16 0
0
T
τ
¯ 2 γ |∂φ| ≤8 (1 + |t − r|)1+2γ
T
|∂φ|2 + Cε 0
0
t
|∂φ|2 1+t
g φ||∂t φ|. |
(2.22)
t
g φ = 0 This implies that the energy of a solution of the homogeneous wave equation grows but at the rate of at most (1 + t)Cε . The presence of an additional space-time integral containing tangential derivatives on the right and side of (2.22) is crucial for our analysis. This type of estimate in the constant coefficient case basically follows by averaging of the energy estimates on light cones used e.g. in [S1]. We also note that the energy estimates with space-time quantities involving special derivatives of a solution were also considered and used in the work of Alinhac, see e.g. [A2, A3]). In our work we use the space-time integral with derivatives spanning the tangent space to outgoing light cones and weights dependent on the distance to the Minkowski light cone r = t + 1. We emphasize that the energy estimate (2.22) is proved only under the assumption of the weak decay of all components of the background metric g together with the strong decay of the components controlled from the wave coordinate condition. It is worth noting that a combination of the energy estimates of the type (2.22) and Klainerman-Sobolev inequalities would also yield a very simple proof of the small data global existence result for semilinear equations φ = Q(∂φ, ∂φ) obeying the standard null condition. This fact appears to be previously unknown. The energy estimate (2.22) will be applied simultaneously to all components of the tensor h. As in Eqs. (2.19), (2.20) the inhomogeneous term obeys the following estimate: g h| ε(1 + r + t)−3/2−γ (1 + |t − r|)1/2−γ |∂h| + ε(1 + t)−1 |∂h|, | where in the last inequality we used the improved decay estimate for the |∂h|T U components. The energy estimate (11.3) will then imply that E0 (t) ≤ 16ε2 (1 + t)Cε .
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
53
Higher order energy estimates. In addition to the energy estimates for the components of the tensor h we need estimates for the higher vector field derivatives of h: Z I h with Minkowski vector fields Z = {∂α , αβ , S}. To obtain these estimates we apply Z I to g hµν = Fµν for h. Applying vector fields to the nonlinear terms Fµν the equation yields similar nonlinear terms for higher derivatives and these can be dealt with using the estimates already described above. We must note however that this is where the additional space-time integral involving the tangential derivatives on the left-hand side ¯ Ih of the energy estimate (11.3) becomes crucial. Consider for example the term ∂h · ∂Z generated by one of the null forms in Fµν . We estimate its contribution, with the help of the weak decay estimates, to the energy estimate as follows: I ¯ I h| |∂Z ¯ I h||∂t Z I h| ≤ Cε|∂t Z h| |∂h| |∂Z (1 + t)1/2+γ (1 + |t − r|)1/2+γ ¯ I h|2 Cε|∂t Z I h|2 Cε|∂Z ≤ + . (1 + t)1+2γ (1 + |t − r|)1+2γ
The integral of the first term is easily controlled by the energy on time slices times an integrable factor in time. The space time integral of the second term is in fact part of the energy (2.22), and if we choose ε sufficiently small this term can be absorbed by the space time integral on the left. The idea with the space-time integral is that one can use the extra decay in |t − r| when one does not have full decay in t. The more serious problem in higher order energy estimates lies however in the comg = g αβ ∂α ∂β . mutators between Z I and the principal part
The commutators. Writing g αβ = mαβ + H αβ with H αβ = −mαα mββ hα β + O(h2 ), we show the following commutator estimate13 :
[Z, g ]φ ≤ C |H | + |ZH | + |ZH |LL + |H |LT |∂Z I φ| 1+t +r 1 + | t − r| Cε ≤ |∂Z I φ| 1+t +r
|I |≤1
(2.23)
|I |≤1
by the weak decay assumptions (2.10) and the improved decay from the wave coordinate condition (2.12). We should note that for a generic quasilinear wave equation commutators with Minkowski vector fields Z give rise to uncontrollable error terms. In the special case of the equation φ = φ φ this problem can be overcome by modifying the vector fields Z, [A3]. In our case it is the wave coordinate gauge that provides additional cancellations. This commutator estimate applied to φ = hαβ together with the analysis in the previous section now gives estimates for the energy E1 as well as for the stronger decay estimates for the second derivatives of h, (2.26) with |J | = 1. This commutator will also g ] · Z I −1 hαβ in the energy estimate for Z I h and the show up as a top order term [Z, resulting term can be dealt with in the same way. 13 This commutator estimate applies to the vector fields Z = {∂ , }. For the scaling vector field α αβ g S − (S + 2) g . Z = S = x α ∂α the commutator expression should have the form
54
H. Lindblad, I. Rodnianski
g ]φ is of the form The other top order term generated by the commutators [Z I , ∂ . We first apply the pointwise estimate α β
(Z I H αβ )∂
|(Z I H αβ )∂α ∂β φ| ≤ C
|Z I H | |Z I H |LL |∂Z K φ|. + 1+t +r 1 + | t − r| |K|≤1
To deal with its contribution to the energy estimate we use the Poincar´e estimate with a boundary term R3
|Z I H |2LL dx ≤C (1 + |t − r|)2+2σ +C R3
|Z I H |2LL dS
S(t+1)
|∂r Z I H |2LL dx (1 + |t − r|)2σ
,
σ > −1/2,
σ = 1/2
(2.24)
together with the fact that h is Schwarzschild outside the cone r = t + 1, because of the inward bending of the Schwarzschild light cones, and hence there |Z I h| ≤ Cε/(1 + t). ¯ I H | and The way coordinate condition implies that |∂Z I H |LL can be estimated by |∂Z ¯ I H | is then controlled by the space-time lower order terms. The term involving |∂Z integral on the left-hand side. One can use a similar but more trivial argument for decay estimates, i.e. |Z I H |LL ≤ |Z I HLL |r=t+1 + (1 + |t − r|)|∂r Z I HLL |L∞ . The lower order terms. So far we have only discussed the top order terms, but there will also be several lower order terms (relative to |I | = k + 1) to deal with. These are typically of the form |∂Z J h| |∂Z K h|
|Z J h| |∂ 2 Z K−1 h| ≤ C
or
|Z J h| |∂Z K h| 1 + |t − r|
(2.25)
with |J |, |K| < |I | = k + 1. The lower order terms are dealt with using induction. We describe the induction argument for the decay estimates. From this it will be clear how it also proceeds for the energy estimates. We will inductively assume that we have the bounds: |∂Z J h| + |Z J h|(1 + |t − r|)−1 ≤ Ck t −1+Ck ε ,
|J | ≤ k.
(2.26)
The terms in (2.25) can then be estimated by Ck2 ε 2 t −2+2Ck ε . Including the top order g Z I h we get an terms using (2.23) applied to φ = Z I −1 h, and using (2.13) applied to inequality of the form M(t) ≤ 0
t
Cε 2 Cε M(s) + ds, 1+s (1 + s)1−Cε
(2.27)
where M(t) = (1 + t) ∂Z I h(t, ·) L∞ . The Gronwall’s inequality then gives the bound M(t) ≤ C(1 + t)2Cε .
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
55
3. The Einstein Equations in Wave Coordinates For a Lorentzian metric gµν , where µ, ν = 0, . . . , 3 we denote µλ ν =
1 λδ g ∂µ gδν + ∂ν gδµ − ∂δ gµν , 2
(3.1)
the Christoffel symbols of g and ρ
Rµλ νδ = ∂δ µλ ν − ∂ν µλ δ + ρλδ µρ ν − ρλν µ δ ,
(3.2)
its Riemann curvature tensor with Rµν = Rµα να , the Ricci tensor. We consider the metric g satisfying the Einstein vacuum equations Rµν = 0.
(3.3)
We impose the wave coordinate condition: λ := g αβ αλβ = 0.
(3.4)
g = g αβ , It follows that assuming (3.4) we have that the reduced wave operator g = g = √1 ∂α g αβ |g|∂β . |g|
(3.5)
The following lemma provides the description of the Einstein vacuum equations in wave coordinates as a system of quasilinear wave equations for gµν . Lemma 3.1. Let metric g satisfy the Einstein vacuum equations (3.3) together with the wave coordinate condition (3.4). Then gµν solves the following system of reduced Einstein equations: g gµν = P (∂µ g, ∂ν g) + Q µν (∂g, ∂g),
(3.6)
where (∂µ g, ∂ν g) = 1 g αα ∂µ gαα g ββ ∂ν gββ − 1 g αα g ββ ∂µ gαβ ∂ν gα β , P (3.7) 4 2 µν (∂g, ∂g) = ∂α gβµ g αα g ββ ∂α gβ ν − g αα g ββ ∂α gβµ ∂β gα ν − ∂β gβµ ∂α gα ν Q
+g αα g ββ ∂µ gα β ∂α gβν − ∂α gα β ∂µ gβν
+g αα g ββ ∂ν gα β ∂α gβµ − ∂α gα β ∂ν gβµ 1
+ g αα g ββ ∂β gαα ∂µ gβν − ∂µ gαα ∂β gβν 2 1
+ g αα g ββ ∂β gαα ∂ν gβµ − ∂ν gαα ∂β gβµ . (3.8) 2 Furthermore, the wave coordinate condition (3.4) reads g αβ ∂α gβµ =
1 αβ g ∂µ gαβ , 2
or
∂α g αν =
1 gαβ g νµ ∂µ g αβ . 2
(3.9)
56
H. Lindblad, I. Rodnianski
Proof. The proof of (3.9) is immediate. We now observe that ∂α gβµ = αβµ + αµβ ,
where
µαν = gαλ µλ ν .
It follows that gαλ ∂β µλ ν = ∂β µαν − (βαλ + βλα )µλ ν so also using that αλβ = βλα we obtain Rµανβ = gαλ Rµλ νβ = ∂β µαν − ∂ν µαβ + νλα µλ β − αλβ µλ ν .
(3.10)
It follows from (3.9) that
1 1 g αβ ∂µ ∂α gβν − ∂µ ∂ν gαβ = −∂µ g αβ ∂α gβν − ∂ν gαβ 2 2
1
= g αα g ββ ∂µ gα β ∂α gβν − ∂ν gαβ , 2
(3.11)
and hence g αβ g αβ ∂α µβν − ∂ν µβα = ∂α ∂µ gβν + ∂α ∂ν gβµ − ∂α ∂β gµν 2 g αβ − ∂ν ∂µ gβα + ∂ν ∂α gβµ − ∂ν ∂β gµα 2 g αβ g αβ =− ∂α ∂β gµν + ∂α ∂µ gβν + ∂ν ∂β gµα − ∂ν ∂µ gβα 2 2 1 αβ 1
= − g ∂α ∂β gµν + g αα g ββ 2 2 × ∂µ gα β ∂α gβν + ∂ν gα β ∂α gβµ − ∂ν gα β ∂µ gαβ . (3.12) Here by (3.9) we can write
g αα g ββ ∂µ gα β ∂α gβν
= g αα g ββ ∂α gα β ∂µ gβν + g αα g ββ ∂µ gα β ∂α gβν − ∂α gα β ∂µ gβν 1
= g αα g ββ ∂β gα α ∂µ gβν + g αα g ββ ∂µ gα β ∂α gβν − ∂α gα β ∂µ gβν 2 1
= g αα g ββ ∂µ gα α ∂β gβν 2
1 ∂β gα α ∂µ gβν −∂µ gα α ∂β gβν + ∂µ gα β ∂α gβν − ∂α gα β ∂µ gβν +g αα g ββ 2 1 αα ββ
= g g ∂µ gα α ∂ν gββ
4
1 +g αα g ββ ∂β gα α ∂µ gβν −∂µ gα α ∂β gβν + ∂µ gα β ∂α gβν −∂α gα β ∂µ gβν . 2 (3.13)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
57
Hence by (3.13) and (3.13) with µ and ν interchanged we get 1 αα ββ
∂µ gα β ∂α gβν + ∂ν gα β ∂α gβµ − ∂ν gα β ∂µ gαβ g g 2 1
1 = g αα g ββ ∂µ gα α ∂ν gββ − ∂ν gα β ∂µ gαβ 4 2
1
+ g αα g ββ ∂µ gα β ∂α gβν −∂α gα β ∂µ gβν + ∂ν gα β ∂α gβµ − ∂α gα β ∂ν gβµ 2 1 αα ββ ∂β gα α ∂µ gβν − ∂µ gα α ∂β gβν + ∂β gα α ∂ν gβµ − ∂ν gα α ∂β gβµ . g g 4 (3.14) On the other hand
1
∂ν gβα + ∂β gαν − ∂α gβν g αα g ββ ∂µ gβ α + ∂β gα µ − ∂α gβ µ 4 1 1 1
= ∂ν gαβ g αα g ββ ∂µ gα β + ∂α gβµ g αα g ββ ∂α gβ ν − ∂α gβµ g αα g ββ ∂β gα ν 4 2 2
1 1 1
= g αα g ββ ∂ν gαβ ∂µ gα β + ∂α gβµ ∂α gβ ν − ∂β gβµ ∂α gα ν 4 2 2 1 αα ββ
− g g ∂α gβµ ∂β gα ν − ∂β gβµ ∂α gα ν 2
1 1 1
= g αα g ββ ∂ν gαβ ∂µ gα β − ∂µ gββ ∂ν gαα + ∂α gβµ ∂α gβ ν 4 8 2 1 αα ββ
− g g ∂α gβµ ∂β gα ν − ∂β gβµ ∂α gα ν , (3.15) 2
ναβ µαβ =
where the last inequality follows from (3.9). Taking the trace of (3.10) and using (3.12), (3.4) we obtain 1 Rµν = − g αβ ∂α ∂β gµν + ναβ µαβ 2
1
+ g αα g ββ ∂µ gα β ∂α gβν + ∂ν gα β ∂α gβµ − ∂ν gα β ∂µ gαβ . (3.16) 2 Using (3.15) and (3.14) we get
1
1 1 Rµν = − g αβ ∂α ∂β gµν + g αα g ββ − ∂ν gαβ ∂µ gα β + ∂µ gββ ∂ν gαα
2 4 8
1 1 + g αα g ββ ∂α gβµ ∂α gβ ν − g αα g ββ ∂α gβµ ∂β gα ν − ∂β gβµ ∂α gα ν 2 2 1 αα ββ ∂µ gα β ∂α gβν − ∂α gα β ∂µ gβν + ∂ν gα β ∂α gβµ − ∂α gα β ∂ν gβµ + g g 2 1 αα ββ g g ∂β gα α ∂µ gβν − ∂µ gα α ∂β gβν + ∂β gα α ∂ν gβµ − ∂ν gα α ∂β gβµ . 4 (3.17)
The result now follows.
Let m denote the standard Minkowski metric m00 = −1,
mii = 1,
if
i = 1, ..., 3,
and mµν = 0,
if
µ = ν.
58
H. Lindblad, I. Rodnianski
Define a 2-tensor h from the decomposition gµν = mµν + hµν . Let mµν be the inverse of mµν . Then for small h H µν = g µν − mµν = −hµν + O µν (h2 ),
where hµν = mµµ mνν hµ ν
and O µν (h2 ) vanishes to second order at h = 0. As a consequence of Lemma 3.1 we get: Lemma 3.2. If the Einstein equations (3.3) and the wave coordinate condition (3.4) hold then g hµν = Fµν (h)(∂h, ∂h),
(3.18)
where Fµν (h)(∂h, ∂h) is a quadratic form in ∂h with coefficients that are smooth functions of h. More precisely, Fµν (h)(∂h, ∂h) = P (∂µ h, ∂ν h) + Qµν (∂h, ∂h) + Gµν (h)(∂h, ∂h),
(3.19)
where P (∂µ h, ∂ν h) =
1 αα
1
m ∂µ hαα mββ ∂ν hββ − mαα mββ ∂µ hαβ ∂ν hα β
4 2
(3.20)
and
Qµν (∂h, ∂h) = ∂α hβµ mαα mββ ∂α hβ ν − mαα mββ ∂α hβµ ∂β hα ν − ∂β hβµ ∂α hα ν
+mαα mββ ∂µ hα β ∂α hβν − ∂α hα β ∂µ hβν
+mαα mββ ∂ν hα β ∂α ghβµ − ∂α hα β ∂ν hβµ 1
+ mαα mββ ∂β hαα ∂µ hβν − ∂µ hαα ∂β hβν 2 1
+ mαα mββ ∂β hαα ∂ν hβµ − ∂ν hαα ∂β hβµ 2 is a null form and Gµν (h)(∂h, ∂h) is a quadratic form in ∂h with coefficients smoothly dependent on h and vanishing when h vanishes: Gµν (0)(∂h, ∂h) = 0. Furthermore 1 mαβ ∂α hβµ = mαβ ∂µ hαβ + Gµ (h)(∂h), 2
or ∂α H αν =
1 gαβ mνµ + H νµ ∂µ H αβ , 2 (3.21)
where Gµ (h)(∂h) is a linear function of ∂h with coefficients that are smooth functions of h and that vanishes when h vanishes: Gµ (0)(∂h) = 0. Observe that the terms in (3.20) do not satisfy the classical null condition. However the trace mµν hµν satisfies a nonlinear wave equation with semilinear terms obeying the null condition: g αβ ∂α ∂β mµν hµν = Q(∂h, ∂h) + G(h)(∂h, ∂h).
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
59
4. The Initial Data In this section we discuss the initial data for which the results of our paper apply. We shall consider the asymptotically flat data, satisfying a global smallness condition, with the property that it coincides with the Schwarzschild data outside the ball of radius one. We start by showing the existence of such data. Let (g0 , k0 ) be asymptotically flat initial data for the Einstein equations consisting of the Riemannian metric g0 and a second fundamental form k0 . The initial data for the vacuum Einstein satisfy the constraint equations R0 − (trk0 )2 + |k0 |2 = 0, ∇ j k0ij − ∇i trk0 = 0.
(4.1) (4.2)
We restrict our attention to the time-symmetric case R0 = k0 = 0. Then, if (g0 , k0 ) is sufficiently close to the Minkowski data and g0 satisfies the parity condition g0 (x) = g0 (−x), by the results of Corvino [Co] and Chrusciel-Delay [C-D] one can construct a new set of initial data (g, k) with the properties that • The initial data (g, k) coincides with (g0 , k0 ) on the ball of radius 1/2. • (g, k) is exactly the Schwarzschild data (gSx , 0) of mass M outside B1 , the ball of radius one. At this point we specify the smallness conditions:
M ≤ , ∂xI (g − δ) L2 (B1 ) + 0≤|I |≤N
∂xJ k L2 (B1 ) ≤
(4.3)
0≤|J |≤N−1
for some sufficiently large integer N. Here ∂xI denotes the derivative ∂xI11 . . . ∂xInn , where (I1 , . . . , In ) is an arbitrary multi-index with the property that I1 + · · · + In = |I |. We have two expressions for the Schwarzschild metric in isotropic and wave coordinates: (1 − M/r)2 2 M dt + (1 + )4 dx 2 , (1 + M/r)2 r r − 2M 2 r + 2M 2 dt + dr + (r + 2M)2 (dθ 2 + sin2 θdφ 2 ). gs = − r + 2M r − 2M
gS = −
(4.4) (4.5)
The expressions gSx and gsx will denote the spatial parts the Schwarzschild metric in respective coordinates. Observe that gs = m +
4M (dt 2 + dx 2 ) + O(r −2 ). r
(4.6)
We now find the coordinate change transforming the metric gS into gs . Set t = τ,
r=ρ+
M2 . ρ
(4.7)
In the coordinates τ, ρ the metric gs takes the form gS . This change of coordinates is one-to-one for the values ρ > M. Since the mass M << 1 we can define the change of coordinates r = (ρ), where coincides with the map (4.7) for ρ > 1 and the identity transformation for ρ ≤ 1/2. Thus we have constructed the initial data (g, k) such that
60
H. Lindblad, I. Rodnianski
• The initial data (g, k) coincides (in new coordinates) with (g0 , k0 ) on the ball of radius 1/2. • (g, k) is exactly the Schwarzschild data (gsx , 0) outside the ball of radius one. • Moreover, the new data still obeys the smallness condition (4.3). The constructed metric is already in wave coordinates on its Schwarzschild part. We now describe the procedure which produces the initial data (g, ∂t g) associated with (g, k) and satisfying the wave coordinate condition. Recall that a priori we are only given the spatial part of the metric gij together with a second fundamental form kij . We now define the full space-time metric gαβ on the Cauchy hypersurface 0 as follows: g0i = 0,
g00 = −a(r),
(4.8)
where the function a(r) =
r − 2M , r + 2M
a(r) = 1,
for r ≤
for r > 1, 1 . 2
Thus defined metric coincides with the full Schwarzschild metric gs for r > 1. We further define ∂t gij = −2akij .
(4.9)
It remains to determine ∂t g0α . We find it by satisfying the wave coordinate condition g βµ ∂µ gαβ =
1 µν g ∂α gµν . 2
Setting α = 0 we obtain 1 00 1 g ∂t g00 = −g βi ∂i g0β + g ij ∂t gij . 2 2 This defines ∂t g00 . On the other hand setting α = i we obtain 1 g 00 ∂t g0i = −g βj ∂j giβ + g µν ∂i gµν . 2 This determines ∂t g0i . Observe that since the metric g coincides with the Schwarzschild metric gs , already satisfying the wave coordinate condition, outside the ball of radius one, we have that on that set the initial data takes the form (gs , 0). Hence we constructed the initial data (g, ∂t g) with the properties that • • • •
The initial data (g, ∂t g) corresponds to the initial data (g, k) prescribed originally. (g, ∂t g) is exactly the Schwarzschild data (gs , 0) outside the ball of radius one. The initial data verifies the wave coordinate condition. The initial data satisfies the smallness condition
∂xI (g − m) L2 (B1 ) + (4.10) ∂xJ ∂t g L2 (B1 ) ≤ . 0≤|I |≤N
0≤|J |≤N−1
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
61
Now with the initial data (g, ∂t g) we solve the reduced Einstein equations (3.6). It folλ , the reduced lows from the proof of Lemma 3.1 that, in the notation λ = g αβ αβ Einstein equations can be written in the form: 1 σ Rαβ − (Dα β + Dβ α ) − σ Nαβ (g, ∂g) = 0. 2
(4.11)
σ Here D denotes a covariant derivative with respect to the space-time metric g and Nαβ are some given functions depending on g and ∂g. Observe that the initial data (g, ∂t g) were chosen in such a way that the wave coordinate condition λ = 0 is satisfied on the initial hypersurface 0 . We now argue that this condition is propagated, i.e, the solution of the reduced Einstein equations (4.11) obeys λ = 0 on any hypersuface t . We would have thus shown that a solution of the reduced Einstein equations is, in fact, a solution of the vacuum Einstein equations. To prove that λ = 0 we differentiate (4.11) and use the contracted Bianchi identity β D Rαβ = 21 Dα R,
1 0 = 2(D β Rαβ − Dα R) = D β Dα β + D β Dβ α 2 σβ σ −Dα D β β − 2D β (σ Nαβ ) − Dα (σ Nβ ) σβ
= D β Dβ α + Rαγ γ − 2(Dβ σ )Nβ σβ
σβ
−(Dα σ )Nβ − 2σ (Dβ Nασβ ) − σ (Dα Nβ ). Therefore, λ satisfies a covariant wave equation, on the background determined by the constructed metric g, with the initial condition λ = 0. It remains to show that Dt λ = 0 on 0 and the conclusion that λ ≡ 0 will follows by the uniqueness result for the wave equation. We recall that the initial data (g, k) verifies the constraint equations (4.1), (4.2), which imply that on 0 , 1 RT T + R = 0, 2
RT i = 0,
where T = −(g00 )−1 ∂t is the unit future oriented normal to 0 . Therefore returning to (4.11) we obtain that 1 0 = R00 + R = −(g00 )−1 Dt 0 + D i i , 2 1 1 0 = R0i = Dt i + Di 0 . 2 2 This finishes the proof that λ ≡ 0. We also know that the time-independent Schwarzschild metric gs is a solution of the Einstein vacuum equation Rαβ = 0. Moreover, since gs satisfies the wave coordinate condition it also verifies the reduced Einstein equations (4.11). Since the initial data (g, ∂t g) = (gs , 0) outside the ball of radius two, the constructed solution will coincide with the Schwarzschild solution in the exterior of the null cone developed from the sphere of radius one in 0 . We end the discussion of the initial data by comparing the light cones of Minkowski and Schwarzschild spaces in the wave coordinates of the Schwarzschild space.
62
H. Lindblad, I. Rodnianski
Lemma 4.1. For an arbitrary R > 2M the forward null cone of the metric gs , intersecting the time slice t = 0 along the sphere of radius R, is contained in the interior of the Minkowski cone t − r = R. Proof. The null cone intersecting the time slice t = 0 along the sphere of radius R can be realized as the level hypersurface u = 0 of the optical function u solving the eikonal equation gsαβ ∂α u ∂β u = 0 with the initial condition that u = 0 on the sphere of radius R at time t = 0. Because of the spherical symmetry of the Schwarzschild metric gs and the initial condition we look for a spherically symmetric solution u = u(t, r). The eikonal equation then reads r − 2M r + 2M (∂t u)2 = (∂r u)2 . r − 2M r + 2M Let t = γ (r) be a null geodesic, originating from some point on the sphere of radius R at t = 0, such that u(γ (r), r) = 0. Then ∂t uγ˙ (r) + ∂r u = 0. Substituting this into the eikonal equation we obtain that
r + 2M 2 = |γ˙ (r)|2 . r − 2M Taking the square root and integrating we obtain that
r − 2M . γ (r) = γ (R) ± r − R + 4M ln R − 2M Thus the null geodesics are described by the curves
r − 2M . t = ± r − R + 4M ln R − 2M In particular, the forward null cone is contained in the interior of the set t ≥ r − R.
5. The Null-Frame and Null-Forms Below we introduce a standard Minkowski null-frame used throughout the paper. At each point (t, x) we introduce a pair of null vectors (L, L), L0 = 1,
Li = x i /|x|, i = 1, 2, 3,
and L0 = 1,
Li = −x i /|x|, i = 1, 2, 3.
Adding two orthonormal tangent to the sphere S 2 vectors S1 , S2 which are orthogonal to ω defines a null frame (L, L, S1 , S2 ).
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
63
Remark 5.1. Since S 2 does not admit a global orthonormal frame S1 , S2 we could alternatively introduce a global frame induced by the projections of the coordinate vector fields ei . Let P be the orthogonal projection of a vector field in R3 along ω = x/|x| onto the tangent space of the sphere; P V = V − V , ωω. For i = 1, 2, 3 denote the projection of ∂i by j
∂/ i = Ai ∂j = ∂i − ωi ωj ∂j ,
j
j
where Ai = (P ei )j = δi − ωi ωj ,
i = 1, 2, 3, (5.1)
where ei is the usual orthonormal basis in R3 , and the sums are over j = 1, 2, 3 only. Let ∂¯0 = Lα ∂α and ∂¯i = ∂/ i , for i = 1, 2, 3. Then a linear combination of the derivatives {∂¯0 , . . . , ∂¯3 } spans the tangent space of the forward light cone. In what follows A, B will denote any of the vectorfields S1 , S2 . We will use the summation conventions XA Aα = X β S1β S1α + X β S2β S2α ,
XA YA = X α Y β S1α S1β + X α Y β S2α S2β .
Obvious generalizations of the above conventions will be used for higher order tensors. We record the following null frame decomposition of a vector field X = Xα ∂α : α X = XL Lα + X L Lα + X A Aα . Relative to a null frame the Minkowski metric m has the following form: mLL = mLL = mLA = mLA = 0,
mLL = mLL = −2,
mAB = δAB ,
i.e. mαβ X α Y β = −2(X L Y L + XL Y L ) + XA Y A . Recall that we raise and lower indices of any tensor relative to the Minkowski metric m, i.e., Xα = mαβ X β . We define XY = mαβ X α Y β = Xα Y α . Then XY = X L YL + X L YL + X A YA . It is useful to remember the following rule: 1 X L = − XL , 2
1 X L = − XL , 2
XA = XA .
Then mLL = mLL = −1/2, mLL = mLL = mLA = mLA = 0, i.e. mαβ Xα Yβ = − 21 XL YL + XL YL + XA YA .
mAB = δ AB ,
Definition 5.2. Denote q = r − t and s = t + r the null coordinates of the Minkowski metric m and ∂q = 21 (∂r − ∂t ) and ∂s = 21 (∂t + ∂r ), the corresponding null vector fields. Let kXY = kαβ X α Y β . Then tr k = mαβ kαβ = −
1 kLL + kLL + tr k, 2
(5.2)
where ij
tr k = δ AB kAB = δ kij , where the sum is over i, j = 1, 2, 3 only.
and
ij
δ = δ ij − ωi ωj ,
(5.3)
64
H. Lindblad, I. Rodnianski
If k and p are symmetric it follows that
pαβ k αβ = mαα mββ pαβ kα β
1 = pLL kLL + pLL kLL + 2pLL kLL 4
−δ AB pAL kBL + pAL kBL + δ AB δ A B pAA kBB
1 = pLL kLL + pLL kLL + 2pLL kLL 4 ij ij i j
−δ piL kj L + piL kj L + δ δ pii kjj .
(5.4)
Lemma 5.3. With P (p, k) given by (3.20) we have for symmetric 2-tensors p and k: 1 1 αβ
m pαβ mαβ kαβ − mαα mββ pαβ kα β
4 2
1 1
= − pLL kLL + pLL kLL − δ AB δ A B 2pAA kBB − pAB kA B
8 4 1 AB + δ 2pAL kBL + 2pAL kBL − pAB kLL − pLL kAB , (5.5) 4 i.e. at least one of the factors contains only tangential components. P (p, k) =
Furthermore 1 β 1 β pαβ ∂α = p Lβ ∂L + p Lβ ∂L + p Aβ ∂A = − pL ∂L − pL ∂L + p Aβ ∂A . 2 2 We introduce the following notation. Let T = {L, S1 , S2 }, U = {L, L, S1 , S2 }, L = {L} and S = {S1 , S2 }. For any two of these families V and W and an arbitrary two-tensor p we denote |pβγ V β W γ |, (5.6) |p|VW = V ∈V ,W ∈W ,
|∂p|VW =
|(∂p)αβγ U α V β W γ |,
(5.7)
|(∂p)αβγ T α V β W γ |.
(5.8)
U ∈U ,V ∈V ,W ∈W ,
|∂p|VW =
T ∈T ,V ∈V ,W ∈W ,
Let Q denote a null form, i.e. Qαβ (∂φ, ∂ψ) = ∂α φ ∂β ψ − ∂β φ ∂α ψ if α = β and Q0 (∂φ, ∂ψ) = mαβ ∂α φ ∂β ψ. Lemma 5.4. If P is as in Lemma 5.3 then |P (p, k)| |p |T U |k|T U + |p |LL |k| + |p | |k|LL .
(5.9)
If Q(∂φ, ∂ψ) is a null form then |Q(∂φ, ∂ψ)| |∂φ||∂φ| + |∂φ||∂ψ|. Furthermore
¯ , |k αβ ∂α φ ∂β φ| |k|LL |∂φ|2 + |k| |∂φ||∂φ| αβ ¯ , |Lα k ∂β φ| |k|LL |∂φ| + |k| |∂φ| αβ |(∂α k )∂β φ| |∂k|LL + |∂k| |∂φ| + |∂k| |∂φ|.
(5.10) (5.11) (5.12) (5.13)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
65
Proof. The proof of (5.10) for the null form Q0 follows directly from (5.2). To prove the claim for the null forms Qαβ use that ∂i = Li (∂s + ∂q ) + ∂¯i ,
i = 1, 2, 3,
∂0 = L0 (∂s − ∂q ).
(5.14)
Therefore, ¯ |∂ψ| + C|∂φ| |∂ψ|. ¯ |Qαβ (∂φ, ∂ψ)| = |∂α φ∂β ψ − ∂β φ∂α ψ| ≤ C|∂φ| The estimates (5.11)–(5.13) follow from (5.4).
Lemma 5.5. If k αβ is a symmetric tensor and φ a function then ¯ . |k αβ ∂α ∂β φ| |k|LL |∂ 2 φ| + |k| |∂∂φ|
(5.15)
Also, with tr k = δ AB kAB = (δ ij − ωi ωj )kij we have αβ k ∂α ∂β φ − kLL ∂ 2 φ − 2kLL ∂s ∂q φ − r −1 tr k ∂q φ q ¯ ¯ . |k|LT |∂∂φ| + |k| |∂¯ 2 φ| + r −1 |∂φ|
(5.16)
Proof. The estimate (5.15) follow from (5.4). We have ∂i ωj = r −1 (δij − ωi ωj ) = r −1 δ ij .
(5.17)
Furthermore ∂i = ∂¯i + ωi ∂r , where ∂r = ωj ∂j so [∂¯i , ∂r ] = (∂¯i ωk )∂k and ∂i ∂j = (∂¯i + ωi ∂r )(∂¯j + ωj ∂r ) = ∂¯i ∂¯j + ωi ωk ∂¯j ∂k + ωj ωk ∂¯i ∂k + ωi ωj ∂r2 + (∂¯i ωj )∂r + ωj (∂¯i ωk )∂k = ∂¯i ∂¯j + ωi ∂¯j ∂r + ωj ∂¯i ∂r + ωi ωj ∂r2 + r −1 δ ij ∂r − r −1 ωi ∂¯j . (5.18) Furthermore ∂0 ∂i = ∂t (∂¯i + ωi ∂r ) = ∂¯i ∂t + ωi ∂t ∂r .
(5.19)
k αβ ∂α ∂β = k 00 ∂t2 + 2k 0i ωi ∂t ∂r + k ij ωi ωj ∂r2 + r −1 tr k ∂r +k ij ∂¯i ∂¯j − r −1 k ij ωi ∂¯j + 2k 0j ∂¯j ∂t + 2k ij ωi ∂¯j ∂r .
(5.20)
Hence
If we substitute ∂t = ∂s − ∂q , ∂r = ∂s + ∂q and identify kLL = k 00 − 2k 0i ωi + k ij ωi ωj , kLL = −k 00 + k ij ωi ωj , kLL = k 00 + 2k 0i ωi + k ij ωi ωj .
(5.21)
and j
j
j
−k 0j + k ij ωj = k0 + ki ωi = kL ,
j
j
j
k 0j + k ij ωj = −k0 + ki ωi = kL (5.22)
66
H. Lindblad, I. Rodnianski
we get k αβ ∂α ∂β = kLL ∂q2 + 2kLL ∂s ∂q + kLL ∂s2 + r −1 tr k ∂q + k ij ∂¯i ∂¯j + r −1 tr k ∂s j j −r −1 k ij ωi ∂¯j + 2kL ∂¯j ∂q + 2kL ∂¯j ∂s .
(5.23)
Finally, we can also write j j j j j 2kL ∂¯j ∂q = kL ∂¯j (ωk ∂k − ∂t ) = kL ωk ∂¯j ∂k − kL ∂¯j ∂t + r −1 kL ∂¯j ,
(5.24)
since (∂¯j ωk )∂k = r −1 ∂¯j . The inequality (5.16) now follows.
g φ = F with a metric Corollary 5.6. Let φ be a solution of the reduced wave equation αβ αβ αβ LL g such that H = g − m satisfies the condition that |H | < 41 . Then
tr H + HLL HLL rF (rφ) + ∂ 4∂s − LL ∂q − q 2g 2g LL r 2g LL ¯ ¯ + r −1 |φ| , r|ω φ| + |H |LT r |∂∂φ| + |H | r |∂¯ 2 φ| + |∂φ|
(5.25)
¯ = δ ij ∂¯i ∂¯j . where ω = Proof. Define the new metric g˜ αβ =
g αβ . −2g LL
The equation g αβ ∂α ∂β φ = F then takes the form g˜ αβ ∂α ∂β φ =
F , −2g LL
which can also be written as φ + (g˜ αβ − mαβ )∂α ∂β φ =
F . −2g LL
Let k αβ be the tensor k αβ = (g˜ αβ − mαβ ). Observe that k αβ = (−2g LL )−1 g αβ + 2mαβ g LL = (−2g LL )−1 H αβ + mαβ (2g LL + 1) = (−2g LL )−1 H αβ + 2mαβ H LL . Thus, kLL = 0,
kLT = (−2g LL )−1 HLT ,
tr k = (−2g LL )−1 tr H + HLL . (5.26)
Moreover, |k| |H |, since g LL = H LL − 21 and by the assumptions of the corollary |H LL | < 41 . Now using (5.16) of Lemma 5.5, with the condition that kLL = 0, together with the decomposition φ = −∂t2 φ + φ =
1 4 (∂t + ∂r )(∂r − ∂t )rφ + ω φ = ∂s ∂q rφ + ω φ, r r
we find that the identity φ + k αβ ∂α ∂β φ = (−2g LL )−1 F leads to the inequality
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
67
4∂s ∂q rφ + rkLL ∂ 2 φ + tr k ∂q φ + (2g LL )−1 rF q ¯ ¯ . r|ω φ| + r|k|LT |∂∂φ| + |k| r |∂¯ 2 φ| + |∂φ| Finally, identity (5.26) and a crude estimate |k| |H | yield the desired result.
6. The Weak Null Condition and Asymptotic Expansion of Einstein’s Equations in Wave Coordinates Let us now first describe the weak null condition. The results of this section appear in [L-R]. Consider the Cauchy problem for a system of nonlinear wave equations in three space dimensions: ui = Fi (u, u , u
), i = 1, ..., N, u = (u1 , ..., uN ), (6.1) where − = −∂t2 + 3j =1 ∂x2j . We assume that F is a function of u and its derivatives of the form −
Fi (u, u , u
) = aiαβ ∂ α uj ∂ β uk + Gi (u, u , u
), jk
(6.2)
where Gi (u, u , u
) vanishes to third order as (u, u , u
) → 0 and ai αβ = 0 unless |α| ≤ |β| ≤ 2and |β| ≥ 1. Here we used the summation convention over repeated indices. We assume that the initial data jk
u(0, x) = εu0 (x) ∈ C ∞ ,
ut (0, x) = εu1 (x) ∈ C ∞
(6.3)
is small and decays fast as |x| → ∞. We are going to determine conditions on the nonlinearity such that Eq. (6.1) is compatible with the asymptotic expansion as |x| → ∞ and |x| ∼ t, u(t, x) ∼ εU (q, s, ω)/|x|,
where q = |x| − t, s = ε ln |x|, ω = x/|x|,
(6.4)
for all sufficiently small > 0. The linear and some nonlinear wave equations allow for such an expansion with U independent of s and the next term decaying like ε/|x|2 , see [H1, H2]. Substituting (6.4) into (6.1) and equating powers of order ε2 /|x|2 we see that jk U s=0 = F0 , (6.5) 2∂s ∂q Ui = Ai mn (ω)(∂qm Uj )(∂qn Uk ), where jk
Ai,mn (ω) =
jk
ai,αβ ωˆ α ωˆ β ,
where ωˆ = (−1, ω) and ωˆ α = ωˆ α1 ...ωˆ αk .
|α|=m,|β|=n
(6.6) In fact, u = −ε −1 ∂s ∂q (ru) + angular derivatives and ∂µ = ωˆ µ ∂q + tangential derivatives. One can show that (6.1)–(6.3) has a solution as long as ε log t is bounded, provided that ε > 0 is sufficiently small and the solution of (6.5) exists up to that time, [J-K, H1, jk H2, L1, L2]. The only exception is the case Ai00 = 0, which has shorter life span. In cases where the solution of (6.5) blows up it has been shown that solutions of (6.1)–(6.3) also break down in some finite time Tε ≤ eC/ε , [J1, H1, A1]. John’s example was u = ut u
(6.7)
68
H. Lindblad, I. Rodnianski
for which (6.5) is the Burger’s equation (2∂s − Uq ∂q )Uq = 0, which is known to blow up. The equation u = u2t
(6.8)
is another example where solutions blow up, for which (6.5) is ∂s Uq = Uq2 , that also blows up. The null condition of [K2] is equivalent to jk
Ai mn (ω) = 0
for all (i, j, k, m, n),
ω ∈ S2 .
(6.9)
The results of [C1, K2] assert that (6.1)-(6.3) has global solutions for all sufficiently small initial data, provided that the null condition is satisfied. In this case the asymptotic equation (6.5) trivially can be solved globally. Moreover, similar to the linear case, its solutions approach a limit as s → ∞ and the solutions of (6.1)–(6.3) decay like solutions of linear equations. A typical example of an equation satisfying the null condition is u = u2t − |∇x u|2 .
(6.10)
There is however a more general class of nonlinearities for which solutions of (6.5) do not blow up: We say that a system (6.1) satisfies the weak null condition if the solutions of the corresponding asymptotic system (6.5) exist for all s and if the solutions together with its derivatives grow at most exponentially in s for all initial data decaying sufficiently fast in q. Under the weak null condition assumption solutions of (6.5) satisfy Eq. (6.1) up to terms of order ε 2/|x|3−Cε, but need only decay like ε/|x|1−Cε. An example of the equation satisfying the weak null condition is given by u = uu.
(6.11)
In [L2] it was proven that (6.11) have small global solutions in the spherically symmetric case and recently [A3] established this result without the symmetry assumption. Equation (6.11) appears to be similar to (6.7) but a closer look shows that the corresponding asymptotic equation: (2∂s − U ∂q )Uq = 0
(6.12)
has global solutions growing exponentially in s, see [L2]. The system u = vt2 ,
v = 0
(6.13)
is another example that satisfies the weak null condition. Equation (6.13) appears to resemble (6.8). The system however decouples: v satisfies a linear homogeneous equation and given v we have a linear inhomogeneous equation for u, and global existence follows. The corresponding asymptotic system is ∂s ∂q U = (∂q V )2 ,
∂s ∂q V = 0.
(6.14)
The solution of the second equation in (6.14) is independent of s: Vq = Vq (q, ω) and substituting this into the first equation we see that Uq (s, q, ω) = sVq (q, ω)2 so ∂u only decays like |x|−1 ln |x|. We show below that the Einstein vacuum equations in wave coordinates satisfy the weak null condition, i.e. that the corresponding asymptotic system (6.5) admits global solutions. In fact, each of the quadratic nonlinear terms in the Einstein equations is either of the type appearing in (6.10), (6.11) or (6.13).
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
69
Theorem 6.1. Let h be a symmetric 2-tensor and let hµν (t, x) ∼ εUµν (s, q, ω)/|x|,
where
q = |x| − t, s = ε ln |x|, ω = x/|x| (6.15)
is an asymptotic ansatz. Then the asymptotic system for the Einstein equations in wave coordinates (3.18), obtained by formally equating the terms with the coefficients 2 |x|−2 , takes the following form: ∀µ, ν = 0, . . . , 3, (6.16) 2∂s − ULL ∂q ∂q Uµν = Lµ Lν P (∂q U, ∂q U ),
where ULL = mαα mββ Uα β Lα Lβ and P (∂q U, ∂q U ) = 41 ∂q trU ∂q trU − 21 ∂q Uαβ ∂q U αβ . The asymptotic form of the wave coordinate condition (3.21) is 2∂q ULµ = Lµ ∂q tr U,
∀µ = 0, . . . , 3,
(6.17)
where ULµ = mαα Uα µ Lα and trU = mαβ Uαβ . The solution of the system (6.16)-(6.17) exists globally and, thus, the Einstein vacuum equations (3.18) in wave coordinates satisfies the weak null condition. Moreover, the component ∂q ULL grows at most as s while the remaining components are uniformly bounded. The asymptotic form (6.16) follows by a direct calculation from (3.18). Observe that the null form Qµν (∂h, ∂h) disappears after passage to the asymptotic system. Next we note that (6.17) is preserved under the flow of (6.16). Contracting (6.16) with Lµ Lν we obtain (2∂s − ULL ∂q )∂q ULL = 0, which can be solved globally. More generally, contracting (6.16) with the vector fields {L, S1 , S2 } we obtain (2∂s − ULL ∂q )∂q UT U = 0,
if T ∈ {L, S1 , S2 } and
U ∈ {L, L, S1 , S2 }, (6.18)
which can be solved globally now that ULL has been determined. Note that the components ∂q UT U are constant along the integral curves of the vector field 2∂s − ULL ∂q . The remaining unknown component ULL can be determined by contracting Eq. (6.16) with the vector field L, (2∂s − ULL ∂q )∂q ULL = 4P (∂q U, ∂q U ).
(6.19)
By Lemma 5.3 the quantity P (∂q U, ∂q U ) does not contain the term (∂q ULL )2 . Thus, Eq. (6.19) can be solved globally and produces solutions growing exponentially in s. A more precise information can be obtained from the asymptotic form of the wave coordinate condition (6.17). For contracting it with the null frame {L, S1 , S2 } we obtain ∂q ULT = 0, if T ∈ {L, S1 , S2 }. Therefore,
1
P (∂q U, ∂q U ) = − δ AB δ A B 2∂q UAA ∂q UBB − ∂q UAB ∂q UA B
4 1 − δ AB ∂q UAB ∂q ULL . (6.20) 2 It follows from (6.18) that P is already determined and is, in fact, constant along the characteristics of the field 2∂s −ULL ∂q . Therefore, integrating (6.19) we infer that ∂q ULL grows at most like s.
70
H. Lindblad, I. Rodnianski
7. Vector Fields and Commutators Let Z ∈ Z be any of the vector fields αβ = −xα ∂β + xβ ∂α ,
S = t∂t + r∂r ,
∂α ,
where x0 = −t and xi = x i , for i ≥ 1. Let I = (ι1 , ..., ιk ), where |ιi | = 1, be an ordered multiindex of length |I | = k and let Z I = Z ι1 · · · Z ιk denote a product of |I | such derivatives. With a slight abuse of notation we will also identify the index set with vector fields, so I = Z means the index I corresponding to the vector field Z. Furthermore, by a sum over I1 + I2 = I we mean a sum over all possible order preserving partitions of the ordered multiindex I into two ordered multiindices I1 and I2 , i.e. if I = (ι1 , ..., ιk ), then I1 = (ιi1 , . . . , ιin ) and I2 = (ιin+1 , . . . , ιik ), where i1 , . . . , ik is any reordering of the integers 1, . . . , k such that i1 < ... < in and i n+1 < ... < ik and i1 , . . . , ik . With this convention Leibniz rule becomes Z I (f g) = I1 +I2 =I (Z I1 f )(Z I2 g). We denote by ∂¯ the tangential derivatives, i.e., ∂¯ = {∂¯0 , ∂¯1 , ∂¯2 , ∂¯3 } and note that the span of the tangential derivatives {∂¯0 , ∂¯1 , ∂¯2 , ∂¯3 } coincides with the linear span of the vectorfields {∂s , ∂S1 , ∂S2 }. Lemma 7.1. We have the following expressions for the coordinate vector fields: tS − x i 0i , t2 − r2 tωi 0i − rS ∂r = ωi ∂i = , t2 − r2 −x j ij + t0i − xi S xi x j 0j xi S 0i ∂i = . = − + + 2 2 2 2 2 2 t −r t −r t (t − r ) t ∂t =
(7.1) (7.2) (7.3)
In particular, ∂s =
S + ωi 0i 1 ∂t + ∂ r = , 2 2(t + r)
∂¯i = ∂i − ωi ∂r =
−ωi ωj 0j + 0i ωj ij = . r t (7.4)
Lemma 7.2. For any function f we have the estimate ¯ | + (1 + |q|)|∂f | C (1 + t + |q|)|∂f |Z I f |,
¯ | + |∂q f |, |∂f | |∂f
|I |=1
(7.5) ¯ |2 = |∂¯0 f |2 + |∂¯1 f |2 + |∂¯2 f |2 + |∂¯3 f |2 and ∂¯0 = ∂s . Furthermore where |∂f |∂¯ 2 f |
C |Z I f | , r 1 + t + |q|
(7.6)
|I |≤2
where |∂¯ 2 f |2 = α,β=0,1,2,3 |∂¯α ∂¯β f |2 . Moreover, if k αβ is a symmetric tensor then
|k| |k|LL αβ |∂Z I φ|. + |k ∂α ∂β φ| ≤ C 1 + t + |q| 1 + |q| |I |≤1
(7.7)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
71
Proof. First we note that if r + t ≤ 1 then (7.5) holds since the usual derivatives ∂α are ¯ | in (7.5) follows directly from included in the sum on the right. The inequality for |∂f (7.4); one has to divide into two cases r ≤ t and r ≥ t and use two different expressions depending on the relative size of r and t. The inequality for |∂f | in (7.5) follows from (7.1) and the first identity in (7.3). If t + r < 1 then (7.6) follows from (7.4), since |∂i ωj | ≤ Cr −1 and the sum on the right of (7.6) contains the usual derivatives. Since |ij ωk | ≤ C and ij r = ij t = 0, for 1 ≤ i, j ≤ 3 it follows, by applying ∂¯i = r −1 ωj ij to the expressions in (7.4), that |∂¯i ∂¯α f | ≤ Cr −1 (t + r)−1 |Z I f |. (7.8) |I |≤2
Once again we distinguish the cases r < t and r > t and use different expressions for ∂¯i . With the notation ∂¯0 = 2∂s (7.8) holds also for α = 0. Since [∂s , ∂¯i ] = 0 it only remains to prove (7.6) for ∂s2 . Since Sωj = 0, |0i ωj | ≤ Ctr −1 , S(t + r) = 2(t + r) and |0i (t + r)| ≤ C(t + r), (7.6) follows also for ∂s2 . The inequality (7.7) follows from Lemma 5.5, (7.5) and the commutator identity j [Z, ∂i ] = ai ∂j . g φ = F . Then Lemma 7.3. Suppose
tr H + HLL HLL rF (rφ) + ∂ 4∂s − LL ∂q − q 2g 2g LL r 2g LL
r |H |LT 1+ |Z I φ|. + |H | r −1 1 + |q|
(7.9)
|I |≤2
Proof. By Corollary 5.6,
tr H + HLL HLL rF ∂ (rφ) + 4∂s − LL ∂q − q 2g 2g LL r 2g LL 2 ¯ ¯ + r −1 |φ| , r|ω φ| + r|H |LT |∂∂φ| + |H | r |∂¯ φ| + |∂φ| where ω = δ ij ∂¯i ∂¯j . Here all the derivatives can be reexpressed in terms of the vector fields Z and ∂q using 7.2, yielding the expression (7.9). Note that I I I |I |=1 |Z ∂φ| |I |≤1 |∂Z φ| |I |≤2 |Z φ| ¯ |∂∂φ| . 1 + t + |q| 1 + t + |q| (1 + |q|)(1 + t + |q|) µ
Lemma 7.4. Let Z = Z µ ∂µ be any of the vector fields above and let cα be defined by [∂α , Z] = cαµ ∂µ ,
cαµ = ∂α Z µ .
µ
Then cα are constants and cLL = cLL = 0. Furthermore [Z, ] = −cZ , where cZ is either 0 or 2.
72
H. Lindblad, I. Rodnianski
In addition, if Q is a null form, then ˜ ZQ(∂φ, ∂ψ) = Q(∂φ, ∂Zψ) + Q(∂Zφ, ∂ψ) + Q(∂φ, ∂ψ)
(7.10)
˜ on the right hand-side. for some null form Q Proof. Since Z = Z α ∂α is a Killing or conformally Killing vector field we have ∂α Zβ + ∂β Zα = f mαβ ,
(7.11)
where Zα = mαβ Z β . In fact, for the vector fields above, f = 0 unless Z = S in which case f = 2. In particular, Lα Lβ ∂α Zβ = 0. µ
µ
If cα is as defined above and cαβ = cα mµβ = ∂α Zβ the above simply means that cLL = cLL = 0, which proves the first part of the lemma. To verify (7.10) we first consider the null form Q = Qαβ . We have ZQαβ (∂φ, ∂ψ) = Qαβ (∂Zφ, ∂ψ) + Qαβ (∂φ, ∂Zψ) +[Z, ∂α ]φ∂β ψ − ∂β φ[Z, ∂α ]ψ + [Z, ∂β ]φ∂α ψ − ∂α φ[Z, ∂β ]ψ = Qαβ (∂Zφ, ∂ψ) + Qαβ (∂φ, ∂Zψ) − cαµ (∂µ φ∂β ψ − ∂β φ∂µ ψ) µ
−cβ (∂µ φ∂α ψ − ∂α φ∂µ ψ) = Qαβ (∂Zφ, ∂ψ) + Qαβ (∂φ, ∂Zψ) − cαµ Qµβ (∂φ, ∂ψ) µ
−cβ Qµα (∂φ, ∂ψ). The calculation for the null form Q0 (∂φ, ∂ψ) = mαβ ∂α φ∂β ψ proceeds as follows: ZQ0 (∂φ, ∂ψ) = Q0 (∂Zφ, ∂ψ) + Q0 (∂φ, ∂Zψ) + mαβ [Z, ∂α ]φ∂β ψ +mαβ ∂α φ[Z, ∂β ]ψ µ
= Q0 (∂Zφ, ∂ψ)+Q0 (∂φ, ∂Zψ) + mαβ cαµ ∂µ φ∂β ψ + mαβ cb ∂α φ∂µ ψ = Q0 (∂Zφ, ∂ψ) + Q0 (∂φ, ∂Zψ) + f mαβ ∂α φ∂β ψ = Q0 (∂Zφ, ∂ψ) + Q0 (∂φ, ∂Zψ) + f Q0 (∂φ, ∂ψ), where f is a constant associated with a Killing (conf. Killing) vector field Z via a relation cαβ + cβα = f mαβ . Lemma 7.5. If k αβ is a symmetric tensor then αβ
k αβ [∂α ∂β , Z] = kZ ∂α ∂β ,
αβ
where kZ = k αγ cγβ + k γβ cγα ,
cαµ = ∂α Z µ . (7.12)
αβ
In particular kS = 2k αβ and |kZ |LL ≤ 2|k|LT . In general [k αβ ∂α ∂β , Z I ] =
I1 +I2 =I, |I2 |<|I |
k I1 αβ ∂α ∂β Z I2 ,
(7.13)
(7.14)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
where k J αβ =
J αβ
cKµν Z K k µν = −Z J k αβ −
|K|≤|J |
73
αβ
Z K kZ +
J αβ
dKµν Z K k µν
|K|≤|J |−2
K+Z=J
(7.15) J αβ
J αβ
for some constants cMµν and dMµν . Here the sum (7.14) means the sum over all possible order preserving partitions of the ordered multiindex I into two ordered multiindices I1 and I2 . Proof. First observe that since the vector fields Z are linear in t and x we have γ
2 [∂αβ , Z] = [∂β , Z]∂α + [∂α , Z]∂β = cβ ∂γ ∂α + cαγ ∂γ ∂β , L
which proves the first statement and the second follows since cL = 0. To prove (7.14) we first write (Z K k αβ )Z J ∂α ∂β φ . Z I k αβ ∂α ∂β φ = K+J =I
Then we observe that Z J ∂α ∂ β φ =
2 ] ... Z J2 φ, Z ι11 , Z ι12 , ..., Z ι1 n−1 , [Z ι1n , ∂αβ
J1 +J2 =J, J1 =(ι11 ,...,ι1n )
(7.16) where the sum is over all order preserving partitions of the ordered multiindex J = (ι1 , ..., ιk ) into two ordered multiindices J1 = (ι11 , ..., ι1n ) and J2 = (ι21 , ..., ι2k ). It therefore follows that 2 k J αβ = − (Z K k αβ ) Z ι1 , Z ι2 , ..., Z ιl−1 , [Z ιl , ∂αβ ] ... . K+L=J, L=(ι1 ,...,ιl )
The desired representation follows after taking into account that αβ
2 (Z K k αβ )[Z, ∂αβ ] = −(Z K kZ )∂α ∂β .
g = + H αβ ∂α ∂β . Then with Zˆ = Z + cZ , Corollary 7.6. Let g Zφ − Zˆ g φ = −(ZH ˆ αβ + H αβ )∂α ∂β φ. Z As a consequence, we have
g φ| |ZH | + |H | + |ZH |LL + |H |LT |∂Z I φ|. g Zφ − Zˆ 1 + t + |q| 1 + |q|
(7.17)
(7.18)
|I |≤1
In general g φ = − g Z I φ − Zˆ I
I1 +I2 =I, |I2 |<|I |
Hˆ I1 αβ ∂α ∂β Z I2 φ,
(7.19)
74
H. Lindblad, I. Rodnianski
where Hˆ J αβ =
cMµν Zˆ M H µν = −Zˆ J H αβ − J αβ
|M|≤|J |
αβ Zˆ M HZ +
dMµν Zˆ M H µν . J αβ
|M|≤|J |−2
M+Z=J
(7.20) We have 1 |Z J H | |∂Z K φ| 1 + t + |q| |K|≤|I |, |J |+(|K|−1)+ ≤|I |
|Z J H |LL + |Z J H |LT
g Z I φ| |Zˆ I g φ| + | + +
1 1 + |q| |K|≤|I |
|J |+(|K|−1)+ ≤|I |
|Z J H | |∂Z K φ|,
|J |+(|K|−1)+ ≤|I |−1
(7.21)
|J
|+(|K|−1)+ ≤|I |−2
where (|K| − 1)+ = |K| − 1 if |K| ≥ 1 and (|K| − 1)+ = 0 if |K| = 0. Proof. First observe that g φ = (Z + cZ )φ + (Z + cZ )H αβ ∂ 2 φ Zˆ αβ αβ
2 2 2 Zφ + (ZH αβ )∂αβ φ + (HZ + cZ H αβ )∂αβ φ = Zφ + H αβ ∂αβ
g Zφ + (ZH αβ )∂ 2 φ + (H αβ + cZ H αβ )∂ 2 φ. = αβ αβ Z Recall now that the constant cZ is different from 0 only in the case of the scaling vector field S. Moreover, in that case αβ
HS + cS H αβ = 0. The inequality (7.18) now follows from (7.17), (7.13) and the estimate (7.7). The general commutation formula (7.19) follows from the following calculation, similar to the one in Lemma 7.5. We have 2 g φ = Zˆ I φ + Zˆ I H αβ ∂ 2 φ = Z I φ + Zˆ I Zˆ J H αβ Z K ∂αβ φ. αβ J +K=I
If we now use (7.16) we get (7.19) as in the proof of Lemma 7.5. The inequality (7.21) now follows from (7.19), (7.13) and the estimate (7.7). 8. Basic Energy Identities We now establish basic energy identities for solutions of the equation g φ = F.
(8.1)
We denote by t the hypersurfaces t =const, by Ctt12 (q) the forward light cones with a vertex at (q, 0) and truncated at times t1 , t2 . We also denote by Ktt12 (q) the interior of the light cone Ctt12 (q) and by Bt,r the ball of radius r centered at (t, 0).
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
75
Lemma 8.1. Let φ be a solution of (8.1). Then for any t1 ≤ t2 and an arbitrary q ≤ t2 ,
− g 00 |∂t φ|2 + g ij ∂i φ∂j φ =
t2
−2
t2
t1
and
τ
+2
t
Kt12 (q)
+2
t Ct12 (q)
− g 00 |∂t φ|2 + g ij ∂i φ∂j φ
t1
1 ∂α g αβ ∂β φ∂t φ − ∂t g αβ ∂α φ∂β φ +F ∂t φ , 2
00 − g |∂t φ|2 +g ij ∂i φ∂j φ +
Bt1 −q
t
Ct12 (q)
¯ 2= |∂φ|
(8.2)
00 − g |∂t φ|2 + g ij ∂i φ∂j φ Bt2 −q
1 ∂α g αβ ∂β φ∂t φ − ∂t g αβ ∂α φ∂β φ + F ∂t φ 2
2(g αβ − mαβ )Lα ∂β φ∂t φ + (g αβ − mαβ )∂α φ∂β φ .
(8.3)
Proof. We multiply Eq. (8.1) by ∂t φ and integrate over the space-time slab between the hyper-surfaces t1 and t2 . We have −
t2
t1
τ
2 g αβ ∂αβ φ∂t φ =
=
t2
g αβ ∂β φ∂t ∂α φ + ∂α g αβ ∂β φ∂t φ t1 τ − g 0β ∂β φ∂t φ + g 0β ∂β φ∂t φ 1 2
t2
t1
00 − g |∂t φ|2 + g ij ∂i φ∂j φ
t2
1 − 2 +
00 − g |∂t φ|2 + g ij ∂i φ∂j φ
t1 t2
t1
τ
1 ∂α g αβ ∂β φ∂t φ − ∂t g αβ ∂α φ∂β φ , 2
and the desired identity (8.2) follows. Similarly, integrating over the region Ktt12 (q) we obtain 00 αβ − g |∂t φ|2 + g ij ∂i φ∂j φ − t 2g Lα ∂β φ∂t φ + g αβ ∂α φ∂β φ Ct12 (q)
Bt1 −q
00 − g |∂t φ|2 + g ij ∂i φ∂j φ
= Bt2 −q
+2
t
Kt12 (q)
1 ∂α g αβ ∂β φ∂t φ − ∂t g αβ ∂α φ∂β φ + F ∂t φ . 2
Subtracting the Minkowski part from the metric g in the Ctt12 (q) integral leads to the identity (8.3).
76
H. Lindblad, I. Rodnianski
Corollary 8.2. Let φ be a solution of Eq. (8.1) with a metric g satisfying the condition that 1 , 4
|H | ≤ Then for any 0 < γ ≤ 1, 2 2 |∂t φ| + |∇x φ| + t2
t1 t2
+2
t1
t2
+8
t1
t2
τ
τ
τ
H αβ = g αβ − mαβ .
(8.4)
¯ 2 γ |∂φ| |∂t φ|2 + |∇x φ|2 ≤4 (1 + |q|)1+2γ t1
∂α g αβ ∂β φ∂t φ − 1 ∂t g αβ ∂α φ∂β φ + F ∂t φ 2 γ (g αβ − mαβ )∂α φ∂β φ + 2(g Lβ − mLβ )∂β φ∂t φ . (1 + |q|)1+2γ (8.5)
Proof. First we note that (8.4) implies that 5 3 |∂t φ|2 + |∇x φ|2 ≤ −g 00 |∂t φ|2 + g ij ∂i φ ∂j φ ≤ |∂t φ|2 + |∇x φ|2 . 4 4 The inequalities (8.3) and (8.2) imply that 00 2 ij ¯ 2≤ | ∂φ| |∂ φ| + g ∂ φ∂ φ − g t i j t Ct12 (q)
(8.6)
(8.7)
t2
+2
t Kt12 (q)
+2
t
Ct12 (q)
2(g αβ − mαβ )Lα ∂β φ∂t φ + (g αβ − mαβ )∂α φ∂β φ
(8.8)
00 − g |∂t φ|2 + g ij ∂i φ∂j φ
≤ t1
+2
t2
t
t
1 +2
1 ∂α g αβ ∂β φ∂t φ − ∂t g αβ ∂α φ∂β φ + F ∂t φ 2
t
Ct12 (q)
1 ∂α g αβ ∂β φ∂t φ − ∂t g αβ ∂α φ∂β φ + F ∂t φ 2
αβ 2(g − mαβ )Lα ∂β φ∂t φ +(g αβ − mαβ )∂α φ∂β φ . (8.9)
We multiply the above inequality by an integrable factor γ (1 + |q|)−1−2γ and integrate with respect to q in the interval (−∞, t2 ] to obtain: t2 ¯ 2 γ |∂φ| 5 ≤ |∂t φ|2 + |∇x φ|2 1+2γ (1 + |q|) 4 t1 t t1 t2 ∂α g αβ ∂β φ∂t φ − 1 ∂t g αβ ∂α φ∂β φ + F ∂t φ +2 2 t1 τ t2 γ (g αβ − mαβ )∂α φ∂β φ +2 1+2γ (1 + |q|) t1 τ +2(g Lβ − mLβ )∂β φ∂t φ , (8.10)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
77
where we also used (8.6). On the other hand using (8.6) and (8.2) yields 5 |∂t φ|2 + |∇x φ|2 ≤ |∂t φ|2 + |∇x φ|2 3 t1 t1 t2 8 ∂α g αβ ∂β φ∂t φ − 1 ∂t g αβ ∂α φ∂β φ + F ∂t φ , + 3 t1 τ 2 and the corollary follows.
(8.11) (8.12)
9. Poincar´e and Klainerman-Sobolev Inequalities We now state the following useful version of the Poincar´e inequality. Lemma 9.1. Let f be a smooth function. Then for any γ > −1/2, γ = 1/2 and any positive t, |f (x)|2 dx |∂r f (x)|2 dx 2 ≤ C |f | dS + C , (9.1) (1 + |t − r|)2+2γ (1 + |t − r|)2γ R3
R3
S(t+1)
provided that the left-hand side is bounded. Here S(t+1) is the sphere of radius t + 1 and r = |x|. Proof. Using polar coordinates x = rω we write t+1 2 2 ∂r f (ρ, ω) · f (ρ, ω) dρ. |f (r, ω)| − |f (t + 1, ω)| = −2 r
Hence |f (r, ω)|2 r 2 |f (t + 1, ω)|2 (t + 1)2 + 2
t+1 r
|∂r f (ρ, ω)| |f (ρ, ω)| ρ 2 dρ,
if
r ≤ t + 1.
Therefore multiplying by (1 + |t − r|)−2−2γ and integrating with respect to r from 0 to t + 1: t+1 t+1 |f (r, ω)|2 r 2 dr |f (t + 1, ω)|2 (t + 1)2 dr (1 + |t − r|)2+2γ (1 + |t − r|)2+2γ 0 0 t+1 t+1 |∂r f (ρ, ω)||f (ρ, ω)| 2 + ρ dρ dr (1 + |t − r|)2+2γ 0 r |f (t + 1, ω)|2 (t + 1)2 t+1 ρ |∂r f (ρ, ω)||f (ρ, ω)| + dr ρ 2 dρ (1 + |t − r|)2+2γ 0 0 t+1 |∂r f (ρ, ω)||f (ρ, ω)| 2 ρ dρ |f (t + 1, ω)|2 (t + 1)2 + (1 + |t − ρ|)1+2γ 0
t+1 |∂ f (ρ, ω)|2 ρ 2 dρ 1/2 r |f (t + 1, ω)|2 (t + 1)2 + (1 + |t − ρ|)2γ 0
t+1 |f (ρ, ω)|2 ρ 2 dρ 1/2 × , (1 + |t − ρ|)2+2γ 0
78
H. Lindblad, I. Rodnianski
where we first changed the order of integration and then used Cauchy-Schwarz inequality. It therefore follows that t+1 t+1 |f (r, ω)|2 r 2 dr |∂r f (ρ, ω)|2 ρ 2 dρ 2 2 |f (t + 1, ω)| (t + 1) + , (1 + |t − r|)2+2γ (1 + |t − ρ|)2γ 0 0 and if we also integrate over the angular variables we get |f (x)|2 dx |∂r f (x)|2 dx 2 |f | dS + . (1 + |t − r|)2+2γ (1 + |t − r|)2γ |x|≤(t+1)
|x|≤(t+1)
S(t+1)
On the other hand, if we instead integrate from t + 1 to 2(t + 1) we similarly obtain 2(t+1) 2(t+1) |f (r, ω)|2 r 2 dr |f (t + 1, ω)|2 (t + 1)2 dr (1 + |t − r|)2+2γ (1 + |t − r|)2+2γ t+1 t+1 2(t+1) r |∂r f (ρ, ω)||f (ρ, ω)| 2 + ρ dρ dr 2+2γ t+1 t+1 (1 + |t − r|) |f (t + 1, ω)|2 (t + 1)2 2(t+1) 2(t+1) |∂r f (ρ, ω)||f (ρ, ω)| + dr ρ 2 dρ 2+2γ (1 + |t − r|) t+1 ρ |f (t + 1, ω)|2 (t + 1)2 2(t+1) |∂r f (ρ, ω)||f (ρ, ω)| 2 + ρ dρ, (1 + |t − ρ|)1+2γ t+1 and as before it follows that |f (x)|2 dx (1 + |t − r|)2+2γ
S(t+1)
(t+1)≤|x|≤2(t+1)
|f | dS + 2
(t+1)≤|x|≤2(t+1)
|∂r f (x)|2 dx . (1 + |t − r|)2γ
Finally, in the region r ≥ 2(t + 1) the estimate (9.1) would follow from the Hardy type inequality: |f (x)|2 dx |∂r f (x)|2 dx −1−2γ ≤ + (t + 1) |f |2 dS, (9.2) |x|2+2γ |x|2γ |x|≥(t+1)
|x|≥(t+1)
S(t+1)
that hold provided the left-hand side is bounded. One can for the proof assume that f ha compact support since we can choose a sequence of compactly supported functions converging to a given function f in the norm defined by the right hand side as long as the norm in the left of f is bounded. Equation (9.2) for compactly supported smooth functions can be easily seen from integrating the identity 2 2 r f 2r 2 r2 ∂r = f · ∂ f + (1 − 2γ ) f 2, γ = −1/2 r r 1+2γ r 1+2γ r 2+2γ from r = t + 1 to r = ∞ and using Cauchy-Schwarz as above. We now state the global Sobolev inequality, which is due to S. Klainerman [K1]. Proposition 9.2. The following inequality holds for an arbitrary smooth function φ, |φ(t, x)|(1 + t + |t − r|)(1 + |t − r|)1/2 ≤ C Z I φ(t, ·) L2 . |I |≤3
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
79
10. Decay Estimates for the Wave Equation on a Curved Space Time In this section we will derive some basic estimates for the scalar wave equation on a curved background. The results will require some weak assumptions on the metric g, which will be easily verified in the case of a metric satisfying the reduced Einstein equations. We consider the reduced scalar wave equation: g φ = F.
(10.1)
The following result is a generalization of the lemma in [L1] to the variable coefficient case: Lemma 10.1. Suppose that φ satisfies the reduced scalar wave equation (10.1) on a curved background with a metric g. Suppose that H αβ = g αβ − mαβ satisfies |H | ≤
1 , 4
and |H |LT ≤
when t/2 ≤ |x| ≤ 2t and ∞ dt 1 H (t, ·) L∞ (Dt ) ≤ , 1+t 4 0
(1 + t + |x|) |∂φ(t, x)| ≤ C sup t
0
(10.2)
where Dt = {x ∈ R3 ; t/2 ≤ |x| ≤ 2t}. (10.3)
Then for any t ≥ 0 and x ∈ R3 ,
+C
1 |q| + 1 4 1 + t + |x|
Z Iφ(τ, ·) L∞
0≤τ ≤t |I |≤1
(1 + τ ) F (τ, ·) L∞ (Dτ ) +
(1 + τ )−1 Z I φ(τ, ·) L∞ (Dτ ) dτ.
|I |≤2
(10.4) Proof. Since by Lemma 7.2
¯ ≤C (1 + |t − r|)|∂φ| + (1 + t + r)|∂φ|
|Z I φ|,
r = |x|,
(10.5)
|I |=1
the inequality (10.4) holds when r < t/2 + 1/2 or r > 2t − 1. Furthermore, since (1 + r)|∂q φ| ≤ C|∂q (rφ)| + C|φ|, it follows that (1 + t + r)|∂φ| ≤ C
r≥1
(10.6)
|Z I φ| + C|∂q (rφ)|.
(10.7)
|I |≤1
Hence it suffices to prove that |∂q (rφ)| is bounded by the right-hand side of (10.4) when t/2 + 1/2 < r < 2t − 1. By Lemma 7.3,
(4∂s − HLL ∂q )∂q (rφ) 1 + r |H |LT + |H | r −1 |Z I φ| LL 2g 1 + |q| |I |≤2
+|H | r
−1
|∂q (rφ)| + r|F |
(10.8)
80
H. Lindblad, I. Rodnianski
and using the decay assumptions (10.2) and (10.7) we get Z I φ| (4∂s − HLL ∂q )∂q (rφ) |H | |∂q (rφ)|+ + C(t + 1)|F |, LL 2g 1+t 1+t |I |≤2
when t/2+1/2 ≤ r ≤ 2t −1.
(10.9)
+ H LL (2g LL )−1 ∂
Along an integral curve (t, x(t)) of the vector field ∂s q , contained in the region t/2 + 1/2 ≤ |x| ≤ 2t − 1, we have the following equation for ψ = ∂q (rφ): d ˆ + f, (10.10) ψ ≤ h|ψ| dt where hˆ = C|H |/(1 + t) and f = Ct|F | + C |I |≤2 |Z I φ|/(1 + t). Hence multiplying ˆ ˆ ds we get (10.10) with the integrating factor e−H , where Hˆ = h(s) d
ˆ ˆ (10.11) ψe−H ≤ f e−H . dt If we integrate backwards along an integral curve from any point (t, x) in the set t/2 + 1/2 ≤ |x| ≤ 2t − 1 until the first time the curve intersects the boundary of the set at (τ, y), |y| = τ/2 + 1/2 or |y| = 2 τ − 1, we obtain
t ˆ |ψ(t, x)| ≤ exp h(σ, ·) L∞ dσ |ψ(τ, y)| τ t
t ˆ + exp h(σ, ·) L∞ dσ f (τ , ·) L∞ dτ , τ
τ
where the L∞
norms are taken only over the set t/1+1/2 ≤ |x| ≤ 2t −1. (Note that any integral curve has to intersect either of the two boundaries r = t/2 + 1/2 or r = 2t − 1 since the slope of the curve x(t) has to be close to 1 when HLL is small.) The lemma now follows from taking the supremum over x in the set t/2 + 1/2 ≤ |x| ≤ 2t − 1, using that on the cones |y| = τ/2 + 1/2 or |y| = 2τ − 1 we have that |ψ| ≤ Cr|∂q φ| + C|φ| ≤ t ˆ C |I |≤1 |Z I φ|, by (10.5), and using that by (10.3) 0 h(σ, ·) L∞ dσ ≤ 41 . For second order derivatives we have an estimate which gives a slightly worse decay: Lemma 10.2. Let φ be a solution of the reduced scalar wave equation on a curved background with a metric g. Assume that H αβ = g αβ − mαβ satisfies ε˜ ε˜ |q| + 1 and (10.12) |Z I H | ≤ , |Z I H |LL + |H |LT ≤ 4 4 1 + t + |x| |I |≤1
|I |≤1
when t/2 ≤ |x| ≤ 2t for some ε˜ ≤ 1. Then, for t ≥ 0, x ∈ R3 , we have
1 + t C ε˜ |∂Z I φ(t, x)| ≤ C sup Z I φ(τ, ·) L∞ (1 + t + |x|) 0≤τ ≤t 1 + τ |I |≤1 |I |≤2 t
1 + t C ε˜ +C (1 + τ ) Z I F (τ, ·) L∞ (Dτ ) 1+τ 0 |I |≤1 (1+τ )−1 Z I φ(τ, ·) L∞ (Dτ ) dτ, (10.13) + |I |≤3
where Dt = {x ∈ R3 ; t/2 ≤ |x| ≤ 2t}.
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
81
Proof. First when r < t/2 or r > t/2 the lemma trivially follows from (10.5) with φ replaced by Zφ so it only remains to prove the lemma when t/2 ≤ r ≤ 2t. We have g Zφ = FZ = ZF g Zφ − Zˆ g φ , ˆ + (10.14) where by (7.18) the additional commutator term can be estimated by
g φ| |ZH | + |H | + |ZH |LL + |H |LT g Zφ − Zˆ |∂Z I φ| 1 + t + |q| 1 + |q|
ε˜ 1+t +q
|I |≤1
|∂Z I φ|,
(10.15)
|I |≤1
where we used the decay assumption (10.12). Furthermore with the help of (10.7), applied to Z I φ in place of φ, we obtain g φ| g Zφ − Zˆ
I ε˜ ∂q (rZ I φ) + |Z φ| . (1 + t + |q|)2 |I |≤1
(10.16)
|I |≤2
Hence by (10.9) applied to (10.14) in place of (10.1) we get |(4∂s −
|Z I φ| HLL ε˜ ∂q )∂q (rZφ)| |∂q (rZ I φ)| + t (|ZF | + |F |) + LL 2g 1+t 1+t |I |≤3
|I |≤1
(10.17) when t/2 + 1/2 ≤ r ≤ 2t − 1. Therefore |Z I φ| HLL C ε˜ |∂q (rZ I φ)| ≤ |∂q (rZ I φ)| + C (4∂s − LL ∂q ) 2g 1+t 1+t |I |≤1
|I |≤1
|I |≤3
+Ct (|ZF | + |F |).
(10.18)
The desired result follows multiplying (10.17) by the factor (1 + t)−C ε˜ and integrating as in the proof of the previous lemma. Along an integral curve we have the equation d
(10.19) ψ(1 + t)−C ε˜ ≤ (1 + t)−C ε˜ f, dt where ψ=
|∂q (Z I φ)|,
f = C(1 + t)(|ZF | + |F |) + C
|I |≤1
|Z I φ| . 1+t
(10.20)
|I |≤3
The lemma now follows as in the proof of Lemma 10.1.
We observe that similar estimates hold for a system g φµν = Fµν .
(10.21)
In particular, in our case, certain components of Fµν expressed in the null-frame will decay better than others and for these components we will also get better estimates for
82
H. Lindblad, I. Rodnianski
φµν . Since the vector fields L and L commute with contractions of any of the vector fields {L, L, S1 , S2 } proofs of the preceding lemmas imply the following result: Corollary 10.3. Let φµν be a solution of reduced wave equation system (10.21) on a curved background with a metric g. Assume that H αβ = g αβ − mαβ satisfies ε˜ ε˜ |q| + 1 (10.22) |Z I H | ≤ , and |Z I H |LL + |H |LT ≤ 4 4 1 + t + |x| |I |≤1
|I |≤1
when t/2 ≤ |x| ≤ 2t, for some ε˜ ≤ 1 and ∞ ε˜ dt ≤ , H (t, ·) L∞ (Dt ) 1+t 4 0
(10.23)
where Dt = {x ∈ R3 ; t/2 ≤ |x| ≤ 2t}. Then for any U, V ∈ {L, L, S1 , S2 } and any t ≥ 0, x ∈ R3 : (1 + t + |x|) ∂φ(t, x) U V ≤ C sup Z Iφ(τ, ·) L∞ 0≤τ ≤t |I |≤1
t
(1 + τ ) |F |U V (τ, ·) L∞ (Dτ ) +C 0 + (1 + τ )−1 Z I φ(τ, ·) L∞ (Dτ ) dτ, |I |≤2
(1 + t + |x|)
(10.24)
1 + t C ε˜ Z Iφ(τ, ·) L∞ 0≤τ ≤t 1 + τ
|∂Z I φ|(t, x)| ≤ C sup
|I |≤1
t
1 + t C ε˜ +C r(·)|Z I F |(τ, ·) L∞ (Dτ ) 1 + τ 0 |I |≤1 −1 I + (1 + τ ) Z φ(τ, ·) L∞ (Dτ ) dτ.
|I |≤2
(10.25)
|I |≤3
Proof. By Lemma 7.3 for each component we have the estimate
tr H + HLL rFµν HLL (rφ ) + ∂ 4∂s − LL ∂q − q µν 2g 2g LL r 2g LL
r |H |LT 1+ + |H | r −1 |Z I φµν |, 1 + |q|
(10.26)
|I |≤2
and since ∂s and ∂q commute with contractions with the frame vectors L, L we get tr H + HLL HLL rFU V (rφ ) + ∂ 4∂s − LL ∂q − q U V 2g 2g LL 2g LL r
r |H |LT 1+ + |H | r −1 |Z I φ|. (10.27) 1 + |q| |I |≤2
As before it also follows that (1 + t + |r)|∂φ|U V
|I |≤1
The lemma now follows as before.
|Z I φ| + |∂q (rφ)|U V .
(10.28)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
83
11. Energy Estimates for the Wave Equation on a Curved Space Time In this section we derive the energy estimate for a solution φ of the inhomogeneous wave equation g φ = F
(11.1)
under the following assumptions on the metric g αβ = mαβ + H αβ : (1 + |q|)−1 |H |LL + |∂H |LL + |∂H | ≤ Cε(1 + t)−1 , 1
1
(1 + |q|)−1 |H | + |∂H | ≤ Cε(1 + t)− 2 (1 + |q|)− 2 −γ .
(11.2)
Proposition 11.1. Let φ be a solution of the wave equation (11.1) with the metric g verifying the assumptions (11.2). Then for any 0 < γ ≤ 1/2, there is an ε0 such that for ε < ε0 , t t ¯ 2 γ |∂φ| |∂φ|2 2 |∂φ|2 + ≤ 8 |∂φ| + Cε 1+2γ t 0 0 τ (1 + |q|) 0 t 1 + t t +16 |F ||∂t φ|. (11.3) 0
t
Remark 11.2. Observe that by the Gronwall inequality the energy estimate of the above proposition implies t ε growth of the energy. For similar estimates, proved under different assumptions, see also [S1, A2, A3]. Proof. The proof of the proposition relies on the energy estimate obtained in Corollary 8.2, t ¯ 2 γ |∂φ| 2 2 |∂t φ| + |∇φ| + |∂t φ|2 + |∇φ|2 ≤4 1+2γ (1 + |q|) t 0 0 τ t 1 ∂α g αβ ∂β φ∂t φ − ∂t g αβ ∂α φ∂β φ + F ∂t φ +8 2 0 τ t αβ γ (g − mαβ )∂α φ∂β φ + 2(g Lβ − mLβ )∂β φ∂t φ . +2 1+2γ 0 τ (1 + |q|) We start by decomposing the terms on the right hand side with respect to the null frame. ¯ | |∂φ|2 + |∂H | |∂φ| ¯ |∂φ|. |∂α g αβ ∂β φ∂t φ| ≤ |H | |∂H | + |(∂H )LL | + |∂H Similarly, ¯ ¯ |∂φ|. |∂φ|2 + |∂g| |∂φ| |∂t g αβ ∂α φ∂β φ| ≤ |g − m| |∂g| + |(∂g)LL | + |∂g| Therefore, using the assumptions (11.2) on the metric g, we obtain that 1 ε ε ¯ 2. |∂φ|2 + |∂φ| |∂α g αβ ∂β φ∂t φ − ∂t g αβ ∂α φ∂β φ| 2 1+t (1 + |q|)1+2γ Decomposing the remaining terms we infer that ¯ |∂φ|. |(g αβ − mαβ )∂α φ∂β φ| ≤ |HLL | |∂φ|2 + |H ||∂φ|
(11.4)
84
H. Lindblad, I. Rodnianski
Similarly, ¯ |∂φ|. |(g αβ − mαβ )Lα ∂β φ∂t φ| ≤ |HLL | |∂φ|2 + |H | |∂φ| Once again, using the assumptions (11.2), we have |2(g αβ − mαβ )Lα ∂β φ∂t φ + (g αβ − mαβ )∂α φ∂β φ| 1 + |q| ε ¯ 2. ε |∂φ| |∂φ|2 + 1+t (1 + |q|)2γ Thus
(11.5)
t
¯ 2 γ |∂φ| 1+2γ t 0 τ (1 + |q|) t t
¯ 2 |∂φ|2 |∂φ| + 8 + ≤4 |∂φ|2 + Cε |F | |∂t φ|, (1 + |q|)1+2γ 0 0 τ 1 + t 0 τ |∂φ| + 2
and the desired estimate follows if we take ε so small that Cε < γ /2.
12. Estimates from the Wave Coordinate Condition In previous sections we have shown that one only needs to control certain components of the metric in order to establish decay estimates for solutions of the reduced wave equation. In this section we will see that the wave coordinate condition allows one to estimate precisely those components in terms of tangential derivatives or higher order terms with better decay. Recall that the wave coordinate condition can be written in the form
∂µ g µν | det g| = 0. (12.1) We have the following decomposition: 1 g µν | det g| = mµν + H µν 1 − trH + O(H 2 ) , 2 where H αβ = g αβ − mαβ , hαβ = gαβ − mαβ . Recall also that g αβ is the inverse of gαβ and H αβ = −mµα mνβ hµν + O(h2 ). Therefore we obtain the following expression for the wave coordinate condition:
1 ∂µ H µν − mµν tr H + O µν (H 2 ) = 0. (12.2) 2 Using that we can express the divergence in terms of the null frame ∂µ F µ = Lµ ∂q F µ − Lµ ∂s F µ + Aµ ∂A F µ ,
(12.3)
we obtain: Lemma 12.1. Assume that |H | ≤ 1/4. Then |∂H |LT |∂H | + |H | |∂H |,
¯ | + |H | |∂H |. |∂ tr H | |∂H
(12.4)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
85
Proof. It follows from (12.2) and (12.3) that Lµ ∂ H µν − 1 mµν tr H ≤ |∂H ¯ | + |H ||∂H |. 2
(12.5)
Contracting with Tν and using that mT L = 0 gives the first inequality and contracting with Lµ and using that mLL = −2 gives the second since HLL + tr H = tr H.
(12.6)
We now compute the commutators of the wave coordinate condition with the vector fields Z. Lemma 12.2. Let Z be one of the Minkowski Killing or conformally Killing vector fields and let tensor H satisfy the wave coordinate condition. Then the estimate J ∂H ¯ JH| + |∂Z |Z Ik H | · · · |Z I2 H | |∂Z I1 H | LT |J |≤|I |
I1 +...+Ik =I, k≥2
holds true for the expression J µν + Hµν = ZJ H
γ ν , cJ µ Z J H I γ
|J |<|I | I γµ
with some constant tensors cJ
µν = Hµν − 1 mµν trH where H 2
(12.7)
I L
such that cJ L = 0 if |J | = |I | − 1.
Proof. The wave coordinate condition (12.1) can be written in the form µν µν = (mµν + H µν ) | det g|. = 0, where G ∂µ G Let Z be one of the Minkowski Killing or conformally Killing vector fields. Then for any vector field F we have that
cJI γα Z J F γ = ∂α cJI γα Z J F γ , Z I ∂α F α = ∂α Z I F α + |J |<|I |
|J ≤|I |
where cJαγ are constants such that cJI γα = δ αγ ,
if |J | = |I |
and
I L
cJ L = 0,
if
|J | = |I | − 1.
I γ
The last identity is a consequence of the relation between cJ α and the commutator constants cαβ = [∂α , Z]β for which we have established that cLL = 0. It therefore follows that I µγ J γ ν = 0. cJ Z G ∂µ |J |≤|I |
Decomposing relative to the null frame (L, L, S1 , S2 ) we obtain
I Lγ J I I Lγ J γ ν = ∂s cJ Z G cJ Z Gγ ν − Aµ ∂¯A cJ ∂q |J |≤|I |
|J |≤|I |
|J |≤|I |
µγ
γν . ZJ G
86
H. Lindblad, I. Rodnianski
We now contract the above identity with one of the tangential vector fields T ν , T ∈ {L, S1 , S2 } to obtain I Lγ γ ν ∂Z ¯ IG γ ν + γ ν . cJ T ν ∂q Z J G L T ∂q Z I G |J |<|I |
|J |≤|I |
We examine the expression
γ ν = Lγ T ν ∂q Z J (mγ ν + Hγ ν ) | det g| Lγ T ν Z J ∂q G
= Lγ T ν ∂q (Z J1 Hγ ν )Z J2 | det g| , J1 +J2 =J
since mLT = Lγ T ν mγ ν = 0. The desired estimate now follows from the identity √ | det g| = 1 + f (H ), which holds with a smooth function f (H ) such that f (H ) = −trH /2 + O(H 2 ). We now summarize the results of this section in the following Lemma 12.3. For a tensor H obeying the wave coordinate condition ¯ | + |H | |∂H |, |∂H |LT |∂H
(12.8)
and
|∂ZH |LL |∂H |LT +
|∂Z I H | +
|I |≤1
|Z I H | |∂Z J H |.
(12.9)
|I |+|J |≤1
In general, |∂Z I H |LT
|J |≤|I |
+
|∂Z J H | +
|∂Z J H |
|J |≤|I |−1
|Z H |· · · |Z I2 H ||∂Z I1 H |, Im
(12.10)
|I1 |+...+|Im |≤|I |, m≥2
and |∂Z I H |LL
|∂Z J H | +
|J |≤|I |
+
|∂Z J H |LT +
|J |≤|I |−1
|Z Im H |· · · |Z I2 H ||∂Z I1 H |.
|∂Z J H |
|K|≤|I |−2
(12.11)
|I1 |+...+|Im |≤|I |, m≥2
The same estimates also hold for H replaced by h. Proof. This follows directly by the previous lemma with the help of the identities mLT = I L 0 and cJ L = 0.
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
87
13. Estimates for the Inhomogeneous Terms In this section we will show that the inhomogeneous terms of the reduced Einstein equations can be estimated in terms of tangential derivatives, for which we have better decay estimates, or tangential components which in turn can be expressed, using the wave coordinate condition, in terms of tangential derivatives and lower order terms. Recall that according to Lemma 3.2 the symmetric two tensor hµν = gµν − mµν verifies the reduced Einstein equations of the form: g hµν = Fµν (h)(∂h, ∂h), Fµν (h)(∂h, ∂h) = P (∂µ h, ∂ν h) + Qµν (∂h, ∂h) + Gµν (h)(∂h, ∂h), (13.1) 1 1 P (∂µ k, ∂ν p) = ∂µ trk ∂ν trp − ∂µ k αβ ∂ν pαβ . (13.2) 4 2 Here Qµν are linear combinations of the null-forms and Gµν (h)(∂h, ∂h) is a quadratic form in ∂h with coefficients that are smooth functions of h and vanishing at h = 0. Lemma 13.1. The quadratic form P satisfies the following pointwise estimate: ¯ | |∂k| + |∂p | |∂k|, ¯ |P (∂p, ∂k)|LT |∂p
(13.3)
|P (∂p, ∂k)| |∂p |LT |∂k|LT + |∂p |LL |∂k| + |∂p | |∂k|LL .
(13.4)
Proof. The first part of the statement follows trivially from (13.2). To prove (13.4) we use (5.9) applied to R µ ∂µ p in place of p and S ν ∂ν k in place of k, for any vector fields T and S, to obtain |T µ S ν P (∂µ p, ∂ν k)| |T µ ∂µ p |T U |S ν ∂ν k|T U +|T µ ∂µ p |LL |S ν ∂ν k| + |T µ ∂µ p | |S ν ∂ν k|LL , which proves the lemma.
(13.5)
Using the additional estimates on the hLL component, derived in Lemma 12.3 under the assumption that the wave coordinate condition holds, we obtain the following: Corollary 13.2. Under the additional assumption that h satisfies the wave coordinate condition (3.4), the quadratic form P obeys the estimate ¯ |∂h|, |P (∂h, ∂h)|T U |∂h|
(13.6)
¯ |∂h| + |h| |∂h|2 . |P (∂h, ∂h)| |∂h|2T U + |∂h|
(13.7)
Moreover,
|Z I P (∂h, ∂h)|
¯ J h| |∂Z K h| |∂Z J h|T U |∂Z K h|T U + |∂Z
|J |+|K|≤|I |
+
|∂Z J h|LT |∂Z K h|
|J |+|K|≤|I |−1
+
|∂Z J h| |∂Z K h|
|J |+|K|≤|I |−2
+
|J1 |+...+|Jm |≤|I |, m≥3
|Z Jm h|· · · |Z J3 h| |∂Z J2 h||∂Z J1 h|.
88
H. Lindblad, I. Rodnianski
Proof. The inequality (13.6) follows directly from (13.3). To prove (13.7) we use (13.4) ¯ + |h| |∂h|. and that by the wave coordinate condition |∂h|LL |∂h| I We now note that Z P (∂µ h, ∂ν h) is a sum of terms of the form P (∂α Z J h, ∂β Z K h) for some α, β and |J | + |K| ≤ I : |P (∂Z J h, ∂Z K h)|. |Z I P (∂h, ∂h)| ≤ C |J |+|K|≤|I |
It follows from Lemma 5.4 and Lemma 12.3 that |P (∂Z J h, ∂Z K h)| |∂Z J h|T U |∂Z K h|T U + |∂Z J h|LL |∂Z K h| |J |+|K|≤|I |
|J |+|K|≤|I |
¯ J h| |Z K h| + |∂Z J h|T U |∂Z K h|T U |∂Z
|J |+|K|≤|I |
+
|∂Z J h|LT +
|J |+|K|≤|I | |J |≤|J |−1
+
|∂Z J h|
|J
|≤|J |−2
|Z Jm h| · · · |Z J2 h| |∂Z J1 h| |∂Z K h|
(13.8)
|J1 |+...+|Jm |≤|J |, m≥2
which proves the lemma.
Proposition 13.3. Let Fµν = Fµν (h)(∂h, ∂h) be as in Lemma 3.2 and assume that the wave coordinate condition holds. Then |F |T U |∂h| |∂h| + |h| |∂h|2
(13.9)
and ¯ |∂h| + |h| |∂h|2 , (13.10) |F | |∂h|2T U + |∂h| |ZF | |∂h|T U + |∂h| + |h| |∂h| (|∂Zh| + |∂h|) + |∂Zh| + |Zh| |∂h| |∂h|, (13.11) I J K J K ¯ h| |∂Z h| |Z F | |∂Z h|T U |∂Z h|T U + |∂Z |J |+|K|≤|I |
+
|∂Z J h|LT |∂Z K h| +
|J |+|K|≤|I |−1
+
|∂Z J h| |∂Z K h|
|J |+|K|≤|I |−2
|Z Jm h|· · · |Z J3 h| |∂Z J2 h||∂Z J1 h|.
(13.12)
|J1 |+...+|Jm |≤|I |, m≥3
Proof. First
|Z I Gµν (h)(∂h, ∂h)| ≤ C
|Z Ik h| · · · |Z I3 h| |∂Z I2 h| |∂Z I1 h|.
|I1 |+...+|Ik |≤|I |, k≥3
Since ZQ(∂u, ∂v) = Q(∂u, ∂Zv) + Q(∂Zu, ∂v) + a ij Qij (∂u, ∂v), and |Qµν (∂h, ∂k)| ≤ |∂h| |∂k| + |∂k| |∂h| it follows that |Qµν (∂Z J h, ∂Z K h)| ≤ C |∂Z J h| |∂Z K h|. |Z I Qµν (∂h, ∂h)| ≤ C |J |+|k|≤|I |
|J |+|k|≤|I |
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
89
14. The Decay Estimates for Einstein’s Equations In this section we will establish the improved decay estimates for Einstein’s equations. Our strategy is to use the weak decay estimates, obtained from the assumed energy bounds, to prove sharper decay estimates and then to recover the energy bounds in the next section. Theorem 14.1. Suppose that for some 0 < γ ≤ 1/2, |I | ≤ N/2 + 4, (14.1) |∂Z I h| ≤ Cε(1 + t + |q|)−1/2−γ (1 + |q|)−1/2−γ , I −1 |Z h| ≤ Cε(1 + t) , q = 1, |I | ≤ N/2 + 4 (14.2) hold for 0 ≤ t ≤ T . Then for 0 ≤ t ≤ T we have |Z I h| ≤ Cε(1 + t + |q|)−1/2−γ (1 + |q|)1/2−γ , |∂Z I h| ≤ Cε(1 + t + |q|)−3/2−γ (1 + |q|)1/2−γ ,
|I | ≤ N/2 + 4, |I | ≤ N/2 + 3.
(14.3) (14.4)
Assume also that h satisfies the wave coordinate condition. Then for 0 ≤ t ≤ T we have and |∂h|LT + |∂Zh|LL ≤ Cε(1 + t)−1−2γ , ≤ Cε(1 + t)−1 (1 + |q|).
|h|LT + |Zh|LL (14.5)
Furthermore if in addition h satisfies Einstein’s equations then for ε sufficiently small and 0 ≤ t ≤ T we also have |∂h|T U ≤ Cε(1 + t)−1 ,
|h|T U ≤ Cε(1 + t)−1 (1 + |q|),
|∂h| ≤ Cε(1 + t)−1 ln(2 + t).
(14.6) (14.7)
In general, there are constants Mk , Ck and εk > 0 such that if ε ≤ εk , then for |I | = k ≤ N/2 + 2. |∂Z I h| ≤ Ck ε(1 + t)−1+Mk ε ,
and
|Z I h| ≤ Ck ε(1 + t)−1+Mk ε (1 + |q|). (14.8)
Remark 14.2. We remind the reader that, as stated in Remark 2.4, our estimates make no distinction between the tensors h and H = −h + O(h2 ). In particular, one can directly verify that the conclusions of the theorem also hold for the tensor H . First we note that all the estimates (14.3)–(14.8) trivially follow from the assumptions (14.1)–(14.2) away from the light cone, thus the theorem is only useful in the region t/2 ≤ |x| ≤ 2t. The estimate (14.3) follows from integrating (14.1) from q = 1, where (14.2) hold. Similarly the second parts of (14.5), (14.6) and (14.8) follow from integrating the first and using (14.2). It follows from (14.3) and Lemma 7.2 that we have the better estimate (14.4) for the derivatives tangential to the outgoing Minkowski cones. The inequalities (14.5)–(14.8) for tangential derivatives certainly follow from (14.4), so it only remains to prove these estimates for a derivative transversal to the light cone. The missing improved estimates for a (∂t − ∂r ) derivative transversal to the light cones will be obtained, in the case of (14.5), from the wave coordinate condition, see Sect. 12, and for (14.6)-(14.8), from integrating the reduced Einstein wave equations, see Sect. 10. The estimates from the wave coordinate condition are easily obtained. In fact the first estimate in (14.5) follows directly from Lemma 12.1 using the estimates
90
H. Lindblad, I. Rodnianski
(14.1), (14.3) and (14.4) and the second estimate in (14.5) follows integrating the first from q = 1, where (14.2) holds. However, the wave coordinate condition does not give estimates for a transversal derivative of all components of the metric and the remaining components have to be controlled by integrating the wave equation expressed in polar coordinates. The estimates for the transversal derivative obtained from the wave coordinate condition rely on a decomposition of the metric with respect to the null frame. On the other hand, the estimates obtained from integrating the wave equation are based on a decomposition of the wave operator in terms of tangential derivatives and a transversal derivative.
14.1. Proof of (14.3) and (14.4). For a fixed angular variable ω we integrate in the radial direction and use (14.1) and (14.2),
t+1
|Z I h(t, r, ω)| ≤ |Z I h(t, t + 1, ω)| +
|∂r Z I h(t, ρ, ω)| dρ
r
t+1 Cε dρ Cε + 1/2+γ 1+t (1 + t + |t − ρ|) (1 + |t − ρ|)1/2+γ r Cε Cε(1 + |t − r|)1/2+γ . (14.9) + 1+t (1 + t + r)1/2+γ
The estimate (14.3) now follows. By Lemma 7.2 and (14.3) |∂Z I h|
1 1 + t + |q|
|Z J h|
|J |≤|I |+1
ε(1 + |q|)1/2−γ , (1 + t + |q|)3/2+γ
which proves (14.4).
14.2. Proof of (14.5). We now show that the wave coordinate condition allows one to control certain components by lower order terms and terms with fast decay. Lemma 14.3. Suppose that the estimates (14.1)– (14.4) hold and that h satisfies the wave coordinate condition. Then |∂Z I h|LL + |∂Z J h|LT |∂Z K h| + ε(1 + t + |q|)−1−2γ . |I |≤ k
|J |≤ k−1
|K|≤ k−2
(14.10) Here the sum over k − 2 is absent if k ≤ 1 and the sum over k − 1 is absent if k = 0. Furthermore 1 I |Z h|LL + |Z J h|LT + |Z K h| (t, x) 1 + |q| |I |≤ k |J |≤ k−1 |K|≤ k−2 ε . (14.11) |∂Z K h(t, y)| + sup 1+t t/2≤|y|≤3t/2 |K|≤ k−2
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
91
Proof. We first prove (14.10). Using the estimates of Lemma 12.2 derived from the wave coordinate condition followed by (14.1)– (14.4) we obtain ¯ I h| + |∂Z I H |LL + |∂Z J H |LT |∂Z |∂Z K h| |I |≤k
+
|J |≤k−1
|I |≤k
|K|≤ k−2
|Z h| · · · |Z h| |∂Z h| Im
I2
I1
|I1 |+...+|Im |≤k, m≥2
≤
|∂Z K h| + ε(1 + t + |q|)−1−2γ + ε(1 + t + |q|)−1−2γ . (14.12)
|K|≤ k−2
The proof of estimate (14.11) for |q| ≥ t/2 follows directly from (14.3). Thus we may assume that |q| ≤ t/2. We now use the inequality (14.13) |H (t, rω)| ≤ |H t, (t + 1)ω | + (1 + |q|) sup |∂ρ H (t, (t + ρ)ω)|, |ρ|≤|q|+1
and the boundary condition (14.2) to conclude that |Z I H |LL + |Z J H |LT + |Z K H | 1 + |q|
sup t/2≤|y|≤2t
|∂r Z I H |LL + |∂r Z J H |LT
+|∂r Z K H | (t, y) +
The desired result now follows from (14.10).
ε . 1+t
(14.14)
The first part of (14.5) now follows directly from the lemma with k = 0, 1 and the second part follows from integrating the first and using the boundary assumption (14.2) as in the proof of (14.3). 14.3. Proof of (14.6)–(14.7). We will appeal to the L∞ estimates of Sect. 10 for the reduced wave equation g hµν = Fµν , where Fµν is as in Lemma 3.2. We will now prove (14.6) and (14.7) assuming (14.1)– (14.5). Lemma 14.4. Suppose that the assumptions of Proposition 14.1 hold and let Fµν = Fµν (h)(∂h, ∂h) be as in Lemma 3.2. Then |F |T U ≤ Cεt −1−2γ |∂h|
(14.15)
|F | ≤ Cεt −1−2γ |∂h| + C|∂h|2T U .
(14.16)
and
Proof. This follows from Lemma 13.3 using (14.1)–(14.5).
Using the first part of Corollary 10.3; (10.24), and (14.1)–(14.5) and the previous lemma we get
92
H. Lindblad, I. Rodnianski
Lemma 14.5. With a constant depending on γ > 0 we have (1 + t) |∂h|T U (t, ·) L∞ ≤ Cε + Cε
t
(1 + τ )−2γ ∂h(τ, ·) L∞ dτ,
(14.17)
0
and (1 + t) ∂h(t, ·) L∞ ≤ Cε + C
t
0
ε(1 + τ )−2γ ∂h(τ, ·) L∞ + (1 + τ ) |∂h|T U (τ, ·) 2L∞ dτ. (14.18)
The estimates (14.6) and (14.7) now follow from the above lemma and the following technical result applied to n00 (t) = (1 + t) |∂h|T U (t, ·) L∞ and n01 (t) = (1 + t) ∂h(t, ·) L∞ : Lemma 14.6. Suppose that n00 ≥ 0 and n01 ≥ 0 satisfy n00 (t) ≤ Cε n01 (t) ≤ Cε
t
0
0 t
+C
t
(1 + s)−1−γ n01 (s) ds + 1 , (1 + s)−1−γ n01 (s) ds + 1
(1 + s)−1 n00 (s)2 ds
(14.19) (14.20)
0
for some positive constants such that 0 < 16(C 2 + C)ε < γ ≤ 1. Then n00 (t) ≤ 2Cε,
and
n01 (t) ≤ 2Cε 1 + γ ln (1 + t) .
(14.21)
Proof. Let T be the largest time such that
t
N01 (t) =
(1 + s)−1−γ n01 (s) ds + 1 ≤ 2,
for t ≤ T .
(14.22)
0
Then for t ≤ T (14.21) holds and since
∞
(1 + s)−1−γ 1 + γ ln (1 + s) ds = γ −1
0
∞
(1 + τ ) e−τ dτ = 2γ −1 + 1,
0
it follows that N01 (t) ≤ 2Cε 2γ −1 + 1 + 1 ≤ 3/2,
for t ≤ T .
Since N01 (t) is continuous this contradicts that T is the maximal number such that (14.22) holds. Thus T = ∞ and (14.21) holds for all t < ∞. This proves the first part of (14.6) and (14.7). The second part of (14.6) follows from integrating the first using the boundary assumption (14.2) as in the proof of (14.3).
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
93
14.4. Proof of (14.8) in case k = 1. We will now prove the first part of (14.8) for |I | = 1, assuming (14.1)–(14.7). Lemma 14.7. Suppose that the assumptions of Proposition 14.1 hold and let Fµν = Fµν (h)(∂h, ∂h) be as in Lemma 3.2. Then (14.23) |ZF | ≤ Cεt −1 |∂Zh| + |∂h| .
Proof. This follows from Lemma 13.3.
Using the second part of Corollary 10.3; (10.25), and (14.1)–(14.5) and the previous lemma we get Lemma 14.8. If ε > 0 is sufficiently small then
t (1 + t) ∂Z I h(t, ·) L∞ ≤ Cε(1+t)Cε 1+ (1+τ )−Cε ∂Z I h(τ, ·) L∞ dτ . 0
|I |≤1
|I |≤1
(14.24) The estimate (14.8) for |I | = 1 is now a consequence of the above lemma and the following technical result applied to n1 (t) = (1 + t)1−Cε |I |≤1 ∂Z I h(t, ·) L∞ : Lemma 14.9. Suppose that n1 (t) ≥ 0 satisfies t
n1 (t) ≤ Cε 1 + (1 + τ )−1 n1 (τ ) dτ .
(14.25)
0
Then n1 (t) ≤ Cε(1 + t)Cε .
(14.26)
Proof.
t
N1 (t) = 1 +
(1 + τ )−1 n1 (τ ) dτ
(14.27)
0
satisfies N˙ 1 (t) ≤ Cε(1 + τ )−1 N1 (t). Multiplying by the integrating factor (1 + t)−Cε and integrating we get N1 (t) ≤ N1 (0)(1 + t)Cε = (1 + t)Cε and the lemma follows. 14.5. Proof of (14.8) in case k ≥ 1. We will now use induction to prove the first part of (14.8) for |I | = k + 1, assuming that (14.1)–(14.5), the first part of (14.6), (14.7) and the first part of (14.8) for |I | ≤ k hold. Lemma 14.10. Suppose that the assumptions of Proposition 14.1 hold and let Fµν = Fµν (h)(∂h, ∂h) be as in Lemma 3.2. Then |∂Z K h| + C |∂Z J h||∂Z K h|. (14.28) |Z I F | ≤ Cεt −1 |K|≤|I |
|J |+|K|≤|I |, |J |≤|K|<|I |
Proof. This follows from Lemma 13.3 using (14.1)–(14.7).
94
H. Lindblad, I. Rodnianski
By Corollary 7.6
g Z I h| |Zˆ I F | + (1 + t)−1 |
|Z J H ||∂Z K h|
|K|≤|I |, |J |+(|K|−1)+ ≤|I |
+C(1 + q)
−1
|K|≤|I |
+
|Z J H |LL
|J |+(|K|−1)+ ≤|I |
|Z H |LT +
|J |+(|K|−1)+ ≤|I |−1
|Z J H | |∂Z K h|,
J
|J
|+(|K|−1)+ ≤|I |−2
(14.29) where (|K| − 1)+ = |K| − 1, if |K| ≥ 1, and 0, if |K| = 0. Using Lemma 14.3 we get
(1 + q)−1 |Z J H |LL + |Z J H |LT |J |≤k, |J |≤k−1, |J
|≤k−2
+|Z J H | ≤ We hence obtain
g Z I h| ≤ Cε(1 + t)−1 |
Cε + 1+t
|∂Z J H (t, y)|.
sup
(14.30)
|J
|≤k−2 t/2≤|y|≤2t
|∂Z K h|+
sup
|∂Z J H (t, y)||∂Z K h|.
|J |+|K|≤|I |−1 t/2≤|y|≤2t
|K|≤|I |
(14.31) Then we have proven that Lemma 14.11. Let nk (t) = (1 + t)
∂Z I h(t, ·) L∞ .
(14.32)
|I |≤k
Then for |I | = k:
g Z I h| ≤ C(1 + t)−2 εnk (t) + nk−1 (t)2 . |
(14.33)
By the first part of Corollary 10.3; (10.24), it therefore follows that: Lemma 14.12.
t
nk (t) ≤ Cε + C
(1 + τ )−1 εnk (τ ) + nk−1 (τ )2 dτ.
(14.34)
0
Our inductive hypothesis is nk−1 (t)2 ≤ Cε2 (1 + t)Cε so the bound nk (t) ≤ Cε(1 + follows from:
t)2Cε
Lemma 14.13. Suppose that
t
nk (t) ≤ Cε(1 + t)Cε + Cε
(1 + τ )−1 nk (τ ) dτ,
(14.35)
0
then nk (t) ≤ Cε(1 + t)2Cε .
(14.36)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
95
t Proof. Let Nk (t) = 0 (1 + τ )−1 nk (τ ) dτ . Then |N˙ k (t)| ≤ Cε(1 + t)−1 (1 + t)Cε +
Nk (t) . Multiplying by an integrating factor gives Nk (t)(1 + t)−2Cε ≤ Cε(1 + t)−1−Cε , so Nk (t)(1 + t)−2Cε ≤ C, and hence Nk (t) ≤ C(1 + t)2Cε and nk (t) ≤ 2Cε(1 + t)2Cε . This proves the first part of (14.8). The second part of (14.8) follows from integrating the first and using the boundary assumption (14.2) as in the proof of (14.3). 15. Energy Estimates for Einstein’s Equations Recall the definitions
EN (t) = sup
0≤τ ≤t |I |≤N
SN (t) =
t
|I |≤N
0
|∂Z I h|2 ,
(15.1)
t
τ
¯ I h|2 γ |∂Z . (1 + |q|)1+2γ
(15.2)
In this section we prove the following theorem. Theorem 15.1. Assume that g = h + m satisfies both Einstein’s equations and the wave coordinate condition for 0 ≤ t ≤ T . Suppose also that for some 0 < γ ≤ 1/2 we have the following estimates for 0 ≤ t ≤ T : 1. For all multi-indices I, |I | ≤ N/2 + 4, ¯ I h| |∂Z I h| + (1 + |q|)−1 |Z I h| + (1 + t)(1 + |q|)−1 |∂Z ≤ Cε(1 + t)−1/2−γ (1 + |q|)−1/2−γ .
(15.3)
2. For all multi-indices I, |I | ≤ N, |Z I H (s, q, ω)| ≤ Cε(1 + t)−1 ,
for q = 1.
(15.4)
3. |∂H |T U + (1 + |q|)−1 |H |T U + (1 + |q|)−1 |ZH |LL ≤ Cε(1 + t)−1 .
(15.5)
4. For all multi-indices I, |I | ≤ N/2 + 2, |∂Z I h| + (1 + |q|)−1 |Z I h| ≤ Cε(1 + t)−1+Cε .
(15.6)
EN (0) ≤ ε2 .
(15.7)
5.
Then there are positive constants Ck independent of T such that if ε ≤ Ck−2 we have the energy estimate Ek (t) + Sk (t) ≤ 16ε2 (1 + t)Ck ε , for 0 ≤ t ≤ T and for all k ≤ N .
(15.8)
96
H. Lindblad, I. Rodnianski
Remark 15.2. Once again we recall that our estimates hold simultaneously for the tensors h and H = −h + O(h2 ). We shall freely interchange h and H in the proof below. Proof. Recall that the components of the tensor hµν = gµν − mµν satisfy the following wave equations: g αβ ∂α ∂β hµν = Fµν , Fµν = P (∂µ h, ∂ν h) + Qµν (∂h, ∂h) + Gµν (h)(∂h, ∂h),
(15.9)
where P (∂µ h, ∂ν h) =
1 αα
1
m ∂µ hαα mββ ∂ν hββ − mαα mββ ∂µ hαβ ∂ν hα β . 4 2
(15.10)
We prove the desired estimate by induction on k. We first establish the estimate E0 (t) + S0 (t) ≤ 8ε2 (1 + t)C0 ε
(15.11)
for some constant C0 . After that we shall assume that the statement (15.8) for k ≤ N −1 and prove the corresponding statement for k ≤ N with some constant CN . We shall base our argument on the energy estimate (11.3) for the solution of the wave equation g φ = F proved in Proposition 11.1. Observe that the conditions of our proposition on the tensor h = g − m imply the assumptions of Proposition 11.1 for the metric g, t t ¯ 2 γ |∂φ| |∂φ|2 2 |∂φ|2 + ≤ 8 |∂φ| + Cε 1+2γ t 0 0 τ (1 + |q|) 0 t 1 + t t +16 |F | |∂φ|. (15.12) 0
t
15.1. The case of N = 0. In this section we prove the basic energy estimate for a solution of Eq. (15.9), g hµν = Fµν := P (∂µ h, ∂ν h) + Qµν (∂h, ∂h) + Gµν (h)(∂h, ∂h). Recall that according to (13.10) of Lemma 13.3 we have a pointwise bound ¯ |F | |∂h|2T U + |∂h||∂h| + h|∂h|2 . Using the assumptions of the proposition we infer that |F | ε
|∂h| . 1+t
Therefore, the energy estimate (15.12) with φ = hµν implies that t t ¯ 2 γ |∂h| |∂h|2 2 |∂h|2 + ≤ 8 |∂h| + C ε . 0 1+2γ t 0 0 τ (1 + |q|) 0 t 1 + t
(15.13)
(15.14)
Using the smallness assumption on the initial data and the Gronwall inequality this, in turn, leads to the desired estimate (15.11), E0 (t) + S0 (t) ≤ 8ε2 (1 + t)C0 ε .
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
97
15.2. The case of N = 1. To facilitate the exposition we first consider the case N = 1. We start by noting that according to (7.18) of Corollary 7.6 we have that g Zhµν = ZF ˆ µν + Dµν , g Zhµν − Zˆ g hµν satisfies the estimate where the term Dµν =
|ZH | + |H | |ZH | + |H | LL LT |∂Z I h|. |D| + 1+t 1 + |q| |I |≤1
Recall that the tensor H αβ = −hαβ + O(h2 ). Thus using the assumptions on h of the proposition we derive that |D| ε
|∂Z I h| . 1+t
|I |≤1
On the other hand, inequality (13.11) gives the estimate |ZF | ≤ |∂h|T U + |∂h| + |h| |∂h| (|∂Zh| + |∂h|) + C|∂h| |∂Zh| + C|∂h|2 |Zh|. Using the assumptions of the proposition we conclude that ˆ | = |(Z + cZ )F | ε |ZF
|∂Z I h| ¯ |∂Zh| . +ε 1 1 1+t (1 + t) 2 (1 + |q|) 2 +γ |I |≤1
ˆ µν + Dµν we obtain Now using the energy estimate (15.12) with φ = Zhµν and F = ZF t 2 ¯ γ |∂Zh| |∂Zh|2 + 1+2γ t 0 τ (1 + |q|) t t |∂Z I h|2 ¯ |∂Zh| |∂Zh| ≤8 + Cε |∂Zh|2 + Cε 1 1 1 + t 0 0 t t 2 (1 + |q|) 2 +γ |I |≤1 0 t t t |∂Z I h|2 ¯ I h|2 |∂Z ≤8 |∂Zh|2 + Cε , + Cε 1+2γ 0 0 t 1 + t 0 t (1 + |q|) |I |≤1
where we used the Cauchy-Schwarz inequality to pass to the last line. Combining this with the energy inequality (15.14) we infer that if Cε ≤ γ /2 then t ¯ I h|2 γ |∂Z |∂Z I h|2 + (1 + |q|)1+2γ |I |≤1 t |I |≤1 0 τ t |∂Z I h|2 I 2 . (15.15) ≤ 16 |∂Z h| +C1 ε 0 0 t 1 + t |I |≤1
|I |≤1
The desired estimate E1 (t) + S1 (t) ≤ 16ε2 (1 + t)C1 ε now follows from the Gronwall inequality and the smallness assumption on the initial data.
98
H. Lindblad, I. Rodnianski
15.3. The case of N > 1. In what follows we assume that we have already shown that EN −1 (t) + SN −1 (t) ≤ 16ε2 (1 + t)CN −1 ε ,
(15.16)
and prove that there exists a constant CN such that EN (t) + SN (t) ≤ 16ε2 (1 + t)CN ε .
(15.17)
We start this section by writing the wave equation for the quantity Z I hµν with |I | = N , g Z I hµν = Zˆ I Fµν + D I , µν where I g Z I hµν − Zˆ I g hµν . = Dµν
We apply the energy estimate (15.12) with the functions φ = Z I hµν and F = I , Zˆ I Fµν + Dµν t t ¯ I h|2 γ |∂Z |∂Z I h|2 I 2 |∂Z I h|2 + ≤ 8 |∂Z h| + Cε 1+2γ t 0 0 τ (1 + |q|) 0 t 1 + t t I |Zˆ F | + |D I | |∂Z I h|. +16 0
t
(15.18) Note that we can estimate t I |Zˆ F | + |D I | |∂Z I h| dx dt
t
ε |∂Z I h|2 dx dt 0 1+t t + ε−1 (1 + t) |Zˆ I F |2 + |D I |2 dx dt.
0
0
(15.19)
Here the first term is of the type that appears already in the energy estimate (15.18). Thus it remains to handle the second term. According to (7.21) of Corollary 7.6 we have that DI = DkI = |DkI 1 | |DkI 2 | |DkI 3 | |DkI 4 |
|I |
DkI ,
(15.20)
k=0 DkI 1
+ DkI 2 + DkI 3 + DkI 4 , |Z J H |
|K|=k
|J |+(|K|−1)+ ≤|I |
|K|=k
|J |+(|K|−1)+ ≤|I |
1 + t + |q|
(15.21) |∂Z K h|,
|Z J H |LL |∂Z K h|, 1 + |q|
|K|=k
|J |+(|K|−1)+ ≤|I |−1
|K|=k
|J |+(|K|−1)+ ≤|I |−2
(15.22)
(15.23)
|Z J H |LT |∂Z K h|, 1 + |q|
(15.24)
|Z J H | |∂Z K h|. 1 + |q|
(15.25)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
99
The estimates for DkI with k ≤ N/2. We must now estimate t ε−1 (1 + t)|DkI |2 dx dt.
(15.26)
0
Since k = |K| ≤ N/2 in (15.22)–(15.25) it follows from the assumptions in the theorem that we can estimate
ε ε ε−1 (1 + t)|∂Z K h|2 min , (15.27) , 1−Cε 1+2γ (1 + t) (1 + |q|) and it thus suffices to estimate t ε−1 (1 + t)|DkI 1 |2 dx dt 0
t
t 0
|J |≤|I | 0
0
(15.28)
ε−1 (1 + t) |DkI 3 |2 + |DkI 4 |2 ) dx dt
t
ε |Z J H |2 dx dt, (1 + |q|)1+2γ (1 + t + |q|)2
t
|J |≤|I |−1 0
ε |Z J H |2 dx dt, (1 + t)1−Cε (1 + |q|)2
(15.29)
ε−1 (1 + t)|DkI 2 |2 dx dt t
min
|J |≤|I | 0
|Z J H |2 ε ε LL , dx dt. (15.30) (1 + t)1−Cε (1 + |q|)1+2γ (1 + |q|)2
Lemma 15.3. Let f be a smooth function satisfying the condition |f | ε(1 + t)−1 , Then t 0
ε |f |2 dx dt 1+2γ (1 + |q|) (1 + t + |q|)2
for q = 1. 0
t
ε (1 + t)1+2γ
(15.31) |∂f |2 dx dt + ε 3 (15.32)
and t 0
|f |2 ε dx dt (1 + t)1−Cε (1 + |q|)2
0
t
ε ε2 + (1 + t)1−Cε
|∂f |2 dx dt. (15.33)
Furthermore, t
|f |2 ε ε , dx dt (1 + t)1−Cε (1 + |q|)1+2γ (1 + |q|)2 0 t t ε ε|∂r f |2 2 dxdt + ε dt. 1+2γ 1−Cε (1 + |q|) 0 0 (1 + t) min
(15.34)
100
H. Lindblad, I. Rodnianski
Proof. We shall repeatedly use the Poincar´e inequality (9.1) of Lemma 9.1, |f (x)|2 dx |∂r f (x)|2 dx 2 |f | dS + , (1 + |q|)2+2σ (1 + |q|)2σ t
(15.35)
t
S(t+1)
which holds for any value of σ > −1/2, σ = 1/2. In particular, using (15.31), we obtain that |f |2 dx |f |2 dx 2 ε + . (15.36) (1 + |q|)2+2σ (1 + |q|)2σ t
t
The estimates (15.32) and (15.33) now follow from (15.36) with σ = 0.
We now note the following generalization of (15.35): |f (x)|2 dx
ε ε min , (1 + t)1−Cε (1 + |q|)1+2γ (1 + |q|)2 t
ε (1 + t)1−Cε
|f |2 dS + ε
S(t+1)
t
|∂r f (x)|2 dx . (1 + |q|)1+2γ
(15.37)
The proof of (15.37) can be reduced to (15.35) by subtracting a term which picks up the boundary value. We define f = f − f , where f (r, ω) = f (t + 1), ω χ (r/t). (15.38) and χ (s) = 1, when 3/4 ≤ s ≤ 3/2 and χ (s) = 0 when s ≤ 1/2 or s ≥ 2. Then |f (x)|2 dx
ε ε min , (1 + t)1−Cε (1 + |q|)1+2γ (1 + |q|)2 t
ε t
|f˜(x)|2 dx ε + 3+2γ (1 + |q|) (1 + t)1−Cε
t
|f¯(x)|2 dx . (1 + |q|)2
(15.39)
We now apply (15.35) to the function f˜, which vanishes at r = t + 1, and observe that 2 |∂r f (x)|2 dx 1 f (t + 1), ω dω |f |2 dS. (15.40) (1 + |q|)1+2γ (1 + t)2 St+1 t
On the other hand, ε (1 + t)1−Cε
t
|f (x)|2 dx ε 2 (1 + |q|) (1 + t)1−Cε
|f |2 dS,
(15.41)
St+1
which proves (15.37). Using the lemma above with f = Z J H , together with (15.28), (15.29) and the assumption that EN −1 ≤ 16(1 + t)CN −1 ε we see that we can estimate
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
101
t
ε−1 (1 + t) |DkI 1 |2 + |DkI 3 |2 + |DkI 4 |2 ) dx dt 0 t t ε ε 2
E (t) dt + ε dt 1+2γ N 1−Cε (1 + t) (1 + t) 0 0 t ε ε EN (t) + ε 2 dt 1−Cε 0 (1 + t)
for all ≤ N/2. Thus the term (15.30) remains containing DkI 2 . We shall use the version of the Poincar´e inequality (15.34) to create the term ∂q (Z J H )LL , which can be then converted to a tangential derivative of Z J H via the wave coordinate condition. However, in order to implement this strategy we modify the term Z J HLL according to Lemma 12.2. We recall the notation J γ J = Z J Hµν + cJ µ Z J Hγ ν . (15.42) Hµν |J |<|J |
If |J | ≤ N then the lower order terms in the right-hand side of (15.42) may be estimated using (15.29) and (15.33) as before. According to Lemma 12.2 and the pointwise estimates in (15.6) and (15.4), J ¯ J H| + | |∂Z |Z Jm H | · · · |Z J2 H | |∂Z J1 H | |∂r HLL |J |≤|J |
|J1 |+..+|Jm |≤|J |, m≥2 J
¯ |∂Z H| +
|J |≤|J |
|J |≤|J |
|Z J1 H | |∂Z J2 H |
|J1 |+|J2 |≤|J |
¯ J H |+ |∂Z
ε(1 + |q|)1/2−γ ε|Z J H | J
|∂Z H | + . (1 + t)1/2+γ (1 + t)1/2+γ (1 + |q|)1/2+γ (15.43)
Hence t
J |2 ε|∂r HLL dxdt (1 + |q|)1+2γ 0
t ε|∂Z J H |2 ε|∂Z J H |2 ε|Z J H |2 + + dxdt. (1 + |q|)1+2γ (1 + t)1+2γ (1 + t)1+2γ (1 + |q|)2 0
|J |≤|J |
(15.44) If we use (15.33) with Cε in the exponent replaced by 2γ we see that the last term can be estimated by the second term from the right plus a term from the boundary: t
J |2 ε|∂r HLL dxdt (1 + |q|)1+2γ 0
t ε|∂Z J H |2 ε|∂Z J H |2 ε 2 dxdt. + + ε (1 + |q|)1+2γ (1 + t)1+2γ (1 + t)1+2γ 0
|J |≤|J |
(15.45)
102
H. Lindblad, I. Rodnianski
As we argued, when estimating (15.30) we can replace |Z J H |LL by the left-hand side of (15.42). After that we use the version of the Poincar´e inequality (15.34) applied to J and this together with (15.45) gives HLL t
ε−1 (1 + t)|DkI 2 |2 dxdt εSN (t) + εEN (t) + ε 2
0
t 0
ε dt. (1 + t)1−Cε (15.46)
Summarizing, we have proven that t 0
ε−1 (1 + t)|DkI |2 dxdt εSN (t)+εEN (t)+ε2
t 0
ε dt, (1 + t)1−Cε
k ≤ N/2. (15.47)
This concludes the estimates in the case k ≤ N/2. The commutator in case k ≥ N/2. We isolate the case when |K| = N = |I |. We can I by the following expression: estimate its contribution to the DN
I |DN
|
|H | + |ZH | |ZH |LL + |H |LT |∂Z K h| + |∂Z K h| ε , 1 + t + |q| 1 + |q| 1+t
|K|=|I |
|K|=|I |
where to pass to the last line we used pointwise estimates from (15.5), (15.3), and (15.4). In the case when N/2 ≤ k < |I | we estimate the contribution of the corresponding term in DkI , with the help of (15.6) as follows: |DkI |
|K|<|I | |J |≤N/2
|ZH | |∂Z K h| |∂Z K h| ε . 1 + |q| (1 + τ )1−Cε |K|<|I |
Therefore, t ε 0
τ
−1
(1 + t)|DkI |2 dxdt
ε
t 0
|K|<|I |
|∂Z K h|2 |∂Z K h|2 + dxdt. (1 + τ )1−2Cε 1+τ |K|=|I |
(15.48) Using the inductive assumption (15.16) we can therefore estimate t ε 0
τ
−1
(1 + t)|DkI |2 dxdt
t
+ε2 0
ε dt , (1 + τ )1−2Cε
ε 0
t
EN (τ ) dt 1+τ
N/2 ≤ k ≤ N .
(15.49)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
103
The inhomogeneous term. By (13.12) ¯ J h| |∂Z K h| |Zˆ I F | |∂Z J h|T U |∂Z K h|T U + |∂Z |J |+|K|≤|I |
+
|J |+|K|≤|I |−1
|∂Z J h| |∂Z K h|
|J |+|K|≤|I |−2
+
|∂Z J h|LT |∂Z K h| +
|Z Jm h|· · · |Z J3 h| |∂Z J2 h||∂Z J1 h|.
(15.50)
|J1 |+...+|Jm |≤|I |, m≥3
The highest order terms with one of |J |, |K| or |Ii | equal to N = |I | are bounded by ¯ + |h||∂h| ¯ I h| |∂h|T U + |∂h| |∂Z I h| + |∂h|2 |Z I h| + |∂h| |∂Z |I |=N
|I |=N
|I |=N
ε ε2 ≤ |∂Z I h| + |Z I h| 1+2γ 1+2γ 1+t (1 + t) (1 + q) |I |=N |I |=N ε ¯ I h|. |∂Z + (1 + t)1/2+γ (1 + q)1/2+γ
(15.51)
|I |=N
The remaining lower order terms are of the form |∂Z J h| |∂Z K h| + |K|
|∂Z J h| |∂Z L h| |Z K h|
|K|
ε ε2 K ≤ |∂Z h|+ |Z I h|. 1−Cε 1+2γ 1+2γ (1 + t) (1 + t) (1 + q) |K|
|I |
(15.52) It therefore follows that t t |∂Z K h|2 −1 I 2 ˆ ε (1 + t)|Z F | dxdt ε 1+τ 0 τ 0 |K|≤|I |
|∂Z K h|2
|Z K h|2
+ dxdt (1 + |q|)1+2γ (1 + t)1+2γ (1 + |q|)2 t ε dt + |∂Z I h|2 dxdt 1−2Cε (1 + τ ) 0 |I |
|I |
Here, to estimate the last term in the first row we used (15.33) with −Cε in the exponent replaced by 2γ , which produced a term similar to the first term of the first line plus a boundary term. Using the inductive assumption (15.16) we thus obtain
104
H. Lindblad, I. Rodnianski
t 0
ε −1 (1 + t)|Zˆ I F |2 dxdt ε
0
τ
t
EN (τ ) dτ + εSN (t) + ε 2 1+τ
t 0
ε dτ . (1 + τ )1−Cε (15.54)
The conclusion of the proof in case N > 1. The inequalities (15.18)–(15.19) and (15.47), (15.49) and (15.54) imply that for some constant C: EN (t) + SN (t) ≤ 8EN (0) + Cε EN (t) + SN (t) t t EN (τ ) dτ ε dτ +Cε . (15.55) + Cε 2 1−Cε 1+τ 0 0 (1 + τ ) If we now choose ε so small that Cε ≤ 1/9 we can move the second term on the right to the left and multiply by 9/8 to obtain for some new constants t t EN (τ ) dτ ε dτ EN (t) + SN (t) ≤ 9EN (0) + Cε + Cε 2 . (15.56) 1 + τ (1 + τ )1−Cε 0 0 This can now be integrated using a Gr¨onwall type of argument. If G(t) denotes the right-hand side then we have G (t) ≤
Cε Cε 3 . G(t) + 1+t (1 + t)1−Cε
Multiplying with the integrating factor we get d
Cε3 G(t)(1 + t)−Cε ≤ , dt 1+t and hence if we integrate and use that Cε ln (1 + t) ≤ (1 + t)Cε , for t ≥ 0 (as is seen by differentiating both sides), and use that by assumption (15.7) G(0) ≤ 9ε 2 , we obtain G(t) ≤ G(0)(1 + t)Cε + Cε 3 ln (1 + t)(1 + t)Cε ≤ 9ε 2 (1 + t)Cε + ε 2 (1 + t)2Cε ≤ 10ε 2 (1 + t)2Cε . Hence we have proven that EN (t) + SN (t) ≤ 10ε2 (1 + t)2Cε . This concludes the induction and the proof of the theorem.
16. Geodesic Completeness Having constructed a solution metric g = m + h of the Einstein equations we need to verify that the resulting space-time (R4 , g) is causally geodesically complete. Let X(τ ) = (x 0 (τ ), x(τ )) = (t (τ ), x(τ ) = (t (τ ), rω(τ )) be a causal geodesic parameterized by the affine parameter τ . Such geodesics satisfy the equations α X¨ α (τ ) + βγ (X(τ ))X˙ β X˙ γ = 0,
X(0) = Y,
˙ X(0) = ξ,
(16.1)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
105
where Y is the point of the origin of the geodesic X(τ ) and ξ is the initial velocity satisfying the condition gαβ (Y )ξ α ξ β = −A2 ≤ 0
(16.2)
for some constant A. Condition (16.2) is preserved in time, i.e., gαβ (X(τ ))X˙ α X˙ β = −A2 .
(16.3)
In the following lemma we show that a vector η causal with respect to the metric g is “almost” causal with respect to the Minkowski metric m. Lemma 16.1. Let η be a causal 4-vector, i.e, gαβ ηα ηβ ≤ −A2 ≤ 0
(16.4)
for some non-negative constant A. Then A + |ηi | ≤ 2|η0 |,
∀i = 1, . . . , 3.
(16.5)
Proof. Expanding g = m + h we obtain from (16.4) that −|η0 |2 +
3
|ηi |2 ≤ |h| · (|η0 |2 +
i=1
3
|ηi |2 ),
i=1
and the desired estimate follows provided that |h| ≤ 1/4.
We choose a future oriented initial velocity ξ , i.e., x˙ 0 (0) > 0. Proposition 16.2. Assume that h = g − m satisfies the estimates14 ¯ LL εt −1 , |h||∂h| + |∂h|T U + |∂h| |∂h(t, x)| εt −1 ,
for |x| ≤ t/2.
Let X(τ ) be a future inextendible causal geodesic. Then the values of the affine parameter τ span the interval [0, ∞). Proof. We start by considering a time-like geodesic X(τ ). Reparameterizing, if necessary, we can assume that the constant A = 1 in (16.3). Then Eq. (16.3) and inequality (16.5) with A = 1 imply that for all τ ≥ 0, x˙ 0 (τ ) ≥
1 + |x(τ ˙ )|. 2
(16.6)
We removed the absolute value from x˙ 0 (τ ), since x˙ 0 (0) > 0. This is the only part of the argument which uses the fact that X(τ ) is a time-like geodesic. The case of a null geodesic will require an additional argument. Assume that X(τ ) is a time-like geodesic of finite length τ∗ . We first observe that lim |X(τ )| = ∞,
τ →τ∗ 14
These assumptions are consistent with the decay estimates for h proved in Theorem 14.1.
106
H. Lindblad, I. Rodnianski
which means that X(τ ) escapes to infinity15 in finite proper time τ∗ . This easily follows ˙ ). from the standard ODE theory. The inequality (16.6) implies that x˙ 0 (τ ) controls X(τ Thus to obtain a contradiction it suffices to show that lim x 0 (τ ) < ∞.
τ →τ∗
Throughout this section we will consistently use the notation x 0 = t. We recall that α = g ασ (∂β hγ σ + ∂γ hβσ − ∂σ hβγ ). βγ
Thus, expanding the metric g = m + h, ˙ 2. x¨ 0 − (2∂β h0γ − ∂0 hβγ )x˙ β x˙ γ = h · ∂h · |X| We further observe that ∂β h0γ x˙ β x˙ γ =
d h0γ x˙ γ ) − h0γ x¨ γ . dτ
(16.7)
We now additionally recall that ∂q hLL is the only derivative of h that does not have the decay rate of at least (x 0 )−1 . Thus ˙ 2 = ∂q hLL |X˙ L |2 + εO((x 0 )−1 )|X| ˙ 2. ∂0 hβγ X˙ β X˙ γ = ∂q hβγ X˙ β X˙ γ + εO(t −1 )|X| The expression xi i d x˙ = − (t − r) = −q. ˙ X˙ L = X˙ α Lα = −x˙ 0 + |x| dτ Moreover, ∂q hLL = 4∂q h00 + εO((x 0 )−1 ). Furthermore, introduce ζ (x 0 /r), a cut-off function of the set r ≥ x 0 /2. Then ∂q h00 = (1 − ζ )∂q h00 + ζ ∂q h00 = εO(t −1 ) + ∂q (ζ h00 ) − (∂q ζ (x 0 /r))h00 . We compute ∂q ζ (x 0 /r) = (r −1 + x 0 r −2 )ζ (x 0 /r) (x 0 )−1 , since r ≥ x 0 /2 on the support of ζ (x 0 /r). Thus ∂q h00 can be replaced by ∂q (ζ h00 ) at the expense of a term of order εO((x 0 )−1 ). Therefore, ˙ 2 = 4∂q (ζ h00 )|q| ˙ 2 + εO((x 0 )−1 )|q| ˙2 ∂q hLL |X˙ L |2 = 4∂q h00 |q| d =4 ζ h00 q˙ − 4ζ h00 q¨ − 4∂L (ζ h00 )X˙ L X˙ L dτ −4∂ (ζ h00 )X˙ ω X˙ L + εO((x 0 )−1 ). Here, h(X(τ )) = h(q((τ ), v(τ ), ω(τ )), 15
Viewed from the point of view of the global system of wave coordinates on R4 .
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
107
i
where q = x 0 − r, v = x 0 + r, and ω = xr . The advantage is that ∂ h00 , ∂L h00 already decay faster than (x 0 )−1 and ∂ ζ (x 0 /r) = 0, while |∂L ζ (x 0 /r)| (x 0 )−1 . Thus d ζ h00 q˙ − ζ h00 q¨ + εO((x 0 )−1 ). ∂q hLL |X˙ L |2 = dτ It remains to analyze the term q¨ =
d 0 xi i xi (x˙ − x˙ ) = x¨ 0 − x¨ i + r −1 (|x| ˙ 2 − r −2 |x · x| ˙ 2 ). dτ r r
(16.8)
From the geodesic equation (16.1) we can estimate ˙ 2. |x¨ α | ≤ |∂h||X| Additionally, since on the support of ζ (x 0 /r), r ≥ x 0 /2, we have that the last term in (16.8) multiplied by ζ h00 contributes at most16 ε(x 0 )−1 . Thus combining everything together we have d 0 ˙ 2. x˙ − 2h0γ x˙ γ + ζ h00 q˙ = O(ε(x 0 )−1 )|X| dτ ˙ ≤ |x˙ 0 | and that We integrate this identity between proper times 0 < τ . Observe that |X| (x 0 )−1 |x˙ 0 | =
d ln x 0 . dτ
Thus
x˙ 0 (τ ) x˙ 0 (0) + (2h0γ x˙ γ − ζ h00 q)| ˙ τ0 + ε
τ 0
d ln x 0 x˙ 0 dτ . dτ
It follows that x˙ 0 (τ )
|x 0 (τ )| ε |x 0 (0)|
x˙ 0 (0).
Integrating one more time and assuming that x 0 (0) = t (0) = 1 we obtain that (x 0 (τ ))1−ε 1 + x˙ 0 (0)τ. From this we conclude that the time x 0 = t remains finite with τ . This concludes the proof for time-like geodesics. We now address the issue of null geodesics X(τ ), g αβ X˙ α X˙ β = 0. Examining the proof above leads to the conclusion that is suffices to establish that the condition x˙ 0 (τ ) > 0 is preserved in time. Lemma 16.3. For a future oriented inextendible null geodesic X(τ ) defined on the interval [0, τ∗ ) we have x˙ 0 (τ ) > 0 for all τ ∈ [0, τ∗ ). 16
This is the reason for introducing the cut-off function ζ .
108
H. Lindblad, I. Rodnianski
Proof. Let τ0 < τ∗ be the first time when x˙ 0 (τ0 ) = 0. Fix a sufficiently small constant c. Then there exists a small interval of size δ such that 0 ≤ x˙ 0 (τ ) ≤ c,
∀τ ∈ [τ0 − δ, τ0 ]
and x˙ 0 (τ0 − δ) = c.
(16.9)
˙ )| ≤ 2|x˙ 0 (τ )| and therefore, We observe that (16.5) with A = 0 implies that |X(τ ˙ )| ≤ 2c, |X(τ
∀τ ∈ [τ0 − δ, τ0 ].
Integrating the geodesic equation (16.1) we obtain τ0 0 0 ˙ 2 ≤ εc2 δ. |||X| |x˙ (τ0 ) − x˙ (τ0 − δ)| ≤ τ0 −δ
Thus, using (16.9), x˙ 0 (τ0 ) ≥ c − εc2 δ > 0. Contradiction.
This completes the proof of Proposition 16.2. We have shown that all future inextendible causal geodesics X(τ ) exist for all values of the affine parameter τ ∈ [0, ∞). This means that the constructed space-time is future causally geodesically complete. Next we establish that all future oriented causal geodesics escape to infinity. Proposition 16.4. Let X(τ ) be a future oriented causal geodesic. Then lim |X(τ )| = ∞.
(16.10)
τ →∞
Proof. The inequality (16.6) immediately gives the desired result for time-like geodesics. Recall that by Lemma 16.3 we have that x˙ 0 (τ ) > 0 and thus x 0 (τ ) is monotonically increasing in τ . We now argue by contradiction. Assume that for all τ ≥ 0, |X(τ )| ≤ C for some potentially large constant C. Then there exists a time t0 such that t0 = lim x 0 (τ ). τ →∞
Set τ0 be the value of the proper time τ for which t (τ0 ) = t0 − δ for some small constant δ. Integrating the geodesic equation we obtain that for τ ≥ τ0 , τ t |||x˙ 0 |2 dτ ≤ x˙ 0 (τ0 )+ε x˙ 0 dt ≤ x˙ 0 (τ0 )+εδ sup x˙ 0 (τ ). x˙ 0 (τ ) = x˙ 0 (τ0 )+ τ0
t0 −δ
τ0 ≤τ ≤τ
(16.11) Thus for any τ ≥ τ0 , x˙ 0 (τ ) ≤ 2x˙ 0 (τ0 ).
(16.12)
Global Existence for the Einstein Vacuum Equations in Wave Coordinates
109
Choosing a sequence of times τ0 → ∞ such that x˙ 0 (τ0 ) → 0 (such a sequence must exist, otherwise x 0 (τ ) → ∞) we infer from (16.12) that x˙ 0 (τ ) → 0 as τ → ∞. We can then choose a small constant c, δ such that t (τ0 ) = t0 − δ and x˙ 0 (τ0 ) = c,
x˙ 0 (τ ) ≤ c
for all τ ≥ τ0 . Returning to (16.11) we see that |x˙ 0 (τ ) − c| ≤ εδc. Thus x˙ 0 (τ ) ≥
c 2
for all τ ≥ τ0 and we obtained the contradiction.
Acknowledgements. The authors would like to thank Demetrios Christodoulou and Sergiu Klainerman for their inspiration and encouragement. We particularly benefited from Sergiu Klainerman’s suggestion to pursue first the problem with restricted data. We would also like to thank Mihalis Dafermos and Vince Moncrief for stimulating discussions and useful suggestions.
References [A1]
Alinhac, S.: Rank 2 singular solutions for quasilinear wave equations. Int. Math. Res. Notices (18), 955–984 (2000) [A2] Alinhac, S.: The null condition for quasilinear wave equations in two dimensions I. Invent. Math. 145, 597–618 (2001) [A3] Alinhac, S.: An example of blowup at infinity for a quasilinear wave equation. Asterisque 284, 1–91 (2003) [CB1] Choquet-Bruhat, Y.: Theoreme d’existence pour certains systemes d’equations aux derivees partielles nonlineaires. Acta Math. 88, 141–225 (1952) [CB2] Choquet-Bruhat, Y.: Un theoreme d’instabilite pour certaines equations hyperboliques non lineaires. C. R. Acad. Sci. Paris Sr. A-B 276, A281–A284 (1973) [CB3] Choquet-Bruhat, Y.: The null condition and asymptotic expansions for the Einstein’s equations. Ann. Phys. (Leipzig) 9, 258–266 (2000) [CB-G] Choquet-Bruhat,Y., Geroch, R.P.: Global aspects of the Cauchy problem in General Relativity. Commun. Math. Phys. 14, 329–335 (1969) [C1] Christodoulou, D.: Global solutions of nonlinear hyperbolic equations for small initial data. Commun. Pure Appl. Math. 39, 267–282 (1986) [C2] Christodoulou, D.: The Global Initial Value Problem in General Relativity. In: The Ninth Marcel Grossmann Meeting (Rome 2000), V.G. Gurzadyan, R.T. Jansen, (eds.), R. Ruffini, editor and series editor, River Edge, NJ: World Scientific, 2002, pp. 44–54 [C-K] Christodoulou, D., Klainerman, S.: The Global Nonlinear Stability of the Minkowski Space. Princeton Mathematical Series 41. Princeton, NJ: Princeton University Press, 1993 [C-D] Chru´sciel, P. T., Delay, E.: Existence of non-trivial, vacuum, asymptotically simple spacetimes. Class. Quantum Grav. 19(9), L71–L79 (2002) [Co] Corvino, J.: Scalar curvature deformation and a gluing construction for the Einstein constraint equations. Commun. Math. Phys. 214(1), 137–189 (2000) [Fo] Fock, V.: The theory of space, time and gravitation. New York: The Macmillan Co., 1964 [Fr] Friedrich, H.: On the existence of n-geodesically complete or future complete solutions of Einstein’s field equations with smooth asymptotic structure. Commun. Math. Phys. 107(4), 587–609 (1986) [H-E] Hawking, S., Ellis, G.: The large scale structure of space-time. Cambridge: Cambridge University Press, 1973
110 [H1]
H. Lindblad, I. Rodnianski
H¨ormander, L.: The lifespan of classical solutions of nonlinear hyperbolic equations. In: Pseudodifferential operators (Oberwolfach, 1986), Lecture Notes in Math. 1256, Berlin: Springer, 1987, pp. 214–280 [H2] H¨ormander, L.: Lectures on Nonlinear hyperbolic differential equations. Berlin-HeidelbergNew York: Springer Verlag, 1997 [H3] H¨ormander, L.: On the fully nonlinear Cauchy probelm with small initial data II. In: Microlocal analysis and nonlinear waves (Minneapolis, MN, 1988–1989), IMA Vol. Math. Appl. 30, New York: Springer, 1991, pp. 51–81 [J1] John, F.: Blow-up for quasilinear wave equations in three space dimensions . Commun. Pure Appl. Math. 34(1), 29–51 (1981) [J2] John, F.: Blow-up of radial solutions of utt = c2 (ut ) u in three space dimensions. Mat. Appl. Comput. 4(1), 3–18 (1985) [J-K] John, F., Klainerman, S.: Almost global existence to nonlinear wave equations in three space dimensions. Commun. Pure Appl. Math. 37, 443–455 (1984) [K1] Klainerman, S.: Uniform decay estimates and the Lorentz invariance of the wave equation. Commun. Pure Appl. Math. 38, 321–332 (1985) [K2] Klainerman, S.: The null condition and global existence to nonlinear wave equations. Lect. Appl. Math. 23, 293–326 (1986) [K-N1] Klainerman, S., Nicolo, F.: The evolution problem in general relativity. Basel-Boston: Birkh¨auser, 2003 [K-N2] Klainerman, S., Nicolo, F.: Peeling properties of asymptotically flat solutions to the Einstein vacuum equations. Class. Quant. Grav. 20, 3215–3257 (2003) [L1] Lindblad, H.: On the lifespan of solutions of nonlinear wave equations with small initial data. Commun. Pure Appl. Math 43, 445–472 (1990) [L2] Lindblad, H.: Global solutions of nonlinear wave equations. Commun. Pure Appl. Math. 45(9), 1063–1096 (1992) [L-R] Lindblad, H., Rodnianski, I.: The weak null condition for Einstein’s equations. C. R. Math. Acad. Sci. Paris 336(11), 901–906 (2003) [S1] Shu, W-T.: Asymptotic properties of the solutions of linear and nonlinear spin field equations in Minkowski space. Commun. Math. Phys 140(3), 449–480 (1991) [S-Y] Schoen, R., Yau, S.: On the proof of the positive mass conjecture in general relativity. Commun. Math. Phys. 65, 45–76 (1979) [Wa] Wald, R.: General Relativity. Chicago, IL: Chicago Univ. Press, 1984 [Wi] Witten, E.: A new proof of the positive mass theorem. Commun. Math. Phys. 80, 381–402 (1981) Communicated by P. Constantin
Commun. Math. Phys. 256, 111–157 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1314-9
Communications in
Mathematical Physics
Perturbation of Singular Equilibria of Hyperbolic Two-Component Systems: A Universal Hydrodynamic Limit B´alint T´oth1 , Benedek Valk´o1,2 1
Institute of Mathematics, Budapest University of Technology and Economics, Egry J. u. 1. , 1111 Budapest, Hungary. E-mail: {balint,valko}@math.bme.hu 2 Alfr´ed R´enyi Institute of Mathematics, Hungarian Academy of Sciences, Re´altanoda u. 13-15. , 1053 Budapest, Hungary. E-mail:
[email protected] Received: 15 December 2003 / Accepted: 14 October 2004 Published online: 8 March 2005 – © Springer-Verlag 2005
Abstract: We consider one-dimensional, locally finite interacting particle systems with two conservation laws which under the Eulerian hydrodynamic limit lead to two-by-two systems of conservation laws:
∂t ρ + ∂x (ρ, u) = 0 ∂t u + ∂x (ρ, u) = 0,
with (ρ, u) ∈ D ⊂ R2 , where D is a convex compact polygon in R2 . The system is typically strictly hyperbolic in the interior of D with possible non-hyperbolic degeneracies on the boundary ∂D. We consider the case of an isolated singular (i.e. non-hyperbolic) point on the interior of one of the edges of D, call it (ρ0 , u0 ). We investigate the propagation of small nonequilibrium perturbations of the steady state of the microscopic interacting particle system, corresponding to the densities (ρ0 , u0 ) of the conserved quantities. We prove that for a very rich class of systems, under a proper hydrodynamic limit the propagation of these small perturbations are universally driven by the two-by-two system
∂t ρ + ∂x ρu = 0 ∂t u + ∂x ρ + γ u2 = 0,
where the parameter γ is the only trace of the microscopic structure. The proof relies on the relative entropy method and thus, it is valid only in the regime of smooth solutions of the pde. But there are essential new elements: in order to control the fluctuations of the terms with Poissonian (rather than Gaussian) decay coming from the low density approximations we have to apply refined pde estimates. In particular Lax entropies of these pde systems play a not merely technical key role in the main part of the proof.
112
B. T´oth, B. Valk´o
1. Introduction 1.1. The PDE to be derived and some facts about it. We consider the pde ∂t ρ + ∂x ρu = 0 ∂t u + ∂x ρ + γ u2 = 0
(1.1)
for (t, x) ∈ [0, ∞) × (−∞, ∞), where ρ = ρ(t, x) ∈ R+ , u = u(t, x) ∈ R are density, respectively, velocity field and γ ∈ R is a fixed parameter. For any fixed γ this is a hyperbolic system of conservation laws in the domain (ρ, u) ∈ R+ × R. Phenomenologically, the pde describes a deposition/domain growth – or, in biological terms: chemotaxis – mechanism: ρ(t, x) is the density of population performing the deposition and h(t, x) is the height of the deposition. Let u(t, x) := −∂x h(t, x). The physics of the phenomenon is contained in the following two rules: (a) The velocity field of the population is proportional to the negative gradient of the height of the deposition. That is, the population is pushed towards the local decrease of the deposition height. This rule, together with the conservation of total mass of the population leads to the continuity equation (the first equation in our system). (b) The deposition rate is 2 ∂t h = ρ + γ ∂x h . The first term on the right-hand side is just saying that deposition is done additively by the population. The second term is a self-generating deposition, introduced and phenomenologically motivated by Kardar-Parisi-Zhang [9] and commonly accepted in the literature. Differentiating this last equation with respect to the space variable x results in the second equation of our system. The pde (1.1) is invariant under the following scaling, if ρ(t, x), u(t, x) is a solution then ρ (t, x) := A2β ρ(A1+β t, Ax),
u(t, x) := Aβ u(A1+β t, Ax),
is also a solution, where A > 0 and β ∈ R are arbitrarily fixed. The choice β = 0 gives the straightforward hyperbolic scale invariance, valid for any system of conservation laws. More interesting is the β = 1/2 case. This is the natural scale invariance of the system, since the physical variables (density and velocity fields) change covariantly under this scaling. This is the (presumed, but never rigorously proved) asymptotic scale invariance of the Kardar-Parisi-Zhang deposition phenomena. The nontrivial scale invariance of the pde (1.1) suggests its universality in some sense. Our main result indeed states its validity in a very wide context. It is also clear that the pde is invariant under the left-right reflection symmetry x→ − x: ρ (t, x) := ρ(t, −x),
u(t, x) := −u(t, −x)
also satisfies (1.1). The parameter γ of the pde (1.1) is of crucial importance: different values of γ lead to completely different behavior. Some particular cases which arose in the past in various contexts are listed:
Perturbation of Singular Equilibria
113
– The pde (1.1) with γ = 0 arose in the context of the ‘true self-repelling motion’ constructed by T´oth and Werner in [23]. For a survey of this case see also [24]. The same equation, with viscosity terms added, appear in mathematical biology under the name of (negative) chemotaxis equations (see e.g. [17, 15, 14]). – Taking γ = 1/2 we get the ‘shallow water equation’. See [3, 13]. This is the only value of the parameter γ when m = ρu is conserved and as a consequence the pde (1.1) can be interpreted as gas dynamics equation. – With γ = 1 the pde is called ‘Leroux’s equation’ which is of Temple class and for this reason much investigated. For many details about this equation see [19]. In the recent paper [6] Leroux’s system has been derived as a hydrodynamic limit under Eulerian scaling for a two-component lattice gas, going even beyond the appearance of shocks. The main facts about the pde (1.1) are presented in Subsect. 10.1 in the Appendix. Here we only mention that 1. For any γ ∈ R the system (1.1) is strictly hyperbolic in (ρ, u) ∈ (0, ∞) × R, with hyperbolicity marginally lost at (ρ, u) = (0, 0) for γ = 1/2 and at ρ = 0 for γ = 1/2. 2. The Riemann invariants (or characteristic coordinates) are explicitly computed in Sect. 10.1, for a first impression see Fig. 1 of the Appendix where the level lines of the Riemann invariants are shown. It turns out that the picture changes qualitatively at the critical values γ = 1/2, γ = 3/4 and γ = 1. It is of crucial importance for our later problem that the level curves, expressed as u → ρ(u) are convex for γ < 1, linear for γ = 1 and concave for γ > 1. 3. For any γ ≥ 0 the system (1.1) is genuinely nonlinear in (ρ, u) ∈ (0, ∞) × R, with genuine nonlinearity marginally lost at (ρ, u) = (0, 0) for γ = 0, 1/2 and at ρ = 0 for γ = 0, 1/2. 4. The system is sufficiently rich in Lax entropies. 5. For γ ≥ 0 the system (1.1) satisfies the conditions of the Lax-Chuey-Conley-Smoller Maximum Principle (see [11, 12, 19]). However, this maximum principle yields a priori bounds for entropy solutions with bounded initial data only for γ ≥ 1. The goal of the present paper is to derive the two-by-two hyperbolic system of conservation laws (1.1) as a decent hydrodynamic limit of some systems of interacting particles with two conserved quantities. We consider one-dimensional, locally finite interacting particle systems with two conservation laws with periodic boundary conditions which under the Eulerian hydrodynamic limit lead to two-by-two systems of conservation laws ∂t ρ + ∂x (ρ, u) = 0 ∂t u + ∂x (ρ, u) = 0, with (t, x) ∈ [0, ∞) × T, (ρ, u) ∈ D ⊂ R2 . Here T = R/Z is the unit torus and D is a convex compact polygon in R2 . The system is typically strictly hyperbolic in the interior of D with possible non-hyperbolic degeneracies on the boundary ∂D. We consider the case of an isolated singular (i.e. non-hyperbolic) point on the interior of one of the edges of D, call it (ρ0 , u0 ) = (0, 0) and assume D ⊂ {ρ ≥ 0} (otherwise we apply an appropriate linear transformation on the conserved quantities). We investigate the propagation of small nonequilibrium perturbations of the steady state of the microscopic interacting
114
B. T´oth, B. Valk´o
particle system, corresponding to the densities (ρ0 , u0 ) of the conserved quantities. We prove that for a very rich class of systems, under a proper hydrodynamic limit the propagation of these small perturbations is universally driven by the system (1.1) on the unit torus, where the parameter γ := 21 uu (ρ0 , u0 ) (with a proper choice of space and time scale) is the only trace of the microscopic structure. The proof is valid for the cases with γ > 1. The proof essentially relies on H-T. Yau’s relative entropy method and thus, it is valid only in the regime of smooth solutions of the pde (1.1). We should emphasize here the essential new ideas of the proof. Since we consider a low density limit, the distribution of particle numbers in blocks of mesoscopic size will have a Poissonian rather than Gaussian tail. The fluctuations of the other conserved quantity will be Gaussian, as usual. It follows that when controlling the fluctuations of the empirical block averages the usual large deviation approach would lead us to the disastrous estimate E exp{ε GAU · P OI } = ∞. It turns out that some very special cutoff must be applied. Since the large fluctuations which are cut off cannot be estimated by robust methods (i.e. by applying entropy inequality), only some cancellation due to martingales can help. This is the reason why the cutoff function must be chosen in a very special way, in terms of a particular Lax entropy of the Euler equation. In this way the proof becomes an interesting mixture of probabilistic and pde arguments. The fine properties of the limiting pde, in particular the global behavior of Riemann invariants and some particular Lax entropies, play an essential role in the proof. The radical difference between the γ ≥ 1 vs. γ < 1 cases, in particular applicability vs. non-applicability of the Lax-Chuey-Conley-Smoller maximum principle, manifests itself on the microscopic, probabilistic level. 1.2. The structure of the paper. In Sect. 2 we define the class of models to which our main theorem applies: we formulate the conditions to be satisfied by the interacting particle systems to be considered, we compute the steady state measures and the fluxes corresponding to the conserved quantities. At the end of this section we formulate the Eulerian hydrodynamic limit, for later reference. In Sect. 3 first we perform asymptotic analysis of the Euler equations close to the singular point considered, then we formulate our main result, Theorem 1, and its immediate consequences. In Sect. 4 we perform the necessary preliminary computations for the proof. After introducing the minimum necessary notation we apply some standard procedures in the context of the relative entropy method. Empirical block averages are introduced, numerical error terms are separated and estimated. In these first estimates only straightforward numerical approximations (Taylor expansion bounds) and the most direct entropy inequality are applied. Section 5 is of crucial importance: here it is shown why the traditional approach of the relative entropy method fails to apply. Here it becomes apparent that in the fluctuation bound(usually referred to as large deviation estimate) instead of the tame E exp{ε GAU 2 } we would run into the wild E exp{ε GAU · P OI } , which is, of course, infinite. Here we describe our special cutoff function and we state its main properties in Lemma 2. The construction of the cutoff is outlined in the Appendix. The proof, that the constructed functions indeed possess the properties described in Lemma 2, is pure classical pde theory. It is a straightforward, although quite lengthy (and not entirely trivial) calculation. Since the detailed proof would lengthen our paper considerably and also because it would stick out a bit from the framework of the paper, it is omitted
Perturbation of Singular Equilibria
115
completely. The interested reader may look up the detailed proof in [25]. At the end of the section the outline of the further steps is presented. In Sect. 6 all the necessary probabilistic ingredients of the forthcoming steps are gathered. These are: fixed time large deviation bounds and fixed time fluctuation bounds, the time averaged block replacement bounds (one block estimates) and the time averaged gradient bounds (closely related to the so-called two block estimates). The proof of these last two rely on Varadhan’s large deviation bound cited in that section and on some probability lemmas stated and proved in Sect. 9. We should mention here that these proofs, in particular the probability lemmas involved also contain some new and instructive elements. Sections 7 and 8 conclude the proof: the various terms arising in Sect. 5 are estimated using all the tools (probabilistic and pde) described in earlier sections. One can see that these estimates rely heavily on the fine properties of the Lax entropy used in the cutoff procedure. As we already mentioned Sect. 9 is devoted to proofs of various lemmas stated in earlier parts. In the first subsection of the Appendix we give some details about the pde (1.1). This is included for the sake of completeness and in order to let the reader have some more information about these, certainly interesting, pde-s. Strictly technically speaking this is not used in the proof. In the second subsection we outline the construction of the cutoff function. 2. Microscopic Models Our interacting particle systems to be defined in the present section model on a microscopic level the same deposition phenomena as the pde (1.1). There will be two conserved physical quantities: the particle number ηj ∈ N and the (discrete) negative gradient of the deposition height ζj ∈ Z. The dynamical driving mechanism is of such nature that (i) The deposition height growth is influenced by the local particle density. Typically: growth is enhanced by higher particle densities. (ii) The particle motion is itself influenced by the deposition profile. Typically: particles are pushed in the direction of the negative gradient of the deposition height. 2.1. State space, conserved quantities. Throughout this paper we denote by Tn the discrete tori Z/nZ, n ∈ N, and by T the continuous torus R/Z. We will denote the local spin state by ; we only consider the case when is finite. The state space of the interacting particle system of size n is
n := T . n
Configurations will be denoted ω := (ωj )j ∈Tn ∈ n . For sake of simplicity we consider discrete (integer valued) conserved quantities only. The two conserved quantities are (2.1) η : → N, ζ : → v0 Z or v0 Z + 1/2 .
116
B. T´oth, B. Valk´o
The trivial scaling factor v0 will be conveniently chosen later (see (2.4)). We also use the notations ηj = η(ωj ), ζj = ζ (ωj ). This means that the sums j ηj and j ζj are conserved by the dynamics. We assume that the conserved quantities are different and non-trivial, i.e. the functions ζ, η and the constant function 1 on are linearly independent. The left-right reflection symmetry of the model is implemented by an involution R : → ,
R ◦ R = Id
which acts on the conserved quantities as follows: η(Rω) = η(ω),
ζ (Rω) = −ζ (ω).
(2.2)
2.2. Rate functions, infinitesimal generators, stationary measures. Consider a (fixed) probability measure π on , which is invariant under the action of the involution R, i.e. π(Rω) = π(ω) and puts positive measure on every ω ∈ . Since eventually we consider low densities of η, in order to exclude trivial cases we assume that π ζ = 0 η = 0 ) < 1. (2.3) The scaling factor v0 in (2.1) is chosen so that Var ζ | η = 0) = 1.
(2.4)
This choice simplifies some formulas (fixing a recurring constant to be equal to 1, see (3.4)) but does not restrict generality. For later use we introduce the notations ρ ∗ := max η(ω), ω∈
u∗ := max ζ (ω), ω∈
u∗ := max ζ (ω). ω∈ , η(ω)=0
For τ, θ ∈ R let G(τ, θ ) be the moment generating function defined below: G(τ, θ ) := log eτ η(ω)+θζ (ω) π(ω). ω∈
In thermodynamic terms G(τ, θ ) corresponds to the Gibbs free energy. We define the probability measures πτ,θ (ω) := π(ω) exp(τ η(ω) + θ ζ (ω) − G(τ, θ ))
(2.5) on . We are going to define dynamics which conserve the quantities j ηj and j ζj , possess no other (hidden) conserved quantities and for which the product measures n πτ,θ := πτ,θ
j ∈Tn
are stationary. We need to separate a symmetric (reversible) part of the dynamics which will be speeded up sufficiently in order to enhance convergence to local equilibrium and thus helps to estimate some error terms in the hydrodynamic limiting procedure. So we consider two rate functions r : × × × → R+ and s : × × × → R+ , r will define the asymmetric component of the dynamics, while s will define the reversible
Perturbation of Singular Equilibria
117
component. The dynamics of the system consists of elementary jumps affecting nearest neighbor spins, (ωj , ωj +1 ) −→ (ωj , ωj +1 ) performed with rate λ r(ωj , ωj +1 ; ωj , ωj +1 ) + κ s(ωj , ωj +1 ; ωj , ωj +1 ), where λ, κ > 0 are speed-up factors, depending on the size of the system in the limiting procedure. We require that the rate functions r and s satisfy the following conditions: (A) Conservation laws: If r(ω1 , ω2 ; ω1 , ω2 ) > 0 or s(ω1 , ω2 ; ω1 , ω2 ) > 0 then η(ω1 ) + η(ω2 ) = η(ω1 ) + η(ω2 ), ζ (ω1 ) + ζ (ω2 ) = ζ (ω1 ) + ζ (ω2 ). (B) Irreducibility: For every N ∈ [0, nρ ∗ ], Z ∈ [−nu∗ , nu∗ ] the set
nN,Z := ω ∈ n : ηj = N, ζj = Z n n j ∈T
j ∈T
is an irreducible component of n , i.e. if ω, ω ∈ nN,Z then there exists a series of elementary jumps with positive rates transforming ω into ω . (C) Left-right symmetry: The jump rates are invariant under left-right reflection and the action of the involution R (jointly): r(Rω2 , Rω1 ; Rω2 , Rω1 ) = r(ω1 , ω2 ; ω1 , ω2 ), s(Rω2 , Rω1 ; Rω2 , Rω1 ) = s(ω1 , ω2 ; ω1 , ω2 ). (D) Stationarity of the asymmetric part: For any ω1 , ω2 , ω3 ∈ , Q(ω1 , ω2 ) + Q(ω2 , ω3 ) + Q(ω3 , ω1 ) = 0, where Q(ω1 , ω2 ) :=
π(ω )π(ω ) 1 2 r(ω1 , ω2 ; ω1 , ω2 ) − r(ω1 , ω2 ; ω1 , ω2 ) . π(ω1 )π(ω2 )
ω1 ,ω2 ∈
(E) Reversibility of the symmetric part: For any ω1 , ω2 , ω1 , ω2 ∈ , π(ω1 )π(ω2 )s(ω1 , ω2 ; ω1 , ω2 ) = π(ω1 )π(ω2 )s(ω1 , ω2 ; ω1 , ω2 ). For a precise formulation of the infinitesimal generator on n we first define the map : n → n for every ω , ω
∈ , j ∈ Tn :
ωj ω
ωj ω ω
i
ω = ω
ωi
if i = j if i = j + 1 if i = j, j + 1.
118
B. T´oth, B. Valk´o
The infinitesimal generators defined by these rates will be denoted:
Ln f (ω) = r(ωj , ωj +1 ; ω , ω
) f (ωj ω ω) − f (ω) , j ∈Tn ω ,ω
∈
K n f (ω) =
s(ωj , ωj +1 ; ω , ω
) f (ωj ω ω) − f (ω) .
j ∈Tn ω ,ω
∈
We denote by Xtn the Markov process on the state space n with infinitesimal generator Gn := λ(n)Ln + κ(n)K n
(2.6)
with speed-up factors λ(n) and κ(n) to be specified later. Let µn0 be a probability distribution on n which is the initial distribution of the microscopic system of size n, and n
µnt := µn0 etG
(2.7)
the distribution of the system at (macroscopic) time t. Remarks.
(1) Conditions (A) and (B) together imply that j ηj and j ζj are indeed the only conserved quantities of the dynamics. (2) Condition (C) together with (2.2) is the implementation of the left-right symmetry of the pde (1.1) on a microscopic level. Actually, our main result, Theorem 1, is valid without this assumption but some of the arguments would be more technical. n are indeed stationary for the (3) Condition (D) implies that the product measures πτ,θ dynamics defined by the asymmetric rates r. This is seen by applying similar computations to those of [1, 2, 18] or [22]. Mind that this is not a detailed balance condition for the rates r. (4) Condition (E) is a straightforward detailed balance condition. It implies that the n are reversible for the dynamics defined by the symmetric product measures πτ,θ rates s. n as the canonical measures. Since ζ and η We will refer to the measures πτ,θ j j j j are conserved the canonical measures on n are not ergodic. The conditioned measures defined on nN,Z by: n (ω)11{ω ∈ n } πτ,θ N,Z n n πN,Z (ω) := πτ,θ ω ηj = N, ζj = Z = n ( n ) π τ,θ N,Z n n j ∈T
j ∈T
are also stationary and, due to condition (B) satisfied by the rate functions, they are ergodic. We shall call these measures the microcanonical measures of our system. (It is n easy to see that the measure πN,Z does not depend on the choice of the values of τ and θ in the previous definition.) The assumptions are by no means excessively restrictive. Here follow some concrete examples of interacting particle systems which belong to the class specified by conditions (A)–(E) and also satisfy the further conditions (F), (G), (H), (I) to be formulated later.
Perturbation of Singular Equilibria
119
{−1, 0, +1}-model. The model is described and analyzed in full detail in [22] and [6]. The one spin state space is = {−1, 0, +1} . The left-right reflection symmetry is implemented by R : → , Rω = −ω. The dynamics consists of nearest neighbor spin exchanges and the two conserved quantities are η(ω) = 1 − |ω| and ζ (ω) = ω. The jump rates are r(1, −1; −1, 1) = 0,
r(−1, 1; 1, −1) = 2,
r(0, −1; −1, 0) = 0,
r(−1, 0; 0, −1) = 1,
r(1, 0; 0, 1) = 0,
r(0, 1; 1, 0) = 1,
and s(ω1 , ω2 ; ω1 , ω2 )
=
1 if (ω1 , ω2 ) = (ω2 , ω1 ) and ω1 = ω2 0 otherwise.
The one dimensional marginals of the stationary measures are πρ,u (0) = ρ,
πρ,u (±1) =
1−ρ±u 2
with the domain of variables D = {(ρ, u) ∈ R+ × R : ρ + |u| ≤ 1}. Two-lane models. The following family of examples are finite state space versions of the bricklayers models introduced in [24]. Let
= {0, 1, . . . , n} ¯ ×{−¯z, −¯z + 1, . . . , z¯ − 1, z¯ }, where n¯ ∈ N and z¯ ∈ { 21 , 1, 23 , 2 . . . }. The elements η of will be denoted ω := ζ . Naturally enough, j ηj and j ζj will be the conserved quantities Left-right reflection symmetry is implemented as ηof the ηdynamics. R : → , R ζ = zeta . We allow only the following elementary changes to occur at neighboring sites j, j + 1: η
j
ζj
η η η j , ζ j +1 → ζ ∓1 , ζ j +1±1 , j +1
j
η
j
ζj
j +1
η ∓1 η ±1 η , ζ j +1 → jζ , jζ+1 j +1
j
j +1
with appropriate rates. Beside the conditions already imposed we also assume that the one dimensional marginals of the steady state measures factorize as follows: π(ω) = π
η ζ
= p(η)q(ζ ).
The simplest case, with n¯ = 1 and z¯ = 1/2, that is with = {0, 1} × {−1/2, +1/2}, was introduced and fully analyzed in [22] and [16]. For a full description (i.e. identification of the rates which satisfy the imposed conditions, Eulerian hydrodynamic limit, etc., see those papers.) It turns out that conditions (A)–(E) impose some nontrivial combinatorial constraints on the rates which are satisfied by a finite parameter family of models. The number of free parameters increases with n¯ and z¯ . Since the concrete expressions of the rates are not relevant for our further presentation we omit the lengthy computations.
120
B. T´oth, B. Valk´o
n 2.3. Expectations. Expectation, variance, covariance with respect to the measures πτ,θ will be denoted by Eτ,θ (.), Varτ,θ (.), Covτ,θ (.). We compute the expectations of the conserved quantities with respect to the canonical measures, as functions of the parameters τ and θ: ρ(τ, θ ) := Eτ,θ (η) = η(ω)πτ,θ (ω) = Gτ (τ, θ ), ω∈
u(τ, θ ) := Eτ,θ (ζ ) =
ζ (ω)πτ,θ (ω) = Gθ (τ, θ ).
ω∈
Elementary calculations show that the matrix-valued function
ρτ ρθ uτ uθ
=
Gτ τ Gτ θ Gθτ Gθθ
=: G
(τ, θ )
is equal to the covariance matrix Covτ,θ (η, ζ ), and therefore it is strictly positive definite. It follows that the function (τ, θ ) → (ρ(τ, θ ), u(τ, θ )) is invertible. We denote the inverse function by (ρ, u) → (τ (ρ, u), θ (ρ, u)). Denote by (ρ, u) → S(ρ, u) the convex conjugate (Legendre transform) of the strictly convex function (τ, θ ) → G(τ, θ ): (2.8) S(ρ, u) := sup ρτ + uθ − G(τ, θ ) , τ,θ
and D := {(ρ, u) ∈ R+ × R : S(ρ, u) < ∞} = co{(η, ζ ) : π(ω) > 0},
(2.9)
where co stands for convex hull and A is the closure of A. The nondegeneracy condition (2.3) implies that ∂D ∩ {ρ = 0} = {(0, u) : |u| ≤ u∗ }. For (ρ, u) ∈ D we have τ (ρ, u) = Sρ (ρ, u),
θ (ρ, u) = Su (ρ, u).
In terms: S(ρ, u) is the rate function of joint large deviations of probabilistic η , ζ j j j j . In thermodynamic terms: S(ρ, u) corresponds to the equilibrium thermodynamic entropy. Let τ ρ τu Sρρ Sρu = =: S
(ρ, u). θ ρ θu Suρ Suu It is obvious that the matrices G
(τ, θ ) and S
(ρ, u) are strictly positive definite and are inverse of each other: G
(τ, θ )S
(ρ, u) = I = S
(ρ, u)G
(τ, θ ),
(2.10)
where either (τ, θ ) = (τ (ρ, u), θ (ρ, u)) or (ρ, u) = (ρ(τ, θ ), u(τ, θ )). With slight abuse of notation we shall denote: πτ (ρ,u),θ(ρ,u) =: πρ,u ,
n πτn(ρ,u),θ(ρ,u) =: πρ,u ,
Eτ (ρ,u),θ(ρ,u) =: Eρ,u , etc.
As a general convention, if ξ : m → R is a local function then its expectation with m is denoted by respect to the canonical measure πρ,u ξ(ω1 , . . . , ωm )πρ,u (ω1 ) · · · πρ,u (ωm ). (ρ, u) := Eρ,u (ξ ) = ω1 ,...,ωm ∈ m
Perturbation of Singular Equilibria
121
2.4. Fluxes. We introduce the fluxes of the conserved quantities. The infinitesimal generators Ln and K n act on the conserved quantities as follows (recall condition (A) on the rates): Ln ηi = −ψ(ωi , ωi+1 ) + ψ(ωi−1 , ωi )
=: −ψi + ψi−1 ,
Ln ζi = −φ(ωi , ωi+1 ) + φ(ωi−1 , ωi )
=: −φi + φi−1 ,
K nη
i
=
−ψ s (ω
i , ωi+1
) + ψ s (ω
i−1 , ωi )
s , =: −ψis + ψi−1
s , K n ζi = −φ s (ωi , ωi+1 ) + φ s (ωi−1 , ωi ) =: −φis + φi−1
where
ψ(ω1 , ω2 ) :=
ω1 ,ω2 ∈
φ(ω1 , ω2 ) :=
r(ω1 , ω2 ; ω1 , ω2 ) η(ω2 ) − η(ω2 ) , r(ω1 , ω2 ; ω1 , ω2 ) ζ (ω2 ) − ζ (ω2 ) ,
(2.11)
ω1 ,ω2 ∈
ψ s (ω1 , ω2 ) :=
ω1 ,ω2 ∈
φ s (ω1 , ω2 ) :=
s(ω1 , ω2 ; ω1 , ω2 ) η(ω2 ) − η(ω2 ) , s(ω1 , ω2 ; ω1 , ω2 ) ζ (ω2 ) − ζ (ω2 ) .
(2.12)
ω1 ,ω2 ∈
Note that due to the left-right symmetry and conservations, i.e. (2.2) and conditions (A) and (C), the microscopic fluxes have the following symmetries: φ(ω1 , ω2 ) =
φ(Rω2 , Rω1 ),
ψ(ω1 , ω2 ) = −ψ(Rω2 , Rω1 ).
In order to simplify some of our further arguments we impose one more microscopic condition (F) Gradient condition on symmetric fluxes: The microscopic fluxes of the symmetric part, defined in (2.12) satisfy the following gradient conditions: ψ s (ω1 , ω2 ) = κ(ω1 ) − κ(ω2 ) =: κ1 − κ2 , φ s (ω1 , ω2 ) = χ (ω1 ) − χ (ω2 ) =: χ1 − χ2 .
(2.13)
Remarks. (1) This is a technical assumption (referring actually to the measure π ) which simplifies considerably the arguments of Sect. 7. The symmetric part K n has the role of enhancing convergence to local equilibrium. Its effect is not seen in the limit, so in principle we can choose it conveniently. Without this assumption we would be forced to use all the non-gradient technology developed in [26] (see also [10]), which would make the paper even longer. (2) It is easy to see that η(ω1 ) = η(ω2 ) = 0 implies ψ s (ω1 , ω2 ) = 0 and thus (by choosing a suitable additive constant) ω → κ(ω) can be chosen so that η(ω) = 0 ⇒ κ(ω) = 0.
(2.14)
122
B. T´oth, B. Valk´o
The macroscopic fluxes are: (ρ, u) := Eρ,u (ψ) = (ρ, u) := Eρ,u (φ) =
ω 1 ,ω2
ψ(ω1 , ω2 )πρ,u (ω1 )πρ,u (ω2 ), φ(ω1 , ω2 )πρ,u (ω1 )πρ,u (ω2 ).
(2.15)
ω1 ,ω2
These are smooth regular functions of the variables (ρ, u) ∈ D. Note that due to reversibility of K n (condition (E)), for any value of ρ and u, Eρ,u (ψ s ) = 0 = Eρ,u (φ s ). The following lemma is proved in [22]. Lemma 1 (Onsager reciprocity relation). Suppose we have a particle system with two conserved quantities and rates satisfying Conditions (A) and (D). Then the following relation holds: ∂θ (ρ(τ, θ ), u(τ, θ )) = ∂τ (ρ(τ, θ ), u(τ, θ )). We will use the lemma in the following equivalent form: (2.16) u (ρ, u)Varρ,u (ζ ) − u (ρ, u)Covρ,u (η, ζ ) = ρ (ρ, u)Varρ,u (η) − ρ (ρ, u)Covρ,u (η, ζ ). For the concrete examples presented at the end of Subsect. 2.2 the following domains D and macroscopic fluxes are obtained: {−1, 0, +1}-model.: D = {(ρ, u) ∈ R+ × R : ρ + |u| ≤ 1}, (ρ, u) = ρu,
(ρ, u) = ρ + u2 .
Two lane models with n¯ = 1. D = {(ρ, u) ∈ R+ × R : ρ ≤ 1, |u| ≤ z¯ }, (ρ, u) = ρ(1 − ρ)ψ(u), (ρ, u) = ϕ0 (u) + ρϕ1 (u), where ψ(u) is odd, while ϕ0 (u) and ϕ1 (u) are even functions of u, determined by the jump rates of the model. In the simplest particular case with z¯ = 1/2 , (ρ, u) = ρ(1 − ρ)u,
(ρ, u) = (ρ − γ )(1 − u2 ),
where γ ∈ R is the only model dependent parameter which appears in the macroscopic fluxes. For details see [22].
Perturbation of Singular Equilibria
123
2.5. The hydrodynamic limit under Eulerian scaling. Given a system of interacting particles as defined in the previous subsections, under Eulerian scaling the local densities of the conserved quantities ρ(t, x), u(t, x) evolve according to the system of partial differential equations: ∂t ρ + ∂x (ρ, u) = 0 (2.17) ∂t u + ∂x (ρ, u) = 0, where (ρ, u) and (ρ, u) are the macroscopic fluxes defined in (2.15). The precise statement of the hydrodynamical limit is as follows: Consider a microscopic system which satisfies Conditions (A)–(E) of Subsect. 2.2. Let (ρ, u) and (ρ, u) be the macroscopic fluxes computed for this system and ρ(t, x), u(t, x) x ∈ T, t ∈ [0, T ] be the smooth solution of the pde (2.17). Let the microscopic system of size n be driven by the infinitesimal generator Gn defined in (2.6) with λ(n) = n and κ(n) = n1+δ where δ ∈ [0, 1), is fixed. This means that the main, asymmetric part of the generator is speeded up by n and the additional symmetric part by n1+δ . Let µnt be the distribution of the system on n at (macroscopic) time t defined by (2.7). The local equilibrium measure νtn (itself a probability measure on n ) is defined by νtn := πρ(t, j ),u(t, j ) . j ∈Tn
n
n
This measure mimics on a microscopic scale the macroscopic evolution driven by the pde (2.17). We denote by H (µnt |π n ), respectively, by H (µnt |νtn ) the relative entropy of the measure µnt with respect to the absolute reference measure π n , respectively, with respect to the local equilibrium measure νtn . The precise statement of the Eulerian hydrodynamic limit is the following Theorem. Assume Conditions (A)–(E) and let δ ∈ [0, 1) be fixed. If H µn0 ν0n = o(n) then
H µnt νtn = o(n)
uniformly for t ∈ [0, T ]. The Theorem follows from direct application of Yau’s relative entropy method. For the proof and its direct consequences see [10, 22] or [27]. For the main consequences of this Theorem, see e.g. Corollary 1 of [22]. 3. Low Density Asymptotics and the Main Result: Hydrodynamic Limit Under Intermediate Scaling 3.1. General properties and low density asymptotics of the macroscopic fluxes. The fluxes in the Euler equation (2.17) are regular smooth functions D. From the left-right symmetry of the microscopic models it follows that (ρ, −u) = (ρ, u),
(ρ, −u) = −(ρ, u).
(3.1)
124
B. T´oth, B. Valk´o
It is also obvious that for u ∈ [−u∗ , u∗ ], (0, u) = 0.
(3.2)
We make two assumptions about the low density asymptotics of the macroscopic fluxes. Here is the first one: (G) We assume that ρu (0, 0) = 0. Actually, by possibly redefining the time scale and orientation of space, without loss of generality we assume ρu (0, 0) = 1.
(3.3)
Considering the Onsager relation (2.16) with u = 0 and taking the Taylor expansion around ρ = 0 it follows that ρ (0, 0) = ρu (0, 0)Var0,0 (ζ ) = 1,
(3.4)
where in the second equality we used the choice (2.4) of the scaling factor v0 in (2.1). We denote γ :=
1 uu (0, 0). 2
(3.5)
Our results will hold for γ > 1 only. From (3.1) and (3.3) it follows that u (0, u) − ρ (0, u) = (2γ − 1)u + O(|u|3 ).
(3.6)
The second condition imposed on the low density asymptotics of the macroscopic fluxes is: (H) For u ∈ [−u∗ , u∗ ], u = 0, u (0, u) − ρ (0, u) = 0, ρ (0, u) = 0, ρu (0, u) = 0.
(3.7) (3.8)
Remarks. (1) (G) is a very natural nondegeneracy condition: if ρu (0, 0) vanished then in the perturbation calculus to be performed, higher order terms would be dominant and a different scaling limit should be taken. (2) Due to (3.1), (3.3) and (3.6) conditions (3.7), (3.8) hold anyway in a neighborhood of u = 0, and this would suffice; we assume Condition (H) for technical convenience only. Condition (3.7) amounts to forbidding other non-hyperbolic points on ∂D ∩ {ρ = 0}, beside the point (ρ, u) = (0, 0). Condition (3.8) reflects the natural monotonicity requirements (i) and (ii) formulated about the microscopic models at the beginning of Sect. 2. These conditions are used in the proof of Lemma 2, for the details see [25]. We are interested in the behavior of the pde near the isolated non-hyperbolic point (ρ, u) = (0, 0). The asymptotic expansion for ρ + u2 1 of the macroscopic fluxes and their first partial derivatives is (ρ, u) = ρu 1 + O(ρ + u2 ) , (ρ, u) = (ρ + γ u2 ) 1 + O(ρ + u2 ) , (3.9) ρ (ρ, u) = u 1 + O(ρ + u2 ) , ρ (ρ, u) = 1 + O(ρ + u2 ), u (ρ, u) = ρ 1 + O(ρ + u2 ) , u (ρ, u) = 2γ u 1 + O(ρ + u2 ) .
Perturbation of Singular Equilibria
125
We are looking for “small solutions” of the pde (2.17): Let ρ0 (x) and u0 (x) be given profiles and assume that ρ ε (t, x), uε (t, x) is the solution of the pde (2.17) with initial condition ρ ε (0, x) = ε 2 ρ0 (x),
uε (0, x) = ε u0 (x).
Then, at least formally, ε−2 ρ ε (ε −1 t, x) → ρ(t, x),
ε−1 uε (ε −1 t, x) → u(t, x),
where ρ(t, x), u(t, x) is solution of the pde (1.1) with initial condition ρ(0, x) = ρ0 (x),
u(0, x) = u0 (x).
3.2. The main result. The asymptotic computations of Subsect. 3.1 suggest the scaling under which we should derive the pde (1.1) as a hydrodynamic limit: fix a (small) positive β and choose the scaling: space
time
particle density
‘slope of the wall’
MICRO
nx
n1+β t
n−2β ρ
n−β u
MACRO
x
t
ρ
u
Ideally the result should be valid for 0 < β < 1/2 but we are able to prove much less than that. Choose a model satisfying Conditions (A)–(F) of Sect. 2 and Conditions (G), (H) of Subsect. 3.1, and let γ be given by (3.5), corresponding to the microscopic system chosen. Let the microscopic system of size n (defined on the discrete torus Tn ) evolve on macroscopic time scale according to the infinitesimal generator Gn (see (2.6)) with speed-up factors λ(n) = n1+β ,
κ(n) = n1+β+δ ,
with β > 0 and some further conditions to be imposed on β and δ (see (3.12)). Denote by µnt the true distribution of the microscopic system at macroscopic time t with µn0 the initial distribution (see (2.7)). We use the translation invariant product measure π n := πnn−2β ,0
(3.10)
as absolute reference measure. Global entropy will be considered relative to this measure, Radon-Nikodym derivatives of µnt and the local equilibrium measure νtn to be defined below, with respect to π n will be used. Given a smooth solution ρ(t, x), u(t, x) , (t, x) ∈ [0, T ] × T, of the pde (1.1), define the local equilibrium measure νtn on n as follows: νtn := πn−2β ρ(t, j ),n−β u(t, j ) . (3.11) j ∈Tn
n
n
This time-dependent measure mimics on a microscopic level the macroscopic evolution governed by the pde (1.1). Our main result is the following:
126
B. T´oth, B. Valk´o
Theorem 1. Assume that the microscopic system of interacting particles satisfies conditions (A)–(F) of Subsects. 2.2, 2.4 and the uniform log-Sobolev condition (I) of Subsect. 6.2. Additionally, assume that the macroscopic fluxes satisfy conditions (G), (H) of Subsect. 3.1 and γ > 1. Choose β ∈ (0, 1/2) and δ ∈ (1/2, 1) so that 2δ − 8β > 1 and δ + 3β < 1. (3.12) Let ρ(t, x), u(t, x) , (t, x) ∈ [0, T ] × T, be a smooth solution of the pde (1.1), such that inf x∈T ρ(0, x) > 0 and let νtn , t ∈ [0, T ] be the corresponding local equilibrium measure defined in (3.11). Under these conditions, if (3.13) H µn0 ν0n = o(n1−2β )
then
H µnt νtn = o(n1−2β )
(3.14)
uniformly for t ∈ [0, T ]. Remarks. (i) From (3.13) via the identity (4.5) and the entropy inequality it also follows that H µn0 π n = O(n1−2β ). (3.15) See the beginning of Subsect. 4.2 (ii) If γ > 3/4, in smooth solutions vacuum does not appear. That is inf ρ(0, x) > 0
x∈T
implies
inf
(t,x)∈[0,T ]×T
ρ(t, x) > 0.
(iii) Although for the {−1, 0, +1}-model we have γ = 1, our proof can also be extended to cover this model. Actually, in that case the proof is much simpler, since the Eulerian pde is equal to the limit pde (1.1) and thus the cutoff function (see Sect. 5) can be determined explicitly. Corollary 1. Assume the conditions of Theorem 1. Let g, h : T → R be smooth test functions. Then for any t ∈ [0, T ], (i) 1 j j P g( )n2β ηj (t) + h( )nβ ζj (t) → g(x)ρ(t, x) + h(x)u(t, x) dx. n n n T n j ∈T
(ii)
H µn0 π n − H µnt π n = o(n1−2β ).
Corollary 1 can be easily proved by the standard use of the entropy inequality. 4. Notations and General Preparatory Computations This section is completely standard in the context of the relative entropy method, so we shall be sketchy.
Perturbation of Singular Equilibria
127
4.1. Notation. We denote hn (t) := n−1+2β H µnt νtn ,
s n (t) := n−1+2β H µn0 π n − H µnt π n .
We know a priori that t → s n (t) is monotone increasing and due to (3.15), s n (t) = O(1),
uniformly for t ∈ [0, ∞).
(4.1)
In fact, from Theorem 1 it follows (see Corollary 1) that as long as the solution ρ(t, x), u(t, x) of the pde (1.1) is smooth s n (t) = o(1),
uniformly for t ∈ [0, T ].
For (ρ, u) ∈ (0, ∞) × (−∞, ∞) denote τ n (ρ, u) := τ (n−2β ρ, n−β u) − τ (n−2β , 0),
θ n (ρ, u) := nβ θ(n−2β ρ, n−β u).
Note that for symmetry reasons θ (n−2β , 0) = 0. Recall that τ is chemical potential rather than fugacity and for small densities the fugacity λ := eτ scales like ρ, i.e. τ (n−2β , 0) ∼ −2β log n. If ρ > 0 and u ∈ R are fixed then τ n (ρ, u) and θ n (ρ, u) stay of order 1, as n → ∞. Given the smooth solution ρ(t, x), u(t, x), with ρ(t, x) > 0 we shall use the notations τ n (t, x) := τ n (ρ(t, x), u(t, x)),
θ n (t, x) := θ n (ρ(t, x), u(t, x)),
v(t, x) := log ρ(t, x). The following asymptotics hold uniformly in (t, x) ∈ [0, T ] × T: τ n (t, x) =
v(t, x) + O(n−2β ),
θ n (t, x) =
u(t, x) + O(n−2β ),
∂x τ n (t, x) = ∂x v(t, x) + O(n−2β ),
∂x θ n (t, x) = ∂x u(t, x) + O(n−2β ), (4.2)
∂t τ n (t, x) = ∂t v(t, x) + O(n−2β ),
∂t θ n (t, x) = ∂t u(t, x) + O(n−2β ).
The logarithm of the Radon-Nikodym derivative of the time dependent reference measure νtn with respect to the absolute reference measure π n is denoted by ftn : dνtn (ω) dπ n j j = τ n (t, )ηj + n−β θ n (t, )ζj n n n
ftn (ω) := log
j ∈T
(4.3)
j j −G(τ n (t, ) + τ (n−2β , 0), n−β θ n (t, )) + G(τ (n−2β , 0), 0) . n n
128
B. T´oth, B. Valk´o
4.2. Preparatory computations. In order to obtain the main estimate (3.14) our aim is to get a Gr¨onwall type inequality: we will prove that for every t ∈ [0, T ], t t hn (t) − hn (0) = ∂t hn (s)ds ≤ C hn (s)ds + o(1), (4.4) 0
where the error term is uniform in t ∈ follows. We start with the identity
0
[0, T ]. Since hn (0)
H (µnt |νtn ) − H (µnt |π n ) = −
= o(1) is assumed, Theorem 1
n
ftn dµnt .
(4.5)
From this identity, the explicit form of the Radon-Nikodym derivative (4.3), the asymptotics (4.2), via the entropy inequality and (3.13), the a priori entropy bound (3.15) follows indeed, as remarked after the formulation of Theorem 1. Next we differentiate (4.5) to obtain ∂t hn (t) = − n3β Lnftn + n3β+δ K nftn + n−1+2β ∂t ftn dµnt − ∂t s n (t).
n
(4.6) Usually, an adjoint version of (4.6) is being used in the form of an inequality. In our case this form is needed. We emphasize that the term −∂t s n (t) on the right-hand side will be of crucial importance. We compute the three terms under the integral using (4.3), 1 j 1 j ∂x v(t, )n3β ψj + ∂x u(t, )n2β φj n n n n n n
n3β Lnftn (ω) =
j ∈T
j ∈T
+An1 (t, ω) + An2 (t, ω) + An3 (t, ω) + An4 (t, ω), where Ani (t, ω), i = 1, . . . , 4 are error terms which will be easy to estimate: An1 (t, ω) :=
1 n j j ∂x τ (t, ) − ∂x v(t, ) n3β ψj , n n n n j ∈T
An2 (t, ω) :=
1 n j j ∂x θ (t, ) − ∂x u(t, ) n2β φj , n n n n j ∈T
An3 (t, ω) :=
1 n n j j ∇ τ (t, ) − ∂x τ n (t, ) n3β ψj , n n n n j ∈T
An4 (t, ω) :=
1 n n j j ∇ θ (t, ) − ∂x θ n (t, ) n2β φj . n n n n j ∈T
Here and in the sequel ∇ n denotes the discrete gradient: ∇ n f (x) := n f (x + 1/n) − f (x) . See Subsect. 4.4 for the estimate of the error terms Anj (t, ω), j = 1, . . . , 12.
(4.7)
Perturbation of Singular Equilibria
129
Next, using the gradient condition (F) of the symmetric fluxes, n3β+δ K nftn (ω) = n−1+3β+δ
2 1 n 2 n j j ∇ τ (t, )κj + ∇ n θ n (t, )χj n n n n j ∈T
=: An5 (t, ω)
(4.8)
is itself a numerical error term. Finally n2β 1 j 2β j j β j n ∂t v(t, ) n ηj − ρ(t, ) + ∂t u(t, ) n ζj − u(t, ) ∂t ft (ω) = n n n n n n n j ∈T
+An6 (t, ω) + An7 (t, ω),
(4.9)
where An6 (t, ω) :=
1 n j j j ∂t τ (t, ) − ∂t v(t, ) n2β ηj − ρ(t, ) , n n n n n j ∈T
An7 (t, ω) :=
1 n j j j ∂t θ (t, ) − ∂t u(t, ) nβ ζj − u(t, ) n n n n n j ∈T
are again easy-to-estimate error terms. 4.3. Blocks. We fix once and for all a weight function a : R → R. It is assumed that: (1) a(x) > 0 for x ∈ (−1, 1) and a(x) = 0 otherwise, (2) it has total weight a(x) dx = 1, (3) it is even: a(−x) = a(x), and (4) it is smooth. We choose a mesoscopic block size l = l(n) such that 1 n(1+δ+5β)/3 l(n) nδ−β n.
(4.10)
This can be done due to condition (3.12) imposed on β and δ. Given a local variable (depending on m consecutive spins) ξi = ξi (ω) = ξ(ωi , . . . , ωi+m−1 ), its block average at macroscopic space coordinate x is defined as 1 nx − j ξ n (x) = ξj . ξ n (ω, x) := a l l
(4.11)
j
Since l = l(n), we do not denote explicitly dependence of the block average on the mesoscopic block size l. Note that x → ξ n (x) is smooth and n 1 nx − j n n ∂x ξ (x) = ∂x ξ (ω, x) = a ξj , l l l j
130
B. T´oth, B. Valk´o
and it is straightforward that sup sup ∂x ξ n (ω, x) ≤ C
ω∈ n x∈T
max ξ(ω1 , . . . , ωm )
ω1 ,...ωm
n . l
(4.12)
We shall use the handy (but slightly abused) notation ξ n (t, x) := ξ n (Xtn , x). This is the empirical block average process of the local observable ξi . For the scaled block average of the two conserved quantities we shall also use the notation ρ n (t, x) := n2β η n (t, x),
ζ n (t, x). u n (t, x) := nβ
(4.13)
Note that these block averages are expected to be of order 1 in the limit. Introducing block averages, the main terms on the right-hand side of (4.7) and (4.9) become: 1 j 1 j ∂x v(t, )n3β ψj + ∂x u(t, )n2β φj = n n n n n n j ∈T
(4.14)
j ∈T
1 1 j j j j n( ) + n ( ) ∂x v(t, )n3β ψ ∂x u(t, )n2β φ n n n n n n n n j ∈T
j ∈T
+An8 (t, ω) + An9 (t, ω), respectively 1 j j 1 j j ∂t v(t, ) n2β ηj − ρ(t, ) + ∂t u(t, ) nβ ζj − u(t, ) = n n n n n n n n j ∈T
j ∈T
1 j j j j j 1 j ζ n ( ) − u(t, ) ∂t v(t, ) n2β η n ( ) − ρ(t, ) + ∂t u(t, ) nβ n n n n n n n n n n j ∈T
j ∈T
+An10 (t, ω) + An11 (t, ω).
(4.15)
The error terms Ani (t, ω) (i = 8, 9, 10, 11) are of the form Ani (t, ω) :=
1 j 1 j − k k w t, − w t, υj , a n n l l n n j ∈T
k
with w = ∂x v, ∂x u, ∂t v, ∂t u and υ = n3β ψ, n2β φ, n2β η, nβ ζ for i = 8, 9, 10, 11, respectively. These error terms will be estimated in Subsect. 4.4. Since [0, T ] × T (t, x) → (ρ(t, x), u(t, x)), is a smooth solution of the pde (1.1), we have ∂t v = −u∂x v − ∂x u,
∂t u = −ρ∂x v − 2γ u∂x u.
Perturbation of Singular Equilibria
131
Inserting these expressions into the main terms of (4.15) eventually we obtain for the integrand in (4.6), n3β Lnftn (ω) + n3β+δ K nftn (ω) + n−1+2β ∂t ftn (ω) = 1 j j j j n ( ) − ρ(t, )u(t, ) ∂x v(t, ) n3β ψ n n n n n n
(4.16)
j ∈T
j j j j j j − u(t, ) n2β ζ n ( ) − u(t, ) η n ( ) − ρ(t, ) − ρ(t, ) nβ n n n n n n 1 j j j j n ( ) − ρ(t, ) + γ u(t, )2 + ∂x u(t, ) n2β φ n n n n n n j ∈T
j j j j j − n2β ζ n ( ) − u(t, ) η n ( ) − ρ(t, ) − 2γ u(t, ) nβ n n n n n +
12
Ank (t, ω),
k=1
where An12 (t) :=
j 1 ∂x v ρu + ∂x u ρ + γ u2 (t, ) n n n j ∈T
=
1 γ j ∂x ρu + u3 (t, ). n 3 n n j ∈T
4.4. The error terms Ank , k = 1, . . . , 12. We estimate these error terms with the help of the entropy inequality with respect to the measure π n . Note that the variables ηj , ζj , ψj and φj are bounded and by (3.9), (3.10) we also have Eπ n ηj ≤ Cn−2β , Varπ n ηj ≤ Cn−2β , Eπ n ζj = 0, Varπ n ζj ≤ C, Eπ n ψj = 0, Varπ n ψj ≤ Cn−2β , Eπ n φj ≤ C, Varπ n φj ≤ C. Applying the entropy inequality in a straightforward way and using the previous bounds with the asymptotics (4.2) and uniform approximation of ∂x of smooth functions by their discrete derivative ∇ n we get that Eµnt Ank (t) ≤ C n−β ∨ n−1+2β+δ ∨ nβ l −1 ∨ n−1+β l = o(1) for k = 1, . . . , 11. The computational details are obvious. Finally, An12 (t) is a simple numerical error term (no probability involved): An12 (t) ≤ Cn−1 = o(1).
132
B. T´oth, B. Valk´o
4.5. Sumup. Thus, integrating (4.6), using (4.16) and the bounds of Subsect. 4.4 we obtain t t An (s) ds + B n (s) ds − s n (t) + o(1), (4.17) hn (t) = 0
0
where An (t) := n − ρu − u( ρ n − ρ) − ρ( u n − u) (t, nj ) , Eµnt n1 j ∈Tn ∂x v n3β ψ (4.18) B n (t) := n −(ρ + γ u2 )−( ρ n −ρ)−2γ u( u n − u) (t, nj ) . Eµnt n1 j ∈Tn ∂x u n2β φ The main difficulty is caused by An (t). The term B n (t) is estimated exactly as it is done in [21] for the one-component systems: since (ρ, u) = ρ + γ u2 is linear in ρ and quadratic in u no problem is caused by the low particle density. By repeating the arguments of [21] we obtain t t n B (s)ds ≤ C hn (s)ds + o(1). (4.19) 0
0
In the rest of the proof we concentrate on the essentially difficult term An (t). 5. Cutoff We define the rescaled macroscopic fluxes n (ρ, u) := n3β (n−2β ρ, n−β u),
n (ρ, u) := n2β (n−2β ρ, n−β u),
(5.1)
defined on the scaled domain Dn := {(ρ, u) : (n−2β ρ, n−β u) ∈ D}.
(5.2)
The first partial derivatives of the scaled fluxes are ρn (ρ, u) = nβ ρ (n−2β ρ, n−β u),
nρ (ρ, u) = ρ (n−2β ρ, n−β u),
un (ρ, u) = n2β u (n−2β ρ, n−β u),
nu (ρ, u) = nβ u (n−2β ρ, n−β u).
(5.3)
For any (ρ, u) ∈ R+ × R, lim n (ρ, u) = ρu,
n→∞
lim n (ρ, u) = ρ + γ u2 ,
n→∞
lim ρn (ρ, u) = u,
n→∞
lim nρ (ρ, u) = 1,
n→∞
n→∞ n→∞
lim un (ρ, u) = ρ, lim nu (ρ, u) = 2γ u.
The convergence is uniform in compact subsets of R+ × R. Note that n ( ρ n (t, x), u n (t, x)) = n3β ( η n (t, x), ζ n (t, x)), n ( ρ n (t, x), u n (t, x)) = n2β ( η n (t, x), ζ n (t, x)).
(5.4)
Perturbation of Singular Equilibria
133
5.1. The direct approach — why it fails. The most natural thing is to write the summand in An (t) as n − ρu − u( ρ n − ρ) − ρ( u n − u) n3β ψ
(5.5)
n − ( = n3β (ψ η n, ζ n )) + ( n ( ρ n, u n) − ρ n u n ) + ( ρ n − ρ)( u n − u). By applying Varadhan’s one block estimate and controlling the error terms in the Taylor expansion of , the first two terms on the right-hand side can be dealt with. However, the last term causes serious problems: with proper normalization, it is asymptotically distributed with respect to the local equilibrium measure νtn , like a product of independent Poisson and Gaussian random variables, and thus it does not have a finite exponential moment. Since the robust estimates heavily rely on the entropy inequality where the finite exponential moment is needed, we have to choose another approach for estimating An (t). Instead of writing plainly (5.5), we introduce a cutoff. We let M > sup{ρ(t, x) ∨ |u(t, x)| : (t, x) ∈ [0, T ] × T}. Let I n , J n : R+ × R → R be bounded functions so that I n + J n = 1 and I n (ρ, u) = 1
for
ρ ∨ |u| ≤ M,
I n (ρ, u) = 0
for ‘large’ (ρ, u).
The last property will be specified later. We split the right-hand side of (5.5) in a most natural way, according to this cutoff: n − ρu − u( n J n ( ρ n − ρ) − ρ( u n − u) = n3β ψ ρ n, u n) n3β ψ n − ( −( ρ n u + ρ u n − ρu)J n ( ρ n, u n ) + n3β (ψ η n, ζ n ))I n ( ρ n, u n)
(5.6)
+( n ( ρ n, u n) − ρ n u n )I n ( ρ n, u n ) + ( ρ n − ρ)( u n − u)I n ( ρ n, u n ). The second term on the right-hand side is linear in the block averages, so it does not cause any problem. The third term is estimated by use of Varadhan’s one block estimate. The fourth term is Taylor approximation. Finally, the last term can be handled with the entropy inequality if the cutoff I n (ρ, u) is strong enough to tame the tail of the Gaussian×Poisson random variable. The main difficulty is caused by the first term on the right-hand side. This term certainly can not be estimated with the robust method, i.e. with entropy inequality: we would run into the same problem we wanted to overcome by introducing the cutoff. The only way this term may be small is by some cancellation. It turns out that the desired cancellations indeed occur (in the form of a martingale appearing in the space-time average) if and only if J n (ρ, u) = Sρn (ρ, u), where S n (ρ, u) is a particular Lax entropy of the scaled Euler equation ∂t ρ + ∂x n (ρ, u) = 0 ∂t u + ∂x n (ρ, u) = 0,
(5.7)
(5.8)
134
B. T´oth, B. Valk´o
with n (ρ, u) and n (ρ, u) defined in (5.1). This means that there exists a flux function F n (ρ, u) with Fρn = ρn Sρn + nρ Sun ,
Fun = un Sρn + nu Sun ,
or equivalently, the following pde holds n n n + nu − ρn Sρu − nρ Suu = 0. un Sρρ
(5.9)
(5.10)
5.2. The cutoff function. In the present subsection we describe the cutoff function (5.7) – or rather: the respective Lax entropies. In Lemma 2 we state some related estimates which will be of paramount importance in our further proof. The construction of the needed Lax entropies is outlined in Subsect. 10.2 of the Appendix. The proof that the Lax entropies described there indeed satisfy the conditions of Lemma 2, is pure classical pde theory. It is a straightforward, although quite lengthy (and not entirely trivial) calculation. Since the full proof would lengthen our paper considerably, we omit these computations. The interested reader can find the detailed proof in [25]. Lemma 2. Let M > 0 and ε > 0 be fixed arbitrary numbers. There exist twicedifferentiable Lax entropy/flux pairs S n (ρ, u), F n (ρ, u) defined on Dn for every (large enough) n such that the following inequalities hold. The positive constants A, B, C depend on M and ε, but not on n, |Sρn (ρ, u) − 11{ρ≥A+B|u|} | ≤ C 11{M≤ρ
(5.11)
|Sun (ρ, u)| ≤ C 11{M≤ρ
(5.12)
n (ρ, u)| ≤ |Sρρ
ε 11{M≤ρ
(5.13)
n |Sρu (ρ, u)| ≤
ε 11{M≤ρ
(5.14)
n (ρ, u)| ≤ ε 11{M≤ρ
(5.15)
|F n (ρ, u) − n (ρ, u)Sρn (ρ, u)| ≤ C(1 + u2 )11{M≤ρ
(5.16)
It is easy to see that the function I n = 1 − Sρn is indeed a cutoff: I n = 0 if ρ ∨ |u| ≤ M and I n = 1 for ‘large’ values of (ρ, u), namely for ρ ≥ A + B |u|. The choice of M will be specified by the large deviation bounds given in Proposition 1 (via Lemma 6), the choice of ε will be determined in Subsect. 7.4 (see (7.16)). 5.3. Outline of the further steps of proof. In Sect. 7 we give an estimate for the terms with ‘large’ values of (ρ, u), we prove that t j 3β n n n n Eµn 1 ) ds (5.17) ψ v n ( ρ , u ) (s, ∂ J x s n n 0 n j ∈T
≤
1 n 1 h (t) + s n (t) + C 2 2
t 0
hn (s) ds + o(1).
Perturbation of Singular Equilibria
135
In Sect. 8 we estimate the terms with ’small’ values of (ρ, u), the section is divided into four subsections. In Subsect. 8.1 we prove j 1 Eµn ∂x v ρ n u + ρ u n − ρu J n ( ρ n, u n ) (s, ) (5.18) s n n j ∈Tn ≤ C hn (s) + o(1). In Subsect. 8.2 we prove the one block estimate t j 3β n n n n n n Eµn 1 v n − ( η , ζ ) I ( ρ , u ) (s, ∂ ψ ) ds x s n n 0 j ∈Tn
(5.19)
= o(1). In Subsect. 8.3 we control the Taylor approximation j n n n n n n n n Eµn 1 ∂ I v ( ρ , u ) − ρ u ( ρ , u ) (s, ) x s n n n
(5.20)
j ∈T
≤ C hn (s) + o(1). Finally, in Subsection 8.4 we control the fluctuations n n n n n j Eµn 1 ∂x v ρ −ρ u − u I ( ρ , u ) (s, ) s n n j ∈Tn
(5.21)
≤ C hn (s) + o(1). Having all these done, from (4.18), (5.6) and the bounds (5.17), (5.18), (5.19), (5.20), (5.21) it follows that t t 1 1 An (s)ds ≤ hn (t) + s n (t) + C hn (s)ds + o(1). (5.22) 2 2 0 0 Finally, from (4.17), (4.19), (5.22) and noting that s n (t) ≥ 0 we get the desired Gr¨onwall inequality (4.4) and the Theorem follows. Note the importance of the term −∂t s n (t) on the right-hand side of (4.6). 6. Tools 6.1. Fixed time estimates. In the estimates with fixed time s ∈ [0, T ] we shall use the notation L = L(n) := n−2β l. Note that L 1 as n → ∞. The following general entropy estimate will be exploited all over:
(6.1)
136
B. T´oth, B. Valk´o
Lemma 3 (Fixed time entropy inequality). Let l ≤ n, V : l → R and denote Vj (ω) := V(ωj , . . . , ωj +l−1 ). Then for any γ > 0, 1 1 1 1 Eµns Vj (Xsn ) ≤ hn (s) + log Eνsn exp γ L Vj . n γ γL n j ∈Tn j ∈Tn (6.2) This lemma is a standard tool in the context of relative entropy method. For its proof we refer the reader to the original paper [27] or the monograph [10]. Proposition 1 (Fixed time large deviation bounds). (i) For any c > 0 there exists M < ∞ such that for any s ∈ [0, T ] n j 1 n Eµns 1+ρ n + u 11{ ρ n ∨| u n |>M} (s, ) ≤ c h (s) + o(1). n n n
(6.3)
j ∈T
(ii) There exist C < ∞ and M < ∞ such that for any s ∈ [0, T ] n 2 j 1 n Eµns u 11{ ρ n ∨| u n |>M} (s, ) ≤ C h (s) + o(1). n n n
(6.4)
j ∈T
The proof of Proposition 1 is postponed to Subsect. 9.1. It relies on the entropy inequality (6.2) of Lemma 3, the stochastic dominations formulated in Lemma 5 (see Subsect. 9.1) and standard large deviation bounds. Proposition 2 (Fixed time fluctuation bounds). For any M < ∞ there exists a C < ∞ such that the following bounds hold: 2 n 1 j Eµns (6.5) u − u (s, ) ≤ C hn (s) + o(1), n n n j ∈T
Eµns
2 j 1 n n ρ − ρ 11{ (s, ) ≤ C hn (s) + o(1). ρ ≤M} n n n
(6.6)
j ∈T
The proof of Proposition 2 is postponed to Subsect. 9.2. It relies on the entropy inequality (6.2) of Lemma 3, and Gaussian fluctuation estimates. 6.2. Convergence to local equilibrium and a priori bounds. The hydrodynamic limit relies on macroscopically fast convergence to (local) equilibrium in blocks of mesoscopic size l. Fix the block size l, N ∈ [0, l max η], Z ∈ [l min ζ, l max ζ ] and denote l l
lN,Z := ω ∈ l : ηj = N, ζj = Z , j =1 l l πN,Z (ω) := πλ,θ (ω |
l j =1
ηj = N,
j =1 l j =1
ζj = Z).
Perturbation of Singular Equilibria
137
Naturally, we are only interested in the pairs (N, Z) for which lN,Z is not empty. Expec l l tation with respect to the measure πN,Z · . For f : lN,Z → R let is denoted by EN,Z l KN,Z f (ω) :=
l−1 j =1 ω ,ω
s(ωj , ωj +1 ; ω , ω
) f (ωj,j,ω+1 ω) − f (ω) ,
l−1
1 2 l l DN,Z (f ) := EN,Z s(ωj , ωj +1 ; ω , ω
) f (ωj,j,ω+1 ω) − f (ω) . 2
j =1
ω ,ω
In plain words: lN,Z is the hyperplane of configurations ω ∈ l with fixed values of l the conserved quantities, πN,Z is the microcanonical distribution on this hyperplane, l KN,Z is the symmetric infinitesimal generator restricted to the hyperplane lN,Z , and l l l finally DN,Z is the Dirichlet form associated to KN,Z . Note that KN,Z is defined with free boundary conditions. The convergence to local equilibrium is quantitatively controlled by the following uniform logarithmic Sobolev estimate, assumed to hold: (I) Logarithmic Sobolev inequality: There exists a finite constant ℵ such that for any l ∈ N, N ∈ [0, l max η], Z ∈ [l min ζ, l max ζ ], and any h : lN,Z → R+ with l EN,Z (h) = 1 the following bound holds: √ l l EN,Z h log h ≤ ℵ l 2 DN,Z h . (6.7) Remark. The uniform logarithmic Sobolev inequality (6.7) is expected to hold for a very wide range of locally finite interacting particle systems, though we do not know about a fully general proof. In [28] the logarithmic Sobolev inequality is proved for symmetric K-exclusion processes. This implies that (6.7) holds for the two lane models defined in Sect. 2. In [6] Yau’s method of proving logarithmic Sobolev inequality is applied and the logarithmic Sobolev inequality is stated for random stirring models with an arbitrary number of colors. In particular, (6.7) follows for the {−1, 0, +1}-model defined in Sect. 2. The following large deviation bound goes back to Varadhan [26]. See also the monographs [10] and [4]. Lemma 4 (Time-averaged entropy inequality, local equilibrium). Let l ≤ n, V : l → R+ and denote Vj (ω) := V(ωj , . . . , ωj +l−1 ). Then for any q > 0, t 1 Eµns Vj (Xsn ) ds ≤ (6.8) n 0 n j ∈T
ℵ l3 2 n1+3β+δ t n l {qV} (t) + max log E s exp . N,Z N,Z 2 q n1+3β+δ ℵl 3 Remarks. (1) Since n1+3β+δ = o(1), l3
138
B. T´oth, B. Valk´o
in order to efficiently apply Lemma 4 one has to choose q = q(n) so that l exp {qV} = O(1), EN,Z uniformly in the block size l = l(n)∈N, and in N∈[0, l max η] and Z∈[l min ζ, l max ζ ]. (2) The proof of the bound (6.8) explicitly relies on the logarithmic Sobolev inequality (6.7). It appears in [29] and it is reproduced in several places, see e.g. [4, 5]. We do not repeat it here. The main probabilistic ingredients of our proof are summarized in Proposition 3 which is a consequence of Lemma 4. These are variants of the celebrated one block estimate, respectively, two blocks estimate of Varadhan and co-authors. Proposition 3 (Time-averaged block replacement and gradient bounds). Given a local variable ξ : m → R there exists a constant C such that the following bounds hold: (i) t 2 n l2 n n Eµns η , ζ ) (s, x) dx ds ≤ C 1+3β+δ s n (t) + o(1) . (6.9) ξ − ( n T 0 (ii) 0
t
Eµns
T
2 ∂x ξ n (s, x) dx
ds ≤ Cn1−3β−δ s n (t) + o(1) .
(6.10)
(iii) Further on, if ξ : → R (that is: it depends on a single spin) and ξ(ω) = 0 whenever η(ω) = 0 then the following stronger version of the gradient bound holds: ! 2 " t ∂x ξ n (s, x) dx ds ≤ Cn1−3β−δ s n (t) + o(1) . Eµns (6.11) n (s, x) η T 0 The proof of Proposition 3 is postponed to Subsect. 9.3. It relies on the large deviation bound (6.8) and some elementary probability estimates stated in Lemma 9 (see Subsect. 9.3). We shall apply (6.9) to ξ = φ and ξ = ψ. From (6.10) it follows that t 2 ∂ x Eµns u n (s, x) dx ds ≤ Cn1−β−δ s n (t) + o(1) , (6.12) T
0
t 0
Eµns
T
2 ∂ x ρ n (s, x) dx
ds ≤ Cn1+β−δ s n (t) + o(1) .
Using (6.11) the last bound is improved to ! 2 " t ∂ x ρ n (s, x) dx ds ≤ Cn1−β−δ s n (t) + o(1) . Eµns n ρ (s, x) T 0 The bound (6.11) will also be applied to ξ = κ (see (2.13) and (2.14)) to get ! 2 " t n2β ∂x κ n (s, x) dx ds ≤ Cn1−β−δ s n (t) + o(1) . Eµns n ρ (s, x) T 0
(6.13)
(6.14)
(6.15)
Perturbation of Singular Equilibria
139
7. Control of the Large Values of (ρ, u): Proof of (5.17) 7.1. Preparations. In the present section we prove (5.17). First we replace the sum 1 · · · by · · n T T · dx. Note that given a smooth function F : T → R, n 1/2 1 1 j 2 |∂x F (x)| dx F( ) − F (x) dx ≤ . (7.1) n n T n T n j ∈T
Hence it follows that t 3β n n n n j 1 J ( Eµn ρ , u ) (s, ) ds = ∂x v n ψ n 0 n j ∈Tn t 3β n n n n E ρ , u ) (s, x) dx ds + An13 , ∂x v n ψ J ( 0
where
(7.2)
T
An13
is again a simple numerical error term: n n n n A ≤ Cn3β 1 + sup ∂x ψ (s, x) + ∂x ρ (s, x) + ∂x u (s, x) 13 0≤s≤t x∈T
= O(n5β l −1 ) = o(1). In the last step we use the boundedness of the function ∂x v(t, x) and the most straightforward gradient bound (4.12). We have to prove that the main term on the right-hand side of (7.2) is negligible. Recall that J n = Sρn . We start with the application of the martingale identity: n n n Eµn vS ( ρ , u ) (t, x) − vS n ( ρ n, u n ) (0, x) T t n n n (∂t v)S ( ρ , u ) (s, x) ds dx = − 0 (7.3) t 1+β n n n n n n v(s, x) n L S ( ρ (x), u (x)) (Xs ) dx ds Eµ 0 T t 1+β+δ n n n n n n v(s, x) n K S ( ρ (x), u (x)) (Xs ) dx ds . +Eµ 0
T
7.2. The left-hand side of (7.3). From (5.11), (5.12) we conclude that n S (ρ, u) ≤ C ρ + |u| 11{ρ∨|u|>M} . Hence, using the large deviation bound (6.3) it follows that, for any fixed small c > 0, by choosing M sufficiently large, we obtain 1 j n S n ( Eµns u n )(s, ) ≤ c hn (s) + o(1). ρ , n n j ∈Tn
140
B. T´oth, B. Valk´o
Applying again (7.1), choosing an appropriately small c in the previous bound we get t 1 | l.h.s. of (7.3) | ≤ hn (t) + C hn (s) ds + o(1). (7.4) 2 0 Remark. Note that this is the point where M and thus the lower edge of the cutoff is fixed. Also note the importance of the factor 1/2 in front of hn (t) on the right-hand side. 7.3. The right-hand side of (7.3): first computations. First we compute how the infiniρ n (x), u n (x)): tesimal generators n1+β Ln and n1+β+δ K n act on the function ω → S n ( ρ n (x), u n (x)) = n1+β Ln S n ( (7.5) n + Sun ( n (x) + An14 (ω, x), ρ n, u n ) n3β ∂x ψ ρ n, u n ) n2β ∂x φ Sρn ( n1+β+δ K n S n ( ρ n (x), u n (x)) = (7.6) ρ n, u n ) n2β ∂x2 κ n + Sun ( ρ n, u n ) nβ ∂x2 χ n (x) + An15 (ω, x), n−1+β+δ Sρn ( where An14 (x) and An15 (x) are numerical error terms. These error terms are easily estimated: using the fact that the second partial derivatives of S n are uniformly bounded, ζ and η are bounded, by simple Taylor expansion after tedious but otherwise straightforward computations we find: sup sup An14 (ω, x) + An15 (ω, x) ≤ C n1+3β l −2 + n1+5β+δ l −3 = o(1). ω∈ n x∈T
(7.7) For similar computational details see [6] or [25]. Next we do some further transformations on the main terms coming from the righthand sides of (7.5) and (7.6). Performing integrations by part, introducing the macroscopic fluxes and using (5.9) we obtain: n + Sun ( n (x) dx − v(x) Sρn ( ρ n, u n )∂x n3β ψ ρ n, u n )∂x n2β φ T n Sρn ( = ∂x v(x) n3β ψ ρ n, u n ) (x) dx T ρ n, u n ) − n ( ρ n, u n )Sρn ( ρ n, u n ) (x) dx + ∂x v(x) F n ( (7.8) T 2β n n n n n n ρ , u ) φ − ( η , ζ ) (x) dx + ∂x v(x) n Su ( T n n n −( ( ρ n, u n ) ∂x ρ n +Sρu ( ρ n, u n ) ∂x un ψ η n, ζ n) + v(x) n3β Sρρ T n n n −( ( ρ n, u n ) ∂x ρ n +Suu ( ρ n, u n ) ∂x un φ η n, ζ n ) (x) dx. +n2β Suρ Note that, since J n = Sρn , the first term on the right-hand side is exactly the expression in the main term on the right-hand side of (7.2). Estimating the other terms on the righthand side of (7.8) is the object of the next subsection. Also note that here we rely heavily on the fact that S n is a Lax entropy of the pde (5.8); without this we would not be able to carry out the needed calculations.
Perturbation of Singular Equilibria
141
Now we turn to the main term on the right-hand side of (7.6). Straightforward integration by parts yields − v(x) Sρn ( ρ n, u n ) n2β ∂x2 κ n + Sun ( ρ n, u n ) nβ ∂x2 χ n (x) dx T
=
T
+
T
∂x v(x) Sρn ( ρ n, u n ) n2β ∂x κ n + Sun ( ρ n, u n ) nβ ∂x χ n (x) dx
v(x)
n Sρρ ( ρ n, u n)
∂x ρ
n
n + Sρu ( ρ n, u n)
(7.9)
2β n ∂x ∂x u κn n
n n + Sρu ( ρ n, u n ) ∂x ρ n + Suu ( ρ n, u n ) ∂x u n nβ ∂x χ n (x) dx.
We will estimate the terms emerging from the right-hand side in the next subsection. 7.4. The right-hand side of (7.3): bounds. By (5.16) n 2 n n n F ( ρ , u ) − n ( ρ n, u n )Sρn ( ρ n, u n ) ≤ C 1 + u 11{ ρ n ∨| u n |>M} . Hence, applying the large deviation bounds (6.3) and (6.4) we obtain ρ n, u n ) − n ( ρ n, u n )Sρn ( ρ n, u n ) (s, x) dx Eµns F n ( T
(7.10)
≤ C hn (s) + o(1).
Next we use the bound (5.12) on Sun and the first block replacement bound (6.9) to obtain: t 2β n n n n n n n Eµ ρ , u ) φ − ( η , ζ ) (s, x) dx ds n Su ( 0 T (7.11) ≤ C l n(−1−δ+β)/2 = o(1). For the next terms we use the bounds on the second derivatives of S n , see (5.13), (5.14), (5.15), and note that here we do not exploit the fact that the constant factor ε on the right-hand side is actually small. Together with the block replacement bounds (6.9), the gradient bounds (6.12), (6.14) and the bound (4.1) on the relative entropy s n (t) we get the following four estimates: t n 3β n − ( Eµn ρ n, u n ) ∂x ρ n ψ η n, ζ n ) dx ds ≤ C l nβ−δ , n Sρρ ( 0
T
0
T
t 3β n n n n n n n n Eµ ρ , u ) ∂x u ψ − ( η , ζ ) dx ds ≤ C l nβ−δ , n Sρu ( (7.12) t n 2β n n n n n n −δ Eµn ρ , u ) ∂x ρ φ − ( η , ζ ) dx ds ≤ C l n , n Suρ ( 0
T
0
T
t n 2β n − ( Eµn ρ n, u n ) ∂x un φ η n, ζ n ) dx ds ≤ C l n−δ , n Suu (
142
B. T´oth, B. Valk´o
where the upper bounds on the right are all o(1). Using the bounds (5.11) and (5.12) on the first partial derivatives of S n , and the gradient bounds (6.12), (6.13) we obtain the following two bounds: t n−1+β+δ Eµn 0 T Sρn ( ρ n, u n ) ∂x ρ n (s, x) dx ds (7.13) ≤ C n(−1+δ+3β)/2 = o(1), n−1+β+δ Eµn
n n n t n (s, x) dx ds S ( ρ , u ) ∂ u x u 0 T ≤
(7.14) C n(−1+δ+β)/2
= o(1).
The following bounds are of crucial importance and they are sharp. We use (5.13), (5.14) and (5.15) again and note that here we exploit them in their full power: the constant factors on the right-hand side is small. These and the gradient bounds (6.12) and (6.14) yield the following bounds: t n n−1+β+δ Eµn ρ n, u n ) ∂x ρ n n2β ∂x κ n (s, x) dx ds Sρρ ( 0
T
≤ c s n (t) + o(1), t 2β −1+β+δ n n n n n n (s, x) dx ds Eµn ρ , u ) ∂x u n ∂x κ Sρu ( 0
T
≤ c s n (t) + o(1), t β n n n n n n−1+β+δ Eµn n (s, x) dx ds (7.15) S ( ρ , u ) ∂ ρ ∂ χ ρu x x 0
T
≤ c s n (t) + o(1), t n n−1+β+δ Eµn ρ n, u n ) ∂x u n nβ ∂x χ n (s, x) dx ds Suu ( 0
T
≤ c s n (t) + o(1). We choose ε so small in Lemma 2 that c
sup
(t,x)∈[0,T ]×T
|v(t, x)| <
1 . 2
(7.16)
7.5. Sumup. The identities (7.5), (7.6), (7.8), (7.9) and the bounds (7.7), (7.10), (7.11), (7.12), (7.14), (7.15) yield t 3β n n n n Eµn ρ , u ) (s, x) dx ds − r.h.s. of (7.3) ∂x v n ψ Sρ ( 0
T
≤
1 n s (t) + C 2
t
hn (s) ds + o(1). (7.17)
0
Finally, from (7.2), (7.3), (7.3), (7.4) and (7.17) we obtain (5.17).
Perturbation of Singular Equilibria
143
8. Control of the Small Values of (ρ, u): Proof of the Bounds (5.18) to (5.21) 8.1. Proof of (5.18). We exploit the inequality n n n n n n J ( ρ , u ) = Sρ ( ρ , u ) ≤ C 11{ ρ n ∨| u n |>M} , see (5.11) and boundedness of the functions ρ(t, x), u(t, x), ∂x v(t, x). Applying the large deviation bound (6.3) we readily obtain (5.18). 8.2. Proof of (5.19). This is very similar to what has been done in various parts of Subsect. 7.4. We use the block replacement bound (6.9) and the bound n n n I ( ρ , u ) = 1 − Sρn ( ρ n, u n ) ≤ C (8.1) which follows from (5.11). We get t 3β n n n n n n n ψ − ( η , ζ ) I ( ρ , u ) (s, x) ds dx Eµn 0
T
≤ C l n(−1−δ+3β)/2 = o(1), which proves (5.19). 8.3. Proof of (5.20). We write n I n ( ρ n, u n ) = 11{ ρ n, u n ), ρ n ∨| u n |≤M} + 11{ ρ n ∨| u n |>M} I (
(8.2)
and note that, by Taylor expansion of the function (ρ, u) → (ρ, u), n n n −2β ( ρ , u )−ρ n u n 11{ . ρ n ∨| u n |≤M} ≤ C n On the other hand
and
n n n n ( ρ , u ) ≤ C ρ n u n ρ n, u n ) ≤ C(1 + u ), ρ n I n (
(8.3)
see (5.11). Thus n n n n n 2 ( ρ , u )−ρ n u n I n ( ρ n, u n ) ≤ C n−2β + u + u 11{ ρ n ∨| u n |>M} . From this, using the large deviation bounds (6.3) and (6.4) we obtain (5.20). 8.4. Proof of (5.21). We use again (8.2) and (8.3) and get n n n n ρ −ρ u − u I n ( ρ n, u n ) ≤ ρ −ρ u − u 11{ ρ n ∨| u n |≤M} n n 2 +C 1 + u + u 11{ ρ n ∨| u n |>M} . Now the fluctuation bounds (6.5), (6.6), and the large deviation bounds (6.3), (6.4) together yield (5.21).
144
B. T´oth, B. Valk´o
9. Proof of the “Tools” 9.1. Proof of the large deviation bounds (Proposition 1). Recall the definition (6.1) of L. The following lemma follows from simple coupling arguments. Lemma 5. (Stochastic dominations). There exists a constant C depending only on max(s,x)∈[0,T ]×T ρ(s, x) ∨ |u(s, x)| such that for any fixed (s, x) ∈ [0, T ] × T, the following stochastic dominations hold: Pνsn ρ (9.1) n (x) > z ≤ P POI(L) > (z/C)L , √ Pνsn u n (x) > z ≤ P |GAU| > (z/C) − 1 L ,
(9.2)
where POI(L) is a Poissonian random variable with expectation L, and GAU is a standard Gaussian random variable. Lemma 6. (Large deviation bounds). (i) For any q < ∞ there exists M < ∞, such that for any n ∈ N and j ∈ Tn and s ∈ [0, T ], j log Eνsn exp q L ρ n ( )11{ ≤ 1, j j n n u ( n )|>M} n ρ ( n )∨| (9.3) n j n log Eνs exp q L u ( ) 11{ ≤ 1. ρ n ( nj )∨| u n ( nj )|>M} n (ii) Let C be the same as in Lemma 5. For any q ∈ (0, 1/(8C 2 )) there exists M < ∞, such that for any n ∈ N, j ∈ Tn and s ∈ [0, T ], n j 2 log Eνsn exp q L u ( ) 11{ ≤ 1. (9.4) j j n n ρ ( n )∨| u ( n )|>M} n Proof. The bounds of (9.3) follow from standard large deviation arguments using the stochastic dominations (9.1), (9.2). For the bound (9.4) we spell out the proof with { ρ n ( nj ) > M} instead of { ρ n ( nj ) ∨ | u n ( nj )| > M}; the latter follows similarly. Let ZL be a P OI (L)-distributed and X be a standard Gaussian random variable. Using the stochastic dominations (9.1) and (9.2) we obtain n j 2 log Eνsn exp q L u ( ) 11{ j ρ n ( n )>M} n n j 2 ≤ log 1 + Eνsn exp q L u ( ) 11{ j n ρ ( n )>M} n # # j 2 j ≤ Eνsn exp 2q L Pνsn ρ u n ( ) n ( ) > M n n # # ≤ E exp 4q C 2 X 2 + L P ZL > (M/C)L −1/4 L ≤ 1 − 8q C 2 4q C 2 + (e(αC)/M − 1) − α , exp 2
Perturbation of Singular Equilibria
145
where α is arbitrary positive number and in the last step we used the Markov inequality. Given q < 1/(8C 2 ), we choose α sufficiently large and M > (Cα)/(ln 2) to obtain (9.4). Now we turn to the proof of Proposition 1: Proof. The bounds (6.3), respectively, (6.4) follow directly from the entropy inequality (6.2) of Lemma 3 and the bounds (9.3), respectively, (9.4) of Lemma 6. Recall that L 1, as n → ∞. 9.2. Proof of the fluctuation bounds (Proposition 2). Within this proof we need the notations n k nβ nx − k un (s, x) := u (x) , a ζk − n−β u s, = u n (x) − Eνsn l l n k
n k n2β nx − k −2β ρ (s, x) := (x) . a ρ s, ηk − n =ρ n (x) − Eνsn ρ l l n n
k
Since we have j j n j 1 l n − u s, − u s, + = o(1), u s, ≤C n n n l n j j n j 1 l n s, − ρ s, − ρ s, + = o(1), ρ ≤C n n n l n it is enough to prove E
µns
1 n j 2 ≤ C hn (s) + o(1), u (s, ) n n n
(9.5)
j ∈T
respectively, Eµns
1 n j 2 ρ (s, ) 11{| ρn (s, j ) | ≤M} ≤ C hn (s) + o(1). n n n n
(9.6)
j ∈T
Lemma 7. (i) There exists q0 > 0 (sufficiently small, but fixed) such that for all n ∈ N, j ∈ Tn and s ∈ [0, T ], n j 2 u s, ≤ 1. log Eνsn exp q0 L n
(9.7)
(ii) For any M < ∞ there exists q0 > 0 (sufficiently small, but fixed) such that for all n ∈ N, j ∈ Tn and s ∈ [0, T ], n j 2 11 n j ≤ 1. s, log Eνsn exp q0 L ρ { | ρ (s, ) | ≤M} n n
(9.8)
146
B. T´oth, B. Valk´o
Proof. (i) Let X be a standard Gaussian random variable, which is independent of the other random variables in question, and denote by . . . expectation with respect to X, n j 2 log Eνsn exp q0 L u s, n q0 j − k 2 = log Eνsn exp a ζk − Eνsn (ζk ) l l
(9.9)
k
#
$ % 2q0 j − k ζk − Eνsn (ζk ) a . = log Eνsn exp X l l k
Now, note that the random variables ζk − Eνsn (ζk ), k ∈ Tn , are uniformly bounded and under the distribution Pνsn they are independent and have zero mean. Hence there exists a finite constant C such that for any collection of real numbers λk , k ∈ Tn , 2 Eνsn exp λk ζk − Eνsn (ζk ) ≤ exp C λk . k
k
Further on, there exists a finite constant C such that for any l, 1 k 2 a ≤ C. l l
(9.10)
k
From these it follows that for some finite constant C, $ % r.h.s. of (9.9) ≤ log exp q0 C X2 . Choosing q0 sufficiently small in this last inequality we obtain (9.7). (ii) Note first that, given M < ∞ fixed, there exists a zero mean bounded random variable Y such that for any r ∈ R, r 2 11{ | r | ≤ M } ≤ log E exp r Y . Let Y1 , Y2 , . . . be i.i.d. copies of Y which are also independent of the other random variables in question, and denote by . . . expectation with respect to these. Then we have n j 2 log Eνsn exp q0 L ρ s, 11{ | ρn (s, j ) | ≤M} (9.11) n n q0 L $ Yp j − k % p=1 ηk − Eνsn (ηk ) a . ≤ log Eνsn exp l L k
Next note that for any λ < ∞ there exists a constant C < ∞ such that for any n ∈ N, any s ∈ [0, T ] and any collection of real numbers λk ∈ [−λ, λ], k ∈ Tn , Eνsn exp λk ηk − Eνsn (ηk ) ≤ exp Cn−2β λ2k . k
k
Perturbation of Singular Equilibria
147
Hence, using again (9.10), 2 % $ & . r.h.s. of (9.11) ≤ log exp q0 C Y1 + . . .+ Yq0 L / q0 L Now, since the i.i.d. random variables Y1 , Y2 , . . . are bounded and have zero mean, choosing q0 sufficiently small this last expression can be made arbitrarily small, uniformly in L. Hence (9.8). Now back to the proof of Proposition 2. Proof. From (6.2) and (9.7), respectively, from (6.2) and (9.8) we deduce (9.5), respectively, (9.6). Finally, these two bounds and the arguments at the beginning of the present subsection imply (6.5), respectively, (6.6).
9.3. Proof of the block replacement and gradient bounds. 9.3.1. An elementary probability lemma. Let ( , π ) be a finite probability space and ωi , i ∈ Z i.i.d. -valued random variables with distribution π . Further on let ζ : → Rd ,
ζ i := ζ (ωi ),
ξ : m → R,
ξi := ξ(ωi . . . , ωi+m−1 ).
For x ∈ co(Ran(ζ )) denote Eπ ξ1 exp{ m i=1 λ · ζ i } (x) := m , Eπ exp{λ · ζ 1 } where co(·) stands for ‘convex hull’ and λ ∈ Rd is chosen so that Eπ ζ 1 exp{λ · ζ 1 } = x. Eπ exp{λ · ζ 1 } For l ∈ N we denote plain block averages by 1 ζj. l l
ζ l :=
j =1
Finally, let b : [0, 1] → R be a fixed piecewise continuous function, we define the block averages weighted by b, 1 b , ζ l := b(j/ l)ζ j , l l
j =0
1 b , ξ l := b(j/ l)ξj . l l
j =0
The following lemma relies on elementary probability arguments:
148
B. T´oth, B. Valk´o
Lemma 8 (Microcanonical exponential moments of block averages). There exists a constant C < ∞, depending only on m, on the joint distribution of (ξi , ζ i ) and on the function b, such that the following bounds hold uniformly in l ∈ N and x ∈ (Ran(ζ ) + · · · + Ran(ζ ))/ l: 1 (i) If 0 b(s) ds = 0, then √ √ Eπ exp q lb , ξ l ζ l = x ≤ exp{C(q 2 + q/ l)}. (9.12) (ii) If
1 0
b(s) ds = 1 then
√ √ Eπ exp q l b , ξ l − (b , ζ l ) ζ l = x ≤ exp{C(q 2 + q/ l)}. (9.13) Proof. We prove the lemma with m = 1, that is with (ξi )li=1 independent rather than m-dependent. The m-dependent case follows by applying Jensen’s inequality in a rather straightforward way. (i) In order to simplify the argument we make the assumption that the function s → b(s) is odd: b(1 − s) = −b(s).
(9.14)
The same argument works if the function s → b(s) can be rearranged (by permutation of finitely many subintervals of [0, 1]) into a piecewise continuous odd function. This case is sufficient for our purposes. The proof of the fully general case — which goes through induction on l — is more tedious and it is left as an exercise for the reader. Assuming (9.14) we have [l/2] √ lb , ξ l = l −1/2 b(j/ l)(ξj − ξl−j ), j =0
and hence √ Eπ exp q lb , ξ l ζ l = x √ = Eπ Eπ exp q lb , ξ l ζ j + ζ l−j : j = 0, . . . , l ζ l = x = Eπ
[l/2]
Eπ exp q l −1/2 b(j/ l)(ξj − ξl−j ) ζ j + ζ l−j ζ l = x
j =0
≤ exp Cq
2
[l/2]
l −1 b(j/ l)2 = exp{C q 2 }.
j =1
In the second step we use the fact that the pairs ξj , ξl−j , j = 0, . . . , [l/2] are independent, given ζ j + ζl−j , j = 0, . . . , [l/2]. In the third step we note that the variables ξj are bounded and E ξj − ξl−j ζ j + ζ l−j = 0. (ii) Beside (x) we also introduce the functions l : Ran(ζ ) + · · · + Ran(ζ ) / l → R, l (x) := E ξ1 ζ l = x .
Perturbation of Singular Equilibria
149
We shall exploit the following facts: (1) The functions (x) and l (x) are uniformly bounded. This follows from the boundedness of ξj . (2) The function x → (x) is smooth with bounded first two derivatives. This follows from direct computations. (3) There exists a finite constant C, such that |l (x) − (x)| ≤ Cl −1 . This follows from the so-called equivalence of ensembles (see e.g. Appendix 2 of [10]). We write b , ξ l − (b , ζ l ) = b , ξ l − ξ l + ξ l − l (ζ l ) (9.15) + l (ζ l ) − (ζ l ) + (ζ l ) − (b , ζ l ) . By applying Jensen’s inequality we conclude that it is enough to bound the exponential moments of type (9.13), separately for the four terms. Bounding the first and last terms reduces directly to (9.12), the third term is uniformly O(l −1 ), so we only have to bound the exponential moments of the second term in (9.15), This is done by induction on l. Let C(l) be the best constant such that for any q ∈ R, √ Eπ exp q l ξ l − l (ζ l ) ζ l = x ≤ exp{C(l) q 2 }. Clearly, C(1) < ∞. We prove that C(l) stays bounded as l → ∞. The following identity holds: √ Eπ exp q l + 1 ξ l+1 − l+1 (ζ l+1 ) ζ l+1 = x = √ Eπ Eπ exp q l + 1 ξ l+1 − l+1 (ζ l+1 ) ζ l , ζ l+1 ζ l+1 = x = ql Eπ Eπ exp √ ξ l − l (ζ l ) ζ l × l+1 q ξ l+1 − 1 (ζ l+1 ) ζ l+1 × Eπ exp √ l+1 ql exp √ l (ζ l ) − l+1 (ζ l+1 ) × l+1 q exp √ 1 (ζ l+1 ) − l+1 (ζ l+1 ) ζ l = x . l+1 The terms ξ l+1 − 1 (ζ l+1 ) ,
l l (ζ l ) − l+1 (ζ l+1 ) ,
1 (ζ l+1 ) − l+1 (ζ l+1 )
are uniformly bounded and Eπ ξ l+1 − 1 (ζ l+1 ) ζ l+1 = 0, Eπ l (ζ l ) − l+1 (ζ l+1 ) ζ l+1 = 0, Eπ 1 (ζ l+1 ) − l+1 (ζ l+1 ) ζ l+1 = 0.
150
B. T´oth, B. Valk´o
Using the induction hypothesis and the previous arguments, it follows that there exists a finite constant B such that C(l + 1) ≤
l 1 C(l) + B, l+1 l+1
for every l ≥ 1. Hence, lim supl→∞ C(l) ≤ B and the lemma follows. Lemma 9 (Microcanonical Gaussian bounds). There exists a q0 > 0, depending only on m, on the joint distribution of (ξi , ζ i ) and on the function b, such that the following bounds hold uniformly in l ∈ N and x ∈ (Ran(ζ ) + · · · + Ran(ζ ))/ l: 1 (i) If 0 b(s) ds = 0, then log Eπ exp q0 lb , ξ 2l ζ l = x ≤ 1. (9.16) (ii) If
1 0
b(s) ds = 1 then 2 log Eπ exp q0 l b , ξ l − (b , ζ l ) ζ l = x ≤ 1.
(9.17)
Proof. The bounds (9.16) and (9.17) follow from (9.12), respectively, (9.13) by exponential Gaussian averaging (as in the proof of Lemma 7). 9.3.2. Proof of Proposition 3. (i) In order to prove (6.9) first note that by simple numerical approximation (no probability bounds involved) n 1 j n n n n n 2 dx − 2 ≤ C ξ ξ − ( η , ζ ) (x) − ( η , ζ ) ( ) n n l T n j ∈T
and also that l −1 = o l 2 n−1−3β−δ . We apply Lemma 4 with n 2 2 η n, ζ n ) (0) = a , ξ l − a , ηl , a , ζ l . V = ξ − ( We use the bound (9.17) of Lemma 9 with the function b = a. Note that q = q0 l can be chosen in (6.8) with a small, but fixed q0 . This yields the bound (6.9). (ii) In order to prove (6.10) we start again with numerical approximation: n2 n n j 2 ∂x ∂x 2 dx − 1 ≤ C 3 = o(n1−3β−δ ). ) ξ ξ (x) ( n n l T n j ∈T
We apply Lemma 4 with 2 2 n2 V = ∂x ξ n (0) = 2 a , ξ l . l We use now the bound (9.16) of Lemma 9 with the function b = a . We can choose q = q0 l 3 /n2 with a small, but fixed q0 and this yields the bound (6.10).
Perturbation of Singular Equilibria
151
(iii) Next we prove (6.11). We apply Lemma 4 with 2 2 | ∂x n2 k a (k/ l)ξk ξ n (0) |2 n2 k a (k/ l)(ξk − ξ−k ) V= = 3 = 3 , η n (0) l 2l k a(k/ l)ηk k a(k/ l)(ηk + η−k ) where in the last equality we use the fact that the weighting function x → a(x) is even. We will carry out similar computations as in the proofs of Lemma 8 and 9. We com2l+1 exp{qV} . Let X be a standard Gaussian random pute the exponential moment EN,Z variable, which is independent of the other random variables in question and denote by . . . averaging with respect to it. We have 2 n2 k a (k/ l)(ξk − ξ−k ) 2l+1 2l+1 EN,Z exp qV = EN,Z exp q 3 2l k a(k/ l)(ηk + η−k ) $ √ n a (k/ l)(ξk − ξ−k ) % 2l+1 = EN,Z exp X q 3/2 &k l k a(k/ l)(ηk + η−k )
$ √ n l % a (k/ l)(ξk − ξ−k ) 2l+1 2l+1 = EN,Z EN,Z exp X q 3/2 &k ηk + η−k k=0 l k a(k/ l)(ηk + η−k ) $ a (k/ l)2 (ηk + η−k ) % $ n2 n2 % 2l+1 ≤ EN,Z exp CX2 q 3 k ≤ exp CX2 q 3 , l l k a(k/ l)(ηk + η−k ) where we used the facts that the random variables ηk are non-negative, is finite and η(ω) = 0 implies ξ(ω) = 0. In the last step we used the inequality a (x)2 ≤ Ca(x), which follows from the conditions on a(x), see Subsect. 4.3. From this bound it follows that in Lemma 4 we can choose q = q0 l 3 /n2 , with a small but fixed q0 , and hence the second bound in (6.11) follows. 10. Appendix 10.1. Some details about the PDE (1.1). Hyperbolicity: One has to analyze the Jacobian matrix ! " ! " ρu u ρu ρ u ρ D := = . 1 2γ u ρ + γ u 2 )ρ ρ + γ u 2 )u The eigenvalues with the corresponding right and left eigenvectors are: Dr = λr,
Ds = µs,
l † D = λl † ,
m† D = µm† ,
(v † stands for the transpose of the column 2-vector v). The eigenvalues and eigenvectors are ' 1 λ (2γ − 1)2 u2 + 4ρ ± (2γ + 1)u =± µ 2
152
and
B. T´oth, B. Valk´o
' 1 ∓ (2γ − 1)2 u2 + 4ρ − (2γ − 1)u , 1 , 2 ' 1 l† 2 u2 + 4ρ − (2γ − 1)u = 1, − (2γ − 1) . ± m† 2
r† s†
=
We can conclude that the pde (1.1) is (strictly) hyperbolic in the domain γ = 1/2 : γ = 1/2 :
{(ρ, u) ∈ R+ × R : (ρ, u) = (0, 0)} , {(ρ, u) ∈ R+ × R : ρ = 0} .
Riemann invariants: The Riemann invariants w = w(ρ, u), z = z(ρ, u) of the pde are given by the relations (wρ , wu ) · s = 0 = (zρ , zu ) · r. That is, the level lines w = const., respectively z = const. are determined by the ordinary differential equations ' 1 dρ (2γ − 1)2 u2 + 4ρ ± (2γ − 1)u . =∓ du 2 In our case the Riemann invariants can be found explicitly. For γ = 3/4 we get w(ρ, u) = 2γ −1' ' 2γ −2 2 2 (2γ − 1) u + 4ρ + (2γ − 1)u (2γ − 1)2 u2 + 4ρ − (2γ − 2)u , F z(ρ, u) = 2γ −1' ' 2γ −2 2 2 2 2 (2γ − 1) u + 4ρ − (2γ − 1)u (2γ − 1) u + 4ρ + (2γ − 2)u , F where F : R → R is an appropriately chosen bijection (recall that only the level sets of the Riemann invariants are determined). Note that due to the changes of sign of 2γ − 1 and 2γ − 2, the above expression gives rise to qualitatively different behavior of the Riemann invariants. The topology of the picture changes at the critical values γ = 1/2, γ = 3/4 and γ = 1. In Fig. 1 we present the qualitative picture of the level lines of w(ρ, u) and z(ρ, u) for 3/4 < γ < 1, and γ > 1, respectively. In all cases the Riemann invariants satisfy the convexity conditions wρρ wu2 − 2wρu wρ wu + wuu wρ2 ≥ 0, zρρ zu2 − 2zρu zρ zu + zuu zρ2 ≥ 0,
(10.1)
in R+ × R for all γ . (We have to choose the sign of the function F (·) appropriately.) The inequalities are strict in the interior of R+ × R, except for the γ = 1 case, when these expressions identically vanish. These conditions are equivalent to saying that the level sets {(ρ, u) ∈ [0, ∞) × (−∞, ∞) : w(ρ, u) < c} and {(ρ, u) ∈ [0, ∞) × (−∞, ∞) :
Perturbation of Singular Equilibria
153
Ρ
Ρ
u
u
(a) 3/4 < γ < 1
(b) 1 < γ
Fig. 1. Level lines of Riemann-invariants
z(ρ, u) < c} are convex. See [11, 12] or [19] for the importance of these convexity conditions. It is of crucial importance for our problem that the level curves w(ρ, u) = w = const. expressed as u → ρ(u, w) are convex for γ < 1, linear for γ = 1 and concave for γ > 1. Genuine nonlinearity: Genuine nonlinearity holds if and only if (λρ , λu ) · r = 0 = (µρ , µu ) · s in the interior of the domain R+ × R. Elementary computations show that 4γ (2γ − 1)2 2 u≤0 (λρ , λu ) · r = 0 u and . ⇔ ρ=− u≥0 (µρ , µu ) · s = 0 (γ + 1)2
(10.2)
Thus, for γ ≥ 0, γ = 0, 1/2 the system is genuinely nonlinear on the closed domain R+ × R; for γ = 0, 1/2 it is genuinely nonlinear in the interior of R+ × R (with genuine nonlinearity marginally lost on the boundary, ρ = 0). For γ < 0 genuine nonlinearity is lost in the interior of R+ × R. Lax entropies and entropy solutions: Lax entropies of the pde (1.1) are solutions of the linear hyperbolic partial differential equation ρSρρ + (2γ − 1)uSρu − Suu = 0. It turns out that the system is sufficiently rich in Lax entropies. In particular a globally convex Lax entropy in R+ × R is S(ρ, u) = ρ log ρ +
u2 . 2
(10.3)
The Maximum Principle and positively invariant domains: For γ ≥ 0 our systems satisfy the conditions of Lax’s Maximum Principle proved in [11], namely: (i) they do possess a globally strictly convex Lax entropy bounded from below, see (10.3); (ii) the Riemann invariants w(ρ, u) and z(ρ, u) satisfy the convexity condition (10.1); (iii) they are genuinely nonlinear in the interior of D, see (10.2).
154
B. T´oth, B. Valk´o
Therefore, convex domains bounded by level curves of w(ρ, u) and z(ρ, u) are positively invariant for entropy solutions. First we conclude that D itself is a positively invariant domain, as it should be. Second, a very essential difference between the cases γ < 1, γ = 1 and γ > 1 may be observed, which is of crucial importance for the main result of the present paper. In the case γ < 1 all convex domains bounded by level curves of the Riemann invariants are unbounded (non-compact) and thus there is no a priori bound on the solutions. Even starting with smooth initial data with compact support, nothing prevents the entropy solutions to blow up indefinitely after the appearance of the shocks. On the other hand, if γ ≥ 1 any compact subset of D is contained in a compact convex domain bounded by level sets of the Riemann invariants, a fact which yields a priori bounds on the entropy solutions, given bounded initial data.
10.2. Construction of the cutoff. We start with the construction of some entropy/flux pairs S(ρ, u), F (ρ, u) for the unscaled Euler equation (2.17). These are the solutions of the system of pde-s Fρ = ρ Sρ + ρ Su ,
Fu = u Sρ + u Su ,
(10.4)
defined on D. In particular the Lax entropy S(ρ, u) solves the pde: u Sρρ + u − ρ Sρu − ρ Suu = 0.
(10.5)
The linear pde (10.5) is hyperbolic in D. One family of its characteristic curves is solutions of the following ODE, meant in the domain D: ' 2 u − ρ + 4ρ u − u − ρ dρ = , (10.6) du 2ρ the other family is obtained by reflecting u to −u. The characteristic curves are the same as the level lines of Riemann invariants for the pde (2.17). First we conclude that the line segment D ∩ {u = 0} is not characteristic for the hyperbolic pde (10.5). That is: it intersects transversally the characteristic lines defined by the differential equation (10.6). Indeed, from the Onsager relation (2.16) and obvious parity considerations it follows that the right-hand side of (10.6) restricted to {u = 0} 1/2 and this expression is obviously finite for r ∈ (0, ρ ∗ ). becomes Varr,0 (η)/Varr,0 (ζ ) It follows that the Cauchy problem (10.5), with the following initial condition: S(r, 0) = s(r),
Su (r, 0) = 0,
r ∈ [0, ρ ∗ )
(10.7)
is well posed. In our concrete problem the function s(r) will be chosen as follows: we fix 0 < r < r < ρ ∗ , and define 0 if r ∈ [0, r), r log(r/r) − (r − r) if r ∈ [r, r), (10.8) s(r) = log(r/r) r − r r − if r ∈ [r, ∞). log(r/r)
Perturbation of Singular Equilibria
155
Ρ
r 2 r
3
3 1 u
u
u Fig. 2. D1 , D2 , D3
Note that s(r) and s (r) are continuous. Due to the assumption (H) imposed, and regularity of the flux functions and , there exists some ρ0 > 0 such that the ODE (10.6) is regular in {(ρ, u) ∈ D : ρ < ρ0 and (ρ, u) = (0, 0)}. We shall not be concerned about what happens outside this strip. Denote by σ (u; r) the solution of the ODE (10.6) with initial condition σ (0; r) = r. For small enough r0 > 0 we can partition the domain D in three parts for any 0 < r < r0 as follows: D1 (r) := {(ρ, u) ∈ D : ρ<σ (−|u|; r)}, D2 (r) := {(ρ, u) ∈ D : ρ>σ (|u|; r)}, D3 (r) := D \ D1 (r) ∪ D2 (r) = {(ρ, u) ∈ D : σ (−|u|; r) ≤ ρ ≤ σ (|u|; r)}. See Fig. 2 for a sketch of the domains D1 (r), D2 (r), D3 (r). := D1 (r0 ). This domain is a From now on r0 is fixed forever and we denote D ∩ {u = 0}, as opposed to D rectangle in characteristic coordinates with diagonal D which may not be a full characteristic rectangle. (Actually, choosing the characteristic is a square coordinates in a natural symmetric way, z(ρ, u) = w(ρ, −u), the domain D in characteristic coordinates.) Next we turn to the construction of a particular family of Lax entropies which will serve for obtaining the cutoff functions needed. We fix 0 < r < r < r0 and define S : D → R as follows: S(ρ, u) is a solution of the Cauchy problem (10.5)+(10.7) with s(r) given (i) In D: in (10.8). Note that if (ρ, u) ∈ D1 (r) ⊂ D, 0 S(ρ, u) = (10.9) r −r ρ − if (ρ, u) ∈ D2 (r) ∩ D. log(r/r) (ii) In D2 (r): S(ρ, u) := ρ −
r −r , log(r/r)
if (ρ, u) ∈ D2 (r).
(10.10)
∩ D2 (r), (i) yields the same expression. Note that there is no contradiction: in D
156
B. T´oth, B. Valk´o
S(ρ, u) is defined as a solution of the Goursat problem for (10.5) (iii) In D3 (r) \ D: ∩ D3 (r), respectively, with boundary conditions on the characteristic lines ∂ D ∂D2 (r) \ D provided by (i), respectively, (ii). Note that S(ρ, u) is a solution of the pde (10.5), globally in D, and thus there exists a flux function F (ρ, u) which together with S(ρ, u) satisfies (10.5). Now we are ready to define the scaled functions S n (ρ, u), F n (ρ, u) on the scaled domain Dn given in (5.2), as follows: fix 0 < r < r < ∞ and define the unscaled Lax entropy/flux pair as done in (i)–(iii), but with downscaled initial conditions S(r, 0) = n−2β s(n2β r),
Su (r, 0) = 0.
r ∈ [0, ρ ∗ ),
(10.11)
with the function r → s(r) given in (10.8). Now, define the pair of scaled functions S n , F n : Dn → R as S n (ρ, u) := n2β S(n−2β ρ, n−β u),
F n (ρ, u) := n3β F (n−2β ρ, n−β u). (10.12)
It is straightforward to check that S n , F n form a Lax entropy/flux pair of the pde (5.8): Fρn = ρn Sρn + nρ Sun ,
Fun = un Sρn + nu Sun ,
in particular S n solves the pde (5.10). If γ > 1 then for any fixed M > 0, ε > 0 – choosing r and r/r large enough – this choice of S n , F n will satisfy the bounds (5.11)–(5.16) of Lemma 2. The spelled out proof can be found in [25]. Remark. As we mentioned in the remarks after Theorem 1, our result also holds for the {−1, 0, +1}-model, even though we have γ = 1 in that case. For this model we have (ρ, u) = ρu, (ρ, u) = ρ + u2 which yields n (ρ, u) = (ρ, u), n (ρ, u) = (ρ, u). Thus our cutoff function does not depend on n and it has to satisfy ρSρρ + uSρu − Suu = 0. This pde is explicitly solvable with initial conditions (10.7), (10.8), and it is not hard to check that the bounds (5.11)–(5.16) are indeed satisfied. Acknowledgement. It is our pleasure to thank J´ozsef Fritz for the many discussions on the content of this paper, his permanent interest and encouragement. We also thank Peter Lax for a very inspiring consultation on hyperbolic conservation laws. The kind hospitality of Institut Henri Poincar´e (Paris) and that of the Isaac Newton Institute (Cambridge), where parts of this work were completed, is gratefully acknowledged. The research work of the authors was partially supported by the Hungarian Scientific Research Fund (OTKA) grant no. T037685. The second author was also partially supported by OTKA grant no. TS40719.
References 1. Bal´azs, M.: Growth fluctuations in interface models. Ann. de l’Inst. H. Poincar´e — Prob. et Stat. 39, 639–685 (2003) 2. Cocozza, C.: Processus des misanthropes. Zeits. f¨ur Wahr. und verwandte Geb. 70, 509–523 (1985) 3. Evans, L.C.: Partial Differential Equations. Graduate Studies in Mathematics 19, Providence, RI: AMS, 1998 4. Fritz, J.: An Introduction to the Theory of Hydrodynamic Limits. Lectures in Mathematical Sciences 18. Graduate School of Mathematics, Univ. Tokyo, 2001 5. Fritz, J.: Entropy pairs and compensated compactness for weakly asymmetric systems. Adv. Stud. Pure Math. 39, 143–171 (2004)
Perturbation of Singular Equilibria
157
6. Fritz, J., T´oth, B.: Derivation of the Leroux system as the hydrodynamic limit of a two-component lattice gas. Commun. Math. Phys. 249, 1–27 (2004) 7. Garabedian, P.R.: Partial Differential Equations. Chelsea, Providence, RI: AMS, 1998 8. John, F.: Partial Differential Equations. Applied Mathematical Sciences, Vol. 1, New York-Heidelberg-Berlin: Springer, 1971 9. Kardar, M., Parisi, G., Zhang, Y.-C.: Dynamic scaling of growing interfaces. Phys. Rev. Lett. 56, 889–892 (1986) 10. Kipnis, C., Landim, C.: Scaling Limits of Interacting Particle Systems. Berlin-Heidelberg-NewYork: Springer, 1999 11. Lax, P.: Shock waves and entropy. In: Contributions to Nonlinear Functional Analysis, (ed.): Zarantonello, New York E. A.: Academic Press, 1971, pp. 603–634 12. Lax, P.: Systems of Conservation Laws and the Mathematical Theory of Shock Waves. SIAM, CBMS-NSF 11, 1973 13. Leveque, R.J.: Numerical Methods in Conservation Laws. Lectures In Mathematics, ETH Z¨urich, Basel: Birkh¨auser Verlag, 1990 14. Levine, H.A., Sleeman, B.D.: A system of reaction diffusion equations arising in the theory of reinforced random walks. SIAM J. App. Math. 57, 683–730 (1997) 15. Othmer, H.G., Stevens,A.:Aggregation, blowup, and collapse: the abc’s of taxis in reinforced random walks. SIAM J. App. Math. 57, 1044–1081 (1997) 16. Popkov, V., Sch¨utz, G.M.: Shocks and excitation dynamics in driven diffusive two channel systems. J. Stat. Phys. 112, 523–540 (2003) 17. Rascle, M.: On some “viscous” perurbations of quasi-linear first order hyperbolic systems arising in biology. Contemp Math. 17, 133–142 (1983) 18. Rezakhanlou, F.: Microscopic structure of shocks in one conservation laws. Ann. de l’Inst. H. Poincar´e — Anal. Non Lineaire 12, 119–153 (1995) 19. Serre, D.: Systems of Conservation Laws. Vol 1–2. Cambridge: Cambridge University Press, 2000 20. Smoller, J.: Shock Waves and Reaction Diffusion Equations. Second Edition, Berlin-Heidelberg-New York: Springer, 1994 21. T´oth, B., Valk´o, B.: Between equilibrium fluctuations and Eulerian scaling. Perturbation of equilibrium for a class of deposition models. J. Stat. Phys. 109, 177–205 (2002) 22. T´oth, B., Valk´o, B.: Onsager relations and Eulerian hydrodynamic limit for systems with several conservation laws. J. Stat. Phys. 112, 497–521 (2003) 23. T´oth, B., Werner, W.: The true self-repelling motion. Prob. Theory and Rel. Fields 111, 375–452 (1998) 24. T´oth, B., Werner, W.: Hydrodynamic equation for a deposition model. In: In and out of equilibrium. Probability with a physics flavor. Sidoravicius, V. (ed.), Progress in Probability 51, Basel-Boston: Birkh¨auser, 2002, pp. 227–248 25. Valk´o, B.: Hydrodynamic behavior of hyperbolic two-component systems. PhD Thesis, Institute of Mathematics, Budapest University of Technology and Economics, 2004. Available at: http://www.renyi.hu/∼valko 26. Varadhan, S.R.S.: Nonlinear diffusion limit for a system with nearest neighbor interactions II. In: Asymptotic Problems in Probability Theory, Sanda/Kyoto 1990, Harlow: Longman, 1993, pp. 75–128 27. Yau, H.T.: Relative entropy and hydrodynamics of Ginzburg-Landau models. Lett. Math. Phys. 22, 63–80 (1991) 28. Yau, H.T.: Logarithmic Sobolev inequality for generalized simple exclusion processes. Prob. Theory and Rel. Fields 109, 507–538 (1997) 29. Yau, H.T.: Scaling limit of particle systems, incompressible Navier-Stokes equations and Boltzmann equation. In: Proceedings of the International Congress of Mathematics, Berlin 1998, Vol 3, Basel-Boston: Birkh¨auser 1999 pp. 193–205 Communicated by H.-T. Yau
Commun. Math. Phys. 256, 159–180 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1315-8
Communications in
Mathematical Physics
Edge and Impurity Effects on Quantization of Hall Currents Jean-Michel Combes1,2 , Fran¸cois Germinet3 1
D´epartement de Math´ematiques, Universit´e de Toulon-Var, BP 132, 83957 La Garde C´edex, France. E-mail: [email protected] 2 Centre de Physique Th´eorique , Luminy, Case 907, 13288 Marseille C´edex 9, France. E-mail:
[email protected] 3 Laboratoire AGM , D´epartement de Math´ematiques, Universit´e de Cergy-Pontoise, BP 222, 95302 Cergy-Pontoise C´edex, France. E-mail:
[email protected] Received: 9 January 2004 / Accepted: 15 October 2004 Published online: 15 March 2005 – © Springer-Verlag 2005
Abstract: We consider the edge Hall conductance and show it is invariant under perturbations located in a strip along the edge (decaying perturbations far from the edge are also allowed). This enables us to prove for the edge conductances a general sum rule relating currents due to the presence of two different media located respectively on the left and on the right half plane. As a particular interesting case we put forward a general quantization formula for the difference of edge Hall conductances in semi-infinite samples with and without a confining wall. It implies in particular that the edge Hall conductance takes its ideal quantized value under a gap condition for the bulk Hamiltonian, or under some localization properties for a random bulk Hamiltonian (provided one first regularizes the conductance; we shall discuss this regularization issue). Our quantization formula also shows that deviations from the ideal value occurs if a semi-infinite distribution of impurity potentials is repulsive enough to produce current-carrying surface states on its boundary. 1. Introduction There has been recently some renewed interest in detailed analysis of edge states occurring in semi-infinite quantum Hall systems, which play a basic role in the analysis of the quantum Hall effect (for a general reference to the QHE, see e.g. [PG]). Such edge states have been proved to carry currents at least in weak disorder regimes [DBP, FGW1, FGW2, FM1, FM2, CHS]. These discussions need to be completed by an analysis of the quantization properties of these currents and of the effect of various types of perturbations, like edge imperfections or random impurities, on these quantized values. The role of edge states in quantization of Hall conductance has been widely discussed since the pioneering work of B.I. Halperin [H] (see e.g. [HT, MDS, B, Th,
UPR 7061 au CNRS UMR 8088 au CNRS
160
J.M. Combes, F. Germinet
CFGP] and references therein). It has been shown recently in [SBKR, KRSB, EG, Ma] that for discrete Hamiltonians with a magnetic field and under a gap condition of the bulk Hamiltonian the edge theory and the bulk theory can be reconciled and the edge conductance as defined in Definition 1 equals the bulk conductance as given by Kubo’s formula provided the Fermi energy lies in such a gap1 . Let us recall that the bulk conductance has received an interpretation both as a Chern number [BESB] and as a topological invariant [Ku, AS2], thus providing an explanation for both quantization and robustness of Hall conductance. In the ergodic case and under a gap condition the edge conductance can also be expressed as a Fredholm index [SBKR, KRSB, KSB]. However, as compared to the bulk theory (e.g. [Be, Ku, AS2, BESB, AG, ES, BGKS]) some of the main arguments of the edge theory for the quantum Hall effect have not been given yet a rigourous mathematical status, efficient enough quantitatively to deal with the questions mentioned above. One goal of this paper is to compute the edge conductance in a simple way, independently of a gap assumption, and to study its stability under perturbations. We note that the exact quantization is obtained here without any covariant structure of the Hamiltonians. One of our main results is a general sum rule linking the conductances of the same system with and without the confining edge (Corollary 3). It is obtained as a particular case of Theorem 2 which deals with general left and right media. We shall provide two models with random impurities for which the edge conductance either vanishes or keeps its ideal quantized value N, when the Fermi energy lies between the N th and (N + 1)th Landau levels. The first model is the one of Nakamura and Bellissard [NB] that we adapt to the edge geometry. We recover in a simple way their result but from the “edge” point of view, i.e. we prove the vanishing of the edge conductance. As a result this implies the existence of a persistent current carrying states due to the impurity potential alone and living near the boundary of the disordered region; these currents are shown to be quantized as well. The second model is of Anderson type, and we investigate the edge conductance in the regime of localized states, in which case a regularization of the edge conductance is required2 . We shall discuss this regularization issue, and show that under a suitable condition of localization the regularized edge conductance keeps its ideal quantized value N.
2. Statements of the General Results Throughout this paper 1X = 1(x,y) will denote the characteristic function of a unit cube centered X = (x, y) ∈ Z2 . If A is a subset of R2 , then 1A will denote the characteristic function of this set. Moreover 1− and 1+ will stand, respectively, for 1x≤0 and 1x≥0 . We consider an electron confined to the 2-dimensional plane composed of two complementary semi-infinite regions supporting potentials V1 and V2 respectively, and under the influence of a constant magnetic field B orthogonal to the sample. If V1 , V2 are two potentials in the Kato class [CFKS] the Hamiltonian of the system is given, in suitable units and Landau gauge, by H (V1 , V2 ) := HL + V1 1− + V2 1+ ,
(2.1)
1 While writing the revised version of this paper, we heard of the recent work of A. Elgart, G.M. Graf, J. Schenker [EGS] concerning the equality of the bulk and edge conductances in a mobility gap, namely in a region where one has localized states. 2 The regularization issue is also treated in [EGS] (see Footnote 1).
Edge and Impurity Effects on Quantization of Hall Currents
161
a self-adjoint operator acting on L2 (R2 , dxdy), where HL = px2 + (py − Bx)2 is the free Landau Hamiltonian. The spectrum of HL = H (0, 0) consists in the well-known Landau levels BN = (2N − 1)B, N ≥ 1 (with the convention B0 = −∞). For technical reasons it is convenient to assume the following control on the growth at infinity of V1 , V2 : for some uniform constants C, p > 0, 1(x,y) V1 ∞ ≤ Cxp , if x ≤ 0,
and
1(x,y) V2 ∞ ≤ Cxp , if x ≥ 0.
(2.2)
For simplicity we further assume that the potentials V1 , V2 are bounded from below, so that H (V1 , V2 ) is a bounded from below self-adjoint operator. We shall say that V1 , resp. V2 , is a (left), resp. (right), confining potential with respect to the interval I = [a, b] ⊂ R if in addition to the previous conditions the following holds: there exists R > 0, s.t. ∀x ≤ −R, ∀y ∈ R, V1 (x, y) > b,
resp. ∀x ≥ R, ∀y ∈ R, V2 (x, y) > b. (2.3)
The “hard wall” case where V1 is infinite and H = HL + V2 acts on L2 (R+ × R, dxdy) with Dirichlet boundary condition at x = 0 can also be considered, and our results still hold. As typical examples for H (V1 , V2 ) one may think of the right potential V2 as an impurity potential and of the left potential V1 as either a wall, confining the electron to the right half plane and generating an edge current, or an empty region (V1 = 0), in which case the issue is to determine whether or not V2 is strong enough to create edge currents by itself. Another example is the strip geometry, where both V1 and V2 are confining. Following [SBKR, KRSB, EG, Ma] we adopt the following definition of an edge conductance. Define a “switch” function as a smooth real valued increasing function equal to 1 (resp. 0) at the right (resp. left) of some bounded interval; then Definition 1. Let X ∈ C ∞ (R2 ) be a x-translation invariant switch function with suppX ⊂ R × [− 41 , 41 ], and let −g ∈ C ∞ (R) be a switch function with supp g ⊂ I = [a, b] a compact interval. The edge conductance3 of H = H (V1 , V2 ) in the interval I , is defined as σe (g, H ) ≡ σe (g, V1 , V2 ) := −tr(g (H (V1 , V2 ))i[H (V1 , V2 ), X ]) = −tr(g (H (V1 , V2 ))i[HL , X ])
(2.4) (2.5)
whenever the trace is finite (we shall use both expressions σe (g, V1 , V2 ) and σe (g, H (V1 , V2 ))). Remark 1. Since [HL , X ] is relatively H (V1 , V2 ) bounded with relative bound 0, the operator g (H (V1 , V2 ))i[HL , X ] readily extends to a bounded operator on L2 (R2 , dxdy). The only issue is thus the finiteness of the trace. In the strip geometry the trace is always well defined, and is actually zero (Corollary 2). In the one wall case, say V1 is left confining, the situation is very different: if I is in a gap of H (0, V2 ) then g (H (V1 , V2 ))i[HL , X ] will be shown to be trace class (Corollary 4); but without the gap condition the situation is more delicate, and a regularized version of (2.4) is needed; we shall discuss this point in Sect. 7. 3 As suggested to us by one referee, one could also call σ (g, V , V ) the interface conductance e 1 2 between potentials V1 and V2 . Concerning the physical interpretation of this quantity, see comments below.
162
J.M. Combes, F. Germinet
Remark 2. In the situations of interest σe (g, V1 , V2 ) will turn out to be independent of the particular shape of the switch function X and also of the switch function g, provided suppg does not contain any Landau level. In practice, in this paper, we shall mainly focus on the following two simple situations: (i) the potential V1 plays the role of a potential barrier (soft wall), (ii) V1 = 0 in which case we investigate the influence of the sole impurities potential V2 . So in both situations we are interested in the possible existence of edge currents. In cases (i) and (ii) σe (g, H ) can be understood in physical terms as follows. Take g to be piecewise linear so that g = 0 outside [a, b] and −g (H ) = EH (I )/|b − a|, with EH (I ) the spectral projection of H on I = [a, b]. The edge conductance σe (g, H ) is then seen as the ratio J (I )/|b−a|, where J (I ) = tr(EH (I )i[H, X ]) is the total current through the surface y = 0 induced by states with energy support contained in I . We note that in case (i), i.e. the one wall case, J (I ) can be interpreted as the total current flowing in a strip whose edges are at different chemical potential E− = a and E+ = b, as discussed in [SBKR]; this assumes that edges are well-separated to prevent effective tunneling between both edges, so that such a strip can in turn be represented by two copies of one edge (half-plane) Hamiltonian with edge currents flowing in opposite directions (for other discussions about this picture, see e.g. [H, HT, MDS, Th]). Our first result is Theorem 1. Let H = H (V1 , V2 ) be as in (2.1), and let W be a bounded potential supported in a strip [L1 , L2 ] × R, with −∞ < L1 < L2 < +∞. Then the operator (g (H + W ) − g (H ))i[HL , X ] is trace class, and tr((g (H + W ) − g (H ))i[HL , X ]) = 0.
(2.6)
As a consequence: (i) σe (g, HL + W ) = 0. (ii) Assume V1 is a y-invariant potential, i.e. V1 (x, y) = V1 (x), that is left confining with respect to I ⊃ suppg . If I ⊂]BN , BN+1 [, for some N ≥ 0, then σe (g, HL + V1 + W ) = N.
(2.7)
Remark 3. The hypotheses on the strip geometry of W in Theorem 1 can be relaxed to some extent. It follows from the proof (see bound (4.16)) that a fast enough decaying potential W in the x-direction works as well; for instance supx1 x1 k1 W 1(x1 ,y1 ) < Cy1 k2 is fine provided k1 is large enough (but k2 can be anything). That σe (g, V1 , 0) = N, with V1 a y-invariant left confining potential, is an easy consequence of the spectral properties of H (V1 , 0) (Proposition 1). In this case the current is carried by edge states which are localized within a few cyclotron radius from the edge [DBP, FGW1, FM1, FM2, CHS]. Property (2.6) implies that a bounded perturbation localized in a strip will not affect the total current, but only, possibly, the geometry of its flow. One can imagine in particular that a strongly repulsive W will move all the current carrying states at the right of the strip supporting W . On the other hand, if the potential is small, edge states will survive near x = 0 and will still propagate along the wall V1 . As a first corollary of Theorem 1, we note that to a large extent edge conductances do not depend on the confining potential V1 so that irregular confining boundaries are allowed.
Edge and Impurity Effects on Quantization of Hall Currents
163
(i)
Corollary 1. Let V1 , i = 1, 2, be two left confining potentials with respect to (i) (1) (2) [a, b] ⊃ suppg , and Hi := H (V1 , V2 ). If V1 − V1 is supported in a strip, then
(g (H1 )−g (H2 ))[HL , X ] is trace class with trace zero. In particular if one conductance (1) (2) is finite, so is the second one, and σe (g, V1 , V2 ) = σe (g, V1 , V2 ). Remark 4. Notice that we do not assume that these confining potentials are y-invariant. (1) (2) So if V1 is a y-invariant left confining potential, then any distortion V1 of the boundary that is supported in a strip or, according to Remark 3, that decays fast enough as (2) x → −∞, will leave the edge conductance invariant, i.e. σe (g, V1 , 0) = N. However the nature of the spectrum of H1 may change. For instance the proof in [FGW2] of the absolute continuity of the spectrum of H (V1 , 0) requires some smoothness of the boundary of the support of V1 . Our second corollary of Theorem 1 investigates the case of the strip geometry. Corollary 2. Let V˜0 (x, y) be a left and right confining potential, s.t. V˜0 (x, y) ≥ v0 > BN+1 if |x| > R, and V˜0 (x, y) = 0 if |x| ≤ R. Then for any electrostatic bounded potential U (x, y) contained in |x| ≤ R, and any g, with suppg ⊂] − ∞, BN+1 [, one has σe (g, HL + V˜0 + U ) = 0 .
(2.8)
Remark 5. Equation (2.8) states that there is no total current flowing in a strip at equilibrium, even in the presence of an electrostatic field. When U is zero, this result also follows from the spectral analysis of H0 (see e.g. [CHS]) showing that both edges carry opposite currents (if any). Impurities and electrostatic potential just have the effect of modifying the geometry of the flow of edge currents, but in such a way that they always compensate and sum up to zero. So far we only considered perturbations located in a strip of the type [L1 , L2 ]×R. But what happens when the right boundary L2 of the strip potential is taken to infinity? It is easy to check that Theorem 1 does not extend as it stands. Consider W = v0 1[0,] (x) and W∞ = v0 1[0,∞[ (x), with the constant v0 ≥ BN+1 , then Theorem 1 yields σe (g, 0, W ) = 0 for all > 0 while σe (g, 0, W∞ ) = −N. In other terms, adding a potential W that does not decay at infinity may dramatically perturb the existence of edge currents. However from Theorem 1 we get that for any bounded potential W supported on a strip [L1 , L2 ] × R, one has σe (g, V1 , W ) − σe (g, 0, W ) = σe (g, V1 , 0) − σe (g, 0, 0) = N.
(2.9)
Although Theorem 1 does not extend in the limit L2 → ∞, it turns out that the difference rule (2.9) does. We shall give a rigorous content of this fact in Corollary 3, which is a particular case of our second theorem. Theorem 2. Let g be s.t. suppg ⊂]BN , BN+1 [ for some N ≥ 0. Then the operator {g (H (V1 , V2 )) − g (H (V1 , 0)) − g (H (0, V2 ))}i[HL , X ] is trace class, and tr({g (H (V1 , V2 )) − g (H (V1 , 0)) − g (H (0, V2 ))}i[HL , X ]) = 0.
(2.10)
164
J.M. Combes, F. Germinet
In a similar way, let V0 be as in (2.3) a confining potential with respect to suppg (left or right depending on where V0 is supported4 ), then the operator {g (H (V1 , V2 )) − g (H (V1 , V0 )) − g (H (V0 , V2 ))}i[HL , X ] is trace class, and tr({g (H (V1 , V2 )) − g (H (V1 , V0 )) − g (H (V0 , V2 ))}i[HL , X ]) = 0.
(2.11)
In particular, if traces are separately finite then σe (g, V1 , V2 ) = σe (g, V1 , 0) + σe (g, 0, V2 ) = σe (g, V1 , V0 ) + σe (g, V0 , V2 ).
(2.12) (2.13)
Remark 6. (i) If suppg contains one (or more) Landau levels, then the trace in (2.10) is no longer zero, but is equal to tr(g (HL )i[HL , X ]) = −σe (g, HL ) = 0. (ii) If V0 is not confining, then the operator in (2.11) should be replaced by {g (H (V1 , V2 )) − g (H (V1 , V0 )) − g (H (V0 , V2 )) + g (H (V0 , V0 ))}i[HL , X ]. (iii) If V1 is confining or if suppg lies in a gap of H (V1 , V1 ) (so that in both cases σe (g, V1 , V1 ) = 0), then it follows from (2.12) that σe (g, V1 , 0) = −σe (g, 0, V1 ). As an immediate consequence of Theorem 2 we get a quantization rule for the difference of the edge conductances with and without a confining potential V1 , that shows that they are simultaneously quantized. Corollary 3. Let g be s.t. suppg ⊂]BN , BN+1 [, for some N ≥ 0. Let V1 be a y-invariant left confining potential with respect to suppg or a perturbation of such a V1 as in Corollary 1. Then the operator {g (H (V1 , V2 )) − g (H (0, V2 ))}i[HL , X ] is trace class and −tr({g (H (V1 , V2 )) − g (H (0, V2 ))}i[HL , X ]) = N.
(2.14)
In particular, if either σe (g, V1 , V2 ) or σe (g, 0, V2 ) is finite, then both are finite, and σe (g, V1 , V2 ) − σe (g, 0, V2 ) = N.
(2.15)
Note that σe (g, 0, V2 ) = 0 would imply the existence of current carrying states due to the sole impurity potential. Since Corollary 3 would yield σe (g, V1 , V2 ) = N , we see that such “edge currents without edges” are responsible for the deviation of the Hall conductance from its ideal value N. An example of this phenomenon is provided by the model of S. Nakamura and J. Bellissard in [NB] that we shall revisit in Sect. 6. On the other hand, if the potential V2 is not strong enough to close the Landau gaps and if the Fermi level falls into a gap of H (0, V2 ), then obviously σe (g, 0, V2 ) = 0, and Corollary 3 immediately gives the exact quantized value of the edge conductance. In particular we recover the fact that the conductance remains constant if one increases the coupling constant while keeping the Fermi level in a gap [AS2, BESB, ES]. We thus have the Corollary 4. Let g and V1 be as in Corollary 3, N ≥ 0. If suppg belongs to a gap of H (0, V2 ), then σe (g, V1 , V2 ) = N. As a consequence, let λ∗ > 0 s.t. λ∗ V2 < B and g s.t. suppg ⊂]BN + λ∗ V2 , BN+1 − λ∗ V2 [, then ∀λ ∈ [0, λ∗ ], σe (g, V1 , λV2 ) = N.
(2.16)
4 Strictly speaking if V is left confining, then V ∗ (x, y) = V (−x, y) is right confining. With some 0 0 0 abuse of notations we still write V0 instead of V0∗ if we consider the right confining potential.
Edge and Impurity Effects on Quantization of Hall Currents
165
If now suppg is no longer included in a gap of H (0, V2 ), but in a region of localization, then one expects a regularized version of σe (g, 0, V2 ) to be still zero (the aim of the regularization is to restore the trace class property of g (H (0, V2 ))i[HL , X ] that fails in a region of localization). In this case the analog of Corollary 4 holds for the regularized reg conductances, i.e. σe (g, V1 , λV2 ) = N, thus recovering from the “edge point of view” the bulk picture [BESB, AG]. This regularization issue is the content of Sect. 7. Remark 7. As a by-product we recover a posteriori the equality “bulk-edge” of the conductances for in the context of Corollary 4 the bulk conductance is also known to be equal to N [BESB, AS2]. The plan of the paper is as follows. In Sect. 3 we recall by direct computation that the results stated in (2.7) hold in absence of impurities (free case). In Sect. 4 we prove Theorem 1; we first show a simple invariance property for σe (g, H ) under a perturbation by a compactly supported potential; this invariance property is extended to potentials supported in a strip (or more generally to a decaying potential in the x direction) by Combes-Thomas arguments together with Helffer-Sj¨ostrand functional calculus. In Sect. 5 we prove Theorem 2 on account of Theorem 1. In Sect. 6 we revisit the model of Nakamura and Bellissard [NB] and get an example of a zero edge conductance due to a strongly repulsive potential. Section 7 is devoted to the case where suppg does not lie anymore in a gap, but in a region of localized states. We introduce a regularization and recover the sum rule of Corollary 3 for the regularized edge conductances together with the analog of Corollary 4 in mobility gaps. Appendices A and B contain tools and estimates we shall make use of throughout the paper. 3. Edge Conductance of the Unperturbed Operator The following result is well-known. For the sake of completeness we shall provide a short proof of it. Proposition 1. Let I = [a, b] ⊂]BN , BN+1 [ be such that I ⊃ suppg . We have σe (g, 0, 0) = 0.
(3.1)
Assume that V1 is a left confining potential with respect to I . Then the operator g (H (V1 , 0))i[HL , X ] is trace class. If in addition V1 is y-invariant, then one has σe (g, V1 , 0) = N .
(3.2)
Remark 8. In the next section, we will show that (3.2) also holds if the confining potential V1 has imperfections (i.e. V1 may depend on y as well). See Remark 4. Moreover, it actually follows from the proof that one can add to the confining potential V1 any bulk mean electrostatic field V2 depending only on x and vanishing at +∞: one still has σe (g, V1 , V2 ) = N. Remark 9. The same proof with V1∗ (x) := V1 (−x) gives σe (g, 0, V1∗ ) = −N. Proof. That σe (g, 0, 0) = 0 is immediate since σ (HL ) ∩ I = ∅. We turn to the free edge Hamiltonian H0 := H (V1 , 0) = HL + V1 1− . That g (H0 )i[HL , X ] is trace class follows from the arguments developed in this paper (more precisely those of Sects. 4 and 5), and the proof is sketched in Appendix B, Lemma 5.
166
J.M. Combes, F. Germinet
We now compute the trace itself. Due to the invariance by translation in the y direction, we perform a partial Fourier transform in the y variable and write, ⊕ H0 (k)dk, H0 (k) = px2 + (k − Bx)2 + V1 (x)1− . (3.3) H0 R
We refer to [DBP, FGW1, CHS] for details on this operator. Eigenfunctions of the one-dimensional Hamiltonian H0 (k), k ∈ R, will be denoted ξn,k (x), n = 1, 2, · · · , with eigenvalue ωn (k) ordered increasingly. Assumption on V1 at ±∞ implies that ωn (+∞) = limk→+∞ ωn (k) = (2n + 1)B and ωn (−∞) = limk→−∞ ωn (k) > b. It follows that g(ωn (+∞)) = 1 if n ≤ N and zero if n > N, while g(ωn (−∞)) is always zero. Generalized eigenfunctions of H0 then read ϕn,k (x, y) = eiky ξn,k (x), n = 1, 2, · · · and k ∈ R. Note that from the Feynman-Hellman formula, ωn (k) = 2ξn,k , (k − Bx)ξn,k .
(3.4)
It follows that (with some abuse of notation we denote again by X (y) the one-dimensional function equal to X (x, y) for all x ∈ R) σe (g, H0 ) = −2 g (ωn (k))ϕn,k , (k − Bx)X (y)ϕn,k dk (3.5) =
n≥1 R
(g(ωn (+∞)) − g(ωn (−∞))) =
n≥1
g(ωn (+∞)) = N, (3.6)
n≥1
where we used in (3.5) that R X (y)dy = X (1) − X (0) = 1.
4. Perturbation by a Strip Potential The aim of this section is to prove Theorem 1. But, given Theorem 1, we first show how to get Corollary 2: by Theorem 1, σe (g, HL + V˜0 ) = σe (g, HL + V˜0 + v0 1[−R,R] ) = 0 (since g (HL + V˜0 + v0 1[−R,R] ) = 0); applying a second time Theorem 1 gives σe (g, HL + V˜0 + U ) = σe (g, HL + V˜0 ) = 0. To prove Theorem 1, we proceed in two steps. First we show that edge conductances are invariant under a perturbation by a bounded and compactly supported potential (Lemma 1); then we extend the result to strip potentials (or decaying potential in the x-direction as pointed out in Remark 3). Lemma 1. Let ⊂ R2 be compact and W a bounded potential supported on . Let H be as in (2.1). Then (g (H + W ) − g (H ))i[HL , X ] ∈ T1 and tr((g (H + W ) − g (H ))i[HL , X ]) = 0 .
(4.1)
Proof. To compare the operators g (H +W ) and g (H ), we shall make use of the HelfferSj¨ostrand formula [HeSj, HuSi]. Let g˜ n be a quasi-analytic extension of g of order n ≥ 3 2 (z) = (H + W − z)−2 and R 2 (z) = (H − z)−2 , (see Appendix A). Then, writing R (A.2) reads 1 2 g (H + W ) − g (H ) = − ∂¯ g˜ n (u + iv)(R (z) − R 2 (z))dudv, z = u + iv. π (4.2)
Edge and Impurity Effects on Quantization of Hall Currents
167
Note that Imz = 0. For further reference recall the second order resolvent: if H1 and H2 = H1 + W are two self-adjoint operators, Ri = (Hi − z)−1 , then R22 − R12 = −R2 R1 W R2 − R1 W R2 R1 .
(4.3)
Since W has a compact support, both operators R RW and RW R are in T1 according to Lemma 4. Moreover both R [H , X ] and R[H, X ] extend to bounded operators. As a consequence, using (4.3), 2 tr (R − R 2 )[H, X ] = −tr (R RW R [H, X ]) − tr (RW R R[H, X ]) , (4.4) each trace being finite for operators which are actually trace class, and the first statement of the lemma follows. Suppose now we have shown that tr (R RW R [H, X ]) = tr (RW R [H, X ]R ) .
(4.5)
Since RW R ∈ T1 and R[H, X ] is bounded, we also have tr (RW R R[H, X ]) = tr (R[H, X ]RW R ) . Thus, taking advantage of R[H, X ]R = [R, X ], (4.4) reduces to 2 tr (R − R 2 )[H, X ] = tr (RW R X ) − tr (X RW R ) = 0.
(4.6)
(4.7)
Since by Lemma 4 the integral in (4.2) is absolutely convergent in T1 , we can pass the trace inside the integral and get (4.1). We come back to (4.5). If M < inf σ (H ), then R (M)1/2 R(z)W can be shown to be trace class. Indeed, by the resolvent identity 1
3
3
R (M) 2 R(z)W = R (M) 2 W + R (M) 2 (z − M − W )R(z)W ;
(4.8)
now, since W is compactly supported, R (M)3/2 W ∈ T1 (e.g. [Si] or [GK2, LemmaA.4]) 3 and the operators R (M) 2 W R(z) and (z − M)R (M)3/2 R(z)W belong to T1 by Lemma 4. Thus tr (R RW R [H, X ]) 1 1 = tr R (z)(H − M)R (M) 2 R (M) 2 R(z)W R (z)[H, X ] 1 1 = tr R (M) 2 R(z)W R (z)[H, X ]R (z)(H − M)R (M) 2 = tr (R(z)W R (z)[H, X ]R (z)(H − M)R (M)) = tr (RW R [H, X ]R ) . We applied the cyclicity property of the trace twice: the first time thanks to R (M)1/2 R(z)W ∈ T1 , and the second time because RW R ∈ T1 according to Lemma 4. Proof of Theorem 1. The potential W is now supported on a strip [L1 , L2 ] × R. We decompose W in the y direction and write, with obvious notations, W = W>R + W≤R , for R > 0. It follows from Lemma 1 that (g (H +W )−g (H +W>R ))i[H, X ] ∈ T1 and its trace is zero, for the difference between H + W and H + W>R is the compactly supported potential W≤R . It thus remains to show that (g (H + W>R ) − g (H ))i[H, X ]1 goes to zero as R tends to infinity.
168
J.M. Combes, F. Germinet
As in Lemma 1, we use the Helffer-Sj¨ostrand formula (A.2) together with the second order resolvent equation (4.3). We denote respectively by R and R>R the resolvents of H and H + W>R . One has (g (H + W>R ) − g (H ))i[H, X ]1 (4.9) 1 ≤ |∂¯ g(u ˜ + iv)| (R>R (u + iv)2 − R(u + iv)2 )i[H, X ] dudv. (4.10) 1 π Write, with z = u + iv, 2 −(R 2 (z) − R>R (z)) = R(z)R>R (z)W>R R(z) + R>R (z)W>R R(z)R>R (z). (4.11)
Let X˜ be a smooth function such that X˜ = 1 on R × [− 41 , 41 ] and X˜ = 0 outside R × [− 21 , 21 ] (in particular X˜ = 1 on the support of X ). So [H, X ] = [H, X ]X˜ . We divide X˜ into cubes by writing X˜ = x2 ∈Z 1(x2 ,0) , with 1(x2 ,0) being smooth functions. Let us also write 1[L1 ,L2 ]×[−R,R]c = 1(x1 ,y1 ) . (4.12) x1 ∈Z∩[L1 ,L2 ] y1 ∈Z, |y1 |>R
For any (x1 , y1 ) ∈ Z2 ∩ ([L1 , L2 ] × [−R, R]c ), we have RR>R 1(x1 ,y1 ) W R[H, X ]X˜ 1 ≤ RR>R 1(x1 ,y1 ) 1 W 1(x1 ,y1 ) 1(x1 ,y1 ) R[H, X ]1(x2 ,0)
(4.13) (4.14)
x2 ∈Z
≤
C x2 e−cη(|x1 −x2 |+|y1 |) , RR>R 1(x1 ,y1 ) 1 W 1(x1 ,y1 ) η
(4.15)
x2 ∈Z
where to get the last inequality we used Lemma 3, Eq. A.8, together with the CombesThomas estimate (A.4) and η = dist(z, σ (H )). Summing over x2 , we get from (4.15) and Lemma 4, RR>R 1(x1 ,y1 ) V R[H, X ]1 ≤
C W 1(x1 ,y1 ) x1 e−cη|y1 | , ηκ
(4.16)
where κ stands for a positive integer (its value will vary, like the one of the constant C). It remains to sum over x1 ∈ [L1 , L2 ] and |y1 | ≥ R. It yields RR>R W>R R[H, X ]1 ≤
C(L2 − L1 )W ∞ −cηR e . ηκ
(4.17)
2 (z)−R 2 (z) in (4.11). We turn to the second term coming from the decomposition of R >R As above we have to control
R>R 1(x1 ,y1 ) W RR>R [H, X ]1(x2 ,0) 1 .
(4.18)
The trace class property will follow from the part R>R 1(x1 ,y1 ) W R, but we also need the term 1(x1 ,y1 ) to extract the required decay in y1 . We thus first pass a smooth version of
Edge and Impurity Effects on Quantization of Hall Currents
169
1(x1 ,y1 ) through the resolvent R. Let χ˜ (x1 ,y1 ) be a smooth characteristic function of the unit cube centered at (x1 , y1 ), so that χ˜ (x1 ,y1 ) 1(x1 ,y1 ) = 1(x1 ,y1 ) . We get R>R 1(x1 ,y1 ) W RR>R [H, X ]1(x2 ,0) 1 ≤ R>R 1(x1 ,y1 ) W R χ˜ (x1 ,y1 ) R>R [H, X ]1(x2 ,0) 1 +R>R 1(x1 ,y1 ) W R[H, χ˜ (x1 ,y1 ) ]RR>R [H, X ]1(x2 ,0) 1 .
(4.19) (4.20) (4.21)
The term in (4.20) is estimated as previously; as for the one in (4.21) note that it follows from Lemma 3, Eq. A.8 and the Combes-Thomas estimate (A.4) that [H, χ˜ (x1 ,y1 ) ]RR>R [H, X ]1(x2 ,0) ≤ [H, χ˜ (x1 ,y1 ) ]R1(x3 ,y3 ) 1(x3 ,y3 ) R>R [H, X ]1(x2 ,0)
(4.22) (4.23)
(x3 ,y3 )∈R3
≤
C (x1 + y1 )x2 e−cη(|x2 −x1 |+|y1 |) . η3
(4.24)
The rest of the argument follows as above. It allows us to conclude that C(L2 − L1 )W ∞ −cηR e , (R>R (u + iv)2 − R(u + iv)2 )i[H, X ] ≤ 1 ηκ
(4.25)
for some integer κ. Following (A.2), it remains to integrate the latter estimate multiplied by |∂¯ g˜ n (z)|, z = u + iv. By Lemma 2, it follows that for any integer m ≥ 1 there exists Cm such that for any R ≥ 1, (g (H + W>R ) − g (H ))i[H, X ]1 ≤ Cm R −m . So (2.6) holds, and (2.7) is a direct consequence of (2.6) and Proposition 1.
(4.26)
5. Estimating Differences of a Priori Non Finite Edge Conductances This section is devoted to the proof of Theorem 2. Proof. The main task is to prove (2.10), and that the operator coming in is trace class. Assuming this, let us sketch how to derive the second part of the statement, and in particular (2.11). If V0 is a confining potential, then, with the abuse of notations of Footnote 4, it follows from (2.10) that (in addition to the trace class property) tr((g (H (V1 , V0 )) − g (H (V1 , 0)) − g (H (0, V0 )))i[HL , X ]) = 0 , tr((g (H (V0 , V2 )) − g (H (V0 , 0)) − g (H (0, V2 )))i[HL , X ]) = 0 , tr((g (H (V0 , 0)) + g (H (0, V0 )) − g (H (V0 , V0 )))i[HL , X ]) = 0 .
(5.1) (5.2) (5.3)
Subtract these equations from (2.10) and note that, V0 being confining, Corollary 2 implies that g (H (V0 , V0 )))i[HL , X ] is trace class with trace zero. This yields the announced (2.11). We now prove the first part of the statement. For R ≥ 0, set D(R) = {g (H (V1 , V2 )) − g (H (0, V2 )) −g (H (V1 , V2 1x≤R )) + g (H (0, V2 1x≤R ))}i[HL , X ].
(5.4)
170
J.M. Combes, F. Germinet
Since g (H (0, 0)) = 0 (suppg is included in a gap of H (0, 0) = HL ), (2.10) of the theorem is proved if we show that D(0) is trace class with trace zero. Now, that D(R)−D(0) is trace class with trace zero is an immediate consequence of Theorem 1. It is thus enough to show that D(R) is trace class and that limR→+∞ |trD(R)| = 0. As previously we use the Helffer-Sj¨ostrand functional calculus to write operators of the type g (H ) in terms of second power of resolvents, and then make use of the second order resolvent equation (4.3). We shall make use of the following notations: H = H (V1 , V2 ), H2 = H (0, V2 ); as for the operators with a truncated V2 , we set H≤R = H (V1 , V2 1x≤R ), H2,≤R = H (0, V2 1x≤R ), with respective resolvents R, R2 , R≤R , R2,≤R . We get 2 2 − R2,≤R )= (R 2 − R22 ) − (R≤R −RR2 V1 R − R2 V1 RR2 + R≤R R2,≤R V1 R≤R + R2,≤R V1 R≤R R2,≤R . (5.5)
We first treat the term RR2 V1 R − R≤R R2,≤R V1 R≤R . Bounding the remaining one will be done in a similar way, and it is discussed below. Since H − H≤R = H2 − H2,≤R = V2 1x>R ≡ V2,>R , one has RR2 V1 R − R≤R R2,≤R V1 R≤R = −RR2 V1 RV2,>R R≤R −RR2 V2,>R R2,≤R V1 R≤R − RV2,>R R≤R R2,≤R V1 R≤R .
(5.6) (5.7)
Let us first prove that RR2 V1 RV2,>R R≤R [H, X ]1 decays faster than any polynomial in R. With Xi = (xi , yi ), i = 1, 2, write V1 = X1 ∈S1 V1 1X1 with S1 = Z− × Z, V2,>R = V 1 with S = (Z∩]R, +∞[) × Z, and [H, X ] = 2 X2 ∈S2 2 X2 x3 ∈Z [H, X ]1(x3 ,0) as in Sect. 4, Proof of Theorem 1. Then, with 1i = 1Xi , i = 1, 2, and κ some integer that will vary from one line to another: RR2 V1 RV2,>R R≤R [H, X ]1 ≤ RR2 V1 11 1 11 R12 12 V 12 R≤R [H, X ]1(x3 ,0) (x1 ,y1 )∈S1 (x2 ,y2 )∈S2 x3 ∈Z
≤
(x1 ,y1 )∈S1 (x2 ,y2 )∈S2 x3 ∈Z
≤
≤
CV ∞ ηκ
C 11 V1 12 V e−η(|x1 −x2 |+|y1 −y2 |+|x2 −x3 |+|y2 |) ηκ
1(x1 ,0) V1 e−η|x1 −x2 |
x1 ∈Z− x2 ∈Z∩]R,+∞[
CV ∞ 1(x1 ,0) V1 e−η(|x1 |+R) . ηκ −
(5.8)
x1 ∈Z
We used Lemma 4, the Combes-Thomas estimate (A.4), as well as Lemma 3. We also used the invariance of V1 in the y-direction. Since by Assumption 2.2 we have the bound 1(x1 ,0) V1 ≤ Cx1 p , for some p < ∞, it follows from (5.8) that for some constant C and integer κ > 0 (depending on p) that RR2 V1 RV2,>R R≤R [H, X ]1 ≤
CV ∞ −ηR e . ηκ
(5.9)
Edge and Impurity Effects on Quantization of Hall Currents
171
The second term coming from (5.7) is estimated exactly as the first one. The third contribution from (5.7) requires an extra argument. If one is only interested in the decay (in R) of its trace, and not of its trace norm, then the above argument applies again if one notices that by cyclicity tr(RV2,>R R≤R R2,≤R V1 R≤R [H, X ]) = tr(R≤R R2,≤R V1 R≤R [H, X ] RV2,>R ). Let us now briefly comment how to control the remaining contribution from (5.5), that is the one coming from the difference R2 V1 RR2 − R2,≤R V1 R≤R R2,≤R . One first decomposes it in three terms as in (5.7). To get the decay of the trace of each of the three contributions one can use cyclity of the trace and apply the argument above. These estimates lead to |trD(R)| ≤ Cm R −m for any m > 0. We note that actually the stronger D(R)1 ≤ Cm R −m for any m > 0 can be proven. It is indeed sufficient to use a similar argument to the one given in (4.19) and subsequent. 6. The Nakamura-Bellissard Model Revisited In [NB] Nakamura and Bellissard showed that the bulk Hall conductance σb vanishes in any Landau band for sufficiently large coupling constant in a positive potential exhibiting non-degenerate wells locally identical (e.g. a periodic potential). Their proof is based on semi-classical analysis at large coupling and non-commutaive geometry methods. It turns out that the vanishing of σe can be obtained in a simple way from Theorem 1. Assume that the bulk potential Vb satisfies the assumptions of [NB]. Namely and with irrelevant simplifications (we set X = (x, y)): (i) inf Vb (X) = 0 and sup Vb (X) < ∞; (ii) there is a countable set {Xn , n = 1, 2, . . . } such that one has |Xn − Xm | ≥ 1 if n = m; (iii) Vb has identical potential wells located at the Xn ’s, i.e., there exists ε ∈]0, 21 [, and V ∈ C 2 (R2 ), such that for all n = 1, 2, . . . , Vb (X + Xn ) = V(X) if |X| ≤ ε; (iv) 0 is the unique minimum of V and it is non-degenerate; (v) if |X − Xn | > ε for all n, then Vb (X) > δ, for some δ > 0. Then by a semi-classical analysis patterned according to the method developed in [BCD], it is shown that for large µ then the spectrum of Hb (µ) = HL + µVb consists, 1 in the range ] − ∞, µ 2 [, of bands Bn,m centered around the eigenvalues En,m (µ) of the one well Hamiltonian h(µ) = HL + µV,
(6.1)
which, in the large µ regime, satisfies the harmonic approximation: 1
En,m (µ) = µ 2 ((n + 1)W1 + (m + 1)W2 ) + O(1),
(6.2)
where W1,2 are the eigenvalues of the Hessian of V at x = 0. The bands Bn,m have width 1 2
n,m (µ) < e−aµ ,
(6.3)
where a is a lower bound on Agmon’s distance between different wells (see Theorem 6.1 in [NB]). So everything only depends on ε and δ. This implies that this spectral structure is not changed under the following modifications of Vb : a) fill the well at Xn up to δ if Xn ∈ S1 = {(x, y), |x| < 1}; b) replace Vb in the half plane {x < 0} by some constant potential v0 > δ.
172
J.M. Combes, F. Germinet 1
Accordingly if I ⊂]BN , BN+1 [, N ≥ 0, satisfies dist(I, σ (h(µ))) > e−aµ 2 and sup I < µδ, then for µ large enough, I is in a gap of He (µ) := HL + µ(v0 1− + Vb 1+ + W ), where
W (X) =
(δ − V(X − Xn ))1|X−Xn |≤ε (X).
(6.4)
(6.5)
Xn ∈S1 \S0
So, as long as suppg ⊂ I , one has σe (g, He (µ)) = 0, and according to Theorem 1 one also obtains σe (g, µv0 , µVb ) = 0. This is the “edge picture” of [NB]’s result. Indeed equality of bulk and edge conductances then yields that the bulk conductance is zero if the Fermi energy belongs to I , which is [NB]’s result. Moreover by virtue of Theorem 2 this in turn implies that σe (g, 0, µVb ) = −N, and thus that HL + µVb 1+ has current carrying edge states for large µ. 7. Regularizing the Edge Conductance in Presence of Impurities Let V be a potential located in the region x ≥ 0. If the operator H (0, V ) has a gap and if the interval I falls into this gap, then the edge conductance is quantized by Corollary (4). A more challenging issue is to show quantization if I falls into a region of localized states of H (0, V ). In the latter case, conductances may not be well-defined, and a regularization is needed. This is the content of this section. We propose some basic conditions that a “good” regularization should fulfill and discuss some candidates.5 . Let V0 be a y-invariant left confining potential with respect to I =[a, b] ⊂]BN , BN+1 [, and assume suppg ⊂ I . Let (JR )R>0 be a family of operators s.t. C1. JR = 1 and limR→∞ JR ψ = ψ for all ψ ∈ EH (0,V ) (I )L2 (R2 ). C2. JR regularizes H (0, V ) in the sense that g (H (0, V ))i[HL , X ]JR is trace class for all R > 0, and limR→∞ tr(g (H (0, V ))i[HL , X ]JR ) exists and is finite. Then it follows from Corollary 3 that lim −tr {g (H (V0 , V )) − g (H (0, V ))}i[HL , X ]JR = N. R→∞
In other terms, if C1 and C2 hold, then JR also regularizes H (V0 , V ). Defining the regularized edge conductance by σe (g, V1 , V2 ) := − lim tr(g (H (V1 , V2 ))i[HL , X ]JR ), reg
R→∞
(7.1)
whenever the limit exists, we get the analog of Corollary 3: reg
reg
σe (g, V0 , V ) = N + σe (g, 0, V ). 5
In [EGS], related questions are addressed. We thus also refer the reader to their preprint.
(7.2)
Edge and Impurity Effects on Quantization of Hall Currents
173
reg
In particular, if we can show that σe (g, 0, V ) = 0, for instance under some localization property, then the edge quantization for H (V0 , V ) follows: σe (g, V0 , V ) = − lim tr(g (H (V0 , V ))i[HL , X ]JR ) = N. reg
R→∞
(7.3)
To start the discussion, consider as the simplest candidate for JR , the multiplication by the characteristic function of the half plane x < R (or a smooth version of it). One checks that C1 holds and that the trace class condition in C2 is fulfilled (to see this consider the difference {g (H (0, V )) − g (H (0, 0))}i[H, X ]JR and proceed as in the proof of Theorem 1). As for the limit R → ∞ of the trace in C2, we do not expect it to exist in full generality. However, if Hω = H (0, Vω,+ ) is a random operator with i.i.d. variables, then it follows from our previous results that the limit exists. Indeed, consider ωi u(x − i), (7.4) Hω = H (0, Vω,+ ) = HL + Vω,+ , Vω,+ = i∈Z+∗ ×Z
a random operator modeling impurities located on the positive half plane (the (ωi )i are i.i.d. random variables, and u is a bump function). The following proposition shows that the current flowing far from the edge x = 0 is negligible (in the expectation sense). Proposition 7.1. Let Hω = H (0, Vω,+ ) as in (7.4), and JR = 1x≤R . For all p ∈ N∗ , there exists Cp > 0 finite, such that, for all R > 0,
E tr g (Hω )i[HL , X ](JR+1 − JR ) ≤ Cp R −p . (7.5) As a consequence, for P-a.e. ω, limR→∞ tr(g (Hω )i[HL , X ]JR ) exists and is finite. In other terms, for P-a.e. ω, JR satisfies C1 and C2 and the rule (7.2) holds. Moreover, if Hω has pure point spectrum in I for P-a.e. ω, then denoting by (ϕω,n )n≥1 a basis of orthonormalized eigenfunctions of Hω with energies Eω,n ∈ suppg ⊂ I , one has reg g (Eω,n )ϕω,n , i[Hω , X ]JR ϕω,n . (7.6) σe (g, 0, Vω,+ ) = − lim R→∞
n
Proof. Let Hω1 be obtained from Hω by setting ωi = 0 for all i ∈ {1} × R. The random variables ωi being i.i.d., one has (7.7) E tr g (Hω )i[HL , X ]JR = E tr g (Hω1 )i[HL , X ]JR+1 . Moreover since the operator Hω1 − Hω leaves in a vertical strip of finite width, it follows by Theorem 1 that (7.8) E tr (g (Hω ) − g (Hω1 ))i[HL , X ] = 0 . On the other hand, using arguments as in the proof of Theorem 1, one has that for any p > 0 there exists Cp < ∞ s.t.
(7.9)
E tr (g (Hω ) − g (Hω1 ))i[HL , X ](1 − JR+1 ) ≤ Cp R −p . By (7.8) and (7.9),
E tr g (Hω )i[HL , X ]JR+1 − E tr g (Hω1 )i[HL , X ]JR+1 ≤ Cp R −p . (7.10) Plugging (7.7) into the latter yields (7.5). The expression in (7.6) follows by expanding the trace on the basis of eigenfunctions.
174
J.M. Combes, F. Germinet
However although the limit exists it is very likely that the quantity in (7.6) will not be zero, even under strong localization properties of the eigenfunctions such as (SULE) (see [DRJLS]) or (WULE) (see Definition 2 below)6 . This can be understood from the fact that the frontier of JR = 1x≤R intersects classical orbits, creating thereby spurious contributions to the total current. The quantum counterpart of this picture is that although the expectation of i[H (0, V ), X ] in an eigenstate of H (0, V ) is zero by the Virial Theorem this is not true anymore if this commutator is multiplied by JR = 1
d/4. It is well known for Schr¨odinger operators that tr(T −1 EH (0,V ) (I )T −1 ) < ∞, if I is compact (e.g. [KKS, GK3]). We set µ(J ) := tr(T −1 EH (0,V ) (J ∩ I )T −1 ) < ∞.
(7.11)
Definition 2 (WULE). Assume H (0, V ) has pure point spectrum in I with eigenvalues En . Let µ be the measure defined in (7.11). We say that H (0, V ) has (WULE) in I , if there exist a mass γ > 0 and a constant C such that for any En ∈ I and X1 , X2 ∈ Z2 , 1X1 EH (0,V ) ({En })1X2 ≤ Cµ({En })T 1X1 T 1X2 e−γ |X1 −X2 | .
(7.12)
Remark 10. The measure µ in (7.11) is the one that appears in the Generalized Eigenfunctions Expansion (GEE) as in [Si, KKS], its kernel being given by Pλ := EH (0,V ) ({λ})/µ({λ}). So (7.12) asserts that 1X1 PEn 1X2 decays exponentially. We further note that alternatively to (7.12), one could assume that 1X1 ϕn,m L2 1X2 ϕn,m L2 ≤ CT −1 ϕn,m 2 T 1X1 T 1X2 e−γ |X1 −X2 | , with the (ϕn,m )’s being an orthonormalized basis of eigenfunctions of eigenvalue En ∈ I . Theorem 3. Assume that H (0, V ) has (WULE) in I . Then JR = EH (0,V ) ({En })1x≤R EH (0,V ) ({En })
(7.13)
En ∈I
regularizes H (0, V ), and thus also H (V0 , V ), in the sense that C1 and C2 hold. reg Moreover the edge conductances are quantized, and one has: σe (g, 0, V ) = 0 and reg σe (g, V0 , V ) = N if I ⊂]BN , BN+1 [ for some N ≥ 0. Remark 11. Another possible regularization is to use the stronger localization signature called (SULE) introduced in [DRJLS] (see also [GDB, GK1]). It requires an exponential 2 n | with cendecay of the eigenfunctions of the form 1X ϕn L2 ≤ Ce(log |Xn |) e−γ |X−X 2 ters of localization Xn = (xn , yn ) ∈ Z . Then one can show that JR = xn ≤R |ϕn ϕn | reg reg satisfies C1 and C2, with in addition σe (g, 0, V ) = 0 and σe (g, V0 , V ) = N if I ⊂]BN , BN+1 [, N ≥ 0. 6
We note that a similar quantity appears in [EGS].
Edge and Impurity Effects on Quantization of Hall Currents
175
Remark 12. Let H (0, Vω,+ )=HL +Vω,+ be a random operator as in (7.4) and hypotheses on u and the ωi ’s are as in [CH, Wa, GK3] (also [DMP]). It can be noted that the percolation estimates due to [CH, Wa] are still effective in the region where the potential is zero. The Wegner estimate given in [CH] is insensitive to this modification as well. Since for energies away from the Landau levels no eigenfunction can live in the left region, it is natural to expect a modified version of the multiscale analysis performed in [CH, Wa, GK3] to hold (or equivalently a version of the fractional moment method developed in [AENSS] if the support of the single bump u covers a unit cube). This is done in [CGH] where localization is proved away from the Landau levels. In particular the following result holds true: For N ∈ N, there exist constants KN (depending on the parameters of the model, except B), so that for B large enough, and if g is s.t. dist(suppg , {BN , BN+1 }) ≥ KN logB B for some N ≥ 0, then H (0, Vω,+ ) has (WULE) in I for P-a.e. ω and Theorem 3 applies. Proof. To show C1, note that for all φ ∈ H and A ⊂ R2 : 2 EH (0,V ) ({En })1A EH (0,V ) ({En })φ En ∈I ≤ 1A ϕn,m 2 |ϕn,m , φ|2 ≤ φ2 ,
(7.14) (7.15)
En ∈I,m≥1
where (ϕn,m )m≥1 denotes an orthonormalized basis of eigenfunctions of energy En ∈I . With A = {x ≤ R} the last bound yields JR ≤ 1. Next, use the first bound in (7.15) with A = {x > R} together with the Lebesgue Dominated Convergence Theorem to get that JR → EH (0,V ) (I ). We turn to C2. Write [HL , X ] = x2 ∈Z [HL , X ]1(x2 ,0) as in Sect. 4. We get
g (H (0, V ))i[HL , X ]JR 1
≤ g (H (0, V ))i[HL , X ]1(x2 ,0) EH (0,V ) ({En })1X1 EH (0,V ) ({En })1 En ∈I X1 ,x2
≤C
1(x2 ,0) EH (0,V ) ({En })1X1 2 1X1 EH (0,V ) ({En })2 ,
(7.16)
En ∈I X1 ,x2
where the summation is over X1 ’s s.t. x1 ≤ R; in the last bound we used that g (H (0, V ))i[HL , X ] ≤ C. Next, the exponential decay due to (7.12) carries over to Hilbert-Schmidt operator kernels, since 1X1 EH (0,V ) ({En })1X2 22 ≤ µ(I )T 1X1 T 1X2 1X1 EH (0,V ) ({En })1X2 . It ensures that for any given x1 ≤ R, the sum over y1 , x2 ∈ Z converges, while n µ({En }) = µ(I ) < ∞ takes care of the summation over n. To complete the argument it is thus enough to show summability in x1 ≤ −2. This will come from the fact that eigenfunctions cannot live far inside the region {x ≤ 0}. More precisely, for x1 ≤ −2, let be a box centered at X1 = (x1 , y1 ) and of radius |x1 | − 1, and 1˜ be a smooth version of 1 , s.t. 1˜ V 1+ = 0. Pass 1˜ through H (0, V ) = HL + V 1+ in 1˜ (H (0, V ) − En )EH (0,V ) ({En }) = 0; multiply on the left by 1X1 (HL − En )−1 ; use
176
J.M. Combes, F. Germinet
Combes-Thomas to control the resolvent of HL . It follows that 1X1 EH (0,V ) ({En }) decays exponentially in |x1 |7 . Remark 13. Notice that the JR of (7.13) considered in Theorem 3 also reads 1 T itH (0,V ) e EH (0,V ) (I )1x≤R EH (0,V ) (I )e−itH (0,V ) dt. JR = s − lim T →∞ T 0
(7.17)
We expect that one can contruct a regularization in the spirit of (7.17), assuming only that H (0, V ) exhibits dynamical localization [A, GDB, GK1] in I .8 A. Appendix A: Some Decay Estimates For g ∈ Cc∞ (R), let g˜ n be a quasi-analytic extension of g of order n ≥ 1 of the form g˜ n (u + iv) = ρ(u, v)Sn g(u ˜ + iv),
n 1 (k) Sn g(u ˜ + iv) = g (u)(iv)k , k!
(A.1)
k=0
where ρ(u, v) = τ (v/u); the function τ is smooth such that τ (t) = 1 for |t| ≤ 1 and τ (t) = 0 for |t| ≥ 2. For H as in (2.1), the Helffer-Sj¨ostrand formula [HeSj, HuSi] reads 1 1
g (H ) = − ∂¯ g˜ n (u + iv)(H − u − iv)−2 dudv, ∂¯ = (∂u + i∂v ) . (A.2) π 2 ¯ ¯ n g(u One has ∂¯ g˜ n (u + iv) = (∂ρ(u, v))Sn g(u ˜ + iv) + ρ(u, v)∂S ˜ + iv). But a simple 1 (n+1) ¯ n g)(u computation yields: ∂(S ˜ + iv) = 2n! g (u)(iv)n . As a consequence, ¯ ∂¯ g˜ n (u + iv) = ∂ρ(u, v)
n 1 (k) ρ(u, v) 1 (n+1) g (u)(iv)k + g (u)(iv)n . (A.3) k! 2 n! k=0
Since u takes values in suppg compact, the usual Combes-Thomas estimate is sufficient for our purpose [CT], namely, 1x (H − z)−1 1y ≤ Cη exp (−cη|x − y|) , η = dist(u + iv, σ (H )) , (A.4) with constants C, c > 0 depending on g. In practice, (A.4) will be used in combination with Lemma 4 and Lemma 3. To conclude we shall use the following lemma. Lemma 2. Let H and g be as above, g˜ be the quasi-analytic extension of g to the order n given by (A.1), and η = dist(u + iv, σ (H )). Let fL,κ (η) = η−κ e−cηL for some κ ≥ 0 and L > 0. For any m ≥ 1, if n ≥ m + κ, there exists a constant c depending only on n, m, κ and on g (through its support and g k ∞ , k = 0, 1, · · · , n + 1), such that
∂¯ g˜ n (u + iv) fL,κ (η)dudv ≤ c . (A.5) Lm 7 An alternative to this last step is to exploit the decay in the region {x ≤ 0} coming from g (H (0, V )) = g (H (0, V )) − g (H (0, 0)). 8 Note that the form (7.17) is close to the regularization considered in [EGS].
Edge and Impurity Effects on Quantization of Hall Currents
177
Remark 14. If g is chosen to be Gevrey of class a > 1, then following [BGK] the integral
in (A.5) decays sub-exponentially like exp(−cL1/a ) with any a > a. Lemma 3. Let χ1 and χ2 be two smooth functions localized on compact regions of R2 . Let χ˜ 2 be a smooth function s.t. χ˜ 2 = 1 on the support of χ2 , and denote by R(z) the resolvent of HL + V = 2x + 2y + V . Then, with α standing for either x or y, χ1 R(z)α χ2 2 (A.6) ≤ 2(|z| + V ∞ + 2px χ2 2∞ + 4py χ2 2∞ + Bxχ2 2∞ )χ1 R(z)χ˜ 2 2 +2χ1 χ˜ 2 ∞ χ1 R(z)χ˜ 2 . As a consequence, let X˜ be a smooth function equal to 1 on the support of X (typically, X˜ = 1 on R × [− 41 , 41 ], and X˜ = 0 outside R × [− 21 , 21 ]), then χ1 R(z)[H, X ]χ2 2 ≤ (C
+ 2|z| + 2Bxχ2 X 2∞ )χ1 R(z)X˜ χ˜ 2 2
(A.7) ˜ ˜ + 2χ1 X χ2 ∞ χ1 R(z)X χ2 ,
where C depends on V , X , X
, X˜ and χ2 as in (A.6), i.e. through their sup norm. In particular, if the supports of χ1 and X˜ χ2 are disjoints, one has 1 χ1 R(z)[H, X ]χ2 ≤ (C + 2|z| + 2Bxχ2 X 2∞ ) 2 χ1 R(z)X˜ χ˜ 2 .
(A.8)
Proof. We have to bound χ2 α R(z)χ1 ϕ2 , with ϕ ∈ Cc∞ . We get χ2 α R(z)χ1 ϕ2 = R(z)χ1 ϕ, α χ22 α R(z)χ1 ϕ = R(z)χ1 ϕ, (α χ22 )α R(z)χ1 ϕ + R(z)χ1 ϕ, χ22 2α R(z)χ1 ϕ.
(A.9)
Using that (α χ22 ) = (2(py χ2 ) − Bxχ2 )χ2 = χ˜ 2 (2(py χ2 ) − Bxχ2 )χ2 , and that ab ≤ 1 2 1 2 2 a + 2 b , we have |R(z)χ1 ϕ, (α χ22 )α R(z)χ1 ϕ| ≤ 2(py χ2 ) − Bxχ2 ∞ χ˜ 2 R(z)χ1 ϕχ2 α R(z)χ1 ϕ 1 1 ≤ 2(py χ2 ) − Bxχ2 2∞ χ˜ 2 R(z)χ1 ϕ2 + χ2 α R(z)χ1 ϕ2 . 2 2 Combining (A.9) and (A.10) with α = x, y, yields
(A.10)
1 1 χ2 x R(z)χ1 ϕ2 + χ2 y R(z)χ1 ϕ2 2 2 1 2 ≤ (4px χ2 ∞ + (2py χ2 ∞ + Bxχ2 ∞ )2 )χ˜ 2 R(z)χ1 ϕ2 2
+ R(z)χ1 ϕ, χ22 (2x + 2y )R(z)χ1 ϕ
≤
1 (4px χ2 2∞ + 8py χ2 2∞ + 2Bxχ2 2∞ )χ˜ 2 R(z)χ1 ϕ2 2 +(|z| + V ∞ )χ2 R(z)χ1 ϕ2 + χ2 χ1 ϕχ2 R(z)χ1 ϕ.
Inequality (A.6) follows.As for (A.7), notice that [H, X ]=−2iy X −X
=(−2iy X − X
)X˜ . We thus apply (A.6) with (X χ2 ) in place of χ2 . The lemma follows.
178
J.M. Combes, F. Germinet
B. Appendix B: Some Trace Estimates Lemma 4. Let V be a bounded potential, and denote by R1 and R2 the resolvents of operators H1 and H2 as in (2.1). Set ηi = dist(z, σ (Hi )), i = 1, 2. There exists C1 , C2 > 0 such that for any (x, y) ∈ R2 , and H1 and H2 s.t. V1 − V2 ∞ < ∞, R1 (z)R2 (z)V 1(x,y) 1 ≤
C1 V 1(x,y) ∞ (1 + C2 (V1 − V2 )∞ ) , η1 η2
(B.1)
R1 (z)V 1(x,y) R2 (z)1 ≤
C1 V 1(x,y) ∞ (1 + C2 (V1 − V2 )∞ ) . η 1 η2
(B.2)
and
Proof. First note that setting χ (x,y) = V 1(x,y) /V 1(x,y) ∞ , it is enough to bound R1 (z)R2 (z)χ (x,y) 1 and R1 (z)χ (x,y) R2 (z)1 with |χ (x,y) | ≤ 1 and supported on the unit cube centered at (x, y). Now choose M ∈ R below the spectrum of H1 and H2 . We first prove (B.2). By the resolvent identity, R1 (z)χ (x,y) R2 (z)1 ≤
C(M) R1 (M)χ (x,y) R2 (M)1 η1 η2
C(M) R1 (M)χ (x,y) R1 (M)1 (1 + (V2 − V1 )R2 (M)) η 1 η2 C(M) ≤ R1 (M)|χ (x,y) |R1 (M)1 (1 + (V2 − V1 )R2 (M)). η 1 η2 And (B.2) follows since R1 (M)|χ (x,y) |R1 (M)1 = R1 (M) |χ (x,y) |2 < C uniformly in (x, y), e.g. [Si] [GK2, Lemma A.4]. We turn to (B.1). By the resolvent identity, ≤
R1 (z)R2 (z)χ (x,y) 1 ≤
C(M) R1 (M)R2 (z)χ (x,y) 1 η1
C(M) R2 (M)R2 (z)χ (x,y) 1 (1 + R1 (M)(V2 − V1 )) η1 C(M) ≤ R2 (M)2 χ (x,y) 1 (1 + R1 (M)(V2 − V1 )). η1 η 2
≤
And (B.1) follows since R2 (M)2 χ (x,y) 1 < C uniformly in (x, y), e.g. [Si, GK2, Lemma A.4]. Lemma 5. Suppose I = [a, b] ⊂]BN , BN+1 [, and pick a switch function g s.t. suppg ⊂I . Suppose that V1 (x, y) > b if x < −R for some R > 0 (i.e. V1 is a left confining potential). Then g (H (V1 , 0))i[HL , X ] is trace class. Proof. Technical details are similar to the ones used to prove Theorem 1 and Theorem 2. We thus only sketch the main ideas. We split g (H (V1 , 0))i[HL , X ] in two terms: g (H (V1 , 0))1x<−R i[HL , X ] and g (H (V1 , 0))1x≥−R i[HL , X ]. Let β = BN+1 + inf(V1 1− ). Note that g (H (V1 , 0) + β1x>−R ) = 0 for I does not intersect the spectrum of H (V1 , 0)+β1x>−R (which starts above b). The first term can thus be seen to be trace class by decomposing {g (H (V1 , 0))−g (H (V1 , 0)+β1x>−R )}1x<−R i[HL , X ] with the Helffer-S¨ojstrand formula and using the resolvent identity in the spirit of the proof of Theorem 1 and Theorem 2. The second term is seen to be trace class by noting that g (HL ) = 0 and by considering {g (H (V1 , 0)) − g (HL )}1x>−R i[HL , X ] in the same way (taking advantage of H (V1 , 0) − HL = V1 1− ).
Edge and Impurity Effects on Quantization of Hall Currents
179
Note added in proof. The property WULE defined in Sect. 7 is revisited in a work in preparation by F. G. and A. Klein. In particular it is renamed SUDEC, for Summable uniform decay of eigenfunctions correlations. Acknowledgements. The authors are grateful to S. De Bi`evre, A. Elgart, A. Klein, P. Hislop, J. Schenker, H. Schulz-Baldes and P. Streda for enjoyable and useful discussions. We would also like to thank the anonymous referee for suggesting to us the generalization (given in Theorem 2) of our original sum rule stated now in Corollary 3, the proof of which turns out to be an immediate rewriting of our original proof.
References [A]
Aizenman, M.: Localization at weak disorder: some elementary bounds. Rev. Math. Phys. 6, 1163–1182 (1994) [AG] Aizenman, M., Graf, G.M.: Localization bounds for an electron gas. J. Phys. A 31, 6783–6806 (1998) [AENSS] Aizenman, M., Elgart, A., Naboko, S., Schenker, J.H., Stolz, G.: Moment Analysis for Localization in Random Schr¨odinger Operators. http://arxiv.org/list/math-ph/0308023;2003 [AS2] Avron, J., Seiler, R., Simon, B.: Charge deficiency, charge transport and comparison of dimensions. Commun. Math. Phys. 159, 399–422 (1994) [Be] Bellissard, J.: Ordinary quantum Hall effect and noncommutative cohomology. In: Localization in disordered systems (Bad Schandau, 1986), Teubner-Texte Phys. 16, Leipzig: Teubner, 1988, pp. 61–74 [BESB] Bellissard, J., van Elst, A., Schulz-Baldes, H.: The non commutative geometry of the quantum Hall effect. J. Math. Phys. 35, 5373–5451 (1994) [BGK] Bouclet, J.M., Germinet, F., Klein, A.: Sub-exponential decay of Operator kernel for functions of generalized Schr¨odinger operators. Proc. Amer. Math. Soc. 132, 2703–2712 (2004) [BGKS] Bouclet, J.M., Germinet, F., Klein, A., Schenker, J.H.: Linear response theory for magnetic Schr¨odinger operators in disordered media. J. Funct. Anal., to appear [BCD] Briet, P., Combes, J.M., Duclos, P.: Spectral Stability under tunneling. Commun. Math. Phys. 126, 133–156 (1989) [B] B¨uttiker, M.: Absence of backscattering in the quantum Hall effect in multiprobe conductors. Phys. Rev. B 38, 9375–9389 (1988) [CGH] Combes, J.M., Germinet, F., Hislop, P.D.: On the quantization of Hall currents in presence of disorder. In preparation [CH] Combes, J.M., Hislop, P.D.: Landau Hamiltonians with random potentials: localization and the density of states. Commun. Math. Phys. 177, 603–629 (1996) [CHS] Combes, J.-M., Hislop, P. D., Soccorsi, E.: Edge states for quantum Hall Hamiltonians. In: Mathematical results in quantum mechanics (Taxco, 2001), Contemp. Math. 307, Providence, RI: Amer. Math. Soc., 2002, pp. 68–81 [CT] Combes, J.M., Thomas, L.: Asymptotic behavior of eigenfunctions for multi-particle Schr¨odinger operators. Commun. Math. Phys. 34, 251–270 (1973) [CFGP] Cresti, A., Fardrioni, R., Grosso, G., Parravicini, G.P.: Current distribution and conductance quantization in the integer quantum Hall regime. J. Phys. Conds. Matter 15, L377–L383 (2003) [CFKS] Cycon, H.L., Froese, R.G., Kirsch, W., Simon, B.: Schr¨odinger operators. Heidelberg: Springer-Verlag, 1987 [DBP] De Bi`evre, S., Pul´e, J.: Propagating Edge States for a Magnetic Hamiltonian. Math. Phys. Elec. J. Vol. 5, paper 3 [DRJLS] Del Rio, R., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum. IV. Hausdorff dimensions, rank one perturbations, and localization. J. Anal. Math. 69, 153–200 (1996) [DMP] Dorlas, T.C., Macris, N., Pul´e, J.V.: Characterization of the Spectrum of the Landau Hamiltonian with delta impurities. Commun. Math. Phys. 204, 367–396 (1999) [EG] Elbau, P., Graf., G.M.: Equality of Bulk and Edge Hall Conductance Revisited. Commun. Math. Phys. 229, 415–432 (2002) [EGS] Elgart, A., Graf, G.M., Schenker, J.: Equality of the bulk and edge Hall conductances in a mobility gap. http://arxiv.org/list/math-ph/040917, 2004 [ES] Elgart, A., Schlein, B.: Adiabatic charge transport and the Kubo formula for Landau-type Hamiltonians. Comm. Pure Appl. Math. 57, 590–615 (2004)
180 [FM1] [FM2] [FGW1] [FGW2] [Ge] [GDB] [GK1] [GK2] [GK3] [H] [HT] [HeSj] [HuSi] [KRSB] [KSB] [KK] [KKS] [Ku] [MDS] [Ma] [NB] [PG] [SBKR] [Si] [Th] [Wa]
J.M. Combes, F. Germinet Ferrari, C., Macris, N.: Intermixture of extended edge and localized bulk levels in macroscopic Hall systems. J. Phys. A: Math. Gen. 35, 6339–6358 (2002) Ferrari, C., Macris, N.: Extended edge states in finite Hall systems. J. Math. Phys. 44, 3734–3751 (2003) Fr¨ohlich, J., Graf, G.M., Walcher, J.: On the extended nature of edge states of quantum Hall hamiltonians. Ann. H. Poincar´e 1, 405–444 (2000) Fr¨ohlich, J., Graf, G.M., Walcher, J.: Extended quantum Hall edge states. Preprint Germinet, F.: Dynamical localization II with an Application to the Almost Mathieu Operator. J. Stat. Phys. 95, 273–286 (1999) Germinet, F., De Bi`evre, S.: Dynamical Localization for Discrete and Continuous Random Schr¨odinger Operators. Commun. Math. Phys. 194, 323–341 (1998) Germinet, F., Klein, A.: Bootstrap Multiscale Analysis and Localization in Random Media. Commun. Math. Phys. 222, 415–448 (2001) Germinet, F., Klein, A.: A characterization of the Anderson metal-insulator transport transition. Duke Math. J. 124, 309–350 (2004) Germinet, F, Klein, A.: Explicit finite volume criteria for localization in continuous random media and applications. Geom. Funct. Anal. 13, 1201–1238 (2003) Halperin, B.I.: Quantized Hall conductance, current carrying edge states and the existence of extended states in a two-dimensional disordered potential. Phys. Rev. B 25, 2185–2190 (1982) Heinonen, P.L. Taylor : Conductance plateaux in the quantized Hall effect, Phys. Rev. B 28, 6119–6122 (1983) ´ Helffer, B., Sj¨ostrand, J.: Equation de Schr¨odinger avec champ magn´etique et e´ quation de Harper. In: Schr¨odinger operators, H Holden, A. Jensen, eds., LNP 345, Berlin-HeidelbergNew York: Springer, 1989, pp 118–197 Hunziker, W., Sigal, I.M.: Time-dependent scattering theory for N-body quantum systems. Rev. Math. Phys. 12, 1033–1084 (2000) Kellendonk, J., Richter, T., Schulz-Baldes, H.: Edge Current channels and Chern numbers in the integer quantum Hall effect. Rev. Math. Phys. 14, 87–119 (2002) Kellendonk, T., Schulz-Baldes, H.: Quantization of Edge Currents for continuous magnetic operators. J. Funct. Anal. 209, 388–413 (2004) Klein, A.; Koines, A.: A general framework for localization of classical waves. I. Inhomogeneous media and defect eigenmodes. Math. Phys. Anal. Geom. 4, 97–130 (2001) Klein, A., Koines, A., Seifert, M.: Generalized eigenfunctions for waves in inhomogeneous media. J. Funct. Anal. 190, 255–291 (2002) Kunz, H.: The Quantum Hall Effect for Electrons in a Random Potential. Commun. Math. Phys. 112, 121–145 (1987) Mac Donald, A.H., Streda, P.: Quantized Hall effect and edge currents. Phys. Rev. B 29, 1616–1619 (1984) Macris, N.: Private communication, 2003 Nakamura, S., Bellissard, J.: Low Energy Bands do not Contribute to Quantum Hall Effect. Commun. Math. Phys. 131, 283–305 (1990) Prange, R.E., Girvin, S.M.: The Quantum Hall Effect, Graduate texts in contemporary Physics. Springer-Verlag, N.Y., 1987 Schulz-Baldes, H., Kellendonk, J., Richter, T.: Simultaneous quantization of edge and bulk Hall conductivity. J. Phys. A 33, L27–L32 (2000) Simon, B.: Schr¨odinger semi-groups. Bull. Amer. Math. Soc. 7, 447–526 (1982) Thouless, D.J.: Edge voltages and distributed currents in the quantum Hall effect. Phys. Rev. Lett. 71, 1879–1882 (1993) Wang, W.-M.: Microlocalization, percolation, and Anderson localization for the magnetic Schr¨odinger operator with a random potential. J. Funct. Anal. 146, 1–26 (1997)
Communicated by M. Aizenman
Commun. Math. Phys. 256, 181–194 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1284-3
Communications in
Mathematical Physics
The Painlev´e Property for Quasihomogenous Systems and a Many-Body Problem in the Plane Adolfo Guillot Instituto de Matem´aticas UNAM, Unidad Cuernavaca. A.P. 273, Adm´on. de correos #3, C.P. 62251, Cuernavaca, Morelos, Mexico. E-mail: [email protected] Received: 12 January 2004 / Accepted: 7 October 2004 Published online: 12 February 2005 – © Springer-Verlag 2005
Abstract: We investigate a many-body problem in the plane introduced by Calogero and intensively studied by Calogero, Fran¸coise and Sommacal. An ad hoc complexification transforms the many-body problem to a system of second order autonomous complex equations depending on some complex constants that describe the two-body interactions. We investigate the sets of two-body interaction constants that make the complexified equation have the Painlev´e Property, this is, its solutions are given by single-valued meromorphic functions. In this case the original system has only periodic isochronous solutions. We exhibit a family of settings where the system displays this property and show that it is not present in the three- and four-body problems that do not fall within our class. For this, we introduce a necessary condition for the presence of the Painlev´e Property in some quasihomogenous systems. 1. Introduction The aim of this article is to investigate the existence of some special settings of Calogero’s “goldfish” many-body problem, the one given by the time evolution of the position of N point particles moving in the plane. This evolution is given by the Newtonian equations r¨ j = ω κ ∧ r˙ j + −2 rj k (αj k + α˜ j k k∧) r˙ j (˙rk · rj k ) + r˙ k (˙rj · rj k ) − rj k (˙rj · r˙ k ) . (1) +2 k=j
In these, rj = (xj , yj , 0) gives the position of each particle, κ = (0, 0, 1) is the direction of a magnetic vector field (orthogonal to the plane) acting upon the particles, ω ∈ R is a natural frequency of the system, αj k , α˜ j k ∈ R are the symmetric interaction constants (αj k = αkj , α˜ j k = α˜ kj ) and, finally, rj k = rj − rk . This many-body problem was originally introduced by Calogero [2]. The particular case αj k = 1 for every couple (j, k)
182
A. Guillot
(the original “goldfish”) was studied by Calogero [3], who gave the remarkable explicit form its solutions have. Other cases have been treated by Calogero-Fran¸coise [7] and Calogero-Fran¸coise-Sommacal [9]. The reader will find in these works both physical and mathematical motivations for the study of this system. We will address the problem, posed in [4], of determining the values of the interaction constants for which, for every initial condition, the solution is periodic with period 1/ω, as in the case where the interaction constants vanish. For the study of this problem we will use an √ad hoc complexification of the system introduced by Calogero: if we take zj = xj + −1yj and1 βj k = −
√ 1 αj k + −1α˜ j k 2
the time evolution can be rewritten as z¨j −
√
N
−1ωz˙j = −
βj k
k=1,k=j
z˙j z˙k . zj − z k
Moreover, the change of coordinates zj (t) = ζj (τ ) with τ (t) =
√ exp( −1ωt) − 1 , √ −1ω
(2)
gives the system the very simple form ζj = −
N
βj k
k=1,k=j
ζj ζk ζj − ζ k
,
(3)
where the last derivatives are taken with respect to τ . This is a system of second-order complex differential equations (both time and space coordinates are complex). If we wish to recover the motion of the original system (1) we need only to take the solution of (3) with suitable initial conditions and follow, within its definition domain, the curve τ (t) given by (2), which amounts to follow, in the complex τ -plane, a circle passing through the origin. We are thus inevitably confronted, while trying to solve the system (1), to the accidents arriving to the solutions of complex differential equations: singularities, multivaluedness, essential boundaries, etc. For the system (1) to have solutions of period 1/ω for every initial condition we need every circle in the definition domain of the solution of (3) to be mapped, via the solution, to a closed curve in phase space, that is, we need the solution of (3) to be single-valued. Secondly, we need the solutions to be free of essential boundaries and have at most poles as singularities. In this way, the system (1) will have (possibly unbounded) periodic solutions of period 1/ω for every initial condition and every ω ∈ R if and only if every solution to the system (3) is a meromorphic function defined in the whole plane, a property usually called the Painlev´e Property. From the Two-body Problem (see [4] and Subsect. 4.1) we learn that, if the system has the Painlev´e Property then βij ∈ {0, 1, 2, 3, 4} for every i, j . We will construct a graph associated to any many-body problem having such interaction constants: each vertex Warning: Calogero takes the constants aj k = αj k + for what follows. 1
√
−1α˜ j k . This normalisation is better suited
The Painlev´e Property for Quasihomogenous Systems
183
represents a body and βij edges are to be drawn between the vertices corresponding to the i th and j th bodies. Our results are the following: Theorem 1. If the associated graph has at most one cycle per connected component, then the system (3) has the Painlev´e Property and every solution is given by entire functions. If the graph is a tree then every solution is polynomial. Theorem 2. If the graph associated to the system (3) for N = 3 and N = 4 is connected and has more than one cycle then the corresponding system does not have the Painlev´e Property. The proof of the first theorem is given in Sect. 2. That of the second one depends crucially on the symmetries of the system. Equations (3) enjoy the following invariance property: from a solution ζ (t) = (ζ1 , . . . , ζn ) and three constants α, β, γ ∈ C with αγ = 0 we can construct another solution ζ = ( ζ1 , . . . , ζn ) by setting ζj (t) = αζj (γ t)+β. This property can be readily expressed in a Lie-algebraic fashion. Let ξj = ζj . Give C2N coordinates (ζ1 , . . . , ζN , ξ1 , . . . , ξN ). Consider the following meromorphic vector fields: N ξi ξj ∂ ∂ ξ i , Y= (4) − βij ∂ζi ζi − ζj ∂ξi j =i
i=1
X=−
N i=1
ξi
N N ∂ ∂ ∂ ∂ ζi . , T= , R= + ξi ∂ξi ∂ζi ∂ζi ∂ξi i=1
(5)
i=1
The first one gives the evolution of the system under consideration; the other three generate its symmetries. The following Lie algebra relations hold: [R, T] = −T,
[R, Y] = 0,
[R, X] = 0,
[X, Y] = −Y,
[X, T] = 0,
[T, Y] = 0.
The couples of vector fields (R, T) and (X, Y) generate Lie subalgebras isomorphic to
ab the Lie algebra of the affine group Aff(C), the group of matrices of the form 01 with a = 0. Vector fields in different couples commute, so the whole Lie algebra is isomorphic to aff(C) ⊕ aff(C). Our criterion, stating a necessary condition for the presence of the Painlev´e Property, is based on the Lie-algebraic relation between X and Y and on the periodic nature of the solutions of X. It will be given in Sect. 3 and applied to our many-body problem in Sect. 4. 2. An Isochronous Setting The aim of this section is to give a proof of Theorem 1. The bodies corresponding to each connected component of the associated graph behave autonomously with time and we will, accordingly, suppose that the graph is connected. The proof is based on the following lemmas, that assume that the j th body interacts exclusively with the k th one as well as with the bodies whose indexes belong to the (possibly empty) set Ij ⊂ {1, . . . , N}, k∈ / I . We also assume that βj l = 1 for every l ∈ I ∪ {k}.
184
A. Guillot
Lemma 1. If for all i ∈ Ij ,
ζi ζi − ζ j
=
d log Gi , dτ
for an entire function Gi , not identically zero, then for some c ∈ C, ζj d d = Gi (x)dx + c = log log G. ζj − ζ k dτ dτ i∈Ij
Proof. A straightforward calculation shows that d ζj − ζk ζ j − ζk . =1− log Gi ζj dτ ζj
(6)
i∈Ij
The assertion is obtained by solving this linear differential equation.
Lemma 2. If ζk is an entire function, ζj ζj − ζ k
=
d log G, dτ
and ζk = GH for entire functions G and H , then ζj is an entire function and G divides ζj (within the ring of entire functions). Proof. The general solution to the linear differential equation ζj = (ζj − ζk )G /G is given by ζk ζk G ζ k G ζj (τ ) = G dτ = G − 2 − H dτ = ζk − G H. G2 G G We can now give the proof of Theorem 1. In what follows, the unassigned initial conditions of the differential equations involved can be adjusted in order to yield solutions of (3) for every non-singular initial condition. Three cases arise: Trees. Suppose that βij is either vanishing or equal to 1 and that the graph has no cycles. Equip the vertices of the graph with the following partial order: the j th vertex is smaller than the k th one if the unique path that joins the j th vertex to the first one passes through the k th one. The maximal element is the first body; the minimal ones, those linked to the graph by a single edge. For each j = 1, let k(j ) be only neighbour of j that is greater than j and let Ij denote the set of neighbours of j that are smaller than j . If the j th body is minimal it interacts exclusively with the k(j )th one and then, according to Eq. (6), ζj ζj − ζk(j )
=
d 1 = log(t − c), t −c dt
for some c ∈ C depending upon the initial conditions. The hypothesis of Lemma 1 is
The Painlev´e Property for Quasihomogenous Systems
185
thus satisfied for the next vertex in ascending order. We can continue in this way up to the first vertex, whose time evolution will be given by ζ1 = ζ1
d Pi , log dτ i
for polynomials Pi (τ ) coming from the vertices neighbouring the first one. Hence, ζ1 = i P and Pi divides ζ1 for all i. The hypothesis of Lemma 2 is satisfied and we can thus spread this solution back along the tree. The solutions are given by polynomials. Graphs with a single cycle of order two. Suppose that α12 = −1. We have trees (possibly not connected, possibly empty) sprouting from the first and second vertices. Following our previous discussion, we can reduce our system to the non-autonomous one given by
ζ2 ζ2 P P ζ1 = ζ1 −2 ζ2 = ζ2 −2 + 1 , + 2 , ζ1 − ζ 2 P1 ζ1 − ζ 2 P2 for two polynomials P1 (τ ), P2 (τ ), that do not vanish at τ = 0. The function K=
ζ1 ζ2 P1 P2 (ζ1 − ζ2 )2
is constant (see [4]), as it can be easily verified by derivating logarithmically. By setting ζi = Pi θi , we obtain the system of linear differential equations with polynomial coefficients given by ζ1 = P1 θ1 , ζ2 = P2 θ2 ,
θ1 = −2KP2 (ζ1 − ζ2 ), θ2 = −2KP1 (ζ2 − ζ1 ).
Hence, the functions ζi are entire and Pi divides ζi . We can now, following Lemma 2, go back through each tree sprouting from the first two vertices. Graphs with a single cycle. We will suppose that the cycle is given by the vertices labelled 1 through n (n ≥ 3) and that their adjacencies in the graph are given by cyclic ordering. By applying our previous results we transform the system into the nonautonomous one given by Pj ζj −1 ζj +1 ζj = ζj − , j = 1, . . . , n mod n, (7) − + ζj − ζj −1 ζj − ζj +1 Pj for n polynomials Pi (τ ). Consider the 2n entire functions Fi and Gi , given by the system of linear differential equations Fj = Pj Fj −1 ,
Gj = −Pj Gj +1 ,
j = 1, . . . , n mod n
with the initial conditions Gj +1 (0) = Pj (0)−1 . We claim that the functions j = 1, . . . , n mod n ζj = Fj −1 Pj Gj +1 ,
(8)
186
A. Guillot
Fig. 1. Connected graphs in the three-body problem having at most one cycle
give the most general solution to the system (7). In fact, the identities, Fj −1 Pj Gj +1 = Fj Gj +1 + Fj Pj +1 Gj +2 ,
Fj −1 Pj Gj +1 = −Fj −1 Gj +
Fj −2 Pj −1 Gj ,
obtained by formulae (8) and integration by parts, give rise to the equalities: ζj +1
Gj +1 Fj Pj +1 Gj +2 = , ζj +1 − ζj Gj +1 Fj Pj +1 Gj +2 − Fj −1 Pj Gj +1 ζj −1 Fj −1 Fj −2 Pj −1 Gj = = . ζj −1 − ζj Fj −1 Fj −2 Pj −1 Gj − Fj −1 Pj Gj +1 =
Thus, every solution of the system (7) is given by entire functions. This finishes the proof of Theorem 1.
Earlier instances of this theorem are given in [4, 5, 8], where it is shown that the three-body problems given by the graphs in Fig. 1 are solved by entire functions. If the graph of Theorem 1 has no branching vertices (if the edges are simple and every body interacts with at most two others), our result reduces to theorems proven in [6, 1]. 3. The Criterion Let X be a complete vector field on Cn such that all its solutions are 2iπ -periodic. Let Y be a meromorphic vector field on Cn that is quasihomogenous for X, this is, the Lie bracket relation [X, Y ] = −Y is satisfied. Let M ⊂ Cn be the open set where Y is holomorphic. Define a similarity solution of Y relative to X as a common one-dimensional orbit of X and Y within M that is invariant by Y (neither X nor Y vanish along the orbit). Our criterion will bring forth an obstruction for Y to have the Painlev´e Property that localizes around its similarity solutions. Let p ∈ M be a point that lies in a similarity solution. There exists a unique αp ∈ C such that the vector field Z = X + αp Y vanishes at p. The Taylor development of Z around p starts with a linear part (p). The eigenvalues of the latter are intrinsically attached to the structure of the couple of vector fields around the similarity solution. Our criterion is the following: Theorem 3. If the vector field Y has the Painlev´e Property then for every p belonging to a similarity solution (where Y has neither zeroes nor poles), 1. every eigenvalue of (p) is an integer and
The Painlev´e Property for Quasihomogenous Systems
187
2. if (p) has a vanishing eigenvalue, then the similarity solution through p is not isolated. The proof of the criterion is based on the following lemmas: Lemma 3. Under the above hypothesis, if Y has the Painlev´e Property then for every α ∈ C the vector field X + αY has the Painlev´e Property and, moreover, every one of its solutions is 2iπ -periodic. Lemma 4. Let W be a vector field defined in a neighborhood of the origin of Cn such that W (0) = 0. If every solution is single-valued and 2iπ -periodic then there exists a holomorphic change of coordinates fixing the origin that redresses W to the vector field m i=1 λi zi ∂/∂zi , with λi ∈ Z. Proof of Lemma 3. Denote by tW (q) the evaluation at time t of the solution of the vector field W with initial condition q (assuming it is, when defined, well defined). The infinitesimal relation [X, Y ] = −Y can be restated as the fact that, for every q ∈ M and for any t and s in a sufficiently small neighborhood of 0 ∈ C, we have t
t t ◦ φYs (q) = φYe s ◦ φX (q). φX
(9)
Because X is actually a complete vector field, this relation holds for
every t ∈ C and ab s for every s where φY (q) is defined. Denote by (a, b) the element ∈ Aff(C). 01 Let : Aff(C) × M −→ M be the meromorphic action to the left2 of the affine group on M defined by log a
[(a, b), q] = φYb ◦ φX
(q).
Notice that this is well defined, because the solutions of X are all 2iπ-periodic. It is indeed an action, for we have log a1
[(a1 , b1 ), [(a0 , b0 ), q]] = φYb1 ◦ φX =
log a0
◦ φYb0 ◦ φX
(q)
log a log a φYb1 ◦ φYa1 b0 ◦ φX 1 ◦ φX 0 (q) log a a φYa1 b0 +b1 ◦ φX 1 0 (q)
= = [(a1 , b1 ) · (a0 , b0 ), q],
where defined. The restriction of this meromorphic action to the subgroup
1α exp 00
t=
et αet − α 0 1
gives the flow of X + αY . Because the kernel of this exponential (as a group morphism) is 2iπZ, the solutions of X + αY have all 2iπ among their periods.
2 The arguments that follow need only the solutions of X to be single-valued in their maximal definition domain. The meromorphic action is then to be replaced by a maximal local holomorphic action in the sense of Palais [10].
188
A. Guillot
Proof of Lemma 4. Consider the real vector field (W ), given by the imaginary part of the vector field W . This vector field vanishes at the origin and all its solutions are 2π periodic. The flow of (W ) gives an action of the compact (real) Lie group R/2π Z by biholomorphisms and this action fixes the origin. Now, according to Cartan’s Linearization Theorem [11, p. 154], this action is holomorphically linearizable in a neighborhood of the origin and thus, in suitable coordinates (W ) —and hence W — is given by a linear vector field. The hypothesis on the periodicity of the solutions implies that the linear vector field is diagonalizable and that its eigenvalues are integers.
Proof of Theorem 3. Assume that Y has the Painlev´e Property and let p be a point that belongs to a similarity solution. Let Z = X + αp Y . Consider coordinates (w1 , . . . , wn ) ∂ centered around p, where Y = ∂w . In order for the relation [Z, Y ] = −Y to take 1 n place, Z = w1 ∂/∂w1 + i=1 fi (w2 , . . . , wn )∂/∂zi for n functions fi vanishing at the origin. Consider the projection P onto the hyperplane {w1 = 0}. The vector field P∗ Z is ni=2 fi (w2 . . . , wn )∂/∂wi . The solutions of this vector field are single-valued and 2iπ -periodic, because, according to Lemma 3, those of Z are. Apply now Lemma 4 to the vector field P∗ Z. By applying the change of coordinates guaranteed by this lemma we n have coordinates where Y = ∂/∂w1 and Z = [w1 + f1 (w2 , . . . , wn )]∂/∂w1 + λ w ∂/∂w with λ ∈ Z. The locus of points belonging to similarity solutions is i i i=2 i i locally given by ∩i≥2 {λi zi = 0}. Both items follow in a straightforward manner.
Remark 1. Because, by relation (9), the orbits of Y are invariant under the flow of X, they induce a foliation on the orbit space of X. In an (n − 1)-dimensional disc around p in the hyperplane {z1 = 0}, the vector field Z − f1 Y = ni=2 λi zi ∂/∂zi , tangent to this hyperplane, is in the linear span of X and Y (in restriction to the disc). Thus, the foliation that Y induces in the orbit space of X has singularities at the similarity solutions and, in a neighborhood of such a point, it is given by the trajectories of a linear vector field with commensurable eigenvalues. Remark 2. This criterion resembles, in its form, a test introduced by Yoshida —that of Kowalevski’s exponents— exhibiting obstructions to the existence of first integrals for quasihomogenous (polynomial) systems [12]. Notice that, if the system has the Painlev´e Property, the proof of Theorem 3 actually constructs local first integrals common to X λ and Y in a neighborhood of a similarity solution, those given by wi j wj−λi for i, j > 1. 4. The Many-Body Problems We will now apply this criterion to some many-body problems in order to prove Theorem (2). In order to do this, we will not work directly with the vector fields Y and X of Eqs. (4–5), but with some dynamical factors of them where our criterion can still be applied. The vector fields T and R generate a two-dimensional foliation whose leaves are the fibers of the rational map : C2N CP2(N −1) given by
(ζ1 , . . . , ζN , ξ1 , . . . , ξN ) → [ζ1 − ζN : · · · : ζN−1 − ζN : ξ1 : · · · : ξN ], because both X and Y commute with the first two vector fields, their images are welldefined in the target space. If we give to the latter the affine coordinates [η1 : · · · : ηN−1 : ρ1 : · · · : ρN−1 : 1],
The Painlev´e Property for Quasihomogenous Systems
189
then these images are given respectively by X =
Y =
N−1 i=1
ρ i − 1 − η i
N−1 j =1
N−1 i=1
ηi ∂/∂ηi and
βj N ρj ∂ ηj ∂ηi
N−1 β ρ β ρ β ij j jN j ∂ iN −ρi + + . ηi ηi − η j ηj ∂ρi j =i
(10)
j =1
The relation [X , Y ] = −Y is preserved. Because the chosen affine coordinates are rational functions of the original ones, this reduced system will have the Painlev´e Property whenever the original one does.
4.1. The two-body problem. In the aim of making this article self-contained, we will show that if the system (4) for N = 2 has the Painlev´e Property then β12 ∈ {0, 1, 2, 3, 4}. This was originally proved by Calogero, who showed that these values give indeed systems having the Painlev´e Property [4]. In this case, the expression (10) reduces to Y = (ρ1 − 1 − β12 ρ1 )
∂ ∂ β12 − ρ1 (1 + ρ1 ) , ∂η1 η1 ∂ρ1
while X = η1 ∂/∂η1 . The similarity solutions are given by {ρ1 = 0} and, if β12 = 2, {ρ1 = −1}. For {ρ = 0}, the vector field X −Y vanishes at the origin of the coordinates ( η, ρ ) = (η + 1, ρ). The linear part of its Taylor development at this point is given by η + [β12 − 1] ρ ) ∂/∂ η − β12 ρ ∂/∂ ρ . Likewise, for {ρ = −1}, the vector field X − Y ( vanishes at the origin of the coordinates ( η, ρ ) = (η + 2 − β, ρ + 1) and the corresponding linear vector field is ( η − [β12 − 1] ρ ) ∂/∂ η − β/(β − 2) ρ ∂/∂ ρ . According to our criterion, the two-body system cannot have the Painlev´e Property unless β12 ∈ Z and, if β12 = 2, β12 /(β12 − 2) ∈ Z, this is, unless β12 ∈ {0, 1, 2, 3, 4}.
4.2. The three-body problem. We will now apply this criterion to the three-body problem (10). Notice that, in restriction to the invariant hypersurface {ξi = 0}, the n-body problem (3) follows the evolution of an (n−1) body problem. The results of the previous paragraph allow us to consider only interaction constants within {0, 1, 2, 3, 4}. We will consider the restriction of Y to the hyperplane H = {ρ1 + ρ2 + 1 = 0} (invariant by Y and X ), by setting ρ2 = −(ρ1 + 1). Lack of the Painlev´e Property for the restricted system implies lack of it for the general one. The restrictions to H of the vector fields read X = η1 ∂/∂η1 + η2 ∂/∂η2 and (ρ1 + 1)η1 ∂ Y = (1 − β13 )ρ1 − 1 + β23 + η2 ∂η0 ρ1 η2 ∂ + β23 − 2 + (β23 − 1)ρ1 − β13 + η1 ∂η1 ∂ β13 β23 β12 +ρ1 (ρ1 + 1) − − + . (11) η1 η2 η1 − η2 ∂ρ1
190
A. Guillot
Once again, the relation [X , Y ] = −Y is preserved and X is a complete vector field whose solutions are 2iπ -periodic. The locus of collinearity of X and Y , where all the similarity solutions live, is the set of points, outside {(η1 − η2 )η1 η2 = 0}, where their cross-product vanishes. This cross-product X × Y is given by
∂ ∂ ∂ ρ1 (ρ1 + 1)Q(η1 , η2 ) − [ρ1 (η1 + η2 ) + (2η1 − η2 )] − η1 , η2 η1 η2 (η1 − η2 ) ∂η1 ∂η2 ∂ρ1 for the polynomial Q(η0 , η1 ) = β23 η12 + (β12 − β23 − β13 )η1 η2 + β13 η22 . This vector field is zero in the cases 1. ρ1 = 0, η2 = 2η1 , 2. ρ1 = −1, η1 = 2η2 , 3. ρ1 = (η2 − 2η1 )/(η1 + η2 ), Q(η1 , η2 ) = 0. In particular, the similarity solutions are always isolated. We will focus on the similarity solutions given by the third item, that give true similarity solutions (in the sense that Y does not vanish along them) if β12 + β23 + β13 = 3. Notice that this third condition gives two similarity solutions (counted with multiplicity) unless the polynomial Q(η0 , η1 ) is divisible by (η0 − η1 ), η0 or η1 . This will happen if and only if, respectively, β12 = 0, β13 = 0 or β23 = 0. Because at least two of the coupling constants are different from zero, we will always have a similarity solution arising from this item. Instead of considering the local situation for each similarity solution, we will consider, following Remark 1, the plane {η2 = 1}. In this plane we have the tangent vector field F = [η1 (η1 − 1)(2η1 + ρ1 η1 + ρ1 − 1)]
∂ ∂ + [ρ1 (ρ1 + 1)Q(η1 , 1)] , ∂η1 ∂ρ1
(12)
that is in the linear span of the restrictions of X and Y to the plane. Because each orbit of X cuts at most once the plane {η1 = 1}, we can think of this plane as a chart of the quotient of C3 by the action of X . We will now apply our test to the vector field (11) and deal first with the cases where all the coupling constants are different from zero. Set β13 = β23 κ1 κ2 and β12 = β23 (κ1 − 1)(κ2 −1) for a couple of numbers κ1 , κ2 ∈ C\{0, 1}. We will not calculate the ratio of the eigenvalues of the linear part of the Taylor development for each singular point of F , but calculate instead the number B = tr2 ( )/ det( ), which is commonly called the Baum1 Bott index of the singular point. If, in Eq. (12) we set (η1 , ρ1 ) = u + κ1 , v + 1−2κ κ1 +1 , then the vector field F transforms into a vector field vanishing at the origin of the (u, v) coordinates whose linear part is given by κ1 (κ1 − 1) (2κ1 − 1)(κ1 − 2)(κ2 − κ1 ) ∂ ∂ 3 u , u + κ1 (κ12 − 1)v + β23 κ1 + 1 ∂u (κ1 + 1)2 ∂v as an explicit calculation shows. The corresponding Baum-Bott number is thus B1 = −
κ1 (κ1 − 1) 9 , β23 (κ1 − κ2 )(κ1 + 1)(κ1 − 2)(2κ1 − 1)
The Painlev´e Property for Quasihomogenous Systems
191
and the symmetries of the vector field F imply that, for the other singular point, B2 = −
κ2 (κ2 − 1) 9 . β23 (κ2 − κ1 )(κ2 + 1)(κ2 − 2)(2κ2 − 1)
These quantities are the roots of the quadratic polynomial (s−B1 )(s−B2 ) = s 2 +c1 s+c0 . The quantities c0 and c1 are symmetric in κ1 and κ2 and are thus expressible in terms of β23 , β13 , β12 . They are in fact given by: c1 = 18
(β13 β12 + β23 β12 + β23 β13 ) ,
c0 = 81
β23 β13 β12 ,
for = (2β23 + 2β13 − β12 )(2β13 + 2β12 − β23 )(2β12 + 2β23 − β13 ), 2 2 2 = β23 + β13 + β12 − 2β23 β13 − 2β13 β12 − 2β12 β23 . Notice that (2β13 + 2β12 − β23 ) = β23 (2κ1 − 1)(2κ2 − 1), so, up to a symmetry of the system, we can suppose (2κ1 − 1)(2κ2 − 1) = 0, for the three factors of cannot vanish simultaneously. Notice also that c12 − 4c0 = 324
(β23 − β13 )2 (β13 − β12 )2 (β12 − β23 )2 , 2
(13)
so the sign of this expression is, when the coupling constants are all different, given by the sign of , whose value is given in Table 1. In these cases the Baum-Bott numbers cannot be rational for they are not even real, so the corresponding systems do not have the Painlev´e Property. In Table 1 and the subsequent ones we have marked with a star the cases that were already ruled out by Calogero by studying the restriction of the system to the locus of similarity solutions [4]. Formula (13) shows that when two coupling constants coincide (say β23 = γ , β13 = γ , γ = 0), the Baum-Bott numbers for the two singularities coincide. They are in fact given by B=9
γ (γ + 2β12 )(4γ − β12 )
for i = 1, 2. Notice that besides the case γ = 1, β12 = 4, the values taken by B are, when γ , β12 ∈ {1, 2, 3, 4}, positive and smaller than 4, as shown in Table 2. Thus, the discriminant B 2 − 4B is always negative. None of these cases correspond to a system having the Painlev´e Property. The second part of our criterion rules out the case γ = 1, β12 = 4, for in this case, where there is one vanishing eigenvalue, the corresponding similarity solution is isolated. Table 1. The values of are negative (β23 , β13 , β12 )
(1, 2, 3) (1, 2, 4)∗ (1, 3, 4)∗ (2, 3, 4)∗
−8 −7 −12 −23
192
A. Guillot Table 2. The values of B lie between 0 and 4 (β23 = β13 ) (β13 , β12 )
B
(β13 , β12 )
B
(β13 , β12 )
B
(β13 , β12 )
B
(1, 1) (1, 2) (1, 3)∗ (1, 4)
— 9/10 9/7 ∞
(2, 1)∗ (2, 2) (2, 3)∗ (2, 4)∗
9/14 1/2 9/20 9/20
(3, 1)∗ (3, 2)∗ (3, 3)∗ (3, 4)∗
27/55 27/70 1/3 27/88
(4, 1)∗ (4, 2)∗ (4, 3)∗ (4, 4)∗
2/5 9/28 18/65 1/4
We will deal now with the cases where one of the coupling constants is zero. Assume, from now on, that β12 = 0. In this case Y loses its pole along the hyperplane {η0 = η1 } and the vector field F reduces to F = [η1 (2η1 + ρ1 η1 + ρ1 − 1)]
∂ ∂ + β23 [ρ1 (1 + ρ1 )(η1 − 1)] . ∂η1 ∂ρ1
13 β23 −β13 , β13 +β23 gives a singularity of this vector field. The correThe point (η1 , ρ1 ) = ββ2,3 sponding Baum-Bott number is given by B=9
β23 β13 . (β23 + β13 )(2β23 − β13 )(2β13 − β23 )
The ratios of the two eigenvalues are the roots of the quadratic polynomial
λ2 λ1 s− = s 2 + s(2 − B) + 1, s− λ2 λ1 that has discriminant B 2 − 4B. If the ratio of the eigenvalues is rational then this discriminant is the square of the rational number λλ21 − λλ21 . It is the value of this discriminant that we will actually calculate. Our criterion establishes that, if the vector field has the Painlev´e Property and β23 + β13 − 3 = 0, then B 2 − 4B is either infinity or the square of a rational number. The values of the square root of this expression for all sets of coupling constants are given in Table 3. The case (β23 , β13 , β12 ) = (2, 4, 0) is ruled out by the second part of our criterion, for the similarity solutions in the system under consideration are isolated. The first √ case in this list accounts for yet another graph in Fig. 1. In the other cases, the value of B 2 − 4B is not a rational number, so the corresponding systems do not have the Painlev´e Property. This finishes the proof of Theorem 2 for the three-body case. Table 3. The values of B 2 − 4B are not rational (β23 , β13 , β12 ) B 2 − 4B (β23 , β13 , β12 ) B 2 − 4B 3 3 √ (1, 1, 0) (2, 3, 0)∗ 2 10 −39 (2, 4, 0) ∞ (1, 2, 0) — √ 3 321 1√ (3, 3, 0) (1, 3, 0) 20 2 −15 √ 6 6 √ (3, 4, 0)∗ (1, 4, 0)∗ 35 79 35 −129 √ 3√ 3 (4, 4, 0)∗ (2, 2, 0) 4 −7 8 −23
The Painlev´e Property for Quasihomogenous Systems
a
b
e
f
193
c
d
Fig. 2. Admissible graphs in the four-body problem
4.3. Four bodies. We will prove now the second part of Theorem 2. Start by considering the connected graphs with four vertices such that the graph generated by any three vertices is, if connected, one of the three graphs of Fig. 1. This amounts to say that – each edge is either single or double, – no two double edges are adjacent, and – no double edge is in a cycle of order three. The set of admissible graphs (up to automorphisms) is portrayed in Fig. 2. The unlabelled graphs have at most one cycle and the corresponding four-body problems have the Painlev´e Property. Our criterion, applied in the setting of expression (10), will rule out the labelled ones. Table 4 shows, for each one of these graphs, the coordinates of a point q, where X − Y vanishes (the vertices follow the order upper-left, upper-right, lower-left and lower-right). The linear term of this vector field at q has six eigenvalues. Table 4. Eigenvalues for some similarity solutions in the four-body problem graph
q = (η1 , η2 , η3 , ρ1 , ρ2 , ρ3 )
characteristic polynomial
(a) (b) (c)
(0, 1, 1, 1, −1, −1) 1 2 (−1, 1, 2, 6, −2, −6) 1 (−1, 4, 3, 10, −10, 6) 6 1 (κ, 2, 1 − κ, 3 − 4κ, −5, 4κ − 1) 3 2κ 2 − 2κ + 1 = 0 (0, κ, 1 − κ, 1, 1 − 4κ, 4κ − 3) 3κ 2 − 3κ + 1 = 0 (1, κ, 1 − κ, −1, 1 − 2κ, 2κ − 1) 2κ 2 − 2κ + 1 = 0
(x + 3)(x 2 + x + 2)(x 2 + 3x + 4) (x + 5)(x 2 + 5x + 10)(x 2 + 3x + 6) (x + 5)(x 2 + x + 3)(x 2 + 7x + 15)
(d) (e) (f)
(x + 5)(x 4 + 8x 3 + 34x 2 + 72x + 90) (x + 5)(x 2 + 4x + 9)(x 2 + 4x + 21) (x + 3)(x 2 + x + 6)(x 2 + 3x + 8)
194
A. Guillot
The third column shows the polynomial satisfied by these eigenvalues —once divided by the factor (ξ − 1). All of these have non-integral roots and cannot, according to our criterion, have the Painlev´e Property. This finishes the proof of Theorem 2.
Acknowledgement. The author thanks Francesco Calogero for introducing him to these beautiful differential equations at the meeting on Integrable Systems at CIC, Cuernavaca, Mexico, in December 2002.
References 1. Bruschi, M., Ragnisco, O.: On a new integrable Hamiltonian system with nearest-neighbour interaction. Inverse Prob. 5(6), 983–998 (1989) 2. Calogero, F.: Motion of poles and zeros of special solutions of nonlinear and linear partial differential equations and related “solvable” many-body problems. Nuovo Cimento B (11), 43(2), 177–241 (1978) 3. Calogero, F.: The neatest many-body problem amenable to exact treatments (a “goldfish”?). Phys. D 152/153, 78–84 (2001) 4. Calogero, F.: Solvable three-body problem and Painlev´e conjectures. Theor. Math. Phys. 133(2), 1445–1454 (2002). Erratum op. cit. 134(1), 139 (2003) 5. Calogero, F.: General solution of a three-body problem in the plane. J. Phys. A: Math. Gen. 36, 7291–7299 (2003) 6. Calogero, F.: Solution of the goldfish N-body problem in the plane with (only) nearest-neighbor coupling constants all equal to minus one half. J. Nonlinear Math. Phys. 11(1), 102–112 (2004) 7. Calogero, F., Fran¸coise, J.P.: Periodic solutions of a many-rotator problem in the plane. Inverse Prob. 17(4), 871–878 (2001) (Special issue to celebrate Pierre Sabatier’s 65th birthday (Montpellier, 2000)) 8. Calogero, F., Fran¸coise, J.P., Guillot, A.: A further solvable three-body problem in the plane. J. Math. Phys. 44(11), 5159–5165 (2003) 9. Calogero, F., Fran¸coise, J.P., Sommacal, M.: Periodic solutions of a many-rotator problem in the plane II. Analysis of various motions. J. Nonlinear Math. Phys. 10(2), 157–214 (2003) 10. Palais, R.S.: A global formulation of the Lie theory of transformation groups. Mem. Amer. Math. Soc. 22 (1957), p. iii+123. 11. Huckleberry, A.: Actions of groups of holomorphic transformations. In: Several complex variables, VI, Volume 69 of Encyclopaedia Math. Sci. Berlin: Springer, 1990, pp. 143–196 12. Yoshida, H.: Necessary condition for the existence of algebraic first integrals I. Kowalevski’s exponents. Celestial Mech. 31(4), 363–379 (1983) Communicated by L. Takhtajan
Commun. Math. Phys. 256, 195–212 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1316-7
Communications in
Mathematical Physics
Zero Debye Length Asymptotic of the Quantum Hydrodynamic Model for Semiconductors Hai-Liang Li1,2 , Chi-Kun Lin3 1
Department of Mathematics, Capital Normal University, Beijing, 100037, P.R. China E-mail: hailiang [email protected] 2 Institute of Mathematics, University of Vienna, Vienna, 1090, Austria 3 Department of Mathematics, National Cheng Kung University, Tainan, 701, Taiwan, R.O.C. E-mail: [email protected] Received: 12 January 2004 / Accepted: 13 October 2004 Published online: 8 March 2005 – © Springer-Verlag 2005
Dedicated to Professor Tai-Ping Liu on his sixtieth birthday Abstract: In the present paper we consider the zero Debye length asymptotic of solutions of isentropic quantum hydrodynamic equations for semiconductors at nano-size and show that the current density consists of the divergence free vector field involved in the incompressible Euler equation and highly oscillating gradient vector field caused by the highly electric fields for small Debye length. This means that the quantum effects possibly may not dominate the charge transport within the channel of semiconductor devices (for instance MOSFET) of nano-size for isentropic quantum fluids. 1. Introduction In the modelling of semiconductor devices in nano-size, for instance, HEMT’s, MOSFET’s and RTD’s, where quantum effects (like particle tunnelling through potential barriers and built-up in quantum wells [9, 10, 17] which can not be simulated by classical hydrodynamic models) take place, the quantum hydrodynamical equations are important and dominant in the description of transport of electron or hole under the self-consistent electric field. The basic observation concerning quantum hydrodynamics is that the energy density consists of an additional new quantum correction term of the order O() introduced first by Wigner [28] in 1932, and that the stress tensor contains also an additional quantum correction part [1, 2] related to the quantum Bohm potential (or internal self-potential) [6] √ 2 ρ Q(ρ) = − (1.1) √ , 2m ρ with observable ρ > 0 the density, m mass, and the Planck constant. The quantum potential Q was introduced by de Broglie and explored by Bohm to make a hidden variable theory and is responsible for producing the quantum behavior, so that all quantum features are related to its special properties. It is no wonder, however, since the original
196
H.-L. Li, C.-K. Lin
idea initialized by Madelung [21] in 1927 to derive quantum fluid-type equations, in terms of Madelung’s transformation applied to wave function of Schr¨odinger equation of pure state, has already described such possible relation. Based on such an idea, one is able to derive quantum fluid type equations from the (nonlinear) Schr¨odinger equation of pure-state [11, 14]. The same idea is also applicable to the other derivative type Schr¨odinger equations [7, 8] Based on the Wigner-Boltzmann (or quantum Liouville) equation [22], the moment method is used recently to derive quantum hydrodynamic equations for a semiconductor device at nano-size. Start with the Wigner-Boltzmann equation W t + ξ · ∇x W +
q P[]W = [Wt ]c , m
(1.2)
where W = W (x, ξ, t), (x, ξ, t) ∈ RN × RN × R+ , N ≥ 1, and P denotes the pseudo-differential operator defined by P[]W =
im (2π )N
(x + 2m η) − (x − 2m η) iη·(ξ −ξ ) e W (x, ξ , t)dηdξ .
The electrostatic potential = (x, t) is self-consistent through the Poisson equation λ0 = q( W dξ − C), with λ0 > 0 the permitivity characteristic of device, q the elementary charge, and C = C(x) > 0 the given doping profile, and [Wt ]c refers to the collisional term. Applying the moment method to the above Wigner-Boltzmann equation (1.2) near the “momentumshifted quantum Maxwellian” [28] together with appropriate closure assumption [10, 12], we can obtain the full quantum hydrodynamic equations for the first three moments [10, 14], i.e., density, momentum, and energy, as ∂t ρ + ∇ ·(ρu) = 0, ∂t (ρu) + ∇ ·(ρu ⊗ u) + ∇ ·T +
1 ρu ρ∇ = − , m τ0
(1.3)
(1.4)
m 5κB 2 ∂t E + ∇ · ρu|u| + ρuT + + ρu∇ = CE , 2 2
(1.5)
−λ0 = ρ − C,
(1.6)
where ρ > 0, u, J = ρu, T > 0 denote the density, velocity, momentum, and temperature respectively, τ0 > 0 is the momentum relaxation time, and the heat fluctuation. The stress tensor T = (Tij ) and the energy density E are of the form, up to order O(2 ), given by Tij =
κB 2 ρ ∂ 2 log ρ T ρδij − , m 12m2 ∂xi ∂xj
E=
m 3κB 2 T ρ + ρ|uλ |2 − ρ log ρ, 2 2 24m
Zero Debye Length Asymptotic of the QHD
197
with δij the Kronecker symbol. The energy relaxation term CE is given by 1 m 3κB 2 CE = − ρu|u| + ρ(T − TL ) τe 2 2 with τe > 0 the energy relaxation time. For more derivation and references on the modelling of quantum models, one refers to [22, 11, 10] and references therein. In the present paper we are interested in the zero Debye length (or the long channel effect) asymptotic of quantum hydrodynamic model for semiconductors in spatial periodic domain TN , N ≥ 2. As a simple start we first consider the isentropic quantum hydrodynamic model for semiconductors, i.e., the particle temperature in (1.3)–(1.6) is taken as a function of the density. The scaled multi-dimensional isentropic quantum hydrodynamic system are of the form ∂t ρλ + ∇ ·(ρλ uλ ) = 0, ∂t (ρλ uλ ) + ∇ ·(ρλ uλ ⊗ uλ ) + ∇Pλ + ρλ ∇λ 1 (1.7) 1 2 2 = ∇ · ρ ∇ log ρ − ρλ uλ , ε λ λ 4 β −λ2 = ρ − C, λ
λ
where Pλ = P (ρλ ) satisfies P (ρ) =
a0 γ ρ , γ
γ ≥ 1.
(1.8)
The coefficient ε > 0 denotes the scaled Planck constant, λ > 0 the scaled Debye length, and β > 0 the momentum relaxation time satisfying ε2 =
2 , 2mκB T0 L2
λ2 =
λ0 κT0 , N q 2 L2
β2 =
κB T0 τ02 , mL2
where we recall that the physical parameters are the elementary charge q, the Boltzmann constant kB , the elective electron mass m, the reduced Planck constant , the permittivity λ0 , the ambient temperature T0 , and the characteristic device length L and density N. For the values of these constants we refer to [22]. In real semiconductor device modelling like MOSFET the scaled Debye length could be sufficiently small in order to simulate the long channel along which there is a significant current of charge (electron or hole) transport from source to drain in MOSFET under voltage bias [27, 22]. As observed in previous works [9, 10, 17] that the resonant tunnelling through potential barriers and built-up in quantum wells, and inversion layer energy quantization near oxide are found. Thus, it is quite natural and interesting to make sure whether the quantum mechanical phenomena will happen within these semiconductor channel devices. In the present paper, we prove the zero Debye length asymptotic from the quantum hydrodynamic equation (2.1) to the incompressible Euler equations for general initial data. Our result shows that the charge current density consists of two components, the divergence free vector field and gradient part at leading order. The divergence free vector field is governed by a damped Euler equations, the other gradient vector field is a highly oscillating field caused by the highly oscillating electric field. Moreover, for any fixed time, both of them will decay in L2 -norm exponentially with respect to small relaxation dissipation β → 0. But we should note that
198
H.-L. Li, C.-K. Lin
such relaxation friction appearing both in the damped Euler equations and the coupled system for highly oscillating fields will disappear when β = ∞. Therefore, one may expect that for small momentum relaxation time β > 0 there will be stronger relaxation dissipation on the oscillating fields and divergence free field. Since it is well known that for small relaxation asymptotic, the quantum hydrodynamic equation will converge to the quantum drift-diffusion equation [16]. As a corollary one may guess that there will be no quantum effects appearing along the channel for the quantum drift-diffusion equations. The local existence of solutions (ρλ , uλ , λ ) ∈ C([0, T ]; H s × H s−1 × H s (TN )), s > 4 + N/2 (for a0 ∈ R) was proven in [19] for any fixed positive λ > 0. It is difficult, however, to deal with the zero Debye length limit for Eq. (1.7) mathematically because of the lack of the a-priori estimates of solutions with respect to λ > 0 and lack of compactness to pass into limit in the convection term due to the high frequency oscillations in time. Even if we deal with the (local or global in time) classical solutions, the time derivatives are not bounded uniformly in λ. It is in fact of the order O(λ−1 ). We need to use the “filtering wave” idea [25, 13, 23, 18] to deal with the high frequency part and to observe the expected convergence result in terms of modulation energy estimate [3, 18, 23]. The general doping function determines the limit behavior of particle density. Formally let λ → 0 in the Poisson equation (1.7)3 ; we obtain ρλ (x, t) = C(x) as λ = 0. It will also affect the limiting behavior for ρλ uλ (for instance, replace the ρλ with C in (2.1) and let λ → 0 formally). In this paper, we first consider the simple case C(x) = 1 + λg(x).
(1.9)
We prove that even the O(λ)-order perturbation causes the coupling of high frequency oscillating parts, say, the coupling of gradient field of current density and electric field. It seems that the general doping function depending on space position in order O(1) may also cause oscillation in space through changing the semigroup of oscillating fields. It is much more difficult to pass to the limit in this case, and will be considered in an upcoming paper. In the rest of the paper, we will list the main results in Sect. 2 and prove them in Sect. 3 respectively. 2. Main Results Define ∇pλ =: λλ . The QHD (1.7) whose variables depending on λ can be re-written as ∂t ρλ + ∇ ·(ρλ uλ ) = 0, ∂t (ρλ uλ ) + ∇ ·(ρλ uλ ⊗ uλ ) + ∇P (ρλ ) + 1 ρλ ∇pλ + 1 ρλ uλ λ β
1 2 2 = 4 ε ∇ · ρλ ∇ log ρλ , −λp = ρ − 1 − λg. λ λ We only consider the following interesting case for QHD (2.1) in this paper β, ε > 0,
λ → 0.
(2.1)
Zero Debye Length Asymptotic of the QHD
199
Consider the multi-dimensional incompressible Euler equation ∂t v + (v · ∇)v + ∇π + 1 v = 0, ∇ ·v = 0, β v(x, 0) = v0 (x) =: P(J0 )(x), and two coupled system for ∇q and ∇p, ∂t ∇q + 1 Q∇ ·(v ⊗ ∇q + ∇q ⊗ v) + 1 Q(g(x)∇p) + 2 2 ∇q(x, t = 0) = ∇q0 (x) =: Q(J0 )(x), ∂t ∇p + 1 Q∇ ·(v ⊗ ∇p + ∇p ⊗ v) − 1 Q(g(x)∇q) + 2 2 ∇p(x, t = 0) = ∇p0 (x),
(2.2)
1 ∇q = 0, 2β
(2.3)
1 ∇p = 0, 2β
(2.4)
where the operators P and Q are defined for v ∈ (L2 (TN ))N as Qv = ∇−1 ∇ ·v,
P = I − Q,
∇ ·P = 0.
(2.5)
The above two coupled systems (2.3)–(2.4) will be used to describe the homogenization behavior of highly oscillating components of Eq. (2.1). Remark 2.1. According to the Hodge decomposition, every vector field v ∈ C ∞ (TN ) on the torus TN has a unique orthogonal decomposition v = v˜ + w + ∇q, v˜ = v dx, ∇ ·w = 0, TN
where ∇q =
|k| =0
e2πx·k
k⊗k vˆ (k), |k|2
and k ⊗ k = (ki kj ). This decomposition has the following properties: (i) w, ∇q ∈ C ∞ (TN );
(ii) w ⊥ ∇q in L2 (TN ), i.e., TN w · ∇q dx = 0; (iii) v − v˜ L2 (TN ) = wL2 (TN ) + ∇qL2 (TN ) . By (2.5) we know that if we have v = Qv + Pv =: ∇q + w, q ∈ H 1 (TN ) for v ∈ (L2 (TN ))N , ∇ ·w = 0 then q is determined by the Poisson equation q = ∇ ·v. The Poisson equation on the torus TN does not necessarily have a solution, unless we impose the compatibility condition ∇ ·v dx = 0. TN
It may be necessary to add another term ∇qH with qH ∈ H 1 (TN ) a harmonic function (i.e., qH = 0) to the gradient vector field ∇q. In our case the compatibility condition is automatically satisfied and furthermore it can be omitted since one obtains, from qH = 0, qH ∈ H 1 (TN ), that qH = constant which leads to ∇qH = 0.
200
H.-L. Li, C.-K. Lin
Set G(x, t) = ∇q + i∇p and define the modulation energy Hλ by ([18, 23]) 2 1 1 √ it/λ ρλ uλ − v − Re G(t)e | ε∇ ρλ |2 dx Hλ (t) = dx + N N 2 T 2 T 2
γ 1 a0 it/λ + − Im G(t)e dx + ρλ + γ − 1 − γρλ dx, ∇pλ 2 TN γ − 1 TN (2.6) where Re(ψ) (Im(ψ)) denotes the real (imaginary) part of ψ. The main result in this paper is Theorem 2.2. Let ε, β > 0 be fixed, J0 ∈ H s (TN ), s > N/2 + 4, g ∈ H s (TN ). Let v be the local strong solution of the damped Euler equation (2.2) in [0, T ], and (∇q, ∇p) the strong solutions of (2.3) and (2.4). Assume that (ρλ , uλ , ∇pλ ) is the solution of Eq. (2.1) such that the initial data (ρ0λ , u0λ ) satisfies ρ0λ = 1 + λ(g − p0λ ) with (∇p0λ , ρ0λ u0λ ) convergence strongly to (∇p0 , J0 ) in H s (TN ), and
λγ 2a0 | ε∇ ρ0λ |2 dx + |ρ0 | + γ − 1 − γρ0λ dx → 0, γ − 1 TN TN
(2.7)
(2.8)
as λ → 0. Then, the current density sequence Jλ = ρλ uλ converges weakly to the solution v of the incompressible Euler equation (2.2), and the projection on divergence-free vector field P(Jλ ) converges to v in L∞ ([0, T ]; L1 (TN )N ). Moreover, it holds Hλ (t) → 0,
λ → 0,
∀ t ∈ [0, T ].
Remark 2.3. (1) Note here even the O(λ) order perturbation of the constant doping gives the coupling in oscillation fields, and only if g(x) ≡ 0, the limiting equations for gradient field of current density (2.3) and electric filed (2.4) will be decoupled. (2) The condition on the first term in (2.8) can be easily satisfied. In fact, for given ∇p0λ ∈ H s (TN ), s > 2 + N/2, choose ρ0λ by (2.7). Then, we can verify that TN
ε∇ ρ λ 0
2 dx ≤ Cλ2 ε 2 ∇(g, ψ λ )2 2 N . 0 L (T )
In analogy, we can check it for the second term, which equals zero as a0 = 0, in (2.8). Remark 2.4. By (3.2) and (3.3) in the next section, one can conclude that for any fixed time the L2 -norm of v, ∇p, ∇q will decay exponentially with respect to small relaxation dissipation β → 0. Therefore, one may expect that for sufficiently small β > 0 there will be stronger relaxation dissipation on the oscillating fields and divergence free field. Since it is well known that for small relaxation asymptotic, the quantum hydrodynamic equation will converge to the quantum drift-diffusion equation [16]. As a corollary, one may guess that there will be no quantum effects appearing along the channel for the quantum drift-diffusion equations. But we should note that such relaxation friction appears both in the damped Euler equations and the coupled system for highly oscillating fields which disappear when β = ∞.
Zero Debye Length Asymptotic of the QHD
201
Note
λ 2 1 1 ρ λ |uλ − J0 |2 dx ∇p0 − ∇p0 dx + Hλ (0) = 2 TN 2 TN 0 0 2
λγ 1 ε∇ ρ λ dx + a0 |ρ0 | + γ − 1 − γρ0λ dx. + 0 2 TN γ − 1 TN
We can prove the convergence rate of the modulation energy. Theorem 2.5. Under the assumptions of Theorem 2.2, assume further that Hλ (0) ≤ Cλ.
(2.9)
Then, it holds that Hλ (t) ≤ CT λ,
∀ t ∈ [0, T ]
(2.10)
with CT > 0 a constant independent of λ > 0. 3. Proof of Theorems First, we consider the initial value problems (2.2) and (2.3)–(2.4). The local existence of the strong solution v ∈ L∞ ([0, T ]; H s (TN )), s ≥ 1 + N/2, of the incompressible Euler equation (2.2) with β = ∞ is well known [20]. It is quite easy to show that there exists a unique local strong solution v ∈ L∞ ([0, T ]; H s (TN )), s ≥ 1 + N/2, of the damped incompressible Euler equation (2.2). In fact, due to the friction dissipation one can also prove that it is a global solution at least for bounded initial vorticity based on the Beale-Kato-Majda’s criterion [4]. It is dissipative in the L2 -norm since a simple energy estimate shows d 2 v(t)2 + v(t)2 = 0, dt β which leads to 2 v(t) + β
2
t
v(s)2 ds = v0 (x)2 = Q(J0 )(x)2 .
(3.1)
(3.2)
0
For a given divergence-free vector field v ∈ L∞ ([0, T ]; H s (TN )), s ≥ 1 + N/2, one can prove the existence of the strong solution (∇q, ∇p) ∈ L∞ ([0, T ]; H s (TN )) of the couple system (2.3)–(2.4). This solution (∇q, ∇p) is also dissipative in the L2 -norm 1 t
2 2 ∇q(s)2L2 + ∇p(s)2L2 ds ∇q(t)L2 + ∇p(t)L2 + β 0 = ∇q0 2L2 + |∇p0 2L2 due to the damping friction and TN ∇p · [Q∇ ·(v ⊗ ∇p + ∇p ⊗ v)] dx = 0, TN ∇q · [Q∇ ·(v ⊗ ∇q + ∇q ⊗ v)] dx = 0,
TN ∇p · Q(g(x)∇q) dx − TN ∇q · Q(g(x)∇p) dx = 0,
(3.3)
(3.4)
202
H.-L. Li, C.-K. Lin
and uniformly bounded in H s -norm by Gronwall’s inequality, i.e., (∇q, ∇p)(t)2L∞ ([0,T ];H s (TN )) ≤ C(∇q, ∇p)(0)2H s (TN ) eCT gH s
(3.5)
with C > 0 a constant. Now we turn to the QHD (2.1). The local and global existence of classical solutions of (2.1) was proven recently for any fixed λ > 0 in [19]. The conservation laws of Eq. (2.1) related to the density give us the conservation of mass ρλ (x, t) dx = ρ0λ (x) dx (3.6) TN
TN
and the conservation of energy 2a0 1 d √ 2 2 γ 2 ρλ |uλ | + |ρλ | + |∇pλ | + |ε∇ ρλ | dx 2 dt TN γ −1 1 + ρλ |uλ |2 dx = 0, β TN which implies the uniform bound of total energy 1 a0 1 1 √ 2 2 γ 2 ρλ |uλ | + |ρλ | + |∇pλ | + |ε∇ ρλ | dx E(t) =: γ −1 2 2 TN 2 1 a0 1 1 λ 2 λ γ λ 2 2 λ |J | + |ρ | + |∇p0 | + |ε∇ ρ0 | dx =: E(0). ≤ λ 0 γ −1 0 2 2 TN 2ρ0 (3.7) Then, from (3.7) the bound of total energy, one has Lemma 3.1. It holds for all λ > 0 that 1 2 2 ρλ |uλ | + |∇pλ | (x, t) dx ≤ E(0), 2 TN
t ∈ [0, T ].
By Lemma 3.1 one concludes that the current density Jλ = ρλ uλ satisfies √ |Jλ (x, t)| dx ≤ ρλ L1 ρλ uλ L2 ≤ C E(0), t ∈ [0, T ]. TN
(3.8)
(3.9)
Thus, we may expect that the current density Jλ converges (weakly) to some vector function J as λ → 0 (which we will prove below). Moreover, we deduce from (3.9) that Jλ ∈ L∞ (0, T ; H −2 (TN )) since for all test functions f ∈ C0∞ (TN ), Jλ (x, t)f (x) dx ≤ C (1 + λ(g(x) + pλ ))f 2 dx ≤ Cf H 2 (TN ) . TN
TN
We consider the eigenvalue problem for the isometry operator L on H =: L2 (TN ) × {∇ψ; ψ ∈ H 1 (TN )}, LU = µU,
U ∈H
Zero Debye Length Asymptotic of the QHD
such that
203
w L = 0, if ∇ ·w = 0, 0
∇q −∇p and L = . ∇p ∇q
(3.10)
It has three eigenvalues µ = ±i, 0 with the corresponding eigenspaces ∇q v E±i = , q ∈ H 1 (TN ) , E0 = , ∇ ·v = 0, v ∈ L2 (TN ) . ∓i∇q 0 We next introduce the group L(τ ) = eτ L , τ ∈ R, where L is an operator defined by (3.10). It is easy to check that L is an isometry on space H s (TN )×H s (TN ). In fact, let ∇ q(τ ¯ ) ∇q = eτ L , ∇ p(τ ¯ ) ∇p then by (3.10) we have ∂ ∂τ
∇ q(τ ¯ ) −∇ p¯ = , ∇ p(τ ¯ ) ∇ q¯
⇒
∂ 2 ∇ q(τ τ ¯ ) + ∇ q¯ = 0, ∂ 2 ∇ p(τ ¯ ) + ∇ p¯ = 0,
(3.11)
τ
which gives, after a computation, that 2 ¯ )2H s (TN ) = ∇ q(0) ¯ + ∇ q¯τ (0)2H s (TN ) , ∇ q(τ ¯ )2H s (TN ) + ∂τ ∇ q(τ H s (TN ) 2 ¯ )2H s (TN ) = ∇ p(0) ¯ + ∇ p¯ τ (0)2H s (TN ) , ∇ p(τ ¯ )2H s (TN ) + ∂τ ∇ p(τ H s (TN )
(3.12) whenever ∇ q(0), ¯ ∇ q¯τ (0) ∈ H s (TN ), and ∇ p(0), ¯ ∇ p¯ τ (0) ∈ H s (TN ). And since the s+1 N s N space H = {∇p, p ∈ H (T )} = Q(H (T ))N , s ≥ 0, is the invariant space = of operator Q, i.e., Q(∇p) = ∇−1 ∇ · (∇p) = ∇p, ∇p ∈ H, it holds for V t (∇q(x, t), ∇p(x, t)) that 1 iτ ∇q + i∇p 1 −iτ ∇q − i∇p L(τ )V = e + e . (3.13) −i∇q + ∇p i∇q + ∇p 2 2 For (2.1)3 it obviously holds 1 ∇pλ = − ∇−1 (ρλ − 1 − λg) λ 1 ∂t ∇pλ = ∇−1 ∇ ·(ρλ uλ ). λ We have by ∂t (2.1)3 and (2.1)1 that ∂t ρλ + ∇ ·(ρλ uλ ) = 0, 1 ∂t (ρλ uλ ) + F λ + ∇pλ = −Gλ , λ ∂ ∇p − 1 Q(ρ u ) = 0, t λ λ λ λ
(3.14)
204
H.-L. Li, C.-K. Lin
where F λ = F λ (ρλ , uλ , ∇pλ ) = ∇ ·(ρλ uλ ⊗ uλ ) + g∇pλ − pλ ∇pλ + ∇P (ρλ ) +
1 ρλ uλ β
1 = ∇ ·(ρλ uλ ⊗ uλ ) + g∇pλ − ∇ ·(∇pλ ⊗ ∇pλ ) + ∇(|∇pλ |2 ) 2 1 +∇P (ρλ ) + ρλ uλ , β
√ 1 √ Gλ = Gλ (ρλ ) = − ε 2 ∇ρλ + ∇ · ε∇ ρλ ⊗ ε∇ ρλ . 4 Let
Q(Jλ ) Uλ = , ∇pλ
(3.16)
λ = L − t U λ , V λ
Jλ P(Jλ ) U = = Uλ + , 0 ∇pλ
λ
(3.15)
V λ = L − λt U λ .
(3.17)
From (3.14), we have
t F λ + Gλ ∂t V = −L − λ , 0 λ
(3.18)
which, together with (3.8), implies that ∂t V λ is bounded uniformly in L1 ([0, T ]; H −m (TN )) with m ≥ N2 + 2. Since by Lemma 3.1 and (3.9), one concludes that V λ is uniformly bounded in L∞ ([0, T ]; L1 (TN )). Therefore, by Lions–Aubin’s Compactness Lemma we have v˜ strongly in L1 ([0, T ]; H −m ) V λ −→ V = +V (3.19) 0 strongly in L1 ([0, T ]; H −m (TN )). Moreover, this implies the λ converges to V and V weak limit of the right hand side term of (3.18). Lemma 3.2. Let (v, ∇p,∇q) ∈ L∞ ([0, T ]; L2 (TN )) be the unique solution of (2.2), λ (2.3)–(2.4). Let Bλ (t) = B10(t) and
) ⊗ (v + L1 t V ) B1λ (t) =∇ · (v + L1 λt V λ
1 |2 , ⊗ L1 t V + ∇ |L1( t )V − ∇ · L1 λt V λ λ 2
(3.20)
is the i th -component of L(τ )V . Then, as λ → 0 it = (∇q, ∇p)t and Li (τ )V where V holds that
L − λt Bλ (t) B(V , V ) weakly in L∞ ([0, T ]; W −1,1 (TN )),
Zero Debye Length Asymptotic of the QHD
205
where the bilinear form B is defined for V = (¯v + ∇ q, ˜ ∇ p) ˜ t and V = (v + ∇q, ∇p)t as 1 v ⊗ ∇q + ∇q ⊗ v¯ ) P∇ ·(¯v ⊗ v) 2 Q∇ ·(¯ B(V , V ) = + . (3.21) 1 0 v ⊗ ∇p + ∇p ⊗ v¯ ) 2 Q∇ ·(¯ λ
+ 1 (v + L1( t )V Furthermore, let Aλ (t) = A10(t) with Aλ1 (t) = gL2 λt V β λ ), then it holds
¯ ) weakly in L − λt Aλ (t) A(V where
¯ ) = A¯ 1 (V ) + A¯ 2 (V ) =: A(V
1 βv
+
L∞ ([0, T ]; W −1,2 (TN )),
1 2β ∇q
1 2β ∇p
+
1 2 Q(g∇p) − 21 Q(g∇q)
,
(3.22)
= (∇q, ∇p)t . where V = (v + ∇q, ∇p)t , V Proof. By (3.10), (3.13), one can obtain, after a complicated computation, that L (−τ )Bλ (t) = B(V , V ) +
3
eiκτ Tκ (V , V ) +
κ=1
¯ V ) + L (−τ )Aλ (t) = A(
2 κ=1
) + eiκτ Fκ (g V
3
e−iκτ Tκ∗ (V , V ),
(3.23)
κ=1 2
), e−iκτ Fκ∗ (g V
(3.24)
κ=1
1, 2, is a linear vector function where Tκ , κ = 1, 2, 3, is a bilinear form and F κ , κ =
t 1 ∇(|L t V|2 ) 1 λ 2 below. In fact, by (3.13) one can of V . We only compute the term L − λ 0 directly obtain
2 | = | 1 eiτ (∇q + i∇p) + 1 e−iτ (∇q − i∇p)|2 |L1 λt V 2 2 = 41 e2iτ (∇q + i∇p)2 + 41 e−2iτ (∇q − i∇p)2 + 21 (|∇q|2 + |∇p|2 ), and then
1 1 ∓iτ t 2 ∇(e 2iτ (∇q + i∇p)2 + e−2iτ (∇q − i∇p)2 ) L − λt 2 ∇(|L1 0 λ V | ) = e ±i∇ e2iτ (∇q + i∇p)2 + e−2iτ (∇q − i∇p)2 16 1 ∇(|∇q|2 + |∇p|2 ) + e∓iτ ±i∇(|∇q|2 + |∇p|2 ) 8 1 ±3iτ ∇((∇q ± i∇p)2 ) 1 ±iτ ∇((∇q ± i∇p)2 ) e e = + ∓i∇((∇q ± i∇p)2 ) ±i∇((∇q ± i∇p)2 ) 16 16 1 ∇(|∇q|2 + |∇p|2 ) + e±iτ . ±i∇(|∇q|2 + |∇p|2 ) 8 Notice that L (−τ )Bλ (t) and L (−τ )Aλ (t) are periodic functions in τ , Lemma 3.2 follows immediately from the weak convergence theorem for the (almost-periodic) function (see Lemma 2.3 in [24] for instance).
206
H.-L. Li, C.-K. Lin
Proof of Theorem 2.2. First, we observe from (3.13) that
Re G(t)e
it/λ
t ∇q = L1 λ , ∇p
t ∇q it/λ = L2 λ Im G(t)e . ∇p
For convenience we just deal with the simplified case of β = ∞ below, the argument can be applied to the general case for any finite β > 0. Thus, by (3.3) and (3.12) we have
(t)2 = V (t)2 = V (0)2 . L λt V
(3.25)
= (∇q, ∇p)t . Using Lemma 3.1, (2.6), (3.2), (3.3), (3.6), (3.25), Recall here again V we obtain that 1 √ Hλ (t) = ρλ uλ2 + |∇pλ |2 + | ε∇ ρλ |2 (x, t) dx 2 TN
γ
2 a0 1 dx + ρλ + γ − 1 − γρλ (x, t) dx + (ρλ −1) v + L1 λt V γ −1 TN 2 TN
2
2 1 | + |L2 t V | )(x, t) dx (|v|2 + |L1 λt V + λ N 2 T
dx − dx − Jλ · v + L1 λt V ∇pλ · L2 λt V TN TN 1 √ ≤ ρλ uλ2 + |∇pλ |2 + | ε∇ ρλ |2 (x, 0) dx 2 TN
γ a0 + ρ + γ − 1 − γρλ (x, 0) dx γ − 1 TN λ
2 1 dx + 1 |v|2 + |∇q|2 + |∇p|2 (x, 0) dx + (ρλ −1) v + L1 λt V 2 TN 2 TN
t
dx − dx − Jλ · v + L1 λ V ∇pλ · L2 λt V TN TN
2 1 1 λ 2 dx ≤ Hλ (0) + |ρ0 − 1||J0 | dx + (ρλ − 1) v + L1 λt V 2 TN 2 TN t t
s
dx ds, − ∂s Jλ · v + L1 λ V dx ds − ∂s ∇pλ · L2 λs V 0
TN
0
TN
which gives, in terms of (2.7), that
2 1 dx Hλ (t) ≤ Hλ (0) + Cλ + (ρλ − 1) v + L1 λt V 2 TN t t
dx ds − dx ds − ∂s Jλ · v + L1 λs V Jλ · ∂s v + L1 λs V 0 TN 0 TN t t
dx ds − dx ds. − ∂s ∇pλ · L2 λs V ∇pλ · ∂s L2 λs V 0
TN
0
TN
(3.26)
Zero Debye Length Asymptotic of the QHD
207
Using the fact t
t
dx ds + dx ds ∇pλ · v + L1 λs V Jλ · L2 λs V 0 TN 0 TN t t
dx ds − dx ds = 0, − Q(Jλ ) · L2 λs V ∇pλ · L1 λs V 0
TN
0
TN
(3.27) and Eqs. (3.14)2,3 and (3.11) we again have
2 1 dx (ρλ − 1) v + L1 λt V Hλ (t) ≤ Hλ (0) + Cλ + 2 TN t t
s
+ ∇P (ρλ ) v + L1 λ V dx ds − Jλ ∂s v dx ds 0 TN 0 TN t
dx ds + (F λ + Gλ ) v + L1 λs V N 0 T t
+ ∇pλ L2 s ∂s V dx ds. − Jλ L1 λs ∂s V (3.28) λ 0
TN
After the recombination of right-hand side terms of (3.28) and the use of (2.3)–(2.4), we deduce from (3.28) that t
Hλ (t) ≤ Hλ (0) + Cλ + Rλ1 (t) + Rλ2 (t) − U λ L λs ∂s V dx ds 0 TN t
√
s √
+ B 1 ρλ uλ − L1 λ V , ρλ uλ − L1 λs V L1 λs V dx ds 0 TN t
s , ∇pλ − L2 s V L1 + B 2 ∇pλ − L2 λs V λ λ V dx ds 0 TN t
U λ , L s V L s V dx ds B +2 λ λ 0 TN t
L s V , L s V L s V dx ds B − λ λ λ 0 TN t
+ (3.29) B 3 gL2 λs V λ L λs V dx ds, 0
TN
where B 3 (u) =
u , 0
V ) = B 1 (U, V ) + B 2 (U, V ) B(U, 0
with 1 B 1 (U, V ) = ∇ · U 1 ⊗ V 1 + V 1 ⊗ U 1 , 2 1 1 B 2 (U, V ) = − ∇ · U 2 ⊗ V 2 + V 2 ⊗ U 2 + ∇ U 2 · V 2 . 2 2
208
H.-L. Li, C.-K. Lin
Here U i (resp. V i ), i = 1, 2, denotes the i-component of U (resp. V ). Moreover, we have
2 1 dx (ρλ − 1) v + L1 λt V 2 TN t
− ∇ · (ρλ − 1)L1 λs V ⊗ L1 λs V L1 λs V dx ds, t 0 TN
dx ds Rλ2 (t)| = ∇P (ρλ ) v + L1 λs V N 0 T
1 2 t − λε ∇(g − pλ )L1 λs V dx N 4 0 T t
√
√
dx, + ε∇ ρλ ⊗ ε∇ ρλ : ∇ v + L1 s V Rλ1 (t) =
0
TN
λ
which, together with (2.7) and integration by parts, satisfy
|Rλ1 (t)|
t 2 1 ≤ λ (g + ∇ ·∇p) v + L1 λ V dx 2 TN t
s
s s +λ ∇ · (g + ∇ ·∇pλ )L1 λ V ⊗ L1 λ V L1 λ V dx ds N 0 T
λ→0 ≤ λC(1 + V 2H s ) gL2 + ∇pλ L2 −→ 0, (3.30)
and
|Rλ2 (t)|
t
s γ ≤ ∇(ρλ + γ − 1 − γρλ ) v + L1 λ V dx ds 0 TN t
s +λγ (g − pλ )∇ ·L1 λ V dx 0 TN
s 1 2 t + λε ∇(g − pλ )L1 λ V dx N 4 t 0 T
√
s √
+ ε∇ ρλ ⊗ ε∇ ρλ : ∇ v + L1 λ V dx 0 TN t √ ≤ C∇V L∞ ε∇ ρλ (s)2L2 ds + CT λ(1 + ε 2 ) (∇g, ∇pλ )2L2 + V 2H 3 0 t t a0 γ + ∇ V L∞ (ρλ + γ − 1 − γρλ ) ≤ Cλ + C Hλ (s)ds. (3.31) γ 0 TN 0
= (∇q, ∇p)t , and V λ is defined by (3.17). Recall here again V = (v + ∇q, ∇p)t , V
Zero Debye Length Asymptotic of the QHD
209
We reduce (3.29) to t t
Hλ (s)ds − U λ L λs ∂s V dx ds Hλ (t) ≤ Hλ (0) + Cλ + C 0 0 TN t
s λ s + B 3 gL2 λ V L λ V dx ds 0 TN t
L s V , L s V L s V dx ds B − λ λ λ 0 TN t
U λ , L s V L s V dx ds. B +2 λ λ 0
TN
(3.32)
Noting (3.19) and applying the similar argument, we can prove
t λ t λ→0 1
L V , L V −→ B V , V + 1 B V , V L − λt B λ λ 2 2
(3.33)
λ→0 L − λt B 3 gL2 λt V λ −→ A¯ 2 ( V )
(3.34)
and
in the sense of distribution, where we recall that V = (v + ∇q, ∇p)t and V = (˜v + ∇ q, ˜ ∇ p) ˜ t . In fact, using the technique of Friedrich’s mollifier we get δ→0
V λ − V L1 (0,T ;H −m ) ≤ CV λ − Vδλ L1 (0,T ;H −m )
−→ 0
+ Vδλ − V δ L1 (0,T ;H −m )
−→ 0
+ V δ − V L1 (0,T ;H −m )
δ→0
λ→0
(3.35)
−→ 0,
where fδ = Mδ ∗ f with Mδ ∗ the Friedrich’s mollifier. Repeating the similar argument as in the proof of Lemma 3.2, we can prove (3.33) and (3.34) when V λ is replaced by V δ which together with (3.35) implies (3.33) and (3.34). Let η = lim sup Hλ (t). λ→0
Due to (3.33), (3.34), and the equality t t t A¯ 2 ( V )V dx + V ∂s V dx ds + − 0
TN
0
TN
0
TN
B V , V V dx ds = 0, (3.36)
and the fact TN B V , V V dx = 0, we get finally from (3.32) that t η(s) ds, η(t) ≤ η(0) + C 0
which together with the fact η(0) = 0 implies η(t) ≡ 0. This completes the proof of Theorem 2.2.
210
H.-L. Li, C.-K. Lin
Proof of Theorem 2.5. We only need to estimate the decay rate on the right-hand side terms of (3.32). In terms of (3.23) we obtain, after a tedious computation, that
t λ t
L V , L V + L − t B 3 gL2 t V λ 2L − λt B λ λ λ λ ¯ λ )+ = B(V λ , V ) + B(V , V λ ) + A(V +
2
3
κ=1 2
t
eiκ λ (Fκ (gV λ )+ βt Gκ (V λ ))+
κ=1
t
eiκ λ Tκ (V λ , V )+
3
t
e−iκ λ Tκ∗ (V λ , V )
κ=1 t
e−iκ λ (Fκ∗ (gV λ )+ βt G∗κ (V λ )),
(3.37)
κ=1
3 3 t t
t
iκ λ L V , L t V = B(V , V ) + L − λt B e T (V , V ) + e−iκ λ Tκ∗ (V , V ), κ λ λ κ=1
κ=1
(3.38) and it holds t t V λ ∂s V dx ds + − TN
0
0
TN
¯ λ )V dx + A(V
t 0
TN
B(V , V λ )V dx ds = 0,
TN
Note t
B(V λ , V )V dx = 0.
s
eiκ λ Tκ (V λ , V )V dx ds N 0 T s λ t λ t = −i eiκ λ ∂t (Tκ (V λ , V )V ) dx ds + i eiκ λ Tκ (V λ , V )V (x, t) dx N k 0 TN k T λ λ +i Tκ (V , V )V (x, 0) dx. k TN
Furthermore, since ∂s (Tκ (V λ , V )V ) = Tκ (∂s V λ , V )V + Tκ (V λ , ∂s V )V + Tκ (V λ , V )∂s V , and V ∈ L∞ ([0, T ]; H m (TN )), ∂t V ∈ L∞ ([0, T ]; H m−1 (TN )), m > N/2 + 1, V λ ∈ L∞ ([0, T ]; L1 × L2 ), ∂t V λ ∈ L∞ ([0, T ]; W −m,1 ), m > N/2 + 1, we conclude that the integral TN
Tκ (V λ , V )V dx,
κ = 1, 2, 3,
Zero Debye Length Asymptotic of the QHD
211
is well defined and uniformly bounded in L∞ (0, T ) independent of λ. Similarly, due to t s eiκ λ Fκ (gV λ )V dx ds 0 TN s λ t λ iκ t iκ λ λ = −i e ∂s (Fκ (gV )V ) dx ds + i e λ Fκ (gV λ )V (x, t) dx k 0 TN k TN λ +i Fκ (gV λ )V (x, 0) dx k TN and ∂s (Fκ (V λ )V ) = Fκ (g∂s V λ )V + Fκ (gV λ )∂s V we know that the integral TN
Fκ (gV λ )V dx,
κ = 1, 2,
is also well-defined and uniformly bounded in L∞ (0, T ) independent of λ. The same arguments are also applicable to the integral related to Tκ∗ , Fκ∗ , Gκ , and G∗κ in (3.37) and (3.38). Thus, we obtain from (3.32) that t Hλ (t) ≤ Hλ (0) + Cλ + C Hλ (s)ds, 0
which implies (2.10). This completes the proof of Theorem 2.5.
Acknowledgements. We would like to take this opportunity to express our gratitude to the editor and referees for a number of valuable criticisms and suggestions which have helped to improve the manuscript. The authors thank Professors N. Masmoudi and A. Matsumura for helpful discussion. Part of research was conducted when H.L. was staying in Vienna; he thanks Professor Peter Markowich for the hospitality and continuous encouragement. H.L. acknowledges partial support from CTS of Taiwan and from the Wittgenstein Award of Peter A. Markowich, funded by the Austrian FWF, the Grants-in-Aid of Japan Society for the Promotion of Science for JSPS fellows No.14-02036, and the NSFC No.10431060. C.L. acknowledges support from CTS of Taiwan. This work is also partially supported by the National Science Council of Taiwan under the grant NSC92-2115-M-006-004.
References 1. Ancona, M. G., Tiersten, H. F.: Microscopic physics of the Silicon inversion layer. Phys. Rev. B 35, 7959–7965 (1987) 2. Ancona, M. G., Iafrate, G. I.: Quantum correction to the equation of state of an electron gas in a semiconductor. Phys. Rev. B 39, 9536–9540 (1989) 3. Brenier, Y.: Convergence of the Vlasov-Poisson system to the incompressible Euler equations. Comm. Partial Diff. Eqs. 25, 737–754 (2000) 4. Beale, T., Kato, T., Majda, A.: Remarks on the breakdown of smooth solutions for the 3D Euler equations. Commun. Math. Phys. 94, 61–66 (1984) 5. Brezzi, F., Gasser, I., Markowich, P., Schmeiser, C.: Thermal equilibrium state of the quantum hydrodynamic model for semiconductor in one dimension. Appl. Math. Lett. 8, 47–52 (1995) 6. Bohm, D.: A suggested interpretation of the quantum theory in terms of “hidden” valuables: I; II. Phys. Rev. 85, 166–179, 180–193 (1952) 7. Desjardins, B., Lin, C.-K., Tso, T.-C.: Semiclassical limit of the derivative nonlinear Schr¨odinger equation. Math. Models Methods Appl. Sci. 10, 261–285 (2000)
212
H.-L. Li, C.-K. Lin
8. Desjardins, B., Lin, C.-K.: On the semiclassical limit of the general modified NLS equation. J. Math. Anal. Appl. 260, 546–571 (2001) 9. Ferry, D. K., Zhou, J.-R.: Form of the quantum potential for use in hydrodynamic equations for semiconductor device modeling. Phys. Rev. B 48, 7944–7950 (1993) 10. Gardner, C.: The quantum hydrodynamic model for semiconductors devices. SIAM J. Appl. Math. 54, 409–427 (1994) 11. Gasser, I., Markowich, P.: Quantum hydrodynamics, Wigner transforms and the classical limit. Asymptotic Anal. 14 , 97–116 (1997) 12. Gasser, I., Markowich, P. A., Ringhofer, C.: Closure conditions for classical and quantum moment hierarchies in the small temperature limit. Transp. Theory Stat. Phys. 25, 409–423 (1996) 13. Grenier, E., Masmoudi, N.: Ekman layers of rotating fluids, the case of well prepared initial data. Commun. Partial Diff. Eqs. 22, 953–975 (1997) 14. J¨ungel, A.: Quasi-hydrodynamic semiconductor equations, Progress in Nonlinear Differential Equations. Basel: Birkh¨auser, 2001 15. J¨ungel, A.: A steady-state potential flow Euler-Poisson system for charged quantum fluids. Commun. Math. Phys. 194, 463–479 (1998) 16. J¨ungel, A., Li, H.-L., Matsumura, A.: Global relaxation asymptotic from quantum hydrodynamics model to quantum drift-diffusion equations in R3 . Preprint 2003 17. Klusdahl, N., Kriman, A., Ferry, D., Ringhofer, C.: Self-consistent study of the resonant-tunneling diode. Phys. Rev. B 39, 7720–7735 (1989) 18. Li, H.-L., Lin, C.-K., Masmoudi, N.: Incompressible limit of the nonisentropic Euler-Poisson system for general initial data. Preprint 2003 19. Li, H.-L., Marcati, P.: Existence and asymptotic behavior of multi-dimensional quantum hydrodynamic model for semiconductors. Commun. Math. Phys. 245(2), 215–247 (2004) 20. Lions, P.-L.: Mathematical topics in fluid mechanics, Vol. 1. Incompressible models, New York: The Clarendon Press, Oxford University Press, 1996 21. Madelung, E.: Quantentheorie in hydrodynamischer form. Z. Physik 40, 322 (1927) 22. Markowich, P. A., Ringhofer, C., Schmeiser, C.: Semiconductor Equations. Berlin-Heidelberg-New York: Springer-Verlag, 1989 23. Masmoudi, N.: From Vlasov-Poisson system to the incompressible Euler system. Commun. Partial Diff. Eqs. 26, 1913–1928 (2001) 24. Masmoudi, N.: Ekman layers of rotating fluids: the case of general initial data. Commun. Pure Appl. Math. LIII, 0432–0483 (2000) 25. Schochet, S.: Fast singular limits of hyperbolic PDEs. J. Diff. Eqs. 114, 476–512 (1994) 26. Simon, J.: Compact sets in the space Lp (0, T ; B). Ann. Math. Pura. Appl. 146, 65–96 (1987) 27. Sze, S. M.: Physics of semiconductor devices. New York: John Wiley & Sons, 1969 28. Wigner, E.: On the quantum correction for thermodynamic equilibrium. Phys. Rev. 40, 749–759 (1932) Communicated by H.-T. Yau
Commun. Math. Phys. 256, 213–238 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1318-5
Communications in
Mathematical Physics
θ -Deformations as Compact Quantum Metric Spaces Hanfeng Li Department of Mathematics, University of Toronto, Toronto ON M5S 3G3, Canada. E-mail: [email protected] Received: 16 January 2004 / Accepted: 13 November 2004 Published online: 15 March 2005 – © Springer-Verlag 2005
Abstract: Let M be a compact spin manifold with a smooth action of the n-torus. Connes and Landi constructed θ -deformations Mθ of M, parameterized by n × n real skew-symmetric matrices θ . The Mθ ’s together with the canonical Dirac operator (D, H) on M are an isospectral deformation of M. The Dirac operator D defines a Lipschitz seminorm on C(Mθ ), which defines a metric on the state space of C(Mθ ). We show that when M is connected, this metric induces the weak-∗ topology. This means that Mθ is a compact quantum metric space in the sense of Rieffel. 1. Introduction In noncommutative geometry there are many examples of noncommutative spaces deformed from commutative spaces. However, for many of them the Hochschild dimension, which corresponds to the commutative notion of dimension, is different from that of the original commutative space. For instance, the C ∗ -algebras of the standard Podle´s quantum 2-spheres and of the quantum 4-spheres of [1] are isomorphic to each other, and their Hochschild dimension is zero [17]. In [8] Connes and Landi introduced a one-parameter deformation Sθ4 of the 4-sphere with the property that the Hochschild dimension of Sθ4 equals that of S 4 . They also considered general θ-deformations, which was studied further by Connes and Dubois-Violette in [7] (see also [28]). In general, the θ-deformation Mθ of a manifold M equipped with a smooth action of the n-torus T n is determined by defining the algebra of smooth functions C ∞ (Mθ ) as the invariant subalgebra (under the diagonal action of T n ) of the algebra ˆ ∞ (Tθ ) of smooth functions on M × Tθ ; here θ is a real C ∞ (M × Tθ ) := C ∞ (M)⊗C skew-symmetric n×n matrix and Tθ is the corresponding noncommutative n-torus. This construction is a special case of the strict deformation quantization constructed in [21]. When M is a compact spin manifold, Connes and Landi showed that the canonical Dirac operator (D, H) on M and a deformed anti-unitary operator Jθ together gives a spectral triple for C ∞ (Mθ ), fitting it into Connes’ noncommutative Riemannian geometry
214
H. Li
framework [5, 6]. In [7] Connes and Dubois-Violette also showed how θ-deformations lead to compact quantum groups which are deformations of various classical groups (see also [30, Sect. 4]). In this paper we investigate the metric aspect of θ-deformation. The study of metric spaces in a noncommutative setting was initiated by Connes in [4] in the framework of his spectral triple. The main ingredient of a spectral triple is a Dirac operator D. On the one hand, it captures the differential structure by setting df = [D, f ]. On the other hand, it enables us to recover the Lipschitz seminorm L, which is usually defined as L(f ) := sup{
|f (x) − f (y)| : x = y}, ρ(x, y)
(1)
where ρ is the geodesic metric on the Riemannian manifold, instead by means of L(f ) = [D, f ] , and then one recovers the metric ρ by ρ(x, y) = sup |f (x) − f (y)|.
(2)
L(f )≤1
In [4, Sect. 2] Connes went further by considering the (possibly +∞-valued) metric on the state space of the algebra defined by (2). Motivated by what happens to ordinary compact metric spaces, in [22–24] Rieffel introduced “compact quantum metric spaces” (see Definition 2.9 below) which requires the metric on the state space to induce the w ∗ topology. Many examples of compact quantum metric spaces have been constructed, mostly from ergodic actions of compact groups [22] or group algebras [26, 18]. Usually it is quite difficult to find out whether a specific seminorm L on a unital C ∗ -algebra gives a quantum metric, i.e., whether the metric defined by (2) on the state space induces the w∗ -topology. Denote by Lθ the seminorm on C(Mθ ) determined by the Dirac operator D (see Definition 3.11 below for detail). Notice that when M is connected the geodesic distance makes M into a metric space. Then our main theorem in this paper is: Theorem 1.1. Let M be a connected compact spin manifold with a smooth action of Tn . For every real skew-symmetric n × n matrix θ the pair (C(Mθ ), Lθ ) is a C ∗ -algebraic compact quantum metric space. Motivated by questions in string theory, Rieffel also introduced a notion of quantum Gromov-Hausdorff distance for compact quantum metric spaces [24, 25]. It has many nice properties. Using the quantum Gromov-Hausdorff distance one can discuss the continuity of θ-deformations (with respect to the parameter θ) in a concrete way. This will be done in [16]. This paper is organized as follows. We shall use heavily the theory of locally convex topological vector spaces (LCTVS). In Sect. 2 we review some facts about LCTVS, Clifford algebras, and Rieffel’s theory of compact quantum metric spaces. Connes and Dubois-Violette’s formulation of θ-deformations is reviewed in Sect. 3. In Sect. 4 we prove a general theorem showing that in the presence of a compact group action, sometimes we can reduce the study of a given seminorm to its behavior on the isotypic components of this group action. Section 5 contains the main part of our proof of Theorem 1.1, where we study various differential operators to derive certain formulas. Finally, Theorem 1.1 is proved in Sect. 6. Throughout this paper G will be a nontrivial compact group with identity eG , endowed ˆ the dual of G, and by γ0 the trivial with the normalized Haar measure. Denote by G
θ-Deformations as Compact Quantum Metric Spaces
215
ˆ let χγ be the corresponding character on G, and let γ¯ representation. For any γ ∈ G ˆ and any representation of G on be the contragradient representation . For any γ ∈ G some complex vector space V , we denote by Vγ the γ -isotypic component of V . If J ˆ we also let VJ = γ ∈J Vγ , and let J¯ = {γ¯ : γ ∈ J }. is a finite subset of G, 2. Preliminaries In this section we review some facts about locally convex topological vector spaces (LCTVS), Clifford algebras, and Rieffel’s theory of compact quantum metric spaces. 2.1. Locally convex topological vector spaces. We recall first some facts about LCTVS. The reader is referred to [29, Chaps. 5 and 43] for detailed information about completion and tensor products of LCTVS. Throughout this paper, our LCTVS will all be Hausdorff. For any LCTVS V and W , one can define the projective tensor product of V and W , denoted by V ⊗π W , as the vector space V ⊗ W equipped with the so-called projective ˆ πW. topology. V ⊗π W is also a LCTVS, and one can form the completion V ⊗ For continuous linear maps ψj : Vj → Wj (j = 1, 2) between LCTVS, the tensor product linear map ψ1 ⊗π ψ2 : V1 ⊗π V2 → W1 ⊗π W2 is also continuous and extends ˆ π ψ2 : V1 ⊗ ˆ π V2 → W 1 ⊗ ˆ π W2 . to a continuous linear map ψ1 ⊗ Let V be a LCTVS, and let α be an action of a topological G on V by automorphisms. We say that the action α is continuous if the map G × V → V given by (x, v) → αx (v) is (jointly) continuous. Let V (resp. W ) be a LCTVS and α (resp. β) be a continuous ˆ π β of G on V ⊗ ˆ π W is action of G on V (resp. W ). Then the tensor product action α ⊗ easily seen to be continuous. A locally convex algebra (LCA) [3, p. 341] is a LCTVS V with an algebra structure such that the multiplication V × V → V is (jointly) continuous. If furthermore V is a ∗-algebra and the ∗-operation ∗ : V → V is continuous, let us say that V is a locally convex ∗-algebra (LC∗A). A locally convex left V -module of V is a left V -module W such that the action V × W → W is (jointly) continuous. For a smooth manifold M, the space of (possibly unbounded) smooth functions C ∞ (M) equipped with the usual Fr´echet space topology is a LC∗A. For a smooth vector bundle E over M, the space of smooth sections C ∞ (M, E) is a locally convex C ∞ (M)-bimodule. If furthermore E is an algebra bundle with fibre algebras being finite-dimensional, then C ∞ (M, E) is also a LCA. We shall need Proposition 2.3 below. Lemma 2.1. Let V and W be two LCTVS. Denote by Vˆ and Wˆ the completion of V and W respectively. Then ˆ π Wˆ = V ⊗ ˆ π W. Vˆ ⊗ Proof. The natural linear maps ιV : V → Vˆ and ιW : W → Wˆ are continuous, so ˆ π ιW : V ⊗ ˆ π W → Vˆ ⊗ ˆ π Wˆ , which is the unique we have the continuous linear map ιV ⊗ continuous extension of ιV ⊗ ιW : V ⊗ W → Vˆ ⊗ Wˆ . Let v0 ∈ Vˆ (resp. w0 ∈ Wˆ ) and a net {vj }j ∈I (resp. {wj }j ∈I ) in V (resp. W ) converging to v0 (resp. w0 ). Let p (resp. q) be a continuous seminorm on V (resp. W ). Consider ˆ π q on V ⊗ ˆ π W defined by the continuous tensor product seminorm p⊗ ˆ π q)(η) = inf (p⊗ p(vj )q(wj ) j
216
H. Li
for all η ∈ V ⊗π W , where the infimum is taken over all finite sets of pairs (vk , wk ) such that η= vk ⊗ wk . k
It satisfies ˆ π q)(v ⊗ w) = p(v)q(w) (p⊗ for all v ∈ V and w ∈ W [29, Prop. 43.1]. In particular, we have ˆ π q)(vj ⊗ wj − vj ⊗ wj ) = (p⊗ ˆ π q)((vj − vj ) ⊗ wj + vj ⊗ (wj − wj )) (p⊗ ≤ p(vj − vj )q(wj ) + p(vj )q(wj − wj ) → 0 ˆ π q form a basis of continuous seminorms on V ⊗ ˆ π W [29, as j, j → ∞. Since such p⊗ ˆ π W . Then it converges to some elep. 438], the net {vj ⊗ wj }j ∈I is a Cauchy net in V ⊗ ˆ π W . Let ϕ(v0 , w0 ) = limj →∞ (vj ⊗ wj ). Clearly ϕ(v0 , w0 ) doesn’t depend ment in V ⊗ ˆ π W is on the choice of the nets {vj }j ∈I and {wj }j ∈I . So the map ϕ : Vˆ × Wˆ → V ⊗ well-defined. It is easy to see that ϕ is bilinear and is an extension of the natural map ˆ π W . Denote the extension of p (resp. q) on Vˆ (resp. Wˆ ) still by p (resp. V ×W → V⊗ q). Notice that ˆ π q)(ϕ(v0 , w0 )) = (p⊗ ˆ π q)( lim (vj ⊗ wj )) = lim (p⊗ ˆ π q)(vj ⊗ wj ) (p⊗ j →∞
j →∞
= lim p(vj )q(wj ) = p(v)q(w). j →∞
ˆ π W is continSo ϕ is continuous, and hence the associated linear map Vˆ ⊗π Wˆ → V ⊗ ˆ π Wˆ → uous [29, Prop. 43.4]. Consequently, we have the continuous extension ψ : Vˆ ⊗ ˆ π W [29, Theorem 5.2]. V⊗ ˆ π Wˆ and V ⊗ ˆ π ιW ˆ π W . Clearly ψ and ιV ⊗ Notice that V ⊗ W is dense in both Vˆ ⊗ are inverse to each other when restricted to V ⊗ W . It follows immediately that ψ and ˆ π ιW are isomorphisms inverse to each other between Vˆ ⊗ ˆ πW. ˆ π Wˆ and V ⊗ ιV ⊗ Lemma 2.2. Let Vj , Wj , Hj (j = 1, 2) be LCTVS, and let ψj : Vj × Wj → Hj be continuous bilinear maps; then the bilinear map ψ1 ⊗ ψ2 : (V1 ⊗ V2 ) × (W1 ⊗ W2 ) → H1 ⊗ H2 extends to a continuous bilinear map ˆ π ψ2 : (V1 ⊗ ˆ π V2 ) × (W1 ⊗ ˆ π W2 ) → H1 ⊗ ˆ π H2 . ψ1 ⊗ Proof. We have the associated continuous linear map ϕj : Vj ⊗π Wj → Hj , j = 1, 2 [29, Prop. 43.4] and hence the continuous linear map ˆ π (V2 ⊗π W2 ) → H1 ⊗ ˆ π ϕ2 : (V1 ⊗π W1 )⊗ ˆ π H2 . ϕ1 ⊗ By the associativity of the projective tensor product and Lemma 2.1 we have ˆ π (V2 ⊗π W2 ) (V1 ⊗π W1 )⊗ ˆ π W2 = ((V1 ⊗π V2 ) ⊗π W1 )⊗ ˆ π W2 = ((V1 ⊗π W1 ) ⊗π V2 )⊗ ˆ π (W1 ⊗π W2 ) = (V1 ⊗ ˆ π (W1 ⊗ ˆ π V 2 )⊗ ˆ π W2 ). = (V1 ⊗π V2 )⊗
θ-Deformations as Compact Quantum Metric Spaces
217
ˆ π (W1 ⊗ ˆ π V 2 )⊗ ˆ π W2 ) → H 1 ⊗ ˆ π H2 , which is So we get a continuous linear map (V1 ⊗ ˆ π V2 ) × (W1 ⊗ ˆ π W2 ) → H1 ⊗ ˆ π H2 . Clearly equivalent to a continuous bilinear map (V1 ⊗ this extends the bilinear map ψ1 ⊗ ψ2 : (V1 ⊗ V2 ) × (W1 ⊗ W2 ) → H1 ⊗ H2 . ˆ π W is also a LCA extending the natProposition 2.3. Let V and W be LCA. Then V ⊗ ˆ π W . If H is a ural algebra structure on V ⊗ W . If both V and W are LC∗A, so is V ⊗ ˆ π W is a locally convex left V ⊗ ˆ π W -module. locally convex left V -module, then H ⊗ Proof. By Lemma 2.2 we have the continuous bilinear map ˆ πW) → V ⊗ ˆ πW ˆ π W ) × (V ⊗ (V ⊗ ˆ π W , clearly the extending the multiplication of V ⊗ W . Since V ⊗ W is dense in V ⊗ ˆ π W is a LCA. The assertion about above bilinear map is associative. In other words, V ⊗ modules can be proved in the same way. If both V and W are LC∗A, then we have the tensor product of the ∗-operations ˆ πW → V ⊗ ˆ π W . Since it extends the natural ∗-operation on V ⊗ W , it is easy to V⊗ ˆ π W is a LC∗A. check that it is compatible with the algebra structure. So V ⊗ For any LCTVS V and W , one can also define the injective tensor product V ⊗ W ˆ W . Let us say that a continuous linear map of V and W , and form the completion V ⊗ ψ : V → W is an isomorphism of V into W if ψ is injective and ψ : V → ψ(V ) is a homeomorphism of topological spaces. The only property about the injective tensor product we shall need is that if ψj is an isomorphism of Vj into Wj for j = 1, 2, then the ˆ ψ2 is an isomorphism of V1 ⊗ ˆ V2 into corresponding tensor product linear map ψ1 ⊗ ˆ W2 [29, Prop. 43.7]. W1 ⊗ Let n ≥ 2, and let θ be a real skew-symmetric n × n matrix. Denote by Aθ the corresponding quantum torus [19, 20]. It could be described as follows. Let ωθ denote the skew-symmetric bicharacter on Zn defined by ωθ (p, q) = eiπp·θq . For each p ∈ Zn there is a unitary up in Aθ . And Aθ is generated by these unitaries with the relation up uq = ωθ (p, q)up+q . So one may think of vectors in Aθ as some kind of functions on Zn . The n-torus Tn has a canonical ergodic action τ on Aθ . Notice that Zn is the dual group of Tn . We denote the duality by p, x for x ∈ Tn and p ∈ Zn . Then τ is determined by τx (up ) = p, x up . n The set A∞ θ of smooth vectors for the action τ is exactly the Schwarz space S(Z ) [2]. n Let X1 , · · ·, Xn be a basis for the Lie algebra of T . Then we have the differential ∂Xj (f ) ∞ for each f ∈ A∞ θ and 1 ≤ j ≤ n. For each k ∈ N define a seminorm, qk , on Aθ by
max ∂Xm11 · · · ∂Xmnn (f ) , qk (f ) := − → | m |≤k
n → ∞ where m1 , · · ·, mn are nonnegative integers and |− m| = j =1 mj . Clearly Aθ is a complete LC∗A equipped with the topology defined by these qk ’s. On the other hand, it is easy to see that this topology is the same as the usual topology on S(Zn ). Thus
218
H. Li
A∞ θ is a nuclear space [29, Theorem 51.5], which means that for every LCTVS V the injective and projective topologies on V ⊗ A∞ θ coincide [29, Theorem 50.1]. So we shall simply use V ⊗ A∞ to denote the (projective or injective) topological tensor product. θ The algebraic tensor product will be denoted by V ⊗alg A∞ θ . We shall need to integrate continuous functions with values in a LCTVS. For our purpose, it suffices to use the Riemann integral. Though this should be well-known, we have not been able to find any reference in the literature. So we include a definition here. Lemma 2.4. Let X be a compact space with a probability measure µ. Let I := { {X1 , · · ·, Xk } : X1 , · · ·, Xk are disjoint measurable subsets of X, k ∈ N, ∪kj =1 Xj = X} be the set of all finite partitions of X into measurable subsets with the fine order, i.e. {X1 , · · ·, Xk } ≥ {X1 , · · ·, Xk } if and only if every Xj is contained in some Xj . Let V be a complete LCTVS, and let f : X → V be a continuous map. For each {X1 , · · ·, Xk } in I pick an xj ∈ Xj for each j , and let v{X1 ,···,Xk } =
k
µ(Xj )f (xj ).
j =1
Then {v{X1 ,···,Xk } }{X1 ,···,Xk }∈I is a Cauchy net in V , and its limit doesn’t depend on the choice of the representatives x1 , · · ·, xk . Proof. Let a continuous seminorm p on V and an > 0 be given. For each x ∈ X there is an open neighborhood Ux of x such that p(f (x)−f (y)) ≤ for all y ∈ Ux . Since X is compact, we can cover X with finitely many such Ux , say Ux1 , · · ·, Uxk . Let X1 = Ux1 and j −1 Xj = Uxj \∪s=1 Xs inductively for all 2 ≤ j ≤ k. Then {X1 , · · ·, Xk } is a finite partition of X. For any {X1 , · · ·, Xk } ≥ {X1 , · · ·, Xk }, clearly p(v{X1 ,···,X } − v{X1 ,···,Xk } ) ≤ 2 k no matter how we choose the representatives for {X1 , · · ·, Xk } and {X1 , · · ·, Xk }. This gives the desired result. Definition 2.5. Let X be a compact space with a probability measure µ, and let f be a continuous function from X into a complete LCTVS V . The integration of f over X, denoted by X f dµ, is defined as the limit in Lemma 2.4. The next proposition is obvious: Proposition 2.6. Let X be a compact space with a probability measure µ, and let f1 , f2 be continuous functions from X into a complete LCTVS V . Then (f1 + f2 ) dµ = f1 dµ + f2 dµ, X X X λf1 dµ = λ f1 dµ X
X
for any scalar λ. If ψ : V → W is a continuous linear map from V into another complete LCTVS W , then ψ ◦ f1 dµ = ψ f1 dµ . X
X
θ-Deformations as Compact Quantum Metric Spaces
219
It is also easy to verify the analogue of the fundamental theorem of calculus: Proposition 2.7. Let f be a continuous map from [0, 1] to a complete LCTVS V . Then t f (s) ds f (0) = lim 0 . t→0 t about Clifford algebras [11, Chap. 1, 2.2. Clifford algebras. Next we recall some facts 12] [12, Sect. 1.8]. Let V be a real vector space of dimension m equipped with a positive-definite inner product. The corresponding Clifford algebra, denoted by Cl(V ), is the quotient of the tensor algebra ⊕k≥0 V ⊗ · · · ⊗ V generated by V by the two sided ideal generated by all elements of the form v ⊗ v+ v 2 for v ∈ V . The complexified Clifford algebra, denoted by Cl C (V ), is defined as Cl C (V ) := Cl(V ) ⊗R C. Cl C (V ) has a natural finite-dimensional C ∗ -algebra structure [11, Theorem 1.7.35]. Denote by SO(V ) the group of isometries of V preserving the orientation. For each g ∈ SO(V ) the isometry g : V → V induces an algebra isomorphism Cl(V ) → Cl(V ) and a C ∗ -algebra isomorphism Cl C (V ) → Cl C (V ). In this way SO(V ) acts on Cl(V ) and Cl C (V ). Recall that a state ϕ on a C ∗ -algebra A is said to be tracial if ϕ(ab) = ϕ(ba) for all a, b ∈ A. Lemma 2.8. When m is even, there is a unique tracial state tr on Cl C (V ). When m is m+1 odd, let γ := i 2 e1 · · · em be the chirality operator, where e1 , · · · , em is an orthonormal basis of V . Then γ is fixed under the action of SO(V ) (equivalently, γ doesn’t depend on the choice of the ordered orthonormal basis e1 , · · · , em ), and there is a unique tracial state tr on Cl C (V ) such that tr(γ ) = 0. In both cases, tr is SO(V )invariant. Proof. In both cases, the SO(V )-invariance of tr follows from the uniqueness. So we just need to show the uniqueness of tr. m m When m is even, Cl C (V ) is isomorphic to the C ∗ -algebra of 2 2 by 2 2 matrices [11, Theorem 1.3.2]. The uniqueness of tr follows from the fact that for any n ∈ N the C ∗ -algebra of n by n matrices has a unique tracial state [13, Example 8.1.2]. Assume that m is odd now. Then Cl C (V ) is isomorphic to the direct sum of two m−1 m−1 copies of the C ∗ -algebra of 2 2 by 2 2 matrices [11, Theorem 1.3.2]. Say Cl C (V ) = m−1 m−1 A1 ⊕ A2 , where both A1 and A2 are isomorphic to the C ∗ -algebra of 2 2 by 2 2 matrices. Let pj be the projection of Cl C (V ) to Aj , and let ϕj be the unique tracial state of Aj . Then the tracial states of Cl C (V ) are exactly λϕ1 ◦ p1 + (1 − λ)ϕ2 ◦ p2 for 0 ≤ λ ≤ 1. It is easily verified that γ belongs to the center of Cl C (V ). So γ must be in C · 1A1 + C · 1A2 . It’s also clear that γ 2 = 1 and γ ∈ C. So γ must be ±(1A1 − 1A2 ). It follows immediately that Cl C (V ) has a unique tracial state tr satisfying tr(γ ) = 0, namely, tr = 21 (ϕ1 ◦ p1 + ϕ2 ◦ p2 ). It is easy to check that γ is fixed under the action of SO(V ). There is a natural injective map V → Cl(V ). So one may think of V as a subspace of Cl(V ). The C ∗ -algebra norm on Cl C (V ) extends the norm on V induced from the inner product (see [11, Theorem 1.7.22(iv)] for the corresponding statement for the real C ∗ -algebra norm; the proofs are similar). Let M be an oriented Riemannian manifold of dimension m. Then we have the smooth algebra bundles ClM and Cl C M over M
220
H. Li
with fibre algebras Cl(T Mx ) and Cl C (T Mx ) respectively, where T Mx is the tangent space at X ∈ M. These are called the Clifford algebra bundle and the complexified Clifford algebra bundle. Since T Mx ⊆ Cl(T Mx ), the complexified tangent bundle T M C is a subbundle of Cl C M. Since Cl C (T Mx ) is unital, C ∞ (M) is a subalgebra of C ∞ (M, Cl C M). 2.3. Compact quantum metric spaces. Finally, we review Rieffel’s theory of compact quantum metric spaces [22–24, 27]. Though Rieffel has set up his theory in the general framework of order-unit spaces, we shall need it only for C ∗ -algebras. See the discussion preceding Definition 2.1 in [24] for the reason of requiring the reality condition (3) below. Definition 2.9 [24, Defi 2.1]. By a C ∗ -algebraic compact quantum metric space we mean a pair (A, L) consisting of a unital C ∗ -algebra A and a (possibly +∞-valued) seminorm L on A satisfying the reality condition L(a) = L(a ∗ )
(3)
for all a ∈ A, such that L vanishes exactly on C and the metric ρL on the state space S(A) defined by (2) induces the w∗ -topology. The radius of (A, L) is defined to be the radius of (S(A), ρL ). We say that L is a Lip-norm. Let A be a unital C ∗ -algebra and let L be a (possibly +∞-valued) seminorm on A vanishing on C. Then L and · induce (semi)norms L˜ and · ∼ respectively on the quotient space A˜ = A/C. Notation 2.10 For any r ≥ 0, let Dr (A) := {a ∈ A : L(a) ≤ 1, a ≤ r}. The main criterion for when a seminorm L is a Lip-norm is the following: Proposition 2.11 [22, Prop. 1.6, Theorem 1.9]. Let A be a unital C ∗ -algebra and let L be a (possibly +∞-valued) seminorm on A satisfying the reality condition (3). Assume that L takes finite values on a dense subspace of A, and that L vanishes exactly on C. Then L is a Lip-norm if and only if ˜ (1) there is a constant K ≥ 0 such that · ∼ ≤ K L˜ on A; and (2) for any r ≥ 0, the ball Dr (A) is totally bounded in A for · ; or (2 ) for some r > 0, the ball Dr (A) is totally bounded in A for · . ˜ sa . In this event, rA is exactly the minimal K such that · ∼ ≤ K L˜ on (A) 3. Connes and Dubois-Violette’s Formulation of θ -Deformations Though the Dirac operator does not depend on θ in Connes and Landi’s formulation of θ-deformations in [8, Sect. 5], it does in Connes and Dubois-Violette’s formulation in [7]. In this section we review the formulation of θ-deformations by Connes and DuboisViolette [7, Sect. 11 and 13], including the deformation of both the algebra and the Dirac operator.
θ-Deformations as Compact Quantum Metric Spaces
221
Let M be a smooth manifold with a smooth action σM of Tn . We denote by σ the induced action of Tn on the LC∗A C ∞ (M). Then σ is continuous. By Proposition 2.3 ˆ ∞ the tensor product completion C ∞ (M)⊗A θ is a LC∗A. The tensor product action ∞ −1 n ∞ ˆ θ is also continuous. The deformed smooth algebra [7, ˆ of T on C (M)⊗A σ ⊗τ Sect. 11], denoted by C ∞ (Mθ ), is then defined as the fixed-point space of this action, ˆ −1 σ ⊗τ ˆ ∞ . Clearly, this is a LC∗A. i.e. C ∞ (Mθ ) = (C ∞ (M)⊗A θ ) Suppose M is equipped with a σM -invariant Riemannian metric. (For any Riemannian metric on M, we can always integrate it over Tn to make it σM -invariant.) Also assume that M is a spin manifold and that σM lifts to a smooth action σS of Tn on the spin bundle S, i.e. the following diagram S −−−−→ σS,x
S
M −−−−→ M σM,x
is commutative for every x ∈ Tn . (Usually σM doesn’t lift directly to S, but lifts only modulo ±I , i.e. there is a twofold covering Tn → Tn such that σM lifts to an action of the two-folding covering on S. Correspondingly, Connes and Dubois-Violette defined the various deformed structures using the tensor product with A 1 θ instead of Aθ . But 2 for the deformed algebras and Dirac operators, the difference is just a matter of parameterization.) We denote the induced continuous action of Tn on C ∞ (M, S) also by σ . Then C ∞ (M, S) is a locally convex left C ∞ (M)-module and σx (f ψ) = σx (f )σx (ψ) for all f ∈ C ∞ (M), ψ ∈ C ∞ (M, S) and x ∈ Tn . We also have the tensor product ∞ ˆ ∞ ˆ ∞ completion C ∞ (M, S)⊗A θ , which is a locally convex left module over C (M)⊗Aθ ˆ −1 of Tn on C ∞ (M, S)⊗A ˆ ∞ by Proposition 2.3. The tensor product action σ ⊗τ θ is still continuous. The deformed spin bundle, denoted by C ∞ (Mθ , S), is then defined as the ˆ −1 σ ⊗τ ˆ ∞ . This is a fixed-point space of this action, i.e. C ∞ (Mθ , S) = (C ∞ (M, S)⊗A θ ) locally convex left C ∞ (Mθ )-module. Let D be the Dirac operator on C ∞ (M, S). This is a first-order linear differential operator. So it is easy to see that D is continuous with respect to the locally convex topology on C ∞ (M, S). Then we have the tensor product ˆ from C ∞ (M, S)⊗A ˆ ∞ linear map D ⊗I θ to itself. Notice that D commutes with the action ˆ commutes with the action σ ⊗τ ˆ −1 . Therefore C ∞ (Mθ , S) is stable under σ , so D ⊗I ˆ . Denote by Dθ the restriction of D ⊗I ˆ to C ∞ (Mθ , S). D ⊗I Assume further that M is compact. As usual, one defines a positive-definite scalar product on C ∞ (M, S) by
(ψ, ψ ) vol, < ψ, ψ >= M
where vol is the Riemannian volume form. Denote by H = L2 (M, S) the Hilbert space obtained by completion. Then C(M) has a natural faithful representation on H by multiplication, and we shall think of C(M) as a subalgebra of B(H), the C ∗ -algebra of all bounded operators on H. The action σ uniquely extends to a continuous unitary representation of Tn in H, which will be still denoted by σ . On the other hand, Aθ has an inner product induced by the unique τ -invariant tracial state. Denote by L2 (Aθ ) the Hilbert
222
H. Li
space obtained by completion. Then Aθ acts on L2 (Aθ ) faithfully by the GN S construction, and we shall also think of Aθ as a subalgebra of B(L2 (Aθ )). The action τ also ¯ 2 (Aθ ) be the extends to a continuous unitary representation of Tn in L2 (Aθ ). Let H⊗L ¯ −1 Hilbert space tensor product. Then we have the continuous tensor product action σ ⊗τ ¯ 2 (Aθ ). The deformed Hilbert space, denoted by Hθ , is defined as the fixedon H⊗L ¯ 2 (Aθ ) under the action σ ⊗τ ¯ −1 . Clearly the maps C ∞ (M, S) → H point space of H⊗L ∞ 2 and Aθ → L (Aθ ) are continuous with respect to the locally convex topologies on 2 C ∞ (M, S), A∞ θ and the norm topologies on H, L (Aθ ). Then we have the sequence of continuous linear maps ψ
φ
2 ˆ ∞ ˆ ¯ 2 C ∞ (M, S)⊗A θ → H⊗π L (Aθ ) → H⊗L (Aθ ),
ˆ π L2 (Aθ ) is the completion of the projective tensor product of H and L2 (Aθ ). where H⊗ n ˆ ∞ ¯ 2 Let : C ∞ (M, S)⊗A θ → H⊗L (Aθ ) be the composition. Then is T -equivariant. So maps C ∞ (Mθ , S) into Hθ . Let θ be the restriction of to C ∞ (Mθ , S). 2 ˆ ∞ ˆ Lemma 3.1. Both maps φ : C ∞ (M, S)⊗A θ → H⊗π L (Aθ ) and ψ : 2 2 ¯ (Aθ ) are injective. Consequently, and θ are injective. ˆ π L (Aθ ) → H⊗L H⊗
Proof. We’ll prove the injectivity of φ. The proof for ψ is similar. Recall the notation at the end of Sect. 1. We shall need the following well-known fact several times. We omit the proof. Lemma 3.2. Let G be a compact group. Let α be a continuous action of G on a complex complete LCTVS V . For a continuous C-valued function ϕ on G let ϕ(x)αx (v) dx αϕ (v) = G
ˆ for v ∈ V . Then αϕ : V → V is a continuous linear map. If J is a finite subset of G ¯ and if ϕ is a linear combination of the characters of γ ∈ J , then αϕ (V ) ⊆ VJ . Let αJ = αγ ∈J dim(γ )χγ . (When J is a one-element set {γ }, we’ll simply write αγ for α{γ } .) Then αJ (v) = v for ˆ \J. all v ∈ VJ , and αJ (v) = 0 for all v ∈ Vγ with γ ∈ G From Proposition 2.6 we also have: Lemma 3.3. Let G be a compact group with continuous actions α and β on complex complete LCTVS V and W . Let φ : V → W be a continuous G-equivariant linear map, and let ϕ : G → C be a continuous function. Then φ ◦ αϕ = βϕ ◦ φ.
(4)
ˆ Then In particular, let J be a finite subset of G. φ ◦ αJ = βJ ◦ φ. We shall need the following lemma a few times: Lemma 3.4. Let G be a compact group, and let h be a continuous C-valued function on G with h(eG ) = 0. Then for any > 0 there is a nonnegative function ϕ on G such that ϕ is a linear combination of finitely many characters, ϕ 1 = 1, and ϕ · h 1 < .
θ-Deformations as Compact Quantum Metric Spaces
223
Proof. Notice that the left regular representation of G on L2 (G) is faithful. Since the left regular representation is a Hilbert space direct sum of irreducible representations, we see ˆ Let U be an open neighborhood of eG that any x = eG acts nontrivially in some γ ∈ G. such that |h(x)| < /2 for all x ∈ U. For any x ∈ G \ U, suppose that x acts nontrivially ˆ Then there is some open neighborhood Ux of x such that x acts nontrivially in γx ∈ G. in γx for all x ∈ UX . Since G \ U is compact, we can find x1 , · · ·, xm ∈ G \ U so that Ux1 , · · ·, Uxm cover G \ U. Let JU = {γx1 , · · ·, γxm }. Then no element in G \ U acts trivially in all γ ∈ JU . Let π1 be the direct sum of one copy for each γ in JU ∪ {γ0 }, and let χπ1 be the character of π1 . Let π = π1 ⊗π1 . Also let χ be the character of π. Note that χ (x) = |χπ1 (x)|2 ≥ 0 for all x ∈ G. Let ϕn = χ n / χ n 1 . Then each ϕn is a linear combination of finitely many characters. Since every element in G \ U acts nontrivially in π , χ (x) < χ (eG ) on G \ U. Therefore it’s easy to see (cf. the proof of Theorem 8.2 in [24]) that G\U ϕn (x) dx → 0 as n → ∞, and hence |ϕn (x)h(x)| dx ≤ sup |h(x)| < .
lim sup n→+∞
x∈U
G
So when n is big enough, we have that ϕn · h 1 < .
As a corollary of Lemma 3.4 we have: Lemma 3.5. Let G be a compact group. Let α be a continuous action of G on a complex ˆ then v = 0. complete LCTVS V . Let v ∈ V . If αγ (v) = 0 for all γ ∈ G, Proof. Let p be a continuous seminorm on V , and let > 0. Define a function h on G by h(x) = p(v − αx (v)). Then h is continuous on G, and h(eG ) = 0. Pick ϕ for h and
in Lemma 3.4. According to the assumption we have αϕ (v) = 0. Then p(v) = p(v − αϕ (v)) = p(
ϕ(x)(v − αx (v)) dx) ≤
G
ϕ(x)h(x) dx < . G
Since the topology on V is defined by all the continuous seminorms, we see that v = 0. ˆ ∞ ˆ acting on V = C ∞ (M, S)⊗A We are ready to prove Lemma 3.1. Let α = I ⊗τ θ , 2 ˆ acting on H⊗ ˆ π L (Aθ ). Let φ be as in Lemma 3.1. Then φ ◦ α = β ◦ φ. and let β = I ⊗τ n clearly αq maps Recall the notation about Aθ in Subsect. 2.1. For any q ∈ Zn = T ∞ ∞ ∞ C (M) ⊗alg Aθ onto C (M) ⊗ uq . Since αq is continuous, by Lemma 3.2 it follows immediately that Vq = αq (V ) = C ∞ (M, S) ⊗ uq . Let f ∈ ker(φ). For any q ∈ Zn by Lemma 3.3 φ(αq (f )) = βq (φ(f )) = 0. Now αq (f ) ∈ C ∞ (M, S) ⊗ uq , and clearly φ restricted to C ∞ (M, S) ⊗ uq is injective. So αq (f ) = 0. From Lemma 3.5 we see that f = 0. Lemma 3.6. The image θ (C ∞ (Mθ , S)) is dense in Hθ . ˆ ∞ ¯ 2 Clearly (C ∞ (M, S)⊗A θ ) is dense in H⊗L (Aθ ), so this is an immediate consequence of the following:
224
H. Li
Lemma 3.7. Let G be a compact group. Let α and β be continuous actions of G on complex complete LCTVS V and W respectively. Let φ : V → W be a continuous G-equivariant linear map such that φ(V ) is dense in W . Then φ(V α ) is dense in W β . Proof. Recall that γ0 is the trivial representation of G. By Lemma 3.2 βγ0 is continuous. So βγ0 (φ(V )) is dense in βγ0 (W ) = W β . But βγ0 (φ(V )) = φ(αγ0 (V )) = φ(V α ) according to Lemma 3.3. The conclusion follows. The Dirac operator D is essentially self-adjoint on H [15, Theorem 5.7]. Then D ⊗ I ¯ 2 (Aθ ) [13, Prop. 11.2.37]. Denote its closure by is also essentially self-adjoint on H⊗L 2 L D . L2 ˆ ∞ Lemma 3.8. (C ∞ (M, S)⊗A θ ) is contained in the domain of D , and 2
ˆ ). D L ◦ = ◦ (D ⊗I
(5)
∞ ∞ ˆ ∞ Proof. For any y ∈ C ∞ (M, S)⊗A θ , take a net yj in C (M, S) ⊗alg Aθ con2 L ˆ )(yj ) → (D ⊗I ˆ )(y) and D ((yj )) = verging to y. Then (yj ) → (y), (D ⊗I 2 ˆ )(y)). So (y) is contained in the domain of D L , and ˆ )(yj )) → ((D ⊗I ((D ⊗I 2 ˆ )(y)). D L ((y)) = ((D ⊗I
So the intersection of Hθ and the domain of D L contains θ (C ∞ (Mθ , S)), which is ¯ −1 , and thus dense in Hθ by Lemma 3.6. Clearly D ⊗ I commutes with the action σ ⊗τ 2 2 2 so does D L . Hence D L maps the intersection of Hθ and the domain of D L into Hθ . 2 Therefore the restriction of D L to Hθ is also self-adjoint. The deformed Dirac operator, 2 denoted by DθL , is then defined to be this restriction. Similarly, the maps C ∞ (M) → C(M) and A∞ θ → Aθ are continuous with respect to the locally convex topologies on C ∞ (M), A∞ and the norm topologies on C(M), Aθ . θ ˆ ∞ So we have the Tn -equivariant continuous linear map : C ∞ (M)⊗A θ → C(M)⊗Aθ ∗ , where C(M) ⊗ Aθ is the spatial C -algebraic tensor product of C(M) and Aθ [31, Appendix T.5]. 2
Definition 3.9. We define the deformed continuous algebra, C(Mθ ), to be the fixed-point −1 algebra (C(M) ⊗ Aθ )σ ⊗τ . Then maps C ∞ (Mθ ) into C(Mθ ). By similar arguments as in Lemma 3.1 and 3.7 we have Lemma 3.10. The map is injective, and (C ∞ (Mθ )) is dense in C(Mθ ). Clearly Hθ is stable under the action of elements in C(Mθ ). So we can define θ : C ∞ (Mθ ) → B(Hθ ) as the composition of C ∞ (Mθ ) → C(Mθ ) and the restriction map of C(Mθ ) to B(Hθ ). We shall see later in Proposition 5.6 that the restriction map of C(Mθ ) to B(Hθ ) is isometric. So we may also think of C(Mθ ) as a subalgebra of B(Hθ ). Then the closure of θ (C ∞ (Mθ )) is just C(Mθ ). 2 We shall see later in Proposition 5.2 that the domain of DθL is stable under θ (f ), 2 and that the commutator [DθL , θ (f )] is bounded for every f ∈ C ∞ (Mθ ). Definition 3.11. We define the deformed Lip-norm, denoted by Lθ , on C(Mθ ) by
2 [DθL , f ] , if f ∈ θ (C ∞ (Mθ )), Lθ (f ) := +∞, otherwise .
θ-Deformations as Compact Quantum Metric Spaces
225
4. Lip-Norms and Compact Group Actions In this section we consider a general situation in which there are a seminorm and a compact group action. We show that under certain compatibility hypotheses we can use this group action to prove that the seminorm is a Lip-norm. The strategy is a generalization of the one Rieffel used to deal with Lip-norms associated to ergodic compact (Lie) group actions [22, 24]. We’ll see that θ -deformations fit into this general picture. Throughout this section we assume that G is an arbitrary compact group which has a fixed length function l, i.e. a continuous real-valued function, l, on G such that l(xy) ≤ l(x) + l(y) for all x, y ∈ G, l(x −1 ) = l(x) for all x ∈ G, l(x) = 0 if and only if x = eG , where eG is the identity of G. Theorem 4.1. Let A be a unital C ∗ -algebra, let L be a (possibly +∞-valued) seminorm on A satisfying the reality condition (3), and let α be a strongly continuous action of G on A. Assume that L takes finite values on a dense subspace of A, and that L vanishes on C. Let Ll be the (possibly +∞-valued) seminorm on A defined by Ll (a) = sup{
αx (a) − a : x ∈ G, x = eG }. l(x)
(6)
Suppose that the following conditions are satisfied: (1) there is some constant C > 0 such that Ll ≤ C · L on A; (2) for any linear combination ϕ of finitely many characters on G we have L ◦ αϕ ≤ ϕ 1 ·L on A, where αϕ is the linear map on A defined in Lemma 3.2; ˆ with γ = γ0 the ball Dr (Aγ ) := {a ∈ Aγ : L(a) ≤ 1, a ≤ r} is (3) for each γ ∈ G totally bounded for some r > 0, and the only element in Aγ vanishing under L is 0; (4) there is a unital C ∗ -algebra B containing Aγ0 = Aα , with a Lip-norm LB , such that LB extends the restriction of L to Aγ0 . ∗ Then (A, L) is a C -algebraic compact quantum metric space with rA ≤ rB + C G l(x) dx.
Remark 4.2. (1) We assume the existence of (B, LB ) in the condition (4) only for the convenience of application. In fact, conditions (2) and (4) imply that L restricted to Aγ0 is a Lip-norm on Aγ0 : for any a ∈ Aγ0 and > 0 pick a ∈ A with L(a ) < ∞ and a − a < . Then by Lemma 3.2 αγ0 (a ) ∈ Aγ0 and a − αγ0 (a ) = αγ0 (a − a ) < . By the condition (2) L(αγ0 (a )) < ∞. Therefore L takes finite values on a dense subspace of Aγ0 . Then from Proposition 2.11 it is easy to see that L restricted to Aγ0 is a Lip-norm on Aγ0 . Consequently, we may take B to be Aγ0 itself. (2) Conditions (1) and (2) in Theorem 4.1 enable us to reduce the study of L to that of the restriction of L to each Aγ . Conditions (3) and (4) say roughly that L restricted to each Aγ is a Lip-norm. (3) Usually it is not hard to verify condition (2). In particular, by Lemma 4.3 it holds when L is α-invariant and lower semicontinuous on {a ∈ A : L(a) < +∞}, and ˆ {a ∈ A : L(a) < +∞} is stable under αγ for every γ ∈ G.
226
H. Li
Lemma 4.3. Let α be a strongly continuous action of G on a C ∗ -algebra A, and let L be a (possibly +∞-valued) seminorm on A. Suppose that L is α-invariant and lower semicontinuous on {a ∈ A : L(a) < +∞}. For any continuous function ϕ : G → C, if {a ∈ A : L(a) < +∞} is stable under the map αϕ : A → A defined in Lemma 3.2, then L ◦ αϕ ≤ ϕ 1 ·L on A. Proof. We only need to show L(αϕ (a)) ≤ ϕ 1 ·L(a) for each a ∈ A with L(a) < +∞. But αϕ (a) = lim
k
→0
αgj (a)µ(Ej )ϕ(gj ),
j =1
where µ is the normalized Haar measure on G, (E1 , · · · , Ek ) is a partition of G, gj ∈ Ej , (Ej ) := sup{max(|ϕ(x) − ϕ(y)|, |αx (a) − αy (a)|) : x, y ∈ Ej } and = max1≤j ≤k (Ej ). By the assumptions we have k L(αϕ (a)) ≤ lim inf L αgj (a)µ(Ej )ϕ(gj ) →0
j =1
≤ L(a) lim inf →0
k
µ(Ej )|ϕ(gj )| = L(a) ϕ 1 .
j =1
For θ-deformations of course A is C(Mθ ). Notice that Tn has a natural action I ⊗ τ on C(Mθ ). They will be our G and α. The following lemma is a generalization of Lemmas 8.3 and 8.4 in [24]. ˆ containing γ0 , dependLemma 4.4. For any > 0 there is a finite subset J = J¯ in G, ing only on l and /C, such that for any strongly continuous isometric action α on a complex Banach space V with a (possibly +∞-valued) seminorm L on V satisfying conditions (1) and (2) (with A replaced by V ) in Theorem 4.1, and for any v ∈ V , there is some v ∈ VJ with v ≤ v ,
L(v ) ≤ L(v),
and v − v ≤ L(v).
If V has an isometric involution ∗ invariant under α, then when v is self-adjoint we can choose v also to be self-adjoint. ˆ such Proof. Pick ϕ for l and /C as in Lemma 3.4. Then there is a finite subset J ⊆ G that ϕ is a linear combination of characters χγ for γ ∈ J . Replacing J by J ∪ J¯ , we may assume that J = J¯ . For any v ∈ V clearly αϕ (v) ≤ ϕ 1 · v = v . A simple calculation as in the proof of [24, Lemma 8.3] tells us that
v − αϕ (v) ≤ Ll (v) ϕ(x)l(x) dx ≤ Ll (v). C G
θ-Deformations as Compact Quantum Metric Spaces
227
Then it follows from condition (1) in Theorem 4.1 that v −αϕ (v) ≤ L(v). Also from condition (2) we see that L(αϕ (v)) ≤ L(v). So for any v ∈ A, the element v = αϕ (v) satisfies the requirement. Notice that ϕ is real-valued, so when v is self-adjoint, so is αϕ (v). Proof of Theorem 4.1. We verify the conditions in Proposition 2.11 for (A, L) to be a compact quantum metric space one by one. Lemma 4.5. For any a ∈ A if L(a) = 0 then a is a scalar. Proof. For any γ ∈ J by condition (2) we have L(αγ (a)) ≤ dim(γ )χγ 1 ·L(a) = 0. By conditions (3) and (4) we see that αγ (a) = 0 for γ = γ0 and that αγ0 (a) ∈ C. Hence ˆ Then Lemma 3.5 tells us that a = αγ0 (a) ∈ C. αγ (a − αγ0 (a)) = 0 for all γ ∈ G. Lemma 4.6. For any R ≥ 0 the ball DR (A) = {a ∈ A : L(a) ≤ 1, a ≤ R} is totally bounded. ˆ such that Proof. For any > 0 by Lemma 4.4 there is some finite subset J ⊆ G
< . Let M = for every v ∈ DR (A) there exists v ∈ D (A ) with v − v R J max { dim(γ )χγ 1 : γ ∈ J }. For any a = γ ∈J aγ ∈ DR (AJ ) and γ ∈ J we have aγ = αdim(γ )χγ (a) ≤ dim(γ )χγ 1 · a ≤ M · R, and by condition (2) L(aγ ) = L(αdim(γ )χγ (a)) ≤ dim(γ )χγ 1 ·L(a) ≤ M. Therefore DR (AJ ) ⊆
γ ∈J
aγ ∈ AJ : aγ ∈ Aγ , L(aγ ) ≤ M, aγ ≤ M · R .
By conditions (3), (4) and Proposition 2.11 the latter set is totally bounded. Then DR (AJ ) is totally bounded. Since is arbitrary, DR (A) is also totally bounded. Lemma 4.7. We have · ∼ ≤ rB + C l(x) dx L∼ G
on Asa /Re.
228
H. Li
Proof. Let a ∈ Asa with L(a) = 1. Let ϕ be the constant function χγ0 = 1 on G. Then αϕ = αγ0 and ϕ 1 = 1. As in the proof of Lemma 4.4 we have αϕ (a) ∈ (Aα )sa and ϕ(x)l(x) dx ≤ C · L(a) l(x) dx = C l(x) dx, a − αϕ (a) ≤ Ll (a) G
G
G
where the second inequality comes from condition (1). Let b = αϕ (a). By condition (2) we have L(b) ≤ ϕ 1 ·L(a) = 1. Then by Proposition 2.11, rB ≥ b˜ ∼ ≥ a˜ ∼ − a˜ − b˜ ∼ ≥ a˜ ∼ − a − αϕ (a) ≥ a˜ ∼ −C Therefore we have · ∼ ≤ (rB + C
G l(x) dx)L
∼.
l(x) dx. G
Now Theorem 4.1 follows from Lemmas 4.5–4.7 and Proposition 2.11 immediately.
5. Differential Operators and Seminorms In this section we make preparation for our proof of Theorem 1.1. In Sect. 6 we shall verify the conditions in Theorem 4.1 for (C(Mθ ), Lθ , Tn , I ⊗ τ ). The seminorm Llθ on C(Mθ ) associated to I ⊗ τ is defined in Definition 5.4. The main difficulty is to verify condition (1). We shall see that it is much more convenient to work on the whole Hilbert ¯ 2 (Aθ ) instead of Hθ . So we have to study the corresponding seminorms LD space H⊗L and Ll on C(M) ⊗ Aθ (see Definitions 5.3 and 5.4). We prove the comparison formula for LD and Ll first, in (20). Then we relate them to Lθ and Llθ by proving (22). The information about these various seminorms is all hidden in differential operators, which involve mainly the theory of LCTVS. Subsections 5.1 and 5.2 are devoted to analyzing these operators. 5.1. Differential operators. In this subsection we assume that M is an oriented Riemannian manifold with an isometric smooth action σM of Tn . Our aim is to derive the formulas (8), (11) and (12) below. Let Cl C M be the complexified Clifford algebra bundle on M. Then its space of smooth sections, C ∞ (M, Cl C M), is a LCA containing C ∞ (M) as a central subalgebra, and containing C ∞ (M, T M C ) as a subspace, where T M C is the complexified tangent bundle. Using the Riemannian metric, we can identify T M and T ∗ M canonically. Then C ∞ (M, T ∗ M C ) = C ∞ (M, T M C ) is also a subspace of C ∞ (M, Cl C M). Notice that C ∞ (M, S) is a locally convex left module over C ∞ (M, Cl C M). Since ∞ C ˆ ∞ ˆ ∞ ∞ A∞ θ is nuclear, the complete tensor products C (M)⊗Aθ , C (M, T M )⊗Aθ and ˆ ∞ C ∞ (M, T ∗ M C )⊗A θ can be thought of as complete injective tensor products, and hence ˆ ∞ are all subspaces of C ∞ (M, Cl C M)⊗A θ (see the discussion after Proposition 2.3). In the same way we think of C(M, T ∗ M C ) = C(M, T M C ) as a subspace of C(M, Cl C M). Since the C ∗ -algebraic norm on Cl C (T Mp ) extends the inner-product norm on the tangent space T Mp for each p ∈ M (see the discussion after Lemma 2.8),
θ-Deformations as Compact Quantum Metric Spaces
229
clearly the supremum (possibly +∞-valued) norm on C(M, Cl C M) extends that on C(M, T M), which is pointwise the inner-product norm. Clearly the action of Tn on the bundle T M extends to an action on the bundle Cl C M. We denote the induced continuous action on C ∞ (M, Cl C M) also by σ . Much as in Sect. 3, we can define ˆ
−1
σ ⊗τ ˆ ∞ C ∞ (Mθ , Cl C M) := (C ∞ (M, Cl C M)⊗A , θ ) −1
ˆ σ ⊗τ ˆ ∞ , C ∞ (Mθ , T M C ) := (C ∞ (M, T M C )⊗A θ ) ˆ
−1
σ ⊗τ ˆ ∞ . C ∞ (Mθ , T ∗ M C ) := (C ∞ (M, T ∗ M C )⊗A θ )
The differential operator d : C ∞ (M) → C ∞ (M, T ∗ M C ) is a first-order linear operator, and hence easily seen to be continuous. Then we have the tensor product linear map ∞ ∗ C ˆ ∞ ˆ : C ∞ (M)⊗A ˆ ∞ d ⊗I θ → C (M, T M )⊗Aθ . Notice that d commutes with the action −1 ˆ commutes with σ ⊗τ ˆ , and hence maps C ∞ (Mθ ) into C ∞ (Mθ , T ∗ M C ). σ . So d ⊗I ˆ to C ∞ (Mθ ). The deformed differential dθ is then defined to be the restriction of d ⊗I ∞ For any f ∈ C (M) we have [15, Lemma II.5.5] [D, f ] = df as linear maps on C ∞ (M, S),
(7)
where df ∈ C ∞ (M, T ∗ M C ) ⊆ C ∞ (M, Cl C M) acts on C ∞ (M, S) via the left C ∞ (M, Cl C M)-module structure of C ∞ (M, S). Then it is easy to see that for any f ∈ C ∞ (M) ⊗alg A∞ θ we have [D ⊗ I, f ] = (d ⊗ I )f as linear maps on C ∞ (M, S) ⊗alg A∞ θ . ˆ f ](ψ) and (f, ψ) → ((d ⊗I ˆ )f )(ψ) This means that the bilinear maps (f, ψ) → [D ⊗I, ∞ ) to C ∞ (M, S)⊗A ∞ coincide on the ∞ (M, S)⊗A ˆ ∞ ˆ ˆ ) × (C from W := (C ∞ (M)⊗A θ θ θ ∞ ∞ dense subspace (C ∞ (M) ⊗alg A∞ θ ) × (C (M, S) ⊗alg Aθ ). Since both of them are (jointly) continuous, they coincide on the whole of W . In other words, for any f ∈ ˆ ∞ C ∞ (M)⊗A θ we have ˆ f ] = (d ⊗I ˆ )f as linear maps on C ∞ (M, S)⊗A ˆ ∞ [D ⊗I, θ . C ∞ (M, T M)
C ∞ (M, T ∗ M)
(8) C ∞ (M)
× → exThe canonical R-bilinear pairing tends to a C-bilinear pairing C ∞ (M, T M C ) × C ∞ (M, T ∗ M C ) → C ∞ (M), which is clearly continuous. For any Y ∈ C ∞ (M, T M C ) let iY be the corresponding conˆ : traction C ∞ (M, T ∗ M C ) → C ∞ (M). Then we have the tensor-product map iY ⊗I ∞ ∞ ∗ C ∞ ∞ ∞ ∞ ˆ ˆ C (M, T M )⊗Aθ → C (M)⊗ Aθ . Let ∂Y : C (M) → C (M) be the derivation with respect to Y . Since ∂Y is a first-order linear operator, it is continuous. Then ∞ ˆ ∞ ˆ ∞ ˆ : C ∞ (M)⊗A we also have the tensor-product map ∂Y ⊗I θ → C (M)⊗Aθ . For any ∞ f ∈ C (M) it is trivial to see that ∂Y (f ) = iY (df ). Then for any f ∈ C ∞ (M) ⊗alg A∞ θ clearly (∂Y ⊗ I )(f ) = ((iY ⊗ I ) ◦ (d ⊗ I ))(f ). ˆ ∞ By the same argument as for (8), for any f ∈ C ∞ (M)⊗A θ we then have ˆ )(f ) = ((iY ⊗I ˆ ) ◦ (d ⊗I ˆ ))(f ). (∂Y ⊗I
(9)
230
H. Li
Since the tracial state tr : Cl C (T Mp ) → C in Lemma 2.8 is invariant under the action of SO(T Mp ) for each p ∈ M, we can use them pointwisely to define a linear map C ∞ (M, Cl C M) → C ∞ (M), which is clearly continuous. We denote this map also by tr. Then tr is still tracial in the sense that tr(f ·g) = tr(g·f ) for any f, g ∈ C ∞ (M, Cl C M). ∞ ˆ : C ∞ (M, Cl C M)⊗A ˆ ∞ ˆ ∞ We have the tensor-product linear map tr ⊗I θ → C (M)⊗Aθ . For any Y ∈ C ∞ (M, T M C ) ⊆ C ∞ (M, Cl C M) and Z ∈ C ∞ (M, T ∗ M C ) ⊆ C ∞ (M, Cl C M), recalling that we have a canonical identification of C ∞ (M, T M C ) and C ∞ (M, T ∗ M C ), we get tr(Y · Z) =
1 1 tr(Y · Z + Z · Y ) = tr(−2 < Y, Z >) = − < Y, Z >= −iY (Z), 2 2
where Y · Z is the multiplication in C ∞ (M, Cl C M), and < ·, · > is the C ∞ (M)-valued C ∞ (M)-bilinear pairing on C ∞ (M, T M C ). So iY = tr ◦(−Y ) on C ∞ (M, T ∗ M C ). ˆ Then iY ⊗ I = (tr ⊗ I ) ◦ ((−Y ) ⊗ 1) on C ∞ (M, T ∗ M C ) ⊗alg A∞ θ . Since both iY ⊗I and ∞ ∞ ∗ C ∞ ˆ ˆ ∞ ˆ (tr ⊗I )◦((−Y )⊗1) are continuous maps from C (M, T M )⊗Aθ to C (M)⊗A θ , we get ˆ = (tr ⊗I ˆ ) ◦ ((−Y ) ⊗ 1) iY ⊗I
(10)
∞ ˆ ∞ ˆ ∞ as maps C ∞ (M, T ∗ M C )⊗A θ → C (M)⊗Aθ . Combining (9) and (10) together, for ∞ ∞ ˆ θ we get any f ∈ C (M)⊗A
ˆ )(f ) = ((tr ⊗I ˆ ) ◦ ((−Y ) ⊗ 1) ◦ (d ⊗I ˆ ))(f ). (∂Y ⊗I
(11)
Let Lie(Tn ) be the Lie algebra of Tn . For any X ∈ Lie(Tn ) we denote by X # the vector field on M generated by X. ˆ ∞ Lemma 5.1. For any X ∈ Lie(Tn ) and any f ∈ C ∞ (M)⊗A θ we have limt→0
ˆ )(f ) − f (σetX ⊗I ˆ )(f ). = (∂−X# ⊗I t
(12)
Proof. For any f ∈ C ∞ (M) and x ∈ Tn clearly σetX (f ) − f , t→0 t σ tX (σx (f )) − σx (f ) (∂−X# )(σx (f )) = lim e t→0 t σetX (f ) − f = lim σx ( ) = σx (∂−X# (f )), t→0 t (∂−X# )(f ) = lim
where the limits are taken with respect to the locally convex topology in C ∞ (M). (Here we have −X# instead of X # in the first equation because (σetX (f ))(p) = f (σe−tX (p)) for any p ∈ M.) So we see that the map t → ∂−X# (σetX (f )) is continuous. When M is compact, we know that t t ∂−X# (σesX (f )) ds = σesX (∂−X# (f )) ds, (13) σetX (f ) − f = 0
0
where the integral is taken with respect to the supremum norm topology in C(M). Notice that the inclusion C ∞ (M) → C(M) is continuous when C ∞ (M) is endowed with the
θ-Deformations as Compact Quantum Metric Spaces
231
locally convex topology t and C(M) is endowed with the norm topology. By Proposition 2.6 the integral 0 σesX (∂−X# (f )) ds is also defined in C ∞ (M), and is mapped to the corresponding integral in C(M) under the inclusion C ∞ (M) → C(M). Therefore we see that (13) also holds with respect to the locally convex topology in C ∞ (M). For noncompact M, since the locally convex topology on C ∞ (M) is defined using seminorms from compact subsets of local trivializations, it is easy to see that (13) still holds. Now for any f ∈ C ∞ (M) ⊗alg A∞ θ clearly we have (σetX ⊗ I )(f ) − f =
t 0
(σesX ⊗ I )((∂−X# ⊗ I )(f )) ds
ˆ ∞ ˆ in C ∞ (M)⊗A θ . For fixed X notice that f → (σetX ⊗I )(f ) − f is a continuous map ∞ ∞ ˆ ˆ )(f ) and from C (M)⊗Aθ to itself. It is also easy to see that both f → (∂−X# ⊗I t ∞ ∞ ˆ θ to itself. So the map ˆ )(f ) ds are continuous maps from C (M)⊗A f → 0 (σesX ⊗I t ∞ ∞ ˆ ˆ ˆ f → 0 (σesX ⊗I )((∂−X# ⊗I )(f )) ds from C (M)⊗Aθ to itself is continuous. Thereˆ ∞ fore, for any f ∈ C ∞ (M)⊗A θ we have ˆ )(f ) − f = (σetX ⊗I
t 0
ˆ )((∂−X# ⊗I ˆ )(f )) ds. (σesX ⊗I
Now (12) follows from Proposition 2.7.
5.2. Seminorms. In this subsection we assume that M is an m-dimensional compact Spin manifold, and that the action σM lifts to an action on S. Notice that the fibres of Cl C M are all isomorphic to the C ∗ -algebra Cl C (Rm ), where Rm is the standard m-dimensional Euclidean space. Clearly C ∞ (M, Cl C M) generates a continuous field of C ∗ -algebras [9, Sect. 10.3] over M with continuous sections = C(M, Cl C M). Recall that H is the Hilbert space completion of C ∞ (M, S). So the algebra C(M, Cl C M) has a natural faithful representation on H. It is easy to see that the inclusion C ∞ (M, Cl C M) → C (M, Cl C M) is continuous with respect to the locally convex topology on C ∞ (M, Cl C M) ˆ ∞ and the norm topology on C(M, Cl C M). Just as in the case of C ∞ (M)⊗A θ → C(M)⊗ n ∞ C ˆ ∞ Aθ in Sect. 3, we have a T -equivariant continuous linear map C (M, Cl M)⊗A θ → C C(M, Cl M) ⊗ Aθ extending this former one. We still denote it by . As in Lemmas 3.1 and 3.10, is in fact injective. Clearly is a ∗-algebra homomorphism. −1 Let C(Mθ , Cl C Mθ ) be (C(M, Cl C M) ⊗ Aθ )σ ⊗τ . We also have the homomorphism ∞ C C (Mθ , Cl Mθ ) → B(Hθ ), which we still denote by θ . L is stable under (f ), ˆ ∞ Proposition 5.2. For any f ∈ C ∞ (M)⊗A θ the domain of D and 2
2
ˆ )f ). [D L , (f )] = ((d ⊗I
(14)
When f is in C ∞ (Mθ ), the domain of DθL is stable under θ (f ), and 2
2
[DθL , θ (f )] = θ (dθ f ).
(15)
232
H. Li
ˆ ∞ Proof. By Lemma 2.3 C ∞ (M, S)⊗A θ is a locally convex left module over the algebra ˆ ∞ . So we have the continuous maps: C ∞ (M, Cl C M)⊗A θ ∞ ∞ ˆ ∞ ˆ ∞ ˆ ∞ ¯ 2 (C ∞ (M, Cl C M)⊗A θ ) × (C (M, S)⊗Aθ ) → C (M, S)⊗Aθ → H⊗L (Aθ ).
On the other hand, we have continuous maps: ∞ ˆ ∞ ˆ ∞ (C ∞ (M, Cl C M)⊗A θ ) × (C (M, S)⊗Aθ ) ×
¯ 2 (Aθ )) × H⊗L ¯ 2 (Aθ ) → H⊗L ¯ 2 (Aθ ). −→ B(H⊗L
∞ The two compositions coincide on (C ∞ (M, Cl C M) ⊗alg A∞ θ ) × (C (M, S) ⊗alg ∞ ∞ C ∞ ˆ θ ) × (C (M, S)⊗A ˆ ∞ So they coincide on the whole of (C (M, Cl M)⊗A θ ). ∞ ∞ C ∞ ˆ θ and any ψ ∈ C (M, S)⊗ ˆ A∞ In other words, for any f ∈ C (M, Cl M)⊗A θ we have
A∞ θ ).
(f ) · (ψ) = (f ψ).
(16)
∞ ˆ ∞ ˆ ∞ Then for any f ∈ C ∞ (M, Cl C M)⊗A θ and ψ ∈ C (M, S)⊗Aθ we have
ˆ f ](ψ)) ([D ⊗I, ˆ )(f ψ) − f ((D ⊗I ˆ )ψ)) (5) ˆ )ψ)) = ((D ⊗I = D L ((f ψ)) − (f ((D ⊗I 2
(16)
2
ˆ )ψ) = D L (((f ))((ψ))) − (f ) · ((D ⊗I
(5)
2
2
2
= D L (((f ))((ψ))) − (f )(D L ((ψ))) = [D L , (f )]((ψ)).
ˆ ∞ So for any f ∈ C ∞ (M, Cl C M)⊗A θ we have 2
ˆ f ] = [D L , (f )] ◦ ◦ [D ⊗I,
(17)
∞ ˆ ∞ ¯ 2 ˆ ∞ as linear maps from C ∞ (M, S)⊗A θ to H⊗L (Aθ ). When f is in C (M)⊗Aθ we also have (8)
(16)
ˆ )f ) = ((d ⊗I ˆ )f ) ◦ . ˆ f ] = ◦ ((d ⊗I ◦ [D ⊗I, ˆ ∞ Therefore, for any f ∈ C ∞ (M)⊗A θ we have 2
ˆ )f ) ◦ . [D L , (f )] ◦ = ((d ⊗I
(18)
ˆ ∞ For any z in the domain of D take a net ψj in C ∞ (M, S)⊗A θ with (ψj ) → z and 2 2 L L D ((ψj )) → D (z). Then L2
(18)
2
2
ˆ )(f ))((ψj )) D L (((f ))((ψj ))) = ((f ))(D L ((ψj ))) + ((d ⊗I 2
ˆ )(f ))(z), → ((f ))(D L (z)) + ((d ⊗I and ((f ))((ψj )) → ((f ))(z). 2
So ((f ))(z) is in the domain of D L , and 2
2
ˆ )(f ))(z). D L (((f ))(z)) = ((f ))(D L (z)) + ((d ⊗I ˆ )f ). Therefore the domain of D L is stable under (f ), and [D L , (f )] = ((d ⊗I ˆ ∞ The assertions about C ∞ (Mθ ) follow from those about C ∞ (M)⊗A . θ 2
2
θ-Deformations as Compact Quantum Metric Spaces
233 2
By Proposition 5.2 we see that the commutator [D L , f ] is bounded for any f ∈ ˆ ∞ (C ∞ (M)⊗A θ ). Corresponding to Lθ defined in Definition 3.11 we have: Definition 5.3. We define a seminorm, denoted by LD , on C(M) ⊗ Aθ by
2 ˆ ∞ [D L , f ] , if f ∈ (C ∞ (M)⊗A D θ ); L (f ) := +∞, otherwise . Fix an inner product on Lie(Tn ), and use it to get a translation-invariant Riemannian metric on Tn in the usual way. We get a length function l on Tn by setting l(x) to be the geodesic distance from x to eTn for x ∈ Tn . Notice that I ⊗ τ = σ ⊗ I is a nontrivial action of Tn on C(Mθ ). To make use of Theorem 4.1 we define two seminorms: Definition 5.4. We define a (possibly +∞-valued) seminorm Ll on C(M) ⊗ Aθ for the action σ ⊗ I via (6): (σ ⊗ I )x (f ) − f Ll (f ) := sup : x ∈ Tn , x = eTn . l(x) We also define a (possibly +∞-valued) seminorm Llθ on C(Mθ ) for the action I ⊗ τ : (I ⊗ τ )x (f ) − f l n Lθ (f ) := sup : x ∈ T , x = eTn . l(x) Then Llθ = Ll
(19)
on C(Mθ ), because there I ⊗ τ = σ ⊗ I . Our first key technical fact is the following comparison between Ll and LD : Proposition 5.5. Let C be the norm of the linear map X → X # from Lie(Tn ) to C ∞ (M, T M) ⊆ C(M, Cl C M). Then on C(M) ⊗ Aθ we have Ll ≤ C · LD .
(20)
ˆ ∞ Proof. Let X ∈ Lie(Tn ). For any f ∈ C ∞ (M)⊗A θ we have ˆ )(f ) − f (σ tX ⊗I ˆ ))(f ) (12) ( ◦ (∂−X# ⊗I = limt→0 e t ˆ )(f ) − f (σetX ⊗I = limt→0 t (σetX ⊗ I )((f )) − (f ) = limt→0 . t It follows immediately that (f ) is once-differentiable with respect to the action σ ⊗ I . In fact, (f ) is easily seen to be smooth for the action σ ⊗ I , though we don’t need this fact here. By [24, Prop. 8.6] Ll ((f )) = sup limt→0 X=1
(σetX ⊗ I )((f )) − (f ) . t
234
H. Li
Then we get ˆ ))(f ) sup ( ◦ (∂−X# ⊗I
Ll ((f )) =
X=1
(11)
=
ˆ ) ◦ ((−X # ) ⊗ 1) ◦ (d ⊗I ˆ ))(f ) . sup ( ◦ (tr ⊗I
X=1
Notice that the linear map tr : C ∞ (M, Cl C M) → C ∞ (M) extends to C(M, Cl C M) → C(M), which we still denote by tr. By Lemma 2.8 the map tr : Cl C (Rm ) → C is positive. Then so is tr : C(M, Cl C M) → C(M). Since C(M) is commutative, tr : C(M, Cl C M) → C(M) is completely positive [10, Lemma 5.1.4]. Then we have the tensor-product completely positive map [14, Prop. 8.2] tr ⊗ I : C(M, Cl C M) ⊗ Aθ → C(M) ⊗ Aθ . Consequently, we have tr ⊗ I = (tr ⊗ I )(1 ⊗ 1) = 1 [10, Lemma 5.1.1]. In fact, tr ⊗ I is easily seen to be a conditional expectation in the sense of [13, Exercise 8.7.23], though we don’t need this fact here. Clearly ˆ ) (tr ⊗ I ) ◦ = ◦ (tr ⊗I
(21)
holds on C ∞ (M, Cl C M) ⊗alg A∞ θ . Since both maps here are continuous, (21) holds on ∞ C C ˆ ∞ the whole of C ∞ (M, Cl C M)⊗A θ . For any Y ∈ C (M, Cl M) ⊆ C(M, Cl M), we have ˆ ) ◦ ((−Y ) ⊗ 1) ◦ (d ⊗I ˆ ))(f ) ( ◦ (tr ⊗I (21)
ˆ ))(f ) = ((tr ⊗ I ) ◦ ◦ ((−Y ) ⊗ 1) ◦ (d ⊗I ˆ ))(f ) = ((−Y ) ⊗ 1) · ((d ⊗I ˆ )(f )) ≤ ( ◦ ((−Y ) ⊗ 1) ◦ (d ⊗I ˆ )(f )) = Y · ((d ⊗I ˆ )(f )) . ≤ ((−Y ) ⊗ 1) · ((d ⊗I
Recall that X # ∈ C ∞ (M, T M) ⊆ C ∞ (M, Cl C M). Therefore Ll ((f )) = ≤ (14)
ˆ ) ◦ ((−X # ) ⊗ 1) ◦ (d ⊗I ˆ ))(f ) sup ( ◦ (tr ⊗I
X=1
ˆ )(f )) = C ((d ⊗I ˆ )(f )) sup X# · ((d ⊗I
X=1
2
= C [D L , (f )] = C · LD ((f ))
as desired.
5.3. Restriction map. Our goal in this subsection is to prove the second key technical fact: Proposition 5.6. The restriction map from C(Mθ , Cl C M) to B(Hθ ) is isometric. In particular, for any f ∈ C ∞ (Mθ , Cl C M) we have (f ) = θ (f ) .
(22)
First of all, Proposition 5.6 justifies our way of taking C(Mθ ) as a subalgebra of B(Hθ ) via restriction to Hθ . Secondly, it enables us to compute Lθ using our seminorm LD in Subsect. 5.2, and hence to compare it with Llθ :
θ-Deformations as Compact Quantum Metric Spaces
235
Corollary 5.7. On C(Mθ ) we have Lθ = LD ,
(23)
Llθ ≤ C · Lθ .
(24)
and
Proof. We prove (23) first. Since is injective it suffices to show (23) on (C ∞ (Mθ )). For any f ∈ C ∞ (Mθ ) we have (14)
2
(22)
ˆ )f ) = (dθ f ) = θ (dθ f ) LD ((f )) = [D L , (f )] = ((d ⊗I (15)
2
= [DθL , θ (f )] = Lθ (θ (f )),
which yields (23). Then on C(Mθ ) we have (19)
(20)
(23)
Llθ = Ll ≤ C · LD = C · Lθ .
Instead of proving Proposition 5.6 directly, we shall prove a slightly more general form. Let A be a unital C ∗ -algebra with a strongly continuous action σ of Tn , which we shall set to be C(M, Cl C M) later. Assume that A ⊆ B(H) and that Tn has a strongly continuous unitary representation on H, which we still denote by σ , such that the action σ on A is induced by conjugation. Then the C ∗ -algebraic spatial tensor product n let ¯ 2 (Aθ ) faithfully. For any q ∈ Zn = T A ⊗ Aθ [31, Appendix T.5] acts on H⊗L 2 2 −1 ¯ (Aθ ))q be the q-isotypic subspace of H⊗L ¯ (Aθ ) for the action σ ⊗τ ¯ (H⊗L . Notice −1 ¯ 2 (Aθ ))q is stable under the action of (A ⊗ Aθ )σ ⊗τ for each q ∈ Zn . that (H⊗L Proposition 5.8. For any f ∈ (A ⊗ Aθ )σ ⊗τ
−1
and q ∈ Zn we have
f = f |(H⊗L ¯ 2 (Aθ ))q ,
(25)
¯ 2 (Aθ ) under σ ⊗τ ¯ −1 . ¯ 2 (Aθ ))q is the q-isotypic component of H⊗L where (H⊗L Proof. Think of −θq as an element of Tn via the natural projection Rn → Rn /Zn = Tn . For any p ∈ Zn , recalling the skew-symmetric bicharacter ωθ in Sect. 2, we have uq up u−q = ωθ (q, p)ωθ (q + p, −q)up = ωθ (q, 2p)up = p, −θq up = τ−θq (up ). It follows immediately that for any b ∈ Aθ we have uq bu−q = τ−θq (b). Consequently, for any f ∈ A ⊗ Aθ we have (1 ⊗ uq )f (1 ⊗ u−q ) = (I ⊗ τ )−θq (f ). Therefore ¯ )−θq ◦ f ◦ (I ⊗τ ¯ )θq (1 ⊗ uq ) ◦ f ◦ (1 ⊗ u−q ) = (I ⊗τ
(26)
¯ 2 (Aθ ). Clearly 1⊗u−q is in the q-isotypic component of A⊗Aθ under σ ⊗τ −1 . on H⊗L ¯ 2 (Aθ ))0 is a unitary map onto (H⊗L ¯ 2 (Aθ ))q . Since I ⊗τ ¯ So 1⊗u−q restricted to (H⊗L
236
H. Li
¯ −1 commute with each other, I ⊗τ ¯ preserves (H⊗L ¯ 2 (Aθ ))q . Thus (26) tells us and σ ⊗τ −1 that for any f ∈ (A ⊗ Aθ )σ ⊗τ the two restrictions f |(H⊗L ¯ 2 (Aθ ))0 and f |(H⊗L ¯ 2 (Aθ ))q are unitarily conjugate to each other. Hence f |(H⊗L ¯ 2 (Aθ ))0 = f |(H⊗L ¯ 2 (Aθ ))q for all q ∈ Zn . Then (25) follows immediately.
Now Proposition 5.6 is just a consequence of Proposition 5.8 applied to A = C(M, Cl C M). 6. Proof of Theorem 1.1 In this section we prove Theorem 1.1 by verifying the conditions in Theorem 4.1 for the quadruple (C(Mθ ), Lθ , Tn , I ⊗ τ ). Clearly Lθ satisfies the reality condition (3). Condition (1) is already verified in (24). ˆ . Notice that α is in fact an action of Tn on Let α = I ⊗ τ , and let αˆ = I ⊗τ C(M) ⊗ Aθ , under which C(Mθ ) is stable. For any f ∈ C(Mθ ) and any continuous function ϕ : Tn → C clearly αϕ (f ) doesn’t depend on whether we think of f as being in C(Mθ ) or C(M) ⊗ Aθ , where αϕ is the linear map on C(M) ⊗ Aθ or C(Mθ ) defined in Lemma 3.2. Now we verify condition (2): ∞ ˆ ∞ Proposition 6.1. Let ϕ ∈ C(Tn ). Then (C ∞ (M)⊗A θ ) and θ (C (Mθ )) are both stable under αϕ . We have
LD ◦ αϕ ≤ ϕ 1 ·LD
(27)
Lθ ◦ αϕ ≤ ϕ 1 ·Lθ
(28)
on C(M) ⊗ Aθ , and on C(Mθ ). ˆ ∞ ˆ ϕ (f )) ∈ Proof. For any f ∈ C ∞ (M)⊗A θ by Lemma 3.3 we have αϕ ((f )) = (α ∞ ) is stable under α . For any g ∈ C ∞ (M ) by ∞ (M)⊗A ˆ ∞ ˆ ). So (C (C ∞ (M, S)⊗A ϕ θ θ θ Lemma 3.3 we have αˆ ϕ (g) ∈ C ∞ (Mθ ). Then αϕ (θ (g)) = αϕ ((g)) = (αˆ ϕ (g)) ∈ (C ∞ (Mθ )) = θ (C ∞ (Mθ )). So θ (C ∞ (Mθ )) is also stable under αϕ . 2 2 ¯ , and hence DθL is invariNotice that D L is invariant under the conjugation of σ ⊗I ¯ to Hθ . Then clearly LD and Lθ ant under the conjugation of the restriction of σ ⊗I are invariant under α. Also notice that seminorms defined by commutators are lower semicontinuous [23, Prop. 3.7]. Then (27) and (28) follow from Remark 4.2(3). n let (C(Mθ ))q be We proceed to verify conditions (3) and (4). For each q ∈ Zn = T the q-isotypic component of C(Mθ ) under α throughout the rest of this section. Also let (C(M))q and (C ∞ (M))q be the q-isotypic components of C(M) and C ∞ (M) under σ . We need: Lemma 6.2. For each q ∈ Zn we have (C(Mθ ))q = (C(M))q ⊗ uq ,
(29)
(C(Mθ ))q ∩ θ (C ∞ (Mθ )) = (C ∞ (M))q ⊗ uq .
(30)
and
θ-Deformations as Compact Quantum Metric Spaces
237
ˆ ∞ Proof. Let V = C ∞ (M)⊗A θ , and let W = C(M) ⊗ Aθ . Let Vq and Wq be the q-isotypic component of V and W under αˆ and α respectively. By similar arguments as in Lemma 3.1, we have Vq = C ∞ (M) ⊗ uq and Wq = C(M) ⊗ uq . Then (C(Mθ ))q = Wq ∩ W σ ⊗τ
−1
= (C(M) ⊗ uq )σ ⊗τ
−1
= (C(M))q ⊗ uq .
Since is injective, we also have ˆ −1 ˆ −1 (C(Mθ ))q ∩ θ (C ∞ (Mθ )) = Vq ∩ V σ ⊗τ = (C ∞ (M) ⊗ uq )σ ⊗τ = ((C ∞ (M))q ⊗ uq ) = (C ∞ (M))q ⊗ uq .
The geodesic distance on M defines a seminorm Lρ on C(M) via (1). This makes C(M) into a compact quantum metric space (see the discussion after Lemma 4.6 in [24]). Let rM be the radius. Define a new seminorm L on C(M) by L = Lρ on C ∞ (M), and L = +∞ on C(M)\C ∞ (M). Since L ≥ Lρ , by Proposition 2.11 clearly L is also a Lipnorm and has radius no bigger than rM . It is well known (cf. the proof of [4, Prop. 1]) that (7)
L(f ) = df = [D, f ]
(31)
for all f ∈ C ∞ (M), where we denote the closure of D on H also by D. Notice that for any f = fq ⊗ uq ∈ (C ∞ (M))q ⊗ uq we have (31)
2
LD (f ) = [D L , f ] = [D, fq ] ⊗ uq = [D, fq ] = L(fq ). Combining this with (23), we get Lθ (fq ⊗ uq ) = L(fq )
(32)
for fq ⊗ uq ∈ (C ∞ (M))q ⊗ uq . From (32), (29) and (30) we see that Lθ restricted to (C(Mθ ))q can be identified with L restricted to (C(M))q . Then conditions (3) and (4) of Theorem 4.1 follow immediately. Then Theorem 1.1 is just a consequence of Theorem 4.1 applied to(C(Mθ ), Lθ , Tn , α). We also see that (C(Mθ ), Lθ ) has radius no bigger than rM + C Tn l(x) dx. Aknowledgements. This is part of my Ph.D. dissertation submitted to UC Berkeley in 2002. I am indebted to my advisor, Professor Marc Rieffel, for many helpful discussions, suggestions, and for his support throughout my time at Berkeley. I also thank Thomas Hadfield and Fr´ed´eric Latr´emoli`ere for valuable conversations.
References 1. Bonechi, F., Ciccoli, N., Tarlini, M.: Noncommutative instantons on the 4-Sphere from quantum groups. Commun. Math. Phys. 226, no. 2, 419–432 (2002) 2. Connes, A.: C ∗ -alg`ebres et g´eom´etrie diff´erentielle. C. R. Acad. Sci. Paris S´er. A-B 290, no. 13, A599–A604 (1980) ´ 3. Connes, A.: Noncommutative differential geometry. Inst. Hautes Etudes Sci. Publ. Math. No. 62, 257–360 (1985) 4. Connes, A.: Compact metric spaces, Fredholm modules, and hyperfiniteness. Ergodic Theory Dynam. Systems 9, no. 2, 207–220 (1989) 5. Connes, A.: Non-commutative Geometry. San Diego, CA: Academic Press, Inc., 1994 6. Connes, A.: Gravity coupled with matter and the foundation of noncommutative geometry. Commun. Math. Phys. 182, no. 1, 155–176 (1996)
238
H. Li
7. Connes, A., Dubois-Violette, M.: Noncommutative finite-dimensional manifolds. I. Spherical manifolds and related examples. Commun. Math. Phys. 230, no. 3, 539–579 (2002) 8. Connes, A., Landi, G.: Noncommutative manifolds, the instanton algebra and isospectral deformations. Commun. Math. Phys. 221, no. 1, 141–159 (2001) 9. Dixmier, J.: C ∗ -algebras. Translated from the French by Francis Jellett. North-Holland Mathematical Library, Vol. 15. Amsterdam-New York-Oxford: North-Holland Publishing Co., 1977 10. Effros, E.G., Ruan, Z.-J.: Operator Spaces. London Mathematical Society Monographs. New Series, 23. New York: The Clarendon Press, Oxford University Press, 2000 11. Gilbert, J.E., Murray, M.A.M.: Clifford Algebras and Dirac Operators in Harmonic Analysis. Cambridge Studies in Advanced Mathematics, 26. Cambridge: Cambridge University Press, 1991 12. Jost, J.: Riemannian Geometry and Geometric Analysis. Second edition. Universitext. Berlin: Springer-Verlag, 1998 13. Kadison, R.V., Ringrose, J.R.: Fundamentals of the Theory of Operator Algebras. Vol. II. Advanced Theory. Corrected reprint of the 1986 original. Graduate Studies in Mathematics, 16. Providence, RI: American Mathematical Society, 1997 14. Lance, E.C.: Hilbert C ∗ -modules. A Toolkit for Operator Algebraists. London Mathematical Society Lecture Note Series, 210. Cambridge: Cambridge University Press, 1995 15. Lawson, H.B. Jr., Michelsohn, M.-L.: Spin Geometry. Princeton Mathematical Series, 38. Princeton, NJ: Princeton University Press, 1989 16. Li, H.: Order-unit quantum Gromov-Hausdorff distance. http://arxiv.org/list/math.OA/0312001, 2003 17. Masuda, T., Nakagami, Y., Watanabe, J.: Noncommutative differential geometry on the quantum two sphere of Podle´s. I. An algebraic viewpoint. K-Theory 5, no. 2, 151–175 (1991) 18. Ozawa, N., Rieffel, M.A.: Hyperbolic group C ∗ -algebras and free-product C ∗ -algebras as compact quantum metric spaces. Canad. J. Math. to appear. http://arxiv.org/list/OA/0302310, 2003 19. Rieffel, M.A.: Projective modules over higher dimensional noncommutative tori. Canad. J. Math. 40, no. 2, 257–338 (1988) 20. Rieffel, M.A.: Non-commutative tori—a case study of non-commutative differentiable manifolds. In: Geometric and Topological Invariants of Elliptic Operators (Brunswick, ME, 1988), Contemp. Math., 105, Providence, RI: Amer. Math. Soc., 1990, pp. 191–211 21. Rieffel, M.A.: Deformation quantization for actions of Rd . Mem. Amer. Math. Soc. 106, no. 506 (1993) 22. Rieffel, M.A.: Metrics on states from actions of compact groups. Doc. Math. 3, 215–229 (1998), (electronic) 23. Rieffel, M.A.: Metrics on state spaces. Doc. Math. 4, 559–600 (1999) (electronic) 24. Rieffel, M.A.: Gromov-Hausdorff distance for quantum metric spaces. In: Mem. Amer. Math. Soc. 168, no. 796, Providence, RI: Amer. Math. Soc., 2004 25. Rieffel, M.A.: Matrix algebras converge to the sphere for quantum Gromov-Hausdorff distance. In: Mem. Amer. Math. Soc. 168, no. 796, Providence, RI: Amer. Math. Soc., 2004 26. Rieffel, M.A.: Group C ∗ -algebras as compact quantum metric spaces. Doc. Math. 7, 605–651 (2002) (electronic) 27. Rieffel, M.A.: Compact quantum metric spaces. http://arxiv.org/list/math.OA/0308207, 2003 28. Sitarz, A.: Dynamical noncommutative spheres. Commun. Math. Phys. 241, no.1, 161–175 (2003) 29. Tr`eves, F.: Topological Vector Spaces, Distributions and Kernels. New York-London: Academic Press, 1967 30. V´arilly, J.C.: Hopf algebras in noncommutative geometry. In: Geometric and Topological Methods for Quantum Field Theory (Villa de Leyva, 2001), River Edge, NJ: World Sci. Publishing, 2003, pp. 1–85 31. Wegge-Olsen, N.E.: K-theory and C ∗ -algebras. A Friendly Approach. Oxford Science Publications. New York: The Clarendon Press, Oxford University Press, 1993 Communicated by A. Connes
Commun. Math. Phys. 256, 239–254 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1319-4
Communications in
Mathematical Physics
Semiclassical Behaviour of Expectation Values in Time Evolved Lagrangian States for Large Times Roman Schubert Department of Mathematics, University of Bristol, Bristol, United Kingdom. E-mail: [email protected] Received: 2 February 2004 / Accepted: 19 October 2004 Published online: 15 March 2005 – © Springer-Verlag 2005
Abstract: We study the behaviour of time evolved quantum mechanical expectation values in Lagrangian states in the limit → 0 and t → ∞. We show that it depends strongly on the dynamical properties of the corresponding classical system. If the classical system is strongly chaotic, i.e. Anosov, then the expectation values tend to a universal limit. This can be viewed as an analogue of mixing in the classical system. If the classical system is integrable, then the expectation values need not converge, and if they converge their limit depends on the initial state. An additional difference occurs in the timescales for which we can prove this behaviour; in the chaotic case we get up to Ehrenfest time, t ∼ ln(1/), whereas for integrable system we have a much larger time range. 1. Introduction and Results A striking property of chaotic dynamical systems is the universality which these systems show in the time evolution for large times. Let (, t , dµ) be a dynamical system, i.e., is the compact phase space, t : → the flow and dµ a normalised invariant measure on . If the system is mixing then for any ρ, a ∈ L2 (, µ) with ρ dµ = 1 one has t a ◦ ρ dµ → a dµ , for t → ∞ . (1) If we think of ρ as describing a probability distribution of initial states and of a as an observable, then mixing means that the system forgets its initial conditions for large times and so one needs only to know the “equilibrium state” dµ in order to predict the behaviour of time evolved observables for large times. If the rate of mixing is fast enough this then often implies other universal statistical features, e.g., a central limit theorem for time means of observables. We want to explore to what extent this universality shows up in quantum mechanics, too. The analogue of the expectation value in (1) is a quantum mechanical expectation
240
R. Schubert
value for a time evolved state. So let U(t) denote the time evolution operator of our quantum system, A an observable, i.e. a bounded operator, and ψ a state; we want to know if U(t)ψ, AU(t)ψ
(2)
converges to some limit if → 0 and t → ∞, at least for certain classes of observables and states. We will consider here Lagrangian states as initial states and bounded pseudo-differential operators as observables. The main difficulty in this problem comes from the fact that we have to perform two limits, → 0 and t → ∞, and these two limits do not commute. So we have to specify precisely how we take the joint limit and we have to use semiclassical constructions which are to some extent uniform in t. For systems which have some positive Liapunov exponents, it was found in the late 70’s in the physics literature [BZ78, Zas81, BBTV79, BB79], that the usual semiclassical constructions apparently can only work up to a timescale which grows logarithmically in , TE ∼ ln(1/), the so called Ehrenfest or log-breaking time. That semiclassical constructions actually do work up to that time was rigorously proved in [CR97] for the time evolution of coherent states and in [BGP99] for the time evolution of observables. We will use for our work the results in [BR02] who extended the results by Bambusi, Graffi and Paul. The time range beyond the Ehrenfest time is not well understood yet. But results by Tomsovic, Heller and coworkers, [TH91, TH93, OTH92], suggest that semiclassical methods might be extended beyond Ehrenfest time. They studied for autocorrelation functions of coherent states the question if one can extend the semiclassical propagator to timescales which are algebraic in 1/, and demonstrated numerically that this is possible for the stadium billiard and some quantised maps. One motivation for this work are the results of Bonechi and De Bi`evre for the time evolution of coherent states in cat-maps, [BDB00]. They showed that a coherent state evolved with the quantised cat-map becomes equidistributed just after the Ehrenfest time, but they could control the time evolution only up to a slightly larger time range which is still logarithmic in 1/. More precisely, equidistribution holds for times between 1+ε 1−ε 2λ ln(1/) and λ ln(1/), for any ε > 0, where λ is the positive Liapunov exponent of the classical map. Since one expects a coherent state to become stretched along the unstable manifold of the orbit on which it is centred, it might be effectively modelled by a Lagrangian state associated with this unstable manifold. This is one motivation for studying Lagrangian states. Furthermore some particular examples of Lagrangian states have been already considered in [BDB00], namely position eigenstates and their time evolution under the quantised Bakers map, and they are shown to become equidistributed for large times up to the Ehrenfest time. More recently estimates on the time evolution around Ehrenfest time have been used in [FNDB03] to construct scarred eigenstates for the quantised cat map, and in [DBR03] the time evolution of coherent states along the separatrix in one-dimensional systems was investigated. A typical Lagrangian state on a manifold M is of the form i
ψ(x) = ρ(, x)e ϕ(x) ,
(3)
where ϕ is a smooth real valued function and ρ(, x) is a smooth function with compact support with an asymptotic expansion ρ(, x) ∼ ρ0 (x) + ρ1 (x) + · · · for → 0. The important geometrical object associated with ψ is the Lagrangian manifold generated by the phase function ϕ, ϕ := {(ϕ (x), x) ; x ∈ U } ⊂ T ∗ M ,
(4)
Semiclassical Behaviour of Expectation Values for Large Times
241
where U ⊂ M is an open set containing the support of the amplitude ρ. We will denote the set of these states with compact support by I0 (). The definition can be extended to arbitrary Lagrangian manifolds, i.e., they need not be representable in the form (4). Any Lagrangian submanifold ⊂ T ∗ M can be represented locally as = {(ϕx (x, θ ), x) ; ϕθ (x, θ ) = 0, (x, θ ) ∈ U × Rκ }, where ϕ(x, θ ) is non-degenerate, i.e., (x, θ ), ϕ (x, θ )) is equal to d at the points the rank of the d × (d + κ) matrix (ϕx,x x,θ (x, θ ) with ϕθ (x, θ ) = 0. The corresponding Lagrangian states are given by 1 ψ(x) = (2π )κ/2
i
Rκ
ρ(, x, θ )e ϕ(x,θ) dθ ,
(5)
see [Dui74, BW97] and [Ivr98, Sect. 1.2.1] for more details. Lagrangian states appear quite often in applications, e.g., if ϕ(x) = p, x we have a localised plane wave with momentum p or if ϕ depends only on |x| we get circular waves. Since the simultaneous eigenstates of d commuting pseudo-differential operators are typically Lagrangian, this class of states appears quite frequently as the result of the preparation of an experiment, e.g., the above mentioned examples occur if one selects initial states with certain momentum, or certain angular momentum, respectively. The leading order behaviour of a Lagrangian state ψ for → 0 is determined by its principal symbol σ (ψ) which, modulo phase factors, is a half-density on . In the case that ψ is of the form (3) σ (ψ) is the pullback of the half-density ρ0 (x)|dx|1/2 on Rd by the projection π : ϕ → Rd . We will only encounter its modulus squared, the density |σ (ψ)|2 , which can be defined more directly by the relation 2 a |σ (ψ)| := a(ϕ (x), x)|ρ0 (x)|2 dx (6) Rd
for any a ∈ C ∞ (T ∗ M). The observables we consider are given by pseudo-differential operators. We will say that A ∈ m (M) if locally A = Op[a] where x + y i x−y,ξ 1 ,ξ e a ψ(y) dy dξ , (7) Op[a]ψ(x) = d d (2π)d 2 R ×R and the symbol a(, x, ξ ) has an asymptotic expansion a(, x, ξ ) ∼ a0 (x, ξ )+a1 (x, ξ )+ 2 a2 (x, ξ ) + · · · and satisfies β
|∂xα ∂ξ a(, x, ξ )| ≤ Cα,β (1 + |x|2 + |ξ |2 )m/2 ,
(8)
for ∈ (0, 1] and all α, β ∈ Nd . One calls σ (a) := a0 the principal symbol of a, or of A, and although the full symbol a is only defined locally, the principal symbol defines a function on T ∗ M, i.e., on phase space. The operators in 0 (M) are bounded, and they will form the set of observables for which we study time evolution. See, e.g., [DS99] for more details. Our first assumption on the system is that the Hamiltonian fits into the above framework, i.e., is a pseudo-differential operator. Condition (H). Let M be a C ∞ manifold and let H ∈ m (M), for some m ∈ R, be essentially selfadjoint.
242
R. Schubert
A typical example is H = −2 g + V , where g is the Laplace Beltrami operator associated with a metric g on M, and V is a smooth real valued function (with |∂ α V (x)| ≤ Cα (1 + |x|)m if M is not compact). For conditions on general operators from m (M) to be (essentially) selfadjoint see [DS99]. The Hamiltonian flow on T ∗ M generated by the principal symbol H of H will be denoted by t . Condition (O). There exists an open connected set ⊂ T ∗ M which has compact closure and which is invariant under the flow t . Let E := {z ∈ T ∗ M ; H0 (z) = E} be the energy shell of energy E and denote by dµE the Liouville measure on E . E and dµE are invariant under the flow. Let us recall the definition of an Anosov flow: Condition (A). A flow t on a compact manifold is called Anosov, if for every x ∈ there exists a splitting Tx = E s (x) ⊕ E u (x) ⊕ E 0 (x) which is invariant under t and where E 0 (x) is one-dimensional and spanned by the generating vectorfield of t . Furthermore there exist constants C, λ > 0 such that ||dt v|| ≤ Ce−λt ||v|| ||dt v|| ≤ Ceλt ||v||
for each v ∈ E s and t ≥ 0, for each v ∈ E u and t ≤ 0 .
(9) (10)
The two distributions E s and E u can be integrated to give the stable and unstable foliations, respectively. We will denote the leaves through x by W s (x) and W u (x). If the flow is smooth then the leaves are smooth submanifolds but the dependence of the leaves on x is usually only H¨older continuous, and we will denote the H¨older exponent by α. The corresponding weakly stable and unstable manifolds are defined by W ws/wu (x) := t∈R t (W s/u (x)). If is an energy-shell of an Hamiltonian system, and t the Hamiltonian flow, then W s (x) and W u (x) have the same dimension, and W ws (x) and W wu (x) are Lagrangian submanifolds. An example for an Anosov flow is given by the geodesic flow on a compact manifold of negative curvature, see e.g. [Ebe01]. If the Hamilton operator is the Laplace Beltrami operator associated with such a metric, then the flow generated by the principal symbol of this operator is conjugate to the geodesic flow, and its restriction to any equi-energy shell E is Anosov. For the time evolution of Lagrangian states the position of relative to the stable foliation will be important. Namely we have to require that Tx contains no stable directions for most x, this leads to the following transversality conditions. Condition (T). (i) If ⊂ E then assume that Tx ∩ E s (x) = {0} for all x ∈ \sing , where sing ⊂ has at least codimension 1. (ii) If ⊂ and the flow is Anosov on all E ⊂ then for all such E assume that Tx ( ∩ E ) ∩ (E s (x) ⊕ E 0 (x)) = {0} for all x ∈ ( ∩ E )\E,sing , where E,sing ⊂ ( ∩ E ) has at least codimension 1. These conditions are typically fulfilled, in the sense that if a Lagrangian manifold does not satisfy them one can find an arbitrary small perturbation of which does. This would not be true if we would require transversality to the stable foliation everywhere, and this is why we choose this more complicated condition. We can state now the main result of this paper about expectation values of time evolved Lagrangian states.
Semiclassical Behaviour of Expectation Values for Large Times
243
Theorem 1. Let M be a C ∞ manifold, and H ∈ m (M) be a selfadjoint pseudo-differential operator on M, with principal symbol H0 . Let t be the Hamiltonian flow on T ∗ M generated by H0 , and assume Condition (O) is fulfilled. Let ⊂ be a Lagrangian submanifold. Then (i) if ⊂ E ⊂ , the flow on E is Anosov, and satisfies condition (T)(i), then there exist for every ψ ∈ I0 () and Op[a] ∈ 0 (M) constants C, c, , γ > 0 such that 2 |t| U(t)ψ, Op[a]U(t)ψ − σ (a) dµ |σ (ψ)| + ce−γ t . (11) E ≤ Ce E
(ii) If the flow is Anosov on all E ⊂ , and ∩ E satisfies condition (T)(ii), then there exist for every ψ ∈ I0 () and Op[a] ∈ 0 (M) constants C, c, , γ such that 2 U(t)ψ, Op[a]U(t)ψ − σ (a) dµE |σ (ψ)|E dE ≤ Ce|t| + ce−γ t , E
∩E
(12) where the density |σ (ψ)|2E on ∩ E is defined by |σ (ψ)|2 = |σ (ψ)|2E ⊗ |dE|. In order that the right hand sides of the inequalities (11) and (12) tend to zero for → 0 and t → ∞, we have to have t≤
1−ε ln(1/) ,
(13)
for some ε > 0, so up to Ehrenfest time we get convergence. The constant does in fact only depend on the principal symbol of H, it is larger than the largest Liapunov exponent of the classical flow. It seems likely that with some additional effort can be chosen to be the supremum of all Liapunov exponents. Let us compare this result with mixing for the classical system. To this end assume that ψ = 1, this implies that |σ (ψ)|2 = 1 and then (11) gives U(t)ψ, Op[a]U(t)ψ → (14) σ (a)dµE E
for t → ∞ and → 0 such that e|t| → 0. So we have the same behaviour as in the classical system, see (1), in particular we obtain the same kind of universality. The limit does not depend any longer on the initial state as long as it satisfies the conditions of part (1) of Theorem 1. The transversality condition on the Lagrangian manifold is necessary. If is for instance the stable manifold of a periodic orbit γ , then one has for ψ ∈ I0 (), 2π i kt U(t)ψ, Op[a]U(t)ψ = bk e Tγ + O(e|t| ) + O(e−γ t ), (15) k∈Z
where Tγ is the period of the orbit, and the coefficients bk are related to σ (ψ) and σ (a). We will discuss this in more detail in Sect. 3. The result in Theorem 1 can be viewed as an analogue for time evolution of the ˇ quantum ergodicity results for eigenfunctions [Sni74, Zel87, CdV85, HMR87]. If the classical system is ergodic then almost all eigenfunctions become equi-distributed. Here
244
R. Schubert
we obtain equidistribution under time evolution, but we need stronger conditions on the classical system, namely mixing for densities concentrated on certain Lagrangian submanifolds. There seems to be no direct relation to the notion of quantum (weak) mixing introduced by Zelditch [Zel96], since our conditions are much stronger. One of the most interesting open problems now is to try to extend the time range in Theorem 1. This could then in turn be used to improve the quantum ergodicity results for eigenfunctions. We want to compare now the behaviour found in classically chaotic systems with integrable systems. Following [BR02] we introduce the following integrability condition. Condition (I). M is analytic, and there exists a symplectic map χ from into U × Td , where U is an open set in Rd and Td is an d-dimensional torus such that χ (t (z)) = (I (z), ϕ(z) + tω(I (z))) ,
∀z ∈ , ∀t ∈ R ,
(16)
˜ U˜ , where χ(z) = (I (z), ϕ(z)). Moreover there exists complex open neighbourhoods , d d d ˜ ˜ ˜ ˜ T of , U , T such that χ is an analytic diffeomorphism from onto U × T . According to the Liouville Arnold Theorem this situation occurs if one has d analytic integrals of motion which are in involution and which are independent on . In the case of integrable systems one can explore larger time scales, and we obtain the following results. Theorem 2. Assume Conditions (H), (O) and (I) are fulfilled. Assume furthermore that ⊂ is an invariant torus with frequency ω ∈ Rd , i.e., in action angle coordinates (I, x) ∈ U × Td from Condition (I) we have = {I } × Td for a fixed I ∈ U . Let 0 ψ ∈ I0 () and Op[a] ∈ , and consider the Fourier expansion of the principal symbols, σ (a)| (x) = m∈Zd αm eim,x , |σ (ψ)|2 (x) = m∈Zd βm eim,x |dx|. Then there are constants C > 0, β > 0 such that im,ω(I ) t β U(t)ψ, Op[a]U(t)ψ − αm β−m e (17) ≤ C(1 + |t|) . m∈Zd
So in this case expectation values do not converge at all, but keep on oscillating. If is transversal to the foliation into invariant tori T , then the situation changes. The tori carry a natural invariant density |dx|, which can be combined with a density |σ (ψ)|2 on to give a density on . By the transversality assumption there exist local symplectic coordinates (I, x) ⊂ U × V such that = {(I, 0) , I ∈ U } and the sets {(I0 , x) , x ∈ V } belong to invariant tori. In these coordinates the modulus square of the principal symbol can be written as |σ (ψ)|2 = |ρ(I ˆ )|2 |dI | and we define µψ,T := |ρ(I ˆ )|2 |dI ∧ dx| .
(18)
Theorem 3. Assume Conditions (H), (O) and (I) are fulfilled, and that the system is nondegenerate, i.e., det ω (I ) = 0 on U . If is transversal to the foliation into invariant tori, then for ψ ∈ I0 () and Op[a] ∈ 0 there exist constants C, c > 0, β > 0 such that U(t)ψ, Op[a]U(t)ψ − σ (a) µψ,T ≤ C(1 + |t|)β + c 1 , (19) 1 + |t| where µψ,T is the density defined in (18).
Semiclassical Behaviour of Expectation Values for Large Times
245
So in this case we get convergence of the expectation value, but the limit depends strongly on the initial state. Integrating against the density µψ,T means that we take the mean over each invariant torus, and then integrate these contributions weighted with the principal symbol of the state. This means that the knowledge of the limit density µψ,T allows to determine the foliation into invariant tori, and the distribution of the mass of the initial state across the tori. In case of a chaotic system the situation is different. The only information on the initial state which survives is the information on how its mass is distributed among the energy shells. All other information is lost, and so we have the same degree of universality as in the classical system. Another difference with the Anosov case is that we can control the time evolution for larger time scales, t ≤ 1/β−ε , for ε > 0. Since we mainly wanted to give a contrast to the main result in Theorem 1 we have not tried to obtain the optimal bounds on the time scales, for which one probably would need other methods. The organisation of the paper is as follows. In Sect. 2 we reduce the quantum mechanical problem to one in classical mechanics, here the limitations on the time range occur. In Sect. 3 we extend previous results on mixing in Anosov systems and use them to prove Theorem 1. In Sect. 4 we discuss the integrable case and give proofs of Theorems 2 and 3. 2. Reduction to Classical Dynamics Our aim in this section is to reduce the quantum mechanical problem to a problem in classical mechanics. This is obtained in Proposition 1. Assume Conditions (H) and (O), and let ⊂ be a Lagrangian manifold, ψ ∈ I0 () and Op[a] ∈ 0 (M). Then there exists a constant > 0, independent of and a, and C > 0 such that t 2 |t| U(t)ψ, Op[a]U(t)ψ − σ (a) ◦ |σ (ψ)| . (20) ≤ Ce
When Condition (I) is fulfilled in addition then there exists a constant β > 0 and C > 0 such that t 2 β U(t)ψ, Op[a]U(t)ψ − σ (a) ◦ |σ (ψ)| (21) ≤ C(1 + |t|) .
The first step in the proof of this proposition is the following simple lemma. Here and in the following |·|∞ will denote the sup-norm. Lemma 1. Let ψ ∈ I0 () be a Lagrangian state with compact support on M, then there exists C > 0 and an integer k > 0 such that for all Op[a] ∈ 0 (M), 2 ψ, Op[a]ψ − σ (a) |σ (ψ)| |∂ β a|∞ . (22) ≤C
|β|≤k
This is a standard result which follows from the results about application of pseudodifferential operators on Lagrangian states, see e.g. [H¨or94, BW97], we have only made
246
R. Schubert
the dependence on a of the right-hand side more explicit. Since this lemma is an application of the method of stationary phase, the remainder follows from the remainder estimates in this method, see [H¨or90]. The second ingredient in the proof of Proposition 1 is an Egorov theorem which is valid up to Ehrenfest time. The problem of time evolution of observables with remainder estimates uniform in time has been studied by Ivrii and Kachalkina in [Ivr98, Chap. 2.3]. Independently [BGP99] obtained a proof of the validity of Egorov up to Ehrenfest time for analytic observables and Hamiltonians. These results were then extended in the work of Bouzouina and Robert, [BR02]. In the formulation of the result we need the notion of essential support of an operator Op[a] ∈ 0 (M). Recall that z ∈ T ∗ M is not in the essential support of Op[a] if there is a neighbourhood U of z such that |a(z)| ≤ CN N for all N ∈ N and z ∈ U . So Op[a] is semiclassically negligible outside of its essential support. Theorem 4 ([BR02]). Assume Conditions (H) and (O). Then there exists a constant 1 > 0 such that for any Op[a] ∈ 0 (M) with essential support in there is a C > 0 such that U(t)∗ Op[a]U(t) − Op[a ◦ t ] ≤ Ce1 t .
(23)
A much stronger version of this theorem was proved for M = in [BR02], but the generalisation of their result to manifolds is complicated since the higher order terms of the symbol are not invariantly defined on T ∗ M. But we only need the leading order term, i.e. the principal symbol, and since this is a function on T ∗ M the result generalises to the case of manifolds. In case of integrable systems we will use instead the stronger Theorem 1.13 from [BR02]. Rn
Theorem 5 ([BR02]). Assume Conditions (H), (O) and (I), then for every Op[a] ∈
0 (M) with essential support in there exist constants C > 0 and βd ≤ 5d + 4 such that U(t)∗ Op[a]U(t) − Op[a ◦ t ] ≤ C(1 + |t|)βd .
(24)
We can now conclude the proof of Proposition 1. Proof (Proposition 1). We will first assume that the essential support of Op[a] is contained on . Then by Theorem 4 we have that |U(t)ψ, Op[a]U(t)ψ − ψ, Op[a ◦ t ]ψ| ≤ Ce1 |t| , and Lemma 1 gives t 2 ψ, Op[a ◦ t ]ψ, − σ (a) ◦ |σ (ψ)| ≤ C |∂ β (a ◦ t )|∞ .
|β|≤k
(25)
(26)
But as is well known, |α|≤k |∂ α (a ◦ t )|∞ ≤ Ce2 |t| |α|≤k |∂ α a|∞ for some 2 > 0, see e.g. [BR02, Lemma 2.4], and combining these estimates gives (20) with = max{1 , 2 }. For the proof of Eq. (21) we use Theorem 5 together with Lemma 1 to get t 2 U(t)ψ, Op[a]U(t)ψ − σ (a) ◦ |σ (ψ)| βd ≤ C(1 + |t|) + C |∂ α (a ◦ t )|∞ (27) |α|≤k
Semiclassical Behaviour of Expectation Values for Large Times
247
and with the estimate |α|≤k |∂ α (a ◦ t )|∞ ≤ C |α|≤k |∂ α a|∞ (1 + |t|)βd , see [BR02, Lemma 4.2], the proof is complete if we take β = max{βd , βd }. We finally show that we can reduce the case of an arbitrary observable Op[a] ∈
0 (M) to the case of observables with essential support in . Let I0 := H0−1 () and I1 := H0−1 (supp(σ (ψ))), where H0 is the principal symbol of H, be the energy-ranges of and the support of σ (ψ) on , respectively. Then I0 is an open interval, I1 is a closed interval with I1 ⊂ I0 , and so there exists a function f ∈ C0∞ (I0 ) with f |I1 ≡ 1. Then by the functional calculus, see [DS99], the operator f (H) is in 0 (M), has essential support in , commutes with U(t), and satisfies f (H)ψ − ψ ≤ C. Therefore |U(t)ψ, Op[a]U(t)ψ − U(t)ψ, f (H) Op[a]U(t)ψ| ≤ C , and since the essential support of f (H) Op[a] is contained in we are done.
(28)
3. Chaotic Systems By Proposition 1 the proof of Theorem 1 is now reduced to the study of σ (a) ◦ t |σ (ψ)|2 ,
(29)
and this expression is very similar to a correlation function like in (1). The only difference is that the density ρ is replaced by a density concentrated on the submanifold . Our aim in this section is to extend existing results on mixing of Anosov flows to this modified correlation functions. It is clear that we need a condition on the manifold , as the example of a weakly stable manifold shows. Because if is the weakly stable manifold of a periodic trajectory, then the mass of a will become more and more concentrated on that trajectory and will not become equidistributed. This example will be discussed in more detail at the end of this section. Recall that a function a on a set X with metric d(x, y) is H¨older continuous with H¨older exponent α ∈ (0, 1) if |a(x) − a(y)| ≤ Cd(x, y)α and the smallest constant C is called a H¨older constant |a|α . The set of H¨older continuous functions on a set X will be denoted by C α (X). Following the usual conventions we will fix a metric on the energy shell E , which then in turn induces metrics on submanifolds of E . We will rely mainly on Liverani’s recent result on mixing for contact Anosov flows, [Liv04]. He shows that for any α ∈ (0, 1) there exist constants C, γ > 0 such that for a, b ∈ C α () one has a ◦ t b dµ − a dµ b dµ ≤ C|a|α |b|α e−γ t . (30) Quantitative results on the decay of correlations for Anosov flows are rather recent, the main results prior to [Liv04] were obtained by Chernov [Che98] and Dolgopyat [Dol98], see the introduction of [Liv04] for more details on the history of this problem. Since the restriction of a Hamiltonian flow to an energy shell is a contact flow, the result of Liverani applies to the systems we are interested in. We want to extend the result of Liverani to the case that one of the functions in the correlation integral is a density concentrated on a smooth submanifold. Such results have been obtained previously for geodesic flows on manifolds of negative curvature with certain measures concentrated on the unstable manifolds by Sinai and Chernov.
248
R. Schubert
Sinai showed in [Sin95] that mixing holds and Chernov, [Che97], showed that the corre√ lations decay at least like e−γ t . On manifolds of constant negative curvature Eskin and McMullen, [EM93], derived mixing if one of the functions is concentrated on certain submanifolds. They reduced this to the classical mixing results for functions by using the hyperbolicity of the flow. We will follow their approach, where the only additional difficulty coming in is that the stable foliation is no longer smooth but only H¨older continuous if the curvature is no longer constant. To overcome this we use the absolute continuity property of the stable foliation. In the following we will assume that non-vanishing smooth densities σ and σ have been fixed on the submanifolds and , so that every density can be written as σ = σˆ σλ or σ = σˆ σ . We say then that σ ∈ C α () if σˆ ∈ C α () and analogously σ ∈ C α () if σˆ ∈ C α (). Theorem 6. Let S be a symplectic manifold of dimension 2d, and t : S → S be a Hamiltonian flow on S with Hamilton-function H ∈ C ∞ (S). Denote by E := {z ∈ S ; H (z) = E} the energy shell with energy E and by dµE the Liouville measure on E . Assume E is compact and connected, and t is Anosov on E and the stable foliation has H¨older exponent α. (i) Let ⊂ E be a d-dimensional submanifold which is transversal to the stable foliation of E except on a subset of codimension at least 1. Then there exist γ1 > 0 and for every density σ ∈ C0α () a constant C1 such that for every function a ∈ C α (E ) we have t a◦ σ − a dµE σ ≤ C1 |a|α e−γ1 t . (31)
E
(ii) Let ⊂ E be a (d − 1)-dimensional submanifold which is transversal to the weakly-stable foliation of E , except on a subset of codimension at least 1. Then there exist γ2 > 0 and for every density σ ∈ C0α () a C2 such that for every function a ∈ C α (E ) we have a ◦ t σ − a dµ σ ≤ C2 |a|α e−γ2 t . (32) E
E
(iii) Let ⊂ S be a d-dimensional submanifold and assume that the flow is Anosov on all E with E ∩ = ∅. Assume furthermore that for all these E ∩ E is transversal to the weakly stable foliation of E , except on a subset of codimension at least one in ∩ E . Then there exist γ3 > 0 and for every density σ ∈ C0α () a constant C3 such that for every function a ∈ C0α (S) we have t a◦ σ − a dµE σE dE ≤ C3 |a|α e−γ3 t , (33)
E
∩E
where σE is a density on ∩ E defined by σ = σE ⊗ |dE|. Proof. In order to prove (i), we will relate the behaviour of a ◦ t σ
(34)
Semiclassical Behaviour of Expectation Values for Large Times
249
to the behaviour of the standard correlation function a ◦ t ρ dµE ,
(35)
E
where ρ ∈ C α (E ) is supported in a neighbourhood of . The heuristic idea is that since a neighbourhood of converges exponentially fast along the stable manifolds to , the integral (35) will become close to the integral (34) for appropriately chosen ρ. But to (35) we can then apply the result (30) by Liverani. We will formalise this idea now and treat first the case that is transversal to the stable foliation. By using a partition of unity we can assume that the support of σ is ˆ 0 ⊂ E of 0 in a small compact set 0 ⊂ , such that there is a neighbourhood in E in which we can choose coordinates (x, y) ∈ U × W ⊂ Rd × Rd−1 with the property that = {(x, 0), x ∈ U } and W s (x) = {(x, y); y ∈ W }. This is where we use the transversality assumption. Since 0 is compact, W can be chosen to be bounded. Notice that since the stable foliation is usually only H¨older continuous, the transformation to this coordinate system is only H¨older continuous, too. But Anosov showed that the stable foliation is absolutely continuous, which means that there is a measurable function δx (y) (basically the Jacobian of the holonomy associated with the stable foliation) which depends measurably on x and satisfies 1/C < δx (y) < C for some C > 0 and all (x, y) ∈ U × W , such that t a ◦ ρ dµE = ρ(x, y)a ◦ t (x, y)δx (y) dydx , (36) E
U
W
where we have assumed that ρ is supported in U × W , see [BS02, Chap. 6.2]. Furthermore, the dependence of δx (y) on x is H¨older continuous, see Eq. (A.3) in [Liv04]. We will now show that ρ can be chosen to be in C α (E ) and such that ρ(x, y)δx (y) dy = σˆ (x), (37) W
where σ (x) = σˆ (x)dx. To this end, set ρ(x, y) = ρ1 (x)ρ2 (x, y)σˆ (x) with ρ2 (x, y) > 0 ˆ 0 , and with on 0 , H¨older and supported in
−1 ρ1 (x) = ρ2 (x, y)δx (y) dy (38) W
on . Then ρ1 is H¨older, since δx (y) is bounded and depends H¨older continuously on x, and so ρ(x, y) is H¨older and satisfies (37) by construction. By H¨older continuity, and since W is bounded, we get now |a ◦ t (x, y) − a ◦ t (x, 0)| ≤ C|a|α d(t (x, y), t (x, 0))α ≤ C |a|α e−αγ t , (39) since the flow is contracting along the stable leaves, i.e., d(t (x, y), t (x, 0)) ≤ Ce−γ t for some constants C, γ > 0. Therefore we obtain with (37) , t t ρ(x, y)a ◦ (x, y)δ (y) dydx − ρ(x, y)a ◦ (x, 0)δ (y) dydx x x U W U W ≤ C |a|α |σˆ (x)| dx e−αγ t (40) U
250
R. Schubert
and
ρ(x, y)a ◦ t (x, 0)δx (y) dydx = U
W
a ◦ t (x, 0)σˆ (x) dx = U
a ◦ t σ .
(41) On the other hand we have by (30), t a ◦ ρ dµE − ρ dµE E
E
E
a dµE ≤ C|a|α |ρ|α e−γ t ,
(42)
and by (37) ,
ρ dµE = E
so finally we obtain t a◦ σ − σ
σ ,
|ρ|α ≤ C |σˆ |α ,
(43)
E
a dµE ≤ C(|σ |α + ||σ ||L1 () )|a|α e−γ t .
(44)
This completes the proof of (i) in case the manifolds are transversal. We will now extend this result to the non-transversal case. Let sing = {x ∈ ; dim Tx ∩ Tx W s (x) ≥ 1} be the set of point on where the intersection is not transversal, and define sing,ε := {x ∈ ; d(x, sing ) ≤ ε}. Choose ϕε ∈ C α () with supp ϕε ⊂ sing,ε and ϕε ≡ 1 on sing,ε/2 . Then ϕε a ◦ t |σ (ψ)|2 ≤ C|a|εd−dsing , (45)
where dsing is the dimension of sing . To the integral (1 − ϕε )a ◦ t |σ (ψ)|2 we can apply the previous results, we only have to pay attention to the ε-dependence of the constants. The second estimate in (43) has to be refined. By the definition of ρ we have |ρ(1 − ϕε )|α ≤ |ρ1 (1 − ϕε )|α |ρ2 |α |σˆ |α and since the Jacobian δy (x) becomes degenerate when x approaches sing we get
|ρ1 (1 − ϕε )|α ≤ Cε−γ ,
(46)
where γ > 0 depends on α and dsing . Collecting the estimates yields t a◦ σ − σ adµE ≤ Cε−γ (|σ |α + ||σ ||L1 () )|a|α e−γ t +C |a|ε d−dsing ,
E
(47)
and choosing ε = e−γ t with γ = γ /(γ + (d − dsing )) gives t a◦ σ − σ a dµE ≤ C(|a|α + |a|) e−γ1 t
(48)
E
with γ1 = γ (d − dsing )/(γ + (d − dsing )). The proof of (ii) is based on (i). Define for some δ > 0 := |t|<δ t () ⊂ E , then is transversal to the stable foliation except on a subset of codimension at least
Semiclassical Behaviour of Expectation Values for Large Times
251
one. If s ∈ U ⊂ Rd−1 are local coordinates on , then (r, s) |r| < δ are local coordinates on . Let ρ be a smooth function with compact support in |r| < δ, ρ(r) dr = 1, and define ρε (r) := 1ε ρ(εr). If we write σ = σˆ (s) ds and σε := σˆ (s)ρε (r)dsdr, we have t a ◦ t σ − a ◦ σε = a(t, s) σˆ ds − a(r + t, s) ρε (r)σˆ (s) drds U R U ≤ ρε (r)|a(t, s) − a(r + t, s)| dr σˆ (s) ds (49) R
U
but R
ρε (r)|a(t, s) − a(r + t, s)| dr =
and therefore
R
ρ(r)|a(t, s) − a(εr + t, s)| dr ≤ C|a|α ε α , (50)
t a ◦ t σ − a ◦ σε ≤ C||σ ||L1 () |a|α ε α .
(51)
On the other hand with |σε |α ≤ C|σ |α ε α−1 and ||σε ||L1 () = ||σ ||L1 () we obtain from (i) that t a ◦ σε − a dµE σ ≤ C|a|α (|σ |α ε α−1 + ||σ ||L1 () )e−γ1 t . (52)
E
−γ t
If we now choose ε = e with γ > 0 and (1 − α)γ > γ1 , the proof of (ii) is complete. Part (iii) then follows immediately by writing a ◦ t σ = a ◦ t σE dE, (53)
∩E
and applying (ii) to the integral over ∩ E on the right-hand side.
Theorem 1 is now a straightforward consequence of Proposition 1 and Theorem 6. Let us end this section by discussing the meaning of the transversality condition. Let us first look at the example that is the stable manifold of an periodic orbit γ with period Tγ . Let (r, x) ∈ S 1 × Rd−1 be coordinates on such that γ is given by x = 0 and t (r, x) = (r + t mod Tγ , x(t)), then Tγ t a◦ σ = a(r + t, x(t))σˆ (r, x) drdx . (54) 0
Rd−1
With |a(r + t, x(t)) − a(r + t, 0)| ≤ Ce−γ t and by inserting the Fourier series a(r, 0) = 2π Tγ ikr we obtain k∈Z ak e 2π it a ◦ t σ = ak σ˜ k e Tγ + O(e−γ t ) (55)
k∈Z
T ikr dr. So in this case we do not get convergence for with σ˜ k = 0 γ Rd−1 σˆ (r, x) dx e large times, and together with Proposition 1 this gives (15). This example shows that some condition on the position of with respect to the stable foliation is necessary. 2π Tγ
252
R. Schubert
4. Integrable Systems In this section we give the proofs of Theorem 2 and Theorem 3 and discuss the situation for integrable systems. Proof (Theorem 2). By Proposition 1 we have to study the behaviour of σ (a) ◦ t |σ (ψ)|2 ,
(56)
for large t. In action angle coordinates (I, x) ∈ U × Td we have = {(I, x), x ∈ V ⊂ Td }, for a fixed I ∈ U , and so with |σ (ψ)|2 = |ρ(x)|2 |dx| we get σ (a) ◦ t |σ (ψ)|2 = σ (a)(I, x + tω(I ))|ρ(x)|2 dx . (57) Td
If we insert now the Fourier expansion in x, σ (a)(I, x) = m∈Zd αm (I )eim,x , we obtain, σ (a) ◦ t |σ (ψ)|2 = αm (I ) eix,m |ρ(x)|2 dx eitω(I ),m , (58)
Td
m∈Zd
which is Eq. (17) in Theorem 2.
Proof (Theorem 3). In order to prove Eq. (19) we notice that the transversality assumption on with respect to the foliation in invariant tori implies that action angle coordinates (I, x) ⊂ U × V can be chosen such that can be locally represented by a generating function ϕ : U → R, = {(I, ϕ (I )) , I ∈ U }. Therefore we have σ (a) ◦ t |σ (ψ)|2 = σ (a)(I, ϕ (I ) + tω(I ))|ρ(I ˆ )|2 dI ,
(59)
(60)
U
where we have written |σ (ψ)|2 = |ρ(I ˆ )|2 |dI |. Inserting for σ (a) again the Fourier expansion in x leads to σ (a) ◦ t |σ (ψ)|2 = αm (I )eim,ϕ (I ) eitm,ω(I ) |ρ(I ˆ )|2 dI . (61)
m∈Zd
U
The non-degeneracy condition det ω (I ) = 0 implies that there exist a constant C > 0, |∇I ω(I ), m| ≥ C|m| ,
(62)
for all I ∈ supp ρ. ˆ Now by the non-stationary phase estimates, see, e.g., [H¨or90, Theorem 7.7.1], one gets 1 2 αm (I )eim,ϕ (I ) eitm,ω(I ) |ρ(I ˆ )| dI ≤ C|m||αm |1 |ρ|21 (63) 1 + |t| U
Semiclassical Behaviour of Expectation Values for Large Times
253
for m = 0. Since σ (a) is C ∞ the Fourier-coefficients satisfy |αm |1 = ON (|m|−N ), for all N ∈ N, and therefore we finally obtain σ (a) ◦ t |σ (ψ)|2 = α0 (I )|ρ(I ˆ )|2 dI + O(1/t) . (64) But α0 (I ) =
U
σ (a)(I, x) dx and so the proof of Theorem 3 is complete.
There are a couple of directions in which one probably can extend and improve Theorems 2 and 3. We have only studied the two extreme cases of the position of relative to the foliation into invariant tori. Certainly the transversal case is (locally) generic, but the case that the intersections are clean can be studied without much additional effort, one would expect an oscillatory behaviour in this case. It appears as well to be very interesting to investigate the behaviour of the time evolution close to singularities of the foliation into invariant tori. Another direction where one can extend some of the results is to more general classes of systems. Namely by using normal forms around invariant tori in a general system one can extend Theorem 2 to that case. Such invariant tori occur typically in a situation described by KAM theory, e.g., for perturbed integrable systems, and close to elliptic orbits. Acknowledgements. This work is a result of research during my stays at the SPhT in Saclay/Paris and the MSRI in Berkeley. I would like to thank St´ephane Nonnemacher, Boris Gutkin and Andre Voros for interest and support. Furthermore I thank Stephan De Bi`evre and the referee for helpful comments. This work has been fully supported by the European Commission under the Research Training Network (Mathematical Aspects of Quantum Chaos) no HPRN-CT-2000-00103 of the IHP Programme.
References [BB79]
Berry, M.V., Balazs, N.L.: Evolution of semiclassical quantum states in phase space. J. Phys. A 12(5), 625–642 (1979) [BBTV79] Berry, M.V., Balazs, N.L., Tabor, M., Voros, A.: Quantum maps. Ann. Physics 122(1), 26–63 (1979) [BDB00] Bonechi, F., De Bi`evre, S.: Exponential mixing and | ln | time scales in quantized hyperbolic maps on the torus. Commun. Math. Phys. 211(3), 659–686 (2000) [BGP99] Bambusi, D., Graffi, S., Paul, T.: Long time semiclassical approximation of quantum flows: a proof of the Ehrenfest time. Asymptot. Anal. 21(2), 149–160 (1999) [BR02] Bouzouina, A., Robert, D.: Uniform semiclassical estimates for the propagation of quantum observables. Duke Math. J. 111(2), 223–252 (2002) [BS02] Brin, M., Stuck, G.: Introduction to dynamical systems. Cambridge: Cambridge University Press, 2002 [BW97] Bates, S., Weinstein, A.: Lectures on the geometry of quantization. Berkeley Mathematics Lecture Notes, Vol. 8, Providence, RI: Amer. Math. Soc. 1997 [BZ78] Berman, G. P., Zaslavsky, G. M.: Condition of stochasticity in quantum nonlinear systems. Phys. A 91(3–4), 450–460 (1978) [CdV85] Colin de Verdi`ere, Y.: Ergodicit´e et fonctions propres du laplacien. Commun. Math. Phys. 102(3), 497–502 (1985) [Che97] Chernov, N. I.: On Sinai-Bowen-Ruelle measures on horocycles or 3-DAnosov flows. Geom. Dedicata 68(3), 359–369 (1997) [Che98] Chernov, N. I.: Markov approximations and decay of correlations for Anosov flows. Ann. of Math. (2) 147(2), 269–324 (1998) [CR97] Combescure, M., Robert, D.: Semiclassical spreading of quantum wave packets and applications near unstable fixed points of the classical flow. Asymptot. Anal. 14(4), 377–404 (1997) [DBR03] De Bi`evre, S., Robert, D.: Semiclassical propagation on | log | time scales. Int. Math. Res. Not. Vol. 12, 667–696 (2003)
254 [Dol98]
R. Schubert
Dolgopyat, D.: On decay of correlations in Anosov flows. Ann. of Math. (2) 147(2), 357–390 (1998) [DS99] Dimassi, M., Sj¨ostrand, J.: Spectral asymptotics in the semi-classical limit. London Mathematical Society Lecture Note Series, Vol. 268, Cambridge: Cambridge University Press, 1999 [Dui74] Duistermaat, J. J.: Oscillatory integrals, Lagrange immersions and unfolding of singularities. Comm. Pure Appl. Math. 27 , 207–281 (1974) [Ebe01] Eberlein, P.: Geodesic flows in manifolds of nonpositive curvature. In: Smooth ergodic theory and its applications (Seattle, WA, 1999), Proc. Sympos. Pure Math., Vol. 69, Providence, RI: Amer. Math. Soc. 2001, pp. 525–571 [EM93] Eskin, A., McMullen, C.: Mixing, counting, and equidistribution in Lie groups. Duke Math. J. 71(1), 181–209 (1993) [FNDB03] Faure, F., Nonnenmacher, S., De Bi`evre, S.: Scarred eigenstates for quantum cat maps of minimal periods. Commun. Math. Phys. 239(3), 449–492 (2003) [HMR87] Helffer, B., Martinez, A., Robert, D.: Ergodicit´e et limite semi-classique. Commun. Math. Phys. 109(2), 313–326 (1987) [H¨or90] H¨ormander, L.: The analysis of linear partial differential operators. I. second ed., Grundlehren der Mathematischen Wissenschaften, Vol. 256, Berlin: Springer-Verlag 1990 [H¨or94] H¨ormander, L.: The analysis of linear partial differential operators. IV. Grundlehren der Mathematischen Wissenschaften, Vol. 275, Berlin: Springer-Verlag, 1985 [Ivr98] Ivrii, V.: Microlocal analysis and precise spectral asymptotics. Springer Monographs in Mathematics, Berlin: Springer-Verlag, 1998 [Liv04] Liverani, C.: On contact Anosov flows. Ann. of Math. 159, 1275–1312 (2004) [OTH92] O’Connor, P. W., Tomsovic, S., Heller, E. J.: Semiclassical dynamics in the strongly chaotic regime: breaking the log time barrier. Phys. D 55(3–4), 340–357 (1992) [Sin95] Sinai, Ya. G.: Geodesic flows on manifolds of negative curvature. In: Algorithms, fractals, and dynamics (Okayama/Kyoto, 1992), New York: Plenum, 1995, pp. 201–215 man, A. I.: Ergodic properties of eigenfunctions. Uspehi Mat. Nauk 29, 6(180), 181– ˇ ˇ [Sni74] Snirel 182 (1974) [TH91] Tomsovic, S., Heller, E. J.: Semiclassical dynamics of chaotic motion: unexpected long-time accuracy. Phys. Rev. Lett. 67(6), 664–667 (1991) [TH93] Tomsovic, S., Heller, E. J.: Long-time semiclassical dynamics of chaos: the stadium billiard. Phys. Rev. E (3) 47(1), 282–299 (1993) [Zas81] Zaslavsky, G. M.: Stochasticity in quantum systems. Phys. Rep. 80(3), 157–250 (1981) [Zel87] Zelditch, S.: Uniform distribution of eigenfunctions on compact hyperbolic surfaces. Duke Math. J. 55(4), 919–941 (1987) [Zel96] Zelditch, S.: Quantum mixing. J. Funct. Anal. 140, 68–86 (1996) Communicated by P. Sarnak
Commun. Math. Phys. 256, 255–285 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1295-8
Communications in
Mathematical Physics
Noncommutative Riemannian and Spin Geometry of the Standard q-Sphere S. Majid School of Mathematical Sciences, Queen Mary, University of London, 327 Mile End Rd, London E1 4NS, UK Received: 4 August 2003 / Accepted: 17 February 2004 Published online: 30 March 2005 – © Springer-Verlag 2005
Abstract: We study the quantum sphere Cq [S 2 ] as a quantum Riemannian manifold in the quantum frame bundle approach. We exhibit its 2-dimensional cotangent bundle as a direct sum 0,1 ⊕ 1,0 in a double complex. We find the natural metric, volume form, Hodge * operator, Laplace and Maxwell operators and projective module structure. We show that the q-monopole as spin connection induces a natural Levi-Civita type connection and find its Ricci curvature and q-Dirac operator ∇ /. We find the possibility of an antisymmetric volume form quantum correction to the Ricci curvature and Lichnerowicz-type formulae for ∇ /2 . We also remark on the geometric q-Borel-Weil-Bott construction. 1. Introduction The standard quantum sphere is nothing other than the invariant subalgebra of the standard quantum group coordinate ring Cq [SL2 ] under a coaction of C[t, t −1 ]. In the ∗algebra setting it means SU2 /U (1) in a coordinate form and of course q-deformed. Other nonstandard quantum spheres were constructed and classified in [Po1] while a unique left-covariant 2-dimensional differential calculus on the standard Cq [S 2 ] was found in [Po2, Po3]. Meanwhile, the q-monopole principal bundle with total space Cq [SL2 ] and base Cq [S 2 ] (i.e. the Hopf fibration) was constructed as an example of the theory of quantum principal bundles in [BM1] and has been somewhat studied since, so that many of the ingredients of geometry for Cq [S 2 ] are already known. In this paper we extend this geometry of Cq [S 2 ] to include Riemannian structures and a geometrically natural Dirac operator, using a systematic frame bundle approach to noncommutative geometry in [M1, M2, M3]. In fact the point is not just to obtain good geometrically justified proposals for these structures on the algebra Cq [S 2 ] in isolation, but rather to demonstrate that the frame bundle formulation, which is formulated in principle at the level of any unital algebra, indeed includes such an important example and gives reasonable answers for it. This is important because without the straight-jacket
256
S. Majid
of a general theory that applies across diverse examples (including ones not related to q-deformations) one could not have confidence that a given definition was not ad-hoc, without which one could not attach weight to physical or other predictions. We find that Cq [S 2 ] indeed fits perfectly into this quantum frame bundle approach to noncommutative geometry as a ‘quantum framed Riemannian manifold’. Another motivation comes from the operator-algebras and K-theory approach to noncommutative geometry of Connes and others [Co]. How this can be reconciled with quantum groups is an active and important area of research at the moment. While we proceed only from the quantum groups side, by going up to the point of the q-geometrically natural Dirac operator and q-spin bundle, one can begin to compare with the ‘top down’ Connes approach where an axiomatically defined ‘Dirac’ operator implicitly defines the geometry. We find for example that our ∇ / indeed generates the exterior derivative by commutator, as it should (Eq. (30)). The principal features of the geometry on Cq [S 2 ] that we find are as follows. The most important feature is that, unlike previous examples based on quantum groups, the sphere is not parallelizable. Hence there is no global ‘vielbein’ and one must work ‘upstairs’ on the total space of the frame bundle for global formulae. Unlike usual formulae in physics, in the noncommutative case we do not consider coordinate charts or patching transformations, instead we use only global constructions. Here Cq [S 2 ] is the simplest example with this difficulty and hence a good setting in which to demonstrate that the frame bundle theory works. In fact we show, Theorem 2.1, that any quantum homogeneous space induced by a Hopf algebra surjection is a framed quantum manifold (we construct the soldering form). The construction at the level of universal calculi was in [M1] but we extend this to general differential calculi as needed for our example. To be self-contained, Sect. 2 starts by recalling the required quantum homogeneous bundle construction itself (before using it as a frame bundle). The rest of the paper computes what the general frame bundle approach implies for the particular example of Cq [S 2 ]. Here for the quantum frame bundle we take the quantum Hopf fibration or ‘q-monopole principal bundle’ from [BM1, BM2], in an appropriate form. The role of the fiber SO(2) frame rotations is played by the commutative Hopf algebra C[t, t −1 ] equipped with a noncommutative q-deformation of its usual calculus. We then find the cotangent bundle 1 (Cq [S 2 ]) as an associated bundle to this. This means that, like all associated bundles in this context (when there is a connection), the cotangent bundle is necessarily projective, a point of view in keeping with other approaches such as [Co]. We later (in Sect. 5) exhibit its nontrivial projector explicitly. This is in spite of the fact that we find (Theorem 3.1) that the cotangent bundle is the sum of a charge -2 and charge 2 monopole, which means that is zero in the (noncommutative) K-theory of Cq [S 2 ] (this is actually in keeping with the classical geometry). More importantly for us, Theorem 3.1 implies a natural direct sum decomposition of the cotangent bundle into much simpler “holomorphic” and “antiholomorphic” parts 1 (Cq [S 2 ]) = 0,1 ⊕ 1,0 according to the monopole charge, which we then use extensively in the sequel. Section 4 covers the next ‘layer’ of geometry in the form of the exterior algebra, metric, Hodge * operator, Laplace operator and Maxwell theory. In particular, the natural Cq [SL2 ]-covariant metric g, Hodge ∗ and volume 2-form (or symplectic structure) ¯ ¯ 1 are naturally related as (id⊗∗)(g) ϒ lifted to an element i(ϒ) ∈ 1 ⊗ ∝ i(ϒ) (Proposition 4.3). Section 5 then comes to the Riemannian geometry and contains our main result (Theorem 5.1) that the q-monopole as spin connection on the frame bundle indeed induces the correct generalized Levi-Civita connection ∇ on the cotangent bundle for its natural
Riemannian Geometry of the Standard q-Sphere
257
metric. This is torsion free and ‘cotorsion free’. The latter is a natural formulation of metric compatibility in the skew form (∇ ∧ id − id ∧ ∇)g = 0. This condition was proposed in the axioms of [M1, M2] for ‘quantum Riemannian manifolds’ as a weakening of usual metric compatibility suggested by noncommutative geometry. We see that Cq [S 2 ] indeed bears this out. We also compute the Riemann curvature of ∇ as a 2-form-valued operator on 1-forms, and from this the physically all-important Ricci curvature. There is some freedom in the definition of this involved in choosing the lifting map i but we find that with the natural i(ϒ) modified by the addition of a q-symmetric metric term, one has Ricci proportional to the metric. Thus Cq [S 2 ] can be made into an ‘Einstein space’ (Proposition 5.2). Alternatively, we could not modify i(ϒ), in which case we find Ricci =
[2]q (1 − q 4 ) q −1 (1 + q 4 ) g+ i(ϒ), 2 2
showing a quantum correction involving the q-antisymmetric volume form or symplectic structure. This effect could also be formulated as a q-antisymmetric addition to the metric itself, which would be in keeping with ideas from string theory, for example. While the correct physical point of view and consequent predictions of q-modifications to gravity would need an understanding of the noncommutative stress energy tensor (which will be attempted elsewhere), we see the possibility of a new physical effect that vanishes as q → 1. Finally, for spin bundle we take S = S − ⊕ S + the direct sum of the charge -1 and charge 1 q-monopole bundles. This is then the correct ‘double cover’ of the cotangent bundle in terms of the corepresentation of C[t, t −1 ]. The same q-monopole connection on the frame bundle as used for ∇ now induces a covariant derivative D on S. This combines with a natural Cq [SL2 ]-covariant γ -matrix which we provide, to give our gravitational Dirac operator ∇ /. It has the correct Z2 -graded form and we show (Proposition 5.5 and Eq. (29)) that its square is linked to the scalar Laplacian. It is also relatively computable. For example, 1 1 ±q 2 a , ±q 2 c b d 1
are eigenspinors of mass ±q 2 , where a, b, c, d are the usual quantum group Cq [SL2 ] generators, viewed now as spinor components. We work algebraically and do not provide Hilbert space or other analytic structures; this would need further study. We do, however, show that unlike the cotangent bundle, our spinor bundle is trivial and we exhibit its trivialisation S ∼ / in the trivialisation appears to be =Cq [S 2 ] ⊕ Cq [S 2 ]. Our ∇ more complicated than previous attempts at the Dirac operator on the q-sphere such as [BK, DS, Ow, PS], but comes with the full geometrical picture above. The appendix applies the q-monopole connection to formulate the q-Borel-Weil-Bott construction as a byproduct of the q-geometry in the paper. The generalisation of this to other quantum groups and of the Riemannian and spin q-geometry to other q-symmetric spaces are two directions for further work. Notably for physics, a suitable Cq [S 4 ] and q-instanton are known [BCT] but require the more general coalgebra bundle theory for which nonuniversal differential calculi are not yet formulated. We cover only the case of generic q in this paper.
258
S. Majid
Preliminaries. We takeCq [SL 2 ] in the conventions of, for example, the text [Ma]. a b Namely it has a matrix of generators with ba = qab, etc. These are the ‘lexicocd graphical conventions’ whereby q is needed to put things in lexicographical order. We will frequently use the q-determinant relations ad = 1 + q −1 bc and da = 1 + qbc. The Hopf algebra structure has the usual matrix coproduct and counit on the generators and the antipode or ‘linearised inverse’ is Sa = d, Sd = a, Sb = −qb, Sc = −q −1 c. For the axioms of Hopf algebra and basic notions such as actions and coactions, we refer to [Ma]. We use the Sweedler notation [Sw] whereby a = a (1) ⊗ a (2) and (id ⊗ )a = a (1) ⊗ a (2) ⊗ a (3) , etc. We will frequently need the right adjoint coaction AdR (a) = a (2) ⊗(Sa (1) )a (3) . We will write A+ ⊂ A to denote the augmentation ideal (the kernel of ). General constructions work over any field but our motivating point of view is over C, which we retain for convenience. We then take q ∈ C as a generic parameter, in particular invertible and not a root of unity. The root of unity case can also be handled as in [M3] but we note that the q-monopole bundle in this case is trivial and hence less interesting. We recall that a differential calculus of an algebra A means an A − A-bimodule 1 and a map d : A → 1 obeying the Leibniz rule and such that 1 is spanned by 1-forms of the form adb. A calculus on a Hopf algebra is left-covariant if the coproduct : A → A ⊗ A viewed as a left coaction extends to a left coaction L on 1 such that d is an intertwiner and L is a bimodule map. In this case, having a Hopf-module, one knows that 1 ∼ =A ⊗ 1 , where 1 are the left-invariant 1-forms, and that 1 = A+ /IA where IA is some right ideal contained in A+ [Wo]. On the Hopf algebra P = Cq [SL2 ] we take the 3-d calculus of [Wo]. In our conventions this has a basis e− = ddb − qbdd,
e+ = q −1 adc − q −2 cda,
e0 = dda − qbdc
of left-invariant 1-forms, is spanned by these as a left module (according to the above) while the right module relations and exterior derivative are given in these terms by: 2 −2 qa q −1 b ± q aq b 0 ± a b 0 a b = = (1) e , e e , e cd cd qc q −1 d q 2 c q −2 d da = ae0 + qbe+ ,
db = ae− − q −2 be0 , dc = ce0 + qde+ , dd = ce− − q −2 de0 .
Our conventions for e± have been chosen with hindsight to fit the frame bundle geometry, see Theorem 3.1. The corresponding ideal is IP = a + q 2 d − (1 + q 2 ), b2 , c2 , bc, (a − 1)b, (d − 1)c .
(2)
Next, we let A = C[t, t −1 ] be a Hopf algebra with t = t ⊗ t and St = t −1 . This coacts on Cq [SL2 ], making it a comodule-algebra. Actually, a coaction here is the same thing as a Z-grading and in our case the degrees are deg(a) = deg(c) = 1,
deg(b) = deg(d) = −1.
By definition the standard q-sphere Cq [S 2 ] is the degree zero (i.e. invariant) subalgebra of Cq [SL2 ]. It is a polynomial algebra Cb0 , b± with inherited relations
Riemannian Geometry of the Standard q-Sphere
b± b0 = q ±2 b0 b± ,
259
q 2 b− b+ = q −2 b+ b− + (1 − q −2 )b0 , b0 (1 + qb0 ) = b+ b− .
(3)
This last can also be written as b0 (1 + q −1 b0 ) = q 2 b− b+ . Here b0 = bc, b+ = cd and b− = ab. The first line of relations become as q → 1 that the algebra is commutative, while (3) becomes the sphere relation in term of b± complex and b0 + 1/2. Moreover, the coproduct of Cq [SL2 ] restricts to Cq [S 2 ] as a left coaction L : Cq [S 2 ] → Cq [SL2 ] ⊗ Cq [S 2 ]. General (2-parameter) ‘quantum spheres’ from the point of view of left comodule algebras were obtained in [Po1]. On this q-sphere we inherit a differential calculus from the one above. It is not free over Cq [S 2 ] so we do not have a basis. But it is spanned by db+ = d(cd) = d 2 e+ + c2 e− ,
db− = d(ab) = b2 e+ + a 2 e− ,
db0 = d(bc) = qbde+ + qace− ,
(4)
using the Leibniz rule and the relations above. The inherited bimodule structure is far from trivial and will be recovered below by our own means. It is equivalent to formulae in [Po2]. The calculus inherits a left coaction of Cq [SL2 ] extending its coaction on Cq [S 2 ]. Finally, the calculi in both cases extend to entire exterior algebras. For Cq [SL2 ] the natural extension compatible with the super-Leibniz rule on higher forms and d2 = 0 is: de0 = q 3 e+ ∧ e− ,
de± = ∓q ±2 [2; q −2 ]e± ∧ e0 ,
q 2 e+ ∧ e− + e− ∧ e+ = 0,
(e± )2 = (e0 )2 = 0,
e0 ∧ e± + q ±4 e± ∧ e0 = 0,
where [n; q] = (1 − q n )/(1 − q) denotes a q-integer. This means that there are the same dimensions as classically, including a unique top form e− ∧ e+ ∧ e0 . Again, these facts are well-known, but given here in our required conventions. For Cq [S 2 ] the exterior calculus is not so well-known and we obtain it below. 2. Framings on Nonuniversal Quantum Homogeneous Spaces The general formulation of a quantum principal bundle with nonuniversal calculi is as follows [BM1, BM2]. As a ‘total space coordinate ring’ we suppose an algebra P and for the fiber a Hopf algebra A. We suppose that P is a right A-comodule algebra by a coaction R and define the fixed subalgebra M = P A = {p ∈ P | R p = p ⊗ 1} for the ‘functions’ on the base. For a bundle at the topological level we require that ver
0 → P (1 M)P → 1 P −→P ⊗ A+ → 0
(5)
is exact where 1 P ⊂ P ⊗ P is the universal calculus associated to any unital algebra (given by the kernel of the product map). The map on the right is ver(p ⊗ p ) = pR p , the generator of vertical vector fields. This exactness is equivalent to the similar map P ⊗M P → P ⊗ A being an isomorphism (a ‘Hopf-Galois’ extension [Sch]).
260
S. Majid
For a bundle with general nonuniversal calculi on our algebras, we require in addition that the calculi are compatible in the sense NM = NP ∩ 1 M,
(6)
R NP ⊆ NP ⊗ A,
(7)
ver(NP ) = P ⊗ IA ,
(8)
where 1 (P ) = (1 P )/NP defines the calculus on P as the quotient of the universal one by a subbimodule, and similarly for 1 (M). It is assumed that 1 (A) is left covariant, with ideal IA as explained in the Preliminaries. Here (6) ensures that 1 (M) = span{mdP n| n, m ∈ M} ⊆ 1 (P ) while (7) ensures that 1 (P ) is left covariant. The coaction on 1 P here is the tensor product of the coaction on each P . Finally, (8) ensures that 1 = A+ /IA
ver : 1 (P ) → P ⊗ 1 , is well-defined and ensures exactness of
ver
0 → P 1 (M)P → 1 (P )−→P ⊗ 1 → 0.
(9)
This is equivalent to the original formulation in [BM1] based on such an exact sequence, as explained in [BM2]. In effect, we put differential structures and ensure that all relevant maps are compatible. Finally, in this theory, a connection is defined [BM1] as an equivariant splitting of 1 (P ) providing a complement to the ‘horizontal forms’ P 1 (M)P . If we assume that the calculus on A is bicovariant then a connection is equivalent to an intertwiner ω : 1 → 1 (P ) such that ver ◦ ω = 1 ⊗ id. Here 1 has the right adjoint coaction inherited from that on A+ . For the purposes of this paper, the main example is a ‘quantum homogeneous bundle’ [BM1, BM2] based on a surjection π : P → A of Hopf algebras. This corresponds geometrically to an inclusion of groups, and just as the subgroup then acts by right multiplication, here A can be shown to coact on P by R = (id ⊗ π) : P → P ⊗ A. As above, we first construct the bundle with universal calculus and ‘quantum homogeneous space base’ M = P A , i.e. we suppose (5). We then impose differential structures, for which we assume that 1 (P ) is left-covariant and 1 (A) is bicovariant. We ensure (6) by taking it as a definition of 1 (M), while the remaining conditions (7)–(8) for a bundle with these nonuniversal calculi reduce to (id ⊗ π)AdR (IP ) ⊆ IP ⊗ A,
π(IP ) = IA .
(10)
This follows immediately using NP = {pSq (1) ⊗ q (2) | p ∈ P , q ∈ IP } and computing R , ver on such elements.
Riemannian Geometry of the Standard q-Sphere
261
If one wants a connection, one can do this at the universal level via a bicovariant splitting map i : A → P . Thus, R ◦ i = (i ⊗ id),
(π ⊗ id) ◦ i = (id ⊗ i)
⇒
ω(a) = Si(a)(1) di(a)(2) (11)
is a connection. One in fact needs only a weaker AdR -covariance condition [BM1] but the stronger bicovariance implies this [HM] and is the condition that is relevant below. In either case the map i then descends and defines a connection on the general bundle with nonuniversal calculus if i(IA ) ⊆ IP .
(12)
A further refinement of these constructions for quantum principal bundles can be found in [BM2]. Up till now we have recalled the known quantum bundle construction itself. We are now ready to give the nonuniversal version of the frame bundle construction. An algebra M is framed if it is the base of a quantum principal bundle as above to which 1 (M) is an associated bundle. The frame quantum group fiber need not be unique but its choice determines what kind of connections ∇ on 1 (M) may be induced from connections on the frame bundle [M1, M2]. Thus, apart from a bundle over M as above, we require for a framing an A-comodule V . Then E = (P ⊗ V )A (the fixed submodule) plays the role of sections of the associated bundle. Finally, we require a ‘soldering form’ θ : V → P 1 (M) such that the induced left M-module map sθ : E → 1 (M),
p ⊗ v → pθ (v)
is an isomorphism. Theorem 2.1. Let π : P → A be a quantum homogeneous bundle with general differential calculi obeying (10) as above. Then M = P A is framed by the bundle and V = P + ∩ M/IP ∩ M,
R v = v˜ (2) ⊗ Sπ(v˜ (1) ),
θ (v) = S v˜ (1) dv˜ (2) , where v˜ is a representative of v in P + ∩ M. Hence every quantum homogeneous space of this type is a ‘quantum manifold’ in the framed sense. Proof. This construction for universal calculi is in [M1, Prop. 4.3] so we have mainly to check that various maps descend to the quotients needed for the nonuniversal calculi. First observe that v ∈ M means by definition v (1) ⊗ π(v (2) ) = v ⊗ 1. Moreover, if v ∈ M then v (1) ⊗ v (2) ∈ P ⊗ M because v (1) ⊗ v (2)(1) ⊗ π(v (2)(2) ) = v (1)(1) ⊗ v (1)(2) ⊗ π(v (2) ) = v (1) ⊗ v (2) ⊗ 1. Similarly, R v = v (1) ⊗ π(Sv (1) ) = v (1)(2) ⊗ π(Sv (1)(1) )π(v (2) ) = v (2) ⊗ π(Sv (1) v (3) ), which is the projected adjoint action. Hence if v ∈ IP ∩ M we see from (10) and from the above that R v ∈ IP ∩ M ⊗ A. Hence R descends to V . Incidentally, if v ∈ P + ∩ M then (v (2) )π(Sv (1) ) = π(Sv) = Sπ(v (2) )(v (1) ) = 1(v) = 0 so R is defined on P + ∩ M in the first place (this is the same as for the universal calculus case). Meanwhile, if v ∈ IP then S v˜ (1) ⊗ v˜ (2) ∈ NP and hence θ(v) = 0 in 1 (P ),
262
S. Majid
so this is well-defined. Moreover, if v˜ ∈ M is a representative of v ∈ V then by the above remark, θ (v) = S v˜ (1) dv˜ (2) ∈ P 1 (M) as required. That θ is equivariant follows from this property proven for the universal calculi in [M1] to which we refer for the proof. Hence all maps are defined as required and we have sθ : (P ⊗ V )A → 1 (M). It remains to give its inverse, which we do by quotienting the inverse in the universal calculus case, namely sθ−1 (mdn) = [mn(1) ⊗ n(2) − mn ⊗ 1],
∀m, n ∈ M,
where the expression in square brackets lies in P ⊗ P + ∩M (again using the observation above) and [ ] denotes the equivalence class modulo IP ∩ M. That the result actually lies in (P ⊗ V )A and gives the inverse of sθ follows in the same way as in the universal case in [M1]. Using ω such as from (11), one may define the covariant derivative D : E → 1 (M) ⊗ E, M
D = (id − ω )d,
where we apply d ⊗ id to E and ω = ·(id ⊗ ω) ◦ ver is the vertical projection. When one takes the universal calculus this implies that E is a projective module and D is the Grassmann connection associated to the projector [HM]. In our case we get other connections which we will study in the next section. Also, using the framing, we obtain ∇ = (id ⊗ sθ−1 ) ◦ D ◦ sθ : 1 (M) → 1 (M) ⊗ 1 (M). M
Both D and ∇ behave as covariant derivatives (so ∇(mτ ) = dm ⊗M τ + m∇τ for any function m ∈ M and 1-form τ ). Hence we need only give ∇ on exact forms. Proposition 2.2. For the canonical connection induced by i : A → P on a quantum homogeneous space, ∇(dm) = d(m(1) Si ◦ π(m(2) )(1) ) ⊗ i ◦ π(m(2) )(2) Sm(3) dm(4) . M
This is the nonuniversal version of a similar formula with the universal differential calculus in [M1]. The proof is similar. 3. Framing and Holomorphic Calculus on Standard q-Sphere Cq [S 2 ] We start by recalling the known q-monopole bundle itself since we will need it in full detail when we use it as frame bundle. We fix the calculus on P = Cq [SL2 ] to be the 3-d one as explained in the Preliminaries. We take A = C[t, t −1 ] and π(a) = t,
π(b) = π(c) = 0,
π(d) = t −1 .
The right coaction R = (id ⊗ π) works out as t 0 ab ab R = ⊗ cd cd 0 t −1
Riemannian Geometry of the Standard q-Sphere
263
so that R a = a ⊗ t, etc. corresponding to deg(a) = 1, etc. Then M = P A = Cq [S 2 ] as explained in the Preliminaries. It is known that we have a quantum homogeneous bundle with universal calculi. As in [BM1] we then take IA = π(IP ) = t + q 2 t −1 − (1 + q 2 ) = (t − 1)(t − q 2 ) (we factored t −1 out of the generator obtained from projecting those of IP ). Now this IA defines the 1-dimensional calculus on C[t, t −1 ] with basis dt = t ⊗[t − 1] (where [ ] denotes modulo IA ) and relations dt.t = t 2 ⊗[t − 1]t = t 2 ⊗[(t − 1)(t − q 2 )] + t 2 ⊗ q 2 [t − 1] = q 2 t 2 ⊗[t − 1] = q 2 tdt. This is a q-differential calculus whereby d(t m ) = [m; q 2 ]t m−1 dt so that the relevant partial derivative is the usual q-derivative. We also verify that IP obeys the AdR condition in (10). Indeed, the element a + q 2 d (the q-trace) is AdR -invariant. Meanwhile (id ⊗ π )AdR (b2 ) = b2 ⊗ t −4 ,
(id ⊗ π )AdR (c2 ) = c2 ⊗ t 4
and so forth. Hence we have the quantum sphere as a quantum homogeneous space where the calculus on it is obtained by restriction of that on Cq [SL2 ] as required in (6). Finally, i(t n ) = a n ,
i(t −n ) = d n ,
∀n ≥ 0
defines a natural connection in the bundle via (11) as follows. We first verify that i(IA ) = span{a m (a − 1)(a − q 2 ), a + q 2 d − (1 + q 2 ), d m (d − 1)(d − q 2 )} ⊆ IP . Here the middle term is already in IP . Hence also, multiplying it by (a − q 2 ) we have (a − 1)(a − q 2 ) + q 2 (d − 1)(a − q 2 ) ∈ IP . The second term is q 2 (da−1−q 2 d−a+q 2 +1) which lies in IP since da−1 = qbc ∈ IP . Hence (a − 1)(a − q 2 ) ∈ IP . Similarly for (d − 1)(d − q 2 ). It follows that the canonical connection defined by this i descends to the chosen nonuniversal calculi as explained in Sect. 2. The resulting q-monopole connection is ω(t n ) = [n; q 2 ]e0
(13)
for all integers n. This is easily proven by induction as follows. First of all, ω(t) = (Sa (1) )da (2) = dda − qbdc = e0 from the definition of the 3-d calculus in the preliminaries, so the statement is true for n = 1. It is trivially for n = 0. When n ≥ 2, ω(t n ) = Sa n (1) da n (2) = S(a (1) a (1) a (1) · · · )d(a (2) a (2) a (2) · · · ) = S(a (1) a (2) · · · )Sa (1) (da (2) )a (2) a (2) · · · + a (2) d(a (2) a (2) · · · ) = ω(t n−1 ) + S(a (1) a (2) · · · )ω(t)a (2) a (2) · · · = ω(t n−1 ) + S(a (1) a (2) · · · )e0 a (2) a (2) · · · = ω(t n−1 ) + q 2(n−1) e0 ,
264
S. Majid
where a n = aa a · · · is the product of n copies of the generator a ∈ Cq [SL2 ] (the primes are to keep the instances apart). We used the antimultiplicativity of the antipode S and the Leibniz rule for the third equality. For the last equality we used that a (2) a (2) · · · has degree n − 1 and hence its commutation relations with e0 give a factor q 2(n−1) after which we cancel using the antipode axioms and (a) = 1. Then we do the same with n = −n , n ≥ 0 giving ω(t n ) = Sd n (1) dd n (2) and a factor q −2(n −1) at the corresponding point. We also have ω(t −1 ) = −q −2 e0 . The two halves of the computation combine to the uniform answer (13). The curvature [BM1] of the q-monopole connection is Fω (t n ) = dω(t n ) + ω(t n ) ∧ ω(t n ) = [n, q 2 ]de0 = q 3 [n; q 2 ]e+ ∧ e− .
(14)
These constructions so far are not essentially new. They are a version of the q-monopole construction in [BM1, BM2]. The choices and conventions are slightly closer to those in [HM] where, however, only universal calculi were considered. Next, we compute the comodule V introduced in Theorem 2.1. Clearly P + ∩ M = M + = ker M . In our case M = Cq [S 2 ] is generated by 1, b± , b0 so that M + = b0 , b± as an ideal. Meanwhile, because (a − 1)b, (d − 1)c, a + q 2 d − (1 + q 2 ) are not of homogeneous degree, the ideal which each one generates has no intersection with M. We assume here that Cq [SL2 ] has no zero-divisors. We therefore focus on b2 , c2 , bc. The 2 , b b , b2 lie in I ∩M. elements of degree zero in b2 include b2 {a 2 , ac, c2 }. Hence, b− − 0 0 P 2 , b b also in this ideal. The element bc = b is already Similarly from c2 we have b+ 0 + 0 in the ideal. From these considerations we arrive at 2 V = b± /b± , b0 .
Hence V is 2-dimensional with representatives b± . We then compute the coaction R on V from Theorem 2.1 as R b+ = cd ⊗ Sπ(d 2 ) = b+ ⊗ t 2 ,
R b− = ab ⊗ Sπ(a 2 ) = b− ⊗ t −2 .
(15)
Hence V = C ⊕ C and the associated bundle E = E −2 ⊕ E +2 = Cq [SL2 ]2 ⊕ Cq [SL2 ]−2 is the direct sum of the q-monopole bundles of charge -2 and charge 2. We identify their sections with the ±2 degree components in Cq [SL2 ]. Thus Theorem 2.1 yields: Theorem 3.1. Cq [S 2 ] = Cq [SL2 ]0 is a framed quantum manifold with cotangent bundle 1 (Cq [S 2 ])∼ =E −2 ⊕ E 2 isomorphic to the charge −2 and charge 2 monopole bundles. This isomorphism is given by the soldering form θ (b− ) = d 2 db− + q 2 b2 db+ − [2; q 2 ]bddb0 = e− , θ (b+ ) = a 2 db+ + q −2 c2 db− − [2; q −2 ]acdb0 = e+ , and makes 1 (Cq [S 2 ]) projective.
Riemannian Geometry of the Standard q-Sphere
265
Proof. The only remaining part is to compute θ (b± ). We first find the coaction on Cq [S 2 ] inherited from the coproduct of Cq [SL2 ] as L (b− ) = (ab) = ab ⊗(1 + [2]q b0 ) + a 2 ⊗ b− + b2 ⊗ b+ , L (b+ ) = (cd) = cd ⊗(1 + [2]q b0 ) + c2 ⊗ b− + d 2 ⊗ b+ , L (b0 ) = (bc) = 1 ⊗ b0 + bc ⊗(1 + [2]q b0 ) + qac ⊗ b− + qbd ⊗ b+ ,
(16)
where [2]q = q + q −1 . These coproducts were already used in computing R b± above; this time we apply S to the first factor and compute θ (b+ ) = Sb+ (1) db+ (2) = −q −1 acdb3 + a 2 db+ + q −2 c2 db− + q(−q −1 )acdb3 q. Similarly for θ (b− ). This gives the middle expressions. We then insert (4) and find e± for the values of the map θ : V → 1 (Cq [SL2 ]). By Theorem 2.1 this is well-defined on V and actually has its values in Cq [SL2 ]1 (Cq [S 2 ]). Also according to Theorem 2.1, one must multiply θ (b− ) by an element of degree 2, and θ(b+ ) by an element of degree -2 to get 1-forms on Cq [S 2 ] and every 1-form is obtained in this way. E ±2 are both projective as shown in [HM], as given via the Cuntz-Quillen theorem and the q-monopole connection with the universal calculus. These two 1-forms θ (b± ) = e± play the role of ‘vielbein’ but do not themselves live on the base. Note also that as regards the bimodule structure, the elements of Cq [S 2 ] commute with the θ (v) as we see from (1). Corollary 3.2. The three 1-forms db± and db0 enjoy the relation q 2 b− db+ + b+ db− − (1 + [2]q b0 )db0 = 0, where [2]q = q + q −1 denotes a symmetrized q-integer. Proof. Using (16) we have θ (b0 ) = Sb0 (1) db0 (2) = b0 d(1 + qb0 ) + (−q)qb− db+ + (−q −1 )qb+ db− + (1 + q −1 b0 )db0 = 0 since b0 represents zero in V . This identity can also be obtained from requiring db± and db0 to be recovered from (4) composed with Theorem 3.1 and extensive use of the commutation relations. This identity corresponds in the classical case to the differential of (3). However, for q = 1 this is not so immediate because the bimodule relations for 1 (Cq [S 2 ]) are complicated to find explicitly. On the other hand, Theorem 3.1 implies a direct sum structure to the cotangent bundle with each piece more reasonable to work with. Corollary 3.3. 1 (Cq [S 2 ]) = 0,1 ⊕ 1,0 , where 1,0 , 0,1 are each first order leftcovariant differential calculi over Cq [S 2 ] with differentials ∂, ∂¯ obeying d = ∂ + ∂¯ and −2 2 b+ b+ q b+ ∂b− q b+ ∂b+ 2 ∂b− b− = q b− ∂b− ∂b+ b− = q 2 b− ∂b+ + (q 2 − q −2 )b+ ∂b− b b b ∂b , q 4 b ∂b , 0 0 − 0 0 +
266
S. Majid
−2 −2 ¯ − + (q −2 − q 2 )b− ∂b ¯ + ¯ + q b+ ∂b q b+ ∂b b+ b+ ¯∂b+ b− = q 2 b− ∂b ¯∂b− b− = q −2 b− ∂b ¯ − ¯ + q −4 b ∂b b ∂b b b 0 0 ¯ −, 0 0 ¯ +. One has the relations ∂b0 = q 2 b− ∂b+ − q −2 b+ ∂b− , b0 b− ∂b+ = q −3 (1 + q −1 b0 )b+ ∂b− ,
¯ 0 = b+ ∂b ¯ − − q 4 b− ∂b ¯ +, ∂b ¯ − = q 3 (1 + qb0 )b− ∂b ¯ +. b0 b+ ∂b
Proof. From Theorem 3.1 we know that 1 is a direct sum as a left module over Cq [S 2 ] spanned respectively by {b2 , db, d 2 }e+ = {∂b− , ∂b0 , ∂b+ },
¯ − , ∂b ¯ 0 , ∂b ¯ + }, {a 2 , ca, c2 }e− = {∂b
where we use the expressions (4) to identify the two components of d. Next we observe that e± commute with elements of Cq [S 2 ] so that the commutation relations of functions with ∂ and ∂¯ are easily determined from the relations of Cq [SL2 ]. We find the ones stated and −4 ¯ 0 +q −1 (1 − q −2 )∂b ¯ + b+ b+ b+ ∂b0 q b+ ∂b 4 2 ¯ 0 b− = b− ∂b ¯ 0 ∂b0 b− = q b− ∂b0 +q(1 − q )∂b− ∂b b b q 2 b ∂b , q −2 b ∂b 0 0 0 0 0 ¯ 0. These close so that each ∂, ∂¯ generate a bimodule differential calculus. Their Leibniz rules follow from that for d and the direct sum decomposition. Next, we observe the relations b+ ∂b− = qb0 ∂b0 ,
b− ∂b+ = q −2 (1 + q −1 b0 )∂b0 ,
¯ + = q −3 b0 ∂b ¯ 0, b− ∂b
¯ − = (1 + qb0 )∂b ¯ 0, b+ ∂b
following likewise from the relations of Cq [SL2 ] acting on e± . We use them as a defi¯ 0 , ∂b0 and a relation among the ∂b ¯ ± (respectively, ∂b± ) as stated. The latter nition of ∂b also imply the relation in (3.2) and follow from differentiating (3), which is their geometrical content (related to the rank 1 projective module structure of each bundle). We note that one also has other relations, such as b+ ∂b0 = q 2 b0 ∂b+ ,
b− ∂b0 = q −1 (1 + q −1 b0 )∂b− ,
¯ 0 = q −2 b0 ∂b ¯ −, b− ∂b
¯ +, ¯ 0 = q(1 + qb0 )∂b b+ ∂b
which are not independent of the one already found. For example the ∂¯ relations here can be written as 2 ¯ ¯ −, b− ∂b+ = q −7 b02 ∂b
2 ¯ ¯ +, ∂b− = q(1 + q 3 b0 )(1 + qb0 )∂b b+
which can be deduced from the one stated if one assumes that the left action of b0 and 1 + q −1 b0 respectively can be cancelled. Finally, each of the spaces 1,0 and 0,1 are stable under the left coaction of Cq [SL2 ]. This is because ‘upstairs’ on 1 (SL2 ) the coaction on an element f e± is just f (1) ⊗ f (2) e± and the coproduct defines a left coaction in each degree (because left-comultiplication commutes with the right-comultiplication used in defining the grading according to (id ⊗ π)). This coaction is intertwined by ∂ and ∂¯ with the left coaction (16) on Cq [S 2 ] since this is true for d.
Riemannian Geometry of the Standard q-Sphere
267
Clearly these ‘holomorphic’ and ‘antiholomorphic’ cotangent bundles are relatively simple to work with. On the other hand, we can use Theorem 3.1 to compute ∂, ∂¯ in terms of d. Using the relations in Cq [SL2 ] and Corollary 3.2 we find ∂b− = qb− db0 − q −1 b0 db− ,
∂b+ = (1 + qb0 )db+ − q −1 b+ db0 , ¯ 0 = b+ db− − qb0 db0 , ∂b0 = q 2 b− db+ − q −1 b0 db0 , ∂b ¯∂b− = (1 + q −1 b0 )db− − qb− db0 , ∂b ¯ + = q −1 b+ db0 − qb0 db+ .
(17)
Therefore, as an application, we can recover the bimodule relations in 1 (Cq [S 2 ]) from the much simpler ones for the two parts. Proposition 3.4. Let µ = q 2 − q −2 . The bimodule relations for the 2-dimensional calculus on Cq [S 2 ] are b0 (q 2 + qµb0 )b0 db0 − µb0 b+ db− = , db0 b± q ∓2 (1 ∓ q ±1 µb0 )b± db0 − (1 − q ±2 ∓ q ±1 µb0 )b0 db± ±1 ∓1 ±4 b0 (q ± q µb0 )b0 db+ ∓ q µb± b0 db0 db± b∓ = q ±2 (1 ± q ±1 µb0 )b∓ db± ± q ±1 µq −1 b02 db0 b q ±2 (1 ± q ±1 µb )b db ∓ q ∓1 µb2 db . ± 0 ± ± ± 0 Proof. These are all computed along the following lines: ¯ 0 + q 2 b0 ∂b0 = q 2 b0 db0 + (q −2 − q 2 )b0 ∂b ¯ 0 db0 .b0 = q −2 b0 ∂b ¯ 0 using d = ∂ + ∂¯ and the commutation relations from Corollary 3.3. We then express ∂b in terms of d to obtain the result. Similarly for all the other commutation relations. As a cross-check, one may now verify that Corollary 3.2 corresponds to the differentials of the the q-sphere relations. Put another way, using Corollary 3.2, the differentials of the four relations (3) of the q-sphere reduce to q −1 db+ .b− − db0 .b0 = −q −2 b0 db0 + qb− db+ , qdb− .b+ − q −2 db0 .b0 = −b0 db0 + q −1 b+ db− , db± .b0 − q ±2 db0 .b± = q ±2 b0 db± − b± db0 ,
which all hold using Proposition 3.4. In fact Podles in [Po3] has shown that there is a unique left-covariant calculus on Cq [S 2 ] of the correct classical dimension, hence the calculus in Proposition 3.4 is necessarily isomorphic to this, but derived differently. 4. Exterior Algebra, Hodge-∗ and Maxwell Theory on the q-Sphere In the last section we have expressed the cotangent bundle of Cq [S 2 ] as associated to a frame bundle by a 1 ⊕ 1-dimensional representation of the ‘frame group’ C[t, t −1 ] equipped with a (bicovariant) q-differential structure. We deduced that 1 (Cq [S 2 ]) is the sum of two 1-dimensional left-covariant calculi 0,1 and 1,0 . We now extend this to the entire exterior algebra.
268
S. Majid
First of all, working ‘upstairs’in (Cq [SL2 ]) we define (Cq [S 2 ]) as the differential ¯ ± which means generated algebra obtained by restriction. It is generated by ∂b± and ∂b by e± and certain elements of Cq [SL2 ]. Because of the commutation relations with e± , these elements can all be collected to the left. Because of the relations between the e± , there is only a functional multiple of a top form ϒ = e+ ∧ e− in degree 2 and nothing in higher degree. Moreover, this top form is a linear combination ¯ ∓ (we will see explicit formulae for it in the next proposition of products of ∂b± with ∂b below). Similarly there are no 2-forms built from ∂ or ∂¯ alone. As a result, we write 2 (Cq [S 2 ]) = 1,1 ,
2,0 = 0 = 0,2 ,
where the numbers refer to the form-degrees with respect to ∂, ∂¯ and 1,1 is 1-dimensional over the algebra. We write 0 (Cq [S 2 ]) = 0,0 as the algebra itself, and we have 1 (Cq [S 2 ]) = 0,1 ⊕ 1,0 from Sect. 3. Finally, we extend ∂, ∂¯ acting by zero on 1,0 , 0,1 respectively as each of these generate exterior algebras with top degree 1 by the above arguments or from the relations in Corollary 3.3. Finally, we define ∂ = d|0,1 ,
∂¯ = d|1,0
on 0,1 , 1,0 respectively. By construction, we obtain a double complex ∂¯
0 ↑
0 ∂¯ ↑
0,1 −→ 1,1 ∂¯ ↑ ∂¯ ↑
∂
−→
∂
−→
0,0 −→ 1,0
∂
∂
0 0.
Here the graded-derivation property of d implies that ¯ + ∂ ∂¯ = 0. ∂∂ Also, forms which are left-invariant under the Cq [SL2 ] coaction are precisely the ones generated by e± alone. Moreover, ϒ is the unique left-invariant 2-form up to scale and hence our natural choice of basis for 2 (Cq [S 2 ]) over Cq [S 2 ]. We let µ = q 2 −q −2 . Proposition 4.1. The relations between the 0,1 and 1,0 calculi are ¯ − ∧ ∂b+ = q 4 µ(b02 − 1)ϒ, ¯ − + q 6 ∂b ∂b+ ∧ ∂b ¯ + = −q 2 ∂b ¯ + ∧ ∂b− = q 2 b02 ϒ, ∂b− ∧ ∂b 2 ¯ − = −q 6 ∂b ¯ − ∧ ∂b− = q 5 b− ∂b− ∧ ∂b ϒ, 2 ¯ + = −q 6 ∂b ¯ + ∧ ∂b+ = q 5 b+ ∂b+ ∧ ∂b ϒ.
Riemannian Geometry of the Standard q-Sphere
269
Proof. We compute all expressions in terms of e± using the definitions from the proof of Corollary 3.3. For example ¯ − ∧ ∂b+ = a 2 e− ∧ d 2 e+ = q −2 a 2 d 2 e− ∧ e+ = −(1 + q −3 b0 )(1 + q −1 b0 )ϒ ∂b using the relations between functions and e± in 1 (Cq [SL2 ]) and the relations in the quantum group. Computing all expressions in this way and comparing gives the relations stated. One may similarly compute ¯ 0 = q 4 (1 + qb0 )b0 ϒ, ∂b0 ∧ ∂b
¯ 0 ∧ ∂b0 = −(1 + q −1 b0 )b0 ϒ, ∂b
¯ − = q 4 (1 + qb0 )b− ϒ, ∂b0 ∧ ∂b
¯ − ∧ ∂b0 = −(1 + q −3 b0 )b− ϒ, ∂b
¯ 0 = q 4 b+ (1 + qb0 )ϒ, ∂b+ ∧ ∂b
¯ 0 ∧ ∂b+ = −b+ (1 + q −3 b0 )ϒ, ∂b
¯ + = −q 4 ∂b ¯ + ∧ ∂b0 = q 3 b+ b0 ϒ, ∂b0 ∧ ∂b
¯ 0 ∧ ∂b− ¯ 0 = −q 4 ∂b ∂b− ∧ ∂b = q 5 b− b0 ϒ,
which will be useful later on, giving the further relations ¯ 0 + ∂b ¯ 0 ∧ ∂b0 = (q − q −1 )b02 ϒ, q −4 ∂b0 ∧ ∂b ¯ − + q 8 ∂b ¯ − ∧ ∂b0 = −q 6 µb− ϒ, ∂b0 ∧ ∂b
¯ 0 + q 8 ∂b ¯ 0 ∧ ∂b+ = −q 6 µb+ ϒ. ∂b+ ∧ ∂b
Note that first two lines taken together exhibit ϒ as an element of 1,1 (Cq [S 2 ]) as promised. (We will give another more geometrical expression later). One may further write it in terms of d using results from the last section. For a simpler expression, the second line in Proposition 4.1 gives qb− db0 ∧ db+ − q −1 b0 db− ∧ db+ = q 2 b02 ϒ, which gives the volume form if b0 is invertible. An alternative is to use (4) and compute db+ ∧ db− − q 4 db− ∧ db+ = q 3 [2]q (1 + [2]q b0 )ϒ, which gives the volume form if one assumes 1 + [2]q b0 invertible. These are classically the two trivialisations of the sphere given by deleting the north or south poles. ¯ denote the tensor Next we look at the metric, motivated from Corollary 3.2. We let ⊗ 2 product over Cq [S ]. Proposition 4.2. There is a natural metric ¯ + + db+ ⊗db ¯ − − [2]q db0 ⊗db ¯ 0 g = q 2 db− ⊗db such that g is invariant under the left coaction of Cq [SL2 ] and q-symmetric in the sense ¯ 0,1 ) ⊕ (0,1 ⊗ ¯ 1,0 ). ∧(g) = 0. Moreover, g ∈ (1,0 ⊗
270
S. Majid
Proof. The coaction (16) defines also the coaction on the db± , db0 in such a way that d is a comodule map. We write this coaction on the basis db− , db0 , db+ as a transformation matrix 2 db− a db− [2]q ab b2 L db0 = ca 1 + [2]q bc db db0 . db+ db+ c2 [2]q cd d 2 Writing the metric also as a matrix in this basis, its left-invariance follows from the identity
2 t a 2 [2]q ab b2 a [2]q ab b2 0 0 q2 0 0 q2 ca 1 + [2]q bc db 0 −[2]q 0 ca 1 + [2]q bc db = 0 −[2]q 0 , 1 0 0 1 0 0 c2 [2]q cd d 2 c2 [2]q cd d 2 where t denotes transpose. This in turn follows from the relations of Cq [SL2 ]. Actually, the coaction L corresponds to the vector corepresentation of the even subalgebra Cq [SO3 ] of Cq [SL2 ] and for generic values of q there is a unique invariant such matrix for the metric coefficients up to a scale. Hence the metric is uniquely determined if we suppose it has numerical coefficients, then our basis of exact differentials are viewed as spanning a 3-dimensional vector space over C (invariance at this level then implies invari¯ Such numerical coefficients in turn are a natural assumption ance when viewed over ⊗). in view of (3.2). Differentiating that, we see that ∧(g) = 0. Also, writing g = g++ ⊕ g+− ⊕ g−+ ⊕ g−− for the decomposition according to Corollary 3.3, we use the q-commutation relations there and expressions for ∂bi in (17) to compute ¯ − −[2]q ∂b0 ⊗∂b ¯ 0 + q 2 ∂b− ⊗∂b ¯ + g++ = ∂b+ ⊗∂b ¯ 0 + (q 3 −q −1 )b+ ∂b− ⊗db ¯ 0 −q 3 b0 ∂b+ ⊗db ¯ − + [2]q b+ ∂b0 ⊗db ¯ − = q 3 b− ∂b+ ⊗db ¯ 0 + q 2 (1 + qb0 )∂b− ⊗db ¯ + − q −1 b+ ∂b− ⊗db ¯ 0 −[2]q (1 + q 3 b0 )∂b0 ⊗db ¯ 0 + (q 2 [2]q − q 3 )b0 ∂b+ ⊗db ¯ − = −q −1 (1 + q 2 [2]q b0 )∂b0 ⊗db ¯ + +q 2 (1 + qb0 )∂b− ⊗db =0 on using (1 + q 2 [2]q b0 )∂b0 = (∂b0 )(1 + [2]q b0 ) and then using Corollary 3.2 to replace db0 by db± . Then, as well as for the third equality, we used the relations in 1,0 in Corollary 3.3 to collect terms. There is a similar proof for g−− = 0. Next, we look at the Hodge ∗ operator ∗ : 1 (Cq [S 2 ]) → 1 (Cq [S 2 ]) which we require to obey ∗2 = id and to be at least a left-module map and to be frameinvariant (the metric can also be analysed in such terms but frame invariance alone does not fix a particular one, we used rotational left-covariance). In the frame bundle approach for ∗ we require ∗ : V → V where V is the 2-dimensional local tangent space. In order to be frame invariant (which means covariant under (15)) and square to the identity, this must be given by ∗(e± ) = ±e± up to an overall sign.
Riemannian Geometry of the Standard q-Sphere
271
Proposition 4.3. The natural Hodge ∗ operator is a left-covariant bimodule map obeying ∗(∂f ) = ∂f,
¯ ) = −∂f, ¯ ∗(∂f
∀f ∈ Cq [S 2 ].
¯ 1 (Cq [S 2 ]), It defines a left-covariant lifting i : 2 (Cq [S 2 ]) → 1 (Cq [S 2 ])⊗ i(ϒ) =
q −1 q −1 ¯ ¯ (∗⊗id)(g) =− (id⊗∗)(g). [2]q [2]q
Proof. Let us first verify directly that ∗ is well-defined as a left module map, in which ¯ ± , ∂b ¯ 0 are generated from e− , etc. Indeed 1 (Cq [S 2 ]) case it is given as stated since ∂b is a rank 2 bundle in which we can take db± , db0 as generators with the relation in ¯ we have each part holding separately, Corollary 3.2. Writing d = ∂ + ∂, ¯ 0=0 ¯ + + b+ ∂b ¯ − − (1 + [2]q b0 )∂b q 2 b− ∂b
(18)
and similarly for ∂ (this is also clear from relations in the proof of Corollary 3.3), so ∗ is compatible with this relation. That ∗ is a right Cq [S 2 ]-bimodule map can easily ¯ ∂. Moreover, ∗ is left-covariant under the be proven using the Leibniz rule for d, ∂, coaction of Cq [SL2 ] since the coaction acts on each 0,1 , 1,0 separately. Next, we recall the usual formulae in which the Hodge ∗ operator is given in terms of the ‘totally antisymmetric tensor’ and the metric. The role of that tensor is played by the lifting of ¯ 1 which is something that in classical geometry the volume form to an element of 1 ⊗ one takes for granted (the wedge product is given classically by skew-symmetrization). In noncommutative geometry, as explained in [M2], this lifting map is an additional ¯ 1 → 2 and we use the Hodge * operator datum required to split the wedge map 1 ⊗ and the metric to define it. We check that ¯ ∧(∗⊗id)(g) = q 2 (−a 2 e− + b2 e+ ) ∧ (c2 e− + d 2 e+ ) + (−c2 e− + d 2 e+ ) ∧ (a 2 e− + b2 e+ ) −[2]q (−cae− + dbe+ ) ∧ (cae− + dbe+ ) = (q 2 a 2 d 2 + q 4 b2 c2 + c2 b2 + q 2 d 2 a 2 − [2]q cadb − q 2 [2]q dbca)ϒ = (q 2 + 1)ϒ, where we work ‘upstairs’ in the frame bundle and where the even terms give q 2 and the odd terms give 1 using the relations of Cq [SL2 ]. Hence ∧ ◦ i(ϒ) = ϒ as required. One may also obtain this result using the computations in the proof of Proposition 4.1. We define i as by definition a left module map (it is not a bimodule map). The other stated expression for i is the same in view of the form of g in Proposition 4.2. Note also that since ∧(g) = 0 we have in fact a general family of lifts of this type, i(ϒ) =
q −1 ¯ (∗⊗id)g + µg [2]q
(19)
for any µ. Or equivalently, i(ϒ) = αg+− + βg−+ , provided α − β = 2q −1 /[2]q . These lifts are all left covariant under the coaction of Cq [SL2 ] since g+− , g−+ separately are.
272
S. Majid
We do not explicitly discuss complex structures in this paper; we work over C, but with care this could be any field as in algebraic geometry. Nevertheless, our b± coordinates have their interpretation as complex linear combinations (of the ambient R3 coordinates) in real geometry; in real coordinates the Hodge ∗ is equivalent to an almost complex structure J since this is defined in two dimensions exactly by the same relation between the volume form (viewed as a symplectic structure) and the metric as for i in Proposition 4.3. In our case since we are deforming the standard metric on the sphere, this gives implicitly a q-deformation of its actual complex structure via the Hodge *. ¯ This justifies our notations ∂, ∂. We now have all the basic structures at least for the first ‘layer’ of geometry, namely cohomology and electromagnetism. For the Maxwell theory we define of course ∗1 = ϒ, and the Laplacian on degree zero by =
∗ϒ = 1 − 21
∗ d ∗ d. Then
1 1 ¯ ¯ (f )ϒ = − d ∗ df = d(∂f − ∂f ) = ∂ ∂f. (20) 2 2 Proposition 4.4. The functions b− , 1 + [2]q b0 , b+ are eigenfunctions of with eigenvalue q 2 [2]q . Proof. We compute
b+ d∂ b− b 0
−1 qdb0 ∧ db+ − q db+ ∧ db0 = qdb− ∧ db0 − q −1 db0 ∧ db− q 2 db ∧ db − q −1 db ∧ db − + 0 0
since d = ∂¯ on 1,0 . We compute these using (4) in the same manner as in the proof of Proposition 4.3. Here we need their values explicitly: db− ∧ db+ = −(1 + q −2 [2]q b0 − q −1 (q − q −1 )[3]q b02 )ϒ, db0 ∧ db0 = (q − q −1 )q 2 ([2]q + [3]q b0 )b0 ϒ, db0 ∧ db+ = ((q 5 − q −1 )b0 − 1)b+ ϒ, db+ ∧ db0 = ((q 7 − q)b0 + q 4 )b+ ϒ,
db− ∧ db0 = b− ((q 5 − q −1 )b0 − 1)ϒ,
db0 ∧ db− = (q 7 − q)b− b0 ϒ, ¯ = −∂ ∂. ¯ which we use to find d∂ = ∂∂
Note that these functions form the vector corepresentation of Cq [SO3 ] under the left coaction (16) (this appears in the above basis as the transformation matrix in the proof of Proposition 4.2). In the same way, the matrix elements of each integer spin corepresentation Vn of Cq [SL2 ] define a square-dimension subspace Vn ⊗Vn∗ ⊂ Cq [SL2 ]. Fixing the unique zero weight vector v under the right coaction, the subspace Vn ⊗v (in other words, the matrix entries in the middle row of the transformation matrix) span an eigenspace of the Laplacian. For generic q the Peter-Weyl decomposition of Cq [SL2 ] implies, as classically, that this is a complete diagonalisation of on Cq [S 2 ] with one eigenspace for each integer spin corepresentation. The matrix elements of the 1/2-integer spin corepresentations contain an odd number of the Cq [SL2 ] generators and hence can never have the zero degree needed to lie in Cq [S 2 ].
Riemannian Geometry of the Standard q-Sphere
273
In particular, the zero eigenspace of is spanned by the constant function 1. This ¯ = 0 then f is a multiple of 1. Similarly for ∂, so for the noncommuimplies that if ∂f tative de Rham and Dolbeault cohomology of Cq [S 2 ] for generic q we have H 0 = H∂0 = H∂¯0 = C.1. It is clear that for generic q we also have the usual values for the rest of the cohomology as well as Poincar´e duality, since all constructions for this calculus are a smooth deformation of the classical ones. We omit explicit proofs of these facts since we need them for discussion only. Similarly, we have a “massive” Maxwell equation defined on 1-forms by 1 1 A ≡ − ∗ d ∗ dA = m2 A, ∗d ∗ A = 0. 4 The second equation is Coulomb gauge in physics and is automatic when m = 0 (in this case m2 A should be interpreted as the source). We recall that in Maxwell theory the field is considered modulo exact forms but this freedom can be partially fixed by a gauge choice. We write the curvature as F = dA = f ϒ, where f = ∗F is in Cq [S 2 ]. Then 1 A = 0 translates to ¯ = 0, ∂f = ∂f which implies that f ∝ 1. In that case, dA ∝ ϒ which implies f = 0 since ϒ is not exact by Poincar´e duality, so the only ‘photons’ are pure gauge. This is to be expected for a sphere. On the other hand, if A is a "massive" mode, then 1 ¯ 1 (∂ α − ∂ α), α = 2m2 α. ∗dα = m2 4m Conversely, given an eigenfunction of as in the second equation, we use the first to define A and obtain a massive mode. For example, the eigenfunctions of in Proposition 4.4 give solutions A=−
A=
q2 ¯ (∂bi − ∂bi ), 2[2]q
where i = ±, 0 and these are given as 1-forms via (17). 5. Levi-Civita Connection, Curvature and Dirac Operator on the q-Sphere Next, we use the frame bundle approach to develop the Riemannian geometry of the q-sphere. Here the q-monopole bundle [BM1] is viewed as the frame bundle and the q-monopole connection (13) on it as a spin connection. We find that it correctly induces the Levi-Civita connection on the cotangent bundle. Theorem 5.1. The q-monopole connection (13) viewed as spin connection in the frame bundle of Cq [S 2 ] induces the covariant derivative db± [2]q b± g ∇ = 1 + [2]q b0 db0 which is torsion-free and skew-metric compatible in the sense of zero cotorsion (a generalised Levi-Civita connection).
274
S. Majid −1
Proof. We recall that the bundle E −2 = (Cq [SL2 ]⊗[b− ])C[t,t ] can be identified with Cq [SL2 ]2 ⊗[b− ], where we now write the representative [b− ] ∈ V explicitly. This space in turn was identified with Cq [SL2 ]2 .e− = 0,1 as explained in the proof of Corollary 3.3. The action of the covariant derivative on E −2 is by D(f ⊗[b− ]) = (id − ω )(df )⊗[b− ] = (df − f ω(t 2 ))⊗[b− ] = (df − (1 + q 2 )f e0 )⊗[b− ] for all f ∈ Cq [SL2 ]2 . This is the usual covariant derivative on a q-monopole section, here of charge 2. Working ‘upstairs’ on Cq [SL2 ] and using the Leibniz rule and the 3-d calculus, we have d(a 2 ) = (1 + q 2 )(a 2 e0 + qabe+ ),
d(ca) = (1 + q 2 )cae0 + q 2 (1 + [2]q bc)e+ ,
d(c2 ) = (1 + q 2 )(c2 e0 + qcde+ ). We see that the horizontal projection simply kills the e0 term in each expression. Next, from the identity d 2 a 2 + q 2 b2 c2 − q[2]q dbac = 1 we write e+ = 1.e+ = q −2 ∂b+ .a 2 + ∂b− .c2 − q −2 [2]q ∂b0 .ca,
(21)
where we move the a, c generators to the right using the relations of the 3-d calculus. These degree 2 products combine with [b− ] to give a section of E −2 . We are working ‘upstairs’ but we can now identify the product as the Cq [S 2 ]-module structure. Thus D(a 2 ⊗[b− ]) = [2]q b− ∂b+ .(a 2 ⊗[b− ]) + q 2 ∂b− .(c2 ⊗[b− ]) − [2]q ∂b0 .(ca⊗[b− ]) . Similarly for ca⊗[b− ] and c2 ⊗[b− ]. Finally, we replace [b− ] by e− (the framing isomorphism Theorem 3.1) and identify the resulting elements of 0,1 on the right. This ¯ − . Similarly for all the other cases. For the ∇∂ we use gives ∇ ∂b ¯ + .b2 − [2]q ∂b ¯ − .d 2 + ∂b ¯ 0 .db. e− = q 2 ∂b As a result, we find ∇(∂b± ) = [2]q b± g−+ ,
¯ ± ) = [2]q b± g+− , ∇(∂b
(22)
where ¯ + + ∂b+ ⊗ ¯ − − [2]q ∂b0 ⊗ ¯ 0, ¯ ∂b ¯ ∂b ¯ ∂b g+− = q 2 ∂b− ⊗ etc. in the decomposition of g as in the proof of Proposition 4.2. Combining these gives ∇d as stated. Next, having found ∇, we look at the torsion equation. As explained in [M1] the noncommutative meaning of this is Tor = ∇∧ − d : 1 → 2 , which is the first degree measure of the failure of ∇∧ to form a complex (the second degree measure is the curvature). We have Tor(db± ) = ∇ ∧ (db± ) = [2]q b± ∧ (g) = 0
Riemannian Geometry of the Standard q-Sphere
275
by the q-symmetry in Proposition 4.2. Since the torsion is a left module map, it follows that the torsion vanishes entirely. Finally, we look at the ‘skew-metric compatibility’ in the sense of zero cotorsion. This has been proposed as the correct notion of compatibility in [M1] and can be written in terms of ¯ 1 CoTor = (∇ ∧ id − id ∧ ∇)g ∈ 2 ⊗ (there is an additional term if the torsion is not zero). Since the metric consists of exact differentials and since the torsion vanishes, the first ∇ ∧ id vanishes. Looking at the second term, we compute 1 1 ¯ + + db+ ∧ ∇ ⊗db ¯ − ) − db0 ∧ ∇ ⊗db ¯ 0 (id ∧ ∇)g = (q 2 db− ∧ ∇ ⊗db [2]q [2]q = q 2 db− ∧ b+ g + db+ ∧ b− g − db0 ∧ (1 + [2]q b0 )g =0 by a right-module version q 2 (db− )b+ + (db+ )b− − db0 (1 + [2]q b0 ) = 0 of the relation in Corollary 3.2. Hence the cotorsion vanishes as well. Note that the cotorsion or ‘skew-metric compatibility’ condition appropriate in noncommutative geometry [M1] is weaker than the usual notion. In our case we have the more usual ∇g = O(q − 1) (if ∇ is taken to act on the tensor product as a derivation while keeping its left output to the far left), so that we recover the usual full metric compatibility only when q = 1. It is also worth noting that viewed as sections of an ¯ 0 which are actually holomorphic ¯ ± , ∂b associated bundle (see the Appendix), it is the ∂b 1,0 1 ¯ in the sense ∇∂bi ∈ ⊗ , rather than the image of ∂. Let us also use the connection to relate to the projective module point of view on quantum bundles. Corollary 5.2. The projector [2]q b− E = 1 + [2]q b0 −b+ , 1 + [2]q b0 , −q 2 b− [2]q b+ yields 1 (Cq [S 2 ]) = Cq [S 2 ]⊕3 .(1 − E) and ∇ = −EdE acting on db− , db0 , db+ . Proof. Note that proceeding from Theorem 3.1 would give a projector from 6 copies of Cq [S 2 ], whereas we provide a projector more in keeping with the classical geometrical picture from 3 copies. Moreover, we are not using the universal calculus as in [HM]. Nevertheless, the form of ∇ similarly suggests the projection shown, which we then verify directly. Thus, we have the dot products [2]q b− −b+ , 1 + [2]q b0 , −q 2 b− 1 + [2]q b0 = 1, [2]q b+ db− −b+ , 1 + [2]q b0 , −q 2 b− db0 = 0, db+ using respectively, the relations (3) of the q-sphere and the relation in Corollary 3.2. The second dot product with d of the row vector similarly gives −g. These observations imply
276
S. Majid
that E 2 = E and (using the Leibniz rule to compute dE) that ∇ = −dE = −EdE when acting on the column vector (dbi ). Here E.(dbi ) = 0 (acting on the column vector). The map Cq [S 2 ]⊕3 → 1 (Cq [S 2 ]) is given by (f, g, h) → f db− + gdb0 + hdb+ = (f, g, h)(1 − E)(dbi ) and has kernel generated as a left module by the row vector in E. This corresponds to the relation in Corollary 3.2. Let us note that the same E also gives the ∂ and ∂¯ parts in a similar way. Thus ∇ = −∂E = −E∂E when acting ¯ i ) and ∇ = −∂E ¯ = −E ∂E ¯ on (∂bi ). One may check that on the column vector (∂b ¯∂E.(∂b ¯ i ) = ∂E.(∂bi ) = 0 so that ∇ = −dE = −EdE when acting on either (∂bi ) or ¯ i ) separately. (∂b Proposition 5.3. The Riemann and Ricci tensors of the above generalised Levi-Civita connection are as follows. ¯ Riemann|0,1 = [2]q ϒ ⊗id,
¯ Riemann|1,0 = −q 4 [2]q ϒ ⊗id.
The lift i(ϒ) =
q −1 [2]q
¯ −(id⊗∗)g +
1 − q −4 g 1 + q −4
and trace in the middle position gives Ricci =
2q −1 g 1 + q −4
making the q-sphere an ‘Einstein space’. Proof. The Riemann tensor is defined abstractly [M1, M2] by ¯ ¯ 1 Riemann = (id ∧ ∇ − d⊗id)∇ : 1 → 2 ⊗ as the form-version of the usual definition, as explained in [M1]. One may compute it ¯ ± ) ∈ 1,0 ⊗ ¯ i }, when we apply id ∧ ∇ ¯ ∂{b from the formulae for ∇ above. Since ∇(∂b we will get zero since 2,0 = 0. So only the −d⊗id)∇ contributes. We have −
1 ¯ + + ∂∂b ¯ − − [2]q ∂∂b ¯ 0) ¯ −⊗ ¯ 0⊗ ¯ + ) = −b+ (q 2 ∂∂b ¯ +⊗ ¯ ∂b ¯ ∂b ¯ ∂b ¯ (d⊗id)∇( ∂b [2]q ¯ + ∧ (q 2 ∂b− ⊗ ¯ + + ∂b+ ⊗ ¯ − − [2]q ∂b0 ⊗ ¯ 0 ). ¯ ∂b ¯ ∂b ¯ ∂b −∂b
¯ and that ϒ is central in the first line to We use Proposition 4.4 for the Laplacian ∂∂ collect q 2 [2]q b+ ϒ to the left times (18), so that the first line vanishes. For the second line we use our computations for such wedge products in the proof of Proposition 4.1 as multiples of ϒ, to obtain −
1 2 ¯ ¯ ¯ + + q −1 b+ ¯ 0 ¯ + ) = ϒ(q 2 b02 ⊗ ¯ ∂b ¯ ∂b ¯ ⊗∂b− − [2]q q −1 b+ b− ⊗ (d⊗id)∇( ∂b [2]q ¯ + ¯ ∂b = ϒ⊗
¯ ± , ∂b ¯ 0 in Corolon using the relations of the q-sphere and the relations between the ∂b ¯ − . The computation for Riemann(∂b± ) lary 3.3. Similarly for the Riemann tensor on ∂b is similar but yields an extra factor −q 4 (the symmetry was broken in our choice of ϒ).
Riemannian Geometry of the Standard q-Sphere
277
We note that Riemann is a left module map so it is enough to find it on such exact differentials. It is also possible to compute the curvature ‘upstairs’ in the principal bundle, using (14). By the same conventions and explanations as in the proof of Theorem 5.1, we have, for example ¯ + = −q 4 [2]q ϒ ⊗∂b ¯ +. Riemann(∂b+ ) = b2 Fω (t −2 )e+ = q 3 [−2; q 2 ]b2 ϒ ⊗e The curvature can be computed either way, as explained for the classical case in [M1]. Using the Hodge ∗ operator we can write the Riemann tensor as 1 − q4 1 + q4 ¯ Riemann = [2]q ϒ ⊗ − ∗ . 2 2 ¯ 1 ⊗ ¯ 1, For the Ricci tensor we need to lift the Riemann tensor to a map 1 → 1 ⊗ 1 − q4 1 + q4 ¯ − ∗ i(Riemann) = [2]q i(ϒ)⊗ 2 2 2q −1 1 − q −4 1 + q −4 1 + q4 1 − q4 ¯ ¯ = id ⊗ − ∗ g ⊗ − ∗ , 1 + q −4 2 2 2 2 where we use the lifting from the same family as in Proposition 4.3 but of the form stated. We can then take a trace by ‘feeding’ the right-hand factor of i(ϒ) into the input of Riemann. This gives 2q −1 1 − q4 1 + q4 1 + q −4 1 − q −4 ¯ Ricci = id⊗ − ∗ − ∗ g 1 + q −4 2 2 2 2 which gives the result stated using ∗2 = id.
¯ 1 is an additional datum in the approach of [M2] needed The lift i : 2 → 1 ⊗ to define the Ricci tensor as well as interior products, etc. Our point of view in the above Proposition 5.3 is that for the standard metric to be Einstein, the natural lift i in Proposition 4.3 gets deformed by an additional g component which vanishes as q → 1. Equivalently, if we keep the choice of i coming from the geometry in Proposition 4.3 then we find i(ϒ) = −
q −1 ¯ (id⊗∗)g [2]q
⇒
Ricci =
[2]q (1 − q 4 ) q −1 (1 + q 4 ) g+ i(ϒ) (23) 2 2
which is a novel prediction of an ‘antisymmetric’ volume form correction to the Ricci tensor that vanishes as q → 1. Let us also note that the definition of Ricci used above is via the trace as in [M2], but between the second factor of the lift of Riemann (rather than the first factor as there) and its input. Also, we do not need a braided trace as was needed to keep covariance in the bicovariant calculus model in [M3], and do not have offsets θ ⊗θ in Ricci as appeared there. The completely general definition of Ricci at the level of arbitrary framed algebras is not fully understood, but we see once again that in examples, as here, it is clear which trace to take. The γ matrices needed for the Dirac operator are likewise not yet formulated in the most general form for any framed algebra, but in examples there seems to be a clear choice. We propose the following. For the spin bundle we take S ≡ S − ⊕ S + = E −1 ⊕ E +1 = Cq [SL2 ]1 ⊕ Cq [SL2 ]−1
278
S. Majid
as given by the monopole bundles of charges −1 and 1. We identify the sections of the bundles with the degree ±1 subspaces of Cq [SL2 ] as we did already for the charges ±2. This corresponds to the double cover of the bundle for the cotangent space in Theorem 3.1. The next ingredient is a map γ : 1 (Cq [S 2 ]) → End(S) which we construct as follows. We use our description in Corollary 3.3 as we have throughout the paper, to define ¯ − → S +, γ : 1,0 ⊗S ¯ + → S −, γ : 0,1 ⊗S
¯ ) = f σ, γ (f e+ ⊗σ
∀f ∈ Cq [SL2 ]−2 ,
σ ∈ S − , (24)
¯ ) = f τ, γ (f e− ⊗τ
∀f ∈ Cq [SL2 ]2 ,
τ ∈ S +.
(25)
Here σ, τ denote appropriate sections and γ under our identifications is nothing other than the product of Cq [SL2 ] restricted to the appropriate degrees. We also let γ |0,1 ⊗¯ S − = 0,
γ |1,0 ⊗¯ S + = 0.
(26)
The classical motivation for γ is as follows. Since ∂ is like a holomorphic differential one may think of it as a complex linear combination of the usual differentials. Likewise, if σ 1,2 are the usual Pauli matrices, then 1 1 1 1 01 00 2 2 , , (σ + ıσ ) = (σ − ıσ ) = 0 0 10 2 2 which is the structure we have used for (24)–(26). ¯ → S defined above is covariant under Lemma 5.4. The operator γ : 1 (Cq [S 2 ])⊗S the left coaction of Cq [SL2 ] and obeys −1 q 0 b2 = −γ∗db± ◦ γ∗db± , {γdb± , γ∗db± } = 0, γdb± ◦ γdb± = 0 q3 ± ¯ )), etc. Moreover, where γdb± = γ (db± ⊗( ¯ )) = γ ◦ γ (g ⊗(
q2 0 id. 0 1
Proof. The left coaction on our various spaces is simply the coproduct of Cq [SL2 ] restricted to the appropriate degree. Since this is an algebra homomorphism, the γ as ¯ i correspond as in Corollary 3.3 to given by the product is covariant. Likewise, ∂bi , ∂b b2 , db, d 2 , a 2 , ca, c2 and the relations among the corresponding γ∂bi , γ∂b ¯ i are just the relations among these generators of Cq [SO3 ] defined as the even part of Cq [SL2 ], since γ acts by left multiplication. Thus 2 2 3 2 γ∂b− ◦ γ∂b ¯ − = b a = q b− ,
−1 2 γ∂b ¯ − ◦ γ∂b− = q b− ,
2 2 3 2 γ∂b+ ◦ γ∂b ¯ + = d c = q b+ ,
−1 2 γ∂b ¯ + ◦ γ∂b+ = q b+ ,
2 γ∂b0 ◦ γ∂b ¯ 0 = dbca = q (1 + qb0 )b0 , 2 2 3 γ∂b+ ◦ γ∂b ¯ − = d a = (1+ q b0 )(1 + qb0 ), 2 2 2 γ∂b− ◦ γ∂b ¯ + = b c = b0 ,
−1 γ∂b ¯ 0 ◦ γ∂b0 = (1 + q b0 )b0 , −3 −1 γ∂b ¯ − ◦ γ∂b+ = (1 + q b0 )(1 + q b0 ), 2 γ∂b ¯ + ◦ γ∂b− = b0 .
Riemannian Geometry of the Standard q-Sphere
279
The left column act on S + and the right column on S − by multiplication. Remembering that γ acts by zero when the degrees do not match, we find γdb± = γ∂b± + γ∂b ¯ ± with ¯ square as stated. Since the Hodge * changes the sign of ∂bi we find similarly that γdb± and γ∗db± anticommute. Finally, the other expressions allow us to compute ¯ )) = q 2 b02 + (1 + q 3 b0 )(1 + qb0 ) − [2]q q 2 (1 + qb0 )b0 = 1, γ ◦ γ (g+− ⊗( ¯ )) = q 2 , γ ◦ γ (g−+ ⊗( ¯ 0,1 component of the metric. These combine to where for example g+− is the 1,0 ⊗ the result stated. The classical meaning of the γ relations stated is that in a local coordinate chart the 2-dimensional cotangent space is spanned by db+ , ∗db+ , say, or (another chart) with b− . We see that our γ operators in these directions mutually anticommute and each square to a multiple of a q-deformation of the identity. The relation involving the metric is a weak form of the Clifford relations involving the metric, as proposed in [M2]. Proposition 5.5. Let D be the covariant derivative on S given by the q-monopole as spin connection. We define the Dirac operator on Cq [S 2 ] by ¯ 0 γ ◦ ∇ /=γ ◦D = : S → S, γ ◦ 0 ¯ according to the parts in 1,0 and 0,1 . Then ∇ where D = + / is covariant under the left coaction of Cq [SL2 ] and under local frame rotations C[t, t −1 ]. Moreover, for f = b− , 1 + [2]q b0 , b+ , we have 0 0 2 −1 2 −1 −1 ∇ / (f b) = q [2]q (f )b + qb ∇ / (f a) = q [2]q (f )a + −q a −q −1 c, qd, a ∇ /2 (f c) = q −1 [2]q (f )c + qc 0,
2 −q b 2 −1 ∇ / (f d) = q [2]q (f )d + −q 3 d 0.
Proof. If σ ∈ S − then γ (Dσ ) = γ (σ ) ∈ S + , etc., giving the stated form of ∇ / on S − ⊕ S + . The space S − viewed as the degree 1 subspace of Cq [SL2 ] is spanned over Cq [S 2 ] by a, c. These are not linearly independent but obey the relations b+ a − (1 + qb0 )c = 0,
b0 a − q 2 b− c = 0
as used in the projector [HM]. The covariant derivative on such sections is known already from [BM1] and takes the form Da = da − ae0 = qbe+ = q −1 ∂b0 .a − q∂b− .c,
Dc = dc − ce0 = qde+ = ∂b+ .a − q∂b0 .c
by similar computations as in Theorem 5.1. We omit writing the basis of the degree −1 left comodule V = C in view of our identifications. Then ¯ − qγ (b2 e+ ⊗c) ¯ = bda − qb2 c = b. ∇ /a = γ (bde+ ⊗a)
280
S. Majid
By such calculations, one has ∇ /a = b,
∇ /c = d,
∇ /b = qa,
∇ /d = qc,
(27)
where b, d ∈ S + and a, c ∈ S − . For ∇ /2 we note the Leibniz rule for any σ ∈ S − (say) and f ∈ Cq [S 2 ], ¯ ), ¯ ) = f∇ ∇ /(f σ ) = f ∇ /σ + γ (∂f ⊗σ /σ + fi γ (∂bi ⊗σ
(28)
¯ = fi ∂b ¯ i and have where ∂f = fi ∂bi (sum over i = −, 0, +) say. We similarly write ∂f a similar expression for the Leibniz property on S + . We choose the coefficients fi from a fixed expansion df = fi dbi (they are not unique). Then ¯ i ⊗γ ¯ i ⊗∇ ¯ /σ ) + γ (∂f ¯ (∂bi ⊗σ ¯ )) + fi ∇ ¯ ). /2 σ + fi γ (∂b / ◦ γ (∂bi ⊗σ ∇ /2 (f σ ) = f ∇ From Theorem 5.1 we have ¯ i ⊗∂b ¯ i ⊗∂b ¯ i + fi ∇∂bi = ∂f ¯ i + fi [2]q bi g−+ + f0 g−+ , ∇∂f = ∂f which combined with Lemma 5.4 gives the third term of ∇ /2 as ¯ ) − (q 2 fi [2]q bi + q 2 f0 )σ. γ ◦ γ (∇∂f ⊗σ Meanwhile direct computation gives 2 b ¯ )=∇ / db ∇ / ◦ γ (∂bi ⊗σ d 2
.σ = (q 2 [2]q bi + q 2 δi,0 )σ
at least when σ = a, c. Hence ¯ i ⊗b) ¯ + γ ◦ γ (∇∂f ⊗a) ¯ ∇ /2 (f a) = qf a + fi γ (∂b and similarly for ∇ /2 (f c). Computing the middle terms, we find ¯ ∇ /2 (f a) = qf a + q −1 fi bi a − q −1 f+ c + γ ◦ γ (∇∂f ⊗a), 2 −1 ¯ ∇ / (f c) = qf c + q fi bi c + f− a + f0 c + γ ◦ γ (∇∂f ⊗c)
(29)
/2 (f b), ∇ /2 (f d). The particular cases for all f ∈ Cq [S 2 ]. There are similar formulae for ∇ ¯ )) from Lemma 5.4. of f stated then follow using Theorem 5.1 or (22) and γ ◦ γ (g ⊗( We recognise the action of the Laplacian from Proposition 4.4. One may also obtain these particular cases by rather tedious direct computation, first finding 0 0 ∇ /(bi a) = q[2]q bi b + qb ∇ /(bi b) = [2]q bi a + 0 d, −q −1 c, −qb ∇ /(bi c) = q[2]q bi d + 0 0, where the cases are as bi = b− , b0 , b+ .
a ∇ /(bi d) = [2]q bi c + c 0,
Riemannian Geometry of the Standard q-Sphere
281
The formulae for ∇ /2 are examples of ‘Lichnerowicz type’ formulae, where relative to a chosen (holomorphic or antiholomorphic) section, it is given by the scalar Laplacian from Proposition 4.4 plus an additional term, which we think of as some form of ‘scalar curvature’. Let us also comment that our Dirac operator comes with a Z2 grading in the splitting S = S − ⊕ S + and ∇ / anticommutes with the grading. Also, from (28) and its cousin on S + , we have [∇ /, fˆ] = γdf ,
∀f ∈ Cq [S 2 ],
(30)
where fˆ denotes f acting on S by left multiplication, so that ∇ / does allow one to recover the d of the calculus at the algebraic level. Thus we have some of the features of a spectral triple [Co] although not fitting precisely into that setting. On the other hand, our ∇ / naturally deforms the geometrical Dirac operator on the classical sphere and is motivated in that way rather than by such axioms. Also, it is not hard to exhibit some eigenfunctions of ∇ /. From the proof of Proposition 5.5, we have 1 1 1 1 1 1 2 2 2 2 ∇ / ±q a = ±q 2 ±q a , ∇ / ±q c = ±q 2 ±q c (31) b b d d as solutions of the massive Dirac equation. Finally, let us note that unlike the cotangent bundle, the spinor bundle is trivial. Both are trivial in K-theory having zero total monopole charge (see [HM]). −1 −q b0 qb− . Then Proposition 5.6. Let e = −b+ 1 + qb0 S −∼ =Cq [S 2 ]⊕2 (1 − e),
S +∼ =Cq [S 2 ]⊕2 e,
S∼ =Cq [S 2 ]⊕2 .
The covariant derivative and Dirac operators in the trivialisation are D = d + (de)e − ede, ←
∇ / = λq + λq −1 ( ∂ i bi )(1 + (q 4 − 1)e) ← ← ← 0 q −1 ∂ + ∂ 0 q −1 ∂ + 2 e−λ (1 − e), +λq ← ← ← −∂ − 0 −∂ − −∂ 0 ←
1
acting on row 2-vectors, where df = (f ∂ i )dbi (sum i = −, 0, +) and λ = q − 2 . Proof. The projection for one half, the bundle S + , say, was obtained in [HM] with the universal calculus and this is our starting point. Our observation is that the projector for the other half of the spinor space is just given by the complementary projection. We then verify the directly. As spanning set for S − , S + we take the column desired properties a b vectors and respectively (these should not be confused with the S − ⊕ S + c d column vectors above). We verify that a a a a e = 0, D = −∂e = −ede , c c c c
282
S. Majid
(1 − e)
b = 0, d
D
b ¯ b = −(1 − e)d(1 − e) b , = ∂e d d d
¯ = using the comutations above for D and the relations (17) to find ∂e = ede and ∂e −(1 − e)d(1 − e) = (de)e. b . Simlarly The map Cq [S 2 ]⊕2 → S + is given by (f, g) → f b + gd = (f, g)e d for S − with 1 − e. Since given by complementary projectors, we see that together these maps trivialise S. Let us write this explicitly as the combined map a + λb (f, g) → (f, g) c + λd viewed in S ⊂ Cq [SL2 ]. The a, c have different degrees from b, d so any nonzero value of λ will do and yield an isomorphism. Noting that b a ¯ ∂e = ∂e =0 d c allows us to combine the expressions for D according to a + λb ¯ a + λb − (f, g)(ede − (de)e) a + λb . D((f, g) ) = (df, dg)⊗ c + λd c + λd c + λd In view of the isomorphism, one can view this as an operator on (f, g) which is as stated with d and matrix multiplication acting from the right (this is inevitable in our conventions). 1 For the Dirac operator we use the Leibniz formula (28) and set λ = q − 2 so that a + λb, c + λd are eigenvectors of ∇ /. Then a + λb a + λb a + λb + λ−1 (f, g) . ∇ /((f, g) ) = γ(df,dg) c + λd c + λd c + λd By similar methods as in the proof of Proposition 5.5, we write df = fi dbi and take ¯ . Then these coefficients also for expansions of ∂f, ∂f 2 q b + λq −1 a a + λb b ) = (fi , gi )bi − qg , f ) γ(df,dg) ( + (qf 0 − + c + λd d q 2 d + λq −1 c a +λ(g− , −q −1 f+ + g0 ) . c b a By inserting an e in front of and (1 − e) in front of , we can write this as a d c a + λb certain operation on (f, g) followed by dot product with , which gives the ∇ / c + λd ←
stated. Here ∂ i denotes the right-acting operator which assigns our chosen coefficients fi to f and gi to g. The last two terms of ∇ / (the displayed matrix terms) can be combined as ← ← ← ← −q ∂ 0 b0 − [2]q ∂ + b+ q 3 ∂ 0 b− + q ∂ + (1 + [2]q b0 ) λ ← . ← ← ← ∂ 0 b+ + ∂ − (1 + [2]q b0 ) −q 2 [2]q ∂ − b− − q ∂ 0 b0
Riemannian Geometry of the Standard q-Sphere
283
This part alone is not well-defined (recall that the chosen coefficients of df, dg are not unique) but the entire ∇ / is well-defined and independent of the choice, as it must be since ∇ / exists geometrically on the bundle S. It remains to make some remarks about the classical limits. Setting q = 1 we have ← ← ← ← ← ← ∂ − b− − ∂ + b+ ∂ 0 b− + ∂ + (1 + 2b0 ) ∇ /q=1 = 1 + ← = 1 − ıσ · ∂ × x ← ← ← ∂ 0 b+ + ∂ − (1 + 2b0 ) ∂ + b+ − ∂ − b− when we shift to the usual x, y, z coordinates related (say) as b± = ±(x ± ıy) and b0 = z − 21 , corresponding to a sphere of radius 1/2 embedded in R3 . Changing vari←
←
←
←
←
←
←
←
←
/ as ables to ∂ = ( ∂ x , ∂ y , ∂ z ) (so that ∂ ± = ± 21 ( ∂ x ∓ ı ∂ y ) and ∂ 0 = ∂ z ) gives ∇ ←
stated in terms of the usual Pauli matrices and the vector x = (x, y, z) acting by right multiplication. This expression makes sense on functions on R3 but vanishes on functions that depend only on the radius, and hence descends to functions on S 2 . Incidentally, as is well-known, ∇ /2 = ∇ / + where ← ←
← ←
← ← ← ←
= ( ∂ · x )2 + ∂ · x − ( ∂ · ∂ )( x · x ) (an easy computation) and one may check that defined on the ambient R3 again descends to S 2 and is indeed the classical limit of the Laplace operator in Proposi← ← tion 4.4 under the same change of classical coordinates and x · x = 1/4. Obviously all operations here may be reformulated acting from the left as more usual. Appendix A. Geometrical q-Borel-Weil-Bott Construction The Borel-Weil-Bott construction in classical representation theory constructs irreducible representations of a compact Lie group G as follows. From the bundle G → G/T (where T is the maximal torus) and an irreducible representation V of T (given by a character) we construct the associated bundle E = G ×T V over G/T . Its space of sections E still carries a representation of G acting on G/T from the left and lifted to an action on E by a certain connection. This space is, however, much too big to be an irreducible representation. From the point of view of ‘geometric quantization’ one must choose a polarization. A natural approach here is to use the complex structure on G/T and take the holomorphic sections E hol . These are a much smaller space and form an irreducible representation of G. We refer to [GZ] for an excellent account of the classical situation from this point of view and of the quantum case from a representation theoretic (but not really geometric) point of view. On the other hand, in the course of understanding the geometry of the q-sphere, we have now obtained all the ingredients for the quantum group geometrical version of this construction. We outline this application in this appendix. The full details and generalisation from Cq [SL2 ] to other quantum groups will be addressed elsewhere. Indeed, this remark is about the monopole associated bundles and their covariant derivatives as we have already used for charges ±2, ±1. Now we consider general −1 E −n = (Cq [SL2 ]⊗V )C[t,t ] , where V = C.v is the right comodule defined by R v = v⊗t −n . This is 1-dimensional so that E n = Cq [SL2 ]n .v, i.e. isomorphic to the degree n component. As such E −n also carries the left coaction of Cq [SL2 ] given by restricting the coproduct since this respects the degree (the coaction on degree 0 was already used
284
S. Majid
in (16)). Let D be the usual covariant derivative for the monopole connection. We say that a section σ ∈ E −n is holomorphic if ¯ −n . Dσ ∈ 1,0 ⊗E
(32)
¯ for the 1,0 and 0,1 parts then we require In other words, if we write D = + ¯ σ = 0. Proposition A.1. The space of holomorphic sections E hol −n of the charge −n q-monopole bundle contains the standard n + 1-dimensional corepresentation of Cq [SL2 ]. Proof. Here n ≥ 0. The standard n + 1-dimensional corepresentation of Cq [SL2 ] corresponds to the q-deformation of the standard n+1-dimensional irreducible representation of SL2 and is given by {cs a t | s + t = n, s, t ≥ 0}.
(33)
These are all in the degree n component of Cq [SL2 ] and form a left corepresentation via the restriction of the coproduct. We verify that as such, they are holomorphic. Indeed, as in the computation for Theorem 5.1, we compute dcs = [s; q 2 ]cs−1 (ce0 + qde+ ),
da t = [t; q 2 ]a t−1 (ae0 + qbe+ )
as easily proven by induction. Then d(cs a t ) = [n; q 2 ]cs a t e0 + q t cs−1 a t−1 (q[s; q 2 ] + [n; q 2 ]bc)e+ by the Leibniz rule. Hence using (13) we have D(cs a t ) = d(cs a t ) − cs a t ω(t n ) = q t cs−1 a t−1 (q[s; q 2 ] + [n; q 2 ]bc)e+ . The expressions are slightly simpler when s = 0 or t = 0. Next, we move cs−1 a t−1 ¯ −n by the same argument as to the far right and use (21) to see that D(cs a t ) ∈ 1,0 ⊗E in the proof of Theorem 5.1. Hence these elements are holomorphic as claimed. ¯ Conversely, if f ∈ Cq [S 2 ] and σ ∈ E hol −n then D(f σ ) = df ⊗σ + f D(σ ) means that ¯ = 0 which, as explained in Sect. 4 (at least for f σ cannot be holomorphic unless ∂f generic q), means f a multiple of 1. This reminds us that E hol −n is indeed a complex vector space but not a Cq [S 2 ]-module. As such we have seen that it contains span{cs a t }, which are linearly independent over C and give the usual n + 1-dimensional corepresentation. On the other hand, since the dimension of E hol −n classically is n + 1, this should also be true for generic q (since all our structures deform with classical dimensions). In this s t case, by dimensions, E hol −n = span{c a }, i.e. not only contains but coincides with the q-deformed n + 1-dimensional corepresentation of Cq [SL2 ]. This outlines a geometric proof of the q-Borel-Weil-Bott construction. Other approaches to this topic are in [APW, PW]. Acknowledgements. I would like to thank Ruibin Zhang for stimulating discussions on the BorelWeil-Bott construction (see Appendix) during a visit to the Dept of Mathematics, University of Sydney in December 2002. The work was completed while the author was a Royal Society University Research Fellow.
Riemannian Geometry of the Standard q-Sphere
285
References [APW] Andersen, H.H., Polo, P., Wen. K.X.: Representations of quantum algebras. Invent. Math. 104, 1–59 (1991) [BCT] Bonechi, F., Ciccoli, N., Tarlini, M.: Noncommutative instantons in the 4-sphere from quantum groups. Commun. Math. Phys. 226, 419–432 (2002) [BK] Bibikov, P.N., Kulish, P.P.: Dirac operators on the quantum group SUq (2) and the quantum sphere. J. Math. Sci. 100, 239–250 (2000) [BM1] Brzezi´nski, T., Majid, S.: Quantum group gauge theory on quantum spaces. Commun. Math. Phys. 157, 591–638 (1993) Erratum 167, 235 (1995) [BM2] Brzezi´nski, T., Majid, S.: Quantum differentials and the q-monopole revisited. Acta Appl. Math. 54, 185–232 (1998) [Co] Connes, A.: Noncommutative Geometry. London: Academic Press, 1994 [DS] Dabrowski, L., Sitarz, A.: Dirac operator on the standard Podle`s sphere. In: Noncommutative geometry and quantum groups, P. Hajac, W. Pusz, (eds.), Banach Center Publications, 61, 49–58 (2003) [HM] Hajac, P., Majid, S.: Projective module description of the q-monopole. Commun. Math. Phys. 206, 246–464 (1999) [Ma] Majid, S.: Foundations of Quantum Group Theory. Cambridge: Cambridge Univeristy Press, 1995 [M1] Majid, S.: Quantum and braided group Riemannian geometry. J. Geom. Phys. 30, 113–146 (1999) [M2] Majid, S.: Riemannian Geometry of Quantum Groups and Finite Groups with Nonuniversal Differentials. Commun. Math. Phys. 225, 131–170 (2002) [M3] Majid, S.: Ricci tensor and Dirac operator on Cq [SL2 ] at roots of unity. Lett. Math. Phys. 63, 39–54 (2003) [Ow] Owczarek, R.: Dirac operators on the Podle`s sphere. Int. J. Theor. Phys. 40, 163–170 (2001) [PW] Parshall, B., Wang, S.: Quantum linear groups. Mem. Amer. Math. Soc. 89, (1991) [PS] Pinzul, A., Stern, A.: Dirac operator on the quantum sphere. Phys. Lett. B 512, 217–224 (2001) [Po1] Podles, P.: Quantum spheres. Lett. Math. Phys. 14, 193–202 (1987) [Po2] Podles, P.: Differential calculus on quantum spheres. Lett. Math. Phys. 18, 107–119 (1989) [Po3] Podles, P.: The classification of differentical structures on quantum 2-spheres. Commun. Math. Phys. 150, 167–179 (1992) [Sw] Sweedler, M.: Hopf Algebras. New York: Benjamin Press (1969) [Sch] Schneider, H-J.: Principal homogeneous spaces for arbitrary Hopf algebras. Isr. J. Math. 72, 167–195 (1990) [Wo] Woronowicz, S.L.: Differential calculus on compact matrix pseudogroups (quantum groups). Commun. Math. Phys. 122, 125–170 (1989) [GZ] Gover, A.R., Zhang, R.B.: Geometry of quantum homogeneous vector bundles and representation theory of quantum groups I (appendix of). Rev. Math. Phys. 11 533–552 (1999) Communicated by L. Takhtajan
Commun. Math. Phys. 256, 287–303 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1317-6
Communications in
Mathematical Physics
The Capacity of a Quantum Channel for Simultaneous Transmission of Classical and Quantum Information I. Devetak1 , P. W. Shor2 1 2
IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA Department of Mathematics, MIT, Cambridge, MA 02139, USA
Received: 11 December 2003 / Accepted: 23 October 2004 Published online: 22 March 2005 – © Springer-Verlag 2005
Abstract: An expression is derived characterizing the set of admissible rate pairs for simultaneous transmission of classical and quantum information over a given quantum channel, generalizing both the classical and quantum capacities of the channel. Although our formula involves regularization, i.e. taking a limit over many copies of the channel, it reduces to a single-letter expression in the case of generalized dephasing channels. Analogous formulas are conjectured for the simultaneous public-private capacity of a quantum channel and for the simultaneously 1-way distillable common randomness and entanglement of a bipartite quantum state. 1. Introduction In the paper that marked the beginning of information theory [21], C. E. Shannon introduced the notion of a (classical) channel W , a stochastic map modeling the effect of noise experienced by a classical message on its way from sender to remote receiver. There he defined and computed the key property of the channel W : its capacity C(W ) to convey classical information, expressed in bits per channel use. Many decades later, in the context of quantum information theory, the notion of a quantum channel N , a cptp (completely positive trace preserving) map, was introduced as the most general bipartite dynamic resource consistent with quantum mechanics. There are now two basic capacities one may define for N : classical C(N ) and quantum Q(N ). Intuitively, these correspond to the maximum number of bits (respectively qubits) per use of N that can be faithfully transmitted over the channel. The classical capacity theorem was independently proved by Holevo [16], and Schumacher and Westmoreland [23]. The quantum capacity theorem was originally stated by Lloyd [19], although it was only recently generally realized that his proof could be made rigorous [17]. It has also been proved by Shor [25] and subsequently, via the private classical capacity, by Devetak [8]. In the present paper we unify the two capacities by investigating the capacity of N for simultaneously transmitting classical and quantum information, given in the form of a trade-off curve.
288
I. Devetak, P. W. Shor
Let the sender Alice and receiver Bob be connected via a quantum channel N : HA → HB , where HA denotes the Hilbert space of Alice’s input system A and HA that of Bob’s output system B. We shall define three distinct information processing scenarios which will turn out to be equivalent. Scenario Ia (Subspace Transmission). Alice’s task is to convey to Bob, in some large number n uses of the channel, one of µ equiprobable classical messages with low error probability and simultaneously an arbitrarily chosen quantum state from some Hilbert space H of dimension κ with high fidelity. More precisely, we define a (classical, quantum) channel code to consist of: ⊗n • An ordered set (Em )m ∈ [µ], [µ] = {1, 2, . . . µ}, of cptp maps Em : HA → HA . Such an ordered set is the most general function with two inputs, classical and quantum, and one quantum output. • A decoding quantum instrument [7] D = (Dm )m∈[µ] , an ordered set of cp (com ⊗n pletely positive) maps Dm : HB → HB , the sum of which D = m∈[µ] Dm is trace preserving. The probability of outcome m for input ρ is Tr Dm (ρ), while the effective quantum map is D. The instrument has one quantum input and two outputs, classical and quantum. It is a natural generalization of a POVM (positive operator valued measure), which cares only about the classical output, and quantum cptp map, which only has a quantum output.
Alice’s classical message is represented by a random variable M uniformly distributed on the set [µ]. Conditional on M taking on a particular value m, Alice encodes the quantum state of A with Em and sends it through n copies of the channel N . Bob performs the instrument D on the channel output, resulting in the classical outcome random variable M and a quantum output system B . Note that HA = HB = H. We call the ordered pair ((Em )m , D) an (n, ) code if 1. Pr{M = m|M = m} ≤ , ∀m, 2. min F (ϕ, (D ◦ N ⊗n ◦ Em )(ϕ)) ≥ 1 − , |ϕ∈H
∀m,
√ √ where the fidelity is defined by F (ρ, σ ) = ρ σ 21 . Condition 1 above means that each message should be correctly decoded by Bob with high probability. Condition 2 corresponds to the subspace transmission criterion of [1]: each pure input state |φ supported on H should be almost perfectly transmitted to Bob. The (classical, quantum) rate pair of the code is (r, R), with r = n1 log µ and R = n1 log κ. They represent the number of bits and qubits, respectively, per use of the channel that can be faithfully transmitted simultaneously. A rate pair (r, R) is called achievable if for all , δ > 0 and all sufficiently large n there exists an (n, ) code with rate pair (r − δ, R − δ). The simultaneous (classical, quantum) Scenario Ia capacity region of the channel SIa (N ) is the set of all achievable positive rate pairs. Scenario Ib (Entanglement Transmission). This scenario is very similar to the first one, but instead of transmitting an arbitrary pure state of A , Alice is required to preserve entanglement [1] between A and some reference system A she has no access to. Here Condition 2 is replaced by 2 . F (, m ) ≥ 1 − ,
∀m,
Simultaneous Transmission of Classical and Quantum Information
289
where
A ⊗n AB ◦ Em )](AA ), m = [1 ⊗ (D ◦ N
and
| =
(1)
κ 1 |k ⊗ |k κ
(2)
k=1
is the standard maximally entangled state on H ⊗ H. We denote the corresponding capacity region by SIb (N ). Scenario II (Entanglement Generation). In this scenario, simultaneously with transmitting classical information, Alice wishes to generate entanglement [8] shared with Bob rather than preserving it as in Scenario Ib. Alice prepares, without loss of generality, n ⊗n a pure bipartite state |ϒm AA in her lab (HAn := HA ), depending on the classical information m, and sends it through the channel. Bob decodes as above, yielding the output state
n
A ⊗n AB )](ϒm AA ), m = [1 ⊗ (D ◦ N
(3)
shared by Alice and Bob. Everything else is defined as in Scenario Ib. The corresponding capacity region is denoted by SII (N ). In the next section we state our main result, a unique expression for the capacity regions defined above, investigate its properties and relate it to previous work. The proof of our main theorem is relegated to Sect. 3. Some remarks on related problems are collected in Sect. 4. We conclude in Sect. 5 with suggestions for future research. 2. Main Result
Recall the notion of an ensemble of quantum states E = {px , |φx AA }: the quantum system AA is in the state |φx AA with probability px . The ensemble E is equivalently represented by a classical-quantum system [10] XAA in the state px |x x|X ⊗ |φx φx |AA . x
X plays the dual role of an auxiliary quantum system in the state x px |x x| and of a random variable with distribution p. Sending the A system through the channel N gives rise to a classical-quantum system XAB in some state σ XAB : σ XAB = px |x x|X ⊗ [(1 ⊗ N )(|φx φx |)]AB . (4) x
For such a state we say that it “arises from” the channel N . For a multi-party state such as σ XAB the reduced density operator σ A is defined by Tr XB σ XAB . Conversely, we call σ XAB an extension of σ A . A pure extension is conventionally called a purification. Define the von Neumann entropy of a quantum state ρ by H (ρ) = −Tr (ρ log ρ). We write H (A)σ = H (σ A ), omitting the subscript when the reference state is clear from the context. The Shannon entropy − x px log px of the random variable X is equal to the von Neumann entropy H (X) of the system X. Define the conditional entropy
290
I. Devetak, P. W. Shor
H (A|B) = H (B) − H (AB), (quantum) mutual information I (A; B) = H (A) + H (B) − H (AB), and conditional mutual information I (A; B|X) = I (A; BX) − I (A; X). The coherent information I (A B) is defined as −H (A|B). Whenever the state ρ AB comes about by sending some pure state |φAA through the channel N , we may use the alternative notation [22] Ic (φ A , N ) := I (A B)ρ , since this quantity is independent of the particular purification |φAA of φ A . In what follows all information theoretical XAB quantities will refer to the state σ , unless stated otherwise. Our main result is the following theorem. Theorem 1. The simultaneous capacity regions of N for the various scenarios Ia, Ib and II are all equal and given by S(N ) =
∞ 1 l=1
l
S (1) (N ⊗l ),
(5)
where S (1) (N ) is the union, over all σ XAB arising from the channel N , of the (r, R) pairs obeying 0 ≤ r ≤ I (X; B), 0 ≤ R ≤ I (A BX).
(6)
Furthermore, in computing S (1) (N ) one only needs to consider random variables X defined on some set X of cardinality |X | ≤ (dim HA )2 + 2. Since the three scenarios are equivalent we shall speak of a single capacity region. The generic shape of the capacity region is shown in Fig. 1. We shall informally refer to the outer boundary of the capacity region in the (r > 0, R > 0) quadrant as the “trade-off curve”. In Scenarios Ib and II, for any 0 < λ < 1, combining a (λn, ) code of rate pair (r1 , R1 ) with a ((1 − λ)n, ) code of rate pair (r2 , R2 ) one obtains an (n, 2) code of rate pair (λr1 + (1 − λ)r2 , λR1 + (1 − λ)R2 ). This construction is known as time-sharing and implies the concavity of the capacity region. In fact, the “single-letter” region S (1) (N ) is already concave for all channels N (see Appendix A). The points (C(N ), 0) and (0, Q(N )) represent the classical and quantum capacities, respectively. By time-sharing one may achieve the line segment interpolating between the two, giving an inner bound on the capacity region. An outer bound given by the line segment connecting (C(N ), 0) and (0, C(N )) is obtained by observing that, in Scenario Ia, the transmitted quantum subspace may always be used to encode classical information at 1 bit/qubit. Our theorem is, alas, difficult to use in practice due to the l → ∞ limit. Two simple examples in which this limit is not required are the noiseless qubit channel and the erasure channel, for which both C(N ) and Q(N ) were previously known [3]. In both cases the boring time-sharing strategy turns out to be optimal. This is particularly trivial to see for the noiseless channel: since Q(N ) = C(N ), the inner and outer bound coincide.
Simultaneous Transmission of Classical and Quantum Information
291
r Ce C
Ee
Q
C
R
Fig. 1. A generic trade-off curve for the simultaneous (classical, quantum) capacity region (solid line). The dashed-dotted line represents the time-sharing inner bound. The dashed line is the outer bound which follows from the observation that the transmitted quantum subspace may always be used to encode classical information at 1 bit/qubit. The continuation to the negative R axis (see text) is shown for Scenario II (solid) and Scenario I (dashed)
A more interesting case is that of a dephasing channel, for which the large l limit is also not required (we prove this in Appendix B), yet the resulting trade-off curve is strictly concave. The S(N ) region for the dephasing qubit channel with dephasing parameter 0.2 is shown in Fig. 2. For the depolarizing channel, another popular example, the l → ∞ limit is known to be needed when the depolarizing parameter is close to p = 0.189, the value making Q(N ) = 0 [13]. One can, however, make an interesting observation about the behavior of the trade-off curve near R = Q(N ). Although the channel itself is invariant under unitary transformations, the ρ that maximizes the coherent information Ic (ρ, N ) breaks this symmetry; indeed there is a whole family of density operators attaining Q(N ). One can thus construct an ensemble with R = Q(N ) and r > 0, so the trade-off curve is parallel to the r axis in a finite region around r = 0. For the depolarizing channel with different p, we have calculated the trade-off curve assuming l = 1 and found some interesting behavior. For p small (p < 0.04 or so) it is possible to do better than the time-sharing strategy, whereas for larger p (p > 0.05), the time-sharing strategy is optimal, assuming l = 1. For these values of p, it is not known whether taking large l is advantageous for Q(N ). There is an intriguing connection between our capacity region and the findings of Shor [26] concerning the classical capacity of a quantum channel with limited entanglement assistance. The latter may be thought of as extending Scenario II to the negative R axis, since entanglement is consumed rather than generated. 1 The result for the R ≤ 0 1
This idea arose from conversations with Aram Harrow.
292
I. Devetak, P. W. Shor
1
0.016
0.014 0.8 0.012
0.6
∆
r
0.010
0.008 0.4
0.006
0.004 0.2 0.002
0
0.2
0.4
0.6
0
R
0.2
0.4
0.6
R
Fig. 2. The trade-off curve for the dephasing qubit channel with dephasing parameter 0.2 (i.e., the channel obtained by applying the identity operator with probability 0.9 and σz with probability 0.1). In the left-hand plot, the trade-off curve is plotted with a solid line and the time-sharing bound with a dashed line. The right-hand plot gives the difference between the optimal strategy and time-sharing
region parallels that from Theorem 1, replacing (6) by 0≤r
≤ I (X; B) + I (A; B|X),
R ≤ −H (A|X) = I (A BX) − I (A; B|X).
(7)
The two expressions on the right-hand side have the same sum as in Eq. (6). There is a simple bijection between the two regions: If (r, R) is a point in R ≥ 0 corresponding to the state σ XAB , then (r + I (A; B|X), R − I (A; B|X)) is a point in the R ≤ 0 region, and vice versa. Imagine that 1 ebit of entanglement were a stronger resource that 1 bit of communication, in the sense that the latter could be produced from the former. Then the R ≤ 0 region would be trivially achievable by the achievability of the R ≥ 0 region. The opposite would hold were 1 bit stronger than 1 ebit. However, it is well known that bits and ebits are incomparable resources. The correspondence between the two regions may be interpreted as providing a limited sense in which bits and ebits may be thought of as equally strong. One may play the same game in the context of Scenario I (a or b), with a somewhat less interesting outcome. Here a negative rate R is interpreted as assistance by a noiseless quantum channel. It is known [24] that the classical capacity of a noiseless channel combined with a noisy one is just the sum of the individual capacities. Hence the Scenario I continuation of our trade-off curve simply follows the linear outer bound into the R < 0 region (see Fig. 1).
Simultaneous Transmission of Classical and Quantum Information
293
3. Proof of Theorem 1 The following lemma from [1] is needed to relate Scenarios Ia and Ib. Lemma 2. 1. If 2 min F (ϕ, (D ◦ N ⊗n ◦ E)(ϕ)) ≥ 1 − , 3 |ϕ∈H then F (, [1 ⊗ (D ◦ N ⊗n ◦ E)]()) ≥ 1 − .
(8)
2. Conversely, if (8) holds then min F (ϕ, (D ◦ N ⊗n ◦ E)(ϕ)) ≥ 1 − 2,
|ϕ∈H
where H is a subspace of H satisfying dim H ≥
1 dim H. 2
(9)
Observe that SIa (N ) = SIb (N ) ⊆ SII (N ). The equality follows from both parts of Lemma 2. The inclusion is obvious since one can always generate entanglement by transmitting half of the maximally entangled state |. Therefore, to prove Theorem 1 it suffices to show that the region (5) is contained in SIb (N ) (the “direct coding theorem”) and contains SII (N ) (the “converse”). To prove the converse we need the following simple lemma [8]. Lemma 3. For two bipartite states ρ AB and σ AB of a quantum system AB of dimension d with fidelity f = F (ρ AB , σ AB ), |I (A B)ρ − I (A B)σ | ≤
2 + 4 log d 1 − f . e
Proof of Theorem 1 (Converse for Scenario II). Define the classical-quantum state n ωMAB to be the result of sending the An part of 1 AAn |m m|M ⊗ ϒm µ m through the channel N ⊗n . We shall prove that, for any δ, > 0 and all sufficiently large n, if an (n, ) code has a rate pair (r, R) then 1 I (M; B n )ω , n 1 R − δ ≤ I (A B n M)ω . n r −δ ≤
δ 2 2 Evidently, it suffices to prove this for δ ≤ 1, ≤ [ 16 log dim HA ] and n ≥ δ . Fano’s inequality [5] says
H (M|M ) ≤ 1 + Pr{M = M}nr.
(10) (11)
294
I. Devetak, P. W. Shor
Equation (10) is a consequence of the following string of inequalities: nr = = ≤ ≤
H (M) I (M; M ) + H (M|M ) I (M; M ) + 1 + n log dim HA I (M; B n ) + 1 + n log dim HA ,
the last line by the Holevo bound [15]. On the other hand, defining ωMAB to be the state ωMAB after Bob’s decoding D, I (A B n M)ω ≥ I (A B M)ω ≥ I (A B )ω ≥ I (A B ) −
√ 2 − 8nR e
√ 2 − 8n log dim HA , e from which the claim (11) follows. The first inequality is the data processing inequality [2], the second follows from the fact that conditioning cannot increase quantum relative entropy [20] and the third is an application of Lemma 3. It should be noted that we only used a weaker “average” version of Conditions 1 and 2 , namely ≥ nR −
1. Pr{M = M} ≤ , 1 2 . F (, m ) ≥ 1 − . m m The bound on the cardinality of X is proven in Appendix C. We henceforth restrict attention to Scenario Ib. In proving the direct coding theorem, we shall combine purely quantum and purely classical codes. A quantum code is a special case of a (classical, quantum) code defined earlier, for which µ = 1 (r = 0). Quantum codes are characterized by a pair of encoding and decoding maps (E, D). Define the quantum code density operator [8] as E(π ), where π = κ1 1A . Often in coding theory is it useful to consider random codes. Alice and Bob have access to an auxiliary resource: a common source of randomness described by some probability distribution (Pα ). A random quantum code is an ordered set of encodingdecoding pairs ((E α , Dα ))α , indexed by α. With probability Pα , Alice and Bob choose α α to employ the deterministic code (E , D ).α The average code density operator for the random quantum code is given by α Pα E (π ). Given a density operator ρ ∈ HA , we say that an (n, ) random quantum code is “ρ-type” if the average code density operator ω satisfies ω − ρ ⊗n 1 ≤ .
(12)
For an ensemble of density operators E = {px , ρx } defined on HA and sequence x n = x1 x2 . . . xn denote ρx n = ni=1 ρxi . We say that an (n, ) random quantum code is “(E, x n )-type” if the average code density operator ωx n satisfies ωx n − ρx n 1 ≤ . The following proposition is a refinement of the quantum channel coding theorem, and was proved in Appendix D of [8]. A perhaps more accessible outline of the proof may be found in [12].
Simultaneous Transmission of Classical and Quantum Information
295
Proposition 4. For any , δ > 0 and all sufficiently large n, there exists a random ρ-type (n, ) quantum code for the channel N of rate R = Ic (ρ, N ) − δ. n , Recall the notion of δ-typical sequences Tp,δ n = x n : ∀x |N (x|x n ) − npx | ≤ δn , Tp,δ
where N(x|x n ) counts the number of occurrences of x in x n . When the distribution p n may be used. is associated with some random variable X the alternative notation TX,δ Proposition (4) extends to: Proposition 5. For any , δ > 0 and all sufficiently large n, for any typical sequence n there exists a random (E, x n )-type (n, ) quantum code for the channel N of x n ∈ Tp,δ rate R = x px Ic (ρx , N ) − cδ, for some constant c. Proof. By Proposition 4, for sufficiently large n, for all x there exists an (n[px − δ], ) code of rate Rx = Ic (ρx , N ) − δ, with average density operator ωx satisfying ⊗n[px −δ]
ωx − ρx
1 ≤ .
By “pasting” |X | such codes together (one for each x) an (n − |X |δ, |X |) code is produced with average code density operator ω = x ωx . Applying the triangle inequality multiple times, ⊗n[p −δ] ρx x 1 ≤ |X |. (13) ω − x n , abbreviate n = N (x|x n ) and n = n − n[p − δ]. Now transform Given x n ∈ TX,δ x x x x
the above code into the “padded” (n, |X |) quantum code obtained by inserting ρx⊗nx after each ωx ; its average density operator ω obeys ρx⊗nx 1 ≤ |X |. (14) ω − x
The new rate R is bounded by Rx [px − δ] ≥ px Ic (ρx , N ) − δ(1 + |X | log dim HA ). R=
x
x
Finally, as x ρx⊗nx and ρx n are related by a permutation of the channel input Hilbert spaces and the channel N ⊗n is invariant under such permutations, there exists an (n, |X |) code of the same rate R and average code density operator ωx n such that ωx n − ρx n 1 ≤ |X |. On the classical side, we shall need the Holevo-Schumacher-Westmoreland (HSW) theorem [16, 23], or rather its “typical codeword” version [8]. Consider the restriction of σ XAB (4) to XB: px |x x|X ⊗ N (φxA )B . σ XB = x∈X
296
I. Devetak, P. W. Shor
Proposition 6 (HSW Theorem). For any , δ > 0, define r = I (X; B) − c δ, for some constant c , and µ = 2nr . For all sufficiently large n, there exists a classical encoding n and a decoding POVM = ( ) map f : [µ] → TX,δ m m∈[µ] , such that Tr τm m ≥ 1 − , ∀m ∈ [µ], where n
τm = N ⊗n (φfA(m) ) n
and φxAn =
n
A i=1 φxi .
Proposition 6 says that Bob may reliably distinguish among µ states of the form n n . The idea behind the proof of the direct coding theorem N ⊗n (φxAn ), with x n ∈ TX,δ is for Alice to use a different quantum code depending on the classical message to be sent. Bob first decodes the classical message (while causing almost no disturbance to the quantum system) by taking advantage of the distinguishability of the channel outputs for the different codes. Furthermore, the same information tells him which quantum decoding to perform! Thus, the classical information has been “piggy-backed” on top of the quantum information. Proof of Theorem 1 (Coding for Scenario Ib). Recall in Scenario Ib Alice is transmitting half of the maximally entangled state | through the channel. Define µ, f , τm and as in Proposition 6. For now we shall assume Alice and Bob have access to a common source of randomness with distribution (Pα ). For each m define a ({px , φxA }, f (m))type (n, ) random quantum code of rate R = I (A BX) − cδ by the encoding and α , D α )) . By Proposition 5 and monotonicity of trace distance decoding operators ((Em m α [20] we have, for all m and sufficiently large n, α Pα τ m − τm 1 ≤ , α α )(π ). By Proposition 6, where τ αm = (N ⊗n ◦ Em α Pα Tr τ m m ≥ 1 − 2.
(15)
α
For a specific value of α, the encoding map for our (classical, quantum) code is given α) α by (Em m∈[µ] . The decoding instrument D is given by α α Dm : ρ → D m ( m ρ m ). α denotes the induced quantum decoding operation. By (15), for As usual, Dα = m Dm all m, m,α Pα perr ≤ 2, (16) α m,α α (τ α ). Defining an extension of τ α , where perr = 1 − Tr Dm m m α ξmα = [1 ⊗ (N ⊗n ◦ Em )](| |),
Simultaneous Transmission of Classical and Quantum Information
297
it follows from (15) that
Pα Tr ξmα (1 ⊗ m ) ≥ 1 − 2.
α
Invoking the gentle measurement lemma [29] and the concavity of the square root function,
Pα (1 ⊗
m )ξm (1 ⊗
√ m ) − ξmα 1 ≤ 4 ,
α
which by the monotonicity of trace distance [20] gives
√ Pα (1 ⊗ Dm )(ξmα ) − (1 ⊗ Dm )(ξmα )1 ≤ 4 .
α
On the other hand, α (1 ⊗ D)(ξmα ) − (1 ⊗ Dm )(ξmα )1 ≤
m =m
D m (ξmα )1 ≤ 2. α
Since, for all m, α, F ((1 ⊗ D m )(ξmα ), ) ≥ 1 − , α
putting everything together gives, for all m,
√ m,α Pα Perr ≤ 3 + 4 ,
(17)
α m,α where Perr = 1 − F ((1 ⊗ Dα )(ξmα ), ). At this point our code relies on Alice and Bob having access to the common random m,α index α. To prove the theorem it remains to “derandomize” the code, i.e. show that perr m,α and Perr are small for a particular value of α, and for m in a sufficiently large subset of [µ]. By (16) and (17),
α
Pα
√ 1 m,α m,α (p + Perr ) ≤ 5 + 4 . µ m err
There exists a particular α for which √ 1 m,α m,α (perr + Perr ) ≤ 5 + 4 . µ m Fixing α, expurgate the worst half of the codewords, i.e. those m with the highest value m,α m,α m,α m,α of perr + by √ Perr . Now we have a code with both perr and Perr bounded from above 10 + 8 for all remaining m, while the classical rate has only decreased by n1 . This concludes the proof.
298
I. Devetak, P. W. Shor
4. Remarks on Related Problems The first remark we make concerns replacing the classical–quantum dichotomy with the cryptographically relevant public–private one. In [8] quantum codes were built based on private information transmission ones. The purpose of the latter is for sending classical information about which the potential eavesdropper (to which the “environment” of the channel is granted) cannot learn anything. This should be contrasted with HSW codes which may be viewed as transmitting public information. One may now consider the problem of finding the simultaneous (public, private) capacity of N . The answer follows in a straightforward manner from the methods of [8] and those used in proving Theorem 1. Viewing the channel N as being embedded in an isometry UN with an enlarged target Hilbert space, UN : HA → HB ⊗ HE (HE is now given to the eavesdropper), the simultaneous (public, private) capacity region is given by the following modification of Theorem 1: • replace the state σ XAB by σ XY B , obtained by sending the A part of A pxy |x x|X ⊗ |y y|Y ⊗ ρxy xy
through the channel, • replace I (A BX) by I (Y ; B|X) − I (Y ; E|X). The corresponding theorem for classical “wire-tap”channels was proven in [6]. Secondly, one may conceive of a “static” analogue of the problem considered here, where Alice and Bob share many copies of some (mixed) state ρ AB instead of being connected by a quantum channel. In [10] the problem of generating common randomness (perfectly correlated bits) from such a resource using limited forward (Alice to Bob) classical communication was considered. There the “distillable common randomness” (DCR) was defined to be the maximum common randomness obtainable in excess of the classical communication invested, and was advertised as an (asymmetric) measure of the classical correlations in ρ AB . In [11] the problem of one-way entanglement distillation was solved, yielding a similarly asymmetric measure of quantum correlations in ρ AB . The next step is to unify the two results in a trade-off between DCR and distillable entanglement, which could now be argued to quantify the total correlations in the state. Based on the results of [10, 11] and the present paper we put forth the following conjecture: The simultaneously distillable (classical, quantum) resources are given precisely by Theorem 1, where now the test states σ XAB are obtained by applying general instruments D = (Dx )x∈X to the A part of ρ AB , rather than arising from a channel. A sketch of the proof is as follows. The coding strategy involves double blocking. First use the protocol of [10] on a block of length n to establish a good approximation to X n on Bob’s side using ≈ nH (B|X) bits of forward communication. This already gives us the desired DCR rate of I (X; B). Now that Bob’s system includes Xn they may use further blocking to distill entanglement at a rate of I (A BX) [11]. The classical communication involved in this distillation has now turned into common randomness, effecting no net change in the DCR. The converse theorem is left as an exercise. A somewhat more ambitious goal would be to include the classical communication cost in the trade-off, giving a 3-dimensional region! The final remark we make is that the “piggy-backing” idea used in the proof of Theorem 1 provides an alternative coding strategy to the one in [26] for the classical capacity of N with limited entanglement assistance, thus establishing an additional connection between the two problems. The original paper on the entanglement assisted capacity [4]
Simultaneous Transmission of Classical and Quantum Information
299
describes how to achieve the pair (r, R) = (I (A; B)ρ , −H (A)ρ ), for some ρ AB = (1A ⊗ N )(φ AA ) arising from the channel. Using a mixture of codes corresponding to different channel inputs |φx AA , one trivially achieves (r, R) = (I (A; B|X), −H (A|X)) (with respect to ωXAB ). As it turns out, Bob may use the distinguishability of the channel outputs of different such code mixtures to send extra classical information at a rate of I (X; B). This gives the region (7). A detailed version of this argument will appear in [9]. 5. Discussion In conclusion, an information theoretical characterization of the simultaneous (classical, quantum) capacity region has been derived. The key idea was to use a different quantum code depending on the classical information to be sent, thus “piggy-backing” the classical information on top of the quantum one. The formula derived requires optimization over potentially arbitrarily many copies of the channel. We have shown that for a generalized dephasing channel a single copy suffices. We have also presented some ideas on cryptographic as well as static analogues of this problem. We have already mentioned the open problem of including the classical communication cost in the trade-off for the static analogue. Another interesting extension of our work, which in fact served as our original motivation, is the following joint sourcechannel coding problem. In [14] the task of quantum compression with classical side information was considered (see also [18]). This is a “visible” source coding problem of a pure-state ensemble E. By storing partial information about the identity of the states (classically) at a rate C it is possible to reduce the quantum storage rate to some value Q(C). The joint source-channel coding variant of this problem is: Given E and a channel N , what is the rate at which Alice can send the quantum part of the ensemble over the channel? One approach is to first separate the source into a classical and quantum part using the trade-off of [14] and then send them simultaneously through the channel using the trade-off of Theorem 1. This procedure is optimized over the ratio λ of the classical and quantum rates which should coincide for the source and channel coding part. There are, however “well matched” source-channel pairs for which such a strategy is known to be suboptimal. The following example is due to Smolin [27]. √ The source is 1 + − ± the equiprobable “trine” ensemble (|0, | , | ), where | = 2 |0 ± 23 |1 and the channel N : H3 → H2 has operation elements {|0 0|, | + 1|, | − 2|}. The channel has no quantum capacity and a classical capacity of 1. Our strategy of separating the source and channel coding gives a source-channel capacity of 1/ log 3 transmitted copies of the ensemble per channel use. On the other hand, by simply feeding the identity of the state to the channel one achieves a source-channel capacity of 1. Finding a solution for an arbitrary (E, N ) pair remains an open question. Acknowledgements. We thank Charles Bennett, Aram Harrow and John Smolin for useful discussions. ID is partially supported by the NSA under the ARO grant numbers DAAG55-98-C-0041 and DAAD1901-1-06.
A. Proof of Concavity of S (1) (N ) Here we provide a proof that the region S (1) (N ) defined by (6) is concave. Let σ0XAB and σ1XAB be two different states arising from the channel. For some λ between 0 and 1, consider the state
300
I. Devetak, P. W. Shor
σ U XAB = λ |0 0|U ⊗ σ0XAB + (1 − λ) |1 1|U ⊗ σ1XAB , which also arises from the channel. Then λ I (X; B)σ0 + (1 − λ) I (X; B)σ1 ≤ I (U X; B)σ , λ I (A BX)σ0 + (1 − λ) I (A BX)σ1 = I (A BU X)σ , from which the claim follows. B. The Capacity Region for Dephasing Channels In this section we define the notion of degradable channels and show that for such channels the quantum capacity is given by the single-letter formula Q(N ) = Q(1) (N ) := max I (A B), ρ AB
(18)
where the maximization is over all states ρ AB arising from the channel N . For the special case of dephasing channels we shall prove that the entire trade-off curve can be single-letterized. Recall that a channel N : HA → HB can be defined by an isometric embedding UN : HA → HB ⊗ HE , followed by a partial trace over the “environment” system E, so N (ρ) = Tr E UN (ρ) [28]. This further induces the complementary channel N c : HA → HE defined by N c (ρ) = Tr B UN (ρ). We call a channel N degradable when it may be degraded to its complementary channel N c , i.e. when there exists a map T : HB → HE so that N c = T ◦ N . To see that Q(N ) = Q(1) (N ) for degradable channels, note that Bob’s output system B may be mapped by a fixed isometry onto a composite system B E such that the channels from A to E and to E are the same. Thus, for any state arising from the channel, I (AB) = H (B) − H (E) = H (B E ) − H (E) = H (B E ) − H (E ) = H (B |E ). We can then use the inequality [20] H (B1 B2 |E1 E2 ) ≤ H (B1 |E1 ) + H (B2 |E2 ) to prove that single-letter maximization already achieve Q(N ). A subclass of degradable channels of particular interest are generalized dephasing channels. The latter are defined on some d-dimensional Hilbert space with a preferred orthonormal basis {|i}, such that all states belonging to this basis are transmitted without error, but pure superpositions of these basis states may become mixed. This implies that if N is a dephasing channel then its isometric embedding UN obeys
UN |iA = |iB |φi E ,
Simultaneous Transmission of Classical and Quantum Information
301
where the |φi are generally not mutually orthogonal. When the |φi are mutually orthogonal N is the completely dephasing channel d : d (ρ) =
d
|i i|ρ|i i|.
i=1
It is clear from the above that any dephasing channel N obeys • d ◦ N = N ◦ d = d • N c ◦ d = N c . Every dephasing channel is degradable, since N may be degraded to d which may be further degraded to N c . In fact, the map T can be taken to be N c . Therefore, Q(N ) = Q(1) (N ). In what follows, the special properties of dephasing channels will allow us to prove an even stronger statement: that the outer boundary of S(N ) may be expressed as a single-letter formula. Consider some state σ XABE arising from the channel. Bob may degrade his channel further by replacing his system B by B Y , where Y now contains the completely dephased version of B (this is why we label it as a classical system). Set λ ≥ 1 and define fλ (N ) = max [H (Y ) + (λ − 1) H (Y |X) − λ H (E|X)] , σ XY E
where the maximization is over all σ XY E arising from the channel (σ XY E is implicit in the entropies). We shall make use of the following lemma. Lemma 7. For two general dephasing channels N1 and N2 , fλ (N1 ⊗ N2 ) = fλ (N1 ) + fλ (N2 ). Proof. The “≥” direction follows from the fact that the input ensemble for N1 ⊗ N2 may be chosen to be a tensor product of the ones maximizing fλ (N1 ) and fλ (N2 ). To show the opposite inequality, in what follows let us refer to the state σ XY1 Y2 E1 E2 that maximizes fλ (N1 ⊗ N2 ). Observe that H (Y1 Y2 ) = H (Y1 ) + H (Y2 |Y1 ), H (Y1 Y2 |X) = H (Y1 |X) + H (Y2 |Y1 X), and H (E1 E2 |X) = H (E1 |X) + H (E2 |E1 X) ≤ H (E1 |X) + H (E2 |Y1 X), the latter since E1 contains a degraded version of Y1 for all values of X. Hence fλ (N1 ⊗ N2 ) = H (Y1 Y2 ) + (λ − 1) H (Y1 Y2 |X) − λ H (E1 E2 |X) ≤ H (Y1 ) + (λ − 1) H (Y1 |X) − λ H (E1 |X) +H (Y2 |Y1 ) + (λ − 1) H (Y2 |XY1 ) − λ H (E2 |XY1 ) ≤ fλ (N1 ) + fλ (N2 ).
302
I. Devetak, P. W. Shor
We shall use Lagrange multipliers to calculate S(N ). By Theorem 1, the quantity to be maximized is I (X; B) + λ I (A BX), over all states σ that arise from N ⊗n . Operationally it is clear that we should restrict attention to λ ≥ 1, since −λ is the slope of the boundary of S(N ) and a qubit channel may always be used to send classical bits at a unit rate. For any such state we have I (X; B) + λI (A BX) = H (B) + (λ − 1)H (B|X) − λH (E|X) ≤ H (Y ) + (λ − 1)H (Y |X) − λH (E|X) ≤ fλ (N ⊗n ) ≤ nfλ (N ). The first inequality follows from the fact that complete dephasing increases entropy, and is saturated by completely dephasing the input to N ⊗n (recall that N commutes with d ). The third inequality is by Lemma 6. Thus, for dephasing channels, S(N ) = S (1) (N ), which makes the optimization problem tractable. We now turn to the particular case of the qubit p-dephasing channel N = (1 − p) 12 + p 2 . It is easily checked that the outer boundary of S(N ) is achieved by the µ–parametrized family of ensembles, µ ∈ [0, 1/2], consisting of diag (µ, 1 − µ) and diag (1 − µ, µ) chosen with equal probabilities. The trade-off curve is given by
(r, R) = 1 − h2 (µ), h2 (µ) − h2 1/2 + 1/2 1 − 16p(1 − p)µ(1 − µ) , where h2 (µ) = −µ log2 µ − (1 − µ) log2 (1 − µ) is the binary entropy function. Figure 2 shows this curve for p = 0.2. C. Proof of the Cardinality Bound Here we justify the condition on the cardinality of X in the statement of Theorem 1. Caratheodory’s theorem states that in a t-dimensional Euclidean space, each point of a connected compact set K can be represented as a convex combination of at most t + 1 points in K. Let F(H) be the family of all density operators on the Hilbert space HA of dimension d. Let K be the image of F(H) under some continuous mapping f defined by f (ρ) = (f1 (ρ), . . . , ft (ρ)). As F(H) is connected and compact, so is K. Then for any probability measure µ on the algebra of density operators of HA , Caratheodory’s theorem implies the existence of some finite ensemble {px , ρx : x ∈ X }, |X | = t + 1, such that µ(dρ)fj (ρ) = px fj (ρx ), ∀j ∈ [t]. F (H)
x∈X
Turning to our problem, the quantities I (X; B) and I (A BX) depend solely on the ensemble E = {px , ρx }, where ρx := φxA , and the channel N . Moreover, they only depend on the vector x px f (ρx ), where the vector valued function f is defined so that f1 , . . . , fd 2 −1 are the d 2 − 1 degrees of freedom of ρ (linear in ρ), fd 2 (ρ) = H (N (ρ)) and fd 2 +1 (ρ) = Ic (ρ, N ). Suppose that a particular point in S (1) (N ) corresponds to some ensemble E = {µ(dρ), ρ}. The above implies that the same point is achievable by a finite ensemble with at most d 2 + 2 elements.
Simultaneous Transmission of Classical and Quantum Information
303
References 1. Barnum, H., Knill, E., Nielsen, M.A.: On Quantum Fidelities and Channel Capacities. IEEE Trans. Inf. Theory 46, 1317–1329 (2000) 2. Barnum, H., Nielsen, M.A., Schumacher, B.: Information transmission through a noisy quantum channel. Phys. Rev. A 57, 4153 (1998) 3. Bennett, C.H., DiVincenzo, D.P., Smolin, J.A.: Capacities of quantum erasure channels. Phys. Rev. Lett 78, 3217–3220 (1997) 4. Bennett, C.H., Shor, P.W., Smolin, J.A., Thapliyal, A.V.: Entanglement-assisted capacity of a quantum channel and the reverse Shannon theorem. IEEE Trans. Inf. Theory 48, 2637–2655 (2002) 5. Cover, T.M., Thomas, J.A.: Elements of Information Theory. New York: Wiley and Sons, 1991 6. Csisz´ar, I. K¨orner, J.: Broadcast channels with confidential messages. IEEE Trans. Inf. Theory 24, 339–348 (1978) 7. Davies, E.B., Lewis, J.T.: An operational approach to quantum probability. Commun. Math. Phys. 17, 239–260 (1970) 8. Devetak, I.: The private classical capacity and quantum capacity of a quantum channel. IEEE Trans. Inf. Theory 51, 44–55 (2005) 9. Devetak, I., Harrow, A.W., Winter, A.: Optimal trade-offs for a family of quantum protocols. In preparation 10. Devetak, I., Winter, A.: Distilling common randomness from bipartite pure states. http:// arxiv.org/abs/quant-ph/0304196, 2003. 11. Devetak, I., Winter, A.: Distillation of secret key and entanglement from quantum states. Proc. Roy. Soc. A 461, 207–235 (2005) 12. Devetak, I., Winter, A.: Relating quantum privacy and quantum coherence: an operational approach. Phys. Rev. Lett. 93, 080501–080504 (2004) 13. DiVincenzo, D.P., Shor, P.W., Smolin, J.A.: Quantum-channel capacity of very noisy channels. Phys. Rev. A. 57, 830–839 (1998) 14. Hayden, P., Josza, R., Winter, A.: Trading quantum for classical resources in quantum data compression. J. Math. Phys. 43, 4404–4444 (2002) 15. Holevo, A. S.: Bounds for the quantity of information transmitted by a quantum channel. Probl. Inf. Transm. 9, 177–183 (1973) 16. Holevo, A.S.: The Capacity of the Quantum Channel with General Signal States. IEEE Trans. Inf. Theory 44, 269–273 (1998) 17. Horodecki, M., Lloyd, S.: Manuscript in progress 18. Kuperberg, G.: The capacity of hybrid quantum memory. IEEE Trans. Inf. Theory 49, 1465–1473, (2003) 19. Lloyd, S.: The capacity of a noisy quantum channel. Phys. Rev. A 55, 1613–1622 (1997) 20. Nielsen, M.A., Chuang, I.L.: Quantum Information and Quantum Computation. Cambridge: Cambridge University Press, 2001 21. Shannon, C.E.: A mathematical theory of communication. Bell System Tech. J. 27, 379–623 (1948) 22. Schumacher, B., Nielsen, M.A.: Quantum data processing and error correction. Phys. Rev. A 54, 2629 (1996) 23. Schumacher, B., Westmoreland, M.D.: Sending classical information via noisy quantum channels. Phys. Rev. A 56, 131–138 (1997) 24. Schumacher, B., Westmoreland, M.D.: Relative entropy in quantum information theory. in Quantum Information and Quantum Computation: A Millenium Volume, AMS Contemporary Mathematics Vol. 305, (S.J. Lomomaco, Jr. and H. E. Brandt, eds., AMS Press, Providence, Rhode Island, 2002) 25. Shor, P. W.: The quantum channel capacity and coherent information . Lecture notes, MSRI Workshop on Quantum Computation, 2002. Available at http://www.msri.org/publications/ln/msri/2002/quantumcrypto/shor/1/, 2002 26. Shor, P.W.: The classical capacity achievable by a quantum channel assisted by limited entanglement. http://arxiv.org/abs/quant-ph/0402129, 2004 27. Smolin, J.A.: Private communication, 2003 28. Stinespring, W.F.: Proc. Amer. Math. Soc. 6, 211 (1955) 29. Winter, A.: Coding theorem and strong converse for quantum channels. IEEE Trans. Inf. Theory 45, 2481–2485 (1999) Communicated by M.B. Ruskai
Commun. Math. Phys. 256, 305–374 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1285-2
Communications in
Mathematical Physics
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base Harald Grosse1 , Raimar Wulkenhaar2 1
Institut f¨ur Theoretische Physik, Universit¨at Wien, Boltzmanngasse 5, 1090 Wien, Austria. E-mail: [email protected] 2 Max-Planck-Institut f¨ ur Mathematik in den Naturwissenschaften, Inselstraße 22–26, 04103 Leipzig, Germany. E-mail: [email protected] Received: 22 January 2004 / Accepted: 25 May 2004 Published online: 18 February 2005 – © Springer-Verlag 2005
Abstract: We prove that the real four-dimensional Euclidean noncommutative φ 4 model is renormalisable to all orders in perturbation theory. Compared with the commutative case, the bare action of relevant and marginal couplings contains necessarily an additional term: an harmonic oscillator potential for the free scalar field action. This entails a modified dispersion relation for the free theory, which becomes important at large distances (UV/IR-entanglement). The renormalisation proof relies on flow equations for the expansion coefficients of the effective action with respect to scalar fields written in the matrix base of the noncommutative R4 . The renormalisation flow depends on the topology of ribbon graphs and on the asymptotic and local behaviour of the propagator governed by orthogonal Meixner polynomials.
1. Introduction Noncommutative φ 4 -theory is widely believed to be not renormalisable in four dimensions. To underline the belief one usually draws the non-planar one-loop two-point function resulting from the noncommutative φ 4 -action. The corresponding integral is finite, but behaves ∼ (θp)−2 for small momenta p of the two-point function. The finiteness is important, because the p-dependence of the non-planar graph has no counterpart in the original φ 4 -action, and thus (if divergent) cannot be absorbed by multiplicative renormalisation. However, if we insert the non-planar graph declared as finite as a subgraph into a bigger graph, one easily builds examples (with an arbitrary number of external legs) where the ∼ p −2 behaviour leads to non-integrable integrals at small inner momenta. This is the so-called UV/IR-mixing problem [1]. The heuristic argumentation can be made exact: Using a sophisticated mathematical machinery, Chepelev and Roiban have proven a power-counting theorem [2, 3] which relates the power-counting degree of divergence to the topology of ribbon graphs. The rough summary of the power-counting theorem is that noncommutative field theories
306
H. Grosse, R. Wulkenhaar
with quadratic divergences become meaningless beyond a certain loop order1 . For example, in the real noncommutative φ 4 -model there exist three-loop graphs which cannot be integrated. In this paper we prove that the real noncommutative φ 4 -model is renormalisable to all orders. At first sight, this seems to be a grave contradiction. However, we do not say that the famous papers [1–3] are wrong. In fact, we reconfirm their results, it is only that we take their message seriously. The message of the UV/IR-entanglement is that: Noncommutativity relevant at very short distances modifies the physics of the model at very large distances. At large distances we have approximately a free theory. Thus, we have to alter the free theory, whereas the quasi-local φ 4 -interaction could hopefully be left unchanged. But how to modify the free action? We found the distinguished modification in the course of a long refinement process of our method. But knowing the result, it can be made plausible. It was pointed out by Langmann and Szabo [4] that the -product interaction is invariant under a duality transformation between positions and momenta, ˆ pµ ↔ x˜µ := 2(θ −1 )µν x ν , (1.1) φ(p) ↔ π 2 | det θ |φ(x) , µ ˆ a ) = d 4 xa e(−1)a ipa,µ xa φ(xa ). Using the definition of the -product given where φ(p in (2.1) and the reality φ(x) = φ(x), one obtains λ Sint = d 4 x (φ φ φ φ)(x) 4! 4 = d 4 xa φ(x1 )φ(x2 )φ(x3 )φ(x4 ) V (x1 , x2 , x3 , x4 ) a=1
4 d 4 pa ˆ 1 )φ(p ˆ 2 )φ(p ˆ 3 )φ(p ˆ 4 ) Vˆ (p1 , p2 , p3 , p4 ) , = φ(p (2π)4
(1.2)
a=1
with λ Vˆ (p1 , p2 , p3 , p4 ) = (2π)4 δ 4 (p1 −p2 −p3 +p4 ) cos 21 θ µν (p1,µ p2,ν + p3,µ p4,ν ) , 4! λ 1 µ ν µ ν 4 −1 (x −x +x −x ) cos 2(θ ) (x x + x x ) V (x1 , x2 , x3 , x4 ) = δ 1 2 3 4 µν 1 2 3 4 . 4! π 4 det θ (1.3) Passing to quantum field theory, V (x1 , x2 , x3 , x4 ) and Vˆ (p1 , p2 , p3 , p4 ) become the Feynman rules for the vertices in position space and momentum space, respectively. Multiplicative renormalisability of the four-point function requires that its divergent part has to be self-dual, too. This requires an appropriate Feynman rule for the propagator. Building now two-point functions with these Feynman rules, it is very plausible that if the two-point function is divergent in momentum space, also the duality-transformed two-point function will be divergent. That divergence has to be absorbed in a multiplicative renormalisation of the initial action. 1 There exist proposals to resum the perturbation series, see [3], but there is no complete proof that this is consistent to all orders.
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
307
However, the usual free scalar field action is not invariant under that duality transformation and therefore cannot absorb the expected divergence in the two-point function. In order to cure this problem we have to extend the free scalar field action by a harmonic oscillator potential: 1 µ2 2 Sfree = d 4 x (∂µ φ) (∂ µ φ) + (x˜µ φ) (x˜ µ φ) + 0 φ φ (x). (1.4) 2 2 2 The action Sfree + Sint according to (1.4) and (1.2) is now preserved by duality transformation, up to rescaling. From the previous considerations we can expect that also the renormalisation flow preserves that action Sfree + Sint . We prove in this paper that this is indeed the case. Thus, the duality-covariance of the action Sfree + Sint implements precisely the UV/IR-entanglement. Of course, we cannot treat the quantum field theory associated with the action (1.4) in momentum space. Fortunately, there is a matrix representation of the noncommutative RD , where the -product becomes a simple product of infinite matrices and where the duality between positions and momenta is manifest. The matrix representation plays an important rˆole in the proof that the noncommutative RD is a spectral triple [5]. It is also crucial for the exact solution of quantum field theories on noncommutative phase space [6–8]. Coincidently, the matrix base is also required for another reason. In the traditional Feynman graph approach, the value of the integral associated to non-planar graphs is not unique, because one exchanges the order of integrations in integrals which are not absolutely convergent. To avoid this problem one should use a renormalisation scheme where the various limiting processes are better controlled. The preferred method is the use of flow equations. The idea goes back to Wilson [9]. It was then used by Polchinski [10] to give a very efficient renormalisability proof for commutative φ 4 -theory. Several improvements have been suggested in [11]. Applying Polchinski’s method to the noncommutative φ 4 -model there is, however, a serious problem in momentum space. We have to guarantee that planar graphs only appear in the distinguished interaction coefficients for which we fix the boundary condition at the renormalisation scale R . Nonplanar graphs have phase factors which involve inner momenta. Polchinski’s method consists in taking norms of the interaction coefficients, and these norms ignore possible phase factors. Thus, we would find that boundary conditions for non-planar graphs at
R are required. Since there is an infinite number of different non-planar structures, the model is not renormalisable in this way. A more careful examination of the phase factors is also not possible, because the cut-off integrals prevent the Gaußian integration required for the parametric integral representation [2, 3]. As we show in this paper, the bare action Sfree + Sint according to (1.4) and (1.2) corresponds to a quantum field theory which is renormalisable to all orders. Together with the duality argument, this is now the conclusive indication that the usual noncommutative φ 4 -action (with = 0) has to be dismissed in favour of the duality-covariant action of (1.4) and (1.2). Our proof is very technical. We do not claim that it is the most efficient one* . However, it was for us, for the time being, the only possible way. We encountered several “miracles” without which the proof had failed. The first is that the propagator is complicated but numerically accessible. We had thus convinced ourselves that the propagator has such an asymptotic behaviour that all non-planar graphs and all graphs with N > 4 * Note added in proof. Meanwhile, a more efficient renormalisation proof based on a multi-scale analysis was given in [17].
308
H. Grosse, R. Wulkenhaar
external legs are irrelevant according to our general power-counting theorem for dynamical matrix models [12]. However, this still leaves an infinite number of planar two- or four-point functions which would be relevant or marginal according to [12]. In the first versions of [12] we had, therefore, to propose some consistency relations in order to get a meaningful theory. Miraculously, all this is not necessary. We have further found numerically that the propagator has some universal locality properties suggesting that the infinite number of relevant / marginal planar two- or four-point functions can be decomposed into four relevant / marginal base interactions and an irrelevant remainder. Of course, there must exist a reason for such a coincidence, and the reason is orthogonal polynomials. In our case, it means that the kinetic matrix corresponding to the free action (1.4) written in the matrix base of the noncommutative RD is diagonalised by orthogonal Meixner polynomials [13].2 Now, having a closed solution for the free theory in the preferred base of the interaction, the desired local and asymptotic behaviour of the propagator can be derived. We stress, however, that some of the corresponding estimations of Sect. 3.4 are, so far, verified numerically only. There is no doubt that the estimations are correct, but for the purists we have to formulate our result as follows: The quantum field theory corresponding to the action (1.4) and (1.2) is renormalisable to all orders provided that the estimations given in Sect. 3.4 hold. Already this weaker result is a considerable progress, because the elimination of the last possible doubt amounts to estimate a single integral. Meanwhile, this estimation will be performed by [17]. The method, further motivation and an outlook to constructive applications are also presented in [18]. Finally, let us recall that the noncommutative φ 4 -theory in two dimensions is different. We also need the harmonic oscillator potential of (1.4) in all intermediate steps of the renormalisation proof, but at the end it can be switched off with the removal of the cut-off [14]. This is in agreement with the common belief that the UV/IR-mixing problem can be cured in models with only logarithmic divergences.
1.1. Strategy of the proof. As the renormalisation proof is long and technical, we list here the most important steps and the main ideas. A flow chart of these steps is presented in Fig. 1. A more detailed introduction is given in [19]. The first step is to rewrite the φ 4 -action (1.4)+(1.2) in the harmonic oscillator base of the Moyal plane, see (2.5) and (2.6). The free theory is solved by the propagator (2.7), which we compute in Appendix A using Meixner polynomials in an essential way. The propagator is represented by a finite sum which enables a fast numerical evaluation. Unfortunately, we can offer analytic estimations only in a few special cases. The propagator is so complicated that a direct calculation of Feynman graphs is not practicable. Therefore, we employ the renormalisation method based on flow equations [10, 11] which we have previously adapted to non-local (dynamical) matrix models [12]. The modification K[ ] of the weights of the matrix indices in the kinetic term is undone in the partition function by a careful adaptation of effective action L[φ, ], which is described by the matrix Polchinski equation (3.1). For a modification given by 2 In our renormalisation proof [14] of the two-dimensional noncommutative φ 4 -model we had originally termed these polynomials “deformed Laguerre polynomials”, which we had only constructed via its recursion relation. The closed formula was not known to us. Thus, we are especially grateful to Stefan Schraml who provided us first with [15], from which we got the information that we were using Meixner polynomials, and then with the encyclopaedia [16] of orthogonal polynomials, which was the key to complete the renormalisation proof.
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
309
XXX XX initial interaction (3.3) XXX HH XXXX H z X HH H Polchinski equation (3.1) H HH derived in [12] 9 H
HH integration procedure j
Def. 1 numerical bounds
App. C A
? A
composite propagators ? ? A
Sec. 3.3, App. B.1 A general power-counting theorem
A for non-local matrix models
A proven in [12]
A
? A A Power-counting behaviour A of interaction coefficients, Prop. 2 A A A A
0 -dependence of PP A P interaction coefficients (4.4) P q AU ? differential equations @ – for 0 -varied funcions (4.17) @ 9 ? – for auxiliary functions (4.16) @ Power-counting behaviour @ – of auxiliary functions, Prop. 3 PP @ – of 0 -varied functions, Prop. 4 PP @ P q@ P R propagator (2.7) derived in App. A.2
Convergence Theorem, Thm. 5 Fig. 1. Relations between the main steps of the proof. The central results are the power-counting behaviour of Proposition 2 and the convergence theorem (Theorem 5). Note that the numerical estimations for the propagator influence the entire chain of the proof
a cut-off function K[ ], renormalisation of the model amounts to prove that the matrix Polchinski equation (3.5) admits a regular solution which depends on a finite number of initial data. In a perturbative expansion, the matrix Polchinski equation is solved by ribbon graphs drawn on Riemann surfaces. The existence of a regular solution follows from the general power-counting theorem proven in [12] together with the numerical determination of the propagator asymptotics in Appendix C. However, the general proof involved an infinite number of initial conditions, which is physically not acceptable. Therefore, the challenge is to prove the reduction to a finite number of initial data for the renormalisation flow. The answer is the integration procedure given in Definition 1, Sect. 3.2, which entails mixed boundary conditions for certain planar two- and four-point functions. The idea is to introduce four types of reference graphs with vanishing external indices and to split the integration of the Polchinski equation for the distinguished two- and four-point graphs
310
H. Grosse, R. Wulkenhaar
into an integration of the difference to the reference graphs and a different integration of the reference graphs themselves. The difference between original graph and reference graph is further reduced to differences of propagators, which we call “composite propagators”. See Sect. 3.3. The proof of the power-countung estimations for the interaction coefficients (Proposition 2 in Sect. 3.5) requires the following extensions of the general case treated in [12]: – We have to prove that graphs where the index jumps along the trajectory between incoming and outgoing indices are suppressed. This leaves 1PI planar four-point functions with constant index along the trajectory and 1PI planar two point functions with in total at most two index jumps along the trajectories as the only graphs which are marginal or relevant. – For these types of graphs we have to prove that the leading relevant/marginal contribution is captured by reference graphs with vanishing external indices, whereas the difference to the reference graphs is irrelevant. This is the discrete analogue of the BPHZ Taylor subtraction of the expansion coefficients to lowest-order in the external momenta. Thus, Proposition 2 provides bounds for the interaction coefficients for the effective action at a scale ∈ [ R , 0 ]. Here, R is the renormalisation scale where the four reference graphs are normalised, and 0 is the initial scale for the integration which has to be sent to ∞ in order to scale away possible initial conditions for the irrelevant functions. The estimations of Proposition 2 are actually independent of 0 so that the limit 0 → ∞ can be taken. This already ensures the renormalisation of the model. However, one would also like to know whether the interaction coefficients converge in the limit 0 → ∞ and if so, with which rate. That analysis is performed in Sect. 4 which culminates in Theorem 5, confirming convergence with a rate −2 0 . 2. The Duality-Covariant Noncommutative φ 4 -Action in the Matrix Base The noncommutative R4 is defined as the algebra R4θ which as a vector space is given by the space S(R4 ) of (complex-valued) Schwartz class functions of rapid decay, equipped with the multiplication rule [20] d 4k (a b)(x) = d 4 y a(x+ 21 θ ·k) b(x+y) eik·y , (2π)4 (2.1) (θ ·k)µ = θ µν kν , k·y = kµ y µ , θ µν = −θ νµ . The entries θ µν in (2.1) have the dimension of an area. We place ourselves into a coordinate system in which θ has the form 0 θ1 0 0 −θ 0 0 0 θµν = 1 . (2.2) 0 0 0 θ2 0 0 −θ2 0 We use an adapted base bmn (x) = fm1 n1 (x 1 , x 2 ) fm2 n2 (x 3 , x 4 ) ,
m=
m1 m2
∈ N2 , n =
n1 n2
∈ N2 ,
(2.3)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
where the base fm1 n1 (x 1 , x 2 ) ∈ R2θ is given in [14]. This base satisfies (bmn bkl )(x) = δnk bml (x) , d 4 x bmn = 4π 2 θ1 θ2 δmn .
311
(2.4)
More information about the noncommutative RD can be found in [5, 20]. We are going to study a duality-covariant φ 4 -theory on R4θ . This means that we add a harmonic oscillator potential to the standard φ 4 -action, which breaks translation invariance but is required for renormalisation. Expanding the scalar field in the matrix base, φ(x) = m,n∈N2 φmn bmn (x), we have
1
1 g µν ∂µ φ ∂ν φ + 4 2 ((θ −1 )µρ x ρ φ) ((θ −1 )νσ x σ φ) + µ20 φ φ 2 2 λ + φ φ φ φ (x) 4! 1 λ = 4π 2 θ1 θ2 (2.5) Gmn;kl φmn φkl + φmn φnk φkl φlm , 2 4! 2
S[φ] =
d 4x
m,n,k,l∈N
where according to [14], 2 2 Gmn;kl = µ20 + (1+ 2 )(n1 +m1 +1)+ (1+ 2 )(n2 +m2 +1) δn1 k 1 δm1 l 1 δn2 k 2 δm2 l 2 θ1 θ2 2 − (1− 2 ) k 1 l 1 δn1 +1,k 1 δm1 +1,l 1 + m1 n1 δn1 −1,k 1 δm1 −1,l 1 δn2 k 2 δm2 l 2 θ1 2 − (1− 2 ) k 2 l 2 δn2 +1,k 2 δm2 +1,l 2 + m2 n2 δn2 −1,k 2 δm2 −1,l 2 δn1 k 1 δm1 l 1 . θ2 (2.6) We need, in particular, the inverse of the kinetic matrix Gmn;kl , the propagator nm;kl , which solves the partition function of the free theory (λ = 0) with respect to the preferred base of the interaction. We present the computation of the propagator in Appendix A. The result is
m1 n1
1 1 ;k l m2 n2 k 2 l 2
=
θ δ 1 1 1 1δ 2 2 2 2 2(1+ )2 m +k ,n +l m +k ,n +l
min(m1 +l 1 ,n1 +k 1 ) min(m2 +l 2 ,n2 +k 2 ) 2 2
×
v 1 = |m
×2 F1 ×
1 −l 1 | 2
v 2 = |m
µ20 θ 1 1 B 1+ 8 + 2 (m +m2 +k 1 +k 2 )−v 1 −v 2 , 1+2v 1 +2v 2
2 −l 2 | 2
µ20 θ 1 1 2 1 2 1 2 2 8 − 2 (m +m +k +k )+v +v (1− ) (1+ )2 µ20 θ 1 2+ 8 + 2 (m1 +m2 +k 1 +k 2 )+v 1 +v 2
1+2v 1 +2v 2 ,
2 i=1
ni i i v i + n −k 2
ki i i v i + k −n 2
mi i i v i + m 2−l
Here, B(a, b) is the Beta-function and 2 F1
li 1− 2v i . i −mi l vi + 2 1+
a,b z the hypergeometric function. c
(2.7)
312
H. Grosse, R. Wulkenhaar
3. Estimation of the Interaction Coefficients 3.1. The Polchinski equation. We have developed in [12] the Wilson-Polchinski renormalisation programme [9–11] for non-local matrix models where the kinetic term (Taylor coefficient matrix of the part of the action which is bilinear in the fields) is neither constant nor diagonal. Introducing a cut-off in the measure m,n dφmn of the partition function Z, the resulting effect is undone by adjusting the effective action L[φ] (and other terms which are easy to evaluate). If the cut-off function is a smooth function of the cut-off scale , the adjustment of L[φ, ] is described by a differential equation,
K ∂ 2 L[φ, ] ∂L[φ, ] 1 ∂ nm;lk ( ) ∂L[φ, ] ∂L[φ, ] 1 , − 2 =
∂
2 ∂
∂φmn ∂φkl 4π θ1 θ2 ∂φmn ∂φkl φ m,n,k,l
(3.1)
where F [φ] φ := F [φ] − F [0] and
K nm;lk ( ) =
2 r=1
K
i r ∈{mr ,nr ,k r ,l r }
i r
nm;lk . θr 2
(3.2)
Here, K(x) is a smooth monotonous cut-off function with K(x) = 1 for x ≤ 1 and K(x) = 0 for x ≥ 2. The differential equation (3.1) is referred to as the Polchinski equation. In [12] we have derived a power-counting theorem for L[φ, ] by integrating (3.1) perturbatively between the initial scale 0 and the renormalisation scale R 0 . The power-counting degree is given by topological data of ribbon graphs and two scaling exponents of the (summed and differentiated) cut-off propagator. The power-counting theorem in [12] is model independent, but it relied on boundary conditions for the integrations which do not correspond to a physically meaningful model. In this paper we will show that the four-dimensional duality-covariant noncommutative φ 4 -theory given by the action (2.5) admits an improved power-counting behaviour which only relies on a finite number of physical boundary conditions for the integration. The first step is to extract from the power-counting theorem [12] the set of relevant and marginal interactions, which on the other hand is used as an input to derive the powercounting theorem. To say it differently: One has to be lucky to make the right ansatz for the initial interaction which is then reconfirmed by the power-counting theorem as the set of relevant and marginal interactions. We are going to prove that the following ansatz for the initial interaction is such a lucky choice: 1 1 0 L[φ, 0 , 0 , ρ 0 ] = ρ1 +(n1 +m1 +n2 +m2 )ρ20 φ m1 n1 φ n1 m1 2πθ 2 m2 n2 n2 m2 m1 ,m2 ,n1 ,n2 ∈N −ρ30 n1 m1 φ m1 n1 φ n1 −1 m1 −1 + n2 m2 φ m1 n1 φ n1 m1 m2 n2
+
n2
m1 ,m2 ,n1 ,n2 ,k 1 ,k 2 ,l 1 ,l 2 ∈N
m2
m2 n2
n2 −1 m2 −1
1 0 ρ φ m1 n1 φ n1 k 1 φ k 1 l 1 φ l 1 m1 . 4! 4 m2 n2 n2 k2 k2 l 2 l 2 m2
(3.3)
For simplicity we impose a symmetry between the two components mi of matrix indices m1 m = m 2 ∈ N, which could be relaxed by taking different ρ-coefficients in front of
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
313
√ mi +ni and mi ni . Accordingly, we choose the same weights in the noncommutativity matrix, θ1 = θ2 ≡ θ . The differential equation (3.1) is non-perturbatively defined. However, we shall solve it perturbatively as a formal power series in a coupling constant λ which later on will be related to a normalisation condition at = R , see (3.16). We thus consider the following expansion: L[φ, , 0 , ρ 0 ] =
∞
λV
V =1
2V +2 N=2
N (2πθ ) 2 −2 (V ) Am1 n1 ;...;mN nN [ , 0 , ρ 0 ]φm1 n1 · · · φmN nN . N! mi ,ni (3.4)
Inserting (3.4) into (3.1) we obtain
∂ (V ) A [ , 0 , ρ 0 ] ∂ m1 n1 ;...;mN nN N V −1 1 (V ) (V −V ) = Qnm;lk ( )Am11n1 ;...;mN −1 nN −1 ;mn [ ]AmN nN1 ;...;mN nN ;kl [ ] 1 1 1 1 2 N1 =2V1 =1 m,n,k,l∈N2 N + − 1 permutations N1 −1 1 (V ) − (3.5) Qnm;lk ( )Am1 n1 ;...;mN nN ;mn;kl [ ] , 2 2 m,n,k,l∈N
with ∂ K 1 nm;lk ( )
. (3.6) 2πθ ∂
One has Qmn;kl ( ) = Qnm;lk ( ) = Qkl;mn ( ) = Qlk;nm ( ), and similarly for mn;kl and K mn;kl ( ). Qnm;lk ( ) :=
3.2. Integration of the Polchinski equation. We are going to compute the functions (V ) Am1 n1 ;...;mN mN by iteratively integrating the Polchinski equation (3.5) starting from boundary conditions either at R or at 0 . The right choice of the integration direction is an art: The boundary condition influences crucially the estimation, which in turn justifies or discards the original choice of the boundary condition. At the end of numerous trial-and-error experiments with the boundary condition, one convinces oneself that the procedure described in Definition 1 below is—up to finite re-normalisations discussed later—the unique possibility3 to renormalise the model. First we have to recall [12] the graphology resulting from the Polchinski equation (3.5). The Polchinski equation is solved by ribbon graphs drawn on a Riemann surface of uniquely determined genus g and uniquely determined number B of boundary components (holes). The ribbons are made of double-line propagators o m / 3
n
lo k
/
= Qmn;kl ( )
(3.7)
We “only” prove that the method works, not its uniqueness. The reader who doubts uniqueness of the integration procedure is invited to attempt a different way.
314
H. Grosse, R. Wulkenhaar
attached to vertices _ m4 n3 ?
n4
m1
? n1 m2
m3
= δn1 m2 δn2 m3 δn3 m4 δn4 m1 .
(3.8)
_ n2
Under certain conditions verified by our model, the rough power-counting behaviour of the ribbon graph is determined by the topology g, B of the Riemann surface and the number of vertices V and external legs N . However, in order to prove this behaviour we need some auxiliary notation: the number V e of external vertices (vertices to which at least one external leg is attached), a certain segmentation index ι and a certain summation of graphs with appropriately varying indices. We recall in detail the index summation, because we need it for a refinement of the general proof given in [12]. Viewing the ribbon graph as a set of single-lines, we can distinguish closed and open lines. The open lines are called trajectories starting at an incoming index n, running through a chain of inner indices kj and ending at an outgoing index m. Each index belongs to N2 , its components are labelled by superscripts, e.g. mj =
m1j
. We define n = i[m] = i[kj ] and m = o[n] = o[kj ]. There is a conservation
N 2 of the total amount of indices, N j =1 nj = j =1 mj (as vectors in N ). An index sum s mation E s is a summation over the graphs with outgoing indices E = {m1 , . . . , ms } where i[m1 ], . . . , i[ms ] are kept fixed. The number of these summations is restricted by s ≤ V e + ι − 1. Due to the symmetry properties of the propagator one could equivalently sum over nj where o[nj ] are fixed. A graph γ is produced via a certain history of contractions of (in each step) either two smaller subgraphs (with fewer vertices) or a self-contraction of a subgraph with two additional external legs. At a given order V of vertices there are finitely many graphs (distinguished by their topology and the permutation of external indices) contributing (V )γ (V ) the part Am1 n1 ;...;mN mN to a function Am1 n1 ;...;mN mN . It is therefore sufficient to prove m2j
(V )γ
estimations for each Am1 n1 ;...;mN mN separately. A ribbon graph is called one-particle irreducible (1PI) if it remains connected when removing a single propagator. The first term on the rhs of the Polchinski equation (3.5) leads always to one-particle reducible graphs, because it is left disconnected when removing the propagator Qnm;lk in (3.5). According to the detailed properties a graph γ , which is possibly a generalisation of the original ribbon graphs as explaned in Sect. 3.3 below, we define the following recursive procedure (starting with the vertex (3.8) which does not have any subgraphs) to integrate the Polchinski equation (3.5): Definition 1. We consider generalised 4 ribbon graphs γ which result via a history of contractions of subgraphs which at each contraction step have already been integrated according to the rules given below. 1. Let γ be a planar (B = 1, g = 0) one-particle irreducible graph with N = 4 external legs, where the index along each of its trajectories is constant (this includes the two external indices of a trajectory and the chain of indices at contracting inner vertices 4
This refers to graphs with composite propagators as defined in Sect. 3.3.
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base (V )γ
in between them). Then, the contribution A m1 n1
1 1 1 1 1 1 ;n k ;k l ;l m m2 n2 n2 k 2 k 2 l 2 l 2 m2
315
[ ] (using the natu-
ral cyclic order of legs of a planar graph) of γ to the effective action is integrated as follows: (V )γ
A m1 n1
1 ;n m2 n2 n2
:= −
k1 k1 ; k2 k2
0 d
+
m1 m2
?
∂ (V )γ A [ ] ∂ mm21 nn21 ; nn21 kk21 ; kk21 ll21 ; ll21 mm21 1 ? _ ll 2 ∂ 1 1 (V )γ ◦ ◦ − mm2 ◦◦◦◦◦◦ kk2 · A [ ] ∂ 00 00 ; 00 00 ; 00 00 ; 00 00
?
l1 l2
k1 ••• k 2 • n1 n2
[ ]
_
l 1 l 1 m1 ; l 2 l 2 m2
?
n1 n2
d
R
_
_
∂ (V )γ (V )γ
A [ ] + A 0 0 0 0 0 0 0 0 [ R ] . ∂ 00 00 ; 00 00 ; 00 00 ; 00 00 0 0;0 0;0 0;0 0
(3.9)
Here (and in the sequel), the wide hat over the -derivative of an Aγ -function indicates that the rhs of the Polchinski equation (3.5) has to be inserted. The two vertices in the third and fourth lines of (3.9) are identical (both are equal to 1). The four-leg graph in the third line of (3.9) indicates that the graph corresponding to the function in brackets to the right of it has to be inserted into the holes. The result is a graph with the same topology as the function in the second line, but different indices on inner trajectories. The graph in the fourth line of (3.9) is identical to the original vertex (3.8). The different symbol shall remind us that in the analytic expression for subgraphs containing the vertex of the last line in (3.9) we have to insert the value in brackets to the right of it. Remark. We use here (and in all other cases discussed below) the convention (its consistency will be shown later) that at = 0 the contribution to the initial four(V )γ point function is independent of the external matrix indices, A 0 0 0 0 0 0 0 0 [ 0 ] = 0 0;0 0;0 0;0 0
(V )γ
[ 0 ]. This is not really necessary, we could admit l 1 m1
A m1 n1
1 1 1 1 ;n k ;k l ; m2 n2 n2 k 2 k 2 l 2 l 2 m2
(V )γ
(V )γ
A m1 n1 n1 k1 k1 l 1 l 1 m1 [ 0 ] − A 0 0 0 0 0 0 0 0 [ 0 ] = C m1 n1 n1 k1 k1 l 1 l 1 m1 [ 0 ] ; 0;0 0;0 0 ; ; ; ; ; ; m2 n2 n2 k 2 k 2 l 2 l 2 m2 m2 n2 n2 k 2 k 2 l 2 l 2 m2 0 0 0const with C m1 n1 n1 k1 k1 l 1 l 1 m1 [ 0 ] ≤ θ 2 . ;
;
;
m2 n2 n2 k 2 k 2 l 2 l 2 m2
0
2. Let γ be a planar (B = 1, g = 0) 1PI graph with N = 2 external legs, where either the index is constant along each trajectory, or one component of the index jumps5 once by ±1 and back on one of the trajectories, whereas the index along the possible 5 A jump forward and backward means the following: Let k , . . . , k 1 a−1 be the sequence of indices at → in correct order between n and m. Then, for either r = 1 inner vertices on the considered trajectory − nm, or r = 2 we require nr = kir = mr for all i ∈ [1, p−1] ∪ [q, a−1] and kir = nr ± 1 (fixed sign) for all i ∈ [p, q−1]. The cases p = 1, q = p+1 and q = a are admitted. The other index component is constant along the trajectory.
316
H. Grosse, R. Wulkenhaar (V )γ
other trajectory remains constant. Then, the contribution A m1 n1 effective action is integrated as follows: (V )γ
A m1 n1
1 1 ;n m m2 n2 n2 m2
1 1 ;n m m2 n2 n2 m2
[ ] of γ to the
[ ]
∂ (V )γ A [ ] m1 n1 ; n1 m1
∂
m2 n2 n2 m2 n1 n1 n2 / ∂ (V )γ / n2 ◦ ◦ − o 1 ◦◦ A [ ] o · 0 0;0 0 1 m m ∂
0 0 0 0 m2 m2 ∂ ∂ (V )γ (V )γ 1 ] − ] +m A [
A [
1 0 0 1 0 0 0 0 ∂ 0 0 ; 0 0 ∂ 0 0 ; 0 0 ∂ ∂ (V )γ (V )γ ] − ] A [
A [
+m2 0 0 0 0 0 0 0 0 ∂ 1 0 ; 0 1 ∂ 0 0 ; 0 0 ∂ ∂ (V )γ (V )γ ] − ] A [
A [
+n1 ∂ 00 01 ; 01 00 ∂ 00 00 ; 00 00 ∂ ∂ (V )γ (V )γ +n2 A [ ] − A [ ] ∂ 00 01 ; 01 00 ∂ 00 00 ; 00 00 n1 n1
d n2 / ∂ (V )γ / n2 • ] + A(V )γ [ ] + o 1 •
A [
o R 0 0 0 0 m m1 ∂ 00 00 ; 00 00 0 0;0 0
R
m2 m2
d ∂ ∂ (V )γ (V )γ 1 ] − ] +m A [
A [
1 0 0 1 0 0 0 0 ∂ 0 0 ; 0 0 ∂ 0 0 ; 0 0
R
(V )γ (V )γ +A 1 0 0 1 [ R ]−A 0 0 0 0 [ R ]
:= −
0 d
0 0;0 0
0 0;0 0
∂ ∂ (V )γ (V )γ ] − ] A [
A [
0 0 0 0 0 0 0 0 ∂ 1 0 ; 0 1 ∂ 0 0 ; 0 0
R
(V )γ (V )γ +A 0 0 0 0 [ R ]−A 0 0 0 0 [ R ]
+m2
d
1 0;0 1
0 0;0 0
∂ ∂ (V )γ (V )γ ] − ] A [
A [
∂ 00 01 ; 01 00 ∂ 00 00 ; 00 00
R
(V )γ (V )γ +A 0 1 1 0 [ R ]−A 0 0 0 0 [ R ]
+n1
d
0 0;0 0
0 0;0 0
∂ ∂ (V )γ (V )γ A 0 0 0 0 [ ] − A 0 0 0 0 [ ] ∂ 0 1 ; 1 0 ∂ 0 0 ; 0 0
R (V )γ (V )γ + A 0 0 0 0 [ R ]−A 0 0 0 0 [ R ] .
+n2
d
0 1;1 0
0 0;0 0
(3.10)
3. Let γ be a planar (B = 1, g = 0) 1PI graph having N = 2 external legs with 1 n1 ±1 n1 m1 external indices m1 n1 ; m2 n2 = mm±1 2 n2 ; n2 m2 (equal sign) or m1 n1 ; m2 n2 =
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
317
n1 m1 n1 m1 m2 ±1 n2 ±1 ; n2 m2 ,
with a single6 jump in the index component of each trajectory. Under these conditions the contribution of γ to the effective action is integrated as follows: (V )γ
A m1 +1 n1 +1 m2
n2
:= −
1 m1 m2
; nn2
[ ]
0 d
∂ (V )γ A [ ] n1 +1 n1 m1 ∂ mm1 +1 ; 2 2 2 2 n m n n1 +1 n1
− (m1 +1)(n1 +1) /o
n2
m1 +1 m2
1
n +1 / n2 + (m1 +1)(n1 +1) o 1
m +1 m2
(V )γ 0 0 0;0
+A 1 1
0 0
1 1 m1 n1 ;n m m2 +1 n2 +1 n2 m2
:= −
/ + (m2 +1)(n2 +1) o
0 0
d
o
m1 m2
R
∂ (V )γ A 1 n1 n1 m1 [ ] ∂ mm2 +1 ; n2 +1 n2 m2 1 1 n n
(V )γ 0 1 1;0
/
∂ (V )γ ] A [
∂ 01 01 ; 00 00 (3.11)
n2 +1 − (m2 +1)(n2 +1) /o 1
+A 0 0
!
n1 n2
[ ]
0 d
o
m1 m2
[ R ] ,
(V )γ
A
∂ (V )γ · A [ ] ∂ 01 01 ; 00 00
/
n2
m m2 +1 n1 n2 +1 m1 m2 +1
m1 m2 n1 n2 m1 m2
∂ (V )γ · A [ ] ∂ 01 01 ; 00 00
/
n2
o
! / o
d
R
∂ (V )γ ] A [
0 0 0 0 ∂ 1 1 ; 0 0
[ R ] .
(3.12)
4. Let γ be any other type of graph. This includes non-planar graphs (B > 1 and/or g > 0), graphs with N ≥ 6 external legs, one-particle reducible graphs, four-point graphs with non-constant index along at least one trajectory and two-point graphs where the integrated absolute value of the jump along the trajectories is bigger than 2. Then the contribution of γ to the effective action is integrated as follows: (V )γ Am1 n1 ;...;mN nN [ ]
:= −
0 d
∂ (V )γ ] . A [
m n ;...;m n N N ∂ 1 1
(3.13)
The previous integration procedure identifies the following distinguished functions ρa [ , 0 , ρ 0 ]: m1 +1 n1 +1
1
1
m For an index arrangement m1 n1 ; m2 n2 = ; nn2 m 2 and sequences k1 , . . . , ka−1 m2 n2 −−m → (n −−m → (l1 , . . . , lb−1 ) of indices at inner vertices on the trajectory n 1 2 2 1 ) this means that there exist labels p, q with n1 +1 = ki1 for all i ∈ [1, p−1], n1 = ki1 for all i ∈ [p, a−1] and n2 = ki2 for all i ∈ [1, a−1] on one trajectory and m1 +1 = lj1 for all j ∈ [q, b−1], m1 = kj1 for all j ∈ [1, q−1] and 6
m2 = kj2 for all j ∈ [1, b−1] on the other trajectory. The cases p ∈ {1, a} and q ∈ {1, b} are admitted.
318
H. Grosse, R. Wulkenhaar
ρ1 [ , 0 , ρ 0 ] :=
γ
A0 0
γ as in Def. 1.2
ρ2 [ , , ρ 0 ] :=
0 0 0 0;0 0
γ
0 1 0 0;0 0
0 0 0 0;0 0
− A1 1
(3.14b)
[ , 0 , ρ 0 ] ,
γ
A0 0
0 0 0 0 0 0 0 0;0 0;0 0;0 0
γ as in Def. 1.1
[ , 0 , ρ 0 ] ,
0 0 0 0;0 0
ρ4 [ , 0 , ρ 0 ] :=
[ , 0 , ρ 0 ] − A 0 0
γ
γ as in Def. 1.3
(3.14a) γ
A1 0
γ as in Def. 1.2
ρ3 [ , 0 , ρ 0 ] :=
[ , 0 , ρ 0 ] ,
(3.14c)
[ , 0 , ρ 0 ] .
(3.14d)
This identification uses the symmetry properties of the A-functions when summed over all contributing graphs. It follows from Definition 1 and (3.3) that ρa [ 0 , 0 , ρ 0 ] ≡ ρa0 ,
a = 1, . . . , 4 .
(3.15)
As part of the renormalisation strategy encoded in Definition 1, the coefficients (3.14) are kept constant at = R . We define ρa [ R , 0 , ρ 0 ] = 0
for a = 1, 2, 3,
ρ4 [ R , 0 , ρ 0 ] = λ .
(3.16)
The normalisation (3.16) for ρ1 , ρ2 , ρ3 identifies K nm;lk ( R ) as the cut-off propagator related to the normalised two-point function at R . This entails a normalisation of the mass µ0 , the oscillator frequency and the amplitude of the fields φmn . The normalisation condition for ρ4 [ R , 0 , ρ 0 ] defines the coupling constant used in the expansion (3.4). 3.3. Ribbon graphs with composite propagators. It is convenient to write the linear combination of the functions in braces { } in (3.9)–(3.12) as a (non-unique) linear combination of graphs in which we find at least one of the following composite propagators: (0)
Q m1 n1
:= Q m1 n1
1 1 ;n m m2 n2 n2 m2
1 1 ;n m m2 n2 n2 m2
m1 2
− Q 0 n1
n1 0 0 n2 ; n2 0
m1 2
m m = o _•_•_•_•_•_•_•_•_•_•/ , n1 n2
(3.17a)
n1 n2
(1)
Q m1 n1
1 1 ;n m m2 n2 n2 m2
(0)
:= Q m1 n1
(0)
− m Q 1 n1 1
1 1 ;n m m2 n2 n2 m2
n1 1 0 n2 ; n2 0
(0)
− m Q 0 n1 2
m1 m2
m1 m2
= o __________/ , n1 0
1 n2 ; n2 1
n1 n2
(3.17b)
n1 n2
(+ 1 )
2 Q m1 +1 n1 +1 m2
n2
1 m1 m2
; nn2
:= Q m1 +1 n1 +1 m2
n2
1 m1 m2
; nn2
−
m1 +1Q 1 n1 +1 0
n2
m1 +1 2
1 0 0
; nn2
(− 21 )
Q
1 1 m1 n1 ;n m m2 +1 n2 +1 n2 m2
:= Q
1 1 m1 n1 ;n m m2 +1 n2 +1 n2 m2
−
n1 +1 n2
m1 2
m2 +1Q 0
n1 n1 0 1 n2 +1 ; n2 0
m1 2
m m = o __________/ ,
m1 2
o __m_+1 ______m_ / . = n1 n1 n2 +1
(3.17c)
n1 n2
n2
(3.17d)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
319
To obtain the linear combination we recall how the graph γ under consideration is produced via a history of contractions and integrations of subgraphs. For a history a −b−. . . −n (a first) we have
∂ (V )γ A [ ] ∂ m1 n1 ;...;mN nN =
( )B
ma ,na ,ka ,la ,...,mn ,nn ,kn ,ln ( )A
d n
n
( n )B
d n−1 ···
n−1
( n )A
( b )B
( b )A
d a
a
ka la ...mn nn kn ln ×Qmn nn ;kn ln ( n ) . . . Qmb nb ;kb lb ( b )Qma na ;ka la ( a ) Vmm1ann1a...m , N nN (3.18) ka la ...mn nn kn ln is the vertex operator and either ( i )A = i , ( i )B = 0 where Vmm1ann1a...m N nN (V )γ ∂ or ( i )A = R , ( i )B = i . The graph ∂
A00;...;00 [ ] is obtained via the same procedure (including the choice of the integration direction), except that we use the ma na ka la ...mn nn kn ln vertex operator V00...00 . This means that all propagator indices which are not determined by the external indices are the same. Therefore, we can factor out in the difference of graphs all completely inner propagators and the integration operations. We first consider the difference in (3.9). Since γ is one-particle irreducible with constant index on each trajectory, we get for a certain permutation π ensuring the history of integrations
∂ ∂ (V )γ (V )γ ] − A [
A [ ] mn;nk;kl;lm ∂ ∂ 00;00;00;00 a a = ... Qmπi kπi ;kπi mπi ( πi ) − Q0kπi ;kπi 0 ( πi )
i=1
= ...
×
a b−1 b=1 a
i=1
i=1
(0)
Q0kπi ;kπi 0 ( πi ) Qmπ
b kπb ;kπb mπb
( πb )
Qmπ(j ) kπ(j ) ;kπ(j ) mπ(j ) ( π(j ) )
,
(3.19)
j =b+1
where γ contains a propagators with external indices and mπi ∈ {m, n, k, l}. The parts (V )γ ∂ of the analytic expression common to both ∂
and Amn;nk;kl;lm [ ] (V )γ ∂
∂
A00;00;00;00 [ ] are symbolised by the dots. The kπi are inner indices. We thus learn that the difference of graphs appearing in the braces in (3.9) can be written as a sum of graphs each one having a composite propagator (3.17a). Of course, the identity
n−1 k n−k−1 . There (3.19) is nothing but a generalisation of a n − bn = k=0 b (a−b)a are similar identities for the differences appearing in (3.10)–(3.12). We delegate their derivation to Appendix B.1. In Appendix B.2 we show how the difference operation works for a concrete example of a two-leg graph. 3.4. Bounds for the cut-off propagator. Differentiating the cut-off propagator (3.2) with respect to and recalling that the cut-off function K(x) is constant unless x ∈ [1, 2], we notice that for our choice θ1 = θ2 ≡ θ the indices are restricted as follows:
320
H. Grosse, R. Wulkenhaar
∂ Km1 n1
1 ;k m2 n2 k 2
l1 l2
( )
=0 ∂
unless θ 2 ≤ max(m1 , m2 , n1 , n2 , k 1 , k 2 , l 1 , l 2 ) ≤ 2θ 2 .
(3.20)
In particular, the volume of the support of the differentiated cut-off propagator (3.20) with respect to a single index m, n, k, l ∈ N2 equals 4θ 2 4 , which is the correct normalisation of a four-dimensional model [12]. We compute in Appendix C the -dependence of the maximised propagator Cmn;kl , which is the application of the sharp cut-off realising the condition (3.20) to the propagator, for selected values of C = θ 2 and , which is extremely well reproduced by (C.2). We thus obtain for the maximum of (3.6), 1 (32 max |K (x)|) max Cmn;kl C= 2 θ max Qmn;kl ( ) ≤ x m,n,k,l m,n,k,l 2πθ C0 for > 0 , 2 θ δm+k,n+l (3.21) ≤ C0 for = 0 , δm+k,n+l √
2 θ 40 maxx |K (x)|. The constant C0 1 corrects the fact that (C.2) holds where C0 = C0 3π asymptotically only. Next, from (C.3) we obtain 1 (32 max |K (x)|) max max Qmn;kl ( ) ≤ max Cmn;kl C= 2 θ max m x m n,k n,k 2πθ l
l
C1 ≤ 2 2 , θ
(3.22)
where C1 = 48C1 /(7π) maxx |K (x)|. The product of (3.21) by the volume 4θ 2 4 of the support of the cut-off propagator with respect to a single index leads to the following bound: θ 2 for > 0 , 4C0 max Qmn;kl ( ) ≤ (3.23) n,k,l m 4C √θ 3 for = 0 . 0 According to (A.29) there is the following refinement of the estimation (3.21): m1 a 1 m2 a 2 1 2 2 ≤ Ca 1 ,a 2 . (3.24) Q m1 n1 n1 −a1 m1 −a1 ( ) r 2 2 r r ; a ≥0, m ≤n θ
θ
θ 2 2 2 m n n2 −a 2 m2 −a 2 This property will imply that graphs with big total jump along the trajectories are suppressed, provided that the indices on the trajectory are “small”. However, there is a potential danger from the presence of completely inner vertices, where the index summation runs over “large” indices as well. Fortunately, according to (C.4) this case can be controlled by the following property of the propagator: m +1 2 1 ∞ max Q m1 n1 k1 l 1 ( ) ≤ C4 , (3.25) 2 2 θ
2 θ 2 2 n2 ; k 2 l 2 k,n∈N m 2 l∈N
m − l 1 ≥ 5
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
321
where we have defined the following norms:
m − l 1 :=
2
|mr − l r |,
m ∞ := max(m1 , m2 )
if m =
m1 m2
,
l=
l1 l2
.
r=1
(3.26) Moreover, we define
m1 n1 ; . . . ; mN nN ∞ := max
i=1,...,N
mi ∞ , ni ∞ .
(3.27)
Finally, we need estimations for the composite propagators (3.17) and (B.7):
m ∞ 1 ( ) , ≤ C5 n1 m1 θ 2 θ 2 ; m2 n2 n2 m2 m 2 1 ∞ (1) , Q m1 n1 n1 m1 ( ) ≤ C6 2 θ
θ 2 ; 2 2 2 2 m n n m & 3 & & 2 & m1 +1 1 (+ 21 ) m2 ∞ . Q m1 +1 n1 +1 n1 m1 ( ) ≤ C7 2 θ
θ
2 ; n2 m2 m2 n2 (0) Q m1 n1
(3.28) (3.29) (3.30)
These estimations follow from (A.31) and (A.33).
3.5. The power-counting estimation. Now we are going to prove a power-counting theorem for the φ 4 -model in the matrix base, generalising the theorem proven in [12]. The generalisation concerns 1PI planar graphs and their subgraphs. A subgraph of a planar graph has necessarily genus g = 0 and an even number of legs on each boundary component. We distinguish one boundary component of the subgraph which after a sequence of contractions will be part of the unique boundary component of an 1PI planar graph. → on the distinguished boundary component, which passes through the For a trajectory − nm indices k1 , . . . , ka when going from n to m = o[n], we define the total jump as → := n − k + − nm 1 1
a−1
kc − kc+1 1 + ka − m 1 .
(3.31)
c=1
−→ → and − Clearly, the jump is additive: if we connect two trajectories − nm mm to a new trajec−→ −→ −→ −−−−→ → + − tory nm , then nm = − nm mm . We let T be a set of trajectories nj o[nj ] on the distinguished boundary component for which we measure the total jump. By definition, the end points of a trajectory in T cannot belong to E s . −−−−→ Moreover, we consider a second set T of t trajectories nj o[nj ] of the distinguished boundary component where one of the end points mj or nj is kept fixed and the other end −−−−→ point is summed over. However, we require the summation to run over nj o[nj ] ≥ 5
only, see (3.25). We let E t be the corresponding summation operator. Additionally, we have to introduce a new notation in order to control – the behaviour for large indices and given , – the behaviour for given indices and large .
322
H. Grosse, R. Wulkenhaar
N nN For this purpose we let Pba m1 n1 ;...;m denote a function of the indices 2 θ
m1 , n1 , . . . , mN , nN and the scale which is bounded as follows: m n ; . . . ; m n C M a for M ≥ 1 , 1 1 N N a 0 ≤ Pba ≤ (3.32) Cb M b for M ≤ 1 , θ 2 mr + 1 nr + 1 nr + 1 1 , 1 2 ,..., N 2 , M := max 2 2θ
2θ
2θ
mi ,ni ∈E / s ,E t for some constants Ca , Cb . The maximisation over the indices mri , nri excludes the summation indices E t . Fixing the indices and varying we have a−a m1 n1 ; . . . ; mN nN a m1 n1 ; . . . ; mN nN Pb+b ≤ P , (3.33) b θ 2 θ 2 for 0 ≤ a ≤ a and b ≥ 0, assuming appropriate Ca , Cb . Moreover, m n ; . . . ; m n 1 1 N1 N1 a2 mN1 +1 nN1 +1 ; . . . ; mN nN Pba11 P b2 θ 2 θ 2 m n ; . . . ; m n 1 1 N N +a2 ≤ Pba11+b . (3.34) 2 θ 2 We are going to prove: Proposition 2. Let γ be a ribbon graph having N external legs, V vertices, V e external vertices and segmentation index ι, which is drawn on a genus-g Riemann surface with B boundary components. We require the graph γ to be constructed via a history of subgraphs and an integration procedure according to Definition 1. Then the contribu(V ,V e ,B,g,ι)γ tion Am1 n1 ;...;mN nN of γ to the expansion coefficient of the effective action describing a duality-covariant φ 4 -theory on R4θ in the matrix base is bounded as follows: 1. If γ is as in Definition 1.1, we have (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ A m1 n1 n1 k1 k1 l 1 l 1 m1 [ , 0 , ρ0 ] − A 0 0 0 0 0 0 0 0 [ , 0 , ρ0 ] ; ; ; m2 n2 n2 k 2 k 2 l 2 l 2 m2 1 m n1 n1 4V −N m2 n2 n2
!
0 0;0 0;0 0;0 0
1 3V −2−V e ; ;
2V −2 ln , (3.35a) P θ 2
R 1 3V −2−V e
(V ,V e ,1,0,0)γ . (3.35b) P 2V −2 ln A 0 0 0 0 0 0 0 0 [ , 0 , ρ0 ] ≤
R 0 0;0 0;0 0;0 0 2. If γ is as in Definition 1.2, we have (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ [ , 0 , ρ0 ] A m1 n1 n1 m1 [ , 0 , ρ0 ] − A 0 0 0 0 ;
≤ P1
k1 k2
k1 l1 k2 l2
l1 l2
m1 m2
0 0;0 0
;
m2 n2 n2 m2
(V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ −m1 A 1 0 0 1 [ , 0 , ρ0 ] − A 0 0 0 0 [ , 0 , ρ0 ] ; ; 0 0 0 0 0 0 0 0 (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ 1 [ , 0 , ρ0 ] − A 0 0 0 0 [ , 0 , ρ0 ] −n A 0 1 1 0 ; ; 0 0 0 0 0 0 0 0 (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ [ , 0 , ρ0 ] − A 0 0 0 0 [ , 0 , ρ0 ] −m2 A 0 0 0 0 1 0;0 1 0 0;0 0 (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ 2 −n A 0 0 0 0 [ , 0 , ρ0 ] − A 0 0 0 0 [ , 0 , ρ0 ] 0 1;1 0
≤ (θ 2 ) P24V −N
! m1 n1
n1 m1 m2 n2 ; n2 m2 θ 2
0 0;0 0
1 3V −1−V e
, P 2V −1 ln
R
(3.36a)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
323
1 3V −1−V e
(V ,V e ,1,0,0)γ , (3.36b) [ , 0 , ρ0 ] ≤ (θ 2 ) P 2V −1 ln A 0 0 0 0 ;
R 0 0 0e 0 (V ,V ,1,0,0)γ (V ,V e ,1,0,0)γ [ , 0 , ρ0 ] − A 0 0 0 0 [ , 0 , ρ0 ] A 1 0 0 1 0 0;0 0
≤
1 3V −1−V e
P 2V −1
0 0;0 0
ln .
R
(3.36c)
3. If γ is as in Definition 1.3, we have (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ [ , 0 , ρ0 ] A m1 +1 n1 +1 n1 m1 [ , 0 , ρ0 ] − (m1 +1)(n1 +1)A 1 1 0 0 m2
n2
; n2
m2
0 0;0 0
! m1 +1 n1 +1
1 3V −1−V e
2V −1 ln , (3.37a) P θ 2
R 1 3V −1−V e
(V ,V e ,1,0,0)γ [ , 0 , ρ0 ] ≤ P 2V −1 ln . (3.37b) A 1 1 0 0
R 0 0;0 0 ≤ (θ 2 )P24V −N
m2
n2
n1 n2
;
m1 m2
4. If γ is a subgraph of an 1PI planar graph with a selected set T of trajectories on one distinguished boundary component and a second set T of summed trajectories on that boundary component, we have (V ,V e ,B,0,ι)γ A m1 n1 ;...;mN nN [ , 0 , ρ0 ] Es Et
(2− N )+2(1−B) 4V −N 2 ≤ θ 2 P 2t +
×
− −−−→ n− j o[nj ]∈T
1 3V − N −1+B−V e −ι+s+t 2
−−−−→ min(2, 21 nj o[nj ])
m n ; . . . ; m n 1 1 N N θ 2
N
. P 2V − 2 ln
R
(3.38)
5. If γ is a non-planar graph, we have (V ,V e ,B,g,ι) A m1 n1 ;...;mN nN [ , 0 , ρ0 ] Es
(2− N )+2(1−B−2g) 4V −N m1 n1 ; . . . ; mN nN 2 ≤ θ 2 P0 θ 2 1 3V − N −1+B+2g−V e −ι+s N
2 . P 2V − 2 ln ×
R
(3.39)
Proof. We prove the proposition by induction upward in the vertex order V and for given V downward in the number N of external legs. 5. We start with the proof for non-planar graphs, noticing that due to (3.33) the estimations (3.35), (3.36), (3.37) and (3.38) can be further bounded by (3.39). The proof of (3.39) reduces to the proof of the general power-counting theorem given in [12], µ δ0 µ δ1 δ2 where we have to take for
and
the estimations (3.21), (3.22) and
µ (3.23) with both their - and -dependence. Independent of the factor (3.32), the non-planarity of the graph guarantees the irrelevance of the corresponding function so that the integration according to Definition 1 agrees with the procedure used in [12].
324
H. Grosse, R. Wulkenhaar
The dependence on ω > 0 we have 0
mri nri , 2 θ θ 2
through (3.32) is preserved in its structure, because for
d C a m ≤ P
ω b θ 2
m+1 b d C C b
ω 2θ 2
1 m+1 b C ≤ Cb ω ω+2b
2θ 2
0
(3.40a)
for m+1 ≤ 2θ 2 and 0 d C a m P
ω b θ 2
' 0 m+1 b m+1 m+1 a 2θ d C d C ≤ ' C + C b a m+1 ω 2θ 2
ω 2θ 2
2θ m+1 a (ω+2a)C −(ω+2b)C 1 C C b a ≤ C + a m+1 ω (3.40b) ω+2a ω 2θ 2 (ω+2a)(ω+2b) 2 2θ
for m+1 ≥ 2θ 2 . For (ω+2b)Ca > (ω+2a)Cb we can omit the last term in the second line of (3.40b), and for (ω+2b)Ca < (ω+2a)Cb we estimate it by (ω+2a)Cb −(ω+2b)Ca times the first term. Taking a polynomial in ln
R into account, Ca (ω+2b) the spirit of (3.40) is unchanged according to [12]. The general power-counting theorem in [12] uses analogues of the bounds (3.21) m and (3.22) of the propagator, which do not add factors θ
2 . Since two legs of the subgraph(s) are contracted, the total a-degree of (3.32) becomes 4V − N − 2, which due to (3.33) can be regarded as degree 4V − N , too. 4. The proof of (3.38) is essentially a repetition of the proof of (3.39), with particular care when contracting trajectories on the distinguished boundary component. The verification of the exponents of (θ 2 ), 1 and ln
R in (3.38) is identical to the proof of (3.39). It remains to verify the a, b-degrees of the factor (3.32). We first consider the contraction of two smaller graphs γ1 (left subgraph) and γ2 (right subgraph) to the total graph γ . (a) We first assume additionally that all indices of the contracting propagator are determined (this is the case for V1e +V2e = V e and ι1 +ι2 = ι), e.g. n1
o m2 /
n2
O o
m1
O
m
O o σ m ?> =< 89 :; /
σn
l1
l
O k1
/ O O ?> =< 89 :;
n1
l2 o k2
/
σ lo σk
/
GF EDo o σm / / o σn @A BC/
O
GF o ED /
m1
o / M
k o
/
m k1
O l1
σl o
o / / BC σ k @A
(3.41)
As a subgraph of an 1PI planar graph, at most one side ml or m1 l1 (mk1 or m1 k) of the contracting propagator Qm1 m;ll1 (Qm1 m;k1 k ) can belong to a trajectory − → in T . In the left graph of (3.41) let us assume that the side ml connects two −−−→ −−−−→ −−→ trajectories i[m]m ∈ T1 and lo[l] ∈ T2 to a new trajectory i[m]o[l] ∈ T . The proof for the small- degree a = 4V − N in (3.38) is immediate, because the contraction reduces the number of external legs by 2 and we are free to estimate
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
325
the contracting propagator by its global maximisation (3.21). Concerning the −−−→ −−→ large- degree b, there is nothing to prove if already i[m]m + lo[l] ≥ 4. −−−→ −−→ For i[m]m + lo[l] < 4 we use the refined estimation (3.24) for the con→ 1 − tracting propagator, which givesa relative factor M 2 lm compared with (3.21), m +1 . Now, the result follows from (3.31). Because where M = max 2θ 2 , l +1 2θ 2 −−−→ −−→ of i[m]m + lo[l] < 4, the indices mr , l r from the propagator can be estimated −−−−→ by i[m]r and o[l]r . If the resulting jump leads to 21 i[m]o[l] > 2, we use (3.33) to reduce it to 2. In this way we can guarantee that the b-degree does not exceed −−−−→ − → the a-degree. Alternatively, if lm ≥ 5 we can avoid a huge i[m]o[l] by esti7 mating the contracting propagator via (3.25) instead of (3.24), because a single propagator |Qmm1 ;l1 l | is clearly smaller than the entire sum over m − l 1 ≥ 5. −−−−→ −−−−→ If i[m]o[l] ∈ T , then the sum over o[l] with i[m]o[l] ≥ 5 can be estimated by the combined sum −−−→ − → −−→ – over finitely many combinations of m, l with max(i[m]m, ml , lo[l]) ≤ 4, which via (3.24) and the induction hypotheses relative to T1 , T2 contrib r +1 1 −−−−→ utes a factor M 2 i[m]o[l] to the rhs of (3.38), where M = max o[l] , θ 2 i[m]r +1 l r +1 mr +1 1 −−−−→ . We use (3.33) to reduce the b-degree from , , i[m]o[l] 2 θ 2 θ 2 θ 2 to 2. −−−→ – over m via the induction hypothesis relative to i[m]m ∈ T1 , combined with the usual maximisation (3.21) of the contracting propagator and an estima−−→ / T2 , T2 . tion of γ2 where lo[l] ∈ −−−→ −−→ – over l for fixed m ≈ i[m], via (3.25), taking i[m]m ∈ / T1 , T1 and lo[l] ∈ / T2 , T2 . – over o[l] for fixed m and l, with i[m] ≈ m ≈ l, via the induction hypothesis −−−→ −−→ relative to lo[l] ∈ T2 , the bound (3.21) of the propagator and i[m]m ∈ / T1 , T1 . A summation over i[m] with given o[l] is analogous. In conclusion, we have proven that the integrand for the graph γ is bounded by (3.38). Since we are dealing with a N ≥ 6-point function, the total -exponent is negative. Using (3.40) we thus obtain the same bound (3.38) after integration −−→ −−→ from 0 down to . If m1 l1 ∈ T or m1 l1 ∈ T we get (3.38) directly from (3.24) or (3.25). The discussion of the right graph in (3.41) is similar, showing that the integrand is bounded by (3.38). As long as the integrand is irrelevant (i.e. the total -exponent is negative), we get (3.38) after -integration, too. However, γ might have −−−→ −−→ two legs only with i(m)m+lo(l) ≤ 2. In this case the integrand is marginal or relevant, but according to Definition 1.4 we nonetheless integrate from 0 down to . We have to take into account that the cut-off propagator at the scale
vanishes for 2 ≥ m1 m; k1 k ∞ /θ . Assuming two relevant two-leg subgraphs γ1 , γ2 bounded by θ 2 times a polynomial in ln
R each, we have
7 In this case there is an additional factor 1 in (3.22) compared with (3.21). It is plausible that this is due to the summation, which we do not need here. However, we do not prove a corresponding formula without summation. In order to be on a safe side, one could replace in the final estimation (4.55) by 2 . Since is finite anyway, there is no change of the final result. We therefore ignore the discrepancy 1. in
326
H. Grosse, R. Wulkenhaar
d γ1 γ2 Q (
)A (
)A (
) m m;k k 1 1
√
m1 m;k1 k ∞ /θ d C0 2 2V −2 ≤ (θ
)P ln
R ! 1
m m; k k C0 1 1 ∞ 2
m1 m; k1 k ∞ P 2V −2 ln ≤ θ 2R m m; k k C0 1 1 ≤ P 2V −2 . (θ 2 )P12 2 2θ
R
0
(3.42)
Here, we have inserted the estimation (3.21) for the propagator, restricted to its ' ' m m
= ln + ln support. In the logarithm we expanded ln θ
2
R and estiθ 2 R ' m q m < c 2θ
mated ln θ
2 2 . Thus, the small- degree a of the total graph is increased by 2 over the sum of the small- degrees of the subgraphs (taken = 0 here), in agreement with (3.38). The estimation for the logarithm is not necessary for the large- degree b in (3.32). Using (3.33) we could reduce that degree to b = 0. We would like to underline that the integration of 1PR graphs is one of the sources for the factor (3.32) in the power-counting theorem. Taking the factors (3.32) in the bounds for the subgraphs γi into account, the formula modifies accordingly. We confirm (3.38) in any case. It is clear that all other possibilities with determined propagator indices as discussed in [12] are treated similarly. (b) Next, let one index of the contracting propagator be an undetermined summation index, e.g. GF o ED /
GF EDo o σm / o / /
σn
o @A BC/
o / M
n
k o
/
m k1
σl o
o / / BC σ k @A
(3.43)
O l1
−−−−→ Let i[k]o[n] ∈ T . Then k is determined by the external indices of γ2 . There is −−→ −−→ nothing to prove for i[k]k ≥ 4. For i[k]k < 4 we partition the sum over n into − → nk ≤ 4, where each term yields the integrand (3.38) as before in the case of − → determined indices (3.41), and the sum over nk ≥ 5, which yields the desired factor in (3.38) via (3.25) and the similarity k ≈ i[k] of indices. As a subgraph of a planar graph, m = o[n] in γ1 , so that a possible k1 -summation can be trans−−−−→ ferred to m. If i[k]o[n] ∈ T then in the same way as for (3.41) the summation −−→ → −−→ − splits into the four possibilities related to the pieces no[n], kn and i[k]k, which yield the integrand (3.38) via the induction hypotheses for the subgraphs and via (3.24) or (3.25). The -integration yields (3.38) via (3.40) if the integrand is irrelevant, whereas we have to perform similar considerations as in (3.42) if the integrand is relevant or marginal.
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
327
(c) The discussion of graphs with two summation indices on the contracting propagator, such as in
o σm / σn
GF o ED /
GF EDo / o m1 o n / M o
m
n1
k o l
o
/ Q
σ lo
(3.44) /
σk o / BC @A
@A BC/
/
is similar. Note that the planarity requirement implies m = o[n] and l = i[k]. (d) Next, we look at self-contractions of the same vertex of a graph. Among the examples discussed in [12] there are only two possibilities which can appear in subgraphs of planar graphs: o σ m ?> =
GF EDo o σm /
n1 o l
/
m1
/ σn
O n1
o @A BC/
n ol
/ M
n
(3.45)
There is nothing to prove for the left graph in (3.45). To verify the large- degree b relative to the right graph, we partition the sum over n into −−→ −−→ – i[n]n ≤ 4 and no[n] ≤ 4, where each term yields (3.38) via the induction −−→ −−→ hypothesis for the trajectories i[n]n ∈ T1 and no[n] ∈ T1 of the subgraph (in the same way as for the examples with determined propagator indices), −−→ −−→ −−→ – i[n]n ≤ 4 and no[n] ≥ 5, for which the induction hypothesis for i[n]n ∈ / − − → T1 , T1 and no[n] ∈ T1 , together with i[n] ≈ n, gives a contribution of 2 to the b-degree in (3.32), and −−→ −−→ −−→ – i[n]n ≥ 5, which via the induction hypothesis for i[n]n ∈ T1 and no[n] ∈ / T1 , T1 gives a contribution of 2 to the b-degree in (3.32). −−−−→ The case i[n]o[n] ∈ T is similar to discuss. At the end we always arrive at the integrand (3.38). If it is irrelevant the integration from 0 down to yields (3.38) according to (3.40). If the integrand is marginal/relevant and γ is one-particle reducible, then the indices of the propagator contracting 1PI subgraphs are of the same order as the incoming and outgoing indices of the trajectories through the propagator (otherwise the 1PI subgraphs are irrelevant). Now a procedure similar to (3.42) yields (3.38) after integration from 0 down to , too. If γ is 1PI and marginal or relevant, it is actually of the type 1–3 of Definition 1 and will be discussed below. (e) Finally, there will be self-contractions of different vertices of a subgraph, such as in o m1 o m O /
l o k
/ O
n1
O o σ m GF @A / σn
O
O
O
(3.46) O
ED BC
328
H. Grosse, R. Wulkenhaar
The vertices to contract have to be situated on the same (distinguished) boundary component, because the contraction of different boundary components increases the genus and for contractions of other boundary components the proof is immediate. Only the large- degree b is questionable. −−−−→ Let i[m]o[l] ∈ T , with i[m] = o[l] due to planarity. Here, m is regarded as a sum−−−→ mation index. As before we split that sum over m into a piece with i[m]m ≤ 4, which yields the b-degree of the integrand (3.38) term by term via the induction −−−→ −−→ hypothesis relative to i[m]m, lo[l] ∈ T1 and (3.24) for the contracting prop−−−→ agator, and a piece with i[m]m ≥ 5, which gives (3.38) via the induction −−−→ −−−−→ −−→ hypothesis relative to i[m]m ∈ T1 and lo[l] ∈ / T1 , T1 . If i[m]o[l] ∈ T the sum −−−−→ over o[l] with i[m]o[l] ≥ 5 is estimated by a finite number of combinations of −−−→ − → −−→ m, l with max(i[m]m, ml , lo[l]) ≤ 4, which yields the integrand (3.38) via the induction hypothesis for T1 and (3.24), and the sum over index combinations −−−→ − → −−→ – i[m]m ≤ 4, ml ≤ 4, lo[l] ≥ 5 −−−→ − → – i[m]m ≤ 4, ml ≥ 5 −−−→ – i[m]m ≥ 5 which is controlled by the induction hypothesis relative to T1 or (3.25), together with the similarity of trajectory indices at those parts where the jumps are bounded −−−→ −−−→ by 4. The case where i[k]m1 ∈ T or i[k]m1 ∈ T is easier to treat. We thus arrive in any case at the estimation (3.38) for the integrand of γ , which leads to the same estimation (3.38) for γ itself according to the considerations at the end of 4.d. If γ is of type 1–3 of Definition 1 we will treat it below. The discussion of all other possible self-contractions as listed in [12] is similar. This finishes part 4 of the proof of Proposition 2. 1. Now we consider 1PI planar 4-leg graphs γ with constant index on each trajectory. If the external indices are zero, we get (3.35b) directly from the general power-counting theorem [12], because the integration direction used there agrees with Definition 1.1. For non-zero external indices we decompose the difference (3.35a) according to (3.19) into graphs with composite propagators (3.17a) bounded by (3.28). The composite propagators appear on one of the trajectories of γ , and as such already on the trajectory of a sequence of subgraphs of γ , starting with some minimal subgraph γ0 . The composite propagator is the contracting propagator for γ0 . Now, the integrand of
m the minimal subgraph γ0 with composite propagator is bounded by a factor C5
2θ times the integrand of the would-be graph γ0 with ordinary propagator, where m is the index at the trajectory under consideration.
m If γ0 is irrelevant, the factor C5
2 θ of the integrand survives according to (3.40) to the subgraph γ0 itself. Otherwise, if γ0 is relevant or marginal, it is decomposed according to 1–3 of Definition 1. Here, the last lines of (3.9)–(3.12) are independent of the external index m so that in the difference relative to the composite propagator these last lines of (3.9)–(3.12) cancel identically. There remains the first part of (3.9)– (3.12), which is integrated from 0 downward and which is irrelevant by induction.
m Thus, (3.40) applies in this case, too, saving the factor C5
2 θ to γ0 in any case. This factor thus appears in the integrand of the subgraph of γ next larger than γ0 . By
m iteration of the procedure we obtain the additional factor C5
2 θ in the integrand of the total graph γ with composite propagators, the -degree of which being thus reduced by 2 compared with the original graph γ . Since γ itself is a marginal graph according to the general power-counting behaviour (3.39), the graph with composite
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
329
propagator is irrelevant and according to Definition 1.1 to be integrated from 0 down to . This explains (3.35a). 3. Similarly, we conclude from the proof of (3.38) that the integrands of graphs γ according to Definition 1.3 are marginal. In particular, we immediately confirm (3.37b). For non-zero external indices we decompose the difference (3.37a) according to (B.1) into graphs either with composite propagators (3.17a) bounded by (3.28) or with composite propagators (3.17c)/(3.17d) bounded by (3.30). In such a graph there are—apart from the usual propagators with bound (3.21)/(3.22)—two propagators with a 1 + a 2 = 1 in (3.24) and a composite propagator with bound (3.28), or one propagator with a 1 + a 2 = 1 in (3.24) and one composite propagator with bound
m1 +1 n1 +1 2 2 2
(3.30). In both cases we get a total factor m(θ 2n)2 ∞ compared with a general planar two-point graph (3.39). The detailed discussion of the subgraphs is similar as under 1. 2. Finally, we have to discuss graphs γ according to Definition 1.2. We first consider the case that γ has constant index on each trajectory. It is then clear from the proof of (3.39) that (in particular) at vanishing indices the graph γ is relevant, which is expressed by (3.36b). Next, the difference (3.36c) of graphs can as in (3.19) be written as a sum of graphs with one composite propagator (3.17a), the bound of which is given by (3.28). After the treatment of subgraphs as described under 1, the integrand of each term in the linear combination is marginal. According to Definition 1.2 we have to integrate these terms from R up to which agrees with the procedure in [12] and leads to (3.36c). Finally, according to (B.3) and (B.4), the linear combination constituting the lhs of (3.36a) results in a linear combination of graphs with either one propagator (3.17b) with bound (3.29), or with two propagators (3.17a) with bound (3.28). A similar discussion as under 1 then leads to (3.36a). The second case is when one index component jumps once on a trajectory and back. According to the proof of (3.38) the integrand of γ at vanishing external indices is marginal. We regard it nevertheless as relevant using the inequality 1 ≤ (θ 2 )(θ 2R )−1 , where (θ 2R )−1 is some number kept constant in our renormalisation procedure. We now obtain (3.36b). Similarly, the integrand relative to the difference (3.36c) would be irrelevant, but is considered as marginal via the same trick. Finally, the linear combination constituting the lhs of (3.36a) is according to (B.3) and (B.6)–(B.9) a linear combination of graphs having either two propagators with a 1 + a 2 = 1 in (3.24) and a composite propagator with bound (3.28), or one propagator with a 1 + a 2 = 1 in (3.24) and one composite propagator (3.17c)/(3.17d) with bound (3.30). The discussion as before would lead to an increased large- degrees P34V −N instead of P24V −N in (3.36a), which can be reduced to P24V −N according to (3.33). This finishes the proof of Proposition 2.
It is now important to realise [11] that the estimations (3.35)–(3.39) of Proposition 2 do not make any reference to the initial scale 0 . Therefore, the estimations (3.35)– (3.39), which give finite bounds for the interaction coefficients with finite external indices, also hold in the limit 0 → ∞. This is the renormalisation of the duality-covariant noncommutative φ 4 -model. In numerical computations the limit 0 → ∞ is difficult to realise. Taking instead a large but finite 0 , it is then important to estimate the error and the rate of convergence as 0 approaches ∞. This type of estimation is the subject of the next section.
330
H. Grosse, R. Wulkenhaar
We finish this section with a remark on the freedom of normalisation conditions. One of the most important steps in the proof is the integration procedure for the Polchinski equation given in Definition 1. For presentational reasons we have chosen the smallest possible set of graphs to be integrated from R upward. This can easily be generalised. We could admit in (3.9) any planar 1PI four-point graphs for which the incoming index of each trajectory is equal to the outgoing index on that trajectory, but with arbitrary jump along the trajectory. There is no change of the estimation (3.35a), (V ,V e ,1,0,0)γ ∂ because (according to Proposition 2.4) ∂
A m1 n1 n1 k 1 k 1 l 1 l 1 m1 [ ] is already irrel;
;
;
m2 n2 n2 k 2 k 2 l 2 l 2 m2
evant for these graphs, so is the difference in braces in (3.9). Moreover, integrating such an irrelevant graph according to the last line of (3.9) from R upward we obtain a bound 1 P 2V −2 [ln
R ], which agrees with (3.35b), because 2R θ is finite. Similarly, we ( 2 θ) R
can relax the conditions on the jump along the trajectory in (3.10)–(3.12). We would then define the ρa [ , 0 , ρ 0 ]-functions in (3.14) for that enlarged set of graphs γ . In a second generalisation we could admit one-particle reducible graphs in 1–3 of Definition 1 and even non-planar graphs with the same condition on the external indices as in 1–3 of Definition 1. Since there is no difference in the power-counting behaviour between non-planar graphs and planar graphs with large jump, the discussion is as before. However, the convergence theorem developed in the next section cannot be adapted in an easy way to normalisation conditions involving non-planar graphs. In summary, the proposed generalisations constitute different normalisation conditions of the same duality-covariant φ 4 -model. Passing from one normalisation to another one is a finite re-normalisation. The invariant characterisation of our model is its definition via four independent normalisation conditions for the ρ-functions so that at large scales the effective action approaches (3.3).
4. The Convergence Theorem In this section we prove the convergence of the coefficients of the effective action in the limit 0 → ∞, relative to the integration procedure given in Definition 1. This is a stronger result than the power-counting estimation of Proposition 2, which e.g. would be compatible with bounded oscillations. Additionally, we identify the rate of convergence of the interaction coefficients.
4.1. The 0 -dependence of the effective action. We have to control the 0 -dependence which enters the effective action via the integration procedure of Definition 1. There is an explicit dependence via the integration domain of irrelevant graphs and an implicit dependence through the normalisation (3.16), which requires a carefully adapted 0 dependence of ρa0 . For fixed = R but variable 0 we consider the identity L[φ, R , 0 , ρ 0 [ 0 ]] − L[φ, R , 0 , ρ 0 [ 0 ]] 0 d 0 d ≡
0 L[φ, R , 0 , ρ 0 [ 0 ]] d 0
0 0 4 0 d 0 ∂L[φ, R , 0 , ρ 0 ] dρ 0 ∂L[φ, R , 0 , ρ 0 ] = +
0 a
0 . (4.1) ∂ 0 d 0 ∂ρa0
0 0 a=1
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
331
The model is defined by fixing the boundary condition for ρb at R , i.e. by keeping ρb [ R , 0 , ρ 0 ] = constant: 0
0 0 R , 0 ,ρ ] R , 0 ,ρ ] dρa 0 = dρb [ R , 0 , ρ 0 ] = ∂ρb [ ∂
d 0 + 4a=1 ∂ρb [ ∂ρ 0 d 0 d 0 . (4.2) 0 a
Assuming that we can invert the matrix theory, we get
∂ρb [ R , 0 ,ρ 0 ] , ∂ρa0
which is possible in perturbation
4
dρa0 ∂ρa0 ∂ρb [ R , 0 ,ρ 0 ] =− . 0] ∂ 0 ∂ρ [
,
,ρ 0 R b d 0 b=1
(4.3)
Inserting (4.3) into (4.1) we obtain L[φ, R , 0 , ρ 0 [ 0 ]] − L[φ, R , 0 , ρ 0 [ 0 ]] 0 d 0 = R[φ, R , 0 , ρ 0 [ 0 ]] ,
0 0
(4.4)
with R[φ, , 0 , ρ 0 ] := 0 −
∂L[φ, , 0 , ρ 0 ] ∂ 0 4 ∂L[φ, , 0 , ρ 0 ]
a,b=1
∂ρa0
∂ρa0 ∂ρb [ , 0 , ρ 0 ] .
0 ∂ρb [ , 0 , ρ 0 ] ∂ 0 (4.5)
Following [10] we differentiate (4.5) with respect to :
4 ∂R ∂ ∂L ∂ ∂L ∂ρa0 ∂ρb
= 0 −
0 ∂
∂ 0 ∂
∂ρa0 ∂ ∂ρb ∂ 0 a,b=1
+
4 a,b,c,d=1
−
∂L ∂ ∂ρb ∂ρc0 ∂ρd
0
∂ρa0 ∂ρb ∂ρc0 ∂ ∂ρd ∂ 0 ∂ρa0
4 ∂L ∂ρa0 ∂ ∂ρb
.
0 ∂ρa0 ∂ρb ∂ 0 ∂
(4.6)
a,b=1
We have omitted the dependencies for simplicity and made use of the fact that the derivatives with respect to , 0 , ρ 0 commute. Using (3.1), with θ1 = θ2 ≡ θ , we compute the terms on the rhs of (4.6): ∂ ∂L[φ, , 0 , ρ 0 ]
0
∂ 0 ∂
1 ∂ K ∂ ∂L[φ, , 0 , ρ 0 ] ∂ nm;lk ( )
L[φ, , 0 , ρ 0 ]
0 2 = 2 ∂φmn ∂φkl ∂ 0 ∂
m,n,k,l 1 ∂ ∂2 0 − L[φ,
,
, ρ ]
0 0 φ (2π θ )2 ∂φmn ∂φkl ∂ 0 ∂L ≡ M L, 0 . (4.7) ∂ 0
332
H. Grosse, R. Wulkenhaar
Similarly, we have ∂ ∂L[φ, , 0 , ρ 0 ] ∂L
. = M L, 0 ∂ρa ∂ρa0 ∂
(4.8)
∂ρb ∂
∂ , which is obtained from (4.7) by For (4.6) we also need the function 0 ∂
0 first expanding L on the lhs according to (3.4) and by further choosing the indices at the A-coefficients according to (3.14). Applying these operations to the rhs of (4.7), we ∂L ∂L obtain for U → 0 ∂
or U → ∂ρ 0 the expansions 0 a
M[L, U ] =
∞
N=2 mi ,ni ∈N2
1 Mm1 n1 ;...;mN nN [L, U ]φm1 n1 · · · φmN nN N!
(4.9)
and the projections M1 [L, U ] :=
γ as in Def. 1.2
M2 [L, U ] :=
γ as in Def. 1.2
M3 [M, U ] :=
γ
M0 0
0 0 0 0;0 0
γ 0
M4 [L, U ] :=
γ as in Def. 1.1
0 0;0
1 0
[L, U ] − M 0 0
γ
− M1 1
0 0 0 0;0 0
γ as in Def. 1.3
(4.10a) γ
M1 0
[L, U ] ,
0
0 0
[L, U ] ,
[L, U ] ,
γ
M0 0
0 0;0
0 0 0 0 0 0 0 0;0 0;0 0;0 0
[L, U ] .
(4.10b) (4.10c) (4.10d)
Since the graphs γ in (4.10) are one-particle irreducible, only the third line of (4.7) can contribute8 to Ma . Using (4.7), (4.8) and (4.10) as well as the linearity of M[L, U ] in the second argument we can rewrite (4.6) as ∂L ∂R = M[L, R] − Ma [L, R] , ∂
∂ρa 4
(4.11)
a=1
where the function 4
∂L ∂ρb0 ∂L[ , 0 ,ρ 0 ] [ , 0 , ρ 0 ] := 0 0 ∂ρ [ ,
∂ρ a 0 ,ρ ] ∂ρa b b=1
(4.12)
4 ∂ ∂L ∂L ∂L ∂L = M L, ∂ρ − L, M b ∂ρb ∂ρa , a ∂ ∂ρa b=1
(4.13)
scales according to
as a similar calculation shows. 8 If one-particle reducible graphs are included in the normalisation conditions as discussed at the end of Sect. 3, also the second line of (4.7) must be taken into account.
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
333
Next, we also expand (4.5) and (4.12) as power series in the coupling constant: ∂L [φ, , 0 , ρ 0 ] ∂ρa =
∞
λ
V
2V +4
V =0
(2πθ ) 2 −2 a(V ) H [ , 0 , ρ 0 ]φm1 n1 · · · φmN nN , (4.14) N ! m ,n m1 n1 ;...;mN nN N
N=2 i i 0 R[φ, , 0 , ρ ] N ∞ 2V +2 (2πθ ) 2 −2 V
=
λ
V =1
N=2
N!
(V )
Rm1 n1 ;...;mN nN [ , 0 , ρ 0 ]φm1 n1 · · · φmN nN . (4.15)
mi ,ni
The differential equations (4.13) and (4.11) can now with (4.9) and (4.10) be written as
∂ a(V ) [ , 0 , ρ 0 ] H ∂ m1 n1 ;...;mN nN V N (V ) = Qnm;lk ( )Am11n1 ;...;mN N1 =2 V1 =1 m,n,k,l
1 −1 nN1 −1 ;mn
a(V −V1 ) [ ] 1 nN1 ;...;mN nN ;kl
[ ]HmN
N 1 a(V ) + − 1 permutations − Qnm;lk ( )Hm1 n1 ;...;mN nN ;mn;kl [ ] N1 −1 2 −
V V1 =0
−
V
m,n,k,l
1 1(V −V1 ) a(V1 ) Hm1 n1 ;...;mN nN [ ] − Qnm;lk ( )H 0 0 0 0 [ ] 2 0 0 ; 0 0 ;mn;kl [Def. 1.2] m,n,k,l
2(V −V )
1 Hm1 n1 ;...;m [ ] N nN
V1 =0 1 a(V ) a(V ) × − Qnm;lk ( ) H 1 0 10 1 [ ] − H 0 0 10 0 [ ] 2 0 0 ; 0 0 ;mn;kl 0 0 ; 0 0 ;mn;kl [Def. 1.2] m,n,k,l
+
V V1 =0
−
V V1 =1
× −
3(V −V1 ) Hm1 n1 ;...;m [ ] N nN
1 a(V1 ) − Qnm;lk ( )H 1 1 0 0 [ ] 2 0 0 ; 0 0 ;mn;kl [Def. 1.3] m,n,k,l
4(V −V )
1 Hm1 n1 ;...;m [ ] N nN
1 a(V ) Qnm;lk ( )H 0 0 10 0 0 0 0 0 [ ] , 2 0 0 ; 0 0 ; 0 0 ; 0 0 ;mn;kl [Def. 1.1]
(4.16)
m,n,k,l
∂ (V ) R [ , 0 , ρ 0 ] ∂ m1 n1 ;...;mN nN −1 N V (V ) = Qnm;lk ( )Am11n1 ;...;mN N1 =2 V1 =1 m,n,k,l
1 −1 nN1 −1 ;mn
(V −V1 ) [ ] 1 nN1 ;...;mN nN ;kl
[ ]RmN
N 1 (V ) + − 1 permutations − Qnm;lk ( )Rm1 n1 ;...;mN nN ;mn;kl [ ] N1 −1 2 m,n,k,l
334
H. Grosse, R. Wulkenhaar
−
V V1 =1
−
V
1 1(V −V1 ) (V1 ) Hm1 n1 ;...;m [ ] − Q ( )R [ ] nm;lk 0 0 0 0 N nN 2 0 0 ; 0 0 ;mn;kl [Def. 1.2] m,n,k,l
2(V −V )
1 Hm1 n1 ;...;m [ ] N nN
V1 =1 1 (V1 ) (V1 ) × − Qnm;lk ( ) R 1 0 0 1 [ ] − R 0 0 0 0 [ ] 2 0 0 ; 0 0 ;mn;kl 0 0 ; 0 0 ;mn;kl [Def. 1.2] m,n,k,l
+
V V1 =1 V
−
V1 =1
1 3(V −V1 ) (V1 ) Hm1 n1 ;...;m [ ] − Q ( )R [ ] nm;lk 1 1 0 0 N nN 2 0 0 ; 0 0 ;mn;kl [Def. 1.3] 4(V −V )
1 Hm1 n1 ;...;m [ ] − N nN
m,n,k,l
1 (V ) Qnm;lk ( )R 0 01 0 0 0 0 0 0 [ ] . 2 0 0 ; 0 0 ; 0 0 ; 0 0 ;mn;kl [Def. 1.1] m,n,k,l
(4.17)
We have used several times symmetry properties of the expansion coefficients and of the propagator and the fact that to the 1PI projections (4.10) only the last line of (4.9) can contribute. By {. . . }[Def. 1.?] we understand the restriction to H -graphs and R-graphs, respectively, which satisfy the index criteria on the trajectories as given in Definition 1. The H -graphs will be constructed later in Sect. 4.2. The R-graphs are in their structure identical to the previously considered graphs for the A-functions, but have a different meaning. See Sect. 4.4.
4.2. Initial data and graphs for the auxiliary functions. Next, we derive the bounds for the H -functions. Inserting (3.3) into the definition (4.12) we obtain immediately the initial condition at = 0 : 1(V )
Hm1 n1 ;...;mN nN [ 0 , 0 , ρ 0 ] = δN2 δ V 0 δn1 m2 δn2 m1 , 2(V ) Hm1 n1 ;...;mN nN [ 0 , 0 , ρ 0 ] 3(V ) Hm1 n1 ;...;mN nN [ 0 , 0 , ρ 0 ]
'
= δN2 δ
V0
(m11 +n11 +m21 +n21 )δn1 m2 δn2 m1
(4.18) ,
(4.19)
'
= −δN2 δ + n12 m12 δn1 ,m1 −1 δn1 −1,m1 δn2 ,m2 δn2 ,m2 1 2 2 1 1 2 2 1 ' ' +( n21 m21 δn2 ,m2 +1 δn2 +1,m2 + n22 m22 δn2 ,m2 −1 δn2 −1,m2 )δn1 ,m1 δn1 ,m1 , (4.20) 2 1 2 2 1 2 1 1 1 2 2 1 4(V ) 0 V0 1 Hm1 n1 ;...;mN nN [ 0 , 0 , ρ ] = δN4 δ δn1 m2 δn2 m3 δn3 m4 δn4 m1 + 5 permutations . 6 (4.21) V0
n11 m11 δn1 ,m1 +1 δn1 +1,m1 1 2 2 1
a(0)
We first compute Hm1 n1 ;...;m4 n4 [ ] for a ∈ {1, 2, 3}. Since there is no 6-point function at order 0 in V , the differential equation (4.16) reduces to
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
335
a(0)
∂Hm1 n1 ;...;m4 n4 [ ] ∂
=
3
b(0) a(0) Cbm n ;k l Hm1 n1 ;...;m4 n4 [ ] Qnm;lk ( )Hm n ;k l ;mn;kl [ ] ,
b=1 m,n,k,l,m ,n ,k ,l
(4.22)
for certain coefficients Cbm n ;k l . The solution is due to (3.6) given by a(0)
a(0)
Hm1 n1 ;...;m4 n4 [ ] = Hm1 n1 ;...;m4 n4 [ 0 ] +
3
b=1 m,n,k,l,m ,n ,k ,l
Cbm n ;k l Hm1 n1 ;...;m4 n4 [ 0 ] b(0)
a(0) K × K ( ) −
(
) H [
] 0 0 nm;lk nm;lk m n ;k l ;mn;kl +
3
b=1 m,n,k,l,m ,n ,k ,l
Cbm n ;k l Hm1 n1 ;...;m4 n4 [ 0 ] b(0)
K × K nm;lk ( ) − nm;lk ( 0 ) ×
3
b =1 m ,n ,k ,l ,m ,n ,k ,l
n ;k l
Cbm
b (0)
Hm n ;k l ;mn;kl [ 0 ]
a(0) K × K n m ;l k ( ) − n m ;l k ( 0 ) Hm n ;k l ;m n ;k l [ 0 ] +... .
(4.23)
With the initial conditions (4.18)–(4.20) we get a(0)
Hm1 n1 ;...;m4 n4 [ , 0 , ρ 0 ] ≡ 0
for a ∈ {1, 2, 3} .
(4.24)
a(0)
Inserting (4.24) into the rhs of (4.16) we see that Hm1 n1 ;m2 n2 for a ∈ {1, 2, 3} and 4(0)
Hm1 n1 ;...;m4 n4 are constant, which means that the relations (4.18)–(4.21) hold actually at any value and not only at = 0 . We need a graphical notation for the H -functions. We represent the base functions (4.18)–(4.21), valid for any , as follows: 1(0)
H m1 n1
1 1 ;n m m2 n2 n2 m2
2(0)
H m1 n1
1 1 ;n m m2 n2 n2 m2
3(0)
H m1 +1 n1 +1 m2
n2
1 m1 m2
; nn2
[ ] =
[ ] =
/
[ ] =
/
n1 n2
/ o
m1 m2
o
n1 n2
o
m1 m2 n1 +1 n2 m1 +1 m2
1
◦◦ ◦◦
2
◦◦ ◦◦
31
n1 n2
o
/
,
(4.25)
o
/
,
(4.26)
/
,
(4.27)
m1 m2 n1 n2 m1 m2 n1 n2 m1 m2
o
336
H. Grosse, R. Wulkenhaar
_ 4(0)
H m1 n1
1 1 1 1 1 1 ;n k ;k l ;l m m2 n2 n2 k 2 k 2 l 2 l 2 m2
1 6
[ ] =
m1 m2
?
?
l1 l2
◦◦ k 1 ◦ ◦ ◦ ◦ 2 ◦◦ k
+ 5 permutations .
(4.28)
_
n1 n2
The special vertices stand for some sort of hole into which we can insert planar twoor four-point functions at vanishing external indices. However, the graph remains connected at these holes, in particular, there is index conservation at the hole ◦◦ and a jump by 01 or 01 at the hole . By repeated contraction with A-graphs and self-contractions we build out of (4.25)–(4.28) more complicated graphs with holes. We use this method to compute 4(0) Hm1 n1 ;m2 n2 [ ]. In this case, we need the planar and non-planar self-contractions of (4.28):
4(0)
Qmn;kl Hm1 n1 ;m2 n2 ;mn;kl
m,n,k,l
1 = 3 l
_ m1 n2 ?
1 + 3
l m1
◦◦ ◦ ◦ ◦ ◦ ◦◦
n1
m2
1 + 3
◦◦ ◦ ◦ ◦ ◦ ◦◦
l
n2 ? m1
(4.29)
◦◦ m2 ◦ ◦ ◦ ◦ ◦◦
? n1
n ? _ 2 n1 m2
These contractions correspond (with a factor − 21 ) to the last term in the third line of (4.16), for a = 4 and N = 2. We also have to subtract (again up to the factor − 21 ) the fourth to last lines of (4.16). For instance, the fourth line amounts to insert the planar graphs of (4.29) with m1 = n1 = m2 = n2 = 00 into (4.25). The total contribution to the rhs of (4.16) corresponding to the first graph in (4.29), with m1 = n2 = n1 = m2 =
n1 n2 ,
1 l
3 1 m m2
l1 l2
?
◦◦ ◦ ◦ ◦ ◦ ◦◦
_
n1 n2
? Y e_ QE1 m "
y ) 1 l 38 l 2 = = ◦◦ ◦ ◦ ◦ ◦ ◦◦ n1 n2
m1 m2
l1 l2
0 1 1 m 2 m
?
and
reads
1 2 − m 3
m1 m2
m1 m2
_
◦◦ ◦ ◦ ◦ ◦ ◦◦ n1 n2
1 − 3
0 0 1 m 2 m
?
−
0 1
_
m1 m2
l1 l2
◦◦ ◦ ◦ ◦ ◦ ◦◦ n1 n2
0 0 1 m 2 m
?
0 0
_ l1 l2
m1 m2
◦◦ ◦ ◦ ◦ ◦ ◦◦ n1 n2
1 1 − m 3
1 0 1 m m2
0 0 m1 m2
?
l1 l2
◦◦ ◦ ◦ ◦ ◦ ◦◦ n1 n2
−
1 0
_
m1 m2
0 0 1 m 2 m
?
l1 l2
◦◦ ◦ ◦ ◦ ◦ ◦◦ n1 n2
0 0 m
1
m2 _
_
(4.30) m1 m2
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
337
The second graph in the first line of (4.30) corresponds to the fourth line of (4.16). The second line of (4.30) represents the fifth/sixth lines of (4.16), undoing the symmetry properties of the upper and lower component used in (4.16). The difference of graphs 1 corresponding to the nn2 component vanishes, because the value of the graph is indepen1
dent of nn2 . There is no planar contribution from the last three lines of (4.16). In total, we get the projection (3.29) to the irrelevant part of the graph. The same procedure leads to the irrelevant part of the second graph in (4.29). With these considerations, the differential equation (4.16) takes for N = 2 and a = 4 the form (1) ∂ 1 4(0) (1)
Hm1 n1 ;m2 n2 [ ] = − δn1 m2 δn2 m1 Qn1 l;ln1 ( ) + Qn2 l;ln2 ( ) ∂
6 2 2 l∈N
l∈N
1 − Qm1 n1 ;m2 n2 ( ) . 6
(4.31)
The first line comes from the planar graphs in (4.29) and the subtraction terms according to (4.30), whereas the second line of (4.31) is obtained from the last (non-planar) graph in (4.29). Using the initial condition (4.21) at = 0 , the bounds (3.21) and (3.29) combined with the volume factor (C2 θ 2 )2 for the l-summation we get C m1 2∞ + n1 2∞ C0 4(0) δn1 m2 δn2 m1 + δn m δn m .(4.32) |Hm1 n1 ;m2 n2 [ ]| ≤ θ 2 θ 2 1 2 2 1 (1)
It is extremely important here that the irrelevant projection Qnl;ln and not the propagator Qnl;ln itself appears in the first line of (4.31).
4.3. The power-counting behaviour of the auxiliary functions. The example suggests that similar cancellations of relevant and marginal parts appear in general, too. Thus, we expect all H -functions to be irrelevant. This is indeed the case: Proposition 3. Let γ be a ribbon graph with holes having N external legs, V vertices, V e external vertices and segmentation index ι, which is drawn on a genus-g Riemann a(V ,V e ,B,g,ι)γ surface with B boundary components. Then, the contribution Hm1 n1 ;...;mN nN of γ to the expansion coefficient of the auxiliary function of a duality-covariant φ 4 -theory on R4θ in the matrix base is bounded as follows: 1. For γ according to Definition 1.1 we have a(V ,V e ,1,0,0)γ H m1 n1 n1 k1 k1 l 1 l 1 m1 [ , 0 , ρ0 ] γ as in Def. 1.1
; ; ; m2 n2 n2 k 2 k 2 l 2 l 2 m2
−δ 1a 4V −2+2δ a4 ≤ θ 2 P1−δ V 0 ×P 2V −1+δ
a4
! m1 n1
ln ,
R
n1 k 1 k 1 l 1 l 1 m1 m2 n2 ; n2 k 2 ; k 2 l 2 ; l 2 m2 θ 2
where all vertices on the trajectories contribute to V e .
1 3V −1+δ a4 −V e (4.33)
338
H. Grosse, R. Wulkenhaar
2. For γ according to Definition 1.2 we have a(V ,V e ,1,0,0)γ H m1 n1 n1 m1 [ , 0 , ρ0 ] γ as in Def. 1.2
;
m2 n2 n2 m2
1a 2 1−δ
≤ θ
4V +2δ a4 P2−δ a1 a2 V 0 (2δ +δ )
1 3V +δ a4 −V e
×
P 2V +δ
a4
! m1 n1
n1 m1 m2 n2 ; n2 m2 θ 2
ln
,
R
(4.34)
where all vertices on the trajectories contribute to V e . 3. For γ according to Definition 1.3 we have a(V ,V e ,1,0,0) H m1 +1 n1 +1 n1 m1 [ , 0 , ρ0 ] m2
γ as in Def. 1.3
≤ θ
×
1a 2 1−δ
n2
; n2
4V +2δ a4 P2−δ V0
1 3V +δ a4 −V e
m2
! m1 +1 n1 +1 m2
P 2V +δ
a4
n2
1
1
m ; nn2 m 2
θ 2
ln ,
R
(4.35)
where all vertices on the trajectories contribute to V e . 4. If γ is a subgraph of an 1PI planar graph with a selected set T of trajectories on one distinguished boundary component and a second set T of summed trajectories on that boundary component, we have a(V ,V e ,B,0,ι)γ H m1 n1 ;...;mN nN [ , 0 , ρ0 ] Es Et
(2−δ 1a − N )+2(1−B) 4V +2+2δ a4 −N 2 ≤ θ 2 P 2t +
−−−−→ −−− −−→ min(2, 21 nj o[nj ]) n j o[nj ]∈T
!
m1 n1 ; . . . ; mN nN θ 2
1 3V − N +δ a4 +B−V e −ι+s+t
a4 N 2 × . P 2V +1+δ − 2 ln
R
(4.36)
The number of summations is now restricted by s + t ≤ V e +ι. 5. If γ is a non-planar graph or a graph with N > 4 external legs, we have a(V ,V e ,B,g,ι) H [ , 0 , ρ0 ] Es
m1 n1 ;...;mN nN
! (2−δ a1 − N )+2(1−B−2g) 4V +2+2δ a4 −N m1 n1 ; . . . ; mN nN 2 ≤ θ 2 P0 θ 2 1 3V − N +δ a4 +B+2g−V e −ι+s
a4 N 2 . (4.37) P 2V +1+δ − 2 ln ×
R
The number of summations is now restricted by s ≤ V e +ι. Proof. The proposition will be proven by induction upward in the number V of vertices and for given V downward in the number N of external legs.
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
339
5. Taking (3.33) into account, the estimations (4.33)–(4.36) are further bound by (4.37). In particular, the inequality (4.37) correctly reproduces the bounds for V = 0 derived in Sect. 4.2. By comparison with (3.39), the estimation (4.37) follows immediately for the H -linear parts on the rhs of (4.16) which contribute to the integrand of a(V ,V e ,B,g,ι) Hm1 n1 ;...;mN nN [ ]. Since planar two- and four-point functions are preliminarily excluded, the -integration (from 0 down to ) confirms (4.37) for those contributions which arise from H -linear terms on the rhs of (4.16) that are non-planar or have N > 4 external legs. We now consider in the H -bilinear part on the rhs of (4.16) the contributions of non-planar graphs or graphs with N > 4 external legs. We start with the fourth line in (4.16), with the first term being a non-planar H -function which (apart from the number of vertices and the hole label a) has the same topological data as the total H -graph to estimate. From the induction hypothesis it is clear that the term in braces { } is bounded by the planar unsummed version (B1 = 1, g1 = 0, ι1 = 0, s1 = 0) of (4.36), with N1 = 2 and T = T = ∅, and with a reduction of the degree of the polynomial in ln
R by 1: + , a(V1 ) −1 Q ( )H [ ] nm;lk 0 0 0 0 Def. 1.2 2 0 0 ; 0 0 ;mn;kl m,n,k,l
(1−δ a1 ) 1 3V1 +δ a4 −V1e 2V −1+δ a4
≤ θ 2 P 1 ln ,
R 1(V −V ,V e ,B,g,ι) 1 H [ ] Es
(4.38a)
m1 n1 ;...;mN nN
(1− N )+2(1−B−2g) 4(V −V1 )+2−N m1 n1 ; . . . ; mN nN 2 ≤ θ 2 P0 θ 2 1 3(V −V1 )− N +B+2g−V e −ι+s N
2 P 2(V −V1 )+1− 2 ln . (4.38b) ×
R
We can ignore the term Pba [ ], see (3.32), in (4.38a) because the external indices of that part are zero. In the first step we exclude a = 4 so that the sum over V1 in (4.16) starts due to (4.18)–(4.20) at V1 = 1. For V1 = V there is a contribution to (4.38b) with N = 2 only, where (4.38a) can be regarded as known by induction. Since the e factor ( 1 )−V1 can safely be absorbed in the polynomial P [ln
R ], the product of (4.38a) and (4.38b) confirms the bound (4.37) for the integrand under consideration, preliminarily for a = 4. In the next step we repeat the argumentation for a = 4, where (4.38b), with V1 = 0, is known from the first step. Second, we consider the fifth/sixth lines in (4.16). The difference of functions in braces { } involves graphs with constant index along the trajectories. We have seen in Sect. 3.3 that such a difference can be written as a sum of graphs each having a composite propagator (3.28) at a trajectory. As such the (θ 2 )-degree of the part in braces { } is reduced9 by 1 compared with planar analogues of (4.37) for N = 2. The difference of functions in braces { } involves also graphs where the index along The origin of the reduction is the term Pba [ ] introduced in (3.32), with b = 1 in the presence of a composite propagator (3.28). The argument in the brackets of Pba [ ] is the ratio of the maximal external index to the reference scale θ 2 . Since the maximal index along the trajectory is 1, we can globally estimate in this case P1a [ ] by a constant times (θ 2 )−1 . 9
340
H. Grosse, R. Wulkenhaar
one of the trajectories jumps once by 01 or 01 and back. For these graphs we conclude from (3.24) (and the fact that the maximal index along the trajectory is 2) that the (θ 2 )-degree of the part in braces { } is also reduced by 1: + , a(V1 ) a(V1 ) −1 Qnm;lk ( ) H 1 0 0 1 [ ] − H 0 0 0 0 [ ] [Def. 1.2] 2 0 0 ; 0 0 ;mn;kl 0 0 ; 0 0 ;mn;kl m,n,k,l
(−δ a1 ) 1 3V1 +δ a4 (1−V1e ) 2V −1+δ a4
≤ θ 2 P 1 ln ,
R 2(V −V ,V e ,B,g,ι) 1 H [ ]
(4.39a)
m1 n1 ;...;mN nN
Es
(2− N )+2(1−B−2g) 4(V −V1 )+2−N m1 n1 ; . . . ; mN nN 2 ≤ θ 2 P0 θ 2 N 1 3(V −V1 )− +1+B+2g−V e −ι+s N
2 P 2(V −V1 )+1− 2 ln . ×
R
(4.39b)
Again we have to exclude a = 4 in the first step, which then confirms the bound (4.37) for the integrand under consideration. In the second step we repeat the argumentation for a = 4. Third, the discussion of the seventh line of (4.16) is completely similar, because there the index on each trajectory jumps once by 01 or 01 . This leads again to a reduction by 1 of the (θ 2 )-degree of the part in braces { } compared with planar analogues of (4.37) for N = 2. Finally, the part in braces in the last line of (4.16) can be estimated by a planar N = 4 version of (4.37), again with a reduction by 1 of the degree of P [ln
R ]: + , a(V1 ) −1 Qnm;lk ( )H 0 0 0 0 0 0 0 0 [ ] ; ; ; ;mn;kl [Def. 1.1] 2 0 0 0 0 0 0 0 0 m,n,k,l
−δ a1 1 3V1 −1+δ a4 −V1e 2V −2+δ a4
≤ θ 2 ln , P 1
R 4(V −V ,B,g,V e ,ι) 1 H [ ] m1 n1 ;...;mN nN
(4.40a)
Es
(2− N )+2(1−B−2g) 4(V −V1 )+4−N m1 n1 ; . . . ; mN nN 2 ≤ θ 2 P0 θ 2 1 3(V −V1 )− N +1+B+2g−V e −ι+s N
2 P 2(V −V1 )+2− 2 ln . (4.40b) ×
R
We confirm again the bound (4.37) for the integrand under consideration. Since we have assumed that the total H -graph is non-planar or has N > 4 external legs, the integrand (4.37) is irrelevant so that we obtain after integration from 0 down to (and use of the initial conditions (4.18)–(4.21)) the same bound (4.37) for the graph, too. 4. According to Sect. 4.2, the inequality (4.36) is correct for V = 0. By comparison with (3.38), the estimation (4.36) follows immediately for the H -linear parts on the rhs a(V ,V e ,B,g,ι) of (4.16) which contribute to the integrand of Hm1 n1 ;...;mN nN [ ]. Excluding planar two- and four-point functions with constant index on the trajectory or with limited
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
341
jump according to 1–3 of Definition 1, the -integration confirms (4.36) for those contributions which arise from H -linear terms on the rhs of (4.16) that correspond to subgraphs of planar graphs (subject to the above restrictions). The proof of (4.36) for the H -bilinear terms in (4.16) is completely analogous to the non-planar case. We only have to replace (4.38b), (4.39b) and (4.40b) by the adapted version of (4.36). In particular, the distinguished trajectory with its subsets T , T of indices comes exclusively from the (4.36)-analogues of (4.38b), (4.39b) and (4.40b) and not from the terms in braces in (4.16). 1. We first consider a = 4. Then, according to (4.18)–(4.20) we need V ≥ 1 in order to have a non-vanishing contribution to (4.33). Since according to Definition 1.1 the index along each trajectory of the (planar) graph γ is constant, we have 1 a(V ,V e ,1,0,0) a(V ,V e ,1,0,0)γ Hm1 n1 ;m2 n2 ;m3 n3 ;m4 n4 [ ] = Hm1 m2 ;m2 m3 ;m3 m4 ;m4 n1 [ ] + 5 permutations. (4.41) 6 Then, using (4.18)–(4.21) and the fact that γ is 1PI, the differential equation (4.16) reduces to ∂ a(V ,V e ,1,0,0)γ
Hm1 n1 ;m2 n2 ;m3 n3 ;m4 n4 [ , 0 , ρ 0 ] ∂
a=4 γ as in Def. 1.1 1 a(V ,V e ,B,0,ι) = − Qnm;lk ( ) Hm1 m2 ;m2 m3 ;m3 m4 ;m4 n1 ;mn;kl [ ] 12 m,n,k,l a(V ,V e ,B,0,ι) −H00;00;00;00;mn;kl [ ] + 5 permutations [Def. 1.1]
+the 4th to last lines of (4.16) with
V V1 =0
→
V −1
.
(4.42)
V1 =1
a(V ,V e ,B,0,ι) H00;00;00;00;mn;kl [ ]
in the third line of (4.42) comes from the (V1 = Here, the term V )-contribution of the last two lines in (4.16), together with (4.21). In the same way as in Sect. 3.3 we conclude that the second and third lines of (4.42) can be written as a linear combination of graphs having a composite propagator (3.17a) on one of the trajectories. As such we have to replace the bound (3.22) relative to the contribution of an ordinary propagator by (3.28). For the total graph this amounts to multiply the cori responding estimation (4.36) of ordinary H -graphs with N = 4 by a factor maxθ
m , 2
which yields the subscript 1 of the part P14V −2+2δ [ ] of the integrand (4.33), for the time being restricted to the second and third lines of (4.42). Since the resulting integrand is irrelevant, we also obtain (4.33) after -integration from 0 down to . Clearly, this is the only contribution for V = 1 so that (4.33) is proven for V = 1 and a = 4. In the second step we use this result to extend the proof to V = 1 and a = 4. Now the differential equation (4.16) reduces to the second and third lines of (4.42), with a = 4, and the fourth to sixth lines of (4.16) with V = 1 and V1 = 0. There is no contribution from the seventh line of (4.16) for V1 = 0, because the part in braces would be non-planar, which is excluded in Definition 1.3. Inserting (4.21) we obtain the composite propagator (3.17b) in the part in braces { } of the fifth and sixth lines of (4.16). Together with (4.33) for V = 1 and a = 4 already proven we verify the integrand (4.33) for V = 1 and a = 4. After -integration we thus obtain (4.33) for V = 1 and any a. a4
342
H. Grosse, R. Wulkenhaar
This allows us to use (4.33) as the induction hypothesis for the remaining contributions in the last line of (4.42). This is similar to the procedure in 5, we only have to replace (4.38b), (4.39b) and (4.40b) by the according parametrisation of (4.33). We thus prove (4.33) to all orders. 2. We first consider a = 4. Then, according to (4.18)–(4.20) only terms with V1 ≥ 1 contribute to (4.16). Using (4.18)–(4.21) and the fact that γ is 1PI, the differential equation (4.16) reduces to ∂ a(V ,V e ,1,0,0)γ
H m1 n1 n1 m1 [ , 0 , ρ 0 ] a=4 ∂
; γ as in Def. 1.2 m2 n2 n2 m2 1 a(V ,V e ,B,0,ι) a(V ,V e ,B,0,ι) =− Qnm;lk ( ) H m1 n1 n1 m1 [ ] − H 0 0 0 0 [ ] 2 ; ;mn;kl 0 0 ; 0 0 ;mn;kl m2 n2 n2 m2 m,n,k,l a(V ,V e ,B,0,ι) a(V ,V e ,B,0,ι) −m1 H 1 0 0 1 [ ] − H 0 0 0 0 [ ] 0 0 ; 0 0 ;mn;kl 0 0 ; 0 0 ;mn;kl a(V ,V e ,B,0,ι) a(V ,V e ,B,0,ι) 1 [ ] − H 0 0 0 0 [ ] −n H 0 1 1 0 ; ;mn;kl ; ;mn;kl 0 0 0 0 0 0 0 0 a(V ,V e ,B,0,ι) a(V ,V e ,B,0,ι) 2 [ ] − H 0 0 0 0 [ ] −m H 0 0 0 0 ; ;mn;kl ; ;mn;kl 1 0 0 1 0 0 0 0 a(V ,V e ,B,0,ι) a(V ,V e ,B,0,ι) 2 −n H 0 0 0 0 [ ] − H 0 0 0 0 [ ] (4.43a) 0 1 ; 1 0 ;mn;kl
0 0 ; 0 0 ;mn;kl
[Def. 1.2]
1 4(0) a(V ) Qnm;lk ( )H 0 0 0 0 0 0 0 0 [ ] −H m1 n1 n1 m1 [ ] − 2 ; 0 0 ; 0 0 ; 0 0 ; 0 0 ;mn;kl [Def. 1.1] m2 n2 n2 m2 m,n,k,l
+the 4th to last lines of (4.16) with
V V1 =0
→
V −1
(4.43b) .
(4.43c)
V1 =1
If the graphs have constant indices along the trajectories, we conclude in the same way as in Appendix B.1 that the part (4.43a) can be written as a linear combination of graphs having either a composite propagator (3.17b) or two composite propagators (3.17a) on the trajectories. As such we have to replace the bound (3.22) relative to the contribution of an ordinary propagator by (3.29) or twice (3.22) by (3.28). For the total graph this amounts to multiply the corresponding estimation (4.36) of r ,nr ) 2 ordinary H -graphs with N = 2 by a factor max(m , which yields the subscript θ 2
2 of the part P24V +2δ [ ] of the integrand (4.34), for the time being restricted to the part (4.43a). For graphs with index jump in Definition 1.2 we obtain according r ,nr ) 2 . Next, the product of to Appendix B.1 the same improvement by max(m θ 2 (4.32) with (4.40a) gives for (4.43b) the same bound (4.34) for the integrand. Since the resulting integrand is irrelevant, we also obtain (4.34) after -integration. Clearly, this is the only contribution for V = 1 so that (4.34) is proven for V = 1 and a = 4. In the second step we use this result to extend the proof to V = 1 and a = 4. Now the differential equation (4.16) reduces to the sum of (4.43a) and (4.43b), with a = 4, and the fourth to sixth lines of (4.16) with V = 1 and V1 = 0. There is again no contribution of the seventh line of (4.16) for V1 = 0. Inserting (4.21) we obtain the composite propagators (3.17b) in the fifth/sixth lines of (4.16), which together with (4.34) for V = 1 and a = 4 already proven verifies the integrand (4.34) for V = 1 and a = 4. After -integration we thus obtain (4.34) for V = 1 and any a. a4
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
343
This allows us to use (4.34) as the induction hypothesis for the remaining contributions (4.43c) for V > 1. This is similar to the procedure in 5; we only have to replace (4.38b), (4.39b) and (4.40b) by the according parametrisation of (4.34). We thus prove (4.34) to all orders. 3. The proof of (4.35) is performed along the same lines as the proof of (4.33) and r ,nr ) −−−−→ −−−−→ (4.34). There is one factor max(m from n1 o[n1 ] + n2 o[n2 ] = 2 in (4.36) and θ 2 a second factor from the composite propagator (3.28) or (3.30) appearing according to Appendix B.1 in the (V1 = V )-contribution to (4.16). This finishes the proof of Proposition 3.
4.4. The power-counting behaviour of the 0 -varied functions. The estimations in Propositions 2 and 3 allow us to estimate the R-functions by integrating the differential equation (4.17). Again, the R-functions are expanded in terms of ribbon graphs. Let us look at R-ribbon graphs of the type described in Definition 1.1. Since γ as in Def. 1.1 (V )γ 0 0 0 0 0 0 0;0 0;0 0;0
A0 0
0 0
[ , 0 , ρ 0 ] ≡ ρ4 [ , 0 , ρ 0 ], we can rewrite the expansion coefficients
of (4.5) as follows:
(V )γ
R m1 n1
γ as in Def. 1.1
= 0
∂ ∂ 0
1 1 1 1 1 1 ;n k ;k l ;l m m2 n2 n2 k 2 k 2 l 2 l 2 m2
−
4
×
0
∂ ∂ρa0
(V )γ
A m1 n1 n1 k1 k1 l 1 l 1 m1 [ , 0 , ρ 0 ] 2 2; 2 2; 2 2; 2 2 γ as in Def. 1.1 m n n k k l l m 0 0 0 0 0 [ , 0 , ρ ]
a,b=1 (V )γ 0 0 0 0;0 0;0 0;0
−A 0 0
(V )γ
A m1 n1 n1 k1 k1 l 1 l 1 m1 [ , 0 , ρ 0 ] ; ; ; m2 n2 n2 k 2 k 2 l 2 l 2 m2 as in Def. 1.1 0 0 0 0 0 [ , 0 , ρ ]
γ (V )γ 0 0 0 0;0 0;0 0;0
−A 0 0
[ , 0 , ρ 0 ]
0
∂ρa0 ∂ρb [ , 0 , ρ 0 ]
0
∂ ρb [ , 0 , ρ 0 ] . ∂ 0
(4.44)
This means that (by construction) only the ( 0 , ρ 0 )-derivatives of the projection to the irrelevant part (3.35a) of the planar four-point function contributes to R. Similarly, only the ( 0 , ρ 0 )-derivatives of the irrelevant parts (3.36a) and (3.37a) of the planar twopoint function contribute to R. According to the initial condition (3.3), these projections and the other functions given in Definition 1.4 vanish at = 0 independently of 0 or ρa0 : ∂ (V ,V e ,1,0,0)γ 0 = 0 A m1 n1 n1 k1 k1 l 1 l 1 m1 [ 0 , 0 , ρ 0 ] ∂ 0 ; ; ; m2 n2 n2 k 2 k 2 l 2 l 2 m2 e (V ,V ,1,0,0)γ −A 0 0 0 0 0 0 0 0 [ 0 , 0 , ρ 0 ] 0 0;0 0;0 0;0 0
γ as in Def. 1.1
0 0;0 0;0 0;0 0
γ as in Def. 1.1
∂ (V ,V e ,1,0,0)γ = 0 A m1 n1 n1 k1 k1 l 1 l 1 m1 [ 0 , 0 , ρ 0 ] ∂ρa ; ; ; m2 n2 n2 k 2 k 2 l 2 l 2 m2 (V ,V e ,1,0,0)γ −A 0 0 0 0 0 0 0 0 [ 0 , 0 , ρ 0 ]
,
(4.45a)
344
H. Grosse, R. Wulkenhaar
∂ (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ A m1 n1 n1 m1 [ 0 , 0 , ρ 0 ] − A 0 0 0 0 [ 0 , 0 , ρ 0 ] ;0 0 ∂ 0 ; 0 0 2 2 2 2 m n n m (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ −m1 A 1 0 0 1 [ 0 , 0 , ρ 0 ] − A 0 0 0 0 [ 0 , 0 , ρ 0 ]
0 = 0
0 0;0 0
0 0;0 0
(V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ [ 0 , 0 , ρ 0 ] − A 0 0 0 0 [ 0 , 0 , ρ 0 ] −n1 A 0 1 1 0 0 0;0 0
0 0;0 0
0 1;1 0
0 0;0 0
(V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ [ 0 , 0 , ρ 0 ] − A 0 0 0 0 [ 0 , 0 , ρ 0 ] −m2 A 0 0 0 0 ; ; 1 0 0 1 0 0 0 0 (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ 2 0 −n A 0 0 0 0 [ 0 , 0 , ρ ] − A 0 0 0 0 [ 0 , 0 , ρ 0 ] ∂ (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ = 0 A m1 n1 n1 m1 [ 0 , 0 , ρ 0 ] − A 0 0 0 0 [ 0 , 0 , ρ 0 ] ; ∂ρa ; 0 0 0 0 m2 n2 n2 m2 (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ 1 −m A 1 0 0 1 [ 0 , 0 , ρ 0 ] − A 0 0 0 0 [ 0 , 0 , ρ 0 ] −n
1
0 0;0 0
γ as in Def. 1.2
0 0;0 0
(V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ [ 0 , 0 , ρ 0 ] − A 0 0 0 0 [ 0 , 0 , ρ 0 ] A0 1 1 0 0 0;0 0 0 0;0 0
(V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ [ 0 , 0 , ρ 0 ] − A 0 0 0 0 [ 0 , 0 , ρ 0 ] −m2 A 0 0 0 0 1 0;0 1 0 0;0 0 e (V ,V e ,1,0,0)γ 2 (V ,V ,1,0,0)γ 0 −n A 0 0 0 0 [ 0 , 0 , ρ ]−A 0 0 0 0 [ 0 , 0 , ρ 0 ] 0 1;1 0
0 0;0 0
∂ (V ,V e ,1,0,0)γ A m1 +1 n1 +1 n1 m1 [ 0 , 0 , ρ 0 ] 0 = 0 ∂ 0 ; n2 m2 m2 n2 (V ,V e ,1,0,0)γ [ 0 , 0 , ρ 0 ] − (m1 +1)(n1 +1)A 1 1 0 0 0 0;0 0
γ as in Def. 1.2
,
(4.45b)
γ as in Def. 1.3
∂ (V ,V e ,1,0,0)γ = 0 A m1 +1 n1 +1 n1 m1 [ 0 , 0 , ρ 0 ] ∂ρa ; n2 m2 m2 n2 (V ,V e ,1,0,0)γ [ 0 , 0 , ρ 0 ] − (m1 +1)(n1 +1)A 1 1 0 0 ; γ 0 0 0 0 e ∂ (V ,V ,B,g,ι)γ A [ 0 , 0 , ρ 0 ] 0 = 0 γ as in Def. 1.4 ∂ 0 m1 n1 ;...;mN nN ∂ (V ,V e ,B,g,ι)γ = 0 Am1 n1 ;...;mN nN [ 0 , 0 , ρ 0 ] . γ as in Def. 1.4 ∂ρa
as in Def. 1.3
,
(4.45c)
(4.45d)
The 0 -derivative at = 0 has to be considered with care: ∂ (V ,V e ,B,g,ι)γ A [ 0 , 0 , ρ 0 ] ∂ 0 m1 n1 ;...;mN nN ∂ (V ,V e ,B,g,ι)γ =
Am1 n1 ;...;mN nN [ , 0 , ρ 0 ]
= 0 ∂
∂ (V ,V e ,B,g,ι)γ + 0 Am1 n1 ;...;mN nN [ , 0 , ρ 0 ] ,
= 0 ∂ 0
0 = 0
(4.46)
and similarly for (4.45a)–(4.45c). Inserting (4.44), (4.45), (4.46) and according formulae into the Taylor expansion of (4.5) we thus have
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
(V ,V e ,1,0,0)γ
R m1 n1
γ as in Def. 1.1
m2
1 k 1 k 1 l 1 l 1 m1 ; ; k 2 k 2 l 2 l 2 m2
;n n2 n2
345
[ 0 , 0 , ρ 0 ]
∂ (V ,V e ,1,0,0)γ A m1 n1 n1 k1 k1 l 1 l 1 m1 [ , 0 , ρ 0 ] ∂
; ; ; m2 n2 n2 k 2 k 2 l 2 l 2 m2 γ as in Def. 1.1 (V ,V e ,1,0,0)γ , −A 0 0 0 0 0 0 0 0 [ , 0 , ρ 0 ]
=−
0 0;0 0;0 0;0 0
(V ,V e ,1,0,0)γ
R m1 n1
γ as in Def. 1.2
m2
(4.47a)
= 0 1 m1 m2
;n n2 n2
[ 0 , 0 , ρ 0 ]
∂ (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ A m1 n1 n1 m1 [ , 0 , ρ 0 ] − A 0 0 0 0 [ , 0 , ρ 0 ] ∂
0 0;0 0 2 n2 ; n2 m2 m γ as in Def. 1.2 (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ −m1 A 1 0 0 1 [ , 0 , ρ 0 ] − A 0 0 0 0 [ , 0 , ρ 0 ] 0 0;0 0 0 0;0 0 (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ 1 0 [ , 0 , ρ ] − A 0 0 0 0 [ , 0 , ρ 0 ] −n A 0 1 1 0 ; ; 0 0 0 0 0 0 0 0 (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ 2 0 [ , 0 , ρ ] − A 0 0 0 0 [ , 0 , ρ 0 ] −m A 0 0 0 0 ; ; 1 0 0 1 0 0 0 0 (V ,V e ,1,0,0)γ (V ,V e ,1,0,0)γ 2 0 −n A 0 0 0 0 [ , 0 , ρ ] − A 0 0 0 0 [ , 0 , ρ 0 ] , (4.47b)
=−
0 1;1 0
0 0;0 0
(V ,2,1,0,0)γ
R m1 +1 n1 +1
γ as in Def. 1.3
m2
n2
= 0
0
1 m1 m2
; nn2
[ 0 , 0 , ρ ]
∂ (V ,V e ,1,0,0)γ A m1 +1 n1 +1 n1 m1 [ , 0 , ρ 0 ] ∂
; n2 m2 m2 n2 γ as in Def. 1.3 (V ,V e ,1,0,0)γ [ , 0 , ρ 0 ] , − (m1 +1)(n1 +1)A 1 1 0 0
=−
0 0;0 0
(V ,V e ,B,g,ι)γ Rm1 n1 ;...;mN nN [ 0 , 0 , ρ 0 ] γ as in Def. 1.4 ∂ (V ,V e ,B,g,ι)γ Am1 n1 ;...;mN nN [ , 0 , ρ 0 ] =−
γ ∂
(4.47c)
= 0
as in Def. 1.4 =
0
.
(4.47d)
In particular, (1,1,1,0,0)
Rm1 n1 ;...;m4 n4 [ , 0 , ρ 0 ] ≡ 0 .
(4.48)
We first get (4.48) at = 0 from (4.47a). Since the rhs of (4.17) vanishes for V = 1 and N = 4, we conclude (4.48) for any . Proposition 4. Let γ be an R-ribbon graph having N external legs, V vertices, V e external vertices and segmentation index ι, which is drawn on a genus-g Riemann sur(V ,V e ,B,g,ι)γ face with B boundary components. Then the contribution Rm1 n1 ;...;mN nN of γ to the expansion coefficient of the 0 -varied effective action describing a duality-covariant φ 4 -theory on R4θ in the matrix base is bounded as follows:
346
H. Grosse, R. Wulkenhaar
1. If γ is of the type described under 1–3 of Definition 1, we have (V ,V e ,1,0,0)γ R m1 n1 n1 k1 k1 l 1 l 1 m1 [ , 0 , ρ0 ] γ as in Def. 1.1
≤
;
! m1 n1 n1 k 1 k 1 l 1 l 1 m1 4V −4 m2 n2 ; n2 k 2 ; k 2 l 2 ; l 2 m2 P θ 2
20 1 1 3V −2−V e 0 2V −2
γ as in Def. 1.2
2
P ln ,
R (V ,V e ,1,0,0)γ R m1 n1 n1 m1 [ , 0 , ρ0 ] ;
2
)P24V −2
! m1 n1
n1 m1 m2 n2 ; n2 m2 θ 2
(θ
20 1 3V −1−V e 0 × P 2V −1 ln ,
R (V ,V e ,1,0,0) R m1 +1 n1 +1 n1 m1 [ , 0 , ρ0 ] 2
(4.49)
m2 n2 n2 m2
m2
γ as in Def. 1.3
≤
;
2 ×
≤
;
m2 n2 n2 k 2 k 2 l 2 l 2 m2
2
; n2
n2
)P24V −2
(4.50)
m2
! m1 +1 n1 +1 m2
n2
1
1
m ; nn2 m 2
(θ
θ 2
20 1 3V −1−V e 0 × . P 2V −1 ln
R
(4.51)
2. If γ is a subgraph of an 1PI planar graph with a selected set T of trajectories on one distinguished boundary component and a second set T of summed trajectories on that boundary component, we have (V ,V e ,B,0,ι)γ R m1 n1 ;...;mN nN [ , 0 , ρ0 ] Es Et
≤
2
θ 2
20
(2− N )+2(1−B) 1 3V − N2 −1+B+2g−V e −ι+s+t 2
×P4V −N −−−−→ −−→]∈T min(2, 1 nj o[nj ]) 2t + − n−− o[n 2 j
!
j
m1 n1 ; . . . ; mN nN 0 2V − N2 P ln . 2 θ
R (4.52)
3. If γ is a non-planar graph, we have (V ,V e ,B,g,ι) R m1 n1 ;...;mN nN [ , 0 , ρ0 ] Es
≤
! 2 N 2 (2− 2 )+2(1−B−2g) 4V −N m1 n1 ; . . . ; mN nN P θ
0 θ 2
20 1 3V − N −1+B+2g−V e −ι+s N
0 2 × P 2V − 2 ln .
R
(4.53)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base (V ,V e ,B,g,ι)
We have Rm1 n1 ;...;mN nN ≡ 0 for N > 2V +2 or
N
i=1 (mi −ni )
347
= 0.
Proof. Inserting the estimations of Proposition 2 into (4.47) we confirm Proposition 4 for = 0 , which serves as the initial condition for the -integration of (4.17). This
0 entails the polynomial in ln
instead of ln
R appearing in Propositions 2 and 3. R Accordingly, when using Propositions 2 and 3 as the input for (4.17), we will further
0 bound these estimations by replacing ln
R by ln
. R Due to (4.48) the rhs of (4.17) vanishes for N = 2, V = 1 and for N = 6, V = 2. This means that the corresponding R-functions are constant in so that the Proposi(1,1,1,0,0) (1,1,2,0,1) (2,2,1,0,0) tion holds for Rm1 n1 ;m2 n2 [ ], Rm1 n1 ;m2 n2 [ ] and Rm1 n1 ;...;m2 n2 [ ]. Since (4.17) is a
2 relative to the estimation of the A-functions of
20 (1,1,1,0,0) (1,1,2,0,1) (2,2,1,0,0) Rm1 n1 ;m2 n2 [ ], Rm1 n1 ;m2 n2 [ ] and Rm1 n1 ;...;m2 n2 [ ],
linear differential equation, the factor
Proposition 2, first appearing in survives to more complicated graphs, provided that none of the R-functions is relevant in . For graphs according to Definition 1.4, the first two lines on the rhs of (4.17) yield in the same way as in the proof of (3.39) the integrand (4.53), with the degree of the
0 polynomial in ln
lowered by 1. Since under the given conditions an A-graph would R be irrelevant, an R-graph with the additional factor
2
20
is relevant or marginal. Thus, the
-integration of the first two lines on the rhs of (4.17) can be estimated by the integrand
0 and a factor P 1 [ln
], in agreement with (4.53). In the same way we verify (4.52) for R the first two lines on the rhs of (4.17). In the remaining lines of (4.17) we get by induction the following estimation: (V1 ) Q ( )R [ ] nm;lk 0 0 0 0 0 0 ; 0 0 ;mn;kl [Def. 1.2]
m,n,k,l
2 1 3V −1−V e 2V −2 0 2 ≤ θ
ln , P
R
2 0 (V1 ) (V1 ) Q ( ) R [ ] − R [ ] nm;lk 1 0 0 1 0 0 0 0 0 0 ; 0 0 ;mn;kl 0 0 ; 0 0 ;mn;kl m,n,k,l
2 1 3V −1−V e
0 2V −2 ≤ P ln ,
R
2 0 (V ) Qnm;lk ( )R 1 11 0 0 [ ] ≤
2 1 3V −1−V e
20
≤
2 1 3V −2−V e
20
(4.54b)
[Def. 1.3]
0 P 2V −2 ln ,
R
(V1 ) Qnm;lk ( )R 0 0 0 0 0 0 0 0 [ ] 0 0 ; 0 0 ; 0 0 ; 0 0 ;mn;kl m,n,k,l
[Def. 1.2]
0 0 ; 0 0 ;mn;kl
m,n,k,l
(4.54a)
0 . P 2V −3 ln
R
(4.54c)
[Def. 1.1]
(4.54d)
These estimations are obtained in a similar way as (4.38a), (4.39a) and (4.40a). In particular, the improvement by (θ 2 )−1 in (4.54b) is due to the difference of graphs which
348
H. Grosse, R. Wulkenhaar
according to Sect. 3.3 yield a composite propagator (3.17a). To obtain (4.54c) we have −−−−→ −−−−→ to use (4.52) with n1 o(n1 ) + n2 o(n2 ) = 2, which for the graphs under consideration is known by induction. Multiplying (4.54) by versions of Proposition 3 according to (4.17), for V1 < V , we
0 obtain again (4.53) or (4.52), with the degree of the polynomial in ln
lowered by 1, R for the integrand. Then the -integration proves (4.53) and (4.52). For graphs as in 1–3 of Definition 1 one shows in the same way as in the proof of 1–3 of Proposition 3 that the last term in the third line of (4.17) and the (V1 = V )-terms in the remaining lines project to the irrelevant part of these R-functions, i.e. lead to (4.49)–(4.51). This was already clear from (4.44). For the remaining (V1 < V )-terms in the fourth to last lines of (4.17) we obtain (4.49)–(4.51) from (4.54) and (4.33)–(4.35). This finishes the proof. 4.5. Finishing the convergence and renormalisation theorem. We return now to the starting point of the entire estimation procedure—the identity (4.4). We put = R in Proposition 4 and perform the 0 -integration in (4.4): Theorem 5. The φ 4 -model on R4θ is (order by order in the coupling constant) renormalisable in the matrix base by adjusting the coefficients ρa0 [ 0 ] defined in (3.15) and (3.14) of the initial interaction (3.3) to give (3.16) and by integrating the Polchinski equation according to Definition 1. (V ,V e ,B,g,ι) (V ,V e ,B,g,ι) The limit Am1 n1 ;...;mN nN [ R , ∞] := lim 0 →∞ Am1 n1 ;...;mN nN [ R , 0 , ρ 0 [ 0 ]] of the expansion coefficients of the effective action L[φ, R , 0 , ρ 0 [ 0 ]], see (3.4), exists and satisfies N N (V ,V e ,B,g,ι) (V ,V e ,B,g,ι) (2π θ ) 2 −2 Am1 n1 ;...;mN nN [ R , ∞] − (2πθ ) 2 −2 Am1 n1 ;...;mN nN [ R , 0 , ρ 0 ] ≤
6−N R
1 2(B+2g−1)
20 θ 2R ! m1 n1 ; . . . ; mN nN 1 3V − N2 −V e −ι 2V − N 0 2 P ln . ×P04V −N
R θ 2R
(4.55)
Proof. We insert Proposition 4, taken at = R , into (4.4). We also use (3.33) in Proposition 4.1. Now, the existence of the limit and its property (4.55) are a consequence of Cauchy’s criterion. Note that dx P q [ln x] = x12 P q [ln x]. x3 5. Conclusion In this paper we have proven that the real φ 4 -model on (Euclidean) noncommutative R4 is renormalisable to all orders in perturbation theory. The bare action of relevant and marginal couplings of the model is parametrised by four (divergent) quantities which require normalisation to the experimental data at a physical renormalisation scale. The corresponding physical parameters which determine the model are the mass, the field amplitude (to be normalised to 1), the coupling constant and (in addition to the commutative version) the frequency of an harmonic oscillator potential. The appearance of the oscillator potential is not a bad trick but a true physical effect. It is the self-consistent solution of the UV/IR-mixing problem found in the traditional noncommutative φ 4 -model
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
349
in momentum space. It implements the duality (see also [4]) that noncommutativity relevant at short distances goes hand in hand with a modified structure of space relevant at large distances. Such a modified structure of space at very large distances seems to be in contradiction with experimental data. But this is not true. Neither position space nor momentum space are the adapted frames to interpret the model. An invariant characterisation of the model is the spectrum of the Laplace-like operator which defines the free theory. Due to the link to Meixner polynomials, the spectrum is discrete. Comparing (A.1) with (A.11) and (A.12) we see that the'spectrum of the squared momentum variable has an equidistant
4 spacing of 4 θ . Thus, θ is the minimal (non-vanishing) momentum of the scalar field which is allowed in the noncommutative universe. We can thus identify the parameter √ with the ratio of the Planck length to the size of the (finite!) universe. Thus, for typical momenta on earth, the discretisation is not visible. However, there should be an observable effect at extremely huge scales. Indeed, there is some evidence of discrete momenta in the spectrum of the cosmic microwave background [21]10 . 1 Of course, when we pass to a frame where the propagator becomes µ2 +p 2 , with p 0
now being discrete, we also have to transform the interactions. We thus have to shift the (α) unitary matrices Um appearing in (A.1) from the kinetic matrix or the propagator into the vertex. The properties of that dressed (physical) vertex will be studied elsewhere. Another interesting exercise is the evaluation of the β-function of the dualitycovariant φ 4 -model [22]. It turns out that the one-loop β-function for the coupling constant remains non-negative and and vanishes for the self-dual case = 1. Moreover, the limit → 0 exists at the one-loop level. This is related to the fact that the UV/IR-mixing in momentum space becomes problematic only at higher loop order. Of particular interest would be the limit θ → 0. In the developed approach, θ defines the reference size of an elementary cell in the Moyal plane. All dimensionful quantities, in particular the energy scale , are measured in units of (appropriate powers of) θ. In the final result of Theorem 5, these mass dimensions are restored. Then, we learn from (4.55) that a finite θ regularises the non-planar graphs. This means that for given 0 and
R the limit θ → 0 cannot be taken. On the other hand, there could be a chance to let θ depend on 0 in the same way as in the two-dimensional case [14] where the oscillator frequency was switched off with the limit 0 → ∞. However, this does not work. The point is that taking in (4.7), instead of the 0 -derivative, the θ -derivative, there is now a contribution from the θ-dependence of the propagator. This leads in the analogue of the differential equation (4.11) to a term bilinear in L. Looking at the proof of Proposition 4, we see that this L-bilinear term will remove the factor −2 0 . Thus, the limit θ → 0 is singular. This is not surprising. In the limit θ → 0 the distinction between planar and non-planar graphs disappears (which is immediately clear in momentum space). Then, non-planar two- and four-point functions should yield the same divergent values as their planar analogues. Whereas the bare divergences in the planar sector are avoided by the mixed boundary conditions in 1-3 of Definition 1, the na¨ıve initial condition in Definition 1.4 for non-planar graphs leaves the bare divergences in the limit θ → 0. The next goal must be to generalise the renormalisation proof to gauge theories. This requires probably a gauge-invariant extension of the harmonic oscillator potential. The 10 According to the main purpose of [21] one should also discuss other topologies than the noncommutative RD .
350
H. Grosse, R. Wulkenhaar
result should be compared with string theory, because gauge theory on the Moyal plane arises in the zero-slope limit of string theory in the presence of a Neveu-Schwarz Bfield [23]. As renormalisation requires an appropriate structure of the space at very large distances, the question arises whether the oscillator potential has a counterpart in string theory. In this respect, it is tempting11 to relate the oscillator potential to the maximally supersymmetric pp-wave background metric of type IIB string theory found in [24], ds 2 = 2dx + dx − − 4λ2
8
(x i )2 (dx − )2 +
i=1
8
(dx i )2 ,
(5.1)
i=1
for dx ± = √1 (dx 9 ±dx 10 ), which solves Einstein’s equations for an energy-momentum 2 tensor relative to the 5-form field strength F5 = λdx − dx 1 ∧ dx 2 ∧ dx 3 ∧ dx 4 + dx 5 ∧ dx 6 ∧ dx 7 ∧ dx 8 . (5.2) A. Evaluation of the Propagator A.1. Diagonalisation of the kinetic matrix via Meixner polynomials. Our goal is to diagonalise the (four-dimensional) kinetic matrix G m1 n1 k1 l 1 given in (2.6), making use ; m2 n2 k 2 l 2
of the angular momentum conservation α r = nr − mr = k r − l r (which is due to the SO(2) × SO(2)-symmetry of the action). For α r ≥ 0 we thus look for a representation (α 1 ) (α 2 ) (α 1 ) (α 2 ) G m1 m1 +α1 l 1 +α1 l 1 = Um1 i 1 Um2 i 2 θ21 vi 1 + θ22 vi 2 + µ20 Ui 1 l 1 Ui 2 l 2 , (A.1) ; m2 m2 +α 2 l 2 +α 2 l 2
δml =
i 1 ,i 2
(α)
(α)
Umi Uil .
(A.2)
i
The sum over i 1 , i 2 would be an integration for continuous eigenvalues vi r . Comparing this ansatz with (2.6) we obtain, eliminating i in favour of v, the recurrence relation (α) (1− 2 ) m(α+m)Um−1 (v) + v − (1+ 2 )(α+1+2m) Um(α) (v) (α) (A.3) +(1− 2 ) (m+1)(α+m+1)Um+1 (v) = 0 (α)
to determine Um (v) and v. We are interested in the case > 0. In order to make contact with standard formulae we put 1 (α+m)! (α) (α) (α) Um (v) = f (v) m v = νx + ρ . (A.4) Vm (v) , τ m! We obtain after division by f (α) (v) (1− 2 ) m2 (α+m)! (α) 0= Vm−1 (νx+ρ) τ m−1 m!
(α+m)! (α) 1 2 − m (1+ )(α+1+2m) − ρ − νx Vm (νx+ρ) τ m! (1− 2 ) (α+m+1)2 (α+m)! (α) Vm+1 (νx+ρ) , + m+1 τ m!
11
We would like to thank G. Bonelli for this interesting remark.
(A.5)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
351
i.e. −
ν (1+ 2 )(α+1+2m) − ρ (α) (α) (α) (νx+ρ) = mV (νx+ρ) − xV Vm (νx+ρ) m m−1 τ (1− 2 ) τ (1− 2 ) 1 (α) + 2 (α+m+1)Vm+1 (νx+ρ). (A.6) τ
Now we put 1+α = β,
1 2(1+ 2 ) (1+ 2 )β − ρ ν = c, = 1+c, = βc, = 1−c 2 2 τ τ (1− ) τ (1− 2 ) (1− 2 )τ (A.7)
and Vn(α) (νx+ρ) = Mn (x; β, c) ,
(A.8)
which yields the recursion relation for the Meixner polynomials [16]: (c−1)xMm (x; β, c) = c(m+β)Mm+1 (x; β, c) −(m + (m+β)c)Mm (x; β, c) + mMm−1 (x; β, c) . (A.9) The solution of (A.7) is τ=
(1± )2 1± ≡ , 1− 2 1∓
c=
(1∓ )2 , (1± )2
ν = ±4 ,
ρ = ±2 (1+α) . (A.10)
We have to chose the upper sign, because the eigenvalues v are positive. We thus obtain (α+n)! 1− m (1− )2 Um(α) (vx ) = f (α) (x) Mm x; 1+α, , n! 1+ (1+ )2 vx = 2 (2x+α+1) . (A.11) The function f (α) (x) is identified by comparison of (A.2) with the orthogonality relation of Meixner polynomials [16], ∞ (β+x)cx
(β)x!
x=0
The result is Um(α) (vx ) =
Mm (x; β, c)Mn (x; β, c) =
c−n n!(β) δmn . (β+n)(1−c)β
√ (1− )2 α+x 2 α+1 1− m+x . Mm x; 1+α, 1+ 1+ (1+ )2 x
(A.12)
α+m m
(A.13)
The Meixner polynomials can be represented by hypergeometric functions [16] −m, −x (1− )2 4 Mm x; 1+α, = . F − 2 1 1+α (1+ )2 (1− )2 (α)
(A.14)
This shows that the matrices Uml in (A.1) and (A.2) are symmetric in the lower indices.
352
H. Grosse, R. Wulkenhaar
A.2. Evaluation of the propagator. Now we return to the computation of the propagator, which is obtained by sandwiching the inverse eigenvalues ( θ21 vi 1 + θ22 vi 2 + µ20 ) between ∞ the unitary matrices U (α) . With (A.11) and the use of Schwinger’s trick A1 = 0 dt e−tA we have for θ1 = θ2 = θ ,
m1 m1 +α1
l 1 +α 1 l 1 ; m2 m2 +α 2 l 2 +α 2 l 2
= =
θ 8 θ 8
∞
dt 0
∞
(α 1 )
(α 2 )
(α 1 )
(α 2 )
e− 4 (vx 1 +vx 2 +θµ0 /2) Um1 (vx 1 )Um2 (vx 2 )Ul 1 (vx 1 )Ul 2 (vx 2 ) t
2
x 1 ,x 2 =0 ∞
µ20 θ
1
dt e−t (1+ 8 + 2 (α +α )) 0 2 i α +mi α i +l i 4 α i +1 1− mi +l i × mi li (1+ )2 1+ i=1 ∞ (α i +x i )! e−t (1− )2 x i × x i !α i ! (1+ )2 i 1
2
x =0
×2 F1
−mi , −x i −l i , −x i 4 4 . F − − 2 1 1+α i (1− )2 1+α i (1− )2
(A.15)
We use the following identity for hypergeometric functions, −m, −x −l, −x b 2 F1 b 1+α 1+α x!α! x=0 −m , −l (1−(1−b)a)m+l ab2 = , F 2 1 1+α (1−(1−b)a)2 (1−a)α+m+l+1
∞ (α+x)!
a x 2 F1
|a| < 1 . (A.16)
The identity (A.16) is probably known, but because it is crucial for the solution of the free theory, we provide the proof in Sect. A.4. We insert the rhs of (A.16), expanded as a finite sum, into (A.15), where we also put z = e−t :
m1 m1 +α1
l 1 +α 1 l 1 ; m2 m2 +α 2 l 2 +α 2 l 2 1 1 2 2 min(m ,l ) min(m ,l ) 1
µ20 θ
1
z 8 + 2 (α +α )+u +u (1 − z)m +m +l +l −2u −2u dz 2 α 1 +α 2 +m1 +m2 +l 1 +l 2 +2 0 1 − (1− ) z u1 =0 u2 =0 (1+ )2 2 4 α i +2ui +1 1− mi +l i −2ui mi !(α i +mi )!l i !(α i +l i )! . × (1+ )2 1+ (mi −ui )!(l i −ui )!(α i +ui )!ui ! i=1 (A.17)
θ = 8
1
2
1
2
This formula tells us the important property 0 ≤ m1 m1 +α1
l 1 +α 1 l 1 ; m2 m2 +α 2 l 2 +α 2 l 2
≤ m1 m1 +α1
l 1 +α 1 l 1 ; m2 m2 +α 2 l 2 +α 2 l 2
1
µ20 =0
,
2
1
2
1
2
(A.18)
i.e. all matrix elements of the propagator are positive and majorised by the massless matrix elements. The representation (A.17) seems to be the most convenient one for
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
353
analytical estimations of the propagator. The strategy12 would be to divide the integration domain into slices and to maximise the individual z-dependent terms of the integrand over the slice, followed by resummation [18, 17]. The z-integration in (A.17) leads according to [25, §9.111] again to a hypergeometric function: =
m1 m1 +α1
l 1 +α 1 l 1 ; m2 m2 +α 2 l 2 +α 2 l 2
×
θ 8 µ20 θ 1 1 2 1+ 8 + 2 (α +α )+u1 +u2 (m1 +m2 +l 1 +l 2 −2u1 −2u2 )! µ20 θ 1 2+ 8 + 2 (α 1 +α 2 )+m1 +m2 +l 1 +l 2 −u1 −u2
1 1 2 2 min(m ,l ) min(m ,l )
u1 =0
u2 =0
+ 21 (α 1 +α 2 )+u1 +u2 , 2+m1 +m2 +l 1 +l 2 +α 1 +α 2 (1− )2 ×2 F1 (1+ )2 µ20 θ 1 1 2+ 8 + 2 (α +α 2 )+m1 +m2 +l 1 +l 2 −u1 −u2 2 mi !(α i +mi )!l i !(α i +l i )! 4 α i +2ui +1 1− mi +l i −2ui × (1+ )2 1+ (mi −ui )!(l i −ui )!(α i +ui )!ui ! i=1 1 1 2 2 i i i i i min(m 2 ,l ) min(m ,l ) α +mi α +l m l θ = 2 i i i i i 2(1+ ) α +u α +u u ui 1 2 1+
µ20 θ 8
u =0
u =0
i=1
1− mi +l i −2ui µ20 θ 1 1 × B 1+ 8 + 2 (α +α 2 )+u1 +u2 , 1+m1 +m2 +l 1 +l 2 −2u1 −2u2 1+ 1+m1 +m2 +l 1 +l 2 −2u1 −2u2 , µ20 θ − 1 (α 1 +α 2 )−u1 −u2 (1− )2 8 2 ×2 F1 2 (1+ )2 . µ0 θ 1 1 2+ 8 + 2 (α +α 2 )+m1 +m2 +l 1 +l 2 −u1 −u2 (A.19) We have used [25, §9.131.1] to obtain the last line. The form (A.19) will be useful for the evaluation of special cases and of the asymptotic behaviour. In the main part, for presentational purposes, α i is eliminated in favour of k i , ni and the summation variable v i := mi + l i − 2ui is used. The final result is given in (2.7). For µ0 = 0 we can in a few cases evaluate the sum over ui exactly. First, for l i = 0 we also have ui = 0. If additionally α i = 0 we get
m1 m1
;0 0 m2 m2 0 0
µ0 =0
=
1− m1 +m2 θ . 2(1+ )2 (1+m1 +m2 ) 1+
(A.20)
One should notice here the exponential decay for > 0. It can be seen numerically that this is a general feature of the propagator: Given mi and α i , the maximum of the propagator is attained at l i = mi . Moreover, the decay with |l i − mi | is exponentially so that the sum
m1 m1 +α1 l 1 +α1 l 1 (A.21) l 1 ,l 2
; m2 m2 +α 2 l 2 +α 2 l 2
converges. We confirm this argumentation numerically in (C.3). 12
We are grateful to Vincent Rivasseau for this idea.
354
H. Grosse, R. Wulkenhaar
It turns out numerically that the maximum of the propagator for indices restricted by C ≤ max(m1 , m2 , n1 , n2 , k 1 , k 2 , l 1 , l 2 ) ≤ 2C is found in the subclass m1 n1 n1 m1 0
0
;
0
0
of propagators. Coincidently, the computation in case of m2 = l 2 = α 2 = 0 simplifies considerably. If additionally m1 = n1 we obtain a closed result: (m!)2 (2u)! θ 2(1+ )2 (m−u)!(u!)2 (1+m+u)! u=0 1+2u , u−m (1− )2 1− 2u ×2 F 1 2+m+u (1+ )2 1+ m
m m;m m = 0 0
0 0
m m−u (1− )2 u+s (m!)2 (2u+s)! θ s = (−1) 2(1+ )2 (m−u−s)!(u!)2 (1+m+u+s)!s! (1+ )2 u=0 s=0 m m θ (m!)2 (r+u)! (1− )2 r u+r = (−1) 2(1+ )2 (m−r)!(u!)2 (1+m+r)!(r−u)! (1+ )2 u=0 r=u m r+1 , −r (1− )2 r (m!)2 θ r = (−1) F 1 2 1 1 2(1+ )2 (m−r)!(1+m+r)! (1+ )2 r=0 m θ (1− )2 r (m!)2 = 2(1+ )2 (m−r)!(1+m+r)! (1+ )2 r=0 1 , −m (1− )2 θ = F − 2 1 m+2 2(1+ )2 (m+1) (1+ )2 θ for > 0 , m 1 , 8 (m+1) √ (A.22) ∼ πθ ' for = 0 , m 1 . 3 4 m+ 4 We see a crucial difference in the asymptotic behaviour for > 0 versus = 0. The 1 slow decay with m− 2 of the propagator is responsible for the non-renormalisability of 4 the φ -model in case of = 0. The numerical result (C.2) shows that the maximum of the propagator for indices restricted by C ≤ max(m1 , m2 , n1 , n2 , k 1 , k 2 , l 1 , l 2 ) ≤ 2C is very close to the result (A.22), for m = C. For = 0 the maximum is exactly given by the 7th line of (A.22). A.3. Asymptotic behaviour of the propagator for large α i . We consider various limiting cases of the propagator, making use of the asymptotic expansion (Stirling’s formula) of the -function, n n ' (n+1) ∼ 2π(n + 16 ) + O(n−2 ) . (A.23) e This implies (n+1+a) (a−b)(a+b+1) ∼ na−b 1 + + O(n−2 ) . (n+1+b) 2n
(A.24)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
355
We rewrite the propagator (A.19) in a manner where the large-α i behaviour is easier to discuss:
m1 m1 +α1
l 1 +α 1 l 1 ; m2 m2 +α 2 l 2 +α 2 l 2
=
1 1 2 2 min(m ,l ) min(m ,l )
θ µ20 θ
2(1+ )2 (1+ 8 + 21 (α 1 +α 2 )+m1 +m2 +l 1 +l 2 −u1 −u2 ) √ (m1 +m2 +l 1 +l 2 −2u1 −2u2 )! m1 !l 1 !m2 !l 2 ! 1− m1 +m2 +l 1 +l 2 −2u1 −2u2 × 1 1 1 1 1 (m −u )!(l −u )!u !(m2 −u2 )!(l 2 −u2 )!u2 ! 1+ µ20 θ 1 1 2 1+ 8 + 2 (α +α )+u1 +u2 × µ20 θ 1 1 1+ 8 + 2 (α +α 2 )+m1 +m2 +l 1 +l 2 −u1 −u2 (α 1 +m1 )!(α 1 +l 1 )! (α 2 +m2 )!(α 2 +l 2 )! × (α 1 +u1 )!(α 1 +u1 )! (α 2 +u2 )!(α 2 +u2 )! u1 =0
×2 F1
u2 =0
1+m1 +m2 +l 1 +l 2 −2u1 −2u2 , µ2 θ
µ20 θ 1 1 2 1 2 2 8 − 2 (α +α )−u −u (1− ) (1+ )2 . 2 1 2 1 2
0 2+ 8 + 21 (α 1 +α 2 )+m1 +m +l +l −u −u
(A.25)
µ2 θ
0 We assume 21 (α1 + α2 ) ≥ max( 8 , m, l). The term in braces { } in (A.25) behaves like
.
/ 2u1 +2u2 −m1 −m2 −l 2 −l 2 1 1 (m1 +l 1 −2u1 ) 2 1 (m2 +l 2 −2u1 ) . . . ∼ 21 (α 1 +α 2 ) (α ) 2 (α ) 2
µ2 θ
0 (2u1 +2u2 −m1 −m2 −l 2 −l 2 )(m1 +m2 +l 1 +l 2 + 4 +1) × 1+ 1 2 (α + α ) +O (α 1 +α 2 )−2
1 −2 (m1 −u1 )(m1 +u1 +1) (l 1 −u1 )(l 1 +u1 +1) × 1+ + + O (α ) 4α 1 4α 1 2 −2 (m2 −u2 )(m2 +u2 +1) (l 2 −u2 )(l 2 +u2 +1) × 1+ . (A.26) + + O (α ) 4α 2 4α 1 We look for the maximum of the propagator under the condition C ≤ max(α 1 , α 2 ) ≤ 2C. Defining s i = mi + l i − 2ui and s = s 1 + s 2 , the dominating term in (A.26) is 1
s1 s2 (α 1 ) 2 (α 2 ) 2 ≤ 1 1 2 s C≤max(α 1 ,α 2 )≤2C 2 (α +α )
max
( 1 s 1 2 ) s2 s +2s +s s ( ss1 +2s 2) 1
2
1 +s
2
, 2 s
C2
s2
s 2 ( s 2 +2s 1) +s s ( ss2 +2s 1) 1
2
1 +s 2
. (A.27)
C 1 ≤ s 2 and at (α 1 , α 2 ) = The maximum is attained at (α 1 , α 2 ) = ( s 1s+2s 2 , C) for s 1
C 1 2 (C, s 2s+2s 1 ) for s ≥ s . Thus, the leading contribution to the propagator will come from the summation index ui = min(mi , l i ). 2
356
H. Grosse, R. Wulkenhaar
Next we evaluate the leading contribution of the hypergeometric function:
2 F1
1+m1 +m2 +l 1 +l 2 −2u1 −2u2 , µ2 θ
µ20 θ 1 1 2 1 2 2 8 − 2 (α +α )−u −u (1− ) 2 1 2 1 2 (1+ )2
0 2+ 8 + 21 (α 1 +α 2 )+m1 +m +l +l −u −u (1− )2 k µ20 θ ∞ +1−k) k(2u1 +2u2 − 4 (m1 +m2 +l 1 +l 2 −2u1 −2u2 +k)! − (1+ )2 ∼ 1+ 1 2 1 2 1 2 1 2 (m +m +l +l −2u −2u )! k! α +α
k=0
0 1 2 −2 k(3+2m1 +2m2 +2l 1 +2l 2 −2u1 −2u2 + 4 +k) − + O (α +α ) α 1 +α 2 (1− )2 k µ20 θ ∞ +k) − (1+ )2 2k(1+s+ 4 (s+k)! = 1− + O (α 1 +α 2 )−2 1 2 s! α +α k!
µ2 θ
k=0
(1+ )2 1+s = 1+ 2(1+ 2 ) +O (α 1 +α 2 )−2 .
(1− )2 (1+s) µ2 θ 1+ 2 s 0 1+ + +(s+2) 4 2 (1+ 2 ) (α 1 +α 2 )
(A.28)
Assuming s 1 ≤ s 2 , we obtain from (A.23), (A.27) and (A.28) the following leading contribution to the propagator (A.25):
m1 m1 +α1 l 1 +α1 l 1 ;
m2 m2 +α 2 l 2 +α 2 l 2
=
max(m1 ,m2 ,l 1 ,l 2 )C≤max(α 1 ,α 2 )≤2C
s1
θ max(m1 , l 1 )
2
s2 2 1− s 1 +s 2 (1+ )2 1+s 1 +s 2 1+ 2(1+ 2 )
max(m2 , l 2 ) s 1 +s 2
(1+ )2 C 1+ 2 s1 1 2 s1 (s 1 +s 2 )s +s 2π(s 1 +s 2 ) ( s 1 +2s 2 ) 2 −1 1 + O(C ) × .(A.29) √ 1 +s 2 1 2 1+s 1 +s 2 (s 1 )s (s 2 )s 2π s 1 s 2 ( ss1 +2s s i :=|mi −l i | 2)
' m−l 2 The numerator comes from m! for m ≥ l. The estimation (A.29) is the l! ≤ m explanation of (3.24). Let us now look at propagators with mi = l i and mi C ≤ max(α 1 , α 2 ) ≤ 2C:
m1 m1 +α1
m1 +α 1 m1 ; m2 m2 +α 2 m2 +α 2 m2
=
θ
µ20 θ 1 1 2(1+ )2 1+ 8 + 2 (α +α 2 )+m1 +m2 1− 2 2 (1+ )2 µ20 θ 1+ 2 2 × + 1+ + 4 (1+ 2 ) 2(1+ 2 ) 2(α1 +α2 ) +
θ µ20 θ
×
2(1+ )2 1+ 8 + 21 (α 1 +α 2 )+m1 +m2 +1 1− 2 2 (1+ )2 m1 α 1 + m2 α 2
1+ 2 1+ 2 1 2 −3 +O (α +α ) .
(α 1 +α 2 )2 (A.30)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
357
This means
m1 m1 +α1 ; m1 +α1 m1 − 0 m1 +α1
m1 +α1 0 0 m2 +α2 ; m2 +α2 0
m2 m2 +α2 m2 +α2 m2
θ µ20 θ 1 2 1+ 8 + 2 (α 1 +α 2 +m1 +m2 ) θ + µ20 θ 1 2(1+ 2 ) 1+ 8 + 2 (α 1 +α 2 +m1 +m2 ) 1− 2 2 m1 (α 1 +m1 ) + m2 (α 2 +m2 ) × 1+ 2 (α 1 +α 2 +m1 +m2 )2 1 (m1 +m2 )2 +O (α 1 +α 2 +m1 +m2 ) (α 1 +α 2 +m1 +m2 )2 = m1 1 m1 +α1 ; m1 +α1 1 − 0 m1 +α1 ; m1 +α1 0 0 m +α m +α 0 0 m2 +α2 m2 +α2 0 2 2 2 2 +m2 0 m1 +α1 ; m1 +α1 0 − 0 m1 +α1 ; m1 +α1 0
= −(m1 +m2 )
8(1+ 2 )
1 m2 +α2 m2 +α2 1
0 m2 +α2 m2 +α2 0
(m1 +m2 )2 1 . +O (α 1 +α 2 +m1 +m2 ) (α 1 +α 2 +m1 +m2 )2
(A.31)
The second and third line of (A.31) explains the estimation (3.28). Clearly, the next term (m1 +m2 )2 in the expansion is of the order (α 1 +α 2 +m1 +m2 )3 , which explains the estimation (3.29). For m1 = l1 + 1 and m2 = l2 we have
l1 +1 l1 +1+α1 ; l1 +α1 l1 = l2
l2 +α2
l2 +α2 l2
µ20 θ
θ
2(1+ )2 8 + 21 (α 1 +α 2 )+l 1 +l 2 +2 1− 2 (l 1 +1)(l 1 +α 1 +1) −1 × 1 + O (α . (A.32) +α ) 1 2 α 1 +α 2 1+ 2
This yields
l1 +1 l1 +1+α1 ; l1 +α1 l1 −
l1 +1 1 l1 +1+α1 ; l1 +α1 0 0 l2 +α2 l2 +α2 0 l 1 +1 (l 1 +1) 1 =O , (α 1 +α 2 +l 1 +l 2 ) α 1 +α 2 +l 1 +l 2 (α 1 +α 2 +l 1 +l 2 ) l2
l2 +α2
l2 +α2 l2
(A.33)
which explains the estimation (3.30). Similarly, we have √ θ α 1 +1 , =O (α 1 +α 2 +1)3
1 1+α1 ;
α1 0 1 1+α2 1+α2 1
− 1 1+α1 ;
α1 0 0 1+α2 1+α2 0
which shows that the norm of (B.7) is of the same order as (3.30).
(A.34)
358
H. Grosse, R. Wulkenhaar
A.4. An identity for hypergeometric functions. For terminating hypergeometric series (m, l ∈ N) we compute the sum in the last line of (A.15): ∞ (α+x)!
x!α!
x=0
=
a x 2 F1
−m, −x −l, −x b 2 F1 b 1+α 1+α
∞ min(x,m) min(x,l) (α+x)! x=0
r=0
s=0
∞
m l
=
m!x!α! l!x!α! ax br bs x!α! (m−r)!(x−r)!(α+r)!r! (m−s)!(x−s)!(α+s)!s!
br+s (α+x)!x!α!m!l!a x (m−r)!(x−r)!(α+r)!r!(l−s)!(x−s)!(α+s)!s!
r=0 s=0 x=max(r,s)
=
m l r=0 s=0 ∞
×
y=0
=
α!m!l! a max(r,s) br+s (m−r)!(α+r)!r!(m−s)!(α+s)!s!
(α+y+ max(r, s))!(y+ max(r, s))! y a (y+|r−s|)!y!
l m
α!m!l! a max(r,s) br+s (m−r)!(α+r)!r!(l−s)!(α+s)!s! r=0 s=0 α+ max(r, s)+1 , max(r, s)+1 (α+ max(r, s))!(max(r, s))! × a 2 F1 |r−s|+1 (|r−s|)! l m
α!m!l! a max(r,s) br+s (m−r)!(α+r)!r!(l−s)!(α+s)!s! (1 − a)α+r+s+1 r=0 s=0 − min(α+r, α+s) , − min(r, s) (α+ max(r, s))!(max(r, s))! × F a 2 1 |r−s|+1 (|r−s|)!
=∗
=
l m r=0 s=0
×
min(r,s) u =0
=
(α+ max(r, s))!(max(r, s))!(α+ min(r, s))!(min(r, s))! u a (|r−s|+u )!(min(r, s)−u )!(α+ min(r, s)−u )!u !
l min(r,s) m r=0 s=0
=
α!m!l! a max(r,s) br+s (m−r)!(α+r)!r!(l−s)!(α+s)!s! (1 − a)α+r+s+1
u=0
min(m,l) m l u=0
r=u s=u
α!m!l! a r+s−u br+s (m−r)!(r−u)!(l−s)!(s−u)!(α+u)!u! (1 − a)α+r+s+1 α!m!l! a r+s−u br+s (m−r)!(r−u)!(l−s)!(s−u)!(α+u)!u! (1 − a)α+r+s+1
ab 2u α!m!l! 1 ab m+l−2u 1+ u (m−u)!(l−u)!(α+u)!u! 1−a 1−a a (1−a)α+1 u=0 −m , −l ab2 (1−a+ab)m+l . (A.35) = F 2 1 m+l+α+1 2 α+1 (1−a+ab) (1−a) =
min(m,l)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
359
In the step denoted by =∗ we have used [25, §9.131.1]. All other transformations should be self-explaining.
B. On Composite Propagators B.1. Identities for differences of ribbon graphs. We continue here the discussion of Sect. 3.3 on composite propagators generated by differences of interaction coefficients. After having derived (3.19), we now have a look at (3.11). Since γ is one-particle irreducible, we get for a certain permutation π ensuring the history of integrations the following linear combination: ∂ (V ∂ )γ (V )γ ] − (m1 +1)(n1 +1) A [
A [ ] 1 1 1 1 m +1 n +1 n m 1 1;0 0 ∂
∂
; 0 0 0 0 2 2 2 2 m n n m p−1 Q n1 +1 = ... 1 ( πp ) n1 +1 ( πi )Q n1 +1 kπi ;kπi n2 (kπp +1+ );kπp nn2 n2 n2
i=1
a
×
Q n1
1 k ;k n n2 πi πi n2
i=p+1
( πi )
×Q m1
m1 +1 l ;(lπq +1+ ) m2 m2 πq
q−1
b
( πq )
Q 0 kπ 0
i=p+1
×
q j =1
= ...
Q 0 lπ 0
p−1
0 i ;kπi 0
0 j ;lπj 0
0
n2
0
kπi ;kπi
×Q n1 +1
1 (kπp +1+ );kπp nn2 n2
p−1 − n1 +1 Q 1 kπ i=1
×
q−1 j =1
Q m1
m2
1 i ;kπi 0
( πj )
m1 +1 m2
lπj ;lπj
( πi )Q 1 (kπ 0
( πj )
0 + p +1 );kπp 0
( πp )
( πi )
( πj )Q 0 lπ
Q n1 +1
i=1
Q m1 +1
j =q+1
i=1 a
1 l ;l m m2 πj πj m2
j =1
p−1 − (m1 +1)(n1 +1) Q 1 kπ
×
Q m1
0
1 l ;l m m2 πj πj m2
n1 +1 n2
+ 1 q ;(lπq +1 ) 0
j =q+1
Q 1 lπ 0
1 j ;lπj 0
( πj )
( πi ) a
( πp )
1 i ;kπi 0
( πq )
b
Q n1
i=p+1
( πi )Q 1 (kπ 0
( πj )Q m1
1 k ;k n n2 πi πi n2
0 + p +1 );kπp 0
m1 +1 l ;(lπq +1+ ) m2 m2 πq
( πi )
( πp )
( πq )
a
Q 0 kπ 0
i=p+1 b
Q m1 +1
j =q+1
m2
0 i ;kπi 0
lπj ;lπj
( πi )
m1 +1 m2
( πj ) (B.1a)
360
H. Grosse, R. Wulkenhaar p−1 + n1 +1 Q 1 kπ i=1
×
q−1
0
i ;kπi 0
Q m1
1 l ;l m m2 πj πj m2
j =1
1 ( πi )Q 1 (kπ
+ p +1 );kπp 0
( πj )Q m1
m1 +1 l ;(lπq +1+ ) m2 m2 πq
q − (m1 +1) Q 0 lπ 0
j =1
0
0 j ;lπj 0
a
0 ( πp )
( πj )Q 0 lπ 0
Q 0 kπ 0
i=p+1
( πq )
b
Q m1 +1 m2
i=q+1
( πq )
+ 1 q ;(lπq +1 ) 0
0 i ;kπi 0
lπj ;lπj
( πi )
m1 +1 m2
( πj )
Q 1 lπ ;lπ 1 ( πj ) , 0 j j 0
b
j =q+1
(B.1b) with 1+ := 01 . We further analyse the the first three lines of (B.1a): p−1
Q n1 +1
i=1
n1 +1 kπi ;kπi n2 n2
( πi )Q n1 +1
1 (kπp +1+ );kπp nn2 n2
p−1 − n1 +1 Q 1 kπ
=
p−1
0
i=1
1 i ;kπi 0
Q n1 +1
n1 +1 kπi ;kπi n2 n2
i=1
×Q n1 +1
1 (kπp +1+ );kπp nn2 n2
+
p−1
Q 1 kπ
i=1
0
1 i ;kπi 0
p−1
+ n1 +1
i=1 a
×
i=p+1
( πi )Q 1 (kπ 0
( πi ) −
p−1
Q n1
1 k ;k n n2 πi πi n2
i=p+1
( πp )
Q 1 kπ
1 i ;kπi 0
( πi )
1 k ;k n n2 πi πi n2
( πi )
( πi )
a
0 + p +1 );kπp 0
Q 0 kπ
i=p+1
0
0 i ;kπi 0
( πi )
0
i=1 a
( πp )
a
( πp )
Q n1
i=p+1
(+ 1 ) 2 ( πi ) Q n1 +1
(B.2a)
a
1 (kπp +1+ );kπp nn2 n2
( πp )
Q n1
1 k ;k n n2 πi πi n2
i=p+1
( πi ) (B.2b)
Q 1 kπ 0
Q n1
1 i ;kπi 0
1 k ;k n n2 πi πi n2
( πi )Q 1 (kπ
( πi ) −
0
a i=p+1
0 + p +1 );kπp 0
( πp )
Q 0 kπ 0
0 i ;kπi 0
( πi )
.
(B.2c)
According to (3.19), the two lines (B.2a) and (B.2c) yield graphs having one composite propagator (3.17a), whereas the line (B.2b) yields a graph having one composite propagator13 (3.17c). In total, we get from (B.1) a + b graphs with composite propagator (3.17a) or (3.17c). The treatment of (3.12) is similar. Second, we treat that contribution to (3.10) which consists of graphs with constant index along the trajectories:
∂ (V )γ A m1 n1 n1 m1 [ ] ∂ m2 n2 ; n2 m2
1 1 2 . ThereNote that the estimation (3.24) yields n1 +1|Q 1 (k +1+ );k 0 ( πp )| ≤ C 2 n +1 θ
θ 2 πp 0 0 πp 1 1 2 which is required for the fore, the prefactor n1 +1 in (B.2c) combines actually to the ratio n +1 θ 2 (3.32)-term in Proposition 2.3. 13
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
361
∂ ∂ ∂ (V )γ (V )γ (V )γ ] − m 1 ] − ] A [
A [
A [
∂ 00 00 ; 00 00 ∂ 01 00 ; 00 01 ∂ 00 00 ; 00 00 ∂ ∂ ∂ (V )γ (V )γ (V )γ A 0 0 0 0 [ ] − A 0 0 0 0 [ ] − n1 A [ ] −m2 ∂ 1 0 ; 0 1 ∂ 0 0 ; 0 0 ∂ 00 01 ; 01 00 ∂ ∂ ∂ (V )γ (V )γ (V )γ ] − n 2 ] − ] A [
A [
A [
− 0 0 0 0 0 0 0 0 0 0 0 0 ∂ 0 0 ; 0 0 ∂ 0 1 ; 1 0 ∂ 0 0 ; 0 0 a b Q n1 Q m1 = ... n1 ( πi ) m1 ( πj ) −
k ;k n2 πi πi n2
i=1 a
−
Q 0 kπ 0
0 i ;kπi 0
Q 0 kπ
0 i ;kπi 0
i=1 a
−
0
i=1
+m
2
b
− n1
Q 0 lπ
j =1 a
1
0
i=1
+n
2
a
Q 0 kπ 1
i=1
= ...
a
×
a i=1
−m1
0
b j =1
−m2 +
b
j =1 a
−n1
0 i ;kπi 1
0
( πi ) −
1
( πi )
i=1
Q 1 kπ 0
1 i ;kπi 0
( πj ) −
( πi ) −
( πi ) −
0 i ;kπi 0
0 i ;kπi 0
( πj )
b
b j =1 a
( πi )
( πi )
j =1
0
0
0 i ;kπi 0
Q 0 lπ 0
0 j ;lπj 0
( πj )
( πi )
0 j ;lπj 0
1 l ;l m m2 πj πj m2
(B.3a)
( πj )
( πi ) −
b j =1
Q 0 lπ 0
j ;lπj
0 0
( πj )
Q 0 lπ 0
0 j ;lπj 0
( πj )
Q 0 lπ 0
0 j ;lπj 0
( πj )
Q 0 kπ 0
i=1 a
0 i ;kπi 0
( πi )
Q 0 kπ
i=1
0 0
Q 0 kπ
Q 0 lπ
j =1
( πj ) −
0
j ;lπj
( πj )
b
Q m1
j =1
1 k ;k n n2 πi πi n2
0
i=1
j =1
j =1
Q 0 lπ
a
b
b
( πj ) −
0 j ;lπj 0
Q 0 kπ
0
( πi ) −
( πj ) −
0 j ;lπj 1
0
Q 0 kπ
i=1
1 j ;lπj 0
Q 0 lπ
Q 0 lπ
i=1 a
1 j ;lπj 0
( πi ) −
b
Q 1 lπ
0
j =1 a
1 k ;k n n2 πi πi n2
0 i ;kπi 0
( πj )
0 j ;lπj 0
b
( πj ) −
1 i ;kπi 0
Q n1
i=1 a
0
j =1
1 l ;l m m2 πj πj m2
Q 0 kπ
Q 0 lπ
b ( πi ) m1 Q 1 lπ
Q m1
j =1
+
j =1
Q n1
i=1 b
b
( πi )
0 j ;lπj 1
Q 1 kπ
l ;l m2 πj πj m2
j =1
0
0 i ;kπi 0
( πi )
(B.3b)
362
H. Grosse, R. Wulkenhaar
−n2
a
Q 0 kπ 1
i=1
0 i ;kπi 1
( πi ) −
a
Q 0 kπ 0
i=1
b
0 i ;kπi 0
( πi )
j =1
Q 0 lπ 0
0 j ;lπj 0
( πj ) . (B.3c)
It is clear from (3.19) that the part corresponding to (B.3a) can be written as a sum of graphs containing (at different trajectories) two composite propagators (0) (0) ( πi ) and Q m1 of type (3.17a). We further analyse (B.3b): Q n1 n1 m1 k ;k n2 πi πi n2
b j =1
l ;l m2 πj πj m2
Q m1
1 l ;l m m2 πj πj m2
−
b j =1
( πj ) −
b j =1
Q 0 lπ 0
0 j ;lπj 0
(1) 1 l ;l m m2 π1 π1 m2
( πj ) − m
( π1 )
b j =2
(0) + Q m1
0
j =2 b
b j =2
−m
2
0
j =2 b
(0) Q0 0 ( π1 ) 1 lπ1 ;lπ1 1
0 1 ;lπ1 0
b j =2
( π1 )
0
j ;lπj
Q 0 lπ 1
b
0 j ;lπj 0
( πj ) − m1 Q 0 lπ 1
1 0
0 j ;lπj 1
0 j ;lπj 1
1 l ;l m m2 πj πj m2
0
1 j ;lπj 0
Q 0 lπ 1
0 j ;lπj 1
( πj ) −
j =2
( πj ) −
b j =2
0
1 j ;lπj 0
b
( πj ) −
j =1
( πj )
Q 0 lπ 0
0 j ;lπj 0
( πj ) (B.4a)
( πj ) −
b j =2
( πj ) − ( πj ) −
1 l ;l m m2 πj πj m2
b
Q 1 lπ
( πj )
Q m1
Q 1 lπ
b j =1
Q m1
j =2
Q 1 lπ
2
j =2 b
(0) 1 ( π1 ) 0 lπ1 ;lπ1 0
−m1
Q 0 lπ
b
−m1 Q 1
+Q 0 lπ
0 j ;lπj 0
j =1
( π1 ) m1
l ;l m2 π1 π1 m2
−m
0
= Q m1
2
Q 0 lπ
b j =2 b
Q 0 lπ 0
0 j ;lπj 0
( πj )
Q 0 lπ 0
0 j ;lπj 0
( πj )
Q 0 lπ
j =2 b
( πj ) −
j =2
0
0 j ;lπj 0
(B.4b)
( πj )
Q 0 lπ 0
0 j ;lπj 0
( πj )
Q 0 lπ 0
j ;lπj
0 0
( πj )
Q 0 lπ 0
0 j ;lπj 0
( πj )
.
(B.4c)
The part (B.4a) gives rise to graphs with one propagator (3.17b). Due to (3.19) the part (B.4b) yields graphs with two propagators14 (3.17a) appearing on the same trajectory. Finally, the part (B.4c) has the same structure as the lhs of the equation, now starting with j = 2. After iteration we obtain further graphs of the type (B.4a) and (B.4b). 14
(0) C m1 1 ( π1 ) is according to (3.28) bounded by θ 2 θ 2 . This l ;l 0 π1 π1 0 r m1 , m2 in (B.4b) combine actually to the ratio m 2 which is required for the θ
Note that the product m1 Q 1
means that the prefactors (3.32)-term in Proposition 2.2.
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
363
Finally, we look at that contribution to (3.10) which consists of graphs where one index component jumps forward and backward in the n1 -component. We can directly use the decomposition derived in (B.3) regarding, if the n1 -index jumps up, a
Q n1
1 k ;k n n2 πi πi n2
i=1
→
p−1
( πi )
Q n1
n1 ( πi )Q n1
k ;k n2 πi πi n2
i=1
×Q n1 +1 n2
q−1
n1 +1 ( πp )
k ;(kπp +1) n2 n2 πp
(kπq +1+ );kπq
n1 n2
a
( πq )
i=p+1
Q n1
1 k ;k n n2 πi πi n2
i=q+1
Q n1 +1 n2
kπi ;kπi
n1 +1 n2
( πi ) (B.5)
( πi ) .
This requires to process (B.3) slightly differently. The two parts (B.3a) and (B.3b) need no further discussion, as they lead to graphs having a composite propagator (3.17a) on the m-trajectory. We write (B.3c) as follows:
(B.3c) =
a
Q n1
1 k ;k n n2 πi πi n2
i=1
−n
1
a i=1
−n
2
a i=1
Q 1 kπ 0
i ;kπi
Q 0 kπ 1
( πi ) − (n1 +1)
a
Q 0 kπ 0
i=1 1 0
0 i ;kπi 1
( πi ) − 2
a
( πi ) −
(B.6a)
( πi )
Q 0 kπ 0
i=1 a
0 i ;kπi 0
Q 0 kπ 0
i=1
i ;kπi
0 0
(B.6b)
( πi ) b
0 i ;kπi 0
( πi )
j =1
Q 0 lπ 0
0 j ;lπj 0
( πj ). (B.6c)
The part (B.6c) leads either with composite tors
according to propagators
Q 1 l 1 +1 1
l2
1 0 1
; ll 2
(3.19) and (B.5) to graphs (3.17a) or with propaga-
− Q 1 l 1 +1 0
l2
1 0 0
; ll 2
(B.7)
.
Inserting (B.5) into (B.6a) we have a
Q n1
1 k ;k n n2 πi πi n2
i=1 (B.5)
−→
p−1
Q n1
q−1
Q n1 +1
i=p+1
n2
a
Q 0 kπ 0
i=1
1 k ;k n n2 πi πi n2
i=1
×
( πi ) − (n1 +1)
( πi ) −
kπi ;kπi
p−1 i=1
n1 +1 n2
Q 0 kπ 0
( πi )Q n1 +1 n2
0 i ;kπi 0
0 i ;kπi 0
( πi )
( πi ) Q n1
(kπq +1+ );kπq
n1 +1 k ;(kπp +1) n2 n2 πp
n1 n2
( πq )
a
Q n1
i=q+1
( πp )
1 k ;k n n2 πi πi n2
( πi ) (B.8a)
364
H. Grosse, R. Wulkenhaar p−1
+
Q 0 kπ 0
i=1
×Q n1 +1 n2
(+ 1 )
i ;kπi 0
p−1
q−1 q−1
Q 1 kπ 0
(kπq +1+ );kπq
p−1 + n1 +1 Q 0 kπ i=1
0
(+ 1 )
×Q n1 2 n2
(kπq +1+ );kπq
+(n1 +1)
p−1 i=1
q−1
×
Q 1 kπ
i=p+1 a
×
i=q+1
0
0
n2
0 i ;kπi 0
1 k ;k n n2 πi πi n2
( πi )Q 0 kπ
( πi ) Q 1 (kπ
0
0
( πi ) −
1 p ;(kπp +1) 0
Q n1
1 i ;kπi 0
1 k ;k n n2 πi πi n2
0
( πi )
(B.8b)
( πi )
( πi )
1 k ;k n n2 πi πi n2
0
0 i ;kπi 0
Q n1
0
( πi )Q 0 kπ
Q 0 kπ
n1 +1 n2
( πi )
0 i ;kπi 0
Q n1
i=q+1
kπi ;kπi
Q 0 kπ
i=p+1 a
a
n2
q−1
i=q+1
0 i ;kπi 0
Q n1 +1
( πp )
Q 0 kπ
i=p+1
( πq )
( πq ) n1
0
1 p ;(kπp +1) 0
q−1
( πi ) −
( πi ) −
n1 n2
1 k ;k n n2 πi πi n2
( πi )Q 0 kπ
q−1 i=p+1
Q n1
i=q+1
0 i ;kπi 0
1 i ;kπi 0
a
( πq )
n1 +1 kπi ;kπi n2 n2
i=p+1
n2
0
n1 n2
Q n1 +1
i=p+1
×Q n1 +1
Q 0 kπ
i=1
( πp ) n1 +1
k ;(kπp +1) n2 n2 πp
(kπq +1+ );kπq
+ n1 +1
−
2 0 ( πi )Q 1 n
1 p ;(kπp +1) 0
0 + q +1 );kπq 0
a
q−1
( πp )
i=p+1
Q 1 kπ 0
1 i ;kπi 0
( πi )
(B.8d)
( πi )
( πp )
( πq )
Q 0 kπ
i=q+1
(B.8c)
( πi )
0
0 i ;kπi 0
( πi ) .
(B.8e)
Thus, we obtain (recall also (3.19)) a linear combination of graphs either with composite propagator (3.17a) or √ with composite propagator (3.17c). In power-counting estimations, the prefactors n1 +1 combine according to footnote 13 to the required ratio with the scale θ 2 . The part (B.6b) is nothing but (B.6a) with n1 = 1 and n2 = 0. If the index jumps down from n1 to n1 − 1, then the graph with n1 = 0 does not exist. There is no change of the discussion of (B.3a) and (B.3b), but now (B.3c) becomes (B.3c) =
a i=1
Q n1
1 k ;k n n2 πi πi n2
( πi )−n1
a i=1
Q 1 kπ 0
b
1 i ;kπi 0
( πj )
j =1
Q 0 lπ 0
0 j ;lπj 0
( πj ). (B.9)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
365
Using the same steps as in (B.8) we obtain the desired representation through graphs either with composite propagator (3.17a) or with composite propagator (3.17c). We show in Appendix B.2 how the decomposition works in a concrete example.
B.2. Example of a difference operation for ribbon graphs. To make the considerations in Sect. 3.3 and Appendix B.1 about differences of graphs and composite propagators understandable, we look at the following example of a planar two-leg graph: k
a
m1
_? ?? ?? b ????? ?? d ??? ? _
?
e
(B.10) m2
c
n1
_
n2
According to Proposition 2 it depends on the indices m1 , n1 , m2 , n2 , k whether this graph is irrelevant, marginal, or relevant. It depends on the history of contraction of subgraphs whether there are marginal subgraphs or not. 1 1 n1 +1 m1 Let us consider m1 = k = mm+1 and m2 = nn2 and the history 2 , n2 = m2 , n1 = n2 a-c-d-e-b of contraction. Then, all resulting subgraphs are irrelevant and the total graph is marginal, which leads us to consider the following difference of graphs: m1 +1 m2
m1 +1 m2
1 0
_? ?? ????? ???? ??? ? _ ?
− n1 n2
n1 +1 n2
_
m1 +1 m2
m1 m2 1 m1 +1 k•d•]•W•P0 m2 •s•
1 k•d•]•W•PHmm+1 2 •s• • •~
' = .5
•@ ?? ?_??? ?????? ??? > • ? m1 +1 _ m2 n1 ? n1 +1 2 • • • • • • •
n2
+
1 0 m1 +1 m2
?
n 1 m1 +1 0 Z S m2 ngaJ?4
v ?? ~_?? ?????? ???? ? _ ?
n1 +1 n2
−
n1 n2
m1 m _2
n1 +1 n2
•H
~ •@ ' ?? ?_??? . ???? 5 ???? > _ ? +1 ?
m m2
?
n1 n2
n1 +1 n2
+ m1 +1 m1 m _2
?
• • • • • 1• 0 •• 1 •
1 0
0 0
0 0
n1 n2
_
m1 m2
m1 m _2
1 0
+$
1 0
(m1 +1)(n1 +1)
_? ?? ????? ???? ??? ? _
1 0 m1 +1 m2
?
_? ?? ????? ???? ??? ? A_ L | SY 1 r n _dj 1
n +1 n2
n2
0 0
(B.11)
m1 m _2
It is important to understand that according to (3.11) the indices at the external lines of the reference graph (with zero-indices) are adjusted to the external indices of the original
366
H. Grosse, R. Wulkenhaar
(leftmost) graph:
1 0
1 0 m1 +1 m2
?
_? ?? ????? ???? ??? ? _
n1 +1 n2
1 0
0 0
0 0
_
n1 n2
/
≡ o
m1 m2
m1 +1 m2 n1 +1 n2
m1 m2
n1 n2
· / o
1 0
_? ?? ????? ???? ??? ? _
1 0
?
0 0
0 0
1 0
_
. (B.12)
Thus, all graphs with composite propagators have the same index structure at the external legs. When further contracting these graphs, the contracting propagator matches the external indices of the original graph. The argumentation in the proof of Proposition 2.3 should be transparent now. In particular, it becomes understandable why the difference (B.11) is irrelevant and can be integrated from 0 down to . On the other hand, the reference graph to be integrated from R up to becomes (m1 +1)(n1 +1)
m1 +1 m2
o
/
n1 +1 n2
m1 m2 n1 n2
/ o
1 0
_? ?? ????? ???? ??? ? _
1 0
?
.
0 0
0 0
1 0
_
(B.13)
We cannot use the same procedure for the history a-b-c-d-e of contractions in (B.10), because we end up with a marginal subgraph after the a-b contractions. According to Definition 1.1 we have to decompose the a-b subgraph into an irrelevant (according to Proposition 2.1) difference and a marginal reference graph:
m1 +1 m2
m1 +1 m2
?
? k1 k2 k1 k2
n1 +1 n2
_
? l1 l2
1 ? k•d•]•W•PHmm+1 2 •s• • •~
@ • • • l1 ••' ? l 2 . 1 kk2 •> 1 m1 +1 k
= _
• • •
m 2
5
+ _
n1 +1 n2
+
_
_
m1 +1 m2
?
0 0
m1 +1 m2
k2
?
?
l1 l2
? _
0 0
?
_ •? 1 •• k _ ••k1 k2 ?• _
n1 +1 n2
0 0
?
l1 l2
_
k2
? 0 0 0
1 ••• k 2 • k n1 +1 n2
m1 +1 m2
0 0
? 0 0
_
(B.14)
0
0 0
_
The two graphs in braces { } are irrelevant and integrated from 0 down to c . The remaining piece can be written as the original φ 4 -vertex times a graph with vanishing
c external indices, which is integrated from R up to c and can be bounded by C ln
. R
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
367
Inserting the decomposition (B.14) into (B.10) we obtain the following decomposition valid for the history a-b-c-d-e:
m1 +1 m2
m1 +1 m2
1 0
_? ?? ????? ???? ??? ? _ ?
− n1 n2
n1 +1 n2
_
0 0
m
n1 +1 n2
×
?
0
0 0
_
0 0 m1 +1 m2
m1
? 0 0 0
0 0
0 0
m2 _
0 0
0 0
1 d] ~sk WPH@
_? ' ?? ??? . ?????? 5 ??? > ? +1 _ 2
•• • • • • 0 •• • • • • • • 1 0 •• 1 •
?
1 + m +1
n1 n2
1 0
?
m m
+$
b_\YURRM F mmie q ;/ x # ' 4
{ | B _ ? • 2
?
(m1 +1)(n1 +1)
m1 m _2
• m1 +1 • •
+
n1 n2
n1 +1 n2
_? ?? ????? ???? ??? ? _
n1 +1 n2
−
n 0 m1 +1 0 m2 gaZSJ? 4 n
v ?? ~_?? • • ?????? •• ???? _•• ? ?
m1 +1 m2
m1 +1 m2
1
n2
?
1 0
(m1 +1)(n1 +1)
m1 m2
k•d•]•W•PHmm+1 2 •s• • •~
• •@ • • ?? _?? •' ?????? = •.•5• ???? ? •> m1 +1 _ ? m2 1 n ? n1 +1 2
+
?
+
m1 +1
_
_? ••?? ????? • ???? •• ??? A_• ? L | SY r n1 _dj 1
n +1 n2
1 0
m1 +1 m2
m1 m2
? 0 0 . _
1 0
n1 n2
n1 +1 n2
?
0 0
n1 +1 n2
S Y_ djr
n1 n2
_
0 0
m1 m2
(B.15a)
m1
m2 _
(B.15b)
m1
m2 _
n2
• •A•_ •L
0 0
? |
n1 n2
0 0
m1
m2 _
(B.15c)
c
The line (B.15a) corresponds to the first graph in the braces { } of (B.14) for both graphs on the lhs of (B.15). These graphs are already irrelevant15 so that no further decomposition is necessary. The second graph in the braces { } of (B.14), inserted into the lhs of (B.15), yields the line (B.15b). Finally, the last part of (B.14) leads to the line (B.15c). In the right graph (B.15a) the composite propagator is according to (3.28) bounded by C 2 1 2 so θ θ
' 1 m1 +1 ) by which )( that the combination with the prefactor (m1 +1)(n1 +1) leads to the ratio ( m +1 θ 2 θ 2 (B.15a) is suppressed over the first graph on the lhs of (B.15). 15
368
H. Grosse, R. Wulkenhaar 1
m Let us also look at the relevant contribution m1 = k = n2 = m 2 , n1 = m2 = the graph (B.10). The history a-c-d-e-b contains irrelevant subgraphs only: m1 m2
?
n1 n2
+ (m1 +m2 +n1 +n2 −1)
m1 m2
_
n1 n2
m1 m2
?
0 0
0 0
n1 n2
0 0 m1 m2
?
1 0
− m2
m1 m2
_
n1 n2
n1 n2
1 0
m1
0 1 m1 m2
?
n
0
0
0 0
− n2
n
m1 m2
_
m1 m2
n1 n2
0 0
+
0 0
0 0 m1 m2
m1 m2
n
0
0
n
n1 n2 m1 m2 ngaZSJ?4
_
m1 m2
_
n1 n2
_
?
m1
n1 n2
0 0
0 0
0 0 m1 m2
_
n1 n2
0 0
+$
v ?? ~_?? ?????? ???? ? _ ? n1 n2
0 1
0 1
m2
_
+
0 0 m1 m2
?
_? ?? ????? ???? ??? ? A_ L | SY 1 r n n1 _dj
n2
0 0 m1 m2
_
n2
(B.16a)
0 m1 0 m2 •g•a•Z•S•J•? n 4
+ m1 m2
_
0 0 m1 m2
?
d] aZSJ ~sk• • •W•P•H@ v•n•g• • • ••?•4•+ • ~ •
•$ _? • ' • ?? ????? . • ? ? + 5 − m1 • ? ? ??? • • > ? m 1 _ m1 m2 m2 1 1 0 n ? n2 0 _ 2 • •• • • • • • • • •
n1 n2
m1 m2
0 1
_? ?? ????? ???? ??? ? _
m1 m2
?
m1
k•d•]•W•PH m2 •s• • •~ •
•@ • ?? _?? • •' ?????? + •.•5• ???? ? •> m1 A_ ? •L •| m2 • S • ? n21 •Y•_•d•j•r n21 n n
n1 n2
0 0
0 0
n1 n2
0 0
0
d] ~sk WPH@ 0
_? ' ?? ????? . ???? = 5 ??? > _ 1 ? m m2 1 0 n1 ? n2 0 2 m2
0 0
0 0
0 0
0 0
_? ?? ????? ???? ??? ? _ 1 0
m1 m2
?? ?_??? ???? ???? _ ? ?
0 0
− n1
0 0
0 1
?? ?_??? ???? ???? _ ? ?
1 0
?? ?_??? ???? ???? _ ? ?
?
1 0
− m1
of
0 0
?? ?_??? ???? ???? _ ? ?
m1 m2
n1 n2
• •+ •$ •v • ?? ~_?? • • ???? • ? ? • ? ? ??? • • ? A_ m1 •L •| m2 • S • •Y•_•d•j•r n1 n1 _ n2 n2
1 d] aZSJ ~sk• • •W•P•H@0 v•n•g• • • ••?•4•+
• ~•_? •$ • ' • ?? ????? . • 1 ? ? 5 − m2 • ? ? ??? • • 0 > ? m 1 _ m1 m2 m2 1 1 0 n ? n2 0 _ 2 • •• • • • • 1• 0 •• •
n
0
0
n
(B.16b)
0 d] aZSJ ~sk• • •W•P•H@1 v•n•g• • • ••?•4•+
• ~•_? •$ • ' • ?? ????? . • 0 ? ? 5 • ? ? ??? • • 1 > ? m 1 _ m1 m2 m2 1 1 0 n ? n2 0 _ 2 • •• • • • • 0• 1 •• •
n
0
0
n
(B.16c)
The line (B.16a) corresponds to (B.4a), the line (B.16b) to (B.3a) and the line (B.16c) to (B.4b). If the history of contractions contains relevant or marginal subgraphs, we first have to decompose the subgraphs into the reference function with vanishing external indices and
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
369
an irrelevant remainder. For instance, the decomposition relative to the history a-b-c-d-e would be m1 m2
m1 m2
?
n1 n2
m k•d•]•W•PH m 2 •s• • •~
@ • • _? • • •' ?? ??? •. ?????? → •5 • ??? •> ? _ m1 m2 n1 ? n1 2
m1 m2
_
n1 n2
0 0 m1 m2
?
+
•?_ ••?? ?•??•? • ??•?•? •• ??•? _• ? 0 0
n1 n2
?
+ m1 m2
?
0 0
• •A•_ ••L
m1 m2
_
n1 n2
SY_ djr
n1 n2
_
?
n1 n2
n2
m1 m _ 2
m1 m2
?
0 0
n1 n2
_
m1 m2
x•t•
} {• • ••_ • n1 n2
?
• ••_ •
0 0
0 0
m1 m1 m2 2 _m
?
_? ••?? ????? • ???? •• ??? A_• ? •L •| •S•Y • r n1 •_•d•j• n1 ?
0
0 0
•
E•A •= ! •?
0 0
n1 n2
m1 2
m _
?
0 0
?
_ e
0
0 0
0
0 0
0 0
? 0 0 0
? 0 0 0
0 0
?
0 0
0 0
?
0 0 _
e
0 0
_
m1
m2 _
n2
n2
•?? _?? 0 •• ?????? 00 ???? •• ? _• ?
l•e•]•U•M
0 0
0 0
0 0
?
0 0
+
n
00
• ••_ •
+
? •|
?
m1 m2
m1 m2
m1 m2
_ 0 0
• ••_ • n1 n2
+
•• • n1 • • • •
n1 n2
n2
0 m1 0 m2
+
0 m1 0 m2
1
_? ?? ????? ???? ??? ? _
?
_
0 0
_
? 0 0 _
c
(B.17)
c
C. Asymptotic Behaviour of the Propagator For the power-counting theorem we need asymptotic formulae about the scaling behaviour of the cut-off propagator K nm;lk and certain index summations. We shall restrict ourselves to the case θ1 = θ2 = θ and deduce these formulae from the numerical evaluation of the propagator for a representative class of parameters and special choices of the parameters where we can compute the propagator exactly. These formulae involve the cut-off propagator C
m1 n1
1 ;k m2 n2 k 2
l1 l2
:=
m1 n1
for C ≤ max(m1 , m2 , n1 , n2 , k 1 , k 2 , l 1 , l 2 ) ≤ 2C ,
0
otherwise ,
1 1 ;k l m2 n2 k 2 l 2
(C.1) which is the restriction of m1 n1
Km1 n1
1 1 ;k l m2 n2 k 2 l 2
1 1 ;k l m2 n2 k 2 l 2
∂ to the support of the cut-off propagator ∂
( ) appearing in the Polchinski equation, with C = θ 2 .
370
H. Grosse, R. Wulkenhaar
100
= 0.3 0.08 80
0.06 60
= 0.1 0.04 40
= 0.05
C = 20
0.02
20
C = 40
=0
C = 50 10
20
30
40
50
0.1
0.2
0.3
0.4
0.5
' 1 −1 6 Fig. 2. Comparison of max C (solid π (16 C+12) + 1+2 3 +2 4 C mn;kl /θ at µ0 = 0 (dots) with line). The left plot shows the inverses of both the propagator and its approximation over C for various values of . The right plot shows the propagator and its approximation over for various values of C
Formula 1: max r r r
m ,n ,k ,l
C m1 n1 r
k1 l1
; m2 n2 k 2 l 2
µ0 =0
≈'
θ δm+k,n+l
.
(C.2)
1 6 π (16 C+12) + 1+2 3 +2 4 C
We demonstrate in Fig. 2 for selected values of the parameters , C that θ/(max Cmn;kl ) ' at µ0 = 0 is asymptotically reproduced by π1 (16 C+12) + 1+2 6 3 +2 4 C. Formula 2: θ (1+2 3 ) C max ≈ . (C.3) max 1 1 1 1 m n k l mr k r ,nr µ0 =0 7 2 (C + 1) ; m2 n2 k 2 l 2 1 2 l ,l ∈N
We demonstrate in Fig. 3 that θ/ maxm l maxn,k | Cmn;kl | is for µ0 = 0 asymptotically given by 7 2 (C + 1)/(1+2 3 ). Formula 3: 1 θ (1− )4 15 + 45 m ∞ + 25
m 2∞ C max ≤ . (C.4) m1 n1 k1 l 1 k r ,nr µ0 =0 2 (C+1)3 ; m2 n2 k 2 l 2 1 2 l ,l ∈ N
m − l 1 ≥ 5
We verify (C.4) for several choices of the parameters in Figs. 4, 5 and 6.
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base
371
12 20 C = 15
10 = 0.3
15
8
C = 10
6
10
4
C=6
5 = 0.1
2
= 0.05
5
10
15
20
0.1 0.2 0.3 0.4 0.5
2 2 Fig. 3. Comparison of θ/ maxm l maxn,k | C mn;kl | at µ0 = 0 (dots) with 7 (C + 1)/(1+2 ) (solid line). The left plot shows the inverse propagator and its approximation over C for three values of , whereas the right plot shows the inverse propagator and its approximation over for three values of C 10
3.5 3
8
2.5 6
= 0.05
2
C = 10
4
= 0.05 C = 15
1.5 1
2
0.5 5
10
15
20
5
10
15
20
1.5 1.25 1
= 0.05
0.75
C = 20
0.5 0.25 5
10
Fig. 4. The index summation compared with
1 θ
15
20
max C mn;kl of the cut-off propagator at µ0 = 0 (dots)
l , m−l 1 ≥5 1 m 2 θ (1− )4 15+ 45 m ∞ + 25 ∞ 2 (C+1)3
k,r
(solid line), both plotted over m ∞
372
H. Grosse, R. Wulkenhaar
70
800 600
= 0.3
60
= 0.1
m ∞ = 5
50
m ∞ = 5
40
400 30 20
200
10 5
10
15
20
25
30
20
25
30
5
10
15
20
25
30
6 5
= 0.05
4
m ∞ = 5
3 2 1 5
10
Fig. 5. The inverse θ pared with
15
max C
k,r l , m−l 1 ≥5 2 (C+1)3 1 m 2 (1− )4 15+ 45 m ∞ + 25 ∞
mn;kl
−1 of the summed propagator at µ0 = 0 (dots) com-
(solid line), both plotted over C
Acknowledgement. We are indebted to Stefan Schraml for providing us with the references to orthogonal polynomials, without which the completion of the proof would have been impossible. We had stimulating discussions with Edwin Langmann, Vincent Rivasseau and Harold Steinacker. We are grateful to Christoph Kopper for indicating to us a way to reduce in our original power-counting estimation the 0 polynomial in ln
to a polynomial in ln
, thus permitting immediately the limit 0 → ∞. R R We would like to thank the Max-Planck-Institute for Mathematics in the Sciences (especially Eberhard Zeidler), the Erwin-Schr¨odinger-Institute and the Institute for Theoretical Physics of the University of Vienna for the generous support of our collaboration.
References 1. Minwalla, S., Van Raamsdonk, M., Seiberg, N.: Noncommutative perturbative dynamics. JHEP 0002, 020 (2000) 2. Chepelev, I., Roiban, R.: Renormalization of quantum field theories on noncommutative Rd . I: Scalars. JHEP 0005, 037 (2000) 3. Chepelev, I., Roiban, R.: Convergence theorem for non-commutative Feynman graphs and renormalization. JHEP 0103, 001 (2001) 4. Langmann, E., Szabo, R.J.: Duality in scalar field theory on noncommutative phase spaces. Phys. Lett. B 533, 168 (2002) 5. Gayral, V., Gracia-Bond´ıa, J.M., Iochum, B., Sch¨ucker, T., V´arilly, J.C.: Moyal planes are spectral triples. Commun. Math. Phys. 246, 569 (2004) 6. Langmann, E.: Interacting fermions on noncommutative spaces: Exactly solvable quantum field theories in 2n+1 dimensions. Nucl. Phys. B 654, 404 (2003) 7. Langmann, E., Szabo, R.J., Zarembo, K.: Exact solution of noncommutative field theory in background magnetic fields. Phys. Lett. B 569, 95 (2003) 8. Langmann, E., Szabo, R.J., Zarembo, K.: Exact solution of quantum field theory on noncommutative phase spaces. JHEP 0401, 017 (2004) 9. Wilson, K.G., Kogut, J.B.: The Renormalization Group And The Epsilon Expansion. Phys. Rept. 12, 75 (1974) 10. Polchinski, J.: Renormalization And Effective Lagrangians. Nucl. Phys. B 231, 269 (1984)
Renormalisation of φ 4 -Theory on Noncommutative R4 in the Matrix Base 3
1750
C = 10
2.5
m ∞ = 5
2
1500
C = 10
1250
m ∞ = 5
373
1000
1.5
750 1
500
0.5
250 0.02
0.04
0.06
0.08
0.1
0.1
0.2
0.3
0.4
0.5
5 2500 4
C = 15
2000
C = 15
3
m ∞ = 5
1500
m ∞ = 5
2
1000
1
500 0.02
0.04
0.06
0.08
0.1
0.1
0.2
0.3
0.4
0.5
0.3
0.4
0.5
12 4000
10 8
C = 20
3000
C = 20
6
m ∞ = 5
2000
m ∞ = 5
4 1000 2 0.02
0.04
Fig. 6. The inverse θ
0.06
0.08
l , m−l 1 ≥5
pared with
0.1
0.1
0.2
−1 max C of the summed propagator at µ0 = 0 (dots) commn;kl k,r
2 (C+1)3
1 m 2 (1− )4 15+ 45 m ∞ + 25 ∞
(solid line), both plotted over
11. Keller, G., Kopper, C., Salmhofer, M.: Perturbative renormalization and effective Lagrangians in φ44 . Helv. Phys. Acta 65, 32 (1992) 12. Grosse, H., Wulkenhaar, R.: Power-counting theorem for non-local matrix models and renormalisation. Commun. Math. Phys. 254, 91–127 (2005) 13. Meixner, J.: Orthogonale Polynomsysteme mit einer besonderen Gestalt der erzeugenden Funktion. J. London Math. Soc. 9, 6 (1934) 14. Grosse, H., Wulkenhaar, R.: Renormalisation of φ 4 theory on noncommutative R2 in the matrix base. JHEP 0312, 019 (2003) 15. Masson, D.R., Repka, J.: Spectral theory of Jacobi matrices in 2 (Z) and the su(1, 1) Lie algebra. SIAM J. Math. Anal. 22, 1131 (1991) 16. Koekoek, R., Swarttouw, R.F.: The Askey-scheme of hypergeometric orthogonal polynomials and its q-analogue. http://arXiv.org/abs/math.CA/9602214, 1996 17. Rivasseau, V., Vignes-Tourneret, F., Wulkenhaar, R.: Renormalization of noncommutative φ 4 -theory by multi-scale analysis. http://arxiv.org/abs/hep-th/0501036, 2055 18. Rivasseau, V., Vignes-Tourneret, F.: Non-Commutative Renormalization. In: Proceedings of Conference, “Rigorous Quantum Field Theory” in honor of J. Bros, http://arxiv.org/abs/hep-th/0409312, 2004 19. Grosse, H., Wulkenhaar, R.: Renormalisation of φ 4 theory on noncommutative R4 to all orders. To appear in Lett. Math. Phys., http://arxiv.org/abs/hep-th/0403232, 2004 20. Gracia-Bond´ıa, J.M., V´arilly, J.C.: Algebras Of Distributions Suitable For Phase Space Quantum Mechanics. 1. J. Math. Phys. 29, 869 (1988) 21. Luminet, J.P.M., Weeks, J., Riazuelo, A., Lehoucq, R., Uzan, J.P.: Dodecahedral space topology as an explanation for weak wide-angle temperature correlations in the cosmic microwave background. Nature 425, 593 (2003)
374
H. Grosse, R. Wulkenhaar
22. Grosse, H., Wulkenhaar, R.: The β-function in duality-covariant noncommutative φ 4 -theory. Eur. Phys. J. C 35, 277–282 (2004) 23. Seiberg, N., Witten, E.: String theory and noncommutative geometry. JHEP 9909, 032 (1999) 24. Blau, M., Figueroa-O’Farrill, J., Hull, C., Papadopoulos, G.: A new maximally supersymmetric background of IIB superstring theory. JHEP 0201, 047 (2002) 25. Gradshteyn, I.S., Ryzhik, I.M.: Tables of Series, Produces, and Integrals. Sixth Edition. San Diego: Academic Press, 2000 Communicated by M.R. Douglas
Commun. Math. Phys. 256, 375–410 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1287-8
Communications in
Mathematical Physics
Covariant Poisson Brackets in Geometric Field Theory Michael Forger1 , Sandro Vieira Romero2 1
Departamento de Matem´atica Aplicada, Instituto de Matem´atica e Estat´ıstica, Universidade de S˜ao Paulo, Caixa Postal 66281, 05311-970 S˜ao Paulo SP, Brazil. E-mail: [email protected] 2 Departamento de Matem´atica, Universidade Federal de Vi¸cosa, 36571-000 Vi¸cosa MG, Brazil. E-mail: [email protected] Received: 1 February 2004 / Accepted: 23 August 2004 Published online: 8 March 2005 – © Springer-Verlag 2005
Abstract: We establish a link between the multisymplectic and the covariant phase space approach to geometric field theory by showing how to derive the symplectic form on the latter, as introduced by Crnkovi´c-Witten and Zuckerman, from the multisymplectic form. The main result is that the Poisson bracket associated with this symplectic structure, according to the standard rules, is precisely the covariant bracket due to Peierls and DeWitt.
1. Introduction One of the most annoying flaws of the usual canonical formalism in field theory is its lack of manifest covariance, that is, its lack of explicit Lorentz invariance (in the context of special relativity) and more generally its lack of explicit invariance under space-time coordinate transformations (in the context of general relativity). Of course, this defect is built into the theory from the very beginning, since the usual canonical formalism represents the dynamical variables of classical field theory by functions on some spacelike hypersurface (Cauchy data) and provides differential equations for their time evolution off this hypersurface: thus it presupposes a splitting of space-time into space and time, in the form of a foliation of space-time into Cauchy surfaces. As a result, canonical quantization leads to models of quantum field theory whose covariance is far from obvious and in fact constitutes a formidable problem: as a well known example, we may quote the efforts necessary to check Lorentz invariance in (perturbative) quantum electrodynamics in the Coulomb gauge. These and similar observations have over many decades nourished attempts to develop a fully covariant formulation of the canonical formalism in classical field theory, which would hopefully serve as a starting point for alternative methods of quantization. Among the many ideas that have been proposed in this direction, two have come to occupy a special role. One of these is the “covariant functional formalism”, based on the concept
376
M. Forger, S.V. Romero
of “covariant phase space” which is defined as the (infinite-dimensional) space of solutions of the equations of motion. This approach was strongly advocated in the 1980’s by Crnkovi´c, Witten and Zuckerman [1–3] (see also [4]) who showed how to construct a symplectic structure on the covariant phase space of many important models of field theory (including gauge theories and general relativity), but the idea as such has a much longer history. The other has become known as the “multisymplectic formalism”, based on the concept of “multiphase space” which is a (finite-dimensional) space that can be defined locally by associating to each coordinate q i not just one conjugate momentum pi µ but n conjugate momenta pi (µ = 1, . . . , n), where n is the dimension of the underlying space-time manifold. In coordinate form, this construction goes back to the classical work of De Donder and Weyl in the 1930’s [5, 6], whereas a global formulation was initiated in the 1970’s by a group of mathematical physicists, mainly in Poland [7–9] but also elsewhere [10–12], and definitely established in the 1990’s [13, 14]; a detailed exposition, with lots of examples, can be found in the GIMmsy paper [15]. The two formalisms, although both fully covariant and directed towards the same ultimate goal, are of different nature; each of them has its own merits and drawbacks. • The multisymplectic formalism is manifestly consistent with the basic principles of field theory, preserving full covariance, and it is mathematically rigorous because it uses well established methods from calculus on finite-dimensional manifolds. On the other hand, it does not seem to permit any obvious definition of the Poisson bracket between observables. Even the question of what mathematical objects should represent physical observables is not totally clear and has in fact been the subject of much debate in the literature. Moreover, the introduction of n conjugate momenta for each coordinate obscures the usual duality between canonically conjugate variables (such as momenta and positions), which plays a fundamental role in all known methods of quantization. A definite solution to these problems has yet to be found. • The covariant functional formalism fits neatly into the philosophy underlying the symplectic formalism in general; in particular, it admits a natural definition of the Poisson bracket (due to Peierls [16] and further elaborated by DeWitt [17–19]) that preserves the duality between canonically conjugate variables. Its main drawback is the lack of mathematical rigor, since it is often restricted to the formal extrapolation of techniques from ordinary calculus on manifolds to the infinite-dimensional setting: transforming such formal results into mathematical theorems is a separate problem, often highly complex and difficult. Of course, the two approaches are closely related, and this relation has been an important source of motivation in the early days of the theory [8]. Unfortunately, however, the tradition of developing them in parallel seems to have partly fallen into oblivion in recent years, during which important progress was made in other directions. The present paper, based on the PhD thesis of the second author [21], is intended to revitalize this tradition by systematizing and further developing the link between the two approaches, thus contributing to integrate them into one common picture. It is organized into two main sections. In Sect. 2, we briefly review some salient features of the multisymplectic approach to geometric field theory, focussing on the concepts needed to make contact with the covariant functional approach. In particular, this requires a digression on jet bundles of first and second order as well as on the definition of both extended and ordinary multiphase space as the twisted affine dual of the first order jet bundle and the twisted linear dual of the linear first order jet bundle, respectively: this will enable us to give a global definition of the space of solutions of the equations of motion, both in
Covariant Poisson Brackets in Geometric Field Theory
377
the Lagrangian and Hamiltonian formulation, in terms of a globally defined Euler - Lagrange operator DL and a globally defined De Donder - Weyl operator DH , respectively. To describe the formal tangent space to this space of solutions at a given point, we also write down the linearization of each of these operators around a given solution. In Sect. 3, we apply these constructions to derive a general expression for the symplectic form on covariant phase space, a` la Crnkovi´c-Witten-Zuckerman, in terms of the multisymplectic form ω on extended multiphase space. Then we prove, as the main result of this paper, that the Poisson bracket associated with the form , according to the standard rules of symplectic geometry, suitably extended to this infinite-dimensional setting, is precisely the Peierls-DeWitt bracket of classical field theory [16–19]. Finally, in Sect. 4, we comment on the relation of our results to previous work and on perspectives for future research in this area. 2. Multisymplectic Approach 2.1. Overview. The multisymplectic approach to geometric field theory, whose origins can be traced back to the early work of Hermann Weyl on the calculus of variations [6], is based on the idea of modifying the transition from the Lagrangian to the Hamiltonian framework by treating spatial derivatives and time derivatives of fields on an equal footing. Thus one associates to each field component ϕ i not just its standard canonically µ conjugate momentum πi but rather n conjugate momenta πi , where n is the dimension of space-time. In a first order Lagrangian formalism, where one starts out from a Lagrangian L depending on the field and its first partial derivatives, these are obtained by a covariant analogue of the Legendre transformation ∂L . ∂ ∂µ ϕ i
µ
πi =
(1)
This allows to rewrite the standard Euler-Lagrange equations of field theory ∂µ
∂L ∂L − = 0 i ∂ ∂µ ϕ ∂ϕ i
(2)
as a covariant first order system, the covariant Hamiltonian equations or De Donder Weyl equations ∂H i µ = ∂µ ϕ ∂πi
,
∂H µ = − ∂ µ πi , i ∂ϕ
(3)
where µ
H = πi ∂µ ϕ i − L
(4)
is the covariant Hamiltonian density or De Donder-Weyl Hamiltonian. Multiphase space (ordinary as well as extended) is the geometric environment built by µ appropriately patching together local coordinate systems of the form (q i , pi ) – instead i of the canonically conjugate variables (q , pi ) of mechanics – together with space-time coordinates x µ and, in the extended version, a further energy type variable that we shall denote by p (without any index). The global construction of these multiphase spaces, however, has only gradually come to light; it is based on the following mathematical concepts:
378
M. Forger, S.V. Romero
• The collection of all fields in a given theory, defined over a fixed (n-dimensional orientable) space-time manifold M, is represented by the sections ϕ of a given fiber bundle E over M, with bundle projection π : E → M and typical fiber Q. This bundle will be referred to as the configuration bundle of the theory since Q corresponds to the configuration space of possible field values. • The collection of all fields together with their partial derivatives up to a certain order, say order r, is represented by the r-jets j rϕ ≡ (ϕ, ∂ϕ, . . . , ∂ rϕ) of sections of E, which are themselves sections of the r th order jet bundle J r E of E, regarded as a fiber bundle over M. In this paper, we shall only need first order jet bundles, with one notable exception: the global formulation of the Euler - Lagrange equations requires introducing the second order jet bundle. • Dualization – the concept needed to pass from the Lagrangian to the Hamiltonian framework via the Legendre transformation – comes in two variants, based on the fundamental observation that the first order jet bundle J 1 E of E is an affine bundle over E whose difference vector bundle J 1 E will be referred to as the linear ∗ E of jet bundle. Ordinary multiphase space is obtained as the twisted linear dual J 1 1 J E while extended multiphase space is obtained as the twisted affine dual J 1 E of J 1 E, where the prefix “twisted” refers to the necessity of taking an additional tensor product with the bundle of n-forms on M.1 • The Lagrangian L is a function on J 1 E with values in the bundle of n-forms on M so that it may be integrated to provide an action functional which enters the E, variational principle. The De Donder - Weyl Hamiltonian H is a section of J 1 1 ∗ considered as an affine line bundle over J E. Note that the formalism is set up so as to require no additional structure on the configuration bundle or on any other bundle constructed from it: all are merely fiber bundles over the space-time manifold M. Of course, additional structures do arise when one is dealing with special classes of fields (matter fields and the metric tensor in general relativity are sections of vector bundles, connections are sections of affine bundles, nonlinear fields such as those arising in the sigma model are sections of trivial fiber bundles with a fixed Riemannian metric on the fibers, etc.), but such additional structures depend on the kind of theory considered and thus are not universal. Finally, the restriction imposed on the order of the jet bundles considered reflects the fact that almost all known examples of field theories are governed by second order partial differential equations which can be derived from a Lagrangian that depends only on the fields and their partial derivatives of first order, which is why it is reasonable to develop the general theory on the basis of a first order formalism, as is done in mechanics [22, 23]. 2.2. The First Order Jet Bundle. The field theoretical analogue of the tangent bundle of mechanics is the first order jet bundle J 1 E associated with the configuration bundle E over M. Given a point e in E with base point x = π(e) in M, the fiber Je1 E of J 1 E at e consists of all linear maps from the tangent space Tx M of the base space M at x to the tangent space Te E of the total space E at e whose composition with the tangent map Te π : Te E → Tx M to the projection π : E → M gives the identity on Tx M: Je1 E = { γ ∈ L(Tx M, Te E) | Te π ◦ γ = idTx M } .
(5)
1 We use an asterisk ∗ to denote linear duals of vector spaces or bundles and a star to denote affine duals of affine spaces or bundles. These symbols are appropriately encircled to characterize twisted duals, as opposed to the ordinary duals defined in terms of linear or affine maps with values in R.
Covariant Poisson Brackets in Geometric Field Theory
379
Thus the elements of Je1 E are precisely the candidates for the tangent maps at x to (local) sections ϕ of the bundle E satisfying ϕ(x) = e. Obviously, Je1 E is an affine subspace of the vector space L(Tx M, Te E) of all linear maps from Tx M to the tangent space Te E, the corresponding difference vector space being the vector space of all linear maps from Tx M to the vertical subspace Ve E: Je1 E = { γ
∈ L(Tx M, Te E) |
Te π ◦ γ = 0 } = L(Tx M, Ve E) ∼ = Tx∗ M ⊗ Ve E . (6)
The jet bundle J 1 E thus defined admits two different projections, namely the target projection τE : J 1 E → E and the source projection σE : J 1 E → M which is simply its composition with the original bundle projection, that is, σE = π ◦ τE . The same goes for J 1 E, which we shall call the linearized first order jet bundle or simply linear jet bundle associated with the configuration bundle E over M. The structure of J 1 E and of J 1 E as fiber bundles over M with respect to the source projection (in general without any additional structure), as well as that of J 1 E as an affine bundle and of J 1 E as a vector bundle over E with respect to the target projection, can most easily be seen in terms of local coordinates. Namely, local coordinates x µ for M and q i for Q, together with a local trivialization of E, induce local coordinates (x µ , q i ) for E as well as local coordinates (x µ , q i , qµi ) for J 1 E ⊂ L(π ∗ (T M), T E) and (x µ , q i , q µi ) for J 1 E ⊂ L(π ∗ (T M), T E). Moreover, local coordinate transformations x µ → x ν for M and q i → q j for Q, together with a change of local trivialization of E, correspond to a local coordinate transformation (x µ , q i ) → (x ν , q j ) for E where x ν = x ν (x µ ) , q j = q j (x µ , q i ) .
(7) j
The induced local coordinate transformations (x µ , q i , qµi ) → (x ν , q j , qν ) for J 1 E j and (x µ , q i , qµi ) → (x ν , q j , q ν ) for J 1 E are then easily seen to be given by qνj =
∂x µ ∂q j i ∂x µ ∂q j q + , ∂x ν ∂q i µ ∂x ν ∂x µ
(8)
∂x µ ∂q j i q . ∂x ν ∂q i µ
(9)
and qνj =
This makes it clear that J 1 E is an affine bundle over E with difference vector bundle J 1 E = T ∗ M ⊗ VE ,
(10)
in accordance with Eq. (6).2 That the (first order) jet bundle of a fiber bundle is the adequate arena to incorporate (first order) derivatives of fields becomes apparent by noting that a global section ϕ of E over M naturally induces a global section j 1 ϕ of J 1 E over M given by j 1 ϕ(x) = Tx ϕ
∈
1 Jϕ(x) E
for x ∈ M .
2 Given any vector bundle V over M, such as T M, T ∗ M or any of their exterior powers, one can consider it as as vector bundle over E by forming its pull-back π ∗ V . In order not to overload the notation, we shall here and in what follows suppress the symbol π ∗ .
380
M. Forger, S.V. Romero
In the mathematical literature, j 1 ϕ is called the (first) prolongation of ϕ, but it would be more intuitive to simply call it the derivative of ϕ since in the local coordinates used above, j 1 ϕ(x) = (x µ , ϕ i (x), ∂µ ϕ i (x)) , where ∂µ = ∂/∂x µ ; this is symbolically summarized by writing j 1 ϕ ≡ (ϕ, ∂ϕ). Similarly, it can be shown that the linear jet bundle of a fiber bundle is the adequate arena to incorporate covariant derivatives of sections, with respect to an arbitrarily chosen connection. 2.3. Duality. The next problem to be addressed is how to define an adequate notion of dual for J 1 E. The necessary background information from the theory of affine spaces and of affine bundles (including the definition of the affine dual of an affine space and of the transpose of an affine map between affine spaces) is summarized in the Appendix. Briefly, the rules state that if A is an affine space of dimension k over R, its dual A is the space A(A, R) of affine maps from A to R, which is a vector space of dimension k + 1. Thus the affine dual J 1 E of J 1 E and the linear dual J 1 ∗ E of J 1 E are obtained by taking their fiber over any point e in E to be the vector space Je1 E = { ze : Je1 E −→ R | ze is affine }
(11)
Je1 ∗ E = { ze : Je1 E −→ R | ze is linear }
(12)
and
respectively. However, as mentioned before, the multiphase spaces of field theory are defined with an additional twist, replacing the real line by the one-dimensional space of volume forms on the base manifold M at the appropriate point. In other words, the twisted affine dual E = J 1 E ⊗ n T ∗ M (13) J 1 of J 1 E and the twisted linear dual ∗ E = J 1 ∗ E ⊗ J 1
n
T ∗M
(14)
of J 1 E are defined2 by taking their fiber over any point e in E with base point x = π(e) in M to be the vector space E = { ze : Je1 E −→ n Tx∗ M | ze is affine } (15) Je1 and ∗ E = { ze : Je1 E −→ Je1
n
Tx∗ M | ze is linear }
(16)
respectively. As in the case of the jet bundle and the linear jet bundle, all these duals admit two different projections, namely a target projection onto E and a source projection onto M which is simply its composition with the original projection π . Using local coordinates as before, it is easily shown that all these duals are fiber bundles over M with respect to the source projection (in general without any additional structure) and are vector bundles over E with respect to the target projection. Namely,
Covariant Poisson Brackets in Geometric Field Theory
381
introducing local coordinates (x µ , q i ) for E together with the induced local coordinates (x µ , q i , qµi ) for J 1 E and (x µ , q i , q µi ) for J 1 E as before, we obtain local coordinates µ E as well as local coordinates (x µ , q i , p µ ) (x µ , q i , pi , p) both for J 1 E and for J 1 i ∗ E, respectively. These are defined by requiring the dual both for J 1 ∗ E and for J 1 E with coordinates (x µ , q i , p µ , p) and a point pairing between a point in J 1 E or J 1 i 1 µ i i in J E with coordinates (x , q , qµ ) to be given by µ
pi qµi + p
(17)
in the ordinary (untwisted) case and by µ pi qµi + p d n x
(18)
∗ E with in the twisted case, whereas the dual pairing between a point in J 1 ∗ E or in J 1 µ µ i 1 µ i i coordinates (x , q , pi ) and a point in J E with coordinates (x , q , qµ ) is given by µ
pi qµi
(19)
in the ordinary (untwisted) case and by µ
pi qµi d n x
(20)
in the twisted case. Moreover, a local coordinate transformation (x µ , q i ) → (x ν , q j ) for E as in Eq. (7) induces local coordinate transformations for J 1 E and for J 1 E as in µ Eqs. (8) and (9) which in turn induce local coordinate transformations (x µ , q i , pi , p) E as well as local coordinate transfor→ (x ν , q j , pjν , p ) both for J 1 E and for J 1 µ mations (x µ , q i , p ) → (x ν , q j , pν ) both for J 1 ∗ E and for J 1 ∗ E: these are given by j
i
pjν =
∂x ν
∂q i
µ
, p = p −
∂q j ∂q i µ p ∂x µ ∂q j i
(21) ∂x µ ∂q j in the ordinary (untwisted) case and ∂x ∂x ν ∂q i ∂x ∂q j ∂q i µ µ p , p = det p p − pjν = det ∂x ∂x µ ∂q j i ∂x ∂x µ ∂q j i (22) pi
in the twisted case. E of J 1 E contain line Finally, it is worth noting that the affine duals J 1 E and J 1 c E whose fiber over any point e in E with base point x = π(e) subbundles J 1 c E and J 1 in M consists of the constant (rather than affine) maps from Je1 E to R and to n Tx∗ M, respectively, and the corresponding quotient vector bundles over E can be naturally ∗ E of J 1 E, i.e., we have identified with the respective linear duals J 1 ∗ E and J 1 ∼ J 1 ∗ E (23) J 1 E/J 1 c E = and c E/J 1 E ∼ J 1 = J 1 ∗ E
(24)
respectively. This shows that, in both cases, the corresponding projection onto the quotient amounts to “forgetting the additional energy variable” since it takes a point with µ µ coordinates (x µ , q i , pi , p) to the point with coordinates (x µ , q i , pi ); it will be denoted E into affine line bundles over J 1 ∗ E and by η and is easily seen to turn J 1 E and J 1 1 ∗ over J E, respectively.
382
M. Forger, S.V. Romero
2.4. The Second Order Jet Bundle. For an appropriate global formulation of the standard Euler - Lagrange equations of field theory, which are second order partial differential equations, it is useful to introduce the second order jet bundle J 2 E associated with the configuration bundle E over M. It can be defined either directly, as is usually done, or by invoking an iterative procedure, which is the method we shall follow here. Starting out from the first order jet bundle J 1 E of E, regarded as a fiber bundle over M, we consider its first order jet bundle J 1 J 1 E and define, in a first step, the semiholonomic second order jet bundle J¯2 E of E to be the subbundle of J 1 J 1 E given by J¯2 E = { γ ∈ J 1 J 1 E | τJ 1 E (γ ) = J 1 τE (γ ) },
(25)
where τJ 1 E : J 1 J 1 E → J 1 E is the target projection of J 1 J 1 E while J 1 τE : J 1 J 1 E → J 1 E is the prolongation of the target projection τE : J 1 E → E of J 1 E, considered as a map of fiber bundles over M. As it turns out, J¯2 E is an affine bundle 2 over J 1 E, with difference vector bundle J¯ E = (T ∗ M ⊗ T ∗ M) ⊗ VE. Moreover, it admits a natural decomposition, as a fiber product over J 1 E, into a symmetric and 2 an antisymmetric part: the symmetric part is the second order 2 jet∗ bundle J E and is 2 ¯ T M ⊗ VE, while the an affine subbundle of J E with difference vector bundle 2 2 ∗ T M ⊗ VE of J¯ E: antisymmetric part is the vector subbundle 2 ∗ J¯2 E = J 2 E ×J 1 E T M ⊗ VE . (26) These assertions can be proved by introducing local coordinates (x µ , q i ) for E together with the induced local coordinates (x µ , q i , qµi ) for J 1 E as before to first define i ) for J 1 J 1 E. Simple calculations then induced local coordinates (x µ , q i , qµi , rµi , qµρ 2 ¯ show that the points of J E are characterized by the condition qµi = rµi and the i = q i . Moreover, a local coordinate points of J 2 E by the additional condition qµρ ρµ transformation (x µ , q i ) → (x ν , q j ) for E as in Eq. (7) induces a local coordinate transformation for J 1 E as in Eq. (8) which in turn induces a local coordinate transfori ) → (x ν , q j , q j , r j , q j ) for J 1 J 1 E, given by Eq. (8) mation (x µ , q i , qµi , rµi , qµρ ν ν νσ together with rνj = j qνσ =
∂x µ ∂q j i ∂x µ ∂q j r + , µ ∂x ν ∂q i ∂x ν ∂x µ j
j
(27) j
∂x ρ ∂qν i ∂x ρ ∂qν i ∂x ρ ∂qν q + r + . ∂x σ ∂qµi µρ ∂x σ ∂q i ρ ∂x σ ∂x ρ
(28)
Differentiating Eq. (8) with respect to qµi , q i and x ρ and using the relation ∂x ρ ∂x κ ∂x µ ∂ 2 x λ ∂ 2xµ = − σ , σ ν ∂x ∂x ∂x ∂x ν ∂x λ ∂x ρ ∂x κ we can rewrite Eq. (28) explicitly in the form j qνσ =
∂x ρ ∂x µ ∂q j i q ∂x σ ∂x ν ∂q i µρ ∂x ρ ∂x µ ∂ 2 q j ∂x ρ ∂x µ ∂ 2 q j + qµk rρi + ri σ ν i k ∂x ∂x ∂q ∂q ∂x σ ∂x ν ∂q i ∂x µ ρ
(29)
Covariant Poisson Brackets in Geometric Field Theory
∂x ρ ∂x σ ∂x ρ + ∂x σ −
383
∂x κ ∂x µ ∂ 2 x λ ∂q j i ∂x ρ ∂x κ ∂x µ ∂ 2 x λ ∂q j q − µ ∂x ν ∂x λ ∂x ρ ∂x κ ∂q i ∂x σ ∂x ν ∂x λ ∂x ρ ∂x κ ∂x µ µ 2 ρ µ j ∂x ∂x ∂x ∂ q ∂ 2 q j i q + . (30) µ ∂x ν ∂x ρ ∂q i ∂x σ ∂x ν ∂x ρ ∂x µ j
j
In particular, Eqs. (8) and (27) show that qµi = rµi implies qν = rν and similarly, j
j
i = q i implies q Eq. (30) shows that if qµi = rµi , then qµρ νσ = qσ ν , as required by the ρµ global, coordinate independent nature of the definition of J¯2 E as a subbundle of J 1 J 1 E and of J 2 E as a subbundle of J¯2 E. Moreover, Eq. (30) also shows that if qµi = rµi , then j
i →q the transformation law qµρ νσ decomposes naturally into separate transformation j
j
i i → q(νσ ) for the symmetric part and q[µρ] → q[νσ ] for the antisymmetric laws q(µρ) part: the former reads j
q(νσ ) =
∂x ρ ∂x µ ∂q j i ∂x ρ q + ∂x σ ∂x ν ∂q i (µρ) ∂x σ 2 j ρ µ ∂ q ∂x ∂x + qi + σ ν ∂x ∂x ∂x µ ∂q i ρ
∂x µ ∂ 2 q j qk qi ∂x ν ∂q i ∂q k µ ρ ∂ 2 q j ∂x κ ∂ 2 x λ ∂q j i i q − q ∂x ρ ∂q i µ ∂x λ ∂x ρ ∂x µ ∂q i κ
∂x ρ ∂x κ ∂x µ ∂ 2 x λ ∂q j ∂x ρ ∂x µ ∂ 2 q j + , (31) σ ν λ ρ κ µ ∂x ∂x ∂x ∂x ∂x ∂x ∂x σ ∂x ν ∂x ρ ∂x µ and is the transformation law for J 2 E as an affine bundle over J 1 E, whereas the latter reads simply −
j
q[νσ ] =
∂x ρ ∂x µ ∂q j i q , ∂x σ ∂x ν ∂q i [µρ]
(32)
and is the transformation law for 2 T ∗ M ⊗ VE as a vector bundle over J 1 E. For more details, see [24, Chapter 5]. The equivalence between the definition of the second order jet bundle given here and the traditional one is obtained observing that the iterated jet j 1 j 1 ϕ of a (local) section ϕ of E assume values not only in J¯2 E but even in J 2 E, due to the Schwarz rule. Therefore, second order jets in the traditional sense, that is, classes of (local) sections where the equivalence relation is the equality between the Taylor expansion up to second order, are in one-to-one correspondence with these iterated jets of (local) sections. Moreover, a global section ϕ of E over M naturally induces a global section j 2 ϕ of J 2 E over M such that in the local coordinates used above j 2 ϕ(x) = (x µ , ϕ i (x), ∂µ ϕ i (x), ∂µ ∂ν ϕ i (x)), where ∂µ = ∂/∂x µ ; this is symbolically summarized by writing j 2 ϕ = (ϕ, ∂ϕ, ∂ 2 ϕ). 2.5. The Legendre Transformation. A Lagrangian field theory is defined by its configuration bundle E over M and its Lagrangian density or simply Lagrangian, which in the present first order formalism is a map of fiber bundles over E: L : J 1 E −→ n T ∗ M . (33) The requirement that L should take values in the volume forms rather than the functions on space-time is imposed to guarantee that the action functional S : (E) → R given by
384
M. Forger, S.V. Romero
S[ϕ] =
L(ϕ, ∂ϕ)
for ϕ ∈ (E)
(34)
M
be well-defined and independent of the choice of additional structures, such as a spacetime metric.3 Such a Lagrangian gives rise to a Legendre transformation, which comes in two variants: as a map ∗ : J 1 E −→ J 1 E FL
(35)
FL : J 1 E −→ J 1 E
(36)
or as a map
of fiber bundles over E. For any point γ in Je1 E, the latter is defined as the usual fiber derivative of L at γ , which is the linear map from Je1 E to n Tx∗ M given by
d
L(γ + λκ )
for κ ∈ Je1 E , (37) FL(γ ) · κ = λ=0 dλ whereas the former encodes the entire Taylor expansion, up to first order, of L around γ along the fibers, which is the affine map from Je1 E to n Tx∗ M given by
d
for κ ∈ Je1 E . (38) L(γ + λ(κ − γ ))
FL(γ ) · κ = L(γ ) + λ=0 dλ is just the linear part of FL, that is, its composition with the bundle proOf course, FL = η ◦FL. In local coordinates jection η from extended to ordinary multiphase space: FL as before, FL is given by µ
pi =
∂L ∂qµi
,
p = L−
∂L i q , ∂qµi µ
(39)
where L = L d n x. Finally, if L is supposed to be hyperregular, which by defini L should be a global diffeomorphism, then one can define the De tion means that F ∗ E given by Donder - Weyl Hamiltonian H to be the section of J 1 E over J 1 −1 . H = FL ◦ ( FL)
(40)
In local coordinates as before, this leads to µ
H = pi qµi − L,
(41)
where L = L d n x and H = − H d n x, as stipulated in Eq. (4). Conversely, the covariant Hamiltonian formulation of a field theory that can be described in terms of a configuration bundle E over M is defined by its Hamiltonian density or simply Hamiltonian, in the spirit of De Donder and Weyl, which in global terms is E as an affine line bundle over ordinary a section of extended multiphase space J 1 1 ∗ multiphase space J E: H : J 1 ∗ E −→ J 1 E .
(42)
3 Strictly speaking, the integration should be restricted to compact subsets of space-time, which leads to an entire family of action functionals.
Covariant Poisson Brackets in Geometric Field Theory
385
Such a Hamiltonian gives rise to an inverse Legendre transformation, which is a map ∗ E −→ J 1 E FH : J 1
(43)
of fiber bundles over E defined as follows. For any point z in Je1 ∗ E, the usual fiber E which when composed with derivative of H at z is a linear map from Je1 ∗ E to Je1 ∗ E (since H is a section): the projection η from Je1 E to Je1 ∗ E gives the identity on Je1 such linear maps form an affine subspace of the vector space of all linear maps from 1 ∗ E to J 1 Je1 e E that can be naturally identified with the original affine space Je E, as explained in the Appendix. In local coordinates as before, FH is given by qµi =
∂H µ, ∂pi
(44)
where H = − H d n x. Finally, if H is supposed to be hyperregular, which by definition means that FH should be a global diffeomorphism, then one can define the Lagrangian L to be given by (45) L(γ ) = H ◦ (FH)−1 (γ ) · γ . In local coordinates as before, this leads to µ
L = pi qµi − H,
(46)
where L = L d n x and H = − H d n x. Thus in the hyperregular case, the two processes are inverse to each other and allow one to pass freely between the Lagrangian and the Hamiltonian formulation. Of course, this is no longer true for field theories with local symmetries, in particular gauge theories, which require additional conceptual input. At any rate, it has become apparent that even in the regular case, the full power of the multiphase space approach to geometric field theory can only be explored if one uses the ordinary and extended multiphase spaces in conjunction.
2.6. Canonical Forms. The distinguished role played by the extended multiphase space is due to the fact that it carries a naturally defined multisymplectic form ω, derived from an equally naturally defined multicanonical form θ by exterior differentiation: it is this property that turns it into the field theoretical analogue of the cotangent bundle of mechanics.4 Global constructions are given in the literature [13–15], so we shall content µ ourselves with stating that in local coordinates (x µ , q i , pi , p) as before, θ takes the form µ
θ = pi dq i ∧ d n xµ + p d n x ,
(47)
so ω = −dθ becomes µ
ω = dq i ∧ dpi 4
∧
d n xµ − dp
∧
d nx .
Note that this statement fails if one uses the ordinary duals instead of the twisted ones.
(48)
386
M. Forger, S.V. Romero
Given a Lagrangian L, we can use the associated Legendre transformation FL to pull back θ and ω and thus define the Poincar´e-Cartan forms θL and ωL on J 1 E associated with the Lagrangian L: θL = (FL)∗ θ
ωL = (FL)∗ ω .
,
(49)
Similarly, given a Hamiltonian H, we can use it to pull back θ and ω and thus define the ∗ E associated with the Hamiltonian H: De Donder-Weyl forms θH and ωH on J 1 θH = H∗ θ
ωH = H ∗ ω .
,
(50)
= FL, Of course, ωL = − dθL and ωH = − dθH ; moreover, supposing that H ◦ FL we have ∗ θH θL = ( FL)
,
∗ ωH . ωL = ( FL)
(51)
In local coordinates as before, Eqs. (47) and (48) imply that θL =
∂L ∂L dq i ∧ d n xµ + L − i qµi d n x , i ∂qµ ∂qµ µ
θH = pi dq i ∧ d n xµ − H d n x ,
(52) (53)
and ωL =
∂ 2L ∂ 2L dq i ∧ dq j + dq i ∧ dqνj ∧ d n xµ j j i ∂q ∂qµ ∂qν ∂qµi ∂ 2L ∂L i i n ∧d x − d L− + dq q ∧ d nx , ∂x µ ∂qµi ∂qµi µ
µ
ωH = dq i ∧ dpi
∧
d n xµ + dH ∧ d n x .
(54) (55)
It is useful to note that the forms θL and θH allow us to give a very simple definition of the action functional: it is given by pull-back and integration over space-time. Thus in the Lagrangian framework, the action associated with a section ϕ of E over M is obtained by taking the pull-back of θL with its derivative which is a section (ϕ, ∂ϕ) of J 1 E over M, S[ϕ] = M
(ϕ, ∂ϕ)∗ θL
for ϕ ∈ (E) ,
(56)
whereas in the Hamiltonian framework, the action associated with a section (ϕ, π ) of J 1 ∗ E over M is simply S[ϕ, π ] = M
(ϕ, π )∗ θH
for (ϕ, π ) ∈ (J 1 ∗ E) .
(57)
Covariant Poisson Brackets in Geometric Field Theory
387
2.7. Euler-Lagrange and De Donder-Weyl Operator. The canonical forms introduced in the previous section are useful for giving a global formulation not only of the variational principle but also of the corresponding equations of motion: these can be expressed through the vanishing of certain (generally nonlinear) differential operators which in turn are derived from (generally nonlinear) fiber bundle maps acting on jets of sections and defined in terms of the forms ωL and ωH , respectively. Here, we present a simple and explicit construction of these operators, in the spirit of global analysis [25, 26], which does not seem to be readily available in the literature, although there do exist various attempts that go a long way in the right direction; see, e.g., [27] for the Lagrangian case and [13] for the Hamiltonian case. Theorem 1. Given a Lagrangian density as in Eq. (33) above, define the corresponding Euler-Lagrange map to be the map DL : J 2 E −→ V E
(58)
of fiber bundles over J 1 E 5 that associates to each 2-jet (ϕ, ∂ϕ, ∂ 2 ϕ) of (local) sections ϕ of E over M and each vertical vector field V on E the n-form on M given by DL (ϕ, ∂ϕ, ∂ 2 ϕ) · V = (ϕ, ∂ϕ)∗ (iV ωL ) ,
(59)
where V on the rhs is any vertical vector field on J 1 E that projects to the vertical vector field on E denoted by V on the lhs. Then for any section ϕ of E, DL (ϕ, ∂ϕ, ∂ 2 ϕ) is the zero section if and only if ϕ satisfies the Euler - Lagrange equations associated to L. Proof. Let V be a vertical vector field on E, with local coordinate expression V = Vi
∂ , ∂q i
and choose any vertical vector field on J 1 E that projects to V , which for the sake of simplicity will be denoted by the same letter V , with local coordinate expression V = Vi
∂ ∂ + Vµi . ∂q i ∂qµi
Contracting Eq. (54) with V and then pulling back with (ϕ, ∂ϕ) gives, after some calculation, (ϕ, ∂ϕ)∗ (iV ωL ) =
∂ 2L ∂ 2L i n (ϕ, ∂ϕ) V (ϕ) d x + (ϕ, ∂ϕ) V i (ϕ) ∂µ ϕ j d n x ∂x µ ∂qµi ∂q j ∂qµi +
∂ 2L j ∂qν
= ∂µ
∂qµi
(ϕ, ∂ϕ) V i (ϕ) ∂µ ∂ν ϕ j d n x −
∂L (ϕ, ∂ϕ) V i (ϕ) d n x ∂q i
∂L ∂L (ϕ, ∂ϕ) − (ϕ, ∂ϕ) V i (ϕ) d n x , ∂qµi ∂q i
5 Again, we suppress the symbols indicating the pull-back of bundles from E or M to J 1 E, and V E denotes the twisted dual of VE.
388
M. Forger, S.V. Romero
where it is to be noted that the terms involving the coefficients Vµi have dropped out. This leads to the following explicit formula for DL : DL (ϕ, ∂ϕ, ∂ ϕ) = 2
∂L ∂L ∂µ (ϕ, ∂ϕ) − (ϕ, ∂ϕ) dq i ⊗ d n x . ∂qµi ∂q i
(60)
In particular, it is clear that DL depends on ϕ only through the point values of ϕ and its partial derivatives up to second order, which concludes the proof. Theorem 2. Given a Hamiltonian density as in Eq. (42) above, define the corresponding De Donder-Weyl map to be the map DH : J 1 (J 1 ∗ E) −→ V (J 1 ∗ E)
(61)
of fiber bundles over J 1 ∗ E 6 that associates to each 1-jet (ϕ, π, ∂ϕ, ∂π ) of (local) sections (ϕ, π ) of J 1 ∗ E over M and each vertical vector field V on J 1 ∗ E the n-form on M given by DH (ϕ, π, ∂ϕ, ∂π) · V = (ϕ, π )∗ (iV ωH ) .
(62)
Then for any section (ϕ, π ) of J 1 ∗ E, DH (ϕ, π, ∂ϕ, ∂π ) is the zero section if only if (ϕ, π) satisfies the De Donder - Weyl equations associated to H. Proof. Let V be a vertical vector field on J 1 ∗ E, with local coordinate expression V = Vi
∂ µ ∂ + Vi µ . i ∂q ∂pi
Contracting Eq. (55) with V and then pulling back with (ϕ, π ) gives ∂H (ϕ, π ) V i (ϕ, π ) d n x ∂q i ∂H µ µ n − ∂µ ϕ i Vi (ϕ, π ) d n x + µ (ϕ, π ) Vi (ϕ, π ) d x . ∂pi
(ϕ, π )∗ (iV ωH ) = ∂µ πi V i (ϕ, π ) d n x + µ
This leads to the following explicit formula for DH : ∂H µ dq i ⊗ d n x (ϕ, π ) + ∂ π µ i ∂q i ∂H µ i + dpi ⊗ d n x . µ (ϕ, π ) − ∂µ ϕ ∂pi
DH (ϕ, π, ∂ϕ, ∂π) =
(63)
In particular, it is clear that DH depends on (ϕ, π ) only through the point values of ϕ and π and their partial derivatives up to first order, which concludes the proof. 6 Again, we suppress the symbols indicating the pull-back of bundles from E or M to J 1 E, and V (J 1 E) denotes the twisted dual of V (J 1 E).
Covariant Poisson Brackets in Geometric Field Theory
389
Remark. A slight extension of the above proofs shows that the expressions on the rhs of Eq. (59) and of Eq. (62) vanish on solutions of the equations of motion even when V is replaced by an arbitrary (not necessarily vertical) vector field, so that one may also consider the Euler - Lagrange map as a fiber bundle map DL : J 2 E −→ T E , and the De Donder - Weyl map as a fiber bundle map ∗ ∗ DH : J 1 (J 1 E) −→ T (J 1 E) .
However, we refrain from writing down the explicit local coordinate expressions for this case (which generalize Eq. (60) and Eq. (63), respectively, by including a term proportional to dx µ ⊗ d n x), since we shall not need this fact here.
2.8. Jacobi Operators. In order to make contact with the functional formalism to be discussed in the next section, we must also derive explicit expressions for the linearization of the Euler - Lagrange operator and the De Donder - Weyl operator around a given solution of the equations of motion. This leads to linear differential operators between vector bundles over M that we shall refer to as Jacobi operators, generalizing the familiar derivation of the Jacobi equation by linearizing the geodesic equation. In its Lagrangian version, the Jacobi operator is a second order differential operator JL [ϕ ] : (ϕ ∗ VE) −→ (ϕ ∗ V E) ,
(64)
obtained by linearizing the Euler - Lagrange operator DL around a given solution ϕ of the equations of motion. Similarly, in its Hamiltonian version, the Jacobi operator is a first order differential operator JH [ϕ, π] : ((ϕ, π )∗ V (J 1 E)) −→ ((ϕ, π )∗ V (J 1 E)) ,
(65)
obtained by linearizing the De Donder - Weyl operator DH around a given solution (ϕ, π ) of the equations of motion. (Thus in both cases, the vector bundles involved are obtained by pulling back the appropriate vertical bundle and its twisted dual with the solution of the nonlinear equation around which the linearization is performed.) To obtain explicit expressions, consider an arbitrary variation ϕλ around ϕ and evaluate D L (ϕλ , ∂ϕλ , ∂ 2 ϕλ ) which, for each λ, is a section of ϕλ∗ V E, observing that since ϕ = ϕλ λ=0 is a solution,
DL (ϕλ , ∂ϕλ , ∂ 2 ϕλ ) λ=0 is the zero section of ϕ ∗ V E, and setting δϕ =
∂
ϕλ
. λ=0 ∂λ
(66)
Noting that in local coordinates, the value of DL (ϕλ , ∂ϕλ , ∂ 2 ϕλ ) at a point x in M with coordinates x µ has coordinates (x µ , ϕλi (x), DL (ϕλ , ∂ϕλ , ∂ 2 ϕλ )i (x)), where the last piece
390
M. Forger, S.V. Romero
is the coefficient of dq i ⊗ d n x in Eq. (60), we get by differentiation with respect to λ,
∂
DL (ϕλ , ∂ϕλ , ∂ 2 ϕλ )
λ=0 ∂λ 2 ∂ L ∂ 2L ∂ j j = δϕ i i + ∂µ (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ δϕ ν j ∂q ∂q j ∂qµi ∂qν ∂qµi ∂ 2L ∂ 2L j dq i ⊗ d n x . (ϕ, ∂ϕ) ∂ δϕ − j i (ϕ, ∂ϕ) δϕ j − ν j ∂q ∂q ∂qν ∂q i Similarly, consider an arbitrary variation (ϕλ , πλ ) around (ϕ, π ) and evaluate DH (ϕλ , πλ , ∂ϕλ , ∂πλ ) which, for each λ, is a section of (ϕλ , πλ )∗ V (J 1 E),
observ
ing that since (ϕ, π ) = (ϕλ , πλ ) λ=0 is a solution, DH (ϕλ , πλ , ∂ϕλ , ∂πλ ) λ=0 is the zero section of (ϕ, π )∗ V (J 1 E), and setting (δϕ, δπ ) =
∂
(ϕλ , πλ )
. λ=0 ∂λ
(67)
Again, noting that in local coordinates, the value of DH (ϕλ , πλ , ∂ϕλ , ∂πλ ) at a point x µ in Mwith coordinates x µ has coordinates (x µ,ϕλi(x),(πλ )i (x),DH (ϕλ , πλ , ∂ϕλ , ∂πλ )i (x), i DH (ϕλ , πλ , ∂ϕλ , ∂πλ )µ (x)), where the last two pieces are the coefficients of µ dq i ⊗ d n x and of dpi ⊗ d n x in Eq. (63), we get by differentiation with respect to λ,
∂
DH (ϕλ , πλ , ∂ϕλ , ∂πλ )
λ=0 ∂λ µ ∂ i ∂ = δϕ + δπi µ ∂q i ∂pi 2H ∂ 2H ∂ µ + (ϕ, π ) δϕ j + (ϕ, π ) δπjν + ∂µ δπi dq i ⊗ d n x ∂q j ∂q i ∂pjν ∂q i ∂ 2H ∂ 2H µ j ν i + dpi ⊗ d n x . µ (ϕ, π ) δϕ + µ (ϕ, π ) δπj − ∂µ δϕ ∂q j ∂pi ∂pjν ∂pi In order to show how to extract the Jacobi operators from these expressions, by means of a globally defined prescription, we apply the following construction [10]. Let F be a fiber bundle over M, with bundle projection πF,M : F → M, and W be a vector bundle over F with bundle projection πW,F : W → F , which is then also a fiber bundle (but not necessarily a vector bundle) over M with respect to the composite bundle projection πW,M = πF,M ◦ πW,F : W → M. Thus W admits two different kinds of vertical bundles, VF W and VM W , with fibers defined by (VF )w W = ker Tw πW,F and (VM )w W = ker Tw πW,M for w ∈ W ; obviously, the former is contained in the latter as a vector subbundle. Moreover, since W is supposed to be a vector bundle over F , there ∗ W . On the other hand, consider the vertical is a canonical isomorphism VF W ∼ = πW,F ∗ (V F ) bundle V F of F which can be pulled back to W to obtain a vector bundle πW,F ∗ (V F )) = V F = ker T π over W , with fibers defined by (πW,F w f f F,M for w ∈ W with f = πW,F w. Note also that the tangent map to the bundle projection πW,F , which
Covariant Poisson Brackets in Geometric Field Theory
391
by definition has kernel VF W , maps VM W onto V F , so we have the following exact sequence of vector bundles over W : ∗ ∗ W −→ VM W −→ πW,F (V F ) −→ 0 . 0 −→ VF W ∼ = πW,F
The crucial observation is now that this exact sequence admits a canonical splitting over the zero section 0 : F → W , given simply by its tangent map. Indeed, its tangent map Tf 0 : Tf F → T0(f ) W at any point f ∈ F takes the vertical subspace Vf F to the M-vertical subspace (VM )0(f ) W and so restricts to a vertical tangent map Vf 0 : Vf F → (VM )0(f ) W whose composition with the restriction of the tangent map T0(f ) πW,F : T0(f ) W → Tf F to (VM )0(f ) W gives the identity on Vf F . Thus the image of Vf 0 is a subspace of (VM )0(f ) W that is complementary to the subspace (VF )0(f ) W = Wf and provides a surjective linear map σf : (VM )0(f ) W → Wf of which it is the kernel. At the level of bundles, this corresponds to a surjective vector bundle homomorphism σ : VM W 0 → W . Applying this construction to the situation at hand, take F = E in the Lagrangian case and F = J 1 E in the Hamiltonian case, setting W = V F in both cases. The fact that the Euler - Lagrange or De Donder - Weyl operator is being linearized around a solution ϕ or (ϕ, π ) of the equations of motion then means that we are evaluating its derivative, which a priori takes the variation δϕ or (δϕ, δπ ) to a vector field on W along M which is vertical with respect to the projection of W onto M, precisely over the zero section, so we can apply the operator σ just introduced to project it down to a section of W over M itself. This operation completes the definition of the Jacobi operators, namely
∂
2 JL [ϕ] · δϕ = σ , (68) D (ϕλ , ∂ϕλ , ∂ ϕλ )
λ=0 ∂λ L and
JH [ϕ, π] · (δϕ, δπ ) = σ
∂
. DH (ϕλ , πλ , ∂ϕλ , ∂πλ )
λ=0 ∂λ
(69)
The local coordinate expressions are the ones derived above, that is,
2 ∂ L (ϕ, ∂ϕ) ∂µ ∂ν δϕ j JL [ϕ] · δϕ = j i ∂qν ∂qµ 2 ∂ 2L ∂ L ∂ 2L (ϕ, ∂ϕ) ∂ν δϕ j + ∂µ (ϕ, ∂ϕ) + − j j ∂q j ∂qνi ∂qν ∂qµi ∂q i ∂qν 2 ∂ L ∂ 2L j + ∂µ (ϕ, ∂ϕ) − (ϕ, ∂ϕ) δϕ dq i ⊗ d n x, (70) ∂q j ∂qµi ∂q j ∂q i and JH [ϕ, π] · (δϕ, δπ ) 2H ∂ 2H ∂ µ = (ϕ, π ) δϕ j + (ϕ, π ) δπjν + ∂µ δπi dq i ⊗ d n x ∂q j ∂q i ∂pjν ∂q i ∂ 2H ∂ 2H µ j ν i n + µ (ϕ, π ) δϕ + µ (ϕ, π ) δπj − ∂µ δϕ dpi ⊗ d x. ∂q j ∂pi ∂pjν ∂pi
(71)
392
M. Forger, S.V. Romero
3. Functional Approach Let us begin by recalling the definition of the Poisson bracket between functions on a symplectic manifold with symplectic form ω. First, one associates to each (smooth) function f a (smooth) Hamiltonian vector field Xf , uniquely determined by the condition iXf ω = df .
(72)
Then the Poisson bracket of two functions f and g is defined to be the function {f, g} given by {f, g} = − iXf iXg ω = df (Xg ) = − dg(Xf ) .
(73)
The goal of this section is to show that formally, the same construction applied to covariant phase space links the Witten symplectic form to the Peierls bracket. 3.1. Covariant Phase Space. In contrast to the traditional non-covariant Hamiltonian formalism of field theory, where phase space is a “space” of Cauchy data, covariant phase space, denoted here by S, is the “space” of solutions of the equations of motion, or field equations. Of course, one cannot expect these two interpretations of phase space to be equivalent in complete generality, since it is well known that, for nonlinear equations, time evolution of regular Cauchy data may lead to solutions that, within finite time, develop some kind of singularity. An even more elementary prerequisite is that the underlying space-time manifold M must admit at least some Cauchy surface : this means that M should be globally hyperbolic. Thus our basic assumption for the remainder of this paper will be that the underlying space-time manifold M should be globally hyperbolic. Globally hyperbolic space-times are the natural arena for the mathematical theory of hyperbolic (systems of) partial differential equations, in which the Cauchy problem is well posed. There are by now various and apparently quite different definitions of the concept of a globally hyperbolic spacetime, but they have ultimately turned out to be all equivalent; see Chapter 8 of [28] for an extensive discussion. For our purposes, the most convenient one is that M admits a global time function whose level surfaces provide a foliation of M into Cauchy surfaces, providing a global diffeomorphism M ∼ = R × . As an immediate corollary, we can define the concept of a (closed/open) time slice in M: it is a (closed/open) subset of M which under such a global diffeomorphism corresponds to a subset of the form I × , where I is a (closed/open) interval in R. In the Lagrangian as well as in the Hamiltonian approach to field theory, the equations of motion are derived from a variational principle, that is, their solutions are the stationary points of a certain functional S called the action and defined on a space of sections of an appropriate fiber bundle over space-time which is usually referred to as the space of field configurations of the theory and will in what follows be denoted by C. More concretely, C is the space (F ) of smooth sections φ of a fiber bundle F over M: in the Lagrangian approach, F is the configuration bundle E, whereas in the Hamiltonian ∗ E, regarded as a fiber bundle over M. approach, F is the multiphase space J 1 Formally, we shall as usual think of C as being a manifold (which is of course infinite-dimensional). As such, it has at each of its points φ a tangent space Tφ C that can be defined formally as a space of smooth sections, with appropriate support properties, of the vector bundle φ ∗ VF over M, i.e., Tφ C ⊂ ∞ (φ ∗ VF ). The cotangent space Tφ∗ C will then be the space of distributional sections, with dual support properties, of the
Covariant Poisson Brackets in Geometric Field Theory
393
twisted dual vector bundle φ ∗ V F over M, i.e., Tφ∗ C ⊂ −∞ (φ ∗ V F ). It contains as a subspace the corresponding space of smooth sections, where the pairing between a smooth section of φ ∗ VF and a smooth section of φ ∗ V F (with appropriate support conditions) is given by contraction and integration of the resulting form over M. Similarly, the second tensor power Tφ∗ C ⊗ Tφ∗ C of Tφ∗ C can be thought of as the space of distributional sections, again with dual support properties, of the second exterior tensor power 7 φ ∗ V F φ ∗ V F of φ ∗ V F ; it contains as a subspace the corresponding space of smooth sections, where the pairing between a pair of smooth sections of φ ∗ VF and a smooth section of φ ∗ V F φ ∗ V F (with appropriate support conditions) is given by contraction and integration of the resulting form over M × M. Regarding the support conditions to be imposed, the first two options that come to mind would be to require that either the elements of Tφ C or the elements of Tφ∗ C should have compact support, which would imply that the support of the elements of the corresponding dual, Tφ∗ C or Tφ C, could be left completely arbitrary: Option 1 .
Tφ C = ∞ (φ ∗ VF ) ,
Tφ∗ C = c−∞ (φ ∗ V F ).
(74)
Option 2 .
Tφ C = c∞ (φ ∗ VF ) ,
Tφ∗ C = −∞ (φ ∗ V F ).
(75)
There is a third option that makes use of the assumption that M is globally hyperbolic. To formulate it, we introduce the following terminology. A section of a vector bundle over M is said to have spatially compact support if the intersection between its support and any (closed) time slice in M is compact, and it is said to have temporally compact support if its support is contained in some time slice. Then, as in Ref. [8], we require the elements of Tφ C to have spatially compact support and the elements of Tφ∗ C to have temporally compact support: Option 3 .
∞ ∗ Tφ C = sc (φ VF ) ,
−∞ ∗ Tφ∗ C = tc (φ V F ).
(76)
Obviously, for each of these three options, the two spaces listed above are naturally dual to each other. 8 These constructions can be applied to elucidate the nature of functional derivatives of functionals on C, such as the action. Namely, given a (formally smooth) functional F : C → R, its functional derivative at a point φ is the linear functional on Tφ C which, when applied to δφ , yields the directional derivative of F at φ along δφ , defined
by the requirement that for any one-parameter family of sections φλ of F such that φλ λ=0 = φ , F [φ] · δφ =
d
F [φλ ]
λ=0 dλ
if
δφ =
∂
. φλ
λ=0 ∂λ
Then F [φ] is a distributional section of φ ∗ V F with appropriate support properties (dual to those required for Tφ C). In local coordinates, its action on δφ can (formally and at least when the intersection of the two supports is contained in the coordinate system domain) be written in the form δF F [φ] · δφ = [φ](x) · δφ(x). (77) d nx δφ M 7 If V and W are vector bundles over M, V W is defined to be the vector bundle over M × M with fibers given by (V W )(x,y) = Vx ⊗ Wy , for all x, y ∈ M. 8 Here and in what follows, the symbols , and indicate spaces of sections of compact, c sc tc spatially compact and temporally compact support, respectively.
394
M. Forger, S.V. Romero
The expression (δF/δφ)[φ], sometimes called the variational derivative of F at φ , is then a distributional section of φ ∗ V ∗F (over the coordinate system domain). In the Lagrangian framework, δF δF [ϕ](x) dq i , [φ](x) = δφ δϕ i whereas in the Hamiltonian framework, δF δF δF µ [ϕ, π ](x) dq i + [φ](x) = µ [ϕ, π](x) dpi . δφ δϕ i δπi Similarly, the second functional derivative of F at φ is the symmetric bilinear functional on Tφ C which, when applied to δφ1 and δφ2 , can be defined by the requirement that for any two-parameter family of sections φλ1 ,λ2 of F such that φλ1 ,λ2 λ ,λ =0 = φ , 1
F [φ] · (δφ1 , δφ2 ) = if δφ1 =
∂2 ∂λ1 ∂λ2
2
F [φλ1 ,λ2 ]
λ1 ,λ2 =0
∂ ∂
φλ1 ,λ2
, δφ2 = φλ1 ,λ2
. λ1 ,λ2 =0 λ1 ,λ2 =0 ∂λ1 ∂λ2
Then F [φ] is a distributional section of φ ∗ V F φ ∗ V F with appropriate support properties (dual to those required for Tφ C ⊗ Tφ C). In local coordinates for M × M induced from local coordinates for M, its action on (δφ1 , δφ2 ) can (formally and at least when the intersection of the supports is contained in the coordinate system domain) be written in the form δ2 F n F [φ] · (δφ1 , δφ2 ) = d x d ny [φ](x, y) · (δφ1 (x), δφ2 (y)). (78) δφ 2 M M The expression (δ 2 F/δφ 2 )[φ], sometimes called the variational Hessian of F at φ , is then a distributional section of φ ∗ V ∗F φ ∗ V ∗F (over the coordinate system domain). In the Lagrangian framework, δ2 F δ2 F [φ](x, y) = [ϕ](x, y) dq i ⊗ dq j , δφ 2 δϕ i δϕ j whereas in the Hamiltonian framework, δ2 F [φ](x, y) δφ 2 δ2 F δ2 F = i j [ϕ, π ](x, y) dq i ⊗ dq j + i ν [ϕ, π ](x, y) dq i ⊗ dpjν δϕ δϕ δϕ δπj +
δ2 F δ2 F µ µ j [ϕ, π ](x, y) dp ⊗ dq + [ϕ, π](x, y) dpi ⊗ dpjν . µ µ i δπi δϕ j δπi δπjν
Of course, for the integrals in Eqs. (77) and (78) to make sense, even when interpreted in the sense of pairing distributions with test functions, we must make some assumption
Covariant Poisson Brackets in Geometric Field Theory
395
about support properties, which leads us back to the options stated in Eqs. (74)–(76). Option 1: when F is arbitrary, we have to restrict the sections δφ, δφ1 , δφ2 of φ ∗ VF considered above to have compact support (which can be achieved if the sections φλ , φλ1 ,λ2 of F are supposed to be independent of the parameters outside a compact subset). Option 2: when F is local, which we understand to mean that its functional dependence on the fields is non-trivial only within a compact region, or equivalently, that its functional derivative F [φ] at each φ has compact support, the sections δφ, δφ1 , δφ2 of φ ∗ VF considered above may be allowed to have arbitrary support; this is the case for local observables defined as integrals of local densities over compact regions of space-time and, in particular, over compact regions within a Cauchy surface (energy, momentum, angular momentum, charges, etc. within a finite volume). Option 3: when F is local in time, which we understand to mean that its functional dependence on the fields is non-trivial only within a time slice, or equivalently, that its functional derivative F [φ] at each φ has temporally compact support, we have to restrict the sections δφ, δφ1 , δφ2 of φ ∗ VF considered above to have spatially compact support (which can be achieved if the sections φλ , φλ1 ,λ2 are supposed to be independent of the parameters outside a spatially compact subset); this is the case for global observables defined as integrals of local densities over time slices and, in particular, over a Cauchy surface (total energy, total momentum, total angular momentum, total charges, etc.). Finally, covariant phase space S is defined to be the subset of C consisting of the critical points of the action: S = {φ ∈ C | S [φ] = 0} .
(79)
Formally, we can think of S as being a submanifold of C whose tangent space at any point φ of S will be the subspace Tφ S of the tangent space Tφ C consisting of the solutions of the linearized equations of motion (where the linearization is to be performed around the given solution φ of the full equations of motion), which are precisely the sections of φ ∗ VF belonging to the kernel of the corresponding Jacobi operator J [φ] : (φ ∗ VF ) −→ (φ ∗ V F ) : Tφ S = ker J [φ] .
(80)
3.2. Symplectic structure. Our next goal is to justify the term “covariant phase space” attributed to S by showing that, formally, S carries a naturally defined symplectic form , derived from an equally naturally defined canonical form by formal exterior differentiation. According to Crnkovi´c, Witten and Zuckerman [1–3] (see also [4]), the symplectic form can be obtained by integration of a “symplectic current”, which is a closed (n − 1)-form on space-time, over an arbitrary spacelike hypersurface . Here, we show that this “symplectic current” can be derived directly from the multisymplectic form ω or, more explicitly, from the Poincar´e - Cartan form ωL in the Lagrangian approach and the De Donder - Weyl form ωH in the Hamiltonian approach. We begin with the definition of and in terms of θ and ω, which is achieved by a mixture of contraction and pull-back: given a point φ in C (a smooth section φ of F ) and smooth sections δφ, δφ1 , δφ2 of φ ∗ VF , insert δφ into the first of the n arguments of θ or δφ1 and δφ2 into the first two of the n + 1 arguments of ω and apply the definition of the pull-back with φ (which amounts to composition with the derivatives ∂φ of φ) to the remaining n − 1 arguments to obtain (n − 1)-forms on space-time which are integrated over . Note that these integrals exist if we assume that δφ and either δφ1 or δφ2 have spatially compact support, since this will intersect in a compact subset.
396
M. Forger, S.V. Romero
Explicitly, in the Lagrangian framework, we have (ϕ, ∂ϕ)∗ θL (δϕ, ∂ δϕ) φ (δφ) =
(81)
and φ (δφ1 , δφ2 ) =
(ϕ, ∂ϕ)∗ ωL (δϕ1 , ∂ δϕ1 , δϕ2 , ∂ δϕ2 ),
(82)
where the notation is the same as that employed in Eq. (56): φ = ϕ is a section of E over M and j 1 ϕ = (ϕ, ∂ϕ) is its (first) prolongation or derivative, a section of J 1 E over M, while δφ = δϕ, δφ1 = δϕ1 , δφ2 = δϕ2 are variations of φ = ϕ, all sections of V E over M, and δj 1 ϕ = (δϕ, ∂ δϕ), δj 1 ϕ1 = (δϕ1 , ∂ δϕ1 ), δj 1 ϕ2 = (δϕ2 , ∂ δϕ2 ) are the induced variations of j 1 ϕ = (ϕ, ∂ϕ), all sections of V (J 1 E) ∼ = J 1 (V E) over M. In local coordinates, δϕ =
∂ ∂
ϕλ
= δϕ i i λ=0 ∂λ ∂q
and δj 1 ϕ =
∂ ∂ ∂ 1
= δϕ i i + ∂µ δϕ i i , j ϕλ
λ=0 ∂λ ∂q ∂qµ
whereas θL and ωL are given by Eqs. (52) and (54). Then ∂L dσµ (ϕ, ∂ϕ) δϕ i φ (δφ) = ∂qµi
(83)
and φ (δφ1 , δφ2 ) =
µ
(84)
dσµ Jφ (δφ1 , δφ2 )
with the “symplectic current” J given by µ
Jφ (δφ1 , δφ2 ) =
∂ 2L j j (ϕ, ∂ϕ) (δϕ1i δϕ2 − δϕ2i δϕ1 ) ∂q j ∂qµi +
∂ 2L j ∂qν
∂qµi
j
j
(ϕ, ∂ϕ) (δϕ1i ∂ν δϕ2 − δϕ2i ∂ν δϕ1 ) ,
(85)
or equivalently µ
∂ 2L ∂ 2L j j i (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ δϕ ν 1 δϕ2 1 j ∂q j ∂qµi ∂qν ∂qµi ∂ 2L ∂ 2L j j i + (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ δϕ ν 2 δϕ1 . 2 j ∂q j ∂qµi ∂qν ∂qµi
Jφ (δφ1 , δφ2 ) = −
(86)
Covariant Poisson Brackets in Geometric Field Theory
397
The same results can be obtained even more directly in the Hamiltonian framework, in which we have φ (δφ) = (ϕ, π )∗ θH (δϕ, δπ ) (87)
and
φ (δφ1 , δφ2 ) =
(ϕ, π )∗ ωH (δϕ1 , δπ1 , δϕ2 , δπ2 ),
(88)
where the notation is the same as that employed in Eq. (57): φ = (ϕ, π ) is a section of ∗ E over M while δφ = (δϕ, δπ ), δφ = (δϕ , δπ ), δφ = (δϕ , δπ ) are variations J 1 1 1 1 2 2 2 ∗ E) over M. In local coordinates, of φ = (ϕ, π ), all sections of V (J 1
∂ ∂ ∂
µ ∂ δϕ = = δϕ i i , δπ = = δπi ϕλ
πλ
µ, λ=0 λ=0 ∂λ ∂q ∂λ ∂pi whereas θH and ωH are given by Eqs. (53) and (55). Then µ dσµ πi δϕ i φ (δφ) =
(89)
and
φ (δφ1 , δφ2 ) =
µ
dσµ Jφ (δφ1 , δφ2 )
(90)
with the “symplectic current” J given by µ
µ
µ
Jφ (δφ1 , δφ2 ) = δϕ1i δπ2,i − δϕ2i δπ1,i .
(91)
Incidentally, these formulas show that, just like in mechanics, the canonical form and the symplectic form do not depend on the choice of the Hamiltonian H. Another important result, duly emphasized in the literature [1–4], is the fact that on covariant phase space S, the symplectic form does not depend on the choice of the hypersurface used in its definition, since for any solution φ of the equations of motion and any two solutions δφ1 , δφ2 of the linearized equations of motion, the “symplectic current” Jφ (δφ1 , δφ2 ) is a closed form on space-time. To prove this, assume that φ is a point in S and observe that a tangent vector δφ in Tφ C belongs to the subspace Tφ S if and only if δφ, as a section of φ ∗ VF , satisfies the pertinent Jacobi equation, which reads ∂ 2L ∂ 2L j j (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ δϕ ∂µ ν j ∂q j ∂qµi ∂qν ∂qµi =
∂ 2L ∂ 2L j (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ν δϕ j j ∂q j ∂q i ∂q i ∂qν
(92)
in the Lagrangian framework and µ
∂µ δπi = − ∂µ δϕ i =
∂ 2H ∂ 2H j (ϕ, π ) δϕ − (ϕ, π ) δπjν , ∂q j ∂q i ∂pjν ∂q i
∂ 2H ∂ 2H j ν (ϕ, π ) δϕ + µ µ (ϕ, π ) δπj ∂q j ∂pi ∂pjν ∂pi
(93)
398
M. Forger, S.V. Romero
in the Hamiltonian framework. Thus if δφ1 and δφ2 both satisfy the Jacobi equation, we have ∂ 2L ∂ 2L j j µ ∂µ Jφ (δφ1 , δφ2 ) = − ∂µ (ϕ, ∂ϕ) δϕ1 + (ϕ, ∂ϕ) ∂ν δϕ1 δϕ2i j j i ∂q ∂qµ ∂qν ∂qµi ∂ 2L ∂ 2L j j ∂µ δϕ2i − (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ δϕ ν 1 1 j ∂q j ∂qµi ∂qν ∂qµi ∂ 2L ∂ 2L j j i + ∂µ (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ δϕ ν 2 2 δϕ1 j ∂q j ∂qµi ∂qν ∂qµi ∂ 2L ∂ 2L j j i + (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ δϕ ν 2 ∂µ δϕ1 2 j ∂q j ∂qµi ∂qν ∂qµi ∂ 2L ∂ 2L j j =− (ϕ, ∂ϕ) δϕ1 + (ϕ, ∂ϕ) ∂ν δϕ1 δϕ2i j j i ∂q ∂q ∂q i ∂qν ∂ 2L ∂ 2L j j i − (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ δϕ ν 1 ∂µ δϕ2 1 j i ∂q j ∂qµi ∂qν ∂q + +
µ
∂ 2L ∂q j
∂q i
j
(ϕ, ∂ϕ) δϕ2 +
∂ 2L j ∂q i ∂qν ∂ 2L
j
(ϕ, ∂ϕ) ∂ν δϕ2
δϕ1i
∂ 2L j j i (ϕ, ∂ϕ) δϕ + (ϕ, ∂ϕ) ∂ δϕ ν 2 ∂µ δϕ1 2 j i ∂q j ∂qµi ∂qν ∂qµ
in the Lagrangian framework and µ
µ
µ
µ
µ
∂µ Jφ (δφ1 , δφ2 ) = ∂µ δϕ1i δπ2,i + δϕ1i ∂µ δπ2,i − ∂µ δϕ2i δπ1,i − δϕ2i ∂µ δπ1,i ∂ 2H ∂ 2H j µ ν = δπ2,i (ϕ, π ) δϕ + (ϕ, π ) δπ µ µ 1,j 1 ∂q j ∂pi ∂pjν ∂pi ∂ 2H ∂ 2H j ν − δϕ1i (ϕ, π ) δϕ + (ϕ, π ) δπ 2,j 2 ∂q j ∂q i ∂pjν ∂q i ∂ 2H ∂ 2H j µ ν − δπ1,i (ϕ, π ) δϕ + (ϕ, π ) δπ µ µ 2,j 2 ∂q j ∂pi ∂pjν ∂pi ∂ 2H ∂ 2H j ν + δϕ2i (ϕ, π ) δϕ + (ϕ, π ) δπ 1,j 1 ∂q j ∂q i ∂pjν ∂q i
in the Hamiltonian framework: obviously, both of these expressions vanish. Of course, independence of the choice of hypersurface holds only for but not for . In fact, if M1,2 is a region of space-time whose boundary is the disjoint union of two hypersurfaces 1 and 2 , then 2 = 1 , but 2 − 1 = δSM1,2 ,
(94)
where SM1,2 is the action calculated by integration over M1,2 and δ is the functional exterior derivative, or variational derivative, on S.
Covariant Poisson Brackets in Geometric Field Theory
399
3.3. Poisson bracket. Given a relativistic field theory with a regular first-order Lagrangian, one expects each of the corresponding Jacobi operators J [φ] (φ ∈ S) to form a hyperbolic system of second-order partial differential operators. A typical example is provided by the sigma model, where E is a trivial product bundle M × Q, with a given Lorentzian metric g on the base manifold M, as usual, and a given Riemannian metric h on the typical fiber Q. Its Lagrangian is L = 21 |g| g µν hij qµi qνj , so that the coefficients of the highest degree terms of the Jacobi operator J [ϕ] are ∂ 2L j ∂qν
∂qµi
(ϕ, ∂ϕ) =
1 2
|g| g µν hij (ϕ) ,
which clearly exhibits the hyperbolic nature of the resulting linearized field equations. A general feature of hyperbolic systems of linear partial differential equations is the possibility to guarantee existence and uniqueness of various types of Green functions. In the present context, what we need is existence and uniqueness of the retarded Green + function G− φ , the advanced Green function Gφ and the causal Green function Gφ for the Jacobi operator J [φ], for each φ ∈ S. By definition, the first two are solutions of the inhomogeneous Jacobi equations Jx [φ] · G± φ (x, y) = δ(x, y)
Jy [φ] · G± φ (x, y) = δ(x, y) ,
,
(95)
or more explicitly, ml Jx [φ]km G± (x, y) = δkl δ(x, y) φ
lm Jy [φ]km G± (x, y) = δkl δ(x, y) , (96) φ
,
where Jz [φ] denotes the Jacobi operator with respect to the variable z, characterized / J + (y) by the following support condition: for any x, y ∈ M, G− φ (x, y) = 0 when x ∈ / J − (y), where J + (y) and J − (y) are the future cone and and G+ φ (x, y) = 0 when x ∈ the past cone of y, respectively. The causal Green function, also called the propagator, is then simply their difference: + Gφ = G− φ − Gφ .
(97)
Obviously, it satisfies the homogeneous Jacobi equations Jx [φ] · Gφ (x, y) = 0
,
Jy [φ] · Gφ (x, y) = 0 .
(98)
Note that the symmetry of the Jacobi operator J [φ], stemming from the fact that it represents the second variational derivative of the action, forces these Green functions to satisfy the following exchange and symmetry properties: lk ∓ kl G± φ (y, x) = Gφ (x, y)
,
kl Glk φ (y, x) = − Gφ (x, y) .
(99)
It should be pointed out that existence and uniqueness of these Green functions cannot be guaranteed in complete generality: this requires not only that M be globally hyperbolic but also that the linearized field equations should form a hyperbolic system. Here, we shall simply assume this to be the case and proceed from there; further comments on the question will be deferred to the end of the section.
400
M. Forger, S.V. Romero
Our next step will be to study certain specific (distributional) solutions XF [φ] of the general inhomogeneous Jacobi equation J [φ] · XF [φ] = F [φ]
(100)
for smooth functionals F on covariant phase space which are (at least) local in time. To eliminate the ambiguity in this equation stemming from the fact that the functional derivative F [φ] on its rhs belongs to the space Tφ∗ S which is only a quotient space of the image space Tφ∗ C of the Jacobi operator J [φ] (an inclusion of the form Tφ S ⊂ Tφ C induces a natural projection from Tφ∗ C to Tφ∗ S), it is necessary to first of all extend the given functional F on S to a functional F˜ on C of the same type (smooth and local in time), whose functional derivative F˜ [φ] does belong to the space T ∗ C which, as we φ
recall from Eq. (76), consists of the distributional sections of φ ∗ V F of temporally compact support. Next, convolution with the retarded and advanced Green function introduced above produces formal vector fields over S which to each solution φ ∈ S of the field equations associate (distributional) sections X −˜ [φ] and X +˜ [φ] of φ ∗ VF , F F respectively. In local coordinates, their definition can (formally and at least when the intersection of the two supports is contained in the coordinate domain) be written in the form δ F˜ kl d n y G± (x, y) [φ](y) . (101) X ±˜ [φ] k (x) = φ F δφ l M
Both of them satisfy the inhomogeneous Jacobi equation J [φ] · X ±˜ [φ] = F˜ [φ] . F
(102)
Similarly, convolution with the causal Green function leads to another formal vector field over S which to each solution φ ∈ S of the field equations associates a (distributional) section X ˜[φ] of φ ∗ VF . Again, in local coordinates, its definition can (formally and at F least when the intersection of the two supports is contained in the coordinate domain) be written in the form δ F˜ k X ˜[φ] (x) = d n y Gkl (x, y) [φ](y) . (103) φ F δφ l M It satisfies the homogeneous Jacobi equation J [φ] · X ˜[φ] = 0 ,
(104)
X ˜[φ] = X −˜ [φ] − X +˜ [φ] .
(105)
F
since according to Eq. (97) F
F
F
Note that the convolutions in Eqs. (101) and (103) exist due to our support assumptions on F˜ (requiring F˜ [φ] to have temporally compact support) and due to the support properties of the Green functions G± φ and Gφ . According to Eq. (104), the prescription of associating to each solution φ ∈ S of the field equations the section X ˜[φ] of φ ∗ VF defines a formal vector field on S which is F tangent to S . (It becomes more than just a formal vector field if F˜ is such that X [φ] F˜
Covariant Poisson Brackets in Geometric Field Theory
401
belongs to Tφ S , which requires it to be not just a distributional section but a smooth section of φ ∗ VF and to satisfy appropriate support properties; we shall come back to this point later on.) The main statement about this formal vector field, to be proved below, is that (a) it does not depend on the choice of the extension F˜ of F, so we may simply denote it by XF [φ], and (b) that it is formally the Hamiltonian vector field associated to F with respect to the symplectic form discussed in the previous subsection. More explicitly, we claim that for any solution φ ∈ S of the field equations and any smooth section δφ of φ ∗ VF with spatially compact support which is a solution of the linearized field equations, we have φ (XF [φ], δφ) = F [φ] · δφ .
(106)
Note that under the assumptions stated, both sides of this equation make sense although we have originally defined φ (δφ1 , δφ2 ) only in the case where both δφ1 and δφ2 are smooth; the extension of this definition, given in the previous subsection, to the case where one of them is a distribution is straightforward. To prove this key statement, let us begin by recalling that the symplectic form and the symplectic current J of the previous subsection are really defined on C and not only on S – the only difference is that on C, is only a presymplectic form so that J should be more appropriately called the presymplectic current and that J on C is no longer conserved so that on C will depend on the choice of the hypersurface . At any rate, we can almost literally repeat the calculation performed at the end of the previous subsection, either in the Lagrangian or in the Hamiltonian formulation, to show that for any solution φ ∈ S of the field equations and any smooth section δφ of φ ∗ VF with spatially compact support, we have ∂µ Jφ (X ±˜ [φ], δφ) = ( J [φ]kl X ±˜ [φ] l ) δφ k − ( J [φ]kl δφ l ) X ±˜ [φ] k , µ
F
F
F
(107)
so that if δφ is a solution of the linearized field equations, δ F˜ [φ] · δφ . δφ
∂µ Jφ (X ±˜ [φ], δφ) = µ
F
(108)
˜ Now since, by assumption, the support of (δ F/δφ)[φ] is contained in some time slice, we can choose two Cauchy surfaces − to the past and + to the future of this time slice and, using that δφ has spatially compact support, integrate Eq. (108) over the time slice S− between − and and similarly over the time slice S+ between and + . Applying Stokes’ theorem, this gives µ µ dσµ (x) Jφ (X −˜ [φ], δφ)(x) = dσµ (x) Jφ (X −˜ [φ], δφ)(x) F
dσµ (x) Jφ (X +˜ [φ], δφ)(x) = µ
F
−
F
+
d nx S−
dσµ (x) Jφ (X +˜ [φ], δφ)(x) µ
F
+
+
δ F˜ [φ](x) · δφ(x) , δφ
d nx S+
δ F˜ [φ](x) · δφ(x) . δφ
402
M. Forger, S.V. Romero
˜ But the support conditions on G± φ , together with the fact that the support of (δ F/δφ)[φ] − lies to the future of − and to the past of + , imply that X ˜ [φ] vanishes on − F
and similarly that X +˜ [φ] vanishes on + , so the first term on the rhs of each of these F equations is zero. Thus taking their difference and inserting Eq. (105), we get
µ
dσµ (x)Jφ (X ˜ [φ], δφ)(x) = F
d nx S−
δ F˜ [φ](x) · δφ(x)+ δφ
d nx S+
δ F˜ [φ](x) · δφ(x), δφ
˜ and since (δ F/δφ)[φ] vanishes outside S− ∪ S+ , φ (X ˜[φ], δφ) =
d nx
F
M
δ F˜ [φ](x) · δφ(x) . δφ
(109)
Finally, observe that since δφ is supposed to be a solution of the linearized field equations (and hence tangent to S), the rhs of this equation does not depend on the choice of the extension F˜ of F. Therefore, X ˜[φ] will not depend on this choice either provided F the symplectic form φ is weakly non-degenerate. Now using the space-time split of M over provided by the tangent vector field ∂t of some global time function t on M or its dual dt, and identifying solutions δφ of the linearized field equations with their Cauchy data on , 9 it can be seen by direct inspection, either of Eqs. (84) and (85) in the Lagrangian formalism or of Eqs. (90) and (91) in the Hamiltonian formalism, that the expression φ (δφ1 , δφ2 ) can only be zero for all δφ2 if δφ1 vanishes, as soon as we require the Lagrangian L to be regular in time derivatives, that is, to satisfy det
∂ 2L i
j
∂q0 ∂q0
= 0 ,
(110)
or equivalently, the Hamiltonian to be regular in timelike conjugate momenta, that is, to satisfy det
∂ 2H = 0 . ∂pi0 ∂pj0
(111)
Moreover, it can be shown that this statement will remain true if δφ1 is allowed to be a distributional solution of the linearized field equations with arbitrary support, as long as δφ2 runs through the space of smooth solutions of the linearized field equations with spatially compact support. Let us summarize this fundamental result in the form of a theorem. Theorem 3. With respect to the symplectic form on covariant phase space as defined by Crnkovi´c, Witten and Zuckerman, the Hamiltonian vector field XF associated with a functional F which is local in time is given by convolution of the functional derivative of F with the causal Green function of the corresponding Jacobi operator. 9 Explicitly, in the Lagrangian formalism, the Cauchy data for δϕ on M are δϕ and δ ϕ˙ on , whereas in the Hamiltonian formalism, the Cauchy data for (δϕ, δπ ) on M are δϕ and δπ 0 on .
Covariant Poisson Brackets in Geometric Field Theory
403
Note that in view of the regularity conditions employed to arrive at this conclusion, the previous construction does not apply directly to degenerate systems such as gauge theories: these require a separate treatment. See, for example, Ref. [29], which addresses the question of equivalence between various definitions of Poisson brackets in this context, though not in a completely covariant manner (all brackets considered there are equal-time Poisson brackets). Having established Eq. (106), it is now easy to write down the Poisson bracket of two functionals F and G on S : it is, in complete analogy with Eq. (73), given by {F, G}[φ] = F [φ] · XG [φ] = − G [φ] · XF [φ] , or
{F, G}[φ] =
d nx M
δF [φ](x) XG [φ] k (x) = − δφ k
d nx M
(112)
δG [φ](x) XF [φ] k (x) . δφ k (113)
Inserting Eq. (103), we arrive at the second main conclusion of this paper, which is an immediate consequence of the first. Theorem 4. The Poisson bracket associated with the symplectic form on covariant phase space as defined by Crnkovi´c, Witten and Zuckerman, according to the standard prescription of symplectic geometry, suitably adapted to the infinite-dimensional setting encountered in this context, is precisely the field theoretical bracket first proposed by Peierls and brought into a more geometric form by DeWitt: δF δG {F, G}[φ] = d nx d ny [φ](x) Gkl [φ](y) . (114) φ (x, y) k l δφ δφ M M Of course, for the expressions in Eqs. (112)–(114) to exist, it is not sufficient to require F and/or G to be local in time. In fact, if we want to use conditions that (a) are sufficient to guarantee existence of this Poisson bracket without making use of specific regularity and support properties of the propagator, (b) are the same for F and G and (c) are reproduced under the Poisson bracket, we are forced to impose quite rigid assumptions: the functionals under consideration must be assumed to be both regular and local, in the sense that their functional derivative at any point φ of S must be a smooth section of φ ∗ V F of compact support (this will force the corresponding Hamiltonian vector field to be a smooth section of φ ∗ VF of spatially compact support). On the other hand, it must be pointed out that this Poisson bracket, which we might call the Peierls - DeWitt bracket, has all the structural properties expected from a good Poisson bracket: bilinearity, antisymmetry, validity of the Jacobi identity and validity of the Leibniz rule with respect to plain and ordinary multiplication of functionals. This can be seen directly by noting that the first two properties and the Leibniz rule are obvious, while the Jacobi identity expresses the propagator identity for the causal Green function. But it is of course much simpler to argue that all these properties follow immediately from the above theorem, in combination with standard results of symplectic geometry. Moreover, the Peierls-DeWitt bracket trivially satisfies the fundamental axiom of field theoretic locality: functionals localized in spacelike separated regions commute. All this suggests that the Peierls-DeWitt bracket is the correct classical limit of the commutator of quantum field theory. Therefore, it ought to play an outstanding role in any attempt at quantizing classical field theories through algebraic methods, a popular example of which is deformation quantization.
404
M. Forger, S.V. Romero
The basic complication inherent in the algebraic structure provided by the PeierlsDeWitt bracket is that it is inherently dynamical: the bracket between two functionals depends on the underlying dynamics. This could not be otherwise. In fact, it is the price to be paid for being able to extend the canonical commutation relations of classical field theory, representing a non-dynamical equal-time Poisson bracket, to a covariant Poisson bracket. The dynamical nature of covariant Poisson brackets is simplified (but still not trivial) for free field theories, where the equations of motion are linear, implying that the Jacobi operator J [φ] and its causal Green function Gφ do not depend on the background solution φ. Finally, we would like to remark that the main mathematical condition to be imposed in order for the constructions presented here to work is that linearization of the field equations around any solution φ should provide a hyperbolic system of partial differential equations on M, for which existence and uniqueness of the Green functions G± φ and Gφ can be guaranteed. There are various definitions of the concept of a hyperbolic system that can be found in the literature, but the most appropriate one seems to be that of regular hyperbolicity, proposed by Christodoulou [30–32] in the context of Lagrangian systems, according to which the matrix u µ uν
∂ 2L i
j
∂qµ ∂qν
should (in our sign convention for the metric tensor) be positive definite for timelike vectors u and negative definite for spacelike vectors u: a typical example is provided by the sigma model as discussed at the beginning of this subsection. What is missing is to translate this condition into the Hamiltonian formalism and to compare it with other definitions of hyperbolicity for first order systems, such as the traditional one of Friedrichs.
4. Conclusions and Outlook The approach to the formulation of geometric field theory adopted in this paper closely follows the spirit of Ref. [8], in the sense of emphasizing the importance of combining techniques from multisymplectic geometry with a functional approach. The main novelties are (a) the systematic extension from a Lagrangian to a Hamiltonian point of view, preparing the ground for the treatment of field theories which have a phase space but no configuration space (or better, a phase bundle but no configuration bundle), (b) a clearcut distinction between ordinary and extended multiphase space, which is necessary for a correct definition of the concept of the covariant Hamiltonian and (c) the use of the causal Green function for the linearized operator as the main tool for finding an explicit formula for the Hamiltonian vector field associated with a given functional on covariant phase space. This explicit formula, together with the resulting identification of the canonical Poisson bracket derived from the standard symplectic form on covariant phase space with the Peierls - DeWitt bracket of classical field theory, are the central results of this paper. An interesting question that arises naturally concerns the relation between the Peierls-DeWitt bracket as constructed here with other proposals for Poisson brackets in multisymplectic geometry. In general the latter just apply to certain special classes of functionals. One such class is obtained by using fields to pull differential forms f back
Covariant Poisson Brackets in Geometric Field Theory
405
to space-time and then integrate over submanifolds of the corresponding dimension. Explicitly, in the Lagrangian framework, f should be a differential form on J 1 E, and (ϕ, ∂ϕ)∗ f , (115) F [φ] =
whereas in the Hamiltonian framework, f should be a differential form on J 1 ∗ E, and (ϕ, π )∗ f . (116) F [φ] =
For the particular case of differential forms f of degree n − 1 and Cauchy hypersurfaces as integration domains , this kind of functional was already considered in the 1970’s under the name “local observable” [7], but it was soon noticed that due to certain restrictions imposed on the forms f allowed in that construction, the class of functionals so defined is way too small to be of much use for purposes such as quantization. As it turns out, these restrictions amount to requiring that f should be a Hamiltonian form, but in a slightly different sense than that adopted in Refs. [33–36]. Namely, in the Lagrangian framework, we define an (n − 1)-form f on J 1 E to be a Hamiltonian form if there exists a (necessarily unique) vector field Xf on J 1 E, called the Hamiltonian vector field associated with f , such that iX ωL = df , f
(117)
∗ E to be a whereas in the Hamiltonian framework, we define an (n − 1)-form f on J 1 1 Hamiltonian form if there exists a (necessarily unique) vector field Xf on J ∗ E, called the Hamiltonian vector field associated with f , such that
iX ωH = df . f
(118)
What motivates this concept is the possibility to use the multisymplectic analogue of the standard definition (73) of Poisson brackets in mechanics for defining the Poisson bracket between the corresponding functionals [8]. However, as in all other variants of the same definition [33–36], it turns out that in contrast to mechanics where f is simply a function, the validity of Eq. (117) or Eq. (118) imposes strong constraints not only on the vector field Xf but also on the form f ; for example, Eq. (118) restricts the coefficients both of Xf and of f in adapted local coordinates to be affine functions of the µ multimomentum variables pi . This implies that the class of functionals F derived from Hamiltonian (n − 1)-forms f according to Eqs. (115) or Eq. (116) does not close under ordinary multiplication of functionals. Fortunately, using the Peierls - DeWitt bracket between functionals, we may dispense with the restriction to Hamiltonian forms. In fact, this line of reasoning was already followed by the authors of Ref. [8], where both the symplectic form on the solution space and the corresponding Poisson bracket between functionals on the solution space, with all its structurally desirable properties, are introduced explicitly. What remained unnoticed at the time was that this bracket is just the Peierls - DeWitt bracket of physics and that incorporating the theory of “local observables” into this general framework results in the transformation of a definition, as given in Ref. [7], into a theorem which, in modern language, states that the Peierls - DeWitt bracket {F, G } between two functionals F and G derived from Hamiltonian (n − 1)-forms f and g, respectively, is the functional derived from the Hamiltonian (n − 1)-form {f, g}. An explicit proof, based on
406
M. Forger, S.V. Romero
the classification of Hamiltonian vector fields and Hamiltonian (n − 1)-forms similar to the results of Refs. [34–36], has been given recently [37]; details will be published elsewhere. Of course, there is a priori no reason for restricting this kind of investigation to forms of degree n − 1, since physics is full of functionals that are localized on submanifolds of space-time of other dimensions, such as: values of observable fields at space-time points (dimension 0), Wilson loops (traces of parallel transport operators around loops) in gauge theories (dimension 1), etc. This problem is presently under investigation. Acknowledgements. The authors would like to thank the referee for useful suggestions. This work has been financially supported by CNPq (“Conselho Nacional de Desenvolvimento Cient´ıfico e Tecnol´ogico”) and by FAPESP (“Funda¸ca˜ o de Amparo a` Pesquisa do Estado de S˜ao Paulo”), Brazil.
Appendix: Affine Spaces and Duality In this appendix, we collect some basic facts of linear algebra for affine spaces which are needed in this paper but which do not seem to be readily available in the literature. A (nonempty) set A is said to be an affine space modelled on a vector space V if there is given a map + : A × V −→ A (a, v) −→ a + v
(119)
satisfying the following two conditions: • a + (u + v) = (a + u) + v for all a ∈ A and all u, v ∈ V . • Given a, b ∈ A, there exists a unique v ∈ V such that a = b + v. Elements of A are called points and elements of V are called vectors, so the map (119) can be viewed as a transitive and fixed point free action of V (as an Abelian group) on A, associating to any point and any vector a new point called their sum. Correspondingly, the vector v whose uniqueness and existence is postulated in the second condition is often denoted by a − b and called the difference of the points a and b. For every affine space A, the vector space on which it is modelled is determined uniquely up to isomorphism and will usually be denoted by A. A map f : A → B between affine spaces A and B is said to be affine if there exists a point a ∈ A such that the map fa : A → B defined by fa (v) = f (a + v) − f (a)
(120)
B). It is easily seen that this condition does not depend on is linear, that is, fa ∈ L(A, the choice of the reference point: in fact, if the map fa is linear for some choice of a, then the maps fa are all equal as a varies through A, so it makes sense to speak of the linear part f of an affine map f . Denoting the set of all affine maps from A to B by A(A, B), we thus have a projection B) l : A(A, B) −→ L(A, . f −→ f
(121)
(A useful property of this correspondence is that f is injective/ surjective/bijective if and only if f is.) This construction is particularly important in the special case where
Covariant Poisson Brackets in Geometric Field Theory
407
B is itself a vector space, rather than just an affine space. Given an affine space A and a vector space W , the set A(A, W ) of affine maps from A to W is easily seen to be a vector space: in fact it is simply a linear subspace of the vector space Map(A, W ) of all maps from A to W . Moreover, the projection W) l : A(A, W ) −→ L(A, f −→ f
(122)
is a linear map whose kernel consists of the constant maps from A to W . Identifying these with the elements of W itself, we obtain a natural isomorphism W) , A(A, W )/W ∼ = L(A,
(123)
or equivalently, an exact sequence of vector spaces, as follows: W ) −→ 0 . 0 −→ W −→ A(A, W ) −→ L(A, l
(124)
In the general case, one shows that given two affine spaces A and B, the set A(A, B) of −−−−−→ and affine maps from A to B is again an affine space, such that A(A, B) = A(A, B), that the projection (121) is an affine map. Concerning dimensions, we may choose a reference point o in A which provides not only an isomorphism between A and A but also a splitting W ) −→ A(A, W ) s : L(A,
(125)
of the exact sequence (124), explicitly given by s(t) a = t (a − o) ,
(126)
W ), showing that which induces an isomorphism between A(A, W ) and W ⊕ L(A, W) . dim A(A, W ) = dim W + dim L(A,
(127)
Choosing W to be the real line R, we obtain the affine dual A of an affine space A: A = A(A, R) .
(128)
Observe that this is not only an affine space but even a vector space which, according to Eq. (124), is a one-dimensional extension of the linear dual A∗ of the model space A by R, that is, we have the following exact sequence of vector spaces: l 0 −→ R −→ A −→ A∗ −→ 0 .
(129)
In particular, according to Eq. (127), its dimension equals 1 plus the dimension of the original affine space: dim A = dim A + 1 .
(130)
More generally, we may replace the real line R by a (fixed but arbitrary) one-dimensional real vector space R (which is of course isomorphic but in general not canonically isomorphic to R) to define the twisted affine dual A of an affine space A: A = A(A, R) .
(131)
408
M. Forger, S.V. Romero
Again, this is not only an affine space but even a vector space which, according to Eq. (124), is a one-dimensional extension of the linear dual A∗ of the model space A by R, that is, we have the following exact sequence of vector spaces: l ∗ 0 −→ R −→ A −→ A −→ 0 .
(132)
Obviously, the dimension is unchanged: dim A = dim A + 1 .
(133)
Moreover, we have the following canonical isomorphism of vector spaces: A ∼ = A ⊗ R ,
(134)
and more generally, for any vector space W , A(A, W ) ∼ = A ⊗ W .
(135)
Finally, as already noted in the preceding paragraph, each point of A defines a splitting of the exact sequence (132), so we obtain a map from A to the set of such splittings, This map is affine and its which is itself an affine space modelled on the bidual A∗∗ of A. linear part is the negative of the canonical isomorphism between A and A∗∗ , implying that the space of splittings of the exact sequence (132) can be naturally identified with A itself – a fact which is used in the construction of the inverse Legendre transformation. The concept of duality applies not only to spaces but also to maps between spaces: given an affine map f : A → B between affine spaces A and B, the formula (f (b ))(a) = b (f (a))
for b ∈ B , a ∈ A
(136)
yields a linear map f : B → A between their affine duals B and A . As a result, the operation of taking the affine dual can be regarded as a (contravariant) functor from the category of affine spaces to the category of vector spaces. This functor is compatible with the usual (contravariant) functor of taking linear duals within the category of vector spaces in the sense that the following diagram commutes: f
B −→ A ↓ f∗ B ∗ −→
↓
(137)
A∗
Concluding this appendix, we would like to point out that all the concepts introduced above can be extended naturally from the purely algebraic setting to that of fiber bundles. For example, affine bundles are fiber bundles modelled on an affine space whose transition functions (with respect to a suitably chosen atlas) are affine maps. Moreover, functors such as the affine dual are smooth (see Ref. [38] for a definition of the concept of smooth functors in a similar context) and therefore extend naturally to bundles (over a fixed base manifold M). In particular, this means that any affine bundle A over M has a naturally defined affine dual, which is a vector bundle A over M.
Covariant Poisson Brackets in Geometric Field Theory
409
References 1. Crnkovi´c, C., Witten, E.: Covariant Description of Canonical Formalism in Geometrical Theories. In: W. Israel, S. Hawking (eds.), Three HundredYears of Gravitation, Cambridge: Cambridge University Press, 1987, pp. 676–684 2. Crnkovi´c, C.: Symplectic Geometry of Covariant Phase Space. Class. Quantum Grav. 5, 1557–1575 (1988) 3. Zuckerman, G.: Action Principles and Global Geometry. In: S.-T. Yau (ed.), Mathematical Aspects of String Theory, Singapore: World Scientific, 1987, pp. 259–288 4. Woodhouse, N.M.J.: Geometric Quantization. 2nd edition. Oxford: Oxford University Press, 1992 5. De Donder, Th.: Th´eorie Invariante du Calcul des Variations. Paris: Gauthier-Villars, 1935 6. Weyl, H.: Geodesic Fields in the Calculus of Variations for Multiple Integrals. Ann. Math. 36, 607–629 (1935) 7. Kijowski, J.: A Finite-Dimensional Canonical Formalism in the Classical Field Theory. Commun. Math. Phys. 30, 99–128 (1973); Multiphase Spaces and Gauge in the Calculus of Variations. Bull. Acad. Sc. Polon. 22, 1219–1225 (1974) 8. Kijowski, J., Szczyrba, W.: Multisymplectic Manifolds and the Geometrical Construction of the Poisson Brackets in the Classical Field Theory. In: J.-M. Souriau (ed.), G´eometrie Symplectique et Physique Math´ematique, Paris: C.N.R.S., 1975, pp. 347–379 9. Kijowski, J., Szczyrba, W.: A Canonical Structure for Classical Field Theories. Commun. Math. Phys. 46, 183–206 (1976) 10. Goldschmidt, H., Sternberg, S.: The Hamilton-Cartan Formalism in the Calculus of Variations. Ann. Inst. Fourier 23, 203–267 (1973) 11. Guillemin, V., Sternberg, S.: Geometric Asymptotics. Providence, RI: AMS, 1977 12. Garcia, P.L.: The Poincar´e-Cartan Invariant in the Calculus of Variations. Symp. Math. 14, 219–246 (1974) 13. Cari˜nena, J.F., Crampin, M., Ibort, L.A.: On the Multisymplectic Formalism for First Order Field Theories. Diff. Geom. App. 1, 345–374 (1991) 14. Gotay, M.J.: A Multisymplectic Framework for Classical Field Theory and the Calculus of Variations I. Covariant Hamiltonian Formalism. In: M. Francaviglia (ed.), Mechanics, Analysis and Geometry: 200 Years After Lagrange, Amsterdam: North Holland, 1991, pp. 203–235 15. Gotay, M.J., Isenberg, J., Marsden, J.E., Montgomery, R.: Momentum Maps and Classical Relativistic Fields. Part I: Covariant Field Theory. http://arxiv.org/abs/physics/9801019, 1998 16. Peierls, R.E.: The Commutation Laws of Relativistic Field Theory. Proc. Roy. Soc. (London) A 214, 143–157 (1952) 17. DeWitt, B.: Invariant Commutators for the Quantized Gravitational Field. Phys. Rev. Lett. 4, 317–320 (1960) 18. DeWitt, B.: Dynamical Theory of Groups and Fields. In: B. DeWitt, C. DeWitt (eds.), Relativity, Groups and Topology, 1963 Les Houches Lectures, New York: Gordon and Breach, 1964, pp. 585– 820 19. DeWitt, B.: The Spacetime Approach to Quantum Field Theory. In: B. DeWitt, R. Stora (eds.), Relativity, Groups and Topology II, 1983 Les Houches Lectures, Amsterdam: Elsevier, 1984, pp. 382–738 20. Kijowski, J., Tulczyjew, W.M.: A Symplectic Framework for Field Theories. Lecture Notes in Physics 107, Berlin: Springer-Verlag, 1979 21. Romero, S.V.: Colchete de Poisson Covariante na Teoria Geom´etrica dos Campos, PhD thesis, IME-USP, June 2001 22. Abraham, R., Marsden, J.E.: Foundations of Mechanics. 2nd edition, Reading, MA: Benjamin-Cummings, 1978 23. Arnold, V.: Mathematical Foundations of Classical Mechanics, 2nd edition, Berlin: Springer-Verlag, 1987 24. Saunders, D.J.: The Geometry of Jet Bundles, Cambridge: Cambridge University Press, 1989 25. Palais, R.: Foundations of Non-Linear Global Analysis. Reading, MA: Benjamin-Cummings, 1968 26. Kol´arˇ, I., Michor, P.W., Slov´ak, J.: Natural Operations in Differential Geometry. Berlin: SpringerVerlag, 1993 27. Marsden, J.E., Patrick, G.W., Shkoller, S.: Multisymplectic Geometry, Variational Integrators and Nonlinear PDEs. Commun. Math. Phys. 199, 351–395 (1998) 28. Wald, R.M.: General Relativity. Chicago, IL: Chicago University Press, 1984 29. Barnich, G., Henneaux, M., Schomblond, C.: Covariant Description of the Canonical Formalism. Phys. Rev. D 44, R939–R941 (1991) 30. Christodoulou, D.: The Notion of Hyperbolicity for Systems of Euler-Lagrange Equations. In: B. Fiedler, K. Gr¨oger, J. Sprekels (eds.), Equadiff99 - Proceedings of the International Conference on Differential Equations, Vol. 1, Singapore: World Scientific, 2000, pp. 327–338
410
M. Forger, S.V. Romero
31. Christodoulou, D.: On Hyperbolicity. Contemp. Math. 263, 17–28 (2000) 32. Christodoulou, D.: The Action Principle and Partial Differential Equations. Princeton, NJ: Princeton University Press, 2000 33. Kanatchikov, I.: On Field Theoretic Generalizations of a Poisson Algebra. Rep. Math. Phys. 40, 225–234 (1997) 34. Forger, M. R¨omer, H.: A Poisson Bracket on Multisymplectic Phase Space. Rep. Math. Phys. 48, 211–218 (2001) 35. Forger, M., Paufler, C., R¨omer, H.: The Poisson Bracket for Poisson Forms in Multisymplectic Field Theory. Rev. Math. Phys. 15, 705–744 (2003) 36. Forger, M., Paufler, C., R¨omer, H.: Hamiltonian Multivector Fields and Poisson Forms in Multisymplectic Field Theory. Preprint IME-USP RT-MAP-0402, July 2004, http://arxiv.org/abs/mathph/0407057, 2004 37. Salles, M.O.: Campos Hamiltonianos e Colchete de Poisson na Teoria Geom´etrica dos Campos. PhD thesis, IME-USP, June 2004 38. Lang, S.: Differential Manifolds. 2nd edition, Berlin: Springer-Verlag, 1985 Communicated by A. Kupiainen
Commun. Math. Phys. 256, 411–435 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1320-y
Communications in
Mathematical Physics
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation Rodrigo Castro , Patricio L. Felmer Departamento de Ingenier´ıa Matem´atica and Centro de Modelamiento Matem´atico, UMR2071 CNRSUChile, Universidad de Chile, Casilla 170 Correo 3, Santiago, Chile. E-mail: [email protected] Received: 4 February 2004 / Accepted: 22 November 2004 Published online: 15 March 2005 – © Springer-Verlag 2005
Abstract: We consider the nonlinear Schr¨odinger equation ε2 u − V (x)u + |u|p−1 u = 0,
x ∈ RN ,
with superlinear and subcritical nonlinearity. Assuming that the potential is radially symmetric we find radial sign-changing solutions of the equation that concentrate in a ball, as the parameter ε goes to zero. We study the asymptotic profile of these highly oscillatory solutions, completely characterizing their behavior by means of an envelope function. 1. Introduction In this article we are interested in the study of highly oscillatory standing waves for the nonlinear Schr¨odinger equation iψt = −(2 /2m)ψ + W (x)ψ − γ |ψ p−1 |ψ,
(1.1)
for a radial potential W and constants m, γ > 0, as the parameter approaches zero. This celebrated equation has been used to describe numerous physical phenomena. Among them we mention fluid dynamics, plasma physics and dispersive phenomena in waves, in particular optical waves. In all these cases the complex function ψ represents a density, through |ψ|2 . Standing waves are obtained by considering in (1.1) the Ansatz ψ(x, t) = exp(−iEt/)u(x). After proper scaling, we find that the amplitude u satisfies ε2 u − V (x)u + |u|p−1 u = 0,
x ∈ RN ,
(1.2)
Partially supported by FONDECYT Lineas Complementarias Grant 8000010. Partially Supported by FONDAP Matem´aticas Aplicadas and FONDECYT Lineas Complementarias Grant 8000010.
412
R. Castro, P.L. Felmer
for ε > 0 and V (x) = W (x) − E. It is the purpose of this article to analyze the asymptotic behavior of highly oscillatory sign-changing solutions of (1.2) in H 1 (RN ), concentrating in a ball of finite radius around the origin, as the parameter ε → 0. These solutions represent excited bound states of the system that keep the overall mass, that is the integral of u2 , bounded away from zero along the limiting process. The semi-linear elliptic problem (1.2) was first studied, in a pioneering work, by Floer and Weinstein [13], in the one dimensional case, for V positive and p = 3. They show that as ε → 0, positive single peaked solutions exist near any non-degenerate critical point of V . Since then, numerous authors have extended this result in many directions. We mention the works by Oh [19], Rabinowitz [21], Wang [24], Ambrosetti, Badiale and Cingolani [1], del Pino and Felmer [7, 8], among many others. In all cases the potential is considered positive and concentration occurs at isolated points in R N . Concerning multiple concentration or clusters we have the contribution by Kang and Wei [14] in dimension N and by del Pino, Felmer and Tanaka [9] in dimension one. More related to our work, the N -dimensional radial case, we find articles by Benci and D’Aprile [5], D’Aprile [6] and Ambrosetti, Malchiodi and Ni [3], where positive solutions are constructed, concentrating around a sphere centered at the origin. More recently,Ambrosetti, Malchiodi and Ni [4], and Malchiodi, Ni and Wei [18] have obtained clusters of positive solutions for (1.2), concentrating on a sphere whose radius is located at a positive maximum point of the effective potential (1.4). In this paper we divert in two directions from previous works. On one hand we allow the potential V to take negative values near the origin. We observe that this situation may occur when we consider standing waves for (1.1) with high values of E, that is highly excited states. On the other hand we consider oscillatory sign-changing solutions that keep their L2 norm away from zero as ε → 0. The asymptotic behavior of our solutions is so that their frequencies increase as ε −1 , their amplitudes stay away from zero and the oscillations take place in a ball of finite radius, while away from that ball the solutions decay as e−r/ε . In this way our solutions concentrate rather than in spheres, in a whole ball of finite radius. Our analysis goes further, by identifying an envelope function that completely describes the asymptotic amplitude of the solutions. By means of this envelope we can also determine the asymptotic frequency at any given radius, and the mass and energy distribution in the concentration ball, see comments after Theorem 1.2. Let us describe our results more precisely. Our first goal is to find solutions for (1.2) having high energies. We achieve this by using the variational formulation of the problem, taking advantage of the even character of the associated functional Jε (u) =
ε2 1 1 |∇u|2 + V (x)u2 − |u|p+1 dx. N 2 2 p + 1 R
(1.3)
For our existence theory we assume the potential V satisfy the following hypothesis: (V1 ) V : [0, ∞) → R is of class C 1 and lim inf r→∞ V (r) > 0. In the appendix we prove the following existence result Theorem 1.1. Assume that the potential V satisfies (V1 ) and that 1 < p < (N +2)/(N − 2) if N ≥ 3 and p > 1 if N = 2. Then, for every c > 0 there is a sequence (εn , un ) of radial solutions of (1.2), with εn converging to zero and such that Jεn (un ) = c, for all n ∈ N.
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
413
In order to analyze the asymptotic behavior of the solutions of (1.2) we require extra hypotheses on the potential. First we need (V2 ) V is uniformly continuous. Second, an hypothesis that is better presented in terms of the effective potential, defined as U (r) = r α(p−1) V (r),
(1.4)
where α = 2(N − 1)/(p + 3). We assume (U) There is d > 0 and η > 0 such that U (r)(r −d) > 0 if r > 0, r = d, and U (r) ≥ η if r ≥ d. For positive potentials we slightly change the hypothesis on U : (U+ ) U (r) > 0 and U (r) > 0 if r > 0, and there exists η > 0 such that U (r) ≥ η if r ≥ 1. Our goal is to study the asymptotic behavior of the solutions {un } found in Theorem 1.1. The first result we get is the oscillatory character of the functions un . Thinking these functions as dependent on the radius r, this means that the zeroes of un become dense in an interval of the form (0, R), as εn → 0. In order to describe the asymptotic behavior of the sequence, we associate to each un an approximate envelope function en , obtained simply by joining through straight lines their maxima. This piece-wise linear function has the information on the amplitude of the oscillatory solution un . See the precise definition in Sect. 5. Our main theorem is the identification of the limit of the sequence {en }. We consider the equation y ∈ R, w (y) − V (r)w(y) + |w|p−1 w(y) = 0, (1.5) w(0) = s, w (0) = 0, where r, s > 0 are parameters, w = w(r, s; y). We denote by T = T (r, s) a quarter of a period of w, if w is periodic with zeroes. When w is positive with exponential decay, we set T = ∞. Then we introduce the functions 1 T 2 1 T Q(r, s) = w dy and R(r, s) = |w|p+1 dy, (1.6) T 0 T 0 if T < ∞, and Q(r, s) = R(r, s) = 0 if T = ∞. We also define V (r) + α(p − 1)V (r)/r (s 2 − Q(r, s)) s H (r, s) = −α , p 2(s − V (r)s) r
(1.7)
and the asymptotic energy functional J¯(e) =
p−1 2(p + 1)
∞ 0
for a function e(r). Here is our main result.
R(r, e(r))r N−1 dr,
(1.8)
414
R. Castro, P.L. Felmer
Theorem 1.2. We assume that V satisfies the hypotheses (V1 )–(V2 ), (U ) or (U + ), and that p satisfies 1 < p < min{5, (N + 2)/(N − 2)}. Let (εn , un ) be a sequence of radial solutions of (1.2) such that Jεn (un ) = c > 0. Then the sequence of approximate envelopes en converges locally uniformly in R+ = {r > 0} to a function e, which is the unique solution of the differential equation e = H (r, e)
r > 0,
(1.9)
subject to the condition J¯(e) = c.
(1.10)
We point out that the function H fails to be Lipschitz continuous over the graph of the function e0 , defined in (5.1). Thus, condition (1.10) replaces the initial condition in order to obtain uniqueness of the solution. The envelope function carries asymptotic information on the sequence {un }. In particular, the functions E(r) =
p−1 R(r, e(r))r N−1 and ρ(r) = Q(r, e(r))r N−1 , 2(p + 1)
(1.11)
correspond to asymptotic energy and mass densities, respectively. The function e(r) itself represents the asymptotic amplitude and T −1 (r, e(r)) the asymptotic frequency. In particular, r the number of zeroes of u in an interval (r0 , r1 ) is approximately εn−1 r01 T −1 (r, e(r))dr. Our results can also be described using the effective potential U . If we define vn (r) = r α un (r) and the corresponding sequence of approximate envelopes for vn , say e˜n , we can prove that e˜n converges locally uniformly in R+ to the function e(r) ˜ = r α e(r) which is a solution of e˜ =
˜ e(r))) ˜ U (r)(e˜2 − Q(r, , 2(e˜p − U (r)e) ˜
(1.12)
˜ s) = s 2α Q(r, r −α s). where Q(r, As a consequence of Theorem 1.2 we can prove the following surprising result on the behavior of un near the origin. Corollary 1.1. There is a constant C > 0 such that |un (r)| ≤
C , rα
for all r > 0
and lim un ∞ = ∞.
n→∞
At this point we mention the earlier work by Felmer and Torres [12] where the one dimensional case of (1.2) is studied. In [12] the existence of an envelope equation like (1.12), is proved but where U is replaced by V . The fact that it is the effective potential what governs the concentration phenomena has been already observed in [5, 6, 3, 4], and [18] in the case when concentration of positive solutions occurs at spheres away from the origin. For recent results in related one dimensional problems see Felmer and Mart´ınez [10] and Felmer, Mart´ınez and Tanaka [11].
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
415
Remark 1.1. For the nonlinear Schr¨odinger equation in the radial case we have shown that a concentration phenomena of sign-changing solutions occurs in a set with nonempty interior. We conjecture that, if the effective potential has a local maximum at the origin, then there exist positive highly oscillatory solutions concentrating in a ball, with a singularity at the origin. For concentration phenomena in a lower dimensional set, other than points, we should mention the recent results by Malchiodi and Montenegro [16] and [17] and Malchiodi [15] in the case of a related Neumann problem, in a bounded domain. Remark 1.2. In this article we have considered that the effective potential U does not have critical points in (d, ∞); in this way concentration occurs only in a ball near the origin. If there are critical points in (0, d), then we expect concentration of oscillating solutions in fat spheres around these points. We do not pursue this line of research, but we mention the work by Felmer, Mart´ınez and Tanaka [11], where an analogous situation is considered in the unbalanced Allen-Cahn equation. Remark 1.3. Our hypotheses on the potential imply control of the growth of V at infinity, that can be interpreted as a confinement condition. The strength of these hypotheses is used in obtaining a uniform estimate of the L∞ norm of the sequence {un }, a fact that is proved in Sect. 3. This is perhaps the hardest part of the paper. There is a wide class of potentials satisfying our hypotheses. They are satisfied, for example, by a potential behaving like mr for large r, m > 0. Another particularly interesting case is the constant potential V ≡ 1. Here, our Theorem 1.2 holds if N +2 2N + 1 ≤p< . 2N − 3 N −2 This certainly exclude the case N = 2, where we require the extra assumption p < 5, see (3.5). We do not know if the constraint p < 5 can be removed. Our work is organized in the following way. In Sect. 2 we prove some preliminary results. In Sect. 3 we prove that un is locally bounded in R+ and that vn is uniformly bounded. In Sect. 4 we prove that the zeroes of un and vn , are densely distributed in a bounded interval. This allows us to define the approximate envelopes en and e˜n . In Sects. 5 and 6 we study the asymptotic behavior of en and e˜n , and we characterize completely their limits through the solutions of the corresponding envelope equations. 2. Preliminary Properties of Solutions In this section we introduce some elements in order to study the asymptotic behavior of the solutions (εn , un ) given by Theorem 1.1. Let us first observe that, as a function of r a solution u of (1.2) satisfies the ordinary differential equation
N −1 ε 2 u + r > 0, u − V (r)u + |u|p−1 u = 0, r (2.1) u (0) = 0, lim u(r) = lim u (r) = 0. r→∞
r→∞
We notice that the function v = r α u satisfies equation
(p − 1)α v 2 (p−1)α v + ε r − Uε (r)v + |v|p−1 v = 0, 2 r
(2.2)
416
R. Castro, P.L. Felmer
where
Uε (r) = U (r) + α
(p + 1)α − 1 ε 2 r (p−1)α−2 , 2
with U (r) and α as defined in the Introduction. We observe that the exponent (p−1)α−2 is negative, so that the function Uε has a singularity at the origin. If N ≥ 3 then the coefficient (p + 1)α/2 − 1 is positive, while if N = 2 it is negative. In any case, Uε converges to U in a C 1 uniform sense in any interval of the form (r0 , ∞), with r0 > 0. In the next two lemmas we prove preliminary properties of un and vn . Lemma 2.1. Given r¯ > d there exists ε0 > 0 such that if (ε, u) is a solution of (2.1) with ε ∈ (0, ε0 ), then u, and also v(r) = r α u(r), do not possess positive minima nor negative maxima in [¯r , ∞). Proof. Multiplying (2.2) by v we see that
2 |v|p+1 v2 v2 d 2 (p−1)α |v | ε r − Uε (r) + + Uε (r) = 0. dr 2 2 p+1 2
(2.3)
By the positivity of the potential V at infinity we see that both u and v decay exponentially. This together with the uniform continuity of V implies that lim ε 2 r (p−1)α
r→∞
v (r)2 v(r)2 |v(r)|p+1 − Uε (r) + = 0. 2 2 p+1
(2.4)
Consider r1 ≥ r¯ , a critical point of u with m = u(r1 ) > 0. Integrating (2.3) between r1 and infinity and using that Uε (r) > 0 in [¯r , ∞) for all ε > 0 small, we find that (p−1)α v
ε 2 r1
(r
1)
2
2
− Uε (r1 )
v(r1 )2 |v(r1 )|p+1 + ≥ 0, 2 p+1
and since v(r1 ) = r1α m and v (r1 ) = αr1α−1 m we obtain c
2 ε2 mp−1 ≥ V (r1 ), + p+1 r12
(2.5)
for a certain constant c. If r1 is a positive minimum point of u, from (2.1) we see V (r1 ) ≥ mp−1 , and combining with (2.5) we get
p−1 ε2 V (r1 ), c 2 ≥ p+1 r1 which is impossible if ε > 0 is small enough. Here we used that V (r) is bounded away from zero in [¯r , ∞) as can be seen from (V1 ) and (U ) or (U + ). This completes the proof in the case of u. Now we consider v in the case when U changes sign (the case U positive is similar). Let dε be the point near d where Uε changes sign. Let r1 ≥ dε be the critical point of v(r) = r α u(r). Since Uε (r) > 0 in [dε , ∞), integrating (2.3) between r1 and infinity we obtain 2 mp−1 ≥ Uε (r1 ), (2.6) p+1 where m = v(r1 ). Thus, if r1 is a positive minimum point of v, from Eq. (2.2) we see that mp−1 ≤ Uε (r1 ), providing a contradiction.
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
417
Lemma 2.2. Let (ε, u) be a solution of (2.1). If 0 < r1 < r2 are two consecutive critical points of v then i) |v(r1 )| < |v(r2 )| if Uε > 0 in [r1 , r2 ], and ii) |v(r1 )| > |v(r2 )| if Uε < 0 in [r1 , r2 ]. Here we can replace <, > by ≤, ≥. Proof. It is enough to prove the lemma in case Uε (r) < 0 in [r1 , r2 ]. Defining hi = |u(ri )|, i = 1, 2 and considering the functions Fi (s) =
s p+1 s2 − Uε (ri ) , p+1 2
s > 0,
i = 1, 2,
after integrating (2.3) between r1 and r2 we find F2 (h2 ) − F1 (h1 ) = −
r2
r1
Uε (r)
v2 dr. 2
Noticing that F1 (h2 ) − F2 (h2 ) = (Uε (r2 ) − Uε (r1 )) h22 /2, we find F1 (h2 ) − F1 (h1 ) =
r2
r1
Uε (r) 2 (h2 − v 2 )dr. 2
(2.7)
Now we assume for contradiction that h1 ≤ h2 . If Uε > 0 in [r1 , r2 ], from the equation 1
for v we see that h1 ≥ (Uε (r1 )) p−1 , and then F1 is increasing in [h1 , h2 ], since F1 (s) = s p − Uε (r1 )s > 0 for s > (Uε (r1 ))1/(p−1) . Thus we obtain that the left-hand side in (2.7) is positive, while the right-hand side is negative. If Uε < 0 in [r1 , r2 ], the function F1 is also increasing and we get the same contradiction. The remainder cases are treated similarly.
3. Uniform Bounds for the Solutions In this section we consider the sequence (εn , un ) of solutions of (2.1) with Jεn (un ) = c and εn → 0. The goal is to obtain uniform estimates for un and vn = r α un . This task is perhaps the hardest part in all our analysis. It is not hard to check that the sequence un has an increasing number of zeroes and critical points, as n → ∞. The contrary would lead to Jεn (un ) → 0. We can see this either by analyzing the min-max procedure or by an asymptotic study of un . Our first lemma says that critical points of un are not isolated. Lemma 3.1. Let (εn , un ) be a sequence of solutions of (2.1) such that εn → 0 and Jεn (un ) = c, for all n ∈ N. If r¯ > d, and xn < yn are sequences of consecutive critical points of un so that yn ≥ r¯ , for all n ∈ N. Then yn − xn → 0 as n → ∞.
418
R. Castro, P.L. Felmer
Proof. Before starting our proof, let us consider a generic situation we encounter several times later. Let ζn be a maximum point of un and let mn = un (ζn ). It will be convenient to consider the re-scaled function (1−p)/2
wn (z) = un (ζn + εn mn
(3.1)
z)/mn ,
that satisfies the equation N −1 w (z) + w (z) − Vn (z)w(z) + |w|p−1 w(z) = 0, −1 (p−1)/2 P (ζn ) εn mn ζn + z w(0) = 1, w (0) = 0, with (1−p)/2
Vn (z) = V (ζn + εn mn
p−1
z)/mn
.
Now we start our proof. Assume, without loss of generality, that our points yn are maximum points of un . Then we re-scale around yn obtaining wn that satisfies P (yn ) and we can follow the proof of Lemma 2.1, to get as (2.5), c
εn2 2 p−1 mn ≥ V (yn ) ≥ V¯ + 2 yn p+1
and
p−1
lim inf mn n→∞
≥
p+1 ¯ V, 2
where V¯ = inf r∈[¯r ,∞) V (r) > 0. By the uniform continuity of V we find that Vn (z) converges, up to sub-sequence, locally uniformly to some constant γ ∈ [0, 2/(p + 1)]. On the other hand, wn and also Vn are locally bounded in R so that from equation P (yn ) we see that wn converges, up to a sub-sequence, to the solution of E(γ )
w − γ w + |w|p−1 w = 0,
w(0) = 1,
w (0) = 0.
Now we consider a constant C > 0 such that r¯ − 2C > d and we assume that un (r) > 0 2 in [yn − 2C, yn ], up to a sub-sequence. This implies that γ = p+1 and w is the positive homoclinic solution. Thus un (yn − C) → 0, and consequently un (r) → 0, for all r ∈ [yn − 2C, yn − C]. From here we can easily prove that there is r¯n ∈ [yn − 2C, yn − C] such that 0 < un (¯rn ), un (¯rn ) ≤ c0 exp(−c1 /εn ),
(3.2)
for certain positive constants c0 , c1 . We just need a comparison argument for the function w¯ n (z) = un (yn − 2C + εn z) with the solution of u − ρ 2 u = 0,
u(0) = u(C/εn ) = 1,
(3.3)
for an appropriate ρ > 0. Now we use (3.2) to obtain r )2 n (p−1)α vn (¯
εn2 r¯n
2
+
|vn (¯rn )|p+1 (p+1)α −c1 /εn e , ≤ c2 r¯n p+1
(3.4)
for certain c2 > 0. On the other hand, using (U ) or (U + ) and the convergence of wn to w, by a direct estimate we get ∞ 2α (5−p)/2 2 εn r¯n mn w 2 ≤ c2 Uεn (r)vn (r)2 dr. (3.5) r¯n
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
419
Next we integrate (2.3) for (εn , vn ) between r¯n and infinity and we use that lim inf n→∞ mn > 0, p < 5, (3.4) and (3.5) to obtain (p−1)α −c1 /εn
εn ≤ c2 r¯n
e
,
(3.6)
enlarging c2 if necessary. But εn r¯nN−1 is bounded as the inequality ∞ 1 N−1 (p+3)/2 p+1 εn r¯ mn |w| dz ≤ |un |p+1 r N−1 dr 2 n R r¯n p−1 shows, for n large. Thus, from (3.6) it follows that εnλ ≤ c¯2 e−c1 /εn , with λ = 1 + N−1 α and a proper c2 . This is impossible for n large. Thus, we have proved that there is a sequence bn < yn , such that un (bn ) = 0 and yn − bn converges to zero. To complete the proof of the lemma it is enough to show that bn − xn → 0. In order to accomplish this we use again the argument just given. For that purpose it will be sufficient to assume that un < 0 and un > 0 in (bn − 2C, bn ), and prove that un (bn − C) → 0. Then we go step by step as before to reach a contradiction. Let us assume that un (bn − C) → −∞ then un (r) → −∞ in [bn − 2C, bn − C], which contradicts the boundedness of the integral ∞ |un |p+1 r N−1 dr. (3.7) 0
Let us assume now that lim inf n→∞ un (bn − C) < 0 and finite. Then there exists x¯n ∈ [bn − 2C, bn − 3C/2] such that u (x¯n ) is bounded, since the contrary would imply again that (3.7) is unbounded. We let mn = un (x¯n ) and we re-scale un around x¯n to obtain wn as in (3.1), satisfying equation P (x¯n ). We claim that Vn converges locally uniformly to a constant γ ∈ [0, 2/(p + 1)]. In fact, integrating (2.3) between x¯n and infinity we find (p−1)α v
εn2 x¯n
(x¯
2 n)
2
+
|v(x¯n )|p+1 v(x¯n )2 ≥ Uεn (x¯n ) , p+1 2
and replacing v(x¯n ) = x¯nα mn and v (x¯n ) = α x¯nα−1 mn + x¯nα un (x¯n ) we obtain
εn2
α u (x¯n ) + x¯n mn
2 + |mn |p−1 ≥
ε2 p+1 V (x¯n ) + C1 n2 , 2 x¯n
from where the claim follows, as mn ≤ lim inf n→∞ un (bn − C) < 0 and u (x¯n ) is bounded. Since Vn and wn are locally bounded to the right of 0, and since −(p+1)/2 wn (0) = εn mn un (x¯n ) converges to zero, the sequence wn converges, up to 2 a sub-sequence, to the solution of equation E( p+1 ). This implies, in particular, that un (bn − C) converges to zero, obtaining a contradiction. The next proposition is crucial, allowing to obtain upper bound for un away from the origin. Proposition 3.1. Let r0 > 0 and (εn , un ) be a sequence of solutions of (2.1) such that εn → 0 and Jεn (un ) = c for all n ∈ N. Then un L∞ [r0 ,∞) is bounded.
420
R. Castro, P.L. Felmer
Proof. Let us denote by yn,1 > yn,2 > · · · > yn,s(n) the zeroes of un and by xn,k a maximum point of |un | in [yn,k+1 , yn,k ], for k = 1, . . . , s(n) − 1. Let xn,0 be a maximum point of |un | in [yn,1 , ∞) and xn,s(n) be a maximum point of |un | in [0, yn,s(n) ]. We also define mn,k = |un (xn,k )|, k = 0, . . . , s(n), for all n ∈ N. Our first goal is to prove that the sequence xn,0 is bounded. To do so we assume the contrary and we prove that Jεn (un ) is unbounded. We can assume that un (xn,0 ) > 0. From the proof of Lemma 3.1 we know that the sequence of functions (1−p)/2
wn (z) = un (xn,0 + εn mn,0
(3.8)
z)/mn,0
2 converges locally uniformly to the solutions of E( p+1 ), since wn > 0 to the right of xn,0 . From Lemma 3.1 we also see that yn,1 − yn,k → 0, for all k ≥ 2. Let us assume, for the moment, that ln ∈ N is a sequence such that yn,1 −yn,ln +1 → 0 as n → ∞. From the uniform continuity of V and Lemma 2.2 we obtain that
lim
n→∞
mn,kn =1 mn,1
and
lim
V (xn,kn )
n→∞
p−1 mn,kn
=
2 , p+1
uniformly on the sequences kn ∈ {1, 2, . . . , ln }. This implies that the sequences of functions (1−p)/2
wn,kn = |un (xn,kn + εn mn,kn
z)|/mn,kn
2 converge to a solution w of equation E( p+1 ) and yn ,kn ∞ 1 2 lim v dx = w 2 dz, n n→∞ ε m(5−p)/2 x 2α yn,kn +1 −∞ n n,kn n,kn
uniformly in the sequence kn . Integrating (2.3) between two consecutive zeroes of un we find 1 yn,k (p−1)α (p−1)α 2 2 yn,k+1 vn (yn,k+1 ) − yn,k vn (yn,k ) = 2 U (r)vn (r)2 dr εn yn,k+1 εn ≥
(5−p)/2 2α xn,k
ηmn,k
εn
||w||22 , 2
and integrating between yn,1 and infinity, (p−1)α vn (yn,1 )2
yn,1
=
1 εn2
∞ yn,1
Uεn (r)vn (r)2 dr ≥
(5−p)/2 2α xn,0
ηmn,0
εn
||w||22 , 2
from where (p−1)α vn (yn,k )2
yn,k
≥ c0 kεn−1 mn,0
(5−p)/2 2α xn,0 ,
α u (y ) and for some c0 > 0, for all k ∈ {1, 2, . . . , ln }. Since vn (yn,k ) = yn,k n n,k yn,k /xn,0 → 1, we find (5−p)/2
un (yn,k )2
≥
c0 kmn,0
(p−1)α
εn xn,0
.
(3.9)
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
421
Next we obtain an estimate for the distance between two zeroes of un . Let us assume that rn → ∞ is a sequence of maximum points of un and let an < bn be the consecutive zeroes of un so that rn ∈ (an , bn ). Let mn = un (rn ) and let us further assume that (1−p)/2 2 wn (z) = un (rn + εn mn z)/mn converges to the solution w of E( p+1 ). We claim that
εn εn2 un (bn )2 bn − an ≤ −γ1 (p−1)/2 log . (3.10) p+1 mn mn Let us prove this claim. From (2.3) and for r ∈ [an , bn ] we have 2 (p−1)α vn (r)
εn2 bn
2
2 |vn (r)|p+1 vn (r)2 (p−1)α vn (bn ) + ≥ εn2 bn , 2 p+1 2
− Uεn (an )
where we used that U > 0. Let us consider µn =
1/(p−1)
p+1 2 Uεn (an )
(3.11)
so that
2 p+1 Uεn (an − p+1 s ≥ 0 for all s ∈ [0, µn ]. Evaluating (3.11) at the maximum point of vn in [an , bn ] we see that µn ≤ maxr∈[an ,bn ] vn (r), and then there are two points rn− , rn+ ∈ (an , bn ) with rn− < rn+ so that vn (rn− ) = vn (rn+ ) = µn . From (3.11) we also
)s 2
have that (rn− − an ) + (bn − rn+ ) ≤ 2
µn
0
(p−1)α/2
εn b n (p−1)α εn2 bn vn (bn )2
ds
,
+ Uεn (an )s 2 −
2 p+1 p+1 s
and then, after changing the variable and taking into account that vn (bn ) = bnα un (bn ), we find (rn−
− an ) + (bn − rn+ )
(p−1)α/2
2εn bn ≤ Uεn (an )
1 0
dt λn εn2 un (bn )2
+ t 2 − t p+1
(3.12)
,
where
2 p+1
λn =
2/(p−1)
(p+1)α
bn . Uεn (an )(p+1)/(p−1) p−1
From the definition of Uεn , the uniform continuity of V and, since V (rn )/mn proaches 2/(p + 1), we obtain lim
n→∞
p+1 m n λn
p+1 = 2
and
(p−1)α/2 (p−1)/2 bn lim mn n→∞ Uεn (an )
=
p+1 . 2
ap-
(3.13)
On the other hand, it can be proved that there is a positive constant γ so that for all ξ > 0, 0
1
ds ξ + s 2 − s p+1
≤ γ (1 − log− (ξ )),
422
R. Castro, P.L. Felmer
where log− (ξ ) = min{0, log(ξ )}. Then, combining (3.12) and (3.13) we find γ1 > 0 such that
2 u (b )2 ε ε n n . (3.14) (rn− − an ) + (bn − rn+ ) ≤ γ1 p−1/2 1 − log− n np+1 mn mn p−1
2 But, since wn converges to the solution of E( p+1 ), and since V (rn )/mn 2/(p + 1), we see that
approaches
un (rn− ) un (rn+ ) = lim = 1, n→∞ mn n→∞ mn lim
and then rn+ − rn− ≤ Cεn mn , for some C > 0. From here we finally conclude (3.10), proving our claim. We notice that the argument of log− in (3.14) converges to (p−1)/2 zero, since the distance between the corresponding zeroes of wn is εn−1 mn (bn −an ), which diverges to infinity. Next we apply (3.9) and (3.10) to obtain γ2 > 0 so that for all 1 ≤ k ≤ ln ,
εn kεn . yn,k − yn,k+1 ≤ −γ2 (p−1)/2 log (p−1)α 3(p−1)/2 mn,0 xn,0 mn,0 (1−p)/2
Adding this inequality from k = 1 to ln , and using that M! ≥ (θ M)M , for some constant θ > 0, and for all M ∈ N, we obtain
εn ln εn ln , yn,1 − yn,ln +1 ≤ −γ2 (p−1)/2 log (p−1)α 3(p−1)/2 mn,0 xn,0 mn,0 and then Tn := yn,1 − yn,ln +1 ≤
εn ln (p−1)/2
mn,0
−ρ
εn ln (p−1)α
xn,0
3(p−1)/2
,
(3.15)
mn,0
for a fixed ρ ∈ (0, 1) and n sufficiently large. We recall that wn,kn converge to w uniformly in the sequences kn , with kn ∈ {0, . . . , ln }. Then, for large n, yn,k N−1 (p+3)/2 |un |p+1 r N−1 dr ≥ εn xn,k mn,k w p+1 dz, yn,k+1
R
(p+3)/2
N−1 for all k ∈ {0, . . . , ln }. This and (1.10) imply that εn ln xn,0 mn,0 together with (3.15) lead us to a constant c1 > 0 such that ρ (p−1)α+N−1 2 xn,0 mn,0 Tn ≤ c1 . N−1 p+1 xn,0 mn,0
is bounded, which
(3.16)
By choosing an appropriate ρ > 0, we see that the right-hand side in (3.16) converges to zero. But, on the other hand, we may choose ln large enough so that Tn converges to zero at a lower rate, providing a contradiction.
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
423
Thus xn,0 is bounded and we are ready to show that un is uniformly bounded in [r0 , ∞). We first notice that if r¯ > lim supn→∞ xn,0 then un (¯r ) is bounded, since on the contrary the integral (3.7) would be unbounded. Now we see that the functions |un | and |vn | decay exponentially, in a uniform way in the interval [¯r , ∞). Next, let rn be the maximum point of |un | in [r0 , ∞). Integrating (2.3) for (εn , vn ) we obtain ∞ 2 v(rn )2 vn (r)2 |vn (rn )|p+1 (p−1)α v (rn ) Uεn (r) − Uεn (rn ) + = dr. (3.17) εn2 rn 2 2 p+1 2 rn Since the functions Uεn have polynomial growth and the vn decay exponentially, the right-hand side in (3.17) is bounded. From here we see that un (rn ) and also vn (rn ) are bounded. Proposition 3.2. Let (εn , un ) be a sequence of solutions of (2.1) such that εn → 0 and Jεn (un ) = c for all n ∈ N. Then the functions vn (r) = r α un (r) are uniformly bounded in R+ . Proof. First we consider the case N ≥ 3 and V negative near the origin. Let rn be the maximum point of |vn (r)| and assume, for contradiction, that |vn (rn )| → ∞, as n → ∞. From Proposition 3.1 we see that rn → 0, and from Lemma 2.2 we see that vn (r) = 0 in (0, rn ), since the existence of a critical point to the left of rn , would imply that |vn (rn )| is not the maximum value of |vn |. Let us assume mn = un (0) > 0, then since V is negative near the origin, un has a local maximum point in zero and is decreasing in (0, rn ), and since vn (rn ) ≤ rnα mn , we see that mn → ∞. Let us re-scale un defining (1−p)/2 z /mn . wn (z) = un εn mn Then wn satisfies equation P (0), see (3.1) and the following equations, and wn converges to the solution of w (z) +
(N − 1) w (z) + |w|p−1 w(z) = 0, z
w(0) = 1, w (0) = 0.
(3.18)
It is well known (using Emden-Fowler transformation, for example) that this equation has infinitely many solutions. Let z0 be the first zero of w and y¯n = yn,s(n) be the first zero of un , then εn2 y¯n vn2 (y¯n ) = w (z0 )2 . α n→∞ (y¯n mn )p+1 (p−1)α
lim
Since vn (rn ) ≤ y¯nα mn , we obtain that εn2 y¯n vn2 (y¯n ) converges to infinity. Let a0 > 0, be such that Uε (r) < 0 in (0, a0 ). Then, integrating (2.3) between y¯n and a0 we find that (p−1)α
2 (p−1)α vn (a0 )
εn2 a0
2
− Uεn (a0 )
|vn (a0 )|p+1 vn (a0 )2 (p−1)α 2 + ≥ εn2 y¯n vn (y¯n ), 2 p+1
(3.19)
which is impossible in view of Proposition 3.1. When V is positive, we consider an as the point where Uεn has its global minimum. Following the last part of the proof of Proposition 3.1, we see that vn (an ) is bounded, since Uεn is bounded in [an , r0 ], for any given r0 . Let rn be the maximum point of |vn (r)| and assume, for contradiction, that |vn (rn )| → ∞, as n → ∞. As before we see that
424
R. Castro, P.L. Felmer
rn → 0 and that vn (r) = 0 in (0, rn ). If tn is the maximum point of un in [0, rn ], which exists since un (rn ) < 0, then mn = un (tn ) satisfies vn (rn ) ≤ rnα mn and then mn → ∞. Now we re-scale un around tn defining (1−p)/2 z /mn . wn (z) = un tn + εn mn Then wn satisfies equation P (tn ) and it converges to the solution of w (z) +
(N − 1) w (z) + |w|p−1 w(z) = 0, z + t¯
w(0) = 1, w (0) = 0,
(3.20)
(p−1)/2 where t¯ = limn→∞ εn−1 mn tn ; here we allow t¯ = ∞. In any case, this equation has also infinitely many zeroes, and then we can repeat the argument given above, just changing a0 by an in (3.19). This finishes the proof in the case N ≥ 3. Now we consider the case N = 2 and we assume first that V is negative near the origin. Let sn > 0 be so that Uεn (sn ) = 0 and sn → 0 as n → 0. We have in this case that Uεn (r) > 0, for all r ∈ (0, sn ), if n is large. We start our argument assuming that vn is bounded in [sn , ∞) and unbounded in (0, sn ]. Noticing that vn (0) = 0, if vn does not have critical points then vn is bounded in [0, sn ]. Thus we can assume that vn has critical points in (0, sn ]. Let bn ∈ (0, sn ) so that vn = 0 in (bn , sn ), then using that Uεn > 0 in (0, sn ) and Lemma 2.2 we have that vn (bn ) → ∞, as n → ∞. Let us assume that vn (sn ) < 0 and vn (sn ) > 0, and denote by zn the first critical point of vn to the right of sn . Integrating (2.3) from bn to zn we get
−Uεn (zn )
|vn (zn )|p+1 |vn (bn )|p+1 vn (zn )2 vn (bn )2 + + Uεn (bn ) − = 2 p+1 2 p+1 sn zn v 2 (r) v 2 (r) − Uεn (r) n dr − Uεn (r) n dr. 2 2 bn sn
Since the right-hand side here is bounded below, we see that our assumption implies that |vn (zn )| → ∞, which is a contradiction. If we have vn (sn ) > 0, we repeat the same argument. Our conclusion is that vn is unbounded in [sn , ∞). Let r¯ > 0 so that U (r) < 0, for all r ∈ (0, r¯ ), then Uεn (r) < 0 in (sn , r¯ ), if n is large enough. Let zn be the first critical point of vn to the right of sn , then integrating 2.3 between zn and r¯ we get −Uεn (¯r )
vn (zn )2 |vn (¯r )|p+1 |vn (zn )|p+1 vn (¯r )2 + + Uεn (zn ) − = 2 p+1 2 p+1 r¯ v 2 (r) Uεn (r) n dr. − 2 zn
By Proposition 3.1, vn (¯r ) is bounded and we see that the right-hand side is bounded below. We conclude that vn (zn ) is bounded. But then vn is bounded in (sn , r¯ ), using Lemma 2.2, completing the proof. We are left with the case V positive, which is direct from Lemma 2.2 since Uεn is increasing.
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
425
Remark 3.1. From this proposition there exists C > 0 such that |un (r)| ≤
C , rα
for all r > 0,
proving the first part of Corollary 1.1. 4. Zeroes and Critical Points are Dense In this section we study the behavior of zeroes and critical points of the sequence un , as n goes to infinity. Let us consider now the number d¯ = lim inf n→∞ yn,1 , where yn,1 is the rightmost zero of un . ¯ with a < b, there exists n0 ∈ N Proposition 4.1. For every interval (a, b) ⊂ (0, d), such that (a, b) contains at least one zero of un , for all n ≥ n0 . Proof. We first prove the proposition in case (a, b) ⊂ (0, d). Let us assume the result is not true. We can assume that un (r) > 0 in (a, b). We first analyze the case when, up to a sub-sequence, un does not have a critical point in [a, b]. Let us consider the case un (a) < 0 for all n ∈ N. Then, since d N−1 εn2 un = r N−1 V (r) − |un |p−1 un r dr and V is negative in (a, b), we see that r N−1 un (r) < a N−1 un (a) for all r ∈ (a, b). And then b b N−1 un (a) = un (b) + −un dr ≥ a |un (a)| r 1−N dr, a
a
which implies un (a)/un (a) is bounded. Let us define mn = un (a) and wn (z) = un (a + εn z)/mn , where we assume that mn converges up to a sequence to m ≥ 0. The functions wn satisfy d p−1 (a + εn z)N−1 wn = (a + εn z)N−1 V (a + εn z) − mn |wn |p−1 wn . dz
(4.1)
Since wn is uniformly bounded to the right of 0 and wn (0) = εn un (a)/un (a) converges to zero, integrating (4.1) between zero and z > 0 we see that the functions (εn−1 a + z)−1 wn (z) are locally uniformly bounded. Then we can prove that wn converges, up to a sub-sequence, to the solution of w − V (a)w + mp−1 |w|p−1 w = 0,
and
w(0) = 1, w (0) = 0,
(4.2)
which is periodic with zeroes. This is impossible. On the other hand, if for some sub-sequence we have un (a) > 0, then from the equation we see that un < 0 in (a, b) and then un (r) > un (b) for all r ∈ (a, b). Thus b un (b) = un (a) + un dr ≥ (b − a)un (b), a
and then un (b)/un (b) is bounded. Re-scaling un as before, but around b, we reach again a contradiction.
426
R. Castro, P.L. Felmer
Finally, if there is a sequence xn ∈ [a, b] with un (xn ) = 0, then xn is a maximum of un and if we define mn = un (xn ) and wn (z) = un (xn + εn z)/mn , we can prove, using the argument as before, that there exists x¯ ∈ [a, b] and m ≥ 0 such that wn converges, up to a sub-sequence, to the solution of (4.2), but with a replaced by x. ¯ This is impossible again. ¯ We observe that Lemma 3.1 implies To end we consider the case (a, b) ⊂ (d, d). that for any sequence of two consecutive zeroes an < bn of un and lim inf n→∞ bn ≥ b, we have bn − an → 0. We may assume that yn,1 > b and take bn as the first zero of un to the right of b, we see then that an ∈ (a, b), for n large enough. 5. The Envelope Function In this section we construct the envelope function associated to the sequence of solution (εn , un ) under study. We obtain this function as the limit of piece-wise linear functions joining the peaks of the functions un . We start with some qualitative results that we need next. It will be convenient to consider the trivial envelope, which is given by
e0 (r) =
p+1 V (r) 2
1 p−1
(5.1)
,
for r ≥ d and e0 (r) = 0 for r < d. We can easily check that this function satisfies (1.9) for r > d. In the next two lemmas we analyze the behavior un in relation to e0 . Lemma 5.1. Let xn be a point of maximum for |un | for n ∈ N, and assume that xn → x, ¯ then lim inf n→∞ |un (xn )| ≥ e0 (x). ¯ Proof. If x¯ > d then the result is a consequence of (2.5), which implies C2
εn2 2 |un (xn )|p−1 ≥ V (xn ). + xn2 p+1
In what follows we assume, taking a sub-sequence if necessary, that xn,1 converges ¯ We have to d. Lemma 5.2. If d¯ > 0 then ¯ lim |un (xn,1 )| = e0 (d).
n→∞
Proof. Without loss of generality, we may assume that un (xn,1 ) > 0. From the proof of Proposition 4.1 we know that d¯ ≥ d. If d¯ > d, from the proof of Lemma 3.1 we have that the sequence (1−p)/2
un (xn,1 + εn mn wn (z) = mn
z)
,
2 ). This implies that Vn (0) = with mn = un (xn,1 ), converges to the solution of E( p+1 p−1
V (xn,1 )/mn
converges to
2 p+1 ,
and then the result follows.
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
427
If d¯ = d and if, up to a sub-sequence, we have that limn→∞ un (xn,1 ) > 0, then the sequence wn (z) = un (xn,1 + εn z), converges to the solution of w + |w|p−1 w = 0,
and w(0) = lim un (xn,1 ), w (0) = 0. n→∞
Since this solution is periodic with zeroes, we reach a contradiction. Thus, we conclude that limn→∞ un (xn,1 ) = e0 (d) = 0. Next we study the behavior of the critical points of un in (0, d). It will be useful to consider the functions vn . Lemma 5.3. Assuming that V is negative near the origin. Given r0 ∈ (0, d), let xn ≥ r0 be a critical point of vn , n ∈ N. If vn (xn ) → 0 then vn (zn ) → 0, for any sequence zn of critical points of vn such that zn ≥ r0 and lim supn→∞ zn ≤ d. Proof. Let b0 < d so that Uεn > 0 in [b0 , d] for n sufficiently large. Let zn as in the lemma and assume that xn ∈ [b0 , d] and vn (xn ) → 0. We claim that vn (zn ) → 0. If zn ≤ xn , from Lemma 2.2 we have |vn (zn )| ≤ |vn (xn )| and if xn ≤ zn then from (2.3) and, since Uεn > 0 in (xn , zn ), we find |vn (xn )|p+1 vn (zn )2 vn (xn )2 |vn (zn )|p+1 ≤ − Uεn (xn ) . − Uεn (zn ) 2 p+1 2 p+1 In both cases it follows that vn (zn ) → 0, proving the claim. Next we show the result when xn , zn ∈ [r0 , b0 ]. We observe that there exist constants m, M > 0 so that −Uεn (r)/2 ≥ m and |Uεn (r)/2| ≤ M in [r0 , b0 ], for all n large. Since s ∈ [r0 , b0 ], from (2.3) we have εn2
vn (s)2 vn (s)2 vn (xn )2 |vn (s)|p+1 |vn (xn )|p+1 − Uεn (s) + = −Uεn (xn ) + 2 2 p+1 2 p+1 s 2 v − Uεn n dr, 2 xn
from where we obtain that
s |vn (xn )|p+1 vn (xn )2 2 + +M vn dr . m · vn (s) ≤ −Uεn (xn ) 2 p+1 xn 2
Using Gronwall’s inequality we find a constant C > 0 such that
vn (xn )2 |vn (xn )|p+1 + , vn (s)2 ≤ C −Uεn (xn ) 2 p+1 for all s ∈ [r0 , b0 ]. From here it follows that vn (r) → 0 uniformly in [r0 , b0 ]. The conclusion in the general case follows from the fact that the critical points of vn are densely distributed in [0, d]. Corollary 5.1. In case V is negative near the origin, assume that xn is a sequence of critical points of un such that xn → x¯ ∈ (r0 , d) and lim inf |un (xn )| > 0. n→∞
Then there exists a constant C > 0 such that |un (zn )| > C for any sequence zn of critical points of vn such that zn ≥ r0 and lim supn→∞ zn ≤ d. Moreover, un possesses a zero between any pair of consecutive critical points of un , for all n ∈ N sufficiently large.
428
R. Castro, P.L. Felmer
Proof. In view of Lemma 5.3, we only need to prove that un does not have positive minima nor negative maxima. Since V ≤ 0 in [0, d] and in view of Lemma 2.1, we just need to rule out the possibility of a sequence yn → d of positive minima of |un |. Let an < bn be consecutive zeroes of un such that yn ∈ [an , bn ] and xn is the point where |un | reaches its maximum in (an , bn ). Considering the sequence wn (z) = un (xn + εn z), which converges, up to a sub-sequence, to the solution of w + |w|p−1 w = 0,
and
w(0) = lim un (xn ) = 0, w (0) = 0, n→∞
which is periodic with zeroes and does not have positive minima, nor negative maxima, we conclude the proof. Now we are prepared to define the approximate envelope in a precise way. Let us assume for the moment that the hypotheses of Corollary 5.1 hold and let us define the function en as en (r) = |un (xn,k+1 )| +
|un (xn,k )| − |un (xn,k+1 )| (r − xn,k+1 ), xn,k − xn,k+1
r ∈ [xn,k+1 , xn,k ], (5.2)
where xn,1 > . . . > xn,s(n) are the critical points of un . To extend en to [0, ∞), we ¯ thus we can notice that e0 is of class C 1 in [d, ∞), xn,1 → d¯ and |un (xn,1 )| → e0 (d), find a sequence xn,0 such that xn,0 > xn,1 , xn,0 − xn,1 → 0 and e0 (xn,0 ) − |un (xn,1 )| xn,0 − xn,1
(5.3)
is bounded. We extend en to the right of xn,1 as en (r) = |un (xn,1 )| +
e0 (xn,0 ) − |un (xn,1 )| (r − xn,1 ), xn,0 − xn,1
in [xn,1 , xn,0 ] and as e0 in [xn,0 , ∞). Now an important conclusion Theorem 5.1. Under the hypotheses of Theorem 1.2, the sequence en converges, up to a sub-sequence, locally uniformly in R+ to a function e which is a solution to the envelope equation (1.9). Proof. Let us assume first that there is a constant C > 0 such that |u(xn,k )| ≥ C for all n, k and let r0 > 0. Multiplying (2.1) by u we find d dr
ε
2 |u
|2
2
u2 |u|p+1 − V (r) + 2 p+1
= −ε 2
N −1 2 u2 |u | − V (r) . r 2
(5.4)
Let xn,k and xn,k+1 be two consecutive critical points of un . Integrating (5.4) for (εn , un ) between xn,k+1 and xn,k we obtain p+1
p+1
h2 h2 h2 h − 1 −V (xn,k ) 2 + V (xn,k+1 ) 1 = − p+1 p+1 2 2
xn,k
εn2
xn,k+1
N −1 2 u2 |un | + V (r) n dr, r 2
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
429
where h1 = |un (xn,k+1 )| and h2 = |un (xn,k )|. By the Mean Value Theorem we find p+1 p+1 p ξn,k ∈ (h1 , h2 ) such that h2 − h1 = (p + 1)ξn,k (h2 − h1 ), and then Nn h2 − h1 = , xn,k − xn,k+1 Dn
(5.5)
where h2(V (xn,k ) − V (xn,k+1 )) 1 − Nn = 1 2(xn,k − xn,k+1 ) xn,k − xn,k+1
xn,k
εn2
xn,k+1
N −1 2 u2 |un | +V (r) n dr r 2 (5.6)
and p
Dn = ξn,k − V (xn,k )
h1 + h 2 . 2
It is clear that for all xn,k+1 ≥ r0 , both Nn and Dn are bounded. On the other hand, from Lemma 5.1 and under our assumption on the local maximum values of un , the denominator Dn is bounded away from zero uniformly for 0 ≤ k ≤ s(n). By the election we made for xn,0 , it is also clear that the right-hand side of (5.5) is bounded for k = 1. Thus, the sequence en is uniformly bounded and it is equicontinuous over [r0 , ∞). The application of the Arzel`a-Ascoli Theorem gives that en converges, up to a sub-sequence. Since r0 is arbitrary, en converges locally uniformly in R+ to a function e. We define the functions fn : R+ → R as the right-hand side of (5.5) for r ∈ [xn,k+1 , xn,k ), k = 0, . . . , s(n) − 1, as (5.3) if r ∈ [xn,1 , xn,0 ) and simply as H (r, e0 (r)) if r ∈ [xn,0 , ∞). In what follows we prove that fn converges point-wise to H (r, e(r)) in R+ . ¯ we let xn− = xn,k(n)+1 ≤ r and xn+ = xn,k(n) ≥ r be the Given r ∈ (0, d), extreme points of un closest to r. By Proposition 4.1 we see that xn− , xn+ → r and en (xn− ), en (xn+ ) → e(r). Then we have
lim h2n,1
n→∞
V (xn+ ) − V (xn− ) xn+ − xn−
= e(r)2 V (r)
and lim ξn − V (xn+ ) p
n→∞
hn,1 + hn,2 = e(r)p − V (r)e(r), 2
where hn,1 = |un (xn− )|, hn,2 = |un (xn+ )| y ξn = ξn,k(n) . Next we consider the integral term in (5.6). We let wn (y) = un (xn− + εn y) and we assume that xn− is a maximum point of un . Then wn converges in to w(y) = w(r, e(r); y) defined as the solution of (1.5). Now we have to distinguish two cases. First, if r ∈ (0, d], then V (r) ≤ 0, w is periodic with zeroes and (xn+ − xn− )/εn converges to 2T (r, e(r)). Then, re-scaling we get 1 n→∞ xn+ − xn− lim
xn+ xn−
εn2 2 1 |un | dr = r T (r, e(r))
T (r,e(r)) 0
|w |2 dy r
430
R. Castro, P.L. Felmer
and
1 n→∞ xn+ − xn− lim
xn+ xn−
V (r)u2n dr =
1 T (r, e(r))
T (r,e(r))
V (r)w 2 dy.
0
¯ by Lemma 5.1 we have that e(r) ≥ e0 (r). If e(r) > e0 (r) then Second, if r ∈ (d, d], the situation is as before. If e(r) = e0 (r) then w is positive and decays exponentially. This implies that 1 n→∞ xn+ − xn− lim
xn+
xn−
1 n→∞ xn+ − xn−
V (r)u2n dr = lim
xn+ xn−
εn2 2 |u | dr = 0. r n
¯ Thus we have that for r ∈ (0, d], T (r,e(r)) 2 |w | V (r) N −1 2 e(r) − Q(r, e(r)) − dy 2 T (r, e(r)) 0 r lim fn (r) = , n→∞ e(r)p − V (r)e(r)
(5.7)
where w(·) = w(r, e(r); ·). We see that the right-hand side corresponds exactly to H (r, e(r)). In fact, multiplying (1.5) by w and by w, after some computations we obtain 1 T (r, s)
T (r,s)
|w |2 dy = V (r) Q(r, s) − s 2 −
0
2 R(r, s) − s p+1 p+1
and 1 T (r, s)
T (r,s)
|w |2 dy = −V (r)Q(r, s) + R(r, s),
0
respectively, from where 1 T (r, s)
T (r,s) 0
|w |2 dy =
1 (p − 1)V (r)Q(r, s) − (p + 1)V (r)s 2 + 2s p+1 . p+3
Replacing this in (5.7) we conclude. For r > d¯ it is direct from the definition of e0 . Next, testing against a compactly supported smooth function, we can show that e is a weak solution of (1.9), which is C 1 since H is a continuous function in {(r, s)/r, s ∈ R+ , s ≥ e0 (r)}, as can be easily checked. We have concluded the proof in case |u(xn,k )| ≥ C > 0 for all n, k. If this is not the case, we know by Corollary 5.1 that un converges locally uniformly to zero in (0, d), which implies en converges to the trivial envelope e0 . Here we remark that in the definition of en , we may take as xn,k a maximum point of un in [yn,k+1 , yn,k ], which may not be unique. In any case en converges to e0 .
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
431
6. Characterizing the Envelope In this section we complete the proof of Theorem 1.2. We already have a limiting envelope, but we do not know its uniqueness. We show in what follows that e can be characterized by means of an asymptotic energy involving the function R(r, s). Proposition 6.1. Let (εn , un ) be a sequence of solutions of (2.1) with εn → 0 and Jεn (un ) = c. If e is the limiting envelope found in Sect. 5 then lim
n→∞ a
b
|un |p+1 r N−1 dr =
b
R(r, e(r))r N−1 dr,
a
for all a, b ∈ R+ . Proof. We first observe that σ (r) := R(r, e(r))r N−1 is uniformly continuous in [a, b], that is, given ε > 0 there exists δ > 0 such that x, y ∈ [a, b] and |x − y| < δ implies |σ (x) − σ (y)| < ε. Let xn− , xn+ be two consecutive extreme points of un converging to r¯ , then we have 1 + n→∞ xn − xn−
lim
xn+ xn−
|un |p+1 r N−1 dr = σ (¯r ).
(6.1)
Consider a partition I1 , . . . , Ik of [a, b] such that |Ik | < ε and let ri be the mid-point in Ii for all i = 1, . . . , k. Then, by uniform continuity of σ we have xn+ 1 p+1 N−1 |u | r dr − σ (r ) (6.2) + n i < ε, xn − xn− xn− for all pair of extreme points xn− , xn+ of un in Ii , i = 1, . . . , k and n large enough. This implies that k b σ (ri )|Ii | − |un |p+1 r N−1 dr ≤ ε(b − a) + o(1), a i+1
where o(1) → 0 when n → ∞. Since ε is arbitrary and σ is continuous, we conclude the proof. To complete our arguments we need the monotonicity of R(r, s). We have Proposition 6.2. R(r, s) is strictly increasing as a function of s. Proof. By conservation of energy in Eq. (1.5) we have T (r,s) p + 1 p+1 1 p+1 |w(y)| dy = s G(t, λ)t p+1 dt, s p−1 0 0 and then
1
R(r, s) = s p+1 0
G(t, λ)t p+1 dt/
1
G(t, λ)dt, 0
432
where
R. Castro, P.L. Felmer
G(t, λ) = 1/ 1 − t p+1 − λ(1 − t 2 ) and λ = (p + 1)V /(2s p−1 ).
If V (r) = 0 then R is increasing in s since ∂ p+1 R(r, s) = R(r, s) > 0. ∂s s In case V (r) = 0, differentiating we get 1 1 p+1 dt ∂ p+1 p+1 0 G (t, λ)t 0 G (t, λ)dt dλ − R(r, s) 1 R(r, s) = R(r, s) + s , 1 ∂s s G(t, λ)dt G(t, λ)dt ds 0
0
G
where is the partial derivative of G with respect to λ. If V (r) < 0, then λ < 0 and dλ/ds = −(p − 1)λ/s > 0. Thus, since G > 0, we just need to prove that 1 p + 1 dλ 0 G (t, λ)dt > 0. D(λ) = − s ds 1 G(t, λ)dt 0
To do so, we notice that G (t, λ) 1 1 = <− , p+1 2 G(t, λ) 2((1 − t )/(1 − t ) − λ) 2λ and then D(λ) >
p + 1 (p − 1)λ 1 p+3 − = > 0. s s 2λ 2s
If V (r) > 0, then we have λ ∈ (0, 1) and dλ/ds < 0, and then we just need to prove that 1 1 1 1 G (t, λ)t p+1 dt G(t, λ)dt − G(t, λ)t p+1 dt G (t, λ)dt E(λ) = 0
0
0
0
is negative. To show this we define g(t, λ) =
G (t, λ) 1 = , G(t, λ) 2((1 − t p+1 )/(1 − t 2 ) − λ)
and we rewrite E(λ) as 1 1 1 E(λ) = G(t, λ)G(τ, λ)(g(t, λ) − g(τ, λ))(t p+1 − τ p+1 )dtdτ. 2 0 0 Since g(t, λ) is decreasing with respect to t, we conclude.
With the following corollary, whose proof is a direct consequence of Proposition 6.1 and Proposition 6.2, we conclude the proof of Theorem 1.2. Corollary 6.1. The sequence en converges to the unique solution e of Eq. (1.9) satisfying the energy condition (1.10).
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
433
Remark 6.1. If e˜ = r α e, it is not hard to see that e˜ is positive at the origin. In fact, since J¯(e) = c > 0 then e and e˜ are not trivial near the origin. This fact implies that e is not bounded at zero. Actually, for a certain constant C we have e(r) ≥ Cr −α . This in turn implies that un is not bounded, since its critical points approach the origin. This proves the second part of Corollary 1.1. Remark 6.2. Once we have identified the envelope e we can define the asymptotic energy and mass densities E and ρ, as in (1.11). Then, from Proposition 6.1, we see that for every 0 ≤ a < b ≤ ∞, b 2 b εn 2 1 1 2 p+1 N−1 r dr = E(r)dr, |u | + V (r)un − |un | lim n→∞ a 2 n 2 p+1 a and similarly
b
lim
n→∞ a
u2n r N−1 dr
=
b
ρ(r)dr. a
7. Appendix In this Appendix we prove the existence of solutions for (1.2) using the variational method, taking advantage of the fact that the corresponding functional is even. Our proof, written in the radial case, can be directly extended to the general N dimensional case, considering some extra growth assumption for the potential at infinity. We consider the Sobolev space V+ (x)u2 dx < ∞ and u is radial , H = u ∈ H 1 (RN ) / RN
where V+ (x) = max{0, V (x)}, endowed with the inner product ∇u · ∇v + (1 + V+ (x))uvdx. u, v = RN
We denote by · the norm in H associated with ·, · and by · q the usual norm of Lq (RN ). For functions u in H we define the quadratic functional Qε as 1 ∞ 2 2 Qε (u) = ε |u | + V (r)u2 r N−1 dr. (7.1) 2 0 We will find critical points of Qε on the sphere S = {u ∈ H / u p+1 = 1} using standard min-max theory for even functionals. Denoting by γ (A) the Krasnoselski genus of the closed symmetric set A ⊂ S, we define Ak = {A ⊂ S / A is closed and symmetric, γ (A) ≥ k} and, given k ∈ N, we consider the min-max value bk (ε) = inf sup Qε (u). A∈Ak u∈A
Since N ≥ 2, the Strauss Lemma guarantees the compact embedding of H in Lq (RN ), for 1 ≤ q < 2N/(N − 2) if N ≥ 3, and for q ≥ 1 if N = 2, see [22]. Thus we can
434
R. Castro, P.L. Felmer
apply Theorem 8.17 in [20] to obtain that each bk (ε) is a critical value of Qε on S and that lim bk (ε) = ∞.
(7.2)
k→∞
If vkε ∈ H is a critical point associated to bk (ε), then uεk = (2bk (ε))1/(p+1) vkε is a solution of (1.2) with ck (ε) ≡ Jε (uεk ) = (p−1) (p+1) bk (ε). These values satisfy the following properties: 1) ck (ε) is a continuous function of ε, 2) If k ≤ then ck (ε) ≤ c (ε), and 3) If ε ≤ ε then ck (ε) ≤ ck (ε ). These properties and the following lemma complete the proof of Theorem 1.1. Lemma 7.1. The critical values ck (ε) satisfy: 1) limk→∞ ck (ε) = ∞ and 2) Given α > 0 and k ∈ N, there exists εk such that ck (εk ) < α. Proof. The proof of 1) is direct from (7.2). To prove 2) we consider a family of k functions v1 , v2 , ..., vk ∈ H having compact supports, disjoint from each other. We define Ak = {v =
k
αi vi / v p+1 = 1, α1 , ..., αk ∈ R},
i=1
and we see that there is a constant Ck so that ∞ (|v |2 + V (0)v 2 )r N−1 dr ≤ Ck ,
for all
v ∈ Ak .
0
Next we consider the set Aεk = {vε / vε (x) = ε−N/(p+1) v(x/ε), v ∈ Ak }, which belongs to Ak and whose elements vε ∈ Ak satisfy εN(p−1)/(p+1) ∞ 2 2 ε |v | + V (εr)v 2 r N−1 dr ≤ εN(p−1)/(p+1) Ck , Qε (vε ) = 2 0 for small ε. From here 2) follows.
Acknowledgements. The authors thank the anonymous referee for comments and criticism that lead to an improved version of our original paper. The second author wants to thank Salom´e Mart´ınez and Kazunaga Tanaka for useful comments about this work.
References 1. Ambrosetti, A., Badiale, M., Cingolani, S.: Semiclassical states of nonlinear Schr¨odinger equations. Arch. Rat. Mech. Anal. 140, 285–300 (1997) 2. Ambrosetti, A., Malchiodi, A., Ni, W. M.: Solutions, concentrating on spheres, to symmetric singularly perturbed problems. C. R. Math. Acad. Sci. Paris 335(2), 145–150 (2002) 3. Ambrosetti, A., Malchiodi, A., Ni, W. M.: Singularly perturbed elliptic equations with symmetry: existence of solutions concentrating on spheres, Part I. Commun. Math. Phys. 235(3), 427–466 (2003) 4. Ambrosetti, A., Malchiodi, A., Ni, W. M.: Singularly perturbed elliptic equations with symmetry: existence of solutions concentrating on spheres, Part II. Ind. Univ. Math. J 53, 297–330 (2004) 5. Benci, V., D’Aprile, T.: The semiclassical limit of the nonlinear Schr¨odinger equation in a radial potential. J. Diff. Eqs. 184(1), 109–138 (2002)
Semi-Classical Limit for Radial Non-Linear Schr¨odinger Equation
435
6. D’Aprile, T.: Behaviour of symmetric solutions of a nonlinear elliptic field equation in the semiclassical limit: concentration around a circle. Electron. J. Diff. Eqs. 69, 40 (2000) 7. del Pino, M., Felmer, P.: Local mountain passes for semi-linear elliptic problems in unbounded domains. Calc. of Variations and PDE’s 4, 121–137 (1996) 8. del Pino, M., Felmer, P.: Semi-Classical States for Nonlinear Schr¨odinger Equations. J. Funct. Anal. 149(01), 245–265 (1997) 9. del Pino, M., Felmer, P., Tanaka, K.: An elementary construction of complex patterns in nonlinear Schr¨odinger equations. Nonlinearity 15(5), 1653–1671 (2002) 10. Felmer, P., Mart´ınez, S.: High energy solutions for a phase transition problem. J. Diff. Eqs. 1, 198–220 (2003) 11. Felmer, P., Mart´ınez, S., Tanaka, K.: High frequency chaotic solutions for a slowly varying dynamical system. Preprint. 12. Felmer, P., Torres, J.: Semi-classical limit for the one dimensional nonlinear Schr¨odinger equation. Commun. Contemp. Math. 4(3), 481–512 (2002) 13. Floer, A., Weinstein, A.: Non-spreading wave packets for the cubic Schr¨odinger equation with a bounded potential. J. Funct. Anal. 69, 397–408 (1986) 14. Kang, X., Wei, J.: On interacting bump of semi-classical state s of Nonlinear Schr¨odinger Equations. Adv. Diff. Eqs. 5(7–9), 899–928 (2000) 15. Malchiodi, A.: Concentration at curves for a singularly perturbed Neumann problem in three-dimensional domains. http://www.sissa.it/∼makhiod/n3k1rev.pdf, 2004 16. Malchiodi, A., Montenegro, M.: Boundary concentration phenomena for a singularly perturbed elliptic problem. Comm. Pure Appl. Math. 55(12), 1507–1568 (2002) 17. Malchiodi, A., Montenegro, M.: Multidimensional boundary layers for a singularly perturbed Neumann problem. Duke Math. J. 124(1), 105–143 (2004) 18. Malchiodi, A., Ni, W.-M., Wei, J.: Multiple clustered layer solutions for semilinear Neumann problems on a ball. Ann. de l’Inst. Henri Poincar´e(c) Nonlinear Analysis, 22(2), 143–163 (2005) 19. Oh, Y.J.: Existence of semi-classical bound states of non linear Schr¨odinger equations with potential on the class (V )a . Comm. Partial Diff. Eq. 13, 1499–1519 (1988) 20. Rabinowitz, P.: Minimax methods in critical point theory with applications to differential equations. CBMS 65. Providence, RI: AMS, 1986 21. Rabinowitz, P.: On a class of nonlinear Schr¨odinger equations. Z. Angew. Math. Phys. 43, 270–291 (1992) 22. Strauss, W.: Existence of solitary waves in higher dimensions. Commun. Math. Phys. 55, 149–162 (1977) 23. Struwe, M.: Variational Methods. Berlin-Heidelberg-New York: Springer Verlag, 1980 24. Wang, X.: On concentration of positive bound states of nonlinear Schr¨odinger equations. Comm. Math. Phys. 153(2), 229–244 (1993) Communicated by P. Constantin
Commun. Math. Phys. 256, 437–490 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1255-8
Communications in
Mathematical Physics
Periodic Solutions for Completely Resonant Nonlinear Wave Equations with Dirichlet Boundary Conditions Guido Gentile1 , Vieri Mastropietro2 , Michela Procesi3 1 2 3
Dipartimento di Matematica, Universit`a di Roma Tre, 00146 Roma, Italy Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, 00133 Roma, Italy SISSA, 34014 Trieste, Italy
Received: 24 February 2004 / Accepted: 26 May 2004 Published online: 4 February 2005 – © Springer-Verlag 2005
Abstract: We consider the nonlinear string equation with Dirichlet boundary conditions utt − uxx = ϕ(u), with ϕ(u) = u3 + O(u5 ) odd and analytic, = 0, and we construct small amplitude periodic solutions with frequency ω for a large Lebesgue measure set of ω close to 1. This extends previous results where only a zero-measure set of frequencies could be treated (the ones for which no small divisors appear). The proof is based on combining the Lyapunov-Schmidt decomposition, which leads to two separate sets of equations dealing with the resonant and non-resonant Fourier components, respectively the Q and the P equations, with resummation techniques of divergent powers series, allowing us to control the small divisors problem. The main difficulty with respect to the nonlinear wave equations utt − uxx + Mu = ϕ(u), M = 0, is that not only the P equation but also the Q equation is infinite-dimensional. 1. Introduction We consider the nonlinear wave equation in d = 1 given by utt − uxx = ϕ(u), u(0, t) = u(π, t) = 0,
(1.1)
where Dirichlet boundary conditions allow us to use as a basis in L2 ([0, π ]) the set of functions {sin mx, m ∈ N}, and ϕ(u) is any odd analytic function ϕ(u) = u3 + O(u5 ) with = 0. We shall consider the problem of existence of periodic solutions for (1.1), which represents a completely resonant case for the nonlinear wave equation as in the absence of nonlinearities all the frequencies are resonant. In the finite dimensional case the problem has its analogue in the study of periodic orbits close to elliptic equilibrium points: results of existence have been obtained in such a case by Lyapunov [31] in the non-resonant case, by Birkhoff and Lewis [6] in the case of resonances of order greater than four, and by Weinstein [37] in the case of any kind
438
G. Gentile, V. Mastropietro, M. Procesi
of resonances. Systems with infinitely many degrees of freedom (as the nonlinear wave equation, the nonlinear Schr¨odinger equation and other PDE systems) have been studied much more recently; the problem is much more difficult because of the presence of a small divisors problem, which is absent in the finite dimensional case. For the nonlinear wave equations utt −uxx +Mu = ϕ(u), with mass M strictly positive, existence of periodic solutions has been proved by Craig and Wayne [14], by P¨oschel [33] (by adapting the analogous result found by Kuksin and P¨oschel [29] for the nonlinear Schr¨odinger equation) and by Bourgain [8] (see also the review [13]). In order to solve the small divisors problem one has to require that the amplitude and frequency of the solution must belong to a Cantor set, and the main difficulty is to prove that such a set can be chosen with non-zero Lebesgue measure. We recall that for such systems also quasi-periodic solutions have been proved to exist in [29, 33, 9] (in many other papers the case in which the coefficient M of the linear term is replaced by a function depending on parameters is considered; see for instance [36, 7] and the reviews [27, 28]). In all the quoted papers only non-resonant cases are considered. Some cases with some low-order resonances between the frequencies have been studied by Craig and Wayne [15]. The completely resonant case (1.1) has been originally studied with variational methods starting from Rabinowitz [34, 35, 12, 11, 17], where periodic solutions with a period which is a rational multiple of π have been obtained; such solutions correspond to a zero-measure set of values of the amplitudes. The case of irrational periods, which in principle could provide a large measure of values, has been mostly studied only under strong Diophantine conditions (as the ones introduced in [2]) which essentially remove the small divisors problem, leaving in fact again a zero-measure set of values [30, 3, 4]. It is however conjectured that also for M = 0 periodic solutions of (1.1) should exist for a large measure set of values of the amplitudes, see for instance [28], and indeed we prove in this paper that this is actually the case: the unperturbed periodic solutions with √periods Tj = 2π/j can be continued into periodic solutions with periods Tε,j = 2π/j 1 − ε, where ε is a small parameter of the order of the squared amplitude of the periodic solution. In [10] existence of periodic solutions is proved for the equation utt − uxx = u3 + F (x, u), with periodic boundary conditions, and with F (x, u) a polynomial in u with coefficients which are trigonometric polynomials in x. Such a problem becomes trivial when F does not depend explicitly on x (in [10] Wayne is credited with such an observation), for instance if F (x, u) ≡ 0. On the other hand, when a function F (x, u) depending on x is considered, the perturbation of the exactly solvable problem appears to order higher than 1 (in ε), and this produces a small divisor problem which is solved by imposing a Diophantine condition with an ε-dependent constant (see (5.35) in [10]). On the contrary in the case of Dirichlet boundary conditions to find a periodic solution just for the cubic equation, utt − uxx = u3 , is non-trivial, and, as will be apparent later on, it is essentially the core of the problem. It already requires the solution of a small divisor problem: one considers the term u3 as a perturbation and the problem is complicated by the fact that utt − uxx can be of the same order of u3 ; in particular we must impose a Diophantine condition with an ε-independent constant, and this requires careful control of the small divisors. Of course the techniques used in our and Bourgain’s papers are quite different. Bourgain uses the Craig-Wayne approach based on the method of Fr¨ohlich and Spencer [18], while we rely on the Renormalization Group approach proposed in [23], which consists of a Lyapunov-Schmidt decomposition followed by a tree expansion of the solution (with a graphic formalism originally introduced by Gallavotti [19], inspired by Eliasson’s work
Nonlinear Wave Equations with Dirichlet Boundary Conditions
439
[16], for investigating the persistence of maximal KAM tori), which allows us to control the small divisors problem. As in [3] and [5] we also consider the problem of finding how many solutions can be obtained with a given period, and we study their minimal period. As a further minor difference between the present paper and [10], we mention that our solutions are analytic in space and time, while the ones found by Bourgain are C∞. If ϕ = 0 every real solution of (1.1) can be written as u(x, t) =
∞
Un sin nx cos(ωn t + θn ),
(1.2)
n=1
where ωn = n and Un ∈ R for all n ∈ N. √ For ε > 0 we set = σ F , with σ = sgn and F > 0, and rescale u → ε/F u in (1.1), thus obtaining utt − uxx = σ εu3 + O(ε 2 ), (1.3) u(0, t) = u(π, t) = 0, where O(ε2 )√denotes an analytic function of u and ε of order at least 2 in ε, and we define ωε = 1 − λε, with λ ∈ R, so that ωε = 1 for ε = 0. As the nonlinearity ϕ is odd the solution of (1.3) can be extended in the x variable to an odd 2π-periodic function (even in the variable t). We shall consider ε small and we shall show that there exists a solution of (1.3), which is 2π/ωε -periodic in t and ε-close to the function u0 (x, ωε t) = a0 (ωε t + x) − a0 (ωε t − x),
(1.4)
provided that ε is in an appropriate Cantor set and a0 (ξ ) is the odd 2π -periodic solution of the integro-differential equation (1.5) σ λa¨ 0 = −3 a02 a0 − a03 , where the dot denotes the derivative with respect to ξ , and, given any periodic function F (ξ ) with period T , we denote by 1 T F = dξ F (ξ ) (1.6) T 0 its average. Then a 2π/ωε -periodic solution of (1.1) is simply obtained by scaling back the solution of (1.3). Equation (1.5) has odd 2π -periodic solutions, provided that one sets σ λ > 0; we shall choose σ λ = 1 in the following. An explicit computation gives [3] a0 (ξ ) = Vm sn( m ξ, m)
(1.7)
for m a suitable negative constant (m ≈ −0.2554), with m = 2K(m)/π and Vm = √ −2m m , where sn( m ξ, m) is the sine-amplitude function and K(m) is the elliptic √ integral of the first kind, with modulus m [25]; see Appendix A1 for further details. Call 2κ the width of the analyticity strip of the function a0 (ξ ) and α the maximum value it can assume in such a strip; then one has a0,n ≤ αe−2k|n| . (1.8)
440
G. Gentile, V. Mastropietro, M. Procesi
Our result (including also the cases of frequencies which are multiples of ωε ) can be more precisely stated as follows. Theorem. Consider Eq. (1.1), where ϕ(u) = u3 + O(u5 ) is an odd analytic function, with F = || = 0. Define u0 (x, t) = a0 (t + x) − a0 (t − x), with a0 (ξ ) the odd 2π-periodic solution of (1.5). There is a positive constant ε0 and for all j ∈ N a set Ej ∈ [0, ε0 /j 2 ] satisfying meas(Ej ∩ [0, ε]) = 1, ε→0 ε √ such that for all ε ∈ Ej , by setting ωε = 1 − ε and f (x, t) r = fn,m er(|n|+|m|) , lim
(1.9)
(1.10)
(n,m)∈Z2
for analytic 2π-periodic functions, there exist 2π/j ωε -periodic solutions uε,j (x, t) of (1.1), analytic in (t, x), with √ (1.11) uε,j (x, t) − j ε/F u0 (j x, j ωε t) ≤ C j ε ε, κ
for some constants C > 0 and 0 <
κ
< κ.
Note that such a result provides a solution of the open problem 7.4 in [28], as far as periodic solutions are concerned. As we shall see for ϕ(u) = F u3 for all j ∈ N one can take the set E = [0, ε0 ], independently of j , so that for fixed ε ∈ E no restriction on j has to be imposed. We look for a solution of (1.3) of the form u(x, t) = einj ωt+ij mx un,m = v(x, t) + w(x, t), (n,m)∈Z2
v(x, t) = a(ξ ) − a(ξ ), einξ an , a(ξ ) =
ξ = ωt + x,
ξ = ωt − x, (1.12)
n∈Z
w(x, t) =
einj ωt+ij mx wn,m ,
(n,m)∈Z2 |n|=|m|
with ω = ωε , such that one has w(x, t) = 0 and a(ξ ) = a0 (ξ ) for ε = 0. Of course by the symmetry of (1.1), hence of (1.4), we can look for solutions (if any) which verify un,m = −un,−m = u−n,m
(1.13)
for all n, m ∈ Z. Inserting (1.12) into (1.3) gives two sets of equations, called the Q and P equations [14], which are given, respectively, by n2 an = [ϕ(v + w)]n,n , Q (1.14) −n2 an = [ϕ(v + w)]n,−n ,
|m| = |n|, P −ω2 n2 + m2 wn,m = ε [ϕ(v + w)]n,m ,
Nonlinear Wave Equations with Dirichlet Boundary Conditions
441
where we denote by [F ]n,m the Fourier component of the function F (x, t) with labels (n, m), so that F (x, t) = einωt+mx [F ]n,m . (1.15) (n,m)∈Z2
In the same way we shall call [F ]n the Fourier component of the function F (ξ ) with label n; in particular one has [F ]0 = F . Note also that the two equations Q are in fact the same, by the symmetry property [ϕ(v + w)]n,m = − [ϕ(v + w)]n,−m , which follows from (1.13). We start by considering the case ϕ(u) = u3 and j = 1, for simplicity. We shall discuss at the end how the other cases can be dealt with, see Sect. 8. 2. Lindstedt Series Expansion One could try to write a power series expansion in ε for u(x, t), using (1.14) to get recursive equations for the coefficients. However by proceeding in this way one finds that the coefficient of order k is given by a sum of terms some of which are of order O(k!α ), for some constant α. This is the same phenomenon occurring in the Lindstedt series for invariant KAM tori in the case of quasi-integrable Hamiltonian systems; in such a case however one can show that there are cancellations between the terms contributing to the coefficient of order k, which at the end admits a bound C k , for a suitable constant C. On the contrary such cancellations are absent in the present case and we have to proceed in a different way, equivalent to a resummation (see [23] where such a procedure was applied to the same nonlinear wave equation with a mass term, utt − uxx + Mu = ϕ(u)). Definition 1. Given a sequence {νm (ε)}|m|≥1 , such that νm = ν−m , we define the renormalized frequencies as 2 2 ω˜ m ≡ ωm + νm ,
ωm = |m|,
(2.1)
and the quantities νm will be called the counterterms. By the above definition and the parity properties (1.13) the P equation in (1.14) can be rewritten as
2 wn,m −ω2 n2 + ω˜ m = νm wn,m + ε[ϕ(v + w)]n,m (a) (b) = νm wn,m + νm wn,−m + ε[ϕ(v + w)]n,m ,
(2.2)
where (a) (b) νm − νm = νm .
(2.3)
With the notations of (1.15), and recalling that we are considering ϕ(u) = u3 , we can write
(v + w)3 = [v 3 ]n,n + [w 3 ]n,n + 3[v 2 w]n,n + 3[w 2 v]n,n n,n
≡ [v 3 ]n,n + [g(v, w)]n,n ,
(2.4)
442
G. Gentile, V. Mastropietro, M. Procesi
where, again by using the parity properties (1.13), [v 3 ]n,n = [a 3 ]n + 3 a 2 an .
(2.5)
Then the first Q equation in (1.13) can be rewritten as n2 an = [a 3 ]n + 3 a 2 an + [g(v, w)]n,n ,
(2.6)
so that an is the Fourier coefficient of the 2π-periodic solution of the equation
a¨ = − a 3 + 3 a 2 a + G(v, w) ,
(2.7)
where we have introduced the function G(v, w) = einξ [g(v, w)]n,n .
(2.8)
n∈Z
To study Eqs. (2.2) and (2.6) we introduce an auxiliary parameter µ, which at the end will be set equal to 1, by writing (2.2) as
2 (a) (b) wn,m −ω2 n2 + ω˜ m wn,m + µνm wn,−m + µε[ϕ(v + w)]n,m , (2.9) = µνm and we shall look for un,m in the form of a power series expansion in µ, un,m =
∞
µk u(k) n,m ,
(2.10)
k=0
with un,m depending on ε and on the parameters νm , with c = a, b and |m | ≥ 1. (0) (0) In (2.10) k = 0 requires un,±n = ±a0,n and un,m = 0 for |n| = |m|, for k ≥ 1, as (k)
(c)
(c )
we shall see later on, the dependence on the parameters νm will be polynomial, of the form ∞
m =2 c =a,b (a)
(b)
(a)
(c ) km
(c )
νm
,
(2.11)
(b)
with |k| = k1 + k1 + k2 + k2 + · · · ≤ k − 1. Of course we are using the symmetry property to restrict the dependence only on the positive labels m . (k) We derive recursive equations for the coefficients un,m of the expansion. We start from the coefficients with |n| = |m|. By (1.12) and (2.10) we can write a = a0 +
∞
µk A(k) ,
(2.12)
k=1
and inserting this expression into (2.7) we obtain for A(k) the equation
A¨ (k) = −3 a02 A(k) + a02 A(k) + 2 a0 A(k) a0 + f (k) ,
(2.13)
Nonlinear Wave Equations with Dirichlet Boundary Conditions
with
f (k) = −
n1 +n2 +n3 =n k1 +k2 +k3 =k ki =k→|ni |=|mi | m1 +m2 +m3 =m
where we have used the notations u(k) n,m with
(k) vn,n
=
(k)
=
443
(k2 ) (k3 ) 1) u(k n1 ,m1 un2 ,m2 un3 ,m3 ,
(2.14)
(k)
vn,m , if |n| = |m|, (k) wn,m , if |n| = |m|,
0, An , if k = a0,n , if k = 0,
(k) vn,−n
=
(2.15)
(k)
−An , if k = 0, −a0,n , if k = 0.
(2.16)
Before studying how to find the solution of this equation we introduce some preliminary definitions. To shorten notations we write c(ξ ) ≡ cn( m ξ, m),
s(ξ ) ≡ sn( m ξ, m),
d(ξ ) ≡ dn( m ξ, m),
(2.17)
and set cd(ξ ) = cn( m ξ, m) dn( m ξ, m). Moreover given an analytic periodic function F (ξ ) we define P[F ](ξ ) = F (ξ ) − F ,
(2.18)
and we introduce a linear operator I acting on 2π -periodic zero-mean functions and defined by its action on the basis en (ξ ) = einξ , n ∈ Z \ {0}, I[en ](ξ ) =
en (ξ ) . in
(2.19)
Note that if P[F ] = F then P[I[F ]] = I[F ] (is simply the zero-mean primitive of F ); moreover I switches parities. In order to find an odd solution of (2.13) we replace first a0 A(k) with a parameter C (k) , and we study the modified equation
(2.20) A¨ (k) = −3 a02 A(k) + a02 A(k) + 2C (k) a0 + f (k) . Then we have the following result (proved in Appendix A2). Lemma 1. Given an odd analytic 2π-periodic function h(ξ ), the equation
y¨ = −3 a02 + a02 y + h
(2.21)
admits one and only one odd analytic 2π-periodic solution y(ξ ), given by
2 −1 s I[cd h]−cd I[P[s h]])+cd I[I[cd h]] D s h+
D y = L[h] ≡ Bm −2 (s m m m m (2.22) with Bm = −m/(1 − m) and Dm = −1/m.
444
G. Gentile, V. Mastropietro, M. Procesi
As a0 is analytic and odd, we find immediately, by induction on k and using Lemma 1, that f (k) is analytic and odd, and that the solution of Eq. (2.20) is odd and given by A˜ (k) = L[−6C (k) a0 + f (k) ].
(2.23)
The function A˜ (k) thus found depends of course on the parameter C (k) ; in order to obtain A˜ (k) = A(k) , we have to impose the constraint C (k) = a0 A(k) , (2.24) and by (2.23) this gives
C (k) = −6C (k) a0 L[a0 ] + a0 L[f (k) ] ,
(2.25)
which can be rewritten as
(1 + 6 a0 L[a0 ]) C (k) = a0 L[f (k) ] .
(2.26)
An explicit computation (see Appendix A3) gives 1 2 −2 1 1 a0 L[a0 ] = Vm 2Dm −
m Bm s 4 + 2Dm (Dm − 1) + s 2 2 , 2 2 2 (2.27) which yields r0 = (1 + 6a0 L[a0 ]) = 0. At the end we obtain the recursive definition (k) (k) (k) A = L[f − 6C a0 ], (2.28) C (k) = r −1 a L[f (k) ] . 0 0 In Fourier space the first of (2.28) becomes
−2 2 (k) (k) A(k) f = B
D s s − 6C a m n n 0,n 1 2 m m n2 n n1 +n2 =0
1 (k) (k) f cd cd − 6C a n n 0,n 1 2 3 n3 i 2 (n2 + n3 )2 n1 +n2 +n3 =n
∗ 1 (k) (k) +Bm −1 D cd − 6C a f (2.29) s m n n 0,n 3 m i(n2 + n3 ) 1 2 n3 n1 +n2 +n3 =n
∗ 1 (k) −Bm −1 − 6C a cdn1 sn2 fn(k) 0,n 3 m Dm 3 i(n2 + n3 ) n1 +n2 +n3 =n
(k) ≡ Lnn fn − 6C (k) a0,n , +Bm
∗
n
where the constants Bm and Dm are defined after (2.22), and the ∗ in the sums means that one has the constraint n2 + n3 = 0, while the second of (2.28) can be written as (k) C (k) = r0−1 a0,−n Ln,n fn . (2.30) n,n ∈Z
Nonlinear Wave Equations with Dirichlet Boundary Conditions (k)
445 (k)
Now we consider the coefficients un,m with |n| = |m|. The coefficients wn,m verify the recursive equations
(k) 2 (a) (k−1) (b) (k−1) wn,m wn,m + νm wn,−m + [(v + w)3 ](k−1) (2.31) −ω2 n2 + ω˜ m = νm n,m , where [(v + w)3 ](k) n,m =
k1 +k2 +k3 =k n1 +n2 +n3 =n m1 +m2 +m3 =m
(k2 ) (k3 ) 1) u(k n1 ,m1 un2 ,m2 un3 ,m3 ,
(2.32)
if we use the same notations (2.15) and (2.16) as in (2.14). Equations (2.29) and (2.31), together with (2.32), (2.14), (2.30) and (2.32), define (k) recursively the coefficients un,m . To prove the theorem we shall proceed in two steps. The first step consists in looking for the solution of Eqs. (2.29) and (2.31) by considering ω˜ = {ω˜ m }|m|≥1 as a given set of parameters satisfying the Diophantine conditions (called respectively the first and the second Mel nikov conditions) |ωn ± ω˜ m | ≥ C0 |n|−τ ∀n ∈ Z \ {0} and ∀m ∈ Z \ {0} such that |m| = |n|, |ωn ± (ω˜ m ± ω˜ m )| ≥ C0 |n|−τ (2.33)
∀n ∈ Z \ {0} and ∀m, m ∈ Z \ {0} such that |n| = |m ± m |, with positive constants C0 , τ . We shall prove in Sect. 3 to 5 the following result. Proposition 1. Consider a sequence ω˜ = {ω˜ m }|m|≥1 verifying (2.33), with ω = ωε = √ 1 − ε and such that |ω˜ m − |m|| ≤ Cε/|m| for some constant C. For all µ0 > 0 there exists ε0 > 0 such that for |µ| ≤ µ0 and 0 < ε < ε0 there is a sequence ν(ω, ˜ ε; µ) = {νm (ω, ˜ ε; µ)}|m|≥1 , where each νm (ω, ˜ ε; µ) is analytic in µ, such that the coefficients (k) ˜ ε; µ) which is un,m which solve (2.29) and (2.31) define via (2.10) a function u(x, t; ω, analytic in µ, analytic in (x, t) and 2π-periodic in t and solves n2 an = a 3 n,n + 3 a 2 an + [g(v, w)]n,n , (2.34) −n2 an = a 3 n,−n + 3 a 2 a−n + [g(v, w)]n,−n ,
2 ˜ ε; µ) wn,m + µε [ϕ(v + w)]n,m , |m| = |n|, wn,m = µνm (ω, −ω2 n2 + ω˜ m with the same notations as in (1.14). If τ ≤ 2 then one can require only the first Mel’nikov conditions in (2.33), as we shall show in Sect. 7. Then in Proposition 1 one can fix µ0 = 1, so that one can choose µ = 1 and set u(x, t; ω, ˜ ε) = u(x, t; ω, ˜ ε; 1) and νm (ω, ˜ ε) = νm (ω, ˜ ε; 1). The second step, to be proved in Sect. 6, consists in inverting (2.1), with νm = νm (ω, ˜ ε) and ω˜ verifying (2.33). This requires some preliminary conditions on ε, given by the Diophantine conditions |ωn ± m| ≥ C1 |n|−τ0
∀n ∈ Z \ {0} and ∀m ∈ Z \ {0} such that |m| = |n|, (2.35)
with positive constants C1 and τ0 > 1.
446
G. Gentile, V. Mastropietro, M. Procesi
This allows to solve iteratively (2.1), by imposing further non-resonance conditions besides (2.35), provided that one takes C1 = 2C0 and τ0 < τ − 1, which requires τ > 2. At each iterative step one has to exclude some further values of ε, and at the end the left values fill a Cantor set E with large relative measure in [0, ε0 ] and ω˜ verify (2.35). If 1 < τ ≤ 2 the first Mel’nikov conditions, which, as we said above, become sufficient to prove Proposition 1, can be obtained by requiring (2.35) with τ0 = τ ; again this leaves a large measure set of allowed values of ε. This is discussed in Sect. 7. The result of this second step can be summarized as follows. Proposition 2. There are δ > 0 and a set E ⊂ [0, ε0 ] with a complement of relative Lebesgue measure of order ε0δ such that for all ε ∈ E there exists ω˜ = ω(ε) ˜ which solves (2.1) and satisfy the Diophantine conditions (2.33) with |ω˜ m − |m|| ≤ Cε/|m| for some constant C. As we said, our approach is based on constructing the periodic solution of the string equation by a perturbative expansion which is the analogue of the Lindstedt series for (maximal) KAM invariant tori in finite-dimensional Hamiltonian systems. Such an approach immediately encounters a difficulty; while the invariant KAM tori are analytic in the perturbative parameter ε, the periodic solutions we are looking for are not analytic; hence a power series construction seems at first sight hopeless. Nevertheless it turns out ˜ ε), ε; µ); that the Fourier coefficients of the periodic solution have the form un,m (ω(ω, while such functions are not analytic in ε, they turn out to be analytic in µ, provided that ω˜ satisfies the condition (2.33) and ε is small enough; this is the content of Proposition 1. The smoothness in ε at fixed ω˜ is what allows us to write as a series expansion un,m (ω, ˜ ε; µ); this strategy was already applied in [23] in the massive case. 3. Tree Expansion: The Diagrammatic Rules A (connected) graph G is a collection of points (vertices) and lines connecting all of them. The points of a graph are most commonly known as graph vertices, but may also be called nodes or points. Similarly, the lines connecting the vertices of a graph are most commonly known as graph edges, but may also be called branches or simply lines, as we shall do. We denote with P (G) and L(G) the set of vertices and the set of lines, respectively. A path between two vertices is a subset of L(G) connecting the two vertices. A graph is planar if it can be drawn in a plane without graph lines crossing (i.e. it has graph crossing number 0). Definition 2. A tree is a planar graph G containing no closed loops (cycles); in other words, it is a connected acyclic graph. One can consider a tree G with a single special vertex V0 : this introduces a natural partial ordering on the set of lines and vertices, and one can imagine that each line carries an arrow pointing toward the vertex V0 . We can add an extra (oriented) line 0 connecting the special vertex V0 to another point which will be called the root of the tree; the added line will be called the root line. In this way we obtain a rooted tree θ defined by P (θ ) = P (G) and L(θ ) = L(G) ∪ 0 . A labeled tree is a rooted tree θ together with a label function defined on the sets L(θ ) and P (θ). Note that the definition of rooted tree given above is slightly different from the one which is usually adopted in literature [24, 26] according to which a rooted tree is just a tree with a privileged vertex, without any extra line. However the modified definition that we gave will be more convenient for our purposes. In the following we shall denote with the symbol θ both rooted trees and labeled rooted trees, when no confusion arises.
Nonlinear Wave Equations with Dirichlet Boundary Conditions
447
We shall call equivalent two rooted trees which can be transformed into each other by continuously deforming the lines in the plane in such a way that the latter do not cross each other (i.e. without destroying the graph structure). We can extend the notion of equivalence also to labeled trees, simply by considering equivalent two labeled trees if they can be transformed into each other in such a way that also the labels match. Given two points V, W ∈ P (θ ), we say that W ≺ V if V is on the path connecting W to the root line. We can identify a line with the points it connects; given a line = (V, W) we say that enters V and comes out of W. In the following we shall deal mostly with labeled trees: for simplicity, where no confusion can arise, we shall call them just trees. We consider the following diagrammatic rules to construct the trees we have to deal with; this will implicitly define also the label function. (1) We call nodes the vertices such that there is at least one line entering them. We call end-points the vertices which have no entering line. We denote with L(θ ), V (θ ) and E(θ ) the set of lines, nodes and end-points, respectively. Of course P (θ) = V (θ) ∪ E(θ ). (2) There can be two types of lines, w-lines and v-lines, so we can associate with each line ∈ L(θ ) a badge label γ ∈ {v, w} and a momentum (n , m ) ∈ Z2 , to be defined in item (8) below. If γ = v one has |n | = |m |, while if γ = w one has |n | = |m |. One can have (n , m ) = (0, 0) only if is a v-line. With the v-lines with n = 0 we also associate a label δ ∈ {1, 2}. All the lines coming out from the end-points are v-lines with n = 0. (3) With each line coming out from a node we associate a propagator 1 −ω2 n2 +ω˜ m2 , if γ = w, 1 g = g(ωn , m ) = (3.1) if γ = v, n = 0, δ , (in ) 1, if γ = v, n = 0, with momentum (n , m ). We can associate also a propagator with the lines coming out from end-points, simply by setting g = 1. (4) Given any node V ∈ V (θ ) denote with sV the number of entering lines (branching number): one can have only either sV = 1 or sV = 3. Also the nodes V can be of w-type and v-type: we say that a node is of v-type if the line coming out from it has label γ = v; analogously the nodes of w-type are defined. We can write V (θ) = Vv (θ ) ∪ Vw (θ ), with obvious meaning of the symbols; we also call Vws (θ ), s = 1, 3, the set of nodes in Vw (θ ) with s entering lines, and analogously we define Vvs (θ ), s = 1, 3. If V ∈ Vv3 (θ ) and two entering lines come out of end points then the remaining line entering V has to be a w-line. If V ∈ Vw1 (θ ) then the line entering V has to be a w-line. If V ∈ Vv1 (θ ) then its entering line comes out of an end-node. (5) With the nodes V of v-type we associate a label jV ∈ {1, 2, 3, 4} and, if sV = 1, an order label kV , with kV ≥ 1. Moreover we associate with each node V of v-type two mode labels (n V , m V ), with m V = ±n V , and (nV , mV ), with mV = ±nV , and such that one has sV mi m V mV i=1 = = s , (3.2) V n V nV ni i=1
448
G. Gentile, V. Mastropietro, M. Procesi
where i are the lines entering V. We shall refer to them as the first mode label and the second mode label, respectively. With a node V of v-type we associate also a node factor ηV defined as 2 −Bm −2 if jV = 1 and sV = 3, m Dm snn V snnV , −2 2 (k ) V , if jV = 1 and sV = 1, −6Bm m Dm snn V snnV C
cd cd , if jV = 2 and sV = 3, −B m n nV V −6Bm cd cdn C (kV ) , if jV = 2 and sV = 1, nV V ηV = (3.3) −1
D sn cd , if jV = 3 and sV = 3, −B m m n m nV V (kV ) , if j = 3 and s = 1, −6Bm −1 V V m Dm snn V cdnV C −1
D cd sn , if j = 4 and s B m m m nV nV V V = 3, 6B −1 D cd s C (kV ) , if jV = 4 and sV = 1. m m m n nV V
(k ) Note that the factors C (kV ) = r0 a0 L[f (kV ) ] depend on the coefficients un,m , with
k < k, so that they have to be defined iteratively. The label δ of the line coming out from a node V of v-type is related to the label jV of v: if jV = 1 then n = 0, while if jV > 1 then n = 0 and δ = 1 + δjV ,2 , where δi,j denotes the Kronecker delta (so that δ = 2 if jV = 2 and δ = 1 otherwise). (6) With the nodes V ∈ Vw1 (θ ), called ν-vertices, we associate a label cV ∈ {a, b}. With the nodes V of w-type we simply associate a node factor ηV given by ε, if sV = 3, ηV = (3.4) (cV ) νm , if sV = 1. −1
In the latter case (n , m ) is the momentum of the line coming out from V, and if one has cV = a the momentum of the entering line is (n , m ) while if cV = b the momentum of the entering line is (n , −m ). In order to unify notations we can associate also with the nodes V of w-type two mode labels, by setting (n V , m V ) = (0, 0) and (nV , mV ) = (0, 0). (7) With the end-points V we associate only a first mode label (n V , m V ), with |m V | = |n V |, and an end-point factor VV = (−1)
1+δn
V
,m V
a0,n = a0,mV .
(3.5)
V
The line coming out from an end-point has to be a v-line. (8) The momentum (n , m ) of a line is related to the mode labels of the nodes preceding ; if a line comes out from a node V one writes = V and sets n = nV + n W + nW + n W , W∈V (θ) W≺V
m = mV +
W∈E(θ) W≺V
m W + mW +
W∈V (θ) W≺V
W∈E(θ) W≺V
m W +
−2mW ,
(3.6)
1 (θ) W∈Vw cW =b
where the sign in m is plus if cV = a and minus if cV = b and some of the mode labels can be vanishing according to the notations introduced above. If comes out from an end-point we set (n , m ) = (0, 0).
Nonlinear Wave Equations with Dirichlet Boundary Conditions
449
∗(k)
We define n,m as the set of inequivalent labeled trees, formed by following the rules (1) to (8) given above, and with the further following constraints: (i) if (n0 , m0 ) denotes the momentum flowing through the root line 0 and (n V0 , m V0 ) is the first mode label associated with the node V0 which 0 comes out from (special vertex), then one has n = n0 + n V0 and m = m0 + m V0 ; (ii) one has k = |Vw (θ )| + kV , (3.7) V∈Vv1 (θ)
with k called the order of the tree. An example of tree is given in Fig. 3.1, where only the labels v, w of the nodes have been explicitly written. ∗(k)
Definition 3. For all θ ∈ n,m , we call
Val(θ ) = g ηV VV , V∈V (θ)
∈L(θ)
(3.8)
V∈E(θ)
the value of the tree θ. Then the main result about the formal expansion of the solution is provided by the following result. Lemma 2. We can write u(k) n,m =
Val(θ ),
(3.9)
∗(k) θ∈n,m
(k)
and if the root line 0 is a v-line the tree value is a contribution to vn,±n , while if 0 is (k) a w-line the tree value is a contribution to wn,m . The factors C (k) are defined as ∗ C (k) = r0−1 a0,−n Val(θ ), (3.10) ∗(k)
θ∈n,n
Fig. 3.1.
450
G. Gentile, V. Mastropietro, M. Procesi
Fig. 3.2.
Fig. 3.3.
where the ∗ in the sum means the extra constraint sV0 = 3 for the node immediately preceding the root (which is the special vertex of the rooted tree). Proof. The proof is done by induction in k. Imagine to represent graphically a0,n as a (k) (small) white bullet with a line coming out from it, as in Fig. 3.2a, and un,m , k ≥ 1, as a (big) black bullet with a line coming out from it, as in Fig. 3.2b. One should imagine that labels k, n, m are associated with the black bullet represent(k) ing un,m , while a white bullet representing a0,n carries the labels n, m = ±n. For k = 1 the proof of (3.9) and (3.10) is just a check from the diagrammatic rules and the recursive definitions (2.27) and (2.29), and it can be performed as follows. (1) (1) Consider first the case |n| = |m|, so that un,m = wn,m . By taking into account only the badge labels of the lines, by item (4) there is only one tree whose root line is a w-line, and it has one node V0 (the special vertex of the tree) with sV0 = 3, hence three end-points V1 , V2 and V3 . By applying the rules listed above one obtains, for |n| = |m|, (1) wn,m =
1 −ω2 n2
2 + ω˜ m
n1 +n2 +n3 =n m1 +m2 +m3 =m
vn(0) v (0) v (0) = 1 ,m1 n2 ,m2 n3 ,m3
Val(θ ),
(3.11)
∗(1) θ∈n,m
where the sum is over all trees θ which can be obtained from the tree appearing in Fig. 3.3 by summing over all labels which are not explicitly written. It is easy to realize that (3.11) corresponds to (2.31) for k = 1. Each end-point Vi is graphically a white bullet with first mode labels (ni , mi ) and second mode labels (0, 0), and has associated an end-point factor (−1)1+δni ,mi a0,ni (see 3.5) in item (7)). The node V0 is represented as a (small) gray bullet, with mode labels (0, 0) and (0, 0), and the factor associated with it is ηV0 = ε (see 3.4) in item (6)). We associate with the line coming out from the node V0 a momentum (n , n ), with n = n, and a propagator 2 ) (see (3.1) in item (3)). g = 1/(−ω2 n2 + ω˜ m (1)
(1)
Now we consider the case |n| = |m|, so that un,m = ±An (see (2.16)). By taking (1) into account only the badge labels of the lines, there are four trees contributing to An : they are represented by the four trees in Fig. 3.4 (the tree b and c are simply obtained from the tree by a different choice of the w-line entering the last node).
Nonlinear Wave Equations with Dirichlet Boundary Conditions
451
Fig. 3.4.
In the trees of Figs. 3.4a, 3.4b and 3.4c the root line comes out from a node V0 (the special vertex of the tree) with sV0 = 3, and two of the entering lines come out from end-points: then the other line has to be a w-line (by item (4)), and (3.7) requires that the subtree which has such a line as root line is exactly the tree represented in Fig. 3.2. In the tree of Fig. 4.4d the root line comes out from a node V0 with sV0 = 1, hence the line entering V0 is a v-line coming out from an end-point (again see item (4)). ∗(1) By defining n,n as the set of all labeled trees which can be obtained by assigning to the trees in Fig. 3.4 the labels which are not explicitly written, one finds
A(1) n =
Val(θ ),
(3.12)
∗(1) θ∈n,n
which corresponds to the sum of two contributions. The first one arises from the trees of Figs. 3.4a, 3.4b and 3.4c, and it is given by 3
Ln,n
n 1 +n 2 +n 3 =n m 1 +m 2 +m 3 =n
n ∈Z
(0)
(0)
(1)
vn ,m vn ,m wn ,m , 1
1
2
2
3
(3.13)
3
where one has 2 Ln,n = Bm −2 m Dm
+ Bm −1 m Dm − Bm −1 m Dm
∗
n1 +n2 =n−n n2 =−n
n1 +n2 =n−n
n1 +n2 =n−n
∗
1 sn cdn i(n2 + n ) 1 2
∗
1 cdn sn , i(n2 + n ) 1 2
n1 +n2 =n−n
sn1 sn2 + Bm
∗
i 2 (n
1 cd cd
2 n1 n2 2+n) (3.14)
with the ∗ denoting the constraint n2 + n = 0. The first and second mode labels associated with the node V0 are, respectively, (m V0 , n V0 ) = (n1 , n1 ) and (mV0 , nV0 ) = (n2 , n2 ), while the momentum flowing through the root line is given by (n , m ), with |m | = |n | expressed according to the definition (3.6) in item (8): the corresponding propagator is (n )δ for n = 0 and 1 for n = 0, as in (3.1) in item (3).
452
G. Gentile, V. Mastropietro, M. Procesi
Fig. 3.5.
The second contribution corresponds to the tree of Fig. 3.4d, and it is given by
Ln,n C (1) a0,n ,
(3.15)
n ∈Z
with the same expression (3.14) for Ln,n and C (1) still undetermined. The mode labels of the node V0 and the momentum of the root line are as before. Then one immediately realizes that the sum of (3.13) and (3.15) corresponds to (2.27) for k = 1. Finally that C (1) is given by (3.9) follows from (2.12). This completes the check of the case k = 1. ∗(k) (k) In general from (2.31) one gets, for θ ∈ n,m contributing to wn,m , that the tree value Val(θ ) is obtained by summing all contributions either of the form 1 2 −ω2 n2 + ω˜ m
∗(k1 ) θ1 ∈n1 ,m 1
ε
n1 +n2 +n3 =n k1 +k2 +k3 =k−1 m1 +m2 +m3 =m
∗(k2 ) θ2 ∈n2 ,m 2
∗(k3 ) θ3 ∈n3 ,m 3
Val(θ1 ) Val(θ2 ) Val(θ3 ),
(3.16)
or of the form 1 −ω2 n2
(c) νm 2 + ω˜ m ∗(k,1) c=a,b θ1 ∈ (c)
Val(θ1 ),
(3.17)
n,m
with m(a) = m and m(b) = −m; the corresponding graphical representations are as in Fig. 3.5. Therefore, by simply applying the diagrammatic rules given above, we see that by summing together the contribution (3.16) and (3.17) we obtain (3.9) for |n| = |m|.
Nonlinear Wave Equations with Dirichlet Boundary Conditions
453
Fig. 3.6. (k)
(k)
A similar discussion applies to An , and one finds that An can be written as a sum of contribution either of the form Ln,n n ∈Z
∗(k1 ) θ1 ∈n1 ,m 1
or of the form
n 1 +n 2 +n 3 =n k1 +k2 +k3 =k m 1 +m 2 +m 3 =n
∗(k2 ) θ2 ∈n2 ,m 2
∗(k3 ) θ3 ∈n3 ,m 3
Val(θ1 ) Val(θ2 ) Val(θ3 ),
Ln,n C (k) a0,n ,
(3.18)
(3.19)
n ∈Z ∗(k)
with C (k) still undetermined. Both (3.18) and (3.19) are of the form Val(θ ), for θ ∈ n,m . A graphical representation is in Fig. 3.6. Analogously to the case k = 1 the coefficients C (k) are found to be expressed by (3.10). Then the lemma is proved. Lemma 3. For any rooted tree θ one has |Vv3 (θ )| ≤ 2|Vw3 (θ )| + 2|Vv1 (θ )| and |E(θ)| ≤ 2(|Vv3 (θ )| + |Vw3 (θ )|) + 1. Proof. First of all note that |Vw3 (θ )| = 0 requires |Vv1 (θ )| ≥ 1, so that one has |Vw3 (θ )| + |Vv1 (θ )| ≥ 1 for all trees θ . We prove by induction on the number N of nodes the bound 2|Vw3 (θ )| + 2|Vv1 (θ )| − 1, if the root line is a v-line, 3 (3.20) Vv (θ ) ≤ 2|Vw3 (θ )| + 2|Vv1 (θ )| − 2 if the root line is a w-line, which will immediately imply the first assertion. For N = 1 the bound is trivially satisfied, as Figs. 3.3 and 3.4 show. Then assume that (3.20) holds for the trees with N nodes, for all N < N , and consider a tree θ with V (θ ) = N. If the special vertex V0 of θ is not in Vv3 (θ ) (hence it is in Vw (θ )) the bound (3.20) follows trivially by the inductive hypothesis. If V0 ∈ Vv3 (θ ) then we can write |Vv3 (θ )| = 1 +
s i=1
|Vv3 (θi )|,
(3.21)
454
G. Gentile, V. Mastropietro, M. Procesi
where θ1 , . . . , θs are the subtrees (not endpoints) whose root line is one of the lines entering V0 . One must have s ≥ 1, as s = 0 would correspond to having all the entering lines of V0 coming out from end-points, hence to having N = 1. If s ≥ 2 one has from (3.21) and from the inductive hypothesis |Vv3 (θ )|
≤1+
2|Vw3 (θi )| + 2|Vv1 (θi )| − 1 ≤ 1 + 2|Vw3 (θ )| + 2|Vv1 (θ )| − 2,
s
i=1
(3.22) and the bound (3.20) follows. If s = 1 then the root line of θ1 has to be a w-line by item (4), so that one has
|Vv3 (θ )| ≤ 1 + 2|Vw3 (θ1 )| + 2|Vv1 (θ )| − 2 (3.23) which again yields (3.20). Finally the second assertion follows from the standard (trivial) property of trees (3.24) (sV − 1) = |E(θ )| − 1, V∈V (θ)
and the observation that in our case one has sV ≤ 3.
4. Tree Expansion: The Multiscale Decomposition We assume the Diophantine conditions (2.33). We introduce a multiscale decomposition of the propagators of the w-lines. Let χ (x) be a C ∞ non-increasing function such that χ(x) = 0 if |x| ≥ 2C0 and χ (x) = 1 if |x| ≤ C0 (C0 is the same constant appearing in (2.33)), and let χh (x) = χ (2h x) − χ (2h+1 x) for h ≥ 0, and χ−1 (x) = 1 − χ (x); such functions realize a smooth partition of the unity as 1 = χ−1 (x) +
∞
χh (x) =
h=0
∞
χh (x).
(4.1)
h=−1
If χh (x) = 0 for h ≥ 0 one has 2−h−1 C0 ≤ |x| ≤ 2−h+1 C0 , while if χ−1 (x) = 0 one has |x| ≥ C0 . We write the propagator of any w-line as the sum of propagators on single scales in the following way: g(ωn, m) =
∞ ∞ χh (|ωn| − ω˜ m ) = g (h) (ωn, m). 2 −ω2 n2 + ω˜ m
h=−1
(4.2)
h=−1
h+1
Note that we can bound |g (h) (ωn, m)| ≤ 2C0 (notice that given n, m there are at most two non-zero values of g (h) (ωn, m)). This means that we can attach to each w-line in L(θ ) a scale label h ≥ −1, which (k) is the scale of the propagator which is associated with . We can denote with n,m the set of trees which differ from the previous ones simply because the lines carry also the scale (k) labels. The set n,m is defined according to the rules (1) to (8) of Sect. 3, by changing item (3) into the following one.
Nonlinear Wave Equations with Dirichlet Boundary Conditions
455
(3 ) With each line coming out from nodes of w-type we associate a scale label h ≥ −1. For notational convenience we associate a scale label h = −1 with the lines coming out from the nodes of v-type and with the lines coming out from the end-points. With each line we associate a propagator χh (|ωn |−ω˜ m ) −ω2 n2 +ω˜ m2 , if γ = w, (h ) 1 (4.3) g ≡ g (h ) (ωn , m ) = , if γ = v, n = 0, δ (in ) 1, if γ = v, n = 0, with momentum (n , m ). (k)
Definition 4. For all θ ∈ n,m , we define
(h ) g ηV VV , Val(θ ) = V∈V (θ)
∈L(θ)
(4.4)
V∈E(θ)
the value of the tree θ . Then (3.9) and (3.10) are replaced, respectively, with Val(θ ), u(k) n,m =
(4.5)
(k) θ∈n,m
and C (k) = r0−1
∗
a0,−n Val(θ ),
(4.6)
(k)
θ∈n,n
with the new definition for the tree value Val(θ ) and with ∗ meaning the same constraint as in (3.10). Definition 5. A cluster T is a connected set of nodes which are linked by a continuous path of lines with the same scale label hT or a lower one and which are maximal; we shall say that the cluster has scale hT . We shall denote with V (T ) and E(T ) the set of nodes and the set of end-points, respectively, which are contained inside the cluster T , and with L(T ) the set of lines connecting them. As for trees we call Vv (T ) and Vw (T ) the sets of nodes V ∈ V (T ) which are of v-type and of w-type respectively. Analogously one defines the sets Vvs (T ) and Vws (T ). We define the order kT of a cluster T as the order of a tree (see item (ii) before Definition 3), with the sums restricted to the nodes internal to the cluster. An inclusion relation is established between clusters, in such a way that the innermost clusters are the clusters with lowest scale, and so on. Each cluster T can have an arbitrary number of lines entering it (incoming lines), but only one or zero line coming from it (outcoming line); we shall denote the latter (when it exists) with 1T . We shall call external lines of the cluster T the lines which either enter or come out from T , and we (e) shall denote by hT the minimum among the scales of the external lines of T . Define also |n V |, |n V | + |nV | + K(θ ) = V∈V (θ)
K(T ) =
|nV | + |nV | +
V∈V (T )
V∈E(θ)
V∈E(θ)
|n V |,
(4.7)
456
G. Gentile, V. Mastropietro, M. Procesi
Fig. 4.1.
where we recall that one has (n V , m V ) = (nV , mV ) = (0, 0) if V ∈ V (θ) is of w-type. If a cluster has only one entering line 2T and (n, m) is the momentum of such a line, for any line ∈ L(T ) one can write (n , m ) = (n0 , m0 ) + η (n, m), where η = 1 if the line is along the path connecting the external lines of T and η = 0 otherwise. Definition 6. A cluster T with only one incoming line 2T such that one has n1 = n2 T
T
and
m1 = ±m2 T
T
(4.8)
will be called a self-energy graph or a resonance. In such a case we shall call a resonant line the line 1T , and we shall refer to its momentum as the momentum of the self-energy graph. Examples of self-energy graphs T with kT = 1 are represented in Fig. 4.1. The lines crossing the encircling bubbles are the external lines, and they are on scales higher than the lines internal to the bubbles. There are 9 self-energy graphs with kT = 1: they are all obtained by the two which are drawn in Fig. 4.1, simply by considering all possible inequivalent trees. Definition 7. The value of the self-energy graph T with momentum (n, m) associated with the line 2T is defined as
(h ) VTh (ωn, m) = (4.9) g ηV VV , V∈V (T )
∈T
V∈E(T )
(e)
where h = hT is the minimum between the scales of the two external lines of T (they can differ at most by a unit), and one has n(T ) ≡ n V = 0, n V + nV + V∈V (T )
m(T ) ≡
m V +mV +
V∈V (T )
V∈E(T )
V∈E(T )
m V +
−2mW ∈ {0, 2m}, (4.10)
1 (T ) W∈Vw
cW =b
by definition of self-energy graph; one says that T is a resonance of type c = a when m(T ) = 0 and a resonance of type c = b when m(T ) = 2m.
Nonlinear Wave Equations with Dirichlet Boundary Conditions
457
Definition 8. Given a tree θ, we shall denote by Nh (θ ) the number of lines with scale h, and by Ch (θ ) the number of clusters with scale h. Then the product of propagators appearing in (4.4) can be bounded as ∞ χ (|ωn | − ω˜ )
h m (h ) g ≤ 2hNh (θ) |ωn | + ω˜ m ∈L(θ)
h=0
∈L(θ) γ =w
∈L(θ) γ =v, n =0
1 , |ωn | (4.11)
and this will be used later. Lemma 4. Assume 0 < C0 < 1/2 and that there is a constant C1 such that one has (k) |ω˜ m − |m|| ≤ C1 ε/|m|. If ε is small enough for any tree θ ∈ n,m and for any line on a scale h ≥ 0 one has min{m , n } ≥ 1/2ε. Proof. If a line with momentum (n, m) is on scale h ≥ 0 then one has √ 1 > C0 ≥ ||ωn| − ω˜ m | ≥ 1 − ε − 1 |n| + (|n| − |m|) − C1 ε/|m| 2 ε |n| ≥ (4.12) − (|n| − |m|) − C1 ε/|m| , √ 1+ 1−ε with |n| = |m|, hence |n − m| ≥ 1, so that |n| ≥ 1/2ε. Moreover one has ||ωn| − ω˜ m | ≤ 1/2 and ω˜ m − |m| = O(ε), and one obtains also |m| > 1/2ε. Lemma 5. Define h0 such that 2h0 ≤ 16C0 /ε < 2h0 +1 , and assume that there is a constant C1 such that one has |ω˜ m − |m|| ≤ C1 ε/|m|. If ε is small enough for any tree (k) θ ∈ n,m and for all h ≥ h0 one has Nh (θ ) ≤ 4K(θ )2(2−h)/τ − Ch (θ ) + Sh (θ ) + Mhν (θ ),
(4.13)
where K(θ ) is defined in (4.7), while Sh (θ ) is the number of self-energy graphs T in θ (e) with hT = h and Mhν (θ ) is the number of ν-vertices in θ such that the maximum scale of the two external lines is h. Proof. We prove inductively the bound Nh∗ (θ ) ≤ max{0, 2K(θ )2(2−h)/τ − 1},
(4.14)
where Nh∗ (θ ) is the number of non-resonant lines in L(θ ) on scale h ≥ h. First of all note that for a tree θ to have a line on scale h the condition K(θ) > 2(h−1)/τ is necessary, by the first Diophantine conditions in (2.33). This means that one can have Nh∗ (θ ) ≥ 1 only if K = K(θ ) is such that K > k0 ≡ 2(h−1)/τ : therefore for values K ≤ k0 the bound (4.14) is satisfied. If K = K(θ) > k0 , we assume that the bound holds for all trees θ with K(θ ) < K. Define Eh = 2−1 (2(2−h)/τ )−1 : so we have to prove that Nh∗ (θ ) ≤ max{0, K(θ )Eh−1 −1}. Call the root line of θ and 1 , . . . , m the m ≥ 0 lines on scale ≥ h which are the closest to (i.e. such that no other line along the paths connecting the lines 1 , . . . , m to the root line is on scale ≥ h).
458
G. Gentile, V. Mastropietro, M. Procesi
If the root line of θ is either on scale < h or on scale ≥ h and resonant, then Nh∗ (θ ) =
m
Nh∗ (θi ),
(4.15)
i=1
where θi is the subtree with i as root line, hence the bound follows by the inductive hypothesis. If the root line has scale ≥ h and is non-resonant, then 1 , . . . , m are the entering line of a cluster T . By denoting again with θi the subtree having i as root line, one has Nh∗ (θ ) = 1 +
m
Nh∗ (θi ),
(4.16)
i=1
so that the bound becomes trivial if either m = 0 or m ≥ 2. If m = 1 then one has a cluster T with two external lines and 1 , which are both with scales ≥ h; then −h+1 |ωn | − ω˜ m ≤ 2−h+1 C0 , | − ω ˜ C0 , (4.17) |ωn 1 m1 ≤ 2 and recall that T is not a self-energy graph. Note that the validity of both inequalities in (4.17) for h ≥ h0 imply that one has |n − n1 | = |m ± m1 |, as we are going to show. By Lemma 4 we know that one has min{m , n } ≥ 1/2ε. Then from (4.17) we have, for some η , η1 ∈ {±1}, 2−h+2 C0 ≥ ω(n − n1 ) + η ω˜ m + η1 ω˜ m1 , (4.18) so that if one had |n − n1 | = |m ± m1 | we would obtain for ε small enough 2−h+2 C0 ≥
ε n − n − C1 ε − C1 ε ≥ ε − 4C1 ε 2 > ε , √ 1 |m | |m1 | 2 4 1+ 1−ε
(4.19)
which is contradictory with h ≤ h0 ; hence one has |n − n1 | = |m ± m1 |. Then, by (4.17) and for |n −n1 | = |m ±m1 |, one has, for suitable η , η1 ∈ {+, −}, 2−h+2 C0 ≥ ω(n − n1 ) + η ω˜ m + η1 ω˜ m1 ≥ C0 |n − n1 |−τ , (4.20) where the second Diophantine conditions in (2.33) have been used. Hence K(θ) − K(θ1 ) > Eh , which, inserted into (4.16) with m = 1, gives, by using the inductive hypothesis, Nh∗ (θ ) = 1 + Nh∗ (θ1 ) ≤ 1 + K(θ1 )Eh−1 − 1
≤ 1 + K(θ ) − Eh Eh−1 − 1 ≤ K(θ)Eh−1 − 1,
(4.21)
hence the bound is proved also if the root line is on scale ≥ h. In the same way one proves that, if we denote with Ch (θ ) the number of clusters on scale h, one has Ch (θ ) ≤ max{0, 2K(θ )2(2−h)/τ − 1}; see [23] for details.
(4.22)
Nonlinear Wave Equations with Dirichlet Boundary Conditions
459
Note that the argument above is very close to [23]: this is due to the fact that the external lines of any self-energy graph T are both w-lines, so that the only effect of the presence of the v-lines and of the nodes of v-type is in the contribution to K(T ). The following lemma deals with the lines on scale h < h0 . Lemma 6. Let h0 be defined as in Lemma 2 and C0 < 1/2, and assume that there is a constant C1 such that one has |ω˜ m − |m|| ≤ C1 ε. If ε is small enough for h < h0 one (h) has |g | ≤ 32. Proof. Either if h = h or h = h = −1 the bound is trivial. If h = h ≥ 0 one has (h)
g
=
1 χh (|ωn | − ω˜ m ) , −|ωn | + ω˜ m |ωn | + ω˜ m
(4.23)
where |ωn | + ω˜ m ≥ 1/2ε by Lemma 4. Then one has 1 ≤ 2ε, |ωn | + ω˜ m
(4.24)
(h)
which, inserted in (4.23), gives |g | ≤ 2h+2 ε/C0 ≤ 32, so that the lemma is proved. 5. The Renormalized Expansion It is an immediate consequence of Lemma 5 and Lemma 6 that all the trees θ with no self-energy graphs or ν-vertices admit a bound O(C k ε k ), where C is a constant. However the generic tree θ with Sh (θ ) = 0 admits a much worse bound, namely O(C k ε k k!α ), for some constant α, and the presence of factorials prevent us to prove the convergence of the series; in KAM theory this is called accumulation of small divisors. It is convenient then to consider another expansion for un,m , which is essentially a resummation of the one introduced in Sects. 3 and 4. (k)R (k) We define the set n,m of renormalized trees, which are defined as n,m except that the following rules are added. (9) To each self-energy graph (with |m| ≥ 1) the R = − L operation is applied, where L acts on the self-energy graphs in the following way, for h ≥ 0 and |m| ≥ 1, LVTh (ωn, m) = VTh (sgn(n) ω˜ m , m),
(5.1)
R is called a regularization operator; its action simply means that each self-energy graph VTh (ωn, m) must be replaced by RVTh (ωn, m). (10) With the nodes V of w-type with sV = 1 (which we still call ν-vertices) and with h ≥ 0 the minimal scale among the lines entering or exiting V, we associate a factor (c) 2−h νh,m , c = a, b, where (n, m) and (n, ±m), with |m| ≥ 1, are the momenta of the lines and a corresponds to the sign + and b to the sign − in ±m. (11) The set {h } of the scales associated with the lines ∈ L(θ ) must satisfy the following constraint (which we call compatibility): fixed (n , m ) for any ∈ L(θ ) and replaced R with at each self-energy graph, one must have χh (|ωn | − ω˜ m ) = 0. (12) The factors C (kV ) in (3.3) are replaced with , to be considered a parameter.
460
G. Gentile, V. Mastropietro, M. Procesi (k)R
(k)
The set n,m is defined as n,m with the new rules and with the constraint that the order k is given by k = |Vw (θ )| + |Vv1 (θ )|. We consider the following expansion ∞
u˜ n,m =
µk
Val(θ ),
(5.2)
1 kT h µ VT (σ ω˜ m , m), 2 σ =± (c)
(5.3)
k=1
(k)R
θ∈n,m
(c)
where, for |m| ≥ 1 and h ≥ 0, νh,m is given by (c) + 2−h µνh,m = µνm (c)
T ∈T
(c)
with c = a, b, and T
∞
µk−1
k=1
∗
a0,−n Val(θ ),
(5.4)
(k)R θ∈n,n
with ∗ denoting the same constraint as in (3.10). (c) (c) We shall set also νm = ν−1,m . Note that VTh (σ ω˜ m , m) is independent of σ . Calling L0 (θ ), V0 (θ ), E0 (θ ) the set of lines, node and end-points not contained in any self-energy graph, and S0 (θ ) the maximal self-energy graphs, i.e. the self-energy graphs which are not contained in any self-energy graphs, we can write Val(θ ) in (5.2) as
he (h ) g ηV VV RVT T (ωnT , mT ) , Val(θ ) = V∈V0 (θ)
∈L0 (θ)
V∈E0 (θ)
T ∈S0 (θ)
(5.5) and by definition he
he
he
RVT T (ωnT , mT ) = VT T (ωnT , mT ) − VT T (sgn(nT ) ω˜ mT , mT ),
(5.6)
he
and VT T (ωnT , mT ) is given by he
VT T (ωnT , mT ) =
∈L0 (T )
(h )
g
V∈V0 (T )
ηV
V∈E0 (T )
VV
T ∈S0 (T )
he RVT T (ωnT , mT ) . (5.7)
First (Lemma 7) we will show that the expansion (5.2) is well defined, for νh,m , = O(ε); then (Lemmas 8 and 9) we show that under the same conditions also the r.h.s. of (5.3) is well defined; moreover (Lemma 10) we prove by using (5.3) that it is indeed (c) possible to choose νm such that νh,m = O(ε) for any h; then (Lemmas 11 and 12) we show that (5.2) admit a solution = O(ε); finally (Lemma 13) we show that indeed (5.2) solves the last of (1.14) and (2.2); this completes the proof of Proposition 1. In the next section we will solve the implicit function problem of (2.1), thus completing the proof of Theorem 1. (c) We start from the following lemma stating that, if the νh,m and functions are bounded, then the expansion (5.2) is well defined.
Nonlinear Wave Equations with Dirichlet Boundary Conditions
461 (c)
Lemma 7. Assume that there exist a constant C such that one has || ≤ Cε and |νh,m | ≤ Cε, with c = a, b, for all |m| ≥ 1 and all h ≥ 0. Then for all µ0 > 0 there exists ε0 > 0 such that for all |µ| ≤ µ0 and for all 0 < ε < ε0 and for all (n, m) ∈ Z2 one has u˜ n,m ≤ D0 εµ e−κ(|n|+|m|)/4 , (5.8) where D0 is a positive constant. Moreover un,m is analytic in µ and in the parameters (c) νm ,h , with c = a, b and |m | ≥ 1. Proof. In order to take into account the R operation we write (5.6) as
1 he he RVT T (ωnT , mT ) = ωnT − ω˜ mT dt∂VT T (ωnT + t (ωnT − ω˜ mT ), mT ), 0
(5.9) where ∂ denotes the derivative with respect to the argument ωnT + t (ωnT − ω˜ mT ). By (5.7) we see that the derivatives can be applied either on the propagators in L0 (T ), he
(e)
or on the RVT T . In the first case there is an extra factor 2−hT bound (4.11): 2
(e) −hT
is obtained from ωnT − ω˜ mT
while ∂g (hT )
heT T
+hT
with respect to the
is bounded proportion(e)
heT T
h
= ∂t V as LVT T is independent of ally to 22hT ; in the second case note that ∂t RV t; if the derivative acts on the propagator of a line ∈ L(T ), we get a gain factor (e)
2−hT
+hT
(e)
≤ 2−hT
(e)
+hT −hT +hT
2
,
(5.10)
(e)
as hT ≤ hT . We can iterate this procedure until all the R operations are applied on propagators; at the end (i) the propagators are derived at most one time; (ii) the number (e) of terms so generated is ≤ k; (iii) with each self-energy graph T a factor 2−hT +hT is associated. (c) Assuming that |νh,m | ≤ Cε and || ≤ Cε, for any θ one obtains, for a suitable constant D, |V (θ)|
(1)
|Val(θ )| ≤ ε|Vw (θ)|+|Vv (θ)| D ∞
exp h log 2 4K(θ )2−(h−2)/τ − Ch (θ ) + Sh (θ ) + Mhν (θ ) h=h0
(e)
2−hT
T ∈S(θ ) (e) hT ≥h0
V∈V (θ)∪E(θ)
+hT
∞
2−hMh (θ) ν
(5.11)
h=h0
e−κ(|nV |+|nV |)
e−κ(|mV |+|mV |) ,
V∈V (θ)∪E(θ)
where the second line is a bound for h≥h0 2hNh (θ) and we have used that by item (12) Nh (θ ) can be bounded through Lemma 5, and Lemma 4 has been used for the lines −hMνh (θ) takes into account the factors 2−h arising on scales h < h0 ; moreover ∞ h=h0 2 (c)
from the running coupling constants νh,m and the action of R produces, as discussed (e) above, the factor T ∈S(θ) 2−hT +hT . Then one has
462
G. Gentile, V. Mastropietro, M. Procesi ∞
h=h0 ∞
2hSh (θ)
(e)
2−hT
= 1,
T ∈S(θ)
2−hCh (θ)
2hT
≤ 1.
(5.12)
T ∈S(θ)
h=h0
We have to sum the values of all trees, so we have to worry about the sum of the labels. Recall that a labeled tree is obtained from an unlabeled tree by assigning all the labels to the points and the lines: so the sum over all possible labeled trees can be written as sum over all unlabeled trees and of labels. For a fixed unlabeled tree θ with a given number of nodes, say N, we can assign first the mode labels {(n V , m V ), (nV , mV )}v∈V (θ)∪E(θ) , and we sum over all the other labels, which gives 4|Vv (θ)| (for the labels jV ) times 2|L(θ)| (for the scale labels): then all the other labels are uniquely fixed. Then we can perform the sum over the mode labels by using the exponential decay arising from the node factors (3.3) and end-point factors (3.4). Finally we have to sum over the unlabeled trees, and this (1) (3) gives a factor 4N [26]. By Lemma 3, one has |V (θ )| = |Vw (θ )|+|Vv (θ )|+|Vv (θ )| ≤ (3) (1) 3(|Vw (θ )| + |Vv (θ )|), hence N ≤ 3k, so that θ∈(k)R |Val(θ )| ≤ D k ε k , for some n,m positive constant D. Therefore, for fixed (n, m) one has ∞
µk |Val(θ )| ≤ D0 µε e−κ(|n|+|m|)/4 ,
(5.13)
k=1 θ∈(k)
n,m
for some positive constant D0 , so that (5.8) is proved.
(c)
From (5.3) we know that the quantities νh,m , for h ≥ 0 and |m| ≥ 1, verify the recursive relations (c)
(c)
(c)
(c )
˜ ε, {νh ,m }), µνh+1,m = 2µνh,m + βh,m (ω, (c)
where, by defining Th the beta function (c)
(c)
(5.14)
(c)
as the set of self-energy graphs in T
˜ ε, {νh ,m }) = 2h+1 βh,m ≡ βh,m (ω,
1 kT h+1 µ VT (σ ω˜ m , m), 2 σ =± (c)
(5.15)
T ∈Th
h
depends only on the scales ≤ h. In order to obtain a bound on the beta function, hence on the running coupling con(c) (k)R (k)R stants, we need to bound VTh+1 (±ω˜ m , m) for T ∈ Th . We define n,m as the set n,m introduced before, but by changing item (7) into the following one: ) of end-points into two sets E(θ ) and E0 (θ ). With each end(7 ) We divide the set E(θ point V ∈ E(θ ) we associate a first mode label (n V , m V ), with |m V | = |n V |, a second 1+δ
mode label (0, 0) and an end-point factor VV = (−1) nV ,mV a0,n , while E0 (θ ) is V either the empty set or a single end-point V0 , and, in the latter case, with the end-point V ∈ E0 (θ ) we associate a first mode label (ωmV , nV ), where ωmV = ω ˜ mV /ω, a second mode label (0, 0) and an end-point factor VV = 1. Then we have the following generalization of Lemma 4.
Nonlinear Wave Equations with Dirichlet Boundary Conditions
463
n,m one has Lemma 8. If ε is small enough for any tree θ ∈ (k)R
Nh (θ ) ≤ 4K(θ )2(2−h)/τ − Ch (θ ) + Sh (θ ) + Mhν (θ ),
(5.16)
where the notations are as in Lemma 4. Proof. Lemma 4 holds for E0 (θ ) = 0; we mimic the proof of Lemma 4 proving that Nh∗ (θ ) ≤ max{0, 2K(θ )2(2−h)/τ }, for all trees θ with E0 (θ ) = ∅, again by induction on K(θ). For any line ∈ L(θ ) set η = 1 if the line is along the path connecting root and η = 0 otherwise, and write n = n0 + η ωm ,
m = m0 + η m,
(5.17) V0
to the (5.18)
which implicitly defines n0 and m0 . Define k0 = 2(h−1)/τ . One has Nh∗ (θ ) = 0 for K(θ ) < k0 , because if a line ¯ ∈ L(θ ) is indeed on scale h then |ωn¯ − ω˜ m¯ | < C0 21−h , so that (5.18) and the Diophantine conditions imply K(θ ) ≥ n0¯ > 2(h−1)/τ ≡ k0 . (5.19) Then, for K ≥ k0 , we assume that the bound (5.17) holds for all K(θ) = K < K, and we show that it follows also for K(θ ) = K. If the root line of θ is either on scale < h or on scale ≥ h and resonant, the bound (5.17) follows immediately from the bound (4.13) and from the inductive hypothesis. The same occurs if the root line is on scale ≥ h and non-resonant, and, by calling 1 , . . . , m the lines on scale ≥ h which are the closest to , one has m ≥ 2: in fact in such a case at least m − 1 among the subtrees θ1 , . . . , θm having 1 , . . . , m , respectively, as root lines have E0 (θi ) = ∅, so that we can write, by (4.13) and by the inductive hypothesis, Nh∗ (θ ) = 1 +
m i=1
Nh∗ (θi ) ≤ 1 + Eh−1
m
K(θi ) − (m − 1) ≤ Eh K(θ),
(5.20)
i=1
so that (5.17) follows. If m = 0 then Nh∗ (θ ) = 1 and K(θ )2(2−h)/τ ≥ 1 because one must have K(θ) ≥ k0 . So the only non-trivial case is when one has m = 1. If this happens 1 is, by construction, the root line of a tree θ1 such that K(θ ) = K(T ) + K(θ1 ), where T is the cluster which has and 1 as external lines and K(T ), defined in (4.7), satisfies the bound K(T ) ≥ |n1 − n |. Moreover, if E0 (θ1 ) = ∅, one has |ωn0 + ω˜ m | − ω˜ m ≤ 2−h+1 C0 , (5.21) |ωn01 + ω˜ m | − ω˜ m1 ≤ 2−h+1 C0 , so that, for suitable η , η1 ∈ {−, +}, we obtain
464
G. Gentile, V. Mastropietro, M. Procesi
2−h+2 C0 ≥ ω(n0 − n01 ) + η ω˜ m + η1 ω˜ m1 ≥ C0 |n0 − n01 |−τ ≡ C0 |n − n1 |−τ , (5.22) by the second Diophantine conditions in (2.33), as the quantities ω˜ m appearing in (5.21) cancel out. Therefore one obtains by the inductive hypothesis Nh∗ (θ ) ≤ 1 + K(θ1 )Eh−1 ≤ 1 + K(θ )Eh−1 − K(T )Eh−1 ≤ K(θ)Eh−1 ,
(5.23)
hence the first bound in (5.17) is proved. If E0 (θ1 ) = ∅, one has Nh∗ (θ ) ≤ 1 + K(θ1 )Eh−1 − 1 ≤ 1 + K(θ )Eh−1 − 1 ≤ K(θ)Eh−1 , and (5.17) follows also in such a case.
(5.24)
The following bound for VTh+1 (±ω˜ m , m), h ≥ h0 , can then be obtained. Lemma 9. Assume that there exists a constant C such that one has || ≤ Cε and (c) |νh,m | ≤ Cε, with c = a, b, for all |m| ≥ 1 and all h ≥ 0. Then if ε is small enough for (c)
all h ≥ 0 and for all T ∈ Th
one has
|VTh+1 (±ω˜ m , m)| ≤ B |V (T )| e−κ2
(h−1)/τ /4
(1)
e−κK(T )/4 ε |Vv
(T )|+|Vw (T )|
,
(5.25)
where B is a constant and K(T ) is defined in (4.7). (c)
Proof. By using Lemma 7 we obtain for all T ∈ Th and assuming h ≥ h0 we get the bound |V (T )| |Vv(1) (T )|+|Vw (T )| h+1 ε VT (±ω˜ m , m) ≤ B h
exp 4K(T ) log 2h 2(2−h )/τ − Ch (T ) + Sh (T ) + Mhν (T )
h =h0
T ⊂T (ε) hT ≥h0
(e)
2−hT
+hT
h
ν 2−h Mh (T ) e−κ|K(T )|/2 ,
(5.26)
h =h0
where B is a suitable constant. If h < h0 the bound trivializes as the r.h.s. reduces simply (1) κ to C |V (T )| |ε||Vv (T )|+|Vw (T )| e− 2 |K(T )| . The main difference with respect to Lemma 6 is (c) that, given a self-energy graph T ∈ Th , there is at least a line ∈ L(T ) on scale h = h and with propagator 1 2 −ω2 (n0 + η ωm )2 + ω˜ m 0 +η
,
(5.27)
m
where η = 1 if the line belongs to the path of lines connecting the entering line (carrying a momentum (n, m)) of T with the line coming out of T , and η = 0 otherwise. Then one has by the Mel’nikov conditions
Nonlinear Wave Equations with Dirichlet Boundary Conditions
465
C0 |n0 |−τ ≤ ωn0 + η ω˜ m ± ω˜ m0 +η m ≤ C0 2−h+1 ,
(5.28)
so that |n0 | ≥ 2(h−1)/τ . On the other hand one has |n0 | ≤ K(T ), hence K(T ) ≥ 2(h−1)/τ ; so we get the bound (5.25). It is an immediate consequence of the above lemma that for all µ0 > 0 there exists (c) ε0 > 0 such that for all |µ| ≤ µ0 and 0 < ε < ε0 one has |βh,m | ≤ B1 ε|µ|, with B1 a suitable constant. We have then proved convergence assuming that the parameters νh,m and are (c) bounded; we have to show that this is actually the case, if the νm in (2.34) are chosen in a proper way. (c) We start proving that it is possible to choose ν (c) = {νm }|m|≥1 such that, for a suitable (c) positive constant C, one has |νh,m | ≤ Cε for all h ≥ 0 and for all |m| ≥ 1. For any sequence a ≡ {am }|m|≥1 we introduce the norm a ∞ = sup |am |.
(5.29)
|m|≥1
Then we have the following result. Lemma 10. Assume that there exists a constant C such that || ≤ Cε. Then for all µ0 > 0 there exists ε0 > such that for all |µ| ≤ µ0 and for all 0 < ε < ε0 there ¯ ¯ ¯ (h) (h+1) (h) is a family of intervals Ic,m , h¯ ≥ 0, |m| ≥ 1, c = a, b, such that Ic,m ⊂ Ic,m , √ ¯ ¯ (c) (h) h¯ | ≤ 2ε( 2)−(h+1) and, if νm ∈ Ic,m , then |Ic,m (c)
νh ∞ ≤ Dε,
h¯ ≥ h ≥ 0,
(5.30)
(c) (c) for some positive constant D. Finally one has νh,−m = νh,m , c = a, b, for all h¯ ≥ h ≥ 0 and for all |m| ≥ 1. Therefore one has νh ∞ ≤ Cε for all h ≥ 0, for some positive constant D; in particular |νm | ≤ Dε for all m ≥ 1. (h) ¯ Let us define Jc,m = [−ε, ε] and call Proof. The proof is done by induction on h. (h) (h) (h) (h) J = ×|m|≥1,c=a,b Jc,m and I = ×|m|≥1,c=a,b Ic,m . ¯ ¯ ¯ We suppose that there exists I (h) such that, if ν spans I (h) then νh¯ spans J (h) and (c) |νh,m | ≤ Dε for h¯ ≥ h ≥ 0; we want to show that the same holds for h¯ + 1. Let us call ¯ ¯ (c) (c) J˜(h+1) the interval spanned by {ν }|m|≥1,c=a,b when {νm }|m|≥1,c=a,b span I (h) . For
any
(c) {νm }|m|≥1,c=a,b
∈
¯ I (h)
¯ h+1,m
one has {νh+1,m }|m|≥1,c=a,b ∈ [−2ε − Dε 2 , 2ε + Dε 2 ], ¯ ¯
where the bound (5.25) has been used. This means that J (h+1) is strictly contained in ¯ J˜(h+1) . On the other hand it is obvious that there is a one-to-one correspondence between (c) (c) {νm }|m|>1,c=a,b and the sequence {νh,m }|m|≥1,c=a,b , h¯ + 1 ≥ h ≥ 0. Hence there is a ¯
¯
(c)
¯
(c)
set I (h+1) ⊂ I (h) such that, if {νm }|m|≥1,c=a,b spans I (h+1) , then {νh+1,m }|m|≥1,c=a,b ¯ ¯
spans the interval J (h) and, for ε small enough, |νh |∞ ≤ Cε for h¯ + 1 ≥ h ≥ 0.
466
G. Gentile, V. Mastropietro, M. Procesi
The previous computations also show that the inductive hypothesis is verified also for ¯ h¯ = 0 so that we have proved that there exists a decreasing sets of intervals I (h) such that ¯ (c) (c) if {νm }|m|>1,c=a,b ∈ I (h) then the sequence {νh,m }|m|≥1,c=a,b is well defined for h ≤ h¯ ¯ (h)
(c)
and it verifies |νh,m | ≤ C|ε|. In order to prove the bound on the size of Ic,m let us denote (c)
(c) ¯ the sequences corresponding to by {ν }|m|≥1,c=a,b and {ν }|m|≥1,c=a,b , 0 ≤ h ≤ h, h,m
(c)
(c)
h,m
¯
{νm }|m|≥1,c=a,b and {νm }|m|≥1,c=a,b in I (h) . We have
(c)
(c) (c)
(c) (c)
(c) µνh+1,m − µνh+1,m = 2 µνh,m − µνh,m + βh,m − βh,m ,
(5.31)
(c)
where βh,m and βh,m are shorthands for the beta functions. Then, as |νk − νk |∞ ≤ |νh − νh |∞ for all k ≤ h, we have (c)
|νh − νh |∞ ≤
1
|νh+1 − νh+1 |∞ + Dε 2 |νh − νh |∞ . 2
Hence if ε is small enough then one has √ ¯ ν − ν ∞ ≤ ( 2)−(h+1) νh¯ − νh ¯ ∞ . ¯
(5.32)
(5.33) ¯
Since, by definition, if ν spans I (h) , then νh¯ spans the interval J (h) , of size 2|ε|, the √ ¯ ¯ size of I (h) is bounded by 2|ε|( 2)(−h−1) . (c) (c) (c) (c) Finally note that one can choose νm = ν−m and then νh,m = νh,−m for any |m| ≥ 1 (c) and any h¯ ≥ h ≥ 0; this follows from the fact that the function βk,m in (5.15) is even under the exchange m → −m; it depends on m through ω˜ m (which is an even function of m), through the end-points v ∈ E(θ ) (which are odd under the exchange m → −m; but (q−1) their number must be even) and finally through νk,m which are assumed inductively to be even. (c)
It will be useful to explicitly construct the νh,m by a contraction method. By iterating (5.14) we find, for |m| ≥ 1, h−1
) (c) (c) (c (c) + 2−k−2 βk,m (ω, ˜ ε, {νk ,m }) , (5.34) µνh,m = 2h+1 µνm k=−1 (c )
where βk,m (ω, ˜ ε, {νk ,m }) depends on νk ,m with k ≤ k − 1. If we put h = h¯ in (5.34) we get (c)
(c) µνm =−
¯ h−1
(c )
¯
2−k−2 βk,m (ω, ˜ ε, {νk ,m }) + 2−h−1 µνh,m ¯ (c)
(c)
(5.35)
k=−1
and, combining (5.34) with (5.35), we find, for h¯ > h ≥ 0, ¯ h−1
¯ (c) (c) (c ) (c) µνh,m = −2h+1 2−k−2 βk,m (ω, ˜ ε, {νk ,m }) + 2h−h µνh,m ¯ . k=h
(5.36)
Nonlinear Wave Equations with Dirichlet Boundary Conditions
467
(c) (c) The sequences {νh,m }|m|>1 , h¯ > h ≥ h0 , parameterized by {νh,m ¯ }|m|≥2 such that (c)(q)
(c)
νh¯ ∞ ≤ Cε, can be obtained as the limit as q → ∞ of the sequences {νh,m }, q ≥ 0, defined recursively as (c)(0)
µνh,m
(c)(q) µνh,m
= 0,
¯ h−1
)(q−1) ¯ (c (c) (c) = −2h+1 2−k−2 βk,m (ω, ˜ ε, {νk ,m }) + 2h−h µνh,m ¯ . (5.37) k=h (q)
In fact, it is easy to show inductively that, if ε is small enough, νh ∞ ≤ Cε, so that (5.25) is meaningful, and (q)
(q−1)
max νh − νh
0≤h≤h¯
(c)(0)
For q = 1 this is true as νh (c )(q−1)
∞ ≤ (Cε)q .
(5.38)
= 0; for q > 1 it follows by the fact that
(c )(q−2) (c) })−βk (ω, ˜ ε, {νk ,m }) can be written as a sum of terms in which (c )(q−1) (c )(q−2) − νh , with h ≥ k, in there are at least one ν-vertex, with a difference νh
(q) (c ) place of the corresponding νh , and one node carrying an ε. Then νh converges as
(c) βk (ω, ˜ ε, {νk ,m
q → ∞, for h¯ < h ≤ 1, to a limit νh , satisfying the bound νh ∞ ≤ Cε. Since the solution is unique, it must coincide with one in Lemma 10. (c) We have then constructed a sequence of νh,m solving (5.36) for any h¯ > 1 and any (c) (c) (c) ν ; we shall call ν () the solution of (5.36) with h¯ = ∞ and ν∞,m = 0, to stress ¯ h,m
h,m
the dependence on . We will prove the following lemma. Lemma 11. Under the the same conditions of Lemma 10 it holds that for any h ≥ 0, νh ( 1 ) − νh ( 2 ) ∞ ≤ Dε| 1 − 2 |,
(5.39)
for a suitable constant D. Proof. Calling νh () the l.h.s. of (5.25) with h¯ = ∞ and ν∞,m = 0, we can show by induction on q that (q)
(q)
(q)
νh ( 1 ) − νh ( 2 ) ∞ ≤ Dε| 1 − 2 |.
(5.40) (c)
We find convenient to write explicitly the dependence of the function βh,m from the (c )(q−1)
(c)
parameter , so that we rewrite βk,m (ω, ˜ ε, {νk ,m (c )(q−1) (c) ˜ ε, , {νk ,m ()}. βk,m (ω, (c)(q)
µνm
(c)(q)
( 1 ) − µνm
} in the r.h.s. of (5.37) as
Then from (5.37) we get ( 2 ) =
∞
(c )(q−1)
(c)
2h−k−1 [βk,m (ω, ˜ ε, 1 , {νk ,m
( 1 )})
k=h (c)
(c )(q−1)
˜ ε, 2 , {νk ,m −βk,m (ω,
( 2 )})].
(5.41)
468
When q (c )(q−1)
G. Gentile, V. Mastropietro, M. Procesi
=
(c )(q−1)
(c)
1 we have that βk,m (ω, ˜ ε, 1 , {νk ,m
(c)
( 1 )}) − βk,m (ω, ˜ ε, 2 ,
( 2 )}) is given by a sum of self energy graphs with one node V with a factor {νk ,m ηV with replaced by 1 − 2 ; as there is at least a vertex V of w-type by the definition of the self energy graphs we obtain
(1) (1) νh ( 1 ) − νh ( 2 ) ∞ ≤ D1 ε + D˜ 1 ε 2 | 1 − 2 |, (5.42) for positive constants D1 < 2D and D˜ 2 , where D1 ε| 1 − 2 | is a bound for the selfenergy first order contribution. For q > 1 we can write the difference in (5.41) as
(c )(q−1) 1 (c)(q−1) (c) (c) βk,m (ω, ˜ ε, 1 , {νk ,m ( )}) − βk,m (ω, ˜ ε, 2 , {νk ,m ( 1 )})
(c)(q−1) (c)(q−1) (c) (c) + βk,m (ω, ˜ ε, 2 , {νk ,m ( 1 )}) − βk,m (ω, ˜ ε, 2 , {νk ,m ( 2 )}) . (5.43) The first factor is given by a sum over self-energy graphs with one node V with a factor ηV with replaced by 1 − 2 ; the other difference is given by a sum over self energy (c)(q−1) (c)(q−1) graphs with a ν-vertex with which is associated a factor νk ,m ( 1 ) − νk ,m ( 2 ); hence we find
(q) (q) νh ( 1 ) − νh ( 2 ) ∞ ≤ D1 ε + D3 ε 2 | 1 − 2 | (q−1)
+εD2 sup νh h≥0
(q−1)
( 1 ) − νh
( 2 ) ∞ ,
(5.44)
where D1 ε| 1 − 2 | is a bound for the first order contribution coming from the first line in (5.43), while the last summand in (5.44) is a bound from the terms from the last line of (5.43). Then (5.40) follows with D = 2D1 , for ε small enough. By using Lemma 11 we can show that the self consistence equation for (5.4) has a unique solution = O(ε). Lemma 12. For all µ0 > 0 there exists ε0 > 0 such that, for all |µ| ≤ µ0 and for all (c) (c) 0 < ε < ε0 , given νm () chosen as in Lemma 9 (with h¯ = ∞ and ν∞,m = 0) it holds that (5.4) has a solution || ≤ Cε where C is a suitable constant. Proof. The solution of (5.4) can be obtained as the limit as q → ∞ of the sequence (q) , q ≥ 0, defined recursively as (0) = 0, (q) = r0−1
∞
∗
a0,−n Val(θ ),
(5.45)
k=1 θ∈(k)(q−1))R n,n
(k)(q)R
(k)R
as the set of trees identical to n,m except that in ηV is where we define n,m replaced by (q) and each νh,m is replaced by νh,m ( (q) ), for all h ≥ 0, |m| ≥ 1.Equation (5.45) is a contraction defined on the set || ≤ Cε, for ε small. In fact if | (q−1) | ≤ Cε, then by (5.45) | (q) | ≤ C1 ε + C2 Cε 2 , where we have used that the first order contribution to the r.h.s. of (5.45) is -independent (see Sect. 3), and C1 ε < εC/ is a bound for it; hence for ε small enough (5.45) send the interval || ≤ Cε to itself.
Nonlinear Wave Equations with Dirichlet Boundary Conditions
469
Moreover we can show inductively that (q) − (q−1) ∞ ≤ (Cε)q .
(5.46)
For q = 1 this is true; for q > 1 (q) − (q−1) can be written as sum of trees in which a) either with a node V is associated a factor proportional to (q−1) − (q−2) ; or b) with a ν vertex is associated νh ,m ( (q−1) ) − νh ,m ( (q−2) ) for some h , m . In the first case we note that the constraint in the sum in the r.h.s. of (5.45) implies that sv0 = 3 for the special vertex of θ ; hence, item (4) in Sect. 3, says that V = v0 so that such terms are bounded by D1 ε| (1) − (2) | (a term order O( 1 − 2 ) should have three v lines entering v0 and two of them coming from end points, which is impossible). In the second case we use (5.39), and we bound such terms by D2 ε| (1) − (2) | . Hence by induction (5.46) is found, if C ≥ D1 /4, D2 /4, C1 /4). We have finally to prove that u˜ n,m solves the last of (1.14) and (2.2). Lemma 13. For all µ0 > 0 there exists ε0 > such that, for all |µ| ≤ µ0 and for all (c) 0 < ε < ε0 , given νm () chosen as in Lemma 9 and chosen as in Lemma 12 then u˜ n,m solves the last of (1.14) and (2.2). $ (k)R Proof. Let us consider first the case in which |n| = |m| and we call R n,m = k n,m ; assume also (what of course is not restrictive) that n, m is such that χh0 (|ωn| − ω˜ m ) + R χh0 +1 (|ωn| − ω˜ m ) = 1. We call R n,m, the set of trees θ ∈ n,m with root line at scale h0 , so that u˜ n,m =
Val(θ ) =
θ∈R n,m
θ∈R n,m,h
β,R
$
Val(θ ),
(5.47)
θ∈R n,m,h
0 +1
0
α,R and we write R n,m,h0 = n,m,h0
Val(θ ) +
β,R
n,m,h0 , where α,R n,m,h0 are the trees with sV0 = 1,
while n,m,h¯ are the trees with sV0 = 3 and V0 is the special vertex (see Definition 2). Then ¯ (c) Val(θ ) = Val(θ ) g (h) (n, m)2−h0 νh0 ,m θ∈α,R n,m,h
θ∈R n,mc ,h
c=a,b
0
+g +g
¯ (h)
¯ (c) (n, m)2−h νh0 ,m
(h0 +1)
θ∈R n,mc ,h
(c) (n, m)2−h0 νh0 ,m
Val(θ ) 0 +1
θ∈R n,m,h0
+g (h0 +1) (n, m)2−h0 −1 νh0 +1,m (c)
0
Val(θ ) Val(θ ) ,
θ∈R n,mc ,h
(5.48)
0 +1
where mc is such that ma = m and mb = −m. On the other hand we can write β,R β1,R $ β2,R β1,R n,m,h0 = n,m,h0 n,m,h0 , where n,m,h0 are the trees such that the root line 0 is
470
G. Gentile, V. Mastropietro, M. Procesi β2,R
the external line of a self-energy graph, and n,m,h0 is the complement. Then we can write ¯ Val(θ ) = g (h0 ) (n, m) RVTh (ωn, m) Val(θ ) 0
+g
θ∈R n,mc ,h
T ∈T˜
c=a,b
β1,R
θ∈n,m,h
(h0 )
(c) 0
(n, m)
Val(θ )
RVTh0 (ωn, m)
Val(θ )
(5.49)
θ∈R n,mc ,h0
(c) T ∈T˜
+g (h0 +1) (n, m)
0
θ∈R n,mc ,h0 +1
(c) T ∈T˜
+g (h0 +1) (n, m)
RVTh0 (ωn, m)
RVTh0 +1 (ωn, m)
Val(θ ),
θ∈R n,mc ,h
(c) T ∈T˜
0 +1
0 +1
where T˜
β1,R θ ∈n,m,h 0
Val(θ ) =
g (h0 ) (n, m)
c=a,b
h
VT 0 (ωn, m)
h +1
VT 0
(c)
g (h0 ) (n, m) + g (h0 +1) (n, m)
(ωn, m)
(c)
T ∈T
Val(θ )
(5.50)
θ∈R n,mc ,h0
(c) 0
+νm
Val(θ )
θ∈R n,m,h0 +1
T ∈T
+g (h0 +1) (n, m)
h
VT 0 (ωn, m)
(c) T ∈T
+g (h0 +1) (n, m)
Val(θ )
θ∈R n,mc ,h0
(c) T ∈T
+g (h0 ) (n, m)
h
VT 0 (ωn, m)
Val(θ )
θ∈R n,mc ,h0 +1
θ∈R n,mc ,h0
Val(θ ) +
Val(θ ) .
θ∈R n,mc ,h0 +1
The last line is equal to 1 (a) (b) [νm u˜ n,m + νm u˜ n,−m ], 2 −ω2 n2 + ω˜ m while adding the first three lines in (5.49) to θ∈β1,R Val(θ ) we get
(5.51)
n,m,h0
ε
u˜ n1 ,m1 u˜ n2 ,m2 u˜ n3 ,m3 ,
(5.52)
n1 +n2 +n3 =n m1 +m2 +m3 =m
from which we get that θ∈R Val(θ ), for µ = 1, is a formal solution of (2.2). A n,m similar result holds for |n| = |m|.
Nonlinear Wave Equations with Dirichlet Boundary Conditions
471
6. Construction of the Perturbed Frequencies In the following it will be convenient to set ω˜ = {ωm }|m|≥2 . By the analysis of the previous sections we have found the counterterms {νm (ω, ˜ ε)}|m|≥2 as functions of ε and ω. ˜ We have now to invert the relations 2 ω˜ m − νm (ω, ˜ ε) = m2 ,
(6.1)
in order to prove Proposition 2. We shall show that there exists a sequence of sets {E (p) }∞ p=0 in [0, ε0 ], such that ∞ (p+1) (p) (p) ⊂ E , and a sequence of functions {ω˜ (ε)}p=0 , with each ω˜ (p) ≡ ω˜ (p) (ε) E defined for ε ∈ E (p) , such that for all ε ∈ E, with E=
∞ % p=0
E (p) = lim E (p) ,
(6.2)
p→∞
there exists the limit ω˜ (∞) (ε) = lim ω˜ (p) (ε),
(6.3)
p→∞
and it solves (6.1). √ To fulfill the program above we shall define since the beginning ω = ωε = 1 − ε, and we shall follow an iterative scheme by setting, for |m| ≥ 1, (0)2 ω˜ m = m2 , (p)2
ω˜ m
(p)2
= ω˜ m (ε) = m2 + νm (ω˜ (p−1) , ε),
p ≥ 1,
(6.4)
and reducing recursively the set of admissible values of ε. We start by imposing on ε the Diophantine conditions |ωn ± m| ≥ 2C0 |n|−τ0
∀n ∈ Z \ {0} and ∀m ∈ Z \ {0} such that |m| = |n|, (6.5)
where C0 and τ0 are two positive constants. This will imply some restrictions on the admissible values of ε, as the following result shows. Lemma 14. For all 0 < C0 ≤ 1/2 there exist ε0 > 0 and γ0 , δ0 > 0 such that the set E (0) of values ε ∈ [0, ε0 ] for which (6.5) are satisfied has Lebesgue measure meas(E (0) ) ≥ ε0 (1 − γ0 ε0δ0 ) provided that one has τ0 > 1. Proof. For (n, m) such that |ωn ± m| ≥ 2C0 the Diophantine conditions in (6.5) are trivially satisfied. We consider then (n, m) such that |ωn ± m| < 2C0 and we write, if 0 < C0 ≤ 1/2, εn 1 > 2C0 > |ωn ± m| ≥ − n ± m , (6.6) √ 1+ 1−ε and as |n ± m| ≥ 1 one gets |n| ≥ 1/2ε ≥ 1/2ε0 . Moreover, for fixed n, the set M of m’s such that |ωn ± m| < 1 contains at most 2 + ε0 |n| values. By writing 2C0 f (ε(t)) = n 1 − ε(t) ± m = t τ , t ∈ [0, 1], (6.7) |n| 0
472
G. Gentile, V. Mastropietro, M. Procesi
and, calling I (0) the set of ε such that |ωn ± m| < 2C0 |n|−τ0 is verified for some (n, m), one finds for the Lebesgue measure of I (0) , 1 dε(t) . meas(I (0) ) = dε = dt (6.8) dt I (0) −1 |n|≥1/2ε0 |m|∈M
We have from (6.7) df df dε 2C0 = = , dt dε dt |n|τ0
(6.9)
so that, noting that one has |∂f/∂ε| ≥ |n|/4, we have to exclude a set of measure |n|≥N
8C0 (2 + ε0 |n|) ≤ const. ε0τ0 , |n|τ0 +1
and one has to impose τ0 = 1 + δ0 , with δ0 > 0.
(6.10)
For p ≥ 1 the sets E (p) will be defined recursively as & (p) E (p) = ε ∈ E (p−1) : |ωn ± ω˜ m | > C0 |n|−τ ∀|m| = |n|, ' (p) (p) |ωn ± (ω˜ m ± ω˜ m )| > C0 |n|−τ |n| = |m ± m | ,
p ≥ 1,(6.11)
for τ > τ0 to be fixed. In Appendix A4 we prove the following result. Lemma 15. For all p ≥ 1 one has (p) ω˜ (ε) − ω˜ (p−1) (ε)
∞
p
≤ Cε0
∀ε ∈ E (p) ,
(6.12)
for some constant C. Therefore we can conclude that there exists a sequence {ω˜ (p) (ε)}∞ p=0 converging to (∞) (ε) for ε ∈ E. We now have to show that the set E has positive (large) measure. ω˜ It is convenient to introduce a set of variables µ(ω, ˜ ε) such that ω˜ m + µm (ω, ˜ ε) = ωm ≡ |m|;
(6.13)
the variables µ(ω, ˜ ε) and the counterterms are trivially related by −νm (ω, ˜ ε) = µ2m (ω, ˜ ε) + 2ω˜ m µm (ω, ˜ ε).
(6.14)
(p)
One can write ω˜ m = ωm − µm (ω˜ (p−1) , ε), according to (6.4). We shall impose the Diophantine conditions |ωn ± (ωm − µm (ω˜ (p−1) , ε))| > C0 |n|−τ , |ωn ± ((ωm ± ωm ) − (µm (ω˜ (p−1) , ε) ± µm (ω˜ (p−1) , ε)))| > C0 |n|−τ .
(6.15)
Nonlinear Wave Equations with Dirichlet Boundary Conditions
473
Suppose that for ε ∈ E (p−1) the functions µm (ω˜ (p−1) , ε) are well defined; then define (p) (p) (p) (p) I (p) = I1 ∪ I2 ∪ I3 , where I1 is the set of values ε ∈ E (p−1) verifying the conditions
(6.16) ωn ± ωm − µm (ω˜ (p−1) , ε) ≤ C0 |n|−τ , (p)
I2
is the set of values ε verifying the conditions
ωn ± (ωm − ωm ) − (µm (ω˜ (p−1) , ε) − µm (ω˜ (p−1) , ε)) ≤ C0 |n|−τ ,
(6.17)
(p)
and I3 is the set of values ε verifying the conditions
ωn ± (ωm + ωm ) − (µm (ω˜ (p−1) , ε) + µm (ω˜ (p−1) , ε)) ≤ C0 |n|−τ . (p)
(6.18) (p)
For future convenience we shall call, for i = 1, 2, 3, Ii (n) the subsets of Ii which verify the Diophantine conditions (6.16), (6.17) and (6.18), respectively, for fixed n. We want to bound the measure of the set I (p) . First we need to know a little better the dependence on ε and ω˜ of the counterterms: this is provided by the following result. Lemma 16. For all p ≥ 1 and for all ε ∈ E (p) there exists a positive constant C such that ∂ (p) νm (ω˜ (p) , ε) ≤ Cε, m, m ≥ 1, νm (ω˜ (p) , ε) ≤ Cε, ω˜ m Cε ∂ (p) µm (ω˜ (p) , ε) ≤ Cε , , m, m ≥ 1, (6.19) µm (ω˜ (p) , ε) ≤ ω ˜ m m m (p) m ≥ 1, ∂ε µm (ω, ε) ≤ C, where the derivatives are in the sense of Whitney [38]. Proof. The bound for νm (ω˜ (p) , ε) is obvious by construction. In order to prove the bound for ∂ω(p) νm (ω˜ (p) , ε) note that one has m
2 (hj ) (hj ) (hj ) −3h
g ˜ − (ω˜ − ω˜ ) ∂ω˜ gj (ω) ˜ ≤ C n2j 2 j ω˜ − ω˜ ∞ , (6.20) j (ω˜ ) − gj (ω) from the compact support properties of the propagator. Let us consider the quantity ν(ω˜ , ε)−ν(ω, ˜ ε)−(ω−ω ) ∂ω˜ ν(ω, ˜ ε), where ∂ω˜ ν(ω, ˜ ε) denotes the derivative in the sense of Whitney, and note that it can be expressed as a (hj )
sum over trees each one containing a line with propagator gj (h ) ˜ ω˜ ) ∂ω˜ gj j (ω),
(hj )
(ω˜ ) − gj
(ω) ˜ − (ω˜ −
by proceeding as in the proof of Lemma 9 of [23]. Then we find
ν(ω˜ , ε) − ν(ω, ˜ ε) − (ω˜ − ω˜ ) ∂ω˜ ν(ω, ˜ ε)
∞
and the second bound in (6.19) follows.
2 ≤ Cε ω˜ − ω˜ ∞ ,
(6.21)
474
G. Gentile, V. Mastropietro, M. Procesi
The bounds for µm (ω˜ (p) , ε) and ∂ω(p) µm (ω˜ (p) , ε) simply follow from (6.14) which m
gives µm (ω˜ (p) , ε) =
νm (ω˜ (p) , ε) −1 . |m| 1 + 1 − νm (ω˜ (p) , ε)/m
(6.22)
In order to prove the last bound in (6.19) we prove by induction the bound (p) m ≥ 1, ∂ε ω˜ m (ω, ε) ≤ C,
(6.23)
by assuming that it holds for ω˜ (p−1) ; then from (6.4) we have (p)
2ω˜ (p) ∂ε ω˜ m (ε) = −∂ε νm (ω˜ (p−1) (ε), ε) (p−1) (p−1) =− ∂ω˜ (p−1) νm (ω˜ (p−1) (ε), ε) ∂ε ω˜ m (ε) − ∂η νm (ω˜ m (ε), η) m ∈Z
m
η=ε
,
(6.24) as it is easy to realize by noting that µm can depend on ω˜ m when |m | > |m| only if the sum of the absolute values of the mode labels mv is greater than |m − m|, while we can bound the sum of the contributions with |m | < |m| by a constant times ε, simply by using the second line in (6.19). Hence from the inductive hypothesis and the proved bounds in (6.19), we obtain (p) (6.25) ω˜ (ε ) − ω˜ (p) (ε) − (ε − ε ) ∂ε ω˜ (p) (ε) ≤ C ε − ε , ∞
so that also the bound (6.23) and hence the last bound in (6.16) follows.
Now we can bound the measure of the set we have to exclude: this will conclude the proof of Proposition 2. (p) We start with the estimate of the measure of the set I1 . When (6.16) is satisfied one must have C2 |m| ≤ |ωm − µm (ω˜ (p−1) , ε)| ≤ ω|n| + C0 |n|−τ ≤ C2 |n|, C1 |m| ≥ |ωm − µm (ω˜ (p−1) , ε)| ≥ ω|n| − C0 |n|−τ ≥ C1 |n|,
(6.26)
which implies M1 |n| ≤ |m| ≤ M2 |n|,
M1 =
C1 , C1
M2 =
C2 , C2
(6.27)
and it is easy to see that, for fixed n, the set M0 (n) of m’s such that (6.16) are satisfied contains at most 2 + ε0 |n| values. Furthermore (by using also (6.5)) from (6.16) one obtains also 2C0 |n|−τ0 ≤ |ω|n| − ωm | ≤ |ω|n| − ωm + µm (ω˜ (p−1) , ε)| + |µm (ω˜ (p−1) , ε)| ≤ C0 |n|−τ + Cε0 /|m|, (6.28) which implies, together with (6.36), for τ ≥ τ0 , C0 M1 1/(τ0 −1) |n| ≥ N0 ≡ . Cε0
(6.29)
Nonlinear Wave Equations with Dirichlet Boundary Conditions
475
√ Let us write ω(ε) = ωε = 1 − ε and consider the function µ(ω˜ (p−1) , ε): we can define a map t → ε(t) such that f (ε(t)) ≡ ω(ε(t))|n| − ωm + µm (ω˜ (p−1) , ε(t)) = t
C0 , |n|τ
t ∈ [−1, 1], (p)
describes the interval defined by (6.16); then the Lebesgue measure of I1 1 dε(t) (p) . dt meas(I1 ) = (p) dε = dt I1 −1
(6.30)
is (6.31)
|n|≥N0 m∈M0 (n)
We have from (6.30), df dε C0 df , = = dt dε dt |n|τ hence (p)
meas(I1 ) =
|n|≥N0 m∈M0 (n)
C0 |n|τ
(6.32)
df (ε(t)) −1 . dt dε(t) −1 1
In order to perform the derivative in (6.30) we write dµm (p−1) ∂ω(p−1) µm ∂ε ω˜ m , = ∂ ε µm +
dε m
(6.33)
(6.34)
|m |≥1
(p−1)
are bounded through Lemma 11. Moreover one has where ∂ε µm and ∂ε ω˜ m ∂ (p−1) µm ≤ Cε e−κ|m −m|/2 , |m | > |m|, ω˜ m
(6.35)
and the sum over m can be dealt with as in (6.24). At the end we get that the sum in (6.34) is O(ε), and from (6.29) and (6.30) we obtain, if ε0 is small enough, ∂f (ε(t)) |n| (6.36) ∂ε(t) ≥ 4 , so that one has
dε ≤ const. (p)
I1
|n|≥N0
C0 (2 + ε0 |n|) |n|τ +1
(6.37)
(τ −1)/(τ0 −1) (τ −τ +1)/(τ0 −1) . + ε0 0 ≤ const. ε0 ε0 (p)
Therefore for τ > max{1, τ0 − 1} the Lebesgue measure of the set I1 is bounded by ε01+δ1 , with δ1 > 0. (p) Now we discuss how to bound the measure of the set I2 . We start by noting that from (6.30) we obtain, if m, > 0, Cε Cε (p) (p) + , (6.38) ω˜ m+ − ω˜ m − ≤ m m+ for all p ≥ 1.
476
G. Gentile, V. Mastropietro, M. Procesi
By the parity properties of ω˜ m without loss of generality we can confine ourselves (p) (p) to the case n > 0, m > m ≥ 2, and |ωn − (ω˜ m − ω˜ m )| < 1. Then the discussion proceeds as follows. When the conditions (6.17) are satisfied, one has 2C0 |n|−τ0 ≤ |ωn − (ωm − ωm )| (p−1)
≤ |ωn − (ωm − µm
(ω˜ (p−1) , ε))
(p−1)
+(ωm − µm (ω˜ (p−1) , ε))| +|µm (ω˜ (p−1) , ε) − µm (ω˜ (p−1) , ε)| Cε Cε 2Cε0 ≤ C0 |n|−τ + + ≤ C0 |n|−τ + , m m m
(6.39)
which implies for τ ≥ τ0 , |n| ≥ N1 ≡
C0 2Cε0
1/τ0 (6.40)
. (p)
We can bound the Lebesgue measure of the set I2 by distinguishing, for fixed (n, ), with = m − m > 0, the values m ≤ m0 and m > m0 , where m0 is determined by the request that one has for m > m0 , C0 2Cε0 ≤ , m |n|τ0
(6.41)
which gives m0 = m0 (n) =
2C|n|τ0 ε0 C0
(6.42)
.
Therefore for m > m0 and τ ≥ τ0 one has, from (6.5), (6.38) and (6.41), 2C0 2Cε C0 C0 C0 (p) (p) ≥ − τ ≥ ≥ , ωn − (ω˜ m − ω˜ m ) ≥ |ωn − | − τ τ m |n| 0 |n| 0 |n| 0 |n|τ
(6.43)
so that one has to exclude no further value from E (p−1) , provided one takes τ ≥ τ0 . For m < m0 define L0 such that C (p) (p) C3 ≤ ω˜ m+ − ω˜ m < ωn + 1 < C3 |n|, L0 = 3 , (6.44) C3 where (6.26) has been used. Again, for fixed n, the set L0 (n) of ’s such that (6.26) is satisfied with m − m = contains at most 2 + ε0 |n| values. Therefore, by reasoning as in obtaining (6.37), one finds that for m < m0 one has to exclude from E (p−1) a set of measure bounded by a constant times
C0 (τ −τ0 )/τ0 1+(τ −τ0 −1)/τ0 ≤ const. ε + ε ε , (6.45) 0 0 0 |n|τ +1 |n|≥N1 ∈L0 (n) m<m0 (n)
(p)
provided that one has τ > τ0 + 1 > 2 the Lebesgue measure of the set I2 by ε01+δ2 , with δ2 > 0.
is bounded
Nonlinear Wave Equations with Dirichlet Boundary Conditions
477
Finally we study the measure of the set I3 . If n > 0, |m | > |m| and |(ω˜ m + ω˜ m ) − ωn| < 1 (which again is the only case we can confine ourselves to study), then one has to sum over |n| ≤ N1 , with N1 given by (6.40). For such values of n one has 2Cε (p) (p) ωn − (ω˜ m + ω˜ m ) ≥ ωn − (|m| + |m |) − |m| 2C0 C0 C0 C0 ≥ − τ ≥ ≥ , (6.46) |n|τ0 |n| 0 |n|τ0 |n|τ (p)
(p)
as soon as |m| > m0 , with m0 given by (6.42). Therefore we have to take into account only the values of m such that |m| < m0 , and we can also note that |m | is uniquely determined by the values of n and m. Then one can proceed as in the previous case and in the end one excludes a further subset of E (p−1) whose Lebesgue measure is bounded by a constant times C0 1+(τ −τ0 )/τ0 ≤ const. ε0 , (6.47) |n|τ +1 |n|≥N1 m<m0 (n)
(p)
so that the Lebesgue measure of the set I3 is bounded by ε01+δ3 , with δ3 > 0, provided that one takes τ > τ0 . (p) (p) (p) By summing together the bounds for I1 , for I2 and for I3 , then the bound meas(I (p) ) ≤ b ε0δ+1
(6.48)
with δ > 0, follows for all p ≥ 1, if τ is chosen to be τ > τ0 + 1 > 2. We can conclude the proof of Proposition 2 through the following result, which shows that the bound (6.42) essentially extends to the union of all I (p) (at the cost of taking a larger constant B instead of b). Lemma 17. Define I (p) as the set of values in E (p) verifying (6.26) to (6.28) for τ > 1. Then one has, for two suitable positive constants B and δ,
(p) I (6.49) ≤ Bε0δ+1 , meas ∪∞ p=0 where meas denotes the Lebesgue measure. (p)
(p)
Proof. First of all we check that, if we call εj (n) the centers of the intervals Ij (n), with j = 1, 2, 3, then one has (p+1) (p) p (n) − εj (n) ≤ Dε0 , (6.50) εj for a suitable constant D. (p)
The center ε1 (n) is defined by the condition
(p) (p) ωn ± ωm − µm (ω˜ (p−1) (ε1 (n)), ε1 (n)) = 0,
(6.51)
where Whitney extensions are considered outside E (p−1) ; then, by subtracting (6.51) from the analogous expression for p + 1, we have
(p+1) (p+1) (p) (p) (n)), ε1 (n)) − µm (ω˜ (p−1) (ε1 (n)), ε1 (n)) = 0. (6.52) µm (ω˜ (p) (ε1
478
G. Gentile, V. Mastropietro, M. Procesi
In (6.48) one has µm (ω˜ (ε ), ε ) − µm (ω(ε), ˜ ε) = µm (ω˜ (ε ), ε ) − µm (ω(ε ˜ ), ε )
+µm (ω(ε ˜ ), ε ) − µm (ω(ε), ˜ ε ) + (µm (ω(ε), ˜ ε ) − µm (ω(ε), ˜ ε)),
(6.53)
and, from Lemma 13, µm (ω˜ (ε ), ε ) − µm (ω(ε ˜ ), ε ) ≤ Cε ω − ω∞ , µm (ω(ε ˜ ), ε ) − µm (ω(ε), ˜ ε ) ≤ Cε ε − ε ,
(6.54)
so that we get, by Lemma 10, (p+1) (p) p (n) − ε1 (n) ≤ Cε0 , ε1
(6.55)
for a suitable positive constant C. This proves the bound (6.50) for j = 1. Analogously one can consider the cases j = 2 and j = 3, and a similar result is found. By (6.10), (6.51), (6.54) and (6.12) it follows for p > p0 , (p)
(p0 )
|εj (n) − εj
(n)| ≤ C
∞
p
ε0k = C
k=p0 (p)
(p0 )
so that one can ensure that |εj (n) − εj
ε0 0 1 − ε0
(6.56)
(n)| ≤ C0 |n|−τ for p ≥ p0 by choosing
p0 = p0 (n, j ) ≤ const. log |n|.
(6.57)
(p)
For all p ≤ p0 define Jj (n) as the set of values ε such that (6.16), (6.17) and (6.18) (p)
are satisfied with C0 replaced with 2C0 . By the definition of p0 all the intervals Ij (n) (p0 )
(0)
fall inside the union of the intervals Jj (n), . . . , Jj means that, by (6.31) to (6.37),
(n) as soon as p > p0 . This
(p) p0 (p) (p) meas ∪∞ ≤ meas ∪ ≤ I J meas(I1 (n)) p=0 1 p=0 1 |n|≥N0
≤ const.
p0 (n,1)
|n|≥N0 p=0
∞ 2C0 ≤ const. n−(τ +1) log n ≤ Bε01+δ , |n|τ +1 n=N0
(6.58) with suitable B and δ, in order to take into account the logarithmic corrections due to (6.55). Analogously one obtains the bounds meas(I2 ) ≤ Bε01+δ and meas(I3 ) ≤ Bε01+δ for the Lebesgue measures of the sets I2 and I3 (possibly redefining the constant B). This completes the proof of the bound (6.48).
Nonlinear Wave Equations with Dirichlet Boundary Conditions
479
7. The case τ ≤ 2 Proposition 1 was proved assuming that ω˜ verifies the first and the second Diophantine conditions (2.33) with τ > 2. Here we want to prove that it is possible to obtain a result similar to Proposition 1 assuming only the first Diophantine condition and 1 < τ ≤ 2, and that also in such a case the set of allowed values of ε have large Lebesgue measure, so that a result analogous to Proposition 2 holds. The proof of the analogue of Proposition 1 is an immediate adaptation of the analysis in Sect. 4 and 5. First we consider a slightly different multiscale decomposition of the propagator; instead of (4.2) we write ( ( 2 |) 2 |) χ (6 |ω2 n2 − ω˜ m χ−1 (6 |ω2 n2 − ω˜ m 1 = + , 2 2 2 2 2 2 2 2 2 −ω n + ω˜ m −ω n + ω˜ m −ω n + ω˜ m
(7.1)
and the denominator in the first addend of the r.h.s. of (7.1aa) is smaller than C0 /6; as in 2|< Sect. 4 we assume C0 ≤ 1/2 without loss of generality. If |n| = |m| and |ω2 n2 − ω˜ m C0 /6, then, by reasoning as in (4.12), we obtain |n| > |m| >
3 |n|, 4
min{|m|, |n|} >
1 , 2ε
(7.2)
and |ω|n| − ω˜ m | <
C0 C0 < . 6(ω|n| + ω˜ m ) 6|n|
(7.3)
We can decompose the first summand in (7.1), obtaining ∞ ∞ ( χh (|ωn| − ω˜ m ) 2 |) χ (6 |ω2 n2 − ω˜ m ≡ ga(h) (ωn, m), 2 −ω2 n2 + ω˜ m h=−1
(7.4)
h=−1
and the scales from −1 to h0 , with h0 given in the statement of Lemma 5, can be bounded (h) as in Lemma 6. We shall call the line of type a a line with which a propagator ga (ωn, m) is associated and the line of type b a line with which the second summand in (7.1) is associated; the scale of the b lines is set equal to −1. If Nh (θ ) is the number of lines on scale h, the following result holds. Lemma 18. Assume that there is a constant C1 such that one has |ω˜ m − |m|| ≤ C1 ε/|m| and that ω˜ verifies the first Melnikov condition in (2.33) with τ > 1. If ε is small enough (k) for any tree θ ∈ n,m and for all h ≥ h0 one has 2
Nh (θ ) ≤ 4K(θ )2(2−h)/τ − Ch (θ ) + Mhν (θ ) + Sh (θ ),
(7.5)
where Mhν (θ ) and Sh (θ ) are defined as the number of ν-vertices in θ such that the maximum scale of the two external lines is h and, respectively, the number of self-energy graphs in θ with heT = h.
480
G. Gentile, V. Mastropietro, M. Procesi
Proof. We prove inductively the bound Nh∗ (θ ) ≤ max{0, 2K(θ )2(1−h)/τ − 1}, 2
(7.6)
where Nh∗ (θ ) is the number of non-resonant lines in L(θ ) on scale h ≥ h. We proceed exactly as in the proof of Lemma 5. First of all note that for a tree θ to have a line on 2 scale h the condition K(θ ) > 2(h−1)/τ > 2(h−1)/τ is necessary, by the first Diophantine conditions in (2.33). This means that one can have Nh∗ (θ ) ≥ 1 only if K = K(θ) is such that K > k0 ≡ 2(h−1)/τ : therefore for values K ≤ k0 the bound (7.6) is satisfied. If K = K(θ ) > k0 , we assume that the bound holds for all trees θ with K(θ ) < K. 2 Define Eh = 2−1 (2(1−h)/τ )−1 , we have to prove that Nh∗ (θ ) ≤ max{0, K(θ )Eh−1 − 1}. The dangerous case is if m = 1 then one has a cluster T with two external lines and 1 , which are both with scales ≥ h; then −h+1 |ωn | − ω˜ m ≤ 2−h+1 C0 , | − ω ˜ C0 , (7.7) |ωn 1 m1 ≤ 2 and recall that T is not a self-energy graph, so that n = n1 . Note that the validity of both inequalities in (7.7) for h ≥ h0 imply, by Lemma 5, that one has |n − n1 | = |m ± m1 |.
(7.8)
Moreover from (7.3), (7.8) and (7.2) one obtains C0 C0 ≤ |ω|n − n1 | − |m1 ± m1 || < , |n − n1 |τ 3 min{|n |, |n1 |} which implies
n − n ≥ 31/τ min{|n |, |n |}1/τ , 1 1
(7.9)
(7.10)
Finally if C0 |n|−τ ≤ ||ωn| − ω˜ m | < C0 2−h+1 we have |n| ≥ 2(h−1)/τ , so that from 2 (7.8) we obtain |n − n1 | ≥ 31/τ 2(h−1)/τ . Then K(θ) − K(θ1 ) > Eh , which gives (7.6), by using the inductive hypothesis. The analysis in Sect. 5 can be repeated with the following modifications. The renormalized trees are defined as in Sect. 5, but the rule (9) is replaced with (9 ) To each self-energy graph T the L operation is applied, where L VTh (ωn, m) = VTh (ωn, m),
(7.11)
if T is such that its two external lines are attached to the same node V0 of T (see as an example the first graph of Fig. 4.1), and L VTh (ωn, m) = 0 otherwise. With this definition we have the expansion (5.2) to (5.4), with νh,m replaced with νh (as L VTh (ωn, m) is independent of n, m). We have that the analogue of Lemma 7 still holds also with L , R replacing L, R; indeed by construction the dependence on (n, m) in R VThT (ωn, m) is due to the propagators of lines along the path connecting the external lines of the self-energy graph. If a line in the path is a line of type a, the propagator has the form ( χh (|ωn | − ω˜ m ) 2 |) , χ (6 |ω2 n2 − ω˜ m (7.12) − |ωn | + ω˜ m |ωn | − ω˜ m
Nonlinear Wave Equations with Dirichlet Boundary Conditions
481
and the second factor in the denominator is bounded proportionally to 2−h , while the first is bounded proportionally to |m0 + meT |. Then we get for heT ≤ h0 , e−κ|m | ≤ Cε |ne |τ −1 , |m0 + meT | T 0
(h e ) |R VThT (ωn, m)g T (ωn, m)|
2
(7.13)
which means that propagator of the external line of the resonance T is compensated, if 1 < τ ≤ 2, by the extra factor (meT )−1 . Here we used that there are at least two nodes, carrying a node factor proportional to ε, not contained in any inner internal selfenergy graphs, as otherwise the points to which the external lines are attached would coincide. The same happens if is of type b, as we are going to show. In the following with C we denote any constant. We can assume that |meT |/2 < |m | < 2|meT |, otherwise one
has |m0 | > |meT |/2 and we can use the factor e−κ|m |/2 < C|meT |−1 , to compensate the propagator of one of the external lines of T . We can decompose the propagator of the line as ∞ ( χh (|ωn| − ω˜ m ) 2 |) χ−1 (6 |ω2 n2 − ω˜ m . (7.14) 2 −ω2 n2 + ω˜ m 0
h=−1
If the line has scale h ≤ h0 then we can bound the propagator with O(|εmeT |−1 ), and we get |R VThT (ωn, m)g
(he ) T
(ωn, m)| ≤ Cε
1 |ne |τ −1 . |meT | T
(7.15)
Moreover if |meT | < 6ε −1 still (7.15) holds. We have then to consider the case in which the line is a line of type b with scale h > h0 and |meT | ≥ 6ε −1 . We have still two cases: either is such that |ωn − ω˜ m | ≤ 2C0 /|meT | or not. In the first case, remembering that |ωneT − ω˜ me | ≤ 2C0 /|meT | because eT is a line of type a, we find T
C0 ≤ |ω(n1 − neT ) ± m1 ± meT | < 5C0 |meT |−1 , |n1 − neT |τ
(7.16)
where we have used that ω˜ m = |m| + O(εm−1 ). This implies |n1 − neT | = |n01 | > 51/τ |meT |1/τ , so that one has |R VThT (ωn, m)g
he
T
1
(ωn, m)| ≤ Cε 2 e
−κC|me | τ T
|neT |τ −1 .
(7.17)
In the second case one has |ωn − ω˜ m | > 2C0 /|meT |. If h1 is the scale of we can distinguish two subcases: either h1 < heT or h1 ≥ heT . If h1 < heT we sum find C0 ≤ |ω(n1 − neT ) ± m1 ± meT | < 5C0 2−h1 , |n1 − neT |τ
(7.18)
as ω˜ me = |meT | + O(ε|meT |−1 ) = |meT | + O(ε2−h1 ) and ω˜ m = |m | + O(ε2−h1 ). T
Hence we find |n0 | > 2(h1 −2)/τ and by bounding the propagator corresponding to by
482
G. Gentile, V. Mastropietro, M. Procesi (h −2)/τ
O(|m |−1 )2h1 ) and using the factor e−κ2 1 we get also in this case R VThT (ωn, m) is O(|m−1 eT |). Finally if h1 ≥ heT we sum again the denominators of and eT and we find C0 −h e ≤ |ω(n1 − neT ) ± m1 ± meT | < 5C0 2 T , |n1 − neT |τ
(7.19)
as ω˜ me = |meT | + O(ε|meT |−1 ) = |meT | + O(ε2−h1 ) and ω˜ m1 = |m1 | + O(ε2−h1 ). T
(h
e
−2)/τ
Hence we find |n0e | > 2 T and this factor compensates the extra factor 2 T arising from the entering line of T . Then the bound (5.11) is replaced, using also Lemma 18, by |Val(θ )| ≤ ε k/2 D
k
∞
he
T
2 exp h log 2 4K(θ )2−(h−2)/τ − Ch (θ ) + Mhν (θ )
h=h0
∞
h=h0
2−hMh (θ) ν
(7.20)
e−κ(|nv |+|nv |)
V∈V (θ)∪E(θ)
e−κ(|mv |+|mv |) .
V∈V (θ)∪E(θ)
There is no need of Lemma 8, while the analogues of Lemma 9 and Lemma 10 can be proved with some obvious changes. For instance in (5.28) one has η = 0, which shows that the first Mel’nikov is required, and νm is replaced with ν, i.e. one needs only one counterterm. Finally also the analogue of Proposition 2 can be proved by reasoning as in Sect. 6, simply by observing that in order to impose the first Mel’nikov conditions (6.16) one can take τ0 = τ ; note indeed that the condition τ > 2 was made necessary to obtain the second Mel’nikov conditions (6.17) and (6.18). 8. Generalizations of the Results So far we have considered only the case ϕ(u) = u3 in (1.1). Now we consider the case in which the function ϕ(u) in (1.1) is replaced with any odd analytic function ϕ(u) = u3 + O(u5 ),
= 0. (8.1) √ Define ω = 1 − λε as in Sect. 1. Then by choosing λ = σ we obtain, instead of (1.14), n2 an = (v + w)3 + O(ε) n,n , Q −n2 an = (v + w)3 + O(ε) n,−n ,
, |m| = |n|, (8.2) P −ω2 n2 + m2 wn,m = ε (v + w)3 + O(ε) n,m
where O(ε) denotes analytic functions in u and ε of order at least one in ε. Then we can introduce an auxiliary parameter µ, by replacing ε with εµ in (8.2) (recall that at the end one has to set µ = 1). Then we can proceed as in the previous sections. The equation
Nonlinear Wave Equations with Dirichlet Boundary Conditions
483 (k)
for a0 is the same as before, and only the diagrammatic rules for the coefficients un,m have to be slightly modified. The nodes can have any odd number of entering lines, that is sV = 1, 3, 5, . . . . For nodes V of w-type with sV ≥ 3 the node factor is given by ηV = ε (sV −1)/2 , which has to be added to the list of node factors (3.3), while for nodes V of v-type one has to add to the list in (3.2) other (obvious) contributions, arising from (k) the fact that the function fn in (2.4) has to be replaced with fn(k) =
k
fn(k,k ) ,
(8.3)
k =1
is given by (2.12), while for k ≥ 2 and 1 < k ≤ k one has (k2k +1 ) 1) u(k (8.4) n1 ,m1 . . . un2k +1 ,m2k +1 ,
(k,1)
where the function fn
fn(k,k ) = −
k1 +...+k2k +1 =k−k n1 +...+n2k +1 =n m1 +...+m2k +1 =m
where the symbols have to be interpreted according to (2.13) and (2.14); note that s = 2k + 1 is the number of lines entering the node in the corresponding graphical representation. In the same way one has to modify (2.31) by replacing the last term in the r.h.s. with [ϕ(v + w)](k) n,m =
k
) [ϕ(v + w)](k,k n,m ,
(8.5)
k =1 (k,1)
where [ϕ(v + w)]n,m is given by the old (2.32), while all the other terms are given by expressions analogous to (8.4). The discussion then proceeds exactly as in the previous cases. Of course we have to use that, by writing ϕ(u) =
∞
k u2k+1 ,
(8.6)
k=1
with 1 = , the constants k can be bounded for all k ≥ 1 by a constant to the power k (which follows from the analyticity assumption). By taking into account the new diagrammatic rules, one change in the proper way the definition of the tree value, and a result analogous to Lemma 2 iseasily obtained. The second statement of Lemma 3 has to be changed into |E(θ )| ≤ V∈V (θ) (sV − 1) + 1 (see (3.24)), while the bound on |Vv3 (θ )| still holds. For Vvs (θ ), with s ≥ 5, one has to use the factors ε(s−1)/2 . Nothing changes in the following sections, except that the bound (5.12) has to be suitably modified in order to take into account the presence of the new kinds of nodes (that is the nodes with branching number more than three). At the end we obtain the proof of Theorem 1 for general odd nonlinearities starting from the third order. Until now we are still confining ourselves to the case j = 1. If we choose j > 1 we have to perform a preliminary rescaling u(t, x) → j u(j t, j x), and we write down the equation for U (t, x) = u(j t, j x). If ϕ(u) = F u3 we see immediately that the function U solves the same equation as before, so that the same conditions on ε has to be imposed in order to find a solution. In the general case (8.2) holds with j 2 ε replacing ε. This completes the proof of Theorem 1 in all cases.
484
G. Gentile, V. Mastropietro, M. Procesi
Appendix A1. The Solution of (1.5) The odd 2π -periodic solutions of (1.5) can be found in the following way [30, 32]. First we consider (1.5), with a02 replaced by a parameter c0 , and we see that a0 (ξ ) = V sn( ξ, m),
(A1.1)
where sn( ξ, m) is the sine-amplitude function with modulus solution of
√
m [25, 1], is an odd
a¨ 0 = −3c0 a0 − a03 ,
(A1.2)
if the following relations are verified by V , , c0 , m V =
√ −2m ,
V2 = −m. 6c0 + V 2
(A1.3)
Of course both V and m can be written as a function of c0 and as ) 3c0 6c0 m = 2 − 1, V = 2 − 2 .
(A1.4)
In particular one finds √ 1 6c0 ∂ V = −2m + √ = −2m 2
√ −2m
m
3c0 m− 2
)
=
2 , −m
(A1.5)
so that one has
∂ V 1 = . V −m
(A1.6)
If we impose also that the solution (A1.1) is 2π -periodic (and we recall that 4K(m) is the natural period of the sine-amplitude √ sn(ξ, m), [1]), we obtain = m = 2K(m)/π , and V is fixed to the value Vm = −2m m . Finally imposing that c0 equals the average of a02 fixes m to be the solution of [32] E(m) = K(m)
7+m , 6
(A1.7)
where K(m) =
0
π/2
dθ
1 1 − m sin2 θ
,
E(m) =
π/2
dθ 0
1 − m sin2 θ
(A1.8)
are, respectively, the complete elliptic integral of the first kind and the complete elliptic integral of the second kind [1], and we have used that the average of sn2 ( ξ, m) is (K(m) − E(m))/K(m). This gives m ≈ −0.2554. One can find 2π/j -periodic solutions by noticing that Eqs. (1.5) are invariant by the simmetry a0 (ξ ) → α a0 (αξ ), so that a complete set of odd 2π -periodic solutions is provided by a0 (ξ, j ) = j a0 (j ξ ).
Nonlinear Wave Equations with Dirichlet Boundary Conditions
485
Appendix A2. Proof of Lemma 1 The solution of (2.21) is found by variation of constants. We write a02 = c0 , and consider c0 as a parameter. Let a0 (ξ ) be the solution (A1.1) of Eq. (A1.2), and define the Wronskian matrix W (ξ ) =
w11 (ξ ) w12 (ξ ) , w21 (ξ ) w22 (ξ )
(A2.1)
which solves the linearized equation W˙ = M(ξ )W,
M(ξ ) =
0 1 , −3a02 (ξ ) − 3c0 0
(A2.2)
and is such that W (0) = and det W (ξ ) = 1 ∀ξ . We need two independent solutions of the linearized equation. We can take one as a˙ 0 , the other as ∂ a0 (we recall that m and V are functions of and c0 through (A1.3)); then one has w11 (ξ ) =
1 a˙ 0 (ξ ) = cn( ξ, m) dn( ξ, m),
V
w21 (ξ ) = w˙ 11 (ξ ) = −sn( ξ, m) dn2 ( ξ, m) + m cn2 ( ξ, m) ,
1 (A2.3) ∂ a0 (ξ ) V (1 + Dm )
= Bm ξ cn( ξ, m) dn( ξ, m) + −1 Dm sn( ξ, m) , w22 (ξ ) = w˙ 12 (ξ )
= cn( ξ, m) dn( ξ, m)− Bm ξ sn( ξ, m) dn2 ( ξ, m)+ m cn2 ( ξ, m) , w12 (ξ ) =
where we have used (A1.3) to define the dimensionless constants Dm =
1
∂ V = , V −m
Bm =
−m 1 = . 1−m (1 + Dm )
(A2.4)
As we are interested in the case c0 = a02 we set = m = 2K(m)/π (in order to have the period equal to 2π) and we fix m as in Appendix A1. Then, by defining X = (y, y) ˙ and F = (0, h), we can write the solution of (2.21) as the first component of
ξ
X(ξ ) = W (ξ ) X + W (ξ )
dξ W −1 (ξ ) F (ξ ),
(A2.5)
0
˙ denote the corrections to the initial conditions (a0 (0), a˙ 0 (0)), and where X = (0, y(0)) y(0) = 0 as we are looking for an odd solution. Shorten c(ξ ) ≡ cn( m ξ, m), s(ξ ) ≡ sn( m ξ, m), and d(ξ ) ≡ dn( m ξ, m), and define cd(ξ ) = cn( m ξ, m) dn( m ξ, m). One can write the first component of (A2.5)
486
G. Gentile, V. Mastropietro, M. Procesi
as ξ y(ξ ) = w12 (ξ ) y(0) ˙ + dξ w12 (ξ ) w11 (ξ ) − w11 (ξ ) w12 (ξ ) h(ξ ) 0
= Bm ξ cd(ξ ) + −1 ˙ m Dm s(ξ ) y(0) ξ ξ +Bm cd(ξ ) dξ dξ
cd(ξ
) h(ξ
) 0
0
+ −1 s(ξ ) D m m
ξ
dξ cd(ξ ) h(ξ ) − cd(ξ )
0
ξ
(A2.6)
dξ s(ξ ) h(ξ )
,
0
as we have explictly written w12 (ξ ) w11 (ξ ) − w11 (ξ ) w12 (ξ )
= Bm ξ cd(ξ ) cd(ξ ) + −1 m Dm s(ξ ) cd(ξ )
−cd(ξ ) ξ cd(ξ ) − −1 m Dm cd(ξ ) s(ξ ) ,
(A2.7)
and integrated by parts
ξ
dξ ξ cd(ξ ) cd(ξ ) − cd(ξ ) ξ cd(ξ ) h(ξ )
0
= cd(ξ ) 0
ξ
dξ
ξ
dξ
cd(ξ
) h(ξ
).
(A2.8)
0
By using that if P[F ] = F then P[I[F ]] = I[F ] and that I switches parity we can rewrite (A2.6) as
s h y(ξ ) = Bm ξ cd(ξ ) y(0) ˙ − I[cd h](0) − −1 D m m
+ −1 ˙ − I[cd h](0) m Dm s(ξ ) y(0)
s(ξ ) I[cd h](ξ ) − cd(ξ ) I[P[s h]](ξ ) + cd(ξ ) I[I[cd h]](ξ ) . + −1 D m m (A2.9)
This is an odd 2π -periodic analytic function provided that we choose y(0) ˙ − I[cd h](0) − −1 m Dm s h = 0, which fixes the parameter y(0); ˙ hence (2.22) is found.
(A2.10)
Nonlinear Wave Equations with Dirichlet Boundary Conditions
487
Appendix A3. Proof of (2.27) We have to compute a0 L[a0 ], which is given by (recall that one has a0 (ξ ) = Vm s(ξ ) and cd(ξ ) = −1 m s˙ (ξ )) 2 2 2 2 2 a0 L[a0 ] = Bm −2 D s s V + D I[s s ˙ ] m m m m
−Dm s s˙ I[P[s ]] + s s˙ I[I[s s˙ ]] . 2
Using that one has I[s s˙ ] = (s 2 − s 2 )/2, we obtain 1 2 ; s4 − s2 s 2 I[s s˙ ] = 2
(A3.1)
(A3.2)
moreover, integrating by parts, we find 1 1 2 s s˙ I[P[s 2 ]] = − s 4 + s 2 , 2 2 1 4 1 2 2 s s˙ I[I[s s˙ ]] = − s + s , 4 4 so that we finally get 1 2 −2 1 a0 L[a0 ] = Vm 2Dm −
m B m s 4 2 2 1 2 2 + 2Dm (Dm − 1) + s , 2
(A3.3)
(A3.4)
which is strictly positive by (A2.4) and by the choice of m according to Appendix A1; then (2.27) follows. Appendix A4. Proof of Lemma 15 We shall prove inductively on p the bounds (6.12). From (6.4) we have (p)
(p−1)
|ω˜ m (ε) − ω˜ m
(p)
(ε)| ≤ C|νm (ω˜ (p−1) (ε), ε) − νm (ω˜ (p−2) (ε), ε)|, (p−1)
as we can bound |ω˜ m (ε) + ω˜ m We set, for |m| ≥ 1,
(A4.1)
(ε)| ≥ 1 for ε ∈ E (p) . (q)
νh,m ≡ νh,m (ω˜ (p−1) (ε), ε) − νh,m (ω˜ (p−2) (ε), ε) = lim νh,m , q→∞
(A4.2)
where we have used the notations (5.37) to define (q)
(q)
(q)
νh,m = νh,m (ω˜ (p−1) (ε), ε) − νh,m (ω˜ (p−2) (ε), ε). We want to prove inductively on q the bound (q) νh,m ≤ Cε ω˜ (p−1) (ε) − ω˜ (p−2) (ε) ∞ , for some constant C, uniformly in q, h and m.
(A4.3)
(A4.4)
488
G. Gentile, V. Mastropietro, M. Procesi
For q = 0 the bound (A4.4) is trivially satisfied. Then assume that (A4.4) hold for all q < q. For simplicity we set ω˜ = ω˜ (p−1) (ε) and ω˜ = ω˜ (p−2) (ε). We can write, from (6.25), for |m| ≥ 1 and for h ≥ 0, (q)
νh,m = −
¯ h−1
(q) (q−1) 2−k−2 βk,m (ω, ˜ ε, {νk (ω, ˜ ε)})
k=h
−βk,m (ω˜ , ε, {νk (q)
(q)
(q−1)
where we recall that βk,m (ω, ˜ ε, {νk k − 1, and we can set (q)
(q−1)
βk,m (ω, ˜ ε, {νk =
(q−1)
(ω˜ , ε)}) ,
(A4.5) (q−1)
(ω, ˜ ε)}) depend only on νk
(ω, ˜ ε) with k ≤
(ω, ˜ ε)})
(a)(q) (q−1) (b)(q) (q−1) ˜ ε, {νk (ω, ˜ ε)}) − βk,m (ω, ˜ ε, {νk (ω, ˜ ε)}), βk,m (ω,
(A4.6)
according to the settings in (2.3). Then we can split the differences in (A4.6) into βk,m (ω, ˜ ε, {νk (ω, ˜ ε)}) − βk,m (ω˜ , ε, {νk (ω˜ , ε)})
(q) (q−1) (q) (q−1) = βk,m (ω, ˜ ε, {νk (ω, ˜ ε)}) − βk,m (ω˜ , ε, {νk (ω, ˜ ε)})
(q) (q−1) (q) (q−1) + βk,m (ω˜ , ε, {νk (ω, ˜ ε)}) − βk,m (ω˜ , ε, {νk (ω˜ , ε)}) , (q)
(q−1)
(q)
(q−1)
(A4.7)
and we bound separately the two terms. The second term can be expressed as the sum of trees θ which differ from the previously considered ones as, among the nodes v of w-type with only one entering (c )(q−1) (c )(q−1) line, there are some with νkVV (ω, ˜ ε), some with νkVV (ω˜ , ε) and one with (c )(q−1)
νkVV
(ω, ˜ ε) − νkVV (ω˜ , ε). Then we can bound (q) (q−1) (q) (q−1) (ω, ˜ ε)}) − βk,m (ω˜ , {νk (ω˜ , ε)}) βk,m (ω˜ , {νk (q−1) ≤ D1 ε sup sup νh ,m ≤ D1 Cε 2 ω˜ − ω˜ ∞ , (c )(q−1)
h ≥0 |m |≥2
(A4.8)
by the inductive hypothesis. We are left with the first term in (A4.8). We can reason as in [23] (which we refer to for details), and at the end, instead of (A9.14) of [23], we obtain s
(hj ) (hj ) ˜ Val (θ1 ) Val(θi ) Val (θ0 ) gj (ω˜ ) − gj (ω) i=2 1 (A4.9) ≤ C |V (T )| e−κK(T )/2 ε |Vw (T )|+|Vv (T )| ω˜ − ω˜ ∞ 1 (k−1)/τ /4 ω˜ − ω˜ ∞ , ≤ C |V (T )| ε |Vw (T )|+|Vv (T )| e−κK(T )/4 e−κ2 where K(T ) is defined in (4.7). Therefore we can bound ω˜ (p) (ε) − ω˜ (p−1) (ε)∞ with a constant times ε times the same expression with p replaced with p − 1, i.e. ω˜ (p−1) (ε) − ω˜ (p−2) (ε)∞ , so that, by the inductive hypothesis, the bound (6.12) follows.
Nonlinear Wave Equations with Dirichlet Boundary Conditions
489
References 1. Abramowitz, M., Stegun, I.A.: Handbook of mathematical functions. New York: Dover, 1965 2. Bambusi, D.: Lyapunov center theorem for some nonlinear PDE’s: a simple proof. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 29(4), 823–837 (2000) 3. Bambusi, D., Paleari, S.: Families of periodic solutions of resonant PDEs. J. Nonlinear Sci. 11(1), 69–87 (2001) 4. Berti, M., Bolle, Ph.: Periodic solutions of nonlinear wave equations with general nonlinearities. Commun. Math. Phys. 243(2), 315–328 (2003) 5. Berti, M., Bolle, Ph.: Multiplicity of periodic solutions of nonlinear wave equations. Nonlinear Anal. 56, 1011–1046 (2004) 6. Birkhoff, G.D., Lewis, D.C.: On the periodic motions near a given periodic motion of a dynamical system. Ann. Math. 12, 117–133 (1933) 7. Bourgain, J.: Construction of quasi-periodic solutions for Hamiltonian perturbations of linear equations and applications to nonlinear PDE. Internat. Math. Res. Notices 1994(11), 475–497 (1994) 8. Bourgain, J.: Construction of periodic solutions of nonlinear wave equations in higher dimension. Geom. Funct. Anal. 5, 629–639 (1995) 9. Bourgain, J.: Quasi-periodic solutions of Hamiltonian perturbations of 2D linear Schr¨odinger equations. Ann. of Math. (2) 148(2), 363–439 (1998) 10. Bourgain, J.: Periodic solutions of nonlinear wave equations. In: Harmonic analysis and partial differential equations (Chicago, IL, 1996), Chicago Lectures in Math. Chicago, IL: Univ. Chicago Press, 1999, pp. 69–97 11. Br´ezis, H., Coron, J.-M., Nirenberg, L.: Free vibrations for a nonlinear wave equation and a theorem of P. Rabinowitz. Commun. Pure Appl. Math. 33(5), 667–684 (1980) 12. Br´ezis, H., Nirenberg, L.: Forced vibrations for a nonlinear wave equation. Commun. Pure Appl. Math. 31(1), 1–30 (1978) 13. Craig, W.: Probl`emes de petits diviseurs dans les e´ quations aux d´eriv´ees partielles. Panoramas and Syntheses 9. Paris: Soci´et´e Math´ematique de France, 2000 14. Craig, W., Wayne, C.E.: Newton’s method and periodic solutions of nonlinear wave equations. Commun. Pure Appl. Math. 46, 1409–1498 (1993) 15. Craig, W., Wayne, C.E.: Nonlinear waves and the 1 : 1 : 2 resonance. In: Singular limits of dispersive waves (Lyon, 1991), NATO Adv. Sci. Inst. Ser. B Phys. 320, New York: Plenum, 1994, pp. 297–313 16. Eliasson, L.H.: Absolutely convergent series expansions for quasi periodic motions. Math. Phys. Electron. J. 2, Paper 4, 33 pp. (electronic) (1996) 17. Fadell, E.R., Rabinowitz, P.H.: Generalized cohomological index theories for Lie group actions with an application to bifurcation questions for Hamiltonian systems. Invent. Math. 45(2), 139–174 (1978) 18. Fr¨ohlich, J., Spencer, T.: A rigorous approach to Anderson localization. In: Common trends in particle and condensed matter physics (Les Houches, 1983). Phys. Rep. 103(1-4), 9–25 (1984) 19. Gallavotti, G.: Twistless KAM tori. Commun. Math. Phys. 164(1), 145–156 (1994) 20. Gallavotti, G., Gentile, G.: Hyperbolic low-dimensional invariant tori and summations of divergent series. Commun. Math. Phys. 227(3), 421–460 (2002) 21. Gallavotti, G., Gentile, G., Mastropietro, V.: A field theory approach to Lindstedt series for hyperbolic tori in three time scales problems. J. Math. Phys. 40(12), 6430–6472 (1999) 22. Gentile, G.: Whiskered tori with prefixed frequencies and Lyapunov spectrum. Dynam. Stability Systems 10(3), 269–308 (1995) 23. Gentile, G., Mastropietro, V.: Construction of periodic solutions of the nonlinear wave equation with Dirichlet boundary conditions by the Lindstedt series method. J. Math. Pures Appl. 83(8), 1019–1065 (2004) 24. Godsil, C., Royle, G.: Algebraic graph theory. Graduate Texts in Mathematics 207. New York: Springer, 2001 25. Gradshteyn, I.S., Ryzhik, I.M.: Table of integrals, series, and products. Sixth edition, San Diego: Academic Press, Inc., 2000 26. Harary, F., Palmer, E.M.: Graphical enumeration. New York-London: Academic Press, 1973 27. Kuksin, S.B.: Nearly integrable infinite-dimensional Hamiltonian systems. Lecture Notes in Mathematics 1556, Berlin: Springer, 1994 28. Kuksin, S.B.: Fifteen years of KAM for PDE. In: Geometry, Topology, and Mathematical Physics, Amer. Math. Soc. Transl. Ser. 2, 212, pp. 237–258, Amer. Math. Soc. Providence, RI, 2004 29. Kuksin, S.B., P¨oschel, J.: Invariant Cantor manifolds of quasi-periodic oscillations for a nonlinear Schr¨odinger equation. Ann. of Math. (2) 143(1), 149–179 (1996) 30. Lidski˘ı, B.V., Shul man, E.I.: Periodic solutions of the equation utt − uxx + u3 = 0. Funct. Anal. Appl. 22(4), 332–333 (1988)
490
G. Gentile, V. Mastropietro, M. Procesi
31. Lyapunov, A.M.: Probl`eme g´en´eral de la stabilit´e du mouvement. Ann. Sc. Fac. Toulouse 2, 203–474 (1907) 32. Paleari, S., Bambusi, D., Cacciatori, S.: Normal form and exponential stability for some nonlinear string equations. Z. Angew. Math. Phys. 52(6), 1033–1052 (2001) 33. P¨oschel, J.: Quasi-periodic solutions for a nonlinear wave equation. Comment. Math. Helv. 71(2), 269–296 (1996) 34. Rabinowitz, H.P.: Periodic solutions of nonlinear hyperbolic partial differential equations. Commun. Pure Appl. Math. 20, 145–205 (1967) 35. Rabinowitz, P.H.: Free vibrations for a semilinear wave equation. Commun. Pure Appl. Math. 31(1), 31–68 (1978) 36. Wayne, C.E.: Periodic and quasi-periodic solutions of nonlinear wave equations via KAM theory. Commun. Math. Phys. 127(3), 479–528 (1990) 37. Weinstein, A.: Normal modes for nonlinear Hamiltonian systems. Invent. Math. 20, 47–57 (1973) 38. Whitney, H.: Analytic extensions of differential functions defined in closed sets. Trans. Amer. Math. Soc. 36(1), 63–89 (1934) Communicated by G. Gallavotti
Commun. Math. Phys. 256, 491–511 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1306-9
Communications in
Mathematical Physics
Melvin Models and Diophantine Approximation David Kutasov1 , Jens Marklof2 , Gregory W. Moore3 1 2 3
EFI and Department of Physics, University of Chicago, Chicago, IL 60637, USA School of Mathematics, University of Bristol, Bristol, BS8 1TW, U.K. Department of Physics, Rutgers University, Piscataway, NJ 08854-8019, USA
Received: 19 July 2004 / Accepted: 1 September 2004 Published online: 22 March 2005 – © Springer-Verlag 2005
Abstract: Melvin models with irrational twist parameter provide an interesting example of conformal field theories with non-compact target space, and localized states which are arbitrarily close to being delocalized. We study the torus partition sum of these models, focusing on the properties of the regularized dimension of the space of localized states. We show that its behavior is related to interesting arithmetic properties of the twist parameter γ , such as the Lyapunov exponent. Moreover, for γ in a set of measure one the regularized dimension is in fact not a well-defined number but must be considered as a random variable in a probability distribution.
1. Introduction Two dimensional conformal field theories (CFT’s) corresponding to defects embedded in non-compact target spaces have many applications in string theory and are interesting in their own right [1, 2]. As in scattering problems in quantum mechanics, the eigenstates of the Hamiltonian in such theories split into two classes. One consists of delta-function normalizable scattering states, which can propagate in the whole non-compact space. The other corresponds to normalizable states localized near the defect. In order to study the defect, one is particularly interested in the localized states and their interactions with the scattering states. An example that has received some attention in recent years is orbifolds of flat noncompact space. In this case, the delocalized (scattering) states belong to the untwisted sector of the orbifold, while the localized ones are twisted sector states. For orbifolds by a finite group, the spectrum of localized states is discrete, with finite gaps between states. By contrast, orbifolds by infinite groups can have discrete but dense spectra of states. The latter case is particularly interesting since there is then no sharp distinction between localized and delocalized states.
492
D. Kutasov, J, Marklof, G.W. Moore
More generally, while orbifolds by finite groups are well-studied, orbifolds by infinite groups introduce many new features, and have not been well-studied (some work has been done on time-dependent orbifolds; see [3] for a review). A better understanding of general CFT orbifolds by infinite groups might provide insights into string cosmology, the AdS/CFT correspondence, and noncommutative geometry [4]. In this note we will study an orbifold of R × C by the group Z, known as the Melvin model, or the twisted circle. These models were introduced and studied in [5–10]. For further background on the Melvin model see [11, 2] and references therein. We will see that as we vary the orbifold twist parameter, the model exhibits some unusual behavior, including divergences associated with a sum over almost delocalized twisted sector states. These divergences can be quantified using some results from the theory of Diophantine approximation. For background on Diophantine approximation see, e.g., [12–16]. The Melvin CFT is the orbifold (R × C)/Z,
(1.1)
where the generator g, of the group Z acts as y → y + 2πR, z → e2πiγ z
(1.2)
for (y, z) ∈ R × C. When γ is rational, e.g. γ = 1/n, one can think of the orbifold (1.2) as a Zn orbifold of S 1 × C. For irrational γ , it is not clear apriori whether (1.1) makes sense as a CFT (and string theory) background. One of our motivations below will be to explore this issue, by studying the torus partition sum of the theory. We will see that for irrational γ the partition sum is very sensitive to the number theoretic properties of γ (physical effects related to the arithmetic of irrational angles have appeared in some other recent investigations in string theory; see e.g. [17–19]). We will mostly focus on the CFT (R × C)/Z. In string theory on R1,6 × (R × C)/Z the consistency requirements for the existence of the theory are more stringent, and it is possible that the theory does not exist for irrational γ . Although the orbifold (1.2) does not have fixed points, one can think of the origin of the z-plane as the location of a defect, near which the twisted states of the orbifold are localized. Indeed, consider a low lying state in the w-twisted sector. It winds w times around the circle R/2πRZ labelled by y. Its endpoints in the z-plane are separated by the angle 2πwγ , where, for a real number x, x denotes the distance to the nearest integer.1 A classical string placed a distance r from the origin has energy α 2 M 2 (r) = (Rw)2 + (rwγ )2 .
(1.3)
When wγ = 0, such winding strings are localized near the origin – their wavefunctions fall off exponentially as r → ∞. The radial size of such w-twisted strings goes like 1/wγ . In fact, we see from (1.3) that strings stretched in the angular direction of the z-plane behave as if their effective tension is proportional to wγ ; this will be important for our later discussion. After quantization, the reduced string tension is reflected in the presence of twisted oscillators for the worldsheet superfield z with moding wγ . 1
Thus, defining the fractional part of x, {x} = x − [x], one has x = min({x}, 1 − {x}).
Melvin Models and Diophantine Approximation
493
For rational γ , the radial size 1/wγ is bounded from above in the twisted sectors. Thus, there is a clear distinction between localized and delocalized sectors. When γ is irrational there are twisted sectors that are arbitrarily close to being delocalized, since wγ is not bounded from below. To study the theory for irrational γ , we would like to analyze the partition sum of the CFT (1.1) on a torus with modulus q = e2πiτ ; this corresponds to the trace of ¯ q L0 −c/24 q¯ L0 −c/24 over the eigenmodes of (L0 , L¯ 0 ). For non-compact orbifolds, the trace over the untwisted sector is divergent – it is proportional to the volume of the target space. Sometimes, it is possible to regulate similar volume divergences by compactifying the space, but here this is not possible without breaking conformal invariance. In [20] it was proposed, in a related context, to restrict the trace in the torus partition sum to the localized states, i.e. to the twisted sectors of the orbifold. This eliminates the usual volume divergence from the untwisted sector but, as we will see, leaves in some cases analogous divergences from “almost untwisted” sectors. The partition sum of the localized states is given by ¯
Zloc (τ ; γ ) := TrHloc q L0 −c/24 q¯ L0 −c/24 ,
(1.4)
where Hloc is a sum over twisted sectors Hloc (γ ) := ⊕wγ =0 Hw .
(1.5)
The partition sum Zloc (τ ; γ ) is not modular invariant. It transforms under τ → −1/τ to the trace over the untwisted Hilbert space with a certain projection operator inserted. This is analogous to what happens for D-branes: the annulus amplitude, which can be thought of as a trace over open string states whose ends lie on the D-brane, is related by a modular transformation to a sum over closed strings that can be emitted by the D-brane. By analogy to the D-brane case, it was proposed in [20] to study the regularized dimension of the space of localized states, which is given by (1.4) in the limit2 q → 1. For non-compact orbifolds by finite groups one finds in this limit πc
Zloc (τ → 0; γ ) ∼ gcl (γ )e 6τ2 .
(1.6)
The leading exponential term in (1.6) is universal – it only depends on the central charge (or dimension of space). Thus, one can think of the quantity gcl as a measure of the density of localized states. Some properties of gcl for finite orbifold groups were described in [20]. As we will see, in the irrational Melvin case the coefficient of the exponential in (1.6) behaves in an unusual way and does not have a good limit as τ2 → 0. First, it diverges −b(γ ) , with some constant b(γ ) ≥ 1/2. Moreover - and somewhat surprisingly like τ2 the coefficient of this divergence, while it is order 1, does not have a well-defined limit as τ2 → 0 but varies as a random variable in a probability distribution. We explain this point, which is somewhat novel in conformal field theory, in Sects. 3.2, 3.3 below. A rigorous account is given in the appendix. One nice aspect of the discussion is that the behavior of the regularized dimension is related to the behavior of geodesics on a certain modular curve. The regularized dimension of the space of localized states (1.6) is analogous to a similar regularized dimension which proved useful in RCFT [21]. The D-brane analog of gcl is the product of the tensions of the D-branes on which the open strings end (see, 2
We set τ1 = 0, such that q = e−2πτ2 , and take τ2 → 0.
494
D. Kutasov, J, Marklof, G.W. Moore
e.g., [22]). Note also that in models with spacetime fermions, the trace in (1.4) is usually taken to include a factor of (−)F , such that spacetime fermions contribute with a minus sign, and there are usually large cancellations between bosons and fermions. For the purpose of estimating the high energy density of states, we should only sum over spacetime bosons (or over bosons plus fermions); see e.g. [23, 24] for a discussion of the relevant issues. In the remainder of this note we will study the behavior of the torus partition sum, and in particular of gcl (1.6), for irrational Melvin models. The main results are: (1) When the twist γ of the Melvin model is a Liouville number of a special kind, the one-loop partition function for bosons and fermions separately diverges for fixed τ , although the string theory partition sum ZB − ZF is finite. For such twists it is not clear that the Melvin conformal field theory makes sense. For γ of Diophantine type this pathology is absent. (2) The standard definition (1.6) of the regularized dimension determines not a number, but a random variable in a probability distribution. This is explained heuristically in Sects. 3.2 and 3.3. A rigorous discussion is given in the appendix. (3) We can use the continued fraction approximations to γ to define a modular invariant regulator in the case of irrational twists. We define a degree of delocalization and show that it is related to the Lyapunov exponent of γ .
2. Torus Partition Sum Using the definition of the Melvin CFT (1.1), (1.2), one can write the torus partition sum of the model. In the sector twisted by g s , s ∈ Z (s = 0), one has: 3 ¯
TrHgs g t q L0 −c/24 q¯ L0 −c/24 +sγ 2 ϑ 1 (0|τ ) +∞ dp α (p+sR/α )2 α (p−sR/α )2 2πi(pR)t ϑ 21 +tγ (0|τ ) 2 . 4 4 = vol(R) q ¯ e q 1 +sγ η3 −∞ 2π ϑ 21 (0|τ ) 2 +tγ
(2.1) The |ϑ/η3 | prefactor is the contribution of the (bosonic and fermionic) oscillators on R. 1 , 2 = 0, 21 label the spin structure of the fermions. The final ratio of theta functions is the partition function of the N = 2 superfield twisted by g s and projected by g t . Note that it only depends on the fractional parts {sγ }, {tγ }. In the orbifold theory we must sum over g t , t ∈ Z, to project onto invariant states, and divide by the order of the group Z. We interpret vol(R)/|Z| = 2π R . 3
Our convention for theta functions is ϑ φθ (0|τ ) η
= e2πiθφ q (
θ2 − 1 ) 2 24
where η is the Dedekind eta function.
∞
1
1 + e2πiφ q n− 2 +θ
n=1
1 1 + e−2πiφ q n− 2 −θ ,
Melvin Models and Diophantine Approximation
495
There is no factor of the volume of C because we are in a twisted sector. The net result is that the trace in the g s twisted sector in the orbifold theory is ¯
TrHgs q L0 −c/24 q¯ L0 −c/24 +sγ 2 ϑ 1 (0|τ ) +∞ dp α (p+sR/α )2 α (p−sR/α )2 2πi(pR)t ϑ 21 +tγ (0|τ ) 2 1 . 4 4 q = 2πR q ¯ e 2 +sγ η3 2π (0|τ ) ϑ t∈Z −∞ 1 2 +tγ
(2.2)
In order to evaluate the τ → 0 asymptotics it is convenient to do the Gaussian integral over p to get 2 2 2 +sγ |2 ϑ R 2 ϑ 1 (0|τ ) − παR2 |t+sτ 1 +tγ (0|τ ) τ 2 e (2.3) . 1 +sγ α τ2 η3 ϑ 21 (0|τ ) t∈Z 2 +tγ
Next we have to sum over the different twisted sectors and spin structures. The precise details of the sum depend on the particular theory – CFT on the orbifold, type 0 or type II string theory on R1,6 times the orbifold, etc.; see e.g. [25] for a discussion. The different theories behave in a similar way as far as our analysis is concerned. To be concrete, consider type IIB string theory on R1,1 × T 5 times the orbifold. Here, T 5 is a five-torus of volume V5 ; the compactification is convenient for studying the τ2 → 0 limit of the partition sum. The partition function for the twisted (NS,NS) sectors is 2 |2 R2 V 5 Z
− παR |t+sτ τ2 Zloc = e √ 5 (2π α τ2 ) α τ2 sγ =0,t∈Z 3 3 sγ
0 0 ϑ sγ ϑ 1/2+tγ (0|τ ) 2 (0|τ ) ϑ 1/2 (0|τ ) 1 ϑ 0 (0|τ ) 1 tγ −πitsγ . − e 2 1 +sγ 1 +sγ η3 2 η3 ϑ 21 ϑ 21 (0|τ ) (0|τ ) 2 +tγ
2 +tγ
(2.4) Z is a Siegel-Narain theta function of signature (5, 5) corresponding to T 5 . The behavior of the partition sum in the limit (1.6) does not depend on the details of the compactification. To analyze the τ → 0 asymptotics of (2.3) we need the following asymptotics for τ = iβ → 0, with β real: 1 − 2π |η(τ )| → √ e 24β . β
(2.5)
1 2 if 21 < φ < 1, β −1/2 e2πiθ q˜ 2 φ , θ 1 2 ϑ (0|τ ) → β −1/2 (1 + e2πiθ )q˜ 2 φ , if φ = 21 , φ −1/2 1 φ2 β q˜ 2 , if 0 ≤ φ < 21 ,
(2.6)
Similarly,
where q˜ = exp(−2π/β). Using these asymptotic formulae one can check that the leading behavior arises from 1 = t = 0 in (2.3),(2.4). One finds that
496
D. Kutasov, J, Marklof, G.W. Moore
Zloc
∼
1 16
∞ π R2 τ R 2 τ2 2π 1 − α 2 s 2 τ2 e e . α (sin π sγ )2
(2.7)
sγ =0
Before discussing the mathematical properties of (2.7) let us interpret the crucial factor 1/(sin π sγ )2 in (2.7). As mentioned in the discussion following (1.3), twisted sectors with sγ << 1 give rise to nearly delocalized states whose radial size scales like 1/sγ . This is reflected in the spectrum of L0 in the following way. In the s-twisted sector, all states wind s times around the y circle, and thus have a large (for large s) ground state energy, of order Rs (or L0 ∼ (Rs)2 , see (1.3)). This gives the exponential prefactor in the sum (2.7). On top of this ground state energy, when sγ is small, one finds a narrowly-spaced spectrum of states, associated with the twisted oscillators of the superfield z. This gives the inverse sine factor in (2.7). Thus, we see that this factor is directly related to the spatial extent of the twisted states. If γ is rational, γ = p/q in lowest terms,4 (2.7) has a smooth τ2 → 0 limit. The limit is easily evaluated by setting s = q + j , 0 ≤ j < q − 1, ∈ Z to get q−1 1 R 2 τ2 − π R2 q 2 τ2 ( +j/q)2 1 2π/τ2 α e . (2.8) Zloc ∼ e 16 α (2 sin πpj/q)2 j =1 ∈Z
Taking the τ2 → 0 limit we reproduce the familiar expression for the C/Zq orbifold [20]: dim Hloc (γ = p/q) =
q−1 1 1 . 16q (sin πpj/q)2
(2.9)
j =1
The trigonometric sum is easily evaluated [20], dim Hloc (γ = p/q) =
1 1 q− . 48 q
(2.10)
We see that for rational γ , the Melvin model is closely related to the corresponding C/Zq orbifold. Note that the result only depends on q and hence is a highly erratic function of γ ∈ Q. This is the first indication that we are dealing with delicate functions of γ . 3. Comments on the Sum in the Case of Irrational γ Now we turn to the case of γ irrational. Stripping the universal exponential in (1.6) from (2.7), we see that to compute gcl we need to evaluate ∞
g(y; γ ) :=
√ −πys 2 y e s=0
1 sin2 π sγ
(3.1)
in the limit y → 0 (here y = τ2 R 2 /α ). First, note that it is not obvious that the sum converges for finite y (or τ2 ). Indeed, we will see in Sect. 4 that for certain transcendental numbers it diverges. However, for 4
Here q is an integer, not to be confused with the modular parameter q = e−2πτ2 .
Melvin Models and Diophantine Approximation
497
a “large” class of irrational numbers, including all algebraic numbers, it does converge. Recall the standard Definition. An irrational number is of Diophantine type (K, σ ) if for all q ≥ 1, σq (γ ) := inf 1≤s≤q sγ ≥
K . q 1+σ
(3.2)
We denote the set of numbers of Diophantine type (K, σ ) by D(K, σ ), and we also denote D(σ ) := ∪K>0 D(K, σ ) .
(3.3)
If γ is of Diophantine type (K, σ ) then g(y; γ ) exists for all positive y. To show this we use 2z ≤ | sin πz| < πz ,
(3.4)
(the best estimate valid for all real, non-integer z) to put upper and lower bounds on g(y; γ ): 1 1 1 2 2 2 < e−πys < e−πys e−πys . (3.5) 2 2 2 2 π sγ 4sγ sin πsγ s=0
s=0
s=0
If γ is of type (K, σ ) then 1 ≤ s 1+σ /K, sγ
(3.6)
and hence by (3.5) the series is bounded above by a convergent sum. Some interesting facts, which can be found in [12–16] are, first, that the set D(σ ) is invariant under SL(2, Z) (acting via fractional linear transformations on the elements of D(σ )). Second, a theorem of Roth says that if γ is algebraic of degree ≥ 2 then it is of type (K, σ ) for all σ > 0 and some K. 5 Diophantine approximation can give us some idea of what the asymptotics of g(y; γ ) might be like. If there are many very good rational approximants to γ then sin π sγ is “often” close to zero, and we expect a divergence as y → 0. If good rational approximants to γ are “rare” then the lower limit in (3.5) is more accurate and g(y; γ ) will grow more slowly. What we can say rigorously is that if γ is of Diophantine type (K, σ ) then, from (3.6), C1 ≤ g(y; γ ) ≤ C2 y −σ −1
(3.7)
for some constants Ci . Therefore, we can define a non-negative number b(γ ) by : b(γ ) := inf{b : lim y b g(y; γ ) = 0} . y→0
(3.8)
We next show that b(γ ) ≥ 1/2. 3.1. A lower bound √ for b(γ ). To show that g(y; γ ) always diverges for y → 0 at least as strongly as 1/ y, we use the continued fraction expansion in positive integers an : 5 A much easier theorem of Liouville, which is all we need to establish convergence for algebraic numbers, says that a degree n ≥ 2 algebraic number is of Diophantine type (K, n − 2).
498
D. Kutasov, J, Marklof, G.W. Moore
γ = [a0 , a1 , a2 , . . . ] = a0 +
1 1 a1
+
1 a2 +···
.
(3.9)
The integers an are known as partial quotients. The best rational approximants to γ are always provided by the convergents pn := [a0 , . . . , an ] qn
(3.10)
in the continued fraction expansion: |γ −
pn 1 |< 2 . qn qn
(3.11)
The qn grow exponentially as a function of n. Roughly speaking, 1
qn ∼ ce 2 λ(γ )n ,
(3.12)
and more rigorously: 6 1 log qn . n→∞ n
λ(γ ) := 2 lim
(3.13)
The quantity λ(γ ) is known as the Lyapunov exponent of γ . Taking a lower bound on g(y; γ ) by summing only over s = qn and using (3.5), (3.11), one can show that ∞
∞
2 √ 2 −πyqn2 2 √ e−πyqn > y qn e . g(y; γ ) > 2 y π qn γ 2 π2 2
n=1
(3.14)
n=1
Now, using (3.12) we see that the divergence as y → 0 is at least as strong as g(y; γ ) ≥
1
2
y 1/2
π 3 λ(γ )
.
(3.15)
3.2. g(y; γ ) and the three gap theorem. Some further insight can be gained on the behavior of g(y; γ ) as y → 0 using the three-gap theorem of [26]. The asymptotics of g(y; γ ) as y → 0 are the same as the N → ∞ asymptotics of the sum gN (γ ) = N −1
N
nγ −b
(3.16)
n=1
in the case b = 2, where we identify y ∼ 1/N 2 . It is useful in the discussion to keep b general. In the case b ≤ 0, Kronecker’s theorem, which tells us that nγ are uniformly distributed implies 1 x−b dx. (3.17) gN (γ ) → 0 6
There are γ ’s for which the limit does not exist.
Melvin Models and Diophantine Approximation
499
The same holds when 0 < b < 1, however one needs to assume γ is Diophantine of type σ , where σ depends on b. This is because values close to zero might cause some divergence since x−b is unbounded there. Estimates for the case b = 1 are also classic [27]. We are here interested in b > 1. In this case the sum is dominated by a finite number of terms. In order to see this, order the points nγ (n = 1, . . . , N) in the interval [0, 1/2] and label them by 0 < ξ1 < . . . < ξN < 1/2.
(3.18)
So gN (γ ) = N −1
N
ξn−b .
(3.19)
n=1
We now summarize the results of [26]. Label the fractional parts of nγ by 0 < η1 < . . . < ηN < 1.
(3.20)
The “three gap theorem” states that every spacing ηn+1 − ηn is equal to either α, β or α + β, where α = η1 and β = 1 − ηN . In [26] one finds formulae for α, β in terms of the continued fraction approximation of γ . In particular α and β in general have no asymptotics as N → ∞. Now if γ is of bounded type (i.e. if the partial quotients an are bounded by some constant), one finds immediately from the three gap theorem that there are constants c, C > 0 such that c/N ≤ α ≤ C/N,
c/N ≤ β ≤ C/N.
(3.21)
It is thus natural to write gN (γ ) = N
b−1
N
(N ξn )−b .
(3.22)
n=1
Since the gaps between the N ηn are bounded from below by a constant, we have c n ≤ Nξn ≤ C n for suitable constants c , C > 0. Therefore (and provided b > 1), given any error threshold > 0 we find an M so that lim sup
N
(N ξn )−b < .
(3.23)
(N ξn )−b + O(N b−1 ).
(3.24)
N→∞ n=M
Hence gN (γ ) = N
b−1
M −1 n=1
This means gN (γ ) is of order N b−1 , and furthermore arbitrarily well approximable by a finite number of terms. Recall the N ξn , n = 1, . . . , M , are bounded from above and below and have an explicit expression in terms of the continued fraction √ approximants of γ . One important consequence of these considerations is that yg(y; γ ) has no good asymptotics. The value remains bounded but fluctuates as y → 0. Nevertheless, as we will see in the next section, this value is governed by a definite probability law.
500
D. Kutasov, J, Marklof, G.W. Moore
3.3. The regularized dimension is a random variable in a probability distribution. We have seen in the previous subsection that the asymptotic y → 0 behaviour of the function g(y; γ ) =
√
e−πm y y sin2 (π mγ ) m∈Z−{0} 2
(3.25)
is determined by the continued fraction expansion of γ . We will here refine our analysis by exploiting the dynamical properties of the geodesic flow on the modular surface. The connection between continued fraction dynamics and geodesic flow is non-trivial but well understood, cf. [28, 29]. To explain the strategy, note that g(y; ˜ γ) =
√ y
(m,n)∈Z2 −{0}
e−πym π 2 (mγ + n)2 2
(3.26)
has the same asymptotic behaviour as g(y; γ ), up to an error of order O(1), i.e., g(y; γ ) = g(y; ˜ γ ) + O(1),
(3.27)
uniformly for all γ . (To prove this use the identity +∞ 1 1 1 = 2 π n=−∞ (mγ + n)2 sin2 (π mγ )
and add and subtract the (m = 0, n = 0) terms by hand.) The main idea is now to construct a certain modular function F (M) on SL(2, Z)\SL(2, R), such that √ y g(y; ˜ γ ) = F (M(t)), t = − log y → ∞, (3.28) where M(t) ∈ SL(2, R) is evaluated along the geodesic −t/2 1γ e 0 M(t) = , t ≥ 0. 01 0 et/2
(3.29)
√ ˜ γ ) is now entirely determined by the geometric distribution The asymptotics of y g(y; of the geodesic associated with a particular value of γ . For example: (a) If γ is a quadratic irrational, then the geodesic M(t) is asymptotic to a closed geodesic with period Tγ . Hence e−t/2 g(e ˜ −t ; γ ) ∼ φ(t),
(3.30)
where φ(t) is a bounded periodic function with period Tγ . (b) If γ is badly approximable by rationals (i.e., Diophantine of bounded type), then the geodesic M(t) is asymptotic to a geodesic which never leaves a bounded set in SL(2, Z)\SL(2, R). Hence e−t/2 g(e ˜ −t ; γ ) is bounded for all t. Our analysis will show that in general F (M) is a non-constant function, hence e−t/2 g(e ˜ −t ; γ ) does not converge to a constant.
Melvin Models and Diophantine Approximation
501
(c) For almost all γ (with respect to Lebesgue measure) the corresponding geodesic M(t) becomes equidistributed in SL(2, Z)\SL(2, R), a consequence of the ergodicity of the geodesic flow. Hence the fluctuations of e−t/2 g(e ˜ −t ; γ ) on some long stretch [0, T ] (T → ∞) have the same probability distribution as the function F (M), where M varies over SL(2, Z)\SL(2, R). That is, 1 T δ(X − e−t/2 g(e−t ; γ ))dt −→ P (X) T 0 = δ(X − F (M))dM. (3.31) SL(2,Z)\SL(2,R)
Interestingly, the limit distribution has an algebraic tail, P (X) ∼ AX −3/2 , and hence no first moment. See Theorem 1 in the Appendix for details. 4. Divergence for Finite τ In the previous section we discussed the behavior of the torus partition sum (2.4) in the limit τ → 0. In this section we will see that for some γ , the sum over twisted sectors (and thus (3.1)) diverge for finite τ . This point has been mentioned briefly in [30]. The dangerous factor in the partition sum (2.4) is the function ϑ
21 + sγ
(4.1)
(0|τ )
1 2
which appears in the denominator; it becomes very small when sγ << 1. As we have seen, this is due to the fact that the corresponding states are nearly delocalized. Consider F (x) := ϑ
21 + x 1 2
(4.2)
(0|τ ) .
It is easy to check that F (x + 1) = F (x) and e−iπx F (x) is an odd function of x given at small x by e−iπx F (x) = −(2πτ η3 )x + +(2π 3 τ 3 E2 − 12π 2 iτ 2 )η3
x3 + ··· . 3!
(4.3)
The convergence of the sum over twisted sectors of (2.4) for fixed τ is controlled by ∞ s=1
2 2 1 − παR s 2 |ττ | 2 . e sγ 2
(4.4)
As discussed in the previous section, for γ of Diophantine type (K, σ ) the sum (4.4) converges. On the other hand, for certain Liouville numbers the sum actually diverges. To show this, consider the subsum given by s = qn , where qn is the denominator of the convergents of γ , (3.10). Then [15]: 1 1 < qn γ < . qn + qn+1 qn+1
(4.5)
502
D. Kutasov, J, Marklof, G.W. Moore
Thus, (4.4) is bounded from below by
2 qn+1 e−κqn , 2
(4.6)
n
where κ is some constant. Now if 2 log qn+1 − κqn2 = O(1)
(4.7)
or is even bounded below by − log n then the series (4.6) diverges. We can thus construct numbers for which the series (4.6) diverges by considering γ of the form γ =
∞ n=1
1
(4.8)
10f (n)
for certain rapidly increasing functions f (n). Indeed we may take the subsum with s = 10f (n) . Then n 2 1 γ − < f (n+1) . (4.9) f (j ) 10 10 j =1 Now consider any function f (n) that satisfies an equation of the form f (n + 1) = f (n) + κ102f (n) + g(n),
(4.10)
2 where g(n) is, say, any positive function of n. Then, using qn = 10f (n) and qn+1 < −2 qn γ , we see that for such functions f (n) the series (4.4) diverges. Thus, there are continuum many transcendental numbers for which the sum diverges.
5. An Alternative Regularization Using Continued Fractions In the previous sections we discussed the regularized number of localized states given by the partition sum (1.4) in the limit τ → 0. We saw that there are some irrational numbers for which the sum over localized states diverges even for finite τ . This divergence is due to the effect of “nearly untwisted strings” with wγ small. It is natural to ask whether one can regularize this divergence in some other way, consistent with conformal symmetry and modular invariance. Replacing C by, say, a sphere of finite radius breaks conformal symmetry, and introduces subtle questions of orders of limits. Similarly, putting a cutoff on the sum over twist sectors breaks modular invariance. One simple way to regulate the volume divergence is to use the continued fraction expansion of γ , γ = [0, a1 , a2 , . . . ] .
(5.1)
Cutting off the continued fraction at a finite place leads to the rational convergents: γ (n) := [0, a1 , a2 , . . . , an ] :=
pn . qn
(5.2)
Melvin Models and Diophantine Approximation
503
For the rational twists γ (n) we have a clear separation of localized from delocalized states and the regularized dimension of the space of localized states is (2.10) 1 1 (n) dim Hloc (γ ) = . (5.3) qn − 48 qn Similarly, other correlation functions in the orbifold CFT are well-defined for finite n. One can formally think of the original orbifold with twist parameter γ as the limit n → ∞ of (5.2). Of course, qn → ∞ as n → ∞, but it does so at different rates for different γ ’s; the rate depends sensitively on γ through the Lyapunov exponent (3.13). The exponential growth of qn suggests that we should define an “entropy of delocalization” by considering the limiting behavior of Sn (γ ) = log dim Hloc (γ (n) ). With this measure of delocalization we have log dim Hloc (γ1 ) Sn (γ1 ) λ(γ1 ) = lim = . (5.4) log dim Hloc (γ2 ) n→∞ Sn (γ2 ) λ(γ2 ) Some interesting facts about λ(γ ), which can be found in [31], are the following. First, for almost every γ , λ(γ ) is given by Khinchin’s constant λ0 =
π2 . 6 log 2
(5.5)
Moreover, the range of λ(γ ) as γ runs over irrational numbers in (0, 1) is √ 1+ 5 [2 log , ∞) . (5.6) 2 Thus, the entropy of delocalization is a nontrivial function of the twist parameter γ of the Melvin model. Remarks. 1. One very interesting property of the Lyapunov exponent λ(γ ) is that it is invariant under SL(2, Z) acting on γ via fractional linear transformations. This is easily seen since γ → γ + 1 obviously does not change the exponent while, for γ = [0, a1 , a2 , . . . ] we have 1/γ = [a1 , a2 , a3 , . . . ], so {1/γ } = [0, a2 , a3 , . . . ]. 2. The Lyapunov exponent of γ is indeed a Lyapunov exponent for a dynamical system, namely that defined by the Gauss map T (x) = {1/x}, which shifts the entries of the continued fraction expansion. In this context it is quite amusing to note that a naive analysis of the GLSM description for the Melvin model discussed in [11] appears to lead to a connection between 2D RG flow and the Gauss map. If we choose a GLSM with gauge group R with gauge group action (X1 , X2 , P ) → (eiγ θ X1 , e−iθ X2 , P + iθ ), then the standard analysis of the D-term equation γ |X1 |2 − |X2 |2 = p1
(5.7)
suggests that the Melvin geometry with twist parameter γ and radius R flows to that with twist parameter {1/γ } and radius R/γ . Thus, at least as long as R remains small, the flow from the UV to the IR acts as a Gauss map on γ . 3. Some other interesting relations of Lyapunov exponents to areas of physics are explored in [32]. 4. The approach of this section has the advantage that it can be easily extended to other twisted tori geometries Cd × Rd / , where acts by linear transformations in Cd and by translations in Rd .
504
D. Kutasov, J, Marklof, G.W. Moore
6. Melvin Models in String Theory So far we have been focusing on the conformal field theory of the Melvin orbifold, and the divergences associated with the sums over twisted sectors in defining partition functions in this CFT. In string theory we have the further complication that we must integrate amplitudes over moduli space. We have seen that for certain irrational numbers γ , the partition function of the twisted (NS,NS) sectors, which contain spacetime bosons, is divergent for fixed τ . In type II string theory, what enters into the torus amplitude – the one loop contribution to the cosmological constant – is the difference of spacetime bosons and fermions, ϑ 2 (0|τ ) 3 1 2 +sγ 2 η ϑ , π R 2 |t+sτ |2 1 2 , η 1 +tγ (0|τ ) R2 1 2 − α τ , (6.1) 2 e 21 +sγ α τ2 (0|τ ) ϑ 1 (s,t)=(0,0) 2 +tγ
where η1 2 = ±1 are described in [25]. It is still the case that the denominator of (6.1) is small in sectors with t = 0 and small sγ (4.1). However, the numerator goes to zero as well, due to the standard Riemann identity, or more physically because when sγ → 0, one is approaching the untwisted contribution, which vanishes due to the standard supersymmetric cancellations. However, when tγ is a good approximation to an odd integer the numerator does not cancel the denominator. 7 In this case there are potential divergences such as those discussed in Sect. 4. In addition to this, for other one loop amplitudes we expect that the sum over twisted sectors s will again be problematic. This is compounded by the fact that we must integrate over moduli space, since the integral over τ2 has the form ∞ 1 2 dτ2 τ2ν e−s τ2 ∼ (ν+1)/2 . (6.2) s Thus the exponential suppression in s is replaced by power law suppression, which can be easily overwhelmed by 1/sγ 2 even for algebraic irrationals γ . For this reason, it is far from clear to us that Melvin spacetimes with irrational values of γ are well-defined string backgrounds. This issue merits further investigation. 7. Discussion The discussion of the previous sections suggests that physics is not continuous as a function of γ . This may seem physically unreasonable. How can continuous changes of a magnetic field lead to discontinuous conformal field theory or string theory amplitudes? There is a well-known precedent for this kind of behavior, namely the Azbel-Hofstadter model of an electron in a magnetic field in the presence of a periodic potential [33]. The spectrum of the Schr¨odinger operator is a sensitive function of the magnetic field, and depends on its arithmetic nature. It has been claimed that Melvin models provide a smooth interpolation between IIB and 0A string theory and this has been used to argue that the endpoint of 0A tachyon condensation is the IIB theory [34, 35, 11]. It is possible that the interpolating string theories with irrational γ do not exist, thus calling these claims into question. 7
We thank the referee for pointing out this subtlety.
Melvin Models and Diophantine Approximation
505
A natural class of questions which arise in the context of our considerations have to do with asking “how many” γ ’s or “how often” a γ leads to a divergent model, or to a model with fixed Lyapunov exponent, and so forth. These are subtle and difficult questions. They have been the subject of much research. To quote one nice result [31], the set of γ such that λ(γ ) takes a value larger than the Khinchin constant λ0 (5.5) has positive Hausdorff dimension. Further discussion of such matters would take us into the subject of measure and category [36], so this seems a good place to stop. Acknowledgements. We would like to thank J. Harvey and E. Martinec for collaboration on related matters and for useful discussions. We would like to thank J. Lagarias, J. Maldacena, and B. Pioline for useful discussions and correspondence. We also thank the LPTHE at Jussieu, Paris, for hospitality while this paper was written. The work of GM is supported in part by DOE grant DE-FG02-96ER40949. That of DK is supported in part by DOE grant DE-FG02-90ER40560. JM was supported by an EPSRC Advanced Research Fellowship and the EC Research Training Network (Mathematical Aspects of Quantum Chaos) HPRN-CT-2000-00103.
Appendix A. Detailed Proof of (3.31) A.1. Sums over lattice points. For any M ∈ SL(2, R) consider the sum f (mM),
F (M) =
(A.1)
m∈Z
2 −{0}
where m
runs over all non-zero integer row vectors, and f is a positive function on R2 . Since the modular group SL(2, Z) leaves the lattice Z2 and the origin 0 invariant, we have immediately F (KM) = F (M)
(A.2)
for every K ∈ SL(2, Z). F may thus be viewed as a function on the homogeneous space SL(2, Z)\SL(2, R). There is a simple formula for the average of F with respect to Haar measure dM, normalized as a probability measure so that dM = 1. (A.3) SL(2,Z)\SL(2,R)
We then have
F (M)dM =
f ( x )dx.
(A.4)
R2
SL(2,Z)\SL(2,R)
This is a special case of Siegel’s weight formula for SL(d, R); for a proof see e.g. Theorem 3.15 of [37]. The function g(y; ˜ γ ) is connected to an automorphic function F of the above form: choose e−πx1 . π 2 x22 2
f (x1 , x2 ) = Then, at the point
M=
1x 01
y 1/2 0 0 y −1/2
(A.5)
,
(A.6)
506
D. Kutasov, J, Marklof, G.W. Moore
we have F (M) =
√
y g(y; ˜ x).
(A.7)
The space SL(2, Z)\SL(2, R) can be identified with the unit tangent bundle of the modular surface by means of the Iwasawa decomposition 1/2 cos(θ/2) sin(θ/2) 1x y 0 ; (A.8) M= − sin(θ/2) cos(θ/2) 01 0 y −1/2 z = x + iy are the standard upper half plane coordinates and the angle θ ∈ [0, 2π ) describes the direction of the unit tangent vector at z. This identification induces the ab following action of a matrix ∈ SL(2, R) on a point M = (z, θ ): cd
ab cd
(z, θ ) =
az + b , θ − 2 arg(cz + d) . cz + d
(A.9)
A fundamental domain of SL(2, Z) in these coordinates is F = {(z, θ ) : |z| > 1, |x| < 1/2, θ ∈ [0, 2π )}.
(A.10)
The normalized Haar measure reads dM =
3 dx dy dθ . 2π 2 y2
(A.11)
The geodesic flow on SL(2, Z)\SL(2, R) is represented by the right translation −t/2 e 0 . (A.12) t = M(0) → M(t) = M(0)t , 0 et/2 The values of g(y; ˜ x) are thusthoseof F (M) evaluated along a geodesic M = M(t) 1x with initial condition M(0) = . 01
A.2. Singularities. To analyze the singularities of F (M) we split F (M) = F0 (M) + F1 (M),
(A.13)
where F0 , F1 are defined in the same way as F above with f replaced with e−πx1 χ0 (x2 ) π 2 x22
(A.14)
e−πx1 f1 (x1 , x2 ) = 2 2 χ1 (x2 ), π x2
(A.15)
2
f0 (x1 , x2 ) = and
2
Melvin Models and Diophantine Approximation
507
respectively. χ0 and χ1 are continuous functions with values in [0, 1] such that χ0 (x) + χ1 (x) = 1, and 1, x ∈ [−, ] χ1 (x) = (A.16) 0, x ∈ / [−(1 + ), (1 + )] for some fixed > 0. (The extra (1 + ) factor is used to accommodate the continuity of χ1 ; we think of χ0 and χ1 as smoothed characteristic functions.) By construction, F0 is a continuous function on all of SL(2, Z)\SL(2, R). This manifold has one cusp at y → ∞. The asymptotic behaviour is here cos(θ/2) sin(θ/2) F0 (M) ∼ f0 (0, ny −1/2 ) − sin(θ/2) cos(θ/2) n=0
∼ C0 (θ ) y 1/2 as y → ∞,where
C0 (θ ) = =
∞ −∞
1 π2
f0 ∞ −∞
(A.17)
cos(θ/2) sin(θ/2) (0, r) − sin(θ/2) cos(θ/2)
dr
e−π [r sin(θ/2)] χ0 (r cos(θ/2))dr. [r cos(θ/2)]2 2
(A.18)
Note that C0 (θ ) = O(1) for all θ . We re-write F1 as a sum over primitive lattice points p,
F1 (M) =
∞
f1 (l pM).
(A.19)
l=1 p
For every primitive lattice point p there is a K ∈ SL(2, Z) such that p = (0, 1)K. The subgroup ∞ ⊂ SL(2, Z) of elements K such that (0, 1)K = (0, 1) is 1n
∞ = :n∈Z , (A.20) 01 and hence there is a one-to-one correspondence between primitive lattice points and the coset ∞ \SL(2, Z). We have therefore F1 (M) =
∞
f1 ((0, l)KM).
(A.21)
l=1 K∈ ∞ \SL(2,Z)
Due to the rapid decay of f1 this is essentially a finite sum. To understand the singularities of F1 consider the term corresponding to K = 1, ∞
f1 ((0, l)M) =
l=1
∞ −πl 2 y −1 sin(θ/2)2 e y χ1 (ly −1/2 cos(θ/2)). (A.22) cos(θ/2)2 π 2l2 l=1
The main singularity of this function is at θ = π , and we note that for θ → π , ∞ l=1
f1 ((0, l)M) ∼
∞ −πl 2 y −1 e 4y . 2 2 π (θ − π) l2 l=1
(A.23)
508
D. Kutasov, J, Marklof, G.W. Moore
The singularities of F1 (M)are the images of the two-dimensional subspace {(z, θ ) : ab θ = π } under the action of ∈ ∞ \SL(2, Z), cd ab {(z, θ ) : θ = π} = {(z, θ ) : θ = π − 2 arg(cz + d)}, (A.24) cd
where (c, d) runs over all primitive lattice points in Z2 − {0}. A.3. Limit theorems. Our main application of the above construction is the following. Theorem 1. There is a probability density P (X) on R+ with the following properties: 1. There is a set of x of full measure such that, for any bounded continuous function φ : R+ → R, ∞ 1 T lim φ(e−t/2 g(e−t ; x))dt = φ(X)P (X)dX; (A.25) T →∞ T 0 0 2. For any bounded continuous function φ : R+ → R, 1 ∞ √ lim φ yg(y; x) dx = φ(X)P (X)dX; y→0 0
(A.26)
0
3. As X → ∞, P (X) ∼ AX−3/2 ,
(A.27)
with 3 A= 2π 3
∞ −πl 2 y −1 1/2 ∞ e
0
l=1
l2
dy . y 3/2
Note that the limiting distribution does not possess a first moment, ∞ XP (X)dX = ∞.
(A.28)
(A.29)
0
Thus, there is no “average” value of gcl for Melvin models. To prove the above limit theorem, we note that the ergodicity of the geodesic flow and the equidistribution of long closed horocycles on SL(2, Z)\SL(2, R) imply the following statements, cf. [37]. Theorem 2. 1. There is a set of x of full measure such that, for any bounded continuous function G : SL(2, Z)\SL(2, R) → C, 1 T 1x t lim G dt = G(M)dM; (A.30) 01 T →∞ T 0 SL(2,Z)\SL(2,R)
Melvin Models and Diophantine Approximation
509
2. For any bounded continuous function G : SL(2, Z)\SL(2, R) → C, 1/2 1 1x y 0 dx = G G(M)dM. lim 01 0 y −1/2 y→0 0 SL(2,Z)\SL(2,R)
(A.31)
Now take any compactly supported continuous function φ : R+ → R and set G(M) = φ(F (M)). Then for > 0 small enough G(M) = φ(F0 (M)),
(A.32)
and hence G(M) is bounded continuous, in view of the above singularity analysis. Theorem 2 therefore implies the first two statements of Theorem 1, for compactly supported continuous test functions φ, with P (X) = δ(X − F (M))dM. (A.33) SL(2,Z)\SL(2,R)
The extension to bounded continuous φ follows from a standard probabilistic argument.
A.4. Tail estimates. Consider first the large X asymptotics of P1 (X) = δ(X − F1 (M))dM SL(2,Z)\SL(2,R) ∞ = δ X − f1 ((0, l)KM) dM SL(2,Z)\SL(2,R)
K∈ ∞ \SL(2,Z) l=1
δ X −
∼ SL(2,Z)\Uσ
∞
f1 ((0, l)KM) dM,
(A.34)
K∈ ∞ \SL(2,Z) l=1
where Uσ = SL(2, Z){(z, θ ) : θ ∈ π + [−σ, σ ]} is a small neighbourhood of the singular set (A.24) on which F1 (M) is large. Since the K sum is essentially finite, we may choose σ > 0 small enough so that the overlap of the neighbourhoods K{(z, θ ) : θ ∈ π + [−σ, σ ]} for different K is negligible. Hence for X → ∞,
∞ P1 (X) ∼ δ X− f1 ((0, l)KM) dM K∈ ∞ \SL(2,Z) SL(2,Z)\Uσ
=
∞ \Uσ
3 = 2π 2
0
δ X−
∞ l=1
1 ∞ π+σ 0
∞ σ
δ X−
3 4h(y) δ X− 2π 2 0 θ2 −σ ∞ 3 dy = h(y) 2 , 2 3/2 2π X y 0 ∼
l=1
f1 ((0, l)M) dM
π−σ
∞
f1 ((0, l)M)
l=1
dx dy dθ y2
dy dθ y2 (A.35)
510
D. Kutasov, J, Marklof, G.W. Moore
where h(y) =
2 −1 ∞ y e−πl y . π2 l2
(A.36)
l=1
Since F0 (M) has its only singularity in the cusp y → ∞, P0 (X) = δ(X − F0 (M))dM SL(2,Z)\SL(2,R) 2π ∞ 3
dy dθ δ(X − C0 (θ )y 1/2 ) 2 2π 2 0 y 0 2π 3 = 2 3 C0 (θ )2 dθ. π X 0
∼
(A.37)
So P (X) ∼ P1 (X) for large X, and the proof of Theorem 1 is complete. References 1. Martinec, E. J.: Defects, decay, and dissipated states. http://arxiv.org/list/hep-th/0210231, 2002 2. Headrick,M., Minwalla, S., Takayanagi, T.: Closed string tachyon condensation: An overview. Class. Quant. Grav. 21, 51539–51565 (2004) 3. Cornalba, L., Costa, M. S.: Time-dependent orbifolds and string cosmology. Fortsch. Phys. 52, 145 (2004) 4. Moore, G. W.: Les Houches lectures on strings and arithmetic. http://arxiv.org/list/hep-th/0401049, 2004 5. Dowker, F., Gauntlett, J. P., Kastor, D. A., Traschen, J. H.: Pair creation of dilaton black holes. Phys. Rev. D 49, 2909 (1994) 6. Dowker, F., Gauntlett, J. P., Giddings, S. B., Horowitz, G. T.: On pair creation of extremal black holes and Kaluza-Klein monopoles. Phys. Rev. D 50, 2662 (1994) 7. Dowker, F., Gauntlett, J. P., Gibbons, G. W., Horowitz, G. T.: The Decay of magnetic fields in Kaluza-Klein theory. Phys. Rev. D 52, 6929 (1995) 8. Dowker, F., Gauntlett, J. P., Gibbons, G. W., Horowitz, G. T.: Nucleation of P -Branes and Fundamental Strings. Phys. Rev. D 53, 7115 (1996) 9. Russo, J. G., Tseytlin, A. A.: Magnetic flux tube models in superstring theory. Nucl. Phys. B 461, 131 (1996) 10. Russo, J. G., Tseytlin, A. A.: Magnetic backgrounds and tachyonic instabilities in closed superstring theory and M-theory. Nucl. Phys. B 611, 93 (2001) 11. David, J. R., Gutperle, M., Headrick, M., Minwalla, S.: Closed string tachyon condensation on twisted circles. JHEP 0202, 041 (2002) 12. Cassels, J.: An Introduction to Diophantine Approximation. Cambridge: Cam. Univ. Press 1957 13. Hardy, G., Wright, E.: An Introduction to the Theory of Numbers. Oxford: Oxford Univ. Press, 1979 14. Khinchin, A.: Continued Fractions. Chicago, IL: Univ. of Chicago Press, 1964 15. Schmidt, W. M.: Diophantine Approximation. LNM 785, Berlin-Heidelberg-New York: SpringerVerlag, 1980; Diophantine approximations and diophantine equations. LNM 1467, Berlin: Springer, 1991 16. See the article by Yoccoz in Itzykson, C., Luck, J.-M., Moussa, P., Waldschmidt, M. eds.: From Number Theory to Physics. Berlin-Heidelberg-New York: Springer Verlag, 1995 17. Kol, B.: On 6d *gauge* theories with irrational theta angle. JHEP 9911, 017 (1999) 18. Elitzur, S., Pioline, B., Rabinovici, E.: On the short-distance structure of irrational non-commutative gauge. JHEP 0010, 011 (2000) 19. Chan, C. S., Hashimoto, A., Verlinde, H.: Duality cascade and oblique phases in non-commutative open string theory. JHEP 0109, 034 (2001) 20. Harvey, J. A., Kutasov, D., Martinec, E. J., Moore, G.: Localized tachyons and RG flows. http://arxiv.org/list/hep-th/0111154, 2001 21. Dijkgraaf, R., Verlinde, E.: Modular Invariance And The Fusion Algebra. Nucl. Phys. Proc. Suppl. 5B, 87 (1988)
Melvin Models and Diophantine Approximation
511
22. Harvey, J. A., Kachru, S., Moore, G. W., Silverstein, E.: Tension is dimension. JHEP 0003, 001 (2000) 23. Kutasov, D., Seiberg, N.: Number Of Degrees Of Freedom, Density Of States And Tachyons In String Theory. Nucl. Phys. B 358, 600 (1991) 24. Kutasov, D.: Some properties of (non)critical strings. http://arxiv.org/list/hep-th/9110041, 1991 25. Angelantonj, C., Dudas, E., Mourad, J.: Orientifolds of String Theory Melvin backgrounds. Nucl. Phys. B637, 59–91 (2002) 26. Slater, N. B.: Gaps and steps for the sequence nθmod1. Proc. Cambridge Philos. Soc. 63, 1115–1123 (1967) 27. Davenport, H.: Analytic Methods for Diophantine Equations and Diophantine Inequalities. Ann Arbor, MI: Ann Arbor Publ., 1962 pp. 13ff 28. Artin, E.: Ein mechanisches System mit quasiergodischen Bahnen. In: Collected works, Lang, S., Tate, J.T. (eds.), Reading, MA-London: Addison-Wesley, 1965, p. 499 29. Series, C.: The modular surface and continued fractions. J. London. Math. Soc. 31, 69–80 (1985) 30. Liu, H., Moore, G., Seiberg, N.: Strings in time-dependent orbifolds. JHEP 0210, 031 (2002) 31. Pollicott, M., Weiss, H.: Multifractal analysis of Lyapunov exponent for continued fraction and Manneville-Pomeau transformations and applications to Diophantine approximation. Commun. Math. Phys. 207, 145 (1999) 32. Marcolli, M.: Modular curves, C* algebras, and chaotic cosmology. http://arxiv.org/list/mathph/0312035, 2003; Marcolli, M.: Limiting modular symbols and the Lyapunov spectrum. http://arxiv.org/list/math.NT/0111093, 2001; Manin, Y., Marcolli, M.: Continued fractions, modular symbols, and non-commutative geometry. http://arxiv.org/list/math.NT/0102006, 2001 33. Hofstadter, D. R.: Energy levels and wave functions of Bloch electrons in rational and irrational magnetic fields. Phys. Rev. B14, 2239 (1976) 34. Costa, M. S., Gutperle, M.: The Kaluza-Klein Melvin solution in M-theory. JHEP 0103, 027 (2001) 35. Gutperle, M., Strominger, A.: Fluxbranes in string theory. JHEP 0106, 035 (2001) 36. Oxtoby, J.: Measure and Category. GTM Vol. 2, Berlin-Heidelberg-New York: Springer-Verlag, 1980 37. Marklof, J.: The n-point correlations between values of a linear form, with an appendix by Z. Rudnick. Ergod. Th. Dyn. Sys. 20, 1127–1172 (2000) Communicated by M.R. Douglas
Commun. Math. Phys. 256, 513–537 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1339-0
Communications in
Mathematical Physics
Wilson Surfaces and Higher Dimensional Knot Invariants Alberto S. Cattaneo1, , Carlo A. Rossi2 1
Institut f¨ur Mathematik, Universit¨at Z¨urich–Irchel, Winterthurerstrasse 190, 8057 Z¨urich, Switzerland. E-mail: [email protected] 2 D-MATH, ETH-Zentrum, 8092 Z¨ urich, Switzerland. E-mail: [email protected] Received: 11 November 2002 / Accepted: 28 January 2005 Published online: 12 April 2005 – © Springer-Verlag 2005
Abstract: An observable for nonabelian, higher-dimensional forms is introduced, its properties are discussed and its expectation value in BF theory is described. This is shown to produce potential and genuine invariants of higher-dimensional knots. 1. Introduction Wilson loops play a very important role in gauge theories. They appear as natural observables, e.g., in Yang–Mills and in Chern–Simons theory; in the latter, their expectation values lead to invariants for (framed) knots [19]. A generalization of Wilson loops in the case where the connection is replaced by a form B of higher degree and the loop by a higher-dimensional submanifold is then natural and might have applications to the theories of D-branes, gerbes and—as we discuss in this paper—invariants of imbeddings. In the abelian case, one assumes B to be an ordinary n-form on an m-dimensional manifold M. The generalization of abelian gauge symmetries is in this case given by transformations of the form B → B + dσ , σ ∈ n−1 (M). The obvious generalization of a Wilson loop has then the form i
O(B, f, λ) = e λ
N
f ∗B
,
(1.1)
where λ is a coupling constant, N is an n-dimensional manifold and f is a map N → M. As an example of a theory where this observable is interesting, one has the so-called abelian BF theory [16] which is defined by the action functional B dA, B ∈ n (M), A ∈ m−n−1 (M). S(A, B) = M
˜ (with f : N → M, g : N˜ → The expectation value of the product O(B, f, λ)O(A, g, λ) M, dim N = n, dim N˜ = m − n − 1) is then an interesting topological invariant which
A.S.C. acknowledges partial support of SNF Grant No. 20-63821.00
514
A.S. Cattaneo, C.A. Rossi
in the case M = Rn turns out to be a function of the linking number of the images of f and g (assuming that they do not intersect). For another generalization of abelian Chern-Simons theory and Wilson loops in higher dimensions, see also [12]. A nonabelian generalization seems to require necessarily that along with B one has an ordinary connection A on some principal bundle P → M. The field B is then assumed to be a tensorial n-form on P . If the map f (N) describes an (n − 1)-family of imbedded loops (viz., N = S 1 × X and f (•, x) is an imbedding S 1 → M ∀x ∈ X), then a generalization of (1.1) has been introduced in [6] in the case n = 2 and, more generally, in [7, 9]. If such an observable is then considered in the context of nonabelian BF theories (which implies that one has to take n = m − 2), one gets cohomology classes of the Vassiliev type on the space of imbeddings of a circle into M [7, 9, 5]. In the present paper we are however interested in the case where f is an imbedding1 of N into M. We assume throughout n = m − 2 and we choose B to be of the coadjoint type. In particular, this will make our generalization of (1.1), the Wilson surface, suitable for the so-called canonical BF theories, see Sect. 2. Since these theories are topological, expectation values of Wilson surfaces should yield potential invariants of imbeddings of codimension two, i.e., of higher-dimensional knots. As an example, we discuss explicitly the case when M = Rm and N = Rm−2 and the imbeddings are assumed to have a fixed linear behavior at infinity (long knots). In this case, by studying the first orders in perturbation theory, we recover an invariant proposed by Bott in [2] for m odd and introduce a new invariant for m = 4. More general invariants may be obtained at higher orders. (These results have appeared in [15] to which we will recurringly refer for more technical details.) We believe that our Wilson surfaces may have broader applications in gauge theories. Plan of the paper. In Sect. 2, we recall nonabelian canonical BF theories and give a very formal, but intuitively clear, definition of Wilson surfaces, see (2.4) and (2.5). We discuss their formal properties and, in particular, we clarify why we expect their expectation values to yield invariants of higher-dimensional knots. (In this section by invariant we mean a Diff 0 (N ) × Diff 0 (M)-invariant function on the space of imbeddings N → M.) In Sect. 3, we give a more precise and at the same time more general definition of Wilson surfaces under the simplifying assumption that we work on trivial principal bundles. The properties of Wilson surfaces are here summarized in terms of descent equations (3.3), the crucial point of the whole discussion being the modified quantum master equation (3.6). Though we briefly recall here the fundamental facts about the Batalin–Vilkovisky (BV) formalism [1], some previous exposure to it will certainly be helpful. In Sect. 4, we carefully describe the perturbative definition of Wilson surfaces—see (4.5), (4.6) and (4.8)—in the case M = Rm and N = Rm−2 to which we will stick to the end of the paper. This perturbative definition of Wilson surfaces is finally rigorous, and in Sect. 5 we are able to prove some of its properties, viz., the “semiclassical” version of the descent equation, see (5.1) and Prop. 5.1. The “quantum” descent equation, on the other hand, still relies on some formal arguments. 1 The necessity of considering imbeddings in the nonabelian theory, instead of more general smooth maps, arises at the quantum level (just like in the nonabelian Chern–Simons theory) in order to avoid singularities which make the observables ill-defined.
Wilson Surfaces and Higher Dimensional Knot Invariants
515
In Sect. 6, we discuss the perturbative expansion of the expectation value of a Wilson surface in BF theory. The main results we obtain by considering the first three orders in perturbation theory are a generalization of the self-linking number (6.2), the Bott invariant (6.3), and a new invariant for long 2-knots (6.4), see Prop. 6.3. (In this section an invariant is understood as a locally constant function on the space of imbeddings.) We also discuss the general behavior of higher orders as well as the expectation value (6.5) of the product of a Wilson loop and a Wilson surface. The discussions in this section require some knowledge on the compactification of configuration spaces relative to imbeddings described in [3]. We refer for more details on this part to [15]. Finally, in Sect. 7, we discuss some possible extensions of our work. 2. Canonical BF Theories and Wilson Surfaces We begin by fixing some notations that we will use throughout. Let G be a Lie group, g its Lie algebra and P a G-principal bundle over an m-dimensional manifold M. We will denote by A and G the affine space of connection 1-forms and the group of gauge transformations, respectively. Given a connection A and a gauge transformation g, we will denote by Ag the transformed connection. The next ingredients are the spaces k (M, adP ) and k (M, ad∗ P ) of tensorial k-forms of the adjoint and coadjoint type respectively. Given a connection A, we will denote by dA the corresponding covariant derivatives on • (M, adP ) and on • (M, ad∗ P ). 2.1. Canonical BF theories. Given A ∈ A and B ∈ m−2 (M, ad∗ P ), one defines the canonical BF action functional by B , FA , S(A, B) := (2.1) M
where FA is the curvature 2-form of A and , denotes the extension to forms of the adjoint and coadjoint type of the canonical pairing between g and g∗ . The critical points of S are pairs (A, B) ∈ A × m−2 (M, ad∗ P ), where A is flat and B is covariantly closed, i.e., solutions to FA = 0 = dA B. The BF action functional is invariant under the action of an extension of the group G := G m−3 (M, ad∗ P ), where of gauge transformations, viz., the semidirect product G ∗ m−3 G acts on the abelian group (M, ad P ) via the coadjoint action. A pair (g, σ ) ∈ G ∗ m−2 acts on a pair (A, B) ∈ A × (M, ad P ) by A → Ag , B → B (g,σ ) = Ad∗g −1 B + dAg σ,
(2.2a) (2.2b)
and it is not difficult to prove that S(Ag , B (g,σ ) ) = S(A, B). By definition an observable is a G-invariant function on A × m−2 (M, ad∗ P ). In the quantum theory, one defines the expectation value2 of an observable by i O = DADB e S(A,B) O(A, B), (2.3) where the formal measure DADB is assumed to be G-invariant. 2
For notational simplicity, throughout the paper we assume the functional measures to be normalized.
516
A.S. Cattaneo, C.A. Rossi
2.2. Wilson surfaces. We are now going to define an observable for BF theories associated to an imbedding f : N → M, where N is a fixed (m − 2)-dimensional manifold. The first observation is that, using f , one can pull back the principal bundle P to N ; let us denote by f ∗ P the principal bundle over N obtained this way. Given a connection one-form A on P , we denote by f ∗ A the induced connection one-form on f ∗ P ; moreover, given B ∈ m−2 (M, ad∗ P ) we denote by f ∗ B the induced element of m−2 (N, ad∗ f ∗ P ). We then define (ξ, β, A, B, f ) :=
N
ξ , df ∗ A β + f ∗ B ,
(2.4)
for ξ ∈ 0 (N, adf ∗ P ) and β ∈ m−3 (N, ad∗ f ∗ P ). Our observable, which we will call the Wilson surface, is then defined as the following functional integral: O(A, B, f ) :=
i
Dξ Dβ e (ξ,β,A,B,f ) .
(2.5)
There are two important observations at this point: (1) At first sight we have a Gaussian integral where the quadratic part pairs ξ with β but there is no linear term in β; so it seems that one could omit the linear term in ξ as well. As a consequence O would not depend on B and would then have a rather trivial expectation value in BF theory. The point however is that (2.4) has in general zero modes. One has then to expand around each zero mode and then integrate over them (with some measure “hidden” in the notation Dξ Dβ). This makes things more interesting as we will see in the rest of the paper; in particular, the dependency of O on B will be nontrivial. (2) The action functional (2.4) may have symmetries (depending on A and B) which make the quadratic part around critical points degenerate. So in the computation of O the choice of some adapted gauge fixing is understood. We defer a more precise discussion to the following sections. We want now to show that (formally) O is an observable. First observe that an element of canonical BF theories, induces a pair (g, (g, σ ) of the symmetry group G ˜ σ˜ ), where g˜ is a gauge transformation for f ∗ P and σ˜ = f ∗ σ ∈ m−3 (N, ad∗ f ∗ P ). It is not difficult to show that (ξ, β, Ag , B (g,σ ) , f ) = (Adg˜ ξ, Ad∗g˜ (β + σ˜ ), A, B, f ). Thus, by making a change of variables in (2.5), we see that O is G-invariant if we make the following Assumption 1. We assume that the measure Dξ Dβ is invariant under i) the action of gauge transformation on 0 (N, adf ∗ P ) × m−3 (N, ad∗ f ∗ P ) and ii) translations of β. In the following we will see examples where these conditions are met; observe that this will in particular imply conditions on the measure on zero modes.
Wilson Surfaces and Higher Dimensional Knot Invariants
517
2.3. Invariance properties. Next we want to discuss invariance of O under the group Diff 0 (N ) of diffeomorphisms of N connected to the identity. For ψ ∈ Diff 0 (N ), one can now prove that3 (ξ, β, A, B, f ◦ ψ −1 ) = (ψ ∗ ξ, ψ ∗ β, A, B, f ). If we now further assume that the measure Dξ Dβ is invariant4 under ψ ∗ , we obtain that O(A, B, f ◦ ψ −1 ) = O(A, B, f ). Finally, we want to prove that O is also Diff 0 (M)-invariant. For φ ∈ Diff 0 (M), the relevant identity is now5 (ξ, β, A, B, φ ◦ f ) = (ξ, β, φ ∗ A, φ ∗ B, f ). After integrating out ξ and β, we get then O(A, B, φ ◦ f ) = O(φ ∗ A, φ ∗ B, f ). Observe now that the BF action (2.1) is Diff 0 (M)-invariant, viz., S(A, B) = S(φ ∗ A, φ ∗ B). Thus, if we assume the measure DADB to be Diff 0 (M)-invariant as well, we deduce that O (f ) = O (φ ◦ f ) ∀φ ∈ Diff 0 (M). In conclusion, whenever we can make sense of the observable O and the expectation value (2.3) together with Assumption 1, we may expect to obtain invariants of higher-dimensional knots N → M. A caveat is that in the perturbative evaluation of the functional integrals some regularizations have to be included (e.g., point splitting) and this may spoil part of the result (analogously to what happens in Chern–Simons theory where expectation values of Wilson loops do not actually yield knot invariants but invariants of framed knots6 ). 2.4. The abelian case. As a simple example we discuss now the case g = R. The action simplifies to (ξ, β, A, B, f ) := ξ(dβ + f ∗ B). N
The critical points are solutions to dξ0 = dβ0 + f ∗ B = 0. Since we want to treat B perturbatively, we expand instead around a solution to dξ0 = dβ0 = 0. For simplicity To be more precise, observe that the l.h.s. is now defined on tensorial forms on (f ◦ ψ −1 )∗ P instead := 0 (N, adf ∗ P )×m−3 (N, ad∗ f ∗ P ) and N (f ◦ ψ −1 ) := 0 (N, ad(f ◦ ψ −1 )∗ P ) × m−3 (N, ad∗ (f ◦ ψ −1 )∗ P ). 4 More precisely, we assume that the measure D ξ˜ D β˜ on N (f ◦ ψ −1 ) is equal to the pullback of the measure Dξ Dβ by ψ ∗ whenever ξ˜ = ψ ∗ ξ and β˜ = ψ ∗ β. 5 Observe that now we are moving from P to φ ∗ P , and in the r.h.s. φ ∗ denotes the induced isomorphism between A(P ) × m−2 (M, ad∗ P ) and A(φ ∗ P ) × m−2 (M, ad∗ φ ∗ P ). 6 Genuine knot invariants may also be obtained by subtracting suitable multiples of the self-linking number [3]. We will see in Subsect. 6.4 that a similar strategy—viz., taking the linear combination of potential invariants coming from expectation values in order to obtain genuine invariant—may be used in the case of long higher-dimensional knots. 3
of f ∗ P . By ψ ∗ we mean then the isomorphism between N (f )
518
A.S. Cattaneo, C.A. Rossi
we consider only the case β0 = 0.7 On the other hand ξ0 has to be a constant function; we will denote by its value. We get then i ∗ O(A, B, f ) = Z µ() e N f B , ∈R
where µ is a measure on the moduli space R of solutions to dξ0 = 0, and i i α(dβ+f ∗ B) N = DαDβ e N αdβ , Z = DαDβ e where we have denoted by α the perturbation of ξ around . Observe that Z is independent of f , of A and of B.8 If we take the measure µ to be a delta function peaked at some value λ, we recover, apart from the constant Z, the observable displayed in (1.1). 3. BV Formalism BF theories present symmetries that are reducible on shell.9 To deal with it, one resorts to the Batalin–Vilkovisky (BV) formalism. We summarize here the results on BV for canonical BF theories [9]. First we introduce the following spaces of superfields: A := A ⊕
m
i (M, adP )[1 − i],
i=0 i =1
B :=
m
i (M, ad∗ P )[m − 2 − i],
i=0
where the number in square brackets denotes the ghost number to be given to each component. If we introduce the total degree as the sum of ghost number and form degree, we see that elements of A have total degree equal to one and elements of B have total degree equal to m − 2. Remark 3.1. In the following, whenever we refer to some super algebraic structure (Lie brackets, derivations, . . . ), it will always be understood that the grading is the total degree. Observe then that the space A of superconnections is modeled on the super vector space A0 :=
m
i (M, adP )[1 − i].
i=0
Observe that the action is invariant under the transformation β → β + dτ . So, if H m−3 (N) = {0}, there is no loss of generality in taking β0 = 0. 8 The explicit computation of Z, taking into account the symmetries with the BRST formalism, yields the Ray–Singer torsion of N, see [16]. 9 The infinitesimal form of the symmetries (2.2) consists of the usual infinitesimal gauge symmetries and of the addition to B of the covariant derivative of an (m − 3)-form σ of the coadjoint type. On shell, i.e. at the critical points of the action, the connection has to be flat. Thus, there is a huge kernel of infinitesimal symmetries containing in particular all dA -exact forms. Off shell the kernel is in general much smaller. Having completely different kernels on and off shell makes the BRST formalism, even with ghosts for ghosts, not applicable to this case. 7
Wilson Surfaces and Higher Dimensional Knot Invariants
519
The Lie algebra structure on g induces a super Lie algebra structure on A0 whose Lie bracket will be denoted by [[ ; ]]. (We refer to [9] for more details and sign conventions.)10 Given A ∈ A, we define its curvature 1 FA = FA0 + dA0 a + [[a ; a]], 2 where A0 is any reference connection and a := A − A0 ∈ A0 . Then we define the BV action for the canonical BF theory by B ; FA , S(A, B) = M
where ; denotes the extension to forms of the adjoint and coadjoint type of the canonical pairing between g and g∗ with shifted degree: α ; β := (−1)gh α deg β α , β . Integration over M is assumed here to select the form component of degree m. Observe that S(A, B) = S(A, B) as in (2.1). The space A × B of superfields is isomorphic to T∗ [−1]A and as such it has a canonical odd symplectic structure whose corresponding BV bracket we will denote by (( ; )). It can then be shown that S satisfies the classical master equation (( S ; S )) = 0. This implies that the derivation (of total degree one) δ := (( S ; )) is a differential (the BRST differential). It can be easily checked that δA = (−1)m FA ,
δB = (−1)m dA B.
(3.1)
As usual in the BV formalism one also introduces the BV Laplacian . For this, one assumes a measure which induces a divergence operator and defines F by 21 div XF with XF = (( F ; )) the Hamiltonian vector field of F . In the functional integral, the measure is defined only formally. For us, the Laplace operator will have the property that ((Ak )a (x) (Bl )b (y)) = δk+l,−1 δba δ(x, y),
(3.2)
where Ak (Bk ) denotes the component of ghost number k of A (B), and we have chosen a local trivialization of ad P (ad∗ P ) to expand Ak (Bk ) on a basis of g (g∗ ). One can then show that S = 0. As a consequence S satisfies the quantum master equation (( S ; S )) − 2iS = 0, and the operator := δ − i is a coboundary operator (i.e., 2 = 0) of total degree one. Given a function O on T∗ [−1]A, one defines its expectation value by i O := DADB e S(A,B) O(A, B), L
where L is a Lagrangian submanifold (determined by a gauge fixing). The general properties of the BV formalism ensure that 10
It suffices here to say that (locally) the Lie bracket of g-valued forms α and β is defined by c Rc , [[α ; β]] = (−1)gh α deg β α a β b fab
c are the corresponding structure constants, gh denotes the ghost number where {Rc } is a basis of g, fab and deg the form degree.
520
A.S. Cattaneo, C.A. Rossi
(1) the expectation value of an -closed function (called a BV observable) is invariant under deformations of L (“independence of the gauge fixing”); and (2) the expectation value of an -exact function vanishes (“Ward identities”). 3.1. Wilson surfaces in the BV formalism. We want now to extend the observable O to a function O (of total degree zero) on T∗ [−1]A × • (Imb(N, M)) (where Imb(N, M) denotes the space of imbeddings N → M) that satisfies the “descent equations” O = (−1)m dO,
(3.3)
where d is the de Rham differential on • (Imb(N, M)). Observe that denoting by Oi the i-form component, the descent equation implies in particular O0 = 0, O1 = (−1)m dO0 . Thus, O0 will be a BV observable satisfying d O0 = 0. We expect then that (apart from regularization problems) O0 should yield a higher-dimensional knot invariant. Observe that, since O will be defined in terms of a gauge-fixed functional integral, we will have to take care of the dependence of O under the gauge fixing. We will show that the variation of O w.r.t. the gauge fixing is (d + (−1)m )-exact. As a consequence, the variation of O w.r.t. the gauge fixing will be d-exact and hence well defined in cohomology. In particular, we should expect that O0 should be gauge-fixing independent. In order to define O properly and to show its properties we make from now on the following simplifying Assumption 2. We assume that the principal bundle P is trivial. As a consequence, from now on, elements of A (B) will be regarded as forms on M taking values in g (g∗ ). Our definition of O requires first the introduction of superfields on N . We set := A := B
m−2 i=0 m−2
i (N; g)[−i], i (N; g∗ )[m − 3 − i].
i=0
have then total degree zero, while elements of B have total degree m − 3. Elements of A × B as T∗ [−1]A, which we endow with its canonical odd Again we may regard A symplectic structure. We will denote by (( ; ))the corresponding BV bracket. We are now in a position to give a first BV generalization of (2.4); viz., for ξ ∈ A we define and β ∈ B, 0 (ξ , β, A, B)(f ) := ξ ; df ∗ A β + f ∗ B . N
One can immediately verify that 0 (ξ, β, A, B)(f ) = (ξ, β, A, B, f ).
Wilson Surfaces and Higher Dimensional Knot Invariants
521
The notation used suggests that we want to consider 0 (ξ , β, A, B) as a function on Imb(N, M). More generally, we want to define a functional taking values in forms on Imb(N, M). To do so, we first introduce the evaluation map ev : N × Imb(N, M) → M (x, f ) → f (x), and the projection π : N × Imb(N, M) → Imb(N, M). Denoting by π∗ the corresponding integration along the fiber N , we define (ξ , β, A, B) := π∗ ξ ; dev∗ A β + ev∗ B ∈ • (Imb(N, M)). Observe that is a sum of forms on Imb(N, M) of different ghost numbers with total degree equal to zero and that 0 is the component of of form degree zero (or, equivalently, of ghost number zero). Now, by using (3.1) and the property dπ∗ = (−1)m π∗ d, one can prove the identity11 d = (−1)m δ +
1 (( ; )). 2
(3.4)
We may also define the derivation δ := (( ; ))which, by (3.4) is not a differential; on generators it gives δβ = (−1)m dev∗ A β + ev∗ B . (3.5) δξ = (−1)m dev∗ A ξ , Observe that for any given family of imbeddings, one gets a vector field on T∗ [−1]A. We now introduce a formal measure Dξ Dβ on this space. In terms of this measure, We assume the formal measure to satisfy the following we define the BV Laplacian . generalization of Assumption 1 in Sect. 2: Assumption 3. We assume the measure to be invariant under the vector fields defined by = 0. (3.5); viz., we assume Formally we can now improve (3.4) to the fundamental identity of this theory which we will call the modified quantum master equation; viz, m
d = (−1)
1 1 + (( ; )) + QME() 2 2
(3.6)
with QME() := (( ; ))− 2i. This identity is a consequence of the following formal facts: (1) vanishes since is at most linear in A and B. vanishes by Assumption 3. (2) 11 Observe that, in order to compute (( ; )), one has to “integrate by parts.” This is allowed since ξ ; β does not depend on the given imbedding. As a consequence, π∗ ξ ; β is a constant zero-form on Imb(N, M), which implies the useful identity
0 = (−1)m dπ∗ ξ ; β = π∗ dA ξ ; β + π∗ ξ ; dA β .
522
A.S. Cattaneo, C.A. Rossi
(3) (( ; )) is proportional to a delta function at coinciding points, but the coefficient is proportional to [[ξ ; ξ ]] ; β which vanishes since ξ has total degree zero. Observe finally that the modified quantum master equation can also be rewritten in the form e i = 0. (3.7) d − (−1)m + i We are now in a position to define the observable O and to prove its formal properties. We set i Dξ Dβ e , O := L
where L is the Lagrangian section determined by the gauge-fixing fermion . Recall that, as in general in the BV formalism, is required to depend only on the fields.12 In this modified situation, we call good a gauge-fixing fermion that in addition satisfies the equation + (( ; )) = (−1)m d. In particular, gauge-fixing fermions independent of A, B and the imbedding are good. Now let t be a path of good gauge-fixing fermions. By the usual manipulations in the BV formalism, the modified quantum master equation (3.6) implies that d i t Ot = ((−1)m − d)O dt with t = O
Lt
i
Dξ Dβ e
d t . dt
As a consequence, the expectation value of O will be gauge-fixing independent modulo exact forms on Imb(N, M) as long as we stay in the class of good gauge fixings. This understood, from now on we will drop the label . Another consequence of the modified quantum master equation is the descent equations (3.3), which are immediately obtained by integrating (3.7) over the Lagrangian section L determined by a good gauge-fixing fermion . 4. The Case of Long Higher-Dimensional Knots We will concentrate from now on on the case M = Rm and N = Rm−2 , m > 3. We also choose once and for all a reference linear imbedding σ : Rm−2 → Rm and we consider only those imbeddings that outside a compact coincide with σ ; we denote by Imbσ the corresponding space, whose elements are usually called long (m − 2)-knots. As usual one has first to enlarge the space of fields and antifields by adding enough antighosts σ¯ i and Lagrange multipliers λi together with their antifields σ¯ i+ and λ+ i . One then extends the action functional by adding the term i N σ¯ i+ λi . The extended action still satisfies the modified quantum master equation. The gauge-fixing fermion is assumed to depend on the fields only, i.e., on the σ¯ i s, the λi s, and the components of nonnegative ghost number in ξ and β. See, e.g., Subsect. 4.3. 12
Wilson Surfaces and Higher Dimensional Knot Invariants
523
On the trivial bundle P Rm ×G, we pick the trivial connection as a reference point. Thus, we may identify A with the space of g-valued 1-forms. More generally, we think of A and B as spaces of g- resp. g∗ -valued forms. Observe that the pair (A, B) = (0, 0) is now a critical point of BF theory. We will denote by a and B the perturbations around the trivial√critical point, but, in order to keep track that they are “small”, we will scale them by . Observe that we assume the fields a and B to vanish at infinity. To simplify the following computations, we also rescale ξ → ξ . As a consequence, the super BF action functional and the super functional will now read as follows:
B ; da + [[a ; a]] , 2 M √ 1 (ξ , β, a, B) = π∗ ξ ; dβ + π∗ ξ ; [[ev∗ a ; β]] + ev∗ B . 1 S(a, B) =
√
(4.1a) (4.1b)
4.1. Zero modes. We now consider the critical points of for = 0. The equations of motions are simply dξ = dβ = 0. Using translations by exact forms (which are the symmetries for at = 0), a critical point can always be put in the form β = 0 and ξ a constant function, whose value we will denote by ∈ g. We have now to choose a measure µ on the space g of zero modes. Then we write ξ = + α with α assumed to vanish at infinity. We also assume β to vanish at infinity and write O(A, B) =
∈g
µ() U(A, B, ),
with U(A, B, ) :=
L
i
DαDβ e (+α,β,A,B) .
(4.2)
In the following, we will concentrate on U(A, B, •) which we will regard as an element of the completion of the symmetric algebra of g∗ . Before starting the perturbative expansion of U, we comment briefly on the validity of Assumption 3 at the end of Sect. 3. We assume the formal measure DαDβ to be will have the following induced from a given constant measure on g. This means that property (cf. with (3.2) for notations): k )a (x) (β l )b (y)) = δk+l,−1 δ a δ(x − y). ((ξ b
(4.3)
Then, by a computation analogous to that for canonical BF theories, one obtains in a combination of delta functions and its derivatives at coinciding points (!) but with a vanishing coefficient. So, formally, Assumption 3 is satisfied.13 13 If we think in terms of the vector fields defined by (3.5), we should take care only of the terms containing the covariant derivatives as the formal measure is, as usual, assumed to be translation invariant. If the Lie algebra g were unimodular, then we would immediately conclude that, formally, the measure is invariant under this generalized gauge transformation. However, even more formally, things work in general as the contributions of different field components cancel each other.
524
A.S. Cattaneo, C.A. Rossi
a B
a
B
Fig. 1. The four vertices coming from the second equation in (4.4)
4.2. The Feynman diagrams. We split the action ( + α, β, A, B) into the sum of (1) (0) (α, β) and the perturbation (α, β, A, B): 1 (0) α ; dβ , (α, β) = π∗ α ; dβ = Rm−2 √ 1 (1) (4.4) (α, β, a, B) = π∗ ( α ; [[ev∗ a ; β]] + α ; ev∗ B + ; [[ev∗ a ; β]] + ; ev∗ B ). As a consequence, in the perturbative expansion of U, we will have a propagator√of order 1 in (the inverse of d with some gauge fixing) and four vertices of order . Graphically, we will denote the propagator by a dashed line oriented from β to α. The four vertices are then represented as in Fig. 1, where the black and white strip represents the zero mode . Observe that with these vertices one can construct two types of connected diagrams: (1) Polygons consisting only of vertices of the first type, see Fig. 2 (observe that the 1-gon is a tadpole, so in general it will be removed by renormalization); (2) “Snakes” with a B-field at the head and a zero mode at the tail; there is a very short snake consisting of a vertex of the fourth type only; a longer snake consisting of a vertex of the second type followed by a vertex of the third type; and a sequel of longer snakes consisting of a vertex of the second type followed by vertices of the first type and ending with a vertex of the third type. See Fig. 3. We will denote by τn the n-gon and by σn the snake with n vertices beside the head. Then, the combinatorial structure of U is given by U = eσ +τ ,
(4.5)
with τ=
n ∞ 2
n=2
n
τn ,
σ =
∞
n+1 2
σn .
(4.6)
n=0
(The factor n dividing τn is the order of the group of automorphisms of the polygon.) Remark 4.1. Observe that setting B = 0 kills σ . On the other hand, the partition function of |B=0 is just the torsion of the connection dev∗ A [16]. As √ a consequence, exp τ (a) is the perturbative expression of the torsion for A = A0 + a, where A0 is the trivial connection.
Wilson Surfaces and Higher Dimensional Knot Invariants
a i+3
525
a i+2
a i+4
a i+1
a i+5
ai
a n−2
a3
a2
a n−1 a1
an
Fig. 2. The polygon τn with n vertices
4.3. The gauge fixing. To compute σ and τ explicitly, one has to choose a gauge fixing. Our choice is the so-called covariant gauge fixing d β = 0, where d is defined in terms of a Riemannian metric on Rm−2 , e.g., the Euclidean metric. In the BV formalism, one needs a gauge fixing also for some of the ghosts, and everything has to be encoded into a gauge-fixing fermion. The first step consists in introducing antighosts and Lagrange multipliers and to extend the BV action. We will denote by σ¯ i,l the antighosts and by λi,l the Lagrange multipliers (i = 1, . . . , m − 3, l = 1, . . . , i), with the following properties: • σ i,l is a g-valued form of degree m − 3 − i and ghost number −i + 2l − 2; • λi,l is a g-valued form of degree m − 3 − i and ghost number −i + 2l − 1. + and λ+ We then introduce the corresponding antifields σ¯ i,l i,l . To the BV action we add then the piece
a1
a2
a3
a n−1
Fig. 3. The “snake” σn with n + 1 terms
an
Bn+1
526
A.S. Cattaneo, C.A. Rossi m−3 i
(−1)i
i=1 l=1
Rm−2
+ ; λi,l . σ¯ i,l
By means of the Euclidean metric on Rm−2 , we can construct the corresponding Hodge operator, which maps linearly forms on Rm−2 of degree k to forms of degree m−2 −k; moreover, we define the L2 -duality between forms on Rm−2 with values in g and g∗ as follows: η , ω L2 := ω ; η , (4.7) Rm−2
where the operator acts on the form part of η. Finally, we choose the gauge-fixing fermion to be
= σ1 ; d β +
i m−4
L2
+
m−4
σ i+1,1 ; d σi
i=1
σ i+1,k+2−l ; d σ i,l
i=1 l=2
L2
L2
.
Observe that this gauge fixing is independent of A, of B and of the imbedding; as a consequence it is a good gauge fixing (according to the terminology introduced at the end of Subsect. 3.1). With this choice of gauge fixing, the superpropagator is readily computed. To avoid the singularity on the diagonal of Rm−2 × Rm−2 , we prefer to work on the (open) configuration space C2 (Rm−2 ) := {(x, y) ∈ Rm−2 | x = y}. If we denote by πi , i = 1, 2, the projection from C2 (Rm−2 ) onto the i th component, we get ∗ a ∗ π1 α π2 β b g.f. := η δba , where η is the pullback of the normalized, SO(m − 2)-invariant volume form wm−3 on S m−3 via the map φ:
C2 (Rm−2 ) → S m−3 , y−x (x, y) → , ||y − x||
where || || denotes the Euclidean norm. 4.4. Explicit expressions. We are now in a position to write down σ and τ in an explicit way. We only need a few more pieces of notation. First, we introduce the (open) configuration space Cn (Rm−2 ) as the space of n distinct points on Rm−2 : Cn (Rm−2 ) := {(x1 , . . . , xn ) ∈ (Rm−2 )n | i = j ⇒ xi = xj }. For a given Cn , we introduce the projections πi :
Cn (Rm−2 ) → Rm−2 , (x1 , . . . , xn ) → xi ,
Wilson Surfaces and Higher Dimensional Knot Invariants
527
and, for i = j , πij :
Cn (Rm−2 ) → C2 (Rm−2 ), (x1 , . . . , xn ) → (xi , xj ).
Then we set ai := (ev ◦ (id × πi ))∗ a,
Bi := (ev ◦ (id × πi ))∗ B,
and ηij := πij∗ η. Finally, we may write τn (a) = π∗n Tr ad(a1 )η12 ad(a2 )η23 · · · ηn−1,n ad(an )ηn1 , σn (a, B; ) = i π∗n+1 ad∗ (a1 )η12 ad∗ (a2 )η23 · · · ad∗ (an )ηn,n+1 Bn+1 , ,
(4.8a) (4.8b)
where π∗n denotes the integration along the fiber corresponding to the projection π n : Cn (Rm−2 ) × Imbσ → Imbσ , and Tr is the trace in the adjoint representation. 5. Properties of the Wilson Surface for Long Knots In this section we discuss the properties of the functions τ and σ introduced in (4.8). Proposition 5.1. The functions τ and σ are well-defined and satisfy δτ = (−1)m dτ,
δσ = (−1)m dσ.
Proof. We have first to prove that the integrals defining σ and τ converge. This is easily done by introducing the compactifications Cn [Rm−2 ] of the (open) configuration spaces Cn (Rm−2 ) defined in [3]. These compactified configuration spaces are manifolds with corners, with the property that all projections to configuration spaces with less points may be lifted to smooth maps. Moreover, the form η defined in the previous subsection extends to a smooth, closed (m − 3)-form on C2 [Rm−2 ]. As a consequence, σ and τ may be expressed by integrating along the compactification. In other words, we take the same expressions but we interpret π∗n as the integration along the fiber corresponding to the projection π n : Cn [Rm−2 ] × Imbσ → Imbσ . To prove the properties, we use the generalized Stokes Theorem dπ∗n = (−1)mn (π∗n d− n,∂ π∗ ), where π∗n,∂ denotes integration along the (codimension-one) boundary of Cn [Rm−2 ]. Since the forms ηij are closed, the first term produces a sum of integrals where d is applied, one at a time, to a form a or B. The boundary terms may be divided into principal and hidden faces, the former corresponding to the collapse of exactly two points. If the two points are not consecutive, they are not joined by an η and the integral along the fiber vanishes by dimensional reasons. If on the other hand they are consecutive, the integral along the fiber of η is normalized; we get then contributions of the form [[a ; a]] or ad∗ (a)B. Collecting all the terms and using (3.1), we get the formulae displayed in the proposition, up to hidden faces.
528
A.S. Cattaneo, C.A. Rossi a2
a1
a3
a n−1
an
Bn+1
am
a3 a2
a1
Fig. 4. The typical term in (( σ ; σ ))
The vanishing of the hidden faces (corresponding to more points collapsing together and/or escaping to infinity) is due partly to dimensional reasons, partly to slight modifications of the Kontsevich Lemma (see [14]). We refer the reader to [15] for the detailed proof.14 An immediate consequence of the proposition is that the Wilson surface U, defined in (4.2), satisfies the “semiclassical” descent equation δU = (−1)m dU.
(5.1)
In order to prove the “quantum” descent equation U = (−1)m dU, we must now show that, formally, U is -closed. To do so, we first observe that, by the formal properties of the BV Laplacian,
1 1 U = U σ + τ + (( σ ; σ )) + (( σ ; τ )) + (( τ ; τ )) . 2 2 The second and last terms in parentheses vanish since τ depends only on a (and not on B). In [15], it is proved that also the third term vanishes and that σ + (( σ ; τ )) = 0. Graphically, these terms are represented in Figs. 4 and 5.15 We observe that the proof in [15] is rather formal in the sense that, in the computation of σ , it ignores the term coming from B and the adjacent a, as this term produces a tadpole. However, if g is unimodular, the Lie algebraic coefficient of this term vanishes. In the general case, one has to introduce a suitable counterterm τ1 in the torsion to compensate for it in (( σ ; τ )). Our final comment is that it does not make much sense to spend efforts in making the proof of the quantum descent equation more rigorous. In any case, the descent equation 14 It should be remarked that in the proof we never make use of the fact that the form w m−3 appearing in the definition of η (see the end of Subsect. 4.3) is SO(m − 2)-invariant; what is needed is just that wm−3 has the same parity of m under the action of the antipodal map x → −x. Hence the proposition is still valid if, in the definition of η, we choose wm−3 to be any normalized top form with the required parity under the antipodal map. 15 The Y-shaped vertex with no labels in the figures is the result of the contraction of an a with a B determined by the BV bracket or the BV Laplacian.
Wilson Surfaces and Higher Dimensional Knot Invariants
529 a i+3
a i+2
a i+4
a1
a2
a3
am−1
a i+1
a i+5
ai
a n−2
a3
am
a2
a n−1 an
a1
Fig. 5. The typical term in σ or in (( σ ; τ ))
implies only formally that U0 should be an invariant, where U0 denotes the piece of U of degree 0 (hence, of ghost number 0). What one has to do instead is to take the perturbative expression of U0 and directly either prove that it produces invariants of long knots or compute its failure (“anomaly” in the language of [3]) and understand how to correct it. We will see examples of this in the next section.
6. Perturbative Invariants of Long Higher-Dimensional Knots In this section we compute the first terms of the perturbative expansion of U0 and briefly discuss the expectation value of the product of U0 with a Wilson loop. First we have, however, to describe the Feynman rules for BF theory. According to the action as written in (4.1a), there is a superpropagator between a and B, which we will √ denote by a solid line oriented from B to a, and a trivalent vertex as in Fig. 6 of weight . In the covariant gauge, the superpropagator can easily be described as follows (see [9] for details). Let us denote by πi , i = 1, 2, the projection from C2 (Rm ) onto the i th component. Then
π1∗ Aa π2∗ (Bb ) g.f. := θ δba ,
;
Fig. 6. The propagator and the interaction vertex
530
A.S. Cattaneo, C.A. Rossi
where θ is the pullback of the normalized, SO(m)-invariant volume form wm−1 on S m−1 via the map ψ:
C2 (Rm ) → S m−1 , y−x (x, y) → , ||y − x||
with || || denoting the Euclidean norm. In order to proceed with the discussion of the perturbative expansion, we have to introduce some pieces of notation. Given f ∈ Imbσ , we denote by Cs,t (f ) the configuration space of s + t points on Rm , the first s of which are constrained to lie on the image of f ; in other words, x = xj , 1 ≤ i < j ≤ s (x1 , . . . , xs ) ∈ (Rm−2 )s i y =
y , s < i < j ≤ s + t . Cs,t (f ) = i j m t (ys+1 , . . . , ys+t ) ∈ (R ) f (x ) = y , 1 ≤ i ≤ s < j ≤ s + t j i Observe that Cs,0 (f ) = Cs (Rm−2 ) and C0,t (f ) = Ct (Rm ). For i, j = 1, . . . , s, i = j , we have projections πij :
Cs,t (f ) → C2 (Rm−2 ) (x1 , . . . , xs ; y1 , . . . , yt ) → (xi , xj ).
We will denote by ηij the pullback of η by πij . Moreover, for i, j = 1, . . . , s + t, i = j , we have projections ij :
→ C2 (Rm ) (f (xi ), f (xj )) i, j ≤ s (f (x ), y ) i≤s<j i j . (x1 , . . . , xs ; ys+1 , . . . , ys+t ) → (y , f (x )) j ≤s s i j Cs,t (f )
(6.1)
We will then denote by θij the pullback of θ by ij . As for the convergence of the integrals appearing in the perturbative expansion, we make the two following observations: (1) There are certainly divergences when a superfield a is paired to a superfield B in the same interaction term (“tadpoles”). The Lie algebra coefficient of tadpoles vanishes if g is unimodular. In general tadpoles are removed by finite renormalization. (2) The remaining terms are integrals over configuration spaces Cs,t (f ). There exists a compactification Cs,t [f ] of these spaces [3] such that the above projections are still smooth maps. The integrals over the compactification then automatically converge (but do not differ from the original ones as one has simply added a measure-zero set). For notational convenience in the following we will simply write Cs,t instead of Cs,t [f ]. In the organization of the perturbative expansion, it is quite convenient to make use of the following combinatorial Lemma 6.1. The order in equals the degree in . Proof. Let us consider a Feynman diagram produced by sn snakes σn , tn n-gons τn and v interaction vertices. We recall that σn is of degree n in a and of degree one in B and in
Wilson Surfaces and Higher Dimensional Knot Invariants
1
531
2
Fig. 7. Order 1
; τn is of degree n in a and contains no Bs or s; each interaction vertex is of degree two in a and of degree one in B and contains no . Thus, the degree in of the diagram is sn . Moreover, degree in a = nsn + ntn + 2v, degree in B = sn + v. By Wick’s theorem these degrees must be equal, so we get the identity (n − 1)sn + ntn + v = 0. Recall now that the order in of σn is (n + 1)/2, whereas the order of τn is n/2. As the order of each interaction vertex in 1/2, the total order of the diagram is 1 ntn + v , (n + 1)sn + 2 which by the previous identity is equal to sn . But this is also the degree in . 6.1. Order 1. The only possible term at order 1 has the form 1 Tr(ad ) with 1 = θ12 η12 . C2,0
(6.2)
Observe that this term does not appear if g is unimodular. It is also possible to prove (considering the involution (x1 , x2 ) → (x2 , x1 ) of C2,0 ) that 1 vanishes if m is odd. The graphical representation of 1 is displayed in Fig. 7. (From now on we omit in diagrams the black and white strip representing . In Fig. 7 it would be attached to vertex 1.) In even dimensions, 1 furnishes a function on Imbσ which is a generalization of the self-linking number for ordinary knots. This function is not an invariant. It can be easily proved that, in computing the differential of 1 , the only boundary contribution corresponds to the collapse of the two points. One obtains then d1 = −p1∗ (∗ wm−1 p3∗ wm−3 ), where : Imbσ ×Rm−2 × S m−3 → S m−1 . df (x)v (f, x, v) → ||df (x)v|| and pi denotes the projection to the i th factor.16 16 It may be observed that the expression for d is well-defined also when f is just an immersion 1 (and not an imbedding). As a consequence, d1 may be regarded as a 1-form on the space of immersions of Rm−2 into Rm (that coincide with σ outside a compact set).
532
A.S. Cattaneo, C.A. Rossi 4
1
3
2
4
2
1
3
1 3
2
4
Fig. 8. Order 2
6.2. Order 2. The contributions corresponding to connected diagrams may be written as 2 Tr((ad )2 ), where 2 is graphically represented in Fig. 8 (where white circles denote vertices in Rm not constrained to lie on the image of the imbedding) and has the following analytical expression: 1 θ13 θ24 η12 η34 − θ14 θ24 θ34 η12 . (6.3) θ13 θ24 η12 η23 + 2 = 2 C4,0 C3,1 C4,0 It is not difficult to prove that 2 vanishes if m is even (consider the involutions that exchange point 1 with point 3 in the first term, point 1 with point 2 in the last term, and the pair of points (1, 3) with the pair (2, 4) in the second term). In odd dimensions, 2 may be rewritten as 1 1 2 θ13 θ24 η1234 − θ14 θ24 θ34 η123 , 2 = 3 C3,1 8 C4,0 where η1234 and η123 are the cyclic sums, η1234 = η12 + η23 + η34 + η41 ,
η123 = η12 + η23 + η31 .
In this form it is clear that 2 is the long-knot version of the invariant of knots introduced by Bott in [2]. Proposition 6.2. 2 is an invariant. Proof. In the computation of d2 , the contributions of the principal faces of the three terms cancel each other as can be easily verified. The vanishing of hidden faces may be easily proved, see [15].17 6.3. Order 3. Connected diagrams sum up to yield a term of the form 3 Tr((ad )3 )— which clearly vanishes if the Lie algebra is unimodular—where 3 corresponds to the sum of the eight Feynman diagrams displayed in Fig. 9. Its analytical expression is the following: 1 θ14 θ26 θ35 η12 η34 η56 + θ14 θ26 θ35 η12 η23 η45 3 = 3 C6,0 C6,0 1 θ14 θ26 θ35 η12 η23 η34 + θ14 θ25 θ36 η12 η23 η31 − 3 C6,0 C6,0 17 Observe that also in the Chern–Simons knot invariants it is easy to prove that hidden faces do not contribute to diagrams of even order.
Wilson Surfaces and Higher Dimensional Knot Invariants 6
533
6
5
5
6 2 2
5
2
1
3
3
1
2
3 4
4
1
5
4
1
3
4
6
5 3
5
2
4
6 6
6 1 1
5 5
3
3 2
4 1
2
6
2
4 1
3
4
Fig. 9. Order 3
+
C5,1
θ16 θ36 θ56 θ24 η12 η34 −
−
C4,2
θ16 θ36 θ56 θ25 θ45 η12 +
C5,1
1 3
θ16 θ36 θ56 θ24 η12 η23
C3,3
θ14 θ25 θ36 θ45 θ46 θ56 .
(6.4)
In [15] it is proved (by considering suitable involutions) that 3 vanishes if m is odd. In even dimensions,18 the differential of 3 is explicitly computed in [15] and it is proved that the only boundary contribution that may survive in each term is the most degenerate face, i.e., the one corresponding to the collapse of all vertices (and with some more effort it is moreover proved that only the seventh term may yield a nonvanishing contribution). To describe d3 , we first introduce the space Im,m−2 of linear injective maps from Rm−2 into Rm . Next we consider the map T : Imbσ ×Rm−2 → Im,m−2 . (f, x) → df (x) Then we may write 3 , d3 = p1∗ T ∗ 18
In this case 3 may also be written as 1 1 θ14 θ25 θ36 η1245 η1346 η2356 − θ16 θ36 θ56 θ24 η1234 η2345 24 C6,0 6 C5,1 1 1 − θ16 θ36 θ56 θ25 θ45 η1234 + θ14 θ25 θ36 θ45 θ46 θ56 , 4 C4,2 3 C3,3
3 = −
with ηij kl : = ηij − ηj k + ηkl − ηli , for any 4-tuple of distinct indices. In this form, 3 may easily be reinterpreted as a function on the space of imbeddings of S m−2 into Rm .
534
A.S. Cattaneo, C.A. Rossi
3 is an (m − 1)-form where p1 is the projection onto the first factor and the “anomaly” that can be explicitly described as follows. Given α ∈ Im,m−2 , one defines the folm−2 of dilations and translations lowing action on Cs,t (α) of the group I = R+ ∗ R m−2 of R : xi → λxi , i = 1, . . . , s,
yi → λyi , i = 1, . . . , t,
xi → xi + a, i = 1, . . . , s,
λ ∈ R+ ∗,
yi → yi + α(a), i = 1, . . . , t,
a ∈ Rm−2 .
m,m−2 s,t s,t (α) as the quotient of Cs,t (α) by I . Denoting by C → Im,m−2 One then defines C s,t (α) over α, one may write 3 as a sum of integrals along the fiber bundle with fiber C m,m−2 s,t , where the integrand form is given by the same products of propathe fibers of C gators ηij and θij as before, with the only modification that ij , see (6.1), is now defined in terms of the linear map α (instead of f ), over which the fiber lies. In general, we do not know if 3 is an invariant. We briefly describe however a possible strategy to correct it. Let Vm,m−2 be the Stiefel manifold (regarded as the space of linear isometries from Rm−2 into Rm w.r.t. the Euclidean metrics). Observe that Vm,m−2 is equipped with a left action of SO(m) and a free, right action of SO(m − 2). Let us denote by ι the inclusion of Vm,m−2 into Im,m−2 and by r some deformation retract; viz., r is a map from Im,m−2 to Vm,m−2 such that ι ◦ r is homotopic to the identity (the existence of such a retract may be proved, e.g., by Gram–Schmidt orthogonalization procedure). Let h be a given homotopy, i.e., a map [0, 1] × Im,m−2 → Im,m−2 such that h(0, α) = α and h(1, α) = ι(r(α)). Define
3 = pr 2∗ h∗ 3 , 3 , where pr 2 denotes the projection onto the second factor. Given the explicit form of one can prove that it is closed. Thus, we obtain 3 = − 3 . 3 + r ∗ ι∗ d 3 is SO(m − 2) × SO(m)-invariant.19 If It is now possible to show that 3 := ι∗ m = 4, we can moreover prove that 3 is also SO(m − 2)-horizontal; hence it is the pullback of an SO(4)-invariant 3-form on the Grassmannian Gr 4,2 . Since the only such form is zero, it follows that in four dimensions 3 = 0 and we get the following 3 is an invariant of long 2-knots. Proposition 6.3. 3 := 3 + p1∗ T ∗ As far as we know, this invariant is new. Observe that also for m > 4 one may define 3 . It turns out from the previous considerations that d3 = p1∗ T ∗ r ∗ 3 . Thus, though in general we cannot claim that 3 is an invariant, we can compute its differential in terms of an invariant form on the Stiefel manifold. This implies that, when d3 does not vanish, we may use it to correct the potential invariants coming from higher-orders in perturbation theory, as explained in the next subsection. Briefly, this is true since it is possible to extend the actions of SO(m) and SO(m − 2) on Vm,m−2 m,m−2 s,t to the whole (restricted) bundle ι∗ C → Vm,m−2 in such a way that the projection as well as the m−1 m−3 and S used in the definitions of the propagators are all equivariant. Recall finally that maps to S the volume forms wm−3 and wm−1 are invariant. 19
Wilson Surfaces and Higher Dimensional Knot Invariants
535
6.4. Higher orders. Higher-order terms may be explicitly computed. In [15] some vanishing lemmata are proved which imply that only the most degenerate faces (i.e., when all points collapse) contribute to the differential of the corresponding functions on Imbσ . One can then prove that in odd dimensions also these contributions vanish. One then obtains genuine invariants of long (m − 2)-knots with m odd. In even dimensions, one may repeat the considerations of the previous subsection. In particular, in four dimensions one may construct genuine invariants of long 2-knots. For m > 4, this construction yields an infinite set of functions on Imbσ whose differentials are pullbacks of SO(m − 2) × SO(m)-invariant (m − 1)-forms on Vm,m−2 . Since the space of such forms is finite dimensional [13], one may produce an infinite set of invariants by taking suitable linear combinations. This is the higher-dimensional analogue of the procedure used in [3] to kill the (possible) anomalies in the perturbative expansion of Chern–Simons theory with covariant gauge fixing.20
6.5. Other observables. The new observable we have introduced in this paper is not the only known observable for BF theories. For example, the usual Wilson loop Wρ (γ )(A) = Tr(ρ(Hol(A, γ )), where ρ is a representation of the Lie group G and γ an imbedding of S 1 , is an observable; more generally, one also has the generalized Wilson loops introduced in [7, 9], whose expectation values yield cohomology classes on the space of imbeddings of S 1 . The expectation value of the usual Wilson loop is rather trivial (the dimension of the representation space) since the degree in a cannot be matched by the degree in B. The mixed expectation value of U0 and Wρ is more interesting. If γ does not intersect f , the product defines an observable, and one can show that
U0 (f ) Wρ (γ ) = U0 (f ) Tr e− lk(f,γ ) ρ∗ () ,
(6.5)
where ρ∗ is the induced representation of g and lk(f, γ ) is the linking number between (the images of) f and γ . It can be written as ϕ ∗ θ12 , lk(f, γ ) = Rm−2 ×S 1
where ϕ : Rm−2 × S 1 → C2 (Rm ) (x, t) → (f (x), γ (t)). The result in (6.5) is tantamount to saying that the only connected diagram arising from the nth order in Wρ expanded in powers of a is the one obtained by joining each of these n as to a short snake σ0 . This result is purely combinatorial after observing that either joining the last a of a snake to the B of a σ0 or joining the two as of an interaction vertex to the Bs of two σ0 s yields a factor [ , ] which clearly vanishes. 20 In this three-dimensional case, the anomaly is an SO(3)-invariant 2-form on the Stiefel manifold V3,1 , which may be identified with the 2-sphere. Since the space of such forms is 1-dimensional, a single potential invariant—e.g., the self-linking number—is enough to correct all others.
536
A.S. Cattaneo, C.A. Rossi
7. Final Comments In this paper we have introduced a new observable for BF theories that is associated to imbeddings of codimension two. We list here some possible follow-ups of our work.
7.1. Yang–Mills theory. In [4], Yang–Mills theory is regarded as a deformation, called BFYM theory, of BF theory with deformation parameter the coupling constant gYM . In this setting O(A, B, f ) becomes an observable for the BFYM theory in the topological limit gYM → 0. Moreover, in this limit the expectation value of this observable times a Wilson loop is still given by (6.5). Thus, O might constitute the topological limit of a dual ’t Hooft variable [18]. 7.2. Nonabelian gerbes. Assume B to be a two form (in the context of BF theories, we assume then that we are working in four dimensions). In the abelian case, the observable (1.1) defines a connection for the gerbe defined by B [17]; in this case, it is interesting to consider also the case when N has boundary. A suitable extension of our observable O to this case would then be a candidate for a connection on a nonabelian gerbe. 7.3. Classical Hamiltonian viewpoint. For M of the form M0 × R, the reduced phase space of BF theory is the space of pairs (A, B), with A a flat connection on M0 and B a covariantly closed (m − 2)-form of the coadjoint type, modulo symmetries. The Poisson algebra generated by generalized Wilson loops is considered in [8] and, in the case G = GLn it is proved to be related to the Chas–Sullivan string topology [11]. It would be interesting to see which new structure one may obtain by considering the Poisson algebra generated by generalized Wilson loops and, in addition, our new observables.
7.4. Cohomology classes of imbeddings of even codimension. In Sect. 6 we have described how the perturbative expansion produces (potential) invariants of long knots. The same formulae may be used to define forms on the space of imbeddings Imbsσ of Rm−2s into Rm (with fixed linear behavior σ at infinity) with s > 1; up to hidden faces, these forms are closed (they certainly are so for m odd). This way, we produce cohomology classes on Imbsσ . 7.5. Graph cohomology. Generalizing [5], one can define a graph cohomology for graphs with two types of vertices (corresponding to points on the imbedding and in the ambient space) and two types of edges (corresponding to the two types of propagators) such that the “integration map” that associates to a graph the corresponding integral over configuration spaces is a chain map up to hidden faces. The Feynman diagrams discussed in this paper produce then nontrivial classes in this graph cohomology. We plan to discuss all this in detail in [10]. Acknowledgements. We thank J. Stasheff for his very useful comments and for revising a first version of the manuscript. We also thank R. Longoni and D. Indelicato for discussions on the material presented here.
Wilson Surfaces and Higher Dimensional Knot Invariants
537
References 1. Batalin, I.A., Vilkovisky, G.A.: Relativistic S-matrix of dynamical systems with boson and fermion constraints. Phys. Lett. 69 B, 309–312 (1977); Fradkin, E.S., Fradkina, T.E.: Quantization of relativistic systems with boson and fermion first- and second-class constraints. Phys. Lett. 72 B, 343–348 (1978) 2. Bott, R.: Configuration spaces and imbedding invariants. In: Proceedings of the 4th G¨okova Geometry–Topology Conference. Tr. J. Math. 20, 1–17 (1996) 3. Bott, R., Taubes, C.: On the self-linking of knots. J. Math. Phys. 35, 5247–5287 (1994) 4. Cattaneo, A.S., Cotta-Ramusino, P., Fucito, F., Martellini, M., Rinaldi, M., Tanzini, A., Zeni, M.: Four-dimensional Yang–Mills theory as a deformation of topological BF theory. Commun. Math. Phys. 197, 571–621 (1998) 5. Cattaneo, A.S.: P. Cotta-Ramusino and R. Longoni, Configuration spaces and Vassiliev classes in any dimension. Algebra. Geom. Topol. 2, 949–1000 (2002) 6. Cattaneo, A.S., Cotta-Ramusino, P., Rinaldi, M.: Loop and path spaces and four-dimensional BF theories: connections, holonomies and observables. Commun. Math. Phys. 204, 493–524 (1999) 7. Cattaneo, A.S., Cotta-Ramusino, P., Rossi, C.A.: Loop observables for BF theories in any dimension and the cohomology of knots. Lett. Math. Phys. 51, 301–316 (2000) 8. Cattaneo, A.S., Fr¨ohlich, J., Pedrini, B.: Topological field theory interpretation of string topology. Commun. Math. Phys. 240, 397–421 (2003) 9. Cattaneo, A.S., Rossi, C.A.: Higher-dimensional BF theories in the Batalin–Vilkovisky formalism: the BV action and generalized Wilson loops. Commun. Math. Phys. 221, 591–657 (2001) 10. Cattaneo, A.S., Rossi, C.A.: Configuration space invariants of higher dimensional knots. In preparation 11. Chas, M., Sullivan, D.: String topology. http://arxiv.org/list/math/9911159, 1999 12. Fock,V.V., Nekrasov, N.A., Rosly,A.A., Selivanov, K.G.:What we think about the higher dimensional Cheran-Simons theories, ITEP-91-70, Jul 1991; Sakharov Conf. 465–472 (1991) 13. Greub, W., Halperin, S., Vanstone, R.: Connections, Curvature and Cohomology. Vol. II: Lie Groups, Principal Bundles, Characteristic Classes, Pure and Applied Mathematics 47 II, NewYork–London: Academic Press, 1973 14. Kontsevich, M.: Feynman diagrams and low-dimensional topology. First European Congress of Mathematics, Paris 1992, Volume II, Progress in Mathematics 120, Basel: Birkh¨auser, 1994, pp. 97–121 15. Rossi, C.: Invariants of Higher-Dimensional Knots and Topological Quantum Field Theories, Ph. D. thesis, Zurich University 2002, http://www.math.unizh.ch/∼asc/RTH.ps 16. Schwarz, A.S.: The partition function of degenerate quadratic functionals and Ray–Singer invariants. Lett. Math. Phys. 2, 247–252 (1978) 17. Segal, G.: Topological structures in string theory. Phil. Trans. R. Soc. Lond. A 359, 1389–1398 (2001) 18. ’t Hooft, G.: On the phase transition towards permanent quark confinement. Nucl. Phys. B 138, 1 (1978); A property of electric and magnetic flux in nonabelian gauge theories. Nucl. Phys. B 153, 141 (1979) 19. Witten, E.: Quantum field theory and the Jones polynomial. Commun. Math. Phys. 121, 351–399 (1989) Communicated by M.R. Douglas
Commun. Math. Phys. 256, 539–564 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1344-3
Communications in
Mathematical Physics
The Volume of the Moduli Space of Flat Connections on a Nonorientable 2-Manifold Nan-Kuo Ho1, , Lisa C. Jeffrey2, 1 2
Department of Mathematics, National Cheng-Kung University, Taiwan, R.O.C. E-mail: [email protected]; [email protected] Department of Mathematics, University of Toronto, Toronto, Ontario M5S 3G3, Canada. E-mail: [email protected]
Received: 29 August 2003 / Accepted: 13 January 2005 Published online: 15 April 2005 – © Springer-Verlag 2005
Abstract: We compute the Riemannian volume of the moduli space of flat connections on a nonorientable 2-manifold, for a natural class of metrics. We also show that Witten’s volume formula for these moduli spaces may be derived using Haar measure, and we give a new proof of Witten’s volume formula for the moduli space of flat connections on a 2-manifold using Haar measure. 1. Introduction In [W] Witten defined and computed a volume on the moduli space of gauge equivalence classes of flat connections on a 2-manifold, using Reidemeister-Ray-Singer torsion (see e.g. [Fr]). When the 2-manifold is orientable, Witten proved that this volume is equal to the symplectic volume on the moduli space. However, when the 2-manifold is not orientable the moduli space does not have a symplectic form, so the interpretation of this volume has been unclear. The moduli space of gauge equivalence classes of flat connections on a nonorientable 2-manifold turns out to be a Lagrangian submanifold of the moduli space of gauge equivalence classes of flat connections on the orientable double cover. In this article we compute its Riemannian volume for a natural class of metrics on the 2-manifold. The dependence on the choice of metric is discussed in Remarks 1 and 2 below. The layout of this article is as follows. In Sect. 2 we compute the Riemannian volume of the moduli space of flat connections on a nonorientable 2-manifold (exhibited as the connected sum of a Riemann surface with either one or two copies of RP 2 ) using a metric derived from a particular class of metrics on RP 2 − {disc}. In Sect. 3 we give a new proof that Witten’s formula for the volume of the moduli space of flat connections on 2-manifolds (for nonorientable surfaces with no boundary and orientable surfaces with one boundary component) arises from Haar measure.
The first author was partially supported by grants from OGS and OGSST The second author was partially supported by a grant from NSERC
540
N.-K. Ho, L.C. Jeffrey
2. Metrics on the Moduli Space 2.1. Preliminaries. By the classification of 2-manifolds [Ma], all nonorientable 2-manifolds are obtained as the connected sum of a Riemann surface with either one copy of the real projective plane RP 2 (denoted P ) or two copies of P (which is equivalent to the connected sum with a Klein bottle K). Given a Riemannian metric on the M¨obius strip P \D, we obtain a Riemannian metric on the connected sum #P or #P #P , where is an oriented 2-manifold equipped with a Riemannian metric. We assume that a Riemannian metric has been chosen on \ D. We view P \ D as formed by gluing together two sides of equal length of a triangle . We define a metric on P \ D by endowing with a Riemannian metric in which two sides are of equal length and using geodesic polar coordinates about one vertex p of the triangle. In such coordinates (ρ, σ ) (where ρ is the arc length from p and σ is the angular variable) the metric takes the form ([dC] Sect. 4.6) ds 2 = dρ 2 + g22 (ρ, σ )dσ 2 .
(1)
We now assume that the triangle is a geodesic triangle. In geodesic polar coordinates, the Hodge star operator is then √ ∗dρ = g22 dσ, (2) 1 ∗dσ = − √ dρ. g22 We make the assumption that g22 = g22 (ρ) depends only on ρ and is independent of σ . This assumption is satisfied by the three important examples of metrics with constant scalar curvature: (1) Spherical metric with scalar curvature +1: g22 (ρ) = sin2 (ρ), (2) Euclidean metric (with scalar curvature 0): g22 (ρ) = ρ 2 , (3) Hyperbolic metric with scalar curvature −1: g22 (ρ) = sinh2 (ρ). We can consider two triangles shown in Fig. 1. We use geodesic polar coordinates at the vertex x0 . The curves γ1 and γ2 are radial geodesics with constant σ . The curve γ3 is the geodesic joining the two vertices x1 and x2 . The curve C1 is the curve ρ = constant between x1 and x2 ; it is not a geodesic. The triangle is bounded by γ1 , γ2 , γ3 ; it is a geodesic triangle. The triangle 0 is bounded by γ1 , γ2 , C1 . We can find the integral over the triangle 0 bounded by γ1 , γ2 , C1 , where C1 is the curve specified by the equation ρ = L = constant. Let G be a compact connected Lie group; the moduli space of flat connections on modulo based gauge transformations (i.e. gauge transformations which take the value identity on the three vertices) can be identified with G × G. The moduli space of flat connections on P \ D modulo gauge transformations which are trivial at one point on the boundary (corresponding to the vertices of the geodesic triangle) is equivalent to the subspace of the moduli space of flat connections on the geodesic triangle (modulo based gauge transformations) which correspond under an orientation reversing map which identifies two sides of the triangle . This is Hom(π, G), where π = π1 (P \ D), in other words Hom(π, G) = {(u, v) ∈ G × G | uv 2 = 1} ∼ = G. Here v is the holonomy along γ1 and −γ2 , while u is the holonomy along γ3 .
The Volume of the Moduli Space
541
f x0
g1
g2
g3 x2
x1 D
C1
Fig. 1. Triangles and 0
On the space of flat connections on , there is a metric Tr(a ∧ ∗b), < a, b >=
(3)
where ∗ is the Hodge star operator on differential forms over . We will study the corresponding metric on G × G. Define the metric <, > at a flat connection A corresponding to a point p in G × G to be Tr(dA ξ ∧ ∗dA η), < dA ξ, dA η >=:
where ξ(·) and η(·) are g-valued functions on , so dA ξ and dA η represent elements of the tangent space to G × G at p. Here, Tr(·) represents the ad-invariant inner product on the Lie algebra. In the case when G is abelian, the holonomy of dξ along γ1 is dξ = exp(ξ(ρ = L1 , σ = 0) − ξ(ρ = 0, σ = 0)), (4) H olγ1 dξ = exp γ1
and its holonomy along γ2 is dξ = exp(ξ(ρ = L2 , σ = 0) − ξ(ρ = 0, σ = φ)). H olγ2 dξ = exp
(5)
γ2
We will study these holonomies once we have determined ξ . From now on, we only consider the inner product at A = 0. The reason is that the metric < a, b >= Tr(a ∧ ∗b) is ad-invariant (where ∗ denotes the Hodge star operator), and since the gauge action at the infinitesimal level is the adjoint action, this metric is invariant under the action of the gauge group and in particular under the action of the based gauge group. Since the space is contractible, any flat connection is gauge equivalent to the trivial connection on a triangle. In other words all infinitesimal flat connections – elements of the tangent space of the space of flat connections modulo based gauge transformations – can be expressed as dξ for some infinitesimal gauge transformation ξ : → g. So we need only consider the inner product at A = 0.
542
N.-K. Ho, L.C. Jeffrey
We want to associate a norm < dξ, dξ >, where ξ ∈ 0 (P \ D) ⊗ g. This will be done by defining < dξ, dξ >=: Tr(dξ ∧ ∗dξ ). (6)
Using Stokes’ theorem, this can be converted to a line integral around the boundary of the triangle if we have d ∗ dξ = 0: this follows because Tr(dξ ∧ ∗dξ ) = Tr[d(ξ ∧ ∗dξ ) − ξ d ∗ dξ ] = Trξ ∗ dξ − Trξ d ∗ dξ.
∂
In fact, we may assume ξ satisfies d ∗ dξ = 0.
(7)
This condition represents a transversal to the orbit of the based gauge group; our procedure is analogous to identifying de Rham cohomology classes with harmonic forms (forms α satisfying dα = 0 and d ∗ α = 0). Note that the space G × G is isomorphic to the space of flat connections on modulo based gauge transformations (i.e. gauge transformations which take the value 1 at the vertices). Each equivalence class may be written as dξ for some ξ : → g (which does not take value 1 at all the vertices, unless one wishes to represent the trivial flat connection). To solve Eq. (7), first, let us recall some geometry. Recall that our Riemannian metric in geodesic polar coordinates is ds 2 = dρ 2 + g22 (ρ)dσ 2 , where ρ is the distance from a chosen point x0 and σ is the polar angle with respect to this point. Let ∗ denote the Hodge star operator on the 2-manifold; the star operator with respect to these polar coordinates is ∗dρ =
−1 g22 (ρ)dσ, ∗ dσ = √ dρ. g22 (ρ)
We use the substitution
u(ρ) =
ρ
√
1 dt g22 (t)
(8)
and define τ (ρ) = exp u(ρ). The equations for the Hodge star operator become ∗du = dσ, ∗dσ = −du. We now have to solve the equation d(∗dξ ) = 0.
(9)
The Volume of the Moduli Space
543
We have dξ = so
∂ξ ∂ξ du + dσ ∂u ∂σ
2 ∂ 2ξ ∂ξ ∂ ξ ∂ξ + 2 du ∧ dσ dσ − du = d(∗dξ ) = d ∂u ∂σ ∂σ 2 ∂u = (ξσ σ + ξuu )du ∧ dσ.
So the equation we need to solve is ξσ σ + ξuu = 0.
(10)
Let us take a Fourier decomposition of ξ in the σ variable: ξˆ (u, k)eikσ dk. ξ(u, σ ) = R
So (10) becomes ∂2 ξˆ (u, k) − k 2 ξˆ (u, k) = 0. ∂u2
(11)
This Eq. (11) has the following solutions: ξˆ (u, k) = C+ (k)τ k + C− (k)τ −k . Recall that we had defined τ = exp(u). Imposing the condition that ξˆ (u, k) is finite at u = 0, we get ξˆ (u, k) = C+ (k)τ k when k > 0, and ξˆ (u, k) = C− (k)τ |k| when k < 0. Recall that φ is the polar angle of the geodesic triangle. Let us impose the additional constraint that ξ(u, σ ) attains its maximum on two edges of the triangle (γ1 ∂ which is σ = 0 and γ2 which is σ = φ), in other words ∂σ ξ(ρ, σ ) = 0 when σ = 0 and σ = φ. This leads to ξ(u, σ ) = Y τ k cos kσ, where Y ∈ g is a constant and k = π/φ. We compute that Tr(ξ ∧ ∗dξ ) = C1
(12)
φ
< ξ, ∂ξ/∂u > dσ 0
φ
= < Y, Y > k exp ku(L)
cos2 kσ dσ
0
= < Y, Y > π/2 exp ku(L).
(13)
544
N.-K. Ho, L.C. Jeffrey
Note that writing ξ in terms of its Fourier decomposition accomplishes the same thing as solving the equation d ∗ dξ = 0 by the method of separation of variables. We solved this equation in the previous subsection. Thus we obtain ξ(ρ, σ ) = Y τ (ρ)k cos kσ,
k = π/φ
(14)
so that the maximum of |ξ | is achieved on the edges σ = 0 and σ = φ and Y ∈ g is a variable we choose so that we can get the desired holonomy along the boundary. The holonomies were specified by Eq. (4) and (5). L 2k φ Tr(ξ ∧ ∗dξ ) = < Y, Y > k τ ( ) dσ cos2 kσ 2 C1 0 L 2k π = < Y, Y > τ ( ) 2 2 Tr(ξ ∧ ∗dξ ) =
Tr(ξ ∧ ∗dξ ) = 0.
γ1
(15)
γ2
Thus (referring to Fig. 1) the integral over the geodesic triangle bounded by γ1 , γ2 , γ3 is the integral over the triangle 0 bounded by γ1 , γ2 , C1 minus the integral over the region D: Tr(dξ ∧ ∗dξ ) = Tr(dξ ∧ ∗dξ ) − Tr(dξ ∧ ∗dξ ). (16)
This leads to
D
0
Tr(dξ ∧ ∗dξ ) =
Tr(ξ ∧ ∗dξ ) − C1
D
Tr(dξ ∧ ∗dξ ),
The integral over D is given as follows: 2 cos kσ sin2 kσ Tr(dξ ∧ ∗dξ ) = < Y, Y > k 2 τ 2k + dτ ∧ dσ τ τ D D 1 = < Y, Y > k 2 τ 2k dτ ∧ dσ τ D φ τ (C1 (σ )) dτ dσ τ 2k = < Y, Y > k 2 τ 0 τ (γ3 (σ )) φ 2k τ (C1 (σ )) τ 2k (γ3 (σ )) 2 − dσ, = < Y, Y > k 2k 2k 0
(17)
(18)
where τ (C1 (σ )) is a constant independent of σ . Remark 1. If g is a Riemannian metric on then the quantity < dξ, dξ >g (associated to the metric g) is equal to the quantity < dξ, dξ >gew , where w : P \D → R is a C ∞ function. In other words our definition of the metric < dξ, dξ >g is invariant under conformal transformations on . To see this, we write the metric as g g g = 11 12 g21 g22
The Volume of the Moduli Space
545
with inverse g
−1
g 11 g 12 . = g 21 g 22
We observe that the Hodge star operator is unchanged under g → gew (where w : → R), since det g transforms to (det g)e2w while g 11 transforms to g 11 e−w . The Hodge star operator for coordinates x1 , x2 is ∗dx1 = g 11 det gdx2 . √ Hence g 11 det g is unchanged. Thus the norm (denoted by ) on dξ computed using metrics of the form ds 2 = dρ 2 + g22 (ρ)dσ 2 .
(19)
also gives the norm for metrics g conformally equivalent to those of the form (19). Remark 2. In fact in dimension 2 every Riemannian metric is locally diffeomorphic to one which is conformally equivalent to a metric of constant curvature (see [d’H] Sect. 3.3). This shows that all metrics on P \ D are conformally equivalent to a metric of constant curvature, one of the three listed in (2.1), for which the norm (6) may be computed as in (16). The values of the norms are different for the three different choices of constant curvature metrics. Remark 3. The volume of the moduli space defined using Reidemeister-Ray-Singer torsion is independent of the choice of metric on the 2-manifold either orientable or nonorientable. The Riemannian volume, however, is a different story. For an orientable 2-manifold, the moduli space of flat connections is a K¨ahler manifold, so the Riemannian volume equals the symplectic volume, and is thus independent of the choice of metric. For a nonorientable surface, Witten remarks in [W] (p. 163) that the Riemannian volume does depend on the metric. According to [W] (2.38) the torsion volume and the Riemannian volume differ (in this case) by a ratio of determinants of elliptic operators (generalized Laplacians on Lie algebra-valued differential forms). Our results are consistent with this observation, and at the same time provide an evaluation of this ratio of determinants.
2.2. Hyperbolic metric on the geodesic triangle. We explore the hyperbolic case first. Note that if has genus ≥ 2 then by the uniformization theorem it is equipped with a unique metric of constant scalar curvature −1 (in other words a hyperbolic metric). This is one motivation for choosing a hyperbolic metric on P \ D, although such a metric will be singular at one point. A Riemann surface with one boundary component and genus ≥ 1 always has a hyperbolic metric with constant curvature −1 [B]. For this reason we construct a metric on P \ D for which the boundary is a geodesic and which is hyperbolic with constant curvature −1 at all but one point. We do this by gluing the edges of an isosceles geodesic hyperbolic triangle using an orientation reversing map: this process identifies all three vertices of the triangle, and the vertex becomes the point where the metric does not have constant curvature −1 (indeed, the metric is singular at this point).
546
N.-K. Ho, L.C. Jeffrey
Remark 4. The reason why this procedure gives a natural metric on the 2-manifold formed by taking the connected sum of a Riemann surface of genus > 0 with P is that a Riemann surface of genus > 0 with one or two boundary components always has a hyperbolic metric with constant curvature for which the boundary components are geodesics. This metric can be obtained from a pants decomposition of the Riemann surface, where each pair of pants is equipped with a hyperbolic metric for which the boundary components are geodesics. See [B]. Hence the connected sum of a Riemann surface with P has a metric which is hyperbolic at all but one point, and the connected sum of a Riemann surface with two copies of P has a metric which is hyperbolic at all but 2 points. To form P with one disc removed, we recall that the two edges of the triangle are identified using an orientation reversing map. In Fig. 1 this corresponds to identifying γ1 with γ2 using an orientation reversing map which maps x1 ∈ γ1 to x0 ∈ γ2 and maps x0 ∈ γ1 to x2 ∈ γ2 . It turns out that the metric on the based moduli space on P with one disc removed is singular at the vertex. We have introduced a geodesic triangle in the upper half plane (see Fig. 2). As shown in Fig. 2, the geodesic triangle has one distinguished vertex x0 (which serves as the origin for geodesic polar coordinates). The sides γ1 , γ2 , γ3 are geodesics. We introduce geodesic polar coordinates (ρ, σ ) (using the hyperbolic metric on the upper half plane, which is assumed to contain the triangle) with x0 as the origin: here ρ is the distance from a chosen point x0 and σ is the polar angle. It is assumed that the geodesic γ1 is defined by the equation σ = 0, and γ2 is defined by σ = φ (where φ is the angle at x0 ). The curves γ1 and γ2 have lengths L1 and L2 which are the same here. The geodesic γ3 is specified by Coxeter’s equation coth ρ =
cos(σ − σ0 ) , see [Cox] p.376,
(20)
where , σ0 are constants determined by the geodesic γ3 . Also, coth ρ =
cosh2 (ρ/2) + sinh2 (ρ/2) 1 1 cosh ρ = = τ+ sinh ρ 2 sinh(ρ/2) cosh(ρ/2) 2 τ
x0
φ γ1
γ2
γ3 x2
x1
Fig. 2. Hyperbolic geodesic triangle
The Volume of the Moduli Space
547
for τ = tanh(ρ/2). Thus cos(σ − σ0 ) − sin(σ − σ0 ) 1− dτ = dσ. τ By (20), it is clear that since ρ(σ = 0) must be equal to ρ(σ = φ) for a hyperbolic geodesic triangle two sides of which will be identified to form P \ D, we need σ0 = φ/2 in (20). Now the equation for γ3 is (see Eq. (20) and that L1 = L2 = L) cos(σ − φ/2) tanh ρ = cos(φ/2) tanh L and tanh ρ = So we have cos(σ − φ/2) τ (γ3 (σ )) = − l
(
2τ . 1 + τ2
cos(σ − φ/2) 2 ) − 1, l = cos(φ/2) tanh L. (21) l
Thus by (18) L 2k kφ < Y, Y > tanh( ) Tr(dξ ∧ ∗dξ ) = 2 2 D
φ 2k cos(σ − φ/2) k cos(σ − φ/2) 2 − ( ) −1 − < Y, Y > dσ. l l 2 0 Let x = cos(σ − φ/2); 2k kφ Tr(dξ ∧ ∗dξ ) = < Y, Y > tanh(L/2) 2 D 1 2k dx x x k − < Y, Y > − ( )2 − 1 √ 2 l 1 − x2 cos(φ/2) l
cos(φ/2) 2k dx x x − ( )2 − 1 − √ l l 1 − x2 1 2k kφ = < Y, Y > − k < Y, Y > tanh(L/2) 2
1 2k dx x x − ( )2 − 1 . (22) × √ l 1 − x2 cos(φ/2) l Together with (15) and (17), we get Tr(dξ ∧ ∗dξ ) = Tr(ξ ∧ ∗dξ ) − Tr(dξ ∧ ∗dξ )
C1
D
1
x ( − = k < Y, Y > cos(φ/2) l We will compute this integral in Appendix A.
dx x ( )2 − 1 )2k √ . l 1 − x2
(23)
548
N.-K. Ho, L.C. Jeffrey
We denote the integral (23) by k < Y, Y > h(φ) because although it depends on two parameters φ and L, there is a relation between these two. We can see this as follows. Assume the length of γ3 is fixed (since γ3 will be glued to the boundary of a Riemann surface); denote this length by b. We have sinh2 b (cosh b − 1) (1 + cos φ)2 + 2(cosh b − 1) 2 1 − cos2 φ 2(cosh b − cos φ)(cosh b − 1)(cos φ + 1) 2(cosh b − 1)(1 − cos2 φ) cosh b − cos φ . = 1 − cos φ
cosh2 L =
This is the relation between φ and L once b is fixed. If τ (L) = tanh L/2, then the relation between τ (L) and φ is
cosh b−cos φ −1 1−cos φ 2 2 τ (φ) = τ (L(φ)) = ; cosh b−cos φ + 1 1−cos φ
(24)
(25)
thus h is a function of the top angle φ only. Thus from (23) we have < dξ, dξ >= k < Y, Y > h(φ).
(26)
If we choose an orthonormal basis e1 , e2 , · · ·√, en for Lie(G), then the volume element of this moduli space using the metric is det, and det = (kh(φ))n ,
using (26),
where n is the dimension of Lie(G) i.e. g.
2.3. Euclidean metric on the geodesic triangle. The only smooth constant curvature metric on P \ D for which the boundary is a geodesic is the Euclidean metric. For the geodesic triangle equipped with the Euclidean metric, we get ρ dρ u= = ln ρ ρ so τ = ρ and ξ = Yρ k cos kσ. The geodesic γ3 is given by ρ cos(σ − φ/2) = L so we have
Tr(ξ ∧ ∗dξ ) = C1
π < Y, Y > L2k , 2
(27)
The Volume of the Moduli Space
549
and from (16) and (18),
π k 2k Tr(dξ ∧ ∗dξ ) = ρ 2k dσ < Y, Y > L − < Y, Y > 2 2 D ρ cos(σ −φ/2)=L φ/2 π sec2k (σ )dσ. (28) < Y, Y > L2k − k < Y, Y > L2k = 2 0
Thus the norm D Tr(dξ ∧∗dξ ) is given by k < Y, Y > h(φ) for an appropriate function h(φ) which depends on the angle φ and on the choice of Euclidean metric.
2.4. Spherical metric on the geodesic triangle. For a spherical metric on the geodesic triangle, we get u=
ρ
dx = ln tan(ρ/2) sin x
(29)
so τ = tan(ρ/2). We then get that Tr(ξ ∧ ∗dξ ) = C1
π < Y, Y > (tan(L/2))2k . 2
(30)
Hence we get that
D
Tr(dξ ∧ ∗dξ ) =< Y, Y >
π k (tan L/2)2k − 2 2
2k
(tan ρ/2) dσ .
(31)
γ3
Here, γ3 is the segment of the great circle connecting the points x1 = (ρ = L, σ = 0) and x2 = (ρ = L, σ = φ). The length of this geodesic and the angles at the vertices x1 , x2 can be determined using spherical trigonometry [Weis]. Thus when is equipped with a spherical metric of constant curvature, the norm < dξ, dξ > is given by k < Y, Y > h(φ) for an appropriate function h(φ) which depends on φ and on the choice of spherical metric.
2.5. is the connected sum of a Riemann surface with P . To consider the volume of the moduli space of the connected sum of a Riemann surface with P , first, let us examine the following lemmas: Lemma 5. Let E be a Riemannian manifold with metric . Let f : E → R be a smooth function for which dfm = 0 for any m ∈ E. Then V ol(E) =
V olf −1 (t) dt, R df (v(t))
where v(t) is the unit normal vector to f −1 (t).
550
N.-K. Ho, L.C. Jeffrey
Proof. The volume form on Tm E is e1∗ ∧ · · · ∧ en∗ , where {ej } is an orthonormal basis of tangent vectors, i.e. (ej , ek ) = δj k and ei∗ are the dual basis vectors for Tm∗ E for any ∗ m ∈ E. Choose e1 , · · · , en−1 ∈ Tm (f −1 (t)) so e1∗ ∧ · · · ∧ en−1 is the volume form on df −1 ∗ Tm (f (t)). Then en = df (v(t)) since v(t) is the unit vector normal to f −1 (t). Thus V ol(E) = E
∗ (e1∗ ∧ · · · ∧ en−1 ) ∧ en∗
df ∗ (e1∗ ∧ · · · ∧ en−1 )∧ df (v(t)) E (dvol) = dt t∈R f −1 (t) df (v(t)) V ol(f −1 (t)) = dt. R df (v(t))
=
Similarly, we have the following lemma, Lemma 6. Let E be a Riemannian manifold with metric , and suppose f : E → G is a smooth map for which df has maximal rank at generic points in E. Then
V ol(f −1 (g))
V ol(E) = g∈G
dg(∧N j =1 df (vj ))
dg,
where dg is the volume element given by a Riemannian metric on G and v1 , · · · , vN are unit normal vectors to f −1 (g) with N =dim G. Now let us return to our particular situation.1 We have a nonorientable surface which is the connected sum of a Riemann surface 1 with a nonorientable surface 2 (either one or two copies of the projective plane P ). Denote E1 = H om(π1 (1 ), G), where 1 is a Riemann surface of genus with one boundary component, and E2 = H om(π1 (2 ), G), where in this section 2 = P with one disc removed. Here π1 (2 ) = {x, y|x = y 2 } ∼ = Z so H om(π1 (2 ), G) ∼ = G. Define maps fi : Ei → G for i = 1, 2 by sending a representation to its value on the loop around the boundary. Let E = {(m1 , m2 ) ∈ E1 × E2 | f1 (m1 ) = f2 (m2 )}. Now we use Lemma 6 with E2 ∼ = G and let f : E → G = E2 be the map (m1 , m2 ) → m2 . Notice that in this case the hypothesis of f having maximal rank is valid generically because this reduces to df1 : T E1 → g being surjective, and it was proved by Goldman [Go1] (Prop. 3.7) that the image of df1 at a point A = (a1 , . . . , a , b1 , . . . , b ) ∈ E1 = G2 is z(A)⊥ (the orthocomplement of the Lie algebra of the stabilizer of A under the adjoint action), and z(A) = 0 for generic A. We have V ol(f −1 (g)) dg. V ol(E) = G dg(∧j df (vj )) 1
For other approaches to measures on moduli spaces, see [Fine,Fo,Liu and BeSe,Se1,Se2,Se3].
The Volume of the Moduli Space
551
Note that the metric on E is π1∗ h1 + π2∗ h2 , where h1 is the metric on E1 and h2 the metric on E2 which are given by Eq. (26) i.e. (Haar measure) ×(kh(φ))n/2 . Thus we can choose an orthonormal basis {vj } in g and dg(∧j df (vj )) = 1. Thus we have V ol(f −1 (g)) = V ol(f1−1 (g 2 ))V ol(f2−1 (g 2 )) because f −1 (g) = f1−1 (g 2 ) × f2−1 (g 2 ). It follows that V ol(E) = V ol(f1−1 (g 2 ))V ol(f2−1 (g 2 ))dg, g∈G
where our moduli space is M = E/G. Thus we have V ol(E) = V ol(M)V ol(G). Notice that the moduli space of gauge equivalence classes of flat connections on a Riemann surface with one boundary component about which the holonomy is constrained to take a fixed value (in other words the moduli space of parabolic bundles) is a K¨ahler manifold (see for instance [AB]) so its symplectic volume (as specified by Witten’s formula) is equal to its Riemannian volume (for a metric derived from any metric on the Riemann surface). Hence our procedure will give the Riemannian volume on the moduli space of gauge equivalence classes of flat connections on the connected sum. We know from [W] the volume formula [W](4.114) for a moduli space of flat connections on a compact orientable surface of genus with one boundary component:2 For s ∈ G, we define M( , s) = {(a1 , . . . , a2 ) ∈ G2 |
j =1
−1 −1 a2j −1 a2j a2j −1 a2j ∈ C(s)}/G
= {ρ ∈ Hom(π1 ( − D), G) | ρ([∂( − D)]) ∈ C(s)}/G,
(32)
where C(s) is the conjugacy class of s. We also define R( , s) = {(a1 , . . . , a2 ) ∈ G2 |
j =1
−1 −1 a2j −1 a2j a2j −1 a2j = s}
= {ρ ∈ Hom(π1 ( − D), G) | ρ([∂( − D)]) = s}
(33)
so that M( , s) = GR( , s)/G. It follows that V olM( , s) = V olR( , s) If s ∈ G, then Witten’s formula reads V ol(M( , s)) = C1 α
V olC(s) . V olG
1 χα (s) F (s), 2−1 (dimα)
(34)
(35)
where α runs over all isomorphism classes of irreducible representations of G, the representation α has character χα , and the constant C1 is C1 = 2
Z(G)V ol(G)2−2+1 (2π)dimM( ,s) V ol(T )
In Sect. 3.1 we give a new proof of this formula.
(36)
552
N.-K. Ho, L.C. Jeffrey
and dimM( , s) = (2 − 2) | G | + | G | − | T |, where |G| denotes the dimension of G and |T | denotes the dimension of the maximal torus T . Note also that F (s) is the Riemannian volume of the conjugacy class C(s) through s as defined in [W](4.53) V ol(G) v0 F (s). V ol(T )
v=
(See Chapter 7, [BGV].) The Liouville volume of C(s) is F (s)volG/volT
(37)
(see [AMW], Prop. 3.6.) Here v represents the measure on T /W obtained by pushing down the Haar measure on G (under the natural map from a group element to its conjugacy class) and v0 represents the measure on T /W obtained by restricting the metric on g to Lie(T ) and then identifying Lie(T ) with the tangent space to T /W . A more detailed explanation of F (s) is given in the next section. So (34) is equivalent to √ F (s) V olM( , s) = V olR( , s) , (38) V olT and (35) is equivalent to V olR( , s) = C1 V olT
α
1 χα (s). (dimα)2−1
(39)
Thus the volume of the moduli space of flat connections on the connected sum of a Riemann surface with a projective plane is V ol(R()) = V ol(R( P )) = V ol(R( , s))V ol(R(P , s)) ds.
(40)
s∈G
Denote by R the map G → G, R : g → g 2 .
(41)
Weyl’s integral formula |W | f (g)dg = [F (t) f (gtg −1 )dg]dt G
T
G
gives us that
f (g)dg =
G
when f is conjugation invariant.
F (s)f (s) T /W
V ol(G) ds, V ol(T )
The Volume of the Moduli Space
Thus
553
√ 1 χα (g)R∗ ( detdg) 2−1 (dim(α)) G α √ 1 = C1 V ol(T ) χα (g 2 ) detdg. 2−1 (dim(α)) G α
V ol(R()) =
C1 V olT
The above is true because S 2 − D − D is the double cover of P \ D; the space of flat connections on S 2 − D − D (modulo based gauge transformations) is isomorphic to G, so we pull back the integral to G under the covering map. We get V ol(R()) = C1 V olT
α
by using
1 fα V ol(G)(kh(φ))|G|/2 (dim(α))2−1
χα (g 2 )dg = fα V ol(G)
(42)
G
(cf:[W] (2.70)) where fα = 1, −1, 0 depending on whether the representation α admits a symmetric invariant bilinear form, an antisymmetric bilinear form, or no invariant bilinear form at all. Here dim(α) is the dimension of the representation α, and its character is denoted by χα . We get V ol(R()) =
fα Z(G)V ol(G)2 H (φ) , 2−1 (dim(α)) (2π)(2−1)|G|−|T | α
where H (φ) = (kh(φ))|G|/2 and k = π/φ. Here h is a real-valued function of the angle φ which depends on the choice of metric on the M¨obius strip. We compare our result to Witten’s formula [W](4.93) V ol(R()) =
fα Z(G)V ol(G)2 . 2−1 (2−1)|G| (dim(α)) (2π) α
Our formula differs from Witten’s by a multiplicative factor of H (φ) (2π )|T | . The factor (2π )|T | is due to Witten’s choice of a different normalization. The factor H (φ) results from our choice of a metric on the nonorientable part of the surface (the projective plane). Note Witten did not choose a metric on P . On the other hand our formula for V ol(M) is a function of the angle φ for a fixed surface. 2.6. is a connected sum of a Riemann surface with two projective planes. We know from [W] the volume formula for a moduli space of flat connections on a compact orientable surface of genus with two boundary components [W] (4.114) (where s1 , s2 ∈ G and the holonomies around the two boundary components are fixed at s1 and s2 ): in this case Witten’s formula [W] (4.114) reads as follows: V ol(M( , s1 , s2 )) = C2
α
1 χα (s1 ) F (s1 )χα (s2 ) F (s2 ), 2 (dim(α))
(43)
554
N.-K. Ho, L.C. Jeffrey
where the constant C2 is Z(G)V ol(G)2 , (2π)dimM( ,s1 ,s2 ) V ol(T )2 and dimM( , s1 , s2 ) = (2 − 2) | G | +2 | G | −2 | T | . Again we define R( , s1 , s2 ) = {ρ ∈ Hom(π1 ( − D1 − D2 ), G) | ρ([∂D1 ] = s1 , ρ([∂D2 ] = s2 }. Here we have chosen representatives [∂Dj ] for the elements of the fundamental group represented by the loops around the j th boundary component, by connecting the basepoint to some point on the boundary. Thus M( , s1 , s2 ) = (G × G)R( , s1 , s2 )/G × G, where (g1 , g2 ) ∈ G × G acts on Hom(π1 ( − D1 − D2 ), G) so that if the value of the representation of the loop around the j th boundary component is sj , then this value becomes gj sj gj−1 . Following (34) and (37) above, we have V olC(s1 )V olC(s2 ) (V olG)2 √ √ F (s1 ) F (s2 ) = V olR( , s1 , s2 ) . (V olT )2
V olM( , s1 , s2 ) = V olR( , s1 , s2 )
(44)
Equivalently, Witten’s formula is V ol(R( , s1 , s2 )) = C2 (V ol(T ))2
α
1 χα (s1 )χα (s2 ). (dim(α))2
Thus the volume of the moduli space of flat connections on the connected sum of a Riemann surface with two projective planes (equivalently, the connected sum with one Klein bottle) is V ol(R()) = V ol(R( P P )) = ds1 ds2 V ol(R( , s1 , s2 ))V ol(R(P , s1 ))V ol(R(P , s2 )). G×G
√ √ Let det1 and det2 denote the volume elements on the two copies of P \ D respectively. We have V ol(R( , g1 , g2 ))R∗ ( det 1 dg1 )R∗ ( det 2 dg2 ) V ol(R()) = G×G
× G×G
1 2 (dim(α)) α χα (g1 )χα (g2 )R∗ ( det1 dg1 )R∗ ( det2 dg2 )
= C2 (V olT )2
= C2 (V olT )2
α
1 det1 det2 2 (dim(α))
2 χα (g 2 )dg
G
, (45)
The Volume of the Moduli Space
555
where R was defined by (41). We use Eq. (42) and thus the volume becomes V ol(R()) =
1 Z(G)V ol(G)2−1 f 2 H (φ1 )H (φ2 )V ol(G)2 , 2l α (2)|G|−2|T | (dim(α)) (2π) α
where H (φi ) = (ki h(φi ))|G|/2 , the angles φ1 and φ2 are the top angles of the two triangles respectively, ki = π/φi , and fα is defined as in Sect. 2.5. Note that fα2 = 1 if α = α¯ and fα2 = 0 otherwise. We get the final formula V ol(R()) =
1 Z(G)V ol(G)2+1 H (φ )H (φ ) . 1 2 (2)|G|−2|T | (dim(α))2 (2π) α=α¯
We compare this with Witten’s formula [W](4.77) V ol(R()) =
1 Z(G)V ol(G)2+1 . (2)|G| (dim(α))2 (2π) α=α¯
Our formula differs from Witten’s formula by a multiplicative factor of H (φ1 ) H (φ2 ) (2π )2(|T |) . The factor (2π)2(|T |) is due to Witten’s choice of a different normalization. The factors H (φi ) result from our choice of a metric on the two projective planes. Notice that there are two factors H (φi ) (in contrast to the case when the manifold is a connected sum of a Riemann surface with one copy of P , when there is only one such factor). Thus V ol(M) is a function of the angles φ1 , φ2 for a fixed surface. We may summarize our conclusions as follows. When we take the connected sum of a Riemann surface with a real projective plane formed by identifying two sides of length L of a geodesic triangle with polar angle φ, the volume we have obtained is a function of the angle φ since the length L of the side of the triangle is determined once the length of the third edge of the triangle is given. A pair (L, φ) gives a conformal structure on the triangle. Once we fix a conformal structure, we obtain a volume for the moduli space determined by this structure. The situation is similar for the connected sum of a Riemann surface with two projective planes. 3. Witten’s Volume Formula and Haar Measure In this section, we will consider Witten’s volume formula for the moduli space of flat connections on a 2-manifold, and give a new proof of these formulas using Haar measure. Notice that Witten did not give a proof exclusively using Haar measure to compute the volume of the moduli space. Witten gave three arguments for the formula in [W]: • In Sect. 2, Witten used results of Migdal [Mi] involving lattice gauge theories to compute the Yang-Mills partition function in two dimensions. Migdal’s result only makes sense when the Yang-Mills partition function has been regularized, giving a result which reduces to the volume of the moduli space when the regularization parameter tends to 0. • In Sect. 3, Witten deduced the result from the Verlinde formula, which is a formula for the Riemann-Roch number of the prequantum line bundle of the moduli space M of flat connections on a 2-manifold, RR(M, Lk ). Since the cohomology class of the symplectic form ω is the first Chern class of L, we may expect that the leading order term in the Riemann-Roch number (as a polynomial in k) is k dim M/2 V ol(M).
556
N.-K. Ho, L.C. Jeffrey
• In Sect. 4, Witten characterized the volume of the moduli space using ReidemeisterRay-Singer torsion. He used the fact that Reidemeister torsion is multiplicative: If M is the union of two submanifolds M1 and M2 with boundary N , glued along the boundary, then τ (M) = τ (M1 )τ (M2 )/τ (N ), where the torsion is viewed as a ratio of determinants of elliptic operators and the above quotient makes sense in terms of the Mayer-Vietoris sequence which computes the cohomology groups of M from those of M1 , M2 and N . Since the moduli space can be identified with the space of representations of a surface group (in other words Hom(π1 (), G)/G) and a compact Lie group has a natural Riemannian measure, the Haar measure, we can try to understand Witten’s formula for the moduli space by pushing forward the Haar measure on G.3 We will use the following two facts to show our argument. • Weyl’s integral formula (cf. [BD](4.1.11)) f (g)dg = [F (t) f (gtg −1 )dg]dt. |W | G
T
G
If f is conjugation invariant, the formula becomes V ol(G) f (g)dg = F (t)f (t) dt |W | V ol(T ) G T V ol(G) i∗ (F (t))f (i −1 (t)) = |W | i∗ dt V ol(T ) T /W V ol(G) = |W | F (i −1 (s))f (i −1 (s)) ds. V ol(T ) s∈T /W Here i : T → T /W is the local isomorphism with i∗ dt =: ds. Thus we have V ol(G) f (g)dg = F (s)f (s) ds. V ol(T ) G s∈T /W Recall that F (s) = F (i −1 (s)) is the volume of the conjugacy class containing s as defined for example in [W](4.53): v=
V ol(G) v0 F (s), V ol(T )
where v represents the measure on T /W obtained by pushing down the Haar measure on G (under the natural map from a group element to its conjugacy class) and v0 represents the measure on T /W obtained by restricting the metric on g to Lie(T ) (which determines a Haar measure on T ) and then identifying Lie(T ) with the tangent space to T /W . This is as in the picture shown below: T G @ i @ u R @ T /W 3
For a related treatment of the role of Haar measure in determining volumes, see [AMM and AMW].
The Volume of the Moduli Space
557
with v = u∗ dg, and v0 = i∗ dt, where dg and dt are the Haar measures on G and T respectively. Here i : T → T /W is the local isomorphism and u : G → T /W is the map which maps g to its conjugacy class. Explicitly we have F (s) = detR (1 − Ad(s))
(46)
or for s = exp λ for λ ∈ Lie(T ) such that α(λ) = 0 for any root α, we have F (exp λ) = 4 sin2 α(λ), α>0
where we assume s ∈ T and view Ad(s) as an endomorphism of the orthocomplement of the Lie algebra of T . • Pushforward of volume under a surjective map: Suppose p : M → N is a surjective map, f : N → R is a map, and VM and VN are volume forms for M and N respectively. Then a function h characterizing the pushforward is defined by f (p(x))VM = f (y)h(y)VN , where p∗ (VM )(y) = h(y)VN . M
N
Roughly speaking, h(y) = V ol(p−1 (y)). 3.1. is a one punctured Riemann surface of genus + 1. The following picture shows the relations between the pushforwards. G×G
[G × G]
@ Q@ R R @ G
π ? T /W where Q(A1 , · · · , A , B1 , · · · , B ) =
−1 Ai Bi A−1 i Bi ,
i=1
and R(g1 , g2 ) =
hg1 g2 g1−1 g2−1 ,
(47)
where h ∈ G is the holonomy around the boundary. The volume formula for a moduli space of flat connections on a compact orientable surface with one boundary component [W](4.114) is as follows. For an element s ∈ G, 1 Z(G)V ol(G)2−1 V ol(R( , s)) = χα (s). (48) (2π)(2−1)|G|−|T | α (dim(α))2−1 We want to show that Witten’s formula [W] (4.114) is the pushforward of the Haar measure on G, i.e. prove Eq. (48) by induction on (assuming that Eq. (48) is true for genus , we prove it for genus +1). The induction hypothesis is valid for = 0 because by [W] (2.43) we have
558
N.-K. Ho, L.C. Jeffrey
Fig. 3. Connected sum of with 1 − D
(dim α)χα (s) = δ(s − 1)
(49)
α
and the space R(0 , s) is one point if s = 1 and empty otherwise, so its volume is δ(s − 1). Consider a torus 1 with one boundary component. If 1 − D is glued to along the boundary of D, we obtain +1 . (See Fig. 3.) Let h ∈ G be the holonomy around the boundary component of 1 (which becomes the boundary component of +1 after gluing to ). We have V ol(R(+1 , h)) = V ol(R( 1 , h)) V ol(R( , s))V ol(R(1 , s, h))ds. = s∈G
Also, by definition of the pushforward we have R∗ (dg1 ∧ dg2 ) = V ol(R(1 , s, h))ds. Thus we have V ol(R(+1 , h)) V ol(R( , s))V ol(R(1 , s, h))ds = G
Z(G)V ol(G)2 1 = χα (g)R∗ (dg1 ∧ dg2 ) (2π )(2−1)|G|−|T | α (dim(α))2−1 G Z(G)V ol(G)2 1 = χα (hg1 g2 g1−1 g2−1 )dg1 dg2 (2π )(2−1)|G|−|T | α (dim(α))2−1 G×G 1 V ol(G) Z(G)V ol(G)2 = χα (hg1 )χα (g1−1 )dg1 , dim(α) (2π )(2−1)|G|−|T | α (dim(α))2−1 G
(50)
The Volume of the Moduli Space
559
where we have used4 (cf:[W](2.50))
χα (AuBu−1 )du =
G
V ol(G) χα (A)χα (B) dim(α)
(51)
and R was defined by (47). Now we use (cf:[W](2.48))
χα (hg)χα (g −1 )dg =
G
V ol(G) χα (h) dim(α)
(52)
to get our final formula for h ∈ G, V ol(R(+1 , h)) =
χα (h) Z(G)V ol(G)2+1 . (2+1)|G|−|T | (dim(α))2+1 (2π) α
(53)
We note that the final formula (53) is invariant under conjugation: V ol(M(+1 , h) = V ol(M(+1 , khk −1 ) for any k ∈ G. So this formula only depends on the conjugacy class of h. Note that Witten [W] (4.114) gives the following formula for the volume for genus + 1 with h ∈ G, V ol(R(+1 , h)) =
χα (h) Z(G)V ol(G)2+1 . (2+1)|G|−|T | (dim(α))2+1 (2π) α
This agrees with our result. Our calculation shows us that Witten’s formula (4.114) can be understood in terms of the pushforward of Haar measure on G.
3.2. is the connected sum of a Riemann surface of genus with P . We know from [W] the volume formula for a moduli space of flat connections on a compact orientable surface with one boundary component [W](4.114), where s ∈ G and the holonomy around the boundary is constrained to be s: V ol(R( , s)) = C1 V olT
α
1 χα (s), (dim(α))2−1
where the constant C1 is given by Eq. (36). 4 The equations [W] (2.43), (2.48), (2.50), (2.70) and (4.50) are standard facts from the representation theory of compact Lie groups, and follow from the orthogonality relations. See also [BD].
560
N.-K. Ho, L.C. Jeffrey
The following picture shows the relations between the pushforwards. [G × G]
G
@ Q@ R R @ G
π ? T /W where Q(A1 , · · · , A , B1 , · · · , B ) =
−1 Ai Bi A−1 i Bi
(54)
i=1
and R(g) = g 2 .
(55)
V ol(R(P , g))dg = R∗ (dg).
(56)
By definition of the pushforward,
Thus
V ol(R( , s))V ol(R(P , s))ds V ol(R()) = V ol(R( P )) = s∈G 1 = C1 V olT χα (g)R∗ (dg) (dim(α))2−1 G α 1 = C1 V olT χα (g 2 )dg 2−1 (dim(α)) G α =
Z(G)V ol(G)2 1 fα , (2−1)|G|−|T | (dim(α))2−1 (2π) α
where we have used Eq. (42) and fα is defined in Sect. 2.5. We get our final formula V ol(R()) =
fα Z(G)V ol(G)2 . (2π)(2−1)|G|−|T | α (dim(α))2−1
Note that Witten’s formula [W](4.93) is V ol(R()) =
fα Z(G)V ol(G)2 . (2π)(2−1)|G| α (dim(α))2−1
Our formula differs from Witten’s by a multiplicative factor of (2π )|T | because Witten chose a different normalization.
The Volume of the Moduli Space
561
3.3. is the connected sum of a Riemann surface of genus with the Klein bottle. Again, the following picture shows the relations between the pushforwards. (Note that we will have a similar calculation if we use [W] (4.114) with two boundary components instead. It will be like the calculation in Sect. 2.6.) [G × G]
G×G
@ Q@ R R @ G
π ? T /W
where Q(A1 , · · · , A , B1 , · · · , B ) =
−1 Ai Bi A−1 i Bi
i=1
and R(g1 , g2 ) = g1 g2 g1−1 g2 . We have that the volume of the moduli space of flat connections on a once punctured Klein bottle with holonomy around the puncture given by s is (by definition of the pushforward) V ol(R(K, g))dg = R∗ (dg1 ∧ dg2 ). Moreover
(57)
V ol(R()) = V ol(R( K)) =
dsV ol(R( , s))V ol(R(K, s)). s∈G
Here we have used Lemma 6, which is again justified by [Go1] Prop. 3.7. Thus we have 1 C1 V olT χ (g)R∗ (dg1 ∧ dg2 ) V ol(R()) = 2−1 α (dim(α)) G α 1 = C1 V olT χα (g1 g2 g1 g2−1 )dg1 dg2 2−1 (dim(α)) G×G α 1 1 Z(G)V ol(G)2 χα (g1 )χα (g1 )dg1 , = (2π)(2−1)|G|−|T | α (dim(α))2−1 dim(α) G where we have used Eq. (51). Thus V ol(R()) =
1 Z(G)V ol(G)2+1 , (2−1)|G|−|T | (dim(α))2 (2π) α=α¯
562
N.-K. Ho, L.C. Jeffrey
where we have used (cf:[W](4.50))
χα (g)χα (g)dg = G
if α = α¯ . if α = α¯
0 V ol(G)
(58)
Thus our final formula is V ol(R()) =
Z(G)V ol(G)2+1 1 . (2−1)|G|−|T | (dim(α))2 (2π) α=α¯
We compare our result with Witten’s formula [W](4.77): V ol(R()) =
1 Z(G)V ol(G)2+1 . (dim(α))2 (2π)(2)|G| α=α¯
Our formula differs from Witten’s by a multiplicative factor of (2π )|G|+|T | because Witten chose a different normalization. This gives us a better idea of the geometric meaning of Witten’s volume [W]. It is the integral of the measure derived from the symplectic volume on the moduli space of flat connections on an orientable surface and the pushforward volume of the Haar measure on products of G. Since the symplectic volume of the orientable part can also be explained as the Haar measure of the Lie group model (as we did in Sect. 3.1), this explains why the Haar measure on products of Lie groups gives Witten’s formula for the volume on the moduli space of flat connections on a nonorientable 2-manifold.
Appendix A. Evaluation of an Integral In this appendix we compute the integral (23). We assume k is a positive integer, so
1
cos(φ/2)
x − l
x ( )2 − 1 l
2k √
dx 1 − x2
= S1 + S 2 ,
(59)
where S1 =
k s=1
(2k)! (2s)!(2k − 2s)!
1
x 2s x 2
cos(φ/2)
l
l
k−s −1
√
dx 1 − x2
(60)
and S2 = −
k−1 s=0
(2k)! (2s + 1)!(2k − 2s − 1)!
x dx × ( )2 − 1 √ . l 1 − x2
x k−s−1 x ( )2s+1 ( )2 − 1 l cos φ/2 l 1
(61)
The Volume of the Moduli Space
563
We can use the trigonometric substitution x = cos θ for the integral (60). This yields k 2s k−s x x 2 dx ( ) − 1 √ 2s l 1 − x2 cos(φ/2) s=1 l φ/2 k (cos2 θ )s cos2 θ = ( 2 − 1)k−s dθ l 2s l 0
1
s=1
=
k k−s s=1 r=0
=
k−s k s=1 r=0
+
1
l
(−1)k−s−r 2(s+r)
(k − s)! r!(k − s − r)!
φ/2
(cos2 θ)s+r dθ
0
1 (k − s)!(−1)k−s−r (2s + 2r)! φ/2 s+r 2(s+r) r!(k − s − r)! 2 (s + r)!(s + r)! l 1
s+r−1
1 22(s+r)−1
p=0
(2s + 2r)! sin(s + r − p)φ . p!(2s + 2r − 2p)! 2(s + r − p)
The integral (61) can be obtained by the substitution y = m2 = 1 − l 2 and y = mz, this gives
k−1
1
cos(φ/2) s=0
=
1 − x 2 . Introducing
(2k)! x x dx x ( )2s+1 (( )2 − 1)k−s−1 ( )2 − 1 √ (2s + 1)!(2k − 2s − 1)! l l l 1 − x2
s k−1 s=0 t=0
√
(1/m) sin(φ/2) 0
1 s! l 2k t!(s − t)!
×(−m2 z2 )t m2(k−s−1) (1 − z2 )k−s−1 m2 1 − z2 dz. Now let z = sin θ, and we obtain sin−1 [(1/m) sin(φ/2)] s t k−1 m2(k−s+t) s! t+r (−1) (cos2 θ)k−s+r dθ (1 − m2 )k r!(t − r)!(s − t)! 0 s=0 t=0 r=0
=
s t k−1 m2(k−(s−t)) s=0 t=0 r=0
× +
(1 − m2 )k
(−1)t+r
s! r!(t − r)!(s − t)!
1 [2(k − s + r)]! θm 22(k−(s−r)) (k − s + r)!(k − s + r)! 1
22(k−s+r)−1
k−s+r−1 p=0
[2(k − s + r)]! sin{2θm (k − s + r − p)} , p!(2[k − s + r] − p)! 2(k − s + r − p)
where θm = arcsin(1/m sin(φ/2)). Acknowledgement. This article comprises part of the Ph.D. thesis of the first author, under the supervision of the second author. Both authors would like to thank Eckhard Meinrenken and Chris Woodward for useful conversations and insightful advice.
564
N.-K. Ho, L.C. Jeffrey
References [AMM] Alekseev, A., Malkin, A., Meinrenken, E.: Lie group valued moment maps. J. Differ. Geom. 48, 445–495 (1998) [AMW] Alekseev, A., Meinrenken, E., Woodward, C.: Duistermaat-Heckman measures and moduli spaces of flat bundles over surfaces. Geom. Funct. Anal. 12(1), 1–31 (2002) [AB] Atiyah, M.F., Bott, R.: The Yang-Mills equations over Riemann surfaces. Philos. Trans. Roy. Soc. London Ser. A 308(1505), 523–615 (1983) [AB2] Atiyah, M.F., Bott, R.: The moment map and equivariant cohomology. Topology 23(1), 1–28 (1984) [BeSe] Becker, C., Sengupta, A.: Sewing Yang-Mills measures and moduli spaces over compact surfaces. J. Funct. Anal. 52, 74–99 (1998) [BGV] Berline, N., Getzler, E., Vergne, M.: Heat Kernels and Dirac Operators. New-York: SpringerVerlag, 1992 [BD] Br¨ocker, T., tom Dieck, T.: Representations of Compact Lie Groups. Graduate Texts in Mathematics 98. New York: Springer-Verlag, 1985 [B] Buser, P.: Geometry and Spectra of Compact Riemann Surfaces. Progress in Math, Vol. 106, Boston-Basel: Birkh¨auser, 1992 [Cox] Coxeter, H.S.M.: Introduction to Geometry. New York: Wiley, 1989 [dC] do Carmo, M.: Differential Geometry of Curves and Surfaces. New York: Prentice-Hall, 1976 [d’H] d’Hoker, E.: String Theory. In : Quantum Fields and Strings: A Course for Mathematicians, P. Deligne et al. (ed.), Vol. 2, Providence, RI: AMS, 1999, pp. 807–1012 [Fine] Fine, D.: Quantum Yang-Mills on the two-sphere. Commun. Math. Phys. 134, 273–292 (1990) [Fo] Forman, R.: Small volume limits of 2-d Yang-Mills. Commun. Math. Phys. 151, 39–52 (1993) [Fr] Freed, D.: Reidemeister torsion, spectral sequences, and Brieskorn spheres. J. Reine Angew. Math. 429, 75–89 (1992) [Go1] Goldman, W.M.: The symplectic nature of fundamental groups of surfaces. Adv. Math. 54, 200-225 (1984) [GM] Goldman, W.M., Millson, J.J.: The deformation theory of representations of fundamental groups of compact K¨ahler manifolds. Bull. Am. Math. Soc. (N.S.) 18(2), 153–158 (1988) [Liu] Liu, K.: Heat kernel and moduli space. Math. Res. Lett. 3, 743–762 (1996) [Ma] Massey, W.S.: Algebraic Topology: An Introduction. Graduate Texts in Mathematics 56. New York: Springer-Verlag, 1967 [Mi] Migdal, A.A.: Recursion equations in gauge field theories. Sov. Phys. JETP 42, 413–418 (1976) [Se1] Sengupta, A.: Yang-Mills on surfaces with boundary: quantum theory and symplectic limit. Commun. Math. Phys. 183, 661–705 (1997) [Se2] Sengupta, A.: Sewing symplectic volumes for flat connections over compact surfaces. J. Geom. Phys. 32, 269–292 (2000) [Se3] Sengupta, A.: The Yang-Mills measure and symplectic structure over spaces of connections. In: Quantization of singular symplectic quotients, Progress in Mathematics Vol. 198, Bosol-Boston: Birkh¨auser, 2001, pp. 329–355 [Weis] Eric W. Weisstein.: Spherical trigonometry. From MathWorld– A Wolfram Web Resource, http://mathworld.wolfram.com/SphericalTrigonometry.html [W] Witten, E.: On quantum gauge theories in two dimensions. Commun. Math. Phys. 141, 153–209 (1991) Communicated by N.A. Nekrasov
Commun. Math. Phys. 256, 565–588 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1288-7
Communications in
Mathematical Physics
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations Evgeny Mukhin1, , Alexander Varchenko2, 1 2
Department of Mathematical Sciences, Indiana University Purdue University Indianapolis, 402 North Blackford St., Indianapolis, IN 46202-3216, USA Department of Mathematics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3250, USA
Received: 20 February 2004 / Accepted: 5 October 2004 Published online: 18 February 2005 – © Springer-Verlag 2005
Abstract: Solutions of the Bethe ansatz equations associated to the XXX model of a simple Lie algebra g come in families called the populations. We prove that a population is isomorphic to the flag variety of the Langlands dual Lie algebra t g . The proof is based on the correspondence between the solutions of the Bethe ansatz equations and special difference operators which we call the discrete Miura opers. The notion of a discrete Miura oper is one of the main results of the paper. For a discrete Miura oper D, associated to a point of a population, we show that all solutions of the difference equation DY = 0 are rational functions, and the solutions can be written explicitly in terms of points composing the population. 1. Introduction The Bethe ansatz is a large collection of methods in the theory of quantum integrable models to calculate the spectrum and eigenvectors for a certain commutative algebra of observables of an integrable model. The elements of the algebra are called integrals of motion or conservation laws of the model, see for instance [BIK, Fa, FT]. In the theory of the Bethe ansatz one assigns the Bethe ansatz equations to an integrable model. Then a solution of the Bethe ansatz equations gives an eigenvector of commuting Hamiltonians of the model. The general conjecture is that the constructed vectors form a basis in the space of states of the model. The first step to that conjecture is to count the number of solutions of the Bethe ansatz equations. One can expect that the number of solutions is equal to the dimension of the space of states of the model. The Bethe ansatz equations of the XXX model is a system of algebraic equations associated to a Kac-Moody algebra g , a non-zero step h ∈ C, complex numbers z1 , . . . , zn , integral dominant g -weights 1 , . . . , n and an integral g -weight ∞ , see [OW] and Sect. 2.2 below.
Supported in part by NSF grant DMS-0140460 Supported in part by NSF grant DMS-0244579
566
E. Mukhin, A. Varchenko
To approach the corresponding counting problem, to every solution of the Bethe ansatz equations we assign an object called the population of solutions. We expect that it would be easier to count populations than individual solutions. For instance, if the Kac-Moody algebra is of type Ar , then each population corresponds to a point of the intersection of suitable Schubert varieties in a suitable Grassmannian variety, as was shown in [MV3]. Then the Schubert calculus allows us to count the number of intersection points of the Schubert varieties and to give an upper bound on the number of populations, see [MV1] and [MV3]. A population of solutions is an interesting object. It is an algebraic variety. It is finitedimensional, if the Weyl group of the Kac-Moody algebra is finite. In this paper we prove that a population is isomorphic to the flag variety of the Langlands dual Lie algebra t g , if g is a simple Lie algebra. The proof is based on the correspondence between solutions of the Bethe ansatz equations and special difference operators which we call the discrete Miura opers. Let G be the complex simply connected group with Lie algebra t g . To every solution t of the Bethe ansatz equations we assign a linear difference operator Dt = τh − Vt , where τh : f (x) → f (x + h) is the shift operator and Vt (x) is a suitable rational G-valued function. We call that difference operator a discrete Miura oper. Our discrete Miura opers are analogs of the differential operators considered by V. Drinfeld and V. Sokolov in their study of the KdV type equations [DS]. Different solutions of the Bethe ansatz equations correspond to different discrete Miura opers. The discrete Miura opers, corresponding to points of a given population, form an equivalence class with respect to a suitable gauge equivalence. Thus a population is isomorphic to an equivalence class of discrete Miura opers. We show that an equivalence class of discrete Miura opers is isomorphic to the flag variety of t g . If Dt is the discrete Miura oper corresponding to a solution t of the Bethe ansatz equations, then the set of solutions of the difference equation Dt Y = 0 with values in a suitable space is an important characteristic of t. It turns out that, for any simple Lie algebra and any solution t of the Bethe ansatz equations, the difference equation Dt Y = 0 has a rational fundamental matrix of solutions. Moreover, all solutions of the difference equation Dt Y = 0 can be written explicitly in terms of points composing the population, originated at t. Thus, the population of solutions of the Bethe ansatz equations “solves” the Miura difference equation Dt Y = 0 in rational functions. This is the second main result of the paper. The results of this paper in the cases of Ar and Br were first obtained in [MV3]. The populations related to the Gaudin model of a Kac-Moody algebra g were introduced in [MV1]. In [MV1] we conjectured that every g -population is isomorphic to the flag variety of the Langlands dual Kac-Moody algebra g L . That conjecture was proved for g of type Ar , Br , Cr in [MV1], for g of type G2 in [BM], for all simple Lie algebras in [F1] and [MV4]. The ideas of [F1] and [MV4] motivated the present paper. It turned out that all technical steps of [MV4] can be carried out in the difference setting of this paper. There are different versions of the XXX Bethe ansatz equations associated to a simple Lie algebra, see [OW, MV2, MV3]. Ogievetsky and Wiegman introduced in [OW] a set of Bethe ansatz equations for any simple Lie algebra g . For g of type Ar , Dr , E6 , E7 , E8 the Ogievetsky-Wiegman equations are the Bethe ansatz equations considered in this paper. For other simple Lie algebras the Ogievetsky-Wiegman equations are different from the Bethe ansatz equations considered in this paper, see Sect. 2.6. The discrete Miura opers, considered in this paper, are discrete versions of the special differential operators called the Miura opers and introduced in [DS]. Miura opers play an
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
567
essential role in the Drinfeld-Sokolov reduction, geometric Langlands correspondence, and are connected with the Bethe ansatz of the Gaudin model, see [DS, FFR, F2]. The quantization of the Drinfeld-Sokolov reduction was described in [FRS, SS, S]. The authors of those papers use some difference operators suitable for their purposes and analogous to Miura opers of [DS, FFR, F2]. It seems that our discrete Miura opers, associated with solutions of the Bethe ansatz equations, are not of the Miura type considered in [FRS, SS, S]. It would be interesting to see if our discrete Miura opers also play a role in discrete versions of the Drinfeld-Sokolov reduction and geometric Langlands correspondence. Considerations of the present paper are in the spirit of the geometric Langlands correspondence. Namely, we start from a solution t of the Bethe ansatz equations associated to a simple Lie algebra g , that is we start from a Bethe eigenvector of the commuting Hamiltonians of the XXX g model. Having a solution t we construct the associated discrete Miura G-oper Dt , whose fundamental matrix is a rational function. Having Dt we may recover the g population of solutions of the Bethe ansatz equations originated at t. In particular we may recover t and the associated Bethe eigenvector of the commuting Hamiltonians. The fact that the discrete Miura oper has a rational fundamental matrix of solutions is a discrete analog of the fact that a differential Miura oper has the trivial monodromy group, see [F1, MV4]. The paper is organized as follows. In Sect. 2 we introduce populations of solutions of the Bethe ansatz equations. In Sect. 3 we discuss elementary properties of discrete Miura opers corresponding to solutions of the Bethe ansatz equations. In Sect. 4 we give explicit formulas for solutions of the difference equation Dt Y = 0, see Theorems 4.1, 4.2. In Sect. 5 we prove that the variety of gauge equivalent marked discrete Miura g opers is isomorphic to the flag variety of t g , see Theorem 5.1. We discuss the relations between the Bruhat cell decomposition of the flag variety and the populations of solutions of the Bethe ansatz equations in Sect. 6. The main results of the paper are Corollaries 6.5 and 6.6. 2. Populations of Solutions of the Bethe Equations, [MV2] 2.1. Kac-Moody algebras. Let A = (ai,j )ri,j =1 be a generalized Cartan matrix, ai,i = 2, ai,j = 0 if and only if aj,i = 0, ai,j ∈ Z ≤ 0 if i = j . We assume that A is symmetrizable, i.e. there exists a diagonal matrix D = diag{d1 , . . . , dr } with positive integers di such that B = DA is symmetric. Let g = g (A) be the corresponding complex Kac-Moody Lie algebra (see [K], §1.2), h ⊂ g the Cartan subalgebra. The associated scalar product is non-degenerate on h ∗ and dim h = r + 2d, where d is the dimension of the kernel of the Cartan matrix A. Let αi ∈ h ∗ , αi∨ ∈ h , i = 1, . . . , r, be the sets of simple roots, coroots, respectively. We have (αi , αj ) = di ai,j , λ, αi∨ = 2(λ, αi )/(αi , αi ),
λ ∈ h ∗.
In particular, αj , αi∨ = ai,j . Let P = {λ ∈ h ∗ | λ, αi∨ ∈ Z} and P + = {λ ∈ h ∗ | λ, αi∨ ∈ Z ≥ 0 } be the sets of integral and dominant integral weights. Fix ρ ∈ h ∗ such that ρ, αi∨ = 1, i = 1, . . . , r. We have (ρ, αi ) = (αi , αi )/2.
568
E. Mukhin, A. Varchenko
The Weyl group W ∈ End(h * ) is generated by reflections si , i = 1, . . . , r, si (λ) = λ − λ, αi∨ αi ,
λ ∈ h ∗.
We use the notation w ∈ W, λ ∈ h ∗ ,
w · λ = w(λ + ρ) − ρ,
for the shifted action of the Weyl group. The Kac-Moody algebra g (A) is generated by h , e1 , . . . , er , f1 , . . . , fr with defining relations [ei , fj ] [h, h ] [h, ei ] [h, fi ]
= = = =
δi,j αi∨ , i, j = 1, . . . r, 0, h, h ∈ h , αi , h ei , h ∈ h , i = 1, . . . r, − αi , h fi , h ∈ h , i = 1, . . . r,
and the Serre’s relations (ad ei )1−ai,j ej = 0,
(ad fi )1−ai,j fj = 0,
for all i = j . The generators h , e1 , . . . , er , f1 , . . . , fr are called the Chevalley generators. Denote n+ (resp. n− ) the subalgebra generated by e1 , . . . , er (resp. f1 , . . . , fr ). Then g = n+ ⊕ h ⊕ n− . Set b± = h ⊕ n± . The Kac-Moody algebra t g = g (t A) corresponding to the transposed Cartan matrix t A is called the Langlands dual to g . Let t α ∈ t h ∗ , t α ∨ ∈ t h , i = 1, . . . , r, be the sets i i of simple roots, coroots of t g , respectively. Then t αi , t αj∨ = αj , αi∨ = ai,j for all i, j . 2.2. The Bethe ansatz equations with parameters b. Fix a Kac-Moody algebra g = g (A), a non-negative integer n, a collection of dominant integral weights = (1 , . . . , n ), i ∈ P + , and complex numbers z = (z1 , . . . , zn ). Fix a non-zero complex number h. Fix a collection b = (bi,m )ri,m = 1, i = m of complex numbers. We say that the parameters b are symmetric if they satisfy the condition: bi,m + bm,i = h
(1)
for all i, m ∈ {1, . . . , r}, i = m. Choose a collection of non-negative integers l = (l1 , . . . , lr ) ∈ Zr≥ 0 . The choice of l is equivalent to the choice of the weight ∞ =
n i=1
i −
r
lj αj ∈ P.
j =1
¯ = (1 , . . . , n , ∞ ). Let The weight ∞ will be called the weight at infinity. Set (i)
t = { tj ∈ C | i = 1, . . . , r, j = 1, . . . , li } be a collection of complex numbers.
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
569
¯ z, b is the following system of The XXX Bethe ansatz equations associated to , algebraic equations with respect to the variables t: n t (i) − z + ( , α ∨ ) h/2 s s j i (i)
s=1
tj
×
− zs − (s , αi∨ ) h/2 l (i) (m) r m tj − tk + bi,m
m=1, m=i
(i)
(m)
k=1 tj − tk
−ai,m
+ bi,m − h
(i)
(i)
(i)
(i)
li
tj − t k − h
k=1, k=j
tj − t k + h
= 1, (2)
where i = 1, . . . , r, j = 1, . . . , li . The product of symmetric groups Sl = Sl1 × · · · × Slr acts on the set of solutions of (2) permuting the coordinates with the same upper index. For i = 1, . . . , r, consider the li equations (2) with fixed upper index i. We call that system of equations the Bethe ansatz equations with fixed upper index i. 2.3. Polynomials representing solutions of the Bethe ansatz equations . For a given (i) t = (tj ) introduce an r-tuple of polynomials y = (y1 (x), . . . , yr (x)), where yi (x) =
li
(i)
x − tj
.
(3)
j =1
Each polynomial is considered up to multiplication by a non-zero number. The tuple defines a point in the direct product P(C[x])r of r copies of the projective space associated to the vector space of polynomials in x. We say that the tuple y represents the collection of numbers t. It is convenient to think that if a polynomial yk of a tuple y = (y1 , . . . , yr ) ∈ P(C[x])r (i) (i) has degree zero, then it means that the collection t = (tj ) has no tj − s with i = k. For i = 1, . . . , r introduce polynomials Ti (x) =
n s =1
(s ,αi∨ )−1
(x − zs + (s , αi∨ ) h/2 − ph)
(4)
p=0
and Qi (x) = Ti (x)
r
ym (x + bi,m )−ai,m .
(5)
m = 1, m = i
We say that the tuple y is generic with respect to weights , numbers z, and parameters b if for every i = 1, . . . , r the polynomial yi (x) has no multiple roots and no common roots with polynomials yi (x + h), and Qi (x). If y represents a solution t of the Bethe ansatz equations (2) and y is generic, then the Sl -orbit of t is called a Bethe solution of (2). The Bethe ansatz equations can be written as (i) (i) (i) li Q i tj tj − t k − h =1, (6) (i) (i) (i) Qi tj − h k=1, k=j tj − tk + h where i = 1, . . . , r, j = 1, . . . , li .
570
E. Mukhin, A. Varchenko
For i = 1, . . . , r, a tuple y is called fertile in the i th direction with respect to , z, b, if there exists a polynomial y˜i satisfying the equation yi (x + h) y˜i (x) − yi (x) y˜i (x + h) = Qi (x) .
(7)
A tuple y is called fertile with respect to , z, b, if it is fertile in all directions i = 1, . . . , r. Example. The tuple (1, . . . , 1) is fertile with respect to any given , z, b. Instead of saying that y is generic or fertile with respect to , z, b we will also say that y is generic or fertile with respect to polynomials T1 (x), . . . , Tr (x) and parameters b. If y is fertile in the i th direction and y˜i is a polynomial solution of (7), then the tuple y(i) = (y1 , . . . , y˜i , . . . , yr ) ∈ P(C[x])r
(8)
is called an immediate descendant of y in the i th direction. If y˜i is a solution of (7), then y˜i + cyi is a solution too for any c ∈ C. Lemma 2.1 ([MV1, MV2]). Assume that y is generic. Let y˜i be a solution of (7). Then the tuple (y1 , . . . , y˜i + cyi , . . . , yr ) is generic for almost all c ∈ C. The exceptions form a finite subset in C. Lemma 2.2 ([MV1, MV2]). Let yj , j = 1, 2, . . . , be a sequence of tuples in P(C[x])r which has a limit y. Assume that all tuples yj are fertile. Then y is fertile. (i) Lemma 2.3 (see [MV1, MV2]). Denote l˜i = deg y˜i and ∞ = r ˜ j =1, j =i lj αj . If li = li , then
n
s=1 s
− l˜i αi −
(i) ∞ = si · ∞ , where si · is the shifted action of the i th reflection of the Weyl group.
Theorem 2.4 (cf. [MV2]). (i) Let a tuple y = (y1 , . . . , yr ) be generic. Let i ∈ {1, . . . , r}. Then y is fertile in the i th direction if and only if t satisfies the Bethe ansatz equations (2) with fixed upper index i. (ii) Let parameters b be symmetric. Let y be generic and fertile. Let i ∈ {1, . . . , r}. Let y(i) be an immediate descendant of y in the i th direction. Assume that y(i) is generic. Then y(i) is fertile. Proof. To prove (i) introduce g(x) = y˜i (x)/yi (x) and write (7) as g(x) − g(x + h) =
Qi (x) . yi (x) yi (x + h)
(9)
The tuple y is fertile in the i th direction if and only if there exists a rational function g(x) satisfying (9). A function g(x) exists if and only if Resx = t (i) j
Qi (x) Qi (x) = − Resx = t (i) − h j yi (x) yi (x + h) yi (x) yi (x + h)
(10)
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
571
for j = 1, . . . , lj . The system of equations (10) is equivalent to the system of the Bethe ansatz equations with fixed upper index i. Part (i) is proved. To prove (ii) we check that the Bethe ansatz equations (2) are satisfied for roots of polynomials composing the tuple y(i) . The Bethe ansatz equations with upper index i are satisfied for roots of y(i) according to part (i). If m is such that ai,m = 0, then the Bethe ansatz equations with upper index m for roots of y(i) are the same as for y, since the roots of y˜i do not enter those equations. Let m be such that ai,m = 0. Write (7) as yi (x + h) y˜i (x + h) Qi (x) − = . yi (x) y˜i (x) y˜i (x) yi (x) Substitute to this equation the zeros of the polynomial ym (x + bi,m ) and get (m) (m) y˜i tk − bi,m + h yi tk − bi,m + h = (m) (m) yi tk − bi,m y˜i tk − bi,m
(11)
for k = 1, . . . , lm . The Bethe ansatz equations with upper index m for roots of y(i) contain the factor (m) (m) y˜i (tk + bm,i )/y˜i (tk + bm,i − h) while the Bethe ansatz equations with upper index (m) (m) m for roots of y contain the factor yi (tk + bm,i )/yi (tk + bm,i − h). By (11) the two ratios are equal if parameters b are symmetric. Hence the Bethe ansatz equations with upper index m are satisfied for roots of y(i) if they are satisfied for roots of y. Part (ii) is proved. 2.4. Simple reproduction procedure. Assume that parameters b are symmetric. Let y represent a Bethe solution of (2). Let i ∈ {1, . . . , r}, and let y˜i be a polynomial solution of Eq. (7). For complex numbers c1 and c2 , not both equal to zero, consider the tuple (i)
y(c1 :c2 ) = (y1 , . . . , c1 y˜i + c2 y, . . . , yr ) ∈ P(C[x])r . The tuples form a one-parameter family. The parameter space of the family is the projective line P1 with projective coordinates (c1 : c2 ). We have a map Yy,i : P1 → P(C[x])r , (i)
(i)
(c1 : c2 ) → y(c1 :c2 ) .
Almost all tuples y(c1 :c2 ) are generic. The exceptions form a finite set in P1 . Thus, starting with a tuple y, representing a Bethe solution of Eqs. (2) associated to numbers z1 , . . . , zn , integral dominant weights 1 , . . . , n , a weight ∞ at infinity, parameters b, and an index i ∈ {1, . . . , r}, we construct a family Yy,i : P1 → P(C[x])r of fertile tuples. For almost all c ∈ P1 (with finitely many exceptions only), the tuple Yy,i (c) represents a Bethe solution of the Bethe ansatz equations associated to points z1 , . . . , zn , integral dominant weights 1 , . . . , n , parameters b, and a suitable weight at infinity. We call this construction the simple reproduction procedure in the i th direction.
572
E. Mukhin, A. Varchenko
2.5. General reproduction procedure. Assume that parameters b are symmetric. Assume that a tuple y ∈ P(C[x])r represents a Bethe solution of the Bethe ansatz ¯ b. equations associated to z, , Let i = (i1 , . . . , ik ), 1 ≤ ij ≤ r, be a sequence of natural numbers. We define a k-parameter family of fertile tuples, Yy,i : (P1 )k → P(C[x])r , by induction on k, starting at y and successively applying the simple reproduction procedure in directions i1 , . . . , ik . The image of this map is denoted by Py,i . For a given i = (i1 , . . . , ik ), almost all tuples Yy,i (c) represent Bethe solutions of the Bethe ansatz equations associated to points z1 , . . . , zn , dominant integral weights 1 , . . . , n , symmetric parameters b, and suitable weights at infinity. Exceptional values of c ∈ (P1 )k are contained in a proper algebraic subset. It is easy to see that if i = (i1 , . . . , ik ), 1 ≤ ij ≤ r, is a sequence of natural numbers, and the sequence i is contained in the sequence i as an ordered subset, then Py,i is a subset of Py,i . The union Py = ∪i Py,i ⊂ P(C[x])r , where the summation is over all of sequences i, is called the population of solutions of the Bethe ansatz equations associated to the Kac-Moody algebra g , integral dominant weights 1 , . . . , n , numbers z1 , . . . , zn , symmetric parameters b, and originated at y. If two populations with the same , z, b intersect, then they coincide. If the Weyl group is finite, then all tuples of a population consist of polynomials of bounded degree. Thus, if the Weyl group of g is finite, then a population is an irreducible projective variety. Every P has a tuple y = (y1 , . . . , yr ), deg yi = li , such that the weight population ∞ = ns=1 s − ri=1 li αi is dominant integral, see [MV1, MV2]. Conjecture 2.1 ([MV2]). Every population, associated to a Kac-Moody algebra g , dominant integral weights 1 , . . . , n , points z1 , . . . , zn , symmetric parameters b, is an algebraic variety isomorphic to the flag variety associated to the Kac-Moody algebra t g which is Langlands dual to g . Moreover, the parts of the family corresponding to tuples of polynomials with fixed degrees are isomorphic to Bruhat cells of the flag variety. The conjecture is proved for the Lie algebra of type Ar in [MV3]. In this paper we prove the conjecture for every simple Lie algebra. The populations, related to the Gaudin model of a Kac-Moody algebra g , were introduced in [MV1]. It was conjectured in [MV1] that every g -population of the Gaudin model is isomorphic to the flag variety of the Langlands dual Kac-Moody algebra g L . That conjecture was proved for g of type Ar , Br , Cr in [MV1], for g of type G2 in [BM], for all simple Lie algebras in [F1] and [MV4]. 2.6. Special symmetric parameters b. Here are two examples of symmetric parameters b. The parameters b are symmetric if bi,m =
h 2
for all i = m .
(12)
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
573
The parameters b satisfying (12) will be called the Ogievetsky-Wiegman parameters, cf. [OW, MV3]. The parameters b are symmetric if bi,m = 0 for i > m
and
bi,m = h for i < m .
(13)
The parameters b satisfying (13) will be called the special symmetric parameters, cf. [OW, MV2]. 1 )r 2 r 2 Let b1 = (bi,m i,m=1, i=m and b = (bi,m )i,m=1, i=m be two collections of param1 2 eters. We say that b and b are gauge equivalent if there exist complex numbers d (1) , . . . , d (r) with the following property. We require that for any tuple (y1 (x), . . . , yr (x)), fertile with respect to some polynomials T1 (x), . . . , Tr (x) and parameters b1 , the tuple (y1 (x + d (1) ), . . . , yr (x + d (r) )) is fertile with respect to polynomials T1 (x + d (1) ), . . . , Tr (x + d (r) ) and parameters b2 . If b1 and b2 are gauge equivalent and P is a population associated to some polynomials T1 (x), . . . , Tr (x) and parameters b1 , then the set {(y1 (x + d (1) ), . . . , yr (x + d (r) )) | (y1 (x), . . . , yr (x)) ∈ P } is a population associated to polynomials T1 (x + d (1) ), . . . , Tr (x + d (r) ) and parameters b2 . Theorem 2.5. Let b1 and b2 be symmetric parameters. Assume that the Dynkin diagram of the Cartan matrix A of the Lie algebra g is a tree. Then b1 and b2 are gauge equivalent. 3 ) in terms of the Dynkin diagram and Proof. We will introduce parameters b3 = (bi,j will prove that b1 and b3 are equivalent. That will prove the theorem. Let v1 , . . . , vr be vertices of the Dynkin diagram corresponding to the roots α1 , . . . , αr , respectively. For i = 2, . . . , r, let vi1 , . . . , vik be the unique sequence of distinct vertices of the Dynkin diagram such that for j = 1, . . . , k − 1 the vertices vij and vij +1 are connected by an edge, and vi1 = v1 , vik = vi . The number k will be called the distance between v1 and vi and denoted by δi . Let b3 be defined by the rule: 3 bi,j = 0 if δi > δj ,
3 bi,j = h if δi < δj ,
3 bi,j = 0 if δi = δj and i > j,
3 bi,j = h if δi = δj and i < j.
Clearly b3 is symmetric. Define (d (1) , . . . , d (r) ). Set d (i) = bi11 ,i2 + bi12 ,i3 + . . . + bi1k−1 ,ik − (k − 1)h for i > 1 and set d (1) = 0. It is easy to see that the sequence (d (1) , . . . , d (r) ) establishes the equivalence of b1 and b3 . Corollary 2.6. If the Dynkin diagram is a tree, then any set of symmetric parameters is gauge equivalent to the set of special symmetric parameters given by (13). Corollary 2.7. If g is simple, then any set of symmetric parameters is gauge equivalent to the set of special symmetric parameters given by (13).
574
E. Mukhin, A. Varchenko
Ogievetsky and Wiegman considered in [OW] a set of Bethe ansatz equations for any simple Lie algebra g . For g of type Ar , Dr , E6 , E7 , E8 the Ogievetsky-Wiegman equations are the Bethe ansatz equations associated to parameters given by (12). For other simple Lie algebras the Ogievetsky-Wiegman equations are different from the Bethe ansatz equations considered in this paper. For g of type Ar we considered in [MV3] the Bethe ansatz equations associated to the special symmetric parameters given by (13).
2.7. Diagonal sequences of polynomials associated to a Bethe solution and a sequence of indices. In this section we assume that parameters b are symmetric. We introduce notions which will be used in Chapter 4 to construct solutions of difference equations. Lemma 2.8. Assume that a tuple y ∈ P(C[x])r represents a Bethe solution of the Bethe ¯ symmetric parameters b. Let i = (i1 , . . . , ik ), ansatz equations associated to z, , (i ) 1 ≤ ij ≤ r, be a sequence of natural numbers. Then there exist tuples y(i1 ) = (y1 1 , . . . , (i ) (i ,i ) (i ,i ) (i ,...,i ) (i ,...,i ) yr 1 ), y(i1 ,i2 ) = (y1 1 2 , . . . , yr 1 2 ), . . . , y(i1 ,...,ik ) = (y1 1 k , . . . , yr 1 k ) in r P(C[x]) such that (i) (i )
(i )
yi1 1 (x) yi1 (x + h) − yi1 1 (x + h) yi1 (x) r
= Ti1 (x)
(ym ( x + bi1 ,m ))−ai1 ,m
m=1, m=i1 (i )
and yj 1 = yj for j = i1 ; (ii) for l = 2, . . . , k, (i ,...,il )
yil 1
(i ,...,il−1 )
(x) yil 1
r
= Til (x)
(i ,...,il )
(x + h) − yil 1 (i ,...,il−1 )
(ym1
(i ,...,il−1 )
(x + h) yil 1
(x)
( x + bil ,m ))− ail ,m
m=1, m=il (i ,...,il )
and yj 1
(i ,...,il−1 )
= yj 1
for j = il .
The tuples y(i1 ) , y(i1 ,i2 ) , . . . , y(i1 ,...,ik ) belong to the population Py . The tuple y(i1 ) is obtained from y by the i1th simple generation procedure and for l = 2, . . . , k the tuple y(i1 ,...,il ) is obtained from y(i1 ,...,il−1 ) by the ilth simple generation procedure. The sequence of tuples y(i1 ) , y(i1 ,i2 ) , . . . , y(i1 ,...,ik ) satisfying Lemma 2.8 will be called associated to the Bethe solution y and the sequence of indices i. The sequence of (i ) (i ,i ) (i ,...,i ) polynomials yi1 1 , yi2 1 2 , . . . , yik 1 k will be called a diagonal sequence of polynomials associated to the Bethe solution y and the sequence of indices i. For a given y the diagonal sequence of polynomials determines the sequence of tuples y(i1 ) , y(i1 ,i2 ) , . . . , y(i1 ,...,ik ) uniquely. There are many diagonal sequences of polynomials associated to a given Bethe solution and a given sequence of indices.
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
575
3. Discrete Opers In the remaining part of the paper, g = g (A) is a simple Lie algebra of rank r. Denote the coroots t α1∨ , . . . , t αr∨ ∈ t h of t g by H1 , . . . , Hr , respectively. Let H1 , . . . , Hr , E1 , . . . , Er , F1 , . . . , Fr be the Chevalley generators of t g . We have [Hj , Ei ] = ai,j Ei and [Hj , Fi ] = −ai,j Fi , where A = (ai,j ) is the Cartan matrix of g . Let G be the complex simply connected Lie group with Lie algebra t g . Let B± , N± be the subgroups of G with Lie algebras t b± , t n± , respectively. 3.1. Relations in G. For a non-zero complex number u and i ∈ {1, . . . , r}, consider the elements uHi , exp (uEi ), exp (uFi ) in G. We will use the following relations: Lemma 3.1. Let u, v be non-zero complex numbers. Then uHj exp (v Ei ) = exp (uai,j v Ei ) uHj , uHj exp (v Fi ) = exp (u−ai,j v Fi ) uHj , exp (u Fi ) exp (v Ej ) = exp (v Ej ) exp (u Fi ) if i = j , v u −Hi exp (u Fi ) exp (v Ei ) = exp exp Ei (1 + uv) Fi 1 + uv 1 + uv if 1 + uv = 0.
3.2. D-opers. Define the shift operator τh acting on functions of x by the formula τh : g(x) → g(x + h) . A discrete oper (a d-oper) is a difference operator of the form D = τh − V , where V : C → G is a rational function. For a rational function s : C → N+ , define the action of s on the d-oper by the formula s · D = s(x + h) D s(x)−1 = τh − s(x + h) V (x) s(x)−1 . The operator s · D is a d-oper. The d-opers D and s · D are called gauge equivalent. 3.3. Miura d-opers associated to tuples of polynomials . In the remaining part of the paper we assume that b = (bi,m )ri,m=1, i=m are special symmetric parameters given by (13). Fix dominant integral weights = (1 , . . . , n ) of g , complex numbers z = (z1 , . . . , zn ). Introduce polynomials T1 (x), . . . , Tr (x) by formula (4). For a tuple y = (y1 , . . . , yr ) of non-zero polynomials, and i ∈ {1, . . . , r}, we define the rational function Ry,i by the formula Ry,i (x) =
Ti (x) (ym ( x + bi,m ))−ai,m . yi (x + h) yi (x) m, m=i
576
E. Mukhin, A. Varchenko
We say that a d-oper D = τh − V is the Miura d-oper associated to weights , numbers z, and the tuple y = (y1 , . . . , yr ) if V (x) =
r
yj (x + h)−Hj
j =1
× exp (Ry,1 (x) F1 ) exp (Ry,2 (x) F2 ) · · · exp (Ry,r (x) Fr )
r
yj (x)Hj .
j =1
We denote by Dy = τh − Vy the Miura d-oper associated to the tuple y. It is easy to see that if a Miura d-oper D is associated to weights , numbers z, and a tuple y = (y1 , . . . , yr ) of non-zero polynomials, then the tuple y is determined uniquely by D. It follows easily from Lemma 3.1 that the d-oper Dy does not change if the polynomials of the tuple y are multiplied by non-zero numbers. For i ∈ {1, . . . , r}, we say that the Miura d-oper Dy is deformable in the i th direction if there exists a non-zero rational function gi : C → C and a non-zero polynomial y˜i such that exp(gi (x + h) Ei ) Dy exp(−gi (x) Ei ) = Dy(i) , where y(i) = (y1 , . . . , y˜i , . . . , yr ). We say that Dy is deformed to Dy(i) with the help of gi . Theorem 3.1. Let b be the special symmetric parameters. Let y be a tuple of non-zero polynomials. (i) Assume that Eq. (7) has a polynomial solution y˜i . Set y(i) = (y1 , . . . , y˜i , . . . , yr ). Then the Miura d-oper Dy is deformable in the i th direction to the Miura d-oper Dy(i) with the help of gi (x) =
1 ym (x)−ai,m . yi (x) y˜i (x)
(14)
m, m=i
(ii) If the tuple y is generic in the sense of Sect. 2.3 and the Miura d-oper Dy is deformable in the i th direction to the Miura d-oper Dy(i) with the help of gi , then y˜i is a polynomial solution of Eq. (7), and gi , y˜i satisfy (14). Proof. For a scalar rational function gi we have exp(gi (x + h) Ei ) Dy exp(−gi (x) Ei ) r yj (x + h)−Hj exp (Ry,1 (x) F1 ) exp (Ry,2 (x) F2 ) = τh − j =1
· · · exp(g˜ i (x + h) Ei ) exp (Ry,i (x) Fi ) exp(−g˜ i (x) Ei ) r yj (x)Hj , · · · exp (Ry,r (x) Fr ) j =1
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
577
where g˜ i (x) = gi (x)
r
ym (x)ai,m .
m=1
By Lemma 3.1 the Miura d-oper D is deformable in the i th direction only if g˜ i (x + h) = g˜ i (x)/(1 − g˜ i (x)Ry,i (x)). This equation is called the i th discrete Ricatti equation, see [MV4] where the classical Ricatti equation appears in an analogous situation. The Ricatti equation can be written as yi (x + h)
yi (x) yi (x + h) ym (x + bi,m )−ai,m . − yi (x) = Ti (x) g˜ i (x) g˜ i (x + h)
(15)
m, m=i
If Eq. (7) has a polynomial solution y˜i , then (15) has a rational solution g˜ i (x) =
yi (x) . y˜i (x)
Then gi is given by (14), and y˜i (x + h) yi (x) , yi (x + h) y˜i (x) exp(g˜ i (x + h) Ei ) exp (Ry,i (x) Fi ) exp(−g˜ i (x) Ei ) yi (x + h) Hi y˜i (x) Hi = y˜i (x + h) yi (x) y ˜ (x) T (x) i i × exp ym (x + bi,m )−ai,m Fi y˜i (x + h) (yi (x))2
1 − g˜ i (x) Ry,i (x) =
=
yi (x + h) y˜i (x + h)
Hi
(16)
m, m=i
exp Ry(i) ,i (x) Fi
y˜ (x) Hi i . yi (x)
(17)
Using the last formula and Lemma 3.1 we easily conclude that the Miura d-oper Dy is deformed in the i th direction to the Miura d-oper Dy(i) with the help of gi given by (14) if y˜i is a polynomial solution of (7). This proves part (i) of the theorem. To prove part (ii) write (15) as 1 1 Ti (x) − ym (x + bi,m )−ai,m . = g˜ i (x) g˜ i (x + h) yi (x + h) yi (x) m, m=i
Let g˜ i (x) be a rational solution of this equation. Since y is generic, the poles of 1/g˜ i (x) are located at zeros of yi (x) and all poles are simple. Hence y˜i (x) = yi (x)/g˜ i (x) is a polynomial solution of (7). Then formulas (16), (17) hold and part (ii) is proved.
578
E. Mukhin, A. Varchenko
Corollary 3.2. Let the Miura d-oper Dy be associated to weights , numbers z, and the tuple y = (y1 , . . . , yr ). Assume that the tuple y = (y1 , . . . , yr ) is generic in the sense of Sect. 2.3. Then Dy is deformable in all directions from 1 to r if and only if the tuple y represents a Bethe solution of the Bethe ansatz equations associated to z, , ∞ = ni=1 i − ri=1 li αi , li = deg yi , and special symmetric parameters b. Let the Miura d-oper Dy be associated to weights , numbers z, and the tuple y = (y1 , . . . , yr ). Let the tuple y = (y1 , . . . , yr ) represent a Bethe solution of the Bethe ansatz equations associated to z, , ∞ , and special symmetric parameters b. Let OmDy 0 be the variety of all Miura d-opers each of which can be obtained from Dy by a sequence of deformations in directions i1 , . . . , ik , where k is a positive integer and all ij lie in {1, . . . , r}. Corollary 3.3. The variety OmDy 0 is isomorphic to the population Py of solutions of Bethe ansatz equations, where Py is the population originated at y. 4. Solutions of Difference Equations Let Dy = τh − Vy be the Miura d-oper associated to a Bethe solution y of the Bethe ansatz equations associated to special symmetric parameters b. Let Py be the population of solutions originated at y. In this section we prove that the difference equation Y (x + h) = Vy (x) Y (x)
(18)
has a G-valued rational solution. We will write that solution explicitly in terms of coordinates of tuples composing the population. Note that if Y (x) is a solution and g ∈ G, then Y (x)g is a solution too. First we give a formula for a solution of Eq. (18) for d-opers associated to g of type Ar , and then we consider more general formulas for solutions which do not use the structure of the Lie algebra. Let Y be a solution of Eq. (18). Define Y¯ (x) =
r
yj (x)Hj Y (x) .
j =1
Then Y¯ is a solution of the equation Y¯ (x + h) = V¯y (x) Y¯ (x),
(19)
where V¯y (x) = exp (Ry,1 (x) F1 ) exp (Ry,2 (x) F2 ) · · · exp (Ry,r (x) Fr ). 4.1. The Ar d-opers and solutions of Bethe ansatz equations. In this section let g = slr+1 be the Lie algebra of type Ar . Then t g = slr+1 . We have (αi , αi ) = 2 for all i. We fix the order of simple roots of slr+1 such that (α1 , α2 ) = (α2 , α3 ) = · · · = (αr−1 , αr ) = −1.
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
579
We start with two examples. Let g = sl2 . Let y = (y1 ) represent a Bethe solution of the sl2 Bethe ansatz equations (1) associated to z, , ∞ . Let y1 be the diagonal sequence of polynomials associated to y and the sequence of indices (1), in other words, (1)
(1)
y1 (x + h) y1 (x) − y1 (x) y1 (x + h) = T1 (x) . Then
Y¯ = exp
(1)
y1 F1 y1
is a solution of the difference equation (19) with values in SL (2, C). Indeed, (1) (1) y1 (x + h) y1 (x) T1 (x) τh − exp exp F1 F1 = exp F1 y1 (x)y1 (x + h) y1 (x) y1 (x + h) (1) (1) y1 (x + h) y1 (x) T1 (x) − + + F1 × τh − exp y1 (x + h) y1 (x) y1 (x)y1 (x + h) (1) y1 (x + h) F1 (τh − id) . = exp y1 (x + h) Let g = sl3 . Let y = (y1 , y2 ) represent a Bethe solution of the Bethe ansatz equa(1) (1,2) be the tions associated to z, , ∞ , and special symmetric parameters b. Let y1 , y2 diagonal sequence of polynomials associated to y and the sequence of indices (1, 2), in other words, (1)
(1)
y1 (x + h) y1 (x) − y1 (x) y1 (x + h) = T1 (x) y2 (x + h) , (1,2)
y2 (x + h) y2
(1,2)
(x) − y2 (x) y2
(1)
(x + h) = T2 (x) y1 (x) .
(2)
Let y2 be the diagonal sequence of polynomials associated to y and the sequence of indices (2), in other words, (2)
(2)
y2 (x + h) y2 (x) − y2 (x) y2 (x + h) = T2 (x) y1 (x) . Then
(1)
y (x) Y¯ (x) = exp 1 F1 y1 (x)
(1,2)
(x) y exp 2 [F2 , F1 ] y2 (x)
(2)
y (x) F2 exp 2 y2 (x)
is a solution of the difference equation (19) with values in SL (3, C). Indeed, we have (1) y1 (x) (τh −V¯y (x)) exp F1 y1 (x) (1) (1) (1) y1 (x+h) y1 (x) T1 (x)y2 (x+h) y1 (x+h) τh − exp − F1 + + F1 = exp y1 (x+h) y1 (x+h) y1 (x) y1 (x)y1 (x+h) (1) T2 (x)y1 (x) T2 (x)y1 (x) × exp [F2 , F1 ] exp F2 y2 (x)y2 (x + h) y2 (x)y2 (x + h)
580
E. Mukhin, A. Varchenko
(1) y1 (x + h) F1 = exp y1 (x + h) (1) T2 (x)y1 (x) T2 (x)y1 (x) , F2 × τh − exp [F2 , F1 ] exp y2 (x)y2 (x + h) y2 (x)y2 (x + h) (1) T2 (x)y1 (x) T2 (x)y1 (x) [F2 , F1 ] exp τh − exp F2 y2 (x)y2 (x + h) y2 (x)y2 (x + h) (1,2) y2 (x) [F2 , F1 ] × exp y2 (x) (1,2) y2 (x + h) T2 (x)y1 (x) , τh − exp F2 [F2 , F1 ] = exp y2 (x)y2 (x + h) y2 (x + h)
and
(2) y2 (x) T2 (x)y1 (x) exp F2 τh − exp F2 y2 (x)y2 (x + h) y2 (x) (2) y2 (x + h) = exp F2 (τh − id) . y2 (x + h)
Consider the general case. Let g = slr+1 . Let y = (y1 , . . . , yr ) represent a Bethe solution of the Bethe ansatz equations associated to z, , ∞ , and special symmetric (i) (i,i+1) (i,...,r) be the diagonal sequence parameters b. For i = 1, . . . , r, let yi , yi+1 , . . . , yr of polynomials associated to y and the sequence of indices (i, i + 1, . . . , r), in other words, (i)
(i)
yi (x + h) yi (x) − yi (x) yi (x + h) = Ti (x) yi−1 (x) yi+1 (x + h) , (i,i+1)
yi+1 (x + h)yi+1
(i,i+1)
(x)−yi+1 (x)yi+1
(i,...,r−1)
yr−1 (x + h)yr−1
(i)
(x + h) = Ti+1 (x)yi (x)yi+2 (x + h), . . . ,
(i,...,r−1)
(x)−yr−1 (x)yr−1
(i,...,r−2)
(x + h) = Tr−1 (x)yr−2
yr (x + h) yr(i,...,r) (x) − yr (x) yr(i,...,r) (x + h) = Tr (x)
(x)yr (x + h),
(i,...,r−1) yr−1 (x).
Define r functions Y1 , . . . , Yr of x with values in SL (r + 1, C) by the formulas (i,...,j ) r (x) yj Yi (x) = [Fj , [Fj −1 , [..., [Fi+1 , Fi ]...]]] . exp yj (x) j =i
Note that inside each product the factors commute. Theorem 4.1. The product Y1 · · · Yr is a solution of the difference equation (19) with values in SL (r + 1, C). The proof is straightforward. One shows that (τh − exp (Ry,i (x) Fi ) · · · exp (Ry,r (x) Fr )) Yi (x) = Yi (x + h) (τh − exp (Ry,i+1 (x) Fi+1 ) · · · exp (Ry,r (x) Fr )) .
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
581
4.2. General formulas for solutions. Let U be a complex finite dimensional representation of G. Let ulow be a lowest weight vector of U , t n− ulow = 0. Let y = (y1 , . . . , yr ) represent a Bethe solution of the Bethe ansatz equations associated to z, , ∞ , and special symmetric parameters b. We solve the difference equation (18) with values in U . Let i = (i1 , i2 , . . . , ik ), 1 ≤ ij ≤ r, be a sequence of natural numbers. Let y(i1 ) = (i1 ) (i ) (i ,i ) (i ,i ) (i ,...,i ) (y1 , . . . , yr 1 ), y(i1 ,i2 ) = (y1 1 2 , . . . , yr 1 2 ), . . . , y(i1 ,...,ik ) = (y1 1 k , . . . , (i ,...,i ) yr 1 k ) be a sequence of tuples associated to the Bethe solution y and the sequence of indices i. Theorem 4.2. The U -valued function 1 (ym (x))−ai1 ,m Ei1 Y (x) = exp − (i1 ) yi1 (x) yi1 (x) m, m=i1 1 (i1 ) × exp − (i ) (ym (x))−ai2 ,m Ei2 · · · (i1 ,i2 ) 1 yi2 (x) yi2 (x) m, m=i2 1 (i ,...,i ) × exp − (i ,...,i ) (ym1 k−1 (x))−aik ,m Eik (i1 ,...,ik ) 1 k−1 yik (x)yik (x) m, m=ik ×
r
(i ,...,ik )
(yj 1
(x))−Hj ulow
j =1
is a solution of the difference equation (18). The proof is straightforward and follows from Theorem 3.1. Corollary 4.1. Every coordinate of every solution of the difference equation (18) with values in a finite dimensional representation of G can be written as a rational function R(f1 , . . . , fN ) of suitable polynomials f1 , . . . , fN which appear as coordinates of tuples in the g population Py generated at y. Since G has a faithful finite dimensional representation, the solutions of the difference equation (18) with values in G also can be written as rational functions of coordinates of tuples of Py , cf. Sect. 4.1. Corollary 4.2. Let y = (y1 , . . . , yr ) represent a Bethe solution of the Bethe ansatz equations associated to z, , ∞ , and special symmetric parameters b. Then there exists a G-valued rational function Y : C → G satisfying Eq. (18). 5. Miura D-Opers and Flag Varieties 5.1. Theorem on isomorphism. Let y = (y1 , . . . , yr ) represent a Bethe solution of the Bethe ansatz equations associated to z, , ∞ , and special symmetric parameters b. Let Dy = τh − Vy be the Miura d-oper associated to y. Consider the variety OmDy of all Miura d-opers gauge equivalent to Dy . If D ∈ OmDy , then there exists a rational function v : C → N+ such that D = v(x + h) Dy v(x)−1 . In that case we denote D by D v .
582
E. Mukhin, A. Varchenko
The variety of pairs O mDy = {(D v , v) | D v ∈ OmDy } will be called the variety of marked Miura d-opers gauge equivalent to Dy . We have the natural projection π : O mDy → OmDy , (D v , v) → D v . We will show below that π is an isomorphism. Let OmDy 0 ⊆ OmDy be the subvariety of all Miura d-opers each of which can be obtained from Dy by a sequence of deformations in directions i1 , . . . , ik , where k is a non-negative integer and all ij lie in {1, . . . , r}. By Corollary 3.3 the subvariety OmDy 0 is isomorphic to the population of solutions of Bethe ansatz equations associated to special symmetric parameters b and originated at y. We will show below that OmDy 0 = OmDy . Assume that D ∈ OmDy 0 and D is obtained from Dy by a sequence of deformations in directions i1 , . . . , ik , where k is a non-negative integer and all ij lie in {1, . . . , r}. Then there exist scalar rational functions g1 , . . . , gk with the following properties. For j = 1, . . . , k, define a rational N+ -valued function vj : C → N+ , vj (x) = exp (gj (x)Eij ) · · · exp (g2 (x)Ei2 ) exp (g1 (x)Ei1 ) .
(20)
Then D vj ∈ OmDy 0 and D = D vk . The set of all pairs (D vk , vk ) such that k is a nonnegative integer, vk is given by the above construction, and D vk ∈ OmDy 0 , will be called the variety of specially marked Miura d-opers gauge equivalent to Dy and denoted by 0 ⊆ O 0 . Clearly we have Om mDy . Om Dy Dy Let P1 be the complex projective line. Consider Dy as a discrete connection ∇y on the trivial principal G-bundle p : G × P1 → P1 . Namely, by definition a section U → G × U,
x → Y (x) × x,
of p over a subset U ⊂ C ⊂ P1 is called horizontal if the G-valued function Y (x) is a solution of the difference equation (18), Y (x + h) = Vy (x) Y (x). By Corollary 4.2, Eq. (18) has a rational solution Y (x). For any g ∈ G the rational G-valued function Y (x)g is a solution of the same equation too. A point x0 ∈ C will be called regular if x0 is a regular point of the rational functions Y (x) and Vy (x). Let x0 ∈ C be a regular point. Let g be an element of G. Then ∇y has a rational horizontal section s such that s(x0 ) = g. It is easy to see that if the values of two rational horizontal sections are equal at one point, then the sections are equal. Consider the trivial bundle p : (G/B− ) × P1 → P1 associated to the bundle p. The fiber of p is the flag variety G/B− . The discrete connection ∇y induces a discrete connection ∇y on p . The variety of rational horizontal sections of the discrete connection ∇y is identified with the fiber (p )−1 (x0 ) over any regular point x0 . Thus, is isomorphic to G/B− . Any G-valued rational function v defines a section Sv : x → v(x)−1 B− × x
(21)
of p over the set of regular points of v. The section Sv is also well defined over the poles of v since G/B− is a projective variety.
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
583
If D v ∈ OmDy , then the section Sv is horizontal with respect to ∇y . This follows from the fact that the function Vy takes values in B− . Thus we have a map S : O mDy → ,
(D v , v) → Sv .
0 = O Theorem 5.1. The map S : O mDy → is an isomorphism and Om mDy . Dy mDy . Assume that the images of (D v1 , v1 ) and Proof. Let (D v1 , v1 ), (D v2 , v2 ) ∈ O v 2 (D , v2 ) under the map S coincide. Assume that v1 , v2 , Vy are regular at x0 ∈ C. The equality Sv1 (x0 ) = Sv2 (x0 ) means that v1 (x0 )−1 B− = v2 (x0 )−1 B− . Then v1 (x0 ) = v2 (x0 ). Hence v1 = v2 and D v1 = D v2 . That proves the injectivity of S. Let x0 be a regular point of Vy in C. For any u ∈ N+ there exists a rational N+ -valued function v such that v(x0 ) = u, D v ∈ OmDy 0 . Indeed, every u ∈ N+ is a product of elements of the form eci Ei for i ∈ {1, . . . , r} and ci ∈ C. Every ci can be taken as the initial condition for a solution of the suitable i th discrete Ricatti equation. Thus the set 0 } I m(x0 ) = {Sv (x0 ) ∈ (G/B− ) × x0 | (D v , v) ∈ Om Dy contains the set ((N+ B− )/B− )×x0 ⊂ (G/B− )×x0 . It is easy to see that the set I m(x0 ) 0 with respect to S. On the other hand is closed in (G/B ) × x as the image of Om −
0
Dy
the set ((N+ B− )/B− ) × x0 is dense in (G/B− ) × x0 . Hence I m(x0 ) = (G/B− ) × x0 , 0 = O mDy since the map S is injective. and Om Dy
5.2. Remarks on the isomorphism. Let g be a simple Lie algebra. Let y0 be a Bethe solution of the Bethe ansatz equations associated to special symmetric parameters b. Theorem 5.1 says that the variety Om Dy0 is isomorphic to the flag variety G/B− . Here are some comments on that isomorphism. The isomorphism is constructed in two steps. If (D v , v) ∈ Om Dy0 is a marked Miura d-oper, then we assign to it the section Sv ∈ by formula (21). We choose a regular point x0 ∈ C, and assign to the section Sv its value Sv (x0 ) ∈ (G/B− ) × x0 at x0 . The resulting composition φy0 ,x0 : Om Dy0 → G/B− is an isomorphism according to Theorem 5.1. Lemma 5.1. If x0 , x1 ∈ C are regular points, then there exists an element g ∈ B− such that φy0 ,x1 = g φy0 ,x0 . Proof. Let Y be the G-valued rational solution of Eq. (18) such that Y (x0 ) = id. Then Y (x) ∈ B− for all x. If (D v , v) ∈ Om Dy0 , then Sv is a horizontal section of ∇y0 . Thus it has the form x → (Y (x)uB− ) × x for a suitable element u ∈ G. Hence φy0 ,x0 (y) = Y (x0 )uB− and φy0 ,x1 (y) = Y (x1 )uB− . We conclude that φy0 ,x1 = Y (x1 )Y (x0 )−1 φy0 ,x0 .
584
E. Mukhin, A. Varchenko
6. Bruhat Cells 6.1. Properties of Bruhat cells. Let g be a simple Lie algebra. For an element w of the Weyl group W , the set Bw = B− wB− ⊂ G/B− is called the Bruhat cell associated to w. The Bruhat cells form a cell decomposition of the flag variety G/B− . For w ∈ W denote l(w) the length of w. We have dim Bw = l(w). Let s1 , . . . , sr ∈ W be the generating reflections of the Weyl group. For v ∈ G/B− and i ∈ {1, . . . , r} consider the rational curve C → G/B− ,
c → ecEi v .
The limit of ecEi v is well defined as c → ∞, since G/B− is a projective variety. We need the following standard property of Bruhat cells. Lemma 6.1. Let si , w ∈ W be such that l(si w) = l(w) + 1. Then Bsi w = { ecEi v | v ∈ Bw , c ∈ {P1 − 0} } . Corollary 6.2. Let w = si1 · · · sik be a reduced decomposition of w ∈ W . Then Bw = { lim . . . lim ec1 Ei1 · · · eck Eik B− ∈ G/B− | c10 , . . . , ck0 ∈ {P1 − 0}}. c1 →c10
ck →ck0
Corollary 6.3. Let si1 · · · sik be an element in W . Let c10 , . . . , ck0 ∈ P1 . Then the element lim . . . lim ec1 Ei1 · · · eck Eik B− ∈ G/B−
c1 →c10
ck →ck0
belongs to the union of the Bruhat cells Bw with l(w) ≤ k. 6.2. Populations and Bruhat cells. Let P be a population of solutions of the Bethe ansatz equations associated to integral dominant weights , numbers z, special symmetric parameters b. Let y0 = (y10 , . . . , yr0 ) ∈ P be a point of the population with li = deg yi0 for i = 1, . . . , r. Assume that the weight at infinity of y0 , ∞ =
n
i −
i=1
r
l i αi ,
i=1
is integral dominant, see Sect. 2. Such y0 exists according to [MV1, MV2]. For w ∈ W consider the weight w · ∞ , where w· is the shifted action of w on h ∗ . Write w · ∞ =
n i=1
i −
r i=1
liw αi .
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
585
Set Pw = { y = (y1 , . . . , yr ) ∈ P | deg yi = liw , i = 1, . . . , r } . Clearly, P = ∪w∈W Pw , and Pw1 ∩ Pw2 = ∅ if w1 = w2 . Consider the trivial bundle p : (G/B− ) × P1 → P1 with the discrete connection ∇y0 . Consider the Bruhat cell decomposition of fibers of p . Assume that x0 ∈ C is a regular point of the Miura d-oper Dy0 = τh − Vy0 and x0 is a regular point of the G-valued rational solutions of the associated difference equation (18), Y (x + h) = Vy0 (x) Y (x). Let φy0 ,x0 : Om Dy0 → G/B− be the isomorphism defined in Sect. 5.2. Let π : Om Dy0 → Om Dy0 ,
(D v , v) → D v ,
be the natural projection. Let ξ : OmDy0 → P ,
Dy → y,
be the isomorphism of Corollary 3.3. Theorem 6.1. For every w ∈ W , the composition ξ π φy−1 0 ,x : G/B− → P , restricted to 0 the Bruhat cell Bw−1 , is a 1-1 epimorphism of Bw−1 onto Pw . v1 Corollary 6.4. The projection π : Om Dy0 → Om Dy0 is an isomorphism, i.e. if (D , v1 ), v1 v2 (D v2 , v2 ) ∈ Om Dy0 are such that D = D , then v1 = v2 .
Corollary 6.5. Let P be a population of solutions of the Bethe ansatz equations associated to integral dominant g -weights 1 , . . . , n , complex numbers z1 , . . . , zn , special symmetric parameters b. Then P is isomorphic to the flag variety G/B− of the Langlands dual algebra t g . Corollary 6.6. Let 1 , . . . , n , ∞ be integral dominant g -weights. Let z1 , . . . , zn be complex numbers. Let w∈ W . Consider the Bethe ansatz equations associated to 1 , . . . , n , w · ∞ = ni=1 i − ri=1 liw αi , z1 , . . . , zn , and special symmetric parameters b. A solution of the Bethe ansatz equations is a collection of complex num(i) bers t = (tj ), i = 1, . . . , r, j = 1, . . . , liw . Let K be a connected component of the set of solutions of the Bethe ansatz equations. For each t ∈ K consider the tuple yt ∈ (C[x])r of monic polynomials representing the solution t. Then the closure of the set { yt | t ∈ K } in (C[x])r is an l(w)-dimensional cell. 6.3. Proof of Theorem 6.1. Lemma 6.7. For w ∈ W , the subset Bw × P1 ⊂ (G/B− ) × P1 is invariant with respect to the discrete connection ∇y 0 .
586
E. Mukhin, A. Varchenko
Proof. Let Y be the rational G-valued solution of the equation Dy0 Y = 0 such that Y (x0 ) = id. Then Y (x) ∈ B− for all x. The rational horizontal sections of ∇y 0 have the form x → (Y (x) uB− ) × x for a suitable element u ∈ G. If uB− ∈ Bw , then Y (x)uB− ∈ Bw for all x. Let w = sik · · · si1 be a reduced decomposition of w ∈ W . For d = 1, . . . , k set (sid · · · si1 ) · ∞ =
n
i −
i=1
r
lid αi .
i=1
for d = 2, . . . , k. From [BGG] it follows that li11 > li1 and lidd > lid−1 d Let i = (i1 , . . . , ik ), 1 ≤ ij ≤ r, be a sequence of integers. We consider the map Yy0 ,i : (P1 )k → P(C[x])r introduced in Sect. 2.5 for special symmetric parameters b = (bi,j ). Its image is denoted by Py0 ,i . The image of a point (c1 , . . . , ck ) ∈ (P1 )k under this map is denoted by yk; c1 ,...,ck . We repeat the definition of yk; c1 ,...,ck in terms convenient for our present purposes. We assume that the tuple y0 is a tuple of monic polynomials. For d = 1, . . . , k we define by induction on d a family of tuples of polynomials depending on parameters c1 , . . . , cd ∈ P1 . Namely, let y˜i1 be a polynomial satisfying equation yi01 (x + h) y˜i1 (x) − yi01 (x) y˜i1 (x + h) = Ti1 (x) ( yj0 (x + bi1 ,j ) )−ai1 ,j . j, j =i1
We fix y˜i1 assuming that the coefficient of x li1 in y˜i1 is equal to zero. Set y1; c1 = (y11; c1 , . . . , yr1; c1 ) ∈ P(C[x])r , where yi1;1 c1 (x) = y˜i1 (x) + c1 yi01 (x)
and yj1; c1 (x) = yj0 (x) for j = i1 .
In particular, y1; ∞ = y0 in P(C[x])r . Assume that the family yd−1; c1 ,...,cd−1 ∈ P(C[x])r is already defined. Let d−1; c1 ,...,cd−1 y˜id be a polynomial satisfying equation d−1; c1 ,...,cd−1
d−1; c ,...,c
d−1; c ,...,cd−1
1 1 d−1 (x + h) y˜id (x) − yid d−1; c1 ,...,cd−1 = Tid (x) ( yj (x + bid ,j ) )−aid ,j .
yid
d−1; c1 ,...,cd−1
(x + h)
d−1; c1 ,...,cd−1
is equal
(x) y˜id
j, j =id d−1; c1 ,...,cd−1
We fix y˜id
to zero. Set
yd; c1 ,...,cd
l d−1
assuming that the coefficient of x id =
(y1d; c1 ,...,cd , . . . , yrd; c1 ,...,cd )
∈
in y˜id
P(C[x])r ,
where
d−1; c1 ,...,cd−1
c1 ,...,cd yid; (x) = y˜idd−1; c1 ,...,cd (x) + cd yid d
and d−1; c1 ,...,cd−1
yjd; c1 ,...,cd (x) = yj
(x) for j = id .
In particular, yd; c1 ,...,cd−1 ,∞ = yd−1; c1 ,...,cd−1 in P(C[x])r .
(x)
Discrete Miura Opers and Solutions of the Bethe Ansatz Equations
587
The d th family is obtained from the (d − 1)st family by the generation procedure in the idth direction, see Sect. 2.5. For any (c1 , . . . , ck ) ∈ (P1 )k the tuple yk; c1 ,...,ck lies in P . For any (c1 , . . . , ck ) ∈ Ck and any i ∈ {1, . . . , r}, we have deg yik; c1 ,...,ck (x) = liw . Set P (i1 ,...,ik ) = { yk; c1 ,...,ck | (c1 , . . . , ck ) ∈ Ck } . For every (c1 , . . . , ck ) ∈ (P1 )k we define a rational function vc1 ,...,ck : C → N+ by the formula vc1 ,...,ck : x → exp (gk (x; c1 , . . . , ck )Eik ) . . . (exp (g1 (x; c1 )Ei1 ), where gd (x; c1 , . . . , cd ) =
Tid (x)
(22)
d−1; c1 ,...,cd−1 (x + bid ,j ) )−aid ,j j, j =id ( yj d−1; c1 ,...,cd−1 c1 ,...,cd yid; (x) yid (x) d
for d = 1, . . . , k. In particular, if some of c1 , . . . , ck are equal to ∞, then the corresponding exponential factors in (22) must be replaced by the identity element id ∈ G. The function vc1 ,...,ck continuously depends on (c1 , . . . , ck ) ∈ (P1 )k . For any (c1 , . . . , ck ) ∈ (P1 )k the pair (D vc1 ,...,ck , vc1 ,...,ck ) belongs to Om Dy0 . Let x0 ∈ C be a regular point. Consider the map φ : Ck → G/B− ,
(c1 , . . . , ck ) → (vc1 ,...,ck (x0 ))−1 B− .
Proposition 6.8. The image of the map π is Bw−1 . Proof of Proposition 6.8. For any (c1 , . . . , ck ) ∈ (P1 )k consider the rational section S(c1 ,...,ck ) : x → ((vc1 ,...,ck (x))−1 B− ) × x
(23)
of the bundle p . In particular, if some of c
1 , . . . , ck are equal to ∞, then the corresponding exponential factors in (22) must be replaced by the identity element id ∈ G. This section is horizontal with respect to the connection ∇y 0 and continuously depends on
(c1 , . . . , ck ) ∈ (P1 )k . This means that φ(Ck ) ⊂ Bw−1 , / Bw−1 , see Corollary 6.2, and if some of c1 , . . . , ck are equal to ∞, then S(c1 ,...,ck ) (x) ∈ see Corollary 6.3. It remains to show that every point in Bw−1 is the limit of points of the form S(c1 ,...,ck ) (x). But that statement follows from Corollary 6.2 and Lemma 6.9. Assume that x0 ∈ C is such that Ti (x0 ) = 0, yi0 (x0 ) = 0, for i = 1, . . . , r. Assume that x0 ∈ C is such that yj0 (x0 +bi,j ) = 0 for all i = j . Then there exists a proper algebraic subset K ⊂ (C − 0)k with the following property. For any (c11 , . . . , ck1 ) ∈ (C − 0)k − K there exists a unique (c12 , . . . , ck2 ) ∈ Ck such that (c11 , . . . , ck1 ) = (g1 (x0 ; c12 ), . . . , gk (x0 ; c12 , . . . , ck2 )) .
588
E. Mukhin, A. Varchenko
The proposition is proved.
Theorem 6.1 is a direct corollary of Proposition 6.8. Acknowledgement. We thank P. Belkale and S. Kumar for numerous useful discussions.
References [BIK] [BGG] [BM] [DS] [Fa] [FT] [FFR] [F1] [F2] [FRS] [K] [MV1] [MV2] [MV3] [MV4] [OW] [S] [SS]
Bogoliubov, N.M., Izergin, A.G., Korepin, V.E.: Quantum Inverse Scattering Method and Correlation Functions. Cambridge: Cambridge University Press, 1993 Bernshtein, I.N., Gel’fand, I.M., Gel’fand, S.I.: Structure of representations generated by vectors of highest weight. Funct. Anal. Appl. 5, 1–8 (1971) Borisov, L., Mukhin, E.: Self-self-dual spaces of polynomials. http://arxiv.org/abs/math. QA/0308128, 2003 Drinfeld, V., Sokolov, V.: Lie algebras and KdV type equations. J. Sov. Math. 30, 1975–2036 (1985) Faddeev, L.D.: Lectures on Quantum Inverse Scattering Method. In: Integrable Systems, X.-C. Song (ed.), Nankai Lectures Math Phys., Singapore: World Scientific, 1990, pp. 23–70 Faddeev, L.D., Takhtajan, L.A.: Quantum Inverse Problem Method and the Heisenberg XYZmodel. Russ. Math. Surveys 34, 11–68 (1979) Feigin, B., Frenkel, E., Reshetikhin, N.: Gaudin model, Bethe Ansatz and Critical Level. Commun. Math. Phys. 166, 29–62 (1994) Frenkel, E.: Opers on the projective line, flag manifolds and Bethe ansatz. http://arxiv. org/abs/math.QA/0308269, 2003 Frenkel, E.: Affine Algebras, Langlands Duality and Bethe Ansatz. http://xxx.lanl.gov/abs/qalg/9506003, 1995 Frenkel, E., Reshetikhin, N., Semenov-Tian-Shansky,: Drinfeld-Sokolov Reduction for Difference Operators and Deformations of W -algebras I. The Case of Virasoro Algebra. http://xxx.lanl/gov/abs/g-alg/9704011, 1997 Kac, V.: Infinite-dimensional Lie algebras. Cambridge: Cambridge University Press, 1990 Mukhin, E., Varchenko, A.: Critical Points of Master Functions and Flag Varieties. http://arxiv. org/abs/ math.QA/0209017, 2002 Mukhin, E., Varchenko, A.: Populations of solutions of the XXX Bethe equations associated to Kac-Moody algebras. http://arxiv.org/abs/math.QA/0212092, 2002 Mukhin, E., Varchenko, A.: Solutions to the XXX type Bethe Ansatz equations and flag varieties. Cent. Eur. J. Math. 1(2), 238–271 (2003) Mukhin, E., Varchenko, A.: Miura Opers and Critical Points of Master Functions. http://arxiv.org/abs/math.QA/0312406, 2003 Ogievetsky, E., Wiegman, P.: Factorized S-matrix and the Bethe Ansatz for simple Lie groups. Phys. Lett. 168B(4), 360–366 (1986) Sevostyanov, A.: Towards Drinfeld-Sokolov reduction for quantum groups. J. Geom. Phys. 33(3–4), 235–256 (2000) Semenov-Tian-Shansky, M., Sevostyanov, A.: Drinfeld-Sokolov reduction for difference operators and deformations of W -algebras. II. The general semisimple case. Commun. Math. Phys. 192(3), 631–647 (1998)
Communicated by L. Takhtajan
Commun. Math. Phys. 256, 589–609 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1258-5
Communications in
Mathematical Physics
Existence of Self-Similar Solutions to Smoluchowski’s Coagulation Equation Nicolas Fournier1 , Philippe Lauren¸cot2 1
Institut Elie Cartan – Nancy, Universit´e Henri Poincar´e – Nancy I, BP 239, 54506 Vandœuvre-l`es-Nancy cedex, France. E-mail: [email protected] 2 Math´ematiques pour l’Industrie et la Physique, CNRS UMR 5640, Universit´e Paul Sabatier – Toulouse 3, 118 route de Narbonne, 31062, Toulouse cedex 4, France. E-mail: [email protected] Received: 8 March 2004 / Accepted: 14 July 2004 Published online: 13 January 2005 – © Springer-Verlag 2005
Abstract: The existence of self-similar solutions to Smoluchowski’s coagulation equation has been conjectured for several years by physicists, and numerical simulations have confirmed the validity of this conjecture. Still, there was no existence result up to now, except for the constant and additive kernels for which explicit formulae are available. In this paper, the existence of self-similar solutions decaying rapidly at infinity is established for a wide class of homogeneous coagulation kernels. 1. Introduction The Smoluchowski coagulation equation provides a mean-field description of a system of an infinite number of particles growing by successive mergers, each particle being fully identified by its mass ranging in the set of positive real numbers. In fact, the only mechanism taken into account in this model is the coalescence of two particles to form a larger one. Denoting by c(t, x) ≥ 0 the concentration of particles of mass x ∈ (0, ∞) at time t ≥ 0, the dynamics of c is given by [8, 23] ∂t c(t, x) = Lc (c(t, .))(x),
(t, x) ∈ (0, ∞) × (0, ∞),
(1.1)
where the coagulation reaction term Lc is defined by ∞ 1 x Lc (c)(x) = K(y, x − y) c(y) c(x − y) dy − c(x) K(x, y) c(y) dy (1.2) 2 0 0 for x ∈ (0, ∞). In (1.2), K(x, y) models the likelihood that two particles with respective masses x and y merge into a single one (with mass x + y) and the coagulation kernel K is a non-negative and symmetric function from (0, ∞)2 into [0, ∞). The first term of the right-hand side of (1.2) describes the formation of particles of mass x resulting from the coalescence of two particles with respective masses y and x − y, y ∈ (0, x).
590
N. Fournier, P. Lauren¸cot
The second term accounts for the disappearance of particles of mass x by coalescence with other particles. Observe that, since the coalescence of two particles with respective masses x and y leads to the formation of a particle with mass x + y, the total mass of the whole system of particles is expected to remain constant throughout time evolution. In other words, a solution c to (1.1) is expected to satisfy ∞ ∞ x c(t, x) dx = x c(0, x) dx for t ≥ 0 . (1.3) 0
0
It is however well-known by now that this property fails to be true for coagulation kernels which increase sufficiently rapidly for large values of x and y. More precisely, some mass can be lost in finite time, a phenomenon known as gelation, and the occurrence of gelation has been extensively studied recently, either by physicists (see [7, 18] and the references therein) and by mathematicians (see [9, 12] and the references therein). For the coagulation kernels to be considered in this paper, the occurrence of the gelation phenomenon in finite time is excluded and solutions to (1.1) do satisfy (1.3). A central issue is then to identify the large time behaviour of solutions. For homogeneous coagulation kernels K, that is, K satisfies (u, x, y) ∈ (0, ∞)3 ,
K(ux, uy) = uλ K(x, y) ,
(1.4)
for some parameter λ ∈ (−∞, 1], it is conjectured by physicists that a solution c to (1.1) approaches a self-similar profile for large masses and large times. More precisely, the so-called dynamical scaling hypothesis predicts that 1 x ψ (1.5) c(t, x) ∼ cS (t, x) = s(t)2 s(t) after a sufficiently large time, and cS is a self-similar solution to (1.1) with a finite mass ∞ ∞ x cS (t, x) dx = x ψ(x) dx < ∞ , (1.6) 0
0
see [1, 7, 15, 18] and the references therein. Here the particle mean mass s(t) and the profile ψ are to be determined and depend on the coagulation kernel K but not on specific properties of the initial datum c(0, .). Formal computations have been performed by physicists to identify s(t) and the behaviour of ψ for small and large x [5, 7, 18, 19], while several numerical simulations seem to support the validity of (1.5) [10, 11, 14, 17, 20]. Still, nothing much is known from the rigorous point of view. In fact, the first difficulty encountered is the existence of the scaling profile ψ which satisfies a nonlinear integro-differential equation. It is the purpose of this work to prove the existence of at least one scaling profile ψ for three classes of coagulation kernels K, namely K1 (x, y) = (x α + y α )(x −β + y −β ), α ∈ [0, 1), β ∈ (0, ∞), λ = α − β ∈ (−∞, 1), (1.7) α α β K2 (x, y) = (x + y ) , α ∈ [0, ∞), β ∈ (0, ∞), λ = αβ ∈ [0, 1), (1.8) K3 (x, y) = x α y β + x β y α , α ∈ (0, 1), β ∈ (0, 1), λ = α + β ∈ [0, 1), (1.9) which include several kernels considered in the literature. To our knowledge, no previous existence result was available, except for the constant kernel K(x, y) = 1 and the
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
591
additive kernel K(x, y) = x + y. For these two kernels, explicit formulae for ψ are available, see, e.g., [7, 18]. It has actually been shown recently that, for the constant and additive kernels, there is a one-parameter family of self-similar solutions to (1.1) (not necessarily of the form (1.5), see [3, 22]). Nevertheless only one of them has a fast decay for large x and is expected to describe the large time behaviour of solutions to (1.1) with, say, compactly supported initial data. The self-similar solution to (1.1) we construct in this paper for K1 , K2 and K3 also has a fast decay at infinity. Still, it is likely that other self-similar solutions with “fat” tails also exist in that case. Remark 1.1. It actually seems that the proof of the existence of a scaling profile developed in this paper could be extended to other homogeneous coagulation kernels K, and in particular to K(x, y) = (xy)−α , α > 0. Before stating our result, let us point out that the existence of a self-similar solution to (1.1) opens the way to the study of the validity of the dynamical scaling hypothesis (1.5). Still, a convergence proof seems to require also the uniqueness of such a solution (in a suitable class) and is still an open problem. For the constant and/or additive coagulation kernels, the validity of (1.5) has been investigated in [1, 4, 6, 13, 16, 18, 21, 22]. From now on, we assume that the coagulation kernel K is given by (1.7), (1.8) or (1.9). Observe that K satisfies (1.4) and that our assumptions imply that λ < 1. Inserting the self-similar ansatz (1.5) in (1.1) and using (1.4), we are led to find two positive real numbers (w, ) and a non-negative function ψ ∈ L1 (0, ∞; xdx) such that w∂x (x 2 ψ(x)) + xLc (ψ)(x) = 0, x ∈ (0, ∞), ∞ x ψ(x) dx = ,
(1.10) (1.11)
0
see [7, 18]. We notice that, if ψ solves (1.10), (1.11) for the parameters (w, ), then ψa,b (x) = aψ(bx) also solves (1.10), (1.11) but for the parameters (awb−1−λ , ab−2 ). Since λ < 1, the choice 1/(1−λ) 1/(1−λ) 1+λ a= , b = (1.12) (1 − λ)2 w 2 (1 − λ)w allows us to consider (1.10), (1.11) with w = 1/(1 − λ) and = 1 without loss of generality. Setting next s(t) = t 1/(1−λ) , it readily follows from (1.10), (1.11) that cS (t, x) = s(t)−2 ψ(xs(t)−1 ) is a self-similar solution to (1.1) (in a weak sense) with mass 1. To be more precise, we first define the notions of the weak solution to (1.1) and (1.10) we will use in this paper. The following definition relies on the (formal) observation that, for c : (0, ∞) → R and for any test function φ : (0, ∞) → R sufficiently smooth, we have ∞ ∞ ∞ φ(x)xLc (c)(x)dx = xK(x, y)[φ(x + y) − φ(x)]c(x)c(y)dydx 0 0 0 z ∞ ∞ φ (z) xK(x, y)c(x)c(y)dydxdz. = 0
0
z−x
Definition 1.2. Assume that K is given by (1.7), (1.8) or (1.9) and set γ = 1/(1 − λ). (i) A non-negative function ψ ∈ L1 (0, ∞; xdx) is said to be a weak solution to γ ∂x (x 2 ψ(x)) + xLc (ψ)(x) = 0,
x ∈ (0, ∞),
(1.13)
592
N. Fournier, P. Lauren¸cot
if ψ ∈ L1 (0, ∞; x 2 dx) and is such that (x, y) −→ xyK(x, y)ψ(x)ψ(y) ∈ L1 (0, ∞)2 and z ∞ 2 γ z ψ(z) = K(x, y)xψ(x)ψ(y)dydx for a.e. z ∈ (0, ∞), (1.14) 0
z−x
the right-hand side of (1.14) being finite for almost every z ∈ (0, ∞). (ii) A non-negative function c ∈ L∞ (0, ∞; L1 (0, ∞; xdx)) is said to be a weak solution to (1.1) if, for every t2 > t1 > 0 and φ ∈ Cb1 ([0, ∞)), the function (t, x, y) −→ 1 2 xyK(x, y)c(t, x)c(t, y) belongs to L (t1 , t2 ) × (0, ∞) and
∞
xφ(x) (c(t2 , x) − c(t1 , x)) dx t2 ∞ ∞ = xK(x, y)[φ(x + y) − φ(x)]c(t, x)c(t, y)dydxdt. 0
t1
0
0
Note that, if ψ is a weak solution to (1.13) in the sense of Definition 1.2 (i), it satisfies γ
∞
x 2 ψ(x)φ (x)dx =
∞ ∞
0
0
xK(x, y)[φ(x + y) − φ(x)]ψ(x)ψ(y)dydx
0
(1.15) for any φ ∈ Cb1 ([0, ∞)). The main result of this paper is the following. Theorem 1.3. Assume that K is given by (1.7), (1.8) or ∞(1.9) and set γ = 1/(1 − λ). (i) There exists a weak solution ψ to (1.13) such that 0 xψ(x)dx = 1. (ii) Introducing cS (t, x) = t−2γ ψ(xt −γ ) for t > 0 and x ∈ (0, ∞), cS is a (self-similar) ∞ weak solution to (1.1), and 0 xcS (t, x)dx = 1 for each t > 0. We will only prove the assertion (i) of Theorem 1.3, since (ii) follows by a straightforward calculation. Furthermore, the weak solution ψ to (1.13) constructed in Theorem 1.3 enjoys the following properties. Theorem 1.4. Assume that K is given by (1.7), (1.8) or (1.9) and consider the weak solution ψ to (1.13) constructed in Theorem 1.3. (i) There exists a continuous and positive function g ∈ C((0, ∞)) such that ψ(x) ≥ g(x) > 0 for each x ∈ (0, ∞). (ii) The following estimates hold: ψ ∈ L1 (0, ∞; x σ dx) for every σ ∈ R if K = K1 , ∞ −β eεx x −β ψ(x)dx < ∞ if K = K1 , ∃ε > 0,
(1.16) (1.17)
0
ψ ∈ L1 (0, ∞; x σ dx) for every σ ≥ λ if K = K2 , ψ ∈ L1 (0, ∞; x σ dx) for every σ > λ if K = K3 .
(1.18) (1.19)
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
593
Remark 1.5. As already mentioned, some conjectures about the behaviour of ψ for small and large x have been proposed by physicists on the ground on formal computations [7, 18, 19]. On the one hand, it is conjectured that ψ decays at least exponentially for large x which is in perfect agreement with (1.16), (1.18) and (1.19). On the other hand, the small-x behaviour is conjectured to depend heavily on the kernel K. For K1 , ψ is expected to behave as x −τ exp {−C x −β } (for some τ > 0), which is in compliance with (1.16), (1.17). For K2 and K3 , the conjectured behaviour of ψ is that ψ(x) ∼ C x −τ near x = 0 with τ < 1 + λ for K2 and τ = 1 + λ for K3 , and (1.18) and (1.19) perfectly agree with this conjecture. Thus, the leading order in (1.17) and (1.19) seems to be optimal. Remark 1.6. Owing to the expected singularity of ψ near x = 0 for K given by (1.8) or (1.9), some of the terms in Lc (ψ) are not well-defined and it is thus not likely that the weak formulation (1.14) can be improved. In addition, the properties of ψ obtained in Theorem 1.4 do not seem to be sufficient to prove that the right-hand side of (1.14) is continuous with respect to z ∈ (0, ∞) (because of the singularity of ψ for (x, y) ∼ (z, z − x) ∼ (z, 0)). The remainder of the paper is devoted to the proof of Theorems 1.3 and 1.4, which relies on a suitable discretization of (1.13), along with a compactness method. It turns out that it is more convenient to work with Q(x) = x ψ(x). With this notation, the weak formulation (1.15) becomes ∞ ∞ ∞ K(x, y) γ [φ(x + y) − φ(x)]Q(x)Q(y)dydx xφ (x)Q(x)dx = y 0 0 0 (1.20) for every φ ∈ Cb1 ([0, ∞)). We introduce and study a discrete approximation to (1.20) in Sect. 2. Moment and integrability estimates are the subject of the next two sections, Sects. 3 and 4, respectively. The proof of Theorem 1.3, together with that of (1.16), (1.18) and (1.19), is performed in Sect. 5 while the last section of the paper is devoted to the proof of (1.17) and the strict positivity of ψ. Throughout the paper we use the following notation: if µ is a measure on a set E and φ is a function from E in R, we write µ, φ = µ(dx), φ(x) = E φ(x) µ(dx). For k ∈ N, the space of bounded and C k -smooth functions from [0, ∞) in R which have bounded derivatives up to the order k is denoted by Cbk ([0, ∞)). Also, if x ∈ R and y ∈ R, we put x ∧ y = min {x, y}. 2. A Discrete Approximation Let N ≥ 1 be a positive integer and consider two families of non-negative real numbers (vi )1≤i≤N+1 and (ai,j )1≤i,j ≤N such that v1 = vN+1 = 0 and ai,j = aj,i for 1 ≤ i, j ≤ N .
(2.1)
We define the function F : RN −→ RN by F = (Fi )1≤i≤N and Fi (q) = vi+1 qi+1 − vi qi +
i−1 ai−j,j j =1
j
qi−j qj −
N−i j =1
ai,j q i qj j
(2.2)
for 1 ≤ i ≤ N and q = (qi )1≤i≤N ∈ RN . We first prove the existence of a zero of F .
594
N. Fournier, P. Lauren¸cot
Proposition 2.1. Under the assumptions (2.1), there exists q ∈ [0, ∞)N such that N
qi = 1 ,
(2.3)
i=1
and F (q) = 0, that is, vi+1 qi+1 − vi qi +
i−1 ai−j,j
j
j =1
qi−j qj −
N−i j =1
ai,j qi qj = 0 for 1 ≤ i ≤ N . j (2.4)
Proof. Clearly F is a locally Lipschitz continuous function and (2.1) implies that N
Fi (q) = 0 for each q ∈ RN .
(2.5)
i=1
The Cauchy-Lipschitz theorem then ensures that, for each q ∈ RN , there is a unique maximal solution ϕ(., q) = (ϕi (., q))1≤i≤N ∈ C 1 ([0, t + (q)); RN ) to the ordinary differential equation d ϕ(t, q) = F (ϕ(t, q)) , dt with the alternative:
ϕ(0, q) = q ,
+
+
either t (q) = ∞ or
t (q) < ∞ and
lim
N
t→t + (q)
(2.6)
|ϕi (t, q)| = ∞
. (2.7)
i=1
Moreover, it follows from (2.5) that N
ϕi (t, q) =
i=1
N
qi ,
t ∈ [0, t + (q)) ,
q ∈ RN .
(2.8)
i=1
We now consider q ∈ [0, ∞)N . Then, q + t F (q) ∈ [0, ∞)N for t small enough, namely
N t sup {vi } + sup {ai,j } |qi | < 1 . 1≤i≤N
1≤i,j ≤N
i=1
Therefore, dist q + tF (q), [0, ∞)N = 0 for t small enough and the subtangent condition
1 lim inf dist q + tF (q), [0, ∞)N = 0 t→0 t is fulfilled for each q ∈ [0, ∞)N . We then infer from [2, Theorem 16.5] that [0, ∞)N is positively invariant for the semiflow ϕ, that is, ϕ(t, q) ∈ [0, ∞)N for each t ∈ [0, t + (q)) and q ∈ [0, ∞)N . This property and (2.8) readily imply that 0≤
N i=1
|ϕi (t, q)| =
N i=1
ϕi (t, q) =
N i=1
qi < ∞
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
595
for t ∈ [0, t + (q)) and q ∈ [0, ∞)N . Recalling (2.7), we thus conclude that t + (q) = ∞ for each q ∈ [0, ∞)N . We finally introduce the non-empty convex and compact subset C of RN defined by N C = q = (qi )1≤i≤N ∈ [0, ∞)N , qi = 1 . i=1
Owing to the previous analysis and (2.8), C is positively invariant for the semiflow ϕ and we may apply [2, Prop. 22.13] to conclude that F has at least one zero in C. We now use Proposition 2.1 to obtain a solution to a discrete approximation of (1.13). n Let n ≥ 1 be a positive integer and put N = n2 , vN+1 = 0, and i−1 i j n =K , ai,j , , 1 ≤ i, j ≤ n2 . vin = γ n n n n ) fulfill (2.1) and we infer from Proposition 2.1 that there The families (vin ) and (ai,j 2
exists q n = (qin ) ∈ [0, ∞)n such that 2
n
qin = 1 ,
(2.9)
i=1
and F (q n ) = 0, that is, i−1 n −i i−j j i j 1 1 n K , qi−j K , qin qjn qjn − j n n j n n 2
j =1
j =1
γ n =− − (i − 1) qin i {1≤i≤n2 −1} qi+1 n
(2.10)
for 1 ≤ i ≤ n2 . We then define the probability measure Qn (dx) on (0, ∞) by 2
Qn (dx) =
n
qin δi/n (dx) ,
(2.11)
i=1
and first show that Qn (dx) is indeed a solution to an approximation of the weak formulation (1.20) of (1.13). Lemma 2.2. Let K be a symmetric and non-negative function from (0, ∞)2 in [0, ∞) and n ≥ 1. There exists a probability measure Qn (dx) on (0, ∞) such that, for any measurable function φ : (0, ∞) → R, Kn (x, y) Qn (dx)Qn (dy), [φ(x + y) − φ(x)] y
= γ Qn (dx), n x − n−1 φ(x) − φ(x − n−1 ) {x≥2/n} , (2.12) with Kn (x, y) = K(x, y) {x+y≤n} .
596
N. Fournier, P. Lauren¸cot
Proof. It suffices to consider Qn (dx) defined previously by (2.11). We multiply the i th equation of (2.10) by nφ(i/n) and sum up the resulting identities to obtain
2
0=
n
φ(i/n) −
2 −i n
j =1
i=1
i−1 K(i/n, j/n) K(j/n, (i − j )/n) n qin qjn qjn qi−j + j/n (i − j )/n j =1
n2 −1
n2
−γ
(i − 1)qin φ(i/n) + γ
i=1
=
n2 n2
n iqi+1 φ(i/n)
i=1
qkn qln
k=1 l=1
−
Kn (k/n, l/n) Kn (k/n, l/n) φ(k/n) + φ((k + l)/n) l/n l/n
2
+γ
n
qkn [−(k − 1)φ(k/n) + (k − 1)φ((k − 1)/n)]
k=2
Kn (x, y) n n = Q (dx)Q (dy), [φ(x + y) − φ(x)] y
−γ Qn (dx), n x − n−1 φ(x) − φ(x − n−1 ) {x≥2/n} , whence (2.12).
3. Moment Estimates We next prove some key moment estimates. Lemma 3.1. Assume that the coagulation kernel K is given by (1.7), (1.8) or (1.9). The sequence of probability measures (Qn (dx)) constructed in Lemma 2.2 satisfies sup Qn (dx), x σ < ∞
(3.1)
n≥1
for every σ ∈R if K = K1 , σ ∈ [λ − 1, ∞) if K = K2 , σ ∈ (λ − 1, ∞) if K = K3 . Proof. We first note that, since the measure Qn (dx) has its support included in [1/n, n], it is clear that Qn (dx), x σ is well-defined and finite for each n and σ ∈ R. We then set Kn (x, y) (x + y)σ − x σ , An,σ = Qn (dx)Qn (dy), y
Bn,σ = γ Qn (dx), n x − n−1 x σ − (x − n−1 )σ {x≥2/n} for σ ∈ R and notice that An,σ = Bn,σ by (2.12).
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
597
Step 1. We first show that (3.1) holds true for K1 and K2 when σ = λ − 1. On the one hand, since K(x, y) ≥ y λ , we have Kn (x, y) 1/x 1−λ − 1/(x + y)1−λ |An,λ−1 | = Qn (dx)Qn (dy), y n n ≥ Q (dx)Q (dy), (1/y 1−λ ){x+y≤n} {y≥x} (1/x 1−λ )[1 − 1/21−λ ] {y≥x} {x+y>n} n n n n − ε Q (dx)Q (dy), ≥ ε Q (dx)Q (dy), (xy)1−λ (xy)1−λ with ε = 1 − 1/21−λ > 0. An easy symmetry argument allows us to deduce that 2 {x≥n/2} ε n Q (dx), 1/x 1−λ − 2ε Qn (dx)Qn (dy), |An,λ−1 | ≥ 2 (xy)1−λ 2 22−λ ε ε n ≥ Q (dx), 1/x 1−λ − 1−λ Qn (dx), 1/x 1−λ 2 n On the other hand, an easy computation shows that 1−λ |Bn,λ−1 | ≤ γ Qn (dx), (x − n−1 ) {x≥2/n} (x − n−1 )2−λ {x≥2/n} ≤ Qn (dx), (x − n−1 )1−λ ≤ 21−λ Qn (dx), 1/x 1−λ , where the last inequality follows from the fact that 1/(x −n−1 ) ≤ 2/x for every x ≥ 2/n. We deduce from the bounds on |An,λ−1 | and |Bn,λ−1 | that there is a constant C which does not depend on n ≥ 1 such that 2 Qn (dx), 1/x 1−λ ≤ C Qn (dx), 1/x 1−λ for each n ≥ 1. Since Qn (dx), 1/x 1−λ < ∞ for each n, we conclude that (3.1) holds true for K1 and K2 when σ = λ − 1. Step 2. We next prove that (3.1) holds true for K2 and K3 when σ ≥ 0. Let p ≥ 1 be a positive integer. Since x → x p is convex, we have p x p − (x − 1/n)p ≥ (x − 1/n)p−1 and (x + y)p − x p ≤ p y (x + y)p−1 . n Consequently,
pγ Qn (dx), (x − 1/n)p {x≥2/n} ≤ Bn,p = An,p ≤ p Qn (dx)Qn (dy), Kn (x, y) (x + y)p−1 ,
and, since (x + y)p ≤ C (x p + y p ) and Qn (dx), 1 = 1, we end up with n 1 p n n p−1 . + Q (dx)Q (dy), Kn (x, y) (x + y) Q (dx), x ≤ C np
(3.2)
598
N. Fournier, P. Lauren¸cot
Now, since λ ∈ [0, 1), it is straightforward to check that, for K2 and K3 and ε > 0, there is C(ε, p) > 0 such that K(x, y) (x + y)p−1 ≤ ε (x p + y p ) + C(ε, p). Using this inequality with ε = 1/(4C) and the fact that Qn (dx), 1 = 1, we conclude that n n 1 n p Q (dx), x p ≤ C + C(ε, p) + 2ε Q (dx)Q (dy), x np 1 n 1 ≤ + C(ε, p) . Q (dx), x p + C 2 np Consequently, (3.1) holds true for K2 and K3 when σ is a non-negative integer and thus for each σ ≥ 0 by interpolation. Step 3. We now check that (3.1) is true for K1 when σ ≥ 0. Let p be a positive integer such that p > 1 +β. Recalling (3.2) and using a symmetry argument and the inequality (x + y)p−1 ≤ C x p−1 + y p−1 , we obtain n 1 n n p−1 Q (dx), x p ≤ C , + Q (dx)Q (dy), K (x, y) x n np whence n C Q (dx), x p ≤ p + C Qn (dx), x p+λ−1 + C Qn (dx), x p+α−1 Qn (dx), x −β n + C Qn (dx), x p−1 Qn (dx), x λ +C Qn (dx), x p−1−β Qn (dx), x α . Now, since p > 1 + β ≥ 1 − λ, one also has 0 < λ + p − 1 < p, 0 < α + p − 1 < p and 0 < p − 1 − β < p. Since Qn (dx) is a probability measure, the Jensen inequality yields (p+λ−1)/p Qn (dx), x λ+p−1 ≤ Qn (dx), x p , (p−1)/p Qn (dx), x p−1 ≤ Qn (dx), x p , (p+α−1)/p , Qn (dx), x α+p−1 ≤ Qn (dx), x p (p−1−β)/p Qn (dx), x p−1−β ≤ Qn (dx), x p , n α/p Q (dx), x α ≤ Qn (dx), x p . Also, since β ≤ 1 − λ, we deduce from Step 1 and the Jensen inequality that n Q (dx), x −β ≤ C for some constant C independent of n. Consequently, by the Young inequality, n (p+λ−1)/p n (p+α−1)/p Q (dx), x p ≤ C 1 + Qn (dx), x p + Q (dx), x p (p−1)/p n Q (dx), x λ + C Qn (dx), x p p 1 n ≤ . Q (dx), x p + C 1 + Qn (dx), x λ 4
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
599
Now, either λ ≤ 0 and we infer from the Jensen inequality and Step 1 that
Qn (dx), x λ
p
(p|λ|)/(1−λ) ≤ Qn (dx), x λ−1 ≤C,
or λ ∈ (0, 1) and the Jensen and Young inequalities imply that
Qn (dx), x λ
p
λ 1 n ≤ Qn (dx), x p ≤ Q (dx), x p + C . 4C
In both cases, we finally arrive at
1 n Qn (dx), x p ≤ Q (dx), x p + C , 2
from which Step 3 follows at once. Step 4. Up to now, we have proved that (3.1) is true for K1 when σ ≥ λ − 1. We now complete the proof of (3.1) for K1 . We consider δ > 0 and notice that Kn (x, y) 1 1 |An,−(1+δ) | = Qn (dx)Qn (dy), − y x 1+δ (x + y)1+δ Kn (x, y) 1 x = Qn (dx)Qn (dy), − xy xδ (x + y)1+δ Kn (x, y) 1 x 1 y = Qn (dx)Qn (dy), − + − 2xy xδ (x + y)1+δ yδ (x + y)1+δ Kn (x, y) 1 1 1 1 = Qn (dx)Qn (dy), − + − 2xy xδ 2(x + y)δ yδ 2(x + y)δ Kn (x, y) 1 1 ≥ Qn (dx)Qn (dy), + δ 4xy xδ y 1 Kn (x, y) = Qn (dx)Qn (dy), 1+δ 2 x y {x+y≤n} 1 ≥ Qn (dx)Qn (dy), 1+δ+β 1−α 2 x y 1 n ≥ Q (dx)Qn (dy), x −(1+δ+β) y α−1 2 {x+y>n} 1 − Qn (dx)Qn (dy), 1+δ+β 1−α 2 x y 1 n ≥ Q (dx), x −(1+δ+β) Qn (dx), x α−1 2 2δ+β 2−α − 1+δ+β Qn (dx), x α−1 − 1−α Qn (dx), x −(1+δ+β) n n 1 2−α n α−1 ≥ Qn (dx), x −(1+δ+β) − 1−α Q (dx), x 2 n δ+β 2 − 1+δ+β Qn (dx), x α−1 . n
600
N. Fournier, P. Lauren¸cot
On the one hand, the Jensen inequality and Step 1 imply that
(1−α)/(1−λ) Qn (dx), x α−1 ≤ Qn (dx), x λ−1 ≤C,
since 0 < 1 − α < 1 − λ. On the other hand, we infer from Step 3 that
{x≤R} Qn (dx), x α−1 ≥ Qn (dx), 1−α x 1 n 1 ≥ 1−α Qn (dx), 1 − Q (dx), x {x≥R} R R 1 C ≥ 1−α 1 − R R ≥2ε
for some ε > 0 sufficiently small after choosing R = 2/C. Gathering the above three estimates we end up with 2−α n ε n |An,−(1+δ) | ≥ ε − 1−α Q (dx), x −(1+δ+β) − C ≥ Q (dx), x −(1+δ+β) − C n 2 for n large enough, since α ∈ [0, 1). Therefore, ε n Q (dx), x −(1+δ+β) ≤ |An,−(1+δ) | + C 2 ≤ |Bn,−(1+δ) | + C ≤ 21+δ γ (1 + δ) Qn (dx), x −(1+δ) + C , where we have proceeded as in Step 1 to obtain the last inequality. Recalling that β > 0, we have 0 < 1 + δ < 1 + δ + β and the Jensen and Young inequalities yield (1+δ)/(1+δ+β) ε n Q (dx), x −(1+δ+β) ≤ C 1 + Qn (dx), x −(1+δ+β) 2 ε n ≤ Q (dx), x −(1+δ+β) + C . 4 Thus, (3.1) is also valid for σ < −1 and an interpolation argument completes the proof for K1 . Step 5. It remains to check that (3.1) holds true for K3 when σ ∈ (λ − 1, 0). We first remark that K(x, y) ≥ (xy)λ/2 . Arguing as in Step 1, we find that (xy)λ/2 σ n n σ |An,σ | ≥ Q (dx)Q (dy), x − (x + y) {x+y≤n} , y n |Bn,σ | ≤ C Q (dx), x σ .
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
601
We next define τ = 2/(1 − λ + σ ) > 0 and ai = i −τ for i ≥ 1. Then, for n ≥ 2, (xy)λ/2 |An,σ | ≥ |σ | Qn (dx)Qn (dy), {x+y≤n} (x + y)1−σ (xy)λ/2 n n ≥ |σ | Q (dx)Q (dy), {x∈[0,1]} {y∈[0,1]} (x + y)1−σ ≥ |σ | (2ai )σ −1 Qn (dx)Qn (dy), (xy)λ/2 {x∈(ai+1 ,ai ]} {y∈(ai+1 ,ai ]} i≥1
2 ≥ε (ai )σ −1 Qn (dx), x λ/2 {x∈(ai+1 ,ai ]}
(3.3)
i≥1
for some constant ε > 0. We next use the Cauchy-Schwarz inequality to obtain ∞ Qn (dx), x σ −(λ/2) x λ/2 {x∈(ai+1 ,ai ]} Qn (dx), {x∈[0,1]} x σ =
i=1
≤ ≤
∞
(1−σ )/2 ai −σ +(λ/2) i=1 ai+1
(σ −1)/2
ai
∞ 1/2 ∞ a 1−σ i i=1
≤C
λ−2σ ai+1
∞
Qn (dx), x λ/2 {x∈(ai+1 ,ai ]}
(ai )
σ −1
n
Q (dx), x
λ/2
i=1
(ai )σ −1 Qn (dx), x λ/2 {x∈(ai+1 ,ai ]}
2
{x∈(ai+1 ,ai ]}
2
1/2
1/2 (3.4)
,
i=1
the last inequality resulting from the fact that ai1−σ λ−2σ ai+1
= (i + 1)τ (λ−2σ ) i −τ (1−σ ) ≤ 2τ (λ−2σ ) i −τ (1−λ+σ ) = 2τ (λ−2σ ) i −2 .
Gathering (3.3) and (3.4), we deduce that 2 |An,σ | ≥ ε Qn (dx), {x∈[0,1]} x σ for some ε > 0. Finally, since |An,σ | = |Bn,σ | and Qn (dx) is a probability measure, the above bounds entail that n 2 2 Q (dx), x σ ≤ 2 Qn (dx), x σ {x∈[0,1]} + 2 ≤ C Qn (dx), x σ + 2 , from which the expected result readily follows by the Young inequality.
4. Integrability Estimates Having studied the behaviour of Qn (dx) for small and large x in the previous section, we now turn to “integrability” properties of Qn (dx). Since Qn (dx) is a measure, such properties can obviously not be enjoyed by Qn (dx) but rather by 1 n n Z (x) = γ n Q (dy), y − {x≤y≤x+1/n} , x ∈ (0, ∞) . (4.1) n In this direction, we have the following result:
602
N. Fournier, P. Lauren¸cot
Lemma 4.1.
(i) If K is given by (1.7), then Z n belongs to L∞ (0, ∞) and sup Z n L∞ < ∞ .
(4.2)
n≥1
(ii) If K is given by (1.8) or (1.9) and p ∈ (1, 1/λ), then x → x λ−1 Z n (x) belongs to Lp (0, ∞) and ∞ sup x p(λ−1) Z n (x)p dx < ∞ . (4.3) n≥1 0
Proof. (i) We take ϑ ∈
C0∞ ((0, ∞))
and choose
x
φ(x) =
ϑ(y) dy ,
x ∈ [0, ∞) ,
0
in (2.12). Setting Kn (x, y) An = Q (dx)Q (dy), [φ(x + y) − φ(x)] , y
n
n
we infer from the Fubini theorem and (2.12) that ∞ ϑ(x) Z n (x) dx = An .
(4.4)
0
Using once more the Fubini theorem, we may estimate An as follows: x+y K(x, y) |An | ≤ Qn (dx)Qn (dy), ϑ(z) dz y x ∞ K(x, y) = {x≤z} {z−x≤y} dz ϑ(z) Qn (dx)Qn (dy), y 0 ∞ K(x, y) ≤ |ϑ(z)| Qn (dx)Qn (dy), dz y 0 ≤ ϑL1 Qn (dx), x λ Qn (dx), x −1 + Qn (dx), x α Qn (dx), x −(β+1) + ϑL1 Qn (dx), x −β Qn (dx), x α−1 + Qn (dx), 1 Qn (dx), x λ−1 ≤ C ϑL1 by Lemma 3.1. Recalling (4.4) we conclude that, for some constant C > 0 not depending on n nor on ϑ, ∞ n ϑ(x) Z (x) dx ≤ C ϑL1 0 ∞ C0 ((0, ∞)). A
for every ϑ ∈ density argument next ensures that the previous estimate is actually true for every ϑ ∈ L1 (0, ∞). The dual space of L1 (0, ∞) being L∞ (0, ∞), the first assertion of Lemma 4.1 readily follows. (ii) We take ϑ ∈ C0∞ ((0, ∞)) and choose x φ(x) = ϑ(y) y λ−1 dy , x ∈ [0, ∞) , 0
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
603
in (2.12). Setting Kn (x, y) An = Qn (dx)Qn (dy), [φ(x + y) − φ(x)] , y we infer from the Fubini theorem and (2.12) that
∞
ϑ(x) x λ−1 Z n (x) dx = An .
(4.5)
0
To estimate An , we first remark that K2 and K3 satisfy, for some constant C > 0, K(x, y) ≤ C(x µ y ν + x ν y µ ) ,
0 ≤ ν ≤ µ < 1,
λ=µ+ν,
with (µ, ν) = (αβ, 0) for K = K2 and (µ, ν) = (max{α, β}, min{α, β}) for K = K3 . We next fix p ∈ (1, 1/λ). Then there exists ε > ν small enough such that (λ + ε − ν) p < 1 .
(4.6)
Introducing σ = 1 − λ − ε < 1 − λ, we infer from (4.6) that (2 − λ − σ − ν) p > 1 and 2 − λ − σ − ν −
1 < 1 − λ. p
(4.7)
Now, since σ < 1 − λ ≤ 1 − ν, we infer from the Fubini theorem that x+y K(x, y) |An | ≤ Qn (dx)Qn (dy), |ϑ(z)| zλ−1 dz y x ∞ |ϑ(z)| K(x, y) n n ≤ Q (dx)Q (dy), {x≤z} {z−x≤y} dz z1−λ y 0 ∞ |ϑ(z)| (x µ y ν + x ν y µ ) n n ≤C Q (dx)Q (dy), {x≤z} {z−x≤y} dz z1−λ y 0 ∞ |ϑ(z)| 1 ≤C Qn (dx)Qn (dy), 1−λ z (z − x)1−σ −ν 0 µ x xν × + {x≤z} {z−x≤y} dz yσ y σ +ν−µ ∞ n |ϑ(z)| xµ n ≤C Q Q (dy), y −σ dz (dx), {x≤z} 1−λ 1−σ −ν z (z − x) 0 ∞ n |ϑ(z)| xν n +C Q Q (dy), y µ−ν−σ dz . (dx), {x≤z} 1−λ 1−σ −ν z (z − x) 0
604
N. Fournier, P. Lauren¸cot
Since σ + ν − µ ≤ σ < 1 − λ, we infer from Lemma 3.1 and the Fubini theorem that, setting p = p/(p − 1), ∞ |ϑ(z)| xµ + xν n |An | ≤ C Q (dx), {x≤z} dz z1−λ (z − x)1−σ −ν 0 ∞ |ϑ(z)| ≤ C Qn (dx), (x µ + x ν ) dz (z − x)1−σ −ν z1−λ x ! ≤ C ϑLp
∞
Qn (dx), (x µ + x ν )
(z − x)−p(1−σ −ν) z−p(1−λ) dz
1/p
x
≤ C ϑLp Qn (dx),
xµ + xν x 2−λ−σ −ν−1/p
∞
1/p
u−p(1−λ) (u − 1)−p(1−σ −ν) du
1
≤ C ϑLp , where the last inequality follows from Lemma 3.1, (4.6) and (4.7). Combining this estimate with (4.5) yields ∞ λ−1 n ≤ C ϑ p ϑ(x) x Z (x) dx L 0
C0∞ ((0, ∞)). A
for every ϑ ∈ density argument next ensures that the previous estimate is actually true for every ϑ ∈ Lp (0, ∞). Since the dual space of Lp (0, ∞) is Lp (0, ∞), this completes the proof of Lemma 4.1. 5. Proof of Theorem 1.3 We are now in a position to complete the proof of Theorem 1.3, (1.16), (1.18) and (1.19). Owing to (2.9) and Lemma 3.1, (Qn (dx)) is a tight sequence of probability measures on [0, ∞). Consequently, there are a subsequence (Qnk (dx)) of (Qn (dx)) and a probability measure Q(dx) on [0, ∞) such that Qnk (dx) converges narrowly towards Q(dx), that is, lim Qnk (dx), φ = Q(dx), φ
(5.1) k→∞
for any φ ∈ Cb ([0, ∞)). A straightforward consequence of Lemma 3.1 and (5.1) is that Q(dx), x σ < ∞ (5.2) for
σ ∈R if K = K1 , σ ∈ [λ − 1, ∞) if K = K2 , σ ∈ (λ − 1, ∞) if K = K3 .
In addition, thanks to the uniform bounds on some negative moments of Qn (dx) given by Lemma 3.1, we deduce that Q({0}) = 0, so that Q(dx) is a probability measure on (0, ∞). We next check that Q(dx) is absolutely continuous with respect to the Lebesgue measure on (0, ∞). We first consider the case when K is given by (1.7). By Lemma 4.1, the sequence (Z n ) is bounded in L∞ (0, ∞) and we may thus assume that there is Z ∈ L∞ (0, ∞) such that (Z nk ) converges weakly towards Z in L∞ (0, ∞). However, it
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
605
is straightforward to deduce from (4.1) and (5.1) that (Z nk ) converges towards γ xQ(dx) in D (0, ∞). Consequently, γ xQ(dx) = Z(x) and, since Q(dx) does not charge {0} and satisfies (5.2), we conclude that Q(dx) = Q(x)dx with Q ∈ L1 (0, ∞). We next turn to the case where K is given by (1.8) or (1.9) and fix p ∈ (1, 1/λ). We infer from Lemma 4.1 that (x λ−1 Z n ) is bounded in Lp (0, ∞), so that we may assume that there is Z˜ ∈ Lp (0, ∞) such that (x λ−1 Z nk ) converges weakly towards Z˜ in Lp (0, ∞). But (x λ−1 Z nk ) also converges towards γ x λ Q(dx) in D (0, ∞) by (4.1) and (5.1). We then argue as before to conclude that Q(dx) = Q(x)dx with Q ∈ L1 (0, ∞). We now prove that ψ(x) = Q(x)/x is a weak solution to (1.13) and first check that (1.20) is satisfied by Q for every φ in Cb2 ([0, ∞)) (the extension to Cb1 functions being straightforward). Owing to (5.2), we have Q(dx)Q(dy), K(x, y) < ∞ . Since (2.12) holds with n = nk for each k, we only have to prove that (i) where
lim Ak = A,
k→∞
(ii)
lim Bk = B,
k→∞
(5.3)
K(x, y){x+y≤nk } Ak = Qnk (dx)Qnk (dy), [φ(x + y) − φ(x)] , y K(x, y) A = Q(dx)Q(dy), [φ(x + y) − φ(x)] , y
φ(x) − φ(x − n−1 Bk = Qnk (dx), nk x − n−1 k k ) , B = Q(dx), xφ (x) .
We first on [0, ∞), it is clear from (5.1) prove (5.3) (ii). Since xφ (x) is continuous that ( Qnk (dx), xφ (x) ) converges towards Q(dx), xφ (x) as k → ∞. The claim (ii) then follows after noticing that
−1 Qnk (dx), xφ (x) − nk x − n−1 ) φ(x) − φ(x − n k k
φ L∞ φ L∞ nk Q (dx), x + nk nk φC 2 b ≤C −→ 0 nk k→∞ by Lemma 3.1. We next prove (5.3) (i). On the one hand, it is not difficult to deduce from (5.1) and (5.2) that K(x, y) lim Qnk (dx)Qnk (dy), [φ(x + y) − φ(x)] = B . k→∞ y ≤
On the other hand, K(x, y){x+y>nk } nk nk |φ(x + y) − φ(x)| Q (dx)Q (dy), y ≤ 2||φ ||L∞ Qnk (dx)Qnk (dy), K(x, y){x>nk /2} 4 ≤ ||φ ||L∞ Qnk (dx)Qnk (dy), xK(x, y) . nk
606
N. Fournier, P. Lauren¸cot
The uniform moment estimates given in Lemma 3.1 allow us to conclude that (i) holds true. Therefore, Q(dx) = Q(x)dx satisfies (1.20). Finally, owing to the moment estimates (5.2), we may proceed as in Sect. 4 to show that z ∞ K(x, y) Q(y) Q(x) dydx < ∞ y 0 z−x for each z ∈ (0, ∞). For ϑ ∈ Cb ([0, ∞)), we then take x φ(x) = ϑ(y) dy , x ∈ [0, ∞) , 0
in (1.20) and use the Fubini theorem to deduce that ∞ ∞ z ∞ K(x, y) γ Q(y) Q(x) dydx dz , z Q(z) ϑ(z) dz = ϑ(z) y 0 0 0 z−x whence
z ∞
γ zQ(z) = 0
z−x
K(x, y) Q(y)Q(x)dydx y
for a.e. z ∈ (0, ∞).
Consequently, we realize that ψ(z) = Q(z)/z satisfies all the properties claimed in Theorem 1.3, and the moment estimates (1.16), (1.18) or (1.19) as well. 6. Proof of Theorem 1.4 We first study the positivity of ψ. Proposition 6.1. Consider the function ψ constructed in Theorem 1.3. There is a continuous and positive function g ∈ C((0, ∞)) such that ψ(x) ≥ g(x) > 0 for any x ∈ (0, ∞). Proof. Owing to (1.14) and the moment estimates (1.16), (1.18) or (1.19), it is sufficient to establish that ψ is positive a.e. in (0, ∞). Indeed, we have K(x, y) > 0 on (0, ∞)2 and z ∞ 1 ψ(z) ≥ g(z) := K(x, y)x(y ∧ 1)ψ(y)ψ(x)dydx , z > 0 , γ z2 0 z−x by (1.14), where g is clearly a continuous and positive function on (0, ∞). Step 1. We claim that ψ does not vanish in a neighbourhood of x = 0. Indeed, assume for contradiction that supp ψ ⊂ (δ, ∞) for some δ ∈ (0, 1). Setting z R(z) = xψ(x) dx , z ∈ [0, ∞) , 0
we infer from (1.14) and (1.16), (1.18) or (1.19) that, for z > δ, z ∞ 1 γ R (z) ≤ xK(x, y)ψ(y) ψ(x) dydx z 0 δ z ∞ C(δ) (1 + x + y) xyψ(y) ψ(x) dydx ≤ δ −2 δ
δ
≤ C(δ) (1 + z) R(z) .
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
607
To obtain the previous estimate, we have used that K(x, y) ≤ C(δ) (1 + x + y) when (x, y) ∈ [δ, ∞)2 for the three classes of coagulation kernels (1.7), (1.8) and (1.9), and the moment estimates (1.16), (1.18) or (1.19). Consequently, for each z > δ, " # z R(z) ≤ R(δ) exp C(δ) (1 + y) dy = 0 , δ
which contradicts the fact that R(z) → 1 as z → ∞. Step 2. Assume now for contradiction that there is z ∈ (0, ∞) such that ψ(z) = 0. By (1.14) we have z
0
∞
which in turn implies that z/2 0
xK(x, y) ψ(y) ψ(x) dydx = 0 ,
z−x
z
xK(x, y) ψ(y) ψ(x) dydx = 0 .
z/2
Since K(x, y) > 0 in (0, ∞)2 , we deduce that z z/2 ψ(x)dx × ψ(x)dx = 0
0
z/2
whence
z/2 z
z
ψ(y) ψ(x) dydx = 0 ,
z/2
ψ(y) dy = 0
z/2
by Step 1. Consequently, ψ = 0 a.e. in (z/2, z) and we may iterate the process to conclude that ψ = 0 a.e. in (0, z), which contradicts Step 1. We conclude the paper with the proof of (1.17). Proposition 6.2. Assume that K = K1 and consider the function ψ constructed in Theorem 1.3. Then there is a constant ε > 0 such that ∞ −β eεx x −β ψ(x)dx < ∞ . (6.4) 0
% $ Proof. For a ∈ (0, 1), ε > 0 and x ∈ (0, ∞), we put ϑa (x) = exp ε (x + a)−β and φ(x) = ϑa (x)/x. Though φ ∈ Cb1 ([0, ∞)), the moment estimates (1.16) allow us to use (1.15) for this choice of test function. On the one hand, we may proceed as in the proof of Step 4 of Lemma 3.1 to obtain that ∞ ∞ xK(x, y) [φ(x) − φ(x + y)] ψ(x) ψ(y) dydx A := 0 ∞ 0 ∞ K(x, y) [ϑa (x) + ϑa (y) − ϑa (x + y)] ψ(x) ψ(y) dydx ≥ 2 0 ∞ 0 ∞ K(x, y) [ϑa (x) + ϑa (y)] ψ(x) ψ(y) dydx ≥ 4 0 ∞ 0 ∞ K(x, y) ϑa (x) ψ(x) ψ(y) dydx ≥ 2 0 0 ∞ ∞ 1 ϑa (x) ≥ x α ψ(x) dx ψ(x) dx . 2 xβ 0 0
608
N. Fournier, P. Lauren¸cot
∞ On the other hand, since 0 xψ(x)dx = 1, we have ∞ x 2 φ (x) ψ(x) dx γ −1 B := − 0 ∞ εβx = 1+ ϑa (x) ψ(x) dx (x + a)1+β 0 ∞ εβ ≤ 1 + β ϑa (x) ψ(x) dx x 0 ε1/β ∞ ε 1 ≤ ϑa (x) ψ(x) dx + 1/β xϑa (x) ψ(x) dx xβ ε 0 ε1/β ∞ ϑa (x) +εβ ψ(x) dx xβ 0 ∞ ϑa (x) e ψ(x) dx + 1/β . ≤ ε (1 + β) β x ε 0 Since 2A = 2B by (1.15), we end up with ∞ x α ψ(x) dx − 2 ε (1 + β) γ 0
∞ 0
Choosing ε= which is positive since
∞ 0
∞ 0
1 4 γ (1 + β)
∞
ϑa (x) 2γ e ψ(x) dx ≤ 1/β . β x ε
x α ψ(x) dx > 0,
0
xψ(x)dx = 1, we obtain that
ϑa (x) 4γ e ψ(x) dx ≤ 1/β xβ ε
∞
α
x ψ(x)dx
−1 .
(6.5)
0
Since the right-hand side of the above estimate does not depend on a ∈ (0, 1), we let a → 0 in (6.5) and use the Fatou lemma to complete the proof. Acknowledgements. We thank Christoph Walker for pointing out to us the availability of [2, Prop. 22.13], and Christophe Giraud, Sylvain Rubenthaler and Etienne Tanr´e for valuable discussions.
References 1. Aldous, D.J.: Deterministic and stochastic models for coalescence (aggregation, coagulation) : a review of the mean-field theory for probabilists. Bernoulli 5, 3–48 (1999) 2. Amann, H.: Ordinary differential equations. An introduction to nonlinear analysis. de Gruyter Studies in Mathematics 13, Berlin: Walter de Gruyter & Co., 1990 3. Bertoin, J.: Eternal solutions to Smoluchowski’s coagulation equation with additive kernel and their probabilistic interpretation. Ann. Appl. Probab. 12, 547–564 (2002) 4. da Costa, F.P.: On the dynamic scaling behaviour of solutions to the discrete Smoluchowski equations. Proc. Edinburgh Math. Soc. (2) 39, 547–559 (1996) 5. Cueille, S., Sire, C.: Nontrivial polydispersity exponents in aggregation models. Phys. Rev. E 55, 5465–5478 (1997) 6. Deaconu, M., Tanr´e, E.: Smoluchowski’s coagulation equation: probabilistic interpretation of solutions for constant, additive and multiplicative kernels. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 29, 549–579 (2000)
Self-Similar Solutions to Smoluchowski’s Coagulation Equation
609
7. van Dongen, P.G.J., Ernst, M.H.: Scaling solutions of Smoluchowski’s coagulation equation. J. Statist. Phys. 50, 295–329 (1988) 8. Drake, R.L.: A general mathematical survey of the coagulation equation. In: “Topics in Current Aerosol Research (part 2),” International Reviews in Aerosol Physics and Chemistry, Oxford: Pergamon Press, 1972, pp. 203–376 9. Escobedo, M., Mischler, S. Perthame, B.: Gelation in coagulation and fragmentation models. Commun. Math. Phys. 231, 157–188 (2002) 10. Filbet, F., Lauren¸cot, Ph.: Numerical simulation of the Smoluchowski coagulation equation. SIAM J. Sci. Comput. 25, 2004–2028 (2004) 11. Friedlander, S.K., Wang, C.S.: The self-preserving particle size distribution for coagulation by brownian motion. J. Colloid Interface Sci. 22, 126–132 (1966) 12. Jeon, I.: Existence of gelling solutions for coagulation-fragmentation equations, Commun. Math. Phys. 194, 541–567 (1998) 13. Kreer, M., Penrose, O.: Proof of dynamical scaling in Smoluchowski’s coagulation equation with constant kernel. J. Statist. Phys. 75, 389–407 (1994) 14. Krivitsky, D.S.: Numerical solution of the Smoluchowski kinetic equation and asymptotics of the distribution function. J. Phys. A 28, 2025–2039 (1995) 15. Lauren¸cot, Ph., Mischler, S.: On coalescence equations and related models. In: “Modeling and computational methods for kinetic equations”. P. Degond, L. Pareschi, G. Russo (eds.), Boston: Birkh¨auser, 2004, pp. 321–356 16. Lauren¸cot, Ph., Mischler, S.: Liapunov functionals for Smoluchowski’s coagulation equation and convergence to self-similarity. Monatsh. Math., to appear 17. Lee, M.H.: A survey of numerical solutions to the coagulation equation. J. Phys. A 34, 10219–10241 (2001) 18. Leyvraz, F.: Scaling theory and exactly solved models in the kinetics of irreversible aggregation. Phys. Rep. 383, 95–212 (2003) 19. Lushnikov, A.A., Kulmala, M.: Singular self-preserving regimes of coagulation processes. Phys. Rev. E 65, 041604, (2002) 20. Meesters, A., Ernst, M.H.: Numerical evaluation of self-preserving spectra in Smoluchowski’s coagulation theory. J. Colloid Interface Sci. 119, 576–587 (1987) 21. Menon, G., Pego, R.L.: Dynamical scaling in Smoluchowski’s coagulation equations: uniform convergence. SIAM J. Math. Anal., to appear 22. Menon, G., Pego, R.L.: Approach to self-similarity in Smoluchowski’s coagulation equations. Comm. Pure Appl. Math. 57, 1197–1232 (2004) 23. Smoluchowski, M., Drei Vortr¨age u¨ ber Diffusion, Brownsche Molekularbewegung und Koagulation von Kolloidteilchen. Physik. Zeitschr. 17, 557–599 (1916) Communicated by J.L. Lebowitz
Commun. Math. Phys. 256, 611–620 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1289-6
Communications in
Mathematical Physics
Analysis of S 2 -Valued Maps and Faddeev’s Model Dave Auckly1, , Lev Kapitanski2, 1 2
Department of Mathematics, Kansas State University, Manhattan, KS 66506, USA Department of Mathematics, University of Miami, Coral Gables, FL 33124, USA
Received: 13 March 2004 / Accepted: 5 October 2004 Published online: 12 February 2005 – © Springer-Verlag 2005
Abstract: In this paper we consider a generalization of the Faddeev model for the maps from a closed three-manifold into the two-sphere. We give a novel representation of smooth S 2 -valued maps based on flat connections. This representation allows us to obtain an analytic description of the homotopy classes of S 2 -valued maps that generalizes to Sobolev maps. It also leads to a new proof of an old theorem of Pontrjagin. For the generalized Faddeev model, we prove the existence of minimizers in every homotopy class. 1. Introduction In 1975 L. D. Faddeev introduced an interesting nonlinear sigma-model motivated by the Hopf invariant and the Skyrme model, [5]. The Hopf invariant is an integer associated to any continuous map from the 3-sphere to the 2-sphere. Roughly, it counts the linking number of the inverse images of two generic points on S 2 . Homotopy classes of such maps are completely classified by the Hopf invariant, [8, 3]. Faddeev’s model is now often called the Faddeev–Hopf model. It is also referred to as the Faddeev–Skyrme model. The fields in this model are maps n from R3 to S 2 asymptotically constant at infinity. The energy functional is E(n) = |dn|2 + |dn ∧ dn|2 d 3 x. (1) R3
The Hopf invariant can be evaluated analytically as follows, [3], Q(n) = α ∧ dα, R3
The first author was partially supported by NSF grant DMS-0204651 The second author was partially supported by NSF grant DMS-0436403
(2)
612
D. Auckly, L. Kapitanski
where α is the unique δ-closed 1-form vanishing at infinity such that dα equals the pull1 back, n∗ ωS 2 , of the volume 2-form on S 2 . In coordinates, ωS 2 = 4π n1 dn2 ∧ dn3 + 2 3 1 3 1 2 n dn ∧ dn + n dn ∧ dn . In the homotopy class corresponding to Q(n) ∈ Z, the energy is bounded from below (see [18]): 3
E(n) ≥ const · |Q(n)| 4 . This estimate leads one to believe that each sector (homotopy class) should have a ground state – a minimizer of E(n). Faddeev expected the minimizers with Q(n) = 0 to have interesting knot-like structures, [5], and recently several numerical investigations have provided some support to this conjecture, see [6, 7] and the references therein. Mathematically, one problem with a domain of R3 is that it is not compact; intuitively, one may imagine a minimizing sequence with a concentrated lump sliding to infinity. When the domain is a closed three-manifold, this particular difficulty is avoided. However, all of the other interesting physical and mathematical features remain. If one only wished to consider minimizers on S 3 , the Hopf invariant gives the homotopy classification and has an analytic expression analogous to (2). When R3 or S 3 is replaced by a general Riemannian three-manifold, M 3 , the homotopy classification of maps to S 2 is more complicated. The classification result is due to Pontrjagin, [12]. Pontrjagin, in fact, classifies the maps from general three-complexes to the 2-sphere. For the special case of three-manifolds, we repeat his result in the proposition below. Compared to the S 3 case, two new features arise. First of all, there is a new invariant given by the induced map on second cohomology. Second, the Hopf invariant generalizes into a secondary invariant that sometimes takes values in a finite cyclic group. Theorem 1 (Pontrjagin). Let M be a closed, connected, oriented three-manifold. To any continuous map φ from M to S 2 one associates a cohomology class φ ∗ µS 2 ∈ H 2 (M; Z), where µS 2 is a generator of H 2 (S 2 ; Z). Every class may be obtained from some map, and two maps with different classes lie in different homotopy classes. The homotopy classes of maps with a fixed class α ∈ H 2 (M; Z) are in bijective correspondence with H 3 (M; Z)/(2α ∪ H 1 (M; Z)). The known proofs of Pontrjagin’s proposition provide a pretty picture of the homotopy classification. These proofs are geometric and combinatorial in nature, and cannot be easily used in our minimization problem. To circumvent this difficulty, we give a novel description of smooth maps from closed three-manifolds into the 2-sphere. Namely, an S 2 -valued map will be represented by a flat connection and a smooth reference map. Here we rely on our earlier work [1] and on the concurrent research in [2]. This description allows us to obtain an analytically friendly picture of the homotopy classification and a new proof of Pontrjagin’s result. At the same time, this presentation fits in well with Faddeev’s functional, which we rewrite using the connection and reference map. In [2] the developing map associated with the connection is used to compute the fundamental group and rational cohomology of the configuration space. The first part of Sect. 2 relates maps into S 2 to maps into S 3 which is the basis of our proof of the Pontrjagin theorem. We then introduce flat connections to encode S 3 -valued maps. To encode an S 3 -valued map by a flat connection, a framing is required. Without the framing, the map is determined up to an orientation preserving isometry of S 2 . Since the Faddeev functional is invariant under isometries, we ignore framings. Our description of orientation preserving isometry classes of smooth S 2 -valued maps in terms of special flat connections is given in Theorem 2. In Sect. 3 we rewrite the Faddeev energy
Analysis of S 2 -Valued Maps and Faddeev’s Model
613
functional in terms of flat connections. At this point we turn from smooth connections to connections with finite energy. Our main result, Theorem 3, is that the analytical conditions fixing the homotopy class are well-defined for finite energy connections. In addition, we prove the existence of a minimizer of the energy functional in every class. When the primary obstruction vanishes, there is an alternate approach to minimization that may be interesting in its own right. This is discussed in Sect. 4. 2. S 2 -Valued Maps and Homotopy Classification Let M be a closed, orientable 3-manifold and µS 2 be a generator of H 2 (S 2 , Z). The image of µS 2 in the de Rham cohomology is the equivalence class of the form ωS 2 given previously. For smooth (or just continuous) maps ψ : M → S 2 , it is well known that ψ ∗ µS 2 is a homotopy invariant. Every class α ∈ H 2 (M, Z) arises from some map ψ. Here is the standard construction: Let γ be a 1-cycle in M dual to α. Since M is orientable, the normal bundle to γ is trivial. Using a trivialization of the normal bundle, each fiber may be identified with R2 and mapped via stereographic projection onto the 2-sphere. Finally, map the complement of the normal bundle to the North pole. Now, there are many trivializations of the normal bundle. Any two trivializations are related by some number of twists (full rotations of the fiber when moving along γ ). The number of twists is the secondary invariant described in the second part of Proposition 1. To describe the secondary invariant analytically, we will need a few constructions. We start with notation. Notation. • Sp1 is the group of unit quaternions, q ∗ denotes the quaternionic conjugation of q. The Lie algebra of Sp1 is the purely imaginary quaternions, denoted sp1 . • S 1 is the group of unit complex numbers regarded as a subgroup of Sp1 . • S 2 will be identified with the unit sphere in the purely imaginary quaternions. • The usual dot-product may be expressed using quaternionic multiplication as p, q = 1 ∗ ∗ 2 (p q + q p). ∞ • C (X, Y ) denotes the space of smooth maps from X to Y . • W s,p denotes the usual Sobolev space of functions with s derivatives in Lp ; W s,p (M, Sp1 ) denotes the subset of quaternion-valued W s,p functions on M which take values in Sp1 almost everywhere; W s,p (M, S 2 ) is defined analogously. Given any map ϕ : M → S 2 and any map u : M → Sp1 , we construct a new map ψ : M → S 2 by ψ(x) = u(x)ϕ(x)u(x)∗ . We will show that ψ has the same associated cohomology class as ϕ. Conversely, we will see that any map ψ with the same associated cohomology class may be represented in this way. To prove these facts we will need several maps (compare with the discussion in [2]). The most important map is q : S 2 × S 1 → Sp1 defined by q(z, λ) = qλq ∗ , where z = qiq ∗ . This bizarre looking map will later encode the gauge freedom when we describe S 2 -valued maps via S 3 -valued maps. It will be important later that q has degree 2 (with standard orientations). One can check that i is a regular value with inverse image {±(i, i)}, and q is orientation preserving at each point. The second map is given by f : S 2 × Sp1 → S 2 × S 2 , where f (z, q) = (z, qzq ∗ ). Define a free right S 1 -action ρ : S 2 × Sp1 × S 1 → S 2 × Sp1 by ρ(z, p, λ) = (z, pq(z, λ)). The quotient map associated with ρ is f . Thus f is the projection of a principal fiber bundle by standard facts from the theory of group actions on manifolds, and therefore, f is a fibration by
614
D. Auckly, L. Kapitanski
the covering homotopy theorem, [4, 17]. For comparison, recall that the Hopf fibration h : Sp1 → S 2 is given by the map h(q) = qiq ∗ . Consider two maps, ϕ and ψ, going from M to S 2 . Define Qϕ,ψ = {(x, q) ∈ M × Sp1 |ψ(x) = qϕ(x)q ∗ }. This bundle is the pull-back of the fibration f under the map (ϕ, ψ) : M → S 2 × S 2 . Lemma 1. There exists a smooth map u : M → Sp1 such that ψ and ϕ are related by ψ = uϕu∗ if and only if ψ ∗ µS 2 = ϕ ∗ µS 2 . Proof. Denote P = S 2 × Sp1 considered as a principal bundle over S 2 × S 2 with bundle map f . Let f1 , f2 : P → S 2 be the first and second components of f , i.e., f1 (z, q) = z and f2 (z, q) = qzq ∗ . Notice, that f2∗ µS 2 = f1∗ µS 2 because f2∗ µS 2 [S 2 × {1}] = f1∗ µS 2 [S 2 × {1}] and [S 2 × {1}] generates H2 (S 2 × Sp1 ; Z). If ψ = uϕu∗ , then ψ ∗ µS 2 = (ϕ, u)∗ f2∗ µS 2 = (ϕ, u)∗ f1∗ µS 2 = ϕ ∗ µS 2 . In the opposite direction, assume that ψ ∗ µS 2 = ϕ ∗ µS 2 . We will prove that the bundle Qϕ,ψ is then trivial. Hence, it admits a section σ : M → Qϕ,ψ . The composition of σ with the projection on the second component of Qϕ,ψ is the desired u. To see that Qϕ,ψ is trivial, we compute the first Chern class of the associated line bundle. We have c1 (Qϕ,ψ ) = (ϕ, ψ)∗ c1 (P ) = (ϕ, ψ)∗ (pr∗1 µS 2 − pr∗2 µS 2 ) = ϕ ∗ µS 2 − ψ ∗ µS 2 = 0.
(3)
In the above computation, prk is the projection of S 2 × S 2 onto the k th factor. We have c1 (P )[{i} × S 2 ] = −1, since f restricted to {i} × S 2 is the Hopf fibration. In addition, for the diagonal map : S 2 → S 2 × S 2 we have c1 (P )[ (S 2 )] = 0, since ∗ P admits the section σ (z) = (z, 1). Recalling that [ (S 2 )] = [{i} × S 2 ] + [S 2 × {i}], we conclude that c1 (P ) = pr∗1 µS 2 − pr∗2 µS 2 , as used above. We can now complete our proof of Pontrjagin’s theorem. Fix a reference map ϕ. Let D(M, S 2 ) be the set of smooth maps from M to S 2 with the same associated cohomology class as ϕ. Using the covering homotopy property of the fibration f , we obtain a fibration C ∞ (M, Sp1 ) → D(M, S 2 ) with homotopy fiber C ∞ (M, S 1 ). The fiber is included by λ → q(ϕ, λ). The fibration is given by u → uϕu∗ . This fibration induces a short exact sequence at the level of path components: π0 (C ∞ (M, S 1 )) → π0 (C ∞ (M, Sp1 )) → π0 (D(M, S 2 )). It is well known that π0 (C ∞ (M, S 1 )) is isomorphic to H 1 (M; Z) by λ → λ∗ µS 1 , and π0 (C ∞ (M, Sp1 )) is isomorphic to H 3 (M; Z) by u → u∗ µSp . Now, 1
∗
∗ ∗
∗
q(ϕ, λ) µSp = (ϕ, λ) q µSp = (ϕ, λ) (2µS 2 ∪ µS 1 ) = 2ϕ ∗ µS 2 ∪ λ∗ µS 1 ). 1 1
(4)
Here we used the fact that q has degree 2. From this computation and the exact sequence, it follows that π0 (D(M, S 2 )) ∼ = H 3 (M; Z)/(2ϕ ∗ µS 2 ∪ H 1 (M; Z). This completes our proof of Pontrjagin’s theorem. Fix a reference map ϕ : M → S 2 . We have just shown that any map ψ with the same associated cohomology class ϕ ∗ µS 2 may be represented in the form ψ = uϕu∗ for some u : M → Sp1 . If ψ is homotopic to ϕ, then, in addition, u may be chosen homotopic to a constant map. This follows by a simple application of the covering homotopy property of the fibration f . In fact, there are many such maps u. For any map λ : M → S 1 , the
Analysis of S 2 -Valued Maps and Faddeev’s Model
615
map uq(ϕ, λ) may also be used to represent ψ. However, uq(ϕ, λ) will not necessarily be null-homotopic. Varying the map λ one obtains all Sp1 -valued maps representing ψ. Using Eq. (4) we see that deg uq(ϕ, λ) = 2ϕ ∗ µS 2 ∪ λ∗ µS 1 [M] + deg u. (5) ∗ The term 2ϕ µS 2 ∪ λ∗ µS 1 [M] in (5) is an even integer, because µS 2 and µS 1 are integral classes. The map η → (ϕ ∗ µS 2 ∪ η)[M] from H 1 (M; Z) to Z is a group homomorphism, and therefore has image mZ for some m depending on the class ϕ ∗ µS 2 . Since the degree of a null-homotopic map is zero, we conclude that the degree of any map u corresponding to a map ψ homotopic to ϕ lies in 2mZ. Remark 1. All homotopy classes of maps ψ : M → S 2 with the same second cohomology class ψ ∗ µS 2 = ϕ ∗ µS 2 are obtained in the form ψ = uϕu∗ . The maps u1 ϕu∗1 and u2 ϕu∗2 are homotopic if and only if deg u1 ≡ deg u2 (mod 2m). Every map u : M → Sp1 is the developing map of a flat connection a = u−1 du. Connections arising from such u’s have trivial holonomy. Conversely, given any flat connection, a, with trivial holonomy, one can find a map u : M → Sp1 so that a = u−1 du. Such u is unique up to left multiplication by a constant in Sp1 . Left multiplication of u by a constant will change ψ by an orientation preserving isometry in S 2 . In general, the 1 degree of the map u is given by − 12π 2 M Re(a ∧ a ∧ a). Notice that da + a ∧ a = 0 implies that this integral is exactly the Chern-Simons invariant, 1 2 cs(a) = Re(a ∧ da + a ∧ a ∧ a), 2 4π M 3 of the flat connection a. Recall from the previous paragraph that uϕu∗ is homotopic to 1 ϕ if and only if deg u = − 12π 2 M Re(a ∧ a ∧ a) ∈ 2mZ. ∗ For any ψ = uϕu homotopic to ϕ, we consider the Hodge decomposition of the ϕ-component, a, ϕ , of the associated connection a = u−1 du. Let Ha, ϕ be the harmonic component of a, ϕ . The space of harmonic 1-forms on M is identified with H 1 (M; R). Let {η1 , . . . , ηk } be an integral basis for H 1 (M; R) and write Ha, ϕ = h1 η1 + · · · + hb ηb . Recall that every class η ∈ H 1 (M; Z) is a pull-back of µS 1 under a smooth map λ : M → S 1 . Given such λ, let a λ = q−1 aq + q−1 dq be the flat connection corresponding to uq(ϕ, λ). Notice that a λ , ϕ − a, ϕ is a de Rham representative of the image of λ∗ µS 1 in H 1 (M; R). We conclude that by an appropriate choice of gauge, λ, we can make each coefficient hk assume values in the interval [0, 1). After this is achieved, turn to the d-component of a, ϕ . Note that uq(ϕ, eiθ ) will also represent ψ for any smooth θ : M → R. The flat connection corresponding to uq(ϕ, eiθ ) is a exp(iθ) = q−1 aq + q−1 dq. Now, a exp(iθ) , ϕ − a, ϕ = dθ . Thus, by an appropriate choice of gauge, we may further assume that the d-component of a, ϕ is zero. We summarize all of these observations in the following theorem. Theorem 2. Any orientation preserving S 2 -isometry class of a smooth map from M to S 2 homotopic to ϕ is uniquely represented by a smooth flat connection a, which has trivial holonomy and satisfies the conditions 1 ∗ 1 1. cs(a) = − 12π 2 M Re(a ∧ a ∧ a) ∈ 2mZ = {2(ϕ µS 2 ∪ η)[M]|η ∈ H (M; Z)}. 2. Ha, ϕ = h1 η1 + · · · + hb ηb with h1 , . . . , hb ∈ [0, 1). 3. δa, ϕ = 0 (here and below δ is the adjoint of the exterior derivative, d).
616
D. Auckly, L. Kapitanski
3. Minimizers of the Faddeev Functional, (I) The Faddeev energy of a map ψ : M → S 2 , ψ(x) = ψ 1 (x)i + ψ 2 (x)j + ψ 3 (x)k, is given by E(ψ) = |dψ|2 + |dψ ∧ dψ|2 dvol, M
where |dψ|2 = |dψ 1 |2 + |dψ 2 |2 + |dψ 2 |2 , and |dψ ∧ dψ|2 = |dψ 1 ∧ dψ 2 |2 + |dψ 2 ∧ dψ 3 |2 + |dψ 3 ∧ dψ 1 |2 1 = |[dψ, dψ]|2 . 4 We begin this section by rewriting E(ψ) using the representation of S 2 -valued maps in Theorem 2. Fix a smooth reference map ϕ : M → S 2 . If ψ is smooth and homotopic to ϕ, it can be represented as ψ = uϕu∗ with u : M → Sp1 . Substitute this expression into the energy functional, use Ad-invariance of the norm and the Lie bracket, and the notation a = u∗ du to obtain E(ψ) = Eϕ [a] := |Da ϕ|2 + |Da ϕ ∧ Da ϕ|2 , (6) M
where Da ϕ = dϕ + [a, ϕ]. There are several advantages in using Eϕ [a] as the primary expression for the energy functional, the two main advantages are: the conditions fixing the homotopy class can be expressed analytically in terms of a; the primary field, a, takes values in a linear space. The natural space for our minimization problem is the space Aϕ of finite energy flat connections a satisfying the conditions of Theorem 2. More precisely, we assume that a ∈ L2 (M, sp1 ), that da + a ∧ a = 0 in the sense of distributions, and that a has trivial holonomy, ρa = 1 (see [1], Sect. 3, Lemma 4). In 1 addition, we assume that Eϕ [a] < ∞ and cs(a) = − 12π 2 M Re(a ∧ a ∧ a) ∈ 2mZ. Also, we require that Ha, ϕ = h1 η1 + · · · + hb ηb with hk ∈ [0, 1] and δa, ϕ = 0. Denote this class by Aϕ . Note that we now allow hk = 1. By doing so, we lose the unique representation of the orientation preserving isometry class of ψ, but this is more convenient for taking limits, and we can always return to the case hk ∈ [0, 1) by a smooth gauge transformation. Theorem 3. The functional Eϕ [a] has a minimum in the class Aϕ . Proof. Let an be a minimizing sequence in Aϕ . Our first step is to show that an are uniformly bounded in L2 . For each of the forms an we use the orthogonal decomposition 1 an = an , ϕ ϕ + ϕ[an , ϕ]. 2 Accordingly, the curvature condition, dan + an ∧ an = 0, decomposes into 1 1 d an , ϕ = [ϕdϕ, [an , ϕ]] + ϕ[an , ϕ] ∧ [an , ϕ], 4 4 1 1 1 d ϕ[an , ϕ] = an , ϕ ∧ Dan ϕ + dϕ ∧ [an , ϕ] + [an , ϕ] ∧ dϕ. 2 4 4
(7) (8)
Analysis of S 2 -Valued Maps and Faddeev’s Model
617
Recall that ϕ is a smooth S 2 -valued function. The terms [an , ϕ] and [an , ϕ] ∧ [an , ϕ] are uniformly bounded in L2 : [an , ϕ]L2 ≤ Dan ϕL2 + dϕL2 , [an , ϕ] ∧ [an , ϕ]L2 ≤ Dan ϕ ∧ Dan ϕL2 + 2dϕL∞ [an , ϕ]L2 + dϕ ∧ dϕL2 . The harmonic part of an , ϕ is uniformly bounded in L∞ by the assumptions of our class Aϕ . From Eq. (7) and δan , ϕ = 0 we obtain an elliptic estimate (9) an , ϕ W 1,2 ≤ C dan , ϕ L2 + Han , ϕ L2 . Thus, the sequence an is uniformly bounded in L2 . Choose a subsequence, called also an , that converges weakly in L2 , and let the limit be a. Note that an , ϕ ϕ a, ϕ ϕ and ϕ[an , ϕ] ϕ[a, ϕ] in L2 . In fact, an , ϕ converges to a, ϕ strongly in Lp for 2 ≤ p < 6 by the compactness of the embedding W 1,2 ⊂ Lp . Consider the wedge products ϕ[an , ϕ] ∧ ϕ[an , ϕ]. By sparsing the sequence an , we will assume that ϕ[an , ϕ] ∧ ϕ[an , ϕ] converges weakly in L2 to some 2-form ξ . Let us show that ξ = ϕ[a, ϕ] ∧ ϕ[a, ϕ]. We need the following version of the div - curl lemma, [13]. Lemma 2. Let M be a smooth 3-dimensional Riemannian manifold. Let ω1m ∈ L2 be a sequence of matrix-valued differential forms and ω2m ∈ L2 be a sequence of matrixvalued differential forms on M. If ω1m converges weakly in L2 to a form ω1 and ω2m converges weakly in L2 to a form ω2 , and if each sequence dω1m and dω2m is precompact −1,2 (M), then ω1m ∧ ω2m converges to ω1 ∧ ω2 in the sense of distributions. in Wloc In our case, d (ϕ[an , ϕ]) is given by Eq. (8). We have an , ϕ ∧ Dan ϕ
3
L2
≤ an , ϕ L6 Dan ϕL2 ,
dϕ ∧ [an , ϕ] + [an , ϕ] ∧ dϕL2 ≤ 2[an , ϕ]L2 dϕL∞ , 3
and an , ϕ L6 ≤ Can , ϕ W 1,2 by the embedding W 1,2 ⊂ L6 . We note that L 2 and −1,2 (M) by the Sobolev embedding theorem. Using L2 are compactly embedded in Wloc −1,2 estimate (9), we conclude that the sequence d (ϕ[an , ϕ]) is precompact in Wloc (M). Applying the div - curl lemma, we obtain that ξ = ϕ[a, ϕ] ∧ ϕ[a, ϕ]. At this point we can conclude that Eϕ [a] ≤ lim inf Eϕ [an ]. It remains to prove that a ∈ Aϕ . We start by showing that the limit form, a, is distributionally flat. We know that dan converges to da in the sense of distributions since an a in L2 . Next, an ∧an converges to a ∧ a in distributions since 1 an ∧ an = [an , ϕ] ∧ [an , ϕ] − an , ϕ ∧ [an , ϕ], 4 and the right-hand side converges in the sense of distributions by the previous arguments. We next note that the holonomy of a is trivial by Lemma 8 of [1]. We have already seen that the energy of a is finite. The harmonic part of an , ϕ converges to the harmonic part of a, ϕ because the space of harmonic forms is finite dimensional. Finally, consider the degree, 1 1 Re (an ∧ an ∧ an ) = − Re (ϕ[an , ϕ] ∧ ϕ[an , ϕ] ∧ ϕ[an , ϕ]) − 12π 2 96π 2 1 +− Re (an , ϕ ϕ ∧ [an , ϕ] ∧ [an , ϕ]) . 16π 2
618
D. Auckly, L. Kapitanski
We already know that an , ϕ converges strongly in L2 to a, ϕ and ϕ[an , ϕ]∧ϕ[an , ϕ] ϕ[a, ϕ] ∧ ϕ[a, ϕ]. This implies the convergence of the second term. We are going to use the div - curl lemma on the first term. We know that ϕ[an , ϕ] ϕ[a, ϕ] and d (ϕ[an , ϕ]) is precompact in W −1,2 . Now, d ([an , ϕ] ∧ [an , ϕ]) = 2an , ϕ ∧ ϕ[Dan ϕ, Dan ϕ] + 2an , ϕ ∧ [ϕdϕ, Dan ϕ] +ϕdϕ ∧ [an , ϕ] ∧ [an , ϕ]. The first 1-form factor in each term is uniformly bounded in L6 and the following 2-forms are uniformily bounded in L2 . It follows that the entire expression is uniformly bounded 3 in L 2 , hence precompact in W −1,2 . The div - curl lemma allows us to conclude that, after taking a subsequence, cs(an ) → cs(a). This completes the proof of the theorem. 4. Minimizers of the Faddeev Functional, (II) Given a smooth reference map ϕ, we now know that there is a minimizer of Eϕ in the class Aϕ . The smooth connections in this class correspond exactly to SO(3)-equivalence classes of smooth maps from M to S 2 homotopic to ϕ. A general connection, a, in the class Aϕ may also be represented as a = u∗ du, but now u is only in W 1,2 (M, Sp1 ). This follows from Lemma 6 of [1]. The corresponding map ψ = uϕu∗ lives in W 1,2 (M, S 2 ) and has finite energy E(ψ). We believe that minimizers of Eϕ are smooth, but this is an open problem. If a minimizer, a, is smooth, then the corresponding u and ψ are smooth as well. In the smooth case, by Lemma 1, ψ ∗ µS 2 = ϕ ∗ µS 2 independent of u. Even stronger, ψ is homotopic to ϕ. Thus, such a ψ would be a minimizer of the original Faddeev functional, E, in the class of smooth maps homotopic to ϕ. Even without any additional regularity result, ψ is a minimizer of E in the class of finite energy Sobolev maps of the form uϕu∗ with u ∈ W 1,2 (M, Sp1 ) and cs(u∗ du) ∈ {2(ϕ ∗ µS 2 ∪η)[M]|η ∈ H 1 (M; Z)}. It is an open problem to extend obstruction theory to the class of finite energy Sobolev maps. In particular, a reasonable extension of the definition of pull-back for Sobolev maps with finite energy would imply ψ ∗ µS 2 = ϕ ∗ µS 2 . The pull-back by a W 1,2 (M, S 2 ) map is well-defined in the de Rham theory. Lemma 3. For any u ∈ W 1,2 (M, Sp1 ), we have (uϕu∗ )∗ ωS 2 = ϕ ∗ ωS 2 +
1 dRe ϕu∗ du 2π
almost everywhere on M. Proof. In quaternionic notation, the standard volume form on S 2 is given by ωS 2 = 1 Re(zdz ∧ dz). The proof follows by straightforward computation. − 8π ∗ Notice that the integrals M ψ ∗ ωS 2 ∧ ηk specify The secondary ϕ µS 2∗ up to torsion. homotopy invariant works even better. To wit: M Re(u du ∧ u∗ du ∧ u∗ du) is well defined. In addition to Theorem 3, we prove the following result. Let Fk denote the class of all finite energy maps ψ ∈ W 1,2 (M, S 2 ) for which there exists a 1-form θ ψ ∈ W 1,2 such that ψ ∗ ωS 2 = dθ ψ
Analysis of S 2 -Valued Maps and Faddeev’s Model
and
619
θ ψ ∧ dθ ψ = k.
(10)
M
The argument from [18] shows, that there exists a constant cM > 0 so that 3
E(ψ) ≥ CM |k| 4 , for all ψ ∈ Fk . Theorem 4. In every nonvoid class Fk there exists a minimizer of E. Proof. Choose a minimizing sequence, ψn , and let θn denote the corresponding θ ψn . By Hodge theory, we may assume that each θn is co-closed (δθn = 0), and has trivial harmonic part. Taking a subsequence, ψn converges weakly in W 1,2 (M, S 2 ) and almost everywhere to some ψ ∈ W 1,2 (M, S 2 ). A direct computation shows that pointwise |ψ ∗ ωS 2 | = (8π )−1 |dψ ∧ dψ|. Since E(ψn ) is bounded, ψn∗ ωS 2 L2 is bounded. Taking another subsequence, ψn∗ ωS 2 ξ in L2 , for some ξ . To show that ξ = ψ ∗ ωS 2 , 1 we use the div - curl lemma. We have ψn∗ ωS 2 = − 8π Re(ψn dψn ∧ dψn ). Certainly, 2 dψn dψ and ψn dψn ψdψ in L . Their differentials are 0 and dψn ∧ dψn , respectively, which are both precompact in W −1,2 . Thus, ξ = ψ ∗ ωS 2 . This implies that E(ψ) ≤ inf E. The 1-forms θn are uniformly bounded in W 1,2 , hence, by taking Fk
a subsequence, converge weakly in W 1,2 and strongly in L2 to some θ . At the same time, dθn dθ in L2 . Recalling that dθn = ψn∗ ωS 2 ψ ∗ ωS 2 , we conclude that dθ = ψ ∗ ωS 2 . Since θn → θ and dθn dθ , we obtain M θ ∧ dθ = k. Thus, ψ ∈ Fk and ψ is the minimizer of E. 5. Concluding Remarks One can pose many minimization problems for the functionals E(ψ) = |dψ|2 + |dψ ∧ dψ|2 dvol, M
and
Eϕ [a] :=
M
|Da ϕ|2 + |Da ϕ ∧ Da ϕ|2 .
For example, one could minimize the first functional over all maps with fixed primary obstruction only; one could minimize the second functional over the classes of flat connections with fixed nontrivial holonomy, or fixed Chern-Simons invariant, or arbitrary holonomy. The arguments that we used in Theorems 3 and 4 apply equally well to each of these problems. There are several interesting open questions related to this model. We have already mentioned that we expect the minimizers to be smooth. What about maps in general: Do W 1,2 maps with finite Faddeev energy have extra regularity? Note that the second term x in E(ψ) rules out local singularities of the form x → |x| . How does one extend obstruction theory to finite energy maps? In particular, what is the appropriate definition of homotopy of finite energy maps? Is there a cohomology theory that agrees with integral
620
D. Auckly, L. Kapitanski
singular theory for which pull-back is defined for finite energy maps? Pull-backs in such a theory should also be homotopy invariant and analogues of Lemma 1 and Proposition 1 should hold. (Note that the proofs of Lemma 1 and Proposition 1 only require continuity. Thus, the above questions only make sense if there exist honestly discontinuous finite energy maps.) The Hopf invariant given in Eq. (2) or k in (10) should be an integer. This has not been verified for Sobolev maps. In another direction, it would be very interesting to see the structure of the minimizers. There may very well exist explicit minimizers on special Riemannian manifolds such as the three-sphere, three-torus, lens spaces, etc. There may be new phenomena for closed domains that could be discovered by numerical experimentation. The closed case has the additional advantage that one does not have to worry about behavior at infinity. References 1. Auckly, D., Kapitanski, L.: Holonomy and Skyrme’s model. Commun. Math. Phys. 240, 97–122 (2003) 2. Auckly, D., Speight, M.: Fermionic quantization and configuration spaces for the Skyrme and Faddeev-Hopf models. http://arxiv.org/abs/hep-th/0411010 3. Bott, R., Tu, L.W.: Differential forms in algebraic topology. NewYork-Berlin: Springer-Verlag, 1982 4. Bredon, G.E.: Introduction to compact transformation groups. New York-London: Academic Press, 1972 5. Faddeev, L.D.: Quantization of solitons. Preprint IAS print-75-QS70 (1975) 6. Faddeev, L.D.: Knotted solitons and their physical applications. Phil. Trans. R. Soc. Lond. A 359, 1399–1403 (2001) 7. Faddeev, L.D., Niemi, A.J.: Stable knot-like structures in classical field theory. Nature 387, 58–61 (1997) ¨ 8. Hopf, H.: Uber die Abbildungen der dreidimensionalen Sph¨are auf die Kugelfl¨ache. Math. Annalen 104, 637–665 (1931) 9. Kapitanski, L.: On Skyrme’s model. In: Nonlinear Problems in Mathematical Physics and Related Topics II: In Honor of Professor O. A. Ladyzhenskaya, Birman et al., (eds.), Dordrecht: Kluwer, 2002, pp. 229–242 10. Kobayashi, S., Nomizu, K.: Foundations of differential geometry, Vols. I, II. New York: John Wiley & Sons, Inc., 1996 11. Munkres, J.R.: Elementary differential topology. Revised edition. Annals of Mathematics Studies, No. 54, Princeton, N.J.: Princeton University Press, 1966 12. Pontrjagin, L.: A classification of mappings of the three-dimensional complex into the twodimensional sphere Rec. Math. [Mat. Sbornik] N. S. 9(51), 331–363 (1941) 13. Robbin, J. W., Rogers, R. C., Temple, B.: On weak continuity and the Hodge decomposition. Trans. AMS 303(2), 609–618 (1987) 14. Skyrme, T.H.R.: A non-linear field theory. Proc. R. Soc. London A 260(1300), 127–138 (1961) 15. Skyrme, T.H.R.: A unified theory of mesons and baryons. Nucl. Phys. 31, 556–569 (1962) 16. Skyrme, T.H.R.: The origins of Skyrmions. Int. J. Mod. Phys. A3, 2745–2751 (1988) 17. Spanier, E.H.: Algebraic topology. New York: Springer, 1966 18. Vakulenko, A.F., Kapitanski, L.: On S 2 -nonlinear σ -model. Dokl. Acad. Nauk SSSR 248, 810–814 (1979) Communicated by L. Takhtajan
Commun. Math. Phys. 256, 621–634 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1310-0
Communications in
Mathematical Physics
Punctured Haag Duality in Locally Covariant Quantum Field Theories Giuseppe Ruzzi Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, Via della Ricerca Scientifica, 00133 Roma, Italy. E-mail: [email protected] Received: 16 March 2004 / Accepted: 30 July 2004 Published online: 2 March 2005 – © Springer-Verlag 2005
Abstract: We investigate a new property of nets of local algebras over 4-dimensional globally hyperbolic spacetimes, called punctured Haag duality. This property consists in the usual Haag duality for the restriction of the net to the causal complement of a point p of the spacetime. Punctured Haag duality implies Haag duality and local definiteness. Our main result is that, if we deal with a locally covariant quantum field theory in the sense of Brunetti, Fredenhagen and Verch, then also the converse holds. The free Klein-Gordon field provides an example in which this property is verified. 1. Introduction The charged sectors of a net of local observables in a 4-dimensional globally hyperbolic spacetime have been investigated in [4]. The sectors define a C∗ -category in which, except when there are geometrical obstructions, the charge structure arises from a tensor product, a symmetry and a conjugation. Geometrical obstructions occur when the spacetime has compact Cauchy surfaces: in this situation neither the classification of the statistics of sectors, nor the existence of a conjugation, have been established. Important progress in this direction has been achieved in [9]. In that paper the key assumption is that, given the net of local observables over M, its restriction to the causal complement of a point p of M fulfills Haag duality. This allows the author to classify the statistics of sectors, and to provide several results towards the proof of the existence of a conjugation. The property assumed in [9], which we call punctured Haag duality, is the subject of the present paper. We will demonstrate that a net satisfying punctured Haag duality is locally definite and fulfills Haag duality. Furthermore, if we deal with a locally covariant quantum field theory [1], then we are able to show that punctured Haag duality is equivalent to Haag duality plus local definiteness. This will allow us to provide an example, the free Klein-Gordon field, satisfying punctured Haag duality. According to [1], once a locally covariant quantum field theory is given, to any 4dimensional globally hyperbolic spacetime M, there corresponds a net of local algebras
622
G. Ruzzi
AK(M) satisfying the Haag-Kastler axioms [5]. Namely, AK(M) is an inclusion preserving map K(M) O −→ A(O) ⊂ A (M) assigning to each element O of a collection K(M) of subregions of M a C∗ -subalgebra A(O) of a C∗ -algebra A (M), and satisfying O1 ⊥ O ⇒ [A(O1 ), A(O)] = 0
causality.
In words, causality means that if O1 is causally disjoint from O, then the algebras A(O1 ), A(O) commute elementwise. A(O) is the algebra generated by all the observables measurable within the region O. The main new feature with respect to the Haag-Kastler framework is that if ψ is an isometric embedding from M into another 4-dimensional globally hyperbolic spacetime M1 , then there is a C∗ -morphism αψ from A (M) into A (M1 ) which maps isomorphically A(O) onto A(ψ(O)) for each O ∈ K(M) (see [1] or Sect. 4.1 for details). Now, turning to the subject of the present paper, given a state ω of A (M) and denoting by π the GNS representation of ω, we say that (AK(M) , ω) satisfies punctured Haag duality if for any point p of M the following identity: π(A(D1 )) = ∩{π(A(D)) |D ∈ K (M), D ⊥ (D1 ∪ {p})}
(1)
is verified for any D1 ∈ K (M) such that D1 ⊥ {p}, where K (M) is the subcollection of K(M) formed by the regular diamonds of M (see Sect. 3.1). In the Haag-Kastler framework, punctured Haag duality implies Haag duality and local definiteness (Proposition 4.1). The main result of the present paper is that, in the setting of a locally covariant quantum field theory, punctured Haag duality is equivalent to Haag duality plus local definiteness (Theorem 4.4 and Corollary 4.5). The basic idea of the proof of Theorem 4.4 is the following. The condition D ⊥ {p} means that the closure D of D is contained in the open set M \ J(p). The spacetime Mp ≡ M \ J(p) is globally hyperbolic, hence, there is a net of local algebras AK(Mp ) associated with Mp . Now, as the injection ιp of Mp into M is an isometric embedding, local covariance allows us to identify the net AK(Mp ) with the restriction of AK(M) to M \ J(p), and the algebra A (Mp ) with a subalgebra A (M). Then, if ω is a state of A (M), punctured Haag duality for (AK(M) , ω) seems to be equivalent to Haag duality for (AK(Mp ) , ω|A (Mp ) ). This is actually true, but, there is a subtle point that has to be carefully considered: the regular diamonds of Mp might not be regular diamonds of M. However, we will be able to circumvent this problem by means of the following result: the set K (Mp ) and the set {D ∈ K (M) |D ⊂ M \ J(p)} have a common “dense” subset (Proposition 3.3). As an easy consequence of this result, we will show that the free Klein-Gordon field over a 4-dimensional globally hyperbolic spacetime, in the representation associated with a pure quasi-free state satisfying the microlocal spectrum condition, fulfills punctured Haag duality (Proposition 4.6). The same holds for pure adiabatic vacuum states of order N > 25 (Remark 4.7). 2. Preliminaries on Spacetime Geometry We recall some basics on the causal structure of spacetimes and establish our notation. Standard references for this topic are [8, 3, 12].
Punctured Haag Duality in Locally Covariant Quantum Field Theories
623
Spacetimes. A spacetime M consists in a Hausdorff, paracompact, smooth, oriented 4dimensional manifold M endowed with a smooth metric g with signature (−, +, +, +), and with a time-orientation, that is a smooth vector field v satisfying gp (vp , vp ) < 0 for each p ∈ M. (Throughout this paper smooth means C ∞ .) A curve γ in M is a continuous, piecewise smooth, regular function γ : I −→ M, where I is a connected subset of R with nonempty interior. It is called timelike, lightlike, spacelike if respectively g(γ˙ , γ˙ ) < 0, = 0, > 0 all along γ , where γ˙ = dγ dt . Assume now that γ is causal, i.e. a nonspacelike curve; we can classify it according to the time-orientation v as future-directed (f-d) or past-directed (p-d) if respectively g(γ˙ , v) < 0, > 0 all along γ . When γ is f-d and there exists limt→sup I γ (t) (limt→inf I γ (t)), then it is said to have a future (past) endpoint. In the negative case, it is said to be future (past) endless; γ is said to be endless if none of them exist. Analogous definitions are assumed for p-d causal curves. The chronological future I+ (S), the causal future J+ (S) and the future domain of dependence D+ (S) of a subset S ⊂ M are defined as: I+ (S) ≡ {x ∈ M|there is a f-d timelike curve from S to x}; J+ (S) ≡ S ∪ {x ∈ M|there is a f-d causal curve from S to x}; D+ (S) ≡ {x ∈ M|any p-d endless causal curve throughx meets S}. These definitions have a dual in which “future” is replaced by “past” and the + by −. So, we define I(S) ≡ I+ (S) ∪ I− (S), J(S) ≡ J+ (S) ∪ J− (S) and D(S) ≡ D+ (S) ∪ D− (S). We recall that I+ (S) is an open set, and that J+ (S) = I+ (S) and (J+ (S))o = I+ (S). Two sets S, V ⊂ M are causally disjoint, S ⊥ V , if S ⊆ M \ J(V ) or, equivalently, if V ⊆ M \ J(S). A set S is achronal if S ∩ I(S) = ∅; it is acausal if {p} ⊥ {q} for each pair p, q ∈ S. A (acausal) Cauchy surface C of M is an achronal (acausal) set verifying D(C) = M. Any Cauchy surface is a closed, connected, Lipschitz hypersurface of M. A spacelike Cauchy surface is a smooth Cauchy surface whose tangent space is everywhere spacelike. Any spacelike Cauchy surface is acausal. Global hyperbolicity. A spacetime M is strongly causal if for each point p the following condition holds: any neighbourhood U of p contains a neighbourhood V of p such that for each q1 , q2 ∈ V the set J+ (q1 ) ∩ J− (q2 ) is either empty or contained in V . M is globally hyperbolic if it admits a Cauchy surface or, equivalently, if it is strongly causal and for each pair of points p1 , p2 the set J+ (p1 ) ∩ J− (p2 ) is empty or compact. Assume that M is globally hyperbolic. Then M can be foliated by spacelike Cauchy surfaces [2]. Namely, there is a 3-dimensional smooth manifold and a diffeomorphism F : R × −→ M such that: for each t ∈ R the set Ct ≡ {F (t, y) | y ∈ } is a spacelike Cauchy surface of M; for each y ∈ , the curve t ∈ R −→ F (t, y) ∈ M is a f-d (by convention) endless timelike curve. For any relatively compact S ⊂ M the following properties hold: 1. J+ (S) = J+ (S); 2. D+ (S) is compact; 3. for each Cauchy surface C the set J+ (S) ∩ C is either empty or compact; 4. J+ (S ∪ {p}) = J+ (S ∪ {p}) for any p ∈ M such that S ∩ {p} = ∅. The category Man [1]. Let M and M1 be globally hyperbolic spacetimes with metric g and g1 respectively. A smooth function ψ from M1 into M is called an isometric embedding if ψ : M1 −→ ψ(M1 ) is a diffeomorphism and ψ∗ g1 = g|ψ(M1 ) . The category Man is the category whose objects are the 4-dimensional globally hyperbolic spacetimes; the arrows Hom(M1 , M) are the isometric embeddings ψ : M1 −→ M
624
G. Ruzzi
preserving the orientation and the time-orientation of the embedded spacetime, and that satisfy the property ∀p, q ∈ ψ(M1 ), J+ (p) ∩ J− (q) is either empty or contained in ψ(M1 ). The composition law between two arrows ψ and φ, denoted by ψ ◦ φ, is given by the usual composition between smooth functions; the identity arrow idM is the identity function of M. 3. Causal Excisions and Regular Diamonds We present two index sets for nets of local algebras over a globally hyperbolic spacetime M: the set K(M) and the set of the regular diamonds K (M). Furthermore, we introduce the spacetime Mp , the causal excision of p ∈ M, and study how K(Mp ) and K (Mp ) are embedded in M. 3.1. The sets K(M) and K (M). Let us consider a globally hyperbolic spacetime M with metric g. The set K(M) [1] is defined as the collection of the open sets O of M which are relatively compact, open, connected and enjoy the following property: for each pair p, q ∈ O the set J+ (p) ∩ J− (q) is either empty or contained in O. The importance of K(M) for the locally covariant quantum field theories derives from the following properties that can be easily checked: O ∈ K(M) ⇒ O ∈ Man, ψ ∈ Hom(M1 , M) ⇒ ψ(K(M1 )) ⊆ K(M),
(2) (3)
where for O ∈ Man we mean that O with the metric g|O and with induced orientation and time orientation is globally hyperbolic. Property (2) allows one to associate a net of local algebras with M; property (3) makes the algebras associated with isometric regions of different spacetimes isomorphic. K(M) however is too big an index set for studying properties of the netlike Haag duality and punctured Haag duality (see Sect. 4.2). For this purpose the set of the regular diamonds of M is well suited [4, 11]. Given a smooth manifold N, let G(N) ≡ {G ⊂ N|G and N \ G are nonempty and open, G is compact and contractible to a point in G; ∂G is a two-sided, locally flat embedded C 0 − submanifold of N having finitely many connected components, and in each connected component there are points near to which ∂G is C ∞ − embedded}. Then, a regular diamond D is an open subset of M of the form D = (D(G))o , where G ∈ G(C) for some spacelike Cauchy surface C; D is said to be based on C, while G is called the base of D. We denote by K (M) the set of the regular diamonds of M, and by K (M, C) those elements of K (M) which are based on the spacelike Cauchy surface C. The set K (M) is a base of the topology of M and K (M) ⊂ K(M). Moreover, let us consider D ∈ K (M, C) and an open set U such that D ⊂ U . Then there exists ([11, Lemma 3]) D1 ∈ K (M, C) such that D ⊂ D1
D1 ⊂ (D(U ∩ C))o .
(4)
Punctured Haag Duality in Locally Covariant Quantum Field Theories
625
Now, notice that ψ ∈ Hom(M1 , M) ⇒ ψ(K (M1 )) ⊂ K(M), but ψ(K (M1 )) might not be contained in K (M). To see this, consider a spacelike Cauchy surface C1 of M1 and notice that ψ(C1 ) is a spacelike hypersurface of M. If D1 is a regular diamond of M1 of the form (D(G1 ))o , where G1 ∈ G(C1 ), then ψ((D(G1 ))o ) = (D(ψ(G1 )))o and ψ(G1 ) ∈ G(ψ(C1 )). However, in general a spacelike Cauchy surface C of M such that ψ(C1 ) ⊆ C does not exist (see for instance Remark 3.2). Moreover, in general, we do not know whether ˜ for some spacelike Cauchy surface C˜ of M. So, we have no way to ψ(G1 ) ∈ G(C) conclude that ψ(D) ∈ K (M). 3.2. Causal excisions. Consider a globally hyperbolic spacetime M, with metric g, and a point p of M. As the set M \ J(p) is open and connected, the manifold Mp ≡ M \ J(p) endowed with the metric gp ≡ g|M\J(p) ,
(5)
and with the induced orientation and time-orientation, is a spacetime. Mp inherits from M the strong causality condition because this property is stable under restriction to open subsets. Moreover, for each pair p1 , p2 ∈ M \ J(p) the set J+ (p1 ) ∩ J− (p2 ) is either empty or compact and contained in M \ J(p) (the compactness follows from the global hyperbolicity of M). Hence Mp is globally hyperbolic, thus Mp ∈ Man. We call Mp the causal excision of p. Let us now denote by ιp the injection of Mp into M, that is M \ J(p) q −→ ιp (q) = q ∈ M. Clearly, ιp ∈ Hom(Mp , M). Proposition 3.1. Given a globally hyperbolic spacetime M, let Mp be the causal excision of p ∈ M. If C is an acausal (spacelike) Cauchy surface of M that meets p, then C \ {p} is an acausal (spacelike) Cauchy surface of Mp . Conversely, if Cp is an acausal Cauchy surface of Mp , then Cp ∪ {p} is an acausal Cauchy surface of M. Proof. (⇒) C\{p} is an acausal set of Mp and D+ (C\{p}) ⊆ D+ (C)\J(p). Conversely, if q ∈ D+ (C) \ J(p) each p-d endless causal curve through q meets C but not p. Hence D+ (C\{p}) ⊇ D+ (C)\J(p). This implies D+ (C\{p}) = D+ (C)\J(p) and the dual equalityD− (C \ {p}) = D− (C) \ J(p). Therefore D(C \ {p}) = D+ (C) \ J(p) ∪ D− (C) \ J(p) = D+ (C) ∪ D− (C) \J(p) = M\J(p). Clearly if C is spacelike, then C\{p} is spacelike. This completes the proof. (⇐) Set C ≡ Cp ∪ {p} and notice that M \ J(p) = D(Cp ) ⊆ D(C). Now, given q ∈ J+ (p) let us consider a p-d endless causal curve γ through q that does not meet p. Because of global hyperbolicity γ leaves J+ (p). Each connected component of γ ∩ (M \ J(p)) is a f-d endless causal curve of Mp , therefore it meets Cp . This entails J(p) ⊂ D(C) and that D(C) = M. In order to prove that C is acausal, assume that there exists a f-d causal curve γ : [0, 1] −→ M that joins two points q1 and q2 lying on C. If one of the two points is p, then γ (t) ∈ J(p) for each t ∈ [0, 1], and this leads to a contradiction. If q1 , q2 = p, and γ ∩ J(p) = ∅, then γ would be a f-d causal curve of Mp joining two points of Cp , and this leads to a contradiction. The same happens in the case where q1 , q2 = p and γ ∩ J(p) = ∅. In fact, if γ (t1 ) ∈ J+ (p), then γ (t) ∈ J+ (p) for each t ≥ t1 . Analogously, if γ (t1 ) ∈ J− (p), then γ (t) ∈ J− (p) for each t ≤ t1 .
626
G. Ruzzi
Remark 3.2. Let Cp be a spacelike Cauchy surface of Mp . Because of Proposition 3.1, Cp ∪ {p} is an acausal Cauchy surface of M. However, it is not smooth in general. Consider for instance the Minkowski space M4 . Let M4o ≡ {(t, x) ∈ M4 | − t 2 + x , x > 0}, Co ≡ {(t, x) ∈ M4 | − 4 · t 2 + x , x = 0, t > 0}, where , denotes the canonical scalar product of R3 . M4o is the causal excision of o = (0, 0, 0, 0), and Co is a spacelike Cauchy surface of M4o . Clearly, Co ∪ {o} is a nonsmooth hypersurface of M4 . We now turn to study how the injection ιp embeds K(Mp ) and K (Mp ) into M. Concerning K(Mp ) one can easily prove that K(Mp ) = {O ∈ K(M)|O ⊥ {p}}.
(6)
As for regular diamonds, in general we do not know whether K (Mp ) ⊂ K (M) (see the observation made in the previous section). However, the set K (Mp ∧ M) ≡ K (Mp ) ∩ K (M)
(7)
of the regular diamonds shared by Mp and M is not empty. In fact, notice that if C is a spacelike Cauchy surface of M that meets p, then by Proposition 3.1 Cp ≡ C \ {p} is a spacelike Cauchy surface of Mp . Since G(Cp ) ⊂ G(C), we conclude that K (Mp , Cp ) = {D ∈ K (M, C)|D ⊥ {p}} ⊂ K (Mp ∧ M).
(8)
Furthermore, K (Mp ∧ M) is a “dense” subset of both K (Mp ) and {D ∈ K (M)|D ⊥ {p}}, as the following proposition shows. Proposition 3.3. For each pair D, D1 ∈ K (Mp ) such that D ⊂ D1 , there exists Do ∈ K (Mp ∧ M) such that D ⊂ Do , Do ⊂ D1 . The same result holds true for each D, D1 ∈ {D ∈ K (M)|D ⊥ {p}} such that D ⊂ D1 . Proof. The proof follows from Propositions A.3, A.4.
4. Punctured Haag Duality This section is devoted to the investigation of punctured Haag duality. We will start by recalling the axioms of a locally covariant quantum field theory. Afterwards, we will show necessary and sufficient conditions for punctured Haag duality, both in the HaagKastler framework and in the setting of the locally covariant quantum field theories. Finally, we will apply these results to the theory of the free Klein-Gordon field.
Punctured Haag Duality in Locally Covariant Quantum Field Theories
627
4.1. Locally covariant quantum field theories. The locally covariant quantum field theory is a categorical approach to the theory of quantum fields incorporating the covariance principle of general relativity [1]. In order to introduce the axioms of the theory, we give a preliminary definition. Let us denote by Alg the category whose objects are unital C∗ -algebras and whose arrows Hom(A1 , A2 ) are the unit-preserving injective C∗ -morphisms from A1 into A2 . The composition law between the arrows α1 and α2 , denoted by α1 ◦ α2 , is given by the usual composition between C∗ -morphisms; the unit arrow idA of Hom(A, A) is the identity morphism of A. A locally covariant quantum field theory is a covariant functor A from the category Man (see Sect. 2) into the category Alg, that is, a diagram M1
ψ
A
A (M1 )
/ M2 A
αψ
/ A (M2 ),
where αψ ≡ A (ψ), such that αidM = idA (M) , and αφ ◦ αψ = αφ◦ψ for each ψ ∈ Hom(M1 , M) and φ ∈ Hom(M, M2 ). The functor A is said to be causal if, given ψi ∈ Hom(Mi , M) for i = 1, 2, ψ1 (M1 ) ⊥ ψ2 (M2 ) ⇒ αψ1 (A (M1 ), αψ2 (A (M2 )) = 0, where ψ1 (M1 ) ⊥ ψ2 (M2 ) means that ψ1 (M1 ) and ψ2 (M2 ) are causally disjoint in M. From now on A will denote a causal locally covariant quantum field theory. We now turn to the notion of a state space of A . To this aim, let Sts be the category whose objects are the state spaces S(A) of unital C∗ -algebras A, namely S(A) is a subset of the states of A closed under finite convex combinations and operations ω(·) → ω(A∗ · A)/ω(A∗ A) for A ∈ A. The arrows between two objects S(A) and S (A ) are the positive maps γ ∗ : S(A) −→ S (A ), arising as dual maps of injective morphisms of C∗ -algebras γ : A −→ A, by γ ∗ ω(A) ≡ ω(γ (A)) for each A ∈ A. The composition law between two arrows, as the definition of the identity arrow of an object, are obvious. A state space for A is a contravariant functor S between Man and Sts, that is, a diagram M1 S
S(M1 ) o
ψ
/ M2
∗ αψ
S(M2 ),
S
where S(M1 ) is a state space of the algebra A (M1 ), such that αι∗ = idS(M) , and M ∗ for each ψ ∈ Hom(M1 , M) and φ ∈ Hom(M, M2 ). αψ∗ ◦ αφ∗ = αφ◦ψ In conclusion let us see how a net of local algebras over M ∈ Man can be recovered by a locally covariant quantum field theory A . For this purpose, recall that by (2) any O ∈ K(M), considered as a spacetime with the metric g|, belongs to Man. The injection ιM,O of O into M is an element of Hom(O, M) because of the definition of K(M). Then, by using αιM,O ∈ Hom(A (O), A (M)) to define A(O) ≡ αιM,O (A (O)), it turns out that the correspondence AK(M) , defined as K(M) O −→ A(O) ⊂ A (M),
(9)
628
G. Ruzzi
is a net of local algebras satisfying the Haag-Kastler axioms. As for the local covariance of the theory, let M1 ∈ Man with the metric g1 . Notice that if ψ ∈ Hom(M, M1 ), then ψ(O) ∈ K(M1 ) for each O ∈ K(M). Since ι−1 M1 ,ψ(O) ◦ ψ ◦ ιM,O is an isometry from the spacetime O into the spacetime ψ(O) — the latter equipped with the metric g1 |ψ(O) — one has that αψ : A(O) ⊂ A (M) −→ A(ψ(O)) ⊂ A (M1 )
(10)
is a C∗ -isomorphism. 4.2. Punctured Haag duality in the Haag-Kastler framework. We investigate punctured Haag duality (1) in the Haag-Kastler framework. This means that we will study punctured Haag duality on the net of local algebras AK(M) , associated with a spacetime M ∈ Man by (9), without making use of the local covariance (10). In this framework, we will obtain two necessary conditions for punctured Haag duality. To begin with, let ω be a state of the algebra A (M) and let π be the GNS representation associated with ω. We recall that (AK(M) , ω) is said to be locally definite if for each p ∈ M, C · 11 = ∩{π(A(D)) |p ∈ D ∈ K (M)}; it is said to satisfy Haag duality if for each D1 ∈ K (M), π(A(D1 )) = ∩{π(A(D)) |D ∈ K (M), D ⊥ D1 }. These properties (and also punctured Haag duality) are defined in terms of the local algebras associated with regular diamonds. The reason is that Haag duality has been proved, in models of quantum fields, only when the local algebras are defined in regions of the spacetime like the regular diamonds [7, 10, 6]. Punctured Haag duality is strongly related to Haag duality and local definiteness, as the following proposition shows. Proposition 4.1. Assume that (AK(M) , ω) satisfies punctured Haag duality. Then, (AK(M) , ω) satisfies Haag duality. Furthermore, if ω is pure and π(A (M)) ⊆ ∪{π(A(O)) | O ∈ K(M), M \ O = ∅} , (11) where π is the GNS representation of ω, then (AK(M) , ω) is locally definite. Proof. Obviously (AK(M) , ω) fulfills Haag duality. As for the proof of local definiteness, let us define Rp ≡ ∩{π(A(D)) |p ∈ D ∈ K (M)}, for p ∈ M. Notice that, as K (M) contains a neighbourhood basis for each point p of M, causality entails that Rp ⊂ π(A(D)) for each D ∈ K (M), D ⊥ {p}. Combining this with punctured Haag duality we have that Rp ⊂ π(A(D)) , therefore Rp ⊂ π(A(D)) ∩ π(A(D))
D ∈ K (M), D ⊥ {p}.
(∗)
Now, let p1 ∈ M be such that {p} ⊥ {p1 }. For each D ∈ K (M) containing p1 , we can find D1 ∈ K (M) such that p1 ∈ D1 ⊂ D and D1 ⊥ {p}. This and (∗) imply that Rp ⊂ Rp1 and, by the symmetry of ⊥, that Rp = Rp1 . For a generic q ∈ M we can find a finite sequence p1 , . . . , pn , such that {p} ⊥ {p1 }, {p1 } ⊥ {p2 }, . . . , {pn } ⊥ {q}. Consequently, Rp = Rq and Rp ⊂ π(A(D)) ∩ π(A(D)) for each D ∈ K (M). We also have that Rp ⊂ π(A(O)) ∩ π(A(O)) for any O ∈ K(M) such that M \ O is nonempty. The irreducibility of π and (11) complete the proof.
Punctured Haag Duality in Locally Covariant Quantum Field Theories
629
For later purposes, it is useful to note that if (AK(M) , ω) satisfies Haag duality, then (AK(M) , ω) is outer regular, that is for any D1 ∈ K (M) we have π(A(D1 )) = ∩ {π(A(D)) |D1 ⊂ D ∈ K (M)}.
(12)
In fact, consider D2 ∈ K (M) such that D2 ⊥ D1 . This means that D1 is contained in the open set M \ J(D2 ) = M \ J(D2 ). By (4) there is a regular diamond D such that D1 ⊂ D and D ⊥ D2 . This with the observation that π(A(D)) ⊂ π(A(D2 )) , imply that π(A(D1 )) ⊆ ∩{π(A(D)) |D ∈ K (M), D1 ⊂ D} ⊆ ∩{π(A(D2 )) |D2 ∈ K (M), D2 ⊥ D1 } = π(A(D1 )) , completing the proof of (12). 4.3. Punctured Haag duality in a locally covariant quantum field theory. In the HaagKastler framework we have seen that punctured Haag duality implies Haag duality and local definiteness. On the other hand, in the setting of the locally covariant quantum field theories, the relation between these properties is stronger. To make a precise claim, let us consider a locally covariant quantum field theory A , and let SHL (A ), SpH (A ) be two sets of state spaces of A defined as follows: • SHL (A ) is the family of state spaces S of A such that, for any M ∈ Man and for any pure state ω ∈ S(M) the pair (AK(M) , ω) is locally definite and satisfies Haag duality; • SpH (A ) is the family of state spaces S of A such that, for any M ∈ Man and for any pure state ω ∈ S(M) the pair (AK(M) , ω) satisfies punctured Haag duality and (11). Then, we will prove that SHL (A ) = SpH (A ). Notice that SpH (A ) ⊆ SHL (A ) because of Proposition 4.1. Moreover, one can easily see that, if S ∈ SHL (A ) and ω ∈ S(M) is pure, then (AK(M) , ω) fulfills (11). So, what remains to prove is that (AK(M) , ω) satisfies punctured Haag duality. To begin with, let us take S ∈ SHL (A ), M ∈ Man, p ∈ M, and a pure state ω ∈ S(M). Let Mp be the causal excision of p, and let AK(Mp ) : K(Mp ) O −→ Ap (O) ⊆ A (Mp ) be the net associated with Mp , where the subscript p is added in order to avoid confusion between the elements of AK(Mp ) and those of AK(M) . Observing that ιp ∈ Hom(Mp , M), we can define ωp (A) ≡ αι∗p ω(A) = ω(αιp (A)),
A ∈ A (Mp ).
Since ω ∈ S(M) and ωp = αι∗p ω, by the definition of state space ωp belongs to S(Mp ). The first step of our proof consists in showing that (AK(Mp ) , ωp ) satisfies Haag duality that, according to the definition of SHL (A ) it is equivalent to prove that ωp is pure. To this aim let us define Vιp πp (A) p ≡ π(αιp (A)) ,
A ∈ A (Mp ),
where (H, π, ) and (Hp , πp , p ) are respectively the GNS constructions associated with ω and ωp .
630
G. Ruzzi
Proposition 4.2. (AK(Mp ) , ωp ) satisfies Haag duality. In particular: a) the representation π ◦ αιp of A (Mp ) is irreducible; b) Vιp ∈ (πp , π ◦ αιp ) is unitary, hence ωp is pure. Proof. a) Let T ∈ (π ◦αιp , π ◦αιp ). Given D1 ∈ K (M) such that p ∈ D1 , let us consider D ∈ K (M), D ⊥ D1 . By (6) D belongs to K(Mp ) and by (10) A(D) = αιp (Ap (D)). Thus, T ∈ (π A(D), π A(D)) for each regular diamond of M causally disjoint from D1 . By Haag duality T ∈ π(A(D1 )) for each D1 ∈ K (M) such that p ∈ D1 ; hence by local definiteness T = c·, completing the proof. b) Observe that for each A ∈ A (Mp ) we have Vιp πp (A) p 2 = π(αιp (A)) 2 = ( , π(αιp (A∗ A)) ) = ωp (A∗ A) = πp (A) p 2 . This entails that Vιp is a unitary intertwiner between πp and π ◦αιp because
p is cyclic for πp and π ◦ αιp is irreducible. Therefore πp is irreducible and, consequently, ωp is pure. Finally, as observed above, (AK(Mp ) , ωp ) satisfies Haag duality. We do not know whether the sets K (Mp ) and {D ∈ K (M) | D ⊥ {p}} are equal or not. If they were the same, punctured Haag duality for (AK(M) , ω) would follow from Haag duality for (AK(Mp ) , ωp ). Nevertheless, by Proposition 3.3 the sets K (Mp ) and {D ∈ K (M) | D ⊥ {p}} have a common “dense” subset: the set K (Mp ∧ M). This is enough for our aim and will allow us to prove punctured Haag duality in two steps. Lemma 4.3. For any D1 ∈ K (Mp ∧ M) the following identity holds: π(A(D1 )) = ∩{π(A(D)) |D ∈ K (M), D ⊥ (D1 ∪ {p})}. Proof. Let us consider D2 ∈ K (Mp ), D2 ⊥ D1 . The closure of D2 is contained in the open set M \ J(D1 ∪ {p}) (see Sect. 2). By Proposition 3.3 there is Do ∈ K (Mp ∧ M) satisfying the relations D2 ⊂ Do , Do ⊥ (D1 ∪ {p}). This leads to the following inclusions: π(A(D1 )) ⊆ ∩{π(A(Do )) |Do ∈ K (M), Do ⊥ (D1 ∪ {p})} ⊆ ∩{π(A(D2 )) |D2 ∈ K (Mp ), D2 ⊥ D1 }(∗) Recall now that by Proposition 4.2.b π(A(D1 )) = Vιp πp (Ap (D1 )) Vι∗p . As (AK(Mp ) , ωp ) satisfies Haag duality we have π(A(D1 )) = Vιp ∩{πp (Ap (D2 )) |D2 ∈ K (Mp ), D2 ⊥ D1 } Vι∗p = ∩{π(A(D2 )) |D2 ∈ K (Mp ), D2 ⊥ D1 }.
Combining this with (∗) we obtain the proof.
Theorem 4.4. Given S ∈ SHL (A ), for any M ∈ Man and for any pure state ω ∈ S(M) the pair (AK(M) , ω) satisfies punctured Haag duality. Proof. Let ω ∈ S(M) be a pure state of A (M), and let π be the GNS representation associated with ω. Fix p ∈ M and D1 ∈ K (M) such that D1 ⊥ {p}. Notice that if D2 ∈ K (M) such that D1 ⊂ D2 , then D1 is contained in the open set D2 ∩ (M \ J(p)). By Proposition 3.3 there is Do ∈ K (Mp ∧ M) such that D1 ⊂ Do , Do ⊂ D2 and Do ⊥ {p}. As Do fulfills the hypotheses of Lemma 4.3, we have
Punctured Haag Duality in Locally Covariant Quantum Field Theories
631
π(A(D1 )) ⊆ ∩{π(A(D)) |D ∈ K (M), D ⊥ (D1 ∪ {p})} ⊆ ∩{π(A(D)) |D ∈ K (M), D ⊥ (Do ∪ {p})} = π(A(Do )) ⊆ π(A(D2 )) . These inclusions are verified for any D2 ∈ K (M) such that D1 ⊂ D2 . Hence, the outer regularity (12) implies that (AK (M) , ω) satisfies punctured Haag duality. This theorem and the observations at the beginning of this section lead to the following Corollary 4.5. SHL (A ) = SpH (A ). 4.4. The case of the free Klein-Gordon field. The theory of the free Klein-Gordon field provides, as shown in [1], an example of a locally covariant quantum field theory W with a state space Sµ . The functor W is defined as the correspondence M −→ W (M) that associates to any M ∈ Man the Weyl algebra W (M) of the free Klein-Gordon field over M. Sµ is defined as the correspondence M −→ Sµ (M) that to any M associates the collection of the states of W (M) which are locally quasiequivalent to quasi-free states of W (M) fulfilling the microlocal spectrum condition (or equivalently the Hadamard condition). We refer the reader to the cited paper and references therein for a detailed description of this example. We now prove that for any ω ∈ Sµ (M) the pair (WK(M) , ω) is locally definite and, if ω is pure, satisfies Haag duality. Hence, Sµ ∈ SHL (W ), and by Theorem 4.4 (WK(M) , ω) satisfies punctured Haag duality for any pure state ω ∈ Sµ (M). Let us start by recalling that two states ω, ω1 of W (M) are said to be locally quasiequivalent if for each O ∈ K(M) there exists an isomorphism ρO : π(W(O)) −→ π1 (W(O)) such that ρO π(A) = π1 (A) for each A ∈ W(O), where π, π1 are respectively the GNS representations of ω and ω1 . Furthermore, we need to recall the following fact ([10, Theorem 3.6]): for each M ∈ Man, if ωµ is a quasi-free state of W (M) satisfying the microlocal spectrum condition, then (WK(M) , ωµ ) is locally definite and, if ωµ is pure, it satisfies Haag duality. Proposition 4.6. Sµ ∈ SHL (W ). Therefore, (WK(M) , ω) satisfies punctured Haag duality for any pure state ω ∈ Sµ (M). Proof. Fix M ∈ Man and consider a pure state ω of W (M) which is locally quasiequivalent to a quasi-free state ωµ satisfying the microlocal spectrum condition. Let π and πµ be the GNS representations of ω and ωµ respectively. It has already been shown in [10] that (WK(M) , ω) satisfies Haag duality. Hence it remains to be proved that (WK(M) , ω) is locally definite. To this aim, fix p ∈ M and D1 ∈ K (M) such that p ∈ D1 . As observed above (WK(M) , ωµ ) is locally definite. This entails that ∩{πµ (W(D)) |D ∈ K (M), p ∈ D ⊂ D1 } = C·, because K (M) contains a neighbourhood basis of p. Being ω locally quasiequivalent to ωµ , there is an isomorphism ρD1 from π(W(D1 )) onto πµ (W(D1 )) such that ρD1 π(A) = πµ (A) for each A ∈ W(D1 ). Hence ρD1 : ∩{ π(W(D)) |p ∈ D ⊂ D1 } −→ ∩{ πµ (W(D)) |p ∈ D ⊂ D1 } = C· is an isomorphism and, consequently, (WK(M) , ω) is locally definite.
Remark 4.7. It is worth mentioning that the family of the adiabatic vacuum states of order N > 25 , studied in [6], is contained in Sµ : any such state is locally quasiequivalent to a quasi-free state fulfilling the microlocal spectrum condition.
632
G. Ruzzi
A. Proof of Proposition 3.3 The proof of Proposition 3.3 comes by a slight modification of Lemmas 5, 7 and 8 of [11]. So, we give a detailed description only of the modified parts of the proofs and refer the reader to the cited paper for the assertions that we will not prove. We start by recalling some results of the cited paper. Let M ∈ Man and let F : R × −→ M be a foliation of M by spacelike Cauchy surfaces. For each acausal (spacelike) Cauchy surface C, there is an associated pair (τC , fC ), where τC : −→ R is a continuous (smooth) function, while fC , defined as fC (y) ≡ F (τC (y), y),
y ∈ ,
is an homeomorphism (diffeomorphism) fC : −→ C. Given another acausal Cauchy surface Co and the corresponding pair (τCo , fCo ), the map Co ,C : C −→ Co defined as
Co ,C (p) ≡ (fCo ◦ fC−1 )(p),
∀p ∈ C,
is an homeomorphism (diffeomorphism if C and Co are spacelike). Now, consider the causal excision Mp of p ∈ M, and a spacelike Cauchy surface Cp of Mp . By Proposition 3.1 C ≡ Cp ∪ {p} is an acausal Cauchy surface of M. Lemma A.1. For each continuous strictly positive function ε : −→ R, there exists a spacelike Cauchy surface Co that meets p, such that |τC (y) − τCo (y)| < ε(y) for each y ∈ . ≡ {F (τC (y) ± λ · ε(y), y) | y ∈ }, 0 < λ < 1, and N ≡ Proof. Let us define A± λ − (A− ) o . Notice that if p = F (t , y ) with t > τ (y ) + λ · ε(y ), M \ J+ (A+ ) ∪ J o o o o C o λ λ then p ∈ N. In fact, the f-d timelike curve γ (t) ≡ F (τC (yo ) + t · ε(yo ), yo ),
t ∈ [λ, (to − τC (yo )) · ε(yo )−1 ],
joins the point F (τC (yo ) + λ · ε(yo ), yo ) ∈ A+ λ with p. Analogously, if p = F (to , yo ) with to < τC (yo ) − λ · ε(yo ), then p does not belong to N. Hence p ∈ N, p = F (t, y) ⇐⇒ |t − τC (y)| < λ · ε(y) ⇒ |t − τC (y)| < ε(y). Now, as N is globally hyperbolic, there is a spacelike Cauchy surface Co of N that meets p. Co is also a spacelike Cauchy surface of M. Since Co ⊂ N, we have |τC (y)−τCo (y)| < ε(y) for each y ∈ . Lemma A.2. Let Cp and C as above. Consider three connected, relatively compact, open subsets G, U1 , U2 of Cp that verify G ⊂ U1 , U 1 ⊂ U2 . Then, there exists a smooth acausal Cauchy surface Co of M that meets p,such that: a) J G ∩ Co ⊂ Co ,C (U1 ), b) J Co ,C (U1 ) ∩ C ⊂ U2 . Proof. The sets N1± , defined as (D± (U1 ))o , are globally hyperbolic. Let us take two spacelike Cauchy surfaces S1± of N1± . Notice that C1± ≡ S1± ∪ (C \ U1 ) are acausal Cauchy surfaces of M, and that, there are two strictly positive continuous functions ε1± : fC−1 (U1 ) −→ R such that S1± = {F (τC (y) ± ε1± (y), y) | y ∈ fC−1 (U1 )}. Let us define U1+ ≡ J− (J+ (G) ∩ S1+ ) ∩ C, U1− ≡ J+ (J− (G) ∩ S1− ) ∩ C.
Punctured Haag Duality in Locally Covariant Quantum Field Theories
633
Since J+ (G) ∩ C1+ is a closed subset of C1+ and J+ (G) ∩ C1+ = J+ (G) ∩ S1+ , we have that U1+ is a closed subset of U1 and G ⊂ U1+ . The same holds for U1− . The set W1 ≡ U1+ ∪ U1− is closed, compact (because it is contained in a relatively compact set) and G ⊂ W1 ⊂ U1 . We now apply the same reasoning with respect to the inclusion U1 ⊂ U2 . Namely, given two spacelike Cauchy surfaces S2± of the spacetimes (D± (U2 ))o , we consider the acausal Cauchy surfaces C2± of M defined as C2± ≡ S2± ∪ (C \ U2 ). As above, there are two strictly positive continuous functions ε2± such that S2± = {F (τC (y) ± ε2± (y), y) | y ∈ fC−1 (U2 )}. Thus, we can find a compact set W2 of C verifying U 1 ⊂ W2 ⊂ U2 . Now let us define ε ≡ min min {ε1+ (y), ε1− (y)}, min {ε2+ (y), ε2− (y)} . y∈f −1 (W1 ) y∈f −1 (W2 ) C
C
By Lemma A.1 there is a spacelike Cauchy surface Co of M that meets p, such that |τCo (y) − τC (y)| < ε for each y ∈ . Since G ⊆ W1 , by the definition of ε the set J+ (G)∩Co is in the past of J+ (G)∩S1+ , while J− (G)∩Co is in the future of J− (G)∩S1− . Hence, we have J J(G) ∩ Co ∩ C ⊂ W1 . This entails
C,Co (J(G) ∩ Co )
=
fC ◦ fC−1 (J(G) ∩ Co ) ⊂ W1 o
⇐⇒ J(G) ∩ Co ⊂ Co ,C (W1 ) ⇒ J(G) ∩ Co ⊂ Co ,C (U1 ), completing the proof of the statement a). The same reasoning applied to the inclusion U1 ⊂ U2 leads to J(J(U 1 ) ∩ Co ) ∩ C ⊂ W2 . As Co ,C (U1 ) is contained in the closed set J(U 1 ) ∩ Co , we have that J( Co ,C (U1 )) ∩ C ⊂ W2 ⊂ U2 . Proposition A.3. Let D ∈ K (Mp ) and let V be an open set of Mp such that D ⊂ V . Then, there exist a spacelike Cauchy surface Co of M that meets p, and Do ∈ K (M, Co ) such that D ⊂ Do , and Do ⊂ (D(V ))o . Proof. Assume that D is based on a spacelike Cauchy surface Cp of Mp and let G ⊂ Cp be the base of D. The set U ≡ V ∩ Cp is open in Cp and G ⊂ U . According to [11, Lemma 3] we can find U1 , U2 ∈ G(Cp ) such that G ⊂ U1 , U 1 ⊂ U2 and U 2 ⊂ U . Let C be the acausal Cauchy surface of M defined as Cp ∪ {p}. By Lemma A.2 there is a spacelike Cauchy surface Co of M that meets p, such that J G ∩ Co ⊂ Co ,C (U1 ), J Co ,C (U1 ) ∩ C ⊂ U2 . fCo is a diffeomorphism between and Co because Co is spacelike. Notice now that fC : −→ Cp ∪ {p} is an homeomorphism because C = Cp ∪ {p} is an acausal, in general nonsmooth, Cauchy surface of M. However, Cp is spacelike, hence smooth. Then, it easily follows from the definition of fC that fC : \ {fC−1 (p)} −→ Cp is a diffeomorphism. Then, Do ≡ (D( Co ,C (U1 )))o is a regular diamond of M based on Co . The previous inclusions entail that D ⊂ Do , Do ⊂ (D(U2 ))o ⊂ (D(V ))o completing the proof.
634
G. Ruzzi
Because of (8), Proposition A.3 proves the first part of Proposition 3.3. Concerning the second part, we have the following Proposition A.4. Let D ∈ K (M) be such that D ⊥ {p} and let V be an open set of M such that D ⊂ V . Then, there exist a spacelike Cauchy surface Co of M, that meets p, and Do ∈ K (M, Co ) such that D ⊂ Do , Do ⊂ (D(V ))o and Do ⊥ {p}. Proof. The proof is very similar to the proof of Proposition A.3. Assume that D = (D(G))o , where G ∈ G(C) for some spacelike Cauchy surface C of M. Notice G∩J(p) = ∅. Let us define U = (C ∩ V ) \ (C ∩ J(p)). U is an open set of C \ (C ∩ J(p)) and G ⊂ U . By [11, Lemma 3] we can find U1 , U2 ∈ G(C) such that G ⊂ U1 , U 1 ⊂ U2 and U 2 ⊂ U . Now notice that in general C does not meet p, hence Lemmas A.1 and A.2 cannot be applied. In this case, however, we can use directly [11, Lemma 5] that asserts that for each ε > 0 there exists a spacelike Cauchy surface Co that meets p such that |τC (y) − τCo (y)| < ε for any y ∈ fC−1 (U ). Proceeding as in Lemma A.2, one can choose ε in such a way that J G ∩ Co ⊂ Co ,C (U1 ), J Co ,C (U1 ) ∩ C ⊂ U2 . Proceeding now as in Proposition A.3, these inclusions lead to the proof of the statement. Acknowledgement. I would like to thank Daniele Guido and John E. Roberts for helpful discussions and their constant interest in this work. Finally, I am grateful to Gerardo Morsella for his precious help.
References 1. Brunetti, R., Fredenhagen, K., Verch, R.: The generally covariant locality principle – A new paradigm for local quantum physics. Commun. Math. Phys. 237, 31–68 (2003) 2. Dieckmann, J.: Cauchy surfaces in globally hyperbolic spacetimes. J. Math. Phys. 29, 578–579 (1988) 3. Ellis, G.F.R., Hawking, S.W.: The large scale structure of space-time. Cambridge: Cambridge University Press, 1973 4. Guido, D., Longo, R., Roberts, J.E., Verch, R.: Charged sectors, spin and statistics in quantum field theory on curved spacetimes. Rev. Math. Phys. 13(2), 125–198 (2001) 5. Haag, R., Kastler, D.: An algebraic approach to quantum field theory. J. Math. Phys. 43, 848–861 (1964) 6. Junker, W., Schrohe, E.: Adiabatic vacuum states on general spacetime manifolds: Definition, construction, and physical properties. Ann. Henri Poincare 3, 1113–1181 (2002) 7. L¨uders, C., Roberts, J.E.: Local quasiequivalence and adiabatic vacuum states. Commun. Math. Phys. 134, 29–63 (1990) 8. O’Neill, B.: Semi-Riemannian geometry. New York: Academic Press, 1983 9. Roberts, J.E.: More lectures in algebraic quantum field theory. In: S. Doplicher, R. Longo (eds.) Noncommutative geometry (C.I.M.E. Lectures, Martina Franca, Italy, 2000), Berlin-HeidelbergNew York: Spinger, 2003 10. Verch, R.: Continuity of symplectically adjoint maps and the algebraic structure of Hadamard vacuum representations for quantum fields in curved spacetime. Rev. Math. Phys. 9(5), 635–674 (1997) 11. Verch, R.: Notes on regular diamonds. Preprint, available as a ps-file at http://www/lqp.unigoe.de/papers/99/07/99070501.html, 1999 12. Wald, R.M.: General relativity. Chicago, IL: University of Chicago Press, 1984 Communicated by Y. Kawahigashi
Commun. Math. Phys. 256, 635–680 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1321-x
Communications in
Mathematical Physics
“Real Doubles” of Hurwitz Frobenius Manifolds Vasilisa Shramchenko Department of Mathematics and Statistics, Concordia University, 7141 Sherbrooke West, Montreal H4B 1R6, Quebec, Canada. E-mail: [email protected] Received: 17 March 2004 / Accepted: 22 October 2004 Published online: 15 March 2005 – © Springer-Verlag 2005
Abstract: New Frobenius structures on Hurwitz spaces are found. A Hurwitz space is considered as a real manifold; therefore the number of coordinates is twice as large as the number of coordinates on Hurwitz Frobenius manifolds of Dubrovin. Simple branch points of a ramified covering and their complex conjugates play the role of canonical coordinates on the constructed Frobenius manifolds. Corresponding solutions to WDVV equations and G-functions are obtained. 1. Introduction Frobenius manifolds were introduced by B. Dubrovin [4] as a geometric interpretation of the Witten-Dijkgraaf-E.Verlinde-H.Verlinde (WDVV) equations from two-dimensional topological field theory [2, 17]. The theory of Frobenius manifolds is related to various branches of mathematics: the theory of singularities – some ingredients of a Frobenius manifold had long existed on the base space of the universal unfolding of a hypersurface singularity. Besides singularity theory, Frobenius manifold structures have been found on cohomology spaces of smooth projective varieties (the theory of Gromov-Witten invariants); on extended moduli spaces of Calabi-Yau manifolds; on orbit spaces of Coxeter groups, extended affine Weil groups and Jacobi groups; and on Hurwitz spaces (see the references in [5, 14]). The aim of the present work is to construct a new class of semisimple (vector algebra on any tangent space has no nilpotents) Frobenius manifolds associated with Hurwitz spaces. The dimension of Dubrovin’s Frobenius manifolds on Hurwitz spaces is equal to the complex dimension of the Hurwitz space. In this paper we build Frobenius structures of a double dimension on the real Hurwitz space. We consider the Hurwitz space as a real manifold, i.e. we complement the set of its usual local coordinates by the set of their complex conjugates. We call new Frobenius manifolds the “real doubles” of Hurwitz Frobenius manifolds of Dubrovin (in some cases the prepotential of a “real double” is real-valued, however this is not always the case).
636
V. Shramchenko
We start with a construction of a family of Darboux-Egoroff (flat potential diagonal) metrics on a real Hurwitz space in genus greater than zero. The Hurwitz space we consider is the space of coverings (L, λ) of CP 1 , where L is a Riemann surface of genus g ≥ 1, λ is a meromorphic function on L with simple finite critical points P1 , . . . , PL and possibly with critical points at infinity. The real Hurwitz space has local coordinates {λ1 , . . . , λL ; λ¯ 1 , . . . , λ¯ L }, where λi = λ(Pi ). The Darboux-Egoroff metrics on ¯ kernels this space are written in terms of the Schiffer (P , Q) and Bergman B(P , Q) on a Riemann surface of genus g ≥ 1. These kernels are defined by [8]: (P , Q) = W (P , Q) − π
g
(ImB)−1 ij ωi (P )ωj (Q) ,
i,j =1
¯ =π B(P , Q)
g
(ImB)−1 ij ωi (P )ωj (Q) ,
i,j =1
where W (P , Q) = dP dQ log E(P , Q) is the canonical bidifferential of the second kind g on L ; E(P , Q) is the prime form; {ωi }i=1 are holomorphic differentials on L normalized with respect to a given canonical basis of cycles by ai ωj = δij ; and B is the symmetric matrix of their b-periods: Bij = bi ωj . The kernels can equivalently be characterized as follows [8]. The Schiffer kernel is −2 the bidifferential with a singularity of the form (x(P ) − x(Q)) dx(P )dx(Q) along the diagonal P = Q such that p.v. L (P , Q) ω(P ) = 0 holds for any holomorphic differential ω on the surface. The Bergman kernel is a regular bidifferential on L holomorphic with respect to its first argument and antiholomorphic with respect to the second one. It is (up to a factor of 2πi) a kernel of an integral operator acting in the space (1,0) L(1,0) (L) of holo2 (L) of (1,0)-forms as an orthogonal projector onto the subspace H morphic (1,0)-forms. In particular, for any holomorphic differential ω on the surface L ¯ ω(Q) = 2π iω(P ). Both kernels, (P , Q) the following relation holds: L B(P , Q) ¯ are independent of the choice of a canonical basis of cycles {ak , bk }. and B(P , Q), We consider the following family of metrics on the real Hurwitz space: ds = 2
L j =1
2 h(Q)(Q, Pj ) l
(dλj ) + 2
L j =1
2
h(Q)B(Q, P¯j )
(d λ¯ j )2 .
(1.1)
l
Here l is an arbitrary contour on the surface not passing through ramification points and such that its projection on the base of the covering does not depend on coordinates {λi ; λ¯ i }; h is an arbitrary function defined in a neighbourhood of the contour. The rotation coefficients βij of the metrics (1.1) are given by the Schiffer and Bergman kernels evaluated√at the ramification points of the covering with respect to the local parameters given by λ(P ) − λi : βij = (Pi , Pj ) ,
βi j¯ = B(Pi , P¯j ) ,
βi¯j¯ = (Pi , Pj ) .
As a consequence of Rauch variational formulas for the Schiffer and Bergman kernels, we have relations ∂λk βij = βik βkj for the rotation coefficients for distinct indices i, j, k from the set {m; m} ¯ Lm=1 . These relations provide main conditions for the flatness of metrics (1.1).
“Real Doubles” of Hurwitz Frobenius Manifolds
637
Some of the metrics (1.1) correspond to Frobenius structures on the Hurwitz space. We describe these structures and find their prepotentials and flat coordinates of the corresponding flat metric. A prepotential as a function of flat coordinates satisfies the WDVV system. Since for the surface of genus zero the Bergman kernel vanishes and the Schiffer kernel coincides with W (P , Q), the metrics (1.1) and therefore the construction of Frobenius manifolds suggested here are only new for a Hurwitz space in genus ≥ 1. For the Riemann sphere, our construction coincides with that of Dubrovin. For the simplest Hurwitz space in genus one, which has the real dimension 6, we compute explicitly prepotentials of three new Frobenius manifolds. One of these prepotentials has the form: 1 1 1 1 1 1 t42 1 t22 t52 1 F = − t1 t22 − t1 t52 + t1 t4 (2t3 − ) − t12 t6 − t3 (t3 − ) − 4 4 2 2πi 2 2 2π i t6 16 t6 t3 t4 t52 t24 1 t24 t3 + − − γ 32t6 4t6 128πi t62 t6 1 4 4 (t3 − 2πi )t4 t22 t 1 t5 1 − 2πit3 + − 5 − γ , (1.2) 32t6 128πi t62 2πit6 4t6 where γ (µ) = 4∂µ log η(µ) for η being the Dedekind η-function. The function F is quasihomogeneous, i.e. it satisfies F (κt1 , κ 1/2 t2 , t3 , κt4 , κ 1/2 t5 , t6 ) = κ 2 F (t1 , t2 , t3 , t4 , t5 , t6 ) for any nonzero constant κ. The matrix F1 formed by third derivatives Ft1 ti tj is constant and invertible; it gives the flat metric (written in flat coordinates) from the family of metrics (1.1) which corresponds to the Frobenius structure (1.2). The functions k cij =
(F1−1 )kn n
∂ 3F ∂ti ∂tj ∂tn
define an associative commutative algebra in the tangent space to the underlying Hurwitz k ∂ . (This is equivalent [5] to the WDVV system for the function F .) space: ∂ti · ∂tj = cij tk Associated with any semisimple Frobenius manifold is the G-function, the solution to Getzler’s system of linear differential equations derived in [9] within the study of recursion relations for the genus one Gromov-Witten invariants of smooth projective varieties. This system may be written for any semisimple Frobenius manifold. In [6] it was proven that, for an arbitrary semisimple Frobenius manifold, the Getzler system has a unique quasihomogeneous solution given by G = log
τI J 1/24
.
(1.3)
Here J is the Jacobian of the transformation between canonical and flat coordinates on the Frobenius manifold; τI is the isomonodromic tau-function associated to the Frobenius manifold. For the Frobenius structures described here the ingredients of the formula (1.3) can be computed using results of papers [12, 13]. For example, the isomonodromic tau-function τI of the new Frobenius manifolds is related to the isomonodromic tau-function τI0 of Dubrovin’s Hurwitz Frobenius manifolds by the formula:
638
V. Shramchenko 1
τI = |τI0 |2 (det ImB)− 2 , where B is the matrix of b-periods of the underlying Riemann surface. The function τI−2 coincides with an appropriately regularized ratio of the determinant of Laplacian on the Riemann surface and the surface volume in the singular metric |dλ|2 , see [3, 12, 16]. For the Frobenius manifold corresponding to the prepotential (1.2), the G-function is expressed in terms of the Dedekind eta-function as follows: 1 −3 1 − 2πit3 t3 η G = − log η (t2 t5 ) 8 t6 4 + const . t6 2πit6 We hope that in the future the construction of a “real double” can be extended to arbitrary Frobenius manifolds. Presumably this extension can be done on the level of the Riemann-Hilbert problem associated with a Frobenius manifold. The most intriguing case would then be the Frobenius manifolds related to quantum cohomologies; we hope that their “real doubles” might find an interesting geometrical application. We notice that a class of solutions to the WDVV system related to real Hurwitz spaces was previously constructed in the work [7]. However, the full structure of a Frobenius manifold was not discussed in [7], and an explicit relationship of prepotentials of [7] and solutions to WDVV equations constructed in this work remains unclear. The paper is organized as follows. In the next section we give definitions of the WDVV system and Frobenius manifold and discuss the one-to-one correspondence between them. In Sect. 3 we describe the Hurwitz space we shall build Frobenius structures on, the W -bidifferential and the Schiffer and Bergman kernels on a Riemann surface and introduce flat metrics on Hurwitz spaces in terms of the kernels. In Sect. 4 we reformulate the structures of Frobenius manifolds on Hurwitz spaces introduced by Dubrovin in terms of the W -bidifferential. Section 5 contains the main result of the paper, the Frobenius structures on Hurwitz spaces considered as real manifolds. Section 6 is devoted to calculation of the G-function for the new Frobenius structures. In Sect. 7 we consider the simplest Hurwitz space in genus one and present explicit expressions for prepotentials and G-functions of the corresponding Frobenius manifolds. 2. Frobenius Manifolds and WDVV Equations The Witten-Dijkgraaf-E.Verlinde-H.Verlinde (WDVV) system looks as follows: Fi F1−1 Fj = Fj F1−1 Fi ,
i, j = 1, . . . , n ,
where Fi is the n × n matrix (Fi )lm =
∂ 3F , ∂t i ∂t l ∂t m
and F is a scalar function of n variables t 1 , . . . , t n . In the theory of Frobenius manifolds one imposes the following two conditions on the function F : • Quasihomogeneity (up to a quadratic polynomial): for any nonzero κ and some numbers ν1 , . . . , νn , νF , F (κ ν1 t 1 , . . . , κ νn t n ) = κ νF F (t 1 , . . . , t n ) + quadratic terms . • Normalization: F1 is a constant nondegenerate matrix.
(2.1)
“Real Doubles” of Hurwitz Frobenius Manifolds
639
The condition of quasihomogeneity can be rewritten in terms of the Euler vector field ν α t α ∂t α (2.2) E := α
as follows: LieE F = E(F ) =
να t α ∂t α F = νF F + quadratic terms .
(2.3)
α
Definition 1. An algebra A over C is called a (commutative) Frobenius algebra if: • it is a commutative associative C-algebra with a unity e. • it is supplied with a C-bilinear symmetric nondegenerate inner product ·, · having the property x · y, z = x, y · z for arbitrary vectors x, y, z from A. Definition 2. M is a Frobenius manifold of the charge ν if a structure of a Frobenius algebra smoothly depending on the point t ∈ M is specified on any tangent plane Tt M such that F1 the inner product ·, · is a flat metric on M (not necessarily positive definite). F2 the unit vector field e is covariantly constant with respect to the Levi-Civita connection ∇ for the metric ·, ·, i.e. ∇x e = 0 for any vector field x on M. F3 the tensor (∇w c)(x, y, z) is symmetric in four vector fields x, y, z, w ∈ Tt M, where c is the following symmetric 3-tensor: c(x, y, z) = x · y, z. F4 there exists on M a vector field E (the Euler field) such that the following conditions hold for any vector fields x , y on M: ∇x (∇y E) = 0 ,
(2.4)
[E, x · y] − [E, x] · y − x · [E, y] = x · y ,
(2.5)
LieE x, y := Ex, y − [E, x], y − x, [E, y] = (2 − ν)x, y .
(2.6)
The charge ν of a Frobenius manifold is equal to νF + 3, where νF is the quasihomogeneity coefficient from (2.3). Theorem 1 ([5]). Any solution F (t) of the WDVV equations with ν1 = 0 defined for t ∈ M determines on M a structure of a Frobenius manifold and vice versa. Proof. (see [5]). Given a Frobenius manifold, denote by {t α } the flat coordinates of the metric ·, · and by η the constant matrix ηαβ = ∂t α , ∂t β . Due to the covariant constancy of the unit vector field e, we can by a linear change of coordinates put e = ∂t 1 . In these coordinates, the condition F3 of Definition 2 implies the existence of a function F whose third derivatives give the 3-tensor c: cαβγ = c(∂t α , ∂t β , ∂t γ ) =
∂ 3F ∂t α ∂t β ∂t γ
.
The WDVV equations for the function F provide the associativity condition for the γ Frobenius algebra defined by relations ∂t α · ∂t β = cαβ ∂t γ , where the structure constants γ δ η cαβ are found from cαβ δγ = cαβγ . The existence of the vector field E implies the quasihomogeneity of the function F. Indeed, requirements (2.5), (2.6) on the Euler vector field imply
640
V. Shramchenko
LieE c(x, y, z) := E (c(x, y, z)) − c([E, x], y, z) − c(x, [E, y], z) − c(x, y, [E, z]) = (3 − ν)c(x, y, z) . (2.7) The Lie derivative LieE commutes with the covariant derivative ∇ as can easily be checked in flat coordinates when the Euler vector field (due to (2.4)) has the form (2.2). Therefore, (2.7) implies LieE F = (3 − ν)F + quadratic terms . The converse statement can be proven analogously. The function F, defined up to an addition of an arbitrary quadratic polynomial in t 1 , . . . , t n , is called the prepotential of the Frobenius manifold. Definition 3. A Frobenius manifold M is called semisimple if the Frobenius algebra in the tangent space at any point of M does not have nilpotents. In this paper we only consider semisimple Frobenius structures. 3. Kernels on Riemann Surfaces and Darboux-Egoroff Metrics 3.1. Hurwitz spaces. Hurwitz space is the moduli space of pairs (L, λ) where L is a compact Riemann surface of genus g and λ : L → CP 1 is a meromorphic function on L of degree N. The pair (L, λ) represents the surface as an N -fold ramified covering Lλ of CP 1 defined by the equation ζ = λ(P ) ,
P ∈L
(ζ is a coordinate on CP 1 ). In this way the surface L can be viewed as a collection of N copies of CP 1 which are glued together along branch cuts. Critical points Pj of the function λ(P ) correspond to ramification points of the covering. The projections λj of ramification points on the base of the covering (CP 1 with coordinate ζ ) are the images of critical points Pj of the function λ(P ) (λj are called the branch points): λ (Pj ) = 0; λj = λ(Pj ). We assume that all finite branch points {λj |λj < ∞} are simple ( i.e. there are exactly two sheets glued together at the corresponding point) and denote their number by L. We also assume that the function λ has m + 1 poles at the points of L denoted by ∞0 , . . . , ∞m ; the pole at ∞i has the order ni + 1. In terms of sheets of the covering, there are m + 1 points which project to ζ = ∞ on the base; the numbers {ni + 1} give m the number of sheets glued at each of these points (n0 , . . . , nm ∈ N are such that i=0 (ni + 1) = N, they are called the ramification indices). The local parameter near a simple ramification point Pj ∈ L (which is not a pole of
λ) is xj (P ) = λ(P ) − λj ; and in a neighbourhood P ∼ ∞i the local parameter zi is given by zi (P ) = (λ(P ))−1/(ni +1) . The Riemann-Hurwitz formula connects the genus g of the surface, degree N of the function λ, the number L of simple finite branch points, and the ramification indices ni over infinity: 2g − 2 = −2N + L +
m
ni .
(3.1)
i=0
Two coverings are said to be equivalent if one can be obtained from the other by a permutation of sheets. The set of equivalence classes of described coverings will be
“Real Doubles” of Hurwitz Frobenius Manifolds
641
=M g;n0 ,...,nm of this denoted by M = Mg;n0 ,...,nm . We shall work with a covering M is a triple {L, λ, {ak , bk }g } , where {ak , bk }g is a space. A point of the space M k=1 k=1 canonical basis of cycles on L . The branch points λ1 , . . . , λL play the role of local viewed as a complex manifold. coordinates on M, 3.2. Bidifferential W, Bergman and Schiffer kernels. First, we summarize properties of three well-known symmetric bidifferentials on Riemann surfaces. Being suitably evaluated at the ramification points {Pj }, these kernels will play the role of rotation coefficients of flat metrics on Hurwitz spaces. The meromorphic bidifferential W (P , Q) defined by W (P , Q) := dP dQ log E(P , Q)
(3.2)
is the symmetric differential on L × L with the second order pole at the diagonal P = Q with biresidue 1 and the properties: W (P , Q) = 0 ; W (P , Q) = 2πi ωk (P ) ; k = 1, . . . , g . (3.3) ak
bk
g Here {ak , bk }k=1
g
is the canonical basis of cycles on L ; {ωk (P )}k=1 is the corresponding set of holomorphic differentials normalized by al ωk = δkl ; and E(P , Q) is the prime form on the surface L. The dependence of the bidifferential W on branch points of the Riemann surface is given by the Rauch variational formulas [11, 15]: ∂W (P , Q) 1 = W (P , Pj )W (Q, Pj ) , ∂λj 2
(3.4)
W (P , Q) at Q = Pj with where W (P , Pj ) denotes the evaluation of the bidifferential
respect to the standard local parameter xj (Q) = λ(Q) − λj near the ramification point Pj : W (P , Pj ) :=
W (P , Q) |Q=Pj . dxj (Q)
(3.5)
The bidifferential W (P , Q) depends holomorphically on the branch points {λj } in contrast to the following two bidifferentials [8]. The Schiffer kernel (P , Q) is the symmetric differential on L × L defined by: (P , Q) := W (P , Q) − π
g
(ImB)−1 kl ωk (P )ωl (Q) ,
(3.6)
k,l=1
where B is the symmetric matrix of b-periods of holomorphic normalized differentials {ωk } : Bkl = bk ωl , which depends holomorphically on the branch points {λj }. This kernel has the same singularity structure as the bidifferential W, it depends on {λ¯ j } due ¯ ¯ is a function of {λ¯ j }. For a to the terms added to W, since ImB = (B − B)/(2i) and B surface of genus zero the Schiffer kernel coincides with W. ¯ is defined by: The Bergman kernel B(P , Q) ¯ =π B(P , Q)
g
(ImB)−1 kl ωk (P )ωl (Q) .
k,l=1
It vanishes for a surface of genus zero.
(3.7)
642
V. Shramchenko
An important property of the Schiffer and Bergman kernels is independence of the g choice of a canonical basis of cycles {ak , bk }k=1 on the Riemann surface. This can be seen, for example, from the following definitions (see Fay [8]) equivalent to (3.6) and (3.7). The Schiffer kernel is the unique symmetric bidifferential with a singularity of the form (x(P ) − x(Q))−2 dx(P )dx(Q) along P = Q and such that (P , Q)ω(P ) = 0
p.v.
(3.8)
L
holds for any holomorphic differential ω. The Bergman kernel is (up to the multiplier 2πi) a kernel of an integral operator which acts in the space L(1,0) 2 (L) of (1, 0)-forms as an orthogonal projector onto the subspace H(1,0) (L) of holomorphic (1, 0)-forms. In particular, the following holds for any holomorphic differential ω on the surface L: 1 2πi
¯ B(P , Q)ω(Q) = ω(P ) .
(3.9)
L
For the Bergman kernel the independence of the choiceof a canonical basis of cycles can also be seen directly from (3.7) using (ImB)kl = 2i L ωk (P )ωl (P ) . The periods of Schiffer and Bergman kernels are related to each other as follows:
(P , Q) = −
ak
B(P¯ , Q) ,
(P , Q) = −
ak
bk
B(P¯ , Q),
(3.10)
bk
where the integrals are taken with respect to the first argument. The derivatives of the kernels with respect to branch points and their complex conjugates are given by: ∂(P , Q) 1 = (P , Pj )(Q, Pj ) , ∂λj 2 ¯ ∂B(P , Q) 1 ¯ , = (P , Pj )B(Pj , Q) ∂λj 2
∂(P , Q) 1 = B(P , P¯j )B(Q, P¯j ) , 2 ∂ λ¯ j ¯ ∂B(P , Q) 1 = B(P , P¯j )(Q, Pj ) . (3.11) ¯ 2 ∂ λj
The notation here is analogous to that in (3.5), i.e. (P , Pj ) stands for
¯ ¯ (P , Q)/dxj (Q) |Q=Pj and B(P , Pj ) := B(P , Q)/dxj (Q) |Q=Pj . To prove (3.11) one uses the variational formulas (3.4) for W (P , Q), and the following Rauch variational formulas for holomorphic normalized differentials {ωk } and for the matrix of b-periods [15]: ∂ ωk (P ) 1 = ωk (Pj )W (P , Pj ) , ∂λj 2
∂ Bkl = π i ωk (Pj )ωl (Pj ) , ∂λj
(3.12)
where we write ωk (Pj ) for (ωk (P )/dxj (P ))|P =Pj . Derivatives of ωk and B with respect to {λ¯ j } vanish.
“Real Doubles” of Hurwitz Frobenius Manifolds
643
3.3. Darboux-Egoroff metrics. Now we are in a position to introduce two families of Darboux-Egoroff (flat potential diagonal) metrics on Hurwitz spaces written in terms of the described bidifferentials. Following the terminology of Dubrovin, we call a bilinear quadratic form a metric even if it is not positive definite. A diagonal metric ds2 = i gii (dλi )2 is called potential if there exists a function U such that ∂λi U = gii for all i. A potential diagonal metric is flat (Riemann curvature tensor vanishes) if its rotation coefficients βij defined for i = j by √ ∂λj gii (3.13) βij := √ gjj satisfy the system of equations: ∂λk βij = βik βkj , i, j, k are distinct, ∂λk βij = 0 for all βij .
(3.14) (3.15)
k
3.3.1. Darboux-Egoroff metrics in terms of the bidifferential W . The following family of diagonal metrics (bilinear quadratic forms) on the Hurwitz space first appeared in [11] where it was realized that the corresponding rotation coefficients are given by the bidifferential W (see (3.17)) and that the metrics are flat: ds2 =
L j =1
2 h(Q)W (Q, Pj )
(dλj )2 .
(3.16)
l
Here l is an arbitrary smooth contour on the Riemann surface L such that Pj ∈ / l for any j, and its image λ(l) in CP 1 is independent of the branch points {λj }; h(Q) is an arbitrary independent of {λj } function defined in a neighbourhood of the contour l. Using variational formulas (3.4), we find that rotation coefficients of the metric (3.16) are given by the bidifferential W (P , Q) evaluated at the ramification points of the surface L with respect to the standard local parameters xj = λ − λj near Pj : βij =
1 W (Pi , Pj ) , 2
i, j = 1, . . . , L , i = j .
(3.17)
Here W (Pi , Pj ), similarly to (3.5), stands for W (P , Q)/(dxi (P )dxj (Q)) |P =Pi ,Q=Pj . Note that rotation coefficients βij (3.17) are symmetric with respect to indices, therefore the metrics (3.16) are potential. The next proposition shows that they are DarbouxEgoroff metrics. Proposition 1 ([11]). Rotation coefficients (3.17) satisfy Eqs. (3.14), (3.15) and therefore metrics (3.16) are flat. Proof. Variational formulas (3.4) with P = Pi , Q = Pk , for different i, j, k imply relations (3.14) for rotation coefficients (3.17). Equations (3.15) hold for the coefficients due to the invariance of W (P , Q) with respect to biholomorphic maps of the Riemann surface. Namely, consider the covering Lλδ obtained from Lλ by a simultaneous δ-shift λ → λ + δ on all sheets. The surface L is mapped by this transformation to Lδ so that the point P ∈ L goes to P δ ∈ Lδ which belongs to the same sheet of the covering as P and is such that λ(P δ ) = λ(P )+δ. Denote by W δ the bidifferential W on the surface Lδ . Since
644
V. Shramchenko
the transformation λ → λ + δ is biholomorphic, we have W δ (P δ , Qδ ) = W (P , Q). The same relation is true for W (P , Q)/(dxi (P )dxj (Q)) when points P and Q are in neighbourhoods of ramification points Pi and Pj , respectively: W δ (P δ , Qδ ) W (P , Q) = . dxi (P )dxj (Q) dxiδ (P δ )dxjδ (Qδ )
(3.18)
√ Note that xi (P ) = λ(P ) − λi does not change under a simultaneous shift of all branch points and λ. After the substitution P = Pi , Q = Pj in (3.18) the differentiation with respect to δ at δ = 0 gives the sum of derivatives with respect to branch points: L k=1 ∂λk W (Pi , Pj ) = 0. Thus, the rotation coefficients (3.17) satisfy also (3.15). Therefore the metrics (3.16) are flat. 3.3.2. Darboux-Egoroff metrics in terms of Schiffer and Bergman kernels. Now let us consider the Hurwitz space M as a real manifold, i.e. a manifold with a set of local coordinates formed by the branch points and their complex conjugates. As an analogue of the family of metrics (3.16) on the space of coverings M = Mg;n0 ,...,nm with the local coordinates {λ1 , . . . , λL ; λ¯ 1 , . . . , λ¯ L } we consider the following two families of metrics: 2 2 L L ds21 = h(Q)(Q, Pj ) (dλj )2 + h(Q)B(Q, P¯j ) (d λ¯ j )2 (3.19) l
j =1
and ds22 = Re
j =1
L
j =1
2
¯ Pj ) h(Q)B(Q,
h(Q)(Q, Pj ) + l
l
(dλj )2
l
.
(3.20)
Here, as before, l is an arbitrary contour on the surface not passing through {Pj } and such that its image λ(l) in ζ -plane is independent of branch points {λj } ; h is an arbitrary function independent of {λj } defined in some neighbourhood of the contour. From variational formulas (3.11) for the Schiffer and Bergman kernels we see that these metrics are potential and their rotation coefficients are given by the kernels evaluated at ramification points of L: βij =
1 (Pi , Pj ) , 2
βi j¯ =
1 B(Pi , P¯j ) , 2
βi¯j¯ = βij .
(3.21)
Here i, j = 1, . . . , L and the index j¯ corresponds to differentiation with respect to λ¯ j . Similarly to the notation in (3.17), we understand (Pi , Pj ) and B(Pi , P¯j ) as follows: (Pi , Pj ) :=
(P , Q) , dxi (P )dxj (Q) P =Pi , Q=Pj
B(Pi , P¯j ) :=
¯ B(P , Q) dxi (P )dxj (Q)
P =Pi , Q=Pj
.
Remark 1. Note that rotation coefficients of the metrics (3.19), (3.20) are defined on the space Mg;n0 ,...,nm , in contrast to rotation coefficients (3.17). The coefficients (3.17) are given by the bidifferential W, which depends on the choice of a canonical basis of cycles g;n0 ,...,nm (see Sect. 3.1). However, {ai , bi }, and therefore are defined on the covering M the metrics of the type (3.19), (3.20) which will be used in Sect. 5 still depend on the choice of cycles {ai , bi } through the choice of contours l.
“Real Doubles” of Hurwitz Frobenius Manifolds
645
Proposition 2. Rotation coefficients (3.21) satisfy Eqs. (3.14), (3.15) and therefore metrics (3.19), (3.20) are flat. The proof is analogous to that of Proposition 1. Here δ should be taken real, δ ∈ R. Note that in Eqs. (3.14), (3.15) i, j, k run through the set of all possible indices which ¯ where we put λ ¯ := λ¯ k . ¯ . . . , L}, in this case is {1, . . . , L; 1, k 4. Dubrovin’s Frobenius Structures on Hurwitz Spaces We start with a description of Dubrovin’s construction [5] of Frobenius manifolds on the =M g;n0 ,...,nm using the bidifferential W (P , Q). The branch points λ1 , . . . , λL space M . are the local coordinates on M for some To introduce a structure of a Frobenius algebra on the tangent space Tt M we take coordinates λ1 , . . . , λL to be canonical for multiplication, i.e. we point t ∈ M define ∂λi · ∂λj = δij ∂λi .
(4.1)
Then, the unit vector field is given by e=
L
∂λi .
(4.2)
i=1
For this multiplication law, the diagonal metrics (3.16) obviously have the property x · y, z = x, y · z required in the definition of a Frobenius algebra. Therefore together with the multiplication (4.1) the metrics (3.16) define a family of Frobenius algebras on Tt M. Among the family of metrics (3.16) (and Frobenius algebras) we are going to isolate those corresponding to Frobenius manifolds. The Euler vector field has the following form in canonical coordinates [5]: E=
L
λi ∂λi .
(4.3)
i=1
4.1. Primary differentials. As is easy to see, with the Euler field (4.3), the multiplication (4.1) satisfies requirement (2.5) from F4. Condition (2.6) then reduces to
(4.4) E ∂λi , ∂λi = −ν∂λi , ∂λi . The following proposition describes the metrics from family (3.16) which satisfy this condition. Proposition 3. Let the contour l in (3.16) be either a closed contour on L or a contour connecting points ∞i and ∞j for some i and j. In the latter case we regularize the integral by omitting its divergent part as a function of the corresponding local parameter near ∞i . Choose a function h(Q) in (3.16) to be h(Q) = C λn (Q) (where C is a constant). Then the Euler vector field (4.3) acts on metrics (3.16) according to (4.4) with ν = 1 − 2n. Proof. Let us again use the invariance of the bidifferential W under biholomorphic mappings of the Riemann surface L. Consider the mapping Lλ → Lλ when the transformation λ → (1 + )λ is performed on every sheet of the covering Lλ . A point P of the surface is then mapped to the point P of the same sheet such that λ(P ) =
646
V. Shramchenko
(1 + )λ(P ). If W is the √ bidifferential W on Lλ , then W (P , Q ) = W (P , Q). For the local parameter xi = λ − λi in a neighbourhood of a ramification point Pi , we √ have dxi = 1 + dxi . A contour l of the specified type is invariant as a path of integration in (3.16) with respect to this transformation. Therefore we have 2 W (Q, P ) 2 n W (Q , P ) 2n−1 n λ (Q ) = (1 + ) λ (Q) . (4.5) dxj (P ) dxj (P ) l l Putting P = Pj , we differentiate (4.5) with respect to at = 0. This yields the action of the vector field E on the metric coefficient in the left-hand side and proves the proposition.
Proposition 4. Rotation coefficients (3.17) given by the bidifferential W satisfy E βij = −βij . Proof. This is a corollary of Proposition 3 and can be proven by a straightforward calculation using (4.4) and the definition of rotation coefficients (3.13). Alternatively, it can be proven directly by the method used in the proof of Proposition 3. So far we have restricted the family of flat metrics to those of the form (3.16) with h = C λn and the contour l being either closed or connecting points ∞i , ∞j : ds2 =
2 L C λn (Q)W (Q, Pj ) (dλj )2 . j =1
(4.6)
l
An additional restriction comes from F2, the requirement of covariant constancy of the unit vector field (4.2) with respect to the Levi-Civita connection. Lemma 1. If a diagonal metric ds2 = i gii (dλi )2 is potential (i.e. ∂λi gjj = ∂λj gii holds) and its coefficients gii are annihilated by the unit vector field (4.2) (e(gii ) = 0), then the vector field e is covariantly constant with respect to the Levi-Civita connection of the metric ds2 . The proof is a simple calculation using the expression for the Christoffel symbols via coefficients of a diagonal metric: iik = −
1 ∂λk gii , 2 gkk
iii =
1 ∂λi gii , 2 gii
iji =
1 ∂λj gii , 2 gii
ijk = 0
for distinct i, j, k . (4.7)
Thus, we need to find the metrics of the form (4.6) such that the unit vector field e annihilates their coefficients. These metrics can be written as L L φ 2 (P ) 1 2 ds2φ = res φ (Pi )(dλi )2 , (4.8) (dλi )2 ≡ P =Pi dλ(P ) 2 i=1
i=1
where φ is a differential of one of the five types listed below in Theorem 2. These differentials are called primary and all have the form φ(P ) = C l λn (Q)W (Q, P ) with some specific choice of a contour l and function Cλn . In other words, we shall consider five types of combinations of a contour and a function Cλn . Let us write these combinations
“Real Doubles” of Hurwitz Frobenius Manifolds
647
in the form of operations of integration over the contour with the weight function. The operations, applied to a 1-form f, have the following form: α 1 res λ(Q) ni +1 f (Q), α ∞i 2. Iv i [f (Q)] := res λ(Q)f (Q),
i = 0, . . . , m ; α = 1, . . . , ni .
1. It i;α [f (Q)] :=
i = 1, . . . , m .
∞i
∞i 3. Iwi [f (Q)] := v.p. f (Q), ∞0 4. Ir k [f (Q)] := − λ(Q)f (Q), ak 1 5. Is k [f (Q)] := f (Q), 2πi bk
i = 1, . . . , m . k = 1, . . . , g . k = 1, . . . , g .
Here the principal value near infinity is defined by omitting the divergent part of the integral as a function of the local parameter zi (such that λ = zi−ni −1 ). Theorem 2. Let us choose a point P0 ∈ L which is mapped to zero by the function λ, i.e. λ(P0 ) = 0, and let all basis contours {ak , bk } start at this point. Then, the defined operations 1.-5. applied to the bidifferential W give a set of L differentials, called primary, with the following singularities (characteristic properties). By zi we denote the local parameter near ∞i such that zi−ni −1 = λ, ni being the ramification index at ∞i : 1. φt i;α (P ):= It i;α [W (P , Q)]
∼ zi−α−1 (P )dzi (P ) , P ∼ ∞i ;
2. φvi (P ) := Ivi [W (P , Q)] 3. φwi (P ) := Iwi [W (P , Q)] :
∼ −dλ(P ) , P ∼ ∞ ; res φwi = 1 ; res φwi = −1 ;
i = 1, . . . , m . i = 1, . . . , m .
4. φr k (P ) := Ir k [W (P , Q)] : 5. φs k (P ) := Is k [W (P , Q)] :
φr k (P ) − φr k (P ) = 2πidλ(P ) ; holomorphic differential;
k = 1, . . . , g . k = 1, . . . , g .
i
∞0
∞i
bk
i = 0,..., m; α = 1, ..., ni .
Here φr k (P bk ) − φr k (P ) denotes the transformation of the differential under analytic continuation along the cycle bk on the Riemann surface. All above differentials have zero a-periods except φs l which satisfy: ak φs l = δkl . Proof. Let us prove that φt i;α (P )
∼
P ∼∞i
zi−α−1 (P )dzi (P ) .
(4.9)
It is easy to see that the differential φt i;α (P ) has a singularity only at P = ∞i . Let us consider the expansion of the bidifferential W at Q ∼ ∞i : 1 W (P , ∞i ) + W, 2 (P , ∞i )zi (Q) + W,2 (P , ∞i )zi2 (Q) + . . . . 2 (4.10)
−2 Since W (P , Q) (zi (P ) − zi (Q)) + O(1) dzi (P )dzi (Q) when P ∼ Q ∼ ∞i then we have for the (α − 1)th coefficient of the expansion (4.10), W (P , Q)
Q∼∞i
dzi (P ) 1 (α−1) , W, 2 (P , ∞i ) ∼ α+1 i α! P ∼∞ zi (P ) which proves (4.9). The case α = ni + 1 proves φv i (P )
∼
P ∼∞i
−dλ(P ).
648
V. Shramchenko
For the differentials φωi the theorem can be proven analogously. The differential φr k (P ) is not defined at the points of the contour ak , however it has certain limits as P approaches the contour from different sides; thus φr k (P ) is defined of the surface. (The fundamental polyand single valued on the fundamental polygon L gon L is obtained by cutting the surface along all basis cycles ak and bk provided they all start at one point.) Let us denote dqki (P ) := φr k (P bi ) − φr k (P ) (as we shall see P below, dqki is indeed an exact differential) and consider the differential φr k (P ) P0 ωk (ωk is one of the normalized holomorphic differentials such that aj ωk = δj k ). This Therefore its integral over the boundary of L equals differential has no poles inside L. consists of cycles {aj } and {bj } the zero. On the other hand, since the boundary ∂ L integral can be rewritten via periods of the differentials as follows: P j φr k (P ) ωk = φr k − φr k Bj k + qk ω k (4.11) 0= ∂ L˜
bk
P0
j
aj
aj
j
(Bj k = bj ωk ). Due to the choice of the point P0 where all basis cycles start, we can change the order of integration in expressions bk ak λ(Q)W (P , Q) as can be checked by a local (near the point P0 ) calculation of the integral. Therefore we have φr k (P ) = 0 for all j and φr k (P ) = −2π i λ(Q)ωk (Q) . aj
bk
ak
Then, the relation (4.11) takes the form j 0 = −2πi λ(Q)ωk (Q) + qk (Q)ωk (Q) , ak
j
aj
j
and we conclude that qk (Q) = 2π iλ(Q)δj k . For differentials φs k the statement of the theorem follows from properties (3.3) of the bidifferential W. For all primary differentials (except φs k ) a-periods are zero since they are zero for W. 4.2. Flat coordinates. For a flat metric there exists a set of coordinates in which coefficients of the metric are constant. These coordinates are called the flat coordinates of the metric. In flat coordinates the Christoffel symbols vanish and the covariant derivative ∇t A is the usual partial derivative ∂t A . Therefore flat coordinates can be found from the equation ∇x ∇y t = 0 (x and y are arbitrary vector fields on the manifold). In canonical coordinates this equation has the form: ∂λi ∂λj t = ijk ∂λk t , (4.12) k
where the Christoffel symbols are given by (4.7). For different i, j, k, the Christoffel symbols of the metrics ds2φ (4.8) have the form: iik = −βik
φ(Pi ) , φ(Pk )
iii = −
j, j =i
iji ,
iji = βij
φ(Pj ) , φ(Pi )
ijk = 0 .
“Real Doubles” of Hurwitz Frobenius Manifolds
649
Theorem 3 ([5]). The following functions give a set of flat coordinates of the metric ds2φ (4.8): t i;α = − (ni + 1)It i;1+ni −α [φ],
i = 0, . . . , m ; α = 1, . . . , ni ,
v = − Iwi [φ],
i = 1, . . . , m,
w = − Iv i [φ],
i = 1, . . . , m,
r
k
= Is k [φ],
k = 1, . . . , g,
s
k
= Ir k [φ],
k = 1, . . . , g .
i
i
Non-zero entries of the constant matrix of the metric in these coordinates are: 1 δij δα+β,ni +1 , ds2φ (∂t i;α , ∂t j ;β ) = ni + 1 ds2φ (∂v i , ∂wj ) = δij , ds2φ (∂r k , ∂s l ) = −δkl . For notational convenience we denote an arbitrary flat coordinate by t A , and a primary differential by φt A , i.e. t A ∈ {t i;α ; v i , wi ; r k , s k |i = 0, . . . , m ; α = 1, . . . , ni ; k = 1, . . . , g} . Proposition 5. In flat coordinates {t A } of the metric ds2φ , the Euler vector field (4.3) has the form (2.2) with coefficients {νA } depending on the choice of a primary differential φ: • if φ = φt io ;α then ni m α α E= 1+ − t i;α ∂t i;α nio + 1 ni + 1 i=0 α=1 m α α i i v ∂ i + (1 + )ω ∂ωi + nio + 1 v nio + 1 i=1 g α α r k ∂r k + (1 + )s k ∂s k , + nio + 1 nio + 1 k=1
• if φ = φv io or φ = φr ko then E=
ni m
(2 −
i=0 α=1
α v i ∂v i + 2ωi ∂ωi + r k ∂r k + 2s k ∂s k , )t i;α ∂t i;α + ni + 1 m
g
i=1
k=1
• if φ = φωio or φ = φs ko then E=
ni m i=0 α=1
(1 −
α )t i;α ∂t i;α + ω i ∂ω i + s k ∂s k . ni + 1 m
g
i=1
k=1
Proposition 6 (see [5]). The unit vector field e (4.2) in the flat coordinates of the metric ds2φ A has the form: e = −∂t A0 . t 0
Thus, the coordinate t A0 is naturally marked. Let us denote it by t 1 so that e = −∂t 1 . In flat coordinates the Christoffel symbols of the Levi-Civita connection vanish. Therefore the proposition implies that the unit vector field is covariantly constant (F2).
650
V. Shramchenko
4.3. Prepotentials of Frobenius structures. Definition 4. A prepotential of a Frobenius manifold is a function F of flat coordinates of the corresponding metric such that its third derivatives are given by the symmetric 3-tensor c from the definition of a Frobenius manifold (F3): ∂ 3 F (t) = c(∂t A , ∂t B , ∂t C ) = ds2φ (∂t A · ∂t B , ∂t C ) . ∂ t A ∂t B ∂ t C
(4.13)
By presenting this function (defined up to a quadratic polynomial in flat coordinates) for each metric ds2φ we shall prove the symmetry in four indices (A, B, C, D ) of the tensor (∇∂t D c)(∂t A , ∂t B , ∂t C ) and therefore complete the construction of the Frobenius manifold. φ = We shall denote the Frobenius manifold corresponding to the metric ds2φ by M φ and its prepotential by Fφ . M g;n0 ,...,nm
Remark 2. Proposition 6 implies that the third order derivatives (4.13) are constant if one of the derivatives is taken with respect to the coordinate t 1 ∂ 3F = −ds2φ 1 (∂t A , ∂t B ) . t ∂t 1 ∂t A ∂t B Before writing a formula for the prepotential we shall define a pairing of differentials. Let ω(1) and ω(2) be two differentials on the surface L holomorphic outside of the points ∞0 , . . . , ∞m with the following behaviour at ∞i : ∞ (α) 1 (α) n (α) n cn,i zi dzi + rn,i λ log λ , P ∼ ∞i , (4.14) d ω = n + 1 i (α) n>0
n=−n
(α)
(α)
where n(α) ∈ Z and cn,i , rn,i are some coefficients; zi = zi (P ) is a local parameter near ∞i . Denote also for k = 1, . . . , g: (α) ω(α) = Ak , (4.15) ak
(α)
ω(α) (P ak ) − ω(α) (P )= dpk (λ(P )) ,
(α)
pk (λ)=
(α)
(4.16)
(α)
(4.17)
psk λs ,
s>0 (α)
ω(α) (P bk ) − ω(α) (P )= dqk (λ(P )) ,
(α)
qk (λ) =
qsk λs .
s>0
Here, as before, ω(P ak ) and ω(P bk ) denote the analytic continuation of ω(P ) along the corresponding cycle on the Riemann surface. Note that if ω(α) is one of the primary differentials (defined in Theorem 2), then the coefficients cn,i , rn,i , psk , qsk and Ak do not depend on coordinates. Definition 5. For two differentials whose singularity structures are given by (4.14) – (4.17) define a pairing F[ , ] as follows:
“Real Doubles” of Hurwitz Frobenius Manifolds
651
F[ω(α) , ω(β) ] ∞i ∞i (α) m c−n−2,i (β) (α) (α) = ω(β) − v.p. rn,i λn ω(β) c + c−1,i v.p. n + 1 n,i P0 P0 i=0
1 + 2π i
n≥0
g k=1
n>0
− ak
(α) qk (λ)ω(β)
+ bk
(α) pk (λ)ω(β)
(α) + Ak
ω
(β)
,
bk
where P0 is a marked point on the surface such that λ(P0 ) = 0. For any primary differential φ we consider a (multivalued on L) function p: P p(P ) = v.p. φ. ∞0
(4.18)
One can see that singularities of the differential pdλ can be described by formulas similar to (4.14) – (4.17). The corresponding coefficients cn,i , rn,i , psk , qsk and Ak for ω = pdλ depend on coordinates {λk } in contrast to those for primary differentials. Theorem 4 ([5]). The following function gives a prepotential of the Frobenius manifold φ : M 1 (4.19) Fφ = F[pdλ , pdλ] , 2 where p is the multivalued function (4.18). The third derivatives of Fφ are given by ∂ 3 Fφ (t) φ Aφ B φ C = c(∂t A , ∂t B , ∂t C ) = − res t t t Pi ∂t A ∂t B ∂t C dpdλ L
i=1
≡−
L 1 φ A (Pi )φ B (Pi )φ C (Pi ) t
2
i=1
t
φ(Pi )
t
.
(4.20)
Theorem 5 ([5]). The second derivatives of the prepotential Fφ are given by the pairing of the corresponding primary differentials: ∂t A ∂t B Fφ = F[φt A , φt B ] . φ , the prepotential (4.19) is a quasihomogeFor the described Frobenius manifold M nous function of flat coordinates {t A } of the metric ds2φ , i.e. the following holds for some numbers {νA } and νF and any non-zero constant κ: Fφ (κ ν1 t 1 , . . . , κ νn t n ) = κ νF Fφ (t 1 , . . . , t n ) + quadratic terms . This follows from the existence of the Euler vector field satisfying (2.4) - (2.6) (see the proof of Theorem 1). The coefficients of quasihomogeneity {νA } are coefficients of the Euler vector field written in flat coordinates (see (2.1) – (2.3)); they are given by Proposition 5. The coeffi φ using Proposition cient νF = 3 − ν can be computed for each Frobenius structure M 3: 2α 2α ν =1− , νF = + 2, if φ = φt i;α , then ni + 1 ni + 1 if φ = φv i or φ = φr k , then ν = −1, νF = 4, if φ = φωi or φ = φs k , then ν = 1, νF = 2 .
652
V. Shramchenko
Remark 3. A linear combination of primary differentials corresponding to the same charge ν also gives a Frobenius structure. Namely, the above construction works for φ=
m i=1
κ i φv i +
g
and
σk φr k
φ=
k=1
m
κi φω i +
i=1
g
σk φ s k ,
k=1
with any constants {κi } and {σk }. The unit vector field in these cases, respectively, is given by m m g g e=− and e=− κ i ∂v i + σ k ∂r k κi ∂ω i + σk ∂ s k . i=1
k=1
i=1
k=1
After a linear change of variables, the unit field can be written as e = −∂ξ 1 for a new variable ξ 1 , since the coordinates {v i } and {r k } ({ωi } and {s k }) have equal quasihomogeneity coefficients. 5. “Real Doubles” of Dubrovin’s Frobenius Structures on Hurwitz Spaces =M g;n0 ,...,nm as a real manifold. The In this section we consider the moduli space M set of local coordinates is given by the set of branch points of the covering Lλ and with coordinates their complex conjugates: {λ1 , . . . , λL ; λ¯ 1 , . . . , λ¯ L }. On the space M {λi ; λ¯ i } we shall build a Frobenius structure in a way analogous to the one described i ; λ¯ i }) in Sect. 4. The construction will be based on a family of flat metrics on M({λ of the type (3.19), (3.20) with rotation coefficients given by the Schiffer and Bergman kernels. Since in genus zero the Schiffer kernel coincides with the bidifferential W and the Bergman kernel vanishes, we only get essentially new metrics (and therefore new Frobenius structures) for Hurwitz spaces in genus greater than zero. We start with a description of a Frobenius algebra in the tangent space. The coordinates {λ1 , . . . , λL ; λ¯ 1 , . . . , λ¯ L } are taken to be canonical for multiplication: ∂λi · ∂λj = δij ∂λi ,
(5.1)
¯ ¯ . . . , L}, where indices i, j range now in the set of all indices, i.e. i, j ∈ {1, . . . , L; 1, and we put λi¯ := λ¯ i . The unit vector field of the algebra is given by e=
L
∂λi + ∂λ¯ i .
(5.2)
i=1
The role of an inner product of the Frobenius algebra is played by one of the metrics (3.19), (3.20). The new vector field E, analogously, is E :=
L
λi ∂λi + λ¯ i ∂λ¯ i
.
(5.3)
i=1
5.1. Primary differentials. Together with the multiplication (5.1), the Euler field (5.3) satisfies relation (2.5) of F4. Its action (2.6) on a diagonal metric takes the form:
“Real Doubles” of Hurwitz Frobenius Manifolds
653
E ∂λk , ∂λk = −ν∂λk , ∂λk ,
¯ . ¯ . . . , L} k ∈ {1, . . . , L; 1,
(5.4)
Among the metrics (3.19), (3.20) we choose, similarly to Proposition 3, those for which this condition holds. Proposition 7. Let the contour l in (3.19), (3.20) be either closed or connecting points ∞i and ∞j for some i, j. In the latter case we regularize the integral by omitting its divergent part as a function of the local parameter zi (or as a function of z¯i ) near ∞i . Then the metrics (3.19), (3.20) with h(Q) = Cλn (Q) (where C is a constant) satisfy (5.4) with ν = 1 − 2n and the Euler field (5.3). Proof. The proof is the same as for Proposition 3: we use the fact that Bergman and Schiffer kernels are invariant under biholomorphic mappings of the Riemann surface. The biholomorphic map to be taken in this case is λ → (1 + )λ, where is real. Proposition Rotation coefficients (3.21) given by the Schiffer and Bergman kernels
8. ¯ where the Euler field E is given ¯ . . . , L}, satisfy E βij = −βij , i, j ∈ {1, . . . , L; 1, by (5.3). Proof. This statement is a corollary of Proposition 7; it can also be proven directly by using the invariance of the kernels under the mapping of Riemann surfaces Lλ → Lλ , λ → (1 + )λ, for ∈ R. Among the metrics ds2 = i gii (dλi )2 + gi¯i¯ (d λ¯ i )2 of the form (3.19), (3.20) with h = Cλn and a contour l of the type required in Proposition 7 only those ones correspond to Frobenius manifolds whose coefficients satisfy e(gii ) = e(gi¯i¯ ) = 0 ( e is the unit vector field (5.2)). This follows from F2 and Lemma 1, which is obviously valid for the unit vector field (5.2) and diagonal potential metrics (3.19), (3.20). Therefore we need to find the combinations of a contour l and a function h = Cλn such that formulas (3.19), (3.20) give metrics whose coefficients are annihilated by the vector field e. We list those combinations in the form of operations I[f (Q)] = l Cλn f (Q) applied to a differential f of the form f = f(1,0) + f(0,1) . We say that a differential is of the (1, 0)-type if in a local coordinate z it can be represented as f(1,0) = f1 (z)dz, and is of the (0, 1)-type if in a local coordinate it has a form f(0,1) = f2 (¯z)d z¯ . We shall also call f(1,0) and f(0,1) the holomorphic and antiholomorphic parts of a differential f, respectively. We denote by res ˜ the coefficient in front of d z¯ /¯z in the Laurent expansion of a differential. As before, zi is the local parameter in a neighbourhood of ∞i such that zi−ni −1 (Q) = λ(Q) , Q ∼ ∞i . For i = 0, . . . , m; α = 1, . . . , ni we define: 1. It i;α [f (Q)]:= 3. Iv i [f (Q)]
1 1 res zi−α (Q)f(1,0) (Q). 2. It i;α [f (Q)] := res ˜ z¯ −α (Q)f(0,1) (Q). i α ∞ α ∞i i ¯ := res λ(Q)f(1,0) (Q). 4. Iv i¯ [f (Q)] := res ˜ λ(Q)f (0,1) (Q) . ∞i
∞i
For i = 1, . . . , m we define: 5. Iwi [f (Q)] := v.p.
∞i ∞0
f(1,0) (Q).
6. Iwi¯ [f (Q)] := v.p.
∞i ∞0
f(0,1) (Q) .
654
V. Shramchenko
As before, the principal value near infinity is defined by omitting the divergent part of an integral as a function of the corresponding local parameter. For k = 1, . . . , g we define: ¯ λ(Q)f 7. Ir k [f (Q)] := − λ(Q)f(1,0) (Q) − (0,1) (Q). ak ak ¯ 8. Iuk [f (Q)] := λ(Q)f λ(Q)f(1,0) (Q) + (0,1) (Q). bk bk 1 9. Is k [f (Q)] := f(1,0) (Q). 2πi bk 1 10. It k [f (Q)] := − f(1,0) (Q) . 2πi ak Applying these operations to the sum of Schiffer and Bergman kernels, we shall obtain a set of primary differentials , each of which gives a Darboux-Egoroff metric and a corresponding Frobenius structure. These differentials, listed below, decompose into a sum of holomorphic and antiholomorphic parts. The a-periods vanish for all primary differentials except for the differentials labeled by the index s k ; the b-periods do not vanish only for the differentials having the index t k . This normalization and a given type of singularity characterize a primary differential completely due to the following lemma. Lemma 2. If a single valued differential on a Riemann surface of the form w = w(1,0) + w(0,1) has zero a- and b-periods and its parts w(1,0) and w(0,1) are everywhere analytic with respect to local parameters z and z¯ , respectively, then the differential w is zero. Proof. Since the holomorphic and antiholomorphic parts of the differential be reg must g ular and single valued on the surface, we can write w in the form: w = k=1 αk ωk + g k=1 βk ωk , where {ωi } are holomorphic normalized differentials. The vanishing of a-periods gives αk = −βk and vanishing of b-periods implies that all αk should be zero. We list primary differentials together with their characteristic properties. A proof that the differentials have the given properties is essentially contained in the proof of Theorem 2. g Let us fix a point P0 on L such that λ(P0 ) = 0, and let all the basic cycles {ak , bk }k=1 on the surface start at this point. This enables us to change the order of integration in expressions of the type bk ak λ(P )(P , Q) (this can be checked by a local calculation of the integral near the point P0 ) and compute a- and b-periods of the following primary differentials. For i = 0, . . . , m ; α = 1, . . . , ni : 1. t i;α (P ) = It i;α (P , Q)+B(P¯ , Q) ; ∼ (z−α−1 +O(1))dzi +O(1)d z¯ i , P∼∞i. i
2. t i;α (P ) = t i;α (P ) . For i = 1, . . . , m: 3. v i (P ) = Iv i (P , Q) + B(P¯ , Q) ; 4. v i¯ (P ) = v i (P ) . 5. wi (P ) = Iwi (P , Q) + B(P¯ , Q) ; 6. wi¯ (P ) = wi (P ) .
∼ − dλ + O(1) (dzi + d z¯ i ) , P ∼ ∞i . res wi = 1 ; res wi = −1. ∞i
∞0
“Real Doubles” of Hurwitz Frobenius Manifolds
655
For k = 1, . . . , g: ! 7. r k (P ) = Ir k 2Re (P , Q) + B(P¯ , Q) ; r k (P bk ) − r k (P ) = 2π idλ − 2π id λ¯ . ! 8. uk (P ) = Iuk 2Re (P , Q) + B(P¯ , Q) ; uk (P ak ) − uk (P ) = 2π idλ − 2π id λ¯ . 9. s k (P ) = Is k (P , Q) + B(P¯ , Q) ; no singularities. 10. t k (P ) = It k (P , Q) + B(P¯ , Q) ; no singularities.
Here, as before, λ = λ(P ) and zi = zi (P ) is the local parameter at P ∼ ∞i such that λ = zi−ni −1 . Note that due to properties (3.10) of the Schiffer and Bergman kernels and the choice of the point P0 (see the proof of Theorem 2), only the primary differentials of the last two types have non-zero a- and b-periods. Let us denote an arbitrary differential from the list by ξ A ; then the following holds: ξ A = δξ A ,s α ; ξ A = δξ A ,t α aα
bα
(δ is the Kronecker symbol). The number of primary differentials is 2L by virtue of the Riemann-Hurwitz formula (3.1). Each of the primary differentials defines a metric of the type (3.19), (3.20) by the formula: ds2 =
1 2 1 2 (1,0) (Pi )(dλi )2 + (0,1) (Pi )(d λ¯ i )2 , 2 2 L
L
i=1
i=1
(5.5)
where (1,0) and (0,1) are, respectively, the holomorphic and antiholomorphic parts of the differential . The evaluation of differentials at a ramification point Pi is done √ λ − λi , i.e. (1,0) (Pi ) =
with respect to the standard local parameter xi = (1,0) (P )/dxi (P ) |P =Pi . As is easy to see, metrics of the type (3.20) correspond to differentials = uk and = r k . Proposition 9. Primary differentials satisfy the following relations:
e (1,0) (Pi ) = 0 , e (0,1) (Pi ) = 0 ,
(5.6)
for any ramification point Pi . The proposition implies that the unit vector field e (5.2) annihilates coefficients of the metric ds2 (5.5). Proof. Consider the covering Lλδ obtained from Lλ by a δ-shift of the points of every sheet, choosing δ ∈ R ; this shift maps the point P of the surface to the point P δ which belongs to the same sheet and for which λ(P δ ) = λ(P )+δ. Denote by δ and B δ the corresponding kernels on Lλδ . They are invariant with respect to biholomorphic mappings of the Riemann surface, i.e. δ (P δ , Qδ ) = (P , Q), and B δ (P δ , Qδ ) = B(P , Q). The local parameters near ramification points also do not change: xi (P ) = xiδ (P ) =
656
V. Shramchenko
√
λ(P ) − λi . Therefore for differentials ωi , ωi¯ , s k , and t k , the statement of the proposition follows immediately from this invariance. For them we have, for example, δωi (1,0) (Pjδ ) = ωi (1,0) (Pj ) ;
δωi (0,1) (Pjδ ) = ωi (0,1) (Pj ) .
Differentiation of these equalities with respect to δ at δ = 0 gives the action of the unit vector field e (5.2) on the differential in the leftand zero in the right side. ¯ ˜ )=− ¯ Consider now the differential (P ak λ(Q)(P , Q) − ak λ(Q)B(P , Q), ˜ which is related to the differential r k as follows: r k (P ) = 2Re{(P )}. On the shifted covering Lλδ we have δ δ δ δ δ δ ¯ δ )B δ (Piδ , Q¯ δ ) ˜ λ(Q λ(Q ) (Pi , Q ) − (Pi ) = − akδ
akδ
¯ . ¯ (λ(Q) + δ)B(Pi , Q)
(λ(Q) + δ)(Pi , Q) −
=− ak
(5.7)
ak
Differentiating both sides of this equality with respect to δ at δ = 0 and using the property (3.10) of the Schiffer and Bergman kernels, we prove formulas (5.6) for the differentials r k ; the proof for uk is analogous. To prove (5.6) for the remaining differentials consider the local parameter zi near infinity ∞i ; under the δ-shift it transforms as follows: α α zi−α (P δ ) = (λ(P ) + δ) ni +1 = zi−α (P ) + (zi (P ))−α+ni +1 δ + O(δ 2 ) . δ∼0 ni + 1 Therefore t i;α (Pj ) on the covering Lλδ is given by δti;α (Pjδ )
1 α −α −α+ni +1 2 = res zi (P ) + δ + O(δ ) (P , Pj ) + B(P , P¯j ) . (zi (P )) i α ∞ ni + 1
Differentiating both sides with respect to δ at δ = 0, we get
1 e t i;α (1,0) (Pj ) = res (zi (P ))−α+ni +1 (P , Pj ) , ni + 1 ∞i
1 e t i;α (0,1) (Pj ) = res (zi (P ))−α+ni +1 B(P , P¯j ) . ni + 1 ∞i The right sides are zero for non-negative powers of zi , i.e. for α = 1, . . . , ni + 1. This proves the statement of the proposition for differentials t i;α and v i ( α = ni + 1 corresponds up to a constant to the case of differential v i ). Remark 4. This calculation also shows that differentials t i;α for i = 0, . . . , m ; α = 1, . . . n, ni and v i for i = 1, . . . , m give the full set of primary differentials of the type l Cλ ((P , Q) + B(P , Q)) for l being a small contour encircling one of the infinities. Note that we cannot consider v 0 (P ) as an independent differential due to the rela1 tion m i=0 v i (P ) = −(m + 1)dλ(P ), where dλ(P ) = dζ is a differential on CP , the base of the covering. Thus, we have constructed 2L differentials (see the Riemann-Hurwitz formula (3.1)); each of them gives by formula (5.5) a Darboux-Egoroff metric which satisfies F2 (∇e = 0), and on which the Euler field acts according to (2.6) from F4. Our next goal is to find a set of flat coordinates for each of the metrics (5.5).
“Real Doubles” of Hurwitz Frobenius Manifolds
657
5.2. Flat coordinates. Let us write the Christoffel symbols of the metric ds2 (5.5) in terms of the corresponding primary differential . We shall use the following lemma which can be proven by a simple calculation using the definition of primary differentials and variational formulas (3.11) for the Schiffer and Bergman kernels. Lemma 3. The derivatives of primary differentials with respect to canonical coordinates are given by
1 ξ A (1,0) (Pk ) (P , Pk ) + B(P¯ , Pk ) , 2 ∂λk ∂ξ A (P ) 1 = ξ A (0,1) (Pk ) B(P , P¯k ) + (P , Pk ) . 2 ∂ λ¯ k ∂ξ A (P )
=
(5.8) (5.9)
Then non-vanishing Christoffel symbols of the metric ds2 can be expressed as follows in terms of the primary differential and rotation coefficients βij (3.21): j
j k = βj k j¯
j¯k = βj¯k
(1,0) (Pk ) i ; = −kk (1,0) (Pj )
(1,0) (Pk ) j¯ j¯ = −kk ; j¯k¯ (0,1) (Pj )
(0,1) (Pk ) j = −k¯ k¯ ; (1,0) (Pj ) j (0,1) (Pk ) j j = βj¯k¯ j l . = −k¯ k¯ ; jj = − (0,1) (Pj ) j
j k¯ = βj k¯
l=j
(5.10) Note that in the last formula, the index of summation l runs through the set {1, . . . , ¯ ¯ . . . , L}. L; 1, Flat coordinates can be found from the system of differential equation (4.12). Due to formulas (5.10), this system can be rewritten as follows: j
∂λj ∂λk t = j k ∂λj t + jkk ∂λk t ,
¯ ¯ . . . , L}, j = k ∈ {1, . . . , L, 1,
e(t) = const .
(5.11) (5.12)
Substituting expressions (5.10) for Christoffel symbols into system (5.11) and using Lemma 3, one proves the next theorem by a straightforward computation. Theorem 6. The following functions (and their linear combinations) satisfy system (5.11) with Christoffel symbols corresponding to the metric ds2 : ¯ ))(0,1) (P ) , t1 = h1 (λ(P ))(1,0) (P ) and t2 = h2 (λ(P (5.13) l1
l2
where l1 , l2 are two arbitrary contours on the surface L which do not pass through ramification points and are such that their images λ(l1 ) and λ(l2 ) in the ζ -plane do not depend on {λk ; λ¯ k } ; arbitrary functions h1 , h2 are defined in some neighbourhoods of l1 and l2 , respectively, and are also independent of the coordinates {λk ; λ¯ k }. The integration is regularized by omitting the divergent part where needed. Among solutions (5.13) we need to isolate those which satisfy Eq. (5.12), the second part of the system identifying flat coordinates. The operations Iξ A applied to the differential (P ) give functions of the form (5.13), and it turns out that flat coordinates can be obtained in this way. Namely, the following theorem holds.
658
V. Shramchenko
Theorem 7. Let P0 be a marked point on L such that λ(P0 ) = 0. Let all the basic g cycles {ak , bk }k=1 start at the point P0 . Then the following functions give a set of flat coordinates of the metric ds2 (5.5): For i = 0, . . . , m ; α = 1, . . . , ni : ni + 1 res zα−ni −1 (1,0) ; α − ni − 1 ∞i i ni + 1 := −(ni + 1)I i;1+n −α [] = res ˜ z¯ α−ni −1 (0,1) . i t α − ni − 1 ∞i i
t i;α := −(ni + 1)I i;1+ni −α [] = t
t i;α
For i = 1, . . . , m: v i := −Iwi [] = −v.p.
∞i ∞0
(1,0) ;
¯
v i := −Iwi¯ [] = −v.p. ¯
∞i ∞0
(0,1) ;
¯ (0,1) . wi := −Iv i¯ [] = − res ˜ λ
wi := −Iv i [] = − res λ(1,0) ; ∞i
∞i
For k = 1, . . . , g:
1 r := Is k [] = (1,0) ; 2π i bk
¯ (0,1) ; s k := Ir k [] = − λ(1,0) + λ k
ak
1 u := −It k [] = (1,0) ; 2π i ak
¯ (0,1) . t k := −Iuk [] = − λ(1,0) + λ k
bk
As before, we use the notation resf ˜ := resf¯. Let us denote the flat coordinates by ξ A , i.e. we assume ¯
¯
ξ A ∈ {t i;α , t i;α ; v i , v i , w i , w i ; r k , uk , s k , t k } ¯
¯
for i = 0, . . . , m , α = 1, . . . , ni ; k = 1, . . . , g (except v 0 , v 0 and w 0 , w 0 , which do not exist). Proof. Theorem 6 implies that these functions satisfy Eqs. (5.11). The remaining Eqs. (5.12), e(ξ A ) = const, can be proven by the same reasoning as in the proof of Proposition 9. Note that the action of the unit vector field e (5.2) on a coordinate ξ A is non-zero if and only if the type of the coordinate coincides with the type of the primary differential which defines the metric. I.e. for the metric ds2 with = ξ A0 the coordinate ξ A0 is naturally marked and we shall denote it by ξ 1 . One can prove that, for any choice of , the corresponding coordinate ξ 1 is such that relations e(ξ 1 ) = −1 and e(ξ A ) = 0 hold for ξ A = ξ 1 . Therefore we have e = −∂ξ 1 (see also Proposition 10 below). Remark 5. By virtue of the Riemann-Hurwitz formula (see Sect. 3.1), the number of functions listed in the theorem equals 2L, i.e. coincides with the number of canonical coordinates {λi ; λ¯ i }. The next theorem gives an expression of the metric ds2 in coordinates {ξ A } and by
that shows again that functions {ξ A {λk ; λ¯ k } } are independent and play the role of flat coordinates for the metric.
“Real Doubles” of Hurwitz Frobenius Manifolds
659
Theorem 8. In coordinates {ξ A } from Theorem 7 the metric ds2 (5.5) is given by a constant matrix whose non-zero entries are the following:
1 ds2 ∂t i;α , ∂t j ;β = ds2 ∂t i;α , ∂t j ;β = δij δα+β,ni +1 , ni + 1
ds2 ∂v i , ∂ωj = ds2 ∂v i¯ , ∂ωj¯ = δij ,
ds2 ∂r i , ∂s j = −δij ,
ds2 ∂ui , ∂t j = δij . We shall prove this theorem later, after introducing a pairing of differentials (5.24). To further investigate properties of the flat coordinates let us choose one of the primary differentials and build a multivalued differential on the surface L as follows: P P (P ) = v.p. (5.14) (1,0) dλ + v.p. (0,1) d λ¯ . ∞0
∞0
This differential will play a role similar to the role of the differential pdλ in the construction of Dubrovin (see formula (4.19) for the prepotential). Note that (P ) decomposes into a sum of holomorphic and antiholomorphic differentials: = (1,0) + (0,1) . Theorem 9. The derivatives of the multivalued differential (5.14) with respect to flat coordinates {ξ A } are given by the corresponding primary differentials: ∂ = ξ A . ∂ξ A Proof. Consider an expansion of the differential in a neighbourhood of one of the infinities ∞i on the surface. We omit the singular part which does not depend on coordinates. As before, zi is a local coordinate in a neighbourhood of the i th infinity, ni is the corresponding ramification index. For i = 0 we have ni −n −2 (P ) = singular part + v i (ni + 1)zi i + t i;α zi−α−1 + w i zi−1 + O(1) dzi P ∼∞i
+ v
i¯
(ni + 1)¯zi−ni −2
+
α=1 ni
t i;α z¯ i−α−1
¯ + w i z¯ i−1
+ O(1) d z¯ i .
α=1
(5.15) We see that the expansion coefficients of the singular part are exactly the flat coordinates 2 . The coordinates t 0;α , α = 1, . . . , n appear similarly in the expanof the metric ds 0 sion at the infinity ∞0 . The remaining coordinates ξ A correspond to other characteristics of the multivalued differential . Namely, we have = sk , = tk, (5.16) ak
bk
¯ , (P ak ) − (P ) = 2πiuk dλ − 2πiuk d λ¯ + δ, k d λ¯ + δ, k (2π idλ − 2π id λ) s
u
(5.17) ¯ , (P bk ) − (P ) = 2πir k dλ − 2πir k d λ¯ + δ, k d λ¯ + δ, k (2π idλ − 2π id λ) t
r
(5.18)
660
V. Shramchenko
where (P ak ), (P bk ) stand for the analytic continuation of (P ) along the corresponding cycles of the Riemann surface. This parameterization of the differential by the flat coordinates, together with Lemma 2, proves the theorem. As a corollary we get the following lemma. Lemma 4. The derivatives of canonical coordinates {λi ; λ¯ i } with respect to flat coordinates {ξ A } of the metric ds2 are as follows: ξ A (1,0) (Pi ) ∂λi , =− ∂ξ A (1,0) (Pi )
ξ A (0,1) (Pi ) ∂ λ¯ i , =− ∂ξ A (0,1) (Pi )
where (P ) is the primary differential which defines the metric ds2 . Proof. Theorem 9 implies the following relations: P ∂ξ A (1,0) dλ = ξ A (1,0) , ∂ξ A ∞0
P ∞0
(0,1)
¯ d λ = ξ A (0,1) . (5.19)
(The divergent terms which we omit by taking the principal value of the integrals in a neighbourhood of ∞0 do not depend on {ξ A }.) We shall use the so-called thermodynamical identity ∂α (f dg)g=const = −∂α (gdf )f =const
(5.20)
for f being a function of another function g and some parameters {pα } , i.e. f = f (g; p1 , . . . , pn ) , where g can be expressed locally as a function of f, i.e. g = g(f ; p1 , . . . , pn ) ; ∂α denotes the derivative with respect to one of the parameters p = {pα }. Relation (5.20) can be proven by differentiation of the identity f (g(f ; p); p) ≡ f with respect to a parameter pα , which gives ∂α gdf/dg + ∂α f = 0. We use the P thermodynamical identity (5.20) for functions f (P ) = ∞0 (1,0) and g(P ) = λ(P ) to get P ∂ξ A (1,0) dλ = −∂ξ A {λ(P )} (1,0) (P ) , ∞0
and similarly,
∂ξ A
P ∞0
! ¯ ) (0,1) (P ) . (0,1) d λ¯ = −∂ξ A λ(P
Evaluating these relations at the critical points P = Pi , using that λ (Pi ) = 0 and equalities (5.19), we prove the lemma. Proposition 10. The unit vector field (5.2) is a tangent vector field in the direction of one of the flat coordinates. Namely, in flat coordinates of the metric ds2 (5.5) corresponding to the primary differential = ξ A0 , the unit vector of the Frobenius algebra is given by e = −∂ξ A0 . Let us denote the marked coordinate by ξ 1 so that e = −∂ξ 1 . Proof. This can be verified by a simple calculation using the chain rule ∂ξ 1 = ∂λi ∂ λ¯ i ∂ + ∂ and expressions for ∂λi /∂ξ 1 provided by Lemma 4. ¯ λ 1 1 i λ i ∂ξ ∂ξ
L i=1
“Real Doubles” of Hurwitz Frobenius Manifolds
661
5.3. Prepotentials of new Frobenius structures. A prepotential of the Frobenius structure which corresponds to a primary differential is a function F ({ξ A }) of flat coordinates of the metric ds2 such that its third derivatives are given by the tensor c from F3:
∂ 3 F (ξ ) = c(∂ξ A , ∂ξ B , ∂ξ C ) = ds2 ∂ξ A · ∂ξ B , ∂ξ C . A B C ∂ξ ∂ξ ∂ξ
(5.21)
We shall construct a prepotential F for each primary differential . This will prove that F3 (symmetry of the tensor (∇ξ A c)(∂ξ B , ∂ξ C , ∂ξ D )) holds in our construction. In order to write an expression for prepotential we define a new pairing of multivalued differentials as follows. Let ω(α) (P ) , α = 1, 2 . . . be a differential on L which can be decomposed into a (α) (α) (α) (α) sum of holomorphic (ω(1,0) ) and antiholomorphic (ω(0,1) ) parts, ω(α) = ω(1,0) + ω(0,1) , i which are analytic outside infinities and have the following behaviour at P ∼ ∞ (we write λ for λ(P ), and zi = zi (P ) for a local parameter zi−ni −1 = λ at P ∼ ∞i ): (α)
ω(1,0) (P ) =
∞
(α) cn,i zin dzi
n>0
(α)
n=−n1 (α)
ω(0,1) (P ) =
∞
(α) n cn,i ¯ z¯ i d z¯ i
(α) n=−n2
(α)
(α)
(α) 1 n + rn,i λ log λ , d ni + 1
(α)
(α)
1 + d ni + 1 (α)
(α) ¯ n ¯ rn,i ¯ λ log λ
,
(5.22)
n>0
(α)
where n1 , n2 ∈ Z ; and cn,i , rn,i , cn,i ¯ , rn,i ¯ are some coefficients. Denote also for k = 1, . . . , g: (α) (α) Ak := ω(α) , Bk := ω(α) , ak
(α)
bk
(α)
(α)
pk (λ) =
(α)
(α) ¯ pk¯ (λ)
(α)
(α) qk (λ)
(α)
(α) ¯ qk¯ (λ)
dpk (λ(P )) := ω(1,0) (P ak ) − ω(1,0) (P ) ,
(α)
(α)
psk λs ,
s>0
(α) ¯ dpk¯ (λ(P ))
(α)
ak
(α)
bk
(α)
bk
:= ω(0,1) (P ) − ω(0,1) (P ) ,
=
s>0
(α) dqk (λ(P ))
:= ω(1,0) (P ) − ω(1,0) (P ) ,
=
(α) ps¯k¯ λ¯ s , (α)
qsk λs ,
s>0
(α) ¯ dqk¯ (λ(P ))
:= ω(0,1) (P ) − ω(0,1) (P ) ,
=
s>0
(α) qs¯k¯ λ¯ s .
(5.23)
Note that all primary differentials and the differential (P ) have singularity structures which are described by (5.22) – (5.23). For ω(α) being one of the primary differentials, the coefficients cn,i , rn,i , cn,i ¯ , rn,i ¯ , Ak , Bk , psk , qsk , ps¯ k , qs¯ k do not depend on coordinates on the Hurwitz space. Definition 6. For two differentials ω(α) , ω(β) having singularities of the type (5.22), (5.23), we define the pairing F[ , ] as follows:
662
V. Shramchenko
F[ω(α) , ω(β) ] ∞i ∞i (α) m c−n−2,i (β) (β) (β) (α) (α) = ω(1,0) − v.p. rn,i λn ω(1,0) c + c−1,i v.p. n + 1 n,i P0 P0 n>0 i=0 n≥0 (α) ∞i ∞i c−n−2,i (β) (β) (α) (α) ¯ n (β) c ¯ + c−1,i + ω(0,1) − v.p. rn,i ¯ v.p. ¯ λ ω(0,1) n + 1 n,i P0 P0 n≥0
+
1 2π i
− bk
g k=1
n>0
− ak
(α)
(β)
qk (λ)ω(1,0) +
(β) (α) ¯ pk¯ (λ)ω (0,1)
(α) + Ak
(β)
ω(1,0) bk
¯ (0,1) + qk¯ (λ)ω (β)
(α)
ak
(α) − Bk
(β)
ω(1,0)
(α)
bk
.
(β)
pk (λ)ω(1,0) (5.24)
ak
As before, P0 is the marked point on L such that λ(P0 ) = 0, and the cycles {ak , bk } all pass through P0 . From this definition one can see that if the first differential in the pairing is one of the primary differentials ξ A then this pairing gives the corresponding operation Iξ A applied to the second differential: F[ξ A , ω] = Iξ A [ω] .
(5.25)
Theorem 10. The pairing (5.24) is commutative for all primary differentials except for differentials t k and s k , k = 1, . . . , g which commute up to a constant: 1 . (5.26) 2π i Proof. Due to the relation (5.25) we should compare the action of superpositions of operations Iξ A Iξ B and Iξ B Iξ A on the sum of Schiffer and Bergman kernels. This sum is only singular when the points P and Q coincide. Therefore among the operations Iωi , Iωi¯ , Ir k , Iuk , Is k , It k those ones commute, being applied to (P , Q) + B(P , Q), which are given by integrals over non-intersecting contours on the surface. In the set of contours used in the definition of the operations Iξ A , the only contours that intersect each other are the basis cycles ak and bk . A simple local calculation in a neighbourhood of the intersection point P0 shows that the order of integration can be changed in the integral ak bk λ(P )(P , Q) due to the assumption λ(P0 ) = 0. Therefore the only noncommuting operations, among those mentioned above, are Is k and It k . The difference in (5.26) can be computed using formulas (3.3) for integrals of the bidifferential W (P , Q) over a- and b-cycles. By a similar reasoning one can see that operations of the type It i;α , It i;α ¯ , Iv i and I i¯ v for i = 0, . . . , m , α = 1, . . . , ni commute with the previous ones. They commute with each other due to the symmetry properties of the kernels. F[s k , t k ] = F[t k , s k ] −
Now we are in a position to prove Theorem 8, which gives the metric ds2 in flat coordinates. Proof of Theorem 8. For computation of the metric on vectors ∂ξ A we shall use the relation
ds2 (∂ξ A , ∂ξ B ) = e F[ξ A , ξ B ] , (5.27) which we prove first.
“Real Doubles” of Hurwitz Frobenius Manifolds
663
Using Lemma 4, we express the vectors ∂ξ A via canonical tangent vectors: ∂ξ A = −
L ξ A (1,0) (Pi ) i=1
(1,0) (Pi )
∂λi +
ξ A (0,1) (Pi ) (0,1) (Pi )
∂λ¯ i
.
(5.28)
Therefore for the metric (5.5) we obtain: 1 ξ A (1,0) (Pi )ξ B (1,0) (Pi ) + ξ A (0,1) (Pi )ξ B (0,1) (Pi ) . (5.29) 2 L
ds2 (∂ξ A , ∂ξ B ) =
i=1
For computation of the right-hand side of (5.27) we note that, in the pairing of two primary differentials, only the contribution of the second one depends on coordinates, therefore we have
e F[ξ A , ξ B ] = F[ξ A , e ξ B ] . (5.30) The action of the vector field e on primary differentials is provided by Lemma 3. From (5.25) we know that the pairing in the right side of (5.30) is just the operation Iξ A applied to e(ξ B ). Therefore in the right-hand side of (5.27) we have L # " 1 ξ B (1,0) (Pi )Iξ A (P , Pi ) + B(P¯ , Pi ) + ξ B (0,1) (Pi )Iξ A B(P , P¯i ) + (P , Pi ) 2 i=1
1 ξ A (1,0) (Pi )ξ B (1,0) (Pi ) + ξ A (0,1) (Pi )ξ B (0,1) (Pi ) . 2 L
=
(5.31)
i=1
Together with (5.29), this proves
(5.27). Now let us compute ds2 ∂r i , ∂ξ A . According to (5.27) we need to compute the action of the unit field e on the following quantity: ¯ )ξ A (0,1) (P ) . F[r i , ξ A ] ≡ Ir i [ξ A ] = − λ(P )ξ A (1,0) (P ) − λ(P ai
ai
Let’s again consider the biholomorphic map of the covering Lλ → Lλ δ performed by a simultaneous δ-shift (δ ∈ R) of the points on all sheets (see the proof of Proposition 9). Since δξ A (P δ ) = ξ A (P )
(5.32)
we get
d ¯ e F[r i , ξ A ] = |δ=0 − (λ(P ) + δ)ξ A (1,0) (P ) − (λ(P ) + δ)ξ A (0,1) (P ) dδ ai ai =− ξ A (P ) = −δξ A ,s i .
ai
Therefore ds ∂r i , ∂ξ A = −δξ A ,s i . Analogously we prove that ds2 ∂ui , ∂ξ A = δξ A ,t i . ∂ + e. To compute the remaining coefficients of the metric consider the operator De = ∂λ It annihilates any primary differential:
De ξ A (P ) = 0 (5.33) 2
664
V. Shramchenko
as can be proven by differentiation d/dδ|δ=0 of the equality (5.32). Therefore, applying the operator De to the expansion of the multivalued differential ξ A near the point ∞i , we obtain the following relation for the corresponding (see (5.22)) coefficients cl,i : A l + 1 ξ A e cl,iξ = . c ni + 1 l−ni −1,i Therefore we have
ds2 ∂t i;α , ∂ξ A = e It i;α [ξ A ] = e
1 ξ A c α α−1,i
=
1 δ A i;1+ni −α , ni + 1 ξ ,t
A A
ξ = δξ A ,ωi . Thus, we computed the entries of the and ds2 ∂v i , ∂ξ A = e cniξ,i = c−1,i matrix listed in the theorem and proved that they are the only non-zero ones. Formulas (5.28) and (5.29) yield the following expression for the tensor c = ds2 (∂ξ A · ∂ξ B , ∂ξ C ) (compare with expression (4.20) for the tensor c of Dubrovin’s construction): c(∂ξ A , ∂ξ B , ∂ξ C ) = −
L 1 ξ A (1,0) (Pi )ξ B (1,0) (Pi )ξ C (1,0) (Pi ) 2 (1,0) (Pi ) i=1 ξ A (0,1) (Pi )ξ B (0,1) (Pi )ξ C (0,1) (Pi ) . + (0,1) (Pi )
(5.34)
The next theorem gives a prepotential of the Frobenius manifold, a function of flat coordinates {ξ A }, which, according to Theorem 1, solves the WDVV system. Theorem 11. For each primary differential consider the differential (P ) (5.14), multivalued on the surface L. For the Frobenius structure defined on the manifold g;n0 ,...,nm ({λi ; λ¯ i }) by the metric ds2 (5.5), multiplication law (5.1), and Euler field M (5.3), the prepotential F is given by the pairing (5.24) of the differential with itself: F =
1 F[ , ] . 2
(5.35)
The second order derivatives of the prepotential are given by ∂ξ A ∂ξ B F = F[ξ A , ξ B ] −
1 1 δξ A ,s k δξ B ,t k + δ A kδ B k , 4πi 4π i ξ ,t ξ ,s
(5.36)
where δ is the Kronecker symbol. Proof. To prove that the function F is a prepotential we need to check that its third order derivatives coincide with the tensor c (5.34). We shall first prove that the second derivatives have the form (5.36) and then differentiate them with respect to a flat coordinate ξC. The first differentiation of F with respect to a flat coordinate gives: ∂ξ A F =
1 1 F[ξ A , ] + F[, ξ A ] . 2 2
(5.37)
The first term in the right side of (5.37) equals 21 Iξ A [] (see (5.25)). Consider the second term. From expansions (5.15) of the multivalued differential and its integrals
“Real Doubles” of Hurwitz Frobenius Manifolds
665
and transformations (5.16)–(5.18) over basis cycles we know that the coefficients for which enter formula (5.24) for the pairing are nothing but the flat coordinates of ds2 . Therefore, writing explicitly the singular part in expansions (5.15) and using also (5.16)–(5.18), we have for the second term in (5.37): F[, ξ A ] ni m v i (1 − δi0 )Iv i [ξ A ] + t i;α It i;α [ξ A ] + ωi Iωi [ξ A ] = i=0
α=1
Iv 0 [ξ A ] n0 + 1
ωi
i=0
−
Iv i [ξ A ]
+ δ, i Iωi [λξ A (1,0) ] ω ni + 1 n i α(ni + 1) 1 − δ, i Iv i [λξ A (1,0) ] − δ, i;α It i;α [λξ A (1,0) ] v t 2 n +1+α α=1 i ni m ¯ ¯ + v i (1 − δi0 )Iv i¯ [ξ A ] + t i;α It i;α [ξ A ] + ωi Iωi¯ [ξ A ] +δ,
α=1
Iv 0¯ [ξ A ]
Iv i¯ [ξ A ]
¯ ξ A (0,1) ] + δ, i¯ Iωi¯ [λ ω ni + 1 ni 1 α(ni + 1) ¯ ¯ − δ, i¯ Iv i¯ [λξ A (0,1) ] − δ, i;α It i;α [λξ A (0,1) ] t v 2 ni + 1 + α +δ,
ωi¯
n0 + 1
−
α=1
+ r k Ir k [ξ A ] + uk Iuk [ξ A ] + s k Is k [ξ A ] + t k It k [ξ A ] k=1
1 1 ¯ ξ A (0,1) ] + δ, Iuk [λξ A (1,0) + λ ¯ ξ A (0,1) ] + δ, kIr k [λξ A (1,0) + λ r uk 2 2 1 1 ¯ ξ A (0,1) + ¯ ξ A (0,1) . (5.38) + δ, k λ λ δ, k s t 2πi 2πi bk ak Here the Kronecker symbol, for example, δ, i is equal to one if the primary differential ω
(which defines the metric ds2 and the differential ) is ωi . Suppose the primary differential ξ A is of the types 1, 3, 5, i.e. suppose ξ A ∈ {t i;α , v i , ωi }. Then ξ A (P ) = Iξ A [(P , Q) + B(P¯ , Q)]. In this case the operation Iξ A commutes with all the others (see Theorem 10). Therefore we can rewrite (5.38) as an action of Iξ A on some differential which depends on λ(Q) only (and does not depend ˜ (1,0) (Q)]. Analogously, we find that for primary differ¯ on λ(Q)): F[, ξ A ] = Iξ A [ ¯
¯
entials of the types 2, 4, 6, when ξ A ∈ {t i;α , v i , ωi }, the right-hand side in (5.38) is ¯ equal to the action of Iξ A on a differential depending only on λ(Q), i.e. F[, ξ A ] = ˜ ˜ ˜ (0,1) (Q) such as Iξ A [(0,1) (Q)]. Examining the properties of the differential (1,0) (Q) + singularities, behaviour under analytic continuation along cycles {ak , bk } and integrals ˜ (0,1) (Q), ˜ (1,0) (Q) + over these cycles, we obtain with the help of Lemma 2: (Q) = ˜ ˜ and therefore (1,0) (Q) = (1,0)(Q) , (0,1) (Q) = (0,1)(Q) . Hence, for primary differentials of the types 1 − 6 we have F[, ξ A ] = Iξ A [] .
(5.39)
666
V. Shramchenko
Similarly, for differentials r k and uk , we get ˜ (1,0) (Q) − ¯ ˜ (0,1) (Q) , λ(Q) λ(Q) F[, r k ] = − ak ak ˜ (1,0) (Q) − ¯ ˜ (0,1) (Q) , F[, uk ] = − λ(Q) λ(Q) bk
bk
which proves that (5.39) also holds for ξ A ∈ {r k , uk }. Formula (5.39) changes for the primary differentials s k and t k : the additional terms appear due to non-commutativity of the corresponding operations (Theorem 10): F[, s k ] = Is k [] −
tk ; 2πi
F[, t k ] = It k [] +
sk . 2π i
Coming back to the differentiation (5.37) of the function F , we have ∂ξ A F = F[ξ A , ] − δξ A ,s k
tk sk + δξ A ,t k . 4πi 4π i
(5.40)
Note that the contribution of the primary differential ξ A into the pairing F[ξ A , ] does not depend on coordinates. Therefore, by virtue of Theorem 9, the differentiation of (5.40) with respect to ξ B gives the expression (5.36) for second derivatives of the function F . To find third derivatives of F we differentiate (5.36) with respect to a flat coordinate ξC: ∂ξ C ∂ξ B ∂ξ A F = F[ξ A , ∂ξ C ξ B ] = Iξ A [∂ξ C ξ B ] .
(5.41)
Then we express the vector ∂ξ C via canonical tangent vectors {∂λi } as in (5.28) and use formulas from Lemma 3 for derivatives of primary differentials. Analogously to the computation (5.31) we find that derivatives (5.41) are given by the right-hand side of (5.34), i.e. equal to the 3-tensor c(∂ξ C , ∂ξ B , ∂ξ A ). Thus, by proving that the function F given by (5.35) is a prepotential (see Definition 4) we completed the construction of Frobenius manifold corresponding to the primary g;n0 ,...,nm . Let us denote this manifold by M = M differential on the space M g;n0 ,...,nm . 5.4. Quasihomogeneity. Now we shall show that the prepotential F (5.35) is a quasihomogeneous function of flat coordinates (see (2.1)). According to Theorem 1, the prepotential satisfies E(F ) = νF F + quadratic terms . In the next proposition we prove that the vector field E has the form (2.2), i.e. E= νA ξ A ∂ξ A , A
and compute the coefficients {νA }.
(5.42)
(5.43)
“Real Doubles” of Hurwitz Frobenius Manifolds
667
Proposition 11. In flat coordinates {ξ A } of the metric ds2 , the Euler vector field (5.3) has the form (5.43) ( and therefore is covariantly linear) with coefficients {νA } depending on the choice of a primary differential as follows: • if = t io ;α or = t io ;α then E=
ni m
t i;α ∂t i;α + t i;α ∂t i;α
1+
i=0 α=1 m
α α − nio + 1 ni + 1
α α ¯ ¯ (v i ∂v i + v i ∂v i¯ ) + (1 + )(ωi ∂ωi + ωi ∂ωi¯ ) nio + 1 nio + 1 i=1 g α α k k k k + (r ∂r k + u ∂uk ) + (1 + )(s ∂s k + t ∂t k ) nio + 1 nio + 1
+
k=1
• if = v io , = v i¯o , = r ko or = uko then E=
ni m
(2 −
i=0 α=1
+t i;α ∂t i;α ) +
α )(t i;α ∂t i;α ni + 1 m
¯
¯
v i ∂v i + v i ∂v i¯ + 2(ωi ∂ωi + ωi ∂ωi¯ )t
i=1
+
g
r k ∂r k + uk ∂uk + 2(s k ∂s k + t k ∂t k )
k=1
• if = ωio , = ωi¯o , = s ko or = t ko then E=
ni m i=0 α=1 g
α ¯ (ωi ∂ωi + ωi ∂ωi¯ ) )(t i;α ∂t i;α + t i;α ∂t i;α ) + ni + 1 m
(1 −
i=1
(s k ∂s k + t k ∂t k ) .
+
k=1
Proof. Let us compute the action of the Euler vector field on a flat coordinate ξ A . Consider again the biholomorphic map Lλ → Lλ defined by the transformation P → P on L such that λ(P ) = λ(P )(1 + ) , ∈ R, performed on every sheet of the covering Lλ . Since the kernels and B are invariant under this map, the primary differentials transform as follows: for = t i;α or = t i;α :
α
(P ) = (1 + ) ni +1 (P ),
for = v i , = v i¯ , = r k or = uk : for = ωi , = ωi¯ , = s k or = t k :
(P ) = (1 + )(P ), (P ) = (P ) ,
where is the corresponding differential on the covering Lλ .
668
V. Shramchenko
Let us choose, for example, the primary differential t io ;α . Flat coordinates of the metric ds2 i ;α are functions of {λj } and {λ¯ j } only. If we consider corresponding functo tions on L and differentiate them with respect to at = 0, we get the action of the vector field E (5.3) on the flat coordinates: ni +1−α ni + 1 d |=0 res (λ(P )) ni +1 tio ;α (1,0) (P ) d α − ni − 1 ∞i ni +1−α α α d + α |=0 (1 + ) ni +1 nio +1 t i;α = (1 − + )t i;α . = d ni + 1 nio + 1
E(t i;α ) =
Therefore the vector field E depends on the coordinate t i;α as E = (1 − niα+1 + α i;α ∂ t i;α +· · · . Similarly we compute the dependence on the other flat coordinates. nio +1 )t The action (5.42) of the Euler field (5.43) on the prepotential F is equivalent to the condition of quasihomogeneity for F , i.e. F (κ ν1 ξ 1 , . . . , κ ν2L ξ 2L ) = κ νF F (ξ 1 , . . . , ξ 2L ) + quadratic terms with the coefficients of quasihomogeneity {νA } computed in Proposition 11. As for the coefficient νF , the proof of Theorem 1 implies that νF = 3−ν, where the charge ν of a Frobenius manifold was computed in Proposition 7. Thus, we have 2α 2α for = t i;α or = t i;α : ν =1− , νF = + 2, ni + 1 ni + 1 ν = −1, νF = 4, for = v i , = v i¯ , = r k or = uk : for = ωi , = ωi¯ , = s k or = t k : ν = 1, νF = 2 . Remark 6. The described construction also holds for the differential being a linear combination of the primary differentials which correspond to the same charge ν. In other words, the differential which defines a Frobenius structure can be one of the following: 1. = σi;α t i;α + σi,α t i;α for some pair (i; α) : i ∈ {0, . . . , m} , α ∈ {1, . . . , ni − 1} , g m
κi v i + κi¯ v i¯ + σk r k + ρk uk , 2. = i=1
3. =
m
k=1 g
i=1
k=1
κi ωi + κi¯ ωi¯ +
σk s k + ρ k t k ,
where the coefficients do not depend on a point of the Hurwitz space. The unit vector fields for the structures defined by these combinations, respectively, are given by: 1. e = −σi;α ∂t i;α − σi,α ∂t i;α for some pair (i; α) : i = 0, . . . , m , α = 1, . . . , ni − 1, g m
2. e = − κi ∂v i + κi¯ ∂v i¯ − σk ∂r k + ρk ∂uk , 3. e = −
i=1 m i=1
κi ∂ωi + κi¯ ∂ωi¯ −
k=1 g k=1
σ k ∂ s k + ρ k ∂t k .
“Real Doubles” of Hurwitz Frobenius Manifolds
669
In each case, by a linear change of variables, the field e can be made equal to ∂ξ 1 for some new variable ξ 1 . This change of variables does not affect the quasihomogeneity of the prepotential since the flat coordinates which enter each of the three combinations have equal coefficients of quasihomogeneity (see Proposition 11). 6. G-Function of Hurwitz Frobenius Manifolds The G-function is a solution to the Getzler system of linear differential equations, which was derived in [9] (see also [6]). The system is defined on an arbitrary semisimple Frobenius manifold M. It was proven in [6] that the Getzler system has a unique, up to an additive constant, solution G which satisfies the quasihomogeneity condition E(G) = −
n 1 ν 2 νn 1 − νA − , + 4 2 48 A=1
with a constant in the left side: ν is the charge, n is the dimension of the Frobenius manifold; {νA } are the quasihomogeneity coefficients (2.1). In [6] the following formula (which proves the conjecture of A. Givental [10]) for this quasihomogeneous solution was derived: τI (6.1) G = log 1/24 , J where Jα is the Jacobian of transformation from canonical to the flat coordinates, J = det ∂t ∂λi ; and τI is the isomonodromic tau-function of the Frobenius manifold defined by 1 ∂ log τI = Hi := ∂λi 2
n
βij2 (λi − λj ) ,
i = 1, . . . , n .
(6.2)
j =i,j =1
s was computed in [6]. In [12, 13] The function G (6.1) for the Frobenius manifold M 1;1 expression (6.1) was computed for Dubrovin’s Frobenius structures on Hurwitz spaces in arbitrary genus. Theorem 12 below summarizes the main results of papers [12] and [13]. Denote by S the following term in asymptotics of the bidifferential W (P , Q) (3.2) near the diagonal P ∼ Q: 1 + S(x(P )) + o(1) dx(P )dx(Q) W (P , Q) = Q∼P (x(P ) − x(Q))2 φ
( 6S(x(P )) is called the Bergman projective connection [8]). By Si we denote the√value of S at the ramification point Pi taken with respect to the local parameter xi (P ) = λ − λi : Si = S(xi )|xi =0 .
(6.3)
Since the singular part of the bidifferential W in a neighbourhood of the point Pi does not depend on coordinates {λj }, the Rauch variational formulas (3.4) imply 1 ∂Si = W 2 (Pi , Pj ) . ∂λj 2
670
V. Shramchenko
The symmetry of this expression provides compatibility for the following system of differential equations which defines the Bergman tau-function τW : ∂ log τW 1 = − Si , ∂λi 2
i = 1, . . . , n .
Theorem 12. The isomonodromic tau-function τI (6.2) for a holomorphic Frobenius φ is related to the Bergman tau-function τW as follows ([13]): structure M 1
τI = (τW )− 2 ,
(6.4)
where τW is given by the following expression independent of the points P and Q ([12]): L+m+1
$
τW = Q
2/3
[E(Dk , Dl )]dk dl /6
(6.5)
k,l=1 k
and • Q is given by Q = [dλ(P )]
g−1 2
L+m+1
C(P )
$
[E(P , Dk )]
(1−g)dk 2
,
k=1
where C(P ) is the following multivalued g(1 − g)/2-differential on L: C(P ) =
g
1 (α−1)
det 1≤α,β≤g ωβ
(P ) α1 ,...,αg =1
∂ g θ(K P ) ωα (P ) . . . ωαg (P ) ∂zα1 . . . ∂zαg 1
+m+1 • Lk=1 dk Dk is the divisor (dλ) of the differential dλ(P ), i.e. Dl = Pl , dl = 1 for l = 1, . . . , L and DL+i+1 = ∞i , dL+i+1 = −(ni +1), i = 0, . . . , m. As before, we evaluate a differential
at the points of the divisor (dλ) with respect to the standard local parameters: xj = λ − λj for j = 1, . . . , L and xL+1+i = λ−1/(ni +1) for i = 0, . . . , m • θ (z|B), z ∈√Cg is the theta-function; E(P , Q) is the prime form; E(Dk , P ) stands for E(Q, P ) dxk (Q)|Q=Dk is chosen so that • K P is the vector of Riemann constants; the fundamental domain L the Abel map of the divisor (dλ) is given by A((dλ)) = −2K P . φ . Theorem 12 gives the numerator of expression (6.1) 6.1. G-function for manifolds M φ on Hurwitz spaces described for the G-function of holomorphic Frobenius structures M in Sect. 4. For the denominator we have (see [6], [13]) J =
L 1 $
2L/2
φ(Pi ) ,
i=1
where φ is the primary differential from the list of Theorem 2 which corresponds to the φ . Frobenius structure M
“Real Doubles” of Hurwitz Frobenius Manifolds
671
Summarizing the above formulas, we get the following expression for the G-function φ : of the Frobenius manifold M $ 1 1 φ(Pi ) + const , log G = − log τW − 2 24 L
(6.6)
i=1
τW is given by (6.5). . For the Frobenius structures with canonical 6.2. G-function for “real doubles” M coordinates {λ1 , . . . , λL ; λ¯ 1 , . . . , λ¯ L }, corresponding to the primary differentials from Sect. 5, the Jacobian of transformation between canonical and flat coordinates is given by J = det
L 1 $ ∂ξ A ∂ξ A = L (1,0) (Pi )(0,1) (Pi ) . ∂λi ∂ λ¯ i 2
(6.7)
i=1
The definition (6.2) of the isomonodromic tau-function in this case becomes: ∂ log τI 1 = Hi := ∂λi 2 ∂ log τI 1 = Hi := ¯ 2 ∂ λi
L j =i,j =1 L j =1
1 2 βi j¯ (λi − λ¯ j ), 2 L
βij2 (λi − λj ) +
j =1
2 ¯ βij ¯ (λi − λj ) +
1 2
L j =i,j =1
βi¯2j¯ (λ¯ i − λ¯ j ) .
(6.8)
Analogously to relation (6.4) one can prove (see [13] and Proposition 12 below) that the function τI is −1/2 power of the function τ , which is defined by the Schiffer kernel (P , Q) (3.6) as follows. The asymptotics of the kernel (P , Q) near the diagonal is 1 + S (x(P )) + o(1) dx(P )dx(Q) . (P , Q) = Q∼P (x(P ) − x(Q))2 Denote by i the evaluation√of the term S (x) at the ramification point Pi with respect to the local parameter xi = λ − λi : i = (S (xi )) |xi =0 = Si + i , where Si is the same as in (6.3) and i is given by i = −π
g
(ImB)−1 kl ωk (Pi )ωl (Pi ).
k,l=1
The differentiation formulas (3.11) for the kernels and B imply 1 ∂i = 2 (Pi , Pj ) = 2βij2 , ∂λj 2
∂i 1 = B 2 (P¯j , Pi ) = 2βi2j¯ , 2 ∂ λ¯ j
∂i 1 2 = B 2 (P¯i , Pj ) = 2βij ¯ , ∂λj 2
∂i 1 = 2 (Pi , Pj ) = 2βi¯2j¯ , ¯ 2 ∂ λj
(6.9)
672
V. Shramchenko
which allows the following definition of the tau-function τ : ∂ log τ 1 = − i , ∂λi 2
∂ log τ 1 = − i . 2 ∂ λ¯ i
(6.10)
From the Rauch variational formulas (3.12) we find ∂ log det(ImB) 1 = − i , ∂λi 2
∂ log det(ImB) 1 = − i , ¯ 2 ∂ λi
and therefore τ = const |τW |2 det(ImB) .
(6.11)
Remark 7. This tau-function coincides with an appropriately regularized ratio of determinant of Laplacian on L and the volume of L in the singular metric |dλ|2 (see [3, 12, 16]). Now we are able to compute the function τI (6.8) by proving the following proposition. Proposition 12. The isomonodromic tau-function τI for a Frobenius structure with canonical coordinates {λ1 , . . . , λL ; λ¯ 1 , . . . , λ¯ L } on the Hurwitz space is related to the function τ (6.10) by τI = (τ )−1/2 .
(6.12)
Proof. Using the relation (6.9) between derivatives of i and rotation coefficients βij , we write for the Hamiltonians Hi (6.8): L L L L 1 1 1 Hi = λ i ∂λj i + ∂λ¯ j i − λj ∂λj i − λ¯ j ∂λ¯ j i . (6.13) 4 4 4 j =i,j =1
j =1
j =i,j =1
j =1
For the quantities i one can prove the relations L ∂ ∂ i = 0 , + ∂λj ∂ λ¯ j j =1
L j =1
∂ ∂ λj + λ¯ j ∂λj ∂ λ¯ j
i = −i .
(6.14)
To prove (6.14) we use the invariance of the Schiffer kernel (P , Q) under two biholomorphic maps of the Riemann surface L → Lδ and L → L given by transformations λ → λ + δ and λ → λ(1 + ) performed simultaneously on all sheets of the covering Lλ (see proofs of Propositions 2 and 3). Substitution of (6.14) into (6.13) yields 1 1 λj ∂λj i + λ¯ j ∂λ¯ j i = i . 4 4 L
Hi = −
j =1
Similarly, we get for Hi¯ the relation: Hi¯ = 41 i .
Formulas (6.7), (6.11) and (6.12) give the expression for the function G (6.1), i.e. we have proven the following theorem.
“Real Doubles” of Hurwitz Frobenius Manifolds
673
is given by Theorem 13. The G-function of the Frobenius manifold M 'L ( % & $ 1 1 2 G = − log |τW | det(ImB) − log (1,0) (Pi )(0,1) (Pi ) + const , (6.15) 2 24 i=1
where the Bergman tau-function τW is given by (6.5). 7. Examples in Genus One Since the described construction in the case of genus zero does not lead to new structures, the simplest examples we can compute are the Frobenius structures in genus one. The simplest Hurwitz space in genus one is M1;1 . We shall compute the prepotentials of s , M t , M s +σ t (for a nonzero constant σ ∈ C) φs and M Frobenius manifolds M 1;1 1;1 1;1 1;1 given by formulas (4.19) and (5.35), respectively, and the corresponding G-functions (6.6) and (6.15). The Riemann surface of genus one can be represented as a quotient L = C/{2ω, 2ω }, where ω, ω ∈ C. The space M1;1 consists of the genus one two-fold coverings of CP 1 with simple branch points, one of them being at infinity. These coverings can be defined by the function λ(ς ) = ℘ (ς) + c ,
(7.1)
where ℘ is the Weierstrass elliptic function ℘ : L → CP 1 and c is a constant with respect to ς. We denote by λ1 , λ2 , λ3 the finite branch points of the coverings (7.1) and consider 1;1 . them as local coordinates on the space M φs . The primary differential φs is the holo7.1. Holomorphic Frobenius structure M 1;1 morphic normalized differential (see (3.3)): 1 φ(ς) = φs (ς ) = W (ς, ς˜ ). (7.2) 2πi b It can be expressed as follows via λ and ς: φ(λ(ς )) =
1 dλ , √ 4ω (λ − λ1 )(λ − λ2 )(λ − λ3 )
φ(ς ) =
dς . 2ω
(7.3)
ς The expansion of the multivalued differential pdλ = 0 φ dλ at infinity with respect to the local parameter z = λ−1/2 is given by 2 1 + c + O(z) dz . pdλ = 2ω z2 The Darboux-Egoroff metric (4.8) corresponding to our choice of primary differential φ has in canonical coordinates {λi } the form (dλ2 )2 (dλ3 )2 (dλ1 )2 1 ds2φs = + + . 8ω2 (λ1 − λ2 )(λ1 − λ3 ) (λ2 − λ1 )(λ2 − λ3 ) (λ3 − λ1 )(λ3 − λ2 ) (7.4)
674
V. Shramchenko
The set of flat coordinates of this metric is x+2ω 1 πi t1 := s = − λφs = − (℘ (ς ) + c)dς = − 2 γ − c, 2ω 4ω a x 1 1 0;1 t2 := t = res √ pdλ = , ς=0 ω λ 1 1 ω t3 := r = φ= , 2πi b 2πi ω
(7.5)
where we denote by γ the following function of period µ = 2π it3 of the torus L: γ (µ) =
1 θ1 (0; µ) . 3πi θ1 (0; µ)
(7.6)
This function satisfies the Chazy equation (see for example [5]): γ = 6γ γ − 9γ 2 .
(7.7)
The metric (7.4) in coordinates (7.5) is constant and has the form: ds2φs =
1 (dt2 )2 − 2dt1 dt3 . 2
φs is The prepotential (4.19) (it was computed in [1, 5]) of the Frobenius structure M 1;1 given by 1 1 πi 4 Fφs = − t1 t22 + t12 t3 − t γ (2π it3 ) . 4 2 32 2 This function is quasihomogeneous, i.e. the following relation: Fφs (κ ν1 t1 , κ ν2 t2 , κ ν3 t3 ) = κ νF Fφs (t1 , t2 , t3 )
(7.8)
holds for any κ = 0 and the quasihomogeneity factors ν1 = 1 ,
ν2 =
The Euler vector field E = E=
3
1 , 2
i=1 λi ∂λi
3 k=1
ν3 = 0
and
νF = 2 .
(7.9)
in flat coordinates has the form:
1 να tα ∂tα = t1 ∂t1 + t2 ∂t2 , 2
and the quasihomogeneity (7.8), (7.9) can be written as E(Fφs (t1 , t2 , t3 )) 2Fφs (t1 , t2 , t3 ). The corresponding G-function was computed in [6] : % & 1 G = − log η(2πit3 )(t2 ) 8 + const ,
=
where η(µ) is the Dedekind eta-function: η(µ) = (θ1 (0))1/3 . (See [12] for the function τW in genus one.)
“Real Doubles” of Hurwitz Frobenius Manifolds
675
7.2. “Real doubles” in genus one . We consider the same coverings (L, λ) with L = C/{2ω, 2ω } , and the function λ given by (7.1). The coverings have simple branch points λ1 , λ2 , λ3 and ∞. The set of such coverings is considered now as a space with local coordinates {λ1 , λ2 , λ3 ; λ¯ 1 , λ¯ 2 , λ¯ 3 }. s . The primary differential = s has the form (µ = ω /ω 7.2.1. The manifold M 1;1 is the period of the torus L): (ς) = s (ς ) =
µ¯ dς µ dς + . µ¯ − µ 2ω µ − µ¯ 2ω¯
(7.10)
The corresponding Darboux-Egoroff metric (5.5) is given by '
ds2s
µ¯ µ¯ − µ
2
(dλ1 )2 (dλ2 )2 + (λ1 − λ2 )(λ1 − λ3 ) (λ2 − λ1 )(λ2 − λ3 ) (dλ3 )2 + . (7.11) (λ3 − λ1 )(λ3 − λ2 )
1 = Re 4ω2
The flat coordinates of this metric are x+2ω µ¯ dς (℘ (ς ) + c) t1 := s = Re , µ − µ¯ x ω µ¯ 1 t2 := t 0;1 = , µ¯ − µ ω 1 µµ¯ t3 := r = , 2πi µ¯ − µ ( ' x+2ω µ¯ dς t4 := t = Re , (℘ (ς ) + c) µ − µ¯ x ω t5 := t 0;1 = t¯2 , µ¯ 1 . t6 := u = 2πi µ¯ − µ
(7.12)
Note that µ = t3 /t6 , µ¯ = 2πit3 /(2πit6 − 1) and for the solution (7.6) to the Chazy equation we have γ (µ) = −γ (−µ). ¯ The metric (7.11) in the flat coordinates has the form ds2s =
1 1 (dt2 )2 + (dt5 )2 − 2dt1 dt3 + 2dt4 dt6 . 2 2
The corresponding prepotential (5.35) is 1 1 1 1 1 ) Fs = − t1 t22 − t1 t52 + t12 t3 − t1 t4 (2t6 − 4 4 2 2 2π i 1 2 1 1 1 1 1 +t3−1 t2 t4 (t6 − ) + t4 t52 t6 + t42 t6 (t6 − ) + t22 t52 4 2πi 4 2 2π i 16
676
V. Shramchenko
1 −2 1 4 t3 1 −1 −1 + t3−1 − t2 − t6 γ t3 t6 32 4πi t6 2πi 1 4 πi 2πit3 −1 −1 −1 + t5 − + t . γ + t (2π it − 1) 6 3 3 32 (2πit6 − 1)2 1 − 2πit6 (7.13)
+
Note that the coordinates t1 , t3 , t4 are real, t2 and t5 are complex conjugates of each other and t6 has a constant imaginary part, t¯6 = t6 − 1/2π i. In these coordinates, the prepotential Fs is a real-valued function. However, Fs also satisfies the WDVV system when considered as a function of six complex coordinates; in that case, Fs is not real. This function is quasihomogeneous: the relation Fs (κ ν1 t1 , . . . , κ ν6 t6 ) = κ νF Fs (t1 , . . . , t6 ) holds for any κ = 0 and the quasihomogeneity factors 1 , 2 1 ν5 = , ν4 = 1 , 2 3 The Euler vector field E = i=1 (λi ∂λi + coordinates: ν1 = 1 ,
E=
6 α=1
ν2 =
ν3 = 0 , ν6 = 0 ,
νF = 2 .
(7.14)
λ¯ i ∂λ¯ i ) has the following form in the flat
1 1 να tα ∂tα = t1 ∂t1 + t2 ∂t2 + t4 ∂t4 + t5 ∂t5 , 2 2
and the quasihomogeneity of Fs can be written as E(Fs (t1 , . . . , t6 )) = 2Fs (t1 , . . . , t6 ). The corresponding G-function (6.15) (real-valued as a function of coordinates (7.12)) is given by ' 1 ( 2 1 t3 2πit3 2π it3 G = − log η + const . η (t2 t5 ) 8 t6 1 − 2πit6 t6 (2π it6 − 1) ¯ for the Dedekind η-function. Here we use the relation η(µ) = η(−µ) t . The primary differential = t has the form (µ = ω /ω is 7.2.2. The manifold M 1;1 the period of torus): (ς) = t (ς ) =
1 dς 1 dς − . µ − µ¯ 2ω µ − µ¯ 2ω¯
(7.15)
The corresponding Darboux-Egoroff metric (5.5) is given by ' 2 1 (dλ1 )2 1 (dλ2 )2 2 + dst = Re 4ω2 µ¯ − µ (λ1 − λ2 )(λ1 − λ3 ) (λ2 − λ1 )(λ2 − λ3 ) 2 (dλ3 ) + . (7.16) (λ3 − λ1 )(λ3 − λ2 )
“Real Doubles” of Hurwitz Frobenius Manifolds
The flat coordinates of this metric are '
1 t1 := t = Re µ¯ − µ
677
x+2ω x
dς (℘ (ς ) + c) ω
( ,
1 1 , µ − µ¯ ω 1 µ t3 := r = , 2πi µ − µ¯ x+2ω 1 dς (℘ (ς ) + c) , t4 := s = Re µ¯ − µ x ω
t2 := t 0;1 =
t5 := t 0;1 = t¯2 , 1 1 . t6 := u = 2πi µ − µ¯
(7.17)
In terms of these coordinates, the period of the torus and its conjugate can be expressed as: µ = t3 /t6 and µ¯ = (2πit3 − 1)/2πit6 . The metric (7.16) in flat coordinates has the form: ds2t =
1 1 (dt2 )2 + (dt5 )2 + 2dt1 dt6 − 2dt3 dt4 . 2 2
The corresponding prepotential (5.35) is given by 1 1 1 1 1 1 1 t42 1 t22 t52 ) − t12 t6 − t3 (t3 − ) − Ft = − t1 t22 − t1 t52 + t1 t4 (2t3 − 4 4 2 2πi 2 2 2π i t6 16 t6 2 4 4 t3 t4 t5 t 1 t2 t3 + − 2 − γ 2 32t6 128πi t6 t6 4t6 1 4 4 (t3 − 2πi )t4 t22 t5 1 t5 1 − 2πit3 + − − γ . (7.18) 32t6 128πi t62 2πit6 4t6 This function is also real if the coordinates are of the form (7.17): in this case t1 , t4 , t6 1 are real, t2 = t¯5 , and t3 has a constant imaginary part, namely, we have t¯3 = t3 − 2πi . The last two lines in (7.18) are complex conjugates of each other since for the function γ (7.6) we have γ (µ) = −γ (−µ). ¯ The function Ft (7.18) is quasihomogeneous. The quasihomogeneity factors {νi } and νF are the same as for the above example (the function Fs ); they are given by (7.14). t (it is also real-valued as a function of coordinates (7.17)) is The G-function for M 1;1 given by 1 −1 t3 1 − 2πit3 2 8 G = − log η η + const , (t2 t5 ) t6 t6 2πit6 where, again, η is the Dedekind eta-function.
678
V. Shramchenko
s +σ t . According to Remark 6 in the end of Sect. 5, there 7.2.3. The manifold M 1;1 exists a Frobenius structure built from a linear combination of two primary differentials s and t . Here, we compute a prepotential which corresponds to the differential = s + σ t for σ being a non-zero parameter. We start with the differential (ς) = s (ς ) + σ t (ς ) =
µ¯ − σ dς σ − µ dς + . µ¯ − µ 2ω µ¯ − µ 2ω¯
The corresponding Darboux-Egoroff metric (5.5) is given by ds2
µ¯ − σ µ¯ − µ
2
(dλ1 )2 (dλ2 )2 + (λ1 − λ2 )(λ1 − λ3 ) (λ2 − λ1 )(λ2 − λ3 ) 2 σ −µ 2 1 (d λ¯ 1 )2 (dλ3 ) + + 2 (λ3 − λ1 )(λ3 − λ2 ) 8ω¯ µ¯ − µ (λ¯ 1 − λ¯ 2 )(λ¯ 1 − λ¯ 3 ) 2 2 ¯ ¯ (d λ3 ) (d λ2 ) + . (7.19) + ¯ ¯ ¯ ¯ ¯ (λ2 − λ1 )(λ2 − λ3 ) (λ3 − λ¯ 1 )(λ¯ 3 − λ¯ 2 )
1 = 8ω2
The flat coordinates t and s of the metric (7.19) are µ¯ − σ x+2ω dς d ς¯ σ − µ x+2ω (℘ (ς ) + c) (℘ (ς ) + c) , + µ¯ − µ x 2ω µ¯ − µ x 2ω¯ x+2ω x+2ω dς d ς¯ σ −µ µ¯ − σ (℘ (ς ) + c) (℘ (ς ) + c) + . s= µ¯ − µ x 2ω µ¯ − µ x 2ω¯ t=
(7.20)
We need to perform a linear change of variables in order to have the unit field e in the form e = −∂t 1 . After this change of variables, we get the following set of flat coordinates for the metric (7.19): t1 := s + σ −1 t, 1 µ¯ − σ , t2 := t 0;1 = ω µ¯ − µ 1 (µ¯ − σ )µ t3 := r = , 2πi µ¯ − µ
t4 := s − σ −1 t, 1 σ −µ t5 := t 0;1 = , ω¯ µ¯ − µ 1 1 t6 := u = . 2π i µ − µ¯
(7.21)
In the coordinates (7.21), the metric has the form: ds2 =
1 1 (dt2 )2 + (dt5 )2 − dt1 dt3 + σ dt1 dt6 − dt3 dt4 − σ dt4 dt6 . 2 2
(7.22)
The period of the torus and its complex conjugate can be expressed in terms of the coordinates (7.21) as follows: µ = t3 /t6 and µ¯ = (σ − 2π it3 )/(1 − 2π it6 ), respectively.
“Real Doubles” of Hurwitz Frobenius Manifolds
679
Then, the prepotential (5.35) is the following function of 6 variables: t54 πi 1 t22 1 t24 t3 2π it3 − σ − − γ γ (t1 + t4 ) 2 2 64π i t6 t6 16 (2πit6 − 1) 1 − 2π it6 8π i t6 σ 2 1 t3 πi 1 − (t1 + t4 )2 + (t − t42 ) + × 8π i 1 8πi t6 2t6 (t3 − σ t6 ) (2π it6 − 1) 2 (t + t52 )t6 t2 (t1 + t4 )t3 × 2 − 2 − (t1 + t4 )t3 t6 + 4πi 2π i 2 2 σ (t1 − t4 )t6 +σ (t1 − t4 )t62 − . (7.23) 2πi
Fs +σ t = −
In the limit σ → 0, the metric (7.22) becomes singular and the function (7.23) does not satisfy the WDVV system. To obtain from (7.23) the prepotential Fs , corresponding to the case σ = 0, one has to rewrite Fs +σ t in terms of the original variables (7.20) and then put σ = 0. The function Fs +σ t is quasihomogeneous. The quasihomogeneity factors {νi } and νF are given by (7.14). s +σ t is given by The G-function for M 1;1 ' 1 ( 2 1 2πit3 − σ t3 − σ t 6 t3 + const . η G = − log η (t2 t5 ) 8 t6 1 − 2πit6 t6 (1 − 2π it6 ) A computer check shows that functions Fs (7.13), Ft (7.18), and Fs +σ t (7.23) indeed satisfy the WDVV system. Acknowledgements. I am grateful to D. Korotkin, A. Kokotov, M. Bertola and S. Natanzon for many useful discussions and to B. Dubrovin for important comments and pointing out some mistakes in an earlier version of this paper.
References 1. Bertola, M.: Frobenius manifolds structure on orbit space of Jacobi groups; Parts I and II. Diff. Geom. Appl. 13, 19–41 and 213–233 (2000) 2. Dijkgraaf, R., Verlinde, E., Verlinde, H.: Nucl. Phys. B 352, 59 (1991); Notes on topological string theory and 2D quantum gravity. In: String theory and quantum gravity, Proc.of Trieste Spring School 1990, Oreen, M. et al (eds.), Singapore. World Scientific, 1991, pp. 91–156 3. D’Hoker, E., Phong, D.H.: Functional determinants on Mandelstam diagrams. Commun. Math. Phys. 124(4), 629–645 (1989) 4. Dubrovin, B.: Integrable systems and classification of 2-dimensional topological field theories. In: Integrable systems (Luminy, 1991), Progr. Math. 115, Boston: Birkh¨auser, 1993, pp. 313–359 5. Dubrovin, B.: Geometry of 2D topological field theories. In: Integrable Systems and Quantum Groups, Montecatini Terme (1993), Lecture Notes in Math. 1620, Berlin: Springer 1996; Geometry and analytic theory of Frobenius manifolds. In: Proceedings of the International Congress of Mathematicians, Vol. II, Berlin, 1998 6. Dubrovin, B., Zhang, Y.: Bi-Hamiltonian hierarchies in 2D topological field theory at one-loop approximation. Commun. Math. Phys. 198(2), 311–361 (1998) 7. Dzhamay, A.: Real-normalized Whitham hierarchies and the WDVV equations. Internat. Math. Res. Notices, 21, 1103–1130 (2000) 8. Fay, J.: Kernel functions, analytic torsion, and moduli spaces. Memoirs of the AMS, 96(464), AMS (1992)
680
V. Shramchenko
9. Getzler, E.: Intersection theory on M 1,4 and elliptic Gromov-Witten invariants. J. Amer. Math. Soc. 10(4), 973–998 (1997) 10. Givental, A.: Elliptic Gromov-Witten invariants and the generalized mirror conjecture. In: Integrable systems and algebraic geometry (Kobe/Kyoto, 1997), River Edge, NJ: World Sci. Publishing, 1998, pp. 107–155 11. Kokotov, A., Korotkin, D.: A new hierarchy of integrable systems associated to Hurwitz spaces. http://arxiv.org/list/math- ph/0112051, 2001 12. Kokotov, A., Korotkin, D.: Bergman tau-function on Hurwitz spaces and its applications. http://arxiv.org/list/math- ph/0310008, 2003 13. Kokotov, A., Korotkin, D.: On G-function of Frobenius manifolds related to Hurwitz spaces. IMRN, 7, 343–360 (2004) 14. Manin, Yu.: Frobenius manifolds, quantum cohomology, and moduli spaces. Providence, RI: American Mathematical Society 1999 15. Rauch, H. E.: Weierstrass points, branch points, and moduli of Riemann surfaces. Comm. Pure Appl. Math. 12, 543–560 (1959) 16. Sonoda, H.: Functional determinants on punctured Riemann surfaces and their application to string theory. Nucl. Phys. B 294(1), 157–192 (1987) 17. Witten, E.: On the structure of the topological phase of two-dimensional gravity. Nucl. Phys. B 340, 281–332 (1990) Communicated by G.W. Gibbons
Commun. Math. Phys. 256, 681–735 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1224-2
Communications in
Mathematical Physics
Anomalous Universality in the Anisotropic Ashkin–Teller Model A. Giuliani1, , V. Mastropietro2 1
Dipartimento di Fisica, Universit`a di Roma “La Sapienza”, P.zzale A. Moro, 2, 00185 Roma, Italy, and INFN, sezione di Roma1, Roma, Italy. E-mail: [email protected] 2 Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, Via della Ricerca Scientifica, 00133 Roma, Italy. E-mail: [email protected] Received: 19 March 2004 / Accepted: 6 May 2004 Published online: 29 December 2004 – © Springer-Verlag 2004
Abstract: The Ashkin–Teller (AT) model is a generalization of Ising 2–d to a four states spin model; it can be written in the form of two Ising layers (in general with different couplings) interacting via a four–spin interaction. It was conjectured long ago (by Kadanoff and Wegner, Wu and Lin, Baxter and others) that AT has in general two critical points, and that universality holds, in the sense that the critical exponents are the same as in the Ising model, except when the couplings of the two Ising layers are equal (isotropic case). We obtain an explicit expression for the specific heat from which we prove this conjecture in the weakly interacting case and we locate precisely the critical points. We find the somewhat unexpected feature that, despite universality, holds for the specific heat, nevertheless nonuniversal critical indexes appear: for instance the distance between the critical points rescale with an anomalous exponent as we let the couplings of the two Ising layers coincide (isotropic limit); and so does the constant in front of the logarithm in the specific heat. Our result also explains how the crossover from universal to nonuniversal behaviour is realized.
1. Introduction 1.1. Historical introduction. Ashkin and Teller [AT] introduced their model as a generalization of the Ising model to a four component system; in each site of a bidimensional lattice there is a spin which can take four values, and only nearest neighbor spins interact. The model can be also considered a generalization of the four state Potts model to which it reduces for a suitable choice of the parameters. A very convenient representation of the Ashkin Teller model is in terms of Ising spins (1) (2) [F]; one associates with each site of the square two spin variables, σx and σx ; lattice −H AT M , where the partition function is given by M = σ (1) ,σ (2) e
Partially supported by NSF Grant DMR 01–279–26
682
A. Giuliani, V. Mastropietro
HM (σ (1) , σ (2) ) = J (1) HI (σ (1) ) + J (2) HI (σ (2) ) + λV (σ (1) , σ (2) ) = HI (σ
(j )
)=−
HxAT ,
x∈M (j ) (j ) [σx σx+eˆ 1
(j ) (j ) + σx σx+eˆ ] 0
,
x∈M
V (σ (1) , σ (2) ) = −
(2)
(1)
(2)
(1)
[σx(2) σx+eˆ σx(1) σx+eˆ + σx(2) σx+eˆ σx(1) σx+eˆ ] , 0
0
1
1
(1.1)
x∈M
where HI is the Ising model hamiltonian, eˆ1 , eˆ0 are the unit vectors eˆ1 = (1, 0), eˆ0 = (0, 1) and M is a square subset of Z2 of side M. The free energy and the specific heat are given by 1 1 log AT , Cv = lim < HxAT HyAT >M ,T , (1.2) f = lim M 2 M→∞ M M→∞ M 2 x,y∈M
where < · >M ,T denotes the truncated expectation w.r.t. the Gibbs distribution with the Hamiltonian (1.1). The case J (1) = J (2) is called isotropic. For λ = 0 the model reduces to two independent Ising models and it has two critical points if J (1) = J (2) ; it was conjectured by Kadanoff and Wegner [K, KW] and later on by Wu and Lin [WL] that the AT model has in general two critical points also when λ = 0, except when the model is isotropic. The isotropic case was studied by Kadanoff [K] who, by scaling theory, conjectured a relation between the critical exponents of isotropic AT and those of the Eight vertex model, which had been solved by Baxter and has nonuniversal indexes. Further evidence for the validity of Kadanoff’s prediction was given by [PB] (using second order renormalization group arguments) and by [LP, N] (by a heuristic mapping of both models into the massive Luttinger model describing one dimensional interacting fermions in the continuum). Indeed nonuniversal critical behaviour in the specific heat in the isotropic AT model, for small λ, has been rigorously established in [M1]. The anisotropic case is much less understood. As we said, it is believed that there are two critical points, contrary to what happens in the isotropic case. Baxter [Ba] conjectured that "presumably" universality holds at the critical points for J (1) = J (2) (i.e. the critical indices are the same as in the Ising model), except when J (1) = J (2) when the two critical points coincide and nonuniversal behaviour is found. Since the 1970’s, the anisotropic AT model was studied by various approximate or numerical methods: Migdal–Kadanoff Renormalization Group [DR], Monte Carlo Renormalization group [Be], finite size scaling [Bad]; such results give evidence of the fact that, far away from the isotropic point, AT has two critical points and belongs to the same universality class of Ising; however they do not give information about the precise relative location of the critical points and the critical behaviour of the specific heat when J (1) is close to J (2) . The problem of how the crossover from universal to nonuniversal behaviour is realized in the isotropic limit remained for years completely unsolved, even at a heuristic level. We will study the anisotropic Ashkin–Teller model by writing the partition function and the specific heat as Grassmann integrals corresponding to a d = 1 + 1 interacting fermionic theory; this is possible because the Ising model can be reformulated as a free fermions model (see [SML, H, S or ID]). One can then take advantage from the theory of Grassmann integrals for weakly interacting d = 1 + 1 fermions, which is quite well developed, starting from [BG1] (see also [BG, GM or BM] for extensive reviews). Fermionic RG methods for classical spin models have been already applied in [PS] to the Ising model perturbed by a four spin interaction, proving a universality result for the
Anomalous Universality in the Anisotropic Ashkin–Teller Model
683
specific heat; and in [M1] to prove a nonuniversality result for the 8 vertex or the isotropic AT model. By such techniques one can develop a perturbative expansion, convergent up to the critical points, uniformly in the parameters. 1.2. Main results. We find it convenient to introduce the variables t (j ) = tanh J (j ) , j = 1, 2 and t (1) + t (2) t (1) − t (2) , u= . (1.3) 2 2 The parameter u measures the anisotropy of the system. We consider then the free energy or the specific heat as functions of t, u, λ. If λ = 0, AT is exactly solvable, because the Hamiltonian (1.1) is the sum of two independent Ising model Hamiltonians. From the Ising model exact solution [O, SML, MW] one finds that f is analytic for all t, u except for √ (1.4) t = tc± = 2 − 1 ± |u|, t=
and for t close to tc± the specific heat Cv has a logarithmic divergence: Cv −C log |t − ± tc± |, where C > 0 and means that the ratio of both sides tends √ to 1 as t → tc . We consider the case in which λ is small with respect to 2 − 1 and we distinguish two regimes. 1) If u is much bigger than λ (so that the unperturbed critical points are well separated) we find that the presence of λ just changes by a small amount the location i.e. we find that the critical points have the form tc± = √ of the critical points, 2 − 1 + O(λ) ± |u| 1 + O(λ) ; moreover the asymptotic behaviour of Cv at criticality remains essentially unchanged: Cv −C log |t − tc± |. 2) When u is small compared to λ the interaction has a more dramatic effect. We find that the system has still only√ two critical points tc± (λ, u); their center (tc+ + tc− )/2 is just shifted by O(λ) from 2 − 1, as in item (1); however their relative location scales, as u → 0, with an “anomalous critical exponent” η(λ), continuously vary+ − 1+η ing with λ: more precisely we find that 2 tc − tc = O(|u| ), where η is analytic in λ near λ = 0 and η = −bλ + O λ , b > 0. In particular the relative location of the critical points as a function of the anisotropy parameter u with λ fixed and small has a different qualitative behaviour, depending on the sign of λ, see Fig 1. For t → tc± (λ, u) the specific heat Cv still has a logarithmic divergence but, for all ηc u = 0, the constant in front of the log is O(|u| ), where ηc is analytic in λ for small λ 2 and ηc = aλ + O λ , a = 0. The logarithmic behaviour is found only in an extremely small region around the critical points; outside this region, Cv varies as t → tc± (λ, u) according to a power law behaviour with nonuniversal exponent. The conclusion is that, for all u = 0, there is universality for the specific heat (which diverges with the same exponent as in the Ising model); nevertheless nonuniversal critical indexes appear in the theory, in the difference between the critical points and in the constant in front of the logarithm in the specific heat. One can speak of anomalous universality as the specific heat diverges at criticality as in Ising, but the isotropic limit u → 0 is reached with nonuniversal critical indices. With the notations introduced above and calling D a√sufficiently small O(1) interval (i.e. with amplitude independent of λ) centered around 2 − 1, we can express our main result as follows.
684
A. Giuliani, V. Mastropietro
Fig. 1. The qualitative behavIour of tc+ (λ, u) − tc− (λ, u) as a function of u for two different values of λ (in arbitrary units). The graphs are (qualitative) plots of 2|u|1+η , with η −bλ, b > 0
Main Theorem. There exists ε1 such that, for t ± u ∈ D, j = 1, 2, and |λ| ≤ ε1 , one can define two functions tc± (λ, u) with the following properties: √ tc± (λ, u) = 2 − 1 + ν ∗ (λ) ± |u|1+η 1 + F ± (λ, u) , (1.5) where |ν ∗ (λ)| ≤ c|λ|, |F ± (λ, u)| ≤ c|λ|, for some positive constant c and η = η(λ) is an analytic function of λ s.t. η(λ) = −bλ + O(λ2 ), b > 0, and: 1) the free energy f (t, u, λ) and the specific heat Cv (t, u, λ) in (1.2) are analytic in the region t ± u ∈ D, |λ| ≤ ε1 and t = tc± (λ, u); 2) in the same region of parameters, the specific heat can be written as: t − t − t − t + 1 − 2ηc c c 2ηc Cv = −C1 log + C + C3 , (1.6) 2 2 ηc def
def
where 2 = (t − t c )2 + (u2 )1+η and t c = (tc+ + tc− )/2; the exponent ηc = ηc (λ) = aλ + O(λ2 ), a = 0, is analytic in λ; the functions Cj = Cj (λ, t, u), j = 1, 2, 3, are bounded above and below by O(1) constants; finally C1 − C2 vanishes for λ = u = 0. Remarks. 1) The key hypothesis for the validity of the Main√ Theorem is the smallness of λ. When λ = 0 the critical points correspond to t±u = 2−1:√hence for simplicity we restrict t ± u in a sufficiently small O(1) interval around √ 2 − 1. A possible √ explicit choice for D, convenient for our proof, could be D = [ 3( 42−1) , 5( 42−1) ]. Our technique would allow us to prove the above theorem, at the cost of a lengthier discussion, for any t (1) , t (2) > 0: of course in that case we should distinguish different regions of parameters and treat in a different way the cases√of low or high temperature or the case of big anisotropy (i.e. the cases t << 2 − 1 or √ t >> 2 − 1 or |u| >> 1). 2) Equation (1.6) shows how the crossover from universal to nonuniversal behaviour is realized. When u = 0 only the first term in (1.6) can be singular in correspondence to the two critical points; it has a logarithmic singularity (as in the Ising model) with a constant O( 2ηc ) in front. However the logarithmic term dominates the second one only if t varies inside an extremely small region O(|u|1+η e−a/|λ| ), a > 0, around the critical points. Outside such a region the power law behaviour corresponding to the second addend in (1.6) dominates. When u → 0 one recovers the power law decay found in [M1] for the isotropic case. See Fig 2.
Anomalous Universality in the Anisotropic Ashkin–Teller Model
685
Fig. 2. The qualitative behaviour of Cv as a function of t − t c , where t c = (tc+ + tc− )/2. The three graphs are plots of (1.6), with C1 = C2 = 1, C3 = 0, u = 0.01, η = ηc = 0.1, 0, −0.1 respectively; the central curve corresponds to λ = 0, the upper one to λ > 0 and the lower to λ < 0
3) By the result of item (1) of the Main Theorem, Cv is analytic in λ, t, u outside the critical line. This is not apparent from (1.6), because is non-analytic in u at u = 0 (of course the bounded functions Cj are non-analytic in u also, in a suitable way compensating the non analyticity of ). We get to (1.6) by interpolating two different asymptotic behaviours of Cv in the regions |t − t c | < 2|u|1+η and |t − t c | ≥ 2|u|1+η and the non analyticity of is introduced “by hand” by our estimates and it is not intrinsic for Cv . Equation (1.6) is simply a convenient way to describe the crossover between different critical behaviours of Cv . 4) We do not study the free energy directly at t = tc± (λ, u), therefore in order to show that t = tc± (λ, u) is a critical point we must study some thermodynamic property like the specific heat by evaluating it at t = tc± (λ, u) and M = ∞ and then verify that it has a singular behavior as t → tc± . The case t precisely equal to tc± cannot be discussed at the moment with our techniques, in spite of the uniformity of our bounds as t → tc± . The reason is that we write the AT partition function as a sum of 16 different partition functions, differing for boundary terms. Our estimates on each single term are uniform up to the critical point; however, in order to show that the free energy computed with one of the 16 terms is the same as the complete free energy, we need to stay at t = tc± : in this case boundary terms are suppressed as ± ∼ e−κM|t−tc | , κ > 0, as M → ∞. If we stay exactly at the critical point, cancellations between the 16 terms can be present (as it is well known already from the Ising model exact solution [MW]) and we do not have control on the behaviour of the free energy, as the infinite volume limit is approached. 1.3. Strategy of the proof. It is well known that the free energy and the specific heat of the Ising model can be expressed as a sum of Pfaffians [MW] which can be equivalently
686
A. Giuliani, V. Mastropietro
written, see [ID, S], as Grassmann functional integrals, see for instance App A of [M1] or §4 of [GM] for the basic definitions of Grassmann variables and Grassmann integration. The formal action of the Ising model in terms of Grassmann variables ψ, ψ has the form t ψx (∂1 − i∂0 )ψx + ψ x (∂1 + i∂0 )ψ x − 2iψ x (∂1 + ∂0 )ψx 4 x √ +i( 2 − 1 − t)ψ x ψx , (1.7) where ∂j are discrete derivatives. ψ and ψ are called Majorana fields, see [ID], because of an analogy with relativistic Majorana fermions. They are massive, because of√ the presence of the last term in (1.7); criticality corresponds to the massless case (t = 2 − 1). If λ = 0 the free energy and specific heat can be written as the sum of √ Grassmann (1) = t (1) − 2 + 1 and integrals describing two kinds of Majorana fields, with masses m √ m(2) = t (2) − 2 + 1. The critical points are obtained by choosing one of the two fields massless (in the isotropic case t (1) = t (2) and the two fields become massless together). If λ = 0 again the free energy and the specific heat can be written as Grassmann integrals, but the Majorana fields are interacting with a short range potential. By performing a suitable change of variables, the partition function can be written, see §2 and §3, as a γ1 ,γ2 sum of terms AT (γ1 , γ2 label different boundary conditions) of the form √ (1) + γ1 ,γ2 = P (dψ)e−V ( Z1 ψ) , P (dψ) = Dψ e−Z1 (ψ ,Aψ) , AT (1.8) + , ψ− } where ψ = {ψω,x ω,x ω=±1 are elements of a Grassmann algebra; Dψ is a symbol for the Grassmann integration; V (1) is a short range interaction, sum of monomials in ψ of any degree, whose quartic term is weighted by a constant λ1 = O(λ); and Z1 (ψ + , Aψ) has the form: − + − + Z1 ψω,x (∂1 − iω∂0 )ψω,x − iωσ1 ψω,x ψ−ω,x x,ω α α α α +iωµ1 ψω,x ψ−ω,−x − β1 ψω,x (∂1 − iω∂0 )ψω,x
(1.9)
√ with σ1 = O(t − 2 + 1) + O(λ), µ1 , β1 = O(u) (in particular in the isotropic case the terms proportional to µ1 and β1 are absent). If λ = 0, σ1 = (m(1) + m(2) )/2 and µ1 = (m(2) − m(1) )/2. ψ ± are called Dirac fields, because of an analogy with rel(j ) ativistic Dirac fermions; they are combinations of the Majorana variables ψ (j ) , ψ , j = 1, 2, associated with the two Ising layers in (1.1): hence the description in terms of Dirac variables mixes intrinsically the two Ising models and will be useful in a range of momentum scale in which the two layers appear to√be essentially equal. (1) γ1 ,γ2 One can compute AT by expanding e−V ( Z 1 ψ) in Taylor series and integrating term by term the Grassmann monomials; since the propagators of P (dψ) (i.e. the elements of A−1 , see (1.8), (1.9)) diverge for k = 0 and σ1 ± µ1 = 0 in the infinite volume limit M → ∞, the series can converge uniformly in M only in a region outside |σ1 ± µ1 | ≤ c, for some c, i.e. in the thermodynamic limit it can converge only far from the critical points. Since we are interested in the critical behaviour of the system, we set up a more complicated procedure to evaluate the partition function, based on the (Wilsonian) Renormalization Group (RG). The first step is to decompose the integration P (dψ) as a product
Anomalous Universality in the Anisotropic Ashkin–Teller Model
687
of independent integrations: P (dψ) = 1h=−∞ P (dψ (h) ), where the momentum space propagator corresponding to P (dψ (h) ) is not singular, but O(γ −h ), for M → ∞, γ being a fixed scaling parameter larger than 1. This decomposition is realized by slicing in a smooth way the momentum space, so that ψ (h) , if h ≤ 0, depends only on the momenta between γ h−1 and γ h+1 . We compute the Grassmann integrals defining the partition function by iteratively integrating the fields ψ (1) , ψ (0) , . . . , see §4. After each integration step we rewrite the partition function in a way similar to (1.8), with the quadratic form Z1 (ψ + , Aψ) replaced by Zh (ψ + , A(h) ψ), which has the same structure of (1.9), with Zh , σh , µh replacing Z1 , σ1 , µ1 ; the structure of Zh (ψ + , A(h) ψ) is preserved because of symmetry properties, guaranteeing that many other possible quadratic “local” terms are indeed vanishing, or irrelevant in a RG sense. The interaction V (1) is replaced by an effective action V (h) , h ≤ 0, given by a sum of monomials of ψ of arbitrary order, with kernels decaying in real space on scale γ −h ; in particular the quartic term is weighted by a coupling constant λh and the kernels of V (h) are analytic functions of {λh , . . . , λ1 }, if λk are small enough, k ≥ h, and |σk |γ −k , |µk |γ −k ≤ 1 (say – the constant 1 could be replaced by any other constant O(1)). In this way the problem of finding good bounds for log AT M is reformulated into the problem of controlling the size of λh , σh , µh , h ≤ 0, under the RG iterations. We use a crucial property, called vanishing of Beta function, to prove that actually, if λ is small enough, |λh | ≤ 2|λ1 | (recall that λ1 = O(λ)). The possibility of controlling the flow of λh is the main reason for describing the system in terms of Dirac variables. For σh , µh , Zh , we find that, under RG iterations, they evolve as: σh σ1 γ b2 λh , 2 µh µ1 γ −b2 λh , Zh γ −b1 λ h . Note in particular that Zh grows exponentially with an exponent O(λ2 ); this is connected with the presence of “critical indexes” in the correlation functions, which means that their long distance behaviour is qualitatively changed by the interaction. We perform the iterative integration described above up to a scale h∗1 such that (|σh∗1 |+ ∗ |µh∗1 |)γ −h1 = O(1), in such a way that (|σh | + |µh |)γ −h ≤ O(1), for all h ≥ h∗1 and convergence of the kernels of the effective potential can be guaranteed by our estimates. In the range of scales h ≥ h∗1 the flow of the effective coupling constant λh is essentially the same as for the isotropic AT model [M1] (since |µh |γ −h is small, the iteration “does not see” the anisotropy and the system seems to behave as if there was just one critical point) and nonuniversal critical indexes are generated (they appear in the flows of σh , µh and Zh ), following the same mechanism of the isotropic case. ∗ We note that after the integration of ψ (1) , . . . , ψ (h1 +1) , we can still reformulate ∗ ∗ the problem in terms of the original Majorana fermions ψ (1,≤h1 ) , ψ (2,≤h1 ) associated ∗ with the two Ising models in (1.1). On scale h1 their masses are deeply changed w.r.t. √ √ (1) (2) t (1) − 2 + 1 and t (2) − 2 + 1: they are given by mh∗ = |σh∗1 | + |µh∗1 | and mh∗ = 1
∗
1
|σh∗1 | − |µh∗1 |. Note that the condition |σh∗1 | + |µh∗1 | = O(γ h1 ) implies that the field ∗ ψ (1,≤h1 ) is massive on scale h∗1 (so that the Ising layer with j = 1 is “far from criticality” on the same scale). This implies that we can integrate (without any multiscale ∗ decomposition) the massive Majorana field ψ (1,≤h1 ) , obtaining an effective theory of a single Majorana field with mass |σh∗1 | − |µh∗1 |, which can be arbitrarily small. The integration of the scales ≤ h∗1 , see §6, is done again by a multiscale decomposition similar to the one just described; an important feature is however that there are no more quartic marginal terms, because the anticommutativity of Grassmann variables forbids local quartic monomials of a single Majorana fermion. The problem is essentially equivalent
688
A. Giuliani, V. Mastropietro
to the study of a single perturbed Ising model with “upper” cutoff on momentum space ∗ O(γ h1 ) and mass |σh∗1 | − |µh∗1 |. The flow of the effective mass and of Zh is nonanomalous in this regime: in particular the mass of the Majorana field is just shifted by ∗ O(λγ h1 ) from |σh∗1 | − |µh∗1 |. Criticality is found when the effective mass on scale −∞ is vanishing; the values of t, u for which this happens are found by solving a non-trivial implicit function problem. Finally, see §7, we define a similar expansion for the specific heat and we compute its asymptotic behaviour arbitrarily near the critical points. Technically it is an interesting feature of this problem that there are two regimes in which the system must be described in terms of different fields: the first one in which the natural variables are Dirac Grassmann variables, and the second one in which they are Majorana; note that the scale separating the two regimes is dynamically generated by the RG iterations (and of course its precise location is not crucial and h∗1 can be modified in h∗1 + n, n ∈ Z, without qualitatively affecting the bounds). 2. Fermionic Representation (j ) 2.1. The partition function I = σ (j ) exp{−J (j ) HI (σ (j ) )} of the Ising model can be written as a Grassmann integral; this is a classical result, mainly due to [LMS], [Ka, H, MW, S]. In Appendix A1, starting from a formula obtained in [MW], we prove that (j ) I
= (−1)
M 2 (2 cosh J
(j ) )M 2
2
×
ε,ε =±
(j )
(j )
(j )
(j )
(j )
dHx dH x dVx dV x (−1)δγ eSγ
(t (j ) )
,
(2.1)
x∈M
where j = 1, 2 denotes the lattice, γ = (ε, ε ) and δγ is δ+,+ = 1, δ+,− = δ−,+ = δ−,− = 2 and, if t (j ) = tanh J (j ) , (j ) (j ) (j ) (j ) H x Hx+eˆ + V x Vx+eˆ Sγ(j ) (t (j ) ) = t (j ) 1
0
x∈M
+
(j ) (j ) (j ) (j ) (j ) (j ) H x Hx + V x Vx + V x H x x∈M (j )
(j )
(j )
(j )
(j )
(j )
+Vx H x + Hx V x + Vx Hx (j )
(j )
(j )
,
(2.2)
(j )
where Hx , H x , Vx , V x are Grassmann variables verifying different boundary conditions depending on the label γ = (ε, ε ) which is not affixed explicitly, to simplify the notations, i.e. (j )
(j )
H x+M eˆ0 = εH x (j )
(j )
Hx+M eˆ = εHx 0
,
H x+M eˆ1 = ε H x
,
Hx+eˆ = ε Hx
(j )
(j )
(j )
(j )
1
, (j )
ε, ε = ±
(2.3) (j )
and identical definitions are set for the variables V (j ) , V ; we shall say that H , (j ) H (j ) , V , V (j ) satisfy ε–periodic (ε –periodic) boundary conditions in the vertical (horizontal) direction.
Anomalous Universality in the Anisotropic Ashkin–Teller Model
689
2.2. By expanding in power series exp{−λV }, we see that the partition function of the model (1.1) is (1) (1) (2) (2) (1) (2) AT e−J HI (σ ) e−J HI (σ ) e−λV (σ , σ ) M = σ (1) , σ (2)
= (cosh λ)2M
2
e−J
(1) H (σ (1) )−J (2) H (σ (2) ) I I
σ (1) , σ (2)
ˆ x(1) σ 1 + λσ σ (2) σx+eˆ x+eˆ x
·
(1)
(2)
1
1
ˆ x(1) σ 1 + λσ σ (2) σx+eˆ x+eˆ x (1)
(2)
0
0
, (2.4)
x∈M
where λˆ = tanh λ. The r.h.s. of (2.4) can be rewritten as:
∂2 (1) (1) (2) (2) ˆ AT 1 + λ I ({Jx,x })I ({Jx,x }) (j ) = ,(2.5) M (1) (2) {Jx,x }={J (j ) } ∂Jx,x+eˆ ∂Jx,x+eˆ x∈M i
i=0,1
(j )
i
(j )
where I ({Jx,x }) is the partition function of an Ising model in which the couplings are allowed to depend on the bonds (the coupling associated to the n.n. bond (x, x ) (j ) (1) (1) on the lattice j is called Jx,x ). Using for I ({Jx,x }) an expression similar to (2.1), we find that we can express AT as a sum of sixteen partition functions labeled by γ1 , γ2 = (ε1 , ε1 ), (ε2 , ε2 ) (corresponding to choosing each εj and εj as ±): AT M =
1 2 γ1 ,γ2 (cosh λ)2M (−1)δγ1 +δγ2 AT , 4 γ ,γ 1
(2.6)
2
each of which is given by a functional integral γ1 ,γ2 AT
2 M 2
ˆ (1) t (2) ) = 4(1 + λt
2
(cosh J (j ) )M (−1)M
2
j =1
·
j
=1,2
(j )
(j )
(j )
(j )
(1)
(1)
dHx dH x dVx dV x eSγ1 (tλ
(2)
(2)
)+Sγ2 (tλ )+Vλ
,
(2.7)
x∈M
where, if we define (j )
λ
λˆ t (1 − t 2 + u2 ) + (−1)j u(1 + t 2 − u2 ) = , ˆ 2 − u2 ) 1 + λ(t
(2.8)
(j )
(j )
we have that tλ , j = 1, 2, is given by tλ = t (j ) + λ(j ) and Vλ by: (1) (1) (2) (2) (1) (1) (2) (2) Vλ = λ˜ H x Hx+eˆ H x Hx+eˆ + V x Vx+eˆ V x Vx+eˆ , 1
1
0
0
x∈M
λ˜ =
λ(1) λ(2) ˆ 2 − u2 ) λ(t
.
(2.9)
690
A. Giuliani, V. Mastropietro
2.3. From now on, we shall study in detail only the partition function − AT = (−,−),(−,−) , i.e. the partition function in which all Grassmannian variables verify anAT tiperiodic boundary conditions (see (2.3)). We shall see in §5.5 below that, if (λ, t, u) does not belong to the critical surface, which is a suitable 2–dimensional subset of |D| [−ε1 , ε1 ] × D × [− |D| 2 , 2 ] which we will explicitly determine in §5.6, the partition def
γ ,γ
(1)γ
(2)γ
1 2 divided by I 1 I 2 is exponentially insensitive to boundary condifunction AT tions as M → ∞. As in [M1] we find it convenient to perform the following change of variables, α = ±, ω = ±1: (j ) α 1 (j ) α , (−iα)j −1 H x + iωHx = eiωπ/4 ψω,x − χω,x √ 2 j =1,2 (j ) 1 (j ) α α (−iα)j −1 V x + iωVx + χω,x . (2.10) = ψω,x √ 2 j =1,2
Let k ∈ D−,− , where D−,− is the set of k’s such that k = 2π/M(n1 + 1/2) and k0 = 2π/M(n0 + 1/2), where −[M/2] ≤ n0 , n1 ≤ [(M − 1)/2], n0 , n1 ∈ Z. The Fou α , φ = ψ, χ , is given by φˆ α def = x∈ rier transform of the Grassmannian fields φω,x ω,k
M
α . e−iαkx φω,x With the above definitions, it is straightforward algebra to verify that the final expression is: − −EM 2 AT = e P (dψ)P (dχ )eQ(ψ,χ)+V (ψ,χ) , (2.11)
where E is a suitable constant; Q(ψ, χ ) collects the quadratic terms of the form ψωα11,x1 χωα22,x2 ; V (ψ, χ) is the quartic interaction (it is equal to Vλ , see (2.9), in terms of the ψω± , χω± variables); P (dφ), φ = ψ, χ , is P (dφ) = Nφ−1
k∈D−,− ω=±1
tλ + − dφω,k dφω,k exp − 4M 2
i sin k + sin k0 −iσφ (k) i sin k − sin k0 iσφ (k) Aφ (k) = µ iµ(k) − 2 (i sin k + sin k0 ) −iµ(k) − µ2 (i sin k − sin k0 )
+,T k Aφ (k)k ,
k∈D−,− − µ2 (i sin k
+ sin k0 ) iµ(k) µ −iµ(k) − 2 (i sin k − sin k0 ) , i sin k + sin k0 −iσφ (k) iσφ (k) i sin k − sin k0 (2.12)
where + + − − +,T k = (φˆ 1,k , φˆ −1,k , φˆ 1,−k , φˆ −1,−k )
Nφ is chosen in such a way that def
(1)
,
− − + + T k = (φˆ 1,k , φˆ −1,k , φˆ 1,−k , φˆ −1,−k ), (2.13) def
(1)
(2)
P (dφ) = 1 and, if we define tλ = (tλ + tλ )/2,
(2)
uλ = (tλ − tλ )/2, for φ = ψ, χ we have: √ ± 2 + 1 + cos k0 + cos k − 2, σφ (k) = 2 1 + tλ µ(k) = −(uλ /tλ )(cos k + cos k0 ).
(2.14)
Anomalous Universality in the Anisotropic Ashkin–Teller Model
691
In the first of (2.14) the − (+) sign corresponds to φ = ψ (φ = χ). The parameter µ in def
(2.12) is given by µ = µ(0). √ It is convenient to split the 2 − 1 appearing in the definition of σψ (k) as: √ √ ν def ν ν 2 − 1 = ( 2 − 1 + ) − = tψ − , 2 2 2
(2.15)
where ν is a parameter to be properly chosen later as a function of λ, in such a way that the average location of the critical points will be given by tλ = tψ ; in other words ν has the role of a counterterm fixing the middle point of the critical temperatures. The splitting (2.15) induces the following splitting of P (dψ): P (dψ) = Pσ (dψ)e−νFν (ψ)
def
Fν (ψ) =
,
1 + ˆ− (−iω)ψˆ ω,k ψ−ω,k , (2.16) 2M 2 k,ω
def
where Pσ (dψ) is given by (2.12) with φ = ψ and σ = 2(1 − tψ /tλ ) replacing σψ (0).
σ φσ 2.4. Integration of the χ variables. The propagators < φx,ω y,ω > of the fermionic integration P (dφ) verify the following bound, for some A, κ > 0:
σ σ −κ m ¯ φ |x−y| φy,ω , | < φx,ω > | ≤ Ae (1)
(2.17) (j )
(2)
where m ¯ φ is the minimum between |mφ | and |mφ | and, for j = 1, 2, mφ is given by (j ) def
(j )
(1)
(2)
mφ = 2(tλ − tφ )/tλ , j = 1, 2. Note that both mχ and mχ are O(1). This suggests to integrate first the χ variables. After the integration of the χ variables we shall rewrite (2.11) as √ (1) −M 2 E1 − PZ1 ,σ1 ,µ1 ,C1 (dψ)e−V ( Z1 ψ) , = e V (1) (0) = 0 , (2.18) AT where C1 (k) = 1, Z1 = tψ , σ1 = σ/(1 − σ2 ), µ1 = µ/(1 − σ2 ) and PZ1 ,σ1 ,µ1 ,C1 (dψ) is the exponential of a quadratic form: PZ1 ,σ1 ,µ1 ,C1 (dψ) = N1−1
ω=±1
k∈D−,−
+ − dψω,k dψω,k
1 +,T (1) , × exp − Z C (k) A (k) 1 1 k ψ k 4M 2 k∈D−,− (1) (1) M (k) N (k) (1) Aψ (k) = , N (1) (k) M (1) (k) i sin k + sin k0 + a1+ (k) −i (σ1 + c1 (k)) , M (1) (k) = i sin k − sin k0 + a1− (k) i (σ1 + c1 (k)) b1+ (k) i (µ1 + d1 (k)) , (2.19) N (1) (k) = b1− (k) −i (µ1 + d1 (k))
692
A. Giuliani, V. Mastropietro
where N1 is chosen in such a way that PZ1 ,σ1 ,µ1 ,C1 (dψ) = 1. Moreover V (1) is the interaction, which can be expressed as a sum of monomials in ψ of arbitrary order: V (1) (ψ) =
∞ 2n
n=1
k1 ,... ,k2n α,ω
α (≤1) (1) ψˆ ωii ,ki W 2n,α,ω (k1 , . . . , k2n−1 )δ(
i=1
2n
α i ki )
(2.20)
i=1
and δ(k) = n∈Z2 δk,2πn . The constant E1 in (2.18), the functions a1± , b1± , c1 , d1 in (1) (2.19) and the kernels W 2n,α,ω in (2.20) have the properties described in the following theorem, proved in Appendix A2. Note that from now on we will consider all functions appearing in the theory as functions of λ, σ1 , µ1 (of course t and u can be analytically and elementarily expressed in terms of λ, σ1 , µ1 ). We shall also assume |σ1 |, |µ1 | bounded by some O(1)√constant. Note that if t ± u belong to a sufficiently small interval D centered around 2 − 1, as assumed in the hypothesis of the Main Theorem in §1, then of course |σ1 |, |µ1 | ≤ c1 for a suitable constant c1 (in particular, if D is chosen as in Remark (1) following the Main Theorem, we find |σ1 | ≤ 1 + O(ε1 ) and |µ1 | ≤ 2 + O(ε1 )). Theorem 2.1. Assume that |σ1 |, |µ1 | ≤ c1 for some constant c1 > 0. There exists a constant ε1 such that, if |λ|, |ν| ≤ ε1 , then − AT can be written as in (2.18), (2.19), (2.20), where: 1) E1 is an O(1) constant; 2) a1± (k), b1± (k) are analytic odd functions of k and c1 (k), d1 (k) real analytic even functions of k; in a neighborhood of k = 0, a1± (k) = O(σ1 k) + O(k3 ), b1± (k) = O(µ1 k) + O(k3 ), c1 (k) = O(k2 ) and d1 (k) = O(µ1 k2 ); 3) the determinant | det Aψ (k)| above can be bounded and below by some constant 2 2 times (σ1 − µ1 ) + |c(k)| (σ1 + µ1 ) + |c(k)| and c(k) = cos k0 + cos k − 2; (1) 4) W 2n,α,ω are analytic functions of ki , λ, ν, σ1 , µ1 , i = 1, . . . , 2n and, for some constant C, 2 n max{1,n/2} |W ; 2n,α,ω (k1 , . . . , k2n−1 )| ≤ M C |λ| (1)
(2.21)
4) –a) the terms in (2.21) with n = 2 can be written as + ˆ+ L1 ψˆ 1,k ψ ψˆ − ψˆ − δ(k1 + k2 − k3 − k4 ) 1 −1,k2 −1,k3 1,k4 k1 ,... ,k4
+
4,α,ω (k1 , k2 , k3 )ψˆ α1 ψˆ α2 ψˆ α3 ψˆ α4 δ( W ω1 ,k1 ω2 ,k2 ω3 ,k3 ω4 ,k4
k1 ,... ,k4 α,ω
4
αi k i ) ,
i=1
(2.22) π π where L1 is real and W4,α,ω (k1 , k2 , k3 ) vanishes at k1 = k2 = k3 = M , M ; 4) –b) the term in (2.21) with n = 1 can be written as: 1 + ˆ− α ˆα S1 (−iω)ψˆ ω,k ψ−ω,k + M1 (iω)ψˆ ω,k ψ−ω,−k 4 ω,α=± k
α ˆα +F1 (i sin k + ω sin k0 )ψˆ ω,k ψω,−k + ˆ− +G1 (i sin k + ω sin k0 )ψˆ ω,k ψω,k 2,α,ω (k)ψˆ α1 ψˆ α2 + W ω1 ,k ω2 ,−α1 α2 k , k
α,ω
(2.23)
Anomalous Universality in the Anisotropic Ashkin–Teller Model
693
2,α,ω (k) is O(k2 ) in a neighborhood of k = 0; S1 , M1 , F1 , G1 are real where W analytic functions of λ, σ1 , µ1 , ν s.t. F1 = O(λµ1 ) and L1 = l1 + O(λσ1 ) + O(λµ1 ) ,
S1 = s1 + γ n1 + O(λσ12 ) + O(λµ21 ),
M1 = m1 + O(λµ1 σ1 ) + O(λµ31 ) ,
G1 = z1 + O(λσ1 ) + O(λµ1 )|, (2.24)
with s1 = σ1 f1 , m1 = µ1 f2 and l1 , n1 , f1 , f2 , z1 independent of σ1 , µ1 ; moreover ˜ 2 + O(λ2 ), f1 , f2 = O(λ), γ n1 = ν/Z1 + cν λ + O(λ2 ), for some cν l1 = λ/Z 1 1 1 independent of λ, and z1 = O(λ2 ). Remark. The meaning of Theorem 2.1 is that after the integration of the χ fields we are left with a fermionic integration similar to (2.12) up to corrections which are at least O(k2 ), and an effective interaction containing terms with any number of fields. A priori many bilinear terms with kernel O(1) or O(k) with respect to k near k = 0 could be generated by the χ –integration besides the ones originally present in (2.12); however symmetry considerations restrict drastically the number of possible bilinear α ψ ˆα terms O(1) or O(k). Only one new term of the form k (i sin k + ω sin k0 )ψˆ ω,k ω,−k appears, which is “dimensionally” marginal in a RG sense; however it is weighted by a constant O(λµ1 ) and this will improve its “dimension”, so that it will result to be irrelevant, see §3.2 below. 3. Integration of the ψ Variables: First Regime (1)
3.1. Multiscale analysis. From the bound on det Aψ (k) described in Theorem 2.1, we see that the ψ fields have a mass given by min{|σ1 − µ1 |, |σ1 + µ1 |}, which can be arbitrarly small; their integration in the infrared region (small k) needs a multiscale analysis. We introduce a scaling parameter γ > 1 which will be used to define a geometrically growing sequence of length scales 1, γ , γ 2 , . . . , i.e. of geometrically decreasing momentum scales γ h , h = 0, −1, −2, . . . Correspondingly we introduce C ∞ compact def support functions fh (k) h ≤ 1, with the following properties: if |k| = sin2 k + sin2 k0 , when h ≤ 0, fh (k) = 0 for |k| < γ h−2 or |k| > γ h , and fh (k) = 1, if |k| = γ h−1 ; f1 (k) = 0 for |k| ≤ γ −1 and f1 (k) = 1 for |k| ≥ 1; furthermore: 1=
1 h=hM
fh (k) ,
where :
hM = min{h : γ h >
√ π 2 sin } , M
(3.1)
√ and 2 sin(π/M) is the smallest momentum allowed by the antiperiodic boundary con√ ditions, i.e. 2 sin(π/M) = mink∈D−,− |k|. The purpose is to perform the integration of (2.19) over the fermion fields in an iterative way. After each iteration we shall be left with a “simpler” Grassmannian integration to perform: if h = 1, 0, −1, . . . , hM , we shall write (h) √ (≤h) 2 − = PZh ,σh ,µh ,Ch (dψ (≤h) ) e−V ( Zh ψ )−M Eh , V (h) (0) = 0 , (3.2) AT where the quantities Zh , σh , µh , Ch , PZh ,σh ,µh ,Ch (dψ (≤h) ), V (h) and Eh have to be −M 2 E−1+hM defined recursively and the result of the last iteration will be − , AT = e i.e. the value of the partition function.
694
A. Giuliani, V. Mastropietro
PZh ,σh ,µh ,Ch (dψ (≤h) ) is defined by (2.19) in which we replace Z1 , σ1 , µ1 , a1ω , b1ω , c1 , h d1 , C1 (k) with Zh , σh , µh , ahω , bhω , ch , dh , Ch (k), where Ch (k)−1 = j =hM fj (k). Moreover V
(h)
∞ 1 (ψ) = M 2n n=1
def
=
2n
k1 ,... ,k2n−1 , α,ω
i=1
∞
2n
n=1
x1 ,... ,x2n , σ ,j ,ω,α
i=1
α (≤h) (h) ψˆ ωii ,ki W 2n,α,ω (k1 , . . . , k2n−1 )δ(
2n
def
αi k i ) =
i=1 (h)
(≤h) ∂jσii ψωαii ,x W2n,σ ,j ,α,ω (x1 , . . . , x2n ) , i
(3.3)
where in the last line ji = 0, 1, σi ≥ 0 and ∂j is the forward discrete derivative in the eˆj direction. (h) Note that the field ψ (≤h) , whose propagator is given by the inverse of Zh Ch (k)Aψ ,
has the same support of Ch−1 (k), that is on a strip of width γ h around the singularity k = 0. The field ψ (≤1) coincides with the field ψ of the previous section, so that (2.18) is the same as (3.2) with h = 1. (h) , h ≤ 1, as functions of the variables It is crucial for the following to think W 2n,α,ω σk (k), µk (k), k = h, h + 1, . . . , 0, 1, k ∈ D−,− . The iterative construction below will inductively imply that the dependence on these variables is well defined (note that for h = 1 we can think of the kernels of V (1) as functions of σ1 , µ1 , see Theorem 2.1). 3.2. The localization operator. We now begin to describe the iterative construction leading to (3.2). The first step consists in defining a localization operator L acting on the kernels of V (h) , in terms of which we shall rewrite V (h) = LV (h) + RV (h) , where R = 1 − L. The iterative integration procedure will use such splitting, see §3.3 below. (h) with n = 1, 2. In this case L will L will be non-zero only if acting on a kernel W 2n,α,ω be the combination of four different operators: Lj , j = 0, 1, whose effect on a function of k will be essentially to extract the term of order j from its Taylor series in k; and Pj , j = 0, 1, whose effect on a functional of the sequence σh (k), µh (k), . . . , σ1 , µ1 will be essentially to extract the term of order j from its power series in σh (k), µh (k), . . . , σ1 , µ1 . (h) (k1 , . . . , k2n ) is defined as follows: The action of Lj , j = 0, 1, on the kernels W 2n,α,ω 1) If n = 1, L0 W 2,α,ω (k, α1 α2 k) = (h)
1 4
(h) (k, α1 α2 k) = 1 L1 W 2,α,ω 4
η,η =±1
η,η =±1
(h) (k¯ ηη , α1 α2 k¯ ηη ), W 2,α,ω (h) (k¯ ηη , α1 α2 k¯ ηη ) η sin k + η sin k0 , W π π 2,α,ω sin M sin M (3.4)
are the smallest momenta allowed by the antiperiodic where k¯ ηη = boundary conditions. (h) = 0 and 2) If n = 2, L1 W 4,α,ω π π ηM , η M
¯ ¯ ¯ ¯ L0 W 4,α,ω (k1 , k2 , k3 , k4 ) = W4,α,ω (k++ , k++ , k++ , k++ ) . (h)
def
2n,α,ω = L1 W 2n,α,ω = 0 . 3) If n > 2, L0 W
(h)
(3.5)
Anomalous Universality in the Anisotropic Ashkin–Teller Model
695
2n,α,ω , thought of as functionals of the The action of Pj , j = 0, 1, on the kernels W sequence σh (k), µh (k), . . . , σ1 , µ1 is defined as follows: 2n,α,ω def 2n,α,ω =W , P0 W (h) (h) σ
=µ
=0
2n,α,ω 2n,α,ω ∂W ∂W σk (k) . (h) (h) + µk (k) (h) (h) ∂σk (k) σ =µ =0 ∂µk (k) σ =µ =0
2n,α,ω def P1 W =
k≥h,k
(3.6) 2n,α,ω as Given Lj , Pj , j = 0, 1 as above, we define the action of L on the kernels W follows: 1) If n = 1, then 2,α,ω if ω1 + ω2 = 0 and α1 + α2 = 0, L0 (P0 + P1 )W L P W if ω1 + ω2 = 0 and α1 + α2 = 0, 0 1 2,α,ω 2,α,ω def LW = L1 P0 W2,α,ω if ω1 + ω2 = 0 and α1 + α2 = 0, 0 if ω1 + ω2 = 0 and α1 + α2 = 0. 4,α,ω . 4,α,ω def = L0 P0 W 2) If n = 2, then LW 2n,α,ω = 0. 3) If n > 2, then LW 2n,α,ω Finally, the effect of L on V (h) is, by definition, to replace on the r.h.s. of (3.3) W 2 (h) (h) with LW2n,α,ω . Note that L V = LV . Using the previous definitions we get the following result, proven in Appendix A2.2. ,1 (h) = {µ (k)}k=h,... ,1 . We use the notation σ (h) = {σk (k)}k=h,... k k∈D−,− and µ k∈D−,− Lemma 3.1. Let the action of L on V (h) be defined as above. Then (≤h)
LV (h) (ψ (≤h) ) = (sh + γ h nh )Fσ(≤h) + mh Fµ(≤h) + lh Fλ
(≤h)
+ z h Fζ
,
(3.7)
where sh , nh , mh , lh and zh are real constants and sh is linear in σ (h) and independent of µ(h) ; mh is linear in µ(h) and independent of σ (h) ; nh , lh , zh are independent of σ (h) , µ(h) ; moreover, if Dh = D−,− ∩ {k : Ch−1 (k) > 0}, 1 1 (≤h) −(≤h) def +(≤h) ψ (−iω)ψ Fσ (k) , Fσ(≤h) (ψ (≤h) ) = = ω,k −ω,k 2 2M M2 def
k∈Dh ω=±1
Fµ(≤h) (ψ (≤h) )
1 = 4M 2
(≤h) Fλ (ψ (≤h) )
1 = 8 M
k∈Dh
α(≤h) def α(≤h) ψ iωψ ω,k −ω,−k =
k∈Dh α,ω=±1
(≤h)
Fζ
(ψ (≤h) ) =
k1 ,...,k4 ∈Dh
1 (≤h) Fµ (k) , M2 k∈Dh
+(≤h) ψ +(≤h) ψ −(≤h) −(≤h) ψ 1,k1 −1,k2 −1,k3 ψ1,k4 δ(k1
+ k2 − k3 − k4 ) ,
1 (i sin k + ω sin k0 ) 2M 2 k∈Dh ω=±1
+(≤h) −(≤h) def ψω,k =
×ψ ω,k where δ(k) = M 2
1 (≤h) Fζ (k), M2 k∈Dh
n∈Z2 δk,2πn .
(3.8)
696
A. Giuliani, V. Mastropietro
Remark. The application of L to the kernels of the effective potential generates the sum in (3.7), i.e. a linear combination of the Grassmannian monomials in (3.8) which, in the renormalization group language, are called “relevant” (the first two) or “marginal” operators (the two others). def
We now consider the operator R = 1 − L. The following result holds, see Appendix A2 for the proof. We use the notation R1 = 1 − L0 , R2 = 1 − L0 − L1 , S1 = 1 − P0 , S2 = 1 − P0 − P1 . 2n,α,ω for n = 1, 2 is the following: Lemma 3.2. The action of R on W 1) If n = 1, then 2,α,ω RW
2,α,ω if ω1 + ω2 = 0, [S2 + R2 (P0 + P1 )]W 2,α,ω = [R1 S1 + R2 P0 ]W if ω1 + ω2 = 0 and α1 + α2 = 0, R S W if ω1 + ω2 = 0 and α1 + α2 = 0. 1 1 2,α,ω
4,α,ω . 4,α,ω = [S1 + R1 P0 ]W 2) If n = 2, then RW Remark. The effect of Rj , j = 1, 2 on W 2n,α,ω consists in extracting the rest of a Taylor (h) consists in extracting the rest series in k of order j . The effect of Sj , j = 1, 2 on W (h)
2n,α,ω
of a power series in (σ (h) , µ(h) ) of order j . The definitions are given in such a way that 2n,α,ω is at least quadratic in k, σ (h) , µ(h) if n = 1 and at least linear in k, σ (h) , µ(h) RW when n = 2. This will give dimensional gain factors in the bounds for RW 2n,α,ω w.r.t. (h) , n = 1, 2, as we shall see in detail in Appendix A4. the bounds for W (h)
2n,α,ω
3.3. Renormalization. Once the above definitions are given we can describe our integration procedure for h ≤ 0. We start from (3.2) and we rewrite it as (≤h) (h) √ (≤h) 2 (h) √ PZh ,σh ,µh ,Ch (dψ (≤h) ) e−LV ( Zh ψ )−RV ( Zh ψ )−M Eh , (3.9) with LV (h) as in (3.7). Then we include the quadratic part of LV (h) (except the term proportional to nh ) in the fermionic integration, so obtaining PZh−1 ,σh−1 ,µh−1 ,Ch (dψ (≤h) ) √ √ √ Zh ψ (≤h) )−γ h nh Fσ ( Zh ψ (≤h) )−RV (h) ( Zh ψ (≤h) )−M 2 Eh
×e−lh Fλ (
,
(3.10)
h−1 (k) = Zh (1 + zh C −1 (k)) and where Z h def
Zh Zh def (σ (k) + sh Ch−1 (k)) , µh−1 (k) = (µ (k) + mh Ch−1 (k)), h−1 (k) h h−1 (k) h Z Z Zh Zh def def ω ω a ω (k) , bh−1 bω (k), (k) = (k) = ah−1 h−1 (k) h h−1 (k) h Z Z Zh Zh def def c (k) , dh−1 (k) = d (k) . (3.11) ch−1 (k) = h−1 (k) h h−1 (k) h Z Z def
σh−1 (k) =
Anomalous Universality in the Anisotropic Ashkin–Teller Model
697
The integration in (3.10) differs from the one in (3.2) and (3.9): PZh−1 ,σh−1 ,µh−1 ,Ch (1) h−1 (k) and A(h−1) . is defined by (2.19) with Z1 and A replaced by Z ψ
ψ
Now we can perform the integration of the ψ (h) field. It is convenient to rescale the fields: (h) ( Zh−1 ψ (≤h) ) def V = λh Fλ ( Zh−1 ψ (≤h) ) +γ h νh Fσ ( Zh−1 ψ (≤h) ) + RV (h) ( Zh ψ (≤h) ) , (3.12) Zh 2 h where λh = Zh−1 lh , νh = ZZh−1 nh and RV (h) = (1 − L)V (h) is the irrelevant part of V (h) , and rewrite (3.10) as e−M
2 (t +E ) h h
PZh−1 ,σh−1 ,µh−1 ,Ch−1 (dψ (≤h−1) ) (h) (√Zh−1 ψ (≤h) )
PZh−1 ,σh−1 ,µh−1 ,f−1 (dψ (h) ) e−V
×
,
h
(3.13)
where we used the decomposition ψ (≤h) = ψ (≤h−1) + ψ (h) (and ψ (≤h−1) , ψ (h) are −1 (k) = C −1 (k)Z −1 + independent) and fh (k) is defined by the relation Ch−1 (k)Z h−1 h−1 h−1 −1 , namely: fh (k)Zh−1 def fh (k) = Zh−1
C −1 (k) C −1 (k) zh fh+1 (k) h = fh (k) 1 + − h−1 . h−1 (k) Zh−1 1 + zh fh (k) Z
(3.14)
Note that fh (k) has the same support as fh (k). Moreover PZh−1 ,σh−1 ,µh−1 ,f−1 (dψ (h) ) is h h−1 (k) resp. Ch replaced defined in the same way as PZh−1 ,σh−1 ,µh−1 ,Ch (dψ (h) ), with Z by Zh−1 , resp. fh−1 . The single scale propagator is α(h) α (h) PZh−1 ,σh−1 ,µh−1 ,f−1 (dψ (h) ) ψx,ω ψy,ω h
=
1 Zh−1
(h)
ga,a (x − y) ,
a = (α, ω)
,
a = (α , ω ) ,
(3.15)
where (h)
ga,a (x − y) =
1 iαα k(x−y) (h−1) e fh (k)[Aψ (k)]−1 j (a),j (a ) 2M 2
(3.16)
k
with j (−, 1) = = 1, j (−, −1) = j (+, −1) = 2, j (+, 1) = j (−, 1) = 3 (h) (1,h) (2,h) and j (+, −1) = j (−, −1) = 4. One finds that ga,a (x) = gω,ω (x) − αα gω,ω (x), j (+, 1)
(j,h)
where gω,ω (x), j = 1, 2 are defined in Appendix A3, see (A3.1). The long distance behaviour of the propagator is given by the following lemma, proved in Appendix A3. def
def
Lemma 3.3. Let σh = σh (0) and µh = µh (0) and assume |λ| ≤ ε1 for a small constant ¯ ε1 . Suppose that for h > h, |zh | ≤
1 2
,
|sh | ≤
1 |σh | , 2
|mh | ≤
1 |µh | , 2
(3.17)
698
A. Giuliani, V. Mastropietro
that there exists c s.t. σ h e−c|λ| ≤ ≤ ec|λ| , σh−1 Z 2 2 h e−c|λ| ≤ ≤ ec|λ| , Zh−1
µ h e−c|λ| ≤ ≤ ec|λ| , µh−1 (3.18)
and that, for some constant C1 , |σh¯ | γ h¯
≤ C1 ,
|µh¯ | γ h¯
≤ C1 ;
(3.19)
¯ given the positive integers N, n0 , n1 and putting n = n0 + n1 , there then, for all h ≥ h, exists a constant CN,n s.t. (h)
γ (1+n)h 1 + (γ h |d(x − y)|)N M πx πx0 = sin , sin ). π M M
|∂xn00 ∂xn1 ga,a (x − y)| ≤ CN,n
,
where d(x) (3.20)
Furthermore, if P0 , P1 are defined as in (3.6) and S1 , S2 are defined as in Lemma 3.2, we (h) (h) have that Pj ga,a , j = 0, 1 and Sj ga,a , j = 1, 2, satisfy the same bound (3.20), times (h) (h) h| j . The bounds for P0 ga,a and P1 ga,a hold even without hypothesis a factor |σh |+|µ γh (3.19). After the integration of the field on scale h we are left with an integral involving the fields ψ (≤h−1) and the new effective interaction V (h−1) , defined as (h−1) (√Z (≤h−1) )−E (≤h) (h) √ ˜hM2 h−1 ψ = PZh−1 ,σh−1 ,µh−1 ,fh (dψ (h) )e−V ( Zh−1 ψ ) . (3.21) e−V It is easy to see that V (h−1) is of the form (3.3) and that Eh−1 = Eh + th + E˜ h . It is sufficient to use the well known identity 1 (h) ( Zh−1 ψ (≤h) ); n), (−1)n+1 EhT (V M 2 E˜ h +V (h−1) ( Zh−1 ψ (≤h−1) ) = n! n≥1
(3.22) where EhT (X(ψ (h) ); n) is the truncated expectation of order n w.r.t. the propagator −1 (h) ga,a , defined as Zh−1 EhT (X(ψ (h) ); n)
∂ = n log ∂λ
PZh−1 ,σh−1 ,µh−1 ,fh (dψ (h) )eλX(ψ
(h) )
λ=0
. (3.23)
Note that the above procedure allows us to write the running coupling constants v h−1 = (λh−1 , νh−1 ), h ≤ 1, in terms of v k , h ≤ k ≤ 1, namely v h−1 = βh ( vh , . . . , v 1 ), where βh is the so–called Beta function.
Anomalous Universality in the Anisotropic Ashkin–Teller Model
699
3.4. Analticity of the effective potential. We have expressed the effective potential V (h) in terms of the running coupling constants λk , νk , k ≥ h, and of the renormalization constants Zk , µk (k), σk (k), k ≥ h. In Appendix A4 we will prove the following result. def
def
Theorem 3.4. Let σh = σh (0) and µh = µh (0) and assume |λ| ≤ ε1 for a small constant ε1 . Suppose that for h > h¯ the hypothesis (3.17), (3.18) and (3.19) hold. If, for some constant c, max{|λh |, |νh |} ≤ c|λ| , (3.24) h>h¯
then there exists C > 0 s.t. the kernels in (3.3) satisfy ¯ ¯ (h) dx1 · · · dx2n |W2n,σ ,j ,α,ω (x1 , . . . , x2n )| ≤ M 2 γ −hDk (n) (C |λ|)max(1,n−1) , (3.25) 2n
where Dk (n) = −2 + n + k and k = i=1 σi . ¯ 2h¯ and the kernels of LV (h) Moreover |E˜ h+1 satisfy ¯ | + |th+1 ¯ | ≤ c|λ|γ |sh¯ | ≤ C|λ||σh¯ | ,
|mh¯ | ≤ C|λ||µh¯ |
(3.26)
and |nh¯ | ≤ C|λ| ,
|zh¯ | ≤ C|λ|2 ,
|lh¯ | ≤ C|λ|2 .
(3.27)
The bounds (3.26) hold even if (3.19) does not hold. The bounds (3.27) hold even if (3.19) and the first two of (3.18) do not hold. Remarks. 1) The above result immediately implies analyticity of the effective potential of scale h in the running coupling constants λk , νk , k ≥ h, under the assumptions (3.17), (3.18), (3.19) and (3.24). 2) The assumptions (3.18) and (3.24) will be proved in §4 and Appendix A5 below, solving the flow equations for v h = (λh , νh ) and Zh , σh , µh , given by v h−1 = βh ( vh , . . . , v 1 ), Zh−1 = Zh (1 + zh ) and (3.11). They will be proved to be true up to h = −∞. 4. The Flow of the Running Coupling Constants The convergence of the expansion for the effective potential is proved by Theorem 3.1 under the hypothesis that the running coupling constants are small, see (3.24), and that the bounds (3.17), (3.18) and (3.19) are satisfied. We now want to show that, choosing λ small enough and ν as a suitable function of λ, such hypotheses are indeed verified. In order to prove this, we will solve the flow equations for the renormalization constants (following from (3.11) and the preceding line): Zh−1 = 1 + zh Zh
,
σh−1 sh /σh − zh =1+ σh 1 + zh
,
µh−1 mh /µh − zh =1+ ,(4.1) µh 1 + zh
together with those for the running coupling constants: λh−1 = λh + βλh (λh , νh ; . . . ; λ1 , ν1 ), νh−1 = γ νh + βνh (λh , νh ; . . . ; λ1 , ν1 ) .
(4.2)
700
A. Giuliani, V. Mastropietro
The functions βλh , βνh are called the λ and ν components of the Beta function, see the comment after (3.23), and, by construction, are independent of σk , µk , so that their convergence follow just from (3.24) and the last of (3.18), i.e. without assuming (3.19), see Theorem 3.1. While for a general kernel we will apply Theorem 3.1 just up to a finite scale h∗1 (in order to insure the validity of (3.19) with h¯ = h∗1 ), we will inductively study the flow generated by (4.2) up to scale −∞, and we shall prove that it is bounded for all scales. The main result on the flows of λh and νh , proven in Appendix A5, is the following. Theorem 4.1. If λ is small enough, there exists an analytic function ν ∗ (λ) independent of t, u such that the running coupling constants {λh , νh }h≤1 with ν1 = ν ∗ (λ) verify |νh | ≤ c|λ|γ (ϑ/2)h and |λh | ≤ c|λ|. Moreover the kernels zh , sh and mh satisfy (3.17) and the solutions of the flow equations (4.1) satisfy (3.18). Once ν1 is conveniently chosen as in Theorem 4.1, one can study in more detail the flows of the renormalization constants. In Appendix A5 we prove the following. Lemma 4.2. If λ is small enough and ν1 is chosen as in Theorem 4.1, the solution of (4.1) can be written as: h
Zh = γ ηz (h−1)+Fζ
,
h
µh = µ1 γ ηµ (h−1)+Fµ
,
h
σh = σ1 γ ησ (h−1)+Fσ ,
(4.3)
where ηz , ηµ , ηz and Fζh , Fµh , Fσh are O(λ) functions, independent of σ1 , µ1 . Moreover ησ − ηµ = −bλ + O(|λ|2 ), b > 0.
4.1. The scale h∗1 . The integration described in §3 is iterated up to a scale h∗1 defined in the following way: min #1, log |σ | 1−η1 σ $ if |σ | 1−η1 σ > 2|µ | 1−η1 µ , def 1 1 1 ∗ γ h1 = (4.4) 1 1 min #1, log |u| 1−ηµ $ if |σ | 1−η1 σ ≤ 2|µ | 1−ηµ . 1
γ
1
From (4.4) it follows that ∗
∗
C2 γ h1 ≤ |σh∗1 | + |µh∗1 | ≤ C1 γ h1 ,
(4.5)
with C1 , C2 independent of λ, µ1 , σ1 . 1
1
This is obvious in the case h∗1 = 1. If h∗1 < 1 and |σ1 | 1−ησ > 2|µ1 | 1−ηµ , then
∗
1
γ h1 −1 = cσ |σ1 | 1−ησ , with 1 ≤ cσ < γ , so that, using the third of (4.3), we see that ∗ ∗ C2 γ h1 ≤ |σh∗1 | ≤ C1 g h1 , for some C1 , C2 = O(1). Furthermore, using also the second of (4.3), we find |µh∗1 | |σh∗1 | and (4.5) follows.
η −ησ
= cσµ
1−η
− 1−ηµ σ
|µ1 ||σ1 |
h∗ 1
γ Fµ
h∗
−Fσ 1
<1
(4.6)
Anomalous Universality in the Anisotropic Ashkin–Teller Model 1
1
701 1
∗
If h∗1 < 1 and |σ1 | 1−ησ ≤ 2|µ1 | 1−ηµ , then γ h1 −1 = cu |u| 1−ηµ , with 1 ≤ cµ < γ , so ∗ ∗ that, using the second of (4.3) and |µ1 | = O(|u|), we see that C2 γ h1 ≤ |µh∗1 | ≤ C1 γ h1 . Furthermore, using the third (4.3), we find
|σh∗1 | |µh∗1 |
η −ηµ
= cuσ
|σ1 ||u|
1−ησ − 1−η µ
h∗ 1
γ Fσ
h∗
−Fµ 1
< C1 ,
(4.7)
for some C1 = O(1), and (4.5) again follows. Remark. The specific value of h∗1 is not crucial: if we change h∗1 in h∗1 + n, n ∈ Z, the constants C1 , C2 in (4.5) are replaced by different O(1) constants and the estimates below are not qualitatively modified. Of course, the specific values of C1 , C2 (then, the specific value of h∗1 ) can affect the convergence radius of the pertubative series in λ. The optimal value of h∗1 should be chosen by maximizing the corresponding convergence radius. Since here we are not interested in optimal estimates, we find the choice in (4.4) convenient. Note also that h∗1 is a non-analytic function of (λ, t, u) (in particular for small u we ∗ have γ h1 ∼ |u|1+O(λ) ). As a consequence, the asymptotic expression for the specific heat near the critical points (that we shall obtain in the next section) will contain non-analytic functions of u (in fact it will contain terms depending on h∗1 ). However, as explained in Remark (3) after the Main Theorem, this does not imply that Cv is non analytic: it is clear that in this case the non analyticity is introduced “by hand” by our specific choice of h∗1 . From the results of Theorem 4.1 and Lemma 4.2, together with (4.4) and (4.5), it follows that the assumptions of Theorem 3.4 are satisfied for any h¯ ≥ h∗1 . The integration of the scales ≤ h∗1 must be performed in a different way, as discussed in next the section.
5. Integration of the ψ Variables: Second Regime 5.1. Integration of the ψ (1) field. If h∗1 is fixed as in §4.1, we can use Theorem 3.4 up to the scale h¯ = h∗1 + 1. Once all the scales > h∗1 are integrated out, it is more convenient to describe the (1) (2) system in terms of the fields ψω , ψω , ω = ±1, defined through the following change of variables:
1 α(≤h∗ ) (1,≤h∗ ) (2,≤h∗ ) ψˆ ω,k 1 = √ (ψˆ ω,−αk1 − iα ψˆ ω,−αk1 ) , 2
(j )
ψω,x =
1 −ikx (j ) ψˆ ω,k . e M2 k
(5.1)
702
A. Giuliani, V. Mastropietro
If we perform this change of variables, we find PZh∗ ,σh∗ ,µh∗ ,Ch∗ = 1
(j,≤h∗1 ),T def (j,≤h∗ ) k = (ψ1,k 1 ,
where, if P
∗
(j ) (j )
Zh∗ ,mh∗ ,Ch∗ 1
1
def
=
(j )
Nh∗ 1
k,ω
1
(j ) j =1 PZ ∗ ,m(j ) ,C
1
h1
(j,≤h∗ ) ψ−1,k 1 ),
h∗ 1
h∗ 1
def
(dψ (j,≤h1 ) ) =
1
1
1
2
(j,≤h∗1 )
dψω,k
Zh∗1 (j,≤h∗1 ),T (h∗1 ) (j,≤h∗1 ) ∗ (k) exp − C A (k) h j k −k 1 4M 2 k∈Dh∗ 1
(j ) (j ) −i mh∗ (k) + ch∗ (k)
+(j ) def (−i sin k − sin k0 ) + ah∗ (k) (h∗ ) 1 (j ) Aj 1 (k) = (j ) i mh∗ (k) + ch∗ (k) 1 1
1 1 (5.2) −(j ) (−i sin k + sin k0 ) + ah∗ (k) 1
and ah∗ , mh∗ , ch∗ are given by (A3.2) with h = h∗ + 1. ω(j ) 1
(j )
(j )
1
1
(j,≤h∗ )
The propagators gω1 ,ω2 1 associated with the fermionic integration (5.2) are given by (1) (2) (A3.1) with h = h∗1 + 1. Note that, by (4.5), max{|mh∗ |, |mh∗ |} = |σh∗1 | + |µh∗1 | = ∗
(1)
1
(2)
1
O(γ h1 ) (see (A3.2) for the definition of mh∗ , mh∗ ). From now on, for definiteness (1)
(2)
1
1
1
(1)
1
we shall suppose that max{|mh∗ |, |mh∗ |} = |mh∗ |. Then, it is easy to realize that the (1,≤h∗ )
1
propagator gω1 ,ω2 1 is bounded as follows: ∗
γ (1+n)h1
(1,≤h∗ )
|∂xn00 ∂xn1 gω1 ,ω2 1 (x)| ≤ CN,n
∗
1 + (γ h1 |d(x)|)N
,
n = n 0 + n1 ,
(5.3)
(1,≤h∗ )
namely gω1 ,ω2 1 satisfies the same bound as the single scale propagator on scale h = h∗1 . ∗ This suggests to integrate out ψ (1,≤h1 ) , without any other scale decomposition. We find the following result: Lemma 5.1. If |λ| ≤ ε1 , |σ1 |, |µ1 | ≤ c1 (c1 , ε1 being the same as in Theorem 2.1) and ν1 is fixed as in Theorem 4.1, we can rewrite the partition function as √ ∗ (h∗ 1 ) ( Z ∗ ψ (2,≤h1 ) )−M 2 E ∗ (2) (2,≤h∗1 ) −V h1 h1 − = P (dψ )e , (5.4) (2) AT Zh∗ , mh∗ ,Ch∗ 1
1
1
∗
where: m h∗ (k) = mh∗ (k)−γ h1 πh∗1 Ch−1 ∗ (k), with πh∗ a free parameter, s.t. |πh∗ | ≤ c|λ|; 1 1 (2)
(2)
1
1
1
∗
|E h∗1 − Eh∗1 | ≤ c|λ|γ 2h1 ; and V
(h∗1 )
=
∗
(2,≤h∗1 )
(ψ (2) ) − γ h1 πh∗1 Fσ
∞
2n
∗
(h1 ) (2) ψˆ ωi ,ki W 2n,ω (k1 , . . . , k2n−1 )δ(
n=1 ω i=1
=
∗
(ψ (2≤h1 ) )
∞
2n n=1 σ ,j ,ω i=1
2n
ki )
i=1 (h∗ )
1 ∂jσii ψω(2) W 2n,σ ,j ,ω (x1 , . . . , x2n ) , i ,xi
(5.5)
Anomalous Universality in the Anisotropic Ashkin–Teller Model (2,≤h)
with Fσ
(h∗1 )
703
(2,≤h) (2,≤h) +(≤h) −(≤h) given by the first of (3.8) with ψˆ ω,k ψˆ ω ,−k replacing ψˆ ω,k ψˆ ω ,k ;
¯ (h) and W 2n,σ ,j ,ω satisfying the same bound (3.25) as W2n,σ ,j ,α,ω with h¯ = h∗1 .
In order to prove the lemma it is sufficient to consider (3.2) with h = h∗1 and rewrite ∗ (j ) . Then the integration over the ψ (1,≤h1 ) PZh∗ ,σh∗ ,µh∗ ,Ch∗ as the product 2j =1 P (j ) 1
1
1
Zh∗ ,mh∗ ,Ch∗
1
1
1
1
field is done as the integration of the χ ’s in Appendix A2, recalling the bound (5.3). ∗ (2) (2) h∗ (k) + γ h1 πh∗1 Ch−1 Finally we rewrite mh∗ (k) as m ∗ (k), where πh∗ is a parameter to 1 1 1 1 be suitably fixed below as a function of λ, σ1 , µ1 .
5.2. The localization operator. The integration of the r.h.s. of (5.4) is done in an iterative way similar to the one described in §3. If h = h∗1 , h∗1 − 1, . . . , we shall write: − AT =
P
(2) (dψ (2,≤h) )e−V (2) Zh , mh ,Ch
(h)
√ ( Zh ψ (2,≤h) )−M 2 Eh
(h)
,
(5.6)
h where V is given by an expansion similar to (5.5), with h replacing h∗1 and Zh , m are defined recursively in the following way. We first introduce a localization operator L. As in §(3.2), we define L as a combination of four operators Lj and P j , j = 0, 1. Lj are defined as in (3.4) and (3.5), while P 0 and P 1 , in analogy with (3.6), are defined (2) as the operators extracting from a functional of m h (k), h ≤ h∗1 , the contributions inde(2)
(h)
(2)
pendent and linear in m h (k). Note that inductively the kernels W 2n,ω can be thought of as functionals of m k (k), h ≤ k ≤ h∗1 . Given Lj , P j , j = 0, 1 as above, we define the (h)
action of L on the kernels W 2n,ω as follows. 1) If n = 1, then % (h) def LW 2,ω =
(h)
L0 (P 0 + P 1 )W 2,ω
if ω1 + ω2 = 0,
(h)
if ω1 + ω2 = 0.
L1 P 0 W 2,ω
(h)
2) If n > 2, then LW 2n,ω = 0. It is easy to prove the analogue of Lemma 3.1: LV
(h)
(2,≤h)
= (sh + γ h ph )Fσ(2,≤h) + zh Fζ
,
(5.7)
where sh , ph and zh are real constants and sh is linear in m k (k), h ≤ k ≤ h∗1 ; ph and (2) (2,≤h) (2,≤h) k (k). Furthermore Fσ and Fζ are given by the first zh are independent of m (2,≤h) (2,≤h) +(≤h) −(≤h) ψˆ and the last of (3.8) with ψˆ ψˆ replacing ψˆ . (2)
ω,k
ω ,−k
ω,k
ω ,k
Remark. Note that the action of L on the quartic terms is trivial. The reason for such a choice is that in the present case no quartic local term can appear, because of the Pauli (2,h) (2,h) (2,h) (2,h) principle: ψ1,x ψ1,x ψ−1,x ψ−1,x = 0, so that L0 W 4,ω = 0.
704
A. Giuliani, V. Mastropietro
Using the symmetry properties exposed in Appendix A2.2, we can prove the analogue of Lemma 3.2: if n = 1, then % [S 2 + R2 (P 0 + P 1 )]W 2,ω if ω1 + ω2 = 0, RW 2,ω = (5.8) [R1 S 1 + R2 P 0 ]W 2,α,ω if ω1 + ω2 = 0, where S 1 = 1 − P 0 and S 2 = 1 − P 0 − P 1 ; if n = 2, then W 4,ω = R1 W 4,ω . 5.3. Renormalization for h ≤ h∗1 . If L and R = 1 − L are defined as in the previous subsection, we can rewrite (5.6) as: (h) √ (2,≤h) )−RV (h) (√Z ψ (2,≤h) )−M 2 E (2) h h . (dψ (2,≤h) )e−LV ( Zh ψ (5.9) P (2) Zh , mh ,Ch
Furthermore, using (5.7) and defining: h−1 (k)def = Zh (1 + Ch−1 (k)zh ) , Z
def
(2)
m h−1 (k) =
Zh (2) m h (k) + Ch−1 (k)sh , h−1 (k) Z (5.10)
we see that (5.9) is equal to h √ (2,≤h) √ h (2) ( Zh ψ (2),≤h )−RV ( Zh ψ (2),≤h )−M 2 (Eh +th ) (dψ (2,≤h) )e−γ ph Fσ . P (2) Zh−1 , mh−1 ,Ch
(5.11) Again, we rescale the potential: h (h) ( Zh−1 ψ (≤h) )def V = γ h πh Fσ(2,≤h) ( Zh−1 ψ (2,≤h) ) + RV ( Zh ψ (2,≤h) ),(5.12) h−1 (0) and πh = (Zh /Zh−1 )ph ; we define f−1 as in (3.14), we where Zh−1 = Z h perform the single scale integration and we define the new effective potential as (h−1) √ h (√Zh ψ (2,≤h) ) (2) ( Zh−1 ψ (2,≤h−1) )−M 2 E˜ h def (2,h) −V = P )e (5.13) . e−V (2) −1 (dψ Zh−1 , mh−1 ,fh
Finally we pose Eh−1 = Eh + th + E˜ h . Note that the above procedure allows us to write the πh in terms of πk , h ≤ k ≤ h∗1 , namely πh−1 = γ h πh + βπh (πh , . . . , πh∗1 ), where βπh is the Beta function. (h)
Proceeding as in §3 we can inductively show that V has the structure of (5.5), with (h) h replacing h∗1 and that the kernels of V are bounded as follows. Lemma 5.2. Let the hypothesis of Lemma 5.1 be satisfied and suppose that, for h¯ < h ≤ h∗1 and some constants c, ϑ > 0, e−c|λ| ≤
(2)
m h
(2) m h−1
≤ ec|λ| ,
|πh | ≤ c|λ| ,
(2)
e−c|λ| ≤ 2
¯
| mh¯ | ≤ γ h .
Zh 2 ≤ ec|λ| , Zh−1 (5.14)
Anomalous Universality in the Anisotropic Ashkin–Teller Model
705
Then the partition function can be rewritten as in 5.6 and there exists C > 0 s.t. the (h) kernels of V satisfy: ¯ (h) ¯ dx1 · · · dx2n |W 2n,σ ,j ,ω (x1 , . . . , x2n )| ≤ M 2 γ −hDk (n) (C |λ|)max(1,n−1) , (5.15) where Dk (n) = −2 + n + k and k =
2n
i=1 σi .
¯
2h Finally |Eh+1 ¯ | + |th+1 ¯ | ≤ c|λ|γ .
The proof of Lemma 5.2 is essentially identical to the proof of Theorem 3.4 and we do not repeat it here. It is possible to fix πh∗1 so that the first three assumptions in (5.14) are valid for any h ≤ h∗1 . More precisely, the following result holds, see Appendix A6. Lemma 5.3. If |λ| ≤ ε1 , |σ1 |, |µ1 | ≤ c1 and ν1 is fixed as in Theorem 4.1, there exists πh∗∗ (λ, σ1 , µ1 ) such that, if we fix πh∗1 = πh∗∗ (λ, σ1 , µ1 ), for h ≤ h∗1 we have: 1
1
∗
|πh | ≤ c|λ|γ (ϑ/2)(h−h1 ) ,
(2)
(2)
h
h
Zh = Zh∗1 γ F ζ ,
m h = m h∗ γ Fm , 1
(5.16)
h
where Fmh and F ζ are O(λ). Moreover: ∗ ∗ ∗ πh∗ (λ, σ1 , µ1 )−πh∗∗ (λ, σ1 , µ1 ) ≤ c|λ| γ (ησ −1)h1 |σ1 − σ1 |+γ (ηµ −1)h1 |µ1 − µ1 | . 1 1 (5.17) 5.4. The integration of the scales ≤ h∗2 . In order to insure that the last assumption in (5.14) holds, we iterate the preceding construction up to the scale h∗2 defined as the scale ∗ (2) (2) mh∗ −1 | > γ h2 −2 . s.t. | mk | ≤ γ k−1 for any h∗2 ≤ k ≤ h∗1 and | ∗
2
∗
Once we have integrated all the fields ψ (>h2 ) , we can integrate ψ (2,≤h2 ) without any further multiscale decomposition. Note in fact that by definition the propagator satisfies the same bound (5.3) with h∗2 replacing h∗1 . Then, if we define e
−M 2 E˜ ≤h∗ def 2
=
PZ
(2) , m ∗ ,C ∗ h∗ 2 −1 h2 −1 h2
e
∗ & ∗ (h2 ) ( Z ∗ ψ (2,≤h2 ) ) −V h −1 2
,
(5.18)
∗ we find that |E˜ ≤h∗2 | ≤ c|λ|γ 2h2 (the proof is a repetition of the estimates on the single scale integration). Combining this bound with the results of Theorem 3.4, Lemma 5.1, Lemma 5.2 and Lemma 5.3, together with the results of §4 we finally find that the free energy associated to − AT is given by the following finite sum, uniformly convergent with the size of M :
1 1 − ∗ ∗ ∗ log = E + (E − E ) + (E˜ h + th ) , ≤h2 h1 h1 AT M→∞ M 2 ∗
lim
(5.19)
h=h2 +1
where E≤h∗2 = limM→∞ E˜ ≤h∗2 and it is easy to see that E≤h∗2 , for any finite h∗2 , exists and satisfies the same bound of E˜ h∗2 .
706
A. Giuliani, V. Mastropietro
5.5. Keeping h∗2 finite. From the discussion of the previous subsection, it follows that, for any finite h∗2 , (5.19) is an analytic function of λ, t, u, for |λ| sufficiently small, uniformly in h∗2 (this is an elementary consequence of Vitali’s convergence theorem). Moreover, repeating the discussion of Appendix G in [M1], it can be proved that, for ∗ ∗ any γ h2 > 0 (here γ h2 plays the role of |t − tc | in Appendix G of [M1]), the limit (5.19) γ1 ,γ2 coincides with limM→∞ 1/M 2 log AT for any choice γ1 , γ2 of boundary conditions; hence this limit coincides with −2 log cosh λ plus the free energy in (1.2), see also (2.6). We can state the result as follows. Lemma 5.4. There exists ε1 > 0 such that, if |λ| ≤ ε1 and t ± u ∈ D (the same as in the Main Theorem), the free energy f defined in (1.2) is real analytic in λ, t, u, except ∗ possibly for the choices of λ, t, u such that γ h2 = 0. ∗
We shall see in §6 below that the specific heat is logarithmically divergent as γ h2 → 0. ∗ So the critical point is really given by the condition γ h2 = 0. We shall explicitly solve the equation for the critical point in the next subsection. 5.6. The critical points. In the present √ subsection we check that, if t ± u ∈ D, D being a suitable interval centered around 2 − 1, see the Main Theorem, there are precisely two critical points of the form (1.5). More precisely, keeping in mind that the equation ∗ for the critical point is simply γ h2 = 0 (see the end of the previous subsection), we prove the following. ∗
Lemma 5.5. Let |λ| ≤ ε1 , t ± u ∈ D and πh∗1 be fixed as in Lemma 5.3. Then γ h2 = 0 only if (λ, t, u) = (λ, tc± (λ, u), u), where tc± (λ, u) is given by (1.5). Proof. From the definition of h∗2 given above, see §5.4, it follows that h∗2 satisfies the following equation: h∗ ∗ ∗ 2 γ h2 −1 = cm γ Fm |σh∗1 | − |µh∗1 | − ασ γ h1 πh∗1 ,
(5.20)
∗
for some 1 ≤ cm < γ and ασ = sign σ1 . Then, the equation γ h2 = 0 can be rewritten as: ∗
|σh∗1 | − |µh∗1 | − ασ γ h1 πh∗1 = 0 .
(5.21)
First note that the result of Lemma 5.5 is trivial when h∗1 = 1. If h∗1 < 1, (5.21) cannot 1
1
be solved when |σ1 | 1−ησ > 2|µ1 | 1−ηµ . In fact, ∗
h∗ 1
∗
h∗ 1
∗
|σ1 |γ ησ (h1 −1)+Fσ − |µ1 |γ ηµ (h1 −1)+Fµ − ασ γ h1 πh∗1 ∗ 1−η 1−ηµ ηµ ησ γ h1 −1 1+ 1−η − 1−ηµ − 1−η h∗1 ∗ 1−η σ σ σ σ |σ1 | ,(5.22) = |σ1 | c1 − |µ1 ||σ1 | c1 − ασ γ πh1 ≥ 3γ ∗
1
where c1 , c1 are constants = 1 + O(λ), πh∗1 = O(λ) and γ h1 −1 = cσ |σ1 | 1−ησ , with 1 ≤ cσ < γ . Now, if |µ1 | > 0, the r.h.s. of (5.22) equation is strictly positive.
Anomalous Universality in the Anisotropic Ashkin–Teller Model
707 1
1
1
∗
So, let us consider the case h∗1 < 1 and |σ1 | 1−ησ ≤ 2|µ1 | 1−ηµ (s.t. γ h1 = cu logγ |u| 1−ηµ , with 1 ≤ cu ≤ γ ). In this case (5.21) can be easily solved to find: |σ1 | = |µ1 ||u| η −η
h∗ 1
ηµ −ησ 1−ηµ
η −ησ
cuµ
h∗ 1
γ Fµ
h∗
−Fσ 1
1−ησ
h∗ 1
+ |u| 1−ηµ cu1−ησ ασ γ 1−Fσ πh∗1 .
(5.23)
h∗ 1
Note that cuµ σ γ Fµ −Fσ = 1 + O(λ) is just a function of u, (it does not depend on t), because of our definition of h∗1 . Moreover πh∗1 is a smooth function of t: if we call πh∗1 (t, u), resp. πh∗1 (t , u), the correction corresponding to the initial data σ1 (t, u), µ1 (t, u), resp. σ1 (t , u), µ1 (t , u), we have ησ −1
|πh∗1 (t, u) − πh∗1 (t , u)| ≤ c|λ||u| 1−ηµ |t − t | ,
(5.24)
where we used (5.17) and the bounds |σ1 − σ1 | ≤ c|t − t | and |µ1 − µ1 | ≤ c|u||t − t |, following from the definitions of (σ1 , µ1 ) in terms of (σ, µ) and of (t, u), see §2. Using the same definitions we also realize that (5.23) can be rewritten as t=
√ 1 + λ(t ˆ 2 − u2 ) ν(λ) , 2−1+ ± |u|1+η 1 + λf (t, u) 2 1 + λˆ
(5.25)
where def
1+η=
1 − ησ , 1 − ηµ
(5.26)
and the crucial property is that η = −bλ + O(λ2 ), b > 0, see Lemma 4.2 and Appendix A5. We also recall that both η and ν are functions of λ and are independent of t, u. Moreover f (t, u) is a suitable bounded function s.t. |f (t, u) − f (t , u)| ≤ c|u|−(1+η) |t − t |, as it follows from the Lipshitz property of πh∗1 (5.24). The r.h.s. of (5.25) is Lipshitz in t with constant O(λ), so that (5.25) can be inverted w.r.t. t by contractions and, for both choices of the sign, we find a unique solution √ t = tc± (λ, u) = 2 − 1 + ν ∗ (λ) ± |u|1+η 1 + F ± (λ, u) , (5.27) with |F ± (λ, u)| ≤ cλ|, for some c. ∗
5.7. Computation of h∗2 . Let us now solve (5.20) in the general case of γ h2 ≥ 0. Calling def
ε=γ
h∗ h∗2 −h∗1 −Fm2
/cm , we find:
h∗ h∗ (ησ −1)(h∗1 −1)+Fσ 1 (ηµ −1)(h∗1 −1)+Fµ1 ε = |σ1 |γ − |µ1 |γ − ασ γ πh∗ 1 h∗ h∗ h∗ h∗ (ησ −1)(h∗1 −1)+Fσ 1 (ηµ −ησ )(h∗1 −1)+Fµ1 −Fσ 1 1+(1−ησ )(h∗1 −1)−Fσ 1 − ασ γ πh∗ . =γ |σ1 | − |µ1 |γ 1 (5.28) ∗
If |σ1 |1/(1−ησ ) ≤ 2|µ1 |1/(1−ηµ) , we use γ h1 −1 = cu |u|1/(1−ηµ ) and, from the second ασ ± row of (5.27), we find: ε = C |σ1 | − |σ1,c | |u|−(1+η) , where σ1,c = σ1 (λ, tc± , u) and
708
A. Giuliani, V. Mastropietro
C = C(λ, t, u) is bounded above and below by O(1) constants; defining as in (1.6), we can rewrite: 2 α α + − |σ1 | − |σ1,cσ | σ1 − (σ1,cσ )2 |t − tc | · |t − tc | ε=C = C = C , (5.29) |u|1+η |u|1+η 2 where C = C (λ, t, u) and C = C (λ, t, u) are bounded above and below by O(1) constants. ∗ In the opposite case (|σ1 |1/(1−ηs ) > 2|µ1 |1/(1−ηµ ) ), we use γ h1 −1 = cσ |σ1 |1/(1−ησ ) −1/(1+η) ˜ − |µ1 ||σ1 | ¯ and, from the first row of (5.27), we find ε = C(1 − ασ γ πh∗1 ) = C, where C˜ and C¯ are bounded above and below by O(1) constants. Since in this region of parameters |t − tc± | −1 is also bounded above and below by O(1) constants, we can in both cases write ε = Cε (λ, t, u)
|t − tc+ | · |t − tc− | 2
C1,ε ≤ Cε (λ, t, u) ≤ C2,ε
,
(5.30)
and Cj,ε , j = 1, 2, are suitable positive O(1) constants. 6. The Specific Heat Consider the specific heat defined in (1.2). The correlation function < HxAT HyAT >M ,T can be conveniently written as ∂2 log AT (φ) , φ=0 ∂φx ∂φy AT def AT (φ) = e− x∈ (1+φx )Hx ,
< HxAT HyAT >,T =
(6.1)
σ (1) ,σ (2)
where φx is a real commuting auxiliary field (with periodic boundary conditions). Repeating the construction of §2, we see that AT (φ) admit a Grassmannian representation similar to the one of AT , and in particular, if x = y: ∂2 ∂2 γ1 ,γ2 (φ) log AT (φ) = log (−1)δγ1 +δγ2 , AT φ=0 φ=0 ∂φx ∂φy ∂φx ∂φy γ ,γ 1
γ1 ,γ2 (φ) = AT
j
=1,2
(j )
(j )
2
(j )
(j )
(1)
dHx dH x dVx dV x eSγ1 (t
(1) )+S (2) (t (2) )+V +B (φ) λ γ2
, (6.2)
x∈M
where δγ , S (j ) (t (j ) ) and Vλ where defined in §2 (see (2.2) and previous lines, and (2.9)), AT refers to the boundary conditions assigned to the Grassthe apex γ1 , γ2 attached to mannian fields, as in §2 and finally B(φ) is defined as: (1) (1) (2) (2) (1) (1) (2) (2) B(φ) = φx a (1) H x Hx+eˆ + V x Vx+eˆ + a (2) H x Hx+eˆ + V x Vx+eˆ 1
0
1
x∈
(1) (1) (2) (2) (1) (1) (2) (2) +λ a H x Hx+eˆ H x Hx+eˆ + V x Vx+eˆ V x Vx+eˆ 1
1
0
0
0
def = φx Ax , (6.3) x∈
Anomalous Universality in the Anisotropic Ashkin–Teller Model
709
where a (1) , a (2) and a are O(1) constants, with a (1) − a (2) = O(u). Using (6.2) and (6.3) we can rewrite: < HxAT HyAT >,T =
γ ,γ 1 2 1 2 γ ,γ2 (cosh J )2M (−1)δγ1 +δγ2 AT < Ax Ay >1M ,T ,(6.4) 4 AT γ ,γ 1
2
γ ,γ2 >1M ,T
where < · is the average w.r.t. the boundary conditions γ1 , γ2 . Proceeding as ∗ γ ,γ2 is expoin Appendix G of [M1] one can show that, if γ h2 > 0, < Ax Ay >1M ,T δγ1 +δγ2 γ1 ,γ2 nentially insensitive to boundary conditions and γ1 ,γ2 (−1) AT /AT is an (φ) and O(1) constant. Then from now on we will study only − AT (φ) = AT (−,−),(−,−) . < Ax Ay >M ,T As in §2 we integrate out the χ fields and, proceeding as in Appendix A2.1, we find: (1) (1) − (6.5) AT (φ) = PZ1 ,σ1 ,µ1 ,C1 (dψ)eV +B , def
(−,−),(−,−)
where B(1) (ψ, φ) =
∞
σ ,j ,α,ω
m
(1)
m,n=1 x1 ···xm
Bm,2n;σ ,j ,α,ω (x1 , . . . , xm ; y1 , . . . , y2n )
i=1
y1 ···y2n
φxi
2n
σ α ∂j i ψyii,ωi .
(6.6)
i
i=1
We proceed as for the partition function, namely as described in §3 above. We introduce the scale decomposition described in §3 and we perform iteratively the integration of the single scale fields, starting from the field of scale 1. After the integration of the fields ψ (1) , . . . , ψ (h+1) , h∗1 < h ≤ 0, we are left with (h) √ (≤h) (h) √ (≤h) − −M 2 Eh +S (h+1) (φ) AT (φ) = e PZh ,σh ,µh ,Ch (dψ ≤h )e−V ( Zh ψ )+B ( Zh ψ ,φ) , (6.7) (dψ (≤h) ) and V (h) are the same as in §3, S (h+1) (φ) denotes the sum
where PZh ,σh ,µh mh ,Ch of the contributions dependent on φ but independent of ψ, and finally B (h) (ψ (≤h) , φ) denotes the sum over all terms containing at least one φ field and two ψ fields. S (h+1) and B (h) can be represented as S (h+1) (φ) =
∞
(h+1) Sm (x1 , . . . , xm )
m=1 x1 ···xm
B (h) (ψ (≤h) , φ) =
,α,ω ∞ σ ,j m,n=1
x1 ···xm y1 ···y2n
2n
×
m
φxi
i=1 (h)
m
Bm,2n;σ ,j ,α,ω (x1 , . . . , xm ; y1 , . . . , y2n )
i . ∂ σi ψy(≤h)α ,ω i i
φxi
i=1
(6.8)
i=1
Since the field φ is equivalent, regarding dimensional bounds, to two ψ fields (see Theorem 6.1 below for a more precise statement), the only terms in the expansion for B (h) which are not irrelevant are those with m = n = 1, σ1 = σ2 = 0 and they are marginal. Hence we extend the definition of the localization operator L, so that its action on (h) B (h) (ψ (≤h) , φ) is defined by its action on the kernels B m,2n;α,ω (q1 , . . ., qm ;k1 , . . ., k2n ):
710
A. Giuliani, V. Mastropietro
def (h) 1) if m = n = 1 and α1 + α2 = ω1 + ω2 = 0, then LB 1,2;σ ,α,ω (q1 ; k1 , k2 ) = (h) P0 B 1,2;α,ω (k+ ; k+ , k+ ), where P0 is defined as in (3.6); (h) 2) in all other cases LB = 0. m,2n;α,ω
Using the symmetry considerations of Appendix B together with the remark that φx is invariant under Complex conjugation, Hole–particle and (1)← →(2), while under Parity φx → φ−x and under Rotation φ(x,x0 ) → φ(−x0 ,−x) , we easily realize that LB(h) has necessarily the following form: LB(h) (ψ (≤h) , φ) =
Z h (−iω) (≤h)+ (≤h)− ψ−ω,x , φx ψω,x Zh x,ω 2
(6.9)
where Z h is real and Z 1 = a (1) |σ =µ=0 = a (2) |σ =µ=0 . (≤h)α (≤h)α Note that apriori a term x,ω,α φx ψω,x ψ−ω,x is allowed by symmetry but, using (1)←→(2) symmetry, one sees that its kernel is proportional to µk , k ≥ h. So, with our definition of localization, such a term contributes to RB(h) . Now that the action of L on B is defined, we can describe the single scale integration, for h > h∗1 . The integral in the r.h.s. of (6.7) can be rewritten as: −M 2 th PZh−1 ,σh−1 ,µh−1 ,Ch−1 (dψ ≤h−1 ) e (≤h) (≤h) (h) √ (h) √ · PZh−1 ,σh−1 ,µh−1 ,f−1 (dψ (h) )e−V ( Zh−1 ψ )+B ( Zh−1 ψ ,φ) , (6.10) h
(h) was defined in (3.12) and where V (h) ( Zh−1 ψ (≤h) , φ)def B = B (h) ( Zh ψ (≤h) , φ) .
(6.11)
Finally we define
(h)
√
√
( Zh−1 ψ )+B ( Zh−1 ψ ,φ) e−Eh M +S (φ)−V √ (h) (≤h) (≤h) def (h) √ = PZh−1 ,σh−1 ,µh−1 ,f−1 (dψ (h) )e−V ( Zh−1 ψ )+B ( Zh−1 ψ ,φ) , (6.12) 2
(h−1)
(≤h−1)
(h−1)
(≤h−1)
h
and h Eh−1 = Eh + th + E def
,
S (h) (φ) = S (h+1) (φ) + S (h) (φ) . def
(6.13)
With the definitions above, it is easy to verify that Z h−1 satisfies the equation Z h−1 = Z h (1 + zh ), where zh = bλh + O(λ2 ), for some b = 0. Then, for some c > 0, Z 1 e−c|λ|h ≤ Z h ≤ Z 1 ec|λ|h . The analogue of Theorem 3.1 for the kernels of B (h) holds: Theorem 6.1. Suppose that the hypothesis of Lemma 5.1 is satisfied. Then, for h∗1 ≤ h¯ ≤ 1 and a suitable constant C, the kernels of B (h) satisfy ¯ (h) dx1 · · · dx2n |B2n,m;σ ,j ,α,ω (x1 , . . . , xm ; y1 , . . . , y2n )| ¯
≤ M 2 γ −h(Dk (n)+m) (C |λ|)max(1,n−1) , where Dk (n) = −2 + n + k and k = 2n i=1 σi .
(6.14)
Anomalous Universality in the Anisotropic Ashkin–Teller Model
711
Fig. 3. The lowest order diagrams contributing to < HxAT HyAT >M ,T . The wavy lines ending in the points labeled x and y represent the fields φx and φy respectively. The solid lines labeled by h and going from x to y represent the propagators g (h) (x − y). The sums are over the scale indices and, even if not explicitly written, over the indexes α, ω (and the propagators depend on these indexes)
Note that, consistently with our definition of localization, the dimension of (h) B2,1;(0,0),(+,−),(ω,−ω) is D0 (1) + 1 = 0. Again, proceeding as in §4, we can study the flow of Z h up to h = −∞ and prove h that Z h = Z 1 γ η(h−1)+Fz¯ , where η is a non-trivial analytic function of λ (its linear part is non-vanishing) and Fz¯h is a suitable O(λ) function (independent of σ1 , µ1 ). We recall that Z 1 = O(1). We proceed as above up to the scale h∗1 . Once the scale h∗1 is reached we pass to the (1) ψ , ψ (2) variables, we integrate out (say) the ψ (1) fields and we get & ∗ & ∗ ∗ (h∗ 1 ) ( Z ∗ ψ (2,≤h1 ) )+B (h1 ) ( Z ∗ ψ (2,≤h1 ) ) h1 h1 (2) (2)(≤h∗1 ) −V P (dψ )e , (6.15) (2) Zh∗ , mh∗ ,Ch∗ 1
h∗1
&
1
1
∗
with LB ( Zh∗1 ψ (2),≤h1 ) = Z h∗1
(2,≤h∗1 ) (2,≤h∗1 ) ψ−1,x . x iφx ψ1,x
The scales h∗2 ≤ h ≤ h∗1 are integrated as in §5 and one finds that the flow of Z h in h this regime is trivial, i.e. if h∗2 ≤ h ≤ h∗1 , Z h = Z h∗1 γ Fz , with Fzh = O(λ). The result is that the correlation function < HxAT HyAT >M ,T is given by a convergent power series in λ, uniformly in M . Then, the leading behaviour of the specific heat is given by the sum over x and y of the lowest order contributions to < HxAT HyAT >M ,T , namely by the diagrams in Fig. 3. Absolute convergence of the power series of < HxAT HyAT >M ,T implies that the rest is a small correction. √ √ The conclusion is that Cv , for λ small and |t − 2 + 1|, |u| ≤ ( 2 − 1)/4, is given by: Cv =
(1) 1 (Zh∨h )2 1 || Z Z x,y∈M ω1 ,ω2 =±1 h,h =h∗2 h−1 h −1
(h) (h ) × G(+,ω1 ),(+,ω2 ) (x − y)G(−,−ω2 ),(−,−ω1 ) (y − x) (h )
(h)
+G(+,ω1 ),(−,−ω2 ) (x − y)G(−,−ω1 ),(+,ω2 ) (x − y) +
1 1 Z h 2 (h) M (x − y) , || Zh ∗ x,y∈M h2
where h ∨ h = max{h, h } and G(α1 ,ω1 ),(α2 ,ω2 ) (x) must be interpreted as (h)
(6.16)
712
A. Giuliani, V. Mastropietro
(h) g(α1 ω1 ),(α2 ,ω2 ) (x) (2,h∗ ) (1,≤h∗1 ) gω1 ,ω2 (x) + gω1 ,ω12 (x) (h) G(α1 ω1 ),(α2 ,ω2 ) (x) = (2,h) gω1 ,ω2 (x) (2,≤h∗2 ) gω1 ,ω2 (x)
if h > h∗1 , if h = h∗1 , if h∗2 < h < h∗1 , if h = h∗2 . (2+n)h
(h)
Moreover, if N, n0 , n1 ≥ 0 and n = n0 + n1 , |∂xn0 ∂x0 M (x)| ≤ CN,n |λ| 1+(γγ h |d(x)|)N . Now, calling ηc the exponent associated to Z h /Zh , from (6.16) we find: Cv = −C1 γ
2ηc h∗1
logγ γ
h∗1 −h∗2
∗ 1 − γ 2ηc (h1 −1) (1) (2) 1 + h∗ ,h∗ (λ) + C2 1 + h∗ (λ) 1 2 1 2ηc
,
(6.17) (1)
(2)
where |h∗ ,h∗ (λ)|, |h∗ (λ)| ≤ c|λ|, for some c. Note that, defining as in (1.6), ∗
1
2
1
γ (1−ησ )h1 −1 is bounded above and below by O(1) constants. Then, using (5.30), (1.6) follows. Appendix A1. Proof of (2.1) We start from Eq. (V.2.12) in [MW], expressing the partition function of the Ising model with periodic boundary condition on a lattice with an even number of sites as a combination of the Pfaffians of four matrices with different boundary conditions, defined by (V.2.10) and (V.2.11) in [MW]. In the general case (i.e. M 2 not necessarily even), the (V.2.12) of [MW] becomes: 21 2 e−βJ HI (σ ) = (−1)M (2 cosh βJ )M − Pf A1 + Pf A2 + Pf A3 +Pf A4 , ZI = 2 σ (A1.1) where Ai are matrices with elements (Ai )x,j ;y,k , with x, y ∈ M , j, k = 1, . . . , 6, given by: 0 0 −1 0 0 1 0 0 0 −1 1 0 1 0 0 0 0 −1 (Ai )x;x = (A1.2) 0 1 0 0 −1 0 0 −1 0 1 0 1 −1 0 1 0 −1 0 T and (Ai )x;x+eˆ1 i,j = tδi,1 δj,2 , (Ai )x;x+eˆ0 i,j = tδi,2 δj,1 , (Ai )x;x+eˆ1 = −(Ai )x+eˆ1 ;x , T
(Ai )x;x+eˆ0 = −(Ai )x+eˆ0 ;x ; moreover T
(Ai )(M,x0 );(1,x0 ) = −(Ai )(1,x0 );(M,x0 ) = (−1)[ (Ai )(x,M);(x,1) =
T −(Ai )(x,1);(x,M)
where [ i−1 2 ] is the bigger integer ≤ identically zero.
i−1 2 ;
= (−1)
i−1 2 ]
i−1
(Ai )(1,x0 );(2,x0 ) ,
(Ai )(x,1);(x,2) ,
(A1.3)
in all the other cases the matrices (Ai )x,y are
Anomalous Universality in the Anisotropic Ashkin–Teller Model
713
Given matrix A, it is well–known that Pf A = a (2n) × (2n) antisymmetric (−1)n dψ1 · · · dψ2n · · exp{ 21 i,j ψi Aij ψj }, where ψ1 , . . . , ψ2n are Grassmannian variables. Then, we can rewrite (A1.1) as:
1 γ γ γ γ γ γ γ M2 δγ (−1) dH x dHx dV x dVx dT x dTx eS (t;H,V ,T ) , (2 cosh βJ ) 2 γ x∈M
(A1.4) γ
γ
where: γ = (ε, ε ); ε, ε = ±1; δγ is defined after (2.1); H x , Hx , V x , Vx are Grassmannian variables with ε–periodic resp. ε –periodic boundary conditions in the vertical, resp. horizontal, direction, see (2.3) and following lines. Furthermore: γ γ γ γ H x Hx+eˆ + V x Vx+eˆ S γ (t; H, V , T ) = t +
1
x
γ
γ
0
γ γ γ γ γ γ γ γ γ γ γ γ γ γ V x H x + H x T x + V x H x + Hx T x + T x V x + T x V x + T x T x .
x
(A1.5) The T –fields appear only in the diagonal elements and they can be easily integrated out: γ
γ γ γ γ γ γ γ γ γ γ γ dT x dTx exp H x Tx + Hx T x + Tx V x + T x Vx + T x Tx x∈M
=
γ
γ
γ
γ
γ
γ
γ
γ
(−1 − H x Hx − V x Vx − Vx H x − Vx H x )
x∈M
= (−1)M exp
γ γ γ γ γ γ γ γ H x Hx + V x Vx + Vx H x + Hx V x ,
(A1.6)
x∈M
γ γ γ γ γ γ γ γ 2 where in the last identity we used that H x Hx + V x Vx + Vx H x + Hx V x = 0. Substituting (A1.6) into (A1.4) we find (2.1). Appendix A2. Integration of the Heavy Fermions. Symmetry Properties A2.1. Integration of the χ fields. Calling V(ψ, χ ) = Q(ψ, χ ) − νFσ (ψ) + V (ψ, χ ), we obtain ∞ (−1)n T 1 M 2 − Q(1) (ψ) − V (1) (ψ) = log P (dχ )eV (ψ,χ) = −E E (V(ψ, χ ); n), n! χ n=0
(A2.1) 1 is a constant and V (1) is at least quadratic in ψ and vanishing when λ = where E ν = 0. Q(1) is the rest (quadratic in ψ). Given s set of labels Pvi , i = 1, . . . , s and def α(f ) χ (Pvi ) = f ∈Pv χω(f ),x(f ) , the truncated expectation EχT ( χ (Pv1 ), . . . , χ (Pvs )) can be i written as
T 1 2 χ (Pv1 ), . . . , χ (Pvs )) = αT gχ (f , f ) dPT (t)Pf GT (t), (A2.2) Eχ ( T
∈T
714
A. Giuliani, V. Mastropietro
where T is a set of lines forming an anchored tree between the cluster of poins Pv1 ,. . ., Pvs i.e. T is a set of lines which becomes a tree if one identifies all the points in the same clusters; t = {ti,i ∈ [0, 1], 1 ≤ i, i ≤ s}, dPT (t) is a probability measure with support on a set of t such that ti,i = ui · ui for some family of vectors ui ∈ Rs of unit norm; αT is a sign (irrelevant for the subsequent bounds); f1 , f2 are the field labels associated to the points connected by ; if a(f ) = (α(f ), ω(f )), the propagator gχ (f, f ) is equal to def
α(f )
gχ (f, f ) = ga(f ),a(f ) (x(f ) − x(f )) = < χω(f ),x(f ) χω(f ),x(f ) >; χ
α(f )
(A2.3)
if 2n = si=1 |Pvi |, then GT (t) is a (2n − 2s + 2) × (2n − 2s + 2) antisymmetrix matrix, whose elements are given by GTf,f = ti(f ),i(f ) gχ (f, f ), where: f, f ∈ FT def
and FT = ∪∈T {f1 , f2 }; i(f ) is s.t. f ∈ Pi(f ) ; finally Pf GT is the Pfaffian of GT . If s = 1 the sum over T is empty, but we can still use the above equation by interpreting the r.h.s. as 1 if Pv1 is empty, and detG(P1 ) otherwise. Sketch of the proof of (A2.2). Equation (A2.2) is a trivial generalization of the well– known formula expressing truncated fermionic expectations in terms of sums of determinants [Le]. The only difference here is that the propagators < χωα1 ,x1 χωα2 ,x2 > are not vanishing, so that Pfaffians appear instead of determinants. The proof can be done along the same lines of Appendix A3 of [GM]. The only difference here is that the identity known as the Berezin integral, see (A3.15) of [GM], that is the starting point to get to (A2.2), must be replaced by the (more general) identity: Eχ
s
j =1
1 χ (Pj ) = Pf G = (−1)n Dχ exp (χ , Gχ) , 2
(A2.4)
s where: the expectation Eχ is w.r.t. P (dχ ); if 2m = j =1 Pj , G is the 2m × 2m χ antisymmetric matrix with entries Gf,f = ga(f ),a(f ) (x(f ) − x(f )); and Dχ =
n
j =1 f ∈Pj
α(f )
dχx(f ),ω(f )
(χ , Gχ ) =
α(f )
f,f ∈∪i Pi
α(f )
χx(f ),ω(f ) Gf,f χx(f ),ω(f ) . (A2.5)
Starting from (A2.4), the proof in Appendix A3 of [GM] can be repeated step by step in the present case, to find finally the analogue of (A.3.55) of [GM]. Then, using again that Dχ exp(χ , Gχ )/2 is, unless for a sign, the Pfaffian of G, we find (A2.2). We now use the well–known property |Pf GT | = | det GT | and we can bound def
det GT by the Gram–Hadamard (GH) inequality. Let H = Rs ⊗H0 , where H0 is the Hilbert space of complex four dimensional vectors F (k) = (F1 (k), . . . , F4 (k)), Fi (k) be ing a function on the set D−,− , with scalar product < F, G >= 4i=1 1/M 2 k Fi∗ (k) Gi (k). We can write the elements of GT as inner products of vectors of H: Gf,f = ti(f ),i(f ) gχ (f, f ) =< ui(f ) ⊗ Af , ui(f ) ⊗ Bf > ,
(A2.6)
Anomalous Universality in the Anisotropic Ashkin–Teller Model
715 χ
where ui ∈ Rs , i = 1, . . . , s, are vectors such that ti,i = ui · ui , and, if gˆ a,a (k) is the χ Fourier transform of ga,a (x − y), Af (k) and Bf (k) are given by χ χ χ χ Af (k) = e−ikx(f ) gˆ a(f ),(−,1) (k), gˆ a(f ),(−,−1) (k), gˆ a(f ),(+,1) (k), gˆ a(f ),(+,−1) (k) , (1, 0, 0, 0), if a(f ) = (−, 1), (0, 1, 0, 0), if a(f ) = (−, −1), Bf (k) = e−ikx(f ) (A2.7) (0, 0, 1, 0), if a(f ) = (+, 1), (0, 0, 0, 1), if a(f ) = (+, −1). With these definitions and remembering (2.17), it is now clear that |Pf GT | ≤ C n−s+1 , for some constant C. Then, applying (A2.2) and the previous bound we find the second of (2.21). We now turn to the construction of PZ1 ,σ1 ,µ1 ,C1 , in order to prove (2.19). def
We define e−t1 M PZ1 ,σ1 ,µ1 ,C1 (dψ) = Pσ (dψ)e−Q (ψ) , where t1 is a normalization constant. In order to write PZ1 ,σ1 ,µ1 ,C1 (dψ) as an exponential of a quadratic form, it is sufficient to calculate the correlations def α1 α2 < ψω1 ,k ψω2 ,−α1 α2 k >1 = PZ1 ,σ1 ,µ1 ,C1 (dψ)ψωα11,k ψωα22,−α1 α2 k 2 Pσ (dψ)P (dχ )eQ(χ,ψ) ψωα11,k ψωα22,−α1 α2 k . = e−t1 M 2
(1)
(A2.8) It is easy to realize that the measure ∼ Pσ (dψ)P (dχ )eQ(χ,ψ) factorizes into the product (j ) α = (ψ (1) + of two measures generated by the fields ψω,x , j = 1, 2, defined by ψω,x ω,x √ (2) α i(−1) ψω,x )/ 2. In fact, using this change of variables, one finds that
Pσ (dψ)P (dχ )eQ(χ,ψ) = P (j ) (dψ (j ) , dχ (j ) ) j =1,2
=
j =1,2
(j ) tλ (j ),T (j ) (j ) 1 exp{− ξk Ck ξ−k } , (A2.9) 4M 2 N (j ) k
def
(j )
(j )
for two suitable matrices Ck , whose determinants B (j ) (k) = det Ck are equal to B (j ) (k) =
16 # (j ) (tλ )4
$ (j ) (j ) (j ) (j ) 2tλ [1 − (tλ )2 ](2 − cos k − cos k0 ) + (tλ − tψ )2 (tλ − tχ )2 . (A2.10) (j )
From the explicit expression of Ck one finds (j )
(j )
(j )
< ψ−k ψk >1 =
(j )
4M 2 c1,1 (k) (j ) B (j ) (k) t
,
(j ) 4M 2 c−1,−1 (k) (j ) B (j ) (k) tλ
,
λ
(j )
(j )
< ψ −k ψ k >1 =
(j )
(j )
< ψ −k ψk >1 =
4M 2 c−1,1 (k) (j ) B (j ) (k) t
,
λ
(A2.11)
716
A. Giuliani, V. Mastropietro
where, if ω = ±1, recalling that tψ = def
(j ) (k) = cω,ω
def
(j )
√ √ 2 − 1 + ν/2 and defining tχ = − 2 − 1,
4 # (j ) 2tλ tχ (−i sin k cos k0 + ω sin k0 cos k) (j ) 2 (tλ ) $ (j ) +[(tλ )2 + tχ2 ](i sin k − ω sin k0 ) ,
4 # (j ) − tλ (3tχ + tψ ) cos k cos k0 (j ) 2 (tλ ) (j ) 2 +[(tλ ) + 2tχ tψ + tχ2 ](cos k + cos k0 ) tψ tχ2 $ (j ) − tλ (tψ + tχ ) + 2 (j ) . tλ
cω,−ω (k) = −iω
It is clear that, for any monomial F (ψ (j ) ), F (ψ (j ) ), with def
P (j ) (dψ (j ) ) =
P (dψ (j ) , dχ (j ) )F (ψ (j ) ) =
(A2.12)
P (j ) (dψ (j ) )
1
(j ) (j ) dψk dψ k Nj k
( ' (' (j ) (j ) (j ) t (j ) B (j ) (k) c−1,−1 (k) −c1,−1 (k) ψ−k (j ) (j ) λ , · exp − (ψk , ψ k ) (j ) (j ) (j ) (j ) −c−1,1 (k) c1,1 (k) ψ −k 4M 2 det ck (j )
(j )
(j )
(j )
(A2.13)
(j )
(j )
where det ck = c1,1 (k)c−1,−1 (k) − c1,−1 (k)c−1,1 (k). If we now use the identity tλ = tψ (2 + (−1)j µ)/(2 − σ ) and rewrite the measure P (1) (dψ (1) )P (2) (dψ (2) ) in terms of ± ψω,k we find: P (1) (dψ (1) )P (2) (dψ (2) ) =
1
Z1 C1 (k) +,T (1) − + − dψω,k dψω,k exp{− k Aψ k } (1) 4M 2 N k,ω
= PZ1 ,σ1 ,µ1 ,C1 (dψ) ,
(A2.14) (1)
with C1 (k), Z1 , σ1 and µ1 defined as after (2.18), and Aψ (k) as in (2.19), with + + (k) c−1,1 (k) −c−1,−1 , + + c1,−1 (k) −c1,1 (k) − − 2 (k) −c−1,−1 (k) c−1,1 (1) , N (k) = − − c1,−1 (k) −c1,1 (k) 2−σ
M (1) (k) =
def
2 2−σ
(1)
(1)
(A2.15) (2)
where cωα 1 ,ω2 (k) = [(1−µ/2)B (1) (k)cω1 ,ω2 (k)/ det ck +α(1+µ/2)B (2) (k)cω1 ,ω2 (k)/ (2)
(1)
det ck ]/2. It is easy to verify that Aψ (k) has the form (2.19). In fact, computing the functions in (A2.15), one finds that, for k, σ1 and µ1 small, 1 + σ21 (i sin k + sin k0 ) + O(k3 ) −iσ1 + O(k2 ) , σ iσ1 + O(k2 ) 1 + 21 (i sin k − sin k0 ) + O(k3 ) µ1 − 2 (i sin k + sin k0 ) + O(k3 ) iµ1 + O(µ1 k2 ) , (A2.16) N (1) (k) = µ1 2 −iµ1 + O(µ1 k ) − 2 (i sin k − sin k0 ) + O(k3 )
M (1) (k) =
Anomalous Universality in the Anisotropic Ashkin–Teller Model
717
where the higher order terms in k, σ1 and µ1 contribute to the corrections a1± (k), b1± (k), c1 (k) and d1 (k). They have the reality and parity properties described after (2.19) and it is apparent that a1± (k) = O(σ1 k) + O(k3 ), b1± (k) = O(µ1 k) + O(k3 ), c1 (k) = O(k2 ) and d1 (k) = O(µ1 k2 ). A2.2. Symmetry properties. In this section we identify some symmetries of model (2.7) and we prove that the quadratic and quartic terms in V (1) have the structure described in (2.22), (2.23) and (2.24). The formal action appearing in (2.7) (see also (2.2) and (2.9) for an explicit form) is invariant under the following transformations: (j )
(j )
(j )
(j )
1) Parity: Hx → H −x , H x → −H−x (the same for V and V ). In terms of the α , this transformation is equivalent to ψ ˆ α → iωψˆ α variables ψˆ ω,k ω,k ω,−k (the same for χ ) and we shall call it parity. α → ψ ˆ −α (the same for χ ) and c → c∗ , where c is 2) Complex conjugation: ψˆ ω,k −ω,k a generic constant appearing in the formal action and c∗ is its complex conjugate. Note that (2.10) is left invariant by this transformation that we shall call complex conjugation. (j ) (j ) 3) Hole-particle: Hx → (−1)j +1 Hx (the same for H , V , V ). This transformation α →ψ ˆ −α (the same for χ) and we shall call it hole-particle. is equivalent to ψˆ ω,k ω,−k (j )
(j )
(j )
(j )
(j )
(j )
(j )
4) Rotation: Hx,x0 → iV −x0 ,−x , H x,x0 → iV−x0 ,−x , Vx,x0 → iH −x0 ,−x , V x,x0 → (j )
iH−x0 ,−x . This transformation is equivalent to α α → −ωe−iωπ/4 ψˆ −ω,(−k ψˆ ω,(k,k 0) 0 ,−k)
,
α α χˆ ω,(k,k → ωe−iωπ/4 χˆ −ω,(−k , 0) 0 ,−k) (A2.17)
and we shall call it rotation. (j ) (j ) (j ) (j ) (j ) (j ) (j ) 5) Reflection: Hx,x0 → iH −x,x0 , H x,x0 → iH−x,x0 , Vx,x0 → −iV−x,x0 , V x,x0 → (j ) α α → i ψˆ −ω,(−k,k (the same iV −x,x0 . This transformation is equivalent to ψˆ ω,(k,k 0) 0) for χ ) and we shall call it reflection. (1) (2) (1) (1) (2) (1) (2) →Hx , H x ← →H x , Vx ← →Vx , V x ← → 6) The (1)←→(2) symmetry: Hx ← (2) −α α ˆ ˆ V x , u → −u. This transformation is equivalent to ψω,k → −iα ψω,−k (the same for χ ) together with u → −u and we shall call it (1)← →(2) symmetry. It is easy to verify that the quadratic forms P (dχ ), P (dψ) and PZ1 ,σ1 ,µ1 ,C1 (dψ) are separately invariant under the symmetries above. Then the effective action V (1) (ψ) is still invariant under the same symmetries. Using the invariance of V (1) under transformations (1)–(6), we now prove that the structure of its quadratic and quartic terms is the one described in Theorem 2.1, see in particular (2.22), (2.23) and (2.24). + ˆ+ Quartic term. The term ki W (k1 , k2 , k3 , k4 )ψˆ 1,k ψ ψˆ − ψˆ − δ(k1 +k2 −k3 − 1 −1,k2 −1,k3 1,k4 − ψˆ − k4 ) under complex conjugation becomes equal to ki W ∗ (k1 , k2 , k3 , k4 )ψˆ −1,k 1 1,k2 + ˆ+ δ(k3 + k4 − k1 − k2 ), so that W (k1 , k2 , k3 , k4 ) = W ∗ (k3 , k4 , k1 , k2 ). ψˆ 1,k ψ 3 −1,k4 Then, defining L1 = W (k¯ ++ , k¯ ++ , k¯ ++ , k¯ ++ ), where k¯ ++ = (π/M, π/M), and l1 = def , we see that L1 and l1 are real. From the explicit computation of P0 L1 = L1 σ1 =µ1 =0
˜ 2 + O(λ2 ). the lower order term we find l1 = λ/Z 1
718
A. Giuliani, V. Mastropietro
Quadratic terms. We distinguish 4 cases (items (a)–(d) below). a) Let α1 = −α2 = + and ω1 = −ω2 = ω and consider the expression ω,k Wω (k; µ1 ) + ˆ− ψˆ ω,k ψ . Under parity it becomes −ω,k ˆ+ ˆ− ˆ+ ˆ− W ω,k ω (k; µ1 )(iω)ψω,−k (−iω)ψ−ω,−k = ω,k Wω (−k; µ1 )ψω,k ψ−ω,k , so that Wω (k; µ1 ) is even in k. Under complex conjugation it becomes ∗ ˆ− ∗ ˆ+ ˆ− ˆ+ ω,k Wω (k; µ1 ) ψ−ω,k ψω,k = − ω,k Wω (k; µ1 ) ψω,k ψ−ω,k , so that Wω (k; µ1 ) is purely imaginary. Under hole-particle it becomes ˆ− ˆ+ ˆ+ ˆ− ω,k Wω (k; µ1 )ψω,−k ψ−ω,−k = − ω,k W−ω (k; µ1 )ψω,k ψ−ω,k , so that Wω (k; µ1 ) is odd in ω. Under →(2) it becomes (1)← ˆ− ˆ+ ˆ+ ˆ− W ω,k ω (k; −µ1 )(−i)ψ−ω,−k (i)ψω,−k = ω,k Wω (k; −µ1 )ψω,k ψ−ω,k , so that Wω (k; µ1 ) is even in µ1 . Let us define S1 = iω/2 η,η =±1 Wω (k¯ ηη ), where k¯ ηη = (ηπ/M, η π/M), and γ n1 = P0 S1 , s1 = P1 S1 = σ1 ∂σ1 S1 σ =µ =0 + 1 1 µ1 ∂µ1 S1 σ =µ =0 . From the previous discussion we see that S1 , s1 and n1 are real and s1 is 1 1 independent of µ1 . From the computation of the lower order terms we find s1 = O(λσ1 ) and γ n1 = ν/Z1 + c1ν λ + O(λ2 ), for some constant c1ν independent of λ. Note that since Wω (k; µ1 ) is even in k (so that in particular no linear terms in k appear) in real space + ∂ψ − no terms of the form ψω,x −ω,x can appear. b) Let α1 = α2 = α and ω1 = −ω2 = ω and consider the expression ω,α,k Wωα (k; µ1 ) α ψ α ˆα ψˆ ω,k −ω,−k . We proceed as in item (a) and, by using parity, we see that Wω (k; µ1 ) is even in k and odd in ω. By using complex conjugation, we see that Wωα (k; µ1 ) = −Wω−α (k; µ1 )∗ . By using hole-particle, we see that Wωα (k; µ1 ) is even in α and Wωα (k; µ1 ) = −Wω−α (k; µ1 )∗ implies that Wωα (k; µ1 ) is purely imaginary. By using (1)← →(2) we see that Wωα (k; µ1 ) is odd in µ1 . If we define M1 = −iω/2 η,η Wωα (k¯ ηη ; µ1 ) and m1 = P1 M1 , from the previous properties it follows that M1 and m1 are real, m1 is independent of σ1 and, from the computation of its lower order, m1 = O(λµ1 ). Note that since Wωα (k; µ1 ) is even in k (so that in particular no linear terms in k appear) in real space no terms of the form α ∂ψ α ψω,x −ω,x can appear. + c) Let α1 = −α2 = +, ω1 = ω2 = ω and consider the expression ω,k Wω (k; µ1 )ψˆ ω,k − . By using parity we see that Wω (k; µ1 ) is odd in k. ψˆ ω,k By using reflection we see that Wω (k, k0 ; µ1 ) = W−ω (k, −k0 ; µ1 ). By using complex conjugation we see that Wω (k, k0 ; µ1 ) = Wω∗ (−k, k0 ; µ1 ). By using rotation we find Wω (k, k0 ; µ1 ) = −iωWω (k0 , −k; µ1 ). By using (1)← →(2) we see that Wω (k; −µ1 ) is even in µ1 . If we define sin k0 1 sin k Wω (k¯ ηη ; µ1 )(η + η ) G1 (k) = 4 sin π/M sin π/M η,η
= aω sin k + bω sin k0 ,
(A2.18)
it can be easily verified that the previous properties imply that def
aω = a−ω = −aω∗ = iωbω = ia
,
def
bω = −b−ω = bω∗ = −iωaω = ωb = −iωia (A2.19)
Anomalous Universality in the Anisotropic Ashkin–Teller Model
719
with a = b real and independent of ω. As a consequence, G1 (k) = G1 (i sin k +ω sin k0 ) def
for some real constant G1 . If z1 = P0 G1 and we compute the lowest order contribution to z1 , we find z1 = O(λ2 ). α d) Let α1 = α2 = α, ω1 = ω2 = ω and consider the expression α,ω,k Wωα (k; µ1 )ψˆ ω,k α α ψˆ ω,−k . Repeating the proof in item (c) we see that Wω (k; µ1 ) is odd in k and in µ1 k0 k and, if we define F1 (k) = 41 η,η Wωα (k¯ ηη ; µ1 )(η sinsinπ/M + η sinsinπ/M ), we can rewrite α F1 (k) = F1 (i sin k + ω sin k0 ). Since Wω (k; µ1 ) is odd in µ1 , we find F1 = O(λµ1 ). Note that with the definition of L introduced in §3.2, the result of the previous discussion is the following: (≤1)
LV (1) (ψ) = (s1 + γ n1 )Fσ(≤1) + m1 Fµ(≤1) + l1 Fλ
(≤1)
+ z 1 Fζ
,
(A2.20)
where s1 , n1 , m1 , l1 and z1 are real constants and: s1 is linear in σ1 and independent of µ1 ; m1 is linear in µ1 and independent of σ1 ; n1 , l1 , z1 are independent of σ1 , µ1 ; (≤1) (≤1) (≤1) (≤1) are defined by (3.8) with h = 1. moreover Fσ , Fµ , Fλ , Fζ Proof of Lemma 3.1. The symmetries (1)–(6) discussed above are preserved by the iterative integration procedure. In fact it is easy to verify that LV (h) , RV (h) and PZh−1 ,σh−1 ,µh−1 ,fh (dψ (h) ) are, step by step, separately invariant under the transformations (1)–(6). Then Lemma 3.1 can be proven exactly in the same way (A2.20) was proven above. Proof of Lemma 3.2. It is sufficient to note that the symmetry properties discussed above imply that L1 W2,α,ω = 0 if ω1 + ω2 = 0; L0 W2,α,ω = 0 if ω1 + ω2 = 0; P0 W2,α,ω = 0 if α1 + α2 = 0; and use the definitions of Ri , Si , i = 1, 2. Appendix A3. Proof of Lemma 3.3 (j,h)
(h)
The propagators ga,a (x) can be written in terms of the propagators gω,ω (x), j = 1, 2, (j,h)
see (3.16) and the following lines; gω,ω (x) are given by (j,h) gω,ω (x − y)
=
−(j ) −i sin k + ω sin k0 + ah−1 (k) 2 −ik(x−y) , e (k) f h 2 (j ) (j ) M2 sin2 k + sin2 k0 + mh−1 (k) + δBh−1 (k) k
(j,h)
gω,−ω (x − y) (j ) −iωmh−1 (k) 2 −ik(x−y) = 2 e fh (k) , (A3.1) (j ) 2 (j ) M sin2 k + sin2 k0 + mh−1 (k) + δBh−1 (k) k
where ω(j )
def
ch−1 (k) = ch−1 (k) + (−1)j dh−1 (k),
(j )
def
(j )
ω ω (k) + (−1)j bh−1 (k) , ah−1 (k) = −ah−1
(j )
def
def
(j )
mh−1 (k) = σh−1 (k) + (−1)j µh−1 (k) , mh−1 (k) = mh−1 (k) + c(j ) (k), def ω(j ) ω(j ) −ω(j ) (j ) (A3.2) ah−1 (k)(i sin k − ω sin k0 ) + ah−1 (k)ah−1 (k)/2 . δBh−1 (k) = ω
720
A. Giuliani, V. Mastropietro
Fig. 4. A tree with its scale labels
In order to bound the propagators defined above, we need estimates on σh (k), µh (k) ω (k), bω (k), c and on the “corrections” ah−1 h−1 (k), dh−1 (k). As regarding σh (k) and h−1 µh (k), in [BM] is proved (see Proof of Lemma 2.6) that, on the support of fh (k), for some c, c−1 |σh | ≤ |σh−1 (k)| ≤ c|σh | and c−1 |µh | ≤ |µh−1 (k)| ≤ c|µh |. Note also h| ¯ using the first two of (3.18), we have |σh |+|µ ≤ 2C1 . As regarding the that, if h ≥ h, γh corrections, using their iterative definition (3.11), the asymptotic estimates near k = 0 of the corrections on scale h = 1 (see lines after (2.19)) and the hypothesis (3.18), we easily find that, on the support of fh (k): 2 )h
ω (k) = O(σh γ (1−2c|λ|)h ) + O(γ (3−c|λ| ah−1
bhω (k)
= O(µh γ
ch (k) = O(γ
(1−2c|λ|)h
(2−c|λ|2 )h
)
) + O(γ ,
)
(3−c|λ|2 )h
,
),
dh (k) = O(µh γ (2−2c|λ|)h ) .
(A3.3)
The bounds on the propagators follow from the remark that, as a consequence of the estimates discussed above, the denominators in (A3.1) are O(γ 2h ) on the support of fh . Appendix A4. Analyticity of the Effective Potentials It is possible to write V (h) (3.3) in terms of Gallavotti-Nicolo’ trees. See Fig. 4. We need some definitions and notations. 1) Let us consider the family of all trees which can be constructed by joining a point r, the root, with an ordered set of n ≥ 1 points, the endpoints of the unlabeled tree, so that r is not a branching point. n will be called the order of the unlabeled tree and the branching points will be called the non trivial vertices. Two unlabeled trees are identified if they can be superposed by a suitable continuous deformation, so that the endpoints with the same index coincide. Then the number of unlabeled trees with n end-points is bounded by 4n . 2) We associate a label h ≤ 0 with the root and we denote Th,n the corresponding set of labeled trees with n endpoints. Moreover, we introduce a family of vertical lines, labeled by an integer taking values in [h, 2], and we represent any tree τ ∈ Th,n so that, if v is an endpoint or a non-trivial vertex, it is contained in a vertical line with
Anomalous Universality in the Anisotropic Ashkin–Teller Model
721
index hv > h, to be called the scale of v, while the root is on the line with index h. There is the constraint that, if v is an endpoint, hv > h + 1; if there is only one end-point its scale must be equal to h + 2, for h ≤ 0. Moreover, there is only one vertex immediately following the root, which will be denoted v0 and can not be an endpoint; its scale is h + 1. 3) With each endpoint v of scale hv = +2 we associate one of the contributions to V (1) given by (2.21); with each endpoint v of scale hv ≤ 1 one of the terms in LV (hv −1) defined in (3.7). Moreover, we impose the constraint that, if v is an endpoint and hv ≤ 1, hv = hv + 1, if v is the non-trivial vertex immediately preceding v. 4) We introduce a field label f to distinguish the field variables appearing in the terms associated with the endpoints as in item 3); the set of field labels associated with the endpoint v will be called Iv . Analogously, if v is not an endpoint, we shall call Iv the set of field labels associated with the endpoints following the vertex v; x(f ), σ (f ) and ω(f ) will denote the space-time point, the σ index and the ω index, respectively, of the field variable with label f . 5) We associate with any vertex v of the tree a subset Pv of Iv , the external fields of v. These subsets must satisfy various constraints. First of all, if v is not an endpoint and v1 , . . . , vsv are the sv vertices immediately following it, then Pv ⊂ ∪i Pvi ; if v is an endpoint, Pv = Iv . We shall denote Qvi the intersection of Pv and Pvi ; this definition implies that Pv = ∪i Qvi . The subsets Pvi \Qvi , whose union will be made, by definition, of the internal fields of v, have to be non empty, if sv > 1, that is if v is a non trivial vertex. Given τ ∈ Tj,n , there are many possible choices of the subsets Pv , v ∈ τ , compatible with the previous constraints; let us call P one of these choices. Given P, we consider the family GP of all connected Feynman graphs, such that, for any v ∈ τ , the internal fields of v are paired by propagators of scale hv , so that the following condition is satisfied: for any v ∈ τ , the subgraph built by the propagators associated with all vertices v ≥ v is connected. The sets Pv have, in this picture, the role of the external legs of the subgraph associated with v. The graphs belonging to GP will be called compatible with P and we shall denote Pτ the family of all choices of P such that GP is not empty. 6) We associate with any vertex v an index ρv ∈ {s, p} and correspondingly an operator Rρv , where Rs or Rp are defined as S2 def R1 S1 Rs = S1 1
if n = 1 and ω1 + ω2 = 0, if n = 1 and ω1 + ω2 = 0, if n = 2, if n > 2;
(A4.1)
and R (P + P1 ) 2 0 R2 P0 def Rp = 0 R1 P0 0
if n = 1 and ω1 + ω2 = 0, if n = 1, ω1 + ω2 = 0 and α1 + α2 = 0, if n = 1, ω1 + ω2 = 0 and α1 + α2 = 0, if n = 2, if n > 2.
Note that Rs + Rp = R, see Lemma 3.2.
(A4.2)
722
A. Giuliani, V. Mastropietro
The effective potential can be written in the following way: V
(h)
( Zh ψ
(≤h)
) + M E˜ h+1 = 2
∞
V (h) (τ, Zh ψ (≤h) ),
(A4.3)
n=1 τ ∈Th,n
where, if√v0 is the first vertex of τ and τ1 , . . . , τs are the subtrees of τ with root v0 , V (h) (τ, Zh ψ (≤h) ) is defined inductively by the relation V (h) (τ, Zh ψ (≤h) ) (−1)s+1 T = Eh+1 [V¯ (h+1) (τ1 , Zh ψ (≤h+1) ); . . . ; V¯ (h+1) (τs , Zh ψ (≤h+1) )] , s! (A4.4) √ and V¯ (h+1) (τi , Zh ψ (≤h+1) ): √ (h+1) (τi , Zh ψ (≤h+1) ) if the subtree τi with first vertex vi is not a) is equal to Rρvi V (h) ); trivial (see (3.12) for the definition of V (h+1) , see (3.12), b) if τi is trivial and h ≤ −1, it is equal to one of the terms √ in LV (1) ( Z1 ψ ≤1 ). or, if h = 0, to one of the terms contributing to V
A4.1. The explicit expression for the kernels of V (h) can be found from (A4.3) and (A4.4) by writing the truncated expectations of monomials of ψ fields using the analogue of (Pvi ) = f ∈P ψ α(f )(hv ) , the following identity holds: (A2.2): if ψ x(f ),ω(f ) vi 1 n
(Pv1),. . . ,ψ (Pvs ))= αTv g (hv ) (f1 , f2 ) dPTv (t)Pf GTv (t), EhTv(ψ Zhv −1 Tv
∈Tv
(A4.5) g (h) (f, f )
x(f ))
where = ga(f ),a(f ) (x(f ) − and the other symbols in a.1 have the same meaning as those in A2.2. Using iteratively A4.5 we can express the kernels of V (h) as sums of products of propagators of the fields (the ones associated to the anchored trees Tv ) and Pfaffians of matrices GTv . A4.2. If the R operator were not applied to the vertices v ∈ τ then the result of the iteration would lead to the following relation:
|Pv | α(f )(≤h) ∗ Vh∗ (τ, Zh ψ (≤h) ) = Zh 0 dxv0 Wτ,P,T (xv0 ) ψx(f ),ω(f ) , P∈Pτ T ∈T
f ∈Pv0
)
(A4.6) ∗ v Tv ; Wτ,P,T
is where xv0 is the set of integration variables asociated to τ and T = given by n Z |Pv |
1 2 hv ∗ Wτ,P,T dPTv (tv ) (xv0 ) = Kvh∗i (xvi∗ ) i Zhv −1 s ! v not e.p. v not e.p. v i=1
· Pf Ghv ,Tv (tv ) g (hv ) (fl1 , fl2 ) , (A4.7) l∈Tv
Anomalous Universality in the Anisotropic Ashkin–Teller Model
723
where: e.p. is an abbreviation of “end points”; v1∗ , . . . , vn∗ are the endpoints of τ , hi = hvi∗ and Kvhv (xv ) are the corresponding kernels (equal to λhv −1 δ(xv ) or νhv −1 δ(xv ) if v is an endpoint of type λ or ν on scale hv ≤ 1; or equal to one of the kernels of V (1) if hv = 2). We can bound (A4.7) using (3.20) and the Gram–Hadamard inequality, see Appendix A2, we would find: ∗ dxv0 |Wτ,P,T (xv0 )| ≤ C n M 2 |λ|n γ −h(−2+|Pv0 |/2) % * |P |
1 Zhv 2v −[−2+ |Pv | ] 2 . (A4.8) × γ sv ! Zhv −1 v not e.p.
We call Dv = −2 + |P2v | the dimension of v, depending on the number of the external fields of v. If Dv < 0 for any v one can sum over τ, P, T obtaining convergence for λ small enough; however Dv ≤ 0 when there are two or four external lines. We will take now into account the effect of the R operator and we will see how the bound (A4.21) is improved. (h)
A4.3. The effect of application of Pj and Sj is to replace a kernel W2n,σ ,j ,α,ω with (h)
(h)
Pj W2n,σ ,j ,α,ω and Sj W2n,σ ,j ,α,ω . If inductively, starting from the end–points, we write (h)
the kernels W2n,σ ,j ,α,ω in a form similar to (A4.7), we easily realize that, eventually, Pj or Sj will act on some propagator of an anchored tree or on some Pfaffian Pf GTv , for some v. It is easy to realize that Pj and Sj , when applied to Pfaffians, do not break the Pfaffian structure. In fact the effect of Pj on the Pfaffian of an antisymmetric matrix G with elements Gf,f , f, f ∈ J , |J | = 2k, is the following (the proof is trivial): P0 Pf G = Pf G0
,
P1 Pf G =
1 P1 Gf1 ,f2 (−1)π Pf G01 , 2
(A4.9)
f1 ,f2 ∈J
where G0 is the matrix with elements P0 Gf,f , f, f ∈ J ; G01 is the matrix with eledef
ments P0 Gf,f , f, f ∈ J1 = J \ {f1 ∪ f2 } and (−1)π is the sign of the permutation leading from the ordering J of the labels f in the l.h.s. to the ordering f1 , f2 , J1 in the r.h.s. The effect of Sj is the following, see Appendix A7 for a proof: S1 Pf G =
1 S1 Gf1 ,f2 2 · k! f1 ,f2 ∈J
∗
(−1)π k1 ! k2 ! Pf G01 Pf G2 ,
(A4.10)
J1 ∪J2 =J \∪i fi
where the ∗ on the sum means that J1 ∩ J2 = ∅; |Ji | = 2ki , i = 1, 2; (−1)π is the sign of the permutation leading from the ordering J of the field labels on the l.h.s. to the ordering f1 , f2 , J1 , J2 on the r.h.s.; G01 is the matrix with elements P0 Gf,f , f, f ∈ J1 ; G2 is the matrix with elements Gf,f , f, f ∈ J2 . The effect of S2 on Pf GT is given by a formula similar to (A4.10). Note that the number of terms in the sums appearing in (A4.9), (A4.10) (and in the analogous equation for S2 Pf GT ), is bounded by ck for some constant c.
724
A. Giuliani, V. Mastropietro
A4.4. It is possible to show that the Rj operators produce derivatives applied to the propagators of the anchored trees and on the elements of GTv ; and a product of “zeros” of the form djb (x(f1 ) − x(f2 )), j = 0, 1, b = 0, 1, 2, associated to the lines ∈ Tv . This is a well known result, and a very detailed discussion can be found in §3 of [BM]. By √ such analysis, and using (A4.9),(A4.10), we get the following expression for RV (h) (τ, Zh ψ (≤h) ): RV (h) (τ, Zh ψ (≤h) )
|Pv | q (f ) α(f )(≤h) = Zh 0 dxv0 Wτ,P,T,β (xv0 ) ∂ˆjββ(f ) ψxβ (f ),ω(f ) , P∈Pτ T ∈T β∈BT
f ∈Pv0
(A4.11) where BT is a set of indices which allows to distinguish the different terms produced by the non trivial R operations; xβ (f ) is a coordinate obtained by interpolating two points in xv0 , in a suitable way depending on β; qβ (f ) is a nonnegative integer ≤ 2; q q jβ (f ) = 0, 1 and ∂ˆj is a suitable differential operator, dimensionally equivalent to ∂j (see [BM] for a precise definition); Wτ,P,T,β is given by: Wτ,P,T,β (xv0 ) =
n Z |Pv |
bβ (v ∗ ) Cβ (v ∗ ) cβ (v ∗ ) 2 hv djβ (v ∗i ) (xβi , yβi )PIβ (v ∗i) Siβ (v ∗i) Kvh∗i (xvi∗ ) i i i i Zhv −1 v not e.p. i=1
1 C (v) c (v) · dPTv (tv )PIββ(v) Siββ(v) Pf Ghβv ,Tv (tv )· s ! v not e.p. v q (f 1 ) q (f 2 ) b (l) C (l) c (l) β β · ∂ˆj (f 1l ) ∂ˆj (f 2l ) [djββ(l) (xl , yl )PIββ(l) Siββ(l) g (hv ) (fl1 , fl2 )] ,
l∈Tv
β
l
β
l
(A4.12) v1∗ , . . .
, vn∗
qβ (fl1 )
where are the endpoints of τ ; bβ (v), bβ (l), and qβ (fl2 ) are nonneg1 2 ative integers ≤ 2; jβ (v), jβ (fl ), jβ (fl ) and jβ (l) can be 0 or 1; iβ (v) and iβ (l) can be 1 or 2; Iβ (v) and Iβ (l) can be 0 or 1; Cβ (v), cβ (v), Cβ (l) and cβ (l) can be 0, 1 and max{Cβ (v) + cβ (v), Cβ (l) + cβ (l}) ≤ 1; Ghβv ,Tv (tv ) is obtained from Ghv ,Tv (tv ) by q (f ) q (f ) substituting the element ti(f ),i(f ) g (hv ) (f, f ) with ti(f ),i(f ) ∂ˆjββ(f ) ∂ˆjββ(f ) g (hv ) (f, f ). It would be very difficult to give a precise description of the various contributions of the sum over BT , but fortunately we only need to know some very general properties, which easily follow from the construction in §3. 1) There is a constant C such that, ∀T ∈ Tτ , |BT | ≤ C n ; for any β ∈ BT , the following inequality is satisfied:
γ h(f )qβ (f ) γ −h(l)bβ (l) ≤ γ −z(Pv ) , (A4.13) f ∈∪v Pv
l∈T
v not e.p.
where h(f ) = hv0 − 1 if f ∈ Pv0 , otherwise it is the scale of the vertex where the field with label f is contracted; h(l) = hv , if l ∈ Tv and 1 if |Pv | = 4 and ρv = p , 2 if |Pv | = 2 and ρv = p , z(Pv ) = (A4.14) 1 if |Pv | = 2, ρv = s and f ∈Pv ω(f ) = 0 , 0 otherwise.
Anomalous Universality in the Anisotropic Ashkin–Teller Model
725
2) If we define
|σh | + |µh | cβ (v)iβ (v) |σh | + |µh | cβ ()iβ () v v v v hv hv γ γ v∈τ ∈Tv
i(v,β) |σhv | + |µhv | def = , γ hv
(A4.15)
v∈Vβ
the indices i(v, β) satisfy, for any BT , the following property:
i(v, β) ≥ z (Pv ) ,
(A4.16)
w≥v
where 1 2 z (Pv ) = 1 0
if |Pv | = 4 and ρv = s , if |Pv | = 2 and ρv = sand f ∈Pv ω(f ) = 0 , if |Pv | = 2, ρv = s and f ∈Pv ω(f ) = 0 , otherwise.
(A4.17)
C (v) c (v)
A4.5. We can bound any |PIββ(v) Siββ(v) Pf Ghβv ,Tv | in (A4.12), with Cβ (v)+cβ (v) = 0, 1, by using (A4.9), (A4.10) and Gram inequality, as illustrated in Appendix A2 for the case of the integration of the χ fields. Using that the elements of G are all propagators on scale hv , dimensionally bounded as in Lemma 3.3, we find: C (v) c (v)
|PIββ(v) Siββ(v) Pf Ghβv ,Tv | ≤ C ·γ
hv 2
sv
sv
i=1 |Pvi |−|Pv |−2(sv −1)
i=1 |Pvi |−|Pv |−2(sv −1)
γ hv qβ (f )
f ∈Jv
|σ | + |µ | cβ (v)iβ (v)+Cβ (v)Iβ (v) hv hv , γ hv (A4.18)
v Pvi \Qvi . We will bound the factors where Jv = ∪si=1 by a constant. If we call
Jτ,P,T ,β = ·
|σhv |+|µhv | Cβ (v)Iβ (v) using (3.19) h v γ
n
bβ (v ∗ ) Cβ (v ∗ ) cβ (v ∗ ) h dxv0 dj (v ∗i ) (xβi , yβi )PI (v ∗i) Si (v ∗i ) Kv ∗i (xv ∗ ) i=1
v not e.p.
β
i
β
i
β
i
i
i
1 ˆ qβ (fl1 ) ˆ qβ (fl2 ) bβ (l) Cβ (l) cβ (l) ∂ ∂ [dj (l) (xl , yl )PI (l) Si (l) g (hv ) (fl1 , fl2 )] , 1 2 β β β jβ (fl ) jβ (fl ) sv ! l∈Tv
(A4.19)
we have, under the hypothesis (3.24),
726
A. Giuliani, V. Mastropietro
Jτ,P,T ,α ≤ C n M 2 |λ|n ·
n
|σh∗ | + |µh∗ | cβ (v ∗ )iβ (v ∗ ) i
γ
i=1
h∗i
i
i
i
·
1 2(sv −1) hv nν (v) −hv l∈T bβ (l) −hv n bβ (v ∗ ) −hv (sv −1) i=1 i γ v γ γ γ · C sv ! v not e.p. |σ | + |µ | cβ ()iβ () 1 2 hv hv , (A4.20) ·γ hv l∈Tv qβ (fl )+qβ (fl ) h v γ ∈T
where nν (v) is the number of vertices of type ν with scale hv + 1. Now, substituting (A4.18), (A4.20) into (A4.12), using (A4.13), we find that:
dxv0 |Wτ,P,T ,β (xv0 )| ≤ C n M 2 |λ|n γ −hDk (|Pv0 |) ·
%
v not e.p.
1 sv |Pv |−|Pv | Zhv C i=1 i sv ! Zhv −1
|Pv | 2
|σh | + |µh | i(v,β) v v γ hv v∈Vβ *
γ −[−2+
|Pv | 2 +z(Pv )]
,
(A4.21)
where, if k = f ∈Pv qβ (f ), Dk (p) = −2 + p + k and we have used (A4.15). Note 0 that given v ∈ τ and τ ∈ Th,n and using (3.19) together with the first two of (3.18), |σhv | |σh | |σhv | h−hv |σh | = h ≤ h γ (h−hv )(1−c|λ|) ≤ C1 γ (h−hv¯ )(1−c|λ|) , γ γ hv γ |σh | γ |µhv | |µh | |µhv | h−hv |µh | (h−hv )(1−c|λ|) γ = h ≤ h γ ≤ C1 γ (h−hv )(1−c|λ|) . (A4.22) γ hv γ |µh | γ Moreover the indices i(v, β) satisfy, for any BT , (A4.17) so that, using (A4.22) and (A4.16), we find
|σh | + |µh | i(v,β)
v v ≤ C1n γ −z (Pv ) . h v γ v not e.p.
(A4.23)
v∈Vβ
Substituting (A4.22) into (A4.21) and using (A4.16), we find: dxv0 |Wτ,P,T ,β (xv0 )| ≤ C n M 2 |λ|n γ −hDk (|Pv0 |) % * |P |
1 sv |Pv |−|Pv | Zhv 2v −[−2+ |Pv | +z(Pv )+(1−c|λ|)z (Pv )] 2 , · γ C i=1 i sv ! Zhv −1 v not e.p.
(A4.24) where def
Dv = − 2 +
|Pv | |Pv | + z(Pv ) + (1 − c|λ|)z (Pv ) ≥ . 2 6
(A4.25)
Then (3.25) in Theorem 3.1 follows from the previous bounds and the remark that
Anomalous Universality in the Anisotropic Ashkin–Teller Model
727
1 |Pv | γ − 6 ≤ cn , sv ! v
(A4.26)
τ ∈Th,n P∈Pτ T ∈T β∈BT
for some constant c, see [BM] or [GM] for further details. The bound on E˜ h , th , (3.26) and (3.27) follow from a similar analysis. The remarks following (3.26) and (3.27) follow from noticing that in the expansion for LV (h) only (h ) (h ) propagators of type P0 ga,av or P1 ga,av appear (in order to bound these propagators we do not need (3.19), see the last statement in Lemma 3.3). Furthermore, by construction lh , nh and zh are independent of σk , µk , so that, in order to prove (3.27) we do not even need the first two inequalities in (3.18). A4.6. The sum over all the trees with root scale h and with at least a v with hv = k 1 is O(|λ|γ 2 (h−k) ); this follows from the fact that the bound (A4.26) holds, for some c = O(1), even if γ −|Pv |/6 is replaced by γ −κ|Pv | , for any constant κ > 0 independent of λ; and that Dv , instead of using (A4.25), can also be bounded as Dv ≥ 1/2 + |Pv |/12. This property is called short memory property. Appendix A5. Proof of Theorem 4.1 and Lemma 4.2 We consider the space Mϑ of sequences ν = {νh }h≤1 such that |νh | ≤ c|λ|γ (ϑ/2)h ; we def
shall think Mϑ as a Banach space with norm || · ||ϑ , where ||ν||ϑ = supk≤1 |νk |γ −(ϑ/2)k . We will proceed as follows: we first show that, for any sequence ν ∈ Mϑ , the flow equation for νh , the hypothesis (3.17), (3.18) and the property |λh (ν)| ≤ c|λ| are verified, uniformly in ν. Then we fix ν ∈ Mϑ via an exponentially convergent iterative procedure, in such a way that the flow equation for νh is satisfied. A5.1. Proof of Theorem 4.1. Given ν ∈ Mϑ , let us suppose inductively that (3.17), (3.18) and that, for k > h¯ + 1, |λk−1 (ν) − λk (ν)| ≤ c0 |λ|2 γ (ϑ/2)k ,
(A5.1)
for some c0 > 0. Note that (A5.1) is certainly true for h = 1 (in that case the r.h.s. of (A5.1) is just the bound on βλ1 ). Note also that (A5.1) implies that |λk | ≤ c|λ|, for any ¯ k > h. Using (3.26), the second of (3.27) and (4.1) we find that (3.17), (3.18) are true with h¯ replaced by h¯ − 1. ¯ The We now consider the equation λh−1 = λh + βλh (λh , νh ; . . . ; λ1 , ν1 ), h > h. h function βλ can be expressed as a convergent sum over tree diagrams, as described in Appendix A4; note that it depends on (λh , νh ; . . . ; λ1 , ν1 ) directly through the end– points of the trees and indirectly through the factors Zh /Zh−1 . (h) (h) (h) We can write P0 g(+,ω),(−,ω) (x − y) = gL,ω (x − y) + rω (x − y), where (h)
def
gL,ω (x − y) =
1 4 −ik(x−y) e fh (k) 2 M ik + ωk0 k
(A5.2)
728
A. Giuliani, V. Mastropietro (h)
(h)
and rω is the rest, satisfying the same bound as g(+,ω),(−,ω) , times a factor γ h . This decomposition induces the following decomposition for βλh : h (λh , . . . , λh ) βλh (λh , νh ; . . . ; λ1 , ν1 ) = βλ,L
+
1
Dλh,k + rλh (λh , . . . , λ1 ) +
k=h+1
νk β˜λh,k (λk , νk ; . . . ; λ1 , ν1 ) ,(A5.3)
k≥h
with |Dλh,k | ≤ c|λ|γ ϑ(h−k) |λk − λh | ,
h |βλ,L | ≤ c|λ|2 γ ϑh ,
|β˜λh,k | ≤ c|λ|γ ϑ(h−k) .
|rλh | ≤ c|λ|2 γ (ϑ/2)h ,
(A5.4) (k)
h collect the contributions obtained by posing r The first two terms in (A5.3) βλ,L ω = 0, k ≥ h and substituting the discrete δ function defined after (3.8) with M 2 δk,0 . The first of (A5.4) is called the vanishing of the Luttinger model Beta function property, see [BGPS, GS, BM1] (or [BeM1] for a simplified proof), and it is a crucial property of interacting fermionic systems in d = 1. Using the decomposition (A5.3) and the bounds (A5.4) we prove the following bounds for λh¯ (ν), ν ∈ Mϑ :
|λh¯ (ν) − λ1 (ν)| ≤ c0 |λ|2
¯
2 (ϑ/2)h |λh¯ (ν) − λh+1 , ¯ (ν)| ≤ c0 |λ| γ
,
for some c0 > 0. Moreover, given
ν, ν
(A5.5)
∈ Mϑ , we show that:
|λh¯ (ν) − λh¯ (ν )| ≤ c|λ|||ν − ν ||0 ,
(A5.6)
def
where ||ν − ν ||0 = suph≤1 |νh − νh |. ¯
Proof of (A5.5). We decompose λh¯ − λh+1 = βλh+1 as in (A5.3). Using the bounds ¯ (A5.4) and the inductive hypothesis (A5.1), we find: ¯
2 ϑ(h+1) |λh¯ (ν) − λh+1 + ¯ (ν)| ≤ c|λ| γ
¯
c|λ|γ ϑ(h+1−k)
¯ k≥h+2 ¯ 2 (ϑ/2)(h+1)
+c|λ| γ
+
k
c0 |λ|2 γ (ϑ/2)k
¯ k =h+2 ¯ 2 (ϑ/2)k (ϑ(h+1−k))
c |λ| γ 2
γ
, (A5.7)
¯ k≥h+1
which, for c0 big enough, immediately implies the second of (A5.5) with h → h − 1; from this bound and the hypothesis (A5.1) follows the first of (A5.5). Proof of (A5.6). If we take two sequences ν, ν ∈ Mϑ , we easily find that the beta function for λh¯ (ν) − λh¯ (ν ) can be represented by a tree expansion similar to the one for βλh , with the property that the trees giving a non vanishing contribution have necessarily one end–point on scale k ≥ h associated to a coupling constant λk (ν) − λk (ν ) or νk − νk . Then we find: λh¯ (ν) − λh¯ (ν ) = λ1 (ν) − λ1 (ν ) + [βλk (λk (ν), νk ; . . . ; λ1 , ν1 ) − βλk (λk (ν ), νk ; . . . ; λ1 , ν1 )]. (A5.8) ¯ h+1≤k≤1
Anomalous Universality in the Anisotropic Ashkin–Teller Model
729
Note that |λ1 (ν) − λ1 (ν )| ≤ c0 |λ||ν1 − ν1 |, because λ1 = λ/Z12 + O(λ2 /Z14 ) and √ ¯ |λk (ν) − λk (ν )| ≤ Z1 = 2 − 1 + ν/2. If we inductively suppose that, for any k > h, 2c0 |λ|||ν − ν ||0 , we find, by using the decomposition (A5.3): |λh¯ (ν) − λh¯ (ν )| ≤ c0 |λ||ν1 − ν1 | + c|λ| × γ (ϑ/2)k γ ϑ(k−k ) 2c0 |λ| ||ν − ν ||0 + |νk − νk | .
(A5.9)
k ≥k
¯ k≥h+1
Choosing c0 big enough, (A5.6) follows.
We are now left with fixing the sequence ν in such a way that the flow equation for ν is satisfied. Since we want to fix ν in such a way that ν−∞ = 0, we must have: 1
ν1 = −
γ k−2 βνk (λk , νk ; . . . ; λ1 , ν1 ) .
(A5.10)
k=−∞
If we manage to fix ν1 as in (A5.10), we also get:
νh = −
γ k−h−1 βνk (λk , νk ; . . . ; λ1 , ν1 ) .
(A5.11)
k≤h
We look for a fixed point of the operator T : Mϑ → Mϑ defined as: (Tν)h = −
γ k−h−1 βνk (λk (ν), νk ; . . . ; λ1 , ν1 ) .
(A5.12)
k≤h
where λk (ν) is the solution of the first line of (4.2), obtained as a function of the parameter ν, as described above. If we find a fixed point ν ∗ of (A5.12), the first two lines in (4.2) will be simultaneously solved by λ(ν ∗ ) and ν ∗ respectively, and the solution will have the desired smallness properties for λh and νh . First note that, if |λ| is sufficiently small, then T leaves Mϑ invariant: in fact, as a consequence of parity cancellations, the ν–component of the Beta function satisfies: h βνh (λh , νh ; . . . ; λ1 , ν1 ) = βν,1 (λh ; . . . ; λ1 ) +
νk β˜νh,k (λh , νh ; . . . ; λ1 , ν1 ),
k
(A5.13) where, if c1 , c2 are suitable constants h |βν,1 | ≤ c1 |λ|γ ϑh
|β˜νh,k | ≤ c2 |λ|γ ϑ(h−k) .
(A5.14)
By using (A5.13) and choosing c = 2c1 we obtain |(Tν)h | ≤
k≤h
2c1 |λ|γ (ϑ/2)k γ k−h ≤ c|λ|γ (ϑ/2)h .
(A5.15)
730
A. Giuliani, V. Mastropietro
Furthermore, using (A5.13) and (A5.6), we find that T is a contraction on Mϑ : |(Tν)h − (Tν )h | ≤
γ k−h−1 |βνk (λk (ν), νk ; . . . ; λ1 , ν1 ) − βνk (λk (ν ), νk ; . . . ; λ1 , ν1 )|
k≤h
≤c
1
γ k−h−1 γ ϑk
≤ c
|λk (ν) − λk (ν )| +
k =k
k≤h
γ
ϑ(k−k )
k =k
|λ||νk − νk |
1 γ k−h−1 |k|γ ϑk |λ| ||ν − ν ||0 + γ ϑ(k−k ) |λ|γ (ϑ/2)k ||ν − ν ||ϑ k =k
k≤h
1
≤ c |λ|γ
(ϑ/2)h
||ν − ν ||ϑ ,
(A5.16)
hence ||(Tν) − (Tν )||ϑ ≤ c |λ|||ν − ν ||ϑ . Then, a unique fixed point ν ∗ for T exists on Mϑ . Proof of Theorem 4.1 is concluded by noticing that T is analytic (in fact βνh and λ are analytic in ν in the domain Mϑ ).
A5.2.Proof of Lemma 4.2. From now on we shall think of λh and νh fixed, with ν1 conveniently chosen as above (ν1 = ν1∗ (λ)). Then we have |λh | ≤ c|λ| and |νh | ≤ c|λ|γ (ϑ/2)h , for some c, ϑ > 0. Having fixed ν1 as a convenient function of λ, we can also think of λh and νh as functions of λ. The flow of Zh . The flow of Zh is given by the first of (4.1) with zh independent of σk , µk , k ≥ h. By Theorem 3.1 we have that |zh | ≤ c|λ|2 , uniformly in h. Again, as for λh and νh , def
we can formally study this equation up to h = −∞. We define γ −ηz = limh→−∞ 1+zh , so that logγ Zh =
logγ (1 + zk ) = ηz (h − 1) +
k≥h+1
rζk
zk − z−∞ def . (A5.17) rζk = logγ 1 + 1 + z−∞
,
k≥h+1
Using the fact that zk−1 − zk is necessarily proportional to λk−1 − λk or to νk−1 − νk and that λk−1 − λk is bounded as in (A5.1), we easily find: |rζk | ≤ c k ≤k |zk −1 − def zk | ≤ c |λ|2 γ (ϑ/2)k . So, if Fζh = k≥h+1 rζk and Fζ1 = 0, then Fζh = O(λ) and h
Zh = γ ηz (h−1)+Fζ . Clearly, by definition, ηz and Fζh only depend on λk , νk , k ≤ 1, so they are independent of t and u.
The flow of µh . The flow of µh is given by the last of (4.1). One can easily show inductively that µk (k)/µh , k ≥ h, is independent of µ1 , so that one can think that µh−1 /µh is just a function of λh , νh . Then, again we can study the flow equation for µh up to def
h → −∞. We define γ −ηµ = limh→−∞ 1+(mh /µh −zh )/(1+zh ), so that, proceeding as for Zh , we see that h
µh = µ1 γ ηµ (h−1)+Fµ , for a suitable Fµh = O(λ). Of course ηµ and Fµh are independent of t and u.
(A5.18)
Anomalous Universality in the Anisotropic Ashkin–Teller Model
731 def
The flow of σh . The flow of σh can be studied as the one of µh . If we define γ −ησ = limh→−∞ 1 + (sh /σh − zh )/(1 + zh ), we find that h
σh = σ1 γ ησ (h−1)+Fσ ,
(A5.19)
for a suitable Fσh = O(λ). Again, ησ and Fσh are independent of t, u. We are left with proving that ησ − ηµ = 0. It is sufficient to note that, by direct computation of the lowest order terms, for some ϑ > 0, (4.1) can be written as: zh = b1 λ2h + O(|λ|2 γ ϑh ) + O(|λ|3 )
,
sh /σh = −b2 λh + O(|λ|γ
2
mh /µh = b2 λh + O(|λ|γ
ϑh
b1 > 0,
) + O(|λ| )
,
b2 > 0,
) + O(|λ| )
,
b2 > 0 ,
ϑh
2
(A5.20)
where b1 , b2 are constants independent of λ and h. Using (A5.20) and the definitions of ηµ and ησ we find: ησ − ηµ = (2b2 / log γ )λ + O(λ2 ). Appendix A6. Proof of Lemma 5.3 (2)
Proceeding as in §4 and Appendix A5, we first solve the equations for Zh and m h ∗) (ϑ/2)(h−h 1 parametrically in π = {πh }h≤h∗1 . If |πh | ≤ c|λ|γ , the first two assumptions of (5.14) easily follow. Now we will construct a sequence π such that |πh | ≤ ∗ c|λ|γ (ϑ/2)(h−h1 ) and satisfying the flow equation πh−1 = γ h πh + βπh (πh , . . . , πh∗1 ). A6.1. Tree expansion for βπh . βπh can be expressed as a sum over tree diagrams, similar to those used in Appendix A4. The main difference is that they have vertices on scales k between h and +2. The vertices on scales hv ≥ h∗1 + 1 are associated to the truncated expectations (A4.4); the vertices on scale hv = h∗1 are associated to truncated (1,h∗ )
expectations w.r.t. the propagators gω1 ,ω12 ; the vertices on scale hv < h∗1 are associated to (2,h +1) truncated expectations w.r.t. the propagators gω1 ,ωv2 . Moreover the end–points on scale ∗ ≥ h1 + 1 are associated to the couplings λh or νh , as in Appendix A4; the end–points on scales h ≤ h∗1 are necessarily associated to the couplings πh . A6.2. Bounds on βπh . The non-vanishing trees contributing to βπh must have at least one vertex on scale ≥ h∗1 : in fact the diagrams depending only on the vertices of type π are vanishing (they are chains, so they are vanishing, because of the compact support property of the propagator). This means that, by the short memory property, see the ∗ Remark at the end of Appendix A4: |βπh | ≤ c|λ|γ ϑ(h−h1 ) . A6.3. Fixing the counterterm. We now proceed as in Appendix A5 but the analysis here ∗ is easier, because no λ end–points can appear and the bound |βπh | ≤ c|λ|γ ϑ(h−h1 ) holds. As in Appendix A5, we can formally consider the flow equation up to h = −∞, even (2) k , k ≤ h∗1 if h∗2 is a finite integer. This is because the beta function is independent of m and admits bounds uniform in h. If we want to fix the counterterm πh∗1 in such a way that π−∞ = 0, we must have, for any h ≤ h∗1 :
732
A. Giuliani, V. Mastropietro
πh = −
γ k−h−1 βπk (πk , . . . , πh∗1 ) .
(A6.1)
k≤h ∗
˜ be the space of sequences π = {π−∞ , . . . , πh∗ } such that |πh | ≤ c|λ|γ −(ϑ/2)(h−h1 ) . Let M 1 ˜ →M ˜ defined as: ˜ :M We look for a fixed point of the operator T ˜ h=− (Tπ)
γ k−h−1 βπk (πk ; . . . ; πh∗1 ) .
(A6.2)
k≤h (2)
Using that βπk is independent from m ˆ k and the bound on the beta function, choosing λ small enough and proceeding as in the proof of Theorem 4.1, we find that T˜ is a ˜ so that we find a unique fixed point, and the first of (5.16) follows. contraction on M, (2)
A6.4. The flows of Zh and m h . Once πh∗1 is fixed via the iterative procedure of §A6.3, (2)
h given by (5.10). Note that zh we can study in more detail the flows of Zh and m and sh can be again expressed as a sum over tree diagrams and, as discussed for βπh , see §A6.2, any non-vanishing diagram must have at least one vertex on scale ≥ h∗1 . ∗ Then, by the short memory property, see §A4.6, we have zh = O(λ2 γ ϑ(h−h1 ) ) and ∗ (2) mh γ ϑ(h−h1 ) ) and, repeating the proof of Lemma 4.1, we find the second and sh = O(λ third of (5.16). A6.5 The Lipshitz property (5.17). Clearly, πh∗∗ (λ, σ1 , µ1 ) − πh∗∗ (λ, σ1 , µ1 ) can be ex1 1 pressed via a tree expansion similar to the one discussed above; in the trees with nonvanishing value, there is either a difference of propagators at scale h ≥ h∗1 with couplings σh , µh and σh , µh , giving in the dimensional bounds an extra factor O(|σh − σh |γ −h ) or O(|µh − µh |γ −h ); or a difference of propagators at scale h ≤ h∗1 (computed by (2) definition at m h = 0) with the “corrections” ahω , ch associated to σ1 , µ1 or σ1 , µ1 , giving in the dimensional bounds an extra factor O(|σ1 − σ1 |) or O(|µ1 − µ1 |). Then, ∗ ∗ γ k−h1 −1 πh1 (λ, σ1 , µ1 ) − πh∗1 (λ, σ1 , µ1 ) ≤ c|λ| k≤h∗1
|σ − σ | |µ − µ | h h h h + |σ · + − σ | + |µ − µ | , 1 1 1 1 γh γh ∗ ∗ h≥h1
k≤h≤h1
(A6.3) from which, using (A5.18) and (A5.19), we easily get (5.17). Appendix A7. Proof of (A4.10) We have, by definition Pf G = (2k k!)−1 p (−1)p Gp(1)p(2) · · · Gp(2k−1)p(2k) , where p = (p(1), . . . . . . , p(|J |)) is a permutation of the indices f ∈ J (we suppose |J | = 2k) and (−1)p its sign.
Anomalous Universality in the Anisotropic Ashkin–Teller Model
733
def
If we apply S1 = 1 − P0 to Pf G and we call G0f,f = P0 Gf,f , we find that S1 Pf G is equal to 1 p 0 0 (−1) · · · G − G · · · G G p(1)p(2) p(2k−1)p(2k) p(1)p(2) p(2k−1)p(2k) 2k k! p =
k 1 p 0 0 (−1) · G · · · G p(1)p(2) p(2j −3)p(2j −2) 2k k! p j =1 ×S1 Gp(2j −1)p(2j ) Gp(2j +1)p(2j +2) · · · Gp(2k−1)p(2k) ,
(A7.1)
where in the last sum the meaningless factors must be put equal to 1. We rewrite the two sums over p and j in the following way: k p j =1
=
k ∗ ∗∗ j =1
f1 ,f2 ∈J f1 =f2
(A7.2)
,
J1 ,J2 p
where the ∗ on the second sum means that the sets J1 and J2 are s.t. (f1 , f2 , J1 , J2 ) is a partition of J ; the ∗∗ on the second sum means that p(1), . . . , p(2j − 2) belong to J1 , (p(2j − 1), p(2j )) = (f1 , f2 ) and p(2j + 1), . . . , p(2k) belong to J2 . Using (A7.2) we can rewrite (A7.1) as S1 Pf G =
k ∗ 1 π (−1) S G 1 f ,f 1 2 2k k! f ,f ∈J j =1
·
1 2 f1 =f2
J1 ,J2
(−1)p1 +p2 G0p1 (1)p1 (2) · · · G0p1 (2k1 −1)p(2k1 )
p1 ,p2
× Gp2 (1)p2 (2) · · · Gp2 (2k2 −1)p(2k2 ) ,
(A7.3)
where (−1)π is the sign of the permutation leading from the ordering J to the ordering (f1 , f2 , J1 , J2 ); pi , i = 1, 2 is a permutation of the labels in Ji (we suppose |Ji | = 2ki ) and (−1)pi is its sign. It is clear that (A7.3) is equivalent to (A4.10). Acknowledgements. AG thanks Prof. J. L. Lebowitz for his invitation at Rutgers University, where part of this work was done; and acknowledges the NSF Grant DMR 01–279–26, which partially supported his work. VM thanks Prof. T. Spencer for his nice invitation to the Institute for Advanced Studies, in Princeton, where part of this work was done. We both thank Prof. G. Gallavotti for many important remarks and suggestions.
References [AT] [B] [Ba] [BG1]
Ashkin, J., Teller, E.: Statistics of Two-Dimensional Lattices with Four Components. Phys. Rev. 64, 178–184 (1943) Baxter, R.J.: Eight-Vertex Model in Lattice Statistics. Phys. Rev. Lett. 26, 832–833 (1971) Baxter, R.: Exactly solved models in statistical mechanics. London-NewYork: Academic Press, 1982 Benfatto, G., Gallavotti, G.: Perturbation Theory of the Fermi Surface in Quantum Liquid. A General Quasiparticle Formalism and One-Dimensional Systems. J. Stat. Phys. 59, 541–664 (1990)
734 [BG]
A. Giuliani, V. Mastropietro
Benfatto, G., Gallavotti, G.: Renormalization group. Physics notes 1, Princeton, NJ: Princeton University Press 1995 [BGPS] Benfatto, G., Gallavotti, G., Procacci, A., Scoppola, B.: Beta function and Schwinger functions for a Many Fermions System in One Dimension. Commun. Math. Phys. 160, 93–171 (1994) [BM] Benfatto, G., Mastropietro, V.: Renormalization group, hidden symmetries and approximate Ward identities in the XY Z model. Rev. Math. Phys. 13 no 11, 1323–143 (2001); Commun. Math. Phys. 231, 97–134 (2002) [BM1] Bonetto, F., Mastropietro, V.: Beta function and Anomaly of the Fermi Surface for a d=1 System of interacting Fermions in a Periodic Potential. Commun. Math. Phys. 172, 57–93 (1995) [Bad] Badehdah, M., et al.: Physica B, 291, 394 (2000) [Bez] Bezerra, C.G., Mariz, A.M.: The anisotropic Ashkin-Teller model: a renormalization Group study. Physica A 292, 429–436 (2001) [Bar] Bartelt, N.C., Einstein, T.L., et al.: Phys. Rev. B 40, 10759 (1989) [Be] Bekhechi, S., et al.: Physica A 264, 503 (1999) [BeM1] Benfatto, G., Mastropietro, V.: Ward identities and Dyson equations in interacting Fermi systems. To appear in J. Stat. Phys. [DR] Domany, E., Riedel, E.K.: Phys. Rev. Lett. 40, 562 (1978) [GM] Gentile, G., Mastropietro, V.: Renormalization group for one-dimensional fermions. A review on mathematical results. Phys. Rep. 352(4–6), 273–243 (2001) [GS] Gentile, G., Scoppola, B.: Renormalization Group and the ultraviolet problem in the Luttinger model. Commun. Math. Phys. 154, 153–179 (1993) [K] Kadanoff, L.P.: Connections between the Critical Behavior of the Planar Model and That of the Eight-Vertex Model. Phys. Rev. Lett. 39, 903–905 (1977) [KW] Kadanoff, L.P., Wegner, F.J.: Phys. Rev. B 4, 3989–3993 (1971) [Ka] Kasteleyn, P.W.: Dimer Statistics and phase transitions. J. Math.Phys. 4, 287 (1963) [F] Fan, C.: On critical properties of the Ashkin-Teller model. Phys. Lett. 39A, 136–138 (1972) [H] Hurst, C.: New approach to the Ising problem. J.Math. Phys. 7(2), 305–310 (1966) [ID] Itzykson, C., Drouffe, J.: Statistical field theory: 1, Cambridge: Cambridge Univ. Press, 1989 [Le] Lesniewski, A.: Effective action for theYukawa 2 quantum field Theory. Commun. Math. Phys. 108, 437–467 (1987) [Li] Lieb, H.: Exact solution of the problem of entropy of two-dimensional ice. Phys. Rev. Lett. 18, 692–694 (1967) [LP] Luther, A., Peschel, I.: Calculations of critical exponents in two dimension from quantum field theory in one dimension. Phys. Rev. B 12, 3908–3917 (1975) [M1] Mastropietro, V.: Ising models with four spin interaction at criticality. Commun. Math. Phys 244, 595–642 (2004) [ML] Mattis, D., Lieb, E.: Exact solution of a many fermion system and its associated boson field. J. Math. Phys. 6, 304–312 (1965) [MW] McCoy, B., Wu, T.: The two-dimensional Ising model. Cambridge, MA: Harvard Univ. Press, 1973 [MPW] Montroll, E., Potts, R., Ward, J.: Correlation and spontaneous magnetization of the two dimensional Ising model. J. Math. Phys. 4, 308 (1963) [N] den Nijs, M.P.M.: Derivation of extended scaling relations between critical exponents in two dimensional models from the one dimensional Luttinger model. Phys. Rev. B 23(11), 6111– 6125 (1981) [O] Onsager, L.: Critical statistics. A two dimensional model with an order-disorder transition. Phys. Rev. 56, 117–149 (1944) [PB] Pruisken, A.M.M., Brown, A.C.: Universality for the critical lines of the eight vertex, AshkinTeller and Gaussian models. Phys. Rev. B, 23(3), 1459–1468 (1981) [PS] Pinson, H., Spencer, T.: Universality in 2D critical Ising model. To appear in Commun. Math. Phys. [S] Samuel, S.: The use of anticommuting variable integrals in statistical mechanics. J. Math. Phys. 21 2806 (1980) [Su] Sutherland, S.B.: Two-Dimensional Hydrogen Bonded Crystals. J. Math. Phys. 11, 3183–3186 (1970) [Spe] Spencer, T.: A mathematical approach to universality in two dimensions. Physica A 279, 250– 259 (2000) [SML] Schultz, T., Mattis, D., Lieb, E.: Two-dimensional Ising model as a soluble problem of many Fermions. Rev. Mod. Phys. 36, 856 (1964) [W] Wegner, F.J.: Duality relation between the Ashkin-Teller and the eight-vertex model. J. Phys. C 5, L131–L132 (1972)
Anomalous Universality in the Anisotropic Ashkin–Teller Model [WL] [Wu]
735
Wu, F.Y., Lin, K.Y.: Two phase transitions in the Ashkin-Teller model. J. Phys. C 5, L181–L184 (1974) Wu, F.W.: The Ising model with four spin interaction. Phys. Rev. B 4, 2312–2314 (1971)
Communicated by J.Z. Imbrie
Commun. Math. Phys. 256, 737–766 (2005) Digital Object Identifier (DOI) 10.1007/s00220-005-1290-0
Communications in
Mathematical Physics
Onset of Chaotic Kolmogorov Flows Resulting from Interacting Oscillatory Modes Zhi-Min Chen1,2 , W.G. Price1 1
School of Engineering Sciences, Ship Science, University of Southampton, Southampton SO17 1BJ, UK 2 School of Mathematics, Nankai University, Tianjin 300071, P.R. China Received: 24 March 2004 / Accepted: 14 July 2004 Published online: 8 March 2005 – © Springer-Verlag 2005
Abstract: On the basis of rigorous analysis supported by numerical computation, a systematic study is presented to locate and examine chaotic Kolmogorov flows resulting from the interaction of a basic steady-state flow and oscillatory modes. Referenced to suitably chosen initial conditions of the Kolmogorov flow model, these oscillatory modes are derived from the equation linearized around the basic steady-state flow. The numerical experiments provide insight into the transition process from secondary selfoscillation flows or secondary steady-state flows to chaotic Kolmogorov flows.
1. Introduction A complete understanding of the mechanics causing the onset of turbulence in a viscous fluid flow remains an elusive problem. In 1959, Kolmogorov suggested (see, for example, Arnold and Meshalkin [1]) that to understand such mechanisms, studies should initially focus on a simple viscous flow model. Namely, a two-dimensional viscous fluid motion in which the velocity fluid v is defined by a stream function ψ and governed by the Navier-Stokes equations expressed in the following nondimensional vorticity formulation: ∂t ψ − 2 ψ + R(∂y ψ ∂x ψ − ∂x ψ ∂y ψ) = k 3 cos ky, ψ(t, x + 2π, y) = ψ(t, x, 2π + y) = ψ(t, x, y).
(1) (2)
Here R denotes the dimensionless Reynolds number defining the viscous fluid motion, the Laplacian operator = ∂x2 + ∂y2 and k is a positive integer. Equation (2) defines the spatially periodic conditions assumed to constrain the flow. The velocity field expressed in terms of the stream function is defined by v = (∂y ψ, −∂x ψ),
738
Z.-M. Chen, W.G. Price
and the basic steady-state flow is defined as v0 = (∂y ψ0 , −∂x ψ0 ) = (sin ky, 0), where the steady-state stream function solution ψ0 = −(1/k) cos ky. Landau [20], Lorenz [22] and Ruelle and Takens [30] proposed that turbulence occurs through generation of bifurcating flows in the transitional flow phase between basic steady-state and turbulent flows. The turbulence model of Landau needs an infinite number of bifurcations whereas Ruelle and Takens require a finite number. The bifurcation scenario as discussed in [20, 30] starts with the transition from basic steady-state to self-oscillation or temporal periodic flows, whereas the bifurcation process as produced in [22] is initially from the transition from basic steady-state solution to a pair of steadystate solutions, which eventually become unstable and transit respectively into a pair of self-oscillations leading to the occurrence of chaos (see, for example, [14]). From the investigation of Joseph and Sattinger [15] into Hopf bifurcation theory associated with Navier-Stokes equations, it was shown that the transition from a steady-state solution to self-oscillations results from the occurrence of oscillatory neutral spectral modes or eigenfunctions, of which the corresponding eigenvalues reach the imaginary axis of the complex plane away from the origin. Motivated by this finding, we now seek a chaotic Kolmogorov flow evolving from a perturbation of the basic steady-state solution ψ0 in a direction of these oscillatory eigenmodes. On the basis of analysis and confirmed by numerical experiments, this investigation suggests that oscillatory eigenmodes play crucial roles in determining a chaotic Kolmolgorov flow. Kolmogorov’s problem loses stability in low values of Reynolds number. For example, if we consider (1) with k = 1 in an infinite channel (−∞, ∞) × [0, 2π ] and ψ satisfies spatially periodic boundary condition on y = 0 and 2π . The first critical √ value of the Reynolds number is 2 ( see Yudovich [33] and Green [13]). To describe instabilities observed in physical experiments arising at large values of Reynolds number, Bondarenko et al. [2] introduced a modified Kolmogorov model taking into account Ekman friction layers. In this paper, numerical computation results are given with respect to low regimes of the Reynolds number values when k ≤ 12. This study is restricted to flow invariant spaces, in which the neutral oscillatory spectral spaces are, in fact, two-dimesional and this gives rise to secondary time periodic flows [15]. The neutral oscillatory spectral space of (1) is actually four-dimensional and this gives rise to secondary time periodic flows, of which the collection forms a twodimensional smooth manifold contained in a three-dimensional ellipsoid. This bifurcation phenomenon will be discussed elsewhere. Ma and Wang [23, 24] present a rigorous bifurcation theory with respect to a dissipative system and prove that an n-dimensional critical eigenvector space gives rise to a bifurcating attractor, of which the dimension is not less than n − 1 for any n > 1. This theory has been successfully applied to nonlinear dynamical systems such as the B´enard problem and reaction-diffusion equations when their linearized operators are self-conjugate. The chaotic behaviour discussed herein is temporal. For investigations on two-dimensional fully developed turbulence, one may refer to Foias et al. [9–11]. A rigorous steady-state bifurcation analysis was examined by Yudovich [33] on a model without neutral oscillatory spectral spaces [26]. Okamoto and Shoji [28] demonstrate through numerical simulations supercritical pitchfork bifurcation diagrams appertaining to this model, whereas the authors [7] discuss the characteristic behaviour of its secondary flows and show through a rigorous analysis that the collection of the secondary steady-state solutions is a circle at any supercritical Reynolds number value. The
Onset of Chaotic Kolmogorov Flows
739
numerical investigation [28] has been extended by Kim and Okamoto [17] to a spatially periodic flow forced by two Fourier modes. Hopf bifurcation solutions of (1) for selected forcing was derived by the authors [5, 6]. This paper is organized as follows: Sect. 2 is devoted to a rigorous analysis classifying eigenvalues which vary across the imaginary axis of the complex plane. Namely, real eigenvalues and non-real eigenvalues on the imaginary axis. The associated eigenfuncations, with respect to non-real eigenvalues, are oscillatory modes. From formulae derived in Sect. 2, elementary numerical computation results are displayed in Sect. 3 to classify these oscillatory modes. In Sect. 4, based on an understanding of the occurrence of oscillatory modes and the flow invariance of Navier-Stokes equations (1,2), initial stream functions are chosen in the vicinity of the basic stream function ψ0 to excite interactions of the basic steady-state ψ0 and oscillatory modes to give rise to chaotic attractors as illustrated in the numerical experiments on the nonlinear system (1,2). A discussion of these findings is contained in Sect. 5 and the transformation of the Navier-Stokes equation (1,2) under flow invariance is derived in the Appendix. 2. Linearized Problem In a spectral perturbation analysis (Lin [21], Drazin and Reid [8]), the stream function is expressed as a steady-state component and a perturbation term. That is, ˆ ψ = ψ0 + ψ, where ψˆ satisfies (2). The substitution of this decomposed stream function into Eq. (1) gives, after omitting the nonlinear term, the linearized equation ∂t ψˆ − 2 ψˆ + R sin ky( + k 2 )∂x ψˆ = 0.
(3)
The solution of this equation is of the form ˆ x, y) = eρt ψ (x, y), ψ(t, which when substituted into Eq. (3), with the superscript prime omitted, defines the spectral problem ρψ − 2 ψ + R sin ky( + k 2 )∂x ψ = 0.
(4)
Here the real part of the eigenvalue ρ, i.e. ρ, determines the linear stability of the perturbation. For the critical case ρ = 0, ψˆ evolved initially from corresponding eigenfunctions form neutral circles. The value R with respect to the case ρ(R) = 0 defines the critical Reynolds number of the flow. Moreover, the eigenfunction ψ with respect to the eigenvalue ρ > 0, ρ < 0 or ρ = 0 is referred to as the unstable, stable or neutral mode, respectively. This linear stability problem with k = 1 was studied initially by Meshalkin and Sinai [26], who, in particular, derived the equivalent spectral problem, an ordinary difference equation and a continued fraction equation. A similar analysis is also adopted in this investigation. The nonlinear stability of the basic flow ψ0 was studied by Yudovich [33] and Marchioro [25] when k = 1 for any values of Reynolds number and by Tran et al. [31] when k ≥ 1 for small values of Reynolds number. Obukhov [27] focused on representation of the problem by physical experiments. Platt et al. [29] and Beyer and Benkadda [3] derived numerical experiments for the case k = 4 to examine its chaotic
740
Z.-M. Chen, W.G. Price
flow evolved from an initial condition involving all possible Fourier modes. Frenkel [12] and Zhang and Frenkel [34] studied the linear stability of a basic steady-state flow and a basic self-oscillation flow, assumed to exist in an extended fluid motion problem. To develop the present investigation, we adopt a Fourier expansion to obtain the formal expression of the eigenfunction of (2, 4) in the form k−1 ∞ ∞
ξl,j,n ei(lx+jy+nky) , i =
√ −1.
l=0 j =0 n=−∞
In particular, it is readily seen that the function sin ky appearing in the spectral problem (4) ensures the general eigenfunctions to be expressed in the simpler form ∞
ψ = ψl,j,k =
ξl,j,n ei(lx+jy+nky) ,
(5)
n=−∞
which are controlled by the integer vectors (l, j, k) for l ≥ 1 and j = 0, ..., k − 1 and k ≥ 1. The instability around the basic flow arises if the Reynolds number R increases beyond a critical value Rc . This initial nontrivial dynamic behaviour is determined by the occurrence of the eigenvalues ρ = ρ(R) increasing across the imaginary axis of the complex plane ρ(R) = 0 for some critical Reynolds number R = Rc . Note the spectral problem also depends on the control parameters l, j and k. We may therefore define such eigenvalues ρ = ρl,j,k (R) and the critical Reynolds numbers Rc = Rl,j,k . To determine the eigenvalues ρ = ρl,j,k (R) with the critical value R = Rl,j,k , we take an inner product of (4) and ψ + k 2 ψ with the eigenfunction ψ in the form of (5) to obtain 2π 2π ¯ 0=− 2 ψ(ψ¯ + k 2 ψ)dxdy 0
=
∞
0
(l 2 + (j + nk)2 )(l 2 + (j + nk)2 − k 2 )|ξl,j,n |2
(6)
n=−∞
after an integration by parts and using the condition ρ = 0. This equation implies that the required eigenvalues and eigenfunctions are determined by the quantities βn ≡ l 2 + (nk + j )2 satisfying βn − k 2 = l 2 + (nk + j )2 − k 2 < 0 (l ≥ 1, j = 0, ..., k − 1, k ≥ 1)
(7)
of which, the only possible choices are β0 − k 2 = l 2 + j 2 − k 2 and β−1 − k 2 = l 2 + (k − j )2 − k 2 . This investigation suggests that the number of eigenvalues ρ(R) crossing the imaginary axis of the complex plane equals the number of βn satisfying (7). For the case of the integer vector (l, j, k) such that (β0 − k 2 )(β−1 − k 2 ) = 0, k ≥ 1, l ≥ 1, j = 0, ..., k − 1, the nonexistence of the eigenfunction is trivial. The proof is derived by following the argument of Meshalkin and Sinai [26].
Onset of Chaotic Kolmogorov Flows
741
6iz
(0)
ρ 6
r (0)
ρl,j,k(Rl,j,k )
ρl,j,k
r
0
r
(0)
Rl,j,k
(1)
(1)
ρl,j,k
-R
Rl,j,k
(0)
r
r
−k 2 −β−1
(1)
ρl,j,k
ρl,j,k
r
−β0
z
0
r
(0)
ρl,j,k(Rl,j,k )
r
−β0
r
−β−1 −k 2
(i)
(ii)
Fig. 1. The profile of two classes of eigenvalues of (3,4) for (l, j, k) ∈ I with 1 ≤ j ≤ k/2 or −β−1 ≤ −β0 . (i) The eigenvalues on the (R, ρ) plane for a typical example of the case (l, j, k) satisfying (10). (ii) The eigenvalues on the complex plane for a typical example of the case (l, j, k) satisfying (11)
For the case of the integer vector (l, j, k) satisfying (β0 − k 2 )(β−1 − k 2 ) < 0, k ≥ 1, l ≥ 1, j = 0, ..., k − 1
(8)
there is a unique real eigenfunction as shown by Yudovich [33] and Frenkel [12]. For the remaining integer vectors (l, j, k) lying in the set I = (l, j, k); k ≥ 1, l ≥ 1, j = 0, ..., k − 1, β0 − k 2 < 0, β−1 − k 2 < 0 , (9) as discussed by Frenkel [12], it is difficult to determine the existence of the eigenfunctions (5) together with their eigenvalues. In fact, the general profile of such eigenvalues (0) (1) ρ = ρl,j,k (R), ρl,j,k (R) for R > 0 are demonstrated in Fig. 1. They are initially from −β0 and −β−1 and either transverse across the imaginary axis with oscillatory neutral modes generated at a single critical value Rl,j,k (see Fig. 1(i)) or increase across the origin along the real axis with the creation of real neutral modes at the two critical values (0) (1) Rl,j,k and Rl,j,k (see Fig. 1(ii)). Without loss of generality, we suppose that −β−1 ≤ −β0 or 1 ≤ j ≤ k/2 in Fig. 1. Thus the control parameters (l, j, k) satisfying (9) can be divided into two classes. The first gives the occurrence of a pair of conjugate oscillatory neutral modes as illustrated in Fig. 1(i), whereas the other is connected to the existence of real neutral modes as shown in Fig. 1(ii). This spectral behaviour was discussed by Chen [4] when k = 6. The rigorous analysis of the present investigation provides the evidence required to classifying these two classes of control parameters Theorem 1. (i) Let (l, j, k) ∈ I, and satisfy the condition, 2 (k 2 − β ) β−2 0
β02 (β−2
− k2 )
≥ 1 and
β12 (k 2 − β−1 ) 2 (β − k 2 ) β−1 1
≥ 1.
(10)
742
Z.-M. Chen, W.G. Price
Then ρ = 0 is not an eigenvalue of (13) for any R > 0. (ii) Let (l, j, k) ∈ I, and assume that there exists a constant c > 0 such that β12 (k 2 − β−1 ) 2 (β − k 2 ) β−1 1 2 (k 2 − β ) β−2 0
β02 (β−2 − k 2 )
≤ ≥
c c+1
c2
+ (c
2 ,
+ 1)2 ,
when 0 ≤ j ≤ k/2,
(11)
when k/2 ≤ j ≤ k − 1.
(12)
and 2 (k 2 − β ) β−2 0
β02 (β−2 − k 2 ) β12 (k 2 − β−1 ) 2 (β − k 2 ) β−1 1
≤ ≥
c c+1
c2
+ (c
2 ,
+ 1)2 , (0)
(1)
Then there exist two different critical values Rl,j,k and Rl,j,k > 0 such that ρ(R) = 0 (0)
(1)
is an eigenvalue of (13) when R = Rl,j,k and Rl,j,k respectively. Assertion (i) suggests that the eigenvalue ρl,j,k (R) with the control parameters (l, j, k) ∈ I satisfying (10) transverses across the imaginary axis away from the origin and thus the corresponding eigenfunction is oscillatory. Proof. The derivation of this proof is based on an equivalent formulation of the spectral problem and a continued fraction equation in the spirit of Meshalkin and Sinai [26]. The substitution of the Fourier expansion expressed by (5) with ξn = ξl,j,k into the spectral problem presented by (4) allows an equivalent formulation in the form of the ordinary difference equation 2βn (βn + ρ)ξn + Rl(βn−1 − k 2 )ξn−1 − Rl(βn+1 − k 2 )ξn+1 = 0,
n ∈ Z,
(13)
or (βn+1 − k 2 )ξn+1 2βn (βn + ρ) (βn−1 − k 2 )ξn−1 − = 0, + 2 2 (βn − k )ξn (βn − k 2 )ξn Rl(βn − k )
n ∈ Z,
(14)
where Z denotes the integer set. That is, (β±n − k 2 )ξ±n ∓1 = , n≥1 2 (β±(n−1) − k )ξ±(n−1) 2β±n (β±n + ρ) (β±(n+1) − k 2 )ξ±(n+1) ∓ Rl(β±n − k 2 ) (β±n − k 2 )ξ±n ∓1 = , n ≥ 1, 2β±n (β±n + ρ) 1 + 2β±(n+1) (β±(n+1) + ρ) 1 Rl(β±n − k 2 ) + 2 Rl(β±(n+1) − k ) .. . (15)
Onset of Chaotic Kolmogorov Flows
743
due to the convergence of the continued fraction (see, for example, Wall [32, Theorem 30.1] and Khinchin [16, Theorem 10]). The substitution of this expression (with n = −1 and 1) into the zeroth equation of (14) gives 2β0 (β0 +ρ) 2β−1 (β−1 +ρ) 1 1 Rl(β −k 2 ) + 2β1 (β1 +ρ) 1 Rl(β −k 2 ) + 2β−2 (β−2 + ρ) 1 = −1. 0 −1 + + Rl(β1 −k 2 ) . . Rl(β−2 −k 2 ) .. . . (16) Thus the spectral problem (4, 5) is equivalent to the difference equation (13), which is equivalent to the continued fractional equation (16). To prove Assertion (i), we assume ρ = 0 in (16) to obtain 2β02 Rl(β −k 2 ) + 0
2 2β−1 1 = −1. + 2 2 2 Rl(β −k ) 2β1 2β−2 1 −1 1 + + Rl(β1 −k 2 ) . . Rl(β−2 −k 2 ) . . . . (17) 1
Let (R) denote the left-hand side of this equation. A contradiction arises if we can prove that (R) > −1. Indeed, since (R) = −1, let us first assume that 2β02 + Rl(β0 − k 2 )
1 2β12 Rl(β1 − k 2 )
+
1
> 0.
(18)
2β22 1 + Rl(β2 − k 2 ) .. .
Taking (10,17), β0 − k 2 < 0 and β−1 − k 2 < 0 into account, we have 2β02 1 (R) > + 2 Rl(β0 − k 2 ) 2β 1 1 + 2 2) Rl(β − k 2β 1 1 2 + 2 Rl(β2 − k ) .. . 2β02 > + Rl(β0 − k 2 ) 2 β−1
>−
2 2β−1 1 Rl(β−1 − k 2 ) 2β12 Rl(β1 − k 2 )
k 2 − β−1 ≥ −1. β12 β1 − k 2
2 2β−1 Rl(β−1 − k 2 )
744
Z.-M. Chen, W.G. Price
This contradicts the finding of (17) and implies that (18) is not true. Thus (17) gives 2 2β−1
Rl(β−1
− k2 )
+
1 2 2β−2
Rl(β−2 − k 2 )
> 0.
1
+
2 2β−3
Rl(β−3 − k 2 )
+
1 .. .
In this case, we have
2 2β−1 2β02 1 (R) > + Rl(β −k 2 ) 2 2 2β−2 0 1 Rl(β−1 −k ) + 2 2 Rl(β−2 −k ) 2β−3 1 + Rl(β−3 −k 2 ) . . . 2β02 Rl(k 2 − β0 ) >− 2 2β−2 1 + 2 Rl(β−2 − k 2 ) 2β−3 1 + Rl(β−3 − k 2 ) .. . β02 k 2 − β0 >− ≥ −1 2 β−2 β−2 − k 2
by taking into account (10). This gives (ρ, R) > −1 again. We thus obtain the desired assertion. To prove Assertion (ii), we use (17), the equivalence formulation of the spectral prob(1) (0) lem of (2, 4), to show the existence of two different critical values Rl,j,k , Rl,j,k > 0 (0)
(1)
such that (17) admits two solutions R = Rl,j,k and Rl,j,k . Indeed, for (R) denoting the left-hand side of (17), we see that
(R) >
2 2β−1 2β02 − Rl(β0 − k 2 ) Rl(β−1 − k 2 )
2 2β−1
Rl|β−1 − k 2 | 2β12 Rl(β1
− k2 )
+
1 2β22 Rl(β2
− k2 )
+
1 .. .
Onset of Chaotic Kolmogorov Flows
745
2β02 Rl|β0 − k 2 |
−
2 2β−2
Rl(β−2
− k2 )
1
+
2 2β−3
− k2 )
Rl(β−3
1 .. .
+
2 β−1
β02 |β − k 2 | |β − k 2 | > − 0 2 − −1 2 2 2 Rl(β0 − k ) Rl(β−1 − k ) β1 β−2 2 β1 − k β−2 − k 2 → ∞ as R → 0 , 2 2β−1
2β02
and 2 2β−1
(R) > −
l|β−1 −k 2 | 2β12 l(β1
−k 2 )
1
+
2β22 R 2 l(β2 −k 2 )
1
+
2β32 l(β3 −k 2 )
−
1 2β42 1 + 2 2 R l(β4 −k ) .. .
+
2β02 l|β0 −k 2 | 2 2β−2
l(β−2
−k 2 )
1
+
2 2β−3
R 2 l(β
−3
−k 2 )
1
+
2 2β−4
l(β−4 −k 2 )
1
+
2 2β−5
R 2 l(β−5 −k 2 ) → 0 as R → ∞.
+
1 .. .
Thus we have (R) > −1 for R → ∞ or R → 0. Hence the continuity of (R) (1) (0) ensures the existence of two different critical values Rl,j,k and Rl,j,k if we can find a ∗ value R = Rl,j,k > 0 such that (R) < −1 or −R 2 (R) > R 2 . To do so, it suffices to consider the case (l, j, k) ∈ I, 1 ≤ j ≤ k/2,
(19)
since β−n−1 becomes βn and (12) becomes (11) if we replace k − j by j . We see that −R 2 (R) > R 2 is valid if there exists a constant c > 0 such that −
2β02 + l(k 2 − β0 )
1 2β12 R 2 l(β
1
− k2 )
+
>c
1 2β22 l(β2
− k2 )
+
1 .. .
2β02 l(k 2 − β0 )
746
Z.-M. Chen, W.G. Price
and 2 2β−1
l(k 2 − β−1 )
1
−
2 2β−2
R 2 l(β−2 − k 2 )
>
1
+
2 2β−3
l(β−3
− k2 )
+
1 .. .
R2 2β 2 c 2 0 l(k − β0 )
(20)
hold true. These inequalities are true if −
2β02 + l(k 2 − β0 )
1 2β12 + R 2 l(β1 − k 2 )
1 2β22 l(β2 − k 2 )
≥c
2β02 , l(k 2 − β0 )
(21)
and 2 2β−1
l(k 2 − β−1 )
−
1 2 2β−2
≥
R 2 l(β−2 − k 2 )
R2 . 2β02 c 2 l(k − β0 )
(22)
Equation (21) can be written in the form β02 β12 β22 l 2 (k 2 − β0 )(β1 − k 2 )(β2 − k 2 ) , β22 β02 − (c + 1) 2 β2 − k 2 k − β0
4(c + 1) R2 ≥
(23)
provided that β22 β02 > (c + 1) , β2 − k 2 k 2 − β0
(24)
and (22) is the same as the inequality 4c R2 ≤
2 β 2β 2 β−1 0 −2
l 2 (k 2 − β−1 )(k 2 − β0 )(β−2 − k 2 ) . 2 β−2 β02 +c 2 β−2 − k 2 k − β0
(25)
Thus the numbers R and c exist if (24) holds true and the right-hand side of (23) is bounded by the right-hand side of (25). That is, 2 β2 β−1 β12 β22 −2 c (k 2 − β−1 )(β−2 − k 2 ) (β1 − k 2 )(β2 − k 2 ) ≤ . 2 β22 β−2 β02 β02 − (c + 1) 2 +c 2 β2 − k 2 k − β0 k − β0 β−2 − k 2
(c + 1)
(26)
Onset of Chaotic Kolmogorov Flows
747
Since the function f (s) = s 2 /(s − k 2 ) increases when s > 2k 2 and β2 > β−2 = l 2 + (2k − j )2 > k 2 + 2(k − j )k > 2k 2 due to 0 ≤ j ≤ k/2 given by (19), we have 2 β−2 β22 > . β2 − k 2 β−2 − k 2
(27)
This implies that 2 β−2
β22 β2 − k 2 β22 β2 − k 2
− (c + 1)
<
β02 β0 − k 2
β−2 − k 2 2 β−2
β02 − (c + 1) 2 β−2 − k β0 − k 2
,
provided that 2 β−2
β−2 − k 2
> (c + 1)
β02 , β0 − k 2
(28)
which implies (24). Thus (26) is valid, if the inequality (c + 1) 2 β−2
β−2 − k 2
β12 β1 − k 2
− (c + 1)
c ≤
β02 k 2 − β0
2 β−1
k 2 − β−1
2 β−2
β02 + c β−2 − k 2 k 2 − β0
,
or 2 β−2
2 β−1 β02 c β−2 − k 2 k 2 − β0 k 2 − β−1 ≤ 2 β−2 β12 β02 (c + 1) − (c + 1) β1 − k 2 β−2 − k 2 k 2 − β0
+c
holds. This remains valid whenever we have 2 β−2
2 β−1 β02 c c+1 k 2 − β0 β−2 − k 2 k 2 − β−1 ≤ ≤ 2 c β−2 β12 β02 (c + 1) − (c + 1) β1 − k 2 β−2 − k 2 k 2 − β0
+c
giving 2 β−2
β−2 − k 2
≥ (c2 + (c + 1)2 )
β02 β02 > (c + 1) β0 − k 2 β0 − k 2
and β12 ≤ β1 − k 2
c c+1
2
2 β−1
β−1 − k 2
.
748
Z.-M. Chen, W.G. Price
Thus we obtain the condition (28) or (24), and thus the presupposed assumption expressed ∗ ∗ ) < −1. By Wall by (11). Hence we have some constant Rl,j,k > 0 such that (Rl,j,k [32, Theorem 28.1] ), (R) is an analytic function of R. Thus (16) admits two solutions (0) (1) (0) (1) ∗ < Rl,j,k < ∞ such that (Rl,j,k ) = (Rl,j,k ) = −1. The proof is 0 < Rl,j,k < Rl,j,k complete. 3. Numerical Experiments for the Spectral Problem Theorem 2.1 allows detection of critical Reynolds numbers to predict whether the associated eigenvalue on the imaginary axis is either zero or non-zero or whether the associated eigenfunction is either oscillatory or not. In this section, we present a selection of numerical computations to show the validity of Theorem 2.1 and to aid understanding of the underlying mechanisms associated with the spectral problem. Once validity is proven, the numerical computation of the spectral problem is simple. From the described analysis, for (l, j, k) ∈ I it follows that ρ = ρl,j,k is an eigenvalue for ρ > −β0 , if and only if ρ satisfies (16) with respect to some value R > 0. We represent the left-hand side of (16) by (ρ, R) − 1. Thus ρ becomes a solution of the nonlinear equation (ρ, R) = 0. To calculate a value of ρ(R) numerically, we adopt the secant method (see, for example, [18]), i.e. a modified Newton method, which gives ρn+1 = ρn −
(ρn − ρn−1 ) (ρn , R) (ρn , R) − (ρn−1 , R)
for every R > 0. Numerical experiments for (l, j, k) ∈ I show the existence of two eigenvalues (1) (0) ρl,j,k (R) and ρl,j,k (R) such that (1) lim ρ (R) R→0 l,j,k
(0)
= −β−1 , lim ρl,j,k (R) = −β0 . R→0
Schematically, these are either in the form of Fig. 1(i) for a single critical value or in the form of Fig. 1(ii) for two critical values. Since βn becomes β−n−1 and (17) remains the same when the integer j is substituted by k − j , this implies ρl,j,k = ρl,k−j,k and Rl,j,k = Rl,k−j,k .
(29)
Thus the spectral problem for (l, j, k) ∈ I with k/2 ≤ j ≤ k − 1 is completely the same as the spectral problem (l, j, k) ∈ I with 1 ≤ j ≤ k/2. Let us for discussion purposes examine the spectral problem with respect to the integer vectors (l, j, k) ∈ I with 1 ≤ j ≤ k/2 and k ≤ 12, or (l, j, k) ∈ {(l, j, k); l ≥ 1, 1 ≤ j ≤ k/2, l 2 + (k − j )2 < k 2 , k ≤ 12}. This set contains 178 integer vectors. Amongst them, 157 integer vectors (l, j, k) give rise to the two oscillatory eigenvalues transversal across the imaginary axis in the form of Fig. 1(i), and the other 21 integer vectors (l, j, k) give rise to the real eigenvalues in the form of Fig. 1(ii). For convenience of discussion, we list all these 21 integer vectors together with the two critical Reynolds numbers in Table 2, and display all the integer vectors with respect to the oscillatory eigenvalues for k ≤ 8 in Table 1. Each one of the
Onset of Chaotic Kolmogorov Flows
749
Table 1. All the integer vectors (l, j, k) ∈ I with 1 ≤ j ≤ k/2 and k ≤ 8, for which the eigenvalues ρ = ±ρl,j,k i around its single critical value Rl,j,k of the Reynolds number 2
l 1 1 2 1 1 2 2 3 1 1 2 2 3 1 1 1 2 2 2 3 3 4 4 5 1 1 1 2 2 2 3 3 4 4 5 1 1 1 1 2 2 2 2 3 3 3 4 4 4 5 5 6
j 1 1 1 1 2 1 2 2 1 2 1 2 2 1 2 3 1 2 3 2 3 2 3 3 1 2 3 1 2 3 2 3 2 3 3 1 2 3 4 1 2 3 4 2 3 4 2 3 4 3 4 4
k 2 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8
2
β (k −β0 ) − −2 β02 (k 2 −β−2 )
8.33 69.59 6.73 257.35 28.68 33.40 8.33 1.24 678.30 84.50 96.33 28.56 7.88 1471.09 194.60 38.01 217.70 69.59 20.07 22.15 8.33 6.73 2.71 0.28 2806.41 385.46 79.52 424.80 141.74 43.79 47.95 19.96 16.72 8.19 2.85 4886.90 688.64 147.23 42.21 750.07 257.35 82.86 28.68 89.94 39.46 16.41 33.40 17.65 8.33 7.51 3.72 1.24
β 2 (k 2 −β ) − 12 2 −1 β−1 (k −β1 )
8.33 5.88 0.57 4.00 33.33 1.15 8.33 1.23 3.13 14.29 1.33 7.14 2.22 2.63 10.00 33.33 1.37 5.88 20.00 2.56 8.33 0.57 2.70 0.27 2.33 7.14 20.00 1.37 4.76 14.29 2.56 7.69 0.93 3.33 0.98 2.08 5.56 14.29 50.00 1.35 4.00 11.11 33.33 2.50 6.67 16.67 1.15 3.57 8.33 1.45 3.70 1.23
ρl,j,k (Rl,j,k ) ±5.73i ±14.31i ±4.53i ±25.44i ±30.79i ±11.48i ±22.91i ±16.18i ±39.48i ±49.45i ±19.08i ±39.01i ±27.97i ±56.55i ±71.52i ±74.16i ±28.00i ±57.24i ±64.71i ±41.95i ±51.54i ±18.12i ±40.47i ±35.68i ±76.69i ±97.30i ±102.04i ±38.36i ±78.07i ±89.81i ±57.86i ±75.25i ±32.25i ±60.71i ±45.56i ±99.89i ±126.91i ±133.66i ±135.23i ±50.22i ±101.72i ±118.76i ±123.15i ±75.76i ±100.82i ±107.65i ±45.92i ±82.43i ±91.62i ±62.34i ±76.52i ±64.70i
Rl,j,k 13.51 20.92 28.65 32.65 31.63 28.56 27.01 53.67 47.98 46.19 35.23 33.08 39.02 66.80 64.17 63.79 44.66 41.84 42.00 41.52 40.51 57.29 54.66 254.17 89.09 85.46 84.81 56.24 52.63 51.45 47.53 46.04 53.94 51.63 77.60 114.82 110.06 109.12 108.97 69.81 65.28 63.66 63.25 55.52 53.62 53.07 57.10 54.75 54.01 66.40 65.06 107.32
750
Z.-M. Chen, W.G. Price
157 integer vectors gives rise to the existence of a single critical value of the Reynolds number Rl,j,k and a pair of complex conjugate eigenvalues ρl,j,k and ρl,j,k when these eigenvalues reach the imaginary axis of the complex plane as is illustrated in Fig. 1(i). It is noted that the integer vectors listed in Table 1, with the exception of (l, j, k) = (2, 1, 3), (4, 2, 6), (4, 2, 7), (5, 3, 6) and (5, 3, 7), satisfy the condition presented by (10). However, Table 1 also shows the oscillatory behaviour of the eigenvalues with respect to the integer vectors (l, j, k) = (2, 1, 3), (4, 2, 6), (4, 2, 7), (5, 3, 6) and (5, 3, 7). Hence, Eq. (10) it is not very accurate in classifying the integer vectors with respect to oscillatory spectral behaviour and the real spectral behaviour around the critical values of the Reynolds number, although (10) is simple. We thus introduce inequalities (11,12), which are more accurate in determining the spectral behaviour. Table 2 lists all the integer vectors (l, j, k) ∈ I with k ≤ 12 and 1 ≤ j ≤ k/2 of which the associated eigenvalues increase across the imaginary axis through the origin 0. Table 2 shows the existence of the constant c value of (11) with respect to the 20 integer vectors except the one (l, j, k) = (6, 3, 8). Although there is no constant c satisfying (11) for (l, j, k) = (6, 3, 8), there exist two different real solutions (0) (1) (ρ, R) = (0, R6,3,8 ) = (0, 83.12), (0, R6,3,8 ) = (0, 168.66) of the spectral problem (3, 4) or (16). Thus for (l, j, k) ∈ {(l, j, k) ∈ I; k ≤ 12, 1 ≤ j ≤ k/2} or the described 178 integer vectors (l, j, k), with the exception of (l, j, k) = (6, 3, 8), if the integer vector (l, j, k) does not satisfy (11) for any constant c > 0, then the associated two eigenvalues increase across the imaginary axis away from the origin 0 as is shown in Fig. 1(i), and if (l, j, k) satisfies (11) for some constant c > 0, then the two eigenvalues increase across the imaginary axis through the origin 0 of complex plane at (0) (1) the two different critical values Rl,j,k and Rl,j,k as illustrated by the profile displayed in Fig. 1(ii).
4. Numerical Experiments for Navier-Stokes Flows By Fourier expansion, the solution of Navier-Stokes equations (1, 2) is expressed in the following form : ψ(t, x, y) =
∞ ∞
[ξm,n (t) cos(mx + ny) + ηm,n (t) sin(mx + ny)].
m=1 n=−∞
In order to locate the chaotic Kolmogorov flow, we use the flow invariance property of (1,2) as discussed in [5]. For the given integer vector (l, j, k), the initial stream function ψ(0, x, y) =
∞ n=1
X0,n (0) cos(nky) +
∞ ∞
Xm,n (0) cos(mlx + mjy + nky)
m=1 n=−∞
(30)
Onset of Chaotic Kolmogorov Flows
751
Table 2. All the integer vectors (l, j, k) ∈ I with 1 ≤ j ≤ k/2 and k ≤ 12, for which the eigenvalues (0) (1) ρ = 0 associated with the two critical values Rl,j,k and Rl,j,k of the Reynolds number 2
l
j
k
2
β (k −β0 ) − −2 β02 (k 2 −β−2 )
β 2 (k 2 −β ) − 12 2 −1 β−1 (k −β1 )
c2 +(c+1)2
c 2 c+1
(0)
c
Rl,j,k
(1)
Rl,j,k
3
1
6
46.74
0.265
8.50
0.360
1.5
19.20
3
1
7
95.79
0.439
13.00
0.444
2.0
21.62
148.77
3
1
8
173.93
0.556
25.00
0.563
3.0
24.46
150.90
5
2
8
12.95
0.207
237.14
8
2.80
0.214
0.250 −−
37.04
3
5.00 −−
1.0
6
none
83.12
168.66
3
1
9
290.56
0.637
41.00
0.641
4.0
27.54
162.98 1576.40
182.08
4
1
9
91.97
0.060
5.00
0.250
1.0
21.62
5
2
9
24.41
0.418
13.00
0.444
2.0
40.76
139.96
3
1
10
456.33
0.694
85.00
0.735
6.0
30.74
180.24 744.54
4
1
10
147.36
0.162
5.00
0.250
1.0
23.06
5
2
10
41.30
0.575
41.00
0.641
4.0
47.16
115.18
7
3
10
5.99
0.084
5.00
0.250
1.0
57.62
671.44 201.16
3
1
11
683.21
0.074
113.00
0.763
7.0
34.04
4
1
11
223.68
0.244
5.00
0.250
1.0
24.68
611.96
5
2
11
65.00
0.690
61.00
0.694
5.0
57.24
100.92
6
2
11
30.55
0.146
5.00
0.250
1.0
36.56
556.60
7
3
11
10.89
0.303
8.50
0.360
1.5
60.04
208.16
3
1
12
984.40
0.769
145.00
0.787
8.0
37.40
225.02
4
1
12
325.50
0.312
13.00
0.444
2.0
26.36
583.48
6
2
12
46.74
0.265
13.00
0.444
2.0
38.40
364.16
7
3
12
17.74
0.478
17.32
0.498
2.4
68.02
142.34
gives rise to the unique solution of (1, 2) expressible in the same form ψ(t, x, y) =
∞
X0,n (t) cos(nky) +
∞ ∞
Xm,n (t) cos(mlx + mjy + nky).
m=1 n=−∞
n=1
(31) It follows from Eq. (43), derived in the Appendix, that the substitution of this solution expression ψ = ψ(t, x, y) into Navier-Stokes equations (1, 2) gives the infinite-dimensional ordinary differential equation system d X0,n = −n2 k 2 X0,n − k, 1 ≤ n < ∞, dt ∞ ∞ lmR + [βm,n +n Xm,n +n −βm,n −n Xm,n −n ]Xm,n , 2nk
(32)
m=1 n =−∞
∞ d lkR mn (βm,n+n − β0,n )Xm,n+n X0,n Xm,n = −βm,n Xm,n − dt 2β m,n
(33)
n =1
+
∞ ∞ m =1 n =−∞
lkR [m n − n m][βm +m,n +n − βm ,n ]Xm ,n Xm +m,n +n 2βm,n
752
Z.-M. Chen, W.G. Price
− +
m−1
∞
m =1 n =−∞ ∞ n =1
lkR [mn − nm ]βm ,n Xm−m ,n−n Xm ,n 2βm,n
lkR n m[βm,n−n − β0,n ]X0,n Xm,n−n , 2βm,n
1 ≤ m < ∞, −∞ ≤ n < ∞, for βm,n ≡ m2 l 2 + (mj + nk)2 . The basic steady-state flow ψ0 = − k1 cos(ky) is now in the form of the steady-state solution X 0 = {Xm,n } with the nonzero component X0,1 = − k1 . As described in Sect. 1, oscillatory eigenfunctions play a crucial role in the transition from laminar to chaotic flows. For discussion purposes, we display computation results with respect to integer vectors (l, j, k) = (1, 1, 3) and (2, 2, 7). These are chosen because the former one represents a typical example of a chaotic flow resulting from an instability associated with secondary self-oscillations bifurcating from the basic flow via a Hopf bifurcation, whereas the latter relates to a chaotic flow evolving from an instability of a pair of secondary steady-state flows bifurcating from the basic flow via a pitchfork bifurcation. Let us begin with the choice of (l, j, k) = (1, 1, 3). By the flow invariance discussed in the Appendix, the dynamical behaviour of the Navier-Stokes flow expressed by (31) is influenced by the eigenfunctions in the following form: ψ=
∞
ξm,n cos(mlx + mjy + nky),
n=−∞
which relate to the eigenvalue ρ = 0 of (3). From (6) it is seen that the only possible choices of such eigenvalues are m = 1 and 2. Table 1 shows that the corresponding critical values R1,1,3 = 20.92 and R2,2,3 = R2,1,3 = 28.65 by (29). That is, the system (32,33) with (l, j, k) = (1, 1, 3) has exactly two critical values R1,1,3 = 20.92 and R2,1,3 = 28.65 with respect to its linearization around the basic steady-state solution X0 . By Theorem 2.1 and Table 1, we see that each of these critical values gives rise to a pair of complex conjugate unstable oscillatory modes. In fact, the interaction of these two pairs of conjugate modes together with the basic flow leads to the occurrence of chaos. By numerical computation, we found that Xm,−n and Xm,n+1 decay like 10−n for n ≥ 1 when m ≥ 1 is fixed, and Xm,n decays like 10−m+2 for m ≥ 1 when integer n is fixed. Thus, for discussion purposes, we display our computational results with respect to the following truncated solution: ψ(t, x, y) =
N
X0,n (t) cos(nky)
n=1
+
N N−1
Xm,n (t) cos(mlx + mjy + nky), N = 5,
(34)
m=1 n=−N
which is an approximation of the stream function represented in (31). Thus the coupled system of the infinite-dimensional ordinary differential equations (32,33) is approximated
Onset of Chaotic Kolmogorov Flows
753
by the 55-dimensional truncated system, for N = 5, d X0,n = −n2 k 2 X0,n − k, 1 ≤ n ≤ N, dt N N−1 lmR + [βm,n +n Xm,n +n −βm,n −n Xm,n −n ]Xm,n , 2nk
(35)
m=1 n =−N
N lkR d Xm,n = −βm,n Xm,n − mn (βm,n+n − β0,n )Xm,n+n X0,n dt 2β m,n n =1
+
N N−1 m =1 n =−N
−
m−1 N−1 m =1 n =−N
+
lkR [m n − n m][βm +m,n +n − βm ,n ]Xm ,n Xm +m,n +n 2βm,n lkR [mn − nm ]βm ,n Xm−m ,n−n Xm ,n 2βm,n
N lkR n m[βm,n−n − β0,n ]X0,n Xm,n−n , 1 ≤ m ≤ N, −N ≤ n ≤ N − 1. 2β m,n n =1 (36)
To derive numerical results of this 55-dimensional dynamical system (35,36), a numerical scheme of study is developed based on the 4th -order Adams-Bashforth method (see [19]) with an iteration step size h = 0.0008. To detect the unstable behaviour from the basic solution ψ0 = − 13 cos(ky) and the influence of the two nontrivial oscillatory modes, ∞
ξm,n cos(mlx + mjy + nky), m = 1, 2
(37)
n=−∞
for R = R1,1,3 and R = R2,1,3 , it is convenient to express the initial stream function as ψ(0, x, y) = X0,1 (0) cos(ky) + X1,−1 cos(lx + jy − ky) + X1,0 cos(lx + jy) +X2,−1 cos(2lx + 2jy − ky) + X2,0 cos(2lx + 2jy), since under the evolution of Navier-Stokes equations (1,2), the Fourier modes cos(ky), cos(lx + jy − ky), cos(lx + jy) excite every other Fourier mode cos(mlx + mjy + nky) (see [5]). Letting (l, j, k) = (1, 1, 3) and the initial stream function 1 ψ1 (0, x, y) = − cos(3y) + 0.1 cos(x + y − 3y) + 0.1 cos(x + y) 3 or the initial vector X 1 = {Xm,n } with the nonzero components X0,1 = −1/3, X1,−1 = 0.1, X1,0 = 0.1,
754
Z.-M. Chen, W.G. Price
we obtain a limit cycle bifurcating from the basic steady-state solution at the first critical value R1,1,3 = 20.92 (see Fig. 2 at R = 21.5). This limit cycle loses stability at the value R = 57 and bifurcates into a quasi-periodic attractor in its phase spaces (see Fig. 2 at R = 57.5 and R = 58.1). On the other hand for the same value R = 58.1, if we take respectively the two initial stream functions 1 ψ2,± (0, x, y) = − cos(3y) ± 0.1 cos(x + y − 3y) ± 0.1 cos(x + y) 3 −0.01 cos(2x + 2y − 3y) + 0.01 cos(2x + 2y), or the initial vectors X 2,± = {Xm,n } with the nonzero components X0,1 = −1/3, X1,−1 = ±0.1, X1,0 = ±0.1, , X2,−1 = −0.01, X2,0 = 0.01, two symmetric limit cycles (see Fig. 3 at R = 58.1) coexist with the two-dimensional quasi-periodic solution described in Fig. 2. In fact, these two limit cycles originally transit from the basic steady-state solution via the second critical value R2,1,3 = 28.65 as a result of the interaction of the two oscillatory modes in the form of (37). Although the transition from the basic flow at the first critical value R1,1,3 to the secondary periodic flow and then to the tertiary quasi-periodic flow described in Fig. 2 can be observed in the numerical computation, the transition of the pair of the symmetric periodic flows, illustrated in Fig. 3 at R = 58.1 from the basic flow at the second critical value R2,1,3 = 28.65 eludes our numerical computation due to the strong stability of the flows appearing in Fig. 2. However, as the Reynolds number increases from R = 58.1, the stability of the two attracting periodic flows soon becomes stronger than the quasi-periodic flow, which is no longer observed in the numerical computations when R ≥ 58.7. Figure 3 illustrates the transitions of the two periodic flows at R = 58.1 to chaotic flows at R = 60 through a sequence of periodic doubling bifurcations. The chaotic behaviours can be confirmed by computing their first Lyapunov exponent to be positive. The two chaotic flows at R = 60 transit respectively from the two different periodic flows (see Fig. 3 at R = 58.1, 59.2 and 59.3) appear to overlap completely. Figure 4 shows the secondary periodic flow at R = 21.5 close to the bifurcation value R1,1,3 = 20.92 and the temporal chaotic flow at R = 60 in the fluid domain. We now display computational results associated with (l, j, k) = (2, 2, 7) as an alternative example of transition of a chaotic flow from secondary steady-state flows. As we know, for (l, j, k) = (2, 2, 7), the solution of (1,2) is in the form of (31) if the initial stream function is expressed as in (30). Thus for this solution, Navier-Stokes equations (1,2) become the ordinary differential system (32,33). From the derivation of (6), the dynamical behaviour of the Navier-Stokes flow expressed in (31) is influenced by the three eigenfunctions ∞
φm,n cos(mlx + mjy + nky), m = 1, 2, 3,
n=−∞
or ∞ n=−∞
φ1,n cos(2x + 2y + 7ny),
∞ n=−∞
φ2,n cos(4x + 3y + 7ny),
(38)
Onset of Chaotic Kolmogorov Flows
755
x
1,0
.3
0
−.3
R=21.5
R=56
x
1,0
.3
0
−.3
R=57.5
R=58.1
−.2
0
.2
−.2
x
.2
1,−1
−.05
x
2,0
.04
0 x
1,−1
−.14
R=57.5 −.18
R=58.1 −.13 x2,−1
−.08
−.18
−.13 x2,−1
−.08
Fig. 2. Phase portraits of the secondary and tertiary flows originally bifurcated from the basic flow ψ0 at the first critical value R = R1,1,3 = 20.92 when (l, j, k) = (1, 1, 3)
and ∞
φ3,n cos(6x + y + 7ny)
(39)
n=−∞
with respect to the eigenvalues ρ = 0. The associated three critical values are expressed as R2,2,7 , R4,3,7 and R6,1,7 . It follows from Table 1 and Theorem 2.1 that the integer vectors (l, j, k) = (2, 2, 7), (4, 3, 7) satisfy condition (10) and the spectral solutions (ρ2,2,7 (R2,2,7 ), R2,2,7 ) = (±78.07i, 52.63), (ρ4,3,7 (R4,3,7 ), R4,3,7 ) = (±60.71i, 51.63). Thus the associated two pairs of conjugate eigenfunctions (38) are oscillatory. For the integer vector (l, j, k) = (6, 1, 7), we see that β−1 = l 2 + (j − k)2 = 72 > k 2 = 49 > 37 = l 2 + j 2 = β0 ,
756
Z.-M. Chen, W.G. Price
x1,0
.3
0
−.3
R=58.1
R=58.1
R=59.2
R=59.2
R=59.3
R=59.3
x1,0
.3
0
−.3
x1,0
.3
0
−.3
x1,0
.3
0
−.3
R=60 −.2
.2
R=60 −.2
.2
Fig. 3. Phase portraits of a pair of symmetric limit trajectories (left and right columns) originally bifurcated from the basic flow ψ0 at the second critical value R = R2,1,3 = 28.65 when (l, j, k) = (1, 1, 3)
that is, only a single βn less than k 2 . Thus there exists a single eigenfunction in the form of (39)(see [12, 33]), which is real around the origin of the complex plane. Using the same computation scheme as described in Sect. 3, we have the critical value R6,1,7 = 40.19. Moreover, the rigorous proof of Yudovich [33] implies that (ψ0 , R6,1,7 ) is a pitchfork bifurcation point of (1,2). This observation shows that the system (32,33) has exactly three critical values arranged in the order of magnitude: R6,1,7 = 40.19, R4,3,7 = 51.63, R2,2,7 = 52.43 with respect to the linearization of (32,33) around the basic steady-state solution X0 . To display the numerical experiments for the case (l, j, k) = (2, 2, 7), based on the same reasoning for the case (l, j, k) = (1, 1, 3), we again apply the 4th order AdamsBashforth method to the 55-dimensional truncated system (35,36) with (l, j, k) = (2, 2, 7), N = 5, an iteration step size h = 0.00015 and two initial stream functions 1 ψ± (0, x, y) = − cos(7y) ± 0.1 cos(2x + 2y) ± 0.1 cos(2x + 2y − 7y), 7
(40)
Onset of Chaotic Kolmogorov Flows
757
y
2π
π
0
R=21.5
y
2π
π
0
0
R=60
π
2π
x Fig. 4. Secondary temporal periodic flow for R = 21.5 and the temporal chaotic flow for R = 60 at a time t = t0 when (l, j, k) = (1, 1, 3)
3,0
.01
R=40.6
R=50
x
R=40.6
3,0
Z.-M. Chen, W.G. Price
x1,0
758
.1
0
0
−.1
R=50
.01
x
x1,0
−.01
.1
0
0
−.1 x1,−1 −.1
0
−.01
x
.1
−.02
0
3,−1
.02
Fig. 5. Phase portraits of two solution trajectories initially evolved from X ± at the proposed values of the Reynolds number for (l, j, k) = (2, 2, 7)
R=70
3,0
.01
x
x1,0
R=70
.1 0
0
−.1
x1,0
R=100
x3,−1
−.01
.01
x3,0
x1,−1
R=100
.1 0
0
−.1 −.01
x
x3,−1
.01
3,0
R=112
.1 0
R=112
x1,−1
x
x1,0
1,−1
0
−.1 −.01
x
x3,−1
1,−1
−.1
0
.1
−.02
0
.02
Fig. 6. Phase portraits of two limit trajectories initially evolved from X± at the proposed values of the Reynolds number for (l, j, k) = (2, 2, 7)
.01
x3,0
759
x1,0
Onset of Chaotic Kolmogorov Flows
.1
0
0
−.1 −.01 x1,−1 0
.1
−.02
0
.02
0
.02
x3,0
x1,0
−.1
x3,−1 (a)
.01 .1
0
0
−.1 −.01 x1,−1 −.1
0
.1
x
3,−1
(b)
−.02
Fig. 7. (a) Phase portraits of the limit chaotic trajectory initially evolved from X+ and (b) Phase portraits of the limit chaotic trajectory initially evolved from X− at R = 114 for (l, j, k) = (2, 2, 7)
or the two initial vectors X ± = {Xm,n } with the nonzero components 1 X0,1 = − , X1,−1 = ±0.1, X1,0 = ±0.1. 7 With this choice, the oscillatory modes in the form of (38) and the real mode as described in (39) are excited (see [5]). Figure 5 shows the two stable steady-state solutions at R = 40.6 just bifurcating from the basic solution X0 via the first critical value R6,1,7 = 40.19. These two bifurcating steady-state solutions become unstable at R = 46.5 and give rise to a pair of stable periodic solutions (see Fig. 5 at R = 50). The (X1,−1 , X1,0 ) phase portraits of the two spiral trajectories at R = 40.6 in Fig. 5 elucidate the influence of the first stable oscillatory mode described in (38). This phenomenon is also reflected in Fig. 5 for the tertiary flow at R = 50. Moreover, these two limit cycles undergo a sequence of periodic doubling bifurcations (see Fig. 6) and become chaotic attractors at R = 114 (see Fig. 7). The chaotic behaviour can also be confirmed by computing their first Lyapunov exponents to be positive. Figures 6 and 7 also show the two different symmetric limit cycles transiting into a single chaotic attractor. Figure 8 illustrates the secondary steady-state flow at R = 40.6 and the chaotic flow at R = 114 in the fluid domain. In fact, if we take the initial conditions 1 ψ(0, x, y) = − cos(7y) ± 0.01 cos(6x + 6y) ± 0.01 cos(6x + 6y − 7y) 7
760
Z.-M. Chen, W.G. Price
y
2π
π
0
R=40.6
y
2π
π
0
0
R=114
π
2π
x Fig. 8. Secondary steady-state flow for R = 40.6 and the temporal chaotic flow for R = 114 at a time t = t0 when (l, j, k) = (2, 2, 7)
instead of (40), or if we consider the Navier-Stokes solutions without the influence of the oscillatory modes (38), the second steady-state flows illustrated in Fig. 5 in the (X3,−1 , X3,0 ) phase space at R = 40.6 remain stable as R increases.
Onset of Chaotic Kolmogorov Flows
761
5. Concluding Remarks The purpose of this paper is to aid understanding of the mechanisms underlying the occurrence of a chaotic Kolmogorov flow. The criteria of Landau [20], Lorenz [22] and Ruelle and Takens [30] suggest that turbulent flows transit from a basic steady-state flow through bifurcation sequences. The first step in the process of bifurcation as discussed in [20, 30] is the transition from basic steady-state to self-oscillation or temporal periodic flows, whereas the first step in the process of bifurcation as discussed in [22] is the transition from a basic steady-state solution to a pair of steady-state solutions, which eventually become unstable and transit respectively into a pair of self-oscillations leading to the occurrence of chaos (see, for example, [14]). By the Hopf bifurcation theory derived by Joseph and Sattinger [15] on viscous incompressible fluid motions, the transition from a steady-state solution to a self-oscillation solution arises from the occurrence of oscillatory neutral modes, or the oscillatory eigenfunctions reaching the imaginary axis of the complex plane away from the origin. Thus chaotic Kolmogorov flows evolve from a perturbation of the basic solution ψ0 in a direction of the oscillatory modes. From the basis of rigorous analysis and numerical experiments developed for this special fluid motion problem, oscillatory modes play a crucial role in determining chaotic behaviour. The flow eventually evolves a chaotic behaviour if it is perturbed from the basic flow under the influence of two oscillatory modes associated with two critical values of the Reynolds number. Typical examples of chaotic flows transiting from secondary temporal periodic flows and secondary steady-state flows are provided. Moreover, analytical reasoning allows location of these oscillatory modes and critical Reynolds number values and provides insight into the process underlying transition into chaotic Kolmogorov flows. A. Flow Invariance of Navier-Stokes Equations (1,2) In this section, we show the flow invariance of the Navier-Stokes equations (1,2) by using the spectral method to transform (1,2) into an infinite-dimensional ordinary differential system. That is, for a given integer vector (l, j, k), the stream function solution of (1,2) evolved from the initial stream function ψ(0, x, y) =
Xm,n (0) cos(mlx + mjy + nky) with
m,n
m,n
remains in the same form ψ(t, x, y) =
∞
=
m=0 n=1
+
∞ ∞ m=1 n=−∞
Xm,n (t) cos(mlx + mjy + nky).
m,n
Indeed, for the stream function ψ described by (42), we have −mlXm,n sin(mlx + mjy + nky), ∂x ψ = m,n
∂y ψ =
−(mj + nk)Xm,n sin(mlx + mjy + nky),
m,n
ψ = −
m,n
(m2 l 2 + (mj + nk)2 )Xm,n cos(mlx + mjy + nky).
(41)
(42)
762
Z.-M. Chen, W.G. Price
For convenience, we let βm,n = m2 l 2 + (mj + nk)2 , φm,n = sin(mlx + mjy + nky), ψm,n = cos(mlx + mjy + nky) to obtain ∂y ψ ∂x ψ − ∂x ψ ∂y ψ = [ml(m j + n k) − (mj + nk)m l]βm ,n Xm,n Xm ,n φm,n φm ,n m,n m ,n
lk = (mn − nm )βm ,n Xm,n Xm ,n [ψm−m ,n−n − ψm+m ,n+n ] 2 m,n m ,n
∞ ∞ ∞ lk = (mn − nm )βm ,n Xm,n Xm ,n ψm−m ,n−n 2 n=−∞ m=1 ∞
+ + + + + + +
m =0 n =1
∞ ∞ lk
2
m =1 n =−∞ m=0 n=1 ∞ ∞
(mn − nm )βm ,n Xm,n Xm ,n ψm−m ,n−n
lk (mn − nm )βm ,n Xm,n Xm ,n ψm−m ,n−n 2
m=m =1 n,n =−∞ ∞ ∞ ∞
m=m +1 m =1 n,n =−∞ ∞ ∞ ∞ m =m+1 m=1 n,n =−∞ ∞ ∞ ∞ m=1 n=−∞ m =0 n =1 ∞ ∞ ∞
m,m =1 n,n =−∞
lk (mn − nm )βm ,n Xm,n Xm ,n ψm−m ,n−n 2
−
lk (mn − nm )βm ,n Xm,n Xm ,n ψm+m ,n+n 2
−
lk (mn − nm )βm ,n Xm,n Xm ,n ψm+m ,n+n 2
m =1 n =−∞ m=0 n=1 ∞ ∞
−
lk (mn − nm )βm ,n Xm,n Xm ,n ψm−m ,n−n 2
lk (mn − nm )βm ,n Xm,n Xm ,n ψm+m ,n+n 2
≡ I1 + · · · + I8 . By an elementary calculation, these eight terms can be written as follows: I1 + I2 =
∞ ∞ ∞ lk mn β0,n Xm,n X0,n ψm,n−n 2 n=−∞
m=1 ∞
−
n =1
∞ ∞ lk
m =1 n =−∞ n=1
2
nm βm ,n X0,n Xm ,n ψm ,n −n
∞ ∞ ∞ lk = mn β0,n Xm,n X0,n ψm,n−n 2 n=−∞ m=1
n =1
Onset of Chaotic Kolmogorov Flows
− =− =−
∞ ∞ ∞ lk n mβm,n X0,n Xm,n ψm,n−n 2 n=−∞
m=1 ∞ m=1 ∞ m=1
I3 = =
763
∞
n =1
∞ ∞ lk mn (βm,n − β0,n )Xm,n X0,n ψm,n−n 2 n=−∞ n =1
∞ ∞ lk mn (βm,n+n − β0,n )Xm,n+n X0,n ψm,n , 2 n=−∞ n =1
∞
lk m(n − n)βm,n Xm,n Xm,n ψ0,n−n 2
m=1 n,n =−∞ ∞ ∞ ∞
lk m(n − n)βm,n Xm,n Xm,n ψ0,n−n 2 n=−∞
m=1 n =n+1 ∞ ∞
∞ lk m(n − n)βm,n Xm,n Xm,n ψ0,n−n 2
+
m=1 n=n +1 n =−∞
∞ ∞ ∞ lk = mn βm,n +n Xm,n Xm,n +n ψ0,n 2 m=1 n=1 n =−∞ ∞ ∞ ∞
−
m=1 n =1
lk mn βm,n−n Xm,n Xm,n−n ψ0,n 2 n=−∞
∞ ∞ ∞ lk = mn [βm,n +n Xm,n +n − βm,n−n Xm,n−n ]Xm,n ψ0,n , 2 n=−∞ m=1 n =1
I4 + I5 =
∞
∞ lk [(m+m )n −(n+n )m ]βm ,n Xm+m ,n+n Xm ,n ψm,n 2
m,m =1 n,n =−∞ ∞ ∞
lk [m(n +n)−n(m +m)]βm +m,n +n Xm,n Xm +m,n +n ψm ,n 2
+ =
m,m =1 n,n =−∞ ∞ ∞ m,m =1 n,n =−∞ ∞ ∞
lk [mn − nm ]βm ,n Xm+m ,n+n Xm ,n ψm,n 2
+ =
m,m =1 n,n =−∞ ∞ ∞
lk [m n − n m]βm +m,n +n Xm ,n Xm +m,n +n ψm,n 2
lk [m n−n m][βm +m,n +n −βm ,n ]Xm ,n Xm +m,n +n ψm,n , 2
m,m =1 n,n =−∞ ∞ ∞ ∞
I6 + I7 = −
+
m=1 n=−∞ n =1 ∞ ∞ ∞ m =1 n =−∞ n=1
lk mn β0,n Xm,n X0,n ψm,n+n 2 lk nm βm ,n X0,n Xm ,n ψm ,n+n 2
764
Z.-M. Chen, W.G. Price
=− +
∞ ∞ ∞ lk mn β0,n Xm,n X0,n ψm,n+n 2 n=−∞
m=1 ∞ m=1
n =1
∞ ∞ lk n mβm,n X0,n Xm,n ψm,n+n 2 n=−∞ n =1
∞ ∞ ∞ lk = n m[βm,n − β0,n ]X0,n Xm,n ψm,n+n 2 n=−∞
=
m=1 ∞
n =1 ∞ ∞ m−1
m=1
I8 = −
n =1
∞ ∞ lk n m[βm,n−n − β0,n ]X0,n Xm,n−n ψm,n , 2 n=−∞
lk [(m−m )n −(n−n )m ]βm ,n Xm−m ,n−n Xm ,n ψm,n 2
m=2 m =1 n,n =−∞
=−
∞ m−1
∞
m=2 m =1 n,n =−∞
lk [mn − nm ]βm ,n Xm−m ,n−n Xm ,n ψm,n . 2
Collecting terms, we have ∂y ψ ∂x ψ − ∂x ψ ∂y ψ ∞ ∞ ∞ lk mn (βm,n+n − β0,n )Xm,n+n X0,n ψm,n =− 2 n=−∞ + +
m=1 ∞
m=1 ∞
n =1
∞ ∞ lk mn [βm,n +n Xm,n +n − βm,n−n Xm,n−n ]Xm,n ψ0,n 2 n=−∞ n =1 ∞
lk [m n−n m][βm +m,n +n −βm ,n ]Xm ,n Xm +m,n +n ψm,n 2
m ,m=1 n,n =−∞ ∞ ∞ ∞
+
m=1 n=−∞ n =1
−
∞ m−1
lk n m[βm,n−n − β0,n ]X0,n Xm,n−n ψm,n 2
∞
m=2 m =1 n,n =−∞
lk [mn − nm ]βm ,n Xm−m ,n−n Xm ,n ψm,n . 2
Thus the Navier-Stokes equation (1) or the equation ∂t ψ = ψ − k cos ky + R(−)−1 [∂y ψ ∂x ψ − ∂x ψ ∂y ψ] with solution ψ expressed by (42) is in the form of an ordinary differential equation system d Xm,n ψm,n + βm,n Xm,n ψm,n + kψ0,1 dt m,n m,n =−
∞ ∞ ∞ lkR mn (βm,n+n − β0,n )Xm,n+n X0,n ψm,n 2β m,n n=−∞
m=1
n =1
(43)
Onset of Chaotic Kolmogorov Flows
+ − + +
765
∞ ∞ ∞ lkR n m[βm,n−n − β0,n ]X0,n Xm,n−n ψm,n 2βm,n n=−∞
m=1 ∞ m−1
n =1 ∞
m=2 m =1 n,n =−∞ ∞ ∞ ∞ m=1 n=−∞ n =1 ∞ ∞
lkR [mn − nm ]βm ,n Xm−m ,n−n Xm ,n ψm,n 2βm,n
lkR mn [βm,n +n Xm,n +n − βm,n−n Xm,n−n ]Xm,n ψ0,n 2β0,n
lkR [m n−n m][βm +m,n +n −βm ,n ]Xm ,n Xm +m,n +n ψm,n . 2βm,n
m ,m=1 n,n =−∞
This shows the flow invariance of (1,2) in the sense that the initial stream function (41) described by the set of Fourier modes {cos(mlx + mjy + nky)} gives rise to the solution (42) which relates to the same set of Fourier modes {cos(mlx + mjy + nky)}. References 1. Arnold, V.I., Meshalkin, L.D.: Kolmogorov’s seminar on selected problems of analysis (1958–1959). Russ. Math. Surv. 15, 247–250 (1960) 2. Bondarenko, N.F., Gak, M.Z., Dolzhanskiy, F.V.: Laboratory and theoretical models of plane periodic flows. Bull. (Izv.) Acad. Sci. USSR, Atmospheric and Oceanic Physics 15, 711–716 (1979) 3. Beyer, P., Benkadda, S.: Advection of passive particles in the Kolmogorov flow. Chaos 11, 774–779 (2001) 4. Chen, Z.-M.: Bifurcations of a steady-state solution to the two-dimensional Navier-Stokes equations. Commun. Math. Phys. 201, 117–138 (1999) 5. Chen, Z.-M., Price, W.G.: Time-dependent periodic Navier-Stokes flow in a two-dimensional torus. Commun. Math. Phys. 179, 577–597 (1996) 6. Chen, Z.-M., Price, W.G.: Remarks on time dependent periodic Navier-Stokes flows on a the twodimensional torus. Commun. Math. Phys. 207, 81–106 (1999) 7. Chen, Z.-M., Price, W.G.: Circle bifurcation of a two-dimensional spatially periodic flow. To be published 8. Drazin, P.G., Reid, W.H.: Hydrodynamic Stability. Cambridge: Cambridge University Press, 1981 9. Foias, C., Jolly, M.S., Manley, O.P., Rosa, R.: On the Landau-Lifschitz degrees of freedom in 2-D turbulence. J. Stat. Phys. 111, 1017–1019 (2003) 10. Foias, C., Jolly, M.S., Manley, O.P., Rosa, R.: Statistical estimates for the Navier-Stokes equations and the Kraichnan theory of 2-D fully developed turbulence. J. Stat. Phys. 108, 591–645 (2002) 11. Foias, C., Manley, O., Rosa, R., Temam, R.: Navier-Stokes equations and turbulence. In: Encyclopedia of Mathematics and its Applications, Vol. 83. Cambridge: Cambridge University Press, 2001 12. Frenkel, A. L.: Stability of an oscillating Kolmogorov flow. Phys. Fluids A 3, 1718–1729 (1991) 13. Green, J.S.A.: Two-dimensional turbulence near the viscous limit. J. Fluid Mech. 62, 273–287 (1974) 14. Guckenheimer, J., Holmes, P.: Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields. New York: Springer, 1983 15. Joseph, D.D., Sattinger, D.: Bifurcating time periodic solutions and their stability. Arch. Rational Mech. Anal. 45, 75–109 (1972) 16. Khinchin, A.Ya.: Continued Fractions. Chicago, IL: University of Chicago Press, 1964 17. Kim, S.-C., Okamoto, H.: Bifurcations and inviscid limit of rhombic Navier-Stokes flows in tori. IMA J. Appl. Math. 68, 119–134 (2003) 18. Kincaid, D., Cheney, W.: Numerical Analysis. Pacific Grove, CA: Brooks/Cole, 1990 19. Lambert, J.D.: Numerical Methods for Ordinary Differential Systems. Chichester: Wiley, 1991 20. Landau, L.: On the problem of turbulence. Comptes Rend. Acad. Sci. USSR 44, 311–316 (1944) 21. Lin, C.C.: The Theory of Hydrodynamic Stability. Cambridge: Cambridge University Press, 1955 22. Lorenz, E.N.: Deterministic non-periodic flow. J. Atmos. Sci. 20, 130–141 (1963) 23. Ma, T., Wang, S.: Attractor bifurcation theory and its applications to Rayleigh-B´enard convection. Commun. Pure Appl. Anal. 2, 591–599 (2003)
766
Z.-M. Chen, W.G. Price
24. Ma, T., Wang, S.: Dynamic bifurcation and stability in the Rayleigh-B´enard convection. Commun. Math. Sci. 2, 159–183 (2004) 25. Marchioro, C.: An example of absence of turbulence for any Reynolds number. Commun. Math. Phys. 105, 99–106 (1986) 26. Meshalkin, L.D., Sinai, Ya.G.: Investigation of the stability of a stationary solution of a system of equations for the plane movement of an incompressible viscous fluid. J. Math. Mech. 19, 1700–1705 (1961) 27. Obukhov, A.M.: Kolmogorov flow and laboratory simulation of it. Russ. Math. Surv. 38, 113–126 (1983) 28. Okamoto, H., Shoji, M.: Bifurcation diagrams in Kolmogorov’s problem of viscous incompressible fluid on 2-D Tori. Japan J. Indus. Appl. Math. 10, 191–218 (1993) 29. Platt, N., Sirovich, L., Fitzmaurice, N.: An investigation of chaotic Kolmogorov flows. Phys. Fluids A 3, 681–696 (1991) 30. Ruelle, D., Takens, F.: On the nature of turbulence. Commun. Math. Phys. 20, 167–192 (1971) 31. Tran, C.V., Shepherd, T.G., Cho, H.-R.: Stability of stationary solutions of the forced Navier-Stokes equations on the two-torus. Discrete Cont. Dyn. Syst. B 2, 483–494 (2002) 32. Wall, H.S.: Analytic Theory of Continued Fractions. New York: D. Van Nostrand Company, 1948 33. Yudovich, V.I.: Example of the generation of a secondary stationary or periodic flow when there is loss of stability of the laminar flow of a viscous incompressible fluid. J. Math. Mech. 29, 587–603 (1965) 34. Zhang, X., Frenkel, A.L.: Large-scale instability of generalized oscillating Kolmogorov flows. SIAM J. Appl. Math. 58, 540–564 (1998) Communicated by J.L. Lebowitz