Commun. Math. Phys. 298, 1–36 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1064-1
Communications in
Mathematical Physics
Hölder-Continuous Rough Paths by Fourier Normal Ordering Jérémie Unterberger Institut Elie Cartan, Université Henri Poincaré, BP 239, 54506 Vandoeuvre Cedex, France. E-mail:
[email protected] Received: 16 March 2009 / Accepted: 9 March 2010 Published online: 19 May 2010 – © Springer-Verlag 2010
Abstract: We construct in this article an explicit geometric rough path over arbitrary d-dimensional paths with finite 1/α-variation for any α ∈ (0, 1). The method may be coined as ‘Fourier normal ordering’, since it consists in a regularization obtained after permuting the order of integration in iterated integrals so that innermost integrals have highest Fourier frequencies. In doing so, there appear non-trivial tree combinatorics, which are best understood by using the structure of the Hopf algebra of decorated rooted trees (in connection with the Chen or multiplicative property) and of the Hopf shuffle algebra (in connection with the shuffle or geometric property). Hölder continuity is proved by using Besov norms. The method is well-suited in particular in view of applications to probability theory (see the companion article [34] for the construction of a rough path over multidimensional fractional Brownian motion with Hurst index α < 1/4, or [35] for a short survey in that case). Contents 0. 1.
2. 3. 4.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Iterated Integrals: Smooth Case . . . . . . . . . . . . . . . . . . . . . 1.1 From iterated integrals to trees . . . . . . . . . . . . . . . . . . . 1.2 Permutation graphs and Fourier normal ordering for smooth paths 1.3 Tree Chen property and coproduct structure . . . . . . . . . . . . 1.4 Skeleton integrals . . . . . . . . . . . . . . . . . . . . . . . . . . Regularization: The Fourier Normal Ordering Step by Step . . . . . . Proof of the Geometric and Multiplicative Properties . . . . . . . . . . 3.1 Hopf algebras and the Chen and shuffle properties . . . . . . . . . 3.2 Proof of the Chen and shuffle properties . . . . . . . . . . . . . . Hölder Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Choice of the regularization scheme . . . . . . . . . . . . . . . . 4.2 A key formula for skeleton integrals . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
2 5 6 7 9 10 13 16 17 19 26 26 27
2
J. Unterberger
4.3 Estimate for the increment term . 4.4 Estimate for the boundary term . 5. Appendix. Hölder and Besov Spaces References . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
28 31 33 35
0. Introduction Assume t → t = (t (1), . . . , t (d)), t ∈ R is a smooth d-dimensional path, and let V1 , . . . , Vd : Rd → Rd be smooth vector fields. Then the classical Cauchy-Lipschitz theorem implies that the differential equation driven by dy(t) =
d
Vi (y(t))dt (i)
(0.1)
i=1
admits a unique solution with initial condition y(0) = y0 . The usual way to prove this is to show by a functional fixed-point theorem that iterated integrals t Vi (yn (s))ds (i) (0.2) yn → yn+1 (t) := y0 + 0
i
converge when n → ∞. Assume now that is only α-Hölder continuous for some α ∈ (0, 1). Then the Cauchy-Lipschitz theorem does not hold any more because one first a meaning t needs to give t to the above integrals, and in particular to the iterated integrals s dt1 (i 1 ) s 1 dt2 (i 2 ) . . . tn−1 dtn (i n ), n ≥ 2, 1 ≤ i 1 , . . . , i n ≤ d. s The theory of rough paths, invented by T. Lyons [22] and further developed by V. Friz, N. Victoir [14] and M. Gubinelli [15] implies the possibility to solve Eq. (0.1) by a redefinition of the integration along , using as an essential ingredient a rough path over . By definition, a functional = ( 1 , . . . , N ), N = 1/α = entire part of 1/α, is called a rough path over if 1ts = (δ)ts := t − s are the two-point increments of , and k = ( k (i 1 , . . . , i k ))1≤i1 ,...,ik ≤d , k = 1, . . . , N satisfy the following three properties: k (i) Hölder continuity. Each component of ,k = 1, . . . , N is kα-Hölder continuous, | k (i ,...,i )|
1 k < ∞; that is to say, sups∈R supt∈R ts|t−s| kα k (ii) Multiplicative/Chen property. Letting δ tus := kts − ktu − kus , one requires δ ktus (i 1 , . . . , i k ) = ktu1 (i 1 , . . . , i k1 ) kus2 (i k1 +1 , . . . , i k ); (0.3)
k1 +k2 =k
(iii) Geometric/shuffle property. nts1 (i 1 , . . . , i n 1 ) nts2 ( j1 , . . . , jn 2 ) =
n 1 +n 2 (k1 , . . . , kn 1 +n 2 ), (0.4)
k∈Sh( i , j )
where Sh(i, j ) is the set of shuffles of i = (i 1 , . . . , i n 1 ) and j = ( j1 , . . . , jn 2 ), that is to say, of permutations of i 1 , . . . , i n 1 , j1 , . . . , jn 2 which do not change the orderings of (i 1 , . . . , i n 1 ) and ( j1 , . . . , jn 2 ).
Hölder-Continuous Rough Paths by Fourier Normal Ordering
3
There is a canonical choice for , called canonical lift of , when is a smooth path, namely, the iterated integrals of of arbitrary order. If one sets t t1 tn−1 cano,n (i 1 , . . . , i n ) := dt1 (i 1 ) dt2 (i 2 ) . . . dtn (i n ), (0.5) s
cano
( cano )
s
s
then = n=1,2,... satisfies properties (i), (ii), (iii) with α = 1. Axiom (ii) receives a natural geometric interpretation in this case since cano measures the areas, volumes and so forth generated by 1 , . . . , d , see [14], while axiom (iii) may be deduced from Fubini’s theorem. A further justification of axioms (i),(ii),(iii) comes from the fact that any rough path is a limit in some sense of the iterated integrals of a sequence of smooth paths, so plays the rôle of a substitute of iterated integrals for . The problem we address here is the existence and construction of rough paths. It is particularly relevant when is a random path; it allows for the pathwise construction of stochastic integrals or of solutions of stochastic differential equations driven by . Rough paths are then usually constructed by choosing some appropriate smooth approx> imation η , η → 0 of and proving that the canonical lift of η converges in L 2 () for appropriate Hölder norms to a rough path lying above (see [11,32] in the case of fractional Brownian motion with Hurst index α > 1/4, and [1,18] for a class of random paths on fractals, or references in [23]). A general construction of a rough path for deterministic paths has been given – in the original formulation due to T. Lyons – in an article by T. Lyons and N. Victoir [23]. The idea [14] is to see a rough path over as a Hölder section of the trivial G-principal bundle over R, where G is a free rank-N nilpotent group (or Carnot group), while the underlying path is a section of the corresponding quotient G/K -bundle for some normal subgroup K of G; so one is reduced to the problem of finding Hölder-continuous sections gt K → gt . Obviously, there is no canonical way to do this in general. This abstract, group-theoretic construction – which uses the axiom of choice – is unfortunately not particularly appropriate for concrete problems, such as the behaviour of solutions of stochastic differential equations for instance. We propose here a new, explicit method to construct a rough path over an arbitrary α-Hölder path which rests on an algorithm that we call Fourier normal ordering. Let us explain the main points of this algorithm. The first point is the use of Fourier transform, F; Hölder estimates are obtained by means of Besov norms involving compactly supported Fourier multipliers, see the Appendix. Assume for simplicity that is complactly supported; this assumption is essentially void since one may multiply any α-Hölder path by a smooth, compactly supported function equal to 1 over an arbitrary large compact interval, and then restrict the construction to this interval. What makes the Fourier transform interesting for our problem is that (F )(ξ func t1 ) is a well-defined tn−1 t ) = iξ(F)(ξ tion; thus, the meaningless iterated integral s dt1 (i 1 ) s dt2 (i 2 ) . . . s dtn (i n ) +∞ ∞ is rewritten after Fourier transformation as some integral −∞ . . . −∞ f (ξ1 , . . . , ξn )dξ1 . . . dξn , where f is regular but not integrable at infinity along certain directions. The second, main point is the splitting of the Fourier domain of integration Rn into ∪σ ∈ n Rnσ , n = set of permutations of {1, . . . , n}, where Rnσ := {|ξσ (1) | ≤ . . . ≤ |ξσ (n) |}, see Sect. 2 for a more accurate definition involving the Besov dyadic decomposition. Away from the singular directions, the resulting integrals are naturally shown to have a polynomially decreasing behaviour at infinity implying the correct Hölder behaviour; simple examples may be read from [35]. However – as computations in Sect. 4 clearly show, see also [35] for an elementary example – these bounds are naturally
4
J. Unterberger
obtained only after permuting the order of integration by means of Fubini’s theorem, so that the Fourier coordinates |ξ1 |, . . . , |ξn | appear in increasing order. There appear in the process integrals over domains which differ from the simplex {t ≥ t1 ≥ . . . ≥ tn ≥ s}, which are particular instances of tree integrals, and that we call tree skeleton integrals. The next step is to regularize the tree skeleton integrals so that Fourier integrals converge at infinity, without losing the Chen and shuffle properties (ii) and (iii). At this point it turns out to be both natural and necessary to re-interpret the above scheme in terms of tree Hopf algebra combinatorics. The interest for the study of Hopf algebras of trees or graphs surged out of a series of papers by A. Connes and D. Kreimer [8–10] concerning the mathematical structures hidden behind the Bogolioubov-Hepp-Parasiuk-Zimmermann (BPHZ) procedure for renormalizing Feynmann diagrams in quantum field theory [17], and is still very much alive, see for instance [3,4,6,7,13,20,25,36], with applications ranging from numerical methods to quantum chromodynamics or multi-zeta functions or operads. It appears that the shuffle property may be stated by saying that regularized skeleton integrals define characters of yet another Hopf algebra called shuffle algebra, while the Chen property follows from the very definition of the regularized iterated integrals as a convolution of regularized skeleton integrals. We show that the tree skeleton integrals may be regularized by integrating over appropriate subdomains of Rnσ avoiding the singular directions. The proof of properties (ii), (iii) uses Hopf combinatorics and does not depend on the choice of the above subdomains, while the proof of the Hölder estimates (i) uses both tree combinatorics and some elementary analysis relying on the shape of the subdomains. It seems natural to look for a less arbitrary regularization scheme for the skeleton integrals. The idea of cancelling singularities by building iteratively counterterms, originated from the BPHZ procedure, should also apply here. We plan to give such a construction (such as dimensional regularization for instance) in the near future. Let us state our main result. Throughout the paper α ∈ (0, 1) is some fixed constant and N = 1/α . Main Theorem. Assume 1/α ∈ N. Let = ((1), . . . , (d)) : R → Rd be a compactly supported α-Hölder path. Then the functional (R 1 , . . . , R N ) defined in Sect. 2 is an α-Hölder geometric rough path lying over in the sense of properties (i),(ii),(iii) of the Introduction. In a companion paper [34], we construct by the same algorithm an explicit rough path over a d-dimensional fractional Brownian motion B α = (B α (1), . . . , B α (d)) with arbitrary Hurst index α ∈ (0, 1) – recall simply that the paths of B α are a.s. κ-Hölder for every κ < α. The problem was up to now open for α ≤ 1/4 despite many attempts [11,32,33,12]. Fourier normal ordering turns out to be very efficient in combination with Gaussian tools, and provides explicit bounds for the moments of the rough path, seen as a path-valued random variable. The above theorem extends to paths with finite 1/α-variation. Namely (see [23], [21] or also [14]), a simple change of variable → φ := ◦ φ −1 turns into an α-Hölder path, with φ defined for instance as φ(t) := supn≥1 sup0=t0 ≤...≤tn =t n−1 1/α . The construction of the above theorem, applied to φ , j=0 ||(t j+1 ) − (t j )|| yields a family of paths with Hölder regularities α, 2α, . . . , N α which may alternatively be seen as a G N -valued α-Hölder path φ , where G N is the Carnot free nilpotent group of order N equipped with any subadditive homogeneous norm. Then (as proved in [23], Lemma 8) := φ ◦ φ has finite 1/α-variation, which is equivalent to saying that n has finite 1/nα-variation for n = 1, . . . , N , and lies above .
Hölder-Continuous Rough Paths by Fourier Normal Ordering
5
Corollary. Let α ∈ (0, 1) and α < α. Then every α-Hölder path may be lifted to a strong α -Hölder geometric rough path, namely, there exists a sequence of canonical lifts (n) of smooth paths (n) converging to R for the sequence of α -Hölder norms. The set of strong α-Hölder geometric rough paths is strictly included in the set of general α-Hölder geometric rough paths. On the other hand, as we already alluded to above, a weak α-Hölder geometric rough path may be seen as a strong α -Hölder geometric rough path if α < α. This accounts for the loss of regularity in the corollary (see [14] for a precise discussion). The proviso 1/α ∈ N in the statement of the main theorem is a priori needed because otherwise R N may not be treated in the same way as the lower-order iterated integrals (although we do not know if it is actually necessary). However, if 1/α ∈ N, all one has to do is replace α by a slightly smaller parameter α , so that the corollary holds even in this case. Note that the present paper gives unfortunately no explicit way of approximating R by canonical lifts of smooth paths, i.e. of seeing it concretely as a strong geometric rough path. The question is currently under investigation in the particular case of fractional Brownian motion by using constructive field theory methods. Interestingly enough, the idea of controlling singularities by separating the Fourier scales according to a dyadic decomposition is at the core of constructive field theory [27]. Here is an outline of the article. A thorough presentation of iterated integrals, together with the skeleton integral variant, the implementation of Fourier normal ordering, and the extension to tree integrals, is given in Sect. 1, where is assumed to be smooth. The regularization algorithm is presented in Sect. 2; the regularized rough path R is defined there for an arbitrary α-Hölder path . The proof of the Chen and shuffle properties is given in Sect. 3, where one may also find two abstract but more compact reformulations of the regularization algorithm, see Lemma 3.5 and Definition 3.7. Hölder estimates are to be found in Sect. 4. Finally, we gathered in an Appendix some technical facts about Besov spaces required for the construction. Notations. We shall denote by F the Fourier transform, F : L 2 (Rl ) → L 2 (Rl ), f → F( f )(ξ ) =
1 f (x)e−ix,ξ d x. (2π )l/2 Rl
(0.6)
Throughout the article, : R → Rd is some compactly supported α-Hölder path; sometimes, it is assumed to be smooth. The permutation group of {1, . . . , n} is denoted by n . Also, if a, b : X → R+ are functions on some set X such that a(x) ≤ Cb(x) for every x ∈ X , we shall write a b. Admissible cuts of a tree T, see Subsect. 1.3, are usually denoted by v or w, and we write (Roov (T), Leav (T)) (root part and leaves) instead of the traditional notation (R c T, P c T) due to Connes and Kreimer. 1. Iterated Integrals: Smooth Case Let t → t = (t (1), . . . , t (d)) be a d-dimensional, compactly supported, smooth path. The purpose of this section is to give proper notations for iterated integrals of and to introduce some tools which will pave the way for the regularization algorithm. Subsection 1.1 on tree iterated integrals is standard, see for instance [8]. We introduce permutation graphs and Fourier normal ordering for smooth paths in Subsect. 1.2. The tree Chen property – a generalization of the usual Chen property to tree iterated integrals – is recalled in Subsect. 1.3, in connection with the underlying Hopf algebraic
6
J. Unterberger
structure. Finally, a variant of iterated integrals called skeleton integrals is introduced in Subsect. 1.4, together with a variant of the tree Chen property that we call tree skeleton decomposition. 1.1. From iterated integrals to trees. It was noted already a long time ago [5] that iterated integrals could be encoded by trees, see also [20]. This remark has been exploited in connection with the construction of the rough path solution of partial, stochastic differential equations in [16]. The correspondence between trees and iterated integrals goes simply as follows: Definition 1.1. A decorated rooted tree (to be drawn growing up) is a finite tree with a distinguished vertex called root and edges oriented downwards, i.e. directed towards the root, such that every vertex wears a positive integer label called decoration. If T is a decorated rooted tree, we let V (T) be the set of its vertices (including the root), and : V (T) → N be its decoration. Definition 1.2 (tree partial ordering). Let T be a decorated rooted tree. • Letting v, w ∈ V (T), we say that v connects directly to w, and write v → w or equivalently w = v − , if (v, w) is an edge oriented downwards from v to w. Note that v − exists and is unique except if v is the root. • If vm → vm−1 → . . . → v1 , then we shall write vm v1 , and say that vm connects to v1 . By definition, all vertices (except the root) connect tothe root. • Let (v1 , . . . , v|V (T)| ) be an ordering of V (T). Assume that vi v j ⇒ (i > j); in particular, v1 is the root. Then we shall say that the ordering is compatible with the tree partial ordering defined by . Definition 1.3 (tree integrals). (i) Let = ((1), . . . , (d)) be a d-dimensional, compactly supported, smooth path, and T a rooted tree decorated by : V (T) → {1, . . . , d}. Then IT () : R2 → R is the iterated integral defined as t x− x− v2 v |V (T)| [IT ()]ts := dx1 ( (v1 )) dx2 ( (v2 )) . . . dxv|V (T)| ( (v|V (T)| )), s
s
s
(1.1) where (v1 , . . . , v|V (T)| ) is any ordering of V (T) compatible with the tree partial ordering. In particular, if T is a trunk tree with n vertices (see Fig. 1) – so that the tree ordering is total – we shall write IT () = In (), where [In ()]ts :=
s
t
dx1 ( (1))
x1 s
(1.2)
dx2 ( (2)) . . .
xn−1 s
dxn ( (n)).
(1.3)
(ii) Multilinear extension. Assume μ is a compactly supported, signed Borel measure on RV (T) := {(xv )v∈V (T) , xv ∈ R}. Then t x− x− v2 v V (T) ... μ(d xv1 , . . . , d xvV (T) ). (1.4) [IT (μ)]ts := s
s
s
Hölder-Continuous Rough Paths by Fourier Normal Ordering
7
n
2 1 Fig. 1. Trunk tree with set of vertices {n → n − 1 → . . . → 1}
2
3
3
2
1 1
1
3
1
2
3
2
Fig. 2. Example 1.6. From left to right: Tσ1 ; Tσ2 ; Roo{2} Tσ1 ⊗ Lea{2} Tσ1 ; Roo{2,3} Tσ1 ⊗ Lea{2,3} Tσ1
Clearly, the definition of [IT ()]ts given in Eq. (1.1) does not depend on the choice of the ordering (v1 , . . . , v|V (T)| ). For instance, consider T = Tσ1 to be the first tree in Fig. 2. Then
x1 t x1 [IT ()]ts = dx1 (1) dx2 (2) dx3 (3) s s s
x1 t x1 dx1 (1) dx2 (3) dx3 (2) . (1.5) = s
s
s
Note that the decoration of T is required only for (i). In case of ambiguity, we shall also use the decoration-independent notation IT ⊗v∈V (T) ( (v)) instead of IT (). The above correspondence extends by multilinearity to the algebra of decorated rooted trees defined by Connes and Kreimer [8], whose definition we now recall. Definition 1.4 (algebra of decorated rooted trees). (i) Let T be the set of decorated rooted trees. (ii) Let H be the free commutative algebra over R generated by T , with unit element denoted by e. If T1 , T2 , . . . Tl are decorated rooted trees, then the product T1 . . . Tl is the forest with connected components T1 , . . . , Tl . L m l Tl ∈ H, where m l ∈ Z and each Tl = Tl,1 . . . Tl, jl is a forest (iii) Let T = l=1 whose decorations have values in the set {1, . . . , d}. Then [IT ()]ts :=
L
m l [ITl,1 ()]ts . . . [ITl, jl ()]ts .
(1.6)
l=1
1.2. Permutation graphs and Fourier normal ordering for smooth paths. As explained briefly in the Introduction, and as we shall see in the next sections, an essential step in our regularization algorithm is to rewrite iterated integrals by permuting the order of integration. We shall prove the following lemma in this subsection:
8
J. Unterberger
Lemma 1.5 (permutation graphs). To every trunk tree Tn with n vertices and decoration , and every permutation σ ∈ n , is associated in a canonical way an element Tσ of H called a permutation graph, such that: (i) In () = ITσ ();
(1.7)
(ii) Tσ =
Jσ
g(σ, j)Tσj ∈ H,
(1.8)
j=1
where g(σ, j) = ±1 and each Tσj , j = 1, . . . , Jσ is a forest provided by construction with a total ordering compatible with its tree structure, image of the ordering {v1 < . . . < vn } of the trunk tree Tn by the permutation σ . The decoration of Tσ is ◦ σ . Proof. Let σ ∈ n . Applying Fubini’s theorem yields t x1 xn−1
[In ()]ts = dx1 ( (1)) dx2 ( (2)) . . . dxn ( (n)) s s s t2 tn t1 dxσ (1) ( (σ (1))) dxσ (2) ( (σ (2))) . . . dxσ (n) ( (σ (n))), = s1
s2
sn
(1.9) with s1 = s, t1 = t, and for some suitable choice of s j ∈ {s} ∪ {xσ (i) , i < j}, t j ∈ t {t} ∪ {xσ (i) , i < j}( j ≥ 2). Now decompose s jj dxσ ( j) ( (σ ( j))) into
−
s
if s j = s, t j = t, and
t sj
sj
tj
s
dxσ ( j) ( (σ ( j)))
dxσ ( j) ( (σ ( j))) into
t
sj
− s
s
dxσ ( j) ( (σ ( j)))
if s j = s. Then In () has been rewritten as a sum of terms of the form τ1 τ2 τn ± dx1 ( (σ (1))) dx2 ( (σ (2))) . . . dxn ( (σ (n))), s
s
(1.10)
s
where τ1 = t and τ j ∈ {t} ∪ {xi , i < j}, j = 2, . . . , n. Note the renaming of variables and vertices from Eq. (1.9) to Eq. (1.10). Encoding each of these expressions by the forest T with a set of vertices V (T) = {1, . . . , n}, label function ◦ σ , roots { j = 1, . . . , n | τ j = t}, and oriented edges {( j, j − ) | j = 2, . . . , n, τ j = x j − }, yields In () = ITσ () for some Tσ ∈ H as in Eq. (1.8).
(1.11)
Hölder-Continuous Rough Paths by Fourier Normal Ordering
9
123 . Then 231 t t2 t3 dx1 ( (1)) dx2 ( (2)) dx3 ( (3)) s s s x2 x2 t dx2 ( (2)) dx3 ( (3)) dx1 ( (1)) =− s s s x2 t t dx2 ( (2)) dx3 ( (3)). dx1 ( (1)). +
Example 1.6. Let σ =
s σ T2 is
s
(1.12)
s
the sum of a tree and of a forest with two components. See Hence Tσ = −Tσ1 + Fig. 2, where variables and vertices have been renamed according to the permutation σ . 1.3. Tree Chen property and coproduct structure. The Chen property (ii), see Introduction, may be generalized to tree iterated integrals by using the coproduct structure of H, as explained in [8]. It is an essential feature of our algorithm since it implies the possibility to reconstruct a rough path from the quantities t → nts0 with fixed s0 . This idea will be pursued further in the next subsection, where we shall introduce a variant of these iterated integrals with fixed s0 called skeleton integrals. Definition 1.7 (admissible cuts). (see [8], Sect. 2). 1. Let T be a tree, with set of vertices V (T) and root denoted by 0. If v = (v1 , . . . , v J ), J ≥ 1 is any totally disconnected subset of V (T) \ {0}, i.e. vi v j for all i, j = 1, . . . , J , then we shall say that v is an admissible cut of T, and write v | V (T). We let Leav T (read: leaves of T) be the sub-forest (or sub-tree if J = 1) obtained by keeping only the vertices above v, i.e. V (Leav T) = v ∪ {w ∈ V (T) : ∃ j = 1, . . . , J, w v j }, and Roov T (read: root part of T) be the sub-tree obtained by keeping all other vertices. 2. Let T = T1 . . . Tl be a forest, together with its decomposition into trees. Then an admissible cut of T is a disjoint union v 1 ∪ . . . ∪ vl , v i ⊂ Ti , where v i is either ∅, {0i } (root of Ti ) or an admissible cut of Ti ; by convention, the two trivial cuts ∅ ∪ . . . ∪ ∅ and {01 } ∪ . . . ∪ {0l } are excluded. By definition, we let Roov T = Roov 1 T1 . . . Roovl Tl , Leav T = Leav 1 T1 . . . Leavl Tl (if v i = ∅, resp. {0i }, then (Roovi Ti , Leavi Ti ) := (Ti , ∅), resp. (∅, Ti )). See Figs. 3, 4 and 2. Defining the co-product operation Roov T ⊗ Leav T, : H → H ⊗ H, T → e ⊗ T + T ⊗ e + v |V (T)
(1.13) where e stands for the unit element, yields a coalgebra structure on H. One may also define an antipode S, which makes H a Hopf algebra (see Sect. 3 for more details). We may now state the tree Chen property. Recall from the Introduction that [δ f ]tus := f ts − f tu − f us if f is a function of two variables. Proposition 1.8 (tree Chen property). (See [20] or [16]). Let T be a forest, then [δ IT ()]tus = [I Roov T ()]tu [I Leav T ()]us . (1.14) v |V (T)
10
J. Unterberger
w’ w vd
vu
0 Fig. 3. Admissible cut
w’ w
0 Fig. 4. Non-admissible cut
This proposition is illustrated in the discussion following Lemma 1.12 in the upcoming paragraph. 1.4. Skeleton integrals. We now introduce a variant of tree iterated integrals that we call tree skeleton integrals, or simply skeleton integrals. We explain after Eq. (1.23) below the reason why we shall use skeleton integrals instead of the usual iterated integrals as building stones for our construction. Definition 1.9 (formal integral). Let f : R → R be a smooth, compactly supported t function such that F f (0) = 0. Then the formal integral f of f is defined as +∞ t itξ 1 e dξ. (1.15) f := √ (F f )(ξ ) iξ 2π −∞ The condition F f (0) = 0 prevents possible infra-red divergence when ξ → 0. Note that
t t s t +∞ 1 ixξ f − f =√ (F f )(ξ ) e d x dξ = f (x)d x (1.16) 2π −∞ s s t by the Fourier inversion formula, so f is an anti-derivative of f . Formally one may write, as an equality of distributions: t t eitξ , (1.17) eixξ d x = eixξ d x = iξ ∞ +∞ ixξ since −∞ eiξ φ(ξ ) dξ →x→∞ 0 for any test function φ such that φ(0) = 0. Hence t +∞ +∞ t 1 1 eitξ ixξ dξ, (1.18) f =√ dξ(F f )(ξ ) e dx = √ (F f )(ξ ) iξ 2π −∞ 2π −∞ in coherence with Eq. (1.15).
Hölder-Continuous Rough Paths by Fourier Normal Ordering
11
Definition 1.10 (skeleton integrals). (i) Let T be a tree with decoration : T → {1, . . . , d}. Let (v1 , . . . , v|V (T)| ) be any ordering of V (T) compatible with the tree partial ordering. Then the skeleton integral of along T is by definition
[SkIT ()]t :=
t
dxv1 ( (v1 ))
xv− 2
dx2 ( (v2 )) . . .
xv−
|V (T)|
dxv|V (T)| ( (v|V (T)| )).
(1.19) (ii) Extension to forests. Let T = T1 . . . Tl be a forest, with its tree decomposition. Then one defines [SkIT ()]t :=
l
[SkIT j ()]t .
(1.20)
j=1
˜ and μ a (iii) Multilinear extension, see Definition 1.3. Assume T is a subtree of T, ˜ T compactly supported, signed Borel measure on R := {(xv )v∈V (T˜ ) , xv ∈ R}. Then t x− x− v2 v |V (T)| [SkIT (μ)]t := ... μ(d xv1 , . . . , d xv|V (T)| ) (1.21) is a signed Borel measure on {(xv )v ∈V (T˜ )\V (T) , xv ∈ R}. Formally again, [SkIT ()]t may be seen as [IT ()]t,±i∞ . Denote by μˆ the partial Fourier transform of μ with respect to (xv )v∈V (T) ), so that μ((ξ ˆ v )v∈V (T) , (d xv )v ∈V (T )\V (T) ) = (2π)−|V (T)|/2 μ, (xv )v∈V (T) → e−i v∈V (T) xv ξv . (1.22) Then
[SkIT (μ)]t = (2π )−|V (T)|/2 μ, ˆ SkIT (xv )v∈V (T) → ei v∈V (T) xv ξv . (1.23) t
As explained in the previous subsection, tree skeleton integrals are straightforward generalizations of the usual tree iterated integrals. They are very natural when computing in Fourier coordinates, because every successive integration brings about a new ξ -factor in the denominator, allowing easy Hölder estimates using Besov norms (see the t itξ Appendix). On the contrary, 0 eixξ d x = eiξ − iξ1 contains a constant term − iξ1 which does not improve when one integrates again. It is the purpose of Sect. 3 to show that a rough path over an α-Hölder path may be obtained from adequately regularized tree skeleton integrals, using the following tree skeleton decomposition, which is a variant of the tree Chen property recalled in Proposition 1.8 above. Definition 1.11 (multiple cut). Let v ⊂ V (T), v = ∅. If w ∈ v, one calls Lev(w) := 1+|{w ∈ v; w w }| the level of w. If v | V (T) is an admissible cut, then Lev(w) = 1 for all w ∈ v. Quite generally, letting Lev(v) = max{Lev(w); w ∈ v}, one writes v j := {w ∈ v; Lev(w) = j} for 1 ≤ j ≤ Lev(v), and calls (v j ) j=1,...,Lev(v ) the level decomposition of v considered as a multiple cut. One shall also write: v 1 | . . . | v Lev(v ) | V (T) since v Lev(v ) | V (T) and each v j , j = 1, . . . , Lev(v) − 1 is an admissible cut of Roov j+1 (T).
12
J. Unterberger
Lemma 1.12 (tree skeleton decomposition). Let T be a tree. Then: (i) Recursive version. [IT ()]tu = [δSkIT ()]tu −
[I RoovT ()]tu .[SkI LeavT ()]u , (1.24)
v |V (T)
(ii) Non-recursive version. [IT ()]tu = [δSkIT ()]tu +
(−1)|v 1 |+...+|vl |
l≥1 v 1 |...|vl |V (T)
[δSkI Roov1 (T) ()]tu
l−1
SkI Leavm ◦Roovm+1 (T) [SkI Leavl (T) ()]u . u
m=1
(1.25) Proof. Same as for Proposition 1.8. Equation (1.24) may formally be seen as a particular case of the Chen property (1.14) by setting s = ±i∞ (see the previous subsection). The non-recursive version may be deduced from the recursive version in a straightforward way. Let us illustrate these notions in a more pedestrian way for the reader who is not accustomed to tree integrals. Consider for an example the trunk tree Tn with vertices n → n − 1 → . . . → 1 and decoration : {1, . . . , n} → {1, . . . , d}, and the associated iterated integral t xn−1 [In ()]ts = [ITn ()]ts = dx1 ( (1)) . . . dxn ( (n)). (1.26) s
s
Cutting Tn at some vertex v ∈ {2, . . . , n} produces two trees, Roov Tn and Leav Tn , with respective vertex subsets {1, . . . , v − 1} and {v, . . . , n}. Then the usual Chen property (ii) in the Introduction reads [δ ITn ()]tus = [I Roov Tn ()]tu [I Leav Tn ()]us . (1.27) v∈V (Tn )\{1}
On the other hand, rewrite [ITn ()]tu as the sum of the increment term, which is a skeleton integral, t x1 xn−1 [δSkITn ()]tu = dx1 ( (1)) dx2 ( (2)) . . . dxn ( (n)) u x1 xn−1 − dx1 ( (1)) dx2 ( (2)) . . . dxn ( (n)), (1.28) and of the boundary term [ITn ()(∂)]tu := − .
u
n 1 +n 2 =n u
dxn1 +1 ( (n 1 + 1))
t
dx1 ( (1)) . . . xn 1 +1
xn 1 −1 u
dxn1 ( (n 1 ))
dxn1 +2 ( (n 1 + 2)) . . .
xn−1
dxn ( (n)). (1.29)
Hölder-Continuous Rough Paths by Fourier Normal Ordering
13
The above decomposition is fairly obvious for n = 2 and obtained by easy induction for general n. One has thus obtained the recursive skeleton decomposition property for trunk trees, [ITn ()]tu = [δSkITn ()]tu − [I Roov Tn ()]tu .[SkI Leav Tn ()]u . (1.30) v∈V (Tn )\{1}
The non-recursive version of the skeleton decomposition property is a straightforward consequence, and reads in this case [ITn ()]tu = [δSkITn ()]tu + (−1)l [δSkI Roo j1 (Tn ) ()]tu j1 <...< jl
l≥1
×
l−1
[SkI Lea jm ◦Roo jm+1 (Tn ) ()]u [SkI Lea jl (Tn ) ()]u ,
(1.31)
m=1
where Lea jm ◦ Roo jm+1 Tn is the piece of Tn with subset of vertices ranging in { jm , . . . , jm+1 − 1}. 2. Regularization: The Fourier Normal Ordering Step by Step We now come back to the original problem and assume is a d-dimensional α-Hölder, compactly supported, non-smooth path. Then none of the previous definitions relative to iterated integrals make sense. However, one may rewrite these as diverging series such that every term is well-defined. This follows easily from the Besov decomposition given in the Appendix. Let us recall briefly, refering to the Appendix for details and notations, that may be decomposed as k∈Z D(φk ), where (φk )k∈Z is a dyadic partition of unity, and D(φk ) = F −1 (φk · F). The Fourier transform F has been introduced at the end of the Introduction. Since φk · F is a compactly supported C ∞ function, 1 φk (ξ )(F)(ξ )eixξ dξ (2.1) D(φk ) : x → √ 2π R is a C ∞ -function, and it makes perfect sense to integrate the D(φk )(i), k ∈ Z, 1 ≤ i ≤ d against each other. We suggest the following definition, where T ∈ T is a fixed tree. All P-projections below extend to measures μ ∈ Meas(RT ), where RT := {(xv )v∈V (T) , xv ∈ R}. Definition 2.1 (P-projections). (i) Let, for k ∈ ZT := {(kv )v∈V (T) , kv ∈ Z}, P { k} () := ⊗v∈V (T) D(φkv )( (v)).
(2.2)
(ii) Similarly, let U ⊂ ZT . Then P U () :=
k=(kv )v∈V (T) ∈U
P { k} ().
(2.3)
14
J. Unterberger
(iii) Let in particular P +,T be the P-projection associated to the subset T U = ZT + := {(kv )v∈V (T) ∈ Z | (v w) ⇒ |kv | ≥ |kw |}.
(2.4)
If T = Tn is the trunk tree with n vertices {n → . . . → 1} and decoration
: j → j, j = 1, . . . , n, see Fig. 1, we shall simply write P + instead of P +,Tn . More generally, if a tree T is equipped with a partial or total ordering > compatible with its tree ordering, we let P + := P U> with U> := {(kv )v∈V (T) ∈ ZT | (v > w) ⇒ |kv | ≥ |kw |}. (iv) Using the Fourier multipliers D(φ˜ kv ) instead of D(φkv ), see Definition 5.3, define similarly P˜ { k} :=
1 ⊗v∈V (T) D(φ˜ kv )( (v)), | k |
(2.5)
where k ⊂ n is the subset of permutations τ such that |kτ ( j) | = |k j | for every j = 1, . . . , n, and P˜ { k} (). P˜ + := (2.6) k=(kv )v∈V (T) ∈U>
Remark. By construction, P + P˜ + = P˜ + if P + , P˜ + are associated to a total ordering compatible with the tree ordering of T. α α Note that P U may be considered as a linear operator P U : (B∞,∞ )⊗T → (B∞,∞ )⊗T , α where (B∞,∞ )⊗T stands for the vector space generated by the monomials ⊗v∈V (T) f v , α f v ∈ B∞,∞ . It is actually a bounded linear operator, as recalled in the Appendix, see Proposition 5.8 and remarks after Proposition 5.2. We may now proceed to explain our regularization algorithm.
• Step 1 (Choice of regularizationscheme). Choose for each tree T ∈ T a subset { k} ZrTeg ⊂ ZT + such that the series k∈ZrTeg [SkIT (P ())]t converges absolutely for
any α-Hölder path . By assumption ZrTeg = Z if |V (T)| = 1. • Step 2. Let T be a forest equipped with a partial or total ordering compatible with its tree ordering, and P˜ + the corresponding projection operator. For k ∈ ZT + , we let the projected regularized skeleton integral be the quantity [R{ k} SkIT (P˜ + )]t = 1 k∈ZrTeg · [SkIT (P { k} P˜ + )]t .
(2.7)
{ k} ˜+ • Step 3 (Regularized projected tree integral). For k ∈ ZT + , let [R IT (P )]ts be constructed out of projected regularized skeleton integrals in the following recursive way, as in Lemma 1.12:
[R{ k} IT (P˜ + )]ts := [δR{ k} SkIT (P˜ + )]ts − [R{Roov ( k)} I Roov (T) (P˜ + )]ts [R{Leav ( k)} SkI Leav T (P˜ + )]s , v |V (T)
(2.8) where Roov (k) = (kw )w∈Roov (T) ∈ Z Roov (T) , and Leav (k) = (kw )w∈Leav (T) ∈ Z Leav (T) .
Hölder-Continuous Rough Paths by Fourier Normal Ordering
15
• Step 4 (Generalization to forests). The generalization is straightforward. Namely, if Tl 1 T = T1 . . . Tl is a forest, and k = (k1 , . . . , kl ) ∈ ZT + × . . . × Z+ , we let R{ k} SkIT (P˜ + ) :=
l
R{ k j } SkIT j (P˜ + )
(2.9)
j=1
and similarly l
R{ k} IT (P˜ + ) :=
R{ k j } IT j (P˜ + ).
(2.10)
j=1
Consider a partial or total ordering > on T and denote by P˜ + the corresponding projection operator. By summing over all indices k ∈ U> , one gets the following quantities: RSkIT (P˜ + ) := R{ k} SkIT (P˜ + ) (2.11) k∈U>
(see Definition 2.1), and similarly RIT (P˜ + ) :=
R{ k} IT (P˜ + ).
(2.12)
k∈U>
Observe in particular, using Eq. (2.8), and summing over indices k, that RIT (P˜ + ) decomposes naturally into the sum of an increment term, which is a regularized skeleton integral, and of a boundary term denoted by the symbol ∂, namely,
δRSkIT (P˜ + ) + RIT (P˜ + )(∂) . (2.13) ts
ts
This decomposition is a generalization of that obtained in Subsect. 1.4, see Eq. (1.28) and (1.29). Observe also that we have not defined RSkIT (), nor RIT (); the regularized integration operators RIT , RSkIT only act on Fourier normal ordered projections of paths P˜ + . • Final step (Fourier normal ordering). Let Tn be a trunk tree with n vertices decorated σ by , and, for each σ ∈ n , Tσ = Jj=1 g(σ, j)Tσj is the corresponding permutation σ graph, as in Lemma 1.5. Each forest T comes with a total ordering compatible with its tree ordering, which defines a projection operator P˜ + ; we write for short P˜ σ instead of P˜ + (⊗nm=1 ( (σ (m)))). Then we let [R n ( (1), . . . , (n))]ts :=
=
σ ∈ n
⎛ ⎝
Jσ σ ∈ n j=1
g(σ, j)RITσj (P˜ σ ) Jσ
k=(k1 ,...,kn )∈Zn ; |kσ (1) |≤...≤|kσ (n) | j=1
⎞ g(σ, j)[R{ k◦σ } ITσj (P˜ σ )]ts ⎠ . (2.14)
16
J. Unterberger
We shall prove in the next section that R satisfies the Chen (ii) and shuffle (iii) properties of the Introduction. The Hölder property (i) will be proved in Sect. 4 for an adequate choice of subdomains ZrTeg , T ∈ T satisfying in particular the property required in Step 1. Some essential comments are in order. 1. Assume that is smooth, and do not regularize, i.e., choose ZrTeg = ZT + . Then Eq. (2.8) is a recursive definition of the non-regularized projected integral [IT (P { k} P˜ + )]ts , as follows from the tree skeleton decomposition property, see Lemma 1.12. Hence the right-hand side of formula (2.14) reads simply
Jσ
σ ∈ n k=(k1 ,...,kn )∈Zn ; |kσ (1) |≤...≤|kσ (n) | j=1
g(σ, j)[ITσj (P { k} P˜ σ )]ts .
(2.15)
But this quantity is the usual iterated integral or canonical lift of , [ cano,n ( (1), . . . ,
(n)]ts , since Jσ
g(σ, j)[ITσj (P { k} P˜ σ )]ts = [ITσ (P { k} P˜ σ )]ts = [In (P { k} P˜ σ )]ts (2.16)
j=1
by Lemma 1.5, and
P { k} P˜ σ ()
σ ∈ n k=(k1 ,...,kn )∈Zn ; |kσ (1) |≤...≤|kσ (n) |
=
σ ∈ n
P + P˜ + (⊗nm=1 ( (σ (m)))) =
P˜ + (⊗nm=1 ( (σ (m)))) =
σ ∈ n
,
(2.17)
see the Remark after Definition 2.1. 2. Iterated integrals of order 1, [R 1 (i)]ts , 1 ≤ i ≤ d, are not regularized, namely, [R 1 (i)]ts = [ 1 (i)]ts = t (i) − s (i), because of the assumption in Step 1 which states that ZrTeg = Z if |V (T)| = 1. Hence R is a rough path over . 3. We propose a reformulation of this algorithm in a Hopf algebraic language in Lemma 3.5 below. An equivalent algorithm is given in Definition 3.7. The abstract algebraic language of Sect. 3 turns out to be very appropriate to prove the Chen and shuffle properties. 3. Proof of the Geometric and Multiplicative Properties Let = ((1), . . . , (d)) be an α-Hölder path. This section is dedicated to the proof of Theorem 3.1. Choose for each tree T a subset ZrTeg ⊂ ZT such that the condition of Step 1 of the construction in Sect. 2 is satisfied, i.e. such that the regularized rough path R defined in Sect. 2 is well-defined. Then R satisfies the Chen (ii) and shuffle (iii) properties of the Introduction.
Hölder-Continuous Rough Paths by Fourier Normal Ordering
17
This theorem is in fact a consequence of the following very general construction, whose essence is really algebraic. Two Hopf algebras are involved in it: the Hopf algebra of decorated rooted trees H, and the shuffle algebra Sh. As we shall presently see, the first one is related to the Chen property, while the second one is related to the shuffle property. The first paragraph below is devoted to an elementary presentation of these Hopf algebras in connection with the Chen/shuffle property. Theorem 3.1 is proved in the second paragraph. 3.1. Hopf algebras and the Chen and shuffle properties. 1. Let us first consider the Hopf algebra of decorated rooted trees, H. Recall the definition of the coproduct on H, (T) = e ⊗ T + T ⊗ e + Roov T ⊗ Leav T. (3.1) v |V (T)
The usual convention [8,9] is to write c (cut) for v, R c (T) (root part) for Roov T, P c (T) for Leav T (leaves), and to reverse the order of the factors in the tensor product. The convolution of two linear forms f, g on H is written: f (Roov T)g(Leav T), T ∈ H. ( f ∗ g)(T) = f (T)g(e) + f (e)g(T) + v |V (T)
(3.2) This notion is particularly interesting for characters. A character of H is a linear map such that χ (T1 .T2 ) = χ (T1 ).χ (T2 ). If χ1 , χ2 are two characters of H, then χ1 ∗ χ2 is also a character of H. The tree Chen property, see Proposition 1.8, may then be stated as follows. Let = ((1), . . . , (d)) be a smooth path, and Hd := {T ∈ H; : V (T) → {1, . . . , d}}
(3.3)
be the subspace of H generated by forests with decoration valued in {1, . . . , d}. Now, define Its : Hd → R to be the following character of H (see Definition 1.3) Its (T) = [IT ()]ts .
(3.4)
Its = Itu ∗ Ius .
(3.5)
Then (as remarked in [20])
Generalizing this property to the multilinear setting, one may also write Iμts (T) = (I tu ∗ I us )μ (T) := Iμtu (T) + Iμus (T) tu us I Roo (Roov (T))I Lea (Leav (T)) + v (μ) v (μ)
(3.6)
v |V (T)
for a tensor measure μ = ⊗v∈V (T) μv , where Roov (μ) := ⊗v∈V (Roov (T)) μv , Leav (μ) := ⊗v∈V (Leav (T)) μv , and (I tu ∗ I us )μk (T) (3.7) Iμts (T) := (I tu ∗ I us )μ (T) := k
18
J. Unterberger
for a more general measure μ := k μ k , where each μ k is a tensor measure. Later on we shall use these formulas for μ k = 1 k∈ZT+ dP { k} () or 1 k∈ZrTeg dP { k} (). As for the antipode S, it is the multiplicative morphism S : H → H defined inductively on tree generators T by (see [8], p. 219)
S(e) = e; S(T) = −T −
Roov T.S(Leav T).
(3.8)
v |V (T)
Applying iteratively the second relation yields an expression of S(T) in terms of multiple cuts of T obtained by ’chopping’ it [8], see Def. 1.11, namely, S(T) = −T −
(−1)|v 1 |+...+|vl |
l≥1 v 1 |...|vl |V (T)
Roov 1 (T)
l−1
Leav m ◦ Roov m+1 (T) Leavl (T).
m=1
(3.9) Let χ1 , χ2 be two characters of H. Recall that χ2 ◦ S is the convolution inverse of χ2 , namely, χ2 ◦ S is a character and χ2 ∗ (χ2 ◦ S) = e, ¯ where e¯ is the counity of H, defined on generators by e(e) ¯ = 1 and e(T) ¯ = 0 if T is a forest. Now Eq. (3.2) and (3.9) yield
χ1 ∗ (χ2 ◦ S)(T) = χ1 (T) + χ2 ◦ S(T) +
χ1 (Roov (T))χ2 ◦ S(Leav (T))
v |V (T)
= (χ1 − χ2 )(T) +
(χ1 − χ2 )(Roov (T))χ2 ◦ S(Leav (T))
v |V (T)
= (χ1 − χ2 )(T) + ×
l−1
(−1)|v1 |+...+|vl | l≥1
(χ1 − χ2 )(Roov1 (T))
v =(v 1 ,...,vl )
χ2 (Leavm ◦ Roovm+1 (T) ) χ2 (Leavl (T)),
(3.10)
m=1
where v = (v 1 , . . . , vl ) is a multiple cut of T as in Eq. (3.9). In particular, let SkIt : H → R be the character defined by (see Definition 1.10) SkIt (T) = [SkIT ()]t . Then the tree skeleton decomposition, see Lemma 1.12, reads simply Itu = SkIt ∗ SkIu ◦ S .
(3.11)
(3.12)
2. The shuffle algebra over the index set N [24] may be defined as follows. The algebra Sh is generated as a vector space over R by the identity e and by the trunk trees (Tn )n≥1 with vertex set V (Tn ) = {v1 < . . . < vn }, provided with an N-valued decoration . Let Tn , T n be trunk trees with n, resp. n vertices. The shuffle product of Tn and T n is the formal sum
Hölder-Continuous Rough Paths by Fourier Normal Ordering
Tn T n =
19 T
ε(Tn n ),
(3.13)
ε∈Sh((V (Tn ),V (T n )))
T
where Tn n is the trunk tree with n + n vertices obtained by putting T n on top of Tn , and the shuffle ε permutes the decorations of Tn , T n as in property (iii) discussed in the Introduction. Let Shd be the subspace of Sh generated by trunk trees with decoration valued in {1, . . . , d}. Then the shuffle property for iterated integrals reads Its (Tn )Its (T n ) = Its (Tn T n ), Tn , T n ∈ Shd .
(3.14)
In other words, it may be stated by saying that Its : Tn → [ITn ()]ts is a character of Sh. Similarly, skeleton integrals SkIt : Tn → [SkIT ()]t also define characters of Sh. The shuffle algebra Sh is made into a Hopf algebra by re-using the same coproduct : T → T ⊗ e + e ⊗ T + v |V (T) Roov T ⊗ Leav T as for H, and defining the ¯ n is obtained from Tn by reversing the ¯ n , where T ¯ n ) = (−1)n T antipode S¯ as S(T ordering of the vertices, T¯ n (v j ) = Tn (vn+1− j ). The convolution of linear forms or characters f, g on Sh is given by the same formula as for H. Proposition 3.1 [24]. The linear morphism : H → Sh defined by (T) = j T j , where T j ranges over all trunk trees {v1 < . . . < v|V (T)| } such that the corresponding total ordering of vertices of T is compatible with its tree partial ordering, is a Hopf algebra map. is actually onto. In other words, it is a structure-preserving projection, with the canonical identification of Sh as a subspace of T. Note that [IT ()]ts = [SkIT ()]ts = 0 if T ∈ K er () and is an arbitrary smooth path, which is a straightforward generalization of the shuffle property; one may call this the tree shuffle property. Corollary 3.2. Let χ¯ be a character of Sh. Then χ := χ¯ ◦ is a character of H. If ¯ T ∈ Sh, then χ ◦ S(T) = χ¯ ◦ S(T). 3.2. Proof of the Chen and shuffle properties. We shall now prove Theorem 3.1. In the next pages, Meas(Rn ) stands for the space of compactly supported, signed Borel measures on Rn . Let us explain the strategy of the proof. We give a general method to construct families of characters of the shuffle algebra, χ¯ t , depending on a path , see ¯ Lemma 3.6; these quantities satisfy the shuffle property by Eq. (3.14). Then χ¯ t ∗(χ¯ s ◦ S) is immediately seen to define a rough path satisfying both the Chen and shuffle properties, see Definition 3.7. For a particular choice of the characters χ¯ t related to the regularized skeleton integrals defined in Sect. 2, the rough path of Definition 3.7 is shown to coincide with the regularized rough path R of Sect. 2, see Lemma 3.8. In order to prove this last lemma, one needs a Hopf algebraic reformulation of the Fourier normal ordering algorithm leading to R, see Lemma 3.5. Lemma 3.3 (measure splitting). Let μ ∈ Meas(Rn ). Then μσ ◦ σ, μ= σ ∈ n
(3.15)
20
J. Unterberger
where μσ ∈ P˜ + Meas(Rn ) is defined by μσ :=
(P˜ { k} μ) ◦ σ
(3.16)
k=(k1 ,...,kn )∈Zn ;|kσ (1) |≤...≤|kσ (n) |
as in Eq. (2.14). Proof. See Eq. (2.17).
+ ⊂ H (n ≥ 1) be the set of all forests T with n vertices and Definition 3.4. (i) Let Fn,n one-to-one decoration : V (T) → {1, . . . , n} valued in the set {1, . . . , n}, such + ⊂ H the vector space generated by F + . that (v w) ⇒ (v) ≥ (w), and Hn,n n,n + +, T n ˜ (ii) If T ∈ Fn,n , let P Meas(R ) denote the subspace {P˜ +,T μ; μ ∈ Meas(Rn )}, see Sect. 2 for a definition of the projection operator P˜ +,T . + ) (iii) Let φTt : P˜ +,T Meas(Rn ) → R, μ → φTt (μ), also written φμt (T)(t ∈ R, T ∈ Fn,n + +, T n be a family of linear forms such that, if (Ti , μi ) ∈ Fn i ,n i × P˜ i Meas(R i ), i = 1, 2, the following H-multiplicative property holds,
φμt 1 (T1 )φμt 2 (T2 ) = φμt 1 ⊗μ2 (T1 ∧ T2 ),
(3.17) where T1 ∧T2 ∈ Fn+1 +n 2 ,n 1 +n 2 is the forest T1 .T2 with decoration T = 1 , T = 1 2 n 1 + 2 ( i = decoration of Ti , i = 1, 2), and μ1 ⊗ μ2 ∈ P˜ +,T1 ∧T2 Meas(Rn 1 +n 2 ) is the tensor measure μ1 ⊗ μ2 (d x1 , . . . , d xn 1 +n 2 ) = μ1 (d x1 , . . . , d xn 1 )μ2 (d xn 1 +1 , . . . , d xn 1 +n 2 ). (iv) Let, for = ((1), . . . , (d)), χ¯ t : Shd → R be the linear form on Shd defined by χ¯ t (Tn ) := φμt σ (Tσ ), (3.18) σ ∈ n
where – being the decoration of Tn – one has set μ := ⊗nj=1 d( ( j)), and Tσ is the permutation graph associated to σ (see Subsect. 1.2). Remarks. 1. Note that the H-multiplicative property (3.17) holds in particular for φTt = [SkIT ( . )]t or [RSkIT ( . )]t , either trivially or by construction (see Step 4 in the construction of Sect. 2). Note that [RSkIT (μ)]t has been defined only if μ ∈ P˜ + Meas(Rn ). If φTt = [SkIT ( . )]t , then simply χ¯ t (Tn ) = [SkITn ()]t by the measure splitting lemma. ˜ 2. Assume μi ∈ P˜ + Meas(Rn i ) ⊂ P˜ +,T Meas(Rn i ), where P˜ + is the P-projection associated to the subset Zn+i := {k = (k1 , . . . , kn i ); |k1 | ≤ . . . ≤ |kn i |}(i = 1, 2). Then μ1 ⊗ μ2 ∈ P˜ +,T1 ∧T2 Meas(Rn 1 +n 2 ) but μ1 ⊗ μ2 ∈ P˜ + Meas(Rn 1 +n 2 ) in general; the product measure μ1 ⊗ μ2 decomposes as a sum over shuffles ε of (1, . . . , n 1 ), (n 1 + 1, . . . , n 1 + n 2 ), namely, μ1 ⊗ μ2 = ε shuffle (μ1 ⊗ μ2 )ε ◦ ε. Hence the H-multiplicative property (3.17) reads also t −1 φ(μ (T1 ∧ T2 )), (3.19) φμt 1 (T1 )φμt 2 (T2 ) = ε (ε 1 ⊗μ2 ) ε shuffle
where ε−1 (T1 ∧ T2 ) is the forest T1 ∧ T2 with decoration ε−1 ◦ , see Definition 3.4 (iii) for the definition of .
Hölder-Continuous Rough Paths by Fourier Normal Ordering
21
3. The regularization algorithm R presented in Sect. 2 may be written in a compact way using the structures we have just introduced. Namely, one has: Lemma 3.5. Let = ((1), . . . , (d)) and μ := ⊗nj=1 d( ( j)). Then [R n ( (1), . . . , (n))]ts =
φ t ∗ (φ s ◦ S) μσ (Tσ ),
σ ∈ n
where
(3.20)
⎛
⎡
⎞⎤ ⎟⎥ ⎜ ⎢ φνt (T) := [RSkIT (ν)]t = ⎣SkIT ⎝ ⊗v∈V (T) D(φkv ) ν ⎠⎦ k∈ZrTeg
(3.21) t
for ν ∈ P˜ +,T Meas(Rn ), and φ t ∗ (φ s ◦ S) μσ is the obvious multilinear extension of the convolution, see Eq. (3.7). Proof. Simple formalization of the regularization procedure explained in Sect. 2.
The fundamental result is the following. Lemma 3.6. Let = ((1), . . . , (d)) be compactly supported, and assume that the condition of Step 1 in Sect. 2 is satisfied. Then χ¯ t is a character of Shd . Proof. Let Tn i ∈ Shd with n i vertices (i = 1, 2); define n := n 1 + n 2 . Let μi := i ⊗nj=1 d( i ( j)), i = 1, 2 and μ := μ1 ⊗ μ2 . If n ≥ 1, we let T n be the trunk tree with n vertices {n → . . . → 1} and decoration ( j) = j, j ≤ n , see Fig. 1. All shuffles ε below are intended to be shuffles of (1, . . . , n 1 ), (n 1 + 1, . . . , n 2 ). Then t χ¯ t (Tn 1 Tn 2 ) = χ¯ μ◦ε (T n ) ε shuffle
=
σ ∈ n ε shuffle
=:
σ ∈ n
t σ φ(μ◦ε) σ (T ) =
σ ∈ n ε shuffle
φμt ε◦σ (Tσ )
φμt σ (tσ1 )
with
tσ1 :=
(3.22)
Tε
−1 ◦σ
+ ∈ Hn,n .
(3.23)
ε shuffle
On the other hand, χ¯ t (Tn 1 )χ¯ t (Tn 2 ) = χ¯ μt 1 (T n 1 )χ¯ μt 2 (T n 2 ) φ t σ1 (Tσ1 )φ t σ2 (Tσ2 ) = σ1 ∈ n 1 ,σ2 ∈ n 2
=
μ1
σ1 ∈ n 1 ,σ2 ∈ n 2 ε shuffle
μ2
φt
σ
σ
(μ11 ⊗μ22 )ε
(ε−1 (Tσ1 ∧ Tσ2 ))
(3.24)
22
J. Unterberger
by (3.19) =
σ ∈ n
where
φμt σ (tσ2 ),
tσ2 :=
ε−1 (Tσ1 ∧ Tσ2 ).
(3.25)
(3.26)
(σ1 ,σ2 ,ε);(σ1 ⊗σ2 )◦ε=σ
Hence χ¯ t is a character of Sh if and only if tσ1 = tσ2 for every σ ∈ n ; let us prove this. Extend first (3.22) and (3.25) by multilinearity from tensor measures μ1 ⊗ μ2 to a general measure μ ∈ Meas(Rn ). By the usual shuffle identity, SkIt (Tn 1 Tn 2 ) = SkIt (Tn 1 ).SkIt (Tn 2 ), so (3.22) and (3.25) coincide for χ¯ t = [SkI( . )]t . Choose σ ∈ n . For any μ ∈ Meas(Rn ), one has [SkIμσ (tσ1 − tσ2 )]t = 0.
(3.27)
This fact implies actually that tσ1 = tσ2 . Let us first give an informal proof of this statement. To begin with, note that the fact that [SkI (t)]t = 0 for every smooth path does not imply in itself that t = 0 if t ∈ H is arbitrary. Namely, the character SkIt : H → R quotients out via the canonical projection : H → Sh, see Proposition 3.1, into a character Sh → R, by the tree shuffle property; one may actually prove that SkIt (t) = 0 for + are lineevery smooth path if and only if t ∈ K er (). In our case, the elements of Fn,n arly independent modulo K er () because the ordering of the labels ( j), j = 1, . . . , n is compatible with the tree ordering – which prevents any possibility of shuffling – hence tσ1 − tσ2 = 0. + Let us now give a more formal argument. Let tσ1 − tσ2 =: j a j t j , a j ∈ Z, t j ∈ Fn,n two-by-two distinct, and define Ft(ξ1 , . . . , ξn ) :=
1 (ξ + v∈V (t) v wv ξw )
(3.28)
+ . Applying Lemma 4.5 to [SkI (t )] , where (μ ◦ σ ) + n if t ∈ Fn,n μm j t m m≥1 ∈ P Meas(R ) is a sequence of measures whose Fourier transform converges weakly to the Dirac distribution δ(ξ1 ,...,ξn ) , one gets a j Ft j (ξ1 , . . . , ξn ) = 0, |ξ1 | ≤ . . . ≤ |ξn |. (3.29) J
Since the left-hand side of (3.29) is a rational function, the equation extends to arbitrary ξ = (ξ1 , . . . , ξn ) ∈ Rn . Note that (ξv + ξw ) = (ξ1 + ξw )Ftˇ j (ξ2 , . . . , ξn ), (3.30) v∈V (t j )
wv
w1
where ˇt j := Lea{1} (t j ) is t j severed of the vertex 1, which is one of its roots. Let J , ⊂ {2, . . . , n} be the subset of indices j such that {v ∈ {1, . . . , n}; v 1 in t j } = , i.e. such that the tree component of 1 in t j has vertex set . Take the residue at − w∈ ξw of the left-hand side of (3.29), considered as a function of ξ1 . This gives: a j Ftˇ j (ξ2 , . . . , ξn ) = 0, ⊂ {2, . . . , n}. (3.31) j∈J
Hölder-Continuous Rough Paths by Fourier Normal Ordering
23
Shifting by −1 the indices of vertices of ˇt j and the labels (v), v ∈ V (ˇt j ), one gets a + forest in Fn−1,n−1 . One may now conclude by an inductive argument. Let us now give an alternative definition for the regularization R. As we shall see in Lemma 3.8, the two definitions actually coincide. Definition 3.7 (alternative definition for regularization R ). Choose for every tree T ∈ H a subset ZrTeg ⊂ ZT + satisfying the condition stated in Step 1 of Sect. 2. Let = ((1), . . . , (d)) be a compactly supported, α-Hölder path, and μ := ⊗nj=1 d( ( j)) the corresponding measure. (i) Let, for every T ∈ Hd with n vertices, φνt (T) = [RSkIT (ν)]t , ν ∈ P˜ +,T Meas(Rn ),
(3.32)
see Eq. (2.11) or Lemma 3.5, and χ¯ t (Tn ) :=
σ ∈ n
φμt σ (Tσ )
(3.33)
be the associated character of Sh as in Definition 3.4. (ii) Let, for Tn ∈ Shd , n ≥ 1, with n vertices and decoration , ¯ n ). [R n ( (1), . . . , (n))]ts := χ¯ t ∗ (χ¯ s ◦ S)(T
(3.34)
¯ are characters of the shuffle algebra, R Since χ¯ s , χ¯ t and hence χ¯ t ∗ (χ¯ s ◦ S) satisfies the shuffle property. Also, R satisfies the Chen property by construction, since ¯ ∗ χ¯ u ∗ (χ¯ s ◦ S) ¯ (Tn ) [R n ( (1), . . . , (n))]ts = χ¯ t ∗ (χ¯ u ◦ S) = [R n ( (1), . . . , (n))]tu + [R n ( (1), . . . , (n))]us + [R j ( (1), . . . , ( j))]tu [R n− j ( ( j + 1), . . . , (n))]us j
(3.35) by definition of the convolution in Sh. Both properties remain valid if χ¯ t , t ∈ R are arbitrary characters of Sh. Let us make this definition a little more explicit before proving that R = R. Replacing χ¯ s ◦ S¯ with χ s ◦ S, see Corollary 3.2, one gets, see Eq. (3.8), [R n ( (1), . . . , (n))]ts = χt (Tn ) + χs (S(Tn )) + =
(χ¯ t
− χ¯ s )(Tn ) +
χt (Roo j Tn )(χs ◦ S)(Lea j Tn )
j
(χ¯ t
− χ¯ s )(Roo j Tn ).χs (S(Lea j Tn )).
j
(3.36)
24
J. Unterberger
Expanding the formula for S(Lea j Tn ) in terms of multiple cuts as in the previous subsection, see Eq. (3.9), we get [R n ( (1), . . . , (n))]ts = (χ¯ t − χ¯ s )(Tn ) + (−1)l j1 <...< jl
(χ¯ t
− χ¯ s )(Roo j1 Tn )
l−1
l≥1
χ¯ s (Lea jm
◦ Roo jm+1 (Tn )) χ¯ s (Lea jl Tn ),
m=1
(3.37) by chopping the trunk tree Tn . Finally, χ¯ u (T), u = t or s, should be split according to Definition 3.4 (iv). Let us now make the following remark. The difference between [R n ( (1), . . . ,
(n))]ts and [R n ( (1), . . . , (n))]ts is that [R n ( (1), . . . , (n))]ts is obtained by first (i) splitting the measure μ := ⊗nj=1 d( ( j)) into σ ∈ n μσ ◦ σ and then (ii) chopping the forests Tσj , while [R n ( (1), . . . , (n))]ts is obtained by first (i) chopping the trunk tree Tn and then (ii) splitting the measures on the trunk subtrees. Actually, as may be expected, the two operations commute. Lemma 3.8. [R n ( (1), . . . , (n))]ts = [R n ( (1), . . . , (n))]ts . Hence the regularized iterated integrals R satisfy the Chen and shuffle properties, and Theorem 3.1 is proved. Proof. The proof goes along the same lines as Lemma 3.6. Let Tn be some trunk tree with n vertices and decoration , and μ := ⊗nj=1 d( ( j)). Consider for the moment an arbitrary character χ¯ t as in Lemma 3.6, associated to linear forms φTt as in Definition 3.4. Define quite generally [Rφ, (Tn )]ts := φ t ∗ (φ s ◦ S) μσ (Tσ ), (3.38) σ ∈ n
see Lemma 3.5, and (see Definition 3.7 (ii)) [R φ, (Tn )]ts := χt ∗ (χs ◦ S) (Tn ).
(3.39)
If φTt = [RSkIT ( . )]t , then Rφ, = R n and R φ, = R n . On the other hand, if φt = [SkIT ( . )]t , then plainly [Rφ, (Tn )]ts = [R φ, (Tn )]ts = [ cano,n ( (1), . . . ,
(n))]ts , see first comment in Sect. 2 and Eq. (3.12). Let σ ∈ n . Fix some multi-index k = (k1 , . . . , kn ) such that |k1 | ≤ . . . ≤ |kn |, and set μσk = P˜ {k} (μ ◦ σ ). Then, see Eq. (3.7), t φ ∗ (φ s ◦ S) μσ (Tσ ) = k
+
v |V (Tσ )
1 t φ σ (Tσ ) + φμs σ (S(Tσ )) k | k | μk
⎞
t σ φ tRoov (μσ ) (Roov (Tσ ))φ Lea σ (S(Leav (T )))⎠ . v (μ ) k
k
(3.40)
Expand S according to Eq. (3.9). This gives an expression for the P˜ { k} -projection of [Rφ, (Tn )]ts . An expression may also be obtained for the analogous regularized
Hölder-Continuous Rough Paths by Fourier Normal Ordering
25
quantity associated to R by using Eq. (3.37). In the end, one gets two sums over some subsets of {1, . . . , n},
[Rφ,P {k◦σ } (Tn )]ts = φ t σ (tσ1,J, j )φ s σ (t1,σ J¯, j ), (3.41) μ μ j
J ⊂{1,...,n}
j
and similarly [R φ,P {k◦σ } (Tn )]ts =
k J¯
k J
J ⊂{1,...,n}
φ t σ (tσ2,J, j )φ s σ (t2,σ J¯, j ),
μk
μk
J
(3.42)
J¯
where
J¯ = {1, . . . , n} \ J = V (t1,σ J¯, j ) = V (t2,σ J¯, j ); (3.43)
J = V (tσ1,J, j ) = V (tσ2,J, j ),
μσk J = ⊗1≤ j≤n, j∈J d P˜ { k◦σ } ( ( j)), μσk J¯ = ⊗1≤ j≤n, j∈ J¯ d P˜ { k◦σ } ( ( j)); (3.44)
+ as in the proof of Lemma 3.6. and tσ1,J, j .t1,σ J¯, j , tσ2,J, j .t2,σ J¯, j ∈ Fn,n In the case of the regularization scheme R, each tσ1,J, j is a forest such as Roov (Tlσ ), where Tlσ appears in the decomposition of the permutation graph Tσ , and v is some
admissible cut of Tlσ , while t1,σ J¯, j is some complicated product obtained by the mul-
tiple cut decomposition of S(Leav (Tlσ )). In the case of R , one first splits Tn into (Rool Tn , Leal Tn ) and then permutes the vertices of each of the two trunk subtrees, see Eq. (3.37). As in the proof of Lemma 3.6, one now proves the equality
tσ1,J, j ⊗ t1,σ J¯, j = tσ2,J, j ⊗ t2,σ J¯, j (3.45) J
j
J
j
by assuming that φTt = [SkIT ( . )]t , in which case both expressions (3.41) and (3.42) are equal. By considering a sequence of measures (μm ◦ σ )m≥1 whose Fourier transforms converge weakly to δ(ξ1 ,...,ξn ) , one gets by Lemma 4.5 an equation of the type ⎡ ⎣ei(s i∈J ξi +t i∈ J¯ ξi ) Ftσ1,J, j ((ξi )i∈J )Ft σ ((ξi )i∈ J¯ ) J
1, J¯, j
j
+
ei(s
i∈J ξi +t
J
i∈ J¯ ξi )
j
⎤ Ftσ2,J, j ((ξi )i∈J )Ft σ ((ξi )i∈ J¯ )⎦ = 0, 2, J¯, j
(3.46) where the function Ft has been defined in the course of the proof of Lemma 3.6. Under the generic condition that all ξ J := i∈J ξi , J ⊂ {1, . . . , n} are two-by-two distinct, the functions (s, t) → f J (s, t) := ei(sξ J +tξ J¯ ) , J ⊂ {1, . . . , n} are linearly independent. Hence, for every J , Ftσ1,J, j ((ξi )i∈J )Ft σ ((ξi )i∈ J¯ ) + Ftσ2,J, j ((ξi )i∈J )Ft σ ((ξi )i∈ J¯ ) = 0. (3.47) j
1, J¯, j
2, J¯, j
By using the same arguments as in the proof of Lemma 3.6, one obtains Eq. (3.45).
26
J. Unterberger
3
4
6 5
2 1
0 Fig. 5. 3,4,6 are leaves; 1, 2 and 5 are nodes, 2 and 5 are uppermost; branches are e.g. Br (2 1) = {2} or Br (6 1) = {6, 5}; Lea f (2) = {3, 4}; wmax (2) = 4
4. Hölder Estimates Let be an α-Hölder path. We shall now choose a regularization scheme, i.e. choose for each tree T a subset ZrTeg ⊂ ZT + such that the convergence condition stated in Sect. 2, Step 1 is verified, and prove that the associated regularized rough path R n ( (1), . . . , (n)) satisfies the required Hölder properties. Following the regularization procedure as explained in Sect. 2, one must first (1) decompose R n ( (1), . . ., (n)) into the sum over all permutations σ ∈ n of RITσj P˜ + ⊗v∈V (Tσj ) ( (σ (v))) as in the final step of Sect. 2, and (2) show Hölder regularity with correct exponent of the increment terms RSkIT (P˜ + (⊗v∈V (T) ( (σ (v))))) and of the boundary terms, RIT (P˜ + (⊗v∈V (T) ( (σ (v)))))(∂), see Step 4. 4.1. Choice of the regularization scheme. Recall that the whole algorithm rests on the T choice of a subdomain ZrTeg ⊂ ZT + := {(kv )v∈V (T) ∈ Z | (v w) ⇒ |kv | ≥ |kw |} for each tree T ∈ T . The purpose of this subsection is to propose an adequate choice. We shall first need to introduce a little more terminology concerning tree structures (see Fig. 5). Definition 4.1. Let T be a tree. (i) A vertex v is a leaf if no vertex connects to v. The set of leaves above (i.e. connecting to) v ∈ V (T) is denoted by Lea f (v). (ii) Vertices at which 2 or more branches join are called nodes. (iii) The set Br (v1 v2 ) of vertices from a leaf or a node v1 to a node v2 or to the root, is called a branch if it does not contain any other node. By convention, Br (v1 v2 ) includes v1 and excludes v2 . (iv) A node n is called an uppermost node if no other node is connected to n. Definition 4.2. Let T be a tree. If v ∈ V (T), we let wmax (v) := max{w ∈ V (T) | w v}, or simply wmax (v) = v if v is a leaf. Definition 4.3. Let ZrTeg be the set of V (T)-uples k = (kv )v∈V (T) ∈ ZT such that the following conditions are satisfied: (i) if v < w, then |kv | ≤ |kw |; (ii) if v ∈ V (T) and w ∈ Lea f (v), kw .kv < 0, then |kv | ≤ |kw |−log2 10−log2 |V (T)|; (iii) if n ∈ V (T) is a node, then each vertex w ∈ {wmax (v) | v → n} such that kw .kwmax (n) < 0 satisfies: |kw | ≤ |kwmax (n) | − log2 10 − log2 |V (T)|.
Hölder-Continuous Rough Paths by Fourier Normal Ordering
27
Lemma 4.4. Let ξ = (ξv )v∈T be such that ξv ∈ supp(φkv ) for some k = (kv )v∈V (T) ∈ ZrTeg , where (φk )k∈Z is the dyadic partition of unity defined in the Appendix. Then, for every v ∈ V , 1 |V (T)| . |ξwmax(v) | ≥ |ξv + ξw | > |ξwmax (v) |. (4.1) 2 wv Proof. The left inequality is trivial. As for the right one, assume first that v is on a terminal branch, i.e. Lea f (v) = {wmax (v)} is a singleton. Then Definition 4.3 (ii) implies the following: for every vertex v on the branch between wmax (v) and v, i.e. v ∈ Br (wmax (v) v) ∪ {v}, – either ξv is of the same sign as ξwmax (v) ; |ξwmax (v) | |kv |−1 , 5 · 2|kv |−1 ) (and similarly for |ξ
– or |ξv | ≤ 2|V wmax (v) |) (T)| , since |ξv | ∈ (2 by the remarks following Proposition 5.2. v}| Hence |ξv + wv ξw | = | v ∈Br (wmax (v)v)∪{v} ξv | > 1 − 21 |{w:w |ξwmax (v) | |V (T)| and ξv + wv ξv has the same sign as ξwmax (v) . Consider now what happens at a node n. Let n + := {v ∈ V (T) | v → n}. Assume by induction on the number of vertices that, for all v ∈ n + ,
1 |{w : w v}| . |ξwmax (v) | ξw | > 1− (1 + |{w : w v}|) |ξwmax (v) | ≥ |ξv + 2 |V (T)| wv (4.2)
and that ξv + wv ξw has the same sign as ξwmax (v) . By Definition 4.3 (iii), either |ξwmax (n) | + ξwmax (v) .ξwmax (n) > 0 or |ξwmax (v) | ≤ 2|V (T)| . Then, letting w0 be the element of n such that wmax (v0 ) = wmax (n), ξw | = ξn + (ξv + ξ w ) (1 + |{w : w n}|) |ξwmax (n) | ≥ |ξn + wn wv v∈n + ≥ ξv 0 + ξw − (ξv + ξw ) − |ξn | + wv0 wv v∈n ;ξwmax (v).ξwmax (n) <0
1 |{w : w n}| . |ξwmax (n) |. > 1− (4.3) 2 |V (T)| 4.2. A key formula for skeleton integrals. We assume in this paragraph that is smooth and denote by its derivative. The Hölder estimates in Subsects. 4.3 and 4.4 rely on the key formula below. Lemma 4.5. The following formula holds: √ [SkIT ()]s = (i 2π)−|V (T)| . . .
v∈V (T)
dξv .eis
v∈V (T) ξv
v∈V (T) F ( v∈V (T) (ξv +
( (v)))(ξ
v)
wv ξw )
.
(4.4)
28
J. Unterberger
Proof. We use induction on |V (T)|. After stripping the root of T, denoted by 0, there remains a forest T = T 1 . . . T J , whose roots 01 , . . . , 0 J are the vertices directly connected to 0. Assume ix0 v∈V (T ) ξv j dξv .e F j (ξ0 j , (ξv )v∈T j \{0 j } ) (4.5) [SkIT j ()]x0 = . . . v∈V (T j )
for some functions F j , j = 1, . . . , J . Note that ⎡ ⎢ F SkIT j () (ξ j ) = ⎣
⎤
⎥ dξv ⎦ F j (ξ j −
v∈V (T j )\{0 j }
v∈V (T j )\{0 j }
ξv , (ξv )v∈V (T j )\{0 j } ).
(4.6) Then [SkIT ()]s =
s
dx0 ( (0))
J
[SkIT j ()]x0
j=1
1 = √ 2π 1 = √ 2π
+∞
−∞
+∞
−∞
⎡ J ⎢ ×⎣
⎞ ⎛ J dξ isξ ⎝
SkIT j ()⎠ (ξ ) e F ( (0)) iξ j=1
dξ F ( ( (0)))(ξ −
j=1 v∈V (T j )\{0 j }
J j=1
⎤
ξj)
eisξ . iξ
dξ1 . . .
J
j=1
v∈V (T j )\{0 j }
⎥ dξv⎦ F j (ξ j −
dξ J
ξv , (ξv )v∈V (T j )\{0 j } ),
(4.7) hence the result.
4.3. Estimate for the increment term. We now come back to an arbitrary α-Hölder path and prove a Hölder estimate for the increment term, see Eq. (2.13), which is simply a regularized skeleton integral. Let σ ∈ n be a permutation, and T be one of the forests Tσj appearing in the permutation graph Tσ , see Lemma 1.5. Hölder norms || . ||C γ are defined in the Appendix. Recall T comes with a total ordering compatible with its tree partial ordering. The ˜ P-projection P˜ + below is defined with respect to this total ordering. Lemma 4.6 (Hölder estimate of the increment term). ||RSkIT P˜ + (⊗v∈V (T) ( (σ (v)))) ||C |V (T)|α < ∞ holds.
(4.8)
Hölder-Continuous Rough Paths by Fourier Normal Ordering
29
Remark. Although formal integrals are a priori infra-red divergent (see Subsect. 1.4), the formula given in Lemma 4.5 for skeleton integrals delivers infra-red convergent quantities when one restricts the integration over ξ = (ξv )v∈V (T) to the subdomain associated to ZrTeg , see Lemma 4.4, because F( ( (v)))(ξv ) |F(( (v)))(ξv )| |ξv | ≤ |F(( (v)))(ξv )| (4.9) ξ + ξ |ξwmax (v) | v wv w is bounded. Proof. We implicitly assume in the proof that T is a tree, leaving the obvious generalization to forests with several components to the reader. We shall start the computations by adapting the proof of a theorem in [30], §2.6.1 bounding the Hölder-Besov norm of the product of two Hölder functions. Write
G(x) = RSkIT P˜ + (⊗v∈V (T) ( (σ (v)))) . (4.10) x
By Lemma 4.5,
√ G(x) = (i 2π )−|V (T)|
v∈V (T) supp(φkv )
k=(kv )v∈V (T) ∈ZrTeg
.e
ix
v∈V (T) ξv
D(φkv ) ( (σ (v))) (ξv ) . v∈V (T) (ξv + wv ξw )
v∈V (T) F
dξv
v∈V (T)
(4.11) Write, for ξ = (ξv )v∈V (T) , (ξ ) =
v∈V (T)
ξv +
ξv
wv ξw
(4.12)
and 1 (k) =
2|kv |
v∈V (T)
2|kwmax (v) |
.
(4.13)
Let finally k (ξ ) :=
! v∈V (T)
φkv (ξv ) .
(ξ ) . 1 (k)
(4.14)
By Lemma 4.4, || k || S 0 (RV (T) ) , see Proposition 5.8, is uniformly bounded in k if k ∈ ZrTeg , which is the key point for the following estimates. ∗ Let k ∈ Z. Apply the operator D(φk ) to Eq. (4.11): then, letting φk (ξ ) := φk ( v∈V (T) ξv ), ⎡ ⎤ ! ⎢ ⎥ D(φk )G(x) = ⎣ 1 (k)D( k )D(φk∗ ). D( φkv )( (σ (v)))⎦ (x), k∈ZrTeg
v∈V (T)
(4.15)
30
J. Unterberger
where x = (xv )v∈V (T) = (x, . . . , x) is a vector with |V (T)| identical ! components. Let vmax := sup{v | v ∈ V (T)}. Note that D(φk∗ ) . D(⊗v∈V (T) φkv ) vanishes except if ⎛ ⎞ ⎝ supp(φkv )⎠ ∩ supp(φk∗ ) = ∅, (4.16) v∈V (T)
which implies by Lemma 4.4, |kvmax − k| = O(log2 |V (T)|);
(4.17)
namely, denoting by 0 the root of T, |V (T)| . |ξkvmax | ≥ | v∈V (T) ξkv | = |ξk0 + 1 w0 ξkw | > 2 |ξkvmax | if ξv ∈ supp(φkv ) for every v. Since k , φk∗ ∈ S 0 (RV (T) ), one gets by Proposition 5.8, ! ||D(φk )G||∞ 1 (k) ||D( φkv )( (σ (v)))||∞ . (4.18) v∈V (T)
k∈ZrTeg ,kvmax =k
Since is in C α , one obtains by Propositions 5.7 and 5.8: 1 (k) 2−|kv |α ||D(φk )G||∞ k∈ZrTeg ,kvmax =k
v∈V (T)
2|kv |(1−α)−|kwmax (v) | .
(4.19)
k∈ZrTeg ,kvmax =k v∈V (T)
In other words, loosely speaking, each vertex v ∈ V (T) contributes a factor 2|kv |(1−α)−|kwmax (v) | to ||D(φk )G||∞ . If v is a leaf, then this factor is simply 2−|kv |α . Note that the upper bound 2|kv |(1−α)−|kwmax (v) | ≤ 2−|kv |α holds true for any vertex v. Consider an uppermost node n, i.e. a node to which no other node is connected, together with the set of leaves {w1 < . . . < w J } above n, see Fig. 5. Let p j = |V (Br (w j n))|. On the branch number j, −|k |α |k |(1−α)−|kw j | −|k |αp 2 v 2 wj j , (4.20) 2 wj v∈Br (w j n)\{w j } |kv |≤|kw j |
and (summing over kw1 , . . . , kw J −1 and over kn ) 2−|kw J |αp J 2−|kw J −1 |αp J −1 ⎛
|kw J −1 |≤|kw J |
⎛
⎝. . . ⎝
2−|kw1 |αp1 ⎝
|kw1 |≤|kw2 |
2
−|kw J |αW (n)
⎛
,
⎞⎞
⎞
2|kn |(1−α)−|kw J | ⎠⎠ . . .⎠
|kn |≤|kw1 |
(4.21)
where W (n) = p1 + . . . + p J + 1 = |{v : v n}| + 1 is the weight of n. One may then consider the reduced tree Tn obtained by shrinking all vertices above n (including n) to one vertex with weight W (n) and perform the same operations on Tn . Repeat this inductively until T is shrunk to one point. In the end, one gets ||D(φk )G||∞ 2−|kvmax |α|V (T)| 2−|k|α|V (T)| , hence G ∈ C |V (T)|α .
Hölder-Continuous Rough Paths by Fourier Normal Ordering
31
Remark. Note that the above proof breaks down for the non-regularized quantitities, T since the function k (ξ ) is unbounded on ZT + \ Zr eg . For instance, the Lévy area of fractional Brownian motion diverges below the barrier α = 1/4, see [11,32,33]. For deterministic, well-behaved paths with very regular, polynomially decreasing Fourier components, the unregularized integrals are probably well-defined at least for α > 1/2 – in which case the much simpler Young integral converges – otherwise the case is not even clear. 4.4. Estimate for the boundary term. We shall now prove a Hölder estimate corresponding to the boundary term. As in the previous paragraph, we let σ ∈ n and T be one of the forests Tσj , j = 1, . . . , Jσ . Once again, recall T comes with a total ordering com˜ patible with its tree partial ordering. The P-projection P˜ + below is defined with respect to this total ordering. Lemma
4.7 (Hölder regularity of theboundary term). The regularized boundary term + ˜ RIT P (⊗v∈V (T) ( (σ (v)))) (∂) is |V (T)|α-Hölder. ts
Proof. As in the previous proof, we assume implicitly that T is a tree, but the proof generalizes with only very minor changes to the case of forests. Solving in terms of multiple cuts as in Sect. 3 the recursive definition of the boundary term [RIT P˜ + (⊗v∈V (T) ( (σ (v))) (∂)]ts given in Sect. 2, one gets in the end a sum of ’skeleton-type’ terms of the form (see Fig. 6) " l−1 # Ats := [δRSkI Roo(T) ]ts [RSkI Leavm ◦Roovm+1 (T) ]s [RSkI Leavl (T) ]s
m=1
× P (⊗v∈V (T) ( (σ (v))) , ˜+
(4.22)
where vl = (vl,1 < . . . < vl,Jl ) | V (T), vl−1 | V (Roovl T), . . ., v 1 = (v1,1 , . . . , v1,J1 ) | Roov 2 (T)) and one has set for short Roo(T) := Roov 1 (T). Leav
T
l Zr eg l, j such that k = (kvl,1 , . . . , kvl,Jl ) (with |kvl,1 | ≤ First step. Let U [k] ⊂ Jj=1 . . . ≤ |kvl,Jl |) is fixed. Then (see after Eq. (4.19) in the proof of Lemma 4.6) each vertex v contributes a factor 2|kv |(1−α)−|kwmax (v)| ≤ 2−|kv |α , hence
||P U [ k] RSkI Leavl T (⊗v∈V (Leavl T) ( (σ (v))))||∞ ⎡ ⎤ ⎣2−|kv |α 2−|kw |α ⎦ v∈vl
|kw |≥|kv |,w∈Leav T\{v}
2
−|kv |α|V (Leav T)|
.
(4.23)
v∈vl
˜ Second step. More generally, let Bs [k] be the expression obtained by P-projecting " l−1 # [RSkI Leavm ◦Roovm+1 (T) ]s [RSkI Leavl (T) ]s P˜ + (⊗v∈V (Leav1 (T)) ( (σ (v)))) m=1
32
J. Unterberger
v 1,2
4
v 2,1 v 1,1
2 1
0 Fig. 6. Here V (Roo(T)) = {0, 1, 2, 4}, R(0) = R(4) = ∅, R(1) = {v1,1 }, R(2) = {v1,2 }
onto the sum of terms with some fixed value of the indices k = (kv1,1 , . . . , kv1,J1 ). Then ||Bs [k]||∞
2−|kv |α|V (Leav T)|
(4.24)
v∈v 1
(proof by induction on l). Third step. We define As (x) := [RSkI Roo(T) ]x
" l−1
# [RSkI Leavm ◦Roovm+1 (T) ]s [RSkI Leavl (T) ]s
m=1
P˜ + (⊗v∈V (T) ( (σ (v)))
(4.25)
α (see Eq. (4.22)), so that Ats = As (t)−As (s), and show that sups∈R ||x → As (x)|| B∞,∞ < ∞. Note first (see the Remark following Lemma 4.6) there is no infra-red divergence problem. Let V (Roo(T)) = {w1 < . . . < wmax }. Fix s ∈ R and K ∈ Z. By definition, and by Lemma 4.5,
⎛ ⎜ (D(φ K )As ) (x) = D(φ K ) ⎝x →
k=(kv1,1 ,...,kv1,J ) ((kw )w∈V (Roo(T)) )∈Sk 1
v∈V (Roo(T))
dξv . e
ix
v∈V (Roo(T)) ξv
v∈V (Roo(T))
supp(φkv )
⎞ D(φkw ) ( (σ (w))) (ξw ) Bs [ k]⎠ , w∈V (Roo(T)) (ξw + w w,w ∈V (Roo(T)) ξw ) w∈V (Roo(T)) F
(4.26) where indices in Sk satisfy in particular the following conditions: (i) |ξw + w w,w ∈V (Roo(T)) ξw | > 21 max{|ξw | : w w, w ∈ V (Roo(T))} by Lemma 4.4; ∗ ) = ∅, see Eq. (4.16); (ii) supp(φ ) ∩ supp(φ K k w w∈V (Roo(T)) (iii) for every w ∈ V (Roo(T)), |kw | ≤ |kwmax |; and (iv) for every w ∈ V (Roo(T)), |kw | ≤ |kv | for every v ∈ R(w) := {v = v1,1 , . . . , v1,J1 | v → w}. Note that R(w) may be empty. See Fig. 6. Note that |kwmax − K | = O(log2 |V (Roo(T))|) by (ii) (see Eq. (4.17)). Hence conditions (ii) and (iii) above are more or less equivalent to fixing kwmax K and letting (kw )w∈V (Roo(T))\{wmax } range over some subset of [−|K |, |K |] × . . . × [−|K |, |K |].
Hölder-Continuous Rough Paths by Fourier Normal Ordering
33
The large fraction in Eq. (4.26) contributes to ||D(φk )As ||∞ an overall factor bounded by |1 (k)| w∈V (Roo(T)) 2−|kv |α . If w ∈ Roo(T), split R(w) into R(w)> ∪ R(w)< , where R(w)≷ := {v ∈ R(w) | v ≷ wmax }. Summing over indices corresponding to vertices in or above RT> := {v = vl,1 , . . . , vl,Jl | v > wmax } = ∪w∈Roo(T) R(w)> , one gets by Eq. (4.24) a quantity bounded up to a constant by 2−|kv |α|V (Rv T)| 2−|K |α v∈RT> |V (Rv T)| . (4.27) v∈R T> |kv |≥|K |
Let w ∈ Roo(T) \ {wmax } such that R(w)< = ∅ (note that R(wmax )< = ∅). Let R(w)< = {vi1 < . . . < vi j } . Then the sum over (kv ), v ∈ R(w)< contributes a factor bounded by a constant times 2
∞
−|kw |α
∞
...
|kvi |=|kw | |kvi |=|kvi | 1
2
2
∞
1
|kvi |=|kvi j
−|kw |α(1+ v∈R(w)< |V (Leav T)|)
2 j−1
−|kvi |α|V (Leavi T)| 1
1
...2
−|kvi |α|V (Leavi T)| j
j
|
.
(4.28)
In other words, each vertex w ∈ Roo(T) ’behaves’ as if it had a weight 1 + v∈R(w)< |V (Rv T)|. Hence (by the same method as in the proof of Lemma 4.6), letting RT< := ∪w∈Roo(T) R(w)< ,
||D(φ K )As ||∞ 2−|K |α(|V (Roo(T))|+
v∈RT<
|V (Leav T)|)
.2−|K |α
= 2−|K |α|V (T)| .
v∈RT>
|V (Leav T)|
(4.29)
5. Appendix. Hölder and Besov Spaces We gather in this Appendix some definitions and technical facts about Besov spaces and Hölder norms that are required in Sects. 2 and 4. Definition 5.1 (Hölder norm). If f : Rl → R is α-Hölder continuous for some α ∈ (0, 1), we let | f (x) − f (y)| . ||x − y||α x,y∈Rl
|| f ||C α := || f ||∞ + sup
(5.1)
The space C α = C α (Rl ) of real-valued α-Hölder continuous functions, provided with the above norm || ||C α , is a Banach space. Proposition 5.2 [30]. Let l ≥ 1. There exists a family of C ∞ functions φ0 , (φ1, j ) j=1,...,4l −2l : Rl → [0, 1], satisfying the following conditions: 1. suppφ0 ⊂ [−2, 2]l and φ0 [−1,1]l ≡ 1. 2. Cut [−2, 2]l into 4l equal hypercubes of volume 1, and remove the 2l hypercubes included in [−1, 1]l . Let K 1 , . . . , K 4l −2l be an arbitrary enumeration of the remaining hypercubes, and K˜ j ⊃ K j be the hypercube with the same center as K j , but with edges twice longer. Then suppφ1, j ⊂ K˜ j , j = 1, . . . , 4l − 2l .
34
J. Unterberger
3. Let (φk, j )k≥2, j=1,...,4l −2l be the family of dyadic dilatations of (φ1, j ), namely, φk, j (ξ1 , . . . , ξl ) := φ1, j (21−k ξ1 , . . . , 21−k ξl ).
(5.2)
Then (φ0 , (φk, j )k≥1, j=1,...,4l −2l ) is a partition of unity subordinated to the covering l l [−2, 2]l ∪ ∪k≥1 ∪4 −2 2k−1 K˜ j , namely, j=1
−2 4 l
φ0 +
l
φk, j ≡ 1.
(5.3)
k≥1 j=1
Constructed in this almost canonical way, the family of Fourier multipliers (φ0 , (φk, j )) is immediately seen to be uniformly bounded for the norm ||.|| S 0 (Rl ) defined in Proposition 5.8 below. If l = 1, letting K 1 = [1, 2] and K 2 = [−2, −1], we shall write φ1 , resp.φ−1 , instead of φ1,1 , resp. φ1,2 , and define φk (ξ ) = φsgn(k) (21−|k| ξ ) for |k| ≥ 2, so that k∈Z φk ≡ 1 and supp φ0 ⊂ [−2, 2], supp φk ⊂ [2k−1 , 5 × 2k−1 ], supp φ−k ⊂ [−5 × 2k−1 , −2k−1 ] (k ≥ 1).
(5.4)
In this particular case, such a family is easily constructed from an arbitrary even, smooth function φ0 : R → [0, 1] with the correct support by setting φk (ξ ) = 1R+ (ξ ).(φ0 (2−k ξ )− φ0 (21−k ξ )) and φ−k (ξ ) = 1R− (ξ ).(φ0 (2−k ξ ) − φ0 (21−k ξ )) for every k ≥ 1 (see [31], §1.3.3). In order to avoid setting apart the one-dimensional case, we let Il := Z if l = 1, and Il = {0} ∪ {(k, j) | k ≥ 1, 1 ≤ j ≤ 4l − 2l } if l ≥ 2. Also, if l ≥ 2, we define |κ| = k ≥ 1 if κ = (k, j) with k ≥ 1. Definition 5.3 Let (φ˜ κ )κ∈Il be the partition of unity of Rl , l ≥ 1 defined by (see Proposition 5.2): (i) φ˜ 0 := 1[−1,1]l , φ˜ 1, j := 1 K j ;
(5.5)
φ˜ k, j (ξ1 , . . . , ξl ) := φ˜ 1, j (21−k ξ1 , . . . , 21−k ξl ).
(5.6)
(ii) if k ≥ 2,
We use this auxiliary partition several times in the text. Definition 5.4 [30]. Let ∞ (L ∞ ) be the space of sequences ( f κ )κ∈Il of a.s. bounded functions f κ ∈ L ∞ (Rl ) such that || f κ || ∞ (L ∞ ) := sup || f κ ||∞ < ∞. κ∈Il
(5.7)
Let S (Rl , R) be the dual of the Schwartz space of rapidly decreasing functions on Rl . As is well-known, it includes the space of infinitely differentiable slowly growing functions.
Hölder-Continuous Rough Paths by Fourier Normal Ordering
35
The following definition is classical. Recall that the Fourier transform F has been defined at the end of the Introduction. Definition 5.5 (Fourier multipliers). Let m : Rl → R be an infinitely differentiable slowly growing function. Then D(m) : S (Rl , R) → S (Rl , R), φ → F −1 (m · Fφ)
(5.8)
defines a continuous operator. In other words, m is a Fourier multiplier of S (Rl , R). α α Definition 5.6 [30]. Let B∞,∞ (Rl ) := { f ∈ S (Rl , R) | || f || B∞,∞ < ∞}, where α || f || B∞,∞ := ||2α|κ| D(φκ ) f || ∞ (L ∞ )
= sup 2α|κ| ||D(φκ ) f ||∞ .
(5.9)
κ∈Il
α (Rl ) = C α (Rl ), and the Proposition 5.7 (see [30], §2.2.9). For every α ∈ (0, 1), B∞,∞ α two norms || ||C α and || || B∞,∞ are equivalent. α We shall sometimes call || || B∞,∞ the Hölder-Besov norm. Let us finally give a criterion for a function m to be a Fourier multiplier of the Besov α space B∞,∞ :
Proposition 5.8 (Fourier multipliers). (see [30], §2.1.3, p. 30). Let α ∈ (0, 1) and m : Rl → R be an infinitely differentiable function such that ||m|| S 0 (Rl ) := sup sup |(1 + ||ξ ||)| j| m ( j) (ξ )| < ∞, | j|≤l+5 ξ ∈Rl
(5.10)
where j = ( j1 , . . . , jl ), | j| = j1 + . . . + jl and m ( j) := ∂ξ11 . . . ∂ξll m. Then there exists a constant C depending only on α, such that j
α α ≤ C||m|| S 0 (Rl ) || f || B∞,∞ . ||D(m) f || B∞,∞
j
(5.11)
The space S 0 (Rl ) contains the space of translation-invariant pseudo-differential symbols of order 0 (see for instance [2], Def. 1.1, or [29]). References 1. Bass, R.F., Hambly, B.M., Lyons, T.J.: Extending the Wong-Zakai theorem to reversible Markov processes. J. Eur. Math. Soc. 4, 237–269 (2002) 2. Benassi, A., Jaffard, S., Roux, D.: Elliptic Gaussian random processes. Rev. Mat. Iberoamericana 13(1), 19–90 (1997) 3. Brouder, C., Frabetti, A.: QED Hopf algebras on planar binary trees. J. Alg. 267, 298–322 (2003) 4. Brouder, C., Frabetti, A., Krattenthaler, C.: Non-commutative Hopf algebra of formal diffeomorphisms. Adv. in Math. 200, 479–524 (2006) 5. Butcher, J.C.: An algebraic theory of integration methods. Math. Comp. 26, 79–106 (1972) 6. Calaque, D., Ebrahimi-Fard, K., Manchon, D.: Two Hopf algebras of trees interacting. Preprint http:// arxiv.org/abs/0806.2238v3[math.co], 2009
36
J. Unterberger
7. Chapoton, F., Livernet, M.: Relating two Hopf algebras built from an operad, International Mathematics Research Notices, Vol. 2007, Article ID rnm131 8. Connes, A., Kreimer, D.: Hopf algebras, renormalization and non-commutative geometry. Commun. Math. Phys. 199(1), 203–242 (1998) 9. Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann-Hilbert problem (I). Commun. Math. Phys. 210(1), 249–273 (2000) 10. Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann-Hilbert problem (II). Commun. Math. Phys. 216(1), 215–241 (2001) 11. Coutin, L., Qian, Z.: Stochastic analysis, rough path analysis and fractional Brownian motions. Prob. Th. Rel. Fields 122(1), 108–140 (2002) 12. Darses, S., Nourdin, I., Nualart, D.: Limit theorems for nonlinear functionals of Volterra processes via white-noise analysis. http://arxiv.org/abs/0904.1401v1[math.PR], 2009 13. Foissy, L.: Les algèbres de Hopf des arbres enracinés décorés (I). Bull. Sci. Math. 126 (3), 193–239, and (II), Bull. Sci. Math. 126(4), 249–288 (2002) 14. Friz, P., Victoir, N.: Multidimensional dimensional processes seen as rough paths. Cambridge studies in Adv. Math. 120, Cambridge: Cambridge University Press, 2010 15. Gubinelli, M.: Controlling rough paths. J. Funct. Anal. 216, 86–140 (2004) 16. Gubinelli, M.: Ramification of rough paths. Preprint available on http://arxiv.org/abs/math/ 0306433v2[math.PR], 2003 17. Hepp, K.: Proof of the Bogoliubov-Parasiuk theorem on renormalization. Commun. Math. Phys. 2(4), 301–326 (1966) 18. Hambly, B., Lyons, T.J.: Stochastic area for Brownian motion on the Sierpinski basket. Ann. Prob. 26(1), 132–148 (1998) 19. Kahane, J.-P.: Some random series of functions. Cambridge studies in advanced mathematics 5, Cambridge: Cambridge Univ. Press, 1985 20. Kreimer, D.: Chen’s iterated integral represents the operator product expansion. Adv. Theor. Math. Phys. 3(3), 627–670 (1999) 21. Lejay, A.: An introduction to rough paths. Séminaire de probabilités XXXVII, Lecture Notes in Mathematics, Berlin-Heidelberg-NewYork: Springer, 2003 22. Lyons, T., Qian, Z.: System control and rough paths. Oxford: Oxford University Press, 2002 23. Lyons, T., Victoir, N.: An extension theorem to rough paths. Ann. Inst. H. Poincaré Anal. Non Linéaire 24(5), 835–847 (2007) 24. Murua, A.: The shuffle Hopf algebra and the commutative Hopf algebra of labelled rooted trees. Available on www.ehu.es/ccwmuura/research/shart1bb.pdf, 2005 25. Murua, A.: The Hopf algebra of rooted trees, free Lie algebras, and Lie series. Found. Comput. Math. 6(4), 387–426 (2006) 26. Nualart, D.: Stochastic calculus with respect to the fractional Brownian motion and applications. Contemporary Mathematics 336, 3–39 (2003) 27. Rivasseau, V.: From Perturbative to Constructive Renormalization. Princeton Series in Physics, Princeton, NJ: Princeton Univ. Press, 1991 28. Tindel, S., Unterberger, J.: The rough path associated to the multidimensional analytic fBm with any Hurst parameter. Preprint available at http://arxiv.org/abs/0810.1408[math.PR], 2008 29. Treves, F.: Introduction to pseudodifferential and Fourier integral operators. Vol. 1. Pseudodifferential operators, The University Series in Mathematics, New York-London: Plenum Press, 1980 30. Triebel, H.: Spaces of Besov-Hardy-Sobolev type. Leipzig: Teubner, 1978 31. Triebel, H.: Theory of function spaces. II. Monographs in Mathematics, 84, Basel: Birkhäuser, 1992 32. Unterberger, J.: Stochastic calculus for fractional Brownian motion with Hurst parameter H > 1/4; a rough path method by analytic extension. Ann. Prob. 37(2), 565–614 (2009) 33. Unterberger, J.: A central limit theorem for the rescaled Lévy area of two-dimensional fractional Brownian motion with Hurst index H < 1/4. Preprint available at http://arxiv.org/abs/0808.3458v2[math.PR], 2008 34. Unterberger, J.: A rough path over multi-dimensional fractional Brownian motion with arbitrary Hurst index by Fourier normal ordering. Preprint available at http://arxiv.org/abs/0901.4771v2[math.PR], 2009 35. Unterberger, J.: A Lévy area by Fourier normal ordering for multidimensional fractional Brownian motion with small Hurst index. Preprint available at http://arxiv.org/abs/0906.1416v1[math.PR], 2009 36. Waldschmidt, M.: Valeurs zêta multiples. Une introduction. Journal de Théorie Des Nombres de Bordeaux 12(2), 581–595 (2000) Communicated by A. Connes
Commun. Math. Phys. 298, 37–64 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1066-z
Communications in
Mathematical Physics
Geometrization and Generalization of the Kowalevski Top Vladimir Dragovi´c1,2 1 Mathematical Institute SANU, Kneza Mihaila 36, 11000 Belgrade, Serbia. E-mail:
[email protected] 2 Mathematical Physics Group, University of Lisbon, Av. Prot. Gama Pinto, 2, PT-1649-003 Lisboa, Portugal
Received: 19 May 2009 / Accepted: 16 February 2010 Published online: 20 May 2010 – © Springer-Verlag 2010
Dedicated to my teacher Boris Anatol’evich Dubrovin on the occasion of his sixtieth birthday Abstract: A new view on the Kowalevski top and the Kowalevski integration procedure is presented. For more than a century, the Kowalevski 1889 case, has attracted full attention of a wide community as the highlight of the classical theory of integrable systems. Despite hundreds of papers on the subject, the Kowalevski integration is still understood as a magic recipe, an unbelievable sequence of skillful tricks, unexpected identities and smart changes of variables. The novelty of our present approach is based on our four observations. The first one is that the so-called fundamental Kowalevski equation is an instance of a pencil equation of the theory of conics which leads us to a new geometric interpretation of the Kowalevski variables w, x1 , x2 as the pencil parameter and the Darboux coordinates, respectively. The second is observation of the key algebraic property of the pencil equation which is followed by introduction and study of a new class of discriminantly separable polynomials. All steps of the Kowalevski integration procedure are now derived as easy and transparent logical consequences of our theory of discriminantly separable polynomials. The third observation connects the Kowalevski integration and the pencil equation with the theory of multi-valued groups. The Kowalevski change of variables is now recognized as an example of a two-valued group operation and its action. The final observation is surprising equivalence of the associativity of the two-valued group operation and its action to the n = 3 case of the Great Poncelet Theorem for pencils of conics.
Contents 1. 2. 3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Pencils of Conics and Discriminantly Separable Polynomials . . . 2.1 Pencils of conics and the Darboux coordinates . . . . . . . . . 2.2 Discriminantly separable polynomials . . . . . . . . . . . . . Geometric Interpretation of the Kowalevski Fundamental Equation
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
38 39 39 43 46
38
4.
5.
V. Dragovi´c
Generalized Integrable System . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Equations of motion and the first integrals . . . . . . . . . . . . . . . . 4.2 Generalized Kotter transformation . . . . . . . . . . . . . . . . . . . . 4.3 Interpretation of the equations of motion . . . . . . . . . . . . . . . . . Two-Valued Groups, Kowalevski Equation and Poncelet Porism . . . . . . . 5.1 Multivalued groups: defining notions . . . . . . . . . . . . . . . . . . . 5.2 The simplest case: 2-valued group p2 . . . . . . . . . . . . . . . . . . . 5.3 2-valued group structure on CP1 , the Kowalevski fundamental equation and Poncelet porism . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47 47 51 52 55 55 56 59
1. Introduction The goal of this paper is to give a new view on the Kowalevski top and the Kowalevski integration procedure. For more than a century, the Kowalevski 1889 case [25], has attracted the full attention of a wide community as the highlight of the classical theory of integrable systems. Despite hundreds of papers on the subject, the Kowalevski integration is still understood as a magic recipe, an unbelievable sequence of skillful tricks, unexpected identities and smart changes of variables (see for example [1,2,4,11,14,17,18,20,22–24,26–29,32] and references therein). The novelty of this paper is based on our four observations. The first one is that the so-called fundamental Kowalevski equation (see [20,24,25]) Q(w, x1 , x2 ) = 0, is an instance of a pencil equation from the theory of conics. This leads us to a new interpretation of the Kowalevski variables w, x1 , x2 as the pencil parameter and the Darboux coordinates respectively. Origins and classical applications of the Darboux coordinates can be found in Darboux’s book [9], while some modern applications can be found in [12,13]. The second is observation of the key algebraic property of the pencil equation: all three of its discriminants are expressed as products of two polynomials in one variable each: Dw (Q)(x1 , x2 ) = f 1 (x1 ) f 2 (x2 ), Dx1 (Q)(w, x2 ) = f 3 (w) f 2 (x2 ), Dx2 (Q)(w, x1 ) = f 1 (x1 ) f 3 (w). This serves us as a motivation to introduce a new class of what we call discriminantly separable polynomials. We develop the theory of such polynomials. All steps of the Kowalevski integration now follow as easy and transparent logical consequences of our theory of the discriminantly separable polynomials. The third observation connects the Kowalevski integration and the pencil equation with the theory of multivalued groups. The theory of multivalued groups started in the beginning of the 1970’s by Buchstaber and Novikov (see [5]). It has been further developed by Buchstaber and his collaborators in the last forty years (see [6–8]). The Kowalevski change of variables is now recognized as a case of the two-valued group operation (2 , Z2 ) and its action, where 2 is an elliptic curve and Z2 its subgroup. Our final observation is the surprising equivalence of the associativity condition for this two-valued group operation to a case of the Great Poncelet Theorem for triangles. Well-known mechanical interpretation of the Great Poncelet Theorem is connected with
Geometrization and Generalization of the Kowalevski Top
39
integrable billiards, see for example [15]. The Great Poncelet Theorem is the milestone of the theory of pencils of conics and the whole classical projective geometry (see [30], and also [3,15,16] and references therein), as the Kowalevski top is the milestone of the classical integrable systems. Now we manage to relate them closely. As a consequence, we get a new connection between the Great Poncelet Theorem and integrable mechanical systems, this time from rigid- body dynamics. The paper is organized as follows. Section 2 starts with a subsection devoted to the pencils of conics and the Darboux coordinates. We derive the key property of the pencil equation-discriminant separability. In the second subsection, we formally introduce the class of discriminantly separable polynomials and systematically study this class. In Sect. 3 we show how the Kowalevski case is embedded into our more general framework. A new geometric interpretation of the Kowalevski variables (w, x1 , x2 ) as the pencil parameter and the Darboux coordinates is obtained. In Sect. 4 general systems are defined, related to the general equation of the pencil. The Kowalevski top can be seen as a special subcase. The first integrals are studied. Their properties are related to the properties of discriminantly separable polynomials, obtained in Sect. 2. It was done by use of what we call the Kotter trick (see [20,24]). The nature of this transformation is going to be clarified in Sect. 5 through the theory of multivalued groups. Then, we manage to generalize another Kotter transformation and this gives us a possibility to integrate the general system defined at the beginning of this section. We reduce the problem to the functions Pi , i = 1, 2, 3. The evolution of those functions in terms of the theta-functions was obtained by Kowalevski herself in [25]. A modern account of the theta-functions and their applications to nonlinear equations can be found for example in [17]. Section 5 is devoted to two-valued groups and their connection with the Kowalevski top and the Great Poncelet Theorem. In order to make the text self-contained as much as possible, we start the section with a brief introduction to the theory of multivalued groups, following works of Buchstaber and his co-workers. The main role is played by a two-valued coset group obtained from an elliptic curve 2 and its subgroup Z2 . It appears that the Kowalevski change of variables has its natural expression through this two-valued group and its action. These results complete the picture obtained before by Weil in [33] and Jurdjevic [23]. Within this framework, we give an explanation of the Kotter trick, as we promised in Sect. 4. Finally, we show that the associativity condition for the two-valued group (2 , Z2 ) is equivalent to the famous Great Poncelet Theorem ([30]) in its basic n = 3 case. 2. Pencils of Conics and Discriminantly Separable Polynomials 2.1. Pencils of conics and the Darboux coordinates. Let us start with two conics C1 and C2 given by their tangential equations: C1 : a0 w12 + a2 w22 + a4 w32 + 2a3 w2 w3 + 2a5 w1 w3 + 2a1 w1 w2 = 0; C2 : w22 − 4w1 w3 = 0.
(1)
We assume that conics C1 and C2 are in general position. Consider the pencil C(s) of conics C1 +sC2 . The conics from the pencil share four common tangents. The coordinate equation of the conics of the pencil is: F(s, z 1 , z 2 , z 3 ) := det M(s, z 1 , z 2 , z 3 ) = 0,
(2)
40
V. Dragovi´c
where M is a bordered matrix of the form ⎤ ⎡ z2 z3 0 z1 a0 a1 a5 − 2s ⎥ ⎢z M(s, z 1 , z 2 , z 3 ) = ⎣ 1 . z 2 a1 a2 + s a3 ⎦ z 3 a5 − 2s a3 a4
(3)
Then the point equation of the pencil of conics C(s) is of the form of the quadratic polynomial in s, F := H + K s + Ls 2 = 0,
(4)
where H , K and L are quadratic expressions in (z 1 , z 2 , z 3 ). Following Darboux (see [9]), we introduce a new system of coordinates in the plane. Given a plane with standard coordinates (z 1 , z 2 , z 3 ), we start from the given conic C2 . The conic is given by Eq. (1) and it is rationally parameterized by (1, , 2 ). The tangent line to the conic C2 through the point with the parameter 0 is given by the equation tC2 (0 ) : z 1 20 − 2z 2 0 + z 3 = 0. On the other hand, for a given point P in the plane with coordinates P = (ˆz 1 , zˆ 2 , zˆ 3 ) there correspond two solutions x1 and x2 of the equation quadratic in : zˆ 1 2 − 2ˆz 2 + zˆ 3 = 0.
(5)
Each solution corresponds to a tangent to the conic C2 from the point P. We will call the pair (x1 , x2 ) the Darboux coordinates of the point P. One finds immediately converse formulae zˆ 1 = 1, zˆ 2 =
x1 + x2 , zˆ 3 = x1 x2 . 2
(6)
We change the variables in the polynomial F from projective coordinates (z 1 : z 2 : z 3 ) to the Darboux coordinates according to formulae (6). In the new coordinates we get the formulae: H (x1 , x2 ) = (a12 − a0 a2 )x12 x22 + (a0 a3 − a5 a1 )x1 x2 (x1 + x2 ) 1 +(a52 − a0 a4 )(x12 + x22 ) + (2(a5 a2 − a1 a3 ) + (a52 − a0 a4 )x1 x2 2 +(a1 a4 − a3 a5 ))(x1 + x2 ) + a32 − a2 a4 , K (x1 , x2 ) = −a0 x12 x22 + 2a1 x1 x2 (x1 + x2 ) − a5 (x12 + x22 ) − 4a2 x1 x2 +2a3 (x1 + x2 ) − a4 , L(x1 , x2 ) = (x1 − x2 )2 .
(7)
We may notice for further references that (x1 − x2 )2 = 4(z 1 z 3 − z 22 ). Now, the polynomial F(s, x1 , x2 ) = L(x1 , x2 )s 2 + K (x1 , x2 )s + H (x1 , x2 )
(8)
Geometrization and Generalization of the Kowalevski Top
41
is of the second degree in each of variables s, x1 and x2 and it is symmetric in (x1 , x2 ). It has one very exceptional property, as described in the next theorem. For a polynomial P(y1 , y2 , . . . , yn ) of variables (y1 , y2 , . . . , yn ) we will denote its discriminant with respect to the variable yi by D yi (P) which is a polynomial of the rest of the variables (y1 , . . . , yi−1 , yi+1 , . . . , yn ). Theorem 1. (i) There exists a polynomial P = P(x) such that the discriminant of the polynomial F in s as a polynomial in variables x1 and x2 separates the variables: Ds (F)(x1 , x2 ) = P(x1 )P(x2 ).
(9)
(ii) There exists a polynomial J = J (s) such that the discriminant of the polynomial F in x2 as a polynomial in variables x1 and s separates the variables: Dx2 (F)(s, x1 ) = J (s)P(x1 ).
(10)
Due to the symmetry between x1 and x2 the last statement remains valid after exchanging the places of x1 and x2 . Proof.
(i) A general point belongs to two conics of a tangential pencil. If a point belongs to only one conic, then it belongs to one of the four common tangents of the pencil. At such a point, this unique conic touches one of the four common tangents. Thus, the equation Ds (F)(x1 , x2 ) = 0
(11)
which represents the condition of annulation of the discriminant, is the equation of the four common tangents. Thus, Eq. (11) is equivalent to the system x1 = c1 x1 = c2 , x1 = c3 x1 = c4 , x2 = c1 x2 = c2 , x2 = c3 x2 = c4 , where ci are parameters which correspond to the points of contact of the four common tangents with the conic C2 . As a consequence, we get Ds (F)(x1 , x2 ) = P(x1 )P(x2 ), where the polynomial P is of the fourth degree and of the form P(x) = a(x − c1 )(x − c2 )(x − c3 )(x − c4 ). This proves the first part of the theorem. The second part of the theorem follows from the following lemma: Lemma 1. Given a polynomial S = S(x, y, z) of the second degree in each of its variables in the form: S(x, y, z) = A(y, z)x 2 + 2B(y, z)x + C(y, z), if there are polynomials P1 and P2 of the fourth degree such that B(y, z)2 − A(y, z)C(y, z) = P1 (y)P2 (z), then there exists a polynomial f such that D y S(x, z) = f (x)P2 (z), Dz S(x, y) = f (x)P1 (y).
(12)
42
V. Dragovi´c
Proof. To prove the lemma, rewrite Eq. (12) in the equivalent form (B + u A)2 − A(u 2 A + 2u B + C) = P1 (y)P2 (z). For a zero y = y0 of the polynomial P1 , any zero of S(u, y0 , z) as a polynomial in z is a double zero, according to the last equation. Thus, y0 is a zero of Dz S(x, y). Thus, the polynomial P1 is a factor of the polynomial Dz S(x, y). Since the degree of the polynomial P1 is four, then there exists a polynomial f in x such that Dz S(x, y) = f (x)P1 (y). The rest of the lemma follows by double application of the same arguments.
(ii) Now, the proof of the second part of Theorem 1 follows by immediate application of Lemma 1. Proposition 1.
(i) The explicit formulae for the polynomials P and J are P(x) = a0 x 4 − 4a1 x 3 + (2a5 + 4a2 )x 2 − 4a3 x + a4 , J (s) = −4s 3 + 4(a5 − a2 )s 2 + (a0 a4 − a52 + 4(a5 a2 − a1 a3 ))s −a32 a0
+ a0 a4 a2 + 2a1 a3 a5 − a4 a12
(13)
− a2 a52 .
(ii) If all the zeros of the polynomial P are simple, then the elliptic curves 1 : y 2 = P(x), 2 : t 2 = J (s) are isomorphic and the later can be understood as the Jacobian of the former. Proof. Instead of a straightforward calculation, we are going to consider a double-bordered determinant (see [9,21,31]) obtained from the matrix M (3): 0 0 z 1 z 2 z 3 0 0 z1 z2 z 3 a0 a1 a5 − 2s . (14) Mˆ = z 1 z 1 z z a a + s a 2 1 2 3 2 z z 3 a5 − 2s a3 a4 3
We apply the Jacobi identity and get Mˆ 11 Mˆ 22 − ( Mˆ 12 )2 = Mˆ Mˆ 12,12 . Obviously, Mˆ 12,12 is a polynomial only in s of the third degree: Mˆ 12,12 = −4s 3 + 4(a5 − a2 )s 2 + ((a0 a4 − a52 ) + 4(a5 a2 − a1 a3 ))s +a0 a4 a2 − a32 a0 + 2a1 a3 a5 − a4 a12 − a2 a52 = J (s).
Geometrization and Generalization of the Kowalevski Top
43
Moreover, if we substitute x1 + x2 , z 3 = x1 x2 , 2 x1 + x2 z 1 = 1, z 2 = , z 3 = x1 x2 , 2
z 1 = 1, z 2 =
we have (x2 − x2 )2 , 4 = F(s, x1 , x2 ),
Mˆ = P(x1 )
Mˆ 11 Mˆ 22 = F(s, x1 , x2 ). If we denote
F(s, x1 , x2 ) = T (s, x1 )x22 + V (s, x1 )x2 + W (s, x1 ), then Mˆ 12 = T x2 x2 + V
x2 + x2 + W. 2
From the last equations, after dividing by (x2 − x2 )2 , we get V 2 − 4T W = J (s)P(x1 ), and the proof of the first part of the proposition is finished. The second part follows by direct calculation of correspondence between two elliptic curves, one of which is defined by a polynomial of degree 3 and one by polynomial of degree 4. 2.2. Discriminantly separable polynomials. We saw that a polynomial of three variables which defines a pencil of conics has a very peculiar property: all three of its discriminants are representable as products of two polynomials of one variable each. These considerations motivate the following definition. Definition 1. For a polynomial F(x1 , . . . , xn ) we say that it is discriminantly separable if there exist polynomials f i (xi ) such that for every i = 1, . . . , n, f j (x j ). Dxi F(x1 , . . . , xˆi , . . . , xn ) = j=i
It is symmetrically discriminantly separable if f2 = f3 = · · · = fn , while it is strongly discriminatly separable if f1 = f2 = f3 = · · · = fn . j
It is weakly discriminantly separable if there exist polynomials f i (xi ) such that for every i = 1, . . . , n, f ji (x j ). Dxi F(x1 , . . . , xˆi , . . . , xn ) = j=i
44
V. Dragovi´c
Theorem 2. Given a polynomial F(s, x1 , x2 ) of the second degree in each of the variables s, x1 , x2 of the form F = s 2 A(x1 , x2 ) + 2B(x1 , x2 )s + C(x1 , x2 ), denote by TB 2 −AC a 5 × 5 matrix such that (B − AC)(x1 , x2 ) = 2
5
5
ij
j−1
TB 2 −AC x1i−1 x2
.
j=1 i=1
Then, polynomial F is discriminantly separable if and only if rank TB 2 −AC = 1. Proof. The proof follows from Lemma 1 and the observation that a polynomial in two variables is equal to a product of two polynomials in one variable if and only if its matrix is equal to a tensor product of two vectors. The last condition is equivalent to the condition on rank of the last matrix to be equal to 1. Proposition 2. Given a polynomial F(s, x1 , x2 ) of the second degree in each of the variables s, x1 , x2 of the form F = s 2 A(x1 ) + 2B(x1 , x2 )s + C(x2 ), where A depends only on x1 and C depends only on x2 , denote by TB 2 a 5 × 5 matrix such that (B 2 )(x1 , x2 ) =
5
ij
j−1
TB 2 x1i−1 x2
.
i=1
Then, polynomial F is discriminantly separable if and only if rank TB 2 = 2. Proof. The proof follows from the observation of the proof of the last theorem and the fact that a matrix of rank two is equal to a sum of two matrices of rank one. The last proposition gives a method to construct nonsymmetric discriminantly separable polynomials. Lemma 2. Given an arbitrary quadratic polynomial F = s 2 A + 2Bs + C, then the square of its differential is equal to its discriminant under the condition F = 0: dF 2 = 4(B 2 − AC). ds Corollary 1. For an arbitrary discriminantly separable polynomial F(x3 , x1 , x2 ) of the second degree in each of the variables x3 , x1 , x2 , its differential is separable on the surface F(x3 , x1 , x2 ) = 0: √
dF d x3 d x1 d x2 =√ +√ +√ . f 3 (x3 ) f 1 (x1 ) f 2 (x2 ) f 3 (x3 ) f 1 (x1 ) f 2 (x2 )
Geometrization and Generalization of the Kowalevski Top
45
The proof of the corollary is a straightforward application of the previous statements. This property of discriminantly separable polynomials is fundamental in their role in the theory of integrable systems. Observe that the analogous statement is valid for arbitrary discriminantly separable polynomials. From the last corollary, applied to a symmetric discriminatly separable polynomial of the second degree, a variant of the Euler theorem immediately follows. Corollary 2. The condition x3 = const defines a conic from the pencil as an integral curve of the Euler equation: √
d x1 d x2 +√ = 0, f 1 (x1 ) f 1 (x2 )
where f 1 is general polynomial of degree 4. Proposition 3. All symmetric discriminantly separable polynomials F(s, x1 , x2 ) of degree two in each variable with the leading coefficient L(x1 , x2 ) = (x1 − x2 )2 are of the form F(s, x1 , x2 ) = (x1 − x2 )2 s 2 + K (x1 , x2 )s + H (x1 , x2 ), where K and H are done by formulae (7). The next lemma gives a possibility to create new discriminantly separable polynomials from a given one. Lemma 3. Given a discriminantly separable polynomial F(s, x1 , x2 ) := A(x1 , x2 )s 2 + 2B(x1 , x2 )s + C(x1 , x2 ) of the second degree in each variable: (a) Let α(x) be a linear transformation. Then polynomial F1 (s, x1 , x2 ) := F(s, α(x1 ), x2 ) is discriminantly separable. (b) The polynomial ˆ x1 , x2 ) := C(x1 , x2 )s 2 + 2B(x1 , x2 )s + A(x1 , x2 ) F(s, is discriminantly separable. The transformation from F to Fˆ described in Lemma 3 (b) maps a solution s of the equation F = 0 to 1/s. We will use the term transposition for such a transformation ˆ Thus, summarizing we get from F to F. Corollary 3. Given a discriminantly separable polynomial F(s, x1 , x2 ) := A(x1 , x2 )s 2 + 2B(x1 , x2 )s + C(x1 , x2 ) of the second degree in each variable and three fractionally-linear transformations α, β, γ , then the polynomial F1 (s, x1 , x2 ) := F(γ (s), α(x1 ), β(x2 )) is discriminantly separable.
46
V. Dragovi´c
From the last lemma we have a procedure to create non-symmetric discriminantly separable polynomials from a given symmetric discriminantly separable polynomial. The converse statement is also true: Proposition 4. Given a discriminantly separable polynomial F(s, x1 , x2 ) := A(x1 , x2 )s 2 + 2B(x1 , x2 )s + C(x1 , x2 ) of the second degree in each variable, suppose that a biquadratic F(s0 , x1 , x2 ) is nondegenerate for some value s = s0 . Then there exists a fractionally-linear transformation α such that the polynomial F1 (s, x1 , x2 ) := F(s, α(x1 ), x2 ) is symmetrically discriminantly separable. Proof. Let us fix an arbitrary value for s such that B(x1 , x2 ) is a nondegenerate biquadratic. Keeping s fixed, we have a relation √
d x1 d x2 ±√ = 0, f 1 (x1 ) f 2 (x2 )
where f 1 , f 2 are two polynomials, each in one variable. For a given x1 there are two corresponding points x2 and xˆ2 . The last two are connected by the relation
d xˆ2 f 2 (xˆ2 )
±√
d x2 = 0, f 2 (x2 )
where now the denominators of both fractions is one and the same polynomial, f 2 . This means that there exists an elliptic function u of degree two and a shift T on the elliptic curve y 2 = f 2 (x), such that x2 and xˆ2 are parameterized by x2 = u(z) xˆ2 = u(z + T ). From the relations B(x1 , x2 ) = 0,
B(x1 , xˆ2 ) = 0,
y2
are elliptic functions of degree at most four which can be we see that both y and expressed through x2 , xˆ2 . Thus, y is an elliptic function of degree two. There is a fractional-linear transformation which reduces y to u(z + T /2). This concludes the proof of the proposition. 3. Geometric Interpretation of the Kowalevski Fundamental Equation The magic integration of the Kowalevski top is based on the Kowalevski fundamental equation, see [20,24]: Q(w, x1 , x2 ) := (x1 − x2 )2 w 2 − 2R(x1 , x2 )w − R1 (x1 , x2 ) = 0,
(15)
where R(x1 , x2 ) = −x12 x22 + 6l1 x1 x2 + 2lc(x1 + x2 ) + c2 − k 2 , R1 (x1 , x2 ) = −6l1 x12 x22 − (c2 − k 2 )(x1 + x2 )2 − 4clx1 x2 (x1 + x2 ) +6l1 (c − k ) − 4c l . 2
2
2 2
(16)
Geometrization and Generalization of the Kowalevski Top
47
If we replace in Eqs. (4) and (7) the following values for the coefficients: a0 = −2, a1 = 0, a5 = 0, a2 = 3l1 , a3 = −2cl, a4 = 2(c2 − k 2 ),
(17)
and compare with (15) and (16), we get the following Theorem 3. The Kowalevski fundamental equation represents a point pencil of conics given by their tangential equations Cˆ 1 : −2w12 + 3l1 w22 + 2(c2 − k 2 )w32 − 4clw2 w3 = 0; C2 : w22 − 4w1 w3 = 0.
(18)
The Kowalevski variables w, x1 , x2 in these geometric settings are the pencil parameter, and the Darboux coordinates with respect to the conic C2 respectively. The Kowalevski case corresponds to the general case under the restrictions a1 = 0 a5 = 0 a0 = −2. The last of these three relations is just a normalization condition, provided a0 = 0. The Kowalevski parameters l1 , l, c are calculated by the formulae 1 a2 a3 l1 = , l = ± −a4 + a4 + 4a32 , c = ∓ , 3 2 2 −a4 + a4 + 4a3 provided that l and c are requested to be real. Let us mention at the end of this section, that in the original paper [25], instead of the relation (15), Kowalevski used the equivalent one: l1 2 R1 (x1 , x2 ) l1 2 ˆ − = 0. Q(s, x1 , x2 ) := (x1 − x2 ) s − − R(x1 , x2 ) s − 2 2 4 The equivalence is obtained by putting w = 2s − l1 . 4. Generalized Integrable System 4.1. Equations of motion and the first integrals. We are going to consider the following system of differential equations on unknown functions e1 , e2 , x1 , x2 , r, g: de1 = −αe1 , dt de2 = αe2 , dt d x1 = −β(r x1 + cg), dt (19) d x2 = β(r x2 + cg), dt α dr = −β(x2 − x1 )(x1 + x2 + a1 ) − (e1 − e2 ), dt 2r dg β (2rβ −α) 2 2 e x − e x = [(x2 −x1 )(x1 x2 −a5 )+e1 x2 − e2 x1 ]+ 1 2 2 1 . dt 2c 2c2 g
48
V. Dragovi´c
Here β and α are given functions of e1 , e2 , x1 , x2 , r, g. The choice of their form defines different systems. The Kowalevski top is equivalent to the above system for a1 = 0 a5 = 0, with the choice α = ir β =
i . 2
(20)
We will assume in what follows that a1 and a5 are general. Beside the last choice for α and β, there are many other choices which also provide polynomial vector fields, such as (A) α = kr 2 , β = k2 r , (B) α = krg, β = k1 g, (C) α = kr 2 g, β = k1 g. Interesting cases satisfy the system (38) from Proposition (8). Proposition 5. The system (19) has the following first integrals: k 2 = e1 · e2 , a0 a2 = e1 + e2 − (x1 + x2 )2 − 2a1 (x1 + x2 ) − r 2 , a5 a0 a3 − = −x2 e1 − x1 e2 + x1 x2 (x1 + x2 ) + (x1 + x2 ) + a1 x1 x2 − rg, 2 2 a0 a4 = x22 e1 + x12 e2 − x12 x22 − a5 x1 x2 − g 2 . 4
(21)
One can rewrite the last relations in the following form: k 2 = e1 · e2 , ˆ 1 , x2 ), r 2 = e1 + e2 + E(x ˆ 1 , x2 ), rg = −x2 e1 − x1 e2 + F(x
(22)
ˆ 1 , x2 ), g 2 = x22 e1 + x12 e2 + G(x where ˆ 1 , x2 ) = −a0 a2 − K (x1 + x2 )2 − 2a1 (x1 + x2 ), E(x ˆ 1 , x2 ) = a0 a3 + K x1 x2 (x1 + x2 ) + a5 (x1 + x2 ) + a1 x1 x2 , F(x 2 2 ˆ 1 , x2 ) = − a0 a4 − K x12 x22 − a5 x1 x2 , G(x 4
(23)
with K = 1. ˆ F, ˆ Gˆ are defined by Eq. (23) then the polynomial Lemma 4. If the polynomials E, ˆ 1 , x2 )x12 + 2 F(x ˆ 1 , x2 )x1 + G(x ˆ 1 , x2 ) P(x1 ) := E(x depends only on x1 . ˆ 1 , x2 ), G(x ˆ 1 , x2 ) of the second degree ˆ 1 , x2 ), F(x Proposition 6. Three polynomials E(x in each variable are given such that
Geometrization and Generalization of the Kowalevski Top
49
(1) Polynomials P, Q defined by ˆ 1 , x2 )x12 + 2 F(x ˆ 1 , x2 )x1 + G(x ˆ 1 , x2 ), P(x1 ) := E(x 2 ˆ 1 , x2 )x2 + 2 F(x ˆ 1 , x2 )x2 + G(x ˆ 1 , x2 ) Q(x2 ) := E(x
(24)
depend only on one variable each. (2) Polynomials R(x1 , x2 ) and R1 (x1 , x2 ) defined by ˆ 1 , x2 )x1 x2 + F(x ˆ 1 , x2 )(x1 + x2 ) + G(x ˆ 1 , x2 ), R(x1 , x2 ) := E(x ˆ 1 , x2 )G(x ˆ 1 , x2 ) − Fˆ 2 (x1 , x2 ) R1 (x1 , x2 ) := E(x
(25)
are of the second degree in each variables. Then: ˆ 1 , x2 ), F(x ˆ 1 , x2 ), G(x ˆ 1 , x2 ) are symmetric in x1 , x2 . (a) The polynomials E(x (b) The polynomial F(s, x1 , x2 ) = (x1 − x2 )2 s 2 − 2R(x1 , x2 )s − R1 (x1 , x2 ) is discriminantly separable. ˆ F, ˆ Gˆ is given in Eq. (23), with K (c) The most general form of the polynomials E, arbitrary. (d) For K = 1 the polynomial P is the one given in Proposition 1. Proof. The proof follows by straightforward calculation with application of Lemma 1. If the coefficient K is nonzero we may normalize it to be equal to one. Under this assumption, Eqs. (23) with K = 1 are general. The case K = 0 is going to be analyzed separately in one of the following sections. From Eqs. (22) we get the following Corollary 4. The relation e2 P(x1 ) + e1 P(x2 ) − H (x1 , x2 ) + k 2 (x1 − x2 )2 = 0,
(26)
is satisfied, where P is the polynomial defined in Lemma 4. Corollary 5. The differentials of x1 and x2 may be written in the form
d x1 = −β P(x1 ) + e1 (x1 − x2 )2 , dt
d x2 = β P(x2 ) + e2 (x1 − x2 )2 . dt
(27)
The proof follows from Eqs. (22) and Lemma 4. Now, we apply what we are going to call the Kotter trick: √ √ 2 √ P(x2 ) √ P(x1 ) e1 ± e2 = (w1 ± k)(w2 ∓ k), (28) x1 − x2 x1 − x2 where w1 , w2 are solutions of the quadratic equation F(s, x1 , x2 ) = (x1 − x2 )2 s 2 − 2R(x1 , x2 )s − R1 (x1 , x2 ).
(29)
50
V. Dragovi´c
The Kotter trick appeared in [24] quite mysteriously. Further explanation done by Golubev sixty years later seems to be even trickier, see [20] and much less clear. In the last section of this paper, see Proposition 11, we provide a new interpretation of this transformation as a commuting diagram of morphisms of double-valued group. Should we hope that our explanation is more transparent than previous ones, since another sixty years passed in the meantime? From the last relations, following Kotter, one gets 2 d x1 (x1 − x2 )4 e1 P(x2 ) = β2 1 + √ P(x1 )P(x2 )(x1 − x2 )2 P(x1 )dt √ √ ( (w1 − k)(w2 + k) + (w1 + k)(w2 − k))2 2 , = β 1+ (w1 − w2 )2 2 d x2 (x1 − x2 )4 e2 P(x1 ) = β2 1 + √ P(x1 )P(x2 )(x1 − x2 )2 P(x2 )dt √ √ ( (w1 − k)(w2 + k) − (w1 + k)(w2 − k))2 . = β2 1 + (w1 − w2 )2 Next, we get
√
√ (w1 − k)(w1 + k) + (w2 + k)(w2 − k) , (w1 − w2 ) √ √ (w1 − k)(w1 + k) − (w2 + k)(w2 − k) d x2 . = −β √ (w1 − w2 ) P(x2 )dt √
d x1 = −β P(x1 )dt
(30)
Now we apply the discriminant separability property of the polynomial F: d x1 d x2 dw1 +√ =√ , P(x1 ) P(x2 ) J (w1 ) d x1 d x2 dw2 −√ =√ . √ P(x1 ) P(x2 ) J (w2 ) √
(31)
We will refer to the last relations as the Kowalevski change of variables. The nature of these relations has been studied by Jurdjevic (see [23]) following Weil ([33]). We are going to develop further these efforts in Sect. 5 where we are going to show that the Kowalevski change of variables is the infinitesimal version of a double valued group operation and its action. From the relations 31 and 30 we finally get: √
dw1 dw2 +√ = 0, (w1 ) (w2 )
w1 dw1 w2 dw2 +√ = 2β dt, √ (w1 ) (w2 )
(32)
where (w) = J (w)(w − k)(w + k), is the polynomial of fifth degree. Thus, Eqs. (32) represent the Abel-Jacobi map of the genus 2 curve y 2 = (w).
Geometrization and Generalization of the Kowalevski Top
51
4.2. Generalized Kotter transformation. In order to integrate the dynamics on the Jacobian of the hyper-elliptic curve y 2 = (w) we are going to generalize the classical Kotter transformation. In this section we will assume the normalization condition a0 = −2. Proposition 7. For the polynomial F(s, x1 , x2 ) there exist polynomials A0 (s), f (s), A(s, x1 , x2 ), B(s, x1 , x2 ) such that the following identity: F(s, x1 , x2 ) · A0 (s) = A2 (s, x1 , x2 ) + f (s) · B(s, x1 , x2 ),
(33)
is satisfied. The polynomials are defined by the formulae: A(s, x1 , x2 ) = A0 (s)(x1 x2 − s) + B0 (s)(x1 + x2 ) + M0 (s), A0 (s) = a12 − a0 a2 − sa0 , 1 B0 (s) = (a0 a3 − a5 a1 + 2sa1 ), 2 M0 (s) = a5 a2 − a1 a3 + s(a12 + a5 ), B(s, x1 , x2 ) = (x1 + x2 )2 + 2a1 (x1 + x2 ) − 2s − 2a2 ,
a2 f (s) = 2s 3 + 2(a2 − a5 )s 2 + 2(a1 a3 − a5 a2 ) + a4 + 5 2 f 0 = a4 a2 − a32 − a1 a3 a5 +
s + f0 ,
a4 a12 + a2 a52 . 2
For a5 = a1 = 0 the previous identity has been obtained in [24]. Following Kotter’s idea, consider the identity F(s) = F(u) + (s − u)F (u) + (s − u)2 . From the last two identities we get a quadratic equation in s − u, (s − u)2 (x1 − x2 )2 − 2(s − u)(R(x1 , x2 ) − u(x1 − x2 )) + f (u)B + (x1 − x2 )2 A2 . Corollary 6. (a) The solutions of the last equation satisfy the identity in u: (s1 − u)(s2 − u) =
A2 B + f (u) . (x1 − x2 )2 (x1 − x2 )2
(b) Denote m 1 , m 2 , m 3 the zeros of the polynomial f , and
Pi = (s1 − m i )(s2 − m i ), i = 1, 2, 3. Then 1 B0 (m i ) Pi = +m i (m i − a5 − 2a2 )−2a5 − a1 a3 , A0 (m i )x1 x2 + √ x1 − x2 A0 (m i ) i = 1, 2, 3. (34)
52
V. Dragovi´c
Now we introduce more convenient notation n i = m i + a12 + 2a2 , i = 1, 2, 3, x1 x2 + (2a12 + a5 + 2a2 ) + a21 (x1 − x2 ) , x1 − x2 1 Y = , x1 − x2 (a 3 + 2a2 a1 + 2a5 a1 + 2a3 )(x1 + x2 ) − 2(a12 + 2a2 )(a12 + a5 ) Z = 1 . x1 − x2
X =
Lemma 5. The quantities X, Y, Z satisfy the system of linear equations 1 P1 Z = √ , 2n 1 n1 1 P2 Z = √ , X − n2Y + 2n 2 n2 1 P3 Z = √ . X − n3Y + 2n 3 n3 X − n1Y +
(35)
Denote fˆ(x) = f (x − a12 − 2a2 ). One can easily solve the previous linear system and get Lemma 6. The solutions of the system (35) are √ √ √ P1 n 1 P2 n 2 P3 n 3 Y =− + + , fˆ (n 1 ) fˆ (n 2 ) fˆ (n 3 ) P1 P2 P3 Z = 2n 1 n 2 n 3 √ +√ +√ . n 1 fˆ (n 1 ) n 2 fˆ (n 2 ) n 3 fˆ (n 3 ) The expression in terms of theta functions for Pi = can be obtained from [25] paragraph 7.
√ (s1 − m i )(s2 − m i ) for i = 1, 2, 3
4.3. Interpretation of the equations of motion. Rigid-body coordinates. We are going to present briefly the interpretation of the equations of motion (19) in the standard rigid-body coordinates p, q, r, γ , γ , γ , where e1 = x12 + c(γ + iγ ), e2 = x22 + c(γ − iγ ), x1 + x2 , p= 2 x1 − x2 q= . 2i
Geometrization and Generalization of the Kowalevski Top
53
From the last four equations of the system (19) we get p˙ = −iβrq, q˙ = iβr p, r˙ = 2βiq(2 p + a1 ) −
iα (2 pq + cγ ), r
(36)
β γ˙ = − (qia5 + 2icγ q − 2icγ p) c 2rβ − α + 2 (icγ ( p 2 − q 2 ) − 2icpqγ ), c γ while the equations for γ˙ , γ˙ can easily be obtained from the first two equations of the system (19): α 2 −x1 x˙1 − x2 x˙2 (x − x12 ) − iαγ + , 2c 2 c α −x1 x˙1 + x2 x˙2 γ˙ = (−x22 − x12 ) − iαγ + . 2c c γ˙ =
Finally, we get 2i(2βr − α) pq − iαγ + 2iβγ q, c 2i(2βr − α) 2 ( p − q 2 ) + iαγ − 2iβγ q. γ˙ = − c γ˙ =
(37)
Proposition 8. The system (36, 37) preserves the standard measure if and only if A0 α + A1 α p + A2 αq + A3 αr + A4 αγ + A5 αγ + A6 αγ + B0 β + B1 β p + B2 βq + B3 βr + B4 βγ + B5 βγ + B6 βγ = 0,
(38)
where A0 = r 2 γ p 2 + c2 γ 2 γ − 2r 2 pqγ + 2cγ 2 pq − r 2 γ q 2 , A1 = 0, A2 = 0, A3 = −2cγ 2 r pq − c2 γ 2 r γ , A4 = −2 pqr 2 γ 2 − γ r 2 cγ 2 , A5 = −2r 2 γ 2 q 2 + gr 2 cγ 2 + 2r 2 γ 2 p 2 , A6 = −r 2 γ γ p 2 + 2r 2 γ pqγ + r 2 γ γ q 2 , B0 = −2r 3 γ p 2 + 2r 3 γ q 2 + 4r 3 pqγ , B1 = −cr 3 qγ 2 , B2 = cr 3 pγ 2 , B3 = 4qr 2 cγ 2 p + 2qr 2 cγ 2 a1 , B4 = 2γ 3 qr 2 c + 4 pqr 3 γ 2 , B5 = −4r 3 γ 2 p 2 − 2γ 3 qr 2 c + 4r 3 γ 2 q 2 , B6 = −r 2 γ 2 qa5 −2r 3 γ γ q 2 −2r 2 γ 2 cγ q +2r 3 γ γ p 2 +2r 2 γ 2 cγ p−4r 3 γ pqγ .
54
V. Dragovi´c
Example 1. From the Kowalevski case, there is a pair α = ir, β = i/2 which satisfies the system (37) written above. We give two more pairs: α1 = 2r ( p 2 + q 2 ) β1 = p 2 + q 2 , and α2 = r γ β2 = 0. Moreover, any linear combination of the pairs (α, β), (α1 , β1 ) and (α2 , β2 ) also gives a solution of the system (37) and provides a system with invariant standard measure. Elastic deformations. Jurdjevic considered a deformation of the Kowalevski case associated to a Kirchhoff elastic problem, see [23]. The systems are defined by the Hamiltonians H = M12 + M22 + 2M32 + γ1 , where deformed Poisson structures {·, ·}τ are defined by {Mi , M j }τ = i jk Mk , {Mi , γ j }τ = i jk γk , {γi , γ j }τ = τ i jk Mk , where the deformation parameter takes values τ = 0, 1, −1. The classical Kowalevski case corresponds to the case τ = 0. Denote e1 = x12 − (γ1 + iγ2 ) + τ, e2 = x22 − (γ1 − iγ2 ) + τ, where x1,2 =
M1 ± i M2 . 2
The integrals of motion I1 I2 I3 I4
= = = =
e1 e2 , H, γ1 M 1 + γ2 M 2 + γ3 M 3 , γ12 + γ22 + γ32 + τ (M12 + M22 + M32 )
may be rewritten in the form (22) k 2 = I1 = e1 · e2 , ˆ 1 , x2 ), M32 = e1 + e2 + E(x ˆ 1 , x2 ), M3 γ3 = −x2 e1 − x1 e2 + F(x ˆ 1 , x2 ), γ32 = x22 e1 + x12 e2 + G(x where ˆ 1 , x2 ) = −x12 x22 − 2τ x1 x2 − 2τ (I1 − τ ) + τ 2 − I2 , G(x ˆ 1 , x2 ) = (x1 x2 + τ )(x1 + x2 ) + I3 , F(x ˆ 1 , x2 ) = −(x1 + x2 )2 + 2(I1 − τ ). E(x
Geometrization and Generalization of the Kowalevski Top
55
Proposition 9. The corresponding pencil of conics is determined by equations a1 = 0, a5 = 2τ, a2 =
2(τ − I1 ) I3 8τ (I1 − τ ) + 4(I2 − τ 2 ) , a3 = 2 , a4 = , a0 a0 a0
where a0 is arbitrary. 5. Two-Valued Groups, Kowalevski Equation and Poncelet Porism 5.1. Multivalued groups: defining notions. The structure of multivalued groups was introduced by Buchstaber and Novikov in 1971 (see [5]) in their study of characteristic classes of vector bundles, and it has been studied by Buchstaber and his collaborators since then (see [8] and references therein). Following [8], we give the definition of an n-valued group on X as a map: m : X × X → (X )n , m(x, y) = x ∗ y = [z 1 , . . . , z n ], where (X )n denotes the symmetric n th power of X and z i coordinates therein. Associativity is the condition of equality of two n 2 -sets [x ∗ (y ∗ z)1 , . . . , x ∗ (y ∗ z)n ], [(x ∗ y)1 ∗ z, . . . , (x ∗ y)n ∗ z], for all triplets (x, y, z) ∈ X 3 . An element e ∈ X is a unit if e ∗ x = x ∗ e = [x, . . . , x], for all x ∈ X . A map inv : X → X is an inverse if it satisfies e ∈ inv(x) ∗ x, e ∈ x ∗ inv(x), for all x ∈ X . Following Buchstaber, we say that m defines an n-valued group structure (X, m, e, inv) if it is associative, with a unit and an inverse. An n-valued group X acts on the set Y if there is a mapping φ : X × Y → (Y )n , φ(x, y) = x ◦ y, such that the two n 2 -multisubsets of Y , x1 ◦ (x2 ◦ y) (x1 ∗ x2 ) ◦ y, are equal for all x1 , x2 ∈ X, y ∈ Y . It is additionally required that e ◦ y = [y, . . . , y] for all y ∈ Y .
56
V. Dragovi´c
Example 2 (A two-valued group structure on Z+ , [7]). Let us consider the set of nonnegative integers Z+ and define a mapping m : Z+ × Z+ → (Z+ )2 , m(x, y) = [x + y, |x − y|]. This mapping provides a structure of a two-valued group on Z+ with the unit e = 0 and the inverse equal to the identity inv(x) = x. In [7] the sequence of two-valued mappings associated with the Poncelet porism was identified as the algebraic representation of this 2-valued group. Moreover, the algebraic action of this group on CP1 was studied and it was shown that in the irreducible case all such actions are generated by Euler-Chasles correspondences. In the sequel, we are going to show that there is another 2-valued group and its action on CP1 which is even more closely related to the Euler-Chasles correspondence and to the Great Poncelet Theorem, and which is at the same time intimately related to the Kowalevski fundamental equation and to the Kowalevski change of variables. However, we will start our approach with a simple example.
5.2. The simplest case: 2-valued group p2 . Among the basic examples of multivalued groups, there are n-valued additive group structures on C. For n = 2, this is a two-valued group p2 defined by the relation m 2 : C × C → (C)2 , √ √ √ √ x ∗2 y = [( x + y)2 , ( x − y)2 ].
(39)
The product x ∗2 y corresponds to the roots in z of the polynomial equation p2 (z, x, y) = 0, where p2 (z, x, y) = (x + y + z)2 − 4(x y + yz + zx). Our starting point in this section is the following Lemma 7. The polynomial p2 (z, x, y) is discriminantly separable. The discriminants satisfy relations Dz ( p2 )(x, y) = P(x)P(y) Dx ( p2 )(y, z) = P(y)P(z) D y ( p2 )(x, z) = P(x)P(z), where P(x) = 2x. The polynomial p2 as discriminantly separable, generates a case of the generalized Kowalevski system of differential equations, but this time with K = 0. The system is defined by Eˆ = 0 Fˆ = 1 Gˆ = 0,
(40)
Geometrization and Generalization of the Kowalevski Top
57
and the equations of motion have the form de1 dt de2 dt d x1 dt d x2 dt dr dt dg dt
= −αe1 , = αe2 , = −β(r x1 + cg), (41) = β(r x2 + cg), α (e1 − e2 ), 2r (2rβ − α) 2 2 e x − e x = 2βc + 1 2 2 1 . 2c2 g =−
In the standard rigid-body coordinates with α = ir , β = i/2 the last two equations become r˙ = 2 pq + cγ γ˙ = ic. Lemma 8. The integrals of the system defined by Eqs. (40) are k2 r2 crg c2 g 2
= e1 e2 , = e1 + e2 , = 1 − x1 e2 − x2 e1 , = x22 e1 + x12 e2 .
From Lemma 8 we get the relation 2e1 x2 + 2e2 x1 − 1 + k 2 (x1 − x2 )2 = 0. Now, together with the first integral relation from Lemma 8, similar as in the Kowalevski case, we get √ √ 2 √ √ 2x2 2x1 e1 ± e2 = (w1 ± k)(w2 ∓ k), (42) x1 − x2 x1 − x2 where w1 , w2 are solutions of the quadratic equation F2 (w, x1 , x2 ) := (x1 − x2 )2 w 2 − 2(x1 + x2 )w + 1 = 0.
(43)
The polynomial F2 is obtained by transposition from the polynomial p2 and, thus, it is discriminantly separable: Dx (F2 )(y, z) = P(y)ϕ(z), where ϕ(z) = z 3 . Following lines of integration, we finally come to
58
V. Dragovi´c
Proposition 10. The system of differential equations defined by 40 is integrated through the solutions of the system ds1 ds2 + √ = 0, √ s1 1 (s1 ) s2 1 (s2 ) ds1 ds2 i +√ = dt, √ 2 1 (s1 ) 1 (s2 )
(44)
where (s) = s(s − e4 )(s − e5 ) is the polynomial of degree 3. Similar systems appeared in a slightly different context in the works of Appel’rot, Mlodzeevskii, Delone in their study of degenerations of the Kowalevski top (see [1,11,29]). In particular, we may construct Delone-type solutions of the last system: i (t − t0 ) . s1 = 0, s2 = ℘ 4 We can also consider integrable perturbation of the previous integrable system, defined by: Eˆ = k1 − 2a1 (x1 + x2 ), a5 Fˆ = k2 + (x1 + x2 ) + a1 x1 x2 , 2 ˆ G = k 3 − a5 x 1 x 2 .
(45)
The equations of motion have the form de1 dt de2 dt d x1 dt d x2 dt dr dt dg dt
= −αe1 , = αe2 , = −β(r x1 + cg), (46) = β(r x2 + cg), α a1 (e1 − e2 ) − β(x2 − x1 ), 2r 2 a (2rβ − α) 2 5 2 = 2βc + e cβ(x2 − x1 ). x − e x 1 2 2 1 + 2c2 g 2 =−
In the standard rigid-body coordinates with α = ir , β = i/2, the last two equations become a1 r˙ = 2 pq + cγ + q, 2 a5 γ˙ = ic(1 + i q). 2
Geometrization and Generalization of the Kowalevski Top
59
The corresponding polynomial F(s, x1 , x2 ) = (x1 − x2 )2 s 2 − 2R(x1 , x2 )s − R1 (x1 , x2 ), where ˆ 1 + x2 ) + G, ˆ R(x1 , x2 ) = Eˆ x1 x2 + F(x
R1 (x1 , x2 ) = Eˆ Fˆ − Gˆ 2 ,
is discriminantly separable and Dx1 (s, x2 ) = ϕ(s)P(x2 ), where ϕ(s) = (2s − a5 )(2a1 + a5 s − 2s 2 ), P(x) = 2x(2a1 x 2 − a5 x − 2). 5.3. 2-valued group structure on CP1 , the Kowalevski fundamental equation and Poncelet porism. Now we pass to the general case. We are going to show that the general pencil equation represents an action of a two valued group structure. Recognition of this structure enables us to give to ’the mysterious Kowalevski change of variables’ a final algebro-geometric expression and explanation, developing further the ideas of Weil and Jurdjevic (see [23,33]). Amazingly, the associativity condition for this action from a geometric point of view is nothing else than the Great Poncelet Theorem for a triangle. As we have already mentioned, the general pencil equation F(s, x1 , x2 ) = 0 is connected with two isomorphic elliptic curves 1 : y 2 = P(x), 2 : t 2 = J (s), where the polynomials P, J of degree four and three respectively are defined by Eqs. (13). Suppose that the cubic one 2 is rewritten in the canonical form 2 : t 2 = J (s) = 4s 3 − g2 s − g3 . Moreover, denote by ψ : 2 → 1 a birational morphism between the curves induced by a fractional-linear transformation ψˆ which maps three zeros of J and ∞ to the four zeros of the polynomial P. The curve 2 as a cubic curve has the group structure. Together with its subgroup Z2 it defines the standard two-valued group structure of coset type on CP1 (see [6,8]): 2 2 t 1 − t2 t1 + t2 , −s1 − s2 + s1 ∗c s2 = −s1 − s2 + , (47) 2(s1 − s2 ) 2(s1 − s2 ) where ti = J (si ), i = 1, 2.
60
V. Dragovi´c
Theorem 4. The general pencil equation after fractional-linear transformations F(s, ψˆ −1 (x1 ), ψˆ −1 (x2 )) = 0 defines the two valued coset group structure (2 , Z2 ) defined by the relation (47). Proof. After the fractional-linear transformations, the pencil equation obtains the form F1 (s, x, y) = T (s, x)y 2 + V (s, x)y + W (s, x), where T (s, x) = −4s 2 + 4sx − s 2 , V (s, x) = 4sx 2 + 2s 2 x − 2xg2 − g2 s − 4g3 , g2 W (s, x) = −s 2 x 2 − g2 xs − 4xg3 − 2g3 s − 2 . 4 We apply now a linear change of variables γ on s: m = γ (s) :=
s 2
and get F2 (m, x, y) = F1 (2m, x, y). Denote by P = (m, n) and M = (x, u) two arbitrary points on the curve 2 , which means n 2 = 4m 3 − g2 m − g3 , u 2 = 4x 3 − g2 x − g3 . We want to find points N1 = (y1 , v1 ) and N2 = (y2 , v2 ) on 2 which correspond by F2 to P and M. These points are −V (s, x) + 4nu 2x T (s, y1 ) + V (s, y1 ) , v1 = − , 2T (s, x) 4n −V (s, x) − 4nu 2x T (s, y2 ) + V (s, y2 ) , v2 = − . y2 = 2T (s, x) 4n y1 =
By trivial algebraic transformations −4mx 2 − 4xm 2 + xg2 + mg2 + 2g3 + 2nu −4(x − m)2 −4mx(x + m) + x 3 + m 3 − x 3 + xg2 + g3 − m 3 + mg2 + g3 + 2nu = −4(x − m)2 2 u−n = −x − m + , 2(x − m)
y1 =
we get the first part of the operation of the two-valued group (2 , Z2 ) defined by the relation (47). Applying similar transformations to y2 we get the second part of the relation (47) as well. This ends the proof of the theorem.
Geometrization and Generalization of the Kowalevski Top
61
The Kowalevski change of variables (see Eqs. (31)) is infinitesimal of the correspondence which maps a pair of points (M1 , M2 ) from the curve 1 to a pair of points (S1 , S2 ) of the curve 2 . One view to this correspondence has been given in [23] following Weil [33]. In our approach, there is a geometric view to this mapping as the correspondence which maps two tangents to the conic C to the pair of conics from the pencil which contains the intersection point of the two lines. If we apply fractional-linear transformations to transform the curve 1 into the curve 2 , then the above correspondence is nothing else than the two-valued group operation ∗c on (2 , Z2 ). Theorem 5. The Kowalevski change of variables is equivalent to the infinitesimal of the action of the two valued coset group (2 , Z2 ) on 1 . Up to the fractional-linear transformation, it is equivalent to the operation of the two valued group (2 , Z2 ). Now, the Kotter trick from Sect. 4 (see Eqs. (28, 29) can be presented as a commutative diagram. Proposition 11. The Kotter transformation defined by Eqs. (28, 29) makes the following diagram commutative: C4
i 1 ×i 1 ×m
Q
- 1 × 1 × Cψ
−1 ×ψ −1 ×id
Q
Q ia ×ia ×m Q i 1 ×i 1 ×id×id p1 × p1 ×id Q Q QQ s ? ? 1 × 1 × C × C CP1 × CP1 × C / ? 1 CP × CP1 × C
ψˆ −1 ×ψˆ −1 ×id
ϕ1 ×ϕ2
? C×C
m c ×τc
m2
? CP2
f
? CP2 × C/ ∼
The mappings are defined as follows:
i 1 : x → (x, P(x)), m : (x, y) → x · y, i a : x → (x, 1), p1 : (x, y) → x, m c : (x, y) → x ∗c y, √ √ τc : x → ( x, − x), √ √ P(x2 ) , ϕ1 : (x1 , x2 , e1 , e2 ) → e1 x1 − x2
- 2 × 2 × C
p1 × p1 ×id
62
V. Dragovi´c
√ e2
√
P(x1 ) , x1 − x2 f : ((s1 , s2 , 1), (k, −k)) → [(γ −1 (s1 )+k)(γ −1 (s2 )−k), (γ −1 (s2 )+k)(γ −1 (s1 )−k)].
ϕ2 : (x1 , x2 , e1 , e2 ) →
From Proposition 11 we see that the two-valued group plays an important role in the Kowalevski system and its generalizations. Putting together the geometric meaning of the pencil equation and algebraic structure of the two valued group we come to the connection with the Great Poncelet Theorem ([30], see also [3,15] and [16]). For the reader’s sake we are going to formulate the Great Poncelet Theorem for triangles in the form we are going to use below. Theorem 6 (Great Poncelet Theorem for triangles [30]). Given four conics C1 , C2 , C3 , C from a pencil and three lines a1 , a2 , a3 , tangents to the conic C such that a1 , a2 intersect on C1 , a2 , a3 intersect on C2 and a2 , a3 intersect on C3 . Moreover, we suppose that the tangents to the conics C1 , C2 , C3 at the intersection points are not concurrent. Given b1 , b2 tangents to the conic C which intersect at C1 . Then there exists b3 , tangent to the conic C such that the triplet (b1 , b2 , b3 ) satisfies all conditions as (a1 , a2 , a3 ). Now, we are going back to the associativity condition for the action of the double-valued group (2 , Z2 ). Theorem 7. Associativity conditions for the group structure of the two-valued coset group (2 , Z2 ) and for its action on 1 are equivalent to the great Poncelet theorem for a triangle. Proof. Denote by P and Q two arbitrary elements of the two-valued group (2 , Z2 ) and M an arbitrary point on the curve 1 . Let Q ∗ P = [P1 , P2 ] and P ◦ M = [N1 , N2 ]. Associativity means the equality of the two quadruples: [Q ◦ N1 , Q ◦ N2 ] = [P1 ◦ M, P2 ◦ M]. Let us consider the previous situation from the geometric point of view. Recall the geometric meaning of the equation of a pencil of conics F(s, x1 , x2 ) = 0. Variables x1 and x2 denote the Darboux coordinates of two tangents to the conic C2 which intersect at the conic Cs with the pencil parameter equal to s. Denote by C P and C Q the conics from the pencil which correspond to the elements P, Q, and by l M , l N1 , l N2 the tangents to the conic C2 which correspond to the points M, N1 , N2 of the curve 1 . Then, l N1 and l N2 are the two lines tangent to C2 which intersect l M at the conic C P . Moreover, if we denote Q ◦ N1 = [N3 , N4 ],
Q ◦ N2 = [N5 , N6 ],
Geometrization and Generalization of the Kowalevski Top
63
Fig. 1. Associativity condition and Poncelet theorem
then corresponding lines l N3 , l N4 , l N5 , l N6 , tangent to the conic C2 satisfy the conditions: the pairs of lines (l N1 , l N3 ), (l N1 , l N4 ), (l N2 , l N5 ), (l N2 , l N6 ) all intersect at the conic C Q . Now, associativity of the action is equivalent to the existence of a pair of conics (C P1 , C P2 ) such that (l M , l N3 ) and (l M , l N6 ) intersect at the conic C P1 , while (l M , l N5 ) and (l M , l N4 ) intersect at the conic C P2 , see Fig. 1. Consider the intersection of the lines (l M , l N3 ). Choose the conic from the pencil which contains the intersection point, such that the tangent to this conic at the intersection point is not concurrent with the tangents to the conics C P and C Q at the intersection points (l M , l N1 ) and (l N1 , l N3 ) respectively. Denote the conic C P1 . Then by applying the Great Poncelet Theorem for triangle (see the theorem above, [30], see also [3,15,16]), one of the lines l N5 and l N6 , say the last one, intersects L M at the conic C P1 . The tangent to this conic at the intersection point is not concurrent with the tangents to the conics C P and C Q at the intersection points (l M , l N2 ) and (l N2 , l N6 ) respectively. In the same way, by considering intersection of the lines (l M , l N4 ) we come to the conic (C P2 ) from the pencil, which, by the Great Poncelet Theorem contains intersections of (l M , l N4 ) and (l M , l N5 ). Since the result of the operation in the double-valued group between elements P, Q doesn’t depend on the choice of the point M to which the action is applied, the conics C P2 and C P1 in the previous construction should not depend of the choice of the line l M . This independence is equivalent to the poristic nature of the Poncelet Theorem. This demonstrates the equivalence between the associativity condition and the Great Poncelet Theorem for a triangle. From the last two theorems we get finally Conclusion. Geometric settings for the Kowalevski change of variables is the Great Poncelet Theorem for a triangle. Acknowledgement. The author is grateful to Borislav Gaji´c and Katarina Kuki´c for helpful remarks. The research was partially supported by the Serbian Ministry of Science and Technology, Project Geometry and Topology of Manifolds and Integrable Dynamical Systems. A part of the paper has been written during a visit to the IHES. The author uses the opportunity to thank the IHES for hospitality and outstanding working conditions.
References 1. Appel’rot, G.G.: Some suplements to the memoir of N. B. Delone. Tr. otd. fiz. nauk, 6 (1893) 2. Audin, M.: Spinning Tops. An introduction to integrable systems. Cambridge studies in advanced mathematics 51, Cambridge: Cambridge Univ. Press, 1999
64
V. Dragovi´c
3. Berger, M.: Geometry. Berlin: Springer-Verlag, 1987 4. Bobenko, A.I., Reyman, A.G., Semenov-Tian-Shansky, M.A.: The Kowalevski top 99 years later: a Lax pair, generalizations and explicit solutions. Commun. Math. Phys. 122, 321–354 (1989) 5. Buchstaber, V.M., Novikov, S.P.: Formal groups, power systems and Adams operators. Mat. Sb. (N. S) 84 (126), 81–118 (1971) (in Russian) 6. Buchstaber, V.M., Rees, E.G.: Multivalued groups, their representations and Hopf algebras. Transform. Groups 2, 325–349 (1997) 7. Buchstaber, V.M., Veselov, A.P.: Integrable correspondences and algebraic representations of multivalued groups. Internat. Math. Res. Notices 1996, 381–400 (1996) 8. Buchstaber, V.: n-valued groups: theory and applications. Moscow Math. J. 6, 57–84 (2006) 9. Darboux, G.: Principes de géométrie analytique. Paris: Gauthier-Villars, 1917, 519 p 10. Darboux, G.: Leçons sur la théorie générale des surfaces et les applications géométriques du calcul infinitesimal. Volumes 2 and 3, Paris: Gauthier-Villars, 1887, 1889 11. Delone, N.B.: Algebraic integrals of motion of a heavy rigid body around a fixed point. Petersburg, 1892 12. Dragovi´c, V.: Multi-valued hyperelliptic continuous fractions of generalized Halphen type. Internat. Math. Res. Notices 2009, 1891–1932 (2009) 13. Dragovi´c, V.: Marden theorem and Poncelet-Darboux curves. http://arXiv./org/abs/0812.4829v1[math. CA], 2008 14. Dragovi´c, V., Gaji´c, B.: Systems of Hess-Appel’rot type. Commun. Math. Phys. 265, 397–435 (2006) 15. Dragovi´c, V., Radnovi´c, M.: Geometry of integrable billiards and pencils of quadrics. J. Math. Pures Appl. 85, 758–790 (2006) 16. Dragovi´c, V., Radnovi´c, M.: Hyperelliptic Jacobians as Billiard Algebra of Pencils of Quadrics: Beyond Poncelet Porisms. Adv. Math. 219, 1577–1607 (2008) 17. Dubrovin, B.: Theta - functions and nonlinear equations. Usp. Math. Nauk 36, 11–80 (1981) 18. Dullin, H.R., Richter, P.H., Veselov, A.P.: Action variables of the Kowalevski top. Reg. Chaotic Dynam. 3, 18–26 (1998) 19. Euler, L.: Evolutio generalior formularum comparationi curvarum inservientium. Opera Omnia Ser 1 20, 318–356 (1765) 20. Golubev, V.V.: Lectures on the integration of motion of a heavy rigid body around a fixed point. Moscow: Gostechizdat, 1953 [in Russian], English translations: Israel Program for Scientific washington, DC: US Dept. of Commerce, Off, of Tech. Serv., 1960 21. Hirota, R.: The direct mthod in soliton theory. Cambridge Tracts in Mathematics 155, Cambridge: Cambridge Univ. Press, 2004 22. Horozov, E., van Moerbeke, P.: The full geometry of Kowalevski’s top and (1, 2)-abelian surfaces. Comm. Pure Appl. Math. 42, 357–407 (1989) 23. Jurdjevic, V.: Integrable Hamiltonian systems on Lie Groups: Kowalevski type. Ann. Math. 150, 605– 644 (1999) 24. Kotter, F.: Sur le cas traite par M-me Kowalevski de rotation d’un corps solide autour d’un point fixe. Acta Math. 17, 209–263 (1893) 25. Kowalevski, S.: Sur la probleme de la rotation d’un corps solide autour d’un point fixe. Acta Math. 12, 177–232 (1889) 26. Kowalevski, S.: Sur une propriete du systeme d’equations differentielles qui definit la rotation d’un corps solide autour d’un point fixe. Acta Math. 14, 81–93 (1889) 27. Kuznetsov, V.B.: Kowalevski top revisted. CRM Proc. Lecture Notes 32, Providence, RI: Amer. Math. Soc., 2002, pp. 181–196 28. Markushevich, D.: Kowalevski top and genus-2 curves. J. Phys. A 34(11), 2125–2135 (2001) 29. Mlodzeevskii, B.K.: About a case of motion of a heavy rigid body around a fixed point. Mat. Sb. 18 (1895) 30. Poncelet, J.V.: Traité des propriétés projectives des figures. Paris: Mett, 1822 31. Vein, R., Dale, P.: Determinants and their applications in Mathematical Physics. Appl. Math. Sciences 134, Berlin-Heidelberg-New York: Springer, 1999 32. Veselov, A.P., Novikov, S.P.: Poisson brackets and complex tori. Trudy Mat. Inst. Steklov 165, 49–61 (1984) 33. Weil, A.: Euler and the Jacobians of elliptic curves. In: Arithmetics and Geometry, Vol. 1, Progr. Math. 35, Boston, MA: Birkhauser, 1983, pp. 353–359 Communicated by M. Aizenman
Commun. Math. Phys. 298, 65–99 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1059-y
Communications in
Mathematical Physics
Dimension Theory for Invariant Measures of Endomorphisms Lin Shu∗ LMAM, School of Mathematical Sciences, Peking University, Beijing 100871, P. R. China. E-mail:
[email protected] Received: 29 June 2009 / Accepted: 3 November 2009 Published online: 15 May 2010 – © Springer-Verlag 2010
Abstract: We establish the exact dimensional property of an ergodic hyperbolic measure for a C 2 non-invertible but non-degenerate endomorphism on a compact Riemannian manifold without boundary. Based on this, we give a new formula of Lyapunov dimension of ergodic measures and show it coincides with the dimension of hyperbolic ergodic measures in a setting of random endomorphisms. Our results extend several well known theorems of Barreira et al. (Ann Math 149:755–783, 1999) and Ledrappier and Young [Commun Math Phys 117(4):529–548, 1988] for diffeomorphisms to the case of endomorphisms. Contents 1. 2. 3.
Introduction . . . . . . . . . . . . . . . . . . . . . . . Notions and Statement of the Main Results . . . . . . . Dimension of Hyperbolic Measures for Endomorphisms 3.1 Preparatory lemmas . . . . . . . . . . . . . . . . . 3.2 Proof of Theorem 2.1 . . . . . . . . . . . . . . . . 4. Volume Lemma and Lyapunov Dimension of Measures 5. Dimension Formula for Random Endomorphisms . . . 5.1 The proofs of Theorem 2.4 and Theorem 2.6 . . . . 5.2 An application of the results to stochastic flows . . References . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
65 70 76 76 83 89 92 93 97 98
1. Introduction The present paper is intended to study the dimension theory for invariant probability measures of a C 2 non-invertible but non-degenerate endomorphism. To motivate the ∗ This work is supported by NSFC (No. 10901007) and National Basic Research Program of China (973 Program) (2007 CB 814800).
66
L. Shu
questions, we first give some background of the corresponding theories for diffeomorphisms. Throughout this paper, we let M be a C ∞ compact connected Riemannian manifold without boundary. Let f : M → M be a C 2 (or C 1+α ) diffeomorphism preserving a Borel probability measure μ. For x ∈ M, the local dimension of μ at x is defined by d(μ, x) = lim
ρ→0
log μ(B(x, ρ)) , log ρ
(1.1)
provided the limit exists, where B(x, ρ) stands for the ball of radius ρ centered at x. Call μ exact dimensional if d(μ, x) is constant a.e. (in that case, the constant is denoted by dimμ). The exact dimensional property of μ implies almost all the known characteristics of dimension type of the measure coincide [32]. This partially tells why the study on exact dimensional measures is of great importance in dimension theory of dynamical systems [3,10,12,23,32]. For further reference of this topic, see e.g. Farmer, Ott and Yorke [5], Eckmann and Ruelle [4], and Young [33]. In 1982 [32], Young studied the local dimension of an ergodic measure μ of a C 1+α surface diffeomorphism f . She showed that for μ almost every x, 1 1 d(μ, x) = hμ( f ) + h μ ( f ) =: δ u + δ s , (1.2) λ1 −λ2 where h μ ( f ) is the metric entropy of f (with respect to μ) and λ1 > 0 > λ2 are the Lyapunov exponents of μ. The δ u , δ s defined as above can roughly be interpreted as the dimension of μ in the direction of the subspaces corresponding to λ1 and λ2 , respectively. As a simple consequence of (1.2) one has that if μ is an SRB measure, then dimμ = 1 − λ1 /λ2 , which coincides with the Lyapunov dimension of μ (cf. [5,6]). Arising from this simple, yet delicate model are three natural questions for an invariant probability measure μ of a C 2 (or C 1+α ) diffeomorphism f of higher dimensional M: i) What is the relation between entropy, Lyapunov exponents and dimensions [12]? ii) In case μ is ergodic, will it be exact dimensional [4]? iii) When will dimμ coincide with its Lyapunov dimension [5,6]? The answers to the last two questions rely on that of the first. For μ-a.e. x, let λ1 (x), . . . , λr (x) (x) be the distinct Lyapunov exponents of f at x and let ⊕i≤r (x) E i (x) be the corresponding decomposition of Tx M. In 1985, Ledrappier and Young [12] proved the entropy formula λi (x)γi (x) dμ, (1.3) hμ( f ) = M λ (x)>0 i
(where γi (x) denotes, roughly speaking, the dimension of μ in the direction of the subspace E i (x)), which gives the existence of stable and unstable pointwise dimensions δ s (x) = Σλi (x)<0 γi (x), δ u (x) = Σλi (x)>0 γi (x) of μ, resembling that in (1.2). Furthermore, it can be derived from a general inequality in [12] that d(μ, x) ≤ δ s (x) + δ c (x) + δ u (x), μ − a.e.,
(1.4)
for any invariant probability measure μ, where d(μ, x) is the upper pointwise dimension of μ at x, defined by replacing lim in (1.1) by lim sup and δ c (x) is the multiplicity of the zero exponent.
Dimension Theory of Endomorphisms
67
In [10], Ledrappier established the existence of the pointwise dimension of arbitrary SRB measures. In [23], Pesin and Yue extended his approach and proved the existence (of the pointwise dimension) for hyperbolic measures satisfying the so-called semi-local product structure. (Here by hyperbolic one means that the measure has no zero Lyapunov exponent.) Finally in 1999, Barreira, Pesin and Schmeling [3] exploited all the above works, especially [12], in an essential way. They showed that a hyperbolic measure has a kind of asymptotically “almost” local product structure, from which they deduced that d(μ, x) ≥ δ s (x) + δ u (x), μ − a.e., where d(μ, x) is the lower pointwise dimension of μ at x, defined by replacing lim in (1.1) by lim inf. This together with (1.4) established the exact dimensional property of ergodic hyperbolic measures of C 1+α diffeomorphisms. (Note that there are examples of non-hyperbolic ergodic measures which are not exact dimensional [11].) Despite the accuracy of the above formulas for d(μ, x) using γi ’s, a formula of dimμ, widely used in practice, however, is the measure’s Lyapunov dimension, which can be simply calculated only using its Lyapunov exponents. To be precise, let μ be an f -ergodic measure with Lyapunov exponents λ1 > · · · > λr . Let K be the largest integer so K λ m > 0, where m is the multiplicity of λ . Then the Lyapunov dimension of that Σi=1 i i i i μ is defined to be K m = dimM; dimM, if Σi=1 i dim L y (μ) = (1.5) K m − 1 Σ K m , otherwise. Σi=1 i λ K +1 i=1 i r γ (where γ equals the multiplicity of λ It can be easily verified using dimμ ≤ Σi=1 i i i if λi = 0) and (1.3) that dimμ ≤ dim L y (μ) is always true and a necessity of the equality is that μ is an SRB measure. Despite existing counterexamples about the converse direction, dim L y (μ) and dimμ are close in real calculation. It was conjectured by Frederickson, Kaplan, Yorke and Yorke [6] (see also [5]) that if μ is an SRB measure, then generically,
dimμ = dim L y (μ).
(1.6)
When M has dimension 2, the conjecture is true by Young’s formula (1.2). For the higher dimensional case, it is still unknown what generic condition should be put there [13]. In a setting of random diffeomorphisms, however, Ledrappier and Young were able to show the formula (1.6) is mathematically correct. Let ν be a probability measure on Diff(M), the space of diffeomorphisms of M. They considered the composition of maps chosen independently with distribution ν. Let μ be an ergodic stationary measure corresponding to this process. Denote by {μw : w ∈ Diff(M)Z } a class of sample measures associated with μ. Consider the backward derivative process naturally induced by the process on the Grassmannian manifold GrM whose transition probabilities are given by Q(v, Γ ) = ν{ f w ∈ Diff(M) : D( f w )−1 v ∈ Γ }, v ∈ Gr(M), Borel Γ ⊂ Gr(M). They showed that if μ Leb and λ j = 0 for all j, taking the hypothesis that for all v ∈ Gr(M) the transition probability Q(v, ·) is absolutely continuous with respect to the Lebesgue measure on Gr(M), then for ν Z -a.e. w, dim(μw ) = dim L y (μ);
(1.7)
68
L. Shu
and Eq. (1.7) continues to hold if the hypothesis is replaced by a weaker assumption about the randomness of the distribution of tangent spaces to K + 1, K + 2th stable manifolds or a nonlinear version formulated in terms of two-point processes on M. We note that the above hypothesis of randomness appears quite naturally in the setting of stochastic flows from Stochastic Differential Equations (SDE) (see Sect. 5.2). The motivation of the present paper is trying to answer the questions proposed at the beginning of the paper in the case f is a non-invertible endomorphism, concerning all the existing results mentioned above for diffeomorphisms. The main difficulty in setting up the corresponding theories is caused by the non-invertibility of the map. To remove this, one method is to avoid this by lifting the system (M, f ) to some higher-dimensional system so that the argument for diffeomorphisms works, see e.g. [29,30]. However, this might encounter the problem that the existence of d(μ, x) of μ can not always be obtained from that of the lift of μ. Another method is to lift the system to its inverse limit space (denoted by (M, θ ), which is a lift of (M, f ) to form an invertible system) [25]. However, this can not solve all the problems, either, especially when the system has negative exponents, since the lift does not help to split stable manifolds into distinct manifolds corresponding to different past paths. Besides, the dimension of the lift of μ and dimμ, in general, are not equal, either (cf. [18]). So, an alternative approach to solve the three questions for endomorphisms might be following the above lines for diffeomorphisms. A preliminary nontrivial step is to set up the corresponding entropy formulas. For positive exponents, much has been done with the help of the inverse limit space of (M, f ) (see [25–28] and also [16]). In [25], Qian and Xie established (1.3) for endomorphisms (with γi to be interpreted differently from that of (1.3) for diffeomorphisms). For each “typical” x = (xn )+∞ −∞ with f (x n ) = x n+1 for n ∈ Z, there exists an unstable manifold W u (x). They were able to show that the dimension of μ along W u (x), denoted by δ u (x0 ), only depends on x0 . Based on this, they proved the existence of d(μ, x) (in a.e. sense) for C 2 expanding endomorphisms. Generalizing (1.3) for negative exponents is another story. Since the existence of the stable manifolds W s only relies on forward iterations of orbits, lifting the system to its inverse limit space can not clear up the influence of the overlap caused by non-invertibility. The key to this problem is the observation that overlap (or folding) actually diminishes the dimensions of (conditional) measures in stable manifolds. Let f be a C 2 non-degenerate (i.e., Tx f = 0 for all x ∈ M) non-invertible endomorphism on M and let μ be an invariant probability measure on M. In [24], Ruelle conjectured an inequality h μ ( f ) ≤ Fμ ( f ) − λi (x)m i (x) dμ (1.8) M λ (x)<0 i
for the purpose of studying positivity of entropy production in nonequilibrium statistical mechanics, which was proved to be true by Liu [17], where Fμ ( f ) := Hμ (| f −1 ), being the partition of M into single points, is called the folding entropy of ( f, μ) and m i (x) is the multiplicity of λi (x). (Later in [19], Liu also characterized the measures obtaining the equality in (1.8).) It was asked in [17] by Liu whether the generalized formula h μ ( f ) = Fμ ( f ) − λi (x)γi (x) dμ (1.9) M λ (x)<0 i
holds, where γi (x) := δi (x) − δi−1 (x) with δi (x) (δ0 (x) = 0) being the supposed dimension of μ on the stable manifolds corresponding to λi (x).
Dimension Theory of Endomorphisms
69
The formula (1.9) was proved in [31], which involves a development of the techniques in establishing entropy formulas for diffeomorphisms [12] and random transformations [14,16] and a delicate study of the function of folding entropy [16,17] with the help of the ergodic theories in inverse limit spaces [25]. This equality, in particular, implies the existence of the dimension of μ along the stable manifold of typical x, denoted by δ s (x), which is crucial in establishing the existence of d(μ, x) in the case of endomorphisms. With all the works mentioned above concerning the existence of δ u (x) and δ s (x) for endomorphisms, we will first establish the exact dimensional property of hyperbolic ergodic measures for C 2 non-invertible but non-degenerate endomorphisms. The proof looks to be a repetition of the techniques used for diffeomorphisms [3] (see also [15]), but the underlining idea actually guides us to set up the dimension theories for endomorphisms: we regard M as a probabilistic disintegration of the original system. Once we obtain the existence of δ s (using the forward Lyapunov metric), we can switch back to use the classical Lyapunov metric in M. Then for a typical x and a ball B(x, e−n ) for n large, we further slice its preimage in M into finer pieces so that on each piece, for “typical” x = (xn )+∞ −∞ with f (x n ) = x n+1 for n ∈ Z, locally, the stable manifold at x0 and the unstable manifold of x0 at x also provide a good coordinate chart, through which we can obtain a kind of local product structure of the measure. Note that δ u is independent of the past paths chosen and δ s exists on the projection of these pieces in s u M. The measures of these pieces then add up to be e−n(δ +δ ) . We emphasize that M is not of finite dimension and hence the Bescovich covering lemma does not apply. The existence of δ s on M is very important in overcoming this deficiency in the argument (see Lemma 3.11). Besides, one also needs some special treatment to set up a variance of the Borel density lemma in M (see Lemma 3.5). As a simple application of the above results for endomorphisms, we have for any ergodic measure μ of a C 2 non-invertible but non-degenerate endomorphism of surface, d(μ, x) =
1 1 hμ( f ) + (h μ ( f ) − Fμ ( f )), μ − a.e., λ1 −λ2
where λ1 > 0 > λ2 are the Lyapunov exponents of μ. In particular, for an SRB measure μ in this setting (see [28] or Sect. 2 for its definition), we have dimμ = 1−(λ1 −Fμ ( f ))/λ2 . This motivates the following new notion of Lyapunov dimension of ergodic measures for endomorphisms. Let λ1 > · · · > λr be the distinct Lyapunov exponents of μ. We define the Lyapunov dimension of μ, denoted by dim L (μ), as i) If Σλi >0 λi m i ≤ Fμ ( f ), let dim L (μ) = Σλi ≥0 m i ; K λ m > F ( f ) and define ii) Otherwise, let K be the largest integer so that Σi=1 i i μ K m i = dimM; dimM, if i=1 dim L (μ) = K 1 K i=1 m i − λ K +1 ( i=1 λi m i − Fμ ( f )), otherwise. (1.10) The formula differs from that for diffeomorphisms (see (1.5)) by plugging in the quantum of folding entropy. When f is a diffeomorphism, the folding entropy is zero and (1.10) reduces to the classical one. As we will show later (Proposition 4.2), it is always true for ergodic μ that d(μ, x) ≤ dim L (μ), μ − a.e., and for the equality to hold, a necessary condition is that μ is SRB.
70
L. Shu
To see the new Lyapunov dimension is in some sense mathematically correct, we study its relation with dimension of measures in the setting of random endomorphisms. Let ν be a probability measure (satisfying some regularity conditions to be specified in Sect. 5) on C 2 (M, M), the space of C 2 endomorphisms, and consider the composition of maps chosen independently with distribution ν. This process together with an ergodic stationary measure μ is referred to as χ . Let Fμ (χ ) denote the folding entropy of χ . We define the Lyapunov dimension of μ as in the deterministic case, replacing Fμ ( f ) by Fμ (χ ). Denote by {μw : w ∈ C 2 (M, M)Z } the associated class of sample measures. Our third main result, in one sentence, is that in the above setting of random endomorphisms, under the same hypothesis of Ledrappier and Young [13], for ν Z -a.e. w, we have dim(μw ) = dim L (μ), with dim L (μ) as above.
(1.11)
The proof of (1.11) depends on the existence of dimension of ergodic measures of random endomorphisms and the establishment of the corresponding entropy formulas in such a setting. With these, we can show, as in the case of random diffeomorphisms [13], that for the equality (1.11) to hold, a sufficient and necessary configuration is that μw tends to fill in the direction λ1 , . . . , λ j before spilling over into the λ j+1 direction. This configuration can be verified by showing the transversal dimension to j + 1th stable manifolds is as large as possible exactly as in [13]. We will state the main results in the next section, but the proofs, i.e., the answers to the last two questions for endomorphisms, will be presented separately in Sects. 3 and 5. Section 4 is devoted to the relation between d(μ, x) and dim L (μ) for arbitrary ergodic measures for non-invertible but non-degenerate C 2 endomorphisms. 2. Notions and Statement of the Main Results We first consider the dimension theory of invariant probability measures for deterministic endomorphisms. Let f be a C 2 non-invertible but non-degenerate endomorphism on M preserving a Borel probability measure μ. Consider M Z endowed with the product topology. Define M := {x = (xn )+∞ −∞ : x n ∈ M, f (x n ) = x n+1 , n ∈ Z}. Denote by θ the left shift transformation on M. The pair (M, θ ) is called the inverse limit space of (M, f ). Let p be the natural projection map from M to M, i.e., p(x) = x0 , ∀ x ∈ M. Then p ◦ θ = f ◦ p on M. Denote by μ the unique invariant probability measure on M that satisfies μ ◦ p −1 = μ. Then μ is ergodic whenever μ is. For μ-a.e. x, let λ1 (x) > λ2 (x) > · · · > λr (x) (x) be the distinct Lyapunov exponents of μ at x with multiplicities m 1 (x), . . . , m r (x) (x), respectively. Applying the Oseledec multiplicative ergodic theorem [21] to (M, θ, μ), we can obtain a Borel set Γ0 ⊂ M with μ(Γ0 ) = 1 such that for each x = (xn )+∞ −∞ ∈ Γ0 , there is a measurable splitting Tx0 M = E 1 (x) ⊕ E 2 (x) ⊕ · · · ⊕ Er (x0 ) (x)
Dimension Theory of Endomorphisms
71
such that for each 1 ≤ i ≤ r (x0 ), 1 log |D(x, n)v| = λi (x0 ) for 0 = v ∈ E i (x), n→±∞ n lim
where D(x, n) = Tx0 f n for n ≥ 0 and D(x, n) = (Tx−n f )−1 ◦ · · · ◦ (Tx−1 f )−1 for n < 0. Put E s (x) = ⊕λi (x0 )<0 E i (x), E u (x) = ⊕λi (x0 )>0 E i (x). Let s(x) = #{λi (x) : λi (x) < 0}. For x = (xn )+∞ −∞ ∈ Γ0 , define u (x) := {y ∈ M : lim sup 1 log d(x−n , y−n ) < 0}. W n→+∞ n u (x)). It is called It is called the unstable set of (M, f, μ) in M at x. Let W u (x) := p(W the unstable manifold of (M, f, μ) in M at x. It can be proved that W u (x)’s are all C 1,1 immersed submanifolds of M tangent at x0 to E u (x) [14]. Each W u (x) inherits a Riemannian structure from M. Denote by dxu (·, ·) the corresponding Riemannian metric on each leaf of W u (x) and let B u (x, ρ) = {y ∈ W u (x) : dxu (x0 , y0 ) < ρ}. A measurable partition ξ of M is said to be subordinate to W u -manifolds of (M, f, μ) (cf. [28]) if for μ-a.e. x, ξ(x) (ξ(x) denotes the element of ξ that contains x) satisfies: i) p|ξ(x) : ξ(x) → p(ξ(x)) is bijective; ii) There exists a Σλ j (x0 )>0 m j (x0 )-dimensional C 1 embedded submanifold V u (x) of M with V u (x) ⊂ W u (x) such that p(ξ(x)) ⊂ V u (x) and p(ξ(x)) contains an open neighborhood of x0 in V u (x) (with respect to the submanifold topology of V u (x)). Let ξ u be a measurable partition of M subordinate to W u -manifolds of (M, f, μ). ξu Denote by {μx } the canonical system of conditional measures of μ associated with ξ u . Then μ is said to be SRB (cf. [28]) if for each ξ u , we have for μ-a.e. x that the measure ξu u μx ◦ p|−1 ξ u (x) is absolutely continuous with respect to the Lebesgue measure on V (x) induced by its inherited Riemannian structure as a submanifold of M. Let ξ u be as above. The lower and upper pointwise dimension of μ along W u manifolds at x ∈ Γ0 with respect to the partition ξ u are defined by ξ log μx ( B u (x, ρ)) ; δ u (x, ξ ) = lim inf ρ→0 log ρ ξu log μx ( B u (x, ρ)) u . δ u (x, ξ ) = lim sup log ρ ρ→0 u
u
It was proved in [25] that there exists an (μ-mod 0) f -invariant measurable function δ u : M → R, the so called unstable pointwise dimension of μ, which does not depend on the choice of ξ u , such that δ u (x, ξ u ) = δ u (x, ξ u ) = δ u (x0 ), for μ − a.e. x. Let Λ denote the set of regular points in the sense of Oseledec for (M, f, μ). We may assume s(x) ≥ 1 for each x ∈ Λ. For x ∈ Λ, define
1 s n n W (x) = y ∈ M : lim sup log d( f (x), f (y)) < 0 . n→∞ n
72
L. Shu
It is called the stable manifold of f at x. Let V s (x) denote the arc connected component of W s (x) which contains x. It is a C 1,1 immersed submanifold of M with dimension Σλ j <0 m j (x). Denote by dxs the metric on V s (x) inherited from M. Let B s (x, ρ) denote the dxs ball in V s (x) centered at x of radius ρ. Let Bμ (M) denote the completion of the Borel σ -algebra of M with respect to μ so that (M, Bμ (M), μ) constitutes a Lebesgue space. A measurable partition ξ of (M, Bμ (M), μ) is said to be subordinate to the W s -manifolds of (M, f, μ) if for μ-a.e. x ∈ Λ one has ξ(x) ⊂ W s (x) and ξ(x) contains an open neighborhood of x in V s (x) (with respect to the submanifold topology of V s (x)). ξs Let ξ s be a measurable partition subordinate to W s . Denote by {μx } a system of canonical conditional measures of μ associated with ξ s . For x ∈ Λ, define δ s (x, ξ s ) = lim inf ρ→0
ξs
log μx (B s (x, ρ)) , log ρ ξs
log μx (B s (x, ρ)) . δ s (x, ξ ) = lim sup log ρ ρ→0 s
It was shown in [31] that there is a measurable function δ s on Λ (independent of ξ s ) such that for μ-a.e. x ∈ Λ, δ s (x) = δ s (x, ξ s ) = δ s (x, ξ s ). Call δ s (x) the stable dimension of μ at x. We note that when μ is ergodic, δ s (x), δ u (x) obtained as above are constants a.e. Denote by δ s , δ u the constants. Theorem 2.1. Let f be a C 2 non-invertible but non-degenerate endomorphism on a smooth Riemannian manifold M without boundary, and μ an f -invariant compactly supported ergodic Borel probability measure. Let μ be the corresponding measure in (M, θ ) that projects to μ, i.e. μ ◦ p −1 = μ. If μ is hyperbolic then the following properties hold: i) for every ε > 0 there exist a set Λ ⊂ M with μ(Λ) > 1 − ε and a constant κ ≥ 1 such that for every x ∈ Λ and every sufficiently small r (depending on x), we have r
ξ u u r
s r ε μξx0 B s x0 , · μx B x, κ κ s ξu ≤ μ(B(x0 , r )) ≤ r −ε μξx0 (B s (x0 , κr )) · μx ( B u (x, κr )); ii) μ is exact dimensional (i.e., d(μ, x) is constant a.e.) and its pointwise dimension is equal to the sum of the stable and unstable pointwise dimensions, i.e. d(μ, x) = δ s + δ u , for μ − a.e. x.
(2.1)
iii) when μ is an arbitrary hyperbolic invariant probability measure, (2.1) changes into d(μ, x) = δ s (x) + δ u (x), for μ − a.e. x.
Dimension Theory of Endomorphisms
73
A simple corollary of the theorem, generalizing (1.2), is: Theorem 2.2. Let f : M → M be a C 2 non-invertible but non-degenerate endomorphism of a compact surface M and let μ be an ergodic Borel probability measure with exponents λ1 > 0 > λ2 . Then d(μ, x) =
1 1 hμ( f ) + (h μ ( f ) − Fμ ( f )), μ − a.e. λ1 −λ2
In the case μ is an arbitrary ergodic measure, which is not necessarily hyperbolic, let δ c denote the multiplicity of its zero Lyapunov exponent. We have Theorem 2.3. Let f be a C 2 non-invertible but non-degenerate endomorphism on M preserving an f -ergodic Borel probability measure μ. Then d(μ, x) ≤ δ s + δ c + δ u ≤ dim L (μ), for μ − a.e. x. Next we consider the dimension of invariant probability measures for random endomorphisms as in [13]. Let C 2 (M, M) be the space of all C 2 endomorphisms of M endowed with the C 2 topology. We denote by w an element of C 2 (M, M) and f w the corresponding map in C 2 (M, M). Let Ω=
+∞
C 2 (M, M)
−∞
be the two sided infinite product of copies of C 2 (M, M) endowed with the product +∞ topology. For w = (wn )+∞ −∞ ∈ Ω, let { f wn }n=−∞ be the corresponding sequence of maps and define for n > 0, f w0 = id, f wn = f wn−1 ◦ f wn−2 ◦ · · · ◦ f w0 . Let ν be a Borel probability measure on C 2 (M, M) satisfying log+ | f w |C 2 ν(dw) < +∞, log D( f w ) ν(dw) > −∞, where | f w |C 2 denotes the C 2 norm of f w and D( f w ) = inf x∈M |detTx f w |. Consider the composition of { f wn }+∞ n=−∞ , where the wn ’s are chosen independently with distribution ν. Let τ denote the left shift map on Ω. Then ν Z is ergodic with respect to τ on Ω. The above set-up of the random process will be referred to as χ (M, ν) in the sequel. A Borel probability measure μ on M is called a stationary measure of χ (M, ν), or χ (M, ν)-stationary if f w μ ν(dw) = μ. C 2 (M,M)
Consider the Markov process generated by χ (M, ν) with state space M and transition probabilities P(x, A) = ν{w : f w x ∈ A},
A ∈ B(M),
74
L. Shu
where B(M) is the Borel σ -algebra of M. It is easy to see that μ is χ (M, ν)-stationary if and only if it is stationary with respect to the transition kernel P(x, A). In the case of the transition probabilities P(x, ·), x ∈ M have a density with respect to the Lebesgue measure L M on M, i.e. there is a measurable function p : M × M → R+ such that for every x ∈ M one has P(x, A) = A p(x, y) dL M (y), A ∈ B(M), every χ (M, ν)-stationary measure μ is absolutely continuous with respect to L M and dμ (y) = p(x, y) μ(d x). dL M M A Borel set A ∈ B(M) is said to be ν-invariant if x ∈ A if and only if f w x ∈ A, ν-a.e. and x ∈ A if and only if f w x ∈ A, ν-a.e. Call a χ (M, ν)-stationary measure μ ergodic if every ν-invariant Borel set A has μ measure 0 or 1. Consider the skew map T : Ω × M → Ω × M defined by T (w, x) = (τ (w), f w0 x), where w = (wn )+∞ −∞ . For each χ (M, ν)-stationary measure μ, there is an unique T -invariant Borel probability measure μ∗ on Ω × M such that (see [14, Prop. 1.2]) ProjC 2 (M,M)N ×M μ∗ = ν N × μ, ∗
Z
(2.2) ∗
ProjC 2 (M,M)Z μ = ν , Proj M μ = μ,
(2.3)
and T n (ν Z × μ) converges weakly to μ∗ . Moreover, μ∗ is ergodic if μ is. Disintegrating μ∗ with respect to ν Z , we obtain a class of measures {μw }w∈Ω (unique ν Z -a.e.) such that μ∗ (dw, d x) = ν Z (dw)μw (d x).
(2.4)
Call {μw }w∈Ω a class of sample measures of μ. It is easy to see from (2.2), (2.3), and (2.4) that the sample measures {μw }w∈Ω of a χ (M, ν)-stationary measure μ satisfy i) f w0 μw = μτ (w) , ν Z -a.e.; ii) w → μw depends only on (wn )n<0 , ν Z -a.e.; iii) μw ν Z (dw) = μ. We note that the properties of sample measures may differ dramatically from that of μ. If μw Leb for ν Z -a.e. w, then we have μ Leb by iii) above. As to the converse direction, there is a counterexample given in Arnold [1], where the stationary measure is smooth while the sample measures are all Dirac measures. In the sequel of this section, the measure μ is assumed to be χ (M, ν)-ergodic. It, together with χ (M, ν), will be referred to as χ (M, ν; μ) or simply χ . Let λ1 > · · · > λr be the Lyapunov exponents of χ with multiplicities m 1 , . . . , m r respectively. Then for μ∗ -a.e. (w, x), there is an associated sequence of subspaces Tx M = V (0) (w, x) ⊃ V (1) (w, x) ⊃ · · · ⊃ V (r ) (w, x) = {0} such that lim
n→+∞
1 log |Tx f wn v| = λi n
Dimension Theory of Endomorphisms
75
for all v ∈ V (i−1) (w, x)\V (i) (w, x), 1 ≤ i ≤ r . We may assume λ1 > 0 for non-triviality. For j with λ j < 0, define the stable manifold corresponding to V ( j) at (w, x) to be
1 s, j n n W (w, x) = y ∈ M : lim sup log d( f w x, f w y) < λ j . n→+∞ n Let V s, j (w, x) denote the arc connected component of W s, j (w, x) which contains x. s, j It is a C 1,1 immersed submanifold of M with dimension Σi≥ j m j . Denote by d(w,x) the metric on V s, j (w, x) inherited from M. The folding entropy of μ for system χ (M, ν; μ), denoted by Fμ (χ ) or Fμ for simplicity, is defined by f w−1 Fμ (χ ) = − log(μw )x 0 ({x}) μw (d x) ν Z (dw), f w−1
where denotes the measurable partition of M into single points and {(μw )x 0 } is a disintegration of the measure μw with respect to the partition f w−1 . This notion is 0 closely related to that of Jacobian of the measure preserving transformations [22] (see also [16,17]). The Lyapunov dimension of μ for χ , denoted by dim L (μ), is as defined in the introduction using Fμ (χ ) in place of Fμ ( f ). As it is to be explained in Sect. 5.1.2, if the first case in the definition of dim L (μ) happens, the following theorems hold trivially. So we may assume K in the definition of dim L (μ) always exists. Hypothesis A, A , and B below are taken from [13]. We let L be the smallest integer L λ m ≤ F (i.e. L = K + 1 for K in Sect. 2). so that Σ j=1 j j μ Hypothesis A. For μ-a.e. x and j = L , L + 1, the distribution of w → V ( j) (w, x) is absolutely continuous with respect to Lebesgue on the space of (Σi≥ j m i )-planes in Tx M. Theorem 2.4. Let χ (M, ν; μ) be so that μ is absolutely continuous with respect to Lebesgue on M and λ j = 0 for all j. Assume Hypothesis A is satisfied. Then, for ν Z -a.e. w, dim(μw ) = dim L (μ). A stronger hypothesis which implies Hypothesis A is as follows. Recall the Grassmannian bundle of M is
dim M
Gr(M) =
Gr(M, k),
k=1
where Gr(M, k) is the bundle of k-dimensional subspaces of tangent spaces to M. For v ∈ Gr(M) and Γ ⊂ Gr(M), the probability transition kernel Q(v, Γ ) is Q(v, Γ ) = ν{w : (D f w )−1 v ∈ Γ }. By our assumption of ν, the map (D f w )−1 is well-defined for ν-a.e. w.
76
L. Shu
Hypothesis A . For all v ∈ Gr(M), the probability Q(v, ·) is absolutely continuous with respect to Lebesgue on Gr(M). Theorem 2.5. Theorem 2.4 holds if Hypothesis A is replaced by Hypothesis A . For x ∈ M, it generates a partition Px on C 2 (M, M) by Px (z) = {w : f w x = z}. Let {ν x,z : z ∈ M} be the family of conditional measures associated with Px . Given y ∈ M, let Px,z (y, ·) be the image of ν x,z under the map w → f w y. Let ρ yx,z be its density with respect to Lebesgue if it exists. Hypothesis B. i) For Lebesgue a.e. (x, z) and all y = x, Px,z (y) Leb. ii) For all ξ > 0, there exists G ξ ⊂ M × M with Leb(M × M\G ξ ) = 0 and Eξ : G ξ → R+ such that for all (x, z) ∈ G ξ and all y with d(x, y) ≤ Eξ (x, z), ρ yx,z ≤ d(x, y)−dim M−ξ . Theorem 2.6. Let χ (M, ν; μ) be so that λ j = 0 for all j and Hypothesis B is satisfied. Then, for ν Z -a.e. w, dim(μw ) = dim L (μ). 3. Dimension of Hyperbolic Measures for Endomorphisms In this section, if it is not specified, μ is assumed to be ergodic and hyperbolic. Let ρ0 , ρ1 > 0 be such that, for any x ∈ M, the map f | B(x,ρ0 ) : B(x, ρ0 ) → M is a diffeomorphism to the image which contains B( f x, ρ1 ). Let f x−1 : f B(x, ρ0 ) → B(x, ρ0 ) denote the local inverse. 3.1. Preparatory lemmas. 3.1.1. Lyapunov charts in (M, θ ). Write Rdim M = Ru × Rs , where u = dimE u (x), s = dimE s (x), μ-a.e. For v ∈ Rdim M , let (v u , v s ) be its coordinates with respect to this splitting. Define |v| = max{|v u |u , |v s |s }, where |·|u , |·|s are the Euclidean norms on Ru , Rs , respectively. The closed disk in Ru (or Rs ) of radius ρ centered at 0 is denoted by Ru (ρ) (or Rs (ρ)) and R(ρ) = Ru (ρ)×Rs (ρ). Put λs = max{λi : λi < 0}, λu = min{λi : λi > 0}. Let 0 < ε < min{−λs , λu }/200 be given. There exist a Borel set Γ0 ⊂ Γ0 with μ(Γ0 ) = 1 and θ (Γ0 ) = Γ0 and a measurable function l : Γ0 → [1, +∞) with l(θ ±1 x) ≤ eε l(x) such that for each x ∈ Γ0 one can define an embedding Φx : R(l(x)−1 ) → M with the following properties: (i) Φx (0) = x0 and T0 Φx takes Ru , Rs to E u (x), E s (x), respectively. −1 −1 := Φθ−1 (ii) Put f x := Φθ−1 −1 x ◦ f x −1 ◦ Φ x , defined whenever they x ◦ f ◦ Φx and f x make sense. Then
Dimension Theory of Endomorphisms
77
|T0 f x v| ≥ eλ
u −ε
|T0 f x v| ≤ e
λs +ε
|v| for v ∈ Ru ,
|v| for v ∈ Rs .
(iii) Let L(g) denote the Lipschitz constant of the function g, then L( f x − T0 f x ) ≤ ε, L( f x−1 − T0 f x−1 ) ≤ ε and L(T f x ), L(T f x−1 ) ≤ l(x). (iv) For any v, v ∈ R(l(x)−1 ), we have κ −1 d(Φx v, Φx v ) ≤ |v − v | ≤ l(x)d(Φx v, Φx v ) for some universal constant κ > 0. (v) | f x v| ≤ eλ |v| and | f x−1 v| ≤ eλ |v| for all v ∈ R(e−(λ+ε) l(x)−1 ) , where λ > 0 is a number depending only on ε and the exponents. In particular, f x±1 R(e−(λ+ε) l(x)−1 ) ⊂ R(l(θ ±1 x)−1 ). Any system of local charts {Φx : x ∈ Γ0 } satisfying i)-v) above will be referred to as (ε, l)-charts. 3.1.2. A special partition P. Given two measurable partitions ξ1 and ξ2 of a measurable space (X, ν), we say that ξ1 refines ξ2 (ξ1 > ξ2 ) if ξ1 (x) ⊂ ξ2 (x) at ν-a.e. x ∈ X . Denote by ∨ the join of two partitions. Let {Φx , x ∈ Γ0 } be a system of (ε, l)-charts and let 0 < δ ≤ 1 be a reduction factor. −1 for For x ∈ Γ0 , put f x0 = Id and f xn = f θ n−1 x ◦ · · · ◦ f x , f x−n := f θ−1 −(n−1) x ◦ · · · ◦ f x n ≥ 1 and define Sδcs (x) = {z ∈ R(l(x)−1 ) : | f xn (z)| ≤ δl(θ n x)−1 , ∀ n ≥ 0},
Sδcu (x) = {z ∈ R(l(x)−1 ) : | f x−n (z)| ≤ δl(θ −n x)−1 , ∀ n ≥ 0}. A measurable partition P of (M, μ) is said to be adapted to ({Φx }, δ) if for μ-a.e. x ∈ Γ0 one has p(P − (x)) ⊂ Φx Sδcs (x), where P − =
∞ 0
θ −n P and P + =
∞ 0
p(P + (x)) ⊂ Φx Sδcu (x),
θ n P.
Lemma 3.1. For any 0 < δ < e−λ−ε , there exists a measurable partition P of (M, μ), which is adapted to ({Φx }, δ) and satisfies Hμ (P) < ∞. (The proof of Lemma 3.1 only differs slightly from [19, Lemma 4.2.1] in the definition of φ below. We give it for completeness.)
78
L. Shu
= {x ∈ Γ : l(x) ≤ l0 } has μ positive Proof. Fix some l0 > 0 such that the set Λ 0 let measure. For x ∈ Λ r + (x) = min{k > 0 : θ k (x) ∈ Λ}, r − (x) = min{k > 0 : θ −k (x) ∈ Λ}. Define φ : M → (0, +∞) by min{δ, ρ0 }, if x ∈ Λ; φ(x) = −2 −(λ+3ε) max{r + (x),r − (x)} min{δl0 e , ρ0 }, if x ∈ Λ. + Then −φ is defined μ almost everywhere and log φ is μ-integrable since Λ r (x) dμ = r (x) dμ = 1. Λ Follow [19, Lemma 4.2.1] (to use Mañé’s idea [20]) to construct a partition P with Hμ (P) < ∞ so that diam p(P(x)) ≤ φ(x), for any x ∈ M. Then, using the recurrence functions r + , r − and v) of the properties of ({Φx }, δ), we can conclude that p(P − (x)) ⊂ Φx Sδcs (x) and p(P + (x)) ⊂ Φx Sδcu (x). We use the following notations. Let η be a measurable partition of (M, θ ). For every integer k, l ≥ 1, we define ηkl = ∨ln=−k θ −n η. We observe that ηkl (x) = ηk0 (x) ∩ η0l (x). Let ξ s be a partition of M subordinate to W s -manifolds satisfying ξ s > f −1 ξ s . Denote by ξ s = p −1 ξ s the lift of ξ s to (M, θ ). Let ξ u be a partition of (M, θ ) subordinate to W u -manifolds satisfying ξ u > θ ξ u . Then as in [15] we have Lemma 3.2 (cf. [15, Lemma 4.4]). Let λ0 ≤ min{|λ j |, 1 ≤ j ≤ r } and fix 0 < δ < e−λ−ε arbitrarily. Let P be as constructed in Lemma 3.1. Then there exist some constant κ ≥ 1 and a measurable function n 0 : M → Z+ such that for μ-a.e. x ∈ M and all n ≥ n 0 (x), i) p(Pnn (x)) ⊂ B(x0 , κδe−nλ /2 ); 0 0 B u (x, κδe−nλ /2 ); ii) p([ξ s ∨ Pn0 ](x)) ⊂ B s (x0 , κδe−nλ /2 ), [ξ u ∨ P0n ](x) ⊂ iii) p(P − (x)) ⊂ B s (x0 , κδ), P + (x) ⊂ B u (x, κδ). 0
Moreover, due to the generating properties of ξ s and ξ u (cf. [19] and [25]), we have Lemma 3.3 (cf. [15, Lemma 4.5]). Let P be the partition (depending particularly on 0 < δ < e−λ−ε ) given above. Then one has for μ-a.e. x, 1 log μ(Pnn (x)) = h μ ( f ), δ↓0 n→+∞ 2n ∗ 1 ξs lim lim − log μx (Pn0 (x)) = h μ ( f ), n→+∞ δ↓0 n ∗ 1 ξs lim lim − log μx (Pnn (x)) = h μ ( f ), δ↓0 n→+∞ n ∗ 1 ξu lim lim − log μx (P0n (x)) = h μ ( f ), δ↓0 n→+∞ n ∗ 1 ξu lim lim − log μx (Pnn (x)) = h μ ( f ), δ↓0 n→+∞ n lim lim −
Dimension Theory of Endomorphisms
79
∗ 1 lim lim − log μx (Pnn (x)) = Fμ ( f ), δ↓0 n→+∞ n 1 ξs lim − log μx (P0n (x)) = 0, n→∞ n 1 ξu lim − log μx (Pn0 (x)) = 0, n→∞ n
where the limits lim∗n→∞ above are understood as both lim inf n→∞ and lim supn→∞ . 3.1.3. Points with good local behavior. Let 0 < ε < 1 be given sufficiently small. Let 0 < ε∗ ≤ (1/200) min{λ0 , 1} and let {Φx } be a system of (ε∗ , l∗ ) Lyapunov charts. Let 0 < δ∗ < e−λ−ε be small enough. Set h = h μ ( f ), = p −1 , with being the partition of M into single points and Fμ = Fμ ( f ). Then, by Lemma 3.2 and Lemma 3.3, we can find a measurable partition P of (M, θ ) with Hμ (P) < ∞ and a set Γ ⊂ M of measure μ(Γ ) > 1 − ε4 together with an integer n 0 = n 0 (ε) ≥ 1 and a number C = C(ε) > 1 such that for every x ∈ Γ and n ≥ n 0 , the following statements hold: a) for all integers k ≥ 1 we have C −1 e−k(h+ε) ≤ μ(P0k (x)) ≤ Ce−k(h−ε) , C −1 e−k(h+ε) ≤ μ(Pk0 (x)) ≤ Ce−k(h−ε) , ξs
C −1 e−kε ≤ μx (P0k (x)) ≤ 1, ξs
C −1 e−k(h+ε) ≤ μx (Pk0 (x)) ≤ Ce−k(h−ε) , ξu
C −1 e−kε ≤ μx (Pk0 (x)) ≤ 1, ξu
C −1 e−k(h+ε) ≤ μx (P0k (x)) ≤ Ce−k(h−ε) ; b) for all integers k ≥ 1 we have C −1 e−2k(h+ε) ≤ μ(Pkk (x)) ≤ Ce−2k(h−ε) , ξs
C −1 e−k(h+ε) ≤ μx (Pkk (x)) ≤ Ce−k(h−ε) , ξu
C −1 e−k(h+ε) ≤ μx (Pkk (x)) ≤ Ce−k(h−ε) , C −1 e−k(Fμ +ε) ≤ μx (Pkk (x)) ≤ Ce−k(Fμ −ε) ; c) e−n(δ e
s +ε/2)
−n(δ u +ε/2)
≤ μξx0 (B s (x0 , e−n )) ≤ e−n(δ s
s −ε/2)
ξu
−n(δ u −ε/2)
≤ μx ( B u (x, e−n )) ≤ e
,
;
d) an (x)) ⊂ p(Pan 0 s ξ ∨ Pan (x) ⊂ u ξ ∨ P0an (x) ⊂
B(x0 , e−n ), p −1 (B s (x0 , e−n )) ⊂ ξ s (x), B u (x, e−n ) ⊂ ξ u (x),
where a is the integer part of 2(1 + (λ0 )−1 );
80
L. Shu
e) for each x ∈ Γ , d(z, z ) ≤ dxs0 (z, z ) ≤ 2d(z, z ), ∀ z, z ∈ B s (x0 , e−n 0 ), B u (x, e−n 0 ); d( p(z), p(z )) ≤ dxu ( p(z), p(z )) ≤ 2d( p(z), p(z )), ∀ z, z ∈ f) for every x ∈ Γ and n ≥ n 0 , 1 B x0 , e−n ∩ ξ s (x0 ) ⊂ B s (x0 , e−n ) ⊂ B(x0 , e−n ) ∩ ξ s (x0 ), 2 1 ∩ ξ u (x) ⊂ B u (x, e−n ) ⊂ p −1 (B(x0 , e−n )) ∩ ξ u (x). p −1 B x0 , e−n 2 3.1.4. Density points of the set Γ Since we only have control on points in Γ , we will pick up the “density points” of Γ for later use. Note that M is not a finite dimensional manifold. We need the following slight variance of the Borel density lemma. Lemma 3.4 ([25, Lemma 3.1]). Let A ⊂ M be a measurable set with μ(A) > 0. Then for μ-a.e. x ∈ A, ξ μx ( B u (x, ρ) ∩ A) = 1. lim ξu ρ→0 μ ( B u (x, ρ)) u
x
Based on this, we can further use its idea to show Lemma 3.5. There exists a set Γˆ ⊂ Γ with μ(Γˆ ) > 1 − 30ε and nˆ ∈ N such that for ˆ any x ∈ Γˆ , n ≥ n,
1 μ B(x0 , e−n ) , μ p −1 (B(x0 , e−n )) ∩ P(x) ∩ Γ ≥ (3.1) 8C
1 ξs s ξs μx0 B (x0 , e−n ) , μx p −1 (B s (x0 , e−n )) ∩ P(x) ∩ Γ ≥ (3.2) 8C 1 ξu u ξu μx ( B u (x, e−n ) ∩ Γ ) ≥ μx ( B (x, e−n )), (3.3) 2 1 . (3.4) μx (P(x) ∩ Γ ) ≥ 2C Proof. We first pick up points satisfying (3.4). Let
1 A := x ∈ M : μx (P(x) ∩ Γ ) ≥ . 2C We show μ(A) > 1 − 3ε2 . Let {μ D : D ∈ P} be a canonical system of conditional measures of μ with respect to the partition P. Denote by μ/P the corresponding induced measure on the factor space of M with respect to P. Put A = {D ∈ P : μ D (Γ ) ≥ 1 − ε2 }, then
μ(Γ ) =
μ D (Γ ) dμ/P ≤ 1 − ε2 (1 − μ/P (A)),
Dimension Theory of Endomorphisms
81
which gives μ/P (A) ≥ 1 − ε2 . For each D ∈ A fixed, define
1 −1 K D := y ∈ M : μ D∩ p {y0 } (Γ ) ≥ . 2 Then μ D (K D ) ≥ 1 − 2ε2 . For y ∈ K D , we see that −1
μy (P(y) ∩ Γ ) =
μP (y)∩ p {y0 } (Γ ) 1 , ≥ 2C μ y (P(y))
where μy (P(y)) ≤ C by b) of Sect. 3.1.3 since p −1 {y0 } ∩ P(y) ∩ Γ = Ø. Hence we have μ(A) ≥ μ D (A) dμ/P A ≥ μ D (K D ) dμ/P A
≥ (1 − ε2 )(1 − 2ε2 ) ≥ 1 − 3ε2 . Let Γ1 = Γ ∩ A. Then μ(Γ1 ) > 1 − 4ε2 . Next, let n 1 ∈ N and define
1 s ξs A1 = x : μx p −1 (B s (x0 , e−n )) ∩ Γ1 ≥ μξx0 (B s (x0 , e−n )), ∀ n ≥ n 1 . 4 We show μ(A1 ) > 1 − 12ε for some n 1 large. Let {μ E : E ∈ ξ s } be a canonical system of conditional measures of μ with respect to the partition ξ s . Denote by μ/ξ s the corresponding induced measure on the factor space of M with respect to ξ s . Put A1 = {E ∈ ξ s : μ E (Γ1 ) ≥ 1 − 2ε}, then μ(Γ1 ) =
μ E (Γ1 ) dμ/ξ s ≤ 1 − 2ε(1 − μ/ξ s (A1 )),
which gives μ/ξ s (A1 ) > 1 − 2ε. For each E ∈ A1 fixed, define
1 −1 . K E := y ∈ M : μ E∩ p {y} (Γ1 ) ≥ 2 Put E = p(E). Then μ E (K E ) ≥ 1 − 4ε and
1 μ E p −1 (B s (x0 , e−n )) ∩ Γ1 ≥ μ E B s (x0 , e−n ) ∩ K E . 2
82
L. Shu
By the Borel density lemma, there exists n = n (E) and K E ⊂ K E of measure μ E (K E ) ≥ 1 − 6ε such that μ E (B s (x0 , e−n ) ∩ K E ) ≥
1 E s μ (B (x0 , e−n )), ∀n ≥ n , y ∈ K E . 2
Thus we can define a measurable function n : A1 → Z+ such that the above equation holds true. Let n 1 be a large number such that the set 1 := {E ∈ A1 : n (E) ≤ n 1 } A 1 ) ≥ 1 − 4ε. Therefore, if E ∈ A 1 and y ∈ K ∩ p(A ∩ E), then has measure μ/ξ s (A E for n ≥ n 1 ,
1 μ E p −1 (B s (y, e−n )) ∩ Γ1 ≥ μ E (B s (y, e−n )), 4 i.e., p −1 (K E ) ∩ Γ1 ∩ E ⊂ A1 . Thus μ(A1 ) ≥ μ E (A1 ) dμ/ξ s 1 A ≥ μ E ( p −1 (K E ) ∩ Γ1 ∩ E) dμ/ξ s 1 A
≥ (1 − 8ε)(1 − 4ε) ≥ 1 − 12ε. Similarly, let n 2 ∈ N and define
1 A2 = x ∈ M : μ p −1 (B(x0 , e−n )) ∩ Γ1 ≥ μ(B(x0 , e−n )), ∀n ≥ n 2 . 4 We have μ(A2 ) > 1 − 12ε for n 2 large. Let n 3 ∈ N and define
1 ξu u ξu B u (x, e−n ) ∩ Γ1 ) ≥ μx ( B (x, e−n )), ∀n ≥ n 3 . A3 = x ∈ M : μx ( 2 Then points in A3 satisfy (3.3). By Lemma 3.4, we have μ(A3 ) > 1 − ε for n 3 large. Now put Γˆ = A∩ A1 ∩ A2 ∩ A3 ∩Γ . Then μ(Γˆ ) > 1−30ε. Let nˆ = max{n 1 , n 2 , n 3 }. For x ∈ Γˆ and n > n, ˆ we have
s ξ μx p −1 (B s (x0 , e−n )) ∩ P(x) ∩ Γ
ξs ≥ μy p −1 (B s (x0 , e−n )) ∩ P(x) ∩ Γ dμx (y) ξs ≥ μy (P(x) ∩ Γ ) dμx (y) −1 s −n p (B (x0 ,e ))∩Γ ξs ≥ μy (P(x) ∩ Γ ) dμx (y) p −1 (B s (x0 ,e−n ))∩Γ1 ξs
≥ μx ≥
1 p −1 (B s (x0 , e−n )) ∩ Γ1 · 2C
1 ξs μ (B(x0 , e−n )), 8C x0
Dimension Theory of Endomorphisms
83
i.e., x satisfies (3.2). Similarly, we can show (3.1). This finishes the proof of the lemma. Let Γˆ ⊂ Γ with μ(Γˆ ) > 1 − 30ε be as obtained in Lemma 3.5. We can further require that for every n ≥ nˆ and x ∈ Γˆ , ξu
μx
0 Pan (x) ∩ B u (x, e−n ) ∩ Γ
≥ e−n(δ
u +ε)
.
(3.5)
This inequality can be obtained by considering μ in place of the random measure in [15].
3.2. Proof of Theorem 2.1. We first show i) and ii) of Theorem 2.1. The proof will follow the line in [15] (see also [3]). Let μ be ergodic. Fix x ∈ Γ and n ≥ n 0 . Consider an an (y) ⊂ P(x) : Pan (y) ∩ Γ = Ø}, Rn := {Pan 0 0 (y) ⊂ P(x) : Pan (y) ∩ Γ = Ø}, Fns := {Pan u an an Fn := {P0 (y) ⊂ P(x) : P0 (y) ∩ Γ = Ø}.
For each A ⊂ P(x), define a series of subsets of Rn , Fns or Fnu by the following: N (n, A) N (n, y, A) N u (n, y, A) Nˆ s (n, A) s
:= {R ∈ Rn : R ∩ A = Ø} := N (n, ξ s (y) ∩ Γ ∩ A) := N (n, ξ u (y) ∩ Γ ∩ A) := {R ∈ Fns : R ∩ A = Ø}
Nˆu (n, A) := {R ∈ Fnu : R ∩ A = Ø}. It is clear that Rn ⊂ Fns ∨ Fnu := {R s ∩ R u : R s ∈ Fns , R u ∈ Fnu , R s ∩ R u = Ø}. From this we have Lemma 3.6. For x ∈ Γ and each n ≥ n 0 ,
#N n, p −1 (B(x0 , e−n )) ∩ Γ
≤ # Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γ · # Nˆu n, p −1 (B(x0 , e−n )) ∩ Γ . (Here # D denotes the cardinality of a countable set D.) On the other hand, it is easy to see that Lemma 3.7. For each x ∈ Γ and integer n > n 0 , we have
s #N s n, x, p −1 (B(x0 , e−n )) ≤ μξx0 (B s (x0 , 4e−n )) · Cean(h+ε) ,
ξu #N u n, x, p −1 (B(x0 , e−n )) ≤ μx ( B u (x, 4e−n )) · Cean(h+ε) .
84
L. Shu
Proof. For each R ∈ N s n, x, p −1 (B(x0 , e−n )) , we have by b) of Sect. 3.1.3 that ξs
μx (R) ≥ C −1 e−an(h+ε) . Moreover, we see that p(R) ∩ ξ s (x0 ) ⊂ B s (x0 , 4e−n ). Hence s ξs μx (R) μξx0 (B s (x0 , 4e−n )) ≥ R∈N s (n,x, p −1 (B(x0 ,e−n )))
≥ #N s n, x, p −1 (B(x0 , e−n )) · C −1 e−an(h+ε) , from which the first inequality of the lemma follows. The proof of the second inequality of the lemma is similar. Lemma 3.8. For each x ∈ Γˆ and n > n, ˆ we have
−n −1 μ(B(x0 , e )) ≤ #N n, p (B(x0 , e−n )) ∩ Γ · 8C 2 e−2an(h−ε) . Proof. By Lemma 3.5 and b) of Sect. 3.1.3, we have
1 μ(B(x0 , e−n )) ≤ μ p −1 (B(x0 , e−n )) ∩ P(x) ∩ Γ 8C μ(R) ≤ R∈N (n, p −1 (B(x0 ,e−n ))∩Γ )
≤ #N n, p −1 (B(x0 , e−n )) ∩ Γ · Ce−2an(h−ε) . So, by the above three lemmas, to show i) of Theorem 2.1, it suffices to compare the cardinalities of the sets N s n, x, p −1 (B(x0 , e−n )) and N u n, x, p −1 (B(x0 , e−n )) with that of Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γ and Nˆu n, p −1 (B(x0 , e−n )) ∩ Γ , respectively. Lemma 3.9. For each x ∈ Γ , y ∈ P(x) and n ≥ n 0 , we have
# Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ ≥ #N s n, y, p −1 (B(y0 , e−n )) · C −2 e−2anε ,
# Nˆu n, p −1 (B(y0 , e−n )) ∩ Γ ≥ #N u n, y, p −1 (B(y0 , e−n )) · C −2 e−2anε . Proof. Obviously, we have
# Nˆ s n, ξ s (y) ∩ p −1 (B(y0 , e−n )) ∩ Γ ≤ # Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ . (3.6) Fix R s ∈ Nˆ s n, ξ s (y) ∩ p −1 (B(y0 , e−n )) ∩ Γ . For each z that belongs to ξ s (y) ∩ 0 (z), we see that R = P an (z) is a rectangle in p −1(B(y0 , e−n )) ∩ Γ suchthat R s = Pan an s −1 −n N n, y, p (B(y0 , e )) . The number of different R corresponding to R s , denoted by An , satisfies ξs
An ≤
μ y (R s )
ξs min{μ y (R) : R ∈ N s n, y, p −1 (B(y0 , e−n )) , R ∈ R s } ξs
=
0 (z)) μ y (Pan ξs
an (z )) : z ∈ ξ s (y) ∩ p −1 (B(y , e−n )) ∩ R s ∩ Γ } min{μ y (Pan 0
≤ Ce−an(h−ε) /(C −1 e−an(h+ε) ) = C 2 e2anε .
Dimension Theory of Endomorphisms
85
Therefore
# Nˆ s n, ξ s (y) ∩ p −1 (B(y0 , e−n )) ∩ Γ ≥ #N s n, y, p −1 (B(y0 , e−n )) · C −2 e−2anε . This together with (3.6) implies the first inequality of the lemma. The other inequality of the lemma can be similarly obtained. As to the inequalities in the reverse direction, we first have Lemma 3.10. For each x ∈ Γ and n ≥ n 0 , # Nˆ s (n, P(x)) ≤ Cean(h+ε) , # Nˆu (n, P(x)) ≤ Cean(h+ε) . 0 (z). So Proof. The set P(x) is the union of a collection of rectangles R = Pan μ(R) ≥ # Nˆ s (n, P(x)) · C −1 · e−an(h+ε) , 1 ≥ μ(P(x)) ≥ R∈ Nˆ s (n,P (x))
where the last inequality holds since different rectangles R of Nˆ s (n, P(x)) are mutually disjoint. The first inequality of the lemma follows immediately. The second inequality of the lemma can be similarly obtained. Then using the fact of the existence of δ s and δ u and Lemma 3.5, we have Lemma 3.11. For μ-a.e. y ∈ P(x) ∩ Γˆ , we have # Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ · e−5anε = 0, lim sup n→+∞ #N s n, y, p −1 (B(y0 , e−n )) # Nˆu n, p −1 (B(y0 , e−n )) ∩ Γ · e−5anε = 0. lim sup n→+∞ #N u n, y, p −1 (B(y0 , e−n )) Proof. Let y ∈ Γˆ and n ≥ n, ˆ we have by c) of Sect. 3.1.3 and Lemma 3.5 that s s e−(δ +ε)n ≤ μξy0 B s (y0 , e−n )
ξs = μ y p −1 (B s (y0 , e−n ))
ξs ≤ 8C · μ y p −1 (B s (y0 , e−n )) ∩ P(y) ∩ Γ
ξs ≤ 8C · μ y p −1 (B(y0 , e−n )) ∩ P(y) ∩ Γ . From this, we obtain that
#N n, y, p −1 (B(y0 , e−n )) ≥ s
ξs
μy
p −1 (B(y0 , e−n )) ∩ P(y) ∩ Γ
ξs
an (z)) : z ∈ ξ s (y) ∩ P(y) ∩ Γ } max{μ y (Pan
e−n(δ +ε) 1 · −an(h−ε) 8C e 1 s ≥ · e−n(δ −ah+2aε) . 8C s
≥
86
L. Shu
Next, for each k ≥ 1, consider the set Fk :=
# Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ 1 −5anε ·e y ∈ P(x) ∩ Γˆ : lim sup ≥ . k n→+∞ #N s n, y, p −1 (B(y0 , e−n ))
For each y ∈ Fk , there exists an increasing sequence {m j (y)}∞ j=1 of positive integers such that n = m j (y) satisfies
1 #N s n, y, p −1 (B(y0 , e−n )) · e5anε # Nˆ s n, p −1 (B(y0 , e−n )) ∩ Γ ≥ 2k 1 −n(δ s −ah−3aε) ≥ . (3.7) e 16kC Suppose Fk has μ positive measure for some k. Then μ( p(Fk )) ≥ μ(Fk ) > 0. Let Fk ⊂ Fk be the set of points y ∈ Fk for which there exists the limit ξs
log μ y0 (B s (y0 , ρ)) = δs . ρ→0 log ρ lim
Clearly μ(Fk ) = μ(Fk ) > 0. Then we can find y ∈ Fk such that ξs
ξs
ξs
μ y (Fk ) = μ y (Fk ) = μ y
Fk ∩ P(y) ∩ ξ s (y) > 0.
ξs
Hence μ y0 ( p(Fk )) > 0. So it follows from Frostman’s lemma that dim H p(Fk ) ∩ ξ s (y0 ) ≥ δ s .
(3.8)
Consider the collection of balls D := {B(z 0 , e−m j (z) ) : z ∈ Fk ∩ ξ s (y), j = 1, 2, . . .}. By the Besicovitch covering lemma, one can find a countable subcover D ⊂ D of p(Fk ) ∩ ξ s (y0 ) of arbitrarily small diameter and finite multiplicity q. This means that ∞ and a sequence for any L ≥ nˆ one can choose a sequence of points {z i ∈ Fk ∩ ξ s (y)}i=1 ∞ ∞ i of integers {ti }i=1 , where ti ∈ {m j (z )} j=1 and ti ≥ L for each i such that the collection of balls D = {B( p(z i ), 4e−ti ) : i = 1, 2, . . .} comprises a cover of p(Fk ) ∩ ξ s (y0 ) whose multiplicity does not exceed q. Write Bi = B( p(z i ), 4e−ti ). The Hausdorff sum corresponding to this cover is B∈D
(diamB)δ
s −ε
= 8δ
s −ε
∞ i=1
e−ti (δ
s −ε)
.
Dimension Theory of Endomorphisms
87
Noting that a > 1 (see d) of Sect. 3.1.3), we have by (3.7) that ∞ i=1
e−ti (δ
s −ε)
≤
∞
# Nˆ s (ti , p −1 (Bi ) ∩ Γ ) · 16kC · e−ati (h+2ε)
i=1
≤ 16kC
∞
e−al(h+2ε) ·
l=nˆ ∞
≤ 16kCq
# Nˆ s (ti , p −1 (Bi ) ∩ Γ )
i: ti =l
e−al(h+2ε) · # Nˆ s (l, P(x))
l=nˆ ∞
≤ 16kC 2 q ≤ 16kC 2 q
l=nˆ ∞
e−al(h+2ε) · eal(h+ε) e−alε < ∞.
l=nˆ
dim H ( p(Fk )
ξ s (y0 ))
∩ ≤ δ s − ε < δ s , which contradicts (3.8). This It follows that proves the first equation of the lemma. The second equation can be obtained similarly using the existence of δ u and Lemma 3.5. Proof of Theorem 2.1. We first consider the case μ is ergodic. By Lemma 3.11, there exists a set Γ ε ⊂ Γˆ with μ(Γ ε ) > 1 − 30ε and n ε ∈ N such that ∀ x ∈ Γ ε , n > n ε ,
(3.9) # Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γ ≤ #N s n, x, p −1 (B(x0 , e−n )) · e5anε ,
# Nˆu n, p −1 (B(x0 , e−n )) ∩ Γ ≤ #N u n, x, p −1 (B(x0 , e−n )) · e5anε . (3.10) Fix x ∈ Γ ε and an integer n ≥ n ε , we first show ξu
μξx0 (B s (x0 , e−n )) · μx ( B u (x, e−n )) ≤ μ(B(x0 , 3e−n )) · 8C 6 e7anε . s
(3.11)
Clearly, for each rectangle R in Rn which intersects p −1 (B(x0 , 2e−n )) ∩ Γ , we have R ⊂ p −1 (B(x0 , 3e−n )). Therefore,
μ p −1 (B(x0 , 3e−n )) ∩ P(x) ≥ μ(R) R∈N (n, p −1 (B(x0 ,2e−n ))∩Γ )
≥ #N n, p −1 (B(x0 , 2e−n )) ∩ Γ · C −1 e−2an(h+ε) . (3.12) On the other hand, we see that
#N n, p −1 (B(x0 , 2e−n )) ∩ Γ
0 ≥ #N u n, z, Pan (z) ∩ p −1 (B(z 0 , e−n ))
0 (z)∈ Nˆ s n, p −1 (B(x ,e−n ))∩Γˆ Pan 0
0 (z) ∩ p −1 (B(z 0 , e−n )) : ≥ # Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γˆ · inf{#N u n, z, Pan z ∈ p −1 (B(x0 , e−n )) ∩ Γˆ }.
(3.13)
88
L. Shu
0 (z) ∩ p −1 (B(z , e−n )) for z ∈ Γˆ . By f) of Sect. 3.1.3, Now we estimate #N u n, z, Pan 0 we have 0 0 (z) ∩ B u (z, e−n ) ⊂ ξ u (z) ∩ Pan (z) ∩ p −1 (B(z 0 , e−n )). ξ u (z) ∩ Pan
Therefore by (3.5), we have
0 (z) ∩ p −1 (B(z 0 , e−n )) #N u n, z, Pan ξu 0 μz Pan (z) ∩ B u (z, e−n ) ∩ Γ u ≥ ξ 0 (z) ∩ p −1 (B(z , e−n )) max μz (R) : R ∈ N u n, z, Pan 0 ≥ e−n(δ
· C −1 ean(h−ε) ξu ≥ C −1 ean(h−2ε) · μ ( B u (x, e−n )), u +ε)
x
(3.14)
where the last inequality holds by c) of the choice of x in Sect. 3.1.3 using the fact that a ≥ 2 (cf. d) there).
As to # Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γˆ , we can follow the same line as in the proof of Lemma 3.9 to show
# Nˆ s n, p −1 (B(x0 , e−n )) ∩ Γˆ
≥ N s n, x, p −1 (B(x0 , e−n )) ∩ Γˆ · C −2 e−2anε . (3.15) Furthermore, we have by Lemma 3.5 and b) of Sect. 3.1.3 that
#N s n, x, p −1 (B(x0 , e−n )) ∩ Γˆ
ξs μx p −1 (B(x0 , e−n )) ∩ P(x) ∩ Γˆ s
≥ ξ max μx (R) : R ∈ N s n, x, p −1 (B(x0 , e−n )) ∩ Γˆ
ξs ≥ μx p −1 (B(x0 , e−n )) ∩ P(x) ∩ Γˆ · C −1 ean(h−ε) ≥
1 an(h−ε) ξ s s e · μx0 (B (x0 , e−n )), 8C 2
(3.16)
where the last inequality holds if we pick up x such that Lemma 3.5 holds for Γˆ . Putting the inequalities (3.12), (3.13), (3.14), (3.15), and (3.16) together, we obtain (3.11). Now we have by Lemmas 3.6, 3.7 and 3.8 and inequalities (3.9) and (3.10) that for x ∈ Γ ε and n ≥ nˆ that ξu
B u (x, 4e−n )) · 8C 5 e14anε . μ(B(x0 , e−n )) ≤ μξx0 (B s (x0 , 4e−n )) · μx ( s
This together with (3.11) proves i) of Theorem 2.1. The equality in ii) of Theorem 2.1 follows immediately from the inequalities of i). As to the case when μ is not ergodic, one can pick up the set of points Γ of (M, θ ) of measure ≥ 1 − ε4 and for every x ∈ Γ , a number h(x0 ) such that a)–f) of Sect. 3.1.3 hold if h is replaced by h(x0 ), δ s by δ s (x0 ) and δ u by δ u (x0 ). Then fix ι > 0 and consider the sets Γ (x) = {y ∈ M : |h(x) − h(y)| < ι, |δ s (x) − δ s (y)| < ι, |δ u (x) − δ u (y)| < ι}.
Dimension Theory of Endomorphisms
89
The collection of these sets covers p(Γ ). Moreover, there exists a countable sub collection {Γ i }i∈N which still covers p(Γ ). Let μi be the conditional measure generated by μ on Γ i . Following the above argument for p −1 (Γ i ) ∩ Γ and μi , we can show for almost every x ∈ Γ i , d(μi , x) ≥ δ s (x) + δ u (x) − cι, d(μi , x) ≤ δ s (x) + δ u (x) + cι, where d(μi , x) and d(μi , x) are the lower and upper pointwise dimension of the measure μi and c does not depend on x or ι. Letting ι go to zero yields that for μ-a.e. x ∈ M, d(μ, x) = δ s (x) + δ u (x). Proof of Theorem 2.2. It is an immediate consequence of Theorem 2.1 using (1.3) for endomorphism (cf. [25]) and (1.9). 4. Volume Lemma and Lyapunov Dimension of Measures When an f -ergodic measure μ is not hyperbolic, let δ c denote the multiplicity of its zero Lyapunov exponent. To see Theorem 2.3, i.e., the relation between the local dimension of μ with its Lyapunov dimension, we first have Lemma 4.1. Let f be a C 2 non-invertible but non-degenerate endomorphism on M preserving an f -ergodic Borel probability measure μ. Then d(μ, x) ≤ δ s + δ c + δ u , for μ − a.e. x. Proof. Let 0 < ε < 1 be given sufficiently small. Let P and Γ be as obtained in Sect. 3.1.3 such that for x ∈ Γ , all the properties there hold except for an p(Pan (x)) ⊂ B(x0 , e−n ) for n ≥ n 0 . ηu
Put ηu = ξ u ∨ P + and let {μx } be a system of conditional measures associated with ηu . Then by Lemma 12.4.1 and Lemma 12.1.2 of [12], we have that at μ-a.e. x, 1 ηu lim − log μx (P0n (x)) = h μ (θ, P) and n→∞ n ηu
log μx ( B u (x, ρ)) ≤ δu . lim sup log ρ ρ→o Hence we may assume that for a point x ∈ Γ and n ≥ n 0 , ηu
μx (P0n (x)) ≤ e−n(h−ε) and ηu
μx ( B u (x, e−n )) ≥ e
−n(δ u +ε)
.
(4.1) (4.2)
Furthermore, by using the properties of the Lyapunov metric in M, we can argue as in [12] to choose a sequence of partitions {Qn }n≥n 0 refining P such that
90
L. Shu
an and for x ∈ Γ, g) For n ≥ n 0 , we have Qn > Pan −n i) diam p(Qn (x)) ≤ 2e , c an (x)). ii) μ(Qn (x)) ≥ C −1 e−nε · e−nδ · μ(Pan
Then we proceed to pick up the “density points” as in Lemma 3.5. Let Γˆ ⊂ Γ and nˆ be there such that the lemma holds for Γ above. Note that the projection map p restricted on each element of ξ u is injective. Hence the same argument as in [25, Lemma 3.1] gives that for μ-a.e. x ∈ Γ , η μ ( B u (x, ρ) ∩ Γ ) lim x ηu = 1. ρ→0 μ ( B u (x, ρ)) u
x
ˆ So, we may assume that for x ∈ Γˆ and n ≥ n, ηu
μx ( B u (x, e−n ) ∩ Γ ) ≥
1 ηu u μ ( B (x, e−n )). 2 x
The left steps are exactly as in [12]. Indeed, pick up x ∈ Γˆ and set 1 δ = lim sup − log μ(B(x0 , 4e−n )). n n→∞ ˆ such that There exist infinitely many n ≥ max{n 0 , n} μ( p −1 B(x0 , 4e−n )) = μ(B(x0 , 4e−n )) ≤ e−n(δ−ε) . Fix such n, assuming 16C 2 ≤ enε . Consider the number an an : Pan intersecting Γ ∩ p −1 B(x0 , 2e−n ) . N = # atoms of Pan We have by b) of Sect. 3.1.3 and ii) of g) that N ≤ Cenε · enδ · e2an(h+ε) · e−n(δ−ε) . c
On the other hand, it is clear that an an N ≥ # atoms of Pan : Pan intersecting Γ ∩ p −1 B(x0 , e−n ) .
(4.3)
(4.4)
an is an intersection of a unique pair from P an and P 0 . For a Note that each atom of Pan an 0 lower bound in (4.4), we first estimate using c) and f) of Sect. 3.1.3 that 1 −n(δ s +ε) an(h−ε) 0 0 e # atoms of Pan : Pan intersecting Γ ∩ p −1 B(x0 , e−n ) ≥ ·e . 8C 0 ) and choose y ∈ P ∩ Γ ∩ p −1 B(x , e−n ). Then for Fix one of these atoms Pu (of Pan u 0 −1 −n any z ∈ Pu ∩ Γ ∩ p B(x0 , e ), we have by (4.1) that ηu
ηu
μ y (P0an (z)) = μz (P0an (z)) ≤ e−an(h−ε) . Denote by n(X ) the number of atoms of P0an intersecting the set X ∩Γ ∩ p −1 B(x0 , e−n ), then we have by c) and f) of Sect. 3.1.3 and (4.2) that n(P0an (y)) ≥
1 −n(δ u +ε) an(h−ε) e ·e . 2
Dimension Theory of Endomorphisms
91
Therefore, we have
N ≥
n(Pu )
{ Pu : Pu ∩Γ ∩ p−1 B(x0 ,e−n )=Ø} ≥
1 −n(δ s +δ u +2ε) 2an(h−ε) e ·e . 16C
Comparing this with (4.3) gives δ ≤ δ u + δ c + δ s + (5 + 4a)ε. The conclusion follows since ε > 0 is arbitrary.
We remark that for a general invariant probability measure μ, a slight modification of the above proof as in [12] (by dividing M into a countable invariant set on each one the relevant functions are more or less constants) will give d(μ, x) ≤ δ s (x) + δ c (x) + δ u (x), μ − a.e., where δ c (x) is the multiplicity of zero Lyapunov exponent at x. The proof is omitted since we will not use this general formula. Let μ be an f -ergodic probability measure on M. It is true by the entropy theories of [25,31] that there are partial dimensions {γi }ri=1 such that the following properties hold: i) ii) iii) iv)
0 ≤ γi ≤ m i for i = 1, 2, . . . , r , γi = m i if λi = 0, δ s = Σλi <0 γi , δ u = Σλi >0 γi , r λ γ = F ( f ) := F . Σi=1 i i μ μ
Proposition 4.2. Let μ be as in Lemma 4.1 with partial dimensions {γi }ri=1 . Then r
γi ≤ dim L (μ)
(4.5)
i=1
with equality attained if and only if the partial dimensions {γ j } satisfy ⎧ if j < jc ; ⎪ ⎨γj = m j, Condition I: ∃ jc such that 0 < γ j ≤ m j , if j = jc ; ⎪ ⎩ γ j = 0, if j > jc . Proof. We first show the inequality (4.5). If Σλi >0 λi m i ≤ Fμ , we have by iv) above that Fμ =
r i=1
λi γi ≤
λi m i ≤ Fμ ,
λi >0
which immediately implies γi = m i for λi > 0 and γi = 0 for λi < 0 and hence K λ m > F . the inequality (4.5). Next, let K be the largest integer such that Σi=1 i i μ
92
L. Shu
K m = dimM, then (4.5) holds trivially. Otherwise, we have λ If Σi=1 i K +1 < 0. Clearly, we have by i) that r r (λi − λ K +1 )γi ≤ (λi − λ K +1 )m i . i=1
(4.6)
i=1
By iv), we have −λ K +1
r
γi ≤ −λ K +1
i=1
K
mi +
i=1
K
λi m i − Fμ .
i=1
Dividing each side of it by −λ K +1 gives r i=1
γi ≤
K
K mi +
i=1 λi m i
i=1
− Fμ
−λ K +1
= dim L (μ).
Suppose Condition I holds. If jc is such that λ jc = 0, then Σi γi = dim L (μ) holds by the first case of the definition of dim L (μ). Otherwise, we have by iv) that jc −1 j=1 λ j m j − Fμ γ jc = , −λ jc and hence Σi γi = dim L (μ) with K = jc − 1 in the definition of dim L (μ). Conversely, if the first case in the definition of dim L μ happens, then Condition I holds by the argument in the first paragraph. Otherwise, we have by (4.6) that the equality in (4.5) implies γ j = 0 for j > K + 1 and γ j = m j for j < K + 1. Then we have by iv) that −λ K +1 γ K +1 =
K
λ j γ j − Fμ ,
j=1
which implies γ K +1 > 0 by our choice of K in the definition of dim L (μ).
As a consequence of Lemma 4.1 and Proposition 4.2, we have Theorem 2.3. Moreover, for μ as there, we have d(μ, x) ≤ dim L (μ), μ − a.e., where with the equality holds only if μ is SRB. (Here we note that γ j = m j for j with λ j > 0 implies SRB property (cf. [25,28]).) Furthermore, in case μ has no zero Lyapunov exponent, we have dimμ = dim L (μ) if and only if Condition I holds. 5. Dimension Formula for Random Endomorphisms In this section, we show in a random setting the dimension of a hyperbolic ergodic measure coincides with its Lyapunov dimension. The corresponding dimension theories that can be obtained completely parallelling that in the deterministic case will be mentioned without proof for conciseness of the paper.
Dimension Theory of Endomorphisms
93
5.1. The proofs of Theorem 2.4 and Theorem 2.6. We begin with some preparations concerning the structure of local stable manifolds and dimension properties of sample measures. The notations in the second part of Sect. 2 will be retained. 5.1.1. Properties of local stable manifolds. For j with λ j < 0, put E j (w, x) = V ( j) (w, x) and F j (w, x) = E j (w, x)⊥ being the orthogonal complement of E j (w, x) in Tx M. Given ε > 0, there exist positive constants C0 , α, D0 , β, E 0 , δ0 , δ1 and a measurable set Λ = Λ(C0 , α, D0 , β, E 0 , δ0 , δ1 ) ⊂ Ω × M such that the following five properties hold (cf. [13]): i) Λ depends only on x and wn , n ≥ 0 and μ∗ (Λ) ≥ 1 − ε. ii) Let j be such that λ j < 0. For (w, x) ∈ Λ and n ≥ 0, v ∈ E j (w, x) ⇒ |Tx f wn v| ≤ C0 en(λ j +ε) |v|, v ∈ F j (w, x) ⇒ |Tx f wn v| ≥ C0−1 en(λ j−1 −ε) |v|. iii) Let j be such that λ j < 0. For each (w, x) ∈ Λ, there is a C 1 embedded connected s, j Σi≥ j m i dimensional disk Wα (w, x) such that s, j s, j a) Wα (w, x) = {y ∈ V s, j (w, x) : d(w,x) (y, x) ≤ α}.
j b) exp−1 x Wα (w, x) is part of the graph of a function gw,x : E (w, x) → j F (w, x) satisfying 1) gw,x 0 = 0, 2) T0 gw,x = 0, 3) |T gw,x | ≤ 1/1000, 4) Lip(T gw,x ) ≤ C0 . s, j s, j s, j c) If z 1 , z 2 ∈ Wα (w, x), then dθ n (w,x) ( f wn z 1 , f wn z 2 ) ≤ C0 en(λ j +ε) d(w,x) (z 1 , z 2 ) for all n ≥ 0. iv) Let Λw = {x ∈ M : (w, x) ∈ Λ}. Then for each w with Λw non-empty, the map s, j
x → E j (w, x) s, j
s, j
is locally Hölder continuous on the set Wα (Λw ) = ∪x∈Λw Wα (w, x) with expos, j
nent β, i.e., for all z 1 , z 2 ∈ Wα (Λw ) with d(z 1 , z 2 ) ≤ δ0 , d(E j (w, z 1 ), E j (w, z 2 )) ≤ D0 d(z 1 , z 2 )β . v) Let w be such that Λw is non-empty. For x ∈ Λw , let T1 and T2 be expx images of small disks parallel to F j (w, x) and at a distance smaller than δ1 from F j (w, x). s, j s, j Then the map ψ from T1 ∩ Wα (Λw ∩ B(x, δ0 )) to T2 by sliding along Wα -leaves is absolutely continuous with |Jac(ψ)| ≤ E 0 . For explicit definitions and proofs of Hölder continuity of subbundles and the absolute continuity of the map ψ, we refer the readers to [14] and [7].
94
L. Shu
5.1.2. Dimension properties of sample measures. Let χ (M, ν; μ) be as in Sect. 2. Parallel to the entropy theories of [25,31] and Lemma 4.1, we obtain partial dimensions {γi }ri=1 for sample measures {μw }w∈Ω such that i) ii) iii) iv) v)
0 ≤ γi ≤ m i for i = 1, 2, . . . , r , γi = m i if λi = 0, r λ γ = F (χ ) := F , Σi=1 i i μ μ r γ for μ∗ -a.e. (w, x). lim supρ→0 log μw (B(x, ρ)) / log ρ ≤ Σi=1 i If λ j < 0, then Σi≥ j γi is the dimension of the conditional measure of μw on (a measurable partition subordinate to) W s, j manifolds.
We may assume the integer K in the definition of dim L (μ) exists. Otherwise, as in the proof of Proposition 4.2, we obtain γi = m i for λi > 0 and γi = 0 for λi < 0. Hence if μ is hyperbolic, we have by a parallel result of Theorem 2.1 in our random setting that dimμw =
r
γi = dim L (μ), ν Z − a.e.
i=1
Let Condition I be as proposed in Proposition 4.2. It is also a straightforward corollary of Lemma 4.1 and Proposition 4.2 in random setting that Lemma 5.1. Let χ (M, ν; μ) be as in Sect. 2. Then there is σ such that for μ-a.e. x, σ := lim
ρ→0
log μw (B(x, ρ)) ≤ dim L (μ), log ρ
If λ j = 0 for all j, then dim(μw ) = dim L (μ), ν Z -a.e. if and only if Condition I holds. 5.1.3. Proof of the main results The idea to show Theorem 2.4 is as in [13] to introduce a notion of transversal dimension of μw with respect to W s, j for j = L , L + 1 and show they have predominate contribution to the dimension of μw . This will imply Condition I and hence the theorem by Lemma 5.1. j
Proof of Theorem 2.4. Let j = L or L +1. For μ-a.e. x, let ρx be the density with respect to Lebesgue of the distribution of w → E j (w, x) in the space of Σi≥ j m i -dimensional planes in Tx M. Let ξ > 0 be arbitrarily small. Choose E and r0 > 0 with r0 ≤ α/100, δ0 so that j Σ := (w, x) ∈ Λ : ρx ≤ E and μw (B(x, r¯ )) ≤ E r¯ σ −ξ , ∀ r¯ ≤ r0 has positive μ∗ measure. For w ∈ Ω, let Σw = {x ∈ M : (w, x) ∈ Σ} be the w-section j s, j of Σ. Let π(w,x) be the projection along Wα into expx F j (w, x). For (w, x) ∈ Σ and t ∈ (1/2, 1) to be specified later, define j Σwj (x, r¯ , t) = y ∈ Σw : d(x, y) ≤ r¯ t and d(x, π(w,x) y) ≤ r¯ . For (w, x) ∈ Σ, we define the upper transversal dimension of μw with respect to W s, j as j
dimμw (x, Σwj ) := lim sup r¯ →0
log μw (Σw (x, r¯ , t)) . log r¯
Dimension Theory of Endomorphisms
95
Then exactly the same argument as in [13] using the existence of σ and {γi }i≥ j gives Sublemma 5.2. For μ∗ a.e (w, x) ∈ Σ, we have j dimμw (x, Σw ) ≥ d j + t (σ − d j − ξ ), if σ − ξ > d j ; j dimμw (x, Σw ) ≥ σ − ξ, if σ − ξ < d j , where we denote by d j := Σi< j m i . Sublemma 5.3. Let t > 1 − β. Then for a set of (w, x) with positive measure in Σ, dimμw (x, Σwj ) ≤ σ − (1 − t) γi + 3ξ. i≥ j
We show these two sublemmas imply Condition I and hence σ = dim L (μ). Firstly, we can exclude the case L−1
λi m i > Fμ and L = r + 1.
i=1
Suppose otherwise. Since μ L M , we have that μw has absolutely continuous measure u (see [25] and [16]), so, m i = γi for i with λi > 0. Hence in the unstable direction W r
λi γi ≥
i=1
L−1
λi m i > Fμ ,
i=1
r λ γ = F . which is a contradiction to the fact that Σi=1 i i μ Next, we have by the above two sublemmas and the arbitrary choice of ξ that
σ ≤ i< j γi + t i≥ j γi , if σ > d j ; (5.1) d j + t (σ − d j ) ≤ i< j γi + t i≥ j γi , if σ ≤ d j . L−1 L λ m ≤ F . From this, we deduce Recall L is such that Σi=1 λi m i > Fμ and Σi=1 i i μ
−λ L m L ≥
L−1
λi m i − Fμ ,
i=1
and hence we have by Lemma 5.1 that σ ≤ dim L (μ) ≤
L
m i = d L+1 .
i=1
Apply (5.1) for j = L + 1 to give σ ≤ i
i≥L
96
L. Shu
which is (1 − t)
γi ≥ (1 − t)d L ,
i
and hence γi = m i for i ≤ L − 1. What is left to show σ ≤ d L , is impossible. Suppose otherwise. Then γ L = 0. Since L−1 λi γi = Fμ gives m i = γi for i with λi > 0. Now, Σi=1 L−1
λi m i ≤
i=1
L−1
λi γi = Fμ ,
i=1
K λ m > F . which is a contradiction to the choice of K = L − 1 which satisfies Σi=1 i i μ
The idea to prove Theorem 2.6 is similar to that of Theorem 2.4 by showing a certain kind of transversal dimension of the distribution of μw with respect to W s, j for j with λ j < 0 is in accordance with the rule of Condition I. Proof of Theorem 2.6. First of all, since f w is assumed to be non-degenerate for ν-a.e. w, we have by compactness of M that there exist ρw0 , ρw1 > 0 (measurable in w) such that for any x ∈ M, the map f w | B(x,ρw0 ) : B(x, ρw0 ) → M is a diffeomorphism to the −1 : f B(x, ρ 0 ) → B(x, ρ 0 ) denote the image which contains B( f w x, ρw1 ). Let f w,x w w w local inverse. Fix j with λ j < 0. Let ξ > 0 be an arbitrarily small constant. We choose δ and C > 0 with δ ≤ δ0 , δ1 , α/100. (See Sect. 5.1.1 for the definitions of δ0 , δ1 and α.) Let Γ = F −1 Λ ∩ {(w, x) ∈ Λ : i) |D f w | ≤ C,
ii) μw (B(x, r¯ )) ≤ C r¯ σ −ξ , r¯ ≤ δ, iii) (x, f w0 x) ∈ G ξ and Eξ (x, f w0 x) ≥ δ/C},
where G ξ and Eξ are given in Sect. 2. For δ sufficiently small and C sufficiently large, we may assume Γ has μ∗ positive measure. We may also assume δ < min(w,x)∈Γ {ρw0 0 , ρw1 0 }. Next, for (w, x) ∈ Γ , let (exp fw Tˆw,x = f w−1 0 ,x
0 (x)
F j (T (w, x))).
s, j Let πˆ w,x denote the projection along Wα leaves onto Tˆw,x . For r¯ > 0, define ˆ
ˆ
μT (w, x, r¯ ) := μw {y ∈ Γw : d(x, y) ≤ δ/C and d T (πˆ w,x , x) ≤ r¯ }, ˆ where d T denotes the distance on Tˆw,x . Let ˆ
τ j (w, x) = lim sup r¯ →0
log μT (w, x, r¯ ) . log r¯
Exactly the same argument as in [13] gives Sublemma 5.4. For μ∗ -a.e. (w, x) ∈ Γ ,
τ j (w, x) ≥ σ − 2ξ, if σ − 2ξ < d j ; τ j (w, x) ≥ d j ,
if σ − 2ξ ≥ d j .
Dimension Theory of Endomorphisms
97
Sublemma 5.5. For a set of (w, x) of Γ with positive measure, we have σ ≥ τ j (w, x) +
γi − ξ.
i≥ j
We show these will imply Condition I and hence the theorem follows by Lemma 5.1. First, we have by i) of Hypothesis B and the properties of stationary measures stated in Sect. 2 that μ ≤ L M . Hence the same reasoning as in the proof of Theorem 2.4 yields r λ m ≤ F . that Σi=1 i i μ Starting from j = r + 1, let L be the first integer j < r such that σ > d j . Then σ ≤ d L+1 . Apply the above two sublemmas for the case j = L + 1. We have by the arbitrary choice of ξ that
τ j (w, x) ≥ σ ≥ τ j (w, x) +
γi ,
i≥L+1
which clearly implies that γi = 0 for i > L. Hence we may assume λ L < 0. Otherwise, γi = 0 for all i with λi < 0, which is impossible since λi <0
(−λi )γi =
λi γi − Fμ =
λi >0
λi m i − Fμ > 0.
λi >0
Now, we apply Sublemmas 5.4 and 5.5 for the case j = L and conclude that σ ≥ τ j (w, x) +
γi ≥ d L + γ L .
i≥L r γ . This forces γ = m for i < L. Thus the γ ’s satisfy the Note that σ = Σi=1 i i i j requirement of Condition I and the theorem holds by Lemma 5.1.
5.2. An application of the results to stochastic flows. The model in this section is taken from Liu [16] (see also [2] and [13]). Consider a random perturbation model introduced in Baladi and Young [2]. Suppose that f : M → M is a C 2 map with no singularities. Consider the case that a particle x ∈ M jumps to f (x) and it then performs a diffusion for the time ε > 0 (see also Kifer [8] for a systematic treatment of this set-up). More precisely, let X 0 , X 1 , . . . , X d be C ∞ vector fields of M, and consider the SDE of Stratonovich type dξt = X 0 (ξt ) dt +
d
X i (ξt ) ◦ d Bti ,
(5.2)
i=1
where {Bt1 , . . . , Btd }t≥0 is a standard d-dimensional Brownian motion defined on a probability space (W, F, P). Realize the solution of this equation as a stochastic process ξt : (W, F, P) → Diff∞ (M)
98
L. Shu
which satisfies i) ii) iii) iv)
ξ0 = id; for t0 < t1 < · · · < tn , the increments ξti ◦ ξt−1 are independent; i−1 −1 for s < t, the distribution of ξt ◦ ξs depends only on t − s; with probability 1 the stochastic flow ξt has continuous sample paths.
(See Kunita [9] for more information.) Now consider the randomly perturbed process generated by compositions of random maps · · · ◦ f w1 ◦ f w0 ◦ f w−1 ◦ · · · , where . . . , w1 , w0 , w−1 , . . . ∈ (W, P) are independent and f wi = ξε (wi ) ◦ f. The randomly perturbed process introduced above is just χ (M, νε ), where νε is the distribution on C 2 (M, M) induced by the map Σ : (W, F, P) → C 2 (M, M), w → ξ (w) ◦ f. It was verified in [16] that the probability νε satisfies log+ |g|C 2 νε (dg) < +∞, log D(g) νε (dg) > −∞. For ε > 0, the transition probabilities of χ are given by Pε (x, A) = νε {w : ξε (w)( f (x)) ∈ A}. In the case when the SDE (5.2) is non-degenerate, i.e., X 0 , . . . , X d span the tangent space of M, then the transition probabilities of χ have a density with respect to Lebesgue measure and hence a χ -stationary measure μ satisfies μ Leb. Furthermore, as it was d showed in [13], if the operator L = − X 0 +Σk=1 X k2 on C ∞ (Gr(M)) is hypoelliptic, where X k is the natural lifting of X k to Gr(M), 0 ≤ k ≤ d, particularly, if d ≥ dimM+(dimM)2 , then there is an open and dense subset in the space of (d + 1)-tuples of vector fields on M on which the hypothesis A is satisfied. Hence Theorem 2.5 applies to this model. Acknowledgements. The author is grateful to Professor Peidong Liu for introducing her to this field, for many discussions, and constant encouragement. This work was partially revised during the author’s visit to CUHK. She would like to thank Professors Dejun Feng and Kasing Lau for hospitality and valuable comments.
References 1. Arnold, L.: Random Dynamical Systems. Berlin-Heidelberg New York: Springer-Verlag, 1998 2. Baladi, V., Young, L.-S.: On the spectra of randomly perturbed expanding maps. Commun. Math. Phys. 156, 355–385 (1993) 3. Barreira, L., Pesin, Y., Schmeling, J.: Dimension and product structure of hyperbolic measures. Ann. Math. 149, 755–783 (1999) 4. Eckmann, J.-P., Ruelle, D.: Ergodic theory of chaos and strange attractors. Rev. Mod. Phys. 57(3), 617–656 (1985) 5. Farmer, J., Ott, E., Yorke, J.: The dimension of chaotic attractors. Physica 7D, 153–180 (1983)
Dimension Theory of Endomorphisms
99
6. Frederickson, P., Kaplan, J.-L., Yorke, E.-D., Yorke, J.-A.: The Liapunov dimension of strange attractors. J. Diff. Eqs. 49(2), 185–207 (1983) 7. Katok, A., Strelcyn, J.-M.: Invariant Manifold, Entropy and Billiards; Smooth Maps with Singularities. Lecture Notes in Mathematics 1222, Berlin-Heidelberg-New York: Springer Verlag, 1986 8. Kifer, Y.: Ergodic Theory of Random Transformations. Boston: Birkhäuser, 1986 9. Kunita, H.: Stochastic Flows and Stochastic Differential Equations. Cambridge: Cambridge University Press, 1990 10. Ledrappier, F.: Dimension of invariant measures. In: Proceedings of the conference on ergodic theory and related topics, II (Georgenthal, 1986), Stuttgart: Math. 94, Teubner-Tecte, 1987, pp. 116–124 11. Ledrappier, F., Misiurewicz, M.: Dimension of invariant measures for maps with exponent zero. Ergod. Th. & Dynam. Sys. 5, 595–610 (1985) 12. Ledrappier, F., Young, L.-S.: The metric entropy of diffeomorphisms. I. Characterization of measures satisfying Pesin’s entropy formula. Ann. of Math. (2) 122, no. 3, 509–539 (1985); The metric entropy of diffeomorphisms. II. Relations between entropy, exponents and dimension. Ann. of Math. (2) 122, no. 3, 540–574 (1985) 13. Ledrappier, F., Young, L.-S.: Dimension formula for random transformations. Commun. Math. Phys. 117(4), 529–548 (1988) 14. Liu, P.-D., Qian, M.: Smooth Ergodic Theory of Random Dynamical Systems. Lecture Notes in Mathematics, 1606, Berlin: Springer-Verlag, 1995 15. Liu, P.-D., Xie, J.-S.: Dimension of hyperbolic measures of random diffeomorphisms. Trans. Amer. Math. Soc. 358(9), 3751–3780 (2006) 16. Liu, P.-D.: Entropy formula of Pesin type for noninvertible random dynamical systems. Math. Z. 230, 201–239 (1999) 17. Liu, P.-D.: Ruelle inequality relating entropy, folding entropy and negative Lyapunov exponents. Commun. Math. Phys. 240(3), 531–538 (2003) 18. Liu, P.-D.: A note on the relationship of pointwise dimensions of an invariant measure and its natural extension. Arch. Math. (Basel) 83(1), 81–87 (2004) 19. Liu, P.-D.: Invariant measures satisfying an equality relating entropy, folding entropy and negative Lyapunov exponents. Commun. Math. Phys. 284, 391–406 (2008) 20. Mañé, R.: A proof of Pesin’s formula. Ergod. Th. & Dynam. Syst. 1, 95–102 (1981) 21. Oseledeˇc, V.-I.: A multiplicative ergodic theorem: Lyapunov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc. 19, 197–221 (1968) 22. Parry, W.: Entropy and Generators in Ergodic Theory. New York: W. A. Benjamin, Inc., 1969 23. Pesin, Y., Yue, C.: The Hausdorff dimension of measures with non-zero Lyapunov exponents and local product structure. PSU preprint 24. Ruelle, D.: Positivity of entropy production in nonequilibrium statistical mechanics. J. Stat. Phys. 85(1–2), 1–23 (1996) 25. Qian, M., Xie, J.-S.: Entropy formula for endomorphisms: relations between entropy, exponents and dimension. Discr. Cont. Dyn. Syst. 21(2), 367–392 (2008) 26. Qian, M., Xie, J.-S., Zhu, S.: Smooth ergodic theory for endomorphisms. Lecture Notes in Mathematics, 1978, Berlin: Springer-Verlag, 2009 27. Qian, M., Zhang, Z.-S.: Ergodic theory for axiom A endomorphisms. Ergod. Th. & Dynam. Sys. 15, 161– 174 (1995) 28. Qian, M., Zhu, S.: SRB measures and Pesin’s entropy formula for endomorphisms. Trans. Amer. Math. Soc. 354(4), 1453–1471 (2002) 29. Schmeling, J., Troubetzkoy, S.: Dimension and invertibility of hyperbolic endomorphisms with singularities. Ergod. Th. & Dynam. Sys. 18, 1257–1282 (1998) 30. Schmeling, J.: A dimension formula for endomorphisms-the Belykh family. Ergod. Th. & Dynam. Sys. 18, 1283–1309 (1998) 31. Shu, L.: The metric entropy of endomorphisms. Commun. Math. Phys. 291(2), 491–512 (2009) 32. Young, L.-S.: Dimension, entropy and Lyapunov exponents. Ergod. Th. & Dynam. Sys. 2, 109–124 (1982) 33. Young, L.-S.: Ergodic theory of attractors. In: Proceedings of the International Congress of Mathematicians, (Zürich, Switzerland, 1994), Basel: Birkhäuser, 1995, pp. 1230–1237 Communicated by G. Gallavotti
Commun. Math. Phys. 298, 101–138 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1010-2
Communications in
Mathematical Physics
Mean-Field Dynamics: Singular Potentials and Rate of Convergence Antti Knowles1 , Peter Pickl2 1 Theoretische Physik, ETH Hönggerberg, CH-8093 Zürich, Switzerland.
E-mail:
[email protected]
2 Mathematisches Institut, Universität München, Theresien str. 39,
80333 München, Germany Received: 24 July 2009 / Accepted: 7 December 2009 Published online: 19 February 2010 – © Springer-Verlag 2010
Abstract: We consider the time evolution of a system of N identical bosons whose interaction potential is rescaled by N −1 . We choose the initial wave function to describe a condensate in which all particles are in the same one-particle state. It is well known that in the mean-field limit N → ∞ the quantum N -body dynamics is governed by the nonlinear Hartree equation. Using a nonperturbative method, we extend previous results on the mean-field limit in two directions. First, we allow a large class of singular interaction potentials as well as strong, possibly time-dependent external potentials. Second, we derive bounds on the rate of convergence of the quantum N -body dynamics to the Hartree dynamics. 1. Introduction We consider a system of N identical bosons in d dimensions, described by a wave function N ∈ H(N ) . Here H(N ) := L 2+ (R N d , dx1 · · · dx N ) is the subspace of L 2 (R N d , dx1 · · · dx N ) consisting of wave functions N (x1 , . . . , x N ) that are symmetric under permutation of their arguments x1 , . . . , x N ∈ Rd . The Hamiltonian is given by HN =
N i=1
hi +
1 N
w(xi − x j ),
(1.1)
1i< j N
where h i denotes a one-particle Hamiltonian h (to be specified later) acting on the coordinate xi , and w is an interaction potential. Note the mean-field scaling 1/N in front of the interaction potential, which ensures that the free and interacting parts of H N are of the same order.
102
A. Knowles, P. Pickl
The time evolution of N is governed by the N -body Schrödinger equation i∂t N (t) = H N N (t),
N (0) = N ,0 .
(1.2)
For definiteness, let us consider factorized initial data N ,0 = ϕ0⊗N for some ϕ0 ∈ L 2 (Rd ) satisfying the normalization condition ϕ0 L 2 (Rd ) = 1. Clearly, because of the interaction between the particles, the factorization of the wave function is not preserved by the time evolution. However, it turns out that for large N the interaction potential experienced by any single particle may be approximated by an effective mean-field potential, so that the wave function N (t) remains approximately factorized for all times. In other words we have that, in a sense to be made precise, N (t) ≈ ϕ(t)⊗N for some appropriate ϕ(t). A simple argument shows that in a product state ϕ(t)⊗N the interaction potential experienced by a particle is approximately w ∗ |ϕ(t)|2 , where ∗ denotes convolution. This implies that ϕ(t) is a solution of the nonlinear Hartree equation i∂t ϕ(t) = hϕ(t) + w ∗ |ϕ(t)|2 ϕ(t), ϕ(0) = ϕ0 . (1.3) Let us be a little more precise about what one means with N≈ ϕ ⊗N (we omit the irrelevant time argument). One does not expect the L 2 -distance N − ϕ ⊗N L 2 (R N d ) to become small as N → ∞. A more useful, weaker, indicator of convergence should depend only on a finite, fixed1 number, k, of particles. To this end we define the reduced k-particle density matrix (k) γ N := Tr k+1,...,N | N N |,
where Tr k+1,...,N denotes the partial trace over the coordinates xk+1 , . . . , x N , and | N N | denotes (in accordance with the usual Dirac notation) the orthogonal projector (k) onto N . In other words, γ N is the positive trace class operator on L 2+ (Rkd , dx1 · · · dxk ) with operator kernel γ N(k) (x1 , . . . , xk ; y1 , . . . , yk ) = dxk+1 · · · dx N N (x1 , . . . , x N ) N (y1 , . . . , yk , xk+1 , . . . , x N ). (k)
The reduced k-particle density matrix γ N embodies all the information contained in the full N -particle wave function that pertains to at most k particles. There are two commonly used indicators of the closeness γ N(k) ≈ (|ϕ ϕ|)⊗k : the projection (k) (k) E N := 1 − ϕ ⊗k , γ N ϕ ⊗k and the trace norm distance
(k) (k) R N := Tr γ N − (|ϕ ϕ|)⊗k .
(1.4)
It is well known (see e.g. [9]) that all of these indicators are equivalent in the sense (k) (k) that the vanishing of either R N or E N for some k in the limit N → ∞ implies that
) (k ) lim N R (k N = lim N E N = 0 for all k . However, the rate of convergence may differ 1 In fact, as shown in Corollary 3.2, k may be taken to grow like o(N ).
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
103
from one indicator to another. Thus, when studying rates of convergence, they are not equivalent (see Sect. 2 below for a full discussion). The study of the convergence of γ N(k) (t) in the mean-field limit towards (|ϕ(t)
ϕ(t)|)⊗k for all t has a history going back almost thirty years. The first result is due to (k) Spohn [13], who showed that lim N R N (t) = 0 for all t provided that w is bounded. His method is based on the BBGKY hierarchy, (k)
i∂t γ N (t) =
k
i=1
+
1 (k) h i , γ N (t) + N
N −k N
k
(k) w(xi − x j ) , γ N (t)
1i< j k
(k+1) Tr k+1 w(xi − xk+1 ) , γ N (t) ,
(1.5)
i=1 (k)
an equation of motion for the family (γ N (t))k∈N of reduced density matrices. It is a simple computation to check that the BBGKY hierarchy is equivalent to the Schrödinger equation (1.2) for N (t). Using a perturbative expansion of the BBGKY hierarchy, (k) Spohn showed that in the limit N → ∞ the family (γ N (t))k∈N converges to a family (k) (γ∞ (t))k∈N that satisfies the limiting BBGKY obtained by formally setting N = ∞ in (1.5). This limiting hierarchy is easily seen to be equivalent to the Hartree equation (1.3) (k) via the identification γ∞ (t) = (|ϕ(t) ϕ(t)|)⊗k . We refer to [3] for a short discussion of some subsequent developments. In the past few years considerable progress has been made in strengthening such results in mainly two directions. First, the convergence lim N R (k) N (t) = 0 for all t has been proven for singular interaction potentials w. It is for instance of special physical interest to understand the case of a Coulomb potential, w(x) = λ|x|−1 , where λ ∈ R. The proofs for singular interaction potentials are considerably more involved than for bounded interaction potentials. The first result for the case h = − and w(x) = λ|x|−1 is due to Erd˝os and Yau [3]. Their proof uses the BBGKY hierarchy and a weak compactness argument. In [1], Schlein and Elgart extended this√result to the technically more demanding case of a semirelativistic kinetic energy, h = 1 − and w(x) = λ|x|−1 . This is a critical case in the sense that the kinetic energy has the same scaling behaviour as the Coulomb potential energy, thus requiring quite refined estimates. A different approach, based on operator methods, was developed by Fröhlich et al. in [4], where the authors treat the case h = − and w(x) = λ|x|−1 . Their proof relies on dispersive estimates and counting of Feynman graphs. Yet another approach was adopted by Rodnianski and Schlein in [12]. Using methods inspired by a semiclassical argument of Hepp [6] focusing on the dynamics of coherent states in Fock space, they show convergence to the mean-field limit in the case h = − and w(x) = λ|x|−1 . The second area of recent progress in understanding the mean-field limit is deriving estimates on the rate of convergence to the mean-field limit. Methods based on expansions, as used in [13 and 4], give very weak bounds on the error R (1) N (t), while weak compactness arguments, as used in [3 and 1], yield no information on the rate of convergence. From a physical point of view, where N is large but finite, it is of some interest to have tight error bounds in order to be able to address the question whether the mean-field approximation may be regarded as valid. The first reasonable estimates on the error were derived for the case h = − and w(x) = λ|x|−1 by Rodnianski and Schlein in their work [12] mentioned above. In fact they derive an explicit estimate on
104
A. Knowles, P. Pickl
the error of the form C1 (k) (k) R N (t) √ eC2 (k)t N for some constants C1 (k), C2 (k) > 0. Using a novel approach inspired by Lieb-Robinson bounds, Erd˝os and Schlein [2] further improved this estimate under the more restrictive assumption that w is bounded and its Fourier transform integrable. Their result is (k)
R N (t)
C 1 C2 k C3 t e e , N
for some constants C1 , C2 , C3 > 0. In the present article we adopt yet another approach based on a method of Pickl [10]. We strengthen and generalize many of the results listed above, by treating more singular interaction potentials as well as deriving estimates on the rate of convergence. Moreover, our approach allows for a large class of (possibly time-dependent) external potentials, which might for instance describe a trap confining the particles to a small volume. We also show that if the solution ϕ(·) of the Hartree equation satisfies a scattering condition, all of the error estimates are uniform in time. The outline of the article is as follows. Section 2 is devoted to a short discussion (k) (k) of the indicators of convergence E N and R N , in which we derive estimates relating them to each other. In Sect. 3 we state and prove our first main result, which concerns the mean-field limit in the case of L 2 -type singularities in w; see Theorem 3.1 and Corollary 3.2. In Sect. 4 we state and prove our second main result, which allows for a larger class of singularities such as the nonrelativistic critical case h = − and w(x) = λ|x|−2 ; see Theorem 4.1. For an outline of the methods underlying our proofs, see the beginnings of Sects. 3 and 4. Notation. Except in definitions, in statements of results and where confusion is possible, we refrain from indicating the explicit dependence of a quantity a N (t) on the time t and the particle number N . When needed, we use the notations a(t) and a|t interchangeably to denote the value of the quantity a at time t. The symbol C is reserved for a generic positive constant that may depend on some fixed parameters. We abbreviate a Cb with a b. To simplify notation, we assume that t 0. We abbreviate L p (Rd , dx) ≡ L p and · L p ≡ · p . We also set · L 2 (R N d ) = ·. For s ∈ R we use H s ≡ H s (Rd ) to denote the Sobolev space with norm f H s = (1 + |k|2 )s/2 fˆ , where fˆ is the Fourier transform of f . 2 Integer indices on operators denote particle number: A k-particle operator A (i.e. an operator on H(k) ) acting on the coordinates xi1 , . . . , xik , where i 1 < · · · < i k , is denoted by Ai1 ...ik . Also, by a slight abuse of notation, we identify k-particle functions f (x1 , . . . , xk ) with their associated multiplication operators on H(k) . The operator norm of the multiplication operator f is equal to, and will always be denoted by, f ∞ . We use the symbol Q(·) to denote the form domain of a semibounded operator. We denote the space of bounded linear maps from X 1 to X 2 by L(X 1 ; X 2), and abbreviate L(X ) = L(X ; X ). We abbreviate the operator norm of L L 2 (R N d ) by ·. For two Banach spaces, X 1 and X 2 , contained in some larger space, we set f X 1 +X 2 = inf f 1 X 1 + f 2 X 2 , f = f1 + f2
f X 1 ∩X 2 = f X 1 + f X 2 , and denote by X 1 + X 2 and X 1 ∩ X 2 the corresponding Banach spaces.
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
105
2. Indicators of Convergence This section is devoted to a discussion, which might also be of independent interest, of (k) (k) quantitative relationships between the indicators E N and R N . Throughout this section we suppress the irrelevant index N . Take a k-particle density matrix γ (k) ∈ L(H(k) ) and a one-particle condensate wave function ϕ ∈ L 2 . The following lemma gives the relationship between different elements of the sequence E (1) , E (2) , . . . , where, we recall, (2.1) E (k) = 1 − ϕ ⊗k , γ (k) ϕ ⊗k . Lemma 2.1. Let γ (k) ∈ L(H(k) ) satisfy γ (k) 0,
Tr γ (k) = 1.
Let ϕ ∈ L 2 satisfy ϕ = 1. Then E (k) k E (1) .
(2.2)
(k) (k) Proof. Let i i 1 be an orthonormal basis of H(k) with 1 = ϕ ⊗k . Then (k−1) (k−1) ϕ ⊗k , γ (k) ϕ ⊗k = ϕ ⊗ i , γ (k) ϕ ⊗ i
i 1
−
(k−1)
(k−1)
, γ (k) ϕ ⊗ i
ϕ ⊗ i
i 2
= ϕ , γ (1) ϕ −
(k−1)
ϕ ⊗ i
(k−1)
, γ (k) ϕ ⊗ i
i 2
Therefore,
ϕ , γ (1) ϕ − ϕ ⊗k , γ (k) ϕ ⊗k = ϕ ⊗ i(k−1) , γ (k) ϕ ⊗ i(k−1) i 2
(k−1) (k−1) (1) , γ (k) (1) j ⊗ i j ⊗ i
i 2 j 1
=
(1)
(k−1)
j ⊗ i
(1)
(k−1)
, γ (k) j ⊗ i
i 1 j 1
−
(1)
(1)
j ⊗ ϕ ⊗(k−1) , γ (k) j ⊗ ϕ ⊗(k−1)
j 1
= 1 − ϕ ⊗(k−1) , γ (k−1) ϕ ⊗(k−1) . This yields E (k) E (k−1) + E (1) , and the claim follows.
.
106
A. Knowles, P. Pickl
Remark 2.2. The bound in (2.2) is sharp. Indeed, let us suppose that E (k) k f (k) E (1) for some function f . Then f (k) sup γ (k)
E (k) 1 − (1 − α)k 1 − (1 − α)k lim = 1, sup α→0 kα kα k E (1) 0<α<1
where the second inequality follows by restricting the supremum to product states γ (k) = (|ψ ψ|)⊗k and writing α = E (1) . The next lemma describes the relationship between E (k) and R (k) , where, we recall, R (k) = Tr γ (k) − (|ϕ ϕ|)⊗k . Lemma 2.3. Let γ (k) ∈ L(H(k) ) be a density matrix and ϕ ∈ L 2 satisfy ϕ = 1. Then E (k) R (k) , R (k) 8 E (k) .
(2.3a) (2.3b)
Proof. It is convenient to introduce the shorthand p (k) := (|ϕ ϕ|)⊗k . Thus, E (k) = 1 − ϕ ⊗k , γ (k) ϕ ⊗k = Tr p (k) − p (k) γ (k) p (k) Tr p (k) − γ (k) = R (k) , which is (2.3a). In order to prove (2.3b) it is easiest to use the identity Tr p (k) − γ (k) = 2 p (k) − γ (k) ,
(2.4)
valid for any one-dimensional projector p (k) and nonnegative density matrix γ (k) . This was first observed by Seiringer; see [12]. For the convenience of the reader we recall the proof of (2.4). Let (λn )n∈N be the sequence of eigenvalues of the trace class operator A := γ (k) − p (k) . Since p (k) is a rank one projection, A has at most one negative eigen value, say λ0 . Also, Tr A = 0 implies that n λn = 0. Thus, n |λn | = 2|λ0 |, which is (2.4). Now (2.4) yields 2 R (k) = Tr p (k) − γ (k) = 2 p (k) − γ (k) 2 Tr p (k) − γ (k) . Then (2.3b) follows from 2 Tr p (k) −γ (k) = 1−2 Tr p (k) γ (k) + Tr(γ (k) )2 E (k) − Tr p (k) γ (k) + 1 = 2E (k) . Alternatively, one may prove (2.3b) without (2.4) by using the polar decomposition and the Cauchy-Schwarz inequality for Hilbert-Schmidt operators.
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
107
Remark 2.4. Up to constant factors the bounds (2.3) are sharp, as the following examples show. Here we drop the irrelevant index k. Consider first ϕ=
1 , 0
γ =
1−a 0
0 , a
where 0 a 1. As above we set p := |ϕ ϕ|. One finds E = 1 − ϕ , γ ϕ = a,
R = Tr| p − γ | = 2a,
so that (2.3a) is sharp up to a constant factor. It is not hard to see that if γ and p commute then (2.3b) can be replaced with the stronger bound R E. In order to show that in general (2.3b) is sharp up to a constant factor, consider ϕ=
1 , 0
1−a γ = √ a − a2
√ a − a2 , a
where 0 a 1. One readily sees that γ is a density matrix (in fact, a one-dimensional projector). A short calculation yields E = 1 − ϕ , γ ϕ = a as well as √ Tr γ (1 − p) = a. Using Tr γ (1 − p) = Tr γ − p + p − γ p 2 Tr| p − γ | we therefore find √ √ a E R = Tr| p − γ | = , 2 2 as desired.
3. Convergence for L 2 -type Singularities This section is devoted to the case w ∈ L 2 + L ∞ .
108
A. Knowles, P. Pickl
3.1. Outline and main result. Our method relies on controlling the quantity (1) α N (t) := E N (t).
(3.1)
To this end, we derive an estimate of the form α˙ N (t) A N (t) + B N (t) α N (t),
(3.2)
which, by Grönwall’s Lemma, implies α N (t) α N (0) e
t 0
BN
t
+
A N (s) e
t s
BN
ds.
(3.3)
0
In order to show (3.2), we differentiate α N (t) and note that all terms arising from the one-particle Hamiltonian vanish. We control the remaining terms by introducing the time-dependent orthogonal projections p(t) := |ϕ(t) ϕ(t)|,
q(t) := 1 − p(t).
We then partition 1 = p(t) + q(t) appropriately and use the following heuristics for controlling the terms that arise in this manner. Factors p(t) are used to control singularities of w by exploiting the smoothness of the Hartree wave function ϕ(t). Factors q(t) are expected to yield something small, i.e. proportional to α N (t), in accordance with the identity α N (t) = N (t) , q1 (t) N (t) . For the following it is convenient to rewrite the Hamiltonian (1.1) as HN =
N
hi +
i=1
1 N
Wi j =: H N0 + H NW ,
(3.4)
1i< j N
where Wi j := w(xi − x j ). We may now list our assumptions. (A1) The one-particle Hamiltonian h is self-adjoint and bounded from below. Without loss of generality we assume that h 0. We define the Hilbert space X N = Q(H N0 ) as the form domain of H N0 with norm X N := (1 + H N0 )1/2 . (A2) The Hamiltonian (3.4) is self-adjoint and bounded from below. We also assume that Q(H N ) ⊂ X N . (A3) The interaction potential w is a real and even function satisfying w ∈ L p1 + L p2 , where 2 p1 p2 ∞. (A4) The solution ϕ(·) of (1.3) satisfies ϕ(·) ∈ C(R; X 1 ∩ L q1 ) ∩ C 1 (R; X 1∗ ), where 2 q2 q1 ∞ are defined through 1 1 1 = + , 2 pi qi
i = 1, 2.
(3.5)
Here X 1∗ denotes the dual space of X 1 , i.e. the closure of L 2 under the norm ϕ X 1∗ := (1 + h)−1/2 ϕ. We now state our main result.
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
109
Theorem 3.1. Let N ,0 ∈ Q(H N ) satisfy N ,0 = 1, and ϕ0 ∈ X 1 ∩ L q1 satisfy ϕ0 = 1. Assume that Assumptions (A1) – (A4) hold. Then 1 α N (t) α N (0) + eφ(t) , N where
φ(t) := 32w L p1 +L p2 0
t
ds ϕ(s)q1 + ϕ(s)q2 .
We may combine this result with the observations of Sect. 2. Corollary 3.2. Let the sequence N ,0 ∈ Q(H N ), N ∈ N, satisfy the assumptions of Theorem 3.1 as well as 1 . E (1) N (0) N Then we have (k) E N (t)
k φ(t) e , N
(k) R N (t)
k φ(t)/2 e . N
Remark 3.3. Corollary 3.2 implies that we can control the condensation of k = o(N ) particles. Remark 3.4. Assumption (A3) allows for singularities in w up to, but not including, the type |x|−3/2 in three dimensions. In the next section we treat a larger class of interaction potentials. Remark 3.5. Assumption (A4) is typically verified by solving the Hartree equation in a Sobolev space of high index (see e.g. Sect. 3.2.2). Instead of requiring a global-in-time solution ϕ(·), it is enough to have a local-in-time solution on [0, T ) for some T > 0. Remark 3.6. If supt φ(t) < ∞, or in other words if ϕ(t)q1 and ϕ(t)q2 are integrable in t over R, then all estimates are uniform in time. This describes a scattering regime where the time evolution is asymptotically free for large times. Such an integrability condition requires large exponents qi , which translates to small exponents pi , i.e. an interaction potential with strong decay. Remark 3.7. The result easily extends to time-dependent one-particle Hamiltonians h ≡ h(t). Replace (A1) and (A2) with (A1’) The Hamiltonian h(t) is self-adjoint and bounded from below. We assume that there is an operator h 0
0 thatsuch that 0 h(t) h 0 for all t. Define the Hilbert space X N = Q i (h 0 )i as in (A1). (A2’) The Hamiltonian H N (t) is self-adjoint and bounded from below. We assume that Q(H N (t)) ⊂ X N for all t. We also assume that the N -body propagator U N (t, s), defined by i∂t U N (t, s) = H N (t)U N (t, s),
U N (s, s) = 1,
exists and satisfies U N (t, 0) N ,0 ∈ Q(H N (t)) for all t. It is then straightforward that Theorem 3.1 holds with the same proof.
110
A. Knowles, P. Pickl
Remark 3.8. In some cases (see e.g. Sect. 3.2.1 below) it is convenient to modify the assumptions as follows. Replace (A3) and (A4) with (A3’) The interaction potential w is a real and even function satisfying 2 w ∗ |ϕ|2 K ϕ2 X1 ∞
(3.6)
for some constant K > 0. Without loss of generality we assume that K 1. (A4’) The solution ϕ(·) of (1.3) satisfies ϕ(·) ∈ C(R; X 1 ) ∩ C 1 (R; X 1∗ ). Then Theorem 3.1 and Corollary 3.2 hold with t φ(t) = 32K dsϕ(s)2X 1 . 0
The proof remains virtually unchanged. One replaces (3.24) with (3.6), as well as (3.20) with w ∗ |ϕ|2 2K ϕ2 , X1 ∞ which is an easy consequence of (3.6). 3.2. Examples. We list two examples of systems satisfying the assumptions of Theorem 3.1. 3.2.1. Particles in a trap. Consider nonrelativistic particles in R3 confined by a strong trapping potential. The particles interact by means of the Coulomb potential: w(x) = λ|x|−1 , where λ ∈ R. The one-particle Hamiltonian is of the form h = − + v, where v is a measurable function on R3 . Decompose v into its positive and negative parts: v = v+ − v− , where v+ , v− 0. We assume that v+ ∈ L 1loc and that v− is −-form bounded with relative bound less than one, i.e. there are constants 0 a < 1 and 0 b < ∞ such that
ϕ , v− ϕ a ϕ , −ϕ + b ϕ , ϕ .
(3.7)
Thus h + b1 is positive, and it is not hard to see that h is essentially self-adjoint on Cc∞ (R3 ). This follows by density and a standard argument using Riesz’s representation theorem to show that the equation (h + (b + 1)1)ϕ = f has a unique solution ϕ ∈ {ϕ ∈ L 2 : hϕ ∈ L 2 } for each f ∈ L 2 . It is now easy to see that Assumptions (A1) and (A2) hold with the one-particle Hamiltonian h + c1 for some c > 0. Let us assume without loss of generality that c = 0. Next, we verify Assumptions (A3’) and (A4’) (see Remark 3.8). We find 2 2 2 w ∗ |ϕ|2 = sup dy λ |ϕ(y)| ϕ , −ϕ ∞ 2 |x − y| x ϕ , hϕ + ϕ , ϕ = ϕ2X 1 , where the second step follows from Hardy’s inequality and translation invariance of , and the third step is a simple consequence of (3.7). This proves (A3’).
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
111
Next, take ϕ0 ∈ X 1 . By standard methods (see e.g. the presentation of [7]) one finds that (A4’) holds. Moreover, the mass ϕ(t)2 and the energy 1 E ϕ (t) = ϕ , hϕ + dxdy w(x − y)|ϕ(x)|2 |ϕ(y)|2 2 t are conserved under time evolution. Using the identity |x|−1 1{|x|ε} ε|x|−2 + 1{|x|>ε} ε−1 and Hardy’s inequality one sees that ϕ(t)2X 1 E ϕ (t) + ϕ(t)2 , and therefore ϕ(t) X 1 C for all t. We conclude: Theorem 3.1 holds with φ(t) = Ct. More generally, the preceding discussion holds for interaction potentials w ∈ L 3w + L ∞ , p where L w denotes the weak L p space (see e.g. [11]). This follows from a short computation using symmetric-decreasing rearrangements; we omit further details. This example generalizes the results of [3,12 and 4]. 3.2.2. A boson star. Consider semirelativistic particles in R3 whose one-particle Hamil√ tonian is given by h = 1 − . The particles interact by means of a Coulomb potential: w(x) = λ|x|−1 . We impose the condition λ > −4/π . This condition is necessary for both the stability of the N -body problem (i.e. Assumption (A2)) and the global well-posedness of the Hartree equation. See [7,8] for details. It is well known that Assumptions (A1) and (A2) hold in this case. In order to show (A4) we need some regularity of ϕ(·). To this end, let s > 1 and take ϕ0 ∈ H s . Theorem 3 of [7] implies that (1.3) has a unique global solution in H s . Therefore Sobolev’s inequality implies that (A4) holds with 1 1 s = − . q1 2 3 Thus q1 > 6, and (A3) holds with appropriately chosen values of p1 , p2 . We conclude: Theorem 3.1 holds for some continuous function φ(t). (In fact, as shown in [7], one has the bound φ(t) eCt .) This example generalizes the result of [1]. 3.3. Proof of Theorem 3.1. 3.3.1. A family of projectors. Define the time-dependent projectors p(t) := |ϕ(t) ϕ(t)|,
q(t) := 1 − p(t).
Write 1 = ( p1 + q1 ) · · · ( p N + q N ),
(3.8)
and define Pk , for k = 0, . . . , N , as the term obtained by multiplying out (3.8) and selecting all summands containing k factors q. In other words, Pk =
a∈{0,1} N
i
ai =k
N
pi1−ai qiai .
(3.9)
: i=1
If k = {0, . . . , N } we set Pk = 0. It is easy to see that the following properties hold:
112
A. Knowles, P. Pickl
(i) Pk is an orthogonal projector, (ii) Pk Pl = δkl Pk , (iii) k Pk = 1. Next, for any function f : {0, . . . , N } → C we define the operator f (k)Pk . f :=
(3.10)
k
It follows immediately that f g= f g, and that f commutes with pi and Pk . We shall often make use of the functions k k n(k) := . m(k) := , N N We have the relation 1 1 1 qi = qi Pk = k Pk = m . N N N i
k
i
(3.11)
k
Thus, by symmetry of , we get α = , q1 = , m .
(3.12)
The correspondence q1 ∼ m of (3.11) yields the following useful bounds. Lemma 3.9. For any nonnegative function f : {0, . . . , N } → [0, ∞) we have , f q1 = , fm , N , fm 2 . , f q1 q2 N −1
(3.13) (3.14)
Proof. The proof of (3.13) is an immediate consequence of (3.11). In order to prove (3.14) we write, using symmetry of as well as (3.11),
, f q1 q2 =
1 , f qi q j N (N − 1) i= j
1 N , fm 2 , , f qi q j = N (N − 1) N −1 i, j
which is the claim.
Next, we introduce the shift operation τn , n ∈ Z, defined on functions f through (τn f )(k) := f (k + n). Its usefulness for our purposes is encapsulated by the following lemma.
(3.15)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
113
Lemma 3.10. Let r 1 and A be an operator on H(r ) . Let Q i , i = 1, 2, be two projectors of the form Q i = #1 · · · #r , where each # stands for either p or q. Then Q 1 A1...r f Q 2 = Q 1 τn f A1...r Q 2 , where n = n 2 − n 1 and n i is the number of factors q in Q i . Proof. Define
Pkr :=
N
pi1−ai qiai .
N −r i=r +1 a∈{0,1}
i ai =k
Then, Qi f =
f (k) Q i Pk =
k
r f (k) Q i Pk−n = i
k
f (k + n i ) Q i Pkr .
k
The claim follows from the fact that Pkr commutes with A1...r .
3.3.2. A bound on α. ˙ Let us abbreviate W ϕ := w ∗ |ϕ|2 . From (A3) and (A4) we find W ϕ ∈ L ∞ (see (3.20) below). Then i∂t ϕ = (h + W ϕ )ϕ, where h + W ϕ ∈ L(X 1 ; X 1∗ ). Thus, for any ψ ∈ X 1 independent of t we have i∂t ψ , p ψ = ψ , [h + W ϕ , p]ψ . On the other hand, it is easy to see from (A3) and (A4) that m ∈ Q(H ). Combining these observations, and noting that ∈ Q(H ) ⊂ X by (A2), we see that α is differentiable in t with derivative
α˙ = i , H − H ϕ , m ,
ϕ where H ϕ := i (h i + Wi ). Thus, ϕ 1 Wi j − Wi , m . α˙ = i , N i< j
i
By symmetry of and m we get α˙ =
i
ϕ ϕ , (N − 1)W12 − N W1 − N W2 , m . 2
In order to estimate the right-hand side, we introduce 1 = ( p1 + q1 )( p2 + q2 )
(3.16)
114
A. Knowles, P. Pickl
on both sides of the commutator in (3.16). Of the sixteen resulting terms only three different types survive:
ϕ ϕ i q 1 p2 , (I) 2 , p1 p2 (N − 1)W12 − N W1 − N W2 , m
ϕ ϕ i q1 q2 , (II) 2 , q1 p2 (N − 1)W12 − N W1 − N W2 , m
ϕ ϕ i q1 q2 . (III) 2 , p1 p2 (N − 1)W12 − N W1 − N W2 , m Indeed, Lemma 3.10 implies that terms with the same number of factors q on the left and on the right vanish. What remains is α˙ = 2(I) + 2(II) + (III) + complex conjugate. The remainder of the proof consists in estimating each term. Term (I). First, we remark that ϕ
p2 W12 p2 = p2 W1 .
(3.17)
This is easiest to see using operator kernels (we drop the trivial indices x3 , y3 , . . . , x N , y N ): ( p2 W12 p2 )(x1 , x2 ; y1 , y2 ) = dzϕ(x2 ) ϕ(z) w(x1 − z) δ(x1 − y1 ) ϕ(z) ϕ(y2 ) = ϕ(x2 ) ϕ(y2 ) δ(x1 − y1 ) (w ∗ |ϕ|2 )(x1 ). Therefore, (I) =
ϕ −i i ϕ ϕ , p1 p2 (N − 1)W1 − N W1 , m , p1 p2 W 1 , m q 1 p2 = q 1 p2 . 2 2
Using Lemma 3.10 we find (I) =
−i −i ϕ ϕ , p1 p2 W 1 m , p1 p2 W 1 q 1 p2 . − τ −1 m q1 p2 = 2 2N
This gives (I) 1 W ϕ ∞ = 1 w ∗ |ϕ|2 . ∞ 2N 2N By (A3), we may write w = w (1) + w (2) ,
w (i) ∈ L pi .
(3.18)
By Young’s inequality, (i) w ∗ |ϕ|2
∞
w (i) pi ϕr2i ,
where r1 , r2 are defined through 1=
1 2 + . pi ri
(3.19)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
Therefore,
w ∗ |ϕ|2
∞
115
w (1) p1 ϕr21 + w (1) p2 ϕr22 2 w (1) p1 + w (2) p2 ϕr1 + ϕr2 .
Taking the infimum over all decompositions (3.18) yields 2 W ϕ ∞ = w ∗ |ϕ|2 ∞ w L p1 +L p2 ϕr1 + ϕr2 .
(3.20)
Note that (A3) and (A4) imply 2 r i q1 ,
(3.21)
so that the right-hand side of (3.20) is finite. Summarizing, (I) 1 w L p1 +L p2 ϕr + ϕr 2 . 1 2 2N
(3.22)
Term (II). Applying Lemma 3.10 to (II) yields i ϕ , q1 p2 (N − 1)W12 − N W2 m − τ −1 m q1 q2 2 N −1 i ϕ W12 − W2 q1 q2 , = , q 1 p2 2 N
(II) =
so that 1 (II) , q1 p2 W12 q1 q2 + 2
1 ϕ , q1 p2 W2 q1 q2 . 2
The second term of (3.23) is bounded by 2 1 1 W ϕ ∞ q1 2 w L p1 +L p2 ϕr1 + ϕr2 α, 2 2 where we used the bound (3.20) as well as (3.12). The first term of (3.23) is bounded using Cauchy-Schwarz by 1 2 p q , q1 p2 W12
, q1 q2 2 1 2 1 = , q1 p2 w 2 ∗ |ϕ|2 1 p2 q1 , q1 q2 . 2 This follows by applying (3.17) to W 2 . Thus we get the bound 1 1 q1 2 w 2 ∗ |ϕ|2 ∞ = α w 2 ∗ |ϕ|2 ∞ . 2 2 We now proceed as above. Using the decomposition (3.18) we get 2 w ∗ |ϕ|2 2(w (1) )2 ∗ |ϕ|2 + 2(w (2) )2 ∗ |ϕ|2 . ∞ ∞ ∞ Then Young’s inequality gives (i) 2 (w ) ∗ |ϕ|2 w (i) 2 ϕ2 , qi p ∞ i
(3.23)
116
A. Knowles, P. Pickl
which implies that 2 w ∗ |ϕ|2
∞
2 2w2L p1 +L p2 ϕq1 + ϕq2 .
(3.24)
Putting all of this together we get √ 1 (II) w L p1 +L p2 2 ϕq + ϕq + ϕr + ϕr 2 α. 1 2 1 2 2 Term (III). The final term (III) is equal to
i i , p1 p2 (N − 1)W12 , m − τ q1 q2 = , p1 p2 (N − 1)W12 m −2 m q1 q2 2 2 N − 1 , p1 p2 W12 q1 q2 , =i N where we used Lemma 3.10. Next, we note that, on the range of q1 , the operator n −1 is well-defined and bounded. Thus (III) is equal to i
N − 1 N − 1 , p1 p2 W12 , p1 p2 τ n n −1 q1 q2 = i n −1 q1 q2 , 2 n W12 N N
where we used Lemma 3.10 again. We now use Cauchy-Schwarz to get 2 τ (III) , p1 p2 τ , n −2 q1 q2 n W n p p 2 1 2 12 2 2 2 ,m −1 q1 q2 = , p1 p2 τ 2 n w ∗ |ϕ| 1 τ 2 n p1 p2 N w 2 ∗ |ϕ|2 ∞ τ2 n
, m N −1 √ N = w 2 ∗ |ϕ|2 ∞ , τ2 m α N −1 2 √ N 2 2 w ∗ |ϕ| ∞ = ,m + α N −1 N N 2α w 2 ∗ |ϕ|2 ∞ α+ N −1 N 1 N 2 α+ . w 2 ∗ |ϕ|2 ∞ N −1 N Using the estimate (3.24) we get finally √ (III) 2 2w L p1 +L p2 ϕq + ϕq 1 2
1 N α+ . N −1 N
Conclusion of the proof. We have shown that the estimate (3.2) holds with 2 B N (t) = 2w L p1 +L p2 ϕ(t)r1 + ϕ(t)r2 + 6 ϕ(t)q1 + ϕ(t)q2 , A N (t) =
B N (t) . N
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
117
Using L 2 -norm conservation ϕ(t) = 1 and interpolation we find ϕ(t)r2i ϕ(t)qi . Thus, B N (t) 16w L p1 +L p2 ϕ(t)q1 + ϕ(t)q2 . The claim now follows from the Grönwall estimate (3.3). 4. Convergence for Stronger Singularities In this section we extend the results of the Sect. 3 to more singular interaction potentials. We consider the case w ∈ L p0 + L ∞ , where 1 1 1 = + . p0 2 d
(4.1)
For example in three dimensions p0 = 6/5, which corresponds to singularities up to, but not including, the type |x|−5/2 . Of course, there are other restrictions on the interaction potential which ensure the stability of the N -body Hamiltonian and the well-posedness of the Hartree equation. In practice, it is often these latter restrictions that determine the class of allowed singularities. In the words of [11] (p. 169), it is “venerable physical folklore” that an N -body Hamiltonian of the form (3.4), with h = − and w(x) = |x|−ζ for ζ < 2, produces reasonable quantum dynamics in three dimensions. Mathematically, this means that such a Hamiltonian is self-adjoint; this is a well-known result (see e.g. [11]). The corresponding Hartree equation is known to be globally well-posed (see [5]). This section answers (affirmatively) the question whether, in the case of such singular interaction potentials, the mean-field limit of the N -body dynamics is governed by the Hartree equation. 4.1. Outline and main result. As in Sect. 3, we need to control expressions of the form w 2 ∗ |ϕ|2 ∞ . The situation is considerably more involved when w2 is not locally integrable. An important step in dealing with such potentials in our proof is to express w as the divergence of a vector field ξ ∈ L 2 . This approach requires the control of not only α = q1 2 but also ∇1 q1 2 , which arises from integrating by parts in expressions containing the factor ∇ · ξ . As it turns out, β, defined through n N t , (4.2) β N (t) := N , does the trick. This follows from an estimate exploiting conservation of energy (see Lemma 4.6 below). The inequality m n and the representation (3.12) yield α β.
(4.3)
We consider a Hamiltonian of the form (3.4) and make the following assumptions. (B1) The one-particle Hamiltonian h is self-adjoint and bounded from below. Without loss of generality we assume that h 0. We also assume that there are constants κ1 , κ2 > 0 such that − κ1 h + κ2 , as an inequality of forms on H(1) .
118
A. Knowles, P. Pickl
(B2) The Hamiltonian (3.4) is self-adjoint and bounded from below. We also assume that Q(H N ) ⊂ X N , where X N is defined as in Assumption (A1). (B3) There is a constant κ3 ∈ (0, 1) such that 0 (1 − κ3 )(h 1 + h 2 ) + W12 , as an inequality of forms on H(2) . (B4) The interaction potential w is a real and even function satisfying w ∈ L p + L ∞ , where p0 < p 2. (B5) The solution ϕ(·) of (1.3) satisfies ϕ(·) ∈ C(R; X 12 ∩ L ∞ ) ∩ C 1 (R; L 2 ), where X 12 := Q(h 2 ) ⊂ L 2 is equipped with the norm ϕ X 2 := (1 + h 2 )1/2 ϕ . 1
Next, we define the microscopic energy per particle E N (t) :=
1
N , H N N t , N
as well as the Hartree energy 1 ϕ 2 2 E (t) := ϕ , h ϕ + dx dyw(x − y)|ϕ(x)| |ϕ(y)| . 2 t By spectral calculus, E N (t) is independent of t. Also, invoking Assumption (B5) to differentiate E ϕ (t) with respect to t shows that E ϕ (t) is conserved as well. Summarizing, E N (t) = E N (0),
E ϕ (t) = E ϕ (0),
t ∈ R.
We may now state the main result of this section. Theorem 4.1. Let N ,0 ∈ Q(H N ) and assume that Assumptions (B1) – (B5) hold. Then there is a constant K , depending only on d, h, w and p, such that 1 ϕ β N (t) β N (0) + E N − E + η e K φ(t) , N where η :=
p/ p0 − 1 2 p/ p0 − p/2 − 1
and φ(t) := 0
t
ds 1 + ϕ(s)3X 2 ∩L ∞ . 1
(4.4)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
119
ϕ Remark 4.2. We have convergence to the mean-field limit whenever lim N E N = E ⊗N and lim N β N (0) = 0. For instance if we start in a fully factorized state, N ,0 = ϕ0 , then β N (0) = 0 and ϕ E N −E =
1
ϕ0 ⊗ ϕ0 , W12 ϕ0 ⊗ ϕ0 , N
so that Theorem 4.1 yields (1)
E N (t) β N (t)
1 K φ(t) e , Nη
and the analogue of Corollary 3.2 holds. Remark 4.3. The following graph shows the dependence of η on p for d = 3, i.e. p0 = 6/5.
0.5 0.4
η
0.3 0.2 0.1 0 1.2
1.4
1.6
1.8
2
Remark 4.4. Theorem 4.1 remains valid for a large class of time-dependent one-particle Hamiltonians h(t). See Sect. 4.4 below for a full discussion. Remark 4.5. In three dimensions Assumption (B1) and Sobolev’s inequality imply that ϕ∞ ϕ X 2 , so that Assumption (B5) is equivalent to ϕ ∈ C(R; X 12 ) ∩ C 1 (R; L 2 ). 1
4.2. Example: nonrelativistic particles with interaction potential of critical type. Consider nonrelativistic particles in R3 with one-particle Hamiltonian h = −. The interaction potential is given by w(x) = λ|x|−2 . This corresponds to a critical nonlinearity of the Hartree equation. We require that λ > −1/2, which ensures that the N -body Hamiltonian is stable and the Hartree equation has global solutions. To see this, recall Hardy’s inequality in three dimensions,
ϕ , |x|−2 ϕ 4 ϕ , −ϕ .
(4.5)
One easily infers that Assumptions (B1) – (B3) hold. Moreover, Assumption (B4) holds for any p < 3/2. In order to verify Assumption (B5) we refer to [5], where local well-posedness is proven. Global existence follows by standard methods using conservation of the mass
120
A. Knowles, P. Pickl
ϕ2 , conservation of the energy E ϕ , and Hardy’s inequality (4.5). Together they yield an a-priori bound on ϕ X 1 , from which an a-priori bound for ϕ X 2 may be inferred; 1 see [5] for details. We conclude: For any η < 1/3 there is a continuous function φ(t) such that Theorem 4.1 holds. 4.3. Proof of Theorem 4.1. 4.3.1. An energy estimate. In the first step of our proof we exploit conservation of energy to derive an estimate on ∇1 q1 . Lemma 4.6. Assume that Assumptions (B1) – (B5) hold. Then 1 . ∇1 q1 2 E − E ϕ + 1 + ϕ2X 2 ∩L ∞ β + √ 1 N Proof. Write 1 E ϕ = ϕ , hϕ + ϕ , W ϕ ϕ , 2
(4.6)
as well as E = , h 1 +
N −1
, W12 . 2N
(4.7)
Inserting 1 = p1 p2 + (1 − p1 p2 ) in front of every in (4.7) and multiplying everything out yields
, (1 − p1 p2 )h 1 (1 − p1 p2 )
= E − , p1 p2 h 1 p1 p2 N −1
, p1 p2 W12 p1 p2 − 2N − , (1 − p1 p2 )h 1 p1 p2 − , p1 p2 h 1 (1 − p1 p2 ) N − 1 N − 1 , (1 − p1 p2 )W12 p1 p2 − , p1 p2 W12 (1 − p1 p2 ) − 2N 2N N − 1 , (1 − p1 p2 )W12 (1 − p1 p2 ) . − 2N We want to find an upper bound for the left-hand side. In order to control the last term on the right-hand side for negative interaction potentials, we need to use some of the kinetic
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
121
energy on the left-hand side. To this end, we split the left-hand side by multiplying it with 1 = κ3 + (1 − κ3 ). Thus, using (4.6), we get κ3 , (1 − p1 p2 )h 1 (1 − p1 p2 ) = E − Eϕ − , p1 p2 h 1 p1 p2 + ϕ , hϕ 1 N −1
, p1 p2 W12 p1 p2 + ϕ , W ϕ ϕ − 2N 2 − , (1 − p1 p2 )h 1 p1 p2 − , p1 p2 h 1 (1 − p1 p2 ) N − 1 N − 1 , (1 − p1 p2 )W12 p1 p2 − , p1 p2 W12 (1 − p1 p2 ) − 2N 2N N − 1 , (1 − p1 p2 )W12 (1 − p1 p2 ) − 2N (4.8) − (1 − κ3 ) , (1 − p1 p2 )h 1 (1 − p1 p2 ) . The rest of the proof consists in estimating each line on the right-hand side of (4.8) separately. There is nothing to be done with the first line. Lines 6–7. The last two lines of (4.8) are equal to N − 1 , (1 − p1 p2 )W12 (1 − p1 p2 ) − 2N 1 − (1 − κ3 ) , (1 − p1 p2 )(h 1 + h 2 )(1 − p1 p2 ) 2
N − 1 , (1 − p1 p2 ) (1 − κ3 )(h 1 + h 2 ) + W12 (1 − p1 p2 ) 0, − 2N where in the last step we used Assumption (B3). Line 2. The second line on the right-hand side of (4.8) is bounded in absolute value by ϕ , hϕ − , p1 p2 h 1 p1 p2 = ϕ , hϕ , (1 − p1 p2 ) = ϕ , hϕ , (q1 p2 + p1 q2 + q1 q2 ) 3 α ϕ , hϕ 3 β ϕ , hϕ , where in the last step we used (4.3). Line 3. The third line on the right-hand side of (4.8) is bounded in absolute value by 1 ϕ , W ϕ ϕ − N − 1 , p1 p2 W12 p1 p2 2 2N N −1 1 = ϕ , W ϕ ϕ 1 −
, p1 p2 2 N 1 1 W ϕ ∞ , (q1 p2 + p1 q2 + q1 q2 ) + , p1 p2 2 N 3 1 W ϕ ∞ α + 2 N 1 3 . W ϕ ∞ β + 2 N
122
A. Knowles, P. Pickl
As in (3.20), one finds that W ϕ ∞ w L 1 +L ∞ ϕ2L 2 ∩L ∞ . Line 4. The fourth line on the right-hand side of (4.8) is bounded in absolute value by , (1 − p1 p2 )h 1 p1 p2 = , (q1 p2 + p1 q2 + q1 q2 )h 1 p1 p2 = , q 1 h 1 p1 p2 = , q1 n −1/2 n 1/2 h 1 p1 p2 = , q1 n −1/2 h 1 τ1 n 1/2 p1 p2 , where in the last step we used Lemma 3.10. Using Cauchy-Schwarz, we thus get , (1 − p1 p2 )h 1 p1 p2 , q1 , p1 p2 n −1 τ1 n 1/2 h 21 τ1 n 1/2 p1 p2 τ1 n p1 p2 , n ϕ , h 2 ϕ , = , where in the second step we used Lemma 3.9. Using 1 k+1 n(k) + √ (τ1 n)(k) = N N we find ! 1 , (1 − p1 p2 )h 1 p1 p2 β ϕ , h 2 ϕ , n + √ N 1 = ϕ , h 2 ϕ β β + 1/4 N 1 . 2 ϕ , h 2 ϕ β + √ N Line 5. Finally, we turn our attention to the fifth line on the right-hand side of (4.8), which is bounded in absolute value by , p1 p2 W12 (1 − p1 p2 ) = , p1 p2 W12 ( p1 q2 + q1 p2 + q1 q2 2(a) + (b), where
(a) := , p1 p2 W12 q1 p2 ,
(b) := , p1 p2 W12 q1 q2 .
One finds, using (3.17), Lemma 3.10 and Lemma 3.9, ϕ (a) = , p1 p2 W1 q1 ϕ 1/2 −1/2 = , p1 p2 W 1 n n q1 ϕ −1/2 = , p1 p2 τ1 n 1/2 W1 n q1 W ϕ ∞ , τ1 n , n −1 q1 ! 1 ϕ , n n + √ W ∞ , N 1 . 2W ϕ ∞ β + √ N
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
123
The estimation of (b) requires a little more effort. We start by splitting w = w ( p) + w (∞) ,
w ( p) ∈ L p , w (∞) ∈ L ∞ .
This yields (b) (b)( p) + (b)(∞) in self-explanatory notation. Let us first concentrate on (b)(∞) : (∞) (b)(∞) = , p1 p2 W12 q1 q2 (∞) = , p1 p2 W12 n n −1 q1 q2 (∞) −1 = , p1 p2 τ n q1 q2 2 n W12 2 W (∞) ∞ , τ , n −2 q1 q2 n 2 2 √ (∞) α w ∞ α + N 2 . 2w (∞) ∞ β + N Let us now consider (b)( p) . In order to deal with the singularities in w ( p) , we write it as the divergence of a vector field ξ , w ( p) = ∇ · ξ.
(4.9)
This is nothing but a problem of electrostatics, which is solved by ξ =C
x ∗ w ( p) , |x|d
with some constant C depending on d. By the Hardy-Littlewood-Sobolev inequality, we find ξ q w ( p) p ,
1 1 1 = − . q p d
(4.10)
Thus if p p0 then q 2. Denote by X 12 multiplication by ξ(x1 − x2 ). For the following it is convenient to write ∇ · ξ = ∇ ρ ξ ρ , where a summation over ρ = 1, . . . , d is implied. Recalling Lemma 3.10, we therefore get ( p) (b)( p) = , p1 p2 W12 n n −1 q1 q2 ( p) −1 = , p1 p2 τ n q1 q2 2 n W12 ρ ρ = , p1 p2 τ n −1 q1 q2 . 2 n (∇ X )12 1
Integrating by parts yields ρ ρ (b)( p) ∇1 τ n −1 q1 q2 2 n p1 p2 , X 12 ρ ρ −1 + τ n q 1 q 2 . 2 n p1 p2 , X ∇ 12 1
(4.11)
124
A. Knowles, P. Pickl
Let us begin by estimating the first term. Recalling that p = |ϕ ϕ|, we find that the first term on the right-hand side of (4.11) is equal to ρ X p2 (∇ ρ p)1 τ n −1 q1 q2 2 n , 12 −1 ρ σ σ n q1 q2 (∇ ρ p)1 τ 2 n , p2 X 12 X 12 p2 (∇ p)1 τ 2n −1 n q1 q2 |ϕ|2 ∗ ξ 2 ∞ ∇ϕ τ 2n 2 √ ξ q ϕ L 2 ∩L ∞ ϕ X 1 α + α, N where we used Young’s inequality, Assumption (B1), and Lemma 3.9. Recalling that β α, we conclude that the first term on the right-hand side of (4.11) is bounded by 1 C ϕ2X 1 ∩L ∞ β + . N Next, we estimate the second term on the right-hand side of (4.11). It is equal to ρ ρ −1 2 n −1 q1 q2 X p1 p2 τ n q1 q2 τ 2 n , ∇1 2 n , p1 p2 X 12 p1 p2 τ 2 n ∇1 12 |ϕ|2 ∗ ξ 2 ∞ τ2 n ∇1 n −1 q1 q2 2 ξ q ϕ L 2 ∩L ∞ α + ∇1 n −1 q1 q2 . N We estimate ∇1 n −1 q1 q2 by introducing 1 = p1 + q1 on the left. The term arising from p1 is bounded by p 1 ∇1 n −1 q1 q2 = p1 q2 τ1 n −1 ∇1 q1 ∇1 q 1 , q 2 τ1 n −2 ∇1 q1 " # N # 1 = $ ∇1 q 1 , qi τ1 n −2 ∇1 q1 N −1 i=2 " # N # 1 qi τ1 n −2 ∇1 q1 $ ∇1 q 1 , N i=1 = ∇1 q1 , n2 τ1 n −2 ∇1 q1 ∇1 q1 . The term arising from q1 in the above splitting is dealt with in exactly the same way. Thus we have proven that the second term on the right-hand side of (4.11) is bounded by 1 Cϕ L 2 ∩L ∞ β + ∇1 q1 . N
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
Summarizing, we have (b)
( p)
ϕ2X 1 ∩L ∞
125
1 1 β+ + ϕ L 2 ∩L ∞ β + ∇1 q1 . N N
Conclusion of the proof. Putting all the estimates of the right-hand side of (4.8) together, we find , (1 − p1 p2 )h 1 (1 − p1 p2 ) (4.12) 1 1 + ϕ L 2 ∩L ∞ β + ∇1 q1 . E − E ϕ + 1 + ϕ2X 2 ∩L ∞ β + √ 1 N N Next, from 1 − p1 p2 = p1 q2 + q1 we deduce h 1 q1 = h 1 (1 − p1 p2 ) − h 1 p1 q2 h 1 (1 − p1 p2 ) + h 1 p1 q2 . Now, recalling that p = |ϕ ϕ|, we find h 1 p1 q2 h 1 p1 q2 ϕ X 1 β. Therefore,
2 h 1 q1 2 h 1 (1 − p1 p2 ) + ϕ2X 1 β.
Plugging in (4.13) yields h 1 q1 2 E − E ϕ + 1 + ϕ2 2 ∞ β + √1 X 1 ∩L N 1 +ϕ L 2 ∩L ∞ β + ∇1 q1 . N Next, we observe that Assumption (B1) implies ∇1 q1 h 1 q1 + β, so that we get
h 1 q1 2 E − E ϕ + 1 + ϕ2 2 ∞ β + √1 X 1 ∩L N 1 +ϕ L 2 ∩L ∞ β + h 1 q1 . N Now we claim that h 1 q1 2 E − E ϕ + 1 + ϕ2 2 ∞ β + √1 . X 1 ∩L N This follows from the general estimate x 2 C(R + ax)
⇒
x 2 2C R + C 2 a 2 ,
which itself follows from the elementary inequality 1 1 C(R + ax) C R + C 2 a 2 + x 2 . 2 2 The claim of the lemma now follows from (4.13) by using Assumption (B1).
(4.13)
126
A. Knowles, P. Pickl
˙ We start exactly as in Sect. 3. Assumptions (B1) – (B5) imply that 4.3.2. A bound on β. β is differentiable in t with derivative i
ϕ ϕ , (N − 1)W12 − N W1 − N W2 , n 2 = 2(I) + 2(II) + (III) + complex conjugate,
β˙ =
(4.14)
where
i ϕ ϕ , p1 p2 (N − 1)W12 − N W1 − N W2 , n q 1 p2 , 2
i ϕ ϕ (II) := , q1 p2 (N − 1)W12 − N W1 − N W2 , n q1 q2 , 2
i ϕ ϕ (III) := , p1 p2 (N − 1)W12 − N W1 − N W2 , n q1 q2 . 2 (I) :=
Term (I). Using (3.17) we find
ϕ ϕ n q 1 p2 2 (I) = , p1 p2 (N − 1)W12 − N W1 − N W2 ,
ϕ = , p1 p2 W1 , n q 1 p2 ϕ = , p1 p2 W 1 n − τ −1 n q1 p2 , where we used Lemma 3.10. Define √ N μ(k) := N n(k) − (τ−1 n)(k) = √ n −1 (k), √ k+ k−1
k = 1, . . . , N . (4.15)
Thus, (I) = 1 , p1 p2 W ϕ 1 μ q 1 p2 N 1 W ϕ ∞ , μ2 q1 N 1 n −2 q1 W ϕ ∞ , N 1 ϕ2L 2 ∩L ∞ , N by (3.13). Term (II). Using Lemma 3.10 we find
ϕ 2|(II)| = , q1 p2 (N − 1)W12 − N W2 , n q1 q2 N −1 ϕ = , q1 p2 μ q1 q2 W12 − W2 N ϕ , q1 p2 W12 μ q 1 q 2 + , q 1 p2 W 2 μ q1 q2 . % &' ( % &' ( =:(a)
=:(b)
(4.16) (4.17) (4.18)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
127
One immediately finds (b) W ϕ ∞ q1 , μ2 q1 q2 ϕ2L 2 ∩L ∞ β. In (a) we split w = w ( p) + w (∞) ,
w ( p) ∈ L p , w (∞) ∈ L ∞ ,
with a resulting splitting (a) (a)( p) + (a)(∞) . The easy part is (a)(∞) w (∞) ∞ q1 2 β. In order to deal with (a)( p) we write w ( p) = ∇ · ξ as the divergence of a vector field ξ , exactly as in the proof of Lemma 4.6; see (4.9) and the remarks after it. We integrate by parts to find ρ (a)( p) = , q1 p2 (∇1 X ρ )12 μ q1 q2 ρ ρ ρ ρ ∇1 q1 p2 , X 12 μ q1 q2 + q1 p2 , X 12 ∇1 μ q 1 q 2 .
(4.19)
The first term of (4.19) is equal to ρ ρ ρ σ p ∇σ q X p2 ∇ ρ q 1 , μ q1 q2 ∇1 q1 , p2 X 12 X 12 , μ2 q1 q2 2 1 1 12 1 n −2 q1 q2 ξ 2 ∗ |ϕ|2 ∞ ∇1 q1 , N 2 2 , n2 ξ ∗ |ϕ| ∞ ∇1 q1 N −1 ξ q ϕ L 2 ∩L ∞ ∇1 q1 β ∇1 q1 2 ϕ L 2 ∩L ∞ + β ϕ L 2 ∩L ∞ , where in the second step we used (4.15), in the third Lemma 3.9, and in the last (4.3), Young’s inequality, and (4.10). The second term of (4.19) is equal to q1 p2 , X ρ ( p1 + q1 )∇ ρ 12 1 μ q1 q2 ρ ρ ρ ρ q1 p2 , X 12 p1 τ μ ∇1 q1 q2 , (4.20) 1 μ ∇1 q1 q2 + q1 p2 , X 12 q1
128
A. Knowles, P. Pickl
where we used Lemma 3.10. We estimate the first term of (4.20). The second term is dealt with in exactly the same way. We find ρ p1 X ρ q1 p2 , τ 1 μ ∇1 q 1 q 2 12 2 p q 2 , q1 p2 X 12 ∇1 q1 , q2 τ 2 1 1 μ q 2 ∇1 q 1 ξ 2 ∗ |ϕ|2 ∞ q1 ∇1 q1 , n −2 q2 ∇1 q1 " # N √ # 1 ∇1 q1 , n −2 qi ∇1 q1 ξ q ϕ L 2 ∩L ∞ α $ N −1 i=2 " # N # 1 ∇1 q1 , n −2 qi ∇1 q1 ϕ L 2 ∩L ∞ β $ N −1 i=1 N ∇1 q1 , n −2 n 2 ∇1 q 1 = ϕ L 2 ∩L ∞ β N −1 ϕ L 2 ∩L ∞ β ∇1 q1 β ϕ L 2 ∩L ∞ + ∇1 q1 2 ϕ L 2 ∩L ∞ . In summary, we have proven that (II) β ϕ L 2 ∩L ∞ + ∇1 q1 2 ϕ L 2 ∩L ∞ . Term (III). Using Lemma 3.10 we find
2|(III)| = (N − 1) , p1 p2 W12 , n q1 q2 = (N − 1) , p1 p2 W12 n − τ −2 n q1 q2 . Defining ν(k) := N n(k) − (τ−2 n)(k) = √
√
N n −1 (k), √ k+ k−2
k = 2, . . . , N , (4.21)
we have ν q1 q2 . 2 (III) , p1 p2 W12 As usual we start by splitting w = w ( p) + w (∞) ,
w ( p) ∈ L p , w (∞) ∈ L ∞ ,
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
129
with the induced splitting (III) = (III)( p) + (III)(∞) . Thus, using Lemma 3.10, we find (∞) 1/2 −1/2 n n ν q1 q2 2 (III)(∞) = , p1 p2 W12 (∞) −1/2 1/2 = , p1 p2 τ W12 n ν q1 q2 2n w (∞) ∞ , τ , n −1 n ν 2 q1 q2 2 ! 2 , n −3 q1 q2 β+ N ! 2 N β+ β N N −1 1 β+√ , N where in the fifth step we used Lemma 3.9. In order to estimate (III)( p) we introduce a splitting of w ( p) into “singular” and “regular” parts, w ( p) = w ( p,1) + w ( p,2) := w ( p) 1{|w( p) |>a} + w ( p) 1{|w( p) |a} ,
(4.22)
where a is a positive (N -dependent) constant we choose later. For future reference we record the estimates w ( p,1) p0 a 1− p/ p0 w ( p) p
p/ p0
w
( p,2)
2 a
1− p/2
,
p/2 w ( p) p .
(4.23a) (4.23b)
The proof of (4.23) is elementary; for instance (4.23a) follows from p p −p p w ( p,1) p00 = dx w ( p) w ( p) 0 1{|w( p) |>a} ( p) p p p0 − p p0 − p a 1{|w( p) |>a} a dx w dx w ( p) . Let us start with (III)( p,1) . As in (4.9), we use the representation w ( p,1) = ∇ · ξ. Then (4.10) and (4.23a) imply that ξ 2 w ( p,1) p0 a 1− p/ p0 . Integrating by parts, we find ( p,1) ν q1 q2 2 (III)( p,1) = , p1 p2 W12 ρ ρ = , p1 p2 (∇1 X 12 ) ν q1 q2 ρ ρ ρ ρ ∇1 p1 p2 , X 12 ν q1 q2 + p1 p2 , X 12 ∇1 ν q 1 q 2 .
(4.24)
(4.25)
130
A. Knowles, P. Pickl
Using ∇ p = ∇ϕ and Lemma 3.9 we find that the first term of (4.25) is bounded by
ρ
ρ
σ p ∇σ p ∇1 p1 , p2 X 12 X 12 2 1 1
√ , ν 2 q1 q2 ∇ p ϕ∞ ξ 2 α ∇ϕ ϕ∞ a 1− p/ p0 β ∇ϕ ϕ∞ β + a 2−2 p/ p0 ,
where in the second step we used the estimate (4.24). Next, using Lemma 3.10, we find that the second term of (4.25) is equal to p1 p2 , X ρ ( p1 + q1 )∇ ρ 12 1 ν q1 q2 ρ ρ ρ ρ p1 p2 , X 12 p1 τ ν ∇1 q1 q2 . 1 ν ∇1 q1 q2 + p1 p2 , X 12 q1 We estimate the first term (the second is dealt with in exactly the same way): ρ 2 p1 p2 , X ρ p1 τ , p1 p2 X 2 p1 p2 ∇1 q1 , τ ν ∇ q q 1 1 2 1 ν q 2 ∇1 q 1 12 12 1 " # N # 1 2 $ p2 X 12 p2 ∇1 q1 , n −2 qi ∇1 q1 N −1 i=2 " # N # 1 ∇1 q1 , n −2 qi ∇1 q1 ξ 2 ϕ∞ $ N −1 i=1 N ∇1 q1 , ∇1 q1 a 1− p/ p0 ϕ∞ N −1 2−2 p/ p 0 ϕ∞ a + ∇1 q1 2 . Summarizing, (III)( p,1) ϕ∞ βϕ X + ∇1 q1 2 + a 2−2 p/ p0 ϕ X . 1 1 Finally, we estimate ( p,2) ( p,2) (III)( p,2) = , p1 p2 W12 ν q1 q2 = , p1 p2 W12 ν ( χ (1) + χ (2) )q1 q2 , (4.26) where 1 = χ (1) + χ (2) ,
χ (1) , χ (2) ∈ {0, 1}{0,...,N } ,
is some partition of the unity to be chosen later. The need for this partitioning will soon become clear. In order to bound the term with χ (1) , we note that the operator norm of ( p,2) p1 p2 W12 q1 q2 on the full space L 2 (Rd N ) is much larger than on its symmetric sub( p,2) space. Thus, as a first step, we symmetrize the operator p1 p2 W12 q1 q2 in coordinate
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
131
2. We get the bound , p1 p2 W ( p,2) ν χ (1) q1 q2 12 N 1 ( p,2) (1) , = p1 pi W1i qi q1 χ ν q1 N −1 i=2 " # N # 1 ( p,2) ( p−2) ν q 1 $ , p1 pi W1i q1 qi χ (1) q1 q j W1 j p j p1 . N −1 i, j=2
Using n −1 q1 1 ν q1 we find , p1 p2 W ( p,2) ν χ (1) q1 q2 12
1 √ A + B, N −1
where
A :=
( p,2)
, p1 pi W1i
χ (1) q j W1 j q1 qi
( p,2)
p j p1 ,
2i= j N
B :=
N
( p,2)
, p1 pi W1i
( p,2) χ (1) W1i pi p1 . q1 qi
i=2
The easy part is B
N
( p,2) 2 , p1 pi W1i pi p1
i=2
N ( p,2) 2 w ∗ |ϕ|2 ∞ , p1 pi i=2
(N − 1)ϕ2∞ w ( p,2) 22 N a 2− p ϕ2∞ . Let us therefore concentrate on A=
2i= j N
=
2i= j N
= A1 + A2 ,
( p,2)
, p1 pi W1i
( p,2) χ (1) χ (1) q j W1 j p j p1 q1 qi
(1) W ( p,2) q W ( p,2) τ (1) q p p , p1 pi q j τ 2χ 1 1j 2χ i j 1 1i
(4.27)
132
A. Knowles, P. Pickl
with A = A1 + A2 arising from the splitting q1 = 1 − p1 . We start with |A1 |
1i
2i = j N
=
1j
(1) W ( p,2) W ( p,2) W ( p,2) W ( p,2) τ (1) q p p , p1 pi q j τ χ 2 2χ i j 1 1i
2i = j N
(1) W ( p,2) W ( p,2) τ (1) q p p , p1 pi q j τ 2χ 2χ i j 1
1j
1i
1j
(1) q p p W ( p,2) W ( p,2) p p q τ (1) , , τ 2χ j 1 i 1 i j 2χ 1i 1j
2i = j N
√ by Cauchy-Schwarz and symmetry of . Here · is any complex square root. In order to estimate this we claim that, for i = j, ( p,2) ( p,2) 2 p1 pi W1i W1 j p1 pi w ( p,2) ∗ |ϕ|2 ∞ .
(4.28)
Indeed, by (3.17), we have ( p,2) ( p,2) ( p,2) ( p,2) p1 pi W1i W1 j p1 pi = p1 pi W1i pi W1 j p1 ( p,2) = p1 pi w ( p,2) ∗ |ϕ|2 1 W1 j p1 . ( p,2) The operator p1 w ( p,2) ∗ |ϕ|2 1 W1 j p1 is equal to f j p1 , where f (x j ) = dx1 ϕ(x1 ) w ( p,2) ∗ |ϕ|2 (x1 ) w ( p,2) (x1 − x j ) ϕ(x1 ). Thus, 2 f ∞ w ( p,2) ∗ |ϕ|2 ∞ , from which (4.28) follows immediately. Using (4.28), we get 2 w ( p,2) ∗ |ϕ|2 2 τ |A1 | χ (1) q1 ∞ 2 2i= j N
(1) q N 2 w ( p) 2p ϕ4L 2 ∩L ∞ , τ 2χ 1 (1) n2 . N 2 ϕ4L 2 ∩L ∞ , τ 2χ Now let us choose χ (1) (k) := 1{k N 1−δ } for some δ ∈ (0, 1). Then (τ2 χ (1) ) n 2 N −δ implies |A1 | ϕ4L 2 ∩L ∞ N 2−δ .
(4.29)
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
133
Similarly, we find
|A2 |
(1) p p W ( p,2) p W ( p,2) p p τ (1) q , q j τ 2χ i 1 1i 1 1j 1 j 2χ i
2i= j N
2i= j N
( p,2) 2 (1) q w ∗ |ϕ|2 ∞ , τ 2χ 1
N 2 ϕ4L 2 ∩L ∞ N −δ = ϕ4L 2 ∩L ∞ N 2−δ . Thus we have proven |A| ϕ4L 2 ∩L ∞ N 2−δ . Going back to (4.27), we see that , p1 p2 W ( p,2) ν χ (1) q1 q2 ϕ2L 2 ∩L ∞ N −δ/2 + ϕ∞ N −1/2 a 1− p/2 . 12 What remains is to estimate is the term of (III)( p,2) containing χ (2) , , p1 p2 W ( p,2) ν χ (2) q1 q2 12 N 1 ( p,2) 1/2 1/2 (2) , = χ p p W q q ν ν q 1 i 1i i 1 1 N −1 i=2 " # N # 1 ( p,2) ( p−2) 1/2 ν q1 $ ν q 1 q j W1 j p j p1 . , p1 pi W1i q1 qi χ (2) N −1 i, j=2
Using
1/2 ν q1 , n −1 n 2 = β
we find , p1 p2 W ( p,2) ν χ (2) q1 q2 12
where A :=
( p,2)
, p1 pi W1i
√
β √ A + B, N −1
χ (2) q1 qi ν q j W1 j
( p,2)
p j p1 ,
2i= j N
B :=
N
( p,2)
, p1 pi W1i
( p,2) χ (2) q1 qi ν W1i pi p1 .
i=2
Since χ (2) (k) = 1{k>N 1−δ } we find χ (2) ν χ (2) n −1 N δ/2 .
(4.30)
134
A. Knowles, P. Pickl
χ (2) Thus, q1 qi ν N δ/2 and we get B N δ/2
N
( p,2) 2 2 , p1 pi W1i pi p1 N 1+δ/2 w ( p,2) ∗ |ϕ|2 ∞
i=2 1+δ/2 N w ( p,2) 22 ϕ2∞
N 1+δ/2 a 2− p ϕ2∞ ,
by (4.23b). Next, using Lemma 3.10, we find ( p,2) (2) 1/2 ( p,2) A= , p1 pi q j W1i χ χ (2) ν q1 ν 1/2 W1 j qi p j p1 2i= j N
=
(2) τν 1/2 W ( p,2) q W ( p,2) τ (2) τν 1/2 q p p , p1 pi q j τ 2χ 2 1 1j 2χ 2 i j 1 1i
2i= j N
= A1 + A2 , where, as above, the splitting A = A1 + A2 arises from writing q1 = 1 − p1 . Thus, (2) τν 1/2 W ( p,2) W ( p,2) τ (2) τν 1/2 q p p , p1 pi q j τ |A1 | 2χ 2 2χ 2 i j 1 1i 1j 2i= j N
=
( p,2) ( p,2) ( p,2) 1/2 (2) p1 pi q j τ τ W1i W1 j W1i 2χ 2ν
2i= j N
( p,2) (2) τν 1/2 q p p × W1 j τ 2χ 2 i j 1 (2) τν 1/2 p p W ( p,2) W ( p,2) p p τ (2) τν 1/2 q , , q j τ 2χ 2 1 i i 1 2χ 2 j 1i 1j 2i= j N
by Cauchy-Schwarz and symmetry of . Using (4.28) we get 2 |A1 | N 2 w ( p,2) ∗ |ϕ|2 ∞ , τ 2 ν q1 N 2 w ( p,2) 2p ϕ4L 2 ∩L ∞ , n
N 2 ϕ4L 2 ∩L ∞ β. Similarly, |A2 |
2i= j N
2i= j N
(2) τν 1/2 p W ( p,2) p W ( p,2) p τ (2) τν 1/2 q p , pi q j τ 2χ 2 1 1i 1 1j 1 2χ 2 i j ( p,2) 2 w ∗ |ϕ|2 ∞ , τ 2 ν q1
n N 2 w ( p) 2p ϕ4L 2 ∩L ∞ , N 2 ϕ4L 2 ∩L ∞ β. Plugging all this back into (4.30), we find that , p1 p2 W ( p,2) ν χ (2) q1 q2 β ϕ2L 2 ∩L ∞ + ϕ∞ + ϕ∞ a 2− p N δ/2−1 . 12
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
135
Summarizing: (III)( p,2) 1 + ϕ2 2 ∞ β + a 2− p N δ/2−1 + N −δ/2 + N −1/2 a 1− p/2 , L ∩L from which we deduce (III)( p) ϕ∞ ∇1 q1 2 + 1 + ϕ X 1 ∩L ∞ β + a 2− p N δ/2−1 + N −δ/2 + N −1/2 a 1− p/2 + a 2−2 p/ p0 . Let us set a ≡ a N = N ζ and optimize in δ and ζ . This yields the relations p δ , ζ (2 − p) + δ = 1, − = 2ζ 1 − 2 p0 which imply δ p/ p0 − 1 = , 2 2 p/ p0 − p/2 − 1 with δ 1. Thus, (III)( p) ϕ∞ ∇1 q1 2 + 1 + ϕ X ∩L ∞ β + N −η , 1 where η = δ/2 satisfies (4.4). Conclusion of the proof. We have shown that β˙ ϕ L 2 ∩L ∞ ∇1 q1 2 + 1 + ϕ X 1 ∩L ∞ β + N −η . Using Lemma 4.6 we find 1 β˙ 1 + ϕ3X 2 ∩L ∞ β + E − E ϕ + η . 1 N
(4.31)
The claim then follows from the Grönwall estimate (3.3). 4.4. A remark on time-dependent external potentials. Theorem 4.1 can be extended to time-dependent external potentials h(t) without too much sweat. The only complication is that energy is no longer conserved. We overcome this problem by observing that, while the energies E (t) and E ϕ (t) exhibit large variations in t, their difference remains small. In the following we estimate the quantity E (t) − E ϕ (t) by controlling its time derivative. We need the following assumptions, which replace Assumptions (B1) – (B3). (B1’) The Hamiltonian h(t) is self-adjoint and bounded from below. We assume that there is an operator h 0 0 that such that 0 h(t) h 0 for all t. We define the Hilbert space X N = Q i (h 0 )i as in (A1), and the space X 12 = Q(h 20 ) as in (B5) using h 0 . We also assume that there are time-independent constants κ1 , κ2 > 0 such that − κ1 h(t) + κ2 for all t.
136
A. Knowles, P. Pickl
We make the following assumptions on the differentiability of h(t). The map t → ψ , h(t)ψ is continuously differentiable for all ψ ∈ X 1 , with derivative ˙ ˙
ψ , h(t)ψ for some self-adjoint operator h(t). Moreover, we assume that the quantities (1 + h(t))−1/2 h(t) ˙ 2 ϕ(t) , ˙ (1 + h(t))−1/2
ϕ(t) , h(t) are continuous and finite for all t. (B2’) The Hamiltonian H N (t) is self-adjoint and bounded from below. We assume that Q(H N (t)) ⊂ X N for all t. We also assume that the N -body propagator U N (t, s), defined by i∂t U N (t, s) = H N (t)U N (t, s),
U N (s, s) = 1,
exists and satisfies U N (t, 0) N ,0 ∈ Q(H N (t)) for all t. (B3’) There is a time-independent constant κ3 ∈ (0, 1) such that 0 (1 − κ3 )(h 1 (t) + h 2 (t)) + W12 for all t. Theorem 4.7. Assume that Assumptions (B1’) – (B3’), (B4), and (B5) hold. Then there is a continuous nonnegative function φ, independent of N and N ,0 , such that 1 ϕ β N (t) φ(t) β N (0) + E N (0) − E (0) + η , N with η defined in (4.4). Proof. We start by deriving an upper bound on the energy difference E(t) := E (t) − E ϕ (t). Assumptions (B1’) and (B2’) and the fundamental theorem of calculus imply t ˙ . ds (s) , h˙ 1 (s)(s) − ϕ(s) , h(s)ϕ(s) E(t) = E(0) + % &' ( 0 =: G(s)
By inserting 1 = p1 (s)+q1 (s) on both sides of h˙ 1 (s) we get (omitting the time argument s) ˙ + 2 Re , p1 h˙ 1 q1 + , q1 h˙ 1 q1 . (4.32) G = , p1 h˙ 1 p1 − ϕ , hϕ The first two terms of (4.32) are equal to ˙ = α ϕ , hϕ ˙ β| ϕ , hϕ |. ˙
, p1 − 1 ϕ , hϕ The third term of (4.32) is bounded, using Lemmas 3.9 and 3.10, by 2 , p1 h˙ 1 n 1/2 n −1/2 q1 = 2 h˙ 1 p1 τ1 n 1/2 , n −1/2 q1 −1/2 n τ1 n 1/2 , p1 h˙ 21 p1 τ1 n 1/2 q1 | ϕ , h˙ 2 ϕ | , τ1 n , n −1 q1 ! 1 β, | ϕ , h˙ 2 ϕ | β + √ N 1 . | ϕ , h˙ 2 ϕ | β + √ N
Mean-Field Dynamics: Singular Potentials and Rate of Convergence
137
The last term of (4.32) is equal to , q1 (1 + h 1 )1/2 (1 + h)−1/2 h˙ 1 (1 + h 1 )−1/2 (1 + h)1/2 q1 ˙ + h)−1/2 (1 + h 1 )1/2 q1 2 . (1 + h)−1/2 h(1 Thus, using Assumption (B1’) we conclude that 2 1 1/2 G(t) C(t) β(t) + √ + h 1 (t) q1 (t)(t) N
(4.33)
for all t. Here, and in the following, C(t) denotes some continuous nonnegative function that does not depend on N . Next, we observe that, under Assumptions (B1’) – (B3’), the proof of Lemma 4.6 remains valid for time-dependent one-particle Hamiltonians. Thus, (4.13) implies h 1 (t)1/2 q1 (t)(t)2 E(t) + 1 + ϕ(t)2 2 ∞ β(t) + √1 . X 1 ∩L N Plugging this into (4.33) yields 1 G(t) C(t) β(t) + √ + E(t) . N Therefore, E(t) E(0) +
t
0
1 . ds C(s) β(s) + E(s) + √ N
(4.34)
Next, we observe that, under Assumptions (B1’) – (B3’), the derivation of the estimate (4.31) in the proof of Theorem 4.1 remains valid for time-dependent one-particle Hamiltonians. Therefore, t 1 (4.35) β(t) β(0) + ds C(s) β(s) + E(s) + η . N 0 Applying Grönwall’s lemma to the sum of (4.34) and (4.35) yields t 1 β(t) + E(t) β(0) + E(0) e 0 C + η N
t
ds C(s) e
t 0
C
.
0
Plugging this back into (4.35) yields 1 β(t) C(t) β(0) + E(0) + η , N which is the claim. Acknowledgements. We would like to thank J. Fröhlich and E. Lenzmann for helpful and stimulating discussions. We also gratefully acknowledge discussions with A. Michelangeli which led to Lemma 2.1.
138
A. Knowles, P. Pickl
References 1. Elgart, A., Schlein, B.: Mean field dynamics of boson stars. Comm. Pure Appl. Math. 60(4), 500– 545 (2007) 2. Erd˝os, L., Schlein, B.: Quantum dynamics with mean field interactions: a new approach. http://arXiv. org/abs/0804.3774v1[math.ph], 2008 3. Erd˝os, L., Yau, H.-T.: Derivation of the nonlinear Schrödinger equation with Coulomb potential. Adv. Theor. Math. Phys. 5, 1169–1205 (2001) 4. Fröhlich, J., Knowles, A., Schwarz, S.: On the mean-field limit of bosons with Coulomb two-body interaction. Commun. Math. Phys. 288, 1023–1059 (2009) 5. Ginibre, J., Velo, G.: On a class of non linear Schrödinger equations with non local interaction. Math. Z. 170, 109–136 (1980) 6. Hepp, K.: The classical limit for quantum mechanical correlation functions. Commun. Math. Phys. 35, 265–277 (1974) 7. Lenzmann, E.: Well-posedness for semi-relativistic Hartree equations of critical type. Math. Phys. Anal. Geom. 10(1), 43–64 (2007) 8. Lieb, E., Yau, H.-T.: The Chandrasekhar theory of stellar collapse as the limit of quantum mechanics. Commun. Math. Phys. 112(1), 147–174 (1987) 9. Lieb, E.H., Seiringer, R.: Proof of Bose-Einstein condensation for dilute trapped gases. Phys. Rev. Lett. 88(17), 170409 (2002) 10. Pickl, P.: A simple derivation of mean field limits for quantum systems. To appear 11. Reed, M., Simon, B.: Methods of Modern Mathematical Physics II: Fourier Analysis, Self-Adjointness. New York: Academic Press, 1975 12. Rodnianski, I., Schlein, B.: Quantum fluctuations and rate of convergence towards mean field dynamics. http://arXiv.org/abs/0711.3087v1[math.ph], 2007 13. Spohn, H.: Kinetic equations from Hamiltonian dynamics: Markovian limits. Rev. Mod. Phys. 53(3), 569–615 (1980) Communicated by H.-T. Yau
Commun. Math. Phys. 298, 139–230 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1061-4
Communications in
Mathematical Physics
Energy Dispersed Large Data Wave Maps in 2 + 1 Dimensions Jacob Sterbenz1, , Daniel Tataru2, 1 Department of Mathematics, University of California, San Diego, CA 92093-0112, USA.
E-mail:
[email protected]
2 Department of Mathematics, University of California, Berkeley, CA 94720-3840, USA.
E-mail:
[email protected] Received: 24 July 2009 / Accepted: 27 December 2009 Published online: 23 May 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract: In this article we consider large data Wave-Maps from R2+1 into a compact Riemannian manifold (M, g), and we prove that regularity and dispersive bounds persist as long as a certain type of bulk (non-dispersive) concentration is absent. This is a companion to our concurrent article [21], which together with the present work establishes a full regularity theory for large data Wave-Maps. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 1.1 A guide to reading the paper . . . . . . . . . . . . 2. Standard Constructions, Function Spaces, and Estimates 2.1 Constants . . . . . . . . . . . . . . . . . . . . . . 2.2 Basic harmonic analysis . . . . . . . . . . . . . . . 2.3 Function spaces and standard estimates . . . . . . . 3. New Estimates and Intermediate Constructions . . . . . 3.1 Core technical estimates and constructions . . . . . 3.2 Derived estimates and intermediate constructions . 4. Proof of the Main Result . . . . . . . . . . . . . . . . . 5. The Iteration Spaces: Basic Tools and Estimates . . . . . 5.1 Space-time and angular frequency cutoffs . . . . . 5.2 The S and N function spaces . . . . . . . . . . . . 5.3 Extension and restriction for S and N functions . . 5.4 Strichartz and Wolff type bounds . . . . . . . . . . 6. Bilinear Null Form Estimates . . . . . . . . . . . . . . . 7. Proof of the Trilinear Estimates . . . . . . . . . . . . . . The first author was supported in part by the NSF grant DMS-0701087.
. . . . . . . . . . . . . . . . .
The second author was supported in part by the NSF grant DMS-0801261.
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
140 143 145 146 146 148 150 150 153 156 171 172 172 179 182 186 193
140
J. Sterbenz, D. Tataru
8. The Gauge Transformation . . . . . . . . . . . . . . . . . . . 8.1 Bounds for B . . . . . . . . . . . . . . . . . . . . . . . 8.2 The gauge construction . . . . . . . . . . . . . . . . . . 9. The Linear Paradifferential Flow . . . . . . . . . . . . . . . . 10. Structure of Finite S Norm Wave-Maps and Energy Dispersion 10.1 Renormalization . . . . . . . . . . . . . . . . . . . . . . 10.2 Partial fungibility of the S norm . . . . . . . . . . . . . 10.3 The role of the energy dispersion . . . . . . . . . . . . . 11. Initial Data Truncation . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
198 198 199 209 221 221 222 224 225 229
1. Introduction In this article we consider finite energy large data Wave-Maps from the Minkowski space R2+1 into a compact Riemannian manifold (M, g). Our main result asserts that regularity and dispersive bounds persist as long as a certain type of bulk concentration is absent. The results proved here are used in the companion article [21] to establish a full regularity theory for large data Wave-Maps. The set-up we consider is the same as the one in [33], using the so-called extrinsic formulation of the Wave-Maps equation. Precisely, we consider the target manifold (M, g) as an isometrically embedded submanifold of R N . Then we can view the M valued functions as R N valued functions whose range is contained in M. Such an embedding always exists by Nash’s theorem [18] (see also Gromov [3] and Günther [4]). In this context the Wave-Maps equation can be expressed in a form which involves the second fundamental form S of M, viewed as a symmetric bilinear form: S : T M × T M → N M, S(X, Y ), N = ∂ X N , Y . For the standard d’Allembertian in R2+1 we use the notation = ∂t2 − x = −∂ α ∂α . The Cauchy problem for the wave maps equation has the form: a φ a = −Sbc (φ)∂ α φ b ∂α φ c , φ ∈ R N , φ(0, x) = φ0 (x), ∂t φ(0, x) = φ˙ 0 (x),
(1a) (1b)
where the initial data (φ0 , φ˙ 0 ) is chosen to obey the constraint: φ0 (x) ∈ M, φ˙ 0 (x) ∈ Tφ0 (x) M, x ∈ R2 . In the sequel, it will be convenient for us to use the notation φ[t] = (φ(t), ∂t φ(t)). The system of Eqs. (1) admits a conserved quantity, namely the Dirichlet energy: |∂t φ(t)|2 + |∇x φ(t)|2 d x := φ[t] 2H˙ 1 ×L 2 = E. (2) E[φ(t)] := R2
Finite energy solutions for (1) correspond to initial data in the energy space, namely φ[t] ∈ H˙ 1 × L 2 . We call a Wave-Map “classical” on a bounded time interval (t0 , t1 )×R2 if ∇x,t φ(t) belongs to the Schwartz class for all t ∈ (t0 , t1 ). The Wave-Maps equation is also invariant with respect to the change of scale φ(t, x) → φ(λt, λx) for any positive λ ∈ R. In (2 + 1) dimensions, it is easy to
Energy Dispersed Wave Maps
141
see that the energy E[φ] is dimensionless with respect to this scale transformation. For this reason, the problem we consider is called energy critical. For the evolution (1), a local well-posedness theory in Sobolev spaces H s × H s+1 for s above scaling, s > 1, was established some time ago. See [7] and [9], and references therein. The small data Cauchy-problem in the scale invariant Sobolev space is, by now, also well understood. Following work of the second author [32] for initial data in a scale invariant Besov space, Tao was the first to consider the wave map equation with small energy data. In the case when the target manifold is a sphere, Tao [29] proved global regularity and scattering for small energy solutions. This result was extended to the case of arbitrary compact target manifolds by the second author in [33]. Finite energy solutions were also introduced in [33] as unique strong limits of classical solutions, and the continuous dependence of the solutions with respect to the initial data was established. The case when the target is the hyperbolic plane was handled by Krieger [15]. There is also an extensive literature devoted to the more tractable higher dimensional case; we refer the reader to [8,14,17,28,31], and [20] for more information. To measure the dispersive properties of solutions φ to the Wave-Maps equation, we shall use a variant of the standard dispersive norm S from [33]. This was originally defined in [29] by modifying a construction in [32]. S is used together with its companion space N which has the linear property (precise definitions will be given shortly): ∞ φ S[I ] φ L ∞ + φ[0] H˙ 1 ×L 2 + φ N [I ] . t (L x )[I ]
The main result in [33] asserts that global regularity and scattering hold for the small energy critical problem: Theorem 1.1. The wave maps Eq. (1) is globally well-posed for small initial data φ[0] ∈ H˙ 1 × L 2 in the following sense: (i) Classical Solutions. If the initial data φ[0] is constant outside of a compact set and C ∞ , then there is a global classical solution φ with this data. (ii) Finite Energy Solutions. For each small initial data set in φ[0] ∈ H˙ 1 × L 2 there is a global solution φ ∈ S, obtained as the unique S limit of classical solutions, so that: φ S φ[0] H˙ 1 ×L 2 .
(3)
(iii) Continuous dependence. The solution map φ[0] → φ from a small ball in H˙ 1 × L 2 to S is continuous. We remark that due to the finite speed of propagation one can also state a local version of the above result, where the small energy initial data is taken in a ball, and the solution is defined in the corresponding uniqueness cone. This allows one to define large data finite energy solutions: Definition 1.2. Let I be a time interval. We say that φ is a finite energy wave map in I if φ[·] ∈ C(I ; H˙ 1 × L 2 ) and, for each (t0 , x0 ) ∈ I and r > 0 so that E[φ(t0 )](B(x0 , r )) is small enough, the solution φ coincides with the one given by Theorem 1.1 in the uniqueness cone I ∩ {|x − x0 | + |t − t0 | r }. In this work we consider a far more subtle case, which is a conditional version of the large data problem. It is first important to observe that for general targets the
142
J. Sterbenz, D. Tataru
above theorem cannot be extended to arbitrarily large C ∞ initial data, and that this failure can be attributed to several different mechanisms. For instance any harmonic map φ0 : R2 → M yields a time independent wave-map which does not decay in time, therefore it does not belong to S. More interesting is that for certain non-convex targets, for example when we take M = S2 , finite time blow-up of smooth solutions is possible (see [13,19]). In this latter case, the blow-up occurs along a family of rescaled harmonic maps. To avoid such Harmonic-Map based solutions, as well as other possible concentration scenarios, in this article we prove a conditional regularity theorem: Theorem 1.3 (Energy Dispersed Regularity Theorem). There exist two functions 1
F(E) and 0 < (E) 1 of the energy (2) such that the following statement is true. If φ is a finite energy solution to (1) on the open interval (t1 , t2 ) with energy E and: sup Pk φ L ∞ 2 (E) t,x [(t1 ,t2 )×R ]
(4)
φ S(t1 ,t2 ) F(E).
(5)
k
then one also has:
Finally, such a solution φ(t) extends in a regular way to a neighborhood of the interval I = [t1 , t2 ]. Remark 1.4. In Sect. 4, Theorem 4.1, we shall state a slightly stronger version of this result which uses the language of frequency envelopes from [29]. In particular, we will show the energy dispersion bound (4) implies that a certain range of subcritical Sobolev norms may only grow by a universal energy dependent factor. Put another way, one may interpret this restatement of Theorem 1.3 as saying that in the energy dispersed scenario, the Wave-Maps equation becomes subcritical in the sense that there is a quasi-conserved norm of higher regularity than the physical energy. This information, coupled with the standard regularity theory for Wave-Maps (e.g. see [33]) provides us with the continuation property. Remark 1.5. The result in this article is stated and proved in space dimension d = 2. However, given its perturbative nature, one would expect to have a similar result in higher dimension d 3 as well. That is indeed the case. There are two reasons why we have decided to stay with d = 2 here. One is to fix the notations. The second, and the more important reason, is to avoid lengthening the paper with an additional argument in Sect. 4, which is the only place in the article where the conservation of energy is used. In higher dimensions, this aspect would have to be replaced by an almost conservation of energy, with errors controlled by the energy dispersion parameter . Remark 1.6. The proof of Theorem 1.3 allows us to obtain explicit formulas for F(E) and (E). Precisely, in the conclusion of the proof of Corollary 4.4 below, we show that these parameters may be chosen of the form: F(E) = eCe
EM
, (E) = e−Ce
EM
,
with C and M sufficiently large. As a consequence of the frequency envelope version of this result in Theorem 4.1 we can also state a weaker non-conditional version of the above result:
Energy Dispersed Wave Maps
143
Corollary 1.7. There exists two functions 1 F(E) and 0 < (E) 1 of the energy (2) such that for each initial data φ[0] satisfying: sup Pk φ[0] H˙ 1 ×L 2 (E),
(6)
k
there exists a unique global finite energy solution φ ∈ S, satisfying: φ S F(E),
(7)
which depends continuously on the initial data. If in addition the initial data is smooth, then the solution is also smooth. Our main interest in Theorem 1.3 is to combine it with the results of our concurrent work [21], which together implies a full regularity theory for Wave-Maps. In this context, one may view Theorem 1.3 as providing a “compactness continuation” principle, which roughly states that there is the following dichotomy for classical Wave-Maps defined on the open time interval (t0 , t1 ) × R2 : (1) The solution φ continues to a neighborhood of the closed time interval [t0 , t1 ] as a classical Wave-Map. (2) The solution φ exhibits a compactness property on a sequence of rescaled times. In particular, the second case may be used with the energy estimates from [21] to conclude that a portion of any singular Wave-Map must become stationary, and via compactness must therefore rescale to a Harmonic-Map of non-trivial energy. This was known as the bubbling conjecture (see the introduction of [21] for more background). Finally, we would like to remark that results similar in spirit to the ones of this paper and [21] have been recently announced. In the case where M = Hn , the hyperbolic spaces, global regularity and scattering follows from the program of Tao [22–24,26,30] and [25]. In the case where the target M is a negatively curved Riemann surface, Krieger and Schlag [16] provide global regularity and scattering via a modification of the KenigMerle method [6], which uses as a key component suitably defined Bahouri-Gerard [1] type decompositions. 1.1. A guide to reading the paper. The paper has a “two tier” structure, whose aim is to enable the reader to get quickly to the proof of the main result in Sect. 4. The first tier consists of Sects. 2, 3 and 4, which play the following roles: Section 2 is where the notations are set-up. In addition, in Proposition 2.3 we review the linear, bilinear, trilinear and Moser estimates concerning the S and N spaces, as proved in [29,33]. The N space we use is the same as in [29,32]. For the S space we begin with the definition in [29] and add to it the Strichartz norm S defined later in (148). This modification costs almost nothing, but saves a considerable amount of work in several places. Section 3 contains new contributions, reaching in several directions: • Renormalization. A main difficulty in the study of wave maps is that the nonlinearity is non-perturbative at the critical energy level. A key breakthrough in the work of Tao [29] was a renormalization procedure whose aim is to remove the nonperturbative part of the nonlinearity. However, despite subsequent improvements in [33], this procedure only applies to the small data problem. We remedy this in Proposition 3.1, introducing a large data version of the renormalization procedure. This
144
J. Sterbenz, D. Tataru
applies without any reference to the energy dispersion bounds. We note that other large data renormalization procedures are available in certain cases, for instance by using the Coulomb or the caloric gauge. • S bounds for the paradifferential evolution with a large connection. After peeling off the perturbative part of the nonlinearity in the wave map equation, one is left with a family of frequency localized linear paradifferential evolutions as in (38). In the case of the small data problem, by renormalization this turns into a small perturbation of the linear wave equation. Here this is no longer possible, as the connection coefficients Aα are large, and this cannot be improved using the energy dispersion. However, what the energy dispersion allows us to do is to produce a large frequency gap m in (38). As it turns out, this is all that is needed in order to have good estimates for Eq. (38). • New bilinear and trilinear estimates which take advantage of the energy dispersion. The main bilinear bound is the L 2 estimate in Proposition 3.4. Ideally one would like to have such estimates for functions in S, but that is too much to ask. Instead we introduce a narrower class W of “renormalizable” functions φ of the form φ = U † w, where U ∈ S is a gauge transformation, while for w we control both w S and w N . As a consequence of Proposition 3.4 and the more standard bounds in Proposition 2.3, we later derive the trilinear estimates in Proposition 3.6, which are easy to apply subsequently in the proof of our main theorem. Section 4 contains the proof of Theorem 4.1, which is a stronger frequency envelope version of Theorem 1.3. This is done via an induction on energy argument. The noninductive part of the proof is separated into Propositions 4.2 and 4.3, whose aim is to bound in two steps the difference between a wave-map φ and a lower energy wave map whose initial data is essentially obtained by truncating in frequency the initial data for φ φ. The arguments in this section use exclusively the results in Sects. 2, 3. The second tier of the article contains the proofs of all the results stated in Sects. 2, 3, with the exception of those already proved in [29] and [33]. These are organized as follows: Section 5’s content is as follows: • A full description of the S and N spaces. Some further properties of these spaces are detailed in Proposition 5.4; most of these are from [29] and [33], with the notable exception of the fungibility estimate (159). The bound (159) is proved using only the definition of N . • Extension properties for the S space. In most of our analysis we do not work with the spaces S and N globally, instead we use their restrictions to time intervals, S[I ] and N [I ]. This is not important for N , since the multiplication by a characteristic function of an interval is bounded on N . However, that is not the case for S. One can define the S[I ] norm using minimal extensions. But in our case, we also need good control of the energy dispersion and of the high modulation bounds for the extensions. To address this, in Proposition 5.5 we introduce a canonical way to define the extensions which obey the appropriate bounds, and which also produce an equivalent S[I ] norm. • Strichartz and L 2 bilinear estimates. Using the U p and V p spaces1 associated to the half-wave evolutions, we first show that solutions to the wave equation φ = F with a right hand side F ∈ N satisfy the full Strichartz estimates. The fungibility estimate (159) plays a significant role here, as it allows us to place the solution φ in a V 2 type space, see (195). A second goal is to prove L 2 bilinear bounds for products of 1 For further information on the U p and V p spaces we refer the reader to [5,11,12].
Energy Dispersed Wave Maps
145
two such inhomogeneous waves with frequency localization and angular frequency separation, see Lemma 5.10. This is accomplished using the Wolff [34]-Tao [27] type L p bilinear estimates with p < 2. Section 6 is devoted to the proof of the bilinear null form estimates in Proposition 3.4. A preliminary step, achieved in Lemma 6.1, is to establish the counterpart of the bounds (44) and (46) in the absence of the renormalization factor. The proofs here use only Lemma 5.10 and the estimates in Propositions 2.3, 5.4. Section 7 contains the proof of the trilinear estimates in Proposition 3.6. There are a number of dyadic decompositions and multiple cases to consider, but this is largely routine, using either Proposition 3.4 or the estimates in Propositions 2.3 and 5.4. Section 8 is concerned with the construction of the gauge transformation in Proposition 3.1. The discrete inductive construction in [29,33] is replaced with a continuous version which serves to insure that the renormalization matrices U,
146
J. Sterbenz, D. Tataru
2.1. Constants. There will be a number of large and small constants in the present work. For the most part these are flexible, although the specific construction of F(E) and (E) from Theorem 1.3 will be sensitive to each other as well as to choices of other constants. Lower case Greek letters such as δ, , and η will always denote small quantities. We shall employ a globally defined string of small constants: 100 δ0 δ1 δ2 1, δi δi+1 .
(8)
As it occurs often in the sequel, we will set δ = δ2 throughout. For the convenience of the reader we list here the purposes of these constants, which all measure various fractional frequency gains in our dyadic estimates: • The base constant δ enters our proof through the various multilinear estimates for the S and N spaces listed below (e.g. in the current section); it measures for instance various dyadic gains in estimates from [29] and [33]. It also influences any portion of our argument which is a direct consequence of these estimates, but has nothing to do with directly bootstrapping large data Wave-Maps. For example, δ also represents various dyadic gains in our gauge construction (see Proposition 3.1). • The constant δ1 measures a small fractional gain coming from energy dispersion in L 2 and N -norm null form estimates. It enters our proof through estimates (51) and (52), and variations thereof. • The constant δ0 is reserved for slowly varying frequency envelopes, and for the smallest fractional quantities built from the energy dispersion constant . It enters in the core part of our proof of Theorem 1.3, and is the assumption on the frequency envelopes of Proposition (3.6). Large quantities, for example C, F, K , and M will be used in various contexts as constants in estimates and the size of norms which are not globally defined. We will also often use m to denote a (possibly) large integer which represents various gaps in frequency truncations. To denote growth and dependence of various estimates on that growth we employ the following notation in the sequel: Definition 2.1 (Complexity Notation). We say that a positive function f (y) is of “polynomial type” if f (y) y M for some constant M as y → ∞. We use the notation: A F B, if A K (F)B for some function K of polynomial type. This notation does not fix K from line to line, although K is fixed on any single line where it occurs. 2.2. Basic harmonic analysis. As usual we denote by ξ and τ the spatial and temporal Fourier variables (resp). We set up both discrete and continuous spatial Littlewood-Paley (LP) multipliers: ∞ I − P−∞ = Pk , I − P−∞ = Pk dk. k
−∞
For the purposes of trichotomy, these two sets of multipliers are interchangeable, and we will only distinguish them by the use of or in identities. However, for the purposes
Energy Dispersed Wave Maps
147
of proving Moser type estimates or constructing gauge transformations, the integral definition of LP projections is essential. We refer the reader to [33] for an earlier use of continuous LP multipliers, and further information. We often denote by φk = Pk φ. If φ is any affinely Schwartz function, the above notation means that we have the identities: ∞ φ − lim φ(x) = φk = φk dk. |x|→∞
−∞
k
Therefore, care must be taken to add constants back into certain estimates involving very low frequencies. Many times in the sequel we shall have use for the inequality: PB φ L qx |B|
( r1 − q1 )
φ L rx ,
(9)
where B ⊆ R2ξ is a frequency box. Furthermore, the Pk multipliers enjoy a commutator structure as follows: Pk (φψ) = φψk + L(∇x φ, 2−k ψ),
(10)
where the bilinear form L is translation invariant and bounded on all Lebesgue type spaces. Such multilinear expressions occur often in the sequel. We call a multilinear form L of the form: (1) (k) L(φ , . . . , φ )(x) = φ (1) (x + y1 ) · · · φ (k) (x + yk )dμ(y1 , . . . , yk ), where
|dμ| 1,
“disposable”. Any disposable operator generates a family of estimates from any single product estimate involving translation invariant norms in the usual way (see [29]). We will also use the variable notation for frequency envelopes from [33] (see [29] for another definition): Definition 2.2 (Frequency Envelopes). A frequency envelope {ck } is called “(σ, )admissible” if it obeys the bounds: 2σ ( j−k) ck c j 2(k− j) ck ,
(11)
for any j < k, where 0 < σ . If φ Y is any non-negative real valued functional, and {ck } is a frequency envelope, we define: φ Yc := sup ck−1 Pk φ Y .
(12)
k
There is an exception to this notation for the norm S[I ] introduced below, in which case we set: ∞ φ Sc [I ] := φ L ∞ + sup ck−1 φ Sk [I ] . t (L x )[I ]
k
148
J. Sterbenz, D. Tataru
Frequency envelopes may be defined in either the discrete or continuous settings. It is easy to see that for any such frequency envelope we have the pair of sum rules (uniformly): 2 Ak ck (A − )−1 2 Ak ck , A > , (13) k k
2−ak ck (a − σ )−1 2−ak ck ,
a > σ,
(14)
k k
with similar bounds for integrals. These two inequalities capture the essence of every use we have for the {ck } notation, which is simply to bookkeep (resp.) Low × Low ⇒ H igh and H igh × H igh ⇒ Low frequency cascades. 2.3. Function spaces and standard estimates. We use the function spaces S and N from [32,33] and [29] with only a few minor modifications. The spaces of restrictions of S and N functions to a time interval I are denoted by S[I ], respectively N [I ], with the induced norms. The first part of our proof does not use the precise structure of these spaces, only the following statement: Proposition 2.3 (Standard Estimates and Relations: Part I). Let F, φ, and φ (i) be a collection of test functions, I ⊆ R any subinterval (including R itself). Then there exists function spaces S[I ] and N [I ] with the following properties: • Triangle Inequality for S. Let I = ∪iK Ii be a decomposition of I into consecutive intervals, then the following bounds hold (uniform in K ): φ S[I ] φ S[Ii ] . (15) i
• Frequency Orthogonality. The spaces S[I ] and N [I ] are made up of dyadic pieces in the sense that: Pk φ 2S[I ] , (16) φ 2S[I ] = φ 2L ∞ (L ∞ )[I ] + t
φ 2N [I ]
=
x
k
Pk φ 2N [I ] .
(17)
k
• Energy Estimates. We have that L 1t (L 2x )[I ] ⊆ N [I ], and also the estimate: φk S[I ] φk N [I ] + φk [0] H˙ 1 ×L 2 .
(18)
• Core Product Estimates. We have that: (1)
(2)
(1)
(2)
φ
Pk (φk(1) 1
· φk(2) ) S[I ] 2
2−(max{ki }−k) φk(1) S[I ] 1
· φk(2) S[I ] , 2
Pk (φ
−δ(k−k2 )+
φk1 S[I ] · Fk2 N [I ] .
(19) (20) (21) (22)
Energy Dispersed Wave Maps
149
• Bilinear Null Form Estimates. We have that: (i) 1 1 (1) (2) φki S[I ] , (23) Pk ∂ α φk1 · ∂α φk2 L 2 (L 2 )[I ] 2 2 min{ki } 2−( 2 +δ)(max{ki }−k) t
(1) Pk (∂ α φk1
x
(2) · ∂α φk2 ) N [I ]
2
−δ(max{ki }−k)
i (i) φki S[I ] .
(24)
i
• Trilinear Null Form Estimate. We have that: (1)
(2)
(3)
Pk (φk1 · ∂ α φk2 · ∂α φk3 ) N [I ] 2−δ(max{ki }−k) 2−δ(k1 −min{k2 ,k3 })+
(i)
φki S[I ] .
i
(25) • Moser Estimates. Let G be any bounded function with uniformly bounded derivatives, and {ck } a (δ, )-admissible frequency envelope. Then there exists a universal K > 0 such that: K G(φ) S[I ] φ S[I ] (1 + φ S[I ] ),
G(φ) Sc [I ]
(26)
K φ Sc [I ] (1 + φ S[I ] ).
(27)
The space N is the same one as used in [29,33]. To obtain the space S we start with the one used in [29,33] and add the control of the Strichartz norms S defined in (148). The bound (15) is relatively straightforward; we prove it in Sect. 5. The relations (16) and (17) can be thought of as a part of the definition of the spaces S, N starting from their dyadic versions. The linear estimate (18) was proved in [29]; here we show that we can add the Strichartz component S in Corollary 5.9. The bounds (19)–(22) as well as (24),(25) were proved in [29]. The one unit gain in the exponent in (20) is not explicitly stated in [29], but it is implicit in the proof. In our context the proofs of (19), (20) need to be augmented to add the control over the Strichartz norm S; this is a straightforward matter which is left for the reader. The bound (23) is implicit in [29], but for the reader’s convenience we prove it in Sect. 6. The Moser estimates (26) and (27) were proved in [33]. Adding in the S norm is again straightforward. An interesting side remark is that in effect the addition of the S norm to S can be taken advantage of to simplify considerably the proof of the Moser estimates in [33]. In particular, one can show that it is possible to take K = 2. Since it does not lead to significant improvements in the present article, we leave this as an exercise for the reader. At several places in our argument, it will be necessary for us to introduce some auxiliary norms. We choose to keep these separate from S defined above for notational purposes: Definition 2.4 (Auxiliary Energy and X s,b Type Norms). We define: φ E[I ] := ∇t,x φ L ∞ + sup ∇ / ωt,x φ L ∞ 2 2 t (L x )[I ] t (L x ω
1
φ X k [I ] := 2− 2 k Pk φ L 2 (L 2 )[I ] . t
x
ω
ω )[I ]
,
(28) (29)
Here the second term in the RHS of (28) represents the energy of φ on characteristic hyperplanes, see [32,29]. We also define X [I ] as the square sum of X k [I ], and X c [I ] according to (12). Notice that there are no square sums or frequency localizations in the norm E. The size of this norm depends only on the initial energy of any (global) classical solution to (1).
150
J. Sterbenz, D. Tataru
In the sequel, it will be also be notationally convenient for us to work with the following definition which one should think of as a variant of the S[I ] space introduced above. The reader should keep in mind that this not even a quasinorm due to the lack of any good additivity property: Definition 2.5 (Renormalizable Functions). Let C > 0 be a large parameter. We define a non-linear functional Wk on S as follows: ||| φ |||Wk :=
inf
U ∈S O(d)
C( j−k) P j U S∩X U S∩X + sup 2 j k
· sup 2|k −k| Pk (U φk )[0] H˙ 1 ×L 2 + Pk (U φk ) N . k
(30)
The functionals W[I ], Wc [I ] are also defined as above. Notice that while the definition of W is nonlinear, one still has the scaling relation ||| λφ |||W [I ] = λ||| φ |||W [I ] . The reader should note that while these bounds are cumbersome to state, they are all natural in light of Propositions 3.1–3.2 below.
3. New Estimates and Intermediate Constructions In this section we introduce the main technical components of the paper. We begin with the core underlying tools that allow us to handle more complicated constructions. In a later sub-section we derive some further useful results that encapsulate many of the repetitive computations in the sequel.
3.1. Core technical estimates and constructions. The right-hand side in Eq. (1) is nonperturbative even when the energy is small. In the case of larger energies, it becomes quite a bit more difficult to handle things in a perturbative manner. Therefore, we introduce a set of tools which are general enough to handle large data situations. The first two of these work without any additional properties (e.g. energy dispersion), and form the technical heart of the paper. The first is a novel gauge construction that should be of more general use. It should be noted that this construction is stable regardless of the size of the energy or the convexity properties of the target, as its key properties depend only on the compactness of the underlying gauge group. Proposition 3.1 (The “Diffusion Gauge”). Let φ be a wave-map in a time interval I with energy E, S[I ] norm F, and S[I ] norm (δ, )-admissible envelope {ck }. Let the antisymmetric B be defined by: (Bba )
k
−∞
a b Sbc (φ) − Sac (φ)
φkc dk ,
(31)
a is a smoothly bounded (a, b) symmetric matrix valued vector. We denote the where Sbc integrand by Bk . Then for each real number k there exists an orthogonal matrix U,
Energy Dispersed Wave Maps
151
• U,
where each U,k = U,
Pk U,k S∩X F 2−δ|k−k | 2−C(k −k)+ ck , J U,k L 1 (L 1 ) Pk ∇t,x t x
(32)
(|J |−3)k −C(k −k)
F 2 2 ck , k > k + 10, |J | 2,
Pk U,
Pk U,k1 · ψk2 N F 2−|k−k2 | 2−δ(k2 −k1 ) ck1 ψk2 S , k1 < k2 − 10.
(33) (34) (35)
In addition, if c˜k is a (δ0 , )-admissible frequency envelope for the energy ∇t,x φk L ∞ then we have a similar bound for U,k : 2 t (L x )[I ]
E 2−|k−k |−C(k −k)+ c˜k . Pk ∇t,x U,k L ∞ 2 t (L x ) Here C 0 is any constant. • The Matrix U Approximately Renormalizes Aα = ∇α B. We have the formula: k † † U,
(36)
(37)
This result is proved in Sect. 8. Next, we state a technical proposition that will help us to deal with the non-fungibility of the S norm. The wave map nonlinearity is nonperturbative. However, due to the small energy dispersion, at fixed frequency we are able to perturbatively replace the nonlinearity in the wave map equation with a paradifferential term, i.e. a linear term involving the lower frequencies of the wave map. This term is large, and due to the non-fungibility of the S norm, it cannot be made small on small time intervals. Fortunately, it has another redeeming feature, namely a large frequency gap (see m below). We take advantage of this in Sect. 9 to prove that: Proposition 3.2 (Gauge Covariant S[I ] Estimate). Let ψk = Pk ψ be a solution to the linear problem: ψk = −2 Aα
c ∂ α φ
(38)
(39)
Assume that φ is a classical Wave-Map on I with the bounds: φ E[I ] + φ X [I ] + φ S[I ] F.
(40)
Furthermore, assume that m m(F) > 20, for a certain function m(F) ∼ ln(F) (to be defined in the proof). Then we have the estimate: ||| ψk |||W [I ] F ψk [0] H˙ 1 ×L 2 + G N [I ] .
(41)
152
J. Sterbenz, D. Tataru
Remark 3.3. As will become apparent in the proof of estimate (41), the only use of the large frequency gap parameter m is to be able to bootstrap the RHS involving ψk . In the sequel, there will be situations where one already has good S[I ] norm bounds on ψk , and the task is to provide a renormalization w,k such that w,k has good N norm bounds. Therefore, we state the following: • Let ψk , Aα
20 we have the following estimate for ψk : ||| ψk |||W [I ] F ψk S[I ] + Pk G N [I ] .
(42)
• Furthermore, in the above situation, the renormalization on the LHS of estimate (42) is given by a matrix as in Proposition 3.1 where the pieces B j are defined from Aα
(1)
∞ η. ||| φki |||W [I ] 1, φk1 L ∞ t (L x )[I ]
(43)
Then the following estimates hold: • Bilinear L 2 Estimate. We have that: (1)
(2)
1
∂ α φk1 ∂α φk2 L 2 (L 2 )[I ] 2 2 max{k1 ,k2 } ηδ . t
(44)
x
• Bilinear N Estimate. Assume that in addition to (43) we also have the high modulation bounds: k1
k2
φk(1) L 2 (L 2 )[I ] 2 2 η, φk(2) L 2 (L 2 )[I ] 2 2 η. 1 2 t
t
x
x
(45)
Then the following estimate holds: (1)
(2)
∂ α φk1 ∂α φk2 N [I ] 2C|k1 −k2 | ηδ .
(46)
This is proved in Sect. 6. Finally, we list a technical result concerning initial data frequency truncation. This does not preserve the space of functions with values in T M, so it has to be followed by a non-linear physical space projection back onto T M. We will show that in the energy dispersed case, this operation is very well behaved in the energy norm. Theorems of this type may be useful for other problems involving the need for a “non-linear Littlewood-Paley theory” of functions with values in a manifold: Proposition 3.5. For each E > 0 there exists 0 > 0 so that for each initial data set φ[0] for (1) with energy E and energy dispersion 0 and k, k∗ ∈ Z we have
1 1 (47) Pk P
Energy Dispersed Wave Maps
153
3.2. Derived estimates and intermediate constructions. A corollary of the above Propositions is the following, which will be needed for the proof of our main theorem. The reader should keep in mind that this proposition is merely a bookkeeping device that will allow us to avoid many repetitive calculations in the sequel: Proposition 3.6 (Improved Multilinear Estimates). Let φ (i) be three test functions defined on a time interval I normalized so that: φ (1) S[I ] 1,
sup ||| φ (i) |||W [I ] 1.
(48)
i=2,3
Suppose in addition that φ (2) has the improved energy dispersion bound on I : sup Pk φ (2) L ∞ [I ] η.
(49)
k
Finally, let {ck } be any (δ0 , δ0 )-admissible frequency envelope, and 0 m an additional integer subject to the condition: (50) m δ1 | ln(η)|. Then one has the following multilinear bounds: i) Core Trilinear L 2 Estimate. Suppose along with the above assumptions that φ (3) has unit Wc [I ] norm for the frequency envelope {ck }. Then for any disposable trilinear form L we have the bound: L(φ (1) , ∂ α φ (2) , ∂α φ (3) )
1
L 2t ( H˙ − 2 )c [I ]
η δ1 .
(51)
ii) Additional Trilinear L 2 Estimate. Suppose again that we have the conditions (48)– (49), and that this time φ (1) has unit Sc [I ] norm for the frequency envelope {ck }. Then for any disposable trilinear form L we have the bound: k Pk L(φ (1) , ∂ α φ (2) , ∂α φ (3) ) L 2 (L 2 )[I ] 2 2 ηδ1 ck + P
x
iii) Core Trilinear N Estimate. For a positive integer m and integer k and disposable trilinear form L, define the following trilinear form: Tkm (φ (1) , φ (2) , φ (3) ) := Pk L(φ (1) , ∂ α φ (2) , ∂α φ (3) ) (1)
(2)
(3)
(1)
(2)
(3)
−L(φ
(53)
Suppose in addition to the (48)–(49) we also have unit Wc [I ] norm of φ (3) , and furthermore the high modulation bounds: φ (2) X [I ] η, φ (3) X c [I ] η.
(54)
Then the following trilinear estimate holds: Tkm (φ (1) , φ (2) , φ (3) ) N [I ] ηδ1 ck .
(55)
154
J. Sterbenz, D. Tataru
iv) Additional Trilinear N Estimate. Suppose in addition to (48)–(49) we have unit Sc [I ] norm of φ (1) , and in addition the high modulation bounds: sup φ (i) X [I ] η.
(56)
i=2,3
Then if Tkm is defined as on line (53) we have the bound: Tkm (φ (1) , φ (2) , φ (3) ) N [I ] ηδ1 ck + P
(57)
Remark 3.7. If the functions φ (i) admit a common frequency envelope {ck } then we can relax the admissibility condition on {ck } and work with (δ0 , ) frequency envelopes. Precisely, for any (δ0 , )-admissible frequency envelope {ck } we have the following: • If (48) is replaced by φ (1) Sc [I ] 1,
sup ||| φ (i) |||Wc [I ] 1,
(58)
i=2,3
then (51) follows. • If in addition (54) is replaced by φ (2) X c [I ] η, φ (3) X c [I ] η,
(59)
then the following version of (55) holds: Tkm (φ (1) , φ (2) , φ (3) ) N [I ] ηδ1 ck .
(60)
Remark 3.8. As will become apparent in the sequel, the only use of the renormalized norms W[I ] and the high modulation bounds X [I ] in the estimates of Proposition 3.6 is to ensure the smallness coming from the parameter η. Thus, under the simpler assumption that the φ (i) are only normalized so that φ (i) S[I ] 1 we have the following: • If φ (3) has (δ0 , δ0 )-admissible S[I ] norm frequency envelope {ck }, then estimate (51) holds with η = 1. • If φ (3) has a (δ0 , δ0 )-admissible S[I ] norm frequency envelope {ck }, and if we let m 10 be any integer, then we may replace estimate (55) with the bound: Tkm (φ (1) , φ (2) , φ (3) ) N [I ] 24δ0 m ck .
(61)
• If φ (1) has (δ0 , δ0 )-admissible S[I ] norm frequency envelope {ck }, and if we let m 0 be any integer, then we may replace estimate (57) with the bound: Tkm (φ (1) , φ (2) , φ (3) ) N [I ] 24δ0 m ck + P
Energy Dispersed Wave Maps
155
• Additional Norm Control. We have the bounds: φ X [I ] + φ E[I ] F 1.
(63)
• Renormalization. If {ck } is a (δ0 , )-admissible frequency envelope for φ S[I ] , then we may renormalize our wave-map as follows: ||| φ |||Wc [I ] F 1.
(64)
• Partial Fungibility. If φ S[I ] = F, then there exists a collection of subintervals K I , such that K = K (F) depends only on F, and such that the following I = ∪i=1 i bound holds on each Ii : φ S[Ii ] E 1.
(65)
• Smallness of High Modulations. Suppose in addition that we have energy dispersion ∞ supk Pk φ L ∞ . Then we also have the estimate: t (L x )[I ] φ X [I ] F δ1 .
(66)
• Frequency Envelope Control. Suppose that φ has sufficiently small energy dispersion < (F). Then if {ck } is a (δ0 , )-admissible H˙ 1 × L 2 frequency envelope for φ[0] we have: φk S[I ] F ck .
(67)
Proposition 3.9 is proved in Sect. 10. Finally, for the reader’s convenience we group together the results which enable us to carry out our bootstrapping arguments: Proposition 3.10 (Bootstrapping Tool). Let I = [a, b] be an interval and c a (δ0 , ) frequency envelope. Then for each affinely Schwartz function φ in I the following properties hold: • Seed S bound. Let In ⊂ I be a decreasing sequence of intervals which converges to the point t = 0. Then: lim φ S[In ] φ[0] H˙ 1 ×L 2 ,
lim φ Sc [In ] φ[0] ( H˙ 1 ×L 2 )c .
n→∞
n→∞
(68)
• Continuity Properties. For each subinterval J ⊂ I we have φ ∈ S[J ] ∩ Sc [J ], and its S norm φ S[J ] , its Sc norm φ Sc [J ] , and its energy dispersion norm ∞ supk Pk φ L ∞ all depend continuously on the endpoints of J . t (L x )[J ] • Closure and Extension Property. Let In be an increasing sequence of intervals and ∪In = I = (a, b). Let φ be a classical Wave-Map in I which satisfies the uniform bounds: ∞ φ S[In ] F, sup Pk φ L ∞ , t (L x )[In ]
k
with (F). Then φ ∈ S[I ], and furthermore it can be extended to a classical Wave-Map in a larger interval I1 = [a1 , b1 ] with a1 < a < b < b1 .
156
J. Sterbenz, D. Tataru
Proof. The first part is a direct consequence of the solvability bound (18) since φ ∈ L 1t L 2x [I ] as well as φ ∈ (L 1t L 2x )c [I ]. For the second part we first consider the S norm. Let Jn ⊂ I be a sequence of intervals converging to J . We consider a sequence of rescalings mapping J to Jn , (t, x) → (λn t + tn0 , λn x), λn → 1, tn → 0. This allows us to map functions in Jn to functions in J , φ → φn (t, x) = φ(λn t + tn0 , λn x). Hence using the scale invariance of the S norm, we have φ S[Jn ] = φn S[J ] → φ S[J ] , where in the last step we simply use the fact that convergence in the Schwartz space implies the convergence in S[J ]. For Sc norms the proof is similar. The dyadic convergence φk S[Jn ] → φk S[J ] follows by the same rescaling argument. This implies the Sc convergence since the tails are small, lim c−1 φk S[I ] k→±∞ k
= 0,
which is due to the Schwartz regularity of φ. A similar decay of the tails yields the continuity of the energy dispersion norm. For the last part we observe that by (67), for each (δ0 , ) frequency envelope c we obtain a uniform bound for Pk φ Sc [In ] + Pk φ X c [In ] . Letting n → ∞ we directly obtain Pk φ ∈ X c [I ], which shows that for each k we have Pk φ ∈ S[I ] and Pk φ S[In ] → Pk φ S[I ] ck . Hence φ is a Schwartz wave map in [a, b], therefore by the local well-posedness result it admits a Schwartz extension to a larger interval. 4. Proof of the Main Result The purpose of this section is to use the setup of the previous two sections to prove the following result, which easily implies our main Theorem 1.3 as well as Corollary 1.7. Theorem 4.1 (Frequency Envelope Version of the Main Theorem). There exist two functions 1 F(E) and 0 < (E) 1 of the energy (2) such that if φ is a finite energy solution to (1) in a closed interval I × R2 , where I = [a, b], with energy E and dispersion (4), then estimate (5) holds in S[I ]. In addition, there exists a universal polynomial K (F) such that if {ck } is any (δ0 , )-admissible frequency envelope for φ[0], we have the bound: φ Sc [I ] K (F(E)).
(69)
In particular, one may extend φ to a finite energy Wave-Map on the open neighborhood I ⊆ (a − i 0 , b + i 0 ) whose additional length i 0 depends only on E, {ck }, and .
Energy Dispersed Wave Maps
157
We immediately observe that it suffices to prove the result for classical wave-maps. This is due to the small data result in Theorem 1.1, which implies that any finite energy wave map in a closed interval can be approximated in S by classical wave maps. In addition, the S convergence easily implies the convergence of the energy dispersion norm (4). In the sequel we simply focus on proving (5). Once (5) is known, Proposition 3.9 can be applied and the estimate (69) is an immediate consequence of (67). In fact, it would be tempting to use the more direct analysis employed in the proof of (67) to establish (5) as well in a single go. Such a strategy seems to fail basically due to linearized Low × H igh ⇒ H igh frequency interactions. These interactions need to be handled via Proposition 3.2, which in turn requires one to already control S type norms (e.g. in assumption (40)). To avoid this dilemma, we employ a simple induction scheme to reduce things to estimates for Wave-Maps of (slightly) smaller energy. The reader should keep in mind however that modulo this single Low × H igh obstruction, our analysis would work to prove (5) and (67) simultaneously. More specifically, the remaining estimates basically boil down to using (44)–(46) to eliminate matched frequency “semilinear” type interactions (this is the only place where the energy dispersion (4) really comes in), and (24)–(25) to kill off H igh × H igh ⇒ Low frequency cascades. We now construct the functions F(E) and (E) such that (4) and (5) hold. Precisely, we will show that there exists a strictly positive nonincreasing function defined for all values of E, c0 = c0 (E) 1, so that if the conclusion of the theorem holds up to energy E then it also holds up to energy E + c0 . It is important here that c0 depends only on E and not on the size of F(E) or (E), as otherwise we would only be able to conclude the usual first step in an induction on energy proof which is establishing that the set of regular energies is open.2 Also, we note here the monotonicity of c0 is only used to conclude that c0 admits a positive lower bound on any compact set. According to Theorem 1.1 we know that (E) and F(E) can be constructed up to some E 0 1. We now assume that E 0 is fixed by induction, and to increase its range we consider a solution φ defined on an interval I with energy E[φ] = E 0 + c, c c0 (E 0 ) and with energy dispersion (at first this is a free parameter which we may take as with energy E 0 . To construct φ small as we like). We will compare φ with a solution φ we reduce the initial data energy of φ[0] by truncation in frequency. We define the “cut frequency” k∗ ∈ R according to (this can be done by adjusting the definition of the P
(70)
has energy dispersion Since φ has energy dispersion , by (47) it follows that φ 1 E 0 4 at time t = 0. Again by the usual Cauchy stability theory, if is chosen small 2 In this latter setup, one is then left with the arduous task of eliminating minimal energy blowup solutions. Our strategy is a bit more direct because we accomplish this as well in our construction of c0 , so we are able to avoid a good deal of repetitive analysis.
158
J. Sterbenz, D. Tataru
enough in comparison to the inductively defined parameter (E 0 ) it follows that there satisfies: exists a non-empty interval J0 where φ L ∞ (L ∞ )[J ] (E 0 ). sup Pk φ 0 t x
(71)
k
Then our induction hypothesis guarantees that we have the dispersive bounds: S[J0 ] F(E 0 ). φ
(72)
The plan is now very simple. On one hand, we try to pass the space-time control of up to φ via linearization around φ to control the low frequencies, and conservation φ of energy and perturbation theory to control the high frequencies. On the other hand, in order to we need to pass the good energy dispersion bounds from φ back down to φ increase the size of J ⊆ I on which (71) holds until it eventually fills up all of I . To achieve all of this, we proceed via two core estimates: Proposition 4.2 (Evolution of Low Frequency Errors). Let φ be a Wave-Map defined on an interval J with energy E + c with 0 < c 1 and bounds: ∞ sup Pk φ L ∞ , φ S[J ] F. t (L x )[J ]
(73)
k
is the Wave-Map with energy E defined by φ [0] = Pk∗ φ[0], Suppose in addition that φ is classical on J with bounds: and that φ L ∞ (L ∞ )[J ] S[J ] F. , φ sup Pk φ t x
(74)
k
Assume also that the two energy dispersion constants are chosen so that: −10
−10
−δ0 , , (C F)−δ0 , (C F)
(75)
E C − 21 and C is a sufficiently large constant. where we may assume that F F Then in addition we have the bound: − Pk∗ φ S[J ] F δ0 . φ
(76)
be defined as in the Proposition 4.3 (High Frequency Evolution Estimates). Let φ and φ last proposition, in particular with the bounds (73) and (74) (resp), and that the dispersion constants obey (75). Then there exists a universal function c0 (E) with c0−1 E 1 we in addition have the bound: such that if we assume c0 = c0 (E) in the definition of φ − φ S[J ] F 1. φ
(77)
We postpone the proof of the above propositions to show how to use them to conclude our induction. By the seed bound (68) we may assume that in addition to (71) and (72) above we also have: φ S[J ] 2F(E 0 + c), on some interval J . With this setup, and by an application of the continuity property in Proposition 3.10 it suffices to combine Propositions 4.2–4.3 to show the following:
Energy Dispersed Wave Maps
159
Corollary 4.4. Assume there exists functions (E) and F(E) defined up to E 0 such that (4) implies (5). Choose c0 (E 0 ) according to Proposition 4.3. Then there exist extensions of (E) and F(E) such that for each 0 < c c0 (E 0 ) and each classical Wave-Map φ in a time interval J with energy E 0 + c and the bounds: ∞ (E 0 + c), φ S[J ] 2F(E 0 + c), sup Pk φ L ∞ t (L x )[J ]
(78)
φ S[J ] F(E 0 + c).
(79)
k
we have:
is Proof. In addition to (78) ⇒ (79), we will make the additional assumption that φ defined as a Schwartz wave map in J and satisfies: L ∞ (L ∞ )[J ] (E 0 ), sup Pk φ t x
(80)
k
and show that if the extensions to (E) and F(E) are chosen correctly then we in addition have the following improvement to (80): L ∞ (L ∞ )[J ] sup Pk φ t x k
1 (E 0 ). 2
(81)
To see that this is sufficient, we first note that by (71) the bound (80) holds in a smaller interval J0 ⊂ J . Extending J0 to a maximal interval in J , denoted still J0 , so that (80) holds, by the closure property in Proposition 3.10 it follows that J0 must be closed. The has a Schwartz extension to a neighborhood same part of Proposition 3.10 shows that φ of J0 . Then by (81) applied in J0 and the continuity property in Proposition 3.10 it follows that (80) holds in a larger interval. Hence J0 must be both closed and open in J , and therefore J0 = J . It remains to find extensions (E) and F(E) so that (78) together with (80) imply (79) and (81). Our extensions of (E) and F(E) in (E 0 , E 0 + c0 ] are constant: (E) = ,
F(E) = F,
E ∈ (E 0 , E 0 + c0 ].
be the implicit polynomials from lines (76) and (77) (resp). In Let K 1 (F) and K 2 ( F) order to get the improvement (81) we need that: δ0 · K 1 (F) (E 0 ). In order to conclude (79) we need that: K 2 (F(E 0 )) F. Finally, we also need to choose and F so that (75) holds, and so that (which is of course redundant): 1
1
E 02 2 (E 0 ), which was used right before line (71) to get things started. All of these goals can easily be satisfied as long as we choose σ = δ020 , with δ0 1 sufficiently small, and then first choose F, followed by , such that: F(E 0 ) F σ , σ min{(E 0 ), F −1 }.
160
J. Sterbenz, D. Tataru
Notice that this process can be carried on indefinitely, regardless of the size of E, because we have taken care to decouple the step size c0 from the growth and decay properties of F and . We remark that the above proof allows us to estimate the size of E(F) and (F). Indeed, what we have obtained are piecewise constant functions c0 (E), (E), and F(E) which at the jump points E n are given by the recurrence relation: E n+1 = E n + c0 (E n ), and which satisfy: −1
c0 (E n ) = cE n−σ ,
−1
−2
F(E n+1 ) = C F(E n )σ , (E n+1 ) = cF(E n )σ ,
with sufficiently small σ, c and sufficiently large C. The first relation shows that: 1
E n ≈ n σ −1 +1 , while the next two give relations of the form: −n
−n
F(E n ) C1σ , (E n ) c1σ . Together the last two bounds yield estimates for F and of the form: F(E) eCe
EM
again with C and M sufficiently large.
, (E) e−Ce
EM
,
The remainder of this section is devoted to the proof of Propositions 4.2–4.3. This will be done in order because we will use some of the estimates of Proposition 4.2 in our demonstration of Proposition 4.3. Proof of Proposition 4.2. Denoting: , ψ = Pk∗ φ − φ
(82)
ψ Sc [J ] 1,
(83)
we will prove the stronger bound:
where {ck } is the (δ0 , δ0 )-admissible frequency envelope ck = 2−δ0 |k−k∗ | δ0 . We first consider the initial data for ψ. By an immediate application of Proposition 11.1 and the energy dispersion bound (4.2) we have: 1
1
Pk ψ[0] H˙ 1 ×L 2 E 4 2− 2 |k−k∗ | .
(84)
Since ψ is a Schwartz function, this implies that for a small interval I ⊂ J containing t = 0 we have: ψ Sc [I ] 1.
(85)
Using this as a seed bound, by the continuity property in Proposition 3.10 it suffices to prove that (83) holds under the bootstrap assumption: ψ Sc [I ] 2.
(86)
Energy Dispersed Wave Maps
161
As a preliminary step we use the general renormalization bound (64) as well as the high modulation bound (66), which in light of the estimates on each of lines (73) and (74) imply the set of inequalities: |||W [J ] F 1, ||| φ |||W [J ] F 1, ||| φ δ1 X [J ] F φ X [J ] F , φ δ1 . The proof is deduced in a series of steps: Step 1 (Outline of the proof ). The equation for ψ has the form:
∂α φ . )∂ α φ ψ = −Pk∗ S(φ)∂ α φ∂α φ + S(φ
(87) (88)
(89)
This may be rewritten as follows: , ψ) + C(φ), ψ = −D(φ
(90)
where the difference D and the generalized commutator C are defined as follows: , ψ) = S(φ + ψ)∂ α (φ ∂α φ , + ψ)∂α (φ + ψ) − S(φ )∂ α φ D(φ
α α C(φ) = S(φk∗ )∂ φk∗ ∂α φk∗ − Pk∗ S(φ)∂ φ∂α φ .
(91) (92)
This form of the equation will be used for proving pure L 2 estimates. Alternatively, freezing the spatial frequency k and introducing a frequency gap parameter m 20, we will write (89) in the following paradifferential form: m α
which will be useful for establishing N estimates. Here we are writing: α
(93)
(94)
These terms are chosen roughly as follows. The term Dkm denotes differences of the and ψ which are frequency localized according to the general T m form (91) between φ structure defined on line (53). In particular, these never contain Low × Low × H igh or Low× H igh × Low interactions. The term Lm k contains certain Low× Low× H igh and and ψ differences, with the additional structure Low × H igh × Low interactions in φ that ψ is always at Low frequency with a (possibly large) m dependent gap. Finally, the expression Ckm (φ) contains φ dependent commutators of the form (92). With this setup, we prove the following estimates. First, we show that the commutators are always favorable, regardless of m: (C, C m )
1 (L 2t ( H˙ − 2 )∩N )c [I ]
1 2
F 2 δ1 .
(95)
Second, under the bootstrapping assumption (86), we will show the first two terms on the RHS of (93) may be estimated as follows: , ψ) Nc [I ] F 24δ0 m , Dkm (φ Lm δ0 + 2−δ0 m . k (φ , ψ) Nc [I ] F
(96) (97)
162
J. Sterbenz, D. Tataru
While the second of these last two estimates is favorable for closing a bootstrap via Proposition 3.2, the first is not. However, via Remark 3.3 the above estimates with m = 20 allow us to gain renormalization control of ψ, namely: ψ Wc [I ] F 1.
(98)
To close the bootstrap, we now use two additional estimates. The first shows that with (87) and (98), we have improved L 2 control: , ψ) D(φ
1
L 2t ( H˙ − 2 )c [I ]
F δ1 .
(99)
In particular by this, (95), and the gap condition (75) we have: δ1 . ψ X c [I ] F
(100)
Finally, we show that this last estimate, (98), and (87)–(88) allow the following drastic improvement to (96): , ψ) Nc [I ] F Dkm (φ δ1 . 2
(101)
)| in estimate (97), and The bootstrap is therefore concluded by choosing m = δ1 | ln( applying the linear bound (41) for the paradifferential flow, with the estimates (101), (95) for the right-hand side and (84) for the initial data. We remark that while (41) gives a stronger Wc [I ] control for ψ, in order to conclude the bootstrap we only use a weaker Sc bound which follows from the Wc [I ] bound. Step 2 (The algebraic decomposition). Here we derive the form of the RHS of (93). To uncover this, we shall employ the following generic notation. We let T be a trilinear expression of the form: T S(φ (1) ), φ (2) , φ (3) = L S(φ (1) ), ∂ α φ (2) , ∂α φ (3) , (102) with L disposable, and S is a smooth function with uniformly bounded derivatives. From this we may define the T dependent expressions D and C as on lines (91)–(92). The frequency localized equation for ψ is:
∂α φ , )∂ α φ ψk = −Pk Pk∗ S(φ)∂ α φ∂α φ + Pk S(φ (103) which may be written in the form:
m ), φ , φ − Pk∗ Tkm (S(φ), φ, φ) = Dm + C m , = Tkm S(φ T1;k 1;k 1;k
(105)
with T m defined as on line (53). We now employ the geometric identity for the second fundamental form: c Sab (φ)∇t,x φ c ≡ 0, (106) c
Energy Dispersed Wave Maps
163
which follows simply because the constraint on the image of φ to lie in M implies as well, because it is an exact wave-map. that ∇t,x φ lies in Tφ M. This is valid for φ Therefore, we have the zero expression: ∂α φ
(108)
α α and Aα where both A
(109)
where each individual form is defined as a T m from line (53) applied separately to the two trilinear expressions on the LHS of (107). We now assign the generalized difference labels on the RHS of (93) by setting m , where the two summands were defined on lines (105) and (109). Dkm = i Di;k m the corresponding expression which results To assign Ckm , we further denote by C3;k from commuting the Pk∗ in the second term on the RHS of line (108). We then set m Ckm = i Ci;k . With these choices, Eq. (103) may be written in the form: k − 2 Aα
α α Lm k = 2 A
(110)
and the form of (93) is achieved. The remainder of the proof shows estimates (95), (96), (97), (99), and (101). Step 3 (Estimates for commutators). Here we demonstrate (95). Let C be any expression of the form:
(111) C = T S(Pk∗ φ), Pk∗ φ, Pk∗ φ − Pk∗ T (S(φ), φ, φ) . We will prove the general pair of bounds: C
1
1 L 2t ( H˙ − 2 )c [I ]
1 2
F 2 δ1 , C Nc [I ] F 2 δ1 .
(112)
As a preliminary step we decompose C = C1 + C2 where:
C1 = T S(φk∗ ) − S(φ)k∗ , φk∗ , φk∗ ,
C2 = T S(φ)k∗ , φk∗ , φk∗ − Pk∗ T (S(φ), φ, φ) . These terms are handled separately: Step 3A (Estimates for C1 ). This is based on the Moser type estimate: S(φk∗ ) − S(φ)k∗ Sk [I ] F 2−δ|k−k∗ | .
(113)
164
J. Sterbenz, D. Tataru
To prove this, we further decompose the difference as:
S(φk∗ ) − S(φ)k∗ = P>k∗ S(φk∗ ) + Pk∗ S (φk∗ , φ>k∗ )φ>k∗ , where here S is a bounded and smooth function or its arguments which results from the difference S(φk∗ + φ>k∗ ) − S(φk∗ ). The bound (113) now follows by directly applying the Moser estimate (27) to the first term on the last line above, and by applying a combination of the product estimate (20) and the Moser estimate (26) to the second. To conclude the proof of (112) for the term C1 we need to split the output frequency into two cases: k k∗ + 10 or k > k∗ + 10. In the first case, we directly use (52) and (57), which together provide (112) in light of (87)–(88) and the additional L ∞ estimate: ∞ P k∗ +10), we establish (112) by directly appealing to estimates (51) and (55), which suffice because of (113) and the observation that due to the fact T is translation invariant we have the identity:
Pk C1 = Pk T Pk+O(1) S(φk∗ ) − S(φ)k∗ , φk∗ , φk∗ . Step 3B (Estimating the Term C2 ). We first observe that from the definition we have Pk C2 ≡ 0 whenever k > k∗ + 10. Thus, we only need to deal with the frequency range k k∗ + 10. We split this range into two regions: either k∗ − m k k∗ + 10 or k < k∗ − m. Here m is defined as follows: 2−m = δ1 .
(114)
Note that this definition has nothing to do with the m in the decomposition (93), and is only local to this step. We now estimate separately: Step 3B.1 (The range k∗ − m k k∗ + 10). We may write:
Pk C2 = Tkm S (φ)k∗ , φk∗ , φk∗ − Tkm (S (φ), φ, φ)
+ 2−k∗ L 1 ∇x S (φ)
+ L 3 ∇x S (φ)
(115) where the Tkm are defined as on line (53) with the additional structure and frequency L i are an additional collection of translation localizations from the definition of C2 . The invariant and disposable trilinear forms resulting from the commutator rule (10) applied to the second and third terms on the RHS of line (53). In particular, this commutator is trivial unless k∗ − 10 k k∗ + 10, so L i ≡ 0 without this further restriction. For the first two terms on the RHS (115), we use (87)–(88) which allows us to apply (51) or (55), and these suffice to give (112) in this case because of the frequency gap (114) and the conditions (8) on the δi . It remains to estimate the commutators. From the version of estimate (51) in Remark 3.8, and the fact that: ∇x S(φ)
Energy Dispersed Wave Maps
165
we directly have the L 2 bound from line (112) via (114) and the range restriction k∗ − 10 k k∗ + 10. To prove the N estimate, we similarly only need to show: 2−k∗ L i N [I ] F 2−m . To estimate 2−k∗ L 1 in N [I ] we use (25) as follows (again using k = k∗ + O(1)):
2−k∗ L 1 N [I ] F 2−k∗
2k1 2−δ(k1 −k2 )+ F 2−m .
k1 ,k2 k−m
The details of these calculations for other L i are similar and left to the reader. Step 3B.2 (The range k < k∗ − m). Here we simply decompose: Pk C2 = −
Pk L S(φ)k1 , ∂ α φk2 , ∂α φk3 ,
ki : max{ki }>k∗
so in particular at least one of the second two factors must be in the range ki > k∗ − 10. We remark that this sum has a T m structure of the form (53), so smallness is guaranteed. The main issue is to also recover the exponential falloff in the definition of {ck }. This may be achieved via a direct application of estimates (51) and (55) by first introducing a high frequency (δ0 , δ0 )-admissible W[I ] envelope for P>k∗ −10 φ which we denote by {dk }. In particular, we have dk F 2δ0 (k−k∗ ) , so we directly have (112) for the above sum. Step 4 (Estimates for matched frequency differences). Here we prove (96), (99), and (101). To do this, it is enough to demonstrate the bound: , ψ) Nc [I ] F 24δ0 m , D m (φ
(116)
under the assumptions (86) and (87), the bound: , ψ) D(φ
1
L 2t ( H˙ − 2 )c [I ]
F δ1 ,
(117)
under the additional assumption (98), and finally the improved estimate: , ψ) Nc [I ] F D m (φ δ1 , 2
(118)
assuming all of the above and also using (88) and (100). Here D is any expression of the form (91) for a general trilinear form T as on line (102), and Dm denotes a similar expression in terms of T m from line (53). To some extent these tasks are redundant, so we will make some effort to collapse cases.3 First we introduce the following decomposition of differences of trilinear expressions of the form (102):
), φ , φ − T S(φ + ψ), φ + ψ, φ + ψ = R0 + R1 + R2 + R3 + R4 . T S(φ 3 We note that one can eliminate this redundancy and also simplify some of the case analysis by simply replacing the Sc [I ] bootstrap of the current proof with a direct bootstrap with respect to the Wc [I ] space. We will not pursue this here.
166
J. Sterbenz, D. Tataru
Here we have set:
, ψ)ψ, φ , φ , R 0 = T S (φ
, ψ)ψ, ψ, φ + T S (φ , ψ)ψ, φ , ψ + T S (φ , ψ)ψ, ψ, ψ , R 1 = T S (φ
), ψ, φ , R2 = −T S(φ
), φ , ψ , R3 = −T S(φ
), ψ, ψ . R4 = −T S(φ
(119a) (119b) (119c) (119d) (119e)
Here we are using S as shorthand for the formula: , ψ)ψ, ) − S(φ + ψ) := S (φ S(φ
(120)
so that it symbolically represents some additional set of smooth functions obeying the same bounds as the original second fundamental form S. We proceed to estimate the above terms via two subcases: Step 4A (Estimates (116)–(118) for the terms R1 –R4 ). The bound (116) for the terms R2 –R4 follows immediately from the estimate (61) of Remark 3.8 in view of the bounds (86) and (87). The estimates (117) and (118) for terms R2 –R4 under the additional assumptions (98) and then (100) result directly from estimates (51) and (55). It remains to prove these estimates for R1 . This will be accomplished with the aid of the following three bounds, which we state in more detail here for their use in the next step: Sc [I ] 1, P>k∗ +10 φ S (φ , ψ)ψ Sk [I ] F ck , , ψ)ψ S[I ] F ck , for k k∗ + 20. P
(121) (122) (123)
The proof of the first bound follows immediately from the bootstrapping assumption (86). The second bound follows from the first and estimates (86) and (87) after an application of the product bounds (19)–(20) and the Moser estimate (27). The last estimate above follows by summing the second and using the explicit form of the frequency envelope {ck }, and also using the fact that the product vanishes for k = −∞. The proof of (116)–(118) for R1 is a direct application of estimates (61), (51), and (55) in conjunction with the bounds (122)–(123) above. Step 4B (Estimates (116)–(118) for R0 ). We first demonstrate (117)–(118). We split things into two output frequency cases: k k∗ + 20 and k > k∗ + 20. In the first, we combine the bounds (52) and (57) with both (122) and (123) above. Notice that the condition k k∗ + 20 and the specific form of the frequency envelope {ck } below k∗ gives the desired result. To deal with R0 in the output range k > k∗ + 20 requires additional work. Note that any estimate of the form (123) is false for k > k∗ + 20, so the needed frequency factors. For this we employ envelope control needs to come from the second and third φ the following version of (121): Wc [I ] F 1, P>k∗ +10 φ
(124)
Energy Dispersed Wave Maps
167
which follows from the assumption (98). To use it, we split R0 into a sum of pieces (we may drop L from the picture): ∂α Pk∗ +20 R0,2 and P>k∗ +20 R0,3 are an immediate consequence of estimate (51) and (55) because in either case we can use {ck } frequency factors. To handle the envelope control of the high frequencies of at least one of the φ term P>k∗ +20 R0,1 we again use (51) and (55) along with the bound (122), which provides the needed {ck } factor on account of the forced Pk frequency localization of the first factor in Pk P>k∗ +20 R0,1 . The proof of estimate (116) is similar to (116) above except that we use (121) instead of (124), and (61)–(62) instead of (55) and (57). Step 5 (Estimates for ψ at low frequency). Here we prove (97). From the definition of Lm k on line (110), we see that it suffices to prove the two general bounds: (2) N [I ] F ( (1) ∂ α ψ
(1) ∂α φ (2) N [I ] F ( , ψ)ψ ∂αφ δ0 + 2−δ0 m )ck , S (φ
(126) (127)
(i) S[I ] F 1, and where the φ (i) also has high frequency improvement (121). where φ We split into two cases: Step 5A (The range k < k∗ + 10). For estimate (126) we use (21) and (24) which together give: c j F ck−m F 2−δ0 m ck . (L.H.S.)(126) F j
For estimate (127) we use (25) and (122): (L.H.S.)(127) F 2−δ( j−k1 )+ c j F j,k1
(k − m − j)c j F 2−δ0 m ck .
j
(i) also have high Step 5B (The range k > k∗ + 10). Here we use the fact that the φ frequency improvement (121), which already incorporates {c }. We only need to gain a k small factor. By (21) and (24), and the fact that k ψ Sk [I ] F δ0 we immediately have (126). The proof of (127) follows from (122) and (25), and the summation: |k∗ − j|c j F δ0 ck . (L.H.S.)(127) F ck j
(1) was used in the range k1 > k∗ to reduce the Notice that the exponential falloff of φ factor of (k − m − j) from the previous step to |k∗ − j| here, and thus avoid a logarithmic divergence. This concludes our demonstration of Proposition 4.2. Proof of Proposition 4.3. We denote the difference to be estimated by: , φ high = φ − φ
168
J. Sterbenz, D. Tataru
which represents the evolution of high frequencies in φ. This solves the equation: ∂α φ . )∂ α φ φ high = −S(φ)∂ α φ∂α φ + S(φ
(128)
A natural attempt is to argue directly as in the preceding proof, namely to replace this nonlinear equation for φ high with a linear paradifferential equation plus a nonlinear perturbative term. However, if we do that directly we encounter some difficulties. Precisely, the initial data φ high [0] has size on the order of c0 = c0 (E). Solving the linear paradifferential equation we lose a constant which depends on the S[I ] size of the coefficients, namely at least K ( F(E)). Thus the solution for the approximate linear flow will have size c0 (E)K ( F(E)), and the key point is that we cannot expect this to be small. This would cause the nonlinear effects to be truly non-perturbative, and therefore outside the scope of the current paper. One fix to this would be to allow c0 to depend on F(E) instead of E. However, this would weaken our conclusion to the point where the induction on energy argument only works to show that the set of energies where one has regularity is open. While this is the usual first step in an induction on energy strategy, it still leaves one to deal with the heart of the matter which is the task of showing that there is no finite upper bound to the set of regular energies. Our path here will be to establish this latter claim more directly. As a first step in our argument, we subdivide the time interval I into consecutive subintervals Ik , and we can insure that on each such subinterval we have the partial fungibility: S[Ik ] E. φ (129) This is possible by estimate (65). This estimate remedies the bootstrapping argument within the first interval because by design φ high has small initial energy (see (130) below). However, one might expect that in each subinterval Ii the energy of φ high may grow by a K (E) factor, and the number of intervals where (129) is true unfortunately depends on F(E). Thus a brute force bound would allow the energy of φ high to grow by a K (F(E)) factor, which would bring us back to the core difficulty. However, such a and φ are true brute force approach does not make any good use of the fact that both φ Wave-Maps, and therefore exactly conserve their energy. In order to take advantage of this last observation, we compute that for each fixed t ∈ I: E(φ) = E(P>k∗ φ) + E(Pk∗ φ) + 2P>k∗ φ, Pk∗ φ E(P>k∗ φ) + E(Pk∗ φ), where ·, · denotes the H˙ 1 × L 2 inner product. The last inequality holds because Pk∗ P>k∗ is a nonnegative operator because it has a nonnegative symbol. By Proposition 4.2 just proved, we have the fixed time bound in I : )[t] H˙ 1 ×L 2 F δ0 . (Pk∗ φ − φ Hence we have: )| + |E(φ − φ ) − E(P>k∗ φ)| F δ0 . |E(Pk∗ φ) − E(φ Thus E(φ high ) E(P>k∗ φ) + δ0 K (F(E)),
E(φ) − E(Pk∗ φ) + δ0 K (F(E)), ) + 2 δ0 K (F(E)), E(φ) − E(φ
c0 (E) + 2 δ0 K (F(E)).
(130)
Energy Dispersed Wave Maps
169
This calculation shows that if is small enough with respect to the function K (F) which appears implicity on the RHS of estimate (76) then we have a good uniform bound on the energy of φ high . The argument now proceeds in a series of steps: Step 1 (The bootstrapping construction, and the main estimates). We now fix the interval Ii ⊆ I and consider an S-norm bootstrap for the φ high on subintervals J ⊆ Ii , where we may assume J is centered about t = 0. We seek to prove the bound: φ high S[J ] 1.
(131)
φ high [0] H˙ 1 ×L 2 c0 .
(132)
Due to (130) we have:
Hence by the second part of Proposition 3.10 we obtain the seed bound: φ high S[J0 ] c0 , for a small enough interval J0 ⊂ J . Taking this into account and also the continuity of the S norm in Proposition 3.10, it suffices to prove (131) under the additional bootstrap assumption: φ high S[J ] 2.
(133)
Combining (133) with (129) we obtain: S[J ] E 1. φ S[J ] + φ
(134)
By Proposition 3.9 this gives: W [J ] E 1, φ X [J ] + φ X [J ] E φ W [J ] + φ δ1 .
(135)
We rewrite the bounds (132), (133) and (135) using frequency envelopes. Precisely, we can find a common (δ0 /2, δ0 /2)-admissible normalized frequency envelope ck so that ck∗ = 1 and the following bounds hold: φ high [0] ( H˙ 1 ×L 2 )c E c0 ,
(136a)
φ high Sc [J ] E 1, Wc [J ] E 1, φ Wc [J ] + φ X [J ] E φ X c [J ] + φ δ1 . c
(136b) (136c) (136d)
From these four bounds, together with the energy dispersion on lines (73)–(74), we will obtain the following vastly improved frequency envelope S bound for φ high : δ0 δ1 . φ high Sλc [J ] E 1, λ = c0 + 2
(137)
The second term on the right is small ( 1) due to (75), therefore the desired conclusion (131) follows if c0 is chosen appropriately small, c0 E 1. It remains to show that (73)–(74) together with (136) imply (137). By estimate (83), we may reduce this demonstration to the frequency range k > k∗ − 10. The mechanics of our argument is to decompose the Pk frequency localized version of (128) as follows: high
φk
+ 2 Aα (φ)
high
) + Tkm (φ) + Lm high ), = Tkm (φ k (φ , φ
(138)
170
J. Sterbenz, D. Tataru
where the large gap parameter m is consistent with Proposition 3.6: δ1 . 2−m =
(139)
Here the terms Tkm are matched frequency trilinear expressions of the form (53), while high which contain at the term Lm k denotes certain trilinear expressions between φ and φ least one (m dependent) low or high frequency factor with improved exponential bounds. Our first round of estimates shows that: 2 2 ) N [J ] E Tkm (φ δ1 ck , Tkm (φ) N [J ] E δ1 ck .
(140)
Our second round of estimates gives the exponential control: high ) N [J ] E 2− 4 δ0 m 2− 2 δ0 |k−k∗ | = Lm 4 δ0 δ1 2− 2 δ0 |k−k∗ | . k (φ , φ 1
1
1
1
(141)
An application of (41) using (136a) and (140)–(141) then implies (137). Step 2 (Algebraic derivation of (138)). We first write the frequency localized equation for φ high as follows: high
φk
1;k
m are trilinear forms as defined on line (53). Adding to this a zero expression where the T1;k similar to (107) (i.e. without Pk∗ on the second factor), and further decomposing the result into principle terms and T m interactions, we have: high
φk
k + Tkm (φ) + Tkm (φ )
Then Eq. (138) is achieved by setting: α high α k . ∂α φ = − A ( φ + φ ) − A ( φ ) Lm
(142)
Step 3 (Control of matched frequency interactions). Here we prove (140). This is an immediate consequence of (55) using (136c)–(136d). Step 4 (Control of separated frequency interactions). Here we prove the estimate (141). We decompose line (142) as a sum of two terms Lm k = R1 + R2 where: k , )
where S now denotes the antisymmetrization of the original second fundamental form, and S is defined as on line (120). Recall that we are restricted to the conditions k k∗ − 10 and m 20. We proceed to estimate each term separately. In doing so, we repeatedly use the following estimates: S[J ] 2−δ0 (k−k∗ ) , k k∗ , Pk φ −δ0 (k∗ −k)
S[J ] 2 , k k∗ , Pk φ high high −δ0 (k∗ −k) S[J ] E 2 )φ , k k∗ , Pk S (φ , φ high
, φ high )φ high S[J ] E 1. S (φ
(143) (144) (145) (146)
Energy Dispersed Wave Maps
171
Estimates (143)–(144) are simply a weaker restatement of (83) for the convenience of the reader. Estimates (145)–(146) follow from (144), (133)–(134), and the Moser and product estimates (19)–(20) and (26)–(27) after a standard summation argument. Step 4A (Estimating R1;k ). After an application of (21)–(22) to peel off the first factor, it suffices to show the bound: k N [J ] E 2− 4 δ0 m 2 2 δ0 (k∗ −k) . ∂ α φ
high
1
If k < k∗ this follows at once from (134) and (144), and summing over (24). If k > k∗ , we use (143), and (144) or (133) in (24) to obtain after summation: high k N [J ] E 2−δ0 (k−k∗ ) ∂ α φ
2−δ0 (k∗ − j)+ .
j
If k∗ < k < k∗ + m then the expression on the right gives 2−δ0 m which suffices. If k > k∗ + m then the expression on the right gives |k − k∗ |2−δ0 (k−k∗ ) which is again sufficient for (141). Step 4B (Estimating R2;k ). In this final step we show the estimate: 1
1
R2;k N [J ] E 2− 4 δ0 m 2 2 δ0 (k∗ −k) . In the case when k k ∗ , using (145), (134), and the trilinear estimate (25) we have the sum: R2;k N [J ] E
2−δ( j−k1 )+ 2−δ0 (k∗ − j) E 2−δ0 m 2−δ0 |k−k∗ | .
j,k1
Finally, in the case where k > k∗ we use (134), (143), and (145)–(146) in conjunction with (25) to achieve the sum: R2;k N [J ] E 2−δ0 (k−k∗ )
2−δ( j−k1 )+ 2−δ0 (k∗ − j)+ .
j,k1
If k∗ < k < k∗ + m then the expression on the right gives 2−δ0 m which is enough for (141). If k > k∗ + m then the expression on the right gives |k − k∗ |2−δ0 (k−k∗ ) which is again sufficient. This concludes our demonstration of Proposition 4.3.
5. The Iteration Spaces: Basic Tools and Estimates This is a continuation of Sect. 2, and our purpose is to fill in any gap between the notation and additional structure of basic function spaces used in this paper and the spaces developed in [32,33 and 29].
172
J. Sterbenz, D. Tataru
5.1. Space-time and angular frequency cutoffs. As usual we denote by Q j the multiplier with symbol: q j (τ, ξ ) = ϕ 2− j ||τ | − |ξ || , where ϕ truncates smoothly on a unit annulus. We denote by Q ±j the restriction of this multiplier to the upper or lower time frequency space. At times we also denote by Q |τ ||ξ | = Q 0. We denote by κ ∈ K l a collection of caps of diameter ∼ 2−l providing a finitely overlapping cover of the unit sphere. According to this decomposition, we cut up the spatial frequency domain according to: Pk = Pk,κ . κ∈K l
These decompositions often occur in conjunction with modulation cutoffs on the order of j = k − 2l, and a central principle is that the corresponding multipliers Q ±
0, 1
X ∞2
+ φk S + sup φ S[k; j] . j
(147) In general, the fixed frequency space X s,b p is defined as: p p 2 pbj Q j Pk φ L 2 (L 2 ) , Pk φ s,b := 2 psk Xp
t
j
x
s,b . Here we define the “physical space Strichartz” with the obvious definition for X ∞ norms:
φk S :=
sup
2
( q1 + r2 −1)k
∇t,x φk L qt (L r ) , x
(q,r ): q2 + r1 21
(148)
the “modulational Strichartz” norms: ⎛ ⎞1 2 k− j 2 ⎠ > 10, (149) φ S[k; j] := sup ⎝ Q± P φ , l= S[k,κ]
and the “angular Strichartz” space in terms of the three components: φ S[k,κ] := 2k sup dist(ω, κ) φ L ∞ 2 t (L w ω∈2κ / 1
1
+2 2 k |κ|− 2
inf
ω ω φ =φ
ω
ω
ω)
+ 2k φ L ∞ 2 t (L x )
φω L 2
∞ tω (L xω )
.
(150)
Energy Dispersed Wave Maps
173
The first component on the RHS above will often be referred to as NFA∗ . We define S as the space of functions φ in R2+1 with ∇x,t φ ∈ C(R; L 2x ) and finite norm: φ 2S = φ 2L ∞ (L ∞ ) + φ 2Sk , t
x
k
and also use the frequency envelope convention from Sect. 2 to define Sc . To measure the derivatives of functions in S we introduce a related space DS: Definition 5.2 (Differentiated S functions). We define the norm: φ DSk := φk L ∞ (L 2x ) + φk
0, 1 X ∞2
+ φk DS k + 2k sup φ S[k; j] ,
(151)
j
where the DS norm is as in (148) but without the gradient: φ DS k :=
sup
2
( q1 + r2 −1)k
φ L qt (L r ) .
(152)
x
(q,r ): q2 + r1 21
The DS space is defined as the space of functions for which the square sum of the DSk norms is finite: φ 2DS = φ 2DSk . k
We remark that by definition we have: φ Sk ≈ ∇x,t φ DSk .
(153)
Definition 5.3 (Dyadic Source Term Space). For each integer k we define the following frequency localized norm: ⎛ F Nk :=
inf FA +FB + l,κ FCl,κ =F +
± l>10
κ
⎝ Pk FA 1 2 + Pk FB L (L ) t
x
0,− 21
X1
l,κ 2 inf dist(ω, κ)−2 Q ±
ω∈2κ /
2 tω (L xω )
1 ⎞ 2 ⎠. (154)
We will often refer to the last component on the RHS above as NFA, and the norm applied to a fixed Q ± FCl,κ as NFA[±κ]. ∞ For any closed interval I = [t0 , t1 ] we define spaces X [I ], X c [I ], E[I ], L ∞ t (L x )[I ], p 2 etc. as the restriction of these classical L based norms to the time slab I × Rx . We also need a similar procedure for the non-local S and N spaces. As usual we define Sk [I ], S[I ], Sc [I ], N [I ], etc. in terms of minimal extension.4 For example: (155) φ Sk [I ] = inf Sk ≡ φ on I × R2 .
4 We will modify this procedure somewhat below by an equivalent norm, but for the most part they are
interchangeable.
174
J. Sterbenz, D. Tataru
On an open time interval (t0 , t1 ) we may also define localized norms by taking φ S(t0 ,t1 ) = sup I ⊆(t0 ,t1 ) φ S[I ] . This definition will only be important for us as a convenience when stating results like Theorem 1.3, so the reader is safe to ignore the distinction and always assume that I denotes a closed time interval. We now state a continuation of Proposition 2.3: Proposition 5.4 (Standard Estimates and Relations: Part 2). Let F, φ, and φ (i) be a collection of test functions, I ⊆ R any subinterval (including R itself). Then the following list of properties for the S[I ] and N [I ] spaces hold: • Time Truncation of S. Let χ I be the characteristic function of I . Then φ DSk [I ] ≈ χ I φ DSk φ DSk , φ Sk [I ] ≈ χ I ∇x,t φ DSk .
(156) (157)
• Time Truncation of N . Let I = ∪iK Ii be a decomposition of I into consecutive intervals, and let χ I , χ Ii be the corresponding sharp time cutoffs. Then the following bounds hold (uniform in K ):
χI F N F N , χ Ii F
2N
F
(158)
2N .
(159)
i
Furthermore, for any Schwartz function F the quantity χ I F N is continuous in the endpoints of I . • Basic S and N Relations. We have that: φk(1) · φk(2) DS[I ] 2(k1 −k2 ) φk(1) DS[I ] · φk(2) S[I ] , k1 < k2 − 10, 1 2 1 2 φ, Fk φ DS · Fk N , φk S ∇t,x φk 0, 1 , X1
Fk
0,− 21
X∞
(160) (161) (162)
2
Fk N .
(163)
q
• L t (L rx ) and Disposability Estimates. We have that: 1
Q j φk L 2 (L ∞ ) 2−( j−k)+ 2− 2 j φk S , (164) t
x
−k φk S , φk L ∞ 2 + Q j φk L ∞ (L 2 ) + Q j φk L ∞ (L 2 ) 2 t (L x ) t t x x
(165)
∞ + Q j φk L ∞ (L ∞ ) φk S . Q j φk L ∞ t (L x ) t x
(166)
• Fine Product Estimates. We have that: φk(1) · φk(2) L 2 (L 2 ) |κ| 2 2− 2 k1 φk(1) S[k1 ,κ] · sup φk(2) L ∞ 2 , t (L x ) 1 2 1 2 1
t
1
x
ω
ω∈κ
(2)
ω
(2)
∞ · φ P< j−10 Q < j−10 φ (1) · φk2 S[k2 ; j] φ (1) L ∞ k2 S , t (L x )
· φk(2) ) ∇t,x Pk Q j (φk(1) 1 2
0, 21
X1
(167) (168)
2δ(k−max{ki }) 2δ( j−min{ki }) φk(1) S φk(2) S , (169) 1 2
Pk (Q j Fk1 · φk2 ) N 2δ(k−max{ki }) 2δ( j−min{ki }) Fk1
0,− 21
X∞
where in estimates (169)–(170) we are assuming j min{ki }.
φk 2 S ,
(170)
Energy Dispersed Wave Maps
175
Estimates (156)–(160) are proved next. The rest of the above bounds are standard, and with the exception of (168) which is Lemma 9.1 in [33], may be found in [29]. For the convenience of the reader we give the detailed citations ([29] page numbers). Estimate (161) is estimate (94) on p. 487. Estimate (162) is Lemma 8 on p. 483. Estimate (163) is Lemma 10 on p. 487. Estimates (164)–(166) are listed in estimates (81)–(84) on p. 483. Estimate (167) is (by duality) the estimate in Step 2 on p. 479, and it also follows more or less immediately by inspecting the third term on line (150) above. Estimate (169) is Lemma 13 on p. 515. Estimate (170) is Lemma 12 on p. 501. Proof of estimates (156)–(157) and (15). Without any loss of generality we replace χ I by χ = χt<0 . Our main observation here will be that the multipliers Q j applied to χ act like time-frequency cutoffs onto dyadic sets |τ | ∼ 2 j . For each of these we have the Strichartz type estimate: 1
Q j χ L 2 (L ∞ ) 2− 2 j . t
(171)
x
Therefore, one can look upon the estimate (156) as some version of the product bound (19). We rescale to k = 0, and set φ0 = P0 φ. We begin with the proof of (156). The DS bound in this estimate is immediate. Therefore we focus on proving the X s,b and S[0, κ] sum portions of the estimate. This is split into cases: 0, 1
Step 1 (Controlling the X ∞2 norm). Freezing Q j , our goal is to show that: j
Q j (χ · φ0 ) L 2 (L 2 ) 2− 2 φ0 DS . t
(172)
x
We now split into subcases. Step 1.A (χ at low modulation). In this case we look at the contribution of the product Q j (Q < j−10 χ · φ0 ). We may freely insert the multiplier Q [ j−5, j+5] in front of φ0 . Then (172) is immediate from L ∞ control of Q < j−10 χ . Step 1.B (χ at high modulation). In this case we’ll rely on the even stronger L 2 bound: 1
− j Q j−10 χ · φ0 L 2 (L 2 ) Q j−10 χ L 2 (L ∞ ) φ0 L ∞ 2 2 2 φ0 S , t (L x ) t
x
t
x
(173)
which results from summing over (171). In particular, isolating the LHS of this last line at frequency Q j we have (172) for this term. Step 2 (Controlling the S[0; j] norms). Freezing j < −20 we need to demonstrate: Q < j (χ · φ0 ) S[0; j] φ0 DS . Step 2.A (χ at low modulation). The contribution of Q < j−10 χ · Q < j φ0 is bounded via estimate (168). Notice that χ is automatically at zero spatial frequency. Step 2.B (χ at high modulation). Adding over estimate (173) we have: Q < j (Q j−10 χ · φ0 )
0, 21
X1
φ0 S ,
which is sufficient via the differentiated version of (162). To wrap things up here, we need to demonstrate the bounds (157) and (15). Beginning with φ ∈ S[I ] we consider an extension φ ∈ S with comparable norm. Then by (156) and (153) we have the chain of inequalities: ∇x,t φ DSk [I ] χ I ∇x,t φk DSk ∇x,t φk DSk ≈ φ Sk .
176
J. Sterbenz, D. Tataru
It remains to prove the converse. We begin with the energy norm, observing that for φ ∈ S[I ], for I = [−i 0 , i 0 ], we have: φk [±i 0 ] H˙ 1 ×L 2 ∇x,t φk DSk [I ] . We extend φ to I ± = [±i 0 , ±∞) as a solution to the homogeneous wave equation with data φ[±i 0 ] and use (153) to compute: φ Sk [I ] φ Sk ≈ ∇x,t φ DSk χ I ∇x,t φ DSk + (1 − χ I )∇x,t φ DSk ∇x,t φ DSk [I ] + φk [±i 0 ] H˙ 1 ×L 2 ∇x,t φ DSk [I ] . The proof of (157) is concluded. Finally, we use (156) and (157) to prove (15): χ Ii ∇x,t φ DSk φ Sk [Ii ] . φ Sk [I ] ≈ ∇x,t φ DSk [I ] i
The proof is concluded.
i
Proof of estimate (159). Since:
F 2N ≈
Pk F 2Nk ,
k
it suffices to show that the similar relation holds for the Nk spaces: χ In F 2Nk F 2Nk .
(174)
n
The space Nk is an atomic space, therefore is suffices to prove (174) for each atom. Step 1 (L 1t (L 2x ) atoms). For these we directly have the stronger relation: χ In Fk L 1 (L 2 ) Fk L 1 (L 2 ) . t
n 0,− 21
Step 2 ( X˙ 1
t
x
x
atoms). For F localized at frequency 2k we will prove the relation: χ In Fk 2 Fk 2 0,− 1 . 0,− 1 L 1t (L 2x )+ X˙ 1
n
2
X˙ 1
2
Without any restriction in generality we can assume that Fk is also localized in modu0,− 1 lation at 2 j . By rescaling we can take j = 0. At modulation 1 the X˙ 1 2 is equivalent to the L 2t (L 2x ) norm. Then the last bound would follow from the stronger estimate: Q <−4 (χ In Q 0 Fk ) 2L 1 (L 2 ) + Q >−4 (χ In Q 0 Fk ) 2L 2 (L 2 ) Q 0 Fk 2L 2 (L 2 ) . t
n
t
x
x
t
x
(175) We trivially have:
n
χ In Q 0 Fk 2L 2 (L 2 ) Q 0 Fk 2L 2 (L 2 ) , t
x
t
x
therefore it remains to prove the L 1t (L 2x ) bound on line (175). We do this in two cases:
Energy Dispersed Wave Maps
177
Step 2.A (Small intervals). We parse the collection of intervals In into two sub-collections, intervals Jn such that |Jn | 1, and intervals K n such that |K n | < 1. In the latter case we may drop the outer Q <−4 and simply use Hölder’s inequality to estimate: χ K n Q 0 Fk 2L 1 (L 2 ) χ K n Q 0 Fk 2L 2 (L 2 ) , t
n
x
t
n
x
so the estimate follows as above. Step 2.B (Large intervals). In this case we break the first term on LHS (175) up as follows: Q <−4 (χ Jm Q 0 Fk ) 2L 1 (L 2 ) = Q <−4 (Q [−10,10] χ Jm · Q 0 Fk ) 2L 1 (L 2 ) t
m
x
t
m
m
x
Q [−10,10] χ Jm · Q 0 Fk 2L 1 (L 2 ) . t
(176)
x
Denoting Jm = [am , bm ], for Q [−10,10] χ Jm we have the pointwise bounds: |Q [−10,10] χ Jm (t)| (1 + |t − am |)−N + (1 + |t − bm |)−N . Hence by Cauchy-Schwartz we obtain: N N L.H.S.(176) (1 + |t − am |)− 2 Q 0 Fk 2L 2 (L 2 ) + (1 + |t − bm |)− 2 Q 0 Fk 2L 2 (L 2 ) t
m
x
(1 + |t − am |)−N + (1 + |t − bm |)−N Q 0 Fk (t) 2L 2 dt. R m
t
x
x
Since the intervals Jm are disjoint and of size at least 1, the last sum above is bounded by 1, therefore we obtain: L.H.S.(176) Q 0 Fk 2L 2 (L 2 ) . t
x
Step 3 (NFA atoms). In this case we can express Fk as: ± Fk = Fk+ + Fk− = Fk,±κ , ±,κ
± is supported in the wedge carved by the multiplier Q ± where Fk,±κ 10, and furthermore: ± 2 Fk,κ NFA[±κ] 1. κ
Without loss of generality we may assume we are in the + case, and we rescale to k = 2 j, and so in particular k > 20. By summing over (163) we have the L 2t (L 2x ) bound: Fk+ L 2 (L 2 ) 1. t
x
(177)
The NFA[κ] norms are translation invariant, and are defined using characteristic L 1tω (L 2xω ) norms. Thus they directly satisfy the inequality: + + χ In Fk,κ 2NFA[κ] Fk,κ 2NFA[κ] . (178) n
178
J. Sterbenz, D. Tataru
We write: + χ In Fk+ = Q >0 (χ In Fk+ ) + Q − <0 (χ In Fk ) +
κ
+ Q +<0 (χ In Fk,κ ),
0,− 12
and estimate the first component in X˙ 1 , the second in L 1t (L 2x ), and the third in NFA. We have from line (177): Q >0 (χ In Fk+ ) 2 0,− 1 χ In Fk+ 2L 2 (L 2 ) Fk+ 2L 2 (L 2 ) 1. X˙ 1
n
2
t
n
t
x
x
Next, using the restriction on the Fourier support of Fk+ , we have for any single interval the bound: + + Q− <0 (χ Jn Fk ) L 1 (L 2 ) Q [−20,20] χ Jn · Fk L 1 (L 2 ) . t
t
x
x
Using this and (177), one may proceed as in Step 2.A and Step 2.B above. On the other hand, by (178) and the disposability of Q +<0 on the Fourier support of the multiplier Pk,κ we have: + + + Q +<0 (χ In Fk,κ ) 2NFA[κ] χ In Fk,κ 2NFA[κ] Fk,κ 2NFA[κ] 1. n,κ
κ
n,κ
Proof of estimate (160). This is a minor variation of (21), and the proof is similar to (1) that of estimate (156) above. We rescale to k2 = 0, discard I , and set φk1 DS = (2)
φ0 S = 1. Using the fact that: Q k1 +10 φk1 · φ0 . As a general tool we have the L 2 bound: k1 − j · φ0(2) ) L 2 (L 2 ) Q > j−10 φk(1) L 2 (L ∞ ) φ0(2) L ∞ Q > j−10 φk(1) 2 2 2 2 . t (L x ) 1 1 1
t
t
x
x
In particular, via the differentiated version of (162), if the output modulation is j < k1 we have both: (1)
(2)
Q j (Q >k1 +10 φk1 · φ0 )
0, 1 X ∞2
(1)
(2)
+ Q >k1 +10 φk1 · φ0 S[0; j] 2k1 .
On the other hand, if the output modulation is j > k1 , then by again using the above general L 2 estimate, it suffices to show: (1)
(2)
Q j (Q [k1 +10, j−10] φk1 · φ0 )
0, 1
X ∞2
(1)
(2)
(1)
∞ , + Q [k1 +10, j−10] φk1 · φ0 S[0; j] φk1 L ∞ t (L x )
and then conclude via an application of Bernstein’s inequality. For the first term on the LHS, we may freely insert a Q [ j−5, j+5] multiplier in front of the second factor, which suffices. For the second term, we directly use (168).
Energy Dispersed Wave Maps
179
5.3. Extension and restriction for S and N functions. In the sequel we will build up estimates through an iterative process by which we first prove bounds in weak spaces (such as Pk L ∞ , X , and E), and then show that these may be used in conjunction with bootstrapping to establish uniform bounds in much stronger spaces (such as S). This process unfortunately leads to some technical difficulties regarding compatibility of extensions in various norms. To tame this difficulty, we will make use of a variable but universal extension process. Because this feature is more of a technicality in our proof, we state here for the convenience of the reader where such extensions are necessary in the sequel: • The primary use of compatible extensions is in the proof of Proposition 3.4, most importantly in the proof of estimate (46). Here we are forced to use several norms simultaneously in a single estimate that involves space-time frequency cutoffs. As will become apparent soon, in such a situation choosing extensions needs to be done carefully because it is not immediate that this can be done in a way that retains smallness of the various component norms. • Universal extensions are also used in a key way in the proof of Proposition 3.2 because we need to know that extensions still enjoy good characteristic energy estimates when these estimates are only known on a finite interval. This extended control needs to be used in conjunction with S norm control in estimates requiring space-time frequency cutoffs (see Lemma 9.3 in Sect. 9). • A secondary use of compatible extensions occurs because we do not include X as a component of S defined above. Doing this allows us to quote standard product estimates from [29] modulo physical space Strichartz components. The price one pays is that X bounds are established separately, and one then needs to include this a-posteriori into extension estimates. For example, this feature is used at the beginning of the proof of Proposition 3.1 to extend the connection B with good S and X bounds. Proposition 5.5 (Existence of S Extensions/Restrictions). Let φ be any affinely Schwartz function defined on an interval I = [−i 0 , i 0 ]. • Canonical extension. For every 0 < η 1 there exists a canonical extension I,η which is compactly supported in time and for which the following estimates are true: Pk I,η S Pk φ S[I ] , Pk
I,η
(179)
N Pk φ E[I ] + Pk φ N [I ] ,
∞ η Pk I,η L ∞ t (L x )
− 25
(180)
1 2
∞ Pk φ L ∞ + η Pk φ X [I ] , |I | 2−k η2 , (181) t (L x )[I ]
1
Pk I,η X η 2 Pk φ E[I ] + Pk φ X [I ] ,
(182)
Pk
(183)
Pk
I,η
I,η
E Pk φ E[I ] ,
· ψ j N 2
k− j
Pk φ E[I ] ψ j S + Pk φ · ψ j N [I ] ,
(184)
where the last bound holds under the additional condition that j > k + 10. I,η which is • Secondary extension. For every 0 < η 1 there exists an extension compactly supported in time and such that (179),(180) hold. Furthermore, for this extension the following improvement of (181) is valid: I,η L ∞ (L ∞ ) η− 2 Pk φ L ∞ (L ∞ )[I ] + η Pk φ E[I ] . Pk t t x x 1
(185)
The canonical extension above will be used most of the time. Its only disadvantage is ∞ that in order to control the L ∞ t (L x ) norm of this extension we need to also control the X norm. In the rare (single) case where this is missing, we use the secondary extension.
180
J. Sterbenz, D. Tataru
Proof of Proposition 5.5. The canonical extension will be defined dyadically for each φk . By rescaling we only work with k = 0. Step 1 (The canonical extension and estimates (179)–(180), (182)–(184)). The obvious candidate I for the extension is obtained by solving the homogeneous wave equation to the left of −i 0 and to the right of i 0 , with Cauchy data P0 φ[−i 0 ], respectively P0 φ[i 0 ]. Denoting the complement of I by I − ∪ I + , we have: I = 0, I [±i 0 ] = P0 φ[±i 0 ], in I ± . It is relatively easy to verify that the extension I satisfies all the properties (179)– (180), (182)–(184). However, there is a core issue with (181), as this bound can easily fail because nonconcentration at time ±i 0 , say, does not guarantee nonconcentration at all later times. To avoid this problem, we truncate I outside a compact set and define: η
I,η = χ I I , η
η
η
where χ I is a smooth cutoff with |∂tk χ I | ηk , such that χ I ≡ 1 on I and vanishing outside of the extended interval I = [−i 0 − η−1 , i 0 + η−1 ]. Furthermore, in I ± we have the identity: η
η
I,η = 2∂t (χ I ) · ∂t I + ∂t2 (χ I ) · I . This allows us to estimate: P0 I,η L 1 (L 2 )[I ± ] P0 φ[±i 0 ] H˙ 1 ×L 2 , t
x
(186)
which in turn leads to: P0 I,η S[I ± ] + P0 I,η N [I ± ] P0 φ[±i 0 ] H˙ 1 ×L 2 . Then the bound (179) follows from (15), while (183) follows from energy estimates for I,η in I ± . The bound (180) is also straightforward, while for (182) we need to compute: 1
P0 I,η L 2 L 2 [I ± ] η 2 P0 φ[±i 0 ] H˙ 1 ×L 2 . t
x
To prove (184) we use Bernstein to estimate: P0 I,η · ψ j L 1 (L 2 )[I ± ] P0 I,η L 1 (L ∞ )[I ± ] · ψ j L ∞ 2 ± t (L x )[I ] t
t
x
P0
I,η
x
L 1 (L 2 )[I ± ] · 2− j P0 ψ j S , t
x
and conclude with (186). ∞ Step 2 (The L ∞ t (L x ) estimate (181)). We now turn our attention to the most interesting part, namely (181). The desired bound follows from a reverse dispersive estimate for the 2D wave equation: √ e±it|Dx | P0 f L ∞ 1 + t P0 f L ∞ , x x provided that we can first establish the “elliptic” estimate (setting P0 φ = φ0 ): ∞ ∞ η−2 φ0 L ∞ + η φ0 X [I ] , ∂t φ0 L ∞ t (L x )[I ] t (L x )[I ]
(187)
Energy Dispersed Wave Maps
181
provided that |I | η2 . Without loss of generality we may assume we are in the worst case scenario |I | = η2 . We begin with the Poincare type inequality: ∂t φ0 − (∂t φ0 )av L ∞ η ∂t2 φ0 L 2 [I ] , t [I ] t
where (∂t φ0 )av = η
−2
I
∂t φ0 dt, so in particular:
∞ (∂t φ0 )av L ∞ η−2 φ0 L ∞ . x [I ] t (L x )[I ]
Therefore, taking the sup over all space in the second to last line above we have: ∞ ∞ ∂t φ0 L ∞ η−2 φ0 L ∞ + η φ0 L 2 (L ∞ )[I ] + η φ0 L 2 (L ∞ )[I ] . t (L x )[I ] t (L x )[I ] t
t
x
x
The proof of (187) is now concluded via Bernstein in space for the second term on the RHS above, and Cauchy-Schwartz in time along with the fact that P0 is bounded on L∞ x to control the third. Step 3 (The Secondary Extension). We next turn our attention to the second extension. Again we set k = 0. The additional difficulty we face here is that we no longer have pointwise bounds for ∂t P0 φ(±i 0 ). We split the function I outside I into two parts: I = 0I + 1I , corresponding to the two different components of its Cauchy data at ±i 0 : 0I = 0, 0I [±i 0 ] = (P0 φ(±i 0 ), 0) in I ± , respectively: 1I = 0, 1I [±i 0 ] = (0, ∂t P0 φ(±i 0 ))
in I ± .
I,η by truncating the two components on different scales: Then we define the extension I,η = χ η 0I + χ 1/η 1I . I I For the first component we argue as before. For the second, we begin with a fixed time L 2 bound: P0 1I (t) L 2x |t ∓ i 0 | · ∂t P0 φ(±i 0 ) L 2x in I ± ,
(188)
which follows at once from integrating the quantity ∂t P0 1I and energy estimates. This leads to: 1/η
P0 (χ I 1I ) L 1 (L 2 )[I ± ] ∂t P0 φ(±i 0 ) L 2x , t
x
which helps us establish bounds of the type (179)–(180). On the other hand, the improved pointwise bound (185) follows simply by using Bernstein’s inequality in (188) to give: P0 1I (t) L ∞ |t ∓ i 0 | · ∂t P0 φ(±i 0 ) L 2x in I ± . x The proof of the proposition is thus concluded.
182
J. Sterbenz, D. Tataru
5.4. Strichartz and Wolff type bounds. In this section we prove the estimate (18) for the S component on line (148), as well as a key L 2 bilinear estimate for transverse waves which takes advantage of the small energy dispersion. The tools we use for these p p purposes are the V±|Dx | and U±|Dx | spaces associated to the two half-wave evolutions. p Precisely, V±|Dx | is the space of right continuous L 2x valued functions with bounded p-variation along the half-wave flow: = e∓it|Dx | u(t) V p (L 2x ) ,
u V p
±|Dx |
or in expanded form: u V p
±|Dx |
p
:= u L ∞ (L 2 ) + sup t
tk k∈Z
x
u(tk+1 ) − e±i(tk+1 −tk )|Dx | u(tk ) L 2 , p
x
where the supremum is taken over all increasing sequences tk . We note that if p < ∞ then V p functions can have at most countably many discontinuities as L 2x valued functions. p On the other hand the slightly smaller space U±|Dx | is defined as the atomic space generated by a family A p of atoms a which have the form: a(t) = e±it|Dx | 1[tk ,tk+1 ) u (k) , k
where the sequence tk is increasing and: p u (k) L 2 1. x
k
Precisely, we have: p
U±|Dx | = {u =
ck ak ;
|ck | < ∞, ak ∈ A p }.
k p
The above sum converges uniformly in L 2x ; it also converges in the stronger V±|Dx | p topology. The U±|Dx | norm is defined by: u U p := inf{ |ck |; u = ck ak , ak ∈ A p }. ±|Dx |
k
k
These spaces are related as follows: p
p
q
U±|Dx | ⊂ V±|Dx | ⊂ U±|Dx | , 1 p < q ∞.
(189)
The first inclusion is straightforward. The second is not, and plays a role similar to the Christ-Kiselev lemma. These spaces were first introduced in unpublished work of the 1 second author, and have proved their usefulness as scale invariant substitutes of X s, 2 type spaces in several problems, see [2,5,11,12]. We use these spaces first in the context of the Strichartz estimates, which for frequency localized homogeneous half-waves can be expressed in the form: e±it|Dx | u k L q (L r ) 2 t
−( q1 + r2 −1)k
x
q
u k L 2x ,
1 2 1 + . q r 2
Applying this bound for each segment in U±|Dx | atoms, one directly obtains embeddings of these spaces into Strichartz spaces:
Energy Dispersed Wave Maps
183
Lemma 5.6. The following estimates hold: φk L qt (L r ) 2
−( q1 + r2 −1)k
φk U q
±|Dx |
x
2 1 1 + . q r 2
,
(190)
The second place where these spaces come into play is in the context of bilinear L 2t (L 2x ) estimates for transversal waves. The classical estimate here (see for instance [10]) has the form: (1)
(2)
1
(1)
1
(2)
e±it|Dx | u k1 · e±it|Dx | u k2 L 2 (L 2 ) 2 2 min{k1 ,k2 } θ − 2 u k1 L 2x u k2 L 2x , t
x
(191)
provided the u (i) ki have angular separation in frequency, namely |θ1 − θ2 | > θ in the (++) or (−−) cases, and |π + (θ1 − θ2 )| > θ in the (+−) or (−+) cases. In subsequent p p work, Wolff [34] was able to replace the L 2t (L 2x ) bound on the left with L t (L x ) for 5 5 p > 3 . The endpoint p = 3 was later obtained by Tao [27]. Our aim here is to first use the Wolff-Tao estimate to strengthen the classical L 2t (L 2x ) bound in a way which takes 2 advantage of the small energy dispersion, and then phrase it in the set-up of the V±|D x| spaces: (i)
2 be two test functions which have angular separation in Lemma 5.7. Let φki ∈ V±|D x| frequency, namely |θ1 − θ2 | > θ in the (++) or (−−) cases, and also |π + (θ1 − θ2 )| > θ 3 we have: in the (+−) or (−+) cases. Then for c < 29 c 1 (1) (2) (1) (2) (2) ∞ 2−k2 φk2 L ∞ φk1 φk2 L 2 (L 2 ) 2 2 max{ki } θ −1 φk1 V 2 φk2 1−c . t (L x ) V2 t x ±|Dx |
±|Dx |
(192) We remark that we did not make an effort to optimize c, the balance of the frequencies, or the power of θ , as these play no role in the present paper. (i)
Proof. Without loss of generality, let us assume we are in the (++) case. If both φki were free waves, then Wolff’s estimate (with Tao’s endpoint) would yield (see [27] Prop. 17.2): (1) (2)
φk 1 φk 2
(1)
1
5
5
L t3 (L x3 )
(2)
2 5 max{ki } θ −1 φk1 (0) L 2x φk2 (0) L 2x .
Applying this for each intersection of two segments in a product of atoms, we obtain: (1) (2)
φk 1 φ k 2
(1)
1
5 5 L t3 (L x3 )
2 5 max{ki } θ −1 φk1
5 3 U|D x|
(2)
φk 2
5
3 U|D x|
.
(193)
On the other hand by (190) with (q, r ) = (6, 6) we have: φk(1) φk(2) L 3 (L 3 ) 2 1 2 t
x
k1 +k2 2
φk(1) U 6 φk(2) U 6 . 1 2 |Dx |
(194)
|Dx |
Interpolating (193) with (194) (it is bilinear interpolation but it suffices to do it for atoms, so it only involves l p and L p spaces) we obtain: (1) (2)
φk1 φk2 L tp (L xp ) 2
3 (2− 3p ) max{ki }− 26 |k1 −k2 | −1
θ
(1)
(2)
φk1 U 2 φk2 U 2 , p = |Dx |
|Dx |
13 . 7
184
J. Sterbenz, D. Tataru 13
+
13
+
2 2+ and L 7 (L 7 ) in We want V|D instead, so we use the embedding (189) with U|D x t x| x| this last estimate, which gives the bound:
φk(1) φk(2) L p (L xp ) 2 1 2
3 (2− 3p ) max{ki }− 26 |k1 −k2 | −1
θ
t
φk(1) V 2 φk(2) V 2 , p > 1 2 |Dx |
|Dx |
13 . 7
∞ On the other hand by using an L ∞ t (L x ) bound we get: (1) (2)
1
(1)
(2)
∞ . φk1 φk2 L 6 (L 6 ) 2 2 k1 φk1 V 2 φk2 L ∞ t (L x ) t
x
|Dx |
Then (192) is obtained interpolating the last two lines.
In this article we work with the S and N spaces. The next lemma relates them to the 2 V±|D spaces. x| Lemma 5.8. Let φk [0] ∈ H˙ 1 × L 2 and Fk ∈ N . Then the solution φk to φk = Fk with initial data φk [0] satisfies: 1 2 2 k Q k−10 ∇t,x φk L 2 (L 2 ) + Q± φk [0] H˙ 1 ×L 2 + Fk N .
±|D x |
±
(195) Proof. By rescaling we may assume that k = 0, and we’ll relabel φk and Fk as φ, F with the implicit understanding that they are both at unit frequency. The estimate for Q −10 ∇t,x φ is immediate from the structure of S and the estimate (18) (note that this was shown in [29] for all portions of the norm (147) except S). The linear wave evolution in the energy space H˙ 1 × L 2 is given by the multiplier: |Dx |−1 sin(t|Dx |) cos(t|Dx |) . S(t) = −|Dx | sin(t|Dx |) cos(t|Dx |) For any increasing sequence t j we can use the energy component of (18) (again established in [29]) and (159) to estimate: φ[t j+1 ] − S(t j+1 − t j )φ[t j ] 2H˙ 1 ×L 2 1[t j ,t j+1 ] F 2N F 2N . j
j
Diagonalizing, one may write the L 2 × L 2 evolution as: it|D | x |Dx | 0 |Dx |φ(t0 ) 0 ∗ e S(t)φ[t0 ] = U , U 0 1 ∂t φ(t0 ) 0 e−it|Dx | 1 −i . Thus, the LHS of the previous difference formula may be 1 i rotated via U to yield: where U =
√1 2
φ[t] − S(t − s)φ[s] 2H˙ 1 ×L 2 =
1 (∂t + i|Dx |)φ(t) − ei(t−s)|Dx | (∂t + i|Dx |)φ(s) 2L 2 2 1 + (∂t − i|Dx |)φ(t) − e−i(t−s)|Dx | (∂t − i|Dx |)φ(s) 2L 2 . 2
Energy Dispersed Wave Maps
185
Hence taking the supremum over all increasing sequences tk we obtain the pair of bounds: (∂t + i|Dx |)φ 2V 2
|Dx |
+ (∂t − i|Dx |)φ 2V 2
−|Dx |
φ[0] 2H˙ 1 ×L 2 + F 2N .
We conclude (195) by noting that one has the following “elliptic” estimate: ∇t,x Q ± <−10 P0 φ Y (∂t ± i|D x |)φ Y , for any translation invariant space-time norm Y , which is valid because the convolution 1 1 kernel of the frequency localized ratio ∇t,x (∂t ± i|Dx |)−1 Q ± <−10 P0 is in L t (L x ). As a quick application of these ideas, notice that if (q, r ) is any pair of indices in the range of (148), we must have q 4. Hence from (190) and (195), and some Sobolev 2 embeddings interpolated with the L ∞ t (L x ) estimate for ∇t,x φ to control the first member on the LHS of (195), we obtain: Corollary 5.9. Let φ[0] ∈ H˙ 1 × L 2 and F ∈ N . Then the solution φ to φ = F with initial data φ[0] satisfies φ S φ[0] H˙ 1 ×L 2 + F N .
(196)
This proves (18), therefore completing the linear theory in the S and N spaces, as needed in view of our modification of Tao’s [29] definition of the S space, namely by adding the S norm to it. In a similar manner, we can combine the bounds (192) and (195) to obtain: (i)
Lemma 5.10. Let φki be two test functions normalized so that: (i)
(i)
(2)
φki S + φki N 1,
∞ η. j = 1, 2, φk2 L ∞ t (L x )
(i) Assume in addition that the localizations Q ± θ in the (++) or (−−) cases, and |π + (θ1 − θ2 )| > θ in the (+−) or (−+) cases. 3 Then for c < 29 one has:
φk(1) φk(2) L 2 (L 2 ) 2− 2 min{ki } ηc θ −1 . 1 2 3
t
x
(197)
Proof. By an application of Lemma 5.7, we need only consider the case where one factor (i) is at high modulation, i.e. a factor of Q ki −10 φki . In this case, if the other factor has ∞ 2 2 the improved L ∞ t (L x ) bound, estimate (197) is immediate on account of the L t (L x ) estimate on the LHS of (195). On the other hand, if the factor at high modulation is also ∞ 6 6 the one with improved L ∞ t (L x ) control, then using L t (L x ) Strichartz for the first factor we have: Q k2 −10 φk(2) φk(1) 1 2
1
3 3 L t2 (L x2 )
3
2− 2 k 1 2− 2 k 2 . (1)
(2)
Interpolating this with the pointwise bound |φk1 Q k2 −10 φk2 | η we again have (197).
186
J. Sterbenz, D. Tataru
6. Bilinear Null Form Estimates In this section we prove the estimates (23), (44), and (46). The first of these is essentially standard, being implicitly contained in the calculations of [29]. We provide the proof here for the sake of completeness: Proof of estimate (23). We begin with the estimate: j
Pk Q < j F L 2 (L 2 ) 2 2 F N . t
(198)
x
To see this, notice that if −1 0 inverts the wave equation with zero Cauchy data, we immediately have from (18) the inequality: Pk Q j −1 0 F
1, 1
X ∞2
F N ,
which implies the fixed frequency estimate: j
Pk Q j F L 2 (L 2 ) 2 2 F N . t
x
j
Summing this last line over all < j (198) is achieved. We now split the proof of estimate (23) into two cases: Step 1 (Low × H igh interaction). In this case we assume that k1 < k2 − 10. The case k2 < k1 − 10 can be handled via a similar argument. For relatively low modulations we have from estimates (24) and (198): (1)
k1
(2)
(1)
(2)
Q
x
Therefore, it suffices to look at output modulations larger than k1 +10. In this case we split (1) (1) (1) the modulations of the low frequency term according to φk1 = Q k1 ∂α φk(2) L 2 (L 2 ) 1 2 1 2 t
t
x
(1)
x
(2)
∞ Q >k1 ∇t,x φ ∇t,x φk1 L ∞ k2 L 2 (L 2 ) t (L x ) t
2
k1 2
(1) (2) ∇t,x φk1 L ∞ 2 ∇t,x φk t (L x ) 2
0, 1
X ∞2
x
,
which suffices. For the high modulations of the first factor in the previous decomposition, we estimate: Q k1 +10 (Q >k1 ∂ α φk(1) · ∂α φk(2) ) L 2 (L 2 ) Q >k1 ∇t,x φk(1) L 2 (L ∞ ) ∇t,x φk(2) L ∞ 2 . t (L x ) 1 2 1 2 t
t
x
x
We then conclude using (164) for the first factor. Step 2 (H igh × H igh interaction). In this case we consider the frequency interaction |k1 − k2 | < 5, and without loss of generality we may also assume that k1 k2 . By using estimates (198) and (24), we may reduce to considering the case of output modulation larger than k + δ(k1 − k) + 10, where δ is from the RHS of (24) (this ultimately forces a harmless redefinition of δ to suit line (23)). For this remaining piece, we will show that: Pk Q k+δ(k1 −k)+10 (∂ α φk(1) · ∂α φk(2) ) L 2 (L 2 ) 2 2 2− 2 δ(k1 −k) φk(1) S · φk(2) S . 1 2 1 2 k
t
x
1
Energy Dispersed Wave Maps
187
The key observation here is that the output modulation combined with the output spatial frequency localization guarantees that at least one term in the product is at modulation greater than k + δ(k1 − k) − 20. Without loss of generality we may assume this is the first term in the product, and we estimate via Bernstein: (1)
(2)
Pk Q k+δ(k1 −k)+10 (Q k+δ(k1 −k)−20 ∂ α φk1 · ∂α φk2 ) L 2 (L 2 ) t x (1) (2) k 2 Q j ∇t,x φk1 L 2 (L 2 ) · ∇t,x φk2 L ∞ 2 t (L x ) t
j>k+δ(k1 −k)−20
2 2 2− 2 δ(k1 −k) ∇t,x φk(1) 1 k
1
0, 1 X ∞2
x
· ∇t,x φk(2) L ∞ 2 . t (L x ) 2
This concludes our demonstration of (23).
Our next step is to prepare for the proof of Proposition 3.4. It will first be useful to have a version of these estimates under simpler assumptions: (i)
Lemma 6.1. a) Let φki be functions localized at frequency ki . Assume that these functions are normalized as follows: ∞ φk(i) S[I ] + φk(i) N [I ] 1, φk(1) L ∞ η. t (L x )[I ] 1 i i
(199)
Then the following bilinear L 2 estimate holds: (1)
(2)
1
∂ α φk1 ∂α φk2 L 2 (L 2 )[I ] 2 2 max{k1 ,k2 } ηδ . t
(200)
x
b) Assume that in addition to (199) we also have the high modulation bounds: k1
k2
φk(1) L 2 (L 2 )[I ] 2 2 η, φk(2) L 2 (L 2 )[I ] 2 2 η. 1 2 t
t
x
x
(201)
Then the following estimate holds: (1)
(2)
∂ α φk1 ∂α φk2 N [I ] 2C|k1 −k2 | ηδ .
(202)
Proof. We may assume that the interval length is such that |I | 2− min{ki } η2δ , as otherwise the desired bounds follow from integrating energy estimates. (1) (2) We begin by taking extensions of φk1 and φk2 according to Proposition 5.5 in such a ∞ ∞ way that the L t (L x ) bound in (199) is preserved; in the case of part (b), we also insure that (201) is preserved. This is achieved using the second extension in Proposition 5.5 in case (a), respectively the canonical extension in Proposition 5.5 in case (b). Doing this requires balancing the parameter η in Proposition 5.5, and has the effect of replacing the 1 η in both (199) and (201) with a small power of η (η 8 should suffice). This is harmless given the small constant δ which we seek to obtain in both (200) and (202). We fix m to be a large spatial frequency separation parameter. In the course of proving (200) and (202), we will decompose into several frequency ranges. In all cases we will show a bound of the form: L.H.S. 2Cm ηc + 2−δm ,
188
J. Sterbenz, D. Tataru
where c, δ are suitably small constants depending only on the estimates in Propositions 2.3, 5.4, and 5.10 above, and where C is a suitable large constant. In what follows we call any bound of this type a “suitable bound”. By choosing m appropriately, and by (globally) redefining the small parameter δ one may produce the RHS of estimates (200) and (202) from such bounds. Step 1 (The unbalanced case |k1 − k2 | m). Here we neglect the pointwise bound in (199) as well as the high modulation bound in (201). From estimate (23) we immediately have that: (1)
(2)
1
1
∂ α φk1 ∂α φk2 L 2 (L 2 ) 2 2 max{k1 ,k2 } 2− 2 m , |k1 − k2 | m, t
x
which is a suitable L 2 bound. Similarly, from (24) we obtain a suitable N bound. Hence, in what follows it suffices to consider the range |k1 − k2 | < m. For the remainder of the proof we let k2 = 0. We split into cases depending on the modulations of the factors and the output. (2) Step 2 (The factor φk2 at high modulation). Here we first prove a suitable L 2 bound: ∂ α φk(1) Q >−10m ∂α φ0(2) L 2 (L 2 ) 2Cm η + 2−m . 1 t
x
(203)
(1)
For moderate modulations of the first factor, i.e. for Q <10m φk1 , we use (199) to place it in L ∞ : (1)
Cm ∞ 2 η, ∂ α Q <10m φk1 L ∞ t (L x )
while the second factor is placed in L 2 via the general embedding: (i)
(i)
1
Q > j φki L 2 (L 2 ) 2− 2 j φki t
x
0, 1
X ∞2
.
(204)
(1)
For high modulations of the first factor, i.e. for Q >10m φk1 , we reverse the roles and bound the first factor in L 2 : (1)
(1)
∂ α Q >10m φk1 L 2 (L 2 ) 2−5m φk1 S , t
x
(205)
while the second factor has a 1 bound in L ∞ thanks to (166), which leads again to a suitable bound. In this case it is even easier to obtain the suitable N bound because we have access to the high modulation assumption (201). We prove: (1)
(2)
∂ α φk1 Q >−10m ∂α φ0 N 2Cm η.
(206)
This follows from (24) combined with: Q >−10m φ0(2) S 25m φ0(2) L 2 (L 2 ) 25m η, t
x
where the first inequality follows from (162). (1) Step 3 (The factor φk1 at high modulation). Here we can also prove a suitable L 2 bound, namely: ∂ α Q >−10m φk(1) Q −10m ∂α φ0(2) L 2 (L 2 ) 2Cm η 4 + 2−m . 1 1
t
x
(207)
Energy Dispersed Wave Maps
189 (1)
Reusing (205) we can dispense with the very high modulations in φk1 and replace the (1)
∞ first factor with ∂ α Q [−10m,10m] φk1 . This time we cannot directly use the L ∞ t (L x ) esti-
mate for φk(1) . However, by applying (204) and using the L 6t (L 6x ) Strichartz estimate 1 contained in (196) we have that: (1)
(2)
∂ α Q [−10m,10m] φk1 Q −10m ∂α φ0
3 3 L t2 (L x2 )
2Cm .
Next, using (199) and (166) we directly have: (1)
(2)
Cm ∞ 2 η. ∂ α Q [−10m,10m] φk1 Q −10m ∂α φ0 L ∞ t (L x )
Interpolating these last two estimates yields (207). It is important to notice that in the above estimates one loses a polynomial in 2m because the multipliers P0 Q −10m and Pk1 Q [−10m,10m] are not uniformly disposable on L p . However, a short calculation shows that the resulting convolution kernels have L 1t (L 1x ) bounds on the order of 2Cm which is acceptable. As in the previous step we also have a suitable N bound: (1)
(2)
∂ α Q >−10m φk1 ∂α Q −10m φ0 N 2Cm η.
(208)
Step 4 (Low frequency output). This is the case when k1 = k2 + O(1), and we seek (1) (2) to estimate Pk (∂ α φk1 ∂α φ0 ) for k < −m. Then we can use (23), respectively (24) to obtain a 2−δm suitable bound in L 2 , respectively N . Here δ is the previously defined constant from Proposition 2.3. (i) Step 5A (Both φki at low modulation, output at low modulation < −2m and high frequency k > −m). In this case, to show (200) we prove the bound: (1) (2) Pk Q <−2m ∂ α Q −10m φk1 ∂α Q −10m φ0 L 2 (L 2 ) 2−δm , (209) t
x
where the δ is the same as in Propositions 2.3 and 5.4. This estimate again uses only (1) (2) the S bounds on φk1 and φk2 and the localization conditions |k1 | m and k2 = 0. To show (209), by (198) it suffices to prove the following set of bounds which together also imply (202) in the present case: (1) (2) Pk Q <−2m Q −10m φk1 · Q −10m φ0 0,− 1 m2−δm , (210)
X1
2
· Q −10m φ0(2) N 2−δm , Pk Q <−2m Q −10m φk(1) 1 (1) (2) N 2−δm . Pk Q <−2m Q −10m φk1 · Q −10m φ0
(211) (212)
The first estimate above follows from (169), while the second and third both follow from (170). Note that while the multiplier Q −2m is not disposable on N (e.g. on the NFA atoms), one may first replace it by Q <0 , and separately estimate the contribution 0,− 1
of Q [−2m,0] as an X 1 2 atom via (163) at an O(m) loss. A similar method using (162) allows one to handle the interior Q −10m multipliers, which are not disposable on S, with another O(m) loss.
190
J. Sterbenz, D. Tataru (i)
Step 5B (Both φki at low modulation, output at high frequency and high modulation). In this step, which is the heart of the matter, we establish the single bound: (1) (2) L 2 (L 2 ) 2Cm ηc . (213) P−m Q −2m ∂ α Q −10m φk1 ∂α Q −10m φ0 t
x
Here c is the same small constant from the RHS of line (197). To use that estimate, we only need to establish angular separation of the two factors. This is a standard “geometry of the cone” calculation, and one finds that the angle between the two factors must satisfy |θ | 2−m in the (++) or (−−) cases, and |θ − π | 2−m in the (+−) or (−+) cases (see for example Lemma 11 in Sect. 13 of [29]). By decomposing the product on the LHS of (213) into O(2Cm ) angular sectors such that each product has these separation properties, and by repeatedly applying estimate (197) on each interaction we have (213). The proof of the lemma is concluded. Proof of Proposition 3.4. For this we use Lemma 6.1. We begin using the extensions (this will be modified somewhat shortly) and the same parameter m as in the proof of Lemma 6.1. We start with several simplifications. The key observation is that in the (i) proof of Lemma 6.1 we have used the bound on φki N just once, namely in Step 5B. All other cases carry over to the proof of Proposition 3.4. Consequently, it suffices to estimate the expression: (1)
(2)
Pk Q >−2m R = Pk Q >−2m (∂ α φk1 ∂α φk2 ), in both L 2 and N under the assumptions k2 = 0, |k1 | m, and |k| m + 2. Furthermore, the contribution Pk Q >−2m ((1 − χ I )R) of R in the exterior of I is estimated directly by Lemma 6.1 because the extensions provided by Proposition 5.5 enjoy estimate (186) in the exterior of I . Hence, we only need consider the expression Pk Q >−2m (χ I R). For this we will establish the pair of suitable bounds: Pk Q >−2m (χ I R) L 2 (L 2 ) 2Cm ηδ + 2−5m , Pk Q >−2m (χ I R) N 2Cm ηδ + 2−4m , t
x
with δ as in Lemma 6.1. We remark that due to the frequency and modulation localization of R, the second N bound follows from the first L 2 bound albeit with a readjusted C. Therefore, we drop the modulation and spatial frequency localization and simply prove that: Pk R L 2 (L 2 )[I ] 2Cm ηδ + 2−5m . t
(214)
x
(i)
For this we use the renormalization. On the interval I , we may decompose φki as follows: (i)
(i)
(i)
φki = (U,
(i)
Pk wki S[I ] + Pk wki N [I ] 2−|k−ki | A−1 , (i)
(i)
U,ki
Energy Dispersed Wave Maps
191
for a possibly large constant A. By normalization, we may without loss of generality assume that A = 1, as any bounds for these two quantities will always appear as a (1) ∞ product. Since φk(1) L ∞ η, we obtain a similar relation for w,k , namely: t (L x )[I ] 1 1 (1) Pk w,k ∞ ∞ η. 1 L t (L x )[I ] (i)
Furthermore, by using Proposition 5.5, we may extend the wki so that all of the above listed bounds are global, albeit with a fractional modification of η. Thus, we may drop the interval I , and again work globally. We decompose the null-form R (first on I , then by extension) into R = R1 + R2 + R3 + R4 where: (1)
(1)
(2)
(2)
R1 = −∂ α (U,
(1)
(2)
R2 = ∂ α (U,
(1)
(2)
(2)
R4 = (U,
x
Note that the RHS loss is the effect of summing over frequencies k k1 m on the (1) first factor. We combine this with the pointwise bound on w,k1 to achieve: (1)
∞ mη. R1 L 2 (L 2 ) w,k1 L ∞ t (L x ) t
x
Step 2 (Estimating the term R2 ). This is essentially the same as in the previous step. (1) (2) Here we use the S bounds for U,
(1) use the pointwise bound for w,k . 1
(1)
Step 3 (Estimating the term R3 ). We begin by splitting φk1 into a low and a high modulation part. For the high modulation part we have from (204) the L 2 bound: (1)
Q >10m ∂ α φk1 L 2 (L 2 ) 2−5m . t
x
(2)
Furthermore, by summing over the energy estimate for U,<0 and using the decay of high frequencies we have the pointwise bound: (2) (2) ∞) ∇t,x U,<0 L ∞ 2k Pk ∇t,x U,<0 L ∞ (215) 2 1. (L t x t (L x ) k (2) Combining these two estimates with the pointwise bound for w,0 we can estimate the corresponding part of R3 , call it R31 , by:
R31 L 2 (L 2 ) 2−5m . t
x
192
J. Sterbenz, D. Tataru (1)
It remains to consider the contribution of the low modulation part Q <10m φk1 in R3 , (1)
(2)
which we will label by R32 . Using the S bounds for φk1 and U,<0 along with (23), after dyadic summation we obtain the usual L 2 estimate: (1)
(2)
∂ α Q <10m φk1 · ∂α (U,<0 )† L 2 (L 2 ) 1. t
x
On the other hand, from the Strichartz control (148) and the boundedness of the gauge we have: (2)
(2)
w,0 L 6 (L 6 ) φ0 L 6 (L 6 ) 1, t
t
x
x
therefore we obtain a low index space-time L p bound for R32 , namely: R32
3
3
L t2 (L x2 )
1. (1)
On the other hand, from the pointwise bound (43) for φk1 we obtain: (1)
Cm ∞ 2 η. ∂ α Q <10m φk1 L ∞ t (L x ) (2)
Combining this with (215) and the pointwise bound for w,0 we have: Cm ∞ 2 R32 L ∞ η. t (L x )
Interpolating the last two lines we obtain: 1
R32 L 2 (L 2 ) 2Cm η 4 . t
x
Step 4 (Estimating the term R4 ). We start by dividing the main part of the product into all spatial frequencies: (1) (2) ∂ α w,k · ∂α w,0 = 1
(1) (2) ∂ α P j1 w,k · ∂α P j2 w,0 . 1
ji
Using the bound (200) if j1 , j2 < 10m, and (23) otherwise in conjunction with the (i) 2−| ji −ki | frequency separation gains for P ji w,ki we have: (1) (2) ∂ α w,k · ∂α w,0 L 2 (L 2 ) 2Cm ηδ + 2−5m . 1 t
x
This estimate is directly transferred to R4 due to the pointwise bounds on the gauge factors.
Energy Dispersed Wave Maps
193
7. Proof of the Trilinear Estimates In this section we will prove estimates (51)–(57). In all cases the desired bounds follow easily from a combination of the standard estimates (23)–(25) for widely separated frequencies, and the improved matched frequency estimates (44) and (46). Proof of estimate (51). The proof will be accomplished in a series of steps whose goal is to reduce things to the matched frequency bilinear estimate (44). Step 1 (Disposal of the φ (1) Factor). As a first step we will show the general estimate: φ · F
1
L 2t ( H˙ − 2 )c [I ]
φ S[I ] · F
1
L 2t ( H˙ − 2 )c [I ]
,
(216)
where {ck } is any (δ0 , δ0 )-admissible frequency envelope. To prove this, we split it into the three main frequency interactions. In the Low × H igh case we immediately have: Pk (φ
k
−1 L 2t ( H˙ x 2 )[I ]
φ S[I ] · 2− 2 P[k−5,k+5] F L 2 (L 2 )[I ] , t
x
which is sufficient. In the H igh × Low case, we freeze the dyadic frequency of F and we have a similar estimate: Pk (φ · Fk )
−1 L 2t ( H˙ x 2 )[I ]
2
k −k 2
ck φ S[I ] · F
− 21
L 2t ( H˙ x
)c [I ]
,
for any k k − 10. Summing this over all such k k − 10 and using (13) we have (216) in this case. In the H igh × H igh case we freeze the frequency of the inputs and output to estimate: k2
Pk (φk1 · Fk2 ) L 2 (L 2 ) 2k−k1 φk1 S[I ] · 2 2 ck2 F t
x
1
L 2t ( H˙ − 2 )c [I ]
,
(217)
which follows easily from Bernstein’s inequality (9) and the bound (165). Multiplying 1 this last line by 2− 2 k , and then summing over all k1 and k2 such that |k1 − k2 | 20 and k1 k − 10, and then using (14), we arrive at the estimate (216) for this case. Step 2 (The Bilinear Estimate). In light of estimate (216) above, it suffices to show that: ∂ α φ (2) ∂α φ (3)
1
L 2t ( H˙ − 2 )c [I ]
η δ1 ,
(218)
assuming the conditions of estimate (51). This will be done in two steps. Step 2A (Reduction to Matched Frequencies). Our first step is to peel off all frequency interactions that cannot be treated by estimate (44). In all of these interactions, we will exploit the fact that there is a wide separation in the frequency. This is measured by choosing a large integer m 0 = m 0 (η) such that: 1
2− 2 δm 0 = ηδ1 ,
(219)
where we remind the reader that δ is the small dyadic savings from the standard L 2 bilinear estimate on line (23), and because of the definition of δ1 we have: (220) m 0 δ1 | ln(η)|.
194
J. Sterbenz, D. Tataru
Our goal in this step is to show the following fixed frequency estimate:
1 1 (3) 2− 2 k Pk ∂ α φk(2) ∂ φ L 2 (L 2 )[I ] 2− 2 δm 0 ck φ (2) S[I ] φ (3) Sc [I ] , α k3 2 t
x
ki max{|ki −k|}m 0
(221) which in light of (219) suffices to establish (218) for all frequency interactions except for the case k = k1 + O(m 0 ) = k2 + O(m 0 ). By an application of estimate (23), the two sum rules (13)–(14), and the definition (11) we immediately have: (L.H.S.)(221)
1 1 1 2− 2 k 2 2 min{ki } 2−( 2 +δ)(max{ki }−k) ck3 ki max{|ki −k|}m 0
2−(δ−δ0 )m 0 ck ,
which by using (219) and the definition of the δi suffices to establish (221). Step 2B (The Matched Frequency Case). We have now reduced estimate (218) to showing the matched frequency bound:
1 (3) L 2 (L 2 )[I ] ηδ1 ck . 2− 2 k Pk ∂ α φk(2) ∂ φ α k3 2 t
x
ki max{|ki −k|}<m 0
Due to the fact that there are only O(m 0 ) | ln(η)| terms in this sum, it suffices to show: 1 (2) (3) Pk ∂ α φk2 ∂α φk3 L 2 (L 2 )[I ] η2δ1 2 2 k ck , max{|ki − k|} δ1 | ln(η)|. t
x
But this last estimate follows immediately from (44) and the definition of the δi .
Proof of estimate (52). This estimate was essentially established in the previous proof. We split the estimate into a sum of two pieces: (L.H.S.)(52) Pk P
x
For the first term we simply use (51). For the second term, we use the following version of (217) above: Pk (φk1 · Fk2 )
1
L 2t ( H˙ − 2 )
2
k−k1 2
ck1 φ Sc [I ] · F
1
L 2t ( H˙ − 2 )[I ]
for |k1 − k2 | 5, along with (218). This suffices via the sum rule (14).
,
Remark 7.1. It is possible to prove the frequency envelope estimate (55) with η = 1 in the case where there is no energy dispersion. As the previous step shows, one may first reduce to a bilinear estimate. Then the desired bound follows from summation over (23) using the sum rules (13)–(14). The details are standard and left to the reader.
Energy Dispersed Wave Maps
195
Proof of estimate (55). The proof will be accomplished in a series of steps whose goal is to reduce things to the bilinear estimate (46). Step 0 (A Preliminary Reduction). The first order of business is to reduce estimate (55) to the case where we replace the condition on line (50) with a maximal case: m = max{ δ1 | ln(η)|, 10}. (222) We claim that a proof of (55) with this choice of m implies (55) for any other choice of m where we turn (222) into an inequality. The only caveat is that we must replace the k with a slightly fattened support, so multiplier Pk in the definition of (53) by a version P that one obtains the quasi-idempotence identity Pk P[k−5,k+5] = P[k−5,k+5] . To see this, simply notice that one has the reshuffling identity: m (φ (1), φ (2), φ (3) ) − T m (φ (1), φ (2), φ (3) ), Tkm 0 (φ (1), φ (2), φ (3) ) = Tkm (φ (1), φ (2), φ (3) ) − T 1;k 2;k m (φ (1) , φ (2) , φ (3) ) are the trilinear forms obtained for any 10 m 0 m, where the T i;k m k instead of Pk , to the second and third terms from applying the definition of Tk , with P (resp.) on the RHS of (53) in the definition of Tkm 0 . Step 1 (Removal of the Commutator). We are now trying to prove (55) under the condition (222). Our next step is to use (10) to write (53) in the form: (1) (2) Tkm (φ (1) , φ (2) , φ (3) ) = Pk φ (1) ∂ α φ (2) ∂α φ (3) − Pk φ
(2)
k φ (3) ), C1 = 2−k L 1 (∇x φ
(1) (2) k φ (3) ), C2 = 2−k L 2 (φ
(3)
k φ (2) , ∂α φ C3 = 2−k L 3 (∇x φ
(3)
k φ (2) , ∇x ∂α φ C4 = 2−k L 4 (φ
(224)
which suffice to establish (55) for all but the first three terms on the RHS of the equation for Tkm above. It suffices to work with the case of i = 3, 4; the cases i = 1, 2 are similar but simpler because the frequency envelope is on the high term. For the trilinear form C3 we decompose into all possible frequencies and use (25), which gives: C3 N [I ] 2−k 2k1 2−δ(k1 −k3 )+ ck3 2−m ck−m 2−(1−δ0 )m ck , k1 ,k3
which suffices to show (224) in light of the definition (222) for m.
196
J. Sterbenz, D. Tataru
To prove the bound (224) for C4 we only split φ (3) into separate frequencies, and we use (21) and (24) to bound: 2k3 ck3 2−m ck−m 2(1−δ0 )m ck . C4 N [I ] 2−k k3
Step 2 (Reduction to Matched Frequencies). We are now trying to bound the sum of the first three terms on the RHS of (223) above. Here we write: (First three terms on R.H.S.)(223) = A1 + A2 + A3 + B1 + B2 , where A1 , A2 and A3 account for the unmatched frequency interactions: (3) α (2) Pk φk(1) ∂ φ ∂ φ A1 = α k2 k3 , 1 k1 k−m max{k2 ,k3 }k+m
A2 =
k1 k−m min{k2 ,k3 }k−2m
A3 =
max{k2 ,k3 }k+m
(1) (2) (3) Pk φk1 ∂ α φk2 ∂α φk3 ,
(1) (2) (3) Pk φ
while B1 and B2 account for the matched frequency interactions: (1) (3) α (2) Pk φ B1 = k−m ∂ φk2 ∂α φk3 , k−2m
B2 =
k−m
(1) (3) Pk φ
The goal of this step is to prove the set of estimates: 1
Ai N [I ] 2− 2 δm ck ,
(225)
which is sufficient to establish (55) for these terms because of the definition (222). To prove (225) for the term A1 we use (25). The two highest frequencies can only differ by O(1), therefore we get three distinct contributions if the highest pairs are {12}, {13}, or {23} respectively: A1 N [I ] 2δ(k3 −k2 ) 2δ(k−k2 ) ck3 + 2δ(k2 −k3 ) 2δ(k−k3 ) ck3 k2 k+m k3 k2
+
k3 k+m k2 k3
k3
3
1
2δ(k−k3 ) ck3 m2− 4 δm ck+m 2− 2 δm ck .
k3 k+m k1 =k−m
In the case of the term A2 we must have either the condition max{k2 , k3 } > k − 10, or the conditions max{k2 , k3 } k − 10 and k1 > k − 10. This gives two distinct contributions using estimate (25), which after summing out the k1 index may be (resp) written as: A2 N [I ] S1 + S2 ,
Energy Dispersed Wave Maps
197
with:
S1 =
2δ(k−max{k2 ,k3 }) 2δ(min{k2 ,k3 }−k+m) ck3 ,
min{k2 ,k3 }k−10
S2 =
2δ(min{k2 ,k3 }−k) ck3 .
min{k2 ,k3 }
For the sum S1 we split into cases depending on which index is minimal, and then sum out k2 which yields: S1
2δ(k3 −k+m) ck3 + 2−δm
k3
1
2δ(k−k3 ) ck3 2−δm (ck−2m + ck ) 2− 2 δm ck .
k3 >k−10
For the sum S2 we again split into cases depending on which index is minimal: S2
(k − k3 )2δ(k3 −k) ck3 +
k3
k−10
2δ(k2 −k) ck3 .
k2
For the first sum on the RHS above we get 2−δm ck−2m which is acceptable. For the second sum we further split the range into k3 < k − 2m and k − 2m k3 < k − 10. In the first case we again get 2−δm ck−2m , while in the second we are left with m2−2δm supk−2m k3
1
2δ(k−k3 ) ck3 2−δm ck+m 2− 2 δm ck .
k3 >k+m
Step 3 (The Matched Frequency Estimate). After the last step, it remains to bound the remaining two terms Bi . In both cases, by an application of either (21) or (22), we only need to show the more general matched frequency estimate: (2) (3) ∂α φ[k−O(m),k+O(m)] N [I ] ηδ1 ck , ∂ α φ[k−O(m),k+O(m)]
(226)
under the conditions of Proposition 3.6. Using the bound on m (222) and the definition (11), it suffices to establish the fixed frequency estimate: ∂ φ (3) N [I ] 2Cm ηδ ck3 , ∂ α φk(2) 2 α k3 where we are restricting |k2 − k3 | m. This follows immediately from (46).
Remark 7.2. We remark here that one may prove estimate (61) by a quick application of the above work. To see this, notice the above proof up to Step 3 does not use Proposition 3.4. Thus, we are left with showing estimate (226) in this case, and by inspection of Step 2 we may assume the gap between k2 and k3 is no larger than 3m. By directly applying estimate (24) we have (61) in this case.
198
J. Sterbenz, D. Tataru
Proof of estimate (57). The proof of this estimate follows from some simple manipulations of the bounds used to produce (55). A quick review of the previous proof shows 1√ that all bounds were achieved with RHS η 2 δ1 δ . Thus, by a direct application of those bounds and using the (δ0 , δ0 ) variance condition on {ck } we have: Tkm (Pk+m φ (1) , φ (2) , φ (3) ) N [I ] 2−δ(max{ki }−k) 2−δ(k1 −min{k2 ,k3 })+ ck1 ki :k1 >k+m
2−δ(k1 −k) ck1 2−(δ−δ0 )m ck ,
k1 : k1 >k+m
which suffices.
Remark 7.3. To prove (62) we follow a similar procedure as in the previous proof, except this time applied to estimate (61) instead of (55). Here it suffices to split cases according to P
(227)
Bk S∩X F ck ,
(228)
Bk · ψk N F 2
δ(k −k)
ck ψk S ,
k < k − 10.
(229)
Proof. By definition we have that: Bk = S(φ)
Energy Dispersed Wave Maps
199
By the version of estimate (51) in Remark 3.8, we have the pair of bounds: k
Pk S(φ) L 2 (L 2 )[I ] + Pk φ L 2 (L 2 )[I ] F 2 2 ck . t
t
x
x
(230)
By Leibniz’s rule we have: Bk = S(φ)
(231)
Hence using (230) for the first and last term, and again using Remark 3.8 for the null form in the middle term, we obtain an X [I ] bound as on line (228). It remains to prove the estimate (229) localized to I . We again use the expression (231) for Bk . We multiply the RHS of this line by a function ψ j of frequency j > k+10. The contribution of the middle term can be estimated by (25): ∂ α S(φ)
(2)
(3)
ψ j · φ
with k ∈ Z. This insures that Uk are localized at frequency 2k , while the smallness of φ in S is used to prove that U † U − I is small. Such a construction is no longer satisfactory here, as φ can be large in S and thus U may fail to be almost orthogonal. Instead we switch to a continuous version of the
200
J. Sterbenz, D. Tataru
above construction where we seek U and its “frequency localized” version U,
−∞
where each U,k is defined by: U,k = U,
(232)
Owing to the antisymmetry of the Bk , solutions to this ODE enjoy the conservation law † = I N , so they are automatically exactly orthogonal. However, the price one U,
(233)
Note however that arbitrarily high frequencies are immediately introduced, and their evolution is not easy to track. In particular a bootstrap argument for the above S norm bound would seem to fail due to the lack of smallness of the Bk ’s. We proceed with the proof in several steps aimed at building up to the full S norm estimate by using the conservation law of (232) in a crucial way: ∞ ∞ 2 Step 1 (L ∞ t (L x ) and L t (L x ) bounds for U,k ). We will work exclusively with the energy frequency envelope { ck } for B in this step. Without loss of generality we may assume that this is bounded by the S norm frequency envelope {ck }. We start with the pointwise and energy bounds: k ∞ E ∞ E 2 Bk L ∞ ck , ∇t,x Bk L ∞ ck , ∇t,x Bk L ∞ ck , 2 E t (L x ) t (L x ) t (L x )
(234) all derived from (227). We claim that this implies the following energy type for U,k itself:
−|k −k| C(k−k )+ 2 ck . Pk ∇t,x U,k L ∞ 2 E 2 t (L x )
(235)
To show this, notice that by construction of U,
We estimate ∇t,x U,k by differentiating (232): d ∇t,x U,
(236)
In view of the second estimate on line (234), we have good bounds for the second term on the RHS of the above expression, and we wish to transfer these to ∇t,x U,
Energy Dispersed Wave Maps
201
Lemma 8.2 (Unitary Variation of Parameters). Let V,
V,<−∞ = W,−∞ = 0,
(237)
where Bk is antisymmetric and the forcing term W,k is arbitrary. Then in any mixed q Lebesgue space L t (L rx ) we have the following bound: k V,
−∞
x
Proof. We write the formula for V,
where P is the propagator of the unitary problem: d P(k, k ) = Bk P(k, k ), P(k , k ) = I N . dk ∞ 1. The proof is concluded via an application of In particular, P(k, k ) L ∞ t (L x ) Minkowski’s integral inequality.
We now use estimate (238) to integrate (236), which yields: k ∞ E 2 ∇t,x U,
through a direct application of the sum rule (13). From the differentiated equation for U,k this shows that: ∇t,x U,k L ∞ ck . 2 E t (L x ) Repeating the process for all possible spatial derivatives of ∇t,x U,
The second relation shows in particular that:
C(k−k ) Pk ∇t,x U,k L ∞ ck , 2 E 2 t (L x )
for any positive constant C, and therefore by integration that:
C(k−k ) Pk ∇t,x U,
which suffices for (235) if k k. It remains to bound the low frequencies in U,k , and so we write: Pk U,k = Pk (P[k−10,k+10] U
k Pk ∇t,x U,k L ∞ ck 2k −k . 2 2 ∇t,x (P[k−10,k+10] U,
Hence (235) is proved.
(239)
202
J. Sterbenz, D. Tataru
Step 2 (Strichartz bounds for U,k ). This section largely mimics the previous one, so we will be more terse here. By (228) we have the Strichartz bounds: Bk DS k F 2−k ck , ∇t,x Bk DS k F ck , q
where we recall that DS k is the space of Strichartz admissible L t (L rx ) norms from line (152) with appropriate dyadic weight (note that this norm does not include frequency localization, which will be notationally useful here). Using the bounds for Bk with Eq. (236) or its derivatives, we directly have: ∇xJ ∇t,x U,k DS k F 2|J |k ck , |J | 0. By using this last set of estimates for high frequencies, and (235) and Bernstein’s inequality for low frequencies, we have:
Pk ∇t,x U,k DS k F 2−|k −k| 2C(k −k)+ ck . In particular, one has the inequality:
3
Pk ∇t,x U,k L 4 (L ∞ ) F 2 4 k 2−|k −k| ck , t
x
(240)
which will be useful later in this section. Finally, by interpolating this last bound with (235) and recalling the definition from line (148), we have the following S norm portion of estimate (233): 1
Pk U,k S F 2− 4 |k −k| ck . Step 3 (High modulation bounds for U,k ). Here we will show that: k
1
P j U,k L 2 (L 2 ) F 2 2 2−( 2 +δ)|k− j| ck . t
x
(241)
Differentiating the equation for U,
(242)
Our first goal will be to use Lemma 8.2 to show that: k
U,
(243)
x
By estimate (238) and the X control for Bk from line (228), it suffices to have the null-form bound: k
∂ α U,
x
(244)
Expanding the term on the LHS of this last line we have: k α ∂ U,
−∞ k −∞
U,
k
−∞
∂ α U,
(245)
Energy Dispersed Wave Maps
203
Estimate (244) for the first term on the RHS of this last line follows by summing over the bound (23). For the second term on the RHS of the last line above we may take a product of two L 4t (L ∞ x ) estimates for the terms at frequency k and , < k , and one energy type bound for Bk . This again yields (244). A similar argument allows us to prove the analog of estimates (244) and (243) for higher spatial derivatives: 1
∇xJ (∂ α U,
x
1
∇xJ U,
(246)
x
Turning our attention to Uk we have the identity: U,k = U,
(247)
By estimates (243), (244), and the analogous bound for Bk from line (228) we directly have: k
U,k L 2 (L 2 ) F 2 2 ck , t
x
∞ while the estimates on line (246) combined with the energy and L ∞ t (L x ) bounds for derivatives of U,
∇xJ U,k L 2 (L 2 ) F 2( 2 +|J |)k ck . t
x
This suffices to give (241) for all but the low frequencies. It remains to obtain improved low frequency bounds, i.e. prove (241) in the case when j < k − 10. The first two terms in (247) are easy to estimate, combining the 2 L 2t (L 2x ) bound for one factor with the L ∞ t (L x ) energy type bound for the other, while using Bernstein’s inequality at low frequency. The third term on the RHS of (247) has already been estimated before using (245), but now we need to be more careful to gain from small j. The first term on the RHS of (245), call it T1 , can at low frequency be split into three contributions, P j T1 = T11 + T12 + T13 , where k T11 = P j P< j+4 U,
k−4 k ∞
k−4 j+4 k−4 −∞
Pl U,
P[k−10,k+10] U,
We explain the estimates for each of these terms. In the case of T11 we bound P< j+4 U,
204
J. Sterbenz, D. Tataru
because there is no extra room in the application of Strichartz estimates to use Bernstein’s inequality. Therefore we reexpand as follows: k k1 U,
+
−∞ −∞
∂ α U,
The first term T21 on the RHS above has a structure very similar to the whole of T1 above. The only new development is that extra factor of Bk1 , but it is harmless due to the fact that its frequency is always greater than the differentiated term ∂ α Bk2 . Therefore, one can use the same methods as in the previous paragraph to bound this term (one could as well use the procedure we are about to describe for bounding the second term T22 ). To handle T22 above, we split it further as: k−8 k1 P>k−20 ∂ α U,
+ Pj
k
−∞ k1
k−8 −∞
∂ α U,
For the first term above we put the two (i.e. first and fourth) high frequency terms in 2 4 ∞ L∞ t (L x ), while the middle two terms are both estimated with L t (L x ); then we use Bernstein’s inequality. One is forced to lose in the low frequencies this way, but this is made up for by the arbitrary gain in the difference (k − k2 ) coming from estimate (235): k−8 k1 1 1 1 1 T221 L 2 (L 2 ) F ck 2 j · 2C(k2 −k) · 2− 4 k2 · 2− 4 k1 dk2 dk1 F 2 2 j 2 2 ( j−k) ck . t x −∞
−∞
To bound the term T222 we put both the k2 indexed terms in L 4t (L ∞ x ), and the other two ∞ 2 factors in L t (L x ) while using Bernstein’s inequality at low frequency. This gives the inequality: k k1 1 1 1 T222 L 2 (L 2 ) F ck 2 j · 2 2 k2 · 2−k1 dk2 dk1 F 2 2 j 2 2 ( j−k) ck . t
x
k−8 −∞
This completes our demonstration of the estimate (241). Step 4 (High frequency bounds for U,k ). Here we show that the high frequencies in U,k can be estimated in a much more favorable way:
J ∇t,x Pk U,k L 1 (L 1 ) F 2(−3+|J |)k 2−C(k −k) ck , k > k + 10, t
x
(248)
where |J | 2. For this we expand with D = {k5 < k4 < k3 < k2 < k1 < k}: Pk U,k = Pk U,
Due to the frequency localizations we can replace U,
x
D
Energy Dispersed Wave Maps
205
The bound (248) with J = 0 follows after integration. The case |J | = 1 is treated similarly. A minor variation is needed in the case |J | = 2 when two time derivatives occur. There one writes ∂t2 = + x , using either (228) or (246) for the factor containing the d’Alembertian. Step 5 (Full S norm bounds for U,k ). Here we prove that:
Pk U,k S F 2−δ|k−k | ck .
(249)
In view of the previous step it suffices to consider the case k < k + 10. Here we encounter the main difficulty compared to [29,33]. The inductive bound used there grows exponentially in k due to lack of smallness, so it is useless. Bootstrapping fails for a similar reason. Instead we consider iterated expansions. There are two bounds we need to prove, namely for Pk Q j U,k 1, 1 and Pk Q j U,k S[k ; j] . Due to the high X ∞2
modulation bound (241) and the high frequency bound (248) it suffices to consider the case j < k − 20 < k − 10. The key technical step asserts that in either case we can bound the contribution of U,< j−20 using only pointwise and high modulation bounds: Lemma 8.3. Let j < k − 10. Then the following estimate holds for test functions u and φ: Q j (u < j−10 φk )
1, 1
X ∞2
∞ + Q < j (u < j−10 φk ) S[k; j] u L ∞ · φk S . t (L x )∩X
(250) Proof. We write u < j−10 φk = Q > j−10 u < j−10 · φk + Q < j−10 u < j−10 · φk . For the first term we obtain an L 2t (L 2x ) bound, which by (162) suffices for both norms on the left in (250): Q > j−10 u < j−10 · φk L 2 (L 2 ) Q > j−10 u < j−10 L 2 (L ∞ ) φk L ∞ 2 t (L x ) t
t
x
x
2 Q > j−10 u < j−10 L 2 (L 2 ) φk L ∞ 2 t (L x ) j
t
2
− 2j −k
u
1
L 2t ( H˙ − 2 )
x
∇t,x φk L ∞ 2 . t (L x )
For the second term we consider separately the two cases. On one hand: Q j (Q < j−10 u < j−10 · φk ) = Q j (Q < j−10 u < j−10 · Q > j−10 φk ). Therefore we directly have: Q j (Q < j−10 u < j−10 · φk )
j
1, 1 X ∞2
∞ Q > j−10 φk 2 2 2 2 +k Q < j−10 u < j−10 L ∞ L (L ) t (L x ) t
∞ φk u L ∞ t (L x )
1, 1
X ∞2
.
On the other hand, by a direct application of estimate (168) we have: ∞ φk S . Q < j (Q < j−10 u < j−10 · φk ) S[k; j] u L ∞ t (L x )
The proof of the lemma is concluded.
x
206
J. Sterbenz, D. Tataru
We now return to the main proof, and consider the two bounds we need in order to bound U,k in S, namely: Pk Q j U,k
1
1, 1 X ∞2
+ Pk Q < j U,k S[k ; j] F 2− 4 |k−k | ck
j < k − 20 < k − 10.
For each fixed modulation index j, we expand U,k in the form: k k k1 U,k = U,< j−20 Bk + U,< j−20 Bk1 Bk dk1 + U,
j−20
(251) Step 5A (Contribution of the first term in (251)). We write: U,< j−20 Bk = P< j−10 U,< j−20 · Bk + P> j−10 U,< j−20 · Bk . The first component has output at frequency k, and its contribution is accounted for due to Lemma 8.3. The second can have both high and low frequency output, so we need to split it further. For the high frequency output we estimate: j
− −k P> j−10 U,< j−20 · Bk L 2 (L 2 ) P> j−10 U,< j−20 L 2 (L ∞ ) Bk L ∞ ck , 2 F 2 2 t (L x ) t
t
x
x
where the L 2t (L ∞ x ) norm is estimated by interpolating the (summed version of the) energy bound (235) with the L 1t (L 1x ) high frequency bound (248) for U,< j−20 , and by using Bernstein’s inequality. In the case of low frequency output k < k − 20, the first factor is further restricted to high frequencies so we may bound: Pk (P>k−10 U,< j−20 · Bk ) L 2 (L 2 ) P>k−10 U,< j−20 L 2 (L ∞ ) Bk L ∞ 2 t (L x ) t
t
x
F 2
−C(k− j) − 23 k
2
x
ck ,
(252)
where we have followed the same procedure as in the previous estimate. The restriction j < k then suffices to produce (249) for this term. Step 5B (Contribution of the second term in (251)). We need to split this into several subcases: Step 5B.1 (Contribution of high frequencies in U< j−20 ). This term may have both low and high frequency output. In the case of high frequencies we estimate directly in L 2t (L 2x ) using Strichartz estimates as follows: P> j−10 U,< j−20 · Bk1 Bk L 2
t,x
P> j−10 U,< j−20 L 4 (L ∞ ) Bk1 L 4 (L ∞ ) Bk L ∞ 2 t (L x ) t
F 2
−
j+k1 4 −k
x
t
x
ck ,
where the j − 20 < k1 < k integration is now straightforward and yields a RHS 1 expression of the form F 2−k 2− 2 j ck which suffices. In the case of low frequency output where k < k − 20, we further split the integrand as follows: Pk (P> j−10 U,< j−20 · Bk1 Bk ) = Pk (P> j−10 U,< j−20 · P>k−10 Bk1 · Bk ) + Pk (P> j−10 U,< j−20 · P
Energy Dispersed Wave Maps
207 1
The first term is estimated as above with a gain of 2− 4 (k− j) due to the restriction on k1 (which in particular restricts the range of integration for this term). This suffices to show (249) for this term. To handle the second term, we use the fact that the first factor is now forced to be at large frequency, which gives an L 2t (L 2x ) bound as on line (252) above. Notice that the additional integration in j − 20 < k1 < k may be absorbed via the factor of 2−C(k− j) . Step 5B.2 (Contribution of low frequencies but high modulations in U< j−20 ). In this case the only possible low frequency contribution comes when |k1 − k| < 10. Therefore we may proceed as above using the high modulation bound (243) for the first factor as follows: Pk (Q > j−10 P< j−10 U,< j−20 · Bk1 Bk ) L 2 (L 2 ) t
x
Q > j−10 P< j−10 U,< j−20 L 4 (L ∞ ) Bk1 L 4 (L ∞ ) Bk L ∞ 2 t (L x ) t
F 2
−
j+k1 4 −k
t
x
x
ck ,
and the integral in k1 is the same as above depending on whether |k − k| < 10 or 1 1 k < k − 10. In either case one gains a RHS factor of F 2− 4 (k−k ) 2−k 2− 2 j ck . Step 5B.3 (Contribution of low frequencies and low modulations in U,< j−20 ). Here we deal with the expression Q < j−10 P< j−10 U,< j−20 · Bk1 Bk . We consider two subcases: Step 5B.3.a (Contribution of the range k1 > k − 10). Under this restriction, we may group the product Bk1 Bk as a single term, which we further decompose into all frequencies k < k + 10. For each such localized term we have from the algebra bound (20) the estimate:
Pk (Bk1 Bk ) S 2−|k −k| Bk1 S Bk S F 2−|k −k| ck . Therefore, in the range j < k the resulting term may be estimated in essentially the same way the first term on the RHS of (251) was estimated in Step 5A above with the additional simplification that the low frequency gains are already implicit in the Pk (Bk1 Bk ) localization. Step 5B.3.b (Contribution of the range j − 20 < k1 < k − 10). In this case the output is automatically at frequency 2k . Notice that if we argue as in the previous case then we run into trouble with the k1 integration. Instead, we observe that one has access to the additional localization:
Q j Q < j−10 P< j−10 U,< j−20 · Bk1 Bk = Q j Q < j−10 P< j−10 U,< j−20 · Q < j+4 (Bk1 Bk ) , and according to estimate (169) we may bound the entire contribution of the second factor as: Q < j+4 (Bk1 Bk ) S 2−δ| j−k1 | Bk1 S Bk S ,
j < k1 k.
This provides the needed additional gain that enables us to integrate with respect to k1 . Step 5C (Contribution of the last term in (251)). As in the previous step, we need to split into two further subcases depending on the range of integration: Step 5C.1 (Contribution of the range k1 > k − 10). A direct application of Strichartz bounds gives the estimate: −k − 2 Bk2 Bk1 Bk L 2 (L 2 ) Bk2 L 4 (L ∞ ) Bk1 L 4 (L ∞ ) Bk L ∞ 2 F 2 t (L x ) t
x
t
x
t
x
k1 +k2 4
ck .
208
J. Sterbenz, D. Tataru
The integration with respect to k1 , k2 over the region j − 20 < k2 < k1 < k with the additional restriction that k1 = k + O(1) is straightforward and yields the RHS term 1 1 F 2− 4 (k− j) 2−k 2− 2 j ck which suffices to produce (249) for this term in light of the restriction j < k . Step 5C.2 (Contribution of the range j − 20 < k1 < k − 10). In this case with high frequency output we may proceed as in the previous step. Notice that integration over the full range j − 20 < k2 < k1 < k with no additional work still yields a RHS of the 1 form F 2−k 2− 2 j ck . The contribution of this range with low frequency output forces the first term in the product to have localization in the range k + O(1). One may again proceed as in the last case of Step 5A above to produce an L 2t (L 2x ) estimate via (252). Notice that the integration in both k1 and k2 is safely absorbed by the factor 2−C(k− j) . This concludes our demonstration of the estimate (249). Step 6 (Proof of the bound (34)). By the algebra estimates (21) and (22) it suffices to do this for |k − k| > 20. We rescale to k = 0. There are two cases: Step 6.A (Low frequencies; k < −20). Here we may further localize the transformation matrix to P[−10,10] U,<−20 . Therefore, we have access to (33). For the lower modulations in G 0 we estimate via Bernstein: Pk (P[−10,10] U,<−20 · Q <20 G 0 ) L 1 (L 2 ) t
x
k
2 P[−10,10] U,<−20 L 2 (L 2 ) Q <20 G 0 L 2 (L 2 ) . t
t
x
x
This suffices by estimate (198) in Sect. 6 above. For the high modulation contribution, we split Q >20 G 0 = G (1) + G (2) , a sum (resp.) 0,− 1
of an L 1t (L 2x ) atom and an X 1 2 atom. For G (1) the bound (34) follows by taking 2 P[−10,10] U,<−20 in L ∞ t (L x ) and using Bernstein. 0,− 1
For the X 1 2 atom G (2) , we may assume we are working with a single modulation Q j G (2) , where j > 20. For modulations Q < j−10 P[−10,10] U,<−20 , estimate (34) follows 2 by again putting the first factor in L ∞ t (L x ) and using Bernstein to estimate the product 0,− 1
as a X 1 2 atom with a 2k gain. For high modulations of the first factor, we estimate: Pk (Q > j−10 P[−10,10] U,<−20 · Q j G (2) 0 ) L 1 (L 2 ) t
k − j
2 2
x
(2)
∂t Q > j−10 P[−10,10] U,<−20 L 2 (L 2 ) Q j G 0 L 2 (L 2 ) , t
x
t
x
0,− 12
which is sufficient to place the second factor in X 1 . Step 6.B (High frequencies; k > 20). Here we may further localize the transformation matrix to P[k −5,k +5] U,<−20 . Therefore, we again have access to (33). In this case we may proceed exactly as above, using at each step the same estimates, which in every case suffice due to the exponential decay in (33) for large frequencies. Step 7 (Proof of the bound (35)). Here we establish the estimate: Pk (U,k1 · ψk2 ) N F 2−|k−k2 | 2−δ(k2 −k1 ) ck1 ψ S , k1 < k2 − 10.
(253)
We use the expansion (242) for U,
Energy Dispersed Wave Maps
209
We prove the bound (253) separately for each of these in reverse order. Without loss of generality we will assume that ψk2 S = 1. Step 7A (Estimating the term G 3 ). We directly have from (229) and (34) the product estimate: Pk (U,k2 −10 U,
G 11 L 1 (L 2 ) U,
t
x
x
t
t
x
x
The second term G 12 can have both high and low frequency outputs. When the output is in the range k < k2 + 10 we use (248) and Bernstein’s inequality to bound it as follows: ∞ ψk2 L ∞ (L ∞ ) Pk G 12 L 1 (L 2 ) 2k P>k2 −10 U,
t
x
k −k1 −C(k2 −k1 )
F 2 2
2
x
ck1 ,
which suffices to show (253) in this case. When G 12 has output in the high range k > k2 + 10 we have further high frequency localization of the first factor and we may estimate via the same procedure: ∞ ψk2 L ∞ (L ∞ ) , Pk G 12 L 1 (L 2 ) 2k P[k−5,k+5] U,
t
x
x
F 2k 2−k1 2−C(k−k1 ) ck1 , which is again sufficient to show (253) in this case. This concludes our demonstration of Proposition 3.1. 9. The Linear Paradifferential Flow We now proceed with the proof of Proposition 3.2. The main difficulty here is that we do not necessarily have smallness of the constant from line (40), which would otherwise make estimate (41), consequence of Propositions 2.3 and 3.1. Instead of proceeding directly, we shall follow a more measured approach of building up our estimate piece by piece. Since this is a lengthy argument, we begin with a brief outline. The first step of the proof is to take advantage of the antisymmetry of Aα , which makes our paradifferential equation almost conservative. Precisely, the only nontrivial contributions to energy estimates arise from terms where one derivative falls on the
210
J. Sterbenz, D. Tataru
coefficients. But such terms are small due to the large frequency gap m. Consequently, we are able to prove a favorable estimate: ψk E[I ] F ψk [0] H˙ 1 ×L 2 + 2δm G k N [I ] + 2−δm ψk S[I ] ,
(254)
for the energy (28) on both time slices and characteristic surfaces. We still need an estimate on the S norm of ψk , for which we renormalize Eq. (38) using an orthogonal gauge transformation U,
w,k = R,
(255)
In the analysis of the small data problem in [29,33] one uses a perturbative bound of the form: per t
R,
ψk S[I ] F w,k S[I ] , Pk (U,
(258)
Summing up the estimates on the last four lines we obtain the S bound for ψk : ψk S[I ] F ψk [0] H˙ 1 ×L 2 + G k N [I ] + 2−δ1 m 0 ψk S[I ] + 22m 0 ψk E[I ] . (259) Now all we have to do is combine this with (254), carefully balancing the constants. Assuming that m 0 = m 0 (F) for a large enough m 0 (F) ∼ ln(F), the third term on the right can be absorbed on the left to obtain: ψk S[I ] F ψk [0] H˙ 1 ×L 2 + G k N [I ] + ψk E[I ] . Substituting (254) for the third term on the RHS of this last line we arrive at: ψk S[I ] F ψk [0] H˙ 1 ×L 2 + 2δm G k S[I ] + 2−δm ψk S[I ] ,
Energy Dispersed Wave Maps
211
so now assuming m > m(F) for a larger m(F) ∼ ln(F), the last term on the RHS is again absorbed on the left: ψk S[I ] F ψk [0] H˙ 1 ×L 2 + G k S[I ] . To conclude the proof of (41) we need to improve the S bound above to a W bound. Returning to w,k , we have the estimate:
P j w,k N [I ] + P j w,k [0] H˙ 1 ×L 2 F 2−| j−k| ψk [0] H˙ 1 ×L 2 + G k S[I ] . This follows from (256), (258), and the second member on line (257). It remains to prove the two main estimates above, namely (254) and (256). In the proof we shall make use of three auxiliary lemmas whose proofs we postpone until the end of this section. The first one is used to estimate perturbative expressions which are small due to the large frequency gap m. Lemma 9.1 (Some auxiliary estimates). Let Aα be the connection one-form defined on line (39) above with estimates (40). Then the following bounds hold: k−m ∞ F 2 Aα L ∞ , t (L x )
Aα ψk DS[I ] F 2
−m
(260)
ψk S[I ] .
(261)
Also, for three test functions φ (i) normalized with S ∩ E[I ] size one, the following list of multilinear estimates holds: (2) (3) φ (1) ∂ α φ
(2)
∇t,x φ
(1) (2) (3) ∇t,x φ
· ψk N [I ] 2
−δm k
2 ψk S[I ] .
(262) (263) (264) (265)
In proving energy estimates we need to restrict integration to half-spaces. This is where the next lemma comes handy: Lemma 9.2 (Half-space duality estimate). Let ψk ∈ S and Hk ∈ N be frequency localized functions. Then for any time-slab I , any unit vector ω, and any spatial point x0 ∈ R2 the following truncated duality estimate holds uniformly: Hk · ψk d xdt Hk N [I ] · ψk DS[I ] . (266) I ∩{t>ω·(x−x0 )}
Finally, for the bulk of the estimate (256) we need the following lemma, which improves upon the trilinear bound (25) in the case of balanced low frequencies |k1 −k2 |
m, k1 , k2 < k3 − m: Lemma 9.3 (An improved trilinear estimate). There exists a universal constant C > 0 (i) such that for any integer m 0 and S[I ] unit normalized test functions φki with ki k − m, and ψk any additional test function defined on I , one has the following imbalanced trilinear estimate: 2 (1) (2) φk1 ∂α φk2 ∂ α ψk N [I ] 2C|k1 −k2 | 2−δ m ψk S[I ] + 2m ψk E[I ] . (267)
212
J. Sterbenz, D. Tataru
Assuming these estimates, we give a proof of (254) and (256) in a series of steps. To close the argument properly, we will employ our chain of small constants (8) (although their use here is independent of their use in other sections). Step 1 (A-priori control of the energy norm of ψk : proof of (254)). We begin by writing Eq. (38) for ψk on the interval I in a covariant form: A ψk = Ren
(268)
where A = (∂ + A)α (∂ + A)α is the gauge covariant wave equation with the connection Aα is given by the formula on line (39) and the function Ren
α α α Ren
∇ α Q αβ [ψk ] = ( A ψk )† A∇ β ψk + (Fγβ ψk )† A∇ ψk ,
(270)
where Fαβ = ∂α Aβ − ∂β Aα + [Aα , Aβ ] is the curvature of Aα . Next, we form the linear momentum one-form Pα [ψk ] = Q α0 [ψk ]. Integrating ∇ α Pα [ψk ] = ∇ α Q α0 [ψk ] over all possible half spaces of the form [0, t0 ] ∩ {t > ω · (x − x0 )} we have the bound: A∇ t,x ψk 2L ∞ (L 2 )[I ] + sup A∇ / t,x ψk 2L ∞ (L 2 t
ω
x
tω
xω )[I ]
A∇ x,t ψk (0) 2L 2 + I1 + I2 , x
(271) where
I1 = sup I,ω,x0 I2 = sup
I ∩{t>ω·(x−x0 )}
( A ψk )† A∇ 0 ψk d xdt ,
(F0γ ψk ) ∇ ψk d xdt . †A γ
I ∩{t>ω·(x−x0 )}
I,ω,x0
Our task is to estimate I1 and I2 and to show that we can replace covariant differentiation by regular differentiation in (271). For the right hand side of (271) we claim that both: A∇ x,t ψk (0) 2L 2 F ∇x,t ψk (0) 2L 2 , x
x
I1 + I2 F 2−2δm ψk 2S[I ] + 22δm G k 2S[I ] .
(272) (273)
Energy Dispersed Wave Maps
213
The proof of (272) is an immediate consequence of expanding the covariant deriv∞ ative A∇ and using the triangle inequality, followed by the L ∞ t (L x ) bound for Aα in (260). To obtain (273), we use the half-space duality estimate (266) and Young’s inequality for the term involving G k . For the other terms, we again use half-space duality, and then conclude with an application of the estimates (157), (261)–(265). It suffices to establish the bounds: γ
−δm Ren ψk S[I ] , F0γ · A∇ ψk N [I ] F 2−δm 2k ψk S[I ] .
The first estimate above follows from applying (262)–(263) to each of the terms in Ren
ω
ω )[I ]
∞ F ψk L ∞ ∇t,x ψk L ∞ . 2 t (L x )[I ] t (L x )[I ]
The first of these follows immediately from the bound (260), while the second uses the characteristic energy estimates we are assuming for φ. We remark that using the first bound above requires m to be large enough, i.e. 2m F 1. Step 2 (The S bound for ψ: Proof of (256)). The first thing we need to do is to rewrite Eq. (38) in a gauged formulation (we have no further use for (268)). As usual, we write: α Aα
where the RHS is given by the integrated terms: a b B
k
The connection ∂ α B
(274)
The second term on the right is easy to estimate using (35), which yields:
Pk (U,
(275)
214
J. Sterbenz, D. Tataru per t
It remains to estimate the first term in R,
Pk U,
α α Pk U,
α Pk U,
R L 1 (L 1 )[I ] F 2−2k 2−Cm , P j R L 1 (L 1 )[I ] F 2−2k 2−C( j−k) 2−Cm . t
t
x
x
These estimates are a consequence of the improved estimate (33), and the fact that at α integral is localized to P least one of the gauge factors in the C α − Clow >k−10 U,
(277)
We’ll do this separately for each of the two terms on the left. Step 2B.1 (Estimation of D α term). The plan is to use Lemma 9.3. To do this we need to separate the connection D α into two pieces, one with essentially matched frequencies α +D α , where and one with wide frequency separation. We write D α = D(δ) α a b Scb D(δ) = (φ) − Sca (φ) ∂ α φkc dk [k −10,k +cδ 2 m 0 ] k
k
Here c 1 is an additional small constant. By a direct application of estimate (267) we have: 2 2 α D(δ) (278) ∂α ψk N [I ] F 2Ccδ m 0 2−δ m 0 ψk S[I ] + 2m 0 ψk E[I ] . For c small enough in relation to C we have (277) for this term. The remainder term is in the range where the standard trilinear estimate (25) gives additional savings. A quick computation shows that for this term we in fact have: α ∂α ψk N [I ] F 2−cδ D
3m
0
ψk S[I ] .
The details of the dyadic summation are left to the reader.
(279)
Energy Dispersed Wave Maps
215
α term). We follow the same strategy as in the previous Step 2B.2 (Estimation of Clow α α +C α , where argument. We split Clow = C(δ) α C(δ)
=
k−m
−∞
† α Bk , P
The factors P
Estimate (263) is a more elaborate use of such summations, but it is standard and left to the reader. Consider now (264). For modulations at most comparable to the frequencies in the first factor we can replace the time derivative with a frequency factor and prove the estimate (264) by summing over (25). The relevant detail is that one has the dyadic sum: 2k2 2−δ(k2 −k1 )+ 2−m 2k . k1 ,k2 : ki
It remains to bound the expression when the first factor is at high modulation. In this case we take a product of the two bounds: (1)
1
∇t,x Q |ξ | |τ | φ
the first of which follows from summation over (164) and the second of which follows from summation over (23). The estimate (265) follows from similar reasoning and is left to the reader. Proof of Lemma 9.2. The bound we seek is scale invariant, so without loss of generality we may assume that k = 0, and we may rotate and center the estimate so that ω = (1, 0) and x0 = (0, 0). In light of (161) we see that the main point of (266) is to be able to drop half space cutoffs of the form χt<0 and χt<x 1 . The required boundedness of cutoffs with discontinuities across space-like hypersurfaces was already shown in (156). Therefore,
216
J. Sterbenz, D. Tataru
we seek an analog of (156) in the null case. Due to the frequency localization of both factors on the LHS of (266), it suffices to prove the following product estimate:
(280) P0 χt<x 1 · ψ0 DS ψ0 DS . To save notation we will write χ = χt<x 1 . Our point of view will be to observe that χ is a singular solution to the wave equation, so one can hope that (280) is in some sense a version of the standard product estimate (19). While this is true, the demonstration requires a bit of care because the PW norm of Pk (χ ) does not gain the usual weight from L 1 summation over angles, even though its Fourier support is well localized in the angular variable. In fact, a quick calculation shows that: ⎧ ⎨ c+ δ(τ + |ξ |)δ(ξ2 ), ξ1 > 0; ξ1 χ (τ, ξ ) = c ⎩ − δ(τ − |ξ |)δ(ξ2 ), ξ1 < 0. ξ1 Here c± are appropriate constants depending on ones in the definition of the Fourier transform. The above formulas show that the (+) wave portion of χ is a measure concentrated on the ray (1, −1, 0), and opposite for the (−) wave portion. We have the frequency localized PW type bound: 1
Q ± Pk χ L 2
∞ t(1,0) (L x(1,0) )
2− 2 k .
(281)
Finally, note that due to the frequency localization in (280), we may replace the cutoff with Q <10 P<10 (χ ). Also, if φ0 is at high modulation 10 then P0 χt<x 1 · ψ0 is at comparable modulation, therefore (280) is immediate due to the L ∞ estimate for χ . We now proceed to prove (280) in a series of steps: Step 1 (Controlling the Strichartz norms). Due to the boundedness of χ , we easily have: P0 (P<10 χ · ψ0 ) L qt (L r ) ψ0 DS . x
0, 1
Step 2 (Controlling the X s,b norm). Our first order of business is to bound the X ∞2 part of the norm (151). Freezing the outer modulation, our goal is to show that: j
Q j P0 (P<10 χ · ψ0 ) L 2 (L 2 ) 2− 2 ψ0 DS . t
x
(282)
We now split into subcases. Step 2.A (Output far from cone). In this step we consider the contribution of output modulations j > 20. In this case, we may further localize the product to Q j P0 (P<10 χ · Q j+O(1) ψ0 ). Estimate (282) follows immediately from L ∞ control of χ . Step 2.B (χ at low frequency (≤ j − 10)). In this case φ0 must be at modulation 2 j , therefore we consider the contribution of the expression Q j P0 (P< j−10 χ · Q j+O(1) ψ0 ). Then (282) is immediate from the L ∞ control of χ . Step 2.C (χ at medium frequency, ψ at larger modulation). In this case we consider the contribution of the term Q j P0 (P[ j−10,10] χ · Q j−20 ψ0 ). Again, only the boundedness of χ is used. Step 2.D (χ at medium frequency, ψ at low modulation). The contribution of Q j P0 (P[ j−10,10] χ · Q < j−20 ψ0 ) is considered here. This is the main term. Without loss
Energy Dispersed Wave Maps
217
of generality, we may assume that we are in a (++) interaction, which we decompose 1 1 into all possible angular sectors of cap size |κ| ∼ 2 2 j−10 , respectively |κ | ∼ 2 2 j : Q j P0 (Q + P[ j−10,10] χ · Q +< j−20 ψ0 ) = Q j P0,κ Q + Pk (χ) · P0,κ Q +< j−20 ψ0 . j−10k<10 κ,κ
The main difficulty here is that we cannot really sum over k, because χ is only in an ∞ type Besov space. However, using Lemma 11 of [29] we see that the above sum is both essentially diagonal in κ, κ , and essentially frequency disjoint in its contribution of angles for each fixed k. Precisely, two sectors κ, κ and a frequency k can provide nonzero output if and only if: j
dist(κ, κ ) ∼ 2 2 −10 , dist(κ, (−1, 0)) ∼ 2
j−k 2
.
In particular the sector κ centered at (1, 0) does not yield any output. Taking this into account we may bound: Q j P0 (Q + P j−10·<10 χ · Q +< j−20 ψ0 ) 2L 2 (L 2 ) t
10
k= j−10
j−k
k= j−10
2
∞ t(1,0) (L x(1,0) )
j−k 2
k= j−10 −j
t
Q + Pk (χ ) 2L 2
dist(κ,(−1,0))∼2
10
P0,κ Q + Pk (χ ) · P0,κ Q +< j−20 ψ0 2L 2 (L 2 ) x
dist(κ,(−1,0))∼2 2 j dist(κ,κ )∼2 2 −10
10
x
2−k j−k 2
sup dist(ω,κ)∼2
dist(κ,(−1,0))∼2 P0,κ Q +< j−20 ψ0 2S[0,κ] .
j−k 2
P0,κ Q +< j−20 ψ0 2L ∞
2 t(1,0) (L x(1,0) )
P0,κ Q +< j−20 ψ0 2L ∞ (L 2 tω
xω )
κ
From the definition of the S norm (147), this suffices to prove (282). Step 3 (Controlling the square sum of S[0, κ] norms). Again freezing j < −10 we need to demonstrate that: 2 2 sup Q± (283) < j P0,±κ (P<10 χ · ψ0 ) S[0,κ] ψ0 S , ±
κ
1
where angular sector size is |κ| ∼ 2 2 j . The subcases repeat Case 2 above with little difference, and are mostly left to the reader: Step 3.A (χ at low frequency). This is the contribution of the expression Q < j P0 (P< j−10 χ · Q < j+O(1) ψ0 ). In this case (283) is immediate from the L ∞ control of χ . Step 3.B (χ at medium frequency, ψ at larger modulation). As before, this is the term Q < j P0 (P j−10,10] χ · Q j−20 ψ0 ) for which we have a stronger L 2 bound: 1
Q < j P0 (P[ j−10,10] χ · Q j−20 ψ0 ) L 2 L 2 2− 2 j ψ0 S . t
x
218
J. Sterbenz, D. Tataru
Step 3.C (χ at medium frequency, ψ at low modulation). Here we consider the contribution of Q < j P0 (P[ j−10,10] χ · Q < j−20 ψ0 ). This is again the main term. Without loss of generality we may assume that we are in a (++) interaction in terms of output and ψ0 modulation (in particular, from the estimate in Step 2 above we may dispense with the case (+) output and (−) input from ψ0 ), and we again use Lemma 11 of [29] to 1 1 decompose into a diagonal sum over caps of size |κ| ∼ 2 2 j−C , respectively |κ | ∼ 2 2 j : Q +< j P0 (Q + P[ j−C,C] χ · Q +< j−C ψ0 ) =
Q +< j P0,κ P[ j−C,C] (χ) · P0,κ Q +< j−C ψ0 . j
dist(κ,κ )∼2 2
Notice that we do not need to frequency localize the factor P[ j−10,10] χ to obtain this diagonally, which is a good thing because the rougher bounds on the output modulation and that of ψ0 do not win us disjoint angular contributions in the k-sum of Pk χ . Plugging the above decomposition into the LHS side of estimate, (283) the RHS bound follows at once from L ∞ control of χ . Proof of Lemma 9.3. We begin by extending ψk via the universal extension in Proposition 5.5 in such a way that we simultaneously maintain the E and S norm control. (i) The functions φki are similarly extended. Thus, it suffices to prove the bound on all of space-time. The constant C will be fixed in the proof in just a moment. Let m 0 be any fixed integer. Without loss of generality, we may assume that k = 0. Furthermore, we may also assume that |k1 − k2 | < δm, for otherwise the estimate follows immediately from an application of the standard trilinear bound (25), and taking C > 1 on the RHS of (267). The proof will be accomplished in a series of steps: Step 1 (Reduction estimate). In this step we consider the contribution of to a bilinear (1) (2) α φk1 Q k2 −δm ∂α φk2 ∂ α ψ0 . We peel off the factor φk1 from the trilinear estimate via the bound (21). It remains to prove the bilinear bound 2 (2) Q >k2 −δm ∂α φk2 ∂ α ψ0 N F 2−δ m ψk S[I ] + 2m ψk E[I ] .
(284)
Step 2 (ψ0 is far from the cone). In this step we consider the contribution of Q k2 −δm 1 α ψ . We will prove that the remaining null-form is an X 0,− 2 atom. In ∂α φk(2) Q ∂ 0 0 1 2 the present case, we freeze the output modulation j and then estimate: (2) (2) ∞ · Q 0 ∇t,x ψ0 2 2 , Q j ∂α φk2 Q 0 ∂ α ψ0 L 2 (L 2 ) ∇t,x φk2 L ∞ L (L ) t (L x ) t
t
x
2
k2
(2) φk 2 S
· ψ0 S .
x
Energy Dispersed Wave Maps
219 j
Multiplying both sides of this bound by 2− 2 and then summing over all dyadic j k2 − δm we arrive at: k2 α Q k2 −δm ∂α φk(2) N 2 2 +δm φk(2) Q ∂ ψ S · ψ0 S , 0 0 2 2 which suffices due to the condition k2 < −m. (2) Step 3 (φk2 is far from the cone). In this step we consider the contribution of (2) Q k2 −δm Q >k2 −8δm ∂α φk2 Q <0 ∂ α ψ0 . In this case, we again freeze the output modulation j and proceed to bound: (2) Q j Q >k2 −8δm ∂α φk2 Q <0 ∂ α ψ0 L 2 (L 2 ) t
(2) Q >k2 −8δm ∇t,x φk2 L 2 (L ∞ ) t x
x
· ∇t,x ψ0 L ∞ 2 . t (L x )
(285)
By summing over all j > k2 − 8δm in estimate (164) we have that: L 2 (L ∞ ) 2 2 k2 +4δm φk(2) S . Q >k2 −8δm ∇t,x φk(2) 2 2 1
t
x
1
Substituting this into the RHS of (285), multiplying the result by 2− 2 j , and then summing over all j k2 − δm we have the estimate: (2) (2) Q k2 −δm Q >k2 −8δm ∂α φk2 Q <0 ∂ α ψ0 N 25δm φk2 S · ψ0 E . Step 4 (The core contribution). In this step we consider the contribution of the expression (2) α Q k2 −δm Q
sition into angular sectors of cap size |κ| ∼ 2−4δm . Without loss of generality we may assume we are in the (++) configuration. The other cases (−−), (+−), and (−+) are the same with only minor modifications and are therefore left to the reader. We break the entire contribution into a Q k2 −δm localized sum of two principle terms T1 and T2 , where (2) T1 = Q +
T2 =
κ∈K l
(2)
Q +
To help state the estimates, we introduce the following weaker version of the NFA∗ portion of the S[k, κ] norm from line (150): ψ Sk := sup Pk ψ S[k,κ] , l>10 κ∈K l
where ± k ψ 2 . S[k,κ] := sup sup 2 |κ| · Q
ω
ω
Notice that we do not use the more eccentric Q j multipliers for j < k − 10 in this definition, and there is no square-summing over angles. The reason this notation is useful
220
J. Sterbenz, D. Tataru
is that we have the relation: ψk S ψk E . This is shown through an application of the estimate: Q± 2
ω)
2−k |κ|−1 · ∇ / t,x ψk L ∞ 2 . t (L x ) ω
ω
(286)
Such an inequality may be proved by decomposing the multiplier Q ± 2 j >|κ| Q +
ω)
ω
2−k |κ j |−1 · ∇ / t,x ψk L ∞ 2 , t (L x ) ω
ω
which is an easy consequence of the fact that the kernels associated to the operators: + L = 2k |κ j |/∇ −1 t,x Q
are uniformly in L 1t (L 1x ). The estimate (286) now follows from simply summing over this last bound overall all dyadic 1 < |κ j |−1 < |κ|−1 . Returning to the main thread, we first bound the term T1 above. In this case, we are going to lose a large constant because the sum is not well localized in the second factor and therefore we cannot use orthogonality with respect to κ. Furthermore, we will not bother to gain anything from the null-structure, because the frequency localization of this term eliminates parallel interactions. To compensate for the large number of nonorthogonal sectors, we may use the S norm for the second factor. Using the product estimate (167) we may bound: 1 1 (2) Q j T1 L 2 (L 2 ) |κ| 2 2− 2 k2 Q +
x
κ∈K l
· sup (I − P0,2κ )Q +<0 ∂ α ψ0 L ∞ 2 t (L x ω
ω∈κ
2
1 2 k2
ω)
(2)
φk2 S · 24δm ψ0 S. 1
Multiplying both sides of this last estimate by the factor 2− 2 j and then summing over all j > k2 − δm we have: (2)
Q >k2 −δm T1 N 25δm φk2 S · ψ0 S, which is sufficient. 0, 1 Our final task here is to bound the term Q >k2 −δm T2 in the space X 1 2 . Notice that because of the angular and (++) localization, as well as the fact that j > k2 − δm, for each Q j T2 we may freely insert the multiplier Q > j−10 in front of the second factor, because the complement vanishes (see Lemma 11 of [29]). In this case the resulting sum is both diagonal and orthogonal in κ, so freezing Q j T2 we have with the aid of Bernstein’s inequality (9) the estimate: Q j T2 2L 2 (L 2 ) Q +
x
κ∈K l
t
x
2 1 (2) 2k2 −2δm φk2 S · 2− 2 j ψ0 S .
t
x
Energy Dispersed Wave Maps
221 1
Multiplying the root of this inequality by the factor 2− 2 j and then summing over all j > k2 − δm we finally have: Q >k2 −δm T2 N 2−δm φk22 S · ψ0 S . This concludes our proof of estimate (267).
10. Structure of Finite S Norm Wave-Maps and Energy Dispersion In this section we prove Proposition 3.9. There is almost nothing to do for (63). The X bound follows from the reduced version of (51) in Remark 3.8, while the E bound follows from energy estimates on null surfaces.
10.1. Renormalization. Here we establish the renormalization bound (64). Our starting point is the construction of the renormalization matrix U in Proposition 3.1. The frequency localized wave-map equation for φ is given by:
φk = −Pk S(φ)∂ α φ∂α φ .
(287)
For each index m the RHS of this expression can be written in terms of the trilinear from Tkm from line (53) as follows:
α m S(φ), ∂ α φ, ∂α φ . φ∂α φk + T1;k Pk S(φ)∂ α φ∂α φ = 2S(φ)
φk = −2 Aα
m Ti;k ,
(288)
i m are trilinear forms as on line (53) with O(m) gap indices. By an application where the Ti;k of estimates (42) and (61) with m = 20 we have the bound:
||| φk |||W [I ] F ck , where {ck } is some S[I ] frequency envelope for φk . This proves (64).
222
J. Sterbenz, D. Tataru
10.2. Partial fungibility of the S norm. Here we prove that there is always a decompoK (F) sition of intervals I = ∪i,l Iil , where K (F) is some polynomial in the S[I ] norm of φ, and where (65) holds in each subinterval. Our starting point is the series of frequency localized equations (288). For a fixed φk we use (288) with m = 20. As in the previous section, we can find a renormalization w,k = U,
Pk w,k N [I ] F 2−|k−k | ck .
(289)
Let η 1 to be chosen later. By the fungibility property (159) (and continuity) there exists a polynomial K 1 in Fη−1 such that I = ∪iK 1 Ii such that:
1
Pk w,k N [Ii ] 2− 2 |k−k | ηcki , where {cki } are now some unit normalized frequency envelope which may depend on the interval Ii . We label each time interval as Ii = [ti , ti+1 ], and on each of these time slabs f r ee sour ce , where w f r ee is a free wave with data w [t ]. By the we write w,k = w,k + w,k ,k i ,k previous line and the energy estimate (18) we have on Ii the bound: 1
sour ce Pk w,k S[Ii ] 2− 2 |k−k | ηcki .
(290)
† sour ce of φ we obtain: Consequently, for the corresponding part U,
By choosing η as the reciprocal of an appropriate polynomial in F, we have: † sour ce U,
(291) f r ee
† It remains to bound the free wave contribution U,
Pk w,k
[ti ] H˙ ×L 2 E 2−|k−k | ck ,
uniformly with respect to i where we may choose the unit frequency envelope {ck } to be the same as on line (289) above. In particular, we have the uniform control: f r ee
Pk w,k
S[Ii ] E 2−|k−k | ck .
(292)
Now we turn our attention to the U,
(293)
Energy Dispersed Wave Maps
223
Such indices j are called “good j’s”; the remainder (of which we have at most a polynomial in m F) are called “bad j’s”. We also introduce the corresponding parts f r ee † of U,
† φk ( j) = P[ j−m, j+m] U,
.
The goal of the argument is now to choose a polynomial in m F collection of subintervals Iil , partitioning the Ii , such that on each there is uniform control over all k and j: 1
Pk φk ( j) S[Iil ] E 2− 2 δ|k−k | ck , jl
(294) jl
for some additional set of unit normalized frequency envelopes {ck }. For good j’s this is straightforward in view of (292) and (293). Since there are m F bad j’s, it suffices to consider such a fixed bad j. The equation for each fixed φk ( j) is: f r ee
† φk ( j) = P[ j−m, j+m] U,
† + 2∂ α P[ j−m, j+m] U,
f r ee
.
Therefore, by a direct application of the estimates (292), (32), (24), and (33)–(35) we have on all of Ii the bound:
Pk φk ( j) N [Ii ] m F 2−δ|k−k | ck , and from the energy norm control giving (292) and estimate (36) we also have the uniform energy control:
Pk φk ( j)[t] H˙ 1 ×L 2 E 2−|k−k | ck . Thus, by again using the property (159) we obtain the desired partition {Iil } of I , with estimate (294) uniformly, at a cost of at most m F 1 subdivisions. f r ee † w,k on each subinterval J = Iil , To conclude the proof we need to estimate U,k+m U,
For the high frequency part we use (33) in conjunction with the product bounds (19)–(20) to obtain: f r ee
† Pk (P>k+m U,
) S[I ] F 2−Cm 2−|k −k| ck ,
which suffices provided m is large enough, m ∼ ln F. For the medium frequency part we can use directly (294) with j = k. Thus, we are reduced to providing good S[J ] f r ee † norm bounds for the quantities P
224
J. Sterbenz, D. Tataru 0, 1
Step 2 (X ∞2 norm control). Fix a modulation Q j . Without loss of generality we will assume that j < k, as the complimentary region is easier to treat using the high modulation bounds in (32) and (33). We decompose as follows: ! Q j φk ( j) + Q j R,k , j < k − 2m; f r ee † = Q j P k − 2m, where R,k =
!
f r ee
f r ee
† † P< j−m U,
, j < k − 2m; j > k − 2m.
By estimate (294) we already control the first terms on the RHS of (295), so we only need to bound the contribution of Q j Pk R,k . This is given by the following analog of Lemma 8.3: Lemma 10.1. Let j < k − 10 and m > 10 be an integer. Then the following estimates hold for test functions u = u
−δm ∞ +2 Q j (u < j−m φk ) 1, 1 +Q < j (u < j−m φk ) S[k; j] u L ∞ u X φk S , t (L x ) X ∞2
Q j (P> j+m u
1, 1 X ∞2
+Q < j (P> j+m u
Proof. The proof of the first bound is immediate from (168) and the product bounds (19)–(20) in conjunction with the following easy estimate for very high modulations: 1
Q > j− 1 m u < j−m S 2− 4 m u X . 2
The second estimate is just a summed version of (169) which also incorporates (162). Using a combination of the estimates in this last lemma, and (292), we have:
Q j Pk R,k 0, 1 E 1 + 2−δm Q 2 (F) ck , X ∞2
which suffices. Step 3 (S[k; j] norm control). This is immediate from the decomposition (295), the estimate (294), and Lemma 10.1. 10.3. The role of the energy dispersion. By applying estimate (64) and then using (51) on Eq. (287) we have (66). Suppose now that {ck } is a frequency envelope for the initial data of φ in H˙ × L 2 . Then by the seed bounds (68) we have full control: φ Sc [J ] K 1 (F),
(296)
on some sufficiently small subinterval J ⊆ I . Here K 1 is a universal polynomial that will be chosen in a moment. The goal now is to bootstrap this control and show that if: φ Sc [J ] 2K 1 (F),
(297)
Energy Dispersed Wave Maps
225
then we have (296). By Proposition 3.10 we may continue and finally close this last estimate on all of I . By applying estimates (64) and (66) to (297), we have: φ Wc [J ] K 2 (F)K 1 (F), φ X c [J ] δ1 K 2 (F)K 1 (F), for a universal polynomial K 2 . Next, choose the gap m ln(F) in Eq. (288) in a way that is consistent with the assumptions of Proposition 3.2, and apply estimate (41) to (288), while using (55) via the last two bounds. This gives: 2 φk S[I ] K 3 (F) 1 + δ1 K 2 (F)K 1 (F) ck . The proof is concluded by choosing K 1 = 2K 3 and assuming is sufficiently small. 11. Initial Data Truncation Here we prove that for each initial data set with small energy dispersion we can continuously regularize it. In a sufficiently small tubular neighborhood V (M) of the surface M ⊂ R N we introduce a projection operator: : V (M) → M. This also induces a projection operator on the tangent bundle: : T V (M) → T M, RN
which is a product of in and Euclidean linear orthogonal projection onto each fiber in the second factor. Given an initial data set: φ[0] = (φ0 , φ1 ) : R2 → T M, which belongs to H˙ 1 × L 2 , we regularize it as follows: φ, 0 there exists 0 > 0 so that for each initial data set φ[0] for (1) with energy E and energy dispersion 0 and k, k∗ ∈ Z we have: Pk (P
(298)
Proof. By rescaling we assume that k∗ = 0. We begin with two simple Moser type estimates which we will repeatedly use in the sequel. Precisely, for each smooth and bounded function G with bounded derivatives we have: ∇xJ G(P
(299)
∇xJ G(P
(300)
and
which are easily proved using the chain rule and Bernstein’s inequality.
226
J. Sterbenz, D. Tataru
We first show that if is small enough then the projection P<0 φ[0] is well defined: Lemma 11.2. Under the assumptions of Proposition 11.1 we have: dist (P<0 φ0 , M) E | log |. Proof. By translation invariance, it suffices to show that: I = |P<0 φ0 (0) − φ0 (x)|d x E | log |. |x|1
(301)
We use a positive parameter m and a Littlewood-Paley decomposition to estimate I as follows: I ∇x P<−m φ0 L ∞ + P[−m,m] φ0 L ∞ + P>m φ0 L 2x . x x Using Sobolev embeddings for the first term, energy dispersion for the second, and the H˙ 1 norm for the third we obtain: I 2−m E + m + 2−m E. Then (301) is obtained by choosing 2m = −1 .
To continue the proof of the proposition, we remark that can be expressed as: (302) (φ (1) , φ (2) ) = G(φ (1) ), H (φ (1) )φ (2) , where G is some smooth extension of to all of R N , and H is some extension of the fiber projection composed with G. Note that both G and H may be chosen as bounded functions with bounded derivatives. We separately estimate the high frequencies, middle frequencies and low frequencies of the difference P 0). For the high frequencies we do not use the fact at all that φ[0] takes values in T M. Instead, we use (299) to directly estimate: Pk G(P<0 φ0 ) L 2x E 2−(C+1)k , where C is a large integer. Similarly we have: Pk (H (P<0 φ0 )P<0 φ1 ) L 2x E 2−Ck . Thus we obtain: Pk (P<0 φ[0] − φ,<0 [0]) H˙ 1 ×L 2 E 2−Ck .
(303)
Step 2 (Low frequencies bounds, the contribution of k < 0). Here we take advantage of the identity φ[0] = φ[0]. Then we can write:
Pk P<0 φ[0] − φ,<0 [0] = Pk (φ[0] − (P<0 φ[0])) := ψ[0]. To estimate the last difference we use an integral expansion as follows: ∞ d G(Pk1 −10 G (P
Energy Dispersed Wave Maps
227
Next, we use Bernstein’s inequality and (300) to estimate:
Pk P>k1 −10 G (Pk1 −10 G (Pk1 −10 G (P
(304)
Hence after integration with respect to k1 0 we obtain: ψ0 L 2x E 2k . A similar computation shows that: ∞ ψ1 = Pk H (P
0
∞
H (P
Then proceeding as above, we may estimate both integrands on the RHS in terms of E 2k−k1 , which upon integration over all k1 0 yields a similar bound: ψ1 L 2x E 2k . Thus we have proved that:
Pk P<0 φ[0] − φ,<0 [0] H˙ 1 ×L 2 E 2k .
(305)
Step 3 (Intermediate frequency bounds, the contribution of −m < k < m). Here m is some fixed large integer. The goal of the argument here is to show the estimate:
Pk P<0 φ[0] − φ,<0 [0] H˙ 1 ×L 2 E m 2 + 2−m , |k| m. (306) This is used with m chosen so that 2−m ≈ . Due to the identity φ[0] = φ[0] we can rewrite (306) in the form Pk (P<0 φ[0] − P<0 φ[0]) H˙ 1 ×L 2 E m 2 + 2−m .
(307)
This is a direct consequence of the following paradifferential relation: Lemma 11.3. Let be as in (302), and D be its differential. Then for each ψ[0] ∈ H˙ 1 × L 2 with energy E and energy dispersion and each k ∈ R we have Pk ψ[0] − D(P 4,
(308)
where Pk can be substituted by any multiplier whose symbol has similar size, localization and regularity. We remark that in (308) there is no geometry left. That is to say, ψ[0] in (308) need not satisfy the identity ψ[0] = ψ[0]. It is easy to see that (308) implies (307). Indeed, if k ≥ 2 then the first term Pk P<0 φ[0] in (307) does not contribute, while for the second we use (308) with ψ[0] = P<0 φ[0]. On the other hand if k < 2 then for the first term Pk P<0 φ[0] in (307) we use (308) with Pk replaced by Pk P<0 and ψ[0] = φ[0], while for the second term we again use (308) with ψ[0] = P<0 φ[0]. It remains to prove the lemma.
228
J. Sterbenz, D. Tataru
Proof of Lemma 11.3. We write Pk ψ[0] − D(P
(309)
We use the integral representation ∞ d Pk (H (ψ0 )ψ1 ) =Pk (H (P
The integrals from −∞ to k − m, respectively from k + m to ∞ can be bounded E 2−m as in Step 1, respectively Step 2 above. For the integral from k − m to k + m we consider the two terms in the integrand separately. The second term is estimated directly, ∇ H (P
k+m
k−m
H (P
The remaining integrand is further expanded, H (P
k1
k−m
∇ H (P
The second term can be estimated as above by E m. We arrive at Pk (H (ψ0 )ψ1 ) = Pk (H (P
E 2−m .
The proof of the lemma is complete.
This concludes our demonstration of Proposition 11.1.
Energy Dispersed Wave Maps
229
Acknowlegements. The authors would like to thank Manos Grillakis, Sergiu Klainerman, Joachim Krieger, Matei Machedon, Igor Rodnianski, and Wilhelm Schlag for many stimulating discussions over the years regarding the wave-map problem. We would also especially like to thank Terry Tao for several key discussions on the nature of induction-on-energy type proofs. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References 1. Bahouri, H., Gérard, P.: High frequency approximation of solutions to critical nonlinear wave equations. Amer. J. Math. 121(1), 131–175 (1999) 2. Bejenaru, I., Ionescu, A., Kenig, C., Tataru, D.: Global Schrödinger maps. http://arXiv.org/abs/0807. 0265v1[math.AD], 2008 3. Gromov, M.L.: Isometric imbeddings and immersions. Dokl. Akad. Nauk SSSR 192, 1206–1209 (1970) 4. Günther, M.: Isometric embeddings of Riemannian manifolds. In: Proceedings of the International Congress of Mathematicians, Vol. I, II (Kyoto, 1990), Tokyo: Math. Soc. Japan, 1991, pp. 1137–1143 5. Hadac, M., Herr, S.S., Koch, H.: Well-posedness and scattering for the kp-ii equation in a critical space. Ann. I. H. Poicavé - AN 26(3), 917–941 (2009) 6. Kenig, C.E., Merle, F.: Global well-posedness, scattering and blow-up for the energy-critical focusing non-linear wave equation. Acta Math. 201(2), 147–212 (2008) 7. Klainerman, S., Machedon, M.: Space-time estimates for null forms and the local existence theorem. Comm. Pure Appl. Math. 46(9), 1221–1268 (1993) 8. Klainerman, S., Rodnianski, I.: On the global regularity of wave maps in the critical Sobolev norm. Internat. Math. Res. Notices 13, 655–677 (2001) 9. Klainerman, S., Selberg, S.: Remark on the optimal regularity for equations of wave maps type. Comm. Part. Diff. Eqs. 22(5-6), 901–918 (1997) 10. Klainerman, S., Selberg, S.: Bilinear estimates and applications to nonlinear wave equations. Commun. Contemp. Math. 4(2), 223–295 (2002) 11. Koch, H., Tataru, D.: Dispersive estimates for principally normal pseudodifferential operators. Comm. Pure Appl. Math. 58(2), 217–284 (2005) 12. Koch, H., Tataru, D.: A priori bounds for the 1D cubic NLS in negative Sobolev spaces. Int. Math. Res. Not. IMRN, 16:Art. ID rnm053, 36, (2007) 13. Krieger, J., Schlag, W., Tataru, D.: Renormalization and blow up for charge one equivariant critical wave maps. Invent. Math. 171(3), 543–615 (2008) 14. Krieger, J.: Global regularity of wave maps from R3+1 to surfaces. Commun. Math. Phys. 238(1-2), 333– 366 (2003) 15. Krieger, J.: Global regularity of wave maps from R2+1 to H 2 . Small energy. Commun. Math. Phys. 250(3), 507–580 (2004) 16. Krieger, J., Schlag, W.: Concentration compactness for critical wave-maps. http://arXiv.org/abs/0908. 2974v1[math.AP], 2009 17. Nahmod, A., Stefanov, A., Uhlenbeck, K.: On the well-posedness of the wave map problem in high dimensions. Comm. Anal. Geom. 11(1), 49–83 (2003) 18. Nash, J.: The imbedding problem for Riemannian manifolds. Ann. of Math. (2) 63, 20–63 (1956) 19. Rodnianski, I., Sterbenz, J.: On the formation of singularities in the critical O(3) sigma-model. http:// arXiv.org/abs/math/0605023v3[mth.AP], 2008 20. Shatah, J., Struwe, M.: The Cauchy problem for wave maps. Int. Math. Res. Not. 11, 555–571 (2002) 21. Sterbenz, J., Tataru, D.: Regularity of Wave-Maps in Dimension 2+1. doi:10.1007/s00220-010-1062-3 22. Tao, T.: Global regularity of wave maps III. Large energy from R2+1 to hyperbolic spaces. http://arXiv. org/abs/0805.4666v3[math.AP], 2009 23. Tao, T.: Global regularity of wave maps IV. Absence of stationary or self-similar solutions in the energy class. http://arXiv.org/abs/0806.3592v2[math.AP], 2009 24. Tao, T.: Global regularity of wave maps V. Large data local well-posedness in the energy class. http:// arXiv.org/abs/0808.0368v2[math.AP], 2009 25. Tao, T.: Global regulrity of wave maps VI. Abstract theory of Minimal energy blowup solutions. http:// arXiv.org/abs/0906.2833v2[math.AP], 2009 26. Tao, T.: An inverse theorem for the bilinear L 2 Strichartz estimate for the wave equation. http://arXiv. org/abs/0904.2880v1[math.AP], 2009 27. Tao, T.: Endpoint bilinear restriction theorems for the cone, and some sharp null form estimates. Math. Z. 238(2), 215–268 (2001)
230
J. Sterbenz, D. Tataru
28. Tao, T.: Global regularity of wave maps. I. Small critical Sobolev norm in high dimension. Internat. Math. Res. Notices 6, 299–328 (2001) 29. Tao, T.: Global regularity of wave maps. II. Small energy in two dimensions. Commun. Math. Phys. 224(2), 443–544 (2001) 30. Tao, T.: Geometric renormalization of large energy wave maps. In: Journées “Équations aux Dérivées Partielles”, pages Exp. No. XI, 32. École Polytech., Palaiseau, 2004 31. Tataru, D.: Local and global results for wave maps. I. Comm. Part. Diff. Eqs. 23(9-10), 1781–1793 (1998) 32. Tataru, D.: On global existence and scattering for the wave maps equation. Amer. J. Math. 123(1), 37–77 (2001) 33. Tataru, D.: Rough solutions for the wave maps equation. Amer. J. Math. 127(2), 293–377 (2005) 34. Wolff, T.: A sharp bilinear cone restriction estimate. Ann. of Math. (2) 153(3), 661–698 (2001) Communicated by P. Constantin
Commun. Math. Phys. 298, 231–264 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1062-3
Communications in
Mathematical Physics
Regularity of Wave-Maps in Dimension 2 + 1 Jacob Sterbenz1, , Daniel Tataru2, 1 Department of Mathematics, University of California, San Diego, CA 92093-0112, USA.
E-mail: [email protected]
2 Department of Mathematics, University of California, Berkeley, CA 94720-3840, USA.
E-mail: [email protected] Received: 24 July 2009 / Accepted: 27 December 2009 Published online: 23 May 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract: In this article we prove a Sacks-Uhlenbeck/Struwe type global regularity result for wave-maps : R2+1 → M into general compact target manifolds M.
Contents 1.
Introduction . . . . . . . . . . . . . . . . . . . . . . . 1.1 The Question of Blowup . . . . . . . . . . . . . . 1.2 The Question of Scattering . . . . . . . . . . . . . 2. Overview of the Proof . . . . . . . . . . . . . . . . . . 3. Weighted Energy Estimates for the Wave Equation . . . 4. Finite Energy Self Similar Wave-Maps . . . . . . . . . 5. A Simple Compactness Result . . . . . . . . . . . . . . 6. Proof of Theorems 1.3,1.5 . . . . . . . . . . . . . . . . 6.1 Extension and scaling in the blowup scenario . . . 6.2 Extension and scaling in the non-scattering scenario 6.3 Elimination of the null concentration scenario . . . 6.4 Nontrivial energy in a time-like cone . . . . . . . . 6.5 Propagation of time-like energy concentration . . . 6.6 Final rescaling . . . . . . . . . . . . . . . . . . . . 6.7 Concentration scales . . . . . . . . . . . . . . . . 6.8 The compactness argument . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . The first author was supported in part by the NSF grant DMS-0701087.
. . . . . . . . . . . . . . . . .
The second author was supported in part by the NSF grant DMS-0801261.
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
232 234 236 237 241 246 248 249 249 251 253 255 256 257 258 261 263
232
J. Sterbenz, D. Tataru
1. Introduction In this article we consider large data Wave-Maps from R2+1 into a compact Riemannian manifold (M, m), and we prove that regularity and dispersive bounds persist as long as a soliton-like concentration is absent. This is a companion to our concurrent article [27], where the same result is proved under a stronger energy dispersion assumption, see Theorem 1.2 below. The set-up we consider is the same as the one in [41], using the extrinsic formulation of the Wave-Maps equation. Precisely, we consider the target manifold (M, m) as an isometrically embedded submanifold of R N . Then we can view the M valued functions as R N valued functions whose range is contained in M. Such an embedding always exists by Nash’s theorem [20] (see also Gromov [8] and Günther [9]). In this context the Wave-Maps equation can be expressed in a form which involves the second fundamental form S of M, viewed as a symmetric bilinear form: S : T M × T M → N M, S(X, Y ), N = ∂ X N , Y . The Cauchy problem for the wave maps equation has the form: a φ a = −Sbc (φ)∂ α φ b ∂α φ c , (φ 1 , . . . , φ N ) := , (0, x) = 0 (x), ∂t (0, x) = 1 (x),
(1a) (1b)
where the initial data (0 , 1 ) is chosen to obey the constraint: 0 (x) ∈ M, 1 (x) ∈ T0 (x) M, x ∈ R2 . In the sequel, it will be convenient for us to use the notation [t] = ((t), ∂t (t)). There is a conserved energy for this problem, 1 E[](t) = (|∂t |2 + |∇x |2 )d x. 2 This is invariant with respect to the scaling that preserves the equation, (x, t) → (λx, λt). Because of this, we say that the problem is energy critical. The present article contributes to the understanding of finite energy solutions with arbitrarily large initial data, a problem which has been the subject of intense investigations for some time now. We will not attempt to give a detailed account of the history of this subject here. Instead, we refer the reader to the surveys [40] and [15], and the references therein. The general (i.e. without symmetry assumptions) small energy problem for compact targets was initiated in the ground-breaking work of Klainerman-Machedon [13,14], and completed by the work of Tao [36] when M is a sphere, and Tataru [41] for general isometrically embedded manifolds; see also the work of Krieger [17] on the non-compact hyperbolic plane, which was treated from the intrinsic point of view. At a minimum one expects the solutions to belong to the space C(R; H˙ 1 (R2 )) ∩ C˙ 1 (R, L 2 (R2 )). However, this information does not suffice in order to study the equation and to obtain uniqueness statements. Instead, a smaller Banach space S ⊂ C(R; H˙ 1 (R2 )) ∩ C˙ 1 (R, L 2 (R2 )) was introduced in [36], modifying an earlier structure in [39]. Beside the energy, S also contains Strichartz type information in various frequency localized contexts. The full description of S is not necessary here. However, we do use 1, 1 the fact that S ∩ L ∞ is an algebra, as well as the X s,b type embedding S ⊂ X˙ ∞2 . See our companion paper [27] for precise definitions. The standard small data result is as follows:
Large Data Wave Maps
233
Theorem 1.1 ([36,41,17]). There is some E 0 > 0 so that for each smooth initial data (0 , 1 ) satisfying E[](0) ≤ E 0 there exists a unique global smooth solution = T (0 , 1 ) ∈ S. In addition, the above solution operator T extends to a continuous operator from H˙ 1 × L 2 to S with S (0 , 1 ) H˙ 1 ×L 2 . We call such solutions “strong finite energy wave-maps”. Furthermore, the following weak Lipschitz stability estimate holds for these solutions for s < 1 and close to 1: − C(R; H˙ s )∩C˙ 1 (R, H˙ s−1 ) [0] − [0] H˙ s × H˙ s−1 .
(2)
Due to the finite speed of propagation, the corresponding local result is also valid, say with the initial data in a ball and the solution in the corresponding domain of uniqueness. The aim of the companion [27] to the present work is to provide a conditional S bound for large data solutions, under a weak energy dispersion condition. For that, we have introduced the notion of the energy dispersion of a wave map defined on an interval I , E D[] = sup Pk L ∞ , t,x [I ] k∈Z
where Pk are spatial Littlewood-Paley projectors at frequency 2k . The main result in [27] is as follows: Theorem 1.2 (Main Theorem in [27]). For each E > 0 there exist F(E) > 0 and (E) > 0 so that any wave-map in a time interval I with energy E[] ≤ E and energy dispersion E D[] ≤ (E) must satisfy S[I ] ≤ F(E). In particular, this result implies that a wave map with energy E cannot blow-up as long as its energy dispersion stays below (E). In this article we establish unconditional analogs of the above result, in particular settling the blow-up versus global regularity and scattering question for the large data problem. We begin with some notations. We consider the forward light cone C = {0 ≤ t < ∞, r ≤ t} and its subsets C[t0 ,t1 ] = {t0 ≤ t ≤ t1 , r ≤ t}. The lateral boundary of C[t0 ,t1 ] is denoted by ∂C[t0 ,t1 ] . The time sections of the cone are denoted by St0 = {t = t0 , |x| ≤ t}. We also use the translated cones C δ = {δ ≤ t < ∞, r ≤ t − δ} as well as the corresponding notations C[tδ 0 ,t1 ] , ∂C[tδ 0 ,t1 ] and Stδ0 for t0 > δ. Given a wave map in C or in a subset C[t0 ,t1 ] of it we define the energy of on time sections as 1 E St [] = (|∂t |2 + |∇x |2 )d x. 2 St
234
J. Sterbenz, D. Tataru
It is convenient to do the computations in terms of the null frame L = ∂t − ∂r , ∂/ = r −1 ∂θ .
L = ∂t + ∂r ,
We define the flux of between t0 and t1 as 1 1 2 2 |L| + |/∂ | d A. F[t0 ,t1 ] [] = 2 ∂C[t0 ,t1 ] 4 By standard energy estimates we have the energy conservation relation E St1 [] = E St0 [] + F[t0 ,t1 ] [].
(3)
This shows that E St [] is a nondecreasing function of t. 1.1. The Question of Blowup. We begin with the blow-up question. A standard argument which uses the small data result and the finite speed of propagation shows that if blow-up occurs then it must occur at the tip of a light-cone where the energy (inside the cone) concentrates. After a translation and rescaling it suffices to consider wave maps in the cone C[0,1] . If lim E St [] ≤ E 0 ,
t→0
then blow-up cannot occur at the origin due to the local result, and in fact it follows that lim E St [] = 0.
t→0
Thus the interesting case is when we are in an energy concentration scenario lim E St [] > E 0 .
(4)
t→0
The main result we prove here is the following: Theorem 1.3. Let : C(0,1] → M be a C ∞ wave map. Then exactly one of the following possibilities must hold: A) There exists a sequence of points (tn , xn ) ∈ C[0,1] and scales rn with (tn , xn ) → (0, 0),
lim sup
|xn | < 1, tn
lim
rn =0 tn
so that the rescaled sequence of wave-maps (n) (t, x) = (tn + rn t, xn + rn x),
(5)
1 to a Lorentz transform of an entire Harmonic-Map of converges strongly in Hloc nontrivial energy:
(∞) : R2 → M,
0 < (∞) H˙ 1 (R2 ) lim E St []. t→0
Large Data Wave Maps
235
B) For each > 0 there exists 0 < t0 1 and a wave map extension : R2 × (0, t0 ] → M with bounded energy E[] ≤ (1 + 8 ) lim E St [] t→0
(6)
and energy dispersion, −k ∞ sup sup Pk (t) L ∞ ≤ . + 2 P ∂ (t) k t Lx x
t∈(0,t0 ] k∈Z
(7)
We remark that a nontrivial harmonic map (∞) : R2 → M cannot have an arbitrarily small energy. Precisely, there are two possibilities. Either there are no such harmonic maps (for instance, in the case when M is negatively curved, see [19]) or there exists a lowest energy nontrivial harmonic map, which we denote by E0 (M) > 0. Furthermore, a simple computation shows that the energy of any harmonic map will increase if we apply a Lorentz transformation. Hence, combining the results of Theorem 1.3 and Theorem 1.2 we obtain the following: Corollary 1.4. (Finite Time Regularity for Wave-Maps). The following statements hold: A) Assume that M is a compact Riemannian manifold so that there are no nontrivial finite energy harmonic maps (∞) : R2 → M. Then for any finite energy data [0] : R2 × R2 → M × T M for the wave map equation (1) there exists a global solution ∈ S(0, T ) for all T > 0. In addition, this global solution retains any additional regularity of the initial data. → M be a Riemannian covering, with M compact, and such that B) Let π : M there are no nontrivial finite energy harmonic maps (∞) : R2 → M. If [0] : × TM is C ∞ , then there is a global C ∞ solution to M with this R2 × R2 → M data. C) Suppose that there exists a lowest energy nontrivial harmonic map into M with energy E0 (M). Then for any data [0] : R2 × R2 → M × T M for the wave map equation (1) with energy below E0 (M), there exists a global solution ∈ S(0, T ) for all T > 0. We remark that the statement in part B) is a simple consequence of A) and restricting the projection π ◦ to a sufficiently small section St of a cone where one expects blowup In particular, since this projection is regular by part A), its of the original map into M. image lies in a simply connected set for sufficiently small t. Thus, this projection can be inverted to yield regularity of the original map close to the suspected blowup point. Because of this trivial reduction, we work exclusively with compact M in the sequel. It should be remarked however, that as a (very) special case of this result one obtains global regularity for smooth Wave-Maps into all hyperbolic spaces Hn , which has been a long-standing and important conjecture in geometric wave equations due to its relation with problems in general relativity (see Chapter 16 of [2]). The statement of Corollary 1.4 in its full generality was known as the Threshold Conjecture. Similar results were previously established for the Wave-Map problem via symmetry reductions in the works [4,26,30], and [29]. General results of this type, as
236
J. Sterbenz, D. Tataru
well as fairly strong refinements, have been known for the Harmonic-Map heat-flow for some time (see [28] and [22]). As with the heat flow, Theorem 1.3 does not prevent the formation of multiple singularities on top of each other. To the contrary, such bubble-trees are to be expected (see [42]). Finally, we remark that this result is sharp. In the case of M = S2 there exists a lowest energy nontrivial harmonic map, namely the stereographic projection Q. The results in [16] assert that blow-up with a rescaled Q profile can occur for initial data with energy arbitrarily close to E[Q]. We also refer the reader to [23] for blow-up results near higher energy harmonic maps. 1.2. The Question of Scattering. Next we consider the scattering problem, for which we start with a finite energy wave map in R2 × [0, ∞) and consider its behavior as t → ∞. Here by scattering we simply mean the fact that ∈ S; if that is the case, then the structure theorem for large energy Wave-Maps in [27] shows that behaves at ∞ as a linear wave after an appropriate renormalization. 1 We can select a ball B so that outside B the energy is small, E B c [] < 10 E 0 . Then outside the influence cone of B, the solution behaves like a small data wave map. Hence it remains to study it within the influence cone of B. After scaling and translation, it suffices to work with wave maps in the outgoing cone C[1,∞) which have finite energy, i.e. lim E St [] < ∞.
(8)
t→∞
We prove the following result: Theorem 1.5. Let : C[1,∞) → M be a C ∞ wave map which satisfies (8). Then exactly one of the following possibilities must hold: A) There exists a sequence of points (tn , xn ) ∈ C[1,∞) and scales rn with tn → ∞,
lim sup
|xn | < 1, tn
lim
rn =0 tn
so that the rescaled sequence of wave-maps (n) (t, x) = (tn + rn t, xn + rn x),
(9)
1 to a Lorentz transform of an entire Harmonic-Map of converges strongly in Hloc nontrivial energy:
(∞) : R2 → M,
0 < (∞) H˙ 1 (R2 ) lim E St []. t→∞
B) For each > 0 there exists t0 > 1 and a wave map extension : R2 × [t0 , ∞) → M with bounded energy E[] ≤ (1 + 8 ) lim E St [] t→∞
(10)
and energy dispersion,
≤ . + 2−k Pk ∂t (t) L ∞ sup sup Pk (t) L ∞ x x
t∈[t0 ,∞) k∈Z
(11)
Large Data Wave Maps
237
In case B) Theorem 1.2 then implies that scattering holds as t → ∞. Thus if scattering does not hold then we must be in case A). As a corollary, it follows that scattering can only fail for wave-maps whose energy satisfies E[] ≥ E0 (M).
(12)
Thus Corollary 1.4 can be strengthened to Corollary 1.6 (Scattering for Large Data Wave-Maps). The following statements hold: A) Assume that there are no nontrivial finite energy harmonic maps (∞) : R2 → M. Then for any finite energy data [0] for the wave map equation (1) there exists a global solution ∈ S. B) Suppose that there exists a lowest energy nontrivial harmonic map, with energy E0 (M). Then for any data [0] for the wave map equation (1) with energy below E0 (M) there exists a global solution ∈ S. Ideally one would also like to have a constructive bound of the form S ≤ F(E[]). This does not seem to follow directly from our results. Furthermore, our results do not seem to directly imply scattering for non-compact targets in the absence of harmonic maps (only scattering of the projection). Results similar to Corollary 1.6 were previously established in spherically symmetric and equivariant cases, see [3] and [5]. Finally, we would like to remark that results similar in spirit to the ones of this paper and [27] have been recently announced. In the case where M = Hn , the hyperbolic spaces, globally regularity and scattering follows from the program of Tao [37,31– 33,35] and [34]. In the case where the target M is a negatively curved Riemann surface, Krieger and Schlag [18] provide global regularity and scattering via a modification of the Kenig-Merle method [12], which uses as a key component suitably defined BahouriGerard [1] type decompositions.
2. Overview of the Proof The proofs of Theorem 1.3 and Theorem 1.5 are almost identical. The three main building blocks of both proofs are (i) weighted energy estimates, (ii) elimination of finite energy self-similar solutions, and (iii) a compactness result. Our main energy estimates are established in Sect. 3. Beside the standard energy bounds involving the ∂t vector field we also use the vector field X0 =
1 (t∂t + r ∂r ), ρ
ρ=
t2 − r2
(13)
as well as its time translates. This leads to a family of weighted energy estimates, see (26) below, which has appeared in various guises in the literature. The first such reference we are aware of is the work of Grillakis [7]. Our approach is closest to the work of Tao [37] and [31] (see also Chap. 6.3 of [38]). These bounds are also essentially identical to the “rigidity estimate” of Kenig-Merle [12]. It should be noted that estimates of this
238
J. Sterbenz, D. Tataru
type are probably the only generally useful time-like component concentration bounds possible for non-symmetric wave equations, and they will hold for any Lagrangian field equation on (2 + 1) Minkowski space. Next, we introduce a general argument to rule out the existence of finite energy selfsimilar solutions to (1). Such results are essentially standard in the literature (e.g. see the section on wave-maps in [25]), but we take some care here to develop a version which applies to the setup of our work. This crucially uses the energy estimates developed in Sect. 3, as well as a boundary regularity result of J. Qing for harmonic maps (see [21]). The compactness result in Proposition 5.1, proved in Sect. 5, allows us to produce the strongly convergent subsequence of wave maps in case A) of Theorems 1.3, 1.5. It applies to local sequences (n) of small energy wave maps with the additional property that X (n) → 0 in L 2 for some time-like vector field X . This estimate uses only the standard small energy theory of [41], and is completely independent of the more involved regularity result in our companion paper [27]. Given these three building blocks, the proof of Theorems 1.3 and 1.5 presented in Sect. 6 proceeds as follows: Step 1 (Extension and scaling). We assume that part B) of Theorem 1.3, respectively Theorem 1.5 does not hold for a wave map and for some > 0. We construct an extension of as in part B) satisfying (6), respectively (10). Then the energy dispersion relation (7), respectively (11) must fail. Thus, we can find sequences tn , xn , and kn so that |Pkn (tn , xn )| + 2−kn |Pkn ∂t (tn , xn )| > , with tn → 0 in the case of Theorem 1.3, respectively tn → ∞ in the case of Theorem 1.5. In addition, the flux-energy relation F[t1 ,t2 ] [] = E St2 [] − E St1 [] shows that in the case of Theorem 1.3 we have lim F[t1 ,t2 ] [] = 0
t1 ,t2 →0
and in the case of Theorem 1.5 we have lim
t1 ,t2 →∞
F[t1 ,t2 ] [] = 0.
This allows us also to choose n → 0 such that 1
F[ n tn ,tn ] [] ≤ n2 E[]. Rescaling to t = 1 we produce the sequence of wave maps (n) (t, x) = (tn t, tn x) in the increasing regions C[ n ,1] so that 1
F[ n ,1] [(n) ] ≤ n2 E[], and also points xn ∈ R2 and frequencies kn ∈ Z so that |Pkn (n) (1, xn )| + 2−kn |Pkn ∂t (n) (1, xn )| > .
(14)
Large Data Wave Maps
239
From this point on, the proofs of Theorems 1.3,1.5 are identical. Step 2 (Elimination of null concentration scenario). Using the fixed time portion of the X 0 energy bounds we eliminate the case of null concentration |xn | → 1,
kn → ∞
in estimate (14), and show that the sequence of maps (n) at time t = 1 must either have low frequency concentration in the range: m( , E) < kn < M( , E),
|xn | < R( , E)
or high frequency concentration strictly inside the cone: kn ≥ M( , E),
|xn | < γ ( , E) < 1.
Step 3 (Time-like energy concentration). In both remaining cases above we show that a nontrivial portion of the energy of (n) at time 1 must be located inside a smaller cone, 1 |∂t (n) |2 + |∇x (n) |2 d x ≥ E 1 , 2 t=1,|x|<γ1 where E 1 = E 1 ( , E) and γ1 = γ1 ( , E) < 1. Step 4 (Uniform propagation of non-trivial time-like energy). Using again the X 0 energy bounds we propagate the above time-like energy concentration for (n) from time 1 to 1
1
smaller times t ∈ [ n2 , n4 ], 1 |∂t (n) |2 + |∇x (n) |2 d x ≥ E 0 ( , E), 2 |x|<γ2 ( ,E)t
1
1
t ∈ [ n2 , n4 ].
At the same time, we obtain bounds for X 0 (n) outside smaller and smaller neighborhoods of the cone, namely ρ −1 |X 0 (n) |2 d xdt 1. C n1
1 [ n2 , n4 ]
Step 5 (Final rescaling). By a pigeonhole argument and rescaling we end up producing another sequence of maps, denoted still by (n) , which are sections of the original wave 1
map and are defined in increasing regions C[1,Tn ] , Tn = e| ln n | 2 , and satisfy the following three properties: E St [(n) ] ≈ E, t ∈ [1, Tn ] (Bounded Energy), E S (1−γ2 )t [(n) ] ≥ E 2 , t ∈ [1, Tn ] (Nontrivial Time-like Energy),
t
1 2
n C[1,T n]
1 1 |X 0 (n) |2 d xdt | log n |− 2 ρ
(Decay to Self-similar Mode).
240
J. Sterbenz, D. Tataru
Step 6 (Isolating the concentration scales). The compactness result in Proposition 5.1 only applies to wave maps with energy below the threshold E 0 in the small data result. Thus we need to understand on which scales such concentration can occur. Using several additional pigeonholing arguments we show that one of the following two scenarios must occur: i) Energy Concentration. On a subsequence there exist (tn , xn ) → (t0 , x0 ), with 1
(t0 , x0 ) inside C 21
[ 2 ,∞)
, and scales rn → 0 so that we have 1 E0 , 10 1 E B(x,rn ) [(n) ](tn ) ≤ E0 , 10
E B(xn ,rn ) [(n) ](tn ) =
rn−1
tn +rn /2 tn −rn /2
B(x0 ,r )
x ∈ B(x0 , r ),
|X 0 (n) |2 d xdt → 0.
ii) Non-concentration. For each j ∈ N there exists an r j > 0 such that for every (t, x) 1 ∩ {2 j < t < 2 j+1 } one has inside C j = C[1,∞) 1 E0 , 10 E (1−γ2 )t [(n) ](t) ≥ E 2 , E B(x,r j ) [(n) ](t) ≤
∀(t, x) ∈ C j ,
St
|X 0 (n) |2 d xdt → 0, Cj
uniformly in n. Step 7 (The compactness argument). In case i) above we consider the rescaled wavemaps (n) (t, x) = (n) (tn + rn t, xn + rn x) and show that on a subsequence they converge locally in the energy norm to a finite energy nontrivial wave map in R2 × [− 21 , 21 ] which satisfies X (t0 , x0 ) = 0. Thus must be a Lorentz transform of a nontrivial harmonic map. In case ii) above we show directly that the sequence (n) converges locally on a subsequence in the energy norm to finite energy nontrivial wave map , defined in 2 the interior of a translated cone C[2,∞) , which satisfies X 0 = 0. Consequently, in hyperbolic coordinates we may interpret as a nontrivial harmonic map : H2 → M. Compactifying this and using conformal invariance, we obtain a non-trivial finite energy harmonic map : D2 → M from the unit disk D2 , which according to the estimates of Sect. 3 obeys the additional weighted energy bound: dx |∇x |2 < ∞. 1−r D2 But such maps do not exist via combination of a theorem of Qing [21] and a theorem of Lemaire [19].
Large Data Wave Maps
241
3. Weighted Energy Estimates for the Wave Equation In this section we prove the main energy decay estimates. The technique we use is the standard one of contracting the energy-momentum tensor:
1 Tαβ [] = m i j () ∂α φ i ∂β φ j − gαβ ∂ γ φ i ∂γ φ j , (15) 2 with well chosen vector-fields. Here = (φ 1 , . . . , φ n ) is a set of local coordinates on the target manifold (M, m) and (gαβ ) stands for the Minkowski metric. The main two properties of Tαβ [] are that it is divergence free ∇ α Tαβ = 0, and also that it obeys the positive energy condition T (X, Y ) 0 whenever both g(X, X ) 0 and g(Y, Y ) 0. This implies that contracting Tαβ [] with timelike/null vector-fields will result in good energy estimates on characteristic and space-like hypersurfaces. If X is some vector-field, we can form its associated momentum density (i.e. its Noether current) (X )
Pα = Tαβ []X β .
This one form obeys the divergence rule ∇ α (X )Pα =
1 Tαβ [](X ) π αβ , 2
(16)
where (X ) παβ is the deformation tensor of X , (X )
παβ = ∇α X β + ∇β X α .
A simple computation shows that one can also express (X )
π = L X g.
This latter formulation is very convenient when dealing with coordinate derivatives. Recall that in general one has: (L X g)αβ = X (gαβ ) + ∂α (X γ )gγβ + ∂β (X γ )gαγ . Our energy estimates are obtained by integrating the relation (16) over cones C[tδ 1 ,t2 ] . Then from (16) we obtain, for δ ≤ t1 ≤ t2 : 1 (X ) (X ) P0 d x + Tαβ [](X )π αβ d xdt = (X )P0 d x + PL d A, (17) 2 Stδ C[tδ ,t ] Stδ ∂C[tδ ,t ] 2
1 2
1
1 2
where d A is an appropriately normalized (Euclidean) surface area element on the lateral boundary of the cone r = t − δ. The standard energy estimates come from contracting Tαβ [] with Y = ∂t . Then we have (Y )
π = 0,
(Y )
P0 =
1 (|∂t |2 + |∇x |2 ), 2
(Y )
PL =
1 1 |L|2 + |/∂ |2 . 4 2
242
J. Sterbenz, D. Tataru
Applying (17) over C[t1 ,t2 ] we obtain the energy-flux relation (3) used in the Introduction. δ Applying (17) over C[δ,1] yields 1 1 |L|2 + |/∂ |2 d A ≤ E[]. (18) δ 4 2 ∂C[δ,1] It will also be necessary√for us to have a version of the usual energy estimate adapted to the hyperboloids ρ = t 2 − r 2 = const. Integrating the divergence of the (Y )Pα momentum density over regions of the form R = {ρ ρ0 , t t0 } we have: (Y ) α P d Vα ≤ E[], (19) {ρ=ρ0 }∩{t t0 }
where the integrand on the LHS denotes the interior product of (Y )P with the Minkowski volume element. To express this estimate in a useful way, we use the hyperbolic coordinates (CMC foliation): t = ρ cosh(y), r = ρ sinh(y), θ = . In this system of coordinates, the Minkowski metric becomes − dt 2 + dr 2 + r 2 dθ 2 = −dρ 2 + ρ 2 dy 2 + sinh2 (y)d2 .
(20)
(21)
A quick calculation shows that the contraction on line (19) becomes the one-form (Y ) α
P d Vα = T (∂ρ , ∂t )ρ 2 d AH2 ,
d AH2 = sinh(y)dyd.
(22)
The area element d AH2 is that of the hyperbolic plane H2 . To continue, we note that: ∂t =
t r ∂ρ − 2 ∂ y , ρ ρ
so in particular T (∂ρ , ∂t ) =
cosh(y) sinh(y) |∂ρ |2 − ∂ρ · ∂ y 2 ρ 1 cosh(y) 2 2 |∂ y | + + |∂ | . 2ρ 2 sinh2 (y)
Letting t0 → ∞ in (19) we obtain a useful consequence of this, namely a weighted hyperbolic space estimate for special solutions to the wave-map equations, which will be used in the sequel to rule out the existence of non-trivial finite energy self-similar solutions: Lemma 3.1. Let be a finite energy smooth wave-map in the interior of the cone C. Assume also that ∂ρ ≡ 0. Then one has: 1 |∇ 2 |2 cosh(y)d AH2 ≤ E[]. (23) 2 H2 H Here: |∇H2 |2 = |∂ y |2 +
1 |∂ |2 , sinh2 (y)
is the covariant energy density for the hyperbolic metric.
Large Data Wave Maps
243
Our next order of business is to obtain decay estimates for time-like components of the energy density. For this we use the timelike/null vector-field X =
1 ((t + )∂t + r ∂r ), ρ
ρ =
(t + )2 − r 2 .
(24)
In order to gain some intuition, we first consider the case of X 0 . This is most readily expressed in the system of hyperbolic coordinates (20) introduced above. One easily checks that the coordinate derivatives turn out to be ∂ρ = X 0 , ∂ y = r ∂t + t∂r . In particular, X 0 is uniformly timelike with g(X 0 , X 0 ) = −1, and one should expect it to generate a good energy estimate on time slices t = const. In the system of coordinates (20) one also has that L X 0 g = 2ρ dy 2 + sinh2 (y)d2 . Raising this, one then computes (X 0 ) αβ
π
=
2 −2 ∂ . ⊗ ∂ + sinh (y)∂ ⊗ ∂ y y ρ3
Therefore, we have the contraction identity: 1 1 Tαβ [](X 0 )π αβ = |X 0 |2 . 2 ρ To compute the components of (X 0 )P0 and (X 0 )PL we use the associated optical functions u = t − r, v = t + r. Notice that ρ 2 = uv. Also, simple calculations show that X0 =
1 ρ
1
2 vL
1 1 + 21 u L , ∂t = L + L. 2 2
(25)
Finally, we record here the components of Tαβ [] in the null frame T (L , L) = |L|2 , T (L, L) = |L|2 , T (L , L) = |/∂ |2 . By combining the above calculations, we see that we may compute
1 v 21 1 v 21 u 21 1 u 21 (X 0 ) |/∂ |2 + P0 = T (∂t , X 0 ) = |L|2 + + |L|2 , 4 u 4 u v 4 v 1 v 21 1 u 21 (X 0 ) PL = T (L , X 0 ) = |L|2 + |/∂ |2 . 2 u 2 v These are essentially the same as the components of the usual energy currents (∂t )P0 and modulo ratios of the optical functions u and v. One would expect to get nice space-time estimates for X 0 by integrating (16) over the interior cone r t 1. The only problem is that the boundary terms degenerate rather severely when ρ → 0. To avoid this we simply redo everything with the shifted
(∂t )P L
244
J. Sterbenz, D. Tataru
version X from line (24). The above formulas remain valid with u, v replaced by their time shifted versions u = (t + ) − r, v = (t + ) + r. Furthermore, notice that for small t one has in the region r t the bounds
v u
1 2
≈ 1,
u v
1 2
≈ 1,
0 < t ≤ .
Therefore, one has in r t that (X )
P0 ≈ (∂t )P0 ,
0 < t ≤ .
In what follows we work with a wave-map in C[ ,1] . We denote its total energy and flux by E = E S1 [],
F = F[ ,1] [].
In the limiting case F = 0, = 0 one could apply (17) to obtain 1 (X ) (X ) P0 d x + |X |2 d xdt = P0 d x. 0 0 0 ρ St C[t ,t ] St 2
1
1 2
By (26), letting t1 → 0 followed by → 0 and taking the supremum over 0 < t2 1 we would get the model estimate 1 (X 0 ) sup |X 0 |2 d xdt ≤ E. P0 d x + 0 0 C[0,1] ρ t∈(0,1] St However, here we need to deal with a small nonzero flux. Observing that (X )
1
PL − 2 (∂t )PL ,
from (17) we obtain the weaker bound 1 (X ) 2 P0 d x + |X | d xdt St0 C[t0 ,t ] ρ St0 2
(X )
1
P0 d x + − 2 F.
1
1 2
Letting t1 = and taking supremum over t2 1 we obtain 1 1 (X ) sup P0 d x + |X |2 d xdt E + − 2 F. 0 0 ρ C[0,1] t∈( ,1] St
(26)
A consequence of this is the following, which will be used to rule out the case of asymptotically null pockets of energy: Lemma 3.2. Let be a smooth wave-map in the cone C( ,1] which satisfies the flux1 energy relation F 2 E. Then (X ) P0 d x E. (27) S10
Large Data Wave Maps
245
Next, we can replace X by X 0 in (26) if we restrict the integrals on the left to r < t − . In this region we have (X )
P0 ≈ (X 0 )P0 ,
ρ ≈ ρ.
Using the second member above, a direct computation shows that in r < t − , 1 1 2 |X 0 |2 |X |2 + 3 |∂t |2 , ρ ρ ρ and also C( ,1]
2 |∂t |2 d xdt ≤ ρ3
1
2
C( ,1]
3
t2
|∂t |2 d xdt E.
Thus, we have proved the following estimate which will be used to conclude that rescaling of are asymptotically stationary, and also used to help trap uniformly time-like pockets of energy: Lemma 3.3. Let be a smooth wave-map in the cone C( ,1] which satisfies the flux1 energy relation F 2 E. Then we have 1 (X 0 ) |X 0 |2 d xdt E. P0 d x + (28) sup C[ ,1] ρ t∈( ,1] St Finally, we use the last lemma to propagate pockets of energy forward away from the boundary of the cone. By (17) for X 0 we have (X 0 ) (X 0 ) (X 0 ) P0 d x ≤ P0 d x + PL d A, ≤ δ < t0 < 1. S1δ
Stδ0
∂C[tδ
We consider the two components of (18) we have the bound
u 1 2
∂C[tδ
v
0 ,1]
(X 0 )P L
0 ,1]
separately. For the angular component, by
1 δ 2 |/∂ | d A t0 ∂C[tδ 2
1 δ 2 |/∂ | d A E. t0 2
0 ,1]
For the L component a direct computation shows that |L|
u 1 2
v
|X 0 | +
u v
|L|.
Thus we obtain
(X 0 ) S1δ
P0 d x
(X 0 ) Stδ0
P0 d x +
1 1 u 3 u 2 δ 2 2 E+ |X 0 φ|2 + |L|2 d A. t0 v v ∂C[tδ ,1] 0
For the last term we optimize with respect to δ ∈ [δ0 , δ1 ] to obtain:
246
J. Sterbenz, D. Tataru
Lemma 3.4. Let be a smooth wave-map in the cone C( ,1] which satisfies the flux1 energy relation F 2 E. Suppose that ≤ δ0 δ1 ≤ t0 ≤ 1. Then
1 δ1 2 (X 0 ) (X 0 ) −1 P0 d x δ P0 d x + + (ln(δ1 /δ0 )) E. (29) δ t0 S11 St 0 0
To prove this lemma, it suffices to choose δ ∈ [δ0 , δ1 ] so that 1
u 3 u 2 2 |X 0 φ|2 + |L|2 d A | ln(δ1 /δ0 )|−1 E. v v ∂C[tδ ,1] 0
This follows by pigeonholing the estimate
u 3 1 u 21 2 2 2 d xdt E. |X φ| + |L| 0 δ δ v v C[t0 ,1] \C[t1 ,1] u 0
0
The first term is estimated directly by (28). For the second we simply use energy bounds since in the domain of integration we have the relation 1
δ2 1 u 23 ≤ 13 . u v t2 4. Finite Energy Self Similar Wave-Maps The purpose of this section is to prove the following theorem: Theorem 4.1 (Absence of non-trivial finite energy self similar wave-maps in 2D). Let be a finite energy1 solution to the wave-map equation (1) defined in the forward cone C. Suppose also that ∂ρ ≡ 0. Then ≡ const. Remark 4.2. Theorems of this type are standard in the literature. The first such reference we are aware of is in the work of Shatah–Tahvildar-Zadeh [26] on the equivariant case. This was later extended by Shatah-Struwe [25] to disprove the existence of (initially) smooth self-similar profiles. However, the authors are not aware of an explicit reference in the literature ruling out the possibility of general (i.e. non-symmetric) finite energy self-similar solutions to the system (1), although this statement has by now acquired the status of a folk-lore theorem. It is this latter version in the above form that is necessary in the context of the present work. Remark 4.3. It is important to remark that the finite energy assumption cannot be dropped, and also that this failure is not due to interior regularity. That is, there are non-trivial C ∞ self similar solutions to (1) in C but these solutions all have infinite energy. However, the energy divergence is marginal, i.e. the energy in Stδ only grows as | ln(δ)| as δ → 0. Further, these solutions have finite energy when viewed as harmonic maps in H2 . 1 That is to say, a weak solution of (1) (in the sense of [10]), such that ess sup 0
Large Data Wave Maps
247
Proof of Theorem 4.1. Writing out the system (1) in the coordinates (20), and canceling the balanced factor of ρ −2 on both sides we have that obeys: ij
a ()∂i b ∂ j c , H2 a = −gH2 Sbc
where in polar coordinates gH2 = dy 2 + sinh2 (y)d2 is the standard hyperbolic metric. Thus, a is an entire harmonic map : H2 → M. By elliptic regularity, such a map is smooth with uniform bounds on compact sets (see [10]). In particular, is C ∞ in the interior of the cone C. Therefore, from estimate (23) of Lemma 3.1, enjoys the additional weighted energy estimate: 2 |∂t |2 + |∇x |2 d x, (30) |∇H2 | cosh(y)d AH2 ≤ 2E() = H2
St
for any fixed t > 0. To proceed further, it is convenient to rephrase all of the above in terms of the conformal compactification of H2 . Using the pseudo-spherical stereographic projection from hyperboloids to the unit disk D2 = {t = 0} ∩ {x 2 + y 2 < 1} in Minkowski space (see Chap. II of [6]): 2ρ(t + ρ, x 1 , x 2 ) , (t + ρ, x 1 , x 2 ), (t + ρ, x 1 , x 2 )
π(t, x 1 , x 2 ) = (−1, 0, 0) −
as well as the conformal invariance of the 2D harmonic map equation and its associated Dirichlet energy, we have from (30) that induces a finite energy harmonic map : D2 → M with the additional property that 1 + r2 1 d x < ∞. (31) |∇x |2 2 D2 1 − r2 To conclude, we only need to show that such maps are trivial. Notice that the weight (1 − r )−1 is critical in this respect, for there are many non-trivial2 finite energy harmonic maps from D2 → M, regardless of the curvature of M, which are also uniformly smooth up to the boundary ∂D2 and may therefore absorb an energy weight of the form (1 − r )−α with α < 1. From the bound (31), we have that there exists a sequence of radii rn 1 such that: |/∇ |2 dl = on (1). r =rn
From the uniform boundedness of (r ) in L 2 (S1 ) and the trace theorem, this implies that: |∂ D2 L 2 (S1 ) < ∞,
|∂ D2
1
H˙ 2 (S1 )
= 0,
so therefore |∂ D2 ≡ const. In particular, : D2 → M is a finite energy harmonic map with smooth boundary values. By a theorem of J. Qing (see [21]), it follows that 2 For example, if M is a complex surface with conformal metric m = λ2 dd, then the harmonic map equation becomes ∂z ∂z = −2∂ (ln λ)∂z ∂z . Thus, any holomorphic function : D2 → M suffices to produce an infinite energy self similar “blowup profile” for (1). See Chap. I of [24].
248
J. Sterbenz, D. Tataru
has uniform regularity up to ∂D2 . Thus, is C ∞ on the closure D2 with constant boundary value. By Lemaire’s uniqueness theorem (see Theorem 8.2.3 of [11], and originally [19]) ≡ const throughout D2 . By inspection of the coordinates (20) and the fact that ∂ρ ≡ 0, this easily implies that is trivial in all of C. 5. A Simple Compactness Result The main aim of this section is the following result: Proposition 5.1. Let Q be the unit cube, and let (n) be a family of wave maps in 3Q 1 which have small energy E[(n) ] ≤ 10 E 0 and so that X (n) L 2 (3Q) → 0
(32) 3
for some smooth time-like vector field X . Then there exists a wave map ∈ H 2 − (Q) 1 with E[] ≤ 10 E 0 so that on a subsequence we have the strong convergence (n) →
in H 1 (Q).
Proof. The argument we present here is inspired by the one in Struwe’s work on Harmonic-Map heat flow (see [28]). The main point there is that compactness is gained through the higher regularity afforded via integration in time. For a parabolic equation, integrating L 2 in time actually gains a whole derivative 2 2 by scaling, so in the heat flow case one can control a quantity of the form |∇ | d xdt which leads to control of 2 2 |∇ | d x for many individual points in time. By the small data result in [41], we have a similar (uniform) space-time bound (n) in any compact subset K of 3Q which gains us 21 a derivative over energy, namely χ (n)
1, 1
X ∞2
1,
supp χ ⊂ 3Q,
(33)
1, 1
where X ∞2 here denotes the ∞ Besov version of the critical inhomogeneous X s,b space. 1 sequence of wave-maps through a simple freWe obtain a strongly converging Ht,x quency decomposition argument as follows. The vector field X is timelike, therefore its symbol is elliptic in a region of the form {τ > (1 − 2δ)|ξ |} with δ > 0. Hence, given a cutoff function χ supported in 3Q and with symbol equal to 1 in 2Q, there exists a microlocal space-time decomposition χ = Q −1 (Dx,t , x, t)X + Q 0 (Dx,t , x, t) + R(Dx,t , x, t), where the symbols q−1 ∈ S −1 and q0 ∈ S 0 are supported in 3Q × {τ > (1 − 2δ)|ξ |}, respectively 3Q × {τ < (1 − δ)|ξ |}, while the remainder R has symbol r ∈ S −∞ with spatial support in 3Q. This yields a decomposition for χ (n) , (n)
χ (n) = (Q 0 (D, x) + R(D, x))(n) + Q −1 (D, x)X (n) = bulk + Rn . Due to the support properties of q0 , for the main term we have the bound (n)
bulk
3 −
2 Ht,x
χ (n)
1, 1
X ∞2
1.
(34)
Large Data Wave Maps
249
On the other hand the remainder decays in norm, Rn H 1 X (n) L 2 (3Q) −→n→∞ 0. t,x
Hence on a subsequence we have the strong convergence χ (n) →
1 in Ht,x (3Q).
In addition, must satisfy both 3
∈ H 2 − ,
X = 0 in 2Q.
It remains to show that is a wave-map in Q (in fact a “strong” finite energy wave-map according to the definition of Theorem 1.1). There exists a time section 2Q t0 close to the center of 2Q such that both [t0 ] ( H˙ 1 ×L 2 )(2Q t
0)
< ∞,
(n) [t0 ] − [t0 ] ( H˙ s × H˙ s−1 )(2Q t
0)
→ 0,
be the solution to (1) with data [t0 ], from the weak stability for some s < 1. Letting in Hxs (Q t ) at fixed time for s < 1. Thus result (2) in Theorem 1.1 we have (n) → s (Q) which suffices. = in Ht,x We consider now the two cases we are interested in, namely when X = ∂t or X = t∂t + x∂x . If X = ∂t (as will be the case for a general time-like X vector after boosting), then is a harmonic map :Q→M and is therefore smooth (see [10]). If X = t∂t + x∂x and 3Q is contained within the cone {t > |x|} then can be interpreted as a portion of a self-similar Wave-Map, and therefore it is a harmonic map from a domain : H2 ⊇ → M and is again smooth. Note that since the Harmonic-Map equation is conformally invariant, one could as well interpret this as a special case of the previous one. However, in the situation where we have similar convergence on a large number of such domains that fill up H2 , will be globally defined as an H2 harmonic map to M, and we will therefore be in a position to apply Theorem 4.1.
6. Proof of Theorems 1.3,1.5 We proceed in a series of steps: 6.1. Extension and scaling in the blowup scenario. We begin with Theorem 1.3. Let be a wave map in C(0,1] with terminal energy E = lim E St []. t→0
Suppose that the energy dispersion scenario B) does not apply. Let > 0 be so that B) does not hold. We can choose arbitrarily small. We will take advantage of this to
250
J. Sterbenz, D. Tataru
construct an extension of outside the cone which satisfies (6), therefore violating (7) on any time interval (0, t0 ]. For this we use energy estimates. Setting 1 1 Ft0 [] = |L|2 + |/∂ |2 d A 2 ∂C(0,t0 ] 4 it follows that as t0 → 0 we have Ft0 [] = E St0 [] − E → 0.
(35)
Then by pigeonholing we can choose t0 arbitrarily small so that we have the bounds 8 8 Ft0 [] E, |/∂ |2 ds E. (36) t0 ∂ St0 The second bound allows us to extend the initial data for at time t0 from St0 to all of R2 in such a way that E[](t0 ) − E St0 [] 8 E. We remark that by scaling it suffices to consider the case t0 = 1. The second bound in (36) shows, by integration, that the range of restricted to ∂ St0 is contained in a small ball of size 8 in M. Thus the extension problem is purely local in M, and can be carried out in a suitable local chart by a variety of methods. We extend the solution outside the cone C between times t0 and 0 by solving the wave-map equation. By energy estimates it follows that for t ∈ (0, t0 ] we have E[](t) − E St [] = E[](t0 ) − E St0 [] + Ft0 [] − Ft [] ≤
1 8 E. 2
(37)
Hence the energy stays small outside the cone, and by the small data result in Theorem 1.1 there is no blow-up that occurs outside the cone up to time 0. The extension we have constructed is fixed for the rest of the proof. Since our extension satisfies (6) but B) does not hold, it follows that we can find a sequence (tn , xn ) with tn → 0 and kn ∈ Z so that |Pkn (tn , xn )| + 2−k |Pkn ∂t (tn , xn )| ≥ .
(38)
The relation (35) shows that Ftn → 0. Hence we can find a sequence n → 0 such that 1
Ftn < n2 E. Thus 1
F[ n tn ,tn ] < n2 E.
(39)
Rescaling we obtain the sequence of wave maps (n) (t, x) = (tn t, tn x) in the increasing family of regions R2 × [ n , 1] with the following properties: a) Uniform energy size, E[(n) ] ≈ E.
(40)
Large Data Wave Maps
251
b) Small energy outside the cone, E[(n) ] − E St [(n) ] 8 E.
(41)
c) Decaying flux, 1
F[ n ,1] [(n) ] < n2 E.
(42)
d) Pointwise concentration at time 1, |Pkn (n) (1, xn )| + 2−kn |Pkn ∂t (n) (1, xn )| >
(43)
for some xn ∈ R2 , kn ∈ Z.
6.2. Extension and scaling in the non-scattering scenario. Next we consider the case of Theorem 1.5. Again we suppose that the energy dispersion scenario B) does not apply for a finite energy wave map in C[1,∞) . Let > 0 be so that B) does not hold. Setting 1 1 |L|2 + |/∂ |2 d A E = lim E St0 [], Ft0 [] = t0 →∞ 2 ∂C[t0 ,∞) 4 it follows that Ft0 [] = E S∞ [] − E St0 [] → 0.
(44)
We choose t0 > 1 so that Ft0 [] ≤ 8 E. We obtain our extension of to the interval [t0 , ∞) from the following lemma: Lemma 6.1. Let be a finite energy wave-map in C[1,∞) and E, t0 as above. Then there exists a wave-map extension of to R2 × [t0 , ∞) which has energy E. We remark that as t → ∞ all the energy of the extension moves inside the cone. It is likely that an extension with this property is unique. We do not pursue this here, as it is not needed. Proof. By pigeonholing we can choose a sequence tk → ∞ so that we have the bounds Ftk [] → 0, tk |/∂ |2 d A → 0. ∂ Stk
The second bound allows us to obtain an extension (k) [tk ] of the initial data [tk ] for at time tk from the circle Stk to all of R2 in such a way that E[(k) ](tk ) − E Stk [] → 0. By rescaling, this extension problem is equivalent to the one in the case of Theorem 1.3.
252
J. Sterbenz, D. Tataru
We solve the wave map equation backwards from time tk to time t0 , with data (k) [tk ]. We obtain a wave map (k) in the time interval [t0 , tk ] which coincides with in C[t0 ,tk ] . The above relation shows that E[(k) ] → E.
(45)
By energy estimates it also follows that for large k and t ∈ [t0 , tk ] we have E[(k) ](t) − E St [(k) ] = E[(k) ](tk ) − E Stk [] + F[t,tk ] [] 8 E.
(46)
Hence the energy stays small outside the cone, which by the small data result in Theorem 1.1 shows that no blow-up can occur outside the cone between times tk and t0 . We will obtain the extension of as the strong limit in the energy norm of a subsequence of the (k) , = lim (k) k→∞
in C(t0 , ∞; H˙ 1 ) ∩ C˙ 1 (t0 , ∞; L 2 ).
(47)
We begin with the existence of a weak limit. By uniform boundedness and the BanachAlaoglu theorem we have weak convergence for each fixed time (k) [t] [t] weakly in H˙ 1 × L 2 . Within C[t0 ,∞) all the (k) ’s coincide, so the above convergence is only relevant outside the cone. But by (46), outside the cone all (k) have small energy. This places us in the context of the results in [41]. Precisely, the weak stability bound (2) in Theorem 1.1 and an argument similar to that used in Sect. 5 shows that the limit is a regular finite energy wave-map in [t0 , ∞) × R2 . It remains to upgrade the convergence. On one hand, (45) and weak convergence shows that E[] ≤ E S∞ []. On the other hand E[] ≥ E St [], and the latter converges to E S∞ []. Thus we obtain E[] = E S∞ [] = lim E[(k) ]. k→∞
From weak convergence and norm convergence we obtain strong convergence (k) [t] → [t] in H˙ 1 × L 2 . The uniform convergence in (47) follows by applying the energy continuity in the small data result of Theorem 1.1 outside C[t0 ,∞) . By energy estimates for it follows that E[] − E St [] = Ft [] ≤ Ft0 [] ≤ 8 E,
t ∈ [t0 , ∞).
Hence the extended satisfies (10) so it cannot satisfy (11). By (35) we obtain the sequence (tn , xn ) and kn ∈ Z with tn → ∞ so that (38) holds. On the other hand from the energy-flux relation we have lim
t1 ,t2 →∞
F[t1 ,t2 ] [] = 0.
This allows us to select again n → 0 so that (39) holds; clearly in this case we must also have n tn → ∞. The same rescaling as in the previous subsection leads to a sequence (n) of wave maps which satisfy the conditions a)–d) above.
Large Data Wave Maps
253
6.3. Elimination of the null concentration scenario. Due to (41), the contribution of the exterior of the cone to the pointwise bounds for (n) is negligible. Precisely, for low frequencies we have the pointwise bound 1 |Pk (n) (1, x)| + 2−k |Pk ∂t (n) (1, x)| 2k (1 + 2k |x|)−N + 4 E 2 , k ≤ 0, where the first term contains the contribution from the interior of the cone and the second is the outside contribution. On the other hand, for large frequencies we similarly obtain 1 |Pk (n) (1, x)| + 2k |Pk ∂t (n) (1, x)| (1 + 2k (|x| − 1)+ )−N + 4 E 2 , k ≥ 0. (48) Hence, in order for (43) to hold, kn must be large enough, 2kn > m( , E), while xn cannot be too far outside the cone, |xn | ≤ 1 + 2−kn g( , E). This allows us to distinguish three cases: (i) Wide pockets of energy. This is when 2kn < C < ∞. (ii) Sharp time-like pockets of energy. This is when 2kn → ∞,
|xn | ≤ γ < 1.
(iii) Sharp pockets of null energy. This is when 2kn → ∞,
|xn | → 1.
Our goal in this subsection is to eliminate the last case. Precisely, we will show that: Lemma 6.2. There exists M = M( , E) > 0 and γ = γ ( , E) < 1 so that for any wave map (n) as in (40)–(43) with a small enough n we have 2kn > M ⇒ |xn | ≤ γ .
(49)
Proof. We apply the energy estimate (27), with replaced by n , to (n) in the time interval [ n , 1]. This yields 1 (50) (1 − |x| + n )− 2 |L(n) |2 + |/∂ (n) |2 d x E. S1
The relation (49) would follow from the pointwise concentration bound in (43) if we can prove that for k > 0 the bound (50), together with (40) and (41) at time t = 1, imply the pointwise estimate 1 1 8 |Pk (n) (1, x)| + 2−k |Pk ∂t (n) (1, x)| (1 − |x|)+ + 2−k + n + 2 E 2 . (51)
254
J. Sterbenz, D. Tataru 1 2
In view of (48) it suffices to prove this bound in the case where rotation we can assume that x = (x1 , 0) with 21 < x1 < 2. In general we have
< |x| < 2. After a
(1 − |x|)+ ≤ (1 − x1 )+ + x22 , and therefore from (50) we obtain 1 S1
1
(|1 − x1 | + x22 + n ) 2
|L(n) |2 + |/∂ (n) |2 d x E.
(52)
At the spatial location (1, 0) we have ∂/ (n) = ∂2 , so we obtain the rough bound |∂2 (n) | |/∂ (n) | + (|x2 | + |1 − x1 |)|∇x (n) |. Similarly, |∂t (n) − ∂1 (n) | |L(n) | + (|x2 | + |1 − x1 |)|∇x (n) |. Hence, taking into account (40) and (41), from (52) we obtain 1 (n) (n) 2 (n) 2 |∂ d x E. (53) − ∂ | + |∂ | t 1 2 1 t=1 ((1 − x 1 )+ + x 2 + n ) 2 + 8 2 k
Next, given a dyadic frequency 2k ≥ 1 we consider an angular parameter 2− 2 ≤ θ ≤ 1 and we rewrite the multiplier Pk in the form 1 2 Pk = Pk,θ ∂1 + Pk,θ ∂2 , 1 and P 2 are multipliers with smooth symbols supported in the sets {|ξ | ≈ where Pk,θ k,θ 2k , |ξ2 | 2k θ }, respectively {|ξ | ≈ 2k , |ξ2 | 2k θ }. The size of their symbols is given by 1 (ξ )| 2−k , | pk,θ
2 | pk,θ (ξ )|
1 . |ξ2 |
Therefore, they satisfy the L 2 → L ∞ bounds 1
1 L 2 →L ∞ θ 2 , Pk,θ
1
2 Pk,θ L 2 →L ∞ θ − 2 .
(54)
1 and P 2 decay rapidly on the 2−k × θ −1 2−k scale. In addition, the kernels of both Pk,θ k,θ Thus one can add weights in (54) provided that they are slowly varying on the same scale. The weight in (52) is not necessarily slowly varying, but we can remedy this by slightly increasing the denominator to obtain the weaker bound 1 (n) (n) 2 (n) 2 |∂ d x E. (55) − ∂ | + |∂ | t 1 2 1 t=1((1 − x 1 )+ + x 2 + 2−k + n ) 2 + 8 2
Using (40), (55) and the weighted version of (54) we obtain 1 2 ∂1 (n) (1, x)| + |Pk,θ ∂2 (n) (1, x)| |Pk (n) (1, x)| ≤|Pk,θ 1 1 1 1 θ 2 + θ − 2 ((1 − x1 )+ + x22 + 2−k + n ) 4 + 4 E 2 .
Large Data Wave Maps
255
We set x2 = 0 and optimize with respect to θ ∈ [2−k/2 , 1] to obtain the desired bound (51) for (n) , 1 1 1 8 (n) −k 2 < x1 < 2. |Pk (1, x1 , 0)| (1 − x1 )+ + 2 + n + E 2 , 2 A similar argument yields 2−k |Pk ∂1 (n) (1, x1 , 0)|
1 1 8 (1 − x1 )+ + 2−k + n + 2 E 2 ,
1 < x1 < 2. 2
On the other hand, from (55) we directly obtain 1 1 1 8 2−k |Pk (∂t − ∂1 )(n) (1, x1 , 0)| (1 − x1 )+ + 2−k + n + 2 E 2 , < x1 < 2. 2 Combined with the previous inequality, this yields the bound in (51) for ∂t (n) . 6.4. Nontrivial energy in a time-like cone. According to the previous step, the points xn and frequencies kn in (43) satisfy one of the following two conditions: (i) Wide pockets of energy. This is when c( , E) < 2kn < C( , E). (ii) Sharp time-like pockets of energy. This is when 2kn > C( , E),
|xn | ≤ γ ( , E) < 1.
Using only the bounds (40) and (41), we will prove that there exists γ1 = γ1 ( , E) < 1 and E 1 = E 1 ( , E) > 0 so that in both cases there is some amount of uniform time-like energy concentration, 1 |∂t (n) |2 + |∇x (n) |2 d x ≥ E 1 . (56) 2 t=1,|x|<γ1 For convenience we drop the index n in the following computations. Denote 1 |∂t |2 + |∇x |2 d x, E(γ1 ) = 2 t=1,|x|<γ1 where γ1 ∈ ( γ 2+1 , 1) will be chosen later. Here γ is from line (49) of the previous subsection. Thus at time 1 the function has energy E(γ1 ) in {|x| < γ1 }, energy ≤ E in {γ1 < |x| < 1}, and energy ≤ 8 E outside the unit disc. Then we obtain different pointwise estimates for Pk [1] in two main regimes: (a) 2k (1 − γ1 ) < 1. Then we obtain
1 1 1 Pk (1) L ∞ + 2−k Pk ∂t (1) L ∞ E(γ1 ) 2 + (2k (1 − γ1 )) 2 + 4 E 2 , x x
with a further improvement if both
256
J. Sterbenz, D. Tataru
(a1) 2k (1 − γ1 ) < 1 < 2k (1 − γ ) and |x| < γ , namely 1 1 1 |Pk (1, x)| + 2−k |Pk ∂t (1, x)| E(γ1 ) 2 + (2k (1 − γ1 )) 2 (2k (1 − γ ))−N + 4 E 2 .
(b) 2k (1 − γ1 ) ≥ 1. Then 1 1 |Pk (1, x)| + 2−k |Pk ∂t (1, x)| E(γ1 ) 2 + (2k (1 − γ ))−N + 4 E 2 ,
|x| < γ .
We use these estimates to bound E(γ1 ) from below. We first observe that if 1 − γ1 is small enough then case (i) above implies we are in regime (a), and from (43) we obtain 1 1 1 E(γ1 ) 2 + (C( , E)(1 − γ1 )) 2 + 4 E 2 , which gives a bound from below for E(γ1 ) if 1 − γ1 and are small enough. Consider now the remaining case (ii) above. If we are in regime (a) but not (a1), then the bound in (a) combined with (43) gives
1 1 1 (1 − γ1 ) 2 E(γ1 ) 2 + + 4 E 2 , 1 (1 − γ ) 2 which suffices if 1 − γ1 is small enough. If we are in regime (a1) then we obtain exactly the same inequality directly. Finally, if we are in regime (b) then we achieve an even better bound
1 1 1 − γ1 N 4 E(γ1 ) 2 + + E2. 1−γ Thus, (56) is proved in all cases for a small enough 1 − γ1 .
6.5. Propagation of time-like energy concentration. Here we use the flux relation (39) to propagate the time-like energy concentration in (56) uniformly to smaller times t ∈ 1
1
[ n2 , n4 ]. Precisely, we show that there exists γ2 = γ2 ( , E) < 1 and E 2 = E 2 ( , E) > 0 so that 1 1 1 |∂t (n) |2 + |∇x (n) |2 d x ≥ E 2 , t ∈ [ n2 , n4 ]. (57) 2 |x|<γ2 t At the same time, we also obtain uniform weighted L 2t,x bounds for X 0 (n) outside smaller and smaller neighborhoods of the cone, namely ρ −1 |X 0 (n) |2 d xdt E. (58) C n1
1 [ n2 , n4 ]
Large Data Wave Maps
257
The latter bound (58) is a direct consequence of (28), so we turn our attention to 1
1
(57). Given a parameter γ2 = γ2 (γ1 , E 1 ), and any t0 ∈ [ n2 , n4 ], we define δ0 and δ1 according to (1 − γ2 )t0 = δ0 δ1 ≤ t0 . We apply Proposition 3.4 to (n) with this set of small constants. Optimizing the right-hand side in (29) with respect to the choice of δ1 it follows that (X 0 ) (n) (X 0 ) P [ ] d x P0 [(n) ] d x + | ln(t0 /δ0 )|−1 E. 0 t δ S10
St00
Converting the X 0 momentum density into the ∂t momentum density it follows that 1 1 (1 − γ1 ) 2 1−γ (∂t )P0 [(n) ] d x (1 − γ2 )− 2 δ (∂t )P0 [(n) ] d x + | ln(1 − γ2 )|−1 E. 1
S1
St00
Hence by (56) we obtain 1
1
(1 − γ1 ) 2 E 1 (1 − γ2 )− 2 E S δ0 [(n) ] + | ln(1 − γ2 )|−1 E. t0
We choose γ2 so that 1
| ln(1 − γ2 )|−1 E (1 − γ1 ) 2 E 1 . Then the second right-hand side term in the previous inequality can be neglected, and for 1
1
0 < E 2 (1 − γ1 ) 2 (1 − γ2 ) 2 E 1 we obtain (57).
6.6. Final rescaling. The one bound concerning the rescaled wave maps (n) which is not yet satisfactory is (57), where we would like to have decay in n instead of uniform 1
1
boundedness. This can be achieved by further subdividing the time interval [ n2 , n4 ]. 1
1
1
For 2 < N < − 4 we divide the time interval [ n2 , n4 ] into about | ln n |/ ln N subintervals of the form [t, N t]. By pigeonholing, there exists one such subinterval which we denote by [tn , N tn ] so that 1 ln N |X 0 (n) |2 d xdt E. (59) n | ln n | C[t ,N t ] ρ n
n
We assign to N = Nn the value Nn = e
√
| ln n |
.
Rescaling the wave maps (n) from the time interval [tn , Nn tn ] to the time interval [1, Nn ] we obtain a final sequence of rescaled wave maps, still denoted by (n) , defined on increasing sets C[1,Tn ] , where Tn → ∞, with the following properties:
258
J. Sterbenz, D. Tataru
a) Bounded energy, E St [(n) ](t) ≈ E,
t ∈ [1, Tn ].
(60)
b) Uniform amount of nontrivial time-like energy, E
(1−γ2 )t
St
c) Decay to self-similar mode, 1 n2 C[1,T n]
[(n) ](t) ≥ E 2 ,
t ∈ [1, Tn ].
(61)
1 1 |X 0 (n) |2 d xdt | log n |− 2 E. ρ
(62)
1 6.7. Concentration scales. We partition the set C[1,∞) into dyadic subsets 1 ; 2 j < t < 2 j+1 }, C j = {(t, x) ∈ C[1,∞)
j ∈ N.
We also consider slightly larger sets 1
j = {(t, x) ∈ C 21 C
[ 2 ,∞)
; 2 j < t < 2 j+1 },
j ∈ N.
Then we prove that Lemma 6.3. Let (n) be a sequence of wave maps satisfying (60), (61) and (62). Then for each j ∈ N one of the following alternatives must hold on a subsequence: j , a sequence (i) Concentration of non-trivial energy. There exist points (tn , xn ) ∈ C 1 of scales rn → 0, and some r = r j with 0 < r < 4 so that the following three bounds hold: 1 E0 , 10 1 E B(x,rn ) [(n) ](tn ) ≤ E0 , 10
E B(xn ,rn ) [(n) ](tn ) =
rn−1
tn +rn /2
tn −rn /2
B(xn ,r )
(63) x ∈ B(xn , r ),
|X 0 (n) |2 d xdt → 0.
(ii) Nonconcentration of uniform energy. There exists some r = r j with 0 < r < that the following three bounds hold: 1 E0 , ∀(t, x) ∈ C j , 10 1 E (1−γ2 )t [(n) ](t) ≥ E 2 , when B (0, (1 − γ2 )t) ⊆ C[1,∞) , E B(x,r ) [(n) ](t) ≤
St
|X 0 (n) |2 d xdt → 0. Cj
(64) (65) 1 4
so
(66) (67) (68)
Large Data Wave Maps
259
Proof. The argument boils down to some straightforward pigeonholing, and is essentially identical for all j ∈ N, which is now fixed throughout the proof. Given any large parameter N ∈ N we partition the time interval [2 j , 2 j+1 ] into about N 2 j equal intervals, Ik = [2 j + (k − 1)/(10N ), 2 j + k/(10N )],
k = 1, 10N 2 j .
j replaced Then it suffices to show that the conclusion of the lemma holds with C j and C by kj = C j ∩ Ik × R2 . C
C kj = C j ∩ Ik × R2 ,
k \C k We begin by constructing a low energy barrier around C kj . To do this we partition C j j into N sets kj ; 1 + l − 1 < t − |x| < 1 + l , l = 1, N . k,l = (t, x) ∈ C C j 4 4N 4 4N By integrating energy estimates we have N l=1
Ik
ECk,l [(n) ](t)dt ≤ j
Ik
ECk [(n) ](t)dt ≤ j
1 E. 10N
Thus by pigeonholing, for each fixed n there must exist ln so that
l n +1
l=ln −1 Ik
ECk,l [(n) ](t)dt ≤ j
3 E, 10N 2
and further there must be some tn ∈ Ik so that jn +1 j= jn −1
ECk,l [(n) ](tn ) ≤ j
3 E. N
k,ln lies within For t ∈ Ik we have |t − tn | < 1/(10N ), and therefore the t section of C j k,ln −1 k,ln k,ln +1 the influence cone of the tn section of C j ∩ Cj ∩ Cj . Hence it follows that one has the uniform bound ECk,ln [(n) (t)] ≤ j
3 E, N
t ∈ Ik .
We choose N large enough so that we beat the perturbation energy 1 3 E≤ E0 . N 20 k,ln acts as an energy barrier for (n) within C k , separating the evolution Then the set C j j inside from the evolution outside with a small data region. We denote the inner region k,
260
J. Sterbenz, D. Tataru
To measure the energy concentration in balls we define the functions f n : [0, r0 ] × Ik → R+ ,
f n (r, t) =
sup k,≤ln } {x;{t}×B(x,r )⊂C j
E B(x,r ) [(n) ](t).
The functions f n are continuous in both variables and nondecreasing with respect to r . We also define the functions inf{r ∈ [0, r0 ]; f n (t, r ) ≥ E100 }, if f n (t, r0 ) ≥ E100 ; rn (t) = rn : Ik → (0, r0 ], otherwise. r0 , which measure the lowest spatial scale on which concentration occurs at time t. Due to the finite speed of propagation it follows that the rn are Lipschitz continuous with Lipschitz constant 1, |rn (t1 ) − rn (t2 )| ≤ |t1 − t2 |. The nonconcentration estimate (66) in case ii) of the lemma corresponds to the case when all functions rn admit a common strictly positive lower bound (note that (61) and (62) give the other conclusions). It remains to consider the case when on a subsequence we have lim inf rn = 0
n→∞ Ik
and show that this yields the concentration scenario i). We denote “kinetic energy” in j by C αn2 = |X 0 (n) |2 d xdt. j C
By (62) we know that αn → 0. Using αn as a threshold for the concentration functions rn , after passing to a subsequence we must be in one of the following three cases: Case 1 (rn Dominates). For each n we have rn (t) > αn in Ik . Then we let tn be the minimum point for rn in Ik and set rn0 = rn (tn ) → 0. By definition we have f n (tn , rn (tn )) = E 0 /10. We choose a point xn where the maximum of f n (tn , rn (tn )) is attained. This directly gives (63). For (64) we observe that, due to the existence of the energy bark,≤ln . Hence, if rier, xn must be at least at distance 3r0 from the lateral boundary of C j k,≤ln and (64) follows. For (65) x ∈ B(xn , r0 ) then x is at distance at least 2r0 from ∂ C j
it suffices to know that rn (tn )−1 αn2 → 0, which is straightforward from the assumptions of this case. Case 2 (Equality). For each n there exists tn ∈ Ik such that αn = rn (tn ). This argument is a repeat of the previous one, given that we define rn0 = rn (tn ) and set up the estimate (64) around this tn as opposed to the minimum of rn (t). Case 3 (αn Dominates). For each n we have rn (t) < αn in Ik . For t ∈ Ik set |X 0 (n) (t)|2 d x. g(t) = k,≤ln C j
Large Data Wave Maps
261
Then by definition Ik
g(t) = αn2 .
Let Ik be the middle third of Ik , and consider the localized averages t+rn (t)/2 1 I= g(s)dsdt. Ik rn (t) t−rn (t)/2 Since rn (t) is Lipschitz with Lipschitz constant 1, if s ∈ [t − rn (t)/2, t + rn (t)/2] then 1 2 rn (s) < rn (t) < 2rn (s) and t ∈ [s − rn (s), s + rn (s)]. Hence changing the order of integration in I we obtain t+rn (t)/2 1 g(s)dsdt ≤ 4 g(s)ds = 4αn2 . I≤2 Ik t−rn (t)/2 rn (s) Ik Ik so that Hence by pigeonholing there exists some tn ∈ tn +rn (tn )/2 1 g(s)ds ≤ 4αn2 | Ik |−1 rn (tn ) tn −rn (tn )/2 Then let xn be a point where the supremum in the definition of f (tn , rn ) is attained. The relations (63)-(66) follow as above.
6.8. The compactness argument. To conclude the proof of Theorems 1.3,1.5 we consider separately the two cases in Lemma 6.3: (i) Concentration on small scales. Suppose that the alternative (i) in Lemma 6.3 holds j . for some j ∈ N. On a subsequence we can assume that (tn , xn ) → (t0 , x0 ) ∈ C Then we define the rescaled wave maps (n) (t, x) = (n) (tn + rn t, xn + rn x) in the increasing sets B(0, r0 /rn ) × [− 21 , 21 ] They have the following properties: a) Bounded energy: E[ (n) ](t) ≤ E[],
1 1 t ∈ [− , ]. 2 2
b) Small energy in each unit ball: sup E B(x,1) [ (n) ](t) ≤ x
1 E0 , 10
1 1 t ∈ [− , ]. 2 2
c) Energy concentration in the unit ball centered at t = 0, x = 0: E B(0,1) [ (n) ](0) =
1 E0 . 10
262
J. Sterbenz, D. Tataru
d) Time-like energy decay: There exists a constant time-like vector X 0 (t0 , x0 ) such that for each x we have |X 0 (t0 , x0 ) (n) |2 d xdt → 0. [− 21 , 12 ]×B(x,1)
Note in part (d) we used the fact that (tn , xn ) → (t0 , x0 ). By the compactness result in Proposition 5.1 it follows that on a subsequence we have strong uniform convergence on compact sets, 1 1 1 (n) → B(0, r0 /(2rn ) × [− , ] , in Hloc 2 2 3
−
2 where ∈ Hloc is a wave-map. Thus, we have obtained a wave map defined 1 1 on all of [− 2 , 2 ] × R2 , with the additional properties that
1 E 0 ≤ E[] ≤ E[] 10 and X 0 (t0 , x0 ) = 0. Then extends uniquely to a wave map in R × R2 with the above properties (e.g. by transporting its values along the flow of X 0 (t0 , x0 )). After a Lorentz transform that takes X 0 (t0 , x0 ) to ∂t the function is turned into a nontrivial finite energy harmonic map with energy bound E[] ≤ E[]. (ii) Nonconcentration. Assume now that the alternative (ii) in Lemma 6.3 holds for every j ∈ N. There there is no need to rescale. Instead, we successively use directly 2 the compactness result in Proposition 5.1 in the interior of each set C j ∩ C[2,∞) . We obtain strong convergence on a subsequence (n) → 3
1 2 in Hloc (C[2,∞) )
−
2 2 with ∈ Hloc (C[2,∞) ). From (67) and energy bounds (e.g. (61)) we obtain
0 < E 2 ≤ sup E B(0,t−2) [](t) ≤ E[]. t 2
From (68) it follows also that X 0 = 0. By rescaling (i.e. extending via homogeneity), we may replace the interior of the 2 translated cone C[2,∞) with the interior of the full cone t > r and retain the assumptions on , in particular that it is non-trivial with finite energy up to the boundary t = r . But this contradicts Theorem 4.1, and therefore shows that scenario (i) above is in fact the only alternative.
Large Data Wave Maps
263
Acknowledgements. The authors would like to thank Manos Grillakis, Sergiu Klainerman, Joachim Krieger, Matei Machedon, Igor Rodnianski, and Wilhelm Schlag for many stimulating discussions over the years regarding the wave-map problem. We would also especially like to thank Terry Tao for several key discussions on the nature of induction-on-energy type proofs. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References 1. Bahouri, H., Gérard, P.: High frequency approximation of solutions to critical nonlinear wave equations. Amer. J. Math. 121(1), 131–175 (1999) 2. Choquet-Bruhat, Y.: General relativity and the Einstein equations. Oxford Mathematical Monographs. Oxford: Oxford University Press, 2009 3. Christodoulou, D., Tahvildar-Zadeh, A.S.: On the asymptotic behavior of spherically symmetric wave maps. Duke Math. J. 71(1), 31–69 (1993) 4. Christodoulou, D., Tahvildar-Zadeh, A.S.: On the regularity of spherically symmetric wave maps. Comm. Pure Appl. Math. 46(7), 1041–1091 (1993) 5. Cote, R., Kenig, C.E., Merle, F.: Scattering below critical energy for the radial 4d Yang-Mills equation and for the 2d corotational wave map system. http://arxiv.org/abs/0709.3222v1[math.AP], 2007 6. Gallot, S., Hulin, D., Lafontaine, J.: Riemannian geometry. Universitext. Berlin: Springer-Verlag, Second edition, 1990 7. Grillakis, M.G.: On the wave map problem. In: Nonlinear wave equations (Providence, RI, 1998), Volume 263 of Contemp. Math., Providence, RI: Amer. Math. Soc., 2000, pp. 71–84 8. Gromov, M.L.: Isometric imbeddings and immersions. Dokl. Akad. Nauk SSSR 192, 1206–1209 (1970) 9. Günther, M.: Isometric embeddings of Riemannian manifolds. In: Proceedings of the International Congress of Mathematicians, Vol. I, II (Kyoto, 1990), Tokyo: Math. Soc. Japan, 1991, pp. 1137–1143 10. Hélein, F.: Harmonic maps, conservation laws and moving frames. Volume 150 of Cambridge Tracts in Mathematics. Cambridge: Cambridge University Press, second edition, 2002 11. Jost, J.: Riemannian geometry and geometric analysis. Universitext. Berlin: Springer-Verlag, 1995 12. Kenig, C.E., Merle, F.: Global well-posedness, scattering and blow-up for the energy-critical focusing non-linear wave equation. Acta Math. 201(2), 147–212 (2008) 13. Klainerman, S., Machedon, M.: Space-time estimates for null forms and the local existence theorem. Comm. Pure Appl. Math. 46(9), 1221–1268 (1993) 14. Klainerman, S., Machedon, M.: Smoothing estimates for null forms and applications. Duke Math. J. 81(1), 99–133 (1996) 15. Krieger, J.: Global regularity and singularity development for wave maps. In: Surveys in differential geometry. Vol. XII. Geometric flows, Volume 12 of Surv. Differ. Geom., Somerville, MA: Int. Press, 2008, pp. 167–201 16. Krieger, J., Schlag, W., Tataru, D.: Renormalization and blow up for charge one equivariant critical wave maps. Invent. Math. 171(3), 543–615 (2008) 17. Krieger, J.: Global regularity of wave maps from R2+1 to H 2 . Small energy. Commun. Math. Phys. 250(3), 507–580 (2004) 18. Krieger, J., Schlag, W.: Concentration compactness for critical wave-maps. http://arxiv.org/abs/0908. 2474[math.AP], 2009 19. Lemaire, L.: Applications harmoniques de surfaces riemanniennes. J. Diff. Geom. 13(1), 51–78 (1978) 20. Nash, J.: The imbedding problem for Riemannian manifolds. Ann. of Math. (2) 63, 20–63 (1956) 21. Qing, J.: Boundary regularity of weakly harmonic maps from surfaces. J. Funct. Anal. 114(2), 458– 466 (1993) 22. Qing, J., Tian, G.: (1993) Bubbling of the heat flows for harmonic maps from surfaces. Comm. Pure Appl. Math. 50(4), 295–310 (1997) 23. Rodnianski, I., Sterbenz, J.: On the formation of singularities in the critical O(3) sigma-model. http://arxiv.org/abs/math/0605023v3[math.AP], 2008 24. Schoen, R., Yau, S.T.: Lectures on harmonic maps. Conference Proceedings and Lecture Notes in Geometry and Topology, II. Cambridge, MA: International Press, 1997 25. Shatah, J., Struwe, M.: Geometric wave equations, Volume 2 of Courant Lecture Notes in Mathematics. New York: New York University Courant Institute of Mathematical Sciences, 1998 26. Shatah, J., Tahvildar-Zadeh, A.S.: On the Cauchy problem for equivariant wave maps. Comm. Pure Appl. Math. 47(5), 719–754 (1994)
264
J. Sterbenz, D. Tataru
27. Sterbenz, J., Tataru, D.: Energy dispersed large data wave maps in 2 + 1 dimensions. http://arxiv.org/abs/ 0906.3384v1[math.AP], 2009. doi:10.1007/s00220-010-1061-4 28. Struwe, M.: On the evolution of harmonic mappings of Riemannian surfaces. Comment. Math. Helv. 60(4), 558–581 (1985) 29. Struwe, M.: Equivariant wave maps in two space dimensions. Comm. Pure Appl. Math. 56(7), 815–823 (2003) 30. Struwe, M.: Radially symmetric wave maps from (1 + 2)-dimensional Minkowski space to general targets. Calc. Var. Part. Diff. Eqs. 16(4), 431–437 (2003) 31. Tao, T.: Global regularity of wave maps III. Large energy from R2+1 to hyperbolic spaces. http://arxiv. org/abs/0805.4666v3[math.AP], 2009 32. Tao, T.: Global regularity of wave maps IV. Absence of stationary or self-similar solutions in the energy class. http://arxiv.org/abs/0806.3592v2[math.AP], 2009 33. Tao, T.: Global regularity of wave maps V. Large data local well-posedness in the energy class. http:// arxiv.org/abs/0808.0368v2[math.AP], 2009 34. Tao, T.: Global regulrity of wave maps VI. Minimal energy blowup solutions. http://arxiv.org/abs/0906. 2833v2[math.AP], 2009 35. Tao, T.: An inverse theorem for the bilinear L 2 Strichartz estimate for the wave equation. http://arxiv.org/ abs/0904.2880v1[math.NT], 2009 36. Tao, T.: Global regularity of wave maps. II. Small energy in two dimensions. Commun. Math. Phys. 224(2), 443–544 (2001) 37. Tao, T.: Geometric renormalization of large energy wave maps. In: Journées “Équations aux Dérivées Partielles”, pages Exp. No. XI, 32. École Polytech., Palaiseau, 2004 38. Tao, T.: Nonlinear dispersive equations. Volume 106 of CBMS Regional Conference Series in Mathematics. Washington, DC: Conference Board of the Mathematical Sciences, 2006 39. Tataru, D.: On global existence and scattering for the wave maps equation. Amer. J. Math. 123(1), 37– 77 (2001) 40. Tataru, D.: The wave maps equation. Bull. Amer. Math. Soc. (N.S.) 41(2), 185–204 (electronic), (2004) 41. Tataru, D.: Rough solutions for the wave maps equation. Amer. J. Math. 127(2), 293–377 (2005) 42. Topping, P.: An example of a nontrivial bubble tree in the harmonic map heat flow. In: Harmonic morphisms, harmonic maps, and related topics (Brest, 1997), Volume 413 of Chapman & Hall/CRC Res. Notes Math. Boca Raton, FL.: Chapman & Hall/CRC, 2000, pp. 185–191 Communicated by P. Constantin
Commun. Math. Phys. 298, 265–292 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-0999-6
Communications in
Mathematical Physics
Generalized Twisted Modules Associated to General Automorphisms of a Vertex Operator Algebra Yi-Zhi Huang Department of Mathematics, Rutgers University, 110 Frelinghuysen Rd., Piscataway, NJ 08854-8019, USA. E-mail: [email protected] Received: 29 July 2009 / Accepted: 5 November 2009 Published online: 9 February 2010 – © Springer-Verlag 2010
Abstract: We introduce a notion of a strongly C× -graded, or equivalently, C/Z-graded generalized g-twisted V -module associated to an automorphism g, not necessarily of finite order, of a vertex operator algebra. We also introduce a notion of a strongly C-graded generalized g-twisted V -module if V admits an additional C-grading com patible with g. Let V = n∈Z V(n) be a vertex operator algebra such that V(0) = C1 and V(n) = 0 for n < 0 and√let u be an element of V of weight 1 such that L(1)u = 0. Then the exponential of 2π −1 Resx Y (u, x) is an automorphism gu of V . In this case, a strongly C-graded generalized gu -twisted V -module is constructed from a strongly C-graded generalized V -module with a compatible action of gu by modifying the vertex operator map for the generalized V -module using the exponential of the negative-power part of the vertex operator Y (u, x). In particular, we give examples of such generalized twisted modules associated to the exponentials of some screening operators on certain vertex operator algebras related to the triplet W -algebras. An important feature is that we have to work with generalized (twisted) V -modules which are doubly graded by the group C/Z or C and by generalized eigenspaces (not just eigenspaces) for L(0), and the twisted vertex operators in general involve the logarithm of the formal variable. 1. Introduction The present paper is the first in a series of papers developing systematically a general theory of twisted modules for vertex operator algebras. This theory is in fact equivalent to the study of orbifold theories in conformal field theory. Though twisted modules associated to finite-order automorphisms of vertex operator algebra have been introduced and studied (see below for more discussions and references), curiously, twisted modules associated to infinite-order automorphisms of vertex operator algebras have not been previously formulated and studied mathematically. To develop our general theory of twisted representations of vertex operator algebras, it is necessary to first find the correct category of twisted modules associated to both finite- and infinite-order automorphisms
266
Y.-Z. Huang
of a vertex operator algebra, and to study and construct such twisted modules. In this paper, we introduce a notion of a strongly C× -, or equivalently, C/Z-graded generalized g-twisted module associated to an automorphism of a vertex operator algebra and also a notion of a strongly C-graded generalized g-twisted V -module when V has an additional C-grading compatible with g. We also construct examples. The category of such generalized twisted modules will be the category of twisted modules that we shall study in this series of papers. Twisted modules for vertex operator algebras were introduced in the construction of the moonshine module vertex operator algebra V by I. Frenkel, J. Lepowsky and A. Meurman [FLM1,FLM2 and FLM3]. Given a vertex operator algebra V and an automorphism g of V of finite order, the notion of g-twisted V -module is a natural but very subtle generalization of the notion of V -module such that the action of g is incorporated. The main axiom in the definition of the notion of g-twisted V -module is the twisted Jacobi identity of the type proved in [Le1,FLM2,Le2 and FLM3]. In fact, V was the first example of an orbifold theory, before orbifold theories were studied in physics. Orbifold theories are important examples of conformal field theories obtained from known conformal field theories and their automorphisms (see [DHVW1,DHVW2,H, DFMS,HV,NSV,M,DGH,DVVV,DGM], as well as, for example [KS,FKS,Ba1,Ba2, BHS,dBHO,HO,GHHO,Ba3 and HH]). Mathematically, the study of orbifold theories is equivalent to the theory of twisted representations of vertex operator algebras, in the sense that any result or conjecture on orbifold conformal field theories can be formulated precisely as a result or conjecture in the theory of twisted representations of vertex operator algebras. There have been a number of papers constructing and studying twisted modules associated to finite-order automorphisms of vertex operator algebras (see for example, [Le1, FLM2,Le2,FLM3,D,DL,DonLM1,DonLM2,Li,BDM,DoyLM1,DoyLM2,BHL], and the references in these papers, especially in the last three papers). However, as far as the author knows, the precise notion of g-twisted V -module for an infinite-order automorphism g has not been previously formulated, and such twisted modules have not been previously constructed and studied. The main difficulty in formulating such a general notion of a g-twisted V -module is that the right-hand side 1/k 1 −1 j (x 1 − x 0 ) x2 δ η Y g (Y (g j u, x0 )v, x2 ) 1/k k x j∈Z/k Z
2
of the twisted Jacobi identity (see [Le1,FLM2,Le2 and FLM3]; cf. [DL]) and the form of the expression (x2 + x0 ) N Y g (Y (u, x0 )v, x2 ) needed in the weak associativity relation (see [Li]) cannot be generalized to the general case. In the present paper, we first reformulate the notion of g-twisted V -module in the case of finite order g using a duality property (incorporating both commutativity and associativity) formulated in terms of complex variables and then generalize this equivalent definition to the general case. The precise generalization involves more subtle issues which fortunately have been treated in the logarithmic tensor product theory developed by Lepowsky, Zhang and the author in [HLZ1 and HLZ2]. First, in the general case, we have to consider a twisted generalization of the notion of generalized V -module in the sense of [HLZ1 and HLZ2] instead of the notion of V -module. Second, the twisted
Generalized Twisted Modules of a Vertex Operator Algebra
267
vertex operators in general involve the logarithm of the formal variable. Third, we have to consider an additional grading by C× (equivalently by C/Z) or by C on such a twisted generalization to introduce a notion of a strongly C/Z- or C-graded general˜ ized g-twisted V -module in the spirit of the notion of strongly A-graded generalized ˜ V -module in [HLZ1 and HLZ2] for a suitable abelian group A. It is important to note that in the general case, all these three new ingredients are necessary. In particular, many of the general results obtained in [HLZ1 and HLZ2] are needed for the study of strongly C/Z- or C-graded generalized g-twisted V -modules introduced and constructed in the present paper. Even when g is a finite-order automorphism of V , how to explicitly construct a nontrivial g-twisted V -module is still an open problem in general. The problem of constructing a strongly C/Z- or C-graded generalized g-twisted V -module for a general automorphism of V is certainly more difficult. However, when we consider certain special types of automorphisms of V , we might still be able to construct such generalized twisted V -modules associated to these automorphisms even though the general problem is not solved; indeed, in the finite-order case, the construction of certain twisted modules for lattice vertex operator algebras was given by Frenkel, Lepowsky and Meurman as one of the many hard steps in the construction of the moonshine module in the work [FLM3] and has been used to solve many other significant problems. √ Given u ∈ V(1) such that L(1)u = 0, we show that gu = e2π −1 Resx Y (u,x) is an automorphism of V . In general, its order might not be finite. In the present paper, generalizing a construction by Li [Li] in the case that Resx Y (u, x) acts on V semisimply and has eigenvalues belonging to k1 Z, we construct a strongly C-graded generalized gu -twisted V -module for u ∈ V(1) from a strongly C-graded generalized V -module with an action of gu such that the C-grading is given by the generalized eigenspaces of the action of Resx Y (u, x). We modify the vertex operator map of the generalized V -module using the exponentials of the negative-power part of the vertex operator Y (u, x). We study such exponentials for u not necessarily of weight 1 and apply the result to the case of u ∈ V(1) later in the construction. Our results show the importance of studying exponentials of vertex operators. As explicit examples, we apply this construction to certain vertex operator algebras constructed from certain lattice vertex operator algebras and automorphisms obtained by exponentiating suitable screening operators. The fixed-point subalgebras of these vertex operator algebras under the group generated by these automorphisms are in fact the triplet W-algebras which in recent years have attracted a lot of attention from physicists and mathematicians. These algebras were introduced first by Kausch [Ka1] and have been studied extensively by Flohr [Fl1,Fl2], Gaberdiel-Kausch [GK1,GK2], Kausch [Ka2], Fuchs-Hwang-Semikhatov-Tipunin [FHST], Abe [A], Feigin-Ga˘ınutdinovSemikhatov-Tipunin [FGST1,FGST2,FGST3], Carqueville-Flohr [CF], Flohr-Gaberdiel [FG], Fuchs [Fu], Adamovi´c-Milas [AM1,AM2,AM3], Flohr-Grabow-Koehn [FGK], Flohr-Knuth [FK], Gaberdiel-Runkel [GR1,GR2], Ga˘ınutdinov-Tipunin [GT], PearceRasmussen-Ruelle [PRR1,PRR2], Nagatomo-Tsuchiya [NT] and Rasmussen [R]. Our results show, in particular, that in this special case, the intertwining operators for suitable fixed-point subalgebras given by Adamovi´c and Milas in Theorem 9.1 in [AM3] are in fact twisted vertex operators. This connection between suitable intertwining operators and twisted vertex operators shows that the triplet logarithmic conformal field theories are closely related to orbifold theories. Note that up to now, the triplet W -algebras are the main known examples of vertex operator algebras for logarithmic conformal field theories. We expect that our theory will provide an orbifold-theoretic approach to the
268
Y.-Z. Huang
triplet logarithmic conformal field theories and will give more interesting examples of logarithmic conformal field theories. There are certainly many examples of vertex operator algebras which have automorphisms of infinite orders. For example, the automorphism groups of the vertex operator algebras for the Wess-Zumino-Novikov-Witten models contain the automorphism groups of the finite-dimensional Lie algebras that one starts with. There are also spectral flow automorphisms of the vertex operator algebras for the Wess-ZuminoNovikov-Witten models and of superconformal vertex operator algebras. Many of these automorphisms are of infinite orders and do not act semisimply. In the case that these automorphisms are the exponentials of weight 1 elements of the vertex operator algebras, strongly C-graded generalized twisted V -modules associated to these automorphisms are constructed by applying the results of the present paper. The present paper is organized as follows: In Sect. 2, we reformulate the notion of g-twisted V -module in the case of finite-order g using complex variables and duality properties. Then we introduce the notions of a strongly C/Z- and C-graded generalized g-twisted V -module in the general case in Sect. 3. In Sect. 4, we study the exponentials of the negative-power parts of vertex operators. The construction of strongly C-graded generalized gu -twisted V -modules for u ∈ V(1) is given in Sect. 5. The explicit examples related to the triplet W -algebras are also given in this section. 2. Twisted Modules for Finite-Order Automorphisms of a Vertex Operator Algebra The notion of g-twisted module associated to a finite order automorphism g of V is formulated by collecting the properties of twisted vertex operators obtained in [Le1,FLM2, Le2 and FLM3]; in particular, it uses a twisted Jacobi identity (see (2.1) below) as the main axiom. See [FLM3,D and DL]. For an infinite-order automorphism g of V , we cannot write down a generalization of the twisted Jacobi identity in the case of finite-order automorphisms. The twisted Jacobi identity can also be replaced by the weak commutativity and (twisted) weak associativity in the case of finite-order automorphisms (see (2.3) below). But the side of the weak associativity containing the iterate of the twisted vertex operator map cannot be generalized to the case of an (necessarily infinite-order) automorphism which acts nonsemisimply on the vertex operator algebra or the twisted module to be defined. These difficulties force us to use the complex variable approach. To motivate our notion of g-twisted module for a general automorphism g of V , we reformulate the definition of g-twisted module in terms of complex variables in this section. Let (V, Y, 1, ω) be a vertex operator algebra and g an automorphism of V of order k ∈ Z+ . Then V = j∈Z/k Z V j , where V j = {v ∈ V | gv = η j v} for j ∈ Z/kZ, where η = e
√ 2π −1 k
. We first recall the notion of g-twisted V -module.
Definition 2.1. A g-twisted V -module W is a C-graded vector space W = equipped with a linear map Y g : V ⊗ W → W [[x 1/k , x −1/k ]], g v ⊗ w → Y g (v, x)w = Yn (v)wx −n−1 , n∈ k1 Z
satisfying the following conditions:
n∈C
W(n)
Generalized Twisted Modules of a Vertex Operator Algebra
269
1. The grading-restriction condition: For each n ∈ C, dim W(n) < ∞ and W(n+l/k) = 0 for all sufficiently negative integers l. 2. The condition: For j ∈ Z/kZ and v ∈ V j , Y g (v, x) = formal gmonodromy −n−1 . n∈ j/k+Z Yn (v)x g 3. The lower-truncation condition: For v ∈ V and w ∈ W , Yn (v)w = 0 for n sufficiently large. 4. The identity property: Y g (1, x) = 1. 5. The twisted Jacobi identity: For u, v ∈ V and w ∈ W , x1 − x2 x0−1 δ Y g (u, x1 )Y g (v, x2 ) x0 x2 − x1 Y g (v, x2 )Y g (u, x1 ) −x0−1 δ −x0 1/k 1 −1 j (x 1 − x 0 ) x2 δ η (2.1) = Y g (Y (g j u, x0 )v, x2 ). 1/k k x j∈Z/k Z
2
g
6. The L g (0)-grading condition: For n ∈ Z, let L g (n) = Yn+1 (ω), that, is, Y g (ω, x) = g −n−2 . Then L g (0)w = mw for w ∈ W (m) . n∈Z L (n)x g 7. The L (−1)-derivative condition: For v ∈ V , d g Y (v, x) = Y g (L(−1)v, x). dx We denote a g-twisted V -module defined above by (W, Y g ) (or briefly, by W ). Note g g that in the definition above, we use Yn (u) instead of u n for n ∈ k1 Z to denote the components of the vertex operator Y g (u, x). In the present paper, we shall always use such notations to denote the components of a vertex operator. For example, for our vertex operator algebra (V, Y, 1, ω), we shall use Yn (u) instead of u n for n ∈ Z to denote the components of the vertex operator Y (u, x). Similarly, for a V -module (W, YW ), we shall use (YW )n (u) instead of u n for n ∈ Z to denote the components of the vertex operator YW (u, x). This definition of g-twisted V -module is a natural generalization of the definition of V -module given in [Bo,FLM3 and FHL]. The main axiom is the twisted Jacobi identity. The following result is proved by Li [Li]: Proposition 2.2. The twisted Jacobi identity in Definition 2.1 can be replaced by the following two properties: 1. The weak commutativity or locality: For u, v ∈ V , there exists N ∈ Z+ such that (x1 − x2 ) N Y g (u, x1 )Y g (v, x2 ) = (x2 − x1 ) N Y g (v, x2 )Y g (u, x1 ).
(2.2)
2. The weak associativity: For u ∈ V j and w ∈ W , there exists M ∈ Z+ such that ˜
˜
(x2 + x0 ) j/k+M Y g (Y (u, x0 )v, x2 ) = (x0 + x2 ) j/k+M Y g (u, x0 + x2 )Y g (v, x2 ) (2.3) for v ∈ V , where j˜ ∈ j satisfies 0 ≤ j˜ < k.
270
Y.-Z. Huang
One advantage of the definition, the weak commutativity and the weak associativity above is that they use only formal variables so that one can discuss the multivaluedness of the correlation functions algebraically. But as we mentioned above, this formulation of g-twisted V -module for g of finite order cannot be generalized to the case that g is of infinite order. So we first reformulate this definition using complex variables and a duality property. We need a complex variable formulation of vertex operator maps √ and some notations first. For any z ∈ C× , we shall use log z to denote log |z| + √ −1 arg z, where 0 ≤ arg z < 2π . In general, we shall use lk (z) to denote log z + 2k −1π for k ∈ Z. Let W1 , W2 and W3 be C-graded vector spaces. The C-gradings on W1 and W2 induce a C-grading on W1 ⊗ W2 . For a formal variable x 1/k , we define its degree to be −1/k. Then the grading on W3 and the degree of x together give W3 [[x 1/k , x −1/k ]] a C-graded vector space structure. Let X : W1 ⊗ W2 → W3 [[x 1/k , x −1/k ]] w1 ⊗ w2 → X (w1 , x)w2 = X n (w1 )w2 x −n−1 n∈ k1 Z
be a linear map preserving the gradings. For any w1 ∈ W1 and w2 ∈ W2 , we define X p (w1 , z)w2 for p ∈ Z/kZ to be the elements X (w1 , x)w2 of W 3 = nl (z) x n =e p˜
n∈C (W3 )(n) , where for p ∈ Z/kZ, p˜ is the integer in p satisfying 0 ≤ p˜ < k. When p = 0, for simplicity, we shall denote X 0 (w1 , z)w2 simply as X (w1 , z)w2 . For z ∈ C× , we have the linear maps X p (·, z)· : W1 ⊗ W2 → W 3 , w1 ⊗ w2 → X p (w1 , z)w2 ,
(2.4)
and we shall denote X 0 (·, z)· simply as X (·, z)·. For any w3 ∈ W3 ,
w3 , X p (w1 , z)w2 are branches of a multivalued analytic function defined on C× . We can view this multivalued analytic function on C× as a single-valued analytic function defined on a k-fold covering space of C× . For p ∈ Z/kZ, we have a map X p : C× → Hom(W1 ⊗ W2 , W 3 ) z → X p (·, z) · . We shall call X p the p th analytic branch of the map X . In particular, for the twisted vertex operator map Y g , we have its p th analytic branch Y g; p . In terms of these analytic branches, we have: Proposition 2.3. The formal monodromy condition in Definition 2.1 can be replaced by the following condition: For p ∈ Z/kZ, z ∈ C× and v ∈ V , Y g; p+1 (gv, z) = Y g; p (v, z).
(2.5)
Generalized Twisted Modules of a Vertex Operator Algebra
271
Proof. Let W be a g-twisted V -module satisfying Definition 2.1. Then for v ∈ V j , we have gv = η j v ∈ V j . It is easy to see that this fact and the formal monodromy condition give Y g; p+1 (gv, z) = Y g; p (v, z) for p ∈ Z/kZ, z ∈ C× and v ∈ V j , Conversely, assume that W satisfies the modification of Definition 2.1 by replacing the formal monodromy condition with the condition in the proposition. Then it is easy to see that the fact gv = η j v ∈ V j for v ∈ V j and the condition in the proposition give
Yn (v)η j e(−n−1)l p+1 (z) = g
n∈ k1 Z
g
Yn (v)e
√ 2π(n+1) −1 k
e(−n−1)l p+1 (z)
n∈ k1 Z
for j ∈ Z/kZ and v ∈ V j . So we have
g
Yn (v)(η j − e
√ 2π(−n−1) −1 k
)e(n+1)l p (z) = 0,
n∈ k1 Z
which implies that g
Yn (v)(η j − e
√ 2π(n+1) −1 k
)=0 √ 2π(n+1) −1
g
g
k for n ∈ k1 Z. So we have either Yn (v) = 0 or η j = e , that is, either Yn (v) = 0 g g
or n ∈ j/k + Z. So we obtain Y (v, x) = n∈ j/k+Z Yn (v)x −n−1 .
We shall call (2.5) in Proposition 2.3 the equivariance property. For n ∈ C, let πn : W → W(n) be the projection. We have: Theorem 2.4. The twisted Jacobi identity in Definition 2.1 can be replaced by the following property: For any u, v ∈ V , w ∈ W and w ∈ W , there exists a multivalued analytic function of the form f (z 1 , z 2 ) =
N2
ar s z 1 z 2 (z 1 − z 2 )−N r/k s/k
r,s=N1
for N1 , N2 ∈ Z and N ∈ Z+ , such that the series
w , Y g; p (u, z 1 )Y g; p (v, z 2 )w =
w , Y g; p (u, z 1 )πn Y g; p (v, z 2 )w,
(2.6)
w , Y g; p (v, z 2 )πn Y g; p (u, z 1 )w,
(2.7)
n∈C
w , Y g; p (v, z 2 )Y g; p (u, z 1 )w =
n∈C
w , Y g; p (Y (u, z 1 − z 2 )v, z 2 )w =
w , Y g; p (πn Y (u, z 1 − z 2 )v, z 2 )w
n∈C
(2.8)
272
Y.-Z. Huang
are absolutely convergent in the regions |z 1 | > |z 2 | > 0, |z 2 | > |z 1 | > 0, |z 2 | > |z 1 − z 2 | > 0, respectively, to the branch N2
ar s e(r/k)l p (z 1 ) e(s/k)l p (z 2 ) (z 1 − z 2 )−N
(2.9)
r,s=N1
of f (z 1 , z 2 ). Proof. Assume that W is a g-twisted V -module satisfying Definition 2.1. By Proposition 2.2, there exists N ∈ Z+ such that (2.2) holds. For w ∈ W and w ∈ W , we know that
w , Y g (u, x1 )Y g (v, x2 )w 1/k
involves finitely many positive powers of x1 1/k x2 , and
(2.10)
and finitely many negative powers of
w , Y g (v, x2 )Y g (u, x1 )w
(2.11)
1/k
involves finitely many negative powers of x1 and finitely many positive powers of 1/k 1/k 1/k x2 . Since (x1 − x2 ) N is a polynomial in x1 and x2 , the same statements are also true for (x1 − x2 ) N w , Y g (u, x1 )Y g (v, x2 )w and (x1 − x2 ) N w , Y g (v, x2 )Y g (u, x1 )w, respectively. By (2.2), they are actually equal. Thus they are both equal to N2
r/k s/k
ar s x1 x2
1/k
−1/k
∈ C[x1 , x1
1/k
−1/k
, x2 , x2
].
r,s=N1 1/k
Since (2.10) involves finitely many positive powers of x1 1/k powers of x2 , it must be equal to the expansion of N2
and finitely many negative
ar s x1 x2 (x1 − x2 )−N r/k s/k
r,s=N1
as a series of the same form. Thus the expansion of (2.9) in the region |z 1 | > |z 2 | > 0 is equal to (2.6). The same argument shows that the expansion of (2.9) in the region |z 2 | > |z 1 | > 0 is equal to (2.7). For j ∈ Z/kZ, let j˜ ∈ j satisfying 0 ≤ j˜ < k. For u ∈ V j , by Proposition 2.2 there exists M ∈ Z+ such that (2.3) holds. For w ∈ W and w ∈ W , we know that (2.10) involves finitely many positive powers 1/k 1/k of x1 and finitely many negative powers of x2 . So
w , Y g (u, x0 + x2 )Y g (v, x2 )w
(2.12)
Generalized Twisted Modules of a Vertex Operator Algebra
273
1/k
and finitely many negative powers of
involves finitely many positive powers of x0 1/k x2 . We also know that
w , Y g (Y (u, x0 )v, x2 )w
(2.13)
1/k
1/k
involves finitely many negative powers of x0 and finitely many positive powers of x2 . ˜ 1/k Since (x0 + x2 ) j/k+M involves finitely many positive powers of x0 and no negative ˜j/k+M 1/k 1/k involves finitely many positive powers of x2 and powers of x2 and (x2 + x0 ) 1/k no negative powers of x2 , the same statements above for (2.12) and (2.13) are also true for ˜
(x0 + x2 ) j/k+M w , Y g (u, x0 + x2 )Y g (v, x2 )w and ˜
(x2 + x0 ) j/k+M w , Y g (Y (u, x0 )v, x2 )w, respectively. Since we have proved that (2.10) is equal to N2
ar s x1 x2 (x1 − x2 )−N , r/k s/k
r,s=N1
we obtain ˜
(x0 + x2 ) j/k+M w , Y g (u, x0 + x2 )Y g (v, x2 )w =
N2
˜
ar s (x0 + x2 )r/k+ j/k+M x2 x0−N . s/k
r,s=N1
By (2.3), we also have ˜
(x2 + x0 ) j/k+M w , Y g (Y (u, x0 )v, x2 )w =
N2
˜
ar s (x2 + x0 )r/k+ j/k+M x2 x0−N . s/k
r,s=N1
Thus the expansion of (2.9) in the region |z 2 | > |z 1 − z 2 | > 0 is equal to (2.8). Conversely, assume that the property in the theorem holds. Let f j (x1 , x2 , x0 ) =
N2
ar s η− jr x1 x2 x0−N r/k s/k
r,s=N1
for j ∈ Z/k Z . Then this property gives
w , Y g (u, x1 )Y g (v, x2 )w = f 0 (x1 , x2 , x1 − x2 ),
w , Y g (v, x2 )Y g (u, x1 )w = f 0 (x1 , x2 , −x2 + x1 ),
w , Y g (Y (u, z 1 − z 2 )v, z 2 )w = f 0 (x2 + x0 , x2 , x0 )
(2.14) (2.15) (2.16)
274
Y.-Z. Huang
for u, v ∈ V , w ∈ W and w ∈ W . From (2.14), (2.16) and the equivariance property, we have
w , Y g (Y (g j u, z 1 − z 2 )v, z 2 )w = f j (x2 + x0 , x2 , x0 )
(2.17)
for u, v ∈ V , w ∈ W and w ∈ W . We have the formal variable identity 1/k 1 −1 x1 − x2 x2 − x1 −1 −1 j (x 1 − x 0 ) x0 δ − x0 δ = x2 δ η . 1/k x0 −x0 k x j∈Z/k Z
2
Using this identity, the properties of the delta-function and the explicit form of f 0 (x1 , x2 , x0 ), we can prove x1 − x2 x2 − x1 f 0 (x1 , x2 , x1 − x2 ) − x0−1 δ f 0 (x1 , x2 , −x2 + x1 ) x0−1 δ x0 −x0 1/k 1 −1 j (x 1 − x 0 ) x2 δ η (2.18) = f j (x2 + x0 , x2 , x0 ). 1/k k x j∈Z/k Z
2
From (2.14)–(2.16) and (2.18), we obtain x1 − x2
w , Y g (u, x1 )Y g (v, x2 )w x0−1 δ x0 x2 − x1
w , Y g (v, x2 )Y g (u, x1 )w −x0−1 δ −x0 1/k 1 −1 j (x 1 − x 0 ) x2 δ η =
w , Y g (Y (g j u, x0 )v, x2 )w 1/k k x j∈Z/k Z
2
for u, v ∈ V , w ∈ W and w ∈ W . Since w ∈ W and w ∈ W are arbitrary, we obtain the twisted Jacobi identity (2.1) for u, v ∈ V .
We shall call the property in Theorem 2.4 the duality property. Remark 2.5. For simplicity, we give only the definition of twisted module for a vertex operator algebra, not for a vertex operator superalgebra. But it is trivial to generalize it to that of a twisted module for a vertex operator superalgebra. 3. Definitions of Strongly C/Z- and C-Graded Generalized g-Twisted Module We give a definition of a strongly C/Z-graded generalized g-twisted V -module for an automorphism g of V of possibly infinite order in this section. In the case that V admits an additional C-grading compatible with g, we also give a definition of strongly C-graded generalized g-twisted V -module. Inthe case that g is of finite order, g always acts on V semisimply so that we have V = j∈Z/k Z V j . When the order of g is infinite, it does not have to act on V semisimply. In particular, we have to allow not only nonintegral powers of x in the twisted vertex operators, but also integral powers of log x in these operators. As we have mentioned in the Introduction, the left-hand side of the twisted Jacobi identity cannot be generalized to the case that the order of g is infinite.
Generalized Twisted Modules of a Vertex Operator Algebra
275
Also we cannot straightforwardly generalize the formal monodromy condition to the general case. On the other hand, the equivariance property and the duality property in Proposition 2.3 and Theorem 2.4, respectively, can be generalized easily. Our definition will use the generalizations of these axioms. First we need to further generalize our consideration of analytic branches of the map X in the preceding section. For formal variables x and log x, we define their degrees to be −1 and 0. Let W1 , W2 and W3 be C-graded vector spaces. Then the grading on W3 and the degrees of x and log x together give W3 {x}[log x] a C-grading. Let X : W1 ⊗ W2 → W3 {x}[log x] w1 ⊗ w2 → X (w1 , x)w2 =
K
X n,k (w1 )w2 x −n−1 (log x)k
n∈C k=1
be a linear map preserving the gradings. For any w1 ∈ W 1 and w2 ∈ W2 , we define X p (w1 , z)w2 for p ∈ Z to be the elements X (w1 , x)w2 of W 3 . nl p (z) n x =e
, log x=l p (z)
When p = 0, we shall denote X 0 (w1 , z)w2 simply as X (w1 , z)w2 . For z ∈ C× , we have the linear maps X p (·, z)· : W1 ⊗ W2 → W 3 , w1 ⊗ w2 → X p (w1 , z)w2 ,
(3.1)
and we shall denote X 0 (·, z)· simply as X (·, z)·. For any w3 ∈ W3 ,
w3 , X p (w1 , z)w2 are branches of a multivalued analytic function defined on C× . We can view this multivalued analytic function on C× as a single-valued analytic function defined on a covering space of C× . For p ∈ Z, we have a map X p : C× → Hom(W1 ⊗ W2 , W 3 ) z → X p (·, z) · . We shall call X p the p th analytic branch of the map X . We also need a C× -, or equivalently, C/Z-grading structure on V given by an automorphism g of V . Since g preserve V(n) for n ∈ Z and V(n) for n ∈ Z are all finite dimensional, we have [a] V(n) V(n) = a∈C×
for n ∈ Z, where for n ∈ Z and a ∈ C× , if a is an eigenvalue for the operator g on V(n) , [a] is the generalized eigenspace of V(n) for g with the eigenvalue a, and if a is not an V(n)
[a] eigenvalue for g on V(n) , V(n) = 0. Since the multiplicative abelian group C× is isomorphic to the additive abelian group C/Z through the isomorphism a → 2π √1 −1 log a + Z,
we can use C/Z instead of C× so that
V(n) =
α∈C/Z
[α] V(n)
276
Y.-Z. Huang
and V is doubly graded by C and C/Z as V =
n∈Z,α∈C/Z
Let V [α] = √
n∈Z
[α] V(n) .
[α] V(n) for α ∈ C/Z. Then V =
α∈C/Z
V [α] and for v ∈ V [α] ,
gv = e2π −1 α v. It is easy to see that the vertex operator algebra V together with this C/Z-grading is a strongly C/Z-graded vertex operator algebra in the sense of [HLZ2]. If, in additionto the C-grading given by the eigenvalues of L(0), V has another C-grading V = α∈C V [α] (not just a C/Z-grading as we have given above) which is √ [α] [α] 2π −1 α v), then V = compatible with g (that is, for v ∈ V , gv = e n,α∈C V(n) equipped with this C-grading has a strongly C-graded vertex operator algebra structure. For a strongly C/Z-graded (or C-graded) vertex operator algebra, a natural module category is the category of strongly C/Z-graded (or C-graded, respectively) generalized V -modules. Moreover, if the C/Z-grading (or C-graded) is given by (or compatible with) g, it is natural to consider strongly C/Z-graded (or C-graded) generalized V -modules with actions of g such that the C× -gradings (or C-gradings) are given by (or compatible with) generalized eigenspaces for g in the obvious sense. The category of the generalizations of g-twisted modules should be a generalization of the category of such strongly C/Z- or C-graded generalized V -modules. In particular, there should also be an action of g and also a C/Z-grading (or C-grading) given by (or compatible with) generalized eigenspaces for the action of g. Here is our generalization of g-twisted V -module for finite-order g to the general case: Definition 3.1. Let (V, Y, 1, ω) be a vertex operator algebra and let g be an automorphism of V . A strongly C/Z-graded generalized g-twisted V -module is a C × C/Z [α] graded vector space W = n∈C,α∈C/Z W[n] equipped with a linear map Y g : V ⊗ W → W {x}[log x], v ⊗ w → Y g (v, x)w, and an action of g satisfying the following conditions: [α] 1. The grading-restriction condition: For each n ∈ C, α ∈ C/Z, dim W[n] < ∞ and
[α] W[n+l] = 0 for all sufficiently negative real number l. 2. The equivariance property: For p ∈ Z, z ∈ C× , v ∈ V and w ∈ W , Y g; p+1 (gv, z) w = Y g; p (v, z)w, where for p ∈ Z, Y g; p is the p th analytic branch of Y g . 3. The identity property: For w ∈ W , Y g (1, x)w = w. [α] ∗ 4. The duality property: Let W = n∈C,α∈C/Z (W[n] ) and, for n ∈ C, πn : W → W[n] be the projection. For any u, v ∈ V , w ∈ W and w ∈ W , there exists a multivalued analytic function of the form
f (z 1 , z 2 ) =
N i, j,k,l=0
n
ai jkl z 1m i z 2 j (log z 1 )k (log z 2 )l (z 1 − z 2 )−t
Generalized Twisted Modules of a Vertex Operator Algebra
277
for N ∈ N, m 1 , . . . , m N , n 1 , . . . , n N ∈ C and t ∈ Z+ , such that the series
w , Y g; p (u, z 1 )Y g; p (v, z 2 )w =
w , Y g; p (u, z 1 )πn Y g; p (v, z 2 )w, (3.2) n∈C
w , Y
g; p
(v, z 2 )Y
g; p
(u, z 1 )w =
w , Y g; p (v, z 2 )πn Y g; p (u, z 1 )w, (3.3)
n∈C
w , Y
g; p
(Y (u, z 1 − z 2 )v, z 2 )w =
w , Y g; p (πn Y (u, z 1 − z 2 )v, z 2 )w
n∈C
(3.4) are absolutely convergent in the regions |z 1 | > |z 2 | > 0, |z 2 | > |z 1 | > 0, |z 2 | > |z 1 − z 2 | > 0, respectively, to the branch N
ai jkl em i l p (z 1 ) en j l p(z2 ) l p (z 1 )k l p (z 2 )l (z 1 − z 2 )−t
i, j,k,l=0
of f (z 1 , z 2 ). 5. The L g (0)- and g-grading conditions: Let L g (n) for n ∈ Z be the operator on W [α] given by Y g (ω, x) = n∈Z L g (n)−n−2 . Then for n ∈ C and α ∈ C/Z, w ∈ W[n] , there exists K , ∈ Z+ such that (L g (0) − n) K w = (g − e2π 6. The L(−1)-derivative condition: For v ∈ V ,
√
−1 α ) w
= 0.
d g Y (v, x) = Y g (L(−1)v, x). dx If V has a strongly C-graded vertex operator algebra structure compatible with g, then a strongly C-graded generalized g-twisted V -module is a C × C-graded vector space [α] W = n,α∈C W[n] equipped with a linear map Y g : V ⊗ W → W {x}[log x], v ⊗ w → Y g (v, x)w, and an action of g, satisfying the same axioms above except that C/Z is replaced by C and the g-grading condition is replaced by the following grading-compatibility condition: For α, β ∈ C, v ∈ V [α] and w ∈ W [β] , Y g (v, x)w ∈ W [α+β] {x}[log x]. Remark 3.2. As in the preceding section, for simplicity, we give only the definition of such generalized twisted module for a vertex operator algebra, not for a vertex operator superalgebra. It is also trivial in this general case to generalize the definition above to that of twisted module for a vertex operator superalgebra. 4. Exponentiating Integrals of Negative Parts of Vertex Operators In this section we shall exponentiate integrals of negative parts of vertex operators and prove formulas needed in the next section. We need first several commutator formulas. The first is the commutator formula for vertex operators which is the special case of the Jacobi identity when we take the coefficients of x0−1 and expand the formal deltafunction on the right-hand side explicitly:
278
Y.-Z. Huang
Proposition 4.1. Let (V, Y, 1, ω) be a vertex operator algebra and (W, YW ) a weak V -module. Then for u, v ∈ V , we have the commutator formula m x −m−1 x2m−k YW (Yk (u)v, x2 ). (4.1) [YW (u, x1 ), YW (v, x2 )] = k 1 m∈Z k∈N
Let u ∈ V . In this section, we do not require that the conformal weight of u be 1. Let be the part of YW (u, x) consisting of all the monomials in x with powers of x less than or equal to −2, that is, ≤−2 YW (u, x) = Ym (u)x −m−1 .
≤−2 YW (u, x)
m∈Z+
For any vector space W and f (x) =
an x n ∈ W [[x]] + x −2 W [[x −1 ]],
n∈Z\{−1}
let
x
f (y) =
0
n∈Z\{−1}
an n+1 x . n+1
Then we obtain a linear map (integration from 0 to x) x : W [[x]] + x −2 W [[x −1 ]] → x W [[x]] + x −1 W [[x −1 ]]. 0
We have: Proposition 4.2. For u, v ∈ V ,
x1 ≤−2 YW (u, y) , YW (v, x2 ) 0 −x1 −x2 x2 YW (Y0 (u)v, x2 ), Y ≤−2 (u, y)v, x2 + log 1 + = YW x1 0 (4.2) −x −x where 0 1 2 Y ≤−2 (u, y)v is defined by expanding the series in powers of −x1 − x2 as a series in positive powers of x2 and in powers of x1 . Proof. From (4.1), we obtain m x −m−1 x2m−k YW (Yk (u)v, x2 ) k 1 m∈Z+ k∈N k + l −k−l−1 l x1 = x2 YW (Yk (u)v, x2 ). k
≤−2 [YW (u, x1 ), YW (v, x2 )] =
k,l∈N,k+l=0
(4.3)
Generalized Twisted Modules of a Vertex Operator Algebra
Then
−x1 0
≤−2 YW (u, y), YW (v, x2 )
=
k,l∈N,k+l=0
=
279
k + l (−1)k+l+1 −k−l l x1 x2 YW (Yk (u)v, x2 ) k k +l
(−1)k −k x1−k−l x2l YW (Yk (u)v, x2 ) l −k
k∈Z+
+
l∈N
(−1)l+1 x1−l x2l YW (Y0 (u)v, x2 ) l
l∈Z+
(−1)k x2 −k YW (Y0 (u)v, x2 ) (x1 + x2 ) YW (Yk (u)v, x2 ) + log 1 + = −k x1 k∈Z+
= YW 0
−x1 −x2
x2 YW (Y0 (u)v, x2 ). Y ≤−2 (u, y)v, x2 + log 1 + x1
We also have: Proposition 4.3. For u ∈ V ,
−x x d ≤−2 ≤−2 L(−1), YW (u, y) = − YW (u, y) − Y0 (u)x −1 . d x 0 0
(4.4)
Proof. By the L(−1)-derivative property, we have [L(−1), YW (u, y)] =
d YW (u, x), dy
which gives ≤−2 [L(−1), YW (u, y)] =
d ≤−2 Y (u, y) − Y0 (u)y −2 . dy W
(4.5)
x Note that the integration 0 : W [[x]] + x −2 W [[x −1 ]] → x W [[x]] + x −1 W [[x −1 ]] defined above is invertible and its inverse is in fact the restriction of ddx to the space x W [[x]] + x −1 W [[x −1 ]]. Integrating both sides of (4.5) with respect to y and then substituting −x for y, we obtain
−x ≤−2 ≤−2 L(−1), YW (u, y) = YW (u, −x) − Y0 (u)x −1 0 −x d ≤−2 =− YW (u, y) − Y0 (u)x −1 . dx 0
280
Y.-Z. Huang
Let (u) (x) = exp
W
−x
0
≤−2 YW (u, y) + Y0 (u) log x .
−1 Then we see that (u) W (x) ∈ (End)[[x ]][[log x]].
Proposition 4.4. For u, v ∈ V , (u) (x)Y (v, x2 ) = Y (
(u) (x + x2 )v, x2 )
(u) (x),
W W W d (u) (u) (x). (x)] = −
[L(−1),
W dx W Proof. By (4.2) and (4.4), we obtain (4.6) and (4.7), respectively.
(4.6) (4.7)
5. Generalized Twisted V -Modules Associated to Weight 1 Elements of V In this section, we construct strongly C-graded generalized twisted V -modules associated to u ∈ V(1) under the assumption that V(0) = C1 and V(n) = 0 for n < 0 and that L(1)u = 0. For such an element u, we have an operator Y0 (u) = Resx Y (u, x) on V . Since the conformal weight of Y0 (u) is 0, it preserves the finite-dimensional homogeneous sub [α] [α] spaces V(n) for n ∈ C. In particular, for any n ∈ C, V(n) = α∈C V(n) , where V(n) is the generalized eigenspace for Y0 (u) with eigenvalue α if α is an eigenvalue of Y0 (u) and is 0 for other α ∈ C. Thus V has an additional C-grading and, equipped with this C-grading, is a strongly C-graded vertex operator algebra. Given a strongly C-graded generalized V -module, we would like to construct a strongly C-graded generalized twisted V -module by modifying the vertex operator map (u) (x) constructed in the preceding section. But in for the module using the operators
V (u) (x)v ∈ V [[x −1 ]][[log x]] for v ∈ V . On the other hand, in our definition of general
V strongly C× -graded generalized twisted module, the image of the vertex operator map (u) (x) directly. Instead, we first need to must be in V {x}[log x]. So we cannot use
V (u) (u) (x) and discuss under what assumptions, construct a series V (x) by modifying
V (u)
V (x)v is in V {x}[log x] for v ∈ V . Let (Y0 (u))ss be the semisimple part of Y0 (u), that is, for any generalized eigenvector v ∈ V for Y0 (u) with eigenvalue λ, (Y0 (u))ss v = λv. Then Y0 (u) = (Y0 (u))ss + (Y0 (u) − (Y0 (u))ss ) and (Y0 (u) − (Y0 (u))ss ) Nu = 0. The commutator formula (4.1) for vertex operators gives [Y0 (u), Y (v, x2 )] = Y (Y0 (u)v, x2 )
(5.1)
for any v ∈ V . The commutator formula (5.1) gives [(Y0 (u))ss , Y (v, x2 )] = Y ((Y0 (u))ss v, x2 ), [(Y0 (u) − (Y0 (u))ss ), Y (v, x2 )] = Y ((Y0 (u) − (Y0 (u))ss )v, x2 ).
(5.2) (5.3)
The statement that (5.2) is a consequence of (5.1) is a special case of a very general fact in linear algebra. Here we give a direct proof in this special case. Let v, w ∈ V be
Generalized Twisted Modules of a Vertex Operator Algebra
281
generalized eigenvectors for Y0 (u) with eigenvalues λ1 and λ2 , respectively. Then there exist N1 , N2 ∈ Z+ such that (Y0 (u) − λ1 ) N1 v = (Y0 (u) − λ2 ) N1 w = 0. From (5.1), we obtain (Y0 (u) − λ1 − λ2 )Y (v, x2 )w = Y ((Y0 (u) − λ1 )v, x2 )w + Y (v, x2 )(Y0 (u) − λ2 )w. Using this formula repeatedly, we obtain (Y0 (u) − λ1 − λ2 ) N1 +N2 Y (v, x2 )w N 1 +N2 N1 + N2 Y ((Y0 (u) − λ1 )i v, x2 )(Y0 (u) − λ2 ) N1 +N2 −i w = i i=0
= 0. This shows that Y (v, x2 )w is a formal series whose coefficients are generalized eigenvectors for Y0 (u) with eigenvalue λ1 + λ2 . Thus (Y0 (u))ss Y (v, x2 )w = (λ1 + λ2 )Y (v, x2 )w = Y ((Y0 (u))ss v, x2 )w + Y (v, x2 )(Y0 (u))ss w, which is equivalent to [(Y0 (u))ss , Y (v, x2 )]w = Y ((Y0 (u))ss v, x2 )w. Since v and w are arbitrary generalized eigenvectors for Y0 (u), we obtain (5.2). The formula (5.3) follows immediately from (5.1) and (5.2). For any generalized eigenvector v ∈ V for Y0 (u) with eigenvalue λ, we have etY0 (u) v = et (Y0 (u))ss etY0 (u)−t (Y0 (u))ss v = etλ etY0 (u)−t (Y0 (u))ss v. Since V is spanned by such generalized eigenvector v ∈ V for Y0 (u), we have a well-defined operator etY0 (u) on V for every t ∈ C. Proposition 5.1. For t ∈ C, the operator etY0 (u) is an automorphism of V . Proof. Since Y0 (u) preserve the grading of V , etY0 (u) also preserve the grading of V . From (5.2) and (5.3), we obtain etY0 (u) Y (v, x) = Y (etY0 (u) v, x)etY0 (u) . The first half of the creation property says that Y (u, x)1 ∈ V [[x]]. In particular, Y0 (u)1 = 0. So etY0 (u) 1 = 1. Since u ∈ V(1) and V(n) = 0 for n < 0, L(n)u = 0 for n ≥ 2. We also have L(1)u = 0. So u is a lowest weight vector of weight 1 and Y (u, x) is a primary field. Thus we have [L(−2), Y (u, x)] = x −1
d Y (u, x) − z −2 Y (u, x). dx
Taking the coefficients of x −1 of both sides of this formula, we obtain [L(−2), Y0 (u)] = 0. Thus Y0 (u)ω = L(−2)Y0 (u)1 = 0 which implies etY0 (u) ω = ω.
282
Y.-Z. Huang
By the skew-symmetry for V , we have Y (u, x)u = e x L(−1) Y (u, −x)u. Taking Resx of both sides, we have Y0 (u)u =
(−1)−m−1 (L(−1))m Ym (u)u. m!
m∈N
Since wt Ym (u)u = 1−m −1+1 = 1−m < 0 and V(n) = 0 when n < 0, Ym (u)u = 0 for m > 1. Since V(0) = C1, Y1 (u)u is proportional to 1 since wt Y1 (u)u = 1−1−1+1 = 0. So (L(−1))m Ym (u)u = 0 for m ≥ 0. Thus we obtain Y0 (u)u = −Y0 (u)u which implies Y0 (u)u = 0. By the component form of the commutator formula (5.1), we obtain [Y0 (u), Yn (u)] = Yn ((Y0 (u)u)) =0 for n ∈ Z. Then we can define (u)
V (x) = x Y0 (u) e
−x 0
Y ≤−2 (u,y)
= x (Y0 (u))ss e(Y0 (u)−(Y0 (u))ss ) log x e
−x 0
Y ≤−2 (u,y)
,
where (Y0 (u))ss is the semisimple part of Y0 (u) and, on an eigenvector of (Y0 (u))ss (that is, a generalized eigenvector of Y0 (u)), x (Y0 (u))ss is defined to be x a if the eigenvalue of the eigenvector is a. Theorem 5.2. For any v ∈ V , there exist m 1 , . . . , m l ∈ R such that m1 −1 ml −1
(u) V (x)v ∈ x V [x ][log x] + · · · + x V [x ][log x] (u)
and the series V (x) satisfies (u)
(u)
(u)
V (x)Y (v, x2 ) = Y ( V (x + x2 )v, x2 ) V (x)
(5.4)
and (u)
[L(−1), V (x)] = −
d (u)
(x). dx V
(5.5)
Proof. For any v ∈ V , e
−x
Y ≤−2 (u,y)v
v k 1 −x ≤−2 = Y (u, y) v k! 0 k∈N ⎞k ⎛ 1 (−1)n ⎝ Yn (u)x −n ⎠ v = k! −n 0
k∈N
1 = k! k∈N
n∈Z+
n 1 ,...,n k ∈Z+ , n 1 +···+n k =l
(−1)l Yn (u) · · · Yn k (u)vx −l . (−n 1 ) · · · (−n k ) 1 (5.6)
Generalized Twisted Modules of a Vertex Operator Algebra
283
Since wt Yn 1 (u) · · · Yn k (u) = (1 − n 1 − 1) + · · · + (1 − n k − 1)) = −n 1 − · · · − n k = −l, we see that Yn 1 (u) · · · Yn k (u)w = 0 when l is sufficiently large. Thus the right-hand side of (5.6) is an element of V [x −1 ] and our first conclusion m1 −1 ml −1
(u) V (x)v ∈ x V [x ][log x] + · · · + x V [x ][log x]
follows immediately. From (4.2), we obtain e
−x 0
Y ≤−2 (u,y)v
−x
Y ≤−2 (u,y)v x Y ≤−2 (u,y)v+Y0 (u) log 1+ x2
Y (v, x2 )e−
0
−x−x 2 1 v, x =Y e 0 2 x −x−x2 ≤−2 Y (u) log 1+ x2 Y (u,y) 1 e 0 =Y e 0 v, x2 .
(5.7)
On the other hand, by (5.2) and (5.3), we have x Y0 (u) Y (v, x2 ) = x (Y0 (u))ss e(Y0 (u)−(Y0 (u))ss ) log x Y (v, x2 ) = Y (x (Y0 (u))ss e(Y0 (u)−(Y0 (u))ss ) log x v, x2 )x (Y0 (u))ss e(Y0 (u)−(Y0 (u))ss ) log x = Y (x Y0 (u) v, x2 )x Y0 (u) .
(5.8)
Using (5.7) and (5.8), we obtain (5.4). Finally, notice that both [L(−1), ·] and − ddx are derivatives on the space −x −x C[[ 0 Y ≤−2 (u, y)]] of power series in 0 Y ≤−2 (u, y). So from (4.4), we obtain −x ≤−2 d −x Y ≤−2 (u,y) e0 [L(−1), e 0 Y (u,y) ] + dx
−x −x ≤−2 = L(−1), Y ≤−2 (u, y) e 0 Y (u,y) 0 −x −x ≤−2 d + Y ≤−2 (u, y) e 0 Y (u,y) dx 0
= −Y0 (u)x −1 e
−x 0
The equality (5.9) is equivalent to (5.5). Let W =
n,α∈C
Y ≤−2 (u,y)
.
(5.9)
[α] W n be a strongly C-graded generalized V -module (see [HLZ1 √
√
and HLZ2]) with the action e2π −1 (YW )0 (u) of the automorphism e2π −1 Y0 (u) of V such that the C-grading is given by the generalized eigenspaces of (YW )0 (u) and thus is com√ √ patible with the generalized eigenspaces of the action e2π −1 (YW )0 (u) of e2π −1 Y0 (u) . For example, when W is a V -module, W has a natural C-grading given by the generalized
284
Y.-Z. Huang
eigenspaces for (YW )0 (u) and together with this grading, W is such a generalized V module. We define a linear map (u)
YW : V ⊗ W → W {x}[log x] (u) (v, x)w v ⊗ w → YW
by (u) (v, x)w = YW ( (u) YW V (x)v, x).
First, we have: (u) (ω, x) is in W [[x, x −1 ]] and if we write Lemma 5.3. The vertex operator YW (u) (u) YW (ω, x) = L W (n)x −n−2 , n∈Z
then 1 L (u) W (0) = L W (0) + (YW )0 (u) − μ, 2 where μ ∈ C is given by (YW )1 (u)u = μ1 (note that since (YW )1 (u)u ∈ V(0) = C1, it must be proportional to 1). In particular, for n ∈ C and α an eigenvalue of (YW )0 (u), W [α] 1 is the generalized eigenspace for both
L (u) W (0)
n−α+ 2 μ
and (YW )0 (u) with the eigenvalues n and α, respectively.
Proof. By (4.1), we have [Y (ω, x1 ), Y (u, x2 )] =
Resx0 x2−1 δ
x1 − x0 x2
Y (Y (ω, x0 )u, x2 ).
Taking the coefficients of x10 in both sides and noticing that from our assumption, L(n)u = 0 for n > 0, we obtain [L(−2), Y (u, x2 )] = −x2−1 Y (L(−1)u, x2 ) + x2−2 Y (L(0)u, x2 ) d = −x2−1 Y (u, x2 ) + x2−2 Y (u, x2 ). d x2 Taking the coefficients of x2−m−1 in both sides, we obtain [L(−2), Ym (u)] = mYm−2 (u) for m ∈ Z. Then we have Ym (u)ω = Ym (u)L(−2)1 = L(−2)Ym (u)1 − mYm−2 (u)1
Generalized Twisted Modules of a Vertex Operator Algebra
285
for m ∈ Z. Since Ym (u)1 = 0 for m ≥ 0 and Y−1 (u)1 = u, we obtain Ym (u)ω = 0 for m ∈ N but m = 1 and Y1 (u)ω = −u. In particular, Y0 (u)ω = (Y0 (u))ss ω. We also know that Ym (u)u = 0 for m > 1 and Y1 (u)u = μ1. So e
−x 0
Y ≤−2 (u,y)
1 ω = ω + ux −1 − μ1x −2 2
and 1 (u)
V (x)ω = x (Y0 (u))ss (ω + ux −1 − μ1x −2 ) 2 1 = ω + ux −1 − μ1x −2 . 2 Thus 1 (u) YW (ω, x) = YW (ω, x) + Y (u, x)x −1 − μx −2 ∈ W [[x, x −1 ]] 2 (u)
and L W (0) = L W (0) + (YW )0 (u) − 21 μ. The second conclusion follows immediately.
We have the following consequence: Corollary 5.4. The space W is also doubly graded by the generalized eigenspaces for [α] , where for L uW (0) and (YW )0 (u), that is, W has a C × C-grading W = n,α∈C W[n] n, α ∈ C, [α] = W [α] W[n]
n−α+ 21 μ
,
[α] for n, α ∈ C if and only if w is a generalized eigenvector for L uW (0) that is, w ∈ W[n] with the eigenvalue n and a generalized eigenvector for (YW )0 (u) with the eigenvalue α.
Proof. By Lemma 5.3, every element of W is a finite sum of vectors which are gener(u) alized eigenvectors for both L W (0) and (YW )0 (u).
The following theorem is the main result of the present paper: (u) ), where W is equipped with the C × C-grading given Theorem 5.5. The pair (W, YW √ in Corollary 5.4, is a strongly C-graded generalized e2π −1 Y0 (u) -twisted V -module.
286
Y.-Z. Huang
Proof. We first prove that W satisfies the grading-restriction condition. By Lemma 5.3, 1 L W (0) = L (u) W (0) − (YW )0 (u) + 2 μ. Since the generalized V -module W satisfies the [α] grading-restriction condition, dim W n < ∞ for n, α ∈ C and for any fixed n, α ∈ C, [α] [α] W n+l = 0 when l is a sufficiently negative integer. So for n, α ∈ C, W[n] = W [α]
are finite dimensional and for any fixed n, α ∈ C,
[α] W[n+l]
=
W [α]
n+l−α+ 12 μ
n−α+ 21 μ
= 0 when l is
a sufficiently negative integer. From the first part of the creation property, Y (u, x)1 ∈ V [[x]] for the vertex operator algebra, we have Yn (u)1 = 0 for n ≥ 0. In particular, (Y0 (u))ss 1 = (Y0 (u) − (Y0 (u))ss )1 = 0. We obtain (u)
V (x)1 = x (Y0 (u))ss e(Y0 (u)−(Y0 (u))ss ) log x e = 1.
−x 0
Y ≤−2 (u,y)
1
For w ∈ W , (u)
(u)
YW (1, x)w = YW ( V (x)1, x)w = YW (1, x)w = w, proving the identity property. Let v ∈ V be a generalized eigenvector for Y0 (u) with eigenvalue λ. Then for v ∈ V and z ∈ C× , (u) 2πiY0 (u)
V (x)e v x n =enl p (z) , log x=l p (z)
= x λ e(Y0 (u)−(Y0 (u))ss ) log x e2πiλ e2πi(Y0 (u)−(Y0 (u))ss ) v λ (Y0 (u)−(Y0 (u))ss ) log x =x e v nl (z) x n =e p+1 , log x=l p+1 (z) (u) = V (x)v . x n =e
nl p+1 (z)
x n =enl p (z) , log x=l p (z)
, log x=l p+1 (z)
Thus (u)
(YW )e
2πiY0 (u) ; p
(e2πiY0 (u) v, z)w
2πiY0 (u) (x)e v, x)w = YW ( (u) V
(u) = YW ( V (x)v, x)w =
x n =e
x n =enl p (z) , log x=l p (z)
nl p+1 (z)
, log x=l p+1 (z)
(u) 2πiY0 (u) ; p+1 (YW )e (v, z)w
for v ∈ V and w ∈ W , that is, the equivariance property holds. By Corollary 5.4, the L (u) (0)-grading condition and the grading-compatibility condition are satisfied.
Generalized Twisted Modules of a Vertex Operator Algebra
287
By the L(−1)-derivative property for the vertex operator map YW and (5.5), we have d (u) d (u) Y (v, x) = YW ( V (x)v, x) dx W dx =
YW (L(−1) (u) V (x)v, x) + YW
d (u)
(x)v , x dx V
(u)
(u)
= YW (L(−1) V (x)v, x) + YW (−[L(−1), V (x)]v), x) (u)
= YW ( V (x)L(−1)v, x) (u)
= YW (L(−1)v, x), for v ∈ V , proving the L(−1)-derivative property. [α] Finally we prove the duality property. By Corollary 5.4, W[n] = W [α] 1 . So
n−α+ 2 μ [α] the graded dual of W with respect to the grading W = n,α∈C W n and the graded [α] dual with respect to the new C × C-grading W = n,α∈C W[n] in Corollary 5.4 are equal as vector spaces, though their gradings are different. We shall use W to denote the underlying vector space of these two graded duals. It will be clear from the context which grading we will be using. For v1 , v2 ∈ V , w ∈ W and w ∈ W , we know that there exist m 1 , . . . , m r , n 1 , . . . , n s ∈ R such that (u)
V (x1 )e2πiY0 (u) v1 ∈ x1m 1 V [x1−1 ][log x1 ] + · · · + x1m r V [x1−1 ][log x1 ] and 2πiY0 (u)
(u) v2 ∈ x2n 1 V [x2−1 ][log x2 ] + · · · + x2n s V [x2−1 ][log x2 ]. V (x 2 )e
Thus, using the rationality, commutativity and associativity properties for the generalized V -module W and the fact that in the region |z 2 | > |z 1 − z 2 | > 0, (u)
V (x2 + x0 )v1 = (u) (x )v 1 1 V
x0n =(z 1 −z 2 )n , x2n =enl p (z 2 ) , log x2 =l p (z 2 )
x1n =enl p (z 1 ) , log x1 =l p (z 1 )
,
we see that there exists ai jkl ∈ C for i, j, k, l = 0, . . . , N , αi , β j ∈ C for i, j = 0, . . . , N and t ∈ N such that the series (u)
w , (YW )e
2πiY0 (u) ; p
(u)
(v1 , z 1 )(YW )e
2πiY0 (u) ; p
(v2 , z 2 )w
(u)
= w , YW ( V (x1 )v1 , x1 ) · (u) ·YW ( V (x2 )v2 , x2 )w
x1n =enl p (z 1 ) , log x1 =l p (z 1 ), x2n =enl p (z 2 ) , log x2 =l p (z 2 )
(u)
w , (YW )e
2πiY0 (u) ; p
(u)
(u)
(v2 , z 2 )(YW )e
= w , YW ( V (x2 )v2 , x2 ) ·
2πiY0 (u) ; p
(v1 , z 1 )w
,
288
Y.-Z. Huang
(u) ·YW ( V (x1 )v1 , x1 )w
x1n =enl p (z 1 ) , log x1 =l p (z 1 ), x2n =enl p (z 2 ) , log x2 =l p (z 2 )
,
(u)
w , YW (Y ( V (x2 + x0 )v1 , x0 ) · (u) · V (x2 )v2 , x2 )w n n
x0 =(z 1 −z 2 ) , x2n =enl p (z 2 ) , log x2 =l p (z 2 )
are absolutely convergent in the regions |z 1 | > |z 2 | > 0, |z 1 | > |z 2 | > 0, |z 2 | > |z 1 − z 2 | > 0, respectively, to the branch N
ai jkl em i l p (z 1 ) en j l p(z2 ) l p (z 1 )k l p (z 2 )l (z 1 − z 2 )−t
i, j,k,l=0
of the multivalued analytic function N
n
ai jkl z 1m i z 2 j (log z 1 )k (log z 2 )l (z 1 − z 2 )−t .
i, j,k,l=0
By (5.4), (u)
w , YW (Y ( V (x2 + x0 )v1 , x0 )· (u) · V (x2 )v2 , x2 )w n
x0 =(z 1 −z 2 )n , x2n =enl p (z 2 ) , log x2 =l p (z 2 )
= w
(u) , YW ( V (x2 )Y (v1 , x0 )v2 , x2 )w (u)
= w , (YW )
e2πiY0 (u) ; p
x0n =(z 1 −z 2 )n , x2n =enl p (z 2 ) , log x2 =l p (z 2 )
(Y (v1 , z 1 − z 2 )v2 , z 2 )w.
Thus the duality property is proved.
Remark 5.6. In the case that Resx Y (u, x) acts on V semisimply and has eigenvalues belonging to k1 Z, the theorem above reduces to the construction by Li in [Li]. We have the following immediate consequence: (u) Corollary 5.7. The correspondence given by (W, YW ) → (W, YW ) is a functor from the category of strongly C-graded generalized V -modules to the category of strongly √ C-graded generalized e2π −1 Y0 (u) -twisted V -modules.
We now apply the theorem above to give some explicit examples. Let p and q be a pair of coprime positive integers, L the one-dimensional positive-definite even lattice generated by γ with the bilinear form ·, · given by γ , γ = 2 pq, and L its dual lattice. Let VL be the vertex algebra associated to L and V L the generalized vertex algebra associated to L (see [DL]). The element ω=
p−q 1 γ (−1)2 1 + γ (−2)1 ∈ VL ⊂ V L pq 2 pq
Generalized Twisted Modules of a Vertex Operator Algebra
289
1 is a conformal element. If instead of the usual conformal element pq γ (−1)2 1, we take ω to be the conformal element for VL and V L , we obtain a vertex operator algebra (still denoted VL ) and a generalized vertex operator algebra (still denoted V L ), respectively. Note that in the grading given by the conformal element ω, the weights of eγ /q and e−γ / p are 1. Consider the VL -modules VL− γ and VL+ γ which are both graded by Z. p q Then we have vertex-operator-algebraic extensions V0 = VL ⊕ VL− γ , V0o = VL ⊕ VL+ γ p q and V( p, q) = VL ⊕ VL− γ ⊕ VL+ γ of VL for which the vertex operators are given by p
q
YV0 (u 1 + v1 , x)(u 2 + v2 ) = YVL (u 1 + v1 , x)u 2 + YVL (u 1 , x)v2 , YV0o (u 1 + w1 , x)(u 2 + w2 ) = YVL (u 1 + w1 , x)u 2 + YVL (u 1 , x)w2 , YV ( p,q) (u 1 + v1 + w1 , x)(u 2 + v2 + w2 ) = YVL (u 1 + v1 + w1 , x)u 2 + YVL (u 1 , x)(v2 + w2 ) for u 1 , u 2 ∈ VL and v1 , v2 ∈ VL− γ and w1 , w2 ∈ VL+ γ . Note that V0 and V0o are vertex p
q
operator subalgebras of V( p, q) and that e−γ / p ∈ V0 and eγ /q ∈ V0o . Also note that (V0 )(0) = (V0o )(0) = (V( p, q))(0) = C1 and (V0 )(n) = (V0o )(n) = (V( p, q))(n) = 0 for n < 0. Let Q = Resx YV ( p,q) (eγ /q , x) and Q˜ = Resx YV ( p,q) (e−γ / p , x). These operators √
√
˜
are called screening operators. Then e2π −1 Q and e2π −1 Q are automorphisms of V0 and V0o , respectively, and are both automorphisms of V( p, q). In fact, the triplet vertex operator algebra W( p, q) (see [FGST2 and AM3]) is a vertex operator subalgebra √ of the fixed-point subalgebra of V( p, q) under the group generated by e2π −1 Q and √ ˜ e2π −1 Q . [α] Let W = n,α∈C W n be a strongly C-graded generalized V0 -, V0o - or V( p, q)-mod√
γ /q
√
˜
√
−γ / p
√
) of e2π −1 Q or ule with an action e2π −1 (YW )0 (e ) of e2π −1 Q , e2π −1 (YW )0 (e either of them, respectively, such that the C-grading is given by the generalized eigenspaces of (YW )0 (eγ /q ) or (YW )0 (e−γ / p ). For example, we can take W to be V0 , V0o or V( p, q) themselves. In [AM3] (Theorem 9.1), Adamovi´c and Milas in [AM3] proved (e−γ / p ) (eγ /q ) and YW are intertwining operators of suitable types. Applying Theorem that YW 5.5, we obtain immediately:
Theorem 5.8. For a strongly C-graded generalized V0 -module (or V0o - or V( p, q)-mod(eγ /q )
ule) (W, YW ), the pair (W, YW (e−γ / p ) (W, YW ))
(e−γ / p )
) (or the pair (W, YW
(eγ /q )
) or the pairs (W, YW
√ ˜ e2π −1 Q -twisted
)
and is a strongly C-graded generalized V0 -modules √ o 2π −1 Q (or is a strongly C-graded generalized e -twisted V0 -module or are strongly √ √ ˜ 2π −1 Q 2π −1 Q - and e -twisted V( p, q)-modules, respectively). C-graded generalized e Acknowledgements. The author would like to thank Antun Milas for pointing out that Theorem 5.5 can be used to show that certain intertwining operators constructed in [AM3] are in fact the twisted vertex operator maps for certain generalized twisted modules in the sense of the definition given in the present paper. The author is supported in part by NSF grant PHY-0901237.
290
Y.-Z. Huang
References [A] [AM1] [AM2] [AM3] [Ba1] [Ba2] [Ba3]
[BDM] [BHL] [Bo] [BHS] [CF] [dBHO] [DVVV] [DFMS] [DGH] [DHVW1] [DHVW2] [DGM] [D] [DL] [DonLM1] [DonLM2] [DoyLM1] [DoyLM2] [FGST1] [FGST2] [FGST3]
Abe, T.: A Z2 -orbifold model of the symplectic fermionic vertex operator superalgebra. Math. Z. 255, 755–792 (2007) Adamovi´c, D., Milas, A.: Logarithmic intertwining operators and W(2, 2 p − 1)-algebras. J. Math. Phys. 48, 073503 (2007) Adamovi´c, D., Milas, A.: On the triplet vertex algebra W( p). Adv. in Math. 217, 2664– 2699 (2008) Adamovi´c, D., Milas, A.: Lattice construction of logarithmic modules for certain vertex operator algebras. To appear in Selecta Math. http://arXiv.org/abs/0902.3417v1[math.QA], 2009 Bantay, P.: Algebraic aspects of orbifold models. Int. J. Mod. Phys. A9, 1443–1456 (1994) Bantay, P.: Characters and modular properties of permutation orbifolds. Phys. Lett. B419, 175–178 (1998) Bantay, P.: Permutation orbifolds and their applications. In: Vertex Operator Algebras in Mathematics and Physics, Proc. workshop, Fields Institute for Research in Mathematical Sciences, 2000, ed. by S. Berman, Y. Billig, Y.-Z. Huang, J. Lepowsky, Fields Institute Communications, Vol. 39, Amer. Math. Soc., 2003, pp. 13–23 Barron, K., Dong, C., Mason, G.: Twisted sectors for tensor products vertex operator algebras associated to permutation groups. Commun. Math. Phys. 227, 349–384 (2002) Barron, K., Huang, Y.-Z., Lepowsky, J.: An equivalence of two constructions of permutationtwisted modules for lattice vertex operator algebras. J. Pure Appl. Alg. 210, 797–826 (2007) Borcherds, R.: Vertex algebras, kac-moody algebras. and the monster. Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986) Borisov, L., Halpern, M., Schweigert, C.: Systematic approach to cyclic orbifolds. Int. J. Mod. Phy. A13(1), 125–168 (1998) Carqueville, N., Flohr, M.: Nonmeromorphic operator product expansion and c2 -cofiniteness for a family of W-algebras. J. Phys. A39, 951–966 (2006) de Boer, J., Halpern, M., Obers, N.: The operator algebra and twisted KZ equations of WZW orbifolds. J. High Energy Phys. 10, 011 (2001) Dijkgraaf, R., Vafa, C., Verlinde, E., Verlinde, H.: The operator algebra of orbifold models. Commun. Math. Phys. 123, 485–526 (1989) Dixon, L., Friedan, D., Martinec, E., Shenker, S.: The conformal field theory of orbifolds. Nucl. Phys. B282, 13–73 (1987) Dixon, L., Ginsparg, P., Harvey, J.: Beauty and the beast: Superconformal conformal symmetry in a Monster module. Commun. Math. Phys. 119, 221–241 (1989) Dixon, L., Harvey, J., Vafa, C., Witten, E.: Strings on orbifolds. Nucl. Phys. B261, 678–686 (1985) Dixon, L., Harvey, J., Vafa, C., Witten, E.: Strings on orbifolds, ii. Nucl. Phys. B274, 285–314 (1986) Dolan, L., Goddard, P., Montague, P.: Conformal field theory of twisted vertex operators. Nucl. Phys. B338, 529–601 (1990) Dong, C.: Twisted modules for vertex algebras associated with even lattice. J. Alg. 165, 91–112 (1994) Dong, C., Lepowsky, J.: The algebraic structure of relative twisted vertex operators. J. Pure Appl. Alg. 110, 259–295 (1996) Dong, C., Li, H., Mason, G.: Twisted representations of vertex operator algebras. Math. Ann. 310, 571–600 (1998) Dong, C., Li, H., Mason, G.: Modular invariance of trace functions in orbifold theory and generalized moonshine. Commun. Math. Phys. 214, 1–56 (2000) Doyon, B., Lepowsky, J., Milas, A.: Twisted modules for vertex operator algebras and bernoulli polynomials. Int. Math. Res. Not. 44, 2391–2408 (2003) Doyon, B., Lepowsky, J., Milas, A.: Twisted vertex operators and bernoulli polynomials. Commun. Contemp. Math. 8, 247–307 (2006) Feigin, B.L., Ga˘ınutdinov, A.M., Semikhatov, A.M., Tipunin, I.Yu.: The kazhdan-lusztig correspondence for the representation category of the triplet w-algebra in logarithmic conformal field theories (russian). Teoret. Mat. Fiz. 148(3), 398–427 (2006) Feigin, B.L., Ga˘ınutdinov, A.M., Semikhatov, A.M., Tipunin, I.Yu.: Logarithmic extensions of minimal models: characters and modular transformations. Nucl. Phys. B757, 303–343 (2006) Feigin, B.L., Ga˘ınutdinov, A.M., Semikhatov, A.M., Tipunin, I.Yu.: Modular group representations and fusion in logarithmic conformal field theories and in the quantum group center. Commun. Math. Phys. 265, 47–93 (2006)
Generalized Twisted Modules of a Vertex Operator Algebra
[Fl1] [Fl2] [FG] [FK] [FGK] [FHL] [FLM1] [FLM2] [FLM3] [Fu]
[FHST] [FKS] [GK1] [GK2] [GR1] [GR2] [GT] [GHHO] [HH] [HO] [HV] [H] [HLZ1] [HLZ2] [Ka1] [Ka2] [KS] [Le1] [Le2]
291
Flohr, M.: On modular invariant partition functions of conformal field theories with logarithmic operators. Int. J. Mod. Phys. A11, 4147–4172 (1996) Flohr, M.: On fusion rules in logarithmic conformal field theories. Int. J. Mod. Phys. A12, 1943– 1958 (1996) Flohr, M., Gaberdiel, M.R.: Logarithmic torus amplitudes. J. Phys. A39, 1955–1968 (2006) Flohr, M., Knuth, H.: On Verlinde-Like formulas in c p,1 logarithmic conformal field theories. To appear, http://arXiv.org/abs/0705.0545v1[math.ph], 2007 Flohr, M., Grabow, C., Koehn, M.: Fermionic expressions for the characters of c( p, 1) logarithmic conformal field theories. Nucl. Phys. B768, 263–276 (2007) Frenkel, I., Huang, Y.-Z., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Memoirs American Math. Soc. 104, 1993 Frenkel, I., Lepowsky, J., Meurman, A.: A natural representation of the fischer-griess monster with the modular function j as character. Proc. Natl. Acad. Sci. USA 81, 3256–3260 (1984) Frenkel, I., Lepowsky, J., Meurman, A.: Vertex operator calculus. In: Mathematical Aspects of String Theory, Proc. 1986 Conference, San Diego, ed. by S.-T. Yau, Singapore: World Scientific, 1987, pp. 150–188 Frenkel, I., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. Pure and Applied Math. Vol. 134, London-New York: Academic Press, 1988 Fuchs, J.: On nonsemisimple fusion rules and tensor categories. In: Lie Algebras, Vertex Operator Algebras and their Applications, Proceedings of a conference in honor of James Lepowsky and Robert Wilson, 2005, ed. Y.-Z. Huang, K. Misra, Contemporary Mathematics, Vol. 442, Providence, RI: Amer. Math. Soc., 2007 Fuchs, J., Hwang, S., Semikhatov, A.M., Tipunin, I.Yu.: Nonsemisimple fusion algebras and the verlinde formula. Commun. Math. Phys. 247(3), 713–742 (2004) Fuchs, J., Klemm, A., Schmidt, M.: Orbifolds by cyclic permutations in gepner type superstrings and in the corresponding calabi-yau manifolds. Ann. Phys. 214, 221–257 (1992) Gaberdiel, M.R., Kausch, H.G.: Indecomposable fusion products. Nucl. Phys. B477, 298– 318 (1996) Gaberdiel, M.R., Kausch, H.G.: A rational logarithmic conformal field theory. Phys. Lett. B386, 131–137 (1996) Gaberdiel, M.R., Runkel, I.: The logarithmic triplet theory with boundary. J. Phys. A39, 14745– 14780 (2006) Gaberdiel, M.R., Runkel, I.: From boundary to bulk in logarithmic cft. J. Phys. A41, 075402 (2008) Ga˘ınutdinov, A.M., Tipunin, I.Yu.: Radford, drinfeld, and cardy boundary states in (1, p) logarithmic conformal field models. J. Phys. A42, 315207 (2009) Ganor, O., Halpern, M., Helfgott, C., Obers, N.: The outer-automorphic WZW orbifolds on so(2n), including five triality orbifolds on so(8). J. High Energy Phys. 12, 019 (2002) Halpern, M., Helfgott, C.: The general twisted open wzw string. Int. J. Mod. Phys. A20, 923– 992 (2005) Halpern, M., Obers, N.: Two large examples in orbifold theory: abelian orbifolds and the charge conjugation orbifold on su(n). Int. J. Mod. Phys. A17, 3897–3961 (2002) Hamidi, S., Vafa, C.: Interactions on orbifolds. Nucl. Phys. B279, 465–513 (1987) Harvey, J.: Twisting the heterotic string. In: Unified String Theories, Proc. 1985 Inst. for Theoretical Physics Workshop, Ed. by M. Green, D. Gross, Singapore: World Scientific, 1086, pp. 704–718 Huang, Y.-Z., Lepowsky, J., Zhang, L.: A logarithmic generalization of tensor product theory for modules for a vertex operator algebra. Int. J. Math. 17, 975–1012 (2006) Huang, Y.-Z., Lepowsky, J., Zhang, L.: Logarithmic tensor product theory for generalized modules for a conformal vertex algebra. To appear, http://arXiv.org/abs/0710.2687v3[math. QA], 2007 Kausch, H.G.: Extended conformal algebras generated by multiplet of primary fields. Phys. Lett. 259B, 448–455 (1991) Kausch, H.G.: Symplectic fermions. Nucl. Phys. B583, 513–541 (2000) Klemm, A., Schmidt, M.G.: Orbifolds by cyclic permutations of tensor product conformal field theories. Phys. Lett. B245, 53–58 (1990) Lepowsky, J.: Calculus of twisted vertex operators. Proc. Nat. Acad. Sci. USA 82, 8295–8299 (1985) Lepowsky, J.: Perspectives on vertex operators and the Monster. In: Proc. 1987 Symposium on the Mathematical Heritage of Hermann Weyl, Duke Univ., Proc. Symp. Pure. Math., Amer. Math. Soc. 48, 181–197 (1988)
292
Y.-Z. Huang
[Li]
Li, H.: Local systems of twisted vertex operators, vertex operator superalgebras and twisted modules. In: Moonshine, the Monster, and related topics Mount Holyoke, 1994, ed. C. Dong, G. Mason, Contemporary Math., Vol. 193, Providence, RI: Amer. Math. Soc., 1996, pp. 203–236 Moore, G.: Atkin-lehner symmetry. Nucl. Phys. B293, 139–188 (1987) Nagatomo, K., Tsuchiya, A.: The Triplet Vertex operator algebra W ( p) and the restricted quantum group at root of unity, to appear, http://arXiv.org/abs/0902.4607v2[math.QA], 2009 Narain, K.S., Sarmadi, M.H., Vafa, C.: Asymmetric orbifolds. Nucl. Phys. B288, 551–577 (1987) Pearce, P.A., Rasmussen, J., Ruelle, P.: Integrable boundary conditions and W-extended fusion in the logarithmic minimal models LM(1, p). J. Phys. A41, 295201 (2008) Pearce, P.A., Rasmussen, J., Ruelle, P.: Grothendieck ring and Verlinde formula for the W-extended logarithmic minimal model WLM(1, p). To appear, http://arXiv.org/abs/0907. 0134v1[hep-th], 2009 Rasmussen, J.: Fusion matrices, generalized Verlinde formulas, and partition functions in WLM(1, p). To appear, http://arXiv.org/abs/0908.2014v2[hep-th], 2009
[M] [NT] [NSV] [PRR1] [PRR2] [R]
Communicated by Y. Kawahigashi
Commun. Math. Phys. 298, 293–322 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1065-0
Communications in
Mathematical Physics
Convolution Inequalities for the Boltzmann Collision Operator Ricardo J. Alonso1 , Emanuel Carneiro2 , Irene M. Gamba3 1 Dept. of Computational & Applied Mathematics, Rice University, Houston, TX 77005-1892, USA.
E-mail: [email protected]
2 School of Mathematics, Institute for Advanced Study, Einstein Drive, Princeton, NJ 08540, USA.
E-mail: [email protected]
3 Department of Mathematics, University of Texas at Austin, Austin, TX 78712-1082, USA.
E-mail: [email protected] Received: 19 March 2009 / Accepted: 10 March 2010 Published online: 15 June 2010 – © Springer-Verlag 2010
Abstract: We study integrability properties of a general version of the Boltzmann collision operator for hard and soft potentials in n-dimensions. A reformulation of the collisional integrals allows us to write the weak form of the collision operator as a weighted convolution, where the weight is given by an operator invariant under rotations. Using a symmetrization technique in L p we prove a Young’s inequality for hard potentials, which is sharp for Maxwell molecules in the L 2 case. Further, we find a new Hardy-Littlewood-Sobolev type of inequality for Boltzmann collision integrals with soft potentials. The same method extends to radially symmetric, non-increasing potentials that lie in some L sweak or L s . The method we use resembles a Brascamp, Lieb and Luttinger approach for multilinear weighted convolution inequalities and follows a weak formulation setting. Consequently, it is closely connected to the classical analysis of Young and Hardy-Littlewood-Sobolev inequalities. In all cases, the inequality constants are explicitly given by formulas depending on integrability conditions of the angular cross section (in the spirit of Grad cut-off). As an additional application of the technique we also obtain estimates with exponential weights for hard potentials in both conservative and dissipative interactions. 1. Introduction 1.1. Background. The nonlinear Boltzmann equation is a classical model for a gas at low or moderate densities. The gas in a spatial domain ⊆ Rn , n ≥ 2, is modeled by the evolution of the mass density function f (x, v, t), (x, v) ∈ × Rn , modeling the probability of finding a particle at position x, with velocity v at the time t ∈ R. The transport equation for f reads (∂t + v · ∇x ) f = Q( f, f ),
(1.1)
where Q( f, f ) is a quadratic integral operator, expressing the change of f due to instantaneous binary collisions of particles. The precise form of Q( f, f ) will be introduced
294
R. J. Alonso, E. Carneiro, I. M. Gamba
below, for both conservative (elastic) [20] and dissipative (inelastic) interactions [19]. The Q( f, f ) operator factorizes as the difference of two positive operators, usually denoted by the Q + ( f, f )(x, v, t) rate of gain of probability due to two pre-collisional velocities for which one of them will take the direction v and the Q − ( f, f )(x, v, t) rate of loss of probability due to particles that get knocked out of the direction v. In addition, these operators depend on the form of their collision kernels which model the collision frequency depending on the intramolecular potentials between interacting particles. More specifically, these kernels depend on functions of the relative speed and on the scattering angle, the latter modeled by an angular function referred as the angular cross section. In all the cases we assume that the angular cross section is modeled by an integrable angular function on the S n−1 sphere (this condition, in the theory of the Boltzmann equation, is called the Grad cut-off assumption). The collisional kernels are further divided into the following classes: hard potentials, corresponding to unbounded forms of the relative speed, modeling stronger collision rates, and soft potentials modeling weaker collision rates, both as the relative speed is larger; and Maxwell molecule type of interactions where collisional kernels are independent of the relative speed. 1.2. Aim of the paper. It is the purpose of this work to investigate the L r -integrability (in velocity) of the gain operator as a bilinear form Q + ( f, g)(v) acting on probability mass densities f (v) and g(v), and to search for exact representation formulas for the inequality constants and possible optimal estimates depending on the L p and L q norms of f and g, respectively. This provides a new detailed study of the Boltzmann equation with harmonic and functional analysis tools. It is possible due to the weighted convolution nature of its collisional integral operator in weak form. The work focuses on questions on sharp constants for estimates in such functional spaces as well as on the existence and exact description of maximizers. In order to achieve our goals, we introduce a representation of collisional integrals in weak form that allows us to write the gain term of the collision operator as a weighted convolution, with the weight being given by a suitable bilinear operator invariant under rotations, acting on the test function. Following the initial idea developed in [3], the main ingredient to approach the convolution estimates is an L p -radial symmetrization technique, a genuinely new addition to the Boltzmann theory, used here with much more generality. The authors showed the Young’s inequality for the elastic hard potential case and calculated exact constants which were proved optimal for Maxwellian molecule type models in L 2 . The method was applied to the strong formulation of the gain operator in the Carleman representation. The new representation for the weak formulation of the gain collisional integral as a bilinear form Q + ( f, g) as a symmetric weighted convolution presented here allow us, very handily, to extend their results to collisional forms with soft (i.e. singular) and symmetric decaying potentials, as well as to the case of dissipative interactions. We also observe that the corresponding loss bilinear form Q − ( f, g) does not have the same convolution symmetry property as the gain one, and so the results we obtain reflects this discrepancy between the gain and loss part of the Boltzmann collision operator. More specifically, we employ still here the L p -radial symmetrization technique, but in addition we develop a Brascamp, Lieb and Luttinger-type approach to multilinear weighted convolution inequalities associated to a weak formulation of the collisional forms [17,18]. In addition, like in the connections between the works of Beckner [7], Brascamp-Lieb [17,28] and multilinear convex inequalities such as the sharp Young’s and the sharp Hardy-Littlewood-Sobolev’s, we also obtain Young’s inequality for hard
Inequalities for the Boltzmann Operator
295
potentials, and the Hardy-Littlewood-Sobolev for soft potentials where the weighted convolution structure has a singular kernel. We also show a Young’s inequality for radially symmetric, non-increasing potentials, where the weighted convolution structure contains a singular kernel. In our case, the invariant-under-rotations weight multiplier in the convolutional structure of the weak formulation of the collisional operator introduces a nonlinear change of coordinates due to the angular integration in the (n − 1)-dimensional sphere. That means our inequalities and methods are not a direct application of those in [7,17,18,28] but rather an analog result for dissipative Boltzmann operators of collisional type. The range of convexity exponents depends on the convolution symmetry property of the associated bilinear form, and recovers the Hardy-Littlewood-Sobolev convexity relation in the gain operator Q + case, while is restricted for the loss operator Q − . All these estimates are valid for conservative or dissipative interactions between particles, whose restitution coefficients have absolute continuity and monotonicity properties as functions of the impact parameter (see (2.4) below). Finally we obtain Young’s inequalities with exponential weights for hard potentials, both for the elastic and strictly (dissipative) inelastic case. It is remarkable that, in the dissipative e(z) < 1 case, tails of order λ and decay rate a are preserved (called stretched exponential tails), with order 0 ≤ λ ≤ 2. However, we show that in the elastic case these estimates hold for classical Maxwellian tails but they have a polynomial weighted norm in one of the components. These last estimates are of interest for the study, for example, of L 1 , L ∞ and tail propagation dissipative kinetic collisional models, such as granular flows. In all cases, the inequalities constants are given by explicit formulas depending only on certain integrability conditions of the angular cross section (in the spirit of Grad cut-off). In short, we hope that this work, inspired from the harmonic analysis toolbox, provides simplified methods, extensions to the full range of exponents including soft potentials and radially non-increasing potentials, and also extensions to dissipative interactions and weighted estimates, to achieve a better qualitative understanding of all the convolution-like inequalities governing the Boltzmann theory. 2. Main Results 2.1. Preliminaries. In this paper we study the integrability properties of the Boltzmann collision operator in the case of elastic or inelastic collisions. The gain part of this bilinear collision operator, commonly denoted by Q + ( f, g), is non-local in both components. It is defined via duality by + Q ( f, g)(v)ψ(v) dv := f (v)g(v∗ ) ψ(v )B(|u|, uˆ · ω) dω dv∗ dv, Rn
Rn
Rn
S n−1
(2.1) where the functions f, g, ψ ∈ C0 (Rn ) (continuous with compact support). The symbol uˆ represents the unitary vector in the direction of u (uˆ = u/|u|) and dω is the surface measure on the sphere S n−1 . The variables v, v∗ (pre-collision velocities), v , v∗ (post-collision velocities) and u (relative velocity) are related by u = v − v∗ , v = v −
β (u − |u|ω) and v + v∗ = v + v∗ . 2
(2.2)
296
R. J. Alonso, E. Carneiro, I. M. Gamba
The corresponding loss part of Boltzmann collision operator, is only non-local in one of its its components, say g. It is defined in a strong form by Q − ( f, g)(v) := b L 1 (S n−1 ) f (v) g(v∗ )(u)dv∗ . (2.3) Rn
We notice that, wile Q + is a bilinear operator with non-locality in both its components, Q − is non-local only in its second component g and maintains locality in its first component f . This difference in the nature of there non-locality plays a crucial role in the range of exponents for the Hardy-Littlewood-Sobolev inequalities for the Q − with singular potentials (Corollaries 9 and 10). The inelastic properties of the collision operator are encoded in the positive scalar function β : [0, ∞) → [ 21 , 1] defined by β(z) := 1+e(z) 2 , where parameter e is the so-called restitution coefficient which enjoys the following two properties that assure micro-reversibility of the interactions: (i) z → e(z) is absolutely continuous and non-increasing, (ii) z → ze(z) is non-decreasing. The dependence of the restitution coefficient on the physical variables is commonly ˆ given by z = |u| 1−2u·ω , i.e. the restitution coefficient e, and thus β, depends only on the impact velocity ˆ 1 + e |u| 1−2u·ω ˆ = . (2.4) β |u| 1−2u·ω 2 We point out that the model for e could be more complex (for example assuming dependence on macroscopic variables like temperature), however this will not be the case in this paper. The particle interaction is elastic when the parameter β = 1, and is referred to as sticky particles when β = 1/2. A complete discussion of the physical aspects of the restitution coefficient can be found in [19]. Standard models for the restitution coefficient, for example constant restitution coefficient and viscoelastic hard spheres, satisfy the assumptions (i) and (ii) above. We refer the interested reader to [2,8,16,25] and [30] for additional numerical and mathematical references that use this class of models. The nature of the interactions modeled by Q + is encoded in the kernel B(|u|, uˆ · ω) modeled by the strength of intramolecular potentials, and many physical models accept the representation (henceforth assumed) B(|u|, uˆ · ω) = |u|λ b(uˆ · ω) with − n < λ. Depending on the parameter λ the interaction receives different names: soft-potentials when −n < λ < 0, meaning that larger relative velocity corresponds to a weaker collision frequency; Maxwell molecules type of interactions when λ = 0, of collision frequency independent of the relative velocity; hard potentials when λ > 0, meaning that larger relative velocity corresponds to stronger collision frequency. For the (nonnegative) angular kernel b(uˆ · ω) we will require the Grad cut-off assumption b(uˆ · ω)dω < ∞. S n−1
We refer to [16] and [23] for a detailed discussion on the inelastic collision operator.
Inequalities for the Boltzmann Operator
297
2.2. Description of the results. Very recently, Alonso and Carneiro [3] revisited the L p -analysis of the operator Q + in the elastic case (restitution coefficient e ≡ 1) for the case of Maxwell type of interactions and hard potentials (i.e. 0 ≤ λ ≤ 1) and developed, by means of radial symmetrization, a Young’s inequality approach that produces sharp constants. Their approach involved the analysis of the collisional integral on the gain operator in strong form by means of the Carleman integral representation. For our current goals, we now use a weak formulation of the collisional integral written as weighted convolution, where the weight is a nonlinear invariant-under-rotations operator acting on the test function (see (2.8) below). From now on we work in the setting of dissipative interactions satisfying conditions (i), (ii) and (2.4) as described above. Let ψ and φ be bounded and continuous functions. Define the bilinear operator P(ψ, φ)(u) := ψ(u − )φ(u + )b(uˆ · ω) dω, (2.5) S n−1
where the symbols u + and u − , commonly known as Bobylev’s variables, are defined by u − :=
β β (u − |u|ω) and u + := u − u − = (1 − β)u + (u + |u|ω). 2 2
(2.6)
We shall see in Sect. 3 that, under conditions (2.4) and (2.6), the operator (2.5) has a certain invariance under the group of rotations in Rn . This operator was first introduced by Bobylev [9,10] in a slightly different setting where it was shown that + ( f, g) = P( fˆ, g), Q ˆ
(2.7)
in the elastic Maxwell molecules case (i.e. λ = 0 and β ≡ 1). More recently, when introducing the study of the Boltzmann dissipative model for Maxwell type of interactions in [12], the authors showed that such relation also holds for constant β = 1. For references on the use of the Fourier transform in the analysis of the elastic Boltzmann collision operator one can consult [21] and [27]. In the dissipative interaction case we also refer to [25] for the Fourier representation of the Q + operator. In particular, from (2.1) and (2.5) we obtain the following relation between the operators Q + and P appearing in the weak formulation of the gain operator, now written in velocity and relative velocity coordinates + Q ( f, g)(v)ψ(v) dv = f (v)g(v − u)P(τv Rψ, 1)(u) |u|λ du dv n n n R R R = f (u + v)g(v)P(1, τ−v ψ)(u) |u|λ du dv, (2.8) Rn
Rn
where τ and R are the translation and reflection operators τv ψ(x) := ψ(x − v) and Rψ(x) := ψ(−x). Representation (2.8) exhibits the nature of the weighted convolution structure of the weak formulation of the gain operator Q + and also shows that the integrability properties of the collision operator Q + are closely related to those of the bilinear operator P. A similar approach was carried out in [24] which relates the operator Q + to a slightly different angular averaging operator.
298
R. J. Alonso, E. Carneiro, I. M. Gamba
One of the advantages of working with the bilinear operator P is that a simple change of variables in (2.8) allow us to change the roles of f and g without essentially changing the convolution structure of the integral. This convolution symmetry will be specially important later on in the proof of the Hardy-Littlewood-Sobolev (HLS) inequality for Q + in the full range of exponents (Theorem 2). Such convolution symmetry is not present in the loss operator Q − as we shall see in Corollaries 9 and 10, that the corresponding weak formulation for Q − ( f, g) has a convolution structure only in its second component g, but not in its first one f . As a consequence, the HLS inequality for this operator will only be valid in a restricted range (Sect. 6.2). In Sect. 3 we develop the L p -analysis of the operator P by means of the weak representation (2.8) using a radial symmetrization method introduced in [3] for a different representation of the gain operator in strong form by means of the Carleman representation. This approach using (2.8) provides exact constants in some of our inequalities, both in the hard and soft potentials settings, that are sharp in some cases. In general, they are calculated depending on (explicit) integral conditions on the angular cross section. The results of Sect. 3 are crucial for all results proved in the following three sections. In Sect. 4 we revisit Young’s inequality for hard potentials, where the new additions are the calculation of exact constants for conservative or dissipative interactions, which are proved to be sharp in the constant energy dissipation rate for Maxwell type of interactions and selected L p -exponents. p To this end, consider the weighted Lebesgue spaces L k (Rn ) ( p ≥ 1, k ≥ 0) defined by the norm
f L p (R n ) =
1/ p pk dv | f (v)| 1 + |v| . p
Rn
k
We prove the following. Theorem 1. Let 1 ≤ p, q, r ≤ ∞ with 1/ p + 1/q = 1 + 1/r . Assume that B(|u|, uˆ · ω) = |u|λ b(uˆ · ω), with λ ≥ 0. For α ≥ 0, the bilinear operator Q + extends to a bounded operator1 from p q L α+λ (Rn ) × L α+λ (Rn ) → L rα (Rn ) via the estimate + Q ( f, g)
L rα (Rn )
≤ C f L p (Rn ) g L q (Rn ) . α+λ α+λ
(2.9)
The constant C is given by
C=
α 1 2λ+2+ 2 + r
×
1 −1
n−2 rn S 2
1
−1
1+s 2
+ (1 − β0 )
1−s − 2
n 2r
dξnb (s)
− 2rn 2 1−s 2
qr r
dξnb (s)
p
,
1 In this paper we shall prove all the inequalities for a dense subspace of smooth functions giving the desired extensions for L p , 1 ≤ p < ∞. The L ∞ bounds, when applicable, are easily treated directly just pulling out the L ∞ -norm out of the integrals.
Inequalities for the Boltzmann Operator
299 n−3
where the measure ξnb on [−1, 1] is defined as dξnb (s) = b(s)(1 − s 2 ) 2 ds and β0 = β(0). In the case ( p, q, r ) = (1, 1, 1) the constant C is to be understood as 1 α λ+3+ 2 n−2 C =2 dξnb (s). S −1
In Sect. 5 we prove a Hardy-Littlewood-Sobolev-type inequality for the collision operator in the case of soft potentials, as follows: Theorem 2. Let 1 < p, q, r < ∞ with −n < λ < 0 and 1/ p + 1/q = 1 + λ/n + 1/r . For the kernel B(|u|, uˆ · ω) = |u|λ b(uˆ · ω), the bilinear operator Q + extends to a bounded operator from L p (Rn ) × L q (Rn ) → L r (Rn ) via the estimate + Q ( f, g) r n ≤ C f L p (Rn ) g L q (Rn ) . (2.10) L (R ) The constant C is explicit in (5.10), (5.14) and (5.18). The constants we obtain for the two inequalities above are explicit, but generally not sharp. Only in the cases α = λ = 0, ( p, q, r ) = (2, 1, 2) and ( p, q, r ) = (1, 2, 2) we find the sharp constant for the Young’s inequality (2.9) (see Corollary 6). In fact, the quest for the sharp forms of these inequalities in the other cases, which could be seen as analogues of the remarkable works of Beckner [7], Brascamp-Lieb [17] and Lieb [28], seems to be a very difficult problem in harmonic analysis. Finally, in Sect. 6, we provide further applications of the methods described here. We first present in Sect. 6.1 a description of the estimates for collision kernels with radial nonincreasing potentials that leads to multiple Young’s inequalities in Corollaries 7 and 8. In Sect. 6.2, we study the corresponding inequalities for the loss operator Q − in Corollaries 9 and 10, and see that the lack of convolution symmetry restricts the range of exponents. In Sect. 6.3 we prove Young-type estimates with exponential weights for hard potentials both for the elastic and strictly (dissipative) inelastic case. We show in Proposition 11 that in the elastic case the weighted estimates with classical Maxwellian tails hold but have a polynomial weighted norm in one of the components. However, things get better in the strict dissipative e(z) < 1 case, as shown in Theorem 12, where tails of order λ and decay rate a are preserved (called stretched exponential tails), with order 0 ≤ λ ≤ 2. These are important tools in the study of propagation of moments [16] and L 1 − L ∞ -exponentially weighted comparison principles [23]. 2.3. Related literature. Young-type inequalities in Theorem 1 for the collisional integrals, in the case of Maxwell type and hard potentials, reveal the convolution nature of the operator Q + . In the elastic case, this observation was first introduced by Gustafsson [26] for an assumed angular cross section function with pointwise cut-off away from zer o (grazing collisions) and π (head-on collisions). Later, Mouhot and Villani [31, Theorem 2.1] revisited the work of Gustafsson under the same conditions to obtain a different control of the constants (observe however that the constant in their Young’s inequality blows up at the endpoints, due to the pointwise cut-off assumption). Also,
300
R. J. Alonso, E. Carneiro, I. M. Gamba
standard integrability of the angular cross section is required away from the cut-off. Later, Duduchava-Kirsch-Rjasanow [22] used elementary techniques to obtain a simpler proof of the control of the L p -norm for the Q + operator in three dimensions by means of L 1 and L p norms where the constants in our present work match precisely the ones obtained in [22] for the case ( p, q, r ) = ( p, 1, p). About the same time, Gamba, Panferov and Villani, [24, Lemma 4.1], presented related estimates that use dual integrability of the functions, namely, f and g belonging to L 1 ∩ L p . This stronger assumption presented in [24] allowed them to remove the restriction of pointwise cut-offs and use p only the integrability of the cross section. Related work on L k estimates for the elastic or the dissipative collisional integrals was also done by Gamba-Panferov-Villani [23], Bobylev-Gamba-Panferov [16], and Mischler-Mouhot-Ricard [30]. In particular, the following observation is worth stressing: some techniques valid in the elastic case are not available in the inelastic (dissipative) one. For instance, defining the angular kernel in a half spherical domain is not possible due to the lack of symmetry, and in particular, such lack of symmetry forces a dual integrability assumption in the derivation of L p estimates done in [24, Lemma 4.1]. In fact, the dual integrability assumption is adequate for the quadratic operator, and also in some cases for bilinear estimates, such as when studying propagation of derivatives for the conservative case. However, it is not always the case that such stronger assumptions hold, and so one needs to study a purely bilinear estimate. This is precisely the case for comparison techniques, where a function g, which in principle is unrelated to a solution f , is “compared” with the solution and so the bilinear form Q(g, f ) appears. In general, there is no reason why this function g may have dual integrability, and so, the previous technique will not apply. Finally, we mention that the authors in [16] have already used the angular averaging mechanism occurring for P (to be introduced in the next section) in order to obtain sharp moment decay formulas for the gain operator, both in the case of elastic and inelastic (with constant restitution coefficient) hard potentials. These results lead to the study of the regularity and the aforementioned Gaussian propagation for the solutions of the Boltzmann equation using comparison techniques [4 and 23]. In the soft potentials case, Theorem 2 also reinforces the convolution character of Q + ( f, g). It is however important to notice that the weak formulation of the collisional integral as a weighted convolution (representation (2.8)) is crucial to attain the results by estimating multilinear integrals by convex type estimates, maximized in their symmetrization and the connections to Brascamp-Lieb-Luttinger inequalities [18,17]. It is also worth mentioning that the classical Hardy-Littlewood-Sobolev was used in connection to Cancellation Lemmas of Alexandre-Desvillettes-Villani-Wennberg [1] and Villani [32] who constructed estimates with singular kernels and non-integrable cross sections, p assuming L k integrability to control the L 1 -norm of the total collisional operator. Our work is not related to theirs, but rather extends to any radial non-increasing potential, showing that the collisional integral satisfies a Hardy-Littlewood-Sobolev-type inequality both for Q + and Q − independently (see also Corollaries 9 and 10), providing exact representations for the constants, and assuming only L p -integrability. In this sense it is the first time where such connection was made. Finally, recent main applications of this work are mentioned below. Alonso and Gamba [5] used this result to obtain classical solutions and L p -stability , 1 ≤ p ≤ ∞, for the Cauchy problem associated to the Boltzmann equation for soft potentials and singular radially symmetric decreasing potentials, with integrable cross section, for initial data near vacuum or large data near local Maxwellian distribution. In addition, Alonso and Lods [6] have shown the pointwise control to inelastic homogeneous cooling
Inequalities for the Boltzmann Operator
301
solutions for the inelastic Boltzmann equation for hard spheres and the conjectured Haff Law by means of a corresponding Young-type estimate for the inelastic Boltzmann collision operator with stretched exponential weights. 3. Radial Symmetrization and the Operator P Let G = S O(n) be the group of rotations of Rn (orthonormal transformations of determinant 1), in which we will use the variable R to designate a generic rotation. We assume that the Haar measure dμ of this compact topological group is normalized so that dμ(R) = 1. G
Let f ∈
L p (Rn ),
p ≥ 1. We define the radial symmetrization f p by
f p (x)
1
=
| f (Rx)| dμ(R) p
p
, if 1 ≤ p < ∞,
(3.1)
G
and (x) = ess sup|y|=|x| | f (y)|, f∞
(3.2)
where the essential sup in (3.2) is taken over the sphere of radius |x| with respect to the surface measure over this sphere. This radial rearrangement f p defined in (3.1)–(3.2) can be seen as an L p -average of f over all the rotations R ∈ G and it satisfies the following properties: (i) f p is radial. (ii) If f is continuous (or compactly supported) then f p is also continuous (or compactly supported). (iii) If g is a radial function then ( f g)p (x) = f p (x)g(x). (iv) Let dν be a rotationally invariant measure on Rn . Then p | f (x)| dν(x) = | f p (x)| p dν(x). Rn
Rn
In particular,
f L p (Rn ) = f p L p (Rn ) . We noticed that this is not the classical rearrangement by monotonicity of level sets associated to f , still has the property that preserves the L p -norm, and so we are driven to prove a convex inequality type of result that can be viewed as an analog to Brascamp Lieb and Luttinger type inequalities, where multilinear convolution type integrals are maximized on their (classical) rearrangements and controlled by multiple Minkowski type of convexity estimates [17,18]. The following result, our first in this manuscript, is a redo of [3, Lemma 4]. We produce here a different proof of that in [3] (which uses a Carleman integral type representation), and present a new approach that exhibits the similarity to those arguments of classical rearrangement inequalities such as those in [17,18]. This new approach is crucial to perform the extensions to soft potentials through Hardy-Littlewood-Sobolev and Lieb [29] type inequalities.
302
R. J. Alonso, E. Carneiro, I. M. Gamba
Lemma 3. Let f, g, ψ ∈ C0 (Rn ) and 1/ p + 1/q + 1/r = 1, with 1 ≤ p, q, r ≤ ∞. Then P( f, g)(u)ψ(u) du ≤ P( f p , gq )(u)ψr (u) du. Rn
Rn
Proof. From (2.4), (2.5) and (2.6) we observe that for any rotation R one has P( f, g)(Ru) = P( f ◦ R, g ◦ R)(u). Therefore, P( f, g)(u) ψ(u) du =
P( f, g)(Ru) ψ(Ru) du n R = P( f ◦ R, g ◦ R)(u) ψ(Ru) du n R ≤ | f (Ru − )| |g(Ru + )| |ψ(Ru)|b(uˆ · ω) dω du.
Rn
Rn
S n−1
(3.3) Note that the left hand side of (3.3) is independent of R. Thus, an integration over the group G = S O(n) leads to P( f, g)(u)ψ(u) du Rn ≤ | f (Ru − )| |g(Ru + )| |ψ(Ru)| dμ(R) b(uˆ · ω) dω du. (3.4) Rn
S n−1
G
An application of Hölder’s inequality with exponents p, q and r yields | f (Ru − )| |g(Ru + )| |ψ(Ru)| dμ(R) ≤ f p (u − ) gq (u + ) ψr (u), G
which together with Eq. (3.4) proves the lemma. Lemma 3 shows that L p -estimates for the operator P will follow by considering radial functions. If f : Rn → R is radial, we define the function f˜ : R+ → R by f (x) = f˜(|x|). In addition, for any p ≥ 1 and α ∈ R we have f (x) p |x|α dx = S n−1 Rn
∞
f˜(t) p t n−1+α dt.
0
Hence, if we define the measure να on Rn by dνα (x) = |x|α dx, and the measure σnα on R+ by dσnα (t) = t n−1+α dt,
(3.5)
Inequalities for the Boltzmann Operator
303
Eq. (3.5) translates to || f ||
1 n−1 p = S || f˜|| L p (R+ , dσnα ) .
L p (Rn , dνα )
(3.6)
In the following computation we show how the operator P simplifies to a 1-dimensional operator when applied to radial functions. If f and g are radial, then
P( f, g)(u) = f˜ |u − | g˜ |u + | b(uˆ · ω) dω n−1 S
= f˜ a1 (|u|, uˆ · ω) g˜ a2 (|u|, uˆ · ω) b(uˆ · ω) dω S n−1
= S n−2
1
−1
f˜ (a1 (|u|, s)) g˜ (a2 (|u|, s)) b(s) (1 − s 2 )
n−3 2
ds.
(3.7)
The functions a1 and a2 are defined on R+ × [−1, 1] → R+ by 1 − s 1/2 1 − s 1/2 1+s a1 (x, s) = β x + (1 − β)2 and a2 (x, s) = x . 2 2 2 (3.8) We conclude from (3.7) that P( f, g)(x) = S n−2
1
−1
f˜ (a1 (x, s)) g˜ (a2 (x, s)) dξnb (s),
(3.9)
where the measure ξnb on [−1, 1] is defined as dξnb (s) = b(s)(1 − s 2 )
n−3 2
ds.
By virtue of Eq. (3.9) we define the following bilinear operator for any two bounded and continuous functions f, g : R+ → R, 1 f (a1 (x, s)) g (a2 (x, s)) dξnb (s). (3.10) B( f, g)(x) := −1
Remark. It is worth to notice that in the case of constant parameter β (which includes elastic interactions) the functions a1 and a2 of the variable interactions are actually functions of the form a1 = xα1 (s) and a2 = xα2 (s); that is a1 and a2 are first order homogeneity in their radial part and their angular part is a positive, bounded by unity function of the angular parametrization s. This observation is at the heart of the analysis made in [14] which shows compactness properties of the spectral structure associated to the bilinear form (3.10). For the operator in (3.10) we have the following bound. Lemma 4. Let 1 ≤ p, q, r ≤ ∞ with 1/ p + 1/q = 1/r . For f ∈ L p (R+ , dσnα ) and g ∈ L q (R+ , dσnα ) we have
B( f, g) L r (R+ , dσnα ) ≤ C f L p (R+ , dσnα ) g L q (R+ , dσnα ) ,
(3.11)
304
R. J. Alonso, E. Carneiro, I. M. Gamba
where the constant C is given in (3.16) below. In the case of the constant restitution coefficient e, corresponding to a constant parameter β = (1 + e)/2, one can show that C(n, α, p, q, b, β) − n+α − n+α 1 2p 2q 1 − s 1 + s 1 − s − n+α 2 =β p dξnb (s) + (1 − β) 2 2 2 −1 (3.12) is sharp. Proof. Using Minkowski’s inequality and Hölder’s inequality with exponents p/r and q/r we obtain B( f, g) ≤
L r (R+ , dσnα )
1
−1
∞
≤
1
−1
∞ 0
| f (a1 (x, s))|
0
p
| f (a1 (x, s))|r |g(a2 (x, s))|r dσnα (x)
dσnα (x)
1
∞
p
|g(a2 (x, s))|
q
0
1
dσnα (x)
r
dξnb (s)
1 q
dξnb (s).
Since the function z → ze(z) is non-decreasing, the change of variables y = a1 (x, s) is valid for any fixed s ∈ [−1, 1), and its inverse Jacobian satisfies 1 da1 1 1 − s 2 ≥ . dx 2 2
(3.13)
Moreover, using the fact that β ≥ 1/2, we arrive at
∞
| f (a1 (x, s))|
p
0
dσnα (x)
1
p
≤
n+α 2 p
1−s 2
− n+α 2p
f L p (R+ , dσnα ) . (3.14)
Using a similar analysis for the change of variables y = a2 (x, s), exploiting the fact that β is non-increasing, we obtain 1 da2 2 1+s 2 1−s ≥ + (1 − β0 ) , dx 2 2
(3.15)
where β0 = β(0). We then arrive at
∞ 0
≤
|g(a2 (x, s))|q dσnα (x)
1+s 2
1
+ (1 − β0 )
2
q
1−s 2
− n+α 2q
g L q (R+ , dσnα ) .
This gives (3.11) with constant C=
n+α 2 p
1 −1
1−s 2
− n+α 2p
1+s 2
+ (1 − β0 )
2
1−s 2
− n+α 2q
dξnb (s). (3.16)
Inequalities for the Boltzmann Operator
305
In the case of constant β, the Jacobians (3.13) and (3.15) can be explicitly computed and the proposed change of variables leads to the constant (3.12). To prove that the constant (3.12) is the best possible in this case, one can consider the sequences { f } and {g } with > 0 defined by 1/ p x −(n+α−)/ p for 0 < x < 1, f (x) = 0 otherwise, and
g (x) =
1/q x −(n+α−)/q 0
for 0 < x < 1, otherwise.
Clearly,
f L p (R+ , dσnα ) = g L q (R+ , dσnα ) = 1, and one can check that
B( f , g ) L r (R+ , dσnα ) → C, as → 0, where C is the constant defined in (3.12). The detailed argument is outlined in [3], in the case β = 1. From Lemma 3 we have
P( f, g) L r (Rn , dνα ) ≤ P( f p , gq ) L r (Rn , dνα ) , where 1/ p + 1/q = 1/r . Using Eqs. (3.6), (3.9) and Lemma 4 we obtain 1 r f p , gq )
P( f p , gq ) L r (Rn , dνα ) = S n−1 P(
L r (R+ , dσnα )
1 r = S n−1 S n−2 B( f˜p , g˜q ) L r (R+ , dσnα ) 1 r ≤ C S n−1 S n−2 f˜p L p (R+ , dσnα ) g˜q L q (R+ , dσnα ) = C S n−2 f L p (Rn , dνα ) g L q (Rn , dνα ) , (3.17) and thus we have proved the following result. Theorem 5. Let 1 ≤ p, q, r ≤ ∞ with 1/ p + 1/q = 1/r , and α ∈ R. The bilinear operator P extends to a bounded operator from L p (Rn , dνα ) × L q (Rn , dνα ) to L r (Rn , dνα ) via the estimate
P( f, g) L r (Rn , dνα ) ≤ C f L p (Rn , dνα ) g L q (Rn , dνα ) . Moreover, in the case of constant restitution coefficient e, the constant n−2 − n+α p C = S β
1
−1
is sharp.
1−s 2
− n+α 2p
1+s 2
+ (1 − β)
2
1−s 2
− n+α 2q
dξnb (s)
306
R. J. Alonso, E. Carneiro, I. M. Gamba
An application of Theorem 5 with Fourier transform methods provides sharp estimates for the (2, 1, 2) and (1, 2, 2) Young’s inequalities in the case of Maxwell molecules and constant parameter β (they will be treated with more generality in the next section). These are the only cases where we are able to explicitly find the sharp constant and exhibit a family of maximizers for the Young’s inequality. Corollary 6. Let f ∈ L 1 (Rn ) and g ∈ L 2 (Rn ). Then + Q ( f, g)
L 2 (R n )
≤ C0 f L 1 (Rn ) g L 2 (Rn ) ,
(3.18)
with the sharp constant given by C0 = S n−2
1
−1
1+s 2
+ (1 − β)
2
1−s 2
− n
4
dξnb (s).
Similarly, for f ∈ L 2 (Rn ) and g ∈ L 1 (Rn ) we have + Q ( f, g)
L 2 (R n )
≤ C1 f L 2 (Rn ) g L 1 (Rn ) ,
(3.19)
with the sharp constant given by n−2 − n2 C1 = S β
1
−1
1−s 2
− n
4
dξnb (s).
Proof. To prove (3.18) observe that + Q ( f, g)
L 2 (R n )
+ = Q ( f, g)
L 2 (R n )
ˆ = P( fˆ, g)
L 2 (R n )
≤ C0 fˆ L ∞ (Rn ) g
ˆ L 2 (Rn ) ≤ C0 f L 1 (Rn ) g L 2 (Rn ) ,
(3.20)
with the constant C0 coming directly from Theorem 5. To guarantee that C0 is indeed the sharp constant in the inequality (3.20) we need approximating sequences fˆ and gˆ slightly different from those presented in the end of the proof of Lemma 4, since we would like to impose the additional constraint f ≥ 0 to have fˆ L ∞ (Rn ) = f L 1 (Rn ) . Heuristically, this can be done by considering f = δ(x) the Dirac delta and so fˆ ≡ 1. In practice we should choose f a Gaussian approximation of the identity by putting 2 2 fˆ (x) = e−π x ,
and gˆ (x) =
1/2 x −(n−)/2 0
for 0 < x < 1, otherwise.
A similar consideration applies to the inequality (3.19).
Inequalities for the Boltzmann Operator
307
4. Young’s Type Inequality for Hard Potentials The goal of this section is to prove Theorem 1. First we treat the case α = λ = 0. The main idea is to use the relation (2.8) that establishes a connection between the operators Q + and P, and then use the L p -knowledge of the operator P from the previous section. In what follows we assume that all the functions involved are non-negative (since when working with L p -norms one can always use the modulus of a function). From (2.8) we have I := Q + ( f, g)(v)ψ(v) dv = f (v)g(v − u)P(τv Rψ, 1)(u) du dv. (4.1) Rn
Rn
Rn
Suppose first that ( p, q, r ) = (1, 1, 1), (1, ∞, ∞), (∞, 1, ∞). The exponents p, q, r in Theorem 1 satisfy 1/ p +1/q +1/r = 1, and thus we can regroup the terms conveniently and use Hölder’s inequality, I =
p
Rn
Rn
× g(v
p
q
r
f (v) q P(τv Rψ, 1)(u) q
f (v) r g(v − u) r
q r p − u) P(τv Rψ, 1)(u) p
du dv ≤ I1 I2 I3 ,
(4.2)
where I1 :=
1
I2 :=
Rn
Rn
q
I3 := =
Rn
f (v) P(τv Rψ, 1)(u) du dv
Rn
1 q
r
Rn
Rn
,
g(v − u)q P(τv Rψ, 1)(u) du dv r
g(v) P(1, τ−v ψ)(u) du dv q
Rn
,
r
p
Rn
r
f (v) g(v − u) du dv p
1 p
(4.3)
1 p
.
Recall that τ and R are unitary operators in the L p spaces, thus, from (4.2) and Theorem 5 we obtain I ≤ C f L p (Rn ) g L q (Rn ) ψ L r (Rn ) , with constant given by n−2 rn C = S 2 ×
1 −1
1
−1
1+s 2
1−s 2
−
n 2r
qr dξnb (s)
+ (1 − β0 )2
1−s 2
−
n 2r
rp dξnb (s)
,
(4.4)
308
R. J. Alonso, E. Carneiro, I. M. Gamba
which concludes the proof. The cases we have left over are easier. Indeed, for ( p, q, r ) = (1, 1, 1), it is a matter of pulling the L ∞ -norm of P(τv Rψ, 1)(u) out of the integral (4.1) using Theorem 5 with (∞, ∞, ∞), thus arriving at the constant C = S n−2
1
−1
dξnb (s).
In the case ( p, q, r ) = (∞, 1, ∞), it is just a matter of pulling the L ∞ -norm of f out of the integral (4.1), and repeating the process as in integral I3 above. The case ( p, q, r ) = (1, ∞, ∞) is basically the same. In both these cases, the final constant can be read from (4.4). Note that one would obtain the same constants starting with the easy cases ( p, q, r ) = (1, 1, 1), (1, ∞, ∞), (∞, 1, ∞) and using Riesz-Thorin interpolation, but we chose to do this directly. In the case where α + λ > 0 and λ > 0, we use two additional inequalities in order to control the post collisional local energy and powers of the post collisional velocity and of the relative velocity by powers of the integration variables appearing on each Ii , i = 1, 2, 3 in (4.3). Indeed, from the energy dissipation we have |v |2 + |v∗ |2 ≤ |v|2 + |v∗ |2 , and thus α/2
|v − u − |α = |v |α ≤ |v|2 + |v∗ |2 ≤ 2α/2 |v|α + |v − u|α .
(4.5)
|u|λ ≤ (|v − u| + |v|)λ ≤ 2λ |v − u|λ + |v|λ .
(4.6)
Also,
Now, take as a test function ψα (v) = ψ(v)|v|α . The first step is to use (2.8) and (4.6) to obtain + Q ( f, g)(v)ψα (v) dv = f (v)g(v − u)P(τv Rψα , 1)(u) |u|λ du dv Rn Rn Rn ≤ 2λ f (v)g(v − u)P(τv Rψα , 1)(u) |v|λ du dv n n R R + f (v)g(v−u)P(τv Rψα , 1)(u) |v−u|λ du dv . Rn
Rn
(4.7) Making use of (2.5) and (4.5) yields
ψ(v − u − )|v − u − |α b(uˆ · ω) dω α/2 ψ(v − u − )|v|α b(uˆ · ω)dω ≤2 S n−1 ψ(v − u − )|v − u|α b(uˆ · ω)dω . +
P(τv Rψα , 1)(u) =
S n−1
S n−1
(4.8)
Inequalities for the Boltzmann Operator
309
Recalling the notation ψα (v) = ψ(v)|v|α , combining (4.7) and (4.8) one obtains Q + ( f, g)(v)ψα (v) dv α λ+ 2 ≤2 f λ+α (v)g(v − u)P(τv Rψ, 1)(u) du dv n n R R + f λ (v)gα (v − u)P(τv Rψ, 1)(u) du dv n n R R + f α (v)gλ (v − u)P(τv Rψ, 1)(u) du dv n n R R + f (v)gλ+α (v − u)P(τv Rψ, 1)(u) du dv .
Rn
Rn
(4.9)
Rn
We now repeat the procedure for the case α = λ = 0, breaking each of the 4 integrals of (4.9) into 3 parts according to (4.3). One should then estimate each piece using Theorem 5 and simple inequalities of the form
f L p (Rn ) ≤ f L p (Rn ) and f α L p (Rn ) ≤ f L p (Rn ) . α+λ α+λ At the end we arrive at
α
Rn
Q + ( f, g)(v)ψα (v) dv ≤ 2λ+2+ 2 C f L p (Rn ) g L q (Rn ) ψ L r (Rn ) . α+λ α+λ
This proves that + Q ( f, g)(v)|v|α
α
L r (R n )
≤ 2λ+2+ 2 C f L p (Rn ) g L q (Rn ) . α+λ α+λ
A similar reasoning provides + Q ( f, g)(v)
L r (R n )
≤ 2λ+1 C f L p (Rn ) g L q (Rn ) , α+λ α+λ
and finally + Q ( f, g)(v)
α 1
L rα (Rn )
≤ 2λ+2+ 2 + r C f L p (Rn ) g L q (Rn ) , α+λ α+λ
(4.10)
with C given in (4.4). The proof of Theorem 1 is now completed.
5. Hardy-Littlewood-Sobolev Type Inequality for Soft Potentials In this section we study the collision operator for soft potentials and prove Theorem 2. We divide this proof in three parts, making use of the convolution symmetry of Q = ( f, g).
310
R. J. Alonso, E. Carneiro, I. M. Gamba
5.1. The case r < q. From (2.8) we have + I := Q ( f, g)(v) ψ(v) dv = f (v)g(v − u)P(τv Rψ, 1)(u) |u|λ du dv Rn Rn Rn f (v) τv Rg(u) P(τv Rψ, 1)(u) |u|λ du dv. = Rn
Rn
(5.1) Applying Hölder’s inequality and then Theorem 5 to the inner integral of (5.1), with ( p, q, r ) = (a, ∞, a), we obtain τv Rg(u) P(τv Rψ, 1)(u) |u|λ du ≤ P(τv Rψ, 1) L a (Rn , dνλ ) τv Rg L a (Rn , dνλ ) Rn
≤ C1 τv Rψ L a (Rn , dνλ ) τv Rg L a (Rn , dνλ ) 1/a
1/a a |g| ∗ |u|λ (v) = C1 |ψ|a ∗ |u|λ (v) ,
where 1/a + 1/a = 1 (a to be chosen later), and the constant C1 given by n+λ C1 = S n−2 2 a
1
−1
1−s 2
− n+λ 2a
dξnb (s).
(5.2)
Therefore we arrive at 1/a 1/a a
I ≤ C1 |g| ∗ |u|λ (v) f (v) |ψ|a ∗ |u|λ (v) dv. Rn
(5.3)
Applying Hölder’s inequality in (5.3) with exponents 1/ p + 1/b + 1/c = 1 (b and c to be chosen later) we arrive at 1/a 1/a I ≤ C1 f L p (Rn ) |ψ|a ∗ |u|λ L b/a (Rn ) |g|a ∗ |u|λ c/a L
(R n )
We now use the classical Hardy-Littlewood-Sobolev inequality to obtain a |ψ| ∗ |u|λ b/a n ≤ C2 ψ a ad n L (R ) L (R ) and
a |g| ∗ |u|λ
L c/a (Rn )
≤ C3 g aL a e (Rn ) ,
.
(5.4)
(5.5)
(5.6)
where 1+
a 1 λ a 1 λ = − and 1 + = − . b d n c e n
The constants C2 and C3 (generally not sharp) are explicit in [29, p. 106]. Finally putting together (5.5) and (5.6) with (5.4) we arrive at 1/a
I ≤ C1 C2
1/a
C3
f L p (Rn ) g L a e (Rn ) ψ L ad (Rn ) .
(5.7)
Inequalities for the Boltzmann Operator
311
To conclude the proof of the theorem it would suffice to have in (5.7) the relations a e = q and ad = r . Now comes the moment to choose our variables. All the inequalities we used above will be well-posed if the following relations are satisfied: ⎧1 1 ⎪ + = 1, 1 ≤ a ≤ ∞, ⎪ ⎪ ⎪ a a ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 1 1 1 ⎪ ⎪ + + = 1, 1 < b, c < ∞, ⎪ ⎪ ⎪ p b c ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ a 1 λ ⎨ 1 + = − , b > a, 1 < d < ∞, (5.8) b d n ⎪ ⎪ ⎪ ⎪ ⎪ a 1 λ ⎪ ⎪ ⎪ 1 + = − , c > a , 1 < e < ∞, ⎪ ⎪ c e n ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ a e = q, ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ ad = r . The last two equations determine d and e in terms of a. The remaining linear system (in the variables 1/a, 1/a , 1/b and 1/c) is undetermined because of the original relation 1 1 λ 1 + =1+ + . p q n r One can check that the choice
1 λ 1 1 = − 1+ b r a n
and 1 1 1 = − c q a
λ 1+ n
with any 1/a in the non-empty interval 1 1 1 1 1 < < min , max , 1 − r a q(1 + λn ) r (1 + λn ) q
(5.9)
provides a solution for (5.8). If we define 1/a
D1 = C 1 C 2
1/a
C3
(5.10)
using expressions (5.2), (5.5), (5.6) and a choice of a given by (5.9), expression (5.7) is plainly equivalent to + Q ( f, g) r n ≤ D1 f L p (Rn ) g L q (Rn ) , (5.11) L (R ) finishing the proof in this case.
312
R. J. Alonso, E. Carneiro, I. M. Gamba
5.2. The case r < p. Here we make use of the convolution symmetry of Q + given by (2.8), I := Q + ( f, g)(v) ψ(v) dv = f (u + v)g(v)P(1, τ−v ψ)(u) |u|λ du dv n n n R R R g(v) τ−v f (u) P(1, τ−v ψ)(u) |u|λ du dv. (5.12) = Rn
Rn
We now proceed exactly as in the proof of the first case, noticing that the roles of f and g (and thus of p and q) are interchanged. We will find 1/a
I ≤ C4 C5
1/a
C6
g L q (Rn ) f L a e (Rn ) ψ L ad (Rn ) .
where C4 (the analogous of C1 ) is given by Theorem 5 as C4 = S n−2
1
−1
1+s 2
+ (1 − β0 )
2
1−s 2
− n+λ 2a
dξnb (s),
while C5 and C6 (the analogues of C2 and C3 ) are given by classical HLS inequalities as in (5.5) and (5.6), for a choice of a now in the non-empty interval 1 1 1 1 1 max , 1 − , . (5.13) < < min r a p(1 + λn ) r (1 + λn ) p In the end, defining 1/a
D2 = C 4 C 5 we will have
+ Q ( f, g)
L r (R n )
1/a
C6
≤ D2 f L p (Rn ) g L q (Rn ) .
(5.14) (5.15)
5.3. The case r ≥ max{ p, q}. The remaining range will be covered by the multilinear Riesz-Thorin interpolation [33, Chapter 12, Theorem 3.3]. Recall that the triple ( p, q, r ) satisfies 1 < p, q, r < ∞ and 1 1 λ 1 + = 1+ + . (5.16) p q n r For fixed λ and n, with −n < λ < 0, there exist triples ( p1 , q1 , r ), with r < q1 , and ( p2 , q2 , r ), with r < p2 , such that: 1 1 1 1 λ 1 + = + = 1+ + . (5.17) p1 q 1 n r p2 q 2 Therefore, for the triple ( p1 , q1 , r ) we have bound (5.11) and for the triple ( p2 , q2 , r ) we have bound (5.15). From (5.16) and (5.17) there is a t, with 0 < t < 1, such that 1 1 1 1 1 1 1 1 1 + (1 − t) = , , , , , , , t p1 q 1 r p2 q 2 r p q r and we can apply the bilinear version of the Riesz-Thorin interpolation to get + Q ( f, g) r n ≤ D t D (1−t) f L p (Rn ) g L q (Rn ) . 1 2 L (R ) This concludes the proof.
(5.18)
Inequalities for the Boltzmann Operator
313
6. Some Immediate Applications The radial symmetrization technique and the key ideas in the proofs of the basic inequalities for Boltzmann (Young’s and HLS) described here have a deeper range and may be applied to other important estimates in kinetic theory. In this section we discuss two additional examples. 6.1. Non-increasing potentials. Assume in the following discussion that the collision kernel has the form B(|u|, uˆ · ω) = (u) b(uˆ · ω),
(6.1)
for some radial non-increasing potential : Rn → R+ . Let us retake the discussion of Sect. 3 by defining the measure ν on Rn by dν (x) = (x)dx, and the measure σn on R+ by ˜ t n−1 dt. dσn (t) = (t) We claim that under this convention, Lemma 4 and Theorem 5 are still valid with these measures replacing dνα and dσnα in the statement of these results. This follows from a simple observation in the proof of Lemma 4 when one performs the change of variables 1 − s 1/2 y = a1 (x, s) = β x . 2 1/2
≤ 1, then Indeed, note that since is non-increasing and 0 ≤ β 1−s 2 ˜ ˜ (x) ≤ (y). Hence, using that β ≥ 1/2, the estimate (3.14) changes in this context to
∞
| f (a1 (x, s))|
p
0
dσn (x)
1
p
≤
n 2p
1−s 2
−
n 2p
f L p (R+ , dσn ) .
The same observation is valid for the change of variables y = a2 (x, s), in the sense that
∞
|g(a2 (x, s))|
q
0
dσn (x)
1 q
≤
1+s 2
+ (1 − β0 )
2
1−s 2
−
n 2q
g L q (R+ , dσnα ) ,
where β0 = β(0). We conclude that, for a radial non-increasing potential, Lemma 4 and Theorem 5 hold with the (non-sharp) constant C(n, p, q, b, β) = 2
n −p
1 −1
1−s 2
−
n 2p
1+s 2
+ (1 − β0 )
2
1−s 2
−
n 2q
dξnb (s).
An immediate consequence of this observation is that Theorem 2 holds for any collision kernel (6.1) with a potential that is radial non-increasing and belongs to a weak Lebesgue space ∈ L sweak (Rn ). A careful reading of the proof of this theorem together with the discussion above leads to:
314
R. J. Alonso, E. Carneiro, I. M. Gamba
Corollary 7 (Weakly integrable potential for the gain). Let 1 < p, q, r, s < ∞ with 1/ p + 1/q + 1/s = 1 + 1/r . Assume that ∈ L sweak (Rn ) is a radial non-increasing function. For the collision kernel (6.1) the bilinear operator Q + extends to a bounded operator form L p (Rn ) × L q (Rn ) → L r (Rn ) via the estimate + Q ( f, g) r n ≤ C L s (Rn ) f L p (Rn ) g L q (Rn ) , L (R ) weak with C as in the proof of Theorem 2. An interesting fact that might be suitable for applications related to soft potentials is that if we impose the stronger condition that our radial non-increasing potential belongs indeed to L s (Rn ), then we can include the endpoints in the range of exponents for our inequality. To see this, it is just a matter of following the proof of Theorem 2 and use the classical Young’s inequality for convolution whenever the classical HLS inequality was used (passages (5.5) and (5.6)). One should obtain: Corollary 8 (Integrable potential for the gain). Let 1 ≤ p, q, r, s ≤ ∞ with 1/ p + 1/q + 1/s = 1 + 1/r . Assume that ∈ L s (Rn ) is a radial non-increasing function. For the collision kernel (6.1) the bilinear operator Q + extends to a bounded operator form L p (Rn ) × L q (Rn ) → L r (Rn ) via the estimate + Q ( f, g) r n ≤ C L s (Rn ) f L p (Rn ) g L q (Rn ) , L (R ) with C as in the proof of Theorem 2. 6.2. Estimates for the loss operator Q − ( f, g). This discussion leads to the application of the previous results to the loss operator Q − ( f, g) defined in (2.3) in strong form. The following results assert that Theorem 2 and its Corollaries 7 and 8 are also valid for this operator, however in a restricted range of exponents due to the fact that this operator lacks the convolution symmetry of Q + given by (2.8) (see expression (6.2) in the proof below). We also show a counterexample that exhibits the lack of full range of exponents property that the gain operator Q + enjoys. Corollary 9 (Weakly integrable potential for the loss). Let 1 < p, q, r, s < ∞ with 1/ p + 1/q + 1/s = 1 + 1/r and r < p. Assume that ∈ L sweak (Rn ) is a radial nonincreasing function. For the collision kernel (6.1) the bilinear operator Q − extends to a bounded operator form L p (Rn ) × L q (Rn ) → L r (Rn ) via the estimate − Q ( f, g)
L r (R n )
1/a
1/a
≤ C5 C6
b L 1 (S n−1 ) L sweak (Rn ) f L p (Rn ) g L q (Rn ) .
with C5 , C6 and a as in the proof of Theorem 2 (second case). Corollary 10 (Integrable potential for the loss). Let 1 ≤ p, q, r, s ≤ ∞ with 1/ p + 1/q + 1/s = 1 + 1/r and r ≤ p. Assume that ∈ L s (Rn ) is a radial non-increasing function. For the collision kernel (6.1) the bilinear operator Q − extends to a bounded operator form L p (Rn ) × L q (Rn ) → L r (Rn ) via the estimate − Q ( f, g) r n ≤ b L 1 (S n−1 ) L s (Rn ) f L p (Rn ) g L q (Rn ) . L (R )
Inequalities for the Boltzmann Operator
315
Proof. These estimates can be proved noticing that for any test function ψ one has − Q ( f, g)(v)ψ(v)dv = b L 1 (S n−1 ) g(v) f (u)ψ(u)(v − u) du dv n Rn Rn R ≤ b L 1 (S n−1 ) g(v)[(|ψ|a ∗ )(v)]1/a [(| f |a ∗ )(v)]1/a dv, Rn
(6.2) where 1/a + 1/a = 1. Now it is just a matter of following the proof of the second case of Theorem 2. Note that estimates (5.5) and (5.6) are achieved using the classical version of HLS inequality when is weakly integrable. Meanwhile, when is integrable they are achieved using the classical Young’s inequality. It is precisely in this step that the range of exponents differ in the two corollaries. Remark. One should not expect Corollaries 9 and 10 to hold outside the ranges above. For instance, in Corollary 9 we could examine the situation p = q = r = s = 2, taking the potential (x) = |x|−n/2 and |x|−n/2 (ln |x|)−1 if |x| ≥ e, g(x) = 0 otherwise. In this situation we would have for any v, g(v∗ )(u)dv∗ = g(x)(v − x)dx n Rn R = |x|−n/2 (ln |x|)−1 |x − v|−n/2 dx {|x|≥e} 1 |x|−n (log |x|)−1 dx = ∞, ≥ n/2 2 {|x|≥max{e,|v|}} that makes Q − ( f, g)(v) to blow up pointwise in (2.3), and thus the bound of Corollary 9 would not hold. For Corollary 10 we could even consider a simpler example taking (x) constant (s = ∞), p = 1 and q = r = ∞. From (2.3) it is easy to see that the bound in Corollary 10 cannot hold in this case. 6.3. Inequalities with exponential weights. In this subsection we will present a Youngtype estimate for the Boltzmann collision operator with stretched exponential weights. Our motivation for presenting such estimates is the study of tail propagation, for instance, in solutions of the elastic Boltzmann equation, in self similar solutions for the inelastic Boltzmann equation, or, in solutions of the inelastic Boltzmann adding heating sources. It is well known that the homogeneous elastic Boltzmann equation propagates solutions with Maxwellian tails. More specifically, it was proved by Bobylev [11] that such solutions f (t, v), in 3-dimensions and for hard spheres, satisfy the L 1 -Maxwellian weighted estimate if initially so, that is, 2 2 if f 0 (v)ea0 |v| 1 < ∞ then sup f (t, v)ea|v| 1 < ∞ for some a ≤ a0 . L
t≥0
L
316
R. J. Alonso, E. Carneiro, I. M. Gamba
Later, a program based in a comparison principle was introduced in [23], for the n-dimensional case with variable potentials and integrable differential cross section, to obtain exponentially weighted pointwise estimates, 2 sup f (t, v)ea|v| ∞ < ∞, L
t≥0
under the assumption that f 0 satisfies similar L ∞ -Maxwellian integrability and not necessarily with the same rate a. One of the key ingredients in this program is the inequality (see Lemma 5 in [23]) 2 2 + −a|v|2 , f )(v)ea|v| ∞ ≤ C f (v)ea|v| 1 , Q (e L
Lk
with k = k(λ, n), with k = 0 for the particular case of hard spheres in three dimensions (i.e. λ = 1 and n = 3). This estimate is a particular case of the Young’s inequality with Maxwellian weights. The techniques used in [23], and the subsequent applications to propagation of Maxwellian tails for derivatives [4], were based on the strong formulation of the collisional integral by the Carleman representation. Further in the studies of the inelastic Boltzmann equation, it was shown that solutions of the homogeneous inelastic Boltzmann for constant restitution coefficient behave completely different from those of the elastic. Thus, the previous L ∞ -Maxwellian estimate does not hold for them. Indeed, as time passes by, the gas cools down to complete rest. In other words, the density converges to a Dirac distribution at v = 0. This result was first rigorously proved for the Maxwell type of interaction models by Bobylev, Carrillo and Gamba [12]. One may obtain such convergence by a dynamical rescaling that leaves the collisional integral invariant usually referred to as a self-similar transformation. It was actually shown that for the case of any dissipative model of Maxwell type of interactions, such transformation yields to a static frame where the distributions have power law for their high energy tails, [13–15]. However this picture is very different for hard potentials, where stretched exponential tails must be expected. Indeed, Bobylev, Gamba and Panferov proved that any solution to the self-similar transformed static problem of the inelastic Boltzmann equation has solutions in L 1 with a weight that corresponds to a non Gaussian exponential high energy tail [16]. Their studies also showed similar behavior to other inelastic collisional models under heating sources, where their stationary solutions are shown to have L 1 -exponential weighted estimates with weights depending on the restitution coefficient and the heating mechanism. Later, the time propagation of L 1 -exponential tails by self-similarity, in the hard spheres case, was shown by Mischler, Mouhot and Ricart [30]. All these estimates, rigorously proved in the inelastic case for hard spheres and constant restitution coefficient, yield sup f s (t, v)ea|v| 1 < ∞, t≥0
L
where f s is the self-similar profile. In addition, for hard potentials 0 < λ < 1 one expects using formal arguments, λ sup f s (t, v)ea|v| 1 < ∞. (6.3) t≥0
L
Inequalities for the Boltzmann Operator
317
Consequently, a companion issue arises on the availability of exponentially weighted pointwise estimates for self-similar profiles f s associated to dissipative collisional equations, that is λ (6.4) sup f s (t, v)ea|v| ∞ < ∞. L
t≥0
Passing from (6.3) to (6.4) will require an inequality of the type of (6.5). Another source for motivation in searching L p -exponentially weighted estimates can be found in a forthcoming work [6] where different issues of the self similar asymptotic of inelastic collisions for variable restitution coefficient are treated. Motivated by the above discussion, we present two Young-type estimates for the Boltzmann collision operator with stretched exponential weights which preserve the rate of the exponential for elastic and inelastic interactions. In order to shorten notation, fix a > 0 and define for γ ≥ 0 the exponential weight and the Maxwellian weight respectively as
Ma,γ (v) := exp −a|v|γ and Ma (v) := Ma,2 (v). The first result of this section is an inequality which is ideal in the study of pointwise Maxwellian tails for hard potential in the elastic case. The proof that we give works for both elastic and inelastic cases. Proposition 11. Let 1 ≤ p, q, r ≤ ∞ with 1/ p + 1/q = 1 + 1/r . Assume that B(|u|, uˆ · ω) = |u|λ b(uˆ · ω), with λ ≥ 0. Then, for a > 0, + Q ( f, g) Ma−1
L r (R n )
≤ C f Ma−1 L p (Rn ) g Ma−1 L q (Rn ) . λ
Proof. Using (2.1) and (2.2) we obtain I := Q + ( f, g)(v) Ma−1 ψ (v) dv n R = Ma−1 ψ (v ) |u|λ b(uˆ · ω) dω du dv. f (v)g(v − u) Rn
Rn
(6.5)
(6.6)
S n−1
From the energy dissipation we have |v |2 + |v∗ |2 ≤ |v|2 + |v∗ |2 , and thus Ma−1 (v ) ≤ Ma−1 (v) Ma−1 (v∗ ) Ma (v∗ ). Using (6.7) in (6.6) we obtain I≤ Ma−1 f (v) Ma−1 g (v − u) Rn
Rn
S n−1
(6.7)
ψ(v ) Ma (v∗ ) |u|λ b(uˆ · ω) dω du dv. (6.8)
Recall that
−
|u | = β|u|
1 − uˆ · ω . 2
318
R. J. Alonso, E. Carneiro, I. M. Gamba
Therefore, using the fact that β ≥ 1/2, we have the simple pointwise estimate, |u|λ ≤ 2λ
2 1 − uˆ · ω
λ/2
|u − |λ ≤ 22λ−1
2 1 − uˆ · ω
λ/2
|v∗ + u − |λ + |v∗ |λ .
Further, v∗ = v∗ + u − , then
Ma (v∗ ) |v∗ + u − |λ + |v∗ |λ ≤ Cλ,a (1 + |v∗ |λ ). Using (6.9) we have
(6.9)
ψ(v ) Ma (v∗ ) |u|λ b(uˆ · ω) dω
˜ uˆ · ω) dω, ψ(v − u − ) b( ≤ Cλ,a 1 + |v∗ |λ
S n−1
(6.10)
S n−2
where we have defined ˜ uˆ · ω) := b(
2 1 − uˆ · ω
λ/2
b(uˆ · ω).
Using (6.10) in expression (6.8) we arrive at (recall that v∗ = v − u) I ≤ Cλ,a Ma−1 f (v) Ma−1 g (v − u)(1 + |v − u|λ )P(τv Rψ, 1)(u) du dv. Rn
Rn
We now have arrived at the same expression in (4.1), with f (v) changed by given
−1 Ma f (v) and g(v) changed by Ma−1 g (v) 1 + |v|λ . Repeating the argument for the Young’s inequality in Sect. 3 we will conclude that + Q ( f, g) Ma−1 r n ≤ C Cλ,a f Ma−1 L p (Rn ) g Ma−1 L q (Rn ) , (6.11) L (R )
λ
with C given by (4.4) with b˜ in place of b and Cλ,a from (6.10). This concludes the proof. If we restrict ourselves and impose “strict dissipation conditions” to the model we can do even better. Indeed, note that both norms in the right-hand side of estimate (6.12) below are free from any extra weight coming from the potential. This feature is achieved by imposing additional conditions (meant as “strict dissipation”) on the restitution coefficient e stated in the theorem. Such conditions are not stringent since they are satisfied by standard models such as viscoelastic spheres or constant restitution coefficient e < 1. Theorem 12. Let 1 ≤ p, q, r ≤ ∞ with 1/ p + 1/q = 1 + 1/r . Assume that B(|u|, uˆ · ω) = |u|λ b(uˆ · ω), with 0 ≤ λ ≤ 2. Then, for a non-increasing restitution coefficient such that e(z) < 1 for z ∈ (0, ∞), + −1 −1 −1
L p (Rn ) g Ma,λ
L q (R n ) . (6.12) Q ( f, g) Ma,λ r n ≤ C f Ma,λ L (R )
Inequalities for the Boltzmann Operator
319
The constant C := C(n, λ, p, q, b, β) is computed below in the proof. In the important case ( p, q, r ) = (∞, 1, ∞) this constant reduces to 1 1+s 1 − s −n/2 C = C(n, λ) + (1 − β(0))2 bβ (s) ds, (6.13) 2 2 −1 where
bβ (s) := 1 −
1 + |ϑβ (s)| 2
λ/2
−1
b(s),
with |ϑβ (s)| =
(1 − β(x))2
+ β 2 (x) + 2(1 − β(x))β(x)s,
for x =
1−s . 2
Proof. Let us introduce the classical center of mass-relative velocity change of coordinates v + v∗ U= and ϑ = (1 − β)uˆ + βω. 2 One can readily verify the following standard identities |u| |u|2 ϑ and E := |v|2 + |v∗ |2 = 2|U |2 + . 2 2 Therefore, a direct calculation shows that 1 + ξ Uˆ · ϑ 2|U ||u| 2 ≤ 1. |v | = E , with ξ := 2 E v = U +
Thus, for any test function ψ with unitary L r -norm, −1 Q + ( f, g)(v) Ma,λ ψ (v) dv I := n R −1 = Ma,λ f (v)g(v − u) ψ (v ) |u|λ b(uˆ · ω) dω du dv. Rn
Rn
S n−1
In order to estimate I , we split the integral into two regions of integration: A = {|u| ≤ 1} and its complement. The integral in the region A is estimated by −1 −1 Ma,λ f (v) Ma,λ g (v − u) IA ≤ ψ(v ) b(uˆ · ω) dω du dv. (6.14) Rn
Rn
S n−1
Indeed, this estimate follows from local dissipation of energy estimates (notice here the condition 0 ≤ λ ≤ 2 is used), namely λ/2 ≤ E λ/2 ≤ |v|λ + |v∗ |λ = |v|λ + |v − u|λ . |v |λ ≤ |v |2 + |v∗ |2 Hence, proceeding as in the proof of Young’s type inequality in Sect. 4, it follows −1 −1 (6.15) I A ≤ C1 f Ma,λ p n gMa,λ q n , L (R )
L (R )
where the constant C1 is explicitly computed by formula (4.4).
320
R. J. Alonso, E. Carneiro, I. M. Gamba
The second integral is slightly more involved. In order to control the integral I Ac we use the inelastic interactions law and conditions (2.4) to our advantage. First, observe that ϑ is a convex combination of two unitary vectors and since β ≥ 1/2, the magnitude of ϑ increases as β gets closer to 1. Moreover, β is non-increasing, thus in Ac = {|u| ≥ 1} one has 1 − uˆ · ω 1 − uˆ · ω ϑ β |u| ≤ ϑ β 2 2 := ϑβ (uˆ · ω) = (1 − β(x))2 + β 2 (x) + 2(1 − β(x))β(x)uˆ · ω, ˆ with x = 1−2u·ω . Next, note that by assumptions (2.4) on the restitution coefficient e, the magnitude |ϑβ | = 1 except for uˆ · ω = 1. Therefore, the magnitude of v is controlled by 1 + |ϑβ (s)| 1 + |ϑ| ≤E . |v |2 ≤ E 2 2 Also note that |u|λ ≤ 2λ/2 E λ/2 , and thus, we can estimate the integral I Ac as follows: 1 + |ϑβ | λ/2 λ/2 λ/2 λ/2 I Ac ≤ 2 f (v)g(v − u) E exp a E ψ(v ) b(uˆ · ω) dω du dv. 2 2n n−1 R S Finally, noting that E
λ/2
exp a E
λ/2
1 + |ϑβ | 2
λ/2
λ/2 | 1 + |ϑ β = exp a E λ/2 E λ/2 exp −a E λ/2 1 − 2 −1 ! −x " 1 + |ϑβ | λ/2 −1 1− exp a E λ/2 , ≤ a sup xe 2 x≥0 one can estimate I Ac by −1 −1 I Ac ≤ 2λ/2 (a e)−1 Ma,λ f (v) Ma,λ g (v − u) ψ(v ) bβ (uˆ · ω) dω du dv R2n S n−1 −1 −1 ≤ C2 2λ/2 (a e)−1 f Ma,λ (6.16) p n gMa,λ q n , L (R )
L (R )
for any 0 ≤ λ ≤ 2, where we have defined −1 1 + |ϑβ (s)| λ/2 b(s). bβ (s) := 1 − 2
Inequalities for the Boltzmann Operator
321
The constant C2 is defined again by (4.4) with bβ replacing b. We obtain the final constant C by adding the constants obtained in (6.15) and (6.16), namely, C = C1 + C2 . Note that bβ has a singularity at s = 1, which in most cases of interest is at least of first order. In the case ( p, q, r ) = (∞, 1, ∞) the constant can be taken as shown in (6.13). Remark. One wonders then, if estimate (6.12) is also valid in the elastic case, or perhaps, even the also useful (but weaker) estimate + −1 −1 −1
L p (Rn ) g Ma,λ
L q (R n ) . Q ( f, g) Ma,λ r n ≤ C f Ma,λ L (R )
λ
We leave this open question to the reader. Acknowledgments. The authors thank Diogo Arsenio, William Beckner and Eric Carlen for very valuable discussions that much have improved this manuscript. This material is based upon work supported by the National Science Foundation under agreements No. DMS-0635607 (E. Carneiro), DMS-0636586 and DMS-0807712 (R. Alonso and I. M. Gamba). E. Carneiro would also like to acknowledge support from the CAPES/FULBRIGHT grant BEX 1710-04-4 and the Homer Lindsey Bruce Fellowship from the University of Texas. Support from the Institute of Computational Engineering and Sciences at the University of Texas at Austin is also gratefully acknowledged.
References 1. Alexandre, R., Desvillettes, L., Villani, C., Wennberg, B.: Entropy dissipation and long range interactions. Arch. Rat. Mech. Anal. 152(4), 327–355 (2000) 2. Alonso, R.: Existence of global solutions to the Cauchy problem for the inelastic Boltzmann equation with near-vacuum data. Indiana Univ. Math. J. 58(3), 999–1022 (2009) 3. Alonso, R., Carneiro, E.: Estimates for the Boltzmann collision operator via radial symmetry and Fourier transform. Adv. Math. 223(2), 511–528 (2010) 4. Alonso, R.J., Gamba, I.M.: L 1 − L ∞ -Maxwellian bounds for the derivatives of the solution of the homogeneous Boltzmann equation. J. Math. Pures Et Appl. (9) 89(6), 575–595 (2008) 5. Alonso, R.J., Gamba, I.M.: Distributional and classical solutions to the Cauchy-Boltzmann problem for soft potentials with integrable angular cross section. J. Stat. Phys. 137(5–6), 1147–1165 (2009) 6. Alonso, R.J., Lods, B.: Free cooling of granular gases with variable restitution coefficient. In preparation 7. Beckner, W.: Inequalities in Fourier analysis. Ann. Math. (2) 102(1), 159–182 (1975) 8. Benedetto, D., Pulvirenti, M.: On the one-dimensional Boltzmann equation for granular flow. Math. Model. Numer. Anal. 35(5), 899–905 (2001) 9. Bobylev, A.: The method of the Fourier transform in the theory of the Boltzmann equation for Maxwell molecules. Dokl. Akad. Nauk SSSR 225(6), 1041–1044 (1975) 10. Bobylev, A.: The theory of the nonlinear, spatially uniform Boltzmann equation for Maxwellian molecules. Sov. Sci. Rev. C. Math. Phys. 7, 111–233 (1988) 11. Bobylev, A.: Moment inequalities for the Boltzmann equation and applications to spatially homogeneous problems. J. Stat. Phys. 88(5–6), 1183–1214 (1997) 12. Bobylev, A.V., Carrillo, J.A., Gamba, I.M.: On some properties of kinetic and hydrodynamic equations for inelastic interactions. J. Stat. Phys. 98(3–4), 743–773 (2000) 13. Bobylev, A., Cercignani, C.: Self-similar asymptotics for the Boltzmann equation with inelastic and elastic interactions. J. Stat. Phys. 110, 333–375 (2003) 14. Bobylev, A., Cercignani, C., Gamba, I.M.: On the self-similar asymptotics for generalized non-linear kinetic Maxwell models. Commun. Math. Phys. 291, 599–644 (2009) 15. Bobylev, A., Cercignani, C., Toscani, G.: Proof of an asymptotic property of self-similar solutions of the Boltzmann equation for granular materials. J. Stat. Phys. 111, 403–417 (2003) 16. Bobylev, A., Gamba, I.M., Panferov, V.: Moment inequalities and high-energy tails for Boltzmann equations with inelastic interactions. J. Stat. Phys. 116, 1651–1682 (2004) 17. Brascamp, H.J., Lieb, E.H.: Best constants in Young’s inequality, its converse, and its generalization to more than three functions. Adv. Math. 20, 151–173 (1976) 18. Brascamp, H.J., Lieb, E., Luttinger, J.M.: A general rearrangement inequality for multiple integrals. J. Funct. Anal. 17, 227–237 (1974)
322
R. J. Alonso, E. Carneiro, I. M. Gamba
19. Brilliantov, N., Pöschel, T.: Kinetic theory of granular gases Oxford: Oxford Univ. Press, 2004 20. Cercignani, C., Illner, R., Pulvirenti, M.: The mathematical theory of dilute gases. Applied Mathematical Sciences, Vol. 106, New York: Springer-Verlag, 1994 21. Desvillettes, L.: About the use of the Fourier transform for the Boltzmann equation. Riv. Mat. Univ. Parma. 7, 1–99 (2003) 22. Duduchava, R., Kirsch, R., Rjasanow, S.: On estimates of the Boltzmann collision operator with cutoff. J. Math. Fluid Mech. 8(2), 242–266 (2006) 23. Gamba, I.M., Panferov, V., Villani, C.: Upper Maxwellian bounds for the spatially homogeneous Boltzmann equation. Arch. Rat. Mech. Anal 194, 253–282 (2009) 24. Gamba, I.M., Panferov, V., Villani, C.: On the Boltzmann equation for diffusively excited granular media. Commun. Math. Phys. 246(3), 503–541 (2004) 25. Gamba, I.M., Sri Harsha, Tharkabhushaman: Spectral-Lagrangian based methods applied to computation of non-equilibrium statistical states. J. Comput. Phys. 228, 2012–2036 (2009) 26. Gustafsson, T.: Global L p properties for the spatially homogeneous Boltzmann equation. Arch. Rat. Mech. Anal. 103, 1–38 (1988) 27. Kirsch, R., Rjasanow, S.: A weak formulation of the Boltzmann equation based on the Fourier Transform. J. Stat. Phys. 129, 483–492 (2007) 28. Lieb, E.H.: Sharp constants in the Hardy-Littlewood-Sobolev and related inequalities. Ann. of Math. (2) 118(2), 349–374 (1983) 29. Lieb, E.H., Loss, M.: Analysis. Graduate Studies in Mathematics, v. 14, Providence, RI: Amer. Math. Soc., 2001 30. Mischler, S., Mouhot, C., Ricard, M.R.: Cooling process for inelastic Boltzmann equations for hard spheres. Part I: The Cauchy problem. J. Stat. Phys. 124, 655–702 (2006) 31. Mouhot, C., Villani, C.: Regularity theory for the spatially homogeneous Boltzmann equation with cutoff. Arch. Rat. Mech. Anal. 173, 169–212 (2004) 32. Villani, C.: A review of mathematical topics in collisional kinetic theory. Handbook of mathematical fluid dynamics, Vol. I, Amsterdam: North-Holland, 2002, pp. 71–305 33. Zygmund, A.: Trigonometric Series, Vol. II, Cambridge University Press, 1959 Communicated by H.-T. Yau
Commun. Math. Phys. 298, 323–342 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1067-y
Communications in
Mathematical Physics
Brunet-Derrida Behavior of Branching-Selection Particle Systems on the Line Jean Bérard1 , Jean-Baptiste Gouéré2 1 Institut Camille Jordan - UMR CNRS 5208, Université Lyon 1, 43, Boulevard du 11 Novembre 1918,
Villeurbanne, F-69622, France. E-mail: [email protected]
2 Laboratoire MAPMO - UMR 6628, Université d’Orléans, B.P. 6759, 45067 Orléans Cedex 2, France.
E-mail: [email protected] Received: 28 April 2009 / Accepted: 19 March 2010 Published online: 16 June 2010 – © Springer-Verlag 2010
Abstract: We consider a class of branching-selection particle systems on R similar to the one considered by E. Brunet and B. Derrida in their 1997 paper “Shift in the velocity of a front due to a cutoff”. Based on numerical simulations and heuristic arguments, Brunet and Derrida showed that, as the population size N of the particle system goes to infinity, the asymptotic velocity of the system converges to a limiting value at the unexpectedly slow rate (log N )−2 . In this paper, we give a rigorous mathematical proof of this fact, for the class of particle systems we consider. The proof makes use of ideas and results by R. Pemantle, and by N. Gantert, Y. Hu and Z. Shi, and relies on a comparison of the particle system with a family of N independent branching random walks killed below a linear space-time barrier.
1. Introduction 1.1. Brunet-Derrida behavior. In [6,7], E. Brunet and B. Derrida studied, among other things, a discrete-time particle system on Z, in which a population of particles with fixed size N undergoes repeated steps of branching and selection. As time goes to infinity, the population of N particles, taken as a whole, moves ballistically, with an asymptotic speed depending on the population size N . One remarkable property of this system is the following: as N goes to infinity, the asymptotic speed of the population of particles converges to a limiting value, but at the unexpectedly slow rate of (log N )−2 , bringing to light an unusually large finite-size effect. This behavior was, on the one hand, observed by Brunet and Derrida on direct numerical simulations of the particle system (with large numbers of particles, up to N = 1016 ). On the other hand, Brunet and Derrida provided a justification for this behavior through the following argument. First, in the limit where N goes to infinity, the time-evolution of the distribution of particles in the branchingselection system is governed by a deterministic equation, which can be viewed as a
324
J. Bérard, J.-B. Gouéré
discrete version of the well-known F-KPP equation ∂u = u + u(1 − u), ∂t
(1)
where u = u(x, t), x ∈ R, t ≥ 0. To account for the fact that there is only a finite number N of particles in the system instead of an infinite one – whence a resolution equal to 1/N for representing distributions of mass – one may introduce a cut-off value of 1/N in the equation, and expect that this modified equation still reflects at least some of the behavior of the original particle system. Whence the question of studying, for large N , an equation of the form: ∂u = u + u(1 − u)1(u ≥ 1/N ). ∂t
(2)
In fact, Brunet and Derrida could provide heuristic arguments for this new problem, showing that, for large N , the effect of the cut-off is to shift the speed of travelling wave solutions of Eq. (2) from the speed of those of Eq. (1), by an amount of order (log N )−2 . In turn, these arguments were supported by numerical simulations of (discrete versions of) Eq. (2). This result concerning the F-KPP equation with cut-off has recently been given a rigorous mathematical proof, see [2,3,11]. A related question (see [8]), is that of the behavior of the F-KPP equation with small noise, i.e. of the equation ∂u = u + u(1 − u) + ∂t
u(1 − u) ˙ W, N
(3)
where W˙ is a standard space-time white-noise, and N is large. Rigorous results have recently been derived for this model too, see [9,14,15], establishing that the speed of the random travelling wave solutions of Eq. (3) is, for large N , shifted from the speed of those of (1) by an amount of order (log N )−2 . We thus have (at least) three examples of what may be called Brunet-Derrida behavior, in three different and more or less loosely related frameworks (branching-selection particle systems, F-KPP-equation with cut-off, F-KPP equation with noise), two of which have already been established rigorously.
1.2. Main result. The goal of this paper is to give a proof of Brunet-Derrida behavior for a class of branching-selection systems that is similar (but not exactly identical) to the one originally studied by Brunet and Derrida in [6,7]. To be specific, we consider a discrete-time particle system with N particles on R evolving through the repeated application of branching and selection steps defined as follows: • Branching: each of the N particles is replaced by two new particles, whose positions are shifted from that of the original particle by independently performing two random walk steps, according to a given distribution p; • Selection: only the N rightmost particles are kept among the 2N obtained at the branching step, to form the new population of N particles.
Brunet-Derrida Behavior of Particle Systems
325
Our assumptions on the random walk distribution p are listed below, and come from the need to apply the result of the paper [13] by N. Gantert, Y. Hu and Z. Shi on the survival probability of the branching random-walk killed below a linear space-time boundary, in the special case of deterministic binary branching. Introduce the logarithmic moment generating function of p defined by (t) := log exp(t x)dp(x). Here are the assumptions on p: (A1) The number σ := sup{t ≥ 0; (−t) < +∞} is > 0. (A2) The number ζ := sup{t ≥ 0; (t) < +∞} is > 0. (A3) There exists t ∗ ∈]0, ζ [ such that t ∗ (t ∗ ) − (t ∗ ) = log 2. Under these assumptions, both numbers χ ( p) :=
π 2 ∗ ∗ 2 t (t ),
v( p) := (t ∗ )
are well-defined, and satisfy 0 < χ ( p) < +∞ and v( p) ∈ R. Simple cases for which these assumptions hold are e.g. the Bernoulli case for α ∈]0, 1/2[, where p = αδ1 + (1 − α)δ0 , the uniform case, where p is the uniform distribution on the interval [0, 1], and the gaussian case, where p is the standard Gaussian distribution on R. In Sect. 3 below, it is proved that, after a large number of iterated branching-selection steps, the displacement of the whole population of N particles is ballistic, with deterministic asymptotic speed v N ( p), and that, as N goes to infinity, v N ( p) increases to a limit v∞ ( p), which turns out to be equal to the v( p) defined above, and is thus finite under our assumptions. The main result concerning the branching-selection particle system is the following theorem: Theorem 1. Assume that (A1)-(A2)-(A3) hold. Then, as N goes to infinity, v∞ ( p) − v N ( p) ∼ χ ( p)(log N )−2 .
(4)
1.3. Credits. The proof of Theorem 1 given in this paper is based on a comparison of the particle system with a family of N independent branching random walks killed below a linear space-time barrier, and makes use in a crucial way of ideas and results from the following two sources: the paper [16] by R. Pemantle on complexity bounds for algorithms seeking near optimal paths in branching random walks, and the paper [13] by N. Gantert, Y. Hu and Z. Shi on the survival probability of the branching random-walk killed below a linear space-time boundary. A detailed description of exactly which ideas and results are used and how is given in Sects. 4, 5 and 6 below. Note that the existence of a link between the Brunet-Derrida behavior of a branching-selection particle system such as the one studied here, and the asymptotics of the survival probability for branching random walks killed below a linear space-time barrier, was already suggested in the papers [10,17] by B. Derrida and D. Simon, where Brunet-Derrida-like features were observed for a quasi-stationary regime of killed branching random walks; the present paper gives an explicit and rigorous version of such a relation. Finally, let us mention that a first version [4] of the present work was completed by one of the authors (J.B.) before the results in [13] became publicly available. In [4], only the (log N )−2 order of magnitude of the difference v∞ ( p) − v N ( p) in the Bernoulli case was established. The results in [13] then allowed us to prove Theorem 1, which is both more precise and more general.
326
J. Bérard, J.-B. Gouéré
1.4. Organization of the paper. The rest of the paper is organized as follows. In Sect. 2, we provide the precise notations and definitions that are needed in the sequel. Sect. 3 contains a discussion of various elementary properties of the model we consider. Section 4 collects the results from [13] that are used in the sequel. Section 5 contains the proof of the lower bound part of Theorem 1, while Sect. 6 contains the proof of the upper bound part. Section 7 discusses the Bernoulli(α) case for α ≥ 1/2, showing that the conclusion of Theorem 1 may fail to hold when Assumption (A3) is not met. Section 8 is an attempt to provide a self-contained explanation of the (log N )−2 order of magnitude appearing in Theorem 1. The arguments in this section are only discussed in an informal way. 2. Notations and Definitions 2.1. Particle systems on R. It is convenient to represent finite populations of particles by finite counting measures on R. We use the notation C to represent the set of all finite counting measures on R. For ν ∈ C, the total mass of ν (i.e. the number of particles in the population it describes) is denoted by M(ν). We denote by max ν and min ν respectively the maximum and minimum of the (finite) support of ν. We also define the diameter d(ν) := max ν − min ν. Given μ, ν ∈ C, we use the notation ≺ to denote the usual stochastic ordering: μ ≺ ν if and only if μ([x, +∞[) ≤ ν([x, +∞[) for all x ∈ R. In particular, M(μ) μ ≺ ν implies that M(μ) ≤ M(ν), and it is easily seen that, if μ = i=1 δxi and M(ν) ν = i=1 δ yi , with x1 ≥ · · · ≥ x M(μ) and y1 ≥ · · · ≥ y M(ν) , μ ≺ ν is equivalent to M(μ) ≤ M(ν) and xi ≤ yi for all i ∈ [[1, M(μ)]]. For all N ≥ 1, let C N denote the set of finite counting measures on R with total mass equal to N . In the sequel, we use the notation (X nN )n≥0 to denote a Markov chain on C N whose transition probabilities are given by the branching-selection mechanism with N particles defined in Sect. 1.2, and which starts at a deterministic value X 0N ∈ C N . We assume that this Markov chain is defined on a reference probability space denoted by ( , F, P). 2.2. Branching random walks. In the sequel, we use the notation BRW to denote a generic branching random walk on a regular rooted binary tree, with value zero at the root, and i.i.d. displacements with common distribution p along each edge. More formally, BRW consists of a pair (T, ), where T is a regular rooted binary tree, and is a random map, associating to each vertex u ∈ T a random variable (u) ∈ R in such a way that (r oot) = 0 and that the collection ( (v) − (u))(u,v) is i.i.d. with common distribution p, where (u, v) runs over the set of pairs of vertices of T such that u is the father of v. We say that (u) is the value of the branching random walk at vertex u. The probability measure governing BRW is denoted by Q. Given m ≥ 1, we say that a sequence u 0 , . . . , u m of vertices in T is a descending path if, for all i ∈ [[1, m]], u i−1 is the father of u i . The set of vertices of T located at depth m is denoted by T(m). 3. Elementary Properties of the Model As a first quite elementary property, note that, from Assumptions (A1) and (A2), E(max X nN ) and E(min X nN ) are finite for all n ≥ 0, for any choice of the (deterministic) initial condition X 0N ∈ C N .
Brunet-Derrida Behavior of Particle Systems
327
3.1. Estimates on the diameter. N N Proposition 1. Let u N := log log 2 + 1. For all N ≥ 1, all initial population X 0 ∈ C N , (2)
(1)
(2)
and all n ≥ u N , d(X nN ) is stochastically dominated by u N × (m N − m N ), where m N (1) and m N are respectively the maximum and the minimum of a family of 2N u N i.i.d. random variables with common distribution p. N Proof. Consider n ≥ u N , y = max X n−u , and let us study the evolution of the branchN
ing-selection system between times n − u N and n. Define m (1) N as the minimum of the 2N u N random walk steps performed by the system between times n − u N and n. (1) Consider first the possibility that min X kN < y + (k − n + u N )m N for all k ∈ [[n + 1 − u N , n]]. Since all the random walk steps that are performed during branching (1) steps are ≥ m N , this implies that all the particles descended by branching from a particle located at y at time n − u N , are preserved by the successive selection steps N performed from X n−u to X nN . Since there are at least 2u N > N such particles at N time n, this is a contradiction. As a consequence, we know that there must be an index (1) k ∈ [[n + 1 − u N , n]] such that min X kN ≥ y + (k − n + u N )m N . Again by the fact (1) (1) that random walk steps are ≥ m N , t → min X tN − (t − n + u N )m N is non-decreasing (1) on the interval [[n + 1 − u N , n]], so we deduce that min X nN ≥ y + u N m N . Now, let (2) m N denote the maximum of the 2N u N random walk steps that are performed at the branching steps between time n − u N and time n. We see from the definition of y that (1) N max X nN ≤ y + u N m (2) N . We have also just seen that min X n ≥ y + u N m N , so that (2) (1) d(X nN ) = max X nN − min X nN ≤ u N (m N − m N ). The following corollary is then a rather straightforward consequence, in view of Assumptions (A1) and (A2). Corollary 1. For all N ≥ 1 and any initial population X 0N ∈ C N , limn→+∞ n −1 d(X nN ) = 0, both with probability one and in L 1 (P). (2)
(1)
Proof. Using the notations of Proposition 1, let FN := m N − m N . From Assumptions (A1) and (A2), one deduces that E(FN ) < +∞. Then, by Proposition 1, one has that, for all n ≥ u N , E(n −1 d(X nN )) ≤ E(FN )/n, so that convergence to 0 in L 1 (P) is proved. Moreover, for any ι > 0, Proposition 1 yields that n≥u N P(n −1 d(X nN ) ≥ ι) ≤ n≥u N P(FN ≥ ιn) ≤ E(FN )/ι, so that convergence to 0 P−a.s. follows from the Borel-Cantelli lemma, since ι can be taken arbitrarily small. 3.2. Monotonicity properties. The following lemma states a key monotonicity property of our branching-selection mechanism. Lemma 1. For all 1 ≤ N1 ≤ N2 , and μ1 ∈ C N1 , μ2 ∈ C N2 such that μ1 ≺ μ2 , there exists a pair of random variables (Z 1 , Z 2 ) taking values in C N1 × C N2 , such that: • the distribution of Z i for i = 1, 2 is that of the population of particles obtained by performing one branching-selection step (with Ni particles) starting from the population μi ; • with probability one, Z 1 ≺ Z 2 .
328
J. Bérard, J.-B. Gouéré
Proof. Consider an i.i.d. family (εi, j )i∈[[1,N2 ]], j=1,2 with common distribution p. For Nk k = 1, 2, write μk = i=1 δxi (k) , with x 1 (k) ≥ · · · ≥ x Nk (k). Then let Tk := Nk δ , and define Z k as being formed by the Nk rightmost particles in j=1,2 xi (k)+εi, j i=1 Tk . From the assumption that μ1 ≺ μ2 , we deduce that xi (1) ≤ xi (2) for all i ∈ [[1, N1 ]], whence the fact that xi (1) + εi, j ≤ xi (2) + εi, j , for all 1 ≤ i ≤ N1 and j = 1, 2. It is easy to deduce that T1 ≺ T2 , whence Z 1 ≺ Z 2 . The conclusion follows. An immediate corollary is the following. Corollary 2. For all 1 ≤ N1 ≤ N2 , and μ1 ∈ C N1 , μ2 ∈ C N2 such that μ1 ≺ μ2 , there exists a coupling (Z n1 , Z n2 )n≥0 between two versions of the branching-selection particle system, with N1 and N2 particles respectively, such that Z 01 := μ1 , Z 02 := μ2 , and Z n1 ≺ Z n2 for all n ≥ 0. Proposition 2. There exists v N ( p) ∈ R such that, with probability one, and in L 1 (P), for any initial population X 0N ∈ C N , lim n −1 min X nN = lim n −1 max X nN = v N ( p).
n→+∞
n→+∞
Proof. Note that, in view of Corollary 1, if either of the two limits in the above statement exists, then the other must exist too and have the same value. Moreover, owing to the translation invariance of our particle system (the dynamics is invariant with respect to shifting all the particles by a translation on R), and to Corollary 2, we see that it is enough to prove the result when X 0N = N δ0 . The idea of the proof is to invoke Kingman’s subadditive ergodic theorem (see e.g. [12]), using the monotonicity property described by Lemma 1. Consider an i.i.d. family (ε,i, j )≥0,i∈[[1,N ]], j=1,2 with common distribution p (the index will be used to shift the origin of time when applying Kingman’s theorem). N ) N For all ≥ 0, denote by (W,k k≥0 the branching-selection system starting at W,0 := N N = N δ0 and governed by the following steps. For k ≥ 0, write W,k i=1 δxi , with N x1 ≥ · · · ≥ x N . The population T,k derived from W,k by branching is then defined N by T,k := (i, j)∈[[1,N ]]×{1,2} δxi +ε+k,i, j . Then, W,k+1 is obtained from T,k by keeping only the N rightmost particles. N ) N Observe that (W0,n n≥0 has the same distribution as (X n )n≥0 . Moreover, the argument used in the proof of Lemma 1 shows that N N N for all n, m ≥ 0, max W0,n+m ≤ max W0,n + max Wn,m .
(5)
Indeed, it is enough to note that (5) compares the maximum of two populations obtained by performing m branching-selection steps coupled as in Lemma 1, starting respectively N (for the l.h.s.) and from N δ from W0,n N (for the r.h.s.). max W0,n Moreover, it is easily seen from the definition that, for each d ≥ 1, the random N ) N variables (Wdn,d n≥0 form an i.i.d. family, and that the distribution of (W,k )k≥0 clearly does not depend on . One can then check from the definition that the following inequal N ≤ n−1 ity holds max W0,n (i, j)∈[[1,N ]]×{1,2} |ε+k,i, j |. Using assumptions (A1) and k=0 (A2), it is then quite clear that there exists ψ > −∞ such that, for all n ≥ 0, ψn ≤ N ) < +∞. E(W0,n We conclude that the hypotheses of Kingman’s subadditive ergodic theorem hold (see e.g. [12]), and deduce that limn→+∞ n −1 max X nN exists both a.s. and in L 1 (P), and is constant.
Brunet-Derrida Behavior of Particle Systems
329
Proposition 3. The sequence (v N ( p)) N ≥1 is non-decreasing. Proof. Consequence of the fact that, when N1 ≤ N2 , N1 δ0 ≺ N2 δ0 , and of the monotonic coupling property given in Corollary 2. We can deduce from the above proposition that there exists v∞ ( p) such that lim N →+∞ v N ( p) = v∞ ( p). A consequence of the proof of Theorem 1 below is that v∞ ( p) is in fact equal to the number v( p) := (t ∗ ), which is finite from our assumptions on p. 3.3. Coupling with a family of N branching random walks. Let (BRWi )i∈[[1,N ]] , denote N independent copies of a branching random walk BRW as defined in Sect. 2. Each BRWi thus consists of a binary tree Ti and a map i . For 1 ≤ i ≤ N , and n ≥ 0, remember that Ti (n) denotes the set of vertices of Ti located at depth n, and define the disjoint union TnN := T1 (n)· · ·T N (n). For every n, fix an a priori (i.e. depending only on the tree structure, not on the random walk values) total order on TnN . We now define by induction a sequence (G nN )n≥0 such that, for each n ≥ 0, G nN is a random subset of TnN with exactly N elements. First, set G 0N := T0N . Then, given n ≥ 0 and G nN , let HnN N formed by the children (each with respect to the tree structure denote the subset of Tn+1 N as the subset of H N formed by it belongs to) of the vertices in G nN . Then, define G n+1 n the N vertices that are associated with the largest values of the underlying random walks
i s (breaking ties by using the a priori order on TnN ). Now let XnN denote the (random) empirical distribution describing the values taken by the i s on the (random) set of vertices G nN . The sequence (XnN )n≥0 has the same distribution as (X nN )n≥0 , when started from X 0N := N δ0 . Thus, we can take for our reference probability space ( , F, P) the one on which BRW1 , . . . , BRW N are defined, and let X nN be equal to the empirical distribution associated with the subset G nN , and so obtain a coupling between (X nN )n≥0 (with X nN = N δ0 ) and the N branching random walks BRW1 , . . . , BRW N . 4. Results on the Branching Random Walk Killed Below a Linear Space-Time Barrier Let us start with the following definition, adapted from [16]. Given v ∈ R and m ≥ 1, we say that a vertex u ∈ BRW is (m, v)−good if there exists a finite descending path u =: u 0 , u 1 , . . . , u m such that (u i ) − (u 0 ) ≥ vi for all i ∈ [[0, m]]. Similarly, we say that u is (∞, v)−good if there exists an infinite descending path u =: u 0 , u 1 , . . . such that (u i ) − (u 0 ) ≥ vi for all i ∈ [[0, +∞[[. With this terminology, the main result in [13] can be stated as follows, remembering 2 that v( p) = (t ∗ ) and χ ( p) = π2 t ∗ (t ∗ ). Theorem 2 (Theorem 1.2 in [13]). Let ρ(∞, ) denote the probability that the root of BRW is (∞, v( p) − )−good. Then, as goes to zero, χ ( p) + o(1) 1/2 ρ(∞, ) = exp − . We shall need a result which, although not stated explicitly in [13], appears there as an intermediate step in a proof.
330
J. Bérard, J.-B. Gouéré
Theorem 3 (Proof of the upper bound part of Theorem 1.2 in [13]). Let ρ(m, ) denote the probability that the root of BRW is (m, v( p) − )−good. For any 0 < β < χ ( p), there exists θ > 0 such that, for all large m, χ ( p) − β 1/2 ρ(m, ) ≤ exp − , with := θ/m 2/3 . One should also consult the papers [10,17] for an approach of these results based on (mathematically non-rigorous) theoretical physics arguments. See also the discussion in Sect. 8. 5. The Lower Bound The arguments used here in the proof of the lower bound, combine ideas from the paper [16] by Pemantle, which deals with the closely related question of obtaining complexity bounds for algorithms that seek near optimal paths in branching random walks, and the estimate on ρ(m, ) from the paper [13], by Gantert, Hu and Shi. In fact, the proof given below is basically a rewriting of the proof of the lower complexity bound in [16] in the special case of algorithms that do not jump, with the following slight differences: we are dealing with N independent branching random walks being explored in parallel, rather than with a single branching random walk; we consider possibly unbounded random walk steps; we use the estimate in [13] instead of the cruder one derived in [16]. We start with an elementary result adapted from [16]. Lemma 2 (Adapted from Lemma 5.2 in [16]). Let v1 , v2 ∈ R be such that v1 < v2 , n ≥ 1, m ∈ [[1, n]], K > 0, and let 0 =: x0 , . . . , xn be a sequence of real numbers such that xi+1 − xi ≤ K for all i ∈ [[0, n − 1]]. Let I := {i ∈ [[0, n − m]]; x j − xi ≥ −v1 n − K /(K − v1 ). v1 ( j − i) for all j ∈ [[i, i + m]]}. If xn ≥ v2 n, then #I ≥ vK2−v 1 m Since Lemma 2 admits so short a proof, we give it below for the sake of completeness, even though it is quite similar to that in [16]. Proof of Lemma 2 (Adapted from [16]). Consider a sequence 0 =: x0 , . . . , xn as in the statement of the lemma. Let then τ0 := 0, and, given τi ≤ n, define inductively τi+1 := inf{ j ∈ [[τi + 1, n]]; x j < xτi + v1 ( j − τi ) or j = τi + m}, with the convention that inf ∅ = n + 1. Now “color” the integers k ∈ [[0, n − 1]], according to the following rules: if xτi+1 ≥ xτi + v1 (τi+1 − τi ) and τi+1 ≤ n, then τi , . . . , τi+1 − 1 are colored red. Note that this yields a segment of m consecutive red terms, and that τi then belongs to I . Then color in blue the remaining integers in [[0, n − 1]]. Let Vr ed (resp. Vblue ) denote the number of red (resp. blue) terms in [[0, n − 1]]. Then decompose the value of xn into the contributions of the steps xk+1 − xk such that k is red, and such that k is blue, respectively. On the one hand, the contribution of red terms is ≤ K Vr ed . On the other hand, the contribution of blue terms is ≤ Vblue × v1 + K m, where the m is added to take into account a possible last segment colored in blue only because it has reached the index n. Writing that n = Vr ed + Vblue , we deduce that −v1 v2 n ≤ K Vr ed + v1 (n − Vr ed ) + K m, so that Vr ed ≥ vK2−v n − K m/(K − v1 ). Then use 1 the fact that at least Vr ed /m terms belong to I . In [16], the result corresponding to our Lemma 2, and an estimate of the type given by Theorem 3, are used in combination with an elaborate second moment argument. In the present context, the following first moment argument turns out to be sufficient.
Brunet-Derrida Behavior of Particle Systems
331
Proof of the lower bound part of Theorem 1. Assume that X 0N = N δ0 . Let β > 0 and θ > 0 be as in Theorem 3. Then let λ > 0, and define
3 3/2 (1 + λ)(log N ) m := θ , (χ ( p) − β)1/2 and := θ/m 2/3 , so that, by Theorem 3, ρ(m, ) ≤ N −(1+λ) for all large N .
(6)
Then let 0 < γ < 1 and define v2 := v( p) − (1 − γ ) and v1 := v( p) − . Let also n := N ξ for some 0 < ξ < λ. Now consider κ > 0, and let K := κ log(2N n). Consider the maximum of the random walk steps performed during the branching steps of (X kN )k≥0 between time 0 and time n. There are 2N n such steps, so that, by assumption (A2), there exists a value of κ such that the probability that this maximum is larger than or equal to K is less than (2N n)−2008 for all large enough N . Now denote by Bn the number of vertices in G 0N ∪ · · · ∪ G nN (see Sect. 3.3) that are (m, v( p) − )−good (each with respect to the BRWi it belongs to). Observe that, with −v1 n our definitions, for N large enough, vK2−v − K /(K − v1 ) > 0. As a consequence, 1 m using Lemma 2, we see that, for N large enough, the event max X nN ≥ v2 n implies that either there exists a random walk step between time 0 and n which is ≥ K , or Bn ≥ 1. Using the union bound and the above estimate, we deduce that P max X nN ≥ v2 n ≤ (2N n)−2008 + P(Bn ≥ 1). (7) On the other hand, Bn can be written as Bn := 1( u is (m, v( p) − )−good)1(u ∈ G 0N ∪ · · · ∪ G nN ).
(8)
u∈T1 ∪···∪TN
Now observe that, by definition, for a vertex u at depth , the event u ∈ G 0N ∪ · · · ∪ G nN is measurable with respect to the random walk increments performed at depth at most , that is, the family of random variables i (w) − i (v), where i ∈ [[1, N ]], v, w ∈ Ti , w is a child of v (with respect to the tree structure of Ti ), and w, v are both located at a depth ≤ in Ti . On the other hand, the event that u is (m, v( p)−)−good is measurable with respect to the random walk increments performed at depth at least , that is, the family of random variables i (w) − i (v), where i ∈ [[1, N ]], v, w ∈ Ti , w is a child of v, and w, v are both located at a depth ≥ in Ti . As a consequence, the two events {u ∈ G 0N ∪· · ·∪G nN } and {u is (m, v( p) − )−good} are independent. Since the total number of vertices in G 0N ∪· · ·∪G nN is equal to N (n +1), we deduce from (6) and (8) that E (Bn ) ≤ N (n + 1)N −(1+λ) . Using Markov’s inequality, we finally deduce from (7) that P(max X nN ≥ v2 n) ≤ (2N n)−2008 + (n + 1)N −λ .
(9) N
Now start with the obvious inequality, valid for all t, exp(t max X nN ) ≤ i=1 N n u∈Ti (n) exp(t i (u)). Taking expectations, we deduce that E(exp(t max X n )) ≤ N 2 ∗ exp(n(t)). Using the definition of t and v( p), we then obtain that
E(exp(t ∗ (max X nN − v( p)n)) ≤ N .
(10)
332
J. Bérard, J.-B. Gouéré
Using (10), we deduce that1 , for all b > 0, and all large enough n, ∗ E max X nN 1(max X nN ≥ (v( p) + b)n) ≤ N exp − 2007 2008 t bn (1 + |v( p)|n).
(11)
Now observe that, by definition, E(n −1 max X nN ) is bounded above by v2 + (v( p) + b)P(max X nN ≥ v2 n) + n −1 E max X nN 1(max X nN ≥ (v( p) + b)n) . Choosing a b > 0, we deduce from (9), (11), and the definition of v2 , that, for all large enough N , E(n −1 max X nN ) ≤ (v( p) − (1 − γ )) + o((log N )−2 ). Using subadditivity (see the proof of Proposition 2), we have that v N ( p) ≤ E(n −1 max X nN ), and we easily deduce that v N ( p) ≤ (v( p) − (1 − γ )) + o((log N )−2 ). ( p)−β −2 . Since the above Now remember that, as N goes to infinity, ∼ χ(1+λ) 2 (log N ) estimates are true for arbitrarily small β, λ and γ , the conclusion follows.
6. The Upper Bound The proof of the upper bound on v∞ ( p) − v N ( p) given in [4] was in some sense a rigorous version of the heuristic argument of Brunet and Derrida according to which we should compare the behavior of the particle system with N particles, to a version of the infinite population limit dynamics suitably modified by a cutoff. The proof given here relies upon a direct comparison with branching random walks, using the fact that, above the threshold induced by the selection steps, the behavior of our branching-selection particle system is exactly that of a branching random walk. χ ( p) Consider 0 < λ < 1 and let := ((1−λ) . With this choice of , as N goes to log N )2 infinity, Theorem 2 yields that ρ(∞, ) = N −(1−λ)+o(1) .
(12)
Let us now quote the following result, which is a consequence of Theorem 2, Sect. 6, Chap. 1 in [1]. Lemma 3. Let (Mn )n≥0 denote the population size of a supercritical Galton-Watson process with square-integrable offspring distribution started with M0 = 1. Then there exist r > 0 and φ > 1 such that, for all n ≥ 0, P(Mn ≥ φ n ) ≥ r. 1 Here are the details. Let M := max X N − v( p)n. From the fact that, for all large enough x, x ≤ n ∗ exp(t ∗ x/2008), we deduce that E(M1(M ≥ bn)) ≤ E exp(t ∗ M − 2007 2008 t bn) for all large enough n. Similarly, E(v( p)n1(M ≥ bn)) ≤ |v( p)|nE exp(t ∗ M − t ∗ bn). The result follows from summing the two inequalities above and applying (10).
Brunet-Derrida Behavior of Particle Systems
333
Let R be such that R < v( p) and p([R, +∞)) ≥ 2/3. Consider a Galton-Watson tree whose offspring distribution is a binomial with parameters 2 and p([R, +∞)). The average number of offspring is thus equal to 2 p([R, +∞)) ≥ 4/3 > 1 with our assumptions. In the sequel, we use the notations r and φ to denote the numbers given by Lemma 3 when we use this offspring distribution. N (v( p)−R)s N Now, let s N := log and log φ + 1, consider 0 < η < 1, and define m :=
η n := m + s N . Let u denote a vertex at depth m in a branching random walk BRW, and assume that (u) ≥ (v( p) − )m. Consider the probability that, conditional upon the values of on the vertices located at depth at most m, there are at least φ s N distinct descending paths u =: u m , . . . , u n starting at u and satisfying u i+1 − u i ≥ R for all i ∈ [[m, n − 1]]. Lemma 3 above shows that this probability is ≥ r . Moreover, with our definition of m and n, and our assumption on the value of (u), any such descending path has the property that (u i ) ≥ (v( p) − (1 + η))i for all i ∈ [[m, n]]. We conclude that the probability that there exist at least φ s N distinct descending paths of the form r oot = u 0 , . . . , u n such that (u i ) ≥ (v( p) − (1 + η))i for all i ∈ [[0, n]], is ≥ ρ(m, )r . Now define A as the event that, for all j ∈ [[1, N ]], BRW j does not contain more than φ s N distinct descending paths of the form r oot = u 0 , . . . , u n such that j (u i ) ≥ (v( p) − (1 + η))i for all i ∈ [[0, n]]. Using the fact that BRW1 , . . . , BRW N are independent and the above discussion, we see that P(A) ≤ [1 − ρ(m, )r ] N . Using (12), the obvious inequality ρ(m, ) ≥ ρ(∞, ), and the fact that 1−x ≤ exp(−x) for all x, we deduce that P(A) ≤ exp(−N λ+o(1) ).
(13)
Let δ := (1 + η). Define the event B := {min(X kN ) < (v( p) − δ)k for all k ∈ [[1, n]]}, and assume that B ∩ Ac occurs. From the definition of the selection mechanism, we conclude that there must be at least φ s N distinct vertices in the set G nN , which is a contradiction since φ s N > N . As a consequence, B ∩ Ac = ∅, so that B ⊂ A. From (13), we thus obtain that P(B) ≤ exp(−N λ+o(1) ).
(14)
To exploit this bound, we use the following result. Proposition 4. With the previous notations, for all N large enough, v N ( p) ≥ (v( p) − δ) − |v( p) − δ|nP(B) − nE(|n |1(B)), where n is the minimum of 2n N i.i.d. random variables with distribution p. Proof. We re-use the coupling construction given in the proof of Proposition 2, and N . assume that (X nN )n≥0 is defined using this construction by the identity X nN := W0,n Start with 0 := 0 and J0 := 0, and i := 0. Given i ≥ 0, i and Ji , let L i+1 := inf{k ∈ [[1, n]]; min(WNi ,k ) ≥ (v( p) − δ)k}, with the convention that inf ∅ := n. Then let i+1 := i + L i+1 , and let Ji+1 := Ji + min(WNi ,L i+1 ). Using an argument similar to the proof of Lemma 1, it is then quite easy to deduce that, a.s., N for all i ≥ 0, min W0, ≥ Ji . i
(15)
334
J. Bérard, J.-B. Gouéré
Observe that the sequence (i+1 − i )i≥0 is i.i.d., and that the common distribution of the i+1 − i is that of the random variable L defined by L := inf{k ∈ [[1, n]]; min(X kN ) ≥ (v( p) − δ)k}, with the convention that inf ∅ := n. Similarly, the sequence (Ji+1 − Ji )i≥0 is i.i.d., the common distribution of the Ji+1 − Ji being that of min X LN . From the law of large numbers and Proposition 2, we have that, a.s., limi→+∞ i −1 min X Ni = v N ( p)E(L), while the law of large numbers and (15) imply that E(min X N )
lim inf i→+∞ i −1 min X Ni ≥ E(min X LN ). We conclude that v N ( p) ≥ E(L) L . Now, let n denote the minimum of all the random walk steps performed by the branchingselection system between time 0 and n. By definition we have that min X LN ≥ (v( p) − δ)L1(B c ) + Ln 1(B), so that E(min X LN ) ≥ (v( p) − δ)(E(L) − E(L1(B))) + E(Ln 1(B)). Using the fact that 1 ≤ L ≤ n, we obtain that
E(min X LN ) E(L)
≥ (v( p) − δ) − |v( p) − δ|nP(B) − nE(|n |1(B)).
Proof of the upper bound part in Theorem 1. In view of Proposition 4, we deduce that v N ( p) ≥ (v( p) − (1 + η))(1 − nP(B)) − nE(|n |1(B)). Bounding above |n | by the sum of the absolute values of the 2n N corresponding i.i.d. variables, and using Schwarz’s inequality thanks to Assumptions (A1) and (A2), we deduce that E(|n |1(B)) ≤ 2n N CP(B)1/2 for some constant C (depending only on p). From (14) and the definition of n, we deduce that, as N goes to infinity, nP(B) and nE(|n |1(B)) are o((log N )−2 ), so we obtain that v N ( p) ≥ v( p) −
χ ( p)(1 + η) (log N )−2 + o((log N )−2 ). (1 − λ)2
Since λ and η can be taken arbitrarily small in the argument leading to the above identity, the conclusion follows. 7. The Bernoulli Case when 1/2 ≤ α < 1 In the Bernoulli case p = αδ1 + (1 − α)δ0 , with 1/2 ≤ α < 1, Assumption (A3) breaks down, and the behavior of the particle system turns out to be quite different from Brunet-Derrida, as stated in the following theorems. Note that, when 1/2 ≤ α < 1, v∞ ( p) = 1. Theorem 4. For α = 1/2, there exists 0 < c∗ ( p) ≤ c∗ ( p) < +∞ such that, for all large N , c∗ ( p)N −1 ≤ 1 − v N ( p) ≤ c∗ ( p)N −1 .
(16)
Theorem 5. For α > 1/2, there exists 0 < d ∗ ( p) ≤ d∗ ( p) < +∞ such that, for all large N , exp(−d∗ ( p)N ) ≤ 1 − v N ( p) ≤ exp(−d ∗ ( p)N ).
(17)
Brunet-Derrida Behavior of Particle Systems
335
7.1. Lower bound when α = 1/2. It is easily checked that, for all m ≥ 0, the number of particles in the branching-selection system that are located at position m after m steps, that is, X mN (m), is stochastically dominated by the total population at the m th generation of a family of N independent Galton-Watson trees, with offspring distribution binomial(2, 1/2). This corresponds to the critical case of Galton-Watson trees, and the probability that such a tree survives up to the m th generation is ≤ cm −1 for some constant c > 0 and all large m. As a consequence, the union bound over the N Galton-Watson trees yields that, for large enough m, P(X mN (m) ≥ 1) ≤ cN m −1 . On the other hand, we have by definition that E max(X mN ) ≤ mP(X mN (m) ≥ 1) + (m − 1)P(X mN (m) = 0). Choosing m := AN , where A ≥ 1 is an integer, we deduce that, for large N , m −1 E max(X mN ) ≤ 1 1 − AN (1 − c/A). Using subadditivity (see the proof of Proposition 2), we have that v N ( p) ≤ E(m −1 max X mN ). The lower bound in (16) follows by choosing A > c. 7.2. Upper bound when α = 1/2. Given m ≥ 1, define U := inf{n ∈ [[1, m]]; X nN (n) ≤ 2N /3}, with the convention that inf ∅ := m. Observe that min X UN ≥ U − 1, since, by definition, X UN −1 (U − 1) ≥ 2N /3, so that, after the branching step applied to X UN −1 , the number of particles whose positions are ≥ U − 1 must be ≥ 2 × 2N /3, whence ≥ N . Using an argument similar to the proof of Proposition 4, we deduce that v N ( p) ≥ 1 −
1 . E(U )
(18)
The lower bound in (16) is then a direct consequence of the following claim. Claim. For small enough > 0, with m := N , there exists c() > 0 such that E(U ) ≥ c()N for all large N . To prove the claim, introduce for every x ∈ N the Markov chain (Vkx )k≥0 defined by the initial condition V0x := x, and the following tranx is the minimum of N and of a random sitions: given V0x , . . . , Vkx , the next term Vk+1 x variable with a binomial(2Vk , 1/2) distribution. Clearly, the sequences (VkN )k≥0 and (X kN (k))k≥0 have the same distribution. Moreover, given two starting points x, y ∈ N y y such that x ≤ y, one can easily couple (Vkx )k≥0 and (Vk )k≥0 in such a way that Vkx ≤ Vk for all k ≥ 0. As a consequence, choosing x N := 3N /4,we see that U stochastically dominates the random variable T defined by T := inf n ∈ [[1, m]]; Vnx N ≤ 2N /3 (again with inf ∅ := m), so that P(U = m) ≥ P(T = m). Now let us define yet another Markov chain (Z k )k≥0 by Z 0 := x N and the following transitions: given Z 0 , . . . , Z k , the next term Z k+1 is a random variable with a binomial(2Z k , 1/2) distribution. Clearly we can couple (Z k )k and (Vkx N )k so that they coincide up to one unit of time before the first hitting of [[N , +∞[[. As a consequence, the two events A1 := {supk∈[[0,m]] |Vkx N − 3N /4| < N /16} and A2 := {supk∈[[0,m]] |Z k − 3N /4| < N /16} have the same probability. Observingthat (Z k )k≥0 isa martingale, we can use Doob’s maximal inequality to prove that P Ac1 = P Ac2 ≤ E(Z m − 2 |Z ) = 3N /4)2 (N /16)−2 . Then, it is easily checked from the definition that E(Z k+1 k Z k2 + Z k /2 for all k ≥ 0, and, using again the fact that (Z k )k≥0 is a martingale, we deduce that E(Z m − 3N /4)2 ≤ m N/2. As a consequence, we see that, choosing > 0 small enough, we can ensure that P Ac1 ≤ 1/2008 for all large N . Since, by definition, A1 implies that T = m, we finally deduce that, for such an , and all N large enough, we have that P(U = m) ≥ P(T = m) ≥ 2007/2008. The conclusion follows.
336
J. Bérard, J.-B. Gouéré
7.3. Upper and lower bound when 1/2 < α < 1. As for the lower bound, observe that the probability that all the 2N particles generated during a branching step remain at the position from which they originated is (1−α)2N , so that E(max X nN ) ≤ n(1−(1−α)2N ). As for the upper bound, observe that, starting from N particles at a site, the number of particles generated from these during a branching step and that perform +1 random walk steps has a binomial(2N , α) distribution, whose expectation is 2α N , with 2α > 1. Using a standard large deviations bound for binomial random variables, we see that the probability for this number to be less than N is ≤ exp(−cN ) for some c > 0. Using superadditivity E(min X nN ) (derived in exactly the same way as the subadditivity property of E(max X nN ), see the proof of Proposition 2), it is easy to deduce that E(min X nN ) ≥ n(1 − exp(−cN )). The result follows. 8. Discussion This section contains a discussion whose goal is to provide a self-contained qualitative explanation of the (log N )−2 order of magnitude appearing in Theorem 1. Most of the discussion consists in explaining the −1/2 scaling of log ρ(∞, ), and of log ρ(m, ) when m ∝ −3/2 , in a way that is (hopefully) less technically demanding than the proofs presented in [13], although we follow the proof strategy of [13] rather closely. Note that the discussion here deals mostly with the order of magnitude of terms, not with the precise value of the constants as in [13]. For the sake of readability, some of the arguments are only discussed in a quite informal way. 8.1. Asymptotic behavior of ρ(∞, ) and ρ(m, ). 8.1.1. Connection between ρ(∞, ) and ρ(m, ). A first remark is that the asymptotic behavior of ρ(∞, ) can be connected with that of quantities of the form ρ(m, ) under appropriate conditions. One obvious inequality, valid for all m ≥ 0, is the following: ρ(m, ) ≥ ρ(∞, ).
(19)
In the reverse direction, we have the following: Proposition 5. There exist R < v( p) − 1, φ > 1, r > 0 and c > 0, depending only on p, such that, for all m ≥ 0, and all 0 < < 1, the condition φ q ρ(m, (1 − α)) ≥ c
(20)
implies that the following inequality holds: ρ(∞, ) ≥
r ρ(m, (1 − α)), 2
where α and are arbitrary numbers satisfying 0 < α < 1 and > 0, and αm q := . v( p) − − R The proof of the above proposition uses the following elementary lemma.
(21)
Brunet-Derrida Behavior of Particle Systems
337
Lemma 4. Consider a Galton-Watson process with offspring distribution Q. If there exists a ≥ 1 such that a × Q([a, +∞[) ≥ 2 log 2, then the survival probability is larger than or equal to Q([a, +∞[)/2. k Proof of Lemma 4. Let g(s) := +∞ k=0 Q(k)s for s ∈ [0, 1[. By coupling, it is enough to prove the result under the additional assumption that only the values 0 and a have nonzero probability with respect to Q, so we may assume that g(s) = 1 − Q(a) + s a Q(a). Since a Q(a) > 1, we have a super-critical Galton-Watson process, and, from standard theory, we know that the extinction probability d of the process is the unique solution in [0, 1[ of the equation g(d) = d, with g(s) > s for s ∈]0, d[ and g(s) < s for s ∈]d, 1[. Our assumption that a × Q(a) ≥ 2 log 2 easily yields the fact that g(1 − Q(a)/2) ≤ 1 − Q(a)/2, whence the fact that d must be ≤ 1 − Q(a)/2. The result follows. Proof of Proposition 5. Consider the values of R, φ, r defined in Sect. 6, in the argument following Lemma 3. Then consider a descending path r oot = u 0 , . . . , u m+q ∈ T such that (u i ) ≥ (v( p) − (1 − α))i for all i ∈ [[0, m]], and (u i+1 ) − (u i ) ≥ R for all i ∈ [[m, m + q − 1]]. We see from the definition of q that (u i ) ≥ (v( p) − )i for all i ∈ [[0, m + q]]. We now define a Galton-Watson branching process of vertices of T in which, for all n, the n th generation of the process is formed by vertices in T((m + q)n). First, the zeroth generation of the process is formed by the root of T. Then, given a vertex x ∈ T((m +q)n) belonging to the n th generation of the process, the offspring of this vertex in the branching process is formed by all the endpoints y of descending paths x =: u 0 , . . . , u m+q := y in T such that (u i ) − (u 0 ) ≥ (v( p) − )i for all i ∈ [[0, m + q]]. From the definition of the branching mechanism of BRW, we see that we have defined a Galton-Watson branching process. Now, re-doing the argument following Lemma 3 in Sect. 6, we see that the offspring distribution of this branching process gives at least φ q children with probability at least ρ(m, (1 − α))r . On the other hand, the definition of our branching process shows that if it never goes extinct, the root of T is (∞, v( p) − )−good. As a consequence, the survival probability of our process is a lower bound for ρ(∞, ). The result then follows from Lemma 4, choosing c > (2 log 2)/r . We conclude this section by the following remark: using the above results, it is possible to deduce the conclusion of Theorem 3 from the conclusion of Theorem 2. In [13], Theorem 3 is in fact an intermediate step in the proof of Theorem 2, so our remark does not lead to an alternative way of proving Theorem 3 from first principles. However, its interest is to show that, as soon as 1/2 log ρ(∞, ) converges to some limit, this limit can be approached arbitrarily closely by expressions of the form 1/2 log ρ(m, ), with = θ/m 2/3 for some large enough constant θ . Proof of Theorem 3 from the conclusion of Theorem 2. Let 0 < α < 1, θ > 0 and m ≥ 1. Set := θ/m 2/3 . Let R, φ, r, c, q be defined as in the statement of Proposition 5. As m goes to infinity, we have that log φ q ∼ θ α log(φ)(v( p) − R)−1 m 1/3 , and, by Theorem 2, log ρ(∞, (1 − α)) ∼ −χ ( p)1/2 θ −1/2 (1 − α)−1/2 m 1/3 . Therefore, provided that θ has been chosen large enough, Inequality (20) holds for large enough m. Given such θ and m, Proposition 5 and Theorem 2 yield that χ ( p) + o(1) 1/2 ρ(m, (1 − α)) ≤ (2/r )ρ(∞, ) = (2/r ) exp − .
338
J. Bérard, J.-B. Gouéré
Setting := (1 − α)θ m −2/3 , one obtains that χ ( p)(1 − α) + o(1) 1/2 ρ(m, ) ≤ exp − . Since α can be chosen arbitrarily small, the conclusion follows. 8.1.2. Strategy and results. Given the results of the previous section, the strategy consists in studying the order of magnitude of log ρ(m, ), when m has the scaling form m ∝ −u for some u > 0, seeking a value of u such that (20) is satisfied. We shall see that log ρ(m, ) ∝ −h(u) , with h(u) = u/3 for all 0 < u ≤ 3/2, while the integer q in Proposition 5 satisfies q ∝ −(u−1) for all u > 1. For u ∗ := 3/2, the identity h(u ∗ ) = u ∗ − 1 holds, so that q and log ρ(m, (1−α)) have the same order of magnitude and (20) can be ∗ satisfied. Then (19) and (21) imply that log ρ(∞, ) ∝ − −h(u ) = − −1/2 , and (using the fact that h is non-decreasing), h(u) = h(u ∗ ) for all u ≥ u ∗ . Thus, the −1/2 scaling exponent is “explained” by 3/2 being the solution of the equation h(u ∗ ) = u ∗ − 1. We deduce from these results the existence of two distinct regimes for ρ(m, ) with m ∝ −u : • when 0 < u ≤ 3/2, log(ρ(m, )) ∝ −m 1/3 ; • when u ≥ 3/2, log(ρ(m, )) ∝ log(ρ(∞, )) ∝ − −1/2 . 8.1.3. Asymptotics of log ρ(m, ) with m ∝ −u , 0 < u < 3/2. Let us now explain how to compute h(u), and assume throughout this section that m ∝ −u , whence m ∝ 1−u . A key idea is to perform a change of measure, replacing the step distribution p of the BRW by the distribution p˜ defined by (see Sect. 1.2) d p˜ exp (t ∗ x) (x) := . dp exp((t ∗ )) The mean value of a step with respect to p˜ is now equal to v( p), and, if (Sk )k≥0 denotes a random walk started at S0 := 0, with i.i.d. increments whose common distribution is p with respect to a probability measure P, and p˜ with respect to a probability measure ˜ the following identity holds for all k: P, ∗ (22) 2k P [(S0 , . . . , Sk ) ∈ ·] = E˜ e−t (Sk −v( p)k) 1(S0 , . . . , Sk ) ∈ · , ˜ The raison d’être of Assumption (A3) where E˜ denotes expectation with respect to P. is to allow for such a change of measure. Remember that ρ(m, ) is the probability that at least one descending path r oot =: u 0 , u 1 , . . . , u m exists in T such that for all i ∈ [[0, m]], (u i ) ≥ (v( p) − )i.
(23)
Observe that, for such a path, either for all i ∈ [[0, m]], (u i ) ≤ v( p)i + −u/3 ,
(24)
there exists i ∈ [[0, m]] such that (u i ) > v( p)i + −u/3 .
(25)
or
Brunet-Derrida Behavior of Particle Systems
339
Denoting by m the (random) number of descending paths satisfying (23) and (24), and by m the number of those satisfying (23) and (25), we see that ρ(m, ) = Q({m ≥ 1} ∪ {m ≥ 1}), so that, obviously, Q(m ≥ 1) ≤ ρ(m, ) ≤ E(m ) + Q(m ≥ 1).
(26)
By definition, E(m ) = 2m P( for all i ∈ [[0, m]], v( p)i − i ≤ Si ≤ v( p)i + −u/3 ), which is rewritten, using (22), as ∗ E˜ e−t (Sm −v( p)m) 1( for all i ∈ [[0, m]], v( p)i − i ≤ Si ≤ v( p)i + −u/3 ) .
(27)
Since the only paths that contribute to the above expectation have v( p)m − m ≤ Sm ≤ v( p)m + −u/3 , we see that E(m ) ≤ et
∗ m
P˜ for all i ∈ [[0, m]], v( p)i − i ≤ Si ≤ v( p)i + −u/3 ,
(28)
and E(m ) ≥ e−t
∗ −u/3
P˜ for all i ∈ [[0, m]], v( p)i − i ≤ Si ≤ v( p)i + −u/3 . (29)
˜ (Sk − v( p)k)k≥0 is a random walk with centered squareNow observe that, under P, integrable increments. Moreover, ( −u/3 )2 << m since u > 0, and, −u/3 >> m as soon as u < 3/2. As a consequence, the usual Brownian scaling for random walks yields that when 0 < u < 3/2, log P˜ ∀i ∈ [[0, m]], v( p)i − i ≤ Si ≤ v( p)i + −u/3 ∝ −
−u ( −u/3 )2
= − −u/3 . (30)
Since −u/3 >> m when u < 3/2, we deduce from (28) and (29) that log E(m ) ∝ − −u/3 for 1 < u < 3/2. As for m , the union bound yields that Q(m ≥ 1) ≤
m
Q(∃x ∈ T(i); (x) > v( p)i + −u/3 ),
i=0
whence Q(m ≥ 1) ≤
m i=0
2i P(Si > v( p)i + −u/3 ).
(31)
340
J. Bérard, J.-B. Gouéré
For all i ∈ [[0, m]], the change of measure shows that 2i P(Si > v( p)i + −u/3 ) = −t ∗ (S −v( p)i) −u/3 i 1(Si > v( p)i + ) ≤ exp(−t ∗ −u/3 ), so we easily deduce that E˜ e − log Q(m ≥ 1) is at least ∝ − −u/3 .
(32)
We conclude from (31), (32) and (26) that − log ρ(m, ) is at least ∝ −u/3 for 0 < u < 3/2.
(33)
Using the same kind of argument that led to (31), but working a little more (we omit the details), it is possible to show that log E(2m ) is at most ∝ −u/3 for 0 < u < 3/2.
(34)
Then, we can use the classical second moment inequality: Q(m > 0) ≥
E(m )2 , E(2m )
to deduce from (26), (31) and (34) that − log ρ(m, ) is at most ∝ − −u/3 for 0 < u < 3/2.
(35)
As a consequence, we obtain that log ρ(m, ) ∝ − −u/3 for 0 < u < 3/2.
(36)
8.1.4. Asymptotics of ρ(m, ) with m ∝ −u , u ≥ 3/2, and ρ(∞, ). The above discussion dealt only with rough order of magnitudes (denoted by the ∝ symbol), but a more precise analysis is needed to study the competition between positive and negative terms of similar orders of magnitude when u = 3/2. For λ > 0, let m := λ −3/2 and f + (λ) := lim sup − 1/2 log ρ(m, ), f − (λ) := lim inf − 1/2 log ρ(m, ). →0
→0
From the monotonicity property: ρ(m, ) ≥ ρ(m , ) when m ≥ m, we deduce that λ → f + (λ) and λ → f − (λ) are non-decreasing. Similarly, from the monotonicity property ρ(m, ) ≤ ρ(m, ) when ≤ , we deduce that λ → λ−1/3 f + (λ) and λ → λ−1/3 f − (λ) are non-increasing. In particular, the fact that there exists some λ for which f + (λ) is finite (resp. positive) implies that f + (λ) is finite (resp. positive) for all λ > 0. The same property holds for f − . Now, rework the bounds in the previous section, replacing the −u/3 (= −1/2 since u = 3/2) terms in the definition of m and m , by λ −1/2 (more precision than only the order of magnitude of terms is needed in order to deal with the case u = 3/2). Consider the analog of the bound (28) in the present context, and observe that, for small , m ∼ λ −1/2 . The Brownian scaling bound then yields the existence of a constant c > 0 such that, for small , log P˜ ∀i ∈ [[0, m]], v( p)i −i ≤ Si ≤ v( p)i +λ −1/2 ≤ −c
m (λ −1/2 )2
∼ −cλ−1 −1/2 .
Brunet-Derrida Behavior of Particle Systems
341
For small enough λ, this term dominates the t ∗ m ∼ t ∗ λ −1/2 term in the exponential, so that f − (λ) > 0. We deduce that f − (λ) > 0 for all values of λ > 0. On the other hand, it is straightforward to adapt the estimates in the previous section to show that f + (λ) < +∞ for all λ > 0. We can thus conclude that, when m ∝ −3/2 , log ρ(m, ) ∝ − −1/2 .
(37)
Now, by Proposition 5, the asymptotic scaling log ρ(∞, ) ∝ − −1/2 is a consequence of (37), provided that (20) is satisfied for u = 3/2 and, at least, large enough λ. αλ −1/2 Observe that, on the one hand, q ∼ v(αm p)−R ∼ v( p)−R . On the other hand, log ρ(m, (1 − α)) − f + (λ(1−α)3/2 )((1−α))−1/2 . The fact that λ → λ−1/3 f + (λ) is non-increasing and thus bounded above for large λ, implies that φ q ρ(m, (1 − α)) >> 1 for large enough λ, so that (20) is indeed satisfied. 8.2. Deducing the Brunet-Derrida behavior. Broadly speaking, our proof of the Brunet-Derrida behavior of branching-selection systems is based on the fact that there is a loose equivalence between the following two properties: BRW1 , . . . , BRW N do not survive killing below a line of slope v − ,
(38)
v N < v − .
(39)
and
If one accepts this premise, it is then natural to expect the actual velocity shift N := v( p) − v N to satisfy ρ(∞, N ) ∝ 1/N .
(40)
Indeed, since BRW1 , . . . , BRW N are independent, ρ(∞, N ) >> 1/N would imply that, with probability close to one, at least one of the BRWi s survives killing, while ρ(∞, N ) << 1/N would imply that, with probability close to one, none of the BRWi s survives. Using the asymptotics log ρ(∞, ) ∼ −χ ( p)1/2 −1/2 , it is then easily checked that (40) imposes the precise asymptotic behavior N ∼ χ ( p)(log N )−2 . To give an intuition of why (38) and (39) should be related, remember the coupling between the branching-selection particle system and BRW1 , . . . , BRW N described in Sect. 3.3. If (38) holds, then, loosely speaking, v( p) − is above the sustainable growth speed for a branching system with only N particles available, so the maximum of the branching-selection system should grow at a speed lower than v( p) − . Conversely, if (38) holds, the population in BRW1 , . . . , BRW N above the line with slope v( p) − quickly exceeds N since the existing surviving particles quickly yield many surviving descendants. As a consequence, the threshold in the selection steps of the branchingselection system has to be above the line with slope v( p) − . A rigorous formulation of the preceding arguments is precisely what we do in Sects. 5 and 6.
342
J. Bérard, J.-B. Gouéré
It should be noted that the key time scale over which we have to control the particle system to prove both the upper and the lower bound is −3/2 , with satisfying (40), that is, a time scale of order (log N )3 . This is the same order of magnitude as the one observed for coalescence times of the genealogical process underlying the branchingselection particle system (this question is investigated empirically and with heuristic arguments in e.g. [5]). Understanding more precisely the role of this time-scale for the dynamics of the particle system certainly deserves more investigation. References 1. Athreya, K.B., Ney, P.E.: Branching processes. Mineola, NY: Dover Publications Inc., 2004. Reprint of original, New York: Springer, 1972 2. Benguria, R., Depassier, M.C.: On the speed of pulled fronts with a cutoff. Phys. Rev. E 75, 051106 (2007) 3. Benguria, R., Depassier, M.C., Loss, M.: Validity of the Brunet-Derrida formula for the speed of pulled fronts with a cutoff. Eur. Phys. J B 61, 331 (2008) 4. Bérard, J.: An example of Brunet-Derrida behavior for a branching-selection particle system on Z. http:// arxiv.org/abs/0810.5567v3[math.PR], 2008 5. Brunet, É., Derrida, B., Mueller, A.H., Munier, S.: Effect of selection on ancestry: an exactly soluble case and its phenomenological generalization. Phys. Rev. E (3) 76(4), 041104 (2007) 6. Brunet, E., Derrida, B.: Shift in the velocity of a front due to a cutoff. Phys. Rev. E (3) 56(3, part A), 2597–2604 (1997) 7. Brunet, É., Derrida, B.: Microscopic models of traveling wave equations. Computer Phys. Commun. 121122, 376–381 (1999) 8. Brunet, É., Derrida, B.: Effect of microscopic noise on front propagation. J. Stat. Phys. 103(1-2), 269– 282 (2001) 9. Conlon, J.G., Doering, C.R.: On travelling waves for the stochastic Fisher-Kolmogorov-PetrovskyPiscunov equation. J. Stat. Phys. 120(3-4), 421–477 (2005) 10. Derrida, B., Simon, D.: The survival probability of a branching random walk in presence of an absorbing wall. Europhys. Lett. EPL 78(6), Art. 60006, 6 (2007) 11. Dumortier, F., Popovi´c, N., Kaper, T.J.: The critical wave speed for the Fisher-Kolmogorov-PetrowskiiPiscounov equation with cut-off. Nonlinearity 20(4), 855–877 (2007) 12. Durrett, R.: Probability: theory and examples. Belmont, CA: Duxbury Press, second edition, 1996 13. Gantert, N., Hu, Y., Shi, Z.: Asymptotics for the survival probability in a supercritical branching random walk. http://arxiv.org/abs/0811.0262v2[math.PR], 2008 14. Mueller, C., Mytnik, L., Quastel, J.: Small noise asymptotics of traveling waves. Markov Process. Related Fields 14(3), 333–342 (2008) 15. Mueller, C., Mytnik, L., Quastel, J.: Effect of noise on front propagation in reaction-diffusion equations of KPP type. http://arxiv.org/abs/0902.3423v1[math.PR], 2009 16. Pemantle, R.: Search cost for a nearly optimal path in a binary tree. Ann. Appl. Prob. 19(4), 1273–1291 (2009) 17. Simon, D., Derrida, B.: Quasi-stationary regime of a branching random walk in presence of an absorbing wall. J. Stat. Phys. 131(2), 203–233 (2008) Communicated by H. Spohn
Commun. Math. Phys. 298, 343–356 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1060-5
Communications in
Mathematical Physics
Quantum Isometries and Noncommutative Spheres Teodor Banica1 , Debashish Goswami2 1 Department of Mathematics, Cergy-Pontoise University, 95000 Cergy-Pontoise, France.
E-mail: [email protected]
2 Theoretical Statistics and Mathematics Unit, 203 Barrackpore Trunk Road,
Kolkata 700 108, India. E-mail: [email protected] Received: 23 May 2009 / Accepted: 16 February 2010 Published online: 13 June 2010 – © Springer-Verlag 2010
Abstract: We introduce and study two new examples of noncommutative spheres: the half-liberated sphere, and the free sphere. Together with the usual sphere, these two spheres have the property that the corresponding quantum isometry group is “easy”, in the representation theory sense. We present as well some general comments on the axiomatization problem, and on the “untwisted” and “non-easy” case. 0. Introduction The aim of the present paper is to bring some contributions to the theory of noncommutative spheres, by using a number of ideas and tools coming from the recent work on quantum isometry groups [9,21], and on easy quantum groups [6,7]. The noncommutative spheres were introduced by Podle´s in [22], as twists of the usual spheres. The natural framework for the study of such noncommutative objects is Connes’ noncommutative geometry [13]. This has led to a systematic study of the associated spectral triples, with the explicit computation of a number of related Riemannian geometric invariants. See Connes and Dubois-Violette [14,15], Connes and Landi [16], Dabrowski, D’Andrea, Landi and Wagner [18]. A useful, alternative point of view comes from the relationship with the quantum groups. The structure of the usual sphere S n−1 is intimately related to that of the orthogonal group On , and when twisting the sphere the orthogonal group gets twisted as well, and becomes a quantum group. See Varilly [23]. The recently developed theory of quantum isometry groups [9,21] provides a good abstract framework for the study of the exact relationship between noncommutative spheres and quantum groups. Some key preliminary results in this direction were obtained in [8], where it was shown that the usual spheres do not have quantum symmetry, and in [10], where the case of the Podle´s sphere is studied in detail. In this paper we present some more general results in this direction, mixing spectral triple and quantum group techniques. The idea is that the “easy” (and “untwisted”)
344
T. Banica, D. Goswami
orthogonal quantum groups, introduced in [6] and classified in [7], are as follows: the orthogonal group On , the half-liberated orthogonal group On∗ , and the free orthogonal group On+ . Together with the above considerations, this suggests that in the “untwisted” and “easy” case we should have exactly 3 examples of noncommutative spheres: the usual sphere S n−1 , a half-liberated sphere S∗n−1 , and a free sphere S+n−1 . We will present here a number of results in this direction, notably by introducing and studying in detail the half-liberated sphere S∗n−1 , and the free sphere S+n−1 . The paper is organized as follows: in 1–2 we discuss the construction of the three spheres, in 3–5 we study the associated projective spaces and spherical integrals, and in 6–7 we work out the relation with the quantum isometry groups. The final section, 8, contains a few concluding remarks. 1. The Usual Sphere The simplest example of a noncommutative sphere is the usual sphere S n−1 ⊂ Rn . Our first goal will be to find a convenient functional analytic description of it. In this section we present a series of 5 basic statements in this direction, all well-known, and all to be extended in the next section to the noncommutative setting. Theorem 1.1. The sphere S n−1 ⊂ Rn is the spectrum of the C ∗ -algebra An = C ∗ x1 , . . . , xn xi = xi∗ , xi x j = x j xi , xi2 = 1 generated by n self-adjoint commuting variables, whose squares sum up to 1. Proof. The first remark is that the algebra in the statement is indeed well-defined, due to the condition xi2 = 1, which shows that we have ||xi || ≤ 1 for any C ∗ -norm. Consider the algebra An = C(S n−1 ), with the standard coordinates denoted xi . By the universal property of An we have a morphism An → An mapping xi → xi . On the other hand, by the Gelfand theorem we have An = C(X ) for a certain compact space X , and we can define a map X → S n−1 by p → (x1 ( p), . . . , xn ( p)). By transposing we get a morphism An → An mapping xi → xi , and we are done. The second ingredient that we will need is a functional analytic description of the action of the orthogonal group On on S n−1 . This action can be seen as follows. Theorem 1.2. We have a coaction : An → C(On ) ⊗ An given by (xi ) = ui j ⊗ x j , j
where u i j ∈ C(On ) are the standard matrix coordinates: u i j (g) = gi j . Proof. Consider indeed the action map On × S n−1 → S n−1 , given by (g, p) → gp. By transposing we get a coaction map as in the statement. The uniform measure on S n−1 is the unique probability measure which is invariant under the action of On . In functional analytic terms, the result is as follows. Theorem 1.3. There is a unique positive unital trace tr : An → C satisfying the invariance condition (id ⊗ tr )(x) = tr (x)1.
Quantum Isometries and Noncommutative Spheres
345
Proof. We can define indeed tr : An → C to be the integration with respect to the uniform measure on S n−1 : the positivity condition follows from definitions, and the invariance condition as in the statement follows from dp = d(gp), for any g ∈ On . Conversely, it follows from the general theory of the Gelfand correspondence that a trace as in the statement must come from the integration with respect to a probability measure on S n−1 which is invariant under On , and this gives the uniqueness. Theorem 1.4. The canonical trace tr : An → C is faithful. Proof. This follows from the well-known fact that the uniform measure on the sphere takes a nonzero value on any open set. Finally, we have the well-known result stating that S n−1 can be identified with the first slice of On . In functional analytic terms, the result is as follows. Theorem 1.5. The following algebras, with generators and traces, are isomorphic: (1) The algebra An , with generators x1 , . . . , xn , and with the trace functional. (2) The algebra Bn ⊂ C(On ) generated by u 11 , . . . , u 1n , with the integration. Proof. From the universal property of An we get a morphism π : An → Bn mapping xi → u 1i . The invariance property of the integration functional I : C(On ) → C shows that tr = I π satisfies the invariance condition in Theorem 1.3, so we have tr = tr . Finally, from the faithfulness of tr we get that π is an isomorphism, and we are done. 2. Noncommutative Spheres We are now in position of introducing two basic examples of noncommutative spheres: the half-liberated sphere, and the free sphere. The idea will be of course to weaken or simply remove the commutativity conditions in Theorem 1.1. Definition 2.1. We consider the universal C ∗ -algebras A∗n = C ∗ x1 , . . . , xn xi = xi∗ , xi x j xk = xk x j xi , xi2 = 1 A+n = C ∗ x1 , . . . , xn xi = xi∗ , xi2 = 1 generated by n self-adjoint variables whose squares sum up to 1, subject to the halfcommutation relations abc = bca, and to no relations at all. Our next goal will be to find suitable analogues of Theorems 1.2, 1.3, 1.4, 1.5. For this purpose, we will need appropriate “noncommutative versions” of the orthogonal group On . So, let us consider the following universal algebras: C(On∗ ) = C ∗ u 11 , . . . , u nn u i j = u i∗j , u i j u kl u st = u st u kl u i j , u t = u −1 , C(On+ ) = C ∗ u 11 , . . . , u nn u i j = u i∗j , u t = u −1 . These algebras, introduced in [6,24], are Hopf algebras in the sense of Woronowicz [25,26]. We refer to the recent paper [7] for a full discussion here. With these definitions in hand, we can state and prove now an analogue of Theorem 1.2. We agree to use the generic notation A× n for the 3 algebras constructed so far.
346
T. Banica, D. Goswami
× × Theorem 2.2. We have a coaction : A× n → C(On ) ⊗ An given by ui j ⊗ x j , (xi ) = j
where u i j ∈
C(On× )
are the standard generators.
Proof. We have to construct three maps, and prove that they are coactions. We will deal with all 3 cases at the same time. Consider the following elements: ui j ⊗ x j . Xi = j
These elements are self-adjoint, and their squares sum up to 1. Moreover, in the case where both sets {xi } and {u i j } consist of commuting or half-commuting elements, the set {X i } consists by construction of commuting or half-commuting elements. These observations give the existence of maps as in the statement. The fact that these maps satisfy the condition ( ⊗ id) = ( ⊗ id) is clear from definitions. Theorem 2.3. There is a unique positive unital trace tr : A× n → C satisfying the condition (id ⊗ tr )(x) = tr (x)1. Proof. Consider the algebra Bn× ⊂ C(On× ) generated by the elements u 11 , . . . , u 1n . By × × the universal property of A× n we have a morphism π : An → Bn mapping x i → u 1i , and by composing with the restriction of the Haar functional I : C(On× ) → C, we obtain a trace satisfying the invariance condition in the statement. As for the uniqueness part, this is a quite subtle statement, which will ultimately come from a certain “easiness” property of our 3 spheres. Our first claim is that we have the following key formula, where tr is the trace that we have just constructed: (I ⊗ id) = tr (.)1 In order to prove this claim, we use the orthogonal Weingarten formula [2,4,6]. To any integer k let us associate the following sets: (1) Dk : all pairings of {1, . . . , k}. (2) Dk∗ : all pairings with an even number of crossings of {1, . . . , k}. (3) Dk+ : all noncrossing pairings of {1, . . . , k}. The Weingarten formula tells us that the Haar integration over On× is given by the loops( p∨q) , and where the δ following sum, where Wkn = G −1 kn , with G kn ( p, q) = n symbols are 1 if all the strings join pairs of equal indices, and 0 if not: δ p (i)δq ( j)Wkn ( p, q). I (u i1 j1 . . . u ik jk ) = p,q∈Dk×
So, let us go back now to our claim. By linearity it is enough to check the equality on a product of basic generators xi1 . . . xik . The left term is as follows: (I ⊗ id)(xi1 . . . xik ) = I (u i1 j1 . . . u ik jk )x j1 . . . x jk j1 ... jk
=
δ p (i)δq ( j)Wkn ( p, q)x j1 . . . x jk
j1 ... jk p,q∈D × k
=
p,q∈Dk×
δ p (i)Wkn ( p, q)
j1 ... jk
δq ( j)x j1 . . . x jk .
Quantum Isometries and Noncommutative Spheres
347
Let us look now at the last sum on the right. In the free case we have to sum quantities of type x j1 . . . x jk , over all choices of multi-indices j which fit into our given noncrossing pairing q, and just by using the condition xi2 = 1, we conclude that the sum is 1. The same happens in the classical case, with the changes that our pairing q can now be crossing, but we can use now the commutation relations xi x j = x j xi . Finally, the same happens as well in the half-liberated case, because the fact that our pairing q has now an even number of crossings allows us to use the half-commutation relations xi x j xk = xk x j xi , in order to conclude that the sum to be computed is 1. Summarizing, in all cases the sum on the right is 1, so we get: (I ⊗ id)(xi1 . . . xik ) = δ p (i)Wkn ( p, q)1. p,q∈Dk×
On the other hand, another application of the Weingarten formula gives: tr (xi1 . . . xik )1 = I (u 1i1 . . . u 1ik )1 δ p (1)δq (i)Wkn ( p, q)1 = p,q∈Dk×
=
δq (i)Wkn ( p, q)1.
p,q∈Dk×
Since the Weingarten function is symmetric in p, q, this finishes the proof of our claim. So, let us get back now to the original question. Let τ : A× n → C be a trace satisfying the invariance condition in the statement. We have: τ (I ⊗ id)(x) = (I ⊗ τ )(x) = I (id ⊗ τ )(x) = I (τ (x)1) = τ (x). On the other hand, according to our above claim, we have as well: τ (I ⊗ id)(x) = τ (tr (x)1) = tr (x). Thus we get τ = tr , which finishes the proof. We do not have an analogue of Theorem 1.4, and it is best to proceed as follows. Definition 2.4. We agree to replace from now on A× n with its GNS completion with respect to the canonical trace tr : A× → C. n We actually believe that the canonical trace is faithful on the algebraic part, so that our replacement is basically not needed, but we don’t have a proof for this fact. Theorem 2.5. The following algebras, with generators and traces, are isomorphic: (1) The algebra A× n , with generators x 1 , . . . , x n , and with the trace functional, (2) The algebra Bn× ⊂ C(On× ) generated by u 11 , . . . , u 1n , with the integration. × Proof. Consider the map π : A× n → Bn , already constructed in the proof of Theorem 2.3. The invariance property of the integration functional I : C(On× ) → C shows that tr = I π satisfies the invariance condition in Theorem 2.3, so we have tr = tr . Together with the positivity of tr and with the basic properties of the GNS construction, this shows that π is an isomorphism, and we are done.
348
T. Banica, D. Goswami
3. Projective Spaces In this section we study the projective spaces associated to our noncommutative spheres. Let us first recall that the projective space over a field F is by definition Fn − {0}/ ∼, where x ∼ y when y = λx for some λ ∈ F. We will use the notation P n−1 for the real projective space, and Pcn−1 for the complex projective space. Let us introduce the following definition. Definition 3.1. We denote by Cn× the subalgebra < xi x j >⊂ A× n , taken together with the restriction of the canonical trace. The noncommutative projective space that we are interested in is by definition the spectrum of Cn× , viewed as a noncommutative compact measured space. As a first remark, in the classical case we get indeed the real projective space. Theorem 3.2. We have Cn = C(P n−1 ). Proof. First, since each product of coordinates xi x j : S n−1 → R takes equal values on p and − p, this product can be regarded as being a function on P n−1 . Now since the collection of functions {xi x j } separates the points of P n−1 , we get the result. Quite surprisingly, in the half-liberated case we get the complex projective space. Theorem 3.3. We have Cn∗ = C(Pcn−1 ). Proof. First, the half-commutation relations abc = cba give abcd = cbad = cdab for any a, b, c, d ∈ {x1 , . . . , xn }, so the elements xi x j commute indeed with each other. We have to prove that the Gelfand spectrum of Cn∗ =< xi x j > is isomorphic to Pcn−1 . For this purpose, we use the isomorphism P On∗ PUn established in [7]. This iso∗ , where u, v denote respectively the fundamental morphism is given by u i j u kl → vi j vkl corepresentations of On∗ , Un . By restricting attention to the first row of coordinates, this gives an embedding Cn∗ ⊂ C(PUn ), mapping xi x j → v1i v1∗ j . Consider now the complex sphere Scn−1 ⊂ Cn , with coordinates denoted z 1 , . . . , z n . By performing the standard identification v1i = z i , coming from the unitary version of Theorem 1.5, we obtain an embedding Cn∗ ⊂ C(Scn−1 ), mapping xi x j → z i z¯ j . The image of this embedding is the subalgebra of C(Scn−1 ) generated by the functions z i z¯ j . By using the same argument as in the real case, this gives the result. In view of the above results, it is tempting to conjecture that the “threefold way” that we are currently developing for quantum groups, noncommutative spheres and noncommutative projective spaces is actually part of the usual real/complex/quaternionic “threefold way”, originally discovered by Frobenius, and known to play a fundamental role in mathematical physics, according to Dyson’s paper [20]. We have here the following question. Question 3.4. Do we have Cn+ = C(Pkn−1 )? This is of course quite a vague question. The symbol Pkn−1 on the right is supposed to correspond to some kind of tricky “quaternionic projective space”. Note that the space Pkn−1 = Kn − {0}/ ∼ appearing in the existing literature won’t be suitable for our purposes, simply because the algebra Cn+ on the left is noncommutative. What we would need is rather a “twist”, in the spirit of the quantum projective spaces in [17].
Quantum Isometries and Noncommutative Spheres
349
The main problem here is to construct a representation of Cn+ , by using the Pauli matrices. With a bit of luck, this representation can be shown to be faithful, and the corresponding result can be interpreted as answering the above question. A first piece of evidence comes from [3], where a certain faithful representation of C(S4+ ) is constructed, by using the Pauli matrices. This is probably quite different from what we need, but the main technical fact, namely that “the combinatorics of the noncrossing partitions can be implemented by the Pauli matrices”, is already there. A second piece of evidence comes from the results in [7], which suggest that the free quantum groups might be actually supergroups. Once again, this kind of argument is quite speculative, and maybe a bit far away from the present considerations. Let us end this section by recording a few modest facts about Cn+ . Proposition 3.5. The algebras Cn+ are as follows:
(1) At n = 2 we have C2+ = C2∗ . (2) At n ≥ 3 we have Cn+ = Cn∗ .
Proof. (1) This follows either from the isomorphism O2+ = O2∗ established in [7], or directly from definitions, by using the fact that C2+ is commutative. (2) It is enough here to prove that C3+ is not commutative. For this purpose, we will use the positive matrices in M2 (C). These are matrices of the following form: pa Y = . a¯ q Here p, q ∈ R and a ∈ C must be chosen such that both eigenvalues are positive, and this happens for instance when p, q > 0 and a ∈ C is small enough. Let us fix some numbers pi , qi > 0 for i = 1, 2, 3, satisfying pi = qi = 1. For any choice of small complex numbers ai ∈ C satisfying ai = 0, the corresponding elements Yi constructed as above will be positive, and will sum up to 1. Moreover, by carefully choosing the ai ’s, we can arrange as for Y1 , Y2 , Y3 not to pairwise commute. √ Consider now the matrices X i = Yi . These are all self-adjoint, and their squares sum up to 1, so we get a representation A+3 → M2 (C) mapping xi → X i . Now this representation restricts to a representation C3+ → M2 (C) mapping xi2 → Yi , and since the Yi ’s don’t commute, it follows that C3+ is not commutative, and we are done. The above result suggests the following extra question regarding Cn+ : what is the Gelfand spectrum of the algebra Cn+ /I , where I ⊂ Cn+ is the commutator ideal? Observe that the canonical arrow Cn+ → Cn∗ and Theorem 3.3 tell us that this Gelfand spectrum must contain Pcn−1 . Moreover, Proposition 3.5 shows that at n = 2 this inclusion is an equality. However, at n = 3 already the answer is not clear. 4. Probabilistic Aspects We know from the previous sections that we have three basic examples of “noncommutative spheres”, namely those corresponding to the algebras An , A∗n , A+n . In this section and in the next one we investigate the key problem of computing the integral over these noncommutative spheres of polynomial quantities of type xi1 . . . xik . Definition 4.1. The polynomial spherical integrals will be denoted I = xi1 . . . xik d x n−1 S×
350
T. Banica, D. Goswami
with this quantity standing for the complex number obtained as an image of the well× defined element xi1 . . . xik ∈ A× n by the well-defined trace functional tr : An → C. The problem of computing such integrals has been heavily investigated in the last years, and a number of results are available from [2,5,12,19]. In what follows we will make a brief presentation of this material, by focusing of course on the applications to n−1 . We will present as well some new results, in the half-liberated case. S× Let us begin our study with an elementary result. Proposition 4.2. We have the formula xi1 . . . xik d x = 0 n−1 S×
unless each xi appears an even number of times. Proof. This follows from the fact that for any i we have an automorphism of A× n given by xi → −xi . Indeed, this automorphism must preserve the trace, so if xi appears an odd number of times, the integral in the statement satisfies I = −I , so I = 0. The basic tool for computing spherical integrals is the Weingarten formula. Let us recall from Sect. 2 that associated to any integer k are the following sets: (1) Dk : all pairings of {1, . . . , k}. (2) Dk∗ : all pairings with an even number of crossings of {1, . . . , k}. (3) Dk+ : all noncrossing pairings of {1, . . . , k}. n−1 , because they These sets can be regarded as being associated to our spheres S× come from the representation theory of the associated quantum groups On× .
Theorem 4.3. We have the Weingarten formula xi1 . . . xik d x = δ p (i)Wkn ( p, q), n−1 S×
p,q∈Dk×
loops( p∨q) , and where the δ symbol is 1 if all the where Wkn = G −1 kn , with G kn ( p, q) = n strings of p join pairs of equal indices of i = (i 1 , . . . , i k ), and is 0 if not.
Proof. This follows from the Weingarten formula in [2,4,12], via the identification in Theorem 2.5, and from the fact that the Weingarten matrix is symmetric in p, q. As a first application, we have the following result. n−1 are as follows: Theorem 4.4. With n → ∞, the standard coordinates of S×
(1) Classical case: real Gaussian, independent. (2) Half-liberated case: symmetrized Rayleigh, their squares being independent. (3) Free case: semicircular, free. Proof. This follows from Theorem 4.3 and from the fact that Wkn is asymptotically diagonal, see [2,6]. The only new assertion is the independence one in (2), which can be proved as in [6], by using the fact that the mixed cumulants vanish. Note that the independence in (1,2) follows as well from the exact formulae in Theorems 5.1 and 5.2 below, by letting n → ∞ and by using the Stirling formula.
Quantum Isometries and Noncommutative Spheres
351
5. Spherical Integrals We discuss in this section a quite subtle problem, of theoretical physics flavor, namely n−1 . the exact computation of the polynomial integrals over S× In the classical case, we have the following well-known result. Theorem 5.1. The spherical integral of xi1 . . . xik vanishes, unless each a ∈ {1, . . . , n} appears an even number of times in the sequence i 1 , . . . , i k . If la denotes this number of occurrences, then (n − 1)!!l1 !! . . . ln !! xi1 . . . xik d x = (n + li − 1)!! S n−1 with the notation m!! = (m − 1)(m − 1)(m − 5) . . .. Proof. The first assertion follows from Proposition 4.2. The second assertion is wellknown, and can be proved by using spherical coordinates, the Fubini theorem, and some standard partial integration tricks. See e.g. [4]. In the case of the half-liberated sphere, we have the following result. Theorem 5.2. The half-liberated spherical integral of xi1 . . . xik vanishes, unless each number a ∈ {1, . . . , n} appears the same number of times at odd and at even positions in the sequence i 1 , . . . , i k . If la denotes this number of occurrences, then: (2n − 1)!l1 ! . . . ln ! . xi1 . . . xik d x = 4li n−1 (2n + li − 1)! S∗ Proof. First, by using Proposition 4.2 we see that the integral I in the statement vanishes, unless k = 2l is even. So, assume that we are in the non-vanishing case. By using Theorem 3.3 the corresponding integral over the complex projective space Pcn−1 can be viewed as an integral over the complex sphere Scn−1 , as follows: I = z i1 z¯ i2 . . . z i2l−1 z¯ i2l dz. Scn−1
Now by using the same argument as in the proof of Proposition 4.2, but this time with transformations of type p → λp with |λ| = 1, we see that I vanishes, unless each z a appears as many times as z¯ a does, and this gives the first assertion. Assume now that we are in the non-vanishing case. Then the la copies of z a and the la copies of z¯ a produce by multiplication a factor |z a |2la , so we have: |z 1 |2l1 . . . |z n |2ln dz. I = Scn−1
Now by using the standard identification Scn−1 S 2n−1 , we get: I = (x12 + y12 )l1 . . . (xn2 + yn2 )ln d(x, y) S 2n−1 l l 1 ... n = x12l1 −2r1 y12r1 . . . xn2ln −2rn yn2rn d(x, y). r1 rn S 2n−1 r1 ...rn
352
T. Banica, D. Goswami
By using the formula in Theorem 5.1, we get: l l (2n − 1)!!(2r1 )!! . . . (2rn )!!(2l1 − 2r1 )!! . . . (2ln − 2rn )!! 1 I = ... n r1 rn (2n + 2li − 1)!! r1 ...rn l (2n − 1)!(2r1 )! . . . (2rn )!(2l1 − 2r1 )! . . . (2ln − 2rn )! l 1 ... n = . r1 rn (2n + li − 1)!r1 ! . . . rn !(l1 − r1 )! . . . (ln − rn )! r ...r 1
n
We can rewrite the sum on the right in the following way: l1 ! . . . ln !(2n − 1)!(2r1 )! . . . (2rn )!(2l1 − 2r1 )! . . . (2ln − 2rn )! (2n + li − 1)!(r1 ! . . . rn !(l1 − r1 )! . . . (ln − rn )!)2 r1 ...rn 2r 2l − 2r 2r 2l − 2r (2n − 1)!l1 ! . . . ln ! 1 1 1 n n n . = ... r1 l1 − r1 rn ln − rn (2n + li − 1)! r r
I =
n
1
The sums on the right being 4l1 , . . . , 4ln , we get the formula in the statement. In the case of the free sphere, we already know from Theorem 4.4 that the standard coordinates x1 , . . . , xn are asymptotically semicircular and free. However, the computation of their joint law for a fixed value of n is a well-known open problem, of remarkable difficulty. The point is that the Gram matrix G kn , which is nothing but Di Francesco’s “meander matrix” in [19], cannot be diagonalized explicitly. The best result in this direction that is known so far is as follows. Theorem 5.3. The moments of the free hyperspherical law are given by S+n−1
x12l
l+1 r 1 1 q +1 r 2l + 2 · dx = · (−1) , l l + r + 1 (n + 1) q − 1 l + 1 1 + qr r =−l−1
where q ∈ [−1, 0) is given by q + q −1 = −n. Proof. This is proved in [5], the idea being that x1 ∈ A+n can be modelled by a certain q variable over SU2 , which can be studied by using advanced calculus methods. Our question is whether Theorem 5.1 and Theorem 5.2 have a free analogue. Question 5.4. Does the liberated spherical integral xi1 . . . xik d x S+n−1
appear as a “free analogue” of the quantities computed in Theorems 5.1 and 5.2? The answer here is very unclear, even in the case where the indices i 1 , . . . , i k are all equal. In fact, the above question is probably closely related to Question 3.4. Let us also mention that the meander determinant computed by Di Francesco in [19], which appears as the denominator of the abstract Weingarten-theoretical fraction expressing the integral in Question 5.4, is a product of Chebycheff polynomials. Our question is whether some “magic” simplification appears when computing the fraction. In fact, our Questions 3.4 and 5.4 should be regarded as a slight, very speculative advance on the conceptual understanding of the various formulae in [5,19].
Quantum Isometries and Noncommutative Spheres
353
6. Spectral Triples In the reminder of this paper, our goal will be to study the “differential structure” of the n−1 . Besides being of independent theoretical interest, this noncommutative spheres S× study will lead via the results in [7,9] to a “global look” to our 3 spheres. n−1 is Connes’ The natural framework for the study of noncommutative objects like S× noncommutative geometry [13], where the basic definition is as follows. Definition 6.1. A compact spectral triple (A, H, D) consists of the following: (1) A is a unital C ∗ -algebra. (2) H is a Hilbert space, on which A acts. (3) D is a (typically unbounded) self-adjoint operator on H , with compact resolvents, such that [D, a] has a bounded extension, for any a in a dense ∗-subalgebra (say A) of A. This definition is of course over-simplified, as to best fit with the purposes of the present paper. We refer to [13] for the exact formulation of the axioms. In what follows we will be mainly interested in the sphere S n−1 , and in its noncommutative versions S∗n−1 and S+n−1 . These objects are all quite simple, geometrically speaking, and we will make only moderate use of the general machinery in [13]. Our guiding examples, all very basic, will be as follows. Proposition 6.2. Associated to a compact Riemannian manifold M are the following spectral triples (A, H, D), with A and A being the algebra of continuous functions and that of smooth functions on M respectively: (1) H is the space of square-integrable spinors, and D is the Dirac operator. ∗ (2) H is the space of forms on M, and D is the Hodge-Dirac operator √ d +d . 2 ∗ (3) H = L (M, dv), dv being the Riemannian volume, and D = d d. Here in the first example M is of course assumed to be a spin manifold. The fact that all the above triples satisfy Connes’ axioms in Definition 6.1 comes from certain standard results in global differential geometry, and we refer here to [13] and references therein. Let us also remark that the third example, though rather uninteresting from the viewpoint of algebraic topology or K-theory, contains all the useful information about the Riemannian geometry of the manifold, like the volume or the curvature. Let us go back now to our 3 noncommutative spheres, described by the algebras A× n in the previous sections. It is technically convenient at this point to slightly enlarge our formalism, by starting with the following “minimal” set of axioms. Definition 6.3. A spherical algebra is a C ∗ -algebra A, given with a family of generators x1 , . . . , xn and with a faithful positive unital trace tr : A → C, such that: (1) x1 , . . . , xn are self-adjoint. (2) x12 + · · · + xn2 = 1. (3) tr (xi ) = 0, for any i. As a first observation, each A× n is indeed a spherical algebra in the above sense. We know that for An = C(S n−1 ), there are at least 3 spectral triples that can be constructed, namely those in Proposition 6.2. In the case of A∗n , A+n , however, or more generally in the case of an arbitrary spherical algebra, the situation with the first two constructions is quite unclear, and the third construction will be our model. We agree to view the identity 1 as a length 0 word in the generators x1 , . . . , xn .
354
T. Banica, D. Goswami
Theorem 6.4. Associated to any spherical algebra A =< x1 , . . . , xn > is the compact spectral triple (A, H, D), where the dense subalgebra A is the linear span of all the finite words in the generators xi , and D acting on H = L 2 (A, tr ) is defined as follows: (1) Let Hk = span(xi1 . . . xir |i 1 , . . . , ir ∈ {1, . . . , n}, r ≤ k). ⊥ , so that H = ⊕∞ E . (2) Let E k = Hk ∩ Hk−1 k=0 k (3) We set Dx = kx, for any x ∈ E k . Proof. We have to show that [D, Ti ] is bounded, where Ti is the left multiplication by xi . Since xi ∈ A is self-adjoint, so is the corresponding operator Ti . Now since ⊥ . Thus we have: Ti (Hk ) ⊂ Hk+1 , by self-adjointness we get Ti (Hk⊥ ) ⊂ Hk−1 Ti (E k ) ⊂ E k−1 ⊕ E k ⊕ E k+1 . This gives a decomposition of type Ti = Ti−1 + Ti0 + Ti1 . It is routine to check that we have [D, Tiα ] = αTiα for any α ∈ {−1, 0, 1}, and this gives the result. As a first example, in the classical case the situation is as follows. Theorem 6.5. For the algebra An = C(S n−1 ), the spectral triple constructed in Theorem 6.4 essentially coincides with the one described in Proposition 6.2 (3). More √ ∗ d by the bijective corprecisely, the Dirac operator D of Theorem 6.4 is related to d √ respondence: D = f ( d ∗ d), where f (s) = 1 − n2 + 21 4s 2 + (n − 2)2 , s ∈ [0, ∞). In √ particular, the eigenspaces of D and d ∗ d coincide. √ Proof. This follows from the well-known fact that d ∗ d diagonalizes as in Theorem 6.4, with the corresponding eigenvalues being k(k + n − 2), with k = 0, 1, 2, . . . . 7. Quantum Isometries We know from the previous section that associated to any spherical algebra A, and in particular to the algebras A× the classical case n , is a certain spectral triple (A, H, D). In√ A = An this spectral triple is the one coming from the operator D = d ∗ d. Let us recall now the definition of the quantum isometry groups from [9], slightly modified to fit with our setting. Let S = (A, H, D) be a spectral triple of compact type, with H assumed to be the GNS space of a certain faithful trace tr : A → C. Consider the category of compact quantum groups acting on S isometrically, that is, the compact quantum group (say Q) must have a unitary representation U on H which commutes with D, satisfies U 1 A = 1 Q ⊗ 1 A and adU maps A into itself. If this category has a universal object, then this universal object (which is unique up to isomorphism) will be denoted by Q I S O(S). See [9] for more details. Proposition 7.1. Let A be a spherical algebra, and consider the associated spectral triple S = (A, H, D). Then Q I S O(S) exists. Proof. The proposition follows from Theorem 2.24 of [9], since the linear space spanned by 1 A is an eigenspace of D.
Quantum Isometries and Noncommutative Spheres
355
n−1 Theorem 7.2. Q I S O(S× ) = On× . × × Proof. Consider the standard coaction : A× n → C(On ) ⊗ An . This extends to a × unitary representation on the GNS space Hn , that we denote by U . We have (Hk ) ⊂ C(On× ) ⊗ Hk , which reads U (Hk ) ⊂ Hk . By unitarity we get as well U (Hk⊥ ) ⊂ Hk⊥ , so each E k is U -invariant, and U, D must commute. That is, is n−1 isometric with respect to D, and On× must be a quantum subgroup of Q I S O(S× ). Assume now that Q is a compact quantum group with a unitary representation V on H × commuting with D, such that adV leaves (A× n ) invariant. Since D has an eigenspace consisting exactly of x1 , . . . , xn , both V and V ∗ must preserve this subspace, so we can find self-adjoint elements bi j ∈ C(Q) such that: adV (xi ) = bi j ⊗ x j . j
From the unitarity of V , it is also easy to see that adV is trace-preserving, and by using this it follows that ((bi j )) as well as ((b ji )) are unitaries. It follows in particular that the antipode κ of Q must send bi j to b ji . Moreover, using the defining relations satisfied by the xi ’s and the fact that adV and (κ ⊗ id) ◦ adV are ∗-homomorphism, we can prove that the bi j ’s will satisfy the same relations as those of the generators u i j of C(On× ). Indeed, for the free case there is nothing to prove, and we have verifed such relations for the classical (commutative) case, i.e. for C(S n−1 ), in [8], the proof of which will go through almost verbatim for the half-liberated case too, replacing the words xi x j of length two by the length-3 words xi x j xk . This shows that C(Q) is a quotient of C(On× ), so Q is a quantum subgroup of On× , and we are done. There are several questions raised by the above results, concerning the axiomatization of the noncommutative spheres. Perhaps the most important one is the following: Question 7.3. What conditions on a spherical algebra A ensure the fact that the corresponding quantum isometry group is “easy” in the sense of [6]? An answer here would of course provide an axiomatization of the “easy spheres”, and our above results would translate into a 3-fold classification for the easy spheres, because of the classification results for easy quantum groups in [7]. 8. Concluding Remarks We have seen in this paper that the usual sphere S n−1 , the half-liberated sphere S∗n−1 , and the free sphere S+n−1 , share a number of remarkable common properties. The general axiomatization and study of these 3 noncommutative spheres has raised a number of concrete questions, notably in connection with the general structure of the associated projective spaces (Question 3.4), with the computation of the associated spherical integrals (Question 5.4), and with the general axiomatization problem (Question 7.3). We intend to come back to these questions in some future work. In addition, there are many questions about what happens in the “untwisted” case, and in the “non-easy” case. Some results here are already available from [10]. Finally, we have the more general problem of understanding the notion of liberation and half-liberation for more general manifolds. In the 0-dimensional case it is probably possible to use the results in [11] in order to reach some preliminary results. In the continuous case, however, the situation so far appears to be quite unclear.
356
T. Banica, D. Goswami
Acknowledgement. We would like to thank J. Bichon and S. Curran for several useful discussions. T.B. was supported by the ANR grants “Galoisint” and “Granma”, and D.G. was supported by the project “Noncommutative Geometry and Quantum Groups”, funded by the Indian National Science Academy.
References 1. Banica, T.: Le groupe quantique compact libre U(n). Commun. Math. Phys. 190, 143–172 (1997) 2. Banica, T., Collins, B.: Integration over compact quantum groups. Publ. Res. Inst. Math. Sci. 43, 277–302 (2007) 3. Banica, T., Collins, B.: Integration over the Pauli quantum group. J. Geom. Phys. 58, 942–961 (2008) 4. Banica, T., Collins, B., Schlenker, J.-M.: On orthogonal matrices maximizing the 1-norm. http://arxiv. org/abs/0901.2923v1[math.FA], 2009 5. Banica, T., Collins, B., Zinn-Justin, P.: Spectral analysis of the free orthogonal matrix, Int. Math. Res. Not. 2009, 3289–3309 (2009) 6. Banica, T., Speicher, R.: Liberation of orthogonal Lie groups. Adv. Math. 222, 1461–1501 (2009) 7. Banica, T., Vergnioux, R.: Invariants of the half-liberated orthogonal group. http://arxiv.org/abs/0902. 2719v2[math.QA], 2009 8. Bhowmick, J., Goswami, D.: Quantum isometry groups: examples and computations.. Commun. Math. Phys. 285, 421–444 (2009) 9. Bhowmick, J., Goswami, D.: Quantum group of orientation preserving Riemannian isometries. http:// arxiv.org/abs/0806.3687v2[math.QA], 2008 10. Bhowmick, J., Goswami, D.: Quantum isometry groups of the Podles spheres. http://arxiv.org/abs/0810. 0658v4[math.QA], 2009 11. Bhowmick, J., Goswami, D., Skalski, A.: Quantum isometry groups of 0-dimensional manifolds. Trans. Amer. Math. Soc., to appear ´ 12. Collins, B., Sniady, P.: Integration with respect to the Haar measure on the unitary, orthogonal and symplectic group. Commun. Math. Phys. 264, 773–795 (2006) 13. Connes, A.: Noncommutative geometry. London-Newyork, Academic Press, 1994 14. Connes, A., Dubois-Violette, M.: Noncommutative finite-dimensional manifolds I: spherical manifolds and related examples. Commun. Math. Phys. 230, 539–579 (2002) 15. Connes, A., Dubois-Violette, M.: Noncommutative finite dimensional manifolds II: moduli space and structure of noncommutative 3-spheres. Commun. Math. Phys. 281, 23–127 (2008) 16. Connes, A., Landi, G.: Noncommutative manifolds, the instanton algebra and isospectral deformations. Commun. Math. Phys. 221, 141–160 (2001) 17. D’Andrea, F., Landi, G.: Bounded and unbounded Fredholm modules for quantum projective spaces. http://arxiv.org/abs/0903.3553v1[math.QA], 2009 18. Dabrowski, L., D’Andrea, F., Landi, G., Wagner, E.: Dirac operators on all Podles quantum spheres. J. Noncommut. Geom. 1, 213–239 (2007) 19. Di Francesco, P.: Meander determinants. Commun. Math. Phys. 191, 543–583 (1998) 20. Dyson, F.J.: The threefold way. Algebraic Structure of Symmetry Groups and Ensembles in Quantum Mechanics. J. Math. Phys. 3, 1199–1215 (1962) 21. Goswami, D.: Quantum group of isometries in classical and noncommutative geometry. Commun. Math. Phys. 285, 141–160 (2009) 22. Podle´s, P.: Quantum spheres. Lett. Math. Phys. 14, 193–202 (1987) 23. Varilly, J.C.: Quantum symmetry groups of noncommutative spheres. Commun. Math. Phys. 221, 511–524 (2001) 24. Wang, S.: Free products of compact quantum groups. Commun. Math. Phys. 167, 671–692 (1995) 25. Woronowicz, S.L.: Compact matrix pseudogroups. Commun. Math. Phys. 111, 613–665 (1987) 26. Woronowicz, S.L.: Tannaka-Krein duality for compact matrix pseudogroups. Twisted SU(N) Groups, Invent. Math. 93, 35–76 (1988) Communicated by A. Connes
Commun. Math. Phys. 298, 357–368 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1063-2
Communications in
Mathematical Physics
Degree Complexity of Birational Maps Related to Matrix Inversion Eric Bedford, Tuyen Trung Truong Department of Mathematics, Indiana University, Bloomington, IN 47405, USA. E-mail: [email protected], [email protected] Received: 29 June 2009 / Accepted: 22 November 2009 Published online: 30 May 2010 – © Springer-Verlag 2010
Abstract: For a q × q matrix x = (xi, j ) we let J (x) = (xi,−1j ) be the Hadamard inverse, which takes the reciprocal of the elements of x. We let I (x) = (xi, j )−1 denote the matrix inverse, and we define K = I ◦ J to be the birational map obtained from the composition of these two involutions. We consider the iterates K n = K ◦ · · · ◦ K and determine the degree complexity of K , which is the exponential rate of degree growth δ(K ) = limn→∞ (deg(K n ))1/n of the degrees of the iterates. Earlier studies of this map were restricted to cyclic matrices, in which case K may be represented by a simpler map. Here we show that for general matrices the value of δ(K ) is equal to the value conjectured by Anglès d’Auriac, Maillard and Viallet. 0. Introduction Let Mq denote the space of q × q matrices with coefficients in C, and let P(Mq ) denote its projectivization. We consider two involutions on the space of matrices: J (x) = (xi,−1j ) takes the reciprocal of each entry of the matrix x = (xi, j ), and I (x) = (xi, j )−1 denotes the matrix inverse. The composition K = I ◦ J defines a birational map of P(Mq ). For a rational self-map f of projective space, we may define its n th iterate f n = f ◦ · · · ◦ f , as well as the degree deg( f n ). The degree complexity or dynamical degree is defined as δ( f ) := lim (deg( f n ))1/n . n→∞
In general it is not easy to determine δ( f ), or even to make a good numerical estimate. Birational maps in dimension 2 were studied in [DF], where a technique was given that, in principle, can be used to determine δ( f ). This method, however, does not carry over to higher dimension. In the case of the map K q , the dimension of the space and the degree of the map both grow quadratically in q, so it is difficult to write even a small composition K q ◦ · · · ◦ K q explicitly. This paper is devoted to determining δ(K q ).
358
E. Bedford, T. T. Truong
Theorem. If q ≥ 5, then δ(K q ) > 1 is the largest root of the polynomial λ2 − (q 2 − 4q + 2)λ + 1. The map K and the question of determining its dynamical degree have received attention because K may be interpreted as acting on the space of matrices of Boltzmann weights and as such represents a basic symmetry in certain problems of lattice statistical mechanics (see [BHM,BM]). In fact there are many K -invariant subspaces T ⊂ P(Mq ) (see, for instance, [AMV1] and [PAM]), and it is of interest to know the values of the restrictions δ(K |T ). The first invariant subspaces that were considered are Sq , the space of symmetric matrices, and Cq , the cyclic (also called circulant) matrices. The value δ(K |Cq ) was found in [BV], and another proof of this was given in [BK1]. Anglès d’Auriac, Maillard and Viallet [AMV2] developed numerical approaches to finding δ and found approximate values of δ(K q ) and δ(K |Sq ) for q ≤ 14. A comparison of these values with the (known) values of δ(K |Cq ) led them to conjecture that δ(K |Cq ) = δ(K q ) = δ(K |Sq ) for all q. The theorem above proves the first of these conjectured equalities. We note that the second equality, δ(K |Sq ) = δ(K q ), has been recently established in [T]. This involves additional symmetry, which adds another layer of subtlety to the problem. An example where additional symmetry leads to additional complication was seen already with the K -invariant space Cq ∩ Sq : the value of δ(K Cq ∩Sq ) has been determined in [AMV2] (for prime q) and [BK2] (for general q), and in the general case it depends on q in a rather involved way. The reason why the cyclic matrices were handled first was that K |Cq (see [BV]) and K |Cq ∩Sq (see [AMV2]) can be converted to maps of the form L ◦ J for certain linear L. In the case of K |Cq , the associated map is “elementary” in the terminology of [BK1], whereas K |Cq ∩Sq exhibits more complicated singularities, i.e., blow-down/blowup behavior. In contrast, the present paper treats matrices in their general form, so our methods should be applicable to wider classes of K -invariant subspaces. The degree of K is the degree of K ∗ H , the pullback of a (general) linear hypersurface H . A difficulty is that frequently (K ∗ )n = (K n )∗ on H 1,1 . To deal with this we analyze the blow-down behavior of K , which means that we look at the hypersurfaces E for which K (E) has codimension ≥ 2. We will construct a new manifold π : X → P(Mq ) by blowing up certain of the sets K (E). The map π is a birational equivalence which changes the topology and increases the size of H 1,1 . The birational map K X := π −1 ◦ K ◦ π of X induces a well-defined pullback on H 1,1 (X ). The exponential growth rate of degree is equal to the exponential growth rate of the induced maps on cohomology: 1/n n ∗ δ(K ) = lim ||(K X ) || H 1,1 (X ) . n→∞ n )∗ and thus δ(K ). Our approach is to choose a space X for which we can determine (K X In general, δ decreases when we restrict to a linear subspace, so δ(K ) ≥ δ(K |Cq ). The paper [BV] shows that δ(K |Cq ) is the largest root of the polynomial λ2 −(q 2 −4q +2)λ+1, so it will suffice to show that this number is also an upper bound for δ(K ). In order to find the right upper bound on δ(K q ), we construct a blowup space π : Z → P(Mq ). A basic property is that the spectral radius (or modulus of the largest eigenvalue) of the ∗ on H 1,1 (Z) gives an upper bound for δ(K ). Thus the goal of this paper induced map K Z ∗ is the number given in the is to construct a space Z such that the spectral radius of K Z theorem.
Degree Complexity of Birational Maps Related to Matrix Inversion
359
1. Basic Properties of I, J, and K For 1 ≤ j ≤ q − 1, define R j as the set of matrices in Mq of rank less than or equal to j. In P(Mq ), R1 consists of matrices of rank exactly 1 since the zero matrix is not in P(Mq ). For λ, ν ∈ Pq−1 , let λ ⊗ ν = (λi ν j ) ∈ P(Mq ) denote the outer vector product. The map Pq−1 × Pq−1 (λ, ν) → λ ⊗ ν ∈ R1 ⊂ P(Mq ) is biholomorphic, and thus R1 is a smooth submanifold. We let I : P(Mq ) → P(Mq ) denote the birational involution given by matrix inversion I (A) = A−1 . We let x[k,m] denote the (q − 1) × (q − 1) sub-matrix of (xi, j ) which is obtained by deleting the k th row and the m th column. We recall the classic formula I (x) = (det (x))−1 Iˆ(x), where Iˆ = ( Iˆi, j ) is the homogeneous polynomial map of degree q − 1 given by the cofactor matrix Iˆi, j (x) = C j,i (x) = (−1)i+ j det (x[ j,i] ).
(1.1)
Thus Iˆ is a homogeneous polynomial map which represents I as a map on projective space. We see that Iˆ(x) = 0 exactly when the determinants of all (q − 1) × (q − 1) minors of x vanish, i.e., when x ∈ Rq−2 . We may always represent a rational map f = [ f 1 : · · · : f q 2 ] of projective space Pq −1 in terms of homogeneous polynomials of the same degree and without common factor. We define the degree of f to be the degree of f j , and the indeterminacy locus is defined as I( f ) = { f 1 = · · · = f q 2 = 0}. The indeterminacy locus represents the points where it is not possible to extend f , even as a continuous mapping. The indeterminacy locus always has codimension at least 2. In the case of the rational map I , the polynomials C j,i (x) have no common factor. Further, Iˆ(x) = 0 exactly when x ∈ Rq−2 , so it follows that the indeterminacy set is I(I ) = Rq−2 . We let J : P(Mq ) → P(Mq ) be the birational involution given by J (x) = (J (x)i, j ) = (1/xi, j ), which takes the reciprocal of all the entries. In the sequel, we will sometimes write J (x) = x1 . We may define 2
Jˆ(x) = J (x)(x),
(1.2)
where (x) = xa,b is the homogeneous polynomial of degree obtained by taking the product of all the entries xa,b of x, and Jˆ(x) = ( Jˆi, j ) is the matrix of homogeneous polynomials of degree q 2 − 1 such that Jˆi, j = (a,b) =(i, j) xa,b is the product of all the xa,b except xi, j . Thus Jˆ is the projective representation of J in terms of homogeneous polynomials. We define K = I ◦ J . On projective space the map K is represented by the polynomial map (1.4) below. Since Iˆ ◦ Jˆ has degree (q − 1)(q 2 − 1), we see from Proposition 1.1, that the entries of Iˆ ◦ Jˆ must have a common factor of degree q 3 − 2q 2 . When V is a variety, we write K (V ) = W for the strict transform of V under K , which is the same as the closure of K (V − I(K )). We say that a hypersurface V is exceptional if K (V ) has codimension at least 2. The map I is a biholomorphic map from Mq − Rq−1 to itself, so the only possible exceptional hypersurface for I is Rq−1 . We define q2
i, j = {x = (xk, ) ∈ Mq : xi, j = 0}.
(1.3)
360
E. Bedford, T. T. Truong
The map J is a biholomorphic map of Mq − i, j i, j to itself, and the exceptional hypersurfaces are the i, j . Further, the indeterminacy locus is a,b ∩ c,d . I(J ) = (a,b) =(c,d)
Proposition 1.1. The degree of K is q 2 − q + 1. Its representation Kˆ = ( Kˆ i, j ) in terms of homogeneous polynomials is given by Kˆ i, j (x) = C j,i (1/x) (x),
(1.4)
where C j,i and are as in (1.1) and (1.2). Proof. Observe that C j,i (1/x) is independent of the variable x j,i , while Kˆ (x)i, j is not divisible by the variables xk, with k = j and = i. Hence the greatest common divisor of all polynomials on the right hand side of (1.4) is 1. Thus the algebraic degree of K is equal to the degree of Kˆ (x)i, j , which is q 2 − q + 1. 2. Construction of R1 Divisors V = c j V j and V = ck Vk on a manifold Z are said to be linearly equivalent if there is a rational function r on Z so that the divisor of r is V − V . The Picard group Pic(Z) is the set of divisors modulo linear equivalence. We will construct a complex manifold π : Z → P(Mq ) by performing a series of blowups, and we consider the induced birational map K Z := π −1 ◦ K ◦ π . For spaces obtained by iterated blowups of Pn , it is a standard result that the cohomology group H 1,1 is isomorphic to Pic, and in the sequel, we find it more convenient to work with Pic. For our construction of Z we first blow up the spaces R1 and Ai, j , 1 ≤ i, j ≤ q. The exceptional (blowup) hypersurfaces will be denoted R1 and Ai, j . Then we will blow up surfaces Bi, j ⊂ Ai, j , which will create exceptional hypersurfaces Bi, j . The precise nature of Z depends on the order in which the various blowups are performed. Different orders of blowup will produce different spaces Z, but the identity map of P(Mq ) to itself induces a birational equivalence between the spaces, and this equivalence induces the identity map on Pic(Z) (∼ = H 1,1 (Z)). Any of these spaces Z yields an induced ∗ on birational map K Z , and each K Z induces essentially the same pullback map K Z Pic(Z). This issue is discussed in [BK2, §2]. We start our discussion with R1 . Let π1 : Z1 → P(Mq ) denote the blowup of P(Mq ) along R1 . We will give a coordinate chart for points of Z1 lying over a point x 0 ∈ R1 . Let us first make a general observation. Let ρ,m denote the matrix operation which interchanges the th and m th rows of a matrix x ∈ Mq , and let γ,m denote the interchange of the th and m th columns. It is evident that J commutes with both ρ,m and γ,m , whereas we have ρ,m (I (x)) = I (γ,m (x)). Thus, for the purposes of looking at the induced map K Z1 , we may permute the coordinates of (xi, j ), and without loss of generality we may assume that the (1,1) entry of x 0 does not vanish. This means that we may assume that x 0 = λ0 ⊗ ν 0 with λ0 , ν 0 ∈ U1 , where U1 = {z = (z 1 , . . . , z q ) ∈ Cq : z 1 = 1}. We write the standard affine coordinate charts for P(Mq ) as 2
Wr,s = {x ∈ Mq : xr,s = 1} ⊂ Cq ,
(2.1)
Degree Complexity of Birational Maps Related to Matrix Inversion
361
where 1 ≤ r, s ≤ q. Let us define V to be the set of all matrices x ∈ Mq such that the first row and column vanish. Further, for 2 ≤ k, ≤ q, we define a subset of V :
0 0 and xk, = 1 . Vk, = x ∈ Mq : x = (2.2) 0 x[1,1] Now we may represent a coordinate neighborhood of Z1 over x 0 as π1 : C × U1 × U1 × Vk, → W1,1 , π1 (s, λ, ν, v) = λ ⊗ ν + sv.
(2.3)
Since λ ⊗ ν has rank 1 and nonvanishing (1,1) entry, we see that π1 (s, λ, ν, v) ∈ R1 exactly when s = 0. Thus the points of R1 which are in this coordinate neighborhood are given by {s = 0}. If y ∈ Mq is a matrix with yk, = 0, then we find π1−1 (y) = (s, λ, ν, v), where y˜ = y/yk, , s = yk, , λ = y˜∗,1 , ν = y˜1,∗ , v = s −1 ( y˜ − λ ⊗ ν).
(2.4)
We may write the induced map K Z1 = π1−1 ◦ K ◦ π1 in a neighborhood of R1 by using the coordinate projections (2.3) and (2.4). This allows us to show that K Z1 |R1 has a relatively simple expression: Proposition 2.1. We have K Z1 (R1 ) = Rq−1 , so R1 is not exceptional for K Z1 . In fact for z 0 = π1 (0, λ, ν, v) ∈ R1 ,
0 0 A, (2.5) K Z1 (z 0 ) = B 0 Iq−1 (v ) where Iq−1 denotes matrix inversion on Mq−1 , and ⎛
1
⎜−λ−1 ⎜ 2 −v j,k v = , A=⎜ ⎜ .. 2 2 λ j νk ⎝ . 2≤ j,k≤q −λq−1
0 1
··· ..
.
⎞
⎛ 1 ⎟ ⎜0 ⎟ ⎜ ⎟ , B = ⎜. ⎟ ⎝ .. ⎠ 0 1
0
−ν2−1 1
··· ..
−νq−1
.
⎞ ⎟ ⎟ ⎟. ⎠
1
(2.6) Proof. Without loss of generality, we work at points λ, ν ∈ U1 such that λ j , νk = 0 for all j, k and v such that the v in (2.6) is invertible. Then
1 0 0 + O(s 2 ) +s J (π1 (s, λ, ν, v)) = 0 v λ⊗ν = π1 (s + O(s 2 ), λ−1 , ν −1 , v + O(s)). Observe that
1 A λ⊗ν and
sA
0 0
1 B= 0
0 0 B = v 0
0 0
0 . s A[1,1] v B[1,1]
(2.7)
362
E. Bedford, T. T. Truong
Thus K Z1 (z) = π1−1 ◦ I ◦ J ◦ π1 (z)
1 0 0 −1 2 + O(s ) +s = π1 I 0 v λ⊗ν
1 0 0 −1 2 BI A + O(s ) B A +s = π1 0 v λ⊗ν
1 0 A , = π1−1 B I 0 sv + O(s 2 ) and the proposition follows if we let s → 0.
Now we will use the identities K Z1 ◦ JZ1 = IZ1 , IZ1 ◦ K Z1 = JZ1 . Proposition 2.2. We have K Z1 (J Rq−1 ) = R1 , and thus J Rq−1 is not exceptional for K Z1 . Proof. For generic s, λ, ν, v, and v as in (2.6), we have (2.7) in the previous proposition. Letting s → 0, we see that these points are dense in R1 , and thus JZ1 R1 = R1 . Now K Z1 (J (Rq−1 )) = IZ1 (Rq−1 ) = IZ1 (K Z1 R1 ) = JZ1 (R1 ) = R1 , where the second equality in the first line follows from the previous proposition.
3. Construction of A i, j We let Ai, j denote the set of q × q matrices whose i th row and j th columns consist entirely of zeros. Let π2 : Z2 → P(Mq ) denote the space obtained by blowing up along all of the the centers Ai, j for 1 ≤ i, j ≤ q. As we discussed earlier, it will be immaterial for our purposes what order we do the blowups in. Without loss of generality, we fix our discussion on (i, j) = (1, 1). The set A1,1 is equal to the set V which was introduced in the previous section. Let us use the notation
∗ ∗ , z 1,r = 1 U = U1,r = z ∈ Mq : z = (3.1) ∗ 0q−1 for the matrices which consist of zeros except for the first row and column, and which are normalized by the entry z 1,r . With this notation and with Wk, and Vk, as in (2.1,2), we define the coordinate chart
sζ sζ . (3.2) π2 : C × U × Vk, → Wk, ⊂ Mq , π2 (s, ζ, v) = sζ + v = sζ v Coordinate charts of this form give a covering of A1,1 , and {s = 0} defines the set A1,1 within each coordinate chart. If x ∈ Mq , then we normalize to obtain x˜ := x/xk, ∈ Wk, , and π2−1 (x) = (s, ζ, v), v = x˜[1,1] , s = x˜1,r , ζ = (x˜ − v)/x˜1,r . We let K Z2 = π2−1 ◦ K ◦ π2 denote the induced birational map on Z2 .
(3.3)
Degree Complexity of Birational Maps Related to Matrix Inversion
363
Proposition 3.1. For 1 ≤ r, s ≤ q, K Z2 (r,s ) = As,r , and in particular r,s is not exceptional for K Z2 . Proof. As was noted at the beginning of the previous section, it is no loss of generality to assume (r, s) = (1, 1) and 2 ≤ k, ≤ q. For generic x ∈ Mq , we may use Kˆ from (1.4) and define y by
1 ˆ = y. K (x) = (x) C j,i x We write π(σ, ζ, v) = y, and we next determine σ , ζ and v. Now let us use the notation s = x1,1 , so (x) = s (x), where denotes the product of all xa,b except (a, b) = (1, 1). For 2 ≤ i, j ≤ q, we have
1 ai, j (x) + O(1) yi, j = s (x) s with ai, j (x) = (−1)i+ j det ((1/x)[ j,i],[1,1] ), which gives vi, j = y˜i, j = yi, j /yk, = ai, j (x)/ak, (x) + O(s), 2 ≤ i, j ≤ q. For generic x, we may let s → 0, and then the value of v approaches (ai, j (x))/ak, (x), which by (1.4) of Proposition 1.1 is just K q−1 (x[1,1] ), normalized at the (k, ) slot. The first row and column of C j,i x1 do not involve the (1,1) entry of the matrix x, so y1,∗ and y∗,1 are divisible by s. By (3.3), we have σ = y1,r /yk, = O(s), so we see that σ → 0 as s → 0. An element of the first row of y is given by y1, j = (x)(−1) j+1 det (1/x[ j,1] ). If we expand this determinant into minors along the top row, we have −1 y1, j = (x) (−1) j+1+ p det (1/x[ j,1] )[1, p] x1, p. 2≤ p≤q
We use the notation y1,∗ and (1/x1,∗ ) for the vectors (y1, p )2≤ p≤q and (1/x1, p )2≤ p≤q . Thus we find y1,∗ = −(x)(at + O(s)) · (1/x1,∗ ), where at is the transpose of the matrix (ai, j : 2 ≤ i, j ≤ q), and “·” denotes matrix multiplication. It is evident that y1,1 = (x)det (1/x[1, 1]). Now we consider the range of K near A1,1 . We have seen that v = K q−1 (x[1,1] ), so the values of v are dense in Vk, . Now for fixed v, we see that the values of y1,∗ and y∗,1 span a 2q − 2 dimensional set. Thus, as we let the values of x1,∗ and x∗,1 range over generic values in Cq−1 × Cq−1 , we see that ζ is dense in U . Thus K Z2 (1,1 ) = A1,1 . 4. Construction of B i, j For 1 ≤ i, j ≤ q, we let Ui, j = {ζ ∈ Mq : ζ[i, j] = 0} be the set of matrices for which all entries are zero except on the i th row and j th column. In the construction of Ai, j , we may consider Ui, j (normalized) to be a coordinate chart in the fiber over a point of Ai, j . We define the set Bi, j ⊂ Ai, j ⊂ Z2 to be given in local coordinates by Bi, j = {(s, ζ, v) ∈ Ai, j : s = 0, ζi, j = 0}. This has codimension 2 in Z2 , and we let π3 : Z3 → Z2 be the new manifold obtained by blowing up all the sets Bi, j . Let
364
E. Bedford, T. T. Truong
K Z3 denote the induced birational map on Z3 . As we have seen before, we may focus our attention on the case (i, j) = (1, 1). Let us use the (s, ζ, v) coordinate system (3.2) at A1,1 . Let U be as in (3.1), and set U = {ζ ∈ U : ζ1,1 = 0}. We define the coordinate projection π3 : C × C × U × V1,1 → C × U × V1,1 , π(t, τ, ξ, v) = (s, ζ, v), s = t, ζ = (tτ, ξ ), v = v,
(4.1)
where the notation ζ = (tτ, ξ ) means that ζ1,1 = tτ , and ζa,b = ξa,b for all (a, b) = (1, 1). Thus B 1,1 is defined by the condition {t = 0} in this coordinate chart. Composing the two coordinate projections, Z3 → Z2 and Z2 → Mq , we have 2 t τ π : (t, τ, ξ, v) → tξ
tξ v
= x.
(4.2)
From (4.2), we see that π −1 (x) = (t, τ, ξ, v), where x˜ = x/x,k , v = x˜[1,1] , t = x˜1,r , τ = x˜1,1 /t 2 , ξ1, j = x1, j /x1,r , 2 ≤ j ≤ q. (4.3) We will use the following homogeneity property of K . If x ∈ Mq , we let χt (x) denote the matrix obtained by multiplying the 1st row by t and then the 1st column by t, so the (1,1) entry is multiplied by t 2 . It follows that χt J χt = J and χt I χt = I , so K
τ ξ
τ ξ = v ξ
ξ v
implies K
2 t τ tξ
tξ v
=
2 t τ tξ
tξ . v
(4.4)
Proposition 4.1. For 1 ≤ i, j ≤ q, we have K Z3 (Bi, j ) = B j,i , and in particular, Bi, j is not exceptional. 1,1 Proof. As before, we may assume that (i, j) = (1, 1). A near
B may be repre point 2 t τ tξ = x. We define τ , sented in the coordinate chart (4.2) as π(t, τ, ξ, v) = tξ v
τ ξ τ ξ ξ , and v by the condition K , so K (x) is given by the right hand = ξ v ξ v side of (4.4). By (4.3), the coordinates (t , τ , ξ , v ) = π −1 K (x) are 2 v = v/vk, , t = tξ1,r /vk, , τ = τ (vk, /ξ1,r ) .
From this we see that t → 0 as t → 0, which means that K Z3 (B 1,1 ) ⊂ B 1,1 . And since K is dominant on P(Mq ), we see that K Z3 (B 1,1 ) is dense in B 1,1 . Next we see how Ai, j maps under K Z3 . A point near A1,1 may be written in coordinates (3.2) as (s, ζ, v). We write K of this point in coordinates (4.1) as (t, τ, ξ, w). Proposition 4.2. For 1 ≤ i, j ≤ q, we have K Z3 (Ai, j ) ⊂ B j,i . Further, generic points (0, ζ, v) ∈ Ai, j .
dt ds
= 0 at
Degree Complexity of Birational Maps Related to Matrix Inversion
365
Proof. Without loss of generality we assume (i, j) = (1, 1). Let us define x and y as
1 sζ sζ x = π2 (s, ζ, v) = , y = Kˆ (x) = (x)C . sζ x x For 2 ≤ h, m ≤ q there are polynomials ah,m (ζ, v) and bh,m (ζ, v) such that y1,1 = s 2q−1 a1,1 (ζ, v), y1,m = s 2q−2 a1,m (ζ, v), yh,m = s 2q−3 ah,m (ζ, v) + s 2q−2 bk,m (ζ, v). We have t = s a1,r /ak, + O(s 2 ), so dt/ds → a1,r /ak, as s → 0. Thus dt/ds = 0 at generic points of A1,1 = {s = 0}. By (4.3), we see that 2 (t, τ, ξ, w) → (0, a1,1 ak, /a1,r , a1,∗ /a1,r , a[1,1] /ak, ) ∈ B 1,1
as s → 0.
5. Picard Group P i c(Z) Pic(P(Mq )) = H is generated by any hyperplane H . We write Z := Z3 and recall that each time we blow up, the exceptional (blowup) fiber gives a new basis element of the Picard group. We will work with the following basis for Pic(Z): {H, R1 , Ai, j , Bi, j , 1 ≤ i, j ≤ q}.
(5.1)
Now consider the hypersurface i, j . Pulling this back under π1 : Z1 → P(Mq ), we find π1∗ i, j = HZ1 = i, j , where i, j on the right hand side denotes the strict transform π −1 i, j . The equality between the strict and total transforms follows because the indeterminacy locus I(π1−1 ) = R1 is not contained in i, j . On the other hand, if we define Ti, j := {(a, b) : a = i or b = j},
(5.2)
then i, j contains Aa,b exactly when (a, b) ∈ Ti, j . Thus, pulling back under π2 : Z2 → Z1 , we have Aa,b . π2∗ i, j = HZ2 = i, j + (a,b)∈Ti, j
We will next pull this back under π3 : Z3 → Z2 . For this, we note that Ba,b ⊂ Aa,b , and in addition Bi, j ⊂ i, j . Rearranging our answer, we have: i, j = HZ − Bi, j − Aa,b + Ba,b . (5.3) (a,b)∈Ti, j
Proposition 5.1. The class of J Rq−1 in Pic(Z) is given in the basis (5.1) by J Rq−1 = (q 2 − q)H − (q − 1)R1 − (2q − 3) Aa,b − (2q − 2) Ba,b . a,b
a,b
(5.4)
366
E. Bedford, T. T. Truong
Proof. The polynomial P(x) := (x)det x1 , analogous to (1.4), is irreducible and has degree q 2 − q. Thus J Rq−1 = {P = 0} = (q 2 − q)H in Pic(P(Mq )). Now we pull this back under the coordinate projection π1 in (2.3). That is, we evaluate P(x) for x = π1 (s, λ, ν, v)). For s = 0 and genericλ,ν, and v, the entries of x = λ ⊗ ν + sv are nonzero, so (x) = 0. We will show det x1 = αs q−1 + · · ·, where α = 0 for generic λ, ν, and v. By (2.7), we must evaluate det (M) with M = λ−1 ⊗ ν −1 + sv + O(s 2 ). th Now we may do elementary row and column operations, such as add λ−1 j ν to the j row, which do not change
the determinant. In this way, we see that det (M) is equal to 1 0 = αs q−1 + · · ·. This means that det 0 sv + O(s 2 ) (q 2 − q)H = π1∗ (J Rq−1 ) = J Rq−1 + (q − 1)R1 ∈ Pic(Z1 ). Now we bring this back to Z2 by pulling back under the projection π2 defined in (3.2). In this case, we have (π2 (s, ζ, v)) = αs 2q−1 + · · ·, where α = α(ζ, v) = 0 for −1 s ζ −1 s −1 ζ −1 = s −2 β + · · ·, generic ζ and v. On the other hand, we have det −1 −1 s ζ v −1 and β(ζ, v) = 0 at generic points. Thus P(π2 (s, ζ, v)) = cs 2q−3 + · · ·, which gives the coefficient 2q − 3 for each Ai, j : (q 2 − q)H = J Rq−1 + (q − 1)R1 + (2q − 3) Ai, j ∈ Pic(Z2 ). i, j
Pulling back to Z3 is similar, except that (π3 (t, τ, ξ, v) = αt 2q + · · · . Thus we obtain the coefficient 2q − 2 for Bi, j in (5.4). ∗ on P i c(Z) 6. The Induced Map K Z ∗ ϕ := ϕ ◦ K . We may We define the pullback map on functions by composition K Z Z ∗ apply K Z to local defining functions of a divisor, and since K Z is well defined off the ∗ induces a well-defined pullback indeterminacy locus, which has codimension ≥ 2, K Z map on Pic(Z). ∗ maps the basis (5.1) according to: Proposition 6.1. K Z (2q − 3)Aa,b + (2q − 2)Ba,b , H → (q 2 − q + 1)H − (q − 2)R1 − a,b
R1 → (q 2 − q)H − (q − 1)R1 − (2q − 3)Aa,b + (2q − 2)Ba,b , A
i, j
→ H − B
j,i
−
a,b
Aa,b + Ba,b ,
(6.1)
(a,b)∈T j,i
Bi, j → A j,i + B j,i . Proof. Let us start with R1 . By § 2, K Z | J Rq−1 is dominant as a map to R1 . Since K Z is birational, it is a local diffeomorphism at generic points of J Rq−1 . Thus we have ∗ (R1 ) = J R KZ q−1 , so the second line in (6.1) follows from Proposition 5.1.
Degree Complexity of Birational Maps Related to Matrix Inversion
367
∗ (Ai, j ) = , and Similarly, since K Z |i, j is a dominant map to A j,i , we have K Z j,i the third line of (6.1) follows from (5.3). −1 i, j ∗ B i, j = In the case of Bi, j , we know from § 4 that K Z B = A j,i ∪ B j,i . Thus K Z j,i j,i λA + μB for some integer weights λ and μ. Again, since K Z is birational, and j,i K Z |Bi, j is a dominant map to B , we have μ = 1. Proposition 4.2 gives us λ = 1. Finally, set h(x) = i, j ai, j xi, j , and let H = {h = 0} be a hyperplane. The pull back is given by the class of {h Kˆ (x) = 0} = i, j ai, j Kˆ i, j (x) = 0, where Kˆ is given by (1.4). Pulling back h is similar to the situation in Proposition 5.1, where we pulled back the function P(x). The difference is that instead of working with det x1 we are working with all of the (q − 1) × (q − 1) minors. By Proposition 1.1, we have K ∗ H = (q 2 − q + 1)H ∈ Pic(P(Mq )). Next we will move up to Z1 by pulling back under π1 and finding the multiplicity of R1 . We consider h Kˆ π1 (s, λ, ν, v), and we recall the matrix M from the proof of Proposition 5.1. We see that each (q − 1) × (q − 1) minor of M is either O(s q−1 ) or O(s q−2 ). Thus for a generic hyperplane, the order of vanishing is q − 2, so we have
(q 2 − q + 1)H = K ∗ H + (q − 2)R1 ∈ Pic(Z1 ). Next, to move up to Z2 , we look at the order of vanishing of h Kˆ π 2 (s, ζ, v) in s. Again
s −1 ζ −1 s −1 ζ −1 2q−1 (π2 (s, ζ, v)) = αs + · · ·. The (q − 1) × (q − 1) minors of −1 −1 s ζ v −1 −2 which are most singular at s = 0 behave like s β + · · ·. Thus for generic coefficients ai, j we have vanishing to order 2q − 3 in s, and so 2q − 3 is the coefficient for each Ai, j as we pull back to Pic(Z2 ). Coming up to Z3 = Z, we pull back under π3 , and the calculation of the multiplicity of Bi, j is similar. This gives the first line in (6.1). Proposition 6.2. The characteristic polynomial of the transformation (6.1) is P(λ)Q(λ)q−1 (λ − 1)q
2 −q+2
(λ + 1)q
2 −3q+2
,
where P(λ) = λ2 − (q 2 − 4q + 2)λ + 1 and Q = (λ2 + 1)2 − (q − 2)2 λ2 . Proof. We will exhibit the invariant subspaces of Pic(Z) which correspond to the various factors of the characteristic polynomial. First, we set A := Ak, and B := B k, , ∗where we sum over all k and , and we set S1 = H, R1 , A, B. By (6.1), S1 is K Z ∗ 2 invariant, and the characteristic polynomial of K Z | S1 is seen to be P(λ)(λ − 1) . Next, if i < j, then we set αi, j = Ai,i + A j, j − (Ai, j + A j,i ), and similarly for βi, j , using the B k, . Then by (6.1), Si, j := αi, j , βi, j is invariant, and the characteristic ∗| 2 polynomial of K Z Si, j is (λ − 1) . Similarly, if i < j < k, we set αi, j,k = Ai, j + A j,k + Ak,i − A j,i + Ak, j + Ai,k and define βi, j,k similarly. Then the 2-dimensional subspace Si, j,k := αi, j,k , βi, j,k is ∗| 2 invariant, and the characteristic polynomial of K Z Si, j,k is (λ + 1) . Finally, for each i, we consider the row and column sums Ari = q j Ai, j − A, i, j Ac j = q i A − A, and we make the analogous definition for Bri and Bc j . The 4-dimensional subspace Ari , Aci , Bri , Bci is invariant and yields the factor Q(λ). These invariant subspaces span Pic(Z), and the product of these factors gives the characteristic polynomial stated above.
368
E. Bedford, T. T. Truong
∗ is the modulus of the largest root of the Proof of Theorem. The spectral radius of K Z characteristic polynomial, which is given in Proposition 6.2. By inspection, the largest root of the characteristic polynomial is the largest root of P(λ). The spectral radius of ∗ is an upper bound for δ(K ). On the other hand, it was shown in [BV] that this same KZ number is also a lower bound for δ(K ), so the Theorem is proved.
Remark. Let us conclude with a discussion of the exceptional cases q = 3 and 4. The proof above shows that δ(K q ) = 1 if q = 3 or 4. In fact one can show that ∗ )n = (K n )∗ for all q. To determine the degree growth, we need to know what (K Z Z ∗ )n does to H , and so we consider the restriction of K ∗ to the subspace S which is (K Z 1 Z defined in the proof of Proposition 6.2. When q = 3, the non-diagonal part of the Jordan ∗ | is a 2 × 2 Jordan block with eigenvalue 1. Thus the degree of K n canonical form K Z S1 ∗ | is a 4 × 4 Jordan block with eigenvalue 1, and grows linearly in n. When q = 4, K Z S1 n in this case the degree of K grows like the cube of n. References [AMV1] [AMV2] [BK1] [BK2] [BV] [BHM] [BM] [DF] [PAM] [T]
Anglès d’Auriac, J.C., Maillard, J.M., Viallet, C.M.: A classification of four-state spin edge potts models. J. Phys. A 35, 9251–9272 (2002) Anglès d’Auriac, J.C., Maillard, J.M., Viallet, C.M.: On the complexity of some birational transformations. J. Phys. A 39(14), 3641–3654 (2006) Bedford, E., Kim, K.-H.: On the degree growth of birational mappings in higher dimension. J. Geom. Anal. 14, 567–596 (2004) Bedford, E., Kim, K.-H.: Degree growth of matrix inversion: birational maps of symmetric. cyclic matrices. Disc. Contin. Dyn. Syst. 21(4), 977–1013 (2008) Bellon, M., Viallet, C.M.: Algebraic entropy. Commun. Math. Phys. 204, 425–437 (1999) Boukraa, S., Hassani, S., Maillard, J.-M.: Noetherian mappings. Physica D 185(1), 3–44 (2003) Boukraa, S., Maillard, J.-M.: Factorization properties of birational mappings. Physica A 220, 403– 470 (1995) Diller, J., Favre, C.: Dynamics of birational maps of surfaces. Amer. J. Math. 123(6), 1135–1169 (2001) Preissmann, E., Anglès d’Auriac, J.-Ch., Maillard, J.-M.: Birational mappings and matrix subalgebra from the chiral potts model. J. Math. Phys. 50, 013302 (2009) Truong, T.: Degree complexity of matrix inversion: symmetric case. Preprint
Communicated by G. Gallavotti
Commun. Math. Phys. 298, 369–405 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1070-3
Communications in
Mathematical Physics
PV Cohomology of the Pinwheel Tilings, their Integer Group of Coinvariants and Gap-Labeling Haïja Moustafa Univ. Blaise Pascal, Clermont-Ferrand, France. E-mail: [email protected] Received: 28 July 2009 / Accepted: 4 February 2010 Published online: 24 June 2010 – © Springer-Verlag 2010
Abstract: In this paper, we first remind how we can see the “hull” of the pinwheel tiling as an inverse limit of simplicial complexes (Anderson and Putnam in Ergod Th Dynam Sys 18:509–537, 1998) and we then adapt the PV cohomology introduced in Savinien and Bellissard (Ergod Th Dynam Sys 29:997–1031, 2009) to define it for pinwheel ˇ tilings. We then prove that this cohomology is isomorphic to the integer Cech cohomolˇ ogy of the quotient of the hull by S 1 which let us prove that the top integer Cech cohomology of the hull is in fact the integer group of coinvariants of the canonical transversal of the hull. The gap-labeling for pinwheel tilings is then proved and we end this article 1 by an explicit computation of this gap-labeling, showing that μt (C(, Z)) = 264 Z 15 . Contents 1. 2.
3.
4.
5.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Reminders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Pinwheel tiling and continuous hull . . . . . . . . . . . . . . . . . . . 2.2 The canonical transversal . . . . . . . . . . . . . . . . . . . . . . . . PV Cohomology of Pinwheel Tilings and the Integer Group of Coinvariants 3.1 The pinwheel prototile space . . . . . . . . . . . . . . . . . . . . . . 3.2 /S 1 as an inverse limit of supertile spaces . . . . . . . . . . . . . . 3.3 The PV cohomology . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 Oriented simplicial complexes and PV cohomology. . . . . . . . 3.3.2 Proof of Theorem 3.10. . . . . . . . . . . . . . . . . . . . . . . . 3.4 The integer group of coinvariants of the pinwheel tiling . . . . . . . . Proof of the Gap-Labeling for Pinwheel Tilings and Explicit Computations 4.1 Proof of the gap-labeling for pinwheel tilings . . . . . . . . . . . . . 4.2 Explicit computations . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
370 373 373 375 377 378 380 381 382 385 388 393 393 396 401
370
H. Moustafa
1. Introduction In this paper, we study some dynamical properties of the pinwheel tiling of the plane (the (1, 2)-pinwheel tiling introduced by Conway and Radin in [Rad94] and [Rad95]). Some remarks on possible extensions of this result will be made in conclusion. The study of tilings have gained in intensity since the discovery by physicists, in 1984, of a new material whose atomic distribution had forbidden symmetries for crystals (see [SBGC84]). The atomic distribution wasn’t the one of a crystal but it was very close to it: it was not periodic but nevertheless, it showed some order, it was “quasiperiodic”. This material was called quasicrystals. Quickly, mathematicians modeled such solids by aperiodic tilings and the physical properties of the material are closely related to the geometry of the tiling. This link was established by Jean Bellissard in [Bel82,Bel86,Bel92] and it is the content of the so called gap-labeling conjecture. To state this conjecture, we need some definitions which will be given in more details in Sect. 1. of this paper. First, to every tiling of the euclidean space, we can associate a topological space , called the continuous hull or the tiling space, which encode many properties of our tiling. This space is provided with an action of a subgroup G of the isometries of the euclidean space (in the statement of the gap-labeling conjecture, G will be the translations Rn ) and thus we can consider the C ∗ -algebra associated to such dynamical system (, G) which is the crossed product C() G. Next, we assume that is provided with an ergodic G-invariant probability meaμ sure μ which gives rise to a trace τ μ on C() G and hence, to a linear map τ∗ : ∗ K 0 (C() G) → R from the K -theory group of this C -algebra to the real numbers. The gap-labeling conjecture then predicts the image of K 0 (C() G) under this linear map. Moreover, the hull contains a Cantor set , called the “canonical transversal” of , which is a sort of discretisation of the hull. The measure μ then induces a measure μt on and the gap-labeling conjecture then expresses the link between the image of K 0 (C() Rn ) under the trace and the image under μt of the integer valued functions on : Conjecture ([Bel92,BHZ00]). μ τ∗ K 0 C() Rn = μt (C(, Z)) , where C(, Z) is the space of continuous functions on with values in Z. Since then, many works have been done to prove this conjecture. First, the Pimsner-Voiculescu exact sequence gave the answer in dimension 1 in [Bel92] and later, the conjecture was proved in dimension 2 by van Elst in [vE94] by iterating this exact sequence. Using a spectral sequence, Bellissard, Kellendonk and Legrand proved the conjecture in dimension 3 in [BKL01]. In 2002, a general proof finally appears independently in several papers: by Bellissard, Benedetti and Gambaudo in [BBG06], by Benameur and Oyono-Oyono in [BOO02] and by Kaminker and Putnam in [KP03]. The proof in [BOO02] uses an index theorem for foliated spaces due to Alain Connes (see [Con79]) to link the analytical part μ τ∗ (K 0 (C() Rn )) to a topological part Ch τ (K n (C())) which lies in Hτ∗ () the longitudinal cohomology group of (Ch τ is the longitudinal Chern character (see [MS06])), this part being more computable. By analogy, the gap labeling of the pinwheel tiling can be formulated as the compuμ tation of τ∗ K 0 C() R2 S 1 in terms of the Z-module of “patch frequencies”
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
371
μt (C(, Z)). We have adapted the method of Benameur and Oyono-Oyono in [Mou] for pinwheel tilings to obtain a similar result: Theorem. If is the continuous hull of a pinwheel tiling, μ an ergodic invariant probability measure on and μt the induced measure on the canonical transversal of , then: K 0 C() R2 S 1 H ⊕ Z ⊕ Z6 , μ
τ∗ (Z) = 0, μ τ∗ Z6 ⊂ μt (C(, Z)) , μ
τ∗ (H ) ⊂ Ch τ (K 1 (C())) , [Cμt ], where H is a subgroup of K 0 C() R2 S 1 , [Cμt ] ∈ H3τ () is the Ruelle-Sullivan current associated to the transverse measure μt (see [MS06]) and , is the pairing of the longitudinal cohomology with the longitudinal homology. μ μ As τ∗ (Z) and τ∗ Z6 are already included in the module of patch frequencies, it suffices to study the last formula of the theorem. Since the longitudinal Chern character ˇ factorizes through the usual Chern character in Cech cohomology (see Sect. 3.) and since the Ruelle-Sullivan current only sees the Hτ3 () part, we will study, in this paper, the ˇ top integer Cech cohomology group of . The aim of this paper is to study carefully the image under the Ruelle-Sullivan current of the top dimensional longitudinal Chern character and to relate it to the module of patch frequencies in order to solve completely the gap-labeling conjecture for the pinwheel tiling. We must notice that the cohomology groups of the pinwheel tiling was computed by Barge, Diamond, Hunton and Sadun in [BDHS]. They proved that 2 1 1 3 ˇ H (; Z) = Z ⊕Z ⊕ Z5 ⊕ Z/(2Z), 5 3 2 1 1 ⊕Z ⊕ Z6 ⊕ (Z/(2Z))5 , Hˇ 2 (; Z) = Z 5 3 Hˇ 1 (; Z) = Z2 , Hˇ 0 (; Z) = Z. Even if these groups were computed, the author was not able to use these results to compute the gap-labeling explicitly. In fact, it is not easy to see the generators of the top degree cohomology and thus to compute their images under the map r∗ which sends this cohomology on the longitudinal cohomology and then under the Ruelle-Sullivan current. Another important remark is that, even if the study of the pinwheel tiling is interesting from the mathematical point of view, no example of real material with the properties of the pinwheel model has been found so far in material science. The structure of this paper is then the following: in Sect. 2. we remind some classical definitions in tiling theory. In particular, we remind the construction of the pinwheel tiling given by Radin in [Rad94] and then we introduce the notion of continuous hull of
372
H. Moustafa
pinwheel tiling, enumerating its properties. We next turn to the definition of the canonical transversal of the hull which allows us to see the continuous hull of pinwheel tiling as a foliated space in a well known way. In Sect. 3., a trick shows that Hˇ 3 (; Z) is isomorphic to Hˇ 2 (/S 1 ; Z). We then study ˇ the top integer Cech cohomology of /S 1 . To do this, we are using results and ideas developed in several papers ([AP98,SB09]). First, we use the idea, initiated by Anderson and Putnam in their paper [AP98], to see /S 1 as an inverse limit of homeomorphic simplicial complexes. Specifically, as the pinwheel tiling isn’t “forcing its border”, we will use the collared version of their construction. This allows to use simplicial methods to compute Hˇ 2 (/S 1 ; Z). This was extended by Savinien and Bellissard in [SB09] to compute the cohomology of tilings in terms of the PV cohomology of its prototile space. In this section, we adapt their method to prove Hˇ ∗ /S 1 ; Z H P∗ V B0c ; C( , Z) , where H P∗ V is the PV cohomology and is a new transversal. The interesting point in this new cohomology is that the cochains of degree 2 are in fact classes of continuous functions with integer values on the transversal 2 (a subspace of ) of the hull which is a first step toward the module of “patch frequencies” of the pinwheel tiling, related to the continuous functions with integer values on the canonical transversal. The key point established in this section is that the top PV cohomology of the pinwheel tiling is isomorphic to the integer group of coinvariants of the transversal 2 (this notion of an integer group of coinvariants is given in this section, its the quotient of the continuous function on 2 by the “local” coinvariants, in a similar way to the definition given in [Kel97]): Theorem 3.16. H P2 V (B0c ; C( , Z)) ∼ = C(2 , Z)/H2 ,
where H2 is a subgroup of C(2 , Z) such that for all h ∈ H2 , μt2 (h) = 0, for the
measure μt2 induced by μ on 2 ,
ˇ which leads to the important corollary that the top integer Cech cohomology group of the hull is isomorphic to the integer group of coinvariants: ˇ Corollary 3.19. The top integer Cech cohomology of the hull is isomorphic to the integer group of coinvariants of the canonical transversal: Hˇ 3 (; Z) C(, Z)/H . In Sect. 4, this theorem associated to the study of the image under the Ruelle-Sullivan map of the top cohomology of gives the desired gap-labeling of the pinwheel tiling: Theorem 4.1. If T is a pinwheel tiling, = (T ) its hull provided with an invariant ergodic probability measure μ and its canonical transversal provided with the induced measure μt , we have: μ τ∗ K 0 C() R2 S 1 = μt (C(, Z)) .
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
373
Fig. 1. Substitution of the pinwheel tiling
We finally end Sect. 4. by an explicit computation of this image. Viewing C(, Z) as a direct limit and exhausting the collared prototiles of the pinwheel tiling, we then prove that, thanks to a result in [Eff81], we have: μ
τ∗
1 1 Z . K 0 C() R2 S 1 = 264 5
This result shows that the gap-labeling of the pinwheel tiling is given by the Z-module of its “patch frequencies”. A natural question is whether this is a general fact.
2. Reminders 2.1. Pinwheel tiling and continuous hull. A tiling of the plane is a countable family P = {t1 , t2 . . .} of non empty compact subsets ti of R2 , called tiles (each tile being homeomorphic to the unit ball), such that:
• i∈N ti = E 2 , where E 2 is the euclidean plane with a fixed origin O; • Tiles meet each other only on their border; • Tiles’s interiors are pairwise disjoint. We are interested in the special case where there exists a finite family of tiles { p1 , . . . , pn }, called prototiles, such that each tile ti is the image of one of these prototiles under a rigid motion (i.e. a direct isometry of the plane). In fact this paper will focus on the particular tiling called pinwheel tiling or (1, 2)-pinwheel tiling which is obtained by a substitution explained below. Our construction of a pinwheel tiling is based on the construction made by Charles Radin in [Rad94]. It’s a tiling of the plane obtained by the substitution described in Fig. 1.
374
H. Moustafa
Fig. 2. Construction of a pinwheel tiling
This tiling √ is constructed from two prototiles, the right triangle in Fig. 1(a) with legs 1, 2 and 5 and its mirror image. To obtain this tiling, we begin from the right triangle with the following vertices in the plane: (0, 0), (2, 0) and (2, 1). This tile and its reflection are called supertiles of level 0 or 0-supertiles. We will next define 1-supertiles as follows: take the right triangle with vertices (−2, 1), (2, −1) and (3, 1) and take the decomposition of Fig. 1(b). This 1-supertile is thus decomposed in five 0-supertiles, which are isometric copies of the first tile, with the beginning tile in its center (see Fig. 2(b)). We next repeat this process by mapping this 1-supertile in a 2-supertile with vertices (−5, 5), (1, −3) and (5, 0) (see Fig. 2(c)). Including this 2-supertile in a 3-supertile with correct orientation and so on, this process leads to the desired pinwheel tiling T . We will now attach to this tiling a topological space reflecting the combinatorial properties of the tiling into topological and dynamical properties of this space. For this, we observe that the direct isometries of the plane are acting on the euclidean plane E 2 where we have fixed the origin O. Direct isometries E2 = R2 S O(2) thus act naturally on our tiling T on the right. If Rθ is the rotation about the origin with angle θ and s ∈ R2 , T .(s, Rθ ) := R−θ (T − s). We will also denote (s, Rθ ) by (s, θ ). Definition 2.1. A patch is a finite union of tiles of a tiling. A tiling T is of finite E2 -type or of Finite Local Complexity (FLC) if for any R > 0, there is only a finite number of patches in T of diameter less than R up to direct isometries. A tiling T of finite E2 -type is E2 -repetitive if, for any patch A in T , there is R(A) > 0 such that any ball of radius R(A) intersects T in a patch containing a E2 -copy of A. The tiling T is of finite E2 -type, E2 -repetitive and non periodic for translations (see [Pet05]). To attach a topological space to T , we define a metric on T .E2 : If T1 and T2 are two tilings in T .E2 , we define 1 A = ε ∈ 0, √ /∃s, s ∈ Bε2 (0) , θ, θ ∈ Bε1 (0) s.t. 2 T1 .(s, θ ) ∩ B 1 (O) = T2 .(s , θ ) ∩ B 1 (O) , ε
ε
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
where B 1 (O) is the euclidean ball centered in O with radius
1
375
and Bi (0) are the euclid-
ean balls in Ri centered in 0 and with radius (i.e. we consider direct isometries near I d). Then, define: InfA if A = ∅ d(T1 , T2 ) = √1 . else 2
d is a bounded metric on T .E2 . For this topology, a base of neighborhoods is defined by: two tilings T1 and T2 are close if, up to a small direct isometry, they coincide on a large ball around the origin. The topology defined here works because the tilings considered are of finite E2 -type. There exist other topologies (equivalent) that one can put on this space. The topology thus obtained is metrizable but none of the metrics that can be defined to produce the topology is canonical. A more canonical way to define the topology was given in [BHZ00]. The problem of non-uniqueness of the metric has been investigated in [PB09]. Definition 2.2. The continuous hull of T is then the completion of (T .E2 , d) and will be denoted (T ). Let’s enumerate some well known properties of this continuous hull: Property 2.3 ([BBG06,BHZ00,BG03,KP00,LP03,Rad94]). • (T ) is formed by finite E2 -type, E2 -repetitive and non periodic (for translations) tilings and each tiling of (T ) has the same patches as T . • (T ) is a compact space since T is of finite E2 -type. • Each tiling in (T ) is uniquely tiled by n-supertiles, for all n ∈ N. • The dynamical system ((T ), E2 ) is minimal since T is repetitive, i.e. each orbit under direct isometries is dense in (T ). The last property of (T ) allows us to write without mentioning the tiling T (in fact, if T ∈ (T ), (T ) = (T )). Definition 2.4. Any tiling in is called a pinwheel tiling. Remark. We can easily see that our continuous hull is the compact space X φ defined by Radin and Sadun in [RS98]. 2.2. The canonical transversal. In this section, we will construct a Cantor transversal for the action of E2 and we show that this transversal gives the local structure of a foliated space. For this, we fix a point in the interior of the two prototiles of the pinwheel tiling. This, in fact, gives for any tiling T1 in (i.e. constructed by these two prototiles), a punct punctuation of the plane denoted T1 . Define then 0 to be the set of every tiling T1 punct of such that O ∈ T1 . The canonical transversal is the space 0 /S O(2). We can identify this space with a subspace of by constructing a continuous section s : 0 /S O(2) −→ . To obtain such a section, we fix an orientation of the two prototiles of our tilings once for all. Hence when we consider a patch of a tiling in the transversal 0 , there is only one orientation of this patch where the tile containing the origin has the orientation chosen for the prototiles. Let then [ω] ∈ 0 /S O(2), there is only one
376
H. Moustafa
θ ∈ [0; 2π [ such that the tile in Rθ (ω) containing the origin has the good orientation. We define s([ω]) := Rθ (ω). s is well defined because θ depends on the representative ω chosen but not Rθ (ω). s : 0 /S O(2) −→ s(0 /S O(2)) is then a bijection. We easily see that s is continuous and thus it is a homeomorphism from the canonical transversal onto a compact subspace of . We also call this space the canonical transversal. We can see as the set of all the tilings T1 in with the origin on the punctuation of T1 and with the tile containing the origin in the orientation chosen for the prototiles. We then have: Proposition 2.5 ([BG03]). The canonical transversal is a Cantor space. A base of neighborhoods is obtained as follows: consider T ∈ and A a patch around the origin in T , then U (T , A) = {T1 ∈ | T1 = T on A} is a closed and open set in , called a clopen set. Before defining the foliated stucture on , we must study the rotations which can fix tilings in . In pinwheel tilings, we can sometimes find regions tiled by supertiles of any level and so we introduce the following definition: Definition 2.6. A region of a tiling which is tiled by n-supertiles for all n ∈ N is called an infinite supertile or supertile of infinite level. If a ball in a tiling T1 fails to lie in any supertile of any level n, then T1 is tiled by two or more supertiles of infinite level, with the offending ball straddling a boundary. We can, in fact, construct a pinwheel tiling with two half-planes as infinite supertiles as follows: Consider the rectangle consisting of two (n − 1)-supertiles in the middle of a n-supertile. For each n 1, orient this rectangle with its center at the origin and its diagonal on the x-axis, and fill out the rest of a (non-pinwheel) tiling Tn by periodic extension. By compactness this sequence has a convergent subsequence, which will be a pinwheel tiling and which will consist of two infinite supertiles (this example comes from [RS98]). Note that the boundary of an infinite supertile must be either a line, or has a single vertex, since it is tiled by supertiles of all levels. We call such a line a fault line. Lemma 2.7. If (s, θ ) fixes a pinwheel tiling T , then θ ∈ {0, π }mod(2π ). Moreover, if θ = 0 then s = 0. In other terms, translations can’t fix a pinwheel tiling. Proof. Let’s consider the different cases: 1. First, if the tiling T fixed by (s, θ ) has no fault line (i.e have no infinite supertile), −→ then s = 0 and θ = 0 (mod(2π )). Indeed, let x ∈ E 2 be such that O x = s, then O and x is in the interior of a m-supertile since there aren’t infinite supertiles (see p. 29 in [RS98]). As no direct isometry fixes our prototiles, s and θ must be zero. 2. Let’s see the case in which T has some infinite supertiles. By [RS98] p. 30, the number of infinite supertiles in T is bounded by a constant K (in fact for pinwheel tilings, we can take K = 2π α , where α is the smallest angle in the prototiles). Thus, T doesn’t contain more than K infinite supertiles and in fact, it has only a finite number of fault lines. This will give us the result. Indeed, since (s, θ ) fixes T , if F is a fault line, (s, θ ) sends it on another fault line F1 in T and thus, Rθ sends F
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
377
on a line parallel to F1 . As there is only a finite number of fault lines in T , there is M ∈ N∗ , m ∈ Z∗ such that Mθ = 2π m. If we now use results obtained in [RS98] p. 32, θ must be in the group of relative orientations G R O (Pin) of pinwheel tilings which is the subgroup of S O(2) generated by π2 and 2α. Hence, if θ = 2kα + l π2 with k ∈ Z∗ and l ∈ Z, it would mean that α is rational with respectto π , which is impossible (see [Rad94] p. 664), hence k = 0 and θ ∈ 0, π2 , π, 3π 2 mod(2π ). Now, if we study the first vertex coronas (i.e the minimal patches around a vertex or around the middle point of the hypothenuses), there are only patches with a 2-fold symmetry ([Sad] or see Fig. 3 and Fig. 4 p. 403 and p. 403). Thus, θ ∈ {0, π } mod(2π ) and if θ = 0, s = 0 since pinwheel tilings are not fixed by translations. We note that, in fact, there exists only 6 pinwheel tilings with a 2-fold symmetry up to rotations ([Sad]). Hence, there are only 6 orbits with fixed points for the R2 S O(2) action on . Moreover, there are exactly 6 circles F1 , . . . , F6 containing fixed points for the S O(2) action on (of course, therefore, the 6 orbits of these circles contain all the fixed points of the R2 S O(2)-action). We thus obtain the following important result on the dynamic of our tiling space: Theorem 2.8 ([BG03]). The continuous hull is a minimal foliated space. Proof. The proof follows the one in [BG03] except that, locally, looks like an open subset of S O(2)× an open subset of R2 × a Cantor set instead of S O(2) × an open subset of R2 × a Cantor set, like in [BG03]. is covered by a finite number of open sets Ui = φi (Vi × Ti ), where • Ti is a clopen set in ; • Vi is an open subset of R2 S O(2) which read Vi = i × Wi , where Wi is an open subset of R2 and i an open subset of S O(2) of the form ]lπ/2 − π/3; lπ/2 + π/3[, l ∈ {0, 1, 2, 3}; • φi : Vi × Ti −→ is defined by φi (v, ω0 ) = ω0 .v. As we can find finite partitions of in clopen sets with arbitrarily small diameter, it is possible to choose this diameter small enough so that: • the maps φi are homeomorphisms on their images; • whenever T1 ∈ Ui ∩ U j , T1 = φi (v, ω0 ) = φ j (v , ω0 ), the element v .v −1 is independent of the choice of T1 in Ui ∩ U j , we denote it by gi j . The transition maps read: (v , ω0 ) = (gi j .v, ω0 .gi−1 j ). It follows that the boxes Ui and charts h i = φi−1 : Ui −→ Vi × Ti define a foliated structure on . By construction, the leaves of are the orbits of under the E2 -action. We must do several remarks on the actions: E2 isn’t acting freely on , even if the translations are, but we could adapt results of Benedetti and Gambaudo obtained in their paper [BG03] studying the possible symmetries in our pinwheel tilings. The E2 -action is not free on 0 too, but the S O(2)-action is. Using the group of relative orientations G R O (Pin), we can see that each R2 -orbit of is in fact a dense subset of (see [HRS05]). 3. PV Cohomology of Pinwheel Tilings and the Integer Group of Coinvariants In [Mou], we have obtained τμ∗ K 0 C() R2 S O(2) ⊂ [Cμt ] (Ch τ (K 1 (C())) .
378
H. Moustafa
There is a natural map ([MS06]) r∗ : Hˇ 3 (; R) −→ Hτ3 () obtained by inclusion of the sheaf R of germs of locally constant real-valued functions into the sheaf Rτ of germs of continuous real-valued tangentially locally constant functions, and since Z ⊂ R, there ˇ cohomology is also a natural map r∗ : Hˇ 3 (; Z) −→ Hτ3 (). Hˇ ∗ (, Z) is the Cech ∗ with integer coefficients of and Hτ () is the tangential cohomology of the foliated space (see [MS06]). There is then a factorization: Ch τ = r∗ ◦ Ch, where Ch is the Chern character Ch : K 1 () −→ Hˇ odd (, Z) (the Chern character takes its values in the integer odd ˇ Cech cohomology because we are in dimension 3). The Ruelle-Sullivan current [Cμt ] only takes into account the Hτ3 () part of Ch τ ˇ and thus one must focus on the top Cech cohomology Hˇ 3 (; Z). In fact, we will study 2 1 Hˇ (/S ; Z), since Hˇ 3 (; Z) Hˇ c3 (\F; Z) Hˇ c2 (\F)/S 1 ; Z Hˇ 2 (/S 1 ; Z), where F = F1 ∪ . . . ∪ F6 is the union of the six circles fixed by a rotation, Hˇ c3 (\F; Z) ˇ is the Cech cohomology with integer coefficients and compact support of \F. The left hand side and the right hand side isomorphisms are obtained by the long exact sequence in cohomology relative to the pairs (, F) and (/S 1 , F/S 1 ) and use the fact that F is of dimension 1 and F/S 1 of dimension 0. The isomorphism in the middle is obtained by a Gysin sequence since the projection \F −→ (\F)/S 1 is a S 1 -principal bundle as the S 1 -action is free on \F (see [Bre72]). To study this cohomology, we use techniques developed in the paper of Savinien and ˇ Bellissard ([SB09]) to show that the top integer Cech cohomology of /S 1 is isomorphic to the integer group of coinvariants associated with a certain transversal 2 . This will be achieved by using an idea first introduced by Anderson and Putnam in [AP98] and then used by Savinien and Bellissard in [SB09]. Then we present the PV ˇ cohomology H P∗ V (B0c ; C( , Z)) of pinwheel tilings which link the top Cech coho1 mology of /S and the integer group of coinvariants of . 3.1. The pinwheel prototile space. Let T denote a pinwheel tiling (the one constructed in the beginning of this paper for example). Let’s recall some terminology: Definition 3.1. 1. A punctured tile is an ordered pair consisting of a tile and a point in its interior. 2. A prototile of a tiling is an equivalence class of tiles (including the punctuation) up to direct isometries. 3. The first corona of a tile in a tiling T is the union of the tiles of T intersecting it. 4. A collared prototile of T is the subclass of a prototile whose representatives have the same first corona up to direct isometries. There are in fact 108 collared prototiles for the pinwheel tiling (see Fig. 3 and Fig. 4 in Sect. 5 and p. 403, where we have represented 54 collared prototiles, the 54 other collared prototiles are obtained by reflection). We punctuate each prototile of T by the intersection point of the perpendicular bisector of the shortest side with the median from the vertex intersection of the hypotenuse and the shortest side:
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
379
This point is invariant under the substitution as can be seen in the figure in Subsect. 3.2. If tˆ is a prototile then t will denote its representative that has its punctuation at the origin O. We then take the simplicial structure defined in the next figure:
We can now build a finite CW-complex B0c (T ), called prototile space, out of the collared prototiles by gluing them along their boundaries according to all the local configurations of their representatives in T : Definition 3.2. Let tˆcj , j = 1, . . . , N , be the collared prototiles of T and let t cj denote the representative of tˆcj that has its punctuation at the origin and the prototile orientation fixed during the construction of the canonical transversal. The collared prototile space (or just the prototile space) of T , B0c (T ), is the quotient C W -complex B0c (T ) =
N
t cj / ∼,
j=1
where two n-cells ein ∈ tic and enj ∈ t cj are identified if there exist direct isometries (xi , θi ),(x j , θ j ) ∈ R2 S O(2) for which tic .(xi , θi ) and t cj .(x j , θ j ) are tiles of T such that ein .(xi , θi ) and enj .(x j , θ j ) coincide on the intersection of their n-skeletons. The images of the tiles t cj in B0c (T ) will be denoted τ j and still be called tiles. We then have a projection map from /S 1 onto B0c (T ): Proposition 3.3 [SB09]. There is a continuous map pc0,T : /S 1 −→ B0c (T ) from the continuous hull quotiented by S 1 onto the collared prototile space. Proof. Let λc0 : Nj=1 t cj → B0c (T ) be the quotient map and let ρ0c : /S 1 → Nj=1 t cj be defined as follows: take [ω] ∈ /S 1 and ω1 a representative of this class. If the origin O belongs to the intersection of k tiles t α1 , . . . , t αk , in ω1 , with t αl = t cjl . xαl (ω1 ), θαl (ω1 ) , l = 1, . . . , k, then we set ρ0c ([ω]) = xαs (ω1 ) which is in t cjs with s = Min{ jl : l = 1, . . . , k}. This function depends on a choice of indice but this choice will vanish when we will look at the image in the quotient. We recall that the E2 -action is given by: ω.(x, θ ) := R−θ (ω − x), where R−θ is the rotation in R2 of angle −θ around the origin. Thus, the definition of ρ0c doesn’t depend on the particular representative ω1 of [ω] chosen. The map ρ0c sends the origin of R2 , that lies in some tiles of a representative of [ω], to one of the corresponding tiles t cj ’s at the corresponding position. The projection pc0,T is then defined by: pc0,T :
/S 1 −→ B0c (T ) . c [ω] −→ λ0 ◦ ρ0c ([ω])
380
H. Moustafa
We then have that, like in [SB09], pc0,T is well defined and continuous, noting that [ω ] is in a neighborhood of [ω] in /S 1 if the representatives of [ω ] and [ω] coincide on a big ball up to a small translation and up to rotations. For simplicity, the prototile space B0c (T ) is written B0c and the projection pc0,T is written p0 . We denote (τ j ) the lift of the punctuation of τ j . This is a subset of the canonical transversal called the acceptance zone of the prototile tˆcj . This subset contains all the tilings with the punctuation of a representative of tˆcj at the origin. The (τ j )’s for j = 1, . . . , N0 , form a clopen partition of the canonical transversal and are Cantor sets like . 3.2. /S 1 as an inverse limit of supertile spaces. We follow the guideline of [SB09] to see our space /S 1 as the inverse limit of supertile spaces (see also [ORS02] or [AP98] to see the hull as an inverse limit). Let T be the pinwheel tiling constructed in the first section and set the simplicial decomposition on the prototiles of the figures in Subsects. 3.1 and 3.3.1. We are going to define new finite C W -complexes Bkc , called supertile space of level k, associated to the k-supertiles of T . The spaces Bkc are built from the collared prototiles of an appropriate subtiling of T , written Tk below, in the same way as B0c was built from the prototiles of T in Definition 3.2. The construction goes as follows. As we said in the first section, T can be decomposed uniquely in supertiles of level k, obtaining a repetitive non-periodic tiling Tk of E2 -finite type and whose tiles are k-supertiles. Each k-supertile is punctured by the punctuation of the tile in its middle as shown in the next figure:
It’s again the intersection point of the perpendicular bisector and the median of the appropriate side. These k-supertiles are then compatible C W -complexes since they are made up of tiles of T . Definition 3.4. The supertile space of level k, Bkc , is the collared prototile space of Tk : Bkc = B0c (Tk ). In fact, since all the tilings Tk are the same, up to a dilation and a rotation, all the spaces Bkc are homeomorphic but they will give us important information on the cohomology of our space /S 1 . The images in Bkc of the k-supertiles p j (tiles of Tk ) are denoted π j and still called supertiles. The projection p0,Tk : /S 1 −→ Bkc , built in Proposition 3.3, is denoted pk . The map Fk : Bkc −→ B0c defined by Fk := p0 ◦ p−1 k is well defined, onto and continuous (see [SB09]). It projects Bkc onto B0c in an obvious way: a point x in Bkc belongs to some supertile π j , hence to some tile, and Fk sends x on the corresponding point in the corresponding tile τ j . Let p and q be two integers such that q p. The map f q, p : B cp → Bqc defined by f q, p := Fq−1 ◦ F p = pq ◦ p−1 p is well defined, onto and continuous. As explained in [SB09], the family (B cp , f q, p ) is a projective system. To prove that /S 1 is the inverse
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
381
limit lim(B cp , f q, p ), we will use an important property of the pinwheel tiling: the (l + 1)←− √ supertiles have the same coronas as l-supertiles, up to a dilation by a factor 5. Hence, if dl denotes the distance between a l-supertile pl and the complementary of its first corona, the distance dl+1 √ of the (l + 1)-supertile √ which has the same first corona as the one of pl but dilated by 5, satisfies dl+1 = 5dl and thus, these distances go to infinity. This point will allow us to prove the next theorem taking an appropriate sequence of supertile spaces. We take the sequence of supertile spaces {Blc , fl }l∈N where, for l 1, Blc is the space of l-supertiles and fl = f (l−1),l , with the convention that f 0 := F1 and B0c is the prototile space. We then prove, as in [SB09], the following theorem: Theorem 3.5. The inverse limit of the sequence {Blc , fl }l∈N is homeomorphic to the continuous hull of T up to rotations: /S 1 ∼ = lim(Blc , fl ). ←−
Proof. The homeomorphism is given by the map p : /S 1 −→ lim(Blc , fl ), defined ←− by p([ω]) = (p0 ([ω]), p1 ([ω]), . . . ) with inverse p−1 (x0 , x1 , . . . ) = ∩{pl−1 (xl ), l ∈ N}. The map p is surjective since each pl is. In fact, we have: p0−1 (x0 ) ⊃ p1−1 (x1 ) ⊃ p2−1 (x2 ) ⊃ · · · , where each pi−1 (xi ) is a non empty compact subset of /S 1 and thus every tiling in the intersection above defines a lift of (x0 , x1 , . . . ). For the injectivity, let ω, ω ∈ be such that p([ω]) = p([ω ]). For each l ∈ N, pl ([ω]) = pl ([ω ]) in some l-supertile πl, j of Blc . This means that the two tilings agree, up to a rotation, on some translate of the supertile pl, j containing the origin. Set then rl = inf p∈Plc inf x∈ p dR2 (x, ∂C 1 ( p)), where Plc is the set of the collared l-supertiles of T and ∂C 1 ( p) is the boundary of the first corona of p. Since our tiling is of finite type, rl > 0 for all l. Moreover, the two tilings agree, up to rotations, on the ball B(0R2 , rl ) since Blc was built out of collared supertiles. √ √ l+1 As mentioned earlier, rl+1 = 5rl , thus rl+1 = 5 r0 , and if we choose l big enough, the two tilings agree on arbitrary large balls and so we have proved the injectivity of p. p is then a bijection trivially continuous and since lim(Blc , fl ) is Hausdorff, ←− p is in fact a homeomorphism. 3.3. The PV cohomology. We now turn to our first goal which was to compute the top ˇ Cech cohomology of /S 1 with integers coefficients and to prove that this was in fact the integer group of coinvariants C(, Z)/ ∼ of the canonical transversal under the partial action of the groupoid R2 S 1 restricted to this transversal. To prove this result, we will show that this cohomology is isomorphic to the PV cohomology introduced in [SB09], modified in order to take into account rotations. We thus follow the approach of Savinien and Bellissard, introducing first the notion of oriented simplicial complexes, then defining the PV cohomology of the pinwheel tiling and finally proving that this ˇ is exactly the integer Cech cohomology of /S 1 , thanks to the inverse limit found in
382
H. Moustafa
the previous section. The interesting point for us in this cohomology is that its cochains are in fact directly taken to be the continuous functions (with integer values) on the transversal 2 . We will then prove that the quotient of these cochains under the image of the differential of the PV cohomology is precisely the integer group of coinvariants on 2 . 3.3.1. Oriented simplicial complexes and PV cohomology. Here again, we will follow the presentation of Savinien and Bellissard in [SB09] modifying the notions and the proofs to our oriented simplicial complexes (see [HY61]). Given n + 1 points v0 , . . . , vn in Rm m > n, which are not colinear, let [v0 , . . . , vn ] denote the n-simplex with vertices v0 , . . . , vn . Let n be the standard n-simplex: n n n+1 = (x0 , . . . , xn ) ∈ R : xi = 1 and xi 0 for all i , i=0
with vertices the unit vectors along the coordinate axis. If one of the n + 1 vertices of an n-simplex [v0 , . . . , vn ] is deleted, the n remaining vertices span a (n − 1)-simplex, called a face of [v0 , . . . , vn ]. The boundary of n is then the union of all the faces of n and is denoted ∂n . ◦
The interior of n is then the open simplex n = n \∂n . A space X is a simplicial complex if there is a collection of maps σα : k → X , where k depends on the index α, such that: (i) The restriction σα |n is injective. (ii) Each restriction of σα to a face of n is one of the maps σβ : n−1 → X . (iii) For each α and β, Fα,β := σα (n )∩σβ ( p ) is a face of the two simplices σα (n ) and σβ ( p ) and there is an affine map l : σα−1 (Fα,β ) → σβ−1 (Fα,β ) such that σα |σα−1 (Fα,β ) = σβ |σ −1 (Fα,β ) ◦ l. β
(iv) A set A ⊂ X is open iff σα−1 (A) is open in n for each σα . ◦
σα (n ) is called a n-cell of the complex. We then obtain an oriented simplex from a n-simplex σ = [v0 , . . . , vn ] as follows: fix an arbitrary ordering of the vertices v0 , . . . , vn . The equivalence class of even permutations of this fixed ordering is the positively oriented simplex, which we denote +σ . The equivalence class of odd permutations of the chosen ordering is the negatively oriented simplex, −σ . An oriented simplicial complex is obtained from a simplicial complex by choosing an arbitrary fixed orientation for each simplex in the complex (this may be done without considering how the individual simplices are joined or whether one simplex is a face of another). To define an oriented simplicial structure on B0c , we decompose each tile of the pinwheel tiling as follows:
we take the orientation of R2 for the 2-cells and any orientation for the edges in the interior of our tiles. Since B0c is of dimension 2, we can consider only the simplices
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
383
of dimension 0, 1 and 2. There is no simpler simplicial decomposition of the collared prototiles if one wants to keep the same decomposition for all of them. One can obtain simpler decompositions if different decompositions are allowed for each collared prototile. We must notice that this decomposition was obtained by a careful study of the different 2 coronas of tiles (there are more than three hundred such coronas) appearing in a (1, 2)-pinwheel tiling. We can thus see T as an oriented simplicial decomposition of R2 . We puncture each cell of each tile by the image under σα of the barycenter of n :
B0c is then a finite oriented simplicial complex and the maps σα : n → B0c are the characteristic maps of the n-simplices on B0c . We next define as in [SB09] a new transversal (not immediately related to the canonical transversal, see 3.18) which is crucial to define the PV cohomology: Definition 3.6. The -transversal, written , is the subset of /S 1 formed by classes of tilings containing the origin on the punctuation of one of their cells. The -transversal is the lift of the punctuation of the cells on B0c . It is partitioned by the lift of the punctuation of the n-cells, written n , i.e the subset of /S 1 consisting of classes of tilings containing the origin on the punctuation of one of their n-cells. The -transversal is then a Cantor set (as the canonical transversal), and the n ’s is a partition of it in clopen subsets. Let σ be the characteristic map of a n-simplex e on B0c , denote (σ ) the lift of the punctuation of e, and χσ its characteristic function in (i.e χσ ([ω]) = 1 if and only if p0 ([ω]) = punct (e)). The subset (σ ) is the acceptance zone of σ . Since (σ ) is a clopen set, χσ ∈ C(n , Z) ⊂ C( , Z). Consider σ : n → B0c the characteristic map of a n-simplex e on B0c , and let τ be a face of σ with image cell f in B0c (a face of e). The simplices e and f on B0c are contained in some tile τ j . If we look at these simplices e and f as subsets of the tile t j in R2 , we can define a vector xσ τ joining the punctuation of f to the one of e. The main issue with this construction is that, in the case of the pinwheel tiling, xσ τ depends on the tile τ j chosen if e is a 1-simplex (this vector is unique up to rotations). In the case of 2-simplices, this vector is in fact unique since the quotient defining B0c only concerns the edges of the cells. We must then find a way to choose such vectors in the case of 1-simplices. There are many ways to make such a choice. Here is one of them: if e is a 1-simplex, the vector is obtained by orienting the edge horizontally and from left to right in any tiling of (σ ): Built in this manner, the vector xσ τ is uniquely attached to σ and τ as if we had a tiling where we only consider translations (as in [SB09]). We next define the “action” of xσ τ on a function in C( , Z) as follows: • If σ is the characteristic map of a 2-simplex e, τ a face of σ and f a function in C( (τ ), Z) then, for each [ω] ∈ (σ ), we set T xσ τ f ([ω]) = f ([ω0 + xσ τ ]),
384
H. Moustafa
where ω0 ∈ [ω] is “well oriented”, i.e. the origin of the euclidean plane E 2 belongs to the punctuation of a cell e (a triangle) of ω included in a unique tile (which is a direct isometry of the tile t j used to construct xσ τ ); ω0 is then the tiling obtained by rotating ω to put this tile in this orientation (i.e. such that this tile is a translation of t j ):
• If σ is the characteristic map of a 1-simplex, τ a face of σ and f a function in C( (τ ), Z) then, for each [ω] ∈ (σ ), we put T xσ τ f ([ω]) = f ([ω0 + xσ τ ]), where ω0 ∈ [ω] is “well oriented”, i.e. the origin of E 2 belongs to the punctuation of an edge of ω; ω0 is then obtained by rotating ω to put this edge horizontally and the orientation in the positive direction (from left to right). We then define important operators, useful to define the differential of the PV cohomology: Definition 3.7. Let σ and τ be the characteristic map of a n-simplex, resp. a k-simplex, on B0c . We define the operator θσ τ on C( , Z) by: χσ T xσ τ χτ if τ ⊂ ∂σ and n = 1, 2 θσ τ = , 0 else where τ ⊂ ∂σ means that τ is a face of σ of codimension 1. We can see from the definition that this operator is only defined (non vanishing) in fact on n-simplices σ with n = 1, 2. This operator is easy to describe: if f ∈ C( (τ ), Z) and [ω] ∈ then θσ τ ( f )([ω]) is 0 if ω is not in (σ ) or if τ is not a face of σ of codimension 1 or if n = 1, 2, and else, θσ τ ( f )([ω]) is equal to the value of f on the class of the tiling obtained from ω by a direct isometry sending the origin to the punctuation of the face τ of σ and rotating τ such that it has the “good” orientation. We can then define the PV cohomology of pinwheel tilings. Let S0n denote the set of the characteristic maps σ : n → B0c of n-simplices on B0c and S0 the union of the S0n ’s. The group of simplicial n-chains on B0c , C0,n , is the free abelian group with basis S0n . Before defining the PV cohomology, we need one more definition, the incidence number: Definition 3.8 (see [Eil44]). Let σ and τ be two simplices of dimension n and n − 1 respectively, the incidence number [σ, τ ] is defined by: [σ, τ ] = ±1 if τ ⊂ ∂σ, [σ, τ ] = 0 else. If τ ⊂ ∂σ , then [σ, τ ] is 1 if τ is a positively oriented face of σ, i.e. the orientation of τ coincides with that induced by σ on its face τ and this number is −1 if τ is a negatively oriented face of σ . Definition 3.9. The PV cohomology of /S 1 is the cohomology of the differential complex {C nP V , d Pn V }, with:
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
385
1. the PV cochain groups are the groups of continuous integer valued functions on n : C nP V = C(n , Z) for n = 0, 1, 2, 2. the PV differential, d P V , is defined by the sum over n = 1, 2, of the operators:
d Pn V :
⎧ ⎪ ⎨ n
d ⎪ ⎩ PV
n C n−1 P V −→ C P V n = [σ, ∂i σ ]θσ ∂i σ . σ ∈S0n i=0
The “simplicial form” of d Pn V easily implies d P2 V ◦ d P1 V = 0. This comes from the fact that, for each simplex σ (see [HY61]): [σ, ∂i σ ][∂i σ, ∂ j ∂i σ ] = 0. i, j ∗ Wec shall also call this cohomology the PV cohomology of T . We denote it H P V B0 ; C( , Z) . The next subsection will then prove the following theorem:
ˇ Theorem 3.10. The integer Cech cohomology of /S 1 is isomorphic to the PV cohomology of T : Hˇ ∗ (/S 1 ; Z) ∼ = H P∗ V B0c ; C( , Z) . 3.3.2. Proof of Theorem 3.10. Once again, we follow the guideline of the paper [SB09]. We define first a PV cohomology for the B cp ’s, written H P∗ V (B0c ; C( p , Z)). We show that this cohomology is in fact the simplicial cohomology of B cp in Proposition 3.13 and then we prove that the PV cohomology of T is isomorphic to the direct limit of the PV cohomologies of the supertile spaces sequence used in Theorem 3.5. Denote S np the set of all the characteristic maps σ p : n → B cp of the n-simplices on B cp , and S p the union of the S np ’s. The group of simplicial n-chains on B cp , C p,n , is the free abelian group with basis S np . As above, if σ p is a simplex on B cp , write p, (σ p ) for the lift of the punctuation of its image in B cp and χσ p its characteristic map. p, (σ p ) is the acceptance zone of σ p . It’s a clopen subset of the -transversal. Lemma 3.11. Given a simplex σ on B0c , its acceptance zone is partitioned by the acceptance zone of its preimages in B cp : (σ ) =
p, (σ p ),
−1 σ p ∈F p# (σ )
with F p# : S np → S0n the map induced by F p . The proof is exactly the same as the one in [SB09]. −1 (σ )’s is S np . We then denote C np the We note that the union over σ ∈ S0n of the F p# simplicial n-cochain group H om(C p,n , Z), which is the dual of the simplicial n-chain
386
H. Moustafa
group C p,n . We can represent it faithfully on the group of continuous functions with integer values on the -transversal C( , Z) by: ⎧ n C(n , Z) ⎨ C p −→ ρ p,n : ψ −→ ψ(σ p )χσ p . ⎩ n σ p ∈S p
We denote C( np , Z) the image of this representation. ρ p,n is an isomorphism on its image C( np , Z), its inverse is defined as follows: given φ = σ p ∈S np φσ p χσ p , where φσ p is an integer, ρ −1 p,n (φ) is the group homomorphism from C p,n to Z whose value on the basis simplex σ p is φσ p . Consider the characteristic map σ p of a n-simplex e p on B cp . This simplex is contained in some supertile π j . Viewing e p as a subset of the supertile p j in R2 , we can again define, similarly to the method used for the PV cohomology, the vector xσ p ∂i σ p , for i = 1, . . . , n, joining the punctuation of the i th face ∂i e p to the punctuation of e p . Since F p preserves the orientation of the simplices, these vectors xσ p ∂i σ p are identical, for each σ p in the preimage of the characteristic map σ of a simplex e on B0c , and they are, in fact, equal to the vector xσ ∂i σ used in Definition 3.7 of the operator θσ ∂i σ . In the same way, we define the operators θσ p ∂i σ p as the operators χσ p T xσ p ∂i σ p χ∂i σ p . Using the relation T xσ ∂i σ χ∂i σ = χσ T xσ ∂i σ and Lemma 3.11, we obtain: θσ p ∂i σ p . θσ ∂i σ = −1 σ p ∈F p# (σ )
Hence, the PV differential can be written: d Pn V =
n
[σ p , ∂i σ p ]θσ p ∂i σ p ,
σ p ∈S np i=0 n and this defines a differential from C( n−1 p , Z) to C( p , Z) (since F p preserves the ori−1 entation of the simplices, for each σ p ∈ F p# (σ ), [σ p , ∂i σ p ] = [σ, ∂i σ ], which justifies the definition of this differential).
Definition 3.12. Set C nP V ( p) = C( np , Z), for n = 0, 1, 2. The PV cohomology of the supertile space B cp , written H P∗ V B0c ; C( p , Z) , is the cohomology of the differential complex {C nP V ( p), d Pn V }. As in [SB09], we obtain one of the two crucial propositions for the proof of 3.10: Proposition 3.13. The PV cohomology of the supertile space B cp is isomorphic to its integer simplicial cohomology: H P∗ V B0c ; C( p , Z) ∼ = H ∗ B cp ; Z . Proof. ρ p,n is an isomorphism from the simplicial cochain group C np onto the PV con chain group C nP V ( p). Let φ be an element in C n−1 P V ( p) and σ p a n-simplex in S p ; the differential of φ is then given by: d Pn V φ(σ p ) =
n [σ p , ∂i σ p ]φ(∂i σ p ). i=0
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
387
On the other hand, the simplicial differential of some ψ ∈ C n−1 is: p δ n ψ(σ p ) =
n [σ p , ∂i ]ψ(∂i σ p ), i=1
thus, we have d Pn V ◦ ρ p,n−1 = ρ p,n ◦ δ n for n = 1, 2. So the ρ p,n ’s give a chain map and thus induce isomorphisms ρ ∗p,n ’s between the n th cohomology groups. The following lemma is trivial using Theorem 3.5 and is useful to prove the second crucial proposition for the proof of Theorem 3.10: Lemma 3.14. Let {Blc , fl } be the sequence used in Theorem 3.5. We have: ∼ = lim(Sl , fl ) ←−
and
C( , Z) ∼ = lim C(l , Z), f l , −→
where the f l ’s are the duals of the fl ’s. Proposition 3.15. Let {Blc , fl } be as in the previous lemma. There is an isomorphism: H P∗ V B0c ; C( , Z) ∼ = lim H P∗ V B0c ; C( p , Z) , fl∗ . −→
Proof. By the previous Lemma 3.14, the cochain group C nP V are the direct limits of the cochain groups C nP V (l) of the supertile spaces Blc . Consider fl# : C nP V (l) −→ C nP V (l +1) the map induced by fl on the PV cochain groups. Since the differential d P V is the same for the complexes of each supertile space Blc , it’s enough to check that the following diagram is commutative ···
/ C n−1 (l)
d Pn V
PV
fl#
···
/ C n−1 (l + 1) PV
/ C n (l) PV
/ ···
fl#
d Pn V
/ C n (l + 1) PV
/ ···
and this is easy using the relations ( fl# φ)(∂i σl ) = φ( fl# (∂i σl )) and fl# (∂i σl ) = ∂i ( fl# (σl )). ˇ To end the proof of Theorem 3.10, we then use the fact that Cech cohomology sends ˇ inverse limits on direct limits, that Cech cohomology is isomorphic to simplicial cohomology for C W -complexes (the Blc ’s) and that the simplicial cohomology of Blc is isomorphic to the PV cohomology of Blc by Proposition 3.13. We then conclude that the direct limit of these groups is the PV cohomology of T by Proposition 3.15.
388
H. Moustafa
3.4. The integer group of coinvariants of the pinwheel tiling. In this section, we show that the top PV cohomology H P2 V (B0c ; C( , Z)) is isomorphic to the integer group of coinvariants C(2 , Z)/ ∼ of the transversal 2 (definition follows) and thus, the top ˇ integer Cech cohomology of the hull of a pinwheel tiling is also isomorphic to this group which is a first step in the proof of the gap-labeling. In the same way we identified the canonical transversal /S 1 to a subset of , we can identify 2 and 1 with a subset of . Indeed, it suffices to choose in each class [ω] ∈ n (n = 1, 2) a representative with the “good orientation” (the one used to define the vector xσ τ ). We also obtain a representation of the clopens p, in in this way. We will denote in the same way these subsets of /S 1 and the corresponding subsets of . When we take a class in n or p, , we implicitly mean that we are in /S 1 and if we take a tiling in n or p, , we consider the subset of . Let’s begin by the definition of the integer group of coinvariants of the canonical transversal and of the n th -transversal n (n = 1, 2). We let 0 denote either or n (n = 1, 2) (subsets of ). Let ω0 ∈ 0 and Aω0 a patch of ω0 around the origin, we define U (ω0 , Aω0 ) := {ω ∈ 0 | ω0 and ω coincide on Aω0 }, which is a clopen subset of 0 (see Subsect. 2.2). Viewing this subset in /S 1 , this clopen becomes V ([ω0 ], Aω0 ) := {[ω1 ] ∈ 0 | ω0 and ω1 coincide onAω0 up to rotations}. By hypothesis, our tiling T is of finite R2 S O(2)-type, hence the family {U (ω0 , Aω0 ), ω0 ∈ 0 , Aω0 patch of size k of ω0 around O , k ∈ N} is countable. We define for each ω0 ∈ 0 : G U (ω0 , Aω0 ) = {(x, θ ) ∈ R2 S O(2) | ω0 .(x, θ ) ∈ 0 and x ∈ Aω0 }, 2 2 where x ∈ Aω0 means that x is a vector in R contained in the subset of R defined by Aω0 . G U (ω0 , Aω0 ) is defined like this because in fact, if we take a tiling ω in the clopen set U (ω0 , Aω0 ), we know this tiling only on the patch Aω0 and thus we have: ∀(x, θ ) ∈ G U (ω0 , Aω0 ) , ∀ ω ∈ U (ω0 , Aω0 ), ω .(x, θ ) ∈ 0 .
The integer group of coinvariants of 0 is then the quotient of C(0 , Z) by the subgroup H0 spanned by the family {χU (ω0 ,Aω0 ) − χU (ω0 ,Aω0 ).(x,θ) | (x, θ ) ∈ G(U (ω0 , Aω0 )), ω0 ∈ 0 }. Theorem 3.16. H P2 V (B0c ; C( , Z)) ∼ = C(2 , Z)/H2 .
Proof. By definition, this cohomology is C(2 , Z)/Im(d P2 V ). We prove that 2 Im(d P V ) = H2 .
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
389
1. Consider f ∈ Im(d P2 V ); there is some g ∈ C(1 , Z) such that f = d P2 V (g). Since C(1 , Z) is generated by characteristic functions χU (ω,Aωτ ) with τ is in S 1p for p ∈ N, ω ∈ p, (τ ) and Aωτ a patch of ω around the origin (in fact around the edge which projects on τ in B cp ) large enough to cover the first corona(s) of the (two) supertile(s) surrounding τ in ω, it is enough to prove the result for f = d P2 V (χU (ω,Aωτ ) )
with τ in S 1 := S 1p . Let thus τ be some element in S 1p , ω a tiling in p, (τ ) and Aωτ a patch surrounding the origin in ω (and thus surrounding the “edge” τ ) large enough, then χU (ω,Aωτ ) is the characteristic function of the set of all the tilings in with the origin on the punctuation of τ , τ having the “good” orientation and which coincide on the patch Aωτ with ω. We remark that if τ is the characteristic map of another 1-simplex in S 1p , then χτ χU (ω,Aωτ ) = δτ τ χU (ω,Aωτ ) . As Aωτ was chosen large enough, this patch characterizes the (two) collared supertile(s) surrounding the simplex corresponding to τ in the tilings of U (ω, Aωτ ). Thus, denoting σ0 and σ1 the two characteristic maps of the 2-simplices having τ as an edge in ω and the above collared supertile(s) (respectively) as supertiles in B cp , we have: for each σ ∈ S 2p , χσ χU (ω0 ,A0τ ) = δσ σ0 χU (ω0 ,A0τ ) and χσ χU (ω1 ,A1τ ) = δσ σ1 χU (ω1 ,A1τ ) , where ωi = Rθi (ω) − xσi τ (i = 0, 1), Aiτ = Rθi (Aωτ ) − xσi τ and ωi is in p, (σi ) (ωi are the tilings in p, (σi ) obtained from ω by a direct isometry taking the origin on the punctuation of σi and rotating ωi to obtain the good orientation). With these three remarks, we easily see that d P2 V (χU (ω,Aωτ ) ) = ±(χU (ω0 ,A0τ ) − χU (ω1 ,A1τ ) ), with ω1 = ω0 .(Rθ0 −θ1 (xσ1 τ ) − xσ0 τ , θ0 − θ1 ) and A1τ of the same form. Hence d P2 V (χU (ω,Aωτ ) ) = ±(χU (ω0 ,A0τ ) − χU (ω0 ,A0τ ).(yσ
0 σ1 ,θσ0 σ1 )
),
where yσ0 σ1 = Rθ0 −θ1 (xσ1 τ ) − xσ0 τ , θσ0 σ1 = θ0 − θ1 and thus (yσ0 σ1 , θσ0 σ1 ) is in G U (ω0 , Aω0 ) . We thus have proved that d P2 V (χU (ω,Aτ ) ) ∈ H2 and thus the inclusion
Im(d P2 V ) ⊂ H2 . 2. The inclusion in the other direction is easy to obtain by reasoning on generators. Let χU (ω,Aω ) − χU (ω,Aω ).(x,θ) be a generator of H2 . Covering Aω by supertiles large enough, we can suppose that Aω is in fact a collared supertile or a union of two collared supertiles of the same level or a star of collared supertiles of the same level (a star of supertiles is all the supertiles surrounding a fixed vertex). Consider ω ∈ 2 and Aω a patch in ω consisting of p-supertiles around the origin of the above form, called a collared patch. Thanks to the form of collared patches,
390
H. Moustafa
we can find a sequence of tilings ω0 , . . . , ωn such that ω0 = ω, ωn = ω.(x, θ ), ωi = ω.(xi , θi ) (i = 1, . . . , n) with (xi , θi ) ∈ G (U (ω, Aω )) and such that the 2-simplex in ωi containing the origin have a common edge with the 2-simplex in ωi−1 which contained the origin (see the figure below). If we take, for example, the patch below, with the points representing ω and ω.(x, θ ) (in fact, on this figure, we have represented ω and ω − x, thus you must think that we are representing tilings by points and then you must rotate the tiling to put the tile containing the origin in the “good orientation” to obtain ω and ω.(x, θ )):
We may thus represent the sequence of tilings “joining” ω and ω.(x, θ ) by the following sequence of points (there isn’t a unique path):
We thus can decompose χU (ω,Aω ) − χU (ω,Aω ).(x,θ) into the sum of the following differences: χU (ωi ,Aω .(xi ,θi )) − χU (ωi+1 ,Aω .(xi+1 ,θi+1 )) , where ωi , ωi+1 are two tilings in 2 obtained from ω by a direct isometry (xi , θi ), resp. (xi+1 , θi+1 ) and containing the origin on the punctuation of two simplices having a common edge. We will thus focus on the difference χU (ω1 ,Aω .(x1 ,θ1 )) − χU (ω2 ,Aω .(x2 ,θ2 )) with ω1 = ω.(x1 , θ1 ), ω2 = ω.(x2 , θ2 ) and (xk , θk ) ∈ G (U (ω, Aω )). We can then write ω2 = ω1 .(x12 , θ12 ) with (x12 , θ12 ) = (x1 , θ1 )−1 (x2 , θ2 ). Thus: χU (ω1 ,Aω .(x1 ,θ1 )) − χU (ω2 ,Aω .(x2 ,θ2 )) is equal to χU (ω1 ,Aω .(x1 ,θ1 )) − χU (ω1 ,Aω .(x1 ,θ1 )).(x12 ,θ12 ) , and is therefore the differential of the characteristic function of U (ω3 , A ω ), where ω3 is the tiling in 1 with the origin on the punctuation of the common edge of
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
391
the two above simplices and A ω is the patch obtained from Aω by a direct isometry bringing the origin on this punctuation and taking the adequate orientation. Thereby, the generators of H2 are images under d P2 V of elements of C(1 , Z) and the reciprocal inclusion is proved. By Theorem 3.10 and the fact that Hˇ 3 (; Z) Hˇ 2 (/S 1 ; Z) , we thus obtain: ˇ Corollary 3.17. The top integer Cech cohomology of the hull is isomorphic to the integer group of coinvariants of 2 : Hˇ 3 (; Z) C(2 , Z)/H2 .
We link now the coinvariants on the -transversal to the coinvariants of the canonical transversal which will be useful to prove the gap labeling in the next section. The invariant ergodic probability measure μ on induces a measure μt0 on each
transversal 0 of the lamination which is given, locally, by: if Vi × Si × Ci , h i−1 is i a maximal atlas of the lamination and B a borelian set in some Ci , we have μ h i−1 (Vi × Si × B) , μt0 (B) = λ(Vi × Si ) where λ is a left and right Haar measure on R2 × S 1 (with λ [0; 1]2 × S 1 = 1), h i−1 : Vi × Si × Ci −→ Ui , Vi is an open subset of R2 , Si an open subset of S 1 and Ci a clopen in 0 . We then have a link between the -transversal and the canonical transversal: Lemma 3.18. C(2 , Z)/H2 is isomorphic to C(, Z)/H . Moreover, let μt3 be the
induced measure on the transversal ∪ 2 . Denoting μt (resp. μt2 ) the restriction of μt3 to (resp. 2 ), we have: μt2 C(2 , Z) = μt (C(, Z)) . Proof. In each prototile, there are 8 punctuations of 2-simplices which we number from 1 to 16 once for all (we are considering the 2 prototiles (uncollared) of the pinwheel tiling and then we number the punctuations in the first prototile from 1 to 8 and those of the second prototile from 9 to 16). Thus, we can define the vectors x0,i joining the punctuation of the prototile taken for the definition of (see 3.1) to the punctuation of the i-th simplex of this prototile. Define the map : C(2 , Z) −→ C(, Z) by:
( f )(ω0 ) =
⎧ 8 ⎪ ⎪ ⎪ f (ω0 − x0,i ) ⎪ ⎨ i=1
16 ⎪ ⎪ ⎪ ⎪ f (ω0 − x0,i ) ⎩
,
i=9
depending on the tile type of the tile containing the origin in ω0 . This map sends a function f defined on 2 on the function on defined on a tiling ω0 containing the
392
H. Moustafa
origin on the punctuation of a tile, by the sum of the values of f on the tilings containing the origin on the punctuation of the cells constituting this tile. This defines a group homomorphism which is trivially surjective. Indeed, fix a simplex of each prototile (for example, the number 1 for the first prototile and 9 for the second prototile), then we can define a section of , s : C(, Z) −→ C(2 , Z), as follows ⎧ f (ω + x0,1 ) if ω has the origin in a tile congruent to ⎪ ⎪ ⎪ the first prototile and in a “1” simplex ⎨ s( f )(ω ) = f (ω + x0,9 ) if ω has the origin in a tile congruent to . ⎪ ⎪ the second prototile and in a “9” simplex ⎪ ⎩ 0 else Then induces a surjective homomorphism : C(2 , Z)/H2 −→ C(, Z)/H
because sends the generators of H2 in H . s induces also a homomorphism s :
C(, Z)/H −→ C(2 , Z)/H2 since s(H ) ⊂ H2 . We then prove that is an −1
isomorphism and = s. Since ◦ s = id, we have ◦ s = id. We then have to prove that s ◦ = id to end this point. Let ω ∈ 2 (having an “i” simplex around the origin and the tile surrounding the origin being of the first type for simplicity) and Aω a patch of ω around the origin, then we have: s ◦ [χU (ω ,A ) ] = s ◦ (χU (ω ,A ) ) = s(χU (ω ,A )+x0,i = χU (ω ,A )+x0,i −x0,1 = χU (ω ,A ) , the last equality is coming from the definition of H2 and from the fact that −x0,i + x0,1 ∈ G (U (ω , A )) because Aω contains the tile surrounding the origin in ω . Thus s ◦ = id and is an isomorphism from C(2 , Z)/H2 onto C(, Z)/H .
We then prove that ψ preserves the measures. Consider ω ∈ 2 (we suppose that the origin is on the punctuation of the k th simplex) and f = χU (ω,Aω ) a generator of C(2 , Z). We then have ( f ) = χU (ω,Aω )+x0,k and thus μt (( f )) = μt (U (ω, Aω ) + x0,k ) = μt3 (U (ω, Aω ) + x0,k ), since μt is the restriction of μt3 to . Thereby μt (U (ω, Aω ) + x0,k ) = μt3 (U (ω, Aω )) by invariance of μt3 . Finally, we obtain μt (U(ω, Aω ) + x0,k ) = μt2 (U (ω, Aω )). Thus, t 2 preserves the measures and μ2 C( , Z) ⊂ μt (C(, Z)). The reciprocal inclusion is obtained using the section s: if μt ( f ) ∈ μt (C(, Z)) then μt ( f ) = μt ( ◦ s( f )) = μt2 (s( f )) ∈ μt2 C(2 , Z) , hence the reciprocal inclusion is proved together with the lemma.
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
393
ˇ Corollary 3.19. The top integer Cech cohomology of the hull is isomorphic to the integer group of coinvariants of the canonical transversal: Hˇ 3 (; Z) C(, Z)/H . 4. Proof of the Gap-Labeling for Pinwheel Tilings and Explicit Computations 4.1. Proof of the gap-labeling for pinwheel tilings. To prove the gap-labeling for pinwheel tilings, i.e [Cμt ] (Ch τ (K 1 ())) ⊂ μt2 (C(2 , Z)) = μt (C(, Z)), 2
we must now show that the inclusion of C(, Z) in C(, R) is, at the level of cohomologies, the map r∗ : Hˇ 3 (; Z) −→ Hτ3 () induced by the inclusion of sheaves described in [MS06] (see the diagram below). We first look at the lifting in Hˇ 2 (/S 1 ; Z) of the generators of C(2 , Z)/H2 . For this section, we consider the sequence {Blc , fl }l∈N defined in Theorem 3.5. l, (σl ) σ ∈S 2 ,l∈N is then a base of neighborhoods of 2 and the characteristic funcl
l
tions χσl thus span C(2 , Z). Fix one of these characteristic maps χσl . This function is in fact, by definition, a function in C(l2 , Z) and so defines a class in the cohomology group H P2 V (B0c ; C(l , Z)) which is isomorphic to the simplicial cohomology group of Blc , H 2 (Blc ; Z). It’s the class of the cochain which sends each characteristic map σ on Blc on the integer 1 if σ = σl and 0 else. We would like to know the image of this class under ˇ the isomorphism linking the simplicial cohomology to the Cech cohomology. For this, in view of [HY61] 5-5 and 8-2, we see that this isomorphism is obtained by considering the coverings of Blc by open stars associated to the iterated barycenter decompositions of Blc . Denote Un the covering of Blc by the open stars of the n th barycenter decomposition of Blc , noting that the open stars are the interior of the union of the 2-simplices surrounding the vertices of the simplical structure of the n th barycenter decomposition. The second ˇ integer Cech cohomology of Blc is then the direct limit over the Un ’s of Hˇ 2 (Un ; Z). We need to shrink these open sets to end the proof. We don’t take the open coverings by whole open stars, but we will consider large enough open subsets of these open stars (large enough in order to still cover the space). It suffices to take, for example, the open sets obtained from the open stars by rescaling them from the star vertex by a factor 56 . We call these open subsets, pseudo-open stars. In this way, we make sure that the centre of a 2-simplex is still in the intersection of the 3 pseudo-open stars surrounding its vertices and thus the family Un of pseudo-open stars, is still an open cover of Blc . Moreover, the family Un is still a cofinal family of coverings of Blc and thus Hˇ 2 (Blc ; Z) = lim Hˇ (Un ; Z). −→
The class of χσl in the integer group of coinvariants is thus sent on the class, in Hˇ 2 (Blc ; Z), of the cochain gσl of Hˇ 2 (U0 ; Z) defined by taking the null section on the 2-simplices (U1 , U2 , U3 ) whose intersection is not contained in the simplex σl and the constant
394
H. Moustafa
section equal to 1 on the 2-simplex formed by the 3 pseudo-open stars surrounding σl (taking the same orientation as σl ). Let U1 , U2 and U3 denote the 3 open stars surrounding σl ; we then see that the previous cohomology class in Hˇ 2 (Blc ; Z) lifts in Hˇ c2 (U ; Z) := Hˇ 2 (U, ∂U ; Z) (where ˇ cohomology with integer coefficients and compact support U = U1 ∩U2 ∩U3 ), the Cech of U , on the cochain h σl equal to 1 on the intersection of the 3 pseudo-open stars surrounding σl and vanishing elsewhere. Now, if we look at the cohomology of /S 1 , we see that the isomorphism in Theorem 3.10 sends the class of σl on the class of pl∗ ([gσl ]) in Hˇ 2 (/S 1 ; Z) which lifts on pl∗ ([h σl ]) in Hˇ c2 (U × l, (σl ); Z). Indeed, we have the following commutative diagram: / Hˇ 2 (B c ; Z) l
Hˇ c2 (U ; Z) pl∗
2 ˇ Hc (U × l, (σl ); Z)
pl∗
/ Hˇ 2 (/S 1 ; Z)
where the horizontal maps are induced by inclusion of the open sets U and U × l, (σl ) = pl−1 (U ) in Blc and /S 1 respectively. We thus fix an open set U × K where U is an open subset of R2 and K a clopen in 2 . We then have the following commutative diagram: / C(K , R) C(K , Z) CC CC CC CC CC / H (R × K ) Hˇ (R × K ; Z) CC N CC N N NNN CC N CC NNN C! N / H (R × S × K ) /= R Hˇ (R × S × K ; Z) : :: || | :: || :: || | :: || Hˇ (\F)/S ; Z Hˇ (/S ; Z) :: || NNN | : | NNN :: || NNN :: | | NN | / H () Hˇ \F; Z Hˇ (; Z) 2 c
r∗
2
3 c
2
1
2 c
2
2 τc
[Cμt ]
r∗
1
μt2
2
3 τc
2
2
1
1
[Cμt ] 2
3 c
3
r∗
3 τ
The horizontal arrows, involving cohomologies, are the restriction map r∗ defined in [MS06]. C(K , Z) → C(K , R) is the inclusion. Hˇ c2 (R2 × K ) C(K , Z) and Hˇ τ2c (R2 × K ) C(K , R) are Thom isomorphisms and so are natural with respect to sheaf maps. The vertical arrows are induced by inclusion of an open subset in a space and are thus natural for sheaf maps. Next, we note that the following diagram is formed by commutative diagrams: Hˇ c2 (R2 × K ; Z)
⊗R
/ Hˇ 2 (R2 × K ; R) c
r∗
/ H 2 (R2 × K ) τc
Hˇ c3 (R2 × S 1 × K ; Z)
⊗R
/ Hˇ 3 (R2 × S 1 × K ; R) c
r∗
/ H 3 (R2 × S 1 × K ) τc
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
395
The left vertical map is the Gysin isomorphism G which becomes, by tensorising, the integration along the fibers S 1 in real cohomologies, this one being sent on the integration along the “longitudinal” fibers S 1 in longitudinal cohomologies. The Gysin sequence for a fiber bundle p : E → B with fibers S 1 is given by the long exact sequence: ...
/ Hˇ n (E; Z)
G
/ Hˇ n−1 (B; Z)
e
/ Hˇ n+1 (B; Z)
p∗
/ Hˇ n+1 (E; Z)
/ ... ,
where p ∗ is the map induced by p in cohomology and e is the cup product by the Euler class. The two diagrams: and Hˇ c2 (R2 × K ; Z) Hˇ c3 (R2 × S 1 × K ; Z) OOO p p OOO pp p OOO p p p OO' p p p wp 3 3 Hˇ 2 (/S 1 ; Z)
Hˇ (\F; Z)
Hˇ 2 (\F)/S 1 ; Z
Hˇ (; Z)
are commutative because the open subset R2 × K U × K ( resp. U × S 1 × K ) is an open subset of /S 1 (resp. ) included in (\F)/S 1 ( resp. \F), since tilings in F necessarily have the origin on an edge (see the patches surrounding the origin in tilings of F below).
Thus, the image in Hτ3 () of a coinvariant generator [χσl ] is sent under the Ruelle Sullivan current in μt2 (C(K , Z)) ⊂ μt2 C(2 , Z) = μt (C(, Z)). Thanks to results obtained in [Mou] (see the Introduction), we thus have μ τ∗ K 0 C() R2 S 1 ⊂ μt (C(, Z)) . The reciprocal inclusion is obtained by using the first diagram in Sect. 4.1. Indeed, if one takes a generator χσl in C(2 , Z) it lifts to some C(K , Z) (K is in fact l, (σl )) which can be lifted using Gysin and Thom isomorphisms on a class in Hˇ 3 (; Z) and since the Chern character is surjective, we have a lift [u] of this class in K 1 (C()). We then have μ τ∗ (ϒ([u])) = [Cμt ] (r∗ Ch([u])) = μt2 χσl , 2
396
H. Moustafa
where ϒ : K 1 () −→ K 0 C() R2 S 1 is a map defined in [Mou] and is, in fact, the Kasparov product by the unbounded triple defined by the Diracoperator along the leaves of the foliated structure on . Thus μt (C(, Z)) = μt2 C(2 , Z) ⊂ μ τ∗ K 0 C() R2 S 1 . The gap-labeling is thus proved for pinwheel tilings: Theorem 4.1. If T is a pinwheel tiling, = (T ) its hull provided with an invariant ergodic probability measure μ and the canonical transversal provided with the induced measure μt , we have: μ τ∗ K 0 C() R2 S 1 = μt (C(, Z)) .
4.2. Explicit computations. Since τμ∗ K 0 C() R2 S 1 = μt (C(, Z)), it is enough to compute μt (C(, Z)) explicitly. For this, C(, Z) will be seen as the direct limit of some direct system (C(l , Z), f l ) as we have done for C(2 , Z) in 3.14. To define the group C(l , Z), one puts a point in each tile of each l-prototile of Blc .
Now, in the same way as for the -transversal, thanks to characteristic maps, for each tile p in some supertile π j of Blc , we denote ( p) the lifting of the punctuation of p under pl , and χ p its characteristic function in (i.e χ p ([ω]) = 1 if and only if pl ([ω]) = punct ( p)). Thus, ( p) is formed by the tilings in /S 1 having the origin on the punctuation of a representative of p which is itself contained in a representative of the supertile π j . C(l , Z) is the subgroup of C(, Z) spanned by the characteristic functions χ p , where p is a tile of some l-supertile of Blc . f l : C(l−1 , Z) −→ C(l , Z) is defined on the generators as follows: let p be a tile in some (l − 1)-supertile πi . ( p) is uniquely decomposed as the disjoint union ( p j ), where p j is a tile in some l-supertile with fl ( punct ( pl )) = punct ( p). f l (χ p ) is then the sum of the χ p j ’s. One then has C(, Z) = lim C(l , Z), f l . Indeed, l is the set of all the punc−→
tuations in Blc and by Theorem 3.5, we have = lim(l , fl ). ←− For each l ∈ N, consider the subgroup Rl of C(l , Z) spanned by differences χ p − χ p , where p, p are two tiles in the same supertile of Blc . Since there are exactly 108 collared supertiles in a pinwheel tiling, for each l, C(l , Z)/Rl Z108 (see Fig. 3 and Fig. 4 in Sect. 5). Let ql : C(l , Z) −→ C(l , Z)/Rl be the quotient map. f l factorizes through the quotient since, if p and p are two tiles in the same (l − 1)-supertile, f l (χ p − χ p ) = χ pi − χ pi , where pi and pi are two tiles contained in the same l-supertile of Blc and thus f l (χ p − χ p ) ∈ Rl .
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
397
One then has the commutative diagram: C(l−1 , Z) ql−1
C(l−1 , Z)/Rl−1
Z108
fl
/ C(l , Z)
f˜l
ql
/ C(l , Z)/Rl
A
/ Z108
where A is the transpose of the substitution matrix of the collared prototiles (i.e. the matrix which have in position (i, j) the number of representatives of the collared prototile of type i in the substitution of the collared prototile of type j). The system C(l , Z)/Rl , f˜l is a direct system and one can consider the direct limit C R := lim C(l , Z)/Rl , f˜l . Define also C := C(, Z)/R, where R is the −→
subgroup of C(, Z) spanned by the Rl ’s and denote q : C(, Z) −→ C the quotient map. Lemma 4.2. The group C R is isomorphic to C. Proof. Let l ∈ N, let ψl : C(l , Z)/Rl −→ C be the homomorphism defined by ψl (ql ( f )) := q( f ) (Rl ⊂ R). One obtains the following commutative diagram: f˜l
C(l−1 , Z)/Rl−1 MMM MMM M ψl−1 MMMM M&
/ C(l , Z)/Rl tt tt t t tt ψl y t t
C
since the next diagram is formed by commutative diagrams: f˜l
C(l−1 , Z)/Rl−1
gPPP PPP PP ql−1 PPP P
C(l−1 , Z)
/ C(l , Z)/Rl 8 qqq q q qqq qqq l
fl
/ C(l , Z) id
== == == == = id == == ==
C(, Z) q
) C v
Denoting jl : C(l , Z)/Rl −→ C R the canonical homomorphisms sending an element of C(l , Z)/Rl on its class in the limit C R, there exists a unique homomorphism j : C R −→ C such that j ◦ jl = ψl (by definition of direct limit). We prove that j is an isomorphism.
398
H. Moustafa
Surjectivity. Consider q( f ) ∈ C, then, since C(, Z) is the direct limit of the C(l , Z)’s, q( f ) = q( f i ) with f i ∈ C(li ). Then q( f ) = q( f i ) = ψli qli ( f i ) = j ◦ jli ◦ qli ( f i ) and thus j is surjective. Now, we show that j is also injective. Suppose that j ( f ) = 0 ( f ∈ C R), then, f can be written jn (qn (g)) with g ∈ C(n , Z) (by definition of direct limit) and thus . j ◦ j ◦ q (g) = 0, i.e. ψ ◦ q (g) = 0. Hence, g ∈ R and there are k .. k , n
n
n
n
1
r
c1 , . . . , cr ∈ Z and f 1 , . . . , fr with f i ∈ Rki such that g = c1 f 1 + . . . + cr fr . By definition, if h ∈ C( j , Z), then f j+1 (h) = h in C(, Z) ( h is written as a linear combination of characteristic functions of patches formed by supertiles of level j and f j+1 (h) is just another way to write this sum obtained by decomposing these patches in patches of ( j + 1)-supertiles that contain them). Hence, f i = f kr ki ( f i ) for i = 1, . . . , r − 1, with f km := f k ◦ . . . ◦ f m+1 and f kr n (g) = g = c1 f kr k1 ( f 1 ) + . . . + cr −1 f kr kr −1 ( fr −1 ) + cr fr in C(, Z) and f kr n (g) = g is then in Rkr . Thus f˜kr n (qn (g)) = qkr ( f kr n (g)) = 0 with f˜kr n := f˜kr ◦ . . . ◦ f˜n+1 . We then conclude that jkr ◦ f˜kr n ◦ qn (g) = 0 and by definition of direct limit, 0 = jkr ◦ f˜kr n ◦ qn (g) = jn ◦ qn (g) = f and j is thus injective. The previous lemma is not surprising since R = ∪Rl and C(, Z) = ∪C(l , Z). Moreover, μt (χ p − χ p ) = 0 if p, p are two tiles in the same supertile since, if χ p = χU (ω,STω ) where [ω] ∈ ( p), STω being the supertile surrounding p and such that the representative of p containing the origin in ω has the “good” orientation, then χ p = χU (ω,STω ).(x,θ) with (x, θ ) the direct isometry sending ω on the tiling of ( p ) having a representative of p surrounding the origin and in the “good” orientation. Hence, the measure factorizes through the quotient and μt (C) = μt (C(, Z)). It is enough then to compute μt (C) in order to end the computation of the gap-labeling of the pinwheel tiling. Since the following diagram ···
/ C(l−1 , Z)/Rl−1
···
/ Z108
f˜l
/ C(l , Z)/Rl
/ ···
/ Z108
/ ···
A
is commutative, one has C lim(Z108 , A ). We then use results obtained by Effros in −→ [Eff81] since we have a stationary system: Definition 4.3 (see [Eff81]). A direct system Zr
φ
/ Zr
φ
/ ···
with constant spaces Zr and with constant map φ (constant means that it’s always the same map) is called stationary. A stationary system is simple if, for some n, φ n is strictly positive (i.e. all its coefficients are strictly positive). An ordered group G is an abelian group G together with a subset P, called the positive cone and denoted G + , such that:
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
1. 2. 3. 4.
399
P + P ⊂ P, P − P = G, P ∩ (−P) = {0}, if a ∈ G and na ∈ P for some n ∈ N, then a ∈ P.
We shall write a b (resp. a < b) if b − a ∈ G + (resp. G + \{0}). We shall say that u ∈ G + is an order unit on G if {a ∈ G : 0 a n.u for some n ∈ N} = G + . A state (depending on u) on G is a homomorphism p : G → R such that p(G + ) 0 and p(u) = 1. We denote Su (G) the set of such states. φ / Zr φ / · · · (with φ having For example, if G is the limit of the system Zr all its coefficients non-negative), we can define an ordering on G taking G + to be the union of all the images in G of the (Z+ )r ’s. With these definitions, we have the following theorem:
Theorem 4.4 ([Eff81]). Let u 1 be (1, . . . , 1) in the first copy of Zr in the simple stationary system of the definition, then u = φ∞ (u 1 ), its image in the limit G, is an order unit for G and Su (G) contains only one point. The inverse system (Z108 , A ) is in fact stationary and simple (since A is the substitution matrix of our tiling which is a primitive substitution ( A6 has all its coefficients strictly positive)) and thus there exists only one state p on the inverse limit G = lim(Z108 , A ). −→ Moreover, there is an explicit formula for this state: if λ is the Perron-Frobenius eigenvalue of A and α the unique eigenvector of A associated to λ such that αi = 1, then 1 p([a, n]) = n−1 αi ai . λ In the case of pinwheel tilings, λ = 5, u 1 = (1, . . . , 1) ∈ Z108 and there is only one state on G defined by: p ([(k1 , . . . , k108 ); n]) =
108 1 1 ki αi , 33000 5n−1 i=1
with α =
1 33000 α
and
α = (765, 1185, 360, 255, 735, 1185, 360, 255, 765, 90, 90, 255, 250, 80, 735, 255, 80, 400, 360, 660, 600, 660, 300, 360, 360, 255, 400, 360, 90, 255, 90, 350, 90, 255, 250, 255, 80, 80, 163, 18, 237, 72, 90, 360, 237, 72, 204, 204, 163, 300, 51, 18, 50, 51, 765, 1185, 360, 255, 735, 1185, 360, 255, 765, 90, 90, 255, 250, 80, 735, 255, 80, 400, 360, 660, 600, 660, 300, 360, 360, 255, 400, 360, 90, 255, 90, 350, 90, 255, 250, 255, 80, 80, 163, 18, 237, 72, 90, 360, 237, 72, 204, 204, 163, 300, 51, 18, 50, 51). Moreover, since each C(l , Z)/Rl is naturally ordered by the ordering of Z108 , C has an ordering compatible with its isomorphism with G and the unit order for this ordering
400
H. Moustafa
is the class of the constant function equal to 1. Hence, there is a unique state on C such that, if pi are tiles contained in 108 different n-supertiles of Bnc , then: p
ki qn (χ pi ) =
108 1 1 ki αi . 33000 5n
(1)
i=1
But has
μt μt ()
is in fact a state on C, thus μt = μt () p and, since 33000 = 53 × 264, one 1 μt () Z . μ (C(, Z)) = 264 5 t
We end this article with two points. The first one is that, in fact, μt is a probability measure on : Lemma 4.5. μt () = 1. Proof. This result is obtained using the formula in [MS06] p.90: if f : → is a Borel function with f (x) in the same leaf as x, then for each ω ∈ we can define ρω on f −1 (ω) as the restriction of the Haar measure on the leaf l(ω) of ω to f −1 (ω) ⊂ l(ω). We then have: μ(E) = χ E (x)dρω (x) dμt (ω),
f −1 (ω)
for any Borel set E in . One then just needs to produce such a Borel function. For this, consider pˆic (i = 1, . . . , 108) the 108 collared prototiles of the pinwheel tiling and let Vik be the open ◦
◦
subset of homeomorphic to ( pic ) × pic × Sk , where pic is the interior of the representative of pˆic with the origin on its punctuation (and with the “good” orientation) and
2 π 2kπ π k Sk = 2kπ k=0 Vi for i = 1, . . . , 108. The 3 − 3 , 3 + 3 for k = 0, 1, 2. Set Ui = union of the closure of these open sets is a covering of : =
108 i=1
Ui =
108 2
Vik .
i=1 k=0
Setting F10 = V10 , F11 = V11 \V10 , F12 = V12 \ V10 ∪ V11 , F20 = V20 \U1 , 107 2 2 \ 0 1 F21 = V21 \ V20 ∪ U1 , . . . , F108 = V108 Ul V108 V108 , l=1
we have a covering of by disjoint Borel sets Fi .
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
401
Define f : → by f (ω) = ωi if ω ∈ Fik , ω = ωi .(xi , θi ) with ωi ∈ ( pic ) and (xi , θi ) ∈ pic × Sk . This is a Borel function from to the transversal and f (ω) is on the same leaf as ω. This proves that μ can be reconstructed from f and μt : μ(E) =
2
◦
k=0
pωc ×Sk
χ E (ω.(x, θ ))dλ(x, θ ) dμt (ω),
(2)
where pωc is the prototile type of the tile in ω surrounding the origin. As Vik and V jl are disjoint if i = j or k = l and because the border of Vik is a set of measure zero, we obtain: 1 = μ() =
108 2
μ(Fik ) =
i=1 k=0
=
108
108 i=1
μ(Ui ) =
108
μt ( pic )
i=1
μt χ pic = μt ().
i=1
Hence, we have obtained the following result: Proposition 4.6. For pinwheel tilings, the gap-labeling (or patch frequencies) is given by: μ τ∗
1 1 2 1 Z . K 0 C() R S = 264 5
The second point is that we can recover from the explicit formula (1) for μt that, in fact, the continuous hull of the pinwheel tiling equipped with the action of the direct isometries is uniquely ergodic (compare with [Fre08,LMS02,Rad95] and [Sol97]): Lemma 4.7. , R2 S 1 is uniquely ergodic. Proof. This is readily obtained because is a Cantor set and hence every Borel measure on is completely determined by its value on each clopen partition of . If μ and ν are two invariant Borel probability measures on then they induce two invariant probability measures μt and ν t on the transversal . By the equality (1), these measures must agree on the clopen sets defined by the successive collared supertiles which form a base of clopen neighborhoods for so μt = ν t . Finally, by formula (2), we can see that μ = ν. 5. Conclusion Using methods developed in [SB09], we thus proved the gap-labeling for the pinwheel tiling (the (1, 2)-pinwheel tiling). We think that the methods developed in the present paper are in fact more general.
402
H. Moustafa
We think that, using the construction of Savinien and Bellissard [SB09], we may ˇ prove that the top Cech cohomology group of more general tilings is in fact the integer group of coinvariants of the canonical transversal. This result coupled with the diagram of Sect. 4.1 can then prove the known fact (see [BBG06]) that the image under the ˇ Ruelle-Sullivan current of the copy of the top integer Cech cohomology of the hull in the longitudinal cohomology of is in fact μt (C(, Z)). We point out another possible generalization of our work. We guess that, for all m and n in N∗ , the number of (m, n)-pinwheel tilings fixed by a finite rotation always remains finite allowing us to apply the results of [Mou] and of this paper. The result of the final section can still be applied to such tilings, obtaining that the gap-labeling of ! (m, n)-pinwheel tilings is given by the Z-module of “patch frequencies” c.Z m 21+n 2 , where c is a scalar normalizing the Perron eigenvector. Here is the substitution for the (2, 3)-pinwheel tiling:
and for the (1, 4)-pinwheel:
Moreover, we think that, for any substitution tiling of R2 which is self-similar, repetitive, of finite type, non periodic and uniquely decomposable (i.e can be decomposed uniquely in supertiles of order k, for all k ∈ N∗ ), the number of tilings fixed by a finite rotation always remains finite allowing again to apply the results obtained in this paper and in [Mou]. For examples of such tilings, see [Fre08]. The results of the present paper would also allow us to retrieve the fact that the tiling space of any substitution tiling is uniquely ergodic, which was already known (see [Fre08,LMS02,Rad95 and Sol97]) (Figs. 3, 4).
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
403
Fig. 3. 28 collared prototiles
Fig. 4. 26 other collared prototiles
Acknowledgements. It is a pleasure for me to thank my advisor Hervé Oyono-Oyono who always supported and advised me during this work. I also want to thank Jean Bellissard for useful discussions on the gap-labeling conjecture, Ian Putnam and Michael Whittaker for useful discussions on pinwheel tilings and their gap-labeling. I am also grateful to Dirk Frettlöh for useful conversations on diffraction and patch frequencies. I am also indebted to the SSM Department of Victoria BC University for its hospitality during my stay in March 2009 where this paper was written.
404
H. Moustafa
References [AP98] [BBG06] [BDHS] [Bel82] [Bel86] [Bel92] [BG03] [BHZ00] [BKL01] [BOO02] [Bre72] [Con79] [Eff81] [Eil44] [Fre08] [HRS05] [HY61] [Kel97] [KP00] [KP03] [LMS02] [LP03] [Mou] [MS06] [ORS02] [PB09] [Pet05] [Rad94] [Rad95] [RS98] [Sad]
Anderson, J.E., Putnam, I.F.: Topological invariants for substitution tilings and their associated c∗ -algebras. Ergod. Th. & Dynam. Sys. 18, 509–537 (1998) Bellissard, J., Benedetti, R., Gambaudo, J.-M.: Spaces of tilings, finite telescopic approximation and gap labelings. Commun. Math. Phys. 261, 1–41 (2006) Barge, M., Diamond, B., Hunton, J., Sadun, L.: Cohomology of Substitution Tiling Spaces. To appear in Ergod. Th. & Dynam. Sys., DOI:10.1017/s0143385709000777, Nov. 2009, also available at http://arXiv.org/abs/0811.2507v1[math.OS], 2008 Bellissard, J.: Schrödinger’s operator with an almost periodic potential: an overview. In: Proc. of VI Int. Conf. on Math. Phys. (W. Berlin, 1981) Lecture Notes in Physics 153, Berlin: Springer, 1982 Bellissard, J.: K-theory of C ∗ -algebras in Solid State Physics. In: Lecture Notes in Physics 257, Berlin: Springer, pp. 99–156, 1986 Bellissard, J.: Gap labelling theorems for Schrödinger operators. In: From number theory to physics (Les Houches, 1989), Berlin: Springer, 1992, pp. 538–630 Benedetti, R., Gambaudo, J.-M.: On the dynamics of G-solenoids. applications to delone sets. Ergod. Th. & Dynam. Sys. 29, 673–691 (2003) Bellissard, J., Herrmann, D.J.L., Zarrouati, M.: Hulls of aperiodic solids and gap labeling theorems. CRM Monogr. Ser. 13, Providence, RI: Amer. Math. Soc., 2000, pp. 207–258 Bellissard, J., Kellendonk, J., Legrand, A.: Gap-labelling for three-dimensional aperiodic solids. C.R.A.S, Serie I 332, 521–525 (2001) Benameur, M.-T., Oyono-Oyono, H.: Index theory for quasi-crystals. i. computation of the gap-label group. J. Funct. Anal. 252, 137–170 (2002) Bredon, G.E.: Introduction to compact transformation groups. Pure and applied mathematics 46, London-New York: Academic Press, 1972 Connes, A.: Sur la théorie non commutative de l’intégration. Lecture Notes in Math. 725, New York: Springer, 1979, pp. 19–143 Effros, E.G.: Dimensions and C ∗ −algebras. Conference Board of the Mathematical Sciences, Washington, D.C.: Amer. Math. Soc., 1981 Eilenberg, S.: Singular homology theory. Ann. Math. 45, 407–447 (1944) Frettlöh, D.: About substitution tilings with statistical circular symmetry. Phil. Mag. 88, 2033– 2039 (2008) Holton, C., Radin, C., Sadun, L.: Conjugacies for tiling dynamical systems. Commun. Math. Phys. 254, 343–359 (2005) Hocking, J.G., Young, G.S.: Topology. Reading, MA: Addison-Wesley Publishing Company, 1961 Kellendonk, J.: The local structure of tilings and their integer group of coinvariants. Commun. Math. Phys. 187, 115–157 (1997) Kellendonk, J., Putnam, I.F.: Tilings, C ∗ -algebras and K-theory. CRM monograph Series 13, M.P. Baake, R.V. Moody, eds., Providence, RI: A.M.S., 2000, pp. 177–206 Kaminker, J., Putnam, I.: A proof of the gap labeling conjecture. Michigan Math. J. 51, 537– 546 (2003) Lee, J.-Y., Moody, R.V., Solomyak, B.: Pure point dynamical and diffraction spectra. Ann. Henri Poincaré 3, 1003–1018 (2002) Lagarias, J.C., Pleasant, P.A.B.: Repetitive delone sets and quasicrystals. Ergod. Th. & Dynam. Sys. 23, 831–867 (2003) Moustafa, H.: An index theorem to solve the gap-labeling conjecture for the pinwheel tiling. http://arxiv.org/abs/1001.4202v1[math.OA], 2010 Moore, C.C., Schochet, C.: Global analysis on foliated spaces. MSRI Publications, 9, Berkeley, CA: MSRI, 2006 Ormes, N., Radin, C., Sadun, L.: A homeomorphism invariant for substitution tiling spaces. Geometriae Dedicata 90, 153–182 (2002) Pearson, J., Bellissard, J.: Noncommutative riemannian geometry and diffusion on ultrametric cantor sets. J. Noncommutative Geom. 3, 847–865 (2009) Petite, S.: Pavages du demi-plan hyperbolique et laminations. PhD thesis, Univ. de Bourgogne I, 2005 Radin, C.: The pinwheel tilings of the plane. Ann. of Math. 139, 661–702 (1994) Radin, C.: Space tilings and substitutions. Geom. Dedicata 55, 257–264 (1995) Radin, C., Sadun, L.: An algebraic invariant for substitution tiling systems. Geom. Dedicata 73, 21–37 (1998) Sadun, L.: Private conversation in September 2007
PV Cohomology of the Pinwheel Tilings, Coinvariants Integer Group and Gap-Labeling
[SB09] [SBGC84] [Sol97] [vE94]
405
Savinien, J., Bellissard, J.: A spectral sequence for the k-theory of tiling spaces. Ergod. Th. & Dynam. Sys. 29, 997–1031 (2009) Shechtman, D., Blech, I., Gratias, D., Cahn, J.V.: Metallic phase with long range orientational order and no translational symmetry. Phys. Rev. Lett. 53, 1951–1953 (1984) Solomyak, B.: Dynamics of self-similar tilings. Ergod. Th. & Dynam. Sys. 17, 695–738 (1997); Solomyak, B.: Corrections to Dynamics of self-similar tilings. Ergod. Th. & Dynam. Sys. 19, 1685 (1999) van Elst, A.: Gap labelling theorems for schrodïnger operators on the square and cubic lattices. Rev. Math. Phys. 6, 319–342 (1994)
Communicated by A. Connes
Commun. Math. Phys. 298, 407–418 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1000-4
Communications in
Mathematical Physics
Localization of Analytic Regularity Criteria on the Vorticity and Balance Between the Vorticity Magnitude and Coherence of the Vorticity Direction in the 3D NSE Zoran Gruji´c1 , Rafaela Guberovi´c1,2 1 Department of Mathematics, University of Virginia, Charlottesville,
VA 22904, USA. E-mail: [email protected]; [email protected]
2 Seminar für Angewandte Mathematik, HG J 47, Rämistrasse 101,
8092 Zürich, Switzerland. E-mail: [email protected] Received: 29 July 2009 / Accepted: 4 November 2009 Published online: 2 February 2010 – © Springer-Verlag 2010
Abstract: The first part of the paper provides spatio-temporal localization of a family of analytic regularity classes for the 3D NSE obtained by Beirao Da Veiga (space-time integrability of the gradient of the velocity on R3 × (0, T ) which is out of the range of the Sobolev embedding theorem reduction to the classical Foias-Ladyzhenskaya-ProdiSerrin space-time integrability conditions on the velocity) as well as the localization of the Beale-Kato-Majda regularity criterion (time integrability of the L ∞ -norm of the vorticity). The second part introduces a family of local, scaling invariant, hybrid geometric-analytic classes in which coherence of the vorticity direction serves as a weight in the local spatio-temporal integrability of the vorticity magnitude. 1. Introduction The classical Foias-Ladyzhenskaya-Prodi-Serrin regularity criterion (FLPS) for weak solutions to the 3D NSE on an open space-time domain × (0, T ), u t − u + (u · ∇)u + ∇ p = 0,
(1)
supplemented with the incompressibility condition div u = 0, where u is the velocity of the fluid and p is the pressure, reads 2q
1 u Lq−3 q () ∈ L (0, T ) for some 3 < q ≤ ∞.
Local versions as well as generalizations to the weak Lebesgue spaces can be found, e.g., in [Se,St,T,KK] and the references therein. In the endpoint case q = 3, i.e., u L 3x ∈ L∞ t ,“standard methods” require a smallness condition; this smallness condition was removed in [ESS] via a different method utilizing unique continuation across a spatial boundary, backward uniqueness for the heat equation and Carleman-type estimates. A method for the study of a possible singular set in the class of “suitable weak solutions” was introduced in [CKN]. The method is based on the localized energy
408
Z. Gruji´c, R. Guberovi´c
inequality, scaling arguments and local estimates on the pressure (this is also for the velocity-pressure formulation), and it provides blow-up criteria on the family of shrinking parabolic cylinders below a spatio-temporal point. One consequence of the blow-up estimates is that the 1D Hausdorff measure of the singular set in × (0, T ) is zero (see also recent works [ZS and GKT] where a local FLPS criterion is given as a special case of a regularity criterion obtained by the CKN method and a CKN-type condition is presented for a boundary point, respectively). The following family of regularity classes was introduced in [daVeiga], 2q
1 Du L2q−3 q (R3 ) ∈ L (0, T ) for some 3 ≤ q < ∞.
(2)
Note that in this range of the parameter q, 3 ≤ q < ∞, in contrast to the case 23 ≤ q < 3, the regularity criterion (2) can not be reduced to the FLSP criterion via the Sobolev embedding theorem; hence, (2) represents a natural extension of the FLSP regularity classes. (In the case of the whole space, the proper L p -norms of ω and ∇u are equivalent due to the Calderon-Zygmund theorem; hence, one can view (2) as a regularity criterion on the vorticity.) A generalization to the case of Dirichlet boundary conditions on a C 2 -domain was given in [B]. The case q = ∞, more precisely, the time integrability of ω L ∞ (R3 ) , where ω = curl u is the vorticity is the Beale-Kato-Majda (BKM) criterion [BKM]; actually, less is required – the time integrability of the B M O-norm of ω [KT] (see also [KOT1,KOT2] for generalizations to the time integrability of the homogeneous Besov 0 norm B˙ ∞,∞ ). Note that all the aforementioned analytic regularity classes are critical in the sense they are invariant with respect to the NSE scaling u λ (x, t) = λu(λx, λ2 t), pλ (x, t) = λ2 p(λx, λ2 t), ωλ (x, t) = λ2 ω(λx, λ2 t) (λ > 0). In the first part of the paper, we present a spatio-temporal localization of the regularity class (2) as well as the BKM regularity criterion to an arbitrarily small space-time cylinder Q r (x0 , t0 ) = Br (x0 ) × (t0 − r 2 , t0 ) utilizing a spatio-temporal localization of vortex-stretching recently obtained in [Gr2]. Localization of the B M O-criterion is addressed in the forthcoming work [GG]. The study of geometric criteria for the regularity was pioneered by Constantin in [Co] where an explicit representation formula for the stretching factor in the evolution of the vorticity magnitude was derived. The representation is in the form of a singular integral whose kernel is depleted by coherence of the vorticity direction. This geometric depletion of the nonlinearity was utilized in [CoFe] where it was shown that Lipschitz coherence in the region of high vorticity on an interval (0, T ) prevents a possible formation of singularities at t = T . The result was sharpened in [daVeigaBe1] where the Lipschitz coherence was replaced with the 21 -Hölder coherence. A different approach to discovering geometric conditions on the 3D NSE flow preventing the singularity formation was introduced in [Gr] utilizing local-in-time spatial analyticity of the solutions via a plurisubharmonic measure ( p-measure) maximum principle in C3 – an extension of the classical harmonic measure maximum principle in one complex variable (for computing the p-measure using foliations see [GaKa,Ga]). The result states that local existence of a thin direction in the region of intense vorticity on the length scale that is essentially an L ∞ -version of the Kolmogorov dissipation scale suffices to control the evolution of the vorticity magnitude preventing finite time blow-up. It is interesting that the same length scale appeared in [RuGr] utilizing a completely different technique; namely, restricting a representation formula for the vortex-stretching term
Vorticity
409
to small (comparable to the aforementioned dissipation scale), medium and large scales. The result reads that sparseness (in the sense of volume) of the region of high vorticity depletes the nonlinearity. Detecting additional cancelation properties in the nonlinearity, it was then shown (cf. [RuGr]) that a certain isotropy condition on the velocity in the region of intense fluid activity prevents the possible formation of singularities. A mixed geometric-analytic regularity class was presented in [daVeiga2]. For 0 < s ≤ 21 , let p be such that 3p = s + 1. Then the class in view is defined by requiring s-Hölder coherence of the vorticity direction and the following space-time integrability condition on the vorticity magnitude, ω ∈ L 2 0, T ; L p (R3 ) . (3) Another mixed geometric-analytic regularity class was independently introduced in [GrRu]; namely, q1 -Hölder coherence of the vorticity direction supplemented with the following integrability condition on the vorticity magnitude: q
1 ω Lq−1 q (R3 ) ∈ L (0, T )
(4)
for some 2 ≤ g < ∞. Note that the above two conditions are in fact complementary; they meet at the purely geometric case ( 21 −Hölder coherence of the vorticity direction), and then they run in two separate directions – the first to the endpoint regularity class in (2), and the second to the BKM criterion. A more general family of mixed geometric-analytic regularity classes was recently obtained in [Ch], where an assumption on the space-time integrability of the vorticity magnitude is paired with an assumption of the membership of the vorticity direction in s . a homogeneous Tribel-Lizorkin space F˙ p,q In all the aforementioned purely geometric and mixed geometric-analytic regularity criteria the spatial domain is the whole space R3 , the main reason being that the techniques used rely on the Biot-Savart law, a non-local representation of the velocity by the vorticity over the whole space. The case of the non-slip boundary conditions on smooth bounded domains was treated in [daVeiga2], where 21 -Hölder coherence regularity criterion was established under an additional assumption on the control of the normal derivative of the vorticity magnitude along the boundary. In the case of the slip boundary conditions on the half-space, no additional assumptions are needed (cf. [daVeiga1]). This result was recently generalized to the case of the free boundary-type boundary conditions on smooth bounded domains in [daVeigaBe2]. In each case, the proof is based on a version of the Biot-Savart law corresponding to the particular type of the boundary conditions. Since the possible formation of singularities is a local phenomenon, a physically relevant question is whether it is possible to localize geometric and mixed geometricanalytic regularity conditions. In the case of the whole space, a localization of the mixed geometric-analytic conditions (4) (including the purely geometric 21 -Hölder coherence) to an arbitrarily small space-time cylinder was obtained in [GrZh]. The localization of the transport of the vorticity by the velocity was independent of the choice of the spatial domain/boundary conditions; however, the localization of the vortex-stretching term was based on the restriction of Constantin’s representation formula for the stretching factor in the evolution of the vorticity magnitude (valid on the whole space) to small scales. A localized representation formula for the vortex-stretching term based on a localization
410
Z. Gruji´c, R. Guberovi´c
of the Biot-Savart law was recently derived in [Gr2]. This led to complete localization of the evolution of the enstrophy independently of the type of the boundary conditions, and in particular, to complete localization of the 21 -Hölder coherence condition (cf. [Gr2]). In the second part of the paper, we introduce a family of local, critical, hybrid geometric-analytic classes in which coherence of the vorticity direction serves as a weight in the space-time integrability of the vorticity magnitude. (Instead of integrating the vorticity magnitude against the homogeneous Lebesgue measure, we integrate it against a nonhomogeneous measure that depends only on the vorticity direction.) Denote by η the vorticity direction. Let (x, t) be a spatio-temporal point, r > 0 and 0 < γ < 1. A γ -Hölder measure of coherence of the vorticity direction at (x, t) is then given by ργ ,r (x, t) =
sup
y∈B(x,r ),y=x
| sin ϕ (η(x, t), η(y, t)) | . |x − y|γ
For a given point (x0 , t0 ) and α, δ > 0, the class under consideration is defined by requiring finiteness of δ t0 α α |ω(x, t)| ργ ,2r (x, t)d x dt. (5) t0 −(2r )2
B(x0 ,2r )
It will transpire in the proof that δ=
2 (2 + γ )α − 3
(6)
with the suitable restrictions on the parameters α and γ . It is worth observing that the condition (6) makes the class (5) scaling invariant (critical). The case γ = 21 , α = 2 and δ = 1, namely t0 |ω(x, t)|2 ρ 21 ,2R (x, t)d x dt, (7) t0 −(2R)2
B(x0 ,2R)
2
is included (and is an improvement over the 21 -Hölder coherence–the sup in x and t bound on ρ 1 ). For a comparison with what can be bounded, a local spatio-temporal 2
average of |ω||∇η|2 averaged over a ball around a spatial point moving with the fluid over a suitable interval of time is a priori bounded (cf. [CPS]). As in [GrZh,Gr2], for simplicity of the exposition, we present the relevant calculations on smooth solutions. More precisely, we consider a Leray solution on a spacetime domain × (0, T ) andsuppose that u is smooth in an open parabolic cylinder Q 2R (x0 , t0 ) = B(x0 , 2R) × t0 − (2R)2 , t0 contained in × (0, T ). The goal is to show that, under a suitable local condition (analytic or hybrid geometric-analytic) on Q 2R (x0 , t0 ), the localized enstrophy remains uniformly bounded up to t = t0 , i.e., |ω(x, t)|2 d x < ∞. sup t∈(t0 −R 2 ,t0 ) B(x0 ,R)
Alternatively, we can consider, e.g., a class of suitable weak solutions constructed in [CKN] as a limit of a family of delayed mollifications (see also [C1]), and perform the calculations on the smooth approximations.
Vorticity
411
Remark 1. Since all the estimates are local, instead of considering (finite energy) Leray solutions, we can consider a class of local Leray solutions in the sense of [LR] (finite locally uniform energy). The paper is organized as follows. In Sect. 2 we recall the relevant results from [GrZh,Gr2]. Section 3 is devoted to the localization of the family of analytic regularity classes (2) as well as the BKM regularity criterion, and in Sect. 4 we show that the critical hybrid geometric-analytic condition (5) prevents a possible formation of the singularities in the flow. Remark 2. After the manuscript was submitted, the authors learned that the paper [ChKaLe] contains a local version of the analytic regularity criteria on the vorticity presented in Theorem 1 excluding the endpoint case q = ∞, i.e., the BKM criterion. More precisely, using a localization of the velocity-pressure formulation of the equations, the authors in [ChKaLe] obtained a local version of the regularity criteria on ∇u [ChKaLe, Prop. 1] which, via a localized Biot-Savart law, lead to the local conditions on the vorticity [ChKaLe, Prop. 2]. (Since the L ∞ -norms of ω and ∇u are not equivalent, this localization method does not lead to a local BKM criterion.) This was then utilized to obtain a local version of the analytic regularity criteria on the two components of the vorticity derived in [ChCh]. In addition, [ChKaLe] contains a localization of the mixed geometric-analytic regularity criteria obtained in [Ch].
2. Bounds on the Localized Enstrophy Taking the curl of the velocity-pressure formulation of the 3D NSE (1) on a space-time domain × (0, T ) leads to the vorticity-velocity formulation, ωt − ω + (u · ∇)ω = (ω · ∇)u.
(8)
The left-hand side of the nonlinearity is the transport term and the right-hand side is the vortex-stretching term. Fix a point (x0 , t0 ) in × (0, T ). Let 0 < R < 1 be such that Q 2R (x0 , t0 ) ⊂ × (0, T ), r ≤ R, and let ψ(x, t) = φ(x)η(t) be a smooth cut-off function on Q 2r (x0 , t0 ) satisfying supp φ ⊂ B(x0 , 2r ), φ = 1 on B(x0 , r ),
|∇φ| c ≤ for some ρ ∈ (0, 1), 0 ≤ φ ≤ 1 φρ r
and supp η ⊂ (t0 − (2r )2 , t0 ], η = 1 on [t0 − r 2 , t0 ], |η | ≤
c , 0 ≤ η ≤ 1. r2
(It was shown in [GrZh and Gr2] that, choosing the parameter ρ sufficiently close to 1, it is possible to control the lower order terms in the localized transport term Q 2r (u · ∇)ω · ψ 2 ω d xdt and in the localized vortex-stretching term Q 2r (ω · ∇)u · ψ 2 ω d xdt, respectively.)
412
Z. Gruji´c, R. Guberovi´c
The following localization formula for the vortex-stretching term was obtained in [Gr2]: φ 2 (x)(ω · ∇)u · ω (x) ∂ = φ(x) u j (x) φ(x) ωi (x) ω j (x) ∂ xi ∂2 1 = −c P.V. jkl φ ωl dy φ(x) ωi (x) ω j (x)+LOT ∂ x ∂ y |x − y| i k B(x0 ,2r ) = −c P.V. (ω(x) × ω(y)) · G ω (x, y) φ(y) φ(x) dy + LOT B(x0 ,2r )
= VSTloc + LOT
(9)
( jkl is the Levi-Civita symbol) for x in B(x0 , 2r ), uniformly in time, where (G ω (x, y))k =
1 ∂2 ωi (x) ∂ xi ∂ yk |x − y|
and LOT denotes terms that are either lower order for at least one order of the differentiation or/and less singular for at least one power of |x − y| then the leading order term VSTloc . Let s in (t0 − (2r )2 , t0 ). Then (cf. [GrZh]), 1 2 2 φ (x)|ω| (x, s) d x + |∇(ψω)|2 d xdt 2 B(x0 ,2r ) Q s2r
1
2 2 2 |∇(ψω)| d xdt + c(r ) |ω| d xdt +
(ω · ∇)u · ψ ω d xdt
≤
Qs
2 Q 2r Q 2r 2r 1 = |∇(ψω)|2 d xdt + c(r ) |ω|2 d xdt + I, (10) 2 Q 2r Q 2r where Q s2r = B(x0 , 2r ) × (t0 − (2r )2 , s) (this holds for any 21 ≤ ρ < 1). Notice that
2 2 I ≤
η VSTloc d xdt +
η LOT d xdt .
Qs
Qs
2r
2r
Choosing ρ sufficiently close to 1, the second term (the sum of all the lower order vortex-stretching terms) can be bounded (cf. [Gr2]) by 1 1 2 2 max c ∇u L 2 (Q 2r ) , sup φω(t) L 2 (B(x ,2r )) + ∇(ψω) L 2 (Q ) 0 2r 4 2 t∈(t0 −(2r )2 ,t0 ) + M r, ∇u L 2 (Q 2r ) . Since u is a weak solution, the first term in the above expression can, for a sufficiently small r , eventually (after sending s to t0 ) be absorbed by the left-hand side of (10) and the second term is bounded.
Vorticity
413
Summarizing, in order to control the evolution of the localized enstrophy, we are left to estimate the leading order vortex-stretching term Q s2r
P.V.
B(x0 ,2r )
jkl
Q s2r
η2 VSTloc d xdt, i.e.,
∂2 1 ψ(y, t)ωl (y, t) dy ψ(x, t)ωi (x, t) ω j (x, t) d xdt. ∂ xi ∂ yk |x − y| (11)
3. Localization of Analytic Regularity Criteria on the Vorticity In this section we obtain spatio-temporal localization of the regularity criteria in (2) and the BKM regularity criterion to an arbitrarily small parabolic cylinder. Theorem 1. Let u be a Leray solution on a space-time domain × (0, T ), (x0 , t0 ) in × (0, T ) 2and0 < R < 1 such that the parabolic cylinder Q 2R (x0 , t0 ) = B(x0 , 2R) × t0 − (2R) , t0 is contained in × (0, T ). Suppose that u is smooth in Q 2R (x0 , t0 ) and 2q
1 2 ω L2q−3 q (B(x ,2R)) is in L (t0 − (2R) , t0 ) 0
for some 3 ≤ q ≤ ∞. Then the localized enstrophy remains uniformly bounded up to t = t0 , i.e., |ω(x, t)|2 d x < ∞. sup t∈(t0 −R 2 ,t0 ) B(x0 ,R)
Proof. Let r ≤ R. Following a discussion in the previous section, it remains to estimate the leading order vortex-stretching term ∂2 1 ψ(y, t)ωl (y, t) dy ψ(x, t)ωi (x, t) J = P.V. jkl ∂ xi ∂ yk |x − y| Q s2r B(x0 ,2r ) ×ω j (x, t) d xdt. Notice that 1 ∂2 ∂ xi ∂ yk |x − y| is a Calderon-Zygmund kernel. Consider first the case 3 ≤ q < ∞. Hölder inequality in x with the exponents 2q , 2q and q, where q is the conjugate exponent of q, followed by an application of the Calderon-Zygmund theorem, yield t0 J ≤c ω(t) L q (B(x0 ,2r )) (ψω)(t)2L 2q (B(x ,2r )) dt. t0 −(2r )2
Interpolating the second factor in the integrand, t0 2q−3 J ≤c ω(t) L q (B(x0 ,2r )) (ψω)(t) L 2q(B(x t0 −(2r )2
0
3
0
∇(ψω)(t) Lq 2 (B(x ,2r ))
0 ,2r ))
dt.
414
Z. Gruji´c, R. Guberovi´c
2q Hölder inequality in t with the exponents 2q−3 , ∞ and 2q 3 , and the polarization inequality imply our final bound on J ,
J ≤c
t0
t0 −(2r )2
ω(t)
2q 2q−3 L q (B(x
2q−3 2q 0 ,2r ))
dt
1 sup × φω(t)2L 2 (B(x ,2r )) + ∇(ψω)2L 2 (Q ) . 0 2r 2 t∈(t0 −(2r )2 ,t0 )
(12)
Collecting all the estimates, (10) yields 1 sup φω(t)2L 2 (B(x ,2r )) + ∇(ψω)2L 2 (Q ) 0 2r 2 t∈(t0 −(2r )2 ,t0 ) ≤
1 ∇(ψω)2L 2 (Q ) + c(r )ω2L 2 (Q ) 2r 2r 2 1 1 2 2 sup + max c ∇u L 2 (Q 2r ) , φω(t) L 2 (B(x ,2r )) +∇(ψω) L 2 (Q ) 0 2r 4 2 t∈(t0 −(2r )2 ,t0 ) t0 2q−3 2q 2q 2q−3 ω(t) L q (B(x0 ,2r )) dt + M r, ∇u L 2 (Q 2r ) + c t0 −(2r )2
1 2 2 × φω(t) L 2 (B(x ,2r )) + ∇(ψω) L 2 (Q ) . sup 0 2r 2 t∈(t0 −(2r )2 ,t0 )
Since u is a weak solution, choosing r small enough, the third term can be absorbed. Similarly, our assumption on the local integrability of ω implies that the last term can be absorbed too. This proof actually works for any 23 < q < ∞ (and for q = 23 under an additional smallness assumption); however, as already mentioned in the Introduction, in the range of the parameter q, 23 ≤ q < 3, the regularity criterion (2) can be reduced to the FLSP criterion via the Sobolev embedding theorem. The case q = ∞ requires only minor modifications. Hölder inequality in x with the exponents 2, 2 and ∞ and the Calderon-Zygmund theorem yield J ≤c
t0
t0 −(2r )2
ω(t) L ∞ (B(x0 ,2r )) (ψω)(t)2L 2 (B(x
0 ,2r ))
dt,
which is in turn bounded by c
t0 t0 −(2r )2
ω(t) L ∞ (B(x0 ,2r )) dt
sup t∈(t0 −(2r )2 ,t0
φω(t)2L 2 (B(x
0 ,2r ))
;
this can be absorbed for r small enough. The result for r = R follows by covering B(x0 , 2R) with finitely many balls.
Vorticity
415
4. A Local Hybrid Geometric-Analytic Regularity Criterion This section introduces a two-parameter family of local, scaling invariant (critical), hybrid geometric-analytic classes in which coherence of the vorticity direction serves as a weight in the space-time integrability of the vorticity magnitude. The classes in view represent an improvement over the corresponding mixed geometric-analytic criteria in which coherence of the vorticity direction and space-time integrability of the vorticity magnitude are considered separately (e.g., compared to the classes presented in [GrRu]), but more importantly, provide a local, more refined measure of the balance between the vorticity magnitude and coherence of the vorticity direction preventing the formation of singularities. As in the Introduction, denote by ργ ,r the following γ -Hölder measure of coherence of the vorticity direction η at a point (x, t), ργ ,r (x, t) =
sup
y∈B(x,r ),y=x
| sin ϕ (η(x, t), η(y, t)) | . |x − y|γ
Theorem 2. Let u be a Leray solution on a space-time domain × (0, T ), (x0 , t0 ) in × (0, T ) 2and0 < R < 1 such that the parabolic cylinder Q 2R (x0 , t0 ) = B(x0 , 2R) × t0 − (2R) , t0 is contained in × (0, T ). Let 0 < γ < 1, α < γ3 , α > γ 3+2 and δ=
2 (2 + γ )α − 3
(the scaling invariance). Suppose that u is smooth in Q 2R (x0 , t0 ) and δ t0 |ω(x, t)|α ργα,2r (x, t)d x dt t0 −(2r )2
B(x0 ,2r )
is finite. Then the localized enstrophy remains uniformly bounded up to t = t0 , i.e., |ω(x, t)|2 d x < ∞. sup t∈(t0 −R 2 ,t0 ) B(x0 ,R)
Proof. Let r ≤ R. As in [Gr2], we utilize the geometric structure of the leading order vortex-stretching term J , and express it as (see (9)) J= P.V. (ω(x, t) × ω(y, t)) · G ω (x, y, t) ψ(y, t) ψ(x, t) d y d xdt, Q s2r
B(x0 ,2r )
(13) where (G ω (x, y, t))k =
1 ∂2 ωi (x, t). ∂ xi ∂ yk |x − y|
This leads to the following bound: t0 ργ ,2r (x, t)|ω(x, t)| J ≤c 2 t0 −(2r ) B(x0 ,2r ) 1 |(ψω)(y, t)| dy (|(ψω)(x, t)|) d x dt. × 3−γ B(x0 ,2r ) |x − y|
(14)
416
Z. Gruji´c, R. Guberovi´c
As a preview, we present the proof in a particular case when γ = 21 , α = 2 and δ = 1 first. Hölder inequality in x with the exponents 2, 4 and 4 yields t0 J ≤c (ρ 1 ,2r |ω|)(t) L 2 (B(x0 ,2r )) 2 t −(2r )2 0 1 × |(ψω)(y, t)| dy (ψω)(t) L 4 (B(x0 ,2r )) dt. B(x0 ,2r ) | · −y| 52 4 L (B(x0 ,2r ))
This is by the Hardy-Littlewood-Sobolev inequality bounded by t0 (ρ 1 ,2r |ω|)(t) L 2 (B(x0 ,2r )) (ψω)(t) 12 (ψω)(t) L 4 (B(x0 ,2r )) dt. c t0 −(2r )2
L
2
5
(B(x0 ,2r ))
Interpolating the last two factors in the integrand implies t0 J ≤c (ρ 1 ,2r |ω|)(t) L 2 (B(x0 ,2r )) (ψω)(t) L 2 (B(x0 ,2r )) ∇(ψω)(t) L 2 (B(x0 ,2r )) dt t0 −(2r )2
≤c
t0
t0 −(2r )2
2
B(x0 ,2r )
21 2 ρ 1 ,2r (x, t)|ω(x, t)| d x dt 2
1 2 2 sup × φω(t) L 2 (B(x ,2r )) + ∇(ψω) L 2 (Q ) . 0 2r 2 t∈(t0 −(2r )2 ,t0 )
As in the proof of Theorem 1, collecting all the estimates, (10) yields 1 sup φω(t)2L 2 (B(x ,2r )) + ∇(ψω)2L 2 (Q ) 0 2r 2 t∈(t0 −(2r )2 ,t0 ) ≤
1 ∇(ψω)2L 2 (Q ) + c(r )ω2L 2 (Q ) 2r 2r 2 1 1 2 2 sup + max c ∇u L 2 (Q 2r ) , φω(t) L 2 (B(x ,2r )) +∇(ψω) L 2 (Q ) 0 2r 4 2 t∈(t0 −(2r )2 ,t0 ) + M r, ∇u L 2 (Q 2r ) t0 21 2 +c ρ 1 ,2r (x, t)|ω(x, t)| d x dt 2 t0 −(2r )2 B(x0 ,2r )
1 2 2 sup × φω(t) L 2 (B(x ,2r )) + ∇(ψω) L 2 (Q ) , 0 2r 2 t∈(t0 −(2r )2 ,t0 )
finishing the argument. The proof in the general case follows exactly the same steps. Let 0 < γ < 1 and α > 1. Hölder inequality in x with the exponents α, p1 and p2 applied to (14) yields t0 (ργ ,2r |ω|)(t) L α (B(x0 ,2r )) J ≤c t0 −(2r )2 1 × |(ψω)(y, t)| dy (ψω)(t) L p2 (B(x0 ,2r )) dt, p 3−γ B(x0 ,2r ) | · −y| L 1 (B(x0 ,2r ))
Vorticity
417
which is by the Hardy-Littlewood-Sobolev inequality bounded by t0 (ργ ,2r |ω|)(t) L α (B(x0 ,2r )) (ψω)(t) L p (B(x0 ,2r )) (ψω)(t) L p2 (B(x0 ,2r )) dt, c t0 −(2r )2
where p=
3 p1 . Interpolating the last two factors in the integrand implies 3 + γ p1 t0 J ≤c (ργ ,2r |ω|)(t) L α (B(x0 ,2r )) t0 −(2r )2
γ +3(1− α1 )−1 3−γ −3(1− 1 ) ∇(ψω)(t) L 2 (B(x ,2r ))α dt ,2r )) 0 0
×(ψω)(t) L 2 (B(x
1 6 − p2 6 − p = γ + 3(1 − ) − 1 . An application of the Hölder inequality in t + 2 p2 2 p α 2 2 with the exponents , ∞ and yields our final bound 1 γ + 3(1 − α ) − 1 3 − γ − 3(1 − α1 ) on J ,
J ≤c
t0
t0 −(2r )2
B(x0 ,2r )
α ργ ,2r (x, t)|ω(x, t)| d x
1
δ
αδ
dt
1 sup × φω(t)2L 2 (B(x ,2r )) + ∇(ψω)2L 2 (Q ) , 0 2r 2 t∈(t0 −(2r )2 ,t0 ) provided α <
3 γ
and α >
3 γ +2 .
References [BKM] [B] [CKN] [ChCh] [Ch] [ChKaLe] [C1] [Co] [CoFe] [CoFo] [CPS] [daVeiga]
Beale, J.T., Kato, T., Majda, A.: Remarks on the breakdown of smooth solutions for the 3-d euler equations. Commun. Math. Phys. 94, 61–66 (1984) Berselli, L.C.: On a regularity criterion for the solutions of the 3D Navier-Stokes equations. Diff. Int. Eq. 15, 1129–1137 (2002) Caffarelli, L., Kohn, R., Nirenberg, L.: Partial regularity of suitable weak solutions of the Navier-Stokes equations. Comm. Pure Appl. Math. 35, 771–831 (1982) Chae, D., Choe, H.-J.: Regularity of solutions to the Navier-Stokes equations. Electronic J. Diff. Eqs. 1999, 1–7 (1999) Chae, D.: On the regularity conditions for the Navier-Stokes and related equations. Rev. Mat. Iberoamericana 23, 371–384 (2007) Chae, D., Kang, K., Lee, J.: On the interior regularity of suitable weak solutions to the Navier-Stokes equations. Comm. PDE 32, 1189–1207 (2007) Constantin, P.: Navier-Stokes equations and area of interfaces. Commun. Math. Phys. 129, 241–266 (1990) Constantin, P.: Geometric statistics in turbulence. SIAM Rev. 36, 73–98 (1994) Constantin, P., Fefferman, C.: Direction of vorticity and the problem of global regularity for the Navier-Stokes equations. Indiana Univ. Math. J. 42, 775–789 (1993) Constantin, P., Foias, C.: Navier-Stokes Equations. Chicago, IL: The University of Chicago Press, 1989 Constantin, P., Procaccia, I., Segel, D.: Creation and dynamics of vortex tubes in three-dimensional turbulence. Phys. Rev. E 51, 3207–3222 (1995) Beirao da Veiga, H.: Concerning the regularity problem for the solutions of the Navier-Stokes equations. C. R. Acad. Sci. Paris Ser. I Math. 321, 405–408 (1995)
418
[daVeiga1]
Z. Gruji´c, R. Guberovi´c
Beirao da Veiga, H.: Vorticity and regularity for flows under the Navier boundary conditions. Comm. Pure Appl. Anal. 5, 483–494 (2006) [daVeiga2] Beirao da Veiga, H.: Vorticity and smoothness in incompressible viscous flows. In: Wave Phenomena and Asyptotic Analysis, RIMS Kokyuroku 1315. Kyoto: RIMS, Kyoto University, 2003, pp. 37–42 [daVeiga3] Beirao da Veiga, H.: Vorticity and regularity for viscous incompressible flows under the Dirichlet boundary condition. Results and Related Open Problems. J. Math. Fluid Mech. 9, 506–516 (2007) [daVeigaBe1] Beirao da Veiga, H., Berselli, L.: On the regularizing effect of the vorticity direction in incompressible viscous flows. Diff. Int. Eqs. 15(3), 345–356 (2002) [daVeigaBe2] Beirao da Veiga, H., Berselli, L.: Navier-Stokes equations: Green’s matrices, vorticity direction, and regularity up to the boundary. J. Diff. Eqs. 246, 597–628 (2008) [ESS] Escauriaza, L., Seregin, G., Sverak, V.: L 3,∞ -solutions of the Navier-Stokes equations and backward uniqueness. Russ. Math. Surv. 58:2, 211–250 (2003) [Ga] Gaveau, B.: Estimations des mesures plurisousharmoniques et des spectres associes. C.R. Acad. Sc. Serie A 288, 969–972 (1979) [GaKa] Gaveau, B., Kalina, J.: Calculs explicites des mesures plurisousharmoniques et des feuilletages associes. Bull. Sc. Math., 2e Serie 108, 197–223 (1984) [Gr] Gruji´c, Z.: The geometric structure of the super-level sets and regularity for 3D Navier-Stokes equations. Indiana Univ. Math. J. 50, 1309–1317 (2001) [Gr2] Gruji´c, Z.: Localization and geometric depletion of vortex-stretching in the 3D NSE. Commun. Math. Phys. 290, 861–870 (2009) [GG] Gruji´c, Z., Guberovi´c, R.: A regularity criterion for the 3D NSE in a local version of the space of functions of bounded mean oscillations. In preparation [GrRu] Gruji´c, Z., Ruzmaikina, A.: Interpolation between algebraic and geometric conditions for smoothness of the vorticity in the 3D NSE. Indiana Univ. Math. J. 53, 1073–1080 (2004) [GrZh] Gruji´c, Z., Zhang, Qi, S.: Space-time localization of a class of geometric criteria for preventing blow-up in the 3D NSE. Commun. Math. Phys. 262, 555–564 (2006) [GKT] Gustafson, S., Kang, K., Tsai, T-P.: Regularity criteria for suitable weak solutions of the Navier-Stokes equations near the boundary. J. Diff. Eqs. 226, 594–618 (2006) [KK] Kim, H., Kozono, H.: Interior regularity criteria in weak spaces for the Navier-Stokes equations. Manus. Math. 115, 85–100 (2004) [KT] Kozono, H., Taniuchi, Y.: Estimates in BMO and the Navier-Stokes equations. Math. Z. 235, 173–194 (2000) [KOT1] Kozono, H., Ogawa, T., Taniuchi, Y.: The critical Sobolev inequalities in Besov spaces and regularity criterion to some semi-linear evolution equations. Math. Z. 242, 251–278 (2002) [KOT2] Kozono, H., Ogawa, T., Taniuchi, Y.: Navier-Stokes equations in the Besov space near L ∞ and BMO. Kyushu J. Math. 57, 303–324 (2003) [LR] Lemarié-Rieusset, P. G.: Recent developments in the Navier-Stokes problem. Chapman & Hall/CRC Research Notes in Mathematics 431, Boca Raton, FL: CRC Press, 2002 [RuGr] Ruzmaikina, A., Gruji´c, Z.: On depletion of the vortex-stretching term in the 3D Navier-Stokes equations. Commun. Math. Phys. 247, 601–611 (2004) [Se] Serrin, J.: On the interior regularity of weak solutions of the Navier-Stokes equations. Arch. Rat. Mech. Anal. 9, 187–195 (1962) [St] Struwe, M.: On partial regularity results for the Navier-Stokes equations. Comm. Pure Appl. Math. 41, 437–458 (1988) [T] Takahashi, S.: On interior regularity criteria for weak solutions of the Navier-Stokes equations. Manus. Math. 69, 237–254 (1990) [ZS] Zajaczkowski, W., Seregin, G.A.: Sufficient condition of local regularity for the Navier-Stokes equations. J. Math. Sci. 143, 2869–2874 (2007) Communicated by P. Constantin
Commun. Math. Phys. 298, 419–436 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1006-y
Communications in
Mathematical Physics
There is No “Theory of Everything” Inside E8 Jacques Distler1 , Skip Garibaldi2 1 Theory Group, Department of Physics, and Texas Cosmology Center,
University of Texas, Austin, TX 78712, USA. E-mail: [email protected] 2 Department of Mathematics & Computer Science, 400 Dowman Dr., Emory University, Atlanta, GA 30322, USA. E-mail: [email protected]; [email protected] Received: 30 July 2009 / Accepted: 26 October 2009 Published online: 12 February 2010 – © Springer-Verlag 2010
Abstract: We analyze certain subgroups of real and complex forms of the Lie group E8 , and deduce that any “Theory of Everything” obtained by embedding the gauge groups of gravity and the Standard Model into a real or complex form of E8 lacks certain representation-theoretic properties required by physical reality. The arguments themselves amount to representation theory of Lie algebras in the spirit of Dynkin’s classic papers and are written for mathematicians. 1. Introduction Recently, the preprint [1] by Garrett Lisi has generated a lot of popular interest. It boldly claims to be a sketch of a “Theory of Everything”, based on the idea of combining the local Lorentz group and the gauge group of the Standard Model in a real form of E8 (necessarily not the compact form, because it contains a group isogenous to SL(2, C)). The purpose of this paper is to explain some reasons why an entire class of such models— which include the model in [1]—cannot work, using mostly mathematics with relatively little input from physics. The mathematical set up is as follows. Fix a real Lie group E. We are interested in subgroups SL(2, C) and G of E so that: G is connected, compact, and centralizes SL(2, C).
(ToE1)
We complexify and then decompose Lie(E) ⊗ C as a direct sum of representations of SL(2, C) and G. We identify SL(2, C) ⊗R C with SL2,C × SL2,C and write Lie(E) = m ⊗ n ⊗ Vm,n , (1.1) m,n≥1
where m and n denote the irreducible representation of SL2,C of that dimension and Vm,n is a complex representation of G ⊗R C. (Physicists would usually write 2 and 2¯ instead of 2 ⊗ 1 and 1 ⊗ 2.) Of course,
420
J. Distler, S. Garibaldi
m ⊗ n ⊗ Vm,n n ⊗ m ⊗ Vm,n , and since the action of SL(2, C) · G on Lie(E) is defined over R, we deduce that Vm,n Vn,m . We further demand that Vm,n = 0 if m + n > 4, and
(ToE2)
V2,1 is a complex representation of G.
(ToE3)
We recall the definition of complex representation and explain the physical motivation for these hypotheses in the next section. Roughly speaking, (ToE1) is a trivial requirement based on trying to construct a Theory of Everything along the lines suggested by Lisi, (ToE2) is the requirement that the model not contain any “exotic” higher-spin particles, and (ToE3) is the statement that the gauge theory (with gauge group G) is chiral, as required by the Standard Model. In fact, physics requires slightly stronger hypotheses on Vm,n , for m + n = 4. We will not impose the stronger version of (ToE2). Definition 1.1. A candidate ToE subgroup of a real Lie group E is a subgroup generated by a copy of SL(2, C) and a subgroup G such that (ToE1) and (ToE2) hold. A ToE subgroup is a candidate ToE subgroup for which (ToE3) also holds. Our main result is: Theorem 1.2. There are no ToE subgroups in (the transfer of) the complex E8 nor in any real form of E8 . Notation. Unadorned Lie algebras and Lie groups mean ones over the real numbers. We use a subscript C to denote complex Lie groups—e.g., SL2,C is the (complex) group of 2-by-2 complex matrices with determinant 1. We can view a d-dimensional complex Lie group G C as a 2d-dimensional real Lie group, which we denote by R(G C ). (Algebraists call this operation the “transfer” or “Weil restriction of scalars”; geometers, and many physicists, call this operation “realification.”) We use the popular notation of SL(2, C) for the transfer R(SL2,C ) of SL2,C ; it is a double covering of the “restricted Lorentz group”, i.e., of the identity component SO(3, 1)0 of SO(3, 1). Strategy and main results. Our strategy for proving Theorem 1.2 will be as follows. We will first catalogue, up to conjugation, all possible embeddings of SL(2, C) satisfying the hypotheses of (ToE2). The list is remarkably short. Specifically, for every candidate ToE subgroup of E, the group G is contained in the maximal compact, connected subgroup G max of the centralizer of SL(2, C) in E. The proof of Theorem 1.2 shows that the only possibilities are: V2,1 E G max E8(−24) Spin(11) 32 E8(8) Spin(5) × Spin(7) (4, 8) E8(−24) Spin(9) × Spin(3) (16, 2) R(E8,C ) E7 56 R(E8,C ) Spin(12) 32 ⊕ 32 R(E8,C ) Spin(13) 64
(1.2)
We then note that the representation V2,1 of G max (and hence, of any G ⊆ G max ) has a self-conjugate structure. In other words, (ToE3) fails.
There is No “Theory of Everything” Inside E8
421
2. Physics Background One of the central features of modern particle physics is that the world is described by a chiral gauge theory. Definition 2.1. Let M be a four-dimensional pseudo-Riemannian manifold, of signature (3, 1), which we will take to be oriented, time-oriented and spin. Let G be a compact Lie group. The data of a gauge theory on M with gauge group G consists of a connection, A, on a principal G-bundle, P → M, and some “matter fields” transforming as sections of vector bundle(s) associated to unitary representations of G. Of particular interest are the fermions of the theory. The orthonormal frame bundle of M is a principal SO(3, 1)0 bundle. A choice of spin structure defines a lift to a principal Spin(3, 1)0 = SL(2, C) bundle. Let S± → M be the irreducible spinor bundles, associated, via the defining two-dimensional representation and its complex conjugate, to this SL(2, C) principal bundle. The fermions of our gauge theory are denoted ψ ∈ (S+ ⊗ V ),
ψ ∈ (S− ⊗ V ),
where V → M is a vector bundle associated to a (typically reducible) representation R of G. Definition 2.2. Consider, V , a unitary representation of G over C—i.e., a homomorphism G → U(V )—and an antilinear map J : V → V that commutes with the action of G. The map J is called a real structure on V if J 2 = 1; physicists call a representation possessing a real structure real. The map J is called a quaternionic structure on V if J 2 = −1; physicists call a representation possessing a quaternionic structure pseudoreal. Subsuming these two subcases, we will say that V has a self-conjugate structure if there exists an antilinear map J : V → V commuting with the action of G and satisfying J 4 = 1. Physicists call a representation V that does not possess a self-conjugate structure complex. Remark 2.3. We sketch how to translate the above definition into the language of algebraic groups and Galois descent as in [2] and [3, §X.2]. Let G be an algebraic group over R and fix a representation ρ : G ⊗ C → GL(V ) for some complex vector space V . Let J be an antilinear map V → V that satisfies ρ(g) = J −1 ρ(g)J for g ∈ G(C).
(2.1)
We define real, quaternionic, etc., by copying the second and third sentences verbatim from Definition 2.2. (In the special case where G is compact, there is necessarily a positive-definite invariant hermitian form on V and ρ arises by complexifying some map G → U(V ); this puts us back in the situation of Def. 2.2. In the special case where G is connected, the hypothesis from Def. 2.2 that J commutes with G(R)—which is obviously implied by Eq. (2.1)—is actually equivalent to Eq. (2.1). Indeed, both sides of Eq. (2.1) are morphisms of varieties over C, so if they agree on G(R)—which is Zariski-dense by [2, 18.2(ii)]—then they are equal on G(C).) If V has a real structure J , then the R-subspace V of elements of V fixed by J is a real vector space and V is canonically identified with V ⊗ C so that J (v ⊗ z) = v ⊗ z
422
J. Distler, S. Garibaldi
for v ∈ V and z ∈ C; this is Galois descent. Because ρ commutes with complex conjugation (which acts in the obvious manner on G(C) and via J on V ), it is the complexification of a homomorphism ρ : G → G L(V ) defined over R by [2, AG.14.3]. Conversely, if there is a representation (V , ρ ) whose complexification is (V, ρ), then taking J to be complex conjugation on V = V ⊗ C defines a real structure on (V, ρ). If V has a quaternionic structure J , then we define a real structure Jˆ on Vˆ := V ⊕ V via Jˆ(v1 , v2 ) := (J v2 , −J v1 ). Finally, suppose that G is reductive and V is irreducible (as a representation over C, of course). Then by [4, §7], there is a unique irreducible real representation W whose complexification W ⊗ C contains V as a summand. By Schur, End G (W ) is a division algebra, and we have three possibilities: • End G (W ) = R, W ⊗ C V , and V has a real structure. • End G (W ) = H, W ⊗ C V ⊕ V , and V has a quaternionic structure. • End G (W ) = C, W ⊗ C V ⊕ V where V V , and V is complex. We have stated this remark for G a group over R, but all of it generalizes easily to the case where G is reductive over a field F and is split by a quadratic extensions K of F. Definition 2.4. A gauge theory, with gauge group G, is said to be chiral if the representation R by which the fermions (2.1) are defined is complex in the above sense. By contrast, a gauge theory is said to be nonchiral if the representation R in 2.1 has a self-conjugate structure. Note that whether a gauge theory is chiral depends crucially on the choice of G. A gauge theory might be chiral for gauge group G, but nonchiral for a subgroup H ⊂ G. That is, there can be a self-conjugate structure on R compatible with H , even though no such structure exists that is compatible with the full group G. Conversely, suppose that a gauge theory is nonchiral for the gauge group G. It is also necessarily nonchiral for any gauge group H ⊂ G. GUTs. The Standard Model is a chiral gauge theory with gauge group G SM := (SU(3) × SU(2) × U(1))/(Z/6Z). Various grand unified theories (GUTs) proceed by embedding G SM is some (usually simple) group, G GUT . Popular choices for G GUT are SU(5) [5], Spin(10), E6 , and the Pati-Salam group, (Spin(6) × Spin(4))/(Z/2Z) [6]. It is easiest to explain what the fermion representation of G SM is after embedding G SM in G GUT := SU(5). Let W be the five-dimensional defining representation of SU(5). The representation R from 2.1 is the direct sum of three copies of R 0 = ∧2 W ⊕ W . Each such copy is called a “generation” and is 15-dimensional. One identifies each of the 15 weights of R0 with left-handed fermions: 6 quarks (two in a doublet, each in three colors), two leptons (e.g., the electron and its neutrino), 6 antiquarks, and a positron. With three generations, R is 45-dimensional. Definition 2.5. As a generalization, physicists sometimes consider the n-generation Standard Model, which is defined in similar fashion, but with R = R0⊕n . The n-generation Standard Model is a chiral gauge theory, for any positive n. Particle physics, in the real world, is described by “the” Standard Model, which is the case n = 3.
There is No “Theory of Everything” Inside E8
423
For the other choices of GUT group, the analogue of a generation (R0 ) is higherdimensional, containing additional fermions that are not seen at low energies. When decomposed under G SM ⊂ G GUT , the representation decomposes as R0 + R , where R is a real representation of G SM . In Spin(10), a generation is the 16-dimensional halfspinor representation. In E6 , it is a 27-dimensional representation, and for the Pati-Salam group it is the (4, 1, 2) ⊕ (4, 2, 1) representation. In each case, these representations are complex representations (in the above sense) of G GUT , and the complex-conjugate representation is called an “anti-generation.”
3. Lisi’s Proposal from [1] In the previous section, we have described a chiral gauge theory in a fixed (pseudo) Riemannian structure on M. Lisi’s proposal [1] is to try to combine the spin connection on M and the gauge connection on P into a single dynamical framework. This motivates Definition 1.1 of a ToE subgroup. More precisely, following [1], we fix subgroups SL(2, C) and G—say, with G = G SM —satisfying (ToE1) in some real Lie group E. The action of the central element −1 ∈ SL(2, C) provides a Z/2Z-grading on the Lie algebra of E. This Z/2Zgrading allows one to define a sort of superconnection associated to E (precisely what sort of superconnection is explained in a blog post by the first author [7]). In the proposal of [1], we are supposed to identify each of the generators of Lie(E) as either a boson or a fermion. (See Table 9 in [1] for an identification of the 240 roots.) The Spin-Statistics Theorem [8] says that fermions transform as spinorial representations of Spin(3, 1); bosons transform as “tensorial” representations (representations which lift to the double cover, SO(3, 1)). To be consistent with the Spin-Statistics Theorem, we must, therefore, require that the fermions belong to the −1-eigenspace of the aforementioned Z/2Z action, and the bosons to the +1-eigenspace. In fact, to agree with 2.1, we should require that the −1-eigenspace (when tensored with C) decomposes as a direct sum of two-dimensional representations (over C) of SL(2, C), corresponding to “left-handed” and “right-handed” fermions, in the sense of 2.1.
3.1. Interpretations of Vm,n and (ToE2). In the notation of (1.1), the Vm,n , with m + n odd, correspond to fermions; those with m +n even correspond to bosons. In Lisi’s setup, the bosons are 1-forms on M, with values in a vector bundle associated to the aforementioned Spin(3, 1)0 principal bundle via the m ⊗ n representation (with m + n even). The Vm,n with m + n = 4 are special; they correspond to the gravitational degrees of freedom in Lisi’s theory. (3⊗1)⊕(1⊗3) is the adjoint representation of SL(2, C); these correspond to the spin connection. The 1-form with values in the 2 ⊗ 2 representation is the vierbein1 . It is a substantial result from physics (see Sects. 13.1, 25.4 of [9]) that a unitary interacting theory is incompatible with massless particles in higher representations (m + n ≥ 6). Our hypothesis (ToE2) reflects this and also forbids gravitinos (m + n = 5). In §10, we will revisit the possibility of admitting gravitinos. 1 In making this identification, we have tacitly assumed that V 2,2 is one-dimensional. This is, in fact, required for a unitary interacting theory. We will not, however, impose this additional constraint. Suffice to say that it is not satisfied by any of the candidate ToE subgroups (per Definition 1.1) of E8 .
424
J. Distler, S. Garibaldi
3.2. Explanation of (ToE3). Our hypothesis (ToE3) says that the candidate “Theory of Everything” one obtains from subgroups SL(2, C) and G as in (ToE1) must be chiral in the sense of Definition 2.4.2 In private communication, Lisi has indicated that he objects to our condition (ToE3), because he no longer wishes to identify all 248 generators of Lie(E) as particles (either bosons or fermions). In his new—and unpublished—formulation, only a subset are to be identified as particles. In particular, V2,1 is typically a reducible representation of G and, in his new formulation, only a subrepresentation corresponds to particles (fermions). This is not the approach followed in [1], where all 248 generators are identified as particles and where, moreover, 20-odd of these are claimed to be new as-yet undiscovered particles—a prediction of his theory. As recently as April 2009, Lisi reiterated this prediction in an essay published in the Financial Times, [11]. Our paper assumes that the approach of [1] is to be followed, and that all 248 generators are to be identified as particles, hence (ToE3). In any case, even if one identifies only a subset of the generators as particles, all the fermions must come from the (−1)-eigenspace, which is too small to accommodate 3 generations, as we now show. 3.3. No-go based on dimensions. The fermions of Lisi’s theory correspond to weight vectors in Vm,n , with m + n odd. In particular, the weight vectors in V2,1 and V1,2 correspond (as in §2.1) to left- and right-handed fermions, respectively. Since there are 3 × 15 = 45 known fermions of each chirality, V2,1 must be at least 45-dimensional, and similarly for V1,2 . Thus, the −1-eigenspace of the central element of SL(2, C), which contains (2 ⊗ 1 ⊗ V2,1 ) ⊕ (1 ⊗ 2 ⊗ V1,2 ), must have dimension at least 2 × 2 × 45 = 180. When E is a real form of E8 , the −1-eigenspace has dimension 112 or 128 (this is implicit in Elie Cartan’s classification of real forms of E8 as in [12, p. 518, Table V]),3 so no identification of the fermions as distinct weight vectors in Lie(E) (as in Table 9 in [1]) can be compatible with the Spin-Statistics Theorem and the existence of three generations. These dimensional considerations do not, however, rule out the possibility of accommodating a 1- or 2-generation Standard Model (per Definition 2.5) in a real form of E8 . That requires more powerful considerations, which are the subject of our main theorem. We now turn to the proof of that theorem. 4. sl2 Subalgebras and the Dynkin Index The Dynkin index. In [15, §2], Dynkin defined the index of an inclusion f : g1 → g2 of simple complex Lie algebras as follows. Fix a Chevalley basis of the two algebras, so that the Cartan subalgebra h1 of g1 is contained in the Cartan subalgebra h2 of g2 . The Chevalley basis identifies hi with the complexification Q i∨ ⊗ C of the coroot lattice 2 Of course, there are many other features of the Standard Model that a candidate Theory of Everything must reproduce. We have chosen to focus on the requirement that the theory be chiral for two reasons. First, it is “physically robust”: Whatever intricacies a quantum field theory may possess at high energies, if it is non-chiral, there is no known mechanism by which it could reduce to a chiral theory at low energies (and there are strong arguments [10] that no such mechanism exists). Second, chirality is easily translated into a mathematical criterion—our (ToE3). This allows us to study a purely representation-theoretic question and side-step the difficulties of making sense of Lisi’s proposal as a dynamical quantum field theory. 3 Alternatively, Serre’s marvelous bound on the trace from [13, Th. 3] or [14, Th. 1] implies that for every element x of order 2 in a reductive complex Lie group G, the −1-eigenspace of Ad(x) has dimension ≤ (dim G + rank G)/2. In particular, when G is a real form of E8 , the −1-eigenspace has dimension ≤ 128.
There is No “Theory of Everything” Inside E8
425
∨ Q i∨ of gi , and the inclusion f gives an inclusion Q ∨ 1 ⊗ C → Q 2 ⊗ C. Fix the Weyl∨ ∨ ∨ invariant inner product ( , )i on Q i so that (α , α )i = 2 for short coroots α ∨ . Then the Dynkin index of the inclusion is the ratio ( f (α ∨ ), f (α ∨ ))2 /(α ∨ , α ∨ )1 where α ∨ is short coroot of g1 . For example, the irreducible representation sl2 → sln has index an+1 3 by [15, Eq. (2.32)].
sl2 subalgebras. We now consider the case g1 = sl2 and write simply g and Q ∨ for g2 ∨ and Q ∨ 2 . The coroot lattice of sl2 is Z and the image of 1 under the map Z → Q is an element h ∈ h called the defining vector of the inclusion. In §8 of his paper (or see [16, §VIII.11]), Dynkin proved that, after conjugating by an element of the automorphism group of g, one can assume that the defining vector h satisfies the strong restrictions: h= pδ δ ∨ for pδ real and non-negative [15, Lemma8.3], δ∈
where denotes the set of simple roots of g and further that δ(h) ∈ {0, 1, 2} for all δ ∈ .
(4.1)
But note that for each simple root δ, the fundamental irreducible representation of g with highest weight dual to δ ∨ restricts to a representation of sl2 with pδ as a weight, hence pδ is an integer. As a consequence of these generalities and specifically [15, Lemma 8.2], one can identify an sl2 subalgebra of g up to conjugacy by writing the Dynkin diagram of g and putting the number δ(h) from Display (4.1) at each vertex; this is the marked Dynkin diagram of the sl2 subalgebra. Here is an alternative formula for computing the index of an sl2 subalgebra from its marked Dynkin diagram. Write κg and m ∨ for the Killing form and dual Coxeter number of g. We have: 1 1 1 κg(h, h) = α(h)2 , (4.2) (Dynkin index) = (h, h) = ∨ ∨ 2 4m 2m positive roots α of g
where the second equality is by, e.g., [17, §5], and the third is by the definition of κg. One can calculate the number α(h) by writing α as a sum of positive roots and applying the marked Dynkin diagram for h. Lemma 4.1. For every simple complex Lie algebra g, there is a unique copy of sl2 in g of index 1, up to conjugacy. This is (equivalent to) Theorem 2.4 in [15]. We give a different proof for the convenience of the reader. Proof. The index of an sl2 -subalgebra is (h, h)/2, where the defining vector h belongs to the coroot lattice Q ∨ . If g is not of type B, then the coroot lattice is not of type C, and the claim amounts to the statement that the vectors of minimal length in the coroot lattice are actually coroots. This follows from the constructions of the root lattices in [18, §12.1]. Otherwise g has type B and is son for some odd n ≥ 5. The conjugacy class of an sl2 -subalgebra is determined by the restriction of the natural n-dimensional representation; they are parameterized by partitions of n (i.e., n i = n) so that the even n i occur
426
J. Distler, S. Garibaldi
with even multiplicity and some n i > 1, see [19, 5.1.2] or [20, §6.2.2]. The index of n i +1 the composition sl2 → son → sln is then 3 ; we must classify those partitions such that this sum equals the Dynkin index of son → sln , which is 2. The unique such partition is 2 + 2 + 1 + · · · + 1 > 0. In the bijection between conjugacy classes of sl2 subalgebras and orbits of nilpotent elements in g from [19, 3.2.10], the unique orbit of index 1 sl2 ’s corresponds to the minimal nilpotent orbit described in [19, 4.3.3]. If g has type C, F4 , or G2 , then the argument in the proof of the lemma shows that there is up to conjugacy a unique copy of sl2 in g with index 2, 2, or 3 respectively. For g of type Bn with n ≥ 4, there are two conjugacy classes of sl2 -subalgebras of index 2. This amounts to the fact that there are vectors in the Cn root lattice that are not roots but have the same length as a root—specifically, sums of two strongly orthogonal short roots, cf. Exercise 5 in §12 of [18]. 5. Copies of sl2,C in the Complex E8 We now prove some facts about copies of sl2,C in the complex Lie algebra e8 of type E8 . Of course, the 69 conjugacy classes of such are known—see [15, pp. 182–185] or [21, pp. 430–433]—but we do not need this information. Fix a pinning for e8 ; this includes a Cartan subalgebra h, a set of simple roots := {αi | 1 ≤ i ≤ 8} (numbered 1
3
4 2
5
6
7
8
(5.1)
as in [22]), and fundamental weights ωi dual to αi . As all roots of the E8 root system have the same length, we can and do identify the root system with its coroot system (also called the “inverse” or “dual” root system). Example 5.1. Taking any root of E8 , one can generate a copy of sl2,C in e8 with index 1. Doing this with the highest root gives an sl2,C with marked Dynkin diagram index 1:
0
0
0 0
0
0
0
1
.
Every index 1 copy of sl2 in e8 is conjugate to this one by Lemma 4.1. Example 5.2. One can find a copy of sl2,C × sl2,C in e8 by taking the first copy to be generated by the highest root of E8 and the second copy to be generated by the highest root of the obvious E7 subsystem. If you embed sl2,C diagonally in this algebra, you find a copy of sl2,C with index 2 and marked Dynkin diagram index 2:
1
0
0 0
0
0
0
0
.
Proposition 5.3. The following collections of copies of sl2,C in e8 are the same: (1) copies such that ±1 are weights of e8 (as a representation of sl2,C ) and no other odd weights occur, (2) copies such that every weight of e8 is in {0, ±1, ±2},
There is No “Theory of Everything” Inside E8
427
(3) copies such that the inclusion sl2,C ⊂ e8 has Dynkin index 1 or 2, (4) copies of sl2,C conjugate to one of those defined in Examples 5.1 or 5.2. Proof. One easily checks that (4) is contained in (1)–(3); we prove the opposite inclusion. For (3), weidentify h with the complexification Q ⊗ C of the (co)root lattice Q, hence h with αi (h)ωi . By Eq. (4.2), the index of h satisfies: 2 ωi , α2 1 1 2 2 α(h) = αi (h)ωi , α ≥ αi (h) , 60 α 60 α 60 α i
i
where the sums vary over the positive roots. We calculate for each fundamental weight ωi the number α ωi , α2 /60: 2
7
15 4
10
6
3
1
.
(5.2)
As the numbers αi (h) are all 0, 1, or 2, the numbers in Display (5.2) show that h for an sl2,C with Dynkin index 1 or 2 must be ω1 (index 2) or ω8 (index 1). For (2), the highest root α˜ of E8 is α˜ = i ci αi , where c1 = c8 = 2 and the other ci ’s are all at least 3. As α(h) ˜ is a weight of e8 relative to a given copy of sl2,C , we deduce that an sl2,C as in (2) must have h = ω1 or ω8 , as claimed. Suppose now that we are given an h for a copy of sl2,C as in (1). As ±1 occur as weights, there is at least one 1 in the marked Dynkin diagram. But note that there cannot be three or more 1’s in the marked Dynkin diagram for h. Indeed, for every connected subset S of vertices of the Dynkin diagram of E8 , i∈S αi is a root [22, §VI.1.6, Cor. 3b]. If the number of 1’s in the marked diagram of h is at least three, then one can pick S so that it meets exactly three of the αi ’s with αi (h) = 1, in which case i∈S αi (h) is odd and at least 3, violating the hypothesis of (1). For the sake of contradiction, suppose that there are two 1’s in the marked diagram for h, say, corresponding to simple roots αi and α j with i < j. For each i, j, one can find a root β in the list of roots of E8 of large height in [22, Plate VII] such that the coefficients of αi and α j in β have opposite parity and sum at least 3. (Merely taking β to be the highest root suffices for many (i, j).) This contradicts (1), so there is a unique 1 in the marked diagram for h, i.e., αi (h) = 1 for a unique i. If αi (h) = 1 for some i = 1, 8, then we find a contradiction because there is a root α of E8 with αi -coordinate 3. Therefore αi (h) = 1 only for i = 1 or 8 and not for both. By the fact used two paragraphs above, β := i αi is a root of E8 , so β(h) = αi (h) is odd and must be 1. It follows that h = ω1 or ω8 . Observation 5.4 (Centralizer for index 1). The sl2,C of index 1 in e8 has centralizer the obvious regular subalgebra e7 of type E7 . (A subalgebra is regular if it is generated by the root subalgebras corresponding to a closed sub-root-system [15, no. 16].) Indeed, it is clear that e7 centralizes this sl2,C . Conversely, the centralizer of sl2,C is contained in the centralizer of h = ω8 —i.e., e7 ⊕ Ch—but does not contain h. Technique 5.5 (Decomposing e8 ). Suppose we are given a copy of sl2,C in e8 specified by a defining vector h. By applying the 240 roots of e8 to h (and throwing in also 0 with multiplicity 8), we obtain the weights of e8 as a representation of sl2,C and therefore also the decomposition of e8 into irreducible representations of sl2,C as in, e.g., [18, §7.2]. Extending this, suppose we are given a copy of sl2,C × sl2,C in e8 , where the two summands are specified by defining vectors in h. (Here we want the defining vectors to
428
J. Distler, S. Garibaldi
span the Cartan subalgebras in the images of the two sl2,C ’s. In particular, they need not be normalized in the sense of Display (4.1).) Computing as in the previous paragraph, we can decompose e8 as a direct sum of irreducible representations m ⊗ n of sl2,C × sl2,C . It is easy to write code from scratch to make a computer algebra system perform this computation. We remark that applying this recipe in the situation from the Introduction gives the dimension of Vm,n as the multiplicity of m ⊗ n. 6. Index 2 Copies of sl2,C in the Complex E8 Lemma 6.1. The centralizer of the index 2 sl2,C in e8 from Example 5.2 is a copy of so13 contained in the regular subalgebra so14 of e8 . Proof. The centralizer of the sl2,C of index 2 in e8 is contained in the centralizer of the defining vector h; this centralizer is reductive with semisimple part the regular subalgebra so14 of type D7 . The centralizer of sl2,C contains the centralizer of the sl2,C × sl2,C from Example 5.2, which is the regular subalgebra so12 of type D6 , as can be seen by the recipe from [15, pp. 147, 148]. Computing as in 5.5, we see that the centralizer of sl2,C has dimension 78 (as is implicitly claimed in the statement of the lemma), so it lies properly between the regular so12 and the regular so14 . For concreteness, let us suppose that the structure constants for e8 are as in [23]. Define a copy of sl2,C by sending 00 01 to the sum of the elements in the Chevalley basis of e8 spanning the root subalgebras corresponding to −α8 and the highest root in the obvious D7 subdiagram. This copy of sl2,C has defining vector α2 + α3 + 2α4 + 2α5 + 2α6 + 2α7 . One checks using the structure constants that this sl2,C centralizes the index 2 sl2,C we started with, and that together with so12 it generates a copy of so13 . In particular, the coroot lattice of this so13 has basis β1∨ , . . . , β6∨ , embedded in the (co)root lattice of e8 as in the table: so13 β1∨ β2∨ β3∨ β4∨ β5∨ β6∨ . e8 α3 α4 α5 α6 α7 −α2 − α3 − 2α4 − 2α5 − 2α6 − 2α7
(6.1)
We remark that the numbering of the coroots β1∨ , . . . , β6∨ corresponds to a numbering of the simple roots of so13 as in the diagram
Dimension count shows that this so13 is the centralizer. The claim of the lemma is already in [24, p. 125]. We gave the details of a proof because it specifies an inclusion of so13 in e8 and a comparison of the pinnings of the two algebras as in Table (6.1). The index 2 sl2 and the copy of so13 give an sl2 × so13 subalgebra of e8 . We now decompose e8 into irreducible representations of sl2 × so13 . We can do this from first principles by restricting the roots of e8 to the Cartan sublagebras of sl2 (using the marked Dynkin diagram from Example 5.2) and so13 (using Table (6.1)). Alternatively, we can read the decomposition off the tables in [25] as follows. As in the proof of Lemma 6.1, sl2 is contained in the regular subalgebra sl2 × sl2 × so12 of e8 , and the tables on pages 301 and 305 of ibid. show that e8 decomposes as a sum of the adjoint representation, 2 ⊗ 1 ⊗ S+ , 1 ⊗ 2 ⊗ S− , and 2 ⊗ 2 ⊗ V, (6.2)
There is No “Theory of Everything” Inside E8
429
where S± denotes the half-spin representations of so12 and V is the vector representation. We can restrict the representations of sl2 × sl2 to the diagonal sl2 subalgebra to obtain a decomposition of e8 into representations of sl2 × so12 . Consulting the tables in ibid. for restricting representations from type B6 to D6 allows us to deduce the decomposition 1 ⊗ so13,C
⊕
2 ⊗ (spin)
⊕
3⊗1
⊕
3 ⊗ (vector)
(6.3)
of e8 as a representation of sl2 × so13 . From this it is obvious that so13,C is the Lie algebra of a copy of Spin13 in E8 . The main result of this section is the following: Lemma 6.2. Up to conjugacy, there is a unique copy of SL2,C × SL2,C in E8,C so that each inclusion of SL2,C in E8,C has index 2. The centralizer of this SL2,C × SL2,C has identity component Sp4,C × Sp4,C . Proof. As in the proof of Lemma 4.1 (or by the method used to prove Prop. 5.3), there are two index 2 copies of sl2 in so13 , corresponding to the partitions (a) 3 + 1 + 1 + · · · + 1 and (b) 2 + 2 + 2 + 2 + 1 + 1 + · · · + 1 of 13. The recipe in [19, §5.3] gives defining vectors for these sl2 ’s, which we can rewrite in terms of the E8 simple roots using Table (6.1): (a) 2β1∨ + 2β2∨ + 2β3∨ + 2β4∨ + 2β5∨ + β6∨ = −α2 + α3 , (b) β1∨ + 2β2∨ + 3β3∨ + 4β4∨ + 4β5∨ + 2β6∨ = −2α2 − α3 − 2α4 − α5 .
(6.4)
We can pair each of (a) and (b) with the copy of sl2 from Example 5.2 to get an sl2 × sl2 subalgebra of e8 where both sl2 ’s have index 2. Clearly, these represent the only two E8 -conjugacy classes of such subalgebras. With Display (6.4) in hand, we can calculate the multiplicities of the irreducible representations of sl2 × sl2 in e8 as in 5.5. In case (a), every irreducible summand m ⊗ n has m + n even. Therefore, this copy of sl2 × sl2 is the Lie algebra of a subgroup of E8 isomorphic to (SL2 × SL2 )/(−1, −1). (An alternative way to see this is to note that the simple roots with odd coefficients are the same in Display (6.4a) and the defining vector in Example 5.2.) In case (b), we have the following table of multiplicities for m ⊗ n: 1 2 3 m 1 20 20 6 . n 2 20 16 4 3 6 4 0
(6.5)
In particular, it is the Lie algebra of a copy of SL2 × SL2 in E8 . The centralizer of (b) in Spin13 has been calculated in [26, IV.2.25], and the identity component is Sp4 × Sp4 , as claimed. We can decompose e8 into a direct sum of irreducible representations of the sl2 × sl2 × sp4 × sp4 subalgebra from Lemma 6.2 by combining the decomposition of e8 into irreducible representations of sl2 × so13 from Decomposition (6.3) with the tables in [25]. Specifically, we restrict representations from so13 to an sp4 × so8 subalgebra and then from so8 to sp4 × sl2 , where this sl2 also has index 2. Recall that sp4 has two fundamental irreducible representations: one that is 4-dimensional symplectic and another that is 5-dimensional orthogonal; we denote them by their dimensions. With this notation and 1.1, we find: V2,1 5 ⊗ 4, V1,2 4 ⊗ 5, V2,3 1 ⊗ 4, V3,2 4 ⊗ 1, and V2,2 4 ⊗ 4. (6.6)
430
J. Distler, S. Garibaldi
7. Copies of SL(2, C) in a Real Form of E8 Suppose now that we have a copy of SL(2, C) inside a real Lie group E of type E8 . Over the complex numbers, we decompose Lie(E) ⊗ C into a direct sum of irreducible representations of SL(2, C) ⊗ C SL2,C × SL2,C ; each irreducible representation can be written as m ⊗ n where m and n denote the dimension of an irreducible representation of the first or second SL2,C respectively. The goal of this section is to prove: Proposition 7.1. Maintain the notation of the previous paragraph. If Lie(E) ⊗ C contains no irreducible summands m ⊗ n with m + n > 4, then the identity component Z of the centralizer of SL(2, C) in E is a subgroup isomorphic to (1) Spin(7, 5) if E is split; or (2) Spin(3, 9) or Spin(11, 1) if the Killing form of Lie(E) has signature −24. In either case, Lie(Z ) ⊗ C is the regular so12 subalgebra of Lie(E) ⊗ C. Proof. Complexifying the inclusion of SL(2, C) in E and going to Lie algebras gives an inclusion of sl2,C ×sl2,C in the complex Lie algebra e8 . The hypothesis on the irreducible summands m ⊗ n implies that each of the two sl2,C ’s has index 1 or 2 by Proposition 5.3. As complex conjugation interchanges the two components, they must have the same index. Suppose first that both sl2 ’s have index 2. When we decompose e8 as in 1.1, we find the representation 2 ⊗ 3 with positive multiplicity 4 by Table (6.5), which violates our hypothesis on the SL(2, C) subgroup of E. Therefore both sl2 ’s have index 1. Lemma 4.1 (twice) gives that this sl2 × sl2 is conjugate to the one generated by the highest root of E8 from Example 5.1 (so the second sl2 belongs to the centralizer of type E7 ) and by the highest root of the E7 subsystem and makes up the first two summands of an sl2 × sl2 × so12 subalgebra, the same one used to find decomposition (6.2). That is, so12 centralizes sl2 × sl2 . Conversely, the centralizer of the defining vectors of the two copies of sl2 has semisimple part so12 ; it follows that Lie(Z ) ⊗ C is isomorphic to so12 . From this and decomposition (6.2), we see that Z is a real form of Spin12 . As Lie(E) is a real representation of Z , we deduce that V is also a real representation of Z but S+ and S− are not; they are interchanged by the Galois action. The first observation shows that Z is Spin(12 − a, a) for some 0 ≤ a ≤ 6. The second shows that a must be 1, 3, or 5, as claimed in the statement of the proposition. It remains to prove the correspondence between a and the real forms of E8 . For a = 5, this is clear: the subgroup generated by SL(2, C) and Spin(7, 5) has real rank 6, so it can only be contained in the split real form. Now suppose that a = 3 or 1 and that SL(2, C) is in the split E8 ; we will show that the Killing form of E has signature −24. Over C, SL(2, C) is conjugate to the copy of SL2,C × SL2,C in E8,C generated by the highest root of E8 and the highest root of the natural subsystem of type E7 . Writing out these two roots in terms of the E8 simple roots, we see that α3 and α5 are the only simple roots whose coefficients have different parities. It follows that the element −1 ∈ SL(2, C)—equivalently, (−1, −1) ∈ SL2 × SL2 —is h α2 (−1) h α3 (−1) in the notation of [27], where h αi : C× → E ⊗ C is the cocharacter corresponding to the coroot αi∨ . Now, α2 and α3 are the only simple roots with odd coefficients in the fundamental weight ω1 , so the subgroup of E ⊗ C fixed by conjugation by this −1 is generated by root subgroups corresponding to roots α such that ω1 , α is even. These roots form the natural D8 subsystem of E8 , and in this way we
There is No “Theory of Everything” Inside E8
431
see SL(2, C) · Spin(12 − a, a) as a semisimple subgroup of maximal rank in a copy of a half-spin group H in 16 dimensions—the identity component of the centralizer of −1. We claim that H is isogenous to SO(12, 4). As H is a half-spin group with a half-spin representation defined over R, it is isogenous to SO(16 − b, b) for b = 0, 4, or 8 or it is quaternionic; these possibilities have Killing forms of signature −120, −24, 8, or −8 respectively, as can be looked up in [28], for example. The adjoint representation of H , when restricted to SL(2, C) · Spin(12 − a, a), decomposes as the adjoint representation of SL(2, C) · Spin(12 − a, a) and 2 ⊗ 2 ⊗ V by decomposition (6.2). The Killing form on H restricts to a positive multiple of the Killing form on SL(2, C) · Spin(12 − a, a) (as can be seen over C by the explicit formula on p. E-14 of [26])—i.e., has signature −44 or −12 for a = 1 or 3—and a form of signature ±2(12 − 2a) on 2 ⊗ 2 ⊗ V ; the sum of these has signature 0, −24, or −64 since a = 1 or 3. Comparing the two lists verifies that H is isogenous to SO(12, 4). The Killing form on H has signature −24. The invariant bilinear form on the half-spin representation is hyperbolic (because H is isogenous to spin of an isotropic quadratic form of dimension divisible by 8, see [29, 1.1]). As a representation of H , Lie(E) is a sum of these two representations, and we conclude that the Killing form on Lie(E) has signature −24, as claimed. Remark 7.2. We can determine the centralizer and the real form of E8 also in the excluded case in the proof where both sl2 ’s have index 2. As in Lemma 6.2, the centralizer is a real form of Sp4,C × Sp4,C . Decomposition (6.6) shows that complex conjugation interchanges the two Sp4,C terms, so the centralizer is R(Sp4,C ). Complex conjugation interchanges the irreducible representations appearing in Eq. (1.1) in pairs (contributing 0 to the signature of the Killing form κ E of E), except for 2 ⊗ 2 ⊗ V2,2 , which has dimension 82 . This last piece breaks up into a 36-dimensional even subspace, and a 28-dimensional odd subspace, contributing 8 to the signature of κ E and proving that the resulting real form of E8 is the split one. 8. No Theory of Everything in a Real Form of E8 In the decomposition of Lie(E) ⊗ C from Eq. (1.1), the integers m, n are positive, so (ToE2) implies Vm,n = 0 if m ≥ 4 or n ≥ 4.
(ToE2’)
We prove the following strengthening of the real case of Theorem 1.2: Lemma 8.1. If subgroups SL(2, C) and G of a real form E of E8 satisfy (ToE1) and (ToE2’), then V1,2 is a self-conjugate representation of G, i.e., (ToE3) fails. Proof. As in the proof of Proposition 7.1, over the complex numbers we get two copies of sl2 that embed in E8 with the same index, which is 1 or 2. If the index is 1, we are in the case of that proposition. The −1-eigenspace in Lie(E) (of the element −1 in the center of SL(2, C)) is a real representation of SL(2, C) · G, and G is contained in a copy of Spin(12 − a, a) for a = 1, 3, or 5. As in the proof of the proposition, there is a representation W of SL(2, C) × Spin(12 − a, a) defined over R that is isomorphic to (2 ⊗ 1 ⊗ S+ )
⊕
(1 ⊗ 2 ⊗ S− )
432
J. Distler, S. Garibaldi
over C. Now G is contained in the maximal compact subgroup of Spin(12 − a, a), i.e., Lie(G) is a subalgebra of so(11), so(9) × so(3), or so(7) × so(5). The restriction of the two half-spin representations of Spin(12 − a, a) to the compact subalgebra are equivalent [25, p. 264], and we see that in each case the restriction is quaternionic. (To see this, one uses the standard fact that the spin representation of so(2 + 1) is real for ≡ 0, 3 (mod 4) and quaternionic for ≡ 1, 2 (mod 4).) That is, the restrictions of S+ , S− , and their complex conjugates to the maximal compact subgroup are all equivalent (over C), hence the same is true for their further restrictions to G, and (ToE3) fails. If the index is 2, then G is contained in a real form of Sp4,C × Sp4,C by Lemma 6.2. When we decompose e8 as in Eq. (1.1), we find V2,1 and V1,2 as in decomposition (6.6). As complex conjugation interchanges these two representations, it follows that complex conjugation interchanges the two Sp4,C factors, i.e., the centralizer of SL(2, C) has as identity component the transfer R(Sp4,C ) of Sp4,C . Its maximal compact subgroup is the compact form of Sp4,C (also known as Spin(5)), all of whose irreducible representations are self-conjugate. Therefore, (ToE3) fails. Remark 8.2. It is worthwhile noting that, in each of the three cases in Proposition 7.1 (the three cases where (ToE2) holds), it is possible to embed G SM in the centralizer, thus showing that (ToE1) is satisfied. Given such an embedding, a simple computation verifies explicitly that S+ has a self-conjugate structure as a representation of G SM . First consider Spin(11, 1). There is an obvious embedding of G GUT := Spin(10). Under this embedding, S+ decomposes as the direct sum of the two half-spinor representations, i.e., as a generation and an anti-generation. For Spin(7, 5), there is an obvious embedding of the Pati-Salam group, G GUT := (Spin(6) × Spin(4))/(Z/2Z). Again, S+ decomposes as the direct sum of a generation and an anti-generation. Finally, Spin(3, 9) contains (SU(3)×SU(2)×SU(2)×U(1))/(Z/6Z) as a subgroup. Under this subgroup, S+ = (3, 2, 2)1/6 ⊕ (3, 2, 2)−1/6 + (1, 2, 2)−1/2 + (1, 2, 2)1/2 , where the subscript indicates the U(1) weights, and the overall normalization is chosen to agree with the physicists’ convention for the weights of the Standard Model’s U(1)Y . Embedding the SU(2) of the Standard Model in one of the two SU(2)s, we obtain an embedding of G SM ⊂ Spin(3, 9) where, again S+ has a self-conjugate structure as a representation of G SM . 9. No Theory of Everything in Complex E8 We now complete the proof of Theorem 1.2 by proving the following strengthening of the complex case. Lemma 9.1. If subgroups SL(2, C) and G of R(E 8,C ) satisfy (ToE1) and (ToE2’), then V1,2 is a self-conjugate representation of G, i.e., (ToE3) fails. First, recall the definition of the transfer R(HC ) of a complex group HC as described, e.g., in [30, §2.1.2]. Its complexification can be viewed as HC × HC , where complex conjugation acts via (h 1 , h 2 ) = (h 2 , h 1 ).
There is No “Theory of Everything” Inside E8
433
One can view R(HC ) as the subgroup of the complexification consisting of elements fixed by complex conjugation. Now consider an inclusion φ : SL(2, C) = R(SL2,C ) → R(E8,C ). Complexifying, we identify R(SL2,C ) ⊗ C with SL2,C × SL2,C and similarly for R(E 8,C ), and write out φ as φ(h 1 , h 2 ) = (φ1 (h 1 )φ2 (h 2 ), ψ1 (h 1 )ψ2 (h 2 ))
(9.1)
for some homomorphisms φ1 , φ2 , ψ1 , ψ2 : SL2,C → E8,C . As φ is defined over R, we have: φ(h 1 , h 2 ) = φ(h 2 , h 1 ) = (ψ1 (h 2 )ψ2 (h 1 ), φ1 (h 2 )φ2 (h 1 )), and it follows that ψ1 (h 1 ) = φ2 (h 1 ) and ψ2 (h 2 ) = φ1 (h 2 ). Conversely, given any two homomorphisms φ1 , φ2 : SL2,C → E8,C (over C) with commuting images, the same equations define a homomorphism φ : SL(2, C) → R(E8,C ) defined over R. Proof of Lemma 9.1. Write Z for the identity component of the centralizer of the image of the map φ1 × φ2 : SL2,C × SL2,C → E8,C from Eq. (9.1). Clearly, G is contained in the transfer R(Z ) of Z . In each of the cases below, we verify that Z is semisimple and − 1 is in the Weyl group of Z .
(9.2)
It follows from this that the maximal compact subgroup of R(Z ) is the compact real form Z R of Z and that Z R is an inner form. Hence every irreducible representation of Z R is real or quaternionic; hence every representation of Z R is self-conjugate. That is, (ToE3) fails, which is the desired contradiction. Case 1. φ1 or φ2 is trivial. Consider the easiest-to-understand case where φ1 or φ2 is the zero map, say φ2 . In the notation of Eq. (9.1), φ(h 1 , h 2 ) = (φ1 (h 1 ), φ1 (h 2 )), i.e., φ is the transfer of the homomorphism φ1 : SL2,C → E8,C . By Proposition 5.3, φ1 has index 1 or 2. If φ1 has index 1, then Z is simple of type E7 by Example 5.4, hence Property (9.2) holds. If φ1 has index 2, then Lie(Z ) is isomorphic to so13,C by Lemma 6.1, and again Property (9.2) holds. Case 2. Neither φ1 nor φ2 is trivial. Now suppose that neither φ1 nor φ2 is trivial. Again, Proposition 5.3 implies that φ1 and φ2 have Dynkin index 1 or 2. If φ1 and φ2 both have index 1, then (over C) the homomorphism φ1 × φ2 is the one from the proof of Proposition 7.1 and Z is the standard D6 subgroup of E8,C and Property (9.2) holds. Now suppose that φ1 and φ2 both have index 2. As φ is an injection, it is not possible that φ1 and φ2 both vanish on −1 ∈ SL2,C , and it follows from the proof of Lemma 6.2 that φ1 × φ2 is an injection as in the statement of Lemma 6.2. In particular, Z has Lie algebra sp4,C × sp4,C of type B2 × B2 and Property (9.2) holds. Note that (ToE2) fails in this case by Table (6.5). Suppose finally that φ1 has index 1 and φ2 has index 2. We conjugate so that φ2 (sl2 ) is the copy of sl2 from Example 5.2, and (by Lemma 4.1 for the centralizer so13 of φ2 (sl2 )) we can take φ1 (sl2 ) to be a copy of sl2 generated by the highest root of so13 . Calculating as described in 5.5 gives the following table of multiplicities for the irreducible representation m ⊗ n of sl2 × sl2 in e8 :
434
J. Distler, S. Garibaldi
1 2 3 m 1 39 18 1 . n 2 32 16 0 3 10 2 0
(9.3)
In particular, the A1 × B4 subgroup of Spin13 that centralizes the image of φ1 × φ2 is all of the identity component Z of the centralizer in E8 . Again Property (9.2) holds. (Of course, Table (9.3) shows that (ToE2) fails.) 10. Relaxing (ToE2) to (ToE2’) Combining Lemmas 8.1 and 9.1 gives a proof not only of Theorem 1.2, but of the following stronger statement. Theorem 10.1. There are no subgroups SL(2, C) · G satisfying (ToE1), (ToE2’), and (ToE3) in the (transfer of the) complex E8 or any real form of E8 . We retained hypothesis (ToE2) in the Introduction because that is what is demanded by physics. Technically, it is possible for V2,3 and V3,2 to be nonzero in an interacting theory—so (ToE2) is false but (ToE2’) still holds—but only in the presence of local supersymmetry (i.e., in supergravity theories) [31]. Lisi’s framework is not compatible with local supersymmetry, so we excluded this possibility above. For real forms of E8 , weakening (ToE2) to (ToE2’) only adds the case of E8(8) , with G max = Spin(5), where we find V3,2 V2,3 = 4,
V2,1 V1,2 = 4 ⊕ 16,
(10.1)
and we have indicated the irreducible representations of Spin(5) by their dimensions. Because the gravitinos transform nontrivially under G max and because of their multiplicity, the only consistent possibility would be a gauged N = 4 supergravity theory (for a recent review of such theories, see [32]). Unfortunately, the rest of the matter content (it suffices to look at V2,1 ) is not compatible with N = 4 supersymmetry. Even if it were, N = 4 supersymmetry would, of course, necessitate that the theory be non-chiral, making it unsuitable as a candidate Theory of Everything. To summarize the results of this section, the previous subsection, and Remark 7.2, weakening (ToE2) to (ToE2’) adds only three additional entries to Table 1.2. E G max V3,2 V2,1 E8(8) Spin(5) 4 4 ⊕ 16 . R(E8,C ) Spin(5) × Spin(5) (4, 1) ⊕ (1, 4) (4, 5) ⊕ (5, 4) R(E8,C ) SU(2) × Spin(9) (2, 1) (2, 9) ⊕ (2, 16)
(10.2)
In each case the fermion representations, V2,1 V1,2 and V3,2 V2,3 , are pseudoreal representations of G max . 11. Conclusion In Subsect. 3.3 above, we observed by an easy dimension count that no proposed Theory of Everything constructed using subgroups of a real form E of E8 has a sufficient number of weight vectors in the −1-eigenspace to identify with all known fermions. The proof of our Theorem 1.2 was quite a bit more complicated, but it also gives much more.
There is No “Theory of Everything” Inside E8
435
It shows that you cannot obtain a chiral gauge theory for any candidate ToE subgroup of E, whether E is a real form or the complex form of E8 . In particular, it is impossible to obtain even the 1-generation Standard Model (in the sense of Definition 2.5) in this fashion. Acknowledgements. The second author thanks Fred Helenius for pointing out a reference and Patrick Brosnan for suggesting this project. The authors’ research was supported by the National Science Foundation under grant nos. PHY-0455649 (Distler) and DMS-0653502 (Garibaldi).
References 1. Lisi, A.G.: An exceptionally simple theory of everything. http://arxiv.org/abs/0711.0770 V1 [hep-th], 2007 2. Borel, A.: Linear Algebraic Groups. Vol. 126 of Graduate Texts in Mathematics, 2nd ed., New York: Springer-Verlag, 1991 3. Serre, J.-P.: Local Fields. Vol. 67 of Graduate Texts in Mathematics. New York-Berlin: Springer, 1979 4. Tits, J.: Représentations linéaires irréductibles d’un groupe réductif sur un corps quelconque. J. Reine Angew. Math. 247, 196–220 (1971) 5. Georgi, H., Glashow, S.: Unity of all elementary particle forces. Phys. Rev. Lett. 32, 438–441 (1974) 6. Pati, J., Salam, A.: Lepton number as the fourth color. Phys. Rev. D10, 275–289 (1974) 7. Distler, J.: Superconnections for dummies. Weblog entry, May 12, 2008, available at http://golem.ph. utexas.edu/~distler/blog/archives/001680.html 8. Streater, R.F., Wightman, A.S.: PCT, Spin and Statistics, and All That. Princeton Landmarks in Mathematics and Physics, Princeton, NJ: Princeton University Press, 2000 9. Weinberg, S.: The Quantum Theory of Fields. 3 volumes. Cambridge: Cambridge University Press, 19952000 10. ’t Hooft, G.: In: Recent Developments in Gauge Theories, Cargèse France, ’t Hooft, G., Itzykson, C., Jaffe, A., Lehmann, H., Mitter, P.K., Singer, I., Stora, R., eds., no. 59 in Proceedings of the Nato Advanced Study Series B, London: Plenum Press, 1980 11. Lisi, A.G.: First person: A. Garrett Lisi, Financial Times, (April 25, 2009), available at http://www.ft. com/cms/s/2/ebead98a-2d71-11de-9eba-00144feabdc0.html 12. Helgason, S.: Differential Geometry, Lie Groups, and Symmetric Spaces. vol. 34 of Graduate Studies in Mathematics, Providence, RI: Amer. Math. Soc., 2001 13. Serre, J.-P.: On the values of the characters of compact Lie groups. Oberwolfach Reports 1(1), 666– 667 (2004) 14. Elashvili, A., Kac, V., Vinberg, E.: On exceptional nilpotents in semisimple Lie algebras. http://arxiv.org/ abs/0812.1571, V1 [math.GR], 2008 15. Dynkin, E.: Semisimple subalgebras of semisimple Lie algebras. Amer. Math. Soc. Transl. (2) 6, 111–244, (1957) [Russian original: Mat. Sbornik N.S. 30(72), 349–462, (1952)] 16. Bourbaki, N.: Lie Groups and Lie Algebras: Chapters 7–9. Berlin: Springer-Verlag, 2005 17. Gross, B., Nebe, G.: Globally maximal arithmetic groups. J. Algebra 272(2), 625–642 (2004) 18. Humphreys, J.: Introduction to Lie Algebras and Representation Theory. Vol. 9 of Graduate Texts in Mathematics. Third printing, revised Berlin-Heidelberg-New York: Springer-Verlag, 1980 19. Collingwood, D., McGovern, W.: Nilpotent Orbits in Semisimple Lie Algebras. New York: Van Nostrant Reinhold, 1993 20. Onishchik, A., Vinberg, E.: Lie Groups and Lie Algebras III. Vol. 41 of Encyclopaedia Math. Sci. Berlin-Heidelberg-New York: Springer, 1994 21. Carter, R.: Finite Groups of Lie Type: Conjugacy Classes and Complex Characters. New York: WileyInterscience, 1985 22. Bourbaki, N. Lie Groups and Lie Algebras: Chapters 4–6, Berlin: Springer-Verlag, 2002 23. Vavilov, N.: Do it yourself structure constants for Lie algebras of type E . J. Math. Sci. (N.Y.) 120(4), 1513–1548 (2004) 24. Èlašvili, A.: Centralizers of nilpotent elements in semisimple Lie algebras. Sakharth. SSR Mecn. Akad. Math. Inst. Šrom. 46, 109–132 (1975) 25. McKay, W., Patera, J.: Tables of Dimensions, Indices, and Branching Rules for Representations of Simple Lie Algebras. Vol. 69 of Lecture Notes in Pure and Applied Mathematics. New York: Marcel Dekker Inc., 1981
436
J. Distler, S. Garibaldi
26. Springer, T., Steinberg, R.: Conjugacy classes. In: Seminar on Algebraic Groups and Related Finite Groups (The Institute for Advanced Study, Princeton, N.J., 1968/69), Vol. 131 of Lecture Notes in Math., Berlin: Springer, 1970, pp. 167–266 27. Steinberg, R.: Lectures on Chevalley Groups. New Haven, CT: Yale University, 1968 28. Tits, J.: Tabellen zu den einfachen Lie Gruppen und ihren Darstellungen. Vol. 40 of Lecture Notes in Mathematics. Berlin: Springer-Verlag, 1967 29. Garibaldi, R.: Clifford algebras of hyperbolic involutions. Math. Zeit. 236, 321–349 (2001) 30. Platonov, V., Rapinchuk, A.: Algebraic Groups and Number Theory. Boston, MA: Academic Press, 1994 31. Grisaru, M.T., Pendleton, H.N.: Soft spin 3/2 fermions require gravity and supersymmetry. Phys. Lett. B67, 323 (1977) 32. Schon, J., Weidner, M.: Gauged N = 4 supergravities. JHEP 05, 034 (2006) Communicated by A. Kapustin
Commun. Math. Phys. 298, 437–459 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1042-7
Communications in
Mathematical Physics
On Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass Pengzi Miao1, , Yuguang Shi2, , Luen-Fai Tam3, 1 School of Mathematical Sciences, Monash University, Victoria, 3800, Australia.
E-mail: [email protected]
2 Key Laboratory of Pure and Applied Mathematics, School of Mathematics Science,
Peking University, Beijing, 100871, P.R. China. E-mail: [email protected]
3 The Institute of Mathematical Sciences and Department of Mathematics,
The Chinese University of Hong Kong, Shatin, Hong Kong, China. E-mail: [email protected] Received: 30 July 2009 / Accepted: 18 December 2009 Published online: 10 April 2010 – © Springer-Verlag 2010
Abstract: We discuss some geometric problems related to the definitions of quasilocal mass proposed by Brown and York (Contemporary mathematics, vol 132, American Mathematical Society, Providence, pp 129–142, 1992; Phys Rev D (3) 47(4):1407– 1419, 1993) and Liu and Yau (Phys Rev Lett 90(23):231102, 2003; J Am Math Soc 19(1):181–204, 2006). Our discussion consists of three parts. In the first part, we propose a new variational problem on compact manifolds with boundary, which is motivated by the study of Brown-York mass. We prove that critical points of this variation problem are exactly static metrics. In the second part, we derive a derivative formula for the Brown-York mass of a smooth family of closed two dimensional surfaces evolving in an ambient three dimensional manifold. As a by-product, we are able to write the ADM mass (Arnowitt et al. in Phys. Rev. (2), 122:997–1006, 1961) of an asymptotically flat 3-manifold as the sum of the Brown-York mass of a coordinate sphere Sr and an integral of the scalar curvature plus a geometrically constructed function (x) in the asymptotic region outside Sr . In the third part, we prove that for any closed, spacelike, 2-surface in the Minkowski space R3,1 for which the Liu-Yau mass is defined, if bounds a compact spacelike hypersurface in R3,1 , then the Liu-Yau mass of is strictly positive unless lies on a hyperplane. We also show that the examples given by Ó Murchadha et al. (Phys Rev Lett 92:259001, 2004) are special cases of this result. 1. Introduction In this work, we will discuss some geometric problems related to the definitions of quasilocal mass proposed by Brown-York [6,7] and Liu-Yau [14,15]. In general, there are certain properties that a reasonable definition of quasilocal mass should satisfy, see Research partially supported by Australian Research Council Discovery Grant #DP0987650. Research partially supported by Grant of NSFC (10725101). Research partially supported by Hong Kong RGC General Research Fund #GRF 2160357.
438
P. Miao, Y. Shi, L.-F. Tam
[23] for example. The most important property is the positivity. There are results on positivity of Brown-York mass and Liu-Yau mass in [15,21,22,24,25]. In particular, the following is a consequence on the positivity of Brown-York mass proved by the last two authors in [21]. Let ge be the standard Euclidean metric on R3 . Let be a bounded strictly convex domain in R3 with smooth boundary which has mean curvature H0 . Then H0 dσ is a maximum of the functional H dσ on the class of smooth metrics with nonnegative scalar curvature on which agree with ge tangentially on and have positive boundary mean curvature H . It is interesting to see if this is still true for general domains in R3 . In [22], a similar result was proved for domains in H3 , the hyperbolic 3-space. Namely, it was proved that if gh is the standard hyperbolic metric on H3 and is a bounded domain with strictly convex smooth boundary which is a topological sphere and has mean curvature H0 , then H0 cosh r dσ is a maximum of the functional H cosh r dσ on the class of smooth metrics with scalar curvature bounded below by −6 which agree with gh tangentially on and have positive boundary mean curvature H . Here r is the distance function on H3 from a fixed point in . Again it is interesting to see if this is still true for general domains in H3 . The results and questions above motivate us to study the functional H φ dσ, Fφ (g) =
where is the boundary of an n dimensional compact manifold , φ is a given smooth nontrivial function (that is φ ≡ 0) on , and dσ is the volume form of a fixed metric γ on . The class of metrics g we are interested in the space of metrics with constant scalar curvature K which induce the metric γ on . In Theorem 2.1, we will prove the following: g is a critical point of Fφ (·) if and only if g is a static metric with a static potential N that equals φ on . That is to say: −( g N )g + ∇g2 N − N Ric(g) = 0, on N = φ, at . In the theorem, for K > 0, we also assume that the first Dirichlet eigenvalue of (n − 1) g + K is positive. In particular, if φ = 1, K = 0 and n = 3, we can conclude that g is a critical point of H dσ if and only if g is a flat metric. Another important question on quasilocal mass is whether it has some monotonicity property. In [21], it was shown that the Brown-York mass of the boundaries of certain domains in a space with some quasi-spherical metric is monotonically decreasing rather than increasing as the domains become larger. In Theorem 3.1, we will derive a derivative formula of the Brown-York mass of a smooth family of 2-surfaces with positive Gaussian curvature which evolve in an ambient 3-manifold. The formula gives a generalization of the monotonicity formula in [21] which plays a key role in the proof of the positivity of Brown-York mass. As a by-product of this derivative formula, in Corollary 3.5 we are able to write the ADM mass [1] of an asymptotically flat 3-manifold as the sum of the Brown-York mass of a coordinate sphere Sr and an integral of the scalar curvature plus a geometrically constructed function (x) in the asymptotic region outside Sr . We note that a qualitatively similar expression relating the ADM mass and the scalar curvature can be found in a recent work by Bray and Khuri [5, Th. 7]. The Minkowski space R3,1 represents the zero energy state in general relativity. Thus, a reasonable notion of quasilocal mass should be such that its value of a spacelike 2-surface in R3,1 equals zero. In [14,15], the Liu-Yau mass was introduced and its positivity
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
439
was proved. In the time symmetric case, this coincides with the Brown-York mass. However, Ó. Murchadha, Szabados and Tod [20] constructed spacelike 2-surfaces with spacelike mean curvature vector H in R3,1 and with positive Gaussian curvature such that the Liu-Yau mass of , 1 (H0 − | H |)dσ mLY () = 8π is strictly positive. Here H0 is the mean curvature of when isometrically embedded in R3 and | H | is the Lorentzian norm of H in R3,1 . In [19], Ó. Murchadha further showed that there exist 2-surfaces in R3,1 whose Liu-Yau mass can be unboundedly large. Recently, Wang and Yau [24,25] have introduced another definition of quasilocal mass to address this question. In Theorem 4.1 in this paper, we will prove the following: Let be a closed, connected, spacelike 2-surface in the Minkowski space R3,1 with spacelike mean curvature vector and with positive Gaussian curvature. Suppose spans a compact, spacelike hypersurface in R3,1 , then the Liu-Yau mass of is strictly positive, unless lies on a hyperplane. The results give some properties on isometric embeddings of compact surfaces with positive Gaussian curvature in the Minkowski space. We will also show that all the examples in [20] satisfy the conditions in Theorem 4.1. This paper is organized as follows. In Sect. 2, we will prove that static metrics are the only critical points of the functional Fφ (·). In Sect. 3, a formula for the derivative of the Brown-York mass will be derived and some applications will be given. In Sect. 4, we will prove that for “most” spacelike 2-surfaces in R3,1 for which the Liu-Yau mass is defined, their Liu-Yau mass is strictly positive. In the Appendix, we prove some results on the differentiability of a 1-parameter family of isometric embeddings in R3 , following the arguments of Nirenberg [18]. The results will be used in Sect. 3. 2. Static Metrics and Brown-York Type Integral Throughout this section, we let be an n-dimensional (n ≥ 3) compact manifold with smooth boundary . Let γ be a smooth Riemannian metric on . As in [17], for a constant K and any integer k > n2 + 2, we let MγK be the set of W k,2 metrics g on with constant scalar curvature K such that g|T () = γ . If g ∈ MγK and the first Dirichlet eigenvalue of (n − 1) g + K is positive, where g is the usual Laplacian operator of g, then MγK is a manifold near g (see [17] for detail). Let φ be a given smooth function on , we define the following functional on MγK : Fφ (g) = Hg φ dσ, (2.1)
where Hg is the mean curvature of in (, g) with respect to the outward unit normal and dσ is the volume form of γ . Motivated by the results in [21,22] on the positivity of Brown-York mass and some generalization, we want to determine the critical points of Fφ (·) on MγK . Before we state the main result, we recall the following definition from [8]: Definition 2.1. A metric g on an open set U is called a static metric on U if there exists a nontrivial function N (called the static potential) on U such that − ( g N )g + ∇g2 N − N Ric(g) = 0.
(2.2)
440
P. Miao, Y. Shi, L.-F. Tam
Here g , ∇g2 are the usual Laplacian, Hessian operator of g and Ric(g) is the Ricci curvature of g. A basic property of static metrics is that they are necessarily metrics of constant scalar curvature [8, Prop. 2.3]. In the following, we obtain a characterization of static metrics in MγK using the function Fφ (·). Theorem 2.1. With the above notations, let φ be a nontrivial smooth function on . Suppose g ∈ MγK such that the first Dirichlet eigenvalue of (n − 1) g + K is positive. Then g is a critical point of Fφ (·) defined in (2.1) if and only if g is a static metric with a static potential N such that N = φ on . Proof. Since the Dirichlet eigenvalue of (n − 1) g + K is positive, we know MγK is a manifold near g by the result in [17]. First, we suppose g is a static metric with a potential N such that N = φ on . Let g(t) be a smooth curve in MγK with g(0) = g. Let F(t) =
H (t)φ dσ,
where H (t) is the mean curvature of in (, g(t)) with respect to the outward unit ` normal ν. Let denote the derivative with respect to t. We want to prove that F (0) = 0. Let h = g (0). In what follows, we let ωn denote the outward unit normal part of a 1-form ω, i.e. ωn = ω(ν), let II be the second fundamental form of in (, g(t)) with respect to ν, let X be the vector field on that is dual to the 1-form h(ν, ·)|T () on (, γ ) and let divγ X be the divergence of X on (, γ ). For convenience, we often omit writing the volume form in an integral. As in [17, (34)], we have 2F (0) =
2H (0)N
N [d(tr g h) − divg h]n − divγ X − II, h γ = N [d(tr g h) − divg h]n − N divγ X (because h|T = 0) = N g (tr g h) − divg (divg h) − tr g h N + tr g h(d N )n − d N , divg h g − N divγ X =− N h, Ric(g) g − tr g h N + tr g h(d N )n + ∇g2 N , h g − h(ν, ∇ N ) − N divγ X (using (2.4) and (2.5)) = h, −N Ric(g) − ( g N )g + ∇g2 N g + tr g h(d N )n − h(ν, ∇ N ) − N divγ X, (2.3) =
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
441
where we have used the facts d N , divg h g = − ∇g2 N , h g + h(ν, ∇ N )
(2.4)
and D Rg (h) =
d R(t)|t=0 = −g (tr g h) + divg (divg h) − h, Ric(g) g = 0. dt
(2.5)
Here R(t) is the scalar curvature of g(t). Let ∇ N denote the gradient of N on (, σ ). Integrating by parts on , we have N divγ X = − ∇ N , X γ = − h(ν, ∇ N ) + h(ν, ν)(d N )n . (2.6)
On the other hand, tr g h = h(ν, ν) at . Hence, F (0) = 0 by (2.2), (2.3), and (2.6). To prove the converse, suppose g is a critical point of Fφ (·). Let {g(t)} be any smooth path in MγK passing g = g(0). Let h = g (0) and F(t) = Fφ (g(t)). As before, we have 2F (0) = =
2H (0)φ
φ[d(tr g h) − divg h]n −
φdivγ X.
(2.7)
Since the first Dirichlet eigenvalue of (n − 1) g + K is positive and φ is not identically zero, there exists a unique smooth function N = Nφ which is not identically zero on such that (n − 1) g N + K N = 0, on (2.8) N = φ, at . With such an N given, we have φ[d(tr g h) − divg h]n − φdivγ X = N g (tr g h) − divg (divg h) − tr g h N + tr g h(d N )n − d N , divg h g − N divγ X = N (−1) h, Ric(g) g − tr g h N + ∇g2 N , h g ,
(2.9)
where we used the fact D Rg (h) = 0 (the boundary terms canceled as before and we have not used (2.8) yet). Now let hˆ be any smooth symmetric (0,2) tensor with compact support in . For each t sufficiently small, we can find a smooth positive function u(t) on such that u(t) = 1 at and 4
ˆ ∈ MγK . g(t) = u(t) n−2 (g + t h)
442
P. Miao, Y. Shi, L.-F. Tam
Moreover, u(t) is differentiable at t = 0 and u(0) ≡ 1 on . See the proof of [17, Th. 5] 4 ˆ Hence, by (2.7) u (0)g + h. for details on the existence of such a u(t). Now g (0) = n−2 and (2.9) we have
4 2 ˆ u (0)g + h, −N Ric(g) − ( g N )g + ∇g N 2F (0) = n−2 g ˆ −N Ric(g) − ( g N )g + ∇g2 N g = h, 4 u (0) −K N − n( g N ) + g N . + (2.10) n − 2 By (2.8), the second integral in the above equation is zero. Hence, we have ˆ −N Ric(g) − ( g N )g + ∇g2 N g . 2F (0) = h,
(2.11)
Since hˆ can be arbitrary, we conclude that g and N satisfy (2.2).
Remark 2.1. If K ≤ 0, then the condition that the first Dirichlet eigenvalue of (n − 1) g + K is positive holds automatically for g ∈ MγK . As a direct corollary of Theorem 2.1, we have Corollary 2.1. With the notations given as in Theorem 2.1, suppose K = 0 and φ = 1. Then g ∈ M0γ is a critical point of H dσ if and only if g is a Ricci flat metric. In particular, if n = 3, then g ∈ M0γ is a critical point of H dσ if and only if g is a flat metric. If φ does not change sign on the boundary, we further have: Corollary 2.2. With the notations given as in Theorem 2.1, suppose φ ≥ 0 or φ ≤ 0 on . Suppose g ∈ MγK is a static metric. If K > 0, we also assume that the first Dirichlet eigenvalue of (n − 1) g + K is positive. Let g(t) be a smooth family of smooth metrics on with g(0) = g such that (i) the scalar curvature of g(t) is at least K . (ii) g(t) induces γ on . Then d Fφ (g(t))|t=0 = 0. dt Proof. We prove the case that φ ≥ 0 on . The case that φ ≤ 0 on is similar. By the assumption of g, for t small, we can find smooth positive functions u(t) on 4 with u(t) = 1 on such that g(t) ˆ = u n−2 (t)g(t) ∈ MγK , u(t) is differentiable at t = 0 and u(0) ≡ 1 (see the proof of Proposition 1 in [17]). The mean curvature Hˆ (t) of in (, g(t)) ˆ is given by 2(n − 1) ∂u Hˆ (t) = H (t) + , n − 2 ∂νt
(2.12)
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
443
where H (t) and νt are the mean curvature and the unit outward normal of in (, g(t)). Note that u satisfies: n+2 4(n−1) n−2 , in n−2 g(t) u − K (t)u = −K u (2.13) u = 1, on , where K (t) is the scalar curvature of g(t). Since K (t) ≥ K , by the maximum principle, we have ∂u ≥ 0. ∂νt ˆ ≥ Fφ (g(t)) by the assumption φ ≥ 0 Hence, Hˆ (t) ≥ H (t) and consequently Fφ (g(t)) on . By Theorem 2.1, we have d Fφ (g(t))| ˆ t=0 = 0. dt Since g(0) ˆ = g(0), we conclude d Fφ (g(t))|t=0 = 0. dt
Here are some examples provided by Theorem 2.1: Example 1. Let be a bounded domain in Rn with smooth boundary . Then the standard Euclidean metric is a critical point of Fφ (·) with φ ≡ 1. If is strictly convex, then this follows also from the result in [21]. Example 2. Let be a bounded domain in Hn with smooth boundary . Then the standard Hyperbolic metric is a critical point of Fφ (·) with φ = cosh r , where r is the distance function on Hn from a fixed point. If is strictly convex and n = 3, then this follows also from the result in [22]. Example 3. Let be a domain in Sn with smooth boundary such that the volume of is less than 2π . Then the standard metric on Sn is a critical point of Fφ (·) with φ = cos r , where r is the distance function on Sn from a fixed point. Example 4. Let be a bounded domain with smooth boundary in the Schwarzschild manifold R3 \{0} with metric m 4 δi j gi j = 1 + 2r m with m > 0 and r = |x|. Then on , g is a critical point for Fφ (·) with φ = (1− 2r )/ m (1 + 2r ).
Example 5. Complete conformally flat Riemannian manifolds with static metrics have been classified by Kobayashi in [13, Th. 3.1]. In addition to the manifolds in the previous examples, there are other kind of static metrics with N being explicitly constructed. Domains in these manifolds will be critical points of Fφ (·), where φ is the restriction of N to the boundary. See [13] for more details.
444
P. Miao, Y. Shi, L.-F. Tam
3. Derivative of the Brown-York Mass In this section, we give a derivative formula that describes how the Brown-York mass of a surface changes if the surface is evolving in an ambient Riemannian manifold. Our main result is: Theorem 3.1. Let S2 be the 2-dimensional sphere. Let (M, g) be a 3-dimensional Riemannian manifold. Let I be an open interval in R1 . Suppose F : S2 × I −→ M is a smooth map such that, for t ∈ I , (i) t = F(S2 , t) is an embedded surface in M and t has positive Gaussian curvature. (ii) The velocity vector ∂∂tF is always perpendicular to t , i.e ∂F = ην, ∂t where ν is a given unit vector field normal to t and η = ∂∂tF , ν denotes the speed of t with respect to ν. Consider mBY (t ), the Brown-York mass of t in (M, g), defined by 1 (H0 − H ) dσt , mBY (t ) = 8π t
(3.1)
where H0 is the mean curvature of t with respect to the outward normal when isometrically embedded in R3 , H is the mean curvature of t with respect to ν in (M, g), and dσt is the volume form of the induced metric on t . We have
d 1 mBY (t ) = |A0 − A|2 − |H0 − H |2 + R η dσt , (3.2) dt 16π t where A0 is the second fundamental form of t with respect to the outward normal when isometrically embedded in R3 , A is the second fundamental form of t with respect to ν in (M, g), and R is the scalar curvature of (M, g). Our proof of Theorem 3.1 makes use of a recent formula of Wang and Yau (Prop. 6.1 in [25]): Proposition 3.1. Let be an orientable closed embedded hypersurface in Rn+1 . Let {t }|t|<δ be a smooth variation of in Rn+1 . Then 1 d H0 dσt |t=0 = (3.3) (H0 tr h − A0 , h ) dσ, dt 2 t where H0 and A0 are the mean curvature and the second fundamental form of with respect to the outward normal in Rn+1 , h is the variation of the induced metric σ on , tr h = σ, h denotes the trace of h with respect to σ , and dσt , dσ denote the volume form on t , .
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
445
In order to apply Proposition 3.1, we will need to show that, on a closed convex surface in R3 , an abstract metric variation on indeed arises from a surface variation {t } of in R3 . Precisely, we have: Proposition 3.2. Given an integer k ≥ 6 and a number 0 < α < 1, let {σ (t)}|t|<1 be a path of C k,α metrics on S2 such that {σ (t)} is differentiable at t = 0 in the space of C k,α metrics. Suppose σ (0) has positive Gaussian curvature. Then there exists a small number δ > 0 and a path of C k,α embeddings { f (t)}|t|<δ of S2 in R3 such that f (t) is an isometric embedding of (S2 , σ (t)) for |t| < δ and { f (t)} is differentiable at t = 0 in the space of C 2,α embeddings. The proposition above follows from the arguments by Nirenberg in [18]. For completeness, we include its proof here. Proof. Given {σ (t)}|t|<1 }, a path of C k,α metrics on S2 , let h = σ (0). Then h is a C k,α symmetric (0,2) tensor. Since σ (0) has positive Gaussian curvature, by the result in [18], there exists a C k,α isometric embedding of (S2 , σ (0)) in R3 , which we denote by X . Given such an X , let Y : S2 → R3 be a C 2,α solution to the linear equation 2d X · dY = h,
(3.4)
where “·” denotes the Euclidean dot product in R3 and (3.4) is understood as d X (e1 ) · dY (e2 ) + d X (e2 ) · dY (e1 ) = h(e1 , e2 ) for any tangent vectors e1 , e2 to S2 . The existence of such a Y is provided by Theorem 2’ in [18]. Let d σ¯ 2 = h and let φ, p1 , p2 be given as in (6.5), (6.6) in [18], then φ satisfies (6.15) in [18]. Using the fact that X is in C k,α and d σ¯ 2 is in C k,α , we check that the coefficients of (6.15) in [18] (when written in a non-divergence form) is in C k−3,α . Thus, it follows from (6.15) in [18] that φ ∈ C k−1,α , from which we conclude Y ∈ C k−1,α by (6.11)-(6.13) in [18]. Now consider the C k−1,α path of embeddings {G(t)}|t|
(3.5)
and t0 is chosen so that G(t) is an embedding. Let ge be the Euclidean metric on R3 . The pull back metric τ (t) = G(t)∗ (ge ) which is in C k−2,α satisfies τ (0) = σ (0), τ (0) = σ (0),
(3.6)
||τ (t) − σ (t)||C 2,α = O(t 2 ).
(3.7)
which implies
Apply Lemma 5.3 in the Appendix to σ 0 = σ (0) = τ (0); for each t sufficiently small, we can find a C 2,α isometric embedding X (t) of (S2 , σ (t)) in R3 such that ||G(t) − X (t)||C 2,α ≤ C||τ (t) − σ (t)||C 2,α = O(t 2 ).
(3.8)
(By Lemma 1’ in [18], X (t) indeed lies in C k,α .) It follows from (3.8) that {X (t)}, when viewed as a path in the space of C 2,α embeddings, is differentiable at t = 0. Proposition 3.2 is therefore proved.
446
P. Miao, Y. Shi, L.-F. Tam
Proposition 3.1 and Proposition 3.2 together imply: Proposition 3.3. Given an integer k ≥ 6 and a number 0 < α < 1, suppose {σ (t)}|t|<1 is a differentiable path in the space of C k,α metrics on S2 . Suppose σ (t) has positive Gaussian curvature for each t. Let H0 be the mean curvature of the isometric embedding of (S2 , σ (t)) in R3 with respect to the outward normal. Let dσt be the volume form of σ (t). Then H0 dσt S2
is a differentiable function of t, and 1 d H0 dσt = H0 σ (t) − A0 , h
dσt , dt 2 S2 S2
(3.9)
where A0 is the second fundamental form of the isometric embedding of (S2 , σ (t)) in R3 with respect to the outward normal, h = σ (t), “ ·,· ” denotes the metric product with respect to σ (t) on the space of symmetric (0, 2) tensors. Proof. Take any t0 ∈ (−1, 1). By Proposition 3.2, there exists a small positive number δ (depending on t0 ) and a path of C k,α embeddings { f (t)}|t−t0 |<δ of S2 in R3 , such that f (t) is an isometric embedding of (S2 , σ (t)) and { f (t)} is differentiable at t = t0 in the space of C 2,α embeddings. Let t = f (t)(S2 ), let H0 (t) be the mean curvature of t with respect to the outward normal in R3 ; by definition we have H0 dσt = H0 (t) dσt , ∀ |t − t0 | < δ. (3.10) S2
t
Apply the fact that { f (t)} is differentiable at t = t0 in the space of C 2,α embeddings and note that H0 only involves derivatives of f (t) up to the second order; we conclude that t H0 (t) dσt is differentiable at t0 . By (3.10), S2 H0 dσt is differentiable at t0 as well. This shows S2 H0 dσt is a differentiable function of t. Equation (3.9) then follows directly from (3.10) and Proposition 3.1.
We are now ready to prove Theorem 3.1 using Proposition 3.3. Proof of Theorem 3.1. By Proposition 3.3, the function mBY (t ) is a differentiable function of t. We have 1 d d 1 d (3.11) m (t ) = H0 dσt − H dσt . dt BY 8π dt 8π dt t t Let σ = σ (t) be the induced metric on t . By (3.9) in Proposition 3.3, we have
1 ∂σ d H0 σ (t) − A0 , dσt . H0 dσt = (3.12) dt 2 t ∂t t Now, applying the fact that {t } evolves in (M, g) according to ∂F = ην, ∂t
(3.13)
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
447
we have ∂σ = 2η A, ∂t
(3.14)
∂H = − η − (|A|2 + Ric(ν, ν))η, ∂t
(3.15)
and
where Ric(ν, ν) is the Ricci curvature of (M, g) along ν. Thus, d 1 H0 dσt = H0 σ (t) − A0 , 2η A dσt dt 2 t t = H0 H η − A0 , A η dσt t
and d dt
t
H dσt
= =
t t
(3.16)
∂H + H 2 η dσt ∂t −(|A|2 + Ric(ν, ν))η + H 2 η dσt .
Hence, it follows from (3.11), (3.16) and (3.17) that d 1 mBY (t ) = H0 H − A0 , A + (|A|2 + Ric(ν, ν)) − H 2 η dσt . dt 8π t
(3.17)
(3.18)
Apply the Gauss equation to t in (M, g) and to the isometric embedding of t in R3 respectively, we have 2K = R − 2Ric(ν, ν) + H 2 − |A|2 ,
(3.19)
2K = (H0 )2 − |A0 |2 ,
(3.20)
and
where K is the Gaussian curvature of t . Hence, (3.18)–(3.20) imply that d 1 mBY (t ) = |A0 − A|2 − (H0 − H )2 + R η dσt . dt 16π t
(3.21)
Therefore, (3.2) is proved.
Next, we want to discuss some applications of Theorem 3.1. The first two applications below put the monotonicity property of the Brown-York mass in the construction in [21] into a more general context. Corollary 3.1. Let (M, g), I , F, {t }, η, A, A0 , H and H0 be given as in Theorem 3.1 with η > 0. Suppose at each point x ∈ t , t ∈ I , A0 − A is either positive semi-definite or negative semi-definite, and R ≤ 0, then mBY (t ) is nonincreasing in t. If in addition, A = α A0 for some number α depending on x ∈ t , then mBY (t ) is constant in I if and only if (S2 × I, F ∗ (g)) is a domain in R3 .
448
P. Miao, Y. Shi, L.-F. Tam
Proof. Let λ1 , λ2 be the eigenvalues of A0 − A. Suppose A0 − A is either positive semidefinite or negative semi-definite, then λ1 λ2 ≥ 0 and hence |A0 − A|2 − |H0 − H |2 = −2λ1 λ2 ≤ 0. Since R ≤ 0, by Theorem 3.1, we have: d m (t ) ≤ 0 dt BY because η > 0. This proves the first assertion. Suppose (S2 × I, F ∗ (g)) is a domain in R3 , then by definition we have mBY (t ) = 0, ∀t. Hence, d m (t ) = 0. dt BY Conversely, suppose A = α A0 and d m (t ) = 0. dt BY Then R = 0 and A = A0 . In particular, H = H0 . For any (t1 , t2 ) ⊂ I , let = S2 × (t1 , t2 ) with the pull back metric F ∗ (g). Let D be the interior of t1 = F(S2 × {t1 }) when it is isometrically embedded in R3 and E be the exterior of t2 = F(S2 × {t2 }) when it is isometrically embedded in R3 . By gluing with D along S2 × {t1 }, which is identified with t1 through F, and gluing with E along S2 × {t2 }, which is identified with t2 through F, we have an asymptotically flat and scalar flat manifold with corners and with zero mass, and it must be flat by [16,21]. Hence, is flat. Since it is simply connected, can be isometrically embedded in R3 .
Corollary 3.2. Let (M, g), I , F, {t }, η, A, A0 , H and H0 be given as in Theorem 3.1. Let ge be the Euclidean metric on R3 . Suppose there exists another smooth map F 0 : S2 × I −→ R3 such that (i) t0 = F 0 (S2 , t) is an embedded closed convex surface in R3 and (Ft0 )∗ (ge ) = Ft∗ (g), where Ft0 (·) = F 0 (·, t) and Ft (·) = F(·, t). 0 (ii) The velocity vector ∂∂tF is always perpendicular to t0 , i.e. ∂ F0 = η0 ν 0 , ∂t where ν 0 is the outward unit normal to t0 in R3 and η0 denotes the speed of t0 with respect to ν 0 . Suppose η0 > 0, η > 0 and (M, g) has zero scalar curvature, then the Brown-York mass mBY (t ) is monotonically non-increasing, and mBY (t ) is a constant if and only if (S2 × I, F ∗ (g)) is a domain in R3 .
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
449
Proof. Since η0 > 0 and η > 0, we can write (F 0 )∗ (ge ) and F ∗ (g) as F ∗ (ge ) = (η0 )2 dt 2 + gt and F ∗ (g) = η2 dt 2 + gt ,
(3.22)
where gt denotes the same induced metric on both t0 and t . Now it follows from (3.22) that A=
η0 A0 . η
(3.23)
Since A0 is positive definite, the results follow from Corollary 3.1.
Remark 3.1. We note that (i) Quasi-spherical metrics constructed in [21] satisfy all the assumptions of Corollary 3.2. (ii) In case η0 = 1, one recovers the monotonicity formula in [21]. By applying the co-area formula directly to (3.2), we also obtain Corollary 3.3. Let (M, g), F, {t }, η, A, A0 , H and H0 be given as in Theorem 3.1. Suppose η > 0. For any t1 < t2 , let [t1 ,t2 ] be the region bounded by t1 and t2 . Then 1 R dV + dV , (3.24) mBY (t2 ) − mBY (t1 ) = 16π [t1 ,t2 ] [t1 ,t2 ] where R is the scalar curvature of (M, g), d V is the volume form of g on M, and is the function on [t1 ,t2 ] , depending on {t }, defined by (x) = |A0 − A|2 − (H0 − H )2 , x ∈ t .
(3.25)
The function (x) defined above clearly depends on the foliation {t } connecting t1 to t2 . However, it is interesting to note that the integral [t ,t ] d V turns out to 1 2 be {t } independent by (3.24). We can apply formula (3.24) to small geodesic balls in a general 3-manifold and to asymptotically flat regions in an asymptotically flat 3-manifold. Corollary 3.4. Let (M, g) be a 3-dimensional Riemannian manifold. Let p ∈ M and Bδ ( p) be a geodesic ball centered at p with geodesic radius δ. Suppose δ is small enough such that (1) δ < i p (M), where i p (M) is the injectivity radius of (M, g) at p. (2) For any 0 < r ≤ δ, the geodesic sphere Sr ( p), centered at p with geodesic radius r , has positive Gaussian curvature. Then the Brown-York mass of Sδ ( p) can be written as 1 mBY (Sδ ( p)) = R dV + dV , (3.26) 16π Bδ ( p) Bδ ( p)\{ p} where R is the scalar curvature of M, d V is the volume form on M, and is the function on Bδ ( p)\{ p}, defined by (x) = |A0 − A|2 − (H0 − H )2 , x ∈ Sr .
(3.27)
Here A, H are the second fundamental form, the mean curvature of Sr in M with respect to the outward normal; and A0 , H0 are the second fundamental form, the mean curvature of the isometric embedding of Sr in R3 with respect to the outward normal.
450
P. Miao, Y. Shi, L.-F. Tam
Proof. Let (r, ω) be the geodesic polar coordinate of x ∈ Bδ ( p)\{ p}, where r denotes ∂ ⊥ Sr , we can choose the foliation {t } in Corollary 3.1 the distance from x to p. Since ∂r to be {Sr } with t = r . By (3.24), we have 1 (R + ) d V. (3.28) mBY (Sδ ( p)) − mBY (Sr ( p)) = 16π Bδ ( p)\Br ( p) By [9], we have lim mBY (Sr ( p)) = 0.
r →0+
(3.29)
Hence, (3.26) follows from (3.28) and (3.29).
Next, we express the ADM mass [1] as the sum of the Brown-York mass of a coordinate sphere and an integral involving the scalar curvature and the function (x). Corollary 3.5. Let (M, g) be an asymptotically flat 3-manifold with a given end. Let {x i | i = 1, 2, 3} be a coordinate system at ∞ defining the asymptotic structure of (M, g). Let Sr = {x ∈ M | |x| = r } be the coordinate sphere, where |x| denotes the coordinate length. Suppose r0 1 is a constant such that Sr has positive Gaussian curvature for each r ≥ r0 . Then 1 1 mADM = mBY (Sr0 ) + R dV + d V, (3.30) 16π M\Dr0 16π M\Dr0 where mADM is the ADM mass of (M, g), R is the scalar curvature of (M, g), Dr0 is the bounded open set in M enclosed by Sr0 , and is the function on M\Dr0 defined by (x) = |A0 − A|2 − (H0 − H )2 , x ∈ Sr .
(3.31)
Here A, H are the second fundamental form, the mean curvature of Sr in M; and A0 , H0 are the second fundamental form, the mean curvature of Sr when isometrically embedded in R3 . Proof. {Sr }r ≥r0 consists of level sets of the function r on M\Dr0 , hence can be reparameterized to evolve in a way that its velocity vector is perpendicular to the surface at each ∇r time. To be precise, we can define the vector field X = |∇r on M\Dr0 and let γ p (t) |2 be the integral curve of X starting at p ∈ Sr0 . For any t ≥ 0, let t = {γ p (t) | p ∈ Sr0 }, then t = Sr0 +t . For any T > 0, apply (3.24) to {t }0≤t≤T , we have 1 mBY (Sr0 +T ) − mBY (Sr0 ) = R dV + dV , (3.32) 16π [0,T ] [0,T ] where [0,T ] is the region in M bounded by 0 = Sr0 and T = Sr0 +T . Letting T → +∞, by [9] we have lim mBY (ST ) = mADM .
T →+∞
Hence, (3.30) follows from (3.32) and (3.33).
(3.33)
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
451
4. Liu-Yau Mass of Spacelike Two-Surfaces in R3,1 Let be a closed, connected, 2-dimensional spacelike surface in a spacetime N . Suppose has positive Gaussian curvature and has spacelike mean curvature vector H in N . Let H0 be the mean curvature of with respect to the outward unit normal when it is isometrically embedded in R3 . The Liu-Yau mass of is then defined as (see [14,15]): 1 mLY () = (H0 − | H |) dσ, 8π where | H | is Lorentzian norm of H in N and dσ is the volume form of the induced metric on . In [15], the following positivity result was proved: Let be a compact, spacelike hypersurface in a spacetime N satisfying the dominant energy conditions. Suppose the boundary ∂ has finitely many components i , 1 ≤ i ≤ l, each of which has positive Gaussian curvature and has spacelike mean curvature vector in N . Then mLY (i ) ≥ 0 for all i; moreover if mLY (i ) = 0 for some i, then ∂ is connected and N is a flat spacetime along . We note that in the proof of the above result in [15], it was assumed implicitly that the mean curvature of ∂ in with respect to the outward unit normal is positive. See the statement [24, Th. 1.1]. Such a condition is necessary as can be seen by the following example in the time symmetric case. Let ge be the Euclidean metric on R3 and let m > 0 be a constant. Consider the Schwarzschild metric (with negative mass) m 4 g = 1− ge , 2|x| defined on {0 < |x| <
m 2 }.
Given any 0 < r1 < r2 <
m 2,
consider the domain
= {r1 < |x| < r2 }. For any constant r , the mean curvature H of the sphere Sr = {|x| = r } with respect to the unit normal in the direction of ∂/∂r is 2 4 m 1 . + H= m 2 m (1 − 2r ) r 1 − 2r 2r 2 The mean curvature of Sr when it is embedded in R3 is H0 = Suppose r <
m 2,
2 1 m 2 . (1 − 2r ) r
then H < 0 and |H | − H0 = −
4 m 3 > 0. r (1 − 2r )
In particular, this shows the mean curvature of Sr2 in with respect to the outward unit normal is negative and mLY (Sr2 ) < 0.
452
P. Miao, Y. Shi, L.-F. Tam
Remark 4.1. In the above example, ∂ is not connected since ∂ = Sr1 ∪ Sr2 . To get an example with connected boundary, we can simply glue with a Euclidean disk B with ˜ be the resulting compact 3-manifold radius r1 along their common boundary Sr1 . Let ˜ = Sr2 is connected. On , ˜ the metric has a “corner” in the sense whose boundary ∂ of [16]. By applying the approximation procedure in [16], we can indeed have a smooth ˜ such that g˜ has nonnegative scalar curvature, the mean curvature of ∂ ˜ metric g˜ on ˜ < 0. We leave the with respect to the outward unit normal is negative and mLY (∂ ) details to the interested reader. In [20], Ó Murchadha, Szabados and Tod gave some examples of a spacelike 2-surface, lying on the light cone of the Minkowski space R3,1 , whose Liu-Yau mass is strictly positive. Motivated by their result, we want to understand the Liu-Yau mass of more general spacelike 2-surfaces in R3,1 . In the sequel, we always regard R3 as the t = 0 slice in R3,1 . We have the following: Theorem 4.1. Let be a closed, connected, smooth, spacelike 2-surface in R3,1 . Suppose spans a compact spacelike hypersurface in R3,1 . If has positive Gaussian curvature and has spacelike mean curvature vector, then mLY () ≥ 0; moreover mLY () = 0 if and only if lies on a hyperplane in R3,1 . In order to prove this theorem, we need the following result which can be proved by the method of Bartnik and Simon [4] and by an idea from Bartnik [2]. In fact, it is just a special case of the results by Bartnik [3]. Lemma 4.1. Let be a closed, connected, smooth, spacelike 2-surface in R3,1 . Suppose spans a compact spacelike hypersurface in R3,1 . Then spans a compact, smoothly immersed, maximal spacelike hypersurface in R3,1 . Proof. Let M be a compact spacelike hypersurface in R3,1 spanned by . By extending M a bit, we may assume that there exists a spacelike hypersurface M˜ in R3,1 such that ˜ Since M˜ is spacelike, M˜ is locally a graph over an open set in R3 . Hence, the M ⊂ M. projection map π : M˜ → R3 , given by π(x, t) = x, is a local diffeomorphism. Now consider the map F : M˜ × R1 −→ R3,1 , (4.1) ˜ then F is a local diffeomorphism as given by F( p, s) = (x, s) for any p = (x, t) ∈ M, 1 ˜ well. Let N = M × R equipped with the pull back metric. Let v be the time function on ˜ we can consider its graph M˜ in R3,1 , i.e. v(x, t) = t. Since v is a smooth function on M, ˆ ˆ in N . Let and G be the graph of v over and M in N respectively. Then Gˆ is a compact, ˆ Moreover, F| ˆ : Gˆ → M ⊂ R3,1 spacelike hypersurface in N whose boundary is . G ˆ → ⊂ R3,1 are both isometries. and F|ˆ : Now one can carry over the arguments in Sect. 3 in [4] to prove that there is a smooth solution (defined on M) to the maximal surface equation in N such that, if G is its graph ˆ For example, Lemma 3.3 in [4] can be rephrased as: For θ > 0, let in N , then ∂G = . D = {φ ∈ C 0,1 (M)| |Dφ| ≤ (1 − θ )} and
F = u ∈ C 2 (M) | |Du| < 1 with maximal graph and u = φ; on for some φ ∈ D .
(4.2)
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
453
Then there exists r0 > and θ1 > 0 such that for all u ∈ F and for all p, q ∈ M with d( p, ), d(q, ) < 13 r0 (say) and d( p, q) = r0 we have: |u( p) − u(q)| ≤ (1 − θ1 )r0 . One then readily checks that F(G) is a compact, smoothly immersed, maximal hyperˆ surface in R3,1 spanned by = F().
To prove Theorem 4.1, we also need a technical lemma concerning the boundary mean curvature of a compact spacelike hypersurface in R3,1 , whose boundary has spacelike mean curvature vector. Lemma 4.2. Let M be a compact 3-manifold with boundary ∂ M. Let F : M → R3,1 be a smooth spacelike immersion such that F|∂ M : ∂ M → R3,1 has spacelike mean curvature vector. Let g be the pull back metric on M and k be the mean curvature of ∂ M in (M, g) with respect to the outward unit normal. Then k can not be negative everywhere on ∂ M. Proof. Suppose k < 0 everywhere on ∂ M. Since F(M) is a compact subset in R3,1 , without loss of generality, we may assume that F(M) ⊂ {x1 ≤ 0} and F(M)∩{x1 = 0} = ∅. Let X 0 = F(q) ∈ F(M) ∩ {x1 = 0} for some q ∈ M. If q is an interior point of M, then there exists an open neighborhood V of q in the interior of M such that the tangent space of F(V ) at X 0 is {x1 = 0}. This is impossible, because F(V ) needs to be spacelike. Therefore, q ∈ ∂ M. Using the fact that F is a spacelike immersion again, we know there exists an open neighborhood U of q in M such that F(U ) is a spacelike graph of some function f over D for some open set D ⊂ R3 ∩ {x1 ≤ 0}. Let B = F(U ∩ ∂ M) and let ˆ We note that X 0 ∈ B. Without Bˆ be the part of ∂ D such that B is the graph of f over B. loss of generality, we may assume that X 0 is the origin. To proceed, we let T = ∂t∂ and define the following notations: n: the future time like unit normal to F(U ) in R3,1 ; ν: the unit outward normal to B in F(U ); νˆ : the unit outward normal to Bˆ in D. We parallel translate ν, νˆ and all the tangent vectors of B, Bˆ along the T direction. Also, we consider f as a function on D × (−∞, ∞) so that f is independent of t. Now νˆ is normal to B, so νˆ = uν + vn for some numbers u, v satisfying u 2 − v 2 = 1. At X 0 ∈ B, we have νˆ = ∂∂x1 . Suppose α(s) = (x1 (s), x2 (s), x3 (s), t (s)) is a curve in F(U ) such that α(0) = X 0 and α (0) = ν. Then, for t < 0 small, α(t) ∈ F(U ) and so x1 (t) < 0. Since x1 (0) = 0, we have x1 (0) ≥ 0, hence u = ˆν , ν ≥ 0. Since u 2 = 1 + v 2 , we have u > |v| at X 0 . Let H be the mean curvature vector of B in R3,1 . Let p = pi j be the second fundamental form of F(U ) in R3,1 with respect to n. Then H = −kν + (tr B p)n,
(4.3)
where tr B p denotes the trace of p restricted to B. Hence, − H , ν
ˆ = uk + v(tr B p).
(4.4)
At X 0 , we have shown u > |v|. On the other hand, we know |k| > |tr B p| (because H is spacelike) and k < 0 (by the assumption), therefore we have − H , ν
ˆ < 0 at X 0 . Recall that 2 ∇ei ei , ν , (4.5) H , ν
ˆ = i=1
454
P. Miao, Y. Shi, L.-F. Tam
where {e1 , e2 } is an orthonormal frame in TX 0 B and ∇ is the covariant derivative in R3,1 . Hence there exists a unit vector e ∈ TX 0 B such that ˆ < 0. − ∇e e, ν
(4.6)
Suppose e is the tangent of a curve γ (s) ⊂ B at s = 0. Let γˆ (s) ⊂ Bˆ be the projection of γ (s) in R3 . Then γ (s) = γˆ (s) +
d f (γˆ (s))T. ds
(4.7)
Hence, ˆ = − ∇γ (s) γˆ (s), ν
ˆ − ∇γ (s) γ (s), ν
ˆ = − ∇γˆ (s) γˆ (s), ν ,
(4.8)
where we have used the facts that T is parallel, T ⊥ νˆ and γˆ (s) is parallel translated along T . Thus, it follows from (4.6), (4.8) and the fact e = γ (0) that − ∇γˆ (0) γˆ (0), ν
ˆ < 0.
(4.9)
But this is impossible because γˆ (s) ⊂ Bˆ ⊂ {x1 ≤ 0} ∩ R3 and νˆ = ∂∂x1 at γˆ (0). Therefore, we have proved that k can not be negative everywhere on ∂ M. Proof of Theorem 4.1. By Lemma 4.1, we know that indeed bounds a compact, smoothly immersed, maximal spacelike hypersurface in R3,1 . Precisely, this means that there exists a compact 3-manifold M with boundary ∂ M and a smooth, maximal spacelike immersion F : M → R3,1 such that F : ∂ M → is a diffeomorphism. Let g = gi j be the pull back metric on M. Let p = pi j be the second fundamental form of the immersion F : M → R3,1 . Let R be the scalar curvature of (M, g). Since F is a maximal immersion, it follows from the constraint equations (or simply the Gauss equation) that R = | p|2 ≥ 0,
(4.10)
where “| · |” is taken with respect to g. On the other hand, let k be the mean curvature of ∂ M in (M, g) with respect to the outward unit normal and let H be the mean curvature vector of = F(∂ M) in R3,1 , it is known that | H |2 = k 2 − (tr p)2 ,
(4.11)
where tr p is the trace of p restricted to . Since H is spacelike, (4.11) implies that either k > 0 or k < 0 on ∂ M because ∂ M is connected. By Lemma 4.2, we have k > 0 on ∂ M. Now let k0 be the mean curvature of with respect to the unit outward normal when it is isometrically embedded in R3 . It follows from (4.11) that (k0 − | H |) dσ ≥ (k0 − k) dσ. (4.12)
On the other hand, by the result of [21], we have (k0 − k) dσ ≥ 0,
(4.13)
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
455
and equality holds if and only if (M, g) is a domain in R3 . Thus, we conclude from (4.12) and (4.13) that mLY () ≥ 0. Moreover, if mLY () = 0, then (M, g) must be flat, hence R = 0 and consequently p = 0. Therefore, F(M) and hence lie on a hyperplane in R3,1 . Conversely, if lies on a hyperplane in R3,1 , then obviously mLY () = 0.
In the sequel, we want to show that the examples given in [20] satisfy the assumption in Theorem 4.1. To do that, we need the following definition: Definition 4.1. Two points p and q in a Lorentzian manifold N are said to be causally related if p and q can be joined by a timelike or null path. A set S in N is called acausal if no two points in S are causally related. We claim that all surfaces in the examples in [20] are acausal. Suppose this claim is true, then by Theorem 3 in [10, p. 4765], we know that those surfaces span spacelike hypersurfaces in R3,1 , hence satisfying the assumption in Theorem 4.1. To verify the claim, let be an example given in [20], i.e. in terms of the usual spherical coordinates (t, r, θ, φ) in R3,1 , is determined by the equation t = r = F(θ, φ),
(4.14)
where F = F(θ, φ) is a smooth positive function of (θ, φ) ∈ S2 . Suppose is not acausal, then there exists two distinct points p, q in and a path γ (τ ) in such that γ (0) = p, γ (1) = q, and (x) ˙ 2 + ( y˙ )2 + (˙z )2 ≤ (t˙)2 , ∀τ ∈ [0, 1],
(4.15)
` here we denote γ (τ ) = (x(τ ), y(τ ), z(τ ), t (τ )) and denotes the derivativewith respect to τ . Since γ˙ = 0, without loss of generality, we may assume t˙ > 0. Let r = x 2 + y 2 + z 2 , then (4.15) implies that |˙r | ≤ t˙.
(4.16)
Note that r (0) = t (0) and r (1) = t (1); we see that |˙r| = t˙,
(4.17)
for all τ ∈ [0, 1]. By the equality case in the Cauchy-Schwartz inequality, we must have x = k(τ )x, ˙
y = k(τ ) y˙ , z = k(τ )˙z
(4.18)
for some function k = k(τ ) and for all τ ∈ [0, 1]. Clearly, this implies that p and q lie on a line which passes through the origin, or equivalently, p = aq for some positive number a. On the other hand, using the Cartesian coordinates, we may write p as (F(θ1 , φ1 ), F(θ1 , φ1 ) sin θ1 cos φ1 , F(θ1 , φ1 ) sin θ1 sin φ1 , F(θ1 , φ1 ) cos θ1 ) and write q as (F(θ2 , φ2 ), F(θ2 , φ2 ) sin θ2 cos φ2 , F(θ2 , φ2 ) sin θ2 sin φ2 , F(θ2 , φ2 ) cos θ2 ) for some (θi , φi ) ∈ S2 , i = 1, 2. The fact p = aq, for some a > 0, then implies p = q, which is contradiction. Therefore, is acausal. Acknowledgements. We would like to thank Robert Bartnik for useful discussions on the existence of maximal surfaces.
456
P. Miao, Y. Shi, L.-F. Tam
5. Appendix In this Appendix, we give some lemmas which are needed to complete the proof of Proposition 3.2. We will follow closely Nirenberg’s argument in [18]. First, we introduce some notations: given an integer k ≥ 2 and a positive number α < 1, let E k,α = the space of C k,α embeddings of S2 into R3 , X k,α = the space of C k,α R3 −valued vector functions on S2 , S k,α = the space of C k,α symmetric (0, 2) tensors on S2 , Mk,α = the space of C k,α Riemannian metrics on S2 . On page 353 in [18], Nirenberg proved Lemma 5.1. Let σ ∈ M4,α be a metric with positive Gaussian curvature. Let X ∈ E 4,α be an isometric embedding of (S2 , σ ) in R3 . There exists two positive numbers and C, depending only on σ , such that if τ ∈ M2,α satisfying ||σ − τ ||C 2,α < , then there is an isometric embedding Y ∈ E 2,α of (S2 , τ ) in R3 such that ||X − Y ||C 2,α ≤ C||σ − τ ||C 2,α . In what follows, we want to show that the constants and C in the above lemma can be chosen to be independent on σ , provided σ is sufficiently close to some σ 0 ∈ M5,α (see Lemma 5.3). First, we prove the following: Lemma 5.2. Let σ 0 ∈ M5,α be a metric with positive Gaussian curvature. There exists positive numbers δ and Kˆ , depending only on σ 0 , such that if σ ∈ M4,α satisfying ||σ 0 − σ ||C 2,α < δ, then for any γ ∈ S 2,α and any Z ∈ X 2,α , there exists a solution Y ∈ X 2,α to the linear equation 2d X σ · dY = γ − (d Z )2 . Xσ
E 4,α
(5.1)
(S2 , σ ).
Here ∈ is any given isometric embedding of Moreover, for every Z (with γ fixed), a particular solution Y denoted by (Z ) may be chosen so that
2 (5.2) ||(Z )||C 2,α ≤ Kˆ ||γ ||C 2,α + ||Z ||C 2,α , and for any Z , Z 1 ∈ E 2,α , ||(Z ) − (Z 1 )||C 2,α ≤ Kˆ ||Z + Z 1 ||C 2,α · ||Z − Z 1 ||C 2,α .
(5.3)
Proof. We proceed exactly as in [18]. For any σ ∈ M4,α , let X σ be a given isometric embedding of (S2 , σ ) in R3 . Let X 3 be the unit inner normal to the surface X σ (S2 ). Let {u, v} be a local coordinate chart on S2 . Let φ, p1 , p2 be defined as in (6.5), (6.6) in [18]. Then φ, p1 , p2 satisfy the system of Eqs. (6.11)–(6.13) in [18] with c1 , c2 , defined on pp. 356–357 in [18]. By Sect. 6.3 in [18], the derivatives of Y are completely determined by φ, which satisfies (6.15) in [18]. Let φ be given by (7.3) in [18], following the first
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
457
paragraph in Sect. 8.1 in [18], we obtain a unique solution Y ∈ E 2,α to (5.1), normalized to vanish at a fixed point on S2 . We denote such a Y by Y = (Z ). To prove estimates (5.2) and (5.3), by the Remark on page 365 in [18] and the proof following it, we know it suffices to show 1 (5.4) ||Y ||C 2,α ≤ C ||d σ¯ 2 ||C 1,α + || (c1v − c2u )||C α , where d σ¯ 2 = γ − (d Z )2 , c1v , c2u are derivatives of c1 , c2 with respect to v, u respectively. On the other hand, by Sect. 9 in [18], to prove (5.4), it suffices to establish an C 1,α estimate of φ: ||φ||C 1,α ≤ C||d σ¯ 2 ||C 1,α .
(5.5)
Therefore, in what follows, we will prove that there are positive numbers δ and C, depending only on σ 0 , such that (5.5) holds for any σ satisfying ||σ 0 − σ ||C 2,α < δ. We first recall the fact that φ is a solution to the second order elliptic equation (6.15) in [18]. For simplicity, we let L σ (φ) = L(φu , φv ),
Fσ (d σ¯ 2 ) = L(c1 , c2 ) − T,
(5.6)
where L(φu , φv ), L(c1 , c2 ) − T are given as in (6.16) and (6.14) in [18], then (6.15) in [18] becomes L σ (φ) + Hσ φ = Fσ (d σ¯ 2 ),
(5.7)
where Hσ is the mean curvature of X σ (S2 ) w.r.t X 3 (note that our H here equals 2H in [18]). On the other hand, since we have chosen φ to be given by the integral formula (7.3) in [18], we know φ is a special solution to (5.7) in the sense that φ is L 2 -perpendicular to the kernel of the operator L σ (·) + Hσ (see p. 359 in [18]). For any σ ∈ M4,α , let K er (σ ) denote the space of solutions ψ to the homogeneous equation L σ (ψ) + Hσ ψ = 0.
(5.8)
On p. 360 in [18], it was shown that K er (σ ) is spanned by the coordinate functions of X 3 . Note that the coefficient of (5.7) depends only on the metric σ . Therefore, if σ Is close to σ 0 in C 2,α , we know by Theorem 8.32 in [11] that, to prove the C 1,α estimate (5.5), it suffices to prove the following C 0 estimate: ||φ||C 0 ≤ C||d σ¯ 2 ||C 1,α ,
(5.9)
where C is some positive constant independent on σ , provided σ is sufficiently close to σ 0 in C 2,α . Suppose (5.9) is not true, then there exists {σi } ⊂ M4,α which converges to σ 0 in C 2,α , {d σ¯ i2 } ⊂ S 1,α with ||d σ¯ i2 ||C 1,α = 1, and a sequence of numbers {Ci } approaching +∞ so that the corresponding φi (of Y = Yi ) satisfies ||φi ||C 0 ≥ Ci . Consider ξi = φi /||φi ||C 0 , then ξi satisfies −1 ¯ i2 ). L σi (ξi ) + 2Hσi ξi = ||φi ||C 0 Fσi (d σ
(5.10)
458
P. Miao, Y. Shi, L.-F. Tam
By Theorem 8.32 in [11], we conclude from (5.10) and the facts {σi } converges to σ 0 in C 2,α and ||ξi ||C 0 = 1 that ||ξi ||C 1,α ≤ C,
(5.11)
where C is some positive constant independent on i. Now (5.11) implies that ξi converges in C 1 to some ξ which is also in C 1,α . Moreover, ||ξ ||C 0 = 1. By (5.10), ξ is a weak solution to the equation L σ 0 ξ + Hσ 0 ξ = 0.
(5.12)
Since σ 0 ∈ C 5,α , the coefficients of (5.12) (given by (6.16) in [18]) are then in C 3,α , hence in C 2,1 . By Theorem 8.10 in [11], we know ξ ∈ W 4,2 , hence in C 2 . Therefore, ξ is a classic solution to (5.12), i.e. ξ ∈ K er (σ 0 ). On the other hand, we know φi , hence ξi , is L 2 -perpendicular to K er (σi ) for each i. Since {σi } converges to σ 0 in C 2,α and {ξi } converges to ξ in C 1 , we conclude that ξ must be L 2 -perpendicular to K er (σ 0 ). Hence, ξ must be zero. This is a contradiction to the fact ||ξ ||C 0 = 1. Therefore, we conclude that (5.9) holds. As mentioned earlier, once we establish the C 0 estimate (5.9), we will have the C 1,α estimate (5.5). Then we can proceed as in the rest of Sect. 9 in [18] to prove (5.4), hence prove (5.2) and (5.3).
We note that the constants and C in Lemma 5.1 indeed can be chosen as = 4 K1¯ 2 and C = 2 K¯ , where K¯ is the constant in Theorem 2’ on p. 352 in [18]. Therefore, by applying the exactly same iteration argument as on pp. 352–353 in [18], one concludes from Lemma 5.2 that Lemma 5.3. Let σ 0 ∈ M5,α be a metric with positive Gaussian curvature. There exist positive numbers δ, and C, depending only on σ 0 , such that for any σ ∈ M4,α satisfying ||σ 0 − σ ||C 2,α < δ, if τ ∈ M2,α satisfying ||σ − τ ||C 2,α < , then there is an isometric embedding Y ∈ E 2,α of (S2 , τ ) in R3 such that ||X − Y ||C 2,α ≤ C||σ − τ ||C 2,α . Here X ∈ E 4,α is any given isometric embedding of (S2 , σ ). References 1. Arnowitt, R., Deser, S., Misner, C.W.: Coordinate invariance and energy expressions in general relativity. Phys. Rev. (2) 122, 997–1006 (1961) 2. Bartnik, R.: Private communications 3. Bartnik, R.: Regularity of variational maximal surfaces. Acta Math. 161(3–4), 145–181 (1988) 4. Bartnik, R., Simon, L.: Spacelike hypersurfaces with prescribed boundary values and mean curvature. Commun. Math. Phys. 87, 131–152 (1982) 5. Bray, H., Khuri, M.: P.D.E.’s which imply the Penrose Conjecture. http://arxiv.org/abs/0905.2622v1[math. DG], 2009
Geometric Problems Related to Brown-York and Liu-Yau Quasilocal Mass
459
6. Brown, J. D., York, J.W. Jr.: Quasilocal energy in general relativity. In: Mathematical aspects of classical field theory (Seattle, WA, 1991), Volume 132 of Contemp. Math., Providence, RI: American Mathematical Society, 1992, pp. 129–142 7. Brown, J.D., York, J.W. Jr.: Quasilocal energy and conserved charges derived from the gravitational action. Phys. Rev. D (3) 47(4), 1407–1419 (1993) 8. Corvino, J.: Scalar curvature deformation and a gluing construction for the Einstein constraint equations. Commun. Math. Phys. 214(1), 137–189 (2000) 9. Fan, X.-Q., Shi, Y.-G., Tam, L.-F.: Large-sphere and small-sphere limits of the Brown-York mass. Comm. Anal. Geom. 17, 37–72 (2009) 10. Flatherty, F.J.: The boundary value problem for maximal hypersurfaces. Proc. Natl. Acad. Sci. USA. 76(10), 4765–4767 (1979) 11. Gilbarg, D., Trudinger, N.S.: Elliptic partial differential equations of second order. Second edition, Berlin-Heidelberg-NewYork: Springer-Verlag (1983) 12. Huisken, G., Ilmanen, T.: The inverse mean curvature flow and the Riemannian Penrose Inequality. J. Diff. Geom. 59, 353–437 (2001) 13. Kobayashi, O.: A differential equation arising from scalar curvature function. J. Math. Soc. Japan 34(4), 665–675 (1982) 14. Liu, C.-C.M., Yau, S.-T.: Positivity of quasilocal mass. Phys. Rev. Lett. 90(23), 231102 (2003) 15. Liu, C.-C.M., Yau, S.-T.: Positivity of quasilocal mass II. J. Am. Math. Soc. 19(1), 181–204 (2006) 16. Miao, P.: Positive mass theorem on manifolds admitting corners along a hypersurface. Adv. Theor. Math. Phys. 6(6), 1163–1182 (2002) 17. Miao, P., Tam, L.-F.: On the volume functional of compact manifolds with boundary with constant scalar curvature. Calc. Var. Part. Diff. Eq. 36(2), 141 (2009) 18. Nirenberg, L.: The Weyl and Minkowski problems in differential geoemtry in the large. Comm. Pure Appl. Math. 6, 337–394 (1953) 19. N. Ó. Murchadha: The Liu-Yau mass as a quasi-local energy in general relativity. http://arxiv.org/abs/ 0706.1166v1[gr-qc], 2007 20. Murchadha, N.Ó., Szabados, L.B., Tod, K.P.: Comment on “Positivity of quasilocal mass”. Phys. Rev. Lett. 92, 259001 (2004) 21. Shi, Y.-G., Tam, L.-F.: Positive mass theorem and the boundary behaviors of compact manifolds with nonnegative scalar curvature. J. Diff. Geom. 62, 79–125 (2002) 22. Shi, Y.-G., Tam, L.-F.: Rigidity of compact manifolds and positivity of quasi-local mass. Class. Quant. Grav. 24(9), 2357–2366 (2007) 23. Christodoulou D., Yau S.-T. (1986) Some remarks on the quasi-local mass. In: Mathematics and general relativity (Santa Cruz, CA, 1986), Contemp. Math. 71, Providence, RI: Amer. Math. Soc., 1986, pp. 9–14 24. Wang, M.-T., Yau, S.-T.: A generalization of Liu-Yau’s quasi-local mass. Comm. Anal. Geom. 15(2), 249–282 (2007) 25. Wang, M.-T., Yau, S.-T.: Isometric embeddings into the Minkowski space and new quasi-local mass. Commun. Math. Phys. 288(3), 919–942 (2009) Communicated by P.T. Chru´sciel
Commun. Math. Phys. 298, 461–484 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1074-z
Communications in
Mathematical Physics
Universality Limits of a Reproducing Kernel for a Half-Line Schrödinger Operator and Clock Behavior of Eigenvalues Anna Maltsev Mathematics 253-37, California Institute of Technology, Pasadena, CA 91125, USA. E-mail: [email protected] Received: 31 July 2009 / Accepted: 16 March 2010 Published online: 26 June 2010 – © Springer-Verlag 2010
Abstract: We extend some recent results of Lubinsky, Levin, Simon, and Totik from measures with compact support to spectral measures of Schrödinger operators on the half-line. In particular, we define a reproducing kernel SL for Schrödinger operators and we use it to study the fine spacing of eigenvalues in a box of the half-line Schrödinger operator with perturbed periodic potential. We show that if solutions u(ξ, x) are bounded in x by ex uniformly for ξ near the spectrum in an average sense and the spectral measure is positive and absolutely continuous in a bounded interval I in the interior of the spectrum with ξ0 ∈ I , then uniformly in I , sin(πρ(ξ0 )(a − b)) SL (ξ0 + a/L , ξ0 + b/L) → , SL (ξ0 , ξ0 ) πρ(ξ0 )(a − b) where ρ(ξ )dξ is the density of states. We deduce that the eigenvalues near ξ0 in a large 1 box of size L are spaced asymptotically as Lρ . We adapt the methods used to show similar results for orthogonal polynomials.
1. Introduction In this paper we exploit the similarities between differential and difference equations to show a half-line Schrödinger operator analogue of recent results of Lubinsky, Levin, Simon, and Totik. Let dη = w(x)d x + dηs be a probability measure supported on [−1, 1]. Let the polynomials pn be orthonormal with respect to the L 2 (dη) inner product. The Christoffel–Darboux kernel K n , given by K n (x, y) =
n k=0
pk (x) pk (y)
(1.1)
462
A. Maltsev
(see for example [8,13,15]), is characterized by the reproducing property, i.e. for all k < n, (1.2) pk (y) = K n (x, y) pk (x)dη(x). The measure dη on a compact set e is regular [15] if for any > 0 there exist δ > 0 and a constant C so that sup
dist(y,e)≤δ
| pn (y, dη)| ≤ Cen .
(1.3)
Let I ⊂ (−1, 1) be a closed interval and dη is regular such that supp(dμs ) ∩ I = ∅ and w is continuous and nonvanishing on I . Then Lubinsky [8] shows that for a, b ∈ R and uniformly for x0 ∈ I , lim
n→∞
K n (x0 + an , x0 + nb ) sin(πρ[−1,1] (x0 )(a − b)) = , K n (x0 , x0 ) πρ[−1,1] (x0 )(a − b)
(1.4)
where ρ[−1,1] (x0 ) = (π 1 − x02 )−1 is the density of states for [−1, 1]. This result is interesting for both the study of orthogonal polynomials and of random matrices. It relates a fundamental object to the sine kernel and implies that the left-hand side of (1.4) only depends on the continuity and positivity of the measure dη at x0 and its essential support. Additionally, Levin-Lubinsky [7] obtain the asymptotic spacing of the zeros of orthogonal polynomials near x0 from (1.4). In this paper we provide definitions of a reproducing kernel SL and of regularity for half-line Schrödinger operators. We prove the analogous results for perturbed periodic half-line Schrödinger operators. Let Aφ(x) = −
d 2 φ(x) + V (x)φ(x) dx2
(1.5)
be a Schrödinger operator on L 2 [0, ∞) with either Dirichlet or Neumann boundary condition at x = 0. We assume throughout that V is locally integrable and bounded from below. Let u, y be the standard fundamental solutions of the eigenvalue equation of the operator A: Aφ(ξ, x) = ξ φ(ξ, x)
(1.6)
u(ξ, 0) = 1 = y (ξ, 0), u (ξ, 0) = 0 = y(ξ, 0).
(1.7)
with initial conditions
Throughout the paper, u , y denote the derivative with respect to x and e = σess (A). Our results are valid for both Dirichlet and Neumann boundary conditions, but we only give the proofs for the Neumann case. There is a shift of notation here, so x in our setup is analogous to n of the discrete case, and our ξ is the analogue of x of the discrete case. The analogues are illustrated in the following “translation” table:
Universality Limits
463
Difference
Differential
recurrence relation
differential expression
orthogonal polynomials
solutions of eigenvalue equation
xn
√ cos( ξ x)
x n + an−1 x n−1 + . . .
L 0
√ f (x) cos( ξ x)d x with f ∈ L 2 (0, ∞]
x
ξ in u(ξ, x)
n
x in u(ξ, x)
We now define a reproducing kernel SL for Schrödinger operators. Definition 1.1. Given a Schrödinger operator A as in (1.5) with the Neumann boundary condition, we let the reproducing kernel be SL (ξ, ζ ) =
L
u(ξ, t)u(ζ, t)dt.
(1.8)
0
There exists a measure dμ which makes the following two formulas hold for every function f ∈ L 2 [0, ∞): W (ζ, f ) =
∞
0
f (x) =
f (x)u(ζ, x)d x,
W (ζ, f )u(ζ, x)dμ(ζ ).
(1.9) (1.10)
√ See Theorem 2.2.3 of [10]; we change variables from Marchenko so that his ξ is our ξ . Here dμ is the spectral measure of the operator A as in (1.5). We see that the reproducing property is satisfied with respect to dμ: u(ξ, x)χ[0,L] (x) =
SL (ξ, ζ )u(ζ, x)dμ(ζ ).
(1.11)
We are primarily interested in the case where the potential V = q + p, where p is periodic with period P and continuous. Definition 1.2. We call a perturbation q non-destructive if it leaves the essential spectrum unchanged and zero-average if 1 x
x
|q(t)|dt → 0.
(1.12)
0
We assume throughout that the perturbation q is a non-destructive zero-average perturbation e.g. q → 0 at ∞.
464
A. Maltsev
We can now state our main result: 2
Theorem 1.3. Let A = − ddx 2 + p(x) + q(x) with periodic and continuous p and nondestructive zero-average q and let dμ(ξ ) = w(ξ )dξ + dμs be its spectral measure. Let I ⊂ eint be a closed and bounded interval such that w is continuous and non-zero on I and supp(dμs ) ∩ I = ∅. Let ξ0 ∈ I and a, b, B ∈ R. Then uniformly in I and |a|, |b| < B, SL (ξ0 + a/L , ξ0 + b/L) sin(πρ(ξ0 )(a − b)) → , SL (ξ0 , ξ0 ) πρ(ξ0 )(a − b)
(1.13)
where ρ(ξ )dξ is the density of states. Like in the discrete case, the asymptotic behavior of the kernel SL for the perturbed periodic operator A depends on the density of states ρ(ξ )dξ of the periodic operator A# , defined, for example, in Berezin-Shubin (Sect. 2.3 of [1]). The measure ρ(ξ )dξ is the same for Dirichlet and Neumann boundary conditions. It is well known that K n (x, y) =
γn−1 pn (x) pn−1 (y) − pn−1 (x) pn (y) , γn x−y
where γn is the leading coefficient as in [15]. This expression is called the Christoffel–Darboux formula, and we show its analogue in Sect. 3 for SL . From (1.13) and the Christoffel–Darboux formula (3.4) we deduce that the zeros of u (−, L), scaled by the density of states, will be asymptotically equally spaced, like the zeros of the sine function. We adapt the definition from [6]: Definition 1.4. Fix ξ ∗ in an interval I, and number the zeros ξ N of u (−, L) with increasing positive integers to the right of ξ ∗ and decreasing negative integers to the left so that . . . < ξ−1 < ξ ∗ ≤ ξ0 < . . .. We say there is strong clock behavior of zeros of u at ξ ∗ on an interval I if the density of states ρ(ξ )dξ is continuous and nonvanishing on I and for fixed n, lim L|(ξn − ξn+1 )|ρ(ξ ∗ ) = 1,
L→∞
(1.14)
and we say there is uniform clock behavior on I if the limit in (1.14) is uniform on I for fixed n. This nomenclature comes from the theory of orthogonal polynomials on the unit circle. There, when zeros of polynomials exhibit clock behavior, they do indeed look like marks on a clock. In Sect. 6, we show Corollary 1.5. Let A, e, I , ξ0 as in Theorem 1.3. Then there is uniform clock behavior of the zeros of u and y on I . The spacing of eigenvalues for functions on [0, L] with periodic Dirichlet boundary condition is the same as the spacing of zeros of y(ξ, L) in ξ in case of the Dirichlet boundary condition at 0 and L, since whenever y(ξ0 , L) = 0 the periodic boundary condition is satisfied. Similar logic applies to the zeros of u (ξ, L) in case of the Neumann boundary condition.
Universality Limits
465
In our setup, we use the space HL = π : π(ξ ) =
L
f (x) cos( ξ x)d x, f ∈ L 2 [0, ∞)
(1.15)
0
as the analogue of the space of polynomials with degree less than or equal to n. Just like polynomial degree, when two functions with parameters L and N are multiplied, if the product is in HM for some M, then M = L + N , since multiplying exponentials adds their exponents. The orthogonal polynomials with degree smaller than or equal to n are a basis for the space of polynomials with degree less than or equal to n. The analogous property of HL is L HL = {π : π(ξ ) = f (x)u(ξ, x)d x, f ∈ L 2 [0, ∞)}. 0
This follows easily from Marchenko (see (1.2.10), (1.2.10”) of [10]), which gives the existence of a continuous integral kernel M, such that
L x π(ξ ) = f (x) u(ξ, x) + M(x, t)u(ξ, t)dt d x. (1.16) 0
0
The space of polynomials of degree less than or equal to n is usually considered with the L 2 (dη) inner product. Analogously, we give HL the following inner product: π1 , π2 = π1 (ζ )π2 (ζ )dμ(ζ ), (1.17) where dμ is the spectral measure. The minimizer of π(y) L 2 (dη) over polynomials π with deg π ≤ n and π(x) = 1 −1 n (x,y) is equal to K K n (x,x) and the minimum is equal to K n (x, x) . This property is called the variational principle and we show its analogue for SL : Theorem 1.6. If μ is an unnormalized spectral measure, then min{Qdμ : Q ∈ HL , Q(ξ0 ) = 1} = SL (ξ0 , ξ0 )−1 ,
(1.18)
and the minimizer is given by SL (ξ, ξ0 ) . SL (ξ0 , ξ0 )
(1.19)
We give the minimum its own letter: λ L (ξ ) = SL (ξ, ξ )−1 .
(1.20)
Returning to the orthogonal polynomials case for motivation, we summarize Lubinsky’s method for showing (1.4). He notes that if dη, dη∗ are regular measures on [−1, 1] with dη ≤ dη∗ and K ∗ is the Christoffel-Darboux kernel associated with dη∗ ,
|K n (x, y) − K n∗ (x, y)| K ∗ (x, x) 1/2 K n (y, y) 1/2 1− n ≤ . (1.21) K n (x, x) K n (x, x) K n (x, x) This inequality, called Lubinsky’s inequality, implies that in order to understand the left-hand side of (1.4), it is sufficient to understand K n# (x, y) for some model measure
466
A. Maltsev
dη# and the behavior of a ratio of diagonal kernels. A model dη# with w # (x0 ) = w(x0 ) is chosen, for which K n# (x, y) can be computed directly. Then dη∗ = sup{dη# , dη} dominates both dη and dη# and has similarly nice local behavior at x0 with w ∗ (x0 ) = w(x0 ). K # (x,x)
both By the variational principle, the ratios of the diagonal kernels K n∗ (x,x) and KK n∗(x,x) n n (x,x) converge to 1, and Lubinsky’s inequality and a comparison of the two resulting expressions yields the desired result. Simon [14] and Totik [17] extend this argument to measures with suppess (dη) = ∪I j a finite union of intervals. In this paper we adapt all the steps to Schrödinger operators. We adapt the regularity condition to spectral sets of half-line Schrödinger operators as follows:
Definition 1.7. Suppose e ⊂ R is the essential support of a spectral measure dμ of a Schrödinger operator with Neumann boundary condtion. We say dμ satisfies regularity bounds if for any > 0 there exists δ1 > 0, C such that for all ξ with dist(ξ, e) ≤ δ1 the solution u satisfies L u(ξ, x)2 d x ≤ Ce L , (1.22) 0
with C not dependent on ξ , L. In Sect. 2 we show that a Schrödinger operator with potential of the form q(x) + p(x) with continuous periodic p and non-destructive zero-average q (as in Definition 1.2) satisfies regularity bounds. Lubinsky’s inequality carries over exactly to our setup, as we show in Sect. 6. Similar to Simon [14] and Lubinsky [8], we need a measure dμ# (ξ ) = w # (ξ )dξ + dμ#s , which corresponds to a Schrödinger operator A# and satisfies the following properties: (1) σess (μ# ) = e. (2) w # is continuous and nonvanishing on e. (3) For any compact interval I ⊂ eint and > 0 as L → ∞ uniformly on I sup e− L SL (ξ, ξ, dμ# ) → 0. ξ ∈I
(1.23)
(4) For any compact interval I ⊂ eint for all ξ ∈ I uniformly, lim lim
→0 L→∞
SL+ L (ξ, ξ, μ# ) = 1. SL (ξ, ξ, μ# )
(1.24)
(5) For ξ(L) → ξ0 in eint lim
L→∞
SL (ξ(L), ξ(L)) =1 SL (ξ0 , ξ0 )
(1.25)
and this limit is uniform in I . We call such a measure a model. We need these properties in the proof of Theorem 1.8. Theorem 5.3 immediately implies that the operator A# with periodic potential satisfies model conditions 3-4. In Theorem 5.1, we notice that model condition 5 is satisfied. Thus, A# is a model. We therefore can use the periodic potential as a model for e, whenever q is non-destructive.
Universality Limits
467
The periodic Schrödinger operator comes up naturally in the study of crystalline solids, and has been extensively studied ([1,11]). It shares a lot of structure with periodic Jacobi operators. The essential spectrum e of a periodic Schrödinger operator is a union of closed intervals. One shows this by considering the operator T which shifts a solution by the period length in x, i.e. T u(ξ, x) = T u(ξ, x + P), where P is the period. This operator is analogous to stripping p coefficients from a p periodic Jacobi matrix. The eigenvalue equation for T is s 2 − (ξ )s + 1 = 0, where (ξ ) = y(ξ, P) + u (ξ, P) is called the discriminant as in, for example, Chapter 2 of [9]. The solutions are not exponentially growing only when |s| = 1, so {ξ : |(ξ )| ≤ 2} is the spectrum. One can furthermore show that (ξ ) = 0 whenever (ξ ) ∈ (−2, 2). We let e = ∪[ln , rn ] so that is invertible on each [ln , rn ]. We call each [ln , rn ] a band and each interval in R\e a gap. When rn = ln+1 , we call the point ξ = rn a closed gap. The perturbed operator may have countably many eigenvalues in each gap, but the only limit points are the bands’ endpoints. Furthermore, there exists a first band, so shifting q by a constant in energy, we can assume that min e = 0. When p is bounded, the size of the n th gap goes to 0 as n → ∞ (Lemma 2.9 of [9]), so only finitely many gaps and finitely many eigenvalues do not lie in {ξ : dist(ξ, e) ≤ δ1 } for any δ1 > 0. The same is true for the comparison measure dμ∗ which we construct to dominate both dμ and dμ# and to be continuous and non-vanishing on I with w ∗ (ξ0 ) = w(ξ0 ). We let dμ∗ be the sup of dμ, dμ# on a compact subset of R and dμ + dμ∗ on the rest of R. The comparison measure is a scalar multiple of a spectral measure, as we show in Sect. 6. We call such measures unnormalized spectral measures, as analogous to unnormalized measures on compact sets. If u, y is a fundamental system of solutions and SL is the reproducing kernel associated to a spectral measure dμ, then, for s > 0, we associate √u , √y , and the reproducing kernel 1 S L (ζ, ξ, dμ(ξ )) to d(sμ). A spectral measure dμ s s s must have a prescribed asymptotic at infinity (Theorem 2.4.2 of [10]), which implies that the normalization constant s is unique and the reproducing kernel is well-defined. Henceforward, we use the letters dμ, dμ∗ to denote spectral measures which may be unnormalized and all results in Sect. 3 are shown for unnormalized spectral measures. Also, the definition of regularity bounds works just as well. In Sect. 4, we show Theorem 1.8. Suppose dμ(ξ ) = w(ξ )dξ + dμs , dμ∗ (ξ ) = w ∗ (ξ )dξ + dμ∗s are unnormalized spectral measures with σess (dμ) = σess (dμ∗ ) = e. Suppose dμ, dμ∗ satisfy regularity bounds and have finitely many eigenvalues outside of {ξ : dist(ξ, e) < δ1 } for any δ1 > 0. Let I ⊂ eint be a closed and bounded interval such that w, w∗ are continuous and strictly positive on I and (supp(dμs ) ∪ supp(dμ∗s )) ∩ I = ∅. Let ξ0 ∈ I and ξ(L) → ξ0 as L → ∞. Suppose dμ∗ satisfies (1.24), (1.25). Then uniformly in I , w ∗ (ξ0 ) SL (ξ(L), ξ(L), μ) → . SL (ξ(L), ξ(L), μ∗ ) w(ξ0 )
(1.26)
In Sect. 5, we compute the universality limit of the kernel in the unperturbed periodic case to be lim
L→∞
SL (ξ0 + La , ξ0 + Lb ) sin(πρ(ξ0 )(a − b)) = , SL (ξ0 , ξ0 ) πρ(ξ0 )(a − b)
(1.27)
468
A. Maltsev
where ρ(ξ )dξ is the density of states corresponding to the periodic Schrödinger operator. To make this calculation, we use a standard formula to express the density of states in terms of the imaginary part of the diagonal Green’s function, and then we express the Green’s function in terms of the solution u. From Theorem 1.8 and adapted Lubinsky’s inequality, we deduce Theorem 1.3. As an example we consider the case p = 0. In Sect. 7, we show by direct computation that given the same conditions on the measure as in Theorem 1.3 we have √ √ sin 2a−b (2 ξ ) SL (ξ + La , ξ + Lb ) ξ lim = . L→∞ SL (ξ, ξ ) a−b This yields that the eigenvalues in a box of size L are spaced asymptotically as
1√ . 2L ξ
2. The Perturbed Periodic Potential Let e be the essential spectrum of a Schrödinger operator with period P periodic potential p and either Neumann or Dirichlet boundary condition. The goal of this section is to show Proposition 2.1. A Schrödinger operator with essential spectrum e and potential x V (x) = p(x) + q(x), where p is periodic and continuous and x1 0 |q(t)|dt → 0, satisfies regularity bounds. x Fix > 0 and let x1 0 | p(t) + q(t)|dt ≤ M for x > x0 , for some x0 . To prove (1.22), L it is sufficient to show that 0 u(ξ, x)2 d x ≤ Ce L separately for three cases of ξ , where C is uniform in ξ , L: 4M 2 , shown in Lemma 2.2. 2 4M 2 ξ ≤ 2 , ξ in the interior of e, but slightly away from the endpoints of the intervals,
2 i.e. ξ ∈ (∪[ln + , rn − ]) ∩ 0, 4M , shown in Lemma 2.3. 2 4M 2 ξ ≤ 2 and ξ near the interval endpoints, i.e. ξ ∈ (∪[ln − , ln + ] ∪ [rn − ,
(1) ξ > (2)
(3)
2 rn + ]) ∩ 0, 4M , shown in Lemma 2.4 2
The three cases are illustrated in Fig. 1
Fig. 1. Three cases for the proof of regularity of the perturbed periodic operator
Universality Limits
469
x 2 Lemma 2.2. Let A = − ddx 2 + V (x) be a Schrödinger operator such that x1 0 |V (t)|dt is bounded in x as x → ∞. Then the solutions u, y of the eigenvalue equation satisfy u(ξ, x) ≤ Ce y(ξ, x) ≤ Ce
x (t)|dt 0 |V √ ξ x |V (t)|dt 0 √ ξ
,
(2.1)
.
(2.2)
Proof. Using successive approximations, we can perturb about the solutions with√V = 0. Chadan-Sabatier ((I.2.3), (I.2.4), (I.2.6), (I.2.8a) [2]) show (2.2), and using cos( ξ x) as initial data, instead of (I.2.3), gives (2.1). √ This lemma indeed implies that for ξ ≥ L which implies 0 u(ξ, x)2 d x ≤ Ce L .
2M
1
the solution u satisfies u(x) ≤ Ce 2 x ,
Lemma 2.3. Let [ln , rn ] be a band of the spectrum for a Schrödinger operator 2 A = − ddx 2 + q(x) + p(x) with periodic and continuous p and non-destructive zeroaverage q (Definition 1.2). Then the solution u of the eigenvalue equation with the NeuL mann boundary condition satisfies 0 u(ξ, x)2 d x ≤ Ce L for ξ ∈ (∪[ln + , rn − ]) ∩ [0, R], where R = condition.
4M 2 , and the same holds for the solution with the Dirichlet boundary 2 2
Proof. Let u p (ξ, x), y p (ξ, x) be the solutions of A# = − ddx 2 + p(x) with boundary conditions u p (ξ, 0) = 1 = y p (ξ, 0), y p (ξ, 0) = 0 = u p (ξ, 0). By Floquet’s theorem (for example Sect. 1.2 of [9] and Theorem XIII.89 of [11]), there exists a solution f (ξ, x) = eiθ(ξ )x φ(ξ, x), where φ is periodic in x with period P. We normalize f (ξ, 0) = 1. The exponent θ (ξ ) is not 0 or π away from band endpoints, so that f is linearly independent of f for ξ ∈ ∪[ln + , rn − ]. Then u(ξ, x) = a1 (ξ ) f (ξ, x) + a2 (ξ ) f (ξ, x). We solve for a1 , a2 in terms of ξ . We get that 1 = u(ξ, 0) = a1 (ξ ) f (ξ, 0) + a2 (ξ ) f (ξ, 0), 0 = u (ξ, 0) = a1 (ξ ) f (ξ, 0) + a2 (ξ ) f (ξ, 0) = a1 (ξ ) + a2 (ξ ), so that a1 (ξ ) = −a2 (ξ ). Substituting 1 = a1 (ξ ) f (ξ, 0) − a1 (ξ ) f (ξ, 0), we get a1 (ξ ) = (2i f (ξ, 0))−1 = −a2 (ξ ).
(2.3)
470
A. Maltsev
Since f , f are independent, f = 0 and, by Theorem XIII.89 of [11], f is analytic in ξ on [ln + , rn − ]. This implies that a1 , a2 are analytic as well. The function | f | is continuous in both x and ξ on [0, P] × (∪[ln + , rn − ] ∩ [0, R]), therefore it achieves its maximum on this set. Since | f | is periodic and continuous in x with period P, the maximum of | f | in x for fixed ξ occurs on [0, P]. This implies that u p (ξ, x) ≤ K , where K is constant in x and ξ ∈ ∪[ln + , rn − ] ∩ [0, R]. We use the method of variation of parameters about u p (ξ, −), y p (ξ, −) and Gronwall inequality. Let a, b be given by
u p (x) y p (x) u(x) a(x) = u (x) y (x) . (2.4) u (x) b(x) p p Then we get that
x
a(x) 1 −y p u p −y 2p a(t) = + dt q(t) 2 b(x) 0 b(t) up u p yp 0
x a(t) dt, ≤ 1 + K1 |q(t)| b(t) 0 where K 1 ≥ |y p u p | + y 2p + u 2p + |u p y p | is constant in x and ξ by the argument above. We apply the Gronwall inequality to this integral equation to get |a(x)| + |b(x)| ≤ K 2 e K 1
x 0
|q(t)|d x
. 1 x
Then we take the matrix norm in (2.4) and, recalling that (1.22) for large L and for all L by choosing C appropriately.
x 0
(2.5) |q(t)|dt → 0, we get
Lemma 2.4. Let [ln , rn ] be a band of the spectrum for a Schrödinger operator 2 A = − ddx 2 +q(x)+ p(x) with continuous periodic p and non-destructive zero-average q (Definition 1.2). Then the solution u of the eigenvalue equation with Neumann boundary L condition satisfies 0 u(ξ, x)2 d x ≤ Ce L for 4M 2 ξ ∈ (∪[ln − , ln + ] ∪ [rn − , rn + ]) ∩ 0, 2 . The same holds for the solution with the Dirichlet boundary condition. Proof. Let ξ ∈ [ln − , ln + ]. We once again use the method of variation of parameters but this time about the solutions u p (−, ln + ) and y p (−, ln + ), i. e. the periodic solutions as before but at ξ = ln + fixed. Like in the previous lemma, u p(x, ln + ),
2
. We get y p (x, ln + ) < K , where K is constant in x and ξ ∈ {ln , rn }n∈N ∩ 0, 4M 2
x
a(x) 1 ln + − ξ + q(x) −y p u p −y 2p a(t) = dt 2 b(x) 0 + b(t) u u y d p p p 0
x a(t) dt. ≤ 1 + K1 (2 + |q(x)|) b(t) 0
As in proof of the previous lemma, applying the Gronwall inequality and picking C appropriately we get (1.22). The three lemmas imply Proposition 2.1. From Lemma 2.2 we get (1.22) for large ξ . This leaves only finitely many bands, so it suffices to consider the remaining bands one at a time as in Lemmas 2.4 and 2.3.
Universality Limits
471
3. Variational Principle and the Christoffel-Darboux Formula We let TL F(ξ ) = F(ζ )SL (ξ, ζ )dμ(ζ ), where dμ = d(sν) is a scalar multiple of a spectral measure dν. We show that TL is the orthogonal projection onto HL . We first show √ Lemma 3.1. The function cos( ξ N ) is fixed by TL for N ≤ L. Proof. Let u be the solution associated to dμ. There exists a continuous integration kernel M ([3], (1.2.5”) [10]) such that √ x cos( ξ x) M(x, t)u(ξ, t)dt. (3.1) = u(ξ, x) + √ s 0 Substituting this expression for
√ cos(√ ξ x) s
√
in evaluating TL ( cos(√sξ x) ), we check
√ N cos( ξ N ) M(N , t) u(ξ, t)SL (ζ, ξ )dμ(ξ )dt SL (ζ, ξ )dμ(ξ ) = u(ξ, N ) + √ s 0 √ N cos( ξ N ) M(N , t)u(ξ, t)dt = = u(ξ, N ) + . √ s 0
Here we use Fubini’s theorem, the reproducing property of SL (noting that N ≤ L), and we recover the last equality again by (3.1). We then show that TL fixes π N ∈ H N for N ≤ L. N √ Corollary 3.2. If π N (ξ ) = 0 f (x) cos( ξ x)d x for some function f ∈ L 2 ([0, N ]) and N ≤ L, then π N (ξ ) = π N (ζ )SL (ξ, ζ )dμ(ζ ). Proof. This is a straightforward calculation, using (3.1): N f (x) cos( ζ x)SL (ξ, ζ )d xdμ(ζ ) π N (ζ )SL (ξ, ζ )dμ(ζ ) = =
0 N
f (x)
cos( ζ x)SL (ξ, ζ )dμ(ζ )d x = π N (ξ ).
0
Here we make use of Fubini’s theorem and Lemma 3.1. Theorem 3.3. The operator (TL π N )(ξ ) = π N (ζ )SL (ξ, ζ )dμ(ζ ) is an orthogonal projection onto the Hilbert space HL . Proof. To show that TL is a projection, by Corollary 3.2, it suffices to show that N √ TL π N (ξ ) ∈ HL for N ≥ L. Recalling that π N (ξ ) = 0 f (x) cos( ξ x)d x, we compute: L N
f (x) cos( ζ x)SL (ξ, ζ )d x + π N (ζ )SL (ξ, ζ )dμ(ζ ) = dμ(ζ ) = π L (ξ ) +
0
L
N
dμ(ζ ) L
f (x) cos( ζ x)SL (ξ, ζ )d x.
472
A. Maltsev
√ We substitute (3.1) for cos( ζ x) to get
f (x) cos( ζ x)SL (ξ, ζ )d x L
N x = dμ(ζ ) f (x) u(ζ, x) + M(x, t)u(ζ, t)dt SL (ξ, ζ )d x. dμ(ζ )
N
L
0
By Fubini and the reproducing property of the kernel, the first term is 0. The second term L N L is 0 g(t)u(ξ, t)dt, where g(t) = L f (x)M(x, t)d x, and 0 g(t)u(ζ, t)dt ∈ HL (by (1.2.10) in [10]). We next check that T is self-adjoint: g, T f d(sμ) = dμ(ξ )g(ξ ) dμ(ζ ) f (ζ )SL (ξ, ζ ) = dμ(ζ ) f (ζ )dμ(ξ )g(ξ )SL (ζ, ξ ), since our definition of SL is symmetric in ζ and ξ .
We now prove Theorem 1.6. Proof. Fixing ξ0 ∈ C we consider inf{π : π L (ξ ) = 2
L
f (x) cos( ξ x)d x; π(ξ0 ) = 1}.
(3.2)
0
If φ = 0 is in some Hilbert space H , then min{ψ2 : ψ, φ = 1} =
1 φ2
(3.3)
φ and the minimizer is given by φ 2 (Proposition 1.2.1 of [12]). In our case, the Hilbert space is HL . The condition that π(ξ0 ) = 1 is equivalent to 1 = π(ξ0 ) = dμ(ζ )π(ζ )SL (ζ, ξ0 ) = π, SL (−, ξ0 ) .
The proposition is applicable with φ(ξ ) = SL (ξ, ξ0 ) ∈ HL as shown above. Therefore the minimum is equal to 1 = SL (ξ0 , ξ0 ) SL (−, ξ0 )2 and the minimizer is SL (ξ, ξ0 )/SL (ξ0 , ξ0 ).
Universality Limits
473
We show the analogue of the Christoffel-Darboux formula here: Lemma 3.4. SL (α, β) =
u(α, L)u (β, L) − u(β, L)u (α, L) . α−β
(3.4)
Proof. u(α, x)u (β, x) = u(α, x)(q(x) − β)u(β, x), u(β, x)u (α, x) = u(β, x)(q(x) − α)u(α, x). We subtract to get u(α, x)u (β, x) − u(β, x)u (α, x) = (α − β)u(α, x)u(β, x).
(3.5)
Integrating both sides d x from 0 to L, we get the desired formula. The left-hand side has to be integrated by parts: L u(α, x)u (β, x) − u(β, x)u (α, x)d x 0
= u(α, 0)u (β, 0) − u(α, L)u (β, L) − u(β, 0)u (α, 0) + u(β, L)u (α, L) = u(β, L)u (α, L) − u(α, L)u (β, L),
for any boundary condition given at 0 and independent of α, β, such as Dirichlet or Neumann. On the diagonal, the Christoffel-Darboux formula becomes SL (ξ, ξ ) = u (ξ, x)
d d u(ξ, x) − u (ξ, x)u(ξ, x). dξ dξ
(3.6)
4. Bounds on the Diagonal Kernel We will show the analogue of Lemma 3.1 in Simon [14]. Assume regularity bounds (1.22) on the measure dμ. Let Q L (ξ, ξ0 ) =
SL (ξ, ξ0 ) SL (ξ0 , ξ0 )
(4.1)
be the minimizer in (3.2). Lemma 4.1. Let dμ be a measure that satisfies regularity bounds. Then for all > 0 there exist C, δ1 such that |Q L (ξ )| ≤ Ce L λ L (ξ0 ), for ξ ∈ {ξ : dist(ξ, e) ≤ δ1 }. Proof. Fix . A regularity bound (1.22) on a measure dμ implies a bound on |SL (ξ, ξ0 )| by Cauchy-Schwarz: |SL (ξ, ξ0 )| ≤ 0
L
1/2 u(ξ, x)2 d x
L
1/2 u(ξ0 , x)2
≤ Ce L .
0
474
A. Maltsev
To show Lemma 4.3 we need the following fact about the spectral measure: Lemma 4.2. Let A be a self adjoint Schrödinger operator and dμ be a scalar multiple of its spectral measure. Then for n ≥ 2 there exists a constant K , ∞ dμ(ξ ) ≤ K 2−n . (4.2) ξn 2 Proof. This follows easily from Marchenko (Theorem 2.4.2 of [10]) for the Neumann boundary condition and Sect. 6 of Gesztesy-Simon [4] for the Dirichlet boundary condition. Lemma 4.3. Suppose dμ(ξ ) = w(ξ )dξ + dμs , dμ∗ (ξ ) = w ∗ (ξ )dξ + dμ∗s are two unnormalized spectral measures with σess (dμ) = σess (dμ∗ ) = e. Suppose dμ, dμ∗ satisfy regularity bounds and have finitely many eigenvalues outside of {ξ : dist(ξ, e) < δ1 } for any δ1 > 0. Let I ⊂ eint be a closed and bounded interval such that w, w∗ are continuous and strictly positive on I and (supp(dμs ) ∪ supp(dμ∗s )) ∩ I = ∅. Let ξ0 ∈ I and ξ(L) → ξ0 as L → ∞. Then for all sufficiently small δ and all > 0 and all M there exist γ < 1, C, n such that for all N > n + 1, ∗
w (ξ ) λ M (ξ0 , μ) + Ce2 M γ N + Ce2 M 2−2N, (4.3) λ L (ξ0 , μ∗ ) ≤ sup w(ξ ) |ξ −ξ0 |<δ where L = M +
π 4ξ0
N.
Proof. We use the methods of Lubinsky [8] and Simon [14]. Let Q M be the minimizing function for the measure μ and ⎞ ⎛ π π sin (ξ − ξ ) (ξ + ξ ) sin 0 0 4ξ0 4ξ0 4ξ0 ⎝ ⎠, + F(ξ ) = Tπ ξ − ξ0 ξ + ξ0
(4.4)
where T = 1 + π2 . We notice that (1) |F(ξ0 )| = 1, (2) |F(ξ )| < γ whenever |ξ − ξ0 | ≥ δ, for some 0 < γ < 1 depending on δ, and 0 (3) |F(ξ )| < |ξCξ −ξ0 | whenever |ξ − ξ0 | > 1. ) The function F is just sin(ξ ξ shifted so that 0 is at ξ0 , scaled so that exactly one period of the sine happens between 0 and ξ0 , then symmetrized to make it even, and then scaled 1 by a factor of T1 again to make F(ξ0 ) = 1. Since sinξ ξ = 0 cos(ξ x), F is a Fourier
transform of some even function f supported on − 4ξπ0 , 4ξπ0 , and F N is the Fourier
transform of an even function with support in − N4ξπ0 , N4ξπ0 . Fix . Since the measures dμ and dμ∗ are essentially supported on the same set e, we can let δ1 be as in the definition of regularity bounds (1.22) for both measures. Let eδ1 = {ξ : dist(ξ, e) < δ1 }. We label the mass points of dμ∗ outside eδ1 with {ξ1 , ξ2 , ξ3 , . . . , ξn }. We can construct a polynomial P with zeros at ξ1 , . . . , ξn and a local maximum at ξ0 of P(ξ0 ) = 1 with degree n + 1.
Universality Limits
475
Then let Q(ξ ) = Q M (ξ, ξ0 , μ)F N P. Since Q(ξ0 ) = 1, by the minimizing property of λ L , Q2HL (dμ∗ ) ≥ λ L (ξ0 , μ∗ ). We then find a bound on Q2HL (dμ∗ ) from above, Q =
∗
|Q(ξ )| dμ (ξ ) =
2
2
|ξ −ξ0 |<δ
+
|ξ −ξ0 |≥δ
|Q(ξ )|2 dμ∗ (ξ ).
Both F and P have a local maximum of 1 at ξ0 , so we see that |ξ −ξ0 |<δ
|Q(ξ )|2 dμ∗ (ξ ) ≤ ≤
sup
w ∗ (ξ ) w(ξ )
sup
w ∗ (ξ ) λ M (ξ0 , μ). w(ξ )
|ξ0 −ξ |<δ
|ξ0 −ξ |<δ
|ξ0 −ξ |<δ
|Q M (ξ )|2 dμ(ξ )
The measure μ∗ is pure point on R\eδ1 and the zeros of P coincide with the mass points of μ∗ , so integrating |F N P|2 over the set eδ1 is the same as integrating over R. We use (1.22) to show that the integral of |Q 2 | over |ξ − ξ0 | ≥ δ is small for large N : |ξ −ξ0 |≥δ
≤
|Q(ξ )|2 dμ∗ (ξ ) ≤
Cλ M (ξ0 )e4 M T
Cλ M (ξ0 )e4 M |F(ξ ) N P(ξ )|2 dμ∗ (ξ ) T |ξ −ξ0 |≥δ,ξ ∈eδ1
|F(ξ )|2N P 2 (ξ )dμ∗ (ξ ). +
δ≤|ξ −ξ0 |≤2
|ξ −ξ0 |>2
We have split the integral into two pieces: one that is close to ξ0 and one that is far. For the close piece, since 1 is a maximum of F on [ξ0 − 2, ξ0 + 2] there exists γ < 1 such that F(ξ ) < γ on {ξ : δ < |ξ − ξ0 | ≤ 2}. Therefore, {ξ :|ξ −ξ0 |≤2}\[ξ0 −δ,ξ0 +δ]
|F(ξ )|2N P 2 (ξ )dμ∗ (ξ ) ≤ Cγ 2N .
For the second piece, |ξ −ξ0 |>2
|F(ξ )|
2N
∗
P (ξ )dμ (ξ ) ≤ 2
|ξ −ξ0 |>2
Cξ 2n+2 ξ0 dμ∗ (ξ ) ≤ Cξ0 2−2N, (ξ − ξ0 )2N
for N > n + 1. The last bound follows from Lemma 4.2. Since ξ0 ∈ I ⊂ eint for a compact interval I and λ M (ξ0 ) is continuous on I , we can choose C that is uniform in ξ0 on I in Lemma 4.3.
476
A. Maltsev
We now prove Theorem 1.8. Suppose dμ∗ , dμ, I as in the theorem and let ξ(L) → ξ0 ∈ I . Fix δ, . Let δ1 be small enough so that regularity bounds (1.22) hold for both μ, μ∗ on E δ1 and let n be the number of mass points of μ∗ outside of E δ1 . Pick N1 , N2 > (n + 1)/ so that (1/2) N1 < e−4 and γ N2 < e−4 . Let N3 = max{N1 , N2 } and N = 2N3 M, so that Lemma 4.3 is applicable, and the sum of the second and third terms in (4.3) is O(e− M ). Divide by λ L (ξ0 , μ) to get ∗
w (ξ ) λ M (ξ0 , μ) λ L (ξ0 , μ∗ ) ≤ sup + O(e−2 M )SL (ξ0 , ξ0 , μ). (4.5) λ L (ξ0 , μ) w(ξ ) λ L (ξ0 , μ) |ξ −ξ0 |<δ From regularity bounds (1.22) on μ and for fixed N , the second term on the right-hand side tends to 0 as M → ∞: O(e
−2 M
)SL (ξ0 , ξ0 , μ) ≤ O(e
−2 M
)Ce
M+ 4ξπ N 0
= O(e− M ).
Then we take inf |ξ −ξ0 |<δ on both sides of (4.5) and we adjust the sup accordingly to get inf
|ξ −ξ0 |<δ
∗
w (ξ ) λ L (ξ, μ∗ ) λ M (ξ, μ) ≤ sup inf . λ L (ξ, μ) w(ξ ) |ξ −ξ0 |<δ λ L (ξ, μ) |ξ −ξ0 |<2δ
We then let δ → 0, then M → ∞, and then → 0. We get by continuity and positivity of w that lim inf L→∞
w ∗ (ξ0 ) λ L (ξ(L), μ∗ ) ≤ . λ L (ξ(L), μ) w(ξ0 )
To get the opposite inequality, we can interchange μ and μ∗ in (4.3), use the corresponding N given by the same formula, and divide by λ L (ξ0 , μ∗ ). All arguments given are uniform in ξ0 ∈ I . 5. Calculation of the Reproducing Kernel in the Case of a Periodic Potential As in Gesztesy–Zinchenko ((2.8) of [5]), for z ∈ C\R let ψ(z, −) ∈ L 2 , with ψ(z, 0) = 1. Then the m-function is given by ψ(z, x) = y(z, x) − m(z)u(z, x).
(5.1)
Similarly let ψ˜ be the L 2 solution with ψ˜ (z, 0) = 1. Then the corresponding m-function is given by ˜ ψ(z, x) = u(z, x) + m˜ y(z, x).
(5.2)
d2 dx2
Theorem 5.1. Let A# = − + p be a Schrödinger operator with continuous periodic potential p and either the Neumann or the Dirichlet boundary condition, and let ρ(ξ )dξ be its density of states. Let ξ0 ∈ I ⊂ σess (A# )int , where I is a closed and bounded interval. Then for a, b ∈ R uniformly in I , (1) lim
L→∞
and
ρ(ξ0 ) SL (ξ0 , ξ0 ) = πL w(ξ0 )
(5.3)
Universality Limits
477
(2) SL (ξ0 + La , ξ0 + Lb ) sin(πρ(ξ0 )(b − a)) = . SL (ξ0 , ξ0 ) πρ(ξ0 )(b − a)
(5.4)
(3) Furthermore, (1.25) is satisfied. Proof. The methods used here are similar to [14]. (1) We first show convergence then uniformity. We use the well known formula relating the ρ(ξ ) and G, where G is the Green’s function. Gesztesy–Zinchenko ((2.18) of [5]) gives the Green’s function explicitly, so we compute: 1 ρ(ξ ) = lim lim L→∞ L ↓0 = lim
L→∞
1 lim L ↓0
L
(G(x, x, ξ + i))d x
0 L
(u(ξ + i, x)ψ(ξ + i, x))d x
0
L 1 lim m(ξ + i) u(ξ, x)2 d x L→∞ L ↓0 0 w(ξ ) L = lim u(ξ, x)2 d x. L→∞ π L 0
= lim
Now, lim↓0 m(ξ + i) = w(ξ ) a.e., so the equality holds a.e.. We use continuity to show equality everywhere and uniformity of convergence. We let ξ ∈ I ⊂ eint and f (ξ, x) = eiθ(ξ )x φ(ξ, x) be the Floquet solution normalized so that f (ξ, 0) = 1. Here φ is periodic in x as in [9]. Then f (ξ, 0) ∈ / R, and we claim that u(ξ, x) =
f (ξ, x) − f (ξ, x) f (ξ, 0) − f (ξ, 0)
.
(5.5)
Since f , f are solutions of the eigenvalue equation, so is the right-hand side of (5.5). Therefore it suffices to check that the right-hand side satisfies the Neumann boundary conditions, and it does. Let g(ξ, x) =
φ(ξ, x) f (ξ, 0) − f (ξ, 0)
.
(5.6)
Then u(ξ, x) = eiθ(ξ )x g(ξ, x) + e−iθ(ξ )x g(ξ, x).
(5.7)
The Wronskian of eiθ(ξ )x g(ξ, x) and e−iθ(ξ )x g(ξ, x) is W (ξ ) = −2ig(ξ, x)g(ξ, x)θ (ξ ) − g(ξ, x)g (ξ, x) + g(ξ, x)g (ξ, x). Substituting (5.7) for u in (3.6) we get that SL (ξ, ξ ) = 2θ (ξ )i L W (ξ ) + O(1),
(5.8)
478
A. Maltsev
where O(1) is bounded uniformly in ξ ∈ I and L. Both 2θ (ξ )i W (ξ ) and continuous in ξ and equal a.e., meaning that
πρ(ξ ) w(ξ )
SL (ξ, ξ ) πρ(ξ ) = 2θ (ξ )i W (ξ ) = L w(ξ )
lim
L→∞
are
(5.9)
for all ξ ∈ I . The convergence in (5.3) is uniform. A similar argument yields the result for SL corresponding to the Dirichlet boundary condition. (2) For the Floquet solution f normalized so that f (ξ, 0) = 1 we have f (ξ, Pk + s) = f (ξ, s)eikθ(ξ ) . By analytic perturbation theory (e. g. Theorems XII.13 and XII.3 of [11]), f is real analytic in θ for θ ∈ (0, π ) ∪ (π, 2π ) and at closed gaps, i.e. θ = π and (θ ) = 0. By Theorem XIII.89 of [11], ξ(θ ) is analytic and ξ (θ ) = 0, which implies that θ (ξ ) is analytic on the interiors of the bands. The function θ (ξ ) is also analytic at ξ0 if ξ0 is a closed gap. To see this we take the derivative of the discriminant equation (ξ ) = 2 cos(θ ): d d d ((ξ ))) = D(ξ ) ξ(θ ) = −2 sin(θ ). dξ dξ dθ At a closed gap ξ0 , the right-hand side has a single zero and
d dξ
D(ξ ) also has a single
d dθ ξ(θ )
zero. This implies that = 0 at a closed gap so that θ (ξ ) is analytic at ξ0 . We can therefore take the Taylor series of θ (ξ ), f (ξ, s), and f (ξ, s) to get
aθ (ξ ) ik θ(ξ0 )+ L 0 +O 12 1 L f (ξ0 , s) + O e , L
aθ (ξ ) ik θ(ξ0 )+ L 0 +O 12 d d a 1 L f (ξ0 + , x) = f (ξ0 , s) + O e . dx L ds L
a f (ξ0 + , x) = L
Letting L = Pk + s, we substitute this into (5.5) to get a a 2y ξ0 + , x f (ξ0 + , 0) L L
aθ (ξ ) ik θ(ξ0 )+ L 0 +O 12 1 L = f (ξ0 , s) + O e L
aθ (ξ ) −ik θ(ξ0 )+ L 0 +O 12 1 L e − f (ξ0 , s) + O . L From this we get the following by direct computation:
a b a b y ξ0 + , L y ξ0 + , L 2 f ξ0 + , 0 f ξ0 + , 0 L L L L
a b − y ξ0 + , L y ξ + , L L L
a−b θ (ξ0 ) + O(L −1 ) . = W ( f, f )i sin P
(5.10) (5.11)
Universality Limits
479
Then substituting into the left-hand side of (5.4), we get SL ξ0 + La , ξ0 + Lb SL (ξ0 , ξ0 )
w(ξ0 )( f (ξ0 , 0)) y L , ξ0 + La y L , ξ0 + Lb − y L , ξ0 + Lb y L , ξ + La L→∞ f ξ0 + La , 0 f ξ0 + Lb , 0 ρ(ξ0 )(b − a) sin(πρ(ξ0 )(b − a)) . = πρ(ξ0 )(b − a)
= lim
Here we have used that w(ξ ) = f (ξ, 0),
(5.12)
which we get by substituting W (ξ ) =
f (0) f (0) − f (0) f (0) = (2i f (ξ, 0))−1 , (2i f (ξ, 0))2
(5.13)
in (5.9). An identical calculation yields the result for the Dirichlet boundary condition. To show (1.25), let (L) → 0 as L → ∞. Since u is real analytic in ξ , u 2 (ξ + (L), x) = u 2 (ξ, x) +
d 2 (u (ξ, x))(L) + o((L)), dξ
d (u 2 (ξ, x)) achieves a maximum, so that u 2 (ξ + (L), x) = and since I is compact, dξ u 2 (ξ, x) + O((L)) uniformly on I . Thus,
w(ξ ) L u(ξ + (L), x)2 d x L→∞ π L 0 w(ξ ) L = lim u(ξ, x)2 d x + O((L)). L→∞ π L 0 lim
6. Off-Diagonal Kernel and Clock Behavior The main goal of this section is to prove our main result, Theorem 1.3. We start by proving Lubinsky’s inequality, which is similar to the discrete case: Lemma 6.1. Let two measures dμ(ξ ), dμ∗ (ξ ) with dμ(ξ ) ≤ dμ∗ (ξ ) be unnormalized spectral measures of Schrödinger operators. Then for any ξ , β ∈ R, |SL (ξ, β, μ) − SL (ξ, β, μ∗ )| ≤ SL (ξ, ξ, μ)
SL (β, β, μ) SL (ξ, ξ, μ)
1/2
SL (ξ, ξ, μ∗ ) 1/2 1− . SL (ξ, ξ, μ)
(6.1)
480
A. Maltsev
Proof. The proof carries over from [8]. Expanding, (SL (ξ, ζ, μ) − SL (ξ, ζ, μ∗ ))2 dμ(ζ ) 2 = SL (ξ, ζ, μ) dμ(ζ ) − 2 SL (ξ, ζ, μ)SL (ξ, ζ, μ∗ )dμ(ζ ) + SL2 (ξ, ζ, μ∗ )dμ(ζ ) ∗ = SL (ξ, ξ, μ) − 2SL (ξ, ξ, μ ) + SL (ξ, ζ, μ∗ )dμ(ζ ). Since dμ ≤ dμ∗ , SL (ξ, ζ, μ∗ )dμ(ζ ) ≤ S 2 (ξ, ζ, μ∗ )dμ∗ (ζ ) = SL∗ (ξ, ξ ).
(6.2)
Therefore, (SL (ξ, ζ, μ) − SL (ξ, ζ, μ∗ ))2 dμ(ζ ) ≤ SL (ξ, ξ, μ) − SL (ξ, ξ, μ∗ ). Using the variational principle for the Christoffel–Darboux symbol, e.g. the minimizing property, for any π(ζ ) ∈ HL and any β ∈ R, π(ζ )2 dμ(ζ ). SL (β, β, μ)−1 ≤ π(β)2 Using π(ζ ) = SL (ξ, ζ, μ) − SL (ξ, ζ, μ∗ ) we get that |SL (ξ, β, μ) − SL (ξ, β, μ∗ )| ≤ SL (β, β, μ)1/2 (SL (ξ, ξ, μ∗ ) − SL (ξ, ξ, μ∗ )). We then show Lemma 6.2. Let dμ, dμ∗ be unnormalized spectral measures with σess (dμ) = σess (dμ∗ ). If dμ(ξ ) obeys regularity bounds and dμ(ξ ) ≤ dμ∗ (ξ ), then dμ∗ (ξ ) also obeys regularity bounds. Proof. Since dμ ≤ dμ∗ , Qdμ ≤ Qdμ∗ for all Q ∈ L 2 (dμ) ∩ L 2 (dμ∗ ), so L inf{Qdμ : Q(ξ0 ) = 1, Q(ξ ) = f (x) cos( ξ x)d x} 0
L
≤ inf{Qdμ∗ : Q(ξ0 ) = 1, Q(ξ ) =
f (x) cos( ξ x)d x}.
0
By the variational principle, this implies that λ L (ξ, μ) ≤ λ L (ξ, μ∗ ). If u, u ∗ are the solutions of the eigenvalue equations corresponding to dμ, dμ∗ respectively, then L L Ce L ≥ u(ξ, x)2 d x ≥ u ∗ (ξ, x)2 d x. 0
0
Universality Limits
481
We now prove Theorem 1.3. 2
2
Proof. Let A = − ddx 2 + p(x) + q(x) and A# = − ddx 2 + p(x) be Schrödinger operators with periodic continuous p and non-destructive zero-average q (Definition 1.2). Suppose the corresponding spectral measures dμ, dμ# satisfy regularity bounds. Suppose there exists a closed and bounded interval I ⊂ σess (A)int such that ξ0 ∈ I , w is absolutely continuous and positive on I , and (σess (dμs ) ∪ σess (dμ#s )) ∩ I = ∅. Let s > 0 such that sw # (ξ0 ) = w(ξ0 ). From μ, μ# we construct a new unnormalized spectral measure μ∗ which dominates μ, sμ# and is absolutely continuous on I with w ∗ (ξ0 ) = w(ξ0 ). Let dμ∗ (ξ ) = sup{sdμ# (ξ ), dμ(ξ )}, for ξ < R, and dμ∗ (ξ ) = sdμ# (ξ ) + dμ(ξ ), for ξ ≥ R, where R ∈ R with I ⊂ (−∞, R). We claim that μ∗ is an unnormalized spectral measure. A measure dν is a spectral measure for a boundary value problem (Theorem 2.3.1 of [10]) if and only if (1) The functional on HL given by the inner product −, π(ξ ) dν is non-trivial for all non-trivial π . (2) The function √ 1 − cos( ξ x) dν(ξ ) (6.3) (x, ν) = ξ is thrice continuously differentiable in x and (0+, ν) = 1. Condition (1) is true for dμ√∗ , since it is true for both μ and μ# . To show condition (2), R let R (x, ν) = −∞ 1−cos(ξ ξ x) dν, for any locally finite measure dν. Then R (x, μ), R (x, μ# ), R (x, μ∗ ) are in C ∞ by the Dominated Convergence Theorem and √ ∞ 1 − cos( ξ x) ∗ dμ = (x, μ) − R (x, μ) + (x, μ# ) − R (x, μ# ) ξ R is in C 3 as a sum of C 3 functions, making (x, μ∗ ) ∈ C 3 . By continuity of R (x) and the Dominated Convergence Theorem, R sin(0) ∗ ∗ ∗ R (0+, μ ) = R (0, μ ) = √ dμ (ξ ) = 0, ξ 0 so (0+, μ∗ ) = (0+, μ) + (0+, μ# ) = 1 + s. Thus, dividing dμ∗ by 1 + s will yield a spectral measure. Additionally, the boundary condition of dμ∗ is the same as that for dμ, dμ# (Theorem 2.4.2 of Marchenko [10]). By Lemma 6.2 above, μ∗ obeys the regularity bound. Applying Theorem 1.8 to dμ∗ , # dμ , we see that dμ∗ satisfies (1.24), (1.25). Thus, by (1.8) SL (ξ0 + a/L , ξ0 + a/L , μ) →1 SL (ξ0 + b/L , ξ0 + b/L , μ∗ ) and SL (ξ0 + a/L , ξ0 + a/L , sμ# ) → 1. SL (ξ0 + b/L , ξ0 + b/L , μ∗ )
482
A. Maltsev
Dividing by SL (ξ0 , ξ0 ) and applying Lubinsky’s inequality, we get that |SL ξ0 + La , ξ0 + Lb , μ − SL ξ0 + La , ξ0 + Lb , μ∗ |2 SL ξ0 + Lb , ξ0 + Lb , μ∗ a a a a ≤ SL (ξ0 + , ξ0 + , μ) − SL ξ0 + , ξ0 + , μ∗ , L L L L and |SL ξ0 + La , ξ0 + Lb , sμ# − SL ξ0 + La , ξ0 + Lb , μ∗ |2 SL ξ0 + Lb , ξ0 + Lb , μ∗ a a a a ≤ SL ξ0 + , ξ0 + , sμ# − SL ξ0 + , ξ0 + , μ∗ , L L L L which gives that SL ξ0 + La , ξ0 + Lb , μ → 1. SL ξ0 + La , ξ0 + Lb , sμ# Since SL (ξ0 , ξ0 , μ) → 1, SL (ξ0 , ξ0 , sμ# ) we get that SL ξ0 + La , ξ0 + Lb , μ SL ξ0 + La , ξ0 + Lb , sμ# lim = lim . L→∞ L→∞ SL (ξ0 , ξ0 , μ) SL (ξ0 , ξ0 , sμ# ) The limit on the right is equal to (1.13) and all limits are uniform on I and |a|, |b| < B. Like [7,14], we can now deduce clock spacing of the zeros for a perturbed periodic potential. Here we prove Corollary 1.5. Proof. Fix an interval I ⊂ eint and ξ ∗ ∈ I . We want to show uniform clock behavior at ξ ∗ of zeros of u and y in ξ as L gets large. More precisely, if ξn is a successive numbering of zeros with . . . ξ−1 < ξ ∗ ≤ ξ0 < ξ1 < . . ., then lim L|(ξn − ξn+1 )|ρ(ξ ∗ ) = 1. L
By the Christoffel–Darboux formula (3.4), u(ξ ∗ + a/L , L) u(ξ ∗ , L) = u (ξ ∗ , L) u (ξ ∗ + a/L , L)
(6.4)
for a = 0 if and only if SL (ξ ∗ , ξ ∗ + a/L) = 0. From (1.26) and (5.3) we see that SL (ξ ∗ , ξ ∗ ) = O(L). Now, by (1.13) and since SL (ξ ∗ , ξ ∗ ) = O(L), SL (ξ ∗ , ξ ∗ + a/L) = o(1/L) if and only if a = ρ(ξk ∗ ) + o(1/L). The convergence in L is uniform on I , since (1.13) is uniform on I . The argument is the same for y.
Universality Limits
483
7. Example: The Free Schrödinger Operator The arguments in Sect. 2 apply also to non-destructive zero-average perturbations of the free Schrödinger operator, thus giving us the regularity bounds condition. We know the spectral measure for the free Schrödinger operator [16], and it is indeed continuous and non-negative on [0, ∞). The solution of the eigenvalue equation for the free Schrödinger operator d2 u(x, ξ ) = ξ u(x, ξ ) dx2 √ with the Neumann boundary condition is cos( ξ x) < ex on [0, ∞). We compute SL (ξ, β) and SL (ξ, ξ ) directly: √ √ √ √ L sin(( ξ − β)L) sin(( ξ + β)L) SL (ξ, β) = + , cos( ξ x) cos( βx)d x = √ √ √ √ 2( ξ − β) 2( ξ + β) 0 −
and
√ L sin(2 ξ L) SL (ξ, ξ ) = + . √ 2 4 ξ
Then model property (3) is clear and we check property (4): lim sup lim sup →0
L→∞
√ sin(2 √ ξ (L+ L) 4 ξ √ sin(2√ ξ L) L 2 + 4 ξ
L+ L 2
+
= 1.
Locally at ξ0 we get
√ √ 2 ξ0 sin 2a−b SL (ξ0 + a/L , ξ0 + b/L) ξ0 = . lim L→∞ SL (ξ0 , ξ0 ) a−b
This coincides with (5.3), since the density of states for the free Schrödinger operator is ρ(ξ ) = (2π )−1 ξ −1/2
(7.1)
for ξ ∈ [0, ∞) (Example 8.1 of [1]). Acknowledgement. I would like to thank my advisor Professor Barry Simon for all his help. I would also like to thank Professors Jonathan Breuer and Fritz Gesztesy for useful discussions.
References 1. Berezin, F.A., Shubin, M.A.: The Schrödinger Equation. Volume 66 of Mathematics and its Applications (Soviet Series). Dordrecht: Kluwer Academic Publishers Group, 1991; translated from the 1983 Russian edition by Rajabov, Yu., Le˘ıtes, D.A., Sakharova, N.A. and revised by Shubin, with contributions by G. L. Litvinov and Le˘ıtes 2. Chadan, K., Sabatier, P.C.: Inverse Problems in Quantum Scattering Theory. New York: Springer-Verlag, 1977, with a foreword by R. G. Newton 3. Gel fand, I.M., Levitan, B.M.: On the determination of a differential equation from its spectral function. Amer. Math. Soc. Transl. (2) 1, 253–304 (1955)
484
A. Maltsev
4. Gesztesy, F., Simon, B.: A new approach to inverse spectral theory. II. General real potentials and the connection to the spectral measure. Ann. Math. (2) 152(2), 593–643 (2000) 5. Gesztesy, F., Zinchenko, M.: On spectral theory for Schrödinger operators with strongly singular potentials. Math. Nachr. 279(9–10), 1041–1082 (2006) 6. Last, Y., Simon, B.: Fine structure of the zeros of orthogonal polynomials. IV. A priori bounds and clock behavior. Comm. Pure Appl. Math. 61(4), 486–538 (2008) 7. Levin, E., Lubinsky, D.S.: Applications of universality limits to zeros and reproducing kernels of orthogonal polynomials. J. Approx. Theory 150(1), 69–95 (2008) 8. Lubinsky, D.S.: A new approach to universality limits involving orthogonal polynomials. Ann. Math. 170, 915–939 (2009) 9. Magnus, W., Winkler, S.: Hill’s Equation. Interscience Tracts in Pure and Applied Mathematics, No. 20. New York-London-Sydney: Interscience Publishers/John Wiley & Sons, 1966 10. Marchenko, V.A.: Sturm-Liouville Operators and Applications. Volume 22 of Operator Theory: Advances and Applications. Basel: Birkhäuser Verlag, 1986, translated from the Russian by A. Iacob 11. Reed, M., Simon, B.: Methods of Modern Mathematical Physics. IV. Analysis of Operators. New York: Academic Press [Harcourt Brace Jovanovich Publishers], 1978 12. Simon, B.: Orthogonal Polynomials on the Unit Circle. Part 1, Volume 54 of American Mathematical Society Colloquium Publications. Providence, RI: Amer. Math. Soc., 2005 13. Simon, B.: The Christoffel-Darboux kernel. In: ’Perspectives in PDE, Harmonic Analysis and Applications,’ a Volume in Honor of V.G. Maz’ya’s 70th birthday, Proceedings of Symposia in Pure Mathematics, 79, 295–335, (2008) 14. Simon, B.: Two extensions of Lubinsky’s universality theorem. J. Anal. Math. 105, 345–362 (2008) 15. Stahl, H., Totik, V.: General Orthogonal Polynomials. Volume 43 of Encyclopedia of Mathematics and its Applications. Cambridge: Cambridge University Press, 1992 16. Teschl, G.: Mathematical Methods in Quantum Mechanics. Volume 150 of Graduate Studies in Mathematics. Providence, RI: Amer. Math. Soc., 2009 17. Totik, V.: Universality and fine zero spacing on general sets. Arkiv for Math. 4(2), 361–391 (2009) Communicated by H.-T. Yau
Commun. Math. Phys. 298, 485–514 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1043-6
Communications in
Mathematical Physics
On Infinite-Volume Mixing Marco Lenci Dipartimento di Matematica, Università di Bologna, Piazza di Porta S. Donato 5, 40126 Bologna, Italy. E-mail: [email protected] Received: 3 August 2009 / Accepted: 19 January 2010 Published online: 9 April 2010 – © Springer-Verlag 2010
Abstract: In the context of the long-standing issue of mixing in infinite ergodic theory, we introduce the idea of mixing for observables possessing an infinite-volume average. The idea is borrowed from statistical mechanics and appears to be relevant, at least for extended systems with a direct physical interpretation. We discuss the pros and cons of a few mathematical definitions that can be devised, testing them on a prototypical class of infinite measure-preserving dynamical systems, namely, the random walks. 1. Historical Introduction The textbook definition of mixing for a transformation T : M −→ M preserving a probability measure μ is lim μ(T −n A ∩ B) = μ(A)μ(B)
n→∞
(1.1)
for all measurable sets A, B ⊂ M [W]. Extending this definition to the case where μ is a σ -finite measure with μ(M) = ∞ is a fundamental issue in infinite ergodic theory. References to this problem can be found in the literature at least as far back as 1937, when Hopf devoted a section of his famous Ergodentheorie [H] to an example of a dynamical system that he calls ‘mixing’. It consists of a set M ⊂ R2 , of infinite Lebesgue measure, and a map T : M −→ M preserving μ, the Lebesgue measure on M. He proved a property that is equivalent to this one: there exists a sequence {ρn }n∈N of positive numbers such that lim ρn μ(T −n A ∩ B) = μ(A)μ(B)
n→∞
(1.2)
for all squarable sets A, B ⊂ M (a squarable set is a bounded set whose boundary has measure zero). For a long time, the community did not seem to act on this suggestion, perhaps due in part to the impossibility, in any reasonable dynamical system, of verifying (1.2) for all
486
M. Lenci
finite-measure sets A, B. (This fact, which might not have been clear to Hopf himself, can be derived from a famous 1964 paper by Hajian and Kakutani [HK].) Work in this direction, however, picked up rather intensively in the 1960’s [Or,KP, KO,Pr,Kr,Pa], to the point that Krickeberg in 1967 [Kr] proposed (1.2) as the definition of mixing for almost-everywhere continuous endomorphisms of a Borel space (M, μ) (with some extra, inessential, conditions on the sets A, B). Krickeberg applied his definition to Markov chains with an infinite state space and an infinite invariant measure, which is very interesting in the context of this paper because our main examples will be the prototypical infinite-state Markov chains, namely the random walks (cf. Sects. 2.2, 4 and 5). Krickeberg’s definition has been studied by several researchers since then [Fr,To1, To2] and, in recent times, it was independently rediscovered by Isola, who uses (1.2) with A = B and calls {ρn } the scaling rate [I1,I2]. It failed, however, to establish itself as the ultimate definition of mixing in infinite measure. In my opinion, this is not so much because of the less-than-perfect requirement of a topological structure in a measure-theoretic problem, but rather for its inherent inability to describe the “global” infinite-measure aspects of a dynamics: after all, (1.2) only involves finite-measure sets. Related to this, it is unclear how this definition may be specified towards stronger and more physically relevant chaotic properties, such as, for example, the rate of correlation decay. A little thinking convinces one that the speed of convergence in (1.2) cannot in general be uniform, even for uniformly nice sets (A and B can be arbitrarily far from each other so that the l.h.s. of (1.2) is negligible for arbitrarily long times). At any rate, by the end of the 1960’s, Krengel and Sucheston [KS] approached the problem from a more measure-theoretic point of view and devised the following two definitions: A discrete-time, nonsingular dynamical system (M, μ, T ) is called mixing if and only if the sequence {T −n A}n∈N is semiremotely trivial for all measurable A ⊂ M with μ(A) < ∞; it is called completely mixing if the condition holds for all measurable A. (A nonsingular map is one for which μ(A) = 0 implies μ(T −1 A) = 0. As for the definition of semiremotely trivial, which is unimportant here, we refer the reader to [KS].) Both definitions reduce to the standard definition (1.1) for maps preserving a probability measure [Su]. However, Krengel and Sucheston themselves proved results that imply that many reasonable (including all invertible) measure-preserving maps cannot be completely mixing [KS, Thms. 3.1 and 5.1]. As for mixing, again specializing to measure-preserving maps, their definition is equivalent to lim μ(T −n A ∩ B) = 0
n→∞
(1.3)
for all finite-measure sets A, B [KS, §2]. This is a rather brutal weakening of (1.2): for instance, it would classify a translation in Rd as mixing! Therefore, however illuminating [Sa], the Krengel–Sucheston definitions are of little applicability to most simple extensive systems that mathematicians would like to study. Aaronson in 1997 [A, §2.5] wrote [...] the discussion in [KS] indicates that there is no reasonable generalisation of mixing. Be that as it may, the drive to produce a general definition of mixing in infinite ergodic theory had apparently ceased by the mid 1970’s. In this paper I will not try to give a universal and firm definition of mixing—in which I am not sure I believe myself—for σ -finite measure-preserving dynamical systems, but
On Infinite-Volume Mixing
487
rather a few very general notions, that can be completely specified on a case-by-case basis, depending on what type of information one wants to extract from the dynamical system under scrutiny. To do so, I will borrow some ideas and a little terminology from physics, in particular from statistical mechanics. The key concept this work is based on is that of infinitevolume average, which I illustrate with an example. Let us consider a measurable unbounded A ⊂ R2 and ask, what is the probability that a random x ∈ R2 belongs to A? Clearly, the answer fully depends on what we mean by random. Suppose we specify that random means that each x can be drawn with equal probability. Then the question itself no longer makes sense because the Lebesgue measure m on R2 , which is the only uniform measure on R2 , cannot be normalized. However, remembering that long-gone course in statistical mechanics, one might come up with the idea that the sought probability is something like m(A ∩ [−r, r ]2 ) , r →+∞ 4r 2 lim
(1.4)
provided the limit exists. Of course, such an answer is riddled with issues, but it does capture the idea that in physics one only looks at finite quantities. Infinity is a mental construct to fit an endless amount of situations, and a finite limit at infinity is the formal way to say that most of these situations will look alike. More generally, for a given dynamical system (M, A , μ, {T t }), where (M, A ) is a measure space and {T t } is a (semi-)group of transformations M −→ M preserving the infinite measure μ, we will choose a family of ever-larger sets V , with μ(V ) < ∞, that “approximate M”. Using the language of physics, one might say that choosing these sets will define how we measure our infinite system—more precisely, how we measure its observables. We will deal with two types of observables: The global, or macroscopic, observables will be a suitable class of functions F ∈ L ∞ (M, A , μ) for which 1 μ(F) := lim F dμ (1.5) V M μ(V ) V exists. What the above limit means and what class of functions it applies to will be clarified in Sect. 2. The local, or microscopic, observables will be essentially the elements of L 1 (M, A , μ). Then, skipping many necessary details and all-important specifications which are found in Sects. 2 and 3, our notions of mixing will basically reduce to the two limits: lim μ((F ◦ T t )G) = μ(F)μ(G),
t→∞
(1.6)
for any two global observables F, G; and lim μ((F ◦ T t )g) = μ(F)μ(g),
t→∞
(1.7)
for F global and g local (with the obvious notation μ(g) := g dμ). To the extent to which the above notions can be made into rigorous definitions— and they can, cf. Sect. 3—they seem to improve on the attempted definitions that we have recalled earlier, chiefly because they involve observables which can be supported throughout the phase space (think, for example, of the velocity of a particle in an aperiodic Lorentz gas, or the potential energy of a small mass in a formally infinite celestial
488
M. Lenci
conglomerate, etc.). So they are more apt to reveal the large-scale aspects of a given dynamics. In particular, (1.6) may be called global-global mixing, because it somehow expresses the vanishing of the correlation coefficient between two global observables, while (1.7) may be called global-local mixing, because the coupling is between a global and a local observable. The latter notion can be quite useful if one takes g ≥ 0 with μ(g) = 1. Then μg , the probability measure defined by dμg := g dμ, can be considered an initial state for the system. In this interpretation, the l.h.s. of (1.7) reads μg (F ◦ T t ) =: T∗t μg (F), that is, the expected value of the observable F relative to the state at time t, and (1.7) asserts that such quantity converges to μ(F). Hence, μ acts as a sort of equilibrium state for the system. In Sect. 4 we will apply the above ideas to certain basic yet representative examples of infinite-measure dynamical systems, the random walks in Zd . We will specify all the mathematical objects needed to obtain rigorous definitions out of (1.6)-(1.7) and we will check that, under reasonable conditions, all these definitions are verified. The proofs, given in Sect. 5, use basic harmonic analysis on groups [R] and an estimate for a certain Fourier norm. The latter is presented in the Appendix, together with other technical results. 2. Infinite-Volume Limit and Observables A measure-preserving dynamical system is the quadruple (M, A , μ, {T t }), where M is a measure space with the σ -algebra A , endowed with the σ -finite measure μ, while {T t }t∈G is a group (respectively, semigroup) of automorphisms (respectively, endomorphisms) M −→ M, labeled by the free additive parameter t ∈ G (without significant loss of generality, we assume G = N, Z, or R). This means that μ(T −t A) = μ(A),
∀A ∈ A , ∀t ∈ G.
(2.1)
If G = R, {T t } is called the flow, whereas if G = Z or N, the generator T := T 1 is called the map, and one usually denotes the dynamical system by (M, μ, T ). In this paper we are interested in the case μ(M) = ∞. The measure-theoretic properties of dynamical systems preserving an infinite measure are the subject of infinite ergodic theory [A], whose most basic definition, perhaps, is the following [HK]: Definition 2.1. The measure-preserving dynamical system (M, A , μ, {T t }) is called ergodic if every measurable invariant set (i.e., any A ∈ A such that μ(T −t A A) = 0 ∀t), has either zero measure or full measure (the latter meaning μ(M \ A) = 0). In the case where μ is a probability measure, the above is one of the several equivalent formulations of ergodicity, including among others: the equivalence of the Birkhoff average with the phase average, for all f ∈ L 1 (M, A , μ) (the definition generally ascribed to Boltzmann); the absence of nontrivial integrals of motion in L 1 (M, A , μ); the strong law of large numbers for the random variables { f ◦ T t }. In infinite measure, these definitions are no longer equivalent and, among those that keep making sense, Definition 2.1 is in some sense the strongest. A notion that is much harder to transport to infinite ergodic theory, as we have discussed in the Introduction, is that of mixing. In terms of observables, i.e., scalar functions on M, it reads: ∀ f, g ∈ L 2 (M, A , μ),
lim μ(( f ◦ T t )g) = μ( f )μ(g).
t→∞
(2.2)
On Infinite-Volume Mixing
489
In other words, the correlation coefficient between the two random variables f ◦ T t and g vanishes asymptotically. This phrasing makes it apparent that the notion of finitemeasure mixing is intrinsically probabilistic.
2.1. Infinite-volume limit. The discussion in the Introduction suggests that one should find an asymptotic decorrelation formula, similar to (2.2), which applies to observables that, unlike L 2 functions, “see” a nonnegligible portion of the space. For this, one needs to define a sort of “normalized measure” for these observables. This is how one comes to think of averaging a function over M by means of an infinite-volume limit. The idea is borrowed from statistical mechanics where the question arises of measuring the intensive quantities of a system (for example, the temperature or the density of a gas). These are represented by sequences of functions defined over larger and larger phase spaces, corresponding to larger and larger portions of the physical system, normalized in such a way that their integral converges to a finite number. This number is supposed to predict the outcome of an experimental measurement. Coming back to our math, we introduce a notation that is going to be used often in the remainder: A f := {A ∈ A | μ(A) < ∞}.
(2.3)
Definition 2.2. The family V ⊂ A f is called exhaustive if it contains a sequence {Vn }n∈N , increasing w.r.t. inclusion, such that n Vn = M. Definition 2.3. Let V be exhaustive. If φ is defined on A f and has values in some topological space, we write lim φ(V ) :=
V M
lim φ(V ) = L ,
V ∈V μ(V )→∞
when, for every neighborhood U of L, there exists an M ∈ R+ such that V ∈ V , μ(V ) ≥ M ⇒ φ(V ) ∈ U. This will be called the µ-uniform infinite-volume limit w.r.t. the family V , or simply the infinite-volume limit. It is apparent that the above definition depends decisively on the choice of V . In the example discussed in the Introduction, cf. (1.4), V = {[−r, r ]2 }r >0 , and it is easy to think of other exhaustive families of subsets of R2 for which the infinite-volume limit for the “probability” of A differs from (1.4). In general, all the results that we are going to discuss in this paper will depend on the choice of V in Definition 2.3. This is no shortcoming! In fact, we want to retain this choice, because this is how we incorporate in the mathematical description of an extended system the way we observe the system, that is, how we measure the observables of that system. In other words, the choice of V defines what it means to pick a large region of M, which we assume—or rather declare—represents the whole space. The first property we assume of our systems may be called ‘compatibility of the infinite-volume limit with the dynamics’: (A1) For any fixed t ∈ G, for V M, μ(T −t V V ) = o(μ(V )).
490
M. Lenci
This means that the scale of the dynamics is local and not global: If V is a very large set “approximating M”, then its evolution at a fixed time should differ little from V , in relative terms. (In statistical mechanics one would say that, over a large region of the space, the dynamics can only produce negligible surface effects—we will come back to this point in Sect. 3.1.) One can expect (A1) to hold in most situations. It does, for instance, when M is a metric space such that the μ-measure of a ball grows like a power of the radius, uniformly in the center of the ball; the dynamics is bounded, i.e., ∀t ∈ G, ∃K = K (t) such that, for μ-a.e. x ∈ M, dist(T t x, x) ≤ K ; and the elements of V are balls. 2.2. A couple of examples. A very simple example of this setup is the dynamical system defined as follows. Set 3x − 1, for x ∈ [0, 1); ϕ(x) := (2.4) 0, otherwise. Then, on M = R, define the self-map ϕ(x − j) + j. T x :=
(2.5)
j∈Z
The definition is well-posed because, for any given x ∈ R, only one term of the sum is nonzero. Furthermore, each x has three distinct counterimages via T , where the derivative of the map is constantly equal to 3. This shows that T is a noninvertible map which preserves the Lebesgue measure μ. (M, μ, T ) describes a random walk in Z, in the following sense: Suppose an initial condition x ∈ [0, 1) is randomly chosen according to μ0 , the Lebesgue measure in [0, 1) (it is no loss of generality to restrict to [0, 1) because the dynamical system is clearly invariant for the action of Z). Then T x will land in one of the intervals [−1, 0), [0, 1), [1, 2), with probability 1/3 in each case. More generally, if [x] := max {m ∈ Z | m ≤ x} denotes the integer part of x ∈ R, we have
(2.6) μ0 x ∈ [0, 1) [T n x] = kn , [T n−1 x] = kn−1 , . . . , [T x] = k1 = 3−n , provided that |k j − k j−1 | ≤ 1, for j = 1, . . . , n (with k0 = 0). This implies that, using the notation of conditional probability, μ0 [T n x] = kn [T n−1 x] = kn−1 , . . . , [T x] = k1 = μ0 [T n x] = kn [T n−1 x] = kn−1 = 1/3. (2.7) Hence, {[T n x]}n is a Markov chain in Z with same-site and nearest-neighbor jumps, each with probability 1/3; namely, it is a (space-)homogeneous random walk. As for the choice of V , the example of the Introduction would seem to suggest that we pick sets of the type V = [−r, r ] (with r ∈ R) or, to fully exploit the Z-structure of this dynamical system with no appreciable loss of generality, sets of the type V = [−k, k] (with k ∈ N). Although this is a legitimate choice, we will see later that a better option is V := {[k, ] | k, ∈ Z, k < } .
(2.8)
On Infinite-Volume Mixing
491
(Actually, this will be a crucial part of our discussion, and we refer the reader to Sect. 3.1.) For V = [k, ], n ∈ N, and − k > 2n, it is easy to verify that [k + n, − n] ⊂ T −n V ⊂ [k − n, + n],
(2.9)
which implies (A1). A less trivial system that fits well the framework we are describing is the (aperiodic) Lorentz gas [L1,L2]. In R2 (just to fix the smallest interesting dimension) a Lorentz gas is the billiard system in C := R2 \ n∈N On , where {On } is a countable collection of pairwise disjoint, convex, bounded regular sets. This means that a point particle moves with constant unit velocity in C until it hits an obstacle On , which reflects the particle according to the Fresnel law: the angle of reflection equals the angle of incidence (the modulus of the velocity remains equal to 1). The phace space for this system is then M = C × S 1 , where q ∈ C represents the position and v ∈ S 1 the velocity of the particle. If T t denotes the flow just described (which is unambiguously defined at all noncollision times), it is well known that T t preserves the Liouville measure μ, which turns out to be the product of the Lebesgue measure on C and the Haar measure on S 1 . Clearly, save for bizarre situations, μ(M) = ∞. Since sufficient conditions for the ergodicity of (M, μ, T ) are known [L1], and since, for the finite-measure version of the Lorentz gas (the so-called Sinai billiard), mixing and stronger stochastic properties essentially follow from ergodicity [Si,BS,CM], it is of interest to devise one or more sound definitions of mixing for this dynamical system. As for the exhaustive family V , in analogy to the previous system, a reasonable choice would be
V := (C ∩ R) × S 1 | R = [a, b] × [c, d], with a < b, c < d . (2.10) In Sect. 4 we will present a third example, which generalizes, in more than one way, the first system introduced above. It is a class of invertible dynamical systems representing all homogeneous random walks in Zd . One of its points of relevance is that it is designed to retain the most essential features of the Lorentz gas discussed above. It is thus a greatly simplified toy model, which we are able to study in depth. As a matter of fact, it will be the testing ground for our new notions of mixing, cf. Sects. 4 and 5.
2.3. Global and local observables. In order to define a surrogate probability measure for our system, we need to declare what we intend to measure. In other words, we need to specify the observables, namely, the functions M −→ R which represent the (sole) information that we can get on the state of the system. In finite ergodic theory, this class of functions is L 1 (M, A , μ), or sometimes 2 L (M, A , μ). In virtually every situation, both are amply sufficient to give a full description of the state of the system (in fact, quite generally, the position itself of x ∈ M is given by a finite number of square-integrable functions). This is conspicuously not true in infinite ergodic theory. Indeed, the forthcoming discussion will try to convince the reader that the choice of the observables is precisely at the heart of the matter in infinite-measure mixing (a point that, after all, was already contained in [KS]). We deal with two categories of observables: the global, or macroscopic, observables and the local, or microscopic, observables. Let us start by introducing the former, whose space we denote by G. We will not give a definition, but rather a presentation of the minimal features that G should have.
492
M. Lenci
A precise definition only makes sense on a case-by-case basis and, indeed, the choice of G is part of our description of the system, just like the choice of V . As a typographical rule, we indicate a global observable with an upper-case Roman letter, as in F : M −→ R. We require at least the following conditions: (A2) G ⊂ L ∞ (M, A , μ), (A3) ∀F ∈ G,
1 ∃ μ(F) := lim μV (F) := lim V M V M μ(V )
F dμ. V
We call μ(F) the average of F (w.r.t. μ and V ). This functional is dynamics-invariant: Lemma 2.4. Under Assumptions (A1)–(A3), μ(F) = μ(F ◦ T t ), ∀t ∈ G. Proof. Using the invariance of μ and then (A1)–(A2), we have 1 1 1 F dμ = (F ◦ T t ) dμ = (F ◦ T t ) dμ + o(1). μ(V ) V μ(V ) T −t V μ(V ) V Applying (A3) gives the assertion.
(2.11)
As for the class of local observables, denoted by L, this can be generally taken to be L 1 (M, A , μ). As we will see below, this choice is much less delicate than the choice of G. Nonetheless, some results may require occasional restrictions on L 1 , so, in the same spirit as (A2), we only require (A4) L ⊆ L 1 (M, A , μ). Local observables are indicated with a lower-case Roman letter, as in g : M −→ R. 3. Definitions and Related Questions 3.1. Global-global mixing. On the basis of Lemma 2.4, one might attempt the following definition of mixing: (M1) ∀F, G ∈ G,
lim μ((F ◦ T t )G) = μ(F) μ(G),
t→∞
provided that μ((F ◦ T t )G) exists for all t ∈ G, or at least for t large enough. This last point represents a problem, because it is not easy, in general, to devise a space G with the property that F, G ∈ G implies (F ◦ T t )G ∈ G for all large t (sometimes G is not even T t -invariant, cf. (4.9) later on). Generally speaking, there are only two solutions to this problem—which would be more honestly described as ways around it. The first solution is to declare that this question should be dealt with on a case-by-case basis. The second solution is to devise another definition of mixing which just does away with the problem: (M2) ∀F, G ∈ G,
lim μV ((F ◦ T t )G) = μ(F) μ(G),
t→∞ V M
having adopted the notation μV (·) = V (·)dμ/μ(V ), as introduced in (A3). The above means that, ∀ε > 0, ∃M > 0 such that, for all t ≥ M and V ∈ V with μ(V ) ≥ M, μV ((F ◦ T t )G) − μ(F)μ(G) < ε. (3.1) Though cast in a less polished form than (M1), (M2) still retains a great deal of the physical meaning of mixing because it prescribes that, if the region V is big enough and
On Infinite-Volume Mixing
493
the time t is large enough, the two observables F ◦ T t and G are practically uncorrelated on V . Actually, in some sense, (M2) is even stronger than (M1), because it implies that, fixed V , (3.1) occurs uniformly in t, for t large. The same is not guaranteed by (M1). See also Proposition 3.1 later on. We refer to (M1) and (M2) as definitions of global-global mixing because they consider the coupling of two global observables. Let us now focus on a couple of less technical and more substantial questions concerning both (M1) and (M2). The first has to do with the importance on V too, not just G, for either condition to function as a sound definition of mixing. Let us exemplify the question by means of the dynamical system defined by (2.4)–(2.5). This is a system that should be classified as mixing by any reasonable definition (cf. also Sect. 4). Suppose that for that system we had made the first, more restrictive, choice of V presented in Sect. 2.2, that is, we had chosen sets of the type V = [−k, k], with k ∈ N. The function F : R −→ R, defined by −1, for x < 0, F(x) := (3.2) 1, for x ≥ 0, is bounded and has average μ(F) = 0. Now, fix n > 0 and consider F ◦ T n . Given the action of the map T and its interpretation as a random walk, it is not hard to see that, for x < −n, F(T n x) = −1 and, for x ≥ n, F(T n x) = 1 (determining F(T n x) for x ∈ [−n, n) is more complicated and irrelevant here). This and (3.2) imply that, for |x| > n, F(T n x)F(x) = 1. Therefore, for k much larger than n and V = [−k, k], μV ((F ◦ T n )F) is very close to 1 and indeed μ((F ◦ T n )F) = 1. Since μ(F) = 0, this shows that, for F as in (3.2) and G = F, both limits in (M1) and (M2) fail to hold! But this is reasonable: after all, F has variations (causing it to be nonconstant) only on a negligible set, namely {x = 0} ⊂ M. By negligible set we mean, in this context, a set whose ρ-neighborhoods, for all ρ > 0, have “measure” zero w.r.t. μ. Therefore this is an instance of the phenomenon, which is well known in statistical mechanics, whereby the infinite-volume limit does not see surface effects. (When the dynamics is bounded, in the sense specified in the last paragraph of Sect. 2.1, the evolution of F can produce no more than surface effects.) So we must avoid global observables that have significant variations on negligible sets. But this does not mean that we should cherry-pick our observables (although there is nothing wrong with that)! In the case at hand, for instance, the unusable observables are automatically eliminated by a smarter choice of V , the one given in (2.8). That exhaustive family is translation invariant, which seems right for a system that is translation invariant. An analogous discussion can be made for the other example presented in Sect. 2.2, the Lorentz gas, and for most extended dynamical systems one can imagine. We may conclude that V should not be so small as to make the verification of (M1) impossible nor, at the same time, so large as to make the class of global observables satisfying (A3) too meager. A happy medium might be for V to include all the symmetries, or “quasi-symmetries”, of the system and no more. The second, and more critical, question concerning (M1)–(M2) is that these definitions are completely blind to the local aspects of the dynamics. For instance, they are not able to detect an invariant set A, if μ(A) < ∞. One can easily produce a system that is mixing in one of the above senses, but not ergodic as per Definition 2.1. (For example, take a Lorentz gas and make one scatterer hollow: points inside that scatterer will stay confined there, thus breaking ergodicity, while all other trajectories will be the same as
494
M. Lenci
in the unperturbed system, which one believes to be at least (M1)-mixing, with the right choice of G and V .) We have no fix for this issue, other then giving a few more definitions which take into account local observables as well. 3.2. Global-local mixing. The most natural way to couple global and local observables in a definition of mixing is this: (M4) ∀F ∈ G, ∀g ∈ L, lim μ((F ◦ T t )g) = μ(F)μ(g), t→∞ where we have used the convenient notation μ(g) := M g dμ. This is the first notion of global-local mixing we give. Since for some systems, such as the Lorentz gas, this can be rather hard to prove [L3], we give a weaker version as well: (M3) ∀F ∈ G, ∀g ∈ L with μ(g) = 0, lim μ((F ◦ T t )g) = 0. t→∞
(Notice that (M3) and (M4) are equivalent in ordinary ergodic theory, because one can always subtract a constant function from any observable in order to make its integral vanish. Not so in infinite measure!) We will see momentarily that, for systems for which a uniform version of (M4) can be established, it is possible to pass from global-local mixing to global-local mixing. So we give one last definition: 1 (M5) ∀F ∈ G, lim sup μ((F ◦ T t )g) − μ(F)μ(g) = 0. t→∞ g∈L\0 μ(|g|) 3.3. Summary of assumptions and definitions. For the convenience of the reader, we summarize here all the assumptions we have made and all the definitions of mixing we have given, listing the latter in the correct hierarchical order, as clarified by Propositions 3.1 and 3.2 below. The following are the minimal requirements on the dynamical system (M, A , μ, {T t }), the exhaustive family V , the space of the global observables G, and the space of the local observables L: (A1) For any fixed t ∈ G, for V M, μ(T −t V V ) = o(μ(V )). (A2) G ⊂ L ∞ (M, A , μ). 1 F dμ. (A3) ∀F ∈ G, ∃ μ(F) := lim μV (F) := lim V M V M μ(V ) V (A4) L ⊆ L 1 (M, A , μ). The definitions of global-global mixing are: (M1) ∀F, G ∈ G, lim μ((F ◦ T t )G) = μ(F) μ(G). (M2) ∀F, G ∈ G,
t→∞
lim μV ((F ◦ T t )G) = μ(F) μ(G).
t→∞ V M
The definitions of global-local mixing are: (M3) ∀F ∈ G, ∀g ∈ L with μ(g) = 0, (M4) ∀F ∈ G, ∀g ∈ L, (M5) ∀F ∈ G,
lim μ((F ◦ T t )g) = 0.
t→∞ t
lim μ((F ◦ T )g) = μ(F)μ(g). 1 μ((F ◦ T t )g) − μ(F)μ(g) = 0. lim sup t→∞ g∈L\0 μ(|g|) t→∞
On Infinite-Volume Mixing
495
Proposition 3.1. Under all the assumptions made so far, (M5) ⇒ (M4) ⇒ (M3). Furthermore, (M2) implies that the limit in (M1) holds for all pairs F, G ∈ G such that μ((F ◦ T t )G) exists for all t large enough. Proof. The chain of implications is obvious. The last assertion follows directly from the definition of the double limit t → ∞, V M; cf. Sect. 3.1. With reasonable hypotheses, the strongest version of global-local mixing implies the “strongest” version of global-global mixing: Proposition 3.2. Suppose that every G ∈ G can be written μ-almost everywhere as G(x) =
g j (x),
with g j ∈ L,
j∈N
and, for every V ∈ V , there exists a finite subset JV of N, such that ⎞ ⎛ μ ⎝ G 1V − g j ⎠ = o(μ(V )); j∈JV g j L 1 = O(μ(V )).
(3.3) (3.4)
j∈JV
Then (M5) ⇒ (M2). Remark 3.3. The hypotheses of Proposition 3.2 above are less cumbersome than they appear. One should think of the very common situation in which M admits a partition of unity, j ψ j (x) ≡ 1, where the ψ j are nonnegative integrable functions which are roughly translations of one another. In many such cases one can expect g j := Gψ j to verify all of the above conditions. At any rate, if the g j are all nonnegative or all nonpositive, then (3.4) follows from (3.3). Proof of Proposition 3.2. Fix F, G ∈ G. We may assume that μ(F) = 0, otherwise in the following argument we consider Fc := F + c, where c is a nonnull constant, and easily derive the sought result at the end of the proof. Take ε > 0 and denote for short F t := F ◦ T t and gV := j∈JV g j . Equation (3.3) and (A3) imply that, for μ(V ) large enough, μ(gV ) μ(gV ) ε ≤ + |μV (G) − μ(G)| ≤ − − μ , μ(G) (G) V μ(V ) μ(V ) 3 |μ(F)|
(3.5)
and, since F is bounded, t μV (F t G) − μ(F gV ) ≤ ε . μ(V ) 3
(3.6)
496
M. Lenci
On the other hand, (M5) implies that μ(F t g j ) − μ(F)μ(g j ) ≤ ϑ(t)g j L 1 ,
(3.7)
where lim ϑ(t) = 0 and ϑ(t) does not depend on j or V . Summing over j ∈ JV and t→∞ using (3.4), one gets g j μ(F t gV ) μ(F)μ(gV ) ε ≤ ϑ(t) j∈JV ≤ , (3.8) μ(V ) − μ(V ) μ(V ) 3 for both μ(V ) and t large enough. Putting together (3.6), (3.8) and (3.5), in that order, we conclude that there exists M = M(ε) such that, for μ(V ) ≥ M and t ≥ M, μV (F t G) − μ(F)μ(G) ≤ ε, (3.9)
which is precisely (M2). 4. Mixing for Random Walks
In this section we see how the previous definitions play out for a fairly representative family of infinite measure-preserving dynamical systems. These are lattices of coupled baker’s maps which generalize the random walk of Sect. 2.2 in two ways. First and foremost, they represent all the random walks in Zd . Secondly, they are invertible dynamical systems, which can be reduced, for example, to the noninvertible dynamical system of Sect. 2.2 by a mere restriction of the σ -algebra. d To begin with, let a random walk in Z be defined by the transition probabilities { pβ }β∈Zd , with pβ ≥ 0 and β pβ = 1. This means that, if the walker is in α ∈ Zd , he will have probability pβ to move to α + β in the next step. We introduce some notation that will be useful later:
(4.1) D := β ∈ Zd pβ > 0 =: {β ( j) } j∈Z N is the set of the “active” directions for the random walk, endowed with some enumeration Z N j → β ( j) ∈ D. If D is infinite, then N := ∞ and Z N := Z+ ; if D is finite, then N denotes its cardinality and Z N := Z+ ∩ [1, N ]. We view this random walk as a dynamical system (M, A , μ, T ), where: • M := Zd × [0, 1)2 . If we denote Sα := {α} × [0, 1)2 , for α ∈ Zd , then M = α Sα can be interpreted as the disjoint union of Zd copies of the unit square. • A is the natural σ -algebra for M, i.e., the σ -algebra generated by all the Lebesguemeasurable subsets of Sα , ∀α ∈ Zd , with the natural identification Sα [0, 1)2 . • μ is the infinite measure that coincides with the Lebesgue measure when restricted to each Sα . • In order to define T , set q0 = 0 and, for k ∈ Z N , qk :=
k j=1
pβ ( j) ;
Rk := [qk−1 , qk ) × [0, 1).
(4.2)
On Infinite-Volume Mixing
497
Clearly {Rk }k∈Z N is a partition of [0, 1)2 into adjacent rectangles of height 1 and width, respectively, qk − qk−1 = pβ (k) = μ(Rk ).
(4.3)
For x = (α, y) = (α; y1 , y2 ) ∈ Zd × [0, 1)2 , let k ∈ Z N be the unique positive integer such that y ∈ Rk (equivalently, qk−1 ≤ y1 < qk ). One defines (4.4) T x = T (α; y1 , y2 ) := α + β (k) ; pβ−1 (k) (y1 − qk−1 ) , pβ (k) y2 + qk−1 . Therefore T is a piecewise linear, hyperbolic, invertible map M −→ M which preserves μ (because its determinant, in the variables (y1 , y2 ), is 1). Denoting Rα,k := {α} × Rk , it is easy to see that T is a Markov map for the partition {Rα,k }α,k . Now define ψ : M −→ Zd as ψ(α, y) = α. It is evident that, having chosen x = (0, y) at random in S0 w.r.t. μ, the stochastic process {ψ(T n x)}n∈N is the random walk introduced at the beginning of the section, with initial position in the origin. Moving on, we need to specify V , the exhaustive family of sets that determines the infinite-volume limit: For all γ = (γ1 , . . . , γd ) ∈ Zd and r ∈ Z+ , the set
(4.5) Bγ ,r = α = (α1 , . . . , αd ) ∈ Zd | γi − r ≤ αi ≤ γi + r, ∀i = 1, . . . , d is called a square box in Zd . Then we pose V := V = Bγ ,r × [0, 1)2
γ ∈ Zd , r ∈ Z+ .
(4.6)
Remark 4.1. Clearly, definition (4.5) does not capture all the square boxes in Zd , but only those whose side length is odd. This choice, made on grounds of simplicity, does not really limit the generality of V , and indeed the forthcoming results can be proven even in the case when (4.6) is modified to include all square boxes. Lemma 4.2. The dynamical system and the exhaustive family defined above verify (A1). When D is finite, the proof of Lemma 4.2 is rather straightforward, along the same lines as (2.9) for the first example of Sect. 2.2. In the general case, it has inessential technical complications, so we postpone it to Sect. A.1 of the Appendix. In order to introduce our observables, we need some preliminary notation. Let B ⊂ m be the σ -algebra A −m on M generated by the partition {Sα } and, as is customary, T B = T A | A ∈ B . Then, for , m ∈ Z with ≤ m, define B,m := T B ∨ T −1 B · · · ∨ T m+1 B ∨ T m B.
(4.7)
To fix the ideas, B0,1 is the σ -algebra corresponding to the partition {Rα,k }. More generally, consider < 0 < m: Recalling that N = #D is the number of rectangles in each partition {Rk } of Sα , one can see that the fundamental partition of B,m is made up of N ||+m rectangles whose widths are bounded by λm and whose heights are bounded by λ|| , where λ := max{ pβ }. Finally, set B,m ; B,+∞ := m∈N
B−∞,m :=
B,m ;
−∈N
B−∞,+∞ :=
−,m∈N
B,m .
(4.8)
498
M. Lenci
If we exclude, now and for the remainder of the section, the trivial case where N = 1 (i.e., pβ = 0, ∀β = β (1) ), one has that λ < 1. This and the previous observation on the fundamental sets of B,m imply that the sets of B0,+∞ are measurable unions of segments of the type {α} × {y1 } × [0, 1). We call those the local stable manifolds (LSMs) of the system and B0,+∞ the stable σ -algebra, also denoted by As . Analogously, Au := B−∞,0 is called the unstable σ -algebra and its sets are measurable unions of local unstable manifolds (LUMs) {α} × [0, 1) × {y2 }. Clearly, then, B−∞,+∞ = A . We define several classes of global observables: Gm := F ∈ L ∞ (M, B−m,m , μ) | ∃ μ(F) as per definition (A3) ; (4.9) Gm , (4.10) G := m∈N
where the closure is meant in the L ∞ norm. One should notice that Lemma 4.3. Given the definitions (4.9)–(4.10), G= F∈
L ∞ (M, B−m,m , μ)
m∈N
∃ μ(F) .
Proof.Let us prove the left-to-right inclusion. Given F ∈ G and n ≥ 1, there exists an Fn ∈ m Gm such that Fn − F L ∞ ≤ 1/n. This implies that μ(Fn ) −
1 1 ≤ lim inf μV (F) ≤ lim sup μV (F) ≤ μ(Fn ) + . n n V M V M
(4.11)
On the other hand, {μ(Fn )} is a Cauchy sequence, because {Fn } is. Its convergence thus proves the existence of μ(F). Conversely, if F is an observable in the closure of m L ∞ (M, B−m,m , μ) for which μ(F) exists, then, for any ε > 0, there exist m ∈ N and F ∈ L ∞ (M, B−m,m , μ) such that F − F L ∞ ≤ ε/2.
(4.12)
Denoting F := E(F|B−m,m ), (4.12) implies that F − F L ∞ ≤ ε/2, hence F − F L ∞ ≤ ε. Furthermore, since V ⊂ B ⊂ B−m,m , then μ(F ) exists and equals μ(F). This shows that F ∈ Gm . Therefore F ∈ G. Remark 4.4. In view of Lemma 4.3, one might wonder why we do not consider the more natural class G∞ := F ∈ L ∞ (M, A , μ) ∃ μ(F) (4.13) instead of G. (The inclusion G ⊂ G∞ is clearly strict.) The reason, which will hardly surprise the “hyperbolic” dynamicist, is that we need to approximate a global observable with locally constant functions uniformly over M, cf. (4.9)–(4.10). At any rate, G does not lack generality: for instance, any uniformly continuous F verifying (A3) belongs in that class.
On Infinite-Volume Mixing
499
As for the local observables, we also introduce countably many classes: Lm := L 1 (M, B−m,m , μ); L := L (M, A , μ). 1
(4.14) (4.15)
Prior to stating the main theorem of this section, we give a lemma that will help appreciate its statement. If {α ( j) } j∈J ⊂ Zd , the expression spanZ {α ( j) } j∈J denotes the subgroup of all the finite linear combinations of the α ( j) with coefficients in Z. Lemma 4.5. Let {β ( j) } j∈Z N ⊂ Zd and j ∈ Z N . Then
spanZ {β ( j) − β ( j ) } j∈Z N does not depend on j . Proof. Section A.2 of the Appendix.
Theorem 4.6. Let (M, A , μ, T ) be the dynamical system described above. Set ν := max{2, [d/2] + 1}, where [·] the integer part of a positive number, and suppose that (i) the probability distribution p has a finite ν th momentum: |β|ν pβ < ∞; β∈Zd
(ii) for a given j ∈ Z N , spanZ {β ( j) − β ( j ) } j∈Z N = Zd . Then the system is mixing in the following senses: (a) (M5) relative to Gm and Lm , for all m ∈ N; (b) (M4)–(M3) relative to G and L; (c) (M2) relative to G; (d) (M1) relative to Gm , for all m ∈ N, with the extra requirement that F be Zd -periodic, i.e., F(α, y) = F(0, y), ∀α ∈ Zd , ∀y ∈ [0, 1)2 . Remark 4.7. Condition (ii) is essential as it has to do with the irreducibility of the random walk [Sp]. In fact, assuming for simplicity that 0 ∈ D, if spanZ (D) = Zd , then the random walk is reducible and the system cannot be mixing in any sense, as is ascertained via the global observable 1, for α ∈ spanZ (D); F(α, y) := (4.16) 0, otherwise. One last observation that may be of interest is that statement (d) is far from optimal. (M1) holds for a much larger class of global observables, depending especially on the distribution { pβ }. In the formulation of Theorem 4.6, however, I was mainly interested in a nontrivial case in which (M1) could be verified easily. 5. Proof of Theorem 4.6 Since the proof is rather lengthy, we will divide it into pieces, or stages, as follows: Stage 1: Stage 2: Stage 3: Stage 4:
We prove (a) using three extra assumptions. We remove one of the extra assumptions. We remove the remaning two extra assumptions. We prove (b)–(d).
500
M. Lenci
5.1. Stage 1: Extra assumptions. Let us initially assume that: (E1) F ∈ As = B0,+∞ ; (E2) Lm only comprises indicator functions of the type g = 1 Q , where Q is a fundamental set of B−m,0 and Q ⊆ S0 ; β pβ = 0. (E3) the random walk has zero drift, i.e., β∈Zd
From (E2), Q is a rectangle of the type {0} × [0, 1) × I . We denote the length of I by h := |I | = μ(Q).
(5.1)
In this setting, (M5) amounts to showing that 1 1 n (F ◦ T )g dμ = lim F dμ = μ(F), lim n→∞ μ(|g|) M n→∞ h T n Q
(5.2)
uniformly in Q, that is, with a speed of convergence that does not depend on the choice of I . (In the above we have used the invariance of μ and the fact that μ(|g|) = h = μ(g).) Achieving this will be Stage 1 of the proof. Q can be thought of as partitioned into {Q ∩ R0, j } j∈Z N , which are rectangles of width pβ ( j) and height h. By construction, T acts on each such rectangle by stretching it horizontally and shrinking it vertically by a factor pβ−1 ( j) , and then mapping the resulting rectangle, of width 1, rigidly into Sβ ( j) . Iterating this procedure n times, we obtain TnQ = Q j1 ,..., jn := {α j1 ,..., jn } × [0, 1) × I j1 ,..., jn , (5.3) j1 ,..., jn ∈Z N
j1 ,..., jn ∈Z N
which is a disjoint union of N n thin rectangles of width 1. Each Q j1 ,..., jn is the set of all the points T n x for which the trajectory of x ∈ Q followed the itinerary Sβ ( j1 ) , Sβ ( j1 ) +β ( j2 ) , …, Sα j1 ,..., jn , where α j1 ,..., jn := β ( ji ) + β ( j2 ) + · · · + β ( jn ) .
(5.4)
Therefore, recalling (5.1), the height of Q j1 ,..., jn is h j1 ,..., jn := |I j1 ,..., jn | = μ(Q j1 ,..., jn ) = h pβ ( j1 ) pβ ( j2 ) · · · pβ ( jn ) .
(5.5)
Since F ∈ As , we can write, with a slight abuse of notation, F(α; y1 , y2 ) = Fα (y1 ). Then F dμ = F dμ TnQ
j1 ,..., jn ∈Z N
=
j1 ,..., jn ∈Z N
=
j1 ,..., jn ∈Z N
Q j1 ,..., jn
I j1 ,..., jn
1 0
Fα j1 ,..., jn (y1 ) dy1 dy2
h pβ ( j1 ) pβ ( j2 ) · · · pβ ( jn ) f α j1 ,..., jn ,
(5.6)
On Infinite-Volume Mixing
501
having used (5.3), (5.5) and the following definition: f α :=
1 0
Fα (y1 ) dy1 .
In view of (5.4) and (4.1), another way to write (5.6) is F dμ = h pβ (1) pβ (2) · · · pβ (n) f β (1) +...+β (n) . TnQ
(5.7)
(5.8)
β (1) ,...,β (n) ∈Zd
5.2. Fourier analysis. The technical backbone of the proof is Fourier analysis on Zd , for which we proceed to establish the necessary notation [K,R]. Let a := {aα }α∈Zd ∈ s (Zd ; C), with s ∈ [1, ∞]. Its Fourier transform is denoted a (θ ) = a (θ1 , . . . , θd ) := aα eıα·θ = aα eı(α1 θ1 +···+αd θd ) , (5.9) α∈Zd
α∈Zd
where ı is the imaginary unit and θ = (θ1 , . . . , θd ) ∈ Td := (R/2π Z)d . If s > 2, (5.9) must be intended in the weak sense, cf. (5.11). The corresponding inverse transform is given by aα = a (θ ) e−ıα·θ dθ, (5.10) Td
where dθ := (2π )−d dθ . The Parseval formula, in this setting, reads as follows: If b = {bα } ∈ s (Zd ; C), with 1/s + 1/s = 1, then aα bα = a (θ ) b(θ ) dθ, (5.11) a, b := α∈Zd
Td
the bar denoting complex conjugation. Another standard result that we need is the duality between convolution and product: If (a ∗ b)α := aβ bα−β = aα−β bβ (5.12) β∈Zd
β∈Zd
is well-defined in the proper or weak sense, then (a ∗ b)(θ ) = a (θ ) b(θ ).
(5.13)
Applying these concepts to our case, we see that f := { f α } ∈ ∞ by construction (because F ∈ G ⊂ L ∞ (M)) so f is a distribution on Td . On the other hand, p := 1 { pα } ∈ , which makes p continuous (it is actually much more than that, cf. Sect. 5.3). In particular, since p is a probability distribution on Zd , p (0) = 1. Defining p (n) := p ∗ · · · ∗ p n times
(5.14)
502
M. Lenci
we can rearrange (5.8) into 1 F dμ = f, p (n) = f (θ ) p (n) (θ ) dθ = f (θ ) p n (θ ) dθ, h T −n Q Td Td
(5.15)
where we have used (5.13), (5.14) and the fact that f α ∈ R. Hence Stage 1 of the proof, cf. (5.2), reduces to showing that (5.16) lim f, p (n) = μ(F). n→∞
Now recall definition (4.5). For r ∈ N, let (2r + 1)−d , if α ∈ B0,r ; (r ) qα := 0, otherwise,
(5.17)
define a function q (r ) : Zd −→ R. Its Fourier transform is easily computed to be (r ) (θ , . . . , θ ) = q 1 d
d sin ((r + 1/2)θi ) . (r + 1/2) sin θi
(5.18)
i=1
In view of (4.6), let us denote Vα,r := Bα,r × [0, 1)2 . Since F verifies (A3) and the infinite-volume limit is μ-uniform (cf. Definition 2.3), we have that 1 1 F dμ = lim fβ lim r →∞ μ(Vα,r ) V r →∞ (2r + 1)d α,r β∈Bα,r = lim f ∗ q (r ) = μ(F), (5.19) r →∞
α
uniformly in α. (We have used notation (5.7).) This, in turn, yields lim f, q (r ) ∗ p (n) = lim f ∗ q (r ) , p (n) = μ(F) r →∞
r →∞
(5.20)
uniformly in n, because p (n) is a probability distribution on Zd . (In the first equality (r ) (r ) we have used the fact that qα = q−α , by construction.) This fact implies that, for any sequence {rn } ⊂ N with rn → ∞, (5.21) lim f, q (rn ) ∗ p (n) = μ(F). n→∞
Therefore, comparing (5.21) with (5.16), we see that Stage 1 is achieved once we have shown that there exists a diverging sequence {rn } of natural numbers such that (5.22) lim f, p (n) − q (rn ) ∗ p (n) = 0. n→∞
For an a fortiori choice of {rn }, let us set gα(n) := pα(n) − q (rn ) ∗ p (n) , α
which gives
g (n) (θ ) = 1 − q (rn ) (θ ) p n (θ ).
(5.23)
(5.24)
On Infinite-Volume Mixing
503
A convenient estimate in view of (5.22) is g (n) f, g (n) ≤ f ∞ g (n) 1 = f ∞
A
g (n) ≤C
Hν
,
(5.25)
where the norms · A and · H ν are introduced in Sect. A.3 of the Appendix (cf. in particular Lemma A.2 and notice that · H ν¯ ≤ · H ν ). Therefore (5.22) will be proved once we establish that g (n)
Hν
g (n) ≤
L1
+
d
∂iν g (n)
i=1
L2
→ 0, as n → ∞,
(5.26)
for a suitable choice of {rn } in definition (5.23). Here ∂i := ∂/∂θi , acting on functions Td −→ C. 5.3. Properties of q (r ) and p . In view of the above goal we need to study some properties (r ) of the functions q and p. Remark 5.1. None of the proofs in this section will use (E3). (r ) is C ∞ by construction and p is C ν , with ν ≥ 2, by As a start, let us notice that q hypothesis (i) of Theorem 4.6.
Lemma 5.2. Fix r ∈ Z+ . On Td , q (r ) (0) = p (0) = 1 and, for θ = 0, | q (r ) (θ )| < 1,
| p (θ )| < 1.
Proof. We first prove the assertions on p . Since p is a probability distribution, p (0) = 1 p (θ )| = and | p (θ )| ≤ 1, ∀θ . Suppose by contradiction that ∃θ ∈ Td , θ = 0, such that | 1, that is, ( j) pβ ( j) eıθ ·β = eıa , (5.27) p (θ ) = j∈Z N
for some a ∈ R. Since pβ ( j) > 0 and whence, ∀ j, j ,
j
eıθ ·(β
pβ ( j) = 1, necessarily eıθ ·β
( j) −β ( j ) )
= 1.
( j)
= eıa , ∀ j ∈ Z N ,
(5.28)
Let us define the character (i.e., the homomorphism Zd −→ S 1 ⊂ C) ηθ (α) := ·α ıθ . It is easy to see that ηθ is not the trivial character (which instead corresponds to e θ = 0; this is a particular case of the so-called Pontryagin Duality [R, Thm. 2.1.2]). On
the other hand, (5.28) reads ηθ (β ( j) − β ( j ) ) = 1 and hypothesis (ii) of Theorem 4.6 implies that ηθ ≡ 1, thereby creating a contradiction. (r ) , there is nothing more to prove, because q (r ) satisfies As for the assertions on q the same properties as p , insofar as the above argument is concerned.
Notational convention. From now on, C will denote a generic universal constant. This means that its actual value will vary from formula to formula but will never depend on n, r , or θ .
504
M. Lenci
Lemma 5.3. If we think of p as a periodic function on Rd (as opposed to a function on d T ), there exists a neighborhood U of θ = 0 and a positive constant C such that, for θ ∈ U, | p (θ )| ≤ 1 − C|θ |2 = 1 − C θ12 + · · · + θd2 . Proof. As θ → 0, p (θ ) = 1 + ıv · θ + O(|θ |2 ),
(5.29)
where v ∈ Rd is the drift of the random walk, defined as v := βpβ .
(5.30)
β∈Zd
p (θ )|2 = The Lagrange remainder in (5.29) holds because p is at least C 2 . Hence | 2 1 + O(|θ | ), which implies the assertion. Lemma 5.4. Regarding q (r ) as a periodic function on Rd , one has that the following expansion: ⎛ ⎞ d ∞ 2j ⎝1 + q (r ) (θ1 , . . . , θd ) = ξ j (r ) θi ⎠, i=1
j=1
holds uniformly on the compact subsets of Rd . Furthermore, as r → ∞, |ξ j (r )| ≤ C
r2j . (2 j)!
Proof. By the factorizability of q (r ) , it is sufficient to treat the case d = 1. (r ) (θ ) is an entire function of θ , which we Since q (r ) is compactly supported in Zd , q have already calculated in (5.17). Its Taylor expansion at the origin is even and its (even) terms are ξ j (r ) =
r 1 1 (−1) j 2 j (r ) (0) = ∂ 2 j q α . (2 j)! (2 j)! 2r + 1 α=−r
This gives ξ0 (r ) ≡ 1 and the desired estimates.
(5.31)
We estimate the norms in the r.h.s. of (5.26) by splitting the corresponding integrals into two parts: one over Bn , which is the ball of center 0 and radius n −(1−ε)/2 in Td , and the other over Td \ Bn . Here ε > 0 is a small constant to be fixed later and n is a large integer. Lemma 5.5. There exists a κ > 0 such that, for n sufficiently large, ε
max | p |n ≤ e−κn .
Td \Bn
On Infinite-Volume Mixing
505
Proof. By elementary Taylor approximations, Lemma 5.3 implies that there exists a constant κ > 0 such that, for θ ∈ U, | p (θ )| ≤ e−κ|θ| . 2
(5.32)
For n large enough, by Lemma 5.2, the continuity of p and the compactness of Td , p (θ )| = max | p (θ )| ≤ e−κn max |
−1+ε
θ∈∂ Bn
θ∈Td \Bn
(5.33)
(in the last inequality we have used (5.32), which applies because, for n large, Bn ⊂ U). 5.4. End of Stage 1. Let us begin to attack (5.26) by estimating g (n) L 1 . First of all, g (n)
ε
L 1 (Td \Bn )
≤ (2π )d 2e−κn ,
(5.34)
(r ) and Lemma 5.5—see (5.24). Again, applying Lemma 5.2 by Lemma 5.2 applied to q (r ) p, to both q and
g (n)
L 1 (Bn )
≤
C n d(1−ε)/2
,
where C does not depend on r , that is, rn . As for the remaining terms in the r.h.s. of (5.26), clearly !ν " n (r ) ∂ l ∂iν ∂ik 1 − q g (n) = i p . k
(5.35)
(5.36)
k+l=ν
Let us estimate (5.36) on Td \ Bn . Fixing l ≥ 1, one verifies by repeated differentiation that ∂il pn =
l
w=1 j1 +···+ jw =l
C lj1 ,..., jw
n! j j j p )(∂i 2 p ) · · · (∂i w p ), p n−w (∂i 1 (n − w)!
(5.37)
where the combination of the two sums above represents the sum over all the partitions { j1 , j2 , . . . , jw } of l (i.e., ju ≥ 1 and j1 + j2 + . . . + jw = l), with any cardinality w, and C lj1 ,..., jw ∈ N is a combinatorial coefficient independent of n. Since p ∈ C ν , all the derivatives that appear on the r.h.s. of (5.37) are continuous functions of θ . Therefore, by Lemma 5.5, ε max ∂il p n ≤ Cnl e−κ(n−l) . (5.38) Td \Bn
As for the other factors in the r.h.s. of (5.36), for k ≥ 1 we use definition (5.17) to estimate (r ) ≤ max ∂ik q αik qα(r ) = Td
α∈Zd
r 1 |αi |k ≤ Cr k . 2r + 1 α =−r i
(5.39)
506
M. Lenci
(r ) is bounded.) Using (5.38)–(5.39) into (5.36), (For k = 0, we already know that 1 − q we get
∂iν g (n)
2
ε
L 2 (Td \Bn )
≤ Cr 2ν n 2ν e2κ(n−ν) ,
(5.40)
which tends to zero, as n → ∞, provided that r = rn grows no faster than a power of n. This will be verified a fortiori, see (5.41). The estimation of the last term, namely ∂iν g (n) L 2 (Bn ) , is the most delicate, therefore we organize most of the computations involved in the following Lemma 5.6. For ν ∈ Z+ (not necessarily as in the statement of Theorem 4.6), assume p ∈ C ν . Then take any sequence of positive numbers n , with n → 0. For n large enough and uniformly for 1 ≤ r ≤ n n (1−ε)/2 , one has
g (n) ≤ Cr 2ν n (−2+2ε+ν(1+ε))/2 . max ∂iν Bn
Once Lemma 5.6 is established, which will happen momentarily, we can finally choose both the sequence {rn } and the parameter ε: the former will be any diverging sequence such that
whereas
ε < min
rn ≤ Cn ε ,
(5.41)
# 1 1 , . 3 2(5ν + 2 + d/2)
(5.42)
The condition ε < 1/3 is necessary in order to apply Lemma 5.6 to r = rn . In fact, via (5.41), rn n −(1−ε)/2 ≤ Cn −(1−3ε)/2 =: n . The other inequality is needed when we use the assertion of Lemma 5.6 to estimate ∂iν g (n)
2 L 2 (Bn )
≤ Crn4ν n −2+2ε+ν(1+ε) n −d(1−ε)/2 ≤ Cn −2+ν−d/2+(5ν+2+d/2)ε .
(5.43)
For ε so small that (5ν + 2 + d/2)ε < 1/2 the above vanishes, as n → ∞, because ν − d/2 ≤ 3/2 (for d = 1, it equals 3/2; for d ≥ 2 it equals 1 or 1/2, depending on d being even or odd). In conclusion, estimates (5.34), (5.35), (5.40) and (5.43) prove (5.26), thus (5.22), thus (5.16) and, lastly, (5.2), uniformly in Q as requested. This ends Stage 1 of the proof. Proof of Lemma 5.6. Throughout the proof it is understood that θ ∈ Bn , i.e., |θ | ≤ n −(1−ε)/2 . The condition on r ensures that 2j (5.44) ξ j (r )θ 2 j ≤ Cr 2 j n −(1−ε) j ≤ Cn , which implies that, for n so large that n < 1, the expansion of Lemma 5.4 and all those derived by it by differentiation w.r.t. θi are meaningful. More importantly, as n → ∞
On Infinite-Volume Mixing
507
and uniformly in r as described in the statement of the lemma, each such expansion can be approximated by a convenient upper bound on its first term. We indicate this with the symbol ∼: for example, (r ) (θ ) ∼ |ξ (r )θ |2 ≤ Cr 2 n −(1−ε) 1 − q 1
(5.45)
and, for k ≥ 1, (k + 1)! |ξ(k+1)/2 (r )θi | ≤ Cr k+1 n −(1−ε)/2 , if k is odd, k ∂i q (r ) (θ ) ∼ k! |ξk/2 (r )| ≤ Cr k , if k is even.
(5.46)
Furthermore, (E3) is equivalent to ∇ p (0) = 0, which implies |∂i p (θ )| ≤ C|θ | ≤ Cn −(1−ε)/2 . We will proceed by induction on ν ≥ 1. When ν = 1 our function reads (r ) (r ) n p n−1 ∂i ∂i g (n) = −∂i q p n + 1 − q p. Hence, from (5.45)–(5.47), and Lemma 5.2 applied to p, (n) max ∂i g ≤ Cr 2 n (−1+3ε)/2 , Bn
(5.47)
(5.48)
(5.49)
proving the assertion for ν = 1. Now we assume the assertion with ν and set out to prove the one with ν + 1. In practice, this means that increasing the order of the derivative by one must worsen the inequality of Lemma 5.6 at most by a factor r 2 n (1+ε)/2 . n (r ) ) or ∂ l We apply ∂i to (5.36). On the r.h.s., ∂i can either hit ∂ik (1 − q i p . Let us analyze the two cases separately. In the first case, assuming for the moment k ≥ 1, we see via (5.46) that k+1 $ k+1 −(1−ε)/2 % (1−ε)/2 Cr = Cr n$ , if k is odd, k+1 % n (5.50) ∂i q (r ) (θ ) ≤ Cr k+2 n −(1−ε)/2 = Cr k r 2 n −(1−ε)/2 , if k is even. (r ) , respectively The terms within parentheses in the above represent the estimates for ∂ik q for k odd and even, coming from (5.46). Also, for k = 0, (r ) ≤ Cr 2 n −(1−ε)/2 = Cr 2 n −(1−ε) n (1−ε)/2 . ∂i q (5.51) (r ) coming from (5.45). In Again, the term in the parentheses is the estimate for 1 − q k (r ) any event, applying another derivative to the term ∂i (1 − q ) will change our estimate at most by a factor r 2 n (1−ε)/2 , which is consistent with our inductive step. In the second case, we use the expansion (5.37): ∂i can either hit p n−w or one of the ju ∂i p . In this first sub-case,
n−w−1 ∂i p p n−w (θ ) = (n − w) (θ ) ∂i p (θ ) ≤ Cn|θ | ≤ Cn (1+ε)/2 ,
(5.52)
508
M. Lenci
via (5.47). As for the second sub-case, without loss of generality, we assume the worstj case estimate for ∂i u p on Bn , that is, ju C|θ | ≤ Cn −(1−ε)/2 , if ju = 1, p (θ ) ≤ ∂i C, if ju ≥ 2.
(5.53)
This implies that increasing by one the order of the derivative in the l.h.s. of (5.53) will worsen our most conservative estimate at most by a factor n (1−ε)/2 . Considering (5.52) as well, we conclude that applying another derivative to ∂il p n will change its estimate at (1+ε)/2 most by a factor n , which is again consistent with our inductive step. Remark 5.7. The careful reader might worry that the unrigorous use of the symbol C for a generic constant may jeopardize the above proof. It does not, since all the constants that have been used do not depend on r or n. In principle, they may depend on k (though it is easy to see that they do not), or i, or ju , but these integers only take on a finite number of values, so bounds can be found that do not depend on any of the variables. 5.5. Stage 2: Removing (E3). If, contrary to assumption (E3), v = −ı∇ p (0) = 0, cf. (5.30), we define δ (n) ∈ Zd to be the (not necessarily unique) lattice point for which δ (n) /n best approximates v ∈ Rd . One clearly has δ (n) 1 i − vi ≤ , 2n n
(5.54)
where the subscript i denotes, as usual, the i th component of a d-dimensional vector. Now, for θ ∈ (−π, π )d , set πn (θ ) := p (θ ) e−ı(δ
(n) ·θ)/n
.
(5.55)
We want to interpret πn as a generally discontinuous function Td −→ C. On the other n hand, πn is a smooth function of Td and, by (5.24) and Lemma A.1 of the Appendix, g (n)
1− q (rn ) pn A A n p = 1 − q (rn ) ω−δ (n) A n (r ) n πn = 1−q =
A
(5.56)
(having used notation (A.9) as well). Comparing the above with (5.25), in view of our goal (5.26), we see that it is sufficient to repeat all the estimations of Sects. 5.3–5.4, replacing p with πn . This is no problem, except for estimate (5.47)—which is also reflected in (5.52) and (5.53). (Consider Remark 5.1 and the fact that Lemmas 5.2 and 5.5 cannot distinguish between p and πn .) In order to find an effective substitute for (5.47), we write, for θ ∈ Bn , ∂i πn (θ ) = ∂i πn (0) + u n (θ ) · θ,
(5.57)
On Infinite-Volume Mixing
509
where u n (θ ) is the i th row of the Hessian of πn evaluated at some θ ∈ Bn . This has a finite limit, for n → ∞, as one can easily verify by direct computation on (5.55) (it is in fact, up to a minus sign, the i th row of the covariance matrix of p). Therefore |u n (θ ) · θ | ≤ C|θ | ≤ Cn −(1−ε)/2 .
(5.58)
On the other hand, by (5.54), (n) (n) δi δi |∂i πn (0)| = ∂i p (0) − ı = vi − ≤ Cn −1 . n n
(5.59)
Thus, using (5.58)-(5.59) in (5.57), πn (θ )| ≤ Cn −(1−ε)/2 , |∂i
(5.60)
g (n) can be estimated which is the same bound as (5.47). This proves that the H ν -norm of as in Sect. 5.4 even when p is replaced by πn . That is, (5.2) holds even when (E3) does not, which completes Stage 2.
5.6. Stage 3: Removing (E1) and (E2). It is easy to realize that the convergence rate in (5.2) is not only independent of the choice of Q ⊆ S0 , it is also independent of the fact that Q is contained in S0 , as long as it remains an element of the countable partition associated to B−m,0 . In fact, if we take Q ⊆ Sα , with α = 0, we can shift, via the natural action of Zd onto M, both Q and F by the quantity −α. Equation (5.2) continues to hold with the same convergence rate because all the properties of F that were used in Stages 1 and 2 are translation invariant. (m) In formula, there exists a positive vanishing sequence {ϑn }n∈N such that, if g = 1 Q and Q is a fundamental set of B−m,0 , μ((F ◦ T n )g) − μ(F)μ(g) ≤ g L 1 ϑ (m) . n
(5.61)
Since (5.61) depends continuously on g ∈ L 1 , it is immediate to extend it to g = 1 j∈N a j 1 Q j , with a j > 0, that is, to a generic positive function in L (M, B−m,0 , μ). + − If g is such that both the positive part g and the negative part g are nonzero, we apply (5.61) twice to g + and g − . An easy estimate proves that the formula holds in this case as well. & := {F ∈ L ∞ (M, As , μ) | ∃ μ(F)} and L &m := Therefore (M5) holds w.r.t. G 1 L (M, B−m,0 , μ), for all m ∈ N. Now, if F ∈ Gm and g ∈ Lm , the invariance of μ and Lemma 2.4 give that μ((F ◦ T n )g) − μ(F)μ(g) = μ((F ◦ T n−m )(g ◦ T −m )) − μ(F ◦ T m )μ(g ◦ T −m ).
(5.62)
&2m and F ◦ T m ∈ G & (because B0,2m ⊂ As ), we apply the previous Since g ◦ T −m ∈ L (2m) result and see that (a) holds with a convergence rate ϑn−2m .
510
M. Lenci
5.7. Stage 4: Proof of the remaining assertions. Statement (a) immediately implies (M4) relative to Gm and Lm (Proposition 3.1). One readily extends it to G and L, thus proving (b), by means of the following obvious lemma: Lemma 5.8. If G is a dense subset of G in the L ∞ -norm and L is a dense subset of L in the L 1 -norm, then (M4) for G and L implies (M4) for G and L. As concerns (c), it is easy to verify that Proposition 3.2 applies to the classes of global observables Gm and local observables Lm (using the family of local observables gα := G 1 Sα ). Therefore (a) implies (M2) relative to Gm . We extend it to G by means of another obvious result. Lemma 5.9. If G is a dense subset of G in the L ∞ -norm, then (M2) for G implies (M2) for G. Finally, let us consider (d). By the second part of Proposition 3.1, it suffices to show that, if F, G ∈ Gm and F is Zd -periodic, then μ((F ◦ T n )G) exists for n large enough. By the same arguments as in the proof of Lemma 2.4, when V M, μV ((F ◦ T n )G) = μV ((F ◦ T n−m )(G ◦ T −m )) + o(1).
(5.63)
So we can reduce to proving the existence of the infinite-volume limit of the above r.h.s., for all n ≥ 2m. Since G ◦ T −m is measurable w.r.t. B−2m,0 ⊂ Au , with a slight abuse of notation we can define, for α ∈ Z d , 1 −m bα := G◦T dμ = G ◦ T −m (α; y2 ) dy2 . (5.64) Sα
0
An analogous definition can be made for F ◦ T n−m , which is measurable w.r.t. B0,2m ⊂ As . In this case, notice that F ◦ T n−m is also Zd -periodic, so we can write 1 a := F ◦ T n−m dμ = F ◦ T n−m (y1 ) dy1 . (5.65) Sα
Clearly, then,
and, for V =
Sα α∈Bγ ,r
0
(F ◦ T n−m ) (G ◦ T −m ) dμ = a bα
(5.66)
Sα ,
μV ((F ◦ T n−m )(G ◦ T −m )) =
a bα . d (2r + 1)
(5.67)
α∈Bγ ,r
Since μ(G) exists, by Lemma 2.4, μ(G ◦ T −m ) = lim
r →∞
1 bα (2r + 1)d
(5.68)
α∈Bγ ,r
exists and the limit is uniform in γ . Also, it is obvious that μ(F ◦ T n−m ) = a. Hence, as V M, the r.h.s. of (5.67) tends to μ(F ◦ T n−m )μ(G ◦ T −m ), which is what we wanted to prove. This concludes the proof of Theorem 4.6.
On Infinite-Volume Mixing
511
A. Appendix We collect here a few technical results which would have been distracting in the body of the paper. The most important of them is an estimate on a certain Fourier norm that is pivotal in the proof of Theorem 4.6. This is presented in Sect. A.3. A.1. Proof of Lemma 4.2. Since T is an automorphism, it is no loss of generality to prove the assertion for t = −1. Also, since (A1) is invariant for the action of Zd on V , we may assume that all V ∈ V are of the form Vr := B0,r × [0, 1)2 . Thus, the infinite-volume limit becomes the limit r → ∞. Let r = r (r ) := [r 1/2 ] ([·] is the integer part of a positive number) and ϕ(r ) := μ(T S0 \ Vr ) = pβ . (A.1) β∈ B0,r
Clearly, ϕ(r ) imply that
0, as r and r tend to infinity. This and the translation invariance of T μ(T Vr \ Vr +r ) ≤ μ(Vr ) ϕ(r ),
(A.2)
μ(T Vr ∪ Vr ) ≤ μ(Vr +r ) + μ(T Vr \ Vr +r ) = (2r + 1)d + o((2r + 1)d ).
(A.3)
whence
With a dual argument, considering that μ(Vr −r \ T Vr ) = μ(T −1 Vr −r \ Vr ) and that T −1 acts essentially as T (after a swapping of the coordinates y1 and y2 , the map T −1 becomes of the same type as T ), we obtain μ(T Vr ∩ Vr ) ≥ μ(Vr −r ) − μ(Vr −r \ T Vr ) = (2r + 1)d + o((2r + 1)d ). (A.4) Taking the difference of (A.3) and (A.4) yields (A1) with t = −1.
A.2. Proof of Lemma 4.5. We must show that, if j , j ∈ Z N with j = j , then
spanZ {β ( j) − β ( j ) } j= j = spanZ {β ( j) − β ( j ) } j= j .
(A.5)
The generic element of the l.h.s. of (A.5) is γ =
n j (β ( j) − β
j= j
( j )
)=
j= j
⎛ n j β ( j) − ⎝
⎞
n j ⎠ β ( j ),
(A.6)
j= j
chosen integers. Upon defining where {n j } j= j are free variables, i.e., are arbitrarily n j := − j= j n j , which implies n j = − j= j n j , (A.6) becomes ⎛ ⎞ γ = n j β ( j) = n j β ( j) − ⎝ n j ⎠ β( j ) = n j (β ( j) − β ( j ) ), (A.7) j∈Z N
j= j
j= j
j= j
which is the generic element of the r.h.s. of (A.5), if we consider {n j } j= j to be the free variables and n j to depend on them.
512
M. Lenci
A.3. Absolutely convergent Fourier series. In this section we present a convenient estimate for the space A of functions a : Td −→ C with an absolutely convergent Fourier series {aβ } = a [K, §6]. This functional space is defined as the maximal domain of the norm a A := a1 := |aβ |. (A.8) β∈Zd
a ∈ A, This norm has a couple of straightforward invariances. For γ ∈ Zd , ζ ∈ Td , and let ωγ (θ ) := eıγ ·θ ; (τζ a )(θ ) := a (θ + ζ ).
(A.9) (A.10)
Lemma A.1. Given a ∈ A, for all γ ∈ Zd and ζ ∈ Td , a ωγ A = τζ a A = a A . a. Proof. Trivial verification upon computation of the Fourier series of a ωγ and τζ
The following estimate is a modification—mostly, a simplification—of a 1984 result by Nowak [No]. The proof, which we give for completeness, is practically copied from that article. Lemma A.2. Let ν¯ = [d/2] + 1 be the smallest integer strictly bigger than d/2. There exists a constant Cd > 0 such that a A ≤ Cd a H ν¯ , where a H ν¯ := |a0 | +
d i=1
=
⎛
⎞1/2 2 ⎝ βiν¯ aβ ⎠ β∈Zd
2 ⎞1/2 ∂ ν¯ a ν¯ (θ ) dθ ⎠ . Td ∂θi
⎛ d ⎝ a (θ )dθ + d
T
i=1
Proof. Let σ = (σ1 , σ2 , . . . , σd ) be a permutation of (1, 2, . . . , d). Let us define
Z σ := (β1 , β2 , . . . , βd ) ∈ Zd 0 < |βσ1 | ≤ |βσ2 | · · · ≤ |βσd | . (A.11) Clearly, Zd = {0} ∪
Zσ ,
(A.12)
σ
although the union is not disjoint. Using the Cauchy-Schwartz inequality, ⎞2 ⎛ 2 ⎝ |aβ |⎠ ≤ Cν¯ βσ2dν¯ |aβ |2 ≤ Cν¯ βσν¯d aβ , β∈Z σ
β∈Z σ
β∈Zd
(A.13)
On Infinite-Volume Mixing
513
where we have denoted ν¯ Cν¯ := βσ−2 = d β∈Z σ
|α1 |>0 |α2 |≥|α1 |
≤C
···
αd−2ν¯
|αd−1 |≥|αd−2 | |αd |≥|αd−1 |
···
|α1 |>0 |α2 |≥|α1 |
≤ ······ ≤ C
−2ν¯ +1 αd−1
|αd−1 |≥|αd−2 |
α1−2ν¯ +d−1 < ∞
(A.14)
|α1 |>0
(as in Sect. 5, C represents a generic constant). In view of (A.12), summing the square root of (A.13) over all the permutations σ , we obtain ⎛ ⎞1/2 d 2 ' ⎝ |aβ | ≤ (d − 1)! Cν¯ (A.15) βiν¯ aβ ⎠ , β=0
whence the assertion of the lemma.
i=1
β∈Zd
Acknowledgements. I would like to thank an anonymous referee for pointing out a relevant mistake in the first draft of the manuscript.
References [A] [BS] [CM] [Fr] [HK] [H] [I1] [I2] [KP] [K] [KO] [KS] [Kr] [L1] [L2] [L3] [No] [Or] [Pa]
Aaronson, J.: An introduction to infinite ergodic theory. Mathematical Surveys and Monographs, 50. Providence, RI: Amer. Math. Soc., 1997 Bunimovich, L.A., Sinai, Ya.G.: Statistical properties of lorentz gas with periodic configuration of scatterers. Commun. Math. Phys. 78(4), 479–497 (1980/81) Chernov, N., Markarian, R.: Chaotic billiards. Mathematical Surveys and Monographs, 127. Providence, RI: Amer. Math. Soc., 2006 Friedman, N.A.: Mixing transformations in an infinite measure space. In: Studies in probability and ergodic theory, Adv. in Math. Suppl. Stud., 2. New York-London: Academic Press, 1978, pp. 167–184 Hajian, A.B., Kakutani, S.: Weakly wandering sets and invariant measures. Trans. Amer. Math. Soc. 110, 136–151 (1964) Hopf, E.: Ergodentheorie. Berlin: Springer-Verlag, 1937 Isola, S.: Renewal sequences and intermittency. J. Stat. Phys. 97(1–2), 263–280 (1999) Isola, S.: On systems with finite ergodic degree. Far East J. Dyn. Syst. 5(1), 1–62 (2003) Kakutani, S., Parry, W.: Infinite measure preserving transformations with “mixing”. Bull. Amer. Math. Soc. 69, 752–756 (1963) Katznelson, Y.: An introduction to harmonic analysis. 3rd ed. Cambridge Mathematical Library. Cambridge: Cambridge University Press, 2004 Kingman, J.F.C., Orey, S.: Ratio limit theorems for markov chains. Proc. Amer. Math. Soc. 15, 907– 910 (1964) Krengel, U., Sucheston, L.: Mixing in infinite measure spaces. Z. Wahr. Verw. Geb. 13, 150– 164 (1969) Krickeberg, K.: Strong mixing properties of Markov chains with infinite invariant measure. In: 1967 Proc. Fifth Berkeley Sympos. Math. Statist. and Probability (Berkeley, CA, 1965/66), Vol. II, Part 2, Berkeley, CA: Univ. California Press, 1967, pp. 431–446 Lenci, M.: Aperiodic lorentz gas: recurrence and ergodicity. Erg. Th. Dynam. Systs. 23(3), 869– 883 (2003) Lenci, M.: Typicality of recurrence for lorentz gases. Erg. Th. Dynam. Systs. 26(3), 799–820 (2006) Lenci, M.: Mixing properties of infinite Lorentz gases. In preparation Nowak, Z.: Criteria for absolute convergence of multiple fourier series. Ark. Mat. 22(1), 25–32 (1984) Orey, S.: Strong ratio limit property. Bull. Amer. Math. Soc. 67, 571–574 (1961) Papangelou, F.: Strong ratio limits, r -recurrence and mixing properties of discrete parameter markov processes. Z. Wahr. Verw. Geb. 8, 259–297 (1967)
514
M. Lenci
[Pr]
Pruitt, W.E.: Strong ratio limit property for r -recurrent markov chains. Proc. Amer. Math. Soc. 16, 196–200 (1965) Rudin, W.: Fourier analysis on groups. Interscience Tracts in Pure and Applied Mathematics, no. 12, New York-London: Interscience Publishers (a division of John Wiley and Sons), 1962 Sachdeva, U.: On category of mixing in infinite measure spaces. Math. Systems Theory 5, 319– 330 (1971) Sinai, Ya.G.: Dynamical systems with elastic reflections. Russ. Math. Surv. 25(2), 137–189 (1970) Spitzer, F.: Principles of random walk, 2nd ed. Graduate Texts in Mathematics, 34. New YorkHeidelberg: Springer-Verlag, 1976 Sucheston, L.: On mixing and the zero-one law. J. Math. Anal. Appl. 6, 447–456 (1963) Thaler, M.: The asymptotics of the perron-frobenius operator of a class of interval maps preserving infinite measures. Studia Math. 143(2), 103–119 (2000) Tomatsu, S.: Uniformity of mixing transformations with infinite measure. Bull. Fac. Gen. Ed. Gifu Univ. No. 17, 43–49 (1981) Tomatsu, S.: Local uniformity of mixing transformations with infinite measure. Bull. Fac. Gen. Ed. Gifu Univ. No. 20, 1–5 (1984) Walters, P.: An introduction to ergodic theory. Graduate Texts in Mathematics, 79. New York-Berlin: Springer-Verlag, 1982
[R] [Sa] [Si] [Sp] [Su] [Th] [To1] [To2] [W]
Communicated by G. Gallavotti
Commun. Math. Phys. 298, 515–522 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1073-0
Communications in
Mathematical Physics
An Energy Gap for Yang-Mills Connections Claus Gerhardt Institut für Angewandte Mathematik, Ruprecht-Karls-Universität, Im Neuenheimer Feld 294, 69120 Heidelberg, Germany. E-mail: [email protected] Received: 8 August 2009 / Accepted: 16 February 2010 Published online: 11 June 2010 – © Springer-Verlag 2010
Abstract: Consider a Yang-Mills connection over a Riemann manifold M = M n , n ≥ 3, where M may be compact or complete. Then its energy must be bounded from below by some positive constant, if M satisfies certain conditions, unless the connection is flat. Contents 1. Introduction . . . . . . 2. The Compact Case . . . 3. The Non-compact Case References . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
515 517 520 522
1. Introduction We consider the problem: When is a Yang-Mills connection non-flat? Of course, the trivial answer Fμλ ≡ 0 is unsatisfactory. Bourguignon and Lawson proved in [3, Theorem C], among other results, that any Yang-Mills connection over S n , n ≥ 3, the field strength of which satisfies the pointwise estimate n (1.1) F 2 = − tr(Fμλ F μλ ) < 2 is flat. We want to prove that under certain assumptions on the base space M, which is supposed to be a Riemannian manifold of dimension n ≥ 3, the energy of a Yang-Mills This work has been supported by the DFG.
516
C. Gerhardt
connection has to satisfy |F|
n 2
2 n
≥ κ0 > 0,
(1.2)
M
where κ0 depends only on the Sobolev constants of M, n and the dimension of the Lie group G, unless the connection is flat. Here, √ (1.3) |F| = F 2 , and we also call the left-hand side of (1.2) energy though this label is only correct when n = 4. However, this norm is also the crucial norm, which has to be (locally) small, used to prove regularity of a connection, cf. [4, Theorem 1.3]. The exponent n2 naturally pops up when Sobolev inequalities are applied to solutions of differential equations satisfied by the field strength or the energy density of a connection in the adjoint bundle. We distinguish two cases: M compact and M complete and non-compact. When M is compact, we require R¯ αβ Λαλ Λβλ − 21 R¯ αβμλ Λαβ Λμλ ≥ c0 Λαβ Λαβ
(1.4)
for all skew-symmetric Λαβ ∈ T 0,2 (M), where 0 < c0 , while for non-compact M the weaker assumption R¯ αβ Λαλ Λβλ − 21 R¯ αβμλ Λαβ Λμλ ≥ 0,
(1.5)
and in addition u M
2n n−2
n−2 n
≤ c1
|Du|2
∀ u ∈ H 1,2 (M)
(1.6)
M
should be satisfied. Remark 1.1.
(i) If M is a space of constant curvature R¯ αβμλ = K M (g¯ αμ g¯ βλ − g¯ αλ g¯ βμ ),
(1.7)
R¯ αβ Λαλ Λβλ − 21 R¯ αβμλ Λαβ Λμλ = (n − 2)K M Λαβ Λαβ .
(1.8)
then In case n = 2 the curvature term therefore vanishes, and this result is also valid for an arbitrary two-dimensional Riemannian manifold, since the curvature tensor then has the same structure as in (1.7) though K M is not necessarily constant. (ii) If M = Rn , n ≥ 3, the conditions (1.5) and (1.6) are always valid. Theorem 1.2. Let M = M n , n ≥ 3, be a compact Riemannian manifold for which the condition (1.4) with c0 > 0 holds. Then any Yang-Mills connection over M with compact, semi-simple Lie group is either flat or satisfies (1.2) for some constant κ0 > 0 depending on the Sobolev constants of M, n, c0 , and the dimension of the Lie group. Theorem 1.3. Let M = M n , n ≥ 3, be complete, non-compact and assume that the conditions (1.5) and (1.6) hold. Then any Yang-Mills connection over M with compact, semi-simple Lie group is either flat or the estimate (1.2) is valid. The constant κ0 > 0 in (1.2) depends on the constant c1 in (1.6), n, and the dimension of the Lie group.
An Energy Gap for Yang-Mills Connections
517
2. The Compact Case Let (P, M, G, G) be a principal fiber bundle, where M = M n , n ≥ 3 is a compact Riemannian manifold with metric g¯ αβ and G a compact, semi-simple Lie group with Lie a ) be a basis of ad g and algebra g. Let f c = ( f cb Aμ = f c Acμ
(2.1)
a Yang-Mills connection in the adjoint bundle (E, M, g, Ad(G)). The curvature tensor of the connection is given by a c R abμλ = f cb Fμλ ,
(2.2)
c Fμλ = f c Fμλ
(2.3)
where
is the field strength of the connection, and a F 2 ≡ γab Fμλ F bμλ = Rabμλ R abμλ
(2.4)
the energy density of the connection—at least up to a factor 14 . Here, γab is the Cartan-Killing metric acting on elements of the fiber g, and Latin indices are raised or lowered with respect to the inverse γ ab or γab , and Greek indices with respect to the metric of M. Definition 2.1. The adjoint bundle E is vector bundle; let E ∗ be the dual bundle, then we denote by T r,s (E) = Γ (E · · ⊗ E ⊗ E ∗ ⊗ · · · ⊗ E ∗ ) ⊗ · r
(2.5)
s
the sections of the corresponding tensor bundle. Thus, we have a Fμλ ∈ T 1,0 (E) ⊗ T 0,2 (M).
(2.6)
Since Aμ is a Yang-Mills connection it solves the Yang-Mills equation F aαλ;α = 0,
(2.7)
where we use Einstein’s summation convention, a semi-colon indicates covariant differentiation, and where we stipulate that a covariant derivative is always a full tensor, i.e., γ a a a a b c γ Fμλ;α = Fμλ,α + f bc Aα Fμλ − Γ¯αμ Fγaλ − Γ¯αλ Fμγ ,
(2.8)
γ where Γ¯αβ are the Christoffel symbols of the Riemannian connection; a comma indicates partial differentiation. Before we formulate the crucial lemma let us note that R¯ αβγ δ resp. R¯ αβ symbolize the Riemann curvature tensor resp. the Ricci tensor of g¯ αβ .
518
C. Gerhardt
Lemma 2.2. Let Aμ be a Yang-Mills connection, then its energy density F 2 solves the equation − 41 ΔF 2 + 21 Faμλ;α F a c = − f cb Fαμ F bαλ Fa
aμλ α ; μλ
aβ + R¯ βμ F λ Fa μλ − 21 R¯ αβμλ Fa αβ F aμλ
.
(2.9)
Proof. Differentiating (2.7) covariantly with respect to x μ and using the Ricci identities we obtain 0 = −F aαλ;αμ aβ β = −F aαλ;μα + R abαμ F bαλ + R¯ αβαμ F λ + R¯ λμα F aαβ .
(2.10)
On the other hand, differentiating the second Bianchi identities a a a + Fμα;λ + Fλμ;α 0 = Fαλ;μ
(2.11)
a , 0 = F aαλ;μα + F aμ α;λα + ΔFλμ
(2.12)
a Fa μλ = −2F aαλ;μα Fa μλ . − ΔFμλ
(2.13)
we infer
and we deduce further
In view of (2.10) we then conclude aβ a Fa μλ + R abαμ F bαλ Fa μλ + R¯ βμ F λ Fa μλ 0 = − 21 ΔFμλ
+ R¯
β
λμα F
aα μλ , β Fa
(2.14)
which is equivalent to 0=
aβ a a c Fa μλ + f cb Fαμ F bαλ Fa μλ + R¯ βμ F λ Fa μλ − 21 ΔFμλ
− R¯ αμβλ F aαβ Fa μλ ,
(2.15)
in view of (2.2). Finally, using the first Bianchi identities, R¯ αβμλ + R¯ αμλβ + R¯ αλβμ = 0,
(2.16)
R¯ αβμλ F aαβ Fa μλ + R¯ αμλβ F aαβ Fa μλ + R¯ αλβμ F aαβ Fa μλ = 0,
(2.17)
R¯ αβμλ F aαβ Fa μλ = 2 R¯ αμβλ F aαβ Fa μλ ,
(2.18)
we deduce
and hence
from which Eq. (2.9) immediately follows.
An Energy Gap for Yang-Mills Connections
519
Proof of Theorem 1.2. Define u = F 2,
(2.19)
aβ R¯ βμ F λ Fa μλ − 21 R¯ αβμλ Fa αβ F aμλ ≥ c0 u,
(2.20)
then
where c0 > 0, in view of the assumption (1.4). Multiplying (2.9) with u and integrating by parts we obtain √ 2 2 2 3 |Du| + c u ≤ c uu , 0 8 M
M
(2.21)
M
where we used the simple estimate |Du|2 ≤ 4Faμλ;α F
aμλ α ; u,
(2.22)
and where c depends on n and the dimension of g; note that f c ∈ SO(g, γab ).
(2.23)
The integral on the right-hand side of (2.21) is estimated by
√ 2 uu ≤
n 4
u
M
2 n
u
M
n−2 n
2n n−2
,
(2.24)
M
where u
2 n
n 4
=
|F|
M
n 2
2 n
.
(2.25)
M
Applying then the Sobolev inequality u
2n n−2
n−2 n
≤ c1
|Du| + c2
u2,
2
M
M
(2.26)
M
cf. [1], we obtain u
2n n−2
n−2 n
≤ c3
|F|
M
n 2
2 n
u
M
2n n−2
n−2 n
,
(2.27)
M
where c3 depends on c1 , c2 , c0 and c. Hence, we deduce u ≡ 0 or c3−1
≤
|F|
n 2
2 n
.
(2.28)
M
Setting κ0 = c3−1 finishes the proof.
(2.29)
520
C. Gerhardt
3. The Non-compact Case We now suppose that M = M n is a complete, non-compact Riemannian manifold. Then there holds H 1,2 (M) = H01,2 (M),
(3.1)
i.e., the test functions Cc∞ (M) are dense in the Sobolev space H 1,2 (M), see [1, Lemma 4] or [2, Theorem 2.6]. Since we do not a priori know F 2 ∈ H 1,2 (M),
(3.2)
1,2 F 2 ∈ Hloc (M),
(3.3)
but only
the preceding proof has to be modified. Let η = η(t) be defined through ⎧ ⎪ t ≤ 1, ⎨1, q η(t) = (2 − t) , 1 ≤ t ≤ 2, ⎪ ⎩0, t ≥ 2,
(3.4)
where
q = max 1, n8 .
(3.5)
Fix a point x0 ∈ M and let r be the Riemannian distance function with center in x0 , r (x) = d(x0 , x).
(3.6)
|Dr | = 1
(3.7)
ηk (x) = η(k −1r ).
(3.8)
Then r is Lipschitz such that
almost everywhere. For k ≥ 1 define
The functions u p−1 ηk ,
p
(3.9)
p = n4 ,
(3.10)
where
An Energy Gap for Yang-Mills Connections
521 p
then have compact support, and multiplying (2.9) with u p−1 ηk yields
p 4
+
1 8
−
M
p |Du|2 u p−2 ηk
≤c
|F| M
+ c
M
n 2
2 n
(uηk )
n n−2
n−2 p
n
M p−2 |Dηk |2 ηk u p ,
(3.11)
where 0 < is supposed to be small. Furthermore, there holds p p2 |D(uηk ) 2 |2 = |Duηk + u Dηk |2 (uηk ) p−2 4 M M (3.12) p2 p2 2 p−2 p 2 p−2 p ≤ (1 + ) |Du| u ηk + c |Dηk | ηk u . 4 M 4 M Now, choosing so small such that 2
(1 + ) p4 ≤ p
p 4
+
1 8
−
(3.13)
and setting p
ϕ = (uηk ) 2 , we obtain
|Dϕ| ≤ pc
|F|
2
M
n 2
2 n
ϕ
M
2n n−2
(3.14)
n−2
n
M
+ c
p−2 p
M
|Dηk |2 ηk
u ,
(3.15)
where c is a new constant. We furthermore observe that p−2
|Dηk |2 ηk
≤ q 2 k −2 (2 − k −1r )q p−2 ,
(3.16)
subject to 1 ≤ k −1r ≤ 2.
(3.17)
qp − 2 ≥ 0,
(3.18)
In view of (3.5) and (3.10)
and hence p−2
|Dηk |2 ηk
≤ q 2 k −2 .
(3.19)
Applying now the Sobolev inequality (1.6) to ϕ and choosing κ0 = (c1 cp)−1 ,
(3.20)
we conclude |F| ≡ 0, if |F| M
n 2
2 n
< κ0 .
(3.21)
522
C. Gerhardt
Indeed, if the preceding inequality is valid, then we deduce from (3.15), 2 n−2 1 − κ0−1
n
|F| 2
n
2n
|ϕ| n−2
M
n
M
In the limit k → ∞ we obtain |u|
pn n−2
≤ c q 2 k −2
n
|F| 2 .
(3.22)
M
n−2 n
≤ 0.
(3.23)
M
References 1. Aubin, T.: Problèmes isopérimétriques et espaces de Sobolev. J. Diff. Geom. 11(4), 573–598 (1976) 2. Aubin, T.: Nonlinear analysis on manifolds. Monge-Ampère equations. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], Vol. 252, New York: Springer-Verlag, 1982 3. Bourguignon, J.-P., Lawson, H.B. Jr.: Stability and isolation phenomena for Yang-Mills fields. Commun. Math. Phys. 79(2), 189–230 (1981) 4. Uhlenbeck, K.K.: Connections with L p bounds on curvature. Commun. Math. Phys. 83(1), 31–42 (1982) Communicated by A. Connes
Commun. Math. Phys. 298, 523–547 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1011-1
Communications in
Mathematical Physics
Generators of KMS Symmetric Markov Semigroups on B(h) Symmetry and Quantum Detailed Balance Franco Fagnola1 , Veronica Umanità2 1 Departiment of Mathematics, Politecnico di Milano, P. L. da Vinci 32,
I-20133 Milano, Italy. E-mail: [email protected]
2 Departiment of Mathematics, University of Genoa, V. Dodecaneso 35,
I-16146 Genova, Italy. E-mail: [email protected] Received: 11 August 2009 / Accepted: 4 December 2009 Published online: 24 February 2010 – © Springer-Verlag 2010
Abstract: We find the structure of generators of norm-continuous quantum Markov semigroups on B(h) that are symmetric with respect to the scalar product tr (ρ 1/2 x ∗ ρ 1/2 y) induced by a faithful normal invariant state ρ and satisfy two quantum generalisations of the classical detailed balance condition related with this non-commutative notion of symmetry: the so-called standard detailed balance condition and the standard detailed balance condition with an antiunitary time reversal. 1. Introduction Symmetric Markov semigroups have been extensively studied in classical stochastic analysis (Fukushima et al. [13] and the references therein) because their generators and associated Dirichlet forms are very well tractable by Hilbert space and probabilistic methods. Their non-commutative counterpart has also been deeply investigated (Albeverio and Goswami [1], Cipriani [6], Davies and Lindsay [8], Goldstein and Lindsay [15], Guido, Isola and Scarlatti [17], Park [23], Sauvageot [26] and the references therein). The classical notion of symmetry with respect to a measure, however, admits several non-commutative generalisations. Here we shall consider the so-called KMS-symmetry that seems more natural from a mathematical point of view (see e.g. Accardi and Mohari [3], Cipriani [6,7], Goldstein and Lindsay [14], Petz [25]) and find the structure of generators of norm-continuous quantum Markov semigroups (QMS) on the von Neumann algebra B(h) of all bounded operators on a complex separable Hilbert space h that are symmetric or satisfy quantum detailed balance conditions associated with KMS-symmetry or generalising it. We consider QMS on B(h), i.e. weak∗ -continuous semigroups of normal, completely positive, identity preserving maps T = (Tt )t≥0 on B(h), with a faithful normal invariant state ρ. This defines pre-scalar products on B(h) by (x, y)s = tr (ρ 1−s x ∗ ρ s y) for s ∈ [0, 1] and allows one to define the s-dual semigroup T on B(h) satisfying
524
F. Fagnola, V. Umanità
tr (ρ 1−s x ∗ ρ s Tt (y)) = tr (ρ 1−s Tt (x)∗ ρ s y) for all x, y ∈ B(h). The above scalar products coincide on an Abelian von Neumann algebra; the notion of symmetry T = T , however, clearly depends on the choice of the parameter s. The most studied cases are s = 0 and s = 1/2. Denoting T∗ the predual semigroup, a simple computation yields Tt (x) = ρ −(1−s) T∗t (ρ 1−s xρ s )ρ −s , and shows that for s = 1/2 the maps Tt are positive but, for s = 1/2 this may not be the case. Indeed, it is well-known that, for s = 1/2, the maps Tt are positive if and only if the maps Tt commute with the modular group (σt )t∈R , σt (x) = ρ it xρ −it (see e.g. [18] Prop. 2.1, p. 98, [22] Th. 6, p. 7985, for s = 0, [11] Th. 3.1, p. 341, Prop. 8.1, p. 362 for s = 1/2). This quite restrictive condition implies that the generator has a very special form that makes simpler the mathematical study of symmetry but imposes strong structural constraints (see e.g. [18 and 12]). Here we shall consider the most natural choice s = 1/2 whose consequences are not so stringent and say that T is KMS-symmetric if it coincides with its dual T . KMSsymmetric QMS were introduced by Cipriani [6] and Goldstein and Lindsay [14]; we refer to [7] for a discussion of the connection with the KMS condition justifying this terminology. All quantum versions of the classical principle of detailed balance (Agarwal [4], Alicki [5], Frigerio, Gorini, Kossakowski and Verri [18], Majewski [20,21]), which is at the basis of equilibrium physics, are formulated prescribing a certain relationship between T and T or between their generators, therefore they depend on the underlying notion of symmetry. This work clarifies the structure of generators of QMS that are KMS-symmetric or satisfy a quantum detailed balance condition involving the above scalar product with s = 1/2 and is a key step towards understanding which is the most natural and flexible in view of the study of their generalisations for quantum systems out of equilibrium as, for instance, the dynamical detailed balance condition introduced by Accardi and Imafuku [2]. The generator L of a norm-continuous QMS can be written in the standard GoriniKossakowski-Sudarshan [16] and Lindblad [19] (GKSL) form L(x) = i[H, x] −
1 ∗ L L x − 2L ∗ x L + x L ∗ L , 2
(1)
≥1
where H, L ∈ B(h) with H = H ∗ and the series ≥1 L ∗ L is strongly convergent. The operators L , H in (1) are not uniquely determined by L, however, under a natural minimality condition (Theorem 2 below) and a zero-mean condition tr (ρ L ) = 0 for all ≥ 1, H is determined up to a scalar multiple of the identity operator and the (L )≥1 up to a unitary transformation of the multiplicity space of the completely positive part of L. We shall call special a GKSL representation of L by operators H, L satisfying these conditions. As a result, by the remark following Theorem 2, in a special GKSL representation of L, the operator G = −2−1 ≥1 L ∗ L − i H , is uniquely determined by L up to a purely imaginary multiple of the identity operator and allows us to write L in the form L(x) = G ∗ x + L ∗ x L + x G. (2) ≥1
Our characterisations of QMS that are KMS-symmetric or satisfy a quantum detailed balance condition generalising related with KMS-symmetry are given in terms of the operators G, L (or, in an equivalent way H, L ) of a special GKSL representation.
Generators of Quantum Markov Semigroups and Detailed Balance
525
Theorem 7 shows that a QMS is KMS-symmetric if and only if the operators G, L of a special GKSL representation satisfy ρ 1/2 G ∗ = Gρ 1/2 + icρ 1/2 of its generator ∗ 1/2 1/2 for some c ∈ R and ρ L k = u k L ρ for all k and some unitary (u k ) on the multiplicity space of the completely positive part of L coinciding with its transpose, i.e. such that u k = u k for all k, . In order to describe our results on the structure of generators of QMS satisfying a quantum detailed balance condition we first recall some basic definitions. The best known is due to Alicki [5] and Frigerio-Gorini-Kossakowski-Verri [18]: a norm-continuous QMS T = (Tt )t≥0 on B(h) satisfies the Quantum Detailed Balance (QDB) on B(h) and a self-adjoint operator K on h such condition if there exists an operator L that tr (ρ L(x)y) = tr (ρxL(y)) and L(x) − L(x) = 2i[K , x] for all x, y ∈ B(h). Roughly speaking we can say that L satisfies the QDB condition if the difference of L with respect to the pre-scalar product on B(h) given by tr (ρa ∗ b) is a and its adjoint L derivation. = L − 2i[K , · ] can be written in the form (2) This QDB implies that the operator L replacing G by G + 2i K and then generates a QMS T. Therefore L and the maps Tt commute with the modular group. This restriction does not follow if the dual QMS is defined with respect to the symmetric pre-scalar product with s = 1/2. with the adjoint L defined via the The QDB can be readily reformulated replacing L symmetric scalar product; the resulting condition will be called the Standard Quantum Detailed Balance condition (SQDB) (see e.g. [9]). Theorem 5 characterises generators L satisfying the SQDB and extends previous partial results by Park [23] and the authors [11]: the SQDB holds if and only if there exists a unitary matrix (u k ), coinciding with its transpose, i.e. u k = u k for all k, , such that ρ 1/2 L ∗k = u k L ρ 1/2 . This shows, in particular, that the SQDB depends only on the L ’s and does not involve directly H and G. Moreover, we find explicitly the unitary (u k )k providing also a geometrical characterisation of the SQDB (Theorem 6) in terms of the operators L ρ 1/2 and their adjoints as Hilbert-Schmidt operators on h. We also consider (Definition 3) another notion of quantum detailed balance, inspired by Agarwal’s original notion (see [4], Majewski [20,21], Talkner [27]) involving an antiunitary time reversal operator θ which does not play any role in the Alicki et al. definition. Time reversal appears to keep into account the parity of quantum observables; position and energy, for instance, are even, i.e. invariant under time reversal, momentum are odd, i.e. change sign under time reversal. Agarwal’s original definition, however, depends on the s = 0 pre-scalar product and implies then, that a QMS satisfying this quantum detailed balance condition must commute with the modular automorphism. Here we study the modified version (Definition 3) involving the symmetric s = 1/2 pre-scalar product that we call the SQDB-θ condition. Theorem 8 shows that L satisfies the SQDB-θ condition if and only if there exists a special GKSL representation of L by means of operators H, L such that Gρ 1/2 = ρ 1/2 θ G ∗ θ and a unitary self-adjoint (u k )k such that ρ 1/2 L ∗k = u k θ L θρ 1/2 for all k. Here again (u k )k is explicitly determined by the operators L ρ 1/2 (Theorem 9). We think that these results show that the SQDB condition is somewhat weaker than the SQDB-θ condition because the first does not involve directly the operators H , G. Moreover, the unitary operator in the linear relationship between L ρ 1/2 and their adjoints is transpose symmetric and any point of the unit disk could be in its spectrum while, for generators satisfying the SQDB-θ , it is self-adjoint and its spectrum is contained in {−1, 1}. Therefore, by the spectral theorem, it is possible in principle to find a standard form for the generators of QMSs satisfying the SQDB-θ generalising the
526
F. Fagnola, V. Umanità
standard form of generators satisfying the usual QDB condition (that commute with the modular group) as illustrated in the case of QMSs on M2 (C) studied in the last section. This classification must be much more complex for generators of QMSs satisfying the SQDB. The above arguments and the fact that the SQDB-θ condition can be formulated in a simple way both on the QMS or on its generator (this is not the case for the QDB when L and its Hamiltonian part i[H, ·] do not commute), lead us to the conclusion that the SQDB-θ is the more natural non-commutative version of the classical detailed balance condition. The paper is organised as follows. In Sect. 2 we construct the dual QMS T and recall the quantum detailed balance conditions we investigate, then we study the relationship between the generators of a QMS and its adjoint in Sect. 3. Our main results on the structure of generators are proved in Sects. 4 (QDB without time reversal) and 5 (with time reversal). 2. The Dual QMS, KMS-Symmetry and Quantum Detailed Balance We start this section by constructing the dual semigroup of a norm-continuous QMS with respect to the (·, ·)1/2 pre-scalar product on B(h) defined by an invariant state ρ and prove some properties that will be useful in the sequel. Although this result may be known, the presentation given here leads in a simple and direct way to the dual QMS avoiding non-commutative L p -spaces techniques. Proposition 1. Let Φ be a positive unital normal map on B(h) with a faithful normal invariant state ρ. There exists a unique positive unital normal map Φ on B(h) such that tr ρ 1/2 Φ (x)ρ 1/2 y = tr ρ 1/2 xρ 1/2 Φ(y) for all x, y ∈ B(h). If Φ is completely positive, then Φ is also completely positive. Proof. Let Φ∗ be the predual map on the Banach space of trace class operators on h and let Rk(ρ 1/2 ) denote the range of the operator ρ 1/2 . This is clearly dense in h because ρ is faithful and coincides with the domain of the unbounded self-adjoint operator ρ −1/2 . For all self-adjoint x ∈ B(h) consider the sesquilinear form on the domain Rk(ρ 1/2 )× Rk(ρ 1/2 ), F(v, u) = ρ −1/2 v, Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 u. By the invariance of ρ and positivity of Φ∗ we have − x ρ = − x Φ∗ (ρ) ≤ Φ∗ (ρ 1/2 xρ 1/2 ) ≤ x Φ∗ (ρ) = x ρ. Therefore |F(u, u)| ≤ x · v · u . Thus sesquilinear form is bounded and there exists a unique bounded operator y such that, for all u, v ∈ Rk(ρ 1/2 ), v, yu = ρ −1/2 v, Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 u. Note that, Φ being a ∗ -map, and x self-adjoint v, y ∗ u = y ∗ u, v = ρ −1/2 u, Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 v = Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 u, ρ −1/2 v = ρ −1/2 v, Φ∗ (ρ 1/2 xρ 1/2 )ρ −1/2 u.
Generators of Quantum Markov Semigroups and Detailed Balance
527
This shows that y is self-adjoint. Defining Φ (x) := y, we find a real-linear map on selfadjoint operators on B(h) that can be extended to a linear map on B(h) decomposing each self-adjoint operator as the sum of its self-adjoint and anti self-adjoint parts. Clearly Φ is positive because ρ 1/2 Φ (x ∗ x)ρ 1/2 = Φ∗ (ρ 1/2 x ∗ xρ 1/2 ) and Φ∗ is positive. Moreover, by the above construction Φ (1l) = 1l, i.e. Φ is unital. Therefore is a norm-one contraction. If Φ is completely positive, then Φ∗ is also and formula ρ 1/2 Φ (x)ρ 1/2 = Φ∗ (ρ 1/2 1/2 xρ ) shows that Φ is completely positive. Finally we show that Φ is normal. Let (xα )α be a net of positive operators on B(h) with least upper bound x ∈ B(h). For all u ∈ h we have then supρ 1/2 u, Φ (xα )ρ 1/2 u = supu, Φ∗ (ρ 1/2 xα ρ 1/2 )u α
α
= u, Φ∗ (ρ 1/2 xρ 1/2 )u = ρ 1/2 u, Φ (x)ρ 1/2 u. Now if u ∈ h, for every ε > 0, we can find a u ε ∈ Rk(ρ 1/2 ) such that u − u ε < ε by the density of the range of ρ 1/2 . We have then
u, Φ (xα ) − Φ (x) u ≤ ε Φ (xα ) − Φ (x) ( u + u ε ) + u ε , Φ (xα ) − Φ (x) u ε for all α. The conclusion follows from the arbitrarity of ε and the uniform boundedness of Φ (xα ) − Φ (x) and u ε . Theorem 1. Let T be a QMS on B(h) with a faithful normal invariant state ρ. There exists a QMS T on B(h) such that ρ 1/2 Tt (x)ρ 1/2 = T∗t (ρ 1/2 xρ 1/2 )
(3)
for all x ∈ B(h) and all t ≥ 0. Proof. By Proposition 1, for each t ≥ 0, there exists a unique completely positive normal and unital contraction Tt on B(h) satisfying (3). The semigroup property follows from the algebraic computation (x)ρ 1/2 = T∗t T∗s (ρ 1/2 xρ 1/2 ) ρ 1/2 Tt+s = T∗t ρ 1/2 Ts (x)ρ 1/2 ) = ρ 1/2 Tt Ts (x)) ρ 1/2 . Since the map t → ρ 1/2 v, Tt (x)ρ 1/2 u is continuous by the identity (3) for all u, v ∈ h, and Tt (x) ≤ x for all t ≥ 0, a 2ε approximation argument shows that t → Tt (x) is continuous for the weak∗ -operator topology on B(h). It follows that T = (Tt )t≥0 is a QMS on B(h). Definition 1. The quantum Markov semigroup T is called the dual semigroup of T with respect to the invariant state ρ. It is easy to see, using (3), that ρ is an invariant state also for T . Remark 1. When T is norm-continuous it is not clear whether also T is norm-continuous. Here, however, we are interested in generators of symmetric or detailed balance QMS. We shall see that these additional properties of T imply that also T is norm continuous. Therefore we proceed studying norm-continuous QMSs whose dual is also norm-continuous.
528
F. Fagnola, V. Umanità
The quantum detailed balance condition of Alicki, Frigerio, Gorini, Kossakowski and Verri modified by considering the pre-scalar product (·, ·)1/2 on B(h), usually called standard (see e.g. [9]) because of multiplications by ρ 1/2 as in the standard representation of B(h), is defined as follows. Definition 2. The QMS T generated by L satisfies the standard quantum detailed balance condition (SQDB) if there exists an operator L on B(h) and a self-adjoint operator K on h such that tr (ρ 1/2 xρ 1/2 L(y)) = tr (ρ 1/2 L (x)ρ 1/2 y),
L(x) − L (x) = 2i[K , x]
(4)
for all x ∈ B(h). The operator L in the above definition must be norm-bounded because it is everywhere defined and norm closed. To see this consider a sequence (xn )n≥1 in B(h) converging in norm to a x ∈ B(h) such that (L(xn ))n≥1 converges in norm to b ∈ B(h) and note that tr ρ 1/2 L (x)ρ 1/2 y = lim tr ρ 1/2 xn ρ 1/2 L(y) n→∞ = lim tr ρ 1/2 L (xn )ρ 1/2 y = tr ρ 1/2 bρ 1/2 y n→∞
for all y ∈ B(h). The elements ρ 1/2 yρ 1/2 , with y ∈ B(h), are dense in the Banach space of trace class operators on h because ρ is faithful. Therefore it shows that L (x) = b and L is closed. Since both L and L are bounded, also K is bounded. We now introduce another definition of quantum detailed balance, due to Agarwal [4] with the s = 0 pre-scalar product, that involves a time reversal θ . This is an antiunitary operator on h, i.e. θ u, θ v = v, u for all u, v ∈ h, such that θ 2 = 1l and θ −1 = θ ∗ = θ . Recall that θ is antilinear, i.e. θ zu = z¯ u for all u ∈ h, z ∈ C, and its adjoint θ ∗ satisfies u, θ v = v, θ ∗ u for all u, v ∈ h. Moreover θ x θ belongs to B(h) (linearity is re-established) and tr (θ xθ ) = tr (x ∗ ) for every trace-class operator x ([10] Prop. 4), indeed, taking an orthonormal basis of h, we have tr (θ xθ ) = e j , θ xθ e j = xθ e j , θ ∗ e j j
j
=
θ e j , x ∗ θ ∗ e j = tr(x ∗ ).
j
It is worth noticing that the cyclic property of the trace does not hold for θ , since tr (θ xθ ) = tr (x ∗ ) may not be equal to tr (x) for non-self-adjoint x. Definition 3. The QMS T generated by L satisfies the standard quantum detailed balance condition with respect to the time reversal θ (SQDB-θ ) if tr (ρ 1/2 xρ 1/2 L(y)) = tr (ρ 1/2 θ y ∗ θρ 1/2 L(θ x ∗ θ )), for all x, y ∈ B(h).
(5)
Generators of Quantum Markov Semigroups and Detailed Balance
529
The operator θ is used to keep into account parity of the observables under time reversal. Indeed, a self-adjoint operator x ∈ B(h) is called even (resp. odd) if θ xθ = x (resp. θ xθ = −x). The typical example of antilinear time reversal is a conjugation (with respect to some orthonormal basis of h). This condition is usually stated ([20,21,27]) for the QMS T as tr (ρ 1/2 xρ 1/2 Tt (y)) = tr (ρ 1/2 θ y ∗ θρ 1/2 Tt (θ x ∗ θ )),
(6)
for all t ≥ 0, x, y ∈ B(h). In particular, for t = 0 we find that this identity holds if and only if ρ and θ commute, i.e. ρ is an even observable. This is the case, for instance, when ρ is a function of the energy. Lemma 1. The following conditions are equivalent: (i) θ and ρ commute, (ii) tr (ρ 1/2 xρ 1/2 y) = tr (ρ 1/2 θ y ∗ θρ 1/2 θ x ∗ θ ) for all x, y ∈ B(h). Proof. If ρ and θ commute, from tr (θaθ ) = tr (a ∗ ), we have tr (ρ 1/2 θ y ∗ θρ 1/2 θ x ∗ θ ) = tr (θ (ρ 1/2 y ∗ ρ 1/2 x ∗ )θ ) = tr (xρ 1/2 yρ 1/2 ) and (ii) follows cycling ρ 1/2 . Conversely, if (ii) holds, taking x = 1l, we have tr (ρy) = tr (ρθ y ∗ θ ) = tr θ (θ y ∗ θ )∗ ρθ = tr (yθρθ ) = tr (θρθ y), for all y ∈ B(h), and ρ = θρθ .
Proposition 2. If ρ and θ commute then (5) and (6) are equivalent. Proof. Clearly (5) follows from (6) differentiating at t = 0. Conversely, putting α(x) = θ xθ and denoting L∗ the predual of L we can write (5) as tr (L∗ (ρ 1/2 xρ 1/2 )y) = tr ρ 1/2 α(y ∗ )ρ 1/2 L(α(x ∗ )) = tr ρ 1/2 α(L(α(x)))ρ 1/2 y , for all y ∈ B(h), because tr (α(a)) = tr (a ∗ ). Therefore we have L∗ (ρ 1/2 xρ 1/2 ) = ρ 1/2 α(L(α(x)))ρ 1/2 and, iterating, Ln∗ (ρ 1/2 xρ 1/2 ) = ρ 1/2 α(Ln (α(x)))ρ 1/2 for all n ≥ 1. It follows that (5) holds for all powers Ln with n ≥ 1. Since ρ and θ commute, it is true also for n = 0 and we find (6) by the exponentiation formula Tt = n≥0 t n Ln /n!. We do not know whether the SQDB condition (4) of Definition 2 has a simple explicit formulation in terms of the maps Tt if L and L do not commute. Remark 2. The SQDB condition (5), by tr (θaθ ) = tr (a ∗ ), reads tr (ρ 1/2 xρ 1/2 L(y)) = tr (ρ 1/2 (θ L(θ xθ )θ )ρ 1/2 x), for all x, y ∈ B(h), i.e. L (x) = θ L(θ xθ )θ . Write L in a special GKSL form as in (1) and decompose the generator L = L0 + i[H, · ] into the sum of its dissipative part L0 and derivation part i[H, · ]. If H commutes with θ , by the antilinearity of θ , we find L (x) = θ L0 (θ xθ )θ − i[H, x]. Therefore, if the dissipative part is time reversal invariant, i.e. L0 (x) = θ L0 (θ xθ )θ , we end up with L = L − 2i[H, · ]. The relationship with Definition 2 of SQDB, in this case, is then clear. The SQDB conditions of Definition 2 and 3, however, in general are not comparable.
530
F. Fagnola, V. Umanità
3. The Generator of a QMS and its Dual We shall always consider special GKSL representations of the generator of a normcontinuous QMS by means of operators L , H . These are described by the following theorem (we refer to [24] Theorem 30.16 for the proof). Theorem 2. Let L be the generator of a norm-continuous QMS on B(h) and let ρ be a normal state on B(h). There exists a bounded self-adjoint operator H and a finite or infinite sequence (L )≥1 of elements of B(h) such that: (i) tr(ρ L ) = 0 for each ≥ 1, ∗ (ii) sum, ≥1 L L is a strongly convergent (iii) if ≥0 |c |2 < ∞ and c0 + ≥1 c L = 0 for complex scalars (ck )k≥0 then ck = 0 for every k ≥ 0, (iv) the GKSL representation (1) holds. If H , (L )≥1 is another family of bounded operators in B(h) with H self-adjoint and the sequence (L )≥1 is finite or infinite then the conditions (i)–(iv) are fulfilled with H, (L )≥1 replaced by H , (L )≥1 respectively if and only if the lengths of the sequences (L )≥1 , (L )≥1 are equal and for some scalar c ∈ R and a unitary matrix (u j ), j we have H = H + c, L = u j L j . j
As an immediate consequence of the uniqueness (up to a scalar) of the Hamiltonian H , the decomposition of L as the sum of the derivation i[H, ·] and a dissipative part L0 = L−i[H, · ] determined by special GKSL representations of L is unique. Moreover, since (u j ) is unitary, we have ⎛ ⎞ ∗ ⎝ L L = u k u j L ∗k L j = u k u j ⎠ L ∗k L j = L ∗k L k . ≥1
,k, j≥1
k, j≥1
≥1
k≥1
−1
∗ Therefore, putting G = −2 ≥1 L L − i H , we can write L in the form (2), where G is uniquely determined by L up to a purely imaginary multiple of the identity operator. Theorem 2 can be restated in the index free form ([24] Thm. 30.12).
Theorem 3. Let L be the generator of a norm continuous QMS on B(h), then there exist an Hilbert space k, a bounded linear operator L : h → h⊗k and a bounded self-adjoint operator H on h satisfying the following: 1. L(x) = i[H, x] − 21 (L ∗ L x − 2L ∗ (x ⊗ 1lk )L + x L ∗ L) for all x ∈ B(h); 2. the set {(x ⊗ 1lk )Lu : x ∈ B(h), u ∈ h} is total in h ⊗ k. Proof. Let k be a Hilbert space with Hilbertian dimension equal to the length of the sequence (L k )k and let ( f k ) be an orthonormal basis of k. Defining Lu = k L k u ⊗ f k , where the L k are as in Theorem 2, a simple calculation shows that 1 is fulfilled. Suppose that there exists a non-zero vector ξ orthogonal to the set of (x ⊗ 1lk )Lu with x ∈ B(h), u ∈ h; then ξ = k vk ⊗ f k with vk ∈ h and vk , x L k u = L ∗k x ∗ vk , u 0 = ξ, (x ⊗ 1lk )Lu = k
k
Generators of Quantum Markov Semigroups and Detailed Balance
531
for all x ∈ B(h), u ∈ h. Hence, k L ∗k x ∗ vk = 0. Since ξ = 0, we can suppose v1 = 1; then, putting p = |v1 v1 | and x = py ∗ , y ∈ B(h), we get ∗ ∗ ∗ ∗ 0 = L 1 yv1 + v1 , vk L k yv1 = L 1 + v1 , vk L k yv1 . (7) k≥2
k≥2
Since y ∈ B(h) is arbitrary, Eq. (7) contradicts the linear independence (see Theorem 2 (iii)) of the L k ’s. Therefore the set in (2) must be total. The Hilbert space k is called the multiplicity space of the completely positive part of L. A unitary matrix (u j ), j≥1 , in the above basis ( f k )k≥1 , clearly defines a unitary operator on k. From now on we shall identify such matrices with operators on k. We end this section by establishing the relationship between the operators G, L and G , L in two special GKSL representations of L and L when these generators are both bounded. The dual QMS T clearly satisfies ρ 1/2 Tt (x)ρ 1/2 = T∗t (ρ 1/2 xρ 1/2 ), where T∗ denotes the predual semigroup of T . Since L is bounded, differentiating at t = 0, we find the relationship among the generator L of T and L∗ of the predual semigroup T∗ of T , ρ 1/2 L (x)ρ 1/2 = L∗ (ρ 1/2 xρ 1/2 ). (8) ∗ Proposition 3. Let L(a) = G ∗ a + aG + L a L be a special GKSL representation of L with respect to a T -invariant state ρ = k ρk |ek ek |. Then ρk L(| uek |)ek − tr(ρG)u, (9) G∗u = k≥1
Gv =
ρk L∗ (| vek |)ek − tr(ρG ∗ )v
(10)
k≥1
for every u, v ∈ h.
Proof. Since L(|uv|) = |G ∗ uv| + |uGv| + |L ∗ uL ∗ v|, putting v = ek we have G ∗ u = |G ∗ uek |ek and G ∗ u = L(|uek |)ek − ek , L ek L ∗ u − ek , Gek u.
Multiplying both sides by ρk and summing on k, we find then G∗u = ρk L(|uek |)ek − ρk ek , L ek L ∗ u − ρk ek , Gek u ,k
k≥1
=
k≥1
ρk L(| uek |)ek −
k≥1
tr (ρ L )L ∗ u
− tr (ρG)u
and (9) follows since tr (ρ L j ) = 0. The identity (10) is now immediate computing the adjoint of G.
532
F. Fagnola, V. Umanità
Proposition 4. Let T be the dual of a QMS T generated by L with normal invariant state ρ. If G and G are the operators (10) in two GKSL representations of L and L then G ρ 1/2 = ρ 1/2 G ∗ + tr(ρG) − tr(ρG ) ρ 1/2 . (11) Moreover, we have tr(ρG) − tr(ρG ) = ic for some c ∈ R. Proof. The identities (10) and (8) yield G ρ 1/2 v =
L∗ (ρ 1/2 | vρk ek |)ρk ek − tr (ρG ∗ )ρ 1/2 v 1/2
1/2
k≥1
=
L∗ (ρ 1/2 (| vek |)ρ 1/2 )ρ 1/2 ek − tr (ρG ∗ )ρ 1/2 v
k≥1
=
ρ 1/2 L(| vek |)ρ 1/2 ρ 1/2 ek − tr (ρG ∗ )ρ 1/2 v
k≥1 1/2
=ρ
G ∗ v + tr (ρG) − tr (ρG ∗ ) ρ 1/2 v.
Therefore, we obtain (11). Right multiplying this equation by ρ 1/2 we have G ρ = 1/2 ∗ 1/2 ∗ ρ G ρ + tr (ρG) − tr (ρG ) ρ, and, taking the trace, tr (ρG) − tr (ρG ∗ ) = tr (G ρ) − tr (ρ 1/2 G ∗ ρ 1/2 ) = tr (G ρ) − tr (G ∗ ρ) = −(tr (ρG) − tr (ρG ∗ )); this proves the last claim.
We can now prove as in [11] Th. 7.2, p. 358 the following Theorem 4. For all special GKSL representations of L by means of operators G, L as in (2) there exists a special GKSL representation of L by means of operators G , L such that: 1. G ρ 1/2 = ρ 1/2 G ∗ + icρ 1/2 for some c ∈ R, 2. L ρ 1/2 = ρ 1/2 L ∗ for all ≥ 1. Proof. L is bounded, it admits a special GKSL representation L (a) = G ∗ a + ∗ Since 1/2 = ρ 1/2 G ∗ + icρ 1/2 , k L k a L k + aG . Moreover, by Proposition 4, we have G ρ c ∈ R, and so (8) implies k
1/2 ρ 1/2 L ∗ = k x Lkρ
L k ρ 1/2 xρ 1/2 L ∗k .
(12)
k
Let k (resp. k ) be the multiplicity space of the completely positive part of L (resp. L ), ( f k )k (resp. ( f k )k ) an orthonormal basis of k (resp. k ) and define a linear operator X : h ⊗ k → h ⊗ k, X (x ⊗ 1lk )L ρ 1/2 u = (x ⊗ 1lk ) ρ 1/2 L ∗k u ⊗ f k k
Generators of Quantum Markov Semigroups and Detailed Balance
533
for all x ∈ B(h) and u ∈ h, where L : h → h ⊗ k, Lu = k L k u ⊗ f k , L : h → h ⊗ k , L u = k L k u ⊗ f k . Note that the right-hand side series is convergent for all u ∈ h because of (12), since
n
2 n
n
1/2 ∗ 2 1/2 ∗ u, L k ρ L ∗k u , ρ L k u ⊗ fk =
ρ L k u =
k=m
k=m
k=m
and the right-hand side goes to 0 for n, m tending to infinity because ρ is an invariant state and the series k L k ρ L ∗k = −(Gρ + ρG) is trace-norm convergent. The identity (12) yields ∗ 1/2 X (x ⊗ 1lk )L ρ 1/2 u, X (y ⊗ 1lk )L ρ 1/2 v = u, ρ 1/2 L ∗ v k x y Lkρ k
= (x ⊗ 1lk )L ρ 1/2 u, (y ⊗ 1lk )L ρ 1/2 v for all x, y ∈ B(h) and u, v ∈ h, i.e. X preserves the scalar product. Therefore, since the set {(x ⊗ 1lk )L ρ 1/2 u | x ∈ B(h), u ∈ h} is total in h ⊗ k (for ρ 1/2 (h) is dense in h and Theorem 3 holds), X is well defined and extends to an isometry from h ⊗ k to h ⊗ k. The operator X is unitary because its range is dense in h ⊗ k. Indeed, if we suppose that there exists a vector ξ = k vk ⊗ f k , with vk ∈ h and k vk 2 < ∞, orthogonal to all (x ⊗ 1lk ) k ρ 1/2 L ∗k u ⊗ f k ; then 0 = ξ, (x ⊗ 1lk ) ρ 1/2 L ∗k u ⊗ f k = vk , xρ 1/2 L ∗k u = L k ρ 1/2 x ∗ vk , u k
k
k
for all x ∈ B(h), u ∈ h. Taking x = |w1 w2 |, by the arbitrarity of u, we have then 1/2 w = 0. Since w is arbitrary, the range of ρ 1/2 is dense in h and 2 2 k w1 , vk L k ρ the sequence (w1 , vk )k≥1 is square-summable we find k w1 , vk L k = 0. The linear independence of the L k , in the sense of Theorem 2 (iii), implies then w1 , vk = 0 for all k and all w1 ∈ h, i.e. ξ = 0. As a consequence we have X ∗ X = 1lh⊗k and X X ∗ = 1lh⊗k . Moreover, since X (y ⊗ 1lk ) = (y ⊗ 1lk )X for all y ∈ B(h), we can conclude that X = 1lh ⊗ Y for some unitary map Y : k → k . The definition of X implies then (ρ 1/2 ⊗ 1lk )L ∗ = X L ρ 1/2 = (1lh ⊗ Y )L ρ 1/2 . This means that, replacing L by (1lh ⊗ Y )L , or more precisely L k by all k, we have ρ 1/2 L ∗k = L k ρ 1/2 . Since tr (ρ L k ) = tr (ρ L ∗k ) = 0 and, from L (1l) = 0, G ∗ + G = − properties of a special GKSL representation follow.
k
Remark 3. Condition 2 implies that the completely positive parts Φ(x) = and Φ of the generators L and L , respectively are mutually adjoint, i.e. tr (ρ 1/2 (x)ρ 1/2 y) = tr (ρ 1/2 xρ 1/2 (y))
u k L
for
L ∗k L k , the
L ∗ x L (13)
for all x, y ∈ B(h). As a consequence, also the maps x → G ∗ x + x G and x → (G )∗ x + x G are mutually adjoint.
534
F. Fagnola, V. Umanità
4. Generators of Standard Detailed Balance QMSs In this section we characterise the generators of norm-continuous QMSs satisfying the SQDB of Definition 2. We start noting that, since ρ is invariant for T and T , i.e. L∗ (ρ) = L∗ (ρ) = 0, the operator K commutes with ρ. Moreover, by comparing two special GKSL representations of L and L + 2i[K , · ], we have immediately the following Lemma 2. A QMS T satisfies the SQDB L − L = 2i[K , · ] if and only if for all special GKSL representations of the generators L and L by means of operators G, L k and G , L k respectively, we have G = G + 2i K + ic
L k =
uk j L j
j
for some c ∈ R and some unitary (u k j )k j on k. Since we know the relationship between the operators G , L k and G, L k thanks to Theorem 4, we can now characterise generators of QMSs satisfying the SQDB. We emphasize the following definition of T -symmetric matrix (operator) on k in order to avoid confusion with the usual notion of symmetric operator X meaning that X ∗ is an extension of X . Definition 4. Let Y = (yk )k,≥1 be a matrix with entries indexed by k, running on the set (finite or infinite) of indices of the sequence (L )≥1 . We denote by Y T the transpose matrix Y T = (yk )k,≥1 . The matrix Y is called T -symmetric if Y = Y T . Theorem 5. T satisfies the SQDB if and only if for all special GKSL representation of the generator L by means of operators G, L k there exists a T -symmetric unitary (u m )m on k such that, for all k ≥ 1, ρ 1/2 L ∗k = u k L ρ 1/2 . (14)
Proof. Given a special GKSL representation of L, adding a purely imaginary multiple of the identity operator to the anti-selfadjoint part of G if necessary, Theorem 4 allows us to write the dual L in a special GKSL representation by means of operators G , L k with G ρ 1/2 = ρ 1/2 G ∗ ,
L k ρ 1/2 = ρ 1/2 L ∗k . (15) u k j L j for some unitary Suppose first that T satisfies the SQDB. Since L k = j (u k j )k j by Lemma 2, we can find (14) substituting L k with j u k j L j in the second formula (15). Finally we show that the unitary matrix u = (u m )m is T -symmetric. Indeed, taking the adjoint of (14) we find L ρ 1/2 = m u¯ m ρ 1/2 L ∗m . Writing ρ 1/2 L ∗m as in (14) we have then L ρ 1/2 = (u ∗ )T u u¯ m u mk L k ρ 1/2 = L k ρ 1/2 . m,k
k
k
Generators of Quantum Markov Semigroups and Detailed Balance
535
The operators L ρ 1/2 are linearly independent by property (iii) Theorem 2 of a special GKSL representation, therefore (u ∗ )T u is the identity operator on k. Since u is also T unitary, we have also u ∗ u = (u ∗ )T u, namely u ∗ = (u ∗ )T and u = u .1/2 1/2 = u k L ρ , so that L k = Conversely, if (14) holds, by (15), we have L k ρ u L for all k and for some unitary (u ) . Therefore, thanks to Lemma 2, to kj kj k conclude it is enough to prove that G = G + i(2K + c) namely, that G − G is anti self-adjoint. To this end note that, since ρ is an invariant state, we have 0 = ρG ∗ + L k ρ L ∗k + Gρ, (16) k
with
L k ρ L ∗k =
k
k
=
(L k ρ 1/2 )(ρ 1/2 L ∗k ) =
k
, j
u k u k j ρ 1/2 L ∗ L j ρ 1/2
ρ 1/2 L ∗ L ρ 1/2 = −ρ 1/2 (G + G ∗ )ρ 1/2 ,
(for condition (14) holds) and so, by substituting in Eq. (16) we get 0 = ρG ∗ − ρ 1/2 Gρ 1/2 − ρ 1/2 G ∗ ρ 1/2 + Gρ = ρ 1/2 ρ 1/2 G ∗ − Gρ 1/2 − ρ 1/2 G ∗ − Gρ 1/2 ρ 1/2 = [Gρ 1/2 − ρ 1/2 G ∗ , ρ 1/2 ], i.e. Gρ 1/2 − ρ 1/2 G ∗ commutes with ρ 1/2 . We can now prove that G − G is anti self-adjoint. Clearly, it suffices to show that 1/2 ρ Gρ 1/2 − ρ 1/2 G ρ 1/2 is anti self-adjoint. Indeed, by (15), we have ∗ ∗ ρ 1/2 Gρ 1/2 − ρ 1/2 G ρ 1/2 = ρ 1/2 Gρ 1/2 − ρG ∗ ∗ = ρ 1/2 Gρ 1/2 − ρ 1/2 G ∗ ∗ = Gρ 1/2 − ρ 1/2 G ∗ ρ 1/2 = ρG ∗ − ρ 1/2 Gρ 1/2 = ρ 1/2 G ρ 1/2 − ρ 1/2 Gρ 1/2 , because Gρ 1/2 − ρ 1/2 G ∗ commutes with ρ 1/2 . This completes the proof.
It is worth noticing that, as in Remark 3, T satisfies the SQDB if and only if the completely positive part Φ of the generator L is symmetric. This improves our previous result, Thm. 7.3 [11], where we gave Gρ 1/2 = ρ 1/2 G ∗ −(2i K + ic) ρ 1/2 for some c ∈ R as an additional condition. Here we showed that it follows from (14) and the invariance of ρ. Remark 4. Note that (14) holds for the operators L of a special GKSL representation of L if and only if it is true for all special GKSL representations because of the second part of Theorem 2. Therefore the conclusion of Theorem 5 holds true also if and only if we can find a single special GKSL representation of L satisfying (14).
536
F. Fagnola, V. Umanità
The T -symmetric unitary (u m )m is determined by the L ’s because they are linearly independent. We shall now exploit this fact to give a more geometrical characterisation of SQDB. When the SQDB holds, the matrices (bk j )k, j≥1 and (ck j )k, j≥1 with (17) bk j = tr ρ 1/2 L ∗k ρ 1/2 L ∗j , and ck j = tr ρ L ∗k L j define two trace class operators B and C on k by Lemma 3 (see the Appendix); B is T -symmetric and C is self-adjoint. Moreover, it admits a self-adjoint inverse C −1 because ρ is faithful. When k is infinite dimensional, C −1 is unbounded and its domain coincides with the range of C. We can now give the following characterisation of QMS satisfying the SQDB condition which is more direct because the unitary (u k )k in Theorem 5 is explicitly given by C −1 B. Theorem 6. T satisfies the SQDB if and only if the operators G, L k of a special GKSL representation of the generator L satisfy the following conditions: (i) the closed linear span of ρ 1/2 L ∗ | ≥ 1 and L ρ 1/2 | ≥ 1 in the Hilbert space of Hilbert-Schmidt operators on h coincide, (ii) the trace-class operators B, C defined by (17) satisfy C B = BC T and C −1 B is unitary T -symmetric. Proof. If T satisfies the SQDB then, by Theorem 5, the identity (14) holds. The series in the right-hand side of (14) is convergent with respect to the Hilbert-Schmidt norm because
2
1/2
u k L ρ = u¯ k u k tr ρ L ∗ L
m+1≤≤n
m+1≤, ≤n HS 1 1 |u k |2 |u k |2 + |c |2 ≤ 2 2 m+1≤, ≤n m+1≤, ≤n ⎞2 ⎛ 1 1 ≤ ⎝ |u k |2 ⎠ + |c |2 , 2 2 m+1≤≤n
m+1≤, ≤n
and the right-hand side vanishes as n, m go to infinity because the operator C is traceclass by Lemma 3 and the columns of U = (u k )k are unit vectors in k by unitarity. Left multiplying both sides of (14) by ρ 1/2 L ∗j and taking the trace we find B = CU T = CU . It follows that the range of the operators B, CU and C coincide and C −1 B = U is everywhere defined, unitary and T -symmetric because U is T -symmetric. Moreover, since B is T -symmetric by the cyclic property of the trace, we have also BC T = CU T C T = C(CU )T = C B T = C B. Conversely, we show that (i) and (ii) imply the SQDB. To this end notice that, by the spectral theorem we can find a unitary linear transformation V = (vmn )m,n≥1 on k such that V ∗ C V is diagonal. Therefore, choosing a new GKSL representation of the generator L by means of the operators L k = n≥1 vnk L n , if necessary, we can suppose
Generators of Quantum Markov Semigroups and Detailed Balance
537
that both (L ρ 1/2 )≥1 and (ρ 1/2 L ∗k )k≥1 are orthogonal bases of the same closed linear space. Note that tr (ρ 1/2 (L )∗k ρ 1/2 (L )∗j ) = v¯nk v¯m j tr (ρ 1/2 L ∗n ρ 1/2 L ∗m ) m,n≥1
and the operator B, after this change of GKSL representation, becomes V ∗ B(V ∗ )T which is also T -symmetric. Writing the expansion of ρ 1/2 L ∗k with respect to the orthogonal basis (L ρ 1/2 )≥1 , for all k ≥ 1 we have ρ 1/2 L ∗k =
tr (ρ 1/2 L ∗ ρ 1/2 L ∗ )
≥1
k
L ρ 1/2 2H S
L ρ 1/2 .
(18)
In this way we find a matrix Y of complex numbers yk such that ρ 1/2 L ∗k = yk L ρ 1/2 and the series is Hilbert-Schmidt norm convergent. Clearly, since C is diagonal and B is T -symmetric, yk = (BC −1 )k = ((B(C −1 )T )k = ((C −1 B)T )k . It follows from (ii) that Y coincides with the unitary operator (C −1 B)T and (14) holds. Moreover, Y is symmetric because yk = (BC −1 )k = ((B(C −1 )T )k = (C −1 B)k = yk .
This completes the proof.
Formula (18) has the following consequence. Corollary 1. Suppose that a QMS T satisfies the SQDB condition. For every special GKSL representation of L with operators L ρ 1/2 that are orthogonal in the Hilbert space of Hilbert-Schmidt operators on h if tr (ρ 1/2 L ∗ ρ 1/2 L ∗k ) = 0 for a pair of indices k, ≥ 1, then tr (ρ L ∗ L ) = tr (ρ L ∗k L k ). Proof. It suffices to note that the matrix (u k ) with entries u k = must be T -symmetric.
tr (ρ 1/2 L ∗ ρ 1/2 L ∗k ) L ρ 1/2 2H S
=
tr (ρ 1/2 L ∗ ρ 1/2 L ∗k ) tr (ρ L ∗ L )
Remark 5. The matrix C can be viewed as the covariance matrix of the zero-mean (recall that tr (ρ L ) = 0) “random variables” { L | ≥ 1 } and in a similar way, B can be viewed as a sort of mixed covariance matrix between the previous random variable and the adjoint { L ∗ | ≥ 1 }. Thus the SQDB condition holds when the random variables L right multiplied by ρ 1/2 and the adjoint variables L ∗ left multiplied by ρ 1/2 generate the same subspace of Hilbert-Schmidt operators and the mixed covariance matrix B is a left unitary transformation of the covariance matrix C. If we consider a special GKSL representation of L with operators L ρ 1/2 that are orthogonal, then, by Corollary 1 and the identity L ρ 1/2 H S = L k ρ 1/2 H S , the unitary matrix U can be written as C −1/2 BC −1/2 . This, although not positive definite, can be interpreted as a correlation coefficient matrix of { L | ≥ 1 } and { L ∗ | ≥ 1 }. The characterisation of generators of symmetric QMSs with respect to the s = 1/2 scalar product follows along the same lines.
538
F. Fagnola, V. Umanità
Theorem 7. A norm-continuous QMS T is symmetric if and only if there exists a special GKSL representation of the generator L by means of operators G, L such that ∗ 1/2 (1) Gρ 1/2 = ρ 1/2 G + icρ for some c ∈ R, (2) ρ 1/2 L ∗k = u k L ρ 1/2 , for all k, for some unitary (u k )k on k which is also T -symmetric.
Proof. Choose a special GKSL representation of L by means of operators G, L k . Theorem 4 allows us to write the symmetric dual L in a special GKSL representation by means of operators G , L k as in (15). Suppose first that T is KMS-symmetric. Comparing the special GKSL representations of L and L , by Theorem 2 we find G = G + ic,
L k =
uk j L j ,
j
for some unitary matrix (u k j ) and some c ∈ R. This, together with (15) implies that conditions (1) and (2) hold. Assume now that conditions (1) and (2) hold. Taking the adjoint of (2) we find immediately L k ρ 1/2 = k u k ρ 1/2 L ∗ . Then a straightforward computation, by the unitarity of the matrix (u k ), yields L∗ (ρ 1/2 xρ 1/2 ) = Gρ 1/2 xρ 1/2 +
L k ρ 1/2 xρ 1/2 L ∗k + ρ 1/2 xρ 1/2 G ∗
k ∗
=ρ
1/2
G xρ
=ρ
1/2
L(x)ρ
1/2
+
u k u k j ρ 1/2 L ∗k x L j ρ 1/2 + ρ 1/2 x Gρ 1/2
kj 1/2
for all x ∈ B(h). Iterating we find Ln∗ (ρ 1/2 xρ 1/2 ) = ρ 1/2 Ln (x)ρ 1/2 for all n ≥ 0, therefore, exponentiating, we find T∗t (ρ 1/2 xρ 1/2 ) = ρ 1/2 Tt (x)ρ 1/2 for all t ≥ 0. This, together with (3), implies that T is KMS-symmetric. Remark 6. Note that condition (2) in Theorem 7 implies that the completely positive part of L is KMS-symmetric. This makes a parallel with Theorem 4, where condition (2) implies that the completely positive parts of the generators L and L are mutually adjoint. The above theorem simplifies a previous result by Park ([23], Thm 2.2) where conditions (1) and (2) appear in a much more complicated way.
5. Generators of Standard Detailed Balance (with Time Reversal) QMSs We shall now study generators of semigroups satisfying the SQDB-θ introduced in Definition 3 involving the time reversal operation. In this section, we always assume that the invariant state ρ and the anti-unitary time reversal θ commute. The relationship between the QMS satisfying the SQDB-θ , its dual and their generators is clarified by the following
Generators of Quantum Markov Semigroups and Detailed Balance
539
Proposition 5. A QMS T satisfies the SQDB-θ if and only if the dual semigroup T is given by Tt (x) = θ Tt (θ xθ )θ
for all x ∈ B(h).
(19)
In particular, if T is norm-continuous, then T is also norm-continuous. Moreover, in this case T is generated by L (x) = θ L(θ xθ )θ,
x ∈ B(h).
(20)
Proof. Suppose that T satisfies the SQDB-θ and put σ (x) = θ xθ . Taking t = 0 Eq. (6) reduces to tr (ρ 1/2 xρ 1/2 y) = tr (ρ 1/2 σ (y ∗ )ρ 1/2 σ (x ∗ )) for all x, y ∈ B(h), so that tr (ρ 1/2 xρ 1/2 Tt (y)) = tr (ρ 1/2 σ (y ∗ )ρ 1/2 Tt (σ (x ∗ ))) = tr (ρ 1/2 σ (Tt (σ (x ∗ ))∗ ρ 1/2 σ (σ (y ∗ )∗ )) = tr (ρ 1/2 σ (Tt (σ (x)))ρ 1/2 y) for every x, y ∈ B(h) and (19) follows. Therefore, if T is norm continuous, Tt = (σ ◦ Tt ◦ σ )t is also. Conversely, if (19) holds, the commutation between ρ and θ implies tr (ρ 1/2 Tt (x)ρ 1/2 y) = tr ρ 1/2 θ Tt (θ xθ )θρ 1/2 y = tr θ ρ 1/2 Tt (θ xθ )θρ 1/2 yθ θ = tr ρ 1/2 θ y ∗ ρ 1/2 θ Tt (θ x ∗ θ ) and (19) is proved. Now (20) follows from (19) differentiating at t = 0.
We can now describe the relationship between special GKSL representations of L and L . Proposition 6. If T satisfies the SQDB-θ then, for every special GKSL representation of L by means of operators H, L k , the operators H = −θ H θ and L k = θ L k θ yield a special GKSL representation of L . Proof. Consider a special GKSL representation of L by means of operators H , L k . Since L (a) = θ L(θaθ )θ by Proposition 5, from the antilinearity of θ and θ 2 = 1l we get 1 ∗ θ L (a) θ = i[H, θaθ ] − L k L k θaθ − 2L ∗k θaθ L k + θaθ L ∗k L k 2 k = iθ (θ H θa − aθ H θ ) θ + θ (θ L ∗k θ )a(θ L k θ ) θ k
1 θ (θ L ∗k θ )(θ L k θ )a + a(θ L ∗k θ )(θ L k θ ) θ − 2 k 1 ∗ ∗ θ L k L k a − 2L ∗ = θ (−i[θ H θ, a] ) θ − k a L k + a L k L k θ, 2 k
L k
H
:= θ L k θ . Therefore, putting = −θ H θ , we find a GKSL representation of where L which is also special because tr (ρ L k ) = tr (θρ L k θ ) = tr (L ∗k ρ) = tr (ρ L k ) = 0.
540
F. Fagnola, V. Umanità
The structure of generators of QMSs satisfying the SQDB-θ is described by the following Theorem 8. A QMS T satisfies the SQDB-θ condition if and only if there exists a special GKSL representation of L, with operators G, L , such that: 1. ρ 1/2 θ G ∗ θ = Gρ 1/2 , ∗ 1/2 2. ρ θ L k θ = j u k j L j ρ 1/2 for a self-adjoint unitary (u k j )k j on k. Proof. Suppose that T satisfies the SQDB-θ condition and consider a special GKSL representation of the generator L with operators G, L k . The operators −θ H θ and θ L k θ give then a special GKSL representation of L by Proposition 6. Moreover, by Theorem 4, we have another special GKSL representation of L by means of operators G , L k such that G ρ 1/2 = ρ 1/2 G ∗ +icρ 1/2 for some c ∈ R, and L k ρ 1/2 = ρ 1/2 L ∗k . Therefore there exists 1/2 L ∗ = 1/2 . a unitary (vk j )k j on k such that L k = j vk j θ L j θ , and ρ j vk j θ L j θρ k Condition 2 follows then with u k j = v¯k j left and right multiplying by the antiunitary θ . In order to find condition 1, first notice that by the unitarity of (vk j )k j , L ∗ θ L ∗k L k θ. (21) k Lk = k
k
Now, by the uniqueness of G up to a purely imaginary multiple of the identity in a special GKSL representation, H = (G ∗ − G )/(2i) is equal to −θ H θ + c1 for some c1 ∈ R. From (21) and G ρ 1/2 = ρ 1/2 G ∗ + icρ 1/2 we obtain then 1 ∗ 1/2 Lk Lkρ ρ 1/2 G ∗ + icρ 1/2 = G ρ 1/2 = −i H ρ 1/2 − 2 k 1 ∗ = iθ H θρ 1/2 + ic1 ρ 1/2 − θ L k L k θρ 1/2 2 k
= θ Gθρ ρ 1/2 θ G ∗ θ
It follows that and tracing we find
=
Gρ 1/2
1/2
+ ic2
+ ic1 ρ
ρ 1/2
1/2
.
for some c2 ∈ R. Left multiplying by ρ 1/2
ic2 = tr θρG ∗ θ − tr (ρG) = tr (Gρ) − tr (ρG) = 0
and condition 1 holds. Finally we show that the square of the unitary (u k j )k j on k is the identity operator. Indeed, taking the adjoint of the identity ρ 1/2 θ L ∗k θ = j u k j L j ρ 1/2 , we have θ L k θρ 1/2 = u¯ k j ρ 1/2 L ∗j . j
Left and right multiplying by the antilinear time reversal θ (commuting with ρ) we find L k ρ 1/2 = θ u¯ k j ρ 1/2 L ∗j θ = u k j ρ 1/2 θ L ∗j θ. j
Writing ρ 1/2 θ L ∗j θ as L k ρ 1/2
j
u jm L m ρ 1/2 by condition 2 we have then = u k j u jm L m ρ 1/2 = (u 2 )km L m ρ 1/2 m
j,m
m
Generators of Quantum Markov Semigroups and Detailed Balance
541
which implies that u 2 = 1l by the linear independence of the L m ρ 1/2 . Therefore, since u is unitary, u = u ∗ . Conversely, if 1 and 2 hold, we can write ρ 1/2 θ L(θ xθ )θρ 1/2 as ρ 1/2 θ L ∗k θ xθ L k θρ 1/2 + ρ 1/2 xθ Gθρ 1/2 ρ 1/2 θ G ∗ θ xρ 1/2 + k
= Gρ
1/2
xρ
1/2
+
L j ρ 1/2 xρ 1/2 L ∗j + ρ 1/2 xρ 1/2 G ∗ .
j
This, by Theorem 4, can be written as ρ 1/2 (G )∗ xρ 1/2 + ρ 1/2 (L j )∗ x L j ρ 1/2 + ρ 1/2 x G ρ 1/2 = ρ 1/2 L (x)ρ 1/2 . j
It follows that θ L(θ xθ )θ = L (x) for all x ∈ B(h) because ρ is faithful. Moreover, it is easy to check by induction that θ Ln (θ xθ )θ = (L )n (x) for all n ≥ 0. Therefore θ Tt (θ xθ )θ = Tt (x) for all t ≥ 0 and T satisfies the SQDB-θ condition by Proposition 5. We now provide a geometrical characterisation of the SQDB-θ condition as in Theorem 6. To this end we introduce the trace class operator R on k R jk = tr ρ 1/2 L ∗j ρ 1/2 θ L ∗k θ . (22) A direct application of Lemma 3 shows that R is trace class. Moreover it is self-adjoint because, by the property tr (θ xθ ) = tr (x ∗ ) of the antilinear time reversal, we have R jk = tr ρ 1/2 L ∗j ρ 1/2 θ L ∗k θ = tr θ (L k θρ 1/2 L j ρ 1/2 θ )θ = tr ρ 1/2 θ L ∗j ρ 1/2 θ L ∗k = tr (ρ 1/2 θ L ∗j θ )(ρ 1/2 L ∗k ) = Rk j . Theorem 9. T satisfies the SQDB-θ if and only if the operators G, L k of a special GKSL representation of the generator L fulfill the following conditions: 1. ρ 1/2 θ G ∗ θ = Gρ 1/2 , 2. the closed linear span of ρ 1/2 θ L ∗ θ | ≥ 1 and L ρ 1/2 | ≥ 1 in the Hilbert space of Hilbert-Schmidt operators on h coincide, 3. the self-adjoint trace class operators R, C defined by (17) and (22) commute and C −1 R is unitary and self-adjoint. Proof. It suffices to show that conditions 2 and 3 above are equivalent to condition 2 of Theorem 8. If T satisfies the SQBD-θ , then it can be shown as in the proof of Theorem 6 that 2 follows from condition 2 of Theorem 8. Moreover, left multiplying by ρ 1/2 L ∗ the identity ρ 1/2 θ L ∗k θ = j u k j L j ρ 1/2 and tracing, we find u k j tr ρ L ∗ L j tr ρ 1/2 L ∗ ρ 1/2 θ L ∗k θ = j
542
F. Fagnola, V. Umanità
for all k, , i.e. R = CU T . The operator U T is also self-adjoint and unitary. Therefore R and C have the same range and, since the domain of C −1 coincides with the range of C, the operator C −1 R is everywhere defined, unitary and self-adjoint. It follows that the densely defined operator RC −1 is a restriction of (C −1 R)∗ = C −1 R and C R = RC. In order to prove, conversely, that 2 and 3 imply condition 2 of Theorem 8, we first notice that, by the spectral theorem there exists a unitary V = (vmn )m,n≥1 on the multiplicity space k such that V ∗ C V is diagonal. Choosing a new GKSL representation of the generator L by means of the operators L k = n≥1 vnk L n , if necessary, we can suppose that both (L ρ 1/2 )≥1 and (ρ 1/2 L ∗k )k≥1 are orthogonal bases of the same closed linear space. Note that tr ρ 1/2 (L )∗k ρ 1/2 θ (L )∗j θ = v¯nk vm j tr (ρ 1/2 L ∗n ρ 1/2 θ L ∗m θ ) m,n≥1
and the operator R, in the new GKSL representation, transforms into V ∗ RV which is also self-adjoint. Expanding ρ 1/2 θ L ∗k θ with respect to the orthogonal basis (L ρ 1/2 )≥1 , for all k ≥ 1, we have tr (ρ 1/2 L ∗ ρ 1/2 θ L ∗ θ ) k L ρ 1/2 , (23) ρ 1/2 θ L ∗k θ = 1/2 2 L ρ HS ≥1
i.e. ρ 1/2 θ L ∗k θ = yk L ρ 1/2 with a unitary matrix Y of complex numbers yk . Clearly, we have yk = (C −1 R)k . It follows then from condition 3 above that Y coincides with the unitary operator (C −1 R)T and condition 2 of Theorem 8 holds. Moreover, Y is self-adjoint because both R and C are.
As an immediate consequence of the commutation of R and C we have the following parallel of Corollary 1 for the SQDB condition Corollary 2. Suppose that a QMS T satisfies the SQDB-θ condition. For every special GKSL representation of L with operators L ρ 1/2 orthogonal as Hilbert-Schmidt operators on h if tr (ρ 1/2 L ∗ ρ 1/2 θ L ∗k θ ) = 0 for a pair of indices k, ≥ 1, then tr (ρ L ∗ L ) = tr (ρ L ∗k L k ). When the time reversal θ is given by the conjugation θ u = u¯ (with respect to some orthonormal basis of h), θ x ∗ θ is equal to the transpose x T of x and we find the following Corollary 3. T satisfies the SQDB-θ condition if and only if there exists a special GKSL representation of L, with operators G, L k , such that: 1. ρ 1/2 G T = Gρ 1/2 ; T 1/2 2. ρ L k = j u k j L j ρ 1/2 for some unitary self-adjoint (u k j )k j . 6. SQDB-θ for QMS on M2 (C) In this section, as an application, we find a standard form of a special GKSL representation of the generator L of a QMS on M2 (C) satisfying the SQDB-θ . The faithful invariant state ρ, in a suitable basis of C2 , can be written in the form 1 ν 0 = (σ0 + (2ν − 1)σ3 ) , 0 < ν < 1, ρ= 0 1−ν 2
Generators of Quantum Markov Semigroups and Detailed Balance
543
where σ0 is the identity matrix and σ1 , σ2 , σ3 are the Pauli matrices 01 0 −i 1 0 , σ2 = , σ3 = . σ1 = 10 i 0 0 −1 The time reversal θ is the usual conjugation in the same basis of C2 . In order to determine the structure of the operators G and L k satisfying conditions of Corollary 3 we find first a convenient basis of M2 (C). We choose then a basis of eigenvectors of the linear map X → ρ 1/2 X T ρ −1/2 in M2 (C) given by σ0 , σ1ν , σ2ν , σ3 , where √ √ 2ν 0 0 −i 2ν , σ2ν = √ . σ1ν = √ 2(1 − ν) 0 0 i 2(1 − ν) Indeed, σ0 , σ1ν , σ3 (resp. σ2ν ) are eigenvectors of the eigenvalue 1 (resp. −1). Every special GKSL representation of L is given by (see [11], Lemma 6.1) L k = −(2ν − 1)z k3 σ0 + z k1 σ1ν + z k2 σ2ν + z k3 σ3 ,
k ∈ J ⊆ {1, 2, 3}
with vectors z k := (z k1 , z k2 , z k3 ) (k ∈ J ) linearly independent in C3 . The SQDB-θ holds if and only if G, L k satisfy 1/2 G T ρ −1/2 , (i) G = ρ (ii) L k = j∈J u k j ρ 1/2 L Tj ρ −1/2 for some unitary self-adjoint U = (u k j )k, j∈J .
Now, if J = ∅, since every unitary self-adjoint matrix is diagonalizable and its spectrum is contained in {−1, 1}, it follows that U = W ∗ DW for some unitary matrix W = (wi j )i, j∈J and some diagonal matrix D of the form diag( 1 , . . . , |J | ),
i ∈ {−1, 1},
(24)
where |J | denotes the cardinality of J . Therefore, replacing the L k ’s by operators L k := j∈J wk j L j if necessary, we can take U of the form (24). We now analyze the structure of L k ’s corresponding to the different (diagonal) forms of U . By condition (ii) we have either L k = ρ 1/2 L kT ρ −1/2 or L k = −ρ 1/2 L kT ρ −1/2 ; an easy calculation shows that L k = ρ 1/2 L kT ρ −1/2
if and only if
z k2 = 0
(25)
and L k = −ρ 1/2 L kT ρ −1/2 if and only if z k1 = z k3 = 0.
(26)
Therefore, the linear independence of {z j : j ∈ J } forces U to have at most two eigenvalues equal to 1 and at most one equal to −1 and, with a suitable choice of a phase factor for each L k , we can write L k = (1 − 2ν)rk σ0 + rk σ3 + ζk σ1ν for k = 1, 2 and rk ∈ R, ζk ∈ C L 3 = r3 σ2ν , r3 ∈ R.
(27) (28)
Clearly L 1 and L 2 are linearly independent if and only if r1 ζ2 = r2 ζ1 . This, together with non triviality conditions leaves us, up to a change of indices, with the following possibilities:
544
F. Fagnola, V. Umanità
|J | = 1, U = 1 then J = {1} with r1 ζ1 = 0, |J | = 1, U = −1 then J = {3} with r3 = 0, |J | = 2, U = diag(1, 1) then J = {1, 2} with r1 ζ1r2 ζ2 = 0, r1 ζ2 = r2 ζ1 , |J | = 2, U = diag(1, −1) then J = {1, 3}, with r3 = 0, r1 ζ1 = 0, |J | = 3, U = diag(1, 1, −1) then J = {1, 2, 3} with r1 ζ2 = r2 ζ1 , r3 = 0, r1 ζ1r2 ζ2 = 0. To conclude, we analyze condition (i). If G = g jk 1≤ j,k≤2 then statement (i) is equivalent to √ √ ν g21 = 1 − ν g12 . (29) Since G = −i H − 2−1 k L ∗k L k with H = 3j=1 v j σ j , v j ∈ R, and k L ∗k L k is equal to the sum of a term depending only on σ0 and σ3 plus √ √ 0 ζk 2ν(1 − ν) − ζ¯k ν 2(1 − ν) √ , 2rk ¯ √ ζk 2ν(1 − ν) − ζk ν 2(1 − ν) 0
(a) (b) (c) (d) (e)
k=1,2
in the case J = ∅ the identity (29) holds if and only if √ √ √ √ √ 2 2 v1 1 − ν − ν = − 2ν(1 − ν) 1 − ν + ν rk Iζk . √ √ √ √ √ 2 k=1 2 v2 1 − ν + ν = − 2ν(1 − ν) 1 − ν − ν k=1 rk Rζk On √ the other hand, when J = ∅, condition (29) is equivalent to 1 − ν(v1 − iv2 ), i.e. √ √ 1 − ν − ν = 0, v2 = 0, v1
(30)
√ ν(v1 + iv2 ) = (31)
Therefore we have the following possible standard forms for L. 3 Theorem 10. Let L 1 , L 2 , L 3 be as in (27), (28), H = j=1 v j σ j with v1 , v2 as in (30) and v3 ∈ R. The QMS T satisfies the SQDB-θ if and only if there exists a special GKSL representation of L given, up to phase factors multiplying L 1 , L 2 , L 3 , in one of the following ways: (o) (a) (b) (c) (d) (e)
H with v1 = v2 = 0 if ν = 1/2, and v1 ∈ R, v2 = 0 if ν = 1/2, H, L 1 with r1 ζ1 = 0, H, L 3 with r3 = 0, H, L 1 , L 2 with r1 ζ1r2 ζ2 = 0 and r1 ζ2 = r2 ζ1 , H, L 1 , L 3 with r3 = 0 and r1 ζ1 = 0, H, L 1 , L 2 , L 3 with r1 ζ2 = r2 ζ1 , r1 ζ1r2 ζ2 = 0 and r3 = 0.
Roughly speaking, the standard form of L corresponds, up to degeneracies when some of the parameter vanish or when some linear dependence arises, to the case e). We know that a QMS satisfying the usual (i.e. with pre-scalar product with s = 0) QDB-θ condition must commute with the modular group. Moreover, when this happens, the SQDB-θ and QDB-θ conditions are equivalent (see e.g. [6,11]). We finally show how the generators of a QMSs on M2 (C) satisfying the usual QDB-θ condition can be recovered by a special choice of the parameters r1 , r2 , r3 , ζ1 , ζ2 in Theorem 10 describing the generator of a QMS satisfying the SQDB-θ condition.
Generators of Quantum Markov Semigroups and Detailed Balance
545
To this end, we recall that T fulfills the QDB-θ when tr (ρxTt (y)) = tr (ρθ y ∗ θ Tt (θ x ∗ θ )) for all x, y ∈ B(h). In [11] we classified generators of QMS on M2 (C) satisfying the QDB condition without time reversal (i.e., formally, replacing θ by the identity operator, that is, of course, not antiunitary). The same type of arguments show that, disregarding trivialisations that may occur when some of the parameters below vanishes, QMSs on M2 (C) satisfying the QDB-θ condition have the following standard form | η |2 2 L x − 2L x L + x L 2 2 2 | μ |2 + − |λ| σ −σ + x − 2σ − xσ + + xσ −σ + − σ σ x − 2σ + xσ − + xσ + σ − , (32) − 2 2
L(x) = i[H, x] −
where H = h 0 σ0 + h 3 σ3 (h 0 , h 3 ∈ R), L = −(2ν − 1)σ0 + σ3 , σ ± = (σ1 ± iσ2 )/2 and, changing phases if necessary, λ, μ, η can be chosen as non-negative real numbers satisfying λ2 (1 − ν) = νμ2 .
(33)
Choosing r1 = η, ζ1 = 0 we find immediately that the operator L in (32) coincides with the operator L 1 in (27). Moreover, choosing r2 = 0 we find v2 = 0 and also v1 = 0 for ν = 1/2. A straightforward computation yields √ √ λ σ+ L2 iλ/(2r√3 2ν) λ/(2ζ √2 2ν) = μ σ− L3 μ/(2ζ2 2(1 − ν)) −iμ/(2r3 2(1 − ν)) √ √ and the above 2×2 matrix is unitary if we choose ζ2 = λ/(2 ν), r3 = iμ/(2 1 − ν)) = iζ2 because of (33) and changing the phase of r3 in order to find a unitary that is also self-adjoint. This shows that we can recover the standard form (32) √1 , L 2 , L 3 as in √ choosing H , L Theorem 10 e) with r1 = η, ζ1 = 0, r2 = 0, ζ2 = λ/(2 ν), r3 = iμ/(2 1 − ν)), v1 = v2 = 0. Appendix We denote by 2 (J ) the Hilbert space of complex-valued, square summable sequences indexed by a finite or countable set J . Lemma 3. Let J be a complex separable Hilbert space and let (ξ j ) j∈J , (η j ) j∈J be two
2
2 Hilbertian bases of J satisfying j∈J ξ j < ∞, j∈J η j < ∞. The complex matrices A = (a jk ) j,k∈J , B = (b jk ) j,k∈J , C = (c jk ) j,k∈J given by a jk = ξ j , ξk , b jk = ξ j , ηk , c jk = η j , ηk define trace class operators on 2 (J ) satisfying B ∗ A−1 B = C. Moreover A and C are self-adjoint and positive. Proof. Note that 2 2 2 b jk ≤
ξ j · ηk 2 =
ξ j · ηk 2 < ∞. j,k≥1
j,k≥1
j
Therefore B defines a Hilbert-Schmidt operator on 2 (J ).
k
546
F. Fagnola, V. Umanità
In a similar way A and C define Hilbert-Schmidt operators on 2 (J ) that are obviously self-adjoint. These are also positive because for any sequence (z m )m∈J of complex numbers with z m = 0 for a finite number of indices m at most we have
2
z¯ m amn z n = z¯ m ξm , ξn z n =
z m ξm ≥ 0.
m,n∈J
m,n∈J
m∈J
Moreover, they are trace class because 2
ξ j < ∞, ajj = j∈J
j∈J
cjj =
j∈J
2
η j < ∞. j∈J
Finally, we show that B is also trace class. By the spectral theorem, we can find a unitary V = (vk j )k, j∈J on 2 (J ) such that V ∗ AV is diagonal. The series m∈J vm j ξm is norm convergent because
2
vm j ξm = v¯n j anm vm j = (V ∗ AV ) j j .
m
m,n∈J
The series as well for a similar reason. Therefore, m∈J vm j ξm is norm convergent putting ξ j = m∈J vm j ξm and ηj = m∈J vm j ηm we find immediately (V ∗ AV )k j =
2
ξk , ξ j = 0 for j = k, (V ∗ AV ) j j = ξ j and (V ∗ BV )k j =
v¯mk vn j ξm , η j = ξk , ηj ,
m,n
∗
(V C V )k j =
v¯mk vn j ηm , η j = ηk , ηj .
m,n
As a consequence, the following identity V ∗ B ∗ A−1 BV = (V ∗ B ∗ V )(V ∗ AV )−1 (V ∗ BV ) kj
=
kj
−1 ∗ (V ∗ B ∗ V )km (V ∗ AV )mm (V BV )m j
m∈J
=
m∈J
ηk ,
ξm ξm
ξm , η ξm j
= ηk , ηj = (V ∗ C V )k j holds because (ξm / ξm )m∈J is an orthonormal basis of J . This proves that V ∗ B ∗ A−1 BV = V ∗ C V i.e. B ∗ A−1 B = C. It follows that |A−1/2 B| = C 1/2 is Hilbert-Schmidt as well as A−1/2 B and B = A1/2 (A−1/2 B) is trace class being the product of two Hilbert-Schmidt operators. Acknowledgements. The financial support from the MIUR PRIN 2007 project “Quantum Probability and Applications to Information Theory” is gratefully acknowledged.
Generators of Quantum Markov Semigroups and Detailed Balance
547
References 1. Albeverio, S., Goswami, D.: A remark on the structure of symmetric quantum dynamical semigroups. Inf. Dim. Anal. Quant. Prob. Relat. Top. 5, 571–579 (2002) 2. Accardi, L., Imafuku, K.: Dynamical detailed balance and local KMS condition for non-equilibrium states. Int. J. Mod. Phys. B 18(4-5), 435–467 (2004) 3. Accardi, L., Mohari, A.: Time reflected markov processes. Inf. Dim. Anal. Quant. Prob. Relat. Top. 2, 397– 426 (1999) 4. Agarwal, G.S.: Open quantum Markovian systems and the microreversibility. Z. Physik 258(5), 409–422 (1973) 5. Alicki, R.: On the detailed balance condition for non-Hamiltonian systems. Rep. Math. Phys. 10, 249–258 (1976) 6. Cipriani, F.: Dirichlet forms and markovian semigroups on standard forms of von neumann algebras. J. Funct. Anal. 147, 259–300 (1997) 7. Cipriani, F.: Dirichlet forms on noncommutative spaces. In: Quantum Potential Theory, Lecture Notes in Math., 1954, Berlin-Heidelberg-New York: Springer, 2008, pp. 161–276 8. Davies, E.B., Lindsay, J.M.: Non-commutative symmetric Markov semigroups. Math. Z. 210, 379– 411 (1992) 9. Derezynski, J., Fruboes, R.: Fermi golden rule and open quantum systems. In: S. Attal et al. (eds.) Open Quantum Systems III, Lecture Notes in Mathematics 1882, Berlin-Heidelberg-New York: Springer, 2006, pp. 67–116 10. Fagnola, F., Umanità, V.: Detailed balance, time reversal and generators of quantum Markov semigroups, M. Zametki, 84 (1) 108–116 (2008) (Russian); translation Math. Notes 84 (1–2), 108–115 (2008) 11. Fagnola, F., Umanità, V.: Generators of detailed balance quantum markov semigroups. Inf. Dim. Anal. Quant. Prob. Relat. Top. 10(3), 335–363 (2007) 12. Fagnola, F., Umanità, V.: On two quantum versions of the detailed balance condition. To appear in: Noncommutative harmonic analysis with applications to probability, M. Bozejko, et al. eds., Banach Center Publications, Polish Academy of Sciences 2009 13. Fukushima, M., Oshima, Y., Takeda, M.: Dirichlet Forms and Symmetric Markov Processes, de Gruyter Studies in Mathematics 19, Berlin: de Grayler, 1994 14. Goldstein, S., Lindsay, J.M.: Beurling-Deny condition for KMS-symmetric dynamical semigroups. C. R. Acad. Sci. Paris 317, 1053–1057 (1993) 15. Goldstein, S., Lindsay, J.M.: KMS symmetric semigroups. Math. Z. 219, 591–608 (1995) 16. Gorini, V., Kossakowski, A., Sudarshan, E.C.G.: Completely positive dynamical semigroups of N -level systems. J. Math. Phys. 17, 821–825 (1976) 17. Guido, D., Isola, T., Scarlatti, S.: Non-symmetric Dirichlet forms on semifinite von Neumann algebras. J. Funct. Anal. 135(1), 50–75 (1996) 18. Kossakowski, A., Frigerio, A., Gorini, V., Verri, M.: Quantum detailed balance and KMS condition. Commun. Math. Phys. 57, 97–110 (1977) 19. Lindablad, G.: On the genarators of quantum dynamical semigroups. Commun. Math. Phys. 48, 119–130 (1976) 20. Majewski, W.A.: On the relationship between the reversibility of detailed balance conditions. Ann. Inst. Henri Poincaré, A 39, 45–54 (1983) 21. Majewski, W.A.: The detailed balance condition in quantum statistical mechanics. J. Math. Phys. 25(3), 614–616 (1984) 22. Majewski, W.A., Streater, R.F.: Detailed balance and quantum dynamical maps. J. Phys. A: Math. Gen. 31, 7981–7995 (1998) 23. Park, Y.M.: Remarks on the structure of Dirichlet forms on standard forms of von Neumann Algebras. Inf. Dim. Anal. Quant. Prob. Rel. Top. 8, 179–197 (2005) 24. Parthasarathy, K.R.: An Introduction to Quantum Stochastic Calculus. Monographs in Mathematics 85, Basel: Birkhäuser-Verlag, 1992 25. Petz, D.: Conditional expectation in quantum probability. In: L. Accardi and W. von Waldenfels (eds.), Quantum Probability and Applications III. Proceedings, Oberwolfach 1987. LNM 1303 Berlin-Heidelberg-New York: Springer, 1988, pp. 251–260 26. Sauvageot, J.L.: Quantum Dirichlet forms, differential calculus and semigroups. In: L. Accardi, W. von Waldenfels (eds.), Quantum Probability and Applications V. LNM 1442, Berlin-Heidelberg-New York: Springer, 1990, pp. 334–346 27. Talkner, P.: The failure of the quantum regression hypotesis. Ann. Phys. 167(2), 390–436 (1986) Communicated by M.B. Ruskai
Commun. Math. Phys. 298, 549–572 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1044-5
Communications in
Mathematical Physics
Random Matrices: Universality of Local Eigenvalue Statistics up to the Edge Terence Tao1,∗ , Van Vu2,∗∗ 1 Department of Mathematics, UCLA, Los Angeles CA 90095-1555, USA. E-mail: [email protected] 2 Department of Mathematics, Rutgers, Piscataway, NJ 08854, USA. E-mail: [email protected]
Received: 13 August 2009 / Accepted: 8 January 2010 Published online: 3 April 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract: This is a continuation of our earlier paper (Tao and Vu, http://arxiv.org/abs/ 0908.1982v4[math.PR], 2010) on the universality of the eigenvalues of Wigner random matrices. The main new results of this paper are an extension of the results in Tao and Vu (http://arxiv.org/abs/0908.1982v4[math.PR], 2010) from the bulk of the spectrum up to the edge. In particular, we prove a variant of the universality results of Soshnikov (Commun Math Phys 207(3):697–733, 1999) for the largest eigenvalues, assuming moment conditions rather than symmetry conditions. The main new technical observation is that there is a significant bias in the Cauchy interlacing law near the edge of the spectrum which allows one to continue ensuring the delocalization of eigenvectors. 1. Introduction In our recent paper [27], a universality result (the Four Moment Theorem) was established for the eigenvalue spacings in the bulk of the spectrum of random Hermitian matrices. (See [6] for an extended discussion of the universality phenomenon, and [27] for further references on universality results in the context of Wigner Hermitian matrices.) The main purpose of this paper is to extend this universality result to the edge of the spectrum as well. 1.1. Universality in the bulk. To recall the Four Moment Theorem, we need some notation. Definition 1.1 (Condition C0). A random Hermitian matrix Mn = (ζi j )1≤i, j≤n is said to obey condition C0 if ∗ T. Tao is is supported by a grant from the MacArthur Foundation, by NSF grant DMS-0649473, and by the NSF Waterman award. ∗∗ V. Vu is supported by research grants DMS-0901216 and AFOSAR-FA-9550-09-1-0167.
550
T. Tao, V. Vu
• The ζi j are independent (but not necessarily identically distributed) for 1 ≤ i ≤ j ≤ n. For 1 ≤ i < j ≤ n, they have mean zero and variance 1; for i = j, they have mean zero and variance c for some fixed c > 0 independent of n. • (Uniform exponential decay) There exist constants C, C > 0 such that P(|ζi j | ≥ t C ) ≤ exp(−t)
(1)
for all t ≥ C and 1 ≤ i, j ≤ n. Examples of random Hermitian matrices obeying Condition C0 include the GUE and GOE ensembles, or the random symmetric Bernoulli ensemble in which each of the ζi j are equal to ±1 with equal probability 1/2. In GOE one has c = 2, but in the other two cases one has c = 1. The arguments in the previous paper [27] were largely phrased for the case c = 1, but it is not difficult to see that the arguments extend without difficulty to other values of c (the main point being that a modification of the variance of a single entry of a row vector does not significantly affect the Talagrand concentration inequality, [27, Lemma 43], or Lemma 2.1 below.). Given an n × n Hermitian matrix A, we denote its n eigenvalues as λ1 (A) ≤ · · · ≤ λn (A), and write λ(A) := (λ1 (A), . . . , λn (A)). We also let u 1 (A), . . . , u n (A) ∈ Cn be an orthonormal basis of eigenvectors of A with Au i (A) = λi (A)u i (A); these eigenvectors u i (A) are only determined up to a complex phase even when the eigenvalues are simple, but this ambiguity will not cause a difficulty in our results as we will only be interested in the magnitude |u i (A)∗ X | of various inner products u i (A)∗ X of u i (A) with other vectors X. It will be convenient to introduce the following notation for frequent events depending on n, in increasing order of likelihood: Definition 1.2 (Frequent events). Let E be an event depending on n. • E holds asymptotically almost surely if 1 P(E) = 1 − o(1). • E holds with high probability if P(E) ≥ 1 − O(n −c ) for some constant c > 0. • E holds with overwhelming probability if P(E) ≥ 1− OC (n −C ) for every constant C > 0 (or equivalently, that P(E) ≥ 1 − exp(−ω(log n))). • E holds almost surely if P(E) = 1. Definition 1.3 (Moment matching). We say that two complex random variables ζ and ζ match to order k if ERe(ζ )m Im(ζ )l = ERe(ζ )m Im(ζ )l for all m, l ≥ 0 such that m + l ≤ k. The first main result [27] can now be stated as follows: Theorem 1.4 (Four Moment Theorem) [27, Theorem 15]. There is a small positive constant c0 such that for every 0 < ε < 1 and k ≥ 1 the following holds. Let Mn = (ζi j )1≤i, j≤n and Mn = (ζij )1≤i, j≤n be two random matrices satisfying C0. Assume furthermore that for any 1 ≤ i < j ≤ n, ζi j and ζij match to order 4 and for any 1 ≤ i ≤ n, 1 See Sect. 1.4 for our conventions on asymptotic notation.
Universality up to the Edge
551
√ √ ζii and ζii match to order 2. Set An := n Mn and An := n Mn , and let G : Rk → R be a smooth function obeying the derivative bounds |∇ j G(x)| ≤ n c0
(2)
for all 0 ≤ j ≤ 5 and x ∈ Rk . Then for any εn ≤ i 1 < i 2 · · · < i k ≤ (1 − ε)n, and for n sufficiently large depending on ε, k (and the constants C, C in Definition 1.2) we have |E(G(λi1 (An ), . . . , λik (An ))) − E(G(λi1 (An ), . . . , λik (An )))| ≤ n −c0 .
(3)
If ζi j and ζij only match to order 3 rather than 4, then there is a positive constant C independent of c0 such that the conclusion (3) still holds provided that one strengthens (2) to |∇ j G(x)| ≤ n −C jc0 for all 0 ≤ j ≤ 5 and x ∈ Rk . Informally, this theorem asserts that the distribution of any bounded number of eigenvalues in the bulk of the spectrum of a random Hermitian matrix obeying condition C0 depends only on the first four moments of the coefficients. There is also a useful companion result to Theorem 1.4, which is used both in the proof of that theorem, and in several of its applications: Theorem 1.5 (Lower tail estimates) [27, Theorem 17]. Let 0 < ε <√1 be a constant, and let Mn be a random matrix obeying Condition C0. Set An := n Mn . Then for every c0 > 0, and for n sufficiently large depending on ε, c0 and the constants C, C in Definition 1.1, and for each εn ≤ i ≤ (1 − ε)n, one has λi+1 (An ) − λi (An ) ≥ n −c0 with high probability. In fact, one has P(λi+1 (An ) − λi (An ) ≤ n −c0 ) ≤ n −c1 for some c1 > 0 depending on c0 (and independent of ε). Theorem 1.4 (and to a lesser extent, Theorem 1.5) can be used to extend the range of applicability for various results on eigenvalue statistics in the bulk for Hermitian or symmetric matrices, for instance in extending results for special ensembles such as GUE or GOE (or ensembles obeying some regularity or divisibility conditions) to more general classes of matrices. See [27,13,10] for some examples of this type of extension. We also remark that a level repulsion estimate which has a similar spirit to Theorem 1.5 was established in [9, Theorem 3.5], although the latter result establishes repulsion of eigenvalues in a fixed small interval I , rather than at a fixed index i of the sequence of eigenvalues, and does not seem to be directly substitutable for Theorem 1.5 in the arguments of this paper. The results of Theorem 1.4 and Theorem 1.5 only control eigenvalues λi (An ) in the bulk region εn ≤ i ≤ (1 − ε)n for some fixed ε > 0 (independent of n). The reason for this restriction was technical, and originated from the use of the following two related results (which are variants of previous results of Erd˝os, Schlein, and Yau[7–9]), whose proof relied on the assumption that one was in the bulk:
552
T. Tao, V. Vu
Theorem 1.6 (Concentration for ESD) [27, Theorem 56]. For any ε, δ > 0 and any random Hermitian matrix Mn = (ζi j )1≤i, j≤n whose upper-triangular entries are independent with mean zero and variance 1, and such that |ζi j | ≤ K almost surely for all i, j 2
20
n and some 1 ≤ K ≤ n 1/2−ε , and any interval I in [−2+ε, 2−ε] of width |I | ≥ K log , n the number of eigenvalues N I of Wn := √1n Mn in I obeys the concentration estimate |N I − n ρsc (x) d x| δn|I | I
with overwhelming probability, where ρsc is the semicircular distribution √ 1 4 − x 2 , |x| ≤ 2 ρsc (x) := 2π 0, |x| > 2.
(4)
In particular, N I = ε (n|I |) with overwhelming probability. Proposition 1.7 (Delocalization of eigenvectors) [27, Prop. 58]. Let ε, Mn , Wn , ζi j , K be as in Theorem 1.6. Then for any 1 ≤ i ≤ n with λi (Wn ) ∈ [−2 + ε, 2 − ε], if u i (Wn ) denotes a unit eigenvector corresponding to λi (Wn ), then with overwhelming 2 20 n ). probability each coordinate of u i (Mn ) is Oε ( K nlog 1/2 In the bulk region [−2 + ε, 2 − ε], the semicircular function ρsc is bounded away from zero. Thus, Theorem 1.6 ensures that the eigenvalues of Wn in the bulk tend to have a mean spacing of ε (1/n) on the average. Applying the Cauchy interlacing law λi (Wn ) ≤ λi (Wn−1 ) ≤ λi+1 (Wn ),
(5)
where Wn−1 is an n − 1 × n − 1 minor of Wn , this implies that the bulk eigenvalues of Wn−1 are within ε (1/n) of the corresponding eigenvalues of Wn on the average. Using linear algebra to express the coordinates of the eigenvector u i (Mn ) in terms of Wn and a minor Wn−1 (see Lemma 4.1 below), and using some concentration of measure results concerning the projection of a random vector to a subspace (see Lemma 2.1), we eventually obtain Proposition 1.7. 1.2. Universality up to the edge. The main results of this paper are that the above four theorems can be extended to the edge of the spectrum (thus effectively sending ε to zero). Let us now state these results more precisely. Firstly, we have the following extension of Theorem 1.6: Theorem 1.8 (Concentration for ESD up to edge). Consider a random Hermitian matrix Mn = (ζi j )1≤i, j≤n whose upper-triangular entries are independent with mean zero and variance 1, and such that |ζi j | ≤ K almost surely for all i, j and some K ≥ 1. Let 0 < δ < 1/2 be a quantity which can depend on n, and let I be an interval such that K 2 log4 n . nδ 10 We also make the mild assumption K = o(n 1/2 δ 2 ). Then the number of eigenvalues N I of Wn := √1n Mn in I obeys the concentration estimate |N I − n ρsc (x) d x| δn|I | |I | ≥
I
with overwhelming probability.
Universality up to the Edge
553
Remark 1.9. The powers of K , δ and log n here are probably not best possible, but this will not be relevant for our purposes. In our applications K will be a power of log n, and δ will be a negative power of log n. (This allows the error term O(δn|I |) in the above estimate for N I to exceed the main term n I ρsc (x) d x when one is very near the edge, but this will not impact our arguments.) We prove this theorem in Sect. 3, using the same (standard) Stieltjes transform method that was used to prove Theorem 1.6 in [27] (see also [9]), with a somewhat more careful analysis. We next use it to obtain the following extension of Proposition 1.7: Proposition 1.10 (Delocalization of eigenvectors up to the edge). Let Mn be a random matrix obeying Condition C0. Then with overwhelming probability, every unit eigenvector u i (Mn ) of Mn has coefficients at most n −1/2 log O(1) n, thus sup |u i (Mn )∗ e j | n −1/2 log O(1) n, 1≤i, j≤n
where e1 , . . . , en is the standard basis. The deduction of Proposition 1.10 from Theorem 1.8 differs significantly from the analogous deduction of Proposition 1.7 in Theorem 1.6 in [27]. The main difference is that in the current case ρsc is no longer bounded away from zero, which causes the average eigenvalue spacing between λi (Wn ) and λi+1 (Wn ) to be considerably larger than 1/n. For instance, the gap between the second largest eigenvalue λn−1 (Wn ) and the largest eigenvalue λn (Wn ) is typically of size n −2/3 . The key new ingredient that helps us to deal with this problem is the following observation: the Cauchy interlacing law (5), when applied to the eigenvalues of the edge, is strongly bias. In particular, the gap between λi (Wn−1 ) and λi (Wn ) is significantly smaller than the gap between λi (Wn−1 ) and λi+1 (Wn ). We can show that (with high probability), the first gap is of order n −1+o(1) while the second can be as large as n −2/3 (and similarly for the gap between λi+1 (Wn ) and λi (Wn−1 ) when n/2 ≤ i ≤ n). This new ingredient will be sufficient to recover Proposition 1.10; see Sect. 4, where the above proposition is proved. Using Theorem 1.8 and Proposition 1.10, one can continue the arguments from [27] to establish the following extensions of Theorem 1.4 and Theorem 1.5: Theorem 1.11 (Four Moment Theorem up to the edge). There is a small positive constant c0 such that for every k ≥ 1 the following holds. Let Mn = (ζi j )1≤i, j≤n and Mn = (ζij )1≤i, j≤n be two random matrices satisfying C0. Assume furthermore that for any 1 ≤ i < j ≤ n, ζi j and ζij match to order 4 and for any 1 ≤ i ≤ n, ζii and ζii match √ √ to order 2. Set An := n Mn and An := n Mn , and let G : Rk → R be a smooth function obeying the derivative bounds (2) for all 0 ≤ j ≤ 5 and x ∈ Rk . Then for any 1 ≤ i 1 < i 2 · · · < i k ≤ n, and for n sufficiently large depending on k (and the constants C, C in Definition 1.1) we have (3). If ζi j and ζij only match to order 3 rather than 4, then there is a positive constant C independent of c0 such that the conclusion (3) still holds provided that one strengthens (2) to |∇ j G(x)| ≤ n −C jc0 for all 0 ≤ j ≤ 5 and x ∈ Rk .
(6)
554
T. Tao, V. Vu
Theorem 1.12 (Lower tail estimates up to the edge). Let Mn be a random matrix obey√ ing Condition C0. Set An := n Mn . Then for every c0 > 0, and for n sufficiently large depending on c0 and the constants C, C in Definition 1.1, and for each 1 ≤ i ≤ n, one has λi+1 (An ) − λi (An ) ≥ n −c0 with high probability, uniformly in i. The novelty here is that we have no assumption on the indices i j and i. We present the proof of these theorems in Sects. 5, 6, following the arguments in [27] closely. 1.3. Applications. As Theorems 1.11, 1.12 extend Theorems 1.4, 1.5, all the applications of the latter theorems in [27] (concerning the bulk of the spectrum) can also be viewed as applications of these theorems. But because these results extend all the way to the edge, we can now obtain some results on the edge of the spectrum as well. For instance, we can prove Theorem 1.13. Let k be a fixed integer and Mn be a matrix obeying Condition C0, and suppose that the real and imaginary part of the atom variables have the same covariance matrix as the GUE ensemble (i.e. both components have variance 1/2, and have covariance 0). Assume furthermore that all third moments of the atom variables vanish. Set Wn := √1n Mn . Then the joint distribution of the k dimensional random vector (7) (λn (Wn ) − 2)n 2/3 , . . . , (λn−k+1 (Wn ) − 2)n 2/3 has a weak limit as n → ∞, which coincides with that in the GUE case (in particular, the limit for k = 1 is the GUE Tracy-Widom distribution [28], and for higher k is controlled by the Airy kernel [14]). The result also holds for the smallest eigenvalues λ1 , . . . , λk , with the offset −2 replaced by +2. If the atom variables have the same covariance matrix as the GOE ensemble (i.e. they are real with variance 1 off the diagonal, and 2 on the diagonal), instead of the GUE ensemble, then the same conclusion applies but with the GUE distribution replaced of course by the GOE distribution (see [29] for the k = 1 case). This result was previously established by Soshnikov [25] (see also [23,24]) in the case when Mn is a Wigner Hermitian matrix (i.e. the off-diagonal entries are iid, and the matrix matches GUE to second order at least) with symmetric distribution (which implies, but is stronger than, matching to third order). For some additional partial results in the non-symmetric case see [20,21]. The exponential decay condition in Soshnikov’s result has been lowered to a finite number of moments; see [22,18]. It is reasonable to conjecture that the exponential decay conditions in this current paper can similarly be lowered, but we will not pursue this issue here. It also seems plausible that the third moment matching conditions could be dropped, though this is barely beyond the reach of the current method2 . Proof. We just prove the claim for the largest k eigenvalues and for GUE, as the claim for the smallest√k and/or GOE is similar. Set An := n Mn . It suffices to show that for every smooth function G : Rk → R, that the expectation EG((λn (An ) − 2n)/n 1/3 , . . . , (λn−k+1 (An ) − 2n)/n 1/3 )
(8)
2 Note added in proof. The third moment condition has recently been dropped in [16], by combining the four moment theorem here with a new proof of universality for the distribution of the largest eigenvalue for gauss divisible matrices.
Universality up to the Edge
555
only changes by o(1) when the matrix Mn is replaced with GUE. But this follows from the final conclusion of Theorem 1.11, thanks to the extra factor n −1/3 .
Remark 1.14. Notice that there is some room to spare in this argument, as the n −1/3 gain in (8) is far more than is needed for (6). Because of this, one can obtain similar universality results for suitably normalised eigenvalues λi (An ) with i ≤ n 1−ε or i ≥ n − n 1−ε for any ε > 0 (where the normalisation factor n 2/3 min(i, n − i)1/3 , t for λi (An ) is now i and the offset −2 is replaced by −t, where −2 ρsc (x) d x = n ). We omit the details. Remark 1.15. In analogy with [13], one should be able to drop the third moment condition in Theorem 1.13 if one can control the distribution of the largest (or smallest) eigenvalues from random matrices obtained from a suitable Ornstein-Uhlenbeck process, as in [12]. 1.4. Notation. We consider n as an asymptotic parameter tending to infinity. We use X Y , Y X , Y = (X ), or X = O(Y ) to denote the bound X ≤ CY for all sufficiently large n and for some constant C. Notations such as X k Y, X = Ok (Y ) mean that the hidden constant C depend on another constant k. X = o(Y ) or Y = ω(X ) means that X/Y → 0 as n → ∞; the rate of decay here will be allowed to depend on other parameters. We write X = (Y ) for Y X Y . We view vectors x ∈ Cn as column vectors. The Euclidean norm of a vector x ∈ Cn is defined as x := (x ∗ x)1/2 . Eigenvalues are always ordered in increasing order, thus for instance λn (An ) is the largest eigenvalue of a Hermitian matrix An , and λ1 (An ) is the smallest. 2. General Tools In this section we record some general tools (proven in [27]) which we will use repeatedly in the sequel. We begin with a very useful concentration of measure result that describes the projection of a random vector to a subspace. Lemma 2.1 (Projection Lemma). Let X = (ξ1 , . . . , ξn ) ∈ Cn be a random vector whose entries are independent with mean zero, variance 1, and are bounded in magnitude by K almost surely for some K , where K ≥ 10(E|ξ |4 + 1). Let H be a subspace of dimension d and π H the orthogonal projection onto H . Then P(|π H (X ) −
√ t2 d| ≥ t) ≤ 10 exp(− ). 10K 2
In particular, one has π H (X ) =
√ d + O(K log n)
with overwhelming probability. The same conclusion holds (with 10 replaced by another explicit constant) if one of the entries ξ j of X is assumed to have variance c instead of 1, for some absolute constant c > 0. Proof. See [27, Lem. 40]. (The main tool in the proof is Talagrand’s concentration inequality.) It is clear from the triangle inequality that the modification of variance in a single entry does not significantly affect the conclusion except for constants.
556
T. Tao, V. Vu
Next, we record a crude but useful upper bound on the number of eigenvalues in a short interval. Lemma 2.2 (Upper bound on ESD). Consider a random Hermitian matrix Mn = (ζi j )1≤i, j≤n whose upper-triangular entries are independent with mean zero and variance 1 (with variance c on the diagonal for some absolute constant c > 0), and such that |ζi j | ≤ K almost surely for all i, j and some K ≥ 1. Set Wn := √1n Mn . Then for any interval I ⊂ R with |I | ≥
K 2 log2 n , n
N I n|I | with overwhelming probability, where N I is the number of eigenvalues of Wn in I . Proof. See [27, Prop. 62]. (The main tools in the proof are the Stieltjes transform method, Lemma 3.3 below, and Lemma 2.1.) Again, the generalisation to variances other than 1 on the diagonal do not cause significant changes to the argument.
Finally, we recall a Berry-Esséen type theorem: Theorem 2.3 (Tail bounds for complex random walks). Let 1 ≤ N ≤ n be integers, and let A = (ai, j )1≤i≤N ;1≤ j≤n be an N × n complex matrix whose N rows are orthonormal in Cn , and obeying the incompressibility condition sup
1≤i≤N ;1≤ j≤n
|ai, j | ≤ σ
(9)
for some σ > 0. Let ζ1 , . . . , ζn be independent complex random variables with mean zero, variance E|ζ j |2 equal to 1, and obeying E|ζi |3 ≤ C for some C ≥ 1. For each 1 ≤ i ≤ N , let Si be the complex random variable Si :=
n
ai, j ζ j
j=1
and let S be the C N -valued random variable with coefficients S1 , . . . , S N : • (Upper tail bound on Si ) For t ≥ 1, we have P(|Si | ≥ t) exp(−ct 2 ) + Cσ for some absolute constant c > 0. √ √ For any t ≤ N , one has P(| S| ≤ t) O(t/ N )N /4 + • (Lower tail bound on S) C N 4 t −3 σ . The same claim holds if one of the ζi is assumed to have variance c instead of 1 for some absolute constant c > 0. Proof. See [27, Th. 41]. Again, the modification of the variance on a single entry can be easily seen to have no substantial effect on the conclusion.
Universality up to the Edge
557
3. Asymptotics for the ESD In this section we prove Theorem 1.8, using the Stieltjes transform method (see [2] for a general discussion of this method). We may assume throughout that n is large, since the claim is vacuous for n small. It is known by the moment method (see e.g. [2] or [4]) that with overwhelming probability, all eigenvalues of Wn have magnitude at most 2 + o(1). Because of this, we may restrict attention to the case when I lies in interval [−3, 3] (say). We recall the Stieltjes transform sn (z) of a Hermitian matrix Wn , defined for complex z by the formula n 1 1 . sn (z) := n λi (Wn ) − z
(10)
i=1
We also introduce the semicircular counterpart 2 1 s(z) := ρsc (x) d x, x − z −2 which by a standard contour integral computation can be given explicitly as 1 s(z) = (−z + z 2 − 4), 2
(11)
where we use the branch of the square root of z 2 − 4 with cut at [−2, 2] which is asymptotic to z at infinity. It is well known that one can control the empirical spectral distribution N I via the Stieltjes transform. We will use the following formalization of this principle: Lemma 3.1 (Control of Stieltjes transform implies control on ESD). There is a positive constant C such that the following holds for any Hermitian matrix Wn . Let 1/10 ≥ η ≥ 1/n and L , ε, δ > 0. Suppose that one has the bound |sn (z) − s(z)| ≤ δ
(12)
with (uniformly) overwhelming probability for all z with |Re(z)| ≤ L and Im(z) ≥ η. Then for any interval I in [−L + ε, L − ε] with |I | ≥ max(2η, ηδ log 1δ ), one has |N I − n ρsc (x) d x| ε δn|I | I
with overwhelming probability, where N I is the number of eigenvalues of Wn in I . Proof. See [27, Lem. 60].
As a consequence of this lemma (with L = 4 and ε = 1, say), we see that Theorem 1.8 follows from Theorem 3.2 (Concentration for the Stieltjes transform up to edge). Consider a random Hermitian matrix Mn = (ζi j )1≤i, j≤n whose upper-triangular entries are independent with mean zero and variance 1, with variance c on the diagonal for some absolute constant c > 0, and such that |ζi j | ≤ K almost surely for all i, j and some K ≥ 1. Set Wn := √1n Mn . Let 0 < δ < 1/2 (which can depend on n), and suppose that
558
T. Tao, V. Vu
K = o(n 1/2 δ 2 ). Then (12) holds with (uniformly) overwhelming probability for all z with |Re(z)| ≤ 4 and Im(z) ≥
K 2 log3.5 n . δ8n
The remainder of this section is devoted to proving Theorem 3.2. Fix z as in Theorem 3.2, thus |Re(z)| ≤ 4 and Im(z) = η, where ηn ≥
K 2 log3.5 n . δ8
(13)
Our objective is to show (12) with (uniformly) overwhelming probability. As in previous works (in particular [9,27]), the key is to exploit the fact that when Imz > 0, s(z) is the unique solution to the equation s(z) +
1 =0 s(z) + z
(14)
with Ims(z) > 0; this is immediate from (11). We now seek a similar relation for sn . Note that Imsn (z) > 0 by (10). We use the following standard matrix identity (cf. [27, Lem. 39], or [2, Chap. 11]): Lemma 3.3. We have sn (z) =
n 1 n
1
√1 ζ k=1 n kk
− z − Yk
,
(15)
where Yk := ak∗ (Wn,k − z I )−1 ak , Wn,k is the matrix Wn with the k th row and column removed, and ak is the k th row of Wn with the k th element removed. Proof. By Schur’s complement,
1 ζkk −z−ak∗ (Wk −z I )−1 ak
(W − z I )−1 . Taking traces, one obtains the claim.
is the k th diagonal entry of
Proposition 3.4 (Concentration of Yk ). For each 1 ≤ k ≤ n, one has Yk = sn (z) + o(δ 2 ) with overwhelming probability. √ Proof Fix k, and write z = x + −1η. The entries of ak are independent of each other and of Wn,k , and have mean zero and variance n1 . By linearity of expectation we thus have, on conditioning on Wn,k ,
1 1 −1 sn,k (z), E(Yk |Wn,k ) = trace(Wn,k − z I ) = 1 − n n where 1 1 n−1 λi (Wn,k ) − z n−1
sn,k (z) :=
i=1
Universality up to the Edge
559
is the Stieltjes transform of Wn,k . From the Cauchy interlacing law (5) and (13), we have
1 1 1 1 = o(δ 2 ). sn (z) − (1 − )sn,k (z) = O d x = O n n R |x − z|2 nη It follows that E(Yk |Wn,k ) = sn (z) + o(δ 2 ), and so it will remain to show the concentration estimate Yk − E(Yk |Wn,k ) = o(δ 2 ) with overwhelming probability. Rewriting Yk , it suffices to show that n−1 j=1
Rj λ j (Wn,k ) − (x +
√
−1η)
= o(δ 2 )
(16)
with overwhelming probability, where R j := |u j (Wn,k )∗ ak |2 − 1/n. Let 1 ≤ i − < i + ≤ n, then
R j = PH ak 2 −
i − ≤ j≤i +
dim(H ) , n
where H is the space spanned by the u j (Wn,k )∗ for i − ≤ j ≤ i + . From Lemma 2.1 and the union bound, we conclude that with overwhelming probability √ i + − i − K log n + K 2 log2 n
. (17) R j n i − ≤ j≤i +
By the triangle inequality, this implies that i − ≤ j≤i +
i+ − i− + PH ak
n 2
√ i + − i − K log n + K 2 log2 n , n
and hence by a further application of the triangle inequality i − ≤ j≤i +
|R j |
(i + − i − ) + K 2 log2 n n
(18)
with overwhelming probability. The plan is to use (17) and (18) to establish (16). Accordingly, we split the LHS of (16), into several subsums according to the distance |λ j − x|. Lemma 2.2 provides a sharp estimate on the number of terms of each subsum which will allow us to obtain a good upper bound on the absolute value.
560
T. Tao, V. Vu
We turn to the details. From (13) we can choose two auxiliary parameters 0 < δ , α < 1 such that δ = o(δ 2 ); α log n = o(δ 2 ); αδ ηn ≥ K 2 log2 n; K log n = o(δ 2 ). √ αδ ηn
(19)
Indeed, one could set δ := δ 2 log−0.01 n and α := δ 2 log−1.01 n and use (13). Fix such parameters, and consider the contribution to (16) of the indices j for which |λ j (Wn ) − x| ≤ δ η. By Lemma 2.2 and (19), the interval of j for which this occurs has cardinality O(δ ηn) 1 √ (with overwhelming probability). On this interval, the quantity λ (W )−(x+ has −1η) j
n,k
magnitude O( η1 ). Applying (18) (and (19)), we see that the contribution of this case is thus
1 δ ηn = o(δ 2 ), η n
which is acceptable. Next, we consider the contribution to (16) of those indices j for which (1 + α)l δ η < |λ j (Wn ) − x| ≤ (1 + α)l+1 δ η for some integer 0 ≤ l log n/α, and then sum over l. By Lemma 2.2 and (19), the set of j for which this occurs is contained (with overwhelming probability) in at most l two intervals of cardinality O((1 + α) thequantity αδ ηn). On each of these intervals, 1 √ 1 α has magnitude O (1+α)l δ η and fluctuates by O (1+α)l δ η . Applyλ (W )−(x+ −1η) j
n,k
ing (17), (18) (and noting that (1 + α)l αδ ηn exceeds K 2 log2 n, by (19)) we see that the contribution of a single l to (16) is at most 1 α α(1 + α)l δ ηn α(1 + α)l δ ηn K log n
+ , l l (1 + α) δ η n (1 + α) δ η n which simplifies to K log n
α(1 + α)−l/2 √ + α 2 . αδ ηn Summing over l we obtain a bound of K log n
√ + α log n, αδ ηn which is acceptable by (19).
Universality up to the Edge
561
We now conclude the proof of Theorem 1.8. By hypothesis, 1 √ √ ζkk ≤ K / n = o(δ 2 ) n almost surely. Inserting these bounds into (15), we see that with overwhelming probability sn (z) +
n 1 1 = 0. n sn (z) + z + o(δ 2 ) k=1
By the triangle inequality (and square rooting the o() decay), we can assume that either the error term o(δ 2 ) is o(δ 2 |sn (z) + z|), or that |sn (z) + z| is o(1). Suppose the former holds. Then by Taylor expansion 1 1 = + o(δ 2 ), sn (z) + z + o(δ 2 ) sn (z) + z and thus sn (z) +
1 = o(δ 2 ). sn (z) + z
If we assume |z| ≤ 10 (say), we conclude that |sn (z)| ≤ 100. Multiplying out by sn (z)+z and rearranging, we obtain z 2 z2 − 4 + o(δ 2 ). sn (z) + = 2 4 Thus
sn (z) +
z2 − 4 z =± + o(δ) 2 4
(treating the case when z 2 − 4 = o(δ 2 ) separately). To summarise, we have shown (with overwhelming probability) in the region |z| ≤ 10; |Re(z)| ≤ 4; Im(z) ≥
K 2 log3.5 n δ8n
√ that one either has sn (z) = s(z)+o(δ), sn (z) = −z −s(z)+o(1) = s(z)− z 2 − 4+o(1), or |sn (z) + z| = o(1). It is not hard to see that the first two cases are disconnected from −1 the third (for n large enough) in this region, because s(z) = s(z)+z is bounded away −1 from zero, as is s(z) + z = s(z) . Furthermore, the first and second possibilities are also disconnected from each other except when z 2 − 4 = o(δ 2 ). Also, the second and third possibilities can only hold for Im(z) = o(1) since sn (z) and z both have positive real part. A continuity argument thus shows that the first possibility must hold throughout the region except when z 2 − 4 = o(δ 2 ), in which case either the first or second possibility can hold; but in that region, the first and second possibility are equivalent, and (12) follows. The proof of Theorem 1.8 is now complete.
562
T. Tao, V. Vu
4. Delocalization of Eigenvectors Without loss of generalization, we can assume that the entries are continuously distributed. Having established Theorem 1.8, we now use this theorem to establish Proposition 1.10. Let Mn obey Condition C0. Then by Markov’s inequality, one has |ζi j | log O(1) n with overwhelming probability (here and in the sequel we allow implied constants in the O() notation to depend on the constants C, C in (1)). By conditioning the ζi j to this event3 , we may thus assume that |ζi j | ≤ K
(20)
almost surely for some K = O(log O(1) n). Fix 1 ≤ i ≤ n; by symmetry we may take i ≥ n/2. By the union bound and another application of symmetry, it suffices to show that |u i (Mn )∗ e1 | n −1/2 log O(1) n with overwhelming probability. To compute u i (Mn )∗ e1 we use the following identity from [7] (see also [27, Lem. 38]): Lemma 4.1 Let
An =
a X∗ X An−1
x be a unit v eigenvector of A with eigenvalue λi (A), where x ∈ C and v ∈ Cn−1 . Suppose that none of the eigenvalues of An−1 are equal to λi (A). Then be a n × n Hermitian matrix for some a ∈ R and X ∈ Cn−1 , and let
|x|2 =
1+
1
n−1
j=1 (λ j (An−1 ) − λi (An ))
−2 |u
j (An−1 )
∗ X |2
,
where u j (An−1 ) is a unit eigenvector corresponding to the eigenvalue λ j (An−1 ). Proof By subtracting λi (A)I from A we may assume λi (A) = 0. The eigenvector equation then gives x X + An−1 v = 0, thus v = −x A−1 n−1 X. Since v 2 + |x|2 = 1, we conclude 2 |x|2 (1 + A−1 n−1 X ) = 1. 2 Since A−1 n−1 X =
n−1
j=1 (λ j (An−1 ))
−2 |u
j (An−1 )
∗ X |2 ,
the claim follows.
3 Strictly speaking, this distorts the mean and variance of ζ by an exponentially small amount, but one ij can easily check that this does not significantly impact any of the arguments in this section.
Universality up to the Edge
563
Let Mn−1 be the bottom right n − 1 × n − 1 minor of Mn . As we are assuming that the coefficients of Mn are continuously distributed, we see almost surely that none of the eigenvalues of Mn−1 are equal to λi (Mn ). We may thus apply Lemma 4.1 and conclude that |u i (Mn )∗ e1 |2 = 1+
n−1
1
|u j (Mn−1 )∗ X |2 j=1 (λ j (Mn−1 )−λi (Mn ))2
,
where X is the bottom left n −1×1 vector of Mn (and thus has entries ζ j1 for 1 < j ≤ n). It thus suffices to show that n−1 j=1
|u j (Mn−1 )∗ X |2 n log−O(1) n (λ j (Mn−1 ) − λi (Mn ))2
with overwhelming probability. It will be convenient to eliminate the exponent 2 in the denominator, as follows. From Lemma 2.1, one has |u j (Mn−1 )∗ X | log O(1) n with overwhelming probability for each j (and hence for all j, by the union bound). It thus suffices to show that n−1 j=1
|u j (Mn−1 )∗ X |4 n log−O(1) n (λ j (Mn−1 ) − λi (Mn ))2
with overwhelming probability. By the Cauchy-Schwarz inequality, it thus suffices to show that j:i−T− ≤ j≤i+T+
|u j (Mn−1 )∗ X |2 n 1/2 log−O(1) n |λ j (Mn−1 ) − λi (Mn )|
with overwhelming probability for some 1 ≤ T− , T+ log O(1) n. It is convenient to work with the normalized matrix Wn := √1n Mn , thus we need to show j:i−T− ≤ j≤i+T+
|u j (Wn−1 )∗ Y |2 log−O(1) n |λ j (Wn−1 ) − λi (Wn )|
with overwhelming probability for some 1 ≤ T− , T+ log O(1) n, where Y :=
(21) √1 n
X
√1 ζ j1 n
for 1 < j ≤ n. has entries There are two cases: the bulk case and the edge case; the former was already treated in [27], but the latter is new. 4.1. The bulk case. Suppose that n/2 ≤ i < 0.999n. Then from the semicircular law (or Theorem 1.8) we see that λi (Wn ) ∈ [−2 + ε, 2 + ε] with overwhelming probability for some absolute constant ε > 0. Let A be a large constant to be chosen later. A further application of Theorem 1.8 then shows that there is an interval I of length log A n/n centered at λi (Wn ) which contains (log A n) eigenvalues of Wn . If λ j (Wn ), λ j+1 (Wn )
564
T. Tao, V. Vu
lie in I , then by the Cauchy interlacing property (5), |λ j (Wn−1 ) − λi (Wn )| log A n/n. One can thus lower bound the left-hand side of (21) (for suitable values of T ) by |u j (Wn−1 )∗ Y |2 . n log−A n j:λ j (Wn ),λ j+1 (Wn )∈I
One can rewrite this as log−A nπ H X 2 , where H is the span of the u j (Wn−1 ) for λ j (Wn ), λ j+1 (Wn ) ∈ I . The claim then follows from Lemma 2.1 (for A large enough). 4.2. The edge case. We now turn to the more interesting edge case when 0.999n ≤ i ≤ n. Using the semicircular law, we now see that λi (Wn ) ≥ 1.9
(22)
(say) with overwhelming probability. Next, we can exploit the following identity: Lemma 4.2 (Interlacing identity) [27, Lem. 37]. If u j (Wn−1 )∗ X is non-zero for every j, then n−1 j=1
|u j (Wn−1 )∗ X |2 1 = √ ζnn − λi (Wn ). λ j (Wn−1 ) − λi (Wn ) n
(23)
Proof By diagonalising Wn−1 (noting that this does not affect either side of (23)), we may assume that Wn−1 = diag(λ1 (Wn−1 ), . . . , λn−1 (Wn−1 )) and u j (Wn−1 ) = e j for j = 1, . . . , n−1. One then easily verifies that the characteristic polynomial det(Wn −λI ) of Wn is equal to ⎡ ⎤
n−1 n−1 |u j (Wn−1 )∗ X |2 1 ⎦ (λ j (Wn−1 ) − λ) ⎣ √ ζnn − λ − λ j (Wn−1 ) − λ n j=1
j=1
when λ is distinct from λ1 (Wn−1 ), . . . , λn−1 (Wn−1 ). Since u j (Wn−1 )∗ X is non-zero by hypothesis, we see that this polynomial does not vanish at any of the λ j (Wn−1 ). Substituting λi (Wn ) for λ, we obtain (23).
Again, the continuity of the entries of Mn ensure that the hypothesis of Lemma 4.2 is obeyed almost surely. From (20), (22), (23) one has n−1 |u j (Wn−1 )∗ X |2 ≥ 1.9 − o(1) j=1 λ j (Wn−1 ) − λi (Wn ) with overwhelming probability, so to show (21), it will suffice by the triangle inequality to show that ∗ X |2 |u (W ) j n−1 ≤ 1.8 + o(1) (24) j>i+T+ or j
Universality up to the Edge
565
Let A > 100 be a large constant to be chosen later. By Theorem 1.8, we see (if A is large enough) that N I = nα I |I | + O(|I |n log−A/20 n)
(25)
with overwhelming probability for any interval I of length |I | = n/n, where α I := |I1| I ρsc (x) d x. For any such interval, we see from Lemma 2.1 (and Cauchy interlacing (5)) that with overwhelming probability log A/2+O(1) n NI ∗ 2 +O |u j (Wn−1 ) X | = n n log A
j:λ j (Wn−1 )∈I
and thus by (25) (for A large enough) |u j (Wn−1 )∗ X |2 = α I |I | + O(|I | log−A/20) n). j:λ j (Wn−1 )∈I
Set d I :=
dist(λi (Wn ),I ) . |I |
If d I ≥ log n (say), then
1 1 = +O λ j (Wn−1 ) − λi (Wn ) d I |I |
1 d I2 |I |
for all j in the above sum, thus j:λ j (Wn−1 )∈I
|u j (Wn−1 )∗ X |2 αI = +O λ j (Wn−1 ) − λi (Wn ) dI
log−A/20 n dI
+O
αI d I2
.
(26)
We now partition the real line into intervals I of length log A n/n, and I
(26)over all sum
with d I ≥ log n. Bounding α I crudely by O(1), we see that I O αd 2I = O log1 n = I
o(1). Similarly, one has log−A/20 n
= O(log−A/20 n log n) = o(1) O dI I
if A is large enough. Finally, Riemann integration of the principal value integral 2 ρsc (x) ρsc (x) p.v. d x := lim dx ε→0 |x|≤2:|x−λi (Wn )|>ε x − λi (Wn ) −2 x − λi (Wn ) shows that αI I
dI
= p.v.
2 −2
ρsc (x) d x + o(1). x − λi (Wn )
The operator norm of Wn is 2 + o(1) with overwhelming probability (see e.g. [2,4]), so |λi (Wn )| ≤ 2 + o(1). Using the formula (11) for the Stieltjes transform, one obtains from residue calculus that 2 ρsc (x) p.v. d x = −λi (Wn )/2 −2 x − λi (Wn )
566
T. Tao, V. Vu
for |λi (Wn )| ≤ 2, with the right-hand side replaced by −λi (Wn )/2 + for |λi (Wn )| > 2. In either event, we have p.v.
2 −2
λi (Wn )2 − 4/2
ρsc (x) d x ≤ 1 + o(1). x − λi (Wn )
Putting all this together, we see that I :d I ≥log n
j:λ j (Wn−1 )∈I
|u j (Wn−1 )∗ X |2 ≤ 1 + o(1). λ j (Wn−1 ) − λi (Wn )
The intervals I with d I < log n will contribute at most log A+O(1) n eigenvalues, by (25) (and Cauchy interlacing (5)). The claim (24) now follows by setting T− and T+ appropriately. The proof of Proposition 1.10 is now complete. Remark 4.3 From (21) and Lemma 2.1 one sees that |λi−1 (Wn−1 ) − λi (Wn )| log O(1) n/n with overwhelming probability for all n/2 ≤ i ≤ n, and similarly one has |λi (Wn−1 ) − λi (Wn )| log O(1) n/n with overwhelming probability for all 1 ≤ i ≤ n/2. On the other hand, according to the Tracy-Widom law, the gap between λn (Wn ) and λn−1 (Wn ) (or between λ1 (Wn ) and λ2 (Wn )) can be expected to be as large as n −2/3 . Thus we see that there is a significant bias at the edge in the interlacing law (5), which can ultimately be traced to the imbalance of “forces” in the interlacing identity (23) at that edge.
5. Lower Bound on Eigenvalue Gap We now give the proof of Theorem 1.12. Most of the proof will follow closely the proof of Theorem 1.5 in [27], so we shall focus on the changes needed to that argument. As such, this section will assume substantial familiarity with the material from [27], and will cite from it repeatedly (similarly for the next section). For technical reasons relating to an induction argument, it turns out that one has to treat the extreme cases i = 1, n separately: Proposition 5.1 (Extreme cases). Theorem 1.12 is true when i = 1 or i = n. Proof By symmetry it suffices to do this for i = n. By a limiting argument we may assume that the entries ζi j of Mn are continuously distributed. From Lemma 4.2 one has (almost surely) that n−1 j=1
|u j (Wn−1 )∗ X |2 1 = √ ζnn − λn (Wn ). λ j (Wn−1 ) − λn (Wn ) n
Universality up to the Edge
567
Recall that λn (Wn ) = 2 + o(1) with overwhelming probability; also, √1n ζnn = o(1) with overwhelming probability. As all the terms in the left-hand side have the same sign, we conclude that |u n−1 (Wn−1 )∗ X |2
1. |λn−1 (Wn−1 ) − λn (Wn )| From Theorem 2.3 and Proposition 1.10, we have |u n−1 (Wn−1 )∗ X | ≥ n −c0 /10 (say) with high probability, and so |λn−1 (Wn−1 ) − λn (Wn )| ≥ n −c0 with high probability. The claim now follows from the Cauchy interlacing property (5).
Remark 5.2 In fact, at the edge, one should be able to improve the lower bound on the eigenvalue gap substantially, from n −c0 to n 1/3−c0 , in accordance to the Tracy-Widom law, but we will not need to do so here. Now we handle the general case of Theorem 1.12. Fix Mn and c0 . We write n 0 , i 0 for i, n, thus 1 ≤ i 0 ≤ n 0 and our task is to show that λi0 +1 (An ) − λi0 (An 0 ) ≥ n −c0 with high probability. By Proposition 5.1 we may assume 1 < i 0 < n 0 . We may also assume n 0 to be large, as the claim is vacuous otherwise. As in previous sections, we may truncate so that all coefficients ζi j are of size O(log O(1) n 0 ) (as before, the exponentially small corrections to the mean and variance of ζi j caused by this are easily controlled), and approximate so that the distribution is continuous rather than discrete. For each n 0 /2 ≤ n ≤ n 0 , let An be the top left n × n minor of An 0 . As in [27, Sect. 3.4], we introduce the regularized gap gi,l,n :=
inf
λi+ (An ) − λi− (An )
1≤i − ≤i−l
0.9 n
min(i + − i − , logC1 n 0 )log
0
,
(27)
for all n 0 /2 ≤ n ≤ n 0 and 1 ≤ i − l < i ≤ n, where C1 is a large constant to be chosen later. It will suffice to show that for each 1 < i 0 < n 0 , gi0 ,1,n 0 ≤ n −c0 with high probability. By symmetry we may assume that n 0 /2 ≤ i 0 < n 0 . As before, we let u 1 (An ), . . . , u n (An ) be an orthonormal eigenbasis of An associated n to the eigenvectors λ1 (An ), . . . , λn (A √n ). We also let X n ∈ C be the rightmost column of An+1 with the bottom coordinate nζn+1,n+1 removed. We will need two key lemmas. First, we have the following deterministic lemma (a minor variant of [27, Lem. 47]), showing that a narrow gap can be propagated backwards in n unless one of a small number of exceptional events happen: Lemma 5.3 (Backwards propagation of gap). Suppose that i 0 ≤ n + 1 ≤ n 0 and l ≤ n/10 is such that gi0 ,l,n+1 ≤ δ
(28)
568
T. Tao, V. Vu
for some 0 < δ ≤ 1 (which can depend on n), and that λn+1 (An+1 ) − λn (An+1 ) ≥ δ exp(log0.91 n 0 ).
(29)
Then i 0 ≤ n. Suppose further that gi0 ,l+1,n ≥ 2m gi0 ,l,n+1
(30)
2m ≤ δ −1/2 .
(31)
for some m ≥ 0 with
Then one of the following statements hold: (i) (Macroscopic spectral concentration) There exists 1 ≤ i − < i + ≤ n + 1 with i + −i − ≥ logC1 /2 n such that |λi+ (An+1 )−λi− (An+1 )| ≤ δ exp(log0.95 n)(i + −i − ). (ii) (Small inner products) There exists 1 ≤ i − ≤ i 0 − l < i 0 ≤ i + ≤ n with i + − i − ≤ logC1 /2 n such that
|u j (An )∗ X n |2 ≤
i − ≤ j
n(i + − i − ) . 2m/2 log0.01 n
(32)
(iii) (Large coefficient) We have |ζn+1,n+1 | ≥ n 0.4 . (iv) (Large eigenvalue) For some 1 ≤ i ≤ n + 1 one has |λi (An+1 )| ≥
n exp(− log0.95 n) . δ 1/2
(v) (Large inner product) There exists 1 ≤ i ≤ n such that |u i (An )∗ X n |2 ≥
n exp(− log0.96 n) . δ 1/2
(vi) (Large row) We have X n 2 ≥
n 2 exp(− log0.96 n) . δ 1/2
(vii) (Large inner product near i 0 ) There exists 1 ≤ i ≤ n with |i − i 0 | ≤ logC1 n such that |u i (An )∗ X n |2 ≥ 2m/2 n log0.8 n. Proof The proof of this proposition repeats the proof of [27, Lem. 47 in Sect. 6] almost exactly. Only the following changes have to be made: 0.9
• We have the upper bound λi+ (An+1 )−λi− (An+1 ) ≤ δ(logC1 n)log n 0 , which together with (29) forces i + = n + 1 and thus i 0 ≤ n as required. • The variable j now lies in the range 1 ≤ j ≤ n rather than εn/10 ≤ j ≤ (1−ε/10)n. • i −− has to be defined as max(i − − 2k−1 , 1) rather than just i − − 2k−1 (and similarly for i ++ ).
Universality up to the Edge
569
Next, we need the following result that asserts that the events (i)-(vii) are rare: Proposition 5.4 (Bad events are rare). Suppose that n 0 /2 ≤ n < n 0 and l ≤ n/10, and m −1/2 . set δ := n −κ 0 for some sufficiently small fixed κ > 0. Let m ≥ 0 be such that 2 ≤ δ Then: (a) The events (i), (iii), (iv), (v), (vi) in Lemma 5.3 all fail with high probability. (b) There is a constant C such that all the coefficients of the eigenvectors u j (An ) for 1 ≤ j ≤ n are of magnitude at most n −1/2 logC n with overwhelming probability. Conditioning An to be a matrix with this property, the events (ii) and (vii) occur with a conditional probability of at most 2−κm + n −κ . (c) Furthermore, there is a constant C2 (depending on C , κ, C1 ) such that if l ≥ C2 and An is conditioned as in (b), then (ii) and (vii) in fact occur with a conditional probability of at most 2−κm log−2C1 n + n −κ . Proof The proof of this proposition repeats the proof of [27, Prop. 49 in Sect. 7] almost exactly. Only the following changes have to be made: • All references to [27, Th. 56] (i.e. Theorem 1.6) need to be replaced with Theorem 1.8. • The variable j now lies in the range 1 ≤ j ≤ n rather than εn/2 ≤ j ≤ (1 − ε/2)n.
Given Lemma 5.3 and Proposition 5.4, the proof of Theorem 1.12 exactly follows the proof of Theorem 1.5 in [27, Sect. 3.5], with the following minor changes: • In the definition of the event E n , the range εn/2 ≤ j ≤ (1 − ε/2)n needs to be expanded to 1 ≤ j ≤ n. • In the definition of the event E 0 , the events that (29) fail for some n 0 − log2C1 n 0 ≤ n ≤ n 0 have to be included; but these events occur with polynomially small probability, thanks to Proposition 5.1 and the union bound. This concludes the proof of Theorem 1.12. 6. The Four Moment Theorem We now prove Theorem 1.11. As in [27, Sect. 3.3], the proof is based on two key propositions. The first proposition asserts that one can swap a single coefficient (or more precisely, two coefficients) of a (deterministic) matrix A as long as A obeys a certain “good configuration condition”: Proposition 6.1 (Replacement given a good configuration). There exists a positive constant C1 such that the following holds. Let k ≥ 1 and ε1 > 0, and assume n sufficiently large depending on these parameters. Let 1 ≤ i 1 < · · · < i k ≤ n. For a complex parameter z, let A(z) be a (deterministic) family of n × n Hermitian matrices of the form A(z) = A(0) + ze p eq∗ + zeq e∗p , where e p , eq are unit vectors. We assume that for every 1 ≤ j ≤ k and every |z| ≤ n 1/2+ε1 whose real and imaginary parts are multiples of n −C1 , we have • (Eigenvalue separation) For any 1 ≤ i ≤ n with |i − i j | ≥ n ε1 , we have |λi (A(z)) − λi j (A(z))| ≥ n −ε1 |i − i j |.
(33)
570
T. Tao, V. Vu
• (Delocalization at i j ) If Pi j (A(z)) is the orthogonal projection to the eigenspace associated to λi j (A(z)), then Pi j (A(z))e p , Pi j (A(z))eq ≤ n −1/2+ε1 .
(34)
Pi j ,α (A(z))e p , Pi j ,α (A(z))eq ≤ 2α/2 n −1/2+ε1 ,
(35)
• For every α ≥ 0,
whenever Pi j ,α is the orthogonal projection to the eigenspaces corresponding to eigenvalues λi (A(z)) with 2α ≤ |i − i j | < 2α+1 . We say that A(0), e p , eq are a good configuration for i 1 , . . . , i k if the above properties hold. Assuming this good configuration, then we have E(F(ζ )) = EF(ζ ) + O(n −(r +1)/2+O(ε1 ) ),
(36)
whenever F(z) := G(λi1 (A(z)), . . . , λik (A(z)), Q i1 (A(z)), . . . , Q ik (A(z))), and G = G(λi1 , . . . , λik , Q i1 , . . . , Q ik ) is a smooth function from Rk × Rk+ → R that is supported on the region Q i1 , . . . , Q ik ≤ n ε1 and obeys the derivative bounds |∇ j G| ≤ n ε1 for all 0 ≤ j ≤ 5, and ζ, ζ are random variables with |ζ |, |ζ | ≤ n 1/2+ε1 almost surely, which match to order r for some r = 2, 3, 4. If G obeys the improved derivative bounds |∇ j G| ≤ n −C jε1 for 0 ≤ j ≤ 5 and some sufficiently large absolute constant C, then we can strengthen n −(r +1)/2+O(ε1 ) in (36) to n −(r +1)/2−ε1 . Proof See [27, Prop. 43].
The second proposition asserts that these good configurations occur very frequently: Proposition 6.2 (Good configurations occur very frequently). Let ε1 > 0 and C, C1 , k ≥ 1. Let 1 ≤ i 1 < · · · < i k ≤ n, let 1 ≤ p, q ≤ n, let e1 , . . . , en be the standard basis of Cn , and let A(0) = (ζi j )1≤i, j≤n be a random Hermitian matrix with independent uppertriangular entries and |ζi j | ≤ n 1/2 logC n for all 1 ≤ i, j ≤ n, with ζ pq = ζq p = 0, but with ζi j having mean zero and variance 1 for all other i j, except on the diagonal where the variance is instead c for some absolute constant c > 0, and also being distributed continuously in the complex plane. Then A(0), e p , eq obey the Good Configuration Condition in Theorem 6.1 for i 1 , . . . , i k and with the indicated value of ε1 , C1 with overwhelming probability.
Universality up to the Edge
571
Proof The proof of this proposition repeats the proof of [27, Prop. 44 in Sect. 5] almost exactly. Only the following changes have to be made: • All references to [27, Th. 56] (i.e. Theorem 1.6) need to be replaced with Theorem 1.8. • All references to [27, Prop. 58] (i.e. Proposition 1.7) need to be replaced with Proposition 1.10. • The edge regions in which λi (A(z)) do not fall inside the bulk region [(−2 + ε )n, (2 − ε )n] no longer need to be treated separately, thus simplifying the last paragraph of the proof somewhat.
Given these two propositions, the proof of Theorem 1.11 repeats the proof of [27, Th. 15 in Sect. 3.3] almost exactly. Only the following changes have to be made: • All references to [27, Prop. 44] need to be replaced with Proposition 6.2. The proof of Theorem 1.11 is now complete. Acknowledgements. The authors thank the anonymous referee for helpful comments and references, and Horng-Tzer Yau for additional references. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
References 1. Anderson, G., Guionnet, A., Zeitouni, O.: An introduction to random matrices. To be published by Cambridge Univ. Press 2. Bai, Z.D., Silverstein, J.: Spectral analysis of large dimensional random matrices. Mathematics Monograph Series 2, Beijing: Science Press, 2006 3. Bai, Z.D., Yin, Y.Q.: Convergence to the semicircle law. Ann. Probab. 16, 863–875 (1988) 4. Bai, Z.D., Yin, Y.Q.: Necessary and Sufficient Conditions for Almost Sure Convergence of the Largest Eigenvalue of a Wigner Matrix. Ann. Probab. 16, 1729–1741 (1988) 5. Deift, P.: Orthogonal polynomials and random matrices: a Riemann-Hilbert approach. Courant Lecture Notes in Mathematics, 3. New York University, Courant Institute of Mathematical Sciences, New York; Providence, RI: Amer. Math. Soc., 1999 6. Deift, P.: Universality for mathematical and physical systems. In: International Congress of Mathematicians Vol. I, Zürich: Eur. Math. Soc., 2007, pp. 125–152 7. Erd˝os, L., Schlein, B., Yau, H.-T.: Semicircle law on short scales and delocalization of eigenvectors for Wigner random matrices. Ann. Prob. 37(3), 815–852 (2009) 8. Erd˝os, L., Schlein, B., Yau, H.-T.: Local semicircle law and complete delocalization for Wigner random matrices. Commun. Math. Phys. 287(2), 641–655 (2009) 9. Erd˝os, L., Schlein, B., Yau, H.-T.: Wegner estimate and level repulsion for Wigner random matrices. Submitted, available at http://arxiv.org/abs/0811.2591v3[math.ph], 2009 10. Erd˝os, L., Schlein, B., Yau, H.-T.: Universality of Random Matrices and Local Relaxation Flow. http:// arxiv.org/abs/0907.5605v3[math-ph], 2009 11. Erd˝os, L., Ramirez, J., Schlein, B., Yau, H.-T.: Universality of sine-kernel for Wigner matrices with a small Gaussian perturbation. http://arxiv.org/abs/0905.2089v1[math-ph], 2009 12. Erd˝os, L., Ramirez, J., Schlein, B., Yau, H.-T.: Bulk universality for Wigner matrices. http://arxiv.org/ abs/0905.4176v2[math-ph], 2009 13. Erd˝os, L., Ramirez, J., Schlein, B., Tao, T., Vu, V., Yau, H.-T.: Bulk universality for Wigner hermitian matrices with subexponential decay. http://arxiv.org/abs/0906.4400v1[math.PR], 2009 14. Forrester, P.: The spectral edge of random matrix ensembles. Nucl. Phys. B 402, 709–728 (1993) 15. Johansson, K.: Universality of the local spacing distribution in certain ensembles of Hermitian Wigner matrices. Commun. Math. Phys. 215(3), 683–705 (2001) 16. Johansson, K.: Universality for certain Hermitian Wigner matrices under weak moment conditions, preprint 17. Katz, N., Sarnak, P.: Random matrices, Frobenius eigenvalues, and monodromy. American Mathematical Society Colloquium Publications, 45. Providence, RI: Amer. Math. Soc., 1999
572
T. Tao, V. Vu
18. Khorunzhiy, O.: High Moments of Large Wigner Random Matrices and Asymptotic Properties of the Spectral Norm. http://arxiv.org/abs/0907.3743v2[math.PR], 2009 19. Mehta, M.L.: Random Matrices and the Statistical Theory of Energy Levels. New York: Academic Press, 1967 20. Péché, S., Soshnikov, A.: On the lower bound of the spectral norm of symmetric random matrices with independent entries. Electron. Commun. Probab. 13, 280–290 (2008) 21. Péché, S., Soshnikov, A.: Wigner random matrices with non-symmetrically distributed entries. J. Stat. Phys. 129(5–6), 857–884 (2007) 22. Ruzmaikina, A.: Universality of the edge distribution of eigenvalues of Wigner random matrices with polynomially decaying distributions of entries. Commun. Math. Phys. 261(2), 277–296 (2006) 23. Sinai, Y., Soshnikov, A.: Central limit theorem for traces of large symmetric matrices with independent matrix elements. Bol. Soc. Brazil. Mat. 29, 1–24 (1998) 24. Sinai, Y., Soshnikov, A.: A refinement of Wigners semicircle law in a neighborhood of the spectrum edge for random symmetric matrices. Func. Anal. Appl. 32, 114–131 (1998) 25. Soshnikov, A.: Universality at the edge of the spectrum in Wigner random matrices. Commun. Math. Phys. 207(3), 697–733 (1999) 26. Soshnikov, A.: Gaussian limit for determinantal random point fields. Ann. Probab. 30(1), 171–187 (2002) 27. Tao, T., Vu, V.: Random matrices: Universality of the local eigenvalue statistics. Submitted, available at http://arxiv.org/abs/0908.1982v4[math.PR], 2010 28. Tracy, C., Widom, H.: Level spacing distribution and Airy kernel. Commun. Math. Phys. 159, 151– 174 (1994) 29. Tracy, C., Widom, H.: On orthogonal and symplectic matrix ensembles. Commun. Math. Phys. 177, 727– 754 (1996) Communicated by H.-T. Yau
Commun. Math. Phys. 298, 573–583 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1046-3
Communications in
Mathematical Physics
A Quantum Boltzmann Equation for Haldane Statistics and Hard Forces; the Space-Homogeneous Initial Value Problem L. Arkeryd Mathematical Sciences, Chalmers, S-41296 Gothenburg, Sweden. E-mail: [email protected] Received: 27 August 2009 / Accepted: 12 January 2010 Published online: 8 April 2010 – © Springer-Verlag 2010
Abstract: The paper considers equations of Boltzmann type for Haldane exclusion statistics. Existence and some basic properties of the solutions are studied for the space homogeneous initial value problem with hard forces and angular cut-off. The approach uses strong L 1 compactness. Some of the technical estimates are based on L ∞ decay properties, and the control of the filling factor on range estimates for the solutions. 1. Introduction The quantum Boltzmann Haldane equation (BH) is a kinetic equation of Boltzmann type for confined quasi-particles with Haldane statistics [H], e.g. in the interior of condensed matter. This exclusion statistics interpolates between the Fermi and Bose quantum behaviours. In quantum statistical mechanics, the number of quantum states of N identical particles occupying G states is given by (G + N − 1)! N !(G − 1)!
and
G! N !(G − N )!
in the boson resp. fermion cases. The interpolated number of quantum states for the fractional exclusion of Haldane and Wu is ([W]) (G + (N − 1)(1 − α))! 0 < α < 1. N !(G − α N − (1 − α))!
(1.1)
When applied to the fractional quantum Hall effect, the Haldane statistics coincides with the two-dimensional anyon definition in terms of the braiding of particle trajectories. Haldane statistics may also be realized [BMB] for neutral fermionic atoms at ultra-low temperature in three dimensions at unitarity. Elastic collisions in a Boltzmann type collision operator are pair collisions preserving mass, first moments, and energy. For two particles having pre-collisional velocities v,
574
L. Arkeryd
v∗ in Rd , the velocities after collision are denoted by v , v∗ . The density function in the corresponding variables is denoted by f, f ∗ respectively f , f ∗ . The collision operator Q for Haldane statistics, first introduced in [BBM], is Q( f ) = B(v − v∗ , ω) × [ f f ∗ F( f )F( f ∗ ) − f f ∗ F( f )F( f ∗ )]dv∗ dω. IR d ×S d−1
Here dω corresponds to the Lebesgue probability measure on the sphere. The collision kernel B in the variables (z, ω) ∈ IR d × Sd−1 is positive, locally integrable, and only depends on |z| and |(z, ω)|. The filling factor F is given by F( f ) = (1 − α f )α (1 + (1 − α) f )1−α , 0 < α < 1. It is concave with maximum value one at 1−2α f = 0 for α ≥ 21 , and maximum value ( α1 − 1)1−2α > 1 at f = α(1−α) for α < 21 . With this filling factor, the collision operator vanishes identically for the Haldane equilibrium distribution functions as obtained in [W] under (1.1), but for no other functions. The Boltzmann equation for the limiting cases, representing boson statistics (α = 0) and fermion statistics (α = 1), were first studied by [Lu1] (α = 0 in a space-homogeneous isotropic situation) resp. [D,L]. In their case (α = 1) the cancellation of quartic terms in the collision integral, and applicability of Lions’ compactness result for the classical gain term, together allow for a space-dependent study resembling the weak L 1 analysis for the quadratic Boltzmann equation. For 0 < α < 1, however, there is no cancellation in the collision term. Moreover, the Lipschitz continuity of the collision term in the Fermi-Dirac case, is now replaced by a weaker Hölder continuity. Those features led to the choice of a strong L 1 -compactness approach in this paper. Consider the initial value problem for the Boltzmann equation with Haldane statistics in the space-homogeneous case with velocities in IR d , d ≥ 2, df = Q( f ). dt
(1.2)
Because of the filling factor F, the range for the initial value f 0 should belong to [0, α1 ], which is then formally preserved by the equation. The general BH equation (0 < α < 1) retains important properties from the Fermi-Dirac case (cf. [BBM], but it has so far not been validated from basic quantum theory. Therefore the choice here is to consider the case of hard forces, B(z, ω) = |z|β b( (z,ω) |z| ) with 0 < β ≤ 1 and Grad angular cut-off, which also agrees with the better understood limiting cases α = 0, 1. In the approach to existence in this paper, stronger limitations on the cut-off allow for weaker moment conditions for f 0 . For the discussion below it is assumed that 0 < b ≤ c| sin θ cos θ |d−1 , with an initial moment (1 + |v|s ) f 0 in L ∞ for s ≥ d − 1 + β. In Section Two this initial value problem is considered for a family of approximations with bounded support for the kernel B, when 0 < f 0 ≤ esssup f 0 < α1 . Starting from approximations with Lipschitz continuous filling factor, the corresponding solutions are shown to stay away uniformly from α1 , the upper bound for the range. Uniform Lipschitz continuity follows for the approximating operators and leads to well posedness for the limiting problem. Section Three studies uniform L ∞ moment bounds for the approximate solutions using an approach from the classical Boltzmann case [A]. Based on those preliminary results, in Section Four the main global existence result for hard forces, Theorem 4.1, is stated and proved. Section 5 extends the result to initial values with 0 < f 0 ≤ α1 . Mass and first moments are conserved and energy is bounded by its initial value. That bound on energy in turn implies energy conservation using the arguments for energy conservation from [Lu2] or [MW]. A stability property of the solutions is discussed. The question of long time behaviour is left open.
Equations of Boltzmann Type for Haldane Exclusion Statistics
575
2. A Particular Collision Kernel This initial value problem for (1.2) will first be considered for kernels of a particular pseudo-Maxwellian type with bounded support, to be used as approximations for the main problem of the paper. For n sufficiently large, take B¯ n = χn B, where χn is the characteristic function of n , the complement of the set 1 , n 1 or |v − v∗ | − |(v − v∗ ) · ω| < }. n
{(v, v∗ , ω); v, v∗ ∈ IR d , ω ∈ Sd−1 , |v − v∗ | > n, or |(v − v∗ , ω)| <
Let χ (x) be a cut-off function on IR, equal to one for x ≤ 0 and vanishing for x > 0, define χn (x) = χ (x − n), and set B¯ = Bn = χn (|v|)χn (|v∗ |)χn ((v |)χn (|v∗ |) B¯ n := n B¯ n . Proposition 2.1. Suppose that the initial value 0 < f 0 ∈ L 1 (Rd ) for the space¯ has finite energy and esssup f 0 (v) < 1 . homogeneous equation (1.2) with kernel B, α This initial value problem is well posed in L 1 with conservation of mass and energy. The essential supremum of the solution remains smaller than α1 on any set {(t, v); 0 ≤ t ≤ t0 , v ∈ Rd }. Proof. A family of locally Lipschitz continuous approximations is first introduced that preserves mass and the bounds for f 0 . In the collision operator for the approximations, Q with > 0, the modified filling factor is F ( f ) =
(1 − α f ) (1 + (1 − α) f )1−α . ( + (1 − α f )1−α )
(2.1)
It has bounded first derivatives with respect to f , when the range of f is contained in [0, α1 ], and so there is Lipschitz continuity for the corresponding approximate collision operators. With f¯ = 0 for f < 0, f¯ = f for 0 ≤ f ≤ α1 , f¯ = α1 for f > α1 , the Lipschitz continuity implies that the initial value problem df = Q ( f¯), dt
f (0, v) = f 0 (v)
is solvable for t near 0 with values in L ∞ . It follows from the equation that f is strictly increasing whenever it attains the value 0, and strictly decreasing at the value α1 . So f is absolutely continuous in t for a.e. v, satisfies 0 < f < α1 for t > 0, and f = f¯ solves the approximate initial value problem uniquely. This local solution can similarly be continued into a unique, global solution. For an initial value in L 1 , the solution stays in L 1 and conserves mass and energy. The function f (., v) is decreasing whenever ¯ f f ∗ F ( f )F ( f ∗ ) − f f ∗ F ( f )F ( f ∗ )) ≤ 0, dv∗ dω B( or
F ( f ) ≤
dv∗ dω B¯ f f ∗ F ( f )F ( f ∗ ) , dv∗ dω B¯ f f ∗ F ( f ∗ )
576
L. Arkeryd
and in particular for v in t = {v; |v| ≤ n, F ( f (t, v)) ≤ 41 } , if dv∗ dω B¯ f f ∗ F ( f )F ( f ∗ ) F ( f )(t, v) ≤ inf t . dv∗ dω B¯ f f ∗ F ( f ∗ )
(2.2)
Given t0 > 0 and ∪t≤t0 t = ∅, the infimum over ∪t≤t0 t is positive. This holds using the conditions on B and the bounds on f , since the denominator has a uniform (in ) upper bound. Define b0 by F( α1 − b0 ) = 21 . Take > 0 small so that F ( α1 − b0 ) ≥ 41 . A positive lower bound of the numerator can be obtained as follows. In the integrand, by definition f (t, v) ≥ α1 − b0 on t for t ≤ t0 . For the factor f (t, v∗ ), the exponential form of the equation gives a lower bound coming from the initial value term f 0 , which is multiplied by an exponential factor. The exponent is a negative time integral of dv∗ dω B¯ f ∗ F ( f )F ( f ∗ ), again with uniform bound, and so f (t, v∗ ) ≥ e−t0 C f 0 (v∗ ). For the factor F ( f )F ( f ∗ ) in the numerator, remove from the integrand the set (of uniformly in bounded measure), where F ( f ) < 21 , or F ( f ∗ ) < 21 . This leads for n large to an -independent positive lower bound for the numerator. Hence there is a constant C0 > 0 (independent of ) such that, if 0 < f 0 ≤ α1 − C1 with 0 < C1 ≤ C0 , then the function f (t, v) of t for v fixed, starts decreasing not later than when reaching the value α1 − C1 . And so the inequality is preserved by f (t) for 0 < t ≤ t0 . This implies that the derivatives (with respect to f) ddf F ( f )(t, v) are uniformly in bounded. It then follows from the equation in mild form that sup | f (t, v + q) − f (t, v)|dv ≤ | f 0 (v + q) − f 0 (v)|dv t≤t0 |v|
For t0 small enough this leads to lim
q→0 |v|
| f (t, v + q) − f (t, v)|dv = 0,
uniformly in and t < t0 , implying precompactness of the ( f ) sequence. A limit satisfies the BH initial value problem for t ≤ t0 . Iteration of the argument proves the existence statement of Proposition 2.1 for t > 0 (i.e. globally in time with = 0 in (2.1)). Uniqueness follows by a contraction argument as well as stability with respect to a family of initial values uniformly strictly less than α1 . The energy conservation and the essential supremum property are immediate.
3. Bounds for L ∞ Moments For kernels B = B(z, θ ) = |z|β b(θ ) of hard force type with 0 < b(θ ) ≤ C| sin θ cos θ |, and Bn = n χn B, the solutions f n of the previous section with n large enough, satisfy the following L ∞ estimate Proposition 3.1. Assume that 0 < (1 + |v|2 ) f 0 ∈ L 1 (IR d ), (1 + |v|s ) f 0 ∈ L ∞ (IR d ), where s = d − 1 + β, and fix t0 > 0. For d = 2 consider 0 < β < 1, and for d > 2 consider 0 < β ≤ 1. Then for the solutions f n of Eq. (1.2) with kernel Bn and initial
Equations of Boltzmann Type for Haldane Exclusion Statistics
577
value f 0 , the moments (1 + |v|s ) f n (t, v) are uniformly in n ∈ N and t ≤ t0 bounded in L ∞ (IR d ), where s = min(s, 2β(d+1)+2 ). d The proof is an adaptation of a corresponding proof in the Boltzmann case. In fact a stronger result holds, the same moments are bounded as in the classical Boltzmann case, but contrary to that case, here the method does not lead to globally in time uniform bounds for those moments (due to the weaker lower bound estimates used below for the collision frequency). In particular for high enough initial moments in L ∞ , such moments of order exceeding d + 2 are under control for t > 0, which in turn implies energy conservation in Theorem 4.1 and Theorem 5.1 below. We give the main steps of the proof of Proposition 3.1 and refer to [A] for further details. Lemma 3.2. For large n the collision frequency is bounded from below, dv∗ dωBn f n (t, v∗ )F( f n )F( f n ∗ ) ≥ C(1 + |v|β )χn (|v|), with C independent of n. Proof. It follows from the mass and energy conservation that for t > 0, there is an upper bound B, independent of n and of t > 0, for the measure of the set where f n (t, v) > α1 − b0 := c0 . Outside of such a set a factor of type F( f n ) has the positive lower bound 21 . Also given t0 > 0, for t ≤ t0 and independent of n, the exponential form of the equation gives a lower bound for the factor f n (t, v∗ ) equal to the positive initial value f 0 , multiplied by an exponential factor, which is bounded from below for |v| ≤ n 0 (with bound independent of n ≥ n 0 ). So for n 0 large enough (depending on B) and for n ≥ n 0 , the following estimate holds for the collision frequency: dv∗ dωBn f n (t, v∗ )F( f n )F( f n ∗ ) ≥ dω dv∗ n χn0 B f n (t, v∗ )F( f n )F( f n ∗ ) f n , f n ∗ r
≥ Cr β χn (|v|)
f n , f n ∗ r
n 0 χn0 f n ∗ dv∗ dω ≥ C1 χn (|v|),
for some C1 > 0 independent of n. Since |v − v∗ |β ≥ |v|β − |v∗ |β , it also holds that dv∗ dωBn f n (t, v∗ )F( f n )F( f n ∗ ) β ≥ c1 |v| n χn0 f n ∗ dv∗ dω − c2 (1 + v 2 ) f n ∗ dv∗ χn (|v|). f n , f n ∗
The last two estimates together imply dv∗ dωBn f n (t, v∗ )F( f n )F( f n ∗ ) ≥ C(1 + |v|β )χn (|v|), with C depending on f 0 but not on n ≥ n 0 .
Lemma 3.3. The gain term for f n is bounded from above by c(1 + |v|)−h χn (|v|) with h = β + min(d − 1 − β, 2(1+β) d ) and c only depending on f 0 and t0 .
578
L. Arkeryd
Proof. The Carleman representation for the gain term Jn ( f n , f n )(v) is f n dv n χn b(θ )|v − v |−d+1+β (cos θ )−β f n ∗ F( f n ∗ )d E ∗ . F( f n ) RN
(3.1)
E vv
Here E vv = {v1 ∈ Rd ; (v − v )(v − v1 ) = 0}, and d E ∗ denotes the Lebesgue measure of this plane. Given v, it holds that f (v ) f (v∗ ) = 0 for any choice of v∗ ∈ Rd , |v| provided f (v1 ) = 0 for all |v1 | ≥ √ . Given v, split f = f i + f o (= f iv + f ov ), 2
|v| where f i (v1 ) = f (v1 ) for |v1 | ≤ √ and = 0 otherwise. Then Jn ( f, f )(t, v) = 2 Jn ( f o , f o )(t, v) + Jn ( f o , f i )(t, v) + Jn ( f i , f o )(t, v), where the notation assumes the filling factors F( f ), F( f ∗ ) to be unchanged. Using the Carleman representation, straightforward elementary computations give J ( f n , f n )(t, v) ≤ cχn (|v|) f n (t, v )|v − v |−α χn (|v∗ |) f no (t, v∗ )d E ∗ dv ,
Rd
E vv
(3.2) where α = d − 1 − β. From here it is a consequence of the estimates in Lemma 3.4 below that (3.4) is bounded from above by cχn (|v|)(1 + |v|)−h with h = β + min(d − 1 − β, 2(1+β) d ) and c only depending on f 0 and t0 . That ends the proof of Lemma 3.3.
Lemma 3.4. It holds that
f (t, v )|v − v |−α dv ≤ c(1 + |v|)−a ,
Rd E vv
χn (|v∗ |) f no (t, v∗ )d E ∗ ≤ c(1 + |v|)−β ,
with a = min(α, 2(1+β) d ) and c only depending on f 0 and t0 . This follows with only minor modifications (similar to those in the proof of Lemma 3.2 above) of the corresponding proofs in [A] for the classical Boltzmann case. Recall that Lemma 3.5. If h 1 , h 2 are continuous, real valued functions on R+ with h 1 > 0, and f + h 1 f ≤ h 2 (t > 0), then supt>0 f (t) ≤ max( f (0), sup t>0
h 2 (t) ). h 1 (t)
Proof of Proposition 3.1. Given t0 > 0, the proposition follows from Lemma 3.2, ).
Lemma 3.4, and Lemma 3.5 with s = min(d − 1 + β, 2β(d+1)+2 d 4. Hard Force Collisions In this section the above existence result for (1.2) with truncated kernels will be extended to hard force kernels with 0 < B(z, ω) ≤ C|z|β | sin θ cos θ |d−1 , where 0 < β ≤ 1, d > 2, and 0 < β < 1, d = 2.
(4.1)
Equations of Boltzmann Type for Haldane Exclusion Statistics
579
Theorem 4.1. Let the initial value 0 < f 0 ∈ L 1 of the space-homogeneous equation (1.2) for hard forces have finite energy and satisfy esssup f 0 (v) < α1 . If esssup(1 + |v|s ) f 0 < ∞ for s = d − 1 + β, then this initial value problem for (1.2) has a solution in the space of functions continuous from t ≥ 0 into L 1 ∩ L ∞ , which conserves mass and energy, and for t0 > 0 given, has esssupv,t≤t0 |v|s f (t, v) bounded, where ). s = min(s, 2β(d+1)+2 d Proof. We shall use the solutions f n for the approximate kernels Bn (v, ω) of the previous sections to which Proposition 2.1 and 3.1 apply, and base the proof on a strong L 1 -compactness property for the sequence ( f n ). Strong precompactness of the sequence ( f n ) in L 1 is equivalent to (4.2) lim | f n (v + q) − f n (v)|dv = 0, q→0
uniformly in n. 1 Given t0 > 0, by Proposition 3.1 there is λ > 0 such that f n (t, v) ≤ 2α for all n ∈ N and all |v| > λ. It follows from the proof of Proposition 2.1 that there is an n 0 ≥ λ and C > 0, such that f (t, v) ≤ α1 − C for all n ≥ n 0 and all |v| ≤ λ. Hence for functions f = f n with n ≥ n 0 , the derivative ddf F( f ) is uniformly bounded when t ≤ t0 , v ∈ R N . With Sgn the sign function for f n (t, q + v) − f n (t, v), it holds that d Sgn( f n (t, q + v) − f n (t, v)) = Sgn(Q n ( f n )(t, q + v) − Q n ( f n )(t, v)). (4.3) dt The right hand side is split into a sum of four differences, Sgn( dv∗ dωBn f n f n ∗ F( f n ∗ )(t, q + v) − dv∗ dωBn f n f n ∗ F( f n ∗ )(t, v))F( f n )(t, q + v) − Sgn( dv∗ dωBn f n∗ F( f n )F( f n ∗ )(t, q + v) − dv∗ dωBn f n∗ F( f n )F( f n ∗ )(t, v)) f n (t, v) + Sgn(F( f n )(t, q + v) − F( f n )(t, v)) dv∗ dωBn f n f n ∗ F( f n ∗ )(t, v) − Sgn( f n (t, q + v) − f n (t, v)) dv∗ dωBn f n ∗ F( f n )F( f n ∗ )(t, q + v).
(4.4)
Here the last, negative term is removed, and integrals of the first three terms will be estimated using the uniform bound on the derivative ddf F( f ) for the sequence ( f n )n≥n 0 and t ≤ t0 . In the third term, by the proof of Lemma 3.3 the (gain type) integral is uniformly in n bounded. And so the estimate of the third term from above by C | f n (t, v + q) − f n (t, v)|dvdt follows with C independent of n.
580
L. Arkeryd
The remainig two terms are split into two further differences. Of the four ensuing terms, the two terms dvdt dv∗ dω|Bn (v+q −v∗ , ω)− Bn (v−v∗ , ω)| f n (t, v∗ )(F( f n )F( f n ∗ ) f n )(t, q +v), dvdt dv∗ dω|Bn (v+q −v∗ , ω)− Bn (v−v∗ , ω)|F( f n )(t, v∗ )( f n f n ∗ F( f n ))(t, q +v), tend to zero when q → 0, uniformly with respect to n, since mass and energy of the f n ’s are conserved, and the factor F( f n ) is uniformly in v and n bounded. In a third term, dvdt dv∗ dωBn (v − v∗ , ω) f n (t, v∗ )|F( f n )F( f n ∗ )(t, q + v) −F( f n )F( f n ∗ )(t, v)| f n (t, q + v),
(4.5)
it is used that f (t, v)(1 + |v|2β ) is uniformly in n and t ≤ t0 bounded in L 1 ∩ L ∞ . The Carleman representation (3.1) together with Proposition 3.1, the condition (4.1), and the bound on the derivative ddf F( f ), can be used to estimate the difference in F( f n )F( f n ∗ ). When the difference is taken for F( f n ), the hyperplane integral in the Carleman representation is evaluated for the integrand (1 + |v − v∗ |β )−1 |v − v∗ |−d+1
b(θ )(sin θ )d−1−β (sin θ )β |v − v∗ |β (1 + |v − v∗ |β ) , (cos θ )d−1 (1 + |v|2β )(1 + |v∗ |2β )
which is convergent, and similarly for the difference in F( f n ∗ ). The terms in the integrand are evaluated at v (v + q, v∗ ) etc., and an upper bound C sup | f n (t, v + q ) − f n (t, v)|dvdt |q |≤|q|
is obtained with C independent of n. The same bound can actually be obtained for all the other terms of (4.3), if the integration with respect to time is first carried out, followed by taking a supremum with respect to |q | ≤ |q|, and only then the integration in v. This will be used. There remains the term Sgn( dv∗ dωBn (v − v∗ , ω)F( f n )(t, v∗ )( f n f n ∗ (t, q + v) − f n f n ∗ (t, v))F( f n )(t, q + v)).
(4.6)
We shall insert F( f ) = F(0) + f F ( ), where 0 < < f , and use that f n (t, v) ≤ c s , uniformly in n and for t ≤ t0 . Then 1+|v| the term resulting from F(0)F(0) (= 1), gives a Boltzmann gain term sequence, which is here compact (cf. [L]). It follows that (after taking the subsequence), uniformly in n, dvdt sup | dv∗ dωBn (v − v∗ , ω)( f n f n ∗ (t, q + v) − f n f n ∗ (t, v))| |q |≤|q|
Equations of Boltzmann Type for Haldane Exclusion Statistics
581
converges to zero when q → 0. In the remaining terms, there is at least one factor from F( f n ) or F( f n ∗ ) that can be estimated by c s or c s , e.g. f n ∗ in the term Sgn
1+|v|
1+|v∗ |
dv∗ dωBn (v − v∗ , ω) f n ∗ F ( n )( f n (t, v (v + q , v∗ )) − f n (t, v (v, v∗ ))) f n ∗
can after integration be estimated from above by c | f (t, v (v + q , v∗ )) dv∗ dωBn (v − v∗ , ω) dvdt sup s n 1 + |v ∗| |q |≤|q| 1 β f (1 + |v | ) ≤ c dvdtdv∗ sup | f n (t, v + q ) − f n (t, v )| ∗ 1 + |v∗ |β n ∗ |q |≤|q| − f n (t, v)|(1 + |v∗ |2 ) f n (t, v∗ ). Here (4.1), together with |v∗ − v∗ | = cos θ |v − v∗ |, |v∗ − v∗ |β ≤ (1 + |v∗ |β )(1 + |v∗ |β ) were used. The other remaining terms can be handled similarly. That gives sup sup | f n (t, v + q ) − f n (t, v)|dv ≤ sup | f 0 (v + q ) t≤t0 |v|
|q |≤|q|
− f 0 (v)|dv + ct0 sup s≤t0
sup | f n (s, v + q ) − f n (s, v)|dv + o(1),
|q |≤|q|
(4.7)
when q → 0. The precompactness of ( f n ) follows from here, and we may choose the subsequence so that ( f n ) converges in L 1 . The limit f solves the initial value problem for (1.2). Obviously the solution f preserves the L ∞ bounds from ((1 + |v|s ) f n ). It also follows that the mass conservation of ( f n ) is preserved by the solution f , and that the energy of f is non-increasing. This completes the proof of the theorem.
Remark. The proof implies stability. Given a sequence of initial values ( f 0n ) with supn esssupv f 0n (v) < α1 and converging in L 1 to f 0 , then there is a subsequence of the solutions converging in L 1 to a solution with initial value f 0 . 5. An Initial Layer Problem Theorem 4.1 can be extended to the case of initial values 0 < f 0 ≤ α1 . Theorem 5.1. In the space-homogeneous equation (1.2) for hard forces, let the initial value 0 < f 0 ∈ L 1 have finite energy. If esssup(1 + |v|s ) f 0 < ∞ for s = d − 1 + β, then this initial value problem for (1.2) has a solution in the space of functions continuous from t ≥ 0 into L 1 ∩ L ∞ , which conserves mass and energy, and for t0 > 0 given, has esssupv,t≤t0 |v|s f (t, v) bounded, where s = min(s, 2β(d+1)+2 ). d The theorem follows from a study of the negativity of the collision term for f (t, v) close to α1 . (The case of f 0 (v) = 0 for certain v’s can be analyzed in a similar way.) Proof. Introduce the approximate initial values f 0n = min( f 0 , α1 − n1 ), and consider the corresponding initial value problems for (1.2). The solutions exist by Sect. 4. Using the frame of (4.3), (4.4) the limiting behaviour when n → ∞ can be studied with the method
582
L. Arkeryd
of Sect. 4. The differences from the analysis will be discussed next. The discussion holds uniformly in n. First consider the initial layer. Define b2 by F( α1 − b2 ) = 21 . Given t0 , by the 1 proofs in λ such that f (t, v) Sect. 3 there is < α − b2 if t ≤ t0 , |v| > λ. Set ν(v) = dv∗ dωB f ∗ F( f )F( f ∗ ) and ν˜ (v) = dv∗ dωB f f ∗ F( f ∗ ). It holds that Q f < 0 if F( f ) < fν˜ν , in particular if F( f ) < in f t fν˜ν := 2C1 . Here t = {v; |v| ≤ λ, F( f (t, v)) ≤ 21 } and obviously C1 > 0. Define b1 by F( α1 − b1 ) = min{ 41 , C1 }. Hence for t ≤ t0 it holds F( f ) ≤ C1 if and only if |v| ≤ λ and α1 ≥ f ≥ α1 − b1 . For such f -values F( f )˜ν ≤ C1 ν˜ ≤ 21 f ν, and so 1 1 1 Q f ≤ − f ν ≤ − ( − b1 )C12 B f ∗ dωdv∗ 2 2 α F( f ),F( f ∗ )>C1 t0 1 1 2 β ≤ − ( − b1 )C1 r inf exp(− νds)dωdv∗ |v|≤λ F( f ),F( f )>C1 ,|v−v∗ |>r 2 α 0 ∗ := −C2 , with C2 > 0. This gives a maximum time for the equation ddtf = Q f with initial value f 0 to reach f (t, v) ≤ α1 − b1 . Also for any t > 0 the solutions stay uniformly in n away from α1 . From here a version of the previous study of (4.4) can be used to prove existence for an initial time interval and an initial value f 0 not remaining uniformly in v away from 1 ∞ 1 α . At a few places the L estimate in time should then be replaced by an L -estimate to α α−1 providing a ’small handle factors of the type α(b + ct) (estimates for integrals of t factor’). Thus in the third term of (4.4) the factor F( f )(t, q + v) − F( f )(t, v) is now bounded by (c3 + c4 t α−1 )( f (t, q + v) − f ((t, v)). This gives an upper bound for the integral of the third term (c3 t + c4 t α ) sup | f (t, q + v) − f (t, v)|dv. (5.1) t≤t0
In the remaining two terms of (4.4) the B-differences are treated as in Sect. 4. For (4.5) the F( f )-difference is estimated as above (the previous third term), now giving the upper bound (c3 t + c4 t α ) supt≤t0 sup|q |≤|q| | f (t, v + q ) − f (t, v)|dv. There remains to consider (4.6) and (4.7) in the present setting. Again an estimate of F(θ )-factors by (c3 + c4 t α−1 ) leads to upper bounds of the type (5.1). Similarly to Sect. 4 the claims of the theorem follow.
Acknowledgement. The author would like to thank J. Bergh for useful discussions during the preparation of the paper. The insightful comments from one referee helped to improve the paper.
References [A] [BBM] [BMB]
Arkeryd, L.: L ∞ -estimates for the space-homogeneous boltzmann equation. J. Stat. Phys. 31, 347– 361 (1983) Bhaduri, R.K., Bhalero, R.S., Murthy, M.V.: Haldane exclusion statistics and the boltzmann equation. J. Stat. Phys. 82, 1659–1668 (1996) Bhaduri, R.K., Murthy, M.V., Brack, M.: Fermionic ground state at unitarity and haldane exclusion statistics. J. Phys. B 41, 115301 (2008)
Equations of Boltzmann Type for Haldane Exclusion Statistics
[D] [H] [L] [Lu1] [Lu2] [MW] [W]
583
Dolbeault, J.: Kinetic models and quantum effects: a modified boltzmann equation for fermi-dirac particles. Arch. Rat. Mech. Anal. 127, 101–131 (1994) Haldane, F.D.: Fractional statistics in arbitrary dimensions: a generalization of the pauli principle. Phys. Rev. Lett. 67, 937–940 (1991) Lions, P.L.: Compactness in Boltzmann’s equation via Fourier integral operators and applications I, III. J. Math. Kyoto Univ. 34, 391–427, 539–584 (1994) Lu, X.: A modified boltzmann equation for bose-einstein particles: isotropic solutions and long time behaviour. J. Stat. Phys. 98, 1335–1394 (2000) Lu, X.: Conservation of energy, entropy identity, and local stability for the spacially homogeneous boltzmann equation. J. Stat. Phys. 96, 765–796 (1999) Mischler, S., Wennberg, B.: On the spacially homogeneous boltzmann equation. Ann. Inst. Henri Poincaré 16, 467–501 (1999) Wu, Y.S.: Statistical distribution for generalized ideal gas of fractional-statistics particles. Phys. Rev. Lett. 73, 922–925 (1994)
Communicated by H. Spohn
Commun. Math. Phys. 298, 585–611 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1086-8
Communications in
Mathematical Physics
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists C. A. S. Young, R. Zegers Department of Mathematical Sciences, University of Durham, South Road, Durham DH1 3LE, UK. E-mail: [email protected]; [email protected] Received: 18 December 2008 / Accepted: 7 June 2010 Published online: 11 July 2010 – © Springer-Verlag 2010
Abstract: Given a symmetric decomposition g = h ⊕ p of a semisimple Lie algebra g, we define the notion of a p-contractible quantized universal enveloping algebra (QUEA): for these QUEAs the contraction g → g0 making p abelian is nonsingular and yields a QUEA of g0 . For a certain class of symmetric decompositions, we prove, by refining cohomological arguments due to Drinfel’d, that every QUEA of g0 so obtained is isomorphic to a cochain twist of the undeformed envelope U(g0 ). To do so we introduce the p-contractible Chevalley-Eilenberg complex and prove, for this class of symmetric decompositions, a version of Whitehead’s lemma for this complex. By virtue of the existence of the cochain twist, there exist triangular quasi-Hopf algebras based on these contracted QUEAs and, in the approach due to Beggs and Majid, the dual quantized coordinate algebras admit quasi-associative differential calculi of classical dimensions. As examples, we consider κ-Poincaré in 3 and 4 spacetime dimensions. 1. Introduction This paper is concerned with deformation quantizations of the universal enveloping algebras (UEAs) of a certain class of non-semisimple Lie algebras, and more particularly with proving that these deformations are cochain twists of their undeformed counterparts. The Lie algebras we consider have the property that they can be obtained by contraction of semisimple Lie algebras; among them is the Poincaré algebra, which is the case of clearest physical interest and will be the example we treat in detail. Let us first recall the situation concerning twists of (semi)simple Lie algebras. For any simple Lie algebra g, the standard Drinfel’d-Jimbo quantization Uh (g) comes equipped with a quasitriangular structure R, which provides the isomorphisms required to turn its category of representations into a quasitensor category [1–3]. As a quasitriangular Hopf algebra, (Uh (g), R) is not twist-equivalent by any cocycle twist to (U(g), 1 ⊗ 1), the undeformed UEA equipped with the usual Hopf algebra structure and trivial triangular structure. It cannot be, because R is strictly quasitriangular (i.e. R21 = R−1 ) and
586
C. A. S. Young, R. Zegers
the property of triangularity is preserved by twisting. The celebrated result of Drinfel’d [4–6] is that Uh (g) and U(g) are twist equivalent in the larger category of quasi-Hopf algebras. Here one drops the requirement that the twist element F obey the cocycle condition. Since this condition is what guarantees the preservation of coassociativity under twisting, a quasi-Hopf algebra may fail to be coassociative; but it does so in a controlled fashion, specified by the coassociator . In the special case = 1 ⊗ 1 ⊗ 1 one recovers the definition of a Hopf algebra. Drinfel’d showed that (Uh (g), R, 1 ⊗ 1 ⊗ 1) can be reached by a cochain twist starting from (U(g), RKZ , KZ ); that is, starting from the quasitriangular quasi-Hopf (qtqH) algebra obtained by equipping U(g) with a certain R-matrix RKZ and coassociator KZ constructed from the monodromies of a KnizhnikZamolodchikov system of equations, which in turn depend on the quadratic Casimir t of g. One has RKZ = eht , where the Casimir is split over the tensor product. An alternative possibility, discussed notably by Beggs and Majid [7,8], is to start instead with (U(g), 1⊗2 , 1⊗3 ) and twist by the same cochain F as in Drinfel’d’s construction. What results is, necessarily, the same algebra and coproduct as Uh (g), but now equipped with non-standard R-matrix F21 F −1 and coassociator F (the coboundary of F , closely related to KZ ). F is central in the sense that the coproduct of Uh (g) is coassociative, but it is nevertheless non-trivial and thus (Uh (g), F21 F −1 , F ) is a triangular but strictly quasi-Hopf algebra. Dually, the deformed function algebra Ch (G) becomes a co-quasi-Hopf algebra which happens to be associative. The non-triviality of F is seen at the level of intertwiners of representations (the category of representations is symmetric but non-trivially monoidal). It also appears when one tries to construct a differential calculus on C h (G), and in fact this was one of the original motivations for considering the set-up: Beggs and Majid showed that, at least for semisimple g, the standard quantum groups C h (G) do not admit any bi-covariant associative differential calculus of classical dimensions in deformation theory. But, by the existence of the cochain twist, one can construct a quasi-associative differential calculus (C h (G)) of classical dimensions [7,8]. The results summarized above pertain to semisimple Lie algebras. To the authors’ knowledge no systematic extension to quantized universal enveloping algebras (QUEAs) of general Lie algebras is known. The proof of the existence of Drinfel’d’s twist element F relies on the vanishing of a certain cohomology class, which holds for semisimple Lie algebras but may fail more generally. Drinfel’d did show [5] that any qtqH QUEA is isomorphic to a cochain twist of the undeformed UEA of the underlying Lie algebra g. So the existence of a qtqH structure is sufficient as well as necessary for the existence of the twist. But when g is not semi-simple, one has no general means of knowing whether the given QUEA admits a qtqH algebra structure. In physics one is also concerned with non-semisimple Lie algebras. In particular, in trying to formulate non-commutative quantum field theory by (paralleling the usual approach in [9]) beginning with particles, regarded as irreducible representations of the algebra of spacetime symmetries, one is certainly interested in the Poincaré algebra iso(1, n) and its deformations. Possibly the most well-known deformation of U(iso(1, n)) is the θ -deformation, which is dual to the usual non-commutative coordinate algebra [xi , x j ] = θi j , with θi j constant. It is known to be twist-equivalent to U(iso(1, n)) by a cocycle twist [10,11], making it in a sense a rather mild deformation. The results presented here will apply, rather, to what is referred to as the κ-deformation of U(iso(1, n)) [12–19]. κ-Poincaré can be understood in more than one way. From one perspective, it arises as a certain bicrossproduct [20], and this viewpoint allows for a nice geometrical interpretation discussed in [21]. Another formulation [22–24] is as a
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
587
particular contraction limit of the appropriate real form of the standard Drinfel’d-Jimbo QUEA Uh (so(n + 1, C)). It is this property which will be relevant in the present work. The main idea, then, is to consider a class of QUEAs obtained by applying to (e.g. the standard) QUEAs Uh (g) of semisimple Lie algebras g a contraction procedure modelled on that used to obtain κ-Poincaré. As we recall in detail below, to every symmetric decomposition g = h ⊕ p of a Lie algebra g there is associated an Inönu-Wigner contraction, in which p is rescaled to become an abelian ideal of the contracted Lie algebra g0 . Whenever the contraction procedure is non-singular at the level of the Uh (g), this yields a quantization Uh (g0 ) of U(g0 ). One must specify how the formal deformation parameter is rescaled in the contraction limit to produce the limiting parameter h ; obviously there are many possibilities, and we will consider the choice that ensures that κ-Poincaré is captured by our results. We will show (theorem 6.1) that, given a certain restriction on the allowed symmetric decomposition (see Definition 3.2), every such QUEA Uh (g0 ) is isomorphic to a twist of the undeformed UEA U(g0 ) by a cochain F0 . We do this by refining the cohomological arguments of Drinfel’d so as to prove that the twist element F which relates U(g) and Uh (g) can be chosen to be non-singular in the contraction limit. Since U(g0 ) can always be endowed with the trivial qtqH structure R = 1⊗2 , = ⊗3 1 , the existence of this twist F0 means that one can certainly obtain (see Corollary 5.4 below) a triangular quasi-Hopf algebra (Uh (g0 ), (F0 )21 F0−1 , F0 ). And, from [7,8], the deformed coordinate algebra C h (G 0 ) dual to Uh (g0 ) will admit a quasi-associative bicovariant differential calculus of classical dimensions. It is a separate question whether Uh (g0 ) admits a quasitriangular Hopf algebra structure. In Sect. 5, we give a necessary condition for such a structure to arise by contraction (see corollary 5.5). Examples, in the case of κ-Poincaré, are in Sect. 7. The paper is organised as follows. In Sect. 2, we recall the definition of symmetric semisimple Lie algebras. The important notion of contractibility is introduced in Sect. 3 after a brief reminder of the definitions of the filtered and graded algebras associated to UEAs. Sect. 4 is dedicated to the cohomology of associative algebras and Lie algebras. After a brief account of Hochschild and Chevalley-Eilenberg cohomology, we introduce the notion of contractible Chevalley-Eilenberg cohomology. We establish, in particular, the vanishing of the first contractible Chevalley-Eilenberg cohomology module for symmetric semisimple Lie algebras possessing the restriction property 3.2. This will be crucial in proving the existence of a contractible twist. In Sect. 5, the usual rigidity theorems for semisimple Lie algebras are then refined, with special regards to the contractibility of the structures. We construct, in particular, a contractible twist from every contractible QUEA of restrictive type to the undeformed UEA of the underlying Lie algebra. The actual contraction is performed in Sect. 6. In Sect. 7 we comment on the implications of our mathematical results for the particular example of κ-Poincaré. We discuss how they are compatible with previous work and explain certain previous results. Throughout Sects. 1 through 6, K denotes a field of characteristic zero. 2. Symmetric Decompositions of Lie Algebras Let us briefly review some well-known facts concerning symmetric semisimple Lie algebras. Following [25,26], we have Definition 2.1. A symmetric Lie algebra is a pair (g, θ ), where g is a Lie algebra and θ : g → g is an involutive (i.e. θ ◦ θ = id and θ = id) automorphism of Lie algebras.
588
C. A. S. Young, R. Zegers
As θ ◦ θ = id, the eigenvalues of θ are +1 and −1. Let h = ker (θ − id) and p = ker (θ + id) be the corresponding eigenspaces. Every such θ thus defines a symmetric decomposition of g, i.e. a triple (g, h, p) such that • h ⊂ g is a Lie subalgebra; • g = h ⊕ p as K-modules; • [h, p] ⊆ p and [p, p] ⊆ h. Any Lie subalgebra h of g that is the fixed point set of some involutive automorphism will be referred to as a symmetrizing subalgebra. If, in addition, g is semisimple then p must be the orthogonal complement of h in g with respect to the (non-degenerate) Killing form, and thus every given symmetrizing subalgebra h uniquely determines p and hence θ . In this case, we shall refer to (g, h) as a symmetric pair. A symmetric semisimple Lie algebra (g, θ ) is said to be diagonal if g = v ⊕ v for some semisimple Lie algebra v and θ (x, y) = (y, x) for all (x, y) ∈ g. A symmetric Lie algebra splits into symmetric subalgebras (gi , θi )i∈I if g = i∈I gi and the restrictions θ|gi = θi for all i ∈ I . Lemma 2.2. Every symmetric semisimple Lie algebra (g, θ ) splits into a diagonal symmetric Lie algebra (gd , θd ) and a collection of symmetric simple Lie subalgebras (gi , θi )i∈I . A proof can be found in Chap. 8 of [26]. Lemma 2.2 allows for a complete classification of the symmetric semisimple Lie algebras; see [26,27]. It also follows that we have the following Lemma 2.3. Let (g, θ ) be a symmetric semisimple Lie algebra and let g = h ⊕ p be the associated symmetric decomposition of g. Then h is linearly generated by [p, p]. Proof. By virtue of Lemma 2.2, it suffices to prove this result on symmetric simple Lie algebras and on diagonal symmetric Lie algebras. Let us first assume that g is simple. The linear span of [p, p] defines a non-trivial ideal in h and span([p, p]) ⊕ p therefore defines a non-trivial ideal in g. If we assume that g is simple, it immediately follows that span([p, p]) = h. Suppose now that (g, θ ) is a diagonal symmetric Lie algebra, i.e. that there exists a semisimple Lie algebra v such that g = v ⊕ v and θ (x, y) = (y, x) for all (x, y) ∈ g. In this case, we have a symmetric decomposition g = h ⊕ p, where h is the set of elements of the form (x, x) for all x ∈ v, whereas p is the set of elements of the form (x, −x) for all x ∈ v. We naturally have [p, p] ⊆ h. Now, as v is semisimple, it follows that for every x ∈ v, there exist y, z ∈ v such that x = [y, z]. Then for all (x, x) ∈ h, we have (x, x) = ([y, z], [y, z]) = [(y, y), (z, z)] = [(y, −y), (z, −z)]. But both (y, −y) and (z, −z) are in p. 3. Contractible QUEAs 3.1. Filtrations of the Universal Enveloping Algebra. Given a Lie algebra g over K, its universal algebra U(g) is defined as the quotient of the graded tensor algebra enveloping ⊗n by the two-sided ideal I(g) generated by the elements of the form Tg = n≥0 g x ⊗ y − y ⊗ x − [x, y], for all x, y ∈ g. This quotient constitutes a filtered K-algebra, i.e. there exists an increasing sequence {0} ⊂ F0 (U(g)) ⊂ · · · ⊂ Fn (U(g)) ⊂ · · · ⊂ U(g),
(3.1)
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
such that1 U(g) =
Fn (U(g))
and
Fn (U(g)) · Fm (U(g)) ⊂ Fn+m (U(g)).
589
(3.2)
n≥0
The elements of this sequence are, for all n ∈ N0 , Fn (U(g)) =
n
g⊗m /I(g).
(3.3)
m=0
In particular, F0 (U(g)) = K and F1 (U(g)) = K ⊕ g. Let us identify g with its image under the canonical inclusion g → U(g), and further write x1 · · · xn for the equivalence class of x1 ⊗ · · · ⊗ xn . In this notation, Fn (U(g)) is linearly generated by elements that can be written as words composed of at most n symbols from g. We define the left action of g on g⊗n by extending the adjoint action x x = x, x of g on g as a derivation: x (x1 ⊗ · · · ⊗ xn ) =
n
x1 ⊗ · · · ⊗ [x, xi ] ⊗ · · · ⊗ xn ∈ g⊗n ,
(3.4)
i=1
for all x, x1 , . . . , xn ∈ g. In this way we endow T g with the structure of a left g-module. As the ideal I(g) is stable under this action, the Fn (U(g)) are also left g-modules. We therefore have a filtration of U(g) not only as a K-algebra, but also as a left g-module. We will also need such a filtration on (U(g))⊗2 . In fact, for all m ∈ N0 , there is a K-algebra filtration on the universal envelope U(g⊕m ) of the Lie algebra g⊕m , as defined above. If we endow g⊕m with the structure of a left g-module according to x (x1 , . . . , xm ) := ([x, x1 ] , . . . , [x, xm ]) ,
(3.5)
and extend this action to all of U(g⊕m ) as a derivation, then we have a filtration of U(g⊕m ) as a left g-module. But there is a natural isomorphism ∼
ρm : U(g⊕m ) −→ (U(g))⊗m
(3.6)
of K-algebras (see e.g. [25, Sect. 2.2]). This induces a left action of g on (U(g))⊗m and a filtration of (U(g))⊗m as a left g-module. We write the elements of this filtration as ⊗m Fn (U(g)) . Given now any symmetric decomposition g = h ⊕ p, (3.7) there is an associated bifiltration Fn,m (U(g)) n,m∈N of U(g), i.e. a doubly increasing 0 sequence
Fn,m (U(g)) ⊂ Fn+1,m (U(g)) such that U(g) =
Fn,m (U(g))
and
Fn,m (U(g)) ⊂ Fn,m+1 (U(g)),
(3.8)
and Fn,m (U(g)) · Fk,l (U(g)) ⊂ Fn+k,m+l (U(g)), (3.9)
n,m≥0 1 Although F (U (g)) · F (U (g)) is usually strictly contained in F n m n+m (U (g)), it linearly generates the latter.
590
C. A. S. Young, R. Zegers
for all n, m, k, l ∈ N0 . The elements of this sequence are, for all n, m ∈ N0 , Fn,m (U(g)) =
n m
Sym h⊗ p ⊗ p⊗q /I(g),
(3.10)
p=0 q=0
where, for all n ∈ N0 and all K-submodules X 1 , . . . X n ⊂ g, X σ (1) ⊗ · · · ⊗ X σ (n) Sym(X 1 ⊗ · · · ⊗ X n ) =
(3.11)
σ ∈ n
is the direct sum over all permutations of submodules in the tensor product. Each Fn,m (U(g)) is therefore the left h-module linearly generated by elements of U(g) that can be written as words containing at most n symbols in h and at most m symbols in p. In particular, F1,0 (U(g)) = K ⊕ h and F0,1 (U(g)) = K ⊕ p. We also have, for all m, n ∈ N0 , Fn,m (U(g)) ⊂ Fn+m (U(g))
and
Fn (U(g)) =
n
Fn−m,m (U(g)).
(3.12)
m=0
In complete analogy with the Fn ((U(g))⊗m ), we can construct bifiltrations Fn, p ((U(g))⊗m ) of all the m-fold tensor products of U(g). 3.2. Symmetric tensors. Let S(g) be the graded algebra associated to the filtration of U(g) by setting, for all n ∈ N0 , Sn (g) = Fn (U(g))/Fn−1 (U(g)) and S(g) = Sn (g). (3.13) n≥0
Since the Fn (U(g)) are left g-modules, so are the Sn (g). The symmetrization map, sym : S(g) → U(g), defined by 1 sym(x1 · · · xn ) = xσ (1) · · · xσ (n) (3.14) n! σ ∈Sn
for all n ∈ N0 and all x1 , . . . , xn ∈ g, constitutes an isomorphism of left g-modules2 . The image of a given Sn (g) through sym is the g-module of symmetric tensors in g⊗n . If now g = h ⊕ p is a symmetric decomposition, let Sm,n (g) = Fm,n (U(g))/Fm+n−1 (U(g)),
(3.15)
for all m, n ∈ N0 . These obviously constitute left h-modules. As such, they are isomorphic to the left h-modules of symmetric tensors in the Sym h⊗m ⊗ p⊗n , which are linearly generated by totally symmetric words with exactly m symbols in h and exactly n symbols in p. Note that these h-modules are mixed under the left p-action. Indeed, let m, n ∈ N0 be two non-negative integers and let x ∈ Sm,n (g). We have: • if m > 0 and n = 0, then p x ∈ Sm−1,n+1 (g); • if m > 0 and n > 0, then p x ∈ Sm+1,n−1 (g) ⊕ Sm−1,n+1 (g); • if m = 0 and n > 0, then p x ∈ Sm+1,n−1 (g). 2 Recall that we assume K has characteristic zero.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
591
This is better represented by the following diagram in Sm+n (g). Sm,n (g) Sm−1,n+1 (g) Sm−2,n+2 (g) ··· J M OO t o M pq q M pq q J t o p p O Oo Jt qM qM h h h h tJ o t p J$ xq q pM M& xq q pM M& wo o p O O' zt t pJ J zt $ Sm,n (g) Sm−1,n+1 (g) Sm−2,n+2 (g) Sm+1,n−1 (g) ··· ··· ··· J
J
Sm+1,n−1 (g)
Jpt
t
M
Using the action (3.5) of g on g⊕m we have entirely analogous structures for g⊕m with Sn, p (g⊕m ) = Fn, p (U(g⊕m ))/Fn+ p−1 (U(g⊕m )).
(3.16)
In view of (3.6), it follows that
Sn, p (g⊕m ) ∼ = Fn, p (U(g))⊗m ) /Fn+ p−1 (U(g))⊗m )
(3.17)
for all n, p ∈ N0 . We shall therefore identify each Sn, p (g⊕m ) with the left h-module of symmetric tensors on (U(g))⊗m containing exactly n factors in h and p in p. 3.3. Symmetric invariants and the restriction property. For all n, p ∈ N0 , let Sn (g⊕g)g be the set of g-invariant elements of the left g-module Sn (g ⊕ g) and let Sn, p (g ⊕ g)h denote the set of h-invariant elements of the left h-module Sn, p (g ⊕ g). We have the following two lemmas. Lemma 3.1. Let n and p be positive integers. Every x ∈ Sn− p, p (g ⊕ g)h such that p x ∈ Sn− p+1, p−1 (g ⊕ g) is in the linear span of Sn− p,0 (g ⊕ g)g S0, p (g ⊕ g)h. Proof. Let (h i )i∈I and ( p j ) j∈J be ordered bases of h ⊕ h and p ⊕ p respectively. Every element x ∈ Sn− p, p (g ⊕ g) can be written as x= xi1 ...in− p j1 ... j p h i1 . . . h in− p p j1 . . . p j p , i 1 ≤···≤i n− p j1 ≤···≤ j p
where, for all i 1 , . . . , i n− p ∈ I and j1 , . . . , j p ∈ J , xi1 ...in− p j1 ... j p ∈ K. Then, omitting the ordered sums, we have p x = xi1 ...in− p j1 ... j p p h i1 . . . h in− p p j1 . . . p j p + h i1 . . . h in− p p p j1 . . . p j p . Since (p x) ∩ Sn− p−1, p+1 (g ⊕ g) = {0}, we have p xi1 ...in− p j1 ... j p h i1 . . . h in− p = 0, for all j1 ≤ · · · ≤ j p ∈ J ; it follows that this quantity is also invariant under [p, p] and hence, by Lemma 2.3, under h. Thus it is actually g-invariant. Introduce a basis (yk )k∈K of the K-module Sn− p,0 (g ⊕ g)g, so that we can write xi1 ...in− p j1 ... j p h i1 . . . h in− p = bk j1 ... j p yk , k∈K
with bk j1 ... j p ∈ K, for all j1 ≤ · · · ≤ j p ∈ J . Now, as x is h-invariant, we also have h x = bk j1 ... j p yk h p j1 . . . p j p = 0.
592
C. A. S. Young, R. Zegers
This yields h bk j1 ... j p p j1 . . . p j p = 0, for all k ∈ K . Introduce a basis (zl )l∈L for the K-module S0, p (g ⊕ g)h, so that we can write, for all k ∈ K , bk j1 ... j p p j1 . . . p j p =
akl zl ,
l∈L
with akl ∈ K for all k ∈ K and l ∈ L. Now, x can be rewritten as x=
akl yk zl ,
k∈K l∈L
with yk ∈ Sn− p,0 (g ⊕ g)g for all k ∈ K and zl ∈ S0, p (g ⊕ g)h for all l ∈ L.
Let us now restrict our attention to the class of symmetric Lie algebras encompassed by the following Definition 3.2. We say that a symmetric semisimple Lie algebra (g, θ ) with associated symmetric decomposition g = h ⊕ p is of restrictive type (or has the restriction property) if and only if for all p ∈ N0 , the projection from g to p maps S p (g ⊕ g)g onto S0, p (g ⊕ g)h. This restriction property will be sufficient to allow us to prove a refined version of Whitehead’s lemma in the next section. Note that it is similar to the so-called surjection property – namely that the restriction from g to p maps S(g)g onto S(p)h – which is known to hold for all classical symmetric Lie algebras [28] and which has proven useful in a number of contexts [29,30]. In our case we have, at least, Lemma 3.3. If a symmetric semisimple Lie algebra splits (as in Lemma 2.2), in such a way that its simple factors are drawn only from the following classical families of simple symmetric Lie algebras: AIn>2 : (su(n), so(n))n>2 , AIIn : (su(2n), sp(2n))n∈N∗ , BDIn>2,1 : (so(n + 1), so(n))n>2 , then it is of restrictive type. Proof. See Appendix.
3.4. Contractible homomorphisms of K[[h]]-modules. Let K[[h]] denote the K-algebra of formal power series in h with coefficients in the field K and let U(g)[[h]] be the U(g)algebra of formal power series in h with coefficients in U(g). We have a natural K-algebra monomorphism i : U(g) → U(g)[[h]]. There is also an epimorphism of K-algebras j : U(g)[[h]] U(g) such that j ◦ i = id on U(g). We shall therefore identify U(g) with its image i(U(g)) ⊂ U(g)[[h]]. We shall also consider complete K[[h]]-modules and it is assumed that the tensor products considered from now on are completed in the h-adic topology. In this subsection, we further assume that g = h ⊕ p is a symmetric decomposition.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
593
Definition 3.4. Let p ∈ Z, m ∈ N0 be integers. An element x of (U(g))⊗m [[h]] is ( p, p)contractible if and only if there exists a collection (xn )n∈N0 of elements of (U(g))⊗m such that, x= h n xn (3.18) n≥0
and, for all n ∈ N0 , there exists l(n) ∈ N0 such that xn ∈ Fl(n),n+ p (U(g))⊗m . Similarly, a subset X ⊂ (U(g))⊗m [[h]] is ( p, p)-contractible if all its elements are, according to the previous definition. Note that for the sake of simplicity, we shall refer to (0, p)-contractible elements or sets as p-contractible. Let us now define the notion of con tractibility for K[[h]]-module homomorphisms in Hom U(g)⊗m [[h]], (U(g))⊗n [[h]] . Definition 3.5. Let r, s ∈ N0 and p ∈ Z be integers. A homomorphism of K[[h]]modules φ : (U(g))⊗r [[h]] → (U(g))⊗s [[h]] is p-contractible if and only if, for all n, m ∈ N0 , φ(Fn,m (U(g)⊗r )) is (m, p)-contractible as a subset. Let us emphasize that for every p-contractible K[[h]]-module homomorphism φ : (U(g))⊗r [[h]] → (U(g))⊗s [[h]], there exists a collection (ϕn )n∈N0 of K[[h]]-module homomorphisms ϕn : (U(g))⊗r [[h]] → (U(g))⊗s [[h]] such that φ= h n ϕn (3.19) n≥0
⊗r ) ⊆ F , there exists l(n) ∈ N such that ϕ ((U(g)) and, for all n, m, p ∈ N 0 0 n m, p Fl(n),n+ p (U(g))⊗s . The following two lemmas will be useful in the next sections. Lemma 3.6. Let φ and ψ be two p-contractible homomorphisms of K[[h]]-modules. Then the K[[h]]-module homomorphism φ ◦ ψ is p-contractible. Proof. We have φ=
h n ϕn
and
ψ=
n≥0
h n ψn ,
n≥0
with, for all n, m, p ∈ N0 , ϕn (Fm, p ) ⊆ F∗,n+ p , and ψn (Fm, p ) ⊆ F∗,n+ p . For the sake of simplicity we shall omit the arguments of the bifiltration and denote by ∗ the integer l(n) whose existence is guaranteed by the definition of contractibility. We thus have φ◦ψ =
h n+m ϕn ◦ ψm =
n≥0 m≥0
n≥0
hn
n
ϕm ◦ ψn−m ,
m=0
with, for all l, m, n, p ∈ N0 , ϕm ◦ ψn−m (Fl, p ) ⊆ ϕm (F∗,n−m+ p ) ⊆ F∗,n+ p .
The following holds for the inverse. Lemma 3.7. Let φ be a p-contractible homomorphism of K[[h]]-modules, congruent with id mod h. Then the K [[h]]-module homomorphism φ −1 = id mod h is p-contractible.
594
C. A. S. Young, R. Zegers
Proof. We shall construct φ −1 =
h n ϕn ,
n≥0
by recursion on the order in h, by demanding that φ ◦ φ −1 = id. At leading order, we have ϕ0 = id and therefore ϕ0 (Fm, p ) ⊆ Fm, p , for all m, p ∈ N0 . Let us assume that we have a polynomial φn−1 of degree n > 0 such that φ ◦ φn−1 − id = q
mod h n+1 .
Assuming that φn−1 is p-contractible, we have by Lemma 3.6 that φ ◦ φn−1 is p-contractible, as φ is p-contractible by assumption. Therefore, q(Fm, p ) ⊆ F∗,n+1+ p . Now, to complete the recursion, we have to find ϕn+1 such that
φ ◦ φn−1 + h n+1 ϕn+1 − id = 0 mod h n+2 . This is achieved by taking ϕn+1 = −q. We thus have ϕn+1 (Fm, p ) ⊆ F∗,n+1+ p .
Finally, when φ is not only a K[[h]]-module homomorphism but also a K[[h]]-algebra homomorphism, we have the following useful lemma. ⊗t [[h]] be a homomorphism of K[[h]]Lemma 3.8. Let φ : (U(g))⊗s [[h]] → (U(g)) algebras. It is p-contractible if and only if φ F1,0 ((U(g))⊗s ) is (0, p)-contractible and φ F0,1 ((U(g))⊗s ) is (1, p)-contractible. Proof. If φ is p-contractible, it follows from the definition that, in particular, φ F1,0 is (0, p)-contractible and φ F0,1 is (1, p)-contractible. Now, assuming that φ F1,0 is (0, p)-contractible and φ F0,1 is (1, p)-contractible, we want to prove that, for all m, p ∈ N0 , φ Fm, p is ( p, p)-contractible. We proceed by recursion on m and p. We have assumed the result for m = 1 and p = 0, as well as for m = 0 and p = 1. Suppose that, for some m, p ∈ N0 , we have that, for all m < m, p < p and n ∈ N0 , proven there exists l ∈ N0 such that ϕn Fm , p ⊆ Fl,n+ p . Then, for all n ∈ N0 ,
⊗s
ϕn Fm, p+1 ((U(g)) ) = ϕn = ⊆
m p
k=0 l=0 p m
span
k=0 l=0 p m
span Fk,l · F0,1 · Fm−k, p−l
ϕσ1 Fk,l · ϕσ2 F0,1 · ϕσ3 Fm−k, p−l
σ ∈C3 (n)
spanσ ∈C3 (n) F∗,σ1 +l · F∗,σ2 +1 · F∗,σ3 + p−l
k=0 l=0
= F∗,n+ p+1 , where, for all X ⊆ (U(g))⊗s , span X denotes the K-module linearly generated by X 3 σi = n} of weak 3-compositions and C3 (n) is the set {σ = (σ1 , σ2 , σ3 ) ∈ N30 : i=1 of n.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
595
Similarly, we have
ϕn Fm+1, p ((U(g))⊗s ) = ϕn = ⊆
m p
k=0 l=0 p m
span Fk,l · F1,0 · Fm−k, p−l
span
k=0 l=0 p m
ϕσ1 Fk,l · ϕσ2 F1,0 · ϕσ3 Fm−k, p−l
σ ∈C3 (n)
spanσ ∈C3 (n) F∗,σ1 +l · F∗,σ2 · F∗,σ3 + p−l = F∗,n+ p ,
k=0 l=0
for all n ∈ N0 .
3.5. Contractible deformation Hopf algebras. We recall that U(g) possesses a natural cocommutative Hopf algebra structure, whose coproduct is the algebra homomorphism
0 : U(g) → U(g) ⊗ U(g) defined by 0 (x) = x ⊗ 1 + 1 ⊗ x for all x ∈ g, and whose counit and antipode are specified by 0 (1) = 1 and S0 (1) = 1. We refer to this as the undeformed Hopf algebra structure. Given the notion of contractibility introduced in the preceding subsections, it is natural to specialize the usual notion of a quantization – i.e. a deformation – of a universal enveloping algebra, as follows. Definition 3.9. Let (g, θ ) be a symmetric Lie algebra, with symmetric decomposition g = h ⊕ p. A p-contractible deformation (Uh (g), ·h , h , h , Sh ) of the Hopf algebra (U(g), ·, 0 , 0 , S0 ) is a topological Hopf algebra such that • • • • •
∼
there exists a K[[h]]-module isomorphism η : Uh (g) −→ U(g)[[h]]; μh := η ◦ (·h ) ◦ η−1 ⊗ η−1 = · mod h and μh is p-contractible; ˜ h := (η ⊗ η) ◦ h ◦ η−1 = 0 mod h and ˜ h is p-contractible;
˜Sh := η ◦ Sh ◦ η−1 = S0 mod h and S˜h is p-contractible; ˜h = h ◦ η−1 = 0 mod h and ˜h is p-contractible.
This definition can be naturally restricted to bialgebras and algebras. 4. On the Cohomology of Associative and Lie Algebras 4.1. The Hochschild cohomology. Let A be a K-algebra. For any (A, A)-bimodule (M, , ) and all n ∈ N0 ∗ , we define the (A, A)-bimodule of n-cochains C n (A, M) = Hom(A⊗n , M). We also set C 0 (A, M) = M. To each cochain module C n (A, M), we associate a coboundary operator, i.e. a derivation operator δn : C n (A, M) −→ C n+1 (A, M), by setting, for all f ∈ C n (A, M), δn f (x1 , . . . , xn+1 ) = x1 f x2 , . . . , xˆi , . . . , xn+1 +
n
(−1)i f (x1 , . . . , xi xi+1 , . . . , xn+1 )
i=1
+ (−1)n+1 f (x1 , . . . , xn ) xn+1
(4.1)
596
C. A. S. Young, R. Zegers
for all x1 , . . . , xn+1 ∈ A. One can check that δn ◦ δn+1 = 0 for all n. Therefore, the (C n , δn ) thus defined constitute a cochain complex. It is known as the Hochschild or standard complex [31,32] – see also [33 or 34]. An element of the (A, A)-bimodule Z n (A, M) = ker δn ⊂ C n (A, M) is called an n-cocycle, while an element of the (A, A)-bimodule B n (A, M) = im δn−1 ⊂ C n (A, M) is called an n-coboundary. As usual, the quotient H H n (A, M) = Z n (A, M)/B n (A, M)
(4.2)
n th
defines the cohomology module of A with coefficients in M. In the next section, we shall be particularly interested in the Hochschild cohomology of the universal enveloping algebra of a given Lie algebra g, i.e. A = U(g), with coefficients in M = U(g). The latter trivially constitutes a (U(g), U(g))-bimodule with the multiplication · of U(g) as left and right U(g)-action. Concerning the Hochschild cohomology we will need the following result – see for example Theorem 6.1.8 in [2]. Lemma 4.1. Let g be a semisimple Lie algebra over K. Then, H H 2 (U(g), U(g)) = 0. 4.2. The Chevalley-Eilenberg cohomology. Let g be a Lie algebra over K and (M, ) a left g-module. For all n ∈ N0 ∗ , we define the left g-module of n-cochains C n (g, M) = Hom(∧n g, M), with left g-action (x f ) (x1 , . . . , xn ) = x ( f (x1 , . . . , xn ))−
n
f (x1 , . . . , [x, xi ], . . . , xn ) ,
(4.3)
i=1
for all f ∈ C n (g, M) and all x, x1 , . . . , xn ∈ g. We also set C 0 (g, M) = M with its natural left g-module structure. To each cochain module C n (g, M), we associate a coboundary operator, i.e. a derivation operator dn : C n (g, M) −→ C n+1 (g, M), by setting, for all f ∈ C n (g, M), dn f (x1 , . . . , xn+1 ) =
n+1 i=1
+
(−1)i+1 xi f x1 , . . . , xˆi , . . . , xn+1
(−1)i+ j f
xi , x j , x1 , . . . , xˆi , . . . , xˆ j , . . . , xn+1
1≤i≤ j≤n+1
(4.4) for all x1 , . . . , xn+1 ∈ g. In (4.4), hatted quantities are omitted and denotes the left g-action on M. One can check that dn ◦ dn+1 = 0 for all n. Therefore, the (C n , dn ) thus defined constitute a cochain complex. It is known as the Chevalley-Eilenberg complex [35], – see also [33 or 34]. An element of Z n (g, M) = ker dn ⊂ C n (g, M) is called an n-cocycle, while an element of B n (g, M) = im dn−1 ⊂ C n (g, M) is called an n-coboundary. As usual, the quotient H n (g, M) = Z n (g, M)/B n (g, M)
(4.5)
defines the n th cohomology module of g with coefficients in M. One can check that, for all n ∈ N0 , Z n (g, M), B n (g, M) and H n (g, M) naturally inherit the left g-module structure of C n (g, M), as for all n ∈ N0 , d (x f ) = x d f,
(4.6)
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
597
for all f ∈ C n (g, M) and all x ∈ g. An important result about the ChevalleyEilenberg cohomology of Lie algebras concerns finite dimensional complex semisimple Lie algebras. It is known as Whitehead’s Lemma. Lemma 4.2. Let g be a semisimple Lie algebra over K. If M is any finite-dimensional left g-module, then H 1 (g, M) = H 2 (g, M) = 0. A proof of this result can be found, for instance, in Sect. 7.8 of [34]. 4.3. Contractible Chevalley-Eilenberg cohomology. In the next section, we will be mostly interested in the module M = U(g) ⊗ U(g), with the left g-action induced by (3.5) and (3.6), i.e. g x = [ 0 (g), x],
(4.7)
for all g ∈ g and all x ∈ U(g) ⊗ U(g). In particular, we shall need a refinement of Whitehead’s Lemma, in the case of symmetric semisimple Lie algebras of restrictive type, taking into account the possible p-contractibility of the generating cocycles of n (g, U(g) ⊗ U(g)) Z ∗ (g, U(g) ⊗ U(g)). For all m, n ∈ N0 , we therefore define Cm, p as the set of (m, p)-contractible n-cochains, by which we mean the set of n-cochains f ∈ C n (g, U(g) ⊗ U(g)), such that, for all 0 ≤ p ≤ n, f (∧n− p h) ∧ (∧ p p) ⊆ n Fl,m+ p (U(g) ⊗ U(g)), for some l ∈ N0 . Defining similarly, Z m,p(g, U(g) ⊗ U(g)) = n−1 n (g, U(g) ⊗ U(g)) and B n (g, U(g) ⊗ U(g)) = d ker dn ∩Cm, n−1 C m,p (g, U(g) ⊗ U(g)) p m,p as the modules of the (m, p)-contractible n-cocycles and of the n-coboundaries of (m, p)contractible n −1-cochains, respectively, we can define the n th (m, p)-contractible cohomology module as n n n Hm, p(g, U(g) ⊗ U(g)) = Z m,p(g, U(g) ⊗ U(g))/Bm,p(g, U(g) ⊗ U(g)).
(4.8)
It is worth emphasizing that these cohomology modules generally differ from the usual ones H n (g, U(g)⊗U(g)). Consider for instance a case for which H 1 (g, U(g)⊗U(g)) = 0. We have that every 1-cocycle in Z 1 (g, U(g) ⊗ U(g)), and therefore every cocycle 1 (g, U(g) ⊗ U(g)), is the coboundary of an element x ∈ U(g) ⊗ U(g). Howf ∈ Z m, p ever, although the considered f is (m, p)-contractible, it may be that it can only be obtained as the coboundary of an element x ∈ U(g) ⊗ U(g) that does not belong to any 1 (g, U(g) ⊗ F∗,m (U(g) ⊗ U(g)), thus yielding a non-trivial cohomology class in Hm, p U(g)). When g is a symmetric semisimple Lie algebra of restrictive type, we nonetheless establish the following lemma concerning the first (m, p)-contractible cohomology 1 (g, U(g) ⊗ U(g)). module Hm, p Lemma 4.3. Let (g, θ ) be a symmetric semisimple Lie algebra of restrictive type over K and let g = h ⊕ p be the associated symmetric decomposition of g. We have 1 (g, U(g) ⊗ U(g)) = 0, for all m ∈ N . Hm, 0 p Proof. Let m ∈ N0 be a positive integer. We have to prove that every (m, p)1 (g, U(g) ⊗ U(g)) is the coboundary of an elecontractible 1-cocycle f ∈ Z m, p ment α ∈ Fl,m (U(g) ⊗ U(g)), for some l ∈ N0 . From Lemma 4.2, there exists an x ∈ U(g) ⊗ U(g) such that f = d0 x. All we have to prove is that we can always find a left g-invariant y ∈ (U(g) ⊗ U(g))g, such that x = y modulo Fl,m (U(g) ⊗ U(g)) for some l ∈ N0 . Then, we can check that for α = x − y ∈ Fl,m (U(g) ⊗ U(g)), we have d0 α = d0 (x − y) = d0 x = f.
598
C. A. S. Young, R. Zegers
In view of (3.17), we can first expand x into its components in the left g-modules isomorphic to the Sn (g ⊕ g), for all n ∈N0 . Up to the isomorphism of left g-modules, which we shall omit here, we have x = n≥0 xn where, for all n ∈ N0 , xn ∈ Sn (g ⊕ g). Similarly, we can further decompose each Sn (g⊕g) into the left h-modules Sn− p, p (g⊕g), with 0 ≤ p ≤ n, and, accordingly, each xn . We are now going to construct the desired y ∈ (U(g) ⊗ U(g))g by recursion, submodule by submodule. If xn = 0 for all n > m, we can set y = 0 and we are done. So, suppose that there exists an n > m such that xn = 0 and let x0,n be the component of xn in S0,n (g ⊕ g). If x0,n vanishes, we can skip to the component of xn in S1,n−1 (g ⊕ g). Otherwise, we are going to prove that there exists a g-invariant yn,0 ∈ Sn (g ⊕ g)g, such that the component of xn − yn,0 in S0,n (g ⊕ g) vanishes. From f being (m, p)-contractible, we know that ⎛ f (h) = d0 x(h) = h ⎝xn +
⎞ xn ⎠ ⊆ Fl,m (U(g) ⊗ U(g)),
(4.9)
n =n
for some l ∈ N0 . Therefore, since the Sm, p (g⊕g) are left h-modules, we have hx0,n = 0. Since g has the restriction property, Definition 3.2, it follows that the h-invariant tensor x0,n ∈ S0,n (g ⊕ g)h is the restriction to p of a g-invariant tensor yn,0 ∈ Sn (g ⊕ g)g. Now consider xn − yn,0 . By construction, it has no component in S0,n (g⊕g). If n −1 ≤ m, we set yn = yn,0 and skip to another g-module Sn >m (g ⊕ g), where x has a non-vanishing component, if any. Otherwise, let 0 ≤ k < n − m and assume that we have found yn,k ∈ Sn (g ⊕ g)g, such that xn − yn,k has vanishing component in all the Sn− p, p (g ⊕ g) with p ≥ n − k > m. We are going to prove that there exists yn,k+1 ∈ Sn (g ⊕ g)g such that xn − yn,k+1 has vanishing component in all the Sn− p, p (g ⊕ g) with p ≥ n − k − 1. To do so, let xk+1,n−k−1 be the component of xn − yn,k in Sk+1,n−k−1 (g ⊕ g). If it is zero, we set yn,k+1 = yn,k . Otherwise, note that from (4.9), we have h xk+1,n−k−1 = 0. But the (m, p)-contractibility of f also implies that ⎛ f (p) = d0 x(p) = p ⎝xn − yn,k +
⎞ xn ⎠ ⊆ Fl,m+1 (U(g) ⊗ U(g)),
n =n
from which it follows that p x k+1,n−k−1 ∈ Sk+2,n−k−2 (g⊕g). According to Lemma 3.1, g we can write xk+1,n−k−1 = i, j ai j wi z j , with ai j ∈ K, wi ∈ Sk+1,0 (g ⊕ g) and z j ∈ S0,n−k−1 (g ⊕ g)h. Since g has the restriction property, all the z j are the restrictions to p of g-invariant elements ζ j ∈ Sn−k−1 (g⊕g)g. Now, set yn,k+1 = yn,k + i, j ai j wi ζ j . It is obvious that yn,k+1 ∈ Sn (g ⊕ g)g and, by construction, xn − yn,k+1 has no component in all the Sn− p, p (g ⊕ g), with p ≥ n − k − 1. The recursion goes on until we have yn,n−m ∈ Sn (g ⊕ g)g such that xn − yn,n−m has vanishing components in all the Sn− p, p (g ⊕ g), with p > m. We therefore set yn = yn,n−m . By repeating this a finite number of times3 , in all the Sn >m (g ⊕ g) in which x has non-vanishing components, we obtain the desired y = n≥0 yn . 3 It is rather obvious that x has non-vanishing components in a finite number of submodules S (g ⊕ g), as n there always exists an l ∈ N such that x ∈ Fl (U (g) ⊗ U (g)).
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
599
5. Rigidity Theorems 5.1. Contractible algebra isomorphisms. Proposition 5.1. Let g be a semisimple Lie algebra over K and let h be a symmetrizing Lie subalgebra with orthogonal complement p in g. Then, for every p-contractible deformation algebra (Uh (g), ·h ) of (U(g), ·), there exists a p-contractible isomorphism ∼ of K[[h]]-algebras (Uh (g), ·h ) −→ (U(g)[[h]], ·), that is congruent with id mod h. ∼
Proof. By definition, there exists a K[[h]]-module isomorphism η : Uh (g) −→ U(g)[[h]]. The latter defines a K[[h]]-algebra between (Uh (g), ·h ) and isomorphism (U(g)[[h]], μh ), where μh := η ◦ (·h ) ◦ η−1 ⊗ η−1 = · mod h. If we found a p-contractible K[[h]]-algebra automorphism ∼
φ : (U(g)[[h]], μh ) −→ (U(g)[[h]], ·),
(5.1)
we would prove the proposition as φ ◦ η would constitute the desired K[[h]]-algebra isomorphism from (Uh (g), ·h ) to (U(g)[[h]], ·). Let φ be a K[[h]]-module automorphism on U(g)[[h]]. The condition for such an automorphism to be the K[[h]]-algebra automorphism (5.1) is μh = φ −1 ◦ (·) ◦ (φ ⊗ φ) . Let us construct φ=
(5.2)
h n ϕn ,
(5.3)
n≥0
order by order in h. At leading order, we have μ0 = · and we can take ϕ0 = id ∈ Hom(U(g)[[h]], U(g)[[h]]). We thus have ϕ0 (Fm, p (U(g))) ⊆ Fm, p (U(g)), for all m, p ∈ N0 . Suppose now that we have found a polynomial of degree n > 0, φn =
n
h m ϕm ,
(5.4)
m=0
such that μh − φn−1 ◦ (·) ◦ (φn ⊗ φn ) = h n+1r
mod h n+2 ,
(5.5)
where φn−1 denotes the exact inverse series of φn defined by φn ◦ φn−1 = id and r ∈ Hom(U(g) ⊗ U(g)[[h]], U(g)[[h]]). We assume that φn is p-contractible. Therefore, (·) ◦ (φn ⊗ φn ) is p-contractible. By Lemma 3.7, φn−1 is p-contractible and, by Lemma 3.6, φn−1 ◦ (·) ◦ (φn ⊗ φn ) is p-contractible. By definition of a p-contractible deformation algebra, we know that μh is p-contractible. It therefore follows from (5.5) at order h n+1 that r (Fm, p (U(g) ⊗ U(g))) ⊆ F∗,n+1+ p (U(g)), for all m, p ∈ N0 . From the associativity of μh , we deduce that r is a 2-cocycle in the Hochschild complex, δ2 r = 0.
(5.6)
As g is semisimple, it follows from Lemma 4.1 that its second Hochschild cohomology module H H 2 (U(g), U(g)) is empty, so that r is a coboundary. We thus have r = δ1 β, for some β ∈ Hom(U(g)[[h]], U(g)[[h]]). But we know that, in particular, r (F2,0 (U(g) ⊗
600
C. A. S. Young, R. Zegers
U(g))) ⊆ F∗,n+1 (U(g)) and r (F1,1 (U(g) ⊗ U(g))) ⊆ F∗,n+2 (U(g)). It follows that β can be consistently chosen so that β(F1,0 (U(g))) ⊆ F∗,n+1 (U(g)) and β(F0,1 (U(g))) ⊆ F∗,n+2 (U(g)). To complete the recursion, we have to solve
μh = φn−1 −h n+1 ϕn+1 mod h n+2 ◦ φn +h n+1 ϕn+1 · φn +h n+1 ϕn+1 mod h n+2 , that is δ1 ϕn+1 = r.
(5.7)
This equation can be solved by taking ϕn+1 = −β, which implies that ϕn+1 (F1,0 (U(g))) ⊆ F∗,n+1 (U(g)) and ϕn+1 (F0,1 (U(g))) ⊆ F∗,n+2 (U(g)). The proposition then follows from Lemma 3.8. 5.2. Contractible twisting for symmetric semisimple Lie algebras. Proposition 5.2. Let (g, θ ) be a symmetric semisimple Lie algebra over K having the restriction property, and let g = h ⊕ p be the associated symmetric decomposition of g. Every p-contractible deformation (Uh (g), , , S) of the Hopf algebra (U(g), 0 , 0 , S0 ) is isomorphic, as a Hopf algebra over K[[h]], to a twist of (U(g), 0 , 0 , S0 ) by a p-contractible invertible element F ∈ U(g) ⊗ U(g)[[h]], congruent with 1 ⊗ 1 mod h. Proof. We consider the composite map ∼
∼ ˜ : U(g)[[h]] −→
Uh (g) −→ Uh (g) ⊗ Uh (g) −→ U(g) ⊗ U(g)[[h]],
(5.8)
where the existence of a p-contractible isomorphism of K[[h]]-algebras φ follows from ˜ is an algebra Proposition 5.1. As φ is an algebra isomorphism, the composite map homomorphism. By repeated use of Lemma 3.6, one can show that it is p-contractible. Now, we want to prove that there exists a p-contractible and invertible element F ∈ U(g) ⊗ U(g)[[h]], such that F = 1 ⊗ 1 mod h and ˜ = F 0 F −1 .
(5.9)
We shall proceed by recursion on the order in h. To first order, we have, by construction ˜ = 0
mod h,
(5.10)
and we can take F = 1⊗1 mod h. We thus have F|h=0 ∈ F0,0 (U(g)⊗U(g)). Suppose now that we have found a polynomial Fn ∈ U(g) ⊗ U(g)[h] of degree n, Fn =
n
h m fm ,
(5.11)
m=0
such that ˜ − Fn 0 Fn−1 = h n+1 ξ
mod h n+2 ,
(5.12)
where Fn−1 ∈ U(g) ⊗ U(g)[[h]] is the formal inverse of F in the sense that F −1 F = 1 and ξ ∈ Hom(U(g)[[h]], U(g) ⊗ U(g)[[h]]). We assume that Fn is p-contractible, i.e.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
601
˜ is p-contractible, we deduce that for all n ∈ N0 , f n ∈ F∗,n (U(g) ⊗ U(g)). Since ξ(F1,0 (U(g))) ⊆ F∗,n+1 (U(g) ⊗ U(g)) and ξ(F0,1 (U(g)) ⊆ F∗,n+2 (U(g)). It follows from (5.12) that, for all X, Y ∈ g, we have
˜ − Fn 0 Fn−1 ([X, Y ]) = h n+1 ξ([X, Y ]) mod h n+2 , (5.13)
˜ is an algebra homomorphism, on one hand and, on the other hand, since
˜ Y ˜ ˜ − Fn 0 Fn−1 ([X, Y ]) = X, − Fn 0 ([X, Y ])Fn−1
= h n+1 ([ 0 X, ξ(Y )]+[ξ(X ), 0 Y ])
mod h n+2 . (5.14)
Equating (5.13) and (5.14), we finally get d1 ξ = 0.
(5.15)
The map ξ is thus a 1-cocycle of Z 1 (g, U(g) ⊗ U(g)) in the sense of the ChevalleyEilenberg complex4 . As g is semisimple, it follows from Lemma 4.2 that the cohomology module H 1 (g, U(g) ⊗ U(g)) is empty. We therefore conclude that ξ is a coboundary. But we know that ξ(F0,1 (U(g))) ⊆ F∗,n+2 (U(g)⊗U(g)) and ξ(F1,0 (U(g))) ⊆ F∗,n+1 (U(g) ⊗ U(g)), so that ξ is an (n + 1, p)-contractible 1-cocycle in the contractible Chevalley-Eilenberg complex defined in Subsect. 4.3. As g is of restrictive type, 1 it follows from Lemma 4.3, that Hn+1, p(g, U(g) ⊗ U(g)) = 0, so that ξ is the coboundary of an (n + 1, p)-contractible element in U(g) ⊗ U(g), i.e. there exists an α ∈ F∗,n+1 (U(g) ⊗ U(g)) such that ξ = d0 α = δ0 α. In order to complete the recursion, we have to find an f n+1 ∈ U(g) ⊗ U(g) such that
˜
− Fn +h n+1 f (n+1) 0 Fn−1 −h n+1 f (n+1) mod h n+2 = 0 mod h n+2 . (5.16) Expanding the above equation to order h n+1 yields δ0 f n+1 + ξ = 0.
(5.17)
This equation can then be solved by choosing f n+1 = −α ∈ F∗,n+1 (U(g) ⊗ U(g)).
5.3. Contractible quasi-Hopf algebras. Generically, cochain twists map quasi-Hopf algebras to quasi-Hopf algebras[2–4]. Under twisting, the coproduct and coassociator of a given quasi-Hopf algebra transform as −1
F X = F · ( X ) · F −1 , F = F12 · ( ⊗ id) (F ) · · (id ⊗ ) (F −1 ) · F23 , (5.18)
and, if the quasi-Hopf algebra is in addition quasitriangular, then the R-matrix R transforms as RF = F21 RF −1 .
(5.19)
4 By rewriting (5.13–5.14) for the associative product of two arbitrary elements in U (g), we also show that ξ is a 1-cocycle in the sense of the Hochschild complex. This indeed provides a unique continuation of ξ from g to U (g) as a derivation.
602
C. A. S. Young, R. Zegers
In the previous section it happened that both and 0 were coassociative, so that both Uh (g) and U(g) happened to be Hopf algebras, but the theory applies more generally. Suppose now that R ∈ (Uh (g))⊗2 and ∈ (Uh (g))⊗3 are any R-matrix and coassociator that make a given QUEA (Uh (g), , , S) into a (coassociative) qtqH algebra, which we denote, by a slight abbreviation, as (Uh (g), , R, ). We say that this qtqH algebra is p-contractible with respect to a symmetric decomposition g = h ⊕ p if and only if (Uh (g), , , S) is p-contractible in the sense of Definition 3.9 and R and are p-contractible as elements of their respective tensor products. It then follows from the definitions above that Proposition 5.3. For any QUEA Uh (g) and any symmetric decomposition g = h ⊕ p, if (Uh (g), , R, ) is a p-contractible qtqH algebra and F ∈ (Uh (g))⊗2 is a pcontractible twist then ((Uh (g))F , F , RF , F ) is a p-contractible qtqH algebra. Combining this with Propositions 5.1 and 5.2, we have that every p-contractible qtqH algebra (Uh (g), , R, ) can be obtained, via p-contractible change of basis and twist, from some p-contractible qtqH algebra (U(g), 0 , R , ) based on the undeformed UEA. In particular, starting from the trivial triangular quasi-Hopf structure (R = 1 ⊗ 1, = 1 ⊗ 1 ⊗ 1) on U(g), which is obviously p-contractible, we have Corollary 5.4. For any p-contractible deformation Hopf algebra (Uh (g), , , S) based on a symmetric semisimple Lie algebra of restrictive type with symmetric decomposition g = h ⊕ p, there is an R-matrix R and coassociator such that (Uh (g), , R, ) is a p-contractible triangular quasi-Hopf algebra. Proof. Explicitly, by Propositions 5.1 and 5.2, there exists a p-contractible invertible element F ∈ U(g) ⊗ U(g)[[h]] and a p-contractible K[[h]]-algebra isomorphism φ, such that
= φ −1 ⊗ φ −1 ◦ F 0 F −1 ◦ φ. Defining
R := φ −1 ⊗ φ −1 F21 F −1 ,
−1 := φ −1 ⊗ φ −1 ⊗ φ −1 F12 · ( 0 ⊗ id) (F ) · (id ⊗ 0 ) (F −1 ) · F23
provides the required structure.
(5.20) (5.21)
One may also want to know when a given p-contractible Hopf QUEA (Uh (g), , , S) admits a p-contractible quasitriangular structure. When g is a semisimple Lie algebra, we can at least give necessary conditions, by adapting the argument surrounding Proposition 3.16 in [4]. Let t ∈ sym(g ⊗ g)g be a g-invariant symmetric element. For semisimple g, t is a linear combination of invariant symmetric elements of the simple factors of ¯ be the corresponding standard quasitriangular Hopf QUEA and g. Let (Uh (g), , R) (U(g)[[h]], 0 , R, ) the qtqH algebra with R = eht/2 , both as defined (simple factor by simple factor) in [4]. Corollary 5.5. Let g = h ⊕ p be a symmetric decomposition of restrictive type of ¯ is p-contractible, then it is isomora semisimple Lie algebra g. If (Uh (g), , R) phic, via a p-contractible isomorphism of K[[h]]-algebras, to a p-contractible twist of (U(g)[[h]], 0 , R = eht/2 , ). Furthermore, ht is necessarily p-contractible.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
603
Proof. (Outline) One follows the Proof of Proposition 3.16 in [4] to reach the qtqH algebra (U(g)[[h]], 0 , R, ), where R and are g-invariants but, as above, one knows from Propositions 5.1 and 5.2 that the required isomorphism φ and twist F can be chosen to be p-contractible. Indeed, a further g-invariant twist may be required to ensure that R21 = R, but this twist is p-contractible as R is (cf. Prop 3.5 in [4]). Then the rest of the proof is unmodified, and one has that R = eht/2 and that is the corresponding coassociator, as defined in [4]. Moreover, since both R and depend on h and t solely through ht, their p-contractibility implies that of ht. Remark. Knowing, ahead of time, that the standard quasitriangular Hopf QUEAs of semisimple Lie algebras exist allows one to conclude that, to the datum (g, t), corresponds, via twisting, a quasitriangular Hopf algebra. It does not allow us though to conclude anything about p-contractibilty. In order to decide whether the existence of a p-contractible ht ∈ h sym(g ⊗ g)g is also a sufficient condition for the existence of a ¯ based on (Uh g, , , S), it p-contractible quasitriangular Hopf algebra (Uh (g), , R) might be helpful to refine the approach of Donin and Shnider, [38], where it is shown by direct cohomological arguments that there exists a twist from (U(g), 0 , R, ) to the latter, therefore setting the coassociator to unity. In Sect. 7 we will see an example for which a p-contractible ht (and a p-contractible quasitriangular Hopf algebra) does exist, and one for which it does not. 6. Twists and p-Contractions We can now finally turn to the objects in which we are really interested in this paper: those deformed enveloping algebras of non-semisimple Lie algebras that are obtained by a certain contraction procedure modelled on that used in [22–24] to obtain the κ-deformation of Poincaré. The notion of p-contractibilty introduced in the previous sections is formulated with this type of contraction in mind, as we now discuss. Recall first that if g = h ⊕ p is a symmetric decomposition of a Lie algebra g, a standard procedure known as Inönu-Wigner contraction, [36,37], consists in contracting the submodule p by means of a one-parameter family of linear automorphisms of the form t = πh + t πp,
(6.1)
where πh : g h and πp : g p denote the linear projections from g to h and p respectively and t ∈ (0, 1]. For all t ∈ (0, 1], the image of g by the automorphism −1 t is the symmetric semisimple Lie algebra gt , isomorphic to g = h ⊕ p as a K-module, with Lie bracket [X, Y ]t = −1 t ([t (X ), t (Y )])
(6.2)
for all X, Y ∈ g. It has the property that [h, h]t ⊂ h ,
[h, p]t ⊂ p , and [p, p]t ⊂ t 2 h,
(6.3)
so in the limit t → 0 one obtains a Lie algebra g0 , isomorphic to g = h ⊕ p as a K-module, whose Lie bracket [, ]0 = limt→0 [, ]t obeys [h, h]0 ⊂ h ,
[h, p]0 ⊂ p , and [p, p]0 = {0}.
(6.4)
604
C. A. S. Young, R. Zegers
The submodule p is therefore an abelian ideal in g0 . The undeformed Hopf algebra structure defined in Sect. 3.1 is preserved as t tends to zero. There is thus a natural undeformed Hopf algebra structure on the envelope U(g0 ) of the contracted Lie algebra, which we may write as (U(g0 ), 0 , S0 , 0 ) without ambiguity. We may extend t over U(g)[[h]] as a K[[h]]-algebra homomorphism. Further, by means of the K[[h]]-module isomorphism η of Definition 3.9, we can regard t as a map Uh (g) → Uh (g) on any QUEA Uh (g). This specifies how every element of the latter is to be rescaled in the contraction limit. The relevance of the definition of p-contractibility from Sect. 3 is then contained in the following Definition-Proposition 6.1. Let (g, θ ) be a symmetric semisimple Lie algebra with symmetric decomposition g = h ⊕ p and let (Uh (g), h , Sh , h ) be a deformation of the Hopf algebra (U(g), 0 , S0 , 0 ). For all t ∈ (0, 1], set −1
(t) = (−1 t ⊗ t ) ◦ th ◦ t ,
S(t) = −1 t ◦ Sth ◦ t
and (t) = th ◦ t , (6.5)
where h = h/t is the rescaled deformation parameter. Then the limit of (Uth (gt ), (t) , S(t) , (t) ) as t → 0 exists if and only if (Uh (g), h , Sh , h ) is p-contractible. If so, one has a deformation of (U(g0 ), 0 , S0 , 0 ) which we denote by (Uh (g0 ), h , Sh , h ), and refer to as the p-contraction of (Uh (g), h , Sh , h ). Proof. Let r, s ∈ N and let φ : (U(g))⊗r [[h]] → (U(g))⊗s [[h]] be a homomorphism ⊗s ◦ φ ◦ ( )⊗r has a finite limit of K[[h]]-modules. We want to prove that φt = (−1 t t ) when t → 0 if and only if φ is p-contractible. First assume that φ is p-contractible; then from Lemma 3.8, there exists a collection (ϕn )n∈N0 of K[[h]]-module homomorphisms ϕn : (U(g))⊗r [[h]] → (U(g))⊗s [[h]] such that φ=
h n ϕn
(6.6)
n≥0
⊗r and, for m, all n,⊗s p ∈ N0 , there exists l ∈ N0 such that ϕn Fm, p ((U(g)) ) ⊆ Fl,n+ p (U(g)) . We thus have, for all n, m, p ∈ N0 , ⊗s ⊗s h n (−1 ◦ ϕn ◦ (t )⊗r Sm, p (g⊕r ) = h −n t n+ p (−1 ◦ ϕn Sm, p (g⊕r ) t ) t ) ⊗s ⊆ h −n t n+ p (−1 Fl,n+ p ((U(g))⊗s ) t ) = h −n t n+ p O(t −(n+ p) ) Fl,n+ p ((U(g))⊗s ) = h −n O(1) Fl,n+ p ((U(g))⊗s ) . This obviously has a finite limit when t → 0 and so does φt . Conversely, one sees that if φ is not p-contractible, φt diverges at least as t −1 . It is worth emphasizing that the notion of p-contraction defined here is not the only possible contraction that can be performed on a QUEA of g with respect to a given symmetric decomposition g = h ⊕ p: one could also, for example, consider contractions where the deformation parameter h is not rescaled in the limit.
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
605
Finally, we can state our main result concerning twists and p-contracted QUEAs: Theorem 6.1. If a deformation Hopf algebra (Uh (g0 ), h , Sh , h ) is the p-contraction of a QUEA of a symmetric semisimple Lie algebra (g, θ ) having the restriction property, then it is isomorphic, as a Hopf algebra over K[[h ]], to a twist of the undeformed Hopf algebra (U(g0 ), 0 , S0 , 0 ) by an invertible element F0 ∈ Uh (g0 ) ⊗ Uh (g0 )[[h ]] congruent with 1 ⊗ 1 modulo h . Proof. By Proposition 6.1, Proposition 5.2 applies. By arguing as in the proof of 6.1, we have that if F is the p-contractible twist element of Proposition 5.2, then −1 F0 = lim (−1 t ⊗ t )(F )
(6.7)
t→0
is well-defined. By construction, this is the twist we seek.
From Corollary 5.4, one has similarly that for every such p-contracted QUEA Uh (g0 ) there exists an R-matrix R and coassociator that make (Uh (g0 ), R, ) into a triangular quasi-Hopf algebra. 7. Examples: κ-Poincaré in 3 and 4 Dimensions We now turn to explicit examples. Let K = C, and consider the symmetric decomposition so(n + 1) = so(n) ⊕ pn ,
n > 2,
(7.1)
whose Inönu-Wigner contraction of course yields the Lie algebra iso(n) of the complexified Euclidean group in n dimensions, I S O(n, C). By Lemma 3.3, this decomposition is of restrictive type. Thus, the results above will apply to any pn -contractible deformation algebra Uh (so(n + 1)). Finding such deformations is itself a non-trivial task. In the cases n = 3, 4, this was achieved in [23,24]5 , yielding the κ-deformations Uκ (iso(3)) and Uκ (iso(4)). These can be written in terms of the generators Mi j = −M ji ,
Ni ,
Pi ,
P0 = E,
(7.2)
for all 1 ≤ i, j ≤ n − 1 and n = 3, 4. The deformation parameter is conventionally denoted as κ = 1/ h , and the algebra is then given by Mi j , Pk = δk[i P j] , (7.3) E , Ni , P j = δi j κ sinh (7.4) [Ni , E] = Pi , κ 1 E + 2 P· P Mi j + Pk P[i M j]k , (7.5) Ni , N j = −Mi j cosh κ 4κ for all 1 ≤ i, j, k, l ≤ n − 1. The coproduct is given by
κ (E) = E ⊗ 1 + 1 ⊗ E , E 2κ
κ (Ni ) = Ni ⊗ e
E 2κ
(7.6)
E − 2κ
κ (Pi ) = Pi ⊗ e + e ⊗ Pi , (7.7)
E E E 1 P j ⊗ e 2κ Mi j − e− 2κ Mi j ⊗ P j , (7.8) + e− 2κ ⊗ Ni + 2κ (7.9)
κ (Mi j ) = Mi j ⊗ 1 + 1 ⊗ Mi j ,
5 Note that although the κ-Poincaré algebra exists in arbitrary dimension [39], to the authors’ knowledge it has only explicitly been shown to arise as a p-contraction for n ≤ 4.
606
C. A. S. Young, R. Zegers
and the antipode by Sκ (Pμ ) = −Pμ , Sκ (Mi j ) = −Mi j , Sκ (Ni ) = −Ni +
d Pi . 2κ
(7.10)
The counit map is undeformed, (Mi j ) = (Ni ) = (Pμ ) = 0, for all 0 ≤ μ ≤ n − 1. It follows from the results presented in the previous sections that Uκ (iso(3)) and Uκ (iso(4)) are isomorphic to cochain twists of U(iso(3)) and U(iso(4)) respectively. Let us comment on the relationship between this statement and various previous results. First, it should not be confused with other statements that exist in the literature, [40,41], concerning twists and κ-deformed Minkowski space-time, which involve enlarged algebras that include the dilatation generator. Next, as we saw above, the existence of the cochain twist means there certainly exist triangular quasi-Hopf algebras (Uκ (iso(n)), R, ), at least for n = 3, 4. They are obtained, as in the approach of Beggs and Majid [7,8], by twisting (Uκ (iso(n)), 1⊗2 , 1⊗3 ). To the first few orders in h = 1/κ, the structures R and were explicitly computed, for any n ≥ 2, in [42]; see also [43,44]. One can also understand the existence of the quasitriangular Hopf algebra structure of Uκ (iso(3)) exhibited in [23] in the context of the results above. Among the special orthogonal algebras, so(4, C) alone is not simple: so(4, C) = a1 ⊕ a1 . There is thus a two-dimensional space of quadratic Casimirs. It is straightforward to verify that a one-dimensional subspace of them are p-contractible, namely h t := h i jk Mi j Pk . For n = 3, it is known that there is no classical r-matrix obeying the classical Yang-Baxter equation [42,45] and therefore no quasitriangular Hopf algebra structure. This now also follows from Corollary 5.5, given that for all n = 3 the unique quadratic Casimir of so(n + 1) fails to be p-contractible. As for versions of the κ-deformed Poincaré algebra in higher and lower space-time dimensions, a consistent definition was first given in [39]. The main idea is that the four dimensional case is generic enough that the 1 + d-dimensional case can be obtained by simply extending or truncating the range of the spatial indices from 1, . . . , 3 to 1, . . . , d. It is reasonable to think that the twist obtained in the four dimensional case can be similarly extended to arbitrary dimensions, thus extending to all dimensions the existence of a triangular quasi-Hopf algebra structure on the κ-deformation of the Poincaré algebra. In particular, we expect that the κ-deformation of U(sl(2)) admits a triangular quasiHopf algebra structure [44], but a proof of this statement would obviously require a refinement of the arguments used here so as to circumvent the obstructions arising in this case – cf. the Appendix. Such a refinement could, for instance, rely on a further symmetry property of the p-contractible Chevalley-Eilenberg cohomology of sl(2). Finally, we note that it would be interesting to understand the existence of the twist from the point of view of the other, conceptually distinct, construction of κ-Poincaré, namely as a bicrossproduct [20,21,46,47]. Acknowledgments. The research of C.A.S.Y. was supported by the Leverhulme foundation. R.Z. was funded by an EPSRC postdoctoral fellowship.
Appendix: Proof of Lemma 3.3 In this Appendix, we provide a proof of Lemma 3.3. Let (g, θ ) be a symmetric semisimple Lie algebra obeying the conditions of the lemma. If g = h ⊕ p is the associated symmetric decomposition of g, we want to prove that, for all p ∈ N, the projection from
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
607
g to p maps S p (g ⊕ g)g onto S0, p (g ⊕ g)h. The isomorphism of left g-modules (3.6) induces a similar isomorphism S(g ⊕ g) ∼ = S(g) ⊗ S(g) at the level of the symmetric algebras, from which it follows that Sm (g ⊕ g) ∼ =
m
Sk (g) ⊗ Sm−k (g),
(7.11)
k=0
for all m ∈ N. We thus have a decomposition of S(g ⊕ g) into the g-submodules isomorphic to Sk (g) ⊗ Sm−k (g). There is an analogous decomposition of S0,m (g ⊕ g) into h-submodules isomorphic to S0,k (g) ⊗ S0,m−k (g) = Sk (p) ⊗ Sm−k (p). It therefore suffices to show that, for all k, ∈ N, the restriction map induces a surjection (Sk (g) ⊗ S (g))g (Sk (p) ⊗ S (p))h .
(7.12)
∼ p∗ , by means of the Killing form, an element Identifying g ∼ = g∗ , and in particular p = d ∈ Sk (p) ⊗ S (p) can be regarded as a (k + )-linear map p × · · · × p → K; (X, . . . , Y ) → d(X, . . . , Y )
(7.13)
that is symmetric in its first k and final slots. In view of the polarization formulae, such maps are in bijection with polynomials of two variables in p, according to p(d) (X, Y ) = d(X, . . . , X , Y, . . . , Y ). k
(7.14)
These polynomials are (k, )-homogeneous, by which we mean that they are homogeneous of degree k with respect to their first argument and of degree with respect to their second argument. We denote by Kk, [p, p] the left h-module of (k, )-homogeneous polynomials on p. Then for all k, ∈ N, (Sk (p) ⊗ S (p))h is in bijection with the submodule of h-invariant (k, )-homogeneous polynomials of Kk, [p, p]h. Similarly, (Sk (g) ⊗ S (g))g is in bijection with Kk, [g, g]g. Therefore, it suffices to show that the restriction map from g to p maps Kk, [g, g]g onto Kk, [p, p]h. By virtue of Lemma 2.2, it will be sufficient to consider separately the cases of diagonal symmetric Lie algebras and of the symmetric simple Lie algebras listed in 3.3. We recall that a diagonal symmetric Lie algebra is a pair (g, θ ), where g = v ⊕ v, for some semisimple Lie algebra v, and θ is the involutive automorphism of Lie algebras defined by θ (x, y) = (y, x), for all (x, y) ∈ g. We thus have g = h ⊕ p, where h is the set of elements of g of the form (x, x), whereas p is the set of elements of g of the form (x, −x), for x ∈ v. We are first going to prove that Kk, [p, p]h ∼ = Kk, [v, v]v. Let p ∈ Kk, [p, p] be a polynomial. For all X, Y ∈ p, we have p(X, Y ) = p((x, −x), (y, −y)) = p(x, ˜ y),
(7.15)
for some x, y ∈ v. The left h-action on p induces a left h-action on p × p, given, for all h ∈ h and all X, Y ∈ p, by h (X, Y ) = (z, z) ((x, −x), (y, −y)) = ((z x, −z x), (z y, −z y)),
(7.16)
for some x, y ∈ v and some z ∈ v; from which it obviously follows that p˜ is v-invariant if and only if p is h-invariant. Now, we are going to prove that the restriction map is
608
C. A. S. Young, R. Zegers
a surjection from Kk, [g, g]g onto Kk, [v, v]v. Let p ∈ Kk, [g, g]g be a g-invariant polynomial on g. The left g-action on g ⊕ g is given, for all g ∈ g and all X, Y ∈ g, by g (X, Y ) = (g1 , g2 ) ((x1 , x2 ), (y1 , y2 )) = ((g1 x1 , g2 x2 ), (g1 y1 , g2 y2 )),
(7.17)
for some g1 , g2 ∈ v and some x1 , x2 , y1 , y2 ∈ v. As one can always choose g1 and g2 independently, it follows that in order for p to be g-invariant, there must be a polynomial f : K × K → K and two v-invariant polynomials p1 , p2 ∈ Kk, [v, v]v such that p((x1 , x2 ), (y1 , y2 )) = f ( p1 (x1 , y1 ), p2 (x2 , y2 )),
(7.18)
for all x1 , x2 , y1 , y2 ∈ v. Now restricting p to p, we get p((x1 , −x1 ), (y1 , −y1 )) = f ( p1 (x1 , y1 ), p2 (−(x1 , y1 ))) = p(x ˜ 1 , y1 ) ∈ Kk, [v, v]v,
(7.19)
for all x1 , y1 ∈ v. Now, it is obvious that every polynomial in Kk, [v, v]v can be obtained as the restriction to p of a polynomial in Kk, [g, g]g; e.g. take p2 = 0, f = id and p1 = p. ˜ We are now going to consider the different symmetric simple Lie algebras listed in 3.3. Let us first consider the symmetric simple Lie algebras of type AIn for all n > 2. In this case, we have g = su(n) endowed with an involutive automorphism θ given by complex conjugation, i.e. θ (x) = x, ¯ for all x ∈ su(n). The fixed points of θ are traceless real antisymmetric matrices which generate an so(n) subalgebra. We thus have the symmetric decomposition su(n) = so(n) ⊕ p, where the orthogonal complement p is the left so(n)-module generated by the traceless imaginary symmetric matrices of su(n). It follows from the first fundamental theorem for so(n)-invariant polynomials on n × n matrices, [48], that Kk, [p, p]so(n) is generated by the following polynomials: (x, y) ∈ p × p → tr P(x, y),
(7.20)
for all (i, j)-homogeneous noncommutative polynomials P ∈ Ki, j [X, Y ], with i ≤ k and j ≤ . The polynomials defined in (7.20) are obviously restrictions to p of su(n)invariant polynomials on su(n) as, for all P ∈ Ki, j [X, Y ] and all x, y ∈ su(n), (x, y) → tr P(x, y)
(7.21)
defines an element in Km,n [su(n), su(n)]su(n) . This proves Lemma 3.3 for simple symmetric Lie algebras of type AIn>2 . It is worth noting that in the case of AI2 , there exist obstructions to the above result which are related to the existence of a further so(2)invariant with appropriate symmetries, namely the pfaffian (x, y) ∈ p × p → Pf([x, y]). As the latter is not the restriction to p of any su(2) invariant on su(2), Lemma 3.3 does not hold in this case. We now turn to type AIIn . In this case, we have g = su(2n) endowed with an involutive automorphism θ given by the symplectic transpose, i.e., for all x ∈ su(2n), θ (x) = J x t J , where J is a non-singular skew-symmetric 2n × 2n matrix such that J 2 = −1. The fixed point set of θ constitutes an sp(2n) subalgebra and we have the following symmetric decomposition su(2n) = sp(2n) ⊕ p, where p ⊂ su(2n) is the left sp(2n)-module of matrices x ∈ su(2n) such that θ (x) = −x. It follows from the first fundamental theorem for sp(2n)-invariant polynomials on 2n × 2n matrices, [48], that Kk, [p, p]sp(2n) is generated by the following polynomials: (x, y) ∈ p × p → tr P(x, y),
(7.22)
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
609
for all noncommutative (i, j)-homogeneous polynomials P ∈ Ki, j [X, Y ], with i ≤ k and j ≤ . These polynomials are restrictions to p of su(2n)-invariant polynomials on su(2n) as, for all P ∈ Ki, j [X, Y ] and all x, y ∈ su(2n), (x, y) → tr P(x, y)
(7.23)
defines an element in Ki, j [su(2n), su(2n)]su(2n) . This proves Lemma 3.3 for simple symmetric Lie algebras of type AIIn . We finally consider the symmetric simple Lie algebras of type BDIn,1 for all n > 2. In this case, we have the symmetric pairs (so(n + 1), so(n))n>2 . We introduce the usual basis of gl(n + 1), i.e. the (E i j )0≤i, j≤n defined as the (n + 1) × (n + 1) matrices with a 1 at the intersection of the i th row and j th column and 0 everywhere else. The matrices Mi j = E i j − E ji , 0 ≤ i, j ≤ n, constitute a basis of so(n + 1), and of these, the Mi j with 1 ≤ i, j ≤ n generate an so(n) subalgebra. We thus have the symmetric decomposition so(n + 1) = so(n) ⊕ p, where p is the n-dimensional so(n)-module spanned by the Pi = M0,i , for all 1 ≤ i ≤ n. The Pi transform under the fundamental representation n of so(n), as can be checked from Mi j Pk = [Mi j , Pk ] = δ jk Pi − δik P j ,
(7.24)
for all 1 ≤ i, j, k ≤ n. This means that we are looking for S O(n)-invariant (k, )-homogeneous polynomials on p×p = n×n. For all n > 2, it follows from the first fundamental theorem for so(n)-invariant polynomials on vectors, [49,50], that such polynomials only depend on the S O(n) scalars built out of the scalar products of their arguments. Let q be the quadratic form defined on p × p by q(Pi , P j ) = δi j for all 1 ≤ i, j ≤ n. For all p ∈ Kk, [p, p]h, there exists a polynomial f : K3 → K such that, for all X, Y ∈ p, p(X, Y ) = f (q(X, X ), q(X, Y ), q(Y, Y )).
(7.25)
Now, it is obvious that q is the restriction to p of the map 1 so(n + 1) × so(n + 1) → K ; (X, Y ) → − tr(X Y ), 2 which is so(n + 1)-invariant. This proves the result for symmetric simple Lie algebras of type BDIn>2,1 . It is worth noting that in the case of BDI2,1 , there exist obstructions to the above result which are related to the existence of a further S O(2) invariant, namely (X, Y ) ∈ p × p → det(X, Y ). As the latter is not the restriction to p of any so(3) invariant, Lemma 3.3 does not hold in this case. By virtue of the special isomorphisms between lower rank simple Lie algebras, the list of summands in Lemma 3.3 actually includes CII1,1 = BDI4,1 and BDI3,3 = AI4 . The latter respectively correspond to the symmetric decompositions sp(4) = (sp(2) ⊕ sp(2))⊕ p and so(6) = (so(3) ⊕ so(3)) ⊕ p. References 1. Drinfeld, V.G.: Quantum groups. J. Sov. Math. 41, 898 (1988) [Zap. Nauchn. Semin. 155, 18 (1986)]. Also in Proc. Int. Cong. Math. (Berkeley,1986) 1, 1987, pp. 798–820 2. Chari, V., Pressley, A.: A Guide to Quantum Groups. Cambridge: Cambridge University Press, 1994 3. Majid, S.: Foundations of Quantum Group Theory. Cambridge: Cambridge University Press, 2000 4. Drinfel’d, V.G.: Quasi-Hopf algebras. (Russian) Algebra i Analiz 1(6), 114–148 (1989); translation in Leningrad Math. J. 1(6), 1419–1457 (1990)
610
C. A. S. Young, R. Zegers
5. Drinfel’d, V.G.: On the structure of quasitriangular quasi-Hopf algebras. (Russian) Funktsional. Anal. i Prilozhen. 26(1), 78–80 (1992); translation in Funct. Anal. Appl. 26(1), 63–65 (1992) 6. Drinfel’d, V.G.: On almost cocommutative Hopf algebras. (Russian) Algebra i Analiz 1(2), 30–46 (1989); translation in Leningrad Math. J. 1(2), 321–342 (1990) 7. Beggs, E.J., Majid, S.: Semi-classical differential structures. Pac. J. Math. 224(1), 1–44 (2006) 8. Beggs, E.J., Majid, S.: Quantization by cochain twists and nonassociative differentials. http://arxiv.org/ abs/math/0506450v2[math.QA], 2005 9. Weinberg, S.: The Quantum Theory of Fields. Vol. 1: Foundations. Cambridge: Cambridge University Press, 1995 10. Lukierski, J.: Quantum deformations of Einstein’s relativistic symmetries. AIP Conf. Proc. 861, 398 (2006) 11. Oeckl, R.: Untwisting noncommutative R**d and the equivalence of quantum field theories. Nucl. Phys. B 581, 559 (2000) 12. Lukierski, J., Ruegg, H., Zakrzewski, W.J.: Classical quantum mechanics of free kappa relativistic systems. Annals Phys. 243, 90 (1995) 13. Kosinski, P., Lukierski, J., Maslanka, P.: Local D = 4 fieldtheory on kappa-deformed Minkowski space. Phys. Rev. D 62, 025004 (2000) 14. Amelino-Camelia, G., Majid, S.: Waves on noncommutative spacetime and gamma-ray bursts. Int. J. Mod. Phys. A 15, 4301 (2000) 15. Agostini, A., Amelino-Camelia, G., D’Andrea, F.: Hopf-algebra description of noncommutative-spacetime symmetries. Int. J. Mod. Phys. A 19, 5187 (2004) 16. Dimitrijevic, M., Jonke, L., Moller, L., Tsouchnika, E., Wess, J., Wohlgenannt, M.: Field theory on kappa-spacetime. Czech. J. Phys. 54, 1243 (2004) 17. Grosse, H., Wohlgenannt, M.: On kappa-deformation and UV/IR mixing. Nucl. Phys. B 748, 473 (2006) 18. Kresic-Juric, S., Meljanac, S., Stojic, M.: Covariant realizations of kappa-deformed space. Eur. Phys. J. C 51, 229 (2007) 19. Daszkiewicz, M., Lukierski, J., Woronowicz, M.: κ-deformed statistics and classical fourmomentum addition law. Mod. Phys. Lett. A23, 653–665 (2008) 20. Majid, S., Ruegg, H.: Bicrossproduct structure of kappa poincare group and noncommutative geometry. Phys. Lett. B 334, 348 (1994) 21. Freidel, L., Kowalski-Glikman, J., Nowak, S.: Field theory on κ–Minkowski space revisited: Noether charges and breaking of Lorentz symmetry. Int. J. Mod. Phys. A 23, 2687 (2008) 22. Celeghini, E., Giachetti, R., Sorace, E., Tarlini, M.: Three dimensional quantum groups from contraction of SU (2) Q . J. Math. Phys. 31, 2548 (1990) 23. Celeghini, E., Giachetti, R., Sorace, E., Tarlini, M.: The Three-dimensional Euclidean quantum group E(3)-q and its R matrix. J. Math. Phys. 32, 1159 (1991) 24. Lukierski, J., Ruegg, H., Nowicki, A., Tolstoi, V.N.: Q-deformation of Poincare algebra. Phys. Lett. B 264, 331 (1991) 25. Dixmier, J.: Enveloping Algebras. Amsterdam: North Holland Publishing Company, 1977 26. Helgason, S.: Differential Geometry and Symmetric Spaces. London: Academic Press, 1962 27. Cartan, E.: Sur certaines formes riemanniennes remarquables des géométries à groupe fondamental simple. Ann. Sci. Ecole Norm. Sup. 44, 345–467 (1927) 28. Helgason, S.: Fundamental solutions of invariant differential operators on symmetric spaces. Amer. J. Math. 86(3), 565–601 (1964) 29. Burstall, F.E., Ferus, D., Pedit, F., Pinkall, U.: Harmonic tori in symmetric spaces and commuting Hamiltonian systems on loop algebras. Ann. Math. 138(1), 173–212 (1993) 30. Evans, J.M.: Integrable sigma-models and Drinfeld-Sokolovhierarchies. Nucl. Phys. B 608, 591 (2001) 31. Hochschild, G.: On the cohomology groups of an associative algebra. Ann. Math. 46(1), 58–67 (1945) 32. Hochschild, G.: On the cohomology theory for associatviealgebras. Ann. Math. 47(3), 568–579 (1946) 33. Cartan, H., Eilenberg, S.: Homological Algebra. Princeton, NJ: Princeton University Press, 1956 34. Weibel, C.A.: An Introduction to Homological Algebra. Cambridge studies in advanced mathematics 38, Cambridge: Cambridge University Press, 1994 35. Chevalley, C., Eilenberg, S.: Cohomology theory of Lie groups and Lie algebras. Trans. Amer. Math. Soc. 63, 85–124 (1948) 36. Inönu, E., Wigner, E.P.: On the contraction of groups and their representations. Proc. Natl. Acad. Sci. U.S.A 39, 510–524 (1953) 37. Saletan, E.J.: Contraction of Lie groups. J. Math. Phys. 2, 1–21 (1961) 38. Donin, J., Shnider, S.: Cohomological construction of quantized universal enveloping algebras. Trans. Am. Math. Soc. 349, 1611 (1997) 39. Lukierski, J., Ruegg, H.: Quantum kappa poincaré in any dimension. Phys. Lett. B 329, 189 (1994) 40. Govindarajan, T.R., Gupta, K.S., Harikumar, E., Meljanac, S., Meljanac, D.: Twisted statistics in kappaminkowski spacetime. Phys. Rev. D 77, 105010 (2008)
Deformation Quasi-Hopf Algebras of Non-semisimple Type from Cochain Twists
611
41. Borowiec, A., Pachol, A.: kappa-Minkowski spacetime as the result of Jordanian twist deformation. Phys. Rev. D 79, 045012 (2009) 42. Young, C.A.S., Zegers, R.: On kappa-deformation and triangular quasibialgebra structure. Nucl. Phys. B 809, 439–457 (2009) 43. Young, C.A.S., Zegers, R.: Covariant particle statistics and intertwiners of the kappa-deformed Poincare algebra. Nucl. Phys. B 797, 537 (2008) 44. Young, C.A.S., Zegers, R.: Covariant particle exchange for kappa-deformed theories in 1+1 dimensions. Nucl. Phys. B 804, 342 (2008) 45. Zakrzewski, S.: Poisson Poincaré groups. http://arxiv.org/abs/hep-th/9412099v1, 1994 46. Schroers, B.J.: Lessons from (2 + 1)-dimensional quantum gravity. PoS QG-PH, 035 (2007) 47. Majid, S., Schroers, B.J.: q-Deformation and semidualisation in 3d quantum gravity. http://arxiv.org/abs/ 0806.2587v2[gr-qc], 2009 48. Procesi, C.: The invariants of n × n matrices. Adv. Math. 19, 306–381 (1976) 49. Kraft, H., Procesi, C.: Classical invariant theory. available online at http://www.math.unibas.ch/~kraft/ Papers/KP-Primer.pdf, 1996 50. Spivak, M.: A comprehensive introduction to differential geometry. Vol. 5, Houston, TX: Publish or Perish, 1979 Communicated by A. Connes
Commun. Math. Phys. 298, 613–643 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1088-6
Communications in
Mathematical Physics
String Theory and the Kauffman Polynomial Marcos Mariño Département de Physique Théorique et Section de Mathématiques, Université de Genève, Genève CH-1211, Switzerland. E-mail: [email protected] Received: 6 June 2009 / Accepted: 16 April 2010 Published online: 6 July 2010 – © Springer-Verlag 2010
Abstract: We propose a new, precise integrality conjecture for the colored Kauffman polynomial of knots and links inspired by large N dualities and the structure of topological string theory on orientifolds. According to this conjecture, the natural knot invariant in an unoriented theory involves both the colored Kauffman polynomial and the colored HOMFLY polynomial for composite representations, i.e. it involves the full HOMFLY skein of the annulus. The conjecture sheds new light on the relationship between the Kauffman and the HOMFLY polynomials, and it implies for example Rudolph’s theorem. We provide various non-trivial tests of the conjecture and we sketch the string theory arguments that lead to it. Contents 1. 2.
3. 4.
5. 6.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Colored HOMFLY and Kauffman Polynomials . . . . . . . . . . . . . 2.1 Basic ingredients from representation theory . . . . . . . . . . . . 2.2 The colored HOMFLY polynomial . . . . . . . . . . . . . . . . . 2.3 The colored Kauffman polynomial . . . . . . . . . . . . . . . . . 2.4 Relationships between the HOMFLY and the Kauffman invariants The Conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Review of the conjecture for the colored HOMFLY invariant . . . 3.2 The conjecture for the colored Kauffman invariant . . . . . . . . . Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Direct computations . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 General predictions for knots . . . . . . . . . . . . . . . . . . . . 4.3 General predictions for links . . . . . . . . . . . . . . . . . . . . String Theory Interpretation . . . . . . . . . . . . . . . . . . . . . . . 5.1 Chern–Simons theory and D-branes . . . . . . . . . . . . . . . . 5.2 Topological string dual . . . . . . . . . . . . . . . . . . . . . . . Conclusions and Outlook . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
614 615 616 617 621 622 623 623 625 629 629 632 633 636 636 639 640
614
M. Mariño
1. Introduction The HOMFLY [12] and the Kauffman [23] polynomials are probably the most useful two-variable polynomial invariants of knots and links. Both of them generalize the Jones polynomial, and they have become basic building blocks of quantum topology. However, many aspects of these polynomial invariants are still poorly understood. As Joan Birman remarked in 1993, “we can compute the simplest of the invariants by hand and quickly fill pages (...) without having the slightest idea what they mean” [6]. One particular interesting question concerns the relationship between the HOMFLY and the Kauffman invariants. Since their discovery almost thirty years ago, a number of isolated connections have been found between them. For example, when written in an appropriate way, they have the same lowest order term [33]. Other connections can be found when one considers their colored versions. The colored invariants can be formulated in terms of skein theory, in terms of quantum groups, or in terms of Chern–Simons gauge theory [53]. In the language of quantum groups or Chern–Simons theory, different colorings correspond to different choices of group representation. The original HOMFLY and Kauffman invariants are obtained when one considers the fundamental representation of SU(N ) and SO(N )/Sp(N ), respectively. One could consider other representations, like for example the adjoint representation. An intriguing result of Rudolph [49] states that the HOMFLY invariant of a link colored by the adjoint representation equals the square of the Kauffman polynomial of the same link, after the coefficients are reduced modulo two. This type of relationship has been recently extended by Morton and Ryder to more general colorings [41,43]. In spite of these connections, no unified, general picture has emerged to describe both invariants. More recently, knot invariants have been reinterpreted in the context of string theory thanks to the Gopakumar–Vafa conjecture [14], which postulates an equivalence between the 1/N expansion of Chern–Simons theory on the three-sphere, and topological string theory on a Calabi–Yau manifold called the resolved conifold. As a consequence of this conjecture, correlation functions of Chern–Simons gauge theory with U(N ) gauge group (i.e. colored HOMFLY invariants) are given by correlation functions in open topological string theory, which mathematically correspond to open Gromov–Witten invariants. Since Gromov–Witten invariants enjoy highly nontrivial integrality properties [13,31,44], this equivalence provides strong structural results on the colored HOMFLY polynomial [29,31,44] which have been tested in detail in various cases [29,36,48] and finally proved in [37]. Moreover, there is a full cohomology theory behind these invariants [31] which should be connected to categorifications of the HOMFLY polynomial [16]. Therefore, the string theory description “explains” to a large extent many aspects of the colored U(N ) invariants and leads to new predictions about their algebraic structure. The string theory perspective is potentially the most powerful tool to understand the connections between the colored HOMFLY and Kauffman polynomials. From the point of view of Chern–Simons theory, these polynomials correspond to the gauge groups U(N ) and SO(N )/Sp(N ), respectively. But when a gauge theory has a string theory large N dual, as is the case here, the theory with orthogonal or symplectic gauge groups can be obtained from the theory with unitary gauge group by using a special type of orbifold action called an orientifold. The building block of an orientifold is an involution I in the target space X of the string theory, which is then combined with an orientation reversal in the worldsheet of the string to produce unoriented strings in the quotient space X/I. Very roughly, one finds that correlation functions in the orbifold theory are given by correlation functions in the SO/Sp gauge theories. As for any orbifold, these
String Theory and the Kauffman Polynomial
615
functions are given by a sum over an “untwisted” sector involving oriented strings, and a “twisted” sector involving the unoriented strings introduced by the orientifold. The contributions from oriented strings are still given by correlation functions in the U(N ) gauge theory. The use of orientifolds in the context of the Gopakumar–Vafa conjecture was initiated in [51], which identified the relevant involution of the resolved conifold and studied the closed string sector. This line of research was developed in more detail in [7,8]. In particular, [8] extended the orientifold action to the open string sector and pointed out that, as a consequence of the underlying string/gauge theory correspondence, the colored Kauffman invariant of a link should be given by the sum of an appropriate HOMFLY invariant plus an “unoriented” contribution. The results of [8] made possible to formulate some partial conjectures on the structure of the Kauffman polynomial and test them in examples (see [45] for further tests)1 . Unfortunately, these results were not precise enough to provide a full, detailed string-based picture. The reason was that one of the crucial ingredients –the appropriate HOMFLY invariant that corresponds to the untwisted sector of the orientifold– was not identified. In this paper we remedy this situation and we identify these invariants as HOMFLY polynomials colored by composite representations of U(N ). This will allow us to state a precise conjecture on the structure of the colored Kauffman polynomial of knots and links. In skein-theoretic language, the appearance of composite representations means that, in order to understand the colored Kauffman polynomial in the light of string theory, one has to use the full HOMFLY skein of the annulus (see for example [19]). We will indeed see that the natural link invariant to consider in an unoriented theory involves both the colored Kauffman polynomial and the colored HOMFLY polynomial for composite representations and for all possible orientations of the link components. Our conjecture generalizes the results of [31,44] for the U(N ) case, and it “explains” various aspects of the relationship between the HOMFLY and the Kauffman polynomials, like for example Rudolph’s theorem. It also predicts some new, simple relationships between the Kauffman and the HOMFLY polynomial of links. In terms of open topological string theory, this paper adds little to the results of [8]. The bulk of the paper is then devoted to a detailed statement and discussion of the conjecture in the language of knot theory. Sect. 2 introduces our notation and reviews the construction of the colored HOMFLY and Kauffman polynomials, as well as of their relations. In Sect. 3 we review the conjecture of [31,44] and we state the new conjecture for the colored Kauffman polynomial. Sect. 4 provides some nontrivial evidence for the conjecture by looking at particular knots and links, and it explains how some standard results relating the HOMFLY and the Kauffman polynomials follow easily from our conjecture. In Sect. 5 we sketch the string theory arguments that lead to the conjecture, building on [7,8,51]. Finally, Sect. 6 contains some conclusions and prospects for future work.
2. Colored HOMFLY and Kauffman Polynomials In this section we introduce various tools from the theory of symmetric polynomials and we recall the construction of the colored Kauffman and HOMFLY polynomials, mainly to fix notations. 1 Some of the proposals of [8] were reformulated and recently proved in [10].
616
M. Mariño
2.1. Basic ingredients from representation theory. Let R be an irreducible representation of the symmetric group S . We will represent it by a Young diagram or partition, R = {li }i=1,...,r (R) ,
l1 ≥ l2 ≥ · · · lr (R) ,
(2.1)
where li is the number of boxes in the i th row of the diagram and r (R) is the total number of rows. Important quantities associated to the diagram are its total number of boxes, (R) =
r (R)
li
(2.2)
i=1
which equals , as well as the quantity κR =
r (R)
li (li − 2i + 1).
(2.3)
i=1
The ring of symmetric polynomials in an infinite number of variables {vi }i≥1 will be denoted by . It can be easily constructed as a direct limit of the ring of symmetric polynomials with a finite number of variables, see for example [38] for the details. It has a basis given by the Schur polynomials s R (v), which are labelled by Young diagrams. The multiplication rule for these polynomials is encoded in the Littlewood–Richardson coefficients s R1 (v)s R2 (v) = N RR1 R2 s R (v). (2.4) R
The identity of this ring is the Schur polynomial associated to the empty diagram, which we will denote by R = ·. We will also need the n th Adams operation ψn (s R (v)) = s R (v n ) = s R (v1n , v2n , . . .).
(2.5)
One can use elementary representation theory of the symmetric group to express s R (v n ) as a linear combination of Schur polynomials labelled by representations U with n ·(R) boxes U s R (v n ) = cn;R sU (v). (2.6) U
Let χ R be the character of the symmetric group associated to the diagram R. Let Cμ be the conjugacy class associated to the partition μ, and let |Cμ | be the number of elements U are given by [36] in the conjugacy class. It is easy to show that the coefficients cn;R U cn;R =
1 χ R (Cμ )χU (Cnμ ), zμ μ
(2.7)
where nμ = (nμ1 , nμ2 , · · · ) and zμ =
(μ)! . |Cμ |
(2.8)
String Theory and the Kauffman Polynomial
617
Fig. 1. A composite representation made out of the diagrams R and S
We will also regard a Young diagram R as an irreducible representation of U(N ). The quadratic Casimir of R is then given by C R = κ R + N (R).
(2.9)
The most general irreducible representation of U(N ) is a composite representation (see for example [25] for a collection of useful results on composite representations). Composite representations are labelled by a pair of Young diagrams (R, S).
(2.10)
This representation is usually depicted as in the left-hand side of Fig. 1, where the second representation S is drawn upside down at the bottom of the diagram. When regarded as a representation of SU(N ), the composite representation corresponds to the diagram depicted on the right-hand side of Fig. 1, and it has in total N μ1 + (R) − (S)
(2.11)
boxes, where μ1 is the number of boxes in the first row of S. For example, the composite representation ( , ) is the adjoint representation of SU(N ). It is easy to show that [15] C(R,S) = C R + C S .
(2.12)
The composite representation can be understood as the tensor product R ⊗ S, where S is the conjugate representation to S, plus a series of “lower order corrections” involving tensor products of smaller representations. The precise formula is [25] (R, S) = (−1)(U ) NURV NUS T W (V ⊗ W ). (2.13) U,V,W
2.2. The colored HOMFLY polynomial. The HOMFLY polynomial of an oriented link L, PL (t, ν), can be defined by using a planar projection of L. This gives an oriented diagram in the plane which will be denoted as DL . The skein of the plane is the set of linear combinations of these diagrams, modulo the skein relations
(2.14)
618
M. Mariño
Using the skein relations, the diagram of a link DL can be seen to be proportional to the trivial diagram . The proportionality factor DL is a scalar and gives a regular isotopy invariant (i.e. a quantity which is invariant under the Reidemeister moves II and III, but not under the I). A true ambient isotopy invariant is obtained by defining PL (t, ν) = ν −w(DL ) DL ,
(2.15)
where w(DL ) is the self-writhe of the link diagram D (see for example [33], p. 173). This is defined as the sum of the signs of crossings at which all link components cross themselves, and not other components. It differs from the standard writhe w(D) in twice the total linking number of the link, lk(L). Notice that the standard HOMFLY polynomial is usually defined by using the total writhe in (2.15) [33]. Therefore, the HOMFLY polynomial, as defined in (2.15), is given by the standard HOMFLY polynomial times a factor ν 2 lk(L) .
(2.16)
As discussed in detail in [31], this is the natural version of the HOMFLY polynomial from the string theory point of view. The HOMFLY invariant of the link is then defined by , (2.17) H(L) = PL (t, ν)H and we choose the normalization H
=
ν − ν −1 . t − t −1
(2.18)
From the skein theory point of view, the colored HOMFLY invariant of a link is obtained by considering satellites of the knot. Let K be a framed knot, and let P be a knot diagram in the annulus. Around K there is a framing annulus, or equivalently a parallel of K. The satellite knot K P
(2.19)
is obtained by replacing the framing annulus around K by P, or equivalently by mapping P to S3 using the parallel of K. Here, K is called the companion knot while P is called the pattern. In Fig. 2 we show a satellite where the companion K is the trefoil knot. Since the diagrams in the annulus form a vector space (called the skein of the annulus) we can obtain the most general satellite of a knot by considering the basis of this vector space. There is a very convenient basis constructed in [19] whose elements are labelled by pairs of Young diagrams P(R,S) . Given a knot K, the HOMFLY invariant colored by the partitions (R, S) is simply (2.20) H(R,S) (K) = H K P(R,S) . If we have a link L with L components K1 , . . . , K L , one can color each component independently, and one obtains an invariant of the form H(R1 ,S1 ),...,(R L ,SL ) (L).
(2.21)
The colored HOMFLY invariant has various important properties which will be needed in the following:
String Theory and the Kauffman Polynomial
619
Fig. 2. An example of a satellite knot. The companion knot K is the trefoil knot, and below we show the framing annulus. If we replace this framing annulus by the pattern P, we obtain the satellite K P
1. The pattern P(S,R) is equal to the pattern P(R,S) with its orientation reversed. In particular, coloring with ( , ·) gives the original knot K, while coloring with (·, ) gives the knot K with the opposite orientation. Since the HOMFLY invariant of a knot is invariant under reversal of orientation, we have that H(S,R) (K) = H(R,S) (K) = H(R,S) (K).
(2.22)
However, the HOMFLY invariant of a link is only invariant under a global reversal of orientation, therefore in general one has that H(R1 ,S1 ),...,(S j ,R j ),...,(R L ,SL ) (L) = H(R1 ,S1 ),...,(R j ,S j ),...,(R L ,SL ) (L j ),
(2.23)
where L j is the link obtained from the link L by reversing the orientation of the j th component, see for example Fig. 3. 2. If one of the patterns is empty, say S = ·, the skein theory is simpler and it has been developed in for example [3,4]. In this case, the HOMFLY invariant of the knot K (which we denote by H R (K)) is equal to the invariant of K obtained from the quantum group Uq (sl(N , C)) in the representation R, with the identification t = q 1/2 , ν = t N .
(2.24)
In particular we have HR
= dimq R,
where dimq R is the quantum dimension of R.
(2.25)
620
M. Mariño
Fig. 3. Changing (R, S) into (S, R) reverses the orientation of a knot. If the knot is a component of a link, this leads in general to different HOMFLY invariants
Fig. 4. Examples of patterns for various representations. The patterns are written as formal combinations of braids, and after closing them we find elements in the skein of the annulus. In the last example, 1 refers to the empty diagram
3. For a general pattern labeled by two representations (R, S), the HOMFLY invariant of the knot K, H(R,S) (K) equals the invariant of K obtained from the quantum group Uq (gl(N , C)) in the composite representation (R, S). In particular [19] H(R,S)
= dimq (R, S) (−1)(U ) NURV NUS T W dimq V dimq W . (2.26) = U,V,W
Here, U T is the transposed Young diagram. The second equality follows from (2.13). In Fig. 4 we show some examples of patterns associated to different representations. The patterns are represented as elements in the braid group, which can be closed by joining the endpoints to produce patterns in the annulus. In the following we will denote z = t − t −1 .
(2.27)
String Theory and the Kauffman Polynomial
621
Remark 2.1. In this paper, the above skein rules will be used to compute the values of the HOMFLY and Kauffman invariants in the standard framing. Starting from this framing, a change of framing by f units is done through [40] H R (K) → (−1) f (R) t f κ R H R (K).
(2.28)
This is the rule that preserves the integrality properties of the invariants that will be discussed below. The framing of links is done in a similar way, with one framing factor like (2.28) for each component. Example 2.2. The HOMFLY polynomial of the trefoil knot is, in our conventions, P31 (t, ν) = 2ν 2 − ν 4 − z 2 ν 2 , while the HOMFLY polynomial of the Hopf link is P22 (t, ν) = ν − ν −1 z −1 + νz. 1
(2.29)
(2.30)
2.3. The colored Kauffman polynomial. The Kauffman polynomial is also defined by a skein theory [23], but the diagrams correspond now to planar projections of unoriented knots and links. The skein relations are (2.31)
This is sometimes called the “Dubrovnik” version of the Kauffman invariant. As in the case of HOMFLY, the diagram of an unoriented link L, which will be denoted by E L , is proportional to the trivial diagram
, and the proportionality factor E L is a regular
isotopy invariant. The Kauffman polynomial is defined as FL (t, ν) = ν −w(E L ) E L .
(2.32)
Like before, this differs from the standard Kauffman polynomial (as defined for example in [33]) in an overall factor (2.16). More importantly, the use of the self-writhe guarantees that the resulting polynomial is still an invariant of unoriented links. The Kauffman invariant of the link will be defined as , (2.33) G(L) = FL (t, ν)G and we choose the normalization G
=1+
ν − ν −1 . t − t −1
(2.34)
The colored Kauffman polynomial is obtained, similarly to the HOMFLY case, by considering the Kauffman skein of the annulus and by forming satellites with elements of
622
M. Mariño
this skein taken as patterns. There is again a basis y R labelled by Young tableaux [5], and we define G R (K) = G(K y R ).
(2.35)
For a link of L components, we can color each component independently, and one obtains in this way the colored Kauffman invariant of the link G R1 ,...,R L (t, ν).
(2.36)
The invariant defined in this way equals the invariant obtained from the quantum group Uq (so(N , C)) in the representation R, after identifying t = q 1/2 ,
ν = q N −1 .
(2.37)
In particular, for the unknot the invariant is equal to the quantum dimension of R, GR = dimqSO(N) R. (2.38) The results for the Kauffman invariants of knots and links will be presented in the standard framing. The change of framing is also done with the rule (2.28). Example 2.3. The Kauffman polynomial of the trefoil knot is, in our conventions, F31 (t, ν) = 2ν 2 − ν 4 + z(−ν 3 + ν 5 ) + z 2 (ν 2 − ν 4 ),
(2.39)
while that of the Hopf link is
F22 (t, ν) = z −1 ν − ν −1 + z + z 2 (ν − ν −1 ) . 1
(2.40)
2.4. Relationships between the HOMFLY and the Kauffman invariants. As we mentioned in the Introduction, the colored HOMFLY and Kauffman invariants of a link are not unrelated. The simplest relation concerns the invariants of a link L in which all components have the coloring R = , i.e. the original HOMFLY and Kauffman polynomials. It is easy to show that these polynomials have the structure PL (z, ν) = z 1−L pi (ν)z 2i , FL (z, ν) = z 1−L ki (ν)z i . (2.41) i≥0
i≥0
It turns out that (see for example [33], Prop. 16.9) p0 (ν) = k0 (ν).
(2.42)
In general, the Kauffman polynomial contains many more terms than the HOMFLY polynomial. In particular, as (2.41) shows, it contains both even and odd powers of z, while the HOMFLY polynomial only contains even powers. In the case of torus knots the HOMFLY polynomial can even be obtained from the Kauffman polynomial by the formula [32] H(z, ν) =
1 (G(z, ν) − G(−z, ν)). 2
(2.43)
String Theory and the Kauffman Polynomial
623
There are also highly nontrivial relations between the two invariants when we consider colorings. An intriguing theorem of Rudolph [49] states the following. Let L be an unoriented link with L components. Pick an arbitrary orientation of L and consider its HOMFLY invariant H(
, ),...,( , ) (L).
(2.44)
Due to (2.23), this invariant does not depend on the choice of orientation in L, and it is therefore an invariant of the unoriented link. One can show (2.44) is an element in Z[z ±1 , ν ±1 ] (see for example [41]). The square of the Kauffman invariant of L, G 2 (L), belongs to the same ring. By reducing the coefficients of these polynomial modulo 2, we obtain two polynomials in Z2 [z ±1 , ν ±1 ]. Rudolph’s theorem states that these reduced polynomials are the same. In other words, G 2 (L) ≡ H(
, ),...,( , ) (L)
mod 2,
(2.45)
see [50] for this statement of Rudolph’s theorem. Morton and Ryder have recently extended this result to more general colorings [41,43]. This generalization requires more care since now the invariants have denominators involving products of t r − t −r , r ∈ Z>0 . However, one can still make sense of the reduction modulo 2, and one obtains that, for any unoriented link L, 2 G(R (L) = H(R1 ,R1 ),...,(R L ,R L ) (L) mod 2. 1 ,...,R L )
(2.46)
3. The Conjecture 3.1. Review of the conjecture for the colored HOMFLY invariant. We start by recalling the conjecture of [30,31,44] on the integrality structure of the colored HOMFLY polynomial. We first state the conjecture for knots, and then we briefly consider the generalization to links. Notice that these conjectures have now been proved in [37]. Let K be a knot, and let H R (K) be its colored HOMFLY invariant with the coloring R. We first define the generating functional Z H (v) = H R (K)s R (v), (3.1) R
understood as a formal power series in s R (v). Here we sum over all possible colorings, including the empty one R = ·. We also define the free energy FH (v) = log Z H (v)
(3.2)
which is also a formal power series. The reformulated HOMFLY invariants of K, f R (t, ν), are defined through the equation FH (v) =
∞ 1 f R (t d , ν d )s R (v d ). d
(3.3)
d=1 R
One can easily prove [30] that this equation determines uniquely the reformulated HOMFLY invariants f R in terms of the colored HOMFLY invariants of K. Explicit formulae for f R in terms of H R for representations with up to three boxes are listed in [30].
624
M. Mariño
If (R) = (S) we define the matrix MRS
1 = χ R (Cμ )χ S (Cμ ) zμ μ
(μ) μ i − t −μi i=1 t , t − t −1
(3.4)
which is zero otherwise. It is easy to show that this matrix is invertible (see for example [31,30]). We now define fˆR (t, ν) = M R−1S f S (t, ν). (3.5) S
In principle, fˆR (t, ν) are rational functions, i.e. they belong to the ring Q[t ±1 , ν ±1 ] with denominators given by products of t r − t −r . However, we have the following Conjecture 3.1. fˆR (t, ν) ∈ z −1 Z[z 2 , ν ±1 ], i.e. they have the structure N R;g,Q z 2g ν Q , fˆR (t, ν) = z −1
(3.6)
g≥0 Q∈Z
where N R;g,Q are integer numbers and are called the BPS invariants of the knot K. The sum appearing here is finite, i.e. for a given knot and a given coloring R, the N R;g,Q vanish except for finitely many values of g, Q. The conjecture can be generalized to links. Let L be a link of L components K1 , . . . , K L , and let vl , l = 1, . . . , L, be formal sets of infinite variables. The subindex l refers here to the l th component of the link, and each vl has the form vl = ((vl )1 , (vl )2 , · · · ). We define H R1 ,...,R L (L)s R1 (v1 ) · · · s R L (v L ) (3.7) Z H (v1 , . . . , v L ) = R1 ,...,R L
as well as the free energy (3.8) FH (v1 , . . . , v L ) = log Z H (v1 , . . . , v L ). If any of the Ri s are given by the trivial coloring Ri = ·, it is understood that H R1 ,...,R L (L) is the HOMFLY invariant of the sublink of L obtained after removing the corresponding Ki s. The reformulated invariants are now defined by FH (v1 , . . . , v L ) =
∞
f R1 ,...,R L (t d , ν d )s R1 (v1d ) · · · s R L (v dL )
(3.9)
d=1 R1 ,...,R L
and
fˆR1 ,...,R L (t, ν) =
S1 ,...,SL
M R−1 · · · M R−1L SL f S1 ,...,SL (t, ν). 1 S1
(3.10)
Remark 3.1. Notice that, for the fundamental representation, fˆ
,...,
(t, ν) = f
,...,
(t, ν).
(3.11)
We can now state the conjecture for links. Conjecture 3.2. fˆR1 ,...,R L (t, ν) ∈ z L−2 Z[z 2 , ν ±1 ], i.e. they have the structure N R1 ,...,R L ;g,Q z 2g ν Q . fˆR1 ,...,R L (t, ν) = z L−2 g≥0 Q∈Z
These conjectures also hold for framed knots and links [40].
(3.12)
String Theory and the Kauffman Polynomial
625
3.2. The conjecture for the colored Kauffman invariant. We will first state the conjecture for knots. Let K be an oriented knot, and let H(R,S) be its HOMFLY invariant in the composite representation (R, S). The composite invariant of the knot K, colored by the representation R, and denoted by R R , is given by R R (K) = N RR1 R2 H(R1 ,R2 ) (K), (3.13) R1 ,R2
where N RR1 R2 are the Littlewood–Richardson coefficients defined by (2.4). Notice that, due to (2.22), this invariant is independent of the choice of orientation of the knot, and it is therefore an invariant of unoriented knots. Example 3.2. We give some simple examples of the composite invariant for colorings with up to two boxes: R R R
= 2H , = 2H + H( , ) , = 2H + H( , ) .
Using these invariants we define the generating functionals Z R (v) = R R (K)s R (v), FR (v) = log Z R (v).
(3.14)
(3.15)
R
We also define the generating functionals for colored Kauffman invariants of K as Z G (v) = G R (K)s R (v), FG (v) = log Z G (v). (3.16) R
This allows us to define two sets of reformulated invariants, h R and g R , as follows. The h R are defined by a relation identical to (3.3), ∞ 1 h R (t d , ν d )s R (v d ), d
(3.17)
1 1 FR (v) = g R (t d , ν d )s R (v d ). 2 d
(3.18)
FR (v) =
d=1 R
while the g R are defined by FG (v) −
d odd R
Here the sum over d is over all positive odd integers. h R can be explicitly obtained in terms of colored HOMFLY invariants for composite representations, while the g R can be written in terms of these invariants and the colored Kauffman invariants. Example 3.3. We list here explicit expressions for the reformulated invariants g R of a knot, where R is a representation of up to three boxes. We have g g g
=G =G =G
−H , 1 1 − G2 − H + H2 − H( , ) , 2 2 1 1 − G 2 − H + H2 − H( , ) , 2 2
(3.19)
626
M. Mariño
as well as =G
g
−G
−H
− H(
=G
g
−G
−H
=G
−G G , )
H
+ 2H
− H(
+ H(
2 + G3 + 3 , ) + 2H
, )H
4 − H3 , 3
1 g (t 3, ν 3 ) 3 H + 2H H
(3.20)
8 − H3 , 3 1 3 1 + G − g (t 3 , ν 3 ) 3 3
, )H
−G G
−H
, )
G
− H(
+ 2H( g
1 1 + G 3 − g (t 3 , ν 3 ) 3 3
G
− H(
, )
+ 2H H
+ H(
, )H
4 − H3 . 3
The invariants hˆ R , gˆ R are defined by a relation identical to (3.5), M R−1S h S (t, ν), gˆ R (t, ν) = M R−1S g S (t, ν). hˆ R (t, ν) = S
(3.21)
S
Like before, hˆ R (t, ν) and gˆ R (t, ν) belong in principle to the ring Q[t ±1 , ν ±1 ] with denominators given by products of t r − t −r . The conjecture for the colored Kauffman polynomial states an integrality property similar to the one we stated for the colored HOMFLY invariant. Conjecture 3.3. We have that hˆ R (t, ν) ∈ z −1 Z[z 2 , ν ±1 ], i.e. they have the structure hˆ R (t, ν) = z −1
gˆ R (t, ν) ∈ Z[z, ν ±1 ],
c=0 N R;g,Q z 2g ν Q ,
g≥0 Q∈Z
gˆ R (t, ν) =
(3.22)
c=1 c=2 N R;g,Q z 2g ν Q + N R;g,Q z 2g+1 ν Q ,
(3.23)
g≥0 Q∈Z
where
c=0,1,2 N R;g,Q
are integers.
Again, there is a generalization to links as follows. Let L be an unoriented link, and pick an arbitrary orientation. We define the composite invariant of L as NUR11V1 · · · NURLLVL H(U1 ,V1 ),...,(U L ,VL ) (L). (3.24) R R1 ,...,R L (L) = U1 ,V1 ,··· ,U L ,VL
Due to (2.23), this invariant does not depend on the choice of orientation of L, and it is therefore an invariant of unoriented links. We further define the generating functionals R R1 ,...,R L (L)s R1 (v1 ) · · · s R L (v L ), Z R (v1 , . . . , v L ) = R1 ,...,R L
Z G (v1 , . . . , v L ) =
R1 ,...,R L
G R1 ,...,R L (L)s R1 (v1 ) · · · s R L (v L ),
(3.25)
String Theory and the Kauffman Polynomial
627
as well as the free energies FR (v1 , . . . , v L ) = log Z R (v1 , . . . , v L ),
FG (v1 , . . . , v L ) = log Z G (v1 , . . . , v L ). (3.26)
The reformulated invariants h R1 ,...,R L , g R1 ,...,R L are now defined by FR (v1 , . . . , v L ) =
∞
h R1 ,...,R L (t d , ν d )s R1 (v1d ) · · · s R L (v dL )
(3.27)
d=1 R1 ,...,R L
and 1 FG (v1 , . . . , v L ) − FR (v1 , . . . , v L ) 2 g R1 ,...,R L (t d , ν d )s R1 (v1d ) . . . s R L (v dL ). =
(3.28)
d odd R1 ,...,R L
Finally, the “hatted” invariants are defined by the relation hˆ R1 ,...,R L (t, ν) =
S1 ,...,SL
gˆ R1 ,...,R L (t, ν) =
S1 ,...,SL
M R−1 . . . M R−1L SL h S1 ,...,SL (t, ν), 1 S1 M R−1 . . . M R−1L SL g S1 ,...,SL (t, ν). 1 S1
(3.29)
Example 3.4. For links L of two components K1 , K2 we have g g
(L) = G , (L) − G (K1 )G (K2 ) − H , (L) − H , (L) + 2H (K1 )H (K2 ), (L) = G , (L) − G , (L)G (K1 ) − G (K1 )G (K2 ) + G (K1 )2 G (K2 ) − H , (L) − H + 2 H , (L) + H
g
, ,
(L) − H( , ), (L) (L) H (K1 )
+ 2H (K1 )H (K2 )+H( , ) (K1 )H (K2 )−4H (K1 )2 H (K2 ), (L) = G , (L) − G , (L)G (K1 ) − G (K1 )G (K2 ) + G (K1 )2 G (K2 ) − H , (L) − H , (L) − H( , ), (L) + 2 H , (L) + H , (L) H (K1 ) + 2H (K1 )H (K2 ) + H( , ) (K1 )H (K2 ) − 4H (K1 )2 H (K2 ).
(3.30)
In these equations, L is the link obtained from L by inverting the orientation of one of its components.
628
M. Mariño
Fig. 5. The reformulated invariant of an unoriented link L, gˆ R1 ,...R L (L), involves the Kauffman invariant of L, together with the HOMFLY invariants of all possible choices of orientations for the components of the link. Here we illustrate it for the fundamental representation and for a two-component link
Example 3.5. For general links of L components it is easy to write down a general formula for g ,..., (L). We first define the connected Kauffman invariant of a link L as the term multiplying s (v1 ) · · · s (v L ) in the expansion of FG (v1 , · · · , v L ). It is given by G
(c)
(L) = G(L) −
L
G(K j )G(L j ) + · · · ,
(3.31)
j=1
where the link L j is obtained from L by removing the j th component. Further corrections involve all possible sublinks of L, and the combinatorics appearing in the formula is the same one that appears in the calculation of the cumulants of a probability distribution. A similar definition gives the connected HOMFLY invariant of a link, H(c) (L), which was studied in detail in [29,31]. We now consider all possible oriented links that can be obtained from an unoriented link L of L components by choosing different orientations in their component knots. In principle there are 2 L oriented links that can be obtained in this way, but they can be grouped in pairs that differ in an overall reversal of orientation, and therefore lead to the same HOMFLY invariant. We conclude that there are 2 L−1 different links which differ in the relative orientation of their components and have a priori different HOMFLY invariants. We will denote these links by Lα , where α = 1, · · · , 2 L−1 . Using (2.23), it is easy to see that the oriented invariant (3.24) for R1 = · · · = R L = involves the sum over all possible orientations of the link, and we have R
,...,
(L) = 2
L−1 2
H(Lα ).
(3.32)
α=1
The reformulated invariant g g
,...,
,...,
(L) is then given by
(L) = G
(c)
(L) −
L−1 2
H(c) (Lα ).
(3.33)
α=1
In general, the reformulated invariant of an unoriented link L, gˆ R1 ,...R L (L), involves colored Kauffman invariants of L, together with colored HOMFLY invariants of all possible choices of orientations of the link. This is an important feature of the reformulated invariants, and we illustrate it graphically for a two-component link in Fig. 5. The fact that one has to consider all possible orientations of the unoriented link bears some resemblance to Jaeger’s model for the Kauffman polynomial in terms of the HOMFLY polynomial (see for example [24], pp. 219–222), and it has appeared before in the context of the Kauffmann invariant in [46]. We can now state our conjecture for the Kauffman invariant of links.
String Theory and the Kauffman Polynomial
629
Conjecture 3.4. We have that hˆ R1 ,...,R L (t, ν) ∈ z L−2 Z[z 2 , ν ±1 ],
gˆ R1 ,...,R L (t, ν) ∈ z L−1 Z[z, ν ±1 ],
(3.34)
i.e. they have the structure N Rc=0 z 2g ν Q , hˆ R1 ,...,R L (t, ν) = z L−2 1 ,...,R L ;g,Q g≥0 Q∈Z
gˆ R1 ,...,R L (t, ν) = z L−1
g≥0 Q∈Z
(3.35) 2g Q c=2 2g+1 Q N Rc=1 . z ν + N z ν R ,...,R ;g,Q ,...,R ;g,Q 1 1 L L
Remark 3.6. It follows from this conjecture that h R1 ,...,R L ∈ z L−2 Z[t ±1 , ν ±1 ],
g R1 ,...,R L ∈ z L−1 Z[t ±1 , ν ±1 ].
(3.36)
As in the colored HOMFLY case, the conjecture is supposed to hold as well for framed knots and links. 4. Evidence 4.1. Direct computations. In this section we provide some evidence for our conjectures concerning the colored Kauffman invariant of knots and links. The first type of evidence follows from direct computation of the invariants hˆ R1 ,...,R L , gˆ R1 ,...,R L for simple knots and links and for representations with small number of boxes. Example 4.1. The simplest example is of course the unknot. The colored HOMFLY and Kauffman invariants are just quantum dimensions. For the standard framing one finds that the only nonvanishing hˆ R , gˆ R are hˆ hˆ
= 2(ν − ν −1 )z −1 , = −z −1 ,
hˆ
= −z −1 ,
gˆ
= 1.
(4.1)
Although we have only computed the reformulated invariants up to four boxes, we conjecture that the hˆ R , gˆ R vanish for all remaining representations. Of course, this has the structure predicted by our conjecture. We now consider more complicated examples. As in [29,31], a useful testing ground are torus knots and links, since for them one can write down general expressions for the colored invariants in any representation. Torus knots are labelled by two coprime integers n, m, and we will denote them by Kn,m . Torus links are labelled by two integers, and their g.c.d. is the number of components of the link, L. We will denote a torus link by L Ln,Lm , where n, m are coprime. Explicit formulae for the HOMFLY invariant of a torus knot Kn,m , colored by a representation R, can be obtained in many ways. In the context of Chern–Simons theory, one can use for example the formalism of knot operators of [28] to write down general expressions [29]. In fact [52], one can obtain formulae in the knot operator formalism which are much simpler than those presented
630
M. Mariño
in [29] and make contact with the elegant result derived in [36] by using Hecke algebras. The formula one obtains is simply H R (Kn,m ) = t nmC R
m
U cn;R t − n CU dimq U,
(4.2)
U
U is defined in (2.7). Of course one has to set t N = ν. This formula is also valid where cn;R for composite representations, which after all are just a special type of representations of U(N ). The generalization to torus links is immediate, as noticed in [31], and the invariant for L Ln,Lm is given by
H R1 ,...,R L (L Ln,Lm ) =
N RS1 ,...,R L t
mn C S − Lj=1 C R j
H S (Kn,m ).
(4.3)
S
This expression is also valid for composite representations, but one has to use the appropriate Littlewood–Richardson coefficients (as computed in for example [25]). Example 4.2. By using these formulae one obtains, for the trefoil knot,
H(
, ) (K2,3 )
= dimq ( , ) 4ν 4 − 4ν 6 + ν 8 + z 2 4ν 4 − 7ν 6 + 2ν 8 + ν 10
+ z 4 ν 4 − 2ν 6 + ν 8 , (4.4)
while for the Hopf link we have for example H(
, ),( ,·) (L2,2 )
= dimq ( , ) dimq
(1 + z 2 ).
(4.5)
Remark 4.3. In the case of the Hopf link, a general expression for H(R1 ,S1 ),(R2 ,S2 ) in terms of the topological vertex [1] can be read from the results for the “covering contribution” in [8]. This expression has reappeared in other studies of topological string theory, see [2,22]. Particular cases have been computed by using skein theory in [42]. One also needs to compute the colored Kauffman invariants of torus knots and links. Very likely, the expression (4.2) generalizes to the Kauffman case by using the group theory data for SO(N )/Sp(N ), but we have used the expression presented in [8] for torus knots of the type (2, 2m + 1), based on the approach of [11]. With these ingredients it is straightforward to compute the reformulated invariants gˆ R for torus knots, although the expressions quickly become quite complicated. We have verified the conjecture for various framed torus knots and links and representations with up to four boxes.
String Theory and the Kauffman Polynomial
631
Example 4.4. For the trefoil knot in the standard framing one finds = −21ν 11 + 79ν 9 − 111ν 7 + 69ν 5 − 16ν 3 + 21z ν 12 − 3ν 10 + 3ν 8 − ν 6 gˆ + z 2 −70ν 11 + 251ν 9 − 307ν 7 + 146ν 5 − 20ν 3 + 7z 3 10ν 12 − 33ν 10 + 33ν 8 − 10ν 6 − 2z 4 42ν 11 − 165ν 9 + 183ν 7 − 64ν 5 + 4ν 3 + 14z 5 6ν 12 − 23ν 10 + 23ν 8 − 6ν 6 + z 6 −45ν 11 + 220ν 9 − 230ν 7 + 56ν 5 − ν 3 + 3z 7 15ν 12 − 73ν 10 + 73ν 8 − 15ν 6 + z 8 −11ν 11 + 78ν 9 − 79ν 7 + 12ν 5 + z 9 −11ν 11 + 78ν 9 − 79ν 7 + 12ν 5 + z 10 −ν 11 + 14ν 9 − 14ν 7 + ν 5 + z 11 ν 12 − 14ν 10 + 14ν 8 − ν 6 + z 12 ν 9 − ν 7 + z 13 ν 8 − ν 10 , gˆ = −15ν 11 + 53ν 9 − 69ν 7 + 39ν 5 − 8ν 3 + 15z ν 12 − 3ν 10 + 3ν 8 − ν 6 + z 2 −35ν 11 + 126ν 9 − 146ν 7 + 61ν 5 − 6ν 3 + 5z 3 7ν 12 − 24ν 10 + 24ν 8 − 7ν 6 + z 4 −28ν 11 + 120ν 9 − 128ν 7 + 37ν 5 − ν 3 + 7z 5 4ν 12 − 17ν 10 + 17ν 8 − 4ν 6 + z 6 −9ν 11 + 55ν 9 − 56ν 7 + 10ν 5 + z 7 9ν 12 − 55ν 10 + 55ν 8 − 9ν 6 + z 8 −ν 11 + 12ν 9 − 12ν 7 + ν 5 + z 9 ν 12 − 12ν 10 + 12ν 8 − ν 6 + z 10 ν 9 − ν 7 + z 11 ν 8 − ν 10 . (4.6) c=1,2 From these expressions one can read the BPS invariants N c=1,2 . In [8] ;g,Q and N ;g,Q
the invariants with c = 1 were obtained by exploited parity properties of the Kauffman invariant, but the c = 2 invariants were not determined. Notice that our convention for the matrix M R S is different from the one in [8], so in order to compare with the results for c = 1 presented in [8] one has to change N R;g,Q → (−1)(R)−1 N R T ;g,Q . Example 4.5. For the Hopf link one finds gˆ gˆ gˆ
,
= z(ν − ν −1 ),
,
= z(ν 2 − 1),
,
= z(ν
−2
− 1).
(4.7)
632
M. Mariño
4.2. General predictions for knots. We now discuss general predictions of our conjecture. We will see that it makes contact with well-known properties of the Kauffman invariant, and that it makes some simple, new predictions for the structure of the Kauffman invariant of links. We start by discussing general predictions for knots. From their definition (2.17), (2.33) we can write ν − ν −1 K G(K) = 1 + ki (ν)z i , z i≥0 (4.8) −1 ν−ν K 2i pi (ν)z . H(K) = z i≥0
According to our conjecture, gˆ
=g
has no terms in z −1 , therefore one must have
k0K (ν) = p0K (ν),
(4.9)
which is (2.42) in the case of knots. Therefore, the equality of the lowest order terms of the HOMFLY and Kauffman polynomials is a simple consequence of our conjecture. This was already noticed in [8]. We now consider the reformulated polynomial g R for representations with two boxes. Our conjecture implies that this quantity belongs to z −1 Z[ν ±1 , t ±1 ]. By looking at the definition of g (K), g (K) in terms of colored Kauffman and HOMFLY invariants, we see that the only possible term which might spoil integrality is 1 (G(K)2 + H( , ) (K)). (4.10) 2 Therefore our conjecture implies that G(K)2 ≡ H(
, ) (K)
mod 2.
(4.11)
This is precisely Rudolph’s theorem (2.45) for knots. Very likely, our integrality conjecture also leads to the generalization of Rudolph’s theorem due to Morton and Ryder (2.46), although the combinatorics becomes more involved. As an example, we will briefly show how to derive (2.46) in the case of knots (L = 1) and with R = . To do this, we look at g , which is given by g
=G
−G G
1 R − 2
1 − G2 2
−R R
1 1 + G2 G − G4 2 4 1 1 − R2 + R2 R 2 2
1 − R4 4
.
(4.12)
Most terms in the r.h.s. are manifestly elements in Z[t ±1 , ν ±1 ] with denominators given by products of t r − t −r . The only possible source for rational coefficients is the term 1 1 2 G G 4 − H(2 , ) . (4.13) + H( , ) − − 2 4 However, the last two terms inside the bracket are also in Z[t ±1 , ν ±1 ] thanks to Rudolph’s theorem, and we conclude that integrality of g requires G2
≡ H(
,
)
mod 2.
(4.14)
This is Morton–Ryder’s theorem (2.46) for a knot colored by R = . It seems likely that the general case of their theorem, for an arbitrary representation R, follows from integrality of g S , where S ∈ R ⊗ R.
String Theory and the Kauffman Polynomial
633
4.3. General predictions for links. Let us now consider two-component links. When both components are colored by , the HOMFLY and Kauffman invariants have the form ν − ν −1 L G(L) = 1 + ki (ν)z i−1 , z i≥0 (4.15) −1 ν−ν L 2i−1 pi (ν)z . H(L) = z i≥0
Our conjecture says that g , (L) belongs to Z[ν ±1 , z], i.e. it has no negative powers of z. From its explicit definition in terms of the HOMFLY and Kauffman polynomials of L we find that this condition leads to three different relations. The first one is k0L (ν) = (ν − ν −1 )k0K1 (ν)k0K2 (ν).
(4.16)
The conjecture in the case of HOMFLY leads to a similar relation [31] p0L (ν) = (ν − ν −1 ) p0K1 (ν) p0K2 (ν).
(4.17)
Notice that, due to (4.9), we also have from (4.17) and (4.16), that p0L (ν) = k0L (ν)
(4.18)
for links of two components. The second relation determines the second coefficient of the Kauffman polynomial of a link as k1L (ν) = k0K1 (ν)k0K2 (ν) + (ν − ν −1 ) k0K1 (ν)k1K2 (ν) + k1K1 (ν)k0K2 (ν) . (4.19) Finally, the third relation gives an equation for k2L (ν), k2L (ν) = p1L (ν) + p1L (ν) − 2 p0K1 (ν) p1K2 (ν) + p1K1 (ν) p0K2 (ν) + k0K1 (ν)k1K2 (ν) + k1K1 (ν)k0K2 (ν) + (ν − ν −1 ) k0K1 (ν)k2K2 (ν) + k2K1 (ν)k0K2 (ν) + k1K1 (ν)k1K2 (ν) . (4.20) These results can be easily generalized to a general link L with L components K j , j = 1, . . . , L, as follows. If we calculate the connected invariants of the link from their definitions in terms of invariants of sublinks, we obtain an expression of the form ν − ν −1 (c),L (c) G (L) = 1 + ki (ν)z i+1−L , z i≥0 (4.21) −1 ν − ν (c),L (c) 2i+1−L pi (ν)z , H (L) = z i≥0
where pi(c),L (ν), ki(c),L (ν) can be obtained in terms of the polynomials pi(c),L (ν),
pi(c),L (ν) of the different sublinks of L, L ⊂ L. For example, p0(c),L (ν) = p0L (ν) − (ν − ν −1 ) L−1
L−1
K
p0 j (ν).
(4.22)
j=1
The conjecture of [30,31,44] for the colored HOMFLY invariant implies in particular that the connected HOMFLY invariant belongs to z L−2 Z[z 2 , ν ±1 ]. This leads to[30,31]
634
M. Mariño
Fig. 6. A Brunnian link with four components (c),L
p0
(c),L
(ν) = · · · = p L−2 (ν) = 0
(4.23)
for any link L. The fact that p0(c),L (ν) = 0 is a result of Lickorish and Millett [34], and (c),L the vanishing of p1,2 (ν) has been proved in [21]. Our conjecture for the colored Kauffman implies that (3.33) belongs to z L−1 Z[z, ν ±1 ]. This gives the relations (c),L
k0
(c),L
(ν) = · · · = k2L−3 (ν) = 0
(4.24)
as well as (c),L
k2L−2 (ν) =
L−1 2
(c),L
p L−1 α (ν).
(4.25)
α=1
The relations (4.24) generalize (4.16) and (4.19), while (4.25) generalizes (4.20) to any (c),L link. The equality (2.42) for any link now follows from the vanishing of p0 (ν), (c),L k0 (ν) and the equality in the case of knots (4.9). (c),L (ν) with i = 0, · · · , 3 has been proved by Kanenobu in [20]. The vanishing of ki More evidence for (4.24) comes from Brunnian links. A Brunnian link is a nontrivial link with the property that every proper sublink is trivial. The Hopf link is a Brunnian link of two components, while the famous Borromean rings give a Brunnian link of three components. A Brunnian link with four components is shown in Fig. 6. It is easy to see that the connected invariants of a Brunnian link B with L components are of the form L ν − ν −1 ν − ν −1 G (c) (B) = 1 + FB (z, ν) − 1 + , z z (4.26) L ν − ν −1 ν − ν −1 (c) PB (z, ν) − H (B) = . z z Conjectures (4.23) and (4.24) imply that, for Brunnian links, L−1 ν − ν −1 PB (z, ν) − = O(z L−1 ), z L−1 ν − ν −1 FB (z, ν) − 1 + = O(z L−1 ). z This has been proved in [18,47].
(4.27)
String Theory and the Kauffman Polynomial
635
Fig. 7. The two oriented links L and L tabulated as 412 , and differing in the relative orientation of their components
The relation (4.25) (and in particular (4.20) for links with two components) seems however to be a new result in the theory of the Kauffman polynomial. It relates the Kauffman polynomial of an unoriented link L to the HOMFLY polynomial of all the oriented links that can be obtained from L, modulo an overall reversal of the orientation. In the case of links made out of two unknots, it further simplifies to k2L (ν) = p1L (ν) + p1L (ν),
(4.28)
and it can be easily checked in various cases by looking for example at the tables presented in [35]. Example 4.6. Let us check (4.28) for some simple links made out of two unknots. The easiest example is of course the Hopf link, where L and L are depicted in Fig. 3. Their HOMFLY polynomials are given by PL =
ν − ν −1 + νz, z
PL =
ν − ν −1 − ν −1 z, z
(4.29)
and p1L (ν) + p1L (ν) = ν − ν −1 .
(4.30)
By comparing with (2.40), we see that (4.28) holds. Let us now consider the pair of oriented links depicted in Fig. 7. Their HOMFLY polynomials are PL = ν − ν −1 z −1 + ν − 3ν −1 z − ν −1 z 3 , (4.31) PL = ν − ν −1 z −1 + ν + ν 3 z, while the Kauffman polynomial is FL = ν −ν −1 z −1 +1+ ν 3 +2ν −3ν −1 z + 1 − ν 2 z 2 + ν −ν −1 z 3 . (4.32) Again, the relation (4.28) holds. Finally, we consider the link depicted in Fig. 8, and tabulated as 521 . This link is invariant under reversal of orientation of its components, hence L = L, and its HOMFLY polynomial equals PL = ν − ν −1 z −1 + −ν −1 + 2ν − ν 3 z + νz 3 , (4.33)
636
M. Mariño
Fig. 8. The oriented link L, tabulated as 521 . It is invariant under reversal of orientation of its components, hence L = L
while its Kauffman polynomial is FL = ν − ν −1 z −1 + 1 + −2ν −1 + 4ν − 2ν 3 z + −1 + ν 4 z 2 + −ν −1 + 3ν − 2ν 3 z 3 + (−1 + ν 2 )z 4 .
(4.34)
Here, k2L (ν) = 2 p1L (ν), again in agreement with (4.28). Finally, it is easy to see that Rudolph’s theorem for a link L can be obtained by requiring integrality of, for example, the reformulated invariant g ,..., , generalizing in this way our analysis for knots. It is likely that (2.46) follows from looking at g S1 ,...,SL with Si ∈ Ri ⊗ Ri . 5. String Theory Interpretation The conjecture stated in this paper is mostly based on the analysis performed in [8], which in turn builds upon previous work on the large N duality between Chern–Simons theory and topological strings (see [39] for a review of these developments). In this section we sketch some of the string theory considerations which lead to the above conjecture. For simplicity we will restrict ourselves to the case of knots. The extension of these considerations to the case of links is straightforward. 5.1. Chern–Simons theory and D-branes. In [54], Witten showed that Chern–Simons theory on a three-manifold M can be obtained by considering open topological strings on the cotangent space T ∗ M with boundaries lying on M, which is a Lagrangian submanifold of T ∗ M. Equivalently, one can say that the theory describing N topological branes wrapping M inside T ∗ M is U(N ) Chern–Simons theory. To incorporate knots and links into this framework one has to introduce a different set of branes, as explained by Ooguri and Vafa [44]. This goes as follows: given any knot K in S3 , one can construct a natural Lagrangian submanifold NK in T ∗ S3 . This construction is rather canonical, and it is called the conormal bundle of K. Let us parametrize the knot K by a curve q(s), where s ∈ [0, 2π ]. The conormal bundle of K is the space ∗ 3 NK = (q(s), p) ∈ T S pi q˙i = 0, 0 ≤ s ≤ 2π , (5.1) i
String Theory and the Kauffman Polynomial
637
where qi , pi are coordinates for the base and the fibre of the cotangent bundle, respectively, and q˙i denote derivatives w.r.t. s. The space NK is an R2 -fibration of the knot itself, where the fiber on the point q(s) is given by the two-dimensional subspace of ∗ S3 of planes orthogonal to q(s). Tq(s) ˙ NK has the topology of S1 × R2 , and intersects S3 along the knot K. As a matter of fact, for some aspects of the construction, the appropriate submanifolds to consider are deformations of NK which are disconnected from the zero section. For example, [26] considers a perturbation ∗ 3 NK, = (q(s), p + q(s)) ˙ ∈T S pi q˙i = 0, 0 ≤ s ≤ 2π . (5.2) i
Let us now wrap M probe branes around NK . There will be open strings with one endpoint on S3 , and another endpoint on NK . These open strings lead to the insertion of the following operator (also called the Ooguri–Vafa operator) in the Chern–Simons theory on S3 [44]: U(N ) Z U(N ) (v) = Tr R (UK ) s R (v). (5.3) R
Here UK is the holonomy of the Chern–Simons gauge field around K, while v is a U (M) matrix associated to the M branes wrapping NK . After computing the expectation value of this operator in Chern–Simons theory, we obtain the generating functional (3.1). In order to describe the Kauffman polynomial, we need a Chern–Simons theory on S3 with gauge group SO(N ) or Sp(N ). From the point of view of the open string description, we need an orientifold of topological string theory on T ∗ S3 . This orientifold was constructed in [51], and it can be described as follows. As a complex manifold, the cotangent space T ∗ S3 is a Calabi–Yau manifold called the deformed conifold. It can be described by the equation 4
xi2 = μ,
(5.4)
i=1
where xi are complex coordinates. For real μ > 0, the submanifold Im xi = 0 is nothing but S3 , while Im xi are coordinates of the cotangent space. We now consider the following involution of the geometry I : xi → x¯i .
(5.5)
This leaves the S3 invariant, and acts as a reflection on the coordinates of the fiber: pi → − pi .
(5.6)
If we now wrap N D-branes on S3 , the corresponding gauge theory description is ChernSimons theory with gauge group SO(N ) or Sp(N ), depending on the choice of orientifold action on the gauge group [51]. Since an orientifold theory is a particular case of a Z2 orbifold, the partition function is expected to be the sum of the partition function in the untwisted sector, plus the partition function of the twisted sector. The partition function in the untwisted sector corresponds to a theory of oriented open strings in the “covering geometry,” i.e. the original target space geometry but with the closed moduli identified
638
M. Mariño
according to the action of the involution. The partition function in the twisted sector is given by the contributions of unoriented strings. We can then write [7,51] Z SO(N )/Sp(N ) =
1 Z cov + Z unor . 2
(5.7)
In the case considered in [51], the partition function for the covering geometry is just the U(N ) partition function. We can introduce Wilson loops around knots and links by following the strategy of Ooguri and Vafa, i.e. by introducing branes wrapping the Lagrangian submanifold NK . This leads to the insertion of the operator [7,8] SO(N )/Sp(N ) Z SO(N )/Sp(N ) (v) = Tr R (UK ) s R (v). (5.8) R
After computing the expectation value of this operator in Chern–Simons theory, we obtain the generating functional (3.16). In this paper we have used the Kauffman invariant which follows naturally from the gauge group SO(N ), but in fact the two choices of gauge group lead to essentially identical theories due to the “SO(N ) = Sp(−N )” equivalence, see [7] for a discussion and references. How does (5.8) decompose into a sum (5.7) of covering and unoriented contributions? In general, if we have a geometry X with a submanifold L, there will be two submanifolds in the covering geometry: the original submanifold L and its image under the involution I (L) [8]. Although I (NK ) = NK , if one considers deformations of the conormal bundle, the resulting submanifolds will be different (this has been previously noticed in [17]). For example, for the deformation (5.2) one has that I (NK, ) = NK,− . Therefore, after deformation we will have two sets of probe D-branes in T ∗ S3 , wrapping two different submanifolds related by the involution I , and leading to two different sources v1 and v2 [8]. In particular, we have two sets of open strings, going from the two sets of probe branes to the branes wrapping S3 in the orientifold plane, and related by the orientifold action. This action involves both the target space involution pi → − pi and an orientation reversal which conjugates the Chan-Paton charges. We then conclude that one set of open strings will lead to the insertion of Wilson lines in S3 involving representations R = ·, , , . . ., while the other set of open strings will lead to conjugate representations S = . . . , , , . . .. This is illustrated in Fig. 9. The partition function of the covering geometry will then have the structure U(N ) Tr (R,S) (UK ) s R (v)s S (v). (5.9) Z cov (v) = R,S
Since we have to identify the closed and open moduli according to the action of the involution, in (5.9) we have set v1 = v2 = v in the source terms s R (v1 ) and s S (v2 ). After computing the expectation value of this operator in Chern–Simons theory, we obtain the generating functional Z R in (3.15). This argument by itself does not make it possible to decide if the representation induced by the orientifold action is the composite representation (R, S) or the tensor product representation R ⊗ S, which differ in “lower order corrections” as specified in (2.13). One needs in principle a more detailed study of the orientifold, but as we will see in a moment, by looking at the topological string theory realization for simple knots and links, we can verify that the covering geometry involves indeed the composite representation.
String Theory and the Kauffman Polynomial
639
Fig. 9. The two sets of open strings in the covering geometry, going from the probe branes to the orientifold “plane” in S3 and extending along the cotangent directions. They are related by the target space involution, which sends pi → − pi , and by orientation reversal. The Chan–Paton charges ending on S3 lead to a Wilson line colored by (R, S), while the Chan-Paton charges in the probe branes lead to sources v1 , v2 which have to be identified by the involution: v1 = v2
5.2. Topological string dual. It was conjectured in [14] that open topological string theory on T ∗ S3 with N D-branes wrapping S3 is equivalent to a closed topological string theory on the resolved conifold X = O(−1) ⊕ O(−1) → P1 .
(5.10)
This leads to a large N duality between Chern–Simons theory on S3 and the closed topological string on (5.10). The open string theory on the deformed conifold is related to the closed string theory on the resolved conifold by a so-called geometric transition. In this case this is the conifold transition. This duality was extended by Ooguri and Vafa to the situation in which one has probe branes in T ∗ S3 wrapping the Lagrangian submanifold NK . They postulated that, given any knot K, one can construct a Lagrangian submanifold in the resolved conifold, L K , which can be understood as a geometric transition of the Lagrangian NK in the deformed conifold. The total free energy in the deformed conifold can be computed in terms of Chern–Simons theory and it is given by FH (v). By the large N duality, it should be equal to the free energy of an open topological string theory on the resolved conifold X with Lagrangian boundary conditions given by L K . Since open topological string amplitudes can be reformulated in terms of counting of BPS invariants and satisfy integrality conditions [13,31,44], one obtains the conjecture about the integrality properties of the colored HOMFLY invariant [30,39] which we reviewed above. As first shown in [51], one can extend the large N duality of [14] to the orientifold case and obtain an equivalence between Chern–Simons theory on S3 with SO(N )/Sp(N ) gauge groups and topological string theory on an orientifold of the resolved conifold. A convenient description of (5.10) is as a toric manfiold, defined by the equation |X 1 |2 + |X 2 |2 − |X 3 |2 − |X 4 |2 = t
(5.11)
640
M. Mariño
and a further quotient by a U (1) action where the coordinates (X 1 , . . . , X 4 ) have charges (1, 1, −1, −1). In this description, the orientifold is defined by the involution (X 1 , X 2 , X 3 , X 4 ) → (X 2 , −X 1 , X 4 , −X 3 ).
(5.12)
Let us now consider the open topological string theory on the resolved conifold defined by the Lagrangian submanifold L K associated to a knot. If we perform the orientifold action (5.12) we obtain an orientifold of this open topological string theory. The total partition function of this orientifold of the resolved conifold should be equal to the total partition function of the orientifold of the deformed conifold, namely Z G (v). The contribution of the covering geometry will be given by the partition function of topological open strings on X in the presence of two Lagrangian submanifolds, L K and I (L K ), after identifying the sources. This should be equal to the contribution of the covering of the deformed geometry, i.e. Z R (v). This partition function can then be c=0 . expressed in terms of integer BPS invariants N R;g,Q On the other hand, the unoriented contribution to the orientifold partition function will be given by the partition function of unoriented topological open strings in the quotient geometry X/I with a brane L K . This unoriented partition function also has an integrality structure [7,8] generalizing [13]. In particular, it can be written in terms of c=1,2 BPS invariants N R;g,Q related to the counting of curves with boundaries ending on L K and with one or two crosscaps. This explains the integrality properties for the colored Kauffman polynomial that we conjectured in this paper. Remark 5.1. Note that the two choices of orientifold action which lead to the gauge groups SO(N )/Sp(N ) in the deformed conifold become here a choice of overall sign for the c = 1 contribution, see for example [7] for a discussion and examples. Remark 5.2. The sum over odd positive integers d in (3.18) seems to be a general feature of multicovering formulae for unoriented surfaces, as noticed in [7,8,51]. See [27] for recent examples. When K is the unknot it is possible to construct explicitly the corresponding Lagrangian submanifold L K in X . It turns out to be given by a toric construction, and it is possible to compute Z R (v) by using the topological vertex [8]. The explicit computation in Eq. (3.10) of [8] confirms that the vacuum expectation value of the operator appearing in (5.9) is indeed the quantum dimension of the composite representation (2.26), ) Tr U(N (R,S) (UK ) = dim q (R, S).
(5.13)
One can also find an explicit description of the Lagrangian submanifolds in X corresponding to the Hopf link [8], and compute the covering contribution to the orientifold partition function by using the topological vertex. The resulting expression (Eq. (3.17) of [8]) agrees again with the HOMFLY invariant of the Hopf link for general composite representations, which was computed in [2,22] in a different context. 6. Conclusions and Outlook In this paper we have formulated a new conjecture on the structure of the colored Kauffman polynomial of knots and links. This conjecture is mainly based on the results of [8], but it adds a crucial ingredient which was missing in that paper: the fact that partition
String Theory and the Kauffman Polynomial
641
functions in the untwisted sector of the orientifold are given by HOMFLY invariants colored by composite representations. This makes possible to extend the results obtained for the colored HOMFLY invariant in [30,31,44] to the colored Kauffman polynomial. According to our conjecture, the natural invariant of unoriented knots and links involves both the Kauffman polynomial and the HOMFLY polynomial colored with composite representations. In particular, in the case of links, it involves considering all possible orientations for the components of a link. This is probably the most interesting aspect of the conjecture, and it “explains” many aspects of the relationship between these invariants, like Rudolph’s theorem [49]. It also leads to new, simple results about the Kauffman polynomial, like for example (4.25). From the point of view of physics, the results presented in this paper provide new precision tests of a large N string/gauge theory correspondence. It would be very interesting to relate the integrality properties conjectured here to appropriate generalizations of Khovanov homology, as in [16]. Indeed, as in the case of the colored HOMFLY invariants, the integers N Rc=0,1,2 are Euler characteristics 1 ,...,R L ;g,Q of cohomology theories associated to BPS states, and it is natural to conjecture that these cohomologies give categorifications of the colored Kauffman invariant. There has been already work in this direction for knots colored by the fundamental representation [17]. The case of links and/or higher representations should involve, as conjectured in this paper, both the Kauffman invariant and the HOMFLY invariant for composite representations. Finally, it was noticed in [29] that the reformulated invariants f R , when expanded in power series, t = exp(x/2), lead to Vassiliev invariants. On the other hand, some of the properties that follow from our conjectures (like (4.27)) have a natural interpretation in Vassiliev theory. Therefore, it would be interesting to have a precise interpretation of our conjectures in terms of Vassiliev invariants, especially now that we have a rather complete understanding of Chern–Simons invariants for all classical gauge groups in terms of string theory. Note added. After this paper was submitted, two papers appeared [9,52] with extensive checks of conjecture 3.3 for framed torus knots and links. Acknowledgements. My interest in this problem was revived by the recent paper of Morton and Ryder [43], and I would like to thank them for a very useful correspondence. I would also like to thank Vincent Bouchard, Jose Labastida and Cumrun Vafa for many conversations on this topic along the years, and Sébastien Stevan for recent discussions on torus knots. Finally, I would like to thank Vincent Bouchard, Stavros Garoufalidis and specially Hugh Morton for a detailed reading of the manuscript. This work was supported in part by the Fonds National Suisse.
References 1. Aganagic, M., Klemm, A., Mariño, M., Vafa, C.: The topological vertex. Commun. Math. Phys. 254, 425 (2005) 2. Aganagic, M., Neitzke, A., Vafa, C.: BPS microstates and the open topological string wave function. http://arxiv.org/abs/hep-th/0504054v1, 2005 3. Aiston, A.K.: Skein theoretic idempotents of Hecke algebras and quantum group invariants. Ph.D. Thesis, University of Liverpool (1996), available in http://www.liv.ac.uk/~su14/knotprints.html 4. Aiston, A.K., Morton, H.R.: Idempotents of Hecke algebras of type A. J. Knot Theory Ramifications 7, 463 (1998) 5. Beliakova, A., Blanchet, C.: Skein construction of idempotents in Birman-Murakami-Wenzl algebras. Math. Ann. 321, 347 (2001)
642
M. Mariño
6. Birman, J.: New points of view in knot theory. Bull. Am. Math. Soc., New Ser. 28, 253 (1993) 7. Bouchard, V., Florea, B., Mariño, M.: Counting higher genus curves with crosscaps in Calabi-Yau orientifolds. JHEP 0412, 035 (2004) 8. Bouchard, V., Florea, B., Mariño, M.: Topological open string amplitudes on orientifolds. JHEP 0502, 002 (2005) 9. Chandrima, P., Pravina, B., Ramadevi, P.: Composite Invariants and Unoriented Topological String Amplitudes. http://arxiv.org/abs/1003.5282v1[hep-th], 2010 10. Chen, L., Chen, Q., Reshetikhin, N.: Orthogonal quantum group invariants of links (to appear) 11. Rama Devi, P., Govindarajan, T.R., Kaul, R.K.: Three-dimensional Chern-Simons theory as a theory of knots and links. 3. Compact semisimple group. Nucl. Phys. B 402, 548 (1993) 12. Freyd, P., Yetter, D., Hoste, J., Lickorish, W.B.R., Millett, K., Ocneanu, A.: A new polynomial invariant of knots and links. Bull. Amer. Math. Soc. 12, 239 (2002) 13. Gopakumar, R., Vafa, C.: M-theory and topological strings. I, II. http://arxiv.org/abs/hep-th/9809187v1, 1998 and http://arxiv.org/abs/hep-th/9812127v1, 1998 14. Gopakumar, R., Vafa, C.: On the gauge theory/geometry correspondence. Adv. Theor. Math. Phys. 3, 1415 (1999) 15. Gross, D.J., Taylor, W.: Two-dimensional QCD is a string theory. Nucl. Phys. B 400, 181 (1993) 16. Gukov, S., Schwarz, A.S., Vafa, C.: Khovanov-Rozansky homology and topological strings. Lett. Math. Phys. 74, 53 (2005) 17. Gukov, S., Walcher, J.: Matrix factorizations and Kauffman homology. http://arxiv.org/abs/hep-th/ 0512298v1, 2005 18. Habiro, K.: Brunnian links, claspers and Goussarov-Vassiliev finite type invariants. Math. Proc. Cambridge Philos. Soc. 142, 459 (2007) 19. Hadji, R.J., Morton, H.R.: A basis for the full Homfly skein of the annulus. Math. Proc. Cambridge Philos. Soc. 141, 81 (2006) 20. Kanenobu, T.: The first four terms of the Kauffman’s link polynomial. Kyungpook Math. J. 46, 509 (2006) 21. Kanenobu, T., Miyazawa, Y.: The second and third terms of the HOMFLY polynomial of a link. Kobe J. Math. 16, 147 (1999) 22. Kanno, H.: Universal character and large N factorization in topological gauge/string theory. Nucl. Phys. B 745, 165 (2006) 23. Kauffman, L.H.: An invariant of regular isotopy. Trans. Amer. Math. Soc. 318, 417 (1990) 24. Kauffman, L.H.: Knots and physics. Third edition, Singapore: World Scientific, 2001 25. Koike, K.: On the decomposition of tensor products of the representations of the classical groups: by means of the universal characters. Adv. Math. 74, 57 (1989) 26. Koshkin, S.: Conormal bundles to knots and the Gopakumar–Vafa conjecture. Adv. Theor. Math. Phys. 11, 591 (2007) 27. Krefl, D., Walcher, J.: The Real Topological String on a local Calabi-Yau. http://arxiv.org/abs/0902. 0616v1[hep-th], 2009 28. Labastida, J.M.F., Llatas, P.M., Ramallo, A.V.: Knot operators in Chern-Simons gauge theory. Nucl. Phys. B 348, 651 (1991) 29. Labastida, J.M.F., Mariño, M.: Polynomial invariants for torus knots and topological strings. Commun. Math. Phys. 217, 423 (2001) 30. Labastida, J.M.F., Mariño, M.: A new point of view in the theory of knot and link invariants. J. Knot Theory Ramifications 11, 173 (2002) 31. Labastida, J.M.F., Mariño, M., Vafa, C.: Knots, links and branes at large N. JHEP 0011, 007 (2000) 32. Labastida, J.M.F., Pérez, E.: A Relation Between The Kauffman And The Homfly Polynomials For Torus Knots. J. Math. Phys. 37, 2013 (1996) 33. Lickorish, W.B.R.: An introduction to knot theory. Berlin-Heidelberg-New York: Springer-Verlag, 1997 34. Lickorish, W.B.R., Millett, K.C.: A polynomial invariant of oriented links. Topology 26, 107 (1987) 35. Lickorish, W.B.R., Millett, K.C.: The new polynomial invariants of knots and links. Math. Mag. 61, 3 (1988) 36. Lin, X.-S., Zheng, H.: On the Hecke algebras and the colored HOMFLY polynomial. http://arxiv.org/ abs/math/0601267v1[math.QA], 2006 37. Liu, K., Peng, P.: Proof of the Labastida–Mariño–Ooguri–Vafa conjecture. http://arxiv.org/abs/0704. 1526v3[math.QA], 2009 38. Macdonald, I.G.: Symmetric functions and Hall polynomials. Second edition, Oxford: Oxford University Press, 1995 39. Mariño, M.: Chern-Simons theory and topological strings. Rev. Mod. Phys. 77, 675 (2005) 40. Mariño, M., Vafa, C.: Framed knots at large N. http://arxiv.org/abs/hep-th/0108064v1, 2001 41. Morton, H.R.: Integrality of Homfly 1-tangle invariants. Algebr. Geom. Topol. 7, 327 (2007) 42. Morton, H.R., Hadji, R.J.: Homfly polynomials of generalized Hopf links. Algebr. Geom. Topol. 2, 11 (2002)
String Theory and the Kauffman Polynomial
643
43. Morton, H.R., Ryder, N.D.A.: Relations between Kauffman and Homfly satellite invariants. Math. Proc. Phil. Soc. 149, 105–114 (2010) 44. Ooguri, H., Vafa, C.: Knot invariants and topological strings. Nucl. Phys. B 577, 419 (2000) 45. Pravina, B., Ramadevi, P.: SO(N) reformulated link invariants from topological strings. Nucl. Phys. B 727, 471 (2005) 46. Przytycki, J.H.: A note on the Lickorish–Millett–Turaev formula for the Kauffman polynomial. Proc. Amer. Math. Soc. 121, 645 (1994) 47. Przytycki, J.H., Taniyama, K.: The Kanenobu-Miyazawa conjecture and the Vassiliev-Gusarov skein modules based on mixed crossings. Proc. Amer. Math. Soc. 129, 2799 (2001) 48. Ramadevi, P., Sarkar, T.: On link invariants and topological string amplitudes. Nucl. Phys. B 600, 487 (2001) 49. Rudolph, L.: A congruence between link polynomials. Math. Proc. Cambridge Philos. Soc. 107, 319 (1990) 50. Ryder, N.D.A.: Skein based invariants and the Kauffman polynomial. Ph.D. Thesis, University of Liverpool, 2008 51. Sinha, S., Vafa, C.: SO and Sp Chern-Simons at large N. http://arxiv.org/abs/hep-th/0012136v1, 2000 52. Stevan, S.: Chern-Simons Invariants of Torus Knots and Links. http://arxiv.org/abs/1003.2861v1[hep-th], 2010 53. Witten, E.: Quantum field theory and the Jones polynomial. Commun. Math. Phys. 121, 351 (1989) 54. Witten, E.: Chern-Simons Gauge Theory As A String Theory. Prog. Math. 133, 637 (1995) Communicated by A. Kapustin
Commun. Math. Phys. 298, 645–672 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1087-7
Communications in
Mathematical Physics
Irreducible Characters of General Linear Superalgebra and Super Duality Shun-Jen Cheng1, Ngau Lam2, 1 Institute of Mathematics, Academia Sinica, Taipei 10617, Taiwan.
E-mail: [email protected]
2 Department of Mathematics, National Cheng-Kung University,
Tainan 70101, Taiwan. E-mail: [email protected] Received: 5 August 2009 / Accepted: 16 April 2010 Published online: 9 July 2010 – © Springer-Verlag 2010
Abstract: We develop a new method to solve the irreducible character problem for a wide class of modules over the general linear superalgebra, including all the finitedimensional modules, by directly relating the problem to the classical Kazhdan-Lusztig theory. Furthermore, we prove that certain parabolic BGG categories over the general linear algebra and over the general linear superalgebra are equivalent. We also verify a parabolic version of a conjecture of Brundan on the irreducible characters in the BGG category of the general linear superalgebra.
1. Introduction The problem of finding the finite-dimensional irreducible characters of simple Lie superalgebras was first posed in [K1,K2]. This problem turned out to be one of the most challenging problems in the theory of Lie superalgebras, and in the type A case was first solved by Serganova [Se]. Later on, inspired by [LLT], Brundan in [B] provided an elegant new solution of the problem. To be more precise, Brundan in [B, Conjecture 4.32 and (4.35)] gave a conjectural character formula for every irreducible highest weight gl(m|n)-module in the Bernstein-Gelfand-Gelfand category O in terms of certain Brundan-Kazhdan-Lusztig polynomials. The validity of the conjecture would imply a remarkable formulation of the Kazhdan-Lusztig theory of O in terms of canonical and dual basis on a certain Fock space. Brundan then solved the finite-dimensional irreducible character problem by verifying the conjecture for the subcategory of finite-dimensional gl(m|n)-modules, in which case the Fock space is Em|n (see Sect. 4.3). One of the main purposes of the present paper is to establish Brundan’s conjecture for a substantially larger subcategory of O of gl(m|n)-modules, which includes all the finite-dimensional Partially supported by an NSC-grant and an Academia Sinica Investigator grant.
Partially supported by an NSC-grant.
646
S.-J. Cheng, N. Lam
ones. We note that a similar Fock space formulation is known among experts for modules of the general linear algebra gl(m + n) in the category O, and in particular for modules in the maximal parabolic subcategory corresponding to the Levi subalgebra gl(m) ⊕ gl(n), in which case the Fock space is Em+n (see Sect. 4.3). Let g and g denote direct limits of general linear algebras gl(m + n) and of general linear superalgebras gl(m|n), respectively, as n → ∞ (see Sect. 2.2 and Sect. 2.3). Motivated by [B] it was shown in [CWZ] that in the limit n → ∞ the Fock spaces Em+n and Em|n have compatible canonical and dual canonical bases, and that the KazhdanLusztig polynomials in Em+∞ and Em|∞ can be identified. Now the Kazhdan-Lusztig f polynomials of Em+∞ describe the g-module category O[−m,−2] whose objects are the direct limits of modules in the above-mentioned maximal parabolic subcategory, while f those of Em|∞ describe the g-module category O[−m,−2] whose objects are direct limits of finite-dimensional gl(m|n)-modules. From this and Brundan’s formulation it follows that the classical (parabolic) Kazhdan-Lusztig polynomials of the general linear algebra also give solution to the finite-dimensional irreducible character problem for the general linear superalgebra. Motivated by [CWZ] Wang and the first author in [CW] compare a more general f f parabolic g-module category OY with a corresponding g-module category OY , where Y here is any subset of [−m, −2] (see Sects. 2.2, 2.3, and Remark 3.13). A precise statement of the parabolic Brundan conjecture ([B, Conj. 4.32]) on the character of irref ducible g-modules in OY was given in [CW, Conj. 3.10]. The results in [CWZ,CW] f f suggest a direct connection between the categories OY and OY . In fact, the categories f
f
OY and OY are conjectured to be equivalent in [CW, Conj. 4.18], which was referred to as super duality. The purpose of the present paper is to establish this super duality. Our main idea is the introduction of a bigger Lie superalgebra g (Sect. 2.1), which contains and interpolates f of g and g. We then study a corresponding category O g-modules and define certain Y f → O f and T : O f → OYf (Sect. 3.2). These functors truncation functors T : O Y Y Y are shown to send parabolic Verma g-modules to the respective parabolic Verma g- and g-modules, and furthermore irreducible g-modules to the respective irreducible g- and g-modules. From this we obtain in Theorem 3.16 a solution of the irreducible character f problem for g-modules in OY . The solution of the irreducible character problem then f f allows us to compare the Kazhdan-Lusztig polynomials in OY with those in OY . This then enables us to prove in Theorem 5.1 that the functors T and T define equivalences of categories from which super duality follows. Note that a special case of the super duality conjecture was already formulated for f f the categories O[−m,−2] and O[−m,−2] in [CWZ, Conj. 6.10]. A proof of this special case was announced recently in [BS], with a proof to appear in a sequel of [BS]. Our method differs significantly from that of Brundan and Stroppel, as ours is independent of [B]. Furthermore our approach enables us to explicitly construct functors inducing this equivalence, and it is applicable to more general module categories. We want to emphasize that, in contrast to [CWZ], the arguments presented in this article do not depend on [B or Se], and hence Theorem 3.16 also gives an independent solution to the finite-dimensional irreducible character problem for the general linear superalgebra as a special case. By directly relating the irreducible character problem of Lie superalgebras to that of Lie algebras our solution of the problem becomes
Irreducible Characters of Superalgebra and Super Duality
647
surprisingly elementary. Our method is applicable to other finite and infinite-dimensional superalgebras, e.g. the ortho-symplectic Lie superalgebras [CLW]. This article is organized as follows. In Sect. 2 the Lie superalgebras g, g and g are f , O f and OYf . In Sect. 3 the main tool, defined, together with the module categories O Y Y odd reflections [LSS], of making connections between these categories is introduced, and the crucial Lemma 3.2 is proved. In Sect. 4 we show that the Kazhdan-Lusztig polynomials of these categories coincide, from which we then derive in Sect. 5 the equivalence of these categories. We conclude this Introduction by setting the notation to be used throughout this article. The symbols Z, N, and Z+ stand for the sets of all, positive and non-negative integers, respectively. For m ∈ Z we set m := m, if m > 0, and m := 0, otherwise. For integers a < b we set [a, b] := {a, a + 1, . . . , b}. Let P denote the set of partitions. For λ ∈ P we denote by λ the transpose partition of λ, by (λ) the length of λ and by sλ (y1 , y2 , . . .) the Schur function in the indeterminates y1 , y2 , . . . associated with λ. For a super space V = V0¯ ⊕ V1¯ and a homogeneous element v ∈ V , we use the notation |v| to denote the Z2 -degree of v. Let U(g) denote the universal enveloping algebra of a Lie (super)algebra g. Finally all vector spaces, algebras, tensor products, et cetera, are over the field of complex numbers C.
2. The Lie Superalgebras g, g and g denote the complex super space with g. For m ∈ N, let V 2.1. The Lie superalgebra 1 homogeneous basis {vr |r ∈ [−m, −1] ∪ 2 N}. The Z2 -gradation is determined by |vr | = ¯ for r ∈ 1 + Z+ , and |vi | = 0, ¯ for i ∈ [−m, −1] ∪ N. We denote by 1, g the Lie 2 vanishing on all but finitely many vr s. For r, s, p ∈ superalgebra of endomorphisms of V [−m, −1] ∪ 21 N, let Er s denote the endomorphism defined by Er s (v p ) := δsp vr . Then g equals the Lie superalgebra spanned by these Er s s. Let g<0 be the subalgebra isomorphic to the linear algebra gl(m) spanned by E i j , i, j ∈ [−m, −1]. Let h stand for the Cartan subalgebra spanned by the Err s and let {r ∈ h∗ |r ∈ 1 1 denote the [−m, −1] ∪ 2 N} be the basis dual to {Err ∈ h|r ∈ [−m, −1] ∪ 2 N}. Let simple roots {α−m := −m −−m+1 , · · · , α−1 := −1 −1/2 }∪{αr := r −r + 1 |r ∈ 21 N}. 2 The corresponding Dynkin diagram is (D1)
α−m
···
α−m+1
α−2
α−1
α1/2
···
αn
· · ·.
let α ∨ denote the simple coroot corresponding to α. Explicitly, we have For α ∈ ∨ = E ∨ ∨ ∨ α−1 −1,−1 + E 1/2,1/2 , α1/2 = −E 1/2,1/2 − E 11 , α1 = E 11 + E 3/2,3/2 , α3/2 = −E 3/2,3/2 − E 22 et cetera. denote the set Given α ∈ h∗ , let gα := {x ∈ g | [h, x] = α(h)x, ∀h ∈ h} and let of all roots. The positive roots with respect to will be denoted by + , while b denotes + . Let the Borel subalgebra with respect to n := [ b, b] and let n− be the opposite nilradical.
648
S.-J. Cheng, N. Lam
For any subset Y ⊆ [−m, −2] (including Y = ∅), define gα ), h ⊕ (⊕α∈ lY := Y uY := ⊕α∈ gα , + \( Y )+ g−α , ( uY )− := ⊕α∈ + \( Y )+ pY := lY ⊕ uY , ∩ (⊕ Y := where r ∈Y ∪ 21 N Zαr ) and ( Y )+ := + ∩ Y . Then pY is a parabolic uY . Set p<0 g<0 ∩ pY subalgebra of g with Levi subalgebra lY and nilpotent radical Y := <0 <0 and lY := g ∩ lY . Given λ ∈ h∗ , we denote by L( lY , λ) the irreducible lY -module of highest weight λ with respect to lY ∩ b, which we may regard as an irreducible pY -module in the usual way. Define the parabolic Verma g-module (λ) := Indg L( lY , λ). K p Y
Let L(λ) be the irreducible g-module of highest weight λ with respect to b. Set PY :={λ = (λ−m , λ−m+1 , . . . , λ−1 , λ1 , λ2 , . . .) | −1 ∨ λi i , α j ∈ Z+ , ∀ j ∈ Y ; (λ1 , λ2 , · · · ) ∈ P}. λi ∈ Z, ∀i; i=−m
−1 λi i . For λ ∈ PY we let λ+ := (λ1 , λ2 , · · · ) ∈ P and λ<0 := i=−m Given a partition μ = (μ1 , μ2 , · · · ) we set θ (μ) to be the sequence of integers θ (μ) := (θ (μ)1/2 , θ (μ)1 , θ (μ)3/2 , θ (μ)2 , · · · ), where θ (μ)i−1/2 := μi − (i − 1) and θ (μ)i := μi − i, for i ∈ N. We recall that · is defined at the end of the Introduction. Example 2.1. For the partition λ = (7, 6, 3, 3, 1), we have θ (λ) = (5, 6, 3, 4, 2, 0, 0, · · · ). This can be read off the Young diagram of λ as follows: 6 4 3
5
Set
2
⎧ ⎫ ⎪ ⎪ ⎨ ⎬ Y := λ<0 + P θ (λ+ )r r ∈ h∗ | λ = (λi ) ∈ PY . ⎪ ⎪ ⎩ ⎭ r∈ 1 N 2
and γ ∈ For a semisimple h-module M h∗ , we define γ := {m ∈ M|hm = γ (h)m, ∀h ∈ h}. M
Irreducible Characters of Superalgebra and Super Duality
649
2.2. The subalgebra g. We denote by g the subalgebra of g generated by Er s , r, s ∈ [−m, −1] ∪ N. The Cartan subalgebra h has basis {E ii |i ∈ [−m, −1] ∪ N}, with dual basis {i |i ∈ [−m, −1] ∪ N}. The corresponding Dynkin diagram is α−m
α−m+1
···
α−2
β−1
β1
···
· · ·,
βn
where β−1 := −1 − 1 , βi := i − i+1 , i ≥ 1. n− ∩g. Set lY := lY ∩g, pY := pY ∩g, uY := uY ∩g, Set b := b∩g, n := n∩g and n− := and (uY )− := ( uY )− ∩ g. Given λ ∈ h∗ , we denote by L(lY , λ) the irreducible lY -module of highest weight λ with respect to lY ∩ b, which we may regard as an irreducible pY -module in the usual way. Define the parabolic Verma g-module g
K (λ) := IndpY L(lY , λ). Let L(λ) be the irreducible g-module of highest weight λ with respect to b. As usual, we identify PY with the following set of weights in h∗ : λ j j ∈ h∗ | λ = (λi ) ∈ PY }. PY = {λ<0 + j∈N
For a semisimple h-module M and γ ∈ h∗ , we define Mγ := {m ∈ M|hm = γ (h)m, ∀h ∈ h}. g generated by Er s , 2.3. The subalgebra g. We denote by g the subalgebra of r, s ∈ [−m, −1]∪ 12 +Z+ . The Cartan subalgebra h has basis {Err |r ∈ [−m, −1]∪ 12 +Z+ }, with dual basis {r |r ∈ [−m, −1] ∪ 21 + Z+ }. The corresponding Dynkin diagram is α−m
α−m+1
···
α−2
α−1
···
β1/2
· · ·,
βn+1/2
where βi−1/2 := i−1/2 − i+1/2 , i ≥ 1. Set b := b ∩ g, n := n ∩ g and n− := n− ∩ g. Set lY := lY ∩ g, pY := pY ∩ g, uY := uY ∩ g, and (uY )− := ( uY )− ∩ g. ∗ Given λ ∈ h let L(lY , λ) be the irreducible lY -module of highest weight λ with respect to lY ∩ b, which we may regard as an irreducible pY -module. Define the parabolic Verma g-module, g
K (λ) := Indp L(lY , λ). Y
Let L(λ) be the irreducible g-module of highest weight λ with respect to b.
650
S.-J. Cheng, N. Lam
Let P Y := {λ<0 +
∗ (λ+ )i i− 1 ∈ h | λ = (λi ) ∈ PY }. 2
i∈N
∗
For a semisimple h-module M and γ ∈ h , we define M γ := {m ∈ M|hm = γ (h)m, ∀h ∈ h}. Y . The set PY parameterizes the sets PY , P Y 2.4. Parametrization for PY , P Y and P Y . From now on we will use the following notation. For λ = (λi ) ∈ PY , let and P λ := λ<0 +
∞
λi i ∈ PY ,
i=1 ∞
λ := λ<0 +
(λ+ )i i−1/2 ∈ P Y ,
i=1 θ
λ := λ
<0
+
Y . θ (λ+ )r r ∈ P
r ∈ 21 N
Example 2.2. Let m = 3, Y = ∅ and λ = (−5, 2, −3, 7, 6, 3, 3, 1) (cf. Example 2.1). We have λ = −5−3 + 2−2 − 3−1 + 71 + 62 + 33 + 34 + 15 , λ = −5−3 + 2−2 − 3−1 + 5 1 + 4 3 + 4 5 + 2 7 + 2 9 + 2 11 + 1 13 , 2
2
2
2
2
2
2
λθ = −5−3 + 2−2 − 3−1 + 5 1 + 61 + 3 3 + 42 + 2 5 . 2
2
2
Y (respectively OY , OY ) be the cate2.5. Categories of g-, g- and g-modules. Let O gory of g-(respectively g-, g-)modules M such that M is a semisimple h-(respectively ∗ h∗ (respectively h∗ , h ) satisfying h-, h-)module and dimMγ < ∞ for each γ ∈ (i) M decomposes over lY (respectively lY , lY ) into a direct sum of L( lY , μθ ) (respec tively L(lY , μ), L(lY , μ )), μ ∈ PY , (ii) M has a filtration of g-(respectively g-, g-)modules M = M0 ⊇ M1 ⊇ M2 ⊇ · · ·
L(νiθ ) (respectively L(νi ), L(νi ), for some such that for all i ≥ 0 Mi /Mi+1 ∼ = νi ∈ PY . Y (respectively The morphisms in OY are g-homomorphisms. The morphisms in O OY ) are (not necessarily even) g-(respectively g-)homomorphisms. Clearly K (λ), for (λθ ) and K (λ ), λ ∈ PY , lies in OY . By [CK, Theorem 3.2] and [CK, Theorem 3.1], K Y and OY , respectively. for λ ∈ PY , lie in O f f f Y (respectively OY , Let OY (respectively OY , OY ) denote the full subcategory of O OY ) consisting of objects having finite composition series.
Irreducible Characters of Superalgebra and Super Duality
651
Set := −1 j=−m Z j + r ∈ 12 N Zr . Let V be a semisimple h-module such that ¯ such that = γ ∈ V M Mγ . Then V is a Z2 -graded vector space V = V0¯ 1 ¯ := μ μ , ¯ := and V (2.1) V V V 0 1 μ∈ 0¯
μ∈ 1¯
where := {μ ∈ | r ∈1/2+Z+ μ(Er,r ) ≡ (mod 2)}. −1 Set := j=−m Z j + r ∈ 21 +Z+ Zr . Let V be a semisimple h-module such that M = γ ∈ M γ . Then V is an Z2 -graded vector space V = V 0¯ V 1¯ such that V 0¯ := Vμ and V 1¯ := V μ, (2.2) where := {μ ∈ |
μ∈ 0¯
r ∈ 21 +Z+
μ∈ 1¯
μ(Er,r ) ≡ (mod 2)}.
Y , let M Y denote the ∈ O ∈ O It is clear that OY is an abelian category. For M equipped with the Z2 -gradation given by (2.1). The Z2 -gradation given by g-module M Y , and N ∈ O (2.2) is compatible with the g-action. Therefore, for M, ϕ ∈ HomO ϕ have structures of Z2 -graded vector Y ( M, N ), the kernel and the cokernel of spaces defined by (2.1). The induced g-actions on the kernel and cokernel are compatible Y , and hence with this Z2 -gradation. Thus the kernel and the cokernel of ϕ belong to O f OY and OY are abelian categories. Note that the homomorphic image of ϕ may not be . Similarly, the Z2 -gradation given by (2.2) of any object M ∈ OY a g-submodule of N f is compatible with its g-action. Moreover, OY and OY are abelian categories. 0¯ and O f,0¯ to be the full subcategories of O Y and O f , respectively, We define O Y Y Y consisting of objects with Z2 -gradations given by (2.1) (cf. [B, §4-e]). Note that the 0¯ and O f,0¯ are of degree 0. Y , it is clear that M ∈ O is iso¯ For M morphisms in O Y Y ¯ 0 morphic to M in OY . Thus OY and OY have isomorphic skeletons and hence they are f and O f,0¯ are equivalent categories. equivalent categories. Similarly, O 0¯
Y f,0¯
Y
Analogously define OY and OY to be the respective full subcategories of OY and f OY consisting of objects with Z2 -gradations given by (2.2). Similarly the morphisms ¯ f,0¯ 0¯ ¯ Moreover, O0Y and OY are equivalent categories, and in OY and OY are of degree 0. f,0¯
f
OY and OY are equivalent categories. 3. Odd Reflection and Character Formulae We shall briefly explain the effect of an odd reflection on the highest weight of an irreducible module that was studied in [PS, Lemma 1] (see also [KW, Lemma 1.4]). Fix a Borel subalgebra B with corresponding set of positive roots + (B). Let α be an isotropic odd simple root and α ∨ be its corresponding coroot. Applying the odd reflection with respect to α changes the Borel subalgebra B into a new Borel subalgebra B(α) with corresponding set of positive roots + (B(α)) = {−α} ∪ + (B)\{α}. Now let λ be the highest weight with respect to B of an irreducible module. If λ, α ∨ = 0, then the highest weight of this irreducible module with respect to B(α) is λ − α. If λ, α ∨ = 0, then the highest weight remains unchanged. In the sequel we will sometimes refer to the highest weight with respect to the new Borel subalgebra as the new highest weight.
652
S.-J. Cheng, N. Lam
3.1. Odd reflection and a fundamental lemma. c (n). Starting with the Dynkin diagram (D1) 3.1.1. A sequence of odd reflections and of Sect. 2.1 and given a positive integer n we apply the following sequence of n(n+1) odd 2 reflections. First we apply one odd reflection corresponding to 1/2 − 1 , then we apply two odd reflections corresponding to 3/2 − 2 and 1/2 − 2 . After that we apply three odd reflections corresponding to 5/2 − 3 , 3/2 − 3 , and 1/2 − 3 , et cetera, until finally we apply n odd reflections corresponding to n−1/2 −n , n−3/2 −n , . . . , 1/2 −n . The resulting new Borel subalgebra for g will be denoted by bc (n) and the corresponding simple roots are c (n) := {α−m , . . ., α−2 ; β−1 ; β1 , . . ., βn−1 ; n −1/2 ; β1/2 , . . ., βn−1/2 ; αn+1/2 , αn+1 , . . .}.
α−m
α−m+1
···
α−2
···
β−1
βn−1
n − 1/2
β1/2
···
βn−1/2
αn+1/2
αn+1
· · ·.
A Borel subalgebra is completely determined by giving an ordered homogeneous basis for the standard module. The ordered basis corresponding to the Borel subalgebra c (n) is given below: of {v−m , . . . , v−1 , v1 , v2 , . . . , vn−1 , vn , v1/2 , v3/2 , . . . , vn+1/2 , vn+1 , vn+3/2 , vn+2 , vn+5/2 , . . .}.
s (n) On the other hand given (D1) and n 3.1.2. A sequence of odd reflections and we can also apply the following different sequence of n(n+1) odd reflections. First we 2 apply one odd reflection corresponding to 1 − 3/2 , then we apply two odd reflections corresponding to 2 − 5/2 and 1 − 5/2 . After that we apply three odd reflections corresponding to 3 − 7/2 , 2 − 7/2 , and 1 − 7/2 , et cetera, until finally we apply n odd reflections corresponding to n − n+1/2 , n−1 − n+1/2 , . . . , 1 − n+1/2 . The resulting new Borel subalgebra for g will be denoted by bs (n) and the corresponding simple roots are s (n) := {α−m , . . . , α−2 ; α−1 ; β1/2 , . . . βn−1/2 ; n+1/2 − 1 ; β1 , . . . , βn ; αn+1 , αn+3/2 , . . .}. α−m
···
α−2
α−1
β1/2
···
βn−1/2
n+1/2 − 1 β1
···
βn
αn+1
αn+3/2
· · ·.
s (n) is given below: The ordered basis corresponding to the Borel subalgebra of {v−m , . . . , v−1 , v1/2 , v3/2 , . . . , vn−1/2 , vn+1/2 , v1 , v2 , . . . , vn , vn+1 , vn+3/2 , vn+2 , vn+5/2 , . . .}.
Remark 3.1. We note that the simple roots used in the above two sequences of odd reflections are all roots of lY and hence these sequences of odd reflections leave the set of roots of uY invariant. We denote by bcY (n) and bsY (n) the Borel subalgebras of lY corresponding to the sets c Y and s (n) ∩ Y , respectively. (n) ∩ of simple roots
Irreducible Characters of Superalgebra and Super Duality
653
3.1.3. A fundamental lemma Lemma 3.2. Given λ ∈ PY , let n ∈ N. lY , λθ ) with respect to the (i) Suppose that (λ+ ) ≤ n. Then the highest weight of L( Y . Borel subalgebra bcY (n) is λ, regarded as an element in P (ii) Suppose that (λ+ ) ≤ n. Then the highest weight of L(lY , λθ ) with respect to the Y . Borel subalgebra bsY (n) is λ , regarded as an element in P Proof. We shall only give the proof for (i), as (ii) is analogous. Certainly λ<0 is unaffected by the sequence of odd reflections in Sect. 3.1.1. We will show more generally by induction on k that after applying the first k(k + 1)/2 odd reflections in Sect. 3.1.1 this weight becomes λ[k] = λ<0 + +
k
λi i +
i=1
k (λ+ )i − ki−1/2 + (λ+ )j − j + 1 j−1/2 i=1
j≥k+1
λ j − j j .
(3.1)
j≥k+1
From (3.1) the lemma follows. Suppose that k = 1. If (λ+ ) < 1, then λθ = λ<0 and in particular λθ , E 1/2,1/2 + E 11 = 0, and thus the new highest weight is λ[1] = λ<0 = λθ . If (λ+ ) ≥ 1, then (λ+ )1 ≥ 1 and λ1 ≥ 1 and thus λθ = λ<0 + (λ+ )1 1/2 + (λ1 − 1)1 + · · · . Now λθ , E 1/2,1/2 + E 11 > 0, and hence the highest weight after the odd reflection with respect to 1/2 − 1 is λ[1] = λ<0 + λ1 1 + ((λ+ )1 − 1)1/2 + · · · , proving (3.1) in the case k = 1. Now suppose that (3.1) is true for k. We shall derive the formula for k +1. If (λ+ ) ≤ k, k λi i . Therefore we have λ[k] , E i−1/2,i−1/2 + E k+1,k+1 = 0, then λ[k] = λ<0 + i=1 for 1 ≤ i ≤ k + 1. So the odd reflections with respect to i−1/2 − k+1 do not affect λ[k] . Thus we have λ[k+1] = λ[k] . So in this case we are done. Now assume that (λ+ ) ≥ k + 1. Let s = λk+1 . We distinguish two cases. First suppose that λk+1 ≥ k + 1. Then (λ+ )k+1 ≥ k + 1 and hence (3.1) becomes λ[k] = λ<0 +
k
λi i +
i=1
k
((λ+ )i − k)i−1/2 + ((λ+ )k+1 − k)k+1/2
i=1
+ (λk+1 − k − 1)k+1 +
(λ+ )j − j + 1 j−1/2 +
j≥k+2
λ j − j j .
j≥k+2
Now λ[k] , E k+1/2,k+1/2 + E k+1,k+1 > 0 so that after the odd reflection with respect to k+1/2 − k+1 the new weight becomes λ[k,1] = λ<0 +
k i=1
λi i +
k
((λ+ )i − k)i−1/2 + ((λ+ )k+1 − k − 1)k+1/2
i=1
+ (λk+1 − k)k+1 +
j≥k+2
(λ+ )j − j + 1 j−1/2 +
j≥k+2
λ j − j j .
654
S.-J. Cheng, N. Lam
Now λ[k,1] , E k−1/2,k−1/2 + E k+1,k+1 > 0 so after the odd reflection with respect to k−1/2 − k+1 we get λ[k,2] = λ<0 +
k
λi i +
k−1
((λ+ )i − k)i−1/2 + ((λ+ )k − k − 1)k−1/2
i=1 i=1 + ((λ+ )k+1 − k − 1)k+1/2 + (λk+1 − k + 1)k+1 + (λ+ )j − j + 1 j−1/2 + λ j − j j . j≥k+2 j≥k+2
Finally after a total of k + 1 odd reflections we end up with λ[k,k+1] =λ<0 +
k+1
λi i +
i=1
+
k+1 ((λ+ )i − k − 1)i−1/2 i=1
(λ+ )j
− j + 1 j−1/2 +
j≥k+2
λ j − j j ,
j≥k+2
which equals λ[k+1] . Now consider the case λk+1 = s < k + 1. We have (λ+ )j ≥ k + 1, for j ≤ s and (λ+ )j < k + 1, for j > s. Thus (3.1) becomes λ[k] = λ<0 +
k
λi i +
i=1
s ((λ+ )i − k)i−1/2 , i=1
where ((λ+ )i − k) > 0, for i ≤ s. It follows that odd reflections with respect to k+1/2 − k+1 , . . . , s+1/2 − k+1 do not affect λ[k] , while odd reflections with respect to s−1/2 − k+1 , . . . , 1/2 − k+1 affect λ[k] . From this we obtain λ[k+1] = λ[k,k+1] = λ<0 +
k+1 i=1
λi i +
s ((λ+ )i − k − 1)i−1/2 , i=1
which concludes the proof. Corollary 3.3. Let λ ∈ PY and n ∈ N. (λθ ) is a highest weight module with respect to (i) Suppose that (λ+ ) ≤ n. Then K Y . the Borel subalgebra bc (n) with highest weight λ, regarded as an element in P Also the highest weight of L(λθ ) with respect to the Borel subalgebra bc (n) is λ, Y . regarded as an element in P (λθ ) is a highest weight module with respect to (ii) Suppose that (λ+ ) ≤ n. Then K s Y . the Borel subalgebra b (n) with highest weight λ , regarded as an element in P θ s Also the highest weight of L(λ ) with respect to the Borel subalgebra b (n) is λ , Y . regarded as an element in P (λθ ) contains a unique copy of L( lY , λθ ) that is annihilated Proof. As an lY -module K c by uY . By Lemma 3.2 with respect to bY (n) the highest weight of L( lY , λθ ) is λ. Now c c θ by Remark 3.1 bY (n) + uY = b (n). Thus K (λ ) has a non-zero vector of weight λ anni(λθ ) over hilated by bc (n). This vector clearly generates K g, proving the first statement of (i). A verbatim argument proves the second statement as well. Part (ii) is similar and so its proof is omitted.
Irreducible Characters of Superalgebra and Super Duality
655
3.2. The Functors T and T . Recall that := −1 j=−m Z j + r ∈ 12 N Zr and := −1 −1 i∈N Zi . Given a semisimple j=−m Z j + j=−m Z j + r ∈ 12 +Z+ Zr . Set := such that M = γ ∈ h-module M Mγ , we define := T ( M)
γ , M
and
:= T ( M)
γ ∈
γ . M
γ ∈
is an h-submodule of M (regarded as an h-module), and T ( M) is an Note that T ( M) (regarded as an h-module). If M is also an is h-submodule of M lY -module, then T ( M) an lY -submodule of M (regarded as an lY -module), and T ( M) is an lY -submodule of M (regarded as an lY -module). Furthermore if M ∈ OY , then T ( M) is a g-submodule of M is a g-submodule of M (regarded as a g-module). (regarded as a g-module), and T ( M) Let M = γ ∈ Mγ and N = γ ∈ Nγ be two semisimple h-modules. We let −−−−→ T ( M) TM : M
and
−−−−→ T ( M) TM : M
−→ N is an be the natural projections. If f :M h-homomorphism, we let −−−−→ T ( N ) f ] : T ( M) T[
and
−−−−→ T ( N ) T[ f ] : T ( M)
be the corresponding restriction maps. Note that TM f ] (respectively, T M and T [ and T[ f ]) are h- (respectively, h-) homomorphisms. If f is also an lY -homomorphism of f ] (respectively, T M f ]) are lY - (respectively, lY -) lY -modules, then TM and T [ and T [ homomorphisms. Furthermore if f is also a g-homomorphism of g-modules, then TM and T [ f ] (respectively, T M and T [ f ]) are g(respectively, g-) homomorphisms. It is easy to see that we have the following commutative diagrams:
−−−f−→ M ⏐ ⏐T M
N ⏐ ⏐T N
T[ f]
−−−−→ T ( N ) T ( M)
−−−f−→ M ⏐ ⏐ T M
N ⏐ ⏐ T N
(3.2)
T[ f]
−−−−→ T ( N ) T ( M)
For an indeterminate e we let xr := er , r ∈ [−m, −1] ∪ 21 N. The formal char±1 ±1 Y , OY , and OY is then an element in Z[[x−m , . . . , x−1 ]] ⊗ acter of an object in O ±1 ±1 ±1 ±1 ]] ⊗ Z[[x1/2 , x1 , . . .]], Z[[x−m , . . . , x−1 ]] ⊗ Z[[x1/2 , x3/2 , . . .]], and Z[[x−m , . . . , x−1 θ Z[[x1 , x2 , . . .]], respectively. For λ ∈ PY , we remark that the character of L(lY , λ ) is given by [CK, Sect. 3.2.3] <0 l<0 chL( lY , λθ ) = chL( Y , λ )H Sλ+ (x 1/2 , x 1 , x 3/2 , x 2 , · · · ),
where (cf. [S,BR]) H Sη (x1/2 , x1 , x3/2 , x2 , · · · ) :=
(3.3)
sμ (x1/2 , x3/2 , · · · )s(η/μ) (x1 , x2 , · · · ), η ∈ P.
μ⊆η <0 <0 Here and below L( l<0 Y , λ ) stands for the irreducible lY -module of highest weight <0 <0 <0 λ and so chL(lY , λ ) is a product of Schur Laurent polynomials in x−m , . . . , x−1 depending on Y and λ<0 .
656
S.-J. Cheng, N. Lam
Lemma 3.4. For λ ∈ PY , we have (i) T (L( lY , λθ )) = L(lY , λ), (ii) T (L( lY , λθ )) = L(lY , λ ). Proof. Since L(lY , λ) is irreducible, it is enough to prove that T (L( lY , λθ )) and L(lY , λ) have the same character. Applying T to L( lY , λθ ) has the effect of setting x j−1/2 = 0, j ∈ N, in the character. Thus we have <0 l<0 chT L( lY , λθ ) = chL( Y , λ )sλ+ (x 1 , x 2 , · · · ), which is the character of L(lY , λ). This proves (i). The proof for (ii) is analogous and hence omitted. is a highest weight Lemma 3.5. If M g-module of highest weight λθ with λ ∈ PY , then T ( M) and T ( M) are highest weight g- and g-modules of highest weight λ ∈ PY and λ ∈ P Y , respectively. as the case of T ( M) is analogous. Let v be Proof. We will only show this for T ( M), a nonzero vector in M of weight λ obtained from a non-zero vector of weight λθ by applying the sequence of odd reflections of Sect. 3.1.1. Such a vector by Corollary 3.3 for n 0. Evidently v ∈ T ( M) is a bc (n)-highest weight vector of the g-module M, c and, since b = b (n) ∩ g, v is a b-singular vector. The g-module T ( M), regarded as an lY -module, is completely reducible by Lemma 3.4. Thus to prove the lemma it is enough of weight μ ∈ PY lies in U(n− )v. To see this, to show that every vector w ∈ T ( M) choose n so that (λ+ ) < n and (μ+ ) < n. Then with respect to bc (n), v is a highest c c weight vector of M and hence w ∈ U( n− (n))v, where n− (n) is the opposite nilradical of bc (n). Now the conditions (λ+ ) < n and (μ+ ) < n imply that λ−μ=
−1
ai i +
i=−m
n−1
b j j , ai , b j ∈ Z.
(3.4)
j=1
c (n). So we can But λ − μ is also a finite Z+ -linear combination of simple roots from write aα α, aα ∈ Z+ . λ−μ= c (n) α∈
If there were some α ∈ {n −1/2 ; β1/2 , . . . , βn−1/2 ; αn+1/2 , αn+1 , . . .} with aα = 0, then it is easy to see that λ − μ, Err = 0, for r ∈ [−m, −1] ∪ [1, n − 1]. It contradicts (3.4). Therefore λ − μ is a Z+ -linear combination of {α−m , · · · , α−2 ; β−1 ; β1 , . . . , βn−1 }, and hence w ∈ U(n− )v. Theorem 3.6. For λ ∈ PY , we have (λθ )) = K (λ), T ( L(λθ )) = L(λ); T (K (λθ )) = K (λ ), T ( T (K L(λθ )) = L(λ ).
Irreducible Characters of Superalgebra and Super Duality
657
Proof. We will show this for T . The argument for T is analogous. Computing the char(λθ ) we have (see (3.3)) acter of K (λθ ) = ch K
i<0, j∈N,r ∈ 12 +Z+
g<0 <0 <0 ch Ind L( l , λ ) H Sλ+ (x 1 , x1 , x 3 , x2 , . . .). Y p<0 2 2 Y (1−xi−1 x j ) (1+xi−1 xr )
Application of T amounts to setting the variables xr = 0, r ∈ 21 + N. Thus 1 g<0 <0 <0 θ ch Indp<0 L(lY , λ ) sλ+ (x1 , x2 , . . .), chT ( K (λ )) = Y (1 − xi−1 x j ) i<0, j∈N (λθ )) is a highest weight module by Lemma 3.5, we which equals chK (λ). Since T ( K θ see that T ( K (λ )) = K (λ). := is not irreducible. Since Let M L(λθ ) with λ ∈ PY . Suppose that M := T ( M) by Lemma 3.5 the g-module M is a highest weight module, it must have a b-singular vector inside M that is not a highest weight vector. Suppose that w is such a b-singular vector of weight μ ∈ PY . We can choose n 0 such that λ is the highest weight of with respect to M bc (n), and (λ+ ) < n and (μ+ ) < n. By Corollary 3.3 there exists of weight λ. It is clear that vλ is a bc (n)-highest weight vector vλ of the g-module M a b-highest weight vector of M and hence w ∈ U(a)vλ , where a is the subalgebra of n− generated by root vectors in n− corresponding to the roots −α−m , . . . , −α−2 , −β−1 and −β j , 1 ≤ j ≤ k, for some k. Choose q ∈ N such that q ≥ n and q > k + 1. Note that vλ is also a bc (q) highest weight vector of the g-module M of weight λ. Since w is b-singular it is annihilated by the root vectors corresponding to the root α−m , . . . , α−2 , β−1 and β j , for all j ∈ N. Also w is annihilated by the root vectors corresponding to the root in c (q)\{α−m , . . . , α−2 , β−1 , β1 , β2 , . . . , βq−1 } since w ∈ U(a)vλ and these root vec is then a tors commute with a. It follows that w, regarded as in M, bc (q)-singular vector, contradicting the irreducibility of M. Y to OY and from O Y to OY , Proposition 3.7. T and T define exact functors from O f f f f to O and O to OY , respectively. respectively. Furthermore, T and T send O Y Y Y Proof. Exactness is clear from the definitions. Y , then T ( M) ∈O ∈ OY and T ( M) ∈ OY . We will It remains to prove that if M ∈ OY , as the proof of T ( M) ∈ OY is analogous. only prove T ( M) γ < ∞, for all γ ∈ h∗ . Now if M ∼ Clearly dimT ( M) lY , μθ )m(μ) , then = μ∈PY L( m(μ) . (Here and below m(μ) stands for the ∼ by Lemma 3.4 T ( M) = μ∈PY L(lY , μ) θ multiplicity of L( lY , μ ) in M.) Finally by Theorem 3.6 T ( L(λθ )) = L(λ). By exactness of T it follows that a downward filtration for M gives rise to a corresponding downward filtration of T ( M) with a one-to-one correspondence between the composition factors. Hence T ( M) ∈ OY . The second part of the proposition is clear. 3.3. Some consequences. Let M ∈ OY . We may regard chM as an element in ±1 ±1 , . . . , x−1 ]] ⊗ Z (x1 , x2 , . . .), where Z (x1 , x2 , . . .) denotes the space of Z[[x−m
658
S.-J. Cheng, N. Lam
(completed) symmetric functions in the variables x1 , x2 , . . .. Similarly, for M ∈ OY ±1 Y , chM and ch M ∈ O may be viewed as elements in Z[[x−m and M , . . . , x ±1 ]] ⊗ −1
±1 ±1 , . . . , x−1 ]] ⊗ Z (x1 , x2 , . . .) ⊗ Z (x1/2 , x3/2 , . . .), Z (x1/2 , x3/2 , . . .) and Z[[x−m respectively. Let ω : Z (x1 , x2 , . . .) → Z (x1/2 , x3/2 , . . .) be the ring homomorphism that sends the n th complete symmetric function in x1 , x2 , . . . to the n th elementary symmetric function in x1/2 , x3/2 , . . .. Let ω : Z (x1 , x2 , . . .) → Z (x1 , x2 , . . .) ⊗ Z (x1/2 , x3/2 , . . .) be the ring homomorphism defined by sending the n th complete symmetric function in x2 , x4 , . . . (respectively in x1 , x3 , . . .) to the n th complete symmetric function in x1 , x2 , . . . (respectively to the n th elementary symmetric function in x1/2 , x3/2 , . . .).
Corollary 3.8. Let λ ∈ PY . We have (i) ω(chL(λ)) = ch L(λθ ). (ii) ω(chL(λ)) = chL(λ ). Proof. Since ω (chL(lY , λ)) = chL( lY , λθ ) and ω (chL(lY , λ)) = chL(lY , λ ) the corollary follows directly from Lemma 3.4 and Theorem 3.6. Remark 3.9. Corollary 3.8 (ii) is consistent with the prediction of the super duality conjecture, and in the case of irreducible polynomial representations (i.e. λ ∈ P[−m,−2] with λ−1 ≥ λ1 ) gives [BR, Theorem 6.10]. In the case of Y = [−m, −2] it gives [CWZ, Cor. 6.15]. For infinite-dimensional unitary modules appearing in certain Howe dualities it also recovers [CLZ, Theorem 5.3]. For n ∈ N, we recall the truncation functor trn : OY → (OY )n of [CW, Def. 3.1], where here and further we use a subscript n to indicate a corresponding truncated cate −1 Zi + nj=1 Z j− 1 let K n (γ ) and L n (γ ) be gory of gl(m|n)-modules. For γ ∈ i=−m 2 the parabolic Verma gl(m|n)-module and irreducible gl(m|n)-module of highest weight γ in the category (OY )n , respectively. We recall the following. Lemma 3.10. [CW, Cor. 3.3] Let λ ∈ P Y . The truncation functor trn , for every n ∈ N, is exact and it sends K (λ) and L(λ) to K n (λ) and L n (λ), respectively, if λ, E n+1/2,n+1/2 = 0, and to zero otherwise. f
f
Proposition 3.11. The module K (λ) lies in OY , for all λ ∈ P Y . Thus category OY is the category of finitely generated g-modules that as lY -modules are direct sums of L(lY , μ), μ ∈ P Y , with a locally nilpotent uY -action. Proof. Consider a fixed λ ∈ P Y . Choose n 0 so that λ, E n+1/2,n+1/2 = 0 and the degree of atypicality for λ does not increase anymore with increasing n. Assume L(μ) is a composition factor in K (λ). We have μ ∈ P Y . Choose k ≥ n such that trk (L(μ)) = 0. Then λ and μ share the same central character in (OY )k . Therefore our choice of n together with μ ∈ P Y implies that μ, E n+1/2,n+1/2 = 0. Thus by Lemma 3.10 the multiplicity of each L n (μ) inside each K n (λ) is the same as that of L(μ) in K (λ). Since the gl(m|n)-module K n (λ) has finite composition series (because as a gl(m|n)0¯ -module it is isomorphic to the tensor product of a generalized Verma f
module and a finite-dimensional module), it follows that K (λ) ∈ OY . By a standard argument a finitely generated g-module M that as an lY -module is a direct sum of L(lY , μ), μ ∈ P Y , with a locally nilpotent uY -action, has a finite filtration
Irreducible Characters of Superalgebra and Super Duality
659 f
by highest weight modules, which are quotients of K (λ), for λ ∈ P Y . Thus M ∈ OY and hence the proposition follows. Theorem 3.6 and Proposition 3.11 give the following. f , for all λ ∈ P Y . Hence the category O f (λ) ∈ O Corollary 3.12. (i) The module K Y Y is the category of finitely generated g-modules that as lY -modules are direct sums Y , with a locally nilpotent of L( lY , μ), μ ∈ P uY -action. f f (ii) The module K (λ) ∈ OY , for all λ ∈ PY . Hence the category OY is the category of finitely generated g-modules that as lY -modules are direct sums of L(lY , μ), μ ∈ PY , with a locally nilpotent uY -action. f
Remark 3.13. Proposition 3.11 and its Corollary 3.12 imply that the categories OY and f ++ and O ++ OY are the categories Om|∞ m+∞ of [CW], respectively. We note that the proof of Proposition 3.11 that we have presented above is elementary. In the proof above we have only used the rather easy Lemma 3.10. By Theorem 3.6 and Proposition 3.7 and Corollary 3.12, we have the following. (λθ ), K (λ) Corollary 3.14. Let λ, μ ∈ PY . The numbers of composition factors of K and K (λ ) that are isomorphic to L(μθ ), L(μ) and L(μ ), respectively, are the same. ∗
Recall the super Bruhat ordering for weights in h (see e.g. [B, §2-b] or [CW, §2.3]) which we denote by . Let us denote by ≥ the classical Bruhat ordering on h∗ . As a further application we present a super analogue of a classical theorem of BGG. ∗
Corollary 3.15. Let λ ∈ PY and γ ∈ h . If L(γ ) is a subquotient of K (λ ), then λ γ . Proof. Clearly γ = μ for some μ ∈ PY . By Corollary 3.14 L(μ ) is a subquotient of K (λ ) if and only if L(μ) is a subquotient of K (λ). By the classical version of the BGG Theorem (e.g. [H, Sect. 5.1]) μ ≤ λ. Now [CW, Lemma 4.6] implies that μ λ . 3.4. Irreducible characters. Theorem 3.16. Let λ ∈ PY . Let chL(λ) = (μθ ), (i) ch L(λθ ) = μ∈PY aμλ ch K (ii) chL(λ ) = μ∈PY aμλ chK (μ ).
μ∈PY
aμλ chK (μ). Then
(λθ ), for λ ∈ PY , the Proof. Since ω(chK (λ)) = chK (λ ) and ω(chK (λ)) = ch K theorem follows directly from Corollary 3.8. Remark 3.17. By (4.4) the coefficients aμλ in Theorem 3.16 equal lμλ (1), where lμλ (q) are the classical (parabolic) Kazhdan-Lusztig polynomials [D,KL] (see also [CW, Prop. 4.4]). Since by [CW, Theorem 4.7] the polynomials lμλ (q) equal μ λ (q) (see (4.3)) Theorem 3.16 verifies [CW, Conj. 3.10], which is a parabolic version of a conjecture of Brundan [B, Conj. 4.32]. In particular, Theorem 3.16 in the special case Y = [−m, −2], together with Lemma 3.10, gives an independent new proof of the first part of [B, Theorem 4.37]. We note that our results do not rely on [B,Se].
660
S.-J. Cheng, N. Lam
Below we work out in more detail a character formula for the irreducible gl(m|n)module L n (γ ), where γ is a weight of the form −1
n
γi i +
i=−m
γ j j−1/2 , γi , γ j ∈ Z,
j=1
with γ1 ≥ γ2 ≥ · · · ≥ γn . Recall that the one-dimensional determinant module det has −1 (highest) weight 1m|n = i=−m i − nj=1 j−1/2 . For k ∈ Z and an h-semisimple gl(m|n)-module M with chM = η dimMη eη we have ch(M ⊗ det⊗k ) =
dimMη eη+k1m|n .
η
Clearly L n (γ + k1m|n ) = L n (γ ) ⊗ det⊗k . Thus taking the tensor product with a suitable power of the determinant module, if necessary, we may assume that γ1 ≥ γ2 ≥ · · · ≥ γn ≥ 0 and so γ ∈ P Y . Let λ ∈ PY (with Y = ∅) be such that λ = γ . We have (λ+ ) ≤ n. Now Theorem 3.16 (ii) (together with Lemma 3.10) implies that the character of the irreducible gl(m|n)-module of highest weight λ equals to chL n (γ ) = aμλ chK n (μ ), (3.5) μ∈PY ,(μ+ )≤n
where K n (μ ) is the parabolic Verma gl(m|n)-module corresponding to Y = ∅. As the coefficients aμλ are known by Remark 3.17, (3.5) gives the irreducible character for gl(m|n)-module L n (γ ). In the special case of γ−m ≥ γ−m+1 ≥ · · · ≥ γ−1 we obtain an irreducible character formula for finite-dimensional irreducible gl(m|n)-module. A formula (corresponding to our case Y = [−m, −2]) was obtained in [Se,B]. Theorem 3.16 is obtained using an approach very different from [Se] and [B], and provides an independent solution of the irreducible character problem. 4. Kazhdan-Lusztig Polynomials 4.1. Homology of Lie superalgebras. Let L = L 0¯ ⊕ L 1¯ be a Lie superalgebra and let n T (L) be the tensor algebra of L. Then T (L) = ∞ n=0 T (L) is an associative superalgebra with a canonical Z-gradation. For v ∈ L , we let |v| := , ∈ Z2 . The exterior algebra of L is the quotient algebra (L) := T (L)/J , where J is the homogeneous two-sided ideal of T (L) generated by the elements of the form x ⊗ y + (−1)|x||y| y ⊗ x, where x and y are homogeneous elements of L. The (L) is also an associative superalgebra with a Z-gradation inherited from T (L). More precisely, we have (L) = ∞ n L, where n L is the set of all homogeneous elements of Z-degree n in (L), n=0 for each n ≥ 0. For Z2 -homogeneous elements x1 , x2 , . . . , xk ∈ L, the image of the element x1 ⊗ x2 ⊗ . . . ⊗ xk under the canonical quotient map from T k (L) to k (L) will be denoted by x1 x2 · · · xk .
Irreducible Characters of Superalgebra and Super Duality
661
For an L-module V , the k th homology group Hk (L; V ) of L with coefficient in V is defined to be the k th homology group of the following complex (see e.g. [T]): ∂
∂
∂
∂
∂
· · · −→ n (L)⊗V −→ n−1 (L)⊗V −→ · · · −→ 1 (L)⊗V −→ 0 (L)⊗V −→0, where the boundary operator ∂ is given by ∂(x1 x2 · · · xn ⊗ v) s−1 t−1 := (−1)s+t+|xs | i=1 |xi |+|xt | j=1 |x j |+|xs ||xt | [xs , xt ]x1 · · · xs · · · xt · · · xn ⊗ v 1≤s
(−1)s+|xs |
+
n
i=s+1 |xi |
x1 · · · xs · · · xn ⊗ xs v.
(4.1)
s=1
Here the xi s are homogeneous elements in L and v ∈ V . Furthermore [xs , xt ] ∈ (L) denotes the linear term corresponding to [xs , xt ] ∈ L and, as usual, y indicates that the term y is omitted. 4.2. Comparison of homology groups. Here and further we shall suppress the subscript Y and denote ( uY )− , (uY )− and (uY )− by u− , u− and u− , respectively. f ∈ O f and M = T ( M) ∈ OYf . Let For M ∈ OY we denote by M = T ( M) Y → ( d : (u− ) ⊗ M → (u− ) ⊗ M and d : (u− ) ⊗ M → u− ) ⊗ M, d : ( u− ) ⊗ M (u− )⊗M be the boundary operator of the complex of u− -homology with coefficients in the boundary operator of the complex of u− -homology with coefficients in M and the M, boundary operator of the complex of u− -homology with coefficients in M, respectively. d, and d are Note that d, lY -homomorphism, lY -homomorphism and lY -homomorphism, respectively. The following lemma is easy. Lemma 4.1. We have (i) T (( u− )) = (u− ), (ii) T (( u− )) = (u− ). The lY -module (u− ) is a direct sum of L(lY , μ), μ ∈ PY , each appearing with finite multiplicity. Using [S,BR] one can show that (u− ), as an lY -module, is a direct sum of L(lY , μ ), μ ∈ PY , each appearing with finite multiplicity ([CK, Lemma 3.2]). Similarly it follows that ( u− ), as an lY -module, is a also direct sum of L( lY , μθ ), μ ∈ PY , each appearing with finite multiplicity ([CK, Sect. 3.2.3]). The lY -module (u− )⊗M is of course completely reducible. The lY -module ( u− )⊗ M and lY -module (u− ) ⊗ M are completely reducible by [CK, Theorem 3.2] and [CK, Theorem 3.1], respectively. f and λ ∈ PY , we have ∈O Lemma 4.2. For M Y u− ) ⊗ L(λθ ) = (u− ) ⊗ L(λ). (i) T ( u− ) ⊗ M = (u− ) ⊗ M, and thus T ( = d. Moreover, T [d] = (u− )⊗ M, and thus T ( (ii) T ( u− ) ⊗ M u− ) ⊗ L(λθ ) = (u− )⊗ L(λ ). = d. Moreover, T [d]
662
S.-J. Cheng, N. Lam
Proof. By Lemma 4.1, Theorem 3.6 and the compatibility of T and T under tensor d and d, we product we have the first part of (i) and (ii). Using the definitions (4.1) of d, = d(v) for all v ∈ (u− ) ⊗ M and d(w) have d(v) = d(w) for all v ∈ (u− ) ⊗ M. = d and T [d] = d. Hence we have T [d] Lemmas 3.4 and 4.2 now imply the following. ∼ Lemma 4.3. Suppose ( u− ) ⊗ M lY , μθ )m(μ) , as lY -modules. Then = μ∈PY L( m(μ) , as l -modules. ∼ (i) ( u− ) ⊗ M = Y μ∈PY L(lY , μ) (ii) ( u− ) ⊗ M ∼ = μ∈PY L(lY , μ )m(μ) , as lY -modules. By Lemma 4.2 and (3.2), we have the following commutative diagram:
d
d
d
d d d −−−− −−−− −−−− · · · −−−−−→ n+1 ( u− ) ⊗ M −→ n ( u− ) ⊗ M −→ n−1 ( u− ) ⊗ M −→ · · · ⏐ ⏐ ⏐ ⏐T n+1 ⏐ ⏐ (4.2) (u− )⊗ M Tn (u− )⊗ M Tn−1 (u− )⊗ M
· · · −−−−−→ n+1 (u− ) ⊗ M −−−−−→ n (u− ) ⊗ M −−−−−→ n−1 (u− ) ⊗ M −−−−−→ · · · .
to Hn (u− ; M). Similarly, T Thus T induces an lY -homomorphism from Hn ( u− ; M) to Hn ( u− ; M). Moreover, we have the induces an lY -homomorphism from Hn ( u− ; M) following. Theorem 4.4. We have for n ≥ 0, ∼ (i) T (Hn ( u− ; M)) = Hn (u− ; M), as lY -modules. ∼ (ii) T (Hn ( u− ; M)) = Hn (u− ; M), as lY -modules. Proof. We shall only prove (i), as the argument for (ii) is parallel. By Lemma 4.2 and (4.2), we have = Ker(d) ∩ ((u− ) ⊗ M) = Ker(d) T (Ker(d)) and = Im(d) ∩ ((u− ) ⊗ M) = Im(d). T (Im(d)) Since T is an exact functor, we have = T (Ker(d))/T = Ker(d)/Im(d) = Hn ( u− ; M)) (Im(d)) Hn (u− ; M). T( n≥0
This completes the proof of the theorem. Theorem 3.6 implies the following. Corollary 4.5. For λ ∈ PY and n ≥ 0, we have (i) T (Hn ( u− ; L(λθ ))) ∼ = Hn (u− ; L(λ)), as lY -modules. (ii) T (Hn ( u− ; L(λθ ))) ∼ = Hn (u− ; L(λ )), as lY -modules.
n≥0
Irreducible Characters of Superalgebra and Super Duality
663
4.3. Kazhdan-Lusztig polynomials. Let gl∞ be the infinite-dimensional general linear algebra with basis consisting of elementary matrices E i j , i, j ∈ Z. Let Uq (gl∞ ) be its quantum group acting on the natural module V (see [B, §2-c] or [CW, §2.1] for precise definition). Let W be the restricted dual of V ([B, §2-d], [CW, §2.3]). Let s s ∼ m 1 , . . . , m s ∈ N with i=1 m i = m and l<0 i=1 gl(m i ). Consider a certain Y = topological completion Em|∞ of the Fock space ([B, §2-d], [CW, §2.3]) E m|∞ := m 1 (V) ⊗ m 2 (V) ⊗ · · · m s (V) ⊗ ∞ (W). By arguments essentially going back to [KL] (cf. [B, Theorem 2.17]) Em|∞ has three sets of distinguished basis, namely the standard, canonical and dual canonical basis, parameterized by PY , denoted respectively by {K fλ |λ ∈ PY }, {U fλ |λ ∈ PY }, and {L fλ |λ ∈ PY }. Furthermore one has U fλ =
u μ λ (q)K
μ∈PY
f μ ,
L fλ =
μ∈PY
μ λ (q)K
f μ ,
(4.3)
where u μ λ (q) ∈ Z[q] and μ λ (q) ∈ Z[q −1 ] [CW, (2.3)], which are parabolic versions of [B, (2.18)]. The following theorem is an analogue of Vogan’s cohomological interpretation of the Kazhdan-Lusztig polynomials. Theorem 4.6. We have for λ, μ ∈ PY , μ λ (−q −1 ) =
∞
dimC HomlY L(lY , μ ), Hn (u− ; L(λ )) q n .
n=0
Proof. Consider a topological completion Em+∞ of the Fock space [CW, §2.2] E m+∞ := m 1 (V) ⊗ m 2 (V) ⊗ · · · m s (V) ⊗ ∞ (V). Em+∞ has the standard, canonical and dual canonical basis, parameterized by PY , denoted respectively by {K fλ |λ ∈ PY }, {U f λ |λ ∈ PY }, and {L f λ |λ ∈ PY }. Similarly to (4.3) one has U fλ = uμλ (q)K fμ , L fλ = lμλ (q)K fμ , μ∈PY
μ∈PY
where uμλ (q) ∈ Z[q] and lμλ (q) ∈ Z[q −1 ]. It is folklore that (cf. [CW, Theorems 4.15 and 4.16]) chL(λ) = lμλ (1)chK (μ). (4.4) μ∈PY
From [V, Conj. 3.4] and the Kazhdan-Lusztig conjecture proved in [BB,BK] we conclude lμλ (−q −1 ) =
∞ n=0
dimC HomlY (L(lY , μ), Hn (u− ; L(λ))) q n .
664
S.-J. Cheng, N. Lam
Now by [CW, Theorem 4.7] we have μ λ (q) = lμλ (q). By Corollary 4.5 we have dimC HomlY (L(lY , μ), Hn (u− ; L(λ))) = dimC HomlY L(lY , μ ), Hn (u− ; L(λ )) , and hence the theorem follows. f
Remark 4.7. Let U (λ ) denote the tilting module in OY corresponding to λ ∈ P Y [CW, Theorem 3.14]. By Remark 3.17 we have chL(λ ) = μ∈PY μ λ (1)chK (μ ). This, together with the remark following [CW, Conj. 3.10], implies that chU (λ ) =
u μ λ (1)chK (μ ).
μ∈PY
5. Super Duality f
f
5.1. Equivalence of the categories OY and OY . The goal of this section is to establish the following: Theorem 5.1. Recall T and T from Sect. 3.2. We have the following: f → O f is an equivalence of categories. (i) T : O Y Y f → OYf is an equivalence of categories. (ii) T : O Y f
f
(iii) The categories OY and OY are equivalent. ¯
f,0 f f,0¯ ≡ O f it is enough to prove Theorem 5.1 for Since by Sect. 2.5 OY ≡ OY and O Y Y f,0¯ f,0¯ . In order to keep notation simple we will from now on drop the superOY and O Y
¯
f f to denote the respective categories OYf,0 and O f,0¯ for the script 0¯ and use OY and O Y Y θ) ∈ O f and f , (λθ ) ∈ O remainder of the article. Henceforth, when we write K L(λ Y Y λ ∈ PY , we will mean the corresponding modules equipped with the Z2 -gradation (2.1). f f Similar convention applies to K (λ ) ∈ OY and L(λ ) ∈ OY . f For M, N ∈ OY and i ∈ N the i th extension Exti f (M, N ) can be understood in the
OY
sense of Baer-Yoneda (see e.g. [M, Chap. VII]) and Ext 0 f (M, N ) := HomO f (M, N ). OY
Y
f can be interpreted. From this viewpoint the and O In a similar way extensions in Y exact functors T and T induce natural maps on extensions by taking the projection of the corresponding exact sequences. f we let M = T ( M) ∈O and M = T ( M). Since all the proofs in this section For M Y for the functors T and T are parallel, we shall only give proofs for the functor T without further explanation. f OY
Irreducible Characters of Superalgebra and Super Duality
665
Lemma 5.2. Let 0 −→ A −→ B−→C −→ 0 be an exact sequence of g-modules (respectively, g-modules, g-modules) such that A, f f f f (respectively, O f , OYf ). C ∈ OY (respectively, OY , OY ). Then B also belongs to O Y Y f f f Proof. The statement for OY is clear. The statements for the categories OY and O Y follow, for example, from [CK, Theorems 3.1 and 3.2].
For g-(respectively, g-, and g-)modules A and C, let E xt i tively, E xt i (C, (U (g),U (lY ))
A) and E xt i(U (g),U (lY )) (C,
(C, A) (respec-
g),U (lY )) (U ( A)) denote the i th relative extension
group of A by C (see e.g. [Ku, Appendix D]). Let C be a uY -(respectively, uY -, and uY -)modules. Let H i ( uY ; C) (respectively, H i (uY ; C) and H i (uY ; C) ) denote the i th uY -(respectively, uY - and uY -)cohomology group with coefficients in C. Let Hi ( uY ; C) (respectively, Hi (uY ; C) and Hi (uY ; C) ) th denote the i restricted (in the sense of [L, Sect. 4]) uY -(respectively, uY - and uY -) cohomology group with coefficients in the C. The following proposition is an analogue of [RW, §7 Theorem 2] (cf. [Ku, Lemma 9.1.8]). Y . For i ≥ 0, we have ∈O Proposition 5.3. Let λ ∈ PY and N (i) (λθ ), N ) ∼ ) E xt i(U (g),U (l )) ( K lY , λθ ), H i ( uY ; N = HomlY L( Y ∼ ) . lY , λθ ), Hi ( uY ; N = HomlY L( (ii) (K (λ ), N ) ∼ HomlY L(lY , λ ), H i (uY ; N ) = (U (g),U (lY )) ∼ = HomlY L(lY , λ ), Hi (uY ; N ) .
E xt i
(iii) E xt i(U (g),U (lY )) (K (λ), N ) ∼ = HomlY L(lY , λ), H i (uY ; N ) ∼ = HomlY L(lY , λ), Hi (uY ; N ) . Proof. We have the following relative version of Koszul resolution for the trivial module pY -modules (see e.g. [GL, §1]): ∂k+1
∂k
∂k−1
∂1
k −→ C k−1 −→ · · · −→ C 0 −→ C −→ 0, · · · −→ C
(5.1)
k := U( pY ) ⊗U (lY ) k ( pY / lY ) is a pY -module with pY acting on the left of first where C factor for k ≥ 0 and is the augmentation map from U( pY ) to C. The pY -homomorphism ∂k is given by
666
S.-J. Cheng, N. Lam
∂k (a ⊗ x 1 x 2 · · · x k ) s−1 t−1 := (−1)s+t+|xs | i=1 |xi |+|xt | j=1 |x j |+|xs ||xt | a ⊗ [xs , xt ]x 1 · · · xs · · · xt · · · xk 1≤s
+
k s−1 (−1)s+1+|xs | i=1 |xi | axs ⊗ x 1 · · · xs · · · xk.
(5.2)
s=1
pY and x i denotes xi + lY Here a ∈ U( pY ) and the xi s are homogeneous elements in k ∼ uY in pY /lY . As usual, y indicates that the term y is omitted. Since C uY ) ⊗ k = U( k is completely reducible as as lY -module, C lY -module, and hence the image of ∂k is a k−1 . direct summand of C Let L( lY , λθ ) also denote the irreducible pY -module on which uY acts trivially. For k := C k ⊗ L( λ ∈ PY and k ≥ 0, D lY , λθ ) is a pY -module. Tensoring (5.1) with L( lY , λ θ ) we obtain an exact sequence of pY -modules
dk−1 dk+1 dk d0 d1 k −→ k−1 −→ 0 −→ D · · · −→ D L( lY , λθ ) −→ 0, · · · −→ D
(5.3)
where dk := ∂k ⊗ 1 for k > 0 and d0 := ⊗ 1. k := U( k . Tensoring (5.3) with U( For k ≥ 0, let E g) ⊗U (pY ) D g)⊗U (pY ) we obtain an exact sequence of g-modules ρ k−1 ρ k ρ 0 ρ k+1 ρ 1 k −→ k−1 −→ 0 −→ (λθ ) −→ 0, E · · · −→ E K · · · −→ E
(5.4)
where ρ k := 1 ⊗ dk for k ≥ 0. We observe that k ⊗ L 0 ( k = U( g) ⊗U (pY ) C lY , λ θ ) E = U( g) ⊗U (pY ) U( pY ) ⊗U (lY ) k ( pY / lY ) ⊗ L( lY , λ θ ) ∼ pY / lY ) ⊗ L( lY , λ θ ) . g) ⊗U (lY ) k ( = U( k s are (U( By [Ku, Lemma 3.1.7] the E g), U( lY ))-projective modules. Since the image k−1 , the image of ρ of ∂k in (5.1) is a U( lY )-direct summand of C k in (5.4) is also a U(lY )-direct summand of E k−1 , for k ≥ 1, and hence (5.4) is a ((U( g), U( lY ))(λθ ). It follows therefore that the relative extension group projective resolution of K th θ E xt i ( K (λ ), N ) equals the i cohomology group of the following complex: g),U (lY )) (U (
ρ ∗
ρ ∗
ρ ∗
1 2 3 0 , N ) −→ 1 , N ) −→ 2 , N ) −→ HomU (g) ( E HomU (g) ( E ··· . 0 −→ HomU (g) ( E (5.5)
Since
i , N ) ∼ ) , lY , λθ ), HomC (i HomU (g) ( E uY , N = HomU (lY ) L(
) for i ≥ 0, the i th cohomology group of (5.5) equals HomlY L( lY , λθ ), H i ( uY ; N and hence for each i ≥ 0, (λθ ), N ) ∼ ) . E xt i(U (g),U (l )) ( K lY , λθ ), H i ( uY ; N = HomlY L( Y
Irreducible Characters of Superalgebra and Super Duality
667
) with weights in ) Since the h-semisimple submodule of H i ( uY ; N equals Hi ( uY ; N we have ) ∼ ) . HomlY L( lY , λθ ), H i ( lY , λθ ), Hi ( uY ; N uY ; N = HomlY L( This completes the proof of part (i). The proofs of (ii) and (iii) are analogous. Y . For i ≥ 0, we have ∈O Corollary 5.4. Let λ ∈ PY and N (λθ ), N ) ∼ E xt i(U (g),U (l )) ( K = E xt i
(U (g),U (lY ))
Y
(K (λ ), N ) ∼ = E xt i(U (g),U (lY )) (K (λ), N ).
Proof. Using the arguments of the proof of Theorem 4.4, we can show that ) ∼ T Hi ( uY ; N = Hi (uY ; N ), and hence T induces an isomorphism (∀λ ∈ PY , ∀i ∈ Z+ ) ∼ = ) −→ lY , λθ ), Hi ( uY ; N HomlY L(lY , λ), Hi (uY ; N ) . T : HomlY L(
(5.6)
By Proposition 5.3, we have (λθ ), N ) ∼ E xt i(U (g),U (l )) ( K = E xt i(U (g),U (lY )) (K (λ), N ). Y
Similarly, we have (λθ ), N ) ∼ E xt i(U (g),U (l )) ( K = E xt i
(U (g),U (lY ))
Y
(K (λ ), N ).
Y . For i = 0, 1, we have ∈O Lemma 5.5. Let λ ∈ PY and N i θ (λ ), N ) → Exti (K (λ), N ) is an isomorphism, (i) T : Ext ( K Y O
OY
OY
OY
(λθ ), N ) → Exti (ii) T : Exti ( K
(K (λ ), N ) is an isomorphism.
(λθ ), N ) is isomorphic to the equivalence Proof. It is well known that E xt 1(U (g),U (l )) ( K Y by K (λθ ) [Ho, §2] and classes of lY -trivial extensions of N (λθ ), N ) = Homg( K (λθ ), N ). E xt 0(U (g),U (l )) ( K Y
Hence we have, for i = 0, 1, i θ ∼ (λθ ), N ). (K ExtiO ( K (λ ), N ) = E xt (U ( g),U (l )) Y
Y
Similarly, for i = 0, 1, ExtiOY (K (λ), N ) ∼ = E xt i(U (g),U (lY )) (K (λ), N ). By Corollary 5.4, we have, for i = 0, 1, i θ ∼ ExtiO ( K (λ ), N ) = Ext OY (K (λ), N ). Y
(5.7)
Since all the isomorphisms involved are natural, it is not hard to see the isomorphism (5.7) is indeed induced by T . This completes the proof of (i). Part (ii) is analogous.
668
S.-J. Cheng, N. Lam
f and ∈O Lemma 5.6. Let N Y
i −→ 0 −→ M−→ M 0 −→ M
(5.8)
. Then be an exact sequence of g-modules in O Y f
(i) The (5.8) induces the following commutative diagram with exact rows. (We will use subscripts to distinguish various maps induced by T .) ∂ ,N N −−−−−→ Hom f M ,N −−−−−→ Hom f M, −−−− 0 −−−−−→ Hom f M −→ OY O O Y Y ⏐ ⏐ ⏐ ⏐T ⏐T ⏐T ,N N ,N M M, M ∂ 0 −−−−−→ Hom f M , N −−−−−→ Hom f (M, N ) −−−−−→ Hom f M , N −−−−−→ OY
OY
∂
−−−−−→ Ext1
f O Y
∂
−−−−−→ Ext1
f
OY
OY
,N N −−−−−→ Ext 1 ,N −−−−−→ Ext 1 M M, M f f O O Y Y ⏐ ⏐ ⏐ ⏐T 1 ⏐T 1 ⏐T 1 M M, M , N N , N 1 1 M , N −−−−−→ Ext f (M, N ) −−−−−→ Ext f M , N . OY
OY
(ii) The analogous statement holds replacing T by T in (i), M by M, et cetera. Proof. We shall only prove (i), as the argument for (ii) is analogous. By [M, Chap. VII, Prop. 2.2], the rows are exact. We only need to show that the following diagram is commutative: −−∂−→ Ext 1 f M ,N HomO f M , N − OY Y ⏐ ⏐ ⏐T ⏐T 1 M , N M , N ∂ HomO f M , N −−−−→ Ext 1 f M , N . Y
Let f ∈ HomO f
Y
, N . Then M ∂( f ) ∈ Ext 1
f O
OY
, N is the bottom exact row of the M
Y
following commutative diagram:
−−−i−→ 0 −−−−→ M ⏐ ⏐ f
−−−−→ M ⏐ ⏐
−−−−→ 0 M
−−−−→ E −−−−→ M −−−−→ 0. 0 −−−−→ N is the pushout of f ]. Here E f and i, and all maps are the obvious ones. Let f := TM , N [ 1 Since T is compatible with pushouts, TM , N (∂( f )) = ∂( f ), which is the pushout of f ]) = ∂( f ). This f and T [i]. On the other hand, TM , N , N [ f ] = f, and hence ∂(TM [ completes the proof. By Lemma 3.4 and the fact that T ( ϕ ) = 0 and T ( ϕ ) = 0 for any nonzero ϕ from L( lY , λθ ) to itself with λ ∈ PY , we have the following. lY -homomorphism
Irreducible Characters of Superalgebra and Super Duality
669
Y . We have N ∈O Lemma 5.7. Let M, (i) T : Hom f ( M, N ) → Hom f (M, N ) is an injection, OY
OY
Y
OY
(ii) T : HomO f ( M, N ) → Hom
f
(M, N ) is an injection.
f . We have ∈O Lemma 5.8. Let λ ∈ PY and N Y θ (i) T : HomO f ( L(λ ), N ) → Hom O f (L(λ), N ) is an isomorphism. Y Y ) → Hom f (L(λ ), N ) is an isomorphism. (ii) T : Hom f ( L(λθ ), N OY
OY
Proof. Consider the commutative diagram with exact rows: −−−−→ K (λθ ) −−−−→ 0 −−−−→ M L(λθ ) −−−−→ 0 ⏐ ⏐ ⏐ ⏐T θ ⏐T θ ⏐T K(λ ) L(λ ) M 0 −−−−→ M −−−−→ K (λ) −−−−→ L(λ) −−−−→ 0. We obtain the following commutative diagram with exact rows: θ θ 0 −−−−→ HomO f L(λ ), N −−−−→ Hom O f K (λ ), N −−−−→ Hom O f M, N Y ⏐ Y ⏐ Y⏐ ⏐T θ ⏐T θ ⏐T N L(λ ), N K(λ ), N M, 0 −−−−→ HomO f (L(λ), N ) −−−−→ HomO f (K (λ), N ) −−−−→ HomO f (M, N ) . Y
Y
Y
By Lemma 5.5 TK(λθ ), N is an isomorphism and by Lemma 5.7 TM, N is an injection. This implies that T is an isomorphism. θ L(λ ), N f . We have ∈O Lemma 5.9. Let λ ∈ PY and N Y ) → Ext 1 f (L(λ), N ) is an injection. (i) T : Ext 1 f ( L(λθ ), N OY
OY
OY
OY
) → Ext 1 f (L(λ ), N ) is an injection. (ii) T : Ext 1 f ( L(λθ ), N Proof. Let
f −→ E −→ 0 −→ N L(λθ ) −→ 0
(5.9)
be an exact sequence of g-modules. Suppose that (5.9) gives rise to a split exact sequence of g-modules T[ f]
0 −→ N −→ E −→ L(λ) −→ 0. Thus there exists ψ ∈ HomO f (L(λ), E) such that T [ f ] ◦ ψ = 1 L(λ) . By Lemma 5.8 Y θ such that T [ψ ∈ Hom f ( ] = ψ. Thus T [ ] = T [ there exists ψ L(λ ), E) f ◦ψ f]◦ OY
] = 1 L(λ) . By Lemma 5.8 we have = 1 T [ψ f ◦ψ L(λθ ) , and hence (5.9) is split. f . We have N ∈O Lemma 5.10. Let M, Y (i) T : HomO f ( M, N ) → Hom O f (M, N ) is an isomorphism. Y Y N ) → Hom f (M, N ) is an isomorphism. (ii) T : Hom f ( M, OY
OY
670
S.-J. Cheng, N. Lam
If M is Proof. We proceed by induction on the length of a composition series of M. irreducible, then it is true by Lemma 5.8. Consider the following commutative diagram with exact top row of g-modules and exact bottom row of g-modules,
−−−i−→ 0 −−−−→ M ⏐ ⏐T M
−−−−→ M L(λθ ) −−−−→ 0 ⏐ ⏐ ⏐T ⏐T θ M L(λ )
(5.10)
i
0 −−−−→ M −−−−→ M −−−−→ L(λ) −−−−→ 0. The sequence (5.10) induces the following commutative diagram with exact rows: i∗ θ 0 −−−−→ HomO f L(λ ), N ) −−−−→ Hom O f M, N −−−−→ Y Y ⏐ ⏐ ⏐T θ ⏐T N L(λ ), N M, i∗
0 −−−−→ HomO f (L(λ), N ) −−−−→ HomO f (M, N ) −−−−→ Y
Y
i∗ 1 −−−−→ HomO f M , N −−−−→ Ext O f Y Y ⏐ ⏐T M ,N i∗ −−−−→ HomO f M , N −−−−→ Ext 1 f Y
OY
L(λθ ), N ⏐ ⏐T 1 L(λθ ), N
(L(λ), N ) .
1 The map TM , N is an isomorphism by induction. The map T is an injection by L(λθ ), N Lemma 5.9. Also T N is is an isomorphism by Lemma 5.8. This implies that TM, L(λθ ), N an isomorphism.
f . We have N ∈O Lemma 5.11. Let M, Y N ) → Ext 1 f (M, N ) is an injection, (i) T : Ext 1 f ( M, OY
OY
OY
OY
N ) → Ext 1 f (M, N ) is an injection. (ii) T : Ext 1 f ( M, Proof. The proof is virtually identical to the proof of Lemma 5.9. Here we use Lemma 5.10 instead of Lemma 5.8. f . We have ∈O Lemma 5.12. Let λ ∈ PY and N Y ) → Ext 1 f (L(λ), N ) is an isomorphism. (i) T : Ext 1 f ( L(λθ ), N OY
OY
OY
OY
) → Ext 1 f (L(λ ), N ) is an isomorphism. (ii) T : Ext 1 f ( L(λθ ), N Proof. Consider the following commutative diagram with exact rows: π −−−−→ K (λθ ) −−− −→ L(λθ ) −−−−→ 0 0 −−−−→ M ⏐ ⏐ ⏐ ⏐T θ ⏐T θ ⏐T K(λ ) L(λ ) M π
0 −−−−→ M −−−−→ K (λ) −−−−→ L(λ) −−−−→ 0.
(5.11)
Irreducible Characters of Superalgebra and Super Duality
671
The sequence (5.11) induces the following commutative diagram with exact rows: π∗ θ ), N θ −−−→ Hom f M, N ) −−−−→ Ext1 f −−− HomO L(λ −→ f K (λ ), N − OY O Y Y ⏐ ⏐ ⏐ ⏐T θ ⏐T ⏐T 1 N K(λ ), N M, L(λθ ), N π∗
HomO f (K (λ), N ) −−−−→ HomO f (M, N ) −−−−→ Ext 1 f (L(λ), N ) −−−−→ OY Y Y π∗ (λθ ), N N −−−−→ Ext1 f M, −−−−→ Ext 1 f K OY O Y ⏐ ⏐ ⏐T 1 ⏐T 1 K(λθ ), N M, N π∗
−−−−→ Ext 1
f
OY
(K (λ), N ) −−−−→ Ext 1
f
OY
(M, N ) .
1 1 The map TM, N is an injection by Lemma 5.11. The map TK (λθ ), N is an isomorphism by
1 Lemma 5.5. Also TK(λθ ), N and TM, N are isomorphisms by Lemma 5.10. Thus T L(λθ ), N is an isomorphism.
We have now all the ingredients to prove of Theorem 5.1. f f ∈O Proof of Theorem 5.1. Lemma 5.12 implies that for every M ∈ OY there exists M Y = M, i.e. the functor T is essentially surjective. Now Lemma 5.10 says such that T ( M) that T is full and faithful. It is well-known that an essentially surjective functor that is full and faithful is an equivalence of categories (see e.g. [P]), proving (i). (ii) is proved in an analogous fashion, while (iii) follows from combining (i) and (ii).
Remark 5.13. Theorem 5.1 (iii) was stated as a conjecture in [CW, Conj. 4.18]. [CW, Conj. 4.18] in the special case of Y = [−m, −2] was already formulated in [CWZ, Conj. 6.10]. A proof of Theorem 5.1 (iii) in the special case Y = [−m, −2] was announced by Brundan and Stroppel in [BS]. The proof we have presented here is different from the one announced in [BS], as it constructs directly the functors inducing this equivalence and also does not rely on [B]. Remark 5.14. The classical BGG-type resolutions for the finite-dimensional modules of [Le] and for the unitarizable modules of [EW] in terms of parabolic Verma g-modules together with Theorem 5.1 imply the existence of BGG-type resolutions for the corresponding g- and g-modules in terms of parabolic Verma g- and g-modules, respectively. The case of g and Y = [−m, −2] was already established in [CKL]. Acknowledgements. We are very grateful to Weiqiang Wang for numerous helpful comments and suggestions. We also thank one of the referees for suggestions.
References [B] [BB] [BK]
Brundan, J.: Kazhdan-Lusztig polynomials and character formulae for the Lie superalgebra gl(m|n). J. Amer. Math. Soc. 16, 185–231 (2003) Beilinson, A., Bernstein, J.: Localisation de g-modules. C.R. Acad. Sci. Paris Ser. I Math. 292, 15–18 (1981) Brylinski, J.L., Kashiwara, M.: Kazhdan-Lusztig conjecture and holonomic systerms. Invent. Math. 64, 387–410 (1981)
672
[BR] [BS] [CK] [CKL] [CLW] [CLZ] [CW] [CWZ] [D] [EW] [GL] [H] [Ho] [K1] [K2] [Ku] [KW] [KL] [L] [Le] [LSS] [LLT] [M] [P] [PS] [RW] [S] [Se] [T] [V]
S.-J. Cheng, N. Lam
Berele, A., Regev, A.: Hook Young diagrams with applications to combinatorics and representations of Lie superalgebras. Adv. Math. 64, 118–175 (1987) Brundan, J., Stroppel, C.: Highest weight categories arising from Khovanov’s diagram algebras I: cellularity. http://arxiv.org/abs/0806.1532v1[math.RT], 2008 Cheng, S.-J., Kwon, J.-H.: Howe duality and Kostant’s homology formula for infinite-dimensional Lie superalgebras. Int. Math. Res. Not. 2008, Art. ID rnn 085 (2008), 52 pp Cheng, S.-J., Kwon, J.-H., Lam, N.: A BGG-type resolution for tensor modules over general linear superalgebra. Lett. Math. Phys. 84, 75–87 (2008) Cheng, S.-J., Lam, N., Wang, W.: Super duality and irreducible characters of ortho-symplectic Lie superalgebras. http://arxiv.org/abs/0911.0129v1[math.RT], 2009 Cheng, S.-J., Lam, N., Zhang, R.B.: Character formula for infinite dimensional unitarizable modules of the general linear superalgebra. J. Algebra 273, 780–805 (2004) Cheng, S.-J., Wang, W.: Brundan-Kazhdan-Lusztig and super duality conjectures. Publ. Res. Inst. Math. Sci. 44, 1219–1272 (2008) Cheng, S.-J., Wang, W., Zhang, R.B.: Super duality and Kazhdan-Lusztig polynomials. Trans. Amer. Math. Soc. 360, 5883–5924 (2008) Deodhar, V.: On some geometric aspects of Bruhat orderings ii: the parabolic analogue of Kazhdan-Lusztig polynomials. J. Algebra 111, 483–506 (1987) Enright, T., Willenbring, J.: Hilbert series, Howe duality and branching for classical groups. Ann. of Math. (2) 159, 337–375 (2004) Garland, H., Lepowsky, J.: Lie algebra homology and the Macdonald-Kac formulas. Invent. Math. 34, 37–76 (1976) Humphreys, J.: Representations of Semisimple Lie Algebras in the BGG Category O. Graduate Studies in Mathematics 94, Providence, RI: Amer. Math. Soc. 2008 Hochschild, G.: Relative homological algebra. Trans. Amer. Math. Soc. 82, 246–269 (1956) Kac, V.: Lie superalgebras. Adv. Math. 16, 8–96 (1977) Kac, V.: Representations of Classical Lie Superalgebras. Lect. Notes in Math. 676, Berlin Heidelberg-New york: Springer Verlag, 1978, pp. 597–626 Kumar, S.: Kac-Moody Groups, their Flag Varieties and Representation Theory. Progress in Mathematics, 204. Boston, MA: Birkhauser Boston, Inc., 2002 Kac, V., Wakimoto, M.: Integrable highest weight modules over affine superalgebras and Appell’s function. Commun. Math. Phys. 215, 631–682 (2001) Kazhdan, D., Lusztig, G.: Representations of Coxeter groups and Hecke algebras. Invent. Math. 53, 165–184 (1979) Liu, L.: Kostant’s formula for Kac-Moody Lie algebras. J. Algebra 149, 155–178 (1992) Lepowsky, J.: A generalization of the Bernstein-Gelfand-Gelfand resolution. J. Algebra 49, 496–511 (1977) Leites, D., Saveliev, M., Serganova, V.: Embedding of osp(N/2) and the associated non-linear supersymmetric equations. In: Group Theoretical Methods in Physics, Vol. I (Yurmala, 1985), Utrecht: VNU Sci. Press, 1986, pp. 255–297 Lascoux, A., Leclerc, B., Thibon, J.-Y.: Hecke algebras at roots of unity and crystal bases of quantum affine algebras. Commun. Math. Phys. 181, 205–263 (1996) Mitchell, B.: Theory of Categories. Pure and Applied Mathematics XVII, New York-San Francisco-London: Academic Press, 1965 Popescu, N.: Abelian Categories with Applications to Rings and Modules. London Mathematical Society Monographs 3, London-New York: Academic Press, 1973 Penkov, I., Serganova, V.: Cohomology of G/P for classical complex lie supergroups G and characters of some atypical G-modules. Ann. Inst. Fourier 39, 845–873 (1989) Rocha-Caridi, A., Wallach, N.: Projective modules over graded Lie algebras I. Math. Z. 180, 151–177 (1982) Sergeev, A.: The tensor algebra of the identity representation as a module over the Lie superalgebras gl(n, m) and Q(n). Math. USSR Sbornik 51, 419–427 (1985) Serganova, V.: Kazhdan-Lusztig polynomials and character formula for the Lie superalgebra gl(m|n). Selecta Math. (N.S.) 2, 607–651 (1996) Tanaka, J.: On homology and cohomology of Lie superalgebras with coefficients in their finitedimensional representations. Proc. Japan Acad. Ser. A Math. Sci. 71(3), 51–53 (1995) Vogan, D.: Irreducible characters of semisimple Lie groups II: the Kazhdan-Lusztig conjectures. Duke Math. J. 46, 805–859 (1979)
Communicated by Y. Kawahigashi
Commun. Math. Phys. 298, 673–706 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1002-2
Communications in
Mathematical Physics
A Rigidity Property of Asymptotically Simple Spacetimes Arising from Conformally Flat Data Juan Antonio Valiente Kroon School of Mathematical Sciences, Queen Mary, University of London, Mile End Road, London, E1 4NS, United Kingdom. E-mail: [email protected] Received: 24 August 2009 / Accepted: 16 November 2009 Published online: 10 February 2010 – © Springer-Verlag 2010
Abstract: Given a time symmetric initial data set for the vacuum Einstein field equations which is conformally flat, it is shown that the solutions to the regular finite initial value problem at spatial infinity extend smoothly through the critical sets where null infinity touches spatial infinity if and only if the initial data coincides with Schwarzschild data. 1. Introduction ˜ g˜μν ) is said to be asymptotically simple —cfr. [21,24,33]— if A vacuum spacetime (M, there exists a smooth, oriented, time-oriented, causal spacetime (M, gμν ) and a smooth function (the conformal factor) on M such that: (i) M is a manifold with boundary I ; (ii) > 0 on M \ I and = 0, d = 0 on I ; (iii) there exists a conformal ˜ → M \ I such that 2 −1∗ g˜μν = gμν ; (iv) each null geodesic of embedding : M ˜ (M, g˜μν ) acquires two distinct endpoints on I . In this definition and throughout this article the word smooth is used as a synonym for C ∞ . The point (iv) in the definition of asymptotic simplicity excludes black hole spacetimes. In order to incorporate this class of spacetimes into the framework, the definition has to be modified. A way of doing this is through the notion of weak asymptotic simplicity —cfr. [33,37]. This point will not be of relevance in our considerations. The definition of asymptotic simplicity was introduced in [32] as a way of characterising the asymptotic behaviour of spacetimes describing isolated systems. As such, it allows to reformulate questions concerning the asymptotic decay of physical fields in terms of its differentiability at the conformal boundary. In particular, asymptotic simplic˜ g˜μν ) ity implies the peeling behaviour of the Weyl tensor of the physical spacetime (M, —see e.g. [33]. It has been long debated whether the condition on the smoothness of the conformal boundary is too strong a requirement to be imposed on a spacetime describing interesting physical phenomena. In [9] it has been shown that there exists a large class of non-trivial asymptotically simple spacetimes. These spacetimes have the peculiarity of
674
J. A. Valiente Kroon
being isometric to the Schwarzschild spacetime in a neighbourhood of spatial infinity. Given this particular state of affairs, the question is now: is this the only possible way to construct non-trivial asymptotically simple spacetimes? The Cauchy problem in General Relativity provides a systematic approach to address the existence of more general classes of asymptotically simple spacetimes. In the spirit of asymptotic simplicity, it is natural to work with objects and equations which are defined in the conformally rescaled spacetime (M, gμν ). The conformal Einstein field equations introduced in [13–15] and extensions thereof —see [19–21,24]— provide a suitable system of equations and unknowns. These conformal field equations are formally regular at the points where the conformal factor vanishes, and such that a solution of them implies a solution to the vacuum Einstein field equations. In particular, the conformal Einstein field equations have been used to prove a semi-global existence and stability result for hyperboloidal data [17] —see also [28]. The construction of asymptotically simple spacetimes in [9] makes use of this result. Cauchy data sets for asymptotically simple spacetimes are prescribed on hypersurfaces which are asymptotically Euclidean. The compactification of an asymptotically Euclidean hypersurface includes a singled out point for each asymptotic end —the corresponding “point at infinity”. These points become the spatial infinities of the development of the Cauchy data. Note however, that the definition of asymptotic simplicity makes no reference to these points; the reason being that if the spacetime has non-zero ADM mass then spatial infinity is a singular point of the conformal geometry —see e.g. [18]. It has long been acknowledged that the main obstacle in the construction of asymptotically simple spacetimes from Cauchy initial data is the lack of a detailed understanding of the structure of the Einstein field equations at spatial infinity —cfr. [16,18]. The difficulties in the analysis of the structure of spatial infinity arise from the singular nature of this point with respect to the conformal geometry —as mentioned in the previous paragraph. In particular, the Weyl tensor is singular at spatial infinity if the ADM mass is non-zero. Thus, one needs to obtain a better representation of spatial infinity. The regular finite initial value problem near spatial infinity introduced in [20] provides a powerful tool for analysing the properties of the gravitational field in the regions of spacetime “close” to both spatial and null infinity. This initial value problem makes use of the so-called extended conformal Einstein field equations and general properties of conformal structures. It is such that both the equations and the data are regular at the conformal boundary —the regular finite initial value problem at spatial infinity. Whereas the standard compactification of spacetimes considers spatial infinity as a point, the approach used in [20] represents spatial infinity as an extended set with the topology [−1, 1] × S2 . This so-called cylinder at spatial infinity is obtained as follows: starting from an asymptotically Euclidean initial data set for the Einstein vacuum equations, one performs a “standard” conformal compactification to obtain a compact manifold, S, with a singled out point, i, representing the infinity of the initial hypersurface —for simplicity it is assumed there is only one asymptotic end. In a second stage, the point i is blown-up to a 2-sphere. This blowing-up of i is achieved through the lifting of a neighbourhood of i to the bundle of space-spinor dyads. In the final step, one uses a congruence of timelike conformal geodesics to obtain an analogue of Gaussian coordinates in a neighbourhood of the initial hypersurface. The conformal geodesics are conformal invariants that can be used to construct a gauge which renders a canonical class of conformal factors for the development of the initial data. These conformal factors can be written entirely in terms of initial data quantities. Hence, the location of the conformal boundary is known a priori. The conformal boundary described by the
Rigidity Property of Asymptotically Simple Spacetimes
675
canonical conformal factor contains a null infinity with the usual structure and a spatial infinity which extends in the time dimension —so that one can speak of the cylinder at spatial infinity. The sets {±1} × S2 will be called critical sets as they can be regarded as the collection of points where null infinity “touches” spatial infinity. Null infinity and spatial infinity do not meet tangentially at the critical points. As a consequence, some of the propagation equations implied by the conformal field equations degenerate at these sets. The analysis in [20] has shown that, as a result, the solutions to the conformal field equations develop certain types of logarithmic singularities at the critical sets. These singularities form an intrinsic part of the conformal structure. It is to be expected that the hyperbolic nature of the propagation equations will propagate these logarithmic singularities. Consequently, they will have an effect on the regularity of null infinity. However, there is not proof of this yet. It should be pointed out that the idea of resolving the structure of spatial infinity by blowing up the point i to S2 has arisen previously in the literature. The idea of directional dependent tensors introduced by Geroch —see e.g. [26,27]— is essentially equivalent to the lifting of spinors to the bundle of space-spinor dyads. The notion of directional dependent tensors was subsequently elaborated in further investigations of the structure of spatial infinity —cfr. [3,35]. One of the aspects that makes the approach put forward in [20] stand ahead of previous studies is the preponderance that the partial differential equation aspects of the problem take in the analysis. Concerning this last point, it is of interest to recall the early attempts to combine geometric and PDE points of views to study the structure of spatial infinity carried out in [4,5]. The analysis carried out in [20] —restricted to the developments of time symmetric data with an analytic conformal compactification at infinity— rendered an infinite set of necessary conditions on the initial data for the spacetime to extend analytically through the critical sets. These conditions can be formulated in terms of the Cotton tensor of the initial metric and its higher order derivatives. If these conditions are not satisfied at some order then the solutions to the conformal Einstein equations develop singularities of a very definite type at the critical sets. It is important to point out that these singularities are associated to structural properties of the principal part of the evolution equations. The ideas and techniques of [20] can be implemented in a computer algebra system. In [40] this approach has been used to analyse a certain type of asymptotic expansions of the development of initial data sets which are conformally flat in a neighbourhood of infinity. The rationale behind the use of conformally flat data sets is that they satisfy the regularity conditions of [20] automatically. They constitute the simplest type of initial data sets on which the methods of [20] can be applied to obtain non-trivial results. Although analytically simple, conformally flat initial data sets give rise to spacetimes of full complexity, and indeed, particular examples are used routinely in the simulation of head-on black hole collisions —see e.g. [1,8,29]. The key finding of [40] was to show the existence of a further type of logarithmic singularities at the critical sets. This new class of obstructions to the smoothness of null infinity arises from the interaction of the principal part (which is singular at the critical sets) and the lower order terms in the conformal Einstein field equations. An important observation arising from the analysis in [40] is that particular subsets of these singularities can be eliminated by setting certain pieces of the initial data to zero. Proceeding in this way it is possible to gain insight into the algebraic structure of the conformal field equations at spatial infinity, and a very definite pattern is observed. One has the following conjecture.
676
J. A. Valiente Kroon
Conjecture. If an initial data set for the Einstein vacuum equations which is time symmetric and conformally flat in a neighbourhood of infinity yields a development with a smooth null infinity, then the initial data set is exactly Schwarzschildean in the neighbourhood. In this article we make definite progress towards a proof of this conjecture. The analysis presented in this article is concerned with the behaviour of the solutions of the conformal Einstein field equations at the critical sets. The main result of this article is the following: Theorem. Consider a time symmetric initial data set for the Einstein vacuum field equations which is conformally flat near infinity. The solution to the regular finite initial value problem at spatial infinity is smooth through the critical sets if and only if the data is exactly Schwarzschildean in a neighbourhood of infinity. The result presented here falls short of providing a proof of the conjecture for it is a priori not clear that smoothness at null infinity follows from the smoothness at the critical sets —although the expectation is that this will be the case. Conversely, the connection between a singular behaviour at the critical sets and non-smoothness at null infinity is, as yet, not fully understood. It is expected that estimates of the type constructed in [23] should provide a way of linking the asymptotic expansions obtained by the methods of [20] with actual solutions to the conformal field equations. A generalisation of these estimates to the problem of a linear field propagating on a curved background has been constructed in [46]. In [25] it has been shown how under certain assumptions it is possible to relate the type of asymptotic expansions discussed in the present work to the better known formal asymptotic expansions of the gravitational field in the so-called Newman-Penrose gauge —see e.g. [33]. In [44] the methods of [25] have been used to show that the logarithmic singularities at the critical sets give rise to asymptotic expansions in the NewmanPenrose gauge which are polyhomogeneous —that is, the asymptotic expansions are given in terms of powers of 1/r and ln r , where r is a suitable affine parameter along the generators of outgoing null cones. The possibility of having this type of asymptotic behaviour was already pointed out in the early works on the subject —see e.g. [7,34]. This issue has been brought back to the fore by several authors —cfr. [10,31,47]. Asymptotic expansions containing logarithmic terms also appear in post-Newtonian expansions of the gravitational field —see e.g. [6]. At least to a certain order in the expansions, these logarithmic terms can be removed by means of a regularisation procedure. It is not clear whether these logarithmic terms in the post-Newtonian expansions are related to those appearing in our analysis —although this is certainly an issue worthwhile being explored. In order to bring our main result into a wider context, it is pointed out that the Schwarzschild spacetime is the only static spacetime admitting conformally flat slices —see e.g. [24]. Thus, our main result together with the analysis in [39] of asymptotic expansions for more general classes of time symmetric data suggests that it should be possible to prove a generalisation of our main result in which smoothness at the critical sets implies the data to be asymptotically static at all orders at spatial infinity in some suitable sense. In particular, if the data is assumed to be real analytic in a neighbourhood of spatial infinity, then it is conjectured that the data will be static in that neighbourhood. It should be pointed out that in [20], the assumption of real analyticity was made for the sake of simplicity. However, as a consequence of the analysis of the smooth case contained in [11], it follows that the conclusions of [20] remain valid. The situation for
Rigidity Property of Asymptotically Simple Spacetimes
677
initial data sets with a non-vanishing second fundamental form is not as clear cut. In any case, initial data sets which are stationary in a neighbourhood of infinity are expected to play a special role —see e.g. [41–43]. The analysis leading to the main theorem provides very detailed information about the behaviour of the solutions to the conformal Einstein field at spatial infinity. In particular, it will be shown that these singularities appear at higher orders than what it is to be expected from general arguments. Indeed, there are some structural properties of the conformal field equations (some cancellations) which make the solutions more regular than what they are expected to be. It is certainly desirable to obtain a deeper understanding of the structures responsible for this behaviour. It is important to emphasise that these cancellations make the analysis much more computationally challenging than what it otherwise would be. It is conceivable that the formalism of the cylinder at spatial infinity can be used to analyse the behaviour of the gravitational field coupled to matter with an appropriate conformal behaviour —like a scalar field, the Maxwell field and/or Yang-Mills fields. Although at present there is no analysis in this direction, logarithmic singularities at the critical sets similar to the ones being discussed here will arise. The reason for this is that this singular behaviour is caused by structural properties of hyperbolic equations which are shared by more complex systems of field equations. What is less clear is the type of restrictions on the initial data that one would obtain, as this would involve a detailed analysis of the constraint equations associated to the relevant field equations. In any case, the analysis is bound to be more elaborated for the non-vacuum case, as one loses the a priori knowledge of the location of the conformal boundary. Outline of the article. The present work is based on the analysis of [20] —cfr. also [24]. It makes use of a number of structures and constructions which are not very standard. Therefore, in order to make our discussion accessible, it will be necessary to introduce a certain amount of notation and definitions. The reader is, in any case, referred to the references [20,24] for complete details. I have striven, in as much as it is possible, to respect the original conventions of [20]. The main difference is the use of capital Latin letters to denote spinorial indices. The structure of the rest of the article is as follows: Section 2 introduces some basic notational conventions and presents the notion of asymptotically Euclidean data. Section 3 provides a brief overview of time symmetric, conformally flat initial data sets and introduces the notion of data sets which are Schwarzschildean up to a certain order. Section 4 provides a summary of the blowing-up of the point at spatial infinity. The blow-up is realised by the introduction of a certain manifold Ca which is a subset of the bundle of spin-frames. It is also shown how one can introduce a calculus on this manifold, and the form that normal expansions acquire when lifted up to this manifold. The contents of this section follow reference [20], of which they constitute a very terse summary. Section 5 discusses the extended conformal field equations and the evolution equations they imply when written in a conformal Gaussian gauge system —the F-gauge. It discusses some structural properties of these evolution equations and introduces the notion of transport equations at the cylinder at spatial infinity. At the end of the section there are some brief remarks concerning the Schwarzschild spacetime in the F-gauge. Section 6 presents the decomposition of the linear transport equations in spherical harmonics. It introduces the notion of a spherical harmonic sector and discusses the
678
J. A. Valiente Kroon
discrete symmetries implied by the time reflection symmetry of the spacetime. It also presents a procedure to solve the hierarchy of transport equations. This procedure is different from the one given in [20] as it exploits a certain analogy with the Maxwell equations —cfr. [45]. Section 7 presents a systematic analysis of the properties of the solutions of the transport equations for initial data sets which are Schwarzschildean up to a certain order. It briefly reviews the computer algebra calculations of reference [40]. It is shown that the solutions to these transport equations are more regular than what one would a priori expect, but that nevertheless, they eventually develop certain logarithmic singularities at the critical sets. The orders at which these singularities arise are determined. The key results of this section are based on very lengthy computer algebra calculations.Only qualitative aspects of these calculations are presented. In Sect. 8, the results of the previous sections are summarised in Theorem 2. This theorem is a more precise version of the main result discussed in the introductory paragraphs. Some further discussion is given, in particular regarding the potential generalisation to initial data sets which are not conformally flat. 2. Notation and Conventions ˜ g˜μν ) solving This article is concerned with the asymptotic properties of spacetimes (M, the Einstein vacuum field equations R˜ μν = 0.
(1)
The metric g˜μν will be assumed to have signature (+, −, −, −) and μ, ν, . . . are space˜ g˜μν ) will be thought of as time indices taking the values 0, . . . , 3. The spacetime (M, the development of some time symmetric initial data set prescribed on an asymptotically ˜ The time symmetric data on S˜ are given in terms Euclidean Cauchy hypersurface S. of a 3-metric h˜ i j of signature (−, −, −). Due to the requirement of time symmetry one has that the other piece of the data, the second fundamental form of S˜ vanishes: χ˜ i j = 0. The letters i, j, k, . . . will be used as spatial tensorial indices taking the values 1, 2, 3. In this case the Einstein vacuum field equations imply the constraint equation r˜ = 0, where r˜ denotes the Ricci scalar of the metric h˜ i j . For simplicity, only one asymptotically flat end will be assumed. The asymptotic flatness of the initial manifold S˜ will be expressed in terms of conditions on a conformally rescaled manifold S. More precisely, it will be assumed that there is a 3-dimensional, orientable, smooth compact manifold (S, h i j ), a point i ∈ S, a diffeomorphism : S \ {i} −→ S˜ and a function ∈ C 2 (S) ∩ C ∞ (S \ {i}) with the properties (i) = 0, D j (i) = 0, > 0 on S\{i}, h i j = 2 ∗ h˜ i j .
D j Dk (i) = −2h jk (i),
(2a) (2b) (2c)
The last condition shall be, sloppily, written as h i j = 2 h˜ i j —that is, S\{i} will be ˜ Under these assumptions (S, ˜ h˜ i j ) will be said to be asymptotically identified with S. Euclidean and regular. Suitable punctured neighbourhoods of the point i will be mapped ˜ It should be clear from the context whether i denotes a into the asymptotic end of S. point or a tensorial index.
Rigidity Property of Asymptotically Simple Spacetimes
679
3. Conformally Flat, Time Symmetric Initial Data Sets All throughout this article it will be assumed that one has initial data sets which are conformally flat in a neighbourhood Ba ⊂ S, a > 0, of i ∈ S. The expressions “near infinity” and “in a neighbourhood of infinity” used in the introductory paragraphs should be understood in this sense. The Hamiltonian constraint together with the boundary conditions (2a)-(2c) imply on Ba (i) the Yamabe equation ϑ = −4π δ(i), ϑ ≡ −2 ,
(3)
where δ(i) denotes the Dirac delta distribution with support on i, and is the flat Laplacian. Let xi denote Cartesian coordinates on Ba such that xi (i) = 0. On Ba \i the physical 3-metric h˜ i j is given by h˜ i j = −ϑ 4 δi j , where δi j is the standard Euclidean metric in Cartesian coordinates. If a > 0 is suitably small, the solutions of (3) can be written on Ba in the form ϑ= where |x| =
1 + W, |x|
(4)
(x1 )2 + (x2 )2 + (x3 )2 . The function W satisfies W = 0,
(5)
so that W is a harmonic and analytic function. From general considerations on asymptotics, one has that W (i) = m/2, where m is the ADM mass of the time symmetric ˜ h˜ i j ) —cfr [18,20,24]. More generally, one has that initial data set (S, ∞
W =
m + wi1 ···ik xi1 · · · xik , 2
(6)
k=1
with wi1 ···ik denoting constant tensors which are totally symmetric and δ-tracefree so that the expression wi1 ···ik xi1 · · · xik is a homogeneous harmonic polynomial of degree k. Conversely, due to the analyticity of the solutions to the Laplace equation (5), given a sequence {wi1 ···ik }, k = 0, . . . , ∞, of constant totally symmetric and trace-free constant tensors (the germ of W ) such that the sum ∞
wi1 ···ik xi1 · · · xik ,
k=1
converges in Ba for a given a > 0, then the sequence defines a unique solution of the Yamabe equation (3). By choosing the centre of mass of the data suitably one can, without loss of generality, consider conformal factors of the form ϑ=
1 m + + wi j xi x j + O(|x|3 ), |x| 2
that is, without dipolar contributions. The latter consideration simplifies considerably the present analysis. Data which is exactly Schwarzschildean on Ba is characterised by W = m/2. More generally, we have the following.
680
J. A. Valiente Kroon
Definition. A time symmetric initial data set which is conformally flat in a suitable neighbourhood Ba of infinity will be said to be Schwarzschildean up to order p• if and only if ϑ=
m 1 + + wi1 ···i p• +1 xi1 · · · xi p• +1 + O(|x| p• +2 ). |x| 2
4. The Manifold Ca In [20] a representation of the region of spacetime close to null infinity and spatial infinity has been introduced —see also the comprehensive discussion in [24]. The standard representation of this region of spacetime depicts i 0 as a point. In contrast, the representation introduced in [20] depicts spatial infinity as a cylinder —the cylinder at spatial infinity. This construction is briefly reviewed for the case of time symmetric initial data sets which are conformally flat in a neighbourhood Ba (i) of infinity. The reader is referred to [20,24] for a thorough discussion of the details. Starting on the initial hypersurface S, the construction introduced in [20] makes use of a blow-up of the point i ∈ S to the 2-sphere S2 . This blow-up requires the introduction of a particular bundle of spin-frames over Ba . Consider the (conformally rescaled) spacetime (M, gμν ) obtained as the development of the time symmetric initial data set (S, h i j ). Let S L(S) be the set of spin dyads δ = {δ A } A=0,1 on S which are normalised √ with respect to the alternating spinor AB in such a way that 01 = 1. Let τ = 2e0 , where e0 is the future g-unit normal of S and τ A A its spinorial counterpart. The spinor τ A A enables the introduction of space-spinors —sometimes also called SU (2) spinors, see [2,12,36]. It defines a sub-bundle SU (S) of S L(S) with structure group SU (2, C) and projection π onto S. Given a spinorial dyad δ ∈ SU (S) one can define an associated vector frame ea , a = 1, 2, 3. We shall restrict our attention to dyads related to frames {e j } j=0,··· ,3 on Ba such that e3 is tangent to the h-geodesics starting at i. Let Hˇ denote the horizontal vector field on SU (S) projecting to the radial vector e3 . The fibre π −1 (i) ⊂ SU (S) (the fibre “over” i) can be parametrised by choosing a fixed dyad δ ∗ and then letting the group SU (2, C) act on it. Let (−a, a) ρ → δ(ρ, t AB ) ∈ SU (S) be the integral curve to the vector Hˇ satisfying δ(0, t AB ) = δ(t AB ) ∈ π −1 (i). With this notation one defines the set Ca = δ(ρ, t AB ) ∈ SU (Ba ) |ρ| < a, t AB ∈ SU (2, C) , which is a smooth submanifold of SU (S) diffeomorphic to (−a, a) × SU (2, C). It follows that the projection map π of the bundle SU (S) maps Ca into Ba . The manifold Ca inherits a number of structures from Ba . In particular, the solder and connection forms can be pulled back to smooth 1-forms on Ca satisfying the structure equations which relate them to the curvature form. In the conformally flat setting discussed here, the curvature form vanishes. In the sequel t AB ∈ SU (2, C) and ρ ∈ R will be used as coordinates on Ca . Consequently, one has that Hˇ = ∂ρ . Vector fields X ± , X relative to the SU (2, C)-dependent part of the coordinates can be introduced by requiring the commutation relations [X, X + ] = 2X + , [X, X − ] = −2X − , [X + , X − ] = −X,
Rigidity Property of Asymptotically Simple Spacetimes
681
and by requiring that they commute with Hˇ = ∂ρ . More importantly, it can be seen that for p ∈ Ba \{i} the projections of the fields ∂ρ , X ± span the tangent space at p. Given these vector fields, define the frame c AB = c(AB) by c AB = x AB ∂ρ +
1 1 z AB X + + y AB X − , ρ ρ
with constant spinors x AB , y AB and z AB given by x AB ≡
√ 1 1 2(A 0 B) 1 , y AB ≡ − √ A 1 B 1 , z AB = √ A 0 B 0 . 2 2
For these frames, the connection coefficients, γ ABC D , are given in the conformally flat setting by γ ABC D =
1 ( AC x B D + B D x AC ). 2ρ
(7)
Given t AB ∈ SU (2, C), define j
Tm k (t AB ) =
1/2 1/2 m m B ) (B t 1(A1 · · · t m j Am )k , k j
T0 00 (t AB ) = 1, with j, k = 0, . . . , m and m = 1, 2, 3, . . .. The subindex expression (A1 ···Am )k means that the indices are symmetrised and then k of them are set equal to 1, while the remaining ones are set to 0. Details about the properties of these functions can be found in √ j [16,20]. The functions m + 1Tm k form a complete orthonormal set in the Hilbert space L 2 (μ, SU (2, C)), where μ denotes the normalised Haar measure on SU (2, C). The action of the differential operators X ± on the functions Tm kj is given by X + Tm kj =
j (m − j + 1)Tm kj−1 ,
X − Tm kj = − ( j + 1)(m − j)Tm kj+1 .
In the sequel, it will be necessary to lift analytic fields defined on Ba to Ca . In particular, the lift of |x| is ρ. More generally, let ξ A1 B1 ···Al Bl denote a spinorial field on Ba . Denote, again, by ξ A1 B1 ···Al Bl its lift to Ca . Denote by ξ j = ξ(A1 B1 ···Al Bl ) j , 0 ≤ j ≤ l its essential components. The function ξ j has spin weight s = l − j and a unique expansion of the form ξj =
∞
p+l
ξ j, p ρ p , ξ j, p =
p=0
2q
ξ j, p;2q,k T2q
k 2q−l+ j ,
(8)
q=max{|l− j|,l− p} k=0
with complex coefficients ξ j, p;2q,k . More generally, we shall consider symmetric spinorial fields ξ A1 ···Ar on Ca with independent components ξ j = ξ(A1 ···A2r ) j , 0 ≤ j ≤ 2r , and spin-weight s = r − j which do not descend to analytic spinor fields on Ba . In this case one has that ξj =
∞ p=0
ξ j, p ρ , ξ j, p = p
q( p)
2q
q=|r − j| k=0
ξ j, p;2q,k T2q
k q−r + j ,
682
J. A. Valiente Kroon
where one has a priori that 0 ≤ |r − j| ≤ q( p). An expansion of the latter form will be said to be of type q( p). Of particular relevance for the present investigation will be the lift to Ca of the function W on Ba . The function W admits an expansion ∞
W =
m 1 k + w p;2 p,k T2 p p ρ p , 2 p! 2p
p=2 k=0
with w p;2 p,k ∈ C given by 1/2 −1/2 √ 2p 2p i i1 w p;2 p,k = ( 2) p σ(B · · · σ Bpp C p )k wi1 ···i p , 1 C1 k p i denoting the spatial Infeld symbols (Pauli matrices). Thus, w with σ BC i 1 ···i p = 0 if and only if w p; p,k = 0, k = 0, . . . , 2 p. The function W is the lift to Ca of a real function. Hence, it satisfies W = W . It follows that the coefficients w p;2 p,k satisfy the reality condition
w p;2 p,k = (−1) p+k w p;2 p,k , k = 0, . . . , 2 p. 5. The Spacetime Friedrich Gauge The formulation of the initial value problem near spatial infinity presented in [20] employs gauge conditions based on timelike conformal geodesics. The conformal geodesics are curves which are autoparallel with respect to a Weyl connection —i.e. a torsion-free connection which is not necessarily the Levi-Civita connection of a metric. An analysis of Weyl connections in the context of the conformal field equations has been given in [19]. In terms of this gauge based on conformal geodesics —which shall be called the Friedrich gauge or F-gauge for short— the conformal factor of the spacetime can be determined explicitly in terms of the initial data for the Einstein vacuum equations. Hence, provided that the congruence of conformal geodesics and the fields describing the gravitational field extend in a regular manner to null infinity, one has complete control over the location of null infinity. This can be ensured by making Ba suitably small. In addition, the F-gauge renders a particularly simple representation of the propagation equations. Using this framework, the singular initial value problem at spatial infinity can be reformulated into another problem where null infinity is represented by an explicitly known hypersurface and where the data are regular at spacelike infinity. The construction of the bundle manifold Ca and the blowing up of the point i ∈ Ba to the set I 0 ⊂ Ca , briefly described in Sect. 4, are the first steps in the construction of this regular setting. The next step is to introduce a rescaling of the frame bundle so that fields that are singular at I 0 become regular. Following the discussion of [20] assume that given the development of data pre scribed on Ba , the timelike spinor τ A A introduced in Sect. 4 is tangent to a congruence of timelike conformal geodesics which are orthogonal to Ba . The canonical conformal factor rendered by this congruence of conformal geodesics is given in terms of an affine parameter τ of the conformal geodesics by 2 κ 2τ 2 = κ −1 1 − 2 , with ω = √ , (9) ω |Dα D α |
Rigidity Property of Asymptotically Simple Spacetimes
683
where = ϑ −2 and ϑ solves the Yamabe equation (3) —see [19,20,22]. The function κ > 0 expresses the remaining conformal freedom in the construction. It will be taken to be of the form κ = κ ρ, with κ analytic, κ (i) = 1. Associated to the conformal factor there is a 1-form dμ from which the Weyl connection can be obtained. In spinorial terms, one has that for conformally flat data 1 x AB − ρ 2 D AB W . d A A = √ τ A A ∂τ − τ BA d AB , d AB = 2ρ (1 + ρW )3 2 The a priori knowledge of and d A A is a feature of the fact that one is dealing with a vacuum physical spacetime. This property is lost when working with, for example, the Einstein-Maxwell equations. The function κ in the conformal factor , induces a scaling δ A → κ 1/2 δ A of the spin frame. Accordingly, one considers the bundle manifold Ca,κ = κ 1/2 Ca of scaled spinor frames. Using Ca,κ one defines the set
ω(q) ω(q) , Ma,κ = (τ, q)q ∈ Ca,κ , − ≤τ ≤ κ(q) κ(q) which, assuming that the congruence of null geodesics and the relevant fields extend adequately, can be identified with the development of Ba up to null infinity —that is, the region of spacetime near null and spatial infinity. In addition, one defines the sets: I = (τ, q) ∈ Ma,κ ρ(q) = 0, |τ | < 1 , I ± = (τ, q) ∈ Ma,κ ρ(q) = 0, τ = ±1 ,
ω(q) I ± = (τ, q) ∈ Ma,κ ρ(q) > 0, τ = ± , κ(q) which will be referred to as, respectively, the cylinder at spatial infinity, the critical sets and future and past null infinity. In order to coordinatise the hypersurfaces of constant parameter τ , one extends the coordinates (ρ, t AB ) off Ca,κ by requiring them to be constant along the conformal geodesics —i.e. one has a system of conformal Gaussian coordinates. The cylinder at spatial infinity, I , can be imagined as the limit set of outgoing and incoming null hypersurfaces for the metric determined by the solution to the conformal field equations —see below. However, it must be pointed out that this metric degenerates as ρ → 0. This has as a consequence that I is a total characteristic of the field equations —as it will be seen in the sequel. Remark. For the purposes of the analysis carried out in this article it turns out that the most convenient choice of the function κ in the conformal factor of Eq. (9) is κ = ρ. This leads to considerable simplifications in all the relevant expressions. From this point onwards, this choice will always be assumed. On the manifold Ma,κ it is possible to introduce a calculus based on the derivatives ∂τ and ∂ρ and on the operators X + , X − and X . The operators ∂ρ , X + , X − and X originally defined on Ca can be suitably extended to the rest of the manifold by requiring them to commute with the vector field ∂τ . In order to derive the propagation equations, a frame c A A and the associated spin connection coefficients A A BC of the Weyl connection ∇
684
J. A. Valiente Kroon
will be used. The gravitational field is, in addition, described by the spinorial counterparts of the Schouten tensor of the Weyl connection, A A B B , and of the rescaled Weyl tensor, φ ABC D —see [19,20,24]. Let φi ≡ φ(ABC D)i . In the present gauge, the inforμ mation of the spacetime spinors c A A , A A B B and A A BC is encoded, respectively, in μ space spinors c AB , ABC D and ABC D —see [24] for the detailed relation between the two sets of spinors. Introduce the notation
μ υ ≡ c AB , ABC D , ABC D , φ ≡ (φ0 , φ1 , φ2 , φ3 , φ4 ) . Suitable field equations for the fields contained in υ and φ can be obtained from the first and second Cartan structure equations, and the Bianchi identities —see e.g., [21,24] for details. It can be proved that a solution to the equations thus constructed implies a solution to the vacuum Einstein field equations. The unknown vector υ has 45 independent complex components, while φ has 5 independent complex components. Using the F-gauge it can be shown that the extended conformal field equations given in [20] imply the following evolution equations for the unknowns υ: ∂τ υ = K υ + Q(υ, υ) + Lφ,
(11)
where K and Q denote, respectively, a linear and a quadratic constant matrix-valued function with constant entries and L is a linear matrix-valued function with coefficients depending on the coordinates and such that L|ρ=0 = 0. For the unknowns φ, the Bianchi identity ∇ A A φ ABC D = 0 implies, respectively, a set of propagation and constraint equations of the form: √ μ 2E∂τ φ + A AB c AB ∂μ φ = B( ABC D )φ, (12a) μ
F AB c AB ∂μ φ = H ( ABC D ), μ
(12b)
where E denotes the 5 × 5 unit matrix and A AB c AB , μ = 0, . . . , 3, are 5 × 5 matrices depending on the coordinates, while B( ABC D ) denotes a constant matrix-valued linear μ function of the connection coefficients ABC D . On the other hand, F AB c AB denotes 3 × 5 matrices with coordinate dependent entries and H ( ABC D ) is another constant matrix-valued linear function of the connection coefficients ABC D . Consider now the system (11)-(12a) with data given on Ca,κ . Given a neighbourhood W of Ca,κ in Ma,κ on which a unique smooth solution of the Cauchy problem is given, from the point of view of the propagation equations, the subset W ∩ I is a regular hypersurface. Introduce the notation υ (0) ≡ υ|W ∩I , φ (0) ≡ φ|W ∩I . Due to the property L|ρ=0 = 0, Eqs. (11) decouple from Eqs. (12a) and can be integrated on W ∩ I using the observation that the restriction of the initial data to I 0 coincides with Minkowski data. The solutions thus obtained extend analytically to the whole of I and in particular to the critical sets I ± . The set I turns out to be a total characteristic of the system (11)-(12a) in the sense that the whole system reduces to an interior system on I. Moreover, the constraint equations (12b) also reduce to an interior system on I. As mentioned before, this feature is a consequence of the fact that the unphysical metric gμν determined by a solution to the conformal field equations degenerates as ρ → 0. Another crucial structural property is that √ √ A0 ≡ 2E + A AB c0AB = 2diag(1 + τ, 1, 1, 1, 1 − τ ) on I,
Rigidity Property of Asymptotically Simple Spacetimes
685
so that the matrix A0 which is positive definite for |τ | < 1 degenerates at I ± . Understanding the effects of this degeneracy is the main motivation behind the analysis in this article. The previous discussion can be generalised by repeated application of the differential operator ∂ρ to Eqs. (11), (12a) and (12b) to obtain interior systems for the quantities ( p) ( p) υ ( p) = ∂ρ υ|I and φ ( p) = ∂ρ φ|I which will be called the order p transport equations. Their behaviour on the whole of I will be studied in the sequel. The transport equations then take the following form for p ≥ 1: ∂τ v ( p) = K v ( p) + Q(v (0) , v ( p) ) + Q(v ( p) , v (0) ) p−1 p Q(v ( j) , v ( p− j) ) + L ( j) φ ( p− j) + L ( p) φ (0) , (13a) + j j=1 √ μ (0) 2E + A AB (c0AB )(0) ∂τ φ ( p) + A AB (c AB )(0) ∂μ φ ( p) = B( ABC D )φ ( p) p p ( j) μ B( ABC D )φ ( p− j) − A AB (c AB )( j) ∂μ φ ( p− j) , + j j=1
(13b) μ (0) F (c0AB )(0) ∂τ φ ( p) + F AB (c AB )(0) ∂μ φ ( p) = H ( ABC D )φ ( p) p p ( j) μ H ( ABC D )φ ( p− j) − F AB (c AB )( j) ∂μ φ ( p− j) . + j j=1 AB
(13c)
Note that the non-homogeneous terms in Eqs. (13a)-(13c) depend on υ ( p ) , φ ( p ) for 0 ≤ p < p. Thus, if their values are known, then (13a)-(13b) constitutes an interior system of linear equations for υ ( p) and φ ( p) . The principal part of these equations is universal, in the sense that it is independent of the value of p. If the initial data on Ca,κ for the system (11)-(12a) is analytic —as is the case in the present analysis— then suitable initial data for the interior system (13a)-(13b) can be obtained by repeated ρ-differentiation and evaluation on I 0 . The interior system (13a)-(13b) is decoupled in the following sense: if υ ( p ) , φ ( p ) are known for 0 ≤ p < p one can solve first (13a) as it contains at most quantities of order φ ( p−1) . With the knowledge of υ ( p) at hand one can then solve (13b) to obtain φ ( p) . The language of jets is natural in the present context. For p = 0, 1, 2, . . . and any sufficiently smooth (possibly vector valued) function f defined on Ma,κ , the sets of ( p) functions { f (0) , f (1) , . . . , f ( p) } on I will be denoted by JI [ f ] and referred to as the jet order p of f on I —and similarly with I replaced by I 0 . If u = (υ, φ) is a solution ( p) to Eqs. (13a), (13b) and (13c), we refer to JI [u] as to the s-jet of u of order p and ( p)
( p)
to the data JI 0 [u] as to the d-jet of u of order p. An s-jet JI [u] of order p will be called regular on I ≡ I ∪ I + ∪ I − if the corresponding functions extend smoothly to the critical sets I ± . A remark on Schwarzschild spacetime in the F-gauge. Of particular relevance for our analysis is the Schwarzschild spacetime. A detailed analysis of the structure of spatial
686
J. A. Valiente Kroon
infinity of this spacetime in the Friedrich gauge has been given in [20]. The key result for our discussion is the following. Proposition 1. The solutions of the transport equations (13a) and (13b) for time symmetric Schwarzschild initial data extend analytically through I ± for all orders p. Moreover, the solutions to the transport equations are polynomial in τ . For the purposes of the present article it turns out that it will be necessary to know the expansions explicitly up to order p = 4 (inclusive). These straightforward, but nevertheless lengthy computations have been performed with the aid of the computer algebra system Maple V. 6. Further Properties of the Bianchi Transport Equations The Bianchi propagation subsystem of interior equations (13b) reads explicitly ( p)
( p)
( p)
( p)
(1 + τ )∂τ φ0 + X + φ1 − ( p − 2)φ0 = R0 , 1 1 ( p) ( p) ( p) ( p) ( p) ∂τ φ1 + X + φ2 + X − φ0 + φ1 = R1 , 2 2 1 1 ( p) ( p) ( p) ( p) ∂τ φ2 + X + φ3 + X − φ1 = R2 , 2 2 1 1 ( p) ( p) ( p) ( p) ( p) ∂τ φ3 + X + φ4 + X − φ2 − φ3 = R3 , 2 2 ( p) ( p) ( p) ( p) (1 − τ )∂τ φ4 + X − φ3 + ( p − 2)φ4 = R4 ,
(14a) (14b) (14c) (14d) (14e)
with R j = R j (u (0) , . . . , u ( p−1) ), j = 0, . . . , 4. On the other hand, the Bianchi transport constraint equations (13c) are given by 1 ( p) X + φ2 − 2 1 ( p) ( p) τ ∂τ φ2 + X + φ3 − 2 1 ( p) ( p) τ ∂τ φ3 + X + φ4 − 2 ( p)
τ ∂τ φ1 +
1 ( p) ( p) ( p) X − φ0 − pφ1 = S1 , 2 1 ( p) ( p) ( p) X − φ1 − pφ2 = S2 , 2 1 ( p) ( p) ( p) X − φ0 − pφ3 = S3 , 2
(15a) (15b) (15c)
with S j = S j (u (0) , . . . , u ( p−1) ), j = 1, . . . , 3. In order to extract detailed information from the above equations one makes use of an explicit decomposition of the various functions in terms of the spherical harmonics j Ti k . We recall the following lemma which was proved in [20]. Lemma 1. The following rules for expansion types hold: (i) The functions (c1AB − ρx AB )( p) , υ ( p) , φ ( p) , p = 1, 2, . . . on I are of expansion type p − 2, p − 1, p respectively. ( p) ( p) (ii) The functions Ri , i = 0, . . . , 4 and S j , j = 1, 2, 3 are of expansion type p − 1 for p = 1, 2, . . .. (iii) If for a given integer p ≥ 1 the data for φ ( p) on Ca,κ are of type p − 1, then φ ( p) on I is of type p − 1.
Rigidity Property of Asymptotically Simple Spacetimes
687 ( p)
6.1. Decomposition in terms of spherical harmonics. Given the vector u ( p) = (u 1 , . . . , ( p) u N ) —respectively υ ( p) , φ ( p) — and non-negative integers q and k = 0, . . . , 2q one defines the sector Sq,k [u ( p) ] as the collection of coefficients u i;2q,k = (2q + 1)
( p)
SU (2)
u¯ i T2q
k q−s dμ,
( p)
where s is the spin-weight of u i , and dμ is the Haar measure of SU (2). Furthermore, one defines Sq [u ( p) ] =
2q
Sq,k [u ( p) ].
k=0
A sector will be said to be vanishing if Sq [u ( p) ] = {0}. The Weyl spinor of time symmetric, conformally flat initial data is of expansion type p − 1 on Ca,κ . Accordingly, one writes ( p) φj
=
2q −1 p 4 q=|2− j| k=0
j
a j, p;2q,k T2q
k q−2+ j ,
−1 with complex (τ -dependent) coefficients a j, p;2q,k . The normalisation factor 4j has been added for convenience. The substitution of the latter expression into Eqs. (14a)(14e) and (15a)-(14a) renders equations for the various coefficients a j, p;2q,k . In the cases p ≥ 0, q = 0, one finds the equations a2, p;0,0 = 6R2, p;0,0 ,
and τ a2, p;0,0 − pa2, p;0,0 = 6S2, p;0,0 .
If p ≥ 1, q = 1, k = 0, 1, 2 one finds 1 a1, p;2,k + β2 a2, p;2,k + a1, p;2,k = 4R1, p;2,k , 3 3 3 a2, p;2,k + β2 a3, p;2,k − β2 a1, p;2,k = 6R2, p;2,k , 4 4 1 a3, p;2,k − β2 a2, p;2,k − a3, p;2,k = 4R3, p;2,k , 3 and 3 3 τ a2, p;2,k + β2 a3, p;2,k + β2 a1, p;2,k − pa2, p;2,k = 6S2, p;2,k . 4 4
688
J. A. Valiente Kroon
More crucially, one obtains for 2 ≤ p, 2 ≤ q, k = 0, . . . , 2q the equations 1 (1 + τ )a0, p;2q,k + β1 a1, p;2q,k − ( p − 2)a0, p;2q,k = R0, p;2q,k , 4 1 a1, p;2q,k + β2 a2, p;2q,k − 2β1 a0, p;2q,k + a1, p;2q,k = 4R1, p;2q,k , 3 3 3 a2, p;2q,k + β2 a3, p;2q,k − β2 a1, p;2q,k = 6R2, p;2q,k , 4 4 1 a3, p;2q,k + 2β1 a4, p;2q,k − β2 a2, p;2q,k − a3, p;2q,k = 4R3, p;2q,k , 3 1 (1 − τ )a4, p;2q,k − β1 a3, p;2q,k + ( p − 2)a4, p;2q,k = R4, p;2q,k , 4
(16a) (16b) (16c) (16d) (16e)
and 1 τ a1, p;2q,k + β2 a2, p;2q,k + 2β1 a0, p;2q,k − pa1, p;2q,k = 4S1, p;2q,k , 3 3 3 τ a2, p;2q,k + β2 a3, p;2q,k + β2 a1, p;2q,k − pa2, p;2q,k = 6S2, p;2q,k , 4 4 1 τ a3, p;2q,k + 2β1 a4, p;2q,k + β2 a2, p;2q,k − pa3, p;2q,k = 4S3, p;2q,k , 3 where β1 =
(q − 1)(q + 2), β2 =
(17a) (17b) (17c)
q(q + 1),
and Ri, p;2q,k , j = 0, . . . , 4 and S j, p;2q,k , i = 1, 2, 3 are such that ( p)
Ri
=
2q p
R j, p;2q,k T2q
( p)
k q−2+ j ,
Si
q=|2− j| k=0
=
2q p
S j, p;2q,k T2q
k q−2+ j .
(18)
q=|2− j| k=0
( p)
( p)
The functions Ri and S j contain products of φ ( p ) and υ ( p ) for 0 ≤ p ≤ p − 1, 0 ≤ p ≤ p − 1 so that in order to obtain the representation (18) one has to linearise j j products of the form Ti1 1k1 × Ti2 2k2 using Clebsch-Gordan coefficients. For latter reference the following result is noted. ( p−1)
Lemma 2. If the s-jets JI ( p)
( p−1)
[υ] and JI
[φ] have polynomial dependence in τ for
some p ≥ 1, then JI [υ] has also polynomial dependence in τ .
The proof of this lemma follows directly from the structure of the transport equation (13a). 6.2. A procedure to solve the Bianchi transport equations. A first analysis of the structure of the solutions to Eqs. (16a)-(16e) has been given in [20]. In particular, in the aforementioned reference a procedure was given by means of which the constraint equations (17a)-(17c) are used to eliminate the unknowns a1, p;2q,k , a2, p;2q,k , a3, p;2q,k so that to find a solution to the transport equations (16a)-(16e) it is only necessary to solve a reduced system involving a0, p;2q,k and a4, p;2q,k . The remaining coefficients are then obtained by means of purely algebraic manipulations.
Rigidity Property of Asymptotically Simple Spacetimes
689
For the purpose of the present investigation it will turn out to be more convenient to consider an alternative procedure to find solutions of Eqs. (16a)-(16e). Again, Eqs. (17a)-(17c) will be used to obtain a reduced system. But in this case, the system will involve a1, p;2q,k and a3, p;2q,k . Due to the formal similarity with the reduced systems obtained in [45], this type of reduced system will be called Maxwell-like. The reasons to prefer this approach over the one put forward in [20] will be explained towards the end of this section. In the following discussion, the values of the indices p, q and k are considered as fixed. In order to ease the formulae, obvious indices will be suppressed. One can construct a Maxwell-like propagation system involving the coefficients a1 , a2 and a3 by considering the sum of Eqs. (16b) and (17a), Eq. (16c) and the difference of Eqs. (16d) and (17c). The resulting equations are given by 2 (1 + τ )a1 + β2 a2 − ( p − 1)a1 = 2R1 + 4S1 , 3 3 3 a2 + β2 a3 − β2 a1 = 6R2 , 4 4 2 (1 − τ )a3 − β2 a2 + ( p − 1)a3 = 2R3 − 4S3 , 3
(19a) (19b) (19c)
where the prime denotes differentiation with respect to τ . Note that the above equations do not contain a0 or a4 . The associated Maxwell-like constraint equation is given simply by Eq. (17b). Namely 3 3 τ a2 + β2 a3 + β2 a1 − pa2 = 6S2 . 4 4
(20)
From Eqs. (20) and (19b) one obtains the algebraic relation 3 3 β2 (1 − τ )a3 + β2 (1 + τ )a1 − pa2 = 6S2 − 6τ R2 , 4 4
(21)
which can be used, in turn, to eliminate a2 from both (19a) and (19c) so that 1 1 q(q + 1)(1 + τ ) a1 + q(q + 1)(1 − τ )a3 (1 + τ )a1 + 1 − p + 2p 2p 4 4 = 2R1 + 4S1 + β2 S2 − β2 τ R2 , p p 1 1 q(q + 1)(1 + τ )a1 + p − 1 − q(q + 1)(1 − τ ) a3 (1 − τ )a3 − 2p 2p 4 4 = 2R3 − 4S3 − β2 S2 + β2 τ R2 . p p
(22a)
(22b) Equations (22a)-(22b) will be referred to as the reduced Maxwell-like system for Sq,k [φ ( p) ]. Given a solution a1 , a3 to the reduced system (22a)-(22b), the coefficient a2 is obtained from Eq. (21) by means of an algebraic manipulation, while a0 and a4 are obtained as a solution of the ordinary differential equations (16a) and (16e).
690
J. A. Valiente Kroon
In order to ease the subsequent discussion, the system (22a) and (22b) is written in matricial form as y (τ ) = A(τ )y(τ ) + b(τ ), with
⎛ 1 1 ⎜− 1+τ 1 − p+ 2 p q(q + 1)(1 + τ ) A(τ ) ≡ ⎜ ⎝ 1 1+τ q(q + 1) 2p 1−τ
and y(τ ) ≡
(23)
⎞ 1 1−τ − q(q + 1) ⎟ 1+τ 2p ⎟, ⎠ 1 1 − p−1− q(q + 1)(1 − τ ) 1−τ 2p
⎛ ⎞ 1 F1 (τ ) ⎟ a1 (τ ) ⎜ τ , b(τ ) ≡ ⎝ 1 + ⎠, 1 a3 (τ ) F3 (τ ) 1−τ
with 4 4 β2 S2 − β2 τ R2 , p p 4 4 F3 ≡ 2R3 − 4S3 − β2 S2 + β2 τ R2 . p p
F1 ≡ 2R1 + 4S1 +
It can be verified that the time symmetry of the setting implies that F3 (τ ) = −F1 (−τ ) = −F1s (τ ). Following the ideas of the discussion in [45], it is possible to find a fundamental matrix for the system (23). One obtains: Q1 (−1)q+1 Q 3 , X p,q ≡ (−1)q+1 Q s3 Q s1 with
Q 1 (τ ) ≡
1−τ 2
p+1
( p+1,1− p) Pq−1 (τ ),
Q 3 (τ ) ≡
1+τ 2
p−1
(− p−1, p−1)
Pq+1
(τ ), (24)
(α,β)
where Pn (τ ), with n a non-negative integer, denotes a Jacobi polynomial —see e.g., [38] for definitions and properties. Furthermore Q s1 (τ ) ≡ Q 1 (−τ ), Q s3 (τ ) ≡ Q 3 (−τ ). The determinant of the fundamental matrix X p,q (the Wronskian) is given by det X p,q = W0 (1 − τ 2 ) p−1 ,
(25)
with W0 a constant. Consequently, the inverse X −1 p,q is given in terms of rational functions with poles in τ = ±1. The solution to the system (23) is given by τ y(τ ) = X (τ )X −1 (0)y(0) + X (τ ) X −1 (s)b(s)ds. (26) 0
One can verify that the procedure described in the previous lines does indeed provide a solution to Eqs. (16a)-(16e) and (17a)-(17c). The argument is a follows:
Rigidity Property of Asymptotically Simple Spacetimes
691
(i) One solves the Maxwell-like reduced system (22a)-(22b) using formula (26) or any other method. (ii) Next, one substitutes the values of the coefficients a1 and a3 obtained in this way into the evolution equation (19b). This equation can be solved for a2 by a direct integration. (iii) Given the evolution equations (22a), (19b) and (22b) one can produce an argument to show the propagation of the constraint equation (20) —if (20) is satisfied initially for τ = 0, then it is satisfied at later times. (iv) Equations (19b) and (20) imply the algebraic condition (21). The latter, together with (22a) and (22b) imply the evolution equations (19a) and (19c). (v) One substitutes the coefficients a1 , a2 and a3 into the Bianchi propagation equations (16a) and (16e). Again, these equations can be solved for the coefficients a0 and a4 by means of a direct integration. (vi) It can be shown that the evolution equations (16a)-(16e) imply the propagation of the constraint equations (17a) and (17c).
6.3. General properties of the solutions to the reduced Maxwell-like system. Let C ω (a, b) denote the set of analytic functions on the interval (a, b). From formula (26) one obtains the following result. Proposition 2. If the components of the vector b(τ ) are polynomials, then the solutions a1 , a3 to the Maxwell-like reduced system (23) will be either polynomial in τ or of the form 1 1 Q 1 (τ ) ln(1 − τ ) + (−1)q+1 Q 3 ln(1 + τ ), W0 W0 1 1 s a3 = P s (τ ) + (−1)q+1 Q s3 (τ ) ln(1 − τ ) + Q (τ ) ln(1 + τ ), W0 W0 1 a1 = P(τ ) +
where P(τ ) is a polynomial in τ and Q 1 (τ ) and Q 3 (τ ) are as in (24). These solutions are of class C ω (−1, 1) ∩ C p−1 [−1, 1]. Proof. The crucial observation is to note that the particular form of the Wronskian (25) implies that the partial fraction decomposition of the entries in the integrand, X −1 p,q b, −1 of formula (26) contains negative integer powers of (1 ± τ ). The terms (1 ± τ ) will integrate to ln |1 ± τ |. The terms with (1 ± τ )−r , r ≥ 2 integrate to terms of the same form. Multiplication by the matrix X p,q removes these rational terms. Remark 1. In order to have polynomial-only solutions, some special cancellations should occur in the partial fraction decomposition of the entries of X −1 p,q b. In the sequel it will be shown that these cancellations do occur for particular combinations of the mutiindices ( p; q, k). Remark 2. The particular form of the Wronskian (25) is the reason why the approach described in this section has been preferred to the one originally described in [20]. The Wronskian of the reduced system advocated in the aforementioned reference is of the form c f (τ )(1 − τ ) p−2 ,
692
J. A. Valiente Kroon
with c a constant and f (τ ) ≡ 2( p + 1)( p − 1) − (q − 1)(q + 2)(1 − τ 2 ). The partial fraction decomposition of the entries of the matrix product X −1 p,q b will contain terms of the form ατ + β , 2( p + 1)( p − 1) − (q − 1)(q + 2)(1 − τ 2 ) for α, β some constants. These will integrate to give multiples of terms of the form ln2( p+1)( p−1)−(q −1)(q +2)(1−τ 2 ), arctan
(q −1)(q +2) τ . 2( p+1)( p−1)−(q −1)(q +2)
As a result of Proposition 2, one is expecting solutions consisting only of polynomials and ln |1 ± τ |. Thus, there are some non-trivial cancellations in formula (26) that need to be explained by means of some further arguments. 6.4. The case p = q and the regularity condition at i. As first pointed out in [20], for p ≥ 2 if q = p, the vector b appearing in formula (26) is such that S p,k [b] = 0, so that the solution to the reduced system (23) is given entirely by the solution to the homogeneous problem. From here, it is possible to identify conditions on the initial data so that the solutions to the Bianchi transport equations for these particular sectors extend smoothly through the critical sets I ± . In particular, one has the following result —Theorem 8.2 in [20]. Theorem 1. Given a vacuum initial data set which is time symmetric and analytic in a neighbourhood of infinity, the solution to the regular finite initial value problem is smooth through I ± only if the condition D(E p F p . . . D E 1 F1 b ABC D) (i) = 0,
p = 1, 2, . . . .
(27)
If this condition is violated at some order p , then the solution will develop logarithmic singularities in S p [φ ( p ) ] at I ± . In the previous result b ABC D denotes the spinorial counterpart of the Cotton tensor. Remark. Time symmetric data which is conformally flat in a neighbourhood of infinity satisfies the condition (27) trivially to all orders as b ABC D = 0. 7. The Solutions to the Transport Equations for Data Sets Which are Schwarzschildean up to a Certain Order In this section the main analysis of the present work is presented. The key idea behind this is to analyse the solutions to the transport equation at spatial infinity for initial data sets which are Schwarzschildean up to a certain order.
Rigidity Property of Asymptotically Simple Spacetimes
693
7.1. Results from explicit calculations. In reference [40] one assumed a function W of the form ∞
W =
m 1 k + w p;2 p,k T2 p p ρ p . 2 p! 2p
p=2 k=0
In fact, the form of W assumed in [40] is slightly more general than this as it includes terms w1,2,k , k = 0, 1, 2, which —as seen in Sect. 3— can always be removed by choosing the centre of mass properly. Using scripts in the computer algebra system Maple V one can calculate the explicit solutions to the transport equations (13a)-(13c) via the decomposition in terms of spherical harmonics discussed in Sect. 6. The solutions for the orders p = 0, 1, 2, 3 have been calculated in [20,25]. These solutions have polynomial dependence in τ , and thus, extend smoothly through the critical sets, although as seen in Proposition 2, the solutions could have had logarithmic singularities. The use of computer algebra methods allows to go beyond this point and to calculate further orders in the expansions. For p = 4 one finds again that the solutions have polynomial dependence on τ . However, for p = 5 the situation is different. One finds that the sectors S2 [φ (5) ] have solutions with logarithmic terms of the form given by Proposition 2. An important observation is that these logarithmic solutions do not appear if one chooses a function W for which w2,4,k = 0, k = 0, . . . , 4. In the terminology of Sect. 3 this means that the data set is Schwarzschildean up to order p = 2. Assuming that this is the case, one can proceed further with the expansions. At order p = 6 one finds logarithmic solutions only in the sectors S3 [φ (6) ]. The logarithmic solutions can be avoided by considering data such that w3,6,k = 0, k = 0, . . . , 6, that is, data which is Schwarzschildean up to order p = 3. From these results one can already infer a pattern which has been confirmed to all orders for which the calculations have been carried out. The calculations reported in [40] are carried out up to order p = 9, but there is no reason why —besides computing power— the calculations cannot be carried out any further. The pattern inferred from the analysis in [40] is as follows. Assume that one has an initial data set which is Schwarzschildean up to order p = p• . Then, the solutions to the transport equations for p ≤ p• will only contain the sectors S0 [u ( p) ]. These will (p ) p coincide with the sectors implied by the s-jet JI • [u • ] of the Schwarzschild spacetime. At order p = p• + 1 the non-vanishing sectors are S0 [u ( p• +1) ], S p• +1 [u ( p• +1) ]. Both of them are polynomial in τ . At order p = p• + 2 the non-vanishing sectors are S0 [u ( p• +2) ], S p• +1 [u ( p• +2) ], S p• +2 [u ( p• +2) ]. Again, all the non-vanishing sectors are polynomial in τ . At order p = p• + 3 the non-vanishing sectors are S0 [u ( p• +3) ], S p• +1 [u ( p• +3) ], S p• +2 [u ( p• +3) ], S p• +3 [u ( p• +3) ].
694
J. A. Valiente Kroon
All the non-vanishing sectors are polynomial in τ . Finally, at order p = p• + 4 one has the following non-vanishing sectors: S0 [u ( p• +4) ], S p• +1 [u ( p• +4) ], S p• +2 [u ( p• +4) ], S p• +3 [u ( p• +4) ], S p• +4 [u ( p• +4) ]. The sectors S0 [u ( p• +4) ], S p• +2 [u ( p• +4) ], S p• +3 [u ( p• +4) ] and S p• +4 [u ( p• +4) ] have polynomial dependence in τ , while S p• +1 [u ( p• +4) ] will have logarithmic singularities of the type indicated in Proposition 2. Furthermore, the solutions will be logarithmic free if w p• +1,2( p• +1),k = 0, k = 0, . . . , 2( p• + 1). This pattern will be effectively proved in the sequel. It readily suggests an inductive procedure to prove the main theorem presented in the introductory section. The calculations in [40] are the base step of this inductive procedure. 7.2. Properties of data which is Schwarzschildean up to order p = p• . We start with some generic observations which will be used systematically in the sequel. Assume that the function W , appearing in expression (4) for the conformal factor ϑ, is of the form W =
2p ∞ 1 m k + w p;2 p,k T2 p p ρ p , 2 p!
(28)
p= p• +1 k=0
so that the initial data is Schwarzschildean up to order p• . The function W —and hence also the coefficients w p;2 p,k , p• ≤ p, k = 0, . . . , 2 p— appears non-linearly in the expression =
ρ2 (1 + ρW )2
for the conformal factor and, moreover, in the expressions for the initial data for the spinors φ ABC D and ABC D on Ca,κ : ABC D = −
ρ2 ρ3 D(AB DC D) , φ ABC D = 2 D(AB DC D) .
Hence, when calculating the normal expansions of φ ABC D and ABC D one encounj j j j ters products of the form Ti1 1k1 × Ti2 2k2 with Ti1 1k1 , Ti2 2k2 = T0 00 which have to j
be linearised —that is, expressed as a linear combination of other functions Ti k . This linearisation procedure is extremely cumbersome and involves the use of the ClebschGordan [40] coefficients of SU (2, C). Remarkably, it turns out that if one only considers expansions up to order p = p• + 4 —which is what will be required in the present analysis— these higher order products do not arise. An inspection renders the following result. Lemma 3. For initial data sets which are Schwarzschildean up to order p = p• in Ba one has that on Ca,κ , φ ABC D − φ •ABC D = O(ρ p• +1 ), ABC D − •ABC D = O(ρ p• +2 ), where φ •ABC D and •ABC D denote, respectively, the Weyl and Ricci spinors of the Schwarzschild data.
Rigidity Property of Asymptotically Simple Spacetimes
695
The following result, which can be proved by direct inspection, shows that the calculations described in the present work do not require the calculation of complicated SU (2, C) Clebsch-Gordan coefficients. Lemma 4. For the class of initial data under consideration, the terms p−1 p j=1 p j=1
j
p ( j) μ B( ABC D )φ ( p− j) − A AB (c AB )( j) ∂μ φ ( p− j) , j
p p j=1
Q(v ( j) , v ( p− j) ) + L ( j) φ ( p− j) ,
j
( j) μ H ( ABC D )φ ( p− j) − F AB (c AB )( j) ∂μ φ ( p− j) ,
with p = p• + 1, . . . , p• + 4 and p• ≥ 3, appearing, respectively, in the transport equations (13a), (13b) and (13c) contain, before linearisation only products of the form j T0 00 × Ti k . A direct observation is the following. ( p)
( p)
Lemma 5. Let JI [υ] and JI [φ] be the s-jets of order p arising from initial data sets which are Schwarzschildean up to order p = p• . Then one has that ( p)
( p)
( p)
( p)
JI [υ] = JI [υ• ], JI [φ] = JI [φ• ], ( p)
( p)
for p = 0, . . . , p• , where JI [υ• ] and JI [φ• ] are the s-jets of order p arising from exactly Schwarzschildean data sets. The first difference between these two sets of jets arises at order p = p• + 1. Remark. An inspection of the terms discussed in Lemma 4 for p = p• + 4 shows that the present discussion requires at most the explicit knowledge of the s-jet J (4) [u • ] of the Schwarzschild spacetime. In order to appreciate the following arguments, the non-vanishing sectors in the ( p +5) ( p +5) s-jets JI • [υ − υ• ] and JI • [φ − φ• ] will be listed. This list can be deduced from Lemma 1. • At order p = p• + 1: S0 [υ ( p• +1) ], S0 [φ ( p• +1) ], S p• +1 [φ ( p• +1) ]. • At order p = p• + 2: S0 [υ ( p• +2) ], S p• +1 [υ ( p• +2) ], S0 [φ ( p• +2) ], S p• +1 [φ ( p• +2) ], S p• +2 [φ ( p• +2) ].
696
J. A. Valiente Kroon
• At order p = p• + 3: S0 [υ ( p• +3) ], S p• +1 [υ ( p• +3) ], S p• +2 [υ ( p• +3) ], S0 [φ ( p• +3) ], S p• +1 [φ ( p• +3) ], S p• +2 [φ ( p• +3) ], S p• +3 [φ ( p• +3) ]. • At order p = p• + 4: S0 [υ ( p• +4) ], S p• +1 [υ ( p• +4) ], S p• +2 [υ ( p• +4) ], S p• +3 [υ ( p• +4) ], S0 [φ ( p• +4) ], S p• +1 [φ ( p• +4) ], S p• +2 [φ ( p• +4) ], S p• +3 [φ ( p• +4) ], S p• +4 [φ ( p• +4) ]. Remark. From Lemma (4) it follows that there is no mixing between the various sectors arising at each order. Thus, it is only necessary to carry out a discussion of the solutions ( p +5) in the difference jet JI • [u − u • ] for the sectors S p• +1 . The analysis of the corresponding sectors S p• +2 , S p• +3 , S p• +4 and S p• +5 can, in principle, be obtained from that of the sector S p• +1 by performing, respectively, the formal replacements p• → p• + 1, p• → p• + 2, p• → p• + 3 and p• → p• + 4. Warning. In order to improve the readability, obvious strings of subindices will be omitted in the sequel.
7.3. Properties of the transport equations for p = p• + 1. The solutions to the transport equations at order p = p• + 1 can be read from the original analysis carried out in [20]. Using Lemma (1) it follows that υ ( p• +1) contains no contribution to the sectors S p• +1 . The only non-vanishing sector in υ ( p• +1) is S0 , which coincides with the Schwarzschildean solution. The (non-vanishing) contribution of φ ( p• +1) to the sectors S p• +1 is given by: a0, p• +1;2( p• +1),k = A0,k (1 − τ ) p• +3 (1 + τ ) p• −1 ,
(29a) (29b)
a1, p• +1;2( p• +1),k = A1,k (1 − τ )
p• +2
(1 + τ ) ,
a2, p• +1;2( p• +1),k = A2,k (1 − τ )
p• +1
(1 + τ )
p•
a3, p• +1;2( p• +1),k = A1,k (1 − τ ) (1 + τ ) p•
a4, p• +1;2( p• +1),k = A0,k (1 − τ )
p• −1
p• +1
p• +2
(1 + τ )
,
,
p• +3
(29c) (29d)
,
(29e)
with k = 0, . . . , 2( p• + 1) and A0,k ≡ − p• ( p• + 1)( p• + 2)( p• + 3)w p• +1;2( p• +1),k , A1,k ≡ −4( p• + 3) ( p• + 1)( p• + 2)w p• +1;2( p• +1),k , A2,k ≡ −6( p• + 2)( p• + 3)w p• +1;2( p• +1),k . Hence, the solutions extend analytically through the sets I ± , consistently with the theorem concerning the regularity condition (27).
Rigidity Property of Asymptotically Simple Spacetimes
697
7.4. Properties of the transport equations for p = p• +2. The solutions for S p• +1 [υ ( p• +2)] can be calculated using the solutions for S p• +1 [φ ( p• +1) ] given by (29a)-(29e). According to Lemma 2 the solutions for S p• +1 [υ ( p• +2) ] will be polynomial in τ and it turns out that they can be expressed in terms of hypergeometric functions. With this information at hand it would be, in principle, possible to analyse the Maxwell-like reduced system (22a)-(22b) at order p = p• + 2 for the coefficients a1, p• +2 and a3, p• +2 . However, the form of the explicit solutions in S p• +1 [υ ( p• +2) ] make it very difficult to identify useful structures in the equations. Instead, it is desirable to find a way to extract information on the solutions to the p = p• + 2 Maxwell-like reduced system without having to make use of the explicit solutions of S p• +1 [υ ( p• +2) ]. In order to get around this problem we do the following: a close inspection of the right hand side of Eqs. (22a) and (22b) reveals that by differentiating three times with respect to τ and then using the υ ( p• +2) and φ ( p• +1) transport equations and τ -derivatives thereof, it is possible to obtain a reduced system for [3] [3] 3 3 a1, p• +2 ≡ ∂τ a1, p• +2 , a3, p• +2 ≡ ∂τ a3, p• +2 ,
where the non-homogeneous terms depend explicitly only on a1, p• +1 and a3, p• +1 as [3] [3] given by expressions (29b) and (29d). Clearly, if the solutions a1, p• +2 and a3, p• +2 for these equations are regular at τ = ±1, this will also be the case for a1, p• +2 and a3, p• +2 . [3] [3] Similarly, if a1, p• +2 and a3, p• +2 have logarithmic singularities at τ = ±1, a1, p• +2 and a3, p• +2 will also contain them. [3] [3] The calculation of suitable equations for a1, p• +2 and a3, p• +2 requires systematic and extensive use of computer algebra methods. Calculations in the system Maple V render a reduced system of the form: [3] (1 + τ )∂τ a1, p• +2 +
1 [3] ( p• + 1)( p• + 2)τ − ( p•2 − 9 p• + 2) a1, p• +2 2( p• − 1)
1 [3] ( p• + 1)( p• + 2)(1 − τ )a3, p• +2 2( p• − 1) 1 1 = G p• +2 a1, p• +1 + H p +2 a3, p• +1 , 3 4 3 (1 − τ ) (1 + τ ) (1 − τ ) (1 + τ )4 • +
(30a) 1 [3] [3] ( p• + 1)( p• + 2)(1 + τ )a1, (1 − τ )∂τ a3, p• +2 − p• +2 2( p• − 1) 1 [3] ( p• + 1)( p• + 2)τ + ( p•2 − 9 p• + 2) a3, + p• +2 2( p• − 1) 1 1 = H ps • +2 a1, p• +1 + G s a3, p• +1 . 3 4 3 (1 + τ ) (1 − τ ) (1 + τ ) (1 − τ )4 p• +2 (30b) In the above expressions G p• +2 and H p• +2 are explicit polynomials in τ of degree 9 with coefficients which are themselves polynomials in p• . Furthermore, G sp• +2 (τ ) ≡ G p• +2 (−τ ), H ps • +2 (τ ) ≡ H p• +2 (−τ ). Both G p• +2 and H p• +2 contain an overall factor of m. The explicit form of these polynomials is given in the Appendix. The system
698
J. A. Valiente Kroon
(30a)-(30b) is supplemented by the initial conditions [3] [3] a1, p• +2 (0) = −2 m A1 (6 p• + 23)( p• + 2), a3, p• +2 (0) = 2 m A1 (6 p• + 23)( p• + 2).
(31) The data is calculated by using the values of φ ( p• +1) (0), φ ( p• +2) (0), υ ( p• +2) (0) implied ( p +2) ( p +2) by the data jets JI • [υ], JI • [φ] and by τ -differentiating as necessary the υ ( p• +2) ( p +2) and φ • transport equations. This cumbersome calculation has been carried out in the computer algebra system Maple V. Substitution of the explicit φ ( p• +1) solutions (29a)-(29e) into Eqs. (30a) and (30b) shows that the right hand sides of these equations are, respectively, of the form (1 − τ ) p• −3 (1 + τ ) p• −4 Q p• +2 (τ ), (1 + τ ) p• −3 (1 − τ ) p• −4 Q sp• +2 (τ ), where Q p• +2 (τ ) is a polynomial of degree 11 in τ such that Q p• +2 (±1) = 0. Consistently with the above, it is assumed that p• ≥ 4. The cases with p• < 4 can be analysed in a case by case basis —cfr. the calculations in [40]. The remarkable structure of the zeros of the right hand sides of Eqs. (30a) and (30b) eases the task of looking for polynomial solutions to these equations. Indeed, we note the following lemma. Lemma 6. All polynomial solutions of Eqs. (30a) and (30b) for p• ≥ 4 are of the form [3] p• −2 a1, (1 + τ ) p• −4 b p• +2 (τ ), p• +2 (τ ) = (1 − τ )
(32a)
[3] a3, p• +2 (τ )
(32b)
= −(1 + τ ) p• −2 (1 − τ ) p• −4 bsp• +2 (τ ),
where b p• +2 (τ ) is polynomial of degree 9 and bsp• +2 (τ ) ≡ b p• +2 (−τ ). Proof. The proof of this lemma is inspired by the discussion in Chapter 4 of [30]. One starts by looking at the possible zeros of the solution to Eqs. (30a) and (30b) at τ = 1. If a1[3] and a3[3] are polynomial then one can write ∗
a1[3]
=
n k=n ∗
∗
αk (1 − τ ) , k
a3[3]
=
m
βk (1 − τ )k ,
(33)
k=m ∗
for some integers n ∗ , m ∗ , n ∗ , m ∗ ≥ 0 and some complex numbers αk , βl , n ∗ ≤ k ≤ n ∗ , m ∗ ≤ l ≤ m ∗ . Dividing Eq. (30a) by 1 + τ and Eq. (30b) by 1 − τ and then substituting the expressions (33) into the ordinary differential Eqs. (30a) and (30b) one finds that min{n ∗ − 1, m ∗ + 1} = p• − 3, min{n ∗ − 1, m ∗ − 1} = p• − 5, as τ = 1 is a zero of the right hand sides of (30a) and (30b) and, moreover, the left and right hand sides must have the same multiplicity. The above conditions are satisfied by setting n ∗ = p• − 2, m ∗ = p• − 4. The discussion of zeros at τ = −1 follows by symmetry. The degree of the polynomial b p• +2 (τ ) follows by inspection of Q p• +2 .
Rigidity Property of Asymptotically Simple Spacetimes
699
Now, the substitution of the Ansatz (32a) and (32b) with b p• +2 (τ ) =
9
B p• +2,k τ k ,
B p• +2,k ∈ C
k=0
into Eqs. (30a) and (30b) leads to a system of 11 linear algebraic equations for the 10 unknowns B p• +2,k , k = 0, . . . , 9. This overdetermined system can be seen to have a solution which can be explicitly calculated. The result of this calculation is also presented in the Appendix. In particular, the solution obtained is such that [3] [3] a1, p• +2 (0) = −2m A1 (6 p• + 23)( p• + 2), a3, p• +2 (0) = 2m A1 (6 p• + 23)( p• + 2),
consistent with the initial data (31) for the system (32a) and (32b). Thus, the particular solution to the system (30a)-(30b) calculated by the above procedure is, in fact, the solution to the system in question with data given by (31). From the procedure described in the above paragraphs together with explicit calculations in [40] for p• < 4 one has the following result. Proposition 3. The solution of the Maxwell-like reduced system (30a) and (30b) with data given by (31) is polynomial in τ and thus it extends smoothly through τ = ±1. [3] [3] Remark. It follows that not only a1, p• +2 and a3, p• +2 but also a1, p• +2 and a3, p• +2 are polynomial in τ . Moreover, from the discussion in Sect. 6.2 and Lemma 6 also a0, p• +2 , a2, p• +2 , a4, p• +2 and the sector S p• +1 [υ ( p• +3) ] are polynomial in τ .
7.5. Properties of the transport equations for p = p• + 3. We extend the approach used in the previous section to decide whether the coefficients a1, p• +3 and a3, p• +3 are polynomial in τ or not. For these coefficients, the calculations are more involved, and require τ -differentiating the Maxwell-like reduced equations satisfied by a1, p• +3 and a3, p• +3 seven times to obtain a linear system for [7] [7] 7 7 a1, p• +3 ≡ ∂τ a1, p• +3 , a3, p• +3 ≡ ∂τ a3, p• +3 .
After computer algebra computations involving the substitution of the transport equations satisfied by υ ( p• +2) , υ ( p• +3) and φ ( p• +1) , φ ( p• +2) and τ -derivatives thereof, one obtains the following system of equations: [7] (1 + τ )∂τ a1, p• +3 −
1 [7] ( p• + 1)( p• + 2)τ − ( p•2 − 21 p• − 38) a1, p• +3 2( p• − 4)
1 [7] ( p• + 1)( p• + 2)(1 − τ )a3, p• +3 2( p• − 4) 1 1 = G p +3 a1, p• +1 + H p +3 a3, p• +1 (1 − τ )8 (1 + τ )9 • (1 − τ )8 (1 + τ )9 • 1 1 + K p +3 a [3] + L p +3 a [3] , (1 − τ )4 (1 + τ )5 • 1, p• +2 (1 − τ )4 (1 + τ )5 • 3, p• +2 −
(34a)
700
J. A. Valiente Kroon
1 [7] ( p• + 1)( p• + 2)(1 + τ )a1, p• +3 2( p• − 4) [3] ( p• + 1)( p• + 2)τ + ( p•2 − 21 p• + 38) a3, p• +3
[7] (1 − τ )∂τ a3, p• +3 +
1 2( p• − 4) 1 1 = H s a1, p• +1 + G s a3, p• +1 (1 + τ )8 (1 − τ )9 p• +3 (1 + τ )8 (1 − τ )9 p• +3 1 1 [3] + L sp• +3 a1, K s a [3] . p• +2 + 4 5 4 (1 + τ ) (1 − τ ) (1 + τ ) (1 − τ )5 p• +3 3, p• +2 −
(34b)
In the above expressions G p• +3 and H p• +3 are explicit polynomials in τ of degree 19, while K p• +3 and L p• +3 are explicit polynomials of degree 10. The polynomials G p• +3 and H p• +3 contain an overall factor of m 2 while K p• +3 and L p• +3 have an overall factor of m. The coefficients of these polynomials are themselves polynomials in p• . As in the case of p = p• + 2, initial conditions for the system (34a)-(34b) are calculated using the values of φ ( p• +1) (0), φ ( p• +2) (0), φ ( p• +3) (0), υ ( p• +2) (0), υ ( p• +3) (0) implied by the data ( p +2) ( p +3) ( p +2) ( p +3) jets JI • [υ], JI • [υ], JI • [φ], JI • [φ] and by τ -differentiating as necessary the υ ( p• +3) and φ ( p• +3) transport equations. One obtains [7] a1, p• +7 (0) =
1 2 m A( p• + 2)( p• + 3)(3 p•5 + 72 p•4 + 49353 p•3 8 +142260 p•2 + 610272 p• + 302048), (35a)
1 2 [7] 5 4 3 a3, p• +7 (0) = − m A( p• + 2)( p• + 3)(3 p• + 72 p• + 49353 p• 8 +142260 p•2 + 610272 p• + 302048). (35b) We proceed to analyse this system on the same lines as it was done for (30a)-(30b). First one notices that direct substitution of the explicit solutions a1, p• +1 , a3, p• +1 , a1, p• +2 and a3, p• +2 which were obtained, respectively, in Sects. 7.3 and 7.4 shows that the right hand sides of Eqs. (34a) and (34b) are of the form (1 − τ ) p• −8 (1 + τ ) p• −9 Q p• +3 (τ ), (1 + τ ) p• −8 (1 − τ ) p• −9 Q sp• +3 (τ ), where Q p• +3 (τ ) is a polynomial in τ such that Q p• +3 (±1) = 0. In order for the following calculations to make sense, it is assumed that p• ≥ 9. The cases p• < 9 can be analysed individually and provide the same qualitative picture. The remarkable structure of the zeros of the right hand sides of Eqs. (30a) and (30b) eases the task of looking for polynomial solutions to these equations. Indeed, one has the following lemma, whose proof is similar to that of Lemma 6. Lemma 7. All polynomial solutions of Eqs. (34a) and (34b) are of the form [7] p• −7 a1, (1 + τ ) p• −9 b p• +3 (τ ), p• +3 (τ ) = (1 − τ )
(36a)
[7] a3, p• +3 (τ )
(36b)
= −(1 + τ )
p• −7
(1 − τ ) p• −9 bsp• +3 (τ ),
where b p• +3 (τ ) is polynomial of degree 19, and bsp• +3 (τ ) ≡ b p• +3 (−τ ).
Rigidity Property of Asymptotically Simple Spacetimes
701
[7] [7] Now, we use the expressions (36a) and (36b) as an Ansatz for a1, p• +3 and a3, p• +3 with
b p• +3 (τ ) =
19
B p• +3,k τ k ,
B p• +3,k ∈ C.
k=0
The substitution of this Ansatz into Eqs. (34a)-(34b) leads to a system of 21 linear algebraic equations for the 20 unknowns B p• +3,k , k = 0, . . . , 19. By means of an explicit calculation, this overdetermined system can be seen to have a solution—that is, not all the 21 equations are linearly independent. The solution so obtained contains a free parameter which can be adjusted to set the value of the coefficient B p• +3,0 such that the initial conditions (35a)-(35b) are satisfied. With the procedure described in the above paragraphs and explicit calculations for the cases with p• < 9 one has the following result. Proposition 4. The solution of the Maxwell-like reduced system (34a)-(34b) with data given by (35a)-(35b) is polynomial in τ and thus, it extends smoothly through τ = ±1. [7] [7] Remark. It follows that not only a1, p• +3 and a3, p• +3 but also a1, p• +3 and a3, p• +3 are polynomial in τ . Moreover, from the discussion in Sect. 6.2 and Lemma 7, a0, p• +3 a2, p• +3 , a4, p• +3 and the sector S p• +1 [υ ( p• +4) ] are polynomial in τ .
7.6. Properties of the transport equations for p = p• + 4. Finally, it is shown, by methods similar to the ones used for the cases p = p• + 2 and p = p• + 3 that the solutions a1, p• +4 and a3, p• +4 of the order p = p• + 4 Maxwell-like transport equations cannot be purely polynomial. In order to show this, one τ -differentiates eleven times(!) the Maxwell-like reduced equations satisfied by a1, p• +4 and a3, p• +4 to obtain a linear system for [11] [11] 11 11 a1, p• +4 ≡ ∂τ a1, p• +4 , a3, p• +4 ≡ ∂τ a3, p• +4 .
After lengthy computer algebra computations involving the substitution of the transport equations satisfied by υ p• +4 , υ p• +3 , υ p• +2 , φ p• +3 , φ p• +2 , φ p• +1 and τ -derivatives thereof [11] [11] into the equations for a1, p• +4 and a3, p• +4 , one obtains a system of equations of the form: [11] (1 + τ )∂τ a1, p• +4 −
1 [11] ( p• + 1)( p• + 2)τ − ( p•2 − 33 p• + 110) a1, p• +4 2( p• − 7)
1 [11] ( p• + 1)( p• + 2)(1 − τ )a3, p• +4 2( p• − 7) 1 1 = G p• +4 a1, p• +1 + H p +4 a3, p• +1 13 14 13 (1 − τ ) (1 + τ ) (1 − τ ) (1 + τ )14 • 1 1 + K p +4 a [3] + L p +4 a [3] (1 − τ )9 (1 + τ )10 • 1, p• +2 (1 − τ )9 (1 + τ )10 • 3, p• +2 1 1 [7] + M p• +4 a1, N p +4 a [7] , p• +3 + (1 − τ )4 (1 + τ )5 (1 − τ )4 (1 + τ )5 • 3, p• +3 −
(37a)
702
J. A. Valiente Kroon
1 [11] ( p• + 1)( p• + 2)(1 + τ )a1, p• +4 2( p• − 7) [11] ( p• + 1)( p• + 2)τ + ( p•2 − 33 p• + 110) a3, p• +4
[11] (1 − τ )∂τ a3, p• +4 +
1 2( p• − 7) 1 1 = H s a1, p• +1 + G s a3, p• +1 (1 + τ )13 (1 − τ )14 p• +4 (1 + τ )13 (1 − τ )14 p• +4 1 1 [3] + L sp• +4 a1, K s a [3] p• +2 + 9 10 9 (1 + τ ) (1 − τ ) (1 + τ ) (1 − τ )10 p• +4 3, p• +2 1 1 + N s a [7] + M s a [7] . (1 + τ )4 (1 − τ )5 p• +4 1, p• +3 (1 + τ )4 (1 − τ )5 p• +4 3, p• +3 −
(37b)
In the above expressions G p• +4 and H p• +4 are polynomials in τ of degree 29 with an overall factor of m 3 ; K p• +4 and L p• +4 are polynomials in τ of degree 20 with an overall factor of m 2 ; finally, M p• +4 and N p• +4 are polynomials in τ of degree 9 with an overall factor of m. The coefficients of all these polynomials are themselves polynomial in p• . The initial conditions for the system (37a)-(37b) can be calculated following a similar procedure to the one used for the lower order systems (30a)-(30b) and (34a)-(34b). The initial conditions read 1 3 [11] 7 6 5 a1, p• +4 (0) = − m A( p• + 2)( p• + 3)( p• + 4)(1737 p• + 64317 p• + 84931161 p• 8 −175033869 p•4 + 3959957166 p•3 − 1379005296 p•2 +6565457856 p• + 1464602048), (38) 1 [11] 3 7 6 5 a3, p• +4 (0) = m A( p• + 2)( p• + 3)( p• + 4)(1737 p• + 64317 p• + 84931161 p• 8 −175033869 p•4 + 3959957166 p•3 − 1379005296 p•2 +6565457856 p• + 1464602048). (39) As in the previous cases, the crucial observation is that using the explicit expressions [3] [3] [7] [7] for a1, p• +1 , a3, p• +1 , a1, p• +2 , a3, p• +2 , a1, p• +3 and a3, p• +3 discussed in the previous sections, one finds that the right hand sides of Eqs. (37a) and (37b) are, respectively, of the form (1 − τ ) p• −13 (1 + τ ) p• −14 Q p• +4 (τ ), −(1 + τ ) p• −13 (1 − τ ) p• −14 Q sp• +4 (τ ), with Q p• +4 (τ ) a polynomial in τ such that Q p• +4 (±1) = 0. In what follows it is assumed that p• ≥ 14, so that the discussions make sense. The cases p• < 14 can be dealt with using direct case-by-case calculations. The qualitative results are the same as in the general case. As in the cases discussed in Sects. 7.4 and 7.5, the structure of the polynomial solutions (if any) of the system (37a)-(37b) is very particular. More precisely, one has the following result. Lemma 8. The polynomial solutions (if any) of the reduced system (37a) and (37b) are of the form [11] p• −12 a1, (1 + τ ) p• −14 b p• +4 (τ ), p• +4 (τ ) = (1 − τ )
(40a)
[11] a3, p• +4 (τ )
(40b)
= −(1 + τ ) p• −12 (1 − τ ) p• −14 bsp• +4 (τ ),
where b p• +4 is a polynomial of degree 29.
Rigidity Property of Asymptotically Simple Spacetimes
703
[11] As in the previous orders, we use expressions (40a)-(40b) as an Ansatz for a1, p• +4
[11] and a3, p• +4 with
b p• +4 (τ ) =
29
B p• +4,k τ k ,
B p• +4,k ∈ C.
k=0
Again, this Ansatz leads to a system of 31 linear algebraic equations for the 30 unknowns B p• +4,k , k = 0, . . . , 29. By means of an explicit calculation, this overdetermined system can be seen to have no solutions unless w p• +1,2 p• +2,k = 0 or m = 0. This is the crucial result of our analysis. One has the following proposition. Proposition 5. The solution of the Maxwell-like reduced system (37a) and (37b) admits no polynomial solutions unless w p• +1,2 p• +2,k = 0 or m = 0. This last proposition, together with Proposition 2 in Sect. 6.3 renders the following corollary. Corollary 1. The solution of the Maxwell-like reduced system (37a)-(37b) with data given by (38)-(39) develops logarithmic singularities at τ = ±1 unless w p• +1,2 p• +2,k = 0 or m = 0. The logarithmic solutions are of class C ω (−1, 1) ∩ C p• +3 [−1, 1]. Remark. From the discussion in Sect. 6.2 it follows that the coefficients ai, p• +4 , i = 0, . . . , 4 will have logarithmic singularities of the form given by Proposition 2. 8. The Main Result The discussion in the previous sections is summarised in the following result. Proposition 6. Given a time symmetric initial data set which in a neighbourhood Ba of infinity is Schwarzschildean up to order p = p• , the solutions to the transport equations for the orders p = p• + 1, p• + 2, p• + 3 are polynomial in τ and hence extend smoothly through the critical sets I ± . On the other hand, the solutions at order p = p• +4 contain logarithmic singularities which can be avoided if and only if the initial data is, in fact, Schwarzschildean to order p = p• + 1. From this result, an induction argument which uses the explicit calculations performed in [40] as a base step renders our main result —cfr. the main theorem in the Introduction. Theorem 2. The solution to the regular finite initial value problem at spatial infinity for initial data which is time symmetric and conformally flat in a neighbourhood of infinity is smooth through I ± if and only if the restriction of the data to I 0 coincides with the restriction of Schwarzschildean data at every order. Furthermore, the analyticity of the data implies that the initial data set is exactly Schwarzschildean in a neighbourhood of infinity. As mentioned in the Introduction, the evidence gathered in [39] suggests that it is possible to obtain a generalisation of this result for time symmetric initial data sets which are not conformally flat. In that case, the expected conclusion is that smoothness at the critical sets implies the data to be asymptotically static at all orders at spatial infinity —in a suitable sense to be determined. In particular, under the further assumption of real analyticity, it is conjectured that one should be able to prove that the data is exactly static in a neighbourhood of infinity. In [24] it has been shown that the conformal structure of
704
J. A. Valiente Kroon
static spacetimes is as regular as one would expect it to be. In particular, static spacetimes extend smoothly (in fact analytically) at the critical sets I ± . Key to such a generalisation of Theorem 2 is to consider a real analytic class of time symmetric initial data sets for which it is simple to decide whether its development will be a static spacetime or not. Acknowledgements. This research is funded by an EPSRC Advanced Research Fellowship. I thank Helmut Friedrich for introducing me to this problem many years ago, and for many helpful discussions through the years. Thanks are also due to Robert Beig, Malcolm MacCallum, Christian Lübbe, Thomas Bäckdahl and José Luis Jaramillo for several discussions. I thank ChristianeM Losert-VK for support, encouragement and a careful reading of the manuscript.
A. Some of the Polynomials in Sect. 7 In order to exemplify the type of expressions one has to deal with in the analysis described in section 7, here we present some of the simpler polynomials. A.1. The polynomials in Sect. 7.4. 1 G p• +2 = ( p• + 4) ( p• + 2) 2 p• 3 + 10 p• 2 − 2 p• − 55 τ 9 m( p• + 2) − 17 p• 3 + 635 + 305 p• + 14 p• 4 + 2 p• 5 + 8 p• 2 τ 8 −2 175 p• 2 + 7 p• 5 − 550 p• + 246 p• 3 − 652 + 72 p• 4 τ 7 −2 5 p• 5 + 155 p• 2 + 115 p• 3 + 34 p• 4 − 471 p• − 1080 τ 6 + 833 p• 3 − 1124 + 795 p• 2 + 12 p• 5 − 1172 p• + 206 p• 4 τ 5 + 36 p• 5 − 2494 + 601 p• 3 − 946 p• + 166 p• 4 + 891 p• 2 τ 4 +2 263 p• + 70 + 12 p• 4 − 280 p• 2 − 263 p• 3 τ 3 −12 −32 p• − 95 + 78 p• 2 + 12 p• 4 + 22 p• 3 τ 2 −3 −62 p• 2 − 72 + 51 p• 3 + 44 p• τ − 75 + 45 p• + 54 p• 3 + 39 p• 2 , 1 H p• +2 = −2 ( p• + 4) ( p• + 2) 2 p• 3 + 10 p• 2 − 2 p• − 55 τ 9 m( p• + 2) +2 161 p• + 102 p• 2 + 18 p• 4 + 33 + 53 p• 3 + 2 p• 5 τ 8 +4 ( p• + 2) 7 p• 4 + 56 p• 3 + 122 p• 2 − 71 p• − 450 τ 7 +4 3 p• 3 − 113 p• 2 − 305 p• + 20 p• 4 − 78 + 5 p• 5 τ 6 −2 −1352 p• + 905 p• 3 + 959 p• 2 − 2748 + 12 p• 5 + 226 p• 4 τ 5 −2 245 p• 3 + 36 p• 5 + 142 p• 4 − 866 p• − 246 − 13 p• 2 τ 4
Rigidity Property of Asymptotically Simple Spacetimes
705
+4 335 p• 3 + 24 p• 4 − 359 p• + 562 p• 2 − 886 τ 3 +48 11 p• 3 − 7 p• − 13 + 6 p• 4 τ 2 −6 −176 + 98 p• 2 + 21 p• 3 + 60 p• τ + 90 + 78 p• − 108 p• 3 − 186 p• 2 , 1 b p+2 (τ ) = −2 (6 p• + 23) ( p• + 2) − 2 ( p• + 2) 6 p• 2 + 14 p• + 5 τ m A1 +4 ( p• + 2) 21 p• 2 + 45 p• + 73 τ 2 + 4 ( p• + 2) 7 p• 3 + 16 p• 2 + 88 p• + 48 τ 3 −2 ( p• + 2) 20 p• 3 + 55 p• 2 + 229 p• + 256 τ 4 −2 ( p• + 1) ( p• + 2) 4 p• 3 + 15 p• 2 + 122 p• + 214 τ 5 4 + ( p• + 2) 7 p• 3 + 69 p• 2 + 260 p• + 249 τ 6 3 4 3 p• + 29 p• 2 + 127 p• + 120 ( p• + 2)2 τ 7 + 3 2 − ( p• + 2) 2 p• 3 + 27 p• 2 + 109 p• + 99 τ 8 3 2 − ( p• + 2) 2 p• 4 + 23 p• 3 + 97 p• 2 + 178 p• + 111 τ 9 . 3 The expressions of the polynomials appearing in the analyses at order p = p• + 3 and p = p• + 4 are too involved to be presented here. References 1. Anninos, P., Hobill, D., Seidel, E., Smarr, L., Suen, W.M.: Collision of two black holes. Phys. Rev. Lett. 71, 2851 (1993) 2. Ashtekar, A.: Lectures on Non-perturbative Canonical Gravity, Singapore: World Scientific, 1991 3. Ashtekar, A., Hansen, R.O.: A unified treatment of null and spatial infinity in general relativity. I. Universal structure, asymptotic symmetries, and conserved quantities at spatial infinity. J. Math. Phys. 19, 1542 (1978) 4. Beig, R.: Integration of Einstein’s equations near spatial infinity. Proc. Roy. Soc. Lond. A 391, 295 (1984) 5. Beig, R., Schmidt, B.G.: Einstein’s equation near spatial infinity. Commun. Math. Phys. 87, 65 (1982) 6. Blanchet, L.: Gravitational radiation from Post-Newtonian sources and inspiralling compact binaries. Living. Rev. Relativity 9 (2006) 7. Bondi, H., van der Burg, M.G.J., Metzner, A.W.K.: Gravitational waves in general relativity VII. Waves from Axi-Symmetric Isolated Systems. Proc. Roy. Soc. Lond. A 269, 21 (1962) 8. Brill, D.R., Lindquist, R.W.: Interaction energy in geometrostatics. Phys. Rev. 131, 471 (1963) 9. Chru´sciel, P.T., Delay, E.: Existence of non-trivial, vacuum, asymptotically simple spacetimes. Class. Quant. Grav. 19, L71 (2002) 10. Chru´sciel, P.T., MacCallum, M.A.H., Singleton, D.B.: Gravitational waves in general relativity XIV. Bondi expansions and the “polyhomogeneity” of I . Phil. Trans. Roy. Soc. Lond. A 350, 113 (1995) 11. Dain, S., Friedrich, H.: Asymptotically flat initial data with prescribed regularity at infinity. Commun. Math. Phys. 222, 569 (2001) 12. Frauendiener, J.: Numerical treatment of the hyperboloidal initial value problem for the vacuum Einstein equations.I. The conformal field equations. Phys. Rev. D 58, 064002 (1998) 13. Friedrich, H.: The asymptotic characteristic initial value problem for Einstein’s vacuum field equations as an initial value problem for a first-order quasilinear symmetric hyperbolic system. Proc. Roy. Soc. Lond. A 378, 401 (1981)
706
J. A. Valiente Kroon
14. Friedrich, H.: On the regular and the asymptotic characteristic initial value problem for Einstein’s vacuum field equations. Proc. Roy. Soc. Lond. A 375, 169 (1981) 15. Friedrich, H.: On the existence of analytic null asymptotically flat solutions of Einstein’s vacuum field equations. Proc. Roy. Soc. Lond. A 381, 361 (1982) 16. Friedrich, H.: On purely radiative space-times. Commun. Math. Phys. 103, 35 (1986) 17. Friedrich, H.: On the existence of N-geodesically complete or future complete solutions of Einstein’s field equations with smooth asymptotic structure. Commun. Math. Phys. 107, 587 (1986) 18. Friedrich, H.: On static and radiative space-times. Commun. Math. Phys. 119, 51 (1988) 19. Friedrich, H.: Einstein equations and conformal structure: existence of anti-de Sitter-type space-times. J. Geom. Phys. 17, 125 (1995) 20. Friedrich, H.: Gravitational fields near space-like and null infinity. J. Geom. Phys. 24, 83 (1998) 21. Friedrich, H.: Conformal Einstein evolution. In: The Conformal Structure of Spacetime: Geometry, Analysis, Numerics, edited by J. Frauendiener, H. Friedrich, Lecture Notes in Physics, 604, Berlin-Heidelberg-NewYork: p. 1, Springer, 2002 22. Friedrich, H.: Conformal geodesics on vacuum spacetimes. Commun. Math. Phys. 235, 513 (2003) 23. Friedrich, H.: Spin-2 fields on Minkowski space near space-like and null infinity. Class. Quant. Grav. 20, 101 (2003) 24. Friedrich, H.: Smoothness at null infinity and the structure of initial data. In: 50 years of the Cauchy Problem in General Relativity, edited by P.T. Chru´sciel, H. Friedrich, Basel-Boston: Birkhausser, 2004 25. Friedrich, H., Kánnár, J.: Bondi-type systems near space-like infinity and the calculation of the NP-constants. J. Math. Phys. 41, 2195 (2000) 26. Geroch, R.: Structure of the gravitational field at spatial infinity. J. Math. Phys. 13, 956 (1972) 27. Geroch, R.: Asymptotic structure of space-time. In: Asymptotic Structure of Spacetime, edited by E.P. Esposito, L. Witten, London: Plenum Press, 1976 28. Lübbe, C., Valiente Kroon, J.A.: On de Sitter-like and Minkowski-like spacetimes. Class. Quant. Grav. 26, 145012 (2009) 29. Misner, C.W.: The method of images in geometrodynamics. Ann. Phys. 24, 102 (1963) 30. van der Put, M., Singer, M.: Galois Theory of Linear Differential Equations. Berlin-Heidelberg-NewYork: Springer, 2003 31. Novak, S., Goldberg, J.N.: Conformal properties of nonpeeling vacuum spacetimes. Gen. Rel. Grav. 14, 655 (1982) 32. Penrose, R.: Asymptotic properties of fields and space-times. Phys. Rev. Lett. 10, 66 (1963) 33. Penrose, R., Rindler, W.: Spinors and Space-time. Volume 2. Spinor and twistor methods in space-time geometry. Cambridge: Cambridge University Press, 1986 34. Sachs, R.K.: Gravitational waves in general relativity VIII. Waves in asymptotically flat space-time. Proc. Roy. Soc. Lond. A 270, 103 (1962) 35. Sommers, P.: The geometry of the gravitational field at spacelike infinity. J. Math. Phys. 19, 549 (1978) 36. Sommers, P.: Space spinors. J. Math. Phys. 21, 2567 (1980) 37. Stewart, J.: Advanced General Relativity. Cambridge: Cambridge University Press, 1991 38. Szegö, G.: Orthogonal Polynomials. Volume 23 of AMS Colloq. Pub., Providence, RI: Amer. Math. Soc., 1978 39. Valiente Kroon, J.A.: Does asymptotic simplicity allow for radiation near spatial infinity? Commun. Math. Phys. 251, 211–234 (2004) 40. Valiente Kroon, J.A.: A new class of obstructions to the smoothness of null infinity. Commun. Math. Phys. 244, 133–234 (2004) 41. Valiente Kroon, J.A.: Time asymmetric spacetimes near null and spatial infinity. I. Expansions of developments of conformally flat data. Class. Quant. Grav. 23, 5457 (2004) 42. Valiente Kroon, J.A.: Time asymmetric spacetimes near null and spatial infinity. II. Expansions of developments of initial data sets with non-smooth conformal metrics. Class. Quant. Grav. 22, 1683 (2005) 43. Valiente Kroon, J.A.: On smoothness-asymmetric null infinities. Class. Quant. Grav. 23, 3593 (2006) 44. Valiente Kroon, J.A.: Asymptotic properties of the development of conformally flat data near spatial infinity. Class. Quantum Grav. 24, 3037 (2007) 45. Valiente Kroon, J.A.: The Maxwell field on the Schwarzschild spacetime: behaviour near spatial infinity. Proc. Roy. Soc. Lond. A 463, 2609 (2007) 46. Valiente Kroon, J.A.: Estimates for the Maxwell field near the spatial and null infinity of the Schwarzschild spacetime. J. Hyper. Differ. Equ. 6, 229 (2009) 47. Winicour, J.: Logarithmic asymptotic flatness. Found. Phys. 15, 605 (1985) Communicated by P.T. Chru´sciel
Commun. Math. Phys. 298, 707–739 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1001-3
Communications in
Mathematical Physics
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors Feng Xu Department of Mathematics, University of California at Riverside, Riverside, CA 92521, USA. E-mail: [email protected] Received: 1 September 2009 / Accepted: 1 November 2009 Published online: 7 February 2010 – © Springer-Verlag 2010
Abstract: In this paper we present a conjecture on intermediate subfactors which is a generalization of Wall’s conjecture from the theory of finite groups. Motivated by this conjecture, we determine all intermediate subfactors of Goodman-Harpe-Jones subfactors, and as a result we verify that Goodman-Harpe-Jones subfactors verify our conjecture. Our result also gives a negative answer to a question motivated by a conjecture of Aschbacher-Guralnick. 1. Introduction Let M be a factor represented on a Hilbert space and N a subfactor of M which is irreducible, i.e., N ∩ M = C. Let K be an intermediate von Neumann subalgebra for the inclusion N ⊂ M. Note that K ∩ K ⊂ N ∩ M = C, K is automatically a factor. Hence the set of all intermediate subfactors for N ⊂ M forms a lattice under two natural operations ∧ and ∨ defined by K 1 ∧ K 2 = K 1 ∩ K 2 , K 1 ∨ K 2 = (K 1 ∪ K 2 ) . The commutant map K → K maps an intermediate subfactor N ⊂ K ⊂ M to M ⊂ K ⊂ N . This map exchanges the two natural operations defined above. Let M ⊂ M1 be the Jones basic construction of N ⊂ M. Then M ⊂ M1 is canonically anti-isomorphic to M ⊂ N , and the lattice of intermediate subfactors for N ⊂ M is related to the lattice of intermediate subfactors for M ⊂ M1 by the commutant map defined as above. Let G 1 be a group and G 2 be a subgroup of G 1 . An interval sublattice [G 1 /G 2 ] is the lattice formed by all intermediate subgroups K , G 2 ⊆ K ⊆ G 1 . By cross product construction and Galois correspondence, every interval sublattice of finite groups can be realized as an intermediate subfactor lattice of finite index. Hence
Supported in part by NSF.
708
F. Xu
the study of an intermediate subfactor lattice of finite index is a natural generalization of the study of an interval sublattice of finite groups. The study of intermediate subfactors has been very active in recent years (cf. [9,14,20,22,23,31,41 and 38] for only a partial list). There are a number of old problems about the interval sublattice of finite groups. It is therefore a natural programme to investigate if these old problems have any generalizations to the subfactor setting. The hope is that maybe subfactor theory can provide a new perspective on these old problems. In [44] we consider the problem whether the very simple lattice Mn consisting of a largest, a smallest and n pairwise incomparable elements can be realized as a subfactor lattice. We showed in [44] all M2n are realized as the lattice of intermediate subfactors of a pair of hyperfinite type I I I1 factors with finite depth. Since it is conjectured that infinitely many M2n can not be realized as interval sublattices of finite groups (cf. [2] and [33]), our result shows that if one is looking for obstructions for realizing finite lattice as lattice of intermediate subfactors with finite index, then the obstruction is very different from what one may find in finite group theory. In 1961 G. E. Wall conjectured that the number of maximal subgroups of a finite group G is less than |G|, the order of G (cf. [40]). In the same paper he proved his conjecture when G is solvable. See [27] for a more recent result on Wall’s conjecture. Wall’s conjecture can be naturally generalized to a conjecture about maximal elements in the lattice of intermediate subfactors. More precisely, since M is the maximal element under inclusion, what we mean by maximal elements are those subfactors K = M, N with the property that if K 1 is an intermediate subfactor and K ⊂ K 1 , then K 1 = M or K . Minimal elements are defined similarly where N , M are not considered as minimal elements. When M is the cross product of N by a finite group G, the maximal elements correspond to maximal subgroups of G, and the order of G is the dimension of a second higher relative commutant. Hence a natural generalization of Wall’s conjecture is the following: Conjecture 1.1. Let N ⊂ M be an irreducible subfactor with a finite index. Then the number of maximal intermediate subfactors is less than dimension of N ∩ M1 (the dimension of the second higher relative commutant of N ⊂ M). We note that since maximal intermediate subfactors in N ⊂ M correspond to minimal intermediate subfactors in M ⊂ M1 , and the dimension of second higher relative commutant remains the same, the conjecture is equivalent to a similar conjecture as above with maximal replaced by minimal. If we take N and M to be cross products of a factor P by H and G with H a subgroup of G, then the above conjecture gives a generalization of Wall’s conjecture which we call a relative version of Wall’s conjecture. The relative version of Wall’s conjecture states that the number of maximal subgroups of G strictly containing a subgroup H is less than |G|/|H |. In the Appendix we give a “subfactor friendly” proof of the relative version of Wall’s conjecture when G is solvable. We also discuss a question which is naturally motivated by a conjecture of Aschbacher-Guralnick. This question also partially motivates our work in this paper. A negative answer to this question is presented in §4.5. When subfactors do not come from groups, with a few exceptions such as [14] and [44], very little is known about their maximal intermediate subfactors. To test conjecture 1.1, it is therefore desirable to determine lattices of intermediate subfactors for more examples of subfactors not coming from groups. As shown in [44], a rich source of such subfactors come from conformal field theories, and the techniques developed in [44] allow one to determine intermediate subfactors in many cases. In [14], as part
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
709
of the effort to classify subfactors with no extra structure, the intermediate subfactor lattice of a Goodman-Harpe-Jones (GHJ) subfactor (cf. [13]) is determined. Since the dual GHJ subfactors are closely related to conformal field theories based on the Loop group L SU (2), we can use the method of [44] to determine lattices of intermediate subfactors for GHJ subfactors. (The idea is already presented in [44].) In this paper we carry out this idea. Compare to [44], the main difference is that we need to determine structure of a larger ring, but such rings have been determined in [5,8,43]. Combining these results we determine intermediate subfactor lattices for all GHJ subfactors. One interesting consequence of our work is that the intermediate subfactors of (dual) GHJ subfactors are again (dual) GHJ subfactors. Also as a result we do not find any counterexamples to our Conjecture 1.1, and we give a negative answer to Question A.12. We also find several surprising intermediate subfactors which are not visible at first sight (cf. Figs. 18,22, 41). Since P. Grossman has proved that all non-commuting quadrilaterals of subfactors with bottom two subfactors of type A come from GHJ subfactor of type D (cf. [15]), Figs. 23, 20 also determined all intermediate subfactor lattices of non-commuting quadrilaterals of subfactors with bottom two subfactors of type A. Besides what is already described above, this paper is organized as follows: §2 is a preliminary section on sectors, representations of intermediate subfactors by a pair of sectors, sectors from conformal nets, inductions, Jones-Wassermann subfactors and a description of GHJ subfactors. The simple idea of representations of intermediate subfactors by a pair of sectors will prove to be crucial in later classifications. In §3 we first explain the basic idea in [44] to determine intermediate subfactors by fusion, and we carry out this idea for GHJ subfactors of type A, D and E respectively. In §4 we apply the results of §3 to determine the lattice relations of intermediate subfactors, and these lattices are listed. 2. Preliminaries For the convenience of the reader we collect here some basic notions that appear in this paper. This is only a guideline and the reader should look at the references such as preliminary sections of [25] for a more complete treatment.
2.1. Sectors. Let M be a properly infinite factor and End(M) the semigroup of unit preserving endomorphisms of M. In this paper M will always be the unique hyperfinite I I I1 factors. Let Sect(M) denote the quotient of End(M) modulo unitary equivalence in M. We denote by [ρ] the image of ρ ∈ End(M) in Sect(M). It follows from [28 and 29] that Sect(M), with M a properly infinite von Neumann algebra, is endowed with a natural involution θ → θ¯ ; moreover, Sect(M) is a semiring. Let ρ ∈ End(M) be a normal faithful conditional expectation : M → ρ(M). We define a number d (possibly ∞) by: d−2 := Max{λ ∈ [0, +∞)|(m + ) ≥ λm + , ∀m + ∈ M+ } (cf. [PP]). We define d = Min {d |d < ∞}.
710
F. Xu
d is called the statistical dimension of ρ and d 2 is called the Jones index of ρ. It is clear from the definition that the statistical dimension of ρ depends only on the unitary equivalence classes of ρ. The properties of the statistical dimension can be found in [28,29 and 30]. Denote by Sect0 (M) those elements of Sect(M) with finite statistical dimensions. For λ, μ ∈ Sect0 (M), let Hom(λ, μ) denote the space of intertwiners from λ to μ, i.e. a ∈ Hom(λ, μ) iff aλ(x) = μ(x)a for any x ∈ M. Hom(λ, μ) is a finite dimensional vector space and we use λ, μ to denote the dimension of this space. λ, μ depends ¯ which only on [λ] and [μ]. Moreover we have νλ, μ = λ, ν¯ μ, νλ, μ = ν, μλ follows from the Frobenius duality (see [29] ). We will also use the following notation: if μ is a subsector of λ, we will write it as μ ≺ λ or λ μ. A sector is said to be irreducible if it has only one subsector. For any ρ ∈ End(M) with finite index, there is a unique standard minimal inverse φρ : M → M which satisfies φρ (ρ(m)m ρ(m )) = mφρ (m )m , m, m , m ∈ M. φρ is completely positive. If t ∈ Hom(ρ1 , ρ2 ) then we have dρ1 φρ1 (mt) = dρ2 φρ2 (tm), m ∈ M.
(1)
2.2. Representation of intermediate subfactors by a pair of sectors. Let M be an AFD type I I I1 factor and ρ ∈ End(M). Let K be a factor such that ρ(M) ⊂ K ⊂ M. Since K is also AFD, one can choose ρ1 ∈ End(M) with ρ1 (M) = K . Then we have ρ = ρ1 ρ2 with ρ2 = ρ1−1 ρ ∈ End(M). Conversely if ρ = ρ1 ρ2 with ρ1 , ρ2 ∈ End(M), then ρ(M) ⊂ ρ1 (M) ⊂ M. The following lemma follows directly from definitions: Lemma 2.1. Suppose that ρ1 ρ2 = σ1 σ2 , ρi , σi ∈ End(M), i = 1, 2 and ρ1 (M) ⊂ σ1 (M). Then set σ = σ1−1 ρ1 ∈ End(M) we have ρ1 = σ1 σ, σ2 = σρ2 . Conversely if there is σ ∈ End(M) such that ρ1 = σ1 σ, σ2 = σρ2 , then ρ1 ρ2 = σ1 σ2 , and ρ1 (M) ⊂ σ1 (M). In addition σ is an automorphism iff ρ1 (M) = σ1 (M). By Lemma 2.1 we can represent the intermediate subfactor of ρ by pairs ρ1 , ρ2 with ρ1 ρ2 = ρ, and if σ1 , σ2 represent the same intermediate subfactor iff there is an automorphism σ of M such that ρ1 = σ1 σ, σρ2 = σ2 . The next lemma shows that we can replace the pair ρ1 , ρ2 by [ρ1 ], [ρ2 ] when ρ is irreducible: Lemma 2.2. Suppose that ρ = ρ1 ρ2 = σ1 σ2 , ρi , σi ∈ End(M), [ρi ] = [σi ], i = 1, 2. Then ρ1 (M) = σ1 (M). Proof. By assumption we have unitaries Ui ∈ M such that ρi = AdUi σi , i = 1, 2. Since ρ = ρ1 ρ2 = σ1 σ2 , we have ρ = AdU1 σ1 (U2 ) ρ. Since ρ is irreducible, U1 σ1 (U2 ) must be a scalar multiple of identity, and it follows that ρ1 (M) = AdU1 σ1 (M) = Adσ1 (U2∗ ) σ1 (M) = σ1 (AdU2∗ (M)) = σ1 (M). In view of Lemma 2.1 and Lemma 2.2, we introduce the following notation: Definition 2.3. We say that two pairs of sectors [ρ1 ], [ρ2 ] and [σ1 ], [σ2 ] are equivalent if there is an automorphism σ of M such that [ρ1 ] = [σ1 σ ], [σρ2 ] = [σ2 ]. We denote the equivalence class of such a pair [ρ1 ], [ρ2 ] by [[ρ1 ], [ρ2 ]]. When no confusion arises we will write [[ρ1 ], [ρ2 ]] simply as [ρ1 , ρ2 ].
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
711
The following follows from our definition, Lemma 2.1 and Lemma 2.2: Corollary 2.4. Let ρ ∈ End(M) be irreducible. Then the set of intermediate subfactors between M and ρ(M) can be represented naturally by [[ρ1 ], [ρ2 ]] such that ρ1 ρ2 = ρ, and the intermediate subfactor is ρ1 (M). In the following we will denote the intermediate subfactor ρ1 (M) simply by [[ρ1 ], [ρ2 ]]. Then [[ρ1 ], [ρ2 ]] ⊂ [[σ1 ], [σ2 ]] as intermediate subfactors between M and ρ(M) iff there is σ ∈ End(M) such that [ρ1 ] = [σ1 σ ], [σ2 ] = [σρ2 ]. 2.3. Symmetries of a subfactor. Definition 2.5. Let ρ ∈ End(M). Define Aut(ρ) := {σ ∈ Aut(M)|σρ = ρ}. Lemma 2.6. If ρ is irreducible and has finite index, then Aut(ρ) is a finite group. There is a one to one correspondence between σ ∈ Aut(ρ) and sector [σ ] of index one which appears in [ρ ρ], ¯ and σ is constructed from [σ ] in the following way: if [σ ] is a sector of index one which appears in [ρ ρ], ¯ then we can find a unique representative σ ∈ Aut(M) in the sector of [σ ] such that σρ = ρ. Proof. Let σ ∈ Aut(ρ). Since σρ = ρ, by Frobenius duality we have σ ∈ ρ ρ, ¯ and [σ ] is a sector of index one which appears in [ρ ρ] ¯ with (necessarily) multiplicity one since ρ is irreducible. On the other hand if [σ ] is a sector of index one which appears in [ρ ρ], ¯ then we can find a representative σ in the sector of [σ ] such that σρ = ρ. Since ρ is irreducible, this representative σ in the sector of [σ ] must be unique. Since ρ has finite index, there are only finitely many subsectors of [ρ ρ], ¯ and so Aut(ρ) is finite. Definition 2.7. Let ρ¯ ∈ End(M). Denote by U (M) (resp. U (ρ(M))) ¯ the group of unitaries in M (resp. ρ(M)). ¯ Let U1 (M) be the subgroup of U (M) which consists of unitaries in M whose conjugate action on M preserve ρ(M) ¯ as a set. Note that U (ρ(M)) ¯ is a 1 (M) ¯ := UU(ρ(M)) . normal subgroup of U1 (M). Define N(ρ) ¯ Lemma 2.8. If ρ is irreducible and has finite index, then N(ρ) ¯ is a finite group isomorphic to Aut(ρ). In fact there is a one to one correspondence between Uσ ∈ N(ρ) ¯ and sector [σ ] of index one which appears in [ρ ρ], ¯ and Uσ is constructed from [σ ] in the following way: if [σ ] is a sector of index one which appears in [ρ ρ], ¯ then we can find a unique representative Uσ ∈ N(ρ) ¯ such that AdUσ ρσ ¯ = ρ. ¯ Proof. Let U be a unitary representative of an element in N(ρ). ¯ Since AdU ρ¯ = ρσ ¯ for some automorphism σ of M, by Frobenius duality we have σ ∈ ρ ρ, ¯ and [σ ] is a sector of index one which appears in [ρ ρ] ¯ with (necessarily) multiplicity one since ρ is irreducible. On the other hand if [σ ] is a sector of index one which appears in [ρ ρ], ¯ then we can find a unitary Uσ such that AdUσ ρ¯ = ρσ. ¯ Since ρ is irreducible, the element with representative Uσ in the group N(ρ) ¯ must be unique. Since ρ has finite index, there are only finitely many subsectors of [ρ ρ], ¯ and so N(ρ) ¯ is finite and it is isomorphic to Aut(ρ) by Lemma 2.6. 2.4. Sectors from conformal nets and their representations. We refer the reader to §3 of [25] for definitions of conformal nets and their representations. Suppose a conformal net A and a representation λ is given. Fix an open interval I of the circle and let M := A(I )
712
F. Xu
be a fixed type I I I1 factor. Then λ gives rise to an endomorphism still denoted by λ of M. We will recall some of the results of [37] and introduce notations. Suppose {[λ]} is a finite set of all equivalence classes of irreducible, covariant, finiteindex representations of an irreducible local conformal net A. We will use A or simply
to denote all finite index representations of net A and will use the same notation A to denote the corresponding sectors of M. We will denote the conjugate of [λ] by [λ¯ ] and identity sector (corresponding to ν = [λ][μ], [ν]. the vacuum representation) by [0] if no confusion arises, and let Nλμ Here μ, ν denotes the dimension of the space of intertwiners from μ to ν (denoted by Hom(μ, ν)). We will denote by {Te } a basis of isometries in Hom(ν, λμ). The univalence of λ and the statistical dimension of (cf. §2 of [16]) will be denoted by ωλ and d(λ) (or dλ )) respectively. The unitary braiding operator (μ, λ) (cf. [16] ) verifies the following Proposition 2.9. (1) Yang-Baxter-Equation (YBE): ε(μ, γ )μ(ε(λ, γ ))ε(λ, μ) = γ (ε(λ, μ))ε(λ, γ )λ(ε(μ, γ )) . (2) Braiding-Fusion-Equation (BFE): For any w ∈ Hom(μγ , δ) ε(λ, δ)λ(w) = wμ(ε(λ, γ ))ε(λ, μ), ε(δ, λ)w = λ(w)ε(μ, λ)μ(ε(γ , λ)) , ε(δ, λ)∗ λ(w) = wμ(ε(γ , λ)∗ )ε(μ, λ)∗ , ε(λ, δ)∗ λ(w) = wμ(ε(γ , λ)∗ )ε(λ, μ)∗ . Lemma 2.10. If λ, μ are irreducible, and tν ∈ Hom(ν, λμ) is an isometry, then tν ε(μ, λ)ε(λ, μ)tν∗ = ωωλ ων μ . By Prop. 2.9, it follows that if ti ∈ Hom(μi , λ) is an isometry, then ε(μ, μi )ε(μi , μ) = ti∗ ε(μ, λ)ε(λ, μ)ti . We shall always identify the center of M with C. Then we have the following Lemma 2.11. If ε(μ, λ)ε(λ, μ) ∈ C, then ε(μ, μi )ε(μi , μ) ∈ C, ∀μi ≺ λ. Let φλ be the unique minimal left inverse of λ, define: Yλμ := d(λ)d(μ)φμ ((μ, λ)∗ (λ, μ)∗ ), where (μ, λ) is the unitary braiding operator (cf. [16] ). We list two properties of Yλμ (cf. (5.13), (5.14) of [37]): Lemma 2.12. Yλμ = Yμλ = Yλ∗μ¯ = Yλ¯ μ¯ , ν ωλ ωμ Yλμ = Nλμ d(ν). ων k
(2)
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
713
We note that one may take the second equation in the above lemma as the definition of Yλμ . Define a := i dρ2i ωρ−1 . If the matrix (Yμν ) is invertible, by Proposition on p.351 of i [37] a satisfies |a|2 = λ d(λ)2 . Definition 2.13. Let a = |a| exp(−2πi c80 ), where c0 ∈ R and c0 is well defined mod 8Z. Define matrices S := |a|−1 Y, T := CDiag(ωλ ),
(3)
where C := exp(−2πi
c0 ). 24
Then these matrices satisfy (cf. [37]): Lemma 2.14. SS † ST S S2 T Cˆ
= T T † = id, = T −1 ST −1 , ˆ = C, ˆ = C T,
where Cˆ λμ = δλμ¯ is the conjugation matrix. Moreover ν Nλμ =
Sλδ Sμδ S ∗ νδ S1δ
(4)
δ
is known as the Verlinde formula. The commutative algebra generated by λ’s with strucν is called a fusion algebra of A. If Y is invertible, it follows from ture constants Nλμ Lemma 2.14, (4) that any nontrivial irreducible representation of the fusion algebra is S of the form λ → Sλμ for some μ. 1μ 2.5. Induced endomorphisms. Suppose that ρ ∈ End(M) has the property that γ = ρ ρ¯ ∈ A . By §2.7 of [32], we can find two isometries v1 ∈ Hom(γ , γ 2 ), w1 ∈ Hom(1, γ )1 such that ρ(M) ¯ and v1 generate M and v1∗ w1 = v1∗ γ (w1 ) = dρ−1 , v1 v1 = γ (v1 )v1 . By Thm. 4.9 of [32], we shall say that ρ is local if v1∗ w1 = v1∗ γ (w1 ) = dρ−1 v1 v1 = γ (v1 )v1 ρ((γ ¯ , γ ))v1 = v1 1 We use v , w instead of v, w here since v, w are used to denote sectors in Sect. 2.6. 1 1
(5) (6) (7)
714
F. Xu
Note that if ρ is local, then ωμ = 1, ∀μ ≺ ρ ρ¯
(8)
For each (not necessarily irreducible) λ ∈ A , let ε(λ, γ ) : λγ → γ λ (resp. ε˜ (λ, γ )), be the positive (resp. negative) braiding operator as defined in Sect. 1.4 of [43]. Denote by λε ∈ End(M) which is defined by λε (x) : = ad(ε(λ, γ ))λ(x) = ε(λ, γ )λ(x)ε(λ, γ )∗ λε˜ (x) : = ad(˜ε (λ, γ ))λ(x) = ε˜ (λ, γ )∗ λ(x)˜ε (λ, γ )∗ , ∀x ∈ M. By (1) of Theorem 3.1 of [43], λε ρ(M) ⊂ ρ(M), λε˜ ρ(M) ⊂ ρ(M), hence the following definition makes sense2 . Definition 2.15. If λ ∈ A define two elements of End(M) by ρ
ρ
aλ (m) := ρ −1 (λε ρ(m)), a˜ λ (m) := ρ −1 (λε˜ ρ(m)), ∀m ∈ M. ρ
ρ
aλ (resp. a˜ λ ) will be referred to as positive (resp. negative) induction of λ with respect to ρ. ρ
ρ
Remark 2.16. For simplicity we will use aλ , a˜ λ to denote aλ , a˜ λ when it is clear that inductions are with respect to the same ρ. The endomorphisms aλ are called braided endomorphisms in [43] due to its braiding properties (cf. (2) of Corollary 3.4 in [43]), and enjoy an interesting set of properties (cf. Sect. 3 of [43]). Though [43] focus on the local case which was clearly the most interesting case in terms of producing subfactors, as observed in [3–6] that many of the arguments in [43] can be generalized. These properties are also studied in a slightly different context in [3–5]. In these papers, the induction is between M and a subfactor N of M ,while the induction above is on the same algebra. A dictionary between our notations here and these papers has been set up in [45] which simply use an isomorphism between N and M. Here one has a choice to use this isomorphism to translate all endomorphisms of N to endomorphims of M, or equivalently all endomorphims of M to endomorphims of N . In [45] the later choice is made (Hence M in [45] will be our N below). Here we make the first choice which makes the dictionary slightly simpler. Our dictionary here is equivalent to that of [45]. Set N = ρ(M). ¯ In the following the notations from [3] will be given a subscript BE. The formulas are : ρN γ λ ε˜ (λ, μ)
= i B E , ρρ ¯ N = i¯B E i B E , −1 = ρ¯ θ B E ρ, ¯ ρρ ¯ = γB E , −1 = ρ¯ λ B E ρ, ¯ ε(λ, μ) = ρ(ε ¯ + (λ B E , μ B E )) − = ρ(ε ¯ (λ B E , μ B E ))
(9) (10) (11) (12)
The dictionary between aλ ∈ End(M) in Definition 2.15 and αλ− as in Definition 3.3, Definition 3.5 of [3] are given by: aλ = αλ+B E , a˜ λ = αλ−B E
(13)
2 We have changed the notations a , a˜ of [43] to a˜ , a of this paper to make some of the formulas such λ λ λ λ as Eq. (13) simpler.
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
715
The above formulas will be referred to as our dictionary between the notations of [43] and that of [3]. The proof is the same as that of [45]. Using this dictionary one can easily translate results of [43] into the settings of [3–8]and vice versa. First we summarize a few properties from [43] which will be used in this paper: (cf. Th. 3.1 , Co. 3.2 and Th. 3.3 of [43] ): Proposition 2.17. (1) The maps [λ] → [aλ ], [λ] → [a˜ λ ] are ring homomorphisms; (2) aλ ρ¯ = a˜ λ ρ¯ = ρλ; ¯ (3) When ρ ρ¯ is local, aλ , aμ = a˜ λ , a˜ μ = aλ ρ, ¯ aμ ρ ¯ = a˜ λ ρ, ¯ a˜ μ ρ; ¯ (4) (3) remains valid if aλ , aμ (resp. a˜ λ , a˜ μ ) are replaced by their subsectors. Definition 2.18. Hρ is a finite dimensional vector space over C with orthonormal basis consisting of irreducible sectors of [λρ], ∀λ ∈ A . [λ] acts linearly on Hρ by [λ][a] = b λa, b[b] where [b] are elements in the basis of Hρ .3 By abuse of notation, we use [λ] to denote the corresponding matrix relative to the basis of Hρ . By definition these matrices are normal and commuting, so they can be simultaneously diagonalized. Recall the irreducible representations of the fusion algebra of A are given by λ→
Definition 2.19. Assume λa, b =
Sλμ . S1μ (μ,i) (μ,i)∗ (μ,i) φb where φa are norSλμ eigenvalue S0μ , Exp is a set of μ, i’s and
Sλμ μ,i∈(Exp) S0μ
· φa
malized orthogonal eigenvectors of [λ] with i is an index indicating the multiplicity of μ, and is called the set of exponents of Hρ . Recall if a representation is denoted by 0, it will always be the vacuum representation. The following lemma is elementary: Lemma 2.20. (1)
db2 =
b
1 2 S00
where the sum is over the basis of Hρ . The vacuum appears once in Exp and φa(1) = S00 da ; (2) φa(λ,i) φ (λ,i) b
i
2 S0λ
∗
=
Sνλ ¯ν a, b S0λ ν
where if λ does not appear in Exp then the righthand side is zero. 3 By abuse of notation, in this paper we use to denote the sum over the basis [b] in H . ρ b
716
F. Xu
Proof. Ad (1): By definition we have [a ρ] ¯ = a ρ, ¯ λ[λ] = a, λρ[λ] λ
λ
where in the second = we have used Frobenius reciprocity. Hence da dρ¯ = a ρ, ¯ λdλ λ
and we obtain
λ
dλ2 =
a ρ, ¯ λdλ da /dρ = da2 λ,a
(2) follows from definition and orthogality of S matrix.
a
In [5 and 43], commutativity among subsectors of aλ , a˜ μ , λ, μ ∈ were studied. We record these results in the following for later use: Lemma 2.21. (1) Let [b] (resp. [b ]) be any subsector of aλ (resp. a˜ λ ). Then [aμ b] = [baμ ], [a˜ μ b ] = [b a˜ μ ]∀μ, [bb ] = [bb ]; (2) Let [b] be a subsector of aμ a˜ λ , then [aν b] = [baν ], [a˜ ν b] = [ba˜ ν ], ∀ν; (3) If σ ≺ λρ, the [μσ ] = [σ aμ ] = [σ a˜μ ]. 2.6. Jones-Wassermann subfactors from representation of Loop groups. Let G = SU (n). We denote LG the group of smooth maps f : S 1 → G under pointwise multiplication. The diffeomorphism group of the circle DiffS 1 is naturally a subgroup of Aut(LG) with the action given by reparametrization. In particular the group of rotations RotS 1 U (1) acts on LG. We will be interested in the projective unitary representation π : LG → U (H ) that are both irreducible and have positive energy. This means that π should extend to LG Rot S 1 so that H = ⊕n≥0 H (n), where the H (n) are the eigenspace for the action of RotS 1 , i.e., rθ ξ = exp(inθ ) for θ ∈ H (n) and dim H (n) < ∞ with H (0) = 0. It follows from [36] that for fixed level k which is a positive integer, there are only finite number of such irreducible representations indexed by the finite set k P++ = λ ∈ P | λ = λi i , λi ≥ 0 , λi ≤ k i=1,··· ,n−1
i=1,··· ,n−1
where P is the weight lattice of SU (n) and i are the fundamental weights. We will write λ = (λ1 , . . . , λn−1 ), λ0 = k − 1≤i≤n−1 λi and refer to λ0 , . . . , λn−1 as components of λ. We will use 0 or simply 1 to denote the trivial representation of SU (n). For λ, μ, (δ) (δ) (δ∗) (δ k , define N ν = ν ∈ P++ /S0 ) where Sλ(δ) is given by the Kack Sλ Sμ Sν λμ δ∈P++ Peterson formula: (δ) Sλ = c εw exp(iw(δ) · λ2π/n) w∈Sn
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
717 (δ)
where εw = det(w) and c is a normalization constant fixed by the requirement that Sμ ν are non-negative integers. is an orthonormal system. It is shown in [24] P. 288 that Nλμ k with structure Moreover, define Gr (Ck ) to be the ring whose basis are elements of P++ ν k ∗ constants Nλμ . The natural involution ∗ on P++ is defined by λ → λ = the conjugate of λ as representation of SU (n). () by S1() . Define dλ = We shall also denote S 0
(λ)
S1
( ) S1 0
. We shall call (Sν(δ) ) the S-matrix
of L SU (n) at level k. The following result is proved in [42] (See Corollary 1 of Chapter V in [42]). (k)
Theorem 2.22. Each λ ∈ P++ has finite index with index value dλ2 . The fusion ring (k) generated by all λ ∈ P++ is isomorphic to Gr (Ck ). Remark 2.23. The subfactors in the above theorem are called Jones-Wassermann subfactors after the authors who first studied them (cf. [21,42]). We will concentrate on n = 2 case in this paper. Fix level k ≥ 1, the representations are labeled by half integers i with 0 ≤ i ≤ k/2. For example 0 will label the vacuum representation, and 1/2 will label the vector representation. The statistical dimension of 2π 1/2 is 2 cos( k+2 ). The fusion rules are given by [i][ j] = [l]. |i− j|≤l≤i+ j,i+ j+l≤k,i+ j+l∈Z
The S matrix for n = 2 is given by Si j =
2 k+2
j+1) sin( π(2i+1)(2 ). k+2
2.7. Dual Goodman-Harpe-Jones subfactors from Conformal field theory. The dual GHJ subfactors are obtained from irreducible sectors of [i][ρ] with ρ ρ¯ ∈ A (cf. [8,43], Appendix of [5]). The induction in the following will be with respect to ρ. Definition 2.24. The fusion graph of [1/2] on Hρ is the graph whose vertices are irreducible sectors in Hρ and two vertices a, b are connected by an edge if 1/2a, b = 1. We will say that Hρ is type A, D, E if the fusion graph is A, D, E respectively. By an end point of the fusion graph we mean a vertex on the graph which is connected to only one other point on the graph. Since the norm of the fusion graph of [1/2] is less than 2, the graph must be A − D − E (cf. [13]). When ρ = 0, the GHJ subfactors are Jones subfactors. The fusion graph of 1/2 is given by:
Fig. 1. Fusion graph of 1/2, type Ak+1
When k is even, and [ρ ρ] ¯ = [0] + [k/2]. The fusion graph of [1/2] on Hρ is type D graph. In this case [i][ρ] is irreducible iff i = k/4. If 4|k then [ak/4 ] = [b] + [b ], and if [ρρ] ¯ = [0] + [g], [g 2 ] = [0], then [gbg] = [b ]. If k is not divisible by 4, [k/4][ρ] = [b]+[b ] where b, b are irreducible with same index, and [ρρ] ¯ = [0]+[ak/2 ] and [k/2][b] = [b ] (cf. §5.2 of [8]).
718
F. Xu
Fig. 2. Fusion graph of 1/2, type D k +2 , 4|k 2
Fig. 3. Fusion graph of 1/2, type D k +2 , 4 |k 2
Fig. 4. Fusion graph of 1/2, type E 6
Fig. 5. Fusion graph of 1/2, type E 7
Notation 2.25. For convenience we shall use ρ0 to denote a fixed endomorphism with [ρ0 ρ¯0 ] = [0] + [k/2]. When the fusion graph of [1/2] is E 6 , the graph is given by Fig. 4 (cf. Page 392-393 of [43]). Note that k = 10, [b0 ] = [a3/2 ] − [a9/2 ], [ρ ρ] ¯ = [0] + [3]. When the fusion graph of [1/2] is E 7 , k = 16, [ρ ρ] ¯ = [0] + [4] + [8], and the graph is given by Fig. 5 (cf. Fig. 42 of [8]). Note that [b2 ] = [5/2ρ] − [3/2ρ], [b1 ] = [1/2b2 ], [b1 ] = [5/2ρ] − [b1 ]
Fig. 6. Normalized eigenvectors
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
719
Fig. 7. Fusion graph of 1/2, type E 8
Fig. 8. Normalized eigenvectors
Fig. 9. Fusion graph of a˜ 1/2
The square root of indices of the corresponding subfactors divided by dρ is given by Fig. 6. When the fusion graph of [1/2] is E 8 , k = 28, [ρ ρ] ¯ = [0] + [5] + [9] + [14] (cf. [43]), and the graph is given by Fig. 7. The square root of indices of the corresponding subfactors divided by dρ is given by Fig. 8. The fusion graph of a˜ 1/2 is given by Fig. 9. Note that we have [a5/2 ] = [b3 ] + [b3 ], [b3 ] = [a1/2 b4 ], [a˜ 5/2 ] = [b˜3 ] + [b˜3 ], [b˜3 ] = [a˜ 1/2 b4 ].
3. Classification of Intermediate Subfactors of Dual GHJ Subfactors Let σ be a dual GHJ subfactor. We assume that k ≥ 5 in this section. Recall that a subfactor is maximal if there is no nontrivial intermediate subfactor. To list nontrivial intermediate subfactors of σ , according to Cor. 2.4, we need to determine all pairs [σ1 , σ2 ] such that [σ1 σ2 ] = [σ ] with 1 < dσ1 < dσ . Since σ1 σ¯ 1 ≺ σ σ¯ ∈ , by Definition 2.15 we can apply induction with respect to σ1 . Our basic strategy, as explained in the end of §2 of [44], is to consider the fusion graph of [1/2] on Hσ1 . By A − D − E classification, this graph must be one of A − D − E graphs. The following is useful: Lemma 3.1. Let σ be a dual GHJ subfactor such that σ = σ1 σ2 . Then either [1/2σ1 ] or both [a1/2 σ2 ] and [a˜ 1/2 σ2 ] are both irreducible where the inductions are with respect to σ1 . In terms of fusion graphs of [1/2] on σ1 , [a1/2 ] on σ2 and [a˜ 1/2 ] on σ2 , this means that either σ1 is an end point or σ2 is an end point on the fusion graphs of [a1/2 ] on σ2 and [a˜ 1/2 ] on σ2 .
720
F. Xu
Proof. Since σ is irreducible, we have 1 = σ1 σ2 , σ1 σ2 = σ¯ 1 σ1 , σ2 σ¯ 2 On the other hand 2 1/2σ1 , 1/2σ1 = σ1 a1/2 , σ1 a1/2 = σ¯ 1 σ1 , a1/2 = 1 + σ¯ 1 σ1 , a1 2 1/2σ1 , 1/2σ1 = σ1 a˜ 1/2 , σ1 a˜ 1/2 = σ¯ 1 σ1 , a˜ 1/2 = 1 + σ¯ 1 σ1 , a˜ 1
Similarly 2 a1/2 σ2 , a1/2 σ2 = σ2 σ¯ 2 , a1/2 = 1 + σ2 σ¯ 2 , a1 2 a˜ 1/2 σ2 , a˜ 1/2 σ2 = σ2 σ¯ 2 , a˜ 1/2 = 1 + σ2 σ¯ 2 , a˜ 1
Since a1 , a˜ 1 are irreducible by Lemma 2.33 of [44], the lemma follows.
By Lemma 3.1, either σ1 or σ2 is an end point on a graph of type A − D − E, which has two or three end points. This together with the known values of indices greatly reduce possible choices of σ1 , σ2 , and in the next few sections we will determine all such choices. 3.1. Type A. By Example 5.24 of [44], in this case i is maximal if i = k/4. When k = 2, 3 all GHJ subfactors are maximal, and when k = 4 a direct computation gives the lattice of intermediate subfactors represented by sector [1] as in Fig. 13. Assume now k ≥ 5. Lemma 3.2. Assume that [k/4] = [σ1 ][σ2 ]. Then if 0 ≤ j ≤ k/2 and j is an integer, then j ∈ Exp of Hσ1 as defined in Definition 2.19. Proof. Apply (2) of Lemma 2.20 to a = σ¯ 2 , b = σ1 , we have φa( j,i) φ ( j,i) b
i
S02 j
∗
=
ν
¯ν a, b
Sν j Sk/4 j = S0 j S0 j
Since up to a nonzero constant is sin( (2 j+1)π ) = 0 when 0 ≤ j ≤ k/2 and j is an 2 integer, it follows that j ∈ Exp of Hσ1 as defined in Definition 2.19. Sk/4 j S0 j
When k is even and i = k/4, there are two cases to consider: when 4|k, by Lemma 3.2 if j is an integer then j ∈ Exp, and it follows that Hσ1 is either type A or D. When Hσ1 is type A then by fusion rule up to right multiplication by an automorphism [σ1 ] = [σ ] or dσ1 = 1; When Hσ1 is type D, by left multiplication by an automorphism on σ1 if necessary we may assume that ρ ∈ Hσ1 , with [ρ ρ] ¯ = [0] + [k/2]. By Lemma 3.1 either σ1 or σ2 has to be an end. It follows that [σ1 , σ2 ] = [ρb, ρ] ¯ or [σ1 , σ2 ] = [ρ, bρ]. ¯ When k is not divisible by 4, similar argument shows that [σ1 , σ2 ] = [b, ρ] ¯ or ¯ [σ1 , σ2 ] = [ρ, b]. We summarize the result in the following Theorem 3.3. Suppose that k ≥ 4. If [i] = [k/4], the corresponding subfactor is maximal; If k = 4, the intermediate subfactor of [1] is given by [ρb, ρ] ¯ [ρ ρ] ¯ = [0] + [1], ρb ≺ 2ρ; If 4|k, k ≥ 8, the intermediate subfactors of [k/4] is given by [ρb, ρ] ¯ and [ρ, bρ] ¯ where [ρ ρ] ¯ = [0] + [k/2], ρb ≺ k/2ρ; If k is even but not divisible by 4, the ¯ where [ρ ρ] intermediate subfactors of [k/4] is given by [b, ρ] ¯ and [ρ, b] ¯ = [0] + [k/2], b ≺ k/2ρ.
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
721
3.2. Type E 6 . In this section we assume that σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [3]. Let σ = σ1 σ2 . If Hσ1 is type A, then multiply σ1 on the right by an automorphism if necessary, we may assume that σ1 ∈ , hence σ1 ∈ ρ. By Lemma 3.1 either 1/2σ1 is irreducible or a1/2 σ2 is irreducible. Then from Fig. 4 it is clear that the possible pairs of [σ1 , σ2 ] are given by [1/2, ρ], [1, ρ], [9/2, ρ], [4, ρ], [1/2, ρb] If Hσ1 is type D, we may assume that σ1 ∈ ρ0 , and it follows that σ2 ∈ ρ¯0 ρ. We note that [5ρb0 ] = [ρb0 ], [5ρa1 ] = [ρa1 ]. By Lemma 2.5 ρb0 (M) (resp. ρa1 (M)) is contained in a subfactor of index 2 which is the fixed point subalgebra of M under an automorphism determined by [5]. Hence there are sectors x, y such that ρ [ρ0 x] = [ρb0 ], [ρ0 y] = [ρa1 ]. Since k = 10, we have [ρ¯0 ρ0 ] = [0] + [a5 0 ]. From ρ0 ρ0 ρ 1 = ρ0 x, ρ0 x = 1 + a5 x, x we conclude that [x] = [a5 x]. Similarly [y] = [a5 0 y]. ρ0 Let us determine all irreducible sectors of ρ¯0 ρ. Note that a1/2 acts on σ2 ∈ ρ¯0 ρ, ρ0 and the corresponding fusion graph of a1/2 is an A − D − E graph. Since ρ¯0 ρ, ρ¯0 ρ = ρ0 ρ0 ρ ρ0 ρ¯0 , ρ ρ ¯ = 1, a1/2 ρ¯0 ρ, a1/2 ρ¯0 ρ = ([0] + [1])ρ0 ρ¯0 , ρ ρ ¯ = 1, and [a1 0 ρ¯0 ρ] = ρ ρ0 [ρ¯0 ρy] = [y] + [a5 0 y], it follows that the fusion graph of the action of a1/2 is given by (The vertices are labeled by all irreducible sectors of ρ¯0 ρ) Fig. 10. By Lemma 3.1 and Fig. 10, it follows that the following is a list of possible intermediate subfactors: [ρ0 , x], [ρ0 , y], [1/2ρ0 , x] Note that from [ρ0 x] = [ρb0 ] we have [ρ x] ¯ ≺ [ρ b¯0 ρρ ¯ 0 ] = [ρ]([a3/2 ] − [a9/2 ])[ρρ ¯ 0] = ([3/2] − [9/2])([0] + [3])[ρ0 ] = 2[3/2ρ0 ] Similarly ρ
[ρ xa ¯ 5 0 ] ≺ [ρ b¯0 ρρ ¯ 0 ] = [ρ]([a3/2 ] − [a9/2 ])[ρρ ¯ 0] = ([3/2] − [9/2])([0] + [3])[ρ0 ] = 2[3/2ρ0 ] By comparing indices we have proved the following ρ
[ρ x] ¯ = [ρ xa ¯ 5 0 ] = [3/2ρ0 ]
(14)
When Hσ1 is E 6 , then there is ρ2 ∈ Hσ1 such that [ρ2 ρ¯2 ] = [0]+[3]. By the cohomology vanishing result of Remark 5.4 in [26], multiplying σ1 on the right by an automorphism if necessary, we can assume that [ρ2 ] = [ρ], and so σ1 ∈ ρ, and σ2 ∈ ρ ρ. ¯ The set of
ρ
0 Fig. 10. Fusion graph of a1/2
722
F. Xu
irreducible sectors of ρ ρ ¯ and fusion graphs of the action of a1/2 , a˜ 1/2 are given by [43] and Fig. 5 of [5]. By using Lemma 3.1 it is straightforward to determine the following list of possible intermediate subfactors: [ρ, a1/2 ], [ρ, a˜ 1/2 ], [ρ, a9/2 ], [ρ, a˜ 9/2 ], [ρ, a1 ], [ρ, a˜ 1 ], [ρ, b], [ρb, a1/2 ], [ρb, a˜ 1/2 ], [1/2ρ, b]. Theorem 3.4. Assume that σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [3]. Then the following is the list of all intermediate subfactors that can occur: [1/2, ρ], [1, ρ], [9/2, ρ], [4, ρ], [1/2, ρb], [ρ0 , x], [ρ0 , y], [1/2ρ0 , x], [ρ, a1/2 ], [ρ, a˜ 1/2 ], [ρ, a9/2 ] [ρ, a˜ 9/2 ], [ρ, a1 ], [ρ, a˜ 1 ], [ρ, b], [ρb, a1/2 ], [ρb, a˜ 1/2 ], [1/2ρ, b]. 3.3. E 7 case. In this section we assume that σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [4] + [8]. Since [8ρ] = [ρ], by Lemma 2.5 we may assume that ρ = ρo ρ1 , and similarly [b1 ] = [ρ0 bˆ1 ], [b1 ] = [ρ0 bˆ1 ], [b2 ] = [ρ0 bˆ2 ]. Let σ = σ1 σ2 . If Hσ1 is type A, then multiply σ1 on the right by an automorphism if necessary, we may assume that σ1 ∈ , hence σ1 ∈ ρ. By Lemma 3.1 either 1/2σ1 is irreducible or a1/2 σ2 is irreducible. Then from Fig. 5 it is clear that the possible pairs of [σ1 , σ2 ] are given by [1/2, ρ], [1, ρ], [3/2, ρ], [1/2, b1 ], [1/2, b2 ], [1, b2 ]. If Hσ1 is type D, we may assume that σ1 ∈ ρ0 , and it follows that σ2 ∈ ρ¯0 ρ. From ρ = ρ0 ρ1 we get [ρ¯0 ρ] = [ρ1 ] + [gρ1 ] where [ρ¯0 ρ0 ] = [0] + [g], [g 2 ] = [0]. It is then easy to determine all irreducible sectors of ρ¯0 ρ. There are two E 7 graphs in Fig. 11 ρ0 encoding the action of a1/2 . By Lemma 3.1 and Fig. 11 it follows that the following is a list of possible intermediate subfactors: [ρ0 , w] where w is one of the vertices in the first graph of Fig. 11, [ρ0 b, bˆ2 ], [ρ0 b, g bˆ2 ], where ρ0 b ≺ [4ρ0 ], [ρ0 bbˆ2 ] = [ρ0 bg bˆ2 ] = [3/2ρ], [1/2ρ0 , ρ1 ], [1ρ0 , ρ1 ], [3/2ρ0 , ρ1 ], [1/2ρ0 , bˆ1 ], [1/2ρ0 , bˆ2 ], [1ρ0 , bˆ2 ]
ρ
0 Fig. 11. Fusion graphs of a1/2
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
723
Note that from [5/2ρ] = [3/2ρ] + [b2 ] we have [5/2ρ ρ¯1 ] = [3/2ρ ρ¯1 ] + [b2 ρ¯1 ], [5/2ρ ρ¯1 g] = [3/2ρ ρ¯1 g] + [b2 ρ¯1 g] On the other hand from [ρ] = [ρ0 ρ1 ] and computing index it is easy to derive that ρ
[ρ1 ρ¯1 ] + [gρ1 ρ¯1 g] = 2[0] + [a4 0 ] or ρ
[ρ1 ρ¯1 ] + [gρ1 ρ¯1 g] = 2[0] + [a˜ 4 0 ] Plug into the equations above and comparing index we find that [b2 ρ¯1 ] = [b2 ρ¯1 g] = [5/2ρ0 ] It follows that ρ
ρ
0 0 [bˆ2 ρ¯1 ] ≺ [ρ¯0 5/2ρ0 ] = [a5/2 ] + [ga5/2 ]
ρ0 ρ0 Hence [bˆ2 ρ¯1 ] = [ga5/2 ] or [bˆ2 ρ¯1 ] = [a5/2 ]. By taking conjugates we have [bˆ2 ρ¯1 ] = ¯ [ρ bˆ ]. We have therefore proved the following: 1 2
[b2 ρ¯1 ] = [b2 ρ¯1 g] = [5/2ρ0 ] = [ρ b¯ˆ2 ] = [ρ b¯ˆ2 g]
(15)
When Hσ1 is E 7 , then there is ρ2 ∈ Hσ1 such that [ρ2 ρ¯1 ] = [ρ ρ]]. ¯ By the cohomology vanishing result of Remark 5.4 in [26], multiplying σ1 on the right by an automorphism if necessary, we can assume that [ρ2 ] = [ρ], and so σ1 ∈ ρ, and σ2 ∈ ρ ρ. ¯ The set of irreducible sectors of ρ ρ ¯ and fusion graphs of the action of a1/2 , a˜ 1/2 are given by Fig. 42 of [8]. By using Lemma 3.1 it is straightforward to determine the following list of possible intermediate subfactors: [ρ, a1/2 ], [ρ, a˜ 1/2 ], [ρ, a3/2 ], [ρ, a˜ 3/2 ], [ρ, a1 ], [ρ, a˜ 1 ], [ρ, τ ], [b1 , a1/2 ], [b1 , a˜ 1/2 ], [b2 , a1/2 ], [b2 , a˜ 1/2 ][b2 , a1 ], [b2 , a˜ 1 ], [b2 , δ] where [τ ] = [a1/2 ]([a˜ 5/2 ] − [a˜ 3/2 ]), [δ] = [a4 ] − [a˜ 1 ] and [ρτ ] = [b1 ], [b2 δ] = [3/2ρ]. Theorem 3.5. Assume that σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [4] + [8]. Then the following is the list of all intermediate subfactors that can occur: [1/2, ρ], [1, ρ], [3/2, ρ], [1/2, b1 ], [1/2, b2 ], [1, b2 ] [ρ0 , w] where w is one of the vertices in the first graph of Fig. 11, [ρ0 b, bˆ2 ], [ρ0 b, g bˆ2 ], where ρ0 b ≺ [4ρ0 ], [ρ0 bbˆ2 ] = [ρ0 bg bˆ2 ] = [3/2ρ], [1/2ρ0 , ρ1 ], [1ρ0 , ρ1 ], [3/2ρ0 , ρ1 ], [1/2ρ0 , bˆ1 ], [1/2ρ0 , bˆ2 ], [1ρ0 , bˆ2 ] [ρ, a1/2 ], [ρ, a˜ 1/2 ], [ρ, a3/2 ], [ρ, a˜ 3/2 ], [ρ, a1 ], [ρ, a˜ 1 ], [ρ, τ ], [b1 , a1/2 ], [b1 , a˜ 1/2 ], [b2 , a1/2 ], [b2 , a˜ 1/2 ] [b2 , a1 ], [b2 , a˜ 1 ], [b2 , δ] where [τ ] = [a1/2 ]([a˜ 5/2 ] − [a˜ 3/2 ]), [δ] = [a4 ] − [a˜ 1 ] and [ρτ ] = [b1 ], [b2 δ] = [3/2ρ].
724
F. Xu
3.4. Type D. In this section we assume that σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [k/2]. Let σ = σ1 σ2 . Lemma 3.6. Assume that σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = j+1) )
= 0, [0] + [k/2]. Let σ = σ1 σ2 . If 0 ≤ i ≤ k/2 and i is an integer, and sin( π(2i+1)(2 k+2 then i ∈ Exp of Hσ1 as defined in Definition 2.19. Proof. When j = k/4, we have [σ ] = [ jρ]. Apply (2) of Lemma 2.20 to a = ρ σ¯ 2 , b = σ1 , we have φa(i,l) φ (i,l) b
2 S0i
l
∗
=
ν
¯ν a, b
S ji S(k/2− j)i Sνi = + S0i S0i S0i
When j = k/4, we have [σ ρ] ¯ = [k/4]. Apply (2) of Lemma 2.20 to a = ρ σ¯ 2 , b = σ1 , we have ∗ Sk i φa(i,l) φ (i,l) Sνi b = ¯ν a, b = 2 2 S0i S0i S0i ν l
S S j)i Note that S0iji + (k/2− = 0 if i is not an integer. When i is an integer, up to a nonzero S0i S(k/2− j)i S ji j+1) constant S0i + S0i is sin( π(2i+1)(2 ), it follows that if 0 ≤ i ≤ k/2, i is an integer, k+2 π(2i+1)(2 j+1) and sin( ) = 0, then i ∈ Exp of Hσ1 as defined in Definition 2.19. k+2
By Lemma 3.6, if i is an integer, (2i + 1)(2 j + 1) is not divisible by k + 2, then i ∈ Exp of Hσ1 . By inspecting exponents on Page 18 of [13], if k = 10, 16, Hσ1 must be of type A or D. We will see in the following that when k = 10, 16 Hσ1 can be E 6 , E 7 respectively. 3.4.1. Local case: 4|k, k = 16 In this section we assume that 4|k, k = 16 and σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [k/2]. By our assumption Hσ1 must be of type A or D. If Hσ1 is type A, then multiply σ1 on the right by an automorphism if necessary, we may assume that σ1 ∈ , hence σ1 ∈ ρ. By Lemma 3.1 either 1/2σ1 is irreducible or a1/2 σ2 is irreducible. It is clear that the possible pairs of [σ1 , σ2 ] are given by [i, ρ], i = [k/4], [1/2, ρb], [1/2, ρb ], where [ak/4 = [b] + [b ], [1/2ρb] = [1/2ρb ] = [ρak/4−1/2 ]. If Hσ1 is type D, we may assume that σ1 ∈ ρ, and it follows that σ2 ∈ ρ ρ. ¯ It ρ0 follows that the fusion graph of the action of a1/2 is given by two D graphs, whose vertices are labeled by irreducible components of ai , i ∈ and a˜ i , i ∈ respectively. By Lemma 3.1, it follows that the following is a list of possible intermediate subfactors: [ρ, w] where w is an irreducible component of ai , i ∈ , [ρb, a1/2 ], [ρb, a˜ 1/2 ], [1/2ρ, b], [1/2ρ, b ] Theorem 3.7. Assume that 4|k, k = 16 and σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [k/2]. Then the following is a list of possible intermediate subfactors: [i, ρ], i = [k/4], [1/2, ρb], [1/2, ρb ], where [ak/4 = [b] + b ], [1/2ρb] = [1/2ρb ] = [ρak/4−1/2 ] [ρ, w] where w is an irreducible component of ai , i ∈ , [ρb, a1/2 ], [ρb, a˜ 1/2 ], [1/2ρ, b], [1/2ρ, b ]
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
725
3.4.2. k = 16 By Lemma 3.6, Hσ1 may be E 7 when σ = 5/2ρ0 = σ1 σ2 . Suppose that Hσ1 is E 7 . Then we can assume that σ1 ∈ ρ2 with [ρ2 ρ¯2 ] = [0] + [4] + [8]. It follows that σ2 ∈ ρ¯2 ρ0 . But the conjugate of ρ¯2 ρ0 has already been determined by Fig. 11. By computing index we conclude that either dσ1 = db2 , dσ2 = dρ1 or dσ1 = dρ2 , dσ2 = dbˆ2 . By Eq. (15) we conclude that [σ1 , σ2 ] = [b2 , ρ¯1 ], [σ1 , σ2 ] = [b2 , ρ¯1 g], [σ , σ ] = [ρ , b¯ˆ ], or [σ , σ ] = [ρ , b¯ˆ g]. 1
2
2
2
1
2
2
2
Theorem 3.8. Assume that k = 16 and σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [8]. Then the following is a list of possible intermediate subfactors: [i, ρ], i = [k/4], [1/2, ρb], [1/2, ρb ], where [ak/4 ] = [b] + b ], [1/2ρb] = [1/2ρb ] = [ρak/4−1/2 ] [ρ, w] where w is an irreducible component of ai , i ∈ , [ρb, a1/2 ], [ρb, a˜ 1/2 ], [1/2ρ, b], [1/2ρ, b ] [b , ρ¯ ], [b , ρ¯ g], [ρ , b¯ˆ ], [ρ , b¯ˆ g] 2
1
2
1
2
2
2
2
where [b2 ρ¯1 ] = [b2 ρ¯1 g] = [ρ2 b¯ˆ2 ] = [ρ2 b¯ˆ2 g] = [5/2ρ]. 3.4.3. Nonlocal case: k is even but not divisible by 4, k = 10 In this section we assume that k is even but not divisible by 4, k = 10 and σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [k/2]. By our assumption Hσ1 must be of type A or D. If Hσ1 is type A, then multiply σ1 on the right by an automorphism if necessary, we may assume that σ1 ∈ , hence σ1 ∈ ρ. By Lemma 3.1 either 1/2σ1 is irreducible or a1/2 σ2 is irreducible. Then it is clear that the possible pairs of [σ1 , σ2 ] are given by [i, ρ], i = [k/4], [1/2, b], [1/2, b ], where [ρak/4 = [b] + b ], [1/2b] = [1/2b ] = [ρak/4−1/2 ]. If Hσ1 is type D, we may assume that σ1 ∈ ρ, and it follows that σ2 ∈ ρ ρ. ¯ It ρ0 follows that the fusion graph of the action of a1/2 is given by one D graph, whose vertices are labeled by irreducible components of ai , i ∈ . By Lemma 3.1, it follows that the following is a list of possible intermediate subfactors: [ρ, w] where w is an irreducible component of ai , i = k/4, [b, a1/2 ], [b, a˜ 1/2 ]. Theorem 3.9. Assume that k is even but not divisible by 4, k = 10 and σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [k/2]. Then the following is a list of possible intermediate subfactors: [i, ρ], i = [k/4], [1/2, b], [1/2, b ], where [ρak/4 ] = [b] + b ], [1/2b] = [1/2b ] = [ρak/4−1/2 ], [ρ, w] where w is an irreducible component of ai , i ∈ , [b, a1/2 ], [b, a˜ 1/2 ].
726
F. Xu
3.4.4. k = 10 By Lemma 3.6, Hσ1 may be E 6 when σ = 3/2ρ0 . Suppose that Hσ1 is E 6 . Then we can assume that σ1 ∈ ρ2 with [ρ2 ρ¯2 ] = [0] + [3]. It follows that σ2 ∈ ρ¯2 ρ0 . But the conjugate of ρ¯2 ρ0 has already been determined by Fig. 10. By computing index we conclude that dσ1 = dρ2 = dσ2 . By Eq. (14) we conclude that [σ1 , σ2 ] = [ρ2 , x] ¯ or [σ1 , σ2 ] = [ρ2 , xa ¯ 5 ]. Theorem 3.10. Assume that k = 10 and σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [5]. Then the following is a list of possible intermediate subfactors: [i, ρ], i = [k/4], [1/2, ρb], [1/2, ρb ], where [ρak/4 ] = [b] + [b ], [1/2ρb] = [1/2ρb ] = [ρak/4−1/2 ], [ρ, w] where w is an irreducible component of ai , i ∈ , [b, a1/2 ], [b, a˜ 1/2 ] [ρ2 , x], ¯ [ρ2 , xa ¯ 5 ] where ρ
¯ = [ρ2 xa ¯ 5 ] = [3/2ρ]. [ρ2 x]
3.5. E 8 case. Assume that σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [5] + [9] + [14]. Since [8ρ] = [ρ], by Lemma 2.5 we may assume that ρ = ρ0 ρ1 . Let σ = σ1 σ2 . If Hσ1 is type A, then multiply σ1 on the right by an automorphism if necessary, we may assume that σ1 ∈ , hence σ1 ∈ ρ. By Lemma 3.1 either 1/2σ1 is irreducible or a1/2 σ2 is irreducible. Then from Fig. 7 and fusion rules it is clear that the possible pairs of [σ1 , σ2 ] are given by [1/2, ρ], [1, ρ], [3/2, ρ], [2, ρ], [1/2, ρb3 ], [1/2, b4 ], [1, ρb4 ] If Hσ1 is type D, we may assume that σ1 ∈ ρ0 , and it follows that σ2 ∈ ρ¯0 ρ. From ρ = ρ0 ρ1 we get [ρ¯0 ρ] = [ρ1 ] + [gρ1 ] where [ρ¯0 ρ0 ] = [1] + [g], [g 2 ] = [0]. It is then easy to determine all irreducible sectors of ρ¯0 ρ. There are two E 8 graphs encoding ρ0 the action of a1/2 in Fig. 12. By Lemma 3.1 and Fig. 12, it follows that the following is a list of possible intermediate subfactors: [ρ0 , w], where w is one of the vertices in the first graph of Fig. 12, [ρ0 b, ρ1 ], [ρ0 b, gρ1 ] where ρ0 b ≺ [7ρ0 ] and [ρ0 bρ1 ] = [ρ0 bgρ1 ] = [2ρ], [1/2ρ0 , ρ1 ], [1ρ0 , ρ1 ], [3/2ρ0 , ρ1 ], [2ρ0 , ρ1 ], [1/2ρ0 , ρ1 b3 ], [1/2ρ0 , ρ1 b4 ], [1ρ0 , ρ1 b4 ]
ρ
0 Fig. 12. Fusion graph of a1/2
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
727
When Hσ1 is E 8 , then there is ρ2 ∈ Hσ1 such that [ρ2 ρ¯2 ] = [ρ ρ]. ¯ By the cohomology vanishing result of Remark 5.4 in [26], multiplying σ1 on the right by an automorphism if necessary, we can assume that [ρ2 ] = [ρ], and so σ1 ∈ ρ, and σ2 ∈ ρ ρ. ¯ The set of irreducible sectors of ρ ρ ¯ and fusion graphs of the action of a1/2 , a˜ 1/2 are given by Fig. 5 of [5]. By using Lemma 3.1 and comparing indices it is straightforward to determine the following list of possible intermediate subfactors: [ρ, a1/2 ], [ρ, a˜ 1/2 ], [ρ, a3/2 ], [ρ, a˜ 3/2 ], [ρ, a1 ], [ρ, a˜ 1 ], [ρ, a2 ], [ρ, a˜ 2 ], [ρ, a1/2 b˜3 ],
[ρ, a˜ 1/2 b3 ], [ρb3 , b4 ], [ρb3 , a1/2 ], [ρb3 , a˜ 1/2 ], [ρb4 , a1/2 ], [ρb4 , a˜ 1/2 ], [ρb4 , a1 ], [ρb4 , a˜ 1 ], [ρb4 , b3 ], [ρb4 , b˜3 ], [ρa1/2 , b4 ], [ρa1 , b4 ], [ρa1/2 , b3 ], [ρa1/2 , b˜3 ] The additional fusion rules which are not immediate visible from Fig. 7 are [b3 b4 ] = [a3/2 ], [a1 b4 ] = [a2 ].
Theorem 3.11. Assume that σ is a dual GHJ subfactor appearing in jρ with [ρ ρ] ¯ = [0] + [5] + [9] + [14]. Then the following is the list of all intermediate subfactors that can occur: [1/2, ρ], [1, ρ], [3/2, ρ], [2, ρ], [1/2, ρb3 ], [1/2, b4 ], [1, ρb4 ] [ρ0 , w], where w is one of the vertices in the first graph of Fig. 12, [ρ0 b, ρ1 ], [ρ0 b, gρ1 ] where ρ0 b ≺ [7ρ0 ] and [ρ0 bρ1 ] = [ρ0 bgρ1 ] = [2ρ], [1/2ρ0 , ρ1 ], [1ρ0 , ρ1 ], [3/2ρ0 , ρ1 ], [2ρ0 , ρ1 ], [1/2ρ0 , ρ1 b3 ], [1/2ρ0 , ρ1 b4 ], [1ρ0 , ρ1 b4 ], [ρ, a1/2 ], [ρ, a˜ 1/2 ], [ρ, a3/2 ], [ρ, a˜ 3/2 ], [ρ, a1 ], [ρ, a˜ 1 ], [ρ, a2 ], [ρ, a˜ 2 ], [ρ, a1/2 b˜ ], 3
[ρ, a˜ 1/2 b3 ], [ρb3 , b4 ], [ρb3 , a1/2 ], [ρb3 , a˜ 1/2 ], [ρb4 , a1/2 ], [ρb4 , a˜ 1/2 ], [ρb4 , a1 ], [ρb4 , a˜ 1 ], [ρb4 , b3 ], [ρb4 , b˜3 ], [ρa1/2 , b4 ], [ρa1 , b4 ], [ρa1/2 , b3 ], [ρa1/2 , b˜3 ]
The additional fusion rules which are not immediate visible from Fig. 7 are [b3 b4 ] = [a3/2 ], [a1 b4 ] = [a2 ]. 4. The Lattice Structure of Intermediate Dual GHJ Subfactors In this section we list the lattices of intermediate subfactors of dual GHJ subfactors. Given a dual GHJ subfactor σ ≺ ρ, first we inspect all the pairs [σ1 , σ2 ] listed in Th. 3.3, Th. 3.4, Th. 3.5, Th.3.7, Th. 3.9, Th. 3.8, Th. 3.10 and Th. 3.11 such that [σ ] = [σ1 σ2 ]. This gives all intermediate subfactors of σ. Then we use Cor. 2.4 and known fusion rules to determine the relations between these intermediate subfactors. The result are listed in the following figures. In each figure indexed by a dual GHJ subfactor σ we list all nontrivial intermediate subfactors [σ1 , σ2 ] with [σ1 σ2 ] = [σ ]. If [σ1 , σ2 ] lies above [τ1 , τ2 ] and there is a line connecting them, then [τ1 , τ2 ] ⊂ [σ1 , σ2 ]. 4.1. Type A. When k is odd or k = 2, all type A GHJ subfactors are maximal. When k ≥ 4 is even, all i = k/4 are maximal.
728
F. Xu
Fig. 13. 1, k = 4 type A
Fig. 14. k/4, 4|k, k > 4, type A
Fig. 15. k/4, k ≥ 6 is even but not divisible by 4, type A
Fig. 16. ρ0 b, 4|k, k ≥ 8, type D
Fig. 17. iρ0 , i = k/4, k/4 − 1/2, 4|k, i = 5/2 if k = 16, type D
Fig. 18. 5/2ρ0 , k = 16, type D
4.2. Type D. When k = 2 the GHJ subfactor is maximal. When k = 4, ρ0 b, ρ0 b are maximal. When k is not divisible by 4, b, b ≺ k/4ρ0 are maximal. We note that [ρ0 b] = [ρ0 b g] when k is divisible by 4, so ρ0 b and ρ0 b have identical intermediate subfactor lattice. The lattice in Fig. 23 for the case when k = 6 was first obtained in [14] in the setting of type I I1 factors.
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
729
Fig. 19. (1/2)ρ0 , [a1/2 b] = [a1/2 b ] = [a1/2 ], k = 4, type D
0
Fig. 20. (k/4 − 1/2)ρ0 , [a1/2 b] = [a1/2 b ] = [ak/4−1/2 ], 4|k, k ≥ 8, type D
Fig. 21. iρ0 , i = k/4, k/4 − 1/2, k is even but not divisible by 4, i = 3/2 if k = 10, type D
Fig. 22. 3/2ρ0 , k = 10, type D
Fig. 23. (k/4 − 1/2)ρ0 , [a1/2 b] = [a1/2 b ] = [ak/4−1/2 ], k ≥ 6 is not divisible by 4,
Fig. 24. 1/2ρ, type E 6
4.3. E 6 . In this case ρ, ρa5 are maximal. The lattice of intermediate subfactors of 9/2ρ is isomorphic to that of 1/2ρ since [9/2ρ] = [1/2ρa5 ] and a5 is an automorphism.
730
F. Xu
Fig. 25. ρb0 , type E 6
Fig. 26. 1ρ, type E 6
Fig. 27. ρ, type E 7
Fig. 28. 1/2ρ, type E 7
Fig. 29. 1ρ, type E 7
Fig. 30. b1 , type E 7
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
731
Fig. 31. b2 , type E 7
4.4. E 7 , E 8 cases, and an example. The labeling of type E 7 , E 8 cases are described in Th. 3.5 and Th. 3.11 respectively. Here we use the most complicated case of 2ρ, type E 8 to explain how we obtain the lattice structure. Let us first use Lemma 2.1 to show that [ρ, a2 ] = [ρ, a˜ 2 ]. If [ρ, a2 ] = [ρ, a˜ 2 ], we can find an automorphism σ such that [ρσ ] = [ρ], [σ a2 ] = [a˜ 2 ]. But a˜ 2 a2 is irreducible, and this implies that [σ ] = [a˜ 2 a2 ], contradicting our assumption that σ is an automorphism. Alternatively we can argue from [ρσ ] = [ρ] that σ ≺ ρρ, ¯ and from the formula for ρρ ¯ in [7] we conclude that [σ ] = [0], and hence [a2 ] = [a˜ 2 ], again a contradiction. A good way to look at the intermediate subfactors of 2ρ, type E 8 is to start with dual GHJ of type A: these are pairs with the first components labeled by a half integer, and there are three of them; for type D the first components are labeled by a half integer multiplied by ρ0 , and there are six of them; for type E the first components are labeled the vertices of Fig. 7, and there are eleven of them.
Fig. 32. b1 , type E 7
Fig. 33. 3/2ρ, type E 7
Fig. 34. ρ, type E 8
732
F. Xu
Fig. 35. 1/2ρ, type E 8
Fig. 36. 1ρ, type E 8
Fig. 37. 3/2ρ, type E 8
Fig. 38. ρb3 , type E 8
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
733
Fig. 39. ρb3 ,type E 8
Fig. 40. ρb4 ,type E 8
Fig. 41. 2ρ, type E 8
Let us now explain why [ρb4 , a1 ] ⊂ [ρ, a2 ] and yet [ρb4 , a1 ] is not a subfactor of [ρ, a˜ 2 ]. Since by Lemma 2.1, [ρb4 , a1 ] ⊂ [ρ, b4 a1 ], but [b4 a1 ] = [a2 ], so we have shown that [ρb4 , a1 ] ⊂ [ρ, a2 ]. Now if [ρb4 , a1 ] ⊂ [ρ, a˜ 2 ], by Lemma 2.1 we can find σ such that [ρb4 ] = [ρσ ], [σ a1 ] = [a˜ 2 ] It follows that [ρ, σ ] is an intermediate subfactor of ρb4 . But all such pairs are classified in Th. 3.11, and by inspection we conclude that [ρ, σ ] = [ρ, b4 ]. By definition we have
734
F. Xu
an automorphism σ1 such that [ρσ1 ] = [ρ], [σ1 σ ] = [b4 ]. Since by [7] the only subsector of ρρ ¯ which is an automorphism is [0], we have [σ ] = [b4 ], and [σ a1 ] = [a2 ] = [a˜ 2 ], a contradiction. The rest of relations in Fig. 41 are derived in a similar way.
4.5. A negative answer to question A.12. From the list of lattices we can easily verify Conjecture 1.1 for GHJ subfactors. It is an interesting question to check that whether the stronger Conjecture A.1 is true for all GHJ subfactors using our list. We claim that the GHJ subfactor ρ3/2 ¯ of type E 7 gives a negative answer to question A.12 in the appendix. This subfactor is dual to 3/2ρ whose lattice of intermediate subfactors is given by Fig. 33. Since 3/2ρ has 12 minimal subfactors, it follows that ρ3/2 ¯ has 12 maximal subfactors. Note that [ρ3/2][3/2ρ] ¯ ∈ ρ ρ, ¯ and it is known by Fig. 42 of [8] that the only sector with index 1 which appears in ρ ρ ¯ is [0]. By Lemma 2.6 we have Aut(ρ3/2) ¯ is trivial. On the other hand it is easy to calculate [3/2ρ ρ3/2], ¯ and we find there are 9 irreducible sectors which can appear in [3/2ρ ρ3/2]. ¯ Since 12 > 9, this gives a negative answer to problem A.12.
A. A Proof of Relative Version of Wall’s Conjecture for Solvable Groups As pointed in introduction, Wall proved his conjecture for solvable group. Another supporting evidence, as observed by V. F. R. Jones, is that the minimal version of Conjecture 1.1 holds for subfactors with a commutative N ∩ M1 . In this appendix we will prove the relative version of Wall’s conjecture for solvable groups. Our proof is partially inspired by some comments of V. F. R. Jones. Let N ⊂ M be an irreducible subfactor with finite Jones index, e N the Jones projection from M to N , and let Pi , 1 ≤ i ≤ n be the set of minimal intermediate subfactors. Then the Jones projections ei from M onto Pi are in N ∩ M1 . Conjecture 1.1 will follow if ei , 1 ≤ i ≤ n, e N are linearly independent. Unfortunately this is not true in general. In fact, Let G acts properly on the hyperfinite type I I1 factor R, and consider the subfactors R G ⊂ R H ⊂ R where R G , R H are fixed point subalgebras of R under the action of G, H respectively. Let K i , 1 ≤ i ≤ n be the set of maximal subgroups of G which strictly contains H. Let ei be the Jones projections from R onto R K i . Note that ei = |K1i | g∈K i g ∈ CG. For any subgroup K of G we will denote by e K = |K1 | g∈K g ∈ CG. Note that CG is a C ∗ algebra. We denote by l(G) the abelian algebra of complex valued functions on G. There are examples when G is a semidirect product of an elementary abelian group V by G 1 which acts irreducibly on V, and we find that the set |G11 | g∈G 1 vgv −1 , v ∈ V is not linearly independent (Note that for any v ∈ V, vG 1 v −1 is maximal in G by our assumption). However the following modification seems to be interesting: Conjecture A.1. Let N ⊂ M be an irreducible subfactor with finite Jones index, and let Pi , 1 ≤ i ≤ n be the set of minimal intermediate subfactors. Denote by ei ∈ N ∩M1 , 1 ≤ i ≤ n the Jones projections ei from M onto Pi and e N the Jones projections e N from M onto N . Then there are vectors ξi , ξ ∈ N ∩ M1 such that ei ξi = ξi , 1 ≤ i ≤ n, e N ξ = ξ, and ξi , 1 ≤ i ≤ n, ξ are linearly independent. Remark A.2. We note that unlike Conjecture 1.1, the conjecture above makes use of the algebra structure of N ∩ M1 and therefore does not immediately imply the dual version or if one replaces minimal by maximal.
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
735
By definition Conjecture A.1 implies Conjecture 1.1, and we shall prove Conjecture A.1 for R G ⊂ R, G solvable. In fact it is easy to check that for R G ⊂ R Conjecture A.1 is equivalent to: Conjecture A.3. Let K i , 1 ≤ i ≤ n be a set of maximal subgroups of G. Then there are vectors ξi ∈ l(G), 1 ≤ i ≤ n such that eG ξi = 0, ξi are K i invariant and linearly independent. We will prove Conjecture A.3 when G is solvable. Lemma A.4. Suppose that K acts irreducibly on an elementary abelian group V, and G is the semi-direct product of V by K . For each v K v −1 , v = 0, we assign a vector ξv := δv K − |V1 | 1 ∈ l(G), and for K , we assign ξ K := δ Mb − |V1 | 1 ∈ l(G) where b = 0, and we use δ S to denote the characteristic function of a set S ⊂ G, and 1 stands for constant function with value 1. Then ξv , v ∈ V, v = 0, ξ K verify Conjecture A.3 for v K v −1 , v ∈ V. Proof. By definition we just have to check that ξv , v ∈ V, v = 0, ξ K are linearly independent. Suppose that λv are complex numbers such that λv ξv + λ0 ξ K = 0. v =0,v∈V
The we have v =0,v∈V
λ v δv K + λ 0 δ K b −
v∈V
λv
1 1 = 0. |V |
For a fixed v ∈ V, v = 0, since K acts irreducibly on V, we can find k ∈ K such that k(v) := k −1 vk = b. It follows that δ K b (vk) = 0. Evaluate the LHS of the above sum at vk, v = 0 we have 1 λv = λv . |V | v ∈V
Since kb = k −1 (b)k, evaluating the above sum at kb we have 1 λ0 + λk −1 (b) − λv = 0. |V | v∈V
Notice that k −1 (b) = 0 since b = 0, we conclude that λ0 = 0, all λv , v = 0 are identical to the same value λ, and |V1 | (|V | − 1)λ = λ. It follows that λv = 0, ∀v ∈ V. Lemma A.5. Suppose that H is a normal subgroup of G and K i ≥ H, 1 ≤ i ≤ n is a set of maximal subgroups of G. If Conjecture A.3 is true for K i /H ≤ G/H, then it is also true for K i ≤ G, 1 ≤ i ≤ n. Proof. By assumption we have functions ξi : G/H → C such that ξi is linearly independent and K i /H invariant, eG/H ξi = 0, 1 ≤ i ≤ n. Let π : G → G/H be the projection, then ξi π : G → C is linearly independent and K i invariant,eG ξi π = 0, 1 ≤ i ≤ n. Recall that for a subgroup K ≤ G, CoreG K := ∩g∈G g K g −1 is the largest normal subgroup of G that is contained in K .
736
F. Xu
Lemma A.6. Let H1 , H2 be subgroups of a finite group G. Denote by H1 H2 the set of different elements g which can be written as h 1 h 2 with h 1 ∈ H1 , h 2 ∈ H2 . Then e H1 e H2 = |H11H2 | g∈H1 H2 g. −1 ∈ Proof. Let g = h 1 h 2 with h 1 ∈ H1 , h 2 ∈ H2 . Then h 1 h 2 = h 1 h 2 iff h −1 1 h1 = h2h2 H1 ∩ H2 . It follows that for each g = h 1 h 2 ∈ H1 H2 , there are H1 ∩ H2 different pairs |H1 ||H2 | of (h 1 , h 2 ) ∈ H1 × H2 such that g = h 1 h 2 . Hence |H1 H2 | = |H and the lemma 1 ∩H2 | follows from definition.
Proposition A.7. Conjecture A.3 is true for G solvable. Proof. The proof goes by induction on |G|n. Consider H := CoreG K 1 . If CoreG K i does not contain H, then K i H = K i , and since K i H is a subgroup of G, K i is maximal, we have K i H = G. Suppose that there is at least one K i such that CoreG K i does not contain H. By induction hypothesis, for the set of K i with CoreG K i not containing H, we can find vectors ξi which verifies Conjecture A.3, and for the set of K i with CoreG K i containing H, we can find vectors ξi which verifies Conjecture A.3. We claim that such set ξi , 1 ≤ i ≤ n is linearly independent. Suppose that λi ∈ C, 1 ≤ i ≤ n such that 1≤i≤n λi ξi = 0. Multiply on the left by e H . We have λi e H ξi + λi e H ξi = 0. i,CoreG K i ≥H
i,CoreG K i ∩H = H
Note that if CoreG K i ≥ H, then e H ξi = e H e K i ξi = e K i ξi ; if CoreG K i does not contain H, then e H ξi = e H e K i ξi = eG ξi = 0 by Lemma A.6. It follows that λi ξi = 0, λi ξi = 0. i,CoreG K i ≥H
i,CoreG K i ∩H = H
By our assumption λi = 0, 1 ≤ i ≤ n. So we are left with the case that CoreG K i ≥ H, 1 ≤ i ≤ n. By replacing H = CoreG K 1 by H = CoreG K j for some 1 ≤ j ≤ n we can now assume that CoreG K i = H, 1 ≤ i ≤ n. If H is nontrivial, by Lemma A.5 we are done. If H is trivial, by Th. 15.6 of [11], G is the semidirect product of an elementary abelian group V by K 1 , and the action of K 1 on V is irreducible. Moreover by Th.16.1 of [11] all maximal subgroup K of G with CoreG K = H is of the form v K 1 v −1 for some v ∈ V. By Lemma A.4 we are done. Remark A.8. The reduction in the proof of the above proposition works for general groups, and Conjecture A.3 can be reduced to the case where G is a primitive group, and the set of maximal subgroups have trivial core. Such groups are classified by O’NanScott theorem (cf. §4 of [10]). The first case is when G is the semidirect product of an elementary abelian group V by K 1 , and the action of K 1 on V is irreducible. When G is not solvable, maximal subgroups K of G with trivial core are not conjugates of K 1 , and our proof above does not work. Such maximal subgroups are related to the first cohomology of K 1 with coefficients in V, and Conjecture A.3 implies that the order of this cohomology is less than |K 1 | (cf. Question 12.2 of [19]). Unfortunately even though it is believed that the order of this cohomology is small (cf. [18]), the bound |K 1 | has not been achieved yet. Corollary A.9. The relative version of Wall’s conjecture is true for solvable groups.
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
737
Proof. Let K i , 1 ≤ i ≤ n be a set of maximal subgroups of G strictly containing H. By Prop. A.7 we can find vectors ξi ∈ l(G), 1 ≤ i ≤ n such that eG ξi = 0, ξi are K i invariant and linearly independent. Since K i ≥ H, we have e H ξi = e H e K i ξi = e K i ξi = ξi , 1 ≤ i ≤ n. It follows that ξi is H invariant, and can be thought as functions on l(G/H ). Since eG ξi = 0, we conclude that 1, ξi , 1 ≤ i ≤ n are linearly independent functions on l(G/H ) and the corollary follows. At the end of this appendix we discuss a question which is motivated by the following conjecture of Aschbacher-Guralnick in [1]: Conjecture A.10. Let G be a finite group. Then the number of conjugacy classes of maximal subgroups is less or equal to the number of conjugacy classes of G. Conjecture A.10 was proved in [1] for solvable G. Here we give a slightly proof (with strict inequality) in the spirit of proof of Prop. A.7. Proposition A.11. If G is a finite solvable group, then the number of conjugacy classes of maximal subgroups is less than the number of conjugacy classes of G. Proof. Let K i , 1 ≤ i ≤ n be a set of representatives of conjugacy classes of maximal of G. Let C1 , . . . , Ck be theconjugacy classes of G. Define f i := subgroups −1 − |G|e , 1 ≤ i ≤ n, h := ge g K G j i g∈G g∈C j g, 1 ≤ j ≤ k. Note that eG f i = −1 0, eG = |G| 1≤ j≤k h j , h j , 1 ≤ j ≤ k is linearly independent, and f i is in the space spanned by h j , 1 ≤ j ≤ k. We claim that f i , 1 ≤ i ≤ n are linearly independent. This will prove the proposition since f i is in the space spanned by h j , 1 ≤ j ≤ k, and 1≤ j≤k h j f i = 0. Suppose that λi ∈ C, 1 ≤ i ≤ n such that 1≤i≤n λi f i = 0. Fix g and 1 ≤ i ≤ n. If j = i, then g K i g −1 is not conjugate to h K j h −1 , h ∈ G and by Th. 16.2 of [11] we have g K i g −1 h K j h −1 = G, and by Lemma A.6 ge K i g −1 he K j h −1 = eG . Multiply −1 we have 1≤i≤n λi f i = 0 on the left by ge K i g λi ge K i g −1 f i = ge K i g −1 λi ( he K i h −1 − |G|eG ) = 0, ∀g ∈ G. h∈G
It follows that |λi |2 f i f i∗ = 0, and since CG is a C ∗ algebra, λi f i = 0. By looking at the coefficient of identity element of G in f i we conclude that λi (|G|/|K i | − 1) = 0. since |G| > |K i | we conclude that λi = 0. The following question is motivated by Conjecture A.10: Question A.12. Let N ⊂ M be an irreducible subfactor with finite index. Let Aut(M|N ) := {α ∈ Aut(M)|α(n) = n, ∀n ∈ N }. We say that two intermediate subfactors P1 , P2 are conjugate if there is an α ∈ Aut(M|N ) such that α(P1 ) = (P2 ). Is the number of conjugacy classes of maximal or minimal subfactors less or equal to the number of irreducible representations of N ∩ M1 ? Remark A.13. There is a similar formulation of the above question using N(M|N ) and Lemma 2.8. Take N = M G ⊂ M. Then Aut(M|M G ) = G, N ∩ M1 = CG. The conjugacy classes of maximal subfactors is the same as the conjugacy classes of minimal subgroups of G, and it is easy to see that the number conjugacy classes of minimal subgroups of G
738
F. Xu
is less than the number of conjugacy class of G, which is the same as the number of irreducible representations of N ∩ M1 = CG. On the other hand the conjugacy classes of minimal subfactors is the same as the conjugacy classes of maximal subgroups of G, and question A.12 is equivalent to Conjecture A.10. Let M be the cross product of N by G. Then Aut(M|N ) is isomorphic to the set of one dimensional representations of G, and N ∩ M1 = l(G). In this case Aut(M|N ) preserves every intermediate subfactor. The conjugacy classes of minimal subfactors can be identified as the set of minimal subgroups of G, and it is easy to see that the number of minimal subgroups of G is less than |G|, which is the same as the number of irreducible representations of N ∩ M1 = l(G). On the other hand the conjugacy classes of maximal subfactors can be identified as the set of maximal subgroups of G, and question A.12 is equivalent to Wall’s conjecture (with ≤ instead of <). However, as shown in §4.5, Question A.12 has a negative answer for general subfactors. Is there any natural modification of the statement in Question A.12 so that it has a better chance of being true while still generalizing Conjecture A.10? Acknowledgements. We’d like to thank Professors M. Aschbacher, P. Grossman and R. Guralnick for useful discussions, and especially Prof. V. F. R. Jones for his interest and useful comments on Conjecture 1.1 which inspired the results of the Appendix.
References 1. Aschbacher, M., Guralnick, R.: Some applications of the first cohomology group. J. Algebra 90(2), 446–460 (1984) 2. Baddeley, R., Lucchini, A.: On representing finite lattices as intervals in subgroup lattices of finite groups. J. Algebra 196, 1–100 (1997) 3. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. I. Commun. Math. Phys. 197, 361–386 (1998) 4. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. II. Commun. Math. Phys. 200, 57–103 (1999) 5. Böckenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors. III. Commun. Math. Phys. 205, 183–228 (1999) 6. Böckenhauer, J., Evans, D.E.: Modular invariants from subfactors: Type I coupling matrices and intermediate subfactors. Commun. Math. Phys. 213(2), 267–289 (2000) 7. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: On α-induction, chiral generators and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487, 1999; also see the preprint version, http://arXiv. org/abs/math/9904109v2[math.OA], 1999 8. Böckenhauer, J., Evans, D.E., Kawahigashi, Y.: Chiral structure of modular invariants for subfactors. Commun. Math. Phys. 210, 733–784 (2000) 9. Bisch, D., Jones, V.F.R.: Algebras associated to intermediate subfactors. Invent. Math. 128(1), 89–157 (1997) 10. Dixon, J.D., Mortimer, B.: Permutation Groups. Graduate Texts in Mathematics, 163. New York: SpringerVerlag, 1996 11. Doerk, K., Hawkes, T.: Finite Soluble Groups. Berlin-New York: W. de Gruyter, 1992 12. Feit, W.: An interval in the subgroup lattice of a finite group which is isomorphic to M7 . Algebra Universalis 17(2), 220–221 (1983) 13. Goodman, F., de la Harpe, P., Jones, V.F.R.: Coxeter Graphs and Towers of Algebras. Mathematical Sciences Research Institute Publications 14, New York: Springer-Verlag, 1989 14. Grossman, P., Jones, V.F.R.: Intermediate subfactors with no extra structure. J. Amer. Math. Soc. 20(1), 219–265 (2007) 15. Grossman, P.: Forked Temperley-Lieb algebras and intermediate subfactors. J. Funct. Anal. 247(2), 477–491 (2007) 16. Guido, D., Longo, R.: Relativistic invariance and charge conjugation in quantum field theory. Commun. Math. Phys. 148, 521–551 (1992)
On Intermediate Subfactors of Goodman-de la Harpe-Jones Subfactors
739
17. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 18. Guralnick, R.M., Hoffman, C.: The first cohomology group and generation of simple groups. In: Groups and Geometries (Siena, 1996), Trends Math., Basel: Birkhäuser, 1998, pp. 81–89 19. Guralnick, R.M., Kantor, W., Kassabov, M., Lubotzky, A.: Presentations of finite simple groups: profinite and cohomological approaches. Groups Geom. Dyn. 1(4), 469–523 (2007) 20. Izumi, M., Longo, R., Popa, S.: A Galois correspondence for compact groups of automorphisms of von Neumann Algebras with a generalization to Kac algebras. J. Funct. Anal. 155, 25–63 (1998) 21. Jones, V.F.R.: Fusion en algebres de von Neumann et groupes de lacets (d’apres A. Wassermann). (French) [Fusion in von Neumann algebras and loop groups (after A. Wassermann)], Seminaire Bourbaki, Vol. 1994/95. Asterisque No. 237, Exp. No. 800, 5, 251–273, 1996 22. Jones, V.F.R.: Two subfactors and the algebraic decompositions of bimodules over I I1 factors. Preprint available from http://math.berkeley.edu/~vfr/algebraic.pdf, 2008 23. Jones, V.F.R., Xu, F.: Intersections of finite families of finite index subfactors. Int. J. Math. 15(7), 717–733 (2004) 24. Kac, V.G.: Infinite Dimensional Lie Algebras, 3rd Edition, Cambridge: Cambridge University Press, 1990 25. Kac, V.G., Longo, R., Xu, F.: Solitons in affine and permutation orbifolds. Commun. Math. Phys. 253(3), 723–764 (2005) 26. Kawahigashi, Y., Longo, R.: Classification of two-dimensional local conformal nets with c < 1 and 2-cohomology vanishing for tensor categories. Commun. Math. Phys. 244(1), 63–97 (2004) 27. Liebeck, M.W., Pyber, L., Shalev, A.: On a conjecture of G. E. Wall. J. Algebra 317(1), 184–197 (2007) 28. Longo, R.: Index of subfactors and statistics of quantum fields. I. Commun. Math. Phys. 126, 217–247 (1989) 29. Longo, R.: Index of subfactors and statistics of quantum fields. II. Commun. Math. Phys. 130, 285–309 (1990) 30. Longo, R.: Minimal index and braided subfactors. J. Funct. Anal. 109, 98–112 (1992) 31. Longo, R.: Conformal subnets and intermediate subfactors. Commun. Math. Phys. 237(1-2), 7–30 (2003) 32. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 33. Pálfy, P.P.: Groups and Lattices. Groups St Andrews 2001, London Mathematicsl Society Lecture Note Series 305, C. M. Campbell, E. F. Robertson, G. C. Smith, eds., Oxford: London Math. Soc., 2001, pp. 429–454 34. Pimsner, M., Popa, S.: Entropy and index for subfactors. Ann. Scient. Ec. Norm. Sup. 19, 57–106 (1986) 35. Popa, S.: Correspondence. INCREST preprint, 56, 1986, available at http://www.math.uda.edu/~popa/ popa-correspondances.pdf 36. Pressley, A., Segal, G.: Loop Groups. Oxford: Oxford University Press, 1986 37. Rehren, K.-H.: Braid group statistics and their superselection rules. In: The Algebraic Theory of Superselection Sectors, D. Kastler, ed., Singapore: World Scientific, 1990 38. Teruya, T., Watatani, Y.: Lattices of intermediate subfactors for type III factors. Arch. Math. (Basel) 68(6), 454–463 (1997) 39. Turaev, V.G.: Quantum Invariants of Knots and 3-Manifolds. Berlin-New York: Walter de Gruyter, 1994 40. Wall, G.E.: Some applications of the Eulerian functions of a finite group. J. Austral. Math. Soc. 2, 35–59 (1961/1962) 41. Watatani, Y.: Lattices of intermediate subfactors. J. Funct. Anal. 140(2), 312–334 (1996) 42. Wassermann, A.: Operator algebras and Conformal field theories III. Invent. Math. 133, 467–538 (1998) 43. Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 347–403 (1998) 44. Xu, F.: On representing some lattices as lattices of intermediate subfactors of finite index. http://arXiv. org/abs/math/0703248v2[math.OA], 2009, to appear in Advance in Mathematics 45. Xu, F.: 3-manifold invariants from cosets. J. Knot Theory and Ram. 14(1), 21–90 (2005) Communicated by Y. Kawahigashi
Commun. Math. Phys. 298, 741–756 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1077-9
Communications in
Mathematical Physics
On Dirac Physical Measures for Transitive Flows Radu Saghin1 , Wenxiang Sun2 , Edson Vargas1 1 Departamento de Matematica, IME-USP, Sao Paulo, Brasil. E-mail: [email protected] 2 School of Mathematical Sciences, PKU, Beijing, China
Received: 7 September 2009 / Accepted: 16 March 2010 Published online: 23 June 2010 – © Springer-Verlag 2010
Abstract: We discuss some examples of smooth transitive flows with physical measures supported at fixed points. We give some conditions under which stopping a flow at a point will create a Dirac physical measure at that indifferent fixed point. Using the AnosovKatok method, we construct transitive flows on surfaces with the only ergodic invariant probabilities being Dirac measures at hyperbolic fixed points. When there is only one such point, the corresponding Dirac measure is necessarily the only physical measure with full basin of attraction. Using an example due to Hu and Young, we also construct a transitive flow on a three-dimensional compact manifold without boundary, with the only physical measure the average of two Dirac measures at two hyperbolic fixed points.
1. Introduction Let M be a compact manifold, and f : M → M a continuous map. A probability measure μ on M is invariant under f if for any measurable set A ⊂ M we have μ( f −1 (A)) = μ(A). The basin of attraction of μ, denoted B(μ), is the set of points x ∈ M such that 1 (δx + δ f (x) + δ f 2 (x) + · · · + δ f n−1 (x) ) = μ, n→∞ n lim
where δ y is the Dirac measure at the point y ∈ M, and the limit is with respect to the weak* topology. This means that for any continuous observable α : M → R, the average of α along the orbit of x will converge to the integral of α with respect to μ, which makes this notion useful in applications. We say that the invariant measure μ is physical if B(μ) has positive Lebesgue measure on M. One has similar definitions for the case of flows. If φ is a continuous flow on the compact manifold M, then a probability measure μ is invariant if μ(φt (A)) = μ(A),
742
R. Saghin, W. Sun, E. Vargas
for any t ∈ R and any measurable A ⊂ M. Denote by m t (x) the probability measure along the orbit of the flow of length t starting at x, i. e. 1 t αdm t (x) = α(φs (x))ds, t 0 M for any α ∈ C 0 (M, R). The basin of attraction B(μ) is in this case the set of points x ∈ M such that lim m t (x) = μ.
t→∞
The measure μ is physical if the basin of attraction has again positive Lebesgue measure. Physical measures are important in dynamics, because they describe statistically the orbit of a significant set of points, and they can be detected experimentally. There is a large amount of work in the area of physical measures, specially if they are also absolutely continuous with respect to Lebesgue measure, or along unstable manifolds (the SRB measures). Such measures generally have very good statistical properties. However, in this paper we would like to turn our attention to situations where the physical measures are rather bad, for example they are supported only on some fixed points. In order to avoid simple examples such as sinks, or the Bowen eye, we are interested in systems with some recurrence, we will require transitivity at least in the basins of attraction of the measures. In other words, even if a typical orbit will travel everywhere on the manifold (or at least some large subset), it will spend most of the time arbitrarily close to some given points. The motivation comes from some well-known examples in one dimensional dynamics. There are examples of non-invertible maps of the interval (or the circle) such that the only physical measure is supported on an indifferent fixed point. Suppose f is a piecewise smooth expanding map of the interval (or the circle), with f (x) = x + x a + o(x a ) in a neighborhood of the origin, and the origin is the only indifferent fixed point of f (a > 1). If a < 2, then f has an absolutely continuous finite invariant measure, but if a ≥ 2, then there is no absolutely continuous finite invariant measure, the Dirac measure at the origin is the only physical measure, and its basin of attraction has full Lebesgue measure (see [6]). There are also more unexpected examples of maps in the logistic family which have the only physical measure supported on a repelling fixed point. In [3] it is shown that, in any full continuous family of S-unimodal maps, there are uncountable many parameters such that, for the corresponding maps, the Dirac measure at the positive repelling fixed point is the unique physical measure, and its basin of attraction has full Lebesgue measure. These maps are transitive (at least on some subset). These maps are also not invertible; this phenomena cannot happen for invertible maps in dimension 1. We are interested in the existence of such examples in higher dimensions and for smooth invertible systems. The simplest systems to consider (and in the same time the most restrictive) are the flows on surfaces. Examples in higher dimensions and/or for maps should be easier to construct. In Sect. 2 we discuss some transitive flows on surfaces with physical measures supported on indifferent fixed points. The existence of such flows is not surprising, they can be easily obtained by stopping a transitive flow at a point for example. We give some conditions under which a reparametrization of a given transitive flow creates a physical measure at an indifferent fixed point.
On Dirac Physical Measures for Transitive Flows
743
In Sect. 3 we give examples of transitive flows on surfaces with the only physical measure supported on a hyperbolic fixed point. The construction of these flows is more difficult, and requires the construction of diffeomorphisms of the circle with ‘bad’ invariant measures - this is done in Sect. 4, and uses a method due to Anosov and Katok. Unfortunately the flows we obtain are either transitive only inside the basin of attraction of the Dirac measure at the fixed point, or the manifold has boundary. Finally, in the last section we discuss some other examples with hyperbolic fixed points. We construct flows on surfaces of genus two with only two ergodic invariant measures supported at two hyperbolic fixed points, but we cannot conclude that a combination of them must be a physical measure, or there is no physical measure at all. We also give an example of a transitive flow on a three dimensional compact manifold without boundary with the only physical measure being the average of two Dirac measures at two hyperbolic fixed points. 2. Indifferent Fixed Points In this section we will discuss about ‘singular’ reparametrizations of flows which create physical measures supported on indifferent fixed points. Here by ‘singular’ reparametrizations we mean that we allow the creation of fixed points, so the new flow is not exactly equivalent to the initial one, but the orbits of the initial flow are unions of orbits of the new flow (eventually with fixed points). The following discussion is more general, but the example to have in mind is an irrational linear flow on the torus, modified such that it has a fixed point, while the trajectories are the same straight lines. Some of the following facts about time-changes can be also found in other papers like [5 or 8], even in more general conditions; however we include them here because they are simple and intuitive in the smooth case. Suppose we have a smooth flow φ on a compact manifold M. Let X be the vector field generating φ, and f : M → [0, ∞) a smooth function. Let Y = f X and let ψ be the flow generated by Y . Let Z = {x ∈ M : f (x) = 0} be the zero set of f . We would like to investigate if the invariant measures of φ will ’survive’ when we modify the flow to ψ, and how this depends on f . The function f will induce a linear continuous operator T : M(M) → M(M) on the set of signed finite Borel measures on M in the following way: given a finite measure ν on M, let T ν be the unique finite measure such that f is theRadon-Nikodym derivative of T ν with respect to ν. In other words we have T ν(A) = A f dν for any measurable set A, or M gdT ν = M g f dν for any g ∈ C 0 (M, R). T ν is absolutely continuous with respect to ν, and ν is supported on the zero set of f iff T ν is the zero measure, so the kernel of the operator is the set of finite measures supported on Z . The image is the set of finite measures μ such that M 1f d|μ| < ∞. If μ satisfies this condition, then there is a ‘distinguished’ element in T −1 (μ): it is the measure ν with the property that ν| Z = 0; we will denote this as Sμ (this is some kind of partial inverse, defined only on the image of T ). This can also be defined by Sμ(A) = A 1f dμ, or M gd Sμ = M gf dμ. If ν is an invariant measure for ψ, then gdν = g ◦ ψt dν, ∀t ∈ R, ∀g ∈ C 0 (M, R), M
M
and from here we get 1 (g ◦ ψt − g)dν = lim Y (g)dν = f X (g)dν = 0, ∀g ∈ C 1 (M, R), t→0 M t M M
744
R. Saghin, W. Sun, E. Vargas
and then
X (g)dT ν = 0, ∀g ∈ C 1 (M, R). M
If we denote I (t) =
M
g ◦ φt dT ν, for some fixed g ∈ C 1 (M, R), then we have I (t) = X (g ◦ φt )dT ν = 0. M
This means that I (t) is constant for every g ∈ C 1 (M, R), or gdT ν = g ◦ φt dT ν, ∀t ∈ R, ∀g ∈ C 1 (M, R). M
M
But C 1 (M, R) is dense in C 0 (M, R) in the C 0 topology, so we get that T ν must be an invariant measure for φ. In other words T Mψ ⊂ Mφ , T sends invariant measures of ψ to invariant measures of φ. In a similar way one can prove that S sends invariant measures of ψ to invariant measures of φ. If ν is an ergodic invariant probability for ψ, then T ν is either zero (if ν is supported on Z ) or an ergodic invariant measure for φ, and it can be made an ergodic probability after rescaling. Conversely, if μ from the image of T is an ergodic probability for φ, then S(μ) is an ergodic invariant measure for ψ and it can be made a probability after rescaling. We also have supp(T (ν)) = cl(supp(ν) \ Z ) and for μ in the image of T we have supp(S(μ)) = supp(μ). Regarding the basins of attraction, let us denote first Bψ (Z ) = {x ∈ M : m t,ψ (x) has all the weak limit measures supported in Z}, where Mt,ψ (x) is the measure given by the piece of orbit of ψ staring at x and for time t, and m t,ψ (x) is the corresponding probability measure obtained by rescaling. Let Mψ (x) be the limit set of m t,ψ (x) as t tends to infinity. We also have the same notations for φ. For any x ∈ M and t ≥ 0 there exist s(t) ≥ 0 such that φs(t) (x) = ψt (x) (this is because the speed of φ is greater or equal to a constant times the speed of ψ; the other direction is not always true). Then dψt (x) dφs(t) (x) = Y (ψt (x)) = f (ψt (x))X (ψt (x)) = = s (t)X (ψt (x)) dt dt or ds = f (ψt (x))dt. Also T Mt,ψ (x)(g) = 0
t
s(t)
g(ψu (x)) f (ψu (x))du = 0
g(φv (x))dv = Ms(t),φ (x)(g),
for any g ∈ C 1 (M, R), by using the change of variables v = s(u). So, because C 1 (M, R) is dense in C 0 (M, R), we have T Mt,ψ (x) = Ms(t),φ (x), or T m t,ψ (x) = s(t) t m s(t),φ (x). From this we can conclude that the measure ν is in Mψ (x) iff μ = T ν is a limit of s(t) t m s(t),φ (x). We get the following result regarding physical measures for φ and ψ.
On Dirac Physical Measures for Transitive Flows
745
Proposition 1. Let φ be the flow on the compact manifold M generated by the smooth vector field X , and ψ the flow generated by the vector field f X , where f : M → [0, ∞) is If x ∈ M is in the basin of attraction of the invariant measure μ for φ, and smooth. 1 M f dμ = ∞, then x ∈ B(Z ). If the zero set of f consists of a single point, Z = { p}, then B(Z ) = B(δ p ), or the basin of of attraction (w.r.t. ψ) of the Dirac measure at p contains the basin of attraction of μ (w.r.t. φ). Proof. Assume that x ∈ B(μ) for some invariant measure μ of φ, then we have lims→∞ m s,φ (x) = μ. If M 1f dμ = ∞, then μ is not in the image of T . If s(ti ) limt→∞ s(t) t = 0, then there exist ti → ∞ such that limi→∞ ti = c > 0, and eventually passing to a subsequence there exists a measure ν such that ν = limi→∞ m ti ,ψ (x). But then from the continuity of T one gets that T ν = cμ, which is a contradiction because μ is not in the image of T . So we must have s(t) t → 0, and this in turn will imply that T m t,ψ (x) → 0, and again from the continuity of T we get x ∈ B(Z ).
Obviously if Z = { p} then B(Z ) = B(δ p ) and the conclusion follows.
As an example consider the case when φt is an irrational translation of the torus, generated by the vector field X . Then φ is minimal, transitive, not mixing, uniquely ergodic, the Lebesgue measure being the unique invariant measure. Let f be a function on the torus with a unique zero at the origin, such that around the origin f (x) ∼ |x|a , and let ψ be the flow generated by f X . The new flow ψ is not minimal anymore, but it is however topologically mixing. If a ≥ 2, then it is uniquely ergodic, the only invariant probability measure is δ0 , and its basin of attraction is the whole manifold. If a < 2 then ψ will have another ergodic probability measure ν, supported on the entire manifold, with density proportional to 1f with respect to Lebesgue measure. This new measure will have a basin of attraction with full Lebesgue measure (but not the entire manifold). We remark that when there is a global cross section to the flow φ, then one can consider the return map to the section, and then relate the invariant measures of the map with the ones of the flow with the help of the return time map to the transversal. For the modified flow ψ the return map is the same, but the return time will have some singularities corresponding to the zeros of f , and in order to see whether invariant measures survive one has to look at the integral of the return time with respect to the invariant measures on the transversal. We will discuss more about this in the next section. We should also mention here the example in [4] of a transitive map on the two-torus with the physical measure at a point (see Theorem 1). The fixed point in their example has a contracting direction and a weakly expanding (indifferent) one. The map cannot be made the time one map of a flow because it is not homotopic to the identity, however we will use it in the last section to create flows on dimension three with the unique physical measure supported on hyperbolic fixed points. 3. Hyperbolic Fixed Points In this section we will investigate the existence of transitive flows on surfaces with physical measures supported on hyperbolic fixed points. This turns out to be more difficult, and a reason for that is the fact that the hyperbolic fixed points create a singularity of the return time map of logarithmic type (unlike indifferent points, where it is usually power-like), and thus it is still integrable with respect to many invariant measures (the ones coming from a H˝older conjugacy with a circle rotation for example). Consequently
746
R. Saghin, W. Sun, E. Vargas f 0
a
0
Fig. 1. Saddle-Node bifurcation
we will also need to create a map on the circle (the transversal) with a ‘bad’ invariant measure (or ‘good’ for our purpose) - this will be done in the next section. We are not aware of any other known example of this type. In [4] the fixed point which supports the physical measure is only weakly expanding (although it has a contracting direction). There is also an example in [2], but in that case the map is not transitive, the basin of attraction contains a wandering domain. Let f : T1 → T1 be a C ∞ diffeomorphism of the circle conjugated to an irrational rotation, i. e. transitive. Let φ be the suspension flow over f with constant roof function 1, defined on a torus T2 . We will consider T1 = R(mod Z), and we denote T = T1 × {0} ⊂ T2 . Then the Poincaré return function of the flow φ to the circle T will be φ1 |T = f . We will make an abuse of notation and consider that T = R(mod Z). One can also define directly the flow on the flat torus, using interpolation between the identity and f, because f is isotopic to the identity. We will make a saddle-node bifurcation to the suspension flow in the following way. By the Flow Box Theorem, there is a C ∞ chart on some small open domain U where the suspension flow is linear, let’s say it is the hamiltonian flow given by the hamiltonian h 0 (x, y) = x (the trajectories are vertical lines). We can modify this hamiltonian by adding a C ∞ bump function α supported on some smaller disk inside U , h 1 (x, y) = x + tα(x, y). If t > 0 is large enough, then h 1 will have two critical points, a saddle and a local maximum. Then the hamiltonian flow associated to h 1 will have a saddle with a homoclinic loop and a center, while it coincides with the initial flow outside the support of α. This can be seen in Fig. 1, where we show the graph of h 1 and some level curves, which represent trajectories of the new hamiltonian flow, together with the disk where the perturbation is supported. Let ψ be the new C ∞ flow, which coincides with φ outside U and is constructed as above inside U . Let also a be the hyperbolic point and γ0 the homoclinic loop. Without loss of generality we can assume that 0 belongs to the stable manifold of a, while f (0) = φ1 (0) belongs to the unstable manifold of a. We remark that the return map to the transversal T is the same f (considering that 0 returns to f (0)), because the level curves of h 1 coincide with the ones of h outside the support of α. The return time will change, and there will be a singularity at 0. This flow is not transitive, inside the homoclinic loop it has invariant circles. It is however transitive, even topologically mixing, outside of the loop. To see this let A, B ⊂ T2 be two open sets, disjoint with the interior of the homoclinic loop. Then the stable manifold of a intersects A because it is dense
On Dirac Physical Measures for Transitive Flows
747
outside the homoclinic loop, and in the same way the unstable manifold of a intersects B. Then ψt (A) accumulates on the unstable manifold of a when t tends to infinity, so for t sufficiently large it has to intersect B, q.e.d.. Let M be the surface with boundary and a corner obtained by removing the interior of the homoclinic loop. We will denote with [x, y] (or [x, y), (x, y], (x, y)) the piece of the trajectory of the flow between x and y (eventually without one or both endpoints). In this case x or y can also be the saddle, in which case the trajectories will be pieces of the invariant manifolds. Let τ : T → R be the return time of ψ to T. Because ψ is a C ∞ flow in dimension two, there exist a smooth conjugacy h L with the linearized flow ψ L from a neighborhood V of the saddle a to a square S = [−δ, δ]2 in R2 . Suppose that the linear flow is ψ L (u, v) = (ueλt , ve−λt ), where λ > 0. If an orbit enters S at some point |u| . (u, ±δ) and leaves at a point (±δ, v), then the time spent in S will be t (u) = log δ−log λ −1 −1 L L Let T1 = {h (u, δ), −δ ≤ u ≤ δ} and T2 = {h (u, −δ), −δ ≤ u ≤ δ}. Because h L is smooth, T1 and T2 are smooth curves in M transversal to the flow, so the holonomy between T and T1 (and T2 ) is bi-Lipschitz. From this, the fact that h L is smooth so also bi-Lipschitz, and the fact that the time spent outside U by a trajectory going from T back to itself is uniformly bounded from above, we can conclude that on a neighborhood around 0 we have τ (x) ∼ − log |x|. Here u(x) ∼ v(x) means that u(x) v(x) is uniformly bounded away from zero and infinity. Actually one can show that there also (x) exist the two one-sided limits lim x→0± − τlog |x| ∈ (0, ∞), in this case they are different (a non-symmetric singularity). We assume that f is conjugated to an irrational rotation Rρ by the homeomorphism h, i. e. h ◦ f = Rρ ◦ h. Then f is uniquely ergodic, with the only invariant probability ν = h ∗ Leb, meaning that for any measurable subset A of T we have ν(A) = Leb(h(A)), where Leb is the Lebesgue measure on the circle. We have the following result. Proposition 2. Let T, M, f , ν, ψ and τ as before. Then we have the following dichotomy: – if T − log(x)dν = ∞, then ψ has only one ergodic probability measure, the Dirac measure at the fixed point a; consequently δa is the only physical measure for ψ, with the basin of attraction the whole manifold; – if T − log(x)dν < ∞, then there exists also exactly one other ergodic probability measure μ of ψ, fully supported on M. 1 Here by T g(x)dν we mean 2 1 g(x)dν , where ν = i ∗ ν, i : − 21 , 21 → T is the −2
natural inclusion. Proof. Let ψ˜ be the suspension semi-flow over f with roof function τ , defined on the non-compact manifold N . Let g : N → M be the map given by g(ψ˜ t (x)) = ψt (x), ∀t ≥ 0, ∀x ∈ T. Then g is diffeomorphic to its image, which is M \ (γ0 ∪ [a, f (0))), and from the definition it is a conjugacy between ψ˜ and ψ restricted to this set. Let μ be an invariant probability measure (ipm) for ψ. Then μ(A) = 0 for any wandering set A. It is easy to see that γ0 and (a, f (0)) are a countable union of wandering segments, so their measure must be zero. The Dirac measure at a is clearly ipm for ψ. ˜ Because ψ˜ is a suspension Suppose that μ({a}) = 0. Then μ˜ = g ∗ (μ) is an ipm for ψ. ˜ there exists an unique invariant measure ν˜ for semi-flow for f , for every ipm μ˜ for ψ, f , such that μ = ν˜ × Leb, meaning that for any measurable set A ⊂ N we have μ(A) ˜ = Leb(A ∩ [x, ψ˜ τ (x) (x)]d ν(x). ˜ T
748
R. Saghin, W. Sun, E. Vargas
Here Leb is the Lebesgue measure on [x, ψ˜ τ (x) (x)] ∼ [0, τ (x)]. But f is uniquely ergodic, so ν˜ must be a multiple of ν. From this we see that there exists an ipm for ψ˜ if and only if τ (x)dν ∼ − log |x|dν < ∞.
T
T
˜ so In conclusion, if T − log |x|dν = ∞, then there is no invariant probability for ψ, the only invariant probability for ψ is δa . In this case δa must be also the only physical measure of ψ, with the basin of attraction the whole manifold M, because δa is the only possible limit of a sequenceof probabilities supported on a sequence of pieces of orbits of increasing length. If T − log |x|dν < ∞, then ψ˜ will have exactly one ipm, the rescaling of ν × Leb; then ψ will have two ergodic ipm’s, δa and μ, which is the rescaling of g∗ (ν × Leb). Obviously μ is fully supported in M, because ν is fully supported on T. We remark that if the rotation number ρ of f is Diophantine, and f is C ∞ , then the conjugacy h is smooth, so ν = h Leb, and consequently τ (x)dν ∼ − log |x|dν = −h (x) log |x|dν < ∞. T
T
T
In this case we are in the second situation of the proposition above, the flow ψ has two ergodic ipm’s, δa and μ which is fully supported. Actually we will have the same situation whenever the conjugacy h between f and the rigid rotation is only H˝older continuous, because x α−1 log |x| is integrable if α > 0. If the rotation number of f is Liouville, we have two possibilities. If the conjugacy is nice enough, then we have again the two ergodic ipm’s. But it is also possible that the conjugacy is bad, and for the ipm ν for f the integral T τ (x)dν is divergent, and in this case ψ has only one ipm, the Dirac measure at a, and this is the physical measure for ψ. An example is given in the next section. 4. A C ∞ Diffeomorphism of the Circle with ‘Bad’ Invariant Measure In this section we will prove the following result. Proposition 3. There exist a C ∞ diffeomorphism of the circle f , with an irrational rotation number, and a unique invariant measure ν, such that − log |x|dν = ∞. T
Proof. The construction of the diffeomorphism uses a method due to Anosov and Katok (see [1]). We construct a sequence of diffeomorphisms f n conjugated to the rational rotations Rρn by the diffeomorphisms h n . Then ρn will converge to the irrational number ρ, h n will converge to the homeomorphism h in the C 0 topology, and f n will converge to the diffeomorphism f in the C ∞ topology. The construction is done by induction, h n+1 is chosen such that −1 h −1 n ◦ Rρn ◦ h n = h n+1 ◦ Rρn ◦ h n+1 ,
On Dirac Physical Measures for Transitive Flows
749
and then ρn+1 is chosen such that −1 f n − f n+1 C n , f n−1 − f n+1 C n < 2−n
(here the choice of h n+1 plays an important role). This condition implies that f n is convergent in the C ∞ topology. Now one has the freedom to choose the sequence h n to converge in the C 0 topology to a homeomorphism h, which has the property that it maps a sequence of small intervals near 0 to some relatively large intervals. Thus the ν-measure of these intervals is very large compared with their Lebesgue size, and this will imply that the integral is divergent. Let ρ1 = 0, h 1 = f 1 = I d. Now suppose that ρn = qpnn and the diffeomorphism h n of T are fixed. We will construct h n+1 of the form h n+1 = An ◦ h n , where An is a diffeomorphism of T of the form An = I d + an , for some C ∞ map an : T → R which is periodic with period q1n . Then Rρn (An (x)) = (x + an (x)) + ρn = (x + ρn ) + an (x + ρn ) = An (Rρn (x)), ∀x ∈ T, so Rρn and An commute. This implies that −1 −1 −1 h −1 n+1 ◦ Rρn ◦ h n+1 = h n ◦ An ◦ Rρn ◦ An ◦ h n = h n ◦ Rρn ◦ h n = f n .
We choose the C ∞ function an such that it satisfies the following conditions: – an is periodic with period
1 qn ;
– an (x) = cn x − x for x ∈ [0, 2cn1qn ];
– an C 0 < – an > −1.
1 qn ;
The first condition implies that An and Rρn commute, a fact needed in order to prove that f n is convergent in the C ∞ topology. The second condition (together with the right choice of the constants cn ) is used to prove that f will have a ‘bad’ invariant measure. The third condition is used to prove that h n converges in the C 0 topology, and the fourth condition says that h n are strictly increasing, a fact used to deduce that h is a homeomorphism. The sequence of numbers cn > 1 will be chosen later. Then An is a C ∞ diffeomorphism of T which commutes with Rρn , and An (x) = cn x for x ∈ [0, 2cn1qn ]. We will use the following result (Lemma 3.2 in [7]). Lemma 1. There exists Cn > 0 (depending on the natural number n), such that if g is a C n+1 diffeomorphism of the circle and t1 and t2 are two real numbers in [0, 1), then n+1 −1 n+1 g −1 ◦ Rt1 ◦ g − g −1 ◦ Rt2 ◦ gC n ≤ Cn max{gC C n+1 }|t1 − t2 |. n+1 , g
Now we choose the rational number ρn+1 =
pn+1 qn+1
such that
– qn+1 > 2n+1 ; – qn+1 is a multiple of 2qn ; – |ρn+1 − ρn | < 2n1qn and qn+1 > qn ; – |ρn+1 − ρn | <
1 n+1 , h −1 n+1 } Cn max{h n+1 C n+1 n+1 C n+1
2−n .
750
R. Saghin, W. Sun, E. Vargas
The first condition is again used to prove that h n is convergent in the C 0 topology. The second condition says that h n will preserve some small intervals for all large enough n, a fact used to prove that the invariant measure of f is ’bad’. The third condition will imply that ρ is irrational, while the fourth condition is used again to prove that f n converges to f in the C ∞ topology. Step 1. f n converges in the C ∞ topology to the C ∞ diffeomorphism f . For any n ≥ 1 we have −1 −n n f n+1 − f n C n = h −1 n+1 ◦ Rρn+1 ◦ h n+1 − h n+1 ◦ Rρn ◦ h n+1 C < 2 ,
then for any fixed m ≥ 1 and any n 1 > n 2 > n ≥ m we have f n 1 − f n 2 C m ≤ f n 1 − f n 1 −1 C m + · · · + f n 2 +1 − f n 2 C m ≤ f n 1 − f n 1 −1 C n1 −1 + · · · + f n 2 +1 − f n 2 C n2 < 2−n 1 +1 + · · · + 2−n 2 < 2−n 2 +1 ≤ 2−n , or for any m ≥ 1 the sequence f n is Cauchy in the C m topology, which means that f n is convergent in the C ∞ topology to some C ∞ map f . A similar argument can be made for the sequence f n−1 which again must converge in the C ∞ topology to the C ∞ map f −1 , so f must be a C ∞ diffeomorphism. Step 2. ρ is irrational. Obviously the sequence ρn must be convergent to some real number ρ in [0, 1]. We recall that in the choice of the sequence ρn we required that qn+1 > qn and |ρn+1 − ρn | < 2n1qn . If we suppose that ρ = qp is rational, we obtain ∞
∞
k=n
k=n
1 1 1 ≤ |ρ − ρn | ≤ |ρk+1 − ρk | < < n−1 , qqn 2k qk 2 qn or q > for any natural number n, which is a contradiction. Consequently ρ must be an irrational number (actually it will be a Liouville number). 2n−1
Step 3. h n converges in the C 0 topology to the homeomorphism h. So we know that Rρn converges in the C ∞ topology to the irrational rotation Rρ . We also have h n+1 − h n C 0 = An ◦ h n − h n C 0 = An − I dC 0 = an C 0 < 2−n , from the choice of an and ρn , which implies that h n is convergent in the C 0 topology to the map h. Then h must be a semi-conjugacy between f and Rρ , and because h n are all increasing homeomorphisms, h must also be increasing. If h is not a homeomorphism, then it must map an interval to a point, and consequently we would get a wandering interval for f , but this would contradict the Denjoy Theorem because f is a C ∞ diffeomorphism and the rotation number is irrational. So h is a homeomorphism and f is then conjugated by h to the irrational rotation Rρ . Step 4. T − log(x)dν = ∞. Now the wewill choose the sequence cn in order to obtain n −1 1 desired conclusion. Let In = 0, 2qn and Jn = h n+1 (In ). If we denote dn = i=1 ci , then, from the definition of an (and h n+1 ), we have Jn = [0, 2q1n dn ]. Because in the choice of the sequence ρn we required that qn+1 is a multiple of 2qn for all n > 0, we obtain that for any m > n > 0 we have h m (Jn ) = In (this is because for m > n the perturbations am are periodic with period 2q1n , and then Am will leave invariant the interval In ). By taking the limit we get that h(Jn ) = In .
On Dirac Physical Measures for Transitive Flows
751
Let ν be the invariant measure for f . Then 1 log(2qn dn ) − log(x)dν ≥ −ν(Jn ) log = Leb(In ) log(2qn dn ) = . 2qn dn 2qn Jn n dn ) Now it is enough to choose the sequence cn such that the sequence log(2q 2qn does not con− log(x)dν verge to zero as n tends to infinity, and this would imply that the integral T
n is divergent. Such a sequence is for example cn = 2qn , so we get dn = 2 i=1 qi , and then
n qi log(2qn dn ) log(2qn ) = + i=1 log 2, 2qn 2qn qn
is clearly not convergent to zero.
5. Other Examples and Remarks We constructed in the previous section a transitive flow with the only physical measure supported at a hyperbolic fixed point, but the inconvenience of that example is that the flow is defined on a manifold with boundary and with a corner. An interesting question to ask is whether the presence of this phenomena – physical measures at hyperbolic points (or even sets) for transitive flows – would impose some restrictions on the topology of the support of the measure and/or its basin of attraction. We believe that in higher dimensions there are no such restrictions, but in lower dimensions it is unclear. In this section we will present two more related examples, this time with transitive flows on manifolds without boundary. The first example is a transitive flow ψ on a surface of genus two M with two hyperbolic fixed points. Let f : [−1, 1] → R be a continuous, C ∞ on (−1, 1), even, strictly convex function such that f (0) = 1 is a minimum, f (−1) = f (1) = 2, and all the derivatives of the (local) inverse of f at 1 and −1 are zero. Rotate the graph {(x, f (x)), −1 ≤ x ≤ 1} around the x-axis to obtain a surface of revolution S. Let R1 and D1 be the square, respectively the disk, inside the plane x = −1, centered at (−1, 0, 0) and with the side equal to 8, respectively the radius equal to 2. Define similarly R2 and D2 for x = 1. Let N be the C ∞ surface S ∪ (R1 \ D1 ) ∪ (R2 \ D2 ). On N we consider the gradient semi-flow given by the height function z. This semi-flow will have two saddles a = (0, 0, −1) and b = (0, 0, 1), connected by two heteroclinic loops. Next identify the lateral sides of R1 (and R2 ) corresponding to x = −1, y = −4 and x = −1, y = 4 (respectively x = 1, y = −4 and x = 1, y = 4) to obtain the surface in Fig. 2, Part a. Then identify C2 with C3 using a rotation by π (c goes to d), in order to create a heteroclinic loop from b to a, and identify C4 and C1 using again a rotation by π composed with the C ∞ circle homeomorphism f (e goes to f (0)). In this way we obtain a C ∞ flow ψ on a surface of genus two M. This flow is also shown in Fig. 2, Part b, with the two circles identified. The Poincaré return map to T = C1 is f , with the convention that the return point of 0 is f (0), with the return time being infinity. The return time τ will have again a logarithmic singularity at zero, this time symmetric. The flow is topologically mixing on the whole manifold without boundary M, and the forward orbit of any point which does not belong to the heteroclinic connections or the stable manifold of a is dense on M, while the backward orbit of any point which does not belong to the heteroclinic connections or the unstable manifold of b is dense in M.
752
R. Saghin, W. Sun, E. Vargas
a
b Fig. 2. Flow with two saddles
If ν is the invariant measure for f , and T − log |x|dν = ∞, then from Proposition 2 we can conclude that ψ has only two ergodic ipm’s, the Dirac measures at a and b. However we can’t conclude that a combination of them is a physical measure for the system. A priori it is possible that there is no physical measure at all, meaning that for Lebesgue almost every x ∈ M the sequence m t,ψ (x) is not convergent, the weak limits may form an interval of measures between δa and 21 (δa + δb ). In conclusion we have the following result. Proposition 4. There exist a transitive flow on a surface of genus two with two hyperbolic fixed points, and the only ergodic probability measures the Dirac measures at these two points. One can also create similar examples with hyperbolic points which are not conservative (the eigenvalues are not symmetric); however this will also modify the return map to the transversal, it will create critical points, with the criticality depending on the respective eigenvalues. Also one can create similar examples with more hyperbolic fixed points and/or less homoclinic connections, and in this case the return map to the transversal becomes a piecewise monotone piecewise continuous map. Unfortunately we know less about the ergodic theory of critical circle maps and piecewise monotone piecewise continuous maps. The second example of this section is a three-dimensional version of the previous one. In this case we can construct a transitive flow on a three-dimensional compact manifold without boundary such that it has two hyperbolic fixed points, and this time the average of the two Dirac measures at the two hyperbolic fixed points is indeed the only physical measure, and the basin of attraction has full Lebesgue measure on the manifold. Let f : T2 → T2 be a C ∞ diffeomorphism on the 2-torus with the following properties: 1. f has a fixed point p, i. e. f ( p) = p; 2. There exists a dominated splitting T T2 = E s ⊕ E u for f such that: – D f | E s < 1; – D f | E xu > 1 if x = p and D f | E up = 1; 3. f is transitive.
On Dirac Physical Measures for Transitive Flows
753
Such a diffeomorphism can be obtained by deforming a linear Anosov automorphism of the torus, in order to make the expanding eigenvalue at the origin equal to 1. The following result can be found in [4]. Theorem 1. The diffeomorphism f with the properties listed above has one unique physical measure supported on the fixed point p. The basin of attraction of the Dirac measure
n−1 at p has full Lebesgue measure on T2 : for almost every x ∈ T2 , limn→∞ i=0 δ f i (x) = δ p , where the convergence is with respect to the weak* topology. We have the following result. Proposition 5. There exists a transitive flow ψ on a compact three-dimensional manifold M, which has two hyperbolic fixed points a and b, and a unique physical measure equal to 21 (δa + δb ), with the basin of attraction having full Lebesgue measure on M. Proof. Step 1. The construction of the flow. The construction is a three-dimensional version of the previous one, using now as the return map the function f introduced above, with the physical measure being the Dirac measure at the fixed point p. Let g be the function used in the previous example, and let S = {(x, y, z, u) ∈ R4 , −1 ≤ x ≤ 1, y 2 + z 2 + u 2 = g 2 (x)}. Let R1 and D1 be the three-dimensional solid cube, respectively solid sphere, inside the hyperplane x = −1, centered at (−1, 0, 0, 0), and with the side equal to 8, respectively the radius equal to 2, and R2 and D2 defined similarly for x = 1. Let N = S ∪ (R1 \ D1 ) ∪ (R2 \ D2 ) and φ be the gradient flow given by the function u. There are again two saddles a and b, and a heteroclinic sphere connecting them. Make similar identification as in the previous example, using the f from the theorem above, such that in the end we obtain a C ∞ transitive flow ψ on a C ∞ compact manifold without boundary M (this time T = C1 becomes a two-torus T2 , and the circle C0 becomes a heteroclinic two-dimensional sphere). The return map to the transversal T2 is f , with the convention that p returns to f ( p) and the return time of the flow is τ : T2 → R, which has again a logarithmic singularity at p, or τ (x) ∼ − log(d(x, p)) in a neighborhood of p, where d(·, ·) is the standard distance in T2 . From the symmetry of the construction we can conclude that the time spent by a trajectory going from T2 to itself near a will be equal to the time spent near b. Step 2. Almost all trajectories spend most of the time near a and b. This fact can be proved using the techniques from Sect. 3, but there is also a direct proof as follows. Let A be the set of points in the base T2 which are generic with respect to the invariant measure δ p . This set is invariant with respect to f and has full Lebesgue measure. Let B = ∪t∈R ψt (A), this set will have again full Lebesgue measure in M. We will prove that for every x ∈ B and every neighborhood U of {a, b}, we have 1 t χU (ψs (x))ds = 1, (1) lim t→∞ t 0 where χU is the indicator function of U . This will imply that all the possible physical measures are of the form uδa + (1 − u)δb , for some u ∈ [0, 1]. The idea of the proof is simple, the iterates of x under f tend to accumulate close to p, while trajectories starting closer to p spend more and more time close to a and b. The speed of the flow is bounded away from zero outside U , M is compact, and the length of the trajectories of the flow from y to ψτ (y) (y) is uniformly bounded for every
754
R. Saghin, W. Sun, E. Vargas
τ (y) y ∈ T2 \ { p}, so 0 χ M\U (ψs (y))ds < L for some fixed positive number L and every y ∈ T2 \ { p} (this can be extended for p too). Let > 0 fixed. There exists a neighborhood V of p such that for every y ∈ V we have τ (y) > 4L 0 such that for any n ≥ n 0
. Because x is in the basin of δ p , there exists n
n 0 −1 1 1 k we have n Car d{k : 0 ≤ k < n, f (x) ∈ V } > 2 . Let t0 = k=0 τ ( f k (x)). For any
k t > t0 , there exist n ≥ n 0 and 0 ≤ t < τ ( f n (x)) such that t = n−1 k=0 τ ( f (x)) + t . Then t t χU (ψs (x))ds = t − χ M\U (ψs (x))ds 0
=t−
0 n−1 τ ( f k (x))
t
χ M\U (ψs ( f k (x))ds −
χ M\U (ψs ( f n (x))ds
0
k=0 0
> t − (n + 1)L , so 1 t
t
χU (ψs (x))ds > 1 −
0
(n + 1)L . t
But t=
n−1
τ (g k (x)) + t ≥
f k (x)∈V
k=0
so
(n+1)L t
τ ( f k (x)) ≥
(n + 1)L 2n L ≥ ,
< . Consequently 1 t χU (ψs (x))ds > 1 − , ∀t > t0 , t 0
which proves formula (1). Step 3. 21 (δa + δb ) is the only physical measure. Let m t be the probability measure given by the piece tψs (x) for 0 ≤ s ≤ t, i. e. for any continuous function α on M of trajectory we have M αdm t = 1t 0 α(ψs (x))ds. We proved that if x ∈ B, then every weak limit measure of m t as t tends to infinity must be in the span of {δa , δb }. Now we will prove that the only limit is 21 (δa + δb ). The rough idea of the proof is that the orbits cannot jump suddenly very close to p (or a and b), but they approach it gradually, at a maximal exponential rate given by the derivative of f . We start with the remark that when the trajectory of x comes near p, then it will come close to a, b, a and b again and then go to f (x). By the symmetry of the construction, we have that 21 (δa + δb ) must be a limit measure, and if there is another limit measure then it should be of the type uδa + (1 − u)δb , for some 21 ≤ u ≤ 1 (this is because the
k trajectories come first to a). Let tn = n−1 k=0 τ ( f (x)), then for tn ≤ t < tn+1 we have n t = tn + t with 0 ≤ t < τ ( f (x)). Let Mt = tm t , m˜ be the corresponding probability ˜ Then measure given by the piece of trajectory ψs ( f n (x)) for 0 ≤ s ≤ t , and M˜ = t m. mt =
Mt tn t Mtn + M˜ = m tn + m. ˜ = t tn + t tn + t tn + t
On Dirac Physical Measures for Transitive Flows
755
As we remarked before, m tn will converge to 21 (δa + δb ), so if ttn converges to zero then we get the desired conclusion. Consequently it is enough to prove that τ ( f n (x)) = 0. n→∞ tn lim
(2)
By rescaling the metric if necessary we can assume that d(x, p) > 1. Let d( f n (x), p) = d, λ = sup y∈T2 D f y−1 > 1, let A be a neighborhood of p which contains a ball of radius
1 2
in T, and C > 0 such that
d( f n−k (x),
1 C
≤
τ (y) − log(d(y, p))
≤ C for all y in
A. We have that p) ≤ for all 0 ≤ k ≤ n. We remark that λk d ≤ 21 is 2d log 2d n−k (x) belongs to A. equivalent to k ≤ − log log λ ; this means that if k ≤ − log λ then f dλk ,
2d Let n 0 be the integer part of − log log λ . For d < 0 ≤ k ≤ n 0 , and then
1 2λ
we have n 0 ≥ 1 and f n−k (x) ∈ A for
τ ( f n (x)) ≤ −C log d, n0 n0 n 0 (n 0 + 1) log λ 1 n 0 log d − . tn ≥ τ ( f n−k (x)) ≥ − log dλk = − C C 2C k=1
k=0
2d log 2d By using the fact that − log log λ − 1 < n 0 ≤ − log λ and plugging in the above inequality
1 we get tn ≥ α(log d)2 + β log d + γ , where α = 2C log λ > 0, β and γ are constants −D depending on C and λ. There exist √ a D > 0 such that if log d < −D, or d < e , then tn ≥ α2 (log d)2 , or log d ≥ −C tn for some positive constant C (this is because log d 1 ). Consequently, if d < e−D , we get is negative; we choose D such that e−D < 2λ √ τ ( f n (x)) ≤ −C log d ≤ C tn ,
for some positive constant C . To conclude, let T = sup{τ (y) : y ∈ T2 , d(y, p) ≥ e−D } < ∞, where D is defined before. Then √ τ ( f n (x)) ≤ max{T, C tn }, so
τ ( f n (x)) T C . ≤ max ,√ tn tn tn
The fact that limn→∞ tn = ∞ finishes the proof of (2).
The flow is clearly transitive. Unlike the examples from the previous sections, it has infinitely many invariant measures and minimal sets (actually it has positive entropy and dense periodic orbits). Question 1. Is it possible to construct a smooth transitive flow on a surface without boundary such that the physical measure supported on a union of hyperbolic points? What about diffeomorphisms (with the physical measure supported even on a uniformly hyperbolic set which is not the whole surface)? Acknowledgements. The first author has been supported by the grant FAPESP 2007/03995-2, and would like to thank IME-USP for their hospitality. The second author has been supported by NNRFC (no. 10831003) and National Basic Research Program of China (973 Program no. 2006CB805903).
756
R. Saghin, W. Sun, E. Vargas
References 1. Anosov, D., Katok, A.: New Examples in Smooth Ergodic Theory. Ergodic Diffeomorphisms. Trans. of the Moscow Math. Soc. 23, 1–35 (1970) 2. Colli, E., Vargas, E.: Non-trivial wandering domains and homoclinic bifurcations. Erg. Th. Dyn. Syst. 21, 1657–1681 (2001) 3. Hofbauer, F., Keller, G.: Quadratic maps without asymptotic measure. Commun. Math. Phys. 127, 319– 337 (1990) 4. Hu, H., Young, L.S.: Nonexistence of SRB measures for some diffeomorphisms that are “almost Anosov”. Erg. Th. Dyn. Syst. 15, 67–76 (1995) 5. Nakamura, M.: Time change and orbit equivalence in ergodic theory. Hiroshima Math. J. 18, 399– 412 (1988) 6. Pianigiani, G.: First return map and invariant measures. Israel J. Math. 35, 32–48 (1980) 7. Sadovskaya, V.: Dimensional characteristics of invariant measures for circle diffeomorphisms. Preprint at http://arXiv.org/abs/0809.0343v1[math.DS], 2008 8. Totoki, H.: Time changes of flows. Mem. Fac. Sci Kyushu Univ. 20, 29–55 (1966) Communicated by G. Gallavotti
Commun. Math. Phys. 298, 757–785 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1045-4
Communications in
Mathematical Physics
Link Homologies and the Refined Topological Vertex Sergei Gukov1 , Amer Iqbal2 , Can Kozçaz3 , Cumrun Vafa4,5 1 Department of Physics, University of California, Santa Barbara, CA, 93106, USA 2 Department of Physics, LUMS School of Science & Engineering,
U Block, D.H.A, Lahore, Pakistan. E-mail: [email protected]
3 Department of Physics, University of Washington, Seattle, WA, 98195, USA 4 Center for Theoretical Physics, Massachusetts Institute of Technology,
Cambridge, MA, 02139, USA
5 Jefferson Physical Laboratory, Harvard University, Cambridge, MA, 02138, USA
Received: 16 September 2009 / Accepted: 14 December 2009 Published online: 20 April 2010 – © The Author(s) 2010. This article is published with open access at Springerlink.com
Abstract: We establish a direct map between refined topological vertex and sl(N ) homological invariants of the of Hopf link, which include Khovanov-Rozansky homology as a special case. This relation provides an exact answer for homological invariants of the Hopf link, whose components are colored by arbitrary representations of sl(N ). At present, the mathematical formulation of such homological invariants is available only for the fundamental representation (the Khovanov-Rozansky theory) and the relation with the refined topological vertex should be useful for categorizing quantum group invariants associated with other representations (R1 , R2 ). Our result is a first direct verification of a series of conjectures which identifies link homologies with the Hilbert space of BPS states in the presence of branes, where the physical interpretation of gradings is in terms of charges of the branes ending on Lagrangian branes.
Contents 1. 2.
Introduction . . . . . . . . . . . . . . . . . . . . . . . . BPS States, Link Invariants, and Open Topological Strings 2.1 Geometric transition and the Hopf link . . . . . . . . 2.2 Knots, links and open topological string amplitudes . 3. Link Homologies and Topological Strings . . . . . . . . 3.1 Hopf link: the fundamental representation . . . . . . 4. Refined Topological Vertex . . . . . . . . . . . . . . . . 4.1 Open topological string amplitudes . . . . . . . . . . 4.1.1 Hopf link. . . . . . . . . . . . . . . . . . . . . . 5. Refined Vertex and Link Homologies . . . . . . . . . . . 5.1 Unknot . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Hopf link . . . . . . . . . . . . . . . . . . . . . . . A. Appendix: Other Representations . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
758 760 762 762 765 768 768 771 771 773 775 776 778
758
A.1 Unknot . . . . . A.2 Hopf link . . . A.3 Specialization to References . . . . . . . .
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Q = −t q −2N : Some examples . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
778 780 782 785
1. Introduction One of the most promising recent developments in a deeper understanding of link invariants involves the study of homological invariants. First, these invariants provide a refinement of the familiar polynomial invariants. Secondly, and more importantly, they often lift to functors. However, constructing such homological invariants for arbitrary groups and representations has been a challenging problem, and at present only a handful of link homologies is known. Most of the existing examples are related to the fundamental representation of classical groups of type A and include the Khovanov homology [1], the link Floer homology [2–4], and the sl(N ) knot homology [5,6]. On the physics side, polynomial invariants of knots and links can be realized in the Chern-Simons gauge theory [7]. On the other hand, a physical interpretation of link homologies was first proposed in [8] and further developed in [9,10]. The interpretation involves BPS states in the context of physical interpretation of open topological string amplitudes [11]. In order to explain the realization in topological string theory one first needs to consider embedding the Chern-Simons gauge theory in string theory [12] and the large N dual description in terms of topological strings [13]. As was shown in [11] and will be reviewed in the next section, in this dual description polynomial invariants of knots and links are mapped to open topological string amplitudes which, in turn, can be reformulated in terms of integer enumerative invariants counting degeneracy of states in Hilbert spaces, roughly the number of holomorphic branes ending on Lagrangian branes. This leads to a physical reformulation of polynomial link invariants in terms of the so-called Ooguri-Vafa invariants which, roughly speaking, compute the Euler characteristic of the Q-cohomology, that is cohomology with respect to the nilpotent components of the supercharge.1 This, however, is not the full answer to homological link invariants which require the understanding of an extra grading. In other words, there is an extra physical charge needed to characterize these invariants. In closed string theory, an extension of topological string was constructed for certain non-compact Calabi-Yau geometries [14]. It involves an extra parameter which has the interpretation of an extra rotation in the fourdimensional space. It was shown in [15] that this extra charge indeed accounts for the charges of the M2 branes on holomorphic curves inside a Calabi-Yau three-fold. It was proposed in [8] that the homological grading of link homologies is related to the extra charge in the extension of topological string proposed in [14]. In particular, supersymmetric states of holomorphic branes ending on Lagrangian branes, labeled by all physical charges, should reproduce homological invariants of knots and links, H(L) = H B P S .
(1)
This conjecture led to a number of predictions regarding the structure of sl(N ) knot homologies, in particular to the triply-graded knot homology categorifying the HOMFLY polynomial [9,16], see also [10]. However, a direct test of this conjecture and 1 Elements of this cohomology can be viewed as the ground states of the supersymmetric theory of M2 branes ending on M5 branes in a particular geometry [11], as we review below.
Link Homologies and the Refined Topological Vertex
759
Table 1. Enumerative invariants of Calabi-Yau three-folds
Closed Open
Rational
Integer
Refinement
Gromov-Witten Open Gromov-Witten
Gopakumar-Vafa/Donaldson-Thomas Ooguri-Vafa invariants
Refined BPS invariants Triply graded invariants D J,s,r and N J,s,r
computation of homological link invariants from string theory was difficult due to lack of techniques suitable for calculating degeneracies of BPS states in the physical setup. Thus, even for the unknot, the only case where one can compute both sides of (1) independently is the case of the fundamental representation. For other representations, a mathematical formulation of homological knot invariants is not available at present, while on the string theory side the direct analysis of H B P S becomes more difficult. For a certain class of representations — which, for example, include totally symmetric and totally anti-symmetric representations of sl(N ) — it was argued in [10] that the corresponding cohomology ring of the unknot, Hg,R , is related to the Jacobi ring of a potential Wg,R (xi ), Hg,R (unknot) ∼ = J (Wg,R (xi )).
(2)
It is expected that for this class of representations the corresponding link homologies can be defined using matrix factorizations of the potential Wg,R (xi ), as in the original construction of the Khovanov and Rozansky [6]. The simplest set of examples of such representations involves totally anti-symmetric representations of sl(N ). For the the k th antisymmetric representation of sl(N ), the potential is the Landau-Ginzburg potential of the A⊗k N minimal model, and the corresponding homology ring of the unknot (2) is the cohomology ring of the Grassmannian of k-planes in C N [6,10], Hsl(N ), (unknot) ∼ = H ∗ (Gr (k, N )), k
(3)
where all cohomology groups are localized in the single homological grading. This will be one of our examples below. We will be able to compute the homology groups Hg,R directly from string theory using the recent work [17], where it was shown how the topological vertex [18] (which computes topological string amplitudes in toric geometries (Table 1)) can be refined to compute Refined BPS invariants [15]. Since the topological vertex formalism is composed of open string amplitudes, this refinement together with the conjecture of [8] implies that the refined topological vertex should be computing homological link invariants, at least for the class of links which can be formulated in terms of local toric geometries. The basic example of such a link is the Hopf link. This is one of the few examples where we can directly verify our conjectures, at least in the case of the fundamental representation, where Khovanov-Rozansky homology of the Hopf link can be computed. We find in this paper that these highly non-trivial computations agree with each other exactly! This provides a strong check of the various conjectures leading to this statement. Moreover, since the refined topological vertex is easily computable for arbitrary representations, this leads to a prediction of all homological invariants of a large class of links (of which the Hopf link is the simplest example) colored by arbitrary representations (R1 , . . . , R ), Hsl(N );R1 ,...,R (L).
(4)
760
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
This is a highly non-trivial new prediction which we are currently studying, and it would be very interesting to compare it with the mathematical formulation of link homologies, once those are developed. It is likely that these predictions lead to a deeper mathematical understanding of homological link invariants. In particular, we hope that the combinatorial interpretation of the refined vertex in terms of 3D partitions will be useful for finding the combinatorial definition of link homologies (4). The organization of this paper is as follows: In Sect. 2 we review the relation between the BPS state counting, link invariants, and open topological strings, including the large N description of the Chern-Simons theory. In Sect. 3 we review aspects of homological link invariants and their interpretation as Hilbert spaces of BPS states. In particular, we use this interpretation to compute the Khovanov-Rozansky homology of the Hopf link. In Sect. 4 we review the refined topological vertex, which is used in Sect. 5 — together with some facts from Sect. 2 — to compute the homological invariants for the Hopf link colored by arbitrary representations (R1 , R2 ), see Eq. (67) below. In particular, in the case of the fundamental representation we reproduce the Khovanov-Rozhansky homology derived in Sect. 3, and make new predictions. Conventions. The triply-graded invariants discussed in this paper are naturally organized into generating functions, which are polynomials in three variables. Unfortunately, the conventions between the physics literature and the knot theory literature are slightly different. In order to be careful about such differences and to agree with the standard notations, we use the variables (Q, q1 , q2 ) when we talk about topological string amplitudes computed by the topological vertex, cf. [17], and we use the variables (a, q, t) when we discuss link homologies, cf. [9]. The two sets of variables are related as follows: √ q2 = q, √ q1 = −t q, (5) Q = −t a−2 . In particular, expressions written in terms of (a, q, t) involve integer powers of q and t, while expressions written in terms of (Q, q1 , q2 ) involve half-integer powers of q1 and q2 . Specialization to the Ooguri-Vafa invariants and to knot polynomials is achieved, respectively, by setting q1 = q2 and t = −1. 2. BPS States, Link Invariants, and Open Topological Strings For the benefit of the reader not very familiar with the description of D-branes in toric varieties, following [19,20], let us briefly review the basics of this description necessary for understanding the topological string interpretation of link homologies. Consider a toric variety, X = Ck+3 /U (1)k ,
(6)
where Ck+3 is parametrized by coordinates X i , i = 1, . . . , k + 3, and the symplectic quotient is obtained by imposing D a = Q a1 |X 1 |2 + Q a2 |X 2 |2 + · · · + Q ak+3 |X k+3 |2 − r a = 0, U (1)a :
X i → ei Q i a X i a
(7)
Link Homologies and the Refined Topological Vertex
761
x2 =0
=0 x1
x 3 =0 Fig. 1. A Lagrangian D-brane in C3 , projected to the base of the toric fibration
for every a = 1, . . . , k. We can think of (6) as a gauged linear sigma model with gauge group U (1)k and chiral fields X i of charges Q ia . The charges Q ia should obey Q ia = 0. i
Using toric geometry, we can also describe Lagrangian D-branes invariant under the torus action. There are two interesting types of Lagrangian D-branes: 1. Lagrangians, which project to a 1-dimensional subspace in the base of the toric variety X . These can be described by three equations of the form: qiα |X i |2 = cα , α = 1, 2, i
(8)
arg X i = 0,
i
where is a set of charges such that i qiα = 0. 2. Lagrangians, which project to a 2-dimensional subspace in the base of the toric variety X . These can be defined by the following equations: qi1 |X i |2 = c, qiα
i
qiα arg X i = 0,
α = 2, 3,
(9)
i
where the charges should satisfy
i
qi1 qiα = 0, α = 2, 3.
Let us consider X = C3 with a Lagrangian D-brane on L, where L is defined by |X 1 |2 − |X 3 |2 = c > 0, |X 2 |2 − |X 3 |2 = 0, arg X i = 0.
(10)
i
The projection of this Lagrangian D-brane to the base of toric fibration is shown on Fig. 1.
762
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
2.1. Geometric transition and the Hopf link. The conjecture on the geometric transition [13] was originally checked at the level of free energies and later at the level of observables of the theory in more detail in [11]. A worldsheet explanation of this duality was discovered in [21]. See [22] for a detailed review of this duality and its consequences for link invariants. Let us briefly review the conjectured equivalence between the Chern-Simons theory in S3 with the closed topological string theory on the resolved conifold, or in other words, with the open topological string theory on T ∗ S3 . In his work, ’t Hooft noted that U (N ) or SU (N ) gauge theories should have a string theory description. If we consider the perturbative Feynman diagram expansion in the ’t Hooft coupling λ = N g using the double line notation, these diagrams can be regarded as a triangulation of a Riemann surface. The contributions to the free energy coming from these diagrams can be arranged in a way that looks like open string expansion on worldsheet with genus g and h boundaries: F= C g,h N 2−2g λ2g−2+h . (11) g=0,h=1
It was shown by Witten for the SU (N ) Chern-Simons theory on a three dimensional manifold S3 that the coefficients C g,h are equal to the A-model topological open string theory on a worldsheet with genus g and h boundaries [12] with the target space T ∗ S3 . The N D-branes are wrapped on the base S3 in this six dimensional cotangent bundle. The summation over the number of holes in Eq. 11 can be carried out first. The free energy takes the following form which looks like the closed string expansion: F= N 2−2g Fg (λ), (12) g=0
where λ acts like some modulus of the theory. The natural question that arises is “what is the closed string theory for the Chern-Simons theory on S3 ?” In [13] it was conjectured that if we start with the open topological string theory on T ∗ S3 which can be regarded as the deformed conifold and wrap N D-branes on the base and take the large N limit, the geometry of the target space undergoes the conifold transition: the base S3 shrinks and then is blown up to S2 , where the D-branes disappear. Instead, the Kähler moduli of the blown up S2 is proportional to the ’t Hooft coupling. The equivalence was checked for all values of the ’t Hooft coupling and for all genera of the free energy of the Chern-Simons theory and the closed topological strings on the resolved conifold. It is worth mentioning that the resolution of the geometry, however, is not unique: two different ways of resolving the singularity give rise to topologically distinct spaces which are birationally equivalent. In Fig. 5, two different resolutions of the conifold singularity are shown which are related by flop. If we insert probe branes in the target geometry and compute the open string partition function using the “usual” topological vertex the partition function is invariant under flop. However, for the “refined” topological vertex this invariance does not hold, and it will be crucial in our discussion to choose the ‘correct’ blowup. 2.2. Knots, links and open topological string amplitudes. The equivalence between the open topological string on the deformed conifold and the closed string on the resolved conifold was also checked in terms of the observables [11]. The basic observables in the
Link Homologies and the Refined Topological Vertex
763
Chern-Simons theory are the Wilson loops. As mentioned before, there are N D-branes wrapped on the base, and to study their dynamics another set of D-branes can be introduced, say M of them. This new set of D-branes will be wrapped on a Lagrangian 3-cycle which is associated with a knot. A closed loop q(s), (0 ≤ s < 2π ), is used to parametrize a knot in S3 . Then the conormal bundle associated with the knot defined as dqi C = (q(s), p) | pi = 0, 0 ≤ s < 2π (13) ds is Lagrangian. The M D-branes wrapped on the Lagrangian cycle C gives rise to SU (M) Chern-Simons theory. However, in addition to the Chern-Simons theory on C there is another topological open string sector coming from strings stretching between the M D-branes around C and the N D-branes around the base S3 . We obtain a complex scalar which transforms as a bi-fundamental of SU (N ) ⊗ SU (M) and lives in the intersection of the D-branes, i.e. on the knot. This complex field can be integrated out and we obtain an effective action for the U (N ) gauge connection A on S3 , SC S (A) +
∞ 1 T rU n T r V −n , n
(14)
n=1
which can be rephrased as correlations of [23] −1 T rRU T rR V .
(15)
R
In the previous section we mentioned that the geometry changes from deformed conifold with branes to the resolved conifold without branes if we take the large N limit. We can take the same limit in this brane system while keeping the number of non-compact probe branes, M, fixed and trace what happens to the probe branes during this transition. According to [11], the non-compact Lagrangian cycle C will be mapped to a new Lagrangian cycle C in the resolved conifold, with M D-branes wrapping it. This will provide boundary conditions for the open strings to end on in the resolved geometry. Aspects of this transition including how one can find the Lagrangian brane for certain knots and links (including the Hopf link) have been discussed in [24]. Precise mathematical description of the Lagrangian D-brane C after transition has been offered [25]. For the case of the unknot, discussed in detail in [11], the normalized CS expectation is given by Wλ(R) = Tr R U , U = Pe
A
,
(16)
where λ(R) is the highest weight of the irreducible representation R, i.e., it is a 2D partition. The above expectation value can be calculated exactly and is given by N +c(i, j) 2
Wλ = Quantum dimension of λ =
q 1 (i, j)∈λ −N |λ| 2
= q1
h(i, j) 2
q1
j) − N +c(i, 2
− q1
− h(i,2 j)
− q1
sλ (1, q1 , q12 , . . . , q1N −1 ),
(17)
764
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
Fig. 2. The content and the hook length of a box in a Young diagram
where sλ (x) is the Schur function labelled by the partition λ and c(i, j) = j − i, h(i, j) = λi − j + λtj − i + 1 are the content and the hook length of a box in the Young diagram of λ as shown in Fig. 2. Similarly for the Hopf link we can color the two component knots by two different representations to obtain Wλ μ = Trλ U1 Trμ U2 ,
(18)
where U1 and U2 are the two holonomy matrices around two component unknots. This can also be calculated exactly to obtain κ(μ)
−ρ
−ρ−λ
Wλ μ = q1 2 sλ (q1 ) sμ (q1
ρ
, Q q1 )
i− j
(1 − Q q1
).
(19)
(i, j)∈λ
Here Q = q1−N . We will recall the geometry of D-branes for the unknot and Hopf link in Sect. 4 and review how the open topological string amplitudes in the presence of these branes reproduce the above knot and link invariants, before extending it to more refined invariants. In [11] the open topological string amplitudes were interpreted as counting a certain BPS partition function. This interpretation is crucial for connecting it to link homologies as the Hilbert space is naturally in the problem. Moreover the gradation of the homology is nothing but the charges of BPS states in the physical theory. The geometry considered in [11] was as follows: We can lift the type IIA geometry of the resolved conifold to M-theory. In this context the probe branes get mapped to M5 branes wrapping the Lagrangian cycles and filling the non-compact R 3 spacetime. The open topological string simply computes the number of M2 branes ending on the M5 branes. The representation of the link invariant encodes the geometry of the ending of the M2 brane on the M5 brane. Moreover the coefficient of q s Q J in the topological string amplitudes, N R,J,s , is determined by the number of such bound states which wrap the P1 J times and have spin s under the SO(2) rotation of the spatial R2 ⊂ R3 .2 The precise structure of the connection between open topological strings and BPS counting was further elaborated in [27], to which we refer the interested reader. For a single knot, for example, one finds that the free energy F = log(Z ) as a function of V defined above, is given by F(V ) = −
R,n>0
f R (q n , Q n )
Tr R V n , n
where f R (q, Q) is completely determined by the BPS degeneracies of the M2 brane, N R ,J,s , where R denotes the representation the BPS state transforms in J , is the charge 2 For a complete mathematical proof of the integrality of N R,J,s see [26].
Link Homologies and the Refined Topological Vertex
765
of the brane and s is the spin. Moreover the sign of N is correlated with its fermion number. It was proposed in [8] that there is a further charge one can consider in labeling the BPS states of M2 branes ending on M5 branes: The normal geometry to the M5 brane includes, in addition to the spacetime R3 , and the three normal directions inside the CY, an extra R2 plane. It was proposed there that the extra S O(2) rotation in this plane will provide an extra gradation which could be viewed as a refinement of topological strings and it was conjectured that this is related to link homologies that we will review in the next section. This gives a refinement of N R,J,s → N R,J,r,s . In other words for a given representation R we have a triply graded structure labeling the BPS states. 3. Link Homologies and Topological Strings Now, let us proceed to describing the properties of link homologies suggested by their relation to Hilbert spaces of BPS states. We mostly follow notations of [8,9]. Let L be an oriented link in S3 with components, K 1 , . . . , K . We shall consider homological as well as polynomial invariants of L whose components are colored by representations R1 , . . . , R of the Lie algebra g. Although in this paper we shall consider only g = sl(N ), there is a natural generalization to other classical Lie algebras of type B, C, and D. In particular, there are obvious analogs of the structural properties of sl(N ) knot homologies for so(N ) and sp(N ) homologies (see [10,28] for some work in this direction). Given a link colored by a collection of representations R1 , . . . , R of sl(N ), we denote the corresponding polynomial invariant by P sl(N );R1 ,...,R (q).
(20)
Here and below, the “bar” means that (20) is the unnormalized invariant; its normalized version Psl(N );R1 ,...,R (q) obtained by dividing by the invariant of the unknot is written without a bar. Since this “reduced” version depends on the choice of the “preferred” component of the link L, below we mainly consider a more natural, unnormalized invariant (20). In the special case when every Ra , a = 1, . . . , is the fundamental representation of sl(N ) we simply write P N (q) ≡ P sl(N );
,...,
(q).
(21)
The polynomial invariants (20) are related to expectation values of Wilson loop operators W (L) = W R1 ,...,R (L) in Chern-Simons theory. For example, the polynomial sl(N ) invariant PN (q) is related to the expectation value of the Wilson loop operator W (L) = W ,..., (L), P¯N (L) = q −2N lk(L) W (L),
(22)
where lk(L) = a
766
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
The graded Poincaré polynomial,
P sl(N );R1 ,...,R (q, t) :=
sl(N );R1 ,...,R
q i t j dim Hi, j
(L)
(24)
i, j∈Z
is, by definition, a polynomial in q ±1 and t ±1 with integer non-negative coefficients. Clearly, evaluating (24) at t = −1 gives (23). );R1 ,...,R When Ra = for all a = 1, . . . , , the homology Hi,sl(N (L) is the Khovaj N
nov-Rozansky homology, H K R i, j (L), and K h R N (q, t) ≡ P sl(N ); ,..., (q, t) N q i t j dim H K R i, j (L) =
(25)
i, j∈Z
is its graded Poincaré polynomial. The physical interpretation of homological link invariants via Hilbert spaces of BPS states leads to certain predictions regarding the behavior of link homologies with rank sl(N );R1 ,...,R N . In particular, the total dimension of H∗,∗ (L) grows as sl(N );R1 ,...,R dim H∗,∗ (L) ∼ N d ,
N → ∞,
(26)
where d=
dim Ri .
(27)
i=1
More specifically, a general form of the conjecture in [8] states: Conjecture. There exists a “superpolynomial” P R1 ,...,R (a, q, t), a rational function3 in three variables a, q, and t, such that P sl(N );R1 ,...,R (q, t) = P R1 ,...,R (a = q N , q, t)
(28)
for sufficiently large N . The coefficients of the superpolynomial, say, in the case of the fundamental representation: 1 P N (a, q, t) = a J q s t r D J,s,r (29) −1 (q − q ) J,s,r
encode the dimensions of the Hilbert space of states, related to BPS states, D J,s,r := (−1) F dim H BF,J,s,r PS ,
(30)
graded by the fermion number F, the membrane charge J , and the U (1) L × U (1) R quantum numbers s and r . However, note that the D J,s,r is not the same as N J,s,r : N J,s,r 3 This definition differs slightly from the ones introduced in [9], where it is the numerator of the rational function P R1 ,...,R (a, q, t) which was called the superpolynomial. Since in general one has very good control of the denominators, the two definitions are clearly related.
Link Homologies and the Refined Topological Vertex
767
encodes the integral structure in the Free energy, whereas D J,s,r is the exponentiated version of it. It is not difficult to see that the integrality of N J,s,r guarantees that of D J,s,r (as in the closed string case where the integrality of GV invariants implies integrality of the DT invariants). This in particular explains that the Hilbert space structure of BPS states captured by N J,s,r will indeed encode the Hilbert space structure for D J,s,r and thus its integrality. However, it is not completely obvious from the physical picture why (28) is a finite polynomial, for any given N , as has been conjectured. The conjecture (28) can be refined even further. Indeed, the large N growth described in (26) and (28) is characterized by the contribution of individual link components, sl(N );Ra ⊕a=1 H∗,∗ (K a ).
(31)
Often, it is convenient to remove this contribution and consider only the “connected” part of the polynomial (resp. homological) link invariant. For example, in the simplest case when all components of the link L carry the fundamental representation, the corresponding sl(N ) invariant P¯N (L) or, equivalently, the Wilson loop correlation function (22) can be written in terms of the integer BPS invariants N( ,..., ),Q,s as
W (L)(c) = (q −1 − q)−2
N(
,...,
),J,s q
N J +s
,
(32)
J,s
where W (L)(c) is the connected correlation function. Thus, for a two-component link, we have
W (L)(c) = W (L) − W (K 1 ) W (K 2 )
(33)
and ⎡ P¯N (L) = q −2N lk(L) ⎣ P¯N (K 1 ) P¯N (K 2 ) +
⎤ N(
,
),J,s q
N J +s ⎦
,
(34)
J,s
where P¯N (K 1 ) and P¯N (K 2 ) denote the unnormalized sl(N ) polynomials of the individual link components. Similarly, the homological sl(N ) invariant of a two-component link L can be written as a sum of connected and disconnected terms [8]: K h R N (L) = q +
−2N lk(L)
t α K h R N (K 1 )K h R N (K 2 )
1 N J +s r , D q t J,s,r q − q −1
(35)
J,s,r ∈Z
where integer invariants D J,s,r (L) are related to the dimensions of the Hilbert space of BPS states, N J,s,r and α is a simple invariant of L. At t = −1 this expression specializes to (34).
768
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
3.1. Hopf link: the fundamental representation. The Hopf link, L = 212 consists of two components, K 1 ∼ = K2 ∼ = unknot, which are linked with the linking number lk(K 1 , K 2 ) = −1. The sl(2) homological invariant for the Hopf link is K h R 2 (212 ) = 1 + q 2 + q 4 t 2 + q 6 t 2 .
(36)
It can be written in the form (35) with the following non-zero invariants: D0,−1,0 = 1, D−2,−1,0 = −1,
D0,1,2 = −1,
(37)
D−2,1,2 = 1.
This gives the “superpolynomial” for the Hopf link, 1 −2 2 2 2 2 2 −2 2 4 2 q + a 1 − q + a , (38) P(212 ) = − 1 + q t t − q − t t (q − q −1 )2 which after specializing to a = q N gives the graded Poincaré polynomial of the sl(N ) link homology: K h R N (212 )
N −N 2 q N − q −N 2N q − q +q =q t2 q − q −1 q − q −1 N −N N +1 q − q t 2. −q q − q −1 N −1
(39)
Notice that at t = −1 this expression reduces to the correct formula for the sl(N ) polynomial invariant of the Hopf link, P N (212 ) = 1 − q 2N + q 2N
q N − q −N q − q −1
2 .
(40)
The result (39) agrees with the direct computation of Khovanov-Rozansky homology for small values of N : K h R 3 (212 ) = 1 + q 2 + q 4 + q 4 t 2 + 2q 6 t 2 + 2q 8 t 2 + q 10 t 2 , K h R 4 (212 ) = 1 + q 2 + q 4 + q 4 t 2 + q 6 + 2q 6 t 2 + 3q 8 t 2 + 3q 10 t 2 + 2q 12 t 2 + q 14 t 2 , K h R 5 (212 ) = 1 + q 2 + q 4 + q 4 t 2 + q 6 + 2q 6 t 2 + q 8 + 3q 8 t 2 + 4q 10 t 2 + 4q 12 t 2 + 3q 14 t 2 + 2q 16 t 2 + q 18 t 2 .
(41)
4. Refined Topological Vertex In this section we will briefly explain the combinatorial interpretation of the refined vertex in terms of 3D partitions; more details can be found in [17]. Recall that the generating function of the 3D partitions is given by the MacMahon function, M(q) =
n≥0
Cn q n =
∞
(1 − q n )−n ,
k=1
Cn = # of 3D partitions with n boxes.
(42)
Link Homologies and the Refined Topological Vertex
769
(a)
(b)
Fig. 3. (a) π• (λ, μ, ν) for λ = (6, 4, 3, 1, 1), μ = (5, 4, 3, 2, 2), ν = (4, 3, 2, 1). (b) An example of π(λ, μ, ν)
The topological vertex Cλ μ ν (q) [18], Cλ μ ν (q) = q
κ(μ) 2
sν t (q −ρ )
η
sλt /η (q −ρ−ν ) sμ/η (q −ρ−ν ), t
has the following combinatorial interpretation [30]: M(q)Cλ μ ν (q) = f λ μ ν (q) q |π(λ,μ,ν)|−|π• (λ,μ,ν)| ,
(43)
(44)
π(λ,μ,ν)
where π(λ, μ, ν) is a 3D partition such that along the three axis which asymptotically approaches the three 2D partitions λ, μ and ν. |π | is number of boxes (volume) of the 3D partition π and π• is the 3D partition with the least number of boxes satisfying the same boundary condition.4 Figure 3(a) shows the π• for λ = (6, 4, 3, 1, 1), μ = (5, 4, 3, 2, 2) and ν = (4, 3, 2, 1). Figure 3(b) shows an example of the partition π(λ, μ, ν) for λ, μ, ν the same as in Fig. 3(a). f λ μ ν (q) is the framing factor which appears because of the change from perpendicular slicing of the 3D partition to diagonal slicing of the 3D partition [30]. The refined topological vertex [17] Cλ μ ν (q1 , q2 ) =
q1 q2
||μ||2 −|μ| 2
κ(μ)
||ν||2 2
q2 2 q2
Z ν (q1 , q2 )
|η|+|λ|−|μ| q2 2 t −ρ −ρ × sλt /η (q1 q2−ν ) sμ/η (q2 q1−ν ) q1 η
(45)
also has a similar combinatorial interpretation in terms of 3D partitions which we will explain now. Recall that the diagonal slices of a 3D partition, π , are 2D partitions which interlace with each other. These are the 2D partitions living on the planes x − y = a 4 Since even the partition with the least number of boxes, has infinite number of boxes, we need to regularize this by putting it in an N × N × N box as discussed in [30].
770
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
Fig. 4. Slices of the 3D partitions are counted with parameters q1 and q2 depending on the shape of ν
where a ∈ Z. We will denote these 2D partitions by πa . For the usual vertex the ath slice is weighted with q |πa | , where |πa | is the number of boxes cut by the slice (the number of boxes in the 2D partition πa ). The 3D partition is then weighted by q |πa | = q a∈Z |πa | = q # of boxes in the π . (46) a∈Z
In the case of the refined vertex the 3D partition is weighted in a different manner. Given a 3D partition π and its diagonal slices πa we weigh the slices for a < 0 with parameter q and the slices with a ≥ 0 with parameter t so that the measure associated with π is given by ⎞ ⎛ ∞ ∞ |π | |π | |π( j−1)| a a ⎠ i=1 |π(−i)| ⎝ q2 q1 q1 j=1 . (47) = q2 a<0
a≥0
The generating function for this counting is a generalization of the MacMahon function and is given by M(q1 , q2 ) :=
π
∞
q2
i=1 |π(−i)|
∞
q1
j=1 |π( j−1)|
=
∞
(1 − q1 q2i−1 )−1 . j
(48)
i, j=1
We can think of this assignment of q1 and q2 to the slices in the following way. If we start from large positive a and move toward the slice passing through the origin, then every time we move the slice towards the left we count it with q1 and every time we move the slice up (which happens when we go from a = i to a = i − 1, i = 0, 1, 2 . . .) we count it with q2 . Since we are slicing the skew 3D partitions with planes x − y = a we naturally have a preferred direction given by the z-axis. We take the 2D-partition along the z-axis to be ν. The case we discussed above, obtaining the refined MacMahon function, had ν = ∅. For non-trivial ν the assignment of q2 and q1 to various slices is different and depends on the shape of ν. As we go from +∞ to −∞ the slices are counted with q1 if we go towards the left and are counted with q2 if we move up. An example is shown in Fig. 4. After taking into account the framing and the fact that the slices relevant for the topological vertex are not the perpendicular slices [30] the generating function is given by G λ μ ν (q1 , q2 ) = M(q1 , q2 ) × Cλ μ ν (q1 , q2 ),
Link Homologies and the Refined Topological Vertex
771
where Cλ μ ν (q1 , q2 ) is the refined topological vertex, Cλ μ ν (q1 , q2 ) =
q2 q1
||ν||2 −||ν||2
κ(μ) 2
2
q2
−ρ Pν t (q1 ; q2 , q1 )
−ρ
−ν t
× sλt /η (q1 q2−ν )sμ/η (q1
q2
|η|+|λ|−|μ| 2
q1
η
−ρ
q2 ).
In the above expression Pν (x; q2 , q1 ) is the Macdonald function such that ||ν||2
−ρ Pν t (q1 ; q2 , q1 ) = q1 2 Z ν (q1 , q2 ), a(i, j)+1 (i, j) −1 1 − q1 q2 , a(i, j) = ν tj − i, Z ν (q1 , q2 ) =
(49)
(i, j)∈ ν
(i, j) = νi − j. 4.1. Open topological string amplitudes. In this section we will discuss the open string partition function obtained from the topological vertex and its relation with polynomial Hopf link invariants. Recall that the usual topological vertex is given by [18,30] κ(μ)
−ρ
Cλ μ ν (q1 ) = q1 2 sν t (q1 )
η
−ρ−ν
sλt /η (q1
−ρ−ν t
) sμ/η (q1
).
(50)
Although written in terms of the Schur and skew-Schur functions in the above equation, it can be rewritten in terms of sl(N ) Hopf link invariants for large N [18], ⎧ −ρ −ρ sλ/η (q1 ) sμ/η (q1 ) ⎪ ⎪ ⎨ ηκ(μ) κ(μ) − − (51) Wλ μ (q1 ) = q1 2 Cλt μ ∅(q1 ) = q1 2 sλ (q1−ρ ) sμt (q1−ρ−λ ) ⎪ ⎪ t ⎩ − κ(μ)+κ(λ) −ρ −ρ−μ 2 sμt (q1 ) sλt (q1 ). q1 The above three expressions are equivalent because of cyclic symmetry of the topological vertex. Next, we will show that sl(N ) Hopf link invariants can be related to the open string partition function calculated using the topological vertex. Equation (51) will guide us in formulating the precise relation between the sl(N ) Hopf link invariant and the open string partition function. 4.1.1. Hopf link. As we discussed in Sect. 2, after geometric transition, the Hopf link is represented by a pair of toric Lagrangian branes in the geometry O(−1)⊕O(−1) → P1 . Furthermore, as we also discussed earlier, there are two possible resolutions of the singular conifold, both given by O(−1) ⊕ O(−1) → P1 , related to each other by a flop transition as shown in Fig. 5. We will determine the open string partition function for both these configurations. The open string partition function for the configuration shown in Fig. 5(a) is given by Z I (q1 , Q, V1 , V2 ) = Z λI μ (q1 , Q) Trλ V1 Trμ V2 , (52) λ, μ
772
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
(a)
(b) Fig. 5. Two different resolutions of the conifold related to each other by flop transition. The normalized partition function of the geometry (b) gives homological sl(N ) invariants of the Hopf link decorated by representations (R1 , R2 ). The red mark indicates the choice of the preferred direction for the refined vertex
where V1 and V2 are the two holonomy matrices associated with the two unknot components of the Hopf link and Z λIμ (q1 , Q) = (−Q)|ν| Cλ μ ν (q1 ) C∅∅ ν t (q1 ) ν
−ρ
−ρ−λ
= sλt (q1 ) sμt (q1
ρ
, Q q1 )
∞
i+ j−1−λtj
(1 − Q q1
).
(53)
i, j=1
We normalize the above open string partition function by dividing with the closed string partition function to obtain, ! Z λIμ (q1 , Q) :=
Z λIμ (q1 , Q) Z ∅∅(q1 , Q) I
−ρ
−ρ−λ
= sλt (q1 ) sμt (q1
ρ
, Q q1 )
j−i
(1 − Q q1
).
(i, j)∈λ
In the limit Q → 0 we get κ(μ)
! Z λIμ (q1 , Q = 0) = Cλ μ ∅ = q1 2 Wλt μ (q1 ).
(54)
The right-hand side is the large N limit of the sl(N ) Hopf link invariant. The above equation suggests the following relation between the open string partition function and the sl(N ) Hopf link invariant: − κ(μ) 2
! Z λI t μ (q1 , Q), Q = q1N .
Wλ μ (q1 , N ) = q1 For (λ, μ) = ( , W
(55)
) we get
(q1 , N ) = ! ZI
1− Q 1 − q1 ⎞ 3 1 1 q12 q12 ⎠ q2 1− Q + −Q 1 1 − q1 1 − q1 1 − q1
−ρ−
(q1 , Q) = s (q1 ⎛ 1
− = ⎝q1 2
=
ρ
1
, Q q1 ) q12
1 − q1 + q12 1 + q12 q1 − Q + Q2 . (56) 2 2 (1 − q1 ) (1 − q1 ) (1 − q1 )2
Link Homologies and the Refined Topological Vertex
773
Flop transition. The other possibility for the geometry after transition is as shown in Fig. 5(b). In this case the partition function is given by ! = ! |ν| C∅ μ ν (q1 ) Cλ ∅ ν t (q1 ) (− Q) Z λIIμ (q1 , Q) ν
κ(μ)
= q1 2 For (λ, μ) = ( , ! Z II
t t ! |ν| sν (q −ρ ) sν t (q −ρ ) sλt (q −ρ−ν ) sμ (q −ρ−ν ). (− Q) 1 1 1 1
(57)
ν
) we get
! = (q1 , Q)
Z
II
! (q1 , Q) ! (q1 , Q)
II Z ∅∅
1 − q1 + q12 1 + q12 q1 ! !2 − Q + Q (1 − q1 )2 (1 − q1 )2 (1 − q1 )2 " # 1 − q1 + q12 1 + q12 q 1 2 −1 −2 ! ! ! =Q −Q +Q (1 − q1 )2 (1 − q1 )2 (1 − q1 )2
=
!2 ! =Q ZI
!−1 ). (q1 , Q
(58)
Thus we see that the two partition functions are equal (up to an overall factor) if we define the Kähler parameters for these two cases, related by the flop transition, as ! = Q −1 . Q
(59)
This implies that − κ(μ) 2
Wλ μ (q1 , N ) = q1
Q −1
|λ|+|μ|
! Z λIIt μ (q1 , Q), Q = q1−N .
2
(60)
Thus we see that when using the usual topological vertex we get the same result for the two geometries (with branes) related by flop transition. This “symmetry”, however, is not preserved by the refined topological vertex as we will see in the next section. 5. Refined Vertex and Link Homologies In this section we will determine the refined open topological string partition functions for the two configuration of branes on the resolved conifold shown in Fig. 5. Let us begin by defining the refined topological vertex that we will use: Cλ μ ν (q1 , q2 ) =
q1 q2
||μ||2 −|μ| 2
κ(μ)
||ν||2 2
q2 2 q2
Z ν (q1 , q2 )
|η|+|λ|−|μ| q2 2 t −ρ −ρ × sλt /η (q1 q2−ν ) sμ/η (q2 q1−ν ). q1 η
The above definition of the refined topological vertex differs from the refined vertex in [17] by a factor which does not affect the closed string calculations because it cancels due to interchanging of q1 , q2 in gluing the vertex along an internal line. For the open
774
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
string partition functions this factor only appears as an overall factor multiplying the partition function. The open string refined partition function of the geometry shown in Fig. 5(b) is given by Z λ μ (q1 , q2 , Q) = (−Q)|ν| C∅ μ ν (q1 , q2 ) Cλ ∅ ν t (q2 , q1 ). (61) ν
Since C∅ μ ν (q1 , q2 ) = Cλ ∅ ν t (q2 , q1 ) =
q1 q2 q2 q1
||μ||2 2
||ν||2 2
κ(μ)
q2 2 q2
t −ρ Z ν (q1 , q2 ) sμ (q2 q1−ν ),
(62)
|λ|
||ν t ||2 2
2
−ρ Z ν t (q2 , q1 ) sλt (q2 q1
−ν t
q1
),
the open string partition function becomes Z λ μ (q1 , q2 , Q) = h λ μ (q1 , q2 ) −ρ × sλt (q2
h λ μ (q1 , q2 ) =
q1 q2
||ν||2 ||ν t ||2 (−Q)|ν| q2 2 q1 2 Z ν t (q2 , q1 ) Z ν (q1 , q2 )
ν −ρ −ν t q1 ) sμ (q2
||μ||2 − |λ| 2
q1−ν ), t
κ(μ)
2
q2 2 .
The normalized partition function is given by Z λ μ (q1 , q2 , Q) ! Z λ μ (q1 , q2 , Q) = , Z ∅∅(q1 , q2 , Q)
(63)
where Z ∅∅(q1 , q2 , Q) = =
||ν||2 ||ν t ||2 (−Q)|ν| q2 2 q1 2 Z ν t (q2 , q1 ) Z ν (q1 , q2 )
ν
∞
i− 12
(1 − Q q1
j− 21
q2
).
(64)
i, j=1
Recall that the sl(N ) Hopf link invariant is related to the open string partition function as Wλ μ (q, N ) = q −
κ(μ) 2
Q −1
|λ|+|μ| 2
! Z λIIt μ (q, Q = q −N ).
(65)
κ(μ)
The factor q − 2 is the framing factor for the usual topological vertex. For the case of the refined vertex the framing factor is given by [17] f λ (q1 , q2 ) =
q2 q1
||μt ||2 −|μ| 2
− κ(μ) 2
q1
.
(66)
Link Homologies and the Refined Topological Vertex
775
Therefore we conjecture the following relation between the homological sl(N ) invariants of the Hopf link and the refined open string partition function5 : q1 |λ|+|λ| |μ| f λ (q1 , q2 ) q2 $ |λ|+|μ| 2 q1 ! Z λIIt μ (q1 , q2 , Q) × Q −1 q2 " ||ν||2 ||ν t ||2 (−Q)|ν| q2 2 q1 2 =
P λ μ (q, t, a) = (−1)|λ|+|μ|
ν
#
−ρ × Z ν (q1 , q2 ) Z ν t (q2 , q1 ) sλ (q2
%
× Z ∅∅(q1 , q2 , Q)
&−1
× Q
−1
−ν t
q1
$
q1 q2
−ρ ) sμ (q2
|λ|+|μ| 2
−ν t
q1
q1 × q2
)
|λ||μ|
(−1)|λ|+|μ| . (67)
This is one of the main results of the present paper. The map between the knot theory parameters (q, t, a) and the vertex parameters (q1 , q2 , Q) is given by (5), where a = q N , and the limit in which we recover the usual topological vertex calculation is given by t = −1.
5.1. Unknot. From now on we will drop the superscript II on the normalized partition function and will just write it as ! Z λ μ (q1 , q2 , Q). Below we compute the Poincaré polynomial (67) of the triply-graded homology for small representations (λ, μ) and compare with known results, whenever they are available. For the case (λ, μ) = ( , ∅) we get P
∅(t, q, a)
= −a
ν (−Q)
|ν| q
||ν||2 2
2
||ν t ||2
q1 2 '∞
t −ρ Z ν t (q2 , q1 ) s (q2 q1−ν ) Z ν (q1 , q2 )
i− 1
j− 1
Q q1 2 q2 2 ) √ $ √ q2 q2 1 q2 a−2 =a = −a −Q − 1 − q2 q1 1 − q2 q − q −1 q − q −1 =
a − a−1 , q − q −1
i, j=1 (1 −
a = qN,
which is exactly the superpolynomial of the unknot [9]. It is interesting to note that for generic representations the partition function for the unknot depends on both parameters q and t, whose interpretation we are currently investigating [31]. However, for totally anti-symmetric representations it is expected to be only a function of q given by (3). Indeed, for = 2 and = 3 we find: 5 The factor
q1 |λ| has been introduced to make the expression symmetric in λ and μ. q2
776
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
P 2 (t, q, a) = a
2
q4 a−4 q 6 a−2 q 4 , + − (1 − q 2 )(1 − q 4 ) (1 − q 2 )2 (1 − q 2 )(1 − q 4 )
q9 a−2 q 9 + (1 − q 2 )3 (1 + 2q 2 + 2q 4 + q 6 ) (1 − q 2 )2 (1 − q 4 ) a−4 q 11 a−6 q 15 − + (1 − q 2 )2 (1 − q 4 ) (1 − q 2 )3 (1 + 2q 2 + 2q 4 + q 6 )
P 3 (t, q, a) = a
3
−
(68)
in complete agreement with (3). Note that for a = q N the partition functions reduce to finite polynomials in q with non-negative integer coefficients. For representations other than the antisymmetric ones the refined partition function (67) depends on t in a non-trivial way. 5.2. Hopf link. Let us now consider the Hopf link colored by (R1 , R2 ) = ( , this case, from Eqs. (5) and (67) we get P
). In
(t, q, a) ||ν||2 ||ν t ||2 −ρ −ν t 2 q1 |ν| q 2 q 2 $ t (q2 , q1 ) s (−Q) (q , q ) Z (q q ) Z ν 1 2 ν ν 2 1 1 2 q1 q2 = Q −1 '∞ i− 12 j− 21 q2 q2 ) i, j=1 (1− Q q1 1 1 − q2 + q1 q2 q2 2 1 + q1 − q2 + q1 q2 q1 2 2 q2 =a −Q +Q (1 − q2 )2 q1 (1 − q2 )2 q1 (1 − q2 )2 q1 −2 1 + q1 − q2 + q1 q2 −4 1 − q2 + q1 q2 − a + a = a2 (1 − q2 )2 (1 − q2 )2 (1 − q2 )2 2 2 2 4 2 2 2 1 − q2 + q4 t2 21+q t −q +q t 4 q t = a−2 . − a + a (1 − q 2 )2 (1 − q 2 )2 (1 − q 2 )2
This result agrees with the superpolynomial of the Hopf link computed in Eq. (38). For a = q N we get q, t, a = q N P 2 2 2 4 2 1 − q2 + q4 t2 q2 t2 −2N 2N 1 + q t − q + q t 4N = q −q +q (1 − q 2 )2 (1 − q 2 )2 (1 − q 2 )2 2 2N 4 2N +2 2N +4 4N (1 − q )(1 − q ) 2 q − q −q + q +2 −2N +t =q (1 − q 2 )2 (1 − q 2 )2 N 2 2N − q 2N +2 + q 4N q − q −N 2 q −q + t = q −2N q N −1 q − q −1 (q − q −1 )2 N −N 2N )2 − (1 − q 2 )(1 − q 2N ) −2N N −1 q − q 2 (1 − q q +t = q q − q −1 (q − q −1 )2 ( ) N 2 −N 2N 1 − q 2N −2N N −1 q − q 2 2 1−q +t +t q q = q q − q −1 q − q −1 q − q −1
Link Homologies and the Refined Topological Vertex
( =q
−2N
q
N −1
q N − q −N q − q −1
777
2
+t q
2N
q N − q −N q − q −1
2
−t q 2
N +1
q N − q −N q − q −1
)
= q −2N K h N (212 ), which is exactly the expression Eq. (39) calculated in Sect. 3. Hopf link colored by ( , P(
,
) (t, q, a) =
). For the Hopf link colored by ( ,
) we get
a−3 (1 − q 4 + q 6 t 2 ) (1 − q 2 )2 (1 − q 4 ) − +
a−1 q −2 (1 + q 2 − q 4 − q 6 + q 4 t 2 + q 6 t 2 + q 8 t 2 ) (1 − q 2 )2 (1 − q 4 )
a3 t 2 a q −2 (1 − q 4 + q t 2 + q 4 t 2 + q 6 t 2 ) − . (1 − q 2 )2 (1 − q 4 ) (1 − q 2 )2 (1 − q 4 )
There is no knot theory result with which we can compare this result. However, note that this has all the right properties. It vanishes for a = 1, i.e., N = 0 and for a = q N it gives q −3N times a finite polynomial with positive integer coefficients: P(
,
) (t, q, a
= 1) = P (
,
) (t, q, a
= q) = 0,
P(
,
) (t, q, a
= q 2 ) = q −6 (1 + q 2 ),
P(
,
) (t, q, a
= q 3 ) = q −9 (1 + 2q 2 + 2q 4 + q 6 + t 2 q 6 + t 2 q 8 + t 2 q 10 ),
P(
,
) (t, q, a
= q 4 ) = q −12 (1 + 2q 2 + 3q 4 + 3q 6 + 2q 8 + q 10
,
+ t 2 q 6 (1 + 2q 2 + 3q 4 + 3q 6 + 2q 8 + q 10 )), (69) 5 −15 2 4 2 6 2 8 (1 + 2q + 3q + 4 + t q + 4 + 2t q ) (t, q, a = q ) = q + 3 + 4t 2 q 10 + 2 + 5t 2 q 12 + 1 + 6t 2 q 14 + 5t 2 q 16
P(
P(
,
+ 4t 2 q 18 + 2t 2 q 20 + t 2 q 22 ), 6 −18 (1 + 2q 2 + 3q 4 + 4 + t 2 q 6 + 5 + 2t 2 q 8 ) (t, q, a = q ) = q + 5 + 4t 2 q 10 + 4 + 6t 2 q 12 + 3 + 8t 2 q 14 + 2 + 9t 2 q 16 + 1 + 9t 2 q 18 + 8t 2 q 20 + 6t 2 q 22 + 4t 2 q 24 + 2t 2 q 26 + t 2 q 28 ).
Acknowledgements. We would like to thank C. Doran, J. Rasmussen, and B. Webster for valuable discussions. It is our pleasure to thank the Stony Brook physics department and the 4th Simons Workshop in Mathematics and Physics for hospitality during the initial stages of this work. In addition, C.V. thanks the CTP at MIT for hospitality during his sabbatical leave. The work of S.G. is supported in part by DOE grant DE-FG03-92-ER40701, in part by RFBR grant 04-02-16880, and in part by the grant for support of scientific schools NSh-8004.2006.2. The work of C.V. is supported in part by NSF grants PHY-0244821 and DMS-0244464. Open Access This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
778
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
A. Appendix: Other Representations In this appendix we write the normalized partition function of the Hopf link and unknot colored by other representations of sl(N ) which, as usual, we label by partitions (or Young diagrams). Specifically, we list simple examples where Young diagrams have at most two columns. Let us define G λ μ (−Q, q, t) " # ||ν||2 ||ν t ||2 −ρ −ν t −ρ −ν t |ν| 2 2 := (−Q) q2 q1 Z ν (q1 , q2 ) Z ν t (q2 , q1 ) sλ (q2 q1 ) sμ (q2 q1 ) ν
% &−1 × Z ∅∅(q1 , q2 , Q) , where we used the identification (q1 , q2 ) = (t 2 q 2 , q 2 ) to write G as a function of q and t. In terms of G λ μ (Q, q, t) the normalized partition function is given by ! Z λ μ (q1 , q2 , Q) = h λ μ G λt μ (−Q, q, t), h λ μ = q κ(μ) t ||μ||
2 −|λ|
.
(70)
In the next two sections we list G λ μ for various Young diagrams. A.1. Unknot. G (1) (Q, q, t) =
q 1−q 2
G (12 ) (Q, q, t) = G (2) (Q, q, t) = G (13 ) (Q, q, t) = G (21 11 ) (Q, q, t) =
G (14 ) (Q, q, t) =
( qt )Q 1−q 2
q2 (1−q 2 )(1−q 4 )
q5 (1−q 2 )3 (1+q 2 +q 4 )
+
6
4
q4 (1−q 2 )(1−q 4 )
+
+
q9 (1−q 2 )3 (1+2q 2 +2q 4 +q 6 )
( qt )Q (1−q 2 )2
+
( q2 )Q 2
t (1−q 2 )(1−q 4 )
(1−q 2 +q 2 t 2 )Q t 3 (1−q 2 )2
+
q9 Q t (1−q 2 )2 (1−q 4 )
Q(q 3 −q 5 +q 5 t 2 ) t 3 (1−q 2 )3
+
+
(1−q 4 +q 4 t 2 )Q 2 t 4 (1−q 2 )(1−q 4 )
+
q 11 Q 2 t 2 (1−q 2 )2 (1−q 4 )
Q 2 (q 3 −q 7 +q 7 t 2 ) t 4 (1−q 2 )3
+
+
q 15 Q 3 t 3 (1−q 2 )3 (1+2q 2 +2q 4 +q 6 )
Q 3 (q 5 −q 11 +q 11 t 2 ) t 5 (1−q 2 )3 (1+q 2 +q 4 )
q 16 q 16 Q q 18 Q 2 + + 2 (1−q 2 )2 (1−q 4 )2 (1+q 2 +2q 4 +q 6 +q 8 ) t (1−q 2 )4 (1+2q 2 +2q 4 +q 6 ) t (1−q 2 )2 (1−q 4 )2
+ G (2 12 ) (Q, q, t) =
+
q 22 Q 3 q 28 Q 4 + 4 t 3 (1−q 2 )4 (1+2q 2 +2q 4 +q 6 ) t (1−q 2 )2 (1−q 4 )2 (1+q 2 +2q 4 +q 6 +q 8 )
8 −q 10 +q 10 t 2 ) 2 8 10 14 16 12 2 14 t 2 +q 16 t 2 ) q 10 + Q(q + Q (q +q −q4 −q 2 2+q t 4+q (1−q 2 )(1−q 4 )(1−q 8 ) t 3 (1−q 2 )3 (1−q 4 ) t (1−q ) (1−q )2 3 10 16 +q 16 t 2 ) Q 4 (q 14 −q 22 +q 22 t 2 ) + Q5 (q −q 2 2 4 2 2 2 4 2 + 6
t (1−q ) (1−q )
t (1−q ) (1−q )
6 12 +q 8 t 2 +q 12 t 2 ) q8 G (22 ) (Q, q, t) = + 3Q(q −q (1−q 2 )2 (1−q 4 )2 (1+q 2 +q 4 ) t (1−q 2 )4 (1+2q 2 +2q 4 +q 6 ) 2 6 8 10 12 6 2 8 2 12 2 14 2 10 4 14 4 + Q (q −q −q +q +q6 t +q2 t2 −q 4 t 2 −q t +q t +q t )
t (1−q ) (1−q )
3 6 10 12 16 8 2 10 2 12 2 14 t 2 −q 16 t 2 −q 18 t 2 +q 14 t 4 +q 18 t 4 ) + Q (q −q −q +q +q t7 +q 2t 4+q t −q t (1−q ) (1+2q 2 +2q 4 +q 6 ) 4 8 12 14 18 12 2 14 2 18 2 20 2 20 4 + Q (q −q −q 8 +q 2+q2 t +q4 2 t −q2 t4 −q t +q t )
t (1−q ) (1−q ) (1+q +q )
Link Homologies and the Refined Topological Vertex 25
q 25 Q t (1−q 2 )2 (1−q 4 )(1−q 8 )(1−q 6 )
q G (15 ) = − (1−q 2 )(1−q 4 )(1−q 8 )(1−q 6 )(1−q 10 ) + 27
2
q 31 Q 3 t 3 (1−q 2 )2 (1−q 4 )2 (1−q 6 )
Q + t 2 (1−q 2 )2q(1−q 4 )2 (1−q 6 ) + 37
4
q 45 Q 5 t 5 (1−q 2 )(1−q 4 )(1−q 8 )(1−q 6 )(1−q 10 ) * + Q q 15 −q 17 +q 17 t 2
q Q + t 4 (1−q 2 )2 (1−q 4 )(1−q 8 )(1−q 6 ) + q 17
G (2 13 ) = − +
Q −q 15 +q 23 −q 19 t 2 −q 23 t
G (22 1) =
+ 2
−
(−1+q 2 )5 (1+2q 2 +2q 4 +q 6 )t 4 * 4
+
−
(−1+q 2 )5 (1+q 2 )(1+q 2 +q 4 )(1+q 2 +q 4 +q 6 +q 8 ) * 2
Q −q 21 +q 29 −q 29 t
(
+ 2
)(
5 −1+q 2
1+2q 2 +2q 4 +q 6
−
)
t6
q 13 (1−q 2 )2 (1−q 4 )(1−q 6 )(1−q 8 )
779
+
(−1+q 2 ) (1+2q 2 +2q 4 +q 6 )t 3 * + Q 3 q 17 +q 21 −q 23 −q 27 +q 23 t 2 +q 27 t 2 5
(−1+q 2 )5 (1+2q 2 +2q 4 +q 6 )t 5
* + Q 5 q 27 −q 37 +q 37 t 2
(
) (1+q 2 )(1+q 2 +q 4 )(1+q 2 +q 4 +q 6 +q 8 )t 7
5 −1+q 2
Q(q 11 +q 13 −q 19 −q 21 +q 13 t 2 +q 15 t 2 +q 17 t 2 +q 19 t 2 +q 21 t 2 ) t 3 (1−q 2 )3 (1−q 4 )2 (1+q 2 +2q 4 +q 6 +q 8 )
2 11 15 17 21 2 11 13 +q 15 −q 19 −2q 21 −q 23 )+t 4 (q 15 +q 17 +q 19 +q 21 +q 23 )) + Q (q −q −q +q +t (q +2q t 6 (1−q 2 )2 (1−q 4 )2 (1−q 6 )
+Q +
G (2 14 ) =
5 (q 17 −q 23 −q 25 +q 31 +t 2 (q 23 +q 25 −q 31 −q 33 )+q 33 t 4 )
(1−q 2 )3 (1−q 4 )2 (1+q 2 +2q 4 +q 6 +q 8 ) q 36
(−1+q 2 )6 (1+q 2 )3 (1+q 2 +q 4 )2 (1+2q 4 +q 6 +2q 8 +q 10 +2q 12 +q 16 ) q 36 Q
+
(1−2q 2 +q 6 +q 10 −2q 16 +q 22 +q 26 −2q 30 +q 32 )t
+
(−1+q 2 )6 (1+q 2 )3 (1+q 4 )(1+q 2 +q 4 )t 2
+
(−1+q 2 )6 (1+q 2 )3 (1+q 4 )(1+q 2 +q 4 )t 4
+
(−1+q 2 )6 (1+q 2 )3 (1+q 2 +q 4 )2 (1+2q 4 +q 6 +2q 8 +q 10 +2q 12 +q 16 )t 6
q 38 Q 2 q 48 Q 4
(
)(
Q2
− +
(−1+q 2 )6 (1+2q 2 +2q 4 +q 6 )2 t 3
+
(1−2q 2 +q 6 +q 10 −2q 16 +q 22 +q 26 −2q 30 +q 32 )t 5
)
3 1+q 2
q 26 (1+q 4 )(1−q 2 +q 4 )(1+q 2 +q 4 )2
Q3
*
+
* + Q −q 24 +q 26 −q 26 t 2
(
) (1+q 2 )2 (1+q 4 )(1+q 2 +q 4 )t 3
6 −1+q 2
(−1+q 2 )6 (1+q 2 )3 (1+q 2 +2q 4 +q 6 +q 8 )t 4 −q 26 −q 28 −q 30 +q 36 +q 38 +q 40 −q 32 t 2 −q 34 t 2 −q 36 t 2 −q 38 t 2 −q 40 t 2
+
(−1+q 2 )6 (1+2q 2 +2q 4 +q 6 )2 t 5
(−1+q 2 ) (1+q 2 ) (1+q 2 +2q 4 +q 6 +q 8 )t 6 * + * + Q 5 −q 36 +q 46 −q 46 t 2 Q 6 q 44 −q 56 +q 56 t 2 6
3
(−1+q 2 )6 (1+q 2 )2 (1+q 4 )(1+q 2 +q 4 )t 7
+
(−1+q 2 )6 (1+q 2 )3 (1+q 2 +q 4 )2 (1−q 2 +2q 4 −q 6 +q 8 )t 8 +
* + Q −q 18 +q 28 −q 20 t 2 −q 24 t 2 −q 28 t 2
)( )( )( ) (−1+q 2 ) (1+q 2 ) (1+q 4 )(1+q 2 +q 4 +q 6 +q 8 )t 3 * + Q 2 q 18 −q 20 −q 26 +q 28 +q 18 t 2 +q 20 t 2 −q 28 t 2 −q 30 t 2 +q 22 t 4 +q 26 t 4 +q 30 t 4
6 −1+q 2
+
−
* 24 26 34 36 28 2 30 2 32 2 34 2 36 2 + q +q −q −q +q t +q t +q t +q t +q t
q 20
+
q 56 Q 5
* + Q 4 q 30 +q 32 +q 34 +q 36 −q 40 −q 42 −q 44 −q 46 +q 38 t 2 +q 40 t 2 +q 42 t 2 +q 44 t 2 +q 46 t 2
− (
q 42 Q 3
+
q 66 Q 6
6 −1+q 2
+
G (22 12 ) =
t 7 (1−q 2 )2 (1−q 4 )2 (1−q 6 )
Q 4 q 13 (1+q 2 +q 4 −q 6 −2q 8 −2q 10 −q 12 +q 14 +q 16 +q 18 +t 2 q 4 (1+2q 2 +2q 4 +2q 6 −2q 10 −2q 12 −2q 14 −q 33 )+t 4 q 12 (1+q 2 +q 4 +q 6 +q 8 )) t 8 (1−q 2 )3 (1−q 4 )2 (1+q 2 +2q 4 +q 6 +q 8 )
+Q G (16 ) =
3 (q 11 +q 13 −2q 17 −2q 19 +q 23 +q 25 +t 2 (q 13 +2q 15 +2q 17 +q 19 −q 21 −2q 23 −2q 25 −q 27 )+t 4 (q 19 +q 23 +q 25 +q 27 ))
3 1+q 2
1+q 4
1+q 2 +q 4 +q 6 +q 8
6
2
Q3
(−1+q 2 ) (1+q 2 ) (1+q 4 )t 6 * 18 20 26 28 20 2 30 2 26 4 28 4 30 4 + −q +q +q −q −q t +q t −q t +q t −q t
Q4
(−1+q 2 ) (1+q 2 ) t 7 * 20 24 26 28 30 32 34 38 2 24 26 28 30 34 36 38 40 4 32 36 40 + q +q −q −q −q −q +q +q +t (q +q +q +q −q −q −q −q )+t (q +q +q )
6
6
Q 5 q 24
*
3
2
(−1+q 2 )6 (1+q 2 )3 (1+q 4 )t 8 (−1+q 2 )6 (1+q 2 )2 (1+q 2 +2q 4 +2q 6 +2q 8 +q 10 +q 12 )t 9
+
−
1+q 4 −q 8 −q 10 −q 12 −q 14 +q 18 +q 22 +t 2 q 6 (1+q 2 +q 4 +q 6 +q 8 −q 10 −q 12 −q 14 −q 16 −q 18 )+t 4 q 16 (1+q 4 +q 8 )
* + Q 6 q 30 −q 38 −q 40 +q 48 +q 38 t 2 +q 40 t 2 −q 48 t 2 −q 50 t 2 +q 50 t 4
(−1+q 2 )6 (1+q 2 )3 (1+q 2 +2q 4 +2q 6 +2q 8 +q 10 +q 12 )t 10
+
780
G (23 ) =
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa q 18 (1−q 2 )6 (1+q 2 )3 (1+q 4 )(1+q 2 +q 4 )2
+Q
−
Q(q 16 −q 18 +q 20 −q 22 +q 18 t 2 −q 20 t 2 +q 22 t 2 ) (1−q 2 )6 (1+q 2 )2 (1+q 2 +2q 4 +q 6 +q 8 )t 3
2 (q 16 −q 22 −q 24 +q 30 +t 2 (q 16 +q 18 +q 20 +q 22 −q 26 −q 28 −q 30 −q 32 )+t 4 (q 20 +q 24 +q 26 +q 28 +q 32 ))
(1−q 2 )6 (1+q 2 )3 (1+q 2 +2q 4 +q 6 +q 8 )t 6
3 18 2 4 8 10 12 +t 2 q −2 (1+q 2 +q 4 −q 6 −2q 8 −2q 10 −q 12 +q 14 +q 16 +q 18 )) + Q q (1−q −q +q +q −q(1−q 2 )6 (1+2q 2 +2q 4 +q 6 )2 t 9
+Q
3 q 20 t 4 (1+q 2 +2q 4 +q 6 +q 8 −q 10 −q 12 −2q 14 −q 16 −q 18 +t 2 q 4 (1+q 4 +q 6 +q 8 +q 12 ))
(1−q 2 )6 (1+2q 2 +2q 4 +q 6 )2 t 9
4 18 22 24 26 28 30 +q 32 −q 36 +t 2 (q 18 +q 20 +2q 22 −q 26 −3q 28 −3q 30 −q 32 +2q 36 +q 38 +q 40 )) + Q (q −q −q −q +q +q (1−q 2 )6 (1+q 2 )3 (1+q 2 +2q 4 +q 6 +q 8 )t 10
Q 4 t 4 (q 22 +q 24 +2q 26 +2q 28 +q 30 −q 34 −2q 36 −2q 38 −q 40 −q 42 +t 2 (q 30 +q 34 +q 36 +q 38 +q 42 )) (1−q 2 )6 (1+q 2 )3 (1+q 2 +2q 4 +q 6 +q 8 )t 10
−Q
5 q 20 (1−q 2 −q 6 +q 10 +q 14 −q 16 +t 2 q 2 (1+q 4 −q 6 −q 8 −q 10 −q 12 +q 14 +q 18 )+t 4 q 8 (1+q 4 −q 10 −q 14 )+t 6 q 18 (1−q 2 +q 4 ))
(1−q 2 )6 (1+q 2 )2 (1+q 2 +2q 4 +q 6 +q 8 )t 11
Q 6 q 24 (1−q 4 −q 6 −q 8 +q 10 +q 12 +q 14 −q 18 +t 2 q 4 (1+q 2 +q 4 −q 6 −2q 8 −2q 10 −q 12 +q 14 +q 16 +q 18 )) (−1+q 2 )6 (1+q 2 )3 (1+q 4 )(1+q 2 +q 4 )2 t 12 +q −q −q −q )+q t ) + Q t q 2 ((1+q (−1+q )6 (1+q 2 )3 (1+q 4 )(1+q 2 +q 4 )2 t 12 6 4 36
G (23 1) = − +
2
4
8
10
12
24 6
q 25
(−1+q 2 )7 (1+q 2 )2 (1+q 4 )(1+q 2 +q 4 )2 (1+q 2 +q 4 +q 6 +q 8 )
* + Q q 23 +q 25 +q 27 −q 33 −q 35 −q 37 +q 25 t 2 +q 27 t 2 +q 29 t 2 +q 31 t 2 +q 33 t 2 +q 35 t 2 +q 37 t 2 Q 2 q 23
−
*
(−1+q 2 )7 (1+2q 2 +2q 4 +q 6 )2 (1+q 2 +2q 4 +2q 6 +2q 8 +q 10 +q 12 )t 3 1−q 8 −q 10 +q 18 +t 2 (1+q 2 +q 4 +q 6 +q 8 −q 12 −q 14 −q 16 −q 18 −q 20 )+t 4 q 4 (1+q 4 +q 6 +q 8 +q 10 +q 12 +q 16 )
(
)(
7 −1+q 2
)(
2 1+q 2
1+2q 2 +4q 4 +5q 6 +6q 8 +5q 10 +4q 12 +2q 14 +q 16
+
)
t6
3 25 27 31 35 39 −q 41 +t 2 (q 23 +q 25 +q 27 −q 31 −2q 33 −2q 35 −q 37 +q 41 +q 43 +q 45 )) + Q (q −q −q +q +q (1−q 2 )7 (1+q 4 )(1+2q 2 +2q 4 +q 6 )2 t 9
+Q
3 t 4 (q 25 +q 27 +2q 29 +q 31 +2q 33 −2q 39 −q 41 −2q 43 −q 45 −q 47 +t 2 (q 31 +q 35 +q 37 +q 39 +q 41 +q 43 +q 47 ))
(1−q 2 )7 (1+q 4 )(1+2q 2 +2q 4 +q 6 )2 t 9
4 25 6 8 14 16 22 2 2 +2q 4 +q 6 −2q 10 −3q 12 −3q 14 −2q 16 +q 20 +2q 22 +q 24 +q 26 )) − Q q (1−q −2q +2q +q −q +t 2(1+q (−1+q )7 (1+q 4 )(1+2q 2 +2q 4 +q 6 )2 t 10
+Q
4 t 4 q 25 (q 4 +q 6 +2q 8 +2q 10 +2q 12 +q 14 −q 18 −2q 20 −2q 22 −2q 24 −q 26 −q 28 )+t 6 (q 12 +q 16 +q 18 +q 20 +q 22 +q 24 +q 28 ))
(−1+q 2 )7 (1+q 4 )(1+2q 2 +2q 4 +q 6 )2 t 10
Q 5 (q 27 +q 31 −q 33 −q 35 −2q 37 −q 39 +q 43 +2q 45 +q 47 +q 49 −q 51 −q 55 ) + (−1+q 2 )7 (1+q 2 )2 (1+2q 2 +4q 4 +5q 6 +6q 8 +5q 10 +4q 12 +2q 14 +q 16 )t 11
+Q
5 (q 29 t 2 +q 31 t 2 +2q 33 t 2 +2q 35 t 2 +q 37 t 2 −q 39 t 2 −2q 41 t 2 −4q 43 t 2 −4q 45 t 2 −2q 47 t 2 −q 49 t 2 +q 51 t 2 +2q 53 t 2 +2q 55 t 2 )
(−1+q 2 )7 (1+q 2 )2 (1+2q 2 +4q 4 +5q 6 +6q 8 +5q 10 +4q 12 +2q 14 +q 16 )t 11
5 57 2 59 2 35 t 4 +q 37 t 4 +2q 39 t 4 +2q 41 t 4 +3q 43 t 4 +q 45 t 4 +q 47 t 4 −q 49 t 4 −q 51 t 4 −3q 53 t 4 −2q 55 t 4 −2q 57 t 4 ) + Q (q t +q t +q (−1+q 2 )7 (1+q 2 )2 (1+2q 2 +4q 4 +5q 6 +6q 8 +5q 10 +4q 12 +2q 14 +q 16 )t 11
−
Q 5 (q 59 t 4 −q 61 t 4 +q 45 t 6 +q 49 t 6 +q 51 t 6 +q 53 t 6 +q 55 t 6 +q 57 t 6 +q 61 t 6 )
(−1+q 2 )7 (1+q 2 )2 (1+2q 2 +4q 4 +5q 6 +6q 8 +5q 10 +4q 12 +2q 14 +q 16 )t 11
−Q
6 (q 31 +q 33 +q 35 −2q 39 −3q 41 −3q 43 −q 45 +q 47 +3q 49 +3q 51 +2q 53 −q 57 −q 59 −q 61 )
(−1+q 2 )7 (1+2q 2 +2q 4 +q 6 )2 (1+q 2 +2q 4 +2q 6 +2q 8 +q 10 +q 12 )t 12
6 35 2 2 4 6 8 10 12 −6q 14 −6q 16 −4q 18 −q 20 +2q 22 +3q 24 +3q 26 +2q 28 +q 30 ) − Q q t (1+2q +3q +3q2 7+2q −q2 −4q (−1+q ) (1+2q +2q 4 +q 6 )2 (1+q 2 +2q 4 +2q 6 +2q 8 +q 10 +q 12 )t 12
−Q
6 q 43 t 4 (1+2q 2 +3q 4 +3q 6 +3q 8 +2q 10 −2q 14 −3q 16 −3q 18 −3q 20 −2q 22 −q 24 +t 2 (q 12 +q 14 +16 +q 18 +q 20 +q 22 +q 24 ))
(−1+q 2 )7 (1+2q 2 +2q 4 +q 6 )2 (1+q 2 +2q 4 +2q 6 +2q 8 +q 10 +q 12 )t 12
Q 7 (q 37 −q 43 −q 45 −q 47 +q 51 +q 53 +q 55 −q 61 +q 43 t 2 +q 45 t 2 +q 47 t 2 −q 51 t 2 −2q 53 t 2 −2q 55 t 2 −q 57 t 2 +q 61 t 2 +q 63 t 2 )
(−1+q 2 )7 (1+2q 2 +2q 4 +q 6 )2 (1+q 2 +2q 4 +2q 6 +2q 8 +q 10 +q 12 )t 13
+
Q 7 (q 65 t 2 +q 53 t 4 +q 55 t 4 +q 57 t 4 −q 63 t 4 −q 65 t 4 −q 67 t 4 +q 67 t 6 )
(−1+q 2 )7 (1+2q 2 +2q 4 +q 6 )2 (1+q 2 +2q 4 +2q 6 +2q 8 +q 10 +q 12 )t 13
A.2. Hopf link. G (1) (1) = G (1) (12 ) =
q2 (1−q 2 )2
−
q5 (1−q 2 )2 (1−q 4 )
Q(1−q 2 +q 2 t 2 +q 4 t 2 ) t 3 (1−q 2 )2
−
+
Q 2 (1−q 2 +q 4 t 2 ) t 4 (1−q 2 )2
Q(q 3 −q 7 +q 5 t 2 +q 7 t 2 +q 9 t 2 ) t 3 (1−q 2 )2 (1−q 4 )
+
Q 2 (q 3 +q 5 −q 7 −q 9 +q 7 t 2 +q 9 t 2 +q 11 t 2 ) t 4 (1−q 2 )2 (1−q 4 )
−
Q 3 (q 5 −q 9 +q 11 t 2 ) t 5 (1−q 2 )2 (1−q 4 )
Link Homologies and the Refined Topological Vertex
G (1) (13 ) = −Q
q 10 (1−q 2 )4 (1+2q 2 +2q 4 +q 6 )
Q(q 8 −q 14 +q 10 t 2 +q 12 t 2 +q 14 t 2 +q 16 t 2 ) t 3 (1−q 2 )4 (1+2q 2 +2q 4 +q 6 )
−
3 (q 10 +q 12 +q 14 −q 16 −q 18 −q 20 +q 16 t 2 +q 18 t 2 +q 20 t 2 +q 22 t 2 )
+
t 5 (1−q 2 )4 (1+2q 2 +2q 4 +q 6 )
G (1,1) (12 ) = +Q
781
q8 (1−q 2 )4 (1+q 2 )2
+
Q 2 (q 8 −q 14 +q 12 t 2 +q 16 t 2 ) t 4 (1−q 2 )3 (1−q 4 )
Q 4 (q 14 −q 20 +q 22 t 2 ) t 6 (1−q 2 )4 (1+2q 2 +2q 4 +q 6 )
Q(q 6 −q 10 +q 8 t 2 +q 12 t 2 ) t 3 (1−q 2 )4 (1+q 2 )
−
2 (q 6 −q 8 −q 10 +q 12 +t 2 (q 6 +2q 8 +q 10 −q 12 −2q 14 −q 16 )+t 4 (q 10 +q 12 +2q 14 +q 16 +q 18 ))
t 6 (1−q 2 )4 (1+q 2 )2
3 6 8 10 12 2 (q 8 +q 10 −q 14 −q 16 )+t 4 (q 14 +q 18 )) − Q (q −q −q +q t+t 7 (1−q 2 )4 (1+q 2 )
+Q
4 (q 8 −q 10 −q 12 +q 14 +t 2 (q 12 +q 14 −q 16 −q 18 )+q 20 t 4 )
t 8 (1−q 2 )4 (1+q 2 )2
q 17 (1−q 2 )5 (1+q 2 )2 (1+q 4 )(1+q 2 +q 4 )
G (1) (14 ) =
−
Q(q 15 −q 23 +q 17 t 2 +q 19 t 2 +q 21 t 2 +q 23 t 2 +q 25 t 2 ) t 3 (1−q 2 )5 (1+q 2 )(1+q 2 +2q 4 +q 6 +q 8 )
+Q
2 (q 15 +q 17 −q 23 −q 25 +q 19 t 2 +q 21 t 2 +q 23 t 2 +q 25 t 2 +q 27 t 2 )
+Q
4 (q 21 +q 23 +q 25 +q 27 −q 29 −q 31 −q 33 −q 35 +t 2 (q 29 +q 31 +q 33 +q 35 +q 37 ))
t 4 (1−q 2 )5 (1+q 2 )2 (1+q 2 +q 4 )
Q 3 (q 17 +q 19 +q 21 −q 25 −q 27 −q 29 +q 23 t 2 +q 25 t 2 +q 27 t 2 +q 29 t 2 +q 31 t 2 ) t 5 (1−q 2 )5 (1+q 2 )2 (1+q 2 +q 4 )
t 6 (1−q 2 )5 (1+q 2 )2 (1+q 2 +2q 4 +q 6 +q 8 )
G (12 ) (13 ) = +Q
−
q 13 (1−q 2 )5 (1+q 2 )2 (1+q 2 +q 4 )
−
−
Q 5 (q 27 −q 35 +q 37 t 2 ) t 7 (1−q 2 )5 (1+q 2 )2 (1+q 4 )(1+q 2 +q 4 )
Q(q 11 +q 13 −q 17 −q 19 +t 2 (q 13 +q 15 +q 17 +q 19 +q 21 )) t 3 (1−q 2 )5 (1+q 2 )2 (1+q 2 +q 4 )
2 (q 11 −q 15 −q 17 +q 21 +t 2 (q 11 +2q 13 +2q 15 +q 17 −q 19 −2q 21 −2q 23 −q 25 )+t 4 (q 15 +q 17 +2q 19 +2q 21 +2q 23 +q 25 +q 27 ))
t 6 (1−q 2 )5 (1+q 2 )2 (1+q 2 +q 4 )
3 11 13 17 19 23 25 2 13 15 17 +2q 19 −2q 23 −3q 25 −2q 27 −q 29 )+t 4 (q 19 +q 21 +2q 23 +2q 25 +2q 27 +2q 29 +q 31 )) − Q (q +q −2q −2q +q +q +t (q +2q +3q t 7 (1−q 2 )5 (1+q 2 )2 (1+q 2 +q 4 )
+Q
4 (q 13 +q 15 −2q 19 −2q 21 +q 25 +q 27 +t 2 (q 17 +2q 19 +2q 21 +q 23 −q 25 −2q 27 −2q 29 −q 31 )+t 4 (q 25 +q 27 +q 29 +q 31 +q 33 ))
t 8 (1−q 2 )5 (1+q 2 )2 (1+q 2 +q 4 )
5 17 21 23 +q 27 +t 2 (q 23 +q 25 −q 29 −q 31 )+t 4 q 33 ) − Q (q −q −q t 9 (1−q 2 )5 (1+q 2 )2 (1+q 2 +q 4 )
q 26 1−2q 2 +q 6 +q 10 −2 q 16 +q 22 +q 26 −2q 30 +q 32
G (1) (15 ) =
Q (q −q +q t +q t +q t ) + t 4 (1−q 2 )6 (1+q 2 )2 (1+q 2 +2q 4 +q 6 +q 8 ) − 2
−Q
24
34
28 2
32 2
36 2
−
Q(q 24 −q 34 +q 26 t 2 +q 28 t 2 +q 30 t 2 +q 32 t 2 +q 34 t 2 +q 36 t 2 ) t 3 (1−2q 2 +q 6 +q 10 −2q 16 +q 22 +q 26 −2q 30 +q 32 )
Q 3 (q 26 −q 36 +q 32 t 2 +q 38 t 2 ) t 5 (1−q 2 )6 (1+q 2 )2 (1+q 2 +q 4 )
+
Q 4 (q 30 +q 34 −q 40 −q 44 +q 38 t 2 +q 42 t 2 +q 46 t 2 ) t 6 (1−q 2 )6 (1+q 2 )2 (1+q 2 +2q 4 +q 6 +q 8 )
5 (q 36 +q 38 +q 40 +q 42 +q 44 −q 46 −q 48 −q 50 −q 52 −q 54 +q 46 t 2 +q 48 t 2 +q 50 t 2 +q 52 t 2 +q 54 t 2 +q 56 t 2 )
t 7 (1−2q 2 +q 6 +q 10 −2q 16 +q 22 +q 26 −2q 30 +q 32 )
Q 6 (q 44 −q 54 +q 56 t 2 ) (1−2q 2 +q 6 +q 10 −2q 16 +q 22 +q 26 −2q 30 +q 32 )t 8
G (12 ) (14 ) = +Q
q 20 (1−q 2 )6 (1+q 2 )3 (1+q 4 )(1+q 2 +q 4 )
−
Q(q 18 −q 26 +q 20 t 2 +q 24 t 2 +q 28 t 2 ) t 3 (1−q 2 )6 (1+q 2 )2 (1+q 4 )(1+q 2 +q 4 )
2 q 18 (1−q 6 −q 8 +q 14 +t 2 (1+2q 2 +2q 4 +2q 6 +q 8 −q 10 −2q 12 −2q 14 −2q 16 −q 18 )+t 4 q 4 (1+q 2 +2q 4 +2q 6 +3q 8 +2q 10 +2q 12 +q 14 +q 16 ))
t 6 (1−q 2 )6 (1+q 2 )3 (1+q 2 +2q 4 +q 6 +q 8 )
3 18 24 26 32 2 20 22 +q 24 +q 26 −q 30 −q 32 −q 34 −q 36 )+t 4 (q 26 +q 30 +q 32 +q 34 +q 38 )) − Q (q −q −q +q +t (q +q t 7 (1−q 2 )6 (1+q 2 )2 (1+q 2 +q 4 )
+Q
4 (q 20 +q 22 +2q 24 −q 28 −3q 30 −3q 32 −q 34 +2q 38 +q 40 +q 42 )
t 8 (1−q 2 )6 (1+q 2 )3 (1+q 2 +2q 4 +q 6 +q 8 )
4 2 24 26 28 30 32 34 36 38 −4q 40 −3q 42 −2q 44 −q 46 +t 2 (q 32 +q 34 +2q 36 +2q 38 +3q 40 +2q 42 +2q 44 +q 46 +q 48 )) + Q t (q +2q +3q +4q +3q +q −q −3q t 8 (1−q 2 )6 (1+q 2 )3 (1+q 2 +2q 4 +q 6 +q 8 )
−Q
5 (q 24 +q 28 −q 30 −q 32 −q 34 −q 36 +q 38 +q 42 +t 2 (q 30 +q 32 +q 34 +q 36 −q 40 −q 42 −q 44 −q 46 )+t 4 (q 40 +q 44 +q 48 ))
t 9 (1−q 2 )6 (1+q 2 )2 (1+q 2 +2q 4 +q 6 +q 8 )
6 30 36 −q 38 +q 44 +t 2 (q 38 +q 40 −q 46 −q 48 )+t 4 q 50 ) + Q (q −q t 10 (1−q 2 )6 (1+q 2 )3 (1+q 2 +2q 4 +q 6 +q 8 )
G (13 ) (13 ) = +Q
q 18 (1−q 2 )6 (1+2q 2 +2q 4 +q 6 )2
−
Q(q 16 −q 22 +q 18 t 2 +q 24 t 2 ) t 3 (1−q 2 )6 (1+q 2 )2 (1+q 2 +q 4 )
2 (q 16 −q 20 −q 22 +q 26 +t 2 (q 16 +q 18 +q 20 −q 26 −q 28 −q 30 )+t 4 (q 20 +q 24 +q 26 +q 28 +q 32 ))
t 6 (1−q 2 )6 (1+q 2 )2 (1+q 2 +q 4 )
3 18 20 22 26 28 30 2 (q 16 +2q 18 +2q 20 −3q 24 −4q 26 −3q 28 +2q 32 +2q 34 +q 36 )) − Q (q −q −q +q +q −q t 9 +t (1−q 2 )6 (1+2q 2 +2q 4 +q 6 )2
+Q
3 t 4 q 18 (1+2q 2 +4q 4 +4q 6 +4q 8 +q 10 −q 12 −4q 14 −4q 16 −4q 18 −2q 20 −q 22 +t 2 q 6 (1+q 2 +2q 4 +3q 6 +3q 8 +3q 10 +3q 12 +2q 14 +q 16 +q 18 ))
t 9 (1−q 2 )6 (1+2q 2 +2q 4 +q 6 )2
782
+Q
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa 4 (q 18 −q 20 −q 22 +q 26 +q 28 −q 30 +t 2 (q 18 +q 20 +q 22 −q 24 −2q 26 −2q 28 −q 30 +q 32 +q 34 +q 36 ))
t 10 (1−q 2 )6 (1+q 2 )2 (1+q 2 +q 4 )
4 4 22 24 26 28 30 32 −q 34 −3q 36 −q 38 −q 40 +t 2 (q 30 +q 34 +q 36 +q 38 +q 42 )) + Q t (q +q +2q +q +q t 10−q (1−q 2 )6 (1+q 2 )2 (1+q 2 +q 4 )
−Q
5 (q 20 −q 22 −q 24 +q 28 +q 30 −q 32 +t 2 (q 22 +q 24 −q 28 −2q 30 −q 32 +q 36 +q 38 )+t 4 (q 28 +q 30 +q 32 −q 38 −q 40 −q 42 )+t 6 (q 38 +q 44 ))
t 11 (1−q 2 )6 (1+q 2 )2 (1+q 2 +q 4 )
6 24 26 28 32 34 36 2 28 +q 30 −2q 34 −2q 36 +q 40 +q 42 )+t 4 (q 36 +q 38 +q 40 −q 42 −q 44 −q 46 +q 48 )) + Q (q −q −q +q +q −q +t (q t 12 (1−q 2 )6 (1+2q 2 +2q 4 +q 6 )2
A.3. Specialization to Q = −t q −2N : Some examples. In this section we consider the specialization Q = −t q −2N for the case of the Hopf link colored by (R1 , R2 ) = (1, 12 ),(12 , 12 ) and (13 , 14 ). We see that G λ μ after this specialization is (up to an overall factor) a polynomial in q and t: G (1) (12 ) (Q = −t, q, t) = G (1) (12 ) Q = −t q −2 , q, t = 0, G (1) (12 ) Q = −t q −4 , q, t = −q −7 t −2 1 + q 2 , G (1) (12 ) Q = −t q −6 , q, t = −q −13 t −2 1+2q 2 +2q 4 +q 6 +t 2 q 6 + t 2 q 8 + t 2 q 10 , G (1) (12 ) Q = −t q −8 , q, t = −q −19 t −2 1+2q 2 +3q 4 +3q 6 +2q 8 +q 10 + t 2 q 6 1 + 2q 2 + 3q 4 + 3q 6 + 2q 8 + q 10 , G (1) (12 ) Q = −t q −10 , q, t = −q −25 t −2 1 + 2q 2 + 3q 4 + 4 + t 2 q 6 + 4 + 2t 2 q 8 + 3 + 4t 2 q 10 + 2 + 5t 2 q 12 + 1+6t 2 q 14 +5t 2 q 16 +4t 2 q 18 + 2t 2 q 20 + t 2 q 22 , G (1) (12 ) Q = −t q −12 , q, t = −q −32 t −2 1 + 2q 2 + 3q 4 + 4 + t 2 q 6 + 5 + 2t 2 q 8 + 5 + 4t 2 q 10 + 4 + 6t 2 q 12 + 3 + 8t 2 q 14 + 2 + 9t 2 q 16 + 1 + 9t 2 q 18 + 8t 2 q 20 + 6t 2 q 22 + 4t 2 q 24 + 2t 2 q 26 + t 2 q 28 . G (12 ) (12 ) Q = −t q −2N q, t = 0, N = 0, 1, G (12 ) (12 ) Q = −t q −4 , q, t = q −8 t −4 , G (12 ) (12 ) ((Q = −t q −6 , q, t) = q −16 t −4 1+q 2 + 1+t 2 q 4 +2t 2 q 6 +2t 2 q 8 +t 2 q 10 , G (12 ) (12 ) Q = −t q −8 , q, t = q −24 t −4 1 + q 2 + 2 + t 2 q 4 + 1 + 3t 2 q 6 + 1 + 5t 2 q 8 + 6t 2 q 10 + 5t 2 + t 4 q 12 + 3t 2 + t 4 q 14 + t 2 + 2t 4 q 16 + t 4 q 18 + t 4 q 20 ,
Link Homologies and the Refined Topological Vertex
783
G (12 ) (12 ) Q = −t q −10 , q, t = q −32 t −4 1 + q 2 + 2 + t 2 q 4 + 2 + 3t 2 q 6 + 2 + 6t 2 q 8 + 1 + 9t 2 q 10 + 1 + 11t 2 + t 4 q 12 + 11t 2 + 2t 4 q 14 + 9t 2 +4t 4 q 16 + 6t 2 +5t 4 q 18 + 3t 2 +6t 4 q 20 + t 2 + 5t 4 q 22 + 4t 4 q 24 + 2t 4 q 26 + t 4 q 28 , G (12 ) (12 ) Q = −t q −12 , q, t = q −40 t −4 1 + q 2 + 2 + t 2 q 4 + 2 + 3t 2 q 6 + 3+6t 2 q 8 + 2 + 10t 2 q 10 + 2 + 14t 2 + t 4 q 12 + 1 + 17t 2 + 2t 4 q 14 + 1 + 18t 2 + 5t 4 q 16 + t 2 17 + 7t 2 q 18 + t 2 14 + 11t 2 q 20 +2t 2 5 + 6t 2 q 22 +2t 2 3 + 7t 2 q 24 + 3t 2 1 + 4t 2 q 26 + t 2 + 11t 4 q 28 + 7t 4 q 30 +5t 4 q 32 + 2t 4 q 34 + t 4 q 36 , G (12 ) (12 ) Q = −t q −14 , q, t = q −48 t −4 1 + q 2 + 2 + t 2 q 4 + 2 + 3t 2 q 6 + 3 + 6t 2 q 8 + 3 + 10t 2 q 10 + 3 + 15t 2 + t 4 q 12 + 2 1 + 10t 2 + t 4 q 14 + 2 + 24t 2 + 5t 4 q 16 + 1 + 26t 2 + 8t 4 q 18 + 1 + 13t 2 2 + t 2 q 20 + t 2 24 + 17t 2 q 22 + 20t 2 + 22t 4 q 24 +3t 2 5 + 8t 2 q 26 +2t 2 5 + 13t 2 q 28 + 6t 2 1 + 4t 2 q 30 + t 2 3 + 22t 2 q 32 + t 2 + 17t 4 q 34 + 13t 4 q 36 + 8t 4 q 38 + 5t 4 q 40 + 2t 4 q 42 + t 4 q 44 ,
784
S. Gukov, A. Iqbal, C. Kozçaz, C. Vafa
G (13 ) (14 ) Q = −t q −2N , q, t = 0, N = 0, 1, 2, 3, G (13 ) (14 ) Q = −t q −8 , q, t = −q −19 t −6 1 + q 2 + q 4 + q 6 , G (13 ) (14 ) Q = −t q −10 , q, t = −q −33 t −6 1 + 2q 2 + 3q 4 + 4 + t 2 q 6 + 4 + 2t 2 q 8 + 3 + 4t 2 q 10 + 2 + 5t 2 q 12 + 1 + 6t 2 q 14 + 5t 2 q 16 + 4t 2 q 18 + 2t 2 q 20 + t 2 q 22 , G (13 ) (14 ) Q = −t q −12 , q, t = −q −47 t −6 1 + 2q 2 + 4q 4 + 6 + t 2 q 6 + 8 + 3t 2 q 8 + 9 + 7t 2 q 10 + 9 + 12t 2 q 12 + 8 + 18t 2 q 14 + 6+23t 2 +t 4 q 16 + 4 + 26t 2 + 2t 4 q 18 + 2 + 26t 2 + 4t 4 q 20 + 1 + 23t 2 + 6t 4 q 22 + 18t 2 + 8t 4 q 24 + 12t 2 + 9t 4 q 26 + 7t 2 + 9t 4 q 28 + 3t 2 + 8t 4 q 30 + t 2 + 6t 4 q 32 + 4t 4 q 34 + 2t 4 q 36 + t 4 q 38 , G (13 ) (14 ) Q = −t q −14 , q, t = −q −61 t −6 1 + 2q 2 + 4q 4 + 7 + t 2 q 6 + 10 + 3t 2 q 8 + 13 + 8t 2 q 10 + 16 + 15t 2 q 12 + 17 + 26t 2 q 14 + 17 + 38t 2 + t 4 q 16 + 16 + 52t 2 + 3t 4 q 18 + 13 + 7t 2 9 + t 2 q 20 + 10 + 72t 2 + 13t 4 q 22 + 7 + 74t 2 + 21t 4 q 24 + 4 + 72t 2 + 30t 4 q 26 + 2 + 63t 2 + 39t 4 q 28 + 1 + 52t 2 + 46t 4 + t 6 q 30 +t 2 38 + 50t 2 +t 4 q 32 +2t 2 13 + 25t 2 + t 4 q 34 +t 2 15 + t 2 1 + 3t 2 q 36 +t 2 8 + 39t 2 + 4t 4 q 38 + t 2 3 + 30t 2 + 4t 4 q 40 + t 2 + 21t 4 + 5t 6 q 42 + t 4 13 + 4t 2 q 44 + t 4 7 + 4t 2 q 46 + 3t 4 1 + t 2 q 48 + t 4 + 2t 6 q 50 + t 6 q 52 + t 6 q 54 .
Link Homologies and the Refined Topological Vertex
785
References 1. Khovanov, M.: A categorification of the Jones polynomial. Duke Math. J. 101, 359 (2000) 2. Ozsvath, P., Szabo, Z.: Holomorphic disks and knot invariants. Adv. Math. 186, 58 (2004) 3. Ozsvath, P., Szabo, Z.: “Holomorphic disks and link invariants.” http://arxiv.org/abs/math/ 0512286v2[math.GT], 2007 4. Rasmussen, J.: “Floer homology and knot complements.” Harvard thesis, http://arxiv.org/abs/math/ 0306378v1[math.GT], 2003 5. Khovanov, M.: sl(3) link homology. Algebr. Geom. Topol. 4, 1045 (2004) 6. Khovanov, M., Rozansky, L.: “Matrix factorizations and link homology.” http://arxiv.org/abs/math/ 0401268v2[math.QA], 2004 7. Witten, E.: Quantum Field Theory And The Jones Polynomial. Commun. Math. Phys. 121, 351 (1989) 8. Gukov, S., Schwarz, A., Vafa, C.: Khovanov-Rozansky homology and topological strings. Lett. Math. Phys. 74, 53 (2005) 9. Dunfield, N., Gukov, S., Rasmussen, J.: The Superpolynomial for Knot Homologies. Experiment. Math. 15, 129 (2006) 10. Gukov, S., Walcher, J.: “Matrix Factorizations and Kauffman Homology.” http://arxiv.org/abs/hep-th/ 0512298v1, 2005 11. Ooguri, H., Vafa, C.: Knot Invariants and Topological Strings. Nucl.Phys. B577, 419 (2000) 12. Witten, E.: Chern-Simons gauge theory as a string theory. Prog. Math. 133, 637 (1995) 13. Gopakumar, R., Vafa, C.: On the gauge theory/geometry correspondence. Adv. Theor. Math. Phys. 3, 1415 (1999) 14. Nekrasov, N.A.: Seiberg-Witten Prepotential From Instanton Counting. Adv. Theor. Math. Phys. 7, 831– 864 (2004) 15. Hollowood, T.J., Iqbal, A., Vafa, C.: Matrix Models, Geometric Engineering and Elliptic Genera. JHEP 0803, 069 (2008) 16. Khovanov, M., Rozansky, L.: “Matrix factorizations and link homology II.” http://arxiv.org/abs/math/ 0505056v2[math.QA], 2006 17. Iqbal, A., Kozcaz, C., Vafa, C.: The Refined Topological Vertex. JHEP 0910, 069 (2009) 18. Aganagic, M., Klemm, A., Marino, M., Vafa, C.: The Topological Vertex. Commun. Math. Phys. 254, 425– 478 (2005) 19. Hori, K., Vafa, C.: “Mirror Symmetry.” http://arxiv.org/abs/math/hep-th/0002222v3, 2000 20. Aganagic, M., Klemm, A., Vafa, C.: Disk Instantons, Mirror Symmetry and the Duality Web. Z. Naturforsch. A 57, 1–28 (2002) 21. Ooguri, H., Vafa, C.: Worldsheet derivation of a large N duality. Nucl. Phys. B 641, 3 (2002) 22. Marino, M.: “Chern-Simons Theory, Matrix Models, And Topological Strings.” Oxford: Oxford University Press, 2005 23. Labastida, J.M.F., Marino, M.: Polynomial invariants for torus knots and topological strings. Commun. Math. Phys. 217, 423 (2001) 24. Marino, M., Vafa, C.: “Framed knots at large N.” http://arxiv.org/abs/hep-th/0108064v1, 2001 25. Taubes, C.: Lagrangians for the Gopakumar-Vafa conjecture. Geom. Topol. Monogr. 8, 73–95 (2006) 26. Liu, K., Peng, P.: “Proof of the Labastida-Marino-Ooguri-Vafa Conjecture.” http://arxiv.org/abs/0704. 1526v4[math.QA], 2009 27. Labastida, J.M.F., Marino, M., Vafa, C.: Knots, links and branes at large N. JHEP 0011, 007 (2000) 28. Khovanov, M., Rozansky, L.: “Virtual crossings, convolutions and a categorification of the S O(2N ) Kauffman polynomial.” http://arxiv.org/abs/math/0701333v1[math.QA], 2007 29. Iqbal, A.: “All genus topological string amplitudes and 5-brane webs as Feynman diagrams.” http://arxiv. org/abs/hep-th/0207114v2, 2002 30. Okounkov, A., Reshetikhin, N., Vafa, C.: “Quantum Calabi-Yau and Classical Crystals”. http://arxiv.org/ abs/hep-th/0309208v2, 2003 31. Gukov, S., Iqbal, A., Kozcaz, C., Vafa, C.: Work in progress Communicated by N.A. Nekrasov
Commun. Math. Phys. 298, 787–831 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1078-8
Communications in
Mathematical Physics
Fluctuations of the Nodal Length of Random Spherical Harmonics Igor Wigman, Centre de Recherches Mathématiques (CRM), Université de Montréal, C.P. 6128, Succ. Centre-Ville, Montréal, Québec H3C 3J7, Canada Received: 16 September 2009 / Accepted: 9 March 2010 Published online: 26 June 2010 – © Springer-Verlag 2010
Abstract: Using the multiplicities of the Laplace eigenspace on the sphere (the space of spherical harmonics) we endow the space with Gaussian probability measure. This induces a notion of random Gaussian spherical harmonics of degree n having Laplace eigenvalue E = n(n + 1). We study the length distribution of the nodal lines of random spherical harmonics. It is known that the expected length is of order n. It is natural to conjecture that the variance should be of order n, due to the natural scaling. Our principal result is that, due to an unexpected cancelation, the variance of the nodal length of random spherical harmonics is of order log n. This behaviour is consistent with the one predicted by Berry for nodal lines on chaotic billiards (Random Wave Model). In addition we find that a similar result is applicable for “generic” linear statistics of the nodal lines. 1. Introduction Nodal patterns (first described by Ernest Chladni in the 18th century) appear in many problems in engineering, physics and natural sciences: they describe the sets that remain stationary during vibrations. Hence, their importance in such diverse areas as musical instruments, mechanical structures, earthquake study and other areas. They also arise in the study of wave propagation and in astrophysics; this is a very active and rapidly developing research area. Let (M, g) be a compact manifold and f : M → R be a real valued function. The nodal set of f is its zero set f −1 (0) = {x ∈ M : f (x) = 0}. The most important or fundamental case is that of f being the eigenfunction of the Laplace-Beltrami operator on M, g f + E f = 0,
(1)
Current address: Institutionen för Matematik, Kungliga Tekniska Högskolan (KTH), Lindstedtsvägen 25, 10044 Stockholm, Sweden. E-mail: [email protected] The author is supported by a CRM ISM fellowship, Montréal and the Knut and Alice Wallenberg Foundation, grant KAW.2005.0098.
788
I. Wigman
with E ≥ 0. In this case it is known [8], that generically, the nodal sets are smooth submanifolds of M of codimension 1. For example, if M is a surface, the nodal sets are smooth curves, also called the nodal lines. One is interested in studying their volume (i.e. the length of the nodal line for the 2-dimensional case) and other properties for highly excited eigenstates. Yau conjectured [25,26] that the volume of the nodal set is √ commensurable to E in the sense that there exist constants cM , CM > 0 such that if f satisfies (1) then √ √ (2) cM E ≤ Vol( f −1 (0)) ≤ CM E. The lower bound was proved by Bruning and Gromes [7] and Bruning [6] for the planar case. Donnelly and Fefferman [10] finally settled Yau’s conjecture for real analytic metrics. However, the general case of a smooth manifold is still open. 1.1. Spherical Harmonics. In this paper, we will concentrate on the nodal sets on the sphere. It is well known that the eigenvalues E of the Laplace equation f + E f = 0 on the m-dimensional sphere S m are all the numbers of the form E nm = n(n + m − 1),
(3)
where n is an integer. Given a number E nm , the corresponding eigenspace is the space Enm of spherical harmonics of degree n. Its dimension is given by 2n + m − 1 n + m − 1 2 (4) N = Nnm = ∼ n m−1 . n+m−1 m−1 (m − 1)! Given an integral number n, we fix an L 2 (S m ) orthonormal basis of Enm , n;m η1n;m (x), η2n;m (x), . . . , ηN m (x), n
giving an identification Enm ∼ =R refer the reader to [1], Chap. 9.
Nnm
. For further reading on the spherical harmonics we
1.2. Random model. We consider a random eigenfunction Nnm |S m | m ak ηkn;m (x), f n (x) = Nnm
(5)
k=1
where ak are standard Gaussian N (0, 1) i.i.d. That is, we use the identification m Enm ∼ = RNn
to endow the space Enm with Gaussian probability measure υ as 1
dυ( f nm ) = e− 2 a where a = (ai ) ∈ RNn are as in (5). m
2
da1 · . . . · daNnm (2π )Nn
m /2
,
Fluctuations of Nodal Length
789
Note that υ is invariant with respect to the orthonormal basis for Enm . Moreover, the Gaussian random field f nm is isotropic in the sense that for every x1 , . . . xl ∈ S m and every orthogonal R ∈ O(m + 1), d
( f n (Rx1 ), . . . , f n (Rxl )) = ( f n (x1 ), . . . , f n (xl )).
(6)
As usual, for any random variable X , we denote its expectation EX . For example, with the normalization factor in (5), for every m ≥ 2, n and fixed point x ∈ S m , one has Nn Sm = m ηkn;m (x)2 = 1, Nn m
E[ f nm (x)2 ]
(7)
k=1
a simple corollary from the addition theorem (see [1], or (18) for m = 2). Any characteristic X (L) of the nodal set L = L( f nm ) = {x ∈ S 2 : f nm (x) = 0} is a random variable. The most natural characteristic of the nodal set L fnm of f nm is, of course, its (m − 1)-dimensional volume Z( f nm ). The main goal of the present paper is the study of the distribution of the random variable Z( f nm ) for a random Gaussian f n ∈ En . 1.3. Some Conventions. Throughout the paper, the letters x, y will denote either points on the sphere S m or spherical variables. For x, y ∈ S m , d(x, y) will stand for the spherical distance between x and y. Given a set F ⊆ S m , we denote its area by |F|; len(C) will stand for the length of a smooth curve C ⊂ S m . For example, |S 2 | = 4π, and Z( f n2 ) = len({ f n2 (x) = 0}). In this paper we are mainly concerned with the 2-dimensional case. Therefore, to simplify the notation we will use f n (x) := f n2 (x), and accordingly En := En2 , E n := E n2 , Nn := Nn2 , ηkn := ηkn;2 . In this manuscript, we will use the notations A B and A = O(B) interchangeably. If necessary, the constant involved will depend on the parameters written in the subscript. For example, Oϕ or ϕ means that the constants involved depend on the function ϕ. 1.4. Nodal length and related subjects. It is widely believed that for generic chaotic billiards, one can model the nodal lines for eigenfunctions of eigenvalue of √ order ≈ E with nodal lines of isotropic, monochromatic random waves of wavenumber E (this is called Berry’s Random Wave Model or RWM). Berry [3] found that the expected length √ (per unit area) of the nodal lines for the RWM is of size approximately E, and he argued that the variance should be of order log E. Berard [2] proved that for every m ≥ 2, E Z( f nm ) = cm · E nm , (8)
790
I. Wigman
where 2π m/2 , cm = √ m m2 (see also [15 and 23]). Furthermore, Neuheisel [15] established an asymptotic upper bound for the variance of the form E nm E nm m (9) Var(Z( f n )) = O = O m−1 (m−1)2 N 3m+1 n 3m+1 and in our previous work [23], we improved the latter to be E nm m Var(Z( f n )) = O . Nnm Either of the bounds implies that the variance of the length, normalized so that its expected value is 1, vanishes with prescribed rate, Z( f n ) 1 Var =O √ E[Z( f n )] Nn for the latter bound. This means that the constants cS 2 and CS 2 guaranteed by DonnelyFefferman (2) may be taken as essentially equal for “generic” eigenfunctions f nm ∈ Enm , where n is large. The volume of the nodal line of a random eigenfunction on the torus T 2 = R2 /Z2 was studied by Rudnick and Wigman [16] and subsequently by Krishnapur and Wigman 2 [14]. In√this case, it is not difficult to see that the expectation is again E[Z( f T )] = const · E. Their principal result is that as the eigenspace dimension N grows to infinity, the variance is bounded by E T2 , Var(Z( f )) = O N2 and it is likely that it is asymptotic to Var(Z( f T )) ∼ ∗ NE2 for a “generic” sequence of eigenvalues. For generic manifolds, one does not expect the Laplacian to have any multiplicities, so that we cannot introduce a Gaussian ensemble on the eigenspace. Let E j be the eigenvalues and φ j the corresponding eigenfunctions. It is well known that the E j are discrete, E j → ∞ and L 2 (M) = span{φ j }. In this case, rather than considering random eigenfunctions, one considers random combinations of eigenfunctions with growing energy window of either type f L (x) = a j φ j (x) 2
E j ∈[0,E]
Fluctuations of Nodal Length
791
(called the long range), or f S (x) =
√
√ √ E j ∈[ E, E+1]
a j φ j (x),
(called the short range), as E → ∞. Berard [2] and Zelditch [27] found that EZ( f L ) ∼ C˜ M ·
√
E
and recently Zelditch [27] proved that EZ( f S ) ∼ C˜ M ·
√
E,
notably with the same constant C˜ M > 0 for both the long and the short ranges. For billiards (i.e. surfaces with piecewise smooth boundary), one is interested in the number of intersections of the nodal line with the boundary, or, equivalently, the number of open nodal components. Toth and Wigman [21] studied the number of boundary intersections for random combinations of eigenfunctions f L (x) and f S (x) on generic billiards, defined precisely as above. They found that the expected number of the inter√ sections is of order E. In the first part of this paper, we resolve the high energy asymptotic behaviour for the variance of the nodal length for random 2-dimensional spherical harmonics f n = f n2 : S 2 → R. Theorem 1.1. One has Var (Z( f n )) =
65 log n + O(1), 32
(10)
asymptotically as n → ∞. For the higher dimensional sphere S m ⊆ Rm+1 with m ≥ 3, it is possible to prove [24] that Var(Z( f nm ))
=O
1
n m−2
=O
E nm , nNnm
and it is likely that Var(Z( f nm )) ∼
c n m−2
for some constant c > 0. We intend to address the question of precise asymptotics for the higher dimensional case in the future.
792
I. Wigman
1.5. Smooth linear statistics.1 Rather than considering the volume of the full nodal line one may choose a nice submanifold F ⊆ S m of the sphere and consider the nodal volume Z F ( f nm ) := Vol({ f nm = 0} ∩ F) inside F. More generally, let ϕ : S m → R be a function. One then defines
Z ϕ ( f nm ) := ϕ(x)d Vol f m −1 (0) (x). n
f nm −1 (0)
The random variable Z ϕ ( f nm ) is called a (smooth) linear statistic of the nodal set. A priori, this definition makes sense only for the continuous test function ϕ ∈ C(S m ), so that the restriction ϕ| f m −1 (0) ∈ C( f nm −1 (0)) is defined. Unfortunately, the class C(S 2 ) n of continuous functions does not contain the characteristic functions of smooth sets. However, it is known [13] that for a smooth (m − 1)-dimensional hypersurface C one can define the trace tr C (ϕ) ∈ L 1 (C) of ϕ for some wider classes of functions such as W 1,1 (S m ), the class of integrable functions with integrable weak derivatives, even though the values of ϕ ∈ W 1,1 (S m ) are defined up to measure zero spherical sets. To define the trace, one exploits the values of ϕ in a small tubular neighbourhood of C. Unfortunately again, the class W 1,1 does not contain the family of characteristic functions of nice sets. As an example, let us consider the 2-dimensional spherical disc F = B(N , π4 ) S 2 centered at the north pole of radius π4 , C = ∂ F its boundary, and ϕ = χ F . Then the definition of tr C (χ F ) is ambiguous since one may define it as either 0 ∈ L 1 (C) or 1 ∈ L 1 (C). This phenomenon (i.e. the jump in f occurring precisely on C) is typical to the class BV (S m ) of functions of bounded variation; it is known [13], that for any characteristic function χ F of a submanifold F with C 2 boundary, χ F ∈ BV (S m ), and, in addition, W 1,1 (S m ) BV (S m ). It turns out that, despite this subtlety, one can still extend the notion of average trace 1 ϕC± = tr ± C (ϕ) ∈ L (C)
to the full class ϕ ∈ BV (S m ) (see Appendix C for more details). For instance, in our 1 previous example, tr ± ∂ F (χ F ) ≡ 2 . It is then natural to define
ϕ m Z ( f n ) := ϕC± (x)d Vol f m −1 (0) (x). n
f nm −1 (0)
It is easy to compute the expected value of a “generic” linear statistic, following along the lines of the proof of [23], Prop. 1.4, starting from (121). Lemma 1.2. For ϕ ∈ BV (S 2 ) ∩ L ∞ (S 2 ) we have
ϕ
E Z ( fn ) =
S2
ϕ(x)d x 23/2
En .
(11)
1 The author wishes to thank Steve Zelditch for his suggestion to consider the smooth linear statistics as a measure of the “stability” for the result obtained for the length and “Berry’s cancelation phenomenon” (discussed in Sect. 1.6.1 below)
Fluctuations of Nodal Length
793
Remark 1.3. Note that f n is odd or even if n is odd or even respectively, so that in particular the nodal lines are symmetric w.r.t. the involution x → −x. Therefore if ϕ is odd then Z ϕ ( f n ) vanishes identically in either case. Moreover, Z ϕ ( f n ) = Z ϕ ( f n ), ev
where the even part of ϕ ϕ ev (x) :=
ϕ(x) + ϕ(−x) 2
does not vanish identically, if and only if, ϕ is not odd. Therefore, we may assume that ϕ is even in the first place, and we will assume so throughout the rest of this paper. Under the assumption of continuous differentiability we have the following result for the variance of Z ϕ for 2-dimensional spherical harmonics. Theorem 1.4. Let ϕ : S 2 → R be a continuously differentiable even function, which does not vanish identically. Then as n → ∞, Var(Z ϕ ( f n )) = c(ϕ) · log n + Oϕ∞ ,V (ϕ) (1),
(12)
where c(ϕ) := 65
ϕ2L 2 (S 2 ) 128π
> 0,
(13)
i.e. the constant involved in the “O”-notation depends only on the L ∞ norm ϕ∞ and the total variation V (ϕ) of ϕ, and moreover, this dependency is monotone increasing. Unfortunately, Theorem 1.4 does not cover the characteristic functions of nice submanifolds. For this, we have Theorem 1.5; the main idea of its proof is approximating a function ϕ ∈ BV (S 2 ) with C ∞ functions ϕi , for which we apply Theorem 1.4. We control the error term in (12) applied to ϕi using its L ∞ norm and variation, which is why we included this technical statement in the formulation of Theorem 1.4 in the first place. Theorem 1.5. Let ϕ ∈ BV (S 2 ) ∩ L ∞ (S 2 ) be a not identically vanishing even function. Then as n → ∞, Var(Z ϕ ( f n )) = c(ϕ) · log n + Oϕ (1),
(14)
where c(ϕ) := 65
ϕ2L 2 (S 2 ) 128π
> 0.
The characteristic function χ F of a subsurface F ⊆ S 2 with C 2 boundary is of bounded variation, i.e. χ F ∈ BV (S 2 ), [13] Example 1.4. Therefore, in this case the statement of Theorem 1.5 is valid for Z F , as the following corollary states.
794
I. Wigman
Corollary 1.6. Let F ⊆ S 2 be a subsurface of the sphere with C 2 boundary. Then as n → ∞, Var(Z F ( f n )) = c · log n + O F (1), |F| c = c(F) := 65 > 0. 128π
(15)
Remark 1.7. One may observe from the proof of Theorem 1.5, that the constant involved in the “O”-notation in (14) depends only on ϕ∞ and the total variation V (ϕ). In particular, the constant involved in the “O”-notation (15) depends only on the length of the boundary ∂ F. 1.6. Discussion. 1.6.1. “Berry’s Cancelation Phenomenon”. Originally, it was conjectured that the variance Var(Z( f n )) should be asymptotic to c · n, where c > 0 is a constant, due to the natural scaling; however, it turned out that c vanishes, precisely as predicted by Berry [3] for the RWM. The reason for this phenomenon, which we refer to as “Berry’s cancelation phenomenon”, is that the leading nonconstant term in the long range asymptotics of the 2-point correlation function is purely oscillating (see the Key Proposition 3.5), so that it does not contribute to the variance. The non-oscillating leading terms cancel (which is, according to Michael Berry [3], “obscure”). It seems that “Berry’s cancelation phenomenon” is of general nature: it also occurs on the torus [14], and it is likely to hold for random combinations of eigenfunctions on a generic manifold [22]. 1.6.2. Spherical Harmonics vs. RWM The principal result of the present paper shows that the behaviour of the nodal lines of 2-dimensional spherical harmonics of eigenvalue √ E is consistent with the RWM of wavenumber E, predicted for nodal √ lines of generic chaotic systems. In both cases, the expected nodal length is of order E and variance of order log E. More precisely, Berry [3] argued that for a billiard of area A, the variance of the nodal length should be asymptotic to A log E 512π in the high energy limit. Taking into account the symmetry of the nodal lines on S 2 , its “effective” area is 2π , and therefore, according to the RWM, the variance should be 1 asymptotic to 256 log E, which differs from the statement of Theorem 1.1 by a constant. There is a direct relation between the random spherical harmonics and the RWM. Kolmogorov’s theorem implies that a random centered Gaussian ensemble of functions is determined by its covariance function (see Sect. 2.1). The covariance function for the RWM is √ r RW M (x, y) = J0 ( E|x − y|), x, y ∈ R2 , and for the random spherical harmonics is rn (x, y) = Pn (cos(d(x, y))),
Fluctuations of Nodal Length
795
where Pn are the Legendre polynomials. The Legendre polynomials admit Hilb’s asymptotics, φ J0 (φ(n + 1/2)), Pn (cos(φ)) ≈ sin φ i.e. almost identical to RWM, up to the “correction factor” sinφ φ . This factor seems to “know” about the geometry of the sphere; it is one of the underlying factors responsible for the difference in the constants in the variance asymptotics. The geometry of the sphere occurs in some other places as well. 1.6.3. Nodal Set vs. Level Sets. Interestingly, the behaviour of the level curves f n−1 (L) for L > 0 is very different. Let Z L ( f n ) be the length of the level curve of f n . The expected length is [24] 2 E[Z L ( f n )] = c1 e−L /2 E n , consistent with the nodal case L = 0. However, unlike the nodal lines, the variance of the level curves length is asymptotic to [24] Var(Z L ( f n )) ∼ c2 L 4 e−L · n. 2
1.6.4. Real vs. Complex Zeros.2 The behaviour of the zeros of complex analytic functions was studied extensively in recent years and it is interesting to learn that their behaviour is very different from our case of real valued spherical harmonics. Sodin and Tsirelson [20] considered 3 different models of random complex analytic functions ψ L : M → C, all parametrized by an integer L → ∞, roughly corresponding to the degree of the harmonic polynomials n. Here M is the natural domain corresponding to the model with G-invariant measure m ∗ , where G is a group of symmetries; M is either the sphere C ∪ {∞}, the complex plain C or the unit disc {|z| < 1}. In this case the set of zeros is almost surely finite. The authors establish the asymptotic Gaussianity for smooth linear statistics h : M → R, h ∈ Cc2 (M), Z h (ψ L ) := h(z), z:ψ L (z)=0
where the expected value is given by E[Z h (ψ L )] = L ·
1 π
hdm ∗ ,
M
for each of the models considered, consistent with (11). However the variance is of order κ Var(Z h (ψ L )) ∼ ∗ h2L 2 (m ∗ ) (16) L decaying with L → ∞; here κ > 0 is a universal constant, and ∗ is the invariant Laplacian. Note also that here the dependency on the test function h is via the L 2 norm 2 The author wishes to thank Mikhail Sodin and Steve Zelditch for pointing out the unexpected differences between the real and the complex analytic cases.
796
I. Wigman
of a second order differential operator acting on h (namely the invariant Laplacian), whereas in the real valued spherical harmonics case it depends on the L 2 norm of ϕ itself (i.e. the operator is the identity, see (12) and (13)). For h = χU the characteristic function of a smooth domain U ⊆ M (i.e. Z h is the number of zeros in U ), while the expected value of Z h (ψ L ) is still proportional to ar ea(U ) · L the variance is of different shape (cf. Corollary 1.6 in the spherical harmonics √ case). Namely it is known [12] that the variance is asymptotically proportional to L · len(∂U ), different from Corollary 1.6 both in the power of L and the dependency on the test function. This reflects the high frequency oscillations of the zeros smoothed out by a smooth test function. Shiffman and Zelditch [18,19] considered a more general situation of random independent Gaussian sections s1 = s1L,L , . . . , sk = skL,L ∈ (L L , M) of high powers L L , L → ∞ of holomorphic line bundles L on an m-dimensional Kähler manifold M, where 1 ≤ k ≤ m. They considered the volume of the intersection of the zero sets of si , Z U (s1 , . . . sk ) = Vol2m−2k (s1 , . . . sk )−1 (0) , and its smooth linear statistics
Z (s1 , . . . sk ) = h
h(z)d Vol (s1 , . . . sk )−1 (0)
(s1 ,...sk )−1 (0)
(here in case the system is full k = m, the volume is the number of points, and the integral is a sum). In both cases the expected value is asymptotic to E Z U (s1L , . . . skL ) , E Z h (s1L , . . . skL ) ∼ cL k , where as earlier, c > 0 is proportional to either Vol(U ) or the mass of h. For the “sharp” random variable they obtained [18] the asymptotic Var(Z U (s1L , . . . skL )) ∼ cmk L 2k−m−1/2 · Vol(∂U ), where cmk are some universal constants, extending Forrester-Honner [12], whereas for the smooth statistics they established a Central Limit Theorem with variance Var(Z h (s1L , . . . skL )) ∼ ch L 2k−m−2 , where, as in case of Sodin-Tsirelson (16), ch involves a certain 2nd order differential operator acting on h. 1.7. On the proof of the main results. The proof of Theorem 1.1 involves some geometric as well as some probabilistic aspects; we improve upon both in comparison with our previous paper. We employ the Kac-Rice formula, which reduces the computation of the length variance to the 2-point correlation function, given in terms of distribution of the values f n (x) as well as their gradients ∇ f n (x) ∈ Tx (S 2 ), for all x ∈ S 2 . Thanks to the isotropicity of the model, it is sufficient to evaluate the 2-point correlation function only on the arc {θ = 0} (in the usual spherical coordinates); this reduces the problem to an essentially 1-dimensional one. One then has to identify the spaces Tx (S 2 )
Fluctuations of Nodal Length
797
via a family of isometries φx , smooth w.r.t. x, for x on the arc only, which is natural in the spherical coordinates. Scaling the arc we find out that for typical x, y ∈ S 2 , the distribution of the values and the gradients is a small perturbation of standard Gaussian i.i.d random variables N (0, I ), the latter recovering the square of the expected value of the nodal length to be canceled. We then expand the 2-point correlation function into a Taylor polynomial around the asymptotic one; to do so we use Berry’s elegant method. It turns out that the long range behaviour of the two-point correlation function given is also sufficient to extend the result to continuously differentiable linear statistics (i.e. Theorem 1.4). In the course of generalizing the proof to include this case we naturally encounter an auxiliary function W ϕ : [0, π ] → R. To conclude the proof of Theorem 1.4 we will have to understand its behaviour at the origin. To prove Theorem 1.5, we apply a standard density argument, approximating ϕ with C ∞ functions, to which we apply Theorem 1.4. To this end we use the full strength of the statement of Theorem 1.4 applied to ϕi , which enables to uniformly control the error term in (12). For a more detailed explanation see Sect. 5.1. 1.8. Plan of the paper. The goal of Sect. 2 is to give a formula for the length variance, explicit as possible, starting from the classical Kac-Rice formula. In Sect. 3, we use the formula obtained to analyze the variance, asymptotically for high energy (i.e. prove Theorem 1.1). In Sects. 4 and 5, we give the proofs for Theorem 1.4 and Theorem 1.5, respectively. Appendix A will carry on a certain technical computation we will encounter in this paper, namely, that of the covariance matrix of a random vector involving values and gradients of f n . Appendix B will be devoted to the Legendre polynomials and some of their basic properties. The goal of Appendix C is to give the definition and some properties of the class BV (S 2 ) of functions of bounded variation, including their traces on smooth curves. 2. An Explicit Integral Formula for the Variance In this section, culminating in Proposition 2.7, we derive an “explicit” integral formula for the variance. First we need to introduce the covariance function. 2.1. Covariance function. The covariance function (sometimes also referred to as twopoint function) is defined as N
u n (x, y) := E [ f n (x) f n (y)] =
n |S 2 | ηkn (x)ηkn (y). Nn
(17)
k=1
It follows from the Kolmogorov theorem [9], that, in principle u n (x, y) determines the centered Gaussian random field f n , so that one can compute any property of f n in terms of u n and its derivatives. By the addition theorem [1], p. 456, Theorem 9.6.3, u n (x, y) has an explicit expression as u n (x, y) = Pn (cos d(x, y)), where Pn : [−1, 1] → R
(18)
798
I. Wigman
is the Legendre polynomial of degree n (see e.g. [17]). Recall that d(x, y) is the spherical distance so that cos d(x, y) = x, y, thinking of S 2 as being embedded into R3 . The orthogonal invariance (6) is then equivalent to the corresponding property of the covariance function, namely u n (Rx, Ry) = u n (x, y)
(19)
for every orthogonal R ∈ O(3). In case y is not specified, we take it to be the northern pole N ∈ S 2 , that is u n (x) := u n (x, N ).
(20)
For every t ∈ [−1, 1], |Pn (t)| ≤ 1 and |Pn (t)| = 1, if and only if t = ±1. Therefore (u n (x, y) = ±1) ⇔ (x = ±y),
(21)
(u n (x) = ±1) ⇔ (x ∈ {N , S}),
(22)
and
where N and S are the northern and the southern poles respectively.
2.2. Kac-Rice formulas for moments of length. In this section we express the first couple of moments of Z( f n ) via the Kac-Rice formula. The most general version due to Bleher-Shiffman-Zelditch [4,5] gives an integral expression for all the moments k ≥ 1 of the (m − l)-dimensional volume of {F = 0} for “generic” smooth vector valued random field F = (Fi )1≤i≤l : M → Rl defined on an m-dimensional smooth manifold M, 1 ≤ l ≤ m. In our previous paper [23] we gave an independent elementary proof for the Kac-Rice formula in the particular case of our interest M = S m , F = f nm , k = 1, 2. To present the Kac-Rice formula in our case we will need some notation. For x, y ∈ S 2 we define the following random vectors: Z 1n;x = ( f n (x), ∇ f n (x)) ∈ R × Tx (S 2 ) and n;x,y
Z2
= ( f n (x), f n (y)∇ f n (x), ∇ f n (y)) ∈ R2 × Tx (S 2 ) × Ty (S 2 ).
More generally, for k ≥ 1 and x1 , . . . , xk ∈ S 2 one may define Z k = Z kn;x1 ,...,xk ∈ Rk ×
k i=1
Txi (S 2 )
(23)
Fluctuations of Nodal Length
799
in similar fashion. The vectors Z k are all centered Gaussian in the sense that for every fixed x1 , . . . , xk ∈ S 2 any linear functional of Z k is mean zero Gaussian. Let Dkn;x1 ,...,xk (v1 , . . . , vk , ξ1 , . . . ξk ) be the (mean zero Gaussian) probability density function of Z kn;x1 ,...,xk . The Kac-Rice formula expresses the k th moment of Z( f n ) in terms of the distributions of Z k only (see Lemma 2.1), namely Dkn;∗ . Therefore to express the variance (and the expected value) of Z( f n ) we will only need to study D1n;∗ and D2n;∗ . Lemma 2.1. ([4] Th. 2.2; [5] Th. 4.3; [23] Prop. 3.3) The first two moments of the nodal length of the spherical harmonics are given by the following formulas: (1) Expectation:
E [Z( f n )] =
P˜n (x)d x,
(24)
S2
where the density of the zero set P˜n (x) is given by
˜ ξ D1n;x (0, ξ )dξ. Pn (x) = Tx (S 2 )
(2) Second moment:
2 K˜ n (x, y)d xd y, E Z( f n ) =
(25)
S 2 ×S 2
where the 2-point correlation function K˜ n is given by
1 ξ x · ξ y D2n (0, 0, ξ x , ξ y )dξ x dξ y . K˜ n (x, y) = 2π 1 − u n (x, y)2 Tx (S 2 )×Ty (S 2 )
(26) Neuheisel [15] (see also [23]) noticed that for every x ∈ S 2 , under any isometry n;x Tx (S m ) ∼ = R2 (i.e. any choice of orthonormal basis of Tx (S 2 )), the distribution of Z 1 is mean zero Gaussian with the diagonal covariance matrix 1 , En 2 I2 where I2 is the 2 × 2 identity matrix. It is then clear that P˜n is x-independent (this also follows from the rotational independence), and 1 P˜n (x) ≡ √ , 2π
800
I. Wigman
by a standard computation. This, together with (24), yields (8) for m = 2 and finishes the treatment of the expectation in this case3 . Moreover, slightly modifying the proof of (24), we obtain
E Z ϕ ( f n ) = ϕ(x) P˜n (x)d x, S2
and (11) follows. The goal of the remaining part of the present section, culminating with Proposition 2.7, is to make the formula (25) for the second moment “explicit” and suitable for asymptotic analysis. The rotational invariance (6) of our model implies that K˜ n depends only on the spherical distance d(x, y) between x and y, i.e. (with a slight abuse of notations) K˜ n (x, y) = K˜ n (d(x, y)),
(27)
which will be used later. Remark 2.2. One may define K˜ n (x, y) intrinsically as K˜ n (x, y) =
1 E [∇ f (x) · ∇ f (y)| f (x) = f (y) = 0], (2π ) 1 − u n (x, y)2
the expectation being one of the product of gradients conditioned on f vanishing at x and y. Remark 2.3. It is important to note that the symmetry of the nodal lines w.r.t. the involution x → −x (see Remark 1.3) implies that K˜ n (x, y) = K˜ n (x, −y).
(28)
The main disadvantage of the formula (26) is that one has to work with probability densities defined on the tangent planes Tx (S 2 ) which depend on the point x ∈ S 2 . In principle one may consider the tangent planes being embedded in R3 . This, however, is highly inadvisable since that would result in working with singular Gaussians supported on a plane corresponding to Tx (S 2 ). It is thus desired to identify for every x ∈ S 2 , Tx (S 2 ) ∼ = R2 , via an isometry, i.e. fix an orthonormal basis Bx varying smoothly for x ∈ S 2 (i.e. an orthonormal frame). Unfortunately it is impossible to choose a global orthonormal frame on S 2 ; however one can still get around that by noting that in fact all we need is a local choice for given x, y ∈ S 2 . In general, the orthonormal frame chosen will affect the probability density function D2n;∗ of Z 2n;∗ induced on R6 (though D1n;∗ will stay invariant); in Sect. 2.3 we show how to compute D2n;∗ for a given choice of local orthonormal frames. In Sect. 2.4 we will show how to choose the orthonormal frames to simplify the computations; we will use this construction while evaluating the two-point correlation function (26). 3 The same computation gives the result for every m ≥ 2.
Fluctuations of Nodal Length
801
2.3. Kac-Rice formula in coordinate system. Given x, y ∈ S 2 , we consider two local y y orthonormal frames F x (z) = {e1x , e2x } and F y (z) = {e1 , e2 }, defined in some neighbourhood of x and y respectively. This gives rise to (local) identifications Tx (S 2 ) ∼ = R2 ∼ = Ty (S 2 ),
(29)
which are isometries. Under the identification (29) the random vector (23) is a R6 mean zero Gaussian with covariance matrix A B , = (x, y) = Bt C where
1 u n (x, y) , = An (x, y) = u n (x, y) 1 0 ∇ y u n (x, y) = Bn (x, y) = ∇x u n (x, y) 0
A2×2 B2×4 and
C4×4 = Cn (x, y) = with “pseudo-Hessian”
En
I2 H H t E2n I2
(30) (31)
2
H2,2 (x, y) = ∇x ⊗ ∇ y u n (x, y),
(32)
(33)
i.e. H = (h jk ) j,k=1,2 with entries given by h jk =
∂2 y u n (x, y). ∂e xj ∂ek
The covariance matrix of the Gaussian distribution of Z 2 in (23) conditioned upon f n (x) = f n (y) = 0 is given by4 n (x, y) = C − B t A−1 B.
(34)
We then have a frame-dependent formula for the two-point correlation function (26),
1 w1 · w2 K˜ n (x, y) = 1 − u n (x, y)2 R 2 ×R 2 1 dw1 dw2 −1 t × exp − (w1 , w2 )n (x, y) (w1 , w2 ) , √ 3 2 (2π ) det n (x, y) where n (x, y) is given by (34). Remark 2.4. Note that even though K˜ n is rotational invariant (i.e. K˜ n (x, y) depends only on the spherical distance d(x, y)), the same is, in general, false for the covariance matrices n (and n ). 4 This is the inverse of the lower right corner of −1 .
802
I. Wigman
Let φ, θ be the standard spherical coordinates on S 2 . Using the rotational invariance (27) of the 2-point correlation function we obtain
E Z( f n )2 = K˜ n (x, y)d xd y = |S 2 | K˜ n (N , x)d x S 2 ×S 2
S2
π = 2π |S 2 |
K˜ n (N , x(φ)) sin φdφ,
0
where x(φ) ∈ S 2 is the point corresponding to the spherical coordinates (φ, θ = 0). Note that K˜ (N , x(φ)) = K˜ (x, y) for any x, y ∈ S 2 with d(x, y) = φ. We therefore have the following corollary. Corollary 2.5. One has
E Z( f n )
2
π = 2π |S | 2
K˜ n (φ) sin φdφ,
(35)
0
where K˜ n (φ) = K˜ n (x, y), x, y ∈ S 2 being any pair of points with d(x, y) = φ. The main goal of the present paper is to understand the asymptotic behaviour of the function K˜ n (φ). To this end we will have to provide a more explicit formula for K˜ n by choosing concrete orthonormal frames in Sect. 2.4. It also turns out that it is more natural to scale the parameter φ by essentially n; this will be done in Sect. 2.5.
2.4. Choosing orthonormal frames. Corollary 2.5 implies that it is sufficient to provide a choice x, y ∈ S 2 with d(x, y) = φ for any given φ ∈ (0, π ), and for the choice made, provide local frames around x and y. Hence we may restrict ourselves only to points on the half circular arc N˘S = {θ = 0}. Let ∂ 1 ∂ , e2 = F = e1 = ∂φ sin φ ∂θ
(36)
be the orthonormal frame defined on S 2 \ {N , S}. Given φ ∈ (0, π ) we choose any pair of points x, y ∈ N˘S \ {N , S} with d(x, y) = φ and set F x := F and F y := F locally in the neighbourhood of x and y respectively. An explicit computation shows that in this case the covariance matrix n (x, y) depends
Fluctuations of Nodal Length
803
only on φ rather than on x, y, and, thus so does n (φ) = n (x, y) of our interest. We compute in Appendix A the conditional distribution covariance matrix explicitly to be ⎛ En ⎜ n (φ) = ⎜ ⎝
2
+ a˜ 0 En 0 2 ˜b 0 0 c˜
⎞ b˜ 0 0 c˜ ⎟ ⎟, En ˜ 0⎠ 2 +a En 0 2
(37)
whose entries are given by a˜ = a˜ n (φ) = −
1 · Pn (cos φ)2 (sin φ)2 , 1 − Pn (cos φ)2
b˜ = b˜n (φ) = Pn (cos φ) cos φ − Pn (cos φ)(sin φ)2 Pn (cos φ) + · Pn (cos φ)2 (sin φ)2 1 − Pn (cos φ)2
(38)
(39)
and c˜ = c˜n (φ) = Pn (cos φ). We then have
(40)
1 w1 · w2 1 − u(x)2 2 2 R ×R 1 dw1 dw2 −1 t × exp − (w1 , w2 )n (φ) (w1 , w2 ) , √ 2 (2π )3 det n (φ)
K˜ n (φ) =
with the covariance matrix n given by (37). Remark 2.6. We choose to work with points on the arc {θ = 0} since here the covariance matrix n (φ) is relatively simple. This corresponds to Berry’s [3] choice of points on the x axis while dealing with random waves on R2 , which takes advantage of the fact that for two points x, y ∈ R2 on the x axis the canonical orthonormal bases for Tx (R2 ) and Ty (R2 ) coincide under the natural identification Tx (R2 ) ∼ = Ty (R2 ). Rather than working with the canonical bases, one may of course choose to work with such orthonormal bases for any two points on the plane; this approach results in the same computation as on the x axis.
2.5. Scaling the integral formula. As pointed out earlier, the two-point correlation function K˜ n is expressible in terms of the covariance function u n and a couple of its derivatives, which in turn are expressible in terms of degree n Legendre polynomial and its derivatives. The high energy asymptotics n → ∞ of K˜ is then intimately related to the behaviour of Pn (cos d) for large n. It is known from the Hilb’s asymptotics (see Appendix B) that φ J0 (φ(n + 1/2)). Pn (cos φ) ≈ sin φ
804
I. Wigman
It is thus only natural to introduce a new parameter ψ related to φ by ψ , m where from this point and throughout the rest of the paper we denote φ=
1 (41) m := n + . 2 We will rewrite the formula (35) in terms of ψ rather than φ, in the hope to simplify the subsequent computations. Proposition 2.7. The variance of the nodal length is given by Var(Z( f n )) = 4π 2
En In , n + 1/2
(42)
where
π m In =
1 K n (ψ) − sin(ψ/m)dψ, 4
(43)
0
the scaled two-point correlation function
1 K n (ψ) = w1 · w2 1 − u(x)2 2 2 R ×R 1 dw1 dw2 −1 t × exp − (w1 , w2 )n (ψ) (w1 , w2 ) √ 3 2 (2π ) det n (ψ) with scaled covariance matrix
⎛
1 + 2a (ψ/m) ⎜ 0 =⎝ n (ψ) = 2b E n /2 0
0 2b 1 0 0 1 + 2a 2c 0
⎞ 0 2c⎟ , 0⎠ 1
whose entries are explicitly given by 1 a = an (ψ) = a˜ n (ψ/m) En 1 1 =− · Pn (cos(ψ/m))2 sin(ψ/m)2 , E n 1 − Pn (cos(ψ/m))2 1 ˜ bn (ψ/m) b = bn (ψ) = En 1 Pn (cos(ψ/m)) cos(ψ/m) − Pn (cos(ψ/m)) sin(ψ/m)2 = En Pn (cos(ψ/m)) 2 2 + · P sin(ψ/m) (cos(ψ/m)) n 1 − Pn (cos(ψ/m))2
(44)
(45)
(46)
(47)
and c = cn (ψ) =
1 1 c˜n ((ψ/m)) = P (cos(ψ/m)). En En n
(48)
Fluctuations of Nodal Length
805
Remark 2.8. Using the Cauchy-Schwartz inequality one can easily check that |bn (ψ)|, |cn (ψ)| ≤ 21 . The inequality |an (ψ)| ≤ 21 is obvious. Remark 2.9. We rewrite (28) as K˜ n (φ) = K˜ n (π − φ).
(49)
Remark 2.10. One can express K n (ψ) in probabilistic language as K n (ψ) =
1 E [U · V ], (2π ) 1 − Pn (cos ψ/m)2
where (U, V ) are mean zero Gaussian random variables with covariance matrix n (ψ). We will find this expression useful later, when we will study the asymptotic behaviour of K n for large ψ (see Proposition 3.5). 3. Asymptotics for the Variance In this section we establish the asymptotics for the variance of the nodal length, i.e. prove Theorem 1.1. Recall that the variance of the nodal length is given by (42). Thus Theorem 1.1 is equivalent to the following proposition. Proposition 3.1. As n → ∞ one has In =
65 log n +O 128π 2 n
1 . n
The rest of the present section is dedicated to the proof of Proposition 3.1. 3.1. Asymptotics for In . Recall the definition
π m In =
1 K n (ψ) − sin (ψ/m) dψ 4
0
of In , where K n is given by (44) or, equivalently, (61), and m is related to n via (41). We note that the scaled version of (49) is K n (ψ) = K n (π m − ψ).
(50)
Thus, by the definition (43) of In and (50), we have π m/2
In = 2 ·
1 K n (ψ) − sin (ψ/m) dψ. 4
(51)
0
Therefore we may concentrate ourselves on [0, π m/2] rather than the full interval. Ideally, to evaluate In , one would hope to obtain an explicit formula for K n (ψ). Unfortunately, to the best knowledge of the author of this paper, no such formula exists. However we will still be able to give an asymptotic expression for K n (ψ) for large values of ψ uniformly w.r.t. n and ψ.
806
I. Wigman
For small values of ψ the behaviour of K n is very different, due to the fact that as ψ → 1 0+, Pn (cos(ψ/m)) approaches 1, which results in the singularity of √ 2 1−Pn (cos(ψ/m))
and of the covariance matrix n (ψ) at the origin. Nevertheless, we will see that this “singular” contribution is negligible, so that a relatively soft upper bound already obtained in [23] will suffice (see Lemma 3.2). More precisely, we choose a constant C > 0, which is kept fixed throughout the rest of the computations, and write π m/2
= 0
π m/2
C
.
+ 0
(52)
C
The main contribution to the integral will come from the second (“nonsingular”) integral in (52) i.e. outside the origin. Our first task is then to bound the first (“singular”) integral of (52). A satisfactory bound was already given in [23]. Lemma 3.2. (Restatement of Lemma 4.2 from [23])5 For any constant C > 0 we have as n → ∞,
C K n (ψ) −
1 1 sin(ψ/m)dψ = O . 4 n
0
Lemma 3.2 together with (52) and (51) yield the following lemma. Lemma 3.3. For any choice of the constant C > 0 we have as n → ∞, 1 In = 2 I˜n + O , n
(53)
where I˜n =
π m/2
K n (ψ) −
1 sin(ψ/m)dψ. 4
(54)
C
Therefore, to understand the asymptotic behaviour of In it is sufficient to understand the asymptotic behavior of I˜n . Proposition 3.4 resolves the latter. The proof of Proposition 3.4 is given throughout the rest of the present section. Proposition 3.4. For any choice of the constant C > 0 in the definition (54) of I˜n , we have as n → ∞, ˜In = 65 log n + O 1 . (55) 256π 2 n n Proof of Proposition 3.1 assuming Proposition 3.4. Just use Proposition 3.4 together with (53). 5 Note that in [23], Lemma 4.2 was given without scaling, i.e. in terms of φ rather that ψ.
Fluctuations of Nodal Length
807
3.2. Asymptotics for the 2-point correlation function. Recall that K n (ψ) is given by (44). One may notice that K n (ψ) =
1 F(an (ψ), bn (ψ), cn (ψ)), 2π 1 − Pn (cos(ψ/m))2
where F(α, β, γ ) is a smooth function independent of n, defined on some neighbourhood of the origin (α, β, γ ) = (0, 0, 0). Its arguments an (ψ), bn (ψ) and cn (ψ) are uniformly small for ψ > C (see Lemma 3.9). An easy explicit computation shows that F(0, 0, 0) =
1 , 4
which cancels out with the constant term in (43) which corresponds to (EZ)2 . This is not a coincidence, since the origin α = β = γ = 0 corresponds to the covariance matrix n being the identity matrix n = In ; in this case the probability density function factors. We choose to expand F(α, β, γ ) into a finite Taylor polynomial around the origin.6 We note that the matrix elements are of different order or decay rate, so that we may cut the smaller terms earlier than the larger ones. The decay rate of an , bn and cn prescribed by Lemma 3.9 implies that it is sufficient to expand an , bn and cn up to 2nd , 4th and 1st degrees respectively. The following is the Key Proposition for the whole paper. We will reuse it while proving Theorem 1.4 (see Sect. 4) for smooth linear statistics of the nodal line, (see also Remark 3.7). Proposition 3.5. (Key Proposition) For any choice of C > 0, as n → ∞, one has K n (ψ) =
65 1 9 cos(2ψ) 1 1 sin(2ψ) + + + 2 4 2 π n sin(ψ/m) 256 π n sin(ψ/m)ψ 32 π nψ sin(ψ/m) 27 11 sin(2ψ) − 256 cos(4ψ) 1 1 + O (56) + 64 + π 2 nψ sin(ψ/m) ψ 3 nψ
uniformly for C < ψ < πm/2. Remark 3.6. It is important to notice that the leading nonconstant term 21 π nsin(2ψ) sin(ψ/m) is oscillating, and we will see that it does not contribute to the variance (see the proof of Proposition 3.1). We observe this obscure “Berry’s cancelation phenomenon”, which is responsible for the variance being surprisingly small, in some other situations, such as Berry’s original work [3], and on the torus [14]. This suggests that this phenomenon is of a more general nature, and we expect it to occur on a “generic” surface [22]. Remark 3.7. As another application of Proposition 3.5, one may exploit it to study the morphology of the nodal lines for n-dependent linear statistics ϕ = ϕn . It is most efficient for ϕn whose support is not shrinking too rapidly (relatively to the scaling ψ ≈ nφ we introduced earlier), for example ϕn a characteristic function of a spherical disc of radius an , where an · n → ∞. 6 Intuitively, the origin a = b = c = 0 corresponds to ψ“ = ”∞ (see the decay at infinity in Lemma 3.9), hence this expansion should be good for large values of ψ.
808
I. Wigman
For finer “local” statistics of the nodal lines, such as studying the nodal length inside a spherical disc of radius ≈ n1 , one needs to expand K n (ψ) around the origin, where the behaviour is very different from the one at infinity. We may want to do so in the future. The asymptotic evaluation (56) in Proposition 3.5 is done in two steps. Lemma 3.8 provides an approximation for the two-point correlation function K n (ψ) with a polynomial in Pn (cos(ψ/m)), an (ψ), bn (ψ) and cn (ψ) i.e. the Taylor expansion of K n as a function of the above expressions. In the second step, performed in Lemma 3.9, we evaluate each of the terms appearing in the Taylor expansion obtained in the first step using the high degree asymptotics of the Legendre polynomials and its derivatives (Hilb asymptotics). Lemmas 3.8 and 3.9 are proved in Sects. 3.4.1 and 3.4.2 respectively. Lemma 3.8. For C > 0 large enough, one has the following expansion on [C, π m/2]: K n (ψ) =
1 1 1 1 3 + · an (ψ) + · bn (ψ)2 − · an (ψ)2 − · an (ψ)bn (ψ)2 4 4 8 32 16 3 1 1 · bn (ψ)4 + · Pn (cos(ψ/m))2 + · an (ψ)Pn (cos(ψ/m))2 + 128 8 8 1 3 · Pn (cos(φ/m))4 + · bn (ψ)2 Pn (cos(φ/m))2 + 16 32 +O Pn (cos(ψ/m))6 + an (ψ)3 + bn (ψ)5 + cn (ψ)2 ,
(57)
where the constants involved in the “O”-notation depend only on C. Lemma 3.9. For n ≥ 1, C < ψ < π m/2 we have the following estimates for a, b, c and Pn (cos(ψ/m)): (1) cos(2ψ) 1 + sin(2ψ) − +O π n sin(ψ/m) 4π nψ sin(ψ/m) 1 1 + sin(2ψ) +O = . π n sin(ψ/m) ψ2
Pn (cos(ψ/m))2 =
1 1 + 3 ψ ψn
(58)
(2) 3 cos(2ψ) 1 − sin(2ψ) 1 + cos(4ψ) + − 2 2 π n sin(ψ/m) 4π n sin(ψ/m) 2π 2 n 2 sin(ψ/m)2 1 1 +O + 3 ψ nψ 1 1 − sin(2ψ) . +O =− π n sin(ψ/m) ψ2
an (ψ) = −
(3) 7 cos(2ψ) 1 + cos(4ψ) 1 + sin(2ψ) + + 2 +O bn (ψ) = π n sin(ψ/m) 4π n sin(ψ/m)ψ π n sin(ψ/m)ψ 2
(4)
|cn (ψ)| = O
. 3/2
1 ψ
1 1 . + ψ 3 nψ
Fluctuations of Nodal Length
809
(5) Pn (cos(ψ/m))4 =
3 2
+ 2 sin(2ψ) − 21 cos(4ψ) +O π 2 n 2 sin(ψ/m)2
1 . ψ3
(6) an (ψ)2 =
3 2
− 2 sin(2ψ) − 21 cos(4ψ) +O π 2 n 2 sin(ψ/m)2
3 2
+ 2 sin(2ψ) − 21 cos(4ψ) +O π 2 n 2 sin(ψ/m)2
1 . ψ3
1 . ψ3
(7) bn (ψ) = 4
(8) Pn (cos(ψ/m))2 an (ψ) = −
1 + cos(4ψ) +O 2π 2 n 2 sin(ψ/m)2
1 . ψ3
(9) 3/2 + 2 sin(2ψ) − 21 cos(4ψ) Pn (cos(ψ/m)) bn (ψ) = +O π 2 n 2 sin(ψ/m)2 2
2
1 . ψ3
(10) an (ψ)bn (ψ)2 = −
1 + cos(4ψ) +O 2π 2 n 2 sin(ψ/m)2
1 . ψ3
Proof of Proposition 3.5. Substituting all the various estimates in Lemma 3.9 into (57) we obtain, after collecting similar terms together and some reorganization (replacing 1 1 by n sin(ψ/m)ψ whenever necessary) n 2 sin(ψ/m)2 K n (ψ) =
1 1 1 1 1 + · sin(2ψ) + sin(2ψ) + sin(2ψ) 4 π n sin(ψ/m) 4 8 8 1 1 1 1 3 3 1 3 3 1 − · + ·1− · + · + · + 2 π n sin(ψ/m)ψ 4 2 8 32 2 16 2 128 2 1 1 1 3 3 3 1 · + · + ·0− · + 8 8 2 16 2 32 2 1 3 1 7 1 1 1 · cos(2ψ) + · cos(2ψ) − · cos(2ψ) + π nψ sin(ψ/m) 4 4 8 4 8 4 1 1 1 1 1 − · cos(4ψ) + · cos(4ψ) + (2 sin(2ψ) + 2 π nψ sin(ψ/m) 4 2 8 32 1 3 1 3 1 1 1 + cos(4ψ)) + · cos(4ψ) + (2 sin(2ψ) − cos(4ψ)) − · 2 16 2 128 2 8 2 1 3 1 1 (2 sin(2ψ) − cos(4ψ)) + · (2 sin(2ψ) − cos(4ψ)) × cos(4ψ) + 16 2 32 2
810
I. Wigman
1 1 + ψ 3 nψ 1 cos(2ψ) 1 1 sin(2ψ) 65 3 = + + + 2 4 2 π n sin(ψ/m) 256 π n sin(ψ/m)ψ 8 π nψ sin(ψ/m) 27 11 sin(2ψ) − 256 cos(4ψ) 1 1 + O . + 64 + π 2 nψ sin(ψ/m) ψ 3 nψ
+O
3.3. Concluding the proof of Proposition 3.4. All the hard work establishing the asymptotics (56) of K n (ψ) at infinity finally pays off as the proof of Proposition 3.4 is now straightforward. Proof of Proposition 3.4. Recall that I˜n is given by (54), where K n (ψ) for large ψ we may asymptotically expand K n (ψ) as (56). First note that the constant term 41 in (56) cancels out in (54). Thus we have I˜n =
π m/2
1 sin(2ψ) 65 1 9 cos(2ψ) + + 2 π n sin(ψ/m) 256 π 2 n sin(ψ/m)ψ 32 π nψ sin(ψ/m) C ! 27 11 64 sin(2ψ) − 256 cos(4ψ) + sin(ψ/m)dψ π 2 nψ sin(ψ/m) ⎞ ⎛ π m/2
1 1 sin(ψ/m)dψ ⎠ + +O ⎝ ψ 3 nψ
1 = πn
C π m/2
C
! 11 1 65 1 9 cos(2ψ) 27 64 sin(2ψ) − 256 cos(4ψ) sin(2ψ)+ + + dψ 2 256 π ψ 32 ψ πψ
1 . +O n
(59)
The contribution of the first term in (59) is bounded by 1 n
1 . sin(2ψ)dψ = O n
π m/2
C
The main contribution to (59) comes from the leading non-oscillatory term, i.e. the second term: π m/2 65 65 1 dψ log n = +O . (60) · 2 2 256π n ψ 256π n n C
Bounding the contribution of the other oscillatory terms using integration by parts, as well as bounding the error term in (59), is easy. The asymptotic expression (60) together with the bound for the contribution of the other terms in (56) yields the result (55) of the present proposition.
Fluctuations of Nodal Length
811
3.4. Proofs of auxiliary lemmas. 3.4.1. Taylor expansion for K n (ψ) In principle, one may compute the coefficients of the multivariate Taylor expansion directly using the Leibnitz rule for differentiating under the integral sign. The following elegant method due to Berry [3] gives the necessary Taylor coefficients avoiding the long and tedious computations. Proof of Lemma 3.8. We use Remark 2.10 to write K n (ψ) =
1
(2π ) 1 − Pn (cos ψ/m)2
E [U V ],
(61)
where (U, V ) is a mean zero multivariate Gaussian random vector with covariance matrix = n (ψ) given by (45). On [C, π m/2] we may Taylor expand the first term in (61) as
1 1 − Pn
(cos ψ/m)2
1 3 Pn (cos ψ/m)2 + Pn (cos ψ/m)4 2 8 6 +O Pn (cos ψ/m) ,
= 1+
(62)
since by part 1 of Lemma 3.9, |Pn (cos(ψ/m))| is bounded away from 1, provided that C is large enough. It then remains to expand the remaining part of (61), i.e. E [U V ] in terms of powers of a, b and c. To this end we use the identity √ 1 α=√ 2π
∞ αt dt (1 − e− 2 ) 3/2 , t 0
which implies
∞ ∞ tU 2 tV 2 dtds 1 1 − e− 2 1 − e− 2 2π 0 (ts)3/2 0
∞ ∞ 1 dtds = [ f (0, 0) − f (t, 0) − f (0, s) + f (t, s)] , (63) 3/2 2π 0 (ts) 0
E[U V ] = E
where we define f (t, s) = f =
n (ψ)
2 2 − tU +sV 2 (t, s) := E e
1 2 (2π ) (det n (ψ))1/2
e R 2 ×R 2
− 12 W t n (ψ)−1 +
t I2 s I2
W
dW
812
I. Wigman
=
=
1 t I2 1/2 −1 (det n (ψ)) det n (ψ) + det I +
1
t I2
n (ψ) = I + A = An (ψ) =
(64)
n (ψ)
Let
where
s I2
1/2 .
s I2
1/2
A B , B A
2a 0
;
B = Bn (ψ) =
2b 2c
,
with entries defined by (46), (47) and (48). Thus we have t I2 (1 + t)I + t A tB I+ , n (ψ) = sB (1 + s)I + s A s I2 so that
t I2 det I +
s I2
= det ((1 + t)I + t A) det (1 + s)I + s A − st B((1 + t)I + t A)−1 B t A = (1 + t)2 (1 + s)2 det I + 1+t −1 st t s A− B I+ A × det I + B , 1+s (1 + s)(1 + t) 1+t
(65)
where we make use of the fact that both A and B are diagonal and hence commute. We compute the first determinant explicitly as t t det I + A =1+ 2a. (66) 1+t 1+t Next we wish to compute the other determinant in (65). For this we write −1 t t I+ A A + O a2 , =I− 1+t 1+t where we understand the “O”-notation entry-wise. Therefore (taking advantage of the fact that all the matrices involved are diagonal), we have −1 s t st I+ A− B2 I + A 1+s (1 + s)(1 + t) 1+t st s st 2 2 2 2 A− B2 + =I+ AB + O a b 1+s (1 + s)(1 + t) (1 + s)(1 + t)2 2 2s 4st 8st 2 1 + 1+s a − (1+s)(1+t) b2 + (1+s)(1+t) 2 ab = + O a 2 b2 . 4st 2 1 − (1+s)(1+t) c
Fluctuations of Nodal Length
813
Therefore, we have −1 t st s 2 A− B I+ A det I + 1+s (1 + s)(1 + t) 1+t 4st 2s 8st 2 4st 2 2 a− b2 + c = 1+ ab 1 − 1+s (1 + s)(1 + t) (1 + s)(1 + t)2 (1 + s)(1 + t) +O (a 2 b2 + c2 ) 8st 2 2s 4st 2 2 2 2 . ab b + c = 1+ + O a a− b2 + 1+s (1 + s)(1 + t) (1 + s)(1 + t)2 Substituting (66) and (67) into (65) we obtain 2t t I2 = (1 + t)2 (1 + s)2 1 + a (67) det I + s I2 1+t 2s 4st 8st 2 2 2 2 2 2 × 1+ a− b + ab + O a b + c 1+s (1 + s)(1 + t) (1 + s)(1 + t)2 = (1 + t)2 (1 + s)2 4st 2(t + s + 2st) a+ (a 2 − b2 ) + O a 2 b2 + c2 . × 1+ (1 + s)(1 + t) (1 + s)(1 + t) Now we use the expansion √
1
3 1 = 1 − x + x 2 + O(x 3 ) 2 8 1+x
to write f n (ψ) (t, s) =
1 t I2
1 1/2 = (1 + t)(1 + s)
det I + s I2 2st (t + s + 2st) a+ b2 × 1− (1 + s)(1 + t) (1 + s)(1 + t) 3s 2 3t 2 st a2 + + + 2(1 + s)2 2(1 + t)2 (1 + s)(1 + t)
6st 6t 2 s 2 2 4 3 5 2 − (t +s +2st)ab + b O(a +b +c ) . (1 + t)2 (1 + s)2 (1 + t)2 (1 + s)2 (68) The asymptotic expansion (68) implies, in particular, f n (ψ) (0, 0) = 1, t 3t 2 1 n (ψ) 2 3 5 2 1− a+ f (t, 0) = a + O(a + b + c ) , 1+t 1+t 2(1 + t)2 and
(69) (70)
814
I. Wigman
f
n (ψ)
1 (0, s) = 1+s
1−
s 3s 2 2 3 5 2 a + O(a + b + c ) . a+ 1+s 2(1 + s)2
(71)
Define F n (ψ) (t, s) := f n (ψ) (0, 0) − f n (ψ) (t, 0) − f n (ψ) (0, s) + f n (ψ) (t, s),
(72)
so that in the new notations (63) is E[U V ] =
1 2π
∞ ∞
0
F n (ψ) (t, s)
0
dtds . (ts)3/2
(73)
Plugging the estimates (68), (69), (70) and (71) into the definition (72) of F n (ψ) yields F n (ψ) (t, s) =
st (2 + t + s) ts + ·a (1 + t)(1 + s) (1 + t)2 (1 + s)2 ts(t + 10ts + s + 3ts 2 + 3t 2 s − 2) 2 2st 2 · b − ·a + (1 + s)2 (1 + t)2 2(1 + t)3 (1 + s)3 6st 6t 2 s 2 2 − (t + s + 2st) · ab + · b4 (1 + t)3 (1 + s)3 (1 + t)3 (1 + s)3 +O(a 3 + b5 + c2 ),
(74)
where the constants involved in the “O”-notation are universal. We wish to plug ∞(74) into (73) and integrate with respect to t and s. The problem is that the integral 0 t dt 3/2 diverges at the origin so that the bound for the error term in (74) is not sufficient. To resolve this isssue we notice that (72) implies that we have F n (ψ) (t, s)|t=0 = F n (ψ) (t, s)|s=0 = 0,
(75)
and identify the expression (74) as the Taylor expansion of F n (ψ) (t, s) considered as a function of (a, b, c) with fixed parameters t, s around the origin (a, b, c) = (0, 0, 0). The vanishing property (75) implies that all the Taylor coefficients in the expansion (74) considered as a function of t, s, are divisible by ts, so that we may improve the error term in (74) as F n (ψ) (t, s) =
st (2 + t + s) ts + ·a (1 + t)(1 + s) (1 + t)2 (1 + s)2 ts(t + 10ts + s + 3ts 2 + 3t 2 s − 2) 2 2st 2 · b − ·a + (1 + s)2 (1 + t)2 2(1 + t)3 (1 + s)3 6st 6t 2 s 2 2 − (t + s + 2st) · ab + · b4 (1 + t)3 (1 + s)3 (1 + t)3 (1 + s)3 +O m(t, s) · (a 3 + b5 + c2 ) , (76)
where we introduce the notations m(t) := min{t, 1} and m(t, s) := m(t) · m(s). Plugging (76) into (73) and integrating term by term we obtain
Fluctuations of Nodal Length
815
E[U V ] =
π π π 3π 3π 4 π + · a + · b2 − · a2 − · ab2 + b 2 2 4 16 8 64 +O(a 3 + b5 + c2 ),
(77)
where we used the standard integrals
∞
dt = π, √ t(1 + t)
0
∞ 0
∞
dt π = , √ 2 2 t(1 + t)
0
t 3/2
3π , dt = 3 (1 + t) 8
∞ 0
∞ 0
√
t π dt = , 3 (1 + t) 8
√ t π dt = , (1 + t)2 2
∞ √
dt 3π . = 3 8 t(1 + t)
0
We finally plug the estimates (62) and (77) into (61) to obtain (57), that is the statement of the present lemma. 3.4.2. Some estimates related to the matrix elements In this section we evaluate the various expressions appearing in (57) asymptotically as ψ → ∞, namely prove Lemma 3.9. To evaluate the matrix elements a, b, c we will need to deal with the asymptotic behaviour of the Legendre polynomials of high degree. The reader may find the necessary background on the Legendre polynomials as well as some basic asymptotic estimates in Appendix B (see Lemma B.3). Proof of Lemma 3.9. It is easy to check that parts 5-10 of the present lemma follow directly from parts 1-4. Moreover, part 1 may be obtained by a straightforward application of (116), and part 4 is a direct consequence of the high degree asymptotics (117) for the derivatives of Legendre polynomials. It then remains to prove parts 2-3. Recall they we assume that C < ψ < π m/2, so that Pn (cos(ψ/m)) is bounded away from 1 by Hilb’s asymptotics (115). Hence we may write 1 P (cos(ψ/m))2 sin(ψ/m)2 n2 1 − 2 P(cos(ψ/m))2 P (cos(ψ/m))2 sin(ψ/m)2 n 1 1 , + +O nψ ψ 3
an (ψ) = −
(78)
where to bound the error term we used the decay |P (cos(ψ/m))| = O which follows from (117).
n2 , ψ 3/2
(79)
816
I. Wigman
Now we use (117) to obtain 1 2 π 2 2 2 2 sin(ψ/m) P (cos(ψ/m)) sin(ψ/m) = sin ψ − n2 n π n sin(ψ/m)3 4 1 3 sin(ψ/m) cos(2ψ) + O − 8n n2 1 1 +O + 3 ψ nψ 3 cos(2ψ) 1 − sin(2ψ) − = π n sin(ψ/m) 4π n 2 sin(ψ/m)2 1 1 , (80) +O + ψ 3 nψ and (58) together with (80) imply 1 1 + cos(4ψ) Pn (cos(ψ/m))2 Pn (cos(ψ/m))2 sin(ψ/m)2 = n2 2π 2 n 2 sin(ψ/m)2 1 1 +O + . ψ 3 nψ
(81)
Substituting (80) and (81) into (78) we obtain part 2 of the present lemma. It then remains to prove part 3 of the lemma, i.e. establish a two-term asymptotics for bn (ψ)2 . To achieve that we first evaluate bn (ψ). From the definition (47) of bn (ψ) we have, using (79) to replace E n = n(n + 1) by n 2 and cos(ψ/m) = 1 + O
ψ2 n2
to replace cos(ψ/m) by 1, 1 1 Pn (cos(ψ/m)) cos(ψ/m) − 2 Pn (cos(ψ/m)) sin(ψ/m)2 2 n n 1 1 1 + 2 Pn (cos(ψ/m))Pn (cos(ψ/m))2 sin(ψ/m)2 + O + √ n ψ 5/2 n ψ 1 1 = 2 Pn (cos(ψ/m)) − 2 Pn (cos(ψ/m)) sin(ψ/m)2 n n 1 1 1 . + 2 Pn (cos(ψ/m))Pn (cos(ψ/m))2 sin(ψ/m)2 + O + √ n ψ 5/2 n ψ
bn (ψ) =
Next we use the differential equation (113) satisfied by the Legendre polynomials to write
Fluctuations of Nodal Length
bn (ψ) = Pn (cos(ψ/m)) −
817
1 P (cos(ψ/m)) n2 n
1 1 1 + 2 Pn (cos(ψ/m))Pn (cos(ψ/m))2 sin(ψ/m)2 + O + √ n ψ 5/2 n ψ "
π 1 cos(ψ + π4 ) 2 2 sin ψ − π4 sin(ψ + ) − − = π n sin(ψ/m) 4 8 ψ π n 3/2 sin(ψ/m)3/2 π 1 − sin(2ψ) 1 2 1 sin(ψ + ) · +O , + + √ π n sin(ψ/m) 4 π n sin(ψ/m) ψ 5/2 n ψ
where we used (116), (117) and reused (80) once more to obtain the second equality. Reorganizing the terms in the last expression, we have cos(ψ + π4 ) π 7 2 2 bn (ψ) = sin(ψ + ) + π n sin(ψ/m) 4 8 π n sin(ψ/m) ψ √
1
1
π π π 2 · sin ψ + 4 + 2 cos 3ψ + 4 − 2 cos ψ − 4 + π 3/2 n 3/2 sin(ψ/m)3/2 1 1 +O + √ ψ 5/2 n ψ cos(ψ + π4 ) π 7 2 2 sin(ψ + ) + = π n sin(ψ/m) 4 8 π n sin(ψ/m) ψ
π π sin(ψ + 4 ) + cos(3ψ + 4 ) 1 1 +O + √ + √ , 5/2 3/2 3/2 3/2 ψ n ψ 2π n sin(ψ/m) and we obtain part 3 of the present lemma by squaring the last equality.
4. Proof of Theorem 1.4 In this section we assume that ϕ : S 2 → R is a continuously differentiable even function. For the sake of proving Theorem 1.5, we will conduct the analysis of the error terms in terms of the L ∞ norm ϕ∞ and the total variation V (ϕ) of the test function, as prescribed by Theorem 1.4. Our first goal is to formulate an analogue of Proposition 2.7 for the variance of Z ϕ ( f n ). It turns out that a certain auxiliary function W ϕ defined below comes out from a straightforward repetition of the steps we performed in Sect. 2.5, adapted to suit Z ϕ rather than Z. For ϕ ∈ C 1 (S 2 ) the analogue of (25) is
ϕ(x)ϕ(y) K˜ n (x, y)d xd y, E Z( f n )2 = S 2 ×S 2
where K˜ n (x, y) is given again by (26). Since K˜ (x, y) = K˜ (φ), where φ = d(x, y), we may employ Fubini, to obtain (cf. (35))
π ϕ 2 2 E Z ( f n ) = 2π |S | K˜ n (φ)W ϕ (φ)dφ, 0
818
I. Wigman
where W ϕ : [0, π ] → R is a continuously differentiable function defined by
1 W ϕ (φ) := ϕ(x)ϕ(y)d xd y. 8π 2
(82)
d(x,y)=φ
For example, for the constant function ϕ ≡ 1 we have W 1 (φ) = sin(φ). It is easy to check that, since d(x, −y) = π − d(x, y), we have W ϕ (π − φ) = W ϕ (φ),
(83)
as we assume that ϕ is even. Scaling the integrand in the same manner exactly as in Sect. 2.5, we finally obtain the following lemma (cf. Proposition 2.7). Lemma 4.1. The variance of Z ϕ ( f n ) is given by Var(Z ϕ ( f n )) = 4π 2
En I ϕ, n + 1/2 n
(84)
where Inϕ
π m =
K n (ψ) −
1 W ϕ (ψ/m)dψ. 4
0
Remark 4.2. One deduces from (50) and (83) that Inϕ
π m/2
=2
1 K n (ψ) − W ϕ (ψ/m)dψ. 4
(85)
0
We will need some rather simple properties of W ϕ . Writing the double integral (82) as an iterated integral and using the spherical coordinates with pole at x for each x ∈ S 2 we obtain W ϕ (φ) = where
1 ϕ sin(φ)W0 (φ), 8π 2
ϕ
W0 (φ) =
(86)
ϕ(x)d x
S2
ϕ(expx (d · η))dη STx (S 2 )
is a continuously differentiable function with ϕ
W0 (0) = 2π ϕ2L 2 (S 2 ) ,
(87)
whose values are uniformly bounded by ϕ
|W0 (φ)| ≤ 2π ϕ∞ ϕ L 1 (S 2 ) ≤ 8π 2 ϕ2∞
(88)
Fluctuations of Nodal Length
819
and derivative uniformly bounded by ϕ
|W0 (φ)| ≤ 2π ϕ∞ V (ϕ).
(89)
Now we pursue the proof of Theorem 1.4. By (84), evaluating the variance of Z ϕ ϕ is equivalent to evaluating In , and for notational convenience we choose to work with the expression (85). As in the proof of Theorem 1.1, we choose a constant C > 0, which remains fixed throughout the present section, and divide the interval [0, π m/2] = [0, C] ∪ [C, π m/2] (see Sect. 3.1), We then have the following lemma (cf. Lemma 3.3); to prove it just use (86) and the ϕ bound (88) for W0 together with Lemma 3.2. Lemma 4.3. For any constant C > 0, we have as n → ∞, Inϕ = 2 I˜nϕ + Oϕ∞
1 , n
(90)
where
I˜nϕ
π m/2
:=
K n (ψ) −
ψ 1 Wϕ dψ. 4 m
(91)
C ϕ Proof of Theorem 1.4. First we evaluate I˜n as defined in (91). Plugging (56) into (91), we have (cf. (59))
I˜nϕ =
=
π m/2
1 sin(2ψ) 65 1 9 cos(2ψ) + + 2 2 π n sin(ψ/m) 256 π n sin(ψ/m)ψ 32 π nψ sin(ψ/m) C ! 27 11 ϕ ψ 64 sin(2ψ) − 256 cos(4ψ) dψ + W π 2 nψ sin(ψ/m) m ⎞ ⎛ π m/2
1 ψ 1 Wϕ dψ ⎠ +O ⎝ + ψ 3 nψ m 1 16π 3 n
C π m/2
9 cos(2ψ) 65 1 + 128 π ψ 16 ψ C ! 27 11 sin(2ψ) − 128 cos(4ψ) 1 ϕ ψ dψ + O ϕ2∞ , + 32 W0 πψ m n sin(2ψ) +
(92)
with constants involved in the “O”-notation universal. Here we used the identity (86); to effectively control the error term we use (88).
820
I. Wigman
We integrate by parts the first oscillatory term in (92), using the continuous differentiability assumptions; this yields the bound for its contribution 1 n
π m/2
ϕ
sin(2ψ)W0 (ψ/m)dψ C
π m/2 1 1 C ϕ ϕ cos(2ψ) · W0 (ψ/m) ψ=π m/2 + 2 cos(2ψ)W0 (ψ/m)dψ n n C
ϕ2∞ n
+
ϕ W0 L 1 ([0,π ])
n
(ϕ2∞ + ϕ∞ V (ϕ)) ·
1 n
with constants involved in the “«”-notation universal, by (89). It is easy to establish similar bounds for the remaining oscillatory terms in (92), i.e. the 3rd and the 4th terms. To analyze the main contribution, which comes from the remaining second term in ϕ (92), we note that the continuous differentiability of W0 implies W0 (φ) = 2π ϕ2L 2 (S 2 ) + Oϕ∞ ,V (ϕ) (φ), by (87) and (89). The main contribution to (92) is then 65 1 2048π 4 n
π m/2
C
ϕ
W0 (ψ/m) 65 1 dψ = ϕ2L 2 (S 2 ) · 3 ψ 1024π n ⎛ 1 +Oϕ∞ ,V (ϕ) ⎝ 2 n
π m/2
C π m/2
dψ ψ
⎞
dψ ⎠
C
=
65 log n + Oϕ∞ ,V (ϕ) ϕ2L 2 (S 2 ) · 1024π 3 n
ϕ All in all we evaluated I˜n as
I˜nϕ =
65 log n + Oϕ∞ ,V (ϕ) ϕ2L 2 (S 2 ) · 3 1024π n
1 . n
1 . n
Plugging this into (90) yields Inϕ =
65 log n + Oϕ∞ ,V (ϕ) ϕ2L 2 (S 2 ) · 3 512π n
1 . n
We finally obtain the statement of Theorem 1.4 by plugging (93) into (84).
(93)
5. Proof of Theorem 1.5 As implied by the formulation of Theorem 1.5, in this section we will deal with functions of bounded variation. The definition and some basic properties of the class BV (S 2 ) of functions of bounded variation is given in Appendix C.
Fluctuations of Nodal Length
821
5.1. On the proof of Theorem 1.5. To prove Theorem 1.5 one wishes to apply a standard approximation argument, approximating our test function ϕ of bounded variation with a sequence ϕi of C ∞ , for which we can apply Theorem 1.4. There are two major issues with this approach however. On one hand, one needs to check that ϕi approximating ϕ implies the corresponding statement for the random variables Z ϕ ( f n ) and Z ϕi ( f n ), and, in particular, their variance. While it is easy to check that if ϕi → ϕ in L 1 then for every fixed n we also have E[Z ϕi ( f n )] → E[Z ϕ ( f n )], the analogous statement for the variance is much less trivial (see Proposition 5.17 ). On the other hand, when applying Theorem 1.4 for ϕi , one needs to control the error term in (14), which may a priori depend on ϕi . To resolve the latter we take advantage8 of the fact that Theorem 1.4 allows us to control the dependency of the error term in (14) on the test function in terms of its L ∞ norm and total variation. Thus to resolve this issue it would be sufficient to require from ϕi to be essentially uniformly bounded and having uniformly bounded total variation. Fortunately the standard symmetric mollifiers construction from [13] as given in Appendix C satisfy both the requirements above. Namely given a function ϕ ∈ BV (S 2 ) we obtain a sequence ϕi of the C ∞ function, that converge in L 1 to ϕ, ϕi ∞ ≤ ϕ∞ and in addition V (ϕi ) → V (ϕ). 5.2. Continuity of the distribution of Z ϕ . As pointed in Sect. 5.1, to prove Theorem 1.5 we will need to show that the distribution of Z ϕ depends continuously on ϕ. Proposition 5.1 makes this statement precise. We believe that it is of independent interest. Proposition 5.1. Let ϕ ∈ BV (S 2 ) ∩ L ∞ (S 2 ) be any test function. Then E Z ϕ ( f n )2 = O n 2 ϕ2L 1 (S 2 ) + ϕ∞ ϕ L 1 (S 2 ) ,
(94)
where the constant involved in the “O”-notation are universal. In particular, if F ⊆ S 2 has a C 2 boundary then 2 F = O(n 2 |F|2 + |F|). E Z ( fn ) Proof. Recall that we defined W ϕ as (82); the assumption ϕ ∈ L ∞ (S 2 ) saves us from dealing with the validity of this definition. Starting from (121), and repeating the steps in the proof of Lemma 2.1 from either [23] or [4,5], we may extend the validity of the Kac-Rice formula (84) with (85) for this class as well. Note that the constant term 7 Proposition 5.1 gives a stronger claim. First, it evaluates the second moment rather than the variance.
2
2 Secondly, it gives a general bound for E Zϕi ( f n ) − Zϕ ( f n ) = E Zϕi −ϕ ( f n ) . It is easy to derive the result we need employing the triangle inequality. 8 This is by no means a lucky coincidence; it is precisely the proof of Theorem 1.5 that motivated the technical statement made in Theorem 1.4.
822
I. Wigman
in (85) comes from the squared expectation, so that we need to omit it if we want to compute the second moment. We then have E
2 = 8π 2 Z ϕ ( fn )
En J ϕ, n + 1/2 n
(95)
where π m/2
Jnϕ
=
K n (ψ) W
ϕ
ψ m
dφ,
0
denoting as usual m := n + 1/2. As usual while estimating this kind of integrals we remove the origin by choosing a constant C > 0 and writing ϕ
ϕ
Jnϕ = Jn,1 + Jn,2 ,
(96)
where ϕ Jn,1
C =
K n (ψ) W ϕ
ψ m
dψ,
0
and ϕ Jn,2
π m/2
=
K n (ψ) W ϕ
ψ m
dψ.
C
First, for C < ψ < on C, i.e.
πm 2 ,
K n (ψ) is bounded by a constant, which may depend only |K n (ψ)| = OC (1), ϕ
which follows directly from Proposition 3.5. Therefore we may bound Jn,2 as ϕ |Jn,2 |
π m/2
ϕ W
C
π m/2 ϕ ψ ψ dψ dψ ≤ W m m 0
C
π/2 ϕ W (φ) dψ nϕ2 1 2 , = m L (S )
(97)
0
as earlier. We claim that for 0 < ψ < C we may bound K n as 1 . |K n (ψ)| = OC ψ
(98)
Fluctuations of Nodal Length
823 ϕ
Before proving this estimate we will show how it helps us to bound Jn,1 . We have by ϕ the definition of Jn,1 ,
C
C 1 ϕ ψ 1 ϕ ψ ϕ W dψ dψ W Jn,1 ψ m n 0 m 0
C/n
0
ϕ W (φ) dφ C 1 ϕ∞ ϕ L 1 (S 2 ) , 0 n
(99)
0
by (86) and the first inequality of (88). The statement of the present lemma now follows from plugging the estimates (97) and (99) into (96) and (95). We still have to prove (98) though. To see (98) we use Remark 2.10 and the Cauchy-Schwartz inequality to write K n (ψ) =
1 E [U · V ], (2π ) 1 − Pn (cos ψ/m)2
(100)
where U and V are 2-dimensional mean zero Gaussian vectors with covariance matrix (45), whose entries uniformly bounded by an absolute constant, whence E [U · V ] ≤ E U 2 E V 2 = O(1), (101) with the constant involved in the “O”-notation uniform. For the other term Lemma B.2 yields 1 (102) 1 − Pn (cos(ψ/m))2 , ψ so that we obtain the necessary bound (98) for K n (ψ) plugging the estimates (101) and (102) into (100). 5.3. Proof of Theorem 1.5. Now we are ready to give a proof of Theorem 1.5. Proof of Theorem 1.5. Given a function ϕ ∈ BV (S 2 ), let ϕi ∈ C ∞ be a sequence of smooth functions such that ϕi → ϕ in L 1 (S 2 ), Vi := V (ϕi ) → V (ϕ) and ϕi ∞ ≤ ϕ∞ .
(103)
(see Appendix C). Let M1 := ϕ∞ and M2 := max{Vi }i≥1 < ∞, since Vi is convergent. Theorem 1.4 applied on ϕi ∈ C ∞ (S 2 ) states that Var(Z ϕi ( f n )) = c(ϕi ) · log n + O M1 ,M2 (1),
(104)
824
I. Wigman
where c(ϕi ) is given by c(ϕi ) := 65
ϕi 2L 2 (S 2 ) 128π
> 0.
Note that since ϕi and ϕ are uniformly bounded (103), L 1 (S 2 ) convergence implies L 2 (S 2 ) convergence, so that c(ϕi ) → c(ϕ),
(105)
the latter being given by (13). On the other hand we know from Proposition 5.1 that
2
2 E Z ϕi ( f n ) − Z ϕ ( f n ) = E Z ϕi −ϕ ( f n ) → 0, using the uniform boundedness (103) again to ensure that (94) holds uniformly. This together with the triangle inequality implies that
Var Z ϕi ( f n ) → Var Z ϕ ( f n ) , (106) and we take the limit i → ∞ in (104) to finally obtain the main statement of Theorem 1.5. Remark 5.2. From the proof presented, it is easy to see that the constant in the “O”-notation in the statement (14) of Theorem 1.5 could be made dependent only on ϕ∞ and V (ϕ). Appendix A. Computation of the Covariance Matrix In this section we compute the matrix n (φ) explicitly, as prescribed by (37). The matrix n (φ) is the 4×4 covariance matrix of the mean zero Gaussian random vector Z 2 in (23) with x = y ∈ S 2 any two points on the arc {θ = 0} with d(x, y) = φ, conditioned upon f (x) = f (y) = 0. Recall that as such, n (φ) is given by (34), where A = An (x, y), B = Bn (x, y) and C = Cn (x, y) are given by (30), (31) and (32) respectively, and x, y ∈ S 2 are any points on the arc {θ = 0} with d(x, y) = φ. Here the gradients are given in the orthonormal frame (36) of the tangent planes Tx (S 2 ) associated to the spherical coordinates (see Sect. 2.4 for explanation). Let x and y correspond to the spherical coordinates (φx , θx = 0) and (φ y , θ y = 0), and denote φ = d(x, y) = |φx − φ y |. Recall that u n (x, y) = Pn (cos(d(x, y))) = Pn (cos φ). First we compute the inverse of A in (30) as 1 1 −Pn (cos φ) . An (φ)−1 = 1 1 − Pn (cos φ)2 −Pn (cos φ)
(107)
Fluctuations of Nodal Length
825
It is easy to either see from the geometric picture or compute explicitly that ∇x u n (x, y) = −∇ y u n (x, y) = ±Pn (cos φ) sin(φ)(1, 0), depending on whether φx > φ y or φx < φ y , so that 0 0 Pn (cos φ) sin φ Bn (φ) = ± −Pn (cos φ) sin φ 0 0
0 . 0
(108)
(109)
Next we turn to the missing part of Cn (φ) defined in (32), i.e. the “pseudo-Hessian” Hn (φ) given by (33). By the chain rule
Hn (φ) = ∇x ⊗ ∇ y u n (x, y) = ∇x ⊗ Pn (cos(d(x, y)))∇ y cos(d(x, y)) = Pn (cos φ)∇x cos(d(x, y)) ⊗ ∇ y cos(d(x, y))
+Pn (cos φ) ∇x ⊗ ∇ y cos(d(x, y)).
(110)
We denote h(x, y) := cos d(x, y) = cos φx cos φ y + sin φx sin φ y cos(θx − θ y ), and compute explicitly that for θx = θ y = 0 we have
cos φ ∇x ⊗ ∇ y cos(d(x, y)) = ∇x ⊗ ∇ y h(x, y) = 0
Plugging (108) and (111) into (110) we obtain Pn (cos φ) cos φ − Pn (cos φ) sin(φ)2 H= 0
0 . 1
0 . Pn (cos φ).
(111)
(112)
Finally plugging (112) into (32), and plugging that together with (107) and (109) into (34), we obtain an explicit expression for n (φ) as prescribed by (37) with entries given by (38), (39) and (40). Appendix B. Estimates for the Legendre Polynomials and Related Functions The goal of this section is to give a brief introduction to the Legendre polynomials Pn : [−1, 1] → R and give some relevant basic information necessary for the purposes of the present paper. The high degree asymptotic analysis of behaviour of Pn and its first two derivatives involves the Hilb’s asymptotics in Lemma B.1 together with the recursion (114) for the 1st derivative and the differential equation (113) for the second one. We refer the reader to [17] for more information. The Legendre polynomials Pn are defined as the unique polynomials of degree n orthogonal w.r.t. the constant weight function ω(t) ≡ 1 on [−1, 1] with the normalization Pn (1) = 1. They satisfy the following second order differential equation: Pn (cos(ψ/m)) = −
n(n + 1) 2 cos(ψ/m) Pn (cos(ψ/m))+ P (cos(ψ/m)), sin(ψ/m)2 sin(ψ/m)2 n
(113)
as well as the recursion Pn (cos(ψ/m)) = (Pn−1 (cos(ψ/m))−cos(ψ/m)Pn (cos(ψ/m)))
n . (114) sin(ψ/m)2
The Hilb asymptotics gives the high degree asymptotic behaviour of Pn .
826
I. Wigman
Lemma B.1. (Hilb Asymptotics (formula (8.21.17) on p. 197 of Szego [17])) Pn (cos φ) =
φ sin φ
1/2 J0 ((n + 1/2)φ) + δ(φ),
(115)
uniformly for 0 ≤ φ ≤ π/2, J0 is the Bessel J function of order 0 and the error term is # φ 1/2 O(n −3/2 ), Cn −1 < φ < π/2 δ(φ) 0 < φ < Cn −1 , φ α+2 O(n α ), where C > 0 is any constant and the constants involved in the “O”-notation depend on C only. We have the following rough estimate for the behaviour of the Legendre polynomials at ±1, which follows directly from Hilb’s asymptotic. Lemma B.2. For 0 < φ <
π 2
one has 1 − Pn (cos(φ))2 n 2 φ 2 ,
where the constant in the “”-notation is universal. Lemma B.3. The Legendre polynomials Pn and its couple of derivatives satisfy uniformly for n ≥ 1, ψ > C: (1)
π 1 cos(ψ + π4 ) 2 sin(ψ + ) − Pn (cos(ψ/m)) = π n sin(ψ/m) 4 8 ψ 1 1 +O , (116) +√ ψ 5/2 ψn (2) Pn (cos(ψ/m)) " √ π 3 π 2 n sin(ψ/m) sin ψ − + sin ψ + = π sin(ψ/m)5/2 4 8n 4 2 n n +O , + ψ 7/2 ψ 3/2
(117)
(3) n2 2 Pn (cos(ψ/m)) + P (cos(ψ/m)) sin(ψ/m)2 sin(ψ/m)2 n 3 n . (118) +O ψ 5/2
Pn (cos(ψ/m)) = −
Fluctuations of Nodal Length
827
Proof. By Lemma B.1 and the standard asymptotics for the Bessel functions we obtain √ √ ψ/m ψ Pn (cos(ψ/m)) = √ J0 (ψ) + O 2 n sin(ψ/m) "
√ sin ψ + π4 1 cos ψ + π4 2 ψ/m = − √ √ π sin(ψ/m) 8 ψ 3/2 ψ √ 1 ψ + +O n2 ψ 5/2 π 1 cos(ψ + π4 ) 2 sin(ψ + ) − = π n sin(ψ/m) 4 8 ψ 1 1 +O , +√ 5/2 ψ ψn which is (116). To obtain (117) we employ the recursive formula (114), evaluating the Legendre polynomials appearing there using (116). Finally we obtain a simple approximate differential equation (118), replacing n(n + 1) by n 2 and cos(ψ/m) by 1 in the differential equation (113) satisfied by the Legendre polynomials. To do so we use the decay 1 |Pn (cos(ψ/m))| = O √ ψ of Pn , which follows directly from (116), as well as (79) of its derivative.
Appendix C. Functions of Bounded Variation In this section we give the definition and some basic properties on the functions of bounded variation. For more information we refer the reader to [13]. Classically, the variation of a function η : [a, b] → R on [a, x] is defined as V (η; x) :=
k−1
sup
λ: t1 =a
|η(ti+1 ) − η(ti )|,
where the supremum is over all the partitions λ of [a, x]. We denote I := [a, b]. If η ∈ C 1 (I ) then the variation is
x V (η; x) =
|η (t)|dt.
a
In fact, the last inequality holds even for η ∈ W 1,1 (I ), where for this class of functions the derivative η is the weak derivative. This definition has two major disadvantages. First, one wishes to identify functions η1 ∼ η2 ,
if η1 (x) = η2 (x)
for almost all x ∈ I.
(119)
However, altering the values of η on a measure zero set does impact its variation. Secondly, one cannot extend this definition for the multivariate case.
828
I. Wigman
We then need to find a better definition. Fortunately, the following definition eliminates the disadvantages of the previous one. Let
x V (η; x) := sup g
η(t)g (t)dt,
0
where the supremum is over all the continuously differentiable functions g : [a, x] → R with |g(t)| ≤ 1 for all t ∈ [a, x]. The number V (η) := V (η; I ) is called the total variation of η on I . We define the space BV (I ) to be the equivalence classes of functions η with finite total variation, i.e. BV (I ) := {η ∈ L 1 (I ) : V (η) < ∞}/ ∼, where the equivalence relation is given by (119). It is known [13] that W 1,1 (I ) BV (I ). We may extend the latter definition quite naturally for the multivariate case. Of our interest is the case of the sphere. Let ϕ ∈ L 1 (S 2 ) be an integrable function. We define its variation on an open subset ⊆ S 2 as
V (ϕ; ) := sup ϕ(x) div g(x)d x, g
where the supremum is over the continuously differentiable compactly supported vector fields g ∈ Cc1 (, T ) with |g(x)| ≤ 1 for all x ∈ . We define the total variation as V (ϕ) := V (ϕ; S 2 ). The space BV () is defined as the equivalence class of functions ϕ with V (ϕ) < ∞, with the equivalence relation (119) adapted to the sphere. Again, for a smooth (and W 1,1 (S 2 )) function ϕ ∈ C 1 (S 2 ) we have
V (ϕ) = ∇ϕ(x)d x, S2
and W 1,1 () BV (). For a function ϕ ∈ BV (S 2 ) [13], Theorem 1.17 gives a construction9 of a sequence ϕi ∈ C ∞ of smooth test functions such that ϕi → ϕ in L 1 (S 2 ) as well as V (ϕi ) → V (ϕ). 9 This book gives only the theory of functions of bounded variation on Rm . One can obtain a similar theory for the sphere only slightly modifying the one given.
Fluctuations of Nodal Length
829
Moreover, part (b) of that theorem implies that ϕi ∞ ≤ ϕ∞ . We are interested in the linear statistics of the nodal sets of smooth functions, where the test functions are of bounded variation. The definition of the linear statistics is natural for continuous test functions ϕ : S 2 → R as
ϕ
Z (f) =
ϕ(x)d x, f −1 (0)
i.e. integrating the restriction of ϕ on the nodal line. However, things become more complicated as one drops the continuity assumption; since the values of ϕ ∈ BV (S 2 ) (or ϕ ∈ L 1 (S 2 )) are only defined up to measure zero sets, there is no meaning to restricting ϕ on curves. In general, one cannot define linear statistics corresponding to integrable functions, and to define a notion of trace of ϕ on a smooth curve C, we will have to exploit the values of ϕ in a tubular neighbourhood around C. Such a construction is known for the functions belonging to the class W 1,1 (S 2 ), i.e. for every smooth curve C ⊆ there exists a map trC : W 1,1 () → L 1 (C) satisfying the natural properties. The situation is more involved in the BV -case, which is essential to us, since W 1,1 does not contain the characteristic functions of nice spherical subsets. A smooth curve divides the sphere and a tubular neighbourhood around it into two parts. One may then + ϕ and ϕ − = tr − ϕ both belonging to L 1 (C), define [13], chap. 2, two traces ϕ + = trC C corresponding to the values of ϕ on the different parts. The traces ϕ + and ϕ − may in general be different10 , and moreover, one cannot canonically distinguish between the traces. For instance, if F ⊆ S 2 is a nice subset, and χ F is its characteristic function, then tr ∂ F (χ F ) might be defined as either 1 or 0, depending on whether we approach the circle from inside or outside the disc respectively. Accordingly, the corresponding linear statistic might be len(∂ F) or 0. We define the average trace of ϕ on a smooth curve C ⊆ S 2 as ϕ ± :=
1 + 1 − ϕ + ϕ , 2 2
(120)
and this is the notion that appears in the formulation of Theorem 1.5 and throughout the present paper. For ϕ ∈ L ∞ (S 2 ) we have ϕ ± ∞ ≤ ϕ∞ . 10 Intuitively, the traces ϕ + and ϕ − will be different precisely if the jump of ϕ occurs on a subset of C, as follows from [13], Proposition 2.8. It is plausible that with probability 1 this situation will not happen for the nodal lines of spherical harmonics; we believe that this is a minor issue and of little interest to the present paper. This situation is almost surely impossible for the characteristic functions of nice sets, which are the main motivation for considering the class BV .
830
I. Wigman
Following the approach of [13], (2.10) and Federer’s co-area formula [11], one may obtain the inequality
1 + ϕ(x)d x − ϕ (x)d x dt 0 f −1 (t) f −1 (0) = O f V ϕ; f −1 ((0, )) + sup len( f −1 (t)) − len( f −1 (0)) . 0
As β → 0, the right-hand side of the last inequality vanishes. Therefore we have the following Kac-Rice type formula:
1 ϕ + (x)d x = lim ∇ f (x)ϕ(x)d x, →0 0< f (x)<
f −1 (0)
and similarly
f −1 (0)
1 ϕ (x)d x = lim →0
−
∇ f (x)ϕ(x)d x. −< f (x)<0
Combining the last two formulas we obtain
1 ϕ ± (x)d x = lim →0 2 f −1 (0)
∇ f (x)ϕ(x)d x.
(121)
| f (x)|<
We employ (121) to extend the validity of the Kac-Rice formula for the second moment for ϕ ∈ BV (S 2 ) (see (95)). Acknowledgements. The author wishes to express his deepest gratitude to Zeev Rudnick, Mikhail Sodin, Stéphane Nonnenmacher, Manjunath Krishnapur and Steve Zelditch for many fruitful and stimulating conversations that inspired this research and pushed it forward. I wish to especially thank Dan Mangoubi for all his help and patience. While conducting the research the author benefited from the expertise and experience of Pengfei Guan, Dmitry Jakobson, Iosif Polterovich, John Toth and Domenico Marinucci that proved invaluable. The author is grateful to Sherwin Maslowe for proofreading this paper. Finally, the author would like to thank the anonymous referee for many useful comments.
References 1. Andrews, G.E., Askey, R., Roy, R.: Special functions Encyclopedia of Mathematics and its Applications 71. Cambridge: Cambridge University Press, 1999 2. Bérard, P.: Volume des ensembles nodaux des fonctions propres du laplacien. Bony-Sjostrand-Meyer seminar, 1984–1985, Exp. No. 14, 10 pp., École Polytech., Palaiseau, 1985 3. Berry, M.V.: Statistics of nodal lines and points in chaotic quantum billiards: perimeter corrections, fluctuations, curvature. J. Phys. A 35, 3025–3038 (2002) 4. Bleher, P., Shiffman, B., Zelditch, S.: Universality and scaling of correlations between zeros on complex manifolds. Invent. Math. 142(2), 351–395 (2000) 5. Bleher, P., Shiffman, B., Zelditch, S.: Universality and scaling of zeros on symplectic manifolds. In: Random Matrix Models and their Applications, Math. Sci. Res. Inst. Publ. 40, Cambridge: Cambridge Univ. Press, 2001, pp. 31–69 6. Brüning, J.: Über Knoten Eigenfunktionen des Laplace-Beltrami Operators. Math. Z. 158, 15–21 (1978)
Fluctuations of Nodal Length
831
7. Brüning, J., Gromes, D.: Über die Länge der Knotenlinien schwingender Membranen. Math. Z. 124, 79–82 (1972) 8. Cheng, S.Y.: Eigenfunctions and nodal sets. Comm. Math. Helv. 51, 43–55 (1976) 9. Cramér, H., Leadbetter, M.R.: Stationary and Related Stochastic Processes. Sample Function Properties and Their Applications. Reprint of the 1967 original. Mineola, NY: Dover Publications, Inc., 2004 10. Donnelly, H., Fefferman, C.: Nodal sets of eigenfunctions on Riemannian manifolds. Invent. Math. 93, 161–183 (1988) 11. Federer, H.: Curvature measures. Trans. Amer. Math. Soc. 93, 418–491 (1959) 12. Forrester, P.J., Honner, G.: Exact statistical properties of the zeros of complex random polynomials. J. Phys. A 32(16), 2961–2981 (1999) 13. Giusti, E.: Minimal Surfaces and Functions of Bounded Variation. Monographs in Mathematics, 80. Basel: Birkhäuser Verlag, 1984 14. Krishnapur, M., Wigman, I.: Fluctuations of the Nodal Length of Random Eigenfunctions of the Laplacian on the Torus. In preparation 15. Neuheisel, J.: The Asymptotic Distribution of Nodal Sets on Spheres. Johns Hopkins Ph.D. thesis, 2000 16. Rudnick, Z., Wigman, I.: On the volume of nodal sets for eigenfunctions of the Laplacian on the torus. Ann. Henri Poincaré 9(1), 109–130 (2008) 17. Szego, G.: Orthogonal Polynomials. Fourth edition. American Mathematical Society, Colloquium Publications, Vol. XXIII. Providence, RI: Amer. Math. Soc., 1975 18. Shiffman, B., Zelditch, S.: Number variance of random zeros on complex manifolds. Geom. Funct. Anal. 18(4), 1422–1475 (2008) 19. Shiffman, B., Zelditch, S.: Number variance of random zeros on complex manifolds, II: smooth statistics. available online http://arxiv.org/abs/0711.1840v1[math.CV], 2007 20. Sodin, M., Tsirelson, B.: Random complex zeroes. I. Asymptotic normality. Israel J. Math. 144, 125–149 (2004) 21. Toth, J.A., Wigman, I.: Counting open nodal lines of random waves on planar domains. IMRN 2009, 3337–3365 (2009) 22. Toth, J.A., Wigman, I.: Universality of length distribution of nodal lines of random waves on generic surfaces. In progress (2009) 23. Wigman, I.: On the distribution of the nodal sets of random spherical harmonics. J. Math. Phys 50, 013521 (2009) 24. Wigman, I.: Volume fluctuations of the nodal sets of random Gaussian subordinated spherical harmonics. In preparation 25. Yau, S.T.: Survey on partial differential equations in differential geometry. In: Seminar on Differential Geometry, Ann. of Math. Stud. 102, Princeton, NJ: Princeton Univ. Press, 1982, pp. 3–71 26. Yau, S.T.: Open problems in geometry. In: Differential Geometry: Partial Differential Equations on Manifolds (Los Angeles, CA, 1990), Proc. Sympos. Pure Math. 54, Part 1, Providence, RI: Amer. Math. Soc., 1993, pp. 1–28 27. Zelditch, S.: Real and Complex zeros of Riemannian Random Waves. To appear in the Proceedings of the Conference, “Spectral Analysis in Geometry and Number Theory on the occasion of Toshikazu Sunada’s 60th birthday”, to appear in the Contemp. Math. Series, available online http://arxiv.org/abs/0803. 4334v1[math.Sp], 2008 Communicated by S. Zelditch
Commun. Math. Phys. 298, 833–853 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1049-0
Communications in
Mathematical Physics
Classification of Simple Linearly Compact n-Lie Superalgebras Nicoletta Cantarini1, , Victor G. Kac2, 1 Dipartimento di Matematica Pura ed Applicata, Università di Padova, Padova, Italy.
E-mail: [email protected]
2 Department of Mathematics, MIT, Cambridge, Massachusetts 02139, USA
Received: 17 September 2009 / Accepted: 15 January 2010 Published online: 21 April 2010 – © Springer-Verlag 2010
Abstract: We classify simple linearly compact n-Lie superalgebras with n > 2 over a field F of characteristic 0. The classification is based on a bijective correspondence between non-abelian n-Lie superalgebras and transitive Z-graded Lie superalgebras of the form L = ⊕n−1 j=−1 L j , where dim L n−1 = 1, L −1 and L n−1 generate L, and [L j , L n− j−1 ] = 0 for all j, thereby reducing it to the known classification of simple linearly compact Lie superalgebras and their Z-gradings. The list consists of four examples, one of them being the n + 1-dimensional vector product n-Lie algebra, and the remaining three infinite-dimensional n-Lie algebras. 0. Introduction Given an integer n ≥ 2, an n-Lie algebra g is a vector space over a field F, endowed with an n-ary anti-commutative product n g → g, a1 ∧ · · · ∧ an → [a1 , . . . , an ], subject to the following Filippov-Jacobi identity: [a1 , . . . , an−1 , [b1 , . . . , bn ]] = [[a1 , . . . , an−1 , b1 ], b2 , . . . , bn ] + [b1 , [a1 , . . . , an−1 , b2 ], b3 , . . . , bn ] + · · · + [b1 , . . . , bn−1 , [a1 , . . . , an−1 , bn ]].
(0.1)
The meaning of this identity is similar to that of the usual Jacobi identity for a Lie algebra (which is a 2-Lie algebra), namely, given a1 , . . . , an−1 ∈ g, the map Da1 ,...,an−1 : g → g, given by Da1 ,...an−1 (a) = [a1 , . . . , an−1 , a], is a derivation of the n-ary bracket. These derivations are called inner. Partially supported by Progetto di ateneo CPDA071244. Partially supported by an NSF grant.
834
N. Cantarini, V. G. Kac
The notion of an n-Lie algebra was introduced by Filippov in 1985 [11]. In this and several subsequent papers, [12,18–20], a structure theory of finite-dimensional n-Lie algebras over a field F of characteristic 0 was developed. In particular, Ling in [20] discovered the following disappointing feature of n-Lie algebras for n > 2: there exists only one simple finite-dimensional n-Lie algebra over an algebraically closed field F of characteristic 0. It is given by the vector product of n vectors in the n + 1-dimensional vector space V , endowed with a non-degenerate symmetric bilinear form (·, ·). Recall that, choosing dual bases {ai } and {a i } of V , i.e., (ai , a j ) = δi j , i, j = 1, . . . , n + 1, the vector product of n vectors from the basis {ai } is defined as the following n-ary bracket: [ai1 , . . . , ain ] = i1 ,...,in+1 a in+1 , where ii ,...,in+1 is a non-zero totally antisymmetric tensor with values in F, and extended by n-linearity. This is a simple n-Lie algebra, which is called the vector product n-Lie algebra; we denote it by O n . Another example of an n-Lie algebra appeared earlier in Nambu’s generalization of Hamiltonian dynamics [23]. It is the space C ∞ (M) of C ∞ -functions on a finitedimensional manifold M, endowed with the following n-ary bracket, associated to n commuting vector fields D1 , . . . , Dn on M: ⎛ ⎞ D1 ( f 1 ) . . . D1 ( f n ) (0.2) [ f 1 , . . . , f n ] = det ⎝ . . . . . . . . . . . . . . . . . . . . ⎠ . Dn ( f 1 ) . . . Dn ( f n ) The fact that this n-ary bracket satisfies the Filippov-Jacobi identity was noticed later by Filippov (who was unaware of Nambu’s work), and by Takhtajan [25], who introduced the notion of an n-Poisson algebra (and was unaware of Filippov’s work). A more recent important example of an n-Lie algebra structure on C ∞ (M), given by Dzhumadildaev [6], is associated to n − 1 commuting vector fields D1 , . . . , Dn−1 on M: ⎛ ⎞ ... fn f1 ⎜ D ( f ) . . . D1 ( f n ) ⎟ [ f 1 , . . . , f n ] = det ⎝ 1 1 . (0.3) ........................⎠ Dn−1 ( f 1 ) . . . Dn−1 ( f n ) In fact, Dzhumadildaev considered examples (0.2) and (0.3) in a more general context, where C ∞ (M) is replaced by an arbitrary commutative associative algebra A over F and the Di by derivations of A. He showed in [9] that (0.2) and (0.3) satisfy the Filippov-Jacobi identity if and only if the vector space i FDi is closed under the Lie bracket. In the past few years there has been some interest in n-Lie algebras in the physics community, related to M-branes in string theory. We shall quote here two sources — a survey paper [13], containing a rather extensive list of references, and a paper by Friedmann [14], where simple finite-dimensional 3-Lie algebras over C were classified (she was unaware of the earlier work). At the same time we (the authors of the present paper) have been completing our work [5] on simple rigid linearly compact superalgebras, and it occurred to us that the method of this work also applies to the classification of simple linearly compact n-Lie superalgebras! Our main result can be stated as follows. Theorem 0.1. (a) Any simple linearly compact n-Lie algebra with n > 2, over an algebraically closed field F of characteristic 0, is isomorphic to one of the following four examples:
Classification of Simple Linearly Compact n-Lie Superalgebras
835
(i) the n + 1-dimensional vector product n-Lie algebra O n ; (ii) the n-Lie algebra, denoted by S n , which is the linearly compact vector space of formal power series F[[x1 , . . . , xn ]], endowed with the n-ary bracket (0.2), where Di = ∂∂xi ; (iii) the n-Lie algebra, denoted by W n , which is the linearly compact vector space of formal power series F[[x1 , . . . , xn−1 ]], endowed with the n-ary bracket (0.3), where Di = ∂∂xi ; (iv) the n-Lie algebra, denoted by SW n , which is the direct sum of n − 1 copies of F[[x]], endowed with the following n-ary bracket, where f j is an element of the j th copy and f = dd xf : j1
[ f1
jn
, . . . fn
] = 0, unless { j1 , . . . , jn } ⊃ {1, . . . , n − 1}, 1
k−1
k
k
k+1
[ f 1 , . . . , f k−1 , f k , f k+1 , f k+2 , . . . , f n n−1 ] f k ) f k+2 . . . f n ) k . = (−1)k+n ( f 1 . . . f k−1 ( f k f k+1 − f k+1
(b) There are no simple linearly compact n-Lie superalgebras over F, which are not n-Lie algebras, if n > 2. Recall that a linearly compact algebra is a topological algebra, whose underlying vector space is linearly compact, namely is a topological product of finite-dimensional vector spaces, endowed with discrete topology (and it is assumed that the algebra product is continuous in this topology). In particular, any finite-dimensional algebra is automatically linearly compact. The basic example of an infinite-dimensional linearly compact space is the space of formal power series F[[x1 , . . . , xk ]], endowed with the formal topology, or a direct sum of a finite number of such spaces. The proof of Theorem 0.1 is based on a construction, which associates to an n-Lie (super)algebra g a pair (Lie g, μ), where Lie g = j≥−1 Lie j g is a Z-graded Lie superalgebra of depth 1 and μ ∈ Lie n−1 g, such that the following properties hold: (L1) Lie g is transitive, i.e., if a ∈ Lie j g with j ≥ 0 and [a, Lie −1 g] = 0, then a = 0; (L2) Lie g is generated by Lie −1 g and μ; (L3) [μ, Lie 0 g] = 0. A pair (L , μ), where L = j≥−1 L j is a transitive Z-graded Lie superalgebra and μ ∈ L n−1 , such that (L2) and (L3) hold, is called admissible. The construction of the admissible pair (Lie g, μ), associated to an n-Lie (super)algebra g, uses the universal Z-graded Lie superalgebra W (V ) = j≥−1 W j (V ), associated to a vector superspace V (see Sect. 1 for details). One has W j (V ) = Hom (S j+1 V, V ), so that an element μ ∈ Wn−1 (V ) defines a commutative n-superalgebra structure on V and vice versa. Universality means that any transitive Z-graded Lie superalgebra L = j≥−1 L j with L −1 = V canonically embeds in W (V ) (the embedding being given by L j a → ϕa ∈ W j (V ), where ϕa (a1 , . . . , a j+1 ) = [. . . [a, a1 ], . . . , a j+1 ]). So, given a commutative n-ary product on a superspace V , we get an element μ ∈ Wn−1 (V ), and we denote by Lie V the Z-graded subalgebra of W (V ), generated by W−1 (V ) and μ. The pair (Lie V, μ) obviously satisfies properties (L1) and (L2). How do we pass from commutative to anti-commutative n-superalgebras? Given a commutative n-superalgebra V with n-ary product (a1 , . . . , an ), the vector superspace V ( stands, as usual, for reversing the parity) becomes an anti-commutative n-superalgebra with n-ary product [a1 , . . . , an ] = p(a1 , . . . , an )(a1 , . . . , an ),
(0.4)
836
N. Cantarini, V. G. Kac
where
p(a1 , . . . , an ) =
(−1) p(a1 )+ p(a3 )+···+ p(an−1 ) if n is even (−1) p(a2 )+ p(a4 )+···+ p(an−1 ) if n is odd,
(0.5)
and vice versa. Thus, given an anti-commutative n-superalgebra g, with n-ary product [a1 , . . . , an ], we consider the vector superspace g with commutative n-ary product (a1 , . . . , an ), given by (0.4), consider the element μ ∈ Wn−1 (g), corresponding to the latter n-ary product, and let Lie g be the graded subalgebra of W (g), generated by W−1 (g) and μ. Note that properties (L1) and (L2) of the pair (Lie g, μ) still hold, and it remains to note that property (L3) is equivalent to the (super analogue of the) Filippov-Jacobi identity. Finally, the simplicity of the n-Lie (super)algebra g is equivalent to (L4) the Lie 0 g-module Lie −1 g is irreducible. An admissible pair, satisfying property (L4) is called irreducible. Thus, the proof of Theorem 0.1 reduces to the classification of all irreducible admissible pairs (L , μ), where L is a linearly compact Lie superalgebra. It is not difficult to show, as in [5], that S ⊆ L ⊆ Der S, where S is a simple linearly compact Lie superalgebra and Der S is the Lie superalgebra of its continuous derivations (it is at this point that the condition n > 2 is essential). Up to now the arguments worked over an arbitrary field F. In the case F is algebraically closed of characteristic 0, there is a complete classification of simple linearly compact Lie superalgebras, their derivations and their Z-gradings [2,3,16,17]. Applying these classifications completes the proof of Theorem 0.1. For example, if g is a finite-dimensional simple n-Lie superalgebra, then dim Lie g < ∞, and from [16] we see that the only possibility for Lie g is L = P/F1, where P is the Lie superalgebra defined by the super Poisson bracket on the Grassmann algebra in the indeterminates ξ1 , . . . , ξn+1 , given by {ξi , ξ j } = bi j , i, j = 1, . . . , n + 1,
(0.6)
where (bi j ) is a non-degenerate symmetric matrix, the Z-grading on L being given by deg(ξi1 . . . ξis ) = s − 2, and μ = ξ1 ξ2 . . . ξn+1 . We conclude that g is the vector product n-Lie algebra. (The proof of this result in the non-super case, obtained by Ling [20], is based on the study of the linear Lie algebra spanned by the derivations Da1 ,...,an−1 , and is applicable neither in the super nor in the infinite-dimensional case.) We have no a priori proof of part (b) of Theorem 0.1 — it comes out only after the classification process. If char F = 0, we have a more precise result on the structure of an admissible pair (L , μ). Theorem 0.2. If char F = 0 and (L , μ) is an admissible pair, then L = ⊕n−1 j=−1 L j , n− j−1 μ for where L n−1 = Fμ, S := ⊕n−2 j=−1 L j is an ideal in L, and L j = (ad L −1 ) j = 0, . . . , n − 1.
Of course, Theorem 0.2 reduces significantly the case wise inspection in the proof of Theorem 0.1. Moreover, Theorem 0.2 can also be used in the study of representations of n-Lie algebras. Namely a representation of an n-Lie algebra g in a vector space M corresponds to an n-Lie algebra structure on the semidirect product L −1 = g M, where M is an abelian ideal. Hence, by Theorem 0.2, we obtain a graded representation of the n−2 n−2 L i in the graded supervector space L(M) = i=−1 Mj, Lie superalgebra S = ⊕i=−1
Classification of Simple Linearly Compact n-Lie Superalgebras
837
M j = (ad M)n− j−1 μ so that L i M j ⊂ Mi+ j . Such “degenerate” representations of the Lie superalgebra S are not difficult to classify, and this corresponds to a classification of representations of the n-Lie algebra g. In particular, representations of the n-Lie algebra O n correspond to “degenerate” representations of the simple Lie superalgebra H (0, n) (finite-dimensional representations of O n were classified in [7], using Ling’s method, mentioned above). Finally, note that, using our discussion on F-forms of simple linearly compact Lie superalgebras in [3], we can extend Theorem 0.1 to the case of an arbitrary field F of characteristic 0. The result is almost the same, namely the F-forms are as follows: O n , depending on the equivalence class of the symmetric bilinear form up to a non-zero factor, and the n-Lie algebras S n , W n and SW n over F. 1. Preliminaries on n-Superalgebras Let V be a vector superspace over a field F, namely we have a decomposition V = V0¯ ⊕V1¯ in a direct sum of subspaces, where V0¯ (resp. V1¯ ) is called the subspace of even (resp. ¯ 1}, ¯ we write p(v) = α. Given two vector odd) elements; if v ∈ Vα , α ∈ Z/2Z = {0, superspaces U and V , the space Hom (U, V ) is naturally a vector superspace, for which even (resp. odd) elements are parity preserving (resp. reversing) maps; also U ⊗ V is a vector superspace via letting p(a ⊗ b) = p(a) + p(b) for a ∈ U , b ∈ V . In particular, the tensor algebra T (V ) = ⊕ j∈Z+ T j (V ) is an associative superalgebra. The symmetric (resp. exterior) superalgebra over V is the quotient of the superalgebra T (V ) by the 2-sided ideal, generated by the elements a ⊗ b − (−1) p(a) p(b) b ⊗ a (resp. a ⊗b +(−1) p(a) p(b) b ⊗a), where a, b ∈ V . They are denoted by S(V ) and (V ) respectively. Both inherit a Z-grading from T (V ): S(V ) = ⊕ j∈Z+ S j , (V ) = ⊕ j∈Z+ j (V ). A well known trivial, but important, observation is that the reversal of parity of V , i.e., taking the superspace V , where (V )α = Vα+1¯ , establishes a canonical isomorphism: S(V ) (V ).
(1.1)
Definition 1.1. Let n ∈ Z+ and let V be a vector superspace. An n-superalgebra structure (or n-ary product) on V of parity α ∈ Z/2Z is a linear map μ : T n (V ) → V of parity α. A commutative (resp. anti-commutative) n-superalgebra of parity α is a linear map μ : S n (V ) → V (resp. n V → V ) of parity α, denoted by μ(a1 ⊗ · · · ⊗ an ) = (a1 , . . . , an ) (resp. = [a1 , . . . , an ]). Lemma 1.2. Let (V, μ) be an anti-commutative n-superalgebra. Then (V, μ) ¯ is a commutative n-superalgebra (of parity p(μ) + n − 1 mod 2) with the n-ary product μ(a ¯ 1 ⊗ · · · ⊗ an ) = p(a1 , . . . , an )μ(a1 ⊗ · · · ⊗ an ),
(1.2)
where p(a1 , . . . , an ) = (−1)
n−2 2 k=0
p(an−1−2k )
,
(1.3)
and vice versa. Proof. Denote . Then, for a1 , . . . , an ∈ V , p (μ(a ¯ 1 ⊗· · ·⊗an )) = n by p the parity n in V p(μ) + 1 + i=1 p(ai ) = i=1 p (ai ) + p(μ) + n + 1 mod 2, i.e., p (μ) ¯ = p(μ) + n − 1 mod 2. Besides, we have p(a1 , . . . , ai , ai+1 , . . . , an ) p(a1 , . . . , ai+1 , ai , . . . , an ) = (−1) p(ai )+ p(ai+1 ) , hence:
838
N. Cantarini, V. G. Kac
μ(a ¯ 1 ⊗ · · · ⊗ ai ⊗ ai+1 ⊗ · · · ⊗ an ) = p(a1 , . . . , an )μ(a1 ⊗ · · · ⊗ ai ⊗ ai+1 ⊗ · · · ⊗ an ) = −(−1) p(ai ) p(ai+1 ) p(a1 , . . . , an )μ(a1 ⊗ · · · ⊗ ai+1 ⊗ ai ⊗ · · · ⊗ an ) = −(−1) p(ai ) p(ai+1 ) p(a1 , . . . , an ) p(a1 , . . . , ai+1 , ai , . . . , an ) ×μ(a ¯ 1 ⊗ · · · ⊗ ai+1 ⊗ ai ⊗ · · · ⊗ an )
¯ 1 ⊗ · · · ⊗ ai+1 ⊗ ai ⊗ · · · ⊗ an ). = (−1) p (ai ) p (ai+1 ) μ(a Definition 1.3. A derivation D of parity α ∈ Z/2Z of an n-superalgebra (V, μ) is an endomorphism of the vector superspace V of parity α, such that: D(μ(a1 ⊗ · · · ⊗ an )) = (−1)αp(μ) (μ(Da1 ⊗ a2 ⊗ · · · ⊗ an ) + (−1)αp(a1 ) μ(a1 ⊗ Da2 ⊗ · · · ⊗ an ) + · · · + (−1)α( p(a1 )+···+ p(an−1 )) μ(a1 ⊗ · · · ⊗ D(an ))). It is clear that derivations of an n-superalgebra V form a Lie superalgebra, which is denoted by Der V . It is not difficult to show that all inner derivations of an n-Lie algebra g span an ideal of Der g (see e.g. [7]), denoted by Inder g. Now we recall the construction of the universal Lie superalgebra W (V ), associated to the vector superspace V . For an integer k ≥ −1, let Wk (V ) = Hom (S k+1 (V ), V ), in other words, Wk (V ) is the vector superspace of all commutative k + 1-superalgebra structures on V , in particular, W−1 (V ) = V , W0 (V ) = End (V ), W1 (V ) is the space of all commutative superalgebra structures on V , etc. We endow the vector superspace W (V ) =
∞
Wk (V )
k=−1
with a product f g, making W (V ) a Z-graded superalgebra, given by the following formula for f ∈ W p (V ), g ∈ Wq (V ): f g(x0 , . . . , x p+q ) = (i 0 , . . . , i q , i q+1 , . . . , i p+q ) i 0 <···
× f (g(xi0 , . . . , xiq ), xiq+1 , . . . , xi p+q ),
(1.4)
where = (−1) N , N being the number of interchanges of indices of odd xi ’s in the permutation σ (s) = i s , s = 0, 1, . . . , p + q. Then the bracket [ f, g] = f g − (−1) p( f ) p(q) g f
(1.5)
defines a Lie superalgebra structure on W (V ). Lemma 1.4. Let V be a vector superspace and let μ ∈ Wn−1 (V ), D ∈ W0 (V ). Then (a) [μ, D] = 0 if and only if D is a derivation of the n-superalgebra (V, μ). (b) D is a derivation of parity α of the commutative n-superalgebra (V, μ) if and only if D is a derivation of parity α of the anti-commutative n-superalgebra (V, μ), ¯ where
Classification of Simple Linearly Compact n-Lie Superalgebras
839
μ(a ¯ 1 ⊗ · · · ⊗ an ) = p(a1 , . . . , an )μ(a1 ⊗ · · · ⊗ an ), and p(a1 , . . . , an ) is defined by (1.3). Proof. By (1.5) and (1.4), we have: [μ, D](b1 ⊗ · · · ⊗ bn ) = (μD)(b1 ⊗ · · · ⊗ bn ) − (−1)αp(μ) (Dμ)(b1 ⊗ · · · ⊗ bn ) = (ε(i 1 , . . . , i n )μ(D(bi1 ) ⊗ bi2 ⊗ · · · ⊗ bin )) i1 i 2 <···
− (−1)αp(μ) D(μ(b1 ⊗ · · · ⊗ bn )), where α is the parity of D. Therefore [μ, D] = 0 if and only if D(μ(b1 ⊗ · · · ⊗ bn )) = (−1)αp(μ) (ε(i 1 , . . . , i n )μ(D(bi1 ) ⊗ bi2 ⊗ · · · ⊗ bin )) i1 i 2 <···
= (−1)αp(μ) (μ(D(b1 ) ⊗ b2 ⊗ · · · ⊗ bn−1 ) + (−1)αp(b1 ) μ(b1 ⊗ D(b2 ) ⊗ · · · ⊗ bn−1 ) + · · · + (−1)α( p(b1 )+···+ p(bn−1 )) μ(b1 ⊗ · · · ⊗ bn−1 ⊗ D(bn )), i.e., if and only if D is a derivation of (V, μ) of parity α, proving (a). In order to prove (b), note that D is a derivation of parity α of (V, μ) if and only if D(μ(a ¯ 1 ⊗ · · · ⊗ an ) = p(a1 , . . . , an )D(μ(a1 ⊗ · · · ⊗ an )) = p(a1 , . . . , an )((−1)αp(μ) (μ(D(a1 ) ⊗ a2 ⊗ . . . an ) + (−1)αp(a1 ) μ(a1 ⊗ D(a2 ) ⊗ . . . an ) + · · · + (−1)α( p(a1 )+···+ p(an−1 )) μ(a1 ⊗ · · · ⊗ D(an ))) = p(a1 , . . . , an )(−1)αp(μ) p(D(a1 ), a2 , . . . , an ) ×(μ(D(a ¯ 1 ) ⊗ a2 · · · ⊗ an ) αp(a1 ) + (−1) p(D(a1 ), a2 , . . . , an ) p(a1 , D(a2 ), . . . , an ) ×μ(a ¯ 1 ⊗ D(a2 ) · · · ⊗ an ) + · · · + (−1)α( p(a1 )+···+ p(an−1 )) p(D(a1 ), a2 , . . . , an ) × p(a1 , . . . , D(an ))μ(a ¯ 1 ⊗ · · · ⊗ D(an ))). If n is even, we have: p(a1 , . . . , an )(−1)αp(μ) p(D(a1 ), a2 , . . . , an ) = (−1)α( p(μ)+1) , (−1)αp(a1 ) p(D(a1 ), a2 , . . . , an ) p(a1 , D(a2 ), . . . , an ) = (−1)α( p(a1 )+1) , .. .
(−1)α( p(a1 )+···+ p(an−1 )) p(D(a1 ), a2 , . . . , an ) p(a1 , . . . , D(an ))=(−1)α( p(a1 )+...+p(an−1 )+1) . If n is odd, we have: p(a1 , . . . , an )(−1)αp(μ) p(D(a1 ), a2 , . . . , an ) = (−1)αp(μ) , αp(a 1 ) p(D(a ), a , . . . , a ) p(a , D(a ), . . . , a ) = (−1)α( p(a1 )+1) , (−1) 1 2 n 1 2 n
.. . (−1)α( p(a1 )+···+ p(an−1 )) p(D(a1 ), a2 . . . , an ) p(a1 , . . . , D(an ))(−1)α( p(a1 )+···+ p(an−1 )) . Since μ¯ has parity equal to p(μ) + n − 1 mod 2, (b) is proved.
840
N. Cantarini, V. G. Kac
2. The Main Construction Let (g, μ) be an anti-commutative n-superalgebra over a field F with n-ary product [a 1∞, . . . , an ], and let V = g. Consider the universal Lie superalgebra W (V ) = Wk (V ), and let μ¯ ∈ Wn−1 (V ) be the element defined by (1.2). Let Lie g = k=−1 ∞ j=−1 Lie j g be the Z-graded subalgebra of the Lie superalgebra W (V ), generated by ¯ W−1 (V ) = V and μ. Lemma 2.1. (a) Lie g is a transitive subalgebra of W (V ). (b) If D ∈ Lie 0 g, then the action of D on Lie −1 g = V (= g) is a derivation of the n-superalgebra g if and only if [D, μ] ¯ = 0. (c) Lie 0 g is generated by elements of the form ¯ where ai ∈ Lie −1 g = V. ( ad a1 ) . . . ( ad an−1 )μ,
(2.1)
Proof. (a) is clear since W (V ) is transitive and the latter holds since, for f ∈ Wk (V ) and a, a1 , . . . , ak ∈ W−1 (V ) = V one has: [ f, a](a1 , . . . , ak ) = f (a, a1 , . . . , ak ). (b) follows from Lemma 1.4. In order to prove (c) let L˜ −1 = V and let L˜ 0 be the subalgebra of the Lie algebra W0 (V ), generated by elements (2.1). Let j≥−1 L˜ j be the full prolongation of L˜ −1 ⊕ L˜ 0 , i.e., L˜ j = {a ∈ W j (V )|[a, L˜ −1 ] ⊂ L˜ j−1 } for j ≥ 1. This is a subalgebra of W (V ), containing V and μ, ¯ hence Lie g. This proves (c). Definition 2.2. An n-Lie superalgebra is an anti-commutative n-superalgebra g of parity α, such that all endomorphisms Da1 ,...,an−1 of g (a1 , . . . , an−1 ∈ g), defined by Da1 ,...an−1 (a) = [a1 , . . . , an−1 , a], are derivations of g, i.e., the following Filippov-Jacobi identity holds: [a1 , . . . , an−1 , [b1 , . . . , bn ]] = (−1)α( p(a1 )+···+ p(an−1 )) ([[a1 , . . . , an−1 , b1 ], b2 , . . . , bn ] +(−1) p(b1 )( p(a1 )+···+ p(an−1 )) [b1 , [a1 , . . . , an−1 , b2 ], b3 , . . . , bn ] + · · · + (−1)( p(b1 )+···+ p(bn−1 ))( p(a1 )+···+ p(an−1 )) [b1 , . . . , bn−1 , [a1 , . . . , an−1 , bn ]]). (2.2) Recall from the Introduction that the pair (L , μ), where L = j≥−1 L j is a Z-graded Lie superalgebra and μ ∈ L n−1 , is called admissible if properties (L1),(L2) and (L3) hold. Two admissible pairs (L , μ) and (L , μ ) are called isomorphic if there exists a Lie superalgebra isomorphism φ : L → L , such that φ(L j ) = L j for all j and φ(μ) ∈ F× μ . The following corollary of Lemma 2.1 is immediate. Corollary 2.3. If g is an n-Lie superalgebra, then the pair (Lie g, μ) ¯ is admissible. Now it is easy to prove the following key result.
Classification of Simple Linearly Compact n-Lie Superalgebras
841
Proposition 2.4. The map g → (Lie g, μ) ¯ induces a bijection between isomorphism classes of n-Lie superalgebras, considered up to rescaling the n-ary bracket, and isomorphism classes of admissible pairs. Under this bijection, simple n-Lie algebras correspond to irreducible admissible pairs. Moreover, g is linearly compact if and only if Lie g is. Proof. Given an admissible pair (L , μ), where L = j≥−1 L j , μ ∈ L n−1 , we let g = L −1 , and define an n-ary bracket on g by the formula [a1 , . . . , an ] = p(a1 , . . . , an )[. . . [μ, a1 ] . . . , an ], where p(a1 , . . . , an ) is given by (1.3). Obviously, this n-ary bracket is anti-commutative. The Filippov-Jacobi identity follows from the property (L3) using the embedding of L in W (L −1 ) and applying Lemma 2.1(b). Thus, g is an n-Lie superalgebra. Due to properties (L1) and (L2), we obtain the bijection of the map in question. It is obvious that g is simple if and only if the pair (L , μ) is irreducible. The fact that the linear compactness of g implies that of Lie g is proved in the same way as Proposition 7.2(c) from [5]. Remark 2.5. If g is a finite-dimensional n-Lie algebra, then Lie −1 g = g is purely odd, hence dim W (g) < ∞ and therefore dim Lie g < ∞. In the super case this follows from Theorem 0.2 if char F = 0, and from the fact that any finite-dimensional subspace of W (V ) generates a finite-dimensional subalgebra if char F > 0. Thus, an n-Lie superalgebra g is finite-dimensional if and only if the Lie superalgebra Lie g is finite-dimensional. Remark 2.6. Let V be a vector superspace. Recall that a sequence of anti-commutative (n + 1)-ary products dn , n = 0, 1, . . . , of parity n + 1 mod 2 on V endow V with a structure of a homotopy Lie algebra if they satisfy a sequence of certain quadratic identities, which mean that d02 = 0, d1 is a Lie (super)algebra bracket modulo the image of d0 and d0 is the derivation of this bracket, etc. [24]. (Usually one also requires a Z-grading on V for which dn has degree n − 1, but we ignore this requirement here.) On the other hand, recall that if μn is an (n + 1)-ary anti-commutative product on V of parity n + 1 mod 2, then the (n + 1)-ary product μ¯ n , defined in Lemma 1.4, is a commutative odd product on the vector superspace V . It is easy to see that the sequence of (n + 1)-ary productsμ¯ n define a homotopy Lie algebra structure on V if and only if the odd element μ = n μ¯ n ∈ W (V ) satisfies the identity [μ, μ] = 0. As above, we can associate to a given homotopy Lie algebra structure on V the subalgebra of W (V ), denoted by Lie (V, μ), which is generated by W−1 (V ) and all μ¯ n , n = 0, 1, . . . . If the superspace V is linearly compact and the homotopy Lie algebra is simple with μ¯ n = 0 for some n > 2, then the derived algebra of Lie (V, μ) is simple, hence is of one of the types X (m, n), according to the classification of [17]. Then the simple homotopy Lie algebra is called of type X (m, n). (Of course, there are many homotopy Lie algebras of a given type.) Lemma 3.1 below shows, in particular, that in characteristic 0 any n-Lie superalgebra of parity n mod 2 is a homotopy Lie algebra, for which μ¯ j = 0 if j = n − 1. This was proved earlier in [8] and [22]. 3. Proof of Theorem 0.2 For the sake of simplicity we consider the n-Lie algebra case, i.e. we assume that L −1 is purely odd. The same proof works verbatim when L −1 is not purely odd, using identity
842
N. Cantarini, V. G. Kac
(2.2). Alternatively, the use of the standard Grassmann envelope argument reduces the case of n-Lie superalgebras of even parity to the case of n-Lie algebras. First, introduce some notation. Let S2n−1 be the group of permutations of the 2n − 1 element set {1, . . . , 2n − 1} and, for σ ∈ S2n−1 , let ε(σ ) be the sign of σ . Denote by S the subset of S2n−1 consisting of permutations σ , such that σ (1) < · · · < σ (n − 1), σ (n) < · · · < σ (2n − 1). Consider the following subsets of S (l and s stand for “long” and “short” as in [8]): Sl1 = {σ ∈ S| σ (2n − 1) = 2n − 1}, S s1 = {σ ∈ S| σ (n − 1) = 2n − 1}. It is immediate to see that S = Sl1 ∪ S s1 . Likewise, let S l1 l2 Sl1 s2 S s1 l2 S s1 s2
= {σ = {σ = {σ = {σ
∈ ∈ ∈ ∈
Sl1 | σ (2n − 2) = 2n − 2}, Sl1 | σ (n − 1) = 2n − 2}, S s1 | σ (2n − 1) = 2n − 2}, S s1 | σ (n − 2) = 2n − 2}.
Then Sl1 = Sl1 l2 ∪ Sl1 s2 , and S s1 = S s1 l2 ∪ S s1 s2 . Likewise, we define the subsets S a1 ...ak , with a = l or a = s, for 1 ≤ k ≤ 2n − 1, so that S a1 ...ak−1 = S a1 ...ak−1 sk ∪ S a1 ...ak−1 lk .
(3.1)
Lemma 3.1. If (L = j≥−1 L j , μ) is an admissible pair, then [(ad L −1 )n− j−1 μ, μ] = 0 for every j = 0, . . . , n − 1. Proof. If (L , μ) is an admissible pair, then, by Lemma 2.1(c), L 0 = (ad L −1 )n−1 μ, hence [(ad L −1 )n−1 μ, μ] = 0 by property (L3). Now we will show that [(ad L −1 )n− j−1 μ, μ] = 0 for every j = 1, . . . , n − 1. By Lemma 2.1(b) and property (L3), the Filippov-Jacobi identity holds for elements in L −1 with product (1.2). Let x1 , . . . , xn− j−1 ∈ L −1 . By definition, [μ, [x1 , . . . , [xn− j−1 , μ]]] = μ[x1 , . . . , [xn− j−1 , μ]]−(−1) j (n−1) [x1 , . . . , [xn− j−1 , μ]]μ. One checks by a direct calculation, using the Filippov- Jacobi identity, that μAlt[x1 , . . . , [xn− j−1 , μ]] − (−1) j (n−1) [x1 , . . . , [xn− j−1 , μ]]μ = 0, ⊗k , V ), Alt f ∈ H om(V ⊗k , V ) denotes the alternator of f , where, for f ∈ H om(V i.e., Alt f (a1 , . . . , ak ) = σ ∈Sk f (aσ (1) , . . . , aσ (k) ). Hence, since μ ∈ Wn−1 (L −1 ), we have
[μ, [x1 , . . . , [xn− j−1 , μ]]] = (( j + 1)! + 1)μ[x1 , . . . , [xn− j−1 , μ]]. We will prove a stronger statement than the lemma, namely, we will show that, for every j = 1, . . . , n − 1 one has: μ[x1 , . . . , [xn− j−1 , μ]] = 0.
(3.2)
Classification of Simple Linearly Compact n-Lie Superalgebras
843
Note that, by definition, for a1 , . . . , an+ j ∈ L −1 , we have: (μ[x1 , . . . , [xn− j−1 , μ]])(a1 , . . . , an+ j ) = ε(σ )μ([x1 , . . . , [xn− j−1 , μ]](aσ (1) , . . . , aσ ( j+1) ), σ (1)<···<σ ( j+1) σ ( j+2)<···<σ (n+ j)
aσ ( j+2) , . . . , aσ (n+ j) ) ε(σ )μ(μ(xn− j−1 , . . . , x1 , aσ (1) , . . . , aσ ( j+1) ), = σ (1)<···<σ ( j+1) σ ( j+2)<···<σ (n+ j)
aσ ( j+2) , . . . , aσ (n+ j) ). Therefore (3.2) is equivalent to the following: ε(σ )μ(xσ (1) , . . . , xσ (n−1) , μ(xσ (n) , . . . , xσ (2n−1) )) = 0. σ ∈S
(3.3)
l1 ,...,ln− j−1
Set Aσ = ε(σ )μ(xσ (1) , . . . , xσ (n−1) , μ(xσ (n) , . . . , xσ (2n−1) )), Q l1 ...lt = σ ∈Sl1 ...lt Aσ , and similarly define Q a1 ...at , where a = s or a = l. Then (3.3) is equivalent to Q l1 ...ln− j−1 = 0. In fact, we shall prove more: Q l1 ...lt = Q l1 ...lt−1 st = Q s1 ...st−1 lt = Q s1 ...st = 0 for t = 0, . . . , n − 2.
(3.4)
For t = 0 and t = 1, equality (3.4) can be proved as in [8, Prop. 2.1]. Namely, by the Filippov-Jacobi identity, for any σ ∈ S s1 , Aσ can be written as a sum of n elements Aτ , where τ ∈ Sl1 is such that {τ (1), . . . , τ (n − 1)} ⊂ {σ (n), . . . , σ (2n − 1)}. Since the sets {τ (1), . . . , τ (n −1)} and {σ (n), . . . , σ (2n −1)} have n −1 and n elements, respectively, there exists only one i such that {τ (1), . . . , τ (n − 1)} ∪ {i} = {σ (n), . . . , σ (2n − 1)}. Then i ≤ 2n − 2 and i = τ (1), . . . , τ (n − 1). Therefore there are n − 1 possibilities to choose i. It follows that Q s1 = (n − 1)Q l1 .
(3.5)
Likewise, by the Filippov-Jacobi identity, for any σ ∈ Sl1 , Aσ can be written as a sum of one element Aρ with ρ ∈ Sl1 , and n − 1 elements Aτ , with τ ∈ S s1 such that {τ (1), . . . , τ (n − 2)} ⊂ {σ (n), . . . , σ (2n − 2)}. As above, there exists only one i such that {τ (1), . . . , τ (n − 2)} ∪ {i} = {σ (n), . . . , σ (2n − 2)}. Such an i is different from τ (1), . . . , τ (n − 2) and i ≤ 2n − 2. Hence there are n possibilities to choose i. Notice that 1 ... n−1 n . . . 2n − 2 2n − 1 ρ= , (3.6) σ (n) . . . σ (2n − 2) σ (1) . . . σ (n − 1) 2n − 1 hence ε(ρ) = (−1)n−1 ε(σ ). Therefore Q l1 = n Q s1 + (−1)n−1 Q l1 .
(3.7)
Equations (3.5) and (3.7) form a system of two linear equations in the two indeterminates Q s1 and Q l1 , whose determinant is equal to n 2 − n − 1 − (−1)n , which is different from zero for every n > 2. It follows that Q s1 = 0 = Q l1 , i.e., (3.4) is proved for t = 1. Since, as we have already noticed, S = Sl1 + S s1 , (3.4) for t = 0 also follows.
844
N. Cantarini, V. G. Kac
Now we argue by induction on t. We already proved (3.4) for t = 0 and t = 1. Assume that Q l1 ...lt−1 = Q l1 ...lt−2 st−1 = Q s1 ...st−2 lt−1 = Q s1 ...st−1 = 0 for some 1 ≤ t < n − 2. Similarly as above, by the Filippov-Jacobi identity, for any σ ∈ S s1 ...st , Aσ can be written as a sum of n elements Aτ with τ ∈ Sl1 ...lt , such that {τ (1), . . . , τ (n − 1)} ⊂ {σ (n), . . . , σ (2n − 1)}, i.e., {τ (1), . . . , τ (n − 1)} ∪ {i} = {σ (n), . . . , σ (2n − 1)}, for some i ≤ 2n − t − 1 and i = τ (1), . . . , τ (n − 1). It follows that there are n − t choices for i, hence Q s1 ...st = (n − t)Q l1 ...lt .
(3.8)
Likewise, if σ ∈ S s1 ...st−1 lt , then, by the Filippov-Jacobi identity, Aσ can be written as a sum of one element Aρ with ρ ∈ Sl1 ...lt as in (3.6), and n − 1 elements Aτ with τ ∈ Sl1 ...lt−1 st , such that {τ (1), . . . , τ (n − 2)} ⊂ {σ (n), . . . , σ (2n − 2)}. As above, there exists only one i such that {τ (1), . . . , τ (n − 2)} ∪ {i} = {σ (n), . . . , σ (2n − 2)}. Then i ≤ 2n − t − 1, i = τ (1), . . . , τ (n − 2). It follows that Q s1 ...st−1 lt = (n − t + 1)Q l1 ...lt−1 st + (−1)n−1 Q l1 ...lt .
(3.9)
Then, using (3.1) and the inductive hypotheses Q s1 ...st−1 = 0 = Q l1 ...lt−1 , we get the following system of linear equations: ⎧ Q = (n − t)Q l1 ...lt ⎪ ⎨ s1 ...st Q s1 ...st−1 lt = (n − t + 1)Q l1 ...lt−1 st + (−1)n−1 Q l1 ...lt (3.10) ⎪ ⎩ Q s1 ...st + Q s1 ...st−1 lt = 0 Q l1 ...lt + Q l1 ...lt−1 st = 0, whose determinant is equal to (−1)n + 1. It follows that if n is even then Q l1 ...lt = 0 = Q s1 ...st = Q s1 ...st−1 lt = Q l1 ...lt−1 st , hence (3.4) is proved. Now assume that n is odd. Then (3.10) reduces to ⎧ ⎨ Q s1 ...st = (n − t)Q l1 ...lt Q s ...s l = (n − t + 1)Q l1 ...lt−1 st + Q l1 ...lt (3.11) ⎩ Q 1 t−1+ tQ s1 ...st s1 ...st−1 lt = 0. Using the Filippov-Jacobi identity as above, one gets the following system of linear equations:
Q s1 ...st−2 lt−1 lt = (n − t + 2)Q l1 ...lt−2 st−1 st − Q l1 ...lt−2 st−1 lt + Q l1 ...lt−2 lt−1 st (3.12) Q s1 ...st−2 lt−1 st = (n − t + 1)Q l1 ...lt−2 st−1 lt + Q l1 ...lt−2 lt−1 lt . Besides, using the inductive hypotheses, we get: ⎧ ⎨ Q s1 ...st−2 lt−1 lt + Q s1 ...st−2 lt−1 st = Q s1 ...st−2 lt−1 = 0 Q l ...l s s + Q l ...l s l = Q l ...l s = 0 ⎩ Q 1 t−2 t−1 t + Q 1 t−2 t−1 t = Q 1 t−2 t−1 = 0. l1 ...lt−2 lt−1 st l1 ...lt−2 lt−1 lt l1 ...lt−2 lt−1
(3.13)
Taking the sum of the two equations in (3.12), and using the three equations in (3.13), we get: Q l1 ...lt−2 st−1 st = Q l1 ...lt−2 st−1 lt = 0. By arguing in the same way, one shows that Q l1 ...lk sk+1 lk+2 ...lt = 0 for every k = 0, . . . t − 2.
(3.14)
Classification of Simple Linearly Compact n-Lie Superalgebras
845
Finally, using the Filippov-Jacobi identity as above, one gets the following system of linear equations: ⎧ Q l1 ...lt = n Q s1 ...st + (−1)n+t−2 Q s1 ...st−1 lt + · · · − Q s1 l2 s3 ...st + Q l1 s2 ...st ⎪ ⎪ ⎨ Q s ...s l s = (n − t + 1)Q l ...l s l + Q l ...l 1 t−2 t−1 t 1 t−2 t−1 t 1 t .. ⎪ ⎪ ⎩. Q l1 s2 ...st = (n − t + 1)Q s1 l2 ...lt + Q l1 ...lt , which reduces, by (3.14), to the following: ⎧ Q = n Q s1 ...st + (−1)n+t−2 Q s1 ...st−1 lt + · · · − Q s1 l2 s3 ...st + Q l1 s2 ...st ⎪ ⎪ l1 ...lt ⎨ Q s1 ...st−2 lt−1 st = Q l1 ...lt (3.15) .. ⎪ ⎪. ⎩ Q l1 s2 ...st = Q l1 ...lt . System (3.15) implies the following equation: Q l1 ...lt = n Q s1 ...st + (−1)n+t−2 Q s1 ...st−1 lt +
(−1)t + 1 Q l1 ...lt . 2
This equation together with (3.11) form a system of four linear equations in four indetert minates, whose determinant is equal to (n − t + 1)((−1)n−t (n − t) + 1−(−1) − n(n − t)). 2 It is different from 0 for every t = 1, . . . , n − 2. Hence Q l1 ...lt = 0 = Q l1 ...lt−1 st = Q s1 ...st−1 lt = Q s1 ...st , and (3.4) is proved. Proof of Theorem 0.2. Any element of the subalgebra generated by L −1 and μ is a linear combination of elements of the form: [. . . [[. . . [[[. . . [μ, a1 ], . . . , as ], μ], b1 ], . . . , bk ], μ], . . . ]
(3.16)
with a1 , . . . , as , b1 , . . . , bk , . . . in L −1 . By Lemma 3.1, every element of the form [[[. . . [μ, a1 ], . . . , as ], μ] is either 0 or an element in L −1 , therefore we can assume that μ appears only once in (3.16), i.e., any element of L lies in [. . . [μ, L −1 ], . . . , L −1 ]. Remark 3.2. We conjecture that Theorem 0.2 holds also in non-zero characteristic if char F ≥ n. Our argument works for char F > (n − 1)2 . The following example shows that Theorem 0.2 (and Theorem 0.1) fails if 0 < char F < n. Let g = Fa be a 1-dimensional odd space, which we endow by the following n-bracket: [a, a, . . . , a] = a. The Filippov-Jacobi identity holds if n = sp + 1, where p = char F and s is a positive integer. However, Lie g is not of the form described by Theorem 0.2. 4. Classification of Irreducible Admissible Pairs First, we briefly recall some examples of Z-graded linearly compact Lie superalgebras over a field F of characteristic 0, and some of their properties. For more details, see [17] and [4]. Given a finite-dimensional vector superspace V of dimension (m|n) (i.e. dim V0¯ = m, dim V1¯ = n), the universal Lie superalgebra W (V ) is isomorphic to the Lie superalgebra W (m, n) of continuous derivations of the tensor product F(m, n) of the algebra of
846
N. Cantarini, V. G. Kac
formal power series in m commuting variables x1 , . . . , xm and the Grassmann algebra in n anti-commuting variables ξ1 , . . . , ξn . Elements of W (m, n) can be viewed as linear differential operators of the form X=
m i=1
∂ ∂ + Q j (x, ξ ) , ∂ xi ∂ξ j n
Pi (x, ξ )
Pi , Q j ∈ F(m, n).
j=1
The Lie superalgebra W (m, n) is simple linearly compact (and it is finite-dimensional if and only if m = 0). Letting deg xi = − deg ∂∂xi = ki , deg ξi = − deg ∂ξ∂ i = si , where ki , si ∈ Z, defines a Z-grading on W (m, n), called the Z-grading of type (k1 , . . . , km |s1 , . . . , sn ). Any Z-grading of W (m, n) is conjugate (i.e. can be mapped by an automorphism of W (m, n)) to one of these. Clearly, such a grading has finite depth d (meaning that W (m, n) j = 0 if and only if j ≥ −d) if and only if ki ≥ 0 for all i. It is easy to show that the depth d = 1 if all ki ’s and si ’s are 0 or 1, or if all ki ’s are 0, s j = −1 for some j, and si = 0 for every i = j. Now we shall describe some closed (hence linearly compact) subalgebras of W (m, n). First, given a subalgebra L of W (m, n), a continuous linear map Div : L → F(m, n) is called a divergence if the action πλ of L on F(m, n), given by πλ (X ) f = X f + (−1) p(X ) p( f ) λ f Div X,
X ∈ L,
is a representation of L in F(m, n) for any λ ∈ F. Note that S Div (L) := {X ∈ L | Div X = 0}
is a closed subalgebra of L. We denote by S Div (L) its derived subalgebra (recall that the derived subalgebra of g is [g, g]). An example of a divergence on L = W (m, n) is the following, denoted by div: ⎛ ⎞ m n m n ∂Qj ∂ ∂ ⎠ ∂ Pi div ⎝ Pi + Qj + (−1) p(Q j ) . = ∂ xi ∂ξ j ∂ xi ∂ξ j i=1
j=1
i=1
j=1
Hence for any λ ∈ F we get the representation πλ of W (m, n) in F(m, n). Also, we get (W (m, n)) ⊃ S closed subalgebras Sdiv div (W (m, n)) denoted by S (m, n) ⊃ S(m, n). Recall that S (m, n) = S(m, n) is simple if m > 1, and S (1, n) = S(1, n) ⊕ Fξ1 . . . ξn
∂ , ∂ x1
(4.1)
where S(1, n) is a simple ideal. The Z-gradings of type (k1 , . . . , km |s1 , . . . , sn ) of W (m, n) induce ones on S (m, n) and S(m, n) and any Z-grading is conjugate to those. The description of Z-gradings of depth 1 for S (m, n) and S(m, n) is the same as for W (m, n). Next examples of subalgebras of W (m, n), needed in this paper, are of the form L(ω) = {X ∈ W (m, n) | X ω = 0}, where ω is a differential form.
Classification of Simple Linearly Compact n-Lie Superalgebras
847
In the case m = 2k is even, consider the symplectic differential form ωs = 2
k i=1
d xi ∧ d xk+i +
n
dξi dξk−i+1 .
i=1
The corresponding subalgebra L(ωs ) is denoted by H (m, n) and is called a Hamiltonian superalgebra. This superalgebra is simple, hence coincides with its derived subalgebra H (m, n), unless m = 0, when the Hamiltonian superalgebra is finite-dimensional. It is convenient to consider the “Poisson” realization of H (m, n). For that let pi = xi , qi = xk+i , i = 1, . . . , k, and introduce on F(m, n) the structure of a Poisson superalgebra P(m, n) by letting the non-zero brackets between generators to be as follows: { pi , qi } = 1 = {ξi , ξn−i+1 }, and extend by the Leibniz the map P(m, n) → H (m, n), given by f → rule. Then k ∂ f ∂ ∂f ∂ ∂f k ∂ p( f ) i=1 ∂ pi ∂qi − ∂qi ∂ pi − (−1) i=1 ∂ξi ∂ξk−i+1 , defines a surjective Lie superalgebra homomorphism with kernel F1. Thus, H (m, n) = P(m, n)/F1. In this realization H (0, n) is spanned by all monomials in ξi mod F1 except for the one of top degree, and we have: H (0, n) = H (0, n) ⊕ Fξ1 . . . ξn .
(4.2)
Note that H (0, n) is simple if and only if n ≥ 4. All Z-gradings of depth 1 of H (0, n) are, up to conjugacy, those of type (|1, . . . , 1), (|1, 0, . . . , 0, −1), and (| 1, . . . , 1, 0, . . . , 0), if n is even [5]. n/2 n d xi dξi is Another example is H O(n, n) = L(ωos ) ⊂ W (n, n), where ωos = i=1 an odd symplectic form. This Lie superalgebra is simple if and only if n ≥ 2. It contains the important for this paper subalgebra S H O (n, n) = H O(n, n) ∩ S (n, n). Its derived subalgebra S H O(n, n) is simple if and only if n ≥ 3. Again, it is convenient to consider a “Poisson” realization of H O(n, n). For this consider the Buttin bracket on F(n, n): n ∂ f ∂g ∂ f ∂g { f, g} B = . − (−1) p( f ) ∂ xi ∂ξi ∂ξi ∂ xi i=1
This is a Lie superalgebra, which we denote by P O(n, n), and the map P O(n, n) → H O(n, n), given by n ∂f ∂ ∂f ∂ f → − (−1) p( f ) ∂ xi ∂ξi ∂ξi ∂ xi i=1
is a surjective Lie superalgebra homomorphism, whose kernel is F1. Thus, H O(n, n) = P(n, n)/F1. In this realization we have: n
S H O (n, n) = { f ∈ P(n, n)/F1 | f = 0},
where = i=1 ∂∂xi ∂ξ∂ i is the odd Laplace operator. Then S H O(n, n) is an ideal of codimension 1 in S H O (n, n), and we have: S H O (n, n) = S H O(n, n) ⊕ Fξ1 . . . ξn .
(4.3)
848
N. Cantarini, V. G. Kac
All Z-gradings of depth 1 of S H O (n, n) are, up to conjugacy, those of type (1, . . . , 1|1, . . . , 1), (0, . . . , 0, 1|0, . . . , 0, −1), and (1, . . . , 1, 0, . . . , 0| 0, . . . , 0, 1, . . . , 1), k
where k = 0, . . . , n [5]. The next important for us example is
k
K O(n, n + 1) = {X ∈ W (n, n + 1) | X ωoc = f ωoc for some f ∈ F(n, n + 1)}, n where ωoc = dξn+1 + i=1 (ξi d xi + xi dξi ) is an odd contact form. This superalgebra is simple for all n ≥ 1. Another realization of this Lie superalgebra is P O(n, n + 1) = f F(n, n + 1) with the bracket { f, g} B O = (2 − E) f ∂ξ∂g − (−1) p( f ) ∂ξ∂n+1 (2 − E)g − n+1 n n ∂ f ∂g ∂ f ∂g ∂ ∂ p( f ) i=1 ( ∂ xi ∂ξi − (−1) i=1 (x i ∂ xi + ξi ∂ξi ). The isomorphism ∂ξi ∂ xi ), where E = f P O(n, n + 1) → K O(n, n + 1) is given by f → (2 − E) f ∂ξ∂n+1 + (−1) p( f ) ∂ξ∂n+1 E− n ∂f ∂ ∂f ∂ p( f ) i=1 ( ∂ xi ∂ξi − (−1) ∂ξi ∂ xi ). It turns out that for each β ∈ F the Lie superalgebra K O(n, n + 1) admits a divergence
divβ f = f + (E − nβ)
∂f , ∂ξn+1
f ∈ P O(n, n + 1).
We let S K O (n, n + 1; β) = { f ∈ P O(n, n + 1) | divβ f = 0}. This Lie superalgebra is not always simple, but its derived algebra, denoted by S K O(n, n + 1; β), is simple if and only if n ≥ 2. In fact, S K O (n, n + 1; β) = S K O(n, n + 1; β), unless β = 1 or β = n−2 n . In the latter cases S K O(n, n +1; β) is an ideal of codimension 1 in S K O (n, n + 1; β), and we have: S K O (n, n + 1; 1) = S K O(n, n + 1; 1) + Fξ1 . . . ξn+1 , n−2 n−2 = S K O n, n + 1; + Fξ1 . . . ξn . S K O n, n + 1; n n
(4.4) (4.5)
All Z-gradings of depth 1 of S K O (n, n + 1; β) are, up to conjugacy, of type (0, . . . , 0, 1|0, . . . , 0, −1, 0), and (1, . . . , 1, 0, . . . , 0| 0, . . . , 0, 1, . . . , 1), where k = 0, . . . , n [5]. k
k
Theorem 4.1. Let (L = ⊕n−1 j=−1 L j , μ) be an irreducible admissible pair over an algebraically closed field F of characteristic 0, where L is a linearly compact Lie superalgebra, and n > 2. Then n−2 (a) L = ⊕n−1 j=−1 L j is a semidirect product of the simple ideal S = ⊕ j=−1 L j and the 1-dimensional subalgebra L n−1 = Fμ, where μ is an outer derivation of L, such that [μ, L 0 ] = 0. (b) The pair (L , μ) is isomorphic to one of the following four irreducible admissible pairs: (i) (H (0, n + 1), ξ1 . . . ξn+1 ), n ≥ 3, with the grading of type (|1, . . . , 1); (ii) (S H O (n, n), ξ1 . . . ξn ), n ≥ 3, with the grading of type (0, . . . , 0|1, . . . , 1); (iii) (S K O (n − 1, n; 1), ξ1 . . . ξn−1 ξn ), n ≥ 3, with the grading of type (0, . . . , 0|1, . . . , 1); (iv) (S(1, n − 1), ξ1 . . . ξn−1 ∂∂x ), n ≥ 3, with the grading of type (0|1, . . . , 1).
Classification of Simple Linearly Compact n-Lie Superalgebras
849
Proof. The decomposition L = S Fμ in (a) follows from Theorem 0.2. The fact that S is simple is proved in the same way as in [5, Th. 7.3]. Indeed, S is the minimal among non-zero closed ideals of L, since if I is a non-zero closed ideal of L, then I ∩ L −1 = ∅ by transitivity, hence, by irreducibility, I ∩ L −1 = L −1 , from which it follows that I contains S. Next, by the super-analogue of Cartan-Guillemin’s theorem [1,15], estabˆ lished in [10], S = S ⊗(m, h), for some simple linearly compact Lie superalgebra S ˆ ˆ and some m, h ∈ Z≥0 , and μ lies in Der (S ⊗O(m, h)). h)) = Since Der (S ⊗O(m, ˆ Der S ⊗O(m, h) + 1 ⊗ W (m, h) [10], we have: μ = i (di ⊗ ai ) + 1 ⊗ μ for some di ∈ Der S , ai ∈ O(m, h) and μ ∈ W (m, h). First consider the case when μ is even. Then μ is an even element of W (m, h), ˆ hence, by the minimality of the ideal S ⊗O(m, h), h = 0. Now suppose m ≥ 1. If μ lies in the non-negative part of W (m, 0) with the grading of type (1, . . . , 1|), then ˆ the ideal generated by S x1 is a proper μ-invariant ideal of S ⊗O(m, 0), contradicting its minimality. Therefore we may assume, up to a linear change of indeterminates, that μ = ∂∂x1 + D, for some derivation D lying in the non-negative part of W (m, 0). Since μ lies in L n−1 , we have deg(x1 ) = −n + 1, but this is a contradiction since the Z-grading of L has depth 1. It follows that m = 0. Now consider the case when μ is odd. Consider the grading of W (m, h) of type (1, . . . , 1|1, . . . , 1), and denote by W (m, h)≥0 its non-negative part. If μ ∈ W (m, h)≥0 , ˆ then the minimality of the ideal S ⊗O(m, h) implies m = h = 0. Now suppose that h ≥ 1 and that μ has a non-zero projection on W (m, h)−1 . Then, up to a linear change of indeterminates, μ = ∂ξ∂ 1 + D for some odd derivation D ∈ W (m, h)≥0 . Since μ lies in L n−1 , we have deg(ξ1 ) = −n + 1 < −1. Since L ⊃ S ξ1 and the grading of L has depth 1, it follows that every element in S has positive degree, but this is a contradiction since S is simple. This concludes the proof of the simplicity of S. In order to prove (b), note that the grading operator D of the simple Z-graded Lie superalgebra S is an outer derivation (since [μ, L 0 ] = 0, D ∈ / L 0 ). Another outer derivation of S is μ. From the classification of simple linearly compact Lie superalgebras [16,17] and their derivations in [16,17], [2, Prop. 1.8] (see also Lemma 4.3 below), we see that the only possibilities for L are H (0, n + 1), S H O (n, n), S K O (n − n−2 1, n; 1), S K O n − 1, n; n , and S (1, n) for n ≥ 3. From the description of Zgradings of depth 1 of these Lie superalgebras, given above, it follows that L = S K O n − 1, n; n−1 is ruled out, whereas for the remaining four possibilities for L n only the grading of type (0, . . . , 0|1, . . . , 1) is possible, and for them indeed L n−1 = Fμ, where μ is as described above. It is immediate to check that in these four cases the pair (L , μ) is admissible. Irreducibility of the L 0 -module L −1 follows automatically from the simplicity of S since its depth is 1. Remark 4.2. In cases (i)–(iv) of Theorem 4.1(b) the subalgebra L 0 and the L 0 -module L −1 are as follows: (i) L 0 ∼ = son+1 (F), L −1 = Fn+1 with the standard action of son+1 (F); (ii) L 0 ∼ = S(n, 0), L −1 = F[[x1 , . . . , xn ]]/F1, where F[[x1 , . . . , xn ]] is the standard module over S(n, 0); (iii) L 0 ∼ = W (n − 1, 0), L −1 = F[[x1 , . . . , xn−1 ]], which carries the representation πλ=−1 of W (n − 1, 0); (iv) L 0 ∼ = W (1, 0) sln−1 (F[[x]]), L −1 = Fn−1 ⊗ F[[x]] with the standard action of sln−1 (F[[x]]) and the representation πλ=−1/(n−1) of W (1, 0) on F[[x]].
850
N. Cantarini, V. G. Kac
As we have seen, an important part of the classification of irreducible admissible pairs is the description of derivations of simple linearly compact Lie algebras. This description is based on the following simple lemma. Lemma 4.3. Let L be a linearly compact Lie superalgebra and let a be a reductive subalgebra of L (i.e. the adjoint representation of a on L decomposes in a direct product of finite-dimensional irreducible a-modules). Then any continuous derivation of L is a sum of an inner derivation and a derivation commuting with the adjoint action of a. Proof. [16] We have closed a-submodules: Inder L ⊂ Der L ⊂ End L , where Inder L and Der L denote the subspaces of all inner derivations and all continuous derivations of the Lie superalgebra L in the space of continuous endomorphisms of the linearly compact vector space L. Since L = j V j , where V j are finite-dimensional irreducible a-modules, we have: End L = i, j Hom (Vi , V j ), hence End L, and therefore Der L, decomposes into a direct product of irreducible a-submodules. Hence Der L = Inder L ⊕ V, where V is an a-submodule. But aV ⊂ Inder L since Inder L is an ideal in Der L. Hence aV = 0, i.e., any derivation from V commutes with the adjoint action of a on L. 5. Classification of Simple Linearly Compact n-Lie Algebras Over a Field of Characteristic 0, and Their Derivations Proof of Theorem 0.1. By Proposition 2.4, the classification of simple linearly compact n-Lie algebras is equivalent to the classification of admissible pairs (L , μ), for which L is linearly compact. The list of the latter consists of the four examples (i)–(iv) given in Theorem 4.1(b). It is easy to see that the corresponding n-Lie algebras are O n , S n , W n and SW n . (By Lemma 1.4(a), we automatically get from [μ, L 0 ] = 0 that the Filippov-Jacobi identity indeed holds.) The notation for the four simple n-Lie algebras comes from the following fact. Proposition 5.1. (a) The Lie algebra of continuous derivations of the n-Lie algebras O n , S n , W n and SW n is isomorphic to son+1 (F), S(n, 0), W (n−1, 0) and W (1, 0) sln−1 (F[[x]]), respectively. Its representation on the n-Lie algebra is described in Remark 4.2. (b) All continuous derivations of a simple linearly compact n-Lie algebra g over an algebraically closed field of characteristic 0 lie in the closure Inder g of the span of the inner ones. Proof. Let g be one of the four simple n-Lie algebras and let Der g be the Lie algebra of all continuous derivations of g. Then L 0 := Inder g is an ideal of Der g. By Remark 4.2, L 0 is isomorphic to the Lie algebras listed in (a). But all derivations of the Lie algebras L 0 = son+1 (F), W (n − 1, 0) and W (1, 0) sln−1 (F[[x]]) are inner, and Der S(n, 0) = S(n, 0) ⊕ FE, where E = i xi ∂∂xi . This is well known, except for the case L = W (1, 0) sln−1 (F[[x]]). We apply Lemma 4.3 to this case, taking
Classification of Simple Linearly Compact n-Lie Superalgebras
851
a = Fx ddx ⊕ sln−1 (F). If D is an endomorphism of the vector space L, commuting with a, we have, by Schur’s lemma: d d = αk x k , D(x k a) = βk x k a, for a ∈ sln−1 (F), where αk , βk ∈ F. D xk dx dx Since D is also a derivation of L, we conclude that D is a multiple of ad x ddx . Let now D ∈ Der g\( Inder g = L 0 ). Since [D, L 0 ] ⊂ L 0 , D induces a derivation of L 0 . Since all derivations of L 0 are inner, except for E in the case g ∼ = S n , but E is not a derivation of g, we conclude that there exists a ∈ L 0 , such that D| L 0 = ad a| L 0 . Therefore D = D − a commutes with the action of L 0 on L −1 . But the latter representation is described in Remark 4.2, and, clearly, in all cases the only operators, commuting with the representation operators of L 0 on L −1 , are scalars. Since a non-zero scalar cannot be a derivation of g, we conclude that D = 0, hence Der g = L 0 . In conclusion we discuss F-forms of the four simple n-Lie algebras, where F is a field of characteristic 0. Let F ⊃ F be the algebraic closure of F. Given a linearly compact n-Lie algebra g over F, its F-form is defined as an n-Lie algebra gF over F, such that F ⊗F gF is isomorphic to g. Due to the bijection given by Proposition 2.4, the F-forms of g are in one-to-one correspondence with the F-forms of the Z-graded Lie superalgebras Sg = [Lie g, Lie g]. But the latter are parameterized by the set H 1 (Gal, Aut Sg), where Gal is the Galois group of F over F, and Aut Sg is the group of continuous automorphisms of the Lie superalgebra Sg, preserving its Z-grading (cf. [3]). By the method of [3] it is easy to compute the group Aut Sg, using Remark 4.2. Proposition 5.2. One has: Aut Sg = G g U, where U is a prounipotent group and G g is a reductive group, isomorphic to On+1 (F), × G L n (F), G L n−1 (F) and F × S L n−1 (F), if g is isomorphic to O n , S n , W n and SW n over F, respectively. We have H 1 (Gal, Aut Sg) = H 1 (G g, Gal) (see, e.g., [3]). Furthermore, H 1 (G g, Gal) = 1 in the last three cases of Proposition 5.2, hence the only F-forms of S n , W n and SW n over F are S n , W n and SW n over F. Finally, it follows from [3] that the F-forms of the Z-graded Lie superalgebra H (0, n + 1) are the derived algebras of the Lie superalgebras P/F1, where P is a Poisson algebra, defined by (0.6). Hence F-forms of O n are vector product n-Lie algebras on Fn+1 , n ≥ 3, with a non-degenerate symmetric bilinear form (up to isomorphism, these n-Lie algebras depend on the equivalence class of the bilinear form up to a non-zero factor). Acknowledgements. We would like to thank T. Friedmann and J. Thierry-Mieg, who drew our attention to this topic, and D. Balibanu, A. Dzhumadildaev, A. Kiselev and E. Zelmanov for useful discussions and correspondence. Also we wish to thank the referees for very useful comments. In particular, one of the referees pointed out that in the case n = 3 the construction of [21] is closely related to our construction.
852
N. Cantarini, V. G. Kac
Appendix A Below we list all known examples of infinite-dimensional simple n-Lie algebras over an algebraically closed field F of characteristic 0 for n ≥ 3. Let A be a commutative associative algebra over F and let g be a Lie algebra of derivations of A, such that A contains no non-trivial g-invariant ideals. Example 1. S(A, g) = A, where g is an n-dimensional Lie algebra with basis D1 , . . . , Dn , the n-ary Lie bracket being ⎛ ⎞ D1 ( f 1 ) . . . D1 ( f n ) [ f 1 , . . . , f n ] = det ⎝ . . . . . . . . . . . . . . . . . . . . ⎠ . Dn ( f 1 ) . . . D1 ( f n ) Example 2. W (A, g), where g is an n − 1-dimensional Lie algebra with basis D1 , . . . , Dn−1 , the n-ary Lie bracket being ⎛ ⎞ ... fn f1 ⎜ D ( f ) . . . D1 ( f n ) ⎟ [ f 1 , . . . , f n ] = det ⎝ 1 1 . ........................⎠ Dn−1 ( f 1 ) . . . Dn−1 ( f n ) Example 3. SW (A, D) = A 1 ⊕ · · · ⊕ A n−1 is the sum of n − 1 copies of A and g = FD, the n-ary Lie bracket being the following. For h ∈ A, denote by h k the corresponding element in A k , then j1
[ f1
jn
, . . . , fn
] = 0, unless { j1 , . . . , jn } ⊃ {1, . . . , n − 1}; 1
k−1
k
k
k+1
[ f 1 , . . . , f k−1 , f k , f k+1 , f k+2 , . . . , f n n−1 ] = (−1)k+n−1 ( f 1 . . . f k−1 (D( f k ) f k+1 − f k D( f k+1 )) f k+2 . . . f n ) k ,
extended on SW (A, D) by anticommutativity. It is an open problem whether there exist any other simple infinite-dimensional n-Lie (super)algebras over an algebraically closed field of characteristic 0 if n > 2. In particular are there any examples of infinite-dimensional simple n-Lie superalgebras over a field of characteristic 0, which are not n-Lie algebras, if n > 2? References 1. Blattner, R.J.: A theorem of Cartan and Guillemin. J. Diff. Geom. 5, 295–305 (1970) 2. Cantarini, N., Kac, V.: Infinite-dimensional primitive linearly compact Lie superalgebras. Adv. Math. 207, 328–419 (2006) 3. Cantarini, N., Kac, V.: Automorphisms and forms of simple infinite-dimensional linearly compact Lie superalgebras. Int. J. Geom. Meth. Phys. 5, 6, 845–867 (2006) 4. Cantarini, N., Kac, V.: Classification of linearly compact simple Jordan and generalized Poisson superalgebras. J. Algebra 313, 100–124 (2007) 5. Cantarini, N., Kac, V.: Classification of linearly compact simple rigid superalgebras. IMRN, doi:10.1093/imrn/rnp231; preprint available at http://arxiv.org/abs/0909.3100v1[math.QA], 2010 6. Dzhumadildaev, A.S.: Identities and derivations for Jacobian algebras. Contemp. Math. 315, 245–278 (2002) 7. Dzhumadildaev, A.S.: Representations of vector product n-Lie algebras. Comm. Algebra 32(9), 3315– 3326 (2004)
Classification of Simple Linearly Compact n-Lie Superalgebras
853
8. Dzhumadildaev, A.S.: n-Lie structures that are generated by Wronskians. Sib. Mat. Zh. 46(4), 759–773 (2005), translation in Sib. Math. J. 46(4), 601–612 (2005) 9. Dzhumadildaev, A.S.: n-Lie property of Jacobian as a complete integrability condition. Sib. Mat. Zh. 47(4), 780–790 (2006), translation in Sib. Math. J. 47(4), 643–652 (2006) 10. Fattori, D., Kac, V.G.: Classification of finite simple Lie conformal superalgebras. J. Algebra 258, 23–59 (2002) 11. Filippov, V.T.: n-Lie algebras. Sib. Mat. Zh. 26(6), 126–140 (1985), translation in Sib. Math. J. 26(6), 879–891 (1985) 12. Filippov, V.T.: On n-Lie algebras of Jacobians. Sib. Mat. Zh. 39(3), 660–669 (1998), translation in Sib. Math. J. 39(3), 573–581 (1998) 13. Figueroa-O’Farrill, J.: Three lectures on 3-algebras. http://arxiv.org/abs/0812.2865v1[hep-th], 2008 14. Friedmann, T.: Orbifold singularities, the LATKe, and pure Yang-Mills with matter. http://arxiv.org/abs/ 0806.0024v1[math.AG], 2008 15. Guillemin, V.W.: A Jordan-Hölder decomposition for a certain class of infinite-dimensional Lie algebras. J. Diff. Geom. 2, 313–345 (1968) 16. Kac, V.G.: Lie superalgebras. Adv. Math. 26, 8–96 (1977) 17. Kac, V.G.: Classification of infinite-dimensional simple linearly compact Lie superalgebras. Adv. Math. 139, 1–55 (1998) 18. Kasymov, S.M.: On the theory of n-Lie algebras. Algebra i Logika 26(3), 277–297 (1987), translation in Algebra and Logic 26(3), 155–166 (1987) 19. Kasymov, S.M.: On nil-elements and nil-subsets of n-Lie algebras. Sib. Mat. J. 32(6), 77–80 (1991) 20. Ling, W.X.: On the structure of n-Lie algebras. PhD thesis, Siegen, 1993 21. de Medeiros, P., Figueroa-OFarrill, J., Méndez-Escobar, E.: Superpotentials for superconformal ChernSimons theories from representation theory. arXiv 0908.2125 22. Michor, P.W., Vinogradov, A.M.: n-ary Lie and associative algebras. Rend. Sem. Mat. Univ. Pol. Torino 54(4), 373–392 (1996) 23. Nambu, Y.: Generalized Hamiltonian mechanics. Phys. Rev. D7, 2405–2412 (1973) 24. Schlessinger, M., Stasheff, J.D.: The Lie algebra structure of tangent cohomology and deformation theory. J. Pure Appl. Algebra 38, 313–322 (1985) 25. Takhtajan, L.: On foundations of generalized Nambu mechanics. Commun. Math. Phys. 160, 295–315 (1994) Communicated by Y. Kawahigashi
Commun. Math. Phys. 298, 855–868 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1048-1
Communications in
Mathematical Physics
Time Functions as Utilities E. Minguzzi Dipartimento di Matematica Applicata, Università degli Studi di Firenze, Via S. Marta 3, I-50139 Firenze, Italy. E-mail: [email protected] Received: 22 September 2009 / Accepted: 8 January 2010 Published online: 13 April 2010 – © Springer-Verlag 2010
Abstract: Every time function on spacetime gives a (continuous) total preordering of the spacetime events which respects the notion of causal precedence. The problem of the existence of a (semi-)time function on spacetime and the problem of recovering the causal structure starting from the set of time functions are studied. It is pointed out that these problems have an analog in the field of microeconomics known as utility theory. In a chronological spacetime the semi-time functions correspond to the utilities for the chronological relation, while in a K -causal (stably causal) spacetime the time functions correspond to the utilities for the K + relation (Seifert’s relation). By exploiting this analogy, we are able to import some mathematical results, most notably Peleg’s and Levin’s theorems, to the spacetime framework. As a consequence, we prove that a K -causal (i.e. stably causal) spacetime admits a time function and that the time or temporal functions can be used to recover the K + (or Seifert) relation which indeed turns out to be the intersection of the time or temporal orderings. This result tells us in which circumstances it is possible to recover the chronological or causal relation starting from the set of time or temporal functions allowed by the spacetime. Moreover, it is proved that a chronological spacetime in which the closure of the causal relation is transitive (for instance a reflective spacetime) admits a semi-time function. Along the way a new proof avoiding smoothing techniques is given that the existence of a time function implies stable causality, and a new short proof of the equivalence between K -causality and stable causality is given which takes advantage of Levin’s theorem and smoothing techniques. 1. Introduction On the spacetime (M, g) we write as usual p < q if there is a future directed causal curve connecting p to q, and write p ≤ q if p < q or p = q. The causal relation is given by J + = {( p, q) ∈ M × M : p ≤ q}. For fixed time orientation these notions depend only on the class g of metrics conformal to g. Once the causal relation is defined it is possible to define a time function as a continuous function t : M → R such that if p < q
856
E. Minguzzi
then t ( p) < t (q). In other words a time function is defined all over the spacetime, it is continuous, and increases over every causal curve. A time function which is C 1 with a past directed timelike gradient is a temporal function. Following [43], a semi-time function is a continuous function t : M → R such that if p q then t ( p) < t (q) (note that by continuity we have also ( p, q) ∈ I + ⇒ t ( p) ≤ t (q)). These definitions clarify that in the framework of general relativity the notion of causality is more fundamental than that of time. Indeed, not all the spacetimes admit a time function. A spacetime admits a time function iff it admits a temporal function iff it is stably causal [4,18,19]. The history of this result is quite interesting. In order to prove the existence of a time function Geroch [17] suggested to introduce a positive measure μ on spacetime so that M has unit measure (in fact this measure has to be chosen so as to satisfy some admissibility constraints [12]), and to define t − ( p) = μ(I − ( p)). In globally hyperbolic spacetimes, and actually in causally continuous ones [12,20,36], the idea works, in fact t − can be shown to be continuous. Nevertheless, these causality conditions are stronger than stable causality and without them t − is only lower semi-continuous. The proof that stable causality implies the existence of a time function was obtained by Hawking through a nice averaging technique [18,19]. In short he noted that if the spacetime is stably causal then there is a one parameter family of metrics with cones strictly larger than g, gλ > g with λ ∈ [1, 2], λ < λ ⇒ gλ < gλ , so that (M, gλ ) is 2 causal. He then defined t ( p) = 1 tλ− ( p) dλ, where the function tλ− is defined as before but with respect to the metric gλ . He was then able to prove the continuity of t (see [19]). The proof of the converse presents several difficulties particularly because a time function t for (M, g) need not be a time function for some (M, g ) with g > g, consider for instance t = x 0 − tanh x 1 in the 1+1 Minkowski spacetime of metric g = −(dx 0 )2 + (dx 1 )2 . Nevertheless, the proof that a temporal function implies stable causality is easy [19] and thus there remained the issue of proving that the existence of a time function implies the existence of a temporal function. This smoothability problem was considered by Seifert [43] but his arguments were unclear. A rigorous proof was finally given by Bernal and Sánchez in [4]. Further insight into the problem of the existence of time comes from the relational approach to causality. Stable causality can be shown to be equivalent to the antisymmetry of the Seifert relation [42] JS+ = g >g Jg+ (a rigorous proof can be found in [20] and [32]). The nice feature of this relation is that it is both closed and transitive whereas J + has only the latter property and J + has only the former. In fact one may ask if JS+ is the smallest relation containing J + with this property. The answer is negative unless some causality conditions are added [32]. Therefore, it is natural to introduce the relation K + defined as the smallest closed and transitive relation which contains J + (see [45]). The spacetime is said to be K -causal if K + is antisymmetric. Recently [35], I have proved the equivalence between K -causality and stable causality and that if K -causality holds then K + = JS+ . The previous result shows that the antisymmetry of K + implies stable causality and hence the existence of a time function. It seems reasonable to expect that (i) This theorem depends only on the transitivity and closure properties of K + , and that therefore passing through stable causality should not be essential. (ii) The existence of a time function should imply K -causality (or stable causality) directly without using the smoothing argument. Finally, given the fact that K -causality implies the existence of a time function one would like to prove that (iii) Under stable causality the set of time functions allowed by the spacetime can be used to recover the relation K + .
Time Functions as Utilities
857
In a first version of this work I presented proofs for points (ii) and (iii) but then searching for fundamental results using only the closure of a relation in connection with problem (i), I discovered a large body of literature in utility theory with important implications for causality (most articles were published in economics journals). In fact the problem of the existence of a utility function in a set of alternatives for an individual is formally similar to that of the existence of time. Surprisingly, these results have been totally overlooked by relativists. In the next section I summarize this long parallel line of research which will be used to draw implications for causality theory and in particular for the problem of the existence of time. I refer the reader to [31,36] for most of the conventions used in this work. In particular, I denote with (M, g) a C r spacetime (connected, time-oriented Lorentzian manifold), r ∈ {3, . . . , ∞} of arbitrary dimension n ≥ 2 and signature (−, +, . . . , +). On M × M the usual product topology is defined. All the causal curves that we shall consider are future directed. The subset symbol ⊂ is reflexive, X ⊂ X . 2. Preorders and Utility Theory Recall1 that a binary relation R ⊂ X × X on a set X is called a preorder if it is reflexive and transitive, a strict partial order if it is irreflexive and transitive, and an equivalence relation if it is reflexive, transitive and symmetric. A preorder which satisfies the antisymmetry property (x, y) ∈ R and (y, x) ∈ R ⇒ x = y, is a partial order. The preorder or strict partial order R such that x = y ⇒ (x, y) ∈ R or (y, x) ∈ R, is complete. The property, if a, b ∈ X then (a, b) ∈ R or (b, a) ∈ R is the totality property and is equivalent to completeness and reflexivity. A preorder which satisfies the totality property, is a total preorder. A preorder which respects both the totality and the antisymmetry property is a total order. A strict partial order which is complete is a complete order. For short we also write2 x ≤ R y if (x, y) ∈ R; x ∼ R y if (x, y) ∈ R and (y, x) ∈ R; and x < R y if x ≤ R y and not x ∼ R y. The relation ∼ R is an equivalence relation called the equivalence relation part while < R is a strict partial order called the strict partial order part. Their union gives R, i.e. x ≤ R y iff x ∼ R y or x < R y. Note that if R is a partial order then < R is obtained from R by removing the diagonal = {(x, x) : x ∈ X × X }, and conversely R is obtained by adding the diagonal to < R , that is, R is the smallest reflexive relation containing < R (the reflexive closure). Given a (total) preorder R, the quotient X/ ∼ R endowed with the induced order [ p] ≤ R [q] iff p ≤ R q, is a partial (resp. total) order. As usual we denote R + (x) = {y ∈ X : (x, y) ∈ R} and R − (x) = {y ∈ X : (y, x) ∈ R}. We say that R2 extends R1 if x ∼ R1 y ⇒ x ∼ R2 y and x < R1 y ⇒ x < R2 y. Through a rather simple application of Zorn’s lemma Szpilrajn [46] proved Theorem 1 (Szpilrajn). Every strict partial order can be extended to a complete order. Moreover, every strict partial order is the intersection of all the complete orders that extend it. (By adding the diagonal one has a corresponding statement for partial orders extended by total orders.) The former statement in the theorem is also known as the order extension principle. The latter statement is sometimes attributed to Dushnik and Miller [14], however, although not stated explicitly in [46], it is a trivial consequence of a remark in Szpilrajn’s paper. 1 Unfortunately, in the literature there is no homogeneous terminology, so that used in this paper differs from that of many cited articles. 2 Unfortunately, if R = J + then while ≤ + has the same meaning of the symbol ≤ in relativity, the relation J < J + coincides with < only in causal spacetimes.
858
E. Minguzzi
Since to any preorder one can associate a partial order passing to the quotient with respect to the equivalence relation ∼ R , it is easy to prove from Theorem 1 the following [6,13] Theorem 2. Every preorder can be extended to a total preorder. Moreover, every preorder is the intersection of all the total preorders that extend it. I note that the proof in [6] is such that the total extension can be chosen (or restricted in the second part) in such a way that the ‘indifference sets’ [ p] are not enlarged passing from the preorder R to its total extension C, i.e. x ∼ R y ⇔ x ∼C y. These results were taken as reference for many other developments, and in fact have been generalized in several directions [1]. Meanwhile, in microeconomics the preference of an individual for a set of alternatives or prospects X was modeled as a total preorder R on X . The idea was that an individual is able to tell whether one option or the other is preferred. These preferences were quantified by an utility function. An utility for a transitive relation R is a function u : X → R with the strictly isotone3 property namely that4
x ∼ R y ⇒ u(x) = u(y) and
x < R y ⇒ u(x) < u(y).
(1)
Often on X one has a topology which makes rigorous the idea that an alternative is similar or close to another. In this case one would like to have a continuous utility function, otherwise the closeness of the alternatives would not be correctly represented by the utility. Eilenberg [15] and Debreu [10,11] (see also [28,40]) were able to prove, under weak topological assumptions, that a continuous utility exists provided R − (x) and R + (x) are closed for every x ∈ X . With the work of Aumann [2] and other economists it became clear that the assumption of totality was too restrictive. Indeed, it is not always possible to compare two alternatives to extract a preference. This conclusion is even more compelling if one models a group of persons rather than an individual. In order to include some indecisiveness the space of alternatives X has to be endowed with a preorder, the totality condition being removed. The previous results on the existence of a continuous utility must therefore be generalized, and there is indeed a large literature on the subject. The reader is referred to the monograph by Bridges and Mehta for a nice reasoned account [7]. The problem is of course that of finding natural conditions which imply the continuity of the utility function. We already suspect, given the suggestions from the spacetime problem above, that this property is the closure of the relation R in X × X , R = R¯ (sometimes called continuity in the economics literature). In fact, it can be easily shown [47] that for a total preorder this property coincides with that used in the Eilenberg and Debreu theorem, namely: R − (x) and R + (x) are closed for every x ∈ X (sometimes called semicontinuity in the economics literature). In the literature many other directions have been explored, but as we shall see the closure condition has given the most powerful results. Considerably important has proved the work by Nachbin [37], who studied closed preorders on topological spaces and obtained an extension to this domain of the Urysohn separation and extension theorems. These results were used to obtain new proofs of the Debreu theorem [28], and over them have been built the subsequent generalizations. For our purposes, the final goal was reached by Levin [7,23,26] who proved 3 A function is isotone if (x, y) ∈ R ⇒ u(x) ≤ u(y). Note that constant functions are isotone. 4 For a total preorder this definition can be replaced by: x ≤ y if and only if u(x) ≤ u(y). R
Time Functions as Utilities
859
Theorem 3 (Levin). Let X be a second countable locally compact Hausdorff space, and R a closed preorder on X , then there exists a continuous utility function. Moreover, denoting with U the set of continuous utilities we have that the preorder R can be recovered from the continuous utility functions, namely there is a multi-utility representation (x, y) ∈ R ⇔ ∀u ∈ U, u(x) ≤ u(y).
(2)
Curiously, as it happened for Szpilrajn’s theorem, the second part of this statement was not explicitly given in Levin’s paper. Nevertheless, it is a consequence of his proof that if x R y, then there is a continuous utility function such that u(x) > u(y) (see the end of the proof of [7, Lemma 8.3.4]). Apparently, this representation possibility has been pointed out only quite recently in a preprint by Evren and Ok [16]. Clearly, Levin’s theorem can be regarded as the continuous analog of Szpilrajn’s Theorem. It is curious to note that while there was, as far as I know, no communication between the communities of relativists and economists, these two parallel lines of research passed through the very same concepts. For instance, Sondermann [44] (see also [8]) introduced a measure on X and built an increasing function exactly as Geroch did; even more his admissibility requirements for the measure were close to those later introduced by Dieckmann [12] in the relativity literature. As expected he could only obtain lower semi-continuity for the utility, indeed we know that Geroch’s time is continuous only if some form of reflectivity is imposed [20]. Most theorems which use in an essential way some closure assumption can be deduced from Levin’s theorem. Peleg’s [39], which instead uses an openness condition, deserves a special mention. According to Peleg a strict partial order S is separable if there is a countable subset C of X such that for any (x, y) ∈ S the diamond S + (x) ∩ S − (y) contains some element of the subset C, and spacious if for (x, y) ∈ S, S − (x) ⊂ S − (y). Theorem 4 (Peleg). Let S be a strict partial order on a topological space X . Suppose that (a) S − (x) is open for every x ∈ X , (b) S is separable, and (c) S is spacious, then there is a function u : X → R such that (x, y) ∈ S ⇒ u(x) < u(y). Note that u is an utility in the sense of Eq. (1). It has been clarified that Debreu’s theorem can be regarded as a consequence of Peleg’s [25,29], while the relation with Levin’s theorem is less clear (but see [21,22]). We will be able to say more on that in the next section where we shall apply the previous theorems to the spacetime case. 3. Application of Utility Theory to Causality Our strategy will be that of applying Peleg’s theorem to the open relation I + and Levin’s theorem to the closed relation K + . As we shall see the semi-time functions and the time functions correspond to the utilities for the relations I + and K + respectively, provided they are antisymmetric. We start with the former case 3.1. Semi-time functions and I + -utilities. In this section we apply the Peleg theorem by setting X = M, the spacetime manifold, and S = I + . The assumption that I + is a strict partial order is equivalent to chronology and conditions (a) and (b) of Peleg’s theorem are satisfied. Note that in a chronological spacetime a continuous utility for I + is a continuous function t such that x y ⇒ t (x) < t (y), thus the continuous utilities
860
E. Minguzzi
for I + are exactly the semi-time functions. Therefore we have only to understand the condition of spaciousness: x y ⇒ I − (x) ⊂ I − (y). This problem is answered by the following Lemma 1. Let (M, g) be a spacetime, then the spaciousness condition, x y ⇒ I − (x) ⊂ I − (y), is equivalent to the future reflectivity condition, p ∈ I − (q) ⇒ q ∈ I + ( p). (The previous definition of future reflectivity is equivalent to the usual one I − (w) ⊂ I − (z) ⇒ I + (z) ⊂ I + (w), see [20,36].) Proof. Assume that (M, g) is future reflective, take x y and let z ∈ I − (x), then x ∈ I + (z), and since I − (y) is open, z y. As z is arbitrary I − (x) ⊂ I − (y). Conversely, assume the spacetime is spacious. Let p ∈ I − (q) and take r ∈ I + (q), then by spaciousness, I − (r ) ⊃ I − (q) p so that r ∈ I + ( p). As r can be chosen arbitrarily close to q, we have q ∈ I + ( p). With these preliminaries, Peleg’s theorem becomes (in the statement we include the past version) Theorem 5. A chronological, future or past reflective spacetime admits a semi-time function. This theorem is new in causality theory as there were no previous results establishing the existence of a semi-time function. We observe that if the causality assumption were somewhat stronger, with chronology replaced by distinction (I + (x) = I + (y) or I − (x) = I − (y) ⇒ x = y) then the spacetime would be K -causal [31, Th. 3.7], thus it would admit a time function. Also note that if future or past reflectivity is strengthened to reflectivity then the chronological assumption can be weakened to non-total viciousness (namely the chronology violating set does not coincide with M). This fact follows from a theorem by Clarke and Joshi which states that a non-totally vicious spacetime which is reflective is chronological [9, Prop. 2.5] (see also [24]). In [35] I have introduced the transverse conformal ladder and proved that future or past reflectivity implies the transitivity of J + , that is K + = J + , (see the proof of Theorem 3 in [35]). Thus one could try to strengthen the previous theorem by replacing ‘future or past reflectivity’ with the transitivity of J + . As we shall see in the next section it is indeed possible to do that. 3.2. Time functions and K + -utilities. Any spacetime is a paracompact Hausdorff manifold and as such it satisfies the topological conditions of Levin’s theorem (in fact it even admits a complete Riemannian metric [38]). Since we wish to apply this theorem to the relation K + we have first to establish the relation between the time functions and the continuous K + -utilities. In this section we shall prove, without using smoothing techniques [4] or the equivalence between K -causality and stable causality [35], that a spacetime is K -causal if and only if it admits a time function. Along the way we shall also prove that in a K -causal spacetime the K + -utilities are exactly the time functions.
Time Functions as Utilities
861
Lemma 2. A spacetime which admits a time function t is strongly causal. Proof. The proof of [31, Th. 3.4] shows that if (M, g) is not strongly causal then there are points x, c ∈ M such that x < c and (c, x) ∈ J + . Since for any pair ( p, q) ∈ J + it is t (q) − t ( p) ≥ 0 and t is continuous, t (x) < t (c) ≤ t (x), a contradiction. Recall that a spacetime is non-total imprisoning if no future inextendible causal curve is contained in a compact set (replacing future with past gives the same property [3,34]). Strong causality implies non-total imprisonment [19]. Lemma 3. Let (M, g) be non-total imprisoning. Let ( p, q) ∈ K + , then either ( p, q) ∈ J + or for every relatively compact open set5 B p there is r ∈ B˙ such that p < r and (r, q) ∈ K + . Proof. Consider the relation R + = {( p, q) ∈ K + : ( p, q) ∈ J + or for every relatively compact open set B p there is r ∈ B˙ such that p < r and (r, q) ∈ K + }. It is easy to check that J + ⊂ R + ⊂ K + . We are going to prove that R + is closed and transitive. From that and from the minimality of K + it follows R + = K + , and hence the thesis. Transitivity. Assume ( p, q) ∈ R + and (q, s) ∈ R + . If ( p, s) ∈ J + there is nothing to prove. Otherwise we have ( p, q) ∈ / J + or (q, s) ∈ / J + . Let B p be an open relatively compact set. If ( p, q) ∈ / J + there is r ∈ B˙ such that p < r and (r, q) ∈ K + , thus (r, s) ∈ K + and hence ( p, s) ∈ R + . It remains to consider the case ( p, q) ∈ J + and (q, s) ∈ / J + . If p = q then ( p, s) = (q, s) ∈ R + . Otherwise, p < q and if q ∈ / B the causal curve γ joining p to q intersects B˙ at a point r ∈ B˙ (possibly coincident with q but different from p). Thus p < r , (r, q) ∈ J + , hence p < r and (r, s) ∈ K + . If q ∈ B, since (q, s) ∈ R + \J + , there is r ∈ B˙ such that q < r and (r, s) ∈ K + , moreover, since p ≤ q, p < r . Since the searched conclusions holds for every B, ( p, s) ∈ R + . / Closure. Let ( pn , qn ) → ( p, q), ( pn , qn ) ∈ R + . Assume, by contradiction, that ( p, q) ∈ R + , then p = q as J + ⊂ R + . Without loss of generality we can assume two cases: (a) ( pn , qn ) ∈ J + for all n; (b) ( pn , qn ) ∈ / J + for all n. (a) Let B p be an open relatively compact set. For sufficiently large n, pn = qn and pn ∈ B. By the limit curve theorem [33] either there is a limit continuous causal curve joining p to q, and thus p < q (a contradiction), or there is a future inextendible continuous causal curve σ p starting from p such that for every p ∈ σ p , ( p , q) ∈ J + . Since (M, g) is non-total imprisoning, σ p intersects B˙ at some point r . Thus p < r and since (r, q) ∈ J + ⊂ K + we have ( p, q) ∈ R + , a contradiction. (b) Let B p be an open relatively compact set. For sufficiently large n, pn = qn ˙ pn < rn , and (rn , qn ) ∈ K + . and pn ∈ B. Since ( pn , qn ) ∈ R + there is rn ∈ B, ˙ so that (r, q) ∈ K + . Without loss of generality we can assume rn → r ∈ B, + Arguing as in (a) either p < r (and (r, q) ∈ K ) or there is r ∈ B˙ such that p < r and (r , r ) ∈ J + ⊂ K + , from which is follows (r , q) ∈ K + . Because of the arbitrariness of B, ( p, q) ∈ R + , a contradiction. 5 The set B can have any ‘size’ i.e. it needs not be sufficiently small.
862
E. Minguzzi
Corollary 1. Let (M, g) be non-total imprisoning. The spacetime (M, g) is K -causal if and only if x ≤ y and (y, x) ∈ K + implies x = y. Proof. Since J + ⊂ K + to the right it is trivial. To the left, assume (M, g) is not K -causal, then there are z, y ∈ M, z = y such that (z, y) ∈ K + and (y, z) ∈ K + . If (z, y) ∈ J + we have finished with x = z. Otherwise, let B z be an open relatively compact set. By Lemma 3 there is a point x ∈ B˙ such that z < x and (x, y) ∈ K + (and thus (x, z) ∈ K + ) ˙ a contradiction. which implies z = x ∈ B, Lemma 4. (a) Let t˜ be a continuous function such that x ≤ y ⇒ t˜(x) ≤ t˜(y). If ( p, q) ∈ K + then t˜( p) ≤ t˜(q). (b) Let t be a time function on (M, g). If ( p, q) ∈ K + then p = q or t ( p) < t (q). Proof. Proof of (a). Consider the relation R˜ + = {( p, q) ∈ K + : t˜( p) ≤ t˜(q)}. Clearly J + ⊂ R˜ + ⊂ K + and R˜ + is transitive. Let us prove that R˜ + is closed. If (xn , z n ) ∈ R˜ + is a sequence such that (xn , z n ) → (x, z), then passing to the limit t˜(xn ) ≤ t˜(z n ) and using the continuity of t˜ we get t˜(x) ≤ t˜(z), moreover since K + is closed, (x, z) ∈ K + , which implies (x, z) ∈ R˜ + , that is R˜ + is closed. Since J + ⊂ R˜ + ⊂ K + , and R˜ + is closed and transitive, by using the minimality of + K it follows that R˜ + = K + . As a consequence, if ( p, q) ∈ K + then t˜( p) ≤ t˜(q). Proof of (b). By Lemma 2, since t is a time function (M, g) is strongly causal and thus non-total imprisoning. Consider the relation R + = {( p, q) ∈ K + : p = q or t ( p) < t (q)}. Clearly J + ⊂ R + ⊂ K + and R + is transitive. Let us prove that R + is closed by keeping in mind the just obtained result given by (a). Let ( pn , qn ) ∈ R + ⊂ K + be a sequence such that ( pn , qn ) → ( p, q). As K + is closed, ( p, q) ∈ K + . If, by contradiction, ( p, q) ∈ / R+, + then ( p, q) ∈ / J , thus by Lemma 3, choosing an open relatively compact set B p there ˙ with p < r , (r, q) ∈ K + , thus t ( p) < t (r ) ≤ t (q) and hence ( p, q) ∈ R + , a is r ∈ B, contradiction. Since J + ⊂ R + ⊂ K + , and R + is closed and transitive, by using the minimality of + K it follows that R + = K + . As a consequence, if ( p, q) ∈ K + then either p = q or t ( p) < t (q). Theorem 6. In a K -causal spacetime the continuous K + -utilities are the time functions. Proof. A K + -utility is a function u which satisfies (i) (x, y) ∈ K + ⇒ u(x) ≤ u(y) and (ii) (x, y) ∈ K + and (y, x) ∈ / K + ⇒ u(x) < u(y). Since the spacetime is K -causal this condition is equivalent to (x, y) ∈ K + ⇒ x = y or u(x) < u(y). Thus by Lemma 4 point (b), every time function is a continuous K + -utility. Conversely, in a K -causal spacetime a continuous K + -utility satisfies x < y ⇒ (x, y) ∈ K + \ ⇒ u(x) < u(y) and hence it is a time function.
Time Functions as Utilities
863
Theorem 7. A spacetime is K -causal if and only if it admits a time function (as a consequence time functions are always K + -utilities). In this case, denoting with A the set of time functions, we have that the partial order K + can be recovered from the time functions, that is (x, y) ∈ K + ⇔ ∀t ∈ A, t (x) ≤ t (y).
(3)
Proof. Assume that the spacetime admits a time function, then it is K -causal, that is K + is antisymmetric, indeed otherwise there would be p, q ∈ M, p = q, such that ( p, q) ∈ K + and (q, p) ∈ K + . By Lemma 4(b), t ( p) < t (q) < t ( p), a contradiction. By Theorem 6 we also infer that the time function is a continuous K + -utility. Assume that the spacetime is K -causal, then by Levin’s theorem it admits a continuous K + -utility which by Theorem 6 is a time function. The last statement is also an application of Levin’s theorem. Actually Levin’s theorem states something more because it applies also to the case in which K -causality does not hold. However, in this case the K + -utilities are not time functions. Nevertheless, we have the following Lemma 5. In a chronological spacetime in which J + is transitive (that is K + = J + ) the continuous K + -utilities are also continuous I + -utilities, that is, they are also semi-time functions. Proof. Let u be a K + -utility, since the spacetime is chronological, we have only to prove (x, y) ∈ I + ⇒ u(x) < u(y). The hypothesis is (i) (x, y) ∈ J + ⇒ u(x) ≤ u(y) and (ii) (x, y) ∈ J + and (y, x) ∈ / J + ⇒ u(x) < u(y). Note that if (x, y) ∈ I + then (y, x) ∈ / J+ + because the relation I is open and the spacetime is chronological, thus (x, y) ∈ I + ⇒ / J + ⇒ u(x) < u(y) which is the thesis. (x, y) ∈ J + and (y, x) ∈ As a consequence we are able to clarify that the consequences of Levin’s theorem are actually stronger than those of Peleg’s theorem, as we can now infer from Theorem 3. Theorem 8. A chronological spacetime in which J + is transitive admits a semi-time function. Of course it actually admits a continuous K + -utility which is a stronger concept than that of the semi-time function. We have expressed the theorem in this form for the sake of comparison with Theorem 5. By using Levin’s theorem and the smoothing result for time functions [4] it is possible to give another proof of the equivalence between K -causality and stable causality Theorem 9. K -causality coincides with stable causality. Proof. The proof that stable causality implies K -causality goes as usual. The thesis follows from K + ⊂ JS+ , because JS+ is closed, transitive and contains J + while K + is by definition the smallest relation with this property. Thus this direction follows from the equivalence between the antisymmetry of JS+ and stable causality, the antisymmetry condition being inherited by the inclusion of relations. For the other direction K -causality implies the existence of a time function, thus the existence of a temporal function and hence stable causality.
864
E. Minguzzi
4. Time Orderings This section is independent of the previous one. Here the representation theorem for K + (or JS+ ) through the time functions is proved again without the help of Levin’s theorem but using the results of [35]. I gave this proof before discovering the connection with utility theory. It is quite short and uses in an essential way the equivalence between K -causality and stable causality. In the last part of the proof I also use the smoothability results of [4] in order to generalize to temporal functions the representation theorem. This improvement is important because it allows us to make a connection with ‘observers’ on spacetime provided we model them with congruences of timelike curves. Given a time function on spacetime let us introduce the total preorder T + [t] = {( p, q) ∈ M × M : t ( p) ≤ t (q)}.
(4)
Any such preorder, here called time ordering, extends J + according to the definition of Sect. 2, in particular J + ⊂ T + [t]. The relation T + [t] is closed because t is continuous. If t is temporal then we shall say that the time ordering T + [t] is also a temporal ordering. Note that the relation T + [t] is invariant under monotonous time reparametrizations, that is, if f is increasing f (t (·)) is a time function and T + [ f (t)] = T + [t]. In other words, the relation T + [t] keeps the information on the simultaneity convention associated to the time function t, but it is insensitive to the actual values of the time intervals t (q) − t ( p), p, q ∈ M. As a matter of convention, in the next intersections if the index sets A or B are empty then the intersection is the whole ambient space M × M. Theorem 10. Let A and B be respectively the set of time functions and the set of temporal functions allowed by a spacetime. In every spacetime K + ⊂ JS+ ⊂ T + [t] ⊂ T + [t]. t∈ A
t∈ B
In a stably causal spacetime K + = JS+ =
t∈ A
T + [t] =
T + [t].
t∈ B
Proof. The first inclusion B ⊂ A. If, by is well known while the latter is obvious because contradiction, JS+ ⊂ t∈ A T + [t] does not hold then there is ( p, q) ∈ JS+ \ t∈ A T + [t]. In particular A is not empty and there is a time function t such that t ( p) > t (q). Now, note that JS+ intersect t∈ A T + [t] is a proper subset of JS+ and, being the intersection of closed and transitive relations which contain J + , shares all these same properties. As a consequence K + = JS+ , but it is known [35] that in a stably causal spacetime K + = JS+ , thus the spacetime is not stably causal although there is a time function t, a contradiction. Let (M, g) be a stably spacetime. The equality K + = JS+ has been proved in causal + [32,35]. Let us prove t∈ A T [t] ⊂ JS+ . By contradiction, assume it does not hold, then there is a pair ( p, q) ∈ t∈ A T + [t]\JS+ . Recall [32, Lemma 3.3] that JS+ = g >g J + g . Since ( p, q) ∈ / JS+ there is g > g such that p ∈ / Jg− (q) and (M, g ) is causal.
Time Functions as Utilities
865
Fig. 1. The last argument in the proof of Theorem 10
Let A p be an open set such that A ∩ Jg− (q) = ∅. We are going to construct a time function tˆ such that tˆ( p) > tˆ(q), a contradiction with ( p, q) ∈ t∈ A T + [t]. Basically we are going to use Hawking’s averaging technique [19, Prop. 6.4.9]. We introduce a volume measure μ as in [19, Prop. 6.4.9] so that μ(M) is finite. We can find a family of Lorentz metrics h(a), a ∈ [0, 3], such that points (1)-(3) in that proof are satisfied 2 and h(3) = g . Then we construct a continuous function θ¯ (x) = 1 θ (x, a)da, where − θ (x, a) = μ(Ih(a) (x)) as done there. However, here we make just a little change. The measure μ is taken with support in A ∩ Ig− ( p). As a consequence the function θ¯ is continuous and non-decreasing over every future directed causal curve while in Hawking’s construction it is increasing. Let t be a time function. The continuous function tˆ = t + θ¯ is a time function and tˆ(q) = t (q) while tˆ( p) = t ( p) + μ(M). By choosing μ(M) > t (q) − t ( p) we get the thesis. + It remains to prove the inclusion t∈ A T + [t] ⊃ t∈ B T [t]. By contradiction, suppose it does not hold, then there is a pair ( p, q) ∈ t∈ B T + [t]\ t∈ A T + [t]. In other words for every temporal function t (there is at least one temporal function τ because (M, g) is stably causal [4]), we have t ( p) ≤ t (q), but there is a time function tˆ such that tˆ( p) > tˆ(q). Consider the (acausal) partial Cauchy hypersurface S = tˆ−1 (tˆ( p)), see Fig. 1. The set S does not intersect q, let N = M\{q} so that S is a partial Cauchy hypersurface for (N , g| N ). Let D N (S) be the Cauchy development of S on (N , g| N ) and H N+ (S) and H N− (S) the future and past Cauchy horizons. We have S ∩ H N+ (S) = ∅, because if r ∈ S ∩ H N+ (S) then, as H N+ (S) is generated by past inextendible lightlike geodesics on N , there would be a past inextendible geodesic passing through r , so there is a portion of the inextendible geodesic to the causal past of r . No other point of the geodesic can belong to S because of its acausality, but since H N+ (S)\S ⊂ I N+ (S), we get a contradiction with the achronality of S. Analogously, S ∩ H N− (S) = ∅. As a consequence the set IntD N (S) is non-empty and being globally hyperbolic it is diffeomorphic to R × S, where the slices diffeomeorphic to S are the level sets of a temporal function t on the spacetime IntD N (S) with the induced metric (see [4]).
866
E. Minguzzi
Choosing a, b, a < b, so that b < t ( p) and a < b + τ ( p) − τ (q), we construct a function t on IntD N (S) so that t = t at those points where a ≤ t ≤ b, t = b at those points such that t ≥ b, and t = a at those points where t ≤ a. Note that t ( p) = b. Clearly t has past directed timelike gradient for a < t < b but there is a discontinuity in the gradient for t = a or t = b. However, a smooth monotonous reparametrization t = f (t ) exists which sends a to a, b to b, and makes the gradient everywhere continuous, timelike on a < t < b and vanishing for t ≤ a, and t ≥ b. A possible choice is t
b+a b−a t −a + =− cos π , for a ≤ t ≤ b, t = t elsewhere. 2 b−a 2
The function t can be extended in a smooth way to M by setting t = b on tˆ−1 ((tˆ( p), +∞))\IntD N (S) and t = a on tˆ−1 ((−∞, tˆ( p)))\IntD N (S). In particular, since q ∈ / IntD N (S) and tˆ( p) > tˆ(q) we have t (q) = a. The function tˇ = τ + t is a temporal function and tˇ(q) = τ (q) + a < τ ( p) + b = tˇ( p), a contradiction. It must be remarked that to every temporal function t there √ corresponds a flow generated by the future directed timelike unit vector u = −∇t/ −g(∇t, ∇t). The generated congruence of timelike curves represents an extended reference frame so that every curve of the congruence is identified with an observer “at rest in the frame”. The flow is orthogonal to the slices t = const. which therefore are the natural simultaneity slices as they would be obtained by the observers at rest in the frame by a local application of Einstein’s simultaneity convention [27,30,41]. This observation shows that the temporal functions, at least in principle, can be physically realized through a well defined operational procedure. The above theorem then states that while observers living in different extended reference frames may disagree on which event of a pair comes “before” or “after” the other, according to their own time function, they certainly agree whenever the pair of compared events belong to the K + (Seifert) relation, and in fact only for those type of pairs. In other words the K + (Seifert) relation provides that set of pairs of events for which all the observers agree on their temporal order. Equation (3) can be rewritten in the equivalent form K+ =
T + [t],
(5)
t∈A
thus we have just obtained an alternative proof for the same equation. This result allows us to establish those circumstances in which the chronological or causal relation can be recovered from the knowledge of the time or temporal functions. Recall that a spacetime is causally easy if it is strongly causal and J + is transitive [35]. Recall also that a causally continuous spacetime is a spacetime which is distinguishing and reflective. Finally a spacetime is causally simple [5] if it is causal and J + = J + . We have causal simplicity ⇒ causal continuity ⇒ causal easiness ⇒ K -causality. By definition of causal easiness K + = J + , thus as I + = J + , we easily find Proposition 1. In a causally easy spacetime I + = Int t T + [t], and in a causally sim ple spacetime J + = t T + [t], where the intersections are with respect to the sets of time or the temporal functions.
Time Functions as Utilities
867
5. Conclusions The concept of causal influence is more primitive, and in fact more intuitive, than that of time. General relativistic spacetimes have by definition a causal structure but may lack a time function, namely a continuous function which respects the notion of causal precedence (i.e. if a influences b then the time of a is less than that of b). In this work we have recognized the mathematical coincidence between the problem of the existence of a (semi-)time function on spacetime in the relativistic physics field and the problem of the existence of a utility function for an agent in microeconomics. From these problems two so far independent lines of research arose which, as we noted, often passed through the very same concepts. Remarkably, some results obtained in one field were not rediscovered in the other, a fact which has allowed us to use Peleg’s and Levin’s theorems to reach new results concerning the existence of (semi-)time functions in relativity. In particular, we have proved that a chronological spacetime in which J + is transitive (for instance a reflective spacetime) admits a semi-time function. Also in a K -causal spacetime the existence of a time function follows solely from the closure and antisymmetry of the K + relation. In the other direction we have proved without the help of smoothing techniques, that the existence of a time function implies K -causality. We have also given a new proof of the equivalence between K -causality and stable causality by using Levin’s theorem and smoothing techniques. Finally, we have shown in two different ways that in a K -causal (i.e. stably casual) spacetime the K + (i.e. Seifert) relation can be recovered from the set of time or temporal functions allowed by the spacetime. This result singles out the K + relation as one of the most important ones for the development of causality theory. Acknowledgments. This work has been partially supported by GNFM of INDAM and by FQXi.
References 1. Andrikopoulos, A.: Szpilrajn-type theorems in economics (May 2009). Mimeo, Univ. of Ionnina. Available at http://ideas.repec.org/p/pra/mprap/14345.html 2. Aumann, R.J.: Utility theory without the completeness axiom. Econometrica 30, 445–462 (1962) 3. Beem, J.K.: Conformal changes and geodesic completeness. Commun. Math. Phys. 49, 179–186 (1976) 4. Bernal, A.N., Sánchez, M.: Smoothness of time functions and the metric splitting of globally hyperbolic spacetimes. Commun. Math. Phys. 257, 43–50 (2005) 5. Bernal, A.N., Sánchez, M.: Globally hyperbolic spacetimes can be defined as ‘causal’ instead of ‘strongly causal’. Class. Quant. Grav. 24, 745–749 (2007) 6. Bossert, W.: Intersection quasi-orderings: An alternative proof. Order 16, 221–225 (1999) 7. Bridges, D.S., Mehta, G.B.: Representations of preference orderings, Vol. 442 of Lectures Notes in Economics and Mathematical Systems. Berlin: Springer-Verlag, 1995 8. Candeal-Haro, J.C., Induráin-Eraso, E.: Utility representations from the concept of measure. Math. Soc. Sci. 26, 51–62 (1993) 9. Clarke, C.J.S., Joshi, P.S.: On reflecting spacetimes. Class. Quant. Grav. 5, 19–25 (1988) 10. Debreu, G.: Representation of preference ordering by a numerical function. In: Decision Processes, ed. Thrall, R.M., Coombs, C.H., Davis, R.L., New York: John Wiley, 1954, pp. 159–165 11. Debreu, G.: Continuity properties of Paretian utility. Int. Econ. Rev. 5, 285–293 (1964) 12. Dieckmann, J.: Volume functions in general relativity. Gen. Rel. Grav. 20, 859–867 (1988) 13. Donaldson, D., Weymark, J.A.: A quasiordering is the intersection of orderings. J. Econ. Theory 78, 328– 387 (1998) 14. Dushnik, B., Miller, E.: Partially ordered sets. Amer. J. Math. 63, 600–610 (1941) 15. Eilenberg, S.: Ordered topological spaces. Amer. J. Math. 63, 39–45 (1941) 16. Evren, O., Ok, E.A.: On the multi-utility representation of preference relations. J. Econ. Theory (in press) 17. Geroch, R.: Domain of dependence. J. Math. Phys. 11, 437–449 (1970)
868
E. Minguzzi
18. Hawking, S.W.: The existence of cosmic time functions. Proc. Roy. Soc. London, Series A 308, 433– 435 (1968) 19. Hawking, S.W., Ellis, G.F.R.: The Large Scale Structure of Space-Time. Cambridge: Cambridge University Press, 1973 20. Hawking, S.W., Sachs, R.K.: Causally continuous spacetimes. Commun. Math. Phys. 35, 287–296 (1974) 21. Herden, G.: On the existence of utility functions. Math. Soc. Sci. 17, 297–313 (1989) 22. Herden, G.: On some equivalent approaches to mathematical utility theory. Math. Soc. Sci. 29, 19–31 (1995) 23. Herden, G., Pallack, A.: On the continuous analogue of the Szpilrajn theorem I. Math. Soc. Sci. 43, 115– 134 (2002) 24. Kim, J.-C., Kim, J.-H.: Totally vicious spacetimes. J. Math. Phys. 34, 2435–2439 (1993) 25. Lee, L.-F.: The theorems of Debreu and Peleg for ordered topological spaces. Econometrica 40, 1151– 1153 (1972) 26. Levin, V.L.: A continuous utility theorem for closed preorders on a σ -compact metrizable space. Sov. Math. Dokl. 28, 715–718 (1983) 27. Malament, D.B.: Causal theories of time and the conventionality of simultaneity. Noûs 11, 293–300 (1977) 28. Mehta, G.: Topological ordered spaces and utility functions. Int. Econ. Rev. 18, 779–782 (1977) 29. Mehta, G.: Ordered topological spaces and the theorems of Debreu and Peleg. Indian J. Pure Appl. Math. 14, 1174–1182 (1983) 30. Minguzzi, E.: Simultaneity and generalized connections in general relativity. Class. Quant. Grav. 20, 2443–2456 (2003) 31. Minguzzi, E.: The causal ladder and the strength of K -causality. I. Class. Quant. Grav. 25, 015009 (2008) 32. Minguzzi, E.: The causal ladder and the strength of K -causality. II. Class. Quant. Grav. 25, 015010 (2008) 33. Minguzzi, E.: Limit curve theorems in Lorentzian geometry. J. Math. Phys. 49, 092501 (2008) 34. Minguzzi, E.: Non-imprisonment conditions on spacetime. J. Math. Phys. 49, 062503 (2008) 35. Minguzzi, E.: K -causality coincides with stable causality. Commun. Math. Phys. 290, 239–248 (2009) 36. Minguzzi, E., Sánchez, M.: The causal hierarchy of spacetimes. In: Baum, H., Alekseevsky, D. (eds.), Recent developments in pseudo-Riemannian geometry of ESI Lect. Math. Phys., Zurich: Eur. Math. Soc. Publ. House, 2008, pp. 299–358 (2008) 37. Nachbin, L.: Topology and order. Princeton: D. Van Nostrand Company, Inc., 1965 38. Nomizu, K., Ozeki, H.: The existence of complete Riemannian metrics. Proc. Amer. Math. Soc. 12, 889– 891 (1961) 39. Peleg, B.: Utility functions for partially ordered topological spaces. Econometrica 38, 93–96 (1970) 40. Rader, T.: The existence of a utility function to represent preferences. Rev. Econ. Stud. 30, 229–232 (1963) 41. Robb, A.A.: A Theory of Time and Space. Cambridge: Cambridge University Press, 1914 42. Seifert, H.: The causal boundary of space-times. Gen. Rel. Grav. 1, 247–259 (1971) 43. Seifert, H.J.: Smoothing and extending cosmic time functions. Gen. Rel. Grav. 8, 815–831 (1977) 44. Sondermann, D.: Utility representations for partial orders. J. Econ. Theory 23, 183–188 (1980) 45. Sorkin, R.D., Woolgar, E.: A causal order for spacetimes with C 0 Lorentzian metrics: proof of compactness of the space of causal curves. Class. Quant. Grav. 13, 1971–1993 (1996) 46. Szpilrajn, E.: Sur l’extension de l’ordre partiel. Fund. Math. 16, 386–389 (1930) 47. Ward, L.E. Jr.: Partially ordered topological spaces. Proc. Am. Math. Soc. 5, 144–161 (1954) Communicated by P.T. Chru´sciel
Commun. Math. Phys. 298, 869–878 (2010) Digital Object Identifier (DOI) 10.1007/s00220-010-1079-7
Communications in
Mathematical Physics
On the Best Constant in the Moser-Onofri-Aubin Inequality Nassif Ghoussoub1 , Chang-Shou Lin2 1 Department of Mathematics, University of British Columbia, Vancouver,
BC V6T1Z2, Canada. E-mail: [email protected]
2 Department of Mathematics, Taida Institute for Mathematical Sciences,
National Taiwan University, Taipei, 106, Taiwan Received: 29 September 2009 / Accepted: 24 February 2010 Published online: 27 June 2010 – © Springer-Verlag 2010
Abstract: Let S 2 be the 2-dimensional unit sphere and let Jα denote the nonlinear functional on the Sobolev space H 1 (S 2 ) defined by dμ0 α 1 2 Jα (u) = |∇u| dμ0 + u dμ0 − ln eu , 2 16π S 2 4π S 2 4π S where dμ0 = sin θ dθ ∧ dφ. Onofri had established that Jα is non-negative on H 1 (S 2 ) provided α ≥ 1. In this note, we show that if Jα is restricted to those u ∈ H 1 (S 2 ) that satisfies the Aubin condition: eu x j dμ0 = 0 for all 1 ≤ j ≤ 3, S2
then the same inequality continues to hold (i.e., Jα (u) ≥ 0) whenever α ≥ 23 − 0 for some 0 > 0. The question of Chang-Yang on whether this remains true for all α ≥ 21 remains open. 1. Introduction Let S 2 be the 2-dimensional unit sphere with the standard metric g0 whose correspond dμ0 ing volume form dω := 4π is normalized so that S 2 dω = 1. For α > 0, we consider the following nonlinear functional on the Sobolev space H 1 (S 2 ): α |∇g0 u|2 dω + u dω − ln eu dω. Jα (u) = 2 2 16π S 2 S S The classical Moser-Trudinger inequality [14] yields that Jα is bounded from below in H 1 (S 2 ) if and only if α ≥ 1. In [15], Onofri proved that the infimum is actually equal to zero for α = 1, by using the conformal invariance of J1 to show that inf J1 (u) =
u∈M
inf
u∈H 1 (S 2 )
J1 (u) = 0,
(1.1)
870
N. Ghoussoub, C.-S. Lin
where M is the submanifold of H 1 (S 2 ) defined by M := u ∈ H 1 (S 2 ) ;
S2
eu x dw = 0 ,
(1.2)
with x = (x1 , x2 , x3 ) ∈ S 2 , on which the infimum of J1 is attained. Other proofs were also given by Osgood-Phillips-Sarnak [16] and by Hong [11]. Prior to that, Aubin [1] had shown that by restricting the functional Jα to M, it is then again bounded below by — a necessarily non-positive — constant Cα , for any α ≥ 21 . In their work on Nirenberg’s prescribing Gaussian curvature problem on S 2 , Chang and Yang [5,6] showed that Cα can be taken to be equal to 0 for α ≥ 1 − 0 for some small 0 . This led them to the following Conjecture 1. If α ≥
1 2
then inf Jα (u) = 0. u∈M
Note that this fails if α < 21 , since the functional Jα is then unbounded from below (see [9]). In this article, we want to give a partial answer to this question by showing that this is indeed the case for α ≥ 23 and slightly below that. As mentioned above, Aubin had proved that for all α ≥ 21 , the functional Jα is coercive on M, and that it attains its infimum on some function u ∈ M. Accounting for the Lagrange multipliers, and setting ρ = α1 , the Euler-Lagrangian equation for u is then g0 u + 8πρ
3 eu − 1 = α j x j eu on S 2 . u S 2 e dw j=1
In [6], Chang and Yang proved however that α j , j = 1, 2, 3 necessarily vanish. Thus u satisfies – up to an additive constant – the equation g0 u + 8πρ(eu − 1) = 0 on S 2 , equivalently u + 2ρ(eu − 1) = 0 on S 2 ,
(1.3)
1 where now the Laplacian := 4π g0 corresponds to the metric on the unit sphere whose volume form is dμ0 = sin θ dθ ∧ dφ. Here is the main result of this note.
Theorem 1.1. If 1 < ρ ≤
3 2
and u is a solution of (1.3), then u ≡ 0 on S 2 .
This then clearly gives a positive answer to Conjecture 1 for α ≥ 23 . 2. The Axially Symmetric Case The proof of Theorem 1.1 relies on the fact that the conjecture has been shown to be true in the axially symmetric case. In other words, the following result holds. Theorem A. Let u be a solution of (1.3) with 1 < ρ ≤ 2. If u is axially symmetric, then u ≡ 0 on S 2 .
Best Constant in the Moser-Onofri-Aubin Inequality
871
Theorem A was first established by Feldman, Froese, Ghoussoub and Gui [9] for 1 < ρ ≤ 25 16 . It was eventually proved for all 1 < ρ ≤ 2 by Gui and Wei [10], and independently by Lin [12]. Note that this means that the following one-dimensional inequality holds: 1 1 1 1 1 2g(x) 2 2 (1 − x )|g (x)| d x + 2 g(x) d x − 2 ln e d x ≥ 0, 2 −1 2 −1 −1 1 for every function g on (−1, 1) satisfying −1 (1 − x 2 )|g (x)|2 d x < ∞ and 1 2g(x) xd x = 0. −1 e We now give a sketch of the proof of Theorem A that connects the conjecture of Chang-Yang to an equally interesting Liouville type theorem on R 2 . For that, we let denote the stereographic projection S 2 → R2 with respect to the North pole N = (0, 0, 1): x2 x1 . ,
(x) := 1 − x3 1 − x3 Suppose u is a solution of (1.3), and set u(y) ˜ := u( −1 (y)) for y ∈ R2 . Then u˜ satisfies
where J (y) :=
u˜ + 2ρ J (y) eu˜ − 1 = 0 in R2 , 2 1+|y|2
2
is the Jacobian of . By letting
v(y) := u(y) ˜ + ρ log (1 + |y|2 )−2 + log(8ρ) for y ∈ R2 ,
(2.1)
we have that v satisfies v + (1 + |y|2 )l ev = 0 in R2 , where l = 2(ρ − 1). Let v be a solution of (2.2) and suppose βl (v) is finite, where 1 βl (v) = (1 + |y|2 )l ev dy. 2π R2
(2.2)
(2.3)
Then v(y) has the following asymptotic behavior at ∞: v(y) = −βl (v) log |y| + O(1).
(2.4)
We refer to [7] for a proof of (2.4), which once combined with the Pohozaev identity yields the following result. Lemma 2.1. Let l > 0 and v be a solution of (2.2) such that βl (v) < +∞. Then 4 < βl (v) < 4(l + 1).
872
N. Ghoussoub, C.-S. Lin
Proof. Multiplying (2.2) by y · ∇v and integrating by parts on B R = {y | |y| < R}, we have 1 ∂v 2 (y · ∇v) ds − (y · ν)|∇v| ds = − (1 + |y|2 )l y · ∇ev dy ∂ν 2 ∂ BR ∂ BR BR 2 l v 2 l−1 v = (l + 2) (1 + |y| ) e dy − l (1 + |y| ) e dy − (1 + |y|2 )l (y · ν)ev ds. BR
∂ BR
BR
By letting R → +∞ in the above formula and by using (2.4), we obtain that (l + 2) (1 + |y|2 )l ev dy − l (1 + |y|2 )l−1 ev dy = πβl2 (v). R2
Note now that
R2
(1 + |y|2 )l ev dy 2 l v < (l + 2) (1 + |y| ) e dy − l (1 + |y|2 )l−1 ev dy R2 R2 < (l + 2) (1 + |y|2 )l ev dy = 2π(2l + 2)βl (v),
4πβl (v) = 2
R2
R2
which means that 4πβl (v) < πβl2 (v) ≤ 2π(2l + 2)βl (v), i.e., 4 < βl (v) < 4(l + 1). Note that by using (2.1) with u ≡ 0, Eq. (2.2) always has a special axially symmetric solution, namely v ∗ (y) = −2ρ log(1 + |y|2 ) + log(8ρ) for y ∈ R2 ,
(2.5)
βl (v ∗ ) = 4ρ = 2(l + 2).
(2.6)
where
An open question that would clearly imply the conjecture of Chang and Yang is the following: Conjecture 2. Is v ∗ the only solution of (2.2) with βl = 2(l + 2)? Note that it is indeed the case if < 0 (i.e., ρ < 1 and α > 1), since then we can employ the method of moving planes to show that v(y) is radially symmetric with respect to the origin, and then conclude that u(x) is axially symmetric with respect to any line passing through the origin. Thus u(x) must be a constant function on S 2 . Equation (1.3) then yields u = 0, which implies Jα ≥ 0 on M. By passing to the limit as α → 1, we recover the Onofri inequality. When l > 0 (i.e., ρ > 1 and α < 1), the method of moving planes fails and it is still an open problem whether any solution of (2.2) with βl = 2(l + 2) is equal to v ∗ or not. The following uniqueness theorem reduces however the problem to whether any solution of (2.2) is radially symmetric.
Best Constant in the Moser-Onofri-Aubin Inequality
873
Theorem B. Suppose l > 0 and vi (y) = vi (|y|), i = 1, 2, are two solutions of (2.2) satisfying βl (v1 ) = βl (v2 ).
(2.7)
Then v1 = v2 under one of the following conditions: (i) l ≤ 1 or (ii) l > 1 and 4l < βl (vi ) < 4(1 + l) for i = 1, 2. See [12] for a proof of Theorem B. In order to show how Theorem B implies Theorem A, we suppose u is a solution of (1.3) that is axially symmetric with respect to some direction. By rotating, the direction can be assumed to be (0, 0, 1). By using the stereographic projection as above, and setting v as in (2.1), we have v(y) |y| + O(1), = −4ρ log (2.8) 1 2 )l ev dy = 4ρ = 4 + 2l. (1 + |y| 2 R 2π If l ≤ 1, i.e., ρ ≤ 23 , then v = v ∗ by (i) of Theorem B, and then u ≡ 0. If 2 > l > 1, then by noting that 4l < 4ρ = 4 + 2l = βl (v) < 4 + 4l, we deduce that v = v ∗ by (ii) of Theorem B, which again means that u ≡ 0. 3. Proof of the Main Theorem We shall prove Theorem 1.1 by showing that if ρ ≤ 23 , then any solution of (1.3) is necessarily axially symmetric. We can then conclude by using Theorem A. We shall need the following lemma. Lemma 3.1. Let be a simply connected domain in R2 , and suppose g ∈ C 2 () satisfies g g +g e > 0 in and e dy ≤ 8π. Consider an open set ω ⊂ such that λ1,g (ω) ≤ 0, where λ1,g (ω) is the first eigenvalue of the operator + e g on H01 (ω). Then, we necessarily have that e g dy > 4π. (3.1) ω
Lemma 3.1 was first proved in [2] by using the classical Bol inequality. The strict inequality of (3.1) is due to the fact that g + e g > 0 in . See [3] and references therein. Remark 3.2. We note that Lemma 3.1 can be applied even when ω is unbounded. Indeed, for simplicity, we shall assume –as will be the case in the application below to the proof of Theorem 1.1– that for some β ≥ 2, we have g(y) = −β log |y| + O(1) at ∞.
874
N. Ghoussoub, C.-S. Lin
We shall also assume that the corresponding null-eigenfunction ϕ in ω, i.e., ϕ + e g ϕ = 0 in ω, ϕ|∂ω = 0, is bounded in ω. Without loss of generality, we may also assume that 0 ∈ ω. Now set g(x) ˆ = g(
x x x ) − 2 log |x| and ϕ(x) ˆ = ϕ( 2 ) for x ∈ ω∗ = {y = ; x ∈ ω}. |x|2 |x| |x|2
Since β ≥ 2, e gˆ is a Hölder function at 0 ∈ ω∗ , and gˆ and ϕˆ satisfy gˆ + e gˆ > 0 in ω∗ \{0} and ϕˆ + e gˆ ϕˆ = 0 in ω∗ . By the boundedness of ϕ, ˆ ϕˆ is continuous on ω∗ . If 0 ∈ ω∗ , then by noting that gˆ satisfies g ˆ gˆ + e ≥ (β − 2)δ0 , where δ0 is the Dirac measure at 0 and β − 2 ≥ 0, we can then apply a version of Lemma 3.1 where gˆ can have a singularity (see [3]), to deduce that ˆ e g(x) dx = e g(x) d x ≥ 4π. ω∗
ω
We note that in the application of the lemma to the proof of Theorem 1.1, we have that ϕ is bounded on all of R2 . Now we are in the position to prove the main theorem. Proof of Theorem 1.1. Suppose u(x) is a solution of (1.3). Let ξ0 be a critical point of u. Without loss of generality, we may assume ξ0 = (0, 0, −1). By using the stereographic projection as before and letting v(y) := u( −1 (x)) − 2ρ log(1 + |y|2 ) + log(8ρ), v satisfies (2.2) and ∇v(0) = 0.
(3.2)
Set ϕ(y) := y2
∂v ∂v − y1 . ∂ y1 ∂ y2
Then ϕ satisfies ϕ + (1 + |y|2 )l ev ϕ = 0 in R2 .
(3.3)
By (2.1), it is easy to see ϕ is bounded in R2 . If ϕ ≡ 0, then by (3.2), ϕ(y) = Q(y) + higher order terms for |y| 1, where Q(y) is a quadratic polynomial of degree m with m ≥ 2, that is also a harmonic function, i.e., Q = 0. Thus, the nodal line {y | ϕ(y) = 0} divides a small neighborhood of the origin into at least four regions. Let γi , i = 1, 2, 3, 4, be four branches of the nodal 4 line of ϕ issuing from the origin. If γi does not intersect with γ j , i = j, then R2 \ γi i=1
Best Constant in the Moser-Onofri-Aubin Inequality
875
Fig. 1.
Fig. 2.
contains at least four simply-connected components. See Fig. 1 below. If γi intersects 4 γi contains at least three simply-connected components. with some γ j , then R2 \ i=1
See Fig. 2. If there are more branches of the nodal line of ϕ issuing from the origin, then R2 \{ϕ = 0} is divided into more components of simply-connected domains. Therefore, we conclude that R2 is divided by the nodal line {y | ϕ(y) = 0} into at least 3 regions, i.e., R2 \{y | ϕ(y) = 0} =
3
j.
j=1
In each component j , the first eigenvalue of + (1 + |y|2 )l ev being equal to 0. Let now
g := log (1 + |y|2 )l ev . By noting that g + e g > 0 in R2 , Lemma 3.1 then implies that for each j = 1, 2, 3, g e dy = (1 + |y|2 )l ev dy > 4π. j
j
876
N. Ghoussoub, C.-S. Lin
It follows that 8πρ =
3 j=1 j
(1 + |y|2 )l ev dy > 12π,
which is a contradiction if we had assumed that ρ ≤ 23 . Thus we have ϕ(y) = 0, i.e., v(y) is axially symmetric. By Theorem A, we can conclude u ≡ 0. Remark 3.3. If we further assume that the antipodal of ξ0 is also a critical point of u, m 2 then R \{y | ϕ(y) = 0} = j , where m ≥ 4. Lemma 3.1 then yields j=1
8πρ =
R2
(1 + |y|2 )l ev dy ≥
m j=1 j
(1 + |y|2 )l ev dy > 4mπ ≥ 16π,
which is a contradiction whenever ρ ≤ 2. By Theorem A, we have again that u ≡ 0. For example, if u is even on S 2 (i.e., u(z) = u(−z) for all z ∈ S 2 ), then the main theorem holds for ρ ≤ 2. Remark 3.4. If v is a solution of (2.2) with βl (v) ≤ 6, and 0 is a critical point of v, then by the same proof of Theorem 1.1, we can conclude v is radially symmetric in R2 . Furthermore, if v(x1 , x2 ) is even in both x1 and x2 , then v is radially symmetric if βl (v) ≤ 8. Remark 3.5. One can actually show that Conjecture 1 holds for ρ ≤ 23 + 0 for some 0 > 0. Indeed, it suffices to show that for α smaller but close to 23 , the functional Jα is non-negative. Assuming not, then there exists a sequence of {αk }k such that 21 < αk < 23 , limk αk = 23 and inf M Jαk (u) < 0. Since Jα is coercive for each α > 21 , a standard compactness argument yields the existence of a minimizer u k ∈ M for Jαk . Moreover, u k H 1 < C for some positive constant independent of k. Modulo extracting a subsequence, u k then converges weakly to some u 0 in M as k → ∞, and u 0 is necessarily a minimizer for I 2 in M. By our main result, u 0 ≡ 0. Now, we claim that u k actually 3
converges strongly in H 1 to u 0 ≡ 0. This is because – as argued by Chang and Yang – the Euler-Lagrange equations are then 1 αk u k − 1 + eu k = 0, 2 λk
(3.20)
where λk = S 2 eu k dw < C for some positive constant C. Multiplying (3.20) by u k and integrating over S 2 , we obtain 1 αk |∇u k |2 dw + u k (x) dw = eu k (x) u k (x) dw. (3.21) 2 S2 λk S 2 S2 Applying Onofri’s inequality for u k and using that u k H 1 < C, we get that S 2 e2u k dw is also uniformly bounded. This combined with Hölder’s inequality and the fact that u k converges strongly to 0 in L 2 yields that S 2 eu k u k dw → 0. Use now (3.21) to conclude that u k H 1 → 0 as k → ∞.
Best Constant in the Moser-Onofri-Aubin Inequality
877
Now, write u = v + o(||u||) for ||u|| small, where v belongs to the tangent space of the submanifold M at u 0 ≡ 0 in H 1 (S 2 ). It is easy to see that S 2 vx dw = 0. We can calculate the second variation of Jα in M at u 0 ≡ 0 and get the following estimate around 0 : Jα (u) = α |∇v|2 dw − 2 |v|2 dw + o(||u||2 ). S2
S2
Note that the eigenvalues of the Laplacian on S 2 corresponding to the eigenspace generated by x1 , x2 , x3 are λ2 = λ3 = λ4 = 2, while λ5 = 6. Since v is orthogonal to x, we have 2 |∇v| dw ≥ 6 |v|2 dw, S2
S2
and therefore 1 Jα (u) ≥ (α − )||u||2 + o(||u||2 ). 3 Taking α = αk and u = u k for k large enough, we get that Jαk (u k ) ≥ 0, which clearly contradicts our initial assumption on u k . Concluding remarks. (i) The question whether Jα (u) ≥ 0 for 21 ≤ α < 23 under the condition (1.2) is still open. However, in [13], it was proved that there is a constant C ≥ 0 such that for any solution u of (1.3) with 1 < ρ ≤ 2 (i.e. 21 ≤ α < 1), we have |u(x)| ≤ C for all x ∈ S 2 . (ii) Recently, Liouville type equations with singular data have attracted a lot of attention among PDErs since they are closely related to vortex condensates which appear in many physics models. One of the challenges in this line of research is to understand bubbling phenomena arising from solutions of these equations, and the past twenty years have seen many works in this direction. The most delicate case in bubbling phenomena is when more than one vortex collapse into a single point. Equation (2.2) is one of the model equations that allows an accurate description of the bubbling behavior during such a collapse. See [4] and [8] for related details. Thus, understanding the structure of solutions to Eq. (2.2) is fundamentally important. As mentioned above, it is conjectured that for l ≤ 2, all solutions of (2.2) must be radially symmetric. This remains an open question, although a partial answer has been given recently in [4]. References 1. Aubin, T.: Meilleures constantes dans le théorème d’inclusion de Sobolev et un théorème de Fredholm non linéaire pour la transformation conforme de la courbure scalaire (French). J. Funct. Anal. 32(2), 148– 174 (1979) 2. Bandle, C.: Isoperimetric inequalities and applications, Monographs and Studies in Mathematics, 7. Boston, MA. London: Pitman (Advanced Publishing Program), 1980 3. Bartolucci, D., Lin, C.S.: Uniqueness results for mean field equations with singular data. Comm. Part. Diff. Eqs. 34(3), 676–702 (2009) 4. Bartolucci, D., Lin, C.S., Tarantello, G.: Preprint, 2009 5. Chang, S.Y., Yang, P.: Conformal deformation of metrics on S 2 . J. Diff. Geom. 27(2), 259–296 (1988)
878
N. Ghoussoub, C.-S. Lin
6. Chang, S.Y., Yang, P.: Prescribing Gaussian curvature on S 2 . Acta Math. 159(3–4), 215–259 (1987) 7. Cheng, K.S., Lin, C.S.: On the asymptotic behavior of solutions of the conformal Gaussian curvature equations in R2 . Math. Ann. 308(1), 119–139 (1997) 8. Dolbeault, J., Esteban, M.J., Tarantello, G.: Multiplicity results for the assigned Gaussian curvature problem in R2 . Nonlinear Anal. 70, 2870–2881 (2009) 9. Feldman, J., Froese, R., Ghoussoub, N., Gui, C.F.: An improved Moser-Aubin-Onofri inequality for axially symmetric functions on S 2 . Calc. Var. Part. Diff. Eqs. 6(2), 95–104 (1998) 10. Gui, C.F., Wei, J.C.: On a sharp Moser-Aubin-Onofri inequality for functions on S 2 with symmetry. Pac. J. Math. 194(2), 349–358 (2000) 11. Hong, C.: A best constant and the Gaussian curvature. Proc. AMS 97, 737–747 (1986) 12. Lin, C.S.: Uniqueness of solutions to the mean field equations for the spherical Onsager vortex. Arch. Rat. Mech. Anal. 153(2), 153–176 (2000) 13. Lin, C.S.: Topological degree for mean field equations on S 2 . Duke Math. J. 104(3), 501–536 (2000) 14. Moser, J.: A sharp form of an inequality by N. Trudinger. Indiana U. Math. J. 20, 1077–1091 (1971) 15. Onofri, E.: On the positivity of the effective action in a theory of random surfaces. Commun. Math. Phys. 86(3), 321–326 (1982) 16. Osgood, B., Phillips, R., Sarnak, P.: Extremals of determinants of Laplacians. J.F.A. 80, 148–211 (1988) Communicated by M. Salmhofer