Commun. Math. Phys. 193, 1 – 46 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
Multi-Dimensional Homoclinic Jumping and the Discretized NLS Equation G. Haller? Division of Applied Mathematics, Box F, Brown University, Providence, RI 02912, USA. E-mail:
[email protected] Received: 1 September 1995 Accepted: 23 May 1997
Abstract: We consider a class of dynamical systems that arise frequently in multi-mode truncations and discretizations of partial differential equations, including the perturbed NLS. We develop a general method to detect the existence of multi-pulse solutions that are doubly asymptotic to an invariant manifold with two different time scales. We use our method together with some recent results of Li and McLaughlin to show the existence of several families of multi-pulse orbits for the Ablowitz-Ladik discretization of the perturbed NLS. These orbits include N -pulse heteroclinic orbits and N -pulse ˇ Silnikov-type orbits for arbitrarily large N .
1. Introduction In this paper we study a class of multi-degree-of-freedom dynamical systems which arise in modal truncations of partial differential equations on periodic domains. One usually arrives at these equations when looking for small amplitude solutions of a PDE with parametric forcing terms. An important prototype example is the damped-forced sine-Gordon equation, which we discuss briefly below for motivation. As shown in, e.g., Bishop et al. [5], a small amplitude approximation to the sineGordon equation leads to a perturbed nonlinear Schr¨odinger Eq. (NLS). For a range of parameters, the integrable limit of the NLS admits one linearly stable and one unstable mode together with infinitely many neutrally stable modes. These latter modes can be further decomposed into a mode of plane waves (i.e., solutions with no spatial structure) and an infinite number of neutrally stable, i.e., oscillatory modes. A finite dimensional approximation to the problem is a well-known discretization of the NLS that produces an integrable system in the unperturbed limit (see Ablowitz and Ladik [1], Bogolyubov and Prikarpatskii [7], and Miller et al. [36]). ?
Partially supported by NSF Grant DMS-95011239 and AFOSR Grant F49620-95-1-0085
2
G. Haller
In the discretized NLS the plane of spatially independent solutions is invariant under both the perturbed and the unperturbed dynamics. For zero dissipation and forcing, the plane contains a circle of fixed points which is surrounded by a one-parameter family of periodic solutions. Furthermore, the invariant plane lies in a codimension two center manifold that accounts for the non-planar oscillatory modes. The center manifold is normally hyperbolic as it admits a one-dimensional stable and a one-dimensional unstable subspace at each of its points. This hyperbolicity is due to the presence of the stable and unstable modes mentioned above, and gives rise to codimension one stable and unstable manifolds to the center manifold. These invariant manifolds then coincide in two homoclinic manifolds in the integrable limit of zero forcing and damping. This phase space geometry is quite remarkable as it is a precise finite dimensional model of the phase space structure of the original PDE (see, e.g., Ercolani et al. [8], Ercolani and McLaughlin [9], and Li and McLaughlin [30] for details). A similar analogy exists between the phase space structure of the perturbed NLS equation and its two-mode approximation (see Bishop et al. [5, 6]). This fact inspired a great deal of work on modal truncations of the perturbed NLS, although all rigorous results so far are only concerned with the two-mode approximation that excludes the oscillatory modes (see Bishop et al. [5, 6], Kovaˇciˇc and Wiggins [27], Haller and Wiggins [16], McLaughlin et al. [34], and Haller and Wiggins [19]). Other examples with the modal truncations of the same class include parametrically forced surface wave problems (Holmes [22], and Kambe and Umeki [25]), the dynamics of forced and damped thin plates (Feng and Sethna [10]), inextensional beams (Nayfeh and Pai [37], Feng and Leal [12]), and resonantly driven coupled pendula (Miles [35], Becker and Miles [4], and Kovaˇciˇc and Wettergren [28]). All these problems can be recast in the form of Eq. (1) below. Our basic goal in this paper is to study the existence of nontrivial homoclinic and heteroclinic behavior in these systems by including an arbitrary high but finite number of modes. The main result of the paper is the construction of a class of complicated solutions in multi-mode truncations or discretizations. These solutions admit three different time scales and correspond to irregular “jumping” around the plane Π of spatially independent modes. In our general formulation we in fact allow for the presence of a 2m-dimensional manifold Π which contains an m-torus of equilibria in the unperturbed limit. In backward time the solutions we construct asymptote to some set in Π which is born out of the perturbation of the torus of fixed points of the unperturbed limit. In forward time, after making several jumps away from Π, the solutions asymptote to other structures in the center manifold that lie in the vicinity of the manifold Π. We give a criterion for the existence of such solutions, which is a generalization of the energy-phase method developed in Haller [15] and Haller and Wiggins [19] for two-degree-of-freedom systems. Under certain conditions, the solutions we construct will ultimately asymptote to some invariant set within the manifold Π. If their ω and α-limit sets coincide, then we obtain a multi-pulse orbit homoclinic to this set. An important special case arises when this set is an equilibrium that is a sink for the dynamics on the codimension two ˇ center manifold. We call the resulting multi-pulse orbit an N -pulse Silnikov-type orbit. Such orbits seem to have a prominent role in creating complicated or chaotic dynamics ˇ in modal equations. While single-pulse Silnikov orbits can also be obtained in these problems applying a modified Melnikov method (see Kovaˇciˇc and Wiggins [27], Feng and Sethna [10], Feng and Wiggins [11], Tien and Namachchivaya [38], Kovaˇciˇc and Wettergren [28]), and Li and McLaughlin [31], such orbits generically exist for a single codimension one surface in the space of system parameters. In contrast, our methods
Multi-Pulse Homoclinic Orbits
3
ˇ typically yield multi-pulse Silnikov-type orbits on an intricate web of the parameter space (see Haller and Wiggins [19] for a two-mode example). The main techniques we use in this paper include the perturbation theory of normally hyperbolic invariant manifolds, their stable and unstable manifolds, and stable and unstable foliations. We do not explicitly assume that in the limit of zero forcing and damping the modal equations are integrable. We do, however, assume the presence of particular structures in this limiting geometry, which are not typical in nonintegrable cases. Our strategy is to follow trajectories in the unstable manifold of the manifold Π as they leave and repeatedly return to a neighborhood of the center manifold. The control over individual trajectories is achieved by obtaining estimates on their location as well as on their energies before and after their intermediate passages near the center manifold. This amounts to studying the properties of an appropriately defined local Poincar´e map. The results of this study are summarized in the Passage Lemma (Lemma 7.1), which sets the stage for a final implicit function argument in Theorem 7.3 of Sect. 6. This argument is subtle since the equation satisfied by multi-pulse homoclinic orbits becomes undefined in the limit of the vanishing perturbation parameter. We circumvent this problem by defining an extension to the local map at this limit, and use the Passage Lemma to conclude that this extension is of class C 1 . We use the main result formulated in Theorem 7.3 on multi-pulse orbits to give conditions for the existence of multi-pulse orbits homoclinic to the manifold Π in Theorems 7.4-8.1 of Sect. 6. We study the “disintegration” of the unstable manifold of the plane Π via repeated jumping in Sect. 7. We give a useful reformulation of our method in Sect. 8 for the case when one of the invariants of the unperturbed limit is more convenient to use than the unperturbed Hamiltonian. An application of the results to a near-integrable discretization of the perturbed NLS is given in Sect. 9. Finally, we present some conclusions in Sect. 10.
2. Setting and assumptions The class of modal truncations listed in the Introduction can be written in the general form (1) x˙ = ω ] [DH0 (x) + DH1 (x)] + g(x), where x ∈ P ⊂ R2(n+m+1) , with n ≥ 0, m ≥ 1, and ≥ 0 is a small parameter. The functions H0 and H1 are assumed to be of class C r+1 in their arguments with r ≥ 5 and they generate the Hamiltonian part of the vector field (1) through the symplectic form ω on the phase space P. The map ω ] : T ∗ R2(n+m+1) → R2(n+m+1) appearing in (1) is the inverse of the map ξ 7→ {ω[x](ξ, · )} with x ∈ R2(n+m+1) and ξ ∈ Tx R2(n+m+1) . The function g is of class C r and it corresponds to the dissipative part of the perturbation to the unperturbed limit = 0. We make the following basic assumptions on system (1): (H1) There exists a 2m−dimensional manifold Π ⊂ P which is invariant under the flow of (1) for = 0. Furthermore, the manifold Π is symplectic, i.e., the restricted two-form ωΠ = ω|Π is nondegenerate. (H2) For = 0, system (1) restricted to Π becomes an m-degree-of-freedom, completely integrable Hamiltonian system, i.e., it admits m independent integrals which are in involution with respect to the Poisson bracket induced by the symplectic form ωΠ .
4
G. Haller
By assumption (H2), the Liouville-Arnold-Jost theorem (see, e.g., Arnold [3]) guarantees the existence of an open set N ⊂ Π on which we can introduce canonical action-angle variables (I, φ) ∈ Rm × Tm . (If the level surfaces of H0 are not compact within the set N , then we have φ ∈ Rn , but all of our forthcoming results are still valid.) We assume that the frequency vector φ˙ vanishes on one of these tori, i.e., (H3) For = 0 there exists an m-dimensional torus C ⊂ N given by I = I0 which is completely filled with equilibria of system (1). Furthermore, for any point p ∈ Π, the Jacobian M = Dω ] H0 (x)|x=p admits precisely m pairs of zero eigenvalues, a pair ±λ0 of nonzero real eigenvalues, and n pairs of simple, purely imaginary, nonzero eigenvalues iλ1 , . . . , iλn . This assumption implies the presence of a stable, an unstable, and 2n neutrally stable directions transverse to the manifold Π in the unperturbed limit of system (1). We stress that in (H3) we assumed the eigenvalues and eigenvectors of M to be independent of the point p ∈ C. Since the normal bundle of the torus C is trivial within Π, the independence of stable, unstable and center subspaces of points on C allows us to introduce local coordinates y = (y1 , y2 ) ∈ R2 and z ∈ R2n in a neighborhood S0 ⊂ P of the set N . The coordinates are such that Eq. (1) can be rewritten in the form y˙ = z˙ = I˙ = φ˙ =
3y + Y¯ (y, z, I, φ; ), ¯ z, I, φ; ), Az + Z(y, ¯ z, I, φ; ), E(y, F¯0 (y, z, I, φ) + F¯ (y, z, I, φ; ).
(2)
Here 3 is a diagonal matrix with eigenvalues ±λ, and A has the eigenvalues iλ1 , . . . , iλn . Hence there exists a constant CA > 0 such that |eAt z| ≤ CA |z|.
(3)
Note that in the local coordinates we introduced the manifold Π satisfies the equations y = 0 and z = 0. Our next major assumption is that (H4) For = 0, the torus C admits a unique, codimension two center manifold M0 = (y, z, I, φ) | y = y 0 (z, I, φ), (z, I, φ) ∈ V ⊂ R2(n+m) , where the function y 0 (z, I, φ) is of class C r . By the uniqueness of this center manifold, Π ⊂ M0 must hold (at least locally near C), which implies y 0 (0, I, φ) = 0. We note that the existence and uniqueness of M0 is usually easy to verify if the unperturbed part of system (2) is integrable. In all applications we know of, this integrability is due to the fact that the system is invariant under rotations in φ. In such cases the function y 0 has no explicit φ-dependence. Taking V small enough, we can ensure that M0 is a normally hyperbolic invariant manifold which admits codimension one stable and unstable manifolds of class C r , denoted W s (M0 ) and W u (M0 ), respectively. Our next assumption is the existence of a homoclinic structure in the unperturbed problem. In particular, we assume that
Multi-Pulse Homoclinic Orbits
5
(H5) The manifolds W s (M0 ) and W u (M0 ) coincide and form two homoclinic manifolds W0+ (M0 ) and W0− (M0 ). These homoclinic manifolds are foliated by orbits doubly asymptotic to the center manifold M0 . Based on the applications we are interested in, our next main assumption is that (H6) Each of the two homoclinic manifolds contains a one-parameter family of heteroclinic orbits that connect points on the torus C. In other words, the torus C has its own m + 1-dimensional stable and unstable manifolds that form two homoclinic manifolds W0+ (C) and W0− (C). Furthermore, the heteroclinic orbits in both W0+ (C) and W0− (C) connect the same pair of points, i.e., the phase shift vector
0 1y 0 1z 1x = lim xh (t) − xh (−t) = = 0 1I t→∞ h h 1φ limt→∞ φ (t) − φ (−t)
(4)
is the same for any solution xh (t) in W0+ (C) ∪ W0− (C). We would like to ensure that a manifold close to Π survives the perturbation. If n = 0, i.e., there are no “oscillatory modes” for the linearized dynamics, then Π ≡ M0 is normally hyperbolic, hence it smoothly perturbs to a nearby invariant manifold. For n > 0, however, Π in general does not persist. Motivated by the examples listed in Sect. 1, we then require the perturbation to be such that it preserves Π: (H7) If n > 0, then the manifold Π remains invariant under the flow of system (1) for > 0. Based on assumptions (H1)-(H7), we can guarantee the persistence of certain invariant manifolds for > 0 sufficiently small. The following theorem describes the properties of these manifolds. Theorem 2.1. Suppose that assumptions (H1)-(H7) hold. Then there exists 0 > 0 such that for 0 ≤ < 0 the following are satisfied: There exists a unique, codimension-two, locally invariant manifold M of class C r which depends on the parameter in a C r fashion. If n > 0, then the manifold M contains the invariant manifold Π which satisfies y = 0 and z = 0. If n = 0, then M0 ≡ Π. (ii) The manifold M has codimension-one local stable and unstable manifolds s u (M ) and Wloc (M ) that are of class C r in the variables (y, z, I, φ) and Wloc . u (iii) The local unstable manifold Wloc (M ) is foliated by a negatively invariant family u u r u (M ) and F −t (f u (p)) ⊂ F = ∪p∈M f (p) of C curves f u (p), i.e., F u = Wloc u −t t f F (p) for any t ≥ 0 and p ∈ M (here F denotes the flow generated by system (1). Moreover, the fibers f u (p) are of class C r in and p, and f u (p)∩f u (p0 ) = ∅, unless p = p0 . Finally, there exist Cu , λu > 0 such that if q ∈ f u (p) then (i)
k F −t (q) − F −t (p) k< Cu e−λu t , for any t ≥ 0.
6
G. Haller
s (iv) The local stable manifold Wloc (M ) admits a positively invariant foliation F s = ∪p∈M f u (p) with similar properties.
Proof. The statements of the theorem follow from a direct application of the invariant manifold results of Fenichel [13, 14]. We only note that the uniqueness of the perturbed manifold M implies Π ⊂ M in statement (i). For simplicity, from now on we will not distinguish between the cases n = 0 (i.e., no oscillatory modes for the unperturbed linearized flow near the manifold Π) and n > 0. As a result, when we refer to the invariant manifold Π for the perturbed system (1), we mean Π ≡ M in the case of n = 0. 3. Fenichel Normal Form Near M In this section we derive a normal form which describes the dynamics of system (1) near the normally hyperbolic invariant manifold M which exists by Theorem 2.1. The normal form is a specific form of a result of Fenichel [14], or more precisely, of the normal form appearing in Tin [39] (see also Jones and Kopell [23]). Since this construction has appeared in several recent papers, we omit the details of the derivation of the normal form. For a detailed proof, the reader may consult Haller [21]. We first introduce the scaling I = I0 +
√
η,
(5)
to blow up a neighborhood of the torus of equilibria C. Using the coordinates (y, z, η, φ), we obtain the following result. Lemma 3.1. There exists 0 > 0 such that for 0 ≤ < 0 , a C r change of coordinates T : (y, z, η, φ) 7→ (w, ζ, ρ, ψ) (with a C r inverse) defined near the manifold M , which puts system (1) in the form √ w˙ 1 = [−λ + hY1 , wi + hY2 , ζi +√ Y3 ]w1 , w˙ 2 = [ λ + hY4 , wi + √ hY5 , ζi + Y6 ]w2 , (Z ζ˙ = Aζ + ζ) ζ + Z2 ζ + Z3 w1 w2 , 1 √ ρ˙ = E, √ ψ˙ = (F1 ζ) ζ + F2 + F3 w1 w2 .
(6)
Here the functions Y1 , Y4 : P × [0, 0 ] → R2 , Y2 , Y5 : P × [0, 0 ] → R2n , Y3 , Y6 : P × [0, 0 ] → R, E, F2 , F3 : P ×[0, 0 ] → Rm , Z3 : P ×[0, 0 ] → R2n , Z2 : P ×[0, 0 ] → R , and the 3-tensors Z1 : P ×[0, 0 ] → R2n×2n×2n and F1 : P ×[0, 0 ] → Rm×2n×2n are all of class C r−4 in their arguments, and h·, ·i denotes the usual Euclidean inner product. Moreover, (7) Dw Z1 = 0, Dw Z2 = 0, Dw F1 = 0, Dw F2 = 0. Proof. Based on the references cited above, the proof of this theorem is a routine exercise following the steps outlined in Fenichel [14]. These steps involve changes of coordinates s u (M ), and Wloc (M ), as well as their that “straighten out” the manifolds M , Wloc invariant foliations. For a detailed proof we refer the reader to Haller [21].
Multi-Pulse Homoclinic Orbits
7
4. Dynamics Near the Manifold M In this section we use the normal form (6) to study trajectories in a neighborhood of the manifold M . The trajectories of interest lie in the unstable manifold W u (M ) and do s (M ) upon entering a small neighborhood of not intersect the local stable manifold Wloc M . Since M is of “saddle-type”, such trajectories pass near the manifold and leave its neighborhood. The question is how the coordinates (w, ζ, ρ, ψ) change during this passage and how the change depends on their initial values upon entry. By Lemma 3.1, the flow of system (1) near the manifold M is C r -conjugate to the flow of the normal form (6) in a neighborhood of the set w = 0. In other words, for ≤ 0 the normal form is related to the original system within some fixed open set √ S0 = {(w, ζ, ρ, ψ) | |w| < Kw , |ζ| < Kζ , |ρ| < KI , ψ ∈ Tm }, where Kζ , Kρ , and KI are fixed positive constants. We shall primarily be interested in solutions x(t) = (w(t), ζ(t), ρ(t), ψ(t)) of the normal form which enter a small, fixed “box” Kw KI U0 = (w, ζ, ρ, ψ) ∈ S0 | |wi | ≤ δ0 < √ , |ζ| ≤ δ0 < Kζ , |ρ| ≤ Kρ < √ 4 2 with positive constants δ0 and Kρ . Since the functions on the right-hand-side of (6) are of class C r−4 , on the closure of S0 they obey the estimates |Yi |, |Zj |, |E|, |Fk | < B0 , |DYi |, |DZj |, |DE|, |DFk | < B0,
(8)
for all 0 ≤ ≤ 0 and for appropriate B0 > 0. We want to follow a solution x(t) which enters the set U0 by intersecting its boundary ∂U0 within the domain ∂1 U0 = {(w, ζ, ρ, ψ) ∈ ∂U0 | |ζ| < δ0 , |ρ| ≤ Kρ } at time t = 0. For such a solution we have w1 (0) = δ0 , and we assume that for 0 < ≤ 0 , the rest of the coordinates of the entry point x(0) obey the entry conditions |ζ(0)| < c1 β ,
c2 c3 < |w2 (0)| < , δ0 δ0
|ρ(0)| < c4 < Kρ
(9)
for fixed positive constants c1 , . . . , c4 and for some power 21 < β < 1. The second inequality in (9) implies that the solution x(t) enters U0 close to the local s (M ). Such solutions spend a long time within U0 , and hence their stable manifold Wloc ζ(t) component does not necessarly remain under control on such time scales, i.e., x(t) may not exit U0 through the domain ∂1 U0 of its boundary. An exit through ∂1 U0 means that |ζ| remains bounded by δ0 while x(t) is in U0 . Our first result shows that this is indeed the case. Lemma 4.1. Suppose that for a solution x(t), the entry conditions in (9) are satisfied. Then for any fixed constant β with 21 < β < 1, there exist 1 > 0 and δ1 > 0 such that for all 0 < δ0 < δ1 and 0 < 0 < 1 there exists T ∗ > 0 with x(T ∗ ) ∈ ∂1 U0 . Moreover, the minimal such time T ∗ obeys the estimate T ∗ < T =
δ2 2 log 0 . λ c2
(10)
8
G. Haller
Proof. We start by picking constants Bζ and α with Bζ > c1 > 0 and β < α < 1. Then, by the smoothness of the solution x(t) with respect to t, (9) implies the existence of a time T¯ > 0 such that for all t ∈ [0, T¯ ), we have |ζ(t)| ≤ Bζ β ,
|ρ(t)| ≤ Kρ ,
|w1 (t)w2 (t)| ≤
c3 α . δ0
(11)
Clearly, for small enough, (11) implies x(t) ∈ S0 . By the continuity of x(t) in t, we also have x(t) ∈ U0 for t > 0 small enough. It is also clear that T¯ can be slightly increased so that the inequalities above still hold. Let T ∗ > 0 denote the time when x(t) first intersects the boundary ∂U0 . One can easily check that T ∗ < T must hold by assuming the contrary and observing that such an assumption would lead to |w2 (T )| > |w20 | exp λT ∗ /2 > δ0 , which is a contradiction. We want to argue that T¯ can in fact be increased up to T ∗ . Let us assume that for all fixed Bζ , Kρ , and α, there exists a time T0 with T¯ ≤ T0 < T such that (11) holds for all t < T0 , but at least one of the inequalities is violated at t = T0 . We will consider these inequalities individually and argue that none of them can be violated at t = T0 if we√choose and δ0 small enough and select Bζ , Kρ , and α properly. We note that |w2 | < 2δ0 will automatically hold in our argument since T0 is smaller than the exit time T ∗ . By assumption, the third equation of (6) yields the following estimate for all 0 ≤ t < T0 on the solution x(t): Rt √ |ζ(t)| = |eAt ζ(0)| + 0 |eA(t−s) (Z1 ζ) ζ + Z2 ζ + Z3 w1 w2 | ds Rt √ < CA |ζ(0)| + CA B0 0 (Bζ β |ζ(s)| + |ζ(s)| + δc30 α ) ds √ Rt < CA [c1 β + B0 δc30 α T ] + 2CA B0 Bζ 0 |ζ(s)| ds, where we used (3). By the Gronwall inequality, this implies |ζ(t)| = CA [c1 β + B0
√ c3 α T ] e2CA B0 Bζ T < 2ec1 CA β δ0
(12)
for > 0 small. Since (12) holds for all 0 ≤ t < T0 , by the continuity of |ζ(t)|, we obtain (13) |ζ(T0 )| < 2ec1 CA β < Bζ β , if we choose Bζ = 7c1 CA . Therefore, the first inequality in (11) cannot be violated at t = T0 . We now study the second inequality in (11). Using the fourth equation in (6), for 0 ≤ t < T0 we can estimate the ρ-component of the solution x(t) as Z t √ √ 2B0 √ δ2 |E| ds < |ρ(0)|+ B0 t < c4 + log 0 < c4 +1, (14) |ρ(t)| < |ρ(0)|+ λ c2 0 for small . Thus, selecting Kρ = c4 + 2 and using the continuity of the function ρ(t), we obtain from (14) that the second inequality in (11) cannot be violated at t = T0 either. As far as the last inequality in (11), the normal form (6) yields the differential equation √ d (w1 w2 ) = [hY1 + Y4, wi + hY2 + Y5, ζi + (Y3 + Y6 )]w1 w2 . dt
(15)
Multi-Pulse Homoclinic Orbits
9
From this equation we obtain that for 0 ≤ t < T0 , the product of the two w-components of the solution x(t) admits the estimate R t |w1 (t)w2 (t)| ≤ |w1 (0)w2 (0)| + 0 hY1 + Y4, wi √ +hY2 + Y5, ζi + (Y3 + Y6 ) |w1 (s)w2 (s)| ds √ Rt √ < c3 + 0 2B0 [ 2δ0 + Bζ β + ]|w1 (s)w2 (s)| ds. Then a simple Gronwall estimate shows that o c n √ √ √ 3 |w1 (t)w2 (t)| ≤ c3 exp 2B0 [2 2δ0 + Bζ β + ]T < exp 4B0 [ 2 + 1]T , δ0 which implies that |w1 (t)w2 (t)| < c3
δ02 c2
4B0 [√2+1] δλ0
√
1−8B0 [
2+1]
δ0 λ
< c3 α ,
(16)
if we choose δ0 small enough such that
δ02 c2
4B0 [√2+1] δλ0 < 1,
δ0 <
λ(1 − α) √ 4B0 (1 + 2)
hold. Again, by continuity with respect to t, (16) implies |w1 (T0 )w2 (T0 )| ≤ c3 α /δ0 , hence the last inequality in (11) cannot be violated at t = T0 either. But this contradicts our original assumption on the time T0 and proves the statement of the lemma. In the following lemma we describe how the coordinates of passing trajectories change and how this change depends on the initial values of these coordinates upon entry into the neighborhood U0 . Lemma 4.2. Let us fix a constant 21 < β < 1 and assume that for 0 < < 0 and δ0 < δ1 , the entry conditions (9) hold for a solution x(t) which enters the set U0 at t = 0 and leaves it at t = T ∗ . Let us introduce the notation a = (w20 , ζ0 , ρ0 , ψ0 ) and let x0 = (δ0, a) and x∗ = x(T ∗ ) = (w1∗ , δ0 , ζ ∗ , ρ∗ , ψ ∗ ) define the coordinates of the solution at entry and departure, respectively. Then there exist constants K > 0 , 0 < µ < 21 , and δ0∗ > 0, and for any δ0 < δ0∗ there exists ∗0 > 0 such that for all 0 < < ∗0 the following estimates hold: (i) √ β √ β |w1∗ | < Kβ , |ζ ∗ − ζ0 | < Kβ , |ρ∗ − ρ0 | < K , |ψ ∗ − ψ0 | < K . (ii) |Da w1∗ | < Kβ , |Da ζ ∗ − (0, 1, 0, 0)| < Kµ , ∗ µ |Da ρ − (0, 0, 1, 0)| < K , |Da ψ ∗ − (0, 0, 0, 1)| < Kµ . (iii) |Dµ w1∗ | < Kβ ,
|Dµ ζ ∗ | < Kµ ,
|Dµ ρ∗ | < Kµ ,
|Dµ ψ ∗ | < Kµ .
10
G. Haller
Proof. We start the proof by establishing a lower estimate and a refined upper estimate for the exit time T ∗ . From the normal form (6) we easily obtain that |w20 | e(λ+3δ0 B0 )t > |w2 (t)| > |w20 | e(λ−3δ0 B0 )t ,
(17)
1 δ2 1 δ2 log 0 < T ∗ < T2 = log 0 λ + 3δ0 B0 c2 λ − 3δ0 B0 c2
(18)
which in turn gives T1 =
for any solution with initial conditions satisfying the estimates in (9). We now turn to the proof of statement (i). From (6) we obtain that |w1∗ | = |w1 (T ∗ )| < |w1 (T1 )| < |w10 | e−(λ−3δ0 B0 )T1 < δ0
δ02 c2
0 B0 λ−3δ λ+3δ B 0
0
β
(19)
provided δ0 <
λ(1 − β) . 3B0 (1 + β)
(20)
By Lemma 4.1, all inequalities in (9) hold for t ∈ [0, T ∗ ], thus selecting Bζ = 7c1 CA (as in the proof of that lemma) and setting t = T ∗ , we obtain |ζ ∗ | < Bζ β . This inequality and (9) imply that |ζ ∗ − ζ0 | ≤ |ζ ∗ | + |ζ0 | < (Bζ + 1)β .
(21)
From the third equation in (6) we see that ∗
|ρ − ρ0 | ≤
√
Z
T∗
|E|x(t) dt <
0
√
B0 T <
2B0 √ β 2B0 √ δ2 log 0 < . λ c2 λ
(22)
Finally, the last equation in (6) and (11) yield the estimate R T∗ √ |(F1 ζ) ζ| + |F2 | + |F3 | |w1 w2 | x(t) dt 0 i h √ B c < Bζ2 B0 2β + B0 + δ00 3 α T h i√ β c3 2 0 B < 2B + + 1 . ζ λ δ0
|ψ ∗ − ψ0 | ≤
(23)
But then (19), (21), (22), and (23) show that statement (i) of the lemma is satisfied if we choose K > 0 big enough. To prove statement (ii), we first need the variational equation associated with the normal form (6). We shall only sketch the proof of the estimates in (ii) for the derivatives of x∗ with respect to ρ0 . To this end, we need the ρ0 -variational equation associated with the normal form (6):
Multi-Pulse Homoclinic Orbits d dt
d dt
d dt
d dt d dt
11
√ Dρ0 w1 = −λ + hY1 , wi + hY2 , ζi + Y3 Dρ0 w1 + hDY1 Dρ0 x, wi + hY1 , Dρ0 wi √ + hDY2 Dρ0 x, ζi + hY2 , Dρ0 ζi + DY3 Dρ0 x w1 , √ Dρ0 w2 = λ + hY4 , wi + hY5 , ζi + Y6 Dρ0 w2 + hDY4 Dρ0 x, wi + hY4 , Dρ0 wi √ + hDY5 Dρ0 x, ζi + hY5 , Dρ0 ζi + DY6 Dρ0 x w2 , Dρ0 ζ = ADρ0 ζ + DZ1 Dρ0 x ζ ζ + (Z1 ζ)Dρ0 ζ √ + Z1 Dρ0 ζ ζ + hDZ2 Dρ0 x, ζi √ hZ2 , Dρ0 ζi + DZ3 Dρ0 xw1 w2 + Z3 Dρ0 (w1 w2 ), √ Dρ0 ρ = DEDρ0 x, Dρ0 ψ = DF1 Dρ0 x ζ ζ + F1 Dρ0 ζ ζ + (F1 ζ) Dρ0 ζ √ + hDF2 , Dρ0 xi + hDF3 , Dρ0 xi w1 w2
(24)
+F3 Dρ0 (w1 w2 ). Let us select constants α, γ, µ, and ν with 0<µ<ν<
1 < γ < β < α < 1. 2
(25)
Then, by the smoothness of the solution x(t) with respect to t, there exists a time T0 ≤ T ∗ such that for all t ∈ [0, T0 ) and for > 0 sufficiently small, 0
|Dρ0 ζ(t)| ≤ Bζ γ ,
0
|Dρ0 ρ(t) − 1| ≤ Kρ µ ,
0
|Dρ0 ψ(t)| ≤ Kψ µ ,
(26)
0
|Dρ0 [w1 (t)w2 (t)]| ≤ K0 β , 0
0
0
|Dρ0 w1 (t)| ≤ Kw β , |Dρ0 w2 (t)| ≤ Kw −ν , kDρ0 x(t)k ≤ 2Kw −ν , 0
0
0
0
(27)
0
with appropriate positive constants Bζ , Kρ , Kψ , K0 , and Kw . We also recall that for t ∈ [0, T ∗ ], the inequalities in (11) hold, and we have T ∗ < T by Lemma 4.1. As in the proof of Lemma 4.1, we shall argue that none of the inequalities in (26) and (27) can be violated at t = T0 if we choose the constants appearing in those inequalities properly. Thus we can select T0 = T ∗ , i.e., we obtain estimates of the form (26) and (27) on the whole time interval while the solution x(t) stays inside the set U0 . We shall use these estimates to prove statement (ii) of the lemma. We start by considering the inequalities in (26). From the third equation in (24) we obtain that Z
h i 0 CA Bζ2 2β B0 2Kw −ν + 2Bζ B0 β |Dρ0 ζ(t)| 0 √ √ 0 0 −ν β −ν c3 α +CA ds B0 2Kw Bζ + B0 |Dρ0 ζ(t)| + B0 2Kw δ0 0 1 c3 < 2CA Kw B0 Bζ2 2β−ν + Bζ 2 +β−ν + α−ν T0 δ0
|Dρ0 ζ(t)| ≤
t
12
G. Haller
Z
t
2CA Bζ B0 β |Dρ0 ζ(t)| ds
+ 0
<
1 0 γ B + e ζ
Z
(28)
t
2CA Bζ B0 β |Dρ0 ζ(t)| ds, 0
provided we choose ν small enough so that α − γ > ν,
(29)
and select small enough. Then the Gronwall inequality applied to (29) shows that for all t ∈ [0, T0 ], √ 0 0 δ02 4 CA B0 Bz γ log ≤ Bζ γ , (30) |Dρ0 ζ(t)| ≤ Bζ exp λ c2 for small. For all t ∈ [0, T0 ], from the fourth equation in (24) we obtain the estimate 0
|Dρ0 ρ(t) − 1| ≤ Kρ µ , if we select µ small enough such that 1 − ν > µ. 2
(31)
Using the last equation in (24), we see that for all t ∈ [0, T0 ], Rt √ 0 0 0 0 |Dρ0 ψ(t)| ≤ 0 (B0 2Kw −ν Bζ2 2β + B0 Bζ γ Bζ β + B0 Bζ β Bζ γ + B0 2Kw −ν 0
0
+B0 2Kw −ν δc30 α + B0 K0 β ) ds h i 0 0 0 0 0 1 < B0 2Bζ2 Kw 2β−ν + 2Bζ Bζ β+γ + 2Kw 2 −ν + 2Kw δc30 α−ν + K0 β T 0
< Kψ µ , (32) provided (29) and (31) hold. To estimate the time interval on which the last inequality in (26) holds, we note that the time evolution of the quantity w2 Dρ0 w1 is given by the equation d dt w2 Dρ0 w1 = [hDY1 Dρ0 x, wi + hY1, Dρ0 wi + hDY2 Dρ0 x, ζi + hY2 , Dρ0 ζi (33) √ + hDY3, Dρ0 xi]w1 w2 √ + hY1 + Y4 , wi + hY2 + Y5 , ζi + (Y3 + Y6 ) w2 Dρ0 w1 . We now estimate the terms on the right-hand-side of this expression individually on the time interval [0, T0 ). The first term can be estimated as |hDY1 Dρ0 x, wi w1 w2 | < B0 w12 |w2 | + w22 |w1 | (34) |Dρ0 w1 | + |Dρ0 w2 | + |Dρ0 ζ| + |Dρ0 ρ| + |Dρ0 ψ| < B0 |w2 Dρ0 w1 | w12 + |w1 w2 | + |w1 Dρ0 w2 | w22 + |w1 w2 | +3 |Dρ0 ρ| |w1 w2 | |w1 | + |w2 | < 2δ02 B0 |w2 Dρ0 w1 | + |w1 Dρ0 w2 | + 12B0 c3 α . In a similar fashion, we can estimate the remaining terms to obtain
Multi-Pulse Homoclinic Orbits
13
|hY1, Dρ0 wi w1 w2 | < δ0 B0 |w2 Dρ0 w1 | + |w1 Dρ0 w2 | , |hDY2 Dρ0 x, ζi w1 w2 | < δ0 Bζ B0 β |w2 Dρ0 w1 | + |w1 Dρ0 w2 | + √
|hY2, Dρ0 ζi w1 w2 | < B0
|hDY3, Dρ0 xi w1 w2 | |hY1 + Y4 , wi w2 Dρ0 w1 | |hY2 + Y5 , ζi w2 Dρ0 w1 | |(Y3 + Y6 ) w2 Dρ0 w1 |
< < < <
0 c 3 Bζ
α+γ , √ δ0 B0 |w2 Dρ0 w1 | + |w1 Dρ0 w2 | + 2δ0 B0 |w2 Dρ0 w1 | , 2Bζ B0 β |w2 Dρ0 w1 | , √ 2B0 |w2 Dρ0 w1 | .
6c3 α δ02
δ0
6c3 α δ02
(35)
,
,
Integrating (33) and using the estimates (34)–(35), we find that for all t ∈ [0, T0 ), |w2 (t)Dρ0 w1 (t)| Z t c3 B0 0 Bζ + 12δ0 + 6(Bζ + 1) α ds, δ0 B0 11 |w2 Dρ0 w1 | + 5 |w1 Dρ0 w2 | + < δ0 0 which gives 0 δ2 0 B + 12δ + 6(B + 1) α log c20 |w2 (t)Dρ0 w1 (t)| < 2cδ30B 0 ζ ζ λ Rt + 0 δ0 B0 11 |w2 Dρ0 w1 | + 5 |w1 Dρ0 w2 | ds. By the symmetry of the normal form (6), we immediately obtain 0 δ2 0 B + 12δ + 6(B + 1) α log c20 |w1 (t)Dρ0 w2 (t)| < 2cδ30B 0 ζ ζ λ Rt + 0 δ0 B0 11 |w1 Dρ0 w2 | + 5 |w2 Dρ0 w1 | ds.
(36)
(37)
Adding the two inequalities (36) and (37), then applying a Gronwall estimate to the resulting inequality, we obtain 0 i h δ2 0 Bζ + 12δ0 + 6(Bζ + 1) α log c20 |w2 (t)Dρ0 w1 (t)| + |w1 (t)Dρ0 w2 (t)| < 4cδ30B λ δ2 × exp 32δλ0 B0 log c20 0
< K0 β , (38) where we selected
32δ0 B0 , (39) λ and assumed that is small enough. Then the inequality (38) implies that for all t ∈ [0, T0 ], 0 |Dρ0 [w1 (t)w2 (t)]| ≤ K0 β . (40) α−β >
It then remains to verify the last three inequalities in (26) for t ≤ T0 . Using the second inequality in (9) with (17) and (38) yields the estimate 0 0 δ 2 λ−3δ 16δ0 +B0 0 B0 4c3 B0 0 α+ 0 Bζ + 12δ0 + 6(Bζ + 1) λ−3δ0 B0 . |Dρ0 w1 (t)| < c2 λ c2 19δ B −λ
If we use (39) and select
(41)
14
G. Haller
δ0 <
λ , 6B0
(42)
then we obtain 0
|Dρ0 w1 (t)| < Kw β , since 32δ0 B0 /λ > 16δ0 B0 /(λ − 3δ0 B0 ). Furthermore, from (36)–(37) we obtain |w1 (t)Dρ0 w2 (t)| <
¯ α K δ2 log 0 e16δ0 B0 t δ0 c2
¯ This, combined with the easy estimate |w1 (t)| > for an appropriate constant K. δ0 exp [−(λ + 3δ0 B0 )t] from the normal form (6), implies that |Dρ0 w2 (t)| <
¯ α K δ2 ¯ −ν , log 0 e(λ+19δ0 B0 )T2 < K δ0 c2
(43)
if we choose α+ν >
λ + 19δ0 B0 , λ − 3δ0 B0
(44)
and let > 0 be small enough. Since the last inequality in (27) trivially follows from (26), (41), (43), we conclude from (30)-(32) and (40)-(43) that the estimates in (27) hold for all t ∈ [0, T ∗ ], provided we satisfy (20),(29), (31), (39), (42), (44), and select small enough. We now use (26) and (27) to prove statement (ii) of the lemma. First note that for any initial value x0 ∈ ∂1 U0 , the time t = T ∗ that the corresponding solution x(t; x0 ) spends within U0 is the solution of the equation w2 (t; x0 ) = δ0 ,
t ≥ 0.
(45)
From the second equation in (6) we can estimate the magnitude of w˙ 2 (T ∗ ) as |w˙ 2 (T ∗ )| ≥
λ λ |w2 (T ∗ )| = δ0 . 2 2
This inequality shows that ∂ w2 (t; x0 )t=T ∗ = w˙ 2 (T ∗ ) 6= 0, ∂t hence, by the implicit function theorem, we can solve (45) near (T ∗ , x0 ) to obtain a continuous function T ∗ (x0 ). Moreover, this function is in fact of class C r , since the solution w2 (t; x0 ) is a C r function of the initial data and depends on t in a C r fashion. Consequently, the function x∗ (x0 ) = x(T ∗ (x0 ); x0 ) is of class C r . Using this expression, the derivatives of the components of x∗ with respect to ρ0 can be computed as
Multi-Pulse Homoclinic Orbits
15
w˙ 1 (T ∗ ; x0 ) Dρ w2 (T ∗ ; x0 ) + Dρ0 w1 (T ∗ ; x0 ), w˙ 2 (T ∗ ; x0 ) 0 ˙ ∗ ; x0 ) ζ(T Dρ0 ζ ∗ (x0 ) = − Dρ w2 (T ∗ ; x0 ) + Dρ0 ζ(T ∗ ; x0 ), w˙ 2 (T ∗ ; x0 ) 0 ρ(T ˙ ∗ ; x0 ) Dρ0 ρ∗ (x0 ) = − Dρ w2 (T ∗ ; x0 ) + Dρ0 ρ(T ∗ ; x0 ), w˙ 2 (T ∗ ; x0 ) 0 ˙ ∗ ; x0 ) ψ(T Dρ0 ψ ∗ (x0 ) = − Dρ w2 (T ∗ ; x0 ) + Dρ0 ψ(T ∗ ; x0 ), w˙ 2 (T ∗ ; x0 ) 0
Dρ0 w1∗ (x0 ) = −
(46)
where we used (45). Then, using the normal form (6), the estimates in (26)-(27) with t = T ∗ , and the inequality (31), we obtain from (46) the following estimates: |Dρ0 w1∗ (x0 )| <
3δ02 −2(λ−3δ0 B0 )T1 0 −ν 0 β e Kw + K w c2 λ−9δ0 B0
−ν
0 β 0 < K1 λ+3δ0 B0 + Kw < (K1 + Kw )β , 0 0 2Bζ γ √ 2Kw c3 |Dρ0 ζ ∗ (x0 )| < kAkBζ β +Bζ2 B0 2β + B0 Bζ β +B0 α −ν + λδ0 δ0 λδ0 " # 0 0 2Kw c3 2 B 0 Bζ + B ζ + + kAkBζ µ , < Bζ + (47) λδ0 δ0 # " 0 0 √ 2B0 Kw −ν 0 0 2B0 Kw µ ∗ µ |Dρ0 ρ (x0 ) − 1| < , + K ρ < Kρ + λδ0 λδ0 √ 0 0 2 c3 |Dρ0 ψ ∗ (x0 )| < Bζ2 B0 2β + B0 + B0 α Kw −ν + Kψ µ λδ0 δ0 " # 0 0 2Kw B0 c3 2 < Kψ + Bζ + 1 + µ , λδ0 δ0
if we let β+ν <
λ − 9δ0 B0 . λ + 3δ0 B0
(48)
But (47), together with identical estimates for the rest of the components of Da x∗ , implies the inequalities in statement (ii) of the lemma. It remains to show that the constants we introduced in the proof of statements (i)-(ii) can indeed be chosen in a way so that all required relations are satisfied. To satisfy these relations, we pick α=
β+1 , 2
γ=
2β + 1 , 4
ν = β(1 − β),
µ=
1−β . 2
(49)
For this choice of parameters, the inequalities (25), (29), and (31) are satisfied. Furthermore, (39) and (42) are also satisfied if δ0 < and (44) is satisfied if
1−β , 64B0
(50)
16
G. Haller
λ 3β − 2β 2 − 1 . δ0 < B0 9β − 6β 2 + 41
(51)
Finally, condition (48) requires that
λ β 2 − 2β + 1 . δ0 < 3B0 −β 2 + 2β + 3
(52)
Therefore, δ0 > 0 must be smaller than the minimum of the right-hand-side of the inequalities in (20), (39), (50),(51) and (52). This completes the proof of (i)-(ii). The proof of statement (iii) is very similar to that of (ii), so we only outline the necessary steps. From the normal form (6) we see that the derivatives of the components of the solution x(t) with respect to ε ≡ µ satisfy the equations √ d Y3 Dε w1 + hDY1 Dε x, wi dt (Dε w1 ) = −λ + hY1 , wi + hY2 , ζi + 1−2µ √ 2µ + hY1 , Dε wi + hDY2 Dε x, ζi + hY2 , Dε ζi + DY3 Dε x + 2µ Y3 w1 , √ d λ + hY4 , wi + hY5 , ζi + Y6 Dε w2 + hDY4 Dε x, wi + hY4 , Dε wi dt (Dε w2 ) = 1−2µ √ 2µ + hDY5 Dε x, ζi + hY5 , Dε ζi + DY6 Dε x + 2µ Y6 w2 , √ d hDZ2 Dε x, ζi dt (Dε ζ) = ADε ζ + (DZ1 Dε x ζ) ζ + (Z1 ζ)Dε ζ + (Z1 Dε ζ) ζ + 1−2µ √ 2µ + hZ2 , Dε ζi + 2µ hZ2 , ζi + DZ3 Dε xw1 w2 + Z3 Dε (w1 w2 ), 1−2µ √ 2µ d DE3 Dε x + 2µ E3 , dt (Dε ρ) = 1−2µ √ d 2µ (F (D (DF ζ + (F ψ) = D x ζ) D ζ) ζ + ζ) D ζ + hDF , D xi + ε 1 ε 1 ε 1 ε 2 ε dt 2µ F2 + hDF3 , Dε xi w1 w2 + F3 Dε (w1 w2 ). (53) As in the proof of statement (i), we can assume that for t ∈ [0, T0 ) and > 0 sufficiently small, 0 |Dε ζ(t)| ≤ B¯ ζ γ ,
¯ ρ0 µ , |Dε ρ(t)| ≤ K
¯ ψ0 µ , |Dε ψ(t)| ≤ K
0 ¯ 00 β , |Dε w1 (t)| ≤ K ¯w |Dε [w1 (t)w2 (t)]| ≤ K β , |Dε w2 (t)| 0 0 ¯w ¯w −ν , kDε x(t)k ≤ 2K −ν . ≤K
(54) (55)
From (53), in the same way as in (29), (32), (38), (41), and (43), we obtain that the estimates in (54)-(55) continue to hold for t ≤ T ∗ . (To see this one only has to note that 1 1 2 −µ+β < β < γ , and 2 −µ < ν .) Calculations similar to those leading to (46) now give w˙ 1 (T ∗ ) Dε w2 (T ∗ ) + Dε w1 (T ∗ ), w˙ 2 (T ∗ ) ˙ ∗) ζ(T Dε ζ ∗ = − Dε w2 (T ∗ ) + Dε ζ(T ∗ ), w˙ 2 (T ∗ ) ρ(T ˙ ∗) D ε ρ∗ = − Dε w2 (T ∗ ) + Dε ρ(T ∗ ), w˙ 2 (T ∗ ) ˙ ∗) ψ(T Dε ψ ∗ = − Dε w2 (T ∗ ) + Dε ψ(T ∗ ). w˙ 2 (T ∗ )
Dε w1∗ = −
(56)
Multi-Pulse Homoclinic Orbits
17
Then, just as in (47), we obtain from (54)-(56) the estimates listed in statement (iii) of the lemma. It is important to note that in the proof of the above lemma we made no use of the fact that our original system (1) is O()-close to a Hamiltonian system. As it will turn out later, this fact enables us to refine some of the estimates in Lemma 4.2 for a special class of initial conditions.
5. Local and Global Maps Lemma 4.2 shows that the “local map” x0 7→ x∗ (x0 ), as well as its partial derivatives remain bounded in the limit → 0. This enables us to extend the local map to the limit = 0 so that the extension is differentiable in µ at = 0. To make this idea more precise, for ≥ 0 and fixed δ0 > 0 we introduce the set L = {(w, ζ, ρ, ψ) ∈ ∂1 U0 ∩ W u (Π) | |w1 | = δ0 , c3 c2 ≤ |w2 | ≤ , |ζ| ≤ c1 β , |ρ| ≤ c4 }. δ0 δ0 (57) L is a subset of the unstable manifold of Π whose points satisfy the entry conditions in (9). In general, L is the disjoint union of two-dimensional manifolds, and these manifolds collapse to the single two-dimensional manifold s (Π) L0 = ∂1 U0 ∩ Wloc
for = 0. For > 0, we define the local map L : L → ∂1 U0 as L (δ0 , w20 , ζ0 , ρ0 , ψ0 ) = (w1∗ , δ0 , ζ ∗ , ρ∗ , ψ ∗ )
(58)
with the coordinates defined as in Lemma 4.2. By the smoothness of the flow with respect to t, for > 0 the map L is of class C r . For ≥ 0 we now define the map L0 : L → ∂1 U0 as L0 (δ0 , w20 , ζ0 , ρ0 , ψ0 ) = (0, δ0 , ζ0 , ρ0 , ψ0 ). s Note that this map simply projects any point to the local unstable manifold Wloc (M ) and pushes the projection along an unstable fiber to the intersection of the fiber with ∂1 U0 . Clearly, L0 is a smooth map. Furthermore, a consequence of Lemma 4.2 is the following result.
Proposition 5.1. For > 0 small enough and for 1/2 < β < 1 in the entry conditions (9), there exists 0 < µ < 1/2 such that the local map can be written as L (x0 ) = L0 (x0 ) + µ L1 (x0 , µ ), where L1 is C 1 in its arguments and L1 (x0 ; 0) = 0. The statement of this proposition follows directly from Lemma 4.2, since the solution-dependent constants K and µ appearing in the statement of the lemma can be chosen uniformly for x0 ∈ L by the compactness of the closure of Π.
18
G. Haller
Remark 5.1. It is also easy to see from (58) that the formal extension L0 of the local map is C r in δ0 in a neighborhood of δ0 = 0. In this limit, the domain of L0 becomes L0 = Π. We now have a good approximation for the local map L when restricted to initial conditions in the unstable manifold of the plane Π. We also want to follow initial conditions as they leave one of the faces {|w2 | = δ0 } of the box U0 and return to some other face with |w1 | = δ0 . Such a global excursion starts from the set G = {(w, ζ, ρ, ψ) ∈ ∂1 U0 ∩ W u (Π) | |w2 | = δ0 , |w1 | < Kβ , |ζ| ≤ Kβ }, (59) and is described by the global map G : G → ∂1 U0 defined as G (w1∗ , δ0 , ζ ∗ , ρ∗ , ψ ∗ ) = (δ0 , w20 , ζ0 , ρ0 , ψ0 ).
(60)
The constant K > 0 appearing in the definition of G is the same as in statement (i) of Lemma 4.2. An approximation for the global map is given in the following lemma. Lemma 5.2. For ≥ 0 and for all sufficiently small δ0 ≥ 0, the global map can be written as √ G (x∗ ) = x∗ + 1x + δ0 G1 (x∗ , δ0 ) + G2 (x∗ , ), where Gj are C 1 in their arguments, and the vector 1x is defined in (4). Proof. We first observe that the map G0 : G0 → Π remains well-defined in the limit δ0 = 0 with domain G0 = Π. The map G0 simply relates the α-limit points of unperturbed heteroclinic orbits in W u (C) ≡ W s (C) to their ω-limit points. Therefore, for δ0 = 0 we obtain G0 (x∗ ) = x∗ + 1x from assumption (H6). For nonzero δ0 > 0, G0 maps the first intersections of solutions in the homoclinic manifolds W0± (C) with ∂U0 to their second intersections with ∂U0 . Since these solutions locally coincide with unperturbed fibers in W s,u (C), and fibers depend smoothly on their basepoints, we obtain that loc G0 (x∗ ) = x∗ +1x+δ0 G1 (x∗ , δ0 ). Now by assumption (H2), for √ x∗ ∈ G , the global map ∗ G (x ) is smooth in the initial condition x∗ and the parameter . Initial conditions in the domain of G0 are at most O(β ) (with β > 1/2) away√ from G , and the magnitude to the perturbation in the Fenichel normal (6) is of order O( ). This proves the statement of the lemma.
6. Energy Estimates In this section we shall study how the conservation of the Hamiltonian H = H0 + H1 is violated on solutions due to the presence of general dissipative terms in Eq. (1). The reason for this study is that we shall use the “energy” H together with the normal form variables (w2 , ζ, ρ, ψ) as coordinates to identify solutions entering the set U0 through its face w1 = δ0 . Similarly, we shall use the coordinates (H, w1 , ζ, ρ, ψ) to label solutions that leave U0 through its face w2 = δ0 . We start with some preliminary estimates which will be needed in our main energy estimate.
Multi-Pulse Homoclinic Orbits
19
Lemma 6.1. Let us fix a constant 21 < β < 1 and assume that for 0 < < 0 and δ0 < δ1 , the estimates (9) hold for a solution x(t) which enters the set U0 at t = 0 and leaves it at t = T ∗ . Then there exist constants L > 0 and δ0∗ > 0, and for any δ0 < δ0∗ there exists ∗0 > 0 such that for all 0 < < ∗0 we have Z T∗ Z T∗ √ |ζ(t)| dt < L , |w1 (t)| dt 0 0 (61) Z T∗ Z T∗ |w2 (t)| dt < Lδ0 ,
< Lδ0 , 0
|ρ(t)| dt < Lµ , 0
where µ = (1 − β)/2 (see (49)). Proof. The proof of this lemma is elementary, as it follows directly from the normal form (6) and the entry conditions (9). The reader may consult Haller [21] for details. We now formulate our main energy estimate for solutions that lie in the unstable manifold of the invariant manifold Π and make repeated passages near Π. Lemma 6.2. Suppose that x(t) is a solution of the normal form (6), which lies in the unstable manifold of the invariant manifold √ Π. Let q0 be the first intersection of x(t) with the surface ∂1 U0 and let b = b0 + (0, η) ∈ Π with b0 ∈ (φ0 , 0) ∈ C be the basepoint of the unstable fiber f u (b ) which contains the point q0 . Let xi (t), i = 1, . . . , N be a chain of unperturbed heteroclinic orbits for the system (1) (see Fig. 1) such that lim x1 (t) = b0 ,
t→−∞
lim xi−1 (t) = lim xi (t),
t→+∞
t→−∞
i = 2, . . . , N.
Suppose that the solution returns to ∂1 U0 N times to intersect it in the points p1 , . . . , pN , and to leave it again at the points q1 , . . . , qN −1 . Assume further that, for some constants 1 2 < β < 1, 0 < < 0 , and δ0 < δ1 , the entry conditions (9) hold for the solution x(t) at each entry point pk . (For N = 1, c2 = 0 is allowed in (9).) Then, for δ0 , > 0 sufficiently small, we have " # N Z ∞ X µ H(pN ) = H0 |C + H(b0 ) + hDH0 , gixi (t) dt + O(δ 0 , ) , i=1
−∞
where 0 < µ < 21 , and the “slow” Hamiltonian H is the first order term in the expansion of (H0 + H1 )|Π near the torus C, i.e., H=
1
η, DI2 H0 (Π)|C η + H1 |C. 2
(62)
Proof. We start by writing H(pN ) in the form H(pN ) = H(b ) + [H(q0 ) − H(b )] +
N −1 X i=1
H(qi ) − H(pi ) +
N X
H(pi ) − H(qi−1 ).
i=1
We shall estimate the four main terms of this expression separately. To estimate the first term, we note that
(63)
20
G. Haller
x1(t) x3(t) ε
C z
I φ
b0
Π ∆φ
∆φ
∆φ
x2(t) Fig. 1. The chain of heteroclinic orbits xi (t)
H(b ) = (H0 + H1 ) |b = H0 |C + H(b0 ) + O(3/2 ),
(64)
where we used the fact that DH0 |C = 0 since the torus C is filled with equilibria for = 0. To estimate the second term in (63), we consider the “Hamiltonian” unstable fiber u (b ), which intersects the surface ∂1 U0 at a point q¯0 . Then, we have H(q¯0 ) = H(b ), fg=0 and the mean value theorem implies that |H(q0 ) − H(b )| = |H(q0 ) − H(q¯0 )| < |DH(q) ˆ · (q0 − q¯0 )| ,
(65)
where the point qˆ lies on the line connecting q0 and q¯0 . Since the unstable fibers are of class C r in the parameter , we have |q0 − q¯0 | < K1 , for some integer K1 . Furthermore, the gradient of H at the point qˆ satisfies the estimate |DH(q)| ˆ < K2 δ0 . Therefore, the inequality in (65) can be rewritten as |H(q0 ) − H(b )| < K1 K2 δ0 . To estimate the third term in (63), we note that N −1 N −1 Z Ti∗ N −1 Z X X X ˙ H(x(t)) dt = H(qi ) − H(pi ) = i=1
i=1
=
0
N −1 Z Ti∗ X i=1
0
i=1
Ti∗
DH · ω ] (DH) + g
(66)
x(t)
dt
0
1 hDH0 , g ix(t) dt + O(2 log ),
(67)
Multi-Pulse Homoclinic Orbits
21
where we used the fact that, by definition, DH · ω ] (DH) = ω ω ] (DH), ω ] (DH) = 0. In (67) Ti∗ denotes the time of flight for the solution x(t) from the point pi to qi , and hence obeys the estimate (10). (Here ν > 0 is the constant defined in (49) and is sufficiently small.) We shall now estimate the three terms in the integrand on the right-hand-side of (67). Noting that DH0 |C = 0, we obtain that if (w, ζ, ρ, ψ) are the coordinates of a point p ∈ S0 , then DH0 (p) = A1 (w, ζ, ρ, ψ)w1 +A2 (w, ζ, ρ, ψ)w2 +A3 (w, ζ, ρ, ψ)ζ +A4 (w, ζ, ρ, ψ)ρ (68) for appropriate C r−1 functions Ai . Using Lemma 6.1 together with (68), we obtain N −1 Z Ti∗ X i=1
0
hDH0 , g ix(t) dt = O(δ0 ) + O(µ ).
(69)
But this last equation and the energy expression (67) shows that N −1 X
H(qi ) − H(pi ) = O(δ0, 1+µ ),
(70)
i=1
where we used the relation (31). To complete the proof of the lemma, it remains to estimate the last sum in the expression (63). Standard “finite-time-of-flight” Gronwall estimatesimply that the perturbed solutions remain close to the chain of unperturbed solutions xi (t) outside the fixed neighborhood U0 of the manifold M . Combining this with the fact that the size of U0 is of order O(δ0 ), we can compute the change in energy between the points qi−1 and pi in the same way as in the first line of Eq. (67). We then obtain N X i=1
H(pi ) − H(qi−1 ) =
N Z X i=1
∞ −∞
hDH0 , gi |xi (t) dt + O(δ0 ).
But (63), (64), (70), and (71) together prove the statement of the lemma.
(71)
s (M ) ∩ ∂1 U0 In the following lemma we estimate the energy of a point sN ∈ Wloc which has the same (z, η, φ) coordinates as the point pN on the incoming solution x(t). We will use this estimate to compute the energy difference between the point pN and its projection on the unstable manifold of M .
Lemma 6.3. Suppose that x(t) is a solution of the normal form and let the points p1 , . . . , pN and q0 , . . . , qN −1 be defined as in Lemma 6.2. Suppose that the assumptions of that lemma hold and c ∈ M is the basepoint of a stable fiber f s (c ) such that for the point sN = f s (c ) ∩ ∂1 U0 , (ζpN , ρpN , ψpN ) = (ζsN , ρsN , ψsN ).
(72)
Then, for the energy of the point sN , we have the expression H(sN ) = H0 |C + H(b0 + N 1φ) + O(δ0 , 1+β/2 ),
(73)
where the phase shift vector 1φ is defined in (4) and the slow Hamiltonian H is defined in (62).
22
G. Haller
Proof. Since the entry estimates (9) are assumed to hold for the incoming solution x(t), Eq. (72) implies that the stable fiber f s (c ) containing sN is locally O(β )-close to another stable fiber with basepoint on the invariant manifold Π. By the smoothness of fibers with respect to the parameter , this implies that the basepoint c is O(β ) close to Π, i.e., |zc | < K7 β . (74) Now sN lies at a distance of order O(δ0 ) from the invariant manifold M , so by the smoothness of individual stable fibers we have (ηc , φc ) = (ηsN , φsN ) + O(δ0 ).
(75)
We now relate the energy of the basepoint c to the energy of the point sN . Let the s (c ) with the surface ∂1 U0 . point sh be the intersection of the “Hamiltonian” fiber fg=0 Then, applying the mean value inequality with some point s∗ lying on the line segment connecting sN and sh , we can write |H(sN ) − H(c )| = H(sN ) − H(sh ) < |DH|s∗ sN − sh < |DH|s∗ K8 < K8 K9 δ0 , which yields H(sN ) = H(c ) + O(δ0 ).
(76)
Hence, to find an approximation for the energy of the point sN , we have to compute the energy of the fiber basepoint c . For this purpose, we have to find the restriction H of the Hamiltonian H to the manifold M . In the original x coordinate, the manifold M is given by x = f0 (x) + f1 (x, ). A standard Taylor expansion on M shows that H0 |M = H0 |M0 + DH0 |M0 · f1 + O(2 ) 3
= H0 |C + hη, DI2 H0 (Π)|C ηi + O(|z|2 , |z|, 2 ), H1 |M = H1 |M0 + O() √ = H1 |C + O(|z|, ). As a result, we have 3
H = H|M = H0 |C + H + O(|z|2 , |z|, 2 )
(77)
with the slow Hamiltonian H defined in (62). Since the solution x(t) travels for an O(1) amount of time near a chain √of unperturbed trajectories described in Lemma 6.2, we know that the point q0 is O( )-close to the √ β unperturbed solution x1 (t), and the point pN is O( )-close to the unperturbed solution xN (t). Since xN (t) locally coincides with an unperturbed stable fiber, the smoothness √ β of fibers implies that the basepoint c of the fiber containing sN is O( )-close to the unperturbed fiber basepoint limt→∞ xN (t). As a result, we obtain √ β c = b0 + N 1φ + O( ), where 1φ is defined in (4). But this last equation together with (74), (76), and (77) yields the statement of the lemma.
Multi-Pulse Homoclinic Orbits
23
7. The Existence of Multi-Pulse Homoclinic Orbits In this section we establish a criterion for the existence of multi-pulse homoclinic or heteroclinic orbits that are doubly asymptotic to the invariant manifold M . These orbits are contained in the unstable manifold of the invariant manifold Π, and in some cases they also lie in the stable manifold of Π. We first give an easy improvement of the results listed in Lemma 4.2 on the coordinates of the solution x(t) upon its exit from the set U0 . This improvement makes use of the energy estimates in Lemma 6.2. The result is that the change in the coordinates w1 and ζ during local passages near M is of the order O() if the solution x(t) satisfies the entry conditions (9) and lies in the unstable manifold of the manifold Π. This is due to the Hamiltonian nature of the unperturbed problem, which was not used in the derivation of the general normal form (6). Lemma 7.1. Let us fix a constant 21 < β < 1 and assume that a solution x(t) of the normal form (6) enters the set U0 at t = 0 and leaves it at t = T ∗ . Assume further that x(t) is contained in the manifold W u (Π) and satisfies the entry conditions (9). Let us introduce the notation a = (w20 , ζ0 , ρ0 , ψ0 ) , and let x0 = (δ0, a) and x∗ = x(T ∗ ) = (w1∗ , δ0 , ζ ∗ , ρ∗ , ψ ∗ ) define the coordinates of the solution at entry and departure, respectively. Then there exist constants K > 0 , 0 < µ < 21 , and δ0∗ > 0, and for any δ0 < δ0∗ there exists ∗0 > 0 such that for all 0 < < ∗0 the following estimates hold: (i) √ β √ β |w1∗ | < K, |ζ ∗ | < K, |ρ∗ − ρ0 | < K , |ψ ∗ − ψ0 | < K . (ii) |Da ζ ∗ − (0, 1, 0, 0)| < Kµ , |Da w1∗ | < Kβ , |Da ρ∗ − (0, 0, 1, 0)| < Kµ , |Da ψ ∗ − (0, 0, 0, 1)| < Kµ . (iii) |Dµ w1∗ | < Kβ ,
|Dµ ζ ∗ | < Kµ ,
|Dµ ρ∗ | < Kµ ,
|Dµ ψ ∗ | < Kµ .
u Proof. Consider the point q ∗ ∈ Wloc (Π) for which w1q∗ = 0, ζq∗ = 0, and (ρq∗ , ψq∗ ) = (ρx∗ , ψx∗ ) hold. By (i) of Lemma 4.2, the points q ∗ and x∗ are O(β ) close. To determine the energy of the point q ∗ , we consider the unstable fiber f u (b∗ ) which contains q ∗ . For u (b∗ ) can be zero dissipation (g ≡ 0), the energy of the basepoint b∗ of the fiber fg=0 ∗ written in the form H(b ) = H0 |C + O(),where we used (77). Since the energy is constant on fibers for g ≡ 0, we immediately obtain
H(q ∗ ) = H0 |C + O().
(78)
This equation remains valid for nonzero dissipation, since unstable fibers perturb by an O() amount when we add the dissipative terms. Also, setting q1 = x∗ in Lemma 6.2, we obtain that H(x∗ ) = H0 |C + O(). This last equation together with (78) and the mean value inequality gives q ∗ − x∗ ∗ ∗ ∗ |q − x∗ | > K11 δ0 |q ∗ − x∗ | , (79) ˆ · ∗ K10 > |H(q ) − H(x )| = DH(q) |q − x∗ |
24
G. Haller
where qˆ is an appropriate point on the line connecting the points q ∗ and x∗ . Here we made use of the facts that the diameter of the set U0 is of the order O(δ0 ) and the perturbed flow intersects the line between q ∗ and x∗ with O(1) transversality due to the geometry of the unperturbed Hamiltonian flow. We rewrite (79) in the form |q ∗ − x∗ | <
K10 . K11 δ0
(80)
Since the transformation from the (y, z, η, φ) coordinates to the (w, ζ, ρ, ψ) coordinates is a diffeomorphism with norm of order O(1), this last expression implies that |w1∗ | < K12 ,
(81)
since w1q∗ = 0. Furthermore, as the unstable fibers are straight lines for the local normal form (6), ζq∗ = 0 must hold, since the basepoint of the unstable fiber containing q ∗ lies in the invariant manifold Π which obeys ζ = 0. As a result, (81) implies |ζ ∗ | < K13 , which, together with the estimate (81) proves the first two inequalities in statement (i) of the lemma. The remaining inequalities are just restatements of the results listed in Lemma 4.2. The following definition describes the types of orbits that we will be interested in finding. Definition 7.1. Let us consider a point b0 ∈ C and let j = {ji }N i=1 be a sequence of +1’s and −1’s. An orbit x of system (1) is called an N -pulse homoclinic orbit with basepoint b0 and jump sequence j, if for some 0 < µ < 21 and for > 0 sufficiently small, (i) x intersects an unstable fiber f u (b ) with basepoint b = b0 + O(µ ) ∈ Π, (ii) x intersects a stable fiber f s (c ) with basepoint c = b0 + N 1φ + O(µ ) ∈ M such that dist(c , Π) = O(). (iii) Outside a small fixed neighborhood of the manifold M , the orbit x is order O(µ ) close to a chain of unperturbed heteroclinic solutions xi (t), i = 1, . . . , N , such that lim x1 (t) = b0 ,
t→−∞
lim xi−1 (t) = lim xi (t), i = 2, . . . , N.
t→+∞
t→−∞
Furthermore, for k = 1, . . . , N and for all t ∈ R we have + W0 (C) if jk = +1, k x (t) ∈ W0− (C) if jk = −1. To illustrate the above definition, we show a three-pulse homoclinic orbit schematically in Fig. 2. To find N -pulse orbits of the type described in Definition 7.1, it is clearly enough to s (M ) coincide. By find conditions under which the points pN ∈ W u (Π) and sN ∈ Wloc construction, these points have the same w1 , ζ, ρ, and ψ coordinates, so they coincide if their w2 coordinates are equal, i.e., the w2 coordinate of pN is zero. However, instead of following the evolution of the w2 coordinate along solutions, we will follow the change of “energy” H along solutions. The following lemma shows that this is sufficient, since the w2 coordinate of pN can uniquely be determined as a function of the other coordinates and H(pN ). This result will enable us to detect N -pulse orbits by solving the equation H(pN ) − H(sN ) = 0.
Multi-Pulse Homoclinic Orbits
25
x1(t)
xε ε
x3(t) s f (cε)
u
f (bε) bε
I φ
∆φ
cε
b0
∆φ
Π
∆φ
x2(t) Fig. 2. 3-pulse homoclinic orbit to the manifold M with jump sequence j = {+1, −1, +1} and with basepoint b
Lemma 7.2. Suppose that the conditions of Lemma 6.2 are satisfied. Then for > 0 small enough there exists a C 1 function f : P 7→ R, such that for any l = 1, . . . N, w2pl = f ζpl , ρpl , ψpl , H(pl ) . Proof. We start by noting that, in terms of the original x coordinate used in Eq. (1), the surface {w1 = δ0 } is given in the form x = s (w2 , ζ, ρ, ψ), where s is a C r embedding into the space P. Then the intersection of the energy surface {H(x) = h} with {w1 = δ0 } satisfies the equation H(s (w2 , ζ, ρ, ψ)) − h = 0. By the implicit function theorem, on this intersection set the coordinate w2 is a C 1 function of the rest of the coordinates and the energy h provided hDH(s (w2 , ζ, ρ, ψ)), Dw2 s (w2 , ζ, ρ, ψ)i 6= 0
(82)
holds in all points of the intersection. We want to see if this equation holds at the point pl . Since pl → sl as → 0, and pl is contained in a compact subset of W u (Π), it is enough to verify that |hDH0 (sl ), Dw2 s0 (w2sl , ζsl , ρsl , ψsl )i| > cl
(83)
for some constant cl > 0. But the vector Dw2 s0 (w2sl , ζsl , ρsl , ψsl ) lies in the tangent space of ∂1 U0 , so this last inequality follows from the fact that the unperturbed flow intersects ∂1 U0 with O(1) transversality. Thus the statement of the lemma follows by the implicit function theorem.
26
G. Haller
We are now in the position to prove our main result on the existence of solutions backward asymptotic to the invariant manifold Π and forward asymptotic to the manifold M . The key ingredient we shall need is the N-th order energy-difference function 1N H. For any point b0 = (η, φ) ∈ C, this function is defined as 1 H(φ) = H(b0 +N 1φ) − H(b0 ) − N
N Z X
∞ −∞
i=1
hDH0 , gi |xi (t) dt,
(84)
where N ≥ 1 is an integer, the slow Hamiltonian H is defined in (62), the phase shift vector 1φ is defined in (4), and xi (t), i = 1, . . . , N is a chain of unperturbed heteroclinic solutions as described in Lemma 6.2 with lim x1 (t) = x.
t→−∞
Finally, we introduce a definition which will be used to determine the jump sequences of multi-pulse homoclinic orbits. To this end, let us consider a point p+ on the unperturbed homoclinic manifold W0+ ≡ W0+ (M0 ). Since W0+ is a hypersurface in the phase space P, it makes sense to define the vector n(p+ ) as the unit normal to W0+ which points in the direction of the other unperturbed homoclinic manifold W0− ≡ W0− (M0 ). (See Fig. 3 for a schematic picture.)
W0+
W0− W0−
W0+ p+
n(p+)
n
(p+)
p+ Fig. 3. The definition of the vector n(p+ ) in two different cases
This allows us to introduce the number
σ = sign DH0 · n(p+ ) .
(85)
+
Note that σ is independent of the choice of the point p by the normal hyperbolicity of the unperturbed manifold M0 . Furthermore, σ remains the same if we interchange the roles of the homoclinic manifolds W0+ and W0− in this construction. It is easy to see that σ = −1 (σ = +1) if the energy H0 of the unperturbed solutions encircled by the homoclinic manifold is higher (lower) than the energy of those lying outside of W0+ . This meaning of σ is preserved under small perturbations. Definition 7.2. For any value φ0 ∈ Tm , the positive sign sequence χ+ (φ0 ) = {χ+k (φ0 )}N k=1 is defined as χ+1 (φ0 ) = +1, χ+k+1 (φ0 ) = σsign 1k H(φ0 ) χ+k (φ0 ), k = 1, . . . , N − 1. N The negative sign sequence χ− (φ0 ) = {χ− k (φ0 )}k=1 is defined as
χ− (φ0 ) = −χ+ (φ0 ).
Multi-Pulse Homoclinic Orbits
27
We now formulate our main result on the existence of N -pulse homoclinic orbits for the perturbed system (1). Theorem 7.3. Suppose that for some positive integer N , φ0 ∈ Tm is a transverse zero of the function 1N H, i.e., after a possible reindexing of the angular variables φ we have 1N H(φ0 ) = 0,
Dφ1 1N H(φ0 ) 6= 0.
Suppose further that 1k H(φ0 ) 6= 0 holds for all integers k = 1, . . . , N − 1, and let ˜ with φ˜ ∈ Tm−1 . φ = (φ1 , φ) Then there exist constants 0 < µ < 21 and Cη > 0, such that for any small enough > 0, the system (2) admits two, 2m − 1–parameter families of N -pulse homoclinic ± ˜ ˜ orbits x± (φ, η0 ) with basepoints b (φ, η0 ) ∈ Π such that √ µ ˜ b± η0 ). (φ, η0 ) = (φ0 + O( ), I0 + Here |η0 | < Cη is an arbitrary localized action value. The jump sequences of the orbits µ ˜ are given by χ± (φ0 ), respectively. Furthermore, the basepoints b± depend on φ and 1 in a C fashion. Proof. For > 0 and δ0 > 0 sufficiently small, let us consider a solution x(t) which lies in the component W0u+ (Π) of the unstable manifold of the invariant manifold Π. (W0u+ (Π) denotes the connected component of W0u (Π) that perturbs from the homoclinic manifold W0+ .) We follow x(t) up to its first intersection with the surface w2 = δ0 . We denote this intersection point√by q0 and note that it lies on an unstable fiber f u (b ) with some basepoint b = (φ0 , η0 ) ∈ Π (see Fig. 4). We then follow the solution
x(t) u
f (b ) 0
δ0
w 2= ε
z
I φ
q0 b0
s
f (b ) 1
δ0 w 1= p1 s1
q1
δ0
w 2=
b1
Π Fig. 4. The geometry of the proof of Theorem 6.2
as it leaves the neighborhood U0 of the manifold M and, by standard Gronwall estimates, returns and intersects the subset |w1 | = δ0 of the surface ∂1 U0 . We denote this second intersection point by p1 (see Fig. 4). Since the unstable fibers are straight in the (w, ζ, ρ, ψ) coordinates, we have |ζq0 | = 0 by construction. Then q0 is clearly contained in the domain G of the global map G (see (59) and (60)) and we can write p1 = G (q0 ). s+ Since the manifold Wloc (M ) is a graph over the variables (w1 , ζ, ρ, ψ), there exists s+ a unique point s1 ∈ Wloc (M ) ∩ ∂1 U0 as defined in Lemma 6.3. In particular, we have
28
G. Haller
ζ s 1 , ρ s1 , ψ s 1 = ζ p 1 , ρ p 1 , ψ p 1 . According to Lemma 7.2, p1 ≡ s1 holds if and only if H(p1 ) − H(s1 (p1 )) = 0,
(86)
where we view s1 as a function of p1 . Note that the right-hand-side of Eq. (86) is C r in the variable p1 . By standard Gronwall estimates, the point p1 of the solution x(t) is O()-close to a stable fiber f s (b1 ) with basepoint b1 = b + 1φ ∈ Π (see Fig. 7). As a result, it satisfies the entry conditions listed in (9) with β = 1 and c2 = 0. Consequently, Lemma 6.2 applies with n = 1 and gives Z ∞ µ hDH0 , gix1 (t) + O(δ 0 , ) (87) H(p1 (b )) = H0 |C + H(b0 ) + −∞
for an appropriate constant 0 < µ < 21 . Furthermore, Lemma 6.3 with n = 1 also applies and yields 3 (88) H(s1 (b )) = H0 |C + H(b0 + 1φ) + O(δ0 , 2 ). √ √ Since b = b0 + O( ) = (φ0 , η0 ), for any > 0 we can use (87), (88), and the definition of 11 H in (84) to rewrite the energy Eq. (86) as 11 H(φ0 ) + δ0 F1 (p1 (b ); δ0 , µ ) + µ G1 (p1 (b ); δ0 , µ ) = 0
(89)
with p1 = (0, w2p1 , ζp1 , ρp1 , ψp1 ) = G (q0 ). The points b0 and p1 are related by p1 (b0 ) = G ◦ Pu (b0 ),
(90)
u+ where Pu : Wloc (Π) ∩ ∂1 U0 → Π is the fiber projection map that maps the intersection u+ points of unstable fibers in Wloc (Π) with the surface ∂1 U0 to the basepoints of these u+ (Π), the function Pu is a C r map. By Lemma fibers. By the smoothness of fibers in Wloc 1 5.2, G is a C map from G to P. As a result, Eq. (90) shows that p1 is a C 1 function of b0 . This in turn implies that the right-hand-side of the energy Eq. (89) is of class C 1 with respect to b0 , because the functions F1 and G1 are smooth in p1 , as we observed after formula (86), and 11 H is a C 1 function. Assume now that N = 1 holds √ in the statement of the theorem. Then, by the assumptions of the theorem, b0 = (φ0 , η0 ) with any 0 < |η0 | ≤ Cη is a solution of Eq. (89) for δ0 = = 0. We want to apply the implicit function theorem to argue that this solution can be continued for , δ0 > 0. Setting = 0, and differentiating (89) with respect to the φ1 coordinate of b0 yields Dφ1 11 H(φ0 ) + δ0 F1 (p1 (b0 ); δ0 , 0) = Dφ1 11 H(φ0 )
+δ0 Dp1 F1 , DG0 DP0u Dφ01 T0−1 |b0 . (91)
Here, φ0 = (φ01 , φ˜ 0 ) and T is the normal form transformation constructed in Lemma 3.1. Now Dφ1 11 H is a continuous function, and we have Dφ1 11 H(φ0 ) 6= 0 by assumption. Hence for sufficiently small δ0 > 0, (91) is nonzero. (This follows by recalling that the right-hand-side of (91) continuous in b0 and the term
Dp1 F1 , DG0 DP0u Dφ01 T0−1 |b0
Multi-Pulse Homoclinic Orbits
29
˜ η0 , δ0 ) = remains bounded as δ0 → 0 by Lemma 5.2.) Thus (89) admits a solution φ¯ 1 (φ, φ01 +O(δ0 ) for δ0 > 0 small and = 0. We fix δ0 sufficiently small, substitute the solution φ¯ 1 back into Eq. (89). The derivative of the left-hand-side of the resulting equation with respect to φ1 is given by
˜ + δ0 Dp1 F1 , DG DPu Dφ01 T−1 Dφ1 11 H (φ¯ 1 , φ)
+µ ∇p1 G1 , DG DPu Dφ01 T−1 . By Lemma 5.2, this derivative is continuous at = 0, and is also nonzero by assumption. ˜ η0 , δ0 , ) = φ01 + O(δ0 , µ ) for > 0 sufficiently Thus Eq. (89) admits a solution φˆ 1 (φ, small. For any fixed , the solution should not depend on δ0 , which is just an auxiliary parameter to measure the size of the neighborhood U0 that we have worked in. Therefore, ˜ η0 , ) = φ01 + O(µ ). This proves the existence of we have dφˆ 1 /dδ0 = 0, implying φˆ 1 (φ, ˜ for N = 1. The smoothness of x+ (φ) ˜ with respect to µ follows the orbit family x+ (φ) from Lemma 5.2. Assume now that N > 1 in the statement of the theorem. Then, by the conditions of the theorem, we see that for and δ0 sufficiently small the energy Eq. (89) cannot be satisfied, so the solution x(t) does not intersect the local stable manifold of M upon its first return to the neighborhood U0 . Using (87), (88), and the compactness of the solid m-torus [−Cη , Cη ]m × Tm , we conclude the existence of positive constants K1(1) and K2(1) such that (92) K1(1) < |H(p1 ) − H(s1 )| < K2(1) . Now the mean value theorem implies for any fixed k ≥ 1, p1 − s1 ∗ |p1 − s1 | |H(p1 ) − H(s1 )| = DH(p1 ), |p1 − s1 | > C2(1) |p1 − s1 | ,
(93)
where p∗1 is a point on the line connecting p1 and s1 , and the existence of C2(1) > 0 follows from an argument similar to that leading to estimate (83). At the same time, the mean value theorem implies that |H(p1 ) − H(s1 )| < C1(1) |p1 − s1 |
(94)
for some constant C1(1) > 0, so it follows from (92)-(94) that K1(1) C1(1)
< |p1 − s1 | <
K2(1) C2(1)
.
(95)
This last expression in (95) immediately shows that the coordinates (w2p1 , ζp1 , ρp1 , ψp1 ) satisfy the entry conditions in (9) (because the normal form coordinates of the point s1 satisfy w1s1 = δ0 , w2s1 = 0, and |ζs1 | = O()). Consequently, the point p1 is contained in the domain L of the local map L , and we can write q1 = L (p1 ), where q1 is the next intersection of the solution x(t) with the surface ∂1 U0 . Let p2 denote the intersection of the solution x(t) with the surface ∂1 U0 upon its second return to the neighborhood U0 . (The existence of p2 is guaranteed by the usual s (M ) ∩ Gronwall estimates for > 0 small enough.) We again have a point s2 ∈ Wloc ∂1 U0 such that ζ s2 , ρ s2 , ψ s2 = ζ p 2 , ρ p 2 , ψ p 2 .
30
G. Haller
Again, the solution x(t) gives rise to a 2-pulse homoclinic orbit if H(p2 ) − H(s2 (p2 )) = 0, or, alternatively, 12 H(φ0 ) + δ0 F2 (p2 (b ); δ0 , µ ) + µ G2 (p2 (b ); δ0 , µ ) = 0,
(96)
where we used Lemmas 6.2 and 6.3. As in Eq. (89), the functions F2 and G2 are C 1 in their arguments. Since p2 (b ) = G ◦ L ◦ G ◦ Pu (b ), we see that for ≥ 0, p2 is a C 1 function of b and µ by Corollary 5.1 and Lemma 5.2. Then just as in the case of N = 1, the implicit function theorem applied to (96) implies ˜ η0 ) for N = 2. the existence of the orbit family x+ (φ, The proof for any N > 2 follows the same steps as that of the case N = 2. The ˜ existence of the other N -pulse homoclinic orbit family x− (φ, η0 ) for any N ≥ 1 follows from the fact that an identical construction can be repeated for solutions contained in W u− (Π). Therefore, it remains to show that the jump sequences of the two families ± ˜ x± (φ) are indeed given by the sign sequences χ (φ0 ), respectively. We sketch the + − argument for x only since the argument for x is identical. Consider an N -pulse homoclinic orbit x+ . By construction it makes its first pulse in the vicinity of the unperturbed manifold W0+ (C), hence the first element of its jump sequence is indeed χ+1 (φ0 ) = +1. For small , δ0 > 0, at the first re-entry point p1 we have sign (H(s1 ) − H(p1 )) = sign [(11 H(φ0 + O(δ0 , µ )) +δ0 FN (pN (b+ ); δ0 , µ ) +µ GN (pN (b+ ); δ0 , µ ))] = sign 11 H(φ0 ) .
(97)
If this quantity is positive, then at the point p1 the solution x(t) has higher energy than s+ (M ). Recalling the meaning of the constant σ nearby points in the hypersurface Wloc (see (85)), we can conclude that σ sign 11 H(φ0 ) = +1 implies that the solution x(t) stays near the homoclinic manifold W0+ (C), whereas σ sign 11 H(φ0 ) = −1 causes the solution to perform its second jump in the vicinity of the manifold W0− (C). Therefore, the second element in the jump sequence of x+ is given by χ+2 (φ0 ) as defined in Definition 7.2. The remaining elements of the jump sequence of x+ are constructed recursively in the same fashion, hence they coincide with the corresponding elements of the sign sequence χ+ (φ0 ) in Definition 7.2. This completes the proof of the theorem. In the following we describe two situations in which the above theorem can be applied. For simplicity, we will consider the case m = 1, i.e., we assume that the manifold Π is two-dimensional, hence the center manifold M0 of the unperturbed system is 2n+2 dimensional. To find the asymptotic behavior of multi-pulse orbits, one has to have some knowledge of the dynamics on the two-dimensional manifold M . A straightforward Taylor expansion shows (see, e.g., Haller and Wiggins [15]) that near the resonant circle C the flow on Π satisfies the equations
Multi-Pulse Homoclinic Orbits
31
√
Dφ Hg (η, φ) + O(), √ φ = − Dη Hg (η, φ) + O(), η˙ =
with
Z Hg (η, φ) = H(η, φ) −
φ
gI |C (u) du 0
=
(98)
1 2 D H0 (C)η 2 + H1 |C (φ) − 2 I
(99) Z
φ
gI |C (u) du, 0
where the slow Hamiltonian H is defined in (62) and gI is the I-component of the perturbation term g in Eq. (1). As seen from (98), for finite √ times the solutions on the manifold Π are approximated with an error of order O( ) by the level curves of the function Hg . (We note that, in general, the flow generated by Hg is only locally Hamiltonian, i.e., it does not admit a single valued Hamiltonian on Π. We selected the Hamiltonian Hg in a way such that it generates the leading order Hamiltonian terms through the canonical symplectic form dφ ∧ dI.) Theorem 7.4. Suppose that m = 1 and the conditions of Theorem 7.3 hold. Assume further that the curve {φ = φ0 } ⊂ Π intersects transversely the unstable manifold of a hyperbolic fixed point p0 ∈ Π of the Hamiltonian Hg . Let (0, 0, η0 , φ0 ) be the coordinates of the point p0 and assume that for any small enough |z| > 0 and > 0, the point (y 0 , z, η0 , φ0 + N 1φ) ∈ M lies in the domain of attraction of an invariant set S ⊂ Π. Then, for > 0 sufficiently small, there exists 0 < µ < 21 such that system (2) ± µ admits two N -pulse homoclinic orbits x± with basepoints b = p0 + O( ) ∈ Π and Both orbits are backward asymptotic to a with jump sequences χ± (φ0 ), respectively. √ hyperbolic fixed point p = p0 + O( ) ⊂ Π and forward asymptotic to the invariant set S . Proof. By Theorem 7.3 we immediately obtain the existence of a curve B ⊂ Π which contains basepoints for N -pulse homoclinic orbits of the type of x± . From the proof of that theorem it is also clear that the curve B is C 1 O(µ )-close to the line {φ = φ0 }. As a result, it will intersect the unstable manifold of the fixed point p , which perturbs from p0 under the effect of dissipative and higher order Hamiltonian terms. Then, by the invariance properties of unstable fibers, this intersection point is a basepoint for an N pulse homoclinic orbit that backward asymptotes to p . Finally, the invariance properties of stable fibers imply that the N -pulse homoclinic orbit asymptotes to the attracting set S in forward time. In applications system (1) frequently depends on parameters. Varying these parameters on a codimension-one subset of the parameter space, it is possible to construct multi-pulse homoclinic orbits which have their basepoints precisely on an equilibrium point p contained in the invariant manifold Π. If, in addition, the attracting set S assumed in the previous theorem is just the fixed point p , then the multi-pulse homoclinic orbit obtained in this fashion is an orbit homoclinic to p itself. Theorem 7.5. Suppose that m = 1, system (1) depends on a vector λ ∈ Rp of system parameters in a C r fashion, and V ⊂ Rp is an open set. Assume further that (i)
For any λ ∈ V the Hamiltonian Hg has a nondegenerate equilibrium (i.e., no zero eigenvalues) p0 (λ) = (η0 (λ), φ0 (λ)) ∈ Π.
32
G. Haller
(ii) For some positive integer N and for some parameter value λ0 ∈ V , φ0 (λ0 ) satisfies the conditions of Theorem 7.3. (iii) Dλ 1N H(φ0 (λ), λ)|λ=λ0 6= 0. (iv) For small enough |z| and > 0, the point (y 0 , z, η0 , φ0 + N 1φ) ∈ M lies in the domain of attraction of an asymptotically stable fixed point p ⊂ Π of system (98) which perturbs from the fixed point p0 . Then there exists a codimension-one set M + ⊂ Rp ×R near the point (λ0 , 0) such that for every parameter value (λ, ) ∈ M, the system (2) admits an N -pulse homoclinic orbit x+ homoclinic to the point p . The basepoint for this orbit is p and the jump sequence of the orbit is given by χ+ (φ0 (λ0 )). There also exists another codimension one set M − ⊂ Rp × R which yields similar homoclinic orbits with jump sequence χ− (φ0 (λ0 )). Proof. The main steps in the proof of this theorem are similar to those in the proof of Theorem 7.3. However, we now want to force the perturbed fixed point p to be a solution of the equation 1N H (p (λ); λ) + δ0 FN (pN (p (λ)); δ0 , µ , λ) + µ GN (pN (p (λ)), δ0 , µ , λ) = 0 √ √ with p (λ) = φ0 (λ) + P1 (λ, ), η0 (λ) + P2 (λ, ) . Using (iv) and the implicit function theorem, we see that this equation can again be solved in two steps to ob¯ = λ0 + O(µ ). tain a solution λ() We note that in the case of n = 0 the above theorem is identical to the one obtained ˇ in Haller and Wiggins [18] for the existence of Silnikov-type orbits in two-degreeˇ of-freedom systems. Another situation in which multi-pulse Silnikov-type orbits may occur is when an equilibrium p ∈ Π of the perturbed system is a saddle restricted to the manifold Π, but when viewed within the center manifold M , it also admits n pairs of complex eigenvalues with negative real parts. Theorem 7.6. Suppose that m = 1 and system (1) depends on a parameter λ ∈ R in a C r fashion. Let V ∈ R be an open set and assume that (i)
(ii) (iii) (iv) (v)
The Hamiltonian Hg has a nondegenerate equilibrium (i.e., no zero eigenvalues) p0 (λ) = (η0 (λ), φ0 (λ)) ∈ Π. If p (λ) ∈ Π is the corresponding equilibrium of the perturbed system (1), then the manifold W s (p (λ)) ∩ M is codimension one within the center manifold M . The “size” of W s (p (λ)) is of order O(q ) with 0 ≤ q < 1, i.e., it intersects a surface |z| = Kq transversely. For some positive integer N and for all λ ∈ V , there exists a function φ0 (λ) which satisfies the conditions of Theorem 7.3. The line {φ = φ0 (λ)} ⊂ Π intersects transversely the unstable manifold of the fixed point p0 ∈ Π of the slow Hamiltonian. If (0, 0, η0 (λ), φ0 (λ)) are the coordinates of this transverse intersection point, then the point (0, 0, η0 , φ0 (λ) + N 1φ(λ)) crosses the stable manifold of p0 transversely as λ is varied through λ0 .
Then there exists a codimension one set M + ⊂ R2 near the point (λ0 , 0) such that for every parameter value (λ, ) ∈ M, the system (1) admits an N –pulse homoclinic orbit x+ to the point p (λ). The basepoint for this orbit lies in W u (p ) ∩ Π and the jump sequence of the orbit is given by χ+ (φ0 (λ0 )). There also exists another codimension one set M − ⊂ R2 which yields similar homoclinic orbits with jump sequence χ− (φ0 (λ0 )).
Multi-Pulse Homoclinic Orbits
33
Proof. Again, the main steps in the proof of this theorem coincide with those in the proof of Theorem 7.3. The new element is that we want to force the stable fiber, which is intersected by the N -pulse homoclinic orbit, to lie on the stable manifold of the perturbed fixed point p . At the same time, we do not require the basepoint of the N -pulse orbit to coincide with p as in the previous theorem, but rather we allow the basepoint to be any point in the set W u (p ) ∩ Π. As in the proof of Theorem 7.3, we first solve the equation 1N H (p (λ); λ) + δ0 FN (pN (p (λ)); δ0 , µ , λ) + µ GN (pN (p (λ)); δ0 , µ , λ) = 0 ¯ ) = φ0 (λ)+O(µ ). By assumption (iv) and by the C 1 dependence to obtain a solution φ(λ, µ ¯ )} intersects the unstable manifold ¯ of φ on (cf. Theorem 7.3), the curve {φ = φ(λ, of the fixed point p (λ) transversely in a point p(λ, ¯ ) = (η0 (λ) + O(µ ), φ0 (λ) + O(µ )) ∈ Π. We know (cf. Definition 7.1) that the N -pulse solution with basepoint p(λ, ¯ ) intersects ˆ )) whose basepoint has the (y, z, η, φ) coordinates a stable fiber f s (p(λ, p(λ, ˆ ) = (0, O(), η0 (λ) + O(µ ), φ0 (λ) + 1φ(λ) + O(µ )) ∈ M .
(100)
Furthermore, by assumption (ii), in a vicinity of the manifold Π the stable manifold of p can be written as a graph over either the (φ, z) or the (η, z) variables. Considering the former case (the latter can be dealt with in the same way), we obtain that near Π a compact subset of W s (p (λ)) satisfies an equation of the form η = m1 (φ, λ) + zm2 (φ, z, λ, ),
(101)
where mj are of class C r and η = m1 (φ, λ) is the local equation of the stable manifold of p0 on the manifold Π. Our goal is to find parameter values for which the stable fiber basepoint p(λ, ˆ ) is contained in the stable manifold of the fixed point p (λ). From (100) we see that dist(p(λ, ˆ ), Π) = O(), and hence by assumption (ii) of the theorem, p(λ, ˆ ) lies in the domain where W s (p (λ)) satisfies (101). Then formulas (100) and (101) give the equation η0 (λ) + µ hη (λ, ) − m1 φ0 (λ) + µ hφ (λ, ), λ −hz (λ, )m2 φ0 (λ) + µ hφ (λ, ), hz (λ, ), λ, = 0,
(102)
where the functions hη , hφ , and hz are differentiable in λ and µ . Now by assumption (v), we know that η0 (λ0 ) − m1 (φ0 (λ0 ), λ0 ) = 0,
Dλ [η0 (λ) − m1 (φ0 (λ), λ)]λ=λ0 6= 0,
¯ = λ0 + O(µ ) to Eq. (102). thus the implicit function theorem guarantees a solution λ() This completes the proof of the theorem.
34
G. Haller
8. Geometry of the Unstable Manifold of Π Using the methods of the proof of Theorem 7.3, we can follow any particular √ solution in the unstable manifold of the manifold Π on time scales of order O(log 1/ ), while the unstable manifold makes a finite number of “jumps”. The following definition will be used to distinguish between different types of jumping orbits within the unstable manifold of Π. Definition 8.1. Let us consider a point b0 ∈ C and let j = {ji }N i=1 be a sequence of +1’s and −1’s. An orbit x of system (1) is called an N -pulse orbit with basepoint b0 and jump sequence j, if for some 0 < µ < 21 and for > 0 sufficiently small, (i) x intersects an unstable fiber f u (b ) with basepoint b = b0 + O(µ ) ∈ Π. √ (ii) Outside a small fixed neighborhood of the manifold M , the orbit x is order O( ) close to a chain of unperturbed heteroclinic solutions xi (t), i = 1, . . . , N , such that lim x1 (t) = b0 ,
t→−∞
lim xi−1 (t) = lim xi (t), i = 2, . . . , N.
t→+∞
t→−∞
Furthermore, for k = 1, . . . , N and for all t ∈ R we have x (t) ∈ k
W0+ (C) if jk = +1, W0− (C) if jk = −1.
We have the following result for the existence of N -pulse orbits. Theorem 8.1. Suppose that for some positive integer N and for some φ0 ∈ Tm we have 1k H(φ0 ) 6= 0, k = 1, . . . , N − 1. Then, for > 0 sufficiently small there exist constants 0 < µ < 21 and Cη > 0, such that for any 0 ≤ |η0 | < Cη , the system (2) admits two N -pulse orbits x± with basepoint b ∈ Π such that φb = φ0 + O(µ ) and ηb = η0 . The jump sequences of the orbits are given by χ± (φ0 ), respectively. Proof. Using the assumption of the theorem and the arguments from the proof of Theorem 7.3, we immediately conclude that for > 0 small enough the inequalities 1k H(φ0 ) + δ0 Fk (pk (b ); δ0 , µ ) + µ Gk (pk (b ); δ0 , µ ) 6= 0 hold for k = 1, . . . , N − 1. As a result, the unstable manifold W u (Π) contains two N -pulse orbits with basepoint (η0 , φ0 ). The jump sequences of these orbits can be found in exactly the same way as in the proof of Theorem 7.3. The above result can be used in examples to study the “disintegration” of the unstable manifold of Π. In particular, in the process of its jumping around Π, the open sets in the manifold W u (Π) depart from each other and follow different jump sequences. This results in observable irregular transient behavior near the broken homoclinic structure, even if there are no chaotic invariant sets created by the perturbation. We will use this fact when we apply our results to a discretization of the forced NLS equation.
Multi-Pulse Homoclinic Orbits
35
9. An Alternative Formulation of the Results It may happen that the unperturbed limit of system (1) admits an invariant which offers a more convenient base for perturbation methods than the Hamiltonian H0 . For this reason, we also present an easy modification of our results that uses some other integral of the unperturbed limit. This alternative formulation will prove very useful in our study of the discretized NLS equation in the next section. We consider a modification of system (1) in the form x˙ = ω ] (DH0 (x)) + g(x),
(103)
and assume that for = 0, there exists a C r+1 function K0 : P → R, which is independent of the Hamiltonian H0 and Poisson commutes with H0 , i.e., {H0 , K0 } = ω(ω ] (DH0 ), ω ] (DK0 )) = 0.
(104)
This last condition implies that the flows generated by H0 and K0 through the symplectic form ω commute. We also assume that on the circle of equilibria C, DK0 |C = 0.
(105)
Following the definition of the energy-difference functions in (84), we introduce the function N Z ∞ X 1N K(φ) = − (106) hDK0 , gi |xi (t) dt. i=1
−∞
We also redefine the number σ in (85) as σ = sign DK0 · n(p+ ) ,
(107)
as well as the sign sequences in Definition 7.2: Definition 9.1. For any value φ0 ∈ Tm , the positive sign sequence χ+ (φ0 ) = {χ+k (φ0 )}N k=1 is defined as χ+1 (φ0 ) = +1, χ+k+1 (φ0 ) = σ sign 1k K(φ0 ) χ+k (φ0 ),
k = 1, . . . , N − 1.
N The negative sign sequence χ− (φ0 ) = {χ− k (φ0 )}k=1 is defined as
χ− (φ0 ) = −χ+ (φ0 ). We then have the following result. Theorem 9.1. The statements of Theorems 7.3-7.6 also hold if we replace the energydifference function 1N H with the function 1N K defined in (106) and we use the definition of sign sequences given in Definition 9.1.
36
G. Haller
Proof. Our estimates for the local dynamics near the manifold M0 in Sect. 4 as well as Lemma 6.1 make no use of the Hamiltonian H1 , hence they hold without change. Lemma 6.2 can also be proved using the function K0 instead of H0 , noting that H1 ≡ 0. Indeed, Eq. (105) ensures that K0 has the same type of Taylor expansion near the resonant circle C as H0 does. Furthermore, the change of K0 along perturbed solutions during passages near the manifold M can be computed as (cf. (67)) N −1 X
K0 (qi ) − K0 (pi ) =
i=1
N −1 Z Ti∗ X i=1
=
N −1 Z Ti∗ X i=1
=
=
x(t)
{K0 , H0 } +hDK0 , gi
x(t)
dt
dt
(108)
0
N −1 Z Ti∗ X i=1
DK0 · ω ] (H0 ) + g
0
N −1 Z Ti∗ X i=1
K˙ 0 (x(t)) dt
0
0
hDK0 , g ix(t) dt,
where we used (104). Moreover, this last integral can again be approximated (with error of order O(δ0 )) by an improper integral as in (71), because by (105), |DK0 | decreases exponentially on the unperturbed solutions xi (t), hence the improper integral converges absolutely. Lemma 6.3 can also be stated in terms of DK0 based on (105). The statement of Lemma 7.1 does not involve H0 explicitly, so its proof remains the same. Based on all these lemmas, the main argument in the proof of Theorem 7.3 can be repeated using the invariant K0 instead of H0 . In particular, one replaces the local coordinate w2 in the representation of the global map G (q0 ) and the local map L (p1 ) with the value of K0 at q0 and p1 , respectively. This is possible because, in analogy with (82), we have hDK0 (s (w2 , ζ, ρ, ψ)), Dw2 s (w2 , ζ, ρ, ψ)i 6= 0, since the vector DK0 is perpendicular to perturbed trajectories up to an error of order O(), and Dw2 s encloses an angle of order O(1) with perturbed trajectories. As a result, we obtain the equation 1N K(φ0 ) + δ0 F˜ N (pN (b ); δ0 , µ ) + µ G˜N (pN (b ); δ0 , µ ) = 0 for the basepoint b of an N -pulse homoclinic orbit. This equation can again be solved for > 0, if we apply the implicit function theorem using the extension L0 of the map L . Adapting the definition of sign sequences from Definition 9.1, the jump sequences of N -pulse orbits can be constructed in exactly the same way as in the proof of Theorem 7.3. Finally, we can repeat the proofs of Theorems 7.4-7.4 without any change using the function 1N K instead of 1N H. 10. Jumping Homoclinic Orbits in a Discretization of the Perturbed NLS Equation Let us consider the periodically forced and damped, focusing nonlinear Schr¨odinger equation
Multi-Pulse Homoclinic Orbits
37 2
2
iut − uxx − 2 |u| u = i(0ei2 t − αu + βuxx ),
(109)
with constants , 0, α, β > 0, and with the small parameter > 0. We assign even, periodic boundary conditions of the form u(x, 0) = u(−x, 0),
u(x + 1, t) = u(x, t).
Introducing the change of variable u → ue−i2 t , we can rewrite (109) as h i 2 iut − uxx − 2 |u| − 2 u = i(0 − αu + βuxx ). 2
(110)
For β = 0, this equation agrees with the form of the perturbed NLS that was studied by Bishop et al. [5] as a small amplitude approximation to the parametrically forced sine-Gordon Eq. (see Sect. 1 for further references). For β > 0 we obtain a form of the perturbed NLS whose modal truncation and discretization was studied in the references listed in Sect. 1.1. The = 0 limit of Eq. (110) admits a discretization which was pointed out to be integrable for arbitrary mesh size by Ablowitz and Ladik [1]. Applying this particular discretization with mesh size h > 0 to the > 0 case yields the system of ordinary differential equations uk+1 − 2uk + uk−1 2 − i |uk | uk−1 + uk+1 + 2i2 uk 2 h uk+1 − 2uk + uk−1 + 0 − αuk + β , h2
u˙ k = −i
(111)
where uk (t) = u(xk , t), k = 0, . . . , K − 1, x0 = −1, xk = x1 + kh. The periodic boundary conditions for the PDE imply that uK (t) ≡ u0 (t). For = 0, system (111) (together with the conjugate equations for u¯˙k ), is Hamiltonian with Hamiltonian K−1 1 X 2 2 2 2 2 u¯ k (uk+1 + uk−1 ) − 2 (1 + h ) log(1 + h |uk | ) , (112) H0 = 2 h h k=1
and with the symplectic form ω=
K−1 X k=0
i Im(du¯ k ∧ duk ). 2(1 + h2 |uk |2 )
The discretization (111) gives a tool for approximating solutions of the partial differential Eq. (110), and also offers a finite dimensional model for the phase space structure of the perturbed NLS. In particular, for = 0, (111) is integrable (see Ablowitz and Ladik [1] and Li and McLaughlin [31]). This is a special feature of this discretization which distinguishes it from the standard finite difference discretization of the NLS. (The usual finite difference scheme would have the same linear part but a nonlinear term of the form uk |uk |2 , which would only ensure integrability for K = 2.) System (111) also admits a two-dimensional invariant plane Π given by u1 = u2, u2 = u3 , . . . uK−1 = uK .
(113)
This plane is the set of solutions with no spatial dependence, and it is easily seen to remain invariant for > 0. Restricting the dynamics to Π as in (113), one obtains the equation
38
G. Haller
h i 2 u˙ K = 2i 2 − |uK | uK + (0 − αuK ).
(114)
This shows that for = 0, Π contains a circle of equilibria C which is given by |uK | = ω. The circle C is surrounded by periodic solutions in the plane Π. Introducing the actionangle variables (I, φ) ∈ R × S 1 by letting uK = Ieiφ , we can rewrite Eq. (114) in the form I˙ = (0 cos φ − αI), 0 φ˙ = 2(2 − I 2 ) − sin φ. I
(115)
For = 0, this system is a one-degree-of-freedom Hamiltonian system with Hamiltonian 2K 1 HΠ ≡ H0 |Π = 2 I 2 − 2 (1 + 2 h2 ) log(1 + h2 I 2 ) , h h and with the symplectic form ωΠ ≡ ω|Π =
−KI dφ ∧ dI, 2(1 + h2 I 2 )
which is clearly nondegenerate. Linearizing (111) about any point of the circle C, one finds that for π < < ∞, if K = 3, 3 tan K 2π π < < K tan , if K > 3, (116) K tan K K off the plane Π, the linearized system possesses one positive, one negative, and K − 2 pairs of pure imaginary eigenvalues (see Li and McLaughlin [31]). Furthermore, the circle C admits a codimension two center manifold M0 which contains the plane Π. For = 0, Li [29] showed the existence of an n − 2 parameter family of orbits homoclinic to the center manifold M0 , which implies the existence of a codimension one homoclinic manifold W u (M0 ) ≡ W s (M0 ). A three dimensional submanifold of this homoclinic structure carries motions that are doubly asymptotic to the plane Π itself, hence we obtain that W u (C) ≡ W s (C) = W0+ ∪ W0− . Here W0± denote the two connected components of the manifold homoclinic to Π. This manifold is filled with heteroclinic orbits connecting points on the circle C. As shown in Li [29], the phase shift along all these heteroclinic connections is given by q 2 π [1 + K 2 ] cos K − 1 −1 q . (117) 1φ = −4 tan 2 π 1+ K 2 sin K If we pass to the real coordinates (φk , Ik ) ∈ R × S 1 by letting uk = Ik eiφk , then the discretized NLS equation is of the form (1) with H0 =
K−1 2 X 2 −iφk iφk+1 iφk−1 2 2 2 2 e (I e + I e ) − (1 + I h ) log(1 + h I ) , k k+1 k−1 k h2 h2 k=0
H1 ≡ 0,
(118)
Multi-Pulse Homoclinic Orbits
ω=
K−1 X k=0
39
−Ik dφk ∧ dIk , 2(1 + h2 Ik2 )
g = G(I, φ; α, β, 0).
Furthermore, based on the above description of system (111), the resulting real system of equations satisfies assumptions (H1)-(H7) of Sect. 2 with m = 1 and n = 2(K − 2). As a result, the theory we have developed in this paper can be used to investigate the existence of multi-pulse homoclinic orbits for the discretized NLS system (111). First, we will study the equations setting β = 0 which was the case in the study of Bishop et al. [5, 6]. Later, we will consider the case β > 0, which was studied first in Li and McLaughlin [31]. 10.1. The β = 0 limit. As described in Li and McLaughlin [31], the unperturbed integrable system admits an invariant denoted F˜1 such that F˜1 |C = 0.
(119)
The function F1 is defined as a Floquet discriminant computed for a set of fundamental solutions to a discretized Lax pair for system (111). For brevity, we do not introduce here all the notation and terminology for the exact definition of F1 , but refer the reader to Li and McLaughlin [29]. All we need in our analysis is the existence of F1 and the results of some involved calculations performed in [31]. In particular, using an implicit derivation, Li and McLaughlin [29] computed a Melnikov integral to study the existence of (single-pulse) homoclinic orbits for system (111). They obtain that, for β = 0, the Melnikov integral computed on unperturbed orbits homoclinic to the circle C can be written as Z ∞ 1φ ˜ ˆ ˆ − χ α Mα . hDF1 , gi|xh (t) dt = 0 M0 cos φ + (120) MF1 (φ) = 2 −∞ Here the nonzero constants M0 and Mα depend only on the number and the mesh size K of the discretization, χα = α/0, and the phase shift 1φ is defined in (117). The heteroclinic solution xh (t) has the property that for its φk (t) component k = 0, . . . , K − 1
lim φk (t) = φ,
t→−∞
ˆ F1 . holds, where φ ∈ S 1 is the argument of M By (118), the real system corresponding to (111) can in fact be written in the form (103). This fact together with (119) implies that the alternative formulation of our main results in Sect. 8 applies to the discretized NLS system. To find multi-pulse homoclinic orbits, we have to study the zeros of the function 1N K defined in (106). Setting K0 = F˜1 and using (120), we obtain that 1N K(φ) = −
N Z X i=1
"
∞ −∞
ˆ0 = −0 M
DF˜1 , g |xi (t) dt
N −1 X k=0
Using the relation
# 2k + 1 1φ − N χα Mα . cos φ + 2
(121)
40
G. Haller N −1 X k=0
2k + 1 1φ cos φ + 2
we obtain that
"
1 K(φ) = −0 N
N −1 X
N 1φ 2 1φ 2
ˆ 0 sin M sin
If 1φ 6= and
ei(φ+1φ/2) eiN 1φ − 1 = Re e = Re ei1φ − 1 k=0 sin N 21φ N 1φ = , cos φ + 1φ 2 sin 2 i[φ+(2k+1)1φ/2]
N 1φ cos φ + 2
2jπ , N
#
− N χ α Mα .
j ∈ Z,
(122)
(123)
1φ ˆ N 1φ N χα Mα sin ≤ M , sin 0 2 2
(124)
then 1N K admits two transverse zeros given by φN 1 =
N χα Mα sin 1φ π N 1φ 2 − − cos−1 , ˆ 0 sin N 1φ 2 2 M 2
φN 2 =
N χα Mα sin 1φ 3π N 1φ 2 − − cos−1 . ˆ 0 sin N 1φ 2 2 M 2
(125)
Using these zeros, we can obtain the following result. Theorem 10.1. Consider any integer N ≥ 1 and suppose that v u 2 π π u [1 + K jπ 2 ] cos K − 1 sin 6= t , j ∈ Z. tan − 2 2N 2 1 + 2
(126)
K
Assume further that conditions (123) and (124) hold. Then, for , α > 0 sufficiently small, (i)
The discretized NLS system (111) admits four, 1-parameter families of N -pulse homoclinic orbits, which are backward asymptotic to the invariant plane Π and forward asymptotic to a codimension two invariant manifold M , which contains Π. The coordinates of the basepoints of the N -pulse homoclinic orbits are of the form √ i(φN +O(√)) l = ul,± = . . . = ul,± , ul,± 1 2 K = ( + O( )e
l = 1, 2.
(ii) The jump sequences of the orbit families satisfy j l,+ = −j l,− ,
l = 1, 2.
(127)
Furthermore, every time the jump sequence j l,± changes sign, the jump sequence j m,± with l 6= m will not change sign, and vice versa.
Multi-Pulse Homoclinic Orbits
41
(iii) The unstable manifold of the invariant plane Π contains 4N families of N -pulse orbits (see Definition 8.1) such that all these families have different jump sequences. Proof. We first note that formula (126) ensures that the zeros of the function 1N K are transverse (see (117) and (123)). Next we observe that for α = 0, condition (123) k guarantees that φN l 6= φm with l, m ∈ {1, 2} and k = 1, . . . , N − 1. This implies that k for any fixed N , 1 K(φN l ) 6= 0, k = 1, . . . , N − 1. This property is clearly preserved for α > 0 sufficiently small, hence Theorem 7.3 implies statements (i). By Theorem 7.3, the two families corresponding to the zero φN l have opposite jump sequences, which is stated in Eq. (127). To prove the second statement in (ii) about sign changes in the jump sequences, we note that for α = 0 and for any k ∈ Z, we have k N sign 1k K(φN 1 ) = −sign 1 K(φ2 ), N since the minimal period of 1k K is 2π and the difference between the zeros φN 1 and φ2 is exactly equal to π. But for sufficiently small α > 0, this last equation together with l,± = χ± (φN the definition of the sign sequence χ± (φN l ), and the fact that j l ), implies the second statement in (ii). Statement (iii) follows directly from Theorem 7.6 for α > 0 small, because the 2N disjoint lines {φ = φkl }k=1,...,N,l=1,2 divide the plane Π into 2N sectors, so that one of the functions 1k K always changes sign at the boundary of these sectors.
According to statement (ii) of the above theorem, if there are homoclinic orbits that, for at least some of their pulses, stay near one particular component of the unperturbed homoclinic structure W0 (M0 ), then there are other multi-pulse orbits that keep switching between different components of W0 (M0 ). Theorem 10.1 does not identify the exact asymptotics of the multi-pulse solutions. The asymptotic behavior of these orbits could be identified using Theorems 7.4 or 7.5, and a likely candidate for the attracting set S is a sink created by the perturbation in the plane Π. The role of the hyperbolic fixed point p0 ∈ Π is then played by a saddle point on Π. (The existence of these fixed points is easy to verify from Eq. (114).) However, the identification of the domain of attraction of S leads to extensive calculations in this example. Nevertheless, the results of Li and McLaughlin [31] indicate that the conditions of Theorem 7.4 are indeed satisfied, which suggests the existence of the same type of jumping heteroclinic orbits between the two equilibria as those described in Haller and Wiggins [19]. 10.2. The case of β > 0. For the case of β > 0, the calculations of the previous subsection leading to the expressions (125) can be repeated. Using the formulas of Li and McLaughlin [31], one obtains in the same fashion that " # ˆ 0 sin N 1φ M N 1φ N 2 − N χα Mα − χβ Mβ , (128) cos φ + 1 K(φ) = −0 2 sin 1φ 2 where χβ = β/0 and Mβ is a nonzero constant that depends on the parameter and the mesh size K only. It is easy to see that the roots of this equation are smooth in χβ , therefore the results listed in Theorem 10.1 remain valid for sufficiently small β > 0 (cf. the proof of that theorem). Instead of repeating these results, we will use Theorem 7.6 to construct multi-pulse homoclinic orbits to a fixed point p ∈ Π. These orbits will be the multi-pulse analogs of the single-pulse homoclinic orbits constructed by Li and McLaughlin in [31].
42
G. Haller
Theorem 10.2. Let N be an arbitrary but fixed positive integer, and let > 0 be a constant such that conditions (116) and (123) are satisfied. Let the mesh size K ≥ 3 be an integer for which the codimension one surface ( ! 1φ sin N 21φ α ˆ0 Mα − M , M0 = (α, β, 0, ) β = Mβ 2 sin2 1φ 2 (129) ˆ M0 sin N 21φ |χα Mα − χβ Mβ | < , < 0 N sin 1φ 2 of the (α, β, 0, ) parameter space is nonempty. Then there exists 0 > 0 and two codimension one surfaces M± ∈ R4+ with the following properties: (i) M± is O(q ) C 0 -close to the surface M0 in the (α, β, 0, ) parameter space. (ii) For every (α, β, 0, ) ∈ M± , system (111 admits an N -pulse homoclinic orbit which is doubly asymptotic to a fixed point √ p ∈ Π. The coordinates of p are given by (ηp , φp ) = 0, cos−1 (χα ) + O( ). (iii) The basepoint of the N -pulse homoclinic orbit lies on the unstable manifold of p , and the jump sequence of the orbit starts with ±1. Proof. We only have to verify conditions (i)-(v) of Theorem 7.6, from which the statements of the present theorem follow directly. We first recall that compact segments of the orbits on the invariant plane Π can be approximated by the level curves of the Hamiltonian Hg defined in (99), which in this case takes the form Hg (η, φ) = −22 η 2 − 0 sin φ + αφ,
(130)
as one obtains by Taylor expanding the right hand side of (115). The level curves of this Hamiltonian are shown in Fig. 5. Note that p0 (χα ) = 0, cos−1 (χα ) is a saddle point η
p0
p0
5π 2
9π 2
φ π 2
Fig. 5. The level curves of the Hamiltonian Hg
with a homoclinic loop. As shown in Li and McLaughlin [31], for > 0 the equilibrium p ∈ Π perturbing from p0 admits n pairs of complex eigenvalues with negative real parts, thus we obtain that W s (p (χα )) ∩ M is a codimension one surface within the manifold M . Consequently, assumption (i) of Theorem 7.6 is satisfied. Condition (ii)
Multi-Pulse Homoclinic Orbits
43
of Theorem 7.6 is established in Sect. 6 of Li and McLaughlin [31] with reference to Li et al. [33]. Condition (iii) of Theorem 7.6 is satisfied if ˆ M0 sin N 21φ , (131) |χα Mα − χβ Mβ | < N sin 1φ 2 in which case the function 1N K(φ) defined in (128) has a zero φ0 . Note that (131) is satisfied for parameter values taken from the set M0 , and the transversality of these zeros is guaranteed by condition (123). Therefore, assumption (iii) of Theorem 7.6 is satisfied. The validity of assumption (iv) can be seen from the phase portrait in Fig. 5. To prove the theorem, it remains to verify condition (v) of Theorem 7.6. This can be done by adapting the distance measurement used in McLaughlin et al. [34] and Li and McLaughlin [31] as follows. The point pˆ = (0, 0, η0 (χα ), φ0 (χα ) + N 1φ) lies on the stable manifold of p0 if the value of Hamiltonian Hg at pˆ is the same as at some other point of the homoclinic loop attached to p0 . In particular, it suffices to require that Hg (η0 (χα ), φ0 (χα )) = Hg (η0 (χα ), φ0 (χα ) + N 1φ). From (130) we obtain that this last equation can be written in the form 1φ N 1φ sin − αN 1φ = 0. 20 cos φ0 (χα ) + 2 2
(132)
Using the expression of 1N K(φ) and the fact that φ0 (χα ) is a zero of 1N K(φ), we can rewrite (132) as ! 1φ sin N 21φ χα ˆ Mα − M0 = 0. χβ − Mβ 2 sin2 1φ 2 The transverse crossing of the unstable manifold of p0 by the point pˆ is equivalent to the left hand side of this equation admitting a nonzero derivative with respect to, e.g., the parameter χα at a solution χα (β, 0, , N ). Since the equation is linear in χα , this transversality condition clearly holds, thus condition (v) of Theorem 7.6 is satisfied. This concludes the proof of the theorem. We remark that Li and McLaughlin [31] showed that the set M0 defined in the statement of the above theorem is nonempty for K > 7 and for N = 1 (i.e., for singlepulse homoclinic orbits). We also note that the multi-pulse homoclinic orbits obtained from the theorem have the same asymptotic behavior as the single-pulse homoclinic orbits, hence the construction of chaotic invariant sets in their vicinities can be directly adapted from Li and Wiggins [32]. 11. Conclusions In this paper we gave a general criterion for the existence of nontrivial homoclinic orbits in a large class of near-integrable, multi-dimensional systems that usually arise as modal truncations or discretizations of partial differential equations. The homoclinic orbits we constructed make repeated departures from, and returns to, a codimension two invariant manifold which carries solutions with a slow and a fast time scale. The shape of the
44
G. Haller
pulses (i.e., excursions of the homoclinic orbits) can be described by a sequence of +1s and −1s which we compute explicitly. Our results generalize the energy-phase method in Haller [16] and Haller and Wiggins [19] to arbitrarily high (but finite) dimensional systems. We remark that if the perturbation in Eq. (1) is purely Hamiltonian (i.e., g ≡ 0), then the multi-pulse orbits generically undergo a sequence of universal bifurcations as the parameters of the system are varied. Such a bifurcation has been first described in an example in Haller [15] and then were shown to be generic near double resonances of near-integrable Hamiltonian systems in Haller [15]. Since for purely Hamiltonian perturbations, the energy-difference function 1N H obtained in this paper is the same as in [15], the same universality holds for the bifurcations of multi-pulse orbits in system (1). As an application of our results, we showed that the discretized, perturbed NLS equation admits multi-pulse solutions homoclinic to its center manifold. In fact, the pulse number of these orbits can be arbitrarily high if the dissipative and forcing terms are small enough. Statement (ii) of Theorem 10.1 also shows that N -pulse orbits with quite different shapes will coexist. Furthermore, statement (iii) describes how the unstable manifold of the plane Π disintegrates through multi-pulse jumping into components which display completely different jumping behaviors. √ Since the multi-pulse orbits spend √ a time of order O log 1/ (as opposed to O(1/ ) as in Kaper and Kovaˇciˇc [26]) in the neighborhood of the manifold M , they have observable open neighborhoods in which solutions exhibit the same type of jumping behavior for finite times. Given the close coexistence of multi-pulse orbit families with different jump sequences, one expects to see a transient type of chaotic dynamics in numerical simulations. This agrees well with the irregular jumping behavior observed by Bishop et al. [5, 6] for β = 0. Finally, we also considered the discretized NLS equation with a mode-dependent damping term (β 6= 0). Making use of the calculations of Li and McLaughlin [31], we showed the existence of multi-pulse Silnikov-type homoclinic orbits for a codimension one set of parameter values. This provides a significant extension of the set of parameter values for which the discretized NLS equation admits chaotic invariant sets in its phase space. Acknowledgement. I am grateful to Dave McLaughlin for several useful discussions on the subject of this paper and for making ref. [31] available to me before its publication.
References 1. Ablowitz, M.J. and Ladik, J.F.: Nonlinear differential-difference equations and Fourier analysis. J. Math. Phys. 17, 1011–1018 (1976) 2. Ablowitz, M.J. and Herbst, B.M.: On homoclinic structures and numerically induced chaos for the nonlinear Schr¨odinger equation. SIAM J. Appl. Math. 50, 339–351 (1990) 3. Arnold, V.I.: Mathematical Methods of Classical Mechanics. (2nd. ed.). New York: Springer-Verlag, 1978 4. Becker, J. and Miles, J. W.: Parametric excitation of an internally resonant double pendulum, II. Z. Angew. Math. Phys. 37, 641–650 (1986) 5. Bishop A.R., Forest, M.G., McLaughlin, D.W., and Overman, E.A.: A modal representation of chaotic attractors for the driven, damped pendulum chain. Phys. Lett. A 144,17–25 (1990) 6. Bishop, A.R., Flesch, R., Forest, M.G., McLaughlin, D.W., and Overman, E.A.: Correlations between chaos in a perturbed Sine-Gordon equation and a truncated model system. SIAM J. Math. Anal. 21, 1511–1536 (1990)
Multi-Pulse Homoclinic Orbits
45
7. Bogolyubov, N. N. and Prikarpatskii, A. K.: The inverse periodic problem for a discrete approximation of the nonlinear Schr¨odinger equation. Sov. Phys. Dokl. 27 2,113–116 (1982) 8. Ercolani, N.M., Forest, M.G., and McLaughlin, D.W.: Geometry of the modulational instability, Part III: Homoclinic Orbits for the periodic sine-Gordon equation. Physica D 43, 349–384 (1990) 9. Ercolani, N.M. and McLaughlin, D.W.: Towards a topological classification of integrable PDE’s. In The Geometry of Hamiltonian Systems, New York: Springer-Verlag, 1991 10. Feng, Z.C. and Sethna, P.R.: Global bifurcations in the motion of parametrically excited thin plates Nonlinear Dynamics 4, 398–408 (1993) 11. Feng, Z.C. and Wiggins, S.: On the existence of chaos in a class of two-degree-of-freedom, damped, strongly parametrically forced mechanical systems with broken O(2) symmetry. ZAMP 44, 201–248 (1993) 12. Feng, Z.C. and Leal: Symmetries in the amplitude equations of an inextensional beam with internal resonance. J. Appl. Mech. 62 1, 235–238 (1995) 13. Fenichel, N.: Persistence and smoothness of invariant manifolds for flows. Ind. Univ. Math. J. 21, 193– 225 (1971) 14. Fenichel, N.: Geometric singular perturbation theory for ordinary differential equations. J. Diff. Eqs. 31, 53–98 (1979) 15. Haller, G.: M ulti-pulse Homoclinic Phenomena is Resonant Hamiltonian Systems. PhD. Thesis, Caltech (1993) 16. Haller, G. and Wiggins, S.: Orbits homoclinic to resonances: The Hamiltonian case Physica D 66, 298–346 (1993) 17. Haller, G. and Wiggins, S.: Whiskered tori and chaos in resonant Hamiltonian normal forms. In: Normal Forms and Homoclinic Chaos, Fields Institute Communications, 4, 129–149 (1995) 18. Haller, G. and Wiggins, S.: N-pulse homoclinic orbits in perturbations of resonant Hamiltonian systems Arch. Rat. Mech. and Anal. 130, 25–101 (1995) 19. Haller, G. and Wiggins, S.: Multi-pulse jumping orbits and homoclinic trees in a modal truncation of the damped, forced nonlinear Schr¨odinger equation. Physica D 85, 311–347 (1995) 20. Haller, G.: Universal homoclinic bifurcations and chaos near double resonances. J. Stat. Phys. 1011–1051 (1997) 21. Haller, G.: Homoclinic Chaos in Resonant Dynamical Systems, Berlin–Heidelberg–New York: SpringerVerlag (in preparation) 22. Holmes, P.: Chaotic motion in a weakly nonlinear model for surface waves. J. Fluid. Mech. 162, 365–388 (1986) 23. Jones, C.K.R.T. and Kopell, N.: Tracking invariant manifolds with differential forms in singularly perturbed systems. J. Diff. Eqs. 108, 64–88 (1994) 24. Jones, C.K.R.T., Kaper, J., and Kopell, N.: Tracking invariant manifolds up to exponentially small errors SIAM J. Math. Anal. (to appear) (1995) 25. Kambe, T. and Umeki, M.: Nonlinear dynamics of two-mode interactions in parametric excitation of surface waves. J. Fluid. Mech. 212, 373–393 (1990) 26. Kaper, T.J. and Kovaˇciˇc, G.: Multiple-pulse homoclinic orbits near resonance bands. Preprint (1993) 27. Kovaˇciˇc, G. and Wiggins, S.: Orbits homoclinic to resonances, with an application to chaos in the damped and forced Sine-Gordon equation. Physica D 57, 185–225 (1992) 28. Kovaˇciˇc, G. and Wettergren, T.A.: Homoclinic orbits in the dynamics of resonantly driven coupled pendula. Preprint (1994) 29. Li, Y.: B¨acklund transformations and homoclinic structures for the integrable discretization of the NLS equation. Phys. Lett. A 163, 181–187 (1992) 30. Li, Y. and McLaughlin, D.W.: Morse and Melnikov functions for NLS PDE’s. Comm. Math. Phys. 162, 175–214 (1994) 31. Li, Y. and McLaughlin, D.W.: Homoclinic orbits and chaos in discretized perturbed NLS systems, Part I.: Homoclinic orbits. J. Nonlin. Sci. 7, 211–269 (1997) 32. Li, Y. and Wiggins, S.: Homoclinic orbits and chaos in discretized perturbed NLS systems, Part II.: Symbolic dynamics. J. Nonlin. Sci. 7, 315–370 (1997) 33. Li, Y., McLaughlin, D., Shatah, J., and Wiggins, S.: Persistent homoclinic orbits for a perturbed nonlinear Schr¨odinger equation. Comm. Pure Appl. Math 49, 1175–1255 (1996) 34. McLaughlin, D., Overman, E.A.II., Wiggins S., and Xiong, C.: Homoclinic orbits in a four dimensional model of a perturbed NLS equation: A geometric singular perturbation study. In: Dynamics Reported 5, Berlin: Springer-Verlag, 1993
46
G. Haller
35. Miles, J. W.: Parametric excitation of an internally resonant double pendulum Z. Angew. Math. Phys. 36, 337–345 (1985) 36. Miller, P. D., Ercolani, N.M., Krichever, I.M., and Levermore, C.D.: Finite genus solutions to the Ablowitz-Ladik equations. Comm. Pure Appl. Math. (1995) (to appear) 37. Nayfeh, A.H. & Pai, P.F.: Non-linear, non-planar parametric responses of an inextensional beam. Int. J. Non-lin. Mech. 24, 139–158 (1989) 38. Tien, W.M. and Namamchchivaya, N.S.: Nonlinear dynamics of a shallow arch under periodic excitation II: 1: 1 internal resonance. Int. J. Non-Lin. Mech. 29 (3), 367–386 (1994) 39. Tin, S. K.: On the Dynamics of Tangent Spaces near a Normally Hyperboloc Invariant Manifold, Ph.D. Thesis, Brown University (1994) 40. Yang, X.L. and Sethna, P.R.: Non-linear phenomena in forced vibrations of a nearly square plate: Antisymmetric case. J. Sound and Vib. 155, 413–441 (1992) 41. Zakharov, V.E. and Shabat, A.B.: Exact theory of two dimensional self-focusing and one dimensional self-modulation of waves in nonlinear media. Sov. Phys. JETP 34, 62–69 (1972) Communicated by J. L. Lebowitz
Commun. Math. Phys. 193, 47 – 67 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
Symplectic Structures on Gauge Theory? Nai-Chung Conan Leung School of Mathematics, 127 Vincent Hall, University of Minnesota, Minneapolis, MN 55455, USA Received: 27 February 1996 / Accepted: 7 July 1997
Abstract: We study certain natural differential forms [∗] and their G equivariant extensions on the space of connections. These forms are defined using the family local index theorem. When the base manifold is symplectic, they define a family of symplectic forms on the space of connections. We will explain their relationships with the Einstein metric and the stability of vector bundles. These forms also determine primary and secondary characteristic forms (and their higher level generalizations). Contents 1 2 2.1 2.2 3 4 5 6
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Higher Index of the Universal Family . . . . . . . . . . . . . . . . . . . . . . . . . . . Dirac operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∂-operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Moment Map and Stable Bundles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Space of Generalized Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Higer Level Chern–Simons Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equivariant Extensions of [∗] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47 49 50 52 53 58 60 63
1. Introduction In this paper, we study natural differential forms [2k] on the space of connections A on a vector bundle E over X, Z i ˆ Tr[e 2π FA B1 B2 · · · B2k ]sym A(X). [2k] (A)(B1 , B2 , . . . , B2k ) = X
They are G invariant closed differential forms on A. They are introduced as the family local index for universal family of operators. ?
This research is supported by NSF grant number: DMS-9114456
48
N-C. C. Leung
We use them to define higher Chern-Simons forms of E: ch(E; A0 , . . . , Al ) = [∗] l ◦ Ll (A0 , . . . , Al ), and discuss their properties. When l = 0 and 1, they are just the ordinary Chern character and Chern-Simons forms of E. The usual properties of Chern-Simons forms are now generalized to (1) ch(E) ◦ ∂A = d ◦ ch(E), (2) ch(E; s · α) = (−1)|s| ch(E; α). When we restrict our attention to the case when X is a symplectic manifold (or a Kahler manifold) with integral symplectic form ω, [ω] = c1 (L). We study [2] on A(E ⊗ Lk ) for large k. Asymptotically, they define the symplectic form on A(E ⊗ Lk ). Using a connection DL on L, we can identify all these A’s with A(E) and get a one parameter family of symplectic forms k on A(E), Z i ˆ k (DA )(B1 , B2 ) = Tr [e 2π FA +kωIE B1 B2 ]sym A(X). X
Their G-moment maps are i
(2n) ˆ 8k (A) = [e 2π FA +kωIE A(X)] .
When we let the parameter k go to infinity, we will obtain the standard symplectic form on A(E): Z ω n−1 . Tr B1 B2 ∧ (A)(B1 , B2 ) = (n − 1)! X Notice that is an affine constant form in the sense that it is independent of the point A ∈ A. The moment map equation associated to is FA ∧
ω n−1 ωn = µE IE , (n − 1)! n!
which is equivalent to the Hermitian Einstein equation (a semi-linear system of partial differential equations) ^ FA = µE IE . Nevertheless, the moment map equation for 8k is a fully non-linear system of equations (of Monge-Ampere type) which are closely related to stability of E as will be explained in Sect. 4. Then we explain how to remove the assumption of integrality of ω in Sect. 5. Moreover, all A’s will be canonically embedded inside a larger space A˜ [E] , the space of generalized connections. Generalized connections satisfy a weaker constraint: −1 −1 Ai gij + gij dgij + dθij . Aj = gij
However, they share many properties of an ordinary connection. The first Chern class defines a surjective affine homomorphism c1 : A˜ [E] → H 2 (X, C),
Symplectic Structures on Gauge Theory
49
and [2] gives a relative symplectic form on it. We define the notion of extended moduli space and polarization which links to our previous picture on stability. In Sect 6, we will generalize the moment map construction. We study equivariant extensions of all [2k] and their properties. They are [2i,2j] ∈ 2i (A, Symmj (Lie G)∗ )G given by [2i,2j] (DA )(B1 , . . . , B2i )(φ1 , . . . , φj ) Z i ˆ Tr [e 2π FA B1 · · · B2i φ1 · · · φj ]sym A(X) = X
In the last section, we will briefly explain how these higher [2i,2j] are related to family stability. We need to look at all connections on the universal bundle which induces the canonical relative universal connection Drel . Let D + S be such a connection, then its curvature is given by: F2,0 (x, A) = FA at x, F1,1 (x, A)(v, B) = B(v) + DA,v (S(A)(B)) at x, F0,2 = dA S + S2 . By twisted transgression of [∗] (defined using D + S), we get M ap(M, G)-invariant closed forms on M ap(M, A), namely, [∗] M . We will explain how these forms are related to stability for a family of bundles over X parametrized by M when both X and M are symplectic manifolds. 2. Higher Index of the Universal Family In this section we consider the universal connection and the universal curvature. They induce canonical differential forms [∗] on the space of connections which are important for our later discussions. These forms are closely tied with the local index theorem for the universal family of certain operators. We shall discuss the Dirac operator and the ∂ operator in detail. Let X be a compact smooth manifold of dimension 2n and E be a rank r unitary vector bundle over X. We denote the space of connections on E by A and the space of automorphisms of E by G, which is also called the group of Gauge transformations on E. The tangent space of A at a point A ∈ A is canonically isomorphic to the space of one forms on X with End(E) valued, that is, TA A = 1 (X, End(E)). In particular, A is an affine space. (Later in this paper, we shall construct a natural affine extension of A by an one dimensional affine space.) G is an infinite dimensional Lie Group with Lie algebra being the space of sections of End(E) over X, that is, Lie G = 0 (X, End E). The center of G can be identified as the space of non-vanishing functions on X and hence the center of its Lie algebra can be identified as the space of all functions on X, C(X), or the space of zero forms on X, 0 (X). By integrating over X, we can identify the dual of Lie G as the space of 2n forms on X with End(E) valued in the sense of distributions (Lie G)∗ = 2n (X, End(E))dist .
50
N-C. C. Leung
It has a dense subspace 2n (X, End(E)) consisting of smooth 2n forms which are enough in most purposes. The above identification follows from the fact that 0 (X, End E) ⊗ 2n (X, End E)dist −→ C, Z Tr φα φ ⊗ α −→ X
is a perfect pairing. By putting all connections together, one can form a universal connection as follow: Let E = π1∗ E be the universal bundle, where π1 : X × A −→ X is the projection to the first factor. Then we can patch up all connections on E to form a partial connection on E and since the bundle E is trivial along A, we get a (universal) connection D on E. To be more precise, let us describe D in terms of horizontal subspaces. A connection A on E is given byL a U (r) equivariant subbundle H of T E over E such that T E is H. Here Tv E is the vertical tangent bundle of E with respect to isomorphic to Tv E E −→ X. Now, let e ∈ E be a point ∈ X × A. Then Te E is L on the fiber of E over (x, A)L TA A and we can choose H TA A as a horizontal canonically isomorphic to Te E bundle of T E (where H is the horizontal subbundle of T E defined by A). This horizontal subspace defines our universal connection D. The curvature F of D will be called the universal curvature, it is a End(E) valued two form on X × A. With respect to the natural decomposition of two forms on X × A: M O M 1 (X) 1 (A) 2 (A), 2 (X × A) = 2 (X) we decompose F into corresponding three components, F = F2,0 + F1,1 + F0,2 . They are given explicitly by F2,0 (x, A) = FA at x, F1,1 (x, A)(v, B) = B(v) at x, F0,2 = 0. where (x, A) ∈ X × A, v ∈ Tx X and B ∈ 1 (X, End(E)). Now, we want to use A to parametrize certain families of first order elliptic operators on X by coupling different connections with a fixed differential operator. The following three situations are of most interest to us. 2.1. Dirac operators. Let X be a spin manifold of even dimension m = 2n. For any Riemannian metric g on X, we get a Dirac operator D on X. By coupling D with a connection on E, A ∈ A, we get a twisted Dirac operator DA on X. Varying these connections, we then get a family of Dirac operators parametrized by A. The index of DA (Index DA = dim KerDA − dim CokerDA ) can be computed by the Atiyah-Singer Index theorem: Z ˆ ch(E)A(X), Index DA = X
which can be expressed in terms of curvature of E via the Chern-Weil theory:
Symplectic Structures on Gauge Theory
51
Z
i
ˆ Tr e 2π FA A(X).
Index DA = X
The virtual vector space (the formal difference of two vector spaces) KerDA − CokerDA varies continuously with respect to A and forms a virtual vector bundle over A (despite the fact that KerDA or CokerDA might jump in dimensions separately). This is called the index bundle over A, Ind (D). We could formally regarded Ind (D) as an element in the K group of A, K(A), even though A is of infinite dimension. This can be interpreted as Ind (D) giving an element in K(Y ) for any compact subfamily Y ⊂ A. Although Ind (D) is only a virtual bundle, its determinant is an honest complex line bundle over A, called the determinant line bundle for Dirac operators, Det (D). By the family index theorem of Atiyah and Singer, we have (for any compact subfamily Y ⊂ A) Z i ˆ Tr e 2π F A(X) ch(Ind (D)) = X
as cohomology classes. Bismut and Freed [B+F] strengthens this result to the level of differential forms and give the local family index theorem. They introduce superconnection At on the infinite dimensional bundle π∗ E and they proved the following equality on the level of differential forms: Z i ˆ Tr e 2π F A(X). lim ch(At ) = t&0
X
We decompose this differential form into a sum according to their different degrees [0] + [2] + · · ·, where [2k] ∈ 2k (A). Proposition 1. [2k] (A)(B1 , B2 , . . . , B2k ) = (
i 2k ) 2π
Z
i ˆ Tr[e 2π FA B1 B2 · · · B2k ]sym A(X),
X
where A ∈ A and Bi ∈ TA A = 1 (X, End (E)). The notation [···]sym denotes the graded symmetric product of those elements inside. Proof of Proposition. [2k] (A)(B1 , B2 , . . . , B2k ) Z 2,0 1,1 i ˆ Tr [e 2π (F +F )(x,A) ](B1 , B2 , . . . , B2k )A(X) = X Z i i (F1,1 )2k ˆ (B1 , B2 , . . . , B2k )]sym A(X) Tr [e 2π FA ( )2k = 2π (2k)! X Z i i ˆ Tr [e 2π FA B1 B2 · · · B2k ]sym A(X). = ( )2k 2π X For the second equality, we used the fact that F2,0 has no effect on the Bi ’s and the last equality holds since F1,1 is the tautological element which can be described as below. Under the following relations:
52
N-C. C. Leung
1 (X) ⊗ 1 (A) ⊗ End(E) ∼ = Hom(T A, 1 (X, End (E))) ⊃ Hom(TA A, 1 (X, End (E))) = Hom(1 (X, End (E)), 1 (X, End (E))), F1,1 comes from the identity endomorphism of 1 (X, End (E)). Hence we have proved the proposition. Corollary 1. [2k] is a G invariant closed form on A and such that [2k] = 0 if k > n. Proof of Corollary. Both [2k] = 0 for k > n and the invariance of [2k] with respect to G action are clear from the above proposition. We will use the Bianchi identity and the simple fact that the antisymmetrization of a symmetric expression is always zero to show the closedness of [2k] . Consider d[2k] (A)(B0 , . . . , B2k ) = +
X
2k X
(−1)i Bi [2k] (A)(B0 , . . . , Bˆ i , . . . , B2k )
i=0 i+j
(−1)
[2k]
(A)([Bi , Bj ], B0 , . . . , Bˆ i , . . . , Bˆj , . . . , B2k ).
i<j
Here, we have extended each Bi ∈ TA A to a vector field on A by transporting it using the affine structure on A. To put it another way, Bi is always the same element in 1 (X, End (E)) regardless of different A ∈ A. Hence, we have [Bi , Bj ] ≡ 0 for any i and j. P2k We shall show that i=0 (−1)i Bi [2k] (A)(B0 , . . . , Bˆ i , . . . , B2k ) is zero. In order to ˆ simplify the notation, let us assume temporary that A(X) = 1. Then Bi [2k] (A)(B0 , . . . , Bˆ i , . . . , B2k ) Z i Tr[FAn−k−1 (DA Bi )B0 · ·Bˆ i · ·B2k ]sym . = ( )2n (n − k) 2π X because all Bi ’s are independent of the connection A ∈ A. Hence, 2k X
(−1)i Bi [2k] (A)(B0 , . . . , Bˆ i , . . . , B2k )
i=0
=(
i 2n ) (n − k) 2π
Z X
d Tr[FAn−k−1 B0 · · · B2k ]sym
Here, we have used the Bianchi identity DA FA = 0 and the fact that d Tr = Tr DA . Now the vanishing of the above expression follows from the Stokes’s theorem. Hence we have showed that [2k] is closed. An alternative way to prove the closedness is by observing that these forms come from the pushed forward of certain closed differential forms on X × A. 2.2. ∂-operator. The second case is when X is a compact complex manifold. Then (complexified) differential forms on X can be decomposed into holomorphic and antiholomorphic components:
Symplectic Structures on Gauge Theory
l (X) ⊗ C =
53
X
i,j (X).
i+j=l
As an End(E)-valued two form on X, the curvature form of a Hermitian connection can be decomposed accordingly: FA = FA2,0 + FA1,1 + FA0,2 . One should not confuse this decomposition of FA with the previous decomposition of F. If FA0,2 = 0, then the (0, 1) part of DA defines a differential complex on 0,∗ (X, E). Let us denote p ◦ DA by ∂ A , where p : 1 (X, E) ⊗ C −→ 0,1 (X, E) is the projection to the anti-holomorphic 2 part. ∂ A gives a complex because FA0,2 = ∂ A . Let Ahol = {DA ∈ A| FA0,2 = 0}, then Ahol parametrizes a family of elliptic complexes over X. Now, A is the space of connections on E, not necessary Hermitian. However, in order for it to parametrize elliptic complexes, we need to choose a compatible Hermitian metric on E (which in fact exists and is unique up to unitary gauge transformations) and a Hermitian metric on the base manifold X. Such a metric induces a Todd form on X representing the Todd class of X. As in the case of Dirac operators, we get a sequence of differential forms on Ahol : Z i Tr e 2π F T dX = [0] + [2] + [4] + · · · . X
ˆ Here the only difference is the replacement of A(X) by the Todd form on X, T dX . ˆ ˆ Notice that the odd class and A class are closely related to each other: T d = ec1 A. If X is Spin Kahler manifold then this equality holds on the level of differential forms. −1
In this case, the ∂ operator on X is equal to the Dirac operator on X twisted by KX 2 , where the canonical line bundle KX is determined by the (almost) complex structure on X and its square root is given by the chosen spin structure on X. In this identification, Kahlerian is important, which ensures that the complex structure and the metric structure on X are compatible to each other. Later on, we shall explain that asymptotic behaviors of [2] are closely related to stability in Mumford’s Geometric Invariant theory [M+F]. The importance of higher [2k] will be briefly discussed in the last section.
3. Moment Map and Stable Bundles In this section, we will explain the relationship between the form [2] we found earlier and stability properties of the bundle E. This relation is originally discovered in the author’s thesis [Le1] and explained in [Le2]. Here, we shall give a global picture in this section and the next one. Then we shall also discuss higher [2k] later in this paper. Now, we suppose X is a symplectic manifold with symplectic form ω. We also assume ω is integral which means [ω] represents an integer cohomology class in X, [ω] ∈ H 2 (X, Z). The general case will be treated in the next session. H 2 (X, Z) classifies complex line bundles over X up to isomorphisms, that is, [ω] = c1 (L) for some L over X. In fact, we can find a Hermitian connection DL on L such that the last equality holds i FL , where FL is the curvature form of a certain on the level of forms, that is ω = 2π connection DL on L. N k We are interested in the behaviors of [2] on the space of connections on E N L for large k. In order to distinguish connections on different bundles, we write A(E Lk )
54
N-C. C. Leung
N k for the space ofN connections on E L . However, by tensoring with DL , we can identify k different A(E L ), A(E) −→ A(E ⊗ Lk ), DA 7→ DA ⊗ DLk . i i FA of DA will be sent to 2π FA + kωIE . Under this identification, the curvature 2π [2] We shall denote the corresponding 0 = by k . We therefore have Z i ˆ Tr [e 2π FA +kωIE B1 B2 ]sym A(X) k (DA )(B1 , B2 ) = X
in the spin case (and similarly formula for other cases by replacing the Aˆ by suitable characteristic classes). Notice that, at a general point DA ∈ A, k may be degenerated. However, it will become non-degenerate if k is chosen large enough and therefore, in a suitable sense, k is asymptotically symplectic form on A. (For the rest of this paper, a symplectic form may possobly degenerate but this deficiancy can usually be rescued by asymptotic non-degeneracy.) Moment maps. For any k, k is always a G an invariant form and a moment map exists. That is, k can be extended to a G equivariant closed form on A. Lemma 1. The moment map for the G action on A with symplectic form k is given by 8k : A −→ 2n (X, End(E)), i (2n) ˆ 8k (A) = [e 2π FA +kωIE A(X)] .
Proof of Lemma. First, we need to show that 8k is a G equivariant map, but this is clear from the definition of 8k . Next, we want to show that k + 8k is a G equivariant form. That is, for any φ ∈ Lie G = 0 (X, End(E)), ιφ k = d(8k (φ)) as an ordinary one form on A, or,
Z Tr 8k (A) · φ)(B)
k (A)(DA φ, B) = d( X
for any A ∈ A and B ∈ TA A = 1 (X, End(E)), Z d( Tr 8k (DA )φ)(B) X Z 2 2 i d ˆ Tr e 2π (FA +tDA B+t B )+kωIE φA(X) = |t=0 dt X Z i i ˆ Tr[e 2π FA +kωIE DA B]sym φA(X) = 2π X Z i i ˆ Tr[e 2π FA +kωIE DA Bφ]sym A(X) = 2π X Z i i ˆ Tr[e 2π FA +kωIE DA φB]sym A(X) = 2π X = k (DA )(DA φ, B)
Symplectic Structures on Gauge Theory
55
Hence, k + 8k is a G equivariant closed form and 8k is the moment map for the gauge group action on A with the symplectic structure given by k . Remark . For each k, we obtain a G equivariant closed form on A, namely k + 8k . It depends only on the choice of a connection DL on L. Neverthless, the equivariant cohomology class it represents, [k + 8k ] ∈ HG2 (A), is independent of such a choice. Proposition 2. [k + 8k ] ∈ HG2 (A) is independent of the choice of the Hermitian connection DL on L. 0 are two connections on L, so they differ by a one Proof of Proposition. If DL and DL form on X, say α. They induce k + 8k and 0k + 80k on A. Their difference is a G equivariant exact form on A which can be written down explicitly (like the Chern-Simons construction). More precisely,
(0k + 80k ) − (k + 8k ) = dG 2k,DL ,DL0 , where
Z 2k,DL ,DL0 (A)(B) = k
Z
1
i
ˆ α Tr e 2π FA +k(ω+tdα)IE B A(X).
dt 0
X
Hence, the cohomology classes they represented are the same.
Large k limit of k + 8k . We now consider k and 8k for large k. If we expand k in powers of k, we have Z ω n−1 n−1 + O(k n−2 ). · Tr B1 B2 ∧ k (A)(B1 , B2 ) = k (n − 1)! X The leading order term of it defines a (everywhere non-degenerate) symplectic form on A, moreover, it is a constant form on A in the sense that there is no dependence of A in its expression. The moment map defined by this constant symplectic form is 8(A) = FA ∧
ω n−1 . (n − 1)!
If we expand 8k for large k , we have 8k (A) = k n
ωn IE + k n−1 8(A) + O(k n−2 ). n!
The first term is a constant term and can be absorbed in the definition of 8k (since the moment map is uniquely determined only up to addition of any constant central element) and therefore, the leading (non-trivial) term for 8k becomes 8 for large k. We now look at the symplectic quotient of A by Hamiltonian G actions with respect to different k . First choose a coadjoint orbit O ⊂ (LieG)∗ , the inverse image 8−1 (O) of O under the moment map 8 is G invariant. Then one can show that 8−1 (O)/G inherits a natural symplectic form (on the smooth points) from the original symplectic space. We can now apply this procedure to any of these k or since they are all G invariant symplectic forms on A. The simplest choice of the coadjoint orbit would be 0, however, 8−1 (0) may not be non-empty in general. We would choose those coadjoint orbits which consist of a single
56
N-C. C. Leung
point only. The set of all these coadjoint orbits can be identified with the space of 2n forms on X (in the sense of distributions). Hermitian Einstein metrics. We first consider 8 which depends on A is a (semi-)linear fashion. Let ζ ∈ 2n (X), then DA ∈ 8−1 (ζ) if and only if FA ∧
ω n−1 = ζIE . (n − 1)!
By taking the trace and integrating it over X, we get a necessary condition for 8−1 (ζ) being non-empty which is Z ζ = µE , X n−1
(L) 1 < c1 (E) ∪ c1(n−1)! , [X] > is the slope of E with respect to the where µE = rk(E) polarization L. For simplicity, we normalize the (symplectic) volume of X to be one. n When we choose ζ to be a multuple of the symplectic volume form ωn! , then the equation defined by the connection being laid on the inverse image of ζ would be:
FA ∧
ω n−1 ωn = µE IE . (n − 1)! n!
n
Using ωn! and a metric on X, we can convert this equation into an equation of zero forms, it reads as: ^ FA = µE IE , V where is the adjoint to the multiplication operator L = ω ∧ ( ) (a metric on X is used to define the adjoint of an operator). In local coordinates, the equation is i = µE δji , g αβ Fjαβ
√ where −1gαβ dz α ∧ dz β is the Kahler form ω on X and (g αβ ) is the inverse matrix to (gαβ ) provided X is Kahler and z’s are local holomorphic coordinates on X. Suppose this is the case and E is an irreducible holomorphic vector bundle over it, then this equation is called the Hermitian Einstein equations or the Hermitian Yang Mills equations. It V is called Einstein because FA is a Ricci curvature of E. (On a vector bundle, there is another kind of Ricci curvature on it, namely Tr E FA ∈ 2 (X). This latter Ricci curature is a closed two form on X and the cohomology class it defined is the first Chern class of E. If we consider only those connections whose (0,2) part of its curvature vanished, that is A ∈ Ahol as in the discussion of last section. Then we have Z ^ Z Z ω n−2 ωn ωn − = −8π 2 . |FA |2 | FA | 2 ch2 (E) ∧ n! n! (n − 2)! X X X R n n Since X is Kahler, ωn! is the volume form of X and the functional X |FA |2 ωn! is the standard Yang-Mills in Gauge theory. By the above equality, it is equivalent R functional V n to the functional X | FA |2 ωn! up to a topological constant (when we restrict our R V hol 2 ωn attention V to A ). Now, the Euler-Lagrange V equation for the functional X | FA | n! is DA ( FA ) = 0. It can be reduced to FA = µE IE if E is irrreducible. It is because
Symplectic Structures on Gauge Theory
57
V the Euler-Lagrange equation implies that FA is a parallel endomorphism and therefore their eigenspaces are all holomorphic subbundles of E. When E is irreducible, E itself V is the only non-trivial subbundle. So, the endomorphism FA has to be constant which is fixed by the topology of the bundle. From another point of view, the Yang-Mills equation on E (that is the Euler-Lagrange R 2 ωn ∗ hol |F , equation of the functional A | n! ) is DA FA = 0. When X is Kahler and A ∈ A X V ∗ we have DA = [DA , ]. So, ∗ F = DA ( 0 = DA
^
FA ) −
^
(DA FA ) = DA (
^
FA )
by the Bianchi identity. Hence, the two equations are equivalent. V Stability. Existence of solutions to FA = µIE on E can be rephased in algebraic geometric languages. By [N+S],[Do] and [U+Y], there exists a Hermitian Einstein metric on E if and only if E is a Mumford stable bundle. In such a case, the metric is also unique up to scaling by a global constant. (When E is not irreducible, then existence of the Hermitian Einstein metric on E is equivalent to E being a direct sum of irreducible Mumford stable bundles over X with the sum slope, we called such a bundle E a Mumford poly-stable bundle.) Mumford stability is only the “linearization” of stability in studying moduli spaces of vector bundles using Mumford’s Geometric Invariant theory. In geometric invariant theory of vector bundles, Gieseker [Gi] found the correct notion of stability and stated them in geometric terms. In the thesis, the author found that stability is in fact equivalent to existence of solutions to moment map equations for 8k with k large (and control of the curvature of the solutions as k goes to infinity). To solve that (fully-nonlinear) equation, the author used a very involved singular perturbation method which identified the obstruction for perturbations is precisely the unstability [Le1, Le2]. The basic idea is when k goes to infinity, we expected the family of solutions will blow up (provided it exists). Different directions will blow up according to different rates. However, this information can be captured from algebraic geometry. Then we try to perturb the singular solution (whose existence can be proved by using a theorem of Uhlenbeck and Yau) to finite k; there will be numerous obstructions. We can identify all these obstructions. In fact, obstructions vanish precisely when the bundle is Gieseker (poly-)stable. Instead of trying to solve the equations, we shall discuss the geometry underlying these k ’s, 8k ’s and their higher degree cousins. n First we want to set 8k equal to a constant multiple of ωn! IE . The constant has to 1 be rk(E) χ(X, E ⊗ Lk ), where χ is the index of the corresponding operator given by the Atiyah-Singer index theorem. Therefore, the equations defined by the moment map would become i
[e 2π FA +kωIE T d(X)](2n) =
1 ωn χ(X, E ⊗ Lk ) IE . rk(E) n!
These equations describe the stability properties of the hlomorphic bundles. Unlike the Hermitian Einstein equations, this system of equations is fully nonlinear unless n = 1. Instead of ∂¯ operator, we can use the Dirac operator and replace T d form by Aˆ forms.
58
N-C. C. Leung
4. Space of Generalized Connections In this section, we shall explain a setting which is more suitable for studying the symplectic quotient of A (or Ahol ) by G with respect to [2] . All the proofs here are elementary and sometimes we only sketch the reasonings. From the last section, we know that it is natural to put A(E ⊗ Lk )’s together for large k. Stability depends only on the ‘direction’ L and one should not distinguish E among E ⊗ Lk ’s. We are going to explain how to achieve these in a natural manner by allowing k to be any complex number. Moreover, in this approach, we do not need to assume that the symplectic form ω is an integral form. To do this, we need to define generalized connections on E. ˇ Review of connections. Let us first recall the definition of connections in Cech languages. Let U be a good cover of X. That is, for any Ui , Uj , . . . , Uk ∈ U , Ui,j,...,k ≡ Ui ∩ Uj ∩ · · · ∩ Uk is a contractible set. Then the bundle E is determined by a collection of −1 and gij gjk gki = 1. gluing functions: {gij : Uij → GL(r)} which satisfies gij = gji 0 Two collections g and g are equivalent if there exists {hi : Ui → GL(r)} satisfying 0 gij = h−1 i gij hj . A connection DA is a collection of matrix-valued one form on U, {Ai ∈ 1 (Ui , gl(r))}. On Uij , they satisfy −1 −1 Ai gij + gij dgij . Aj = gij Now, a generalized connection DA is essentially the same thing except that we relax the last equality to −1 −1 Ai gij + gij dgij + dθij Aj = gij for some collection {θij : Uij → C} which satisfies θij = −θji and d(θij +θjk +θki ) = 0. The definition of gauge equivalent for two generalized connections are the same as ordinary connections: A0i = gi−1 Ai gi + gi−1 dgi for some automorphism {gi : Ui → GL(r)}. −1 Fi gij . That Let Fi = dAi + Ai ∧ Ai , then one can prove that we still have Fj = gij 2 is F ∈ (X, End (E)). One can also verify that the notion of generalized connections is well-defined and independent of the choice of good cover U or the gluing functions {gij } of E. Generalized connections. Definition 1. A collection DA = {Ai ∈ 1 (Ui , gl(r))} is called a generalized connection if −1 −1 Aj = gij Ai gij + gij dgij + dθij for some collection {θij } as before. The curvature FA ∈ 2 (X, End (E)) of DA is defined to be Fi = dAi + Ai ∧ Ai on Ui . A˜ [E] denotes the space of all generalized connections on E. We collect some basic facts about A˜ [E] in the following lemma. Lemma 2. (1) A(E) ⊂ A˜ [E] . (2) A˜ [E] ≡ A˜ [E⊗L] for any line bundle L.
Symplectic Structures on Gauge Theory
59
(3) A˜ [E] has a natural affine structure such that A(E) is an affine subspace of it of codimension b2 (X). S In a sense, A˜ [E] should be regarded as L,k A(E ⊗ Lk ) where the L’s forms a base of H 2 (X, Z)/T or and k ∈ C. Previously, in order to identify A(E) and A(E ⊗ L), we have to pick a connection on L. But now, A˜ [E] and A˜ [E⊗L] are always canonically isomorphic to each other. So it is natural to consider equivalent classes of vector bundles which are isomorphic to each other modulo tensoring with line bundles. We denote the equivalent class of E by [E]. That is [E ⊗ L] = [E] for any line bundle L. Notice that End([E]) is a well-defined bundle on X and the group G of all gauge transformations of [E] is also a well-defined notion for the equivalent class. i Let c1 ([E], DA ) = 2π T r FA be the first Chern form of any generalized connection ˜ DA ∈ A[E] . It is always a closed two form on X. However, its cohomology class is not determined by [E]. In fact, it induces a surjective affine homomorphism c1 : A˜ [E] → H 2 (X, C) and G acts on A˜ [E] preserving the fiber of c1 . Now, ≡ [2] can be naturally extended to a G invariant relative closed two form on A˜ [E] . Relative forms. Let us first define the relative notion which will also be used in later π sections. Suppose that X → P → M is a fiber bundle over M . We have a canonical exact sequence of vector bundles over P : 0 −→ Tvert P −→ T P −→ π ∗ T M −→ 0, where Tvert P denote the vertical tangent bundle (or the relative tangent bundle). In fact, we can regard this exact sequence as the definition of Tvert P . Recall that a k-form is a Vk (T P ) over P . section of Vk Definition 2. A relative k-form is a section of (Tv P ) over P . rel A relative connection DA on a bundle E over P is a first order relative differential operator rel ∗ : 0(P, E) −→ 0(P, Tvert P ⊗ E), DA that is, rel rel DA (f s)(v) = v(f )s + f (DA s)(v)
for any f ∈ C ∞ (P ), s ∈ 0(P, E) and v ∈ Tvert P . rel rel 2 The curvature of DA is FArel = (DA ) . An ordinary k-form (resp. connection of E) on P induces a relative k-form (resp. relative connection of E). Notice that the curvature of a relative connection is a relative two form on P with End(E)-valued. If we are given a splitting of T P as a direct sum of Tvert P and π ∗ T M then a relative form can be extended to a form on P by assigning zero to π ∗ T M . For example, when P has a Riemannian metric or P is a product of X and M , then T P ≡ Tvert P ⊕ π ∗ T M . Extended moduli space. Now, is a G invariant relative closed two form on A˜ [E] . (The theory of deRham model of equivariant cohomology can be extended to the relative
60
N-C. C. Leung
case.) Then admits a G-equivariant closed extension + 8. Just as before, for any DA ∈ A˜ [E] and φ ∈ 0 (X, End([E])), 8 is given by Z i ˆ T r e 2π FA φ A(X). 8(DA )(φ) = X
For any non-vanishing top form µ on X, 8−1 (C · µIE ) would be preserved by G. ˜ µ := 8−1 (C · µIE )/G is called the extended moduli space. Definition 3. M ˜ µ → H 2 (X, C). Now, c1 descends to give a map c1 : M To study the stability problem, we have to introduce polarization to specify a particular direction in A˜ [E] . Definition 4. A polarization P is a line field in A˜ [E] which is invariant under affine translations. P is called regular if it is transverse to the fiber of c1 . Two regular polarizations are equivalent to each other if they induce the same line field in H 2 (X, C). It is easy to see that given a line bundle L and a connection DL on it. They induce a regular polarization in A˜ [E] . Different choices of connections always give equivalent polarizations. Remark . The word polarizatioon we used here comes from algebraic geometry which means a choice of an ample line bundle up to numerical equivalency. This should not be confused with the notion of polarizations used in geometric quantization. We fix a i FL be its curvature. line bundle L on X and choose a connection DL on L. Let ω = 2π n ω Then the top form µ = n! is non-vanishing if and only if ω is a symplectic form. We are interested in the ends of Mµ . It being non-empty near the +∞ ends is related to stability properties of E as explained in the last section. However, now the picture we have here is more geometric. All the symplectic quotients A(E ⊗ Lk )//G’s (k ∈ Z) are embedded inside a bigger continuous family which is the symplectic quotient of the space of generalized connections. Moreover, “Mµ |c1 =+∞ ” should be regarded as the moduli space of Hermitian Einstein connections on E. Loosely speaking, asymptotically near c1 = +∞, [2] defines a Poisson structure on A˜ [E] and restricting to each fiber on c1 , it is a symplectic form. 5. Higer Level Chern–Simons Forms In this section, we shall show that these ∗ deteremine Chern character and Chern– Simons forms of E. t the sma time, we define higher level generalzytions of ern–Simons forms. In order to simplify the diskussion below, we shall assume that the characteristic form of X is one. The author would like to thank the referee who pointed ot that this section is closely related to earlier work of Gefand and Wang. In fact, the author found out later that this is also related to certain previous work of BOtt. Inside A, there is a canonical affine foliation such that any two connections DA and 0 0 are in the dame leaf if and only if DA = DA + αIE for some one formα on X. DA Interm of an (integrable) subbundle of A, it is given by 1 (X) ⊂ 1 End(E) = TA A. We restrict [∗] to a differential form along the foliation (see relative forms in Sect. 3). We shall still denote this relative differential form an (A) by [∗] . We are going to see that [∗] is equivalent to the operation of taking the Chern character form of any connection.
Symplectic Structures on Gauge Theory
For exaple, ω [2k] ⊂ 0 A,
V
61
2kT ∗ A now gives a homomorphism:
A × 2k ,→ γ(A,
2k ^
[2k]
(T calA)) −→ C
By abuse of languages, we also denote this homomorphism by [2k] . The we have Z i [2k] (DA , γ) = Tr e 2π FA γ. x
If we regard [2k] as a map A → 2n−2k (X)dist , the the image of DA is just given by the n − k th Chern character form of DA . Therefore, taking the Chern character form of any connection is equivalent to the restriction of the differential form [∗] to the above foliation of A. To recover secondary (or higher level generalized) characteristic forms on E, we look at the chain complex of A, S∗ (A) and [∗] defines a chain homomorphism [∗] : S(A) → [∗] (X). Recall that an element in Sl (A) (a singular chain) is a finite linear combination of singular l-simplexes σ’s on A with complex coefficient. Where a singular l-simplex is a smooth map from the standard l-simplex 1l to A, σ : 1l → A. There is a boundary homomorphism ∂ : Sl (A) → Sl−1 (A) with ∂ 2 = 0 and its homology is called the ∗ (A). For completeness, let us recall the notations: singular homology of A, Hsin l X tj Pj |sumlj=j tj = 1 . 1L = j=0
where P0 is the origin of R∞ and Pi in the ith standard basis vector in R∞ . To define ∂, it is sufficient to know its effect on a singular l-simplex σ : 1l → A. We have: ∂σ =
l X
(−1)i σ ◦ ∂li ,
i=0
where ∂li : δl−i → 1l is te ith face map given by ∂li
0 X
! j = 0tj Pj
=
i=1 X j=0
+
l X
tj−1 Pj .
j=i+1
Now we will imitate the previous constructions to obtain characteristic forms for any singular chain in A, ωl[∗] : Sl → ∗ (X)dist . We no longer nedd to assume that X is even dimensional. It is enough for us to construct it for a single singular l-simplex σ : 1l → A ansd extend it by linearity. For any such σ, we get a connection Aσ on π1∗ E → X × 1l by pulling back the universal connection A over E → X × A. We denote its curvature by Fσ .
62
N-C. C. Leung
∗ Definition 5. We define [∗] l : Sl (A) → (X)dist by Z 1 (σ)(γ) = Tr e 2 Fα γ. [∗] l X×1L
Lemma 3. [∗] ∗ commute with the (co-)boundary operators of both complexes. That is, [∗] d ◦ [∗] l = l−1 ◦ ∂.
This lemma follows from the Stokes’s theorem. Next, we shall use [∗] l define secondary characteristic forms (Chern–Simons forms) and their higher level generalizations. We first define a map L which is similar to the homotopy operator K : S : ∗(A) → S 0 ∗ (A) defined by one construction. (since [K, ∂] = 1 or −1, this was used to prove that the singular homology of an affine space is trivial.) Now, Ll :
l+1 Y
A → SL (A).
Roughly speaking, Ll is given by the complex polyhedron spanned by l + 1 points in A. Notice that both K and L use strongly the affine structure on A. Let us write σ = Ll (DA0 , . . . , DAl ), then L is defined by as ! l l X X tj Pj = tj DAj . σ j=o
i=0
Pl Here, j=o tj DAj is a well-defined element in A because of j=0 tj = 1. We can extend Ll by linearity to the free Abelian group Al generated by elements Ql+1 A, that is, in " l+1 # Y Ll : A := Z A → Sl (A) Pl
On A∗ , there is also a boundary operator ∂A : AL → Al−1 defined by ∂A (A0 , . . . , A : l) =
l X
(−1)i (A0 , . . . , Aˆ i , . . . , Al )
i=0
on any generator (A0 , . . . , Al ) in AL . The proof of the following lemma is straightforward. 2 ==, Lemma 4. (1) ∂A (2)L ◦ ∂A = ∂ ◦ L.
Now we can define a higher Chern–Simons form as follow: Definition 6. The higher Chern–Simons is the linear homomorphism ∗ ch(E) = [∗] ∗ ◦ L∗ : A∗ → (X)dist
such that on any generator (A0 , . . . , Al ) in Al , it is given by ch(E; A0 , . . . , A : l) = [∗] L circLL (A0 , . . . , Al ).
Symplectic Structures on Gauge Theory
63
In fact, ch(E; A0 , . . . , Al ) i always a smooth diffrential form on X. Noticce that ch(E; A= ) is just the ordinary Chern character form of A0 and ch(E; A0 , A1 ) is the ordinary Chern–Simons form of the pair A0 and A1 . Proposition 3. ch(E) ◦ ∂A = d ◦ ch(E). This proposition follows from the previous two lemmas. When l = 0, this proposition implies that the ordinary Chern character form is always a closed differential form. When L = 1, the proposition says that ch/E; A1 ) − ch(E; A0 ) = d?, ch(E; , A0 , A1 ) , which is the most important property of Chern–Simons forms as the image under d of a canonical diffferential form (by the proposition for l = 2: ch(E; A1 , A2 ) − ch(E; A0 , A2 ) + ch(E; A0 , A1 ) = d ch(E; A0 , A1 , A2 ). Using Stokes’ theorem we have Z Z Z ch(E; A1 , A2 ) − ch(E; A0 , A2 ) + ch(E; A0 , A1 ) = 0, X
R
X
X
which is useful when we use X ch(E; A1 , A2 ) as an action functional. For instance, the Chern–Simons theory over a three manifold uses such a functional and has important impacts on knot theory and three dimensional topology in recent years. These ch(E)’s have another symmetry property. The symmetry group of l+1 elements Sl+1 , acts on Al and we have the following proposition. Proposition 4. For any s ∈ Sl+1 and α ∈ Al , we have ch(E; s · α) = (−1)|s| ch(E; α). The proof of this proposition is left to the readers. When we look at the case for l = 1, we get ch(E; A0 , A1 ) = −ch(E; A1 , A0 ), which is a well-known property for Chern–Simons forms. Hence, we have finished constructing higher Chern–Simons forms of E using ω [∗] ’s. 6. Equivariant Extensions of [∗] In the last section, we analyze [2] (= ) and its G-moment map 8. We showed that how they are related to the Hermitian Einstein metric and stability when X is a symplectic manifold. In this section, we first generalize the moment map construction to all [∗] ’s and study their equivariant extensions [∗,∗] . Let us first review some basic materials about equivariant cohomology. (See, for example [A+B] [M+Q] for details.) We shall decribe the Cartan model here. When M admits a G (compact connected Lie Group) action, then the equivariant cohomology of M is defined as ∗ (M ) = H ∗ (EG ×G M ), HG where EG is the Universal space for G. It can be computed as the cohomology of the following differential complex on (∗ (M ) ⊗ Symm∗ (Lie G)∗ )G . Choose any base φi ’s
64
N-C. C. Leung
of Lie G and let φi ’s be its dual base. For any α ⊗ ψ ∈ ∗ (M ) ⊗ Symm∗ (Lie G)∗ , we define the differential D to be X ιφi α ⊗ (φi ψ). D(α ⊗ ψ) = dα ⊗ ψ − i
It is not difficult to check that D vanishes on G invariant elements in ∗ (M ) ⊗ Symm∗ (Lie G)∗ . Therefore, it defines a differential complex on (∗ (M )⊗ ∗ (M ). Although, in Symm∗ (Lie G)∗ )G whose cohomology can be proved to be HG our situation, the group G is not a compact finite dimensional Lie Group, we shall still consider the same complex and call it the equivariant complex. Elements that are D-closed (-exact) are called equivariant closed (exact) forms. 2
[2k] as a G equivariant Equivariant extensions of [∗] . Now, we want to extend P each [2i,2j] , where closed form on A for any k. That is, we should have i+j=k [2i,2j] ∈ 2i (A, Symmj (Lie G)∗ )G , such that [2k,0] = [2k] and for any φ ∈ Lie G = 0 (X, End (E)), we have ιφ [2i,2j] = d([2i−2,2j+2] (φ)). Theorem 1.
[2i,2j] (DA )(B1 , . . . , B2i )(φ1 , . . . , φj )
Z
i ˆ T r [e 2π FA B1 · · · B2i φ1 · · · φj ]sym A(X),
= X
where DA ∈ A, B ∈ TA A = 1 (X, End (E)), φ ∈ Lie G = 0 (X, End (E)). Proof of Theorem. Before proving the theorem, let us first explain some notations. [2i,2j] is a 2i-form on A valued in the j th symmetric power of the dual of Lie algebra of G and it is also G-invariant. At a point DA ∈ A and 2i tangent vector B’s at DA , [2i,2j] gives an element in Symmj (Lie G)∗ . Evaluating it on the j element of LieG, then it gives us a number which we claimed to be given by the above formula. First, G invariancy of [2i,2j] is clear from the formula. [2k,0] = [2k] can also be seen directly. It remains to verify that j X
ιφl [2i,2j] (φ0 , . . . , φˆ l , . . . , φj ) = d([2i−2,2j+2] (φ0 , . . . , φj )),
l=0
where each φl ∈ Lie G = 0 (X, End (E)). Now, we take any B1 , . . . , B2i−1 ∈ TA A = 1 (X, End (E)) and extend it to vector fields over all of A by parallel transport. Then [Bi , Bj ] is always a zero vector field on A. (Here, [ , ] means the bracket of two vector fields on the space A.) Therefore, d([2i−2,2j+2] (φ0 , . . . , φj ))(B1 , . . . , B2i−1 ) =
2i−1 X
(−1)l+1 Bl ([2i−2,2j+2] (φ0 , . . . , φj )(B1 , . . . , Bˆ l , . . . , B2i−1 ))
l=1
=
2i−1 X
Z
i ˆ T r[e 2π FA (DA Bl )B1 · ·Bˆ l · ·B2i−1 φ0 · ·φj ]sym A(X)
l+1
(−1)
l=1
X
Symplectic Structures on Gauge Theory
65
On the other hand, ιφ [2i,2j] (φ0 , . . . , φˆ l , . . . , φj )(B1 , . . . , B2i−1 ) Zl i ˆ T r[e 2π FA (DA φl )B1 · ·B2i−1 φ0 · ·φˆ l · ·φj ]sym A(X) = X
Hence, j X
Z
ιφl [2i,2j] (φ0 , . . . , φˆ l , . . . , φj ) − d([2i−2,2j+2] (φ0 , . . . , φj )) (B1 , . . . , B2i−1 )
l=0 i
= ZX
ˆ T r[e 2π FA DA (B1 · ·B2i−1 φ0 · ·φj )]sym A(X) i
ˆ d T r[e 2π FA B1 · ·B2i−1 φ0 · ·φj ]sym A(X)
= X
=0 by the Stokes’s theorem. So we have proved the theorem.
In particular, when we look at = 0 , we recover the moment map 80 as [0,2] extension of the total [∗] as G equivariant closed form on A would be P . The[2i,2j] since the highest degree form in [∗] is of degree 2n. We can relax the i+j≤n restriction i + j ≤ n and obtain an equivariant closed form of arbitrart large total degree, we shall denote it by [∗,∗] . [2]
[∗,∗] as equivariant pushforwards. From previous section, we know that [∗] occurred as the Chern character of the virtual bundle by the family local theorem. Recall that Z i ˆ T r e 2π F A(X). [∗] = X
and weextended [∗] as a G equivariant closed form on A. Now, we are going to i ˆ to a G equivariant extend the results in section two. So we want to extend T r e 2π F A(X) [∗,∗] by integrating it over X. closed form on X × A such that we get Let us first consider an example where we want to find an element in 2n (X × A, (Lie G)∗ )G such that it gives us φ by integrating it over X. This is clear that this is i given by the 2n-form T r[e 2π FA φ](2n) (x) at a point (x, DA ) ∈ X × A when evaluating 0 it on φ ∈ Lie G = (X, End (E)). To generalize it to other [∗,∗] ’s, we introduce 1 i ˆ would be our G equivariant form on X × A. and such that T r e 2π F+1 A(X) Definition 7. Let 1 ∈ 0 (X × A, End(E))(Lie G)∗ be given by 1(x, DA )(φ) = φx , where (x, DA ) ∈ X × A and φ ∈ Lie G = 0 (X, End (E)). By evaluating 1 on φ at (x, DA ), we should arrive at an element in (EndE)(x, A) = EndEx , and 1 is defined such that this element is just given by φ at x. Therefore, 1 is an tautological element and independent of the variable DA ∈ A. Put it another way, 1 is the pullback from the identity element in 0 (X, End (E))(Lie G)∗ ≡ M ap(0 (X, End (E)), 0 (X, End (E))).
66
N-C. C. Leung i
ˆ In the exponential sums, we multiple forms on Now we consider T rE e 2π F+1 A(X). X × A by exterior product and elements in (Lie G)∗ by symmetric product. Therefore, we have i ˆ ∈ ∗ (X × A, End(E))(Symm∗ (Lie G)∗ ). T rE e 2π F+1 A(X)
Theorem 2.
i ˆ is a G equivariant closed form on X × A. T rE e 2π F+1 A(X)
ˆ Proof of Theorem. To prove the theorem, we can forget the A(X) term. It is rather straight i forward to check that T rE e 2π F+1 is G equivariant. To prove that it is equivariantly closed, we choose any φ ∈ Lie G = 0 (X, End (E)). Then, by Bianchi identity DF = 0, we have i i d(T r e 2π F+1 (φ)) = T r [e 2π F+1 D(1φ)]sym , where D is the universal connection on E. On the other hand, we have i
i
ιφ T r e 2π F+1 = T r [e 2π F+1 ιφ F]sym . i
Therefore, in order to prove that T r e 2π F+1 is equivariantly closed, it is enough to check that ιφ F = D(1φ). At a point (x, DA ) ∈ X × A, φ induces the vector field (0, DA φ) ∈ T(x,DA ) (X × A) = Tx X × TDA A = Tx X × 1 (X, End (E)). If we choose any other vector (v, B) ∈ T(x,DA ) (X × A), then ιφ F(x, DA )(v, B) = F(x, DA )((v, B), (0, DA φ)) = (F2,0 + F1,1 )(x, DA )((v, B), (0, DA φ)) = FA (x)(v, 0) + (DA φ)(x, v) = (DA φ)(x, v). For D(1φ), we have D(1φ)(x, A)(v, B) = (DA,x φ)(x) = (DA φ)(x, v). i
Therefore, we have ιφ F = D(1φ) and T rE e 2π F+1 is a G equivariantly closed form on A. Next, weR shall show that the pushforward of this form gives [∗,∗] on A which i ˆ = [∗] . generalizes X T rE e 2π F A(X) Theorem 3.
Z
i ˆ = [∗,∗] . T rE e 2π F+1 A(X)
X
Symplectic Structures on Gauge Theory
67
i Proof of Theorem. For simplicity, we shall omit some factors of the power of 2π in the R i 2i F+1 in (A, Symmj calculation below. We consider the component of X T rE e 2π ∗ (Lie G) ). We evaluate it on 2i tangent vectors B’s on A and j elements φ’s in LieG at a point A ∈ A, Z i ˆ (DA )(B1 , . . . , B2i )(φ1 , . . . , φj ) T rE e 2π F+1 A(X) Z X 2,0 1,1 i ˆ T r e( 2π (F +F )+1)(x,DA ) (B1 , . . . , B2i )(φ1 , . . . , φj )A(X) =
Z
X i
T r [e 2π FA
= ZX
12j (x, DA ) (F1,1 )2k ˆ (B1 , . . . , B2i ) (φ1 , . . . , φj )]sym A(X) (2k)! (2j)!
i
ˆ T r [e 2π FA B1 · ·B2i φ1 · ·φj ]sym A(X)
= X
= [2i,2j] (DA )(B1 , . . . , B2i )(φ1 , . . . , φj ). Hence the theorem.
Acknowledgement. This work is done during my visit to the Courant Institute of Mathematical Science. Courant created an excellent research environment for me and my work. I want to thank Professors R.Bott, I. M.Singer and S. T. Yau who taught me Topology, Gauge theory, Geometry and many beautiful interactions among them.
References [A+B] Atiyah, M. and Bott, R.: The moment map and equivariant cohomology, Topology 23, 1 (1984) [B+F] Bismut, J.M. and Freed, D.S.: The analysis of elliptic families I, metrics and connections on determinant line bundles. Commun. Math. Phys. 106, 159–176 (1986) [Do] Donaldson, S.K.: Anti Self-dual Yang-Mills connections over complex algebraic surfaces and stable vector bundles. Proc. Lond. Math. Soc. 50, 1–26 (1985) [Gi] Gieseker, D.: On moduli of vector bundles on an algebraic surface. Ann. of Math. 106„ 45–60 (1977) [Ko] Kobayashi, S.: Differential geometry of complex vector bundles. Princeton, NJ: Princeton University Press [Le1] Leung, N.C.: Differential Geometric and Symplectic Interpretations of Stability in the sense of Gieseker. MIT Thesis, 1993 [Le2] Leung, N.C.: Einsten type metric. JDG 1997 [M+F] Mumford, D. and Fogarty, J.: Geometric invariant theory. Berlin–Heidelber–New York: SpringerVerlag [M+Q] Mathai, V. and Quillen, D.: Thom classes, superconnections and equivariant differential forms. Topology 25, 85 (1986) [N+S] Narasihan, M.S. and Seshadri, C.S.: Stable and unitary vector bundles on compact Riemannian surfaces. Ann. of Math. 82, 540–567 (1965) [U+Y] Uhlenbeck, K. and Yau, S.T.: In the existence of Hermitian-Yang-Mills connections in stable vector bundles. Comm. Pure Appl. Math. 39, 257–293 (1986) Communicated by S.-T. Yau
Commun. Math. Phys. 193, 69 – 104 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
The Scaling Limit of Lattice Trees in High Dimensions Eric Derbez? , Gordon Slade Department of Mathematics and Statistics, McMaster University, Hamilton, ON, Canada L8S 4K1. E-mail:
[email protected] Received: 27 September 1996 / Accepted: 28 July 1997
Abstract: We prove that above eight dimensions the scaling limit of sufficiently spreadout lattice trees is the variant of super-Brownian motion known as integrated superBrownian excursion (ISE), as conjectured by Aldous. The same is true for nearestneighbour lattice trees in sufficiently high dimensions. The proof uses the lace expansion. 1. Introduction 1.1. The scaling limit. Lattice trees arise in polymer physics as a model of branched polymers and in statistical mechanics as an example exhibiting the general features of critical phenomena. A lattice tree in the d-dimensional integer lattice Zd is a finite connected set of lattice bonds containing no cycles. Thus any two sites in a lattice tree are connected by a unique path in the tree. For the nearest-neighbour model, the bonds are nearest-neighbour bonds {x, y}, x, y ∈ Zd , |x − y| = 1 (Euclidean distance), but we will also consider “spread-out” lattice trees constructed from bonds {x, y} with x 6= y and 0 ≤ |x(j) − y (j) | ≤ L for j = 1, . . . , d. Here x(j) denotes the j th component of x ∈ Zd , and L is a parameter which will later be taken large. We associate the uniform probability measure to the set of all n-bond lattice trees which contain the origin. We are interested in the existence of a scaling limit for lattice trees. This involves taking a continuum limit of lattice trees, in which the size of the trees increases simultaneously with a shrinking of the lattice spacing, in such a way as to produce a random fractal. The nature of the scaling limit is believed to depend in an essential way on the spatial dimension, but nothing close to existence of the scaling limit has been proven in low spatial dimensions. The corresponding problem for simple random walk has the well-known solution that when space is scaled down by a factor n1/2 , as the length n of the walk goes to infinity, there is convergence to Brownian motion in any dimension. For ? Present address: Department of Mathematics, University of British Columbia, Vancouver, BC, Canada V6T 1Z2. E-mail:
[email protected].
70
E. Derbez, G. Slade
self-avoiding walks, it has been shown using the lace expansion that the scaling limit is also Brownian motion in dimensions d ≥ 5 [8, 20, 18]. The same is believed to be true for d = 4 with a logarithmic adjustment to the spatial scaling, but in dimensions 2 and 3 a different limit, currently not understood, is expected. ........ . ... .. ... .. .. . .. ....... ............ . .. .. ..... ... ... ... . .. .. ... . ... ... . .. ... .. .. .. ..... ... . . .. . .. . ... . .. . ..... .. . ... .. .. ..... . .. ..... .. .... .. ... ... . . . . .. .. .. .. .. .. .. .... ..... . . .. .. .. ... . ... .. .. ..... .... ..... .. .. . . . . . . .. .. .. . . .. ......... . .. .. .. . . . . . . . . . . . . . . . . . .. ......... ..... . ...... ... ...... . . .. .. ...... . . . . .. .. . . ..... . ..... .. . ..... ...... . . . . . . . . . . . . . . . . . . . . . ...... .. .. ... . . . . . . . . . . . . . . . . .. . . . . . . . .. . ......... ... . . ... .. . ... ... . .. . . . ... . . . . .. .. .... . . ....... ....... . . . . .. .. . ... ... .... ..... .. .. .. .. . .. . . . . . . . . . . . . . . . . .. .. .. .. .. ..... ........ .. . . .. . . ...... . .. .. ... .. . .. .. ... .. .... . . .. .... ..... ... . . . . .... .. .. . ... . . ... .. ....... ....... .. . ...... ... ........ . .. .. .. ... .... . .. ..... ... . . .. . . . . . . . . . . . . ..... ........ ... ... .. ..... ....... ...... ... . ......... ... ... .. .. ... ... ... ..... ... .. ...... .. ... . . . . . . . ... .... .... ..... . ... . . .... . . . .. ... ... . .. . .. .. .. .. ... . . . . . . . . . . . . . . . . . . . . . . . ... .... . .. . .. . .. .. .. .. . . ... . ... .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... ... . ..... . ... .. .. ..... . .. .. .. . . ... .. .. . .. . . ... . .. .. .. .. . .. .. ... .. .. ..... . .. ... ... ... . .. ... . .. ... ... .. .. .. .. ... . .. . .... ... . ... .. .... ....... . . . . .. .. .. . ... ..... ..... . . . . . . . . ... .. .. ...... . .. .. ... .. . . ... . . . . . . . . . . . . .. ..... . ... . ... . . ... .. .. . ... . . .. .. .... ... ... . .... .. ..... . . .... ... .. ... ...... . . . .. . . . . . . . . . . . ..... ... .. ... ..... ....... .. .. ... ..... . .... . .. . .... ... . .. ... . .. ... . . . . .. . . .. .. ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... .... .. .. .. .... ..... . .. .. .. ... ... .. .. .. .. . ...... ... . ..... ... .. .. ... . .. . ...... . .. . ... . . ..... .... .... . ... ... .. .. .. . . .. ... . . .. .. .. . .. .. ... . ..... . ... .. ... ...... . ... .. ..... ... .. .. ... . ... . .. . .. . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... .. .. .... . . . . . .... .. .. .. ... ... ..... .. .. ... . .. ..... .. ..... . .... ..... ... .. . ... ..... .. . . . . .. .. ... .. .. ............... .. .. .. .. .. ... .. ....... . . .. . ...... .. .. ... . .. ... .. . ... ... .. .. ..... . .. ... .. .. . . ..... . . . . . . . ... ... ... ... ... ..... ... .......... .... .. ... ... ......... ...... ... ... . ... . ... ..... .. ... .... ... ... ..... .... . .. .. .... .. .. . .. . ....... . . ... ... .. .. ..... ...... ... . . . . . . ... ... ...... .. .. .. ..... . . . .. . . ..... ... . . . ...... .. .. . ...... ... . .. .. .. . ... . ... ... . .. .. ..... ... . .. .. .. . ... ... .... ... .. ... . ... .. ... .. .. .. . .. .. .. . ..... ....... . ..... . .. . . ... .. .. .. ....... . . . . . . . . . .. ... .. . . . . .. . . . ..... ..... .. .. . . .. .. ... . ..... ... . . .. . . . .. .. .. . ... .. .. ... .. .. . ... ... ..... .... .. . ... .. . ... .. .. ... ... . . . .......... . . ..... . . . . . .. .. .. ..... .. .. .. ...... ... . . . . ....... ... .... . ... . ... .. .. . ...... ....... . . ... ... ..... . . . . ... .. .. . ... .. .. . ... .. .. . . .. . ... . ... .......... ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. .. ........... .... .. . . . ... ... .. .. . .. . ... .. . .. . . . . .. .. .. .... .. .. .... . . .. .. .. . . ... ... . .. .. .. ..... .. .. . . ... .. .. .. .. . ..... .. . . . ... ... . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . . ... . . . ...... . ... . .... ........ ........ .... .. ..... .. .. .. .. . . ... . ... ... ... . .. ... ... .. ............. . ... . . ... . . .. .. . ... .. .. . .. .. .. . ... ... . . . .. . . ... . ... .. . . . ..... . . .. ... ..... . .. . .... . . . . .. . .... .. ... . ... . . . .. ... .. .. . . .. ..... .. .... .. .. .. ... ... .. .. .. ........... ..... ... ... .. . ..... ...... ..... ... . . . . . . .. .. ..... .. . .. ... .. .. ... ... ... ... . . ... ..... . .. . . . . . ... . .. . . ....... ... ... ... ... .. ...... .. ... ... .. .. ..... .. .. .... . . . .. .. .... . .. .. ... ..... .. .. . .... ... . . .. . . .. .. . ... . . . .. .. ... . .. . ... .... ... . .. . .. ... .. . ... ... ... . . . . .. .. .. .... ... ... .... .. .. ... .. .. .. .. ..... ..... .. . ... .. .. ... .. .. . . ... ... .. .. .. . . . . ... .. ... . ... .. .. .. ... ... . .. ... .... . . . .... .. . . . . . . . .... ... . .. . . .. .. .. .. .. .. .. .. .. .. .. ... ... .. .. ... . .. . .. . ... ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... ... ... ..... ... . .. .. . ... ..... ..... .... ... .. . ... ... ... .. ....... ... ..... ..... ... .. .. .. .. .. .. ... .. ..... .... .. .. ... . ... .. ... ... .. . .. .. . . .. .. . . .. . .. .. ... . . ... . .. .. .. . ... .. ... . . . ... .. .. . .. . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... ... .. .. ..... .. ......... . .. . .. .. . ... .. .. ... ... ... . ... .. .. ... ... .. .. ... .. . .. ... .. . ........... ....... .. .. .. .. .. .. ... ... ... . . . . . . . ... . . . .. .. .. .... . . . .. .... . ... ... ... ... . .. .. .. . .. .. . .. .. . ... . . . . . ..... . .. ........ ...... .. . . .. . . ..... . . ... ... . . . . ..... . .... . .. . . .. .. ..... . .. .. ..... ... .. ..... . . . ... ... ... . . . .. .. ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... .. .. ... ... . . ............ . . .. .. .. . .. .. .. .. .. .. .. .. . . . ... ....... . . . .. ... ..... . . . . . . . ... .. ... ..... . . .. ..... . . .. . .. ....... ... ..... . .. .. .. .. . ... ... . .. . .. .... . . . . .. ... ... ..... ... .. .. ..... . . . .. .. .. . . .... .. .... . . ......... .. . .. ... .. .. ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . .. . ..... ........ . .. . ... . ..... .... . ... ... ... .. .. ..... .... .. .... ...... ....... ... ... .. .. . .. .... ....... ... .. .. .. ... .. . ... .. . ... .. ... .. . .. ... ... .. . .. .. . .. .. .. .. . .. .. ... .. .. ... .. .. .. ... .. .... . .. .. .. ... .. ... ... ... ..... . .......... ........ ... ...... .. .. .. .. . .. . .. . . . . ... .. .. . . . .. .. . ......... . .. .. . .. . ....... ... ... . . ... ... . ... . . . . .. . ... . . . . . . .. ... . . . .. .. .. .. . .. .. .. ... .. .. . . . ... . .. .. .. ... ... ... .. . .. .. ... ... .. .. . .. . .. .. ..... . .. .. .. ... . ... .. .. .. . .. .. ..... ... .. .. .. .. .. .. ... .. .. ... . .. . .. . ..... . .. .. ... . . . .. ... . .. . . . . . .. . . .. . ..... ..... ... .. .. .. . ... ..... .... .... ...... .. .. ... ... ... .. .. . . . .. .. .. ... .. . ... . ... .. .. . . .. . . . . . . . ... . .. ... . . . . .. .. ... .. . ... . ... .. .. . . ... . . . .. .. .. ... . . . .. . .. . . . . . ..... . . . . . . . . . . .. . .. . . .... .. .. .. . . .. .. .. .. ..... .. .. . .. .. . .. . ..... ... . . . .. .. . . . .. .. . . ... ..... . .. .. ... .. .. ... . . . ... ... ... ... .. .. . ... ... ... .. . ....... ... . .. .. . . . .. . ... . . . ... .... ... .. . .. .. ... .. .. .... ... . . ... .. .. ... ... ... ... ..... ... ..... ... . . .. .... ... .. .. .. ... ..... ... ... ....... ... . ... ... .. .. ... ... ... . . . . ... ... . ..... ..... .... ...... .. ..... . .. .. ..... ..... ....... ... .. ... .... .... . ... . . . ... . .. .. ... ..... ... . . ........ .. .. .. ... . ... .. .. .. ... . . . ... .. . . .. . . .... .. .. ... ..... ........ ... . .. .... . . .. .. .. . .. .. . .... . .. .. .. .... ... . . . .. .. .... .. ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ........... ... ... . . ....... ..... ... ... . ..... . ... . . .. ... .. . . . .. . ..... ..... . . ... . ... . .. .. . .. ... ... .. . .... . . . .. . .. .. ....... ... . .. . . ........ . ... .. .. .. .. . ... . . .. ... .. ... . ... .. .. . . .. .. . ........ . ... . .. . . .. .. ..... . .. .. ... . ...... .. .... ......... ..... . .. .. ....... ...... .. ... .. . .. ... .. .. .. .. ....... . .. .. .... ... .. .. ........ .. .. .. .. . . ... . . . . . .. .. .. .. .. .. ... ... .. ...... . ... .. .. . . . ... . ..... .. ....... ...... .... ... .. .. .. .. ... .. . ... . .. .. ... ... . .. . ..... ......... .. . ... ... .. .. .. .. ... . . . . .. .. .. .. .. . ... .. . . .. .. .. .. .. .. .. . . . . ... ......... ... . .. . .. . ....... .. .. . . . .... . . . . . .. . . . .. . . . . . . ....... . . . .. . . . . . . . ... . . .. .. ... . ... .. .. .. .. . ........ .. .. .. .. . . .. ... ... .. ... ... ... .. . .. .. .. . .. .. .. .. . ....... .. ... .. .. .. .. .. .. .. .. ....... . . . ..... . ... . . ... .. . . .. . ... . .. .. . ..... . . ... ... . . . . . . ... .. .. ..... . . . ... . . .. ... .. .. ... . . ... . .... . . . .. ... . .. . . .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. .. .. .. .. .. ... .......... .. . .... .. .. . . ..... .. ... .. .. .. .. ........... .. .... ..... ... ...... .. ... .. .. .. ... . .... ..... ... ..... ..... . . ..... . . ... ..... . ......... . . .. ... ...... . . .. .. ... .. ... .. .. ... .. .. ... . .. .. .... . . .. . . .. ... . .. .. ..... .. .. . ... .. .. .. .. .. ... ... . . .. . . ..... . . ..... . ... . ... ... . .. ... . . . . . . . .. .. .. ... .. ... ..... . . . . . . . . . . . . . . . .. .. ..... ... ... ... . ... ..... . ... . .... . . ... . .. . . . . . ... . . . . . .. .. .. .. ... . ... ... ... ......... ...... ... ... .. .. .. .. .. .. ..... ... .. . .. . .. ..... ... ..... . . ..... . . . . ... . . . . ... ... ... . . .. .. .... ..... .. ... ... .. .. .. .. ... .. . . . . . ... . ........... . . . .. . .. . . . .. ... .. .. .. .. ... . .. ... ... . . ... .... ... ... ... .. ... ....... .. .. . . . .. . . . . . . . . . ... . . . . . . . . . . . . . . . ... .. .. .... . .... .. .. .. .. . . . . . . ... ... ..... . .. . . .. .. . . ......... ..... . ... ... . ........ .
Fig. 1. A 2-dimensional lattice tree with 5000 vertices, created with the algorithm of [23]
Here we prove that under certain assumptions the scaling limit of lattice trees is ISE (integrated super-Brownian excursion) for d > 8. To be precise about the assumptions, the scaling limit has been shown to be ISE for the spread-out model if d > 8 and L is sufficiently large, and for the nearest-neighbour model if d is sufficiently large. The hypothesis of universality implies that the scaling limit should be the same for spread-out and nearest-neighbour lattice trees, and assuming this, our results provide evidence that the scaling limit of nearest-neighbour lattice trees is ISE for d > 8. That the scaling limit of lattice trees should be ISE for d > 8 was conjectured by Aldous, who has emphasized the role of ISE as a model for the random distribution of mass [6]. In particular, Aldous has shown that ISE arises in various situations where random trees are randomly embedded into Rd [3, 4, 5]. ISE is super-Brownian motion conditioned to have total mass 1 and is closely connected to the super-processes intensively studied in the probability literature. It is typical of statistical mechanical models that there is an upper critical dimension above which a model’s scaling properties cease to depend on the dimension and become identical with those of a simpler so-called mean-field model. For lattice trees, the fact that ISE occurs as the scaling limit for d > 8 adds to the already considerable evidence that the upper critical dimension is 8 [7, 17, 19, 27, 34]. The results of this paper were announced and discussed in [11], and extend the results of [10]. An overview of our method, and a discussion of a relation between critical exponents and the occurrence of ISE as a scaling limit, is given in [11]. We expect that ISE is relevant also for lattice animals and for percolation, above their upper critical dimensions. Since lattice animals are believed to have the same scaling properties as lattice trees, in all dimensions, it is natural to conjecture that ISE is the scaling limit of lattice animals for d > 8. In addition, as is described in more detail in [11], Hara and Slade have conjectured that ISE arises also as a scaling limit for percolation models
1
Scaling Limit of Lattice Trees in High Dimensions
71
at the critical point, for d > 6. More precisely, the conjecture is that for d > 6, at the critical point the connected cluster of the origin, conditioned to contain n sites, yields ISE in the limit n → ∞ when space is scaled down by a factor n1/4 . Perhaps our method can be combined with the method of [16] to prove the conjecture. The methods of [30, 31] could possibly serve as a starting point to study related questions for oriented percolation. Further discussion of the scaling limit for percolation at the critical point can be found in [1]. This paper is organized as follows. Sections 1.2 and 1.3 provide introductory material concerning ISE and lattice trees. Sect. 1.4 contains precise statements of results showing that the scaling limit of high-dimensional lattice trees is ISE. The proof is reduced to a statement about generating functions in Sect. 2. A method of fractional derivatives is outlined in Sect. 3, which will be used to analyze the generating functions. The relevant fractional derivatives are estimated using the lace expansion, which is reviewed and extended in Sect. 4. We take this opportunity to make minor corrections to [17, 19], in Sect. 4.4. The main results for the two-point function, the three-point function, and higher-point functions are then proved respectively in Sections 5, 6 and 7. 1.2. ISE. ISE can be considered as an abstract continuous random tree embedded in Rd , rooted at the origin and having total mass 1 [6]. It is designed in such a way that if 0, x1 , . . . , xm−1 are points in Rd contained in ISE then there is an underlying tree structure with branch points b1 , . . . , bm−2 ∈ Rd and Brownian motion paths connecting the branch points and the points 0, x1 , . . . , xm−1 according to an abstract skeleton (minimal spanning subtree); see Fig. 2. There are (2m − 5)!! distinct “shapes” for the skeleton, corresponding to the (2m − 5)!! possible labellings of its m external vertices. See [15, (5.96)] for a proof of this elementary fact; here N !! is defined recursively for N = −1, 1, 3, 5, 7, 9, . . . by (−1)!! = 1 and N !! = N (N − 2)!!, N ≥ 1. The shapes for m = 2, 3, 4 are illustrated in Fig. 3. The joint probability density function for the skeleton shape, the durations t1 , . . . , t2m−3 of each of the Brownian motion paths and the positions of points and branch points is given by the explicit formula ! 2m−3 P2m−3 2 2m−3 X Y ti e−( i=1 ti ) /2 pti (yi ), (1.1) i=1
i=1
where yi ∈ Rd are the vector displacements along the skeleton paths, and pt (y) is the Brownian transition function pt (y) =
2 1 e−y /2t . (2πt)d/2
(1.2)
In Fig. 2, the vector displacements (in Rd ) along the skeleton paths are y1 = b1 , y2 = b2 − b1 , y3 = x1 − b2 , y4 = x2 − b2 , and so on. The ordering of the labelling of the displacements is fixed according to some convention, for each skeleton shape σ. The density (1.1) is discussed in [4, 5, 6]; see also [26]. The joint probability density function for the skeleton shape and the positions of the points and branch points, with the time variables integrated out, is given by A(m) (σ; y1 , . . . , y2m−3 ) Z ∞ Z ∞ = dt1 · · · dt2m−3 0
0
2m−3 X i=1
! ti
e−(
P2m−3 i=1
ti )2 /2
2m−3 Y i=1
pti (yi ).
(1.3)
72
E. Derbez, G. Slade
x2 x1
0
x4
b2
x3
b5
b1
b3
b4
x5
x6
Fig. 2. The branch points b1 , . . . , b5 and abstract skeleton for a realization of ISE containing the sites 0, x1 , . . . , x6
x1
x1
0
0
x3
0
AA
Q
Q Q
AA
x1
x2
x3
0
AA
x2
x2
AA
x2
0
AA
x3
x1
AA
x1
Fig. 3. The unique shapes for m = 2, 3 and the three shapes for m = 4, joining points 0, x1 , . . . , xm−1
The right side is independent of the shape σ and depends only on the displacements yi . This is indeed a probability measure, because integrating over the yi ’s simply removes the product over Brownian transition functions, and the remaining integral over the ti ’s equals 1/(2m−5)!!, the reciprocal of the number of shapes. If we leave the xi ’s fixed and integrate out the positions of the branch points and sum over all the (2m − 5)!! possible shapes, the result is a measure P (m) on Rd(m−1) . The measures P (m) , for m = 2, 3, 4, . . ., represent the joint probability densities for ISE to contain the sites 0, x1 , . . . , xm−1 . In the simplest case m = 2, P (2) (x) represents the probability density function for a point chosen randomly from the distribution of ISE. Explicitly, for m = 2, Z ∞ 2 2 t1−d/2 e−t /2 e−x /2t dt (1.4) A(2) (x) = P (2) (x) = (2π)−d/2 0
Z
and
∞
Aˆ (2) (k) =
te−t
2
/2 −k2 t/2
e
dt,
(1.5)
0
where our convention for the Fourier transform of a function f : Rdn → C is Z fˆ(k1 , . . . , kn ) = f (y1 , . . . , yn )eik1 ·y1 +···+ikn ·yn dd y1 · · · dd yn , ki ∈ Rd . (1.6) Rdn
The integral (1.5) can be written in terms of the parabolic cylinder function D−2 as 4 Aˆ (2) (k) = ek /16 D−2 (k 2 /2) [14, 3.462.1]. For general m ≥ 2, Aˆ (m) (σ; k1 , . . . , k2m−3 ) Z ∞ Z ∞ = dt1 · · · dt2m−3 0
0
2m−3 X i=1
! ti
e−(
P2m−3 i=1
ti )2 /2 −
e
P2m−3 i=1
ki2 ti /2
.
(1.7)
Scaling Limit of Lattice Trees in High Dimensions
73
1.3. Lattice trees. In this section, we introduce some notation and recall some previous results concerning high-dimensional lattice trees.
0
r
r x3 b1
r
b2
r
x1
b3
r r x4 r
r
x2
Fig. 4. A lattice tree containing the sites 0, x1 , x2 , x3 , x4 with its corresponding skeleton and branch points b1 , b2 , b3 . The vector displacements along the skeleton paths are y1 = b1 , y2 = x1 − b1 , y3 = b2 − b1 , y4 = x3 − b2 , and so on. The ordering of the labelling of the displacements is fixed according to some convention, for each skeleton shape σ (1) Let t(1) n denote the number of n-bond lattice trees containing the origin, with t0 = (m) 1. For m ≥ 2, let tn (σ; y, s) be the number of n-bond lattice trees with skeleton shape σ and skeleton displacements y1 , . . . , y2m−3 as in Fig. 4, with the skeleton path corresponding to yi consisting of si steps (i = 1, . . . , 2m − 3). We also define X t(m) t(m) (1.8) n (σ; y) = n (σ; y, s), s
t(m) n (y)
=
X
t(m) n (σ; y).
(1.9)
σ
We will make use of Fourier transforms with respect to the y variables, for example, X i(k1 ·y1 +···+k2m−3 ·y2m−3 ) tˆ(m) t(m) , ki ∈ [−π, π]d . (1.10) n (σ; k) = n (σ; y)e y
Note that for m ≥ 2, tˆ(m) n (0) =
XX σ
m−1 (1) t(m) tn . n (σ; y) = (n + 1)
(1.11)
y
To see this, perform the sums over σ and y by first fixing the values of x1 , . . . , xm−1 and then summing over all shapes and branch points compatible with x1 , . . . , xm−1 as in Fig. 4. This leaves the sum over x1 , . . . , xm−1 of the number of n-bond lattice trees containing the origin and x1 , . . . , xm−1 . Then (1.11) follows from the fact that an n-bond lattice tree contains n + 1 sites. In [17, 19], some critical exponents for lattice trees were proven to exist and to assume their mean-field values when d > 8. More precisely, the results were obtained for the nearest-neighbour model when d ≥ d0 for some undetermined dimension d0 > 8,
74
E. Derbez, G. Slade
and for spread-out trees when d > 8 and L is sufficiently large depending on d. We will refer to these restrictions on the dimension and L as the “high-dimension condition.” In particular, it was shown in [19] that under the high-dimension condition there are positive constants B2 and zc (depending on d and L; in the notation of [19], B2 = Aπ 1/2 ) such that −1/2 −n −3/2 zc n , as n → ∞. (1.12) t(1) n ∼ B2 π In terms of the critical exponent θ occurring in the conjectured asymptotic relation −n 1−θ and depending, in general, on the dimension, this says that θ = 25 t(1) n ∼ const.zc n 1/d −n under the high-dimension condition. The bounds c1 n−c2 log n zc−n ≤ t(1) zc , n ≤ c3 n proved respectively in [22] and [28] and believed not to be sharp, are the best general bounds known at present for t(1) n . With (1.12), (1.11) gives −1/2 −n m−5/2 zc n . tˆ(m) n (0) ∼ B2 π
(1.13)
A second critical exponent involves Rn , the average radius of gyration of n-bond trees. The squared average radius of gyration is defined by Rn2 =
X
1
t(1) n T :|T |=n,T 30
where R(T )2 =
R(T )2 ,
1 X (x − x¯ T )2 |T | + 1
(1.14)
(1.15)
x∈T
is the squared radius of gyration ofP T . Here we write |T | to denote the number of bonds in a lattice tree T , x¯ T = (|T | + 1)−1 x∈T x to denote the centre of mass of T (considered as a set of equal masses at the sites of T ), and we say that x ∈ T if x is an element of a bond in T . Equivalently, Rn2 =
1
X
2tˆ(2) n (0)
x
|x|2 t(2) n (x).
(1.16)
It is believed that there is a critical exponent ν such that Rn ∼ Dnν as n → ∞, but very little has been proved rigorously about this in general dimensions. Under the high-dimension condition, it is proved in [19] that Rn ∼ Dn1/4 ,
(1.17)
so that ν = 41 . The amplitude D of (1.17) is a positive constant which depends on d, and for the spread-out model, also on L. Asymptotically, for fixed d, D behaves like a multiple of L as L → ∞. For later use, we define D1 = 23/4 d−1/2 π −1/4 D.
(1.18)
The fact that ν = 41 under the high-dimension condition can be interpreted as saying that the mass n of a tree grows on average like the fourth power of its radius, suggesting a 4-dimensional nature for high-dimensional lattice trees. This compares well with the fact that ISE has Hausdorff dimension 4 [9, Theorem 4.9], and also permits the upper critical dimension 8 to be interpreted as the dimension above which two 4-dimensional objects generically do not intersect.
Scaling Limit of Lattice Trees in High Dimensions
75
1.4. The results. Using (1.11), we define p(m) n (σ; y) =
1 t(m) t(m) n (σ; y) n (σ; y) = , (m) m−1 (n + 1) tˆn (0) t(1) n
(1.19)
which is (n + 1)−(m−1) times the probability that an n-bond lattice tree containing the origin has a skeleton of shape σ mediating displacements y. The following theorem shows that this distribution has the corresponding ISE distribution as its scaling limit, under the high-dimension condition. Theorem 1.1. Let m ≥ 2 and ki ∈ Rd (i = 1, . . . , 2m − 3). For nearest-neighbour trees in sufficiently high dimensions d ≥ d0 , and for sufficiently spread-out trees above eight dimensions, −1 −1/4 ) = Aˆ (m) (σ; k), lim pˆ(m) n (σ; kD1 n n→∞
where D1 is given by (1.18). To prove Theorem 1.1, we will obtain the asymptotic behaviour, as n → ∞, of the Fourier transform of the numerator of (1.19). The asymptotic behaviour of the denominator is given already by (1.13). For a more refined statement than Theorem 1.1, we wish to see the integrand ! 2m−3 P2m−3 2 P2m−3 2 X (m) ti e−( i=1 ti ) /2 e− i=1 ki ti /2 (1.20) aˆ (σ; k, t) ≡ i=1
of the integral representation (1.7) of Aˆ (m) (σ; k) as corresponding to Brownian motion paths arising from the scaling limit of the skeleton. For this, we define p(m) n (σ; y, s) =
1 t(m) t(m) n (σ; y, s) n (σ; y, s) = , (m) m−1 (n + 1) tˆn (0) tˆ(1) n (0)
(1.21)
which is (n + 1)−(m−1) times the probability that an n-bond lattice tree containing the origin has a skeleton of shape σ mediating displacements y with skeleton paths of respective lengths s1 , . . . , s2m−3 . The following theorem shows the skeleton paths converging to Brownian motions, for m = 2 and m = 3. Theorem 1.2. Let m = 2 or m = 3, ki ∈ Rd , and ti ∈ [0, ∞) (i = 1, . . . , 2m − 3). For nearest-neighbour trees in sufficiently high dimensions d ≥ d0 , and for sufficiently spread-out trees above eight dimensions, there is a constant T1 (defined below in (4.20)) depending on d and L such that −1 −1/4 , t T1 n1/2 ) = aˆ (m) (σ; k, t). lim (T1 n1/2 )2m−3 pˆ(m) n (σ; kD1 n
n→∞
(1.22)
1/2 is to be interpreted as its integer part bti T1 n1/2 c.) (As an argument of pˆ(m) n , ti T 1 n
We believe that Theorem 1.2 holds for all m ≥ 2, but we encounter technical difficulties for m ≥ 4 and have proved the theorem only for m = 2 and m = 3; we comment at the end of of Sect. 7 on what goes wrong for m ≥ 4. It would be of interest to prove Theorem 1.2 for general m, and also to investigate tightness to obtain a stronger statement of convergence to ISE.
76
E. Derbez, G. Slade
The factor (T1 n1/2 )2m−3 on the left side of (1.22) has a natural interpretation. In fact, writing ti = si /(T1 n1/2 ) in the right side of (1.22), and then multiplying by (T1 n1/2 )−(2m−3) and summing over the si , gives a Riemann sum approximation to (1.7). Theorem√1.2 indicates that, at least for m = 2 and m = 3, skeleton paths with length of order n are typical, and that the skeleton paths converge to Brownian motion paths in limit. It is natural that the skeleton paths should typically be of length √ the scaling n = (n1/4 )2 , since distance is scaled as n1/4 .
2. Generating Functions The proofs of Theorems 1.1 and 1.2 use generating functions and contour integration, with the generating functions controlled using the lace expansion. In this section, we reduce the proofs of Theorems 1.1 and 1.2 to corresponding statements about generating functions. General remarks concerning the relationship between generating functions and ISE can be found in [11]. To define the generating functions, for m ≥ 2 let G(m) z,ζ (σ; y) =
∞ X X n=0
s n t(m) n (σ; y, s)ζ z ,
ζj , z ∈ C,
(2.1)
s s
2m−3 where ζ = (ζ1 , . . . , ζ2m−3 ) and ζ s = ζ1s1 · · · ζ2m−3 . We adopt the convention that ζ is dropped from the notation if ζj = 1 for all j. If |ζj | ≤ 1 for all j, then ∞ X X (m) n tˆ(m) G (σ; y) ≤ (2.2) n (σ; 0)|z| . z,ζ y
n=0
1/n In view of (1.11) and the general fact [25] that lim n→∞ (t(1) ≡ zc−1 ∈ (0, ∞), it n ) follows from the above bound that the Fourier transform of (2.1) exists for |ζj | ≤ 1, |z| < zc . In [19], it was shown under the high dimension condition that
B2 + E(z), Gˆ (2) z (0) = p 1 − z/zc
(2.3)
with B2 as in (1.12), and E(z) an error term which obeys the bounds |E(z)| ≤ O(|1 − z/zc |−1/2 ) and |E 0 (z)| ≤ O(|1 − z/zc |−3/2 ) uniformly in the disk |z| ≤ zc . Here is a positive constant, whose value may change from line to line in the sequel. The constant B2 can be expressed in terms of the constants B, B1 and b occurring respectively in [19, (2.21)], [19, (2.14)] and [19, Lemma 2.1(iii)] as 1/2 B 1 1 1 = , B2 = √ = 3/2 zc zc B1 zc 2(1 + b)
(2.4)
where denotes the coordination number of the lattice ( = 2d for the nearest neighbour model and = (2L + 1)d − 1 for the spread-out model). (m) We shall generalize (2.3) to variable m, k, ζ. For this, we define Ez,ζ (σ; k) by
Scaling Limit of Lattice Trees in High Dimensions
2m−5/2 Gˆ (m) z,ζ (σ; k) = B2 2
2m−3 Y j=1
77
1 (m) (σ; k), + Ez,ζ D12 kj2 + 23/2 (1 − z/zc )1/2 + 2T1 (1 − ζj )
(2.5) where T1 is the constant occurring in Theorem 1.2, to be defined below in (4.20). Equation (2.5) reduces to (2.3), if we set m = 2, k = 0, ζ = 1. It will be shown in Theorem 2.1 (m) (σ; k) is an error term, for all and the proofs of Theorems 1.1 and 1.2 below that Ez,ζ (m) (σ; k) will m ≥ 2 when ζ = 1 and for m = 2, 3 for general ζ. Explicit bounds on Ez,ζ be given in Sects. 5, 6 and 7. With this in mind, we make three remarks. Remarks. 1. For m = 2 and ζ = 1, the main term in (2.5) is proportional to (D12 k 2 + 23/2 (1 − z/zc )1/2 )−1 . This corresponds to the critical exponent values η = 0 and γ = η−2 3 − θ = 21 , where the exponents η and γ are given by Gˆ (2) and Gˆ (2) zc (k) ≈ k z (0) ≈ −γ (1 − z/zc ) . 2. Let m ≥ 3 and ζ = 1. If we define v = 1/(2B22 ), the main term in (2.5) can be written as 2m−3 Y B2 23/2 . (2.6) v m−2 2 2 3/2 D1 kj + 2 (1 − z/zc )1/2 j=1 The quantity occurring in the product is the leading contribution to the two-point function Gˆ (2) z (k), and the renormalized vertex factor v accounts for the fact that, due to the selfavoidance interactions inherent in a lattice tree, the m-point function is not a product of independent two-point functions. However, apart from this renormalized vertex factor, the m-point function does behave, to leading order, as a product of two-point functions. A conjecture for percolation theory, in the spirit of (2.6), is stated in [2, Sect. 4.1]. 3. Equation (2.5), with k = 0 and ζ = 1, is consistent with the following formal calculad m−2 ˆ (2) Gz (0), for m ≥ 2. (zf (z)). In view of (1.11), Gˆ (m) tion. Define Df (z) = dz z (0) = D If we ignore the error term in (2.3), then as z → zc we obtain m−2 ˆ (2) Gz (0) ∼ B2 (2m − 5)!!2−(m−2) (1 − z/zc )−(2m−3)/2 . Gˆ (m) z (0) = D
(2.7)
The factor (2m − 5)!! is absent in (2.5) because the shape is fixed there. (m) We define coefficients e(m) n (σ; k, s) and en (σ; k) by
(m) (σ; k) Ez,ζ
=
∞ X X n=0
s n e(m) n (σ; k, s)ζ z
|z| < zc , |ζj | ≤ 1
(2.8)
s
and (m) Ez,1 (σ; k) =
∞ X
n e(m) n (σ; k)z
|z| < zc .
(2.9)
n=0
To abbreviate the notation, we write κ = kD1−1 n−1/4 .
(2.10)
78
E. Derbez, G. Slade
Theorem 2.1. Let m ≥ 2 and ki ∈ Rd (i = 1, . . . , 2m − 3). For nearest-neighbour trees in sufficiently high dimensions d ≥ d0 , and for sufficiently spread-out trees above eight dimensions, there is an > 0 such that −n −+m−5/2 . |e(m) n (σ; κ)| ≤ const.zc n
(2.11)
Let m = 2 or m = 3, ki ∈ Rd , and ti ∈ [0, ∞) (i = 1, . . . , 2m − 3). For nearestneighbour trees in sufficiently high dimensions d ≥ d0 , and for sufficiently spread-out trees above eight dimensions, there is an > 0 such that 1/2 )| ≤ const.zc−n n−1− . |e(m) n (σ; κ, tT1 n
(2.12)
The value of may vary from line to line in the sequel. Before moving on to the proof of Theorem 2.1, we first show that it allows for proofs of Theorems 1.1 and 1.2. Proof of Theorem 1.1. Let c(m) n (σ; k) be defined by 2m−3 Y j=1
∞
X 1 n = c(m) n (σ; k)z , 2 2 3/2 1/2 D1 kj + 2 (1 − z/zc ) n=0
|z| < zc .
(2.13)
By (2.5) with ζ = 1, 2m−5/2 (m) cn (σ; κ) + e(m) tˆ(m) n (σ; κ) = B2 2 n (σ; κ).
(2.14)
In view of (2.11) and (1.13), it suffices to show that m−5/2 −n −(2m−5/2) −1/2 ˆ (m) A (σ; k) zc 2 π c(m) n (σ; κ) ∼ n
as n → ∞.
(2.15)
The remainder of the proof is concerned with establishing (2.15). By (2.13), c(m) n (σ; κ) =
1 2πi
I
2m−3 1 dz Y , 2 n−1/2 + 23/2 (1 − z/z )1/2 z n+1 k c j j=1
(2.16)
where the integral is taken around a small circle centred at the origin. For simplicity, for the remainder of the proof we suppose that kj 6= 0 for all j. The case where one or more of the kj is zero can be treated using similar methods, together with a limiting argument. We make the change of variables w = n(z/zc − 1), and then deform the contour to the branch cut [0, ∞) in the w-plane. The resulting contour goes from right to left below the branch cut and from left to right above the cut. We then apply the identity Z 2 1/2 1/2 1 ∞ 1 = dtj e−kj tj /2 e−2 (−w) tj . (2.17) kj2 + 23/2 (−w)1/2 2 0 Taking into account the correct branch of the square root on either side of the branch cut, and applying Fubini’s theorem, this gives Z ∞ Z ∞ P2m−3 2 −( kj tj )/2 (m) m−5/2 −n −(2m−3) j=1 zc 2 dt1 · · · dt2m−3 e cn (σ; κ) = n 0 0 Z P2m−3 √ 1 ∞ dw × sin(( j=1 tj ) 2w). (2.18) π 0 (1 + w/n)n+1
Scaling Limit of Lattice Trees in High Dimensions
79 2
w 2 w n+1 Since (1 + w ≥ 1 + (n+1)n n) 2 ( n ) ≥ 1 + 2 for all n ≥ 1, the dominated convergence theorem can be applied to conclude that as n → ∞ the w-integral converges to Z ∞ P2m−3 2 π 1/2 P P2m−3 √ −( tj ) /2 2m−3 j=1 dw e−w sin(( j=1 tj ) 2w) = ( j=1 tj )e . (2.19) 2 0
Therefore, as required, m−5/2 −n −(2m−5/2) −1/2 ˆ (m) A (σ; k). zc 2 π c(m) n (σ; κ) ∼ n
(2.20)
Proof of Theorem 1.2. In view of (2.12), recalling (1.13) and taking into account the in (1.22), it suffices to show that for k replaced by kD1−1 n−1/4 , the factor (T1 n1/2 )2m−3 √ tT1 n n z in the first term on the right side of (2.5) is asymptotic to coefficient of ζ B2 2m−3 T1
√
1 aˆ (m) (σ; k, t). πnzcn
(2.21)
Since each factor in the product in (2.5) is the sum of a geometric series in ζj , the desired coefficient is equal to 1 (2T1 )2m−3 !−(T1 tj √n+1) √ I 2m−3 kj2 1 2 dz Y √ + 1+ × (1 − z/zc )1/2 , (2.22) 2πi z n+1 2T1 n T1
B2 22m−5/2
j=1
where the integration is performed around a small circle centred at the origin. Making the change of variables w = n(z/zc − 1) and deforming the contour of integration to the branch cut [0, ∞) in the w-plane, the above is equal to √ 2 B2 2m−3 πnz n T1 c √ !−(T1 tj √n+1) Z ∞ 2m−3 Y kj2 i 2w dw √ − √ ×Im . (2.23) 1+ (1 + w/n)n+1 2T1 n T1 n 0 j=1
2
The integrand here is dominated by (1 + w2 )−1 · 1, so by the dominated convergence theorem, this is asymptotic to √ Z ∞ P √ P2m−3 2 −( 2m−3 B2 kj2 tj )/2 j=1 e dw e−w sin( 2w( j=1 tj )). (2.24) 2m−3 πnz n T1 0 c By (2.19), this equals B2 1 √ aˆ (m) (σ; k, t). T12m−3 πnzcn
(2.25)
80
E. Derbez, G. Slade
3. A Power Series Method Our remaining task is to prove Theorem 2.1. For this, we shall obtain explicit formulas (m) (σ; k) and use contour integration to obtain the bounds (2.11) for the error term Ez,ζ and (2.12) on its coefficients of z n and z n ζ s . These bounds result from corresponding (m) (σ; k) itself. The purpose of this section is to review and extend the bounds on Ez,ζ method of [20] (see also [29, Sect. 6.3]) that will be used to carry out this procedure. P∞ Given a power series f (z) = n=0 an z n and ≥ 0, we define the fractional derivative ∞ X n a n z n . (3.1) δz f (z) = n=0 δz does
Note that for equal to a positive integer, not give the usual derivative, but gives d instead z dz . The series (3.1) converges absolutely at least strictly within the circle of convergence of f (z). We also define kf (z)k =
∞ X
|an ||z|n .
(3.2)
n=0
We require the following lemma, which is a restatement of [29, Lemma 6.3.2]. This lemma provides an error estimate analogous to the error estimate in Taylor’s theorem. In applications of the lemma, R will be the radius of convergence of f . The norms P∞ n f (R)k = appearing in the lemma are given, in view of (3.2), by kδ z n=0 n |an |R and P∞ 1+ 1+ n kδz f (R)k = n=1 n |an |R . Lemma 3.1 will be used to help supply the hypotheses of Lemma 3.2 below. P∞ Lemma 3.1. Let ∈ (0, 1), f (z) = n=0 an z n , and R > 0. If kδz f (R)k < ∞ (so in particular f (z) converges for |z| ≤ R), then for any z with |z| ≤ R, (3.3) |f (z) − f (R)| ≤ 21− kδz f (R)k|1 − z/R| . P∞ 1+ 0 n−1 converges for |z| ≤ R), If kδz f (R)k < ∞ (so in particular f (z) = n=1 nan z then for any z with |z| ≤ R, |f (z) − f (R) − f 0 (R)(z − R)| ≤
21− 1+ kδ f (R)k|1 − z/R|1+ . 1+ z
(3.4)
The next lemma will be our main tool in converting bounds on the error term which will be obtained using the lace expansion, into the bounds on its coefficients stated P∞in Theorem 2.1. The intuition behind the lemma is that if a power series f (z) = n=0 an z n has radius of convergence R > 0 and if |f (z)| is bounded above by a multiple of |R − z|−b on the disk of radius R, with b ≥ 1, then an should be not much worse than order R−n nb−1 . The lemma incorporates improvements from [12, Theorem 4] to [29, Lemma 6.3.3]. To abbreviate the notation, we introduce (m) (σ; k), Ez,ζ
(3.5) Up (w) = |1 − w|−bp (1 − |w|)−bp . P∞ Lemma 3.2. (i) Let f (z) = n=0 an z n have radius of convergence at least R, where R > 0. Suppose that |f (z)| ≤ const.U0 (z/R) for |z| < R, for some b0 ≥ 1, ¯ ¯ b¯ 0 ≥ 0. Then |an | ≤ const.R−n nb0 +b0 −1 if b0 > 1, and |an | ≤ const.R−n nb0 log n if b0 = 1, with the constant independent of n. ¯
Scaling Limit of Lattice Trees in High Dimensions
81 j
d (ii) Let j be a positive integer. Suppose that | dz j f (z)| ≤ const.U0 (z/R) for |z| < R, ¯ ¯ for some b0 ≥ 1, b0 ≥ 0. Then |an | ≤ const.R−n nb0 +b0 −1−j if b0 > 1, and −n b¯ 0 −j |an | ≤ const.R n log n ifP b0 = 1 ∞ (iii) Suppose that f (z, ζ1 , . . . , ζq ) = n,s1 ,···,sq =0 an,s1 ,···,sq ζ s1 · · · ζ sq z n converges absolutely if |z| < R and |ζp | ≤ 1 (p = 1, . . Q . , q). Suppose in addition that in this q region, |f (z, ζ1 , . . . , ζq )| ≤ const.U0 (z/R) p=1 Up (ζp ), where bp ≥ 1, b¯ p ≥ 0 Qq b +b¯ −1 ¯ (p = 0, 1, . . . , q). Then |an,s1 ,···,sq | ≤ const.R−n nb0 +b0 −1 p=1 spp p , where we b −1
interpret nb0 −1 as log n if b0 = 1, and spp
Proof. (i) The coefficient an is given by 1 an = 2πi
as log sp if bp = 1 (p = 1, . . . , q).
I
dz , z n+1
f (z)
(3.6)
where the integral is around the circle of radius rn = R(1 − n1 ) centred at the origin. Using C to denote a constant whose value may vary from line to line, it follows that Z π C |f (rn eiθ )|dθ. (3.7) |an | ≤ n R −π By the assumed bound on f , Z π Z π ¯ C C 1 − rn R−1 eiθ −b0 1 − rn R−1 −b0 dθ U0 (rn R−1 eiθ )dθ ≤ n |an | ≤ n R −π R 0 b¯ 0 Z π Cn 1 − rn R−1 eiθ −b0 dθ. (3.8) = Rn 0 The portion of the integral corresponding to θ ∈ [ π2 , π] is bounded uniformly in n. To estimate the integral over [0, π2 ], we note that Re[1 − (1 − n−1 )eiθ ] = 1 − (1 − n−1 ) cos θ ≥ n−1
(3.9)
and, for θ ∈ [0, π2 ] and n ≥ 2, Im[1 − (1 − n−1 )eiθ ] = −(1 − n−1 ) sin θ ≥ (1 − n−1 ) 2θ ≥ θ . π π It follows that
1 − (1 − n−1 )eiθ ≥ 1 2
and hence Z Z π/2 1 − rn R−1 eiθ −b0 dθ ≤ C 0
π/2 0
+
,
1 1 n
1 θ + n π
θ π
(3.10)
b0 dθ ≤
(3.11)
Cnb0 −1 b0 > 1 C log n b0 = 1.
(3.12)
This completes the proof. (ii) This is an immediate consequence of (i). (iii) The proof involves an iteration of part (i) of the lemma, and we demonstrate this for a single iteration with the case p = 1. The general case is similar.
82
E. Derbez, G. Slade
Let p = 1 and write ζ in place of ζ1 . The proof P∞ of part (i), up to the first inequality of (3.8), can be carried out with an = an (ζ) = s=0 an,s ζ s to give Z π ¯ U0 (rn R−1 eiθ )dθ. (3.13) Rn n−b0 |an (ζ)| ≤ CU1 (ζ) −π
The constant C in (3.13) is independent of both ζ and n. The integral on the right side was shown in the proof of part (i) to be bounded above by a multiple of nb0 −1 , where we interpret n0 as log n. Therefore n−b0 −b0 +1 Rn |an (ζ)| ≤ CU1 (ζ). (3.14) P∞ ¯ s Writing bn,s = n−b0 −b0 +1 Rn an,s and bn (ζ) = s=0 bn,s ζ , this means |bn (ζ)| ≤ CU1 (ζ), and part (i) can be applied (with R = 1) to give the desired result. ¯
Our reliance on Lemma 3.2 to obtain bounds on coefficients of a power series from bounds on the series itself is something of a limitation. However, Lemma 3.2 cannot be significantly improved. It is clear from the example f (z) = (1 − z)−b that the power in the conclusion of the lemma is optimal (leaving aside logarithms, which are unimportant for our needs), and, as will be explained momentarily, the restriction b0 ≥ 1 cannot be weakened. It would be of interest to find a method of proving Theorems 1.1 and 1.2 that works directly with the sequence t(m) n (σ; y, s) rather than with generating functions. Such an approach has been used for weakly self-avoiding walks in [13, 24]. hypothesis b0 ≥ 1 in Lemma 3.2(i) cannot be discarded. For example, let f (z) = P∞The −2 2n n z . Then f (z) is finite for |z| ≤ 1, so in particular |f (z)| ≤ const.|1−z|−b for n=1 any b ∈ [0, 1). However aN = (log2 N )−2 for N = 2n , so an 6= O(nα ) for α ∈ (−1, 0). The conclusion of Lemma 3.2 gives only positive power laws (or a logarithmic bound) as upper bounds. Thus, for example, to use the lemma for a function which is bounded by O(|1 − z|−1/2 ) for |z| ≤ 1 (ideally corresponding to coefficients with power law n−1/2 ), we must show that its derivative is bounded above by O(|1 − z|−3/2 ) (yielding power law n1/2 for the derivative and hence n−1/2 for the original function). This need to take derivatives will make our calculations more laborious. Specifically, to prove (2.11) will require one derivative for m = 2 (none are necessary for m ≥ 3), and (2.12) requires two derivatives for any m.
4. Lace Expansion Bounds Theorem 2.1 will be proved using Lemma 3.2, and to employ the latter we require (m) (σ; k) of (2.5). These bounds will be obtained using the bounds on the error term Ez,ζ lace expansion. This will require some modifications and extensions of the lace expansion methods developed at length for lattice trees in [17, 19] (see also [29, Sect. 5.5.1] and [21]). In particular, we extend the lace expansion methods to the context where there is a skeleton activity ζ. Because the extensions required do not involve the development of new methodology, we will be brief with the proofs. All results requiring the lace expansion that will be used in later sections are stated here. Throughout this section, we assume familiarity with the methods of [17, 19] and will refer frequently to these references for calculations and notation. For simplicity, we work in the remainder of the paper with the spread-out model and assume that d > 8 and
Scaling Limit of Lattice Trees in High Dimensions
83
L is sufficiently large. Similar results can be proven for the nearest-neighbour model in sufficiently high dimensions. 4.1. The lace expansion with backbone activity. With ζ = 1, [17, (3.8)] gives hˆ z (k) Gˆ (2) , z (k) = ˆ Fz (k)
ˆ hˆ z (k). Fˆz (k) = 1 − zD(k)
(4.1)
Here we are using our convention that when ζ = 1 it is omitted from the notation, and = (2L + 1)d − 1 is the coordination number of the lattice. (We will abuse notation by also writing for the set of all x ∈ Zd with x 6= 0 and 0 ≤ |x(j) | ≤ L, j = 1, . . . , d.) Also, ∞ X 1 X ik·x n ˆ ˆ ˆ e , hz (k) = gz + Πz (k), gz = t(1) (4.2) D(k) = n z , x∈ n=0
and in the notation of [17, Sect. 2.1.1], X
Πz (x) =
z |ω|
|ω| Y X
z |Ri | J[0, |ω|].
(4.3)
i=0 Ri 3ω(i)
ω:0→x |ω| ≥ 1
The above expression for Πz (x) is produced using the lace expansion, and leads to the diagrammatic representation x
Πz (x) =
− AA + 0
−
x
x
0 x
0
−
+
s
−
0
0
s
(4.4)
0
s
x
0
s
x
x
+··· .
The diagrams on the right side represent terms which are given by series that are absolutely bounded by the corresponding Feynman diagram with propagator G(2) |z| (x). This is discussed at length in [17, Sect. 2.2.1] (see also [29, Sect. 5.5.1]). The diagrams are convergent when z = zc for d > 8, even if an additional vertex is added to one of the lines. They have an ultra-violet cutoff due to the lattice, and z = zc corresponds to massless propagators. The heavy lines in the diagrams play no special role for ζ = 1, but will be discussed below for general ζ. As discussed in [19, Sect. 2.1], Fˆzc (0) = 0. Therefore hˆ zc (0) = 1/(zc ). By (1.11), d zgz = Gˆ (2) (4.5) z (0) ≡ χ(z), dz where the last equality defines χ(z). By [19, Lemma 2.1(iii)], d ˆ ˆ z (k). z Πz (k) = [zχ(z) + 1]9 dz
(4.6)
ˆ z (k) from [19] will be generalized below. The function 9 ˆ z (k) is repreProperties of 9 sented by a sum of diagrams which are convergent when z = zc for d > 8, but not for d ≤ 8. The diagrams include the diagrams obtained by adding an additional vertex to the diagrams in Πˆ z (k), as well as others.
84
E. Derbez, G. Slade
For ζ 6= 1, the derivation of (4.1) and (4.6) is identical, except for the fact that (in x-space) there is a factor of ζ associated to each bond in the backbone path joining 0 to x. Keeping track of this for the two-point function leads to hˆ z,ζ (k) , Gˆ (2) z,ζ (k) = ˆ Fz,ζ (k)
ˆ hˆ z,ζ (k), Fˆz,ζ (k) = 1 − ζzD(k)
(4.7)
where hˆ z,ζ (k) = gz + Πˆ z,ζ (k) and Πz,ζ (x) =
X
ζ |ω| z |ω|
|ω| Y X
(4.8) z |Ri | J[0, |ω|].
(4.9)
i=0 Ri 3ω(i)
ω:0→x |ω| ≥ 1
Thus the only change for ζ 6= 1 occurs in the new factor ζ appearing explicitly in Fˆz,ζ (k) and in the factor ζ |ω| in Πˆ z,ζ (k). The heavy lines of (4.4) represent the backbone paths carrying factors ζ. Keeping track of this extra factor ζ |ω| throughout the calculations of [19, Sect. 4.1] gives d ˆ ˆ z,ζ (k). z Πz,ζ (k) = [zχ(z) + 1]9 (4.10) dz ˆ z,ζ (k) is defined in the same way as its ζ = 1 counterpart, except that a The quantity 9 |ω| factor ζ must be inserted into [19, (4.17)] and [19, (4.24)]. An example of a diagram contributing to 9z,ζ (x), with the backbone carrying ζ drawn with a heavy line, is r
x
0
For a double power series f (z, ζ) = kf (z, ζ)k =
P n,s
X
an,s z n ζ s , we define
|an,s ||z|n |ζ|s .
(4.11)
n,s
Basic bounds on the lace expansion are provided by the following lemma. Lemma 4.1. (i) There is a constant K such that for large L, k ∈ [−π, π]d , |z| ≤ zc and |ζ| ≤ 1, kΠˆ z,ζ (k)k ≤ KL1−d , k∇2k Πˆ z,ζ (k)k ≤ KL3−d , ˆ z,ζ (k)k ≤ K. khˆ z,ζ (k)k ≤ K, k9
(4.12)
The critical point zc is bounded above and below by a constant multiple of −1 . In ˆ z,ζ (k) = 0 uniformly addition, hˆ zc ,1 (0) is bounded away from zero, and limL→∞ 9 in z, ζ, k as above.
Scaling Limit of Lattice Trees in High Dimensions
85
(ii) There is an > 0 (depending on d > 8) such that the following are bounded uniformly in k ∈ [−π, π]d , |z| ≤ zc , |ζ| ≤ 1 and large L: L−2 kδz ∇2k hˆ z,ζ (0)k,
kδza hˆ z,ζ (k)k (0 ≤ a < 21 ), kδζ1+ hˆ z,ζ (k)k,
kδζ1+ Fˆz,ζ (k)k,
L−2 kδz ∇2k Fˆz,ζ (0)k, (4.13)
d ˆ Fz,ζ (k)k. kδz dζ
(4.14)
(iii) The constants B3 =
1 2 ˆ ∇ Fz (0), 2d k c
B4 = hˆ zc (0)B2−1 ,
B5 = −
d ˆ Fz ,1 (0) dζ c
(4.15)
are finite and positive, with B4 and B5 bounded away from zero and infinity uniformly in large L, and B3 asymptotic to a multiple of L2 as L → ∞. Proof. (i) It suffices to prove the statements for ζ = 1, and the latter follow readily from the bounds of [17, 19]. (More is known about zc ; see [32].) (ii) The bounds (4.13) follow from [19, Lemma 2.1] for ζ = 1, and this implies the bounds for all |ζ| ≤ 1. of (4.14), the ζ-dependence resides in Πˆ z,ζ (k). Writing Πˆ z,ζ (k) = P For the firstnbound s π (k)z ζ , taking 1 + ζ-derivatives introduces a factor s1+ , which is bounded n,s n,s by sn . The factor s can be replaced by a sum over backbone sites, which corresponds diagrammatically to adding a vertex along the backbone. This gives diagrams which ˆ z,ζ (k). The factor n corresponds to taking z-derivatives, and the are included in 9 ˆ zc (0)k is shown to be finite in the proof of [19, result remains bounded, since kδz 9 Lemma 2.1(iii)]. The other bounds in (4.14) are similar. (iii) The statements for B3 follow by direct calculation of the derivative and (4.12). The statement for B4 follows from (2.4). The statement for B5 follows since, by definition, d ˆ Πzc ,1 (0), and the derivative on the right side goes to zero as L → ∞ B5 = 1 + zc dζ ˆ z,ζ (k) does in (i). for the same reason 9 Lemma 4.1(ii) can be used in conjunction with Lemma 3.1, to conclude for example that hˆ z,ζ (k) − hˆ zc ,ζ (k) = O(|1 − z/zc |a ). These two lemmas can also be combined with the elementary facts that |1 − cos t| ≤ O(t ) and | cos t − 1 + t2 /2| ≤ O(t2+ ) for small positive , and that |x| ≤ n if x is a site in an n-bond lattice tree, to conclude for example that hˆ zc (k) − hˆ zc (0) = O(k ) (4.16) and
k2 Fˆz (k) − Fˆz (0) − ∇2k Fˆz (0) = O(L2 k 2+ ). (4.17) 2d Explicitly, the latter estimate follows by writing the left side (using symmetry) as X (k · x)2 Fz (x) cos(k · x) − 1 + , (4.18) 2 x bounding |x|2+ by n |x|2 , and using the last bound of (4.13) together with Lemma 3.1. Keeping track of the various constants entering into the definition of the amplitude D in [19, Sect. 2.3], and using (1.18), we have D12 = 23/2 d−1 π −1/2 D2 = 23/2 B3 B4−1 .
(4.19)
86
E. Derbez, G. Slade
By Lemma 4.1(iii), D1 grows linearly in L for large L (to leading order), as is to be expected in view of its role as a scaling constant in Theorems 1.1 and 1.2. Also, we can now define the constant T1 of Theorem 1.2, by T1 = 21/2 B5 B4−1 .
(4.20)
The constant b occurring in (2.4) is given, according to [19, Lemma 2.1(iii)], by ˆ zc ,1 (0). b = zc 9 To prove the next lemma, we require a lower bound Fˆz,ζ (k) ≥ const. k 2 + (1 − ζ) + (1 − z/zc )1/2
(4.21)
(4.22)
valid uniformly in k ∈ [−π, π]d , small (1 − z/zc ) and small (1 − ζ). To prove (4.22), we write (4.23) Fˆz,ζ (k) = Fˆz,ζ (k) − Fˆz,ζ (0) + Fˆz,1 (0) + Fˆz,ζ (0) − Fˆz,1 (0) . The third term on the right is bounded below by a multiple of (1 − ζ), for small (1 − ζ), by the middle bound of (4.14) and the behaviour of B5 expressed in Lemma 4.1(iii). The second term on the right is equal to hˆ z,1 (0)χ(z)−1 and is bounded below by a multiple of (1 − z/zc )1/2 , for small (1 − z/zc ), by Lemma 4.1 and (2.3). Finally, the first term on the right is bounded below by a multiple of k 2 , using the method of [17, (3.10)–(3.12)]. Lemma 4.2. There is an > 0 (depending on d > 8) such that the following bounds hold uniformly in k ∈ [−π, π]d , |z| ≤ zc , |ζ| ≤ 1: ˆ z,ζ (k)k ≤ K, kδz 9
ˆ z,ζ (k)k ≤ K, kδζ 9
d
9 ˆ z,ζ (k) ≤ O((1 − |ζ|)−1 ),
dζ
d
9 ˆ z,ζ (k) ≤ O(|1 − z/zc |−1/2 (1 − |z|/zc )−1/2 ),
dz
2
d
−1 −1
ˆ
dz 2 9z,ζ (k) ≤ O(|1 − z/zc | (1 − |z|/zc ) ).
(4.24)
(4.25) (4.26) (4.27)
Proof. The first bound of (4.24) was obtained in the proof of [19, Lemma 2.1(iii)] for ζ = 1, and this implies the corresponding bound for general ζ. The second bound of (4.24) then follows, by bounding s by n as in the proof of (4.14). For (4.25), we argue as in the proof of Lemma 4.1(ii) that taking a ζ-derivative amounts to adding a ˆ z,ζ (k). This yields diagrams vertex (along the backbone) to the diagrams representing 9 with the same power counting as the pentagon; see [19, Sect. 4.3.2] for more details. These diagrams can be bounded above by Feynman diagrams with propagator Gˆ zc ,|ζ| (k). (Independent propagators arise via overcounting to remove interactions between diagram lines, and this can be done only for nonnegative activities, hence the absolute values.) This propagator is bounded above by O([k 2 + (1 − |ζ|)]−1 ), by (4.22), and the desired result then follows by power counting as in [19, Sect. 4.3.2]. To estimate Feynman diagrams as the propagator’s mass goes to zero, we use the power counting bounds of [33, Theorem 2], as adapted in [10, Appendix A].
Scaling Limit of Lattice Trees in High Dimensions
87
The bounds (4.26) and (4.27) are more delicate, due to the presence of |1 − z/zc | and not merely (1 − |z|/zc ), and require further application of the lace expansion. We begin P ˆ z,ζ (k) = ∞ ψˆ n,s (k)ζ s z n , the quantity ψn,s (x) counts treewith (4.26). Writing 9 n,s=0 like geometric objects containing the origin and x and consisting of n bonds including a “backbone” with s steps. These geometric objects are not trees, due to the various selfintersections required by the definition of 9. Differentiation with respect to z replaces the factor z n by nz n−1 , and the factor of n arising here can be written as a sum over sites (to be accurate, there are n + 1 sites, but this is unimportant). Diagrammatically, this corresponds to adding a vertex to a line in 9z,ζ (x) and a new line emanating from that vertex and terminating at a new vertex of degree 1. An example is the following: r
x
(4.28)
0
Taking absolute values, overcounting by ignoring avoidance constraints between different lines in the diagram, and bounding |ζ| by 1, leads to a bound by χ(|z|) = O((1 − |z|/zc )−1/2 ) times a diagram with degree of divergence equal to that of the pentagon with propagator bounded above by O([k 2 + (1 − |z|/zc )1/2 ]−1 ). This gives an overall bound (1 − |z|/zc )−1 , which is insufficient for our needs because it fails to provide the factor |1 − z/zc |−1/2 . Such factors are important, because of the restriction b0 ≥ 1 in Lemma 3.2(i). To recover the desired factor |1 − z/zc |−1/2 in the upper bound, we proceed in the same fashion used to derive (4.6). This involves performing a lace expansion to remove the interaction between the new line in (4.28) and the remainder of the diagram. The result is a formula of the form d ˆ ˆ (1) z 9z,ζ (k) = 8 z,ζ (k)[zχ(z) + 1], dz
(4.29)
(1)
ˆ is bounded by a sum of diagrams having the same power counting as the where 8 pentagon. The desired factor |1−z/zc |−1/2 then arises as an upper bound on χ(z), while ˆ (1) is bounded by O((1 − |z|/zc )−1/2 ). 8 The situation is similar for (4.27), making use of an additional lace expansion to obtain an identity d ˆ (1) ˆ (2) z 8 (k) = 8 (4.30) z,ζ (k)[zχ(z) + 1], dz z,ζ (2)
ˆ with 8
bounded by a sum of diagrams having the same power counting as the hexagon.
ˆ z,ζ (k) of 4.2. The three-point function. In this section, we introduce a generalization 9 ˆ z,ζ (k) occurring in (4.10). Here ζ = (ζ1 , ζ2 , ζ3 ) with each |ζj | ≤ 1, and the function 9 k = (k1 , k2 , k3 ) with each kj ∈ [−π, π]d . The generalization involves a modification of ˆ z (k) given in [19, Sect. 4.1], as follows. the definition of 9 |ω|−j In [19, (4.17)], we multiply the right side by a factor eik1 ·y ζ1j eik2 ·(x−y) ζ2 , where ˆ z,ζ (k) j is such that ω(j) = y. Summing this over x and y gives a contribution to 9 corresponding to [19, (4.17)]. In [19, (4.28)], we modify the right side by changing F
88
E. Derbez, G. Slade |ω 0 |
|ω|−j
to include a factor eik1 ·ω(j) ζ1j eik2 ·(x−ω(j)) ζ2 eik3 ·(y−ω(j)) ζ3 in the right side of [19, ˆ (4.24)]. This gives the same diagrams as for 9z (k), except that now the diagrams contain factors as above. For a specific example arising as in [19, (4.24)], in the diagrammatic contribution y ppp ppp ppp p p x q ppp p p ω(j) 0
ˆ z,ζ (k), the dashed line carries a factor eik1 ·ω(j) ζ j , the heavy line carries a factor to 9 1 |ω|−j |ω 0 | eik2 ·(x−ω(j)) ζ2 , and the dotted line carries a factor eik3 ·(y−ω(j)) ζ3 . Here ω 0 is the skeleton path joining ω(j) to y. The method of proof of Lemma 4.2 can then be used to prove the following lemma. Lemma 4.3. The bounds of Lemma 4.2 hold, uniformly in k ∈ [−π, π]d , |z| ≤ zc , ˆ z,ζ (k) is replaced by 9 ˆ z,ζ (k) and if ζ is replaced by ζj in ζ-derivatives |ζj | ≤ 1, if 9 and in the upper bound of (4.25). ˆ z,ζ (k) arises as a kind of renormalized vertex in the following lemma. The function 9 Lemma 4.4. The three-point function obeys the identity h i ˆ (2) ˆ ˆ (2) ˆ Gˆ (3) z,ζ (k) = Gz,ζ3 (k3 ) + 1 + ζ3 zD(k3 )Gz,ζ3 (k3 ) 9z,ζ (k) ×
2 Y
ˆ p )Gˆ (2) (kp ) + 1 . ζp zD(k z,ζp
(4.31)
p=1
Proof. By definition, Gˆ (3) z,ζ (k)
=
X
X
z |ω|
y1 ,y2 ,y3 ω:0→y1 →y1 +y2
×
X
z
|Rj |
|ω| X j=0
δω(j),y1
Y X
z |Ri |
i:i6=j Ri 3ω(i)
|ω|−j |ω3 | K[0, |ω|]ei(k1 ·y1 +k2 ·y2 +k3 ·y3 ) ζ1j ζ2 ζ3 ,
(4.32)
Rj 3y1 ,y1 +y3
where ω3 : y1 → y1 + y3 is the backbone in Rj joining y1 to y1 + y3 , and K[0, |ω|] is defined in [17, (2.3)] and serves to keep the ribs R0 , . . . , R|ω| from intersecting one another. By a simple modification of the proof of [29, Lemma 5.2.5] to account for the different notion of graph connectivity used in [17], it follows that X K[0, |ω|] = K[0, I1 − 1]J[I1 , I2 ]K[I2 + 1, |ω|], (4.33) I:I3j
where the sum is over intervals I = [I1 , I2 ] of integers such that 0 ≤ I1 ≤ j ≤ I2 ≤ |ω|. Here K[a, b] = 1 if a ≥ b, J[a, a] = 1, and the degenerate case I = [j, j] is included as a term in the sum. We insert (4.33) in (4.32). The term with I = [0, |ω|] and |ω| ≥ 1 is the same as d the expression for dz z Πˆ z (k) given in [19, (4.5)], except (4.32) is more detailed due to
Scaling Limit of Lattice Trees in High Dimensions
89
its dependence on k, ζ. Carrying out the analysis of [19, Sect. 4.1] while taking these additional variables into account, this term is equal to ˆ 3 )Gˆ (2) (k3 ) 9 ˆ z,ζ (k). 1 + ζ3 zD(k z,ζ3
(4.34)
ˆ p )Gˆ (2) (kp ) + 1 arise from the factors K[0, I1 − 1] and ζ z D(k p p=1 z,ζp K[I2 + 1, |ω|], with the unit terms arising when I1 = 0 or I2 = |ω|. The term Gˆ (2) z,ζ3 (k3 ) in (4.31) arises from the case I = [j, j]. The factors
Q2
Define 2 Y
ˆ 3 )9z,ζ (k) V (z, ζ, k) = 1 + ζ3 zD(k
ˆ p) ζp zD(k
(4.35)
p=1
and, recalling (4.21) and (2.4), v = V (zc , 1, 0) = (1 + b)(zc )2 =
1 . 2B22
(4.36)
Abbreviating the notation, (4.31) can then be written G(3) = V G1 G2 G3 + 9(ζ2 zD2 G2 + 1)(ζ1 zD1 G1 + 1) + [G3 + (1 + ζ3 zD3 G3 )9] (ζ1 zD1 G1 + ζ2 zD2 G2 + 1) .
(4.37)
4.3. The m-point function, m ≥ 4. This section will be used only in Sect. 7, where the case m ≥ 4 of Theorem 1.1 is proved, so we restrict attention to the case of ζ = 1 and m ≥ 4. Consider the m-point function G(m) z (σ; y) for lattice trees containing the “external” vertices 0, x1 , . . . , xm−1 , as in Fig. 4. Focussing on the three points 0, x1 , x2 , we apply the proof of Lemma 4.4 to obtain an identity like (4.31), except now in each term in the expansion the geometric objects involved must contain the additional points x3 , . . . , xm−1 . In the skeleton for a lattice tree containing 0, x1 , . . . , xm−1 , we decompose (4.32) according to the possibilities for how the skeleton paths from x3 , . . . , xm−1 connect to the backbone path ω. Given an interval I, let p1 , p˜3 , p2 respectively be the number of these m − 3 points which connect to ω in the intervals [0, I1 − 1], [I1 , I2 ], [I2 + 1, |ω|]. (If I1 = 0 we take p1 = 0 and if I2 = |ω| we take p2 = 0.) Then G1 and G2 1 +2) 2 +2) in (4.31) get replaced by G(p and G(p , with a suitable allocation of momenta kj . 1 2 Of the p˜3 points connected to [I1 , I2 ], let p3 be the number connected to the portion of the skeleton for 0, x1 , x2 which enters into G3 and let p4 be the number connected to the portion of the skeleton which enters into 9. Schematically, this can be represented as
90
E. Derbez, G. Slade
x2
.. .
p3
p1
p2
... 0
9 ...
... x1
p4 although this figure does not show the topological richness of possible connections of x3 , . . . , xm−1 to the backbone. This gives an equation like (4.37) for G(m) , with each (p +2) Gj replaced by Gj j and 9 replaced by 9(p4 ) on the right side. Here 9(p4 ) denotes 9 with p4 additional external lines emanating, and we will also write V (p) to denote V with 9 replaced by 9(p) . Specification of a shape σ for G(m) imposes corresponding shape restrictions on the G(pj +2) , as well as on the possible values for the pj . Explicitly, (4.37) becomes h X G(p1 +2) G(p2 +2) G(p3 +2) V (p4 ) G(m) = pj ≥0:p1 +p2 +p3 +p4 =m−3 2 +2) 1 +2) +9(p4 ) (zD2 G(p + 1)(zD1 G(p + 1) 2 1 i h (p3 +2) (p3 +2) + (1 + zD3 G3 )9(p4 ) + G3 i (p2 +2) 1 +2) × zD1 G(p + zD G + 1 , 2 2 1
(4.38)
where on the right side, in terms in which a superscript involving a pj does not appear 1 +2) explicitly, e.g., 9(p4 ) zD1 G(p , the missing pj variables (in this case, p2 and p3 ) are 1 set equal to zero. Also, the pj should obey constraints relevant to a shape specification for G(m) . Finally, we observe that for any p ≥ 1, |9(p) | ≤
1 . (1 − |z|/zc )p−
(4.39)
This follows as in the proof of (4.26) (except now no additional lace expansions are required) by taking absolute values, overcounting by removing avoidance interactions between diagram lines, bounding each of the p external lines by (1 − |z|/zc )−1/2 , and using power counting [33, 10] to bound the truncated diagram (which has p extra vertices) by (1−|z|/zc )−p/2 . (A similar argument applies to the case where the additional p points attach in such a way as to produce less than p extra vertices to the truncated diagram.) 4.4. Corrigenda. We take this opportunity to make corrections to each of [17] and [19]. The corrections affect details within proofs but do not affect results.
Scaling Limit of Lattice Trees in High Dimensions
91
4.4.1. Proof of [17, Lemma 3.1]. The inequality [17, (3.5)] is incorrect, because the number of unlabelled abstract trees with n edges is not (n + 1)n−1 /(n + 1)!. The proof of [17, Lemma 3.1] can be recovered by replacing its last paragraph by the following paragraph. The net effect is to change the definition of zL on [17, p.1499] to zL = 1/(2e|L |) and to change [17, (3.2)] to gz ≤ 4 for z ≤ zL . These changes result in only cosmetic changes elsewhere in [17]. The number of lattice trees or animals containing the origin is less than the number of lattice trees on the Bethe lattice (of coordination number |L |) which contain the origin. By subadditivity [25] and the fact that e|L | is the analogue of λ = zc−1 for the Bethe lattice [32], the number of lattice trees on the Bethe lattice which contain the origin is bounded above by (n + 1)(e|L |)n . Therefore, for trees or animals, gz ≤
∞ X
(n + 1)(e|L |z)n =
n=0
1 . (1 − e|L |z)2
(4.40)
Taking zL = 1/(2e|L |), this gives gz ≤ 4 for z ≤ zL . Also, zgz ≤ 2/(e|L |) < 1/(|L |) if z ≤ zL . By definition of zL and the fact that zc is bounded below by its Bethe lattice counterpart, this gives zc ≥ (e|L |)−1 = 2zL > zL . This replaces the two sentences above the proof of [17, Lemma 3.2]. 4.4.2. Proof of [19, Theorem 1.1(b)]. The upper bound of [19, Lemma 2.1(v)] should be KL2+d (1−|z|/zc )−1 rather than KL2+d |1−z/zc |−1 , because its proof uses [19, (3.11)]. The latter is proved only for nonnegative z because its proof uses an inequality, valid only for nonnegative z, to remove interactions between distinct lines of a Feynman diagram. The proof of [19, (2.40)] in the paragraph under [19, (2.41)] is therefore incorrect. In ˆ detail, in the equation under [19, (2.41)], with the constant equal to B1−2 [hˆ zc (0)∇2k D(0)+ 2 ˆ −2 2 2 ∇k Πzc (0)], there is a contribution to E2 (z) given by Fˆz (0) [∇k Πˆ z (0) − ∇k Πˆ zc (0)]. d ∇2k Πˆ z (0), The derivative of this term with respect to z contains a contribution Fˆz (0)−2 dz −1 which is bounded above by O(|1 − z/zc |)O((1 − |z|/zc ) ). This is a weaker bound than the O(|1 − z/zc |−2 ) claimed in [19], but the weaker bound is sufficient for the proof of [19, (2.40)], by Lemma 3.2(ii).
5. The Two-Point Function (2) (k), as in (2.5), by Define Ez,ζ
Gˆ (2) z,ζ (k) =
D12 k 2
+
23/2 (1
B2 23/2 (2) (k), + Ez,ζ − z/zc )1/2 + 2T1 (1 − ζ)
(5.1)
where T1 is given by (4.20). This section is devoted to the proof of the following theorem, which gives the m = 2 case of Theorem 2.1. Throughout this and subsequent sections we work with the spread-out model and assume without further mention that d > 8 and L is sufficiently large. Similar methods can be applied to the nearest-neighbour model in sufficiently high dimensions. Theorem 5.1. Let k ∈ Rd , κ = kD1−1 n−1/4 and t ∈ [0, ∞). For nearest-neighbour trees in sufficiently high dimensions d ≥ d0 , and for sufficiently spread-out trees above
92
E. Derbez, G. Slade
eight dimensions, there is an > 0 such that the coefficients defined in (2.9) and (2.8) obey the bounds −n −−1/2 (5.2) |e(2) n (κ)| ≤ const.zc n and
1/2 )| ≤ const.zc−n n−1− . |e(2) n (κ, tT1 n
(5.3)
This theorem generalizes a result of [19], where (5.2) was obtained for k = 0. To prove the theorem, we begin by extracting the main term from Gˆ (2) z,ζ (k). This will lead (2) (2) (k) will then be to an explicit expression for the error term Ez,ζ (k). Estimates on Ez,ζ combined with Lemma 3.2 to obtain the desired bounds (5.2) and (5.3). The leading behaviour of Gˆ (2) z,ζ (k) will be extracted using the following lemma, which extracts the leading behaviour of hˆ z,ζ (k) and Fˆz,ζ (k). This lemma generalizes the statements from [19] that hˆ z (0) = hˆ zc (0) + O(zc − z) and Fˆz (0) = B4 (1 − z/zc )1/2 + O(|1 − z/zc |1/2+ ) with (recalling (2.4) and the fact that zc hˆ zc (0) = 1) √ B4 = hˆ zc (0)B2−1 = B1 zc .
(5.4)
(5.5)
Define Eh (z, ζ, k) and EF (z, ζ, k) by hˆ z,ζ (k) = hˆ zc ,1 (0) + Eh (z, ζ, k)
(5.6)
Fˆz,ζ (k) = B3 k 2 + B4 (1 − z/zc )1/2 + B5 (1 − ζ) + EF (z, ζ, k),
(5.7)
and with the constants given by (4.15). Lemma 5.2. There is an > 0 such that, uniformly in |z| ≤ zc , |ζ| ≤ 1, k ∈ [−π, π]d , |Eh (z, ζ, κ)| ≤ O(n− ) + O(|1 − ζ| ) + O(|1 − z/zc | )
(5.8)
and |EF (z, ζ, κ)| ≤ O(L2 n−1/2− ) + O(|1 − ζ|1+ ) + O(|1 − z/zc |1/2+ ) + O(n− |1 − ζ|) + O(L2 n−1/2 |1 − z/zc | ) + O(|1 − ζ||1 − z/zc | ).
(5.9)
Proof. For hˆ z,ζ (k), we write hˆ z,ζ (k) = hˆ zc (0) + [hˆ zc (k) − hˆ zc (0)] + [hˆ z (k) − hˆ zc (k)] + [hˆ z,ζ (k) − hˆ z (k)]. (5.10) The last three terms on the right side are error terms, and (5.8) is a consequence of Lemma 4.1(ii) and Lemma 3.1, using (4.16) for the first of the error terms. For Fˆz,ζ (k), we begin with the decomposition Fˆz,ζ (k) = Fˆz (0) + [Fˆz (k) − Fˆz (0)] + [Fˆz,ζ (k) − Fˆz (k)].
(5.11)
The first term on the right side is B4 (1 − z/zc )1/2 + O(|1 − z/zc |1/2+ ), by (5.4). Using (4.17), and arguing as above, the second term is
Scaling Limit of Lattice Trees in High Dimensions
k2 2 ˆ Fˆz (k) − Fˆz (0) = ∇ Fz (0) + O(L2 k 2+ ) 2d k2 2 ˆ = ∇ Fzc (0) + O(L2 k 2 |1 − z/zc | ) + O(L2 k 2+ ). 2d
93
(5.12)
By Lemma 3.1 and Lemma 4.1(ii), the third term is d Fˆz,ζ (k) − Fˆz (k) = − Fˆz,1 (k)(1 − ζ) + O(|1 − ζ|1+ ) dζ d = − Fˆzc ,1 (0)(1 − ζ) + O(k |1 − ζ|) + O(|1 − ζ||1 − z/zc | ) dζ + O(|1 − ζ|1+ ). (5.13) This completes the proof.
For the leading behaviour of (5.7), we define (0) (k) = B3 k 2 + B4 (1 − z/zc )1/2 + B5 (1 − ζ). Fˆz,ζ
(5.14)
(0) (k)| that will be used The following lemma gives lower bounds on |Fˆz,ζ (k)| and |Fˆz,ζ repeatedly throughout the remainder of this paper.
Lemma 5.3. (i) For k 2 ≥ 0, |z| ≤ zc , and |ζ| ≤ 1, (0) |Fˆz,ζ (k)| ≥ const. L2 k 2 + |1 − ζ| + |1 − z/zc |1/2 .
(5.15)
(ii) For sufficiently small k 2 ≥ 0, and for |z| ≤ zc , |ζ| ≤ 1, |Fˆz,ζ (k)| ≥ const. k 2 + |1 − ζ| + |1 − z/zc |1/2 .
(5.16)
Proof. (i) If |ζ| ≤ 1 and |z| ≤ zc , then Re(1 − ζ) ≥ 0 and 0 ≤ |Im[(1 − z/zc )1/2 ]| ≤ (0) (k)| follows immediately from the general Re[(1 − z/zc )1/2 ]. The lower bound on |Fˆz,ζ fact that if c ≥ 0, α = a1 + ia2 with a1 ≥ 0, and β = b1 + ib2 with 0 ≤ |b2 | ≤ b1 , then 1 |c + α + β| ≥ √ (c + |α| + |β|), 3 2
(5.17)
together with the fact that B3 scales like L2 . To prove (5.17), we first note that |c + α + β| ≥ Re(c + α + β) = c + a1 + b1 ≥ c
(5.18)
and similarly 1 |c + α + β| ≥ b1 ≥ √ |β|. 2 It suffices to show that for γ = g1 + ig2 with g1 ≥ 0, 1 |γ + β| ≥ √ |γ|, 2
(5.19)
(5.20)
where γ represents c + α and clearly |c + α| ≥ |α|. To prove the inequality (5.20), we assume without loss of generality that g2 ≥ 0, and use the cosine law to see that
94
E. Derbez, G. Slade
|γ + β|2 = |γ|2 + |β|2 − 2|γ||β| cos θ,
(5.21)
where θ is an angle in the interval [ π4 , 5π 4 ]. Therefore |γ + β|2 ≥ |γ|2 + |β|2 −
√
2|γ||β|,
(5.22)
and taking the minimum of the right side with respect to |β| then gives (5.20). (ii) We consider separately the cases where |1 − z/zc | is small, or not, beginning with the former. By Lemma 5.2 and part (i), (0) (k)| − |EF (z, ζ, k)| |Fˆz,ζ (k)| ≥ |Fˆz,ζ
≥ const.(L2 k 2 + |1 − ζ| + |1 − z/zc |1/2 ) − |EF (z, ζ, k)|,
(5.23)
and the error term can be absorbed into the main term on the right side for sufficiently small k 2 , |1 − z/zc | ≤ δ and |1 − ζ| ≤ δ1 . Now suppose that |1 − z/zc | ≤ δ and |1 − ζ| ≥ δ1 . For this, we recall (4.7) and (4.8), and write h ˆ hˆ z (k) 1 − zhˆ z (0) + zhˆ z (0) − zD(k) ˆ (5.24) + zD(k) Πˆ z (k) − Πˆ z,ζ (k) .
Fˆz,ζ (k) = (1 − ζ) + ζ
Using Lemma 4.1, the fact that Fˆzc (0) = 0, and Lemma 3.1, the three terms in the square brackets are small respectively if |1 − z/zc | is small, k is small, and L is large. Under these conditions, the right side can be bounded below in absolute value by δ1 /2. Taking δ to be smaller if necessary, for |1 − z/zc | ≤ δ and |1 − ζ| ≥ δ1 , we thus have δ1 δ1 k 2 + |1 − ζ| + |1 − z/zc |1/2 = ≥ const.(k 2 + |1 − ζ| + |1 − z/zc |1/2 ). 2 2 k 2 + |1 − ζ| + |1 − z/zc |1/2 (5.25) It remains to consider the case of |1 − z/zc | ≥ δ. For this we apply the maximum modulus principle to the function 1/Fˆz,ζ (k), (which is analytic in |z| < zc for fixed ζ and k) in the region W inside |z| < zc but outside |1 − z/zc | ≤ δ. For the portion of the boundary of W on the circle |1 − z/zc | = δ, the preceding paragraph gives an upper bound on |1/Fˆz,ζ (k)|. For the remainder of the boundary of W , we note that |Fˆz,ζ (k)| ≥
|Fˆz,ζ (k)| ≥ 1 − zc |gz | − zc |Πˆ z,ζ (k)| ≥ 1 − zc |gz | − O(L1−d )
(5.26)
and argue exactly as in the last two paragraphs of [19, Lemma 2.2]. This gives a uniform lower bound on |Fˆz,ζ (k)| for z in W and on its boundary, and we can argue as in (5.25) to complete the proof. The proof of Theorem 5.1 will also make use of the following bounds. Lemma 5.4. The following bounds hold uniformly in k ∈ [−π, π]d , |z| ≤ zc and |ζ| ≤ 1, as n → ∞:
Scaling Limit of Lattice Trees in High Dimensions
d EF (z, ζ, κ) ≤ O(|1 − z/zc | ) + O(n− ) + O(|1 − ζ| ), dζ d EF (z, ζ, κ) ≤ O(|1 − z/zc |−1/2 ) + O(n− |1 − z/zc |−1/2 ) dz 2 d dzdζ EF (z, ζ, κ) d Eh (z, ζ, κ) dζ d Eh (z, ζ, κ) dz 2 d dzdζ Eh (z, ζ, κ)
+ O(|1 − ζ| |1 − z/zc |−1/2 ),
95
(5.27)
(5.28)
≤ O(|1 − z/zc |−1/2 (1 − |ζ|)−1 ),
(5.29)
≤ O(1),
(5.30)
≤ O(|1 − z/zc |−1/2 ),
(5.31)
≤ O(|1 − z/zc |−1/2 (1 − |ζ|)−1 ).
(5.32)
The bounds (5.30)–(5.32) hold also when Eh (z, ζ, κ) is replaced by Fˆz,ζ (κ) on the left sides. Proof. We begin with the derivatives of (0) (κ). EF (z, ζ, κ) = Fˆz,ζ (κ) − Fˆz,ζ
(5.33)
For (5.27), using (4.15), and adding and subtracting a pair of terms, gives d ˆ d ˆ d d ˆ d ˆ EF (z, ζ, κ) = Fz,ζ (κ) − Fz,1 (κ) + Fz,1 (κ) − Fz,1 (0) dζ dζ dζ dζ dζ d ˆ d ˆ + Fz,1 (0) − Fzc ,1 (0) . (5.34) dζ dζ The desired estimate then follows from Lemma 4.1 and Lemma 3.1, arguing as in (4.16) for the middle term. For (5.28), by (4.5), (4.10), (2.4) and (4.15), we have B4 d d ˆ EF (z, ζ, κ) = −ζD(κ) zgz + z Πˆ z,ζ (κ) + (1 − z/zc )−1/2 (5.35) dz dz 2zc B1 ˆ ˆ z,ζ (κ) + √ = O(1) − ζD(κ)χ(z) 1 + z9 (1 − z/zc )−1/2 . 2 zc By (2.4) and (4.21), (2.3) can be rewritten as B1 ˆ zc ,1 (0) − √ (1 − z/zc )−1/2 = O(|1 − z/zc |−1/2 ). χ(z) 1 + zc 9 2 zc
(5.36)
If we add the left side of (5.36) to (5.35) and subtract the right side, we are left with estimating ˆ ˆ zc ,1 (0) − ζ D(κ) ˆ z,ζ (κ) . 1 + z9 (5.37) χ(z) 1 + zc 9 With (2.3) and (4.24), this leads to (5.28). Turning now to (5.29), we have
96
E. Derbez, G. Slade
d2 ˆ d2 EF (z, ζ, κ) = Fz,ζ (κ). dzdζ dzdζ
(5.38)
This derivative is dominated by the behaviour of d ˆ d2 z Πˆ z,ζ (κ) = (zχ(z) + 1) 9 z,ζ (κ). dzdζ dζ
(5.39)
The factor of χ(z) is responsible for the factor |1 − z/zc |−1/2 in the upper bound, and ˆ z,ζ (κ) gives the other factor, by (4.25). the ζ-derivative of 9 For the error term Eh (z, ζ, κ) = hˆ z,ζ (κ) − hˆ zc ,1 (0), (5.40) the bound (5.30) follows from Lemma 4.1. For (5.31), we use d ˆ ˆ z,ζ (κ)(zχ(z) + 1), z hz,ζ (κ) = χ(z) + 9 dz
(5.41)
and (5.32) is similar to (5.29). The fact that the bounds on derivatives of Eh imply the corresponding bounds for ˆ hˆ z,ζ (k). F is evident from Fˆz,ζ (k) = 1 − ζzD(k) Proof of Theorem 5.1. We have hˆ z,ζ (k) hˆ z (0) Eh (z, ζ, k) hˆ zc (0)EF (z, ζ, k) + . Gˆ (2) = (0)c − z,ζ (k) = ˆ (0) Fz,ζ (k) Fˆz,ζ (k) Fˆz,ζ (k) Fˆz,ζ (k)Fˆz,ζ (k)
(5.42)
By (4.15), (4.19), (4.20) and (5.5), this gives (5.1) with (2) (k) = Ez,ζ
Eh (z, ζ, k) hˆ zc (0)EF (z, ζ, k) . − (0) Fˆz,ζ (k) Fˆz,ζ (k)Fˆz,ζ (k)
(5.43)
We absorb the constant hˆ zc (0) into EF ; this has no effect on bounds. For future reference, we note that by Lemmas 5.2 and 5.3, n−1/2 1 n−1/2− n− (2) + (κ)| ≤ O + + . |Ez,1 |1 − z/zc |1/2 |1 − z/zc |1/2− |1 − z/zc | |1 − z/zc |1− (5.44) Denoting derivatives with respect to z by primes, and omitting arguments to simplify the notation, the derivative of (5.43) is E0 =
Eh F 0 Eh0 EF0 EF F 0 EF (F (0) )0 − − + + . F F2 F F (0) F 2 F (0) F (F (0) )2
(5.45)
Consider first the case of ζ = 1. To prove (5.2), we estimate each of the terms on the right side of (5.45) using Lemma 5.2, Lemma 5.3, and (5.28) and (5.31) of Lemma 5.4, and apply Lemma 3.2(ii). The procedure is similar for each term and we illustrate it only for the third term on the right side of (5.45). This term is bounded above by O(|1 − z/zc |−3/2 ) + O(n− |1 − z/zc |−3/2 ), and considering also the other terms, (2) (k) is bounded above by O(zc−n n−−1/2 ), as required. This the coefficient of z n in Ez,ζ completes the proof of (5.2).
Scaling Limit of Lattice Trees in High Dimensions
97
To prove (5.3), we begin by differentiating (5.45) with respect to ζ. Denoting ζderivatives with dots, this gives E 0 F˙ E˙ 0 E˙ h F 0 Eh F˙ 0 Eh F 0 F˙ E˙ F0 EF0 F˙ − + 2 − + E˙ 0 = h − h2 − F F F2 F2 F3 F F (0) F 2 F (0) 0 ˙ (0) 0 0 0 ˙ ˙ ˙ EF F EF F EF F F EF F 0 F˙ (0) E F + F (0) 2 + 2 (0) + 2 (0) − 2 3 (0) − 2 (0) 2 F (F ) F F F F F F F (F ) (0) 0 (0) 0 (0) 0 ˙ ˙ ˙ EF (F ) EF ( F ) EF (F ) F EF (F (0) )0 F˙ (0) + + − −2 . (0) 2 (0) 2 2 (0) 2 F (F ) F (F ) F (F ) F (F (0) )3
(5.46)
It is routine, if tedious, to estimate each of the terms on the right side using Lemmas 5.3 and 5.4, and apply Lemma 3.2(iii). We illustrate this for just two of the terms, since the procedure is similar for all terms. For the first term, by (5.32), we have 0 E˙ h O(|1 − z/zc |−1/2 (1 − |ζ|)−1 ) ≤ . F |F |
(5.47)
Since |F | ≥ c|1 − ζ| by Lemma 5.3, and sacrificing a factor of |1 − z/zc |−1/2 , this is bounded above by a multiple of 1 . |1 − z/zc ||1 − ζ|(1 − |ζ|)1−
(5.48)
Using Lemma 3.2(iii), and restoring factors of n−1 and s−1 = (tT1 n1/2 )−1 for the two derivatives, the contribution to e(2) n (κ, s) due to this term is at most of order 0
n−1 s−1 zc−n (log n)s1− = zc−n n−1− ,
(5.49)
for some 0 < 0 < . For the eleventh term, by Lemmas 5.2 and 5.4, h EF F 0 F˙ ≤ |F 3 F (0) |−1 |1 − z/zc |1/2+ + n−1/2 (|1 − z/zc | + n− ) F 3 F (0) +|1 − ζ|(|1 − ζ| + n− + |1 − z/zc | ) |1 − z/zc |−1/2 O(1).
(5.50)
The various terms on the right side are all handled in essentially the same way. For example, using |F 3 F (0) | ≥ c|1 − ζ|2 |1 − z/zc |, the last term can be bounded above by a multiple of 1 . (5.51) 3/2− |1 − z/zc | |1 − ζ| Applying Lemma 3.2(iii) as above, the contribution to e(2) n (κ, s) due to this term is at most of order 0 (5.52) n−1 s−1 zc−n n1/2− log s = zc−n n−1− .
98
E. Derbez, G. Slade
6. The Three-Point Function (3) (k), as In this section, we prove the m = 3 case of Theorem 2.1. For this, we define Ez,ζ in (2.5), by
7/2 Gˆ (3) z,ζ (k) = B2 2
3 Y j=1
1 (3) (k), + Ez,ζ D12 kj2 + 23/2 (1 − z/zc )1/2 + 2T1 (1 − ζj )
(6.1)
and as usual, write Ez(3) (k)
=
(3) (k) = Ez,ζ
∞ X
n e(3) |z| < n (k)z n=0 ∞ X X s n e(3) n (k, s)ζ z n=0 s
zc , |z| < zc , |ζj | ≤ 1.
(6.2) (6.3)
We begin with the following lemma, which follows in a straightforward manner from (4.35) and Lemma 4.3. Lemma 6.1. Defining EV (z, ζ, k) by V (z, ζ, k) = v + EV (z, ζ, k),
(6.4)
the bound EV (z, ζ, κ) = O(|1 − z/zc | ) +
3 X
O(|1 − ζj | ) + O(n− )
(6.5)
j=1
holds uniformly in k ∈ [−π, π]d , |z| ≤ zc , |ζ| ≤ 1. In the same domain, the following bounds also hold: d −1 (6.6) dζj EV (z, ζ, κ) ≤ O((1 − |ζj |) ), d EV (z, ζ, κ) ≤ O(|1 − z/zc |−1/2 (1 − |z|/zc )−1/2 ), (6.7) dz 2 d −1 −1 (6.8) dz 2 EV (z, ζ, κ) ≤ O(|1 − z/zc | (1 − |z|/zc ) ). We divide the m = 3 case of Theorem 2.1 into two parts, and first prove the easier part, corresponding to taking ζ = 1. Theorem 6.2. Let k ∈ R3d , κ = kD1−1 n−1/4 and t ∈ [0, ∞)3 . For nearest-neighbour trees in sufficiently high dimensions d ≥ d0 , and for sufficiently spread-out trees above eight dimensions, −n 1/2− . |e(3) n (κ)| ≤ const.zc n
(6.9)
Scaling Limit of Lattice Trees in High Dimensions
99
Proof. Equation (4.37) states that G(3) = V G1 G2 G3 + 9(ζ2 zD2 G2 + 1)(ζ1 zD1 G1 + 1) + [G3 + (1 + ζ3 zD3 G3 )9] (ζ1 zD1 G1 + ζ2 zD2 G2 + 1) .
(6.10)
The contribution from the right side, excluding the first term, is O(|1 − z/zc |−1 ) and hence its coefficient of z n is bounded by O(zc−n log n). Hence these terms are error terms. The first term, abbreviating (5.1) to Gj = G(0) j + Ej , can be written as (0) (0) (0) (0) (0) V G1 G2 G3 = vG(0) 1 G2 G3 + vG1 G2 E3 + vG1 E2 G3 + vE1 G2 G3 + EV G1 G2 G3 . (6.11) In view of (4.36), the first term is equal to the first term on the right side of (6.1) (with ζ = 1). The other four terms, together with the terms in (6.10) other than the first, −1/2 ), comprise E (3) . Applying the estimates Gj = O(|1−z/zc |−1/2 ), G(0) j = O(|1−z/zc | (5.44), and Lemma 6.1 gives 1 n−1/2− n−1/2 n− . + + + Ez(3) (κ) = O |1 − z/zc |3/2 |1 − z/zc |3/2− |1 − z/zc |2 |1 − z/zc |2− (6.12) Applying Lemma 3.2(i) then yields the desired result.
We now complete the proof of the m = 3 case of Theorem 2.1. Theorem 6.3. Let k ∈ R3d , κ = kD1−1 n−1/4 and t ∈ [0, ∞)3 . For nearest-neighbour trees in sufficiently high dimensions d ≥ d0 , and for sufficiently spread-out trees above eight dimensions, 1/2 )| ≤ const.zc−n n−1− . (6.13) |e(3) n (κ, tT1 n Proof. By (6.10) and (6.11), we have (0) (0) G(3) = vG(0) 1 G2 G3 + E 1 + E 2 ,
(6.14)
where E1 denotes the right side of (6.11) apart from the first term, and E2 denotes the right side of (6.10) apart from the first term. Then E10 consists of terms of the form (it is immaterial for our bounds whether we have the superscript (0) or not) vG01 G2 E3 , while
E20
vG1 G2 E30 ,
EV0 G1 G2 G3 ,
EV G01 G2 G3 ,
(6.15)
contains terms of the form 90 G 1 G 2 ,
9G01 G2 .
(6.16)
(There are also lower order terms, such as 90 G1 , that can be handled similarly.) The proof involves showing that there are bounds on (6.15) and (6.16) sufficient to apply √ √ √ t T n t T n t T n Lemma 3.2(iii) to conclude that their coefficient of z n ζ11 1 ζ22 1 ζ33 1 is at most O(zc−n n− ), corresponding to a bound O(zc−n n−1− ) for the corresponding coefficient of E1 + E2 . Beginning with the term vG01 G2 E3 in (6.15), it suffices to show that taking a ζ3 derivative gives rise to a bound O(zc−n n1/2− ). This derivative acts only on E3 , and, as in (5.45), E˙ h Eh F˙3 E˙ F 2EF F˙3 − − 2 + . (6.17) E˙ 3 = 2 F3 F3 F3 F33
100
E. Derbez, G. Slade
Each of these four terms can be bounded above using Lemmas 5.2, 5.3 and 5.4. We bound G01 = h01 F1−1 − h1 F10 F1−2 by |1 − z|−1 |1 − ζ1 |−1 , and G2 by |1 − ζ2 |−1 . Then we apply Lemma 3.2(iii). For vG1 G2 E˙ 30 , we bound the factors Gi by |1 − ζi |−1 , which is neutral regarding powers of n, and bound E˙ 30 exactly as in the proof of Theorem 5.1. For EV0 G1 G2 G3 we take another z-derivative, obtaining terms of the form EV00 G1 G2 G3 ,
EV0 G01 G2 G3 .
(6.18)
These terms give rise to at most n1− , since they are bounded above respectively by 1 1 1 1 |1 − z/zc |(1 − |z|/zc )1− |1 − ζ1 | |1 − ζ2 | |1 − ζ3 |
(6.19)
and 1 1 1 1 1 . |1 − z/zc |1/2 (1 − |z|/zc )1/2− |1 − z/zc | |1 − ζ1 | |1 − ζ2 | |1 − ζ3 |
(6.20)
For EV G01 G2 G3 we take a ζ3 -derivative, obtaining E˙ V G01 G2 G3 ,
EV G01 G2 G˙ 3 ,
(6.21)
and show that these terms give rise to at most n1/2− . For E˙ V G01 G2 G3 , this follows from the upper bound 1 1 1 1 . 1− (1 − |ζ3 |) |1 − z/zc ||1 − ζ1 | |1 − ζ2 | |1 − ζ3 |
(6.22)
For EV G01 G2 G˙ 3 , all contributions to EV can be handled in a similar manner, except for the contribution of order |1 − ζ2 | , since this leaves too few powers of |1 − ζ2 | overall. For this, we take a further ζ2 -derivative, and then proceed as above. For (6.16), it suffices to show that the derivative with respect to z of the first term gives n1− and of the second with respect to ζ3 gives n1/2− . Beginning with the second, the derivative with respect to ζ3 acts only on the 9 factor. Sacrificing a factor of |1−ζ3 |−1 , ˙ 01 G2 is bounded above by 9G 1 1 1 1 . 1− (1 − |ζ3 |) |1 − ζ1 ||1 − z/zc | |1 − ζ2 | |1 − ζ3 |
(6.23)
Similarly, 900 G1 G2 + 90 G01 G2 is bounded above by 1
1 1 1 (6.24) (1 − |z|/zc − z/zc | |1 − ζ1 | |1 − ζ2 | |1 − ζ3 | 1 1 1 1 . + (1 − |z|/zc )1/2− |1 − z/zc |1/2 |1 − ζ1 ||1 − z/zc | |1 − ζ2 | |1 − ζ3 | )1− |1
This is sufficient.
Scaling Limit of Lattice Trees in High Dimensions
101
7. The m-Point Functions, m ≥ 4 In this section, we prove Theorem 2.1 for m ≥ 4, and comment on difficulties encountered in attempting to extend the proof of (2.12) to m ≥ 4. Proof of Theorem 2.1 for m ≥ 4. By Lemma 3.2(i), it suffices to show that |Ez(m) (σ; κ)| ≤
1 1 n− + |1 − z/zc | Sm , m−3 3/2 (1 − |z|/zc ) |1 − z/zc |
(m ≥ 4), (7.1)
where Sm is an expression of the form Sm =
X j∈Jm
n−1/2 |1 − z/zc |1/2
j ,
(7.2)
with Jm a finite subset of the natural numbers whose precise nature is immaterial. In view of Lemma 3.2(i), the factor Sm has no effect on the resulting bound on e(m) n (σ; κ). We will prove (7.1) by induction on m. In (7.1) and throughout the proof, we will omit constant factors in upper bounds. For m = 2, 3, from (5.44) and (6.12) we already have the bounds 1 n−1/2− n−1/2 (m) − + n + |1 − z/zc | + . |E | ≤ |1 − z/zc |m−3/2 |1 − z/zc |1/2 |1 − z/zc |1/2− (7.3) In particular, (7.1) holds for m = 3 with J3 = {0, 1}. This begins the induction. We assume that (7.1) holds for 3 ≤ l < m, and then show that it must also hold for m itself. By Lemmas 4.1 and 5.3, |G(2) | ≤
1 . |1 − z/zc |1/2
(7.4)
Also, it follows from the induction hypothesis (7.1) that for 3 ≤ l < m, |G(l) | ≤
1 1 1 + |E (l) | ≤ (1 + Sl ). (1 − |z|/zc )l−3 |1 − z/zc |3/2 |1 − z/zc |l−3/2
(7.5)
To advance the induction, we use (4.38). In (4.38), the first term on the right side is the main term and the remaining terms are error terms. Since these error terms can be treated by the same method as the error terms present within the main term itself, we discuss only the main term. Thus we wish to consider terms of the form G(p1 +2) G(p2 +2) G(p3 +2) V (p4 ) ,
(7.6)
with p1 + p2 + p3 + p4 = m − 3. In particular, pj ≤ m − 3 for each j. Writing G(q) = (q) (q) G(q) 0 + E , with G0 the leading term for the q-point function, this can be written as 1 +2) 2 +2) 3 +2) G(p G(p V (p4 ) + E (p1 +2) G(p2 +2) G(p3 +2) V (p4 ) G(p 0 0 0 1 +2) 1 +2) +G(p E (p2 +2) G(p3 +2) V (p4 ) + G(p G0(p2 +2) E (p3 +2) V (p4 ) . 0 0
(7.7)
Consider first the case of p4 = 0. By Lemma 6.1, the first term in (7.7) can be written as
1 +2) 2 +2) 3 +2) G(p G(p (v + O(n− ) + O(|1 − z/zc | )). G(p 0 0 0
(7.8)
102
E. Derbez, G. Slade
The first term here gives the desired leading behaviour of G(m) , since p1 +p2 +p3 = m−3, and hence by (4.36) 3 Y (7.9) v (B2 22(pj +2)−5/2 ) = B2 22m−5/2 . j=1
The remaining contribution from (7.8) is 1 (n− + |1 − z/zc | ), |1 − z/zc |m−3/2
(7.10)
consistent with (7.1). Since G(l) 0 also obeys the bounds (7.4) and (7.5), we can treat the (l) last three terms of (7.7) in the same way by using (7.4) and (7.5) for both G(l) 0 and G . We discuss only the first of these terms. Since pj + 2 ≤ m − 1 for each j, the induction hypothesis can be applied to the first three factors. Treating separately the cases where any one of the pj ’s is zero, and using (7.4), (7.5), (7.3) and (7.1) gives a bound of the desired form in any case. As an illustration, consider the case where p1 , p2 ≥ 1 and p3 = 0, for which p1 + p2 = m − 3. The resulting bound is |E (p1 +2) G(p2 +2) G(2) V (0) | ≤
1 1 p +p −2 1 2 (1 − |z|/zc ) |1 − z/zc |7/2 × n− + |1 − z/zc | Sp1 +2 (1 + Sp2 +2 ),
(7.11)
consistent with (7.1). This concludes the discussion of the case p4 = 0. Now consider the case of p4 > 0. Since V (p4 ) can be expressed in terms of 9(p4 ) , by (4.39) the first term of (7.7) is bounded by 1 1 , p +p +p +3/2 1 2 3 (1 − |z|/zc )p4 − |1 − z/zc |
(7.12)
which is better than necessary for (7.1). The second term of (7.7) is bounded by E (p1 +2) G(p2 +2) G(p3 +2)
1 , (1 − |z|/zc )p4 −
and we can proceed as in the second half of the previous paragraph.
(7.13)
We conclude with a brief discussion of what goes wrong in attempting to extend our methods to prove (2.12) for m ≥ 4. To begin a proof, we would first require the straightforward extension of (4.38) to include ζ dependence, and proceed as above. The 1 +2) 2 +2) 3 +2) G(p G(p , as in (7.8) and as in the leading term in this case should again be vG(p 0 0 0 proof of Theorem 1.2. Our difficulty arises in showing that some of the other terms are error terms. Consider, for example, the term (2) (2) (1) G(2) 0 G0 G0 V
(7.14)
arising in the case of m = 4 with p1 = p2 = p3 = 0, p4 = 1. To show that this is an error term, we expect to have to take two derivatives before applying Lemma 3.2, as was done for m = 2, 3. Our method requires an upper bound, after having taken appropriate derivatives, containing in particular the factors
Scaling Limit of Lattice Trees in High Dimensions
103
Y 1 1 |1 − z/zc | |1 − ζj | 5
(7.15)
j=1
corresponding to the restriction bp ≥ 1 in Lemma 3.2(iii), as well as additional factors leading to the desired n−1− behaviour for (7.14), as in the proof of Theorem 6.3. However, the dependence of (7.14) on one of the ζ’s, say ζ3 , lies entirely on a portion of the backbone residing inside 9. To bound 9 (which inherits an extra vertex from the extra line due to p4 = 1), we must take absolute values before bounding above by removing avoidance constraints between different lines in the Feynman diagram representing 9. This gives bounds in terms of an inverse power of (1 − |ζ3 |) and cannot produce the essential factor |1 − ζ3 |−1 . The situation worsens as m increases. Acknowledgement. We are grateful to Takashi Hara for numerous useful and stimulating conversations. We thank David Aldous, Jean-Franc¸ois Le Gall and Ed Perkins for enlightening correspondence concerning ISE, and Buks van Rensburg for providing Fig. 1. E.D. thanks Sichun Wang for several helpful discussions. This work was supported in part by NSERC grant OGP0009351. The work of G.S. was also supported in part by an Invitation Fellowship of the Japan Society for the Promotion of Science.
References 1. Aizenman, M.: On the number of incipient spanning clusters. Nucl. Phys. B [FS] 485, 551–582 (1997) 2. Aizenman, M. and Newman, C.M.: Tree graph inequalities and critical behavior in percolation models. J. Stat. Phys. 36, 107–143 (1984) 3. Aldous, D.: The continuum random tree. I. Ann. Probab. 19, 1–28 (1991) 4. Aldous, D.: The continuum random tree II: An overview. In: M.T. Barlow and N.H. Bingham, editors, Stochastic Analysis. Cambridge: Cambridge University Press, 1991, pp. 23–70 5. Aldous, D.: The continuum random tree III. Ann. Probab. 21, 248–289 (1993) 6. Aldous, D.: Tree-based models for random distribution of mass. J. Stat. Phys. 73, 625–641 (1993) 7. Bovier, A., Fr¨ohlich, J. and Glaus, U.: Branched polymers and dimensional reduction. In: K. Osterwalder and R. Stora, editors, Critical Phenomena, Random Systems, Gauge Theories, Les Houches 1984, Amsterdam: North-Holland, (1986) 8. Brydges, D.C. and Spencer, T.: Self-avoiding walk in 5 or more dimensions. Commun. Math. Phys. 97, 125–148 (1985) 9. Dawson, D. and Perkins, E.: Measure-valued processes and renormalization of branching particle systems. In: R. Carmona and B. Rozovskii, editors, Stochastic Partial Differential Equations: Six Perspectives. Providence, RI: AMS Math. Surveys and Monographs, 1997 10. Derbez, E.: The scaling limit of lattice trees above eight dimensions. PhD thesis, McMaster University, (1996) 11. Derbez, E. and Slade, G. Lattice trees and super-Brownian motion. Canad. Math. Bull. 40, 19–38 (1997) 12. Flajolet, P. and Odlyzko, A.: Singularity analysis of generating functions. SIAM J. Disc. Math. 3, 216– 240 (1990) 13. Golowich, S. and Imbrie, J.Z.: A new approach to the long-time behavior of self-avoiding random walks. Ann. Phys. 217, 142–169 (1992) 14. Gradshteyn, I.S. and Ryzhik, I.M.: Table of Integrals, Series and Products. New York: Academic Press, 4th edition, 1965 15. Grimmett, G.: Percolation. Berlin: Springer, 1989 16. Hara, T. and Slade, G.: Mean-field critical behaviour for percolation in high dimensions. Commun. Math. Phys. 128, 333–391 (1990) 17. Hara, T. and Slade, G.: On the upper critical dimension of lattice trees and lattice animals. J. Stat. Phys. 59, 1469–1510 (1990) 18. Hara, T. and Slade, G.: The lace expansion for self-avoiding walk in five or more dimensions. Reviews in Math. Phys. 4, 235–327 (1992)
104
E. Derbez, G. Slade
19. Hara, T. and Slade, G.: The number and size of branched polymers in high dimensions. J. Stat. Phys. 67, 1009–1038 (1992) 20. Hara, T. and Slade, G.: Self-avoiding walk in five or more dimensions. I. The critical behaviour. Commun. Math. Phys. 147, 101–136 (1992) 21. Hara, T. and Slade, G.: Mean-field behaviour and the lace expansion. In: G. Grimmett, editor, Probability and Phase Transition, Dordrecht: Kluwer, 1994 22. Janse van Rensburg, E.J. On the number of trees in Zd . J. Phys. A: Math. Gen. 25, 3523–3528 (1992) 23. Janse van Rensburg, E.J. and Madras, N.: A non-local Monte–Carlo algorithm for lattice trees. J. Phys. A: Math. Gen. 25, 303–333 (1992) 24. Khanin, K.M., Lebowitz, J.L., Mazel, A.E. and Sinai, Ya.G.: Self-avoiding walks in five or more dimensions: polymer expansion approach. Russian Math. Surveys 50, 403–434 (1995) 25. Klein, D.J.: Rigorous results for branched polymer models with excluded volume. J. Chem. Phys. 75, 5186–5189 (1981) 26. Le Gall, J.-F.: The uniform random tree in a Brownian excursion. Probab. Th. Rel. Fields 96, 369–383 (1993) 27. Lubensky, T.C. and Isaacson, J.: Statistics of lattice animals and dilute branched polymers. Phys. Rev. A20, 2130–2146 (1979) 28. Madras, N.: A rigorous bound on the critical exponent for the number of lattice trees, animals and polygons. J. Stat. Phys. 78, 681–699 (1995) 29. Madras, N. and Slade, G.: The Self-Avoiding Walk. Boston: Birkh¨auser, 1993 30. Nguyen, B.G. and Yang, W-S.: Triangle condition for oriented percolation in high dimensions. Ann. Probab. 21, 1809–1844 (1993) 31. Nguyen, B.G. and Yang, W-S. Gaussian limit for critical oriented percolation in high dimensions. J. Stat. Phys. 78, 841–876 (1995) 32. Penrose, M.D. Self-avoiding walks and trees in spread-out lattices. J. Stat. Phys. 77 3–15, (1994) 33. Reisz, T.: A convergence theorem for lattice Feynman integrals with massless propagators. Commun. Math. Phys. 116, 573–606 (1988) 34. Tasaki, H. and Hara, T. Critical behaviour in a system of branched polymers. Prog. Theor. Phys. Suppl. 92, 14–25 (1987) Communicated by D. Brydges
Commun. Math. Phys. 193, 105 – 124 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
Determinant Bundles, Manifolds with Boundary and Surgery II. Spectral Sections and Surgery Rules for Anomalies Paolo Piazza Istituto Matematico “G. Castelnuovo”, Universit`a degli Studi di Roma “La Sapienza”, P.le Aldo Moro 2, 00185 Roma, Italy. E-mail:
[email protected] Received: 23 October 1996 / Accepted: 28 July 1997
Abstract: Let φ : M −→ B be a closed fibration of Riemannian manifolds and let ð = (ðz ), z ∈ B, be a family of generalized Dirac operators. Let H ⊂ M be an embedded hypersurface fibering over B; χ : H −→ B. Let ðH = (ðH,z ) be the Dirac family induced on χ : H −→ B. Each fiber φ−1 (z) = Mz in φ : M −→ B is the union along χ−1 (z) = Hz of two manifolds with boundary Mz0 , Mz1 . In this paper, generalizing our previous work [16], we prove general surgery rules for the local and global anomalies of the Bismut–Freed connection on the determinant bundle associated to ð. Our results depend heavily on the b-calculus [12], on the surgery calculus [11] and on the APS family index theory developed in [13], in particular on the notion of spectral section for the family ðH . Introduction Surgery rules for the local and global anomalies on determinant bundles were first considered in [16]. The main result there can be informally presented as follows. Consider M , a fibration of closed riemannian manifolds of even dimension with base space B and assume that M is the union along a fibering hypersurface H of two fibrations with boundary M 0 , M 1 . Thus each fiber Mz of M → B is the union along Hz of two manifolds with boundary Mz0 ∪Hz Mz1 . If the fibres are spin manifolds there are well defined Dirac families associated to these fibrations M , M 0 , M 1 and it is natural to try to relate the associated determinant line bundles. This is the program pursued in [16] but under the assumption that the Dirac family induced on the fibering hypersurface has null space of constant dimension. Under this additional hypothesis it is proved that the determinant bundle associated to the closed fibration is the tensor product of the determinant bundles associated to the two fibrations with boundary through the Atiyah– Patodi–Singer boundary value problem. Moreover the curvature (the holonomy) of the respective Bismut–Freed connections satisfy the natural additivity (multiplicative) formula. We refer the reader to [16] for the precise statements. In this paper we shall lift the
106
P. Piazza
above assumption and give general surgery rules for the curvature and the holonomy of the Bismut–Freed connection. The paper is organized as follows. In Sect. 1, following [13] and the beginning of [16], we describe the general definition of determinant bundle associated to a Dirac family on manifolds with boundary. Here the notion of spectral section P for the boundary family, as introduced in [13], and of regularizing P -perturbation play a fundamental role. Quillen metrics and Bismut–Freed connections on these determinant line bundle are introduced in Sect. 2, whereas the curvature formula is proved in Sect. 3. In Sect. 4 we explain how to identify determinant bundles on manifolds with boundary associated to different regularizing P -perturbations. For the tensor product of the determinant bundles defined by the two halves M 0 , M 1 of the closed fibration M this identification turns out to be canonical. This is an important point since it allows us to work with a fixed regularizing P -perturbation, at least as far as surgery is concerned. The surgery set-up, as first introduced in [11], is recalled in Sect. 5. Finally general surgery rules for the curvature and the holonomy are proved in Sect. 6 (see Theorem 2 and the notation introduced at the end of Sect. 5). As a byproduct of the proof a splitting formula for the family index, as recently proved by Dai and Zhang [10], can be obtained. 1. Spectral Sections and Determinant Line Bundles We briefly recall the basic definitions of [16]. We refer the reader to [13] and [14] for some of the notation used in this paper. Let φ : M −→ B be a fibration of smooth compact manifolds with fibres diffeomorphic to a fixed even-dimensional manifold with boundary X. Let gM/B be a family of exact b-metrics and let E be a Hermitian Z2 -graded module for the Clifford algebra bundle associated to the vertical b-cotangent bundle b T ∗ (M/B). For Clifford algebras we shall use the convention of [12]. Thus for any two covectors α and β, cl(α) cl(β) + cl(β) cl(α) = 2hα, βi. Let ∇E be a vertical (true) connection on E which is assumed to be unitary and Clifford. Let ð = (ðz ) ∈ Diff 1b,φ (M ; E) be the family of Dirac operators associated to these data. Let ð0 = (ð0,z ) ∈ Diff 1∂φ (∂M ; E ∂M ) be the family of Dirac operators induced on the boundary. According to Proposition 1 of [13] we can always choose a spectral section P for the boundary family ð0 . Thus P is a smooth family of 0th -order pseudodifferential operators, P ∈ 90∂φ (∂M ; E ∂M ), Pz : L2 (∂Mz ; E ∂Mz ) −→ L2 (∂Mz ; E ∂Mz ) which are self-adjoint projection and with the additional property that there exists a positive function R ∈ C ∞ (B) such that ( Pz u = u if λ > R(z) (1.1) ð0,z u = λu =⇒ Pz u = 0 if λ < −R(z). The family ð and the choice of a specific spectral section P for ð0 determine two distinct (but homotopic) families of Fredholm operators. The first one is a (local) family of Fredholm operators acting on Hilbert spaces which are finite dimensional extensions of weighted Sobolev spaces. As in [16] we denote the associated determinant line bundle by det b (ð, P ). The second family is obtained by perturbing ð by an element AP ∈ 9−∞ b,φ (M, E) (i.e. a smooth family of b-pseudodifferential operators of order (−∞)) in such a way that the boundary operators of the perturbed family are all invertible; acting on unweighted Sobolev spaces the perturbed family ð + AP ∈ 91b,φ (M, E) is Fredholm.
Determinant Bundles, Manifolds with Boundary and Surgery II
107
Thus there exists a well defined determinant bundle which we denote by det b (ð + AP ). Since by Proposition 6 of [13] these two families are homotopic (as Fredholm families) we certainly have det b (ð, P ) ∼ = det b (ð + AP ).
(1.2)
We shall often refer to the perturbation AP as a regularizing P -perturbation. The family AP is defined in terms of a smooth family of smoothing operator A0P ∈ 9−∞ ∂φ (∂M ; E ∂M ) directly defined by the spectral section P . The latter family has the following 3 important properties: ∀z ∈ B the operator ð0,z + A0P,z is invertible;
(1.3)
∀z ∈ B A0P,z has range in a finite sum of eigenspaces of ð0,z .
(1.4)
2Pz − Id =
ð0,z + A0P,z |ð0,z + A0P,z |
∀z ∈ B
(1.5)
or,in words, P = the spectral projection onto the positive spectrum of ð0 + A0P . It is important to point out that the APS family index theory developed in [13] is obtained by working almost exclusively with the perturbed family ð+AP . The definition of the perturbation AP ∈ 9−∞ b,φ (M, E) is not unique; however, as explained below, different perturbations yield homotopic families and thus isomorphic determinant bundles. In order to explain this last point we need to recall in detail the definition of AP starting from the boundary family A0P ∈ 9−∞ ∂φ (∂M ; E ∂M ). ∞ Thus let ρ ∈ Cc (R) be non-negative, even and have integral 1, so ρu (t) = u−1 ρ(t/u), u > 0 approximates δ(t) as u ↓ 0. The Fourier-Laplace transform Z ρc e−itw ρu (t)dt u (w) = is therefore entire, even, real for real w and has ρc u (0) = 1. We consider 0 + I(A+P , w) = (L− )−1 ◦ ρc u (w)AP ◦ L ,
(1.6)
where u > 0 will be chosen small and L± are the identifications L± : E ± ∂M −→ E 0 given in [13]. Thus, by definition, E 0 = E + ∂M , L+ = Id and L− = i cl(dx/x) with x equal to a boundary defining function for ∂M . Notice that the kernel 1.6 is spanned by a fixed range of the eigenfunctions of ð0 . Consider the operator I(ð+ , w) + I(A+P , w). Reducing it to an operator on E 0 this is equal to 0 iw + ð0 + ρc u (w)AP ;
(1.7)
property (1.3) of A0P implies that this operator is invertible for w ∈ R. To construct the full operator A+P we simply choose a product decomposition near the boundary; if (A+P )0 is the unique R+ -invariant operator with indicial family I(A+P , w) we define A+P = f (x)(A+P )0 f (x0 ),
(1.8)
108
P. Piazza
+ where f (x) localizes near the boundary. We then define A− P as the adjoint of AP and ± AP as the odd family defined by AP . The definition of A+P involves the choice of the function ρ, of the parameter u, of the smooth boundary family A0P and of the function f . Let BP+ ∈ 9−∞ b,φ be a perturbation 0 0 0 corresponding to different choices ρ , u , BP , g.
Lemma 1. The linear homotopy ð + rAP + (1 − r)BP ,
r ∈ [0, 1]
gives a homotopy through Fredholm families connecting ð + AP and ð + BP . Proof. It suffices to show that the indicial family I(ð+ +rA+P +(1−r)BP+ , w) is invertible for each w ∈ R and for each r ∈ [0, 1]. We can write explicitly 0 (w)B 0 . ρu (w)A0P + (1 − r)ρc I(ð+ + rA+P + (1 − r)BP+ , w) = iw + ð0 + rc P u0 0 0 0 c Since ρc u (w), ρu0 (w) are real for each w ∈ R and AP , BP are self-adjoint, we are reduced to check the statement at w = 0. By (1.5) we can write
ð0 + rA0P + (1 − r)BP0 = (r|ð0 + A0P | + (1 − r)|ð0 + BP0 |)(2P − Id) and the proof is complete using (1.3) and the fact that (2P − Id) is invertible.
Lemma 1 ensures that for a fixed spectral section P the space of regularizing P perturbations is simply connected, in fact convex. We shall use this result in Sect. 4 in order to give an explicit isomorphism between det b (ð + BP ) and det b (ð + AP ). The family index theorem proved in [13] implies the following formula for the Chern class of the determinant line bundle det b (ð + AP ) " # Z 1 1 0 2 b c1 (det b (ð + AP )) = in HdR (B) A(M/B) Ch (E) − ηbP n 2 (2πi) 2 M/B (1.9) [2] (see Sect. 3 for the definition of the P -eta form). We also recall that if Q is a different spectral section for ð0 , then the relative index theorem of [13] gives a way to relate the Chern classes of det b (ð, P ) and det b (ð, Q). (As already remarked in [16] using this result and the fact that K 0 (B) is generated by formal differences of spectral sections ([14]), we readily obtain that there always exists a spectral section P such that det b (ð, P ) is trivial). 2. Metrics and Connections on the Determinant Bundle det b (ð + AP ) Consider the determinant line bundle det b (ð + AP ). This is the determinant line bundle associated to the family of Fredholm operators ð+z + A+P,z : Hb1 (Mz ; Ez+ ) −→ L2b (Mz ; Ez− ). The family is Fredholm since, by construction, AP is such that 0 is never the imaginary part of an indicial root of ð+z + A+P,z , ∀z ∈ B. Thus the L2b −spectrum of (ðz + AP,z )2 is discrete near zero and, following [17] and [8], we can give the usual description of detb (ð + AP ) in terms of eigenfunctions of (ðz + AP,z )2 .
Determinant Bundles, Manifolds with Boundary and Surgery II
109
Thus we denote by 5λP = (5λP,z )z∈B the family of spectral projections for (ðz + AP,z )2 corresponding to the interval [0, λ) ∈ R. The family 5λP is Z2 -graded. Also we denote by δ the largest real number with the property that Im(specb ((ðz + AP,z )2 )) ∩ {w ∈ R; |w| ≤ δ} = ∅. Our λ’s will always be chosen in the interval [0, δ) ⊂ R. Let / spec((ðz + AP,z )2 )} and HPλ be the Z2 -graded bundle over Uλ Uλ = {z ∈ B : λ ∈ defined by the range of 5λP . We consider det(HPλ ) = (Λmax HPλ,+ )∗ ⊗ (Λmax HPλ,− ). If µ ∈ [0, δ) and λ < µ then on Uλ ∩ Uµ we certainly have HPµ = HPλ ⊕ HP[λ,µ) with HP[λ,µ) equal to the direct sum of the eigenspaces corresponding to the eigenvalues of (ðz + AP,z )2 in the interval [λ, µ). Thus det(HPµ ) = det(HPλ ) ⊗ det(HP[λ,µ) ). We can define a line bundle on B by gluing det(H λ ) and det(H µ ) on Uλ ∩ Uµ through the non-zero section det((ð+ + A+P )(λ,µ) ) induced by the isomorphism (ð+ + A+P )(λ,µ) = (ð+ + A+P ) HP[λ,µ),+ : HP[λ,µ),+ ←→ HP[λ,µ),− . As explained in [8] this line bundle and the one introduced by Quillen for the Fredholm family ð+ + A+P are canonically identified; we shall keep the same notation for these two bundles. Using the results of [15] on the regularity at s = 0 of the b-zeta function of a positive elliptic b-pseudodifferential operator, we can consider for each λ ∈ [0, δ) and for each z ∈ Uλ the derivative at s = 0 of the meromorphic extension of b
(λ,∞),+ − − + + + + −s ζ(s, (ð− ((ð− ). z + AP,z )(ðz + AP,z ), λ) = b−Tr(5P,z z + AP,z )(ðz + AP,z ))
= Id −5λP . Notice that the function Here 5(λ,∞) P − + + Uλ 3 z −→ b ζ 0 (0, (ð− z + AP,z )(ðz + AP,z ), λ) ∈ R
(2.1)
is smooth. Notation. To simplify the writing of our formulae we shall use the following notation: ∓ ± ∓ ± 1± P,z ≡ (ðz + AP,z )(ðz + AP,z ).
In this notation we shall write b ζ 0 (0, 1+P , λ) for the C ∞ (Uλ )-function defined by (2.1). If necessary we shall specify the particular choice of perturbation AP by writing 1± AP . It is important to note, as in [16], that if z ∈ Uλ ∩ Uµ , with λ < µ < δ, then b
ζ(s, 1+P,z , λ) = b ζ(s, 1+P,z , µ) +
m X
λ−s i
Re s >> 0
i=1
with 0 < λ < λ1 ≤ · · · ≤ λm ≤ µ < δ an enumeration of the eigenvalues of 1+P,z between λ and µ. This impies directly, by meromorphic continuation, the key formula
110
P. Piazza
b 0
ζ (0, 1+P,z , λ) = b ζ 0 (0, 1+P,z , µ) −
m X
log λi .
(2.2)
i=1
To define the Quillen metric we now proceed as in [17]: the L2b -metrics associated to the family of exact b-metrics gM/B induce a metric on HPλ and thus a metric | · |λ on (det b (ð + AP )) Uλ = det(HPλ ). We define the Quillen metric on Uλ as k · kb,Q = e−
b 0
ζ (0,1+P ,λ)/2
| · |λ .
(2.3)
Formula (2.2) implies that k · kb,Q is globally defined on detb (ð + AP ). As in [16] we need to compute the differential of the function b ζ 0 (0, 1+P , λ). We will denote by ∂/∂z a generic partial derivative in a fixed coordinate patch contained in Uλ . Lemma 2. Over the set Uλ the following formula holds: ∂ ∂ b ( ζ(s, 1+P,z , λ)) = −s(b−Tr(5(λ,∞),+ ( (1+P,z ))(1+P,z )−s−1 )). P,z ∂z ∂z Proof. Proceeding as in [2] it suffices to prove that b−Tr(5(λ,∞),+ P,z
∂ −t1+P,z (λ,∞),+ ∂ + −t1+P,z e ) = −t b−Tr(5P,z ( 1P,z )e ) . ∂z ∂z
We apply Duhamel’s principle on the left hand side and write the result as the sum of the right hand side of (2) and a term which involves the b-trace of a commutator: + ∂ ( 1+P,z )e−t1P,z ) −b−Tr(t5(λ,∞),+ P,z ∂z Z t (λ,∞),+ ∂ −(t−s)1+P,z + −s1+P,z [e , 5P,z ( 1P,z )e ]ds . −b−Tr ∂z 0
We need to show that the latter term is equal to zero. Recall that the parametrix , w) = Id construction for elliptic b-pseudodifferential operators implies I(5(λ,∞),+ P,z ∀w ∈ R, ∀z ∈ Uλ . The b-trace identity [12], formula (1.7), and a further application of Duhamel’s principle then give the following expression for the b-Trace of the commutator: Z t Z t−s Z 1 Fz (t, u, w)dwduds 2πi 0 0 Rw with ∂ ρ (w)A0P )2 (ð + ρb (w)A0P )2 + 2w)e−u(ð+b ∂w ∂ ρ (w)A0P )2 · ( (ð + ρb (w)A0P )2 )e−(t−u)(ð+b ). ∂z
F (t, u, w) = Tr(e−tw ( 2
Since ρb (w) is even in w ∈ R it follows that Fz (t, u, w) is odd for each w and for each t, u. This implies easily that the commutator term is equal to zero and the proposition follows.
Determinant Bundles, Manifolds with Boundary and Surgery II
111
Observe that 1P = ð2 + (AP ð + ðAP + A2P ).
(2.4)
Using Duhamel’s formula and the classical asymptotic expansion for b−Tr( (λ,∞),+ + exp(−u1+P,z )) has also an asymptotic exexp(−uð− z ðz )) we see that b−Tr(5P,z pansion for u small. It follows that there exists an asymptotic expansion for b−Tr(5(λ,∞),+ ( P,z
∂ + 1 )(1+P,z )−1 exp(−t1+P,z )) ∂z P,z
and Lemma 2 implies, as in [2], the following important result Proposition 1. On the open set Uλ we have + ∂ ∂ b 0 ζ (0, 1+P,z , λ) = − LIM(b−Tr(5(λ,∞),+ ( 1+P,z )(1+P,z )−1 e−t1P,z )). (2.5) P,z t→0 ∂z ∂z
On Uλ we consider the superconnections b P = (ð + AP )(λ,∞) + A[1] , A λ
(2.6)
AP λ = (ð + AP )(λ,∞) + A[1] + A[2]
with A[1] , A[2] respectively the one and two-form piece of the Bismut superconnection. Let b P = t 21 ((ð + AP )(λ,∞) ) + A[1] A λ,t
(2.7)
b P . Define two differential 1-forms be the rescaled superconnection associated to A λ 1 b ± ∞ αP (t, λ) ∈ C (Uλ , 3 ) as follows: !
b ± αP (t, λ)
=
bP ∂A P 2 λ,t −(b e Aλ,t ) ) b−Tr ± ( ∂t
.
(2.8)
[1]
Remark. In the APS family index theorem of [13] one considers the rescaled perturbed Bismut superconnection e t = t 21 (ð + χ(t)AP ) + A[1] + t− 21 A[2] A
(2.9)
with the function χ ∈ C ∞ (R+ ) equal to 0 for t < 1 and equal to 1 for t > 2. This cut-off function is inserted in order to ensure the convergence near t = 0 of the b-Trace of the e t )2 ) . It also ensures the convergence near t = 0 of superconnection heat kernel exp(−(A the t-integrals defining the exact form and the P -eta form appearing in the transgression formula giving the index theorem. We shall see that in the present context the function χ is unnecessary (and would in fact complicate our analysis).
112
P. Piazza
The 1-forms (2.8) can be also expressed as b ± αP (t, λ)
2 1 = − b−Tr ± (5(λ,∞) (ð + AP )∓ [A[1] , (ð + AP )± ]e−t(ð+AP ) ) (2.10) P 2
± (t, λ) has an asymptotic expansion for t small: It follows from formula (2.4) that b αP thus Z ∞ b ± b ± βP (λ) = 2 LIM αP (t, λ)dt (2.11) s→0
s
is well defined. The 1-form piece of the Bismut superconnection induces, through the , a Z2 -graded connection on HPλ which in turn induces a conspectral projection 5[0,λ) P b λ nection ∇ on (det b (ð + AP )) Uλ . This connection is compatible with the metric | · |λ induced by the L2b - inner product. To define a connection compatible with the Quillen metric we consider on (det b ð) Uλ the connection b
∇P = b ∇λ + b βP+ (λ).
(2.12)
Proposition 2. The connection b ∇P is globally defined on detb (ð + AP ). The proof of the proposition proceeds as in the closed case; the absence of the cut off function χ (see (2.6), (2.9)) does play a role in the computation. Proposition 3. Over the set Uλ b − αP (t, λ) b 0 d( ζ (0, 1+P , λ)) =
+ (t, λ), = b αP
−(b βP+ (λ) + b βP− (λ)).
(2.13)
Proof. Proceeding as in [16] it is necessary to prove that various commutator terms arising from the application of Duhamel’s formula are identically zero. The computations are similar to the one given in the proof of Proposition 2; we leave the cumbersome details to the reader. Proceeding as in [2] we obtain from Proposition 3 the following Proposition 4. The connection b ∇P is compatible with the b−Quillen metric k · kb,Q on detb (ð + AP ). 3. The Curvature Formula In this section we give a formula for the curvature of the connection b ∇P . The proof is based, as in [16], on the transgressed index formula of [13] which will be now recalled. Let 1 − 21 b P = t 21 (ð + AP ) + A[1] 2 A[2] , A AP t = t (ð + AP ) + A[1] + t t be the analogues of the superconnections considered in (2.6). Let Bt be the rescaled Bismut superconnection of the boundary fibration ∂φ : ∂M −→ B. Bt is a Cl(1)-superconnection on E 0 ⊗ Cl(1) ≡ E 0 ⊕ E 0 (see [13] Sect. 10). It is explicitly given by Bt = t 2 σð0 + B[1] + t− 2 σB[2] 1
1
(3.1)
Determinant Bundles, Manifolds with Boundary and Surgery II
113
with σ equal to the self-adjoint involution on E 0 ⊕ E 0 given by 0 1 σ= . 1 0
(3.2)
Using the identifications L± to reduce indicial operators to operators acting on the sections of E 0 ⊕ E 0 , we have 2 2 I(AP ρu (w)A0P + Bt t , w) = t γw + t σc 1
1
with
∀w ∈ R
(3.3)
0 −i . i 0
γ=
(3.4)
We define 2 ρu (w)A0P ) + Bt , BP t (w, u) = t (σc 1
b P (w, u) = t 21 (ð0 + σc B ρu (w)A0P ) + B[1] t
(3.5)
(the u-dependence is in the function ρu ) and rewrite b P , w) = t 21 γw + B b P (w, u). I(A t t
P 2 I(AP t , w) = t γw + Bt (w, u), 1
(3.6)
On the set Uλ ⊂ B we now consider the null bundle of the family (ð + AP )(λ,∞) . By definition this is precisely the smooth bundle HPλ = Im(5λP ) considered in the previous section. Notice that since I(5λP , w) = 0 formula (3.6) holds for the indicial family of AP λ,t (see (2.6)). Formula (3.6) and the b-trace identity are key tools in the proof of the following transgressed family index formula [13]: Z ∞ dAP 2 P 2 1 λ,s −(AP λ e s,λ ) )ds) = LIM(b −STr(e−(At ) )) − η P (u) b −STr( Ch(HP ) + d(LIM t↓0 t↓0 ds 2 t (3.7) with the boundary correction term Z η P (u) = LIM t↓0
∞ t
η P (s, u)ds
(3.8)
given in terms of 2 dBP s (w, u) −Bt (w,u)2 e e−tw dw s STr Cl(1) dt R Z 1 dc ρu 0 −BPt (w,u)2 −tw2 1 − σw A e e STr Cl(1) dw. π 2 dw P
1 η P (s, u) = π
Z
1 2
(3.9)
R
Remark. Since we are using the superconnections (2.6), (3.5), which do not involve the cut-off function χ, we have followed a different notation with respect to that of [13]: we use η P (u) instead of ηbP (u) for the boundary correction term appearing in (3.7). Notice also that the parameter u plays the role here of the parameter in [13]. In this paper we keep the parameter for the surgery problem.
114
P. Piazza
We now consider the restriction of the determinant bundle det b (ð + AP ) to the open set Uλ ; using (3.7), definition (2.8) and proceeding as in [8] and [2] we obtain Z ∞ P 2 1 − b λ 2 + ( b αP (s, λ)−b αP (s, λ))ds = LIM(b −STr(e−(At ) ))[2] − (η P (u))[2] . ( ∇ ) +d LIM t↓0 t↓0 2 t Recalling the definition of the connection b ∇P we readily obtain the curvature formula: P 2 1 (b ∇P )2 = LIM(b −STr(e−(At ) ))[2] − (η P (u))[2] . t↓0 2
(3.10)
To bring the right-hand side in a more computable form we proceed as follows. First we introduce the P -eta form Z ∞ P 2 1 dBP 0 2 η P ≡ LIM STr Cl(1) ( s e−(Bs ) )ds with BP t = t σAP + Bt ; t↓0 ds t next we recall Proposition 13 of [13] which states the existence of a differential form γ ∈ C ∞ (B, 3∗ B) such that η P (u) − η P = d(γ(u)). If we define the perturbed P Bismut–Freed connection as b eP
∇ = b ∇P + γ(u)[1] ,
we obtain at once the formula Z 1 ∞ dBP s −(BP b eP 2 −(AP )2 )2 t s e )− STr Cl(1) ( )ds ( ∇ ) = LIM b −STr(e t↓0 2 t ds [2] = LIM(b −STr(e
2 −(AP t )
t↓0
1 ))[2] − (η P )[2] . 2
(3.11)
Remark. Proceeding as in [8] (Theorem 1.5) it is not difficult to show that both (η P (u))[2] and (η P )[2] are purely imaginary. Thus γ(u)[1] is also purely imaginary. This implies readily that the perturbed P -Bismut–Freed connection is still compatible with the bQuillen metric on det b (ð + AP ). We consider also the P -eta form ηbP defined in terms of the Cl(1)-superconnection e t = t 21 σ(ð0 + χA0 ) + B[1] + t− 21 B[2] ; B P thus
Z
∞
ηbP =
STr Cl(1) ( 0
es e 2 dB e−(Bs ) )ds ds
as in [13]. We recall that the cut-off function χ ∈ Cc∞ (R) is equal to zero for t < 1 and equal to 1 for t > 2. It is introduced precisely to ensure the convergence of the integrand near t = 0. e = (ð + χAP ) + A[1] + A[2] . For t small e 2 )) with A Finally consider b −STr(exp(−A t it follows that this term has the same asymptotic as the Bismut superconnection; in particular the limit as t ↓ 0 is equal to the Atiyah-Singer integral. We claim that for t small, At ) b −STr(e−(At ) )[2] = b −STr(e−(e )[2] + O(t 2 ), P 2
2
1
(3.12)
Determinant Bundles, Manifolds with Boundary and Surgery II
Z
∞
( t
STr Cl(1) (
2 dBP s −(BP e s ) )ds)[2] = ( ds
Z
∞ t
STr Cl(1) (
115
es e 2 1 dB e−(Bs ) )ds)[2] + O(t 2 ). ds (3.13)
To prove (3.12) it is necessary to introduce the 1-parameter family of superconnections At (r) = t 2 (ð + χ + r(1 − χ)AP ) + A[1] + t− 2 A[2] 1
1
and compute the variation in r of the b-supertrace of the corresponding superconnection heat-kernel. Applying the Duhamel formula, the b-trace identity, the analogues of (3.3)– (3.6) and using the fact that AP 0 and AP are of order (−∞) as pseudodifferential and b-pseudodifferential families, the first claim follows. The result in (3.13) is obtained with a similar strategy, introducing the 1-parameter family of Cl(1)-superconnections Bt (r) = t 2 σ(ð0 + (χ + r(1 − χ))A0P ) + B[1] + t− 2 σB[2] , 1
1
and computing the variation in r of the 2-form Z ∞ dBs (r) −(Bs (r))2 ( e STr Cl(1) ( )ds)[2] . ds t We leave the details of these (somewhat long) computations to the reader. Since, as already remarked, the right-hand sides of (3.12), (3.13) are well behaved at t = 0 we have completed the proof of the following: e P on the determinant Theorem 1. For the perturbed P -Bismut–Freed connection b ∇ line bundle det b (ð + AP ) the following curvature formula holds ! Z 1 1 0 b eP 2 b . (3.14) ( ∇ ) = A(M/B) Ch (E) − ηbP n 2 (2πi) 2 M/B [2]
As usual the content of this theorem is that there exists an explicit metric-compatible connection on detb (ð + AP ) with curvature precisely equal to the differential 2-form obtained from the cohomological statement of the family index theorem (see (1.9)). The decomposition of the curvature of the Bismut–Freed connection in an interior geometric term and a boundary term, as shown by (3.10) (or (3.14) for the perturbed connection) will play an important role in the next two sections. 4. Identifications through Parallel Transport In this section we shall discuss how to identify the determinant bundles detb (ð+AP ) and det b (ð + BP ) corresponding to two different choices of regularizing P -perturbations. First, following closely [8], we discuss the case of a closed fibration; this case will both serve as an introduction and be used in the next two sections, in conjunction with the surgery problem. Thus let ψ : M −→ B be a fibration of closed riemannian even dimensional manifolds with fibre diffeomorphic to a fixed compact closed manifold X. We denote by gM/B the metric on the vertical tangent bundle T (M/B); thus for each z ∈ B, Mz = ψ −1 (z) is endowed with the metric gz = gM/B T Mz . Let E be a vertical Hermitian Clifford module endowed with a unitary Clifford connection as in [2]. Let ð ∈ Diff 1ψ (M ; E) be the associated family of Dirac operators and let det(ð) be the
116
P. Piazza
corresponding determinant bundle with its Quillen metric and Bismut–Freed connection. Suppose now that A = (Az )z∈B ∈ 9−∞ ψ (M, E) is a smooth family of smoothing operators and consider the family of pseudodifferential operators (ð+A) = (ðz +Az )z∈B . Since Az is of order (−∞) for each z ∈ B, it follows that (ð + A) is a smooth family of Fredholm operators. We consider the associated determinant bundle det(ð + A). Let B = (Bz )z∈B be a second family of smoothing operators and consider the determinant bundle det(ð + B). As a particular choice we could consider the constant family of zero operators and reobtain the original family ð. Using results presented in [9] (see III i)) it is possible to show that the two determinant bundles are canonically isomorphic, with the isomorphism preserving the curvature and the holonomy of the respective Bismut– Freed connections. To see this point consider A(r), r ∈ [0, 1], an arbitrary path of smooth families of smoothing operators joining A and B. Thus [0, 1] 3 r −→ A(r) is a path, joining A and B, in the space S of all smoothing perturbations of the family ð. Notice that S is simply connected. Let D = (D(z,r) ) be the family, parametrized by B × [0, 1], defined by D(z,r) = ðz + (A(r))z .
(4.1)
Let det(D) −→ B ×[0, 1] be the associated determinant bundle. Thus if ir : B ×{r} ,→ B×[0, 1] is the obvious inclusion, then i∗0 det(D) = det(ð+A) and i∗1 det(D) = det(ð+B). Let k·kQ be the Quillen on det(D) and let ∇det(D) be the Bismut–Freed connection defined in terms of the superconnection D + A[1] + dr(∂/∂r). These data restrict at r = 0, r = 1 to the corresponding ones on det(ð + A) and det(ð + B) respectively. Let R = (∇det(D) )2 . The curvature R, according to the curvature theorem in [8] (Theorem 1.18), is the sum of two terms; one is a 2-form on B whereas the other, the term involving dr, is explicitly given by 1 ∂A(r) −(t 21 D+A[1] )2 e (4.2) ))[1] . LIM −dr ∧ t 2 (STr( t↓0 ∂r Since ∂A(r)/∂r is also a family of smoothing (hence trace class) operators it follows that (4.2) is identically zero. This vanishing implies that if [0, 1] 3 r −→ A0 (r) ∈ S is another arbitrary path in S, joining A and B, and if C is the loop obtained by composing A(r) with the inverse of A0 (r) then holC (∇det(D) ) = 1. Thus the parallel transport isomorphism τ10 : i∗0 det(D) ≡ det(ð + A) −→ i∗1 det(D) ≡ det(ð + B)
(4.3)
defined by the Bismut–Freed connection on det(D) does not depend on the particular path chosen to join A and B. We have therefore obtained a canonical isomorphism which, by construction, identifies the two bundles metrically; moreover the curvature and the holonomy of the 2 Bismut–Freed connections on det(ð+ A), det(ð + B) are equal (this can be also checked directly as in [9]). Considering now B = 0 and summarizing our discussion so far, we have: Proposition 5. Let M −→ B be a fibration of closed even dimensional riemannian manifolds and let ð = (ðz )z∈B be a family of Dirac operators. If A = (Az )z∈B is a family of smoothing operators then there exists a canonical isomorphism τ : det(ð) −→ det(ð + A)
(4.4)
Determinant Bundles, Manifolds with Boundary and Surgery II
117
which preserves the curvature and the holonomy of the respective Bismut–Freed connections. We now pass to manifolds with boundary. We adopt the notation given at the beginning of Sect. 1. We choose a spectral section P for the boundary family ð0 and consider the space P of all regularizing P -perturbations. This space will play the role of the space S of all smoothing perturbations in the closed case. Consider the space B × P and denote by iAP : B × {AP } ,→ B × P the obvious inclusion. There is a natural family of Fredholm operators DP parametrized by B × P (defined in a way similar to (4.1) above). Thus there is a well defined determinant bundle detb (DP ) which, proceeding as in Sect. 2, can be given a Quillen metric and a P -Bismut–Freed connection. Clearly i∗AP (det b (DP )) = det b (ð + AP ). Let AP , BP two points in P. According to Lemma 1 the segment rAP + (1 − r)BP is all contained in P. Thus parallel transport of the Bismut–Freed connection on det b (DP ) along the path rAP + (1 − r)BP gives an explicit isomorphism between i∗AP (det b (DP )) and i∗BP (det b (DP )), i.e. between det b (ð+AP ) and det b (ð + BP ). We would like this isomorphism to be natural, i.e. independent of the path in P joining AP and BP . Unfortunately this is not the case. Proceeding as in the closed case, P = ðz + (AP (r))z , thus introducing the bundle detb (DP ) −→ B × [0, 1], with D(z,r) and employing the curvature formula proved in the previous section, formula (3.11), we obtain the following expression for the dr-component of the curvature RP of the (for simplicity perturbed) Bismut–Freed connection on det b (DP ): Z 1 1 P 2 ∂AP (r) 1 LIM −dr ∧ t 2 (b −STr( e−u(t 2 D +A[1] ) t↓0 ∂r 0 ×e
1
−(1−u)(t 2 D P +A[1] )2
(4.5)
du))[1] + dr ∧ β(t)[1]
with 1 β(t) = − √ 2 π
∞
Z t
STr Cl(1) (σD∂P
Z
1
1
P
e−u(s 2 D∂ +B[1] )
2
0
×σ
∂A0P (r) −(1−u)(s 21 D∂P +B[1] )2 e du) ∂r
[1]
dt (4.6)
with D∂P = ð0 + A0P (r). The first summand in the argument of the regularized limit in (4.5) is the sum of a term like (4.2) and another one produced by an application of the b-trace identity. Both terms vanish when we take LIMt↓0 . Thus the dr component of the curvature RP is equal to dr ∧ (LIMt↓0 β(t)[1] ) which will be in general different from 0. This implies in particular that parallel transport along different paths in P joining AP and BP will in general produce different isomorphisms. In the sequel we shall always identify det b (ð + AP ) and det b (ð + BP ) using parallel transport along the linear path rAP + (1 − r)BP . For later reference we explicitly single out the following crucial: Remark. The dr component of the curvature RP only depends on the boundary data of the family DP .
118
P. Piazza
5. Surgery In this section we briefly recall the surgery set-up of [11] and [16]. Thus we consider ψ : M −→ B, a fibration of closed riemannian even dimensional manifolds with fibre diffeomorphic to a fixed compact closed manifold X. For simplicity we assume that X and T (M/B) are oriented and spin with fixed spin structures. We denote by gM/B the metric on the vertical tangent bundle T (M/B); thus for each z ∈ B, Mz = ψ −1 (z) is endowed with the metric gz = gM/B T Mz . We let E be the vertical spinor bundle. Let H be a codimension one embedded submanifold of M . We assume that H fibres over B; thus there exists a codimension one embedded submanifold Y of X and a fibre bundle χ : H −→ B with fibres diffeomorphic to Y. We assume, only for the sake of simplicity, that H separates M ; thus ψ : M → B is the union of two fibrations ψi : M i −→ B, i = 0, 1 with common boundary equal to H: Mz = Mz0 ∪Hz Mz1 for each z ∈ B. Let x ∈ C ∞ (M ) be a defining function for H (thus H = {x = 0} and dx 6= 0 on H). It should be remarked that one of the two fibrations with boundary, say M 0 , will have the normal vector field to its boundary, ∂/∂x, oriented in the outward direction. 0 1 We consider the fibration M = M t M obtained by taking the disjoint union of i the compactification M , i = 0, 1 of the fibrations obtained by attaching a cylindrical end to Mi . Consider the family of riemannian metrics gM/B () =
|dx|2 + gM/B . x2 + 2
(5.1)
We shall also use the shorter notation gz () for the metric (5.1) restricted to the fibre Mz . 0 1 The limit metric gM/B (0) ≡ g(0) endows the fibration ψ : M = M t M −→ B with a vertical family of exact b-metrics. Equivalently gz (0) defines on Mz \ Hz the structure of a complete manifold with asymptotically cylindrical ends. The given data endow the spinor bundle E on the single surgery space (Mz )s = [Mz × [0, 1]; Hz × {0}], with the structure of a surgery Hermitian Clifford module with Hermitian Clifford connection. The associated Dirac operator ðz () =
1 cls ∇S i
is an element in Diff 1s (Mz ; E) (see [11] for the notation). The latter statement specifies how the -family of differential operator ðz () fixed by (5.1) degenerates as ↓ 0; in particular the limit operator ðz (0) is, for each z ∈ B, an element in Diff 1b (M z ; E). We will also use the notation ðz,M for the limit operator: notice that C ∞ (M z , E) = 0
1
C ∞ (M , E) ⊕ C ∞ (M , E) and that with respect to this decomposition ! ðz,M 0 0 . ðz,M = 0 ðz,M 1 We shall also use the shorter notation ðz,M = ðz,M 0 t ðz,M 1 .
Determinant Bundles, Manifolds with Boundary and Surgery II
119
Moreover we shall sometime be very precise and write ðM () instead of ð(). Our main goal is to study the uniform behaviour, as ↓ 0, of the Quillen metric and of the Bismut– Freed connection on the determinant bundle associated to the family of Dirac operator fixed by the closed riemannian fibration (ψ : M −→ B, gM/B ()). Results in this direction were first presented in [16] where the above problem was solved under the assumption that the Dirac family induced on the fibering hypersurface H had null space of constant dimension. In this paper we want to deal with the general case; thus no assumptions are made on the Dirac family induced on H. In order to treat the surgery problem for the Bismut–Freed connection on det(ð()) we first need to describe the limit picture, namely the determinant bundle and its hermitian geometry at = 0. Since we do not make any assumption on the Dirac family induced on H, it is clear that spectral sections must be employed in order to define determinant bundles associated to the Dirac families fixed by the two fibrations with 0 1 boundary M , M . First, following [16], we describe the structure of the operators ðM 0 , ðM 1 near the respective boundaries. We consider the bundle endomorphisms L+ = Id and 2 2 21 L− = i cl(dx/(x + ) ). We also consider the chirality operator 0() associated to a g()-orthonormal frame. Thus 02 = Id and the positive and negative spinors are by definition the eigenspaces associated to 1 and −1 respectively. Let L± 0 and 00 be the limit endomorphisms. Following the boundary identifications of [13], explained after (1.6), and using the grading induced by 00 it is easy to see that near the boundary: ± ± −1 L∓ = ±x 0 · ð 1 · (L0 ) M
∂ + ðH ∂x
(5.2)
with ðH equal to the Dirac family on χ : H −→ B induced by the boundary Clifford action dx 1 cl∂ (ξ) = cl(i ) cl(ξ) with ξ ∈ T ∗ (H/B) ≡ T ∗ (∂M /B). x Similarly ± ± −1 L∓ = ±y 0 · ð 0 · (L0 ) M
∂ + ð(−H) with y = −x. ∂y
(5.3)
Notice that since ðH arises as a boundary family there always exist infinitely many spectral sections associated to it. Let us fix a spectral section P for the family ðH which is, according to (5.2), the boundary family of ðM 1 . Because of (5.3) (Id −P ) will then be a spectral section for the boundary family of ðM 0 . Using generalized APS boundary value problems as in [13] we therefore obtain two Fredholm families (ðM 1 , P ), (ðM 0 , (Id −P )) and thus a determinant line bundle det b (ðM 0 , (Id −P )) ⊗ det b (ðM 1 , P ). Following our discussion in Sect. 1 this determinant bundle can also be realized in 1 the b-calculus framework by choosing a regularizing P -perturbation AM P for the family
120
P. Piazza 0
ðM 1 and a regularizing (Id −P )-perturbation AM (Id −P ) for the family ðM 0 . We denote by 0
1
AM P the regularizing perturbation induced on M = M t M ; thus, by definition, 0
1
M M AM P = A(Id −P ) t AP . 1
± The perturbation AM P is constructed (as in (1.6)–(1.8)) using the identifications L0 , 0 the boundary smoothing family AP and a cut-off function fM 1 whereas, according to 0
0 (5.3), the perturbation AM (Id −P ) is constructed using the family (−AP ), a cut-off function ± fM 0 and of course the limit identifications L0 . In the sequel we shall use the notation f M = fM 0 t fM 1 . Using the results of Sect. 1 we can endow the determinant line bundles det b (ðM 0 + 0
1
M AM (Id −P ) ), det b (ðM 1 +AP ) with b-Quillen metrics and Bismut–Freed connections. Thus 0
1
M M the tensor product detb (ðM 0 + AM (Id −P ) ) ⊗ det b (ðM 1 + AP ), i.e. det b (ðM + AP ), is also endowed with a b-Quillen metric and a Bismut–Freed connection. These data constitute the limit picture we were looking for. Before tackling the surgery problem we make an important remark. Suppose we 0 1 consider a different regularizing P -perturbation on M = M t M . We denote it by BPM . Let [0, 1] 3 r −→ AM P (r) ∈ P a path, in the space P of all regularizing P P and BPM . Consider the family D = ðM + AM perturbation, joining AM P P (r) and let P
det b (D ) −→ B × [0, 1] be the associated determinant bundle with its Quillen metric P and Bismut–Freed connection ∇D . Let M τ : det b (ðM + AM P ) −→ det b (ðM + BP ) P
be the parallel transport map defined by ∇D . Proposition 6. The bundle map τ gives a canonical isomorphism preserving the curvature and the holonomy of the P -Bismut–Freed connections. Proof. We only need to show that the isomorphism τ does not depend on the particular M path chosen to join AM P and BP . Reasoning as in the closed case (see the discussion leading to Proposition 5) it suffices to show that the dr component of the curvature P of ∇D is identically equal to zero (recall that the space P of all regularizing P perturbations is convex by Lemma 1). As remarked at the end of Sect. 4 this dr component P P only depends on the boundary family (D )∂ associated to D . By definition P
D = D(Id0−P ) t DP M
M
1
P M with D(Id0−P ) = ðM 0 + AM (Id −P ) (r) and D 1 = ðM 1 + AP (r). Using the identifications 0
M
L± 0 , as in (5.2) (5.3), we obtain
1
M
P
(D )∂ = (−ðH − A0P (r)) t (ðH + A0P (r)). 0
Thus the two contributions to the dr component of the curvature coming from M and 1 M cancel out and the proposition is proved.
Determinant Bundles, Manifolds with Boundary and Surgery II
121
Thus although the individual parallel transport maps for the 2 factors of 0
1
0
1
M det b (ðM 0 + AM (Id −P ) ) ⊗ det b (ðM 1 + AP ) M M and det b (ðM 0 + B(Id −P ) ) ⊗ det b (ðM 1 + BP )
are not canonical, their tensor product τ is. 0
Notation. According to the previous proposition we can think of det b (ðM 0 +AM (Id −P ) )⊗ 1
det b (ðM 1 + AM P ) as something which depends only on the spectral section P and not on the particular regularizing P -perturbation chosen. From now on we shall denote by L(Id −P ) (ðM 0 ) ⊗ LP (ðM 1 )
(5.4) 0
1
M the generic determinant bundle defined by a regularizing P -perturbation AM (Id −P ) tAP .
Remark. Notice that the notation is justified only for the tensor product and not for the individual factors. With a slight abuse of notation we shall denote by b ∇(Id −P ),0 ⊗ Id + Id ⊗b ∇P,1 the P -Bismut–Freed connection introduced, as in Sect. 2, on a representative of (5.4). Proposition 6 shows that the curvature and the holonomy of this P -Bismut–Freed connection do not depend on the choice of the perturbation and are thus well defined on (5.4). Similarly, on a fibration without boundary we shall identify the determinant bundle det(ð) with the determinant bundle det(ð + A) for any smoothing perturbation A = (Az )z∈B using the canonical isomorphsm τ of Proposition 5. We shall denote by L(ð) the generic determinant bundle defined by a smoothing perturbation A = (Az )z∈B . 6. Surgery Rules for the Local and Global Anomalies We are now in the position of stating and proving the main result of this paper. We adopt the notation of the previous section and let ðM () be the family of Dirac operators associated to the fibration of closed riemannian manifolds (ψ : M −→ B, gM/B ()) with |dx|2 gM/B () = 2 2 + gM/B x + and gM/B equal to a vertical metric on M −→ B. We assume the existence of a fibering hypersurface H in M such that M = M 0 ∪H M 1 and consider the surgery problem for det(ðM ()) as explained in the previous section and in [16]. We fix P , a spectral section for the induced family ðH . Finally we adopt the notation explained at the end of the previous section thus considering L(ðM ()) and L(Id −P ) (ðM 0 ) ⊗ LP (ðM 1 ) endowed with their respective Bismut–Freed connections ∇,M , b ∇(Id −P ),0 ⊗ Id + Id ⊗b ∇P,1 Theorem 2. For each > 0, small enough, there exists a natural explicit isomorphism S : L(ðM ()) −→ L(Id −P ) (ðM 0 ) ⊗ LP (ðM 1 ).
(6.1)
As ↓ 0 the following (asymptotic) surgery formulae hold: (∇,M )2 −→ (b ∇(Id −P ),0 )2 + (b ∇P,1 )2 , holγ (∇,M ) −→ holγ (b ∇(Id −P ),0 ) · holγ (b ∇P,1 )
∀γ ∈ Map(S1 , B).
(6.2) (6.3)
122
P. Piazza
Proof. The main step is to show the existence of an -family of perturbations A(, P ) = (A(, P )z )z∈B , with the following 2 properties: (Mz , E), ∀z ∈ B (A(, P ))z ∈ 9−∞ s
(6.4)
with 9m s denoting the small surgery calculus of Mazzeo and Melrose [11], 0
1
M M ∀z ∈ B Nb (A(, P )z ) = (AM P )z ≡ (A(Id −P ) )z t (AP )z,
(6.5)
with Nb equal to the b-normal homomorphism of [11]. Informally these two properties state the existence of an -family of perturbations A(, P ) which is smoothing for each > 0 (and with a precise behaviour in for small) (6.4), and such that A() −→ AM P when ↓ 0 (6.5). Assuming the existence of such a family for a moment we now complete the proof of the theorem. For each > 0 the family (A(, P )z )z∈B is a family of smoothing operators on the closed Riemannian fibration (M −→ B, gM/B ()). Hence the determinant bundles det(ð()) and det(ð() + A(, P )) are canonically identified through the parallel transport map τ of Proposition 5. In other words these two determinant bundle are representatives for L(ð()). Thus it suffices to show that there exists a natural explicit isomorphism 0
1
M S : det(ð() + A(, P )) −→ det b (ðM 0 + AM (Id −P ) ) ⊗ det b (ðM 1 + AP )
(6.6)
and that the surgery rules (6.2), (6.3) hold for the Bismut–freed connections of these two line bundles. To get these results we can now proceed as in [16] where the invertible case null(ðH,z ) = 0 ∀z ∈ B;
P = 5≥ ;
AM P ≡0
is treated. The introduction of the spectral section P and of the regularizing P −∞ perturbation AM P ∈ 9b,ψ (M , E), together with the degenerating family A(, P ), means that we have reduced the general surgery problem to the invertible case but for a larger class of operators; it is very interesting that the pseudodifferential techniques encoded in the b-calculus and the surgery calculus are powerful enough to allow for the treatment of perturbed families and thus for the solution of the problem. Thus the existence of the natural explicit isomorphism (6.6) follows from the analysis of the eigenfunctions associated to the small eigenvalues of (ð() + A(, P ))2 which is based in turn on the study of the uniform structure of the resolvent of (ð() + A(, P ))2 as ↓ 0. The latter analysis is ultimately done using the b-normal homomorphism and the surgery normal homomorphism (see the content of Sect. 5.3 in [11] and the proof of Proposition 2 there). For each z ∈ B these homomorphisms have values in the (overblown) calculus with bounds of M z and H z respectively, with H z equal to the surgery face of the single surgery space (Mz )s . Because of the presence of the family A(, P ) these normal homomorphisms will be pseudodifferential; however their resolvents can in any case be analyzed using the full force of the b-pseudodifferential calculus. Thus the proofs presented in [11] can be extended to the present pseudodifferential context. In particular the gz ()-orthogonal projection onto the eigenfunctions of , with (ð() + A(, P ))2z corresponding to the small eigenvalues is an element of 9−∞,α s α > 0, of uniform finite rank. This property can be used as in [16] to define the map S in (6.6) and show that it is indeed an isomorphism. We refer the reader to proof of Proposition 4 of [16] for the details.
Determinant Bundles, Manifolds with Boundary and Surgery II
123
We now pass to the surgery rules for the local and global anomalies ((6.2) (6.3)). Proceeding as in [16] it suffices to show that if ∇,∗ is the pushforward under the isomorphism S of the Bismut–Freed connection on det(ð() + A(, P )), then ∇,∗ + log · dζ 0 (0, (ð0 + A0P )2 ) −→ b ∇(Id −P ),0 ⊗ Id + Id ⊗b ∇P,1
as ↓ 0. (6.7)
The proof presented in [16] carries over if we can show that the heat-surgery calculus can be extended to include the heat-kernel of perturbed laplacians such as (ð() + A(, P ))2z ∈ 92s (Mz , E). Once again we remark that the heat-surgery calculus of [11] is ultimately based on the surgery and b-normal heat-homomorphisms, with values in 9kη (H z , E) and 9kη (M z , E) respectively. Proposition 8 and 20 of [13] prove the existence of the heat-kernel for a perturbed b-laplacian in an appropiate extension of the b-heat-calculus. Extending the heat-surgery calculus accordingly (thus adding to 9khs the space tp C ∞ ([0, ∞), 9−∞ )) permits to analyze the heat-kernel of (ð() + A(, P ))2z s uniformly as ↓ 0 and extend the results of [11] (in particular Lemma 2 and part of Sect. 9) in such a way that the proof of (6.7) as given in [16] in the invertible case can be carried over. Once again we leave the details of this program to the reader. We still have to prove the existence of an -family A(, P ) satisfying (6.4) and (6.5) above. To this end recall the definition of the regularizing P -perturbation, as explained in Sect. 1 before Lemma 1. We applied this definition in order to define AM P (see Sect. 5, before Proposition 6). Consider the surgery double space of [11] for a fibre Mz . Following the notation of [11] we label the boundary faces of (Mz )2s as Bls (z), Brs (z), Bdb (z), Bds (z). The operator (AM P )z defines a smooth kernel on Bdb (z) 0 obtained by extending with the cut-off fM a Schwartz kernel (AM P ) only defined on the 0 front face of the overblown b-double product associated to M . The definition of (AM P ) ∞ only depends on A0P ∈ 9−∞ (H, E H) and the function ρ ∈ C (R) since, up to u χ c identifications, is given by 2πρu (log s)A0P,z (y, y 0 )
(6.8)
with x, s = x/x0 , y, y 0 projective coordinates near the front face of (Mz )2ob . Of course starting with A0P,z and the function ρu we can analogously define an operator in 9−∞ ob (H z , E), since (6.8) can also be interpreted as a kernel on the front face of the overblown b-double product (H z )2s . Since ρu is of compact support it is obvious from (6.8) that both this operator and the original one AM b can be extended as zero to the boundary faces Bls (z), Brs (z). Summarizing we have constructed a kernel on all of the boundary faces over = 0 which satisfies the compatibility properties of Sect. 4.8 in [11]. Such a kernel can always be extended to a smooth kernel on all the surgery double . By space. Since it vanishes of infinite order at Bls (z), Brs (z) this kernel belongs to 9−∞ s construction it fulfils property (6.5). This proves the claim and thus the theorem. Remark. The proof of the existence of the natural isomorphism (6.6) can be easily adapted to show, as in [10], that Ind(ðM ) = Ind(ðM 0 , (Id −P )) + Ind(ðM 1 , P )
in K 0 (B).
Of course Theorem 2, being a statement about the hermitian geometry of the determinant bundles, is much more refined than (6).
124
P. Piazza
Remark. The proof of Theorem 2 and that of Proposition 5 in [16] also establish the following result: if k · k∗, is the push-forward through (6.6) of the Quillen metric on det(ð() + A(), then, as −→ 0 we have ζ
0
(0,(ð0 +A0P )2 )
k · k∗, −→ k · kb,Q
(6.9)
with k · kb,Q denoting the b-Quillen metric on the b-determinant bundle appearing on the right-hand-side of (6.6). This result depends of course on the choice of the perturbation; however it is clear from the construction of the canonical isomorphisms, τ and τ , that different choices of perturbations will produce Quillen metrics and b-Quillen metrics that are simply one pull-back of the other through τ and τ respectively. In other words the convergence result (6.9) is also subject to a direct interpretation on L(ð()) and L(Id −P ) (ðM 0 ) ⊗ LP (ðM 1 ). Acknowledgement. It is a pleasure to thank Jean-Michel Bismut and Richard Melrose for several interesting conversations.
References 1. Atiyah, M.F. and Singer, I.M.: Dirac operators coupled to vector potentials. Proc. Nat. Acad.Sci.81, 2596–2600 (1984) 2. Berline, N., Getzler, E. and Vergne, M.: Heat kernels and Dirac operators. New York: Springer, 1992 3. Bismut, J.-M.: The index theorem for families of Dirac operators: two heat equation proofs. Invent. Math. 83, 91–151 (1986) 4. Bismut, J.-M. and Cheeger, J.: Families index for manifolds with boundary superconnections and cones I. J. Funct. Anal. 89, 313–363 (1990) 5. Bismut, J.-M. and Cheeger, J.: Families index for manifolds with boundary superconnections and cones II : J. Funct. Anal.90, 306–354 (1990) 6. Bismut, J.-M. and Cheeger, J.: η-invariants and their adiabatic limits. JAMS2, 33–70 (1989) 7. Bismut, J.-M. and Cheeger, J.: Remarks on the index theorem for families of Dirac operators on manifolds with boundary. In: Differential Geometry, B.Lawson and K. Teneblat (eds), Longman Scientific, 1992 8. Bismut J.-M. and Freed, D.S.: The analysis of elliptic families: Metrics and connections on determinant bundles. Commun. Math. Phys. 106, 159–176 (1986) 9. Bismut J.-M. and Freed, D.S.: The analysis of elliptic families: Dirac operators, eta invariants and the holonomy theorem of Witten . Commun. Math. Phys.107, 103–163 (1986) 10. Dai, Z. and Zhang, W.: The splitting of family index. Commun. Math. Phys.182, 303–318 (1996) 11. Mazzeo, R. and Melrose, R.: Analytic surgery and the eta invariant. Geom. Funct. Anal.5, 14–75 (1995) 12. Melrose, R.B.: The Atiyah-Patodi-Singer Index Theorem. Boston: A. and K. Peters 1993 13. Melrose, R.B. and Piazza, P.: Families of Dirac operators, boundaries and the b-calculus. J. Diff. Geom. 146, 99–180 (1997) 14. Melrose, R.B. and Piazza, P.: An index theorem for families of Dirac operators on odd-dimensional manifolds with boundary. J. Diff. Geom. 146, 287–334 (1997) 15. Piazza, P.: On the index of elliptic operators on manifolds with boundary. J. Funct. Anal.117, 308–359 (1993) 16. Piazza, P.: Determinant bundles, manifolds with boundary and surgery. Commun. Math. Phys. 178, 597–626 (1996) 17. Quillen, D.: Determinants of Cauchy-Riemann operators over a Riemann surface. Funct. Anal. Appl. 14, 31–34 (1985) Communicated by A. Jaffe
Commun. Math. Phys. 193, 125 – 150 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
An Inverse Problem Associated with Polynomials Orthogonal on the Unit Circle J. S. Geronimo1 , R. Johnson2,? 1 School of Mathematics, Georgia Institute of Technology, Atlanta, GA 30332-01660, USA. E-mail:
[email protected] 2 Dipartimento Sistemi e informatica, Universita di Firenze, Firenze, Italy 50139. E-mail:
[email protected]
Received: 21 May 1997 / Accepted: 28 July 1997
Abstract: Polynomials orthogonal on the unit circle with random recurrence coefficients and finite band spectrum are investigated. It is shown that the coefficients are in fact quasi-periodic. The measures associated with these quasi-periodic coefficients are exhibited and necessary and sufficient conditions relating quasi-periodicity and spectral measures of this type are given. Analogs for polynomials orthogonal on subsets of the real line are also presented.
1. Introduction This paper is a companion to [GJ], in which we developed the essential theory of random orthogonal polynomials on the unit circle from a dynamical systems point of view. Our goal here is to formulate and solve an inverse problem for such random orthogonal polynomials. In addition to techniques of dynamical systems, we will use some elementary methods from the theory of algebraic curves, in particular the generalized Jacobian [F]. These methods permit us also to generalize some results obtained for polynomials orthogonal on the unit circle with periodic recurrence coefficients to the case when the recurrence coefficients are quasi-periodic. To be more precise, let T (z, n) be the following matrix: z αn , (1.1) T (z, n) = an z α¯ n 1 where {αn }∞ 1 is a sequence of complex numbers with values in the unit disc in the complex plane, and z a complex number. The normalizing constant an is given by an = (1 − |αn |2 )−1/2 . ?
Research partially supported by N.S.F. (J. Geronimo) and M.U.R.S.T. (Italy; R. Johnson).
126
J. S. Geronimo, R. Johnson
Orthonormal polynomials on the unit circle K = {z ∈ C | |z| = 1} can be generated in the following way. Write 1 φn (z) = T (z, n) · · · T (z, 1) . (1.2) φ∗n (z) 1 Then there is a unique probability measure dσ on K with the property that {φn (·) | n = 0, 1, 2, . . .} is orthonormal with respect to dσ. The polynomials φ∗n (z) = z n φn (1/z) are called the dual polynomials. The relation between the quantities αn and the polynomials φn is as follows: φn (0) αn = ∗ , φn (0) which easily follows from (1.1) and (1.2). ˜ +, In Sect. 2 of [GJ] the Weyl m functions, m± (z, n), were introduced (m+ = m m− = m˜1− in [GJ, Sect. 2]). These were shown to be analytic for |z| < 1 with |m+ | < 1 and |m− | > 1 in this region (in [GJ] these results were stated only for m± (z, 0), however the proof carries over to n 6= 0). Furthermore it was shown (see [GJ], Eqs. (2.15) and (2.16)) that these functions satisfy the following relations: m+ (z, n) = −z
α¯ n+1 − m+ (z, n + 1) , 1 − αn+1 m+ (z, n + 1)
(1.3)
z α¯ n+1 + m− (z, n) . z + αn+1 m− (z, n)
(1.4)
and m− (z, n + 1) =
These functions will be introduced in a more geometric fashion below. From (1.3) and (1.4) we can evaluate limz→0 m± (z, n) to find, m+ (0, n) = 0,
m− (0, n) =
1 . αn
(1.5)
Likewise if these functions can be analytically continued to |z| > 1 then we find m+ (∞, n) = ∞,
m− (∞, n) = α¯ n .
(1.6)
If we consider the functions m ˜ ± (z) = then (1.5) and (1.6) show that
1 + m± (z, 0) , 1 − m± (z, 0)
α0 + 1 m ˜ − (0) = α0 − 1 1 + α0 m ˜ − (∞) = 1 − α0
(1.7)
m ˜ + (0) = 1, . m ˜ + (∞) = −1
It now follows from the relation between Caratheodory functions and positive measures on the unit circle ([Akh]) that Z iθ e +z dσ+ (θ), (1.8) m ˜+= eiθ − z K
Inverse Problem for Polynomials Orthogonal on the Unit Circle
and
α0 − α0 − m ˜−= |1 − α0 |2
Z K
eiθ + z eiθ − z
127
dσ− (θ),
(1.9) R
where σ+ and σ− are positive measures supported on the circle such that K dσ+ = 1 R 2 0| and K dσ− = 1−|α |1−α0 |2 , (σ+ is the orthogonality measure σ). The problem we pose and solve is the following. Let {αn } be a stationary, ergodic sequence and consider the set Σ of non-isolated points of the topological support of the measure dσ. Thus Σ is a closed subset of K. We will characterize those stationary ergodic sequences {αn } for which (i) Σ is a finite union of arcs; (ii) the Lyapounov exponent γ(z) of the family {φn (z) | n ≥ 1} is zero for z ∈ Σ. Here γ(z) measures the exponential rate of growth of the product in (1.1): γ(z) = lim
n→∞
1 log kT (z, n) · · · T (z, 1)k. n
(1.10)
In fact, we will show that the sequence α1 , . . . , αn , . . . can be obtained as the evaluation of an explicit rational function along a straight-line winding in a real subtorus of a generalized Jacobian variety. These results are rather similar to those holding for the AKNS operator whose inverse spectral theory was discussed by [AKNS] and then, in a context similar to the one considered in this paper, by [DC-J]. That paper was in turn inspired by the work in the 1970’s on the inverse spectral theory of the periodic Schr¨odinger operator [DMN]. The idea of regarding the set Σ as an “essential spectrum" and z as a “spectral parameter" is quite natural. Indeed the suspension technique [W] can be used to write the matrix product in (1.2) as the discretization of the fundamental matrix solution of an Atkinsontype spectral problem [A]; the details are carried out in [GJ]. Results of this paper will be used below. This paper is organized as follows. In Sect. 2 we formulate the inverse problem in a precise way and review the basic results we will need to solve it. In Sect. 3 we carry out the solution of the inverse problem using methods of algebraic curve theory. We work out explicit formulas for those stationary ergodic sequences satisfying condition (i) and (ii). The solution of the inverse problem mentioned above rests upon the Weyl m functions satisfying Eq. (2.5). A similar property plays an important role for Jacobi operators associated with orthogonal polynomials on the real line ([BGHT, GT]). Operators having this property have come to be called “reflectionless”. In Sect. 4 we discuss the theory of polynomials orthogonal on the unit circle (which can also be associated with an operator [GT, T, GNV]) which have this reflectionless property. This allows us to make contact with the work in [PS]. Finally we utilize the Stahl-Totik theory of general orthogonal polynomials to obtain some stability results. Some analogs of these results for polynomials orthogonal on a finite number of segments of the real line are also given. We finish this Introduction by discussing the concept of randomization as we use it and some basic definitions regarding random orthogonal polynomials. Let (, µ) be a probability space and let A : → D be a µ-measurable map of to the open unit disc D = {z ∈ C | |z| < 1}. Let τ : → be a bimeasurable bijection with respect to which µ is ergodic [W]. Suppose that there is a point ω ∈ such that αn = A(τ n−1 (ω)).
128
J. S. Geronimo, R. Johnson
In this situation, we say that the matrices T (z, n) (or the orthogonal polynomials {φn (z)}) have been randomized. Note that α1 ≡ α1 (ω) = A(ω); to save notation we will write α1 in place of A from now on. We have αn (ω) = α1 (τ n−1 (ω)) (n ≥ 1). Since τ is assumed bijective, αn is also defined for negative n and for n = 0. It will be convenient to replace the pair (, α1 ) with the path space which it generates. Thus for each ω ∈ , let sω = (α1 (τ n−1 (ω))−∞
(1.11)
This will allow us to use the results of [GJ] and, as we will see, presents no loss of generality as far as the final solution of the inverse problem is concerned. An immediate consequence Q∞ of (1.11) is that, for µ-a.a. ω, the sequence sω lies in the compact subset ˜ = cls{sω | ω ∈ 0 }, where 0 ⊂ X1 = −∞ {z ∈ D : |z| ≤ kα1 k∞ } of X. Let ˜ is invariant under the homeomorphism τ˜ is the set of ω for which sω lies in X1 . Then of X1 defined by left translation. Moreover the image of µ˜ of the measure µ under the ˜ : ω → sω is ergodic with respect to the translation τ˜ . It is natural to map i : 0 → ˜ → D : (· · · s−1 , s0 , s1 , . . .) → s0 . In what follows we identify α1 with the map α˜ 1 : ˜ τ˜ , µ, ˜ α˜ 1 ). Note that is then a compact metric space will identify (, τ, µ, α1 ) with (, and α1 is continuous. Next let 0 6= z ∈ C; and define 8z (ω, n) = T (z, n) · · · T (z, 1) 8z (ω, 0) = I, 8z (ω, −n) = T (z, −n + 1)
−1
(n ≥ 1),
· · · T (z, 0)
(1.12) −1
(n ≥ 1).
Then we can check that 8z is a cocycle in the sense that 8z (τ n (ω), m)8z (ω, n) = 8z (ω, m + n). We have
φn (z; ω) φ∗n (z; ω)
1 = 8z (ω, n) 1
(n ≥ 1)
for each ω ∈ . We now repeat two definitions of basic importance in the theory of random differential and difference equations. Definition 1.1. Let 0 6= z ∈ C, and define γ(z) = lim
n→∞
1 log k8z (ω, n)k. n
The limit exists and is constant for µ-a.a. ω ∈ [O]. This µ-almost everywhere limit is the Lyapounov exponent of the cocycle 8z .
Inverse Problem for Polynomials Orthogonal on the Unit Circle
129
Definition 1.2. [C] Let 0 6= z ∈ C. We say that the cocycle 8z (ω, n) has an exponential dichotomy over if there are constants K > 0, β > 0 and a continuous projectionvalued function P (·) : C2 → C2 on such that k8z (ω, n)P (ω)8z (ω, m)−1 k ≤ Ke−β(n−m)
(n ≥ m),
k8z (ω, n)(I − P (ω))8z (ω, m)−1 k ≤ Keβ(n−m)
(n ≤ m).
The range Im P (ω) and the kernel Ker P (ω) define a hyperbolic splitting of the cocycle 8z in the product space × C2 . 2. The Inverse Problem We formulate the problem to be solved in this paper. Let be a compact metric space, τ : → a homeomorphism, α1 : → D a continuous map of to the open unit disc D, and let µ be a Radon probability measure on which is ergodic with respect to τ . Let αn (ω) = α1 (τ n−1 (ω)) (n ∈ Z), and let {φn (z) ≡ φn (z; ω)}n≥0 be the corresponding family of orthogonal polynomials. Thus φn (z; ω) 1 = Tz (ω, n) · · · Tz (ω, 1) , φ∗n (z; ω) 1 where
Tz (ω, n) = an (ω)
z zαn (ω)
αn (ω) 1
(2.1)
and an (ω) = (1 − |αn (ω)|2 )−1/2 . Let γ(z) be the Lyapounov exponent of the cocycle 8z (ω, n) defined by {Tz } (see (1.12)). For each ω ∈ , let Σ be the set of non-isolated points in the topological support of the orthogonality measure dσω of the polynomials {φn (z; ω) | n ≥ 1} (see [A, G]). Then Σ is a closed subset of the unit circle K. It is known (e.g., [GJ, Thm. 5.4]) that Σ is independent of ω µ-a.e. in . We will suppose that (i) Σ = [z0 , z1 ] ∪ [z2 , z3 ] ∪ · · · ∪ [z2N −2 , z2N −1 ] is a finite union of (necessarily nondegenerate) arcs in K; (ii) γ(z) = 0 for Lebesgue-a.a. z ∈ Σ. Here the points z0 , . . . , z2N −1 are ordered (say) counter-clockwise on K. Starting from assumptions (i) and (ii), we will devise an explicit description of (, τ, µ) and a formula for α1 . Note that with these assumptions γ is the Green’s function associated with the complex set Σ, i.e., γ is harmonic on C/Σ, γ − log |z| is harmonic at infinity and γ(z) = 0 Lebesgue - a.a. on Σ. Our starting point will be the Weyl m-functions, discussed in detail in [GJ]. First of all, it is proved in [GJ, Thm. 4.3] that, if z 6∈ Σ, then the cocycle 8z admits an exponential dichotomy over the topological support of the measure µ in . We henceforth assume that is the topological support of µ; thus 8z has an exponential dichotomy over ×C2 if z 6∈ Σ. We define stable and unstable bundles W ± ⊂ × C2 as follows: W + = {(ω, u) | u ∈ Im P (ω)}, W − = {(ω, u) | u ∈ Ker P (ω)},
130
J. S. Geronimo, R. Johnson
where P (ω) is the projection of Definition 1.2. As is shown in [GJ, Sect. 5], W + and W − are both one-dimensional (line) bundles over . For each ω ∈ and 0 6= z, z 6∈ Σ, let m± (z, ω) be the unique extended complex numbers such that 1 1 ± 2 = c :c∈C . W ∩ ({ω} × C ) = Span m± (z, ω) m± (z, ω) Thus m± are just the projective coordinates of the complex lines W ± ∩ ({ω} × C2 ). We now have Theorem 2.1. For each ω ∈ = Supp µ, the functions m± (z) ≡ m± (z, ω) are branches of a single meromorphic function M (z) ≡ M (z, ω) on the Riemann surface C of the Q2N algebraic relation w2 = K(z) = j=1 (z − zj ). Here we have written z2N = z0 ; thus the points z1 , z2 , . . . , z2N coincide with the points z1 , z2 , . . . , z0 . The intervals (z1 , z2 ), . . . , (z2N −1 , z2N ) determine the complement K − Σ. Proof. We first note that in Sect. 4 of [GJ] it is shown that m± (z) are analytic for |z| < 1 and that |m+ | < 1, while |m− | > 1 for all such z. If we write 1 a = Tz (ω, 1) m− (z, ω) b and recognize that ab = m− (z, τ (ω)) (by the invariance of the bundle W − ), we obtain the equation of motion for m− (see also (1.3)), m− (z, τ (ω)) = Likewise
a b
= Tz−1 (ω, 1)
1 m+ (z,τ (ω))
zα1 (ω) + m− (z, ω) . z + α1 (ω)m− (z, ω)
(2.2)
leads to the equation
m+ (z, ω) = −z
α1 (ω) − m+ (z, τ (ω)) . 1 − α1 (ω)m+ (z, τ (ω))
(2.3)
The exponential dichotomy now allows an analytic continuation of m± to K − Σ ∪ {z : |z| > 1}. Equations (2.2) and (2.3) again allow us to evaluate limz→∞ m± and limz→0 m± . We collect the needed values for future reference: m− (0) =
1 α1
(τ −1 (ω))
m− (∞) = α1 (τ −1 (ω))
m+ (0) = 0,
(2.4)
m+ (∞) = ∞.
Next we see the Kotani-type result proved in [GT, Thm. 5.7] and [GJ, Cor. 5.12] (take careful note of the change in notation used in the proofs in those papers) to conclude that the functions m± (z), extend holomorphically through each open subinterval (z0 , z1 ), . . . , (z2N −2 , z2N −1 ) of Σ. Moreover the extensions satisfy the following relations 1 , (2.5) lim m+ (rz, ω) = lim− r→1− r→1 m− (rz, ω) for each ω ∈ and each z in the interior of Σ.
Inverse Problem for Polynomials Orthogonal on the Unit Circle
131
At this point we see that m± are branches of a function M = M (·, ω) defined and meromorphic on C with the possible exception of the branch points on C, i.e., the inverse images in C of the points z1 , . . . , z2N ∈ C under the natural projection π mapping C to the extended complex plane. However, the Riemann theorem on isolated singularities together with the fact [GJ, Eq. (5.1-5.2)] that m± take values on K exactly when z belongs to a “resolvent interval" (z2i−1 , z2i ) (1 ≤ i ≤ N ) shows that M is meromorphic on the entire curve C, for all ω ∈ . This also implies that m+ (zi , ω) = m− (z1 i ,ω) , i = 1, . . . , 2N . The Riemann surface C contains two preimages under π of z = 0 and of z = ∞ respectively. With an eye to (2.4), it is convenient to label these points as 0± and ∞± , 1 −1 (ω)). where M (0+ ) = 0, M (∞+ ) = ∞ and M (0− ) = α1 (τ −1 (ω)) , M (∞− ) = α1 (τ Note√that this labelling does not determine the values of the meromorphic function √ √ agree to define K(z) so that, as z → ∞ , K(z) = z N + · · ·. w = K(z) on C. We + √ √ N Then, as z → ∞− , K(z) = −z + · · ·. The function w = K is now well-defined on Q2N C, however its value at 0+ may turn out to be either of the square roots of i=1 zi . We will now derive an explicit formula for the meromorphic function M . It is convenient to change coordinates. Define 1 1 −1 A= √ , i 2 i and consider the transfer matrix Tˆ = AT A−1 in the coordinate uˆ = Au defined by A. We obtain −ia1 i[z(1 − α1 ) + 1 − α1 ] z(1 − α1 ) − (1 − α1 ) , (2.6) Tˆ (z, 1) = −[z(1 + α1 ) − (1 + α1 )] i[z(1 + α1 ) + (1 + α1 )] 2 ˆ be the complex projective coordinate in where α1 and a1 are evaluated at ω ∈ . Let m the u-space. ˆ We have 1+m . (2.7) m ˆ =i 1−m Using the transformation (2.7), we obtain the parameterizations m ˆ ± (z, ω) of the bundles ˆ These satisfy (see (2.4)): W ± in the u-coordinate. α1 (τ −1 (ω)) + 1 m ˆ + (0) = i, m ˆ − (0) = i α1 (τ −1 (ω)) − 1 (2.8) 1 + α1 (τ −1 (ω)) m ˆ − (∞) = i m ˆ + (∞) = −i. 1 − α1 (τ −1 (ω)) ˆ on C, which we The functions m ˆ ± are branches of a single meromorphic function M ˆ (∞+ ) = −i. Note that (2.5) implies that fix by requiring that M m ˆ + (z) = m ˆ − (z),
(2.9)
for z in Σ. ˆ . In what follows, this function will be more conveLet us determine the function M nient than the function M . First of all since |m+ | < 1, while |m− | > 1 for |z| < 1 we ˆ − < 0 if |z| < 1, and by the relation of m ˆ ± to positive find that Im m ˆ + > 0 and Im m measures on the circle (see (1.8) and (1.9)) it follows that if these can be continued to |z| > 1 then Im m ˆ + < 0 and Im m ˆ − > 0 in this region. From this plus the fact that
132
J. S. Geronimo, R. Johnson
from (2.9), m ˆ + (zi ) = m ˆ − (zi ), i = 1, . . . , 2N , it follows without difficulty that for each ˆ ω ∈ , the function M has exactly one simple pole Pi (ω) in each simple closed curve ˆ + has a real posiπ −1 ([z2i−1 , z2i ]) ⊂ C (1 ≤ i ≤ N ). Here we use the fact that m tive angular derivative and m ˆ − has a real negative angular derivative at each point of a resolvent interval (z2i−1 , z2i ). Note that these curves determine exactly the set where ˆ = ∞. In what follows, we will denote by pi the images of the poles Pi , i.e., Im M 1 1 1 pi = π(Pi ) ∈ [z2i−1 , z2i ]. Note that K 2 (Pi ) = ±K 2 (π(Pi )) = ±K 2 (pi ) where the sign taken depends upon which sheet Pi is located. Next let i : C → C be the involution which interchanges the sheets of C : i(z, w) = (z, −w). The branch points of C are exactly z1 , . . . , z2N (or rather their inverse images ˆ −M ˆ ◦ i. Combining this fact with the in C). These are the zeroes of the function M previous paragraph, we get √ ˆ −M ˆ ◦ i = c Q K(z) , (2.10) M N i=1 (z − pi ) where c is a constant to be determined. We use the fact that the zeroes and poles of ˆ −M ˆ ◦ i are all simple. M ˆ +M ˆ ◦ i is a rational function of the complex variable z. We further note that M Therefore ˆ +M ˆ ◦ i = Q Q(z) M , (2.11) N i=1 (z − pi ) ˆ is finite and non-zero at ∞± , the order of Q must be where Q is a polynomial. Since M N: Q(z) = qN z N + · · · + q1 z + q0 . Combining (2.10) and (2.11), we get √ Q(z) + c K(z) ˆ . M = QN 2 i=1 (z − pi )
(2.12)
ˆ be given by Eq. (2.12), then c and the coefficients of Q are uniquely Lemma 2.2. Let M ˆ at 0+ , ∞+ , and the poles Pi , i = 1, . . . , N . Furthermore determined by the values of M they are meromorphic functions on C of the poles Pi . The formula qN (ω) =
2iα1 (τ −1 (ω)) 1 − α1 (τ −1 (ω))
,
(2.13)
shows that the retarded coefficients α1 (τ −1 ω) are also a meromorphic function on C of the poles Pi . Proof. Note that ˆ ◦i= M
√ Q−c K . QN 2 i=1 (z − pi )
ˆ , we must have Since only one inverse image Pi ∈ π −1 π(Pi ) is a pole of M p c K(Pi ) = Q(pi ) (1 ≤ i ≤ N ),
(2.14)
Inverse Problem for Polynomials Orthogonal on the Unit Circle
133
where the sign taken by the square root depends on which sheet Pi is located. Note that ˆ (0+ ) = i, Eq. (2.14) still holds if Pi coincides with a ramification point. We also have M which means that N Y p (2.15) 2i (−pi ) = c K(0+ ) + q0 . i=1
ˆ (∞+ ) = −i, hence In addition M −2i = qN + c.
(2.16)
Equations (2.14) - (2.16) uniquely determine q0 , . . . , qN and c. ˆ , we have Substituting ∞− into the argument of M 2i
1 + α1 (τ −1 (ω))
= qN (ω) − c(ω),
1 − α1 (τ −1 (ω))
(2.17)
which when combined with (2.16) and (2.17), yields c(ω) = qN (ω) =
−2i 1 − α1 (τ −1 (ω)) 2iα1 (τ −1 (ω)) 1 − α1 (τ −1 (ω))
, (2.18) .
If Newton’s interpolation formula is used with Eq. (2.14) we find Q(z) = qN
N Y
(z − pj ) + c
j=1
N X
1
K 2 (Pj )
j=1
Y z − pk . pj − pk
(2.19)
k6=j
If (2.17) and (2.18) are now used in the above equation and if D is defined by 1 D = −K0 ... 1
p1
...
pN
···
−1 pN p1 1 + (−1)N .. . −1 pN pN N
we can solve for qN in the form N −1 p1 1 p1 · · · p1 . . N 2i K0 .. + (−1) .. N −1 1 pN · · · pN pN qN = D
···
−1 pN 1
···
−1 pN N
pN 1 + K1 , N pN + K N
···
−1 pN 1
···
−1 pN N
K1 + pN 1 KN + p N N
. (2.20)
An analogous formula for c is N +1 4i (−1) c=
p1 .. . pN
···
−1 pN 1
−1 · · · pN N D
K1 + pN 1 KN + p N N
.
(2.21)
134
J. S. Geronimo, R. Johnson
Since z can be considered as a meromorphic function on C of degree two we see that the coefficients of Q are meromorphic functions of the pole divisor {P1 (ω), . . . , ˆ (·, ω). Equations (2.21) and (2.18)show that c(ω) and the retarded PN (ω)} of M coefficient α1 (τ −1 (ω)) (or rather its conjugate) are also meromorphic functions of {P1 (ω), . . . , PN (ω)}. ˆ (0+ ) using If we combine the formula (2.18) for c with an explicit evaluation of M relations (2.8), we get qQ 2N −1 K0 1 − α1 (τ (ω)) i=1 zi = = . (2.22) − QN QN −1 1 − α1 (τ (ω)) i=1 −pi (ω) i=1 −pi (ω) This simple relation determines the argument of 1 − α1 (τ −1 (ω)). The determination of ˆ (0− ) the values of α1 itself appears to require the previous calculations. Finally, from M QN and (2.15) it follows that q0 = qN i=1 (−pi ). 3. Determination of the Pole Motion As we have seen in Sect. 2, the (retarded) coefficient α1 is determined by the poles ˜: P1 , . . . , PN of M qN (ω) , α1 (τ −1 (ω)) = 2i + qN (ω) where qN is the function (2.20) of the poles P1 (ω), . . . , PN (ω). We fix ω ∈ and write Pi (n) = Pi (τ n (ω)) (n ∈ Z). We will determine the pole motion n → {P1 (n), . . . , PN (n)}. The first step is to consider the behavior of the function m ˆ + (z) near z = 0 and z = ∞. ˆ + (∞) = −i (see (2.8)). We also have We already know that m ˆ + (0) = i and m 1 + m+ m ˆ+=i , (2.5) 1 − m+ where m± (z) are the original m-functions considered in Theorem 2.1. Recall from Eq. (2.4) that m+ (0) = 0; hence we can write m+ (z) = βz + O(z 2 ), where β is to be determined. With m1 (z) = m+ (z, τ (ω)),
m0 (z) = m+ (z, ω),
we find by inverting (2.3) that m1 (z) =
zα1 (ω) + m0 (z) . z + α1 (ω)m0 (z)
Substituting m0 (z) = βz + O(z 2 ), we see that m1 (z) =
z[α1 (ω) + β] + O(z 2 ) . z[1 + βα1 (ω)] + O(z 2 )
The requirement that m1 (0) = 0 forces β = −α1 (ω). Thus Eq. (2.7) gives
(3.1)
Inverse Problem for Polynomials Orthogonal on the Unit Circle
1 − α1 (ω)z + O(z 2 ) = i[1 − 2α1 (ω)z + O(z 2 )]. m ˆ + (z, ω) = i 1 + α1 (ω)z + O(z 2 )
135
(3.2)
In a similar way we can determine the behavior of m ˆ + (z) in a neighborhood of 1 1 , and m ˜ 1 (u) = m+ (z,τ z = ∞. Write u = 1/z, m ˜ 0 (u) = m+ (z,ω) (ω)) in (3.1) to find, m ˜ 1 (u) =
α1 (ω)u + m ˜ 0 (u) . α1 (ω)m ˜ 0 (u) + u
˜ 0 (u) = βu + O(u2 ), and hence Since m ˜ 0 (0) = 0, we can write m m ˜ 1 (u) =
[β + α1 (ω)]u + O(u2 ) . [α1 (ω)β + 1]u + O(u2 )
Since m ˜ 1 (0) = 0 we get β = −α1 (ω), and so finally 1 − α1 (ω)u + O(u2 ) m ˆ + (u, ω) = i = −i[1 − 2αi (ω)u + O(u2 )]. −1 − α1 (ω)u + O(u2 )
(3.3)
Armed with formulas (3.2) and (3.3), we study the Wronskian of (1.2). Fix ω ∈ and 0 6= z ∈ C. Let wv00 be a non-zero vector in the complex line in C2 determined by m ˆ − (z). For example, we can choose wv00 = mˆ −1 (z) . Similarly, let fe00 be a non-zero vector in the complex ˆ + (z). line determined by m Write wvnn , fenn for the images of these vectors under the map Tˆ (z, n) · · · Tˆ (z, 1) = Tˆz (ω, n) · · · Tˆz (ω, 1), thus vn en = Gn (z) + Hn (z)m ˆ + (z, 0), = Gn (z) + Hn (z)m ˆ − (z, 0), e0 v0
(3.4)
where Gn and Hn are polynomials of degree n in z. The Wronskian relation gives vn en e0 v0 n v n fn − en w n = W , =z W , wn fn w0 f0 w f 0 0 = z n [v0 f0 − e0 w0 ] = z n v0 e0 = z n v0 e0 [m − ˆ + (z) − m ˆ − (z)]. e0 v0
Thus v n en and so
wn fn − ˆ + (z, 0) − m ˆ − (z, 0)], = z n v0 e0 [m en vn m ˆ + (z, 0) − m ˆ − (z, 0) en vn , = zn e0 v0 m ˆ + (z, n) − m ˆ − (z, n)
(3.5)
where we have written m ˆ ± (z, 0) = m ˆ ± (z, ω), m ˆ ± (z, n) = m ˆ ± (z, τ n (ω)). Writing c0 = −2i −2i , cn = for the constants appearing in the expressions (2.12) for 1−α1 (τ −1 (ω)) 1−α1 (τ n−1 (ω)) n ˆ ˆ M (·, ω) and M (·, τ (ω)), we get QN e n vn n c0 i=1 (z − Pi (n)) =z . QN e0 v0 cn i=1 (z − Pi (0)) We claim
(3.6)
136
J. S. Geronimo, R. Johnson
Lemma 3.1. een0 and algebraic curve C.
vn v0
are branches of a single meromorphic function hn on the
Proof. Choose e0 = v0 = 1. Using formula (2.6) for Tˆ (z, 1), we have e1 = A1 (z) + B1 (z)m ˆ + (z, 0), , ˆ − (z, 0) v1 = A1 (z) + B1 (z)m where
a1 · [z(1 − α1 (ω)) + (1 − α1 (ω))], 2 −ia1 B1 (z) = · [z(1 − α1 (ω)) − (1 − α1 (ω))]. 2 ˆ , and so the claim follows for n = 1. Thus we see that h1 = A1 (z) + B1 (z)M Next note that en en−1 e1 en = ··· , e0 en−1 en−2 e0 vn vn−1 v1 vn = ··· . v0 vn−1 vn−2 v0 A1 (z) =
Since
we see that
ek = Ak (z) + Bk (z)m ˆ + (z, k − 1), ek−1 vk = Ak (z) + Bk (z)m ˆ − (z, k − 1), vk−1
(3.7)
hn = gn · gn−1 · · · · · g1 ,
(3.8)
where gk is the meromorphic function on C determined by (3.7) (1 ≤ k ≤ n). In particular g1 = f1 . We consider more closely the meromorphic function h1 on C. From the relation (3.6): QN c0 i=1 (z − Pi (1)) e1 v1 = z · QN , (3.9) e0 v0 c1 i=1 (z − Pi (0)) we see that the poles of h1 on C are contained among the points ∞+ , ∞− , and the inverse images π −1 π(Pi (0)) (1 ≤ i ≤ N ) of the projections π(Pi (0)) of the points Pi (0) to C. Lemma 3.2. The poles of h1 are ∞− and the points P1 (0), . . . , PN (0), and all these poles are simple. Proof. To verify this claim, note first that the simplicity follows immediately from (3.9) and the fact that the Pi (0) are distinct. Second, (3.9) together with the formula ˆ shows that the “finite" poles of h1 are exactly P1 (0), . . . , PN (0). h1 = A1 + B1 M To show that ∞− rather than ∞+ is a pole of h1 , we use the equation of motion −ia1 i[z(1 − α1 ) + (1 − α1 )] z(1 − α1 ) − (1 − α1 ) 1 e1 = ∗ ∗ m ˆ + (z, 0) f1 2 together with (3.3). Writing u = 1/z, we have
Inverse Problem for Polynomials Orthogonal on the Unit Circle
137
−ia1 u[i(1 − α1 ) + i(1 − α1 ) + 2iα1 (1 − α1 )] + O(u2 ) 2u −ia1 = · 2i[1 − |α1 |2 ] + O(u) = (1 − |α1 (ω)|2 )1/2 + O(u). 2
e1 =
This means that, at ∞+ , h1 has a finite value which can be evaluated from h1 (u) = (1 − |α1 (ω)|2 )1/2 + O(u);
(3.10)
this result will be useful later. Combining (3.10) with (3.9), we see that indeed h1 has a simple pole at ∞− ∈ C. For later reference, we note that for z near zero one finds from (3.2) and the equation of motion for fe11 , e1 (z) = (1 − |α1 (ω)|2 )1/2 z + O(z 2 ),
(3.11)
which means that h1 has a simple zero at 0+ with principal part (1 − |α1 (ω)|2 )1/2 . The fact that the principal parts in (3.10) and (3.11) are equal is not an accident and will be important later on. Having shown that h1 has simple poles in P1 (0), . . . , PN (0) and in ∞− , and no others, we now determine the zeroes of h1 on C. We use again the equation of motion: ˆ + (z, 0) A1 + B 1 m 1 e1 ˆ = , = T (z, 1) C1 + D1 m f1 ˆ + (z, 0) m ˆ + (z, 0) where A1 , B1 are as above and ia1 [z(1 + α1 ) − (1 + α1 )], 2 a1 D1 = · [z(1 + α1 ) + (1 + α1 )]. 2 C1 =
The quantity α1 is evaluated at ω : α1 = α1 (ω). It follows that m ˆ + (z, 1) =
C1 (z) + D1 (z)m ˆ + (z, 0) , A1 (z) + B1 (z)m ˆ + (z, 0)
ˆ (τ (ω), ·) are either zeroes of A1 + B1 M ˆ (ω, ·) or poles of and hence the poles of M ˆ ˆ C1 + D1 M (ω, ·). But the finite poles of C1 + D1 M (ω, ·) are among the finite poles of ˆ (ω, ·), hence the poles of M ˆ (τ (ω), ·) are among the zeroes of A1 + B1 M ˆ (ω, ·). A1 + B1 M ˆ Now, by (3.11), h1 = A1 + B1 M (ω, ·) has a simple zero in 0+ , and from (3.9) the ˆ (τ (ω), ·). Since the set of zeroes of h1 includes the set of poles P1 (1), . . . , PN (1) of M zeroes and poles of a meromorphic function are equal in number, we conclude that the zeroes of h1 are exactly 0+ , P1 (1), . . . , PN (1) and that these are all simple. The reasoning just used extends, using (3.4), (3.6), and (3.8) to show, Theorem 3.3. For each n ≥ 1, hn has simple poles at P1 (0), . . . , PN (0) and a pole of order n at ∞− , while hn has simple zeroes at P1 (n), . . . , PN (n) and a zero of order n at 0+ .
138
J. S. Geronimo, R. Johnson
The idea now is to use Abel’s theorem and the Jacobi Inversion theorem to show that the motion n → {P1 (n), . . . , PN (n)} is quasi-periodic which will imply from (2.13) and (2.20) that αn is quasi-periodic. Since the poles of hn include P1 (0), . . . , PN (0), and the zeroes of hn include P1 (n), . . . , PN (n), it is clear that this idea is reasonable. There are two difficulties in carrying it to completion that do not arise in the case of the one-dimensional Schr¨odinger operator ([DNM, MM]; historically speaking the first to which the inverse method we are now using was applied). First of all, the poles are N in number while the genus of C is N − 1. This means that the standard version of the Jacobi Inversion theorem [S] cannot be applied. This difficulty arises also in the case of the AKNS operator [P, DC-J]. It is circumvented in those papers by singularizing the curve C. Namely, one identifies two points on C; intuitively this raises the genus by one and makes it possible to apply a generalized version of the Jacobi Inversion theorem [F] to the poles. In our case we wish to identify the √ points 0+ and ∞+ which can be accomplished (see (3.9) and (3.10)) by “dividing by z". This amounts to working on a curve C˜ which is a four-times ramified double cover of C. It turns out that a generalized ˜ Now for the details. Abel’s theorem can be applied when we work on C. ˜ of the algebraic relation w2 = K(u2 ) We begin by introducing the Riemann surface C √ (thus u = z). The following facts may be verified: (1) The genus of C˜ is 2N − 1. There is a projection π˜ : C˜ → C which is 2-1 except at the points 0± , ∞± ∈ C, where π˜ is 1-1. We will label the inverse images in C˜ of 0± , ∞± with the same symbols 0± , ∞± . (2) There is a homology basis a˜ 1 , . . . , a˜ 2N −1 ; b˜ 1 , . . . , b˜ 2N −1 of C˜ with the following properties. First of all, the basis is normalized: a˜ i ◦ a˜ j = 0 = b˜ i ◦ b˜ j , a˜ i ◦ b˜ j = δij , where ◦ denotes intersection number. Second, the family of projected curves {π˜ ◦ a˜ i , π˜ ◦ b˜ i | 1 ≤ i ≤ 2N − 1} contains a normalized homology basis {a1 , . . . , aN −1 , b1 , . . . , bN −1 } of C. The reader is advised to draw pictures to verify these facts. (3) A basis for the holomorphic differentials on C˜ is given by ui−1 du ηi = p K(u2 ) Observe that where
(1 ≤ i ≤ 2N − 1).
2η2r = π˜ ∗ ωr
(1 ≤ r ≤ N − 1),
z r−1 dz ωr = √ K(z)
(1 ≤ r ≤ N − 1)
is a holomorphic differential on C. The forms {ωr | 1 ≤ r ≤ N − 1} form a basis for the holomorphic differentials on C. Let us now fix a differential of the third kind ωN on C with simple poles at 0+ and ∞+ , with residues −1 and +1 respectively, R and no other poles. Such a differential can be uniquely determined by requiring that ai ωN = 0 (1 ≤ i ≤ N − 1), but there seems no particular point in doing so here. Define η2N =
1 ∗ π˜ ωN . 2
˜ with residues −1 and 1 respecThen η2N has simple poles exactlyR at 0+ and ∞+ ∈ C, R tively. (There is no guarantee that a˜ i η2N = 0 even if ai ωN = 0 for all i.)
Inverse Problem for Polynomials Orthogonal on the Unit Circle
139
We are now ready to use the theory of generalized Jacobians as developed in Fay [F, ˜ be the lattice in C2N generated by the 4N − 1 vectors pp. 50-60]. Let 3 Z Z Z η1 , η2 , . . . , η2N . c
c
c
Here c ranges over the curves a˜ 1 , . . . , a˜ 2N −1 , b˜ 1 , . . . , b˜ 2N −1 together with a small circle α˜ in C˜ centered at ∞+ . Then [F, Cor. 3.8] the rank of this lattice is 4N − 1. (In [F] one works with a normalized basis η˜1 , . . . , η˜2N −1 of holomorphic differentials together with a normalized differential η˜2N of the third kind, but one has η˜1 η1 .. = A˜ .. . . η2N
η˜2N
˜ Furthermore, if f is a meromorphic function on for a nonsingular 2N × 2N matrix A.) ˜ C such that f (0+ ) = f (∞+ ), and if A resp. B is the pole resp. zero divisor of f , then by the Theorem of Abel [F, p. 55]: ! Z B Z B Z B ˜ η1 , η2 , . . . , η2N ∈ 3. A
A
A
It is permitted that {0+ , ∞+ } ⊂ A or {0+ , ∞+ } ⊂ B as long as the leading-order coefficients of f at 0+ and ∞+ are equal. Let 3 ⊂ CN be the lattice generated by the vectors Z Z Z 1 1 1 ω1 , ω2 , . . . , ωN , 2 c 2 c 2 c where c ranges over the curves a1 , . . . , aN −1 ; b1 , . . . , bN −1 , α. Note that the image α = π˜ ◦ α˜ of the circle α˜ centered at ∞+ is a small circle in C centered at ∞+ , but traversed twice. This is because C˜ ramifies over C at ∞+ . This lattice has rank 2N − 1 in CN . This can be proved as follows. Since 0 ω1 ω1 .. = A0 .. , . . 0 ωN ωN 0 where (ω10 , . . . , ωN −1 ) is a basis of the holomorphic differentials of the third kind which 0 is the differential of the third is normalized with respect to {ai , bi | 1 ≤ i ≤ N − 1}, ωN kind with residues −1 and +1 at 0+ and ∞+ which is normalized, and A0 is a nonsingular 0 0 N × N matrix. Hence the lattice 3 = A−1 0 3 , where 3 is the lattice constructed in [F, pp. 50-60] except that (due to the double traversal of α) the last vector in 30 is twice the last vector in the lattice of [F]. Define a meromorphic function h˜1 on C˜ by
h1 (z) h1 (u2 ) h˜1 (u) = = , u u where we make the obvious abuse of notation.
140
J. S. Geronimo, R. Johnson
Lemma 3.4. The function h˜1 has simple poles at the inverse images P1(1) (0), P1(2) (0), ˜ It has sim. . . , PN(1) (0), PN(2) (0) of the points P1 (0), . . . , PN (0) with respect to π. ple zeroes at the inverse images P1(1) (1), P1(2) (1), . . . , PN(1) (1), PN(2) (1) of the points P1 (1), . . . , PN (1). In addition, h˜1 has simple poles at 0− and ∞− , and simple zeroes at 0+ and ∞+ . The first-order terms in the Taylor expansions of h˜1 at 0+ and ∞+ are both are equal, i.e., (1 − |α1 (ω)|2 )1/2 . Finally ! Z Pi(j) (1) Z Pi(j) (1) X η1 , . . . , η2N Pi(j) (0)
1≤i≤N j=1,2
Z
Pi(j) (0)
Z
0−
η1 +
=
∞+
0+
which implies
Z N X i=1
Z
∞−
Z
0−
η1 , . . . ,
η2N + 0+
Z
Pi (1)
˜ mod 3,
η2N
!
Pi (1)
ω1 , . . . , Pi (0)
∞+
(3.12)
!
∞−
ωN
= v mod 3,
0−
Z
(3.13)
Pi (0)
where, v=
1 2
Z
0− 0+
1 ω1 + 2
Z
∞− ∞+
1 ω1 , . . . , 2
Z
0+
1 ωN + 2
!
∞− ∞+
ωN
.
(3.14)
The integrals involving η2N are interpreted so that the logarithmic terms cancel, leaving a finite result (recall that η2N has residues −1 and 1 at 0+ and ∞+ ). Proof. The first two parts of the lemma follow from the properties of h1 and Eqs. (3.9) and (3.10). To obtain (3.12) we apply Abel’s theorem to h˜ 1 . Here n o A = P1(1) (0), . . . , PN(1) (0), P1(2) (0), . . . , PN(2) (0), 0− , ∞− , n o B = P1(1) (1), . . . , PN(1) (1), P1(2) (1), . . . , PN(2) (1), 0+ , ∞+ . We now project (3.12) to the curve C. Consider the projection P : C2N → CN : (z1 , z2 , . . . , z2N ) → (z2 , z4 , . . . , z2N ). Recall that η2r = 21 π˜ ∗ ωr , and that the images under π˜ of curves a˜ 1 , . . . , a˜ 2N −1 , b˜ 1 , . . . , b˜ 2N −1 contain a normalized homology basis of C. Next let v ∈ CN be the vector ! Z Z Z Z 1 ∞− 1 0− 1 ∞− 1 0− ω1 + ω1 , . . . , ωN + ωN , v= 2 0+ 2 ∞+ 2 0+ 2 ∞+ where we choose fixed paths on C joining 0+ with 0− and ∞+ with ∞− . Using (3.12), we conclude that ! Z Pi (1) Z Pi (1) N X ω1 , . . . , ωN = v mod 3. i=1
Pi (0)
Pi (0)
Inverse Problem for Polynomials Orthogonal on the Unit Circle
141
The factor of 1/2 in front of the ωi ’s is canceled by a factor of 2 due to the repetition of ˜ the poles on C. Now, we can apply the above argument to hn for each n ≥ 1 (or, alternatively, work with h1 (τ (ω), ·), . . . , h1 (τ n−1 (ω), ·)) to find ! Z Pi (n) Z Pi (n) N X ω1 , . . . , ωN = nv mod 3. (3.15) i=1
Pi (0)
Pi (0)
let 30 be the lattice in CN generated by the vectors RTo finish Rthe discussion, ω , . . . , c ωN | c = a1 , . . . , aN −1 ; b1 , . . . , bN −1 together with the vector Rc 1 R 1 1 ω , . . . , ω . This just means that we traverse once the small circle centered 2 α 1 2 α N at ∞+ . Then by [F, Prop. 3.11], the generalized Jacobian, def
J0 (C) =
CN , 30
is birationally isomorphic to the set of unordered N -tuples {P1 , . . . , PN } from C which contain neither 0+ nor ∞+ . A birational isomorphism is given by the Abel map A constructed as follows: let {P1∗ , . . . , PN∗ } be the fixed divisor not containing 0+ and ∞− , and define ! Z Pi Z Pi N X ω1 , . . . , ωN . A(P1 , . . . , PN ) = i=1
Pi∗
Pi∗
Let ci be the inverse image in C of the resolvent interval [z2i−1 , z2i ] (1 ≤ i ≤ N ). It can be checked directly that the restriction of A to the real N -torus T N = c1 × · · · × cN is an analytic diffeomorphism onto its image. We now prove ∞ Theorem 3.5. The sequence {qN (n)}∞ n=−∞ , and hence {α(n)}n=−∞ , is a quasi-periodic sequence. That is qN (n) is determined by the evolution of the meromorphic function F along a quasi-periodic winding in the generalized Jacobian J0 (C).
Proof. We begin by noting that the function which defines qN in (2.20) is symmetric and rational in the poles P1 , . . . , PN ∈ C. Therefore it induces a meromorphic function F on the complex variety J0 (C) via the formula F ◦ A(P1 , . . . , PN ) = qN (P1 , . . . , PN ). Observe that the denominator D in (2.20) does not vanish on T N , hence F is a realanalytic function on the image A(T N ) ⊂ J0 (C). We now prove the result by making a connection between the formula (3.15) and the function F via the Abel map A. For this, note that the lattice 3 contains 30 , and indeed the quotient group 3/30 contains 2N −1 elements. We identify 3/30 with a set {vi |1 ≤ i ≤ 2N −1 } of fixed vectors in CN . The complex manifold N def C J(3) = 3 is a finite quotient of J0 (C). In fact the canonical projection η : J0 (C) → J(3) : (z1 , . . . , zN ) + 30 → (z1 , . . . , zN ) + 3
142
J. S. Geronimo, R. Johnson
is a holomorphic group homorphism. Consider the motion (P1 (0), . . . , PN (0) → (P1 (n), . . . , PN (n)) defined by a fixed sequence sω = {αk }∞ k=−∞ ∈ . The formula (3.15) shows that η ◦ A(P1 (n), . . . , PN (n)) = η ◦ A(P1 (0), . . . , PN (0)) + nvmod3 (−∞ < n < ∞). That is, if the pole motion is transferred to J0 (C) via A, then it descends to a linear motion on J(3). From the discussion above one can show that the pole motion is also linear on J0 (C), in fact from (1.11), (2.13), and (2.20) one has A(P1 (n), . . . , PN (n)) = A(P1 (0), . . . , PN (0)) + n(v + vi )mod30 for a unique choice of vi independent of sω and n. The choice is unique because the flow ˜ is completely defined by the spectral intervals and because C2N /3 ˜ contains on C2N /3 the real part A(T N ) of the generalized Jacobian J0 (C) as an embedded subvariety. We can now combine (2.20) and (3.15) as follows: qN (n) ≡ qN (τ n (ω)) =F ◦ A(P1 (n), . . . , PN (n)) =F (A(P1 (0), . . . , PN (0)) + nv + nvi ), which gives the result.
(3.16)
4. Quasi-periodic Recurrence Coefficients In this section we discuss the theory of “reflectionless” quasi-periodic recurrence coefficients for polynomials orthogonal on the unit circle. By Reflectionless recurrence coefficients (See [BGHT] for a more general formulation) we mean that {αn }∞ −∞ , |αn | < 1 are such that the topological support Σ of the orthogonality measure σ associated with the coefficients {αn }∞ 1 is a finite number of disjoint intervals and that for almost every point in these intervals m± satisfy Eq. (2.5). These form a special class of those studied by [PS]. ¯ z1 ) = w(z). Note A polynomial of degree N, w(z) is said to be self-reciprocal if z N w( −iN θ
−iN θ
if w(z) is a self-reciprocal polynomial of degree N, then e 2 w(eiθ ) = e 2 w(eiθ ), −iN θ hence on the unit circle e 2 w(eiθ ) can be represented as a real trigonometric polynoˆ ±. mial (in θ2 ). We now examine the measures that are associated with m Lemma 4.1. Suppose that {αn }∞ −∞ , |αn | < 1 are a sequence of recurrence coefficients ˜ of the form given by Eq. (2.12) M such that m ˜ ± are branches of a meromorphic αfunction 0 +1 ˜ ˜ ˜ (∞− ) = i 1+α0 . Set P (z) = ˜ with M (0+ ) = i, M (∞+ ) = −i, M (0− ) = i α0 −1 , and M 1−α0 1 1 iθ iθ 2 2 QN ˜ (eiθ ) = K0 P (e ) , Q(e ˜ iθ ) = K0 Q(e ) , Q given above and K(e ˜ iθ ) = (z − p ), P i i=1 cP (0) cP (0) √ iθ K(e ) ˜ and K ˜ are self-reciprocal polynomials, where K0 = K(0+ ) and . Then P˜ , Q, K0 1
K02 indicate a fixed square root. ˜ is self-reciprocal follows since all its zeros are on the unit circle. That Proof. That K ˜ P is self-reciprocal follows from the formula for c (see (2.18)), Eq. (2.22) and the fact that all the zeros of P are on the unit circle. In order to see that Q˜ is self-reciprocal ˜ − have integral representations given by (1.8) and (1.9) we use the fact that m ˜ + and m respectively multiplied by the complex number i. These representations and (2.12) imply iθ ) = 0 for almost every eiθ ∈ K\Σ. Note that Q and P have the same degree. that = Q(e P (eiθ )
Inverse Problem for Polynomials Orthogonal on the Unit Circle
143
The result now follows using the relation between P and P˜ , the fact that P˜ is selfijθ reciprocal, and the linear independence of the set {e 2 }N −N on K\Σ. From the above lemma it is not difficult to show using the boundary properties of the Poisson kernel that the measure associated with m ˜ + is (see [PS], Thm. 2.1), X 1 f (eiθ )dθ + σki δ(θ − θki )dθ, 2π n
dσ+ (θ) =
(4.1)
i=1
where
( iθ
f (e ) =
√
˜ iθ )| |c| |K(e 2|P˜ (eiθ )|
if eiθ ∈ Σ otherwise,
0 and σ ki
1 −c = 2i p
q ˜ ki ) K(p
ki
dP˜ (pki ) dz
(4.2)
q ˜ iθki )| |K(e |c| 1 , = 2 dP˜ (eiθki ) dθ
(4.3)
where pj = eiθj and eiθki , i = 1, . . . , n ≤ N are the zeros of P which are the images of ˜ on the first sheet. Note that (2.14) has been used to compute the residues, the poles of M σki , given by (4.3). From the relation between m ˜ + and m ˜ − we find that the measure associated with m ˜ − , σ− , (see (1.9)), has the same density as σ+ , i.e., that given by (4.2) while the ˜ located on the masses are given by (4.3) evaluated at the images of the poles of M second sheet. Before stating the next theorem we use (2.13) and (2.20) to obtain the following formula for α0 : K0 1 1 . 1+ α0 = (4.4) P Ki 2 P (0) (1 + (−1)N N ) dP (p ) i=1 pi
dz
i
We now prove Theorem 4.2. Given any set of distinct arcs of the circle [z0 , z1 ], . . . , [z2N −2 , z2N −1 ], points pi , i = 1, . . . , N, pi ∈ [z2i−1 , z2i ], i = 1, . . . , 2N, z0 = z2N , and integers, 0 ≤ k1 < · · · < kn ≤ QN , there exists a unique Q probability measure σ+ given by N
2N
(z−zi ) (z−pi ) ˜ (4.1) - (4.3) with K(z) = Qi=12N 1 and P˜ (z) = Qi=1 1 and a unique sequence of N −pi ) 2 reflectionless quasi-periodic recurrence coefficients {αn }∞ n=−∞ , |αn | lim sup |αn | < 1. (
i=1
zi ) 2
(
i=1
< 1. Furthermore
Proof. The uniqueness of the measure follows from (4.2), (4.3), (2.21), and the fact that D is not equal to zero (since the points p1 , . . . , pN , 0+ , and ∞+ are linearly in˜ . This gives ˜ + (z, 0), and M dependent). From this measure we construct m ˜ + (z) = m ˜ − (z, 0) and the reflectionless property follows. Since σ is a positive measure m ˜ − (z) = m with an infinite number of points in its support there is a unique set of recurrence coeffi˜ ± inverting cients {αn }∞ n=1 , |αn | < 1 associated with it. We now compute m± from m (1.7). Use (1.5) to compute α0 , invert (1.4) to compute m− (·, −1), and then α−1 from
144
J. S. Geronimo, R. Johnson
(1.5). Repeating this procedure gives αn , n ≤ 0. The fact that |αn | < 1 follows since |m− (z, n)| > 1 for |z| < 1. Equation (1.3) shows that each m ˜ + (·, n), (n ∈ Z) is a meromorphic function similar ˜ + (·, n) which will to m ˜ + (·, 0), consequently there is be a measure σ+n associated with m have the form given by (4.1) - (4.3) except with different poles p1 (n), . . . pN (n) and cn . n Equation (1.4) indicates that there is an analogous measure σ− associated with m ˜ − (·, n). n ˜ − both σ+n and σ− have the same absolutely From the relation between m ˜ + and m continuous part. Since all the zeros of Pn (z) lie on the unit circle for all n there exists a constant d such that |Pn (z)| < d for all n and all z ∈ K. Consequently, |1 − αn | =
1 1 ≥ |cn | 4πd
and 1 1 − |αn |2 ≥ |1 − αn |2 2π
Z q
|K(eiθ )|dθ,
Σ
Z fn (eiθ )dθ > 0. Σ
The first of the above inequalities implies that αn stays uniformly away from one, which when coupled with the second implies that lim sup |αn | < 1. The quasi-periodicity of the {αn } follows from the previous section. Note that a formula similar to (4.4) holds for each αn if P and its zeros are replaced by Pn and its zeros. We now examine what happens if a particular sequence of recurrence coefficients is almost periodic. Again we assume that the sequence of recurrence coefficients α = {αn }∞ n=−∞ is an almost periodic sequence that is not identically zero with lim sup |αn | < 1. In this case we can take the hull associated with the above sequence as our probability space and µ is the Haar measure (see [JM, AS]). Note that if {αn } is given only for n ≥ 1, {αn } for n < 1 can be obtained from the Fourier representation for α. As above Σω , ω ∈ is the set of non-isolated points of the topological support of the orthogonality measure σω associated with polynomials {φn (z; ω)|n ≥ 1}. For am ω ∈ , let Hω be the unitary operator associated with the sequence αω (see [GT], Eq. (2.11)). Lemma 4.3. Let {αn }∞ n=−∞ , |αn | < 1 be an almost periodic sequence. Then the Weyl m functions, {m± (z, n)}∞ n=−∞ are almost periodic for |z| < 1 as is g(z, n) = i
m− (z, n) + m+ (z, n) . m− (z, n) − m+ (z, n)
Proof. Since {αn }∞ n=−∞ is almost periodic we need to show ([JM]) that if αn+ni converges uniformly in n as i tends to ∞ then m± (·, n + ni ) also converge uniformly as i tends to ∞. But this follows from the fact that for |z| < 1, {m+ (·, n)} and { m−1(·,n) } are uniformly bounded and are the unique solutions to (1.3) and (1.4) respectively satisfying (1.5). Remark 4.4. The results to follow can also be proved in the more general case when admits a unique τ invariant measure.
Inverse Problem for Polynomials Orthogonal on the Unit Circle
145
Lemma 4.5. With the hypotheses of Lemma 4.3 there is a set Σ ⊂ K such that Σω = Σ for all ω ∈ . Furthermore there exists a unique measure k with supp k = Σ such that for any f ∈ C(K), C(K) the continuous functions on K Z X 1 f dk = lim < f (Hω en , en ) > (4.5) N →∞ 2N + 1 Σ |n|≤N
for all ω ∈ . Here {en } is the standard basis in l2 . Proof. Since gω (z, n) =< (Hω + zI)(Hω − zI)−1 en , en > (see [GT, Eq. (5.11)]), we iθ +z) is the average of an almost periodic function see that the above sum with f = (e(eiθ −z) which is independent of ω ∈ . That the integral holds for all f ∈ C(K) follows from the Stone-Weierstrass theorem. The rest of the theorem follows from [GT, Thm. 3.6] or [GJ, Thm. 5.1] which have their roots in [AS], and [JM]. Note that (4.5) also follows from [GT, Thm. 3.6] and the topology of almost periodic functions. Lemma 4.6. Let {αn }∞ n=−∞ , |αn | < 1 be an almost periodic sequence as described above. Then Z 1 ln |φ∗n (z) + zφn (z)| = ln |z − eiθ |dk(θ) + R, (4.6) lim n→∞ n Σ uniformly on compact subsets of z ∈ C\K. Furthermore, Z 1 ln |φ∗n (z)| = ln |z − eiθ |dk(θ) + R, lim n→∞ n Σ uniformly on compact subsets of |z| < 1, while Z 1 ln |φn (z)| = ln |z − eiθ |dk(θ) + R, lim n→∞ n Σ PN uniformly on compact subsets of |z| > 1. Here R = lim N1 n=1 ln an .
(4.7)
(4.8)
Proof. If in Lemma 4.4 we take f (eiθ ) = ln |z − eiθ | then (4.6) follows from the above lemma and Lemmas 2.1 and 4.3 in [GT]. Since φ∗n has all its zeros outside the unit circle ([Sz]), Eq. (4.7) follows from (4.6) and the fact that for |z| < 1, | φφn∗ (z) | < 1. Eq. (4.8) n (z) follows from a similar analysis. Let Σ0 be a compact subset of the unit circle and let gΣ0 be the Green’s function associated with the set Σ0 (see Sect. 2). We denote the capacity of Σ0 by cap(Σ0 ) and we assume that cap(Σ0 ) > 0. Note ([Tu]) that gΣ0 has the representation Z ln |z − eiθ |dν, gΣ0 (z) = − log cap(Σ0 ) + Σ0
where ν is the equilibrium measure associated with Σ0 . We say that lim fn (z) = h(z)
n→∞
146
J. S. Geronimo, R. Johnson
locally uniformly in an open set D if for every z ∈ D and zn → z as n → ∞ we have limn→∞ fn (zn ) = h(z) (see [ST]). Let 1 γn (ω, z) = log kT (z, n) · · · T (z, 1)k, n and 1 γ−n (ω, z) = log kT (z, 0) · · · T (z, −n + 1)k. n Which norm is used is not important, but for convenience we will use the Hilbert-Schmidt norm. We now show Lemma 4.7. Let {αn }∞ n=−∞ , |αn | < 1 be an almost periodic sequence as described , φ above. Let {φn }∞ 0 = 1 be the sequence of orthonormal polynomials satisfying n=0 1 (1.1). If limn→∞ |φn (z)| n = egΣ0 (z) locally uniformly on |z| > 1, then Σ = Σ0 , k = ν, and R = − log cap(Σ0 ). Furthermore limn→∞ γn (ω, z) = gΣ0 (z) = limn→∞ γ(z, −n) for all z ∈ C\K and ω ∈ and lim sup γ±n (ω, z) ≤ γ(z) for all z ∈ C and ω ∈ . Proof. That Σ = Σ0 , k = ν, and R = − log cap(Σ0 ) follows from Lemmas 4.5 and 4.6. That limn→∞ γn (ω, z) = gΣ0 (z) for all ω ∈ and all z ∈ C\K follows from Lemma 4.4 in [GT] and analogous arguments using Lemma 2.1 in [GT] show lim n→∞ γ−n (ω, z) = gΣ0 (z). That lim sup γ±n (ω, z) ≤ γ(z) for all ω ∈ and all z ∈ C\K now follows since lim sup γ±n (ω, z) ≤ γ(z) is submean while γ is subharmonic (see Theorem 2.3 in [CS] or [J]). Systems of orthogonal polynomials having the behavior described in Lemma 4.7 are said to have regular nth root asymptotic behavior (see [ST], p. 60). Let {8± (z, n)}∞ −∞ be solutions of the difference system determined by (1.1) such that limn→∞ k8+ k and limn→−∞ k8− k are bounded. The existence of such solutions follow from exponential dichotomy results ([GJ]) or from the fact that for z ∈ C\K, (zI − H)−1 is a bounded operator in l2 . It now follows from the “deterministic” Oseledec Theorem ([CL]) and det T (z, n) = z that for all |z| < 1 and all ω ∈ , ([GT]), 1 (4.9) lim log k8+ (z, n)k = log |z| − γ(z), n and 1 (4.10) lim log k8+ (z, −n)k = −γ(z). n We now prove Theorem 4.8. Let {αn }∞ n=−∞ , |αn | < 1 be an almost periodic sequence as above. Let {φn }∞ n=0 , φ0 = 1 be the sequence of orthonormal polynomials satisfying (1.1) with 1 orthogonality measure σω0 . If limn→∞ |φn (z)| n = egΣ0 (z) locally uniformly on |z| > 1, with Σ0 a finite union of nondegenerate arcs of the unit circle, then σω has the form given by (4.1) - (4.3) for all ω ∈ . Furthermore {αn } is a quasi periodic sequence. Proof. From its definition gΣ0 (z) = 0 for almost all z ∈ Σ0 . Equations (4.9) and (4.10) allow us to apply the Kotani theorem for these systems for almost all ω ∈ ([GJ, Thm. 5.9, GT, Thm. 5.6]). The ω continuity of the m ˜ functions allows the result to be extended to all ω ([GJ, Cor. 5.12]). Furthermore since Σ0 is comprised of a finite number of arcs we can conclude that σω has an analytic density for all ω ∈ and that m± are
Inverse Problem for Polynomials Orthogonal on the Unit Circle
147
reflectionless. Hence the results from the previous section show that {αn } is in fact a quasi-periodic sequence. We now describe a result from orthogonal polynomials and potential theory. for (1.1). Lemma 4.9. Let {αn }∞ n=1 , |αn | < 1 be a sequence of recurrence coefficients PN 1 Let Σ be the non-isolated points in the topological support of σ. If lim N n=1 log(1 − 1 1 |αn |2 ) 2 = log cap(Σ), then limn→∞ |φn (z)| n = egΣ (z) locally uniformly for |z| > 1. Remark 4.10. This is a special case of a much more general theorem that has been proved by Stahl and Totik ([ST, Thm. 3.1.1]) which also applies to polynomials orthogonal on bounded subsets of the real line. In the case when {αn }∞ −∞ , |αn | < 1 is an almost periodic sequence as described above, the result follows since γ(z) ≥ 0 for all |z| = 1 (in particular for z ∈ Σ) while the equation below (1.1) and Lemma 4.6 show limz→∞ (γ(z) − gΣ (z)) = 0. With this we now show Theorem 4.11. Let {αn }∞ −∞ , |αn | < 1 be a sequence of recurrence coefficients for (1.1). Let Σ (non-isolated points in the topological support of σ) be a finite set of nondegenerate arcs of the circle. Then {αn } is a quasi periodic sequence with lim sup |αn | < 1 and PN 2 21 limN →∞ N1 n=1 log(1 − |αn | ) = log cap(Σ) if and only if σ has the form given by (4.1) - (4.3). PN 2 21 Proof. Since limN →∞ N1 n=1 log(1 − |αn | ) = log cap(Σ) we see from the above lemma that the orthonormal polynomials φn associated with σ have regular nth root asymptotics, i.e., limn→∞ |φn (z)| = egΣ locally uniformly for |z| > 1. Thus Theorem 4.8 implies that σω (for any ω ∈ ) has the form given by (4.1) - (4.3). From Theorem 4.2 we see that (4.1) - (4.3) imply that {αn } is quasiperiodic with lim sup |αn | < 1. Since all the measures are absolutely continuous on the same segments, the analog of the Pastur–Ishii Theorem for these systems ([GT, Thm. 5.7, GJ, Thm. 5.8]) shows that γ = gΣ . Comparison of these functions near infinity shows that PN 2 21 limN →∞ N1 n=1 log(1 − |αn | ) = log cap(Σ). Remark 4.12. In the case that Σ = Σ0 the example of Chulaevesky and Sinai [ChS] PN 2 21 indicates that limN →∞ N1 n=1 log(1 − |αn | ) = log cap(Σ0 ) is needed. Finally using the theory developed in Stahl and Totik we give the following stability result. Theorem 4.13. Let {αn }∞ −∞ , |αn | < 1 be an almost periodic sequence as described above. Let σ = σω0 be the orthogonality measure associated with {αn }∞ n=0 and suppose there is an infinite set Σac such that dσ > 0 for almost every point in Σac . Then for dθ ω almost every ω ∈ , Σac = Σac . Proof. Let σˆ = σ|Σac . Note that Σac is a compact subset of K. Then for almost every θ ∈ Σac , log σ(1 ˆ r (eiθ ) < ∞, lim r→0 ln r
148
J. S. Geronimo, R. Johnson
where 1r (eiθ ) is the disk of radius r centered at eiθ ∈ K. Therefore σˆ is a regular measure by the 3 criterion of Stahl and Totik (see [ST, p. 108]) and from [ST, 1 Theorem 5.3] lim supn→∞ |φn (z)| n ≤ egΣac (z) locally uniformly for |z| > 1 and 1 lim supn→∞ |φ∗n (z)| n ≤ egΣac (z) locally uniformly for |z| < 1. Since limn→∞ n1 log |φn (z)| = γ(z) for |z| > 1 and limn→∞ n1 log |φ∗n (z)| = γ(z) for |z| < 1, we see γ(z) = 0 for almost all θ ∈ Σac since γ is subharmonic. The result now follows the Kotani theorem for these systems ([GT, Thm. 5.7, GJ, Thm. 5.9]) for almost all ω ∈ . We note that the above theorems have real line analogs. In this case the orthogonal polynomials {pn (x)}∞ n=0 satisfy the following three term recurrence formula an+1 pn+1 (x) + bn pn (x) + an pn−1 (x) = xpn (x)
(4.11)
with initial condition p0 = 1 and p−1 = 0. Here an > 0 and bn ∈ R, if it is assumed ∞ that the sequences {an }∞ −∞ and {bn }−∞ are almost periodic (or quasi periodic where appropriate) with bounds 0 < r ≤ an ≤ RM < ∞ and |bn | ≤ M ∀n ∈ Z. In this case there is a unique measure σ such that R pn (x)pm (x)dσ(x) = δm,n and the above results carry over with little modification to the real line case (See [AK, AS, BGHT, M, and D]). Let Σ = [x0 , x1 ] ∪ [x2 , x3 ] ∪ · · · ∪ [x2N −2 , x2N −1 ] be a finite union of finite non-degenerate intervals of the real line, C be the Riemann surface of the algebraic Q2N −1 relation w2 = K(x) = j=0 (x − xj ). We will use the same branch of the square root as earlier. ∞ Theorem 4.14. Let {an }∞ −∞ and {bn }−∞ be a sequence of recurrence coefficients for (4.11). Let Σ (the non isolated points in the topological support of σ) be a finite union of non-degenerate intervals as given above. Then {an } and {bn } are quasiperiodic sequences with 0 < r ≤ an ≤ M < ∞, |bn | ≤ M ∀n ∈ Z, and PN limN →∞ N1 n=1 log an = log cap(Σ) if and only if σ has the form
dσ(x) = f (x)dx +
X
σin δ(x − pin )dx,
in
where
p |K(x)| f (x) = , π|P (x)| c
and, if the above sum is not empty, then 1 ≤ i1 < i2 . . . < im ≤ N − 1 with p 2c |K(pin )| . σ in = dP (p ) | dxin | Here pi ∈ [x2i−1 , x2i ], i = 1, . . . N − 1, P (x) = Eq. (3.11)]) c
−1
=−
N −1 X j=1
p
K(pj )
2
dP (pj ) dx
QN −1 i=1
(x − pi ), and ([BGHT,
N −1 2N −1 N −1 2N −1 X X 1 X 1 X − {( pj − pj )2 + p2j − 1/2 p2j }. 4 2 j=1
j=0
j=1
j=0
Remark 4.15. Since the number of points pi is the same as the genus of the surface, the standard forms of Abel’s theorem and the Jacobi Inversion theorem can be used ([AK]).
Inverse Problem for Polynomials Orthogonal on the Unit Circle
149
Acknowledgement. The first author would like to thank the members of the Theoretical Physics Division (SPHT) at Saclay and the members of the Numerical Analysis Laboratory at Paris VI for their hospitality and support while this work was being carried out. JG would also like to thank Gerald Teschl for several discussion.
References [AKNS] [Akh] [AK] [A] [AS] [BGHT]
[CL] [C] [CS] [ChS] [DC-J] [DMN] [D] [F] [GJ] [GNV] [GT] [G] [J] [JM] [M] [MM] [O]
Ablowitz, M., Kaup, D., Newell, A. and Segur, H.: The inverse scattering transform: Fourier analysis for non-linear systems. Studies in Appl. Math. 53, 249–315 (1974) N. I. Akhiezer, The Classical Moment Problem Oliver and Boyd, Edinburgh, (1965). Anand, A. and Krishna, M.: Almost periodicity of some Jacobi matrices. Proc. Indian Acad. Sci. (Math. Sci) 102, 175–188, (1992) Atkinson, F.: Discrete and Continuous Boundary Value Problems. New York-London: Academic Press, 1964 Avron, J. and Simon, B.: Almost periodic Schrodinger Operators II. The integrated density of States. Duke Math. 50, 369–391 (1983) Bulla, W., Gesztesy, F., Holden, H. and Teschl, G.: Algebro-Geometric Quasi-periodic Finite-Gap Solutions of the Toda and Kac-van Moerbeke Hierarchies. Memoirs of the Am. Math. Soc. (to appear) Carmona, R. and Lacroix, J.: Spectral Theory of Random Schrodinger Operators, Birkhauser: Boston, 1990 Coppel, A.: Dichotomies in Stability Theory. Lecture Notes in Mathematics 629, New York–Berlin: Springer-Verlag, 1978 Craig, W. and Simon, B.: Subharmonicity of the Lyapunov Index, Duke Math, 50, 551–560 (1983) Chulaevsky, V. and Sinai, Y.: Anderson localization for the I-D discrete Schr¨odinger operator with two-frequency potential. Comm. Math. Phys. 125, 91–112 (1989) DeConcini, C. and Johnson, R.: The algebraic-geometric AKNS potentials. Ergodic Theory and Dyn. Sys. 7, 1–24 (1987) Dubrovin, B. Matveev, V. and Novikov, S.: Non-linear equations of Korteweg-de Vries type, finite-zone linear operators, and Abelian varieties. Russ. Math. Surveys 31, 59–146 (1976) Dombrowski, J.M.: Quasitriangular matrices. Proc. Am. Math. Soc. 69, 95–96 (1978) Fay, J.: Theta Functions on Riemann Surfaces. Lecture Notes in Mathematics 352, New York– Berlin: Springer-Verlag, 1973 Geronimo, J. and Johnson, R.: Rotation numbers associated with difference equations satisfied by polynomials orthogonal on the unit circle. JDE 132, 140–178 (1996) Golinskii, L., Nevai, P. and Van Assche, W.: Perturbation of Orthogonal Polynomials on an Arc of the Circle. JAT 83, 392–422 (1996) Geronimo, J. and Teplaev, A.: A difference equation arising from the trigonometric moment problem having random reflection coefficients. J. Funct. Anal 123, 12–45 (1994) Geronimus, Y.: Polynomials Orthogonal on a Circle and Interval. New York: Pergamon Press, 1960 Johnson, R.: Lyapunov numbers for the almost periodic Schr¨odinger equation Ill. J. Math. 28, 387–419 (1984) Johnson, R. and Moser, J.: The rotation number for almost periodic potentials. Commun. Math. Phys. 84, 403–438 (1982) Minami, N.: An extension of Kotani’s theorem to random generalized Sturm Liouville operators. Commun. Math. Phys. 103, 387–402 (1986) McKean, H. and van Moerbeke, P.: The spectrum of Hill’s equation. Invent. Math. 30, 217–274 (1975) Oseledec, V.: A multiplicative ergodic theorem. Lyapounov characteristic numbers for dynamical systems. Trans. Moscow Math. Soc. 19, 197–231 (1968)
150
J. S. Geronimo, R. Johnson
[PS]
Peherstorfer, F. and Steinbauer, R.: Orthogonal Polynomials on arcs of the unit circle I. Accepted JAT Previato, E.: Hyperelliptic generalized Jacobians and the non-linear Schr¨odinger equation. Ph.D. thesis, Harvard University, 1984 Siegel, C.: Topics in Complex Functions Theory. Vol. II, New York: Wiley-Interscience, 1971 Szego, G.: Orthogonal Polynomials. Coll. Pub. vol. 23, Providence, RI: Am. Math, Soc., 1975 Stahl, H. and Totik, V.: General Orthogonal Polynomials. Cambridge: Cambridge University Press, 1992 Tsuji, M.: Potential Theory in Modern Function Theory. Tokyo: Maruzen, 1959 Walters, P.: An Introduction to Ergodic Theory. Graduate Texts in Math. 79, New York–Berlin: Springer-Verlag, 1982
[P] [S] [Sz] [ST] [Tu] [W]
Communicated by B. Simon
Commun. Math. Phys. 193, 151 – 170 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
The Absolutely Continuous Spectrum of One-Dimensional Schr¨odinger Operators with Decaying Potentials Christian Remling Universit¨at Osnabr¨uck, Fachbereich Mathematik/Informatik, 49069 Osnabr¨uck, Germany. E-mail:
[email protected] Received: 25 June 1997 / Accepted: 29 July 1997
Abstract: We investigate one-dimensional Schr¨odinger operators with asymptotically small potentials. It will follow from our results that if |V (x)| ≤ C(1+x)−α with α > 1/2, then 6ac = (0, ∞) is an essential support of the absolutely continuous part of the spectral measure. We also prove that if C := lim supx→∞ x |V (x)| < ∞, then the spectrum is purely absolutely continuous on ((2C/π)2 , ∞). These results are optimal. 1. Introduction In this paper, I am interested in one-dimensional Schr¨odinger equations, − y 00 (x) + V (x)y(x) = Ey(x),
(1)
with asymptotically small potentials V (x). We will treat only the half-line problem x ∈ [0, ∞) explicitly (of course, the results below extend easily to whole-line problems). d2 So, we are interested in the spectral properties of the operators Hβ = − dx 2 + V (x) on L2 (0, ∞). The index β ∈ [0, π) refers to the boundary condition y(0) cos β +y 0 (0) sin β = 0. Although the emphasis will be on the continuous case, we will also occasionally discuss the discrete analogue of (1), y(n − 1) + y(n + 1) + V (n)y(n) = Ey(n). The properties of the corresponding classical system very naturally lead to the question of whether suitable smallness assumptions on V (x) at large x imply absence of singular spectrum on (0, ∞) or, at least, existence of absolutely continuous spectrum. Indeed, one can make the elementary remark that the spectrum is purely absolutely continuous on (0, ∞) if V ∈ L1 . On the other hand, the classical von Neumann-Wigner example [32] shows that potentials V (x) = O(1/x) can have embedded eigenvalues.
152
C. Remling
Moreover, the constructions of [22, 30] show that there are potentials with decay arbitrarily close to O(1/x) and dense point spectrum in (0, ∞). These phenomena cannot occur if one prevents the potential from oscillating too heavily. In this case, one can use WKB methods to prove absence of singular spectrum. The first result in this spirit was obtained in [33]; for generalizations, see [1, 2]. Recently, Kiselev proved that 6ac = (0, ∞) is an essential support of the absolutely continuous part of the spectral measure if V (x) = O(x−α ) with α > 3/4 [14]. (Recall that, by definition, S is called an essential support of the measure µ if µ(R \ S) = 0 and µ(T ) > 0 for every subset T ⊂ S of positive Lebesgue measure.) He subsequently weakened this assumption to α > 2/3 [15]. Later, Molchanov found an elegant alternative proof of the same result [21]. These statements are remarkable in that, in striking contrast to the WKB methods, no conditions other than smallness conditions are imposed on the potential. However, the question of the optimal exponent on the power scale was left open. The work on decaying random potentials [9, 10, 16, 18, 28] has shown that there are potentials V (x) = O(x−1/2 ) with purely singular spectrum. (A method of constructing non-random power-decaying potentials with pure point spectrum was recently discovered by myself [25].) So the critical exponent has to lie in the range [1/2, 2/3]. Our first aim in this paper is to solve this problem. We will prove: Theorem 1.1. Suppose there exists an increasing sequence xn → ∞, such that P ∞ N n=1 kV χ(xn−1 ,xn ) k2 < ∞ and supn kV χ(xn−1 ,xn ) k1 kV χ(xn−1 ,xn ) k2 < ∞ for some N ∈ N. Then 6ac = (0, ∞) is an essential support of the absolutely continuous part of the spectral measure. As a corollary, we get the following sharp result: Theorem 1.2. If |V (x)| ≤ C(1 + x)−α with α > 1/2, then 6ac = (0, ∞) is an essential support of the absolutely continuous part of the spectral measure. Proof. Let xn = n2/(2α−1) (say). Theorem 1.1 applies.
Note that the conclusion 6ac = (0, ∞) is equivalent to each of the following two statements: 1. The absolutely continuous spectrum satisfies σac = [0, ∞), and, moreover, Hβ is purely absolutely continuous on (0, ∞) for almost every boundary condition β ∈ [0, π). 2. The absolutely continuous parts of Hβ(0) = −d2 /dx2 and Hβ 0 = −d2 /dx2 + V (x) are unitarily equivalent. For the first statement, this follows from [8]. Clearly, Theorem 1.1 is considerably more general than its corollary, Theorem 1.2. In fact, the first condition is always satisfied for sufficiently large xn if V ∈ L2 . Then the second condition imposes a certain asymptotic relationship between L1 and L2 norms. On the other hand, Theorem 1.1 is not strong enough to deal with general Lp potentials. That is to say, for any p > 1, there are functions V ∈ Lp ∩ L2 which do not satisfy the hypotheses of Theorem 1.1. The proof of Theorem 1.1 will be given in the next section. We will use ideas from Kiselev’s and Molchanov’s proofs of the 2/3 result as well as a number of new ideas. Actually, our methods will yield several extensions and generalizations of Theorem 1.1 with relatively little extra effort. In fact, we will prove in Sect. 3 that the assumptions of Theorem 1.1 also imply that for almost every E ∈ (0, ∞), the solutions satisfy the
Absolutely Continuous Spectrum of 1D Schr¨odinger Operators
153
WKB asymptotic formulae (= Theorem 3.1). In Sect. 4 we will establish an analogue of Theorem 1.1 for the discrete Schr¨odinger equation (= Theorem 4.1). Finally, in Sect. 5, we will use the generalized Pr¨ufer transformation introduced in [17] to prove a result on perturbations of general Schr¨odinger operators H0 = −d2 /dx2 + U (where not necessarily U = 0). The proofs of all these extensions are relatively minor variations on the proof of Theorem 1.1. Therefore, we will give a thorough discussion of this proof in Sect. 2, in order to present the ideas as clearly as possible. The representation in Sects. 3–5 will then be rather sketchy. Christ and Kiselev [5] have independently developed a completely different approach to these problems, which gives very similar results. In particular, their results also imply Theorem 1.2. We will also prove a new result on the absence of singular spectrum, assuming only decay conditions. Namely, on the power scale, we can improve the elementary result on L1 potentials to Theorem 1.3. If C := lim supx→∞ x |V (x)| < ∞, then Hβ is purely absolutely continuous on ((2C/π)2 , ∞). In particular, if V (x) = o(1/x), then Hβ is purely absolutely continuous on (0, ∞). The point of this result is the absence of singular continuous spectrum. That E = (2C/π)2 is a (sharp) bound for possible positive eigenvalues appears already in [11, Sect. 3.2]. See also [16, Theorem 4.1] for further information on the point spectrum. We will prove Theorem 1.3 in Sect. 6. This proof is not hard; it relies on a classical result on the divergence set of a Fourier series. Molchanov has informed me [21] that he can prove absence of embedded singular spectrum under the stronger assumption V (x) = O(x−1 ln− x). His proof is based on ideas developed by himself in [20]. The proof of Theorem 1.3 will automatically yield the following result, which seems to be of independent interest. Theorem 1.4. If |V (x)| ≤ Cx−α for all large enough x, then there is a set S ⊂ (0, ∞) of Hausdorff dimension dim S ≤ 4(1 − α), such that for every boundary condition β ∈ [0, π), the singular part of the spectral measure (restricted to (0, ∞)) is supported on the set S: ρβs ((0, ∞) \ S) = 0. We will also briefly mention discrete versions of these results at the end of Sect. 6. Unfortunately, we do not know of any potentials V (x) = O(x−α ), α > 1/2, with embedded singular continuous spectrum. While potentials of this type presumably exist, an explicit construction seems to be extremely difficult. It appears, however, that such examples are needed to gain further insight into the issues dealt with in Theorems 1.3, 1.4. Note also that in recently constructed examples with embedded singular continuous spectrum [20, 26], 6ac does not have full measure in the absolutely continuous spectrum σac . Some of the results of this paper (along with some of the results of [5]) have been announced in [6]. 2. Proof of Theorem 1.1 Fix once and for all a boundary β ∈ [0, π) at x = 0. For E > 0, let y(x, E) be the solution of (1) with the initial values y(0, E) = − sin β, y 0 (0, E) = cos β. It will be
154
C. Remling
√ convenient to work with k = E; using the physicist’s notation, we will denote the solution y from above by y(x, k) (instead of the more careful, but too clumsy notation y(x, ˜ k) = y(x, k 2 )). Similar conventions apply to other quantities. The modified Pr¨ufer variables R(x, k), ψ(x, k) are defined by the formulae y = R sin ψ/2, y 0 = kR cos ψ/2 and by requiring that R > 0 and ψ be continuous in x. R, ψ obey the equations (ln R)0 =
V sin ψ, 2k
ψ 0 = 2k −
V V + cos ψ. k k
The main part of the proof of Theorem 1.1 will consist of a careful analysis of the integrated form of these equations: Z y V (t) sin ψ(t, k) dt, (2) 2k(ln R(y, k) − ln R(x, k)) = x
ψ(x, k) = ω(x, k) + θ(x, k). Here ω(x, k) := 2kx −
1 k
Z
(3)
x
V (t) dt
(4)
0
is the “good” part R x of ψ (this will be made precise in Proposition 2.2 below), and θ(x, k) := ψ(0, k)+k −1 0 V (t) cos ψ(t, k) dt will be treated as a perturbation. The crucial property of θ is that its derivative is a small, oscillatory function: dθ 1 = V (x) cos ψ(x, k). dx k
(5)
We will use the following relation between 6ac and the asymptotic behavior of the solutions of (1). This result is not new (see also [19] for related results). However, the proof is easy, so we give it. Proposition 2.1 ([20]). Let xn → ∞ be a fixed sequence. Then S := {E = k 2 : lim sup R(xn , k) < ∞} ⊂ 6ac . n→∞
Proof. Let r2 (x, E) = y 2 (x, E) + y 0 (x, E). The measures dρx (E) = (πr2 (x, E))−1 dE converge weakly (i.e., when smeared with continuous functions of compact support) to the spectral measure dρ(E) as x → ∞ [4, 23]. Let fn (E) := min{1, (πr2 (xn , E))−1 }. For fixed [a, b] ⊂ (0, ∞), the sequence fn (·) is bounded in L2 (a, b), so there exists a weakly convergent subsequence (which we denote again by fn ). That is, there exists Rb Rb f ∈ L2 (a, b) so that a fn (E)g(E) dE → a f (E)g(E) dE for all g ∈ L2 (a, b). Thus, if ρ({E0 ± }) = 0, Z E0 + 1 dE lim ρ(E0 − , E0 + ) = 2 π n→∞ E0 − r (xn , E) Z E0 + Z E0 + fn (E) dE = f (E) dE. ≥ lim 2
n→∞
E0 −
E0 −
Dividing by 2 and letting → 0 shows that dρ(E0 )/dE ≥ f (E0 ) for almost every E0 ∈ (a, b). Notice that r(x, E) ≤ max{E 1/2 , 1}R(x, E). Hence lim inf n→∞ fn (E) > 0 for all E ∈ S, and thus also f > 0 almost everywhere on S ∩ (a, b).
Absolutely Continuous Spectrum of 1D Schr¨odinger Operators
155
Proposition 2.2 ([15]). Assume that V ∈ Lp for some p ∈ [1, ∞), and fix [a, b] ⊂ (0, ∞). Then there is a constant C = C(a, b) so that 2 Z ∞ Z b Z ∞ iω(x,k) f (x)e dx dk ≤ C |f (x)|2 dx a
0
0
for all f ∈ L1 ∩ L2 . Proof. Pick g ∈ C0∞ with 0 ≤ g ≤ 1 and g = 1 on (a, b). We want to estimate the quantity Z ∞ Z Z ∞ Ry i V (t) dt dx f (x) dy f (y)e2ik(x−y) e k x dk g(k) 0 0 Z ∞ Z ∞ Z Ry i V (t) dt = dx f (x) dy f (y) dk e2ik(x−y) g(k)e k x . 0
0
In the k-integral, we integrate by parts [2p] + 1 times ([a] is the largest integer ≤ a), integrating the factor e2ik(x−y) and differentiating the rest. Since, by H¨older’s inequality, Z y V (t) dt ≤ C|x − y|1−1/p , x
we gain a factor |x − y|−1/p with each integration by parts. So, it suffices to estimate integrals of the form Z ∞ Z ∞ dx dy |f (x)f (y)|K(x, y), (6) 0
0
where the kernel K is non-negative and can be bounded by C1 C . ≤ K(x, y) ≤ min C0 , (x − y)2 1 + (x − y)2 Now, the Cauchy-Schwarz inequality (in L2 (R2 )) yields the desired bound on (6): Z
Z
∞
∞
dx 0
Z dy |f (x)| K(x, y)
0
Z
∞
2
∞
dx 0
1/2 dy |f (y)| K(x, y) 2
≤ Cπkf k22 .
0
We now have all the tools for the proof of Theorem 1.1. First of all, note that we have σess = [0, ∞) (see, e.g., [34, Theorem 15.1]). So we only need to show (0, ∞) ⊂ 6ac . Let xn be as in the hypothesis. We will show that ∞ X
| ln R(xn , k) − ln R(xn−1 , k)| < ∞ for a.e. k > 0.
n=1
Then the assertion will follow by Proposition 2.1. Of course, we may restrict our attention to k ∈ (a, b), where [a, b] ⊂ (0, ∞) is fixed, but otherwise arbitrary. According to the Pr¨ufer Eqs. (2), (3), it thus suffices to show that
156
C. Remling
∞ Z xn X iω(x,k) iθ(x,k) V (x)e e dx < ∞ xn−1
for a.e. k ∈ (a, b).
(7)
n=1
We integrate by parts; also, since k is fixed in this and in subsequent calculations, we will usually drop this argument. Using (3) and (5), we get Z
Z
xn xn−1
i − k
xn
V (x)eiω(x) eiθ(x) dx = eiθ(xn )
xn−1
Z
xn
iθ(x)
V (x)eiω(x) dx Z
x
dx V (x) cos(ω(x) + θ(x))e xn−1
dt V (t)eiω(t) .
xn−1
The first summand on the right-hand side (call it S) has already the desired properties. In fact, the Cauchy-Schwarz inequality together with Proposition 2.2 imply
Z
b
Z
dk |S(n, k)| ≤ (b − a)
1/2
a
b
a
Z 2 1/2 xn iω(x,k) dk dx V (x)e xn−1
≤ CkVn k2 (using the notation Vn = V χ(xn−1 ,xn ) ). By hypothesis, this is summable, hence indeed P n |S(n, k)| < ∞ for almost every k ∈ (a, b) by monotone convergence. Using cos ψ = (eiψ + e−iψ )/2, we thus see that it suffices to prove that sums of the form Z x ∞ Z xn X dx V (x)eiσω(x) eimθ(x) dt V (t)eiω(t) , xn−1 xn−1
(8)
n=1
with σ ∈ {−1, 1}, m ∈ N0 , converge for almost every k ∈ (a, b). To handle these terms, we will use an iterative procedure. Rather than formulate immediately the basic step in a general setting, we will discuss the first step in detail and only then give the general result (see Lemma 2.3 below). We may assume that kVn k2 > 0 for all n. Define Nn ∈ N by Nn = max{1, [1/ kVn k2 ]}. Since kVn k2 → 0, there is an n0 ∈ N so that 1 ≤ kVn k2 Nn ≤ 1 2
∀n ≥ n0 .
(9)
We subdivide the interval [xn−1 , xn ] into Nn subintervals: For n ∈ N, pick numbers y1 (n, l) such that xn−1 = y1 (n, 0) < y1 (n, 1) < . . . < y1 (n, Nn ) = xn . Later, we will impose additional conditions on the y1 ’s. However, the following argument is completely general. The integral from (8) can now be decomposed as follows:
Absolutely Continuous Spectrum of 1D Schr¨odinger Operators
Z
Z
xn
dx V (x)e
xn−1
Nn Z X l=1
y1 (n,l)
dx V (x)eiσω(x) eimθ(x) ×
y1 (n,l−1)
Z
Z
y1 (n,l−1)
dt V (t)e
=
Nn Z X
dt V (t)eiω(t)
e
xn−1
=
x
iσω(x) imθ(x)
iω(t)
!
x
dt V (t)e
+
xn−1
Z
y1 (n,l)
x
dx V (x)eiσω(x) eimθ(x)
Nn Z y1 (n,l−1) X
dt V (t)eiω(t) +
y1 (n,l−1)
Z
y1 (n,l)
dx V (x)eiω(x)
xn−1
l=1
iω(t)
y1 (n,l−1)
y1 (n,l−1)
l=1
157
dx V (x)eiσω(x) eimθ(x) .
(10)
y1 (n,l−1)
The first term in the last expression has again the form of the integrals of (8), but the integration is now over a smaller region. The second term can be treated with the integration by parts argument from above. In fact, we have Z y1 (n,l) Z y1 (n,l) dx V (x)eiσω(x) eimθ(x) = eimθ(y1 (n,l)) dx V (x)eiσω(x) − y1 (n,l−1) Z y1 (n,l)
im k
y1 (n,l−1)
Z
x
dx V (x) cos ψ(x)eimθ(x)
y1 (n,l−1)
dt V (t)eiσω(t) .
(11)
y1 (n,l−1)
Plugging the first term on the right-hand side of this formula into (10) (and, again, calling the resulting term S(n, k)), we obtain from Proposition 2.2, (9), and the Cauchy-Schwarz inequality Z Z Z b Nn Z b y1 (n,l−1) y1 (n,l) X iω(x) iω(x) dk |S(n, k)| ≤ dk dx V (x)e dx V (x)e xn−1 y1 (n,l−1) a a l=1
≤C
Nn X
kV χ(xn−1 ,y1 (n,l−1)) k2 kV χ(y1 (n,l−1),y1 (n,l)) k2
l=1
≤ CkVn k2
Nn X
kV χ(y1 (n,l−1),y1 (n,l)) k2
l=1 3/2
≤ CkVn k22 Nn1/2 ≤ CkVn k2 . By hypothesis, this bound is summable. The other term obtained from inserting (11) into (10) can obviously be bounded by a constant times the sum of two terms (corresponding to the two summands in cos ψ = (eiψ + e−iψ )/2) of the form Nn Z y1 (n,l−1) X iω(x) dx V (x)e × xn−1 l=1 Z Z x y1 (n,l) dx V (x)eiσ1 ω(x) eimθ(x) dt V (t)eiσ2 ω(t) , (12) y1 (n,l−1) y1 (n,l−1)
158
C. Remling
with σi ∈ {−1, 1}, m ∈ N0 . Next, we claim that Z y1 (n,l−1) iω(x,k) Mn (k) := max dx V (x)e 1≤l≤Nn xn−1
(13)
remains bounded as n → ∞ for almost every k ∈ (a, b). In fact, much more is true: Obviously, 2 Nn Z y1 (n,l−1) X 2 iω(x) dx V (x)e Mn (k) ≤ , xn−1 l=1
and integrating this over (a, b) yields with the aid of Proposition 2.2, Z b Nn X dk Mn (k)2 ≤ C kV χ(xn−1 ,y1 (n,l−1)) k22 ≤ CNn kVn k22 ≤ CkVn k2 ∈ l1 (N), a
l=1
P
so we even have that n Mn (k)2 < ∞ for almost every k. So, for almost every k, we may estimate the first integral in (12) by a constant that depends on k, but not on n or l. Therefore, putting everything together, we see that in order to prove almost everywhere convergence of (8), it suffices to prove that sums of the form Z x Nn Z y1 (n,l) ∞ X X iσ1 ω(x) imθ(x) iσ2 ω(t) dx V (x)e e dt V (t)e (14) , y1 (n,l−1) y1 (n,l−1) n=1 l=1
with σi ∈ {−1, 1}, m ∈ N0 , converge for almost all k ∈ (a, b). Note that this sum has the same form as the one in (8), but we have succeeded in passing from {xn } to the finer partition {y1 (n, l)}. The above steps can now be applied to (14), in turn, to obtain again a sum of this form, but corresponding to an even finer partition, etc. Here are the formal details of this inductive procedure: Suppose the yi (n, l1 , l2 , . . . , li ) have been defined for i = 1, 2, . . . , j − 1. Then, for fixed n ∈ N, li ∈ {1, . . . , Nn } (i ≤ j − 1), pick numbers yj (n, l1 , . . . , lj−1 , lj ) (lj = 0, 1, . . . , Nn ) satisfying yj−1 (n, l1 , . . . , lj−1 − 1) = yj (n, l1 , . . . , lj−1 , 0) < yj (n, l1 , . . . , lj−1 , 1) < . . . < yj (n, l1 , . . . , lj−1 , Nn ) = yj−1 (n, l1 , . . . , lj−1 ). In what follows, we will usually simplify the notation by making only the last argument of yi (· · · ) explicit. Lemma 2.3. In order for Z Z x Nn ∞ yj−1 (lj−1 ) X X dx V (x)eiσ1 ω(x) eimθ(x) dt V (t)eiσ2 ω(t) yj−1 (lj−1 −1) yj−1 (lj−1 −1) n=1 l1 ,... ,lj−1 =1
(with σi ∈ {−1, 1}, m ∈ N0 ) to converge for almost every k ∈ (a, b), it is sufficient that sums of the form Z Z x Nn ∞ yj (lj ) X X 0 0 0 dx V (x)eiσ1 ω(x) eim θ(x) dt V (t)eiσ2 ω(t) (15) yj (lj −1) yj (lj −1) n=1 l1 ,... ,lj =1
(with
σi0
∈ {−1, 1}, m0 ∈ N0 ) converge for almost every k ∈ (a, b).
Absolutely Continuous Spectrum of 1D Schr¨odinger Operators
159
Proof. Repeat the arguments leading from (8) to (14); the necessary adjustments are minor and concern mainly the notations. For instance, instead of (13), we now have to show that Z yj (n,l1 ,... ,lj−1 ,lj −1) iω(x) max dx V (x)e Mn (k) := 1≤l1 ,... ,lj ≤Nn yj−1 (n,l1 ,... ,lj−1 −1) is bounded in n for almost all k. As above, this follows from Z
b
Nn X
dk Mn (k) ≤ 2
a
l1 ,... ,lj =1
Z
b a
Nn X
≤C
Z 2 yj (n,l1 ,... ,lj−1 ,lj −1) iω(x) dk dx V (x)e yj−1 (n,l1 ,... ,lj−1 −1)
kV χ(yj−1 (lj−1 −1),yj (lj −1)) k22
l1 ,... ,lj =1 Nn X
≤C
kV χ(yj−1 (lj−1 −1),yj−1 (lj−1 )) k22
l1 ,... ,lj =1
= CkVn k22 Nn ≤ CkVn k2 ∈ l1 (N). So, in order to complete the proof of Theorem 1.1, it suffices to show that (15) converges almost everywhere for some j ∈ N. We now specialize to particular subdivisions {yi }. Namely, we distribute the L1 norm of V evenly among the subintervals of [xn−1 , xn ], i.e., we choose the yi such that for i = 1, . . . , j, kV χ(yi (n,l1 ,... ,li−1 ,li −1),yi (n,l1 ,... ,li−1 ,li )) k1 =
kVn k1 . Nni
(16)
We integrate the summands from (15) with respect to dk: Z Z b Z x Nn yj (lj ) X 0 0 0 dk dx V (x)eiσ1 ω(x) eim θ(x) dt V (t)eiσ2 ω(t) a yj (lj −1) yj (lj −1) l1 ,... ,lj =1
Nn X
≤ (b − a)1/2
Z
l1 ,... ,lj =1
≤C
Nn X l1 ,... ,lj =1
Z
yj (lj ) yj (lj −1)
≤ CkVn k1 Nn−j
yj (lj ) yj (lj −1)
Z
dx |V (x)|
b
a
Z 2 1/2 x dk dt V (t)eiω(t) yj (lj −1)
dx |V (x)| kV χ(yj (lj −1),x) k2
Nn X
kV χ(yj (lj −1),yj (lj )) k2
l1 ,... ,lj =1
≤ CkVn k1 Nn−j/2 kVn k2 ≤ CkVn k1 kVn k2
j/2+1
.
By assumption, the last expression is summable, provided j ≥ 2N .
160
C. Remling
3. WKB Asymptotics Theorem 3.1. Assume that V satisfies the assumptions of Theorem 1.1. Then, for almost every E > 0, there are solutions y, y of the Schr¨odinger equation (1) with the asymptotic form 1 y(x, E) = + o(1) eiω(x,E)/2 (x → ∞). ik y 0 (x, E) Remark.R The term “WKB” R x √ is justified because if V → 0, then ω(x)/2 = kx − x (2k)−1 0 V (t) dt = 0 E − V (t) dt + c + o(1). Proof. First of all, we observe that the assumptions of Theorem 1.1 imply that V ∈ Lp for some p < 2. To prove this, let p0 = 1, pn+1 = 1 + (pn /2), and note that by the Cauchy-Schwarz inequality 2 pn n+1 kf k2p pn+1 ≤ kf k2 kf kpn .
(17)
Moreover, we have that pn ∈ (1, 2) for all n ∈ N. By hypothesis, there exists r ∈ N so that 2 +...+2r ≤ C ∀m. kVm k1 kVm k2+2 2 Now, repeatedly applying (17) with f = Vm , we get 2
+...+2r
2
+23 +...+2r
C ≥ kVm k1 kVm k2+2 2 2 1 ≥ kVm k2p p1 kVm k2 r
r+1
≥ ... r+1
≥ kVm k2prpr ≥ kVm k2pr+1pr+1 kVm k−2 , 2 ˜ m k2 . This shows V ∈ Lpr+1 , as claimed. ≤ CkV hence kVm kppr+1 r+1 It follows from (7) that limn→∞ R(xn , k) and limn→∞ θ(xn , k) already exist for almost every k > 0. We will show that the limits limx→∞ R(x, k), θ(x, k) exist, too. It is easy to see that this implies the claimed asymptotic formulae. Thus it suffices to show that Z ξ iω(x,k) iθ(x,k) max dx V (x)e e (18) lim =0 n→∞ ξ∈[xn−1 ,xn ] x n−1 for almost every k. We need one additional tool, namely a norm estimate on the maximal function Z ξ max dx V (x)eiω(x,k) . Mn (k) = ξ∈[xn−1 ,xn ] xn−1 Fix p < 2 such that V ∈ Lp , and let q be the conjugate index (i.e. 1/p + 1/q = 1). Then it follows by interpolation (see [15, Theorem 2.1]) that !1/q !1/p Z Z b
a
Mn (k)q dk
≤C
xn
|V (x)|p dx
.
(19)
xn−1
Now Theorem 3.1 is proved by basically repeating the steps of the proof of Theorem 1.1. Pick a function ξ(k), so that the maximum in (18) is attained for ξ = ξ(k). Integration by parts shows that we can estimate (18) by
Absolutely Continuous Spectrum of 1D Schr¨odinger Operators
161
Z Z x ξ(k) 1 Z ξ(k) iω(x) iθ(x) iω(t) dx V (x)e dx V (x) cos ψ(x)e dt V (t)e + .(20) xn−1 k xn−1 xn−1 Rb Call the first term S(n, k). Then, by (19), a S(n, k)q dk ≤ CkVn kqp which is summable because of q > p. Thus S(n, k) ∈ lq for almost every k ∈ (a, b), and, in particular, limn→∞ S(n, k) = 0 for these k, as desired. We now study the second term of (20). The fact that the upper limit is ξ(k) instead of xn forces a slight modification of the reasoning of Sect. 2. However, since we only need to show convergence to zero (rather than summability), the argument becomes, if anything, easier. Let y1 (n, l) be as in Sect. 2, and define L1 = L1 (n, k) by y1 (n, L1 −1) < ξ(k) ≤ y1 (n, L1 ). Then we have to analyze expressions of the form Z x L 1 −1 Z y1 (n,l) X dx V (x)eiσω(x) eimθ(x) dt V (t)eiω(t) + y1 (n,l−1) xn−1 l=1 Z Z x ξ(k) dx V (x)eiσω(x) eimθ(x) dt V (t)eiω(t) . y1 (n,L1 −1) xn−1 The first contribution can obviously be estimated by the corresponding sum over the whole range l = 1, . . . , Nn , and this term has already been dealt with in Sect. 2. It remains to investigate the second term. Proceed as in Sect. 2; again, we need to control terms of three different types: Z Z ξ(k) y1 (n,L1 −1) iω(x) iω(x) dx V (x)e dx V (x)e (21) , y1 (n,L1 −1) xn−1 Z y1 (n,L1 −1) iω(x) dx V (x)e × xn−1 Z Z x ξ(k) 0 dx V (x)eiσ1 ω(x) eim θ(x) dt V (t)eiσ2 ω(t) , y1 (n,L1 −1) y1 (n,L1 −1) Z Z x ξ(k) iσω(x) imθ(x) iω(t) dx V (x)e e dt V (t)e . y1 (n,L1 −1) y1 (n,L1 −1)
(22)
(23)
As above, it is easy to see, using (19), that (21) and the first integral of (22) tend to zero for almost every k ∈ (a, b). Thus it suffices to show that expressions of the form of (23) go to zero for almost every k. Applying this whole argument j times, we see that it is enough to prove that Z x Z ξ(k) dx V (x)eiσ1 ω(x) eimθ(x) dt V (t)eiσ2 ω(t) yj (n,L1 ,... ,Lj−1 ,Lj −1) yj (n,L1 ,... ,Lj−1 ,Lj −1) (24) tends to zero for almost every k ∈ (a, b). Here, the Li = Li (n, k) are defined in a way analogous to the definition of L1 . To conclude the proof, we note that (24) can be estimated by
162
C. Remling
Z x iω(t) dx |V (x)| dt V (t)e yj (n,L1 ,... ,Lj−1 ,Lj −1) yj (n,L1 ,... ,Lj−1 ,Lj −1) Z yj (n,l1 ,... ,lj−1 ,lj ) ≤ 2Mn (k) max dx |V (x)|. Z
ξ(k)
l1 ,... ,lj
yj (n,l1 ,... ,lj−1 ,lj −1)
We know already that Mn tends to zero for almost every k, and the second factor is equal to kVn k1 Nn−j by (16). This expression is bounded, provided j ≥ N . Remarks. 1. Note that we could afford using several crude estimates in this proof. This is due to the fact that we needed to establish only convergence to zero, while in the preceding section, we proved that a closely related quantity is actually summable. 2. In particular, Theorem 3.1 implies that for almost every E > 0, the Schr¨odinger Eq. (1) has only bounded solutions. This provides a way of proving Theorem 1.1 without using Proposition 2.1 [29, 31]. 3. The maximal function estimate (19) can also be used to control what we called Mn in Sect. 2. 4. The Discrete Case In this section, we want to obtain analogues of Theorems 1.1, 3.1 for the discrete Schr¨odinger equation y(n − 1) + y(n + 1) + V (n)y(n) = Ey(n)
(n ≥ 1).
(25)
The corresponding operator on l2 (N) is given by y(2) + V (1)y(1) (n = 1) (Hy)(n) = . y(n − 1) + y(n + 1) + V (n)y(n) (n ≥ 2) In this formulation, the role of the boundary condition is now taken by the parameter V (1). We will again work with Pr¨ufer type variables (cf. [16, 17]): So, let y(n, E) be the solution of (25) with y(0, E) = 0, y(1, E) = 1. For E ∈ (−2, 2), write E = 2 cos k with k ∈ (0, π). Then, if V ≡ 0, the motion of the vector 1 sin k 0 y(n − 1, k) Y (n, k) = y(n, k) sin k − cos k 1 is simply rotation by k. Thus it is natural to introduce Pr¨ufer variables R, ψ by writing sin((ψ(n, k)/2) − k) Y (n, k) = R(n, k) . cos((ψ(n, k)/2) − k) We need not worry about the non-uniqueness of ψ, since only the value modulo 2π will matter in the sequel. R, ψ obey the equations V (n)2 V (n) ψ(n, k) R(n + 1, k)2 sin ψ(n, k) + = 1 − , sin2 2 R(n, k)2 sin k 2 sin k ψ(n, k) V (n) ψ(n + 1, k) − k = cot − . cot 2 2 sin k
(26) (27)
Absolutely Continuous Spectrum of 1D Schr¨odinger Operators
163
There is no problem with the singularities of the cot, because we may as well use a similar equation with tan instead of cot. Theorem P∞ 4.1. Suppose there exists an increasing sequence xn ∈ N, xn → ∞, such that n=1 kVn k2 < ∞ and supn∈N kVn k1 kVn kN 2 < ∞ for some N ∈ N (writing Pxn −1 p kVn kpp = m=x |V (m)| ). Then 6 = (−2, 2) is an essential support of the absoac n−1 lutely continuous part of the spectral measure. Moreover, for almost every E ∈ (−2, 2), there are solutions y, y of the Schr¨odinger equation (25) with the asymptotic form y(n, E) = (1 + o(1))eiω(n,k)/2 Pn−1 (where ω(n, k) = 2kn + sin1 k s=1 V (s)).
(n → ∞)
Sketch of the proof. We will discuss only the proof of 6ac = (−2, 2). This proof is, of course, modelled on the proof of Theorem 1.1. The assertion on the WKB asymptotics is established by a similar adaption of the proof of Theorem 3.1. A Taylor expansion of the basic Eqs. (26), (27) shows xX xX n −1 n −1 R(xn , k) =− V (s) sin ψ(s, k) + O V (s)2 , (28) 2 sin k ln R(xn−1 , k) s=x s=x n−1
n−1
ψ(m, k) = ω(m, k) + θ(m, k),
(29)
where ω has been defined above, and θ satisfies θ(m + 1, k) − θ(m, k) = −
V (m) cos ψ(m, k) + O V (m)2 . sin k
(30)
The constants hidden in the O(. . . ) terms of (28), (30) are uniform in k, provided k is restricted to a compact subset of (0, π). Analogues of Propositions 2.1, 2.2 remain true. The proofs are completely analogous and will thus be left to the reader. Now fix δ > 0. We will show that ∞ xX X n −1 iω(s,k) iθ(s,k) V (s)e e (31) <∞ n=1 s=xn−1 for almost every k ∈ (δ, π − δ). This will imply the assertion by the analogue of Proposition 2.1. Summation by parts lets us rewrite the sum over s from (31) as eiθ(xn )
xX n −1 s=xn−1
V (s)eiω(s) +
xX n −1
eiθ(s) − eiθ(s+1)
s=xn−1
s X
V (t)eiω(t) .
(32)
t=xn−1
As in Sect. 2, we see from the analogue of Proposition 2.2 that the first term is already summable for almost every k. Namely, we have that 2 1/2 xX xX Z π−δ Z π−δ n −1 n −1 dk V (s)eiω(s,k) ≤ π 1/2 dk V (s)eiω(s,k) δ δ s=xn−1 s=xn−1 ≤ CkVn k2 ∈ l1 (N).
164
C. Remling
Taylor’s theorem and (30) imply that eiθ(s) − eiθ(s+1) =
iV (s) cos ψ(s)eiθ(s) + O V (s)2 . sin k
Plug this into (32). The contribution coming from the remainder O(V (s)2 ) is summable for almost every k. In fact, X Z π−δ xX s n −1 dk V (s)2 V (t)eiω(t) ≤ CkVn k32 ∈ l1 (N). δ t=xn−1 s=xn−1 So, it suffices to deal with terms of the form xX n −1
V (s)eiσω(s) eimθ(s)
s=xn−1
s X
V (t)eiω(t) .
t=xn−1
Now the machine based on Lemma 2.3 can be used to pass to increasingly finer subdivisions of the segments [xn−1 , xn ). As above, additional terms coming from the O(V 2 ) remainders are easily seen to be summable. We keep the notations of Sect. 2. In particular, we let again Nn = max{1, [1/kVn k2 ]}. So, our final task is to show that for appropriately chosen yi ’s and for some j, yj (lj )−1 Nn ∞ s X X X X |V (s)| V (t)eiω(t) < ∞ (33) t=yj (lj −1) n=1 l1 ,... ,lj =1 s=yj (lj −1) for almost every k. The problem is that we can no longer pick the yi ’s according to (16). This difficulty is overcome as follows. Fix n; then, we have to pick Nnj + 1 numbers yj (n, l1 , . . . , lj ) ∈ {xn−1 , . . . , xn }. It will be convenient to relabel these numbers as y0 , y1 , . . . , yNnj . The restriction of V to {ym−1 , ym−1 + 1, . . . , ym − 1} will be denoted by Vnm . Now let y0 = xn−1 , and if y0 , . . . , ym−1 have been chosen, let ym be the −3j/4 . If no such number which smallest integer > ym−1 such that kVnm k1 > kVn k1 Nn is less than xn exists, set ym = xn . It is clear that this final index (let us call it M ) 3j/4 3j/4 3j/4 satisfies M ≤ Nn . Since Nnj − (Nn + 1) > Nn for large enough n, we can add the numbers ym − 1 (m = 1, . . . , M ) to this collection of yi ’s. The remaining yi ’s can now be chosen arbitrarily. By construction, this subdivision has the following property: Renumber the yi ’s so that xn−1 = y0 ≤ y1 ≤ . . . ≤ yNnj = xn . Then {ym−1 , . . . , ym − 1} is either a −3j/4
. (In fact, some of these segments may even single point, or kVnm k1 ≤ kVn k1 Nn be empty, in which case the second alternative holds.) It is also clear how the numbers with the original labeling yi (n, l1 , . . . , li ) are obtained from this “linear” arrangement y0 , . . . , yNnj . Namely, a little thought shows that we have to set yi (n, l1 , . . . , li ) = ym , Pi−1 where m = r=1 (lr − 1)Nnj−r + li Nnj−i . Now integrate the summands of (33) with respect to dk. The result can be bounded by (a constant times) j
Nn X m=1
kVnm k1 kVnm k2 .
Absolutely Continuous Spectrum of 1D Schr¨odinger Operators
165
Consider separately the sum over those m for which ym−1 = ym − 1 and the sum over the remaining indices. The first contribution obviously does not exceed kVn k22 , and the second one can be estimated by j
Nn−3j/4 kVn k1
Nn X
kVnm k2 ≤ Nn−j/4 kVn k1 kVn k2 ≤ CkVn k1 kVn k2
j/4+1
.
m=1
This is also summable, provided j ≥ 4N .
5. Perturbations of General Schr¨odinger Operators Our methods also allow us to treat the following, more general situation. Consider the Schr¨odinger operator d2 H = − 2 + U (x) + V (x) dx on L2 (0, ∞), viewed as a perturbation of H0 = −d2 /dx2 + U (x). In order to be able to apply [31, Theorem 5], we will need a mild assumption on the negative part of U , R n+1 namely supn∈N n U− (x) dx < ∞. We are interested in relations between 6ac (H0 ) and 6ac (H). Results of this flavor were first proved in [15]. Here, this extension relies on the generalized Pr¨ufer transformation, as introduced in [17]. We recall briefly the basic properties of this transform. Let f (x, E) be a complex solution of the reference equation − f 00 (x) + U (x)f (x) = Ef (x)
(34)
with the additional property that f and f are linearly independent. Define a differentiable function γ by writing f = |f |eiγ . Constancy of the Wronskian W = f f 0 − f f 0 shows that γ 0 (x)|f (x)|2 = c. By replacing f by f if necessary, we may assume that c > 0. Then we also have that γ 0 (x, E) > 0 for all x, E. Now any real-valued solution y of the full equation − y 00 (x) + (U (x) + V (x))y(x) = Ey(x)
(35)
can be expressed in terms of f and generalized Pr¨ufer variables R(x, E) > 0, ψ(x, E) by f y i(ψ/2−γ) = Im Re . f0 y0 It is shown in [17] that R, ψ are well-defined and obey the equations V sin ψ, 2γ 0 ψ(x, E) = ω(x, E) + θ(x, E), (ln R)0 =
Z
where
x
ω(x, E) = 2γ(x, E) − 0
(36) (37)
V (t) dt γ 0 (t, E)
and θ0 = (V /γ 0 ) cos ψ. Note the complete analogy of these equations to (2), (3). The main new feature is that we no longer have Proposition 2.2. The property stated there now becomes an assumption.
166
C. Remling
Theorem 5.1. Let S ⊂ R be a Borel set with the following properties: 1. For all E ∈ S, all solutions of (34) are bounded. 2. For some choice of f (x, E) as above, the integral operator T : L2 (0, ∞) → L2 (S), defined for bounded functions g of compact support by Z
∞
(T g)(E) = 0
eiω(x,E) g(x) dx, γ 0 (x, E)
is norm bounded. Furthermore, assume that V satisfies the hypotheses of Theorem 1.1. Then 6ac (H) ⊃ S. Moreover, for almost every E ∈ S, there are solutions y, y of (35) with the asymptotic form
y(x, E) y 0 (x, E)
=
f (x, E) f 0 (x, E)
+ o(1) ei(ω(x,E)/2−γ(x,E))
(x → ∞).
0 Sketch of the S proof. Let Sn = {E ∈ S : supx≥0 (1/γ (x, E)) ≤ n}. Then, by assumption, S = n∈N Sn , and it thus suffices to prove the assertion for E ∈ Sn , where n is fixed but arbitrary. Now we can repeat the proofs of Theorems 1.1, 3.1 step by step, starting from the generalized Pr¨ufer equations (36), (37). We cannot use Proposition 2.1, but the asymptotic formulae for y, y imply that all solutions are bounded at these energies (compare Remark 2 at the end of Sect. 3).
Remarks. 1. By [31] (see also [29]), assumption 1 implies that 6ac (H0 ) ⊃ S. 2. In [15], assumption 2 is established for U = 0 and for periodic U . It would be interesting to investigate this condition in more general contexts. Note also that we do not really need the full force of the norm bound, but only the special case g = V χ(s,t) . 3. The fact that the proof of Theorem 5.1 is almost identical with the proofs of Theorems 1.1, 3.1 may be viewed as an argument against this result. I prefer to think of this as an illustration of the power of the generalized Pr¨ufer transformation.
6. Proof of Theorems 1.3, 1.4 The first ingredient to the proof is the following consequence of a classical result. Lemma 6.1. If |V (x)| ≤ Cx−α for all large x, then the set Z
N
S = {k : lim
N →∞
x V (x)eikx dx does not exist }
0
has Hausdorff dimension dim S ≤ 2(1 + − α). Proof. A classical result on the set S says that S is of γ-capacity zero for every γ > 2(1 + − α) (see, e.g., [3, Sect. V] or [35, Sect. XIII.11]). In the literature, this result is usually formulated for Fourier series (rather than integrals), but the proof extends to the continuous case without any difficulties. S is a Borel set, so the usual connections between capacities and Hausdorff measures (see, e.g., [3, Sect. IV]) imply dim S ≤ 2(1 + − α).
Absolutely Continuous Spectrum of 1D Schr¨odinger Operators
167
Now let n = 1 − α + n−1 and consider the sets Sn ≡ Sn ∩ (0, ∞). The integration by parts argument from [14] shows that if k > 0 and 2k ∈ / Sn , then R∞ RN 2ikt 2ikt −n dt exists and, moreover, x V (t)e dt = O(x ). Hence, limN →∞ 0 V (t)e again by [14] (or the alternative version of this argument in [16, Sect. 3]), all solutions of (1) (with E = k 2 ) are bounded. So, for every boundary condition β, the singular part of the spectral measure (restricted to (0, ∞)) is supported by every Sen := {E = k 2 /4 : k ∈ Sn } T and hence also by n∈N Sen . By Lemma 6.1, this intersection has dimension ≤ 4(1 − α). This proves Theorem 1.4. 1 The second ingredient to the proof of Theorem 1.3 is an estimate on the solutions of (1). To simplify the notation, we will subsequently assume that |V (x)| ≤ C/x for all x ≥ 1. The general case (i.e. C = lim sup x |V (x)|) is dealt with by first increasing C slightly and then considering only big Renough x. x We will use the notation kyk2x = 0 |y(t)|2 dt (cf. [12]). The following lemma is based on a precise analysis of the Pr¨ufer equations (compare also [11, Sect. 3.2]). Lemma 6.2. For every E > (2C/π)2 , there exist α, β > 0, such that every nontrivial solution y of the Schr¨odinger equation (1) satisfies estimates of the form A1 xα ≤ kyk2x ≤ A2 xβ (where A1 , A2 are positive constants, and, say, x ≥ 1). Proof. Fix E > (2C/π)2 , and let y be a solution of (1). Introduce the corresponding Pr¨ufer variables R, ψ (compare Sect. 2). It is easy to see that we have Z x Z x R2 (t) dt ≤ kyk2x ≤ R2 (t) dt (x ≥ 1, A > 0), A 0
0
so it suffices to analyze R. Let xn (n ∈ N0 ) satisfy 1 = x0 < x1 < . . . < xn → ∞ and xn /xn−1 → 1. Then 2
k ln R (x) =
N Z X n=1
xn
V (t) sin(2kt + ϑn−1 ) dt + o(ln x)
(x → ∞),
(38)
xn−1
where ϑn = ψ(xn ) − 2kxn , and N = N (x) is determined by xN ≤ x < xN +1 . This follows from a routine Taylor expansion of the basic equations. The details can be left to the reader. Now, we can specialize to, say, xn = n2 + ξn , where the ξn are picked such that Nn ≡ (2k/π)(xn − xn−1 ) ∈ N. A simple argument shows that it is possible to take the ξn bounded. Then (38) gives Z xn N CX 1 | sin(2kt + ϑn−1 )| dt + o(ln x) k xn−1 xn−1 n=1 N 2C X xn − 1 + o(ln x) = πk xn−1 n=1 N xn 2C X 2C + o(1) ln x. ln + o(ln x) = = πk xn−1 πk
| ln R2 (x)| ≤
n=1
Taking exponentials and then integrating completes the proof.
168
C. Remling
Remark. The same reasoning shows that (2C/π)2 is optimal. Namely, if k < 2C/π, we can define V for x ∈ (xn−1 , xn ) by C/x if sin(2kx + ϑn−1 ) ≤ 0 V (x) = . −C/x if sin(2kx + ϑn−1 ) > 0 This definition makes sense, because ϑn−1 depends only on the values of V (x) for x < xn−1 . The above arguments prove that (1), with this potential and E = k 2 , has an L2 -solution. Clearly, Lemma 6.2 guarantees that for E > (2C/π)2 , we can find δ = δ(E) > 0 so odinger equation (1). By the that kukx kvk−δ x → ∞ for any two solutions u, v of the Schr¨ refined subordinacy theory [13] (see also [24]), this implies that there is a γ = γ(E) > 0 so that the γ-derivative of the spectral measure ρ vanishes at E: (Dγ ρ)(E) = lim sup →0+
ρ(E − , E + ) = 0. (2)γ
By general properties of Hausdorff measures [27], this means that ρ gives zero weight to every S ⊂ ((2C/π)2 , ∞) with dim S = 0. However, Theorem 1.4 with α = 1 says that ρs is supported on a set of zero Hausdorff dimension. This completes the proof of Theorem 1.3. 2 We conclude this paper with a few remarks on how to obtain analogues of Theorems 1.3, 1.4 for the discrete Schr¨odinger Eq. (25). First of all, generalizing Theorem 1.4 is straightforward. In fact, the proof becomes somewhat easier. Now we try to prove an analogue of Lemma 6.2. We use the notations of Sect. 4. Instead of (38), we now have to analyze the following equation: sin k ln R2 (x) = −
N xX n −1 X
V (s) sin(2ks + ϑn−1 ) + o(ln x)
(x → ∞).
n=1 s=xn−1
As above, we have ϑn = ψ(xn ) − 2kxn , and N satisfies xN ≤ x < xN +1 . Also, we assumed that xn ∈ N, xn → ∞, and xn /xn−1 → 1. Proceeding as in the proof of Lemma 6.2, we get | ln R2 (x)| ≤
xX N n −1 C X xn 1 ln | sin(2ks + ϑn−1 )| + o(ln x), sin k xn−1 xn − xn−1 s=x n=1
n−1
provided |V (n)| ≤ C/n. We now suppose that, in addition, xn − xn−1 → ∞. Then, using the unique ergodicity of the rotation on the torus with irrational rotation number (see, e.g., [7]), we deduce that if k/π ∈ / Q, then Z xX −1 n 1 π 1 | sin(2ks + ϑn−1 )| = | sin ϕ| dϕ + o(1) (n → ∞). xn − xn−1 s=x π 0 n−1
Hence, for these k we obtain | ln R2 (x)| ≤
2C + o(1) ln x π sin k
(x → ∞).
Since a countable set cannot support continuous measures, we can now conclude the proof of the analogue of Theorem 1.3 as in the continuous case. Recall also that E = 2 cos k. We have thus shown
Absolutely Continuous Spectrum of 1D Schr¨odinger Operators
169
Theorem 6.3.p If C := lim supn→∞ n |V (n)| < π/2, then σsc (H) ∩ (−E0 , E0 ) = ∅, where E0 = 2 1 − (2C/π)2 . In particular, if V (n) = o(1/n), then H is purely absolutely continuous on (−2, 2). The second part uses the trivial remark that o(1/n) potentials do not have embedded eigenvalues. However, the question of sharp bounds for eigenvalues of O(1/n) potentials is much more subtle than in the continuous case and will not be discussed here. Acknowledgement. I would like to thank A. Kiselev for stimulating discussions on most topics of this paper, S. Molchanov for showing me his alternate proof of the result of [15], and B. Simon for useful comments on an earlier version of this work. I appreciated the hospitality of Caltech, where most of this work was done. I would like to thank the Deutsche Forschungsgemeinschaft for financial support.
References 1. Behncke, H.: Absolute continuity of Hamiltonians with von Neumann Wigner potentials. Proc. Amer. Math. Soc. 111, 373–384 (1991) 2. Behncke, H.: Absolute continuity of Hamiltonians with von Neumann Wigner potentials II. Manuscr. Math. 71, 163–181 (1991) 3. Carleson, L.: Selected Problems on Exceptional Sets. Princeton: Van Nostrand, 1967 4. Carmona, R.: One-dimensional Schr¨odinger operators with random or deterministic potentials: New spectral types. J. Funct. Anal. 51, 229–258 (1983) 5. Christ, M., Kiselev, A.: Absolutely continuous spectrum for one-dimensional Schr¨odinger operators with slowly decaying potentials: Some optimal results. Preprint 6. Christ, M., Kiselev, A., Remling, C.: The absolutely continuous spectrum of one-dimensional Schr¨odinger operators with decaying potentials. Math. Research Letters (to appear) 7. Cornfeld, I.P., Fomin, S.V., Sinai, Y.G.: Ergodic Theory. New York: Springer-Verlag, 1982 8. del Rio, R., Simon, B., Stolz, G.: Stability of spectral types for Sturm-Liouville operators. Math. Res. Lett. 1, 437–450 (1994) 9. Delyon, F.: Apparition of purely singular continuous spectrum in a class of random Schr¨odinger operators. J. Stat. Phys. 40, 621–630 (1985) 10. Delyon, F., Simon, B., Souillard, B.: From power pure point to continuous spectrum in disordered systems. Ann. Inst. H. Poincar´e 42, 283–309 (1985) 11. Eastham, M.S.P., Kalf, H.: Schr¨odinger type operators with continuous spectra. Res. Notes in Math, vol. 65, London: Pitman, 1982 12. Gilbert, D., Pearson, D.: On subordinacy and analysis of the spectrum of one-dimensional Schr¨odinger operators. J. Math. Anal. Appl. 128, 30–56 (1987) 13. Jitomirskaya, S., Last, Y.: Dimensional Hausdorff properties of singular continuous spectra. Phys. Rev. Lett. 76, 1765–1769 (1996) 14. Kiselev, A.: Absolutely continuous spectrum of one-dimensional Schr¨odinger operators and Jacobi matrices with slowly decreasing potentials. Commun. Math. Phys. 179, 377–400 (1996) 15. Kiselev, A.: Preservation of the absolutely continuous spectrum of Schr¨odinger equation under perturbations by slowly decreasing potentials and a.e. convergence of integral operators. Duke Math. J. (to appear) 16. Kiselev, A., Last, Y., Simon, B.: Modified Pr¨ufer and EFGP transforms and the spectral analysis of one-dimensional Schr¨odinger operators. Commun. Math. Phys. (to appear) 17. Kiselev, A., Remling, C., Simon, B.: Effective perturbation methods for one-dimensional Schr¨odinger operators. Preprint 18. Kotani, S., Ushiroya, N.: One-dimensional Schr¨odinger operators with random decaying potentials. Commun. Math. Phys. 115, 247–266 (1988) 19. Last, Y., Simon, B.: Eigenfunctions, transfer matrices, and absolutely continuous spectrum of onedimensional Schr¨odinger operators. Preprint 20. Molchanov, S.: One-dimensional Schr¨odinger operators with sparse potentials. Preprint 21. Molchanov, S.: Private communication and in preparation
170
C. Remling
22. Naboko, S.N.: Dense point spectra of Schr¨odinger and Dirac operators. Theor. and Math. Phys. 68, 646–653 (1986) 23. Pearson, D.B.: Value distribution and spectral analysis of differential operators. J. Phys. A 26, 4067–4080 (1993) 24. Remling, C.: Relationships between the m-function and subordinate solutions of second order differential operators. J. Math. Anal. Appl. 206, 352–363 (1997) 25. Remling, C.: Some Schr¨odinger operators with power-decaying potentials and pure point spectrum. Commun. Math. Phys. 186, 481–493 (1997) 26. Remling, C.: Embedded singular continuous spectrum for one-dimensional Schr¨odinger operators. Trans. Am. Math. Soc. (to appear) 27. Rogers, C.A.: Hausdorff Measures. London: Cambridge University Press, 1970 28. Simon, B.: Some Jacobi matrices with decaying potentials and dense point spectrum. Commun. Math. Phys. 87, 253–258 (1982) 29. Simon, B.: Bounded eigenfunctions and absolutely continuous spectra for one-dimensional Schr¨odinger operators. Proc. Am. Math. Soc. 124, 3361–3369 (1996) 30. Simon, B.: Some Schr¨odinger operators with dense point spectrum. Proc. Am. Math. Soc. 125, 203–208 (1997) 31. Stolz, G.: Bounded solutions and absolute continuity of Sturm-Liouville operators. J. Math. Anal. Appl. 169, 210–228 (1992) ¨ 32. von Neumann, J., Wigner, E.: Uber merkw¨urdige diskrete Eigenwerte. Z. Phys. 30, 465–467 (1929) 33. Weidmann, J.: Zur Spektraltheorie von Sturm-Liouville Operatoren. Math. Z. 98, 268–302 (1967) 34. Weidmann, J.: Spectral Theory of Ordinary Differential Operators. Springer Lect. Notes, Vol. 1258, Berlin: Springer-Verlag, 1987 35. Zygmund, A.: Trigonometric Series, Vol. I, II. Cambridge: Cambridge University Press, 1959 Communicated by B. Simon
Commun. Math. Phys. 193, 171 – 196 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
Principal Bundles and the Dixmier Douady Class Alan L. Carey1 , Diarmuid Crowley2 , Michael K. Murray1 1 Department of Pure Mathematics, University of Adelaide, Adelaide, South Australia 5005, Australia. E-mail:
[email protected],
[email protected] 2 Department of Mathematics, Indiana University, Bloomington, IN 47405-4301, USA. E-mail:
[email protected]
Received: 26 February 1997 / Accepted: 5 August 1997
Abstract: A systematic consideration of the problem of the reduction and ‘lifting” of the structure group of a principal bundle is made and a variety of techniques in each case are explored and related to one another. We apply these to the study of the DixmierDouady class in various contexts including string structures, Ures bundles and other examples motivated by considerations from quantum field theory.
1. Introduction This paper develops the theory of principal bundles with the aim of studying various manifestations of the Dixmier-Douady class. The motivating example is that of principal bundles whose structure group is an infinite dimensional Lie group much studied by many authors in connection with string theory (it is the restricted unitary group in the terminology of Pressley and Segal [22]). Our results include a demonstration that there are interesting examples of such bundles and we relate them to string structures. We also discuss obstructions (or characteristic classes) arising from them. Finally we connect our work to the notion of bundle gerbe, bundles with structure group the projective unitaries and to infinite dimensional Clifford bundles. A statement of the main results of the paper is given later in this introduction. We now digress a little to explain the history which lead to this paper. Some time ago Gross [13] suggested that quantum electrodynamics lends itself to a formulation in terms of infinite dimensional Clifford bundles. It was Segal [23] who showed that using bundles with fibre the projective space of a Fock space (carrying a representation of the Clifford algebra) in non-abelian gauge theories one could explain the origin of Hamiltonian anomalies (in particular that discovered by Faddeev and Mickelsson [11]). Mickelsson, in his study of anomalies and gauge theories, found it useful to introduce the idea of a Fock bundle. These are also related to bundles whose fibre is an infinite dimensional Clifford algebra. In this paper we approach the study of these bundles
172
A. L. Carey, D. Crowley, M. K. Murray
through the theory of infinite dimensional principal bundles whose structure group is the restricted unitary group. In [6] two of the authors began a related study: that of string structures. Our ideas were partly influenced by the history above. An abstract version of the problem discussed in [6] is to start with a principal bundle P over a manifold M with structure group G. Let Gˆ be a central extension of G by U (1). Then one can ask when there exists a principal bundle Pˆ with structure group Gˆ such that Pˆ /U (1) = P (we call this the lifting problem). J-L. Brylinski [2] observed that the obstruction D(P ) to the existence of Pˆ , ˘ cohomology) studied in a different may be identified with a class in H 3 (M, Z) (Cech context by Dixmier and Douady [10]. Recently in [7] and [8] the connection between Hamiltonian anomalies, the characteristic classes arising in the Atiyah-Singer families index theorem and the Dixmier-Douady class was established. In this paper we attempt to unify some of these various manifestations of the DixmierDouady class. We start by showing (in Theorem 4.1) that if G is simply connected the Dixmier-Douady class of a principal bundle P is the transgression of the Chern class of the U (1) principal bundle Gˆ → G. Next we develop an obstruction theory for the extension problem showing (Sect. 5) that it too leads to the Dixmier-Douady class. We find that the physical examples discussed above can all be related to principal bundles with structure group the restricted unitary group. To explain this let H be a complex Hilbert space and P+ an orthogonal projection on H with infinite dimensional kernel and co-kernel. Denote by Ures the group of unitary operators U on H such that U P+ − P+ U is Hilbert-Schmidt and by PU the projective unitary group. The existence of Ures bundles with non-trivial Dixmier-Douady class and their relation to the work of Brylinski et al is covered in Sects. 7 and 8. This is handled by exploiting the existence of an embedding of the smooth loop group Ld G of a compact Lie group G into Ures . There are canonical central extensions of both Ld G ([22]) and Ures which are compatible with the embedding of the former in the latter. Now Killingback [15] argued that the obstruction to extending a principal Ld G bundle over the space of smooth loops in M to a principal bundle having fibre equal to this extension transgresses to half the Pontrjagin class of M . On the other hand it was shown by Brylinski [2] that the obstruction is the Dixmier-Douady class. Following McLaughlin [17] and [6] we can prove equality of these establishing as a corollary the existence of principal Ures bundles with non-trivial Dixmier-Douady class. Our next result concerns the connection between Ures bundles and PU bundles. There is a standard inclusion of Ures into PU which we review in Sect. 9. Let P (M, Ures ) be a principal Ures –bundle over M . We use the prefix 6 to denote the reduced suspension of a space and 6q to denote the suspension isomorphism on cohomology 6q : H q (M, Z) ∼ = H q+1 (6M, Z). One of our main results (Sect. 11) is that for M compact, there is an associated U (∞)˜ 1 (M )) over 6M with bundle, 6P (6M, U (∞)) (an element of K 63 (D(P )) = c2 (6P ) (the right hand side being the second Chern class of 6P ). We deduce from this that the structure group of a PU –bundle, Q reduces to Ures if and only if there is a Ures bundle, P whose Dixmier-Douady class coincides with that of Q. This happens if and only if there is a U (∞)–bundle, 6P (6M, U (∞)) over 6M such that c2 (6P ) = 63 (D(Q)). There are interesting connections between this paper and a number of other recent results. For example another way of viewing the lifting of a principal G bundle to a
Principal Bundles and the Dixmier Douady Class
173
principal Gˆ bundle is to use the recently introduced notion of a bundle gerbe [20]. In this exposition we have avoided use of that viewpoint although it has partly motivated our arguments in section 4 and we discuss it briefly in Sect. 12. The original construction of the Dixmier-Douady class [10] was in connection with bundles of C ∗ -algebras with fibre the compact operators and hence with principal bundles whose fibre is PU . In the case of principal Ures bundles the associated C ∗ -algebra bundles have fibre the infinite dimensional Clifford algebra. Specifically, in Sect. 12 we associate to any principal Ures bundle over M a bundle whose fibre is the C ∗ -algebra of the canonical anticommutation relations (CAR) over H (an algebra isomorphic to the infinite dimensional Clifford algebra). The vanishing of the Dixmier-Douady class allows us to construct an associated Hilbert bundle over M whose fibre is a representation space for the CAR-algebra (in fact it is a Fock space) such that the sections of the CAR-bundle over M act on sections of the Fock bundle in the obvious way. Finally, one of the most interesting by-products of our investigation is the explicit construction in Sect. 6 of the classifying space of PU. 2. Preliminary Material on Principal G-Bundles We recall some facts about principal G bundles starting with the definition. A (topological) principal G bundle over a topological space M is a triple P (M, G), where G is a topological group (the structure group) and P (the total space) and the base M are topological spaces with a continuous surjection π : P → M . The group G acts continuously and freely on the right of P and the orbits of this action are precisely the fibres of the map π. We require that the bundle is locally trivial in the sense that there is a locally finite cover {Uα | α ∈ A} of M with the property that if Pα = π −1 (Uα ) then there are homeomorphisms Pα → Ua × G which send p to (π(p), sα (p)) and which commute with the action of G so that sα (pg) = sα (p)g. Note that the trivial bundle M × G is naturally a principal bundle G bundle over M if we define the obvious right action (m, h)g = (m, hg). Two principal bundles P (M, G) and Q(M, G) are said to be isomorphic if there is a homeomorphism f : P → Q commuting with the G action and the projection map so that the induced action on M is the identity. We will be interested in isomorphism classes of principal bundles which may be classified in two ways that we detail in the next sections. All that we have said so far holds also in the category of manifolds and smooth maps with the corresponding modifications to the definitions. In particular many of the principal bundles that we discuss below arise as the quotient of a Lie group G by a closed subgroup H. To show that G → G/H is a principal H bundle over G/H one needs to demonstrate that this fibration is locally trivial in the topological sense. In all the cases which arise in this paper both G and H are Banach Lie groups and the result follows by a theorem of E. Michael ([18]) on the existence of local continuous sections for the fibration G → G/H. 2.1. Principal bundles and non-abelian cohomology. Notice that the function sα s−1 β : Pα ∩ Pβ → G is constant on fibres and hence descends to define the transition functions of P with respect to the cover by gαβ : Uα ∩ Uβ → G. ˘ cocycle for It is straightforward to check that the transition functions gαβ form a Cech the sheaf G of continuous G-valued functions on M . It is also straightforward to check
174
A. L. Carey, D. Crowley, M. K. Murray
that if the trivialisations are changed then the cocycle changes by a coboundary. Hence a principal bundle defines a class in H 1 (M, G). Moreover it is possible to show by the standard ‘clutching construction” (see for example [14]) that every cohomology class arises in this way. We have: Proposition 2.1. The isomorphism classes of principal G bundles over M are in bijective correspondence with the elements of H 1 (M, G). It is important to note that the cohomology space H 1 (M, G) is not a group. It is a pointed set, pointed by the equivalence class of the identity cocycle which corresponds under the isomorphism from 2.1 to the trivial G bundle. 2.2. Classifying spaces for principal G bundles. Another way of describing the isomorphism classes of principal bundles is to use classifying spaces. If f : N → M is a map and P (M, G) is principal bundle then there is a pull-back bundle f ∗ (P )(N, G) defined by f ∗ (P ) = {(p, n) : π(p) = f (n)} ⊂ P × N. We make f ∗ (P ) a topological space or manifold by its definition as a subspace or submanifold of P × N . The action of G is (p, n)g = (pg, n). A principal G bundle EG(BG, G) is called a classifying space for principal G bundles if it has the property that for any principal bundle P (M, G) there is a map f , unique up to homotopy, such that f ∗ (EG) is isomorphic to P . The map f is called a classifying map for P . A standard fact, see for example, [14] is that classifying spaces exist and are unique up to homotopy equivalence. It is sufficient for our purposes to work in the category of spaces with the homotopy type of a CW–complex, denoted CW (see, for example, [25] pp. 400). Any map between two connected CW-complexes whose associated maps on the homotopy groups are all isomorphisms (a weak homotopy equivalence) is, in fact, a homotopy equivalence ([25] pp. 405). For example, differentiable manifolds have the homotopy type of a CWcomplex and CW is closed under the operation of forming loop spaces. An extremely useful characterisation of classifying spaces within the category CW is the fact that a principal G-bundle, P (M, G) is a classifying space if and only if P is weakly contractible (i.e. πq (P ) = 0 for all q). Recall Kuiper’s Theorem [16] which states that U (H), the full unitary group of a separable Hilbert space is contractible in the uniform topology. This makes U (H) a candidate for the total space of CW –universal bundles. If EG(BG, G) is a classifying space for G-bundles we may summarise our discussion as: Proposition 2.2. The set of isomorphism classes of principal G-bundles over M is in bijective correspondence with the set of homotopy classes of maps from M to BG. 2.3. Characteristic classes of principal G bundles. A characteristic class, c, for principal G bundles assigns to any principal G bundle P (M, G) an element c(P ) in H ∗ (M ), the cohomology of M . This assignment is required to be natural in the sense that if f : N → M and P is a G bundle over M then c(f ∗ (P )) = f ∗ (c(P )). Note that, among other things, this implies that c(P ) depends only on the isomorphism class of P . The results above on classifying spaces give us a complete characterisation of all characteristic classes. If c is a characteristic class we can apply it to EG and obtain
Principal Bundles and the Dixmier Douady Class
175
an element ξ = c(EG) ∈ H ∗ (BG). Conversely if ξ ∈ H ∗ (BG) then we can define a characteristic class by defining c(P ) = f ∗ (ξ) where f is a classifying map for P . So characteristic classes are in bijective correspondence with the cohomology of BG. 2.4. Associated fibrations. We shall need to consider other fibrations that arise as associated fibrations to a principal bundle. If P (M, G) is a principal bundle and G acts on the left of a space X then G acts on P × X by (p, x)g = (pg, g −1 x) and the quotient (X × G)/G is a fibration over M with fibre isomorphic to X. 3. Changing the Structure Group Let φ : H → G be a topological group homomorphism. If Q(M, H) is an H bundle consider the problem of finding a G bundle P and a φ˜ : Q → P such that ˜ m) ⊂ Pm for all m in M , and 1. φ(Q ˜ ˜ 2. φ(qh) = φ(q)φ(h) for all q in Q and h in H. This problem can always be solved in a canonical way. To define P we let H act on the left of G by hg = φ(h)g and define P to be the associated fibration to this action. The group G acts on Q × G by (p, g)g 0 = (p, gg 0 ). The action of G commutes with the action of H and makes P into a principal G bundle. We denote it by φ∗ (Q). It is straightforward to show that if we choose local trivialisations of Q with transition functions hαβ they define local trivialisations of P with transition functions φ ◦ hαβ . In other words P is the image of Q under the induced map φ : H 1 (M, H) → H 1 (M, G). In terms of classifying spaces we have the following theorem: Theorem 3.1. Let φ : H → G be a group homomorphism. Then there is a map Bφ : BH → BG with the property that if f : M → BH is a classifying map for an H bundle Q then Bφ ◦ f : M → BG is a classifying map for the G bundle φ∗ (Q). Proof. This follows from the standard constructions of the classifying map and the classifying space (see for example [14]). More interesting is the “inverse” problem to this. If P (M, G) is a principal G bundle can we find a principal H bundle Q such that φ∗ (Q) is isomorphic to P ? A number of ways of deciding when this is possible are known. ˘ First, in terms of Cech cohomology: a bundle Q exists if the bundle P (M, G) lies in the image of φ : H 1 (M, H) → H 1 (M, G). Second, in terms of classifying spaces we have Theorem 3.2. Let φ : H → G be a group homomorphism. Then if f : M → BG is a classifying map for P then a Q bundle H exists with φ∗ (Q) ' P if and only if f lifts to a map fˆ : M → BH such that Bφ ◦ fˆ = f . Proof. This follows from Theorem 3.1.
176
A. L. Carey, D. Crowley, M. K. Murray
The third method, which will be explained in the examples below, is to formulate the problem as that of finding a section of a fibration and to employ obstruction theory. We are interested in two particular cases of this general problem: 1. H is a closed Lie subgroup of G 2. Gˆ → G is a central extension with kernel U (1). In the first of these cases we say that the structure group G reduces to H and in the ˆ second that it lifts to G. 3.1. Reducing the structure group. Let H be a closed Banach Lie subgroup of a Banach Lie group G. If Q(M, H) is a principal bundle with a bundle map from Q(M, H) to P (M, G) then it identifies H with its image inside P . This image is a reduction of P to H. That is, it is a submanifold of P which is stable under H and forms, with this H action, a principal H bundle over M . It is clear that the problem of reducing P to H is equivalent to the problem of finding a reduction to H. Given a bundle P (M, G), consider a fibre Pm . A reduction of P involves selecting an H orbit in Pm for each m. The set of all H orbits in Pm is Pm /H and a reduction of P therefore corresponds to a section of the fibering P/H → M whose fibre at m is Pm /H. Applying this to the classifying space of G we see that EG → EG/H is a principal H bundle with contractible total space and hence a classifying space for H. The map H ⊂ G induces a map BH → BG which under these identifications is the map EG/H → BG. It is now straightforward to show that the following theorem holds. Theorem 3.3. Let P (M, G) be a principal G–bundle with classifying map f : M → BG, then the following conditions are equivalent to the structure group of P reducing to H: 1. The fibration P/H → M has a global section. 2. The classifying map, f , has a lift, fˆ, to BH = EG/H. If, in addition, H is normal in G, then a final equivalent condition is ρ[P ] = 0, where ρ is the map in first cohomology induced by the canonical projection G −→ G/H, ρ : H 1 (M, G) → H 1 (M, G/H). Proof. (1) Defining a reduction of P to H means picking out, for each m in M an orbit of H inside Pm or equivalently an element of Pm /H. But the latter defines a section of P/H. (2) Theorem 3.2. (3) If P has a reduction to H then we can always choose our local trivialisations so that the transition functions take values in H. Hence ρ(P ) = 0. Conversely if the transition functions are gαβ and ρ(P ) = 0 then we must have gαβ = gβ hαβ gα−1 , where gα : Uα → G and hαβ : Uα ∩ Uβ → H. Let the transition functions be defined by local trivialisations p 7→ (π(p), sα (p)) so that gαβ ◦ π = sβ s−1 α . If we modify these by letting s0α = sα gα and s0β = sβ gβ then we find that the new transition functions are hαβ as required.
Principal Bundles and the Dixmier Douady Class
177
Before we can apply Theorem 3.3 usefully we need the obstructions to lifting maps from the base space of a fibre bundle to the total space (loc cit Steenrod, pp. 177–181). Briefly, assume that M is a CW complex and that we are trying to lift a map f : M → B to the total space of the fibre bundle πE : E → B with fibre F such that the lift, fˆ satisfies f = πE ◦ fˆ. We define fˆ over the zero skeleton of M by lifting f arbitrarily. Extending over the 1–skeleton of M is only a problem if the fibre, F , is not connected. In general, there is no difficulty in extending a map from the n-skeleton to the (n + 1)-skeleton of M if πn (F ) is zero. We will be interested in the case that F has non-vanishing homotopy only in one dimension, that is, it is an Eilenberg-Maclane space. Recall that if A is a group and n > 0 then we denote by K(A, n) the Eilenberg-Maclane space whose only non-vanishing homotopy occurs in dimension n, where πn (K(A, n)) = A. In this case the general theorem from [28] page 302 becomes: Theorem 3.4. Let f : M → B be a continuous map, where M is a CW complex and let πE : E → B be a fibration over M with fibre F = K(A, n) (i.e. an Eilenberg MacLane space) with n > 0 and A abelian. Then there exists a cohomology class, o(f, E) ∈ H n+1 (M, A), which depends only on the homotopy class of f and which has the property that f has a lift, fˆ : M → E if and only if o(f, E) = 0. Moreover if g : M 0 → M is continuous, then o(f ◦ g, E) = g ∗ (o(f, E)) ∈ H n+1 (M 0 , A)). Note 3.1. Notice that it suffices to define o(id, E), where id : B → B is the identity map. Then o(f, E) = f ∗ (o(id, E)). Note 3.2. We use the notation H n+1 (M, A) to denote the fact that the cohomology may take values, not simply in πn (F ) = A but in a possibly twisted A bundle over B. However, when this bundle is trivial we recover standard cohomology and this is the case precisely when the action of π1 (B) on the fibre is trivial. Fibrations of this sort may be called principal K(A, n)-fibrations and it is easy to check that the pull-back of a principal K(A, n)-fibration is itself a principal K(A, n)-fibration. It follows that when K(A, n) is realised as a topological group, principal K(A, n)-bundles are principal K(A, n)-fibrations (since π1 (BK(A, n)) = 0). The following lemma allows one to compute the homotopy groups of the fibre of the map Bφ : BH → BG in the case that φ is an inclusion. Lemma 3.1. Let i : H ,→ G be an inclusion of topological groups. Then there is a commutative diagram of homotopy groups for all q ≥ 0. δ
1 −−−−→ πq (BH) −−−−→ πq−1 (H) −−−−→ i∗ y Bi∗ y y
1 y
δ
1 −−−−→ πq (BG) −−−−→ πq−1 (G) −−−−→ 1 Proof. Setting B := (EH × G)/H = B(BH, G, Bi) let Bi0 be the bundle morphism Bi0 : B → EG covering Bi and let I be the obvious bundle morphism I : EH → B covering idBH . Then Bi0 ◦ I : EH → EG is a bundle morphism covering Bi. The commutative diagram above is just the commutative diagram of the long exact sequences of the fibrations EG(BG, G) and EH(BH, H) with the map of fibre bundles Bi0 ◦ I (including the weak contractibility of EG and EH).
178
A. L. Carey, D. Crowley, M. K. Murray
3.2. Obstruction and transgression. Recall the spectral sequence of a fibration [19]. We are interested in the case that F is a K(Z, 2). For convenience we state it in terms of cohomology with values in a ring R although in the examples we want R will be π π2 (F ) = Z. If E → B is a fibration with fibre F there is a spectral sequence with E2p,q = H p (B, H q (F, R)) converging to a grading of the total cohomology of E. As H 1 (F, R) = 0 the differential d3 of this spectral sequence defines a map τ = d3 : H 2 (F, R) → H 3 (B, R) called the transgression [19]. Note that this is a different transgression from that mentioned in the introduction. The following alternative definition of transgression from [19] will also be necessary to prove Theorem 5.1. We start with π : E → B as above and let F = π −1 (b0 ) be the fibre above some fixed b0 . Then there are homomorphisms: δ
π∗
j∗
H 2 (F, R) → H 3 (E, F, R) ← H 3 (B, {b0 }, R) → H 3 (B, R). On our case π ∗ is injective on H 3 (B, {b0 }, R) and the image of δ is inside the image of π ∗ . We then have that τ (u) = j ∗ (π ∗ )−1 δ(u), where u ∈ H 2 (F, R) is such that δ(u) ∈ p∗ (H 3 (B, {b0 }, R)). Because F is a K(Z, 2) it has a fundamental class η ∈ H 2 (F, π2 (F )) [19]. We then have Theorem 3.5. Let πE : E → B be a fibre bundle over M with fibre F a K(Z, 2) and let η be the canonical generator of H 2 (F, π2 (F )) = Z. Then o(id, E) ∈ H 3 (B, π2 (F )) is the transgression of µ. Proof. This is claimed in [19], pages 103 and 109 although a proof is not given. We want to be sure that o(id, E) is the transgression of µ and not −µ, so we give a sketchproof here that will be sufficient to check the sign. We will consider the case that the Hurewicz isomorphism π3 (B) → H3 (B, Z) is onto in dimension three. This will suffice for the cases we are interested in. To make sure we have the correct sign we need to be careful about various identifications. We are assuming that π2 (F ) = Z. The H 2 (F, π2 (F )) has a canonical generator which is the cohomology class η with the propery that, if applied to a two-sphere in F , it returns the homotopy class of that two-sphere. So let us choose a generator ρ for π2 (F ). Let τ (η) = x then as we have seen above π ∗ (x) = δ(η). Choose a three-sphere 6 ⊂ E ˆ ⊂ B with π(6) ˆ = 6 and boundary, ∂ 6, ˆ in F . Then by with b0 ∈ 6 and a three-ball 6 the definition of the obstruction class [28] we have ˆ o(id, E)(6) = [∂ 6], ˆ ∈ π2 (F ). where square-brackets denote the homotopy class of ∂ 6 ˆ On the other hand we have π(6) = 6 so that ˆ = η(∂ 6), ˆ x(6) = δ(η)(6) and hence o(id, E)(6) = x(6). Thus we have chosen the correct sign.
Principal Bundles and the Dixmier Douady Class
179
4. Extending the Structure Group Let ρ 1 → U (1) → Gˆ → G → 1
(4.1)
be a short exact sequence of Lie groups with U (1) central. If P (M, G) is a principal bundle we are interested in the problem of finding a lift of P to a Gˆ bundle Pˆ over M . We shall present two methods of defining a characteristic class: the Dixmier-Douady class and the obstruction class, both of which are obstructions to finding such a lift. We then show that they are, in fact, equal. 4.1. The obstruction class. Proposition 4.1 ([9]). We can realise B Gˆ as a principal BU (1)-bundle over BG. Proof. Steenrod [27] showed that Milgram’s realisation of the classifying space makes E a functor from the category of topological groups and continuous homomorphisms to itself. In fact we have the following commutative diagram where the vertical arrows are the inclusion of a fibre. i
1 −−−−→ U (1) −−−−→ y y Bi
ρ Gˆ −−−−→ y
G −−−−→ y
1 y
Bρ
1 −−−−→ EU (1) −−−−→ E Gˆ −−−−→ EG −−−−→ 1 Functorality allows us to move from U (1) central in Gˆ to EU (1) central in E Gˆ and ˆ Since we have a closed inclusion, U (1) ,→ G, ˆ E G/U ˆ (1) thus U (1) is normal in E G. ˆ is a realisation of BU (1) as a topological group. Moreover, E G/U (1) is a principal Gbundle over B Gˆ with G canonically identified as a subgroup. We may form the associated ˆ ˆ (1)-bundle over BG. bundle B := (EG×E G/N )/G → BG which is a principal E G/U However, we may also project B onto B Gˆ and the fibre is EG which is contractible. So B has the homotopy type of B Gˆ and hence is another realisation of B Gˆ proving the result. From Theorem 3.2 we see that if P (M, G) is a principal G–bundle with classifying ˆ To map f : M → BG then P lifts to Gˆ if and only if there is a lift of f to B G. find when such a lift occurs we can use Theorem 3.4 from obstruction theory. The classifying space BU (1) is a K(Z, 2). Since we have realised B Gˆ as a principal BU (1)bundle it follows from 3.2 that there is no twisting in the co-efficient group and that ˆ ∈ H 3 (M, π2 (BU (1))). The results of the obstruction to lifting f is a class o(f, B G) 3.4 imply that this is a characteristic class. We normalise it by choosing a generator for π2 (BU (1)) so that the chern class on this generator is equal to one. Let µ be the resulting element of H 2 (BU (1), Z) and define O(P ) = f ∗ (τ (µ)) ∈ H 3 (M, Z). So O(P ) is just o(f, BG) ∈ H 3 (M, π2 (BU (1))) evaluated with respect to our particular choice of generator for π2 (BU (1)). 4.2. The Dixmier-Douady class. Because U (1) is central in Gˆ it is possible to show that there is a short exact sequence of pointed sets [12]
180
A. L. Carey, D. Crowley, M. K. Murray δ ˆ → H 1 (M, G) → H 1 (M, U (1)) → H 1 (M, G) H 2 (M, U (1)).
(4.2)
The definition of an exact sequence of pointed sets is that if X, Y and Z are sets with points x, y and z and f
g
X→Y →Z
(4.3)
is a sequence of pointed maps (that is f (x) = y and g(y) = z) then this sequence is exact at Y if f (X) = g −1 (z). This clearly agrees with the definition for groups if the point of a group is the identity. The map δ is defined as follows. Choose a Leray cover {Uα } and local sections sα : Uα → P . Then the transition functions of the bundle are defined by sβ = sα gαβ . We can lift these to maps ˆ gˆ αβ : Uα ∩ Uβ → G. Of course these may not be transition functions for a Gˆ bundle. Their failure to be so is measured by the cocycle −1 gˆ αβ eαβγ = gˆ βγ gˆ αγ which takes values in U (1). Because U (1) is central it can be shown that eαβγ defines a ˆ class in H 2 (M, U (1)) which vanishes precisely when we can lift the bundle P to G. We can use the short exact sequence of groups 0 → Z → R → U (1) → 0 to define an isomorphism H 2 (M, U (1)) ' H 3 (M, Z). The result of applying this isomorphism to eαβγ defines a characteristic class D(P ) ∈ H 3 (M, Z) called the Dixmier-Douady class. Explicitly if we choose wαβγ so that eαβγ = exp(2πiwαβγ ) then the Dixmier-Douady class has a representative dαβγδ = wβγδ − wαγδ + wαβδ − wαβγ .
(4.4)
Note that if p is a point in the fibre Pm above m then there is a homeomorphism G → Pm defined by g 7→ pg. If G is connected then changing p gives a homotopic homeomorphism and hence there is a unique identification of the cohomology of G with the cohomology of Pm . We want to prove Theorem 4.1. Let P → M be a principal G bundle with G one-connected. Let Gˆ → G be a central extension of G by U (1). Let [µ] be the cohomology class (the Chern class of Gˆ → G) that the central extension defines on G and hence also on any fibre of P → M . Then the transgression of [−µ] is the Dixmier-Douady class of the bundle P → M . Proof. To do this we need to use the second definition of transgression in Subsect. 3 in terms of relative cohomology. ˘ For our purposes it is most useful to realise the cohomology here using Cech cocycles. Recall that if X is a topological space and A is a subspace we can define the relative integral cohomology H(X, A, Z) as follows. We take a cover U (T ) of X and a subcover ˘ cocycles for U(T )0 ⊂ U(T ) of A and consider the induced map of complexes of Cech the group Z: C p (X, U (T )) → C p (A, U (T )0 ).
Principal Bundles and the Dixmier Douady Class
181
We define C p ((X, A), (U(T ), U(T )0 )) to be the kernel of this map and define H p ((X, A), (U (T ), U (T )0 ), Z) to be the cohomology of the complex C p ((X, A), (U (T ), U(T )0 )). The cohomology group H p (X, A, Z) is now defined in the usual way by taking the direct limit as the covers are refined. In the particular case of interest choose a cover U (T ) of M with respect to which the Dixmier-Douady class can be represented by a cocycle dαβγδ ∈ C 3 (M, U (T )) as in (4.4). Choose α0 so that m0 ∈ Uα0 and let U(T )0 = {Uα0 }. Then the restriction of any cocycle in C 3 (M, U(T )) to C 3 (m0 , U(T )0 ) is automatically zero. Consider the transition functions gαβ . The pullback of gαβ to P is trivial because it satisfies π ∗ gαβ = σα σβ−1 , where σα (p) is defined by σα (sα (x)g) = g. Restricted to any fibre the maps σα : Pm → G are homeomorphisms. Cover G by open sets Va over which Gˆ → G has transition functions hab relative to local sections ˆ that is rb = ra hab . Then we can use the maps ra : Va → G, π −1 (Uα ) → Uα × G p 7→ (π(p), σα (p)) to pull the Va back to P to define open sets W(α,a) ⊂ π −1 (Uα ). The cover W = {W(α,a) } is a refinement of the cover {π −1 (Uα )}. If ρα1 ,α2 ,...,αd is a cocycle for {π −1 (Uα )} we denote by ρ(α1 ,a1 ),(α2 ,a2 ),...,(αd ,ad ) its restriction to {W(α,a) }. In particular consider the G valued cocycle σ(α,a) . This can be lifted to Gˆ by defining −1 σˆ (α,a) = ra ◦ σ(α,a) . Then π ∗ gˆ (α,a)(β,b) and σˆ (α,a) σˆ (β,b) are both lifts of π ∗ g(α,a)(β,b) so that we must have −1 f(α,a)(β,b) = πˆ ∗ gˆ (α,a)(β,b) σˆ (α,a) σˆ (β,b)
(4.5)
for a cocycle f(α,a)(β,b) : U(α,a) ∩ U(β,b) → U (1). Hence we have −1 f(α,a)(β,b) . π ∗ e(α,a)(β,b)(γ,c) = f(β,b)(γ,c) f(α,a)(γ,c)
Letting f(α,a)(β,b) = exp(2πiv(α,a)(β,b) ) gives π ∗ w(α,a)(β,b)(γ,c) = v(β,b)(γ,c) − v(α,a)(γ,c) + v(α,a)(β,b) + n(α,a)(β,b)(γ,c)
(4.6)
for n(α,a)(β,b)(γ,c) some integer valued co-cycle. Finally we deduce that π ∗ d(α,a)(β,b)(γ,c)(δ,d) = n(β,b)(γ,c)(δ,d) − n(α,a)(γ,c)(δ,d) + n(α,a)(β,b)(δ,d) − n(α,a)(β,b)(γ,c) . (4.7) Consider now the cohomology on the fibre Pm0 . We define a cover W 0 which covers Pm0 by W 0 = {Wa = W(α0 ,a) },
182
A. L. Carey, D. Crowley, M. K. Murray
where m0 ∈ Uα0 . We make corresponding notational changes to indicate restriction of cocycles from W to W 0 . For example the restriction of n(α,a)(β,b)(γ,c) is nabc = n(α0 ,a)(α0 ,b)(α0 ,c) . We then have from Eq. (4.6) 0 = π ∗ w(α0 ,a)(α0 ,b)(α0 ,c) = vbc − vac + vab + nabc so that nabc = −vbc + vac − vab
(4.8)
Using Eq. (4.5) we see that −1 σˆ a σˆ b−1 = σˆ (α0 ,a) σˆ (α = exp(−2πivab ). 0 ,b)
Finally note that σˆ is defined by σˆ (α,a) = ra ◦ σα |U(α,a) so that exp(−2πivab ) = (ra rb−1 ) ◦ σα0 = h−1 ab ◦ σα0 , where hab are the transition functions of Gˆ → G. Finally we can calculate the transgression of the Chern class. It follows from (4.8) that the Chern class in H 2 (Pm0 , Z) is represented by the cocycle −nabc . We want to apply the coboundary map in relative cohomology to this to obtain a class in H 3 (P, Pm0 , Z). ˘ We do this by first extending nabc to a class on all of P and then applying the Cech coboundary to it. But we obtained nabc by restricting n(α,a)(β,b)(γ,c) so this is an obvious ˘ extension and then (4.7) shows that if we apply the Cech coboundary to n(α,a)(β,b)(γ,c) we obtain the class π ∗ d(α,a)(β,b)(γ,c)(δ,d) which is the pullback of the Dixmier-Douady class as required. 5. The Relation Between the Two Classes This Section is devoted to the proof of the following fact. Theorem 5.1. The obstruction class is the negative of the Dixmier-Douady class if G is one-connected. Proof. Notice first that the universal bundle for U (1), EU (1) → BU (1), can be realised ˆ (1). Also we have that G acts on E G/U ˆ (1) and hence we can form as E Gˆ → E G/U the associated fibration ˆ (1))/G → BG. (EG × E G/U The fibres of this are therefore BU (1). Notice also that if we project onto B Gˆ that ˆ (1))/G is homotopy equivalent to fibering has contractible fibres and (EG × E G/U ˆ B G. Consider the diagram Gˆ → E Gˆ ↓ ↓ ˆ (1). G → E G/U It follows that the bottom arrow must be the classifying map. Let µ be the generator of H 2 (BU (1), Z). Let f be the classifying map. Then f ∗ (µ) is the class of the bundle Gˆ → G.
Principal Bundles and the Dixmier Douady Class
183
We now have a commuting diagram of fibrations: EG
f˜
&
→
.
B Gˆ
(5.1)
BG where the map f˜ restricted to fibres is the classifying map f . Let us denote by [µ] the class on a fibre of B Gˆ → BG which is the fundamental class in H 2 (BU (1), Z) = Z. Then by Theorem 3.5 we have that this transgresses to the obstruction class if the Hurewicz isomorphism π3 (BG, Z) → H3 (BG, Z) is onto. From [19], page 3, a sufficient condition for this is that BG be two-connected. But, by assumption, G is one-connected so BG is two-connected. Also by Theorem 4.1 the class f˜∗ ([µ]) restricted to a fibre transgresses to the negative of the Dixmier-Douady class. But from the commuting diagram (5.1) of fibrations the transgression maps will commute with f˜∗ and hence the obstruction is the negative of the Dixmier-Douady class. 6. The Classifying Space of the Projective Unitaries Given the importance of PU and principal PU –bundles in the following theory we remark that there is a simple construction of a BPU which is a homogenous, infinite dimensional smooth manifold and will allow us to obtain a BG when G ,→ PU is a closed embedding of Banach Lie groups. Throughout this section all groups are equipped with their natural Banach Lie group topologies (in the case of PU this arises from the norm topology on the unitary group). Proposition 6.1. There exists a closed inclusion of PU = U (H)/U (1) in U(T ), the unitary group of the Hilbert space of Hilbert–Schmidt operators T on H (U(T ) is equipped with the norm topology). Proof. Given [a] ∈ PU , choose a representative a ∈ U . Then define i : PU → U (T ) [a] 7→ Ad(a), where
Ad(a) : T → T t 7→ a.t.a∗
Clearly i is well defined and is injective. To prove the continuity of i consider any convergent sequence ([an ])∞ n=1 → [1] in PU. By taking n large enough we may assume that the [an ] lie in a neighbourhood over which U (PU, S 1 ) is locally trivial. Hence we may assume that there is a sequence (an )∞ n=N → 1 in U . Then it is straightforward to see that kAd(an ) − Ad(1)kB(T ) → 0 as n → ∞. To see that the image of i is closed consider a sequence i([an ]) → b, where b ∈ U (T )u . Define a ∗–automorphism of T by b0 (t) = lim Ad(an )t. n→∞
184
A. L. Carey, D. Crowley, M. K. Murray
One can verify that b0 is a ∗–automorphism of T . Since T is uniformly dense in K(H), the compact operators on H, b0 defines a ∗–automorphism of K(H) and is thus of the form Ad(a) for some a ∈ U (H)u . Hence b = Ad(a) = i([a]) and the image of i is closed. Finally, to see that i defines a homeomorphism we begin with the metric, ρ, which defines the topology on PU(H), ρ([a], [b]) = inf ka − λ.bkB(H) . λ∈S 1
Now let 2u,v be the rank one operator 2u,v : H → H given by w 7→ (v, w)u. Then the map u ⊗ v 7→ 2u,v extends to an isomorphism of H ⊗ H with T . Here the bar denotes the complex conjugate Hilbert space. The operator Ad(a) becomes a¯ ⊗ a, where a¯ denotes the action of a on the conjugate space. To prove our result it suffices to work in a neighbourhood of the identity in U(T ). Now for a¯ ⊗ a to be close to the identity operator the spectrum of a must contain a gap. That being the case we can assume −1 is not in the spectrum of a by multiplying by a phase if necessary. Assume we have a sequence an ∈ U (H) with kAd(1) − Ad(an )kB(T ) → 0. Then there is a sequence of self adjoint operators Kn on T with an = exp(iKn ) and the spectrum of Kn is [γn , δn ] ⊂ [−π, π]. In fact we may assume γn = δn =
inf
{(u, Kn u)},
(kukH =1)
sup {(u, Kn u)}. (kukH =1)
Then ||Ad(an ) − Ad(1)|| = sup{| exp i(λ − µ) − 1| | λ, µ ∈ [γn , δn ]} = exp i(δn − γn ) − 1. On the other hand inf ||an − λ1|| = | exp[i(δn − γn )/2] − 1| λ
= ||Ad(an ) − Ad(1)||. Thus, if kAd(1) − Ad(an )kB(H) → 0 as n → ∞ then ρ([1], [an ]) → 0. Hence i−1 : i(PU (H) → PU (H) is continuous and thus i is a homeomorphism. This result shows that PU is a Banach Lie subgroup of U (T ). The contractibility of U(T ) (Kuiper’s theorem) means that (after identifying i(PU) and PU), we have that U(T )(U(T )/PU , PU) is a locally trivial (by [18]) universal PU –bundle and that U (T )/PU is a BPU . More generally, if G is a closed sub-Banach Lie Group of PU , then U (T )(U(T )/G, G) is a universal G-bundle.
Principal Bundles and the Dixmier Douady Class
185
7. String Structures We start with a principal G–bundle, P (M, G), where G is a compact, simple, simply connected Lie group and form the bundle Ld P (Ld M, Ld G, Lf ) where, in general, Ld M denotes the space of differentiable loops into a finite dimensional manifold M . It is well known ([22] Ch 6) that Ld G has a canonical central extension by S 1 , Ld d G, induced from an embedding of Ld G in the restricted unitary group which in turn embeds in the projective unitaries of a second Hilbert space Hπ (the last inclusion is described in Sec. 9). Henceforth U and PU will refer respectively to the unitaries and projective unitaries over Hπ , Ld G ,→ Ures ,→ PU(Hπ ), 1 ∗ 1 Ld d G(Ld G, S ) = i U (Hπ )(PU(Hπ ), S ), 2 [Ld d G] generates H (Ld G, Z).
The idea of a string structure arises as follows. Starting with a principal SO(n)–bundle, P (M, SO(n), f ) (n > 2), which is usually the frame bundle of a tangent bundle, T M , and which has a Spin(n) structure Q(M, Spin(n), fˆ) with classifying map fˆ one forms the loop bundle Ld Q(Ld M, Ld Spin(n), Ld fˆ). The bundle P is said to have a string structure if and only if the structure group of Ld Q lifts to Ld d Spin(n). Of course, the Dixmier-Douady class of Ld Q, D[Ld Q], is the obstruction to the existence of a string structure. Killingback proposed that twice D[Ld Q] was in fact the transgression of the Pontryagin class of P . Since then McLaughlin [17] and Carey and Murray [6] have produced rigorous proofs of Killingback’s result. 7.1. Loop spaces, groups and bundles. Henceforth, let X be a topological space, H a topological group, M a finite dimensional manifold and G a compact Lie group. By (d M, m0 ) we denote the based, differentiable loops into M . d (M, m0 ) := {γ ∈ Ld (M ) : γ(0) = γ(1) = m0 }. When the base point is unimportant we shall suppress it. Lc X and c X shall denote the spaces of continuous loops and continuous based loops respectively, both with the compact open topology, whereas, Ls M and s M shall denote the loop spaces used by Carey and Murray [6] consisting of free or based continuous loops differentiable except perhaps at m0 . Ld X and Ls X have the structure of differentiable Frechet manifolds when given the Frechet topology (see [6]). Moreover, in the case where the spaces are groups, Lc H and Ls,d G have, respectively, the structure of a topological group or a Lie group under pointwise multiplication of loops. (When considering based loops into a group, the base point is taken to be the identity of the group.) We shall next show that all three loop spaces are homotopic and hence they share many properties. When dealing with facts and properties equally applicable to either the differentiable, piecewise differentiable or continuous loops we shall drop the subscripts and use LX and X where it is understood that X is a manifold if the loop functor in question is any of Ld , Ls , d or s .
186
A. L. Carey, D. Crowley, M. K. Murray
Proposition 7.1. Let M be a differentiable manifold of finite or infinite dimension, then c M , s M and d M have the same homotopy type. Proof. We shall show that the obvious inclusions i : d M ,→ s M , j : s M ,→ c M and j ◦i are weak homotopy equivalences. Then, since c M , s M and d M ∈ CW , it will follow that they are of the same homotopy type. Firstly, we start with some standard notation and the case of j ◦ i: I n := {(y0 , . . . , yn−1 ) ∈ Rn : 0 ≤ yi ≤ 1}, dI n := {(y0 , . . . , yn−1 ) ∈ Rn : yi = 0 or 1 for some i}, C((X, A), (Y, B)) = {f ∈ C(X, Y ) : f (A) ⊂ B}. q
Then πq (M ) = [(I , dI q ), (M, m0 )]. Recall the 1–1 correspondence between the sets of maps φ : C((I n , dI n ), (c M, m0 )) → C((I n+1 , dI n+1 ), (M, m0 )), φ(f )(y0 , y1 . . . , yn ) = f (y1 , . . . , yn )(y0 ). (Here m0 denotes both the base point of M and the constant loop onto it.) It is well known that φ descends to an isomorphism on the homotopy groups φ∗ : πn (c M ) ∼ = πn+1 (M ). Observe also that if g ∈ C((I q+1 , dI q+1 ), (M, m0 )) is differentiable then φ−1 (g) ∈ C((I q , dI q ), (d M, m0 )). So now we can show that (j ◦ i)∗ : πq (d M ) → πq (c M ) is bijective. From 17.8 and 17.8.1 of Bott and Tu, [1] it follows that there is a differentiable map, g, in the homotopy class of φ(f ) (surjectivity of (j ◦ i)∗ ) and that any two differentiable maps, φ(f0 ) and φ(f1 ) which are continuously homotopic are homotopic via a path of differentiable maps (injectivity of (j ◦ i)∗ ). This argument also shows that j is a weak homotopy equivalence and thus so too is i. 7.2. The loop map. If X and Y are two spaces (manifolds) and f is a continuous (differentiable) map f : X → Y then there is a continuous (differentiable) map, the loop of f , denoted Lf : LX → LY , where γ 7→ f ◦ γ. If P (M, G, f ) is a locally trivial principal G-bundle then LP (LM, LG) is a locally trivial principal LG-bundle. Now, we may realise EG(BG, G) as a smooth principal G-bundle via the inclusion of G in O(n) for some n and the realisation of the classifying space of O(n) as the infinite dimensional Steifel manifold (see [28]). It follows that LEG makes sense for differentiable loops and since LEG is also a contractible space that BLG = LBG. Since the homotopy class of a continuous map between manifolds always contains a differentiable map we may take the classifying map of any principal G-bundle to be differentiable and hence LP (LM, LG) has classifying map Lf . All of this holds mutatis mutandis for the based loops.
Principal Bundles and the Dixmier Douady Class
187
7.3. Transgression. Given two topological spaces, X and Y , the slant product (see [25], p. 287) is the product in general (co)homology theories defined via the pairing between homology and cohomology as follows. Let ω ∈ H q (X × Y, Z), a ∈ Hp (X, Z) and b ∈ Hq−p (Y, Z) then the slant product: / : H q (X × Y, Z) × Hp (Y, Z) → H q−p (X, Z) is given by
(ω/a)(b) = ω(a ⊗ b).
We shall need the following functorial property. Given f : X → X 0 , g : Y → Y 0 and ω 0 ∈ H q (X 0 × Y 0 , Z) then [(f × g)∗ ω 0 ]/a = f ∗ (ω 0 /g∗ a).
(7.1)
Let ev : X × S 1 → X be the evaluation map and let i be the fundamental class of H1 (S 1 , Z). Then the transgression homomorphism between the cohomologies of a space and its loop space is defined as follows: tq : H q+1 (X, Z) → H q (X, Z) . ω 7→ ev ∗ (ω)/i One can easily check that the following diagram commutes. X × S 1
ev
→
X
f ×Id
&
f
→ X0 ev
X 0 × S 1
%
By applying 7.1 to f × Id and Id one sees that tq (f ∗ ω) = (ev ∗ (f ∗ ω))/i = (f × Id)∗ (ev ∗ )(ω)/i = (f )∗ ((ev ∗ )(ω)/Id∗ i) = (f )∗ tq (ω).
(7.2)
In simple cases, McLaughlin ([17], p 147) has noted that transgressions can be computed using the Hurewicz homomorphism as follows. Given any spaces X and Y , let [X, Y ]0 denote the set of based homotopy classes of continuous based maps from X to Y , then there is a well known bijective, adjoint correspondence (closely related to the correspondence mentioned in Proposition 7.1) 1
[6X, Y ]0 −→ [X, c Y ]0
(7.3)
which descends in the case that X = S q−1 to the isomorphism between the homotopy groups of a space and its loop space, δq : πq (Y ) ∼ = πq−1 (c Y ). In fact, δq = ∂q , the boundary map in the long exact sequence of the continuous path fibration, Pc Y → Y . Now let π : S q−1 × S 1 → 6S q−1 be the projection defined by the equivalence relation (θ, 1) ∼ (θ0 , 1) and (θ0 , t) ∼ (θ0 , t0 ) for all θ, θ0 ∈ S q−1 and
188
A. L. Carey, D. Crowley, M. K. Murray
for all t, t0 ∈ S 1 , where θ0 is the base point of S q−1 . Then, by the definition of 1, the following diagram commutes: 1f ×Id
ev
S q−1 × S 1 −→ c X × S 1 −→ X f
π
&
6S
q−1
%
.
If j ∈ Hq−1 (S q−1 , Z) is a generator then π∗ (j ⊗ i) := k generates Hq (S q , Z). Thus for ω ∈ H q (X, Z), ω(f∗ (k)) = ω(f∗ π∗ (j ⊗ i)) = ω(ev∗ ((1f )∗ j ⊗ i)) = ev ∗ (ω)((1f )∗ j ⊗ i) = tq (ω)((1f )∗ j).
(7.4)
In cases where the Hurewicz homomorphism, φ : πq−1 (c X) → Hq−1 (c X, Z) is surjective and H q−1 (c X, Z) is torsion free, (7.4) will allow us to compute tq since in this case a cohomology class ω 0 ∈ H q−1 (c X, Z) is determined by the value it takes on (1f )∗ j as 1f runs through πq−1 (c X). We can also use the fact that the continuous and differentiable loop spaces are homotopic (Proposition 7.1) to gain the same result when X = M is a manifold and we consider differentiable loops (now we must consider a differentiable map, g : S q−1 → d M which is homotopic to 1f ). We can apply this to interpret the transgression homomorphism as the looping of maps when we regard H q (X, Z) as [X, K(Z, q)]. In this case the Hurewicz homomorphism is an isomorphism and if 1 ∈ H q (K(Z, q), Z) is a generator then τ q (1) := 10 generates H q−1 (K(Z, q), Z) = H q−1 (K(Z, q − 1), Z) and τ q (f ∗ (1)) = (f )∗ (10 ).
(7.5)
8. Killingback’s Result In this section we confine our attention to cases where G is a compact, connected and simply connected Lie group and we consider string structures for smooth bundles with fibre s G. We can consider s G and d G interchangeably since the obvious inclusion d G ,→ s G is a homotopy equivalence. This means that, for a Lie group G, isomorphism classes of d G-bundles, s G-bundles and c G-bundles are in 1–1 correspondence via the obvious bundle inclusions. Thus, the problem of finding a string structure is identical in the case of d G and s G as the following commutative diagram makes clear: H 1 (M, s G) Dy
∼ =
H 1 (M, d G) . Dy
H 2 (M, S 1 )
∼ =
H 2 (M, S 1 )
Principal Bundles and the Dixmier Douady Class
189
We see that for a principal G-bundle, P (M, G), over a manifold, M , D[s P ] = 0 if and only if D[d P ] = 0. This links the work of Carey and Murray [6] and McLaughlin [17]. Moreover since LG is homeomorphic to G × G we need only consider based loops when G is simply connected for then H i (G, Z) = 0 for i = 1, 2 and the canonical projection φ : LG = G × G → G, induces an isomorphism φ∗ : H 2 (G, Z) ∼ = H 2 (LG, Z). The correspondence between circle bundles and second integral cohomology entails, 1 1 d cs (Ls G, S 1 ) = φ∗ d L s G(s G, S ) = s G(s G, S ) × G.
Now note that for any topological groups, G and H, H 1 (M, G × H) = H 1 (M, G) × H 1 (M, G). The following commutative diagram shows that Ls P has a string structure if and only if s P has one (the first two vertical arrows are the obvious projections and d ρ is the map induced on cohomology from the projection ρ : s G → s G). ρ×Id D 1 1 1 2 1 d H 1 (M, s G) × H (M, G) −−−−→ H (M, s G) × H (M, G) −−−−→ H (M, S ) Idy y y
d H 1 (M, s G)
ρ
−−−−→
H 1 (M, s G)
D
−−−−→ H 2 (M, S 1 )
Let us now turn to the general situation for s G. Start with a principal SO(n)– bundle, P (M, SO(n), f ) (n > 4), (typically P is the frame bundle of the tangent bundle of a Spin manifold M ) that has a Spin(n)–structure Q(M, Spin(n), fˆ) and form the loop bundle s Q(s M, s Spin(n), s fˆ). Now, realise Bs Spin(n) as s BSpin(n). Since Spin(n) is two-connected with π3 (Spin(n)) ∼ = Z, BSpin(n) is three-connected and H 4 (BSpin(n), Z) ∼ = Z. Thus (7.4) gives us that t3 : H 4 (BSpin(n), Z) → H 3 (s BSpin(n), Z) is an isomorphism, so choose ω ∈ H 4 (BSpin(n), Z), a generator, so that t4 (ω) = µ, the universal Dixmier-Douady class. So, D[Ls Q] = (s fˆ)∗ µ = (s fˆ)∗ t4 (ω) = t3 (fˆ∗ ω) by (7.2). McLaughlin [17] in his Lemma 2.2 shows by analysing the spectral sequence of the bundle BSpin(n)(BSO(n), BZ2 ) that for n > 4, 2.fˆ∗ (ω) = P1 (P ), where P1 (P ) is the first Pontryagin class of P . Thus 2D[Ls Q] = t4 (P1 (P )) which is Killingback’s result. Now (7.4) entails that tq is injective for M (q − 2)connected and hence the vanishing of (1/2)P1 (P ) is necessary and sufficient for the existence of a string structure if M is two-connected, and merely a sufficient condition in general.
190
A. L. Carey, D. Crowley, M. K. Murray
9. The Restricted Unitary Group We follow the construction in Pressley and Segal’s book [22]. Let H = H + + H− be a separable Hilbert space decomposed by infinite dimensional subspaces H + and H− which are the range of the self adjoint projections P + and P − respectively, IdH = P + + P − . The restricted unitary group relative to a polarisation is defined by Ures (H, P + ) = {u ∈ U (H) : P ± uP ∓ is Hilbert Schmidt}. Now Ures is not equipped with the subspace topology from U (H) but with its own topology coming from the metric ρ. ρ(u1 , u2 ) = ||P + (u1 − u2 )P + || + ||P − (u1 − u2 )P − || + |P + (u1 − u2 )P − |HS + |P − (u1 − u2 )P + |HS , where | |HS denotes the symmetric norm on the Hilbert–Schmidts. Typically the Hilbert space and polarisation are understood and omitted from the notation. If ( , ) denotes the inner product on H, then the CAR (canonical anti-commutation relations) algebra over H, CAR(H) is the C ∗ –algebra generated by the set {a(f ), a∗ (f ), f ∈ H} whose elements satisfy the canonical anti-commutation relations a(f ).a(g) + a(g)a(f ) = 0, a(f ).a∗ (g) + a(g ∗ ).a(f ) = (f, g). Any unitary u ∈ U (H) allows one to define an automorphism of CAR(H) (called a Bogoliubov transformation) by αu ((a(f )) = a(u.f )
αu ((a∗ (f )) = a∗ (u.f ).
An irreducible (Fock) representation π of CAR(H) is determined via the GNS construction from the state ω defined by ω(a∗ (f1 )...a∗ (fM )a(gN )...a(g1 ) = δM N det(gi , P − fj ). The result we need is the theorem (see [24]) that, given a Bogoliubov transformation αu , there exists a unitary W (u) ∈ U (Hπ ) such that π(αu (a(f ))) = π(a(u.f )) = Ad(W (u))(π(a(f )) = W (u)π(a(f ))W (u)∗ iff u ∈ Ures (H). Since π is irreducible, W (u), is uniquely defined up to a scalar which is killed by the adjoint. Hence the above defines an embedding i : Ures ,→ PU (Hπ ) of the restricted unitaries of H in the projective unitaries on Hπ . It is a corollary of a proof in Carey (1984) (Lemma 2.10) that this embedding is closed in PU (Hπ ). Furthermore we shall see below that H 2 (Ures , Z) = Z and the canonical central extension of Ures , 2 Ud res , defined by the generator of H (Ures , Z) is given by ∗ Ud res (Ures , U (1)) = i U (Hπ )(PU(Hπ ), U (1)).
Principal Bundles and the Dixmier Douady Class
191
Hence the assumptions of Sect. 3 are fulfilled. Finally, note that Ures is a disconnected group with connected components labelled by the Fredholm index of P + U P + . We denote 0 . Henceforth we drop reference to the the connected component of the identity by Ures different Hilbert spaces over which Ures and PU are defined and it shall be understood that PU refers to the projective unitaries on Hπ and not H. We now summarise the homotopy properties of Ures , its role as a classifying space for U (∞) and the relation between Ures and PU bundles. 10. Ures as a Classifying Space The group of unitaries with determinant, say T , consists of those unitary operators of the form 1+trace class. By considering T (H+), Pressley and Segal [22] (see Ch 6) show 0 , the connected component of Ures with that there is a principal T –bundle over Ures 0 contractible total space and hence Ures is a BT . It is known that T has homotopy type of the direct limit of the finite unitaries, T ' U (∞) = lim (U (n)). n→∞
0 Ures
0 So is a CW –classifying space for T and thus U (∞). So we have Ures ' BT . Since the homotopy groups of U (∞) are well known by Bott periodicity we have that ( Z, q even, πq (Ures ) = 0 q odd.
(This result has elsewhere been proven via methods more closely tied to Ures ’s structure as a group of unitary operators, see Carey (1983).) Now U (∞) and BU (∞) are classifying spaces for reduced K–theory and we have: Proposition 10.1.
BUres ' U (∞)
Ures ' c U (∞).
Proof. It is known that the embedding of d U (n) ,→ Ures extends to a map i : d U (∞) ,→ Ures and one can check that this is a weak homotopy equivalence and hence a homotopy equivalence. By Proposition 7.1, d U (∞) ' c U (∞) and thus, remembering that via the path fibration Bc G ' G0 , BUres ' Bc U (∞) = U (∞). If we loop this equation we find, c U (∞) ' c BUres ' Bc Ures ' Ures . Note 10.1. Over the category of CW-complexes of dimension less than a given integer, CWn , and over the category of finite CW-complexes, CWf in , the functors of reduced K-theory have BU (∞) as a classifying space (see [14], p. 118). If follows that isomorphism classes of Ures -bundles correspond bijectively with elements of reduced K-theory. ˜ 1 (X) of a space is defined to be the stable isomorphism classes of vector Specifically, K bundles over the reduced suspension of X, 6X. For X ∈ CWn or CWf in ,
192
A. L. Carey, D. Crowley, M. K. Murray
˜ 1 (X) = [6X, BU (∞)] K = [X, c BU (∞)] apply 1 = [X, Bc U (∞)] = [X, BUres ] = BunX (Ures ), where BunX (Ures ) denotes the set of all isomorphism classes of Ures bundles over ˜ correspond bijectively with U (∞)-bundles, Ures -bundles correX. Now elements of K spond with c U (∞) bundles. So our correspondence can be seen as a mapping between c U (∞)-bundles over a space and U (∞)-bundles over the reduced suspension of that space which is attained by applying 1 or 1−1 to the classifying maps of the bundles. We exploit this in the next section.
11. The Dixmier-Douady Class and the Second Chern Class Regarding Ures as a subgroup of PU via the inclusion mentioned in Sect. 7, we may ask when can we reduce the structure group of a PU–bundle, P (PU, M, f ) to Ures ? By Theorem 3.4 we translate this question into a search for maps fˆ such that f = g ◦ fˆ, where we take g : BUres → BPU to be a fibration with fibre F , BUres ' U (∞) g↓ f
X −→ BPU ' K(Z, 3). In general we know that if there were a section of g, say s, then this would entail the existence of group homomorphisms g ∗ : H ∗ (BPU, Z) → H ∗ (BUres , Z), s∗ : H ∗ (BUres , Z) → H ∗ (BPU, Z), such that
s∗ ◦ g ∗ = (g ◦ s)∗ = id.
It is a group theoretic result that this implies that H ∗ (BPU, Z) would be a direct summand of H ∗ (BUres , Z). But we know (See Bott and Tu, pp. 245–246) that H ∗ (BPU, Z) has torsion whereas H ∗ (BUres , Z) is a free group. Therefore the sought after section cannot exist and the structure group of some PU–bundles does not reduce to Ures . The situation in specific instances depends in part on the homotopy groups of the fiber, which we can compute in this case by noting that i∗ : πq (Ures ) → πq (PU) is an isomorphism for q = 2 and null otherwise. It follows by Lemma 3.1 that g∗,q : πq (BUres ) → πq (BPU) is an isomorphism for q = 3 and null otherwise. By considering the long exact homotopy sequence of the fibration g
F ,→ BUres → BPU we see that
Principal Bundles and the Dixmier Douady Class
( πq (F ) =
193
Z, 0
q odd 6= 3, q even or 3.
Now the cohomology, H n (K(Z, 3)), of K(Z, 3) is zero for n = 1 and torsion for n > 3 (see Bott and Tu, pp. 245–246). Hence obstructions to lifting f can lie only in H 2n+4 (M, Z) (n ≥ 1). So the structure group of any PU –bundle over a space with free, even (greater than fourth) cohomology groups reduces to Ures . We recast this problem in a more general setting by exploiting the correspondence ˜ 1 (X). There is a suspension isomorphism between Ures -bundles over a space X and K on cohomology, 6q : H q (X, Z) ∼ = H q+1 (6X, Z) which one can obtain from the Mayer-Vietoris sequence for (6X, CX, CX) (where “CX” denotes the reduced cone of X) or by using the adjoint relation, 1 (see 7.3) between 6 and c considered as functors on CW : 1f
f
X −→ c K(Z, q + 1) = K(Z, q) ←→ 6X −→ K(Z, q + 1). If 1 and 10 are as in (7.5) then 6q ((1f )∗ (10 )) = f ∗ (1).
(11.1)
The next proposition uses the suspension isomorphism and the transgression homomorphism to link characteristic classes of principal Ures –bundles over with the characteristic classes of the associated U (∞)–bundles over 6X. Proposition 11.1. Let c be a characteristic class for principal U (∞)-bundles defined by its universal class c∗ ∈ H q+1 (BU (∞), Z) and let tq (c) be the characteristic class for principal Ures -bundles with universal class tq (c∗ ). If P (X, Ures , 1f ) is a principal Ures –bundle over X and 6P (6X, U (∞), f )) ˜ 1 (X)), then is the associated U (∞)-bundle over 6X, (element of K c(6P ) = 6q (tq (c)(P )). Proof. Let c also denote a map c : BU (∞) → K(Z, q+1) which pulls back 1 ∈ K(Z, q+ 1) to c∗ . We must show that f ∗ (c∗ ) = 6q ((1f )∗ (tq (c∗ ))). By applying Proposition 10.1, realise BUres as c BU (∞). Thus we may exploit the adjoint pairing 1 between the functors c and 6. We have the following maps: f
c
1f
c
6X −→ BU (∞) −→ K(Z, q + 1), c X −→ c BU (∞) −→ c K(Z, q + 1)
with 1(c ◦ f ) = c c ◦ 1f . Now by 7.5, (c c)∗ (10 ) = tq (c∗ ) and thus 6q ((1f )∗ (tq (c∗ ))) = 6q ((1f )∗ (c c)∗ (10 )) = 6q (1(c ◦ f )∗ (10 )) = (c ◦ f )∗ (1) by 11.1 = f ∗ (c∗ ), and the proposition is proved.
194
A. L. Carey, D. Crowley, M. K. Murray
Proposition 11.2. Let D be the Dixmier-Douady class for principal Ures -bundles, let c2 be the second Chern class for U (∞)-bundles and let P and 6P be as above. Then 63 (D(P )) = c2 (6P ). Proof. Let D∗ ∈ H 3 (BUres , Z) and c∗2 ∈ H 4 (BU (∞, Z)) denote the universal classes of D and c2 respectively, then by Proposition 11.1 it suffices to show that t3 (c∗2 ) = D∗ . Using [14] (Ch 20, Corollary 9.8) one deduces that there is a U (∞))-bundle over S 4 , P (S 4 , U (∞), f ), with c2 (P ) a generator of H 4 (S 4 , Z). Let k denote the generator of H4 (S 4 , Z) such that 1 = c2 (P )(k) = f ∗ (c∗2 )(k) and let j be the corresponding generator of H3 (S 3 , Z) (in the sense of 7.4). Then by (7.4) t3 (c∗2 )((1f )∗ j) = f ∗ (c∗2 )(k) = 1. But one can show by considering long exact sequence of the fibration U (n)(S 2n−1 , U (n − 1)) (n large) that the Hurewicz homomorphism is an isomorphism on π3 (BUres ) ∼ = π3 (U (∞)) ∼ = Z. Hence, t3 (c∗2 ) generates H 3 (BUres , Z) (as it evaluates to 1 on the generator of H3 (BUres , Z)) and so t3 (c∗2 ) = D∗ as required. In summary, the structure group of a PU-bundle, Q(X, PU ) reduces to Ures if and only if there is a Ures bundle, P (X, Ures ) whose Dixmier-Douady class coincides with that of Q. This, we have just seen, happens if and only if there is a U (∞)–bundle, 6P (6X, U (∞)) over 6X such that c2 (6P ) = 63 (D(P )). We know from above that one cannot, in general, construct a U (∞)-bundle with an arbitrary second Chern class on any given space. This differs from the case for the first Chern class where one can always find a line bundle, and hence a U (∞)–bundle, for any given element of H 2 (M, Z). 12. Connections with Other Viewpoints 12.1. Bundle gerbes. An alternative method of defining the obstruction to lifting a bundle to a central extension is to use the notion of bundle gerbes [20]. We will sketch the theory here and refer the reader to [20] for details. If Y → M is a fibration define Y [p] to be the pth fibre product of Y with itself. Then a bundle gerbe over M is a pair (J, Y ), where π : Y → M is a fibration and J → Y [2] is a U (1) bundle. Furthermore for any x, y and z in Y we require the existence of a bundle morphism, called the bundle gerbe product, J(x,y) ⊗ J(y,z) → J(x,z) depending continuously or smoothly on x, y and z. Moreover this composition is required to be associative. Note that for U (1) principal bundles there is a natural notion of tensor product and dual, see [20] for details. If L → Y is a U (1) bundle then we can define a bundle gerbe (Y, δ(L)) by δ(Y )(x,y) = Lx ⊗ L∗y .
Principal Bundles and the Dixmier Douady Class
195
A bundle gerbe is called trivial if it is isomorphic to a bundle gerbe of the form δ(L). The obstruction to a bundle gerbe (J, Y ) over M being trivial is a three class in H 3 (M, Z) called the Dixmier-Douady class of the bundle gerbe. Its definition can be found in [20]. The connection with our work is the bundle gerbe arising as the obstruction to extending the structure group of a G bundle P to Gˆ where 0 → U (1) → Gˆ → G → 0 is a central extension. Note that if we form the fibre product P [2] there is a map σ : P [2] → ˆ where here we think of Gˆ as a U (1) G defined by p = qσ(p, q). We define J = σ ∗ (G) bundle over G. It is easy to check that the group multiplication in Gˆ defines the required bundle gerbe product. It is shown in [20] that Theorem 12.1 ([20]). The bundle gerbe L is trivial if and only if the bundle P lifts ˆ The Dixmier-Douady class of L is the same Dixmier-Douady class which is the to G. ˆ obstruction to the bundle P lifting to G. 12.2. The Dixmier-Douady class and Clifford bundles. We now interpret the DixmierDouady class as an obstruction in a different setting which is closer in spirit to that of the original (cf. [10]). Suppose we have a principal fibre bundle P (M, Ures ) and a locally finite cover {Uβ |β ∈ A} of M . The transition functions gβγ may be used to define the transition functions for a locally trivial bundle over M with fibre the CAR algebra. This is achieved by defining automorphisms of the CAR algebra by uβγ (a(v)) = a(gβγ v) (v ∈ H) and using the uαβ as transition functions for a fibre bundle C(M, CAR(H)). If the Dixmier-Douady class of P (M, Ures ) is trivial then one can find unitaries {W (uβγ )| β, γ ∈ A} ˘ 2-cocycle with values in the acting on the Hilbert space Hπ of π which form a Cech unitaries on Hπ . Using these as transition functions one defines a “Fock bundle” over M , with fibre the Fock space Hπ . Thus the Dixmier-Douady class of P (M, Ures ) is an obstruction to the existence of a locally trivial bundle over M with fibre the Fock space and on sections of which the Clifford bundle (as a field of C∗ -algebras) acts. This is analogous to the original introduction of the Dixmier-Douady class as an obstruction to the triviality of a bundle of C∗ -algebras with fibre the compact operators. Acknowledgement. ALC and MKM acknowledge the support of the Australian Research Council. We thank Varghese Mathai and Jim Davis for valuable assistance and John Phillips for suggesting many improvements to the arguments. Finally we thank the referee for a number of useful comments.
References 1. Bott, R. and Tu, L.: Differential Forms in Algebraic Topology. New York: Springer-Verlag, 1982 2. Brylinski, J.-L.: Loop spaces, Characteristic classes and Geometric Quantization. Berlin: Birkh¨auser, 1992 3. Carey, A.L.: Infinite Dimensional Groups and Quantum Field Theory. Acta Appl. Math. 1, 321–332 (1983) 4. Carey, A.L. and O’Brien, D.M.: Automorphisms of the Infinite Dimensional Clifford Algebra and The Atiyah-Singer Mod 2 Index. Topology 22, 437–448 (1983) 5. Carey, A.L.: Some Infinite Dimensional Groups and Bundles. Publications of the Research Institute for Mathematical Sciences Kyoto University 20, 1103–1117 (1984)
196
A. L. Carey, D. Crowley, M. K. Murray
6. Carey, A.L. and Murray, M.K.: String Structures and the Path Fibration of a Group. Commun. Math. Phys. 141, 441–452 (1991) 7. Carey, A.L. and Murray, M.K.: Faddeev’s anomaly and bundle gerbes. Lett. Math. Phys. 37, 29–36 (1996) 8. Carey, A.L., Mickelsson, J. and Murray, M.K.: Index theory, Gerbes and Hamiltonian quantisation. Commun. Math. Phys. to appear (1997) 9. Coquereaux, R. and Pilch, K.: String Structures on Loop Bundles. Commun. Math. Phys. 120, 353–378 (1989) 10. Dixmier, J.: C ∗ –algebras. Amsterdam: North Holland, 1977 11. Faddeev, L. and Shatasvili, S.: Algebraic and hamiltonian methods in the theory of nonabelian anomalies. Theoret. Math. Phys. 60, 770 (1985) and Mickelsson, J.: Chiral anomalies in even and odd dimensions. Commun. Math. Phys. 97, 361 (1985) 12. Frenkel, J.: Cohomologie non Abeliene et Espaces Fibres. Bulletin de la Soc. de Math. Francais 85, 135–220 (1957) 13. Gross, L.: On the formula of Mathews and Salam. J. Funct Analysis. 25, 162–209 (1977) 14. Husemoller, D.: Fibre Bundles. New York: McGraw-Hill, 1966 15. Killingback, T.P.: World–sheet Anomalies and Loop Geometry. Nuc. Phys. B288, 578–588 (1987) 16. Kuiper, N.: Contractibility of the Unitary Group in Hilbert Space. Topology 3, 19–30 (1964) 17. Mclaughlin, D.A.: Orientation and String Structures on Loop Space. Pac. J. Math. 155, 1–31 (1992) 18. Michael, E.: In: Set-valued mappings, selections and topological properties of 2X . Proceedings of the Conference held at the State University of New York at Buffalo, May 8-10, 1969. Edited by W. M. Fleischman, Berlin–Heidelberg–New York: Springer-Verlag, 1970 19. Mosher, R.E. and Tangora, M.C.: Cohomology operations and applications in homotopy theory. New York: Harper & Row, 1968 20. Murray, M.K.: Bundle Gerbes. J. London Math. Soc. (2) 54, 403–416 (1996) 21. Murray, M.K. and Stevenson, D.: A universal bundle gerbe. Preprint 22. Pressley, A. and Segal, G.: Loop Groups. Oxford: Clarendon Press, 1986 23. Segal, G.: Faddeev’s anomaly in Gauss’ law. Unpublished preprint (1985) 24. Shale, D. and Stinespring, W.F.: Spinor Representations of Infinite Orthogonal Groups. J. Math. Mech. 14, 315–322 (1965) 25. Spanier, E.H.: Algebraic Topology. New York: McGraw-Hill, 1966 26. Steenrod, N.: The Topology of Fibre Bundles. Princeton, N J: Princeton University Press, 1951 27. Steenrod, N.: Milgram’s Classifying Space of a Topological Group. Topology 7, 349–68 (1968) 28. Whitehead, G.W.: Elements of Homotopy Theory. Berlin–Heidelberg–New York: Springer-Verlag, 1978 Communicated by H. Araki
Commun. Math. Phys. 193, 197 – 208 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
Recovering Asymptotics of Short Range Potentials Mark S. Joshi1 , Antˆonio S´a Barreto2 1 Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, 16 Mill Lane, Cambridge CB2 1SB, England, UK. E-mail:
[email protected] 2 Department of Mathematics, Purdue University, West Lafayette IN 47907, Indiana, USA. E-mail:
[email protected]
Received: 3 January 1997 / Accepted: 15 August 1997
Abstract: Any compact smooth manifold with boundary admits a Riemann metric of the form x−4 dx2 + x−2 h0 near the boundary, where x is the boundary defining function and h0 restricts to a Riemannian metric, h, on the boundary. Melrose has associated a scattering matrix to such a metric which was shown by he and Zworski to be a Fourier integral operator. It is shown here that the principal symbol of the difference of the scattering matrices for two potentials V1 , V2 ∈ x2 C ∞ (X) at fixed energy determines a weighted integral of the lead term of V1 − V2 over all geodesics on the boundary. This is used to prove that the entire Taylor series of the potential at the boundary is determined by the scattering matrix at a non-zero fixed energy for certain manifolds including Euclidean space. 1. Introduction In this paper, we show how the asymptotics of a short range potential on various asymptotically Euclidean spaces including Rn can be recovered from the singularities of the scattering matrix at any given fixed energy. We proceed by using the calculus developed by Melrose and Zworski in [8] to show that the total symbol of the scattering matrix, which they showed is a Fourier integral operator, determines the Taylor series of the potential at infinity. We also achieve some partial results for metric scattering. A scattering manifold, or asymptotically Euclidean manifold, was introduced by Melrose in [7] and is defined to be a smooth manifold X with boundary ∂X which is equipped with a metric g which can be written in the form g=
dx2 h0 + 2 x4 x
(1.1)
for some boundary defining function x with h0 restricting to a smooth positive definite metric, h, on the boundary. The important point here is that Rn with the Euclidean metric
198
M. S. Joshi, A. S´a Barreto
is an example of such a space - to see this take a stereographic projection to the upper unit hemi-sphere in Rn+1 , as in Eq. (0.5) of [8]. We will not need the special properties of Rn for most of our arguments and we will work in the general case. Given f ∈ C ∞ (∂X), then Melrose has shown in [7], that for any non-zero λ that there is a unique u ∈ C ∞ (X ◦ ) such that (1 − λ2 )u = 0 and u = eiλ/x x
n−1 2
f 0 + e−iλ/x x
n−1 2
f 00
with f 0 , f 00 smooth on X up to the boundary and f equal to the restriction of f 0 to the boundary. The scattering matrix at energy λ 6= 0 is then the map on C ∞ (X), 00
S(λ) : f 7−→ f|∂X . Melrose and Zworski showed in [8] that S(λ) is a classical Fourier integral operator of order zero associated to the geodesic flow at time π given by the metric h on the 0 (Gπ ). This means that its kernel can be written locally boundary. We write this as Iphg as Z eiφ(x,y,θ) a(x, y, θ)dθ, where a is a classical symbol and φ is a generating function for the Lagrangian submanifold which is the graph of geodesic flow at time π on the cosphere bundle over the boundary (see [6], Vol. 4 for a comprehensive account of Fourier integral operators). They proceeded by developing a calculus of Legendrian distributions associated to intersecting Legendrian submanifolds (with conic points) of the scattering cotangent bundle over the boundary. This calculus was then used to construct the Poisson operator for the problem; that is an operator P (λ) from C ∞ (∂X) → C ∞ (X ◦ ) which maps P (λ) : f 7−→ u, where f, u are as above. It is clear from their paper that their methods also apply to construct the scattering matrix of certain perturbations of the Laplacian. It also follows that the total symbol of S(λ) is determined by the asymptotics of the metric and the perturbation at ∂X and that the symbol is in principle computable in terms of transport equations along geodesics on the boundary. It is by judiciously solving such transport equations that we proceed. We refer the reader to [8] for the definition of scattering pseudo-differential operators 1
2 on a manifold with scattering metric denoted by 9m,k sc (X, sc ).
Definition 1.1. We define a short range perturbation of a scattering metric g on a 1
2 manifold X to be a differential operator F ∈ 92,2 sc (X, sc ) such that 1g + F is equal to the Laplacian induced by some metric on X plus a smooth potential.
Note that the metric associated to 1g + F will necessarily be a scattering metric from our condition on the order of the perturbation. Note also that a function which is smooth up to the boundary and vanishes to second order there will define a short range potential perturbation. Theorem 1.1. Let (X, ∂X, g, x) be a smooth manifold with boundary with scattering metric g and boundary defining function x. Suppose F1 , F2 are short range perturbations 1 2 and F = F1 − F2 ∈ 92,k sc (X, sc ) where k ∈ N is greater than or equal to 2. Let
Recovering Asymptotics of Short Range Potentials
199
f (τ, |µ|, y, µ) ˆ be the principal symbol of F at the boundary. Let Sj (λ) be the scattering matrix associated to 1 + Fj − λ2 . Then S1 (λ) − S2 (λ) ∈ I −k+1 (Gπ ), and its principal symbol determines and is determined by Zπ f (|λ| cos s, |λ| sin s, γ(s))(sin s)k−1 ds
Ik−1 (f, γ) = 0
for every integral curve, γ, of the geodesic flow on the unit cosphere bundle over the boundary. From this we deduce −2 (Rn ), n ≥ 3. If the associated scattering matrices at Corollary 1.1. Let V1 , V2 ∈ Scl some non-zero fixed energy are equal up to smooth terms then V1 − V2 ∈ S −∞ (Rn ).
In fact, all that is important here is the geometry of the boundary and we discuss recovery of potentials for various different boundaries. We also sketch a constructive procedure for recovering the asymptotics of the potential from the scattering matrix. We also note that as rational potentials are determined by their asymptotics, we have that Corollary 1.2. Let V1 , V2 ∈ C ∞ (Rn ), n ≥ 3, be rational potentials of some order less than or equal to −2. If their scattering matrices at some non-zero fixed energy are equal up to smooth terms then V1 = V2 . The problem of recovering a potential from the scattering matrix has a long tradition in mathematical physics. It is known that potentials with suitable decay at infinity can be recovered from the scattering matrix S(λ) for high energies, see for example [1] and [15]. The analogous question for a fixed non-zero energy was first considered by T. Regge [13], R. Newton [14], P. Sabatier [16] and others. They showed that in general the scattering matrix at fixed energy does not determine the potential. They constructed examples of smooth potentials with zero scattering amplitude at a fixed energy (transparent potentials). However their examples do not contradict Corollary 1.2 because these potentials decay at infinity as |x|−3/2 . In the two dimensional case, P. Grinevich [2] constructed examples of smooth rational potentials, decaying at infinity as |x|−2 , with zero scattering amplitude at a fixed energy. So Corollary 1.2 is not true in general for n = 2. Related examples, which are not shown to be rational, also decaying |x|−2 at infinity, were constructed by P. Grinevich and S.Manakov [3] and by V. Zakharov [18]. For potentials that decay rapidly at infinity, Grinevich and Novikov [4] have shown that in dimension two, the scattering matrix at a fixed energy does not determine potentials in S −∞ (Rn ). They construct potentials in the Schwartz class, with a suitable small norm, which are transparent. They also show that potentials which do not satisfy the smallness of the norm and decay exponentially at infinity can not be transparent. Novikov [12] shows that in dimension three exponentially decaying potentials are den termined by their scattering matrices at a fixed energy. If Vj ∈ L∞ comp (R ), j = 1, 2, have the same scattering matrix at a fixed energy, then it has been proved by Sylvester and Uhlmann [17] and by Novikov [11], for n ≥ 3. For n = 2, Nachman [9] has shown the unique determination for the conductivity problem.
200
M. S. Joshi, A. S´a Barreto
2. Review In this section, we review and rephrase the material we need from [7] and [8]. In this section, X is a compact manifold with boundary ∂X and g is a scattering metric on X with x a boundary defining function for ∂X such that g takes the form (1.1). Our account is necessarily brief and we refer the reader to [7] and [8] for more details. There is a natural bundle over X called the scattering cotangent bundle which is denoted sc T ∗ (X). This is the dual to the bundle of vector fields of bounded length with respect to some (and hence all) scattering metrics on X. The restriction of sc T ∗ (X) to ∂X is denoted sc T ∗ (X)|∂X and carries a natural contact structure. If y are local coordinates on ∂X and µ are the corresponding dual coordinates, then (y, µ, τ ) form local coordinates on sc T ∗ (X)|∂X , where τ is the coefficient of dx x2 . We omit the definition of scattering pseudo-differential operators as we need only consider differential operators but note that a differential operator P (x, D) will be in sc 1/2 ) if it is of order m and its total symbol, a polynomial in x2 Dx , xDy , 9m,k sc (X, with smooth coefficients, vanishes to order k at the boundary. It then has a well-defined symbol at the boundary j(P ) = xk pk + xk+1 pk+1 , pk , pk+1 ∈ C ∞ (R × T ∗ ∂X). f , of sc T ∗ (X)|∂X Definition 2.1. An intersecting pair with conic points is a subset, W which is a union of the closure of a smooth Legendrian submanifold, W , and W # , which is a finite union of global sections of the form W # (λj ) = {(y, 0, λj )}, and contains W \ W. We also require W to have an at most conic singularity at µ = 0 that is, it is smooth if polar coordinates are introduced along W \ W. The process of introducing polar coordinates along W \ W can be given an invariant c. meaning and is then called blow-up. We denote the blown-up manifold W The metric g induces a metric h on the boundary as it is, near the boundary, of the form τ 2 + h0 (y, µ) + xg 0 as function on sc T ∗ X, we obtain h0 from h via the isomorphism µ.
dy 7−→ µ.dy. x
Example 2.1. For each y 0 ∈ ∂X and 0 6= λ ∈ R, let Gy0 (λ) be equal to the set of ˆ such that (τ, y, µ), such that τ 2 + |µ|2 = λ2 , µ 6= 0, and putting µ = |µ|µ, τ = |λ| cos(s), |µ| = |λ| sin(s),
(2.1)
(y, µ) ˆ = exp(sH 1 h )(y 0 , µˆ 0 ), 2
where s ∈ (0, π), (y 0 , µˆ 0 ) ∈ T ∗ ∂X, and h(y 0 , µˆ 0 ) = 1. Then Gy0 (λ) ∪ {(λ, y, 0)} is an intersecting pair with conic points. We also have that on ∂X × ∂X, the boundary of X × ∂X, G(λ) = ∪y0 ∈∂X Gy0 (λ), together with {(λ, y, 0, y 0 , 0)} is an intersecting pair e with conic points. We denote this pair G(λ). This is the pair we are interested in this paper.
Recovering Asymptotics of Short Range Potentials
201
Associated with these intersecting pairs at each conic point is a unique homogeneous e f , λi ) of T ∗ (∂X). For the pair, G(λ), we are interested in, Lagrangian submanifold 3(W this is precisely the relation of being π apart along a lifted geodesic (see Proposition 3 of [8]). For simplicity we shall henceforth take λ to be positive. Melrose and Zworski associated to any such intersecting pair a class of smooth functions whose asymptotics on approach to the boundary are determined by symbols f can on the Legendrians. A symbol bundle over the smooth Legendrian W (λ) in pair W be defined and is denoted Eˆ m,p . The sections of this bundle are of the form aS p−m |dx|m−n/4 1
ˆ ; 2 ⊗ M ˆ ) and S is a defining function of the with a a smooth section of C ∞ (W H b 1
boundary of W, M is the Maslov bundle and b2 is the b−half density bundle. For G above, one could take S = sin s. Melrose and Zworski remove this singularity at the endpoints by rescaling but for us it will be easier not to do so. f (λ) is an intersecting pair with conic points then there is a class of Proposition 2.1. If W m,p f ), such that T I m,p (X, W f ), is equal (X, W smooth half-densities on X o , denoted Isc sc m,p
to the class of half-densities vanishing to infinite order at the boundary. There exists a symbol map m,p f , sc 1/2 ) → C ∞ (W ˆ ; Eˆ m,p ) (X, W σˆ sc,m,p : Isc which gives a short exact sequence m+1,p f , sc 1/2 ) → I m,p (X, W f , sc 1/2 ) → C ∞ (W ˆ ; Eˆ m,p ) → 0. 0 → Isc (X, W sc
This is Proposition 12 from [8]. An important related fact we need to know is how do Legendrian distributions map under scattering pseudo-differential operators. We recall Proposition 13 from [8]. sc 1/2 ) has symbol xk pk + xk+1 pk+1 with reProposition 2.2. Suppose P ∈ 9l,k sc (X, spect to a product decomposition of X near ∂X, and suppose that ∗ W ⊂ sc T∂X (X)
is a smooth Legendre submanifold. Then for any m ∈ R, m m+k P : Isc (X, W ; sc 1/2 ) → Isc (X, W ; sc 1/2 )
(2.2)
σsc,m+k (P u) = (pk |G )σsc,m (u) ⊗ |dx|k .
(2.3)
Furthermore, if pk vanishes identically on W then m m+k+1 P : Isc (X, W ; sc 1/2 ) → Isc (X, W ; sc 1/2 )
and σsc,m+k+1 (P u) = n n ∂pk 1 1 LV + (k + 1) + m − + pk+1 |W a ⊗ |dx|m+k+1− 4 , i 2 4 ∂τ n
where σsc,m (u) = a⊗|dx|m− 4 and V is the rescaled Hamiltonian vector field associated to pk .
202
M. S. Joshi, A. S´a Barreto
We omit the definition of the rescaled Hamiltonian vector field but recall that for the pair G we are studying it is equal to 2λ sin s
∂ ∂s
in the semi-global coordinates given by (2.1). The other main fact we need is the push-forward theorem, Proposition 16, from [8]. This relates the singularities of the scattering matrix to the asymptotics in small x of the Poisson operator - it is at the crux of our argument. Given a product decomposition near the boundary, there is a natural pairing B : C −∞ (X, sc 1/2 ) × C ∞ (∂X; sc 1/2 ) → C −∞ ([0, ), sc 1/2 ), Z n−1 2 u(x, y)f (y). B(u, f ) = x
(2.4) (2.5)
∂X
Proposition 2.3. For any intersecting pair of Legendre submanifolds with conic points, G, the partial pairing (2.5) gives a map X n−1 m,p f ; sc 1/2 ) × C ∞ (∂X; sc 1/2 ) 7→ (X, W I p+ 4 ([0, ), W 0 (τ¯j ; sc 1/2 )), B : Isc j
where the W 0 (τ¯j ) = {(0, −τj dx/x2 )} are the Legendre submanifolds corresponding to the components of W # and 21 X dx e−iτ¯ /x xp+n/4 Q0τ¯j (u, f ) 2 + O(xp+n/4+1 ) B(u, f ) = x j with
p−m− n−1 4
Q0τ¯ (u) ∈ Iphg
f , τ¯ )), (∂X, 3(W
and the principal symbol determines and is determined by the lead singularity of the principal symbol of u on W on approach to W 0 (τ¯j ). 3. Proof of Theorem 1.1 Proof. Let Fj , j = 1, 2, be as in the hypotheses of Theorem 1.1. We know that if Pj is the Poisson operator of 1 − λ2 + Fj then (1 − λ2 + Fj )Pj = 0, Qλ (Pj ) = Id, Q−λ (Pj ) = Sj (λ),
(3.1) (3.2) (3.3)
with Sj (λ) being the scattering matrix and 1 − 2n−1 4 ,− 4
Pj (λ) ∈ 9sc
sc 1/2 e (X × ∂X, G(λ); ).
Our hypothesis that Fj is of order at least 2 at the boundary means that the principal and sub-principal symbols of 1 − λ 2 + Fj
Recovering Asymptotics of Short Range Potentials
203
at the boundary are independent of j. This will imply that the principal symbol of Pj on G is independent of j. Suppose F1 , F2 are of order r ≥ 2 at the boundary. We also have that (1 − λ2 )(P1 − P2 ) = (F2 − F1 )P1 + F2 (P2 − P1 ) and so, as F2 is of at least second order on the boundary, that σsc,− 2n−1 +k−2 ((1 − λ2 )(P1 − P2 )) = σsc,− 2n−1 +k−2 ((F2 − F1 )P1 ) 4
4
= σsc,− 2n−1 (P1 )σsc,k (F2 − F1 ) ⊗ |dx|k . 4
(3.4) (3.5)
Our purpose in this section is to use this to compute σsc, 2n−1 +r−1 (P1 − P2 ), and thus 4 deduce Theorem 1.1. First, we must compute σsc,− 2n−1 (P1 ) on G; we can parametrize the Legendrian G by 4 (y, s, µ) ˆ with (y, µ) ˆ a point of the cosphere bundle of the boundary and s is the geodesic distance on the boundary from the initial point (as in Sect. 2.) In these coordinates the rescaled Hamilton vector field of 1 on G takes the form V = 2λ sin s
∂ , ∂s
where s runs from 0 to π on G, with G as in Sect. 2. Now sin s is a boundary defining function for the Legendrian, G, and so ˆ 1/2 |ds|1/2 |dy|1/2 |dµ| (sin s)1/2 trivializes the b-half density bundle over Gˆ - the blown-up Legendrian. This is a good choice as |ds|1/2 |dy|1/2 |dµ| ˆ 1/2 = 0. LV (sin s)1/2 We now use Proposition 2.2 to compute the principal symbol of P away from the endpoints. We have 2n − 1 k = 0, m = − 4 and
∂pk ∂τ
= 2τ = 2λ cos(s). Using the argument of Lemma 14 of [8], we have that pk+1 = −i(n − 1)τ + c = −iλ(n − 1) cos(s) + c(α(s)),
where c is
dh dx
on the boundary and is of the form c(y, µ, τ ) = τ f (y, µ) + g(y, µ)
with f linear in µ and g quadratic in µ, independent of Fj , j = 1, 2 and α(s) denotes the point where τ = λ cos s, |µ| = λ cos s and y, µˆ are given by the time s flow from the initial point. So c will be equal to 2λ sin s times a smooth function d(α(s)).
204
M. S. Joshi, A. S´a Barreto
So if our symbol is b(s, y, u) ˆ
|ds|1/2 |dy|1/2 |dµ| ˆ 1/2 |dx|m−1/4 , (sin s)1/2
we have that b satisfies ∂ 1 2n − 1 + − 2 cos(s) + (n − 1) cos(s) + 2i sin(s)d(α(s)) b = 0. 2 sin(s) ∂s 2 2 (3.6) This implies that
1−n ∂ + cos(s) + i sin(s)d(α(s)) b = 0. sin(s) ∂s 2
(3.7)
We can rewrite this as 1−n 1−n ∂ (sin 2 (s)b(s)) + id(α(s))(sin 2 (s)b(s)) = 0. ∂s
(3.8)
This is easily solved to give
b(s) = C sin
n−1 2
−i
Rs
(s)e
d(α(s0 ))ds0
.
0
(3.9)
This has a singularity of order (n − 1)/2 at the endpoints which is equal to p − m in Proposition 2.1 and so is what we would expect as the symbols on Gˆ are defined to have singularities of order p − m. It is clear that C is non-zero. We now have enough data to use (3.5) to compute the principal symbol of P1 − P2 . Taking the same trivializing section of the b-half density bundle as above we obtain, ∂ 1−n + + l cos(s) + i sin(s)d(α(s)) a = sin(s) ∂s 2 (3.10) iC sin(s)n/2−1/2 f (α(s)), 2λ where a times the trivializing section times |dx|− the initial data is zero because Qλ0 (P1 − P2 ) = 0. 1−n Writing b = (sin s)l+ 2 b0 this is equivalent to iC ∂b0 + id(α(s))b0 = sinl−1 (s)e ∂s 2λ
2n−1 4 +r−1
−i
Rs
is the principal symbol and
d(α(s0 ))ds0
0
f (α(s)),
(3.11)
and so we have iC b (s) = 2λ 0
Zs 0
and thus that
f (α(s0 )) sin(s0 )l−1 ds0 ,
(3.12)
Recovering Asymptotics of Short Range Potentials
n−1 iC (sin s) 2 −l e b(s) = 2λ
−i
Rs
205
d(α(s0 )ds0
Zs
0
f (α(s0 )) sin(s0 )l−1 ds0 .
(3.13)
0
The principal symbol of the scattering matrices will determine the lead singularity of b at s = π using Proposition 2.3. The lead singularity is iC (π − s)n/2−1/2−l e 2λ
i
Rπ
d(γ(s0 ))ds0
Zπ
0
f (α(s0 )) sin(s0 )l−1 ds0 .
0
We have not considered Maslov factors but these will depend purely on the geometry around the curve and not on the potentials and so can be absorbed into the constant C. So the principal symbol of the difference of the scattering matrices determines Zπ f (α(s)) sin(s)l−1 ds. 0
This ends the proof of Theorem 1.1.
Note that α(s) depends on |λ| but that in the case of potential scattering the perturbation will in fact be independent of the variables which are affected by λ.
4. Recovering Potentials on Manifolds The next question then becomes the injectivity of the transformations. We restrict ourselves to studying the case of potential perturbations. The potential is only a function of the base variable and so, for short range potentials, from Theorem 1.1, we can recover weighted integrals of the potential along geodesics. We first consider the behaviour along a single geodesic so define Ij : C ∞ (R) → C ∞ (R) by
α+π Z (sin(s − α))j f (s)ds. Ij f (α) = α
For j > 1, if we differentiate with respect to α twice, we obtain α+π Z (sin(s − α))j−2 j(j − 1) cos2 (s)f (s) + j sin(s − α)j f (s)ds, α
and so we can obtain an expression for Ij−2 in terms of Ij and its second derivative. This means that it is sufficient to show that each of I0 and I1 determine the data. Its immediate from the fundamental theorem of calculus that if one differentiates I0 f with respect to α one obtains f (α + π) − f (α) and by differentiating twice one can recover f (α + π) + f (α) from I1 f. So we have proven
206
M. S. Joshi, A. S´a Barreto
Theorem 4.1. If γ is a geodesic on the boundary then Ij (f )(γ) determines f (γ(α + π)) − (−1)j f (γ(α)), ∀α, and this quantity is equal to zero if Ij (f )(γ) is equal to zero. Now if γ is a closed geodesic of period T then we can recover α+π ZT Z ZT V (γ(s))(sin(s − α))l−1 dsdα = CT V (γ(s))ds, 0
α
(4.1)
0
that is we can recover the X-ray transform of V. We can thus prove, Theorem 4.2. If V1 , V2 are short range potentials on a scattering manifold X of dimension ≥ 3, with boundary equal to a sphere of radius not equal k1 , k ∈ N and S1 (λ) − S2 (λ) is smooth, then V1 − V2 vanishes to order infinity at the boundary. This result also holds for any X such that the X-ray transform on ∂X is injective or if the injectivity radius on ∂X is bigger than π. Proof. Suppose
V1 − V2 = xl V + O(xl+1 ) with V a function on the boundary. We show that, in fact, V must be zero. The result then follows by induction. If the X-ray transform on ∂X is injective, the result follows from (4.1). In general, we have that Zπ V (γ(s))(sin s)l−1 ds = 0 0
for all geodesics γ so V must be zero at some point p in each component of ∂X. Then from Theorem 4.1, V (γ(s)) is periodic of period π: it is then zero also at all points of distance π from p. This set, which we denote by Sp , is a submanifold of one lower dimension than the boundary, provided the injectivity radius of the boundary is bigger than π or the boundary is a sphere of radius not equal to 1/k (after possibly wrapping around.) Again by Theorem 4.1, we find that V must be zero on the set of points whose distance to a point in Sp is π. This contains an open neighbourhood of p. Thus the set of zeros of V is an open set. Thus V vanishes everywhere and the result follows. If the boundary is a sphere of radius 1 then the sphere at distance π is precisely the antipodal point and so we deduce from Theorem 4.1 that in fact V is an even function if l is odd and odd if l is even. It is a result due to Helgason, see for example [5], Theorem 4.7, that the kernel of the X-ray transform on the sphere is precisely the odd functions. Thus since we know V is in the kernel, we deduce that V is identically zero if l is odd. Now to recover, V, when l is even, we know that V is odd and thus if we can recover the X-ray transform of V times an odd function, which is supported everywhere, we are done. So consider the unit sphere in Rn+1 ; we can recover the X-ray transform of x1 V simply by observing that if one takes a geodesic starting in the plane x1 = 0 then its x1 coordinate at time s will a fixed multiple of sin s. Thus Zπ Zπ (x1 f )(γ)(s)ds = C sin(s)f (γ(s))ds. 0
0
Adding the two halves of the great circle together we obtain the X-ray transform.
Recovering Asymptotics of Short Range Potentials
207
Note that as the boundary of the compactification of the Euclidean Rn given by Eq. (0.5) of [8] is the unit sphere, we have the result, in particular, for Euclidean space which is Corollary 1.1. If X has dimension two, then the argument used above does not work. In fact the result is not true, as in shown in the examples of [2]. As well as proving the scattering matrix as a map from asymptotics of potentials to Fourier integral operators is injective, our results actually give a technique for reconstructing the asymptotics from the scattering matrix. We sketch the constructive procedure. Given a scattering matrix, S(λ), for some unknown potential, first compute the scattering matrix, S0 (λ), up to smooth terms, for the zero potential. This can be done constructively and explicitly (in principle at least!) by using the proofs in [8]. One then computes the principal symbol of the difference of S(λ) − S0 (λ), by inverting the transforms as above. This enables one to compute the lead term of the potential, x2 V2 . One then repeats the process, computing S1 (λ) the scattering matrix associated to x2 V2 and using S1 (λ)−S(λ) to get the next term x3 V3 . Repeating, one can find all terms in the expansion. Of course, to carry out the procedure computationally it would be necessary to solve the eikonal equation and work with total symbols which is something we have avoided in this note. Acknowledgement. We express our thanks to Maciej Zworski for suggesting the connection between the question studied here and the results of [8] and for his help in understanding the intricacies of that paper. The work of the first author was partly supported by the Newton Trust. The work of the second author was supported by the National Science Foundation Grant DMS-9623175 and a Sloan Fellowship. This research was also supported by an L.M.S. grant.
References 1. Fadeev, L.D.: The uniqueness of solutions for the inverse scattering problem. Vestnik Leningrad Univ. 7, 126–130 (1956) 2. Grinevich, P.G.: Rational Solutions of the Veselov–Novikov equations are reflectionless two-dimensional potentials at fixed energy. Theor. and Math. Phys. 69 (2), 307–310 (1986) 3. P.G. Grinevich and S.V. Manakov: The inverse scattering problem for the two dimensional Schr¨odinger operator, the ∂-method and non-linear equations. Funkt. Anal. i Pril. 20 (2), 14–24 (1986), translation in Funct. Anal. and Appl. 20, 94–103 (1986) 4. Grinevich, P.G. and Novikov, R.G.: Transparent potentials at fixed energy in dimension two. Fixed energy dispersion relations for the fast decaying potentials. Commun. Math. Phys. 1174, 409–446 5. Helgason, S.: Groups and Geometric Analysis. New York–London: Academic Press, 1984 6. H¨ormander, L.: The Analysis of Linear Partial Differential Operators, Volumes 1 to 4. Berlin–Heidelberg: Springer-Verlag, 1983 7. Melrose, R.B. Spectral and Scattering Theory for the Laplacian on Asymptotically Euclidean spaces. (M. Ikawa, ed.), New York: Marcel Dekker, 1994 8. Melrose, R.B., Zworski, M.: Scattering Metrics and Geodesic Flow at Infinity. Invent. Math. 124, 389– 436 (1996) 9. Nachman, A.I.: Global uniqueness for a two-dimensional inverse boundary problem. Ann. of Math. 143 no. 1, 71–96 (1996) 10. Newton, R.: Construction of potentials from the phase shifts at fixed energy. J. Math. Phys. 3, 75–82 (1962) 11. Novikov, R.G.: Multidimensional inverse spectral problems for the equation −1ψ + (v(x) − E)ψ = 0 Funct. Anal. and Appl. 22, 263–272 (1988) 12. Novikov, R.G.: The inverse scattering problem at fixed energy for the three-dimensional Schr¨odinger equation with an exponentially decreasing potential Commun. Math. Phys. 161, 569–595 (1994) 13. Regge, T.: Introduction to complex orbital moments. Nuovo Cimento. 14, 951–976 (1959) 14. Newton, R.: Construction of potentials from the phase shifts at fixed energy. J. Math. Phys. 3, 75 (1962) 15. Saito, Y.: Some properties of the scattering amplitude. Osaka J. of Math.19, 527–547 (1982)
208
M. S. Joshi, A. S´a Barreto
16. Sabatier, P.: Asymptotic properties of the potentials in the inverse scattering problem at fixed energy J. Math. Phys. 7, 1515–1531 (1966) 17. Sylvester, J. and Uhlmann, G.: A global uniqueness theorem for an inverse boundary value problem. Ann. of Math. 125, 154–169 (1987) 18. Zakharov, V.E.: Shock waves spreading along solitons on the surface of liquid. Izv. Vuzov Radiofiz. 2969, 1073–1079 (1986) Communicated by B. Simon
Commun. Math. Phys. 193, 209 – 218 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
Gauge Dependence in the Theory of Non-Linear Spacetime Perturbations Sebastiano Sonego? , Marco Bruni?? International School for Advanced Studies, Via Beirut 2-4, 34014 Trieste, Italy. Received: 21 November 1996 / Accepted: 20 August 1997
Abstract: Diffeomorphism freedom induces a gauge dependence in the theory of spacetime perturbations. We derive a compact formula for gauge transformations of perturbations of arbitrary order. To this end, we develop the theory of Taylor expansions for one-parameter families (not necessarily groups) of diffeomorphisms. First, we introduce the notion of knight diffeomorphism, that generalises the usual concept of flow, and prove a Taylor’s formula for the action of a knight on a general tensor field. Then, we show that any one-parameter family of diffeomorphisms can be approximated by a family of suitable knights. Since in perturbation theory the gauge freedom is given by a one-parameter family of diffeomorphisms, the expansion of knights is used to derive our transformation formula. The problem of gauge dependence is a purely kinematical one, therefore our treatment is valid not only in general relativity, but in any spacetime theory.
1. Introduction In the theory of spacetime perturbations [1–3], one usually deals with a family of spacetime models Mλ := (M, {Tλ }), where M is a manifold that accounts for the topological and differential properties of spacetime, and {Tλ } is a set of fields on M, representing its geometrical and physical content. The numerical parameter λ that labels the various members of the family gives an indication of the “size” of the perturbations, regarded as deviations of Mλ from a background model M0 . Perturbations are described as additional fields in the background, defined as 1Tλϕ := ϕ∗λ Tλ − T0 , where ϕλ : M → M is a diffeomorphism that provides a pairwise identification between points of the perturbed ? Present address: Universit` a degli Studi di Udine, DIC – Via delle Scienze 208, 33100 Udine, Italy. E-mail:
[email protected] ?? E-mail:
[email protected]
210
S. Sonego, M. Bruni
spacetime and of the background, and ϕ∗λ denotes the pull-back. Of course, such an identification is arbitrary, and this leads to a gauge freedom in the definition of perturbations. Under a change ϕλ → ψλ of the point identification mapping, a perturbation transforms as 1Tλϕ → 1Tλψ , with 1Tλψ = 8∗λ 1Tλϕ + 8∗λ T0 − T0 , (1.1) where 8λ := ϕ−1 λ ◦ ψλ is a diffeomorphism on M. In the perturbative approach, one tries to approximate Tλ expressing 1Tλϕ as a series, 1Tλϕ =
n−1 X k=1
λk k ϕ δ T + O(λn ) , k!
(1.2)
where n is the order of differentiability with respect to λ of 1Tλϕ , and then solving iteratively the field equations for the various terms δ k T ϕ . It is then important to know how the latter transform under a change of gauge. Until very recently, only the first order terms, δ 1 T ϕ , have been considered; in this case, it is well-known that the representations of a perturbation in two different gauges differ just by a Lie derivative of the background quantity T0 [1]. However, non-linear perturbations are now becoming a valuable tool of investigation in black hole and gravitational wave physics [4], as well as in cosmology [5]. Their behaviour under gauge transformations can be derived by Taylor-expanding (1.1) with respect to λ. This apparently straightforward procedure presents a difficulty, though. Even if one chooses, as usual, point identification maps that are one-parameter groups with respect to λ, the family of diffeomorphisms 8λ is not a one-parameter group [3], i.e., it does not correspond to a flow on M. While flows on manifolds are well understood and widely discussed in the literature, more general one-parameter families of diffeomorphisms are not (see however references [6–9]). Therefore, in order to extract from (1.1) the relationship between δ k T ϕ and δ k T ψ , one must first develop the theory of Taylor expansions for general one-parameter families of diffeomorphisms, not necessarily forming a local group. The purpose of the present article is to provide the mathematical framework needed for this purpose. Roughly, the discussion generalises Sect. 2 of reference [3] from the analytic to the C n case, but we also derive here a compact formula that gives directly the gauge transformation to an arbitrary order k. The paper is organised as follows. In the next section we define particular combinations of flows that we dub knight diffeomorphisms, and present our main result (Theorem 1). This establishes that arbitrary one-parameter families of diffeomorphisms can be approximated by families of knights, so that all one needs is a suitable expression for the Taylor expansion of knights, which is derived in Sect. 3. Then, Theorem 1 is proved in Sect. 4. Section 5 contains the application to (1.1), i.e., our formula (5.1) and some concluding remarks. In the following, we shall work on a finite-dimensional manifold M, smooth enough for all the statements below to make sense. In order to avoid cumbersome talking about neighbourhoods, we shall often suppose that maps are globally defined. This assumption simplifies the discussion, without altering the results significantly. Also, we specify the class of differentiability of an object only when it is really needed. Finally, let us recall that a one-parameter family of diffeomorphisms of M is a differentiable mapping 8 : D → M, with D an open subset of IR × M containing {0} × M, and 8(0, p) = p, ∀p ∈ M. As we have already been doing, we shall write, following the common usage, 8λ (p) := 8(λ, p), for any (λ, p) ∈ D.
Gauge Dependence in Theory of Non-Linear Spacetime Perturbations
211
2. Knight Diffeomorphisms Let φ(1) : D1 → M, . . . , φ(k) : Dk → M be flows on M, generated by the vector fields ξ1 , . . . , ξk , respectively. We can combine φ(1) , . . . , φ(k) to define a new one-parameter family of diffeomorphisms 9 : D → M, with D a suitable open subset of IR × M containing {0} × M, whose action is given by 9λ := φ(k) ◦ · · · ◦ φ(2) ◦ φ(1) λ . λ2 /2 λk /k!
(2.1)
Thus, 9λ displaces a point of M, a parameter interval λ along the integral curve of ξ1 , then an interval λ2 /2 along the integral curve of ξ2 , and so on (see Fig. 1 for the case k = 2). For this reason, we shall call 9λ , with a chess-inspired terminology, a knight diffeomorphism of rank k or, more shortly, a knight. The vector fields ξ1 , . . . , ξk will be called the generators of 9.
p
Ψ λ (p)
φ
(1) λ
(p)
Fig. 1. The action of a knight diffeomorphism 9λ of rank 2 generated by ξ1 and ξ2 . Solid lines: integral (p) is λ, and that curves of ξ1 . Dashed lines: integral curves of ξ2 . The parameter lapse between p and φ(1) λ 2 /2 from φ(1) (p) to 9 (p) is λ λ λ
The utility of knights stems from the fact that any C n one-parameter family 8 of diffeomorphisms can always be approximated by a family 9 of knights of rank n − 1, as shown by the following Theorem 1. Let 8 : D → M be a C n one-parameter family of diffeomorphisms. Then ∃ φ(1) , . . . , φ(n−1) , flows on M such that, up to the order λn , the action of 8λ is equivalent to the one of the C n knight ◦ · · · ◦ φ(2) ◦ φ(1) 9λ = φ(n−1) λ . λn−1 /(n−1)! λ2 /2
(2.2)
212
S. Sonego, M. Bruni
This result allows one to use knights in order to investigate many properties of arbitrary diffeomorphisms. In a sense, knights play among the one-parameter families of diffeomorphisms of M the same crucial role that polynomials play for functions of a real variable. We postpone the proof of Theorem 1 to Sect. 4, after we have established some preliminary results.
3. Taylor Expansion of Flows and Knights It is easy to generalise the usual Taylor’s expansions on IRm [10] to the case of a flow acting on a manifold: Proposition 2. Let φ : D → M be a flow generated by the vector field ξ, and T a tensor field such that φ∗λ T is a (tensor-valued) function of λ of class C n . Then, φ∗λ T can be expanded around λ = 0 as φ∗λ T
=
n−1 X l=0
λl l £ T + λn Rλ(n) T , l! ξ
(3.1)
where £ξ is the Lie derivative along the flow φ, and Rλ(n) is a linear map whose action on T is given by Rλ(n) T =
1 (n − 1)!
Z
1 0
dt (1 − t)n−1 £nξ φ∗tλ T .
(3.2)
This proposition has the important consequence that, for a tensor field T and a flow φ such that φ∗λ T is C n , one can approximate φ∗λ T , to order n − 1, by the polynomial n−1 X l=0
λl l £ T . l! ξ
This follows from the property lim Rλ(n) T =
λ→0
1 n £ T , n! ξ
(3.3)
which implies that, for λ → 0, the remainder λn Rλ(n) T is O(λn ).1 The proof of Proposition 2 is rather straightforward and can be omitted. We only wish to point out that it relies heavily on the property that φλ forms a one-parameter group: φσ+λ = φσ ◦ φλ . It is evident from (2.1) that for knights one has, in general, 9σ ◦ 9λ 6= 9σ+λ , and 9−1 λ 6= 9−λ . Thus, Eq. (3.1) cannot be applied if we want to expand in λ the pull-back 9∗λ T of a tensor field T defined on M. The ultimate reason for this, is that a family of knights does not form a group, except under very special conditions, as shown by the following 1 Actually, this result holds also for the weaker case in which φ∗ T is C n− (i.e., it is of class C n−1 with λ a locally Lipschitzian (n − 1)th derivative). However, under these conditions one does not have an explicit expression, like (3.2), for the remainder.
Gauge Dependence in Theory of Non-Linear Spacetime Perturbations
213
Theorem 3. Let 9 : D → M be a family of knight diffeomorphisms of rank k, with generators ξ1 , . . . , ξk . 9 forms a group iff there exists a vector field ξ, and numerical coefficients αl , with 1 ≤ l ≤ k, such that ξl = αl ξ, ∀ l. In this case, under the reparametrisation λ → λ¯ := f (λ), with f (λ) :=
k X
αl λl /l! ,
(3.4)
l=1
9 reduces to a flow in the canonical form. Proof. Let us first show that ξl = αl ξ is a sufficient condition for 9 to form a group. Let φ be the flow generated by ξ. Then φ(l) σ = φαl σ , and we have 9λ = φαk λk /k! ◦· · ·◦φα1 λ = −1 ¯ and (ii) φλ¯ . Thus, (i) 9σ ◦ 9λ = φσ¯ ◦ φλ¯ = φσ+ (σ¯ + λ), ¯ = φτ¯ = 9τ , with τ = f ¯ λ −1 −1 −1 ¯ 9λ = φλ¯ = φ−λ¯ = 9ρ , with ρ = f (−λ). To prove the reverse implication, let us suppose that 9 form a group. Let p be an arbitrary point of M, and define the set Cp := {9λ (p)|λ ∈ Ip } ⊂ M, where Ip 3 0 is an open interval of IR such that Ip × {p} ⊂ D. Obviously, Cp is a one-dimensional submanifold of M (to see this, it is sufficient to consider a chart on Cp , where λ itself is the coordinate). Let us now consider another arbitrary point q ∈ Cp , and ask whether it is possible that Cq 6= Cp . If it were so, there would be some σ ∈ Iq such that 9σ (q) 6= 9τ (p), ∀τ ∈ Ip . But since q = 9λ (p), for some λ ∈ Ip , and p is arbitrary, this would mean that, for some λ and σ, one cannot find a τ such that 9σ ◦ 9λ = 9τ , which would contradict the hypothesis that 9 forms a group. Thus, each point of M belongs to one, and only one, one-dimensional submanifold constructed using 9 as above. The set of these submanifolds becomes a congruence of curves simply by suitably parametrising them; this, in turn, defines a flow φ and a vector field ξ. Thus, if 9 forms a group, it can ¯ be written as 9λ = φλ¯ , for some suitable parameter λ. In the particular case of a knight, this condition can be rewritten, using (2.1), as (2) (k) φ(1) ¯ = φ−λ2 /2 ◦ · · · ◦ φ−λk /k! . λ ◦ φ −λ
(3.5)
Assuming φ and the various φ(l) to be at least of class C 2− (which is a natural requirement, if one wants them to be uniquely determined by the respective vector fields), we can apply (3.1) to (3.5) and get, for an arbitrary tensor field T , λ¯ (3.6) £ξ1 − £ξ T = O(λ) . λ This implies that ∃α1 ∈ IR such that λ¯ = α1 λ + f2 (λ), with f2 (λ) = O(λ2 ), together with ξ1 = α1 ξ. Substituting into (3.5) and applying again (3.1), we find f2 (λ) £ξ2 − 2 £ξ T = O(λ) . (3.7) λ /2 Thus, we have also that ∃α2 ∈ IR such that f2 (λ) = α2 λ2 /2 + f3 (λ), with f3 (λ) = O(λ3 ), and ξ2 = α2 ξ. Iterating this procedure, one shows that ξl = αl ξ, ∀l ≤ k. It is clear from the proof given above that the failure of 9 to form a group is also related to the following circumstance. For any p ∈ M, one can define a curve up : Ip → M by up (λ) := 9λ (p). However, these curves do not form a congruence on M. For the point up (λ), say, belongs not only to the image of the curve up , but also to the one of
214
S. Sonego, M. Bruni
uup (λ) , which differs from up when at least one of the ξl is not collinear with ξ1 , since uup (λ) (σ) = 9σ ◦ 9λ (p) 6= 9λ+σ (p) = up (λ + σ). Thus, the fundamental property of a congruence, that each point of M lies on the image of one, and only one, curve, is violated. Let us now turn to the problem of Taylor-expanding 9∗λ T . Although (3.1) cannot be used straightforwardly for this purpose, one can apply it repeatedly to 9∗λ T = (2)∗ (k)∗ φ(1)∗ λ φλ2 /2 · · · φλk /k! T , and get the following Proposition 4. Let 9 be a one-parameter family of knight diffeomorphisms of rank k, and T a tensor field such that 9∗λ T is of class C n . Then 9∗λ T can be expanded around λ = 0 as 9∗λ T =
n−1 X l=0
λl X l! £j1 · · · £jξnn T + λn Rλ(n) T , l! 2j2 · · · n!jn j1 !j2 ! · · · jn ! ξ1
(3.8)
Jl
Pn where Jl := {(j1 , . . . , jn ) ∈ INn | i=1 i ji = l} defines the set of indices over which one has to sum in order to obtain the lth order term, and Rλ(n) T is a remainder with a finite limit as λ → 0.
x~ 2
−λ2 ξ 2 λξ1
2
x
−
ν
λ ξ1 ν ξ 1 , 2
Fig. 2. The action of a knight diffeomorphism of rank two, represented in a chart to order λ2
The geometrical meaning of (3.8) is particularly clear in a chart. Let us consider the special case in which the tensor T is just one of the coordinate functions on M, xµ . We have then, since 9∗λ xµ (p) = xµ (9λ (p)), the action of an “infinitesimal point transformation”, that reads, to second order in λ, λ2 µ ν ξ1 ,ν ξ1 + ξ2µ + O(λ3 ) , (3.9) x˜ µ = xµ + λ ξ1µ + 2 where we have denoted xµ (p) simply by xµ , and xµ (9λ (p)) by x˜ µ . Equation (3.9) is represented pictorially in Fig. 2. The effect of φ(2) (and of higher order φ’s) is to correct the action of the simple flow φ(1) .
Gauge Dependence in Theory of Non-Linear Spacetime Perturbations
215
Finally, let us notice that since each element of Jl has ji ≡ 0, ∀i > l, the sum on the right-hand side of (3.8) only involves the Lie derivatives along the vectors ξl with l ≤ n − 1. Thus, as far as Taylor expansions are concerned, only knights of rank lower than their degree of differentiability are really relevant. 4. Proof of Theorem 1 If ϕ and ψ are two diffeomorphisms of M such that ϕ∗ f = ψ ∗ f for every function f , it follows that ϕ ≡ ψ, as it is easy to see in a chart. Thus, in order to show that a family of knights 9 approximates any one-parameter family of diffeomorphisms 8 up to the nth order, it is sufficient to prove that 9∗λ f and 8∗λ f differ by a function that is O(λn ), ∀f . Let us therefore consider the action of 8λ on an arbitrary sufficiently smooth function f : M → IR. The Taylor expansion of 8∗λ f gives [10] 8∗λ f =
n−1 X l=0
with Rλ(n) f
1 = (n − 1)!
λl dl ∗ 8 f + λn Rλ(n) f , l! dλl 0 λ Z
1
dt (1 − t) 0
n−1
dn 8∗ 0 f . dλ0n tλ λ
(4.1)
(4.2)
Let us define n − 1 linear differential operators L1 , . . . , Ln−1 through the recursive formula X dl ∗ l! jl−1 Lj11 Lj22 · · · Ll−1 8 f − f , (4.3) Ll f := λ j2 · · · (l − 1)!jl−1 j !j ! · · · j dλl 0 2 ! 1 2 l−1 0 Jl
Pl−1 where J10 ≡ ∅ and, for l > 1, Jl0 := {(j1 , . . . , jl−1 ) ∈ INl−1 | i=1 i ji = l}. Since L1 , . . . , Ln−1 satisfy Leibniz’s rule (see Appendix), they are derivatives, and we can thus define n − 1 vector fields ξ1 , . . . , ξn−1 by requiring that, for any C 1 function f , £ξl f := Ll f . Now, if 9λ is the knight of rank n − 1 generated by ξ1 , . . . , ξn−1 as in (2.2), we can combine (4.1), (4.3), and (3.8) to get 8∗λ f = 9∗λ f + λn 1(n) λ f , 0 where 1(n) λ f is O(λ ). This completes the proof.
(4.4)
5. Gauge Transformation and Conclusions In the previous sections we have presented the theory of Taylor’s expansions for oneparameter families of diffeomorphisms on a manifold M. Taking the simple case of a flow as our basic element, we have first defined the notion of knights, and then shown that an arbitrary one-parameter family of diffeomorphisms can always be approximated by a family of knights of a suitable rank. We can now return to the problem stated in the introduction, of finding the relationship between the k th order perturbations of a tensor Tλ in two gauges ϕλ and ψλ . Let n be the lowest order of differentiability of the objects contained in (1.1). It follows from Theorem 1 that the action of 8λ is equivalent, up to the order λn , with the
216
S. Sonego, M. Bruni
one of a knight 9λ , constructed as in (2.2). Therefore, we can expand (1.1) using (3.8), and find, ∀k < n, δk T ψ =
k X l=0
k! X 1 £j1 · · · £jξkk δ k−l T ϕ , j j 2 (k − l)! 2 · · · k! k j1 ! · · · jk ! ξ1
(5.1)
Jl
where the various quantities are defined according to (1.2), and δ 0 T ϕ := T0 . Equation (5.1) gives a complete description of the gauge behaviour of perturbations at an arbitrary order. Among other applications, it allows one to obtain easily the conditions for the gauge invariance of perturbations to k th order; this problem has been discussed in some detail in reference [3]. Since the problem of gauge dependence is purely kinematical, (5.1) is valid not only in general relativity, but in any geometrical theory of spacetime. Of course, our treatment can be easily generalised in several ways. For instance, it may happen that the perturbations are characterised by several parameters λ1 , . . . , λN [2], so that one is dealing with a N -parameter family of spacetime models M(λ1 ,...,λN ) that differ from the background M(0,...,0) . Correspondingly, gauge transformations are associated with the action of a N -parameter family of diffeomorphisms 8 : D → M, where D is an open subset of IRN × M containing {(0, . . . , 0)} × M, and 8((0, . . . , 0), p) = p, ∀p ∈ M. One can then ask several questions about such an extension of the theory discussed in the present paper. However, we leave this topic for future investigations. Appendix: Proof that the Operators Ll Satisfy the Leibniz Rule Since the operators Ll are linear, the Leibniz rule is equivalent to the condition Ll f 2 = 2f Ll f , for any C 1 function f . This property can be established for any l by induction. It trivially holds for l = 1, so there exists a vector field ξ1 such that L1 f = £ξ1 f , ∀f . Let us suppose that this is true up to l − 1, so that there are l − 1 vector fields ξ1 , . . . , ξl−1 such that Lk f = £ξk f , ∀k ≤ l − 1 and ∀f . Then we must prove that X dl l! j ∗ 2 £jξ11 £jξ22 · · · £ξl−1 (8 f ) − f2 . 2f Ll f = λ j2 · · · (l − 1)!jl−1 j !j ! · · · j l−1 dλl 0 2 ! 1 2 l−1 0 Jl
(A.1) Recalling (4.3), we have l X dl−k l dk dl ∗ 2 ∗ (8 f ) = (8 f ) (8∗ f ) dλl 0 λ dλl−k 0 λ k dλk 0 λ k=0 X l! j £j1 · · · £ξl−1 f = 2f Ll f + 2f j j l−1 l−1 2 2 · · · (l − 1)! j1 !j2 ! · · · jl−1 ! ξ1 0 Jl ! l−1 X X k! l j1 jk £ · · · £ξ k f + 2j2 · · · k!jk j1 !j2 ! · · · jk ! ξ1 k Jk k=1 X (l − k)! j f ; £j1 · · · £ξl−k l−k 2j2 · · · (l − k)!jl−k j1 !j2 ! · · · jl−k ! ξ1 Jl−k
(A.2)
Gauge Dependence in Theory of Non-Linear Spacetime Perturbations
217
therefore, (A.1) is satisfied iff X Jl0
l! 2 j2
· · · (l −
1)!jl−1 j
j
1 !j2 ! · · · jl−1 !
£jξ11 · · · £ξl−1 f2 l−1
X
l! j £jξ11 · · · £ξl−1 f j2 · · · (l − 1)!jl−1 j !j ! · · · j l−1 2 ! 1 2 l−1 Jl0 ! l−1 X X k! l j1 jk + £ · · · £ξ k f 2j2 · · · k!jk j1 !j2 ! · · · jk ! ξ1 k Jk k=1 X (l − k)! j £j1 · · · £ξl−k f , l−k 2j2 · · · (l − k)!jl−k j1 !j2 ! · · · jl−k ! ξ1 = 2f
(A.3)
Jl−k
for any f and for any choice of the vector fields ξ1 , . . . , ξl−1 . This relationship could be proved by brute force. However, it is easier to follow an alternative path. Let us consider a knight 9λ of rank l, generated by the vectors ξ1 , . . . , ξl−1 , and by a new arbitrary vector ξl . Then one can compute dl (9∗ f )2 dλl 0 λ using (3.8), from which (A.3) follows straightforwardly. Acknowledgement. We are grateful to Professor Dennis W. Sciama for hospitality at the Astrophysics Sector of SISSA, and to an anonymous referee for stimulating several improvements in the presentation. MB thanks INFN for financial support.
References 1. Stewart, J.M., Walker, M.: Perturbations of space-time in general relativity. Proc. R. Soc. London A 341, 49–74 (1974) 2. Wald, R.M.: General Relativity. Chicago: University of Chicago Press, 1984, pp. 183–184 3. Bruni, M., Matarrese, S., Mollerach, S., Sonego, S.: Perturbations of spacetime: Gauge transformations and gauge invariance at second order and beyond. Class. Quantum Grav. 14, 2585–2606 (1997) 4. Tomita, K.: On the non-linear behavior of nonspherical perturbations in relativistic gravitational collapse. Prog. Theor. Phys. 52, 1188–1204 (1974); Tomita, K., Tajima, N.: Nonlinear behavior of nonspherical perturbations of the Schwarzschild metric. Prog. Theor. Phys. 56, 551–560 (1974); Gleiser, R.J., Nicasio, C.O., Price, R.H., Pullin, J.: Second-order perturbations of a Schwarzschild black hole. Class. Quantum Grav. 13, L117–L124 (1996); Gleiser, R.J., Nicasio, C.O., Price, R.H., Pullin, J.: Colliding black holes: How far can the close approximation go? Phys. Rev. Lett. 77, 4483–4486 (1996) 5. Tomita, K.: Non-linear theory of gravitational instability in the expanding universe. Prog. Theor. Phys. 37, 831–846 (1967); Matarrese, S., Pantano, O., Saez, D.: General-relativistic dynamics of irrotational dust: Cosmological implications. Phys. Rev. Lett. 72, 320–323 (1994); Matarrese, S., Pantano, O., Saez, D.: A relativistic approach to gravitational-instability in the expanding universe: 2nd-order Lagrangian solutions. Mon. Not. R. Astron. Soc. 271, 513–522 (1994); Salopek, D.S., Stewart, J.M., Croudace, K.M.: The Zel’dovich approximation and the relativistic Hamilton-Jacobi equation. Mon. Not. R. Astron. Soc. 271, 1005–1016 (1994); Russ, H., Morita, M., Kasai, M., B¨orner, G.: The Zel’dovich-type approximation for an inhomogeneous universe in general relativity: Second-order solutions. Phys. Rev. D 53, 6881–6888 (1996) 6. Taub, A.H.: Approximate stress energy tensor for gravitational fields. J. Math. Phys. 2, 787–793 (1961)
218
S. Sonego, M. Bruni
7. Schutz, B.F.: The use of perturbation and approximation methods in general relativity. In: Fustero, X., Verdaguer, E. (eds.) Relativistic Astrophysics and Cosmology. Singapore: World Scientific, 1984, pp. 35–97; Schutz, B.F.: Motion and radiation in general relativity. In: Bressan, O., Castagnino, M., Hamity, V. (eds.) Relativity, Supersymmetry and Cosmology. Singapore: World Scientific, 1985, pp. 3–80 8. Geroch, R., Lindblom, L.: Is perturbation theory misleading in general relativity? J. Math. Phys. 26, 2581–2588 (1985) ´ E., ´ Wald, R.M.: Does backreaction enforce the averaged null energy condition in semiclas9. Flanagan, E. sical gravity? Phys. Rev. D 54, 6233–6283 (1996) 10. Choquet-Bruhat, Y., DeWitt-Morette, C., Dillard-Bleick, M.: Analysis, Manifolds and Physics. Amsterdam: North-Holland, 1977; Avez, A.: Differential Calculus. Chichester: Wiley, 1986 Communicated by A. Jaffe
Commun. Math. Phys. 193, 219 – 231 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
Degenerate Solutions of General Relativity from Topological Field Theory John C. Baez Department of Mathematics, University of California, Riverside, California 92521, USA. E-mail:
[email protected] Received: 21 April 1997 / Accepted: 22 August 1997
Abstract: Working in the Palatini formalism, we describe a procedure for constructing degenerate solutions of general relativity on 4-manifold M from certain solutions of 2-dimensional BF theory on any framed surface 6 embedded in M . In these solutions the cotetrad field e (and thus the metric) vanishes outside a neighborhood of 6, while inside this neighborhood the connection A and the field E = e ∧ e satisfy the equations of 4-dimensional BF theory. Our construction works in any signature and with any value of the cosmological constant. If M = R × S for some 3-manifold S, at fixed time our solutions typically describe “flux tubes of area”: the 3-metric vanishes outside a collection of thickened links embedded in S, while inside these thickened links it is nondegenerate only in the two transverse directions. We comment on the quantization of the theory of solutions of this form and its relation to the loop representation of quantum gravity. 1. Introduction Since the introduction by Rovelli and Smolin of spin networks into what used to be called the “loop representation” of quantum gravity, progress on understanding the kinematical aspects of the theory has been swift [1, 7, 15]. There is a basis of kinematical states given by spin networks, i.e., graphs embedded in space with edges labelled by irreducible representations of SU(2), or spins, and with vertices labelled by intertwining operators. Geometrical observables have been quantized and their matrix elements computed in the spin network basis, giving a concrete – though of course still tentative – picture of “quantum 3-geometries”. In this picture, the edges of spin networks play the role of quantized flux tubes of area: surfaces acquire area through their intersection √ with these edges, each edge labelled with spin j contributing an area proportional to j(j + 1) for each transverse intersection point [2, 16]. Unsurprisingly, the dynamical aspects remain more obscure. Thiemann has proposed an explicit formula for the Hamiltonian constraint [17], but this remains controversial,
220
J. C. Baez
in part because of the difficulty in extracting physical predictions that would test this proposal, or any competing proposals. A crucial problem is the lack of a clear picture of the 4-dimensional, or spacetime, aspects of the theory. For example, diffeomorphism equivalence classes of states annihilated by the Hamiltonian constraint should presumably represent physical states of quantum gravity. Many explicit solutions are known, perhaps the simplest being the “loop states”, corresponding to spin networks with no vertices, or in other words, links with components labelled by spins. However, the interpretation of these states as “quantum 4-geometries” is still unclear. In search of some insight into the 4-dimensional interpretation of these loop states, we turn here to classical general relativity. We construct degenerate solutions of the equations of general relativity which at a typical fixed time describe “flux tubes of area” reminiscent of the loop states of quantum gravity. More precisely, the 3-metric vanishes outside a collection of embedded solid tori, while inside any of these solid tori the metric is degenerate in the longitudinal direction, but nondegenerate in the two transverse directions. The 4-dimensional picture is as follows: given any surface 6 embedded in spacetime, we obtain solutions for which the metric vanishes outside a tubular neighborhood of 6. Inside this neighborhood, which we may think of as 6 × D2 , the 4-metric is degenerate in the two directions tangent to 6 but nondegenerate in the two transverse directions. In the 4-geometry defined by one of these solutions, the area of a typical surface 60 intersecting 6 transversally in isolated points is determined by a sum of contributions from the points in the intersection 6 ∩ 60 . The solutions we consider are inspired by the work of Reisenberger [13], who studied a solution for which the metric vanishes outside a neighborhood of a 2-sphere. Since various formulations of Einstein’s equation become inequivalent when extended to the case of degenerate metrics, it is important to say which formulation is being used in any work of this kind. Reisenberger worked in the Plebanski (or self-dual) formalism, and concentrated on the initial-value problem for general relativity extended to possibly degenerate complex metrics, comparing Ashtekar’s version of the constraints and a modified version which differs only when the densitized triad field has rank less than 2. Only the latter version admits his “2-sphere solution”. We work 4-dimensionally for the most part, using the classical equations of motion coming from the Palatini formalism, in which the basic fields are a cotetrad field e and a metric-preserving connection A on the “internal space” bundle. Our work treats arbitrary signatures and arbitrary values of the cosmological constant. It would be worthwhile to translate our results into the self-dual formalism, but we do not attempt this here. An interesting aspect of our solutions is that they arise from solutions of 2dimensional BF theory, a topological field theory, on the surface 6. This takes advantage of the relation between general relativity and BF theory in 4 dimensions, together with the fact that BF theory behaves in a simple manner under dimensional reduction. It is also interesting that our construction requires us to choose a thickening of 6, that is, to extend the embedding of 6 in spacetime to an embedding of 6 × D2 . The topological data needed to specify a thickening of 6 up to diffeomorphism is known as a “framing” of 6. This is precisely the 4-dimensional analog of a framing of a knot or link in a 3-manifold. The possible need for framings in the loop representation of canonical quantum gravity has been widely discussed [5, 9, 12], but here we see it arising quite naturally in the classical context, where the spacetime aspects of framing-dependence are easier to understand. In the Conclusions we speculate on some of the possible implications of our ideas for the quantum theory. We discuss a “minisuperspace” model in which one quantizes only
Degenerate Solutions of General Relativity from Topological Field Theory
221
the degenerate solutions of general relativity associated to a fixed surface 6 in spacetime. We treat this model in detail in another paper [8]; here we only sketch the main ideas. For simplicity we restrict our attention to the Lorentzian signature and certain nonzero values of the cosmological constant 3. Our main result is that this minisuperspace model is equivalent to a G/G gauged Wess-Zumino-Witten model on the surface 6, with gauge group SO(3). It follows that if 6 intersects space at a given time in some link L, states at this time correspond to labellings of the components of L with irreducible representations of quantum SO(3) with nonzero qunatum dimension, or in other words, integral spins j = 0, 1, 2, . . . , k/2 where the “level” k is an even integer depending on 3. In other words, our states are described by a special class of the “q-deformed spin networks” intensively studied by knot theorists [11]. Major and Smolin [12] have suggested using q-deformed spin networks to describe states of quantum gravity with nonzero cosmological constant; here we see quite concretely how q-deformed spin networks arise in a toy model. 2. General Relativity and BF Theory Understanding the degenerate solutions of general relativity obtained from topological field theory is perhaps simplest in the Palatini formalism. We begin by recalling some notation introduced in our previous papers [4, 7]. In the Palatini formalism our spacetime M can be any oriented smooth 4-manifold equipped with an oriented vector bundle T that is isomorphic to T M and equipped with a (nondegenerate) metric η. In the rest of this section η can have any fixed signature. The fiber of T is called the “internal space”, and η is the “internal metric”. The basic fields are a T -valued 1-form e on M , often called the “soldering form” or “cotetrad field”, and a metric-preserving connection A on T . Pulling back the internal metric along e: T → T M we obtain a – possibly degenerate – metric g on M , given explicitly by g(v, w) = η(e(v), e(w)). The metric g is nondegenerate precisely when e is an isomorphism. To write the Palatini action in index-free notation, it is handy to work with differential forms on M taking values in the exterior algebra bundle 3T . In these terms, the Palatini action with cosmological constant 3 is given by Z 3 (1) tr(e ∧ e ∧ F + e ∧ e ∧ e ∧ e). SP al (A, e) = 12 M Here we use the internal metric to regard the curvature F of A as a 32 T -valued 2-form on M , and use the symbol “∧” to denote the wedge product of differential forms tensored with the wedge product in 3T . Also, the orientation and internal metric on T give an “internal volume form”, i.e., a section of 34 T , and thus a map from 34 T -valued forms to ordinary differential forms, which we denote above by “tr”. If we set the variation of this action equal to zero, ignoring boundary terms, we obtain the field equations dA (e ∧ e) = 0,
e ∧ (F +
3 e ∧ e) = 0, 6
(2)
222
J. C. Baez
where dA denotes the covariant exterior derivative. If e is an isomorphism, the first equation implies that dA e = 0, which means that the connection on T M corresponding to A via e is torsion-free, and thus equal to the Levi-Civita connection of g. The latter equation is then equivalent to Einstein’s equation for g. Therefore these equations describe an extension of general relativity to the case of degenerate metrics. Now compare BF theory. This makes sense in any dimension, but has special features in dimension 4. First let M be any oriented smooth n-manifold and let P be a G-bundle over M , where G is a connected semisimple Lie group. Then the basic fields of the theory are a connection A on P and an AdP -valued (n − 2)-form, often called B, which we shall call E because of its relation to the gravitational electric field. The action of the theory is Z tr(E ∧ F ), M
where F is the curvature of A and “tr” is defined using the Killing form on g. The corresponding field equations are dA E = 0,
F = 0.
In dimension 4 we can also add a cosmological constant term to the action, obtaining Z SBF (A, E) =
M
tr(E ∧ F +
3 E ∧ E), 12
(3)
which gives the field equations dA E = 0,
F+
3 E = 0. 6
(4)
The case 3 6= 0 is very different from the case 3 = 0. When 3 6= 0, the first equation follows from the second one and the Bianchi identity, so A is arbitrary and it determines E. When 3 = 0, A is flat and E is any solution of dA E = 0. We can relate BF theory in 4 dimensions to general relativity in the Palatini formalism if we take G = SO(4), SO(3, 1), or SO(2, 2), depending on the signature of the internal metric, and we let P be the special orthogonal frame bundle of T . This lets us think of E as a 32 T -valued 2-form, just as e ∧ e is in the Palatini formalism. Comparing the BF theory equations with those of general relativity, it is clear that e ∧ e and E play analogous roles. The easiest way to exploit this is to note that if E = e ∧ e, then the equations of BF theory, (4), imply those of general relativity, (2). An analogous observation has been widely remarked on in the context of the Ashtekar formalism (see [4] and the many references therein). Our new observation here is that for (2) to hold, it suffices for (4) to hold where e is nonzero. In what follows we will be interested in degenerate solutions of general relativity where e vanishes everywhere except in a neighborhood of a 2-dimensional surface 6, and the equations of 4-dimensional BF theory hold in this neighborhood. We construct these solutions from certain solutions of 2-dimensional BF theory living on 6.
Degenerate Solutions of General Relativity from Topological Field Theory
223
3. Local Analysis We begin by working locally in coordinates (t, x, y, z) and fixing a local trivialization of T . We consider the surface 6 = {(t, x, y, z): y = z = 0} in spacetime, and let
D2 = {(y, z): y 2 + z 2 ≤ r2 }
be the disc of radius r in the yz plane. We describe some degenerate solutions of general relativity for which e vanishes outside the neighborhood 6 × D2 of the surface 6, and the BF theory equations hold inside this neighborhood. For simplicity, we first consider the case of vanishing cosmological constant. Assume that A is of the form A = At dt + Ax dx inside 6 × D2 , with At , Ax depending only on t and x. Let A be arbitrary outside 6 × D2 . Assume also that e is of form e = f (y, z) [ey dy + ez dz], where f is a smooth function vanishing outside D2 and ey , ez depend only on t and x. Setting E = e ∧ e, note that
E = f 2 dy ∧ dz,
where
= ey ∧ ez
is a g-valued function on the surface 6. (Recall that g is the Lie algebra of G = SO(4), SO(3, 1), or SO(2, 2), depending on the signature we are considering.) Let α denote the restriction of the connection A to 6. A calculation then shows that e and A satisfy the equations of general relativity with vanishing cosmological constant: dA (e ∧ e) = 0,
e ∧ F = 0,
if the following equations hold: dα = 0,
φ = 0,
(5)
where dα is the exterior covariant derivative of with respect to the connection α, and φ is curvature of α. These are precisely the equations for 2-dimensional BF theory on the surface 6. However, not every solution of 2-dimensional BF theory on 6 gives rise to a degenerate solution of general relativity this way. Starting from any α, satisfying (5), we may define the fields A and E as above. These fields satisfy dA E = 0,
F =0
inside 6 × D2 , and E vanishes outside 6 × D2 . To obtain a solution of general relativity, however, we need a way to write E as e ∧ e. There may be no way to do this, or many ways, but we can certainly do it if is “decomposable”, that is, if = ey ∧ ez for some
224
J. C. Baez
sections ey , ez of T . Note that since is covariantly constant, it will be decomposable if it is of the form ey ∧ ez at any single point of 6. We discuss the issue of decomposability further in Sect. 5. If the cosmological constant is nonzero things are a bit more complicated. Suppose again that α, satisfy the 2-dimensional BF theory Eqs. (5). We set E = f 2 dy ∧ dz as before, but assume A = αt dt + αx dx + A⊥ inside 6 × D2 , with A⊥ = Ay dy + Az dz. Let us show that with an appropriate choice of A⊥ , the fields A and E satisfy the equations of 4-dimensional BF theory, (4), inside 6 × D2 . It follows that if is decomposable, we can write E = e ∧ e and obtain a degenerate solution of the equations of general relativity with cosmological constant term, (2). To solve Eqs. (4), it suffices to choose A⊥ such that F+
3 e∧e=0 6
in 6 × D2 , or in other words, Ftx = 0, Fty , Fxy , Ftz , Fxz = 0, Fyz = −
(6) (7) 3 . 6
(8)
Equation (6) holds automatically because α is flat. Equations (7) say precisely that the g-valued functions Ay and Az are covariantly constant in the t and x directions. Since α is flat, these equations have a unique solution given any choice of Ay and Az on the disc {t = x = 0} × D2 . We can find Ay , Az solving (8) on this disc; the solution is not unique, but it is unique up to gauge transformations. Choosing any solution and solving Eqs. (7) for Ay and Az on the rest of 6 × D2 , Eq. (8) then holds throughout 6 × D2 , because Fyz and are both covariantly constant in the t and x directions. z
x E
y
Fig. 1. Flux tube of gravitational electric field
In the next section we discuss the global aspects of these solutions of general relativity coming from 2-dimensional BF theory, but first let us examine their physical significance. If we restrict one of our solutions to the slice t = 0 we obtain the picture
Degenerate Solutions of General Relativity from Topological Field Theory
225
shown in Fig. 1. Here we have drawn the “gravitational electric and magnetic fields” E and B, by which we mean the restriction of the E and F fields, respectively, to the slice t = 0. The field E vanishes except in a tube of radius r running along the x axis, so we may think of our solution as describing a “flux tube” of the gravitational electric field. Since E is a g-valued 2-form on space, in Fig. 1 we have drawn it as small “surface element” transverse to the x axis. Outside the flux tube the gravitational magnetic field is arbitrary, but inside it we have B=−
3 E. 6
The 3-metric is zero outside this flux tube, while inside, the 3-metric is degenerate in the x direction but nondegenerate in the two transverse directions. In the 3-geometry defined by this solution, the area of any surface is zero unless it intersects the flux tube. We see therefore that this solution is a kind of classical analog of the “loop states” of quantum gravity, which have been shown to represent quantized flux tubes of area [2, 16]. We make this analogy more precise in the next section, where we consider flux tube solutions associated to arbitrary thickened links in space.
x
γx
Fig. 2. Holonomy of connection around flux tube
The case of the nonzero cosmological constant is particularly interesting. Then Byz is typically nonzero in the flux tube, so the connection A will typically have nontrivial holonomy around a small loop wrapping around the tube. Fig. 2 shows a circular loop γx of radius r centered at the point x on the x axis. To define the holonomy around such a loop as an element of the gauge group G, rather than a mere conjugacy class, we need to choose a basepoint for the loop. We can do this using a “framing” of the flux tube, or more precisely, a curve running along the surface of the tube that intersects each loop γx at one point, which we take as the basepoint. In Figure 2 we have chosen as our framing the line z = 0, y = −r. Using this framing we may define the holonomy I A. g(x) = T exp γx
Using the definition of parallel transport we easily see that g(x) is covariantly constant along the flux tube: ∂x g + [Ax , g] = 0. In the spacetime picture, g becomes a function of t and x, and one can also check that ∂t g + [At , g] = 0.
226
J. C. Baez
In the next section we show that in general, we need to equip a surface 6 with the 4-dimensional analogue of a framing in order to obtain solutions of general relativity from solutions of 2-dimensional BF theory on 6. This extra structure also lets us define a field g on 6 that measures the holonomy of A around small loops around 6. As above, this field g will be covariantly constant on 6. In fact, this field is closely related to the group-valued field in the G/G gauged WZW model on 6 that we describe in the Conclusions. 4. Global Analysis In this section we extend the local analysis of the previous section to show that for any framed surface embedded in spacetime, we can obtain degenerate solutions of general relativity from certain solutions of 2-dimensional BF theory on this surface. Let M be an oriented 4-manifold, and let 6 be a 2-manifold embedded in M . As before let T be a vector bundle isomorphic to T M equipped with a nondegenerate metric of some fixed signature, and let P → M be the special orthogonal frame bundle of T . By a “framing” of 6 we mean a homotopy class of trivializations of its normal bundle. We call an extension of the embedding i: 6 ,→ M to an embedding
j: 6 × D2 ,→ M
a “thickening” of 6. Any thickening determines a framing in a natural way. In fact, any framing comes from some thickening in this way. Moreover, two thickenings j, j 0 determine the same framing if and only if they are ambient isotopic, that is, if j 0 = f j for some diffeomorphism f of M connected to the identity. These results can be proved just as the analogous results were proved for thickenings of framed links [5]. In what follows we assume 6 is equipped with a framing, and we pick an arbitrary thickening compatible with this framing. Using this thickening we think of 6 × D2 as a submanifold of M . We assume α is a connection on P |6 and is a section of AdP over 6, and we assume that these fields satisfy the equations of 2-dimensional BF theory, dα = 0,
φ=0
where φ is the curvature of α. From this data we shall construct a connection A on P and an AdP -valued 2-form E on M such that E is zero outside the thickening 6×D2 ⊆ M , and the equations of 4-dimensional BF theory, (4), hold inside 6 × D2 . First we construct the field E as follows. Using notation similar to that of the previous section, let dy∧dz denote the standard volume form on D2 and let f be a smooth function on D2 vanishing near the boundary. Pulling back the 2-form f 2 dy∧dz from D2 to 6×D2 along the projection p2 : 6 × D2 → D2 , we obtain a 2-form on 6 × D2 which by abuse of language we again denote by f 2 dy ∧ dz. We can choose a metric-preserving isomorphism between the bundle T |6×D2 and the bundle p∗1 T |6 , where p1 : 6 × D2 → 6 is the projection onto the first factor. Requiring that this isomorphism be the identity for points in 6, it is then unique up to gauge transformation. We fix such an isomorphism. Using this, the pullback p∗1 can be thought of as a section of AdP on 6 × D 2 . We then define the AdP -valued 2-form E by E = f 2 dy ∧ dz
Degenerate Solutions of General Relativity from Topological Field Theory
227
on 6 × D2 , and define E to be zero on the rest of M . Next we construct the connection A on P . We pull back α to a connection on p∗1 P |6 , and using the isomorphism chosen in the previous paragraph we think of this as a connection on P |6×D2 , which we extend arbitrarily to a connection Ak on all of P . We define A by A = Ak + A⊥ , where A⊥ is an AdP -valued 1-form given as follows. Inside 6 × D2 we set A⊥ = Ay dy + Az dz, where Az = 0 and for any point (p, y, z) ∈ 6 × D2 , Z 3 z (p, y, z 0 )f 2 (y, z 0 ) dz 0 . Ay (p, y, z) = 6 0 Outside 6 × D2 , we let A⊥ be arbitrary. Finally, we check that the equations of 4-dimensional BF theory hold in 6 × D2 . When 3 6= 0 we only need to check that F = − 36 E. We work using local coordinates t, x on 6. Our choice of A⊥ ensures that Fyz = −
3 2 f , 6
so Fyz = −3Eyz /6. We have Etx = 0 by construction and Ftx = 0 since Ak is flat. We have Ety = 0 by construction, and at any point (p, y, z) ∈ 6 × D2 we have Fty = ∂t Ay + [At , Ay ] Z 3 z ∂t (p, y, z 0 ) + [αt (p), (p, y, z 0 )] f 2 (y, z 0 ) dz 0 = 6 0 = 0, since dα = 0. Similarly we have Etz = Exy = Exz = 0 and Ftz = Fxy = Fxz = 0. When 3 = 0 we also need to check that dA E = 0 in 6 × D2 . This follows from dα = 0. Summarizing, we have: Theorem 1. Let 6 be a framed 2-manifold embedded in an oriented 4-manifold M , and let T be a vector bundle over M isomorphic to the tangent bundle and equipped with a metric η of arbitrary fixed signature. Let P be the special orthogonal frame bundle of T . For any value of 3, there is a natural map from solutions (α, ) of 2-dimensional BF theory on 6 (where α is a connection on P |6 and is a section of AdP |6 ) to gauge and diffeomorphism equivalence classes of fields (A, E) (where A is a connection on P and e is a AdP -valued 2-form on M ) satisfying the equations of 4-dimensional BF theory in a neighborhood of 6. Given any of these solutions, we may extend A and E smoothly to all of M with E vanishing outside the given neighborhood of 6. Here the term “natural” refers to the fact that given a diffeomorphism F : M → M 0 mapping the framed surface 6 ⊆ M to the framed surface 60 ⊆ M 0 and carrying the solution (α, ) of 2-dimensional BF theory on 6 to the solution (α0 , 0 ) on 60 , the corresponding solution (A, E) of 4-dimensional BF theory on a neighborhood of 6 is mapped to the corresponding solution (A0 , E 0 ) on a neighborhood of 60 , up to diffeomorphism and gauge transformation (or more precisely, up to a bundle automorphism).
228
J. C. Baez
5. Decomposability Given the hypotheses of Theorem 1, if is decomposable at one point of each component of the surface 6, it is automatically decomposable throughout 6. We may thus write E = e ∧ e and obtain a degenerate solution of the equations of general relativity, (2), on M. One way to guarantee decomposability is to require that (α, ) is actually a solution of SO(3) BF theory on 6. Here we restrict attention to the signatures + + ++ and + + +−, so that G is either SO(4) or SO(3, 1), and we use the fact that SO(3) is a subgroup of these groups. Suppose we split T |6 into a line bundle and a 3-dimensional vector bundle on which the internal metric η is positive definite. Using a metric-preserving isomorphism between T |6×D2 and p∗1 T |6 , we obtain a splitting T6×D2 = L ⊕ 3 T , where η is positive definite on 3 T . Let 3 P denote the special orthogonal frame bundle of 3 T . This is a principal SO(3)-bundle, which we may think of as a sub-bundle of the principal G-bundle P |6×D2 . Similarly, the vector bundle Ad 3 P is a sub-bundle of the vector bundle AdP |6×D2 . Locally, we may think of sections of the former bundle as so(3)-valued functions, and sections of the latter as g-valued functions, where g is either so(4) or so(3, 1). Suppose we have a solution of 2-dimensional BF theory on 6 for which the connection α reduces to a connection on 3 P and the section of AdP is actually a section of Ad 3 P . Then is automatically decomposable, since every vector u ∈ R3 is the cross product of two other vectors. Moreover, there is a canonical way to decompose , since we can always write u = v × w, where v and w have the same length and u, v, w form a right-handed triple. The map u 7→ (v, w) is not smooth at u = 0, but if is zero at some point of 6 it is zero throughout that component of 6, so we indeed have a canonical way to write = ey ∧ ez for smooth sections ey , ez of 3 T . We thus obtain the following: Theorem 2. Given the hypotheses of Theorem 1, assume that the signature of η is either + + ++ or + + +−, and assume T |6 is equipped with a 3-dimensional sub-bundle on which η is positive definite. For any value of 3, there is a natural map from solutions (α, ) of 2-dimensional SO(3) BF theory on 6 (where α is a connection on 3 P |6 and is a section of Ad 3 P |6 ) to gauge and diffeomorphism equivalence classes of fields (A, e) (where A is a connection on P and e is an T -valued 1-form on M ) satisfying the equations of general relativity on M . One may wonder to what extent these solutions of general relativity depend on our choice of 3-dimensional sub-bundle of T |6 , and whether such a choice always exists. At least in the case of signature + + ++, these issues work out nicely. In this case, we can always choose a splitting of T6 into a line bundle L and a 3-dimensional vector bundle K, and this splitting is unique up to a small gauge transformation. It follows that our map from solutions (α, ) to gauge and diffeomorphism equivalence classes of solutions (A, e) is independent of the choice of 3-dimensional sub-bundle of T |6 , as long as we choose it so that its orthogonal complement is trivial.
Degenerate Solutions of General Relativity from Topological Field Theory
229
To see these facts we need a little homotopy theory. Note that a splitting T6 = L⊕K is equivalent to a reduction of structure group from SO(4) to SO(3). This is also equivalent to a lifting BSO(3) j ` 6
q- ? BSO(4)
where q is the classifying map of the bundle P |6 , and j is the map between classifying spaces induced by the inclusion SO(3) ,→ SO(4). Homotopy classes of such liftings correspond to homotopy classes of maps from 6 to SO(4)/SO(3) ∼ = S 3 , which is 3 the homotopy fiber of the map j. However, all maps from 6 to S are homotopic, so any two liftings are homotopic. Thus, suppose we are given two splittings, T6 = L ⊕ K and T6 = L0 ⊕ K 0 . These correspond to two liftings ` and `0 , and since there exists a homotopy between ` and `0 , there exists a continuous one-parameter family of splittings interpolating between the two given ones. It follows that there is a small gauge transformation carrying L to L0 and K to K 0 . 6. Conclusions Despite many parallels, the relevance – if any – of our “flux tube” solutions to the loop representation of quantum gravity is not yet clear. Perhaps, however, we can begin to understand this by quantizing the theory of solutions of this sort associated to a fixed surface in spacetime. We can think of this as an unusual sort of “minisuperspace model”. In such a model all but finitely many degrees of freedom of the classical gravitational field are ignored, and then the rest are quantized. Presumably any such model is only a caricature of quantum gravity. At best we can hope that, like a good caricature, it reveals certain interesting features of the subject. Here we give a rough outline of how this might proceed, leaving a more careful treatment to another paper [8]. We follow a path-integral approach, considering only the case of the nonzero cosmological constant. Let M be an oriented 4-manifold and let T be a bundle over M isomorphic to the tangent bundle and equipped with a Lorentzian metric η. Fix a compact framed surface 6 in M , a thickening of 6, and a 3-dimensional sub-bundle 3 T of T |6×D2 as in the previous section. Working in the Palatini formalism, we integrate only over field configurations (A, e) for which: 1. Inside 6 × D2 , A reduces to a connection on 3 T and F + 2. Outside 6 × D2 , A is arbitrary and e = 0.
3 6e
∧ e = 0.
For such field configurations the action can be expressed purely in terms of A: Z 3 SP al (A, e) = tr(e ∧ e ∧ F + e ∧ e ∧ e ∧ e) 12 M Z 3 tr(F ∧ F ). =− 3 6×D2 If we fix a connection A0 of the above form that is flat on the boundary 6 × S 1 of the thickening, and use this to think of A as an AdP -valued 1-form, we obtain by Stokes’ theorem
230
J. C. Baez
3 SCS (A) + c, 3 where c depends only on the fixed connection A0 and SCS (A) is the Chern-Simons action Z 2 tr(A ∧ dA + A ∧ A ∧ A). SCS (A) = 3 1 6×S SP al (A, e) = −
Now we are in a position to take full advantage of the following “ladder of field theories” [4] in dimensions 2, 3, and 4: 4d:
General relativity
→
BF theory
→
Chern theory ↓
3d:
Chern-Simons theory ↓
2d:
WZW model
→
G/G gauged WZW model
As we have seen, for any framed surface 6 in spacetime, general relativity has a class of degenerate solutions coming from 4-dimensional BF theory. Our minisuperspace model amounts to 4-dimensional BF theory on the thickened surface 6 × D2 . Using the equations of motion of BF theory to write the action purely as a function of F , it is then proportional to the second Chern number Z 1 tr(F ∧ F ). c2 = 8π 2 M This action gives a theory involving only the A field, which we may call “Chern theory”. The Lagrangian of Chern theory is a closed 4-form, so it has no bulk degrees of freedom, and on 6 × D2 it reduces to Chern-Simons theory on 6 × S 1 . The final step down the dimensional ladder uses the fact that Chern-Simons theory on a 3-manifold of form 6 × S 1 is equivalent to the G/G-gauged Wess-Zumino-Witten model on 6. This theory is a 2-dimensional topological field theory having a basis of states corresponding to irreducible representations of the quantum group associated to the gauge group G, which in our case is SO(3). We see therefore that in the canonical picture, when 6 intersects space a given time in some link L, states of our minisuperspace model are described by labellings of the components of L with irreducible representations of quantum SO(3), or in other words, integral spins j = 0, 1, 2, . . . , k/2, where the “level” k is an even integer depending on 3. A link labelled this way is simply a q-deformed spin network with no vertices. While we have included the Wess-Zumino-Witten model in our picture of the ladder of field theories, it has only an indirect relevance to our minisuperspace model. It is a conformal field theory whose conformal blocks form a basis of states of the G/G gauged Wess-Zumino-Witten model. In string theory, the string worldsheet is equipped with a conformal structure, and some of the transverse vibrational modes of the string are described by Wess-Zumino-Witten models living on this surface. In the minisuperspace model sketched above, we see instead the G/G gauged Wess-Zumino-Witten model living on the surface 6. There is, of course, no opportunity for us to see a conformal field theory on 6, because this surface is not equipped with a metric; even in any particular solution the metric is only nondegenerate in the two directions transverse to 6.
Degenerate Solutions of General Relativity from Topological Field Theory
231
This begins to make more precise our previous speculation that the loop representation of quantum gravity is related to background-free string theory, with the loops arising as t = 0 slices of string worldsheets [3]. It is intriguing that in a somewhat different framework, Jacobson [10] has constructed degenerate solutions of general relativity in which the metric is nondegenerate in the directions tangent to a surface, and by this means obtains a conformal field theory. Comparing his solutions and ours may shed more light on the relation between the loop representation of quantum gravity and string theory – or at least 2-dimensional field theory. Acknowledgement. I would like to thank Ted Jacobson for getting me interested in degenerate solutions of general relativity coming from field theories on surfaces, Mike Reisenberger for pointing out that his “2-sphere solution” of general relativity should be related to BF theory, Jorge Pullin for emphasizing the importance of decomposability of the E field, and W. Dale Hall and Clarence Wilkerson for helping me with the homotopy theory. I would also like to thank the Erwin Schr¨odinger Institute in Vienna, the Theoretical Physics Group at Imperial College, and the Center for Gravitational Physics and Geometry at Pennsylvania State University for their hospitality while this work was being done. This work was supported in part by NSF grant PHY95-14240.
References 1. Ashtekar, A., Lewandowski, J., Marolf, D., Mour˜ao, J. and Thiemann, T.: Quantization of diffeomorphism invariant theories of connections with local degrees of freedom. J. Math. Phys. 36, 6456–6493 (1995) 2. Ashtekar, A. and Lewandowski, J.: Quantum Theory of Geometry I: Area Operators, Class. Quantum Grav. 14, A55–A81 (1997) 3. Baez, J.: Strings, loops, knots and gauge fields. In: Knots and Quantum Gravity. ed. J. Baez, Oxford: Oxford U. Press, 1994 4. Baez, J.: Knots and quantum gravity: progress and prospects. In Proceedings of the Seventh Marcel Grossman Meeting on General Relativity. ed. Robert T. Jantzen and G. Mac Keiser, Singapore: World Scientific Press, 1996, pp. 779–797 5. Baez, J.: Link invariants, holonomy algebras and functional integration. J. Funct. Anal. 127, 108–131 (1995) 6. Baez, J.: Four-dimensional BF theory as a topological quantum field theory. Lett. Math. Phys. 38, 129–143 (1996) 7. Baez, J.: Spin networks in nonperturbative quantum gravity. In: The Interface of Knots and Physics, ed. Louis Kauffman, Providence, R. I.: American Mathematical Society, 1996, pp. 167–203 8. Baez, J.: BF theory and quantum gravity in four dimensions. Manuscript in preparation 9. Di Bartolo, C., Gambini, R., Griego, J., Pullin, J.: The space of states of quantum gravity in terms of loops and extended loops: Some remarks. J. Math. Phys. 36, 6511–6528 (1995) 10. Jacobson, T.: 1+1 sector of 3+1 gravity. Class. Quant. Grav. 13, L1–L6 (1996) 11. Kauffman, L. and Lins, S.: Temperley-Lieb Recoupling Theory and Invariants of 3-Manifolds. Princeton, NJ: Princeton U. Press, 1994 12. Major, S., Smolin, L.: Quantum deformation of quantum gravity. Nucl. Phys. B473, 267–290 (1996) 13. Reisenberger, M.: New constraints for canonical general relativity. Nucl. Phys. B457, 643–687 (1995) 14. Reisenberger, M. and Rovelli, C.: “Sum over surfaces” form of loop quantum gravity. Preprint available as gr-qc/9612035. 15. Rovelli, C. and Smolin, L.: Spin networks in quantum gravity. Phys. Rev. D52, 5743–5759 (1995) 16. Rovelli, C. and Smolin, L.: Discreteness of area and volume in quantum gravity. Nucl. Phys. B442, 593–622 (1995); Erratum, ibid. B456, 753 (1995) 17. Thiemann, T.: Anomaly-free formulation of non-perturbative, four-dimensional Lorentzian quantum gravity. Phys. Lett. B380, 257–264 (1996) Communicated by G. Felder
Commun. Math. Phys. 193, 233 – 243 (1998)
Communications in
Mathematical Physics c Springer-Verlag 1998
The Discrete Spectrum in the Gaps of the Continuous One for Non-Signdefinite Perturbations with a Large Coupling Constant O.L. Safronov Department of Mathematics, Royal Institute of Technology, S-100 44 Stockholm, Sweden. E-mail:
[email protected] Received: 9 April 1997 / Accepted: 26 August 1997
Abstract: Given two selfadjoint operators A and V = V+ − V− , we study the motion of the eigenvalues of the operator A(t) = A − tV as t increases. Let α > 0 and let λ be a regular point for A. We consider the quantities N+ (λ, α), N− (λ, α), N0 (λ, α) defined as the number of the eigenvalues of the operator A(t) that pass point λ from the right to the left, from the left to the right or change the direction of their motion exactly at point λ, respectively, as t increases from 0 to α > 0. An abstract theorem on the asymptotics for these quantities is presented. Applications to Schr¨odinger operators and its generalizations are given.
0. Introduction 1. Let A be a selfadjoint, semibounded from below operator in a Hilbert space H, let V be a selfadjoint non-signdefinite perturbation. Assume that the quadratic form of the operator V is compact with respect to the form of A. Denote A(α) = A − αV, α > 0. Let the interval 3 = (λ− , λ+ ) be a gap in the spectrum of A. Then the spectrum of A(α) is discrete in 3. The eigenvalues of A(α) in 3 are continuous functions of α. If α grows, they may appear or disappear only at the endpoints λ− , λ+ . Fix a number λ such that λ− < λ < λ+ . The eigenvalues of A(α) may move in 3 crossing the point λ from the right to the left, from the left to the right, or “being reflected” by point λ, as α increases. Let us introduce the value N+ (λ, α) to be the number of the eigenvalues of the operator A(t) that pass point λ from the right to the left as t increases from 0 to α > 0. Let N− (λ, α) be the same value corresponding to the motion of the eigenvalues from the left to the right. At last, denote by N0 (λ, α) the number of the eigenvalues of A(t), which change the direction of their motion exactly at point λ as t increased from 0 to α. Our
234
O. Safronov
aim is to investigate the asymptotic behavior of the values N+ (λ, α), N− (λ, α), N0 (λ, α) as α → ∞. Put N (λ, α) = N+ (λ, α) − N− (λ, α). In the present paper we obtain abstract conditions guaranteeing that (1) N+ (λ, α) ∼ N (λ, α), α → ∞, and N− (λ, α) + N0 (λ, α) = o(N (λ, α)), α → ∞.
(2)
This allows one to apply the results already obtained in [16] about the asymptotics of N (λ, α). 2. Note, that if V ≥ 0 or V ≤ 0, then the eigenvalues of A(α) move monotonically to the left or to the right respectively. In this case only N+ (λ, α) or N− (λ, α) is not zero. The problem can be reduced (in view of the Birman-Schwinger principle) to the study of the asymptotics of the spectrum of a certain compact selfadjoint operator. In particular, this idea was used by Klaus [12, 13] and Hempel [9]. The paper [12] deals with a Dirac operator; in [13] the case of relatively compact perturbations was studied in an abstract setting. In [9] a d-dimensional Schr¨odinger operator A = − 4 +f (x), f ∈ L∞ (Rd ),
(3)
V (x) = O(|x|−q ), |x| → ∞, q > 2,
(4)
perturbed by a potential
was investigated. Namely, for the corresponding operator A(α) and V ≥ 0 the Weyl asymptotics has been obtained Z −d d/2 V d/2 dx, α → ∞, (5) N+ (λ, α) ∼ (2π) ωd α where ωd = vol{x ∈ Rd : |x| < 1}. Later the asymptotics (5) was proved in [2] for a wider class of potentials V . For example the asymptotics (5) holds for an arbitrary V ≥ 0, such that (6) V ∈ Ld/2 (Rd ), d ≥ 3. Notice that the asymptotics of N+ (λ, α) in (5) does not depend on λ. According [2] and [4] this class of V is called the class of “regular” potentials. In [2] a general operator scheme for such perturbations has been suggested. Conditions providing that the asymptotics of N+ (λ, α) does not change under sufficiently strong variations of A and under the change of the point λ have been obtained. This allows one to apply the variational technique and to extend and refine the results in applications. Let us now discuss the case V ≤ 0, where one should study the quantity N− (λ, α). For the operator (3), which possesses the so-called density of states and for the perturbation V (x) ∼ c|x|−q , |x| → ∞, c > 0, q > 0, the asymptotics N− (λ, α) ∼ C(λ)αd/q , α → ∞, was obtained in [1]. The asymptotic coefficient C(λ) was expressed in terms of density of states of A. In the case d ≥ 3 for an arbitrary operator (3) and perturbation (6) we have the “poorer” relation N− (λ, α) = o(αd/2 ), α → ∞ (see [2]).
The Discrete Spectrum in the Gaps of the Continuous One
235
3. If V is a non-signdefinite perturbation, then the problem becomes much more difficult. The auxiliary compact operator in the Birman-Schwinger principle happens to be non-selfadjoint. A certain way of avoiding non-selfadjoint operators (despite nonsigndefiniteness of V ) has been suggested by Simon [17] (see also [8]). The work [5] renewed interest to the case of non-signdefinite V . This paper gives also a detailed information for one-dimensional Schr¨odinger operators (the case excluded in our considerations). Then the results obtained in [5] were improved by different authors (see, for example, [1] and [8]). For a perturbed Schr¨odinger operator lower estimates (but not asymptotics) of the sum N+ (λ, α) + N− (λ, α) + N0 (λ, α) were obtained in [10, 15]. Somewhat different questions for a 1-dimensional Schr¨odinger operator have been analyzed in [6]. It has been found that for large α the eigenvalues in a gap have a “trapping and cascading” structure. In [6] matrix differential operators of first order (Dirac operators on the half-line R+ ) were also considered. The investigation of the case of variable sign perturbations was continued in [16]. It turned out that it is convenient to study the difference N (λ, α) = N+ (λ, α) − N− (λ, α). In [16] an abstract theorem (of the same type as in [2]) on the stability of the asymptotic of N (λ, α) was proved. This proof is more complicated technically than in [2]. In applications, however, the same generality as in [2] was achieved. In particular, in the case d ≥ 3 for the operator (3) perturbed by a non-signdefinite potential (6) the following asymptotic formula: Z d/2 N (λ, α) ∼ (2π)−d ωd αd/2 V+ dx, α → ∞, 2V+ = |V | + V, was obtained. There were some reasons to believe that under quite broad conditions on V the relations (1), (2) hold. This conjecture was confirmed partially in the paper [11], where for the Schr¨odinger operator the following statement has been proved. Let A be the operator (3), V ∈ L∞ (Rd ) and let V be of compact support. Let G ⊂ Rd be an open set such that{x ∈ Rd : V (x) 6= 0} coincides with G, up to a set of measure zero. Suppose that λ is not an eigenvalue for the operator − 4 +f (x) in L2 (Rd \ G) with the Dirichlet boundary conditions (see [11]). Then there exists α0 > 0, such that for α > α0 the eigenvalues of the operator A(α) may pass point λ only from the right to the left. In the present paper the abstract conditions, guaranteeing the relations (1), (2) are obtained. We are forced, however, to use a “Tauberian character” argument in the proof of the main Theorem 1.1. This argument allows us to study only the asymptotics of order αp with p > 1. Therefore, in applications to Schr¨odinger operators the cases of dimensions d = 1, 2 are not covered. As a non-perturbed operator we also consider (Theorem 4.1) a polyharmonic operator with additional terms of smaller differential orders. However, as a perturbation V we take only the operator of multiplication by a function. Perturbations of the form of differential operators, considered in [2, 16], do not suit to the conditions of Theorem 1.1.
1. Statement of the Main Result 1. Let H be a separable Hilbert space. Below (·, ·), k · k, I are the scalar product, the norm and the identity operator in H. We denote by R the space of continuous linear operators and by S∞ the space of compact operators acting from H to H. For a densely defined linear operator M we denote its domain, range, kernel, adjoint operator, resolvent set
236
O. Safronov
and spectrum by the symbols D(M ), RanM, Ker M, M ∗ , ρ(M ), σ(M ) respectively. If M = M ∗ , we put 2M± = |M | ± M . For T ∈ S∞ we put 2 Re T = T + T ∗ and denote by sk (T ), k ∈ N, the singular numbers of T , i.e., the consecutive eigenvalues of the operator (T ∗ T )1/2 . We introduce the counting function n(s, T ) = card{ k : sk (T ) > s},
s > 0,
of the s-numbers. If T = T ∗ ∈ S∞ (H), we denote n± (·, T ) = n(·, T± ).
(7)
For 0 < p < ∞ we consider the class (ideal) Σp ⊂ S∞ defined by the condition Σp = {T ∈ S∞ : sup sp n(s, T ) < ∞}. s>0
Finally, Sp , 0 < p < ∞, is the class of all compact operators T satisfying the condition X sk (T )p < ∞. kT kpSp := k
2. Let a[·, ·] be a semibounded from below closed sesquilinear form in H. We assume that its domain d[a] is a dense set in H. The form a induces in H the self-adjoint operator A. Fix a value of γ ∈ R, such that aγ := a + γ ≥ 1, i.e. aγ [x, x] = a[x, x] + γkxk2 ≥ kxk2 , x ∈ d[a]. Denote by Hγ [a] the (complete) Hilbert space d[a] with the metric form aγ [x, x] = k(A + γI)1/2 xk2 , x ∈ d[a]. Let V be a self-adjoint operator in H, satisfying D(|V |1/2 ) ⊃ d[a]. Put W± = | ± V ))1/2 . Then the closures of RanW+ and RanW− are orthogonal subspaces in H. Obviously, D(W± ) ⊃ d[a] and
( 21 (|V
W± (A + γI)−1/2 ∈ R. We impose on W± the more restrictive conditions: there exists p ∈ (1, ∞), such that W± (A + γI)−1/2 ∈ Σ2p .
(8)
v[x, y] = (W+ x, W+ y) − (W− x, W− y).
(9)
Let us introduce the form
The condition (8) implies that v is compact on d[a]. This means that the form v is continuous on Hγ [a] and the corresponding operator Q, determined by the equation aγ [Qx, y] = v[x, y], x, y ∈ d[a], is compact on Hγ [a]. Hence, the form v satisfies the estimate v[x, x] ≤ εaγ [x, x] + C(ε)kxk2 , x ∈ d[a], ∀ε > 0. Therefore, we can introduce a family of semibounded from below, closed forms a(α) on d[a]: a(α) = a − αv, α > 0. (10)
The Discrete Spectrum in the Gaps of the Continuous One
237
Denote by A(α) the self-adjoint operator in H, which corresponds to the form (10). Since the difference of the resolvents of A and A(α) is compact, the spectrum of A(α) in the gaps of σ(A) is discrete. Let the interval 3 = (λ− , λ+ ) be a gap in the spectrum σ(A). We fix an “observation point” λ, λ− < λ < λ+ . Assume that λ is an eigenvalue of multiplicity k of the operator A(t) for some t > 0. There is a neighborhood of t where we can choose real-analytic functions λj (s), j = 1, . . . , k, whose values are eigenvalues of A(s) and λj (t) = λ, j = 1, . . . , k. We count the eigenvalues {λj (s)}kj=1 with their multiplicities. Let us choose a suitable neighborhood of point t. Since none of the functions λj is constant, the zeros of the derivatives dλj /ds are isolated. Hence we can choose such a neighborhood of t that dλj (s)/ds 6= 0 for s 6= t, j = 1, . . . , k. Let k+ (t), k− (t), k0 (t) be the number of the functions λj which are monotone decreasing, monotone increasing and non-monotone (respectively) in the chosen neighborhood. If λ ∈ ρ(A(t)), then we assume that k+ (t) = k− (t) = k0 (t) = 0. Introduce the following values: P k± (t), N± (λ, α) = 0
(12) Re X(λ) = X+ (λ) − X− (λ), where X± (λ) are the compact selfadjoint operators on H defined by the formulae X± (λ)g = W± (A − λI)−1 W± g, g ∈ D(W± ).
Obviously, the closure of RanX+ (λ) is orthogonal to the closure of RanX− (λ). Therefore, it follows from (12) that n+ (s, Re X(λ)) = n+ (s, X+ (λ)) + n− (s, X− (λ)), s > 0.
(13)
It is well known that the description of the discrete spectrum of A(α) can be reduced to the study of the spectrum of the compact operator X(λ) (the Birman–Schwinger principle). We present here a version of this reduction suitable for perturbations of forms.
238
O. Safronov
Proposition 1.1. Let λ = λ ∈ ρ(A). The following two statements are equivalent: 1) the point λ is an eigenvalue of multiplicity k for the operator A(α); 2) the point α−1 is an eigenvalue of geometric multiplicity k for the operator X(λ). Proposition 1.1 is a direct consequence of Lemma 1 from [14]. For additive operator perturbations a simple proof can be found in [13]. Note that in the case V ≥ 0 or V ≤ 0 this statement was later proved in [2] and this proof can be also applied for non-signdefinite perturbations. Now we are ready to formulate the main result of the paper. Theorem 1.1. Let the form v in (10) be defined by (9) and let the condition (8) be fulfilled for some p ∈ (1, ∞). Assume that for λ = λ ∈ ρ(A) the following limits exist and coincide (14) lim α−p N (λ, α) = lim sp n+ (s, Re X(λ)) =: J. α→∞
s→0
Then lim α−p N+ (λ, α) = J
α→∞
and lim α−p (N0 (λ, α) + N− (λ, α)) = 0.
α→∞
2. Additional Information Concerning Compact Operators 1. Let T ∈ S∞ and let {λk (T )}ν1 be the sequence of its eigenvalues, lying in the right half-plane C+ = { z ∈ C : Re z > 0 }. Here ν is a positive integer or ν = ∞. We arrange the eigenvalues so that Re λk+1 (T ) ≤ Re λk (T ) and count them with their algebraic multiplicities. If the sequence {λk (T )}ν1 is finite, we continue it by zeros. After this we can assume that ν = ∞. Note that T may be a non-selfadjoint operator. We introduce the distribution function n+ (s, T ) = card{ k : Re λk (T ) > s }, s > 0, of the eigenvalues of T . Since this function is a generalization of n+ from (7), we keep the previous notation for it. There is a relation between the eigenvalues of T and the eigenvalues of the selfadjoint operator 1 R := Re T = (T ∗ + T ). 2 Proposition 2.1. Let T ∈ S∞ and η > 0. Then Z∞
Z∞ n+ (s, T )ds ≤
η
n+ (s, R)ds. η
(15)
The Discrete Spectrum in the Gaps of the Continuous One
239
Proof. Consider the subspace X
G=
Gk ,
Re λk (T )>η
where Gk are the root subspaces corresponding to the eigenvalues λk (T ). Then G is T -invariant subspace, such that dim G = n+ (η, T ) and the restriction TG of the operator T on G satisfies λk (TG ) = λk (T ), k = 1, . . . , n+ (η, T ). Let P = PG be the orthogonal projection onto G. Obviously (T −ηI)P = (TG −ηIG )⊕0. Hence X (λk (T ) − η). (16) tr((T − ηI)P ) = Re λk (T )>η
Put 2Y = i(T ∗ − T ). Then tr((T − ηI)P ) = tr((R − ηI)P ) + i tr(Y P ). Since tr(Y P ) = tr(P Y P ) is a real number, we have Re tr((T − ηI)P ) = tr((R − ηI)P ) = tr(P (R − ηI)P ) ≤ tr(P (RP− ηI)+ P ) ≤ k(R − ηI)+ kS1 = λk (R)>η (λk (R) − η). It follows from (16) and (17) that X (Re λk (T ) − η) ≤
X
Re λk (T )>η
λk (R)>η
(λk (R) − η).
(17)
(18)
It remains to note that the left- and right-hand sides in (15) are the same as in (18) respectively. Corollary 2.1. Let T ∈ S∞ . Then l X
Re λk (T ) ≤
k=1
l X
λk (R),
l = 1, 2, . . . .
(19)
k=1
Proof. Let us choose the number η in (18), such that η = λl (R). Then we can substitute the sum on the right-hand side of (18) by l X
(λk (R) − η).
k=1
It remains to note that Re
l X
(λk (T ) − η) ≤ Re
k=1
The proof is completed.
X Re λk (T )>η
(λk (T ) − η).
240
O. Safronov
Similar to (15), (19) inequalities can be found in [7], Ch.2, §6. 2. Here we consider the functionals (+) 1p (T ) = lim sups→0 sp n+ (s, T ), δp(+) (T ) = lim inf s→0 sp n+ (s, T ), which are finite on Σp (see [3]). Let the operators T , R be the same as in Subsect. 1. Assume that T ∈ Σp , 1 < p < ∞, and for every s > 0, (20) n+ (s, T ) ≥ n+ (s, R) − φ(s), where φ ∈ L∞,loc (0, ∞) is a real-valued function, such that lim sp φ(s) = 0.
(21)
s→0
As an application of the inequality (15) we present the following result. Proposition 2.2. Let T ∈ Σp , 1 < p < ∞. Assume that the conditions (20) and (21) are fulfilled. Then (+) (22) 1(+) p (T ) = 1p (R), δp(+) (T ) = δp(+) (R).
(23)
Proof. To be definite we consider the functional 1(+) p (T ). The equality (23) can be checked similarly. It follows from (20),(21) that (+) 1(+) p (T ) ≥ 1p (R).
Let us now prove the inequality ≤ in (22). There exists s0 > 0, such that n+ (s, T ) = n+ (s, R) = 0 for s > s0 . Therefore, multiplying φ by the indicator function of (0, s0 ] we can assume that φ(s) = 0, ∀s > s0 . Then for every t > 0, according to (20), we have Z∞
Z∞ n+ (s, T )ds ≥
t
Note that
Z∞ n+ (s, R)ds −
t
φ(s)ds.
(24)
t
Z∞ φ(s)ds = o(t1−p ), t → 0.
(25)
t
Subtracting (24) from (15), we obtain Zt
Zt n+ (s, T )ds ≤
η
Z∞ n+ (s, R)ds +
η
φ(s)ds. t
Putting η = (1 − ε)t, 0 < ε < 1, and estimating the left- and right-hand side, we find that
The Discrete Spectrum in the Gaps of the Continuous One
241
Z∞ εtn+ (t, T ) ≤ εtn+ ((1 − ε)t, R) +
φ(s)ds, t > 0. t
Multiplying the last inequality by tp−1 /ε and taking upper limits as t → 0, in view of (25) we obtain −p (+) 1p (R). 1(+) p (T ) ≤ (1 − ε) The passage to the limit as ε → 0 gives the estimate (+) 1(+) p (T ) ≤ 1p (R).
The proof is completed.
3. The Proof of the Main Result (Theorem 1.1) It follows from Proposition 1.1 that N+ (λ, α) + N− (λ, α) + N0 (λ, α) =
X
dim Ker(X(λ) − t−1 I).
0
Hence,
N+ (λ, α) + N− (λ, α) + N0 (λ, α) ≤ n+ (α−1 , X(λ)).
(26)
According to (14) we have N+ (λ, α) − N− (λ, α) ∼ αp J, α → ∞.
(27)
By the comparison of (26) and (27), we see that it is sufficient to prove that lim sp n+ (s, X(λ)) = J.
(28)
s→0
We apply Proposition 2.2 putting T = X(λ) and φ(s) = n+ (s, Re X(λ)) − N (λ, s−1 ). Then condition (20) follows from the inequality n+ (α−1 , X(λ)) ≥ N (λ, α). Equality (14) coincides with assumption (21). Thus we obtain (28).
4. Application to Differential Operators We give here an example of how Theorem 1.1 can be applied to the study ofR spectral R properties of differential operators. Below we use the following notations = Rd , Dj = −i∂/∂xj , D = (D1 , . . . , Dd ) and ωd = vol{x ∈ Rd : |x| < 1}. By H ` (Rd ) we denote the Sobolev classes on Rd of order ` > 0. Let 8 be the Fourier transform on Rd defined on the Schwartz class by the formula Z 8u(ξ) = (2π)−d/2 e−ihξ,xi u(x)dx. We write uˆ := 8u. In our example H = L2 (Rd ), d[a] = H ` (Rd ), ` ∈ N, 0 < ` < d/2 and
242
O. Safronov
Z a[u, u] =
2 |ξ|2` |u(ξ)| ˆ dξ +
X Z
fβ (x)|Dβ u(x)|2 dx,
(29)
|β|≤m
|β| ≤ m ∈ N, 0 ≤ m < `. As V we take an operator of multiplication by a measurable function V : Rd −→ R. We borrow from [2] the following assumptions: fβ : Rd −→ R,
fβ ∈ L∞ (Rd ),
|β| ≤ m,
(30)
and V ∈ Lp (Rd ), p = d/2` > 1.
(31)
The form a is semibounded from below and closed on d[a] (see, for example, [2]). 1/2 Obviously, W± are the operators of multiplication by the functions V± , 2V± = |V |±V, and Z v[u, u] =
V (x)|u|2 dx.
(32)
The condition (31) ensures (8) (see [2]). At last, for the operator A(α) generated by the form a − αv we have (see [16], §6) Z (33) lim α−p N (λ, α) = (2π)−d ωd V+p dx. α→∞
Theorem 4.1. Let a, v be the forms defined as in (29) and (32), and let the conditions (30) and (31) be fulfilled. Then for all λ = λ ∈ ρ(A) the following formulae hold Z lim α−p N+ (λ, α) = (2π)−d ωd V+p dx, α→∞
lim α−p (N− (λ, α) + N0 (λ, α)) = 0.
α→∞
R Proof. It is sufficient to verify the conditions of Theorem 1.1 with J = (2π)−d ωd V+p dx. As it was noted, condition (31) implies (8). The asymptotics (33) means N (λ, α) ∼ αp J, α → ∞. Taking into account (13), it remains to note that (see [2]) lim sp n+ (s, X+ (λ)) = J,
s→0
lim sp n− (s, X− (λ)) = 0.
s→0
The theorem is proved.
Acknowledgement. It is a pleasure for the author to express his deep gratitude to Professor M. Sh. Birman for suggesting the problem. The author is also grateful to M. Sh. Birman, A. A. Laptev and T. Weidl for their attention to the work. I am grateful to the Swedish Institute for financial support which allowed me to spend the 1996–1997 academic year at the Department of Mathematics of the Royal Institute of Technology in Sweden.
The Discrete Spectrum in the Gaps of the Continuous One
243
References 1. Alama, S., Deift, P.A., Hempel, R.: Eigenvalue brunches of the Schr¨odinger operator H − λW in a gap of σ(H). Commun. Math. Phys. 121, 291–321 (1989) 2. Birman, M.Sh.: Discrete spectrum in the gaps of acontinuous one for perturbations with large coupling constants. Adv. in Soviet Math. 7, 57–73 (1991) 3. Birman, M.S., Solomjak, M.Z.: Spectral theory of self-adjoint operators in Hilbert space. Mathematics and Its Applications. Amsterdam: D. Reidel Publishing Company, 1987 4. Birman, M.Sh., Solomyak, M.Z.: Estimates for the number of negative eigenvalues of the Schr¨odinger operator and its generalizations. Adv. in Soviet Math. 7, 1–55 (1991) 5. Deift, P.A., Hempel, R.: On the existence of eigenvalues of the Schr¨odinger operator H − λW in a gap of σ(H). Commun. Math. Phys. 103, 461–490 (1986) 6. Gesztesy, F., Gurarie, D., Holden, H., Klaus, M., Sadun, L., Simon, B., Vogl, P.: Trapping and Cascading of eigenvalues in the large coupling limit. Commun. Math. Phys. 118, 597–634 (1988) 7. Gohberg, I.C., Kre˘ın, M.G.: Introduction to the theory of linear nonselfadjoint operators. Transl. of Math. Monographs 18, Providence, RI: AMS, 1969 8. Gesztesy, F., Simon, B.: On a Theorem of Deift and Hempel. Commun. Math. Phys. 116, 503–505 (1988) 9. Hempel, R.: On the asymptotic distribution of the eigenvalue branches of a Schr¨odinger operator H−λW in a spectral gap of H. J. Reine Angew. Math. 399, 38–59 (1989) 10. Hempel, R: Eigenvalues in gaps and decoupling by Neumenn boundary conditions. J. Math. Anal. Appl. 169, 229–259 (1992) 11. Hempel, R.: On the asymptotic distribution of eigenvalues in gaps. In: Proceedings of the IMA-workshop on quasiclassical methods, May 1995- IMA Vl in Math. and its Appl. Berlin–Heidelberg–New York: Springer-Verlag, to appear 12. Klaus, M.: On the point spectrum of Dirac operators. Helv. Phys. Acta 53, 453–462 (1980) 13. Klaus, M.: Some applications of the Birman–Schwinger principle. Helv. Phys. Acta 55, 49–68 (1980) 14. Konno, R., Kuroda, S.T.: On the finiteness of perturbed eigenvalues. J. Fac. Sci. Uni. Tokyo Sect. I, 13, 55–63 (1966) 15. Levendorski˘i, S.Z.: Lower bound for the number of eigenvaluebranches for the Schr¨odinger operator H − λW in gap of H: The case of indefinite W . Commun. P.D.E. 20, 827–854 (1995) 16. Safronov, O.L.: Discrete spectrum in a gap of the continuous spectrum for variable sign perturbation with large coupling constant. St. Petersburg Math. J. 8, No. 2 (1997) 17. Simon, B.: Brownian motion, Lp properties of Schr¨odinger operators and the lacalization of binding. J. Funct. Anal. 35, 215–229 (1980) Communicated by B. Simon
Commun. Math. Phys. 193, 245 – 268 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Integrable Evolution Equations on Associative Algebras Peter J. Olver1,? , Vladimir V. Sokolov2,?? 1 School of Mathematics, University of Minnesota, Minneapolis, MN 55455, USA. E-mail:
[email protected], http://www.math.umn.edu/∼olver 2 Ufa Mathematical Institute, Russian Academy of Sciences, Chernyshevski str. 112, 450000, Ufa, Russia. E-mail:
[email protected]
Received: 12 March 1997 / Accepted: 27 August 1997
Summary. This paper surveys the classification of integrable evolution equations whose field variables take values in an associative algebra, which includes matrix, Clifford, and group algebra valued systems. A variety of new examples of integrable systems possessing higher order symmetries are presented. Symmetry reductions lead to an associative algebra-valued version of the Painlev´e transcendent equations. The basic theory of Hamiltonian structures for associative algebra-valued systems is developed and the biHamiltonian structures for several examples are found.
1. Introduction This paper is devoted to the classification of certain non-commutative generalizations of classical integrable soliton equations. In the literature, integrable multi-component equations have been considered by Svinolupov, [45, 46, 47, 48], Fordy, [2, 3, 14, 15], Marchenko, [26], and many others. Svinolupov started the systematic classification of such systems, and observed that many arise from certain special types of nonassociative structures, such as Jordan algebras, Jordan triple systems, and so on. This approach leads to many new examples of integrable matrix and vector equations, [42, 44], which are becoming the focus of significant research activity, [16, 17]. Our starting point is the fact that most of the interesting examples appearing in these papers arise when the field variables take their value in an associative algebra. Particularly important cases are matrix or operator algebras, Clifford algebras, including the algebra of quaternions, and group algebras. Indeed, since the field variables can be regarded as taking values in an operator algebra, our classification approach can be considered as a constructive procedure for quantizing classical integrable systems. The ?
Supported in part by NSF Grant DMS 95–00931. Supported in part by the Russian Foundation for Fundamental Research, Grant #96–01–00382 and INTAS Grant #93–166–EXT. ??
246
P. J. Olver, V. V. Sokolov
associative algebra systems are the simplest, meaning that the algebraic theory and the computations are most easily understood. Svinolupov investigated integrable equations associated with special kinds of non-associative algebras, although he did not consider integrable equations arising from the simpler associative algebras in any depth. The existence of higher order symmetries has been effectively used to classify integrable evolution equations with commutative field variables, [33, 50, 13, 35]. Higher order symmetries typically occur in infinite hierarchies, obtained by successively applying a recursion operator to a trivial “seed” symmetry. (Bakirov, [4], proposes an example of a fourth order system with a single sixth order symmetry and no higher symmetries of order ≤ 60; recently, Beukers, Sanders and Wang (personal communication) have announced a rigorous proof that Bakirov’s system has no additional higher order symmetries.) Ibragimov, Shabat, Sokolov, and Mikhailov, [18, 19, 28, 41], introduced the concept of a formal symmetry. Their method was then successfully used to classify several basic types of integrable evolutionary systems. We further refer the reader to the extensive tables in [28, 29, 30], for a summary of known examples. Due to the complexity of the computation, some of these classification tables relied on the existence of higher order conserved densities, and hence may not be complete. Indeed, as we demonstrate in our analysis of two-component systems of nonlinear Schr¨odinger (NLS) type, there are two integrable commutative two-component systems of NLS type which do not admit conserved densities and thus do not appear in the tables. In this paper, we develop the symmetry classification of integrable systems arising from associative algebras in a systematic fashion. We have been able to find quite a few new and interesting examples of integrable systems using these methods. Many of our examples do not come from Jordan algebras, and thus lie outside the class of equations considered by Svinolupov, [48]. To make the classification procedure tractable, it is important that the dependent variable is not viewed as a matrix or vector with entries that can be combined arbitrarily, but only appears using intrinsic algebra operations of multiplication, addition, scalar multiplication, and, possibly, trace and transpose. One of our main points is that the higher order symmetries of the integrable matrix equations should themselves be local matrix equations. Although this is perhaps not surprising, it is not obvious from the work of Svinolupov and Fordy. Although we are as yet unable to conclusively demonstrate the locality of all the higher order symmetries, all known examples admit at least one higher order local matrix symmetry, and we choose this property to classify the equations. A key question is how many of the known commutative examples extend to noncommutative integrable equations. We now list some known examples of integrable equations in which the field variable takes its values in an associative algebra. (For simplicity, the reader can view u as a matrix-valued function.) The simplest case is, of course, a linear equation: (1.1) ut = un = Dxn (u). The matrix-valued Burgers’ equation comes in a right and left handed version, ut = uxx + uux ,
or
ut = uxx + ux u.
(1.2)
Note that reversing the order of multiplication (or, in the matrix case, taking the transpose u 7→ u∗ ) interchanges the two versions of Burgers’ equation. The matrix Korteweg— deVries equation ut = uxxx + 3uux + 3ux u = uxxx + 6{u, ux }
(1.3)
Integrable Evolution Equations on Associative Algebras
247
is invariant under transpose. Here {u, v} = 21 (uv + vu) is the standard anti-commutator between fields u and v. There are, remarkably, two different matrix versions of the modified Korteweg–deVries equation. The first is ut = uxxx + 3u2 ux + 3ux u2 = uxxx + 6{u2 , ux },
(1.4)
and is invariant under both u 7→ u∗ and u 7→ −u. The matrix Miura map u = ±vx + v 2
(1.5)
maps this mKdV equation (for v) to the matrix KdV Eq. (1.3) for u. The second version is ut = uxxx + 3uuxx − 3uxx u − 6uux u = uxxx + 3 [u, uxx ] − 6uux u,
(1.6)
which is invariant under only u 7→ −u∗ . This equation was first described in [23], where the Lax pair formulation and the inverse scattering problem were studied. Equation(1.6) does not admit a Miura transformation. Finally, the nonlinear Schr¨odinger equation has a noncommutative version ut = uxx + 2uvu,
vt = −vxx − 2vuv,
(1.7)
which is invariant under simultaneous transposes u 7→ u∗ , v 7→ v ∗ . The system (1.7) can be identified with the usual (noncommutative) form ψt = iψxx + 2ψψψ of the nonlinear Schr¨odinger equation if we identify u = ψ, v = ψ and perform a complex scaling of the x coordinate. Such complex transformations do not affect the algebraic integrability of the equation, but do, of course, have significant effects on its analytical properties and (real) solutions. Not every scalar integrable system has a noncommutative counterpart. There are no purely matrix analogues of the fifth order Sawada-Kotera, [39], and Kaup, [22], equations when the right hand side of the evolution equation only involves the field variable u and its derivatives, but we suspect there may be versions that also depend on the transpose u∗ and its derivatives. However, the complexity of the calculations have so far prevented us from finding the desired equations. Symmetry reductions of classical soliton equations lead to ordinary differential equations of Painlev´e type, meaning those whose movable singularities are only poles, [1]. Analogous reductions of our noncommutative integrable systems will therefore lead to new associative algebra-valued ordinary differential equations of Painlev´e type. In particular, we exhibit analogues of some of the classical Painlev´e transcendents. However, it remains to determine the types of singularities and perform a Painlev´e classification of such systems in a direct fashion. We refer the reader to [38, 49], for additional applications of associative, Jordan, and other types of algebras to ordinary differential equations. A particularly powerful approach to integrability was discovered by Magri, [25], who showed how equations possessing two distinct compatible Hamiltonian structures have a recursion operator, and corresponding infinite hierarchy of commuting biHamiltonian flows and associated conservation laws. As in the commutative case, some of the integrable equations over associative algebras are also biHamiltonian systems. In order to understand this additional structure, we shall need to develop the theory of functionals and Hamiltonian operators over associative algebras having an additional trace
248
P. J. Olver, V. V. Sokolov
operation. The verification of the all-important Jacobi identity requires us to develop the associated theory of noncommutative functional multi-vectors, in direct analogy with the commutative version developed in [35, Chapter 7]. The biHamiltonian approach provides a simplified proof of the existence of suitable recursion operators. Remarkably, one consequence of our studies is that the first order Hamiltonian operators associated with systems of hydrodynamic type, as considered by Dubrovin and Novikov, [9, 10, 11], do not naturally generalize to local Hamiltonian operators in noncommutative variables: one is required to append certain nonlocal terms in order to satisfy the Jacobi identity. The resulting noncommutative operators have some similarities with the non-local (but commutative) Hamiltonian operators of hydrodynamic type introduced by Mokhov and Ferapontov, [12, 31, 32]. Given the beautiful and deep connections between such Hamiltonian operators and classical Riemannian geometry, this result invites an interesting speculation on the proper form of a noncommutative Riemannian geometry, [6], that will produce such nonlocal Hamiltonian operators. We begin our paper with a review of the basic facts from the theory of associative algebras. In Sect. 3 we present classification results for one-component evolution equations, of orders 2, 3, and 5. Section 4 discusses the classification of two-component systems of nonlinear Schr¨odinger type. Section 5 provides a brief discussion of the theory of noncommutative Painlev´e equations. The last section develops the noncommutative theory of Hamiltonian and biHamiltonian systems, with emphasis on the KdV and mKdV systems. The main classification theorems all rely on extensive computer algebra computations. To help convince the reader of the correctness of our conclusions, we have included an appendix that outlines the computational strategies our programs employ, as well as a discussion of how the results were arrived at. The Mathematica programs for effecting such computations are available at the web site http://www.math.umn.edu/∼ olver. 2. Associative Algebras In this paper, the field variables in our systems will take their values in a linear associative algebra A. Three important examples are the algebras of n×n matrices, Clifford algebras, [37], and the group algebras appearing in the representation theory of finite-dimensional groups, [8]. Our results, though, do not depend on A being finite-dimensional, and so we could also view A as a suitable operator algebra over, say, Hilbert space. For brevity, the noncommutative multiplication uv for u, v ∈ A will be just denoted by juxtaposition. We use the notation Lu (v) = uv,
Ru (v) = vu,
for the operators of left and right multiplication. The commutator and anti-commutator on A are denoted by the standard bracket notations [u, v] = uv − vu,
{u, v} = 21 (uv + vu).
(2.1)
Au (v) = 2{u, v},
(2.2)
Au = Lu + R u .
(2.3)
We also introduce the notation Cu (v) = [u, v], so that
Cu = Lu − Ru ,
Integrable Evolution Equations on Associative Algebras
249
Note that the anti-commutator defines a Jordan algebra structure on A, [21]. Finally, for notational convenience, we introduce the “triple anti-commutator” {u, v, w} = 21 (uvw + wvu).
(2.4)
Note that {u, v} = {u, e, v}, where e is the identity element of A. Each of our associative algebras comes with an involution u 7→ u∗ , called the transpose, that interchanges the operations of left and right multiplication, so (uv)∗ = v ∗ u∗ . In certain instances, we will also require the existence of a trace form on the algebra, by which we mean a scalar-valued invariant bilinear form tr: A × A → R, that satisfies tr uv = tr vu, so that tr [u, v] = 0. (2.5) The simplest example of an associative algebra is, as mentioned above, the algebra of real n × n matrices, which we denote by gl(n, R). More generally, if A is any associative algebra, then the space gl(n, A) of n × n matrices whose entries belong to A also forms an associative algebra. Since gl(m, gl(n, A)) ' gl(mn, A), iterating this procedure does not produce anything new. Similarly, one can form “vector” associative algebras by taking the direct sum of two or more associative algebras, e.g. A ⊕ B, and using the component-wise multiplication (a, b) · (c, d) = (ac, bd). Clifford algebras form a second important class of associative algebras, cf. [7, 37]. The simplest non-commutative Clifford algebra is the algebra H of quaternions. We write u = v + w, where v ∈ R is the scalar and w ∈ R3 the vector part of u. Note that the associative quaternion multiplication is b ) + vw b + vbw + w × w b, ub u = (vb v−w·w
(2.6)
where · is the ordinary dot product and × the ordinary cross product in three-dimensional space. The general classification of Clifford algebras can be found, for instance, in [37]. Theorem 2.1. Every Clifford algebra over R is isomorphic to one of the following matrix algebras: gl(2k , R), gl(2k , C), gl(2k , H), gl(2k , R ⊕ R), gl(2k , H ⊕ H).
(2.7)
Therefore, every Clifford algebra is constructed as a matrix algebra (of a particular size) over the basic algebras R, C, H, and the two vector algebras R ⊕ R, and H ⊕ H. If G is a finite group containing r elements, then the (real) group algebra AG = R[G] is an r-dimensional vector space, whose basis elements P are identified with the elements of G. Thus, the elements of R[G] have the form u = g∈G ug · g, with coefficients ug ∈ R indexed by the group elements. The group algebra multiplication is induced from that of the group itself. Thus, if w = uv, then wg =
X
uh v h
−1
g
,
(2.8)
h∈G
where h−1 g denotes the multiplication of elements h, g ∈ G. Note that the group algebra is commutative if and only if G is abelian. Thus, the simplest noncommutative group algebra is the six-dimensional algebra R[S3 ] associated with the symmetric group on three letters. See [8] for details and applications to representation theory.
250
P. J. Olver, V. V. Sokolov
3. Classification of One-Component Evolution Equations Let A be an associative algebra. We can use the linear structure on A to differentiate maps u: R → A, leading to an algebra of associative, but not necessarily commutative differential polynomials, which we denote by D = D[x, u]. We use subscript notation for derivatives, so that ux , uxx , etc. denote the first, second, etc. derivatives of u = u(x) with respect to its scalar argument x. For simplicity, we only consider x-independent differential polynomials in this paper, although the general methods readily extend to differential polynomials that have x-dependent coefficients. A one-component1 A–valued evolution equation will then be of the general form ut = K[u],
where
K ∈ D.
(3.1)
In the following section, we consider multi-component equations. Each evolution equation describes the flow associated with an evolutionary vector field vK =
∞ X
Dxn K
n=1
∂ , ∂un
(3.2)
which acts on the space of differential polynomials, cf. [35]. Two such equations are called equivalent if they can be mapped to each other by a scaling transformation x 7−→ λx,
t 7−→ µt,
u 7−→ νu,
λµν 6= 0.
(3.3)
The evolution Eq. (3.1) is said to be integrable if it admits a higher order differential polynomial symmetry, again described by an evolution equation ut = S[u],
where
S ∈ D.
(3.4)
The symmetry condition that the two evolutionary flows (3.1), (3.4) commute implies that the two evolutionary vector fields commute: [vK , vS ] = 0. Define the Fr´echet derivative of the differential polynomial K by adapting the basic formula d K[u + εw] , (3.5) DK [w] = dε ε=0 from the commutative case, cf. [35]. Since vK [S] = DS [K], the commutativity of the flows (3.1), (3.4), can then be written in the usual form DK [S] − DS [K] = 0.
(3.6)
Given an evolution Eq. (3.1), the determination of all symmetries (3.4) of a given order m is a straightforward, but computationally intensive, calculation. We have implemented a Mathematica package which will automatically do these computations. Even so, due to time and memory resources, only a limited number of such computations can be completed. In all interesting examples, the right-hand side of an integrable evolution equation (3.1) turns out to be a homogeneous differential polynomial with respect to some weighting of its constituent monomials. We introduce a weighting scheme on the algebra D by assigning a weight m = deg u to the dependent variable and n = deg x to the independent 1 In the commutative case, a one-component equation is a scalar equation, but the use of the term “scalar” here would be confusing.
Integrable Evolution Equations on Associative Algebras
251
variable, so that the kth order derivative of u with respect to x has weight m + kn. We shall, without loss of generality, assume that deg x = 1. The weight of a monomial is the sum of the weights of its multiplicands. We let D(n) denote the space of differential polynomials all of whose monomials have weight n. We set deg K = n if K ∈ D(n) . For example, the KdV weighting sets deg u = 2,
deg x = 1.
(3.7)
In the noncommutative case, there are three monomials of weight 5, namely uxxx , uux and ux u. Every differential polynomial P ∈ D(5) of weight 5 is thus a linear combination of these three monomials. Clearly, if (3.1) is a homogeneous evolution equation, then the homogeneous summands of any symmetry are automatically symmetries, and hence we can, without loss of generality, restrict our attention to symmetries S ∈ D(m) that are homogeneous in the prescribed weighting. Definition 3.1. Under a given weighting scheme, the weight of an evolution Eq. (3.1) is equal to the difference deg K − deg u. For example, under the KdV weighting (3.7), the most general evolution equation of weight 3 has the form ut = a uxxx + b uux + c ux u,
(3.8)
where a, b, c ∈ R. The reason for the definition of weighting is so that, when x has degree 1, the linear terms in an evolution equation of weight k are of order k, and so, in most cases, the weight of the evolution equation equals its order. More specifically: Definition 3.2. A homogeneous evolution equation is nondegenerate if it contains a linear term. Lemma 3.3. If deg x = 1, then a nondegenerate homogeneous evolution equation of weight k is a k th order differential equation. As a consequence of Definition 3.1, the commutator equation ut = DK [S] − DS [K] of a weight k equation and weight l symmetry has weight k + l. (However, in the one component case, it is never nondegenerate!) We now summarize the results obtained to date through the symmetry classification of nondegenerate homogeneous polynomial evolution Eqs. (3.1). The following results are the consequences of extensive computer algebra computations using our Mathematica package. See the Appendix for details of the proofs of the following classification theorems. Theorem 3.4. For the KdV weighting (3.7), every nondegenerate polynomial equation of weight 3 over a noncommutative associative algebra having a weight 5 symmetry is either linear or equivalent to the Korteweg–deVries Eq. (1.3). For example, the fifth order symmetry of the Korteweg–deVries Eq. (1.3) is ut = uxxxxx + 10{u, uxxx } + 20{ux , uxx } + 20{u2 , ux } + 10uux u.
(3.9)
Note that, in the commutative case, this reduces to the standard fifth order symmetry of the ordinary Korteweg–deVries equation, [35]. In the commutative case, there are three different fifth order integrable polynomial equations, [28, 24]; besides the fifth order KdV equation, these are the Sawada–Kotera
252
P. J. Olver, V. V. Sokolov
equation, [39], and the Kaup–Kupershmidt equation, [22]. We have shown that neither of these has an associative counterpart where the right hand side is a differential polynomial in u alone, although we believe that there do exist integrable versions involving u and its transpose u∗ . Theorem 3.5. For the KdV weighting, every nondegenerate polynomial equation of weight 5 over a noncommutative associative algebra having a weight 7 symmetry is either linear or equivalent to the fifth order Korteweg–deVries Eq. (3.9). Using the triple bracket notation (2.4), we can write the seventh order symmetry of the fifth order Korteweg–deVries Eq. (1.3) in the form ut = uxxxxxxx + 14{u, uxxxxx } + 42{ux , uxxxx } + 70{uxx , uxxx } + 42{u2 , uxxx } + 28uuxxx u + 98 {u, ux , uxx } + 112 {u, uxx , ux } + 70 {ux , u, uxx } + 70u3x + 70{u3 , ux } + 70 u2 , ux , u .
(3.10)
The second interesting weighting is that associated with both Burgers’ equation and the modified Korteweg–deVries (mKdV) equation, which assigns equal weighting to x and u: deg u = 1, deg x = 1. (3.11) Theorem 3.6. For the equal weighting (3.11), every nondegenerate polynomial equation of weight 2 over a noncommutative associative algebra having a weight 3 symmetry is either linear or equivalent to either the right or left handed Burgers’ Eq. (1.2). Every polynomial equation of weight 3 having a weight 5 symmetry is either linear or equivalent to one of the following five equations: ut = uxxx + 6 {u2 , ux }, ut = uxxx + 3 [u, uxx ] − 6uux u, ut = uxxx + 3u2x , ut = uxxx + 3uuxx + ut = uxxx + 3uxx u +
(3.12) 3u2x 3u2x
2
+ 3u ux , + 3ux u2 .
The first and second of these integrable equations are the two versions of the mKdV equation discussed above. The third is the potential KdV equation. The fourth and fifth are the third order symmetries of the right and left handed Burgers’ Eqs. (1.2), respectively – these systems also admit fourth order symmetries. Each of the preceding one-component noncommutative integrable equations can be made into an integrable system by identifying the dependent variable as a matrix-valued function u(x, t) = uα β (x, t)). A second interesting class of systems comes from the case when u = v + w ∈ H takes its values in the quaternions, where v ∈ R, w ∈ R3 , as in (2.6). The quaternion versions of our integrable equations are as follows: (1) Quaternion Burgers’ equation: vt = vxx + vvx − w · wx , wt = wxx + (vw)x + w × wx .
(3.13)
Integrable Evolution Equations on Associative Algebras
253
(2) Quaternion Korteweg–deVries equation: vt = vxxx + 3 v 2 − | w |
2 x
,
(3.14)
wt = wxxx + 6(vw)x . This coincides with the well-known vector KdV equation given in [42]. (3) The first quaternion modified Korteweg–deVries equation is 2 vt = vxxx + 2v 3 − 6v | w | x , 2
wt = wxxx + 6(v 2 w)x − 6 | w | wx .
(3.15)
This is also a well-known vector mKdV equation. (4) The second quaternion mKdV is more interesting: 2
vt = vxxx + (6 | w | v − 2v 3 )x , 2
wt = wxxx + 6 w × wx − (v 2 + | w | )w
2
x
+ 18 | w | wx .
(3.16)
Note that the reduction v = 0 leads to the three-dimensional vector equation 2
wt = wxxx + 6 w × wxx − 6(w · wx )w + 6 | w | wx . Finally, we can also consider cases in which the field variable u takes its values in a group algebra R[G]. The multiplication rule (2.8) allows us to write out the component form of the equation. For example, the Korteweg–deVries Eq. (1.3) on a group algebra has the component form X −1 −1 [uh g + ugh ] uhx , g ∈ G. (3.17) ugt = ugxxx + 3 h∈G
The indices in (3.17) are governed by the group multiplication. The G-KdV system (3.17) is integrable for any finite group G. In particular, if G is abelian, then the two summands in (3.17) coincide. Because of the variety of isomorphisms between matrix, Clifford, and group algebras, our three classes of associative algebras very often lead to different forms of the same equation. For example, in view of Theorem 2.1, the associative equations on higher dimensional Clifford algebras are straightforward matrix versions of either the real, the complex, or the quaternionic equations. The vector integrable systems in [42] arise from Clifford algebras. The evolution equations associated with the last two matrix Clifford algebras R ⊕ R and H ⊕ H automatically decouple into two independent evolution equations associated with the real or quaternionic matrix algebra, and are thus not particularly interesting.
4. Integrable Two-Component Systems on Associative Algebras The classification of associative algebra valued systems having two or more field variables is even less well developed. We shall only consider second order systems of the form (4.1) ut = A uxx + F(u, ux ),
254
P. J. Olver, V. V. Sokolov
where u = (u1 , . . . , un ) and A is a constant n × n matrix. Note that, by a (complex) linear transformation u 7→ Bu, we can place A into Jordan canonical form. Moreover, scaling x will allow us to make at least one of the diagonal entries of A equal to 1. Mikhailov, Shabat and Yamilov, [29, 30], investigated the two component systems, n = 2, in detail, and used the existence of higher order conservation laws to characterize integrability. Here we continue their work in both the commutative and noncommutative cases. For simplicity, we assume that the linear part of the system (4.1) is prescribed by the simple diagonal matrix A = diag(1, −1), leaving other types of two component systems to future investigations. We will consider a second order system to be integrable if it possesses a fourth order symmetry. All known examples of integrable second order systems satisfy this condition. Definition 4.1. A two component system is called triangular if it decouples into the form ut = F (u, ux , uxx ),
vt = G(u, v, ux , vx , uxx , vxx ).
(4.2)
Note that a triangular system can be solved by first solving the one-component equation for u and then using that to reduce the v equation to a second x, t dependent onecomponent equation. In our classification procedure, for brevity we have chosen not to consider systems which are equivalent, under a point transformation, to triangular systems, so we can concentrate on “genuinely” two-component systems. The first class of systems are the noncommutative versions of the nonlinear Schr¨odinger equation, which corresponds to the equal weighting deg u = 1,
deg v = 1,
deg x = 1.
(4.3)
The general form of such a second order system in the commutative case is ut = uxx + a1 uux + a2 vux + a3 uvx + a4 vvx + b1 u3 + b2 u2 v + b3 uv 2 + b4 v 3 , vt = −vxx − c4 uux − c3 vux − c2 uvx − c1 vvx − d4 u3 − d3 u2 v − d2 uv 2 − d1 v 3 . (4.4) The corresponding fourth order symmetry has the form uτ = uxxxx + f (u, v, ux , vx , uxx , vxx , uxxx , vxxx ), vτ = −vxxxx + g(u, v, ux , vx , uxx , vxx , uxxx , vxxx ).
(4.5)
All known integrable commutative Eqs. (4.4) have a polynomial symmetry of the form (4.5). The following theorem extends the classification of Mikhailov, Shabat and Yamilov. Theorem 4.2. Up to scaling of t, x, u, v, and the discrete transformation u ←→ v,
t ←→ −t,
(4.6)
every nonlinear, commutative nontriangular system of the form (4.4), possessing a polynomial symmetry of the form (4.5) is equivalent to one of the following seven integrable systems:
Integrable Evolution Equations on Associative Algebras
255
ut = uxx + 2u2 v, vt = −vxx − 2v 2 u, n ut = uxx + 2vvx , vt = −vxx + 2uux , n ut = uxx + 2uux + 2vux , vt = −vxx + 2uvx + 2vvx , n ut = uxx + 2uux + 2vux + 2uvx , vt = −vxx + 2vux + 2uvx + 2vvx , ut = uxx + 2uvx + 2vvx − 23 (u + v)3 , vt = −vxx + 2uux + 2vux + 23 (u + v)3 , ut = uxx + 2uux − 2vux − 2uvx − 2u2 v + 2uv 2 , vt = −vxx − 2vux − 2uvx + 2vvx − 2u2 v + 2uv 2 , ut = uxx + 2uux + 2vvx , vt = −vxx − vux − uvx − 43 u2 v − 21 v 3 .
(4.7) (4.8) (4.9) (4.10) (4.11) (4.12) (4.13)
The last two systems do not appear in the lists of Mikhailov, Shabat and Yamilov, [29, 30], because they do not have higher order conserved densities. System (4.12) was first obtained by A.E. Borovik, V.Yu. Popkov and V.N. Robuk, cf. [40, (4.5)], and can be linearized by a differential substitution. System (4.13) appears to be new. In the noncommutative case, we are classifying systems of the general form ut = uxx + a1 uux + a2 uvx + a3 vux + a4 vvx + a5 ux u + a6 ux v + a7 vx u + a8 vx v + b1 u3 + b2 u2 v + b3 uvu + b4 uv 2 + b5 vu2 + b6 vuv + b7 v 2 u + b8 v 3 , vt = −vxx − c1 vvx − c2 vux − c3 uvx − c4 uux − c5 vx v − c6 vx u − c7 ux v − c8 ux u − d1 v 3 − d2 v 2 u − d3 vuv − d4 vu2 − d5 uv 2 − d6 uvu − d7 u2 v − d8 u3 (4.14) up to equivalence, where we allow scaling, the discrete transformation (4.6), as well as the formal involution (u, v) 7→ (u∗ , v ∗ ) that interchanges the order of multiplication. The complete list of the nonlinear, nontriangular noncommutative systems up to equivalence is as follows. Interestingly, each of the first six commutative nontriangular integrable systems has either one or two noncommutative counterparts; however, the final new system (4.13) does not generalize to the noncommutative regime. These integrable noncommutative systems naturally split into three classes. The first group contains the six systems n ut = uxx + 2uvu, (4.15) v = −vxx − 2vuv, t ut = uxx + 2uux − 2uvx + 2vux + 2vx u − 2u2 v + 4uvu − 2vu2 , (4.16) vt = −vxx + 2vux − 2ux v + 2vx u + 2vx v + 2uv 2 − 4vuv + 2v 2 u, n ut = uxx + 2uux + 2vux , (4.17) v = −vxx + 2vx u + 2vx v, t ut = uxx + 2uux + 2vux + 2vx u + 2uvu − 2vu2 , (4.18) vt = −vxx + 2vux + 2vx u + 2vx v − 2vuv + 2v 2 u, ut = uxx + 2uux + 2uvx + 2ux v + 2u2 v − 2uvu, (4.19) vt = −vxx + 2vux + 2vvx + 2vx u + 2vuv − 2v 2 u, ut = uxx + 2uux − 2vux − 2vx u − 2uvu + 2v 2 u, (4.20) vt = −vxx − 2uvx + 2vvx − 2ux v − 2u2 v + 2vuv.
256
P. J. Olver, V. V. Sokolov
The commutative versions of these are the following: (4.15) is a noncommutative generalization of (4.7); both (4.16) and (4.17) reduce to (4.9); (4.18) and (4.19) reduce to (4.10), while (4.20) reduces to (4.12). In several cases, one must perform a rescaling of the form x 7−→ λx,
u 7−→ λ2 u,
v 7−→ λ2 v,
(4.21)
in order to make the precise identification. The second group involves a primitive cube root of unity , where 2 + + 1 = 0. We can set √ 1 i 3 2πi =− + = exp 3 2 2 without loss of generality. The following four systems lie in this class:
ut = uxx − 2vvx + 2vx v − 2(1 + )u2 v + 2uvu + 2vu2 (4.22) vt = −vxx + 2(1 + )uux + 2ux u + 2(1 + )uv 2 − 2vuv − 2v 2 u u = uxx + 2uux + 2uvx − 2vux + 2(1 + )vvx − 2ux u + 2ux v − 2vx u + 2vx v t +2(1 − )u2 v − 6uvu + 2(1 + 2)uv 2 + 2(2 + )vu2 − 2(1 + 2)v 2 u, (4.23) x + 2vux + 2vvx − 2(1 + )ux u − 2ux v + 2vx u vt = −vxx − 2uux − 2uv −2vx v − 2(1 + 2)u2 v − 2(1 − )uv 2 + 2(1 + 2)vu2 + 6vuv − 2(2 + )v 2 u, u = u + 2uu + 2uv + 2vu + 2(1 + )vv − 2u u − 2u v t xx x x x x x x +2(1 + )vx u + 2vx v + 2u3 + 2(1 + 2)u2 v + 6uvu +4(1 + )uv 2 − 2(1 + 2)vu2 + 2vuv − 4v 2 u + 2v 3 , (4.24) vt = −vxx + 2uux − 2uvx + 2(1 + )vux − 2vvx + 2(1 + )ux u 3 2 +2ux v + 2vx u + 2vx v − 2u − 4(1 + )u v − 2uvu −2(1 + 2)uv 2 + 4vu2 − 6vuv + 2(1 + 2)v 2 u − 2v 3 , x + 2vx u + 2vx v ut = uxx 3+ 2(1 2+ )uvx + 2(1 + )vv +2u + 2u v + 2uvu + 2uv 2 + 2vu2 + 2vuv + 2v 2 u + 2v 3 , (4.25) x + 2vux + 2(1 + )ux u + 2(1 + )ux v vt = −vxx3 + 2uu 2 2 2 2 3 −2u − 2u v − 2uvu − 2uv − 2vu − 2vuv − 2v u − 2v .
Again, under a suitable rescaling (4.21), both (4.22) and (4.23) reduce to (4.8) in the commutative case, while (4.24) and (4.25) reduce to (4.11). The final noncommutative nontriangular equation is (
ut = uxx + 2uux − 2uvx + 2vux − 2ux v + 2vx u −2u2 v + 4uvu + 2uv 2 − 2vu2 − 2vuv, vt = −vxx + 2vux + 2vx u + 2vx v − 2vuv + 2v 2 u.
(4.26)
This case reduces to a commutative triangular system ut = uxx + 2uux , vt = −vxx + 2vux + 2vx u + 2vx v. Interestingly, most of these integrable systems are not associated with a Jordan algebra, and thus lies outside the class of equations considered by Svinolupov, [47, 48]. Theorem 4.3. Every noncommutative associative algebra valued, two-component nontriangular second order system of nonlinear Schr¨odinger form (4.14) possessing a fourth order symmetry is equivalent to one of the systems (4.15–4.26).
Integrable Evolution Equations on Associative Algebras
257
Next, we consider the weighting deg u = 1,
deg v = 1,
deg x = 2,
(4.27)
which governs the derivative nonlinear Schr¨odinger equation. We have not performed a complete classification in this case. We did find two nontriangular integrable systems of second order with the given diagonal matrix as linear part: n ut = uxx + 2(uvu)x , vt = −vxx − 2(vuv)x , (4.28) n ut = uxx + 2ux vu, vt = −vxx + 2vuvx . The first system is a noncommutative generalization of the derivative nonlinear Schr¨odinger equation. The second system is interesting, since it gives another example of an integrable equation not associated with a Jordan algebra. The corresponding third order symmetry is n ut = uxxx + 3uxx vu + 3ux vux + 3ux vuvu, (4.29) vt = vxxx − 3vuvxx − 3vx uvx + 3vuvuvx . 5. Matrix Painlev´e Equations It is well known that the symmetry reductions of integrable systems are ordinary differential equations of Painlev´e type, [1, 27]. Thus, we can seek symmetry reductions of our noncommutative integrable equations, leading to associative algebra-valued counterparts of the classical Painlev´e transcendents P-I, . . . , P-VI, [20]. First of all, the first Painlev´e transcendent P-I appears as a symmetry reduction of the matrix KdV Eq. (1.3). Namely, the classical Galilean symmetry ∂ 1 ∂ ∂ −t + ∂t ∂x 6 ∂u gives us the group-invariant solution
u = 16 t e + v x + 21 t2 ,
where e is the identity (or any other constant) element of the algebra, and v(z) satisfies the matrix ordinary differential equation v 000 + 3vv 0 + 3v 0 v = 16 e. By integrating we obtain the first matrix Painlev´e equation v 00 + 3v 2 = 16 ze + a
(5.1)
with arbitrary constant matrix a. Of course, we can use the scaling to normalize the constants 3 and 16 . By a suitable shift of z and a conjugation of the form v 7→ mvm−1 , we can make a into a matrix having zero trace in Jordan canonical form. Note that the Painlev´e Eq. (5.1) admits a symmetric reduction v ∗ = v. The two matrix mKdV equations lead to two different matrix version of the second Painlev´e equation P-II. Namely, the simple scaling gives us the substitution u = t−1/3 v(xt−1/3 ).
258
P. J. Olver, V. V. Sokolov
Starting with the standard matrix mKdV Eq. (1.4), we obtain v 000 + 3v 2 v 0 + 3v 0 v 2 + 13 v + 13 zv 0 = 0.
(5.2)
The second mKdV Eq. (1.6) (and the same reduction) gives rise to different third order ODE (5.3) v 000 + 3vv 00 − 3v 00 v − 6vv 0 v + 13 v + 13 zv 0 = 0. In contrast with the scalar case, we are unable to lower the order of either version of the second Painlev´e transcendent, which remain third order equations. In the scalar case the fourth Painlev´e equation P-IV is equivalent to the following system of Riccati type: y 0 = c1 − 21 zy − y 2 − 2yw, (5.4) w0 = c2 + 21 zw + w2 + 2yw, where c1 , c2 are arbitrary constants. The system (5.4) can be most simply obtained from the commutative nonlinear Schr¨odinger type system (4.10) via the similarity reduction u = t−1/2 y(z), v = t−1/2 w(z), with z = t−1/2 x. There are two different noncommutative counterparts to this equation, leading to two noncommutative versions of P-IV. Performing the same similarity reduction on (4.18) leads to 0 y 00 + 2yy 0 + 2wy + 21 zy + 2ywy − 2wy 2 = 0, (5.5) 0 w00 − 2w0 w − 2wy + 21 zw + 2wyw − 2w2 y = 0, whereas (4.19) yields the alternative system 0 y 00 + 2yy 0 + 2yw + 21 zy + 2y 2 w − 2ywy = 0, 0 w00 − 2ww0 − 2wy + 21 zw − 2wyw + 2w2 y = 0.
(5.6)
Unlike the commutative version, neither of these can be reduced to a first order system. The classical matrix chiral model (5.7) uxy = {ux u−1 uy } = 21 ux u−1 uy + uy u−1 ux , is one of the most important integrable matrix equations. In [42, 43, 44], this equation was generalized to the case of arbitrary Jordan triple systems. The general scaling symmetry reduction of (5.7) leads to the substitution u = xp y q v(xα y β ),
αβ 6= 0.
Without loss of generality we can put q = 0, but the resulting reduced matrix ordinary differential equation does not depend on p, q, α, β and is as follows: vz = vz v −1 vz , z = xα y β . (5.8) vzz + z In the standard case, this equation is a special case of P-III. It would be interesting to investigate the analytical properties, B¨acklund transformations and the corresponding isomonodromic problems for these matrix Painlev´e equations. We do not know if there exist other matrix Painlev´e equations (for instance P-VI). It seems possible to transfer our “associative approach” from symmetry analysis to the Painlev´e analysis. This could allow one to classify matrix ordinary differential equations of Painlev´e type. For example, recently Balandin and the second author, [5], have shown that, although it does not arise as a reduction of either of the matrix mKdV equations, the matrix P-II equation v 00 = 2v 3 + zv + αe, where α is a scalar parameter passes the Painlev´e–Kovalevskaya test.
Integrable Evolution Equations on Associative Algebras
259
6. Hamiltonian Structures A scalar evolution Eq. (3.1) is said to be Hamiltonian if it can be written in the form R
ut = D δH.
(6.1)
Here H = H(u, ux , . . .) dx is the Hamiltonian functional, δ is the variational derivative, and the Hamiltonian operator D determines a Poisson bracket Z {F, H} = δF · D δH dx (6.2) on the space of functionals. The Hamiltonian operator D is required to be a skewadjoint differential operator, D∗ = −D, in order that the Poisson bracket (6.2) be skewsymmetric, {F , H} = −{H, F}. In addition, we require that (6.2) satisfy the Jacobi identity {F, {G, H}} + {G, {H, F}} + {H, {F , G}} = 0. (6.3) Any skew-adjoint operator which does not explicitly depend on the field variable u automatically satisfies the Jacobi identity. For field dependent operators, (6.3) is a nontrivial condition which we discuss in more detail below. We refer the reader to [35, Chapter 7], for details on the general theory of (commutative) Hamiltonian systems. Generalizing the calculus of Poisson brackets and variational derivatives to the noncommutative case is relatively straightforward. The basic functionals must still be scalarvalued, and thus must be expressed as integrals of certain trace forms. For example, the functional Z (6.4) H1 = tr − 21 u2x + u3 dx turns out to be a conservation law for the non-commutative Korteweg–deVries Eq. (1.3). We can compute its variational derivative as follows: Z d H[u + tv] = tr − 21 ux vx − 21 vx ux + vu2 + uvu + u2 v dx hδH1 ; vi = dt t=0 Z Z = tr −ux vx + 3u2 v dx = tr (uxx + 3u2 )v dx. Here we are using the fundamental property (2.5) of the trace operation as well as integration by parts. Therefore, the variational derivative of the functional (6.4) is δH1 = uxx + 3u2 . The associative algebra-valued Korteweg–deVries equation (1.3) can thus be written in the Hamiltonian form ut = Dx δH. The total derivative operator Dx , being skew-adjoint and independent of the field variable, is automatically Hamiltonian. For general skew-adjoint differential operators, the simplest method for verifying the Jacobi identity (6.3) is based on the calculus of “functional multi-vectors” developed in [35, Chapter 7]. We assume that the reader is familiar with the method for the commutative case as explained there. The non-commutative version is very similar; in the one component case, one introduces the basic uni-vectors θk ≡ Dxk θ, which are, in a certain
260
P. J. Olver, V. V. Sokolov
sense, “duals” to the one-forms duk = Dxk du. The main difference is that, since ordinary multiplication is no longer commutative, the wedge product of such multi-vectors is no longer skew symmetric. Thus, in order to avoid confusion, we shall drop the explicit wedge product symbol entirely, denoting θ ∧ θx , say, by just concatenation θ θx . The only identity that is retained in the case of an associative algebra is the skew analogue of the trace formula (2.5), which requires tr ξ η = (−1)mn tr η ξ,
(6.5)
for any m-vector ξRand n-vector η. An associative algebra-valued functional multi-vector has the form 2 = (tr ξ) dx, where ξ is a multi-vector built on the underlying associative algebra. We note that, besides the trace identity (6.5), one can also integrate functional multi-vectors by parts: Z Z Z 1+mn tr (η Dx ξ) dx. (6.6) tr (ξ Dx η) dx = − tr ((Dx ξ) η) dx = (−1) The most important functional multi-vector is the bivector Z 2 = tr (θ D θ) dx
(6.7)
associated with a skew-adjoint linear operator D. The operator D defines a noncommutative Poisson bracket Z {F, H} = tr [δF · D δH] dx, (6.8) if and only if the bivector 2 satisfies the quadratic bracket condition [2, 2] = 2 vDθ (2) = 0.
(6.9)
Here [ · , · ] denotes the noncommutative generalization of the Schouten bracket, cf. [34], between functional multi-vectors. The functional trivector (6.9) can be effectively computed via the noncommutative version of a standard formula based on the formal evolutionary vector field vDθ which has multi-vector characteristic Dθ. In other words, we replace K in (3.2) by the uni-vector Dθ, and allow vDθ to act as an anti-derivation on the space of multi-vectors, meaning that vDθ (ξ η) = vDθ (ξ) η + (−1)m ξ vDθ (η)
(6.10)
whenever ξ is an m-vector. Note that vDθ picks up a minus sign each time it “moves past” another uni-vector. See [35] for details, which are straightforwardly adapted here to the non-commutative case. Two Hamiltonian operators D and E are said to form a Hamiltonian pair provided any linear combination aD + bE, a, b ∈ R, is also Hamiltonian. This is equivalent to requiring that D and E are individually Hamiltonian, meaning that their associated bivectors Z Z 2 = tr (θ D θ) dx, Ξ = tr (θ E θ) dx, satisfy the bracket condition (6.9). Moreover, they must satisfy a compatibility condition that the Schouten bracket between the bivectors vanishes, leading to the complete system of bracket conditions [2, 2] = 0,
[2, Ξ] = 0,
[Ξ, Ξ] = 0.
(6.11)
Integrable Evolution Equations on Associative Algebras
261
Magri’s theorem, [25, 35], demonstrates the integrability of any biHamiltonian system Indeed, the operator
ut = DδH1 = EδH0 .
(6.12)
R = E · D−1 ,
(6.13)
forms a recursion operator for the system (6.12), leading to the hierarchy of commuting higher order flows (6.14) ut = D δHn+1 = E δHn , associated with the hierarchy Hn of higher order conservation laws. The main technical issue is whether the higher order flows generated by recursion are local. We now discuss what is known concerning the Hamiltonian and biHamiltonian structures of noncommutative integrable systems. Theorem 6.1. The noncommutative Korteweg–deVries Eq. (1.3) is a biHamiltonian system for the compatible Hamiltonian pair D = Dx ,
E = Dx3 + Au Dx + Dx Au + Cu Dx−1 Cu .
(6.15)
Here Au and Cu are the anti-commutator and commutator maps as in (2.3). Proof. Note that Dv = vx ,
Ev = vxxx + 4 {u, vx } + 2 {ux , v} + [Dx−1 [v, u], u].
(6.16)
Therefore, it suffices to prove that E is Hamiltonian, since the linear combination E + cD can be simply obtained from E via translation u 7→ u + c. We write E = E0 + E1 + E2 , where (6.17) E0 = Dx3 , E1 = 2Au Dx + Aux , E2 = Cu Dx−1 Cu . Note that the subscript on the E’s denotes their degrees (weights) under scaling in u. We similarly decompose the associated functional bivector: 2 = 20 + 21 + 22 . Let us introduce the (nonlocal) noncommutative uni-vectors λ = Dx−1 (uθ),
ρ = Dx−1 (θu).
(6.18)
Then using integration by parts (6.6) and the trace identity (6.5), we find the following simplified formulae: Z 20 = tr [θ θxxx ] dx, Z Z (6.19) 21 = 2 tr [u θ θ θx − u θx θ] dx = 2 tr [(λx + ρx ) θx ] dx, Z 22 = tr [(λ − ρ)(λx − ρx )] dx. Since the Schouten bracket condition (6.9) is quadratic in 2, we can split the trivector ϒ = [2, 2] into homogeneous components ϒ = ϒ0 + ϒ1 + ϒ2 + ϒ3 , where ϒ0 = v0 (21 ), ϒ1 = v0 (22 ) + v1 (21 ),
ϒ2 = v1 (22 ) + v2 (21 ), ϒ3 = v2 (22 ).
(6.20)
262
P. J. Olver, V. V. Sokolov
Here vk = vEk θ , and we are using the fact that vk (20 ) = 0 since E0 is a constant coefficient operator. When evaluating (6.20), it is important to remember the anti-derivational rules (6.10) for the formal vector fields vk = vEk θ . In particular, v0 (λx ) = θxxx θ,
v0 (ρx ) = −θ θxxx .
whereas
(6.21)
Similarly, we find v2 (λx ) = (ρ − λ) λx − u(ρ − λ) θ, v2 (ρx ) = ρx (ρ − λ) − θ (ρ − λ)u. (6.22) We can now compute the relevant brackets. First, applying (6.21) and integrating the result by parts, we find v1 (λx ) = 2u θx θ + 2 θx λx + ux θ θ + θ ux θ, v1 (ρx ) = −2θ θx u − 2ρx θx − θ ux θ − θ θux ,
Z
Z tr [v0 (λx ) + v0 (ρx ) θx ] dx =
v0 (21 ) = Z =
tr [θxxx θ θx − θxxx θx θ] dx
tr [−θxx θx θx − θxx θ θxx + θxx θxx θ + θxx θx θx ] dx = 0,
the final equality relying on (2.5). Similarly, Z v2 (22 ) = Z =
tr [v2 (λx ) − v2 (ρx )(ρ − λ)] dx tr [(λx − ρx )(λλ − λρ − ρλ + ρρ)] dx = 0.
The last equality follows by noting that the various terms combine to form total derivatives and hence integrate to zero; for example, using (6.5), Dx tr λλλ = tr(λx λλ + λλx λ + λλλx ) = 3 tr λx λλ, R and hence tr (λx λλ) dx = 0. The verification that ϒ1 = 0 = ϒ2 is similar, although more tedious. Corollary 6.2. The operator R = Dx2 + 2Au + Aux Dx−1 + Cu Dx−1 Cu Dx−1
(6.23)
is a recursion operator for the non-commutative Korteweg–deVries equation (1.3). Remark. The operator (6.2) is a special case of a formula due to Svinolupov, [46], for the recursion operator for multi-component Jordan systems. The proof that E satisfies the Jacobi identity would be almost impossible to do using the component form of the operator. Remark. It is an open problem to prove that, as with the commutative KdV recursion operator, applying (6.23) recursively to the elementary symmetry K0 = ux produces a hierarchy of local, mutually commuting, higher order flows. We have verified this up to order 11, but do not have a general proof.
Integrable Evolution Equations on Associative Algebras
263
There are several interesting points associated with this result, indicating that the noncommutative Hamiltonian theory is more complicated than the well-studied commutative theory. First, and most noticeable, is the fact that the operator E is non-local, requiring the formal integral operator Dx−1 . This is in contrast with the scalar Korteweg–deVries equation, whose second Hamiltonian structure is a local operator. Second, and even more surprising, is that, except for the constant coefficient operator Dx3 , the other two homogeneous summands of the second Hamiltonian operator E are not individual Hamiltonian operators — only when they appear in the particular linear combination (6.15) does the operator define a genuine Poisson bracket! In particular, the combination Dx3 + Au Dx + Dx Au , which would appear to be a more natural noncommutative generalization of the second KdV Hamiltonian operator, is not Hamiltonian. Indeed, [21 , 21 ] = v1 (21 ) 6= 0, and hence E1 (v) = 4{u, vx } + 2{ux , v} does not define a Hamiltonian operator! This is remarkable, since this appears to be the direct analogue of the commutative Hamiltonian operator 2uDx + ux , which is the simplest Hamiltonian operator appearing in commutative Hamiltonian systems of “hydrodynamic type”, [9, 10, 11, 36], which are first order quasilinear systems of partial differential equations. The operator 2uDx + ux defines the first of four known Hamiltonian structures for the inviscid Burgers’ equation ut = uux , [36]. Our result indicates that the noncommutative theory of systems of hydrodynamic type is considerably more complicated than the commutative theory. However, this theory is clearly worth developing, since, in view of the connections between such Hamiltonian operators and Riemannian geometry in the commutative case, the resulting theory may help shed new light on how to construct a “noncommutative Riemannian geometry”, cf. [6]. The general form for a first order Hamiltonian operator for the mKdV-type equations is (6.24) D = Dx + λCu + µCu Dx−1 Cu . In other words,
D(v) = vx + λ[u, v] + µ[u, Dx−1 [u, v] ].
(6.25)
Therefore, the operators Dx , Cu , and Cu Dx−1 Cu , form a compatible Hamiltonian triple, although the latter is an immediate consequence of the compatibility of the first two operators. Interestingly, in contrast with the scalar modified Korteweg–deVries equation, whose first order Hamiltonian structure is the local operator Dx , the structure here is non-local, requiring the formal integral operator Dx−1 . Theorem 6.3. For any scalar constants λ, µ, the operator (6.24) is Hamiltonian. The proof is similar to Theorem 6.1 and is omitted. Choosing λ = 0, µ = 1, so D1 = Dx + Cu Dx−1 Cu ,
(6.26)
we find that the first matrix mKdV Eq. (1.4) can be written in Hamiltonian form (6.1), with Hamiltonian Z H1 = tr − 21 u2x + 21 u4 dx. (6.27) Choosing λ = 3, µ = 2, so D2 = Dx + 3Cu + 2Cu Dx−1 Cu ,
(6.28)
we find the that the second matrix mKdV Eq. (1.6) can be written in Hamiltonian form (6.1), with Hamiltonian
264
P. J. Olver, V. V. Sokolov
Z
tr − 21 u2x − 21 u4 dx.
(6.29)
For general λ, µ, and Hamiltonian Z H = tr − 21 u2x + 21 εu4 dx.
(6.30)
H2 =
we obtain the family of matrix Hamiltonian equations ut = uxxx + λ[u, uxx ] + (µ + ε){u2 , ux } + (ε − 2µ)uux u.
(6.31)
However, Theorem 3.6 indicates that not all of these are integrable. More specifically, only the first and second matrix mKdV equations (1.4), (1.6), admit a fifth order symmetry. 7. Conclusions and Further Research In this paper, we have initiated the systematic classification of integrable evolution equations whose field variables take values in an associative algebra. Complete classifications are found for equations of KdV type, where the only integrable noncommutative examples found so far are the noncommutative KdV and its higher order symmetries. A similar result holds for noncommutative generalizations of the mKdV equation. The classification of two-component systems generalizing the nonlinear Schr¨odinger equation has been completed. In both the commutative and noncommutative cases, new examples of integrable systems were discovered. In some cases, commutative equations have one or two noncommutative counterparts, whereas some integrable commutative equations cannot be extended to the noncommutative regime. The computer packages used to effect these computations can be readily applied to other types of systems, although the required amount of computing power and memory increases rapidly, leaving many interesting cases unexplored. Explicit solutions, including solitons, as well as the possible linearizations and integration via a noncommutative inverse scattering method will be the subject of future research. Symmetry reduction of the noncommutative integrable equations leads to a variety of associative algebra valued ordinary differential equations of Painlev´e type, whose properties await a more detailed development. A systematic study of both symmetry reductions, and integrable reductions through specialization of the field variables, will be discussed in a future work. Noncommutative Hamiltonian structures for some of the integrable equations were found, although the second Hamiltonian structure for the two noncommutative variants of the mKdV equation have not yet been determined. Another important problem is to prove that all the higher order symmetries of our noncommutative systems obtained through application of the nonlocal recursion operator are local equations. Since the field variables in our systems take their value in an arbitrary associative algebra, our results can thus be equally well applied to algebras of operators over Hilbert spaces and other infinite dimensional spaces. This leads one to speculate on a possible model of nonlinear quantum mechanics, governed by such an operator evolution equation. Our approach to the classification of nonabelian equations can therefore be regarded as a constructive procedure of “nonlinear” quantization that can be applied to classical integrable systems. The effect of such evolutions on such basic physical assumptions as the superposition principle remains to be explored.
Integrable Evolution Equations on Associative Algebras
265
Appendix A. Discussion of Proofs The full proofs of our main theorems involve extensive computer calculations using Mathematica, and are not possible to provide here in full. However, for the reader’s convenience, we outline the basic steps, beginning with the first case – the noncommutative Korteweg–deVries equation. To prove Theorem 3.4, we start with a general nondegenerate noncommutative weight 3 Eq. (3.8). Rescaling the time variable t allows us to set the linear coefficient to be unity. Therefore, our basic equation has the form ut = uxxx + a1 uux + a2 ux u.
(A.1)
Up to an inessential scalar multiple, the most general nondegenerate 2 weight 5 symmetry has the form ut = uxxxxx + b1 uuxxx + b2 ux uxx + b3 uxx ux + b4 uxxx u + b5 u2 ux + b6 uux u + b7 ux u2 . (A.2) We successively analyze the terms in the commutator according to their degree in u. The linear terms automatically vanish — this is a general fact for scalar equations. The quadratic terms uu7x , ux u6x , etc. lead to the following normalizations for the symmetry coefficients: b2 = b3 = 53 a1 + a2 , b4 = 53 a2 . b1 = 53 a1 , The cubic terms lead to further normalizations of the symmetry coefficients: b5 = 56 a21 + 59 a1 a2 −
5 2 b7 = − 18 a1 + 59 a1 a2 + 56 a22 . (A.3) In addition, there are three nonlinear conditions on the coefficients of the equation itself, which reduce to the simple quadratic equations 5 2 18 a2 ,
b6 =
5 2 18 a1
+ 59 a1 a2 +
5 2 18 a2 ,
a21 = a1 a2 = a22 . This implies that a1 = a2 is necessary for integrability, and hence (A.1) must agree with the noncommutative Korteweg–deVries Eq. (1.3) up to rescaling of u. Substituting these values into the symmetry conditions (A.3), (A.4) reduces (A.2) to the corresponding fifth order symmetry. There are no further commutator conditions to satisfy, and so this completes the proof of Theorem 3.4. This general pattern holds for the other classifications. We always work with nondegenerate equations and symmetries, and so can always set the linear coefficients to unity. (Leaving them arbitrary leads to significant complications in the computational algorithm since the symmetry conditions all become nonlinear.) The symmetry conditions are analyzed in order corresponding to the degree of the terms in u and its derivatives. The linear terms automatically vanish. Low degree terms will produce linear equations for the coefficients of the symmetry. Eventually, however, nonlinear algebraic conditions on the coefficients of the equation itself arise. Although this did not happen in the simple KdV example, typically, there are several different solutions to the nonlinear symmetry conditions, and this leads to branching of the possible classes of integrable systems. Two general computational strategies may be followed. One is to solve the nonlinear symmetry conditions as they arise, and then continue on to the higher degree conditions in each branch separately. This may lead to additional branching, and, often, numerous redundancies. An alternative approach is to retain the nonlinear equations in unsolved 2
There are no symmetries if we omit the linear term.
266
P. J. Olver, V. V. Sokolov
form, and proceed to the higher degree conditions, appending any additional nonlinear equations to the list as required. Finally, one can solve the full (large) nonlinear system all at once, if possible. In Mathematica 2.2, the standard Solve routine misses branches and cannot be trusted. However, the Gr¨obner basis routine Reduce is reliable, surprisingly powerful, and produces complete lists of solutions and branches. (We have not yet tried our programs out in version 3.0 of Mathematica.) Each method has advantages and disadvantages, which become apparent once one tries to analyze more complicated examples, which easily can tax the memory and execution time limits of even a sophisticated workstation implementation of the code. The most extensive example we have successfully analyzed to date is the nonlinear Schr¨odinger system discussed in Sect. 4. In this case, we were forced to use a hybrid strategy in order to stay within the memory limitations of the Iris workstation utilized for these computations. Let us outline the steps in the computation in the other cases. For the Burgers’-mKdV classification Theorem 3.6, we begin with the general form ut = uxxx + a1 uuxx + a2 u2x + a3 uxx u + a4 u2 ux + a5 uux u + a6 ux u2 ,
(A.4)
A general fifth order symmetry has 32 terms, beginning with ut = uxxxxx + b1 uuxxxx + b2 ux uxxx + · · · + b31 u6 . The quadratic commutator conditions serve to normalize the coefficients of the quadratic terms in the symmetry. The cubic terms produce additional symmetry normalizations, as well as a pair of nonlinear equations for the coefficients of the quadratic terms in the equation: (A.5) a1 (a2 − a1 − a3 ) = a3 (a2 − a1 − a3 ) = 0. There are, consequently, 3 distinct branches at this stage: a1 = a3 = 0,
or
a1 = 0, a2 = a3 ,
or
a3 = 0, a1 = a2 .
Of course, the latter two branches are related to each other via the transpose operation, and so only one needs to be treated directly. However, we find it a bit easier to just continue the commutator computations without reducing to particular branches at this point, but rather retaining the nonlinear conditions (A.6) in unsolved form. For the degree 4 terms, we obtain 44 additional nonlinear equations (although this count includes many redundancies which we did not try to resolve), the degree 5 terms lead to 106 nonlinear equations. Finally, the degree 6 terms have completely determined all the coefficients of the symmetry, and leave a residual system of 158 nonlinear equations in the coefficients a1 , . . . , a6 of (A.5). There are precisely 6 different branches of solutions to this final system, and these produce the 6 types of integrable equations mentioned in Theorem 3.6. A similar strategy works for the fifth order KdV equation considered in Theorem 3.5. The interesting twist here is that in the commutative case, there is a single condition on the coefficients of the equation arising from the cubic commutator terms. The quartic terms then lead to 3 branches — the fifth order KdV, the Kaup–Kupershmidt, and the Sawada–Kotera cases. However, in the noncommutative case, there are significantly more nonlinear conditions, and only the analogue of the fifth order KdV equation survives the computation. In the nonlinear Schr¨odinger case, we employed a hybrid strategy as follows. We first computed the nonlinear equations for the coefficients of the equation corresponding to the cubic symmetry conditions. Solutions of these equations resulted in 70 different branches, of which 20 were not triangular. Of these, several additional cases could be
Integrable Evolution Equations on Associative Algebras
267
eliminated by using the transpose involution. The surviving systems were then substituted into the remaining symmetry conditions, and the final nontriangular cases were tallied, as in the text. In all cases, an ab initio computation of the fourth order symmetry was conducted to reconfirm that each system obtained is integrable. A complete list of symmetries of the integrable cases can be obtained from the first author. Acknowledgement. We would like to thank the Ordway endowment for supporting V. Sokolov’s visit to Minnesota, during which this project was initiated. We would also like to thank V.E. Adler, I.Z. Golubchik and R.I. Yamilov for helpful discussions. We also thank an anonymous referee for useful remarks.
References 1. Ablowitz, M.J., Ramani, A., Segur, H.: A connection between nonlinear evolution equations and ordinary differential equations of P -type. J. Math. Phys. 21, 715–721 (1980) 2. Antonowicz, M., Fordy, A.P.. Coupled KdV equations with multi-Hamiltonian structures. Physica D 28, 345–357 (1987) 3. Athorne, C., Fordy, A.P.: Generalised KdV and MKdV equations associated with symmetric spaces. J. Phys. A 20, 1377–1386 (1987) 4. Bakirov, I.M.: On the symmetries of some system of evolution equations. 1991 5. Balandin, S.P., Sokolov, V.V.: On the Painlev´e test for nonabelian equations. Ufa, 1996 6. Connes, A.: Noncommutative Geometry. San Diego: Academic Press, 1994 7. Crumeyrolle, A. Orthogonal and Symplectic Clifford Algebras. Boston: Kluwer, 1990 8. Curtis, C.W., Reiner, I.: Representation Theory of Finite Groups and Associative Algebras. New York: Interscience, 1962 9. Dubrovin, B.A., Novikov, S.P.: Hamiltonian formalism of one-dimensional systems of hydrodynamic type and the Bogolyubov-Whitham averaging method. Sov. Math. Dokl. 27, 665–669 (1983) 10. Dubrovin, B.A., Novikov, S.P.: On Poisson brackets of hydrodynamic type. Sov. Math. Dokl. 30, 651–654 (1984) 11. Dubrovin, B.A., Novikov, S.P.: Hydrodynamics of weakly deformed soliton lattices. Differential geometry and Hamiltonian theory. Russian Math. Surveys 44:6, 35–124 (1989) 12. Ferapontov, E.V.: Differential geometry of nonlocal Hamiltonian operators of hydrodynamic type. Func. Anal. Appl. 25, 195–204 (1991) 13. Fokas, A.S.: A symmetry approach to exactly solvable evolution equations. J. Math. Phys. 2, 1318–1325 (1980) 14. Fordy, A.P.: Derivative nonlinear Schr¨odinger equations and Hermitian symmetric spaces J. Phys. A 17, 1235–1245 (1984) 15. Fordy, A.P., Kulish, P.P.: Nonlinear Schr¨odinger equations and simple Lie algebras. Commun. Math. Phys. 89, 427–443 (1983) 16. G¨urses, M., Karasu, A.: Degenerate Svinolupov KdV systems. Phys. Lett. A 214, 21–26 (1996) 17. Habibullin, I.T., Sokolov, V.V., Yamilov, R.I.: Multi-component integrable systems and nonassociative structures. In: Nonlinear Physics: Theory and Experiment, Singapore: World Scientific, 1996, pp. 139– 168 18. Ibragimov, N.H., Shabat, A.B.: Evolutionary equations with nontrivial Lie–B¨acklund group. Func. Anal. Appl. 14, 19–28 (1980) 19. Ibragimov, N.H., Shabat, A.B.: Infinite Lie–B¨acklund algebras. Func. Anal. Appl. 14, 313–315 (1980) 20. Ince, E.L.: Ordinary Differential Equations. New York: Dover, 1956 21. Jacobson, N.: Structure and Representations of Jordan Algebras. Providence, R.I.: American Math. Soc. Colloquium Publ., vol. 39, 1968 22. Kaup, D.J.: On the inverse scattering problem for cubic eigenvalue problems of the class ψxxx +6Qψx + 6Rψ = λψ. Stud. Appl. Math. 62, 189–216 (1980) 23. Khalilov, F.A., Khruslov, E.Ya.: Matrix generalisation of the modified Korteweg-deVries equation. Inv. Prob. 6, 193–204 (1990) 24. Kichenassamy, S., Olver, P.J.: Existence and non–existence of solitary wave solutions to higher order model evolution equations. SIAM J. Math. Anal. 23, 1141–1166 (1992) 25. Magri, F.: A simple model of the integrable Hamiltonian equation. J. Math. Phys. 19, 1156–1162 (1978)
268
P. J. Olver, V. V. Sokolov
26. Marchenko, V.A.: Nonlinear Equations and Operator Algebras. Boston: D. Reidel Pub. Co., 1988 27. McLeod, J.B., Olver, P.J.: The connection between partial differential equations soluble by inverse scattering and ordinary differential equations of Painlev´e type. SIAM J. Math. Anal. 14, 488–506 (1983) 28. Mikhailov, A.V., Shabat, A.B., Sokolov, V.V.: The symmetry approach to classification of integrable equations. In: What is Integrability?, V.E. Zakharov, ed., New York: Springer Verlag, 1990, pp. 115–184 29. Mikhailov, A.V., Shabat, A.B., Yamilov, R.I.: The symmetry approach to classification of nonlinear equations. Complete lists of integrable systems. Russian Math. Surveys 42:4, 1–63 (1987) 30. Mikhailov, A.V., Shabat, A.B., Yamilov, R.I.: Extension of the module of invertible transformations. Classification of integrable systems. Commun. Math. Phys. 115, 1–19 (1988) 31. Mokhov, O.I.: Hamiltonian systems of hydrodynamic type and constant curvature metrics. Phys. Lett. A 166, 215–216 (1992) 32. Mokhov, O.I., Ferapontov, E.V.: Non-local Hamiltonian operators of hydrodynamic type related to metrics of constant curvature. Russian Math. Surveys 45:3, 218–219 (1990) 33. Olver, P.J.: Evolution equations possessing infinitely many symmetries. J. Math. Phys. 18, 1212–1215 (1977) 34. Olver, P.J.: Hamiltonian perturbation theory and water waves. Contemp. Math. 28, 231–249 (1984) 35. Olver, P.J.: Applications of Lie Groups to Differential Equations, Second Edition, Graduate Texts in Mathematics, vol. 107, New York: Springer–Verlag, 1993 36. Olver, P.J., Nutku, Y.: Hamiltonian structures for systems of hyperbolic conservation laws. J. Math. Phys. 29, 1610–1619 (1988) 37. Porteous, I.: Clifford Algebras. Cambridge Stud. Adv. Math., vol. 50, Cambridge: Cambridge University Press, 1995 38. R¨ohrl, H.: Algebras and differential equations. Nagoya Math. J. 68, 59–122 (1977) 39. Sawada, K., Kotera, T.: A method for finding N -soliton solutions of the K.d.V. equation and K.d.V.-like equation. Prog. Theor. Physics 51, 1355–1367 (1974) 40. Sokolov, V.V.: On the symmetries of evolution equations. Russian Math. Surveys 43:5, 165–204 (1988) 41. Sokolov, V.V., Shabat, A.B.: Classification of integrable evolution equations. Soviet Math. Phys. Reviews C 4, 221–280 (1984) 42. Sokolov, V.V., Svinolupov, S.I.: Vector–matrix generalizations of classical integrable equations. Theor. Math. Phys. 100, 959–962 (1994) 43. Sokolov, V.V., Svinolupov, S.I.: Deformations of nonassociative algebras and integrable differential equations. Acta Appl. Math. 41, 323–339 (1995) 44. Sokolov, V.V., Svinolupov, S.I.: Deformations of Jordan triple systems and integrable equations. Theor. Math. Phys. 108, 388–392 (1996) 45. Svinolupov, S.I.: On the analogues of the Burgers equation. Phys. Lett. A 135, 32–36 (1989) 46. Svinolupov, S.I.: Jordan algebras and generalized KdV equations. Theor. Math. Phys. 87, 611–620 (1991) 47. Svinolupov, S.I.: Generalized Schr¨odinger equations and Jordan pairs. Commun. Math. Phys. 143, 559– 575 (1992) 48. Svinolupov, S.I.: Jordan algebras and integrable systems. Func. Anal. Appl. 27, 257–265 (1993) 49. Walcher, S.: Algebras and Differential Equations. Hadronic Press, Palm Harbor, Fla., 1991 50. Zhiber, A.V., Shabat, A.B.: Klein–Gordon equations with a nontrivial group. Sov. Phys. Dokl. 24, 607– 609 (1979) Communicated by T. Miwa
Commun. Math. Phys. 193, 269 – 285 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Modular Intersections of von Neumann Algebras in Quantum Field Theory Hans-Werner Wiesbrock? Institut f¨ur Theoretische Physik, FU Berlin, D-14195 Berlin, Germany. E-mail:
[email protected] Received: 26 February 1996/ Accepted: 30 August 1997
Abstract: We show that modular intersections of von Neumann algebras occur naturally in quantum field theory. An example are local observable algebras associated with wedge regions, which have a lightray in common, see also [Bo 2, Wi 3]. Conversely, starting from a set of four algebras lying in a specified modular position relative to each other we construct a net of local observables of a 2+1 dimensional quantum field theory. 1. Preliminaries In this article we investigate the intimate relationship between modular theory and quantum field theory. As a fundamental feature of quantum theories we have the ReehSchlieder property of the vacuum. Intuitively speaking, this property reflects the vacuum fluctuation of local quantum field theories. Mathematically this feature enables one to apply the modular theory of operator algebras as developed by M. Tomita and M. Takesaki [Ta] to quantum field theory. It has been realized by J. Bisognano and E. Wichmann [Bi-Wich] that this rather general structure gets a very beautiful interpretation in terms of space time symmetries of the underlying quantum field theory. We will subsequently review some of these results in the second part of this work. In ‘92 H.-J. Borchers [Bo1] gave an abstract version of the Bisognano/Wichmann result which led to the introduction of a special type of inclusion of algebras, the socalled modular inclusions [Wi1, Ar-Zs]. It turned out that inclusions of this type carry a strikingly rich symmetry structure, hidden in their modular theory [Wi2]. They led to a new characterization of chiral quantum field theories by one special pair of algebras lying in an appropriate relative position to each other w.r.t. their modular data [Wi4, Gu-Lo-Wi]. In order to generalize these results to higher dimensional theories we introduced the notion of modular intersections of algebras in [Wi3]. Within this setting one recovers ?
Supported by the DFG, SFB 288 “Differentialgeometrie und Quantenphysik”.
270
H.-W. Wiesbrock
almost all features of modular inclusions except for the spectrum condition, which reflected the inclusion order in the former case. In Sect. 3 we will show how one can use these structures to construct a (2+1)dimensional quantum field theory starting from a set of four algebras lying in an appropriate relative position to each other w.r.t. their modular data. As mentioned already our results are mainly based on two structures, namely modular inclusions and intersections of algebras. Therefore we first recall the basic notions of half-sided modular inclusions and modular intersections and review the important results needed in the following. We often will have several von Neumann algebras M, N , . . . together with a common cyclic and separating vector . As a notational convention we will index the associated modular objects by the related algebra, e.g. 1M , resp. JM , will denote the modular operator, resp. conjugation, associated with (M, ) etc. Definition 1. a) Let N ⊂ M be von Neumann algebras acting on a Hilbert space H. t the modular group Let be a common cyclic and separating vector. Denote by σM t associated with (M, ). If σM (N ) ⊂ N for all t ≥ 0 (or t ≤ 0), we call (N ⊂ M, ) a ±-half-sided modular inclusion, respectively, abbreviated by ±-hsm inclusion. b) Let N , M be von Neumann algebras acting on a Hilbert space H and let ∈ H be a common cyclic and separating vector which is also cyclic for N ∩ M. If I) ((N ∩ M) ⊂ N , ), ((N ∩ M) ⊂ M, ) are ±-hsm inclusions, −it −it it II) JN (s − lim 1it N 1M )JN = s − lim 1M 1N , t→∓∞
t→∓∞
then we say that such a triple (N , M, ) has the (±) modular intersection property, abbreviated by ± mi property. Note that condition I implies that the strong limit in II exists. The main theorem for such pairs of algebras is 1 Theorem 2 (Wi 1, Ar-Zs, Wi 3). Let N and M fulfill a) or b) of the above definition with ± -sign. Then 1 a) 2π (ln(1N ) − ln(1M )) is essentially selfadjoint (in the case of modular inclusions even positive). Denote by 1 (ln(1N ) − ln(1M ))− . Then U (a), a ∈ R, the unitary group on H with generator 2π −it −it it ±2πt a) for all t, a ∈ R, b) 1it M U (a)1M = 1N U (a)1N = U (e c) JM U (a)JM = JN U (a)JN = U (−a) for all a ∈ R, d) N = U (1)MU (−1), especially JN = U (1)JM U (−1). −it From c) and d) we immediately conclude U (2) = JN JM and with b) 1it M 1N = ±2πt U (1 − e ). Let us introduce a convenient notation for the various unitary groups arising in Theorem 2. In the case of a ±-hsm inclusion (N ⊂ M, ), we denote the associated unitary group by UN ⊂± M , in the case of modular intersections by UN ∩± M . Using this notation statement d) of Theorem 2 reads as N = Ad UN ⊂± M (1)(M) etc. For the construction of a quantum field theory from a finite set of algebras we will use space-time symmetry representations generated by modular operators. In 2+1 dimensions the homogeneous Lorentz group SO↑ (1, 2) is isomorphic to SL(2, R)/Z2 and we will apply 1 There was an error in the original proof of statement a) for the case of hsm-inclusions by [Wi 1], as was noticed by H. Araki and L. Zsido. They provided also a way of filling the gap [Ar-Zs].
Modular Intersections of von Neumann Algebras in Quantum Field Theory
271
Theorem 3 (Wi 3, Theorem 6). Let N , M, L be von Neumann algebras acting on a Hilbert space H and let ∈ H be a common cyclic and separating vector. Assume the following modular intersection properties 1. (N , M, ) has −mi , 2. (L, M, ) has +mi , 3. (L, N 0 , ) has −mi , where the prime indicates the commutant. Then for t, r, s ∈ R the one-parameter groups ir is 1it M , 1 N , 1L
generate a representation of the group SL(2, R)/Z2 ∼ = SO↑ (1, 2). In the proof of Proposition 11 below we will give a detailed description of that representation. We remark that Theorem 3 holds equally well if we replace the miproperties by hsm-inclusions [Wi 2, Theorem 5]. We need a further relation which was obtained in the proof of this theorem, Lemma 4 (Wi 3, Lemma 7). UM∩− N (1)JN JL = UL∩+ M (1)UM∩− N (1)UN 0 ∩− L (1), Ad UM∩− N (1)JN (L) = L, and by modular theory [UM∩− N (1)JN , JL ] = 0. The same statement holds if we replace N by L, M by N 0 and L by M0 , respectively N by M0 , M by L0 and L by N . After these preparations let us show that these abstract structures occur naturally in the context of 2+1-dimensional quantum field theories. 2. Modular Intersections of von Neumann Algebras in 2+1 Dimensional Quantum Field Theory Our framework will be the description of quantum field theories in terms of nets of local algebras, see [Haa]. Instead of working with unbounded field operators one uses algebras of bounded operators which might be viewed as bounded functions of the observables. Let us recall some basic assumptions on the set of algebras, which replace the familiar assumptions on the observable fields. Let A(O) ⊂ B(H) be a net of von Neumann Algebras indexed by the closed double cones O ⊂ R1,2 which satisfies the following properties: (i) A(O1 ) ⊂ A(O2 ) if O1 ⊂ O2 (isotony), (ii) A(O1 ) ⊂ A(O2 )0 if O1 ⊂ O20 (locality), where A(O)0 denotes the commutant of A(O) in B(H) and O0 is the causal complement of O ⊂ R1,2 . (iii) ∃ U : SO↑ (2, 1) B R1,2 → U (H) (Poincar´e covariance)
272
H.-W. Wiesbrock
is a unitary representation of the Poincar´e group with positive energy, and (iv) ∃ a unique U-invariant vector ∈ H
(vacuum).
The algebra A(O) is called the algebra of observables localized in O ⊂ R1,2 . Such a net describes a (2+1)-dimensional quantum field theory in the algebraic approach. The famous Reeh-Schlieder property of the vacuum states that is cyclic and separating for a local algebra A(O), whenever the causal complement of the localization region O contains an open subset. Hence, we can apply the modular theory of Tomita-Takesaki to (A(O), ). For some special regions, the so-called wedges, these abstract algebraic structures have nice physical and geometrical interpretations. A wedge W [l1 , l2 ] is determined by two linearly independent lightlike vectors l1 , l2 ∈ R1,2 belonging to the forward lightcone. Using the convenient notation introduced in [Bo2] it is defined by W [l1 , l2 ] := {αl1 + βl2 + l⊥ ∈ R1,2 |α > 0, β < 0, (l⊥ , li ) = 0, i = 1, 2}, where (·, ·) denotes the Minkowski-product. If the local net is generated by Wightman fields J. Bisognano and E. Wichmann [Bi-Wich] showed that the Tomita-Takesaki modular groups associated with algebras of observables localized in such wedges act as Lorentz boosts. More specifically one gets the one-parameter group of the Lorentz boosts 3l1 ,l2 (t), t ∈ R leaving the set W [l1 , l2 ] invariant. For example, if l1 = (1, 1, 0) and l2 = (1, −1, 0) then cosh 2πt sinh 2πt 0 3l1 ,l2 (t) = sinh 2πt cosh 2πt 0 . 0 0 1 The precise relation is given by 1it A(W [l1 ,l2 ]) = U(3l1 ,l2 (−2πt)), where 1A(W [l1 ,l2 ]) denotes the modular operator associated with (A(W [l1 , l2 ]), ). In [Bo2] H.-J. Borchers looked at various relations between the Lorentz boosts in the directions of different wedges. In his investigation he made the following observation. Lemma 5. For all l1 , l2 and l3 as above a)
W [l1 , l2 ] ∩ W [l1 , l3 ] contains an open neighborhood,
b) 3l1 ,l2 (t)(W [l1 , l2 ] ∩ W [l1 , l3 ]) ⊂ (W [l1 , l2 ] ∩ W [l1 , l3 ]) for t ≥ 0, and similar if we replace 3l1 ,l2 by 3l1 ,l3 . In the case where the Bisognano-Wichmann result holds, also the modular conjugations of the wedges act geometrically, namely as reflections. More precisely, the defining lightlike vectors are reflected whereas the orthogonal spacelike line is kept fixed. Denote this reflection by Jˆl1 ,l2 . Then an exercise shows Lemma 6.
Uˆ := lim 3l1 ,l2 (t)3l1 ,l3 (−t) t→∞
exists and
Jˆl1 ,l2 Uˆ Jˆl1 ,l2 = Uˆ −1 ≡ lim 3l1 ,l3 (t)3l1 ,l2 (−t). t→∞
Modular Intersections of von Neumann Algebras in Quantum Field Theory
273
Proof. These statements are Lorentz-invariant. Let us take l1 = (1, 1, 0), l2 = (1, −1, 0), l3 = (2, 0, 2). With this choice the computations in [Bo 2, (2.5) and Corollary 4.3] show 3 − 21 −1 2 1 Uˆ = 21 −1 . 2 −1 1 1
Since Jˆl1 ,l2
−1 = 0 0
0 −1 0
0 0 1
the second statement reduces to a simple matrix calculation.
Putting these observations together we obtain the following important result Proposition 7. Let A(O), O ⊂ R1,2 , be the local observable algebras of a 2+1 dim. quantum field theory for which the Bisognano/Wichmann property holds. Then (A(W [l1 , l2 ]), A(W [l1 , l3 ]), ) has the −modular intersection property. Proof. See [Bo 2, Lemma 6]. The assumptions state that the modular groups and conjugations act geometrically. Therefore it is enough to look at the geometrical picture. The statement is then a reformulation of Lemma 5 and Lemma 6. If instead of l1 the above wedges have the second lightlike vector in common, then we similarly have the +modular intersection property. As an immediate corollary of Theorem 3 we get Theorem 8. Let A(O), O ⊂ R1,2 , be a local net fulfilling the Bisognano/Wichmann property for wedges and put N := A(W [l1 , l2 ]), M := A(W [l1 , l3 ]), L := A(W [l2 , l3 ]). Then we have the following intersection properties: 1. (N , M, ) 2. (L, M, ) 3. (L, N 0 , )
has −mi , has +mi , has −mi ,
where the prime indicates the commutant. Especially ir is 1it M , 1N , 1L , t, r, s ∈ R
generate a representation of the 2+1 dim. Lorentz group SO↑ (1, 2) ∼ = SL(2, R)/Z2 . Proof. Due to the Bisognano/Wichmann property we have A(W [l1 , l2 ])0 = A(W [l2 , l1 ]). The theorem now follows from Theorem 3 and Proposition 7 above. To see how space-time translations arise by modular theory we look at translated wedges. Denote by W [l1 , l2 , a] := {x ∈ R1,2 |x − a ∈ W [l1 , l2 ]} the wedge translated by a ∈ R1,2 . Then we have
274
H.-W. Wiesbrock
Lemma 9. (A(W [l1 , l2 , αl1 ]) ⊂ A(W [l1 , l2 ]), ) is a −hsm inclusion for α > 0. Proof. (see [Bo2]) Denote by U(αl1 ), α ∈ R, the translations in l1 -direction. Due to the spectrum condition the generator is positive. Moreover Ad U(αl1 )(A(W [l1 , l2 ]) = A(W [l1 , l2 , αl1 ]) ⊂ A(W [l1 , l2 ]) for α > 0 and U (αl1 ) = . Therefore we can apply H.-J. Borchers’ result, [Bo 1] to get the commutation relation −2πt αl1 ), ∀t ∈ R, Ad 1it l1 ,l2 (U(αl1 )) = U (e
where 1l1 ,l2 denotes the modular operator associated with (A(W [l1 , l2 ]), ). This implies −2πt αl1 ]), Ad 1it l1 ,l2 (A(W [l1 , l2 , αl1 ])) = A(W [l1 , l2 , e and therefore the statement.
Notice that Lemma 9 holds independently of the Bisognano/Wichmann property. Proposition 7 and Lemma 9 show that modular inclusions and modular intersections of von Neumann algebras occur naturally in 2+1 dimensional quantum field theories and similar in 3+1 -dimensional theories. 3. On the Reconstruction of a 2 + 1-Dimensional Quantum Field Theory In the following we conversely start from a finite set of algebras lying in a specific relative position w.r.t. their modular data. From these data we then construct a unitary representation of the 2+1 dimensional Poincar´e group fulfilling the spectrum condition. To get the Abelian translation group we need the following result. Lemma 10 (Wi 2, Theorem 2). Assume 1. (M1 ⊂ M, ) is a −hsm inclusion, 2. (M2 ⊂ M, ) is a +hsm inclusion, 3. [JM1 JM , JM JM2 ] = 0. Then the two unitary groups associated with the hsm inclusions 1 and 2 commute. Furthermore these “translations” fulfill the spectrum condition. Applying Ad 1it M to the equation in Assumption 3 and taking products we see that only [UM1 ⊂− M (a), UM2 ⊂+ M (b)] = 0 for some a, b ∈ R \ {0} is needed, where UMi ⊂± M denote the unitary groups associated with the hsm inclusions (Mi ⊂ M, ). With these preparations we can show Proposition 11. Let N , M, L and Nˆ be von Neumann algebras acting on a Hilbert space H with a common cyclic and separating vector . Assume I. 1. (N , M, ) has the −mi property, 2. (L, M, ) has the +mi property, 3. (L, N 0 , ) has the −mi property. II. 1. (Nˆ ⊂ N , ) is a −hsm inclusion, 2. Ad JM (JNˆ JN ) = JN JNˆ , 3. [Ad JL (JNˆ JN ), JNˆ JN ] = 0.
Modular Intersections of von Neumann Algebras in Quantum Field Theory
275
Then we get a unitary representation of the 2+1 -dimensional Lorentz group SO↑ (1, 2) and of the translation group R1,2 by modular operators. (For a detailed description of these representations see the proof below.) Before coming to the proof let us comment on the various conditions. First, according to Theorem 3 Conditions I give us a unitary representation of the 2+1 -dimensional homogeneous Lorentz group. Comparison with the example of 2+1 -dimensional quantum field theory suggests that the hsm inclusion in Condition II could be thought of as giving us a representation of translations along some lightray, see Lemma 9. Then the product of the two modular conjugations in Assumption II. 2 would be a finite translation of this kind, see the remarks after Theorem 2. Due to the result of Bisognano/Wichmann [Bi-Wi] the modular conjugations of the wedge algebras act as reflections. At least they map translations to translations. Conditions II.2 and II.3 should be interpreted as reminiscent of these expected properties. Proof. Due to the Assumptions I.1–3. Theorem 3 provides us with a unitary representation of SL(2, R)/Z2 ∼ = SO↑ (1, 2). We fix the representation of the Lorentz group by choosing a Lorentz frame: Let l1 = (1, 1, 0), l2 = (1, −1, 0), l3 = (2, 0, 2). These are 3 linearly independent lightlike vectors in R1,2 , pointing to the future. Denote by 3li ,lj (t) ∈ SO↑ (1, 2) the Lorentz boosts associated with the wedges W [li , lj ] as described above. By definition, these boosts leave the sets W [li , lj ] invariant and they generate the whole Lorentz group. We define t −i 2π
ULor (3l1 ,l2 (t)) := 1N
−i
t
, ULor (3l1 ,l3 (t)) := 1M 2π , t −i 2π
ULor (3l2 ,l3 (t)) := 1L
.
In this way we fix the unitary representation of SO↑ (1, 2) on H [Wi 3] so that we get a unitary representation ULor : SO↑ (1, 2) → U (H). In the framework of Bisognano and Wichmann, the algebras N , M, and L could be thought of as being associated with the localization regions W [l1 , l2 ], W [l1 , l3 ] and W [l2 , l3 ]. We will use this picture as a guiding line through the various arguments during the proof. Before coming to the translations let us note the following. Since the −mi property for (N , M, ) is equivalent to the +mi property for (N 0 , M0 , ) [Wi 3], Assumption I is symmetric in (N , M, L), more precisely with (N , M, L) also (L, N 0 , M0 ) and (M0 , L0 , N ) fulfill Assumption I. We will exhibit this symmetry by introducing 0N := UL∩+ M (1)JM = UM∩+ L (1)JL , 0M := UN 0 ∩− L (1)JL = UL∩− N 0 (1)JN , 0L := UN ∩− M (1)JM = UM∩− N (1)JN . Then 0N is involutive, since 02N = UL∩+ M (1)JM UL∩+ M (1)JM = UL∩+ M (1)UL∩+ M (−1) = 1,
276
H.-W. Wiesbrock
where we have used Theorem 2c). This obviously also holds for 0M and 0L . Moreover, Lemma 4 states −it Ad 0N (N ) = N , Ad 0N (1it N ) = 1N , t ∈ R,
and looking at the definitions and Theorem 2d) we see Ad 0N (L) = M0 , Ad 0N (M) = L0 . Modular theory tells us that in this case it it it Ad 0N (1it M ) = 1L , Ad 0N (1L ) = 1M , t ∈ R
(1)
holds. These facts motivated our index notation. Similar results follow for 0M , 0L . After these preparations let us come to the translations. Due to Lemma 9 we might interpret the unitary group associated with the hsm inclusion (Nˆ ⊂ N , ) of Condition II as translations in the l1 -direction. We therefore introduce the notion Utrans,l1 (a) := UNˆ ⊂− N (a). Note that Utrans,l1 fulfills the spectrum condition, see Theorem 2. To get the translations in l2 and l3 directions we will use the Lorentz boost representation. An easy group theoretical computation shows 3l1 ,l2 (− ln 2)3l2 ,l3 (− ln 2)l1 = l3 and with to =
1 2π
ln 2, we get as a candidate
Utrans,l3 (a) := Ad ULor (3l1 ,l2 (− ln 2)3l2 ,l3 (− ln 2))(Utrans,l1 (a)) it0 0 = Ad 1−it N 0 1L (Utrans,l1 (a)) = Ad UL∩− N 0 (1)(Utrans,l1 (a))
where in the last line we used again the remarks following Theorem 2. Similarly we make the Ansatz Utrans,l2 (a) := Ad UL∩+ M (1)(Utrans,l1 (a)). We will show that these one-parameter unitary groups Utrans,li indeed commute, such that they provide us with a unitary representation Utrans of R1,2 . First we need Lemma 12. Ad ULor (3li ,lj (2πt))(Utrans,li (a)) = Utrans,li (e2πt a) ∀a, t ∈ R, i, j = 1, 2, 3. (2) Proof of Lemma 12. Let us begin with the translations in l1 −direction. Due to Theorem 2 we have Utrans,l1 (2) = JNˆ JN = UNˆ ⊂− N (2), and Assumption II.2 states that Ad JM acts as a reflection on this discrete translation. We get: Ad UN ∩− M (−2)(Utrans,l1 (2)) = Ad UN ∩− M (−2)(UNˆ ⊂− N (2)) = Ad JM JN (JNˆ JN ) = Ad JM (JN JNˆ ) = JNˆ JN = Utrans,l1 (2),
Modular Intersections of von Neumann Algebras in Quantum Field Theory
277
where we used Theorem 2 c) in the second line and II.2 in the third line. Scaling this ˙ equation by applying Ad 1ıt N to it and using the transformation rule Theorem 2 b) gives UN ∩− M (−2e−2πt )Utrans,l1 (2e−2πt )UN ∩− M (2e−2πt ) = Ad 1it N (UM∩− N (2)Utrans,l1 (2)UM∩− N (−2)) = −2πt Ad 1it ) t ∈ R. N (Utrans,l1 (2)) = Utrans,l1 (2e
Taking products and adjoints of this equation for arbitrary t easily leads to Ad UN ∩−M (−a)(Utrans,l1 (b)) = Utrans,l1 (b) a, b ∈ R
(3)
(see the proof of Th. 2 in [Wi 3] for example). But this implies Ad ULor (3l1 ,l3 (−2πt))(Utrans,l1 (a)) = Ad 1it M (Utrans,l1 (a))
−it −2πt a)) = Ad 1it M 1N (Utrans,l1 (e
= Ad UN ∩− M (1 − e−2πt )(Utrans,l1 (e−2πt a)) = Utrans,l1 (e−2πt a), t, a ∈ R, where in the second line we have again used Theorem 2 b). The similar result for ULor (3l1 ,l2 (−2πt)) = 1it N follows directly from the definition of Utrans,l1 and Theorem 2 b), so that we have proven Lemma 12, (2), for i = 1. To obtain the remaining translations we will explore the symmetry in (N , M, L). For example by the definition of Utrans,l2 we have Ad 0N (Utrans,l1 (a)) = Ad UL∩+ M (1)(Utrans,l1 (−a)) = Utrans,l2 (−a), a ∈ R.
(4)
Furthermore as shown above, see (1), we know it it it Ad 0N (1it M ) = 1L , Ad 0N (1L ) = 1M , t ∈ R.
Using this the symmetry interchanges (l1 7→ l2 , l2 7→ l1 ) and from the proven statement for i = 1 we get Lemma 12 (2) in case of i=2. Next we have Ad 0M (Utrans,l1 (a)) = Ad UL∩+ N 0 (1)(Utrans,l1 (−a)) = Utrans,l3 (−a), a ∈ R, and with the analogue of (1) for 0M we can finish the proof of Lemma 12.
Note that (3) shows Ad 0L (Utrans,l1 (a)) = Utrans,l1 (−a), a ∈ R, and similarly one gets for a ∈ R, Ad 0N (Utrans,l3 (a)) = Utrans,l3 (−a), Ad 0M (Utrans,l2 (a)) = Utrans,l2 (−a). We will use another simple observation. Due to Theorem 2c) we have for a ∈ R, JN Utrans,l1 (a)JN = JN UNˆ ⊂− N (a)JN = Utrans,l1 (−a),
(5)
278
H.-W. Wiesbrock
and Ad JN (Utrans,l2 (a)) = Ad JN 0N (Utrans,l1 (−a)) = Ad 0N JN (Utrans,l1 (−a)) = Utrans,l2 (−a), where we used Eq. (4) and Lemma 4. Applying Ad 0M , resp. Ad0L , to it we get similar results for JM , resp. JL . Proof of Proposition 11. Hence, we end up with three unitary one-parameter groups fulfilling the standard commutation relations with the SO↑ (1, 2)- representation. To show that they fit together to give a unitary representation of the translations R1,2 we are left to prove that these one-parameter groups pairwise commute. Due to unitary equivalence it is enough to check this for Utrans,l2 (a), Utrans,l3 (b), a, b ∈ R. Exploiting the fact that (0N Nˆ 0N ⊂ N , ) is a + hsm inclusion, see (1), we can apply Lemma 10 according to which it is enough to prove this for one pair of nonzero values. We compute 1 ln 2, for a = b = 4 with t0 := 2π [Utrans,l3 (4), Utrans,l2 (4)] = Ad UL∩− N 0 (1)([Utrans,l1 (4), Utrans,l2 (4)]) = Ad UL∩− N 0 (1)([Utrans,l1 (4), Ad UL∩+ M (1)(Utrans,l1 (4))] 0 = Ad UL∩− N 0 (1)1−it M ([Utrans,l1 (2), Ad UL∩+ M (2)(Utrans,l1 (2))]),
where we used the definitions and (3) in the first two lines, the scaling property of UL∩+ M , see Theorem 2 b) in the last one. Assumptions II.2 and II.3 imply [Utrans,l1 (2), Ad UL∩+ M (2)(Utrans,l1 (2))] = [UNˆ ⊂− N (2), Ad UL∩+ M (2)(UNˆ ⊂− N (2))] = [JNˆ JN , Ad JL JM (JNˆ JN )] = [JNˆ JN , Ad JL (JN JNˆ )] = 0, i.e. Utrans : R1,2 → U(H), (αl1 + βl2 + γl3 ) 7→ Utrans,l1 (α)Utrans,l2 (β)Utrans,l3 (γ) defines a unitary representation. This finishes the proof of Proposition 11.
(6)
Before coming to the Poincar´e group let us remark Lemma 13. Utrans : R1,2 → U(H) is faithful. Proof. First we show For this assume
Utrans |Rl1 +Rl2 is faithful.
(7)
Utrans,l1 (a) = Utrans,l2 (b), a, b ∈ R.
Taking products and adjoints gives Utrans,l1 (na) = Utrans,l2 (nb), ∀n ∈ Z, m and scaling this relation by applying Ad 1it N with tm =
Utrans,l1 (
1 2π
ln m to it implies
n a) = Utrans,l2 (mnb), ∀n, m ∈ Z, m
Modular Intersections of von Neumann Algebras in Quantum Field Theory
279
especially 2 3 Utrans,l1 ( a) = Utrans,l2 (6b) = Utrans,l1 ( a). 3 2 Due to the nontriviality of Utrans,l1 = UN ⊂− M and scaling invariance, see Theorem 2 b), we get a = 0 = b, i.e statement (7). Using 0N , 0M , and 0L we easily see that Utrans is faithful on any Rli + Rlj , i, j = 1, 2, 3. Now assume Utrans,l1 (a)Utrans,l2 (b)Utrans,l3 (c) = 1 with some a, b, c ∈ R. Applying Ad 0N to it combined with the properties shown in (4) and (5) gives Utrans,l2 (−a)Utrans,l1 (−b)Utrans,l3 (−c) = 1, and multiplying with the first equation implies Utrans,l1 (a − b)Utrans,l2 (b − a) = 1, i.e. a = b. Using Ad 0L shows similarly b = c so that the assumption states Utrans,l1 (a) = Utrans,l2 (−a)Utrans,l3 (−a). But this implies JL Utrans,l1 (a)JL = Utrans,l2 (a)Utrans,l3 (a) = Utrans,l1 (−a).
(8)
Now we observe JL Utrans,l1 (a)JL = JL JM Utrans,l1 (−a)JM JL = UL∩+ M (2)Utrans,l1 (−a)UL∩+ M (−2) −it0 0 = 1it M UL∩+ M (1)Utrans,l1 (−2a)UL∩+ M (−1)1M
−it0 0 = 1it M Utrans,l2 (−2a)1M ,
(9)
and combined with Eq. (8) this gives 0 Ad 1it M (Utrans,l2 (−2a)) = JL Utrans,l1 (a)JL = Utrans,l1 (−a), 0 Utrans,l2 (−2a) = Ad 1−it M (Utrans,l1 (−a)) = Utrans,l1 (−2a),
which implies a = 0 and finishes the proof.
Note that similarly as in (9) we have JL Utrans,l1 (a)JL = JL JN Utrans,l1 (−a)JN JL −it0 0 = 1it N Utrans,l3 (−2a)1N .
(10)
Moreover, we have it0 −it0 it0 0 JN Utrans,l3 (a)JN = 1−it M Utrans,l2 (−2a)1M = 1L Utrans,l1 (−2a)1L , it0 it0 −it0 0 JM Utrans,l2 (a)JM = 1−it N Utrans,l3 (−2a)1N = 1L Utrans,l1 (−2a)1L .
(11)
The next step in constructing a (2+1)-dimensional quantum field theory out of a set of 4 algebras is to show that the Lorentz group representation of Proposition 11 and the representation of the translations properly combine to give a representation of the Poincar´e group of R1,2 . For this we need as a further assumption Property III below.
280
H.-W. Wiesbrock
Theorem 14. Let N , M, L and Nˆ be von Neumann algebras acting on a Hilbert space H with a common cyclic and separating vector so that Assumptions I and II of Proposition 11 are fulfilled. Denote Utrans the representation of the translations R1,2 and assume further III. To all a ∈ R1,2 there exists b ∈ R1,2 with Ad JN (Utrans (a)) = Utrans (b), i.e. Ad JN maps translations to translations. Then the modular groups ir is iv 1it N , 1M , 1L and 1Nˆ t, r, s, v ∈ R
generate a unitary representation of the 2+1 dimensional Poincar´e group with spectrum condition. We will use the shortcut notation Ad JN (Utrans (R1,2 )) = Utrans (R1,2 ) for the assumption made in III. Proof. The idea of the proof goes as follows. First we will show that Assumption III implies for all a ∈ R, JN Utrans,l3 (a)JN = Utrans,l1 (−2a)Utrans,l2 (−2a)Utrans,l3 (a). Due to Eq. (11) together with Proposition 12 we thereby get the correct action (spin 1 action of SO↑ (1, 2) on R3 ) of 0 Ad 1−it M = Ad ULor (3l1 ,l3 (ln 2))
1 on the translations, where t0 = 2π ln 2. Using the symmetry in the algebras we get this result for the other boosts and taking products of these elements yields the correct action for a dense set of SO↑ (1, 2). Strong continuity of the representation will finish the proof. We use the notations introduced in the proof of Proposition 11. First one observes
JN Utrans,l3 (a)JN = Utrans,l1 (λ1 a)Utrans,l2 (λ2 a)Utrans,l3 (λ3 a), ∀a ∈ R with
(12)
JN Utrans,l3 (1)JN = Utrans,l1 (λ1 )Utrans,l2 (λ2 )Utrans,l3 (λ3 ).
(Note that λi are uniquely defined due to Lemma 13.) For this let an , bn , cn , n ∈ N be given by 1 JN Utrans,l3 ( )JN = Utrans,l1 (an )Utrans,l2 (bn )Utrans,l3 (cn ). n Then taking products of this relation yields 1 Utrans,l1 (nan )Utrans,l2 (nbn )Utrans,l3 (ncn ) = (JN Utrans,l3 ( )JN )n n = JN Utrans,l3 (1)JN = Utrans,l1 (λ1 )Utrans,l2 (λ2 )Utrans,l3 (λ3 ), which shows an =
λ1 n , bn
=
λ2 n , cn
=
λ3 n.
Taking products and adjoints gives
Modular Intersections of von Neumann Algebras in Quantum Field Theory
JN Utrans,l3 (
281
m m m m )JN = Utrans,l1 ( λ1 )Utrans,l2 ( λ2 )Utrans,l3 ( λ3 ) n, m ∈ Z, n n n n
and strong continuity of both sides proves (12). Now apply Ad JN to Eq. (12). This gives Utrans,l3 (a) = Utrans,l1 (−λ1 a)Utrans,l2 (−λ2 a)JN Utrans,l3 (λ3 a)JN , and multiplying both equations implies JN Utrans,l3 ((λ3 − 1)a)JN = Utrans,l3 (−(λ3 − 1)a). As in (8) we conclude λ3 = 1. Using the symmetry 0N gives JN Utrans,l3 (a)JN = Ad 0N (JN Utrans,l3 (−a)JN ) = Ad 0N (Utrans,l1 (−λ1 a)Utrans,l2 (−λ2 a)Utrans,l3 (−a)) (13) = Utrans,l2 (λ1 a)Utrans,l1 (λ2 a)Utrans,l3 (a), which compared with (12) and due to Lemma 13 shows λ1 = λ2 =: λ. For determining λ we first use again the symmetry in the algebras, namely Ad 0M (JN Utrans,l3 (a)JN ) = Ad UL∩− N 0 (1)(Utrans,l3 (a)) = Ad JL UL∩− N 0 (−1)(Utrans,l3 (−a)) = JL Utrans,l1 (−a)JL , which gives as in (13), see (10), −it0 0 = 1it N Utrans,l3 (−2a)1N
JL Utrans,l1 (a)JL = Utrans,l3 (λa)Utrans,l2 (λa)Utrans,l1 (a), where we used also (4) and (5). 0 Therefore we can apply Ad 1it N to Eq. (12), −it0 0 1it N JN Utrans,l3 (λa)JN Utrans,l3 (a)1N −it0 0 = 1it N Utrans,l1 (λa)Utrans,l2 (λa)Utrans,l3 (2a)1N
λ −it0 0 = Utrans,l1 ( a)Utrans,l2 (2λa)1it N Utrans,l3 (2a)1N 2 λ = Utrans,l1 (( − 1)a)Utrans,l2 (λa)Utrans,l3 (−λa). 2 Now this expression has to be Ad JN -invariant. On the other hand we compute JN Utrans,l1 (( = Utrans,l1 (−(
λ − 1)a)Utrans,l2 (λa)Utrans,l3 (−λa)JN 2
λ − 1)a)Utrans,l2 (−λa)JN Utrans,l3 (−λa)JN 2
λ − 1 + λ2 )a)Utrans,l2 ((−λ − λ2 )a)Utrans,l3 (−λa) 2 which by comparison leads to = Utrans,l1 (−(
282
H.-W. Wiesbrock
−(
λ λ − 1 + λ2 ) = − 1 2 2 −λ − λ2 = λ
whose unique solution is λ = −2. Therefore we finally conclude JN Utrans,l3 (a)JN = Utrans,l1 (−2a)Utrans,l2 (−2a)Utrans,l3 (a). This implies, using (10), it0 0 Ad ULor (3l1 ,l3 (ln 2))(Utrans,l2 (−2a)) = 1−it M Utrans,l2 (−2a)1M
= JN Utrans,l3 (a)JN = Utrans ((3l1 ,l3 (ln 2))(al2 )), showing according to Lemma 12 Ad ULor (3l1 ,l3 (n ln 2))(Utrans (a)) = Utrans (3l1 ,l3 (n ln 2)a)
(14)
for all a ∈ R1,2 , n ∈ Z. Similarly we get Ad ULor (3l1 ,l2 (n ln 2))(Utrans (a)) = Utrans (3l1 ,l2 (n ln 2)a) and
Ad ULor (3l2 ,l3 (n ln 2))(Utrans (a)) = Utrans (3l2 ,l3 (n ln 2)a)
for all a ∈ R , n ∈ Z. Next we will show that products of such elements form a dense subset of SO↑ (1, 2). By continuity we then get the correct action of the representation ULor on the translations, and therefore a unitary representation of the semidirect product SO↑ (1, 2) B R1,2 ≡ (the Poincar´e group in (2+1) dimensions). Let 1,2
ULM (a) := 3l2 ,l3 (− ln(a + 1))3l1 ,l3 (+ ln(a + 1)) a ≥ 0, ULM (−a) := (ULM (a))−1 . (Notice that ULor (ULM (n)) = UL∩+ M (n), n ∈ Z.) This defines a one-parameter subgroup of SO↑ (1, 2) and some group relations are Ad 3l1 ,l3 (t)(ULM (a)) = ULM (e−t a), ∀t, a ∈ R. The relations (14) show that the representors of ULM (±1) and 3l1 ,l3 (n ln 2), n ∈ Z act in the right way on the translations. Taking products this is true for Ad 3l1 ,l3 (n ln 2)(ULM (m)) = ULM (2−n m), ∀n, m ∈ Z, and by strong continuity we get it for all ULM (a), a ∈ R. Similarly we define UN M (a) := 3l1 ,l3 (+ ln(a + 1))3l1 ,l2 (− ln(a + 1)) a ≥ 0, UN M (−a) := (UN M (a))−1 and UN L (a) := 3l1 ,l2 (− ln(a + 1))3l1 ,l3 (+ ln(a + 1)) a ≥ 0, UN L (−a) := (UN L (a))−1 , and as above we get that the representors of UN M (a), UN L (a), a ∈ R act in the right way on the translations. But these three 1-dimensional subgroups generate the whole
Modular Intersections of von Neumann Algebras in Quantum Field Theory
283
Lorentz group SO↑ (1, 2) and therefore we finally get a representation of the Poincar´e group of dimension (2+1) by modular theory. The spectrum of the representation of the translations is Lorentz invariant. Since the translations in the l1 -direction are represented by UNˆ ⊂− N the generator is positive according to Theorem 2 a). From these remarks we conclude that we arrived at a positive energy representation of the Poincar´e group. As in the case of chiral quantum field theories [Wi4] we can use Theorem 14 in order to construct a 2+1 -dim. quantum field theory starting from a set of 4 algebras N , M, L, Nˆ . Let us assume Conditions I–III for this set. We define a local net of observables in the following way: Let l1 = (1, 1, 0), l2 = (1, −1, 0) and l3 = (2, 0, 2) ∈ R1,2 be as in the proof of Proposition 11. We define the local algebras of observables associated with wedges by A(W [l1 , l2 ]) := N , A(W [l1 , l3 ]) := M, A(W [l2 , l3 ]) := L.
(15)
For arbitrary linearly independent light rays li , lj ∈ R1,2 pointing to the future we choose a gli ,lj ∈ SO↑ (1, 2) with li = gli ,lj l1 , lj = gli ,lj l2 . Such a group element always exists in SO↑ (1, 2) and is unique up to a multiplication by a boost of type 3l1 ,l2 (t), t ∈ R. (These are the only elements in SO↑ (1, 2) which leave fixed the wedge region W [l1 , l2 ].) 2 We will use this group element in order to define the local observable algebra associated with W [li , lj ]. Let U denote the unitary representation of the Poincar´e group according to Theorem 14, i.e. let t −i 2π
U(3l1 ,l2 (t)) := 1N
etc. with t ∈ R.
Then we define the observable algebra associated with W [li , lj ] by A(W [li , lj ]) := Ad U (gli ,lj )(N ). For this to be a good definition we need Ad U(3l1 ,l2 (t))(N ) = N , which trivially holds by modular theory. A look at the proof of Proposition 11 shows that this definition is consistent with the choice in (15). Similarly we define for translated wedges A(W [li , lj , a]) := Ad U(a)U (3li ,lj )(N ), a ∈ R1,2 . In this way we associate with any wedge region in R1,2 a unique von Neumann algebra. Obviously this map respects isotony, i.e. ˆ ⇒ A(W [li , lj , a]) ⊂ A(W [li , lj , a]) ˆ W [li , lj , a] ⊂ W [li , lj , a]
( isotony ),
which follows from the properties of modular inclusions. We have W [li , lj , a] and W [lˆi , lˆj , aˆ ] spacelike separated ⇒ li = lˆj , lj = lˆi , and a − aˆ spacelike. 2 Note that the reflection across the plane orthogonal to the basis line of W [l , l ] is not in the group 1 2 SO↑ (1, 2). It has determinant = -1.
284
H.-W. Wiesbrock
The definition tells us now A(W [l2 , l1 ]) = Ad UN 0 ∩− L (1)UL∩+ M (1)UM∩− N (1)(N ) = N 0 , see Lemma 4, and using the Lorentz invariance of the construction we immediately get Lemma 15. If W [li , lj , a] and W [lˆi , lˆj , aˆ ] are spacelike separated then ˆ 0. A(W [li , lj , a]) ⊂ A(W [lˆi , lˆj , a]) Now we can define local observable algebras belonging to double cones by taking intersections of appropriate observable algebras associated with wedges. Due to Lorentz covariance, isotony and locality for the wedge algebras, the double cone algebras will inherit these structures. For details one may look up [Bi-Wich], where the sketched construction, starting from a proper set of wedge algebras, is carried out. We should mention here that this construction does not necessarily imply the Reeh-Schlieder property for observables localized in double cones. In summary we arrive at Theorem 16. Let A(O), O ⊂ R1,2 , be a local net fulfilling the Bisognano/Wichmann property for wedges, l1 , l2 , l3 be linearly independent lightlike vectors, pointing to the future. Let N := A(W [l1 , l2 ]), M := A(W [l1 , l3 ]), L := A(W [l2 , l3 ]) and Nˆ := A(W [l1 , l2 , l1 ]). Then this set of algebras together with the vacuum vector fulfill the assumptions of Theorem 14. Conversely, given a set of 4 algebras N , M, L and Nˆ acting on a Hilbert space H together with a common cyclic and separating vector ∈ H, which fulfill the assumptions in Theorem 14, then these data determine a local (Bisognano-Wichmann) net A(O) ⊂ B(H), O ⊂ R1,2 , such that the algebras one started with become wedge algebras of the constructed net as in the first part. The generalization of these results to the 1+3 dimensional quantum field theory is in progress. Acknowledgement. I would like to thank H.-J. Borchers for pointing out to me Lemma 5 above, see [Bo2]. This observation initiated the present discussion. Furthermore I would like to thank B. Schroer for his constant, stimulating interest, F. Nill and C. Binnenhei for reading the manuscript and M. Florig, who pointed out an error in an earlier version of this work. I also would like to thank the referee for the suggestion to introduce a more convenient notation. Hopefully this will clarify the line of arguments. He also prompted me to work a little more on condition III in Theorem 14 leading to the present revised version.
References [Ar-Zs] [Bi-Wich] [Bo1]
Araki, H., Zsido, L.: An extension of the structure theorem of Borchers with application to half sided modular inclusions. In preparation Bisognano, J., Wichmann, E.: On the duality condition for a Hermitean scalar field. J. Math. Physics 16, 985 (1975) Borchers, H.-J. The CPT-Theorem in two-dimensional theories of local observables. Commun. Math. Phys. 143, 315 (1992)
Modular Intersections of von Neumann Algebras in Quantum Field Theory
[Bo2]
285
Borchers, H.-J.: Half-Sided Modular Inclusion and the Construction of the Poincar´e Group. Commun. Math. Phys. 179, 703 (1996) [Gu-Lo-Wi] Guido, D., Longo, R., Wiesbrock, H.-W.: Extensions of Conformal Nets and Superselection Structures. Preprint, to be published in Commun. Math. Phys. [Haa] Haag, R. Local Quantum Physics. Berlin–Heidelberg–New York: Springer, 1992 [Ta] Takesaki, M.: Tomita‘s Theory of Modular Hilbert Algebras and its Applications. Berlin– Heidelberg–New York: Springer, LNM 128, 1970 [Wi1] Wiesbrock, H.-W. Half-Sided Modular Inclusions of von-Neumann- algebras. Commun. Math. Phys. 157, 83 (1993) [Wi2] Wiesbrock, H.-W.: Symmetries and Half-Sided Modular Inclusions of von-Neumann-algebras. Lett. Math. Phys. 29, 107 (1993) [Wi 3] Wiesbrock, H.-W.: Symmetries and Modular Intersections of von-Neumann Algebras. Lett. Math. Phys. 39, 203 (1997) [Wi4] Wiesbrock, H.-W.: Conformal Quantum Field Theory and Half-Sided Modular Inclusions of von-Neumann- algebras. Commun. Math. Phys. 158, 537 (1993) Communicated by H. Araki
Commun. Math. Phys. 193, 287 – 316 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Global Solutions to a Reactive Boussinesq System with Front Data on an Infinite Domain Simon Malham1 , Jack X. Xin2 1 Department of Theoretical Mechanics, University of Nottingham, Nottingham, UK. E-mail:
[email protected] 2 Department of Mathematics, University of Arizona, Tucson AZ 85721, USA. E-mail:
[email protected]
Received: 15 January 1996 / Accepted: 2 September 1997
Abstract: We prove the existence of global solutions to a coupled system of Navier– Stokes, and reaction-diffusion equations (for temperature and mass fraction) with prescribed front data on an infinite vertical strip or tube. This system models a one-step exothermic chemical reaction. The heat release induced volume expansion is accounted for via the Boussinesq approximation. The solutions are time dependent moving fronts in the presence of fluid convection. In the general setting, the fronts are subject to intensive Rayleigh-Taylor and thermal-diffusive instabilities. Various physical quantities, such as fluid velocity, temperature, and front speed, can grow in time. We show that the growth is at most eO(t) for large time t by constructing a nonlinear functional on the temperature and mass fraction components. These results hold for arbitrary order reactions in two space dimensions and for quadratic and cubic reactions in three space dimensions. In the absence of any thermal-diffusive instability (unit Lewis number), and with weak fluid coupling, we construct a class of fronts moving through shear flows. Although the front speeds may oscillate in time, we show that they are uniformly bounded for large t. The front equation shows the generic time-dependent nature of the front speeds and the straining effect of the flow field.
1. Introduction
We study the existence of global solutions to the following Boussinesq combustion system on the infinite tube := {(x, y) ∈ 6 × R}, where 6 ⊂ Rd−1 is an open, bounded, simply connected domain with smooth boundary ∂6 = 6/6, outward normal n, ˆ and d = 2, 3 is our spatial dimension:
288
S. Malham, J. Xin
∂t ψ + u · ∇ψ = 1ψ − ψf (θ), ∂t θ + u · ∇θ = `1θ + ψf (θ), ∂t u + u · ∇u = ν1u − ∇p + σθe, ∇ · u = 0.
(1.1a) (1.1b) (1.1c) (1.1d)
Physically we interpret: u(x, y, t) : × R → Rd as the fluid velocity; p(x, y, t) : × R → R the pressure; ψ(x, y, t) : × R → R the concentration of the reactant in a one-step irreversible exothermic reaction; θ(x, y, t) : × R → R the temperature of the reactant-product mixture; ν the normalised fluid viscosity or Prandtl number; ` the Lewis number; σ the Rayleigh number; e denotes the unit vertical direction opposite to the propagation direction of flame (aligned with the y direction). For convenience we shall write θ := (ψ, θ) : × R → R2 , i.e. the 2-tuple of reactant concentration and temperature. We assume non-homogeneous boundary conditions which allow front type initial data to be prescribed (for our main results b∗ ≡ 0): ∂nˆ ψ = 0, ∂nˆ θ = 0, u = 0, on ∂6 × R × R+ , ψ → 0, θ → 1, u → b∗ , as y → +∞, ψ → 1, θ → 0, u → b∗ , as y → −∞.
(1.2)
We will suppose for some m ∈ N, ( f (θ) =
θm , 0,
θ > 0, θ ≤ 0.
(1.3)
System (1.1) models the vertical movement of flame fronts. Thermal volume expansion of the fluid due to the irreversible exothermic combustion reaction is accounted for by the Overbeck-Boussinesq approximation [10, 28]. The nonlinear chemical reaction term ψf (θ) usually takes the Arrhenius form ψf (θ) exp{−E/Rθ}, or its normalised version ψf (θ) exp{(θ − 1)/(1 + χ(θ − 1))}, where f (θ) is usually of the form (1.3) though m is not always a positive integer (in general). The constant E is the activation energy, R is the universal gas constant, χ ∈ (0, 1) is the thermal expansion coefficient and > 0 is the reciprocal of the Zel’dovich number (see Buckmaster and Ludford [8] or Berestycki and Larrouturou [2]). Since the supremum of exp{−E/Rθ} over × R is always bounded, the inclusion of this factor in the chemical nonlinearity would not affect any of our proofs. Consequently we neglect this exponential factor and for simplicity, choose f to be a power nonlinearity. This simple chemical nonlinearity also arises in isothermal autocatalytic chemical reactions of the form A + mB → (m + 1)B, where m is the order of reaction, and ψ, θ are the concentrations of reactant A and autocatalyst B respectively. The rate of reaction is thus given by kψθm , where the constant enthalpy k can be scaled out of the system. In this case the temperature remains fixed, yet the density of the A and B mixture increases with the reaction resulting in a change of fluid velocity. Thus chemical feedback plays the role of thermal feedback and system (1.1) then governs the dynamics of the moving concentration fronts in the presence of fluid convection. Billingham, King, Merkin, Metcalf, Needham, and Scott [6, 39, 40] studied the autocatalytic reaction-diffusion system (1.1a), (1.1b), neglecting hydrodynamical effects. They proved existence and uniqueness results for an associated boundary value problem and studied the development of travelling fronts of chemical reaction. See also Focant and Gallay [18] for a recent study of existence of traveling fronts in the quadratic-cubic
Global Solutions to Reactive Boussinesq System on Infinite Domain
289
case and their stability when ` is near one. The passively convected version of the nonisothermal system was considered by Berestycki, Larrouturou and Roquejoffre [3, 48] in an infinite tube, and they proved the linear and nonlinear stability of travelling front solutions. Manley, Marion and Temam [34, 36] examined system (1.1) on a finite tube in the case of a multi-component reaction and with slightly different boundary conditions. They proved the global existence (d = 2, 3) of suitable weak solutions uniformly bounded (d = 2) in time. Further, their estimates for the Hausdorff dimension of the universal attractor indicated that for long tubes, hydrodynamical effects make a significant contribution to the complexity of the flow. Crucial to their proofs was the assumption of a bounded nonlinear reaction term. We are interested in studying the full system (1.1) on unbounded domains while allowing for unbounded chemical nonlinearities. The attractor dimension results of Manley, Marion and Temam [34, 36] indicate the importance of studying this system when the vertical domain size is much larger than the typical length scale associated with the front width. The infinite cylindrical domain is the natural setting for examining the long time behaviour of travelling front solutions and especially for the irreversible reactions. Numerical simulations (Patnaik and Kailasanath [45], Zhu and Xin [62]) have shown that moving fronts of system (1.1) are subject to both Rayleigh-Taylor (upward fronts) and thermal diffusive instabilities. The Rayleigh-Taylor instability from the σθe term is due to heavier (cold) fluid lying above the lighter (hot) fluid. It leads to bubble formation on the front and growth of fluid energy and vorticity. The thermal-diffusive instability due to ` 6= 1 can cause chaotic front oscillation (` > 1) or formations of cellular front structures (` < 1) [51, 39]. As a result, the maximum temperature can grow in time. Last, but not least, for high Reynolds numbers (small ν) the fluid flow can become highly irregular, which in turn wrinkles the front and may induce front acceleration. In [62], power growth in time of maximum vorticity and temperature is numerically observed (d = 2, ν = 0.005, ` = 0.1 or 10). Majda and Souganidis [33] studied front acceleration (front speed of O(tp ), p > 0), in a prescribed (passive) random shear flow of H¨older regularity. All this evidence suggests that in general, one should not expect the front speed to be uniformly bounded in time, instead a power growth may well happen. Our first result implies an exponential bound eO(t) on the front speeds for fronts in a two dimensional infinite vertical strip (for all orders of reaction m) or in a three dimensional infinite vertical tube (for m ∈ {1, 2, 3}). We treat only the one-step reaction case, as the analysis of the multi-component case is practically identical. We prove the existence of global weak solutions (d = 2, 3) to (1.1). In the two dimensional case we prove uniqueness for a class of slightly more regular weak solutions as well as the existence of strong, smooth solutions for smooth initial data. Our sharpest norm upper bounds of solutions grow with time, so we are unable to discuss attractors. The growing bounds may be interpreted as the enhancement of instabilities in the system due to the unbounded chemical nonlinearities and unbounded domains. If = 6 × 3, |3| < ∞, i.e. a bounded strip or tube, we can considerably improve our growth estimates. The details however will be presented elsewhere. Our second result concerns fronts in a reduced system when ` = 1, σ is small, ν > 2π. The fluid flows are laminar, the Rayleigh-Taylor effect is minimal, and the thermaldiffusive effect is absent. We construct a class of front solutions near the known passive fronts in smooth shear flows. The time dependent front speeds are proved to be uniformly bounded in time. Passive fronts in shear flows on infinite cylindrical domains have been studied at length by Berestycki, Larrouturou, Lions, Nirenberg and Roquejoffre [2, 3, 48, 4, 5], regarding the existence and stability of travelling front solutions. Similar issues on passive fronts in periodic flow fields have also been well studied by Papanicolaou,
290
S. Malham, J. Xin
Xin, and Zhu, [43, 56–59]. The passive fronts in these cases all propagate with constant speeds. However, with fluid coupling turned on, front speed is no longer constant as we will see from the front equation arising in the course of the proof.
2. Global Weak Solutions and Growth Estimates R 2.1. Notation and Statement of Main Result. We denote hϕi := ϕ dxdy for Lebesgue measurable functions ϕ : → R. For q ∈ [1, ∞), n ∈ N, Lq (; Rd ) and H n (; Rd ) are the usual Lebesgue and Sobolev spaces of Rd -valued functions, equipped with the norm and inner product kϕkqLq (;Rd ) :=
d X
h|ϕi |q i,
i=1
(ϕ, φ)H n (;Rd ) :=
d X X
hDα ϕi Dα φi i.
i=1 |α|≤n
Since has a smooth boundary, an equivalent norm on H n (; Rd ) is X kDα ϕk2L2 (;Rd ) . kϕk2H n (;Rd ) = kϕk2L2 (;Rd ) + |α|=n
The non-reflexive space L∞ (; Rd ) is equipped with the usual sup-norm. We adopt the notation Lq := Lq (; R2 ) for R2 -valued functions defined on ⊂ Rd . We will often use the Gagliardo–Nirenberg inequality: for all ϕ ∈ H 1 (), q ∈ [2, ∞) when d = 2 and q ∈ [2, 6] when d = 3 d
−d
1− d2 + d q
q kϕkLq ≤ c k∇ϕkL2 2 (;R d ) kϕkL2
+ c()kϕkL2 .
(2.1)
The last term on the right-hand side of (2.1) is zero when ϕ ∈ H01 (). Also, we shall use the interpolation inequality [19, 31]: for ϕ ∈ Lr () ∩ Lp (), 1 ≤ p ≤ q ≤ r ≤ ∞, µ ∈ [0, 1] and 1/q = µ/p + (1 − µ)/r kϕkLq ≤ kϕkµLp kϕk1−µ Lr .
(2.2)
The Poincar´e inequality establishes the equivalence of the norm kϕkH 1 () and semi-norm k∇ϕkL2 (;Rd ) on H01 (): for ϕ ∈ H01 () kϕkL2 ≤ c(6)k∇ϕkL2 (;Rd ) .
(2.3)
For a given Hilbert space X with inner product (·, ·)X , we shall use h·, ·iX×X0 to denote the bilinear form establishing the duality between X and its dual X0 . For two topological vector spaces X and Y, the notation X ,→ Y shall indicate that a continuous embedding exists from X into Y and we shall use X ,→,→ Y when the embedding is compact. Given a Banach space Y, we shall use Lploc ([0, ∞); Y) to denote the space of measurable functions from [0, ∞) to Y such that k·kY ∈ Lploc ([0, ∞)). The notations w-Y and w∗ -Y are used to denote Y endowed with its weak and weak-star topologies respectively. By C([0, ∞); w-X) we indicate the space of functions continuous from [0, ∞) into w-X.
Global Solutions to Reactive Boussinesq System on Infinite Domain
291
With D := {v ∈ C0∞ (; Rd ) : ∇ · v = 0}, we specify H to be the closure of D in L2 (; Rd ) and V to be the closure of D in H 1 (; Rd ). Since there exists 1 TC : {v ∈ L2 (; Rd ) : ∇ · v ∈ L2 ()} → {v|∂ ∈ H − 2 (∂; Rd )}, a continuˆ ∂ for smooth v, we have the standard ous linear trace operator such that TC (v) = v · n| characterisations [14, 32] H = {v ∈ L2 (; Rd ) : ∇ · v = 0, TC (v) = 0}, V = {v ∈ H01 (; Rd ) : ∇ · v = 0}. By the Riesz representation theorem we can identify H ≡ H0 which is our pivot space and then, using the inequalities above, we establish the Gelfand triple [56] V ,→ H ,→ V0 , where each space is dense in the one which follows. We use P to represent the LerayHodge orthogonal projection onto divergence free functions P : L2 (; Rd ) → H. In the standard fashion we define the linear Stokes operator A = −P1 : D(A) → H, where D(A) = H 2 (; Rd ) ∩ V. Further, with E := {ϕ ∈ C0∞ (Rd ) restricted to : ∂nˆ ϕ = 0 on ∂6 × R} and W := {ϕ ∈ H 1 () : kϕk2H 1 () + k1ϕk2L2 () , we specify W to be the closure of E in
W . Since there exist TD : H 1 () → H 2 (∂) ⊂ L2 (∂) and TN : W → H − 2 (∂), continuous linear trace operators such that TD (ϕ) = ϕ|∂ and TN (ϕ) = ∂nˆ ϕ|∂ for smooth ϕ, we can characterise [15, 56] 1
1
W = {ϕ ∈ H 1 (); 1ϕ ∈ L2 () : ∂nˆ ϕ = 0 on ∂6 × R}, We will also need Green’s theorem, which by density arguments, holds for ϕ ∈ W and v ∈ H 1 (): h∇v · ∇ϕi + hv1ϕi = hTD (v), TN (ϕ)i
1
1
H 2 (∂)×H − 2 (∂)
.
(2.4)
R R∞ ˜ = y−ys φ(s)ds and Suppose φ ∈ C0∞ (R; [0, ∞)) satisfies R φ(y) dy = 1. Set ψ(y) Ry ˜ = θ(y) φ(s)ds, where ys is a finite constant which we can choose to be zero. Thus −∞ ˜θ = (ψ, ˜ θ) ˜ ∈ [0, 1]2 is smooth, satisfies the boundary conditions (1.2) and ψ˜ · θ˜ has compact support in R. As in Heywood [22], we assume that b∗ can be continued as a 2 function into , b ∈ Hloc (; Rd ), for which there exists a scalar distribution q(x, y) such that f = ν1b − b · ∇b − ∇q ∈ L2 (; Rd ). This is trivial when b∗ ≡ 0, which we assume throughout the rest of this section. (However, with slight modifications to our proofs equivalent to those in Heywood [22], our results can include the case when f has finite Dirichlet integral – for example, when 6 is a disk of radius r0 , a natural choice for b∗ would be a Hagan-Poiseuille flow, b∗ (x) = ∂y p˜ · (|x|2 − r02 )e/4ν for some ˜ prescribed pressure gradient ∂y p.) ˆ For initial data (θ in , uin ) We linearly decompose our solutions into θ = θ˜ + θ. satisfying the boundary conditions (1.2), our initial boundary value problem now takes the form:
292
S. Malham, J. Xin
∂t ψˆ + u · ∇ψ = 1ψ − ψf (θ), ∂t θˆ + u · ∇θ = `1θ + ψf (θ),
in × R+ ,
(2.5a)
in × R+ , ˆ in × R+ , ∂t u + u · ∇u = ν1u − ∇p + σ θe, ∇ · u = 0, in × R+ , ˆ u) = 0, on ∂6 × R × R+ , (∂nˆ θ,
(2.5b)
ˆ u) → 0, (θ, in
as |y| → ∞,
ˆ ˜ θ(x, y, 0) = θˆ (x, y) = θ − θ, in
in ,
in
u(x, y, 0) = u (x, y).
(2.5c) (2.5d) (2.5e) (2.5f) (2.5g) (2.5h)
˜ is the gradient of a scalar function so p is now the modified pressure. The term θe Definition 2.1. For given initial data (θˆ in , uin ) ∈ Lq () × H, for all q ∈ [1, ∞), which satisfy ψˆ in ∈ [−1, 1] and θˆin ∈ [−1, ∞) for a.e. (x, y) ∈ , a global weak solution of (2.5) indicates measurable functions q 2 1 2 2 θˆ ∈ L∞ loc ([0, ∞); L ) ∩ Lloc ([0, ∞); H (; R )) ∩ C([0, ∞); L ),
(2.6a)
2 u ∈ L∞ loc ([0, ∞); H) ∩ Lloc ([0, ∞); V) ∩ C([0, ∞); w-H),
(2.6b)
for every q ∈ [1, ∞) when d = 2 and every q ∈ [1, 6] when d = 3, such that ψˆ ∈ [−1, 1] and θˆ ≥ −1, a.e. in , for every t ∈ [0, ∞), and which for all (ϕ, v) ∈ C ∞ ([0, ∞); E × D) and [t0 , t1 ] ⊂ [0, ∞) satisfy Z
t1
ˆ t ϕ + ψ1ϕ + ψu·∇ϕ − ψf (θ)ϕi dt = hψ(t ˆ 1 )ϕ(t1 )i − hψ(t ˆ 0 )ϕ(t0 )i, hψ∂
t0
Z
(2.7a) t1
ˆ t ϕ + `θ1ϕ + θu·∇ϕ + ψf (θ)ϕi dt = hθ(t ˆ 1 )ϕ(t1 )i − hθ(t ˆ 0 )ϕ(t0 )i, (2.7b) hθ∂
Z
t0 t1 t0
ˆ hu·∂t v + νu·1v + u ⊗ u : ∇v + σ θvei dt = hu(t1 )v(t1 )i − hu(t0 )v(t0 )i, (2.7c)
and ˆ (θ(0), u(0)) = (θˆ in , uin ).
(2.8)
ˆ u) ∈ C([0, ∞); L2 × w-H), the initial Remark 2.1. Since for these weak solutions, (θ, condition (2.8) is satisfied in this sense. Theorem 2.1. Suppose ψ in ∈ [0, 1], θin ∈ [0, ∞) for a.e. (x, y) ∈ . If (θˆ in , uin ) ∈ Lq × H for every q ∈ [1, ∞), then there exists a global weak solution to (2.5) for all m ∈ N when d = 2 and for m = 1, 2 or 3 when d = 3. Moreover, as t → ∞ there is a positive constant c such that ct ˆ kθ(t)k Lq , ku(t)kH = O(e )
(2.9)
for all q ∈ [1, ∞) if d = 2 and q ∈ [1, 6] if d = 3. When d = 2, if (θˆ in , uin ) ∈ H n (; R2 ) × (V ∩ H n (; R2 )) for every n ∈ N, then there exists a unique global classical solution to (2.5).
Global Solutions to Reactive Boussinesq System on Infinite Domain
293
2.2. Existence of Weak Solutions (d = 2, 3). We shall prove this result through a series of lemmas. We provide a-priori estimates for an associated system – the Leray regularised form of (2.5), with renormalised chemical nonlinearities (θ δ = θ˜ + θˆ δ ) ∂t ψˆ δ + uδ · ∇ψδ = 1ψδ − ψδ f (θδ )e−δθδ , ∂t θˆδ + uδ · ∇θδ = `1θδ + ψδ f (θδ )e−δθδ , ∂t uδ + wδ · ∇uδ = ν1uδ − ∇pδ + σ θˆδ e, ∇ · uδ = 0, ˆ (∂nˆ θ δ , uδ ) = 0, (θˆ δ , uδ ) → 0,
in × R+ ,
(2.10a)
in × R+ ,
(2.10b)
in × R+ , in × R+ ,
(2.10c) (2.10d)
on ∂6 × R × R+ ,
(2.10e)
as |y| → ∞,
(2.10f)
ˆ in θˆ δ (0) = θˆ in δ = Jδ ∗ θ ,
in ,
(2.10g)
= Jδ ∗ u ,
in .
(2.10h)
uδ (0) =
uin δ
in
For δ > 0, let Jδ ∈ C0∞ (Rd ) be a Friedrichs mollifier [1] with support onR Bδ (x, y), the ball of radius δ, centered at (x, y). We define wδ (ξ) = (Jδ ∗ uδ )(ξ) = Rd Jδ (ξ − η)u ¯ δ (η) dη, i.e. the mollification of uδ , where u ¯ δ is the zero extension of uδ outside . We recall the usual mollifiers properties: if v ∈ Lq (; Rd ), q ∈ [1, ∞), then kJδ ∗ vkLq ≤ kvkLq and limδ→0+ kJδ ∗ v − vkLq = 0. For every δ > 0, we know that a unique classical solution exists to (2.10) for the given initial data, at least on some interval [0, Tδ ], Tδ > 0. We remark that since the polynomial function f (·) is Lipschitz continuous and the nonlinear reaction terms are also bounded for this approximate system, then such an existence result on a finite domain follows classically via a Faedo-Galerkin method, projecting initially onto the first N eigen-functions of the appropriate elliptic operators on a smooth bounded boundary say of vertical diameter N (see for example Heywood [22]). Norm estimates can be shown to be independent of the domain size considered and the result follows by considering the limit N → ∞. Our a-priori Lebesgue and Sobolev norm estimates on (θˆ δ , uδ ) will verify that such a solution exists on [0, ∞), i.e. the interval of existence is independent of δ. We will then eventually consider the limit δ → 0+ . The following preliminary lemma is an extension of the usual parabolic maximum principles (see Smoller [52] and Marion [36]). Lemma 2.1. If ψ in ∈ [0, 1], θin ∈ [0, ∞) a.e. in , then ψδ ∈ [0, 1], θδ ≥ 0, everywhere in × [0, Tδ ]. Proof. Consider the inner product of (2.10a) with ψ − := max{0, −ψδ } in L2 (), Z t kψ − (t)k2L2 + 2 k∇ψ − k2L2 (;Rd ) ds ≤ kψ − (0)k2L2 . 0
Analogous estimates follow for θ := max{0, −θδ } and ψ + := max{0, ψδ − 1}. −
We now establish the main estimates we require to prove the existence of weak solutions for d = 2, 3. The phrase “uniformly bounded” is considered to be with respect to our regularising parameter δ > 0. We shall use c and c(·) to denote a generic finite positive constant which might depend on the argument indicated, but which does not depend on the regularisation parameter.
294
S. Malham, J. Xin
Lemma 2.2. If (θˆ in , uin ) ∈ L2 × H, then the set {θˆ δ , uδ } is uniformly bounded in 2 2 1 2 L∞ loc ([0, ∞); L × H) ∩ Lloc ([0, ∞); H (; R ) × V), and in fact Z t kθˆ δ (s)kH 1 (;R2 ) ds = O(ect ). kθˆ δ (t)kL2 , kuδ (t)kH , 0
Proof. Motivated by the work of Masuda [37], Haraux and Youkana [20], Collet and Xin [12] and Bricmont, Kupiainen and Xin [7] on reaction-diffusion systems, we consider a simple nonlinear functional that allows us to take advantage of the interaction of the chemical nonlinearities. As in Collet and Xin [12], for F ∈ C 2 (R2 ; [0, ∞)), we use (2.5a)–(2.5b) to derive d hF (ψˆ δ , θˆδ )i + hQ(∇θˆ δ )i dt = h∇ · (F1 ∇ψˆ δ + `F2 ∇θˆδ − uδ F )i ˜ − hF1 uδ · ∇ψ˜ + F2 uδ · ∇θi ˜ + hF1 1ψ˜ + `F2 1θi − h(F1 − F2 )ψδ f (θδ )e−δθδ i,
(2.11)
where Q(∇θˆ δ ) = F11 |∇ψˆ δ |2 + (1 + `)F12 ∇ψˆ δ · ∇θˆδ + `F22 |∇θˆδ |2 .
(2.12)
Here F1 and F2 are the partial derivatives of F with respect to its first and second arguments. We would like to choose an F (ψˆ δ , θˆδ ) which satisfies: 2 F11 F22 > (1 + `)2 F12 /4`,
for all ψˆ δ ∈ [−1, 1], θˆδ ∈ [−1, ∞), (2.13a)
F1 − F2 > 0,
for all ψˆ δ ∈ [0, 1], θˆδ ∈ [0, ∞).
(2.13b)
We impose (2.13a) to ensure the quadratic form (2.12) is non-negative and condition (2.13b) to partially help control the nonlinear terms. Step 1. We show that the following inequality holds for the mean-square reactant concentration and temperature: d hF (ψˆ δ , θˆδ )i + c(`) k∇ψˆ δ k2L2 (;Rd ) + k∇θˆδ k2L2 (;Rd ) dt ≤ c kθˆ δ k2L2 + kuδ k2H + `2 + 1 .
(2.14)
We assume F to be the quadratic form F (ψˆ δ , θˆδ ) := αψˆ δ2 + β ψˆ δ θˆδ + γ θˆδ2 . Note that for arbitrary (α, β, γ) ∈ R3+ and (ε1 , ε2 ) ∈ R2+ we can estimate βε1 ˆ 2 β ˆ2 ψδ + γ − θ , F (θˆ δ ) ≥ α − 2 2ε1 δ (1 + `)βε2 (1 + `)β ˆ 2 Q(∇θˆ δ ) ≥ 2α − |∇ψˆ δ |2 + 2γ` − |∇θδ | . 2 2ε2 Condition (2.13b) is satisfied provided we choose 2α − β = k1 > 0
and
β − 2γ = k2 > 0.
(2.15)
Global Solutions to Reactive Boussinesq System on Infinite Domain
295
For a given γ > 0, we can always choose β large enough so that k2 > 0 and then ε1 and ε2 large enough so that γ > β/2ε1 and 2γ` > (1 + `)β/2ε2 . Finally we can choose α sufficiently large (but finite) such that k1 > 0, α > βε1 /2 and 2α > (1 + `)βε2 /2 and thus we guarantee (2.15) and hence (2.13b) as well as (2.13a). In (2.11) we bound the linear convection and diffusion terms using the H¨older and Young inequalities: ˜ + |hF1 uδ · ∇ψ˜ + F2 uδ · ∇θi| ˜ |hF1 1ψ˜ + `F2 1θi| Z Z |uδ · e|2 ρ(y) dxdy + ≤c 6×2
6×2
|θˆ δ |2 ρ(y) dxdy + `2 + 1 ,
˜ = supp{∂y θ}, ˜ ρ(y) = max{|∂y ψ|, ˜ |∂y θ|, ˜ |∂y2 ψ|, ˜ |∂y2 θ|}. ˜ Using where 2 = supp{∂y ψ} (2.13b) we estimate the nonlinear term Z h k1 ψˆ δ + k2 θˆδ ψδ f (θδ )e−δθδ i ≥ − |−k1 |ψˆ δ | + k2 θˆδ |ψδ f (θδ ) dxdy K Z − |k1 ψˆ δ − k2 |θˆδ ||ψδ f (θδ ) dxdy − c −
≥ − c kθˆ δ k2L2 + 1 ,
(2.16)
where K = {(x, y) ∈ : ψˆ δ ∈ [−1, 0], θˆδ ∈ [0, K = 2k1 /k2 ]} and − = {(x, y) ∈ : ψˆ δ ∈ [0, 1], θˆδ ∈ [−1, 0]} and we have made use of the fact that ψ˜ · θ˜m has compact support. Combining these last two inequalities yields (2.14). Step 2. We establish the following energy inequality: d kuδ k2H + 2νkuδ k2V ≤ kuδ k2H + σ 2 kθˆδ k2L2 . dt
(2.17)
This inequality follows by considering the inner product of (2.10c) with uδ in H, 1 d kuδ k2H + νkuδ k2V = σ(uδ , θˆδ e)H , 2 dt and then using the H¨older and Young inequalities to estimate 1 σ2 σ(uδ , θˆδ e)H ≤ σkuδ kH kθˆδ kL2 ≤ kuδ k2H + kθˆδ k2L2 . 2 2 Combine the inequalities in Steps 1 and 2 and apply the Gr¨onwall inequality, noting in ˆ in ˆ in that by the usual mollifier properties kuin δ kH ≤ ku kH and kθ δ kL2 ≤ kθ kL2 , to establish the norm growth estimates. We are now in a position to estimate the Lq ()-norms of the mass-fraction and temperature fields and in particular L1 estimates which indicate the growth rates of the reacting fronts. q Lemma 2.3. If, in addition to the assumptions of Lemma 2.2, we assume θˆ in δ ∈ L , for q all q ∈ [1, ∞), then {θˆ δ } is uniformly bounded in L∞ loc ([0, ∞); L ) for every q ∈ [1, ∞) ct ˆ when d = 2 and q ∈ [2, 6] when d = 3, and: kθ(t)kLq = O(e ). We can extend this estimate to include q ∈ [1, 2) when d = 3, provided we additionally assume m ∈ {1, . . . , 6}.
296
S. Malham, J. Xin
q Proof. Step 1. We establish that if θˆ in δ ∈ L for some q ∈ [2, ∞) when d = 2 or q ˆ some q ∈ [2, 6] when d = 3, then θ δ lies in L∞ loc ([0, ∞); L ), which follows from the inequality: for any n ∈ N, Z t 1/n kθˆ δ (s)kL2 + kuδ (s)kL2n (;Rd ) ds. (2.18) kθˆ δ (t)kL2n ≤ kθˆ δ (0)kL2n + ct + c 0
P2n ˆ 2n−k θˆk , where αk , For n ∈ N, let F be the polynomial F (ψˆ δ , θˆδ ) := δ k=0 αk ψδ k = 0, 1, . . . , 2n are positive constants. We can choose the αk so that F satisfies the conditions (2.13) and also that F ≥ cψˆ δ2n + cθˆδ2n , for all ψˆ δ ∈ [−1, 1], θˆδ ∈ [−1, ∞), to ensure hF i is a positive definite functional – we provide a proof in the appendix (analogous to the quadratic case). Using (2.11), since there exists A ∈ R such that for all θˆδ > A, F1 − F2 > 0, then with A = {(x, y) ∈ : ψˆ δ ∈ [−1, 1], θˆδ ∈ [−1, A]}: Z X d 2n hF i ≤c kFi k 2n−1 |θˆ δ |2n−1 ψδ f (θδ )dxdy + c kuδ kL2n (;Rd ) + 1 L () dt A Z ≤c
A
|θˆ δ |
2n−1
i=1,2
Z ψ˜ · θ
˜m
dxdy + c(A) A
+ c kuδ kL2n (;Rd ) + 1 hF i
1 1− 2n
|θˆ δ | dxdy 2
2n1
hF i
1 1− 2n
.
˜ L2n (;Rd+2 ) ≤ c and k1θk ˜ L2n ≤ c. Now using ˜ L∞ ≤ c, k∇θk We have used that k∇θk m ˜ ˜ that ψ · θ has compact support the last inequality becomes 1 d hF i 2n ≤ c(A) dt
Z A
|θˆ δ |2 dxdy
2n1
+ c kuδ kL2n (;Rd ) + 1 ,
which yields (2.18) after integration in time. If we use the Gagliardo–Nirenberg inequality (2.1) to estimate the L2n -norm on velocity field in (2.18) we must restrict ourselves to n = 1, 2 or 3 when d = 3. Intermediate Lq -estimates follow from (2.2). 1 1 ˆ Step 2. If θˆ in δ ∈ L and provided ψδ f (θδ ) ∈ L (), then {θ δ } is uniformly bounded in ∞ q Lloc ([0, ∞); L ()) for every q ∈ [1, 2). In particular, by Step 1, the latter assumption here is true for all m ∈ N when d = 2, but when d = 3, we are restricted to reactions for which m ∈ {1, . . . , 6}. Rξ As in Constantin and Fefferman [13], consider F (ξ) = 0 (ξ − η)φ(η)dη ∈ C 2 (R), where: φ(ξ) ≥ 0, for all ξ ∈ R; φ → 0 as ξ → 0; φ(ξ) = 0, for all ξ > ρ0 and Rξ R ρ0 φ(ξ) dξ = 1. Hence F 0 (ξ) = 0 φ(η) dη ∈ [0, 1] and F 00 (ξ) = φ(ξ) ≥ 0. We form 0 the estimate (9 := {(x, y) ∈ : |ψˆ δ | > ρ0 }) Z Z d 2 F (|ψˆ δ |)dxdy + F 00 (|ψˆ δ |)|∇|ψˆ δ || dxdy dt 9 9 Z ˜ + |1ψ| ˜ dxdy. |ψδ f (θδ )| + |uδ · ∇ψ| ≤
Taking the limit as ρ0 → 0, we get d ˆ kψδ kL1 () ≤ kψδ f (θδ )kL1 () + ckuδ kH + c. dt
Global Solutions to Reactive Boussinesq System on Infinite Domain
297
Pm Now use that kψδ f (θδ )kL1 () ≤ c 1 + kψˆ δ kL1 () + kθˆδ kL1 () + k=2 kθˆδ kkLk () , where, by step 1, the last term on the right-hand side here is uniformly bounded for every m ∈ N when d = 2, but only for m ∈ {1, . . . , 6} when d = 3. We can derive an analogous estimate for kθˆδ kL1 () . Adding the two estimates together, using Lemma 2.2 and the interpolation inequality (2.2), the result follows. Consider the following weak form of our regularised system (2.10), for all (ϕ, v) ∈ C ∞ ([0, ∞); E × D) and [t0 , t1 ] ⊂ [0, ∞), R t1 t hψˆ ∂ ϕ + ψδ 1ϕ + ψδ uδ ·∇ϕ − ψδ f (θδ )e−δθδ ϕi dt = hψˆ δ (t)ϕ(t)i|t10 , (2.19a) t0 δ t R t1 t hθˆ ∂ ϕ + `θδ 1ϕ + θδ uδ ·∇ϕ + ψδ f (θδ )e−δθδ ϕi dt = hθˆδ (t)ϕ(t)i|t10 , (2.19b) t0 δ t R t1 t hu ·∂ v + νuδ ·1v + wδ ⊗ uδ : ∇v + σ θˆδ vei dt = huδ (t)v(t)i|t10 . (2.19c) t0 δ t From Lemmas 2.2 and 2.3, there exists a subsequence {θˆ δ , uδ } (which we label by the same subscript) such that for all q ∈ [1, ∞) when d = 2 and q ∈ [2, 6] when d = 3, as δ → 0+ , ∞ 2 θˆ δ → θˆ in w∗ -Lloc ([0, ∞); w-Lq ) ∩ w-Lloc ([0, ∞); w-H1 (; R2 )), ∗ ∞ 2 uδ → u in w -Lloc ([0, ∞); w-H) ∩ w-Lloc ([0, ∞); w-V).
(2.20)
Convergence of the linear terms in (2.19) to the appropriate terms in (2.7) follows from (2.20). By standard results we can deduce from (2.19c) that {∂t uδ } is uniformly 4/d bounded in Lloc ([0, ∞); V0 ). The Aubin-Lions theorem [14, 54] for Bochner spaces (based on the Rellich-Kondrachov compact imbedding) implies {φ ∈ L2loc ([0, ∞); H 1 ()) : ∂t φ ∈ Lloc ([0, ∞); H 1 ()0 )} 4/d
,→,→ L2loc ([0, ∞); L2loc ()),
(2.21)
where the target space is equipped with the strong topology. Hence {uδ } is relatively compact in L2loc ([0, ∞); L2 (BR )) for any R ∈ (0, ∞). So uδ → u strongly in both L2loc ([0, ∞); L2 (supp{v})) and L2loc ([0, ∞); L2 (supp{ϕ})), which together with θˆ δ → θˆ weakly in L2loc ([0, ∞); L2 (supp{ϕ})), is sufficient to establish the convergence of the nonlinear convective terms in our weak formulation since Z t1 Z t1 ˆ ˆ h∇ϕ · (uδ θ δ − uθ)i dt ≤ kuδ − ukH kθˆ δ kL2 k∇ϕkL∞ (;Rd ) dt t0
t0
Z +
t1 t0
ˆ dt . hu · ∇ϕ(θˆ δ − θ)i
By applying Green’s theorem we can establish that n o k`1θˆδ + uδ · ∇θˆδ kH 1 ()0 ≤ sup `|hϕ1θˆδ i| + |hϕuδ · ∇θˆδ i| kϕkH 1 =1
= sup kϕkH 1 =1
n
`|h∇ϕ · ∇θˆδ i| + |h∇ϕ · uδ θˆδ i|
≤ `k∇θˆδ kL2 () + kuδ kL4 (;Rd ) kθˆδ kL4 () .
o
298
S. Malham, J. Xin
Using the conclusions of Lemma 2.3 and recalling that ψ˜ · θ˜ has compact support, we 2 know that {ψδ f (θδ )e−δθδ } is uniformly bounded in L∞ loc ([0, ∞); L ()) provided we assume m = 1, 2 or 3 when d = 3 (no assumption when d = 2). Hence ∂t θˆδ = `1θδ − uδ · ∇θδ + ψδ f (θδ )e−δθδ ∈ L2loc ([0, ∞); H 1 ()0 ) for d = 2, 3. Analogous estimates hold for ∂t ψˆ δ . By employing the compact embedding (2.21) we can establish that θˆ δ → θˆ strongly in L2loc ([0, ∞); L2 (supp{ϕ})). We can now guarantee the convergence of the nonlinear reaction terms in (2.19) to the appropriate terms in (2.7) by, for example, using the identity Z t1 D E ϕ ψδ f (θδ )e−δθδ − ψf (θ) dt t0
Z ≡
t1 D
E ϕ (ψδ − ψ)f (θδ )e−δθδ + ψ f (θδ )e−δθδ − f (θ) dt,
t0
and noting that strong convergence of ψˆ δ → ψˆ in L2loc ([0, ∞); L2 (supp{ϕ})) and weak convergence of f (θδ ) → f (θ) in L2loc ([0, ∞); L2 (supp{ϕ})) is sufficient to ensure the right-hand side converges to zero. By choosing v ∈ D independent of time, that u ∈ C([0, ∞); w-H) follows from (2.7) and that D is dense in H. Further, from the embedding {φ ∈ L2loc ([0, ∞); H 1 ()) : ∂t φ ∈ L2loc ([0, ∞); H 1 ()0 )} ,→ C([0, ∞); L2 ()), we deduce that θˆ ∈ C([0, ∞); L2 ) after taking the limit δ → 0+ . This last embedding also implies that when d = 2, we have u ∈ C([0, ∞); H). 2.3. Stronger Solutions (d = 2). We prove uniqueness for slightly more regular solutions and then show that classical solutions exist provided we assume our initial data is smooth. Lemma 2.4. When d = 2, for initial data θˆ in ∈ H 1 (; R2 ) and uin ∈ H, weak solu1 2 2 2 tions also satisfy θˆ ∈ L∞ loc ([0, ∞); H (; R )) ∩ Lloc ([0, ∞); W ) and this additional regularity is sufficient to establish uniqueness. Proof. Consider the inner product of (2.10b) with −1θˆδ in L2 (), d k∇θˆδ k2L2 + 2`k1θˆδ k2L2 ≤ ck1θˆδ kL2 k∇θˆδ kL4 kuδ kL4 + kψδ f (θδ )e−δθδ kL2 dt + kukH + k∇θˆδ kL2 + ` + 1 . Using the Gagliardo–Nirenberg and Young inequalities, for arbitrary σ > 0, k1θˆδ kL2 k∇θˆδ kL4 kuδ kL4 ≤ σk1θˆδ k2L2 +
8 c k∇θˆδ k2L2 kuδ kL4−d 4 . σ 8
lies Also note from the Gagliardo–Nirenberg inequality and Lemma 2.2 that kuδ kL4−d 4 in L1loc ([0, ∞)) when d = 2. Hence noting regularity already established, forming an analogous estimate for k∇ψδ (t)kL2 , integrating with respect to time and considering the limit δ → 0+ , the regularity result follows. d kϕk2L2 = 2(ϕ, ∂t ϕ)L2 , in the scalar We know [54] that for ϕ ∈ C([0, ∞); L2 ()), dt distribution sense on compact subsets of [0, ∞). Let (θ 1 , u1 ) and (θ 2 , u2 ) be two weak solutions to our system (2.7) with the same initial data (θˆ in , uin ) ∈ H 1 (; R2 ) × H. Set
Global Solutions to Reactive Boussinesq System on Infinite Domain
299
θ¯ = θ 2 − θ 1 and u ¯ = u2 − u1 . Since E is dense in W we can choose ϕ = θˆ in (2.7b) and use Green’s theorem to form Z t1 1 ¯ 1 ¯ 2 2 ¯ 2 2 dt kθ(t1 )kL2 − kθ(t0 )kL2 + ` k∇θk L 2 2 t0 Z t1 Z t1 ¯ ¯ 2 f (θ2 ) − ψ1 f (θ1 ))i dt, h∇θ · (u2 θ2 − u1 θ1 )i dt + hθ(ψ = t0
t0
where ¯ L2 kuk ¯ L4 ku1 kL4 h∇θ¯ · (u2 θ2 − u1 θ1 )i ≤ k∇θk ¯ L4 kθˆ2 kL4 + kθk ¯ L2 kuk ¯ L2 . + ck∇θk ¯ H + kθk
¯ 2 f (θ2 ) − ψ1 f (θ1 ))i in a similiar fashion. Using the Gagliardo– We can estimate hθ(ψ Nirenberg and Young inequalities, forming analogous estimates for ψ¯ and u ¯ and adding them together and choosing t0 = 0, the result follows. 2 Lemma 2.5. If (θˆ in , uin ) ∈ W2 × (H 2 (; R2 ) ∩ V), then θˆ lies in L∞ loc ([0, ∞); W ).
Proof. Consider the L2 ()-inner product of the time derivative of (2.10b) with ∂t θˆδ , 1 d ˆ ˆ k∂t θˆδ k2L2 + `k∇(∂t θˆδ )k2L2 ≤ k∂t θˆδ kL4 k∂t ψˆ δ kL2 kθˆδ km L4m + k∂t θδ kL2 k∂t ψδ kL2 2 dt + ck∂t θˆδ k2L4 kψδ f 0 (θδ )kL2 + δck∂t θˆδ k2L4 kψδ f (θδ )kL2 + k∂t θˆδ kL4 k∇θˆδ kL4 k∂t uδ kH + ck∂t θˆδ kL2 k∂t uδ kH . We can obtain analogous estimates for ∂t ψˆ δ and ∂t uδ . Using the Gagliardo–Nirenberg and Young inequalities and the regularity already established, add the inequalities for ∂t ψˆ δ , ∂t θˆδ and ∂t uδ , integrate with respect to time, and note from (2.10) that in limt1 →0+ k(∂t θˆ δ , ∂t uδ )(t1 )kL2 ×H ≤ c k(θˆ in δ , uδ )kW2 ×(H 2 (;R2 )∩V) . From (2.10b) and the regularity already established, it follows that `1θˆδ = g, where 2 g = ∂t θˆδ + uδ · ∇θδ − ψδ f (θδ )e−δθδ − `1θ˜ ∈ L∞ loc ([0, ∞); L ()). An analogous + estimate for ψˆ δ follows. Then take the limit δ → 0 . 2 2 Finally, we deduce from standard estimates that u lies in L∞ loc ([0, ∞); V ∩ H (; R )). ∞ That all components of θˆ and u lie in C ( × R+ ), when the initial data is smooth, follows from the standard elliptic estimates in Constantin and Foias [14], p. 26, and Temam [54], pp. 302–3, as well as estimates for passively convected reaction-diffusion systems found in Smoller [52] or Gilbarg and Trudinger [19]. The proof of Theorem 2.1 is complete.
3. Front Solutions to a System with Unit Lewis Number (d = 2) 3.1. Orientation and Statement of Results. Let us consider the front solutions of the following simpler system: ut + u · ∇x u = −∇x p + ν1x u + T xˆ 2 + f (x), ∇x · u = 0, Tt + u · ∇x T = 1x T + g(T ),
(3.1) (3.2)
300
S. Malham, J. Xin
where we denote by T the temperature; x = (x1 , x2 ) ∈ ≡ (0, 2π) × R1 ; f (x) = (0, sinx1 ), a shearing force; g(T ) = T (1−T )(T −µ), µ ∈ (0, 21 ), the bistable nonlinearity. The other notations are the same as before except that is now the Rayleigh number. System (3.1)-(3.2) is the unit Lewis number case of the original system studied in early sections. The identity ψ + θ = 1 makes possible the reduction of the chemistry part to the temperature equation. Our goal is to find an asymptotic regime where the structures of front solutions can be explicitly demonstrated, and the front speeds are uniformly bounded in time. Since the main issue is to deal with the fluid coupling, we choose the technically simpler bistable nonlinearity for the reaction term g in (3.2). Taking g as the combustion type nonlinearity of Arrhenius form with ignition temperature cutoff would require working in weighted function spaces, and only produce similar results. In the same spirit, we choose a special shear forcing function for f (x), which was proposed by Kolmogorov for studying two dimensional turbulence, see [38, 50] and references therein. The boundary conditions for system (3.1)-(3.2) are: u|x1 =0,2π = 0, ∀ x2 , Tx1 |x1 =0,2π = 0, ∀ x2 ,
(3.3) (3.4)
respectively the no slip boundary condition for velocity u and the adiabatic boundary condition for temperature T . When = 0, Eq. (3.1) decouples from (3.2), and a simple stationary solution is: u0 = (0, ν −1 sin x1 ),
(3.5)
with pressure p0 = 0. For u = u0 , Eq. (3.2) admits traveling front solutions of the form T = T0 (x1 , x2 − c0 t) ≡ T0 (y, s), y = x1 , s = x2 − c0 t, and T0 satisfies the elliptic equation: 1y,s T0 + (ν −1 sin y + c0 )T0,s + g(T0 ) = 0,
(3.6)
for (y, s) ∈ (0, 2π) × R1 , with the boundary conditions: T0 (y, −∞) = 0, T0 (y, +∞) = 1, max T0 (y, 0) = y∈[0,2π]
1 , T0,y (y, s)|y=0,2π = 0. (3.7) 2
Existence, uniqueness, and asymptotic stability of such traveling fronts are studied at length in a series of papers by Berestycki, Larrouturou, Nirenberg, and Roquejoffre, see [4, 5, 3], and [48]. The linearised operator around T0 (see Subsect. 3.2 for details) has an eigenfunction T0,s (x1 , x2 ) for the simple eigenvalue zero on L2 (), and its L2 adjoint operator also has an eigenfunction corresponding to the simple eigenvalue zero, which ? ? we denote by T0,s . The L2 orthogonal complement of T0,s is denoted by W . Now let us consider front solutions to system (3.1)-(3.2) ( 6= 0) of the form: u = u0 + u1 (t, x, ), p = p1 (t, x, ), T = T0 (x1 , x2 − c(t, )) + T1 (t, x, ), c(t, ) = c0 t + c1 (t, ).
(3.8) (3.9) (3.10) (3.11)
We will show that for small enough, (3.8)-(3.11) are valid for all t ≥ 0, with u1 , p1 , T1 uniformly bounded in proper norms, and |c1 | ≤ O(t), as t → +∞. Substituting (3.8)–(3.11) into (3.1)–(3.2), and using Eq. (3.6), we have:
Global Solutions to Reactive Boussinesq System on Infinite Domain
301
u1,t + u0 · ∇u1 + u1 · ∇u0 + u1 · ∇u1 = −∇p1 + ν1u1 + T0 xˆ2 + T1 xˆ2 , (3.12) ∇ · u1 = 0,
(3.13)
T1,t + u0 · ∇T1 + u1 · ∇T0 + u1 · ∇T1 − c01 (t, )T0,s = 1T1 + g 0 (T0 )T1 + N (T1 , ), (3.14) where N (T1 , ) =
g(T0 + T1 ) − g(T0 ) − g 0 (T0 )T1 ,
(3.15)
and so N (T1 , ) contains quadratic and cubic terms in T1 . The prime on c denotes the time derivative. The boundary conditions for u1 and T1 are: u1 |x1 =0,2π = 0,
T1,x1 |x1 =0,2π = 0.
(3.16)
Let us introduce some notations for this section. For any open subset G ∈ R2 and measurable vector function u(x) = (u1 (x), u2 (x)), define: XZ ||u||2G = u2i (x)dx, L2 (G) = {u : ||u||G < +∞}, i=1,2
X Z
||∇u||2G =
i,j=1,2
||D
2
u||2G
=
G
G
|ui,xj |2 dx, H 1 (G) = {u : ||u||2G + ||∇u||2G < +∞},
X Z i,j,k=1,2
G
|ui,xj ,xk |2 dx, H 2 (G) = {u : ||D2 u||G < +∞},
D(G) = {u ∈ C0∞ (G) : ∇ · u = 0}, J(G) = closure of D(G) in the norm ||u||G , J0 (G) = closure of D(G) in the norm H 1 (G), H01 (G) = closure of C0∞ (G) in the norm H 1 (G). It is well known that the orthogonal complement of J(G) in L2 (G) is: 1 2 J ⊥ ≡ {u : u = ∇p, for some p ∈ Hloc (G), with ∇p ∈ Lloc (G)}.
Let P be the orthogonal projection from L2 (G) to J(G), then the Stokes operator is: −A = P 1,
(3.17)
with domain of definition D(A) = H 2 (G) ∩ J0 . If G is bounded, ∂G is C 3 , then A : D(A) → J(G) is one to one and onto. Moreover, A−1 exists, is completely continuous, and symmetric. The eigenfunctions of A, denoted by al (x), l = 1, 2, · · · , with eigenvalues λl , i.e, Aal = λl al , are orthogonal and complete in J0 (G). The main result of the section is:
302
S. Malham, J. Xin
Theorem 3.1. Let u1 (0, x) ∈ J0 (), T1 (0, x) ∈ H 1 () ∩ W , c1 (0, ) = 0. Let < T0 > be the integral average of T0 (x1 , x2 ) over x1 ∈ [0, 2π], and so < T0 > is a bounded smooth function of x2 . Then there exists a positive number 0 depending on the H 1 norms of u1 (0, x) and T1 (0, x), and ν, such that if ν > 2π, ∈ (0, 0 ), system (3.12)-(3.14) admits unique solutions u1 = u1 (t, x, ), T1 = T1 (t, x, ), c1 = c1 (t, ), p1 = p1 (t, x, ) on (0, +∞) × , which satisfy: u1 ∈ L∞ ((0, +∞); J0 ) ∩ C([0, +∞); J0 ) ∩ C 1 ((0, +∞); J0 ), u1,t , Dx2 u1 , ∇p1 − < T0 > xˆ 2 ∈ L2loc ((0, +∞); L2 ()),
(3.18) (3.19)
T1 ∈ L∞ ((0, +∞); H 1 ∩ W ) ∩ C([0, +∞); H 1 ∩ W ) ∩ C 1 ((0, +∞); H 1 ∩ W ), (3.20) (3.21) c1 ∈ C 1 [0, +∞), (3.22) limt→0 ||u1 − u1 (0, x)||H 1 = limt→0 ||T1 − T1 (0, x)||H 1 = 0. Moreover, the following estimates hold: ||u1 ||H 1 + ||T1 ||H 1 + |c01 | ≤ C, ∀t ≥ 0,
(3.23)
for some positive constant C, depending on the H 1 norm of initial data u1 (0, x), T1 (0, x), and ν. The front solutions to system (3.1)-( 3.2) are then given by (3.8)-(3.11). Remark 3.1. The solutions satisfying (3.18)–(3.22) are strong solutions. Front structures are seen from (3.8)-(3.11). The theorem does not specify the asymptotic behaviour of c01 as t → +∞, whether it converges to a constant or it is oscillatory in time. Numerical simulations ([45, 62]) indicate that for downward propagating fronts, the front speeds tend to nearly constant values, while for upward propagating fronts, front speeds tend to oscillate in time due to the Rayleigh-Taylor instability induced by . Without the constraint on ν and , we do not expect that front solutions and their speeds will remain bounded in time. Power growth in t is observed for vorticity field, and front shape can evolve into a bubble like structure for an upward moving front, [62]. Remark 3.2. The condition that T1 ∈ W is not a restriction on the initial data. If it is not in W , one can always shift T0 in x2 by a suitable constant, or change the initial position for c, so that the new T1 belongs to W . The idea of proof is to seek energy estimates on u1 , and spectral-semigroup type estimates for T1 as often used in stability analysis for traveling waves in reaction-diffusion equations. Combining the two types of estimates, we show that the H 1 norm of both u1 and T1 are bounded for all time. The condition ν > 2π comes up in the energy inequality for controlling the convective terms with the dissipative term ν1u1 . The basic ingredient for the energy estimate is the Poincar´e inequality available when no slip boundary condition is imposed on u1 . If we impose periodic boundary condition instead, then due to unboundedness of our domain , Poincar´e inequality no longer holds. It seems that one has to come up with a different approach for analyzing solutions. The proof of the theorem is organised as follows. In Subsect. 3.2, we consider solutions to Eqs. (3.12)-(3.13) with a given forcing term in L∞ ((0, t0 ); L2 ()), for any t0 > 0, and derive energy inequalities. These inequalites are along the line of those in Constantin and Foias [14]. To handle unbounded domains, we estimate the nonlinear terms differently using inequalities on the Stokes operator as given in Heywood [22]. In Subsect. 3.3, we present estimates based on the analytical semigroup generated by the linearised reaction-diffusion operator around the basic traveling front T0 . That the linearised operator is sectorial and so a generator of the analytical semigroup follows
Global Solutions to Reactive Boussinesq System on Infinite Domain
303
from the works of Berestycki, Larrouturou, and Roquejoffre, see [3, 48], and references therein. The nonlinear term u1 · ∇T1 is handled with an inequality in Kozono and Ogawa [27] for bounding convective terms in unbounded domains with fractional powers of differential operators. In Subsect. 3.4, we complete the proof by combining the above estimates and show the existence of global strong solutions uniformly bounded in H 1 norms for all time. 3.2. Solutions to a Forced Navier–Stokes Equation. We discuss the strong solutions to the following forced Navier–Stokes equation: vt + u0 · ∇v + v · ∇u0 + v · ∇v = −∇p + ν1v + F (t, x), ∇ · v = 0, x ∈ = (0, 1) × R1 , v|t=0 = v0 (x) ∈ J0 (), v|∂ = 0,
(3.24) (3.25) (3.26)
where the forcing function F (t, x) ∈ L∞ ((0, t0 ); L2 ()), for any t0 > 0. Following Heywood [22], we first consider (3.24)-(3.26) on any bounded domain with at least C 3 smooth boundary, then approximate with an enlarging sequence of such domains. They can be domains enclosed by two parallel straight lines, with distance 2π apart, on the left and right, and two C ∞ curves on the top and bottom that connect to the straight lines with C ∞ smoothness. We will derive estimates on solutions that are independent of the x2 diameter of these approximate domains, then pass to the limit. Since the approximate domains and itself have width 2π in the x1 direction, the Poincar´e inequality: ||u||L2 ≤ 2π||∇u||L2 , ∀ u ∈ H01 ,
(3.27)
holds. For any bounded domain, still denoted by , the nth Galerkin approximate solution is: v n (x) =
n X
ckn ak (x),
(3.28)
k=1
with ckn = ckn (t), and v n satisfies: Z Z (vtn + v n · ∇v n + u0 · ∇v n + v n · ∇u0 − ν1v n ) · al (x)dx = F · al dx, (3.29) or (v n , al )t + (v n · ∇v n , al ) + (u0 · ∇v n , al ) + (v n · ∇u0 , al ) − ν(1v n , al ) = (F, al ), (3.30) where l = 1, 2, · · · , n, (·, ·) is the usual L2 inner product. System (3.29) or (3.30) is an ODE system for ckn (t), k = 1, 2, · · · , n with quadratic nonlinearities. In the following, we skip the superscript n on v. Multiply (3.29) by cln and summing over l gives the identity: 1 d ||v||22 + (v · ∇u0 , v) + ν||∇v||22 = (F, v). 2 dt Poincar´e inequality (3.27) implies that:
(3.31)
304
S. Malham, J. Xin
(v · ∇u0 , v) + ν||∇v||22 ≥ −ν −1 ||v||22 +
ν ||v||22 = δ||v||22 , (2π)2
with δ ≡ (2π)−2 ν − ν −1 > 0. It follows that ||v||2,t ≤ −δ||v||2 + ||F ||2 ,
(3.32)
or sup ||v||2 (t) ≤ ||v(0)||2 + C(t) sup ||F ||2 , t∈[0,t0 ]
∀ t ≥ 0,
(3.33)
t∈[0,t0 ]
where C(t) is a bounded smooth function in t ≥ 0, C(0) = 0, C(t) ≤ 1 . Since δ ||v n ||22 =
n X
c2kn (t),
k=1
(3.33) implies the existence of smooth solutions of the ODE system (3.30) for all time. Multiplying λl cln to both sides of (3.30), and summing over l gives: 1 d ||∇v||22 + ν||P 1v||22 = (v · ∇v, P 1v) + (u0 · ∇v, P 1v) 2 dt + (v · ∇u0 , P 1v) − (F, P 1v).
(3.34)
The terms on the right hand side of (3.34) are estimated below. By the Cauchy-Schwartz and Gagliardo–Nirenberg inequalities, we have: R1 ≡ |(v · ∇v, P 1v)| ≤ ||v||4 ||∇v||4 ||P 1v||2 1
1
1
1
≤ ||v||22 ||∇v||22 ||∇v||22 ||D2 v||22 ||P 1v||2 . Recall that (see Lemma 1 of [22] for the three dimensional case): ||D2 u||2 ≤ C(||P 1u||2 + ||∇u||2 ),
(3.35)
where C depends only on smoothness (at least C 3 ) of the boundary. It follows that: 1
1
R1 ≤ C||v||22 ||∇v||2 (||Av||2 + ||∇v||2 ) 2 ||Av||2 1
3
1
3
≤ C||v||22 ||∇v||2 ||Av||22 + C||v||22 ||∇v||22 ||Av||2 , and by Young’s inequality: 4 1 3 ν R1 ≤ Cα−4 ||v||22 ||∇v||42 + Cα 3 ||Av||22 + 2ν −1 C 2 ||v||2 ||∇v||32 + ||Av||22 4 4 8 4 1 −4 3 ν 2 4 −1 2 4 2 ≤ α C||v||2 ||∇v||2 + 4πν C ||∇v||2 + ( Cα 3 + )||Av||2 , (using (3.27)) 4 4 8 4 3 ν −1 −4 2 −1 2 4 (3.36) =C(4 α ||v||2 + 4πν C )||∇v||2 + ( Cα 3 + )||Av||22 , 4 8
for some constant α to be chosen. The other three terms are bounded as: R2 ≡ |(u0 · ∇v, P 1v)| ≤ |u0 |∞ ||∇v||2 ||Av||2 ≤ 2ν −1 |u0 |2∞ ||∇v||22 +
ν ||Av||22 , 8
(3.37)
Global Solutions to Reactive Boussinesq System on Infinite Domain
305
R3 ≡ |(v · ∇u0 , P 1v)| ≤ |∇u0 |∞ ||v||2 ||Av||2 ≤ 2ν −1 |∇u0 |2∞ ||v||22 + |(F, P 1v)| ≤ ||F ||2 ||Av||2 ≤ ν −1 ||F ||22 +
ν ||Av||22 , 8
ν ||Av||22 . 4
(3.38)
(3.39)
Here and in the rest of this section, | · |∞ denotes the L∞ norm. 4 Combining (3.34)-(3.39) and choosing 6Cα 3 = ν, we have (with C denoting a generic constant independent of ν, and skipping the subscript 2 for the L2 norm): ν 1 d ||∇v||2 + ||Av||2 ≤ C(ν −3 ||v||2 + ν −1 )||∇v||4 2 dt 4 + 2ν −1 |u0 |2∞ ||∇v||2 + 2ν −1 |∇u0 |2∞ ||v||2 + ν −1 ||F ||2 . Substituting (3.33) and (3.27), and denoting supt∈[0,t0 ] || · ||2 by || · ||∞ , we continue: ν 1 d ||∇v||2 + ||Av||2 ≤ C(ν −3 ||v(0)||2 + ν −3 δ −2 ||F ||2∞ + ν −1 )||∇v||4 2 dt 4 + Cν −3 ||∇v||2 + ν −1 ||F ||2∞ . (3.40) It follows from (3.31) and (3.27) that: 1 d ||v||2 + (2π)2 δ||∇v||2 ≤ ||F || · ||v||, 2 dt or
(3.41)
1 d 1 1 δ ||v||2 + 2π 2 δ||∇v||2 + δ||v||2 ≤ ||F ||2 + ||v||2 , 2 dt 2 2δ 2
or 1 1 d ||v||2 + 2π 2 δ||∇v||2 ≤ ||F ||2 . 2 dt 2δ
(3.42)
Integrating (3.42) and using (3.33) to get: Z t+τ 1 1 2π 2 δ ||∇v||2 (s)ds ≤ ||v(0)||2 + 2 ||F ||2∞ + ||F ||2∞ τ, 2δ δ t assuming t0 ≥ t + τ . Thus, Z t+τ ||∇v||2 (s)ds ≤ Cδ −1 (||v(0)||2 + δ −1 ||F ||2∞ (τ + δ −1 )),
(3.43)
t
with constant C independent of ν and δ. It follows from (3.43) that the Lebesque measure |{s ∈ [t, t + τ ] : ||∇v|| ≥ ρ}| ≤ Cρ−2 δ −1 (||v(0)||2 + δ −1 ||F ||2∞ (τ + δ −1 )). Choose ρ =
√ √ √ 2 Cδ −1 (δ||v(0)||2 τ
+ ||F ||∞ (δ −1 + τ ))1/2 , then
|{s ∈ [t, t + τ ] : ||∇v|| ≥ ρ}| ≤
τ , 2
(3.44)
306
S. Malham, J. Xin
and thus ∃ t1 ∈ [t, t + τ ] such that ||∇v||2 (t1 ) ≤ Cτ −1 δ −2 (δ||v(0)||2 + ||F ||2∞ (δ −1 + τ )).
(3.45)
Multiplying (3.40) by: Z t −3 2 −3 −2 2 −1 2 −3 C(ν ||v(0)|| + ν δ ||F ||∞ + ν )||∇v|| ds − (t − t1 )Cν , I ≡ exp − t1
where the constant C is twice that in (3.40), to obtain: 2 d [||∇v||2 I] ≤ ||F ||2∞ I. dt ν
(3.46)
Integrating (3.46) on [t1 , t] to get: ||∇v||2 (t) ≤ ||∇v(t1 )||2 I −1 +
2 ||F ||2∞ (t − t1 )I −1 . ν
(3.47)
Let t1 ∈ [t − τ, t], t ≥ τ ≥ 0, such that (3.45) holds. In view of (3.43) and (3.45), we have from (3.47): ||∇v||2 (t) ≤ [Cτ −1 δ −2 (δ||v(0)||2 + ||F ||2∞ (δ −1 + τ )) + 2ν −1 ||F ||2∞ τ ]× exp{C(ν −3 ||v(0)||2 + ν −3 δ −2 ||F ||2∞ + ν −1 )× δ −1 (||v(0)||2 + δ −1 ||F ||2∞ (τ + δ −1 )) + Cτ ν −3 }.
(3.48)
Fix τ = δ −2 , then for t ≥ τ , we have: ||∇v||2 (t) ≤ C(δ||v(0)||2 + ||F ||2∞ (δ −1 + δ −2 ) + ν −1 δ −2 ||F ||2∞ )× exp{Cδ −1 (ν −3 ||v(0)||2 + ν −3 δ −2 ||F ||2∞ + ν −1 )× (||v(0)||2 + δ −2 (1 + δ −1 )||F ||2∞ ) + Cδ −2 ν −3 },
(3.49)
while for t ∈ [0, τ ], we set t1 = 0 in (3.47) to have: ||∇v||2 (t) ≤ (||∇v(0)||2 + 2δ −2 ν −1 ||F ||2∞ )× exp{Cδ −1 (ν −3 ||v(0)||2 + ν −3 δ −2 ||F ||2∞ + ν −1 )× (||v(0)||2 + δ −1 ||F ||2∞ (δ −1 + δ −2 )) + Cδ −2 ν −3 }.
(3.50)
Combining (3.49) and (3.50), we obtain the estimates on ||∇v||(t) for t ∈ [0, t0 ], uniformly in t0 > 0: ||∇v||2 (t) ≤ B1 · B2 , where and B2 is:
B1 = C(||v(0)||2 δ + ||F ||2∞ (δ −1 + δ −2 + ν −1 δ −2 ) + ||∇v(0)||2 ), exp{Cδ −1 (ν −3 ||v(0)||2 + ν −3 δ −2 ||F ||2∞ + ν −1 )×
(3.51)
Global Solutions to Reactive Boussinesq System on Infinite Domain
307
(||v(0)||2 + δ −1 (δ −1 + δ −2 )||F ||2∞ ) + Cδ −2 ν −3 }, where C is a positive constant independent of ν, δ, and t0 . Notice that estimates (3.33) and (3.51) hold for the approximate Galerkin solutions independent of the x2 diameters of the approximate domains. Using (3.33), (3.51) and (3.35) in (3.40), it is straightforward to obtain the bounds: Z t Z t 2 2 ||D v|| (s)ds ≤ L(t), ||vt ||2 (s)ds ≤ L(t), (3.52) 0
0
for some continuous function L of t, independent of the approximate solutions and domains. Passing to n → ∞ in (3.29) in a standard way(see [22]), we see that the limiting vector function v is a unique strong solution to the system (3.24) and (3.25). Summarizing, we have: Proposition 3.1. Let v(0) ∈ J0 (), and F ∈ L∞ ([0, t0 ]; L2 ()), for some t0 > 0. Then if ν > 2π, there is a unique solution v(t, x), p(t, x) to system (3.24) and (3.25) such that: v ∈ C((0, t0 ); J0 ()); vt , D2 v ∈ L2 ((0, t0 ); L2 ()).
(3.53)
Moreover, v attains initial data continuously in L2 , and v satisfies the estimates (3.33), (3.51), (3.52). If t0 = +∞, then (3.33) and (3.51) hold uniformly in t ≥ 0. 3.3. Estimates on a Reaction-Diffusion Equation. We consider the reaction-diffusion equation (3.14) for T1 with velocity u1 a given function as described in Proposition 3.1. As usual, we introduce the moving frame coordinate ξ = x2 − c(t, ), x1 = x1 , t = t. Equation (3.14) becomes: T1,t − c0 T1,ξ − 1T1 − g 0 (T0 (x1 , ξ))T1 + u0 · ∇T1 = c01 (t, )T1,ξ − u1 · ∇T0 + c01 T0,s − u1 · ∇T1 + N (T0 , T1 , ) ≡ F1 .
(3.54)
Define the operator L: (−L)T1 = 1T1 + c0 T1,ξ − u0 · ∇T1 + g 0 (T0 (x1 , ξ))T1 ,
(3.55)
with domain of definition D(L) = {T1 ∈ H () : T1,x1 |∂ = 0}. The operator L has a simple eigenvalue corresponding to the positive eigenfunction T0,s , thanks to the monotonicity of the wave profile T0,s > 0. For the bistable nonlinearity g, the arguments in [3] and [48] apply without using weighted spaces, and we have the following: 2
Lemma 3.1. The operator L is sectorial [21] on L2 () with zero Neumann boundary condition; the spectrum of L stays inside a sector strictly in the right half plane except for a simple eigenvalue at zero corresponding to the eigenfunction T0,s (x1 , ξ). Operator L is invertible on the subspace: ? ) = 0, ux1 |∂ = 0}, W ≡ {u ∈ L2 () : (u, T0,s
(3.56)
? is the positive nullfunction of the adjoint operator L? in L2 () such that where T0,s ? ) = 1. Moreover, the estimate: (T0,s , T0,s
||L−1 u||H 2 ≤ Cγ ||u||, ∀ u ∈ W,
(3.57)
for constant Cγ depending only on γ, the distance from the sector to the left half plane.
308
S. Malham, J. Xin
Lemma 3.1 implies that L is a generator of an analytical semigroup on W , and the usual fractional powers of (−L) are well-defined, [21]. Equation (3.54) can be expressed as: T1,t = −LT1 + F1 (T1 , c1 , ),
(3.58)
and will be solved for T1 ∈ W for all t ≥ 0. For any given initial data T1 (0) ∈ W , let us write (3.58) into the related integral equation: Z t e−(t−s)L F1 (T1 , c1 , )(s)ds, (3.59) T1 (t) = e−Lt T1 (0) + 0
where we impose the condition: ? (F1 , T0,s ) = 0, ∀ s ≥ 0,
(3.60)
so that formula (3.59) provides bounded solutions for all time. In view of (3.54), we have: ? ). c01 = (u1 · ∇T1 + u1 · ∇T0 − c01 T1,ξ − N (T0 , T1 , ), T0,s
(3.61)
It follows from (3.58)-(3.61) that T1 ∈ W for all t ≥ 0. The front equation (3.61) for c1 is a nonlocal equation and the terms u1 · ∇T1 and u1 · ∇T0 reflect the strain effects of the fluid flows. In the passive case, = 0, u1 as a perturbation of the steady state u0 will decay to zero when t → ∞ if ν > 2π (see (3.32) with F = 0). Hence (3.61) implies that the front speed approaches an asymptotic constant value. Now let us make a-priori estimates on solutions of the integral equation (3.59) in the space L∞ ((0, t0 ); W ∩ H 1 ()) along with (3.61). Define: mα (t0 ) = sup ||Lα T1 (t)||,
(3.62)
t∈(0,t0 )
for all T1 ∈ W , α ∈ [0, 21 ], t0 > 0. First we note that sup ||e−Lt Lα T1 (0)|| ≤ ||Lα T1 (0)|| ≤ C||T1 (0)||H 1 ,
t∈[0,t0 ]
where we use the fact that if T1 ∈ W , Lα T1 ∈ W . By (3.60) and (3.61), we rewrite F1 as: F1 = P2 (c01 T1,ξ ) + P2 (−u1 · ∇T0 ) + P2 (−u1 · ∇T1 ) + P2 N,
(3.63)
? )T0,s , i.e. P2 is the projection from L2 () to where P2 ≡ Id − P1 , and P1 u ≡ (u, T0,s W. Applying Lα , α ∈ [0, 21 ], to (3.59), we have:
||Lα T1 ||2 ≤ ||Lα e−Lt T1 (0)||2 +
Z
t
0
0
||Lα+δ e−L(t−s) L−δ F1 ||2 (s)ds,
0
Z ≤ C||T1 (0)||H 1 +
t
0
0
(t − s)−(α+δ ) e−γ (t−s) ||L−δ F1 ||(s)ds,
0
where δ 0 ∈ (0, 21 ), F1 = F1 (T1 , c01 , ). By (3.63), we have:
(3.64)
Global Solutions to Reactive Boussinesq System on Infinite Domain
309
0
||L−δ F1 || ≤ |c01 |Cγ ,δ 0 ||T1,ξ || + ||∇T0 ||∞ Cγ ,δ 0 ||u1 || 0
+ ||L−δ P2 (u1 · ∇T1 )|| + Cγ ,δ 0 ||N ||,
(3.65)
for some positive constant Cγ ,δ 0 depending only on γ and δ 0 . We estimate: 0
||L−δ P2 (u1 · ∇T1 )|| 0 0 0 = ||L−δ (L + γ)δ (L + γ)−δ P2 (u1 · ∇T1 )|| 0 0 0 ≤ ||L−δ (L + γ)δ ||(L2 (W )→L2 (W )) ||(L + γ)−δ P2 (u1 · ∇T1 )|| 0 ≤ Cγ ,δ 0 ||(L + γ)−δ P2 (u1 · ∇T1 )|| 0 0 0 ≤ Cγ ,δ 0 ||(L + δ 0 )−δ (−1 + γ)δ ||(L2 ()→L2 ()) ||(−1 + γ)−δ P2 (u1 · ∇T1 )|| 0 ≤ Cγ ,δ 0 ||(−1 + γ)−δ P2 (u1 · ∇T1 )|| 0 0 ? ≤ Cγ ,δ 0 ||(−1 + γ)−δ u1 · ∇T1 − (u1 · ∇T1 , T0,s )(−1 + γ)−δ T0,s || 0 ≤ Cγ ,δ 0 ||(−1 + γ)−δ (u1 · ∇T1 )|| + Cγ ,δ 0 ||u1 || · ||∇T1 ||. (3.66) By Lemma 2.1 of Kozono and Ogawa, [27], for δ 0 ∈ (0, 21 ), we have: 0 0 1 1 ||(−1 + γ)−δ (u1 · ∇T1 )|| ≤ Cδ 0 ||(−1) 2 −δ u1 || · ||(−1) 2 T1 ||, 1 ≤ Cδ 0 ||∇T1 ||(||(−1) 2 u1 || + ||u1 ||),
≤ Cδ 0 ||∇T1 ||(||∇u1 || + ||u1 ||),
(3.67)
for some constant Cδ 0 depending on δ 0 . It follows from (3.66) and (3.67) that: 0
||L−δ P2 (u1 · ∇T1 )|| ≤ Cγ ,δ 0 ||u1 ||H 1 · ||∇T1 ||.
(3.68)
Noticing that |N | ≤ C((T1 )2 + (T1 )3 ), for C independent of T1 . Then the Gagliardo– Nirenberg inequality shows that: ||T12 ||2 = ||T1 ||24 ≤ C||T1 ||2 ||∇T1 ||2 , ||T13 || = ||T1 ||36 ≤ C||T1 ||2 ||∇T1 ||22 .
(3.69)
Inequality (3.65) implies that 0
||L−δ F1 || ≤ |c01 |Cγ ,δ 0 ||T1,ξ || + ||∇T0 ||∞ Cγ ,δ 0 ||u1 || +Cγ ,δ 0 ||u1 ||H 1 · ||∇T1 || + Cγ ,δ 0 (||T1 || · ||∇T1 || + ||∇T1 ||2 ||T1 ||). Now, choose α = 0,
1 2,
(3.70)
δ 0 ∈ (0, 21 ), we have from (3.64), (3.65) and (3.70) that:
m 1 + m0 ≤ C||T1 (0)||H 1 + Cγ ,δ 0 [|c01 |∞ m 1 + ||∇T0 ||∞ ||u1 ||∞ + (||u1 ||∞ m 1 2 2 2
310
S. Malham, J. Xin
Z
t
2
2
2
t∈[0,t0 ]
0
((t − s)−δ + (t − s)−δ
+||∇u1 ||∞ m 1 + m 1 m0 + m 1 m0 )] sup 2
0
− 21
0
)e−γ (t−s) ds, (3.71)
where || · ||∞ = supt∈[0,t0 ] || · ||, |c0 |∞ = supt∈[0,t0 ] |c0 |; constants C and Cγ ,δ 0 are independent of t0 . Letting M = m 1 + m0 , and lumping all the constants depending on γ, δ 0 , we get: 2
M ≤ C||T1 (0)||H 1 + C(|c01 |M + (||u1 ||∞ + ||∇u1 ||∞ )M + M 2 + M 3 ) + C||u1 ||∞ , (3.72) where C depends on γ, δ 0 only. We get from (3.61) that ? |c01 | ≤ ||T0,s ||∞ ||u1 || · ||∇T1 || + C||u1 || + |c01 | · ||T1,ξ ||
+ C(||T1 ||2 + ||T1 || · ||T1 ||24 ), or |c01 |∞ ≤ C||u1 ||∞ M + C||u1 || + |c01 |∞ M + C(M 2 + M 3 ).
(3.73)
It is straightforward to verify that for small time t0 , the mapping defined by the right hand side of (3.59) on T1 is a contraction in L∞ ((0, t0 ); W ∩ H 1 ), which yields a unique mild solution. Parabolic regularity [46], then shows that it is a strong solution. We will consider long time solutions to (3.59) along with the Navier–Stokes equation in the next subsection. We summarise the above into: Proposition 3.2. The integral equation (3.59) along with (3.61) has a strong solution for t ∈ [0, t0 ], if t0 is small enough. Moreover, the estimates (3.72) and (3.73) hold for the solution. 3.4. Uniformly Bounded Solutions of the System. We turn to the solutions of system (3.12)-(3.15). Equation (3.12) can be written as: u1,t + u0 ·∇u1 + u1 ·∇u0 + u1 ·∇u1 = −∇p˜1 + ν1u1 + (T0 − < T0 >)xˆ2 + T1 xˆ 2 , (3.74) where < T0 >=< T0 > (x2 ) = Z
1 2π
Z
2π
T0 (x1 , x2 )dx1 , 0
x2
p˜1 = p1 −
< T0 > dx2 . 0
It is obvious that T0 − < T0 > ∈ L∞ ((0, ∞), L2 ()). In the moving frame coordinate, (x1 , ξ, t), system (3.12)-(3.14) becomes: u1,t − c0 u1,ξ + u0 · ∇u1 + u1 · ∇u0 + u1 · ∇u1 = − ∇p˜1 + ν1u1 + (T0 − < T0 >)xˆ 2 + T1 xˆ 2 ,
(3.75)
∇ · u1 = 0,
(3.76)
T1,t − c0 T1,ξ − 1T1 − g 0 (T0 (ξ, x1 ))T1 + u0 · ∇T1 = F1 .
(3.77)
Global Solutions to Reactive Boussinesq System on Infinite Domain
311
Notice that the estimates (3.33) and (3.51) in Subsect. 3.2 remain the same in the moving frame coordinate for (3.75). Let us make an a-priori estimate of solutions to system (3.75)-(3.77) with initial data u1 (0) ∈ J0 (), T1 (0) ∈ W ∩ H 1 (). Define: |u|∞ ≡ sup (||u||2 + ||∇u||2 ), 0
for any t0 > 0, and |c0 |∞ = sup0
|∞ + |T1 |∞ )2 +C[||u1 (0)||2 + (|T0 − < T0 > |∞ + |T1 |∞ )2 + ||∇u1 (0)||2 ]× exp{C(||u1 (0)||2 + (|T0 − < T0 > |∞ + |T1 |∞ )2 + 1)2 },
(3.78)
and |T1 |∞ ≤ C|T1 (0)|H 1 + C(|c01 |∞ |T1 |∞ +|u1 |∞ |T1 |∞ + |T1 |2∞ + |T1 |3∞ ) + C|u1 |∞ ,
(3.79)
and |c01 |∞ ≤ C|u1 |∞ |T1 |∞ + C|u1 |∞ + |c01 |∞ |T1 |∞ + C(|T1 |2∞ + |T1 |3∞ ). (3.80) Taking the square root of (3.78) yields: |u1 |∞ ≤ ||u1 (0)||H 1 + C|T0 − < T0 > |∞ + C|T1 |∞ + C(||u1 (0)||H 1 + |T0 − < T0 > |∞ + |T1 |∞ )× exp{C(1 + ||u1 (0)||2 + (|T0 − < T0 > |∞ + |T1 |∞ )2 )2 }.
(3.81)
Set K ≡ |u1 |∞ + |T1 |∞ . To get rid of the last term on the right hand side of (3.79), let us multiply (3.81) by C + 1 and add the resulting inequality to (3.79) to find: K ≤ C||u1 (0)||H 1 + C|T0 − < T0 > |∞ + CK +C(||u1 (0)||H 1 + |T0 − < T0 > |∞ + K)× exp{C(1 + ||u1 (0)||2 + (|T0 − < T0 > |∞ + K)2 )2 } +C||T1 (0)||H 1 + C(K 2 + K 3 + |c01 |∞ K),
(3.82)
|c01 |∞ ≤ CK 2 + CK + |c01 |∞ K + C(K 2 + K 3 ).
(3.83)
where
The above estimates on K remain the same for small time, and we can use the contraction mapping principle to construct local in time mild solutions u1 , T1 , c1 in the space V for some pressure p˜1 . Standard regularity results for Navier–Stokes equations ([22])
312
S. Malham, J. Xin
and parabolic equations ([46]) then imply that the mild solutions are actually strong solutions. In particular, if in the definition of K or norm | · |∞ , we replace t0 by t and t by τ , then K as a function of t is continuous for t ∈ [0, t0 ). Proof of Theorem 3.1. Assume that K = K(t) ≤ A0 , for t ∈ [0, t0 ), where t0 is a positive time ensured by local existence, and A0 is a constant to be properly chosen below (3.90). In particular, A0 is independent of , A0 > 0, A0 < 21 . We will show by a continuity argument that K(t) ≤ A0 for all t ≥ 0 if is small enough. We have from such a choice of A0 and that: |c01 |∞ ≤ C[K 2 + K + K 3 ],
(3.84)
which implies via (3.82) that K ≤ C||u1 (0)||H 1 + C|T0 − < T0 > |∞ + CK + C||T1 (0)||H 1 + C(||u1 (0)||H 1 + |T0 − < T0 > |∞ + K) exp{C(1 + ||u1 (0)||2 + |T0 − < T0 > |2∞ + 2 K 2 )2 } + C(K 2 + K 4 ) or
C 2 1 K K ≤ C||u1 (0)||H 1 + C|T0 − < T0 > |∞ + C||T1 (0)||H 1 + C + 2 2 +C(||u1 (0)||H 1 + |T0 − < T0 > |∞ + K) exp{C(1 + ||u1 (0)||2 +|T0 − < T0 > |2∞ + 2 K 2 )2 } + C(K 2 + K 4 ),
(3.85)
1 K ≤ C||u1 (0)||H 1 + C|T0 − < T0 > |∞ + C||T1 (0)||H 1 + C + C(||u1 (0)||H 1 + 2 |T0 − < T0 > |∞ + A0 ) exp{C(1 + ||u1 (0)||2 + |T0 − < T0 > |2∞ + 2 A20 )2 } +C(1 + A20 )K 2 ≡ K0 + C(1 + A20 )K 2 .
(3.86)
Since K is continuous in t, it follows from (3.86) that if 4K0 C(1 + A20 ) < 1,
(3.87)
K ≤ 2K0 , for t ∈ (0, t0 ),
(3.88)
then
if K(0) ≤ 2K0 , which is true by our choice of K0 with C ≥ 1. To be consistent with our assumption of A0 , we have also: 2K0 ≤ A0 .
(3.89)
Now we choose: A0 = 4[C||u1 (0)||H 1 + C|T0 − < T0 > |∞ + C||T1 (0)||H 1 + C(||u0 ||H 1 + |T0 − < T0 > |∞ )].
(3.90)
Global Solutions to Reactive Boussinesq System on Infinite Domain
313
Then there exists 0 , depending only on C = C(ν, γ, δ 0 ), ||u1 (0)||H 1 , T0 , ||T1 (0)||H 1 , such that if ∈ (0, 0 ): A0 K0 ≤ , 2 4K0 C(1 + A20 ) ≤ 2A0 C(1 + A20 ) < 1, with C ≥ 1, which implies that: A0 <
1 . 2
Inequality (3.84) implies that |c01 |∞ ≤ CA0 + 0 CA20 + 0 A30 .
(3.91)
Since the above bounds on K and c0 are independent of t0 , they are valid for all t ≥ 0 by continuity . The rest of the theorem follows from standard regularity results for Navier–Stokes equations [22] and semilinear parabolic equations [46]. We finish the proof.
Appendix We would like to choose the αk so that F (ξ, η) is a positive function satisfying (2.13). Condition (2.13b) will be satisfied provided (2n − k)αk > (k + 1)αk+1 , ∀ k = 0, 1, . . . , 2n − 1.
(3.92)
Further let us suppose α2n−2 is large enough so that (λ = (1 + `)2 /4`) 4nα2n−2 α2n > λ(2n − 1)α22n−1 .
(3.93)
Now consider condition (2.13a): 2n(2n − 1)α0 ξ
2n−2
2n−2 X
! (k + 1)(k + 2)αk+2 ξ
k=0 2n−2 X
(2n − k)(2n − k − 1)αk ξ
! 2n−2−k k
η
k=1
2n−2−k k
η
2n−2 X
+ !
(k + 1)(k + 2)αk+2 ξ
2n−2−k k
η
k=0
> λ(2n − 1)
2n−2 X
(2n − k − 1)(k + 1)αk+1 ξ
!2 2n−2−k k
η
.
k=0
We can write this in the shorthand form α0 B1 (α2 , . . . , α2n , 2n(2n − 1)α2n r2n−2 ) + B2 (α1 , . . . , α2n , 4n(2n − 1)α2n−2 α2n r4n−4 ) > λ(2n − 1)B3 (α1 , . . . , α2n−1 , (2n − 1)2 α22n−1 r4n−4 ), where each of the Bi (i = 1, 2, 3) are the obvious polynomials where their last argument indicates the highest order term in r ≡ η/ξ. For the moment let us assume B1 ≥ 0 for all ξ ∈ [−1, 1], η ≥ −1. We show this is true below. Then, there exists R ∈ R+
314
S. Malham, J. Xin
independent of α0 , such that for all |r| > R, B2 > λ(2n−1)B3 (using condition (3.93)), i.e. such that (2.13a) is satisfied. Now suppose |r| ≤ R, then we can clearly choose α0 large enough to guarantee condition (2.13a). Let k 0 denote all odd integers 0 < k 0 < 2n. Then # " " # X X k 0 2n k 0 αk0 2n −1 ξ + α2n − k0 αk0 η 2n 1− F (ξ, η) ≥ α0 − 2n 2n k0 k0 2n X 0 0 X 0 0 k 2n0 −1 2n k ξ + k η + αk 0 αk0 ξ 2n−k η k 1− + 2n 2n k0 k0 # " " # X X k 0 2n k 0 αk0 2n −1 0 ξ + α2n − k αk0 η 2n ≥ α0 − 1− 2n 2n 0 0 k
= c1 ξ
2n
k
2n
+ c2 η ,
(3.94)
where we choose small enough such that c2 > 0 and then choose α0 large enough so that c1 > 0. Now recall that we need to demonstrate that for all ξ ∈ [−1, 1], η ≥ −1, P2n−2 B1 = k=0 (k+1)(k+2)αk+2 ξ 2n−2−k η k ≥ 0. This is clear from an identical argument to that in (3.94), choosing α2 large enough. We can easily choose the αk (k = 0, 1, . . . , 2n) such that the conditions (3.92), (3.93) and α0 , α2 large enough are met. Acknowledgement. We would like to thank U. Abdullaev, B. Bayly, P. Constantin, C. Doering, J. King, E. Leveque, D. Levermore, S. Nazarenko, M. Oliver and E. Titi for all their help and suggestions. SM would especially like to acknowledge M. Oliver for checking the manuscript and his creative suggestions for improvement. J. Xin would like to thank T. Ogawa for helpful email discussions on [27]. During the preparation of the main part of the work, SM was visiting the Department of Mathematics, University of Arizona, and the Centre for Nonlinear Studies, Los Alamos National Laboratory; and J. Xin was visiting the Institut MittagLeffler. The work of J. Xin was partially supported by NSF grants DMS-9302830, DMS-9625680, and the Swedish Natural Science Research Council (NFR) Grant F-GF 10448-301 at the Institut Mittag-Leffler. Last but not the least, we sincerely thank the anonymous referee for providing all the constructive comments and thorough critiques on the early manuscript.
References 1. Adams, R.A.: Sobolev Spaces. NY: Academic Press, 1975 2. Berestycki, Larrouturou, B.: Quelque aspects math´ematique de la propagation des flammes pr´em´elang´es. Pitman Research Notes in Math Series 220, eds. H. Brezis and J.L. Lions, Nonlinear PDE’s and their applications, Coll´ege de France, 10, 1991 3. Berestycki, H., Larrouturou, B., Roquejoffre, J.-M.: Stability of Traveling Fronts in a Model for Flame Propagation, Part I, Linear Analysis. Arch. Rat. Mech. and Analysi, 117, 97–117 (1992) 4. Berestycki, H., Nirenberg, L.: Some qualitative properties of solutions of semilinear elliptic equations in cylindrical domains. In: Analysis etc., eds., P. Rabinowitz et al., Academic Proceedings, 1990, pp. 115–164 5. Berestycki, H., Nirenberg, L.: Travelling Fronts in Cylinders. Annales de l’IHP, Analyse Nonlin´eaire, 9 No. 5, 497–572 (1992) 6. Billingham, J., Needham, D.: The development of traveling waves in quadratic and cubic autocatalysis with unequal diffusion rates, I and II. Phil Trans. R. Soc. Lond. A 334, 1–124 and 336, 497–539 (1991) 7. Bricmont, J., Kupiainen, A., Xin,J.: Global large time self-similarity of a thermal diffusive combustion system with critical nonlinearity. J. Diff. Eq. 130, No. 1, 9–35 (1996) 8. Buckmaster, J.D., Ludford, G.S.S.: Theory of Laminar Flames. Cambridge: Cambridge University Press, 1982
Global Solutions to Reactive Boussinesq System on Infinite Domain
315
9. Caffarelli, L., Kohn, R., Nirenberg, L.: Partial Regularity of Suitable Weak Solutions of the Navier– Stokes Equations. Comm. on Pure Appl. Math. 35, 771–831 (1982) 10. Chandrasekhar, S.: Hydrodynamic and Hydromagnetic Stability. Oxford: Clarendon Press, 1961 11. Clavin, P.: Dynamical behavior of premixed flame fronts in laminar and turbulent flows. Prog. Energ. Comb. Sci. 11, 1169–1190 (1975) 12. Collet, P., Xin, J.: Global Existence and Large Time Asymptotic Bounds of L∞ Solutions of Thermal Diffusive Combustion Systems on Rn . Ann. Scu. Norm. Sup. Pisa Sci. Fis. Mat, Serie IV, Vol. XXIII, Fasc 4(1996), pp 625–642 13. Constantin, P., Fefferman, C.: Direction of Vorticity and the Problem of Global Regularity for the Navier–Stokes Equations. Indiana Univ. Math. J. 42, No. 3, 775–789 (1993) 14. Constantin, P., Foias, C.: Navier–Stokes Equations. Chicago: Univ. of Chicago Press, 1988 15. Dautray, R., Lions, J-L.: Mathematical Analysis and Numerical Methods for Science and Technology. Vols. 2&5, Berlin–Heidelberg–New York: Springer-Verlag, 1984 16. Doering, C.R., Constantin, P.: Variational bounds on energy dissipation in incompressible flows: Shear flow. Phys. Rev. E, 49, No. 5, 4087–4099 (1994) 17. Duan, J., Holmes, P., Titi, E.S.: Global existence theory for a generalised Ginzburg–Landau equation. Nonlinearity 5, 1303–1314 (1992) 18. Focant, S., Gallay, Th.: Existence and Stability of Propagating Fronts for an Autocatalytic ReactionDiffusion System. Univ. de Paris-Sud, preprint 97–38, 1997 19. Gilbarg, D., Trudinger, N.S.: Elliptic Partial Differential Equations of Second Order. Second Edition, Berlin–Heidelberg: Springer-Verlag, 1977 20. Haraux, A., Youkana, A.: On a Result of K. Masuda Concerning Reaction-Diffusion Equations. Tohoku Math J. 40, 159–163 (1988) 21. Henry, D.: Geometric theory of semilinear parabolic equations. Lecture Notes in Math, 840, Berlin– Heidelberg–New York: Springer-Verlag, 1981 22. Heywood, J.G.: The Navier–Stokes Equations: On the Existence, Regularity, and Decay of Solutions. Indiana Univ. Math. J. 29, No. 5, 639–681 (1980) 23. Kato, T.: Perburbation theory for linear operators. Berlin–Heidelberg–New York: Springer-Verlag, 1966 24. Kato, T.: Strong Lp -Solutions of the Navier–Stokes Equation in Rm , with Applications to Weak Solutions. Math. Z. 187, 471–480 (1984) 25. Kato, T.: Nonstationary Flows of Viscous, Ideal Fluids in R3 . J. Funct. Anal. 9, 296–305 (1972) 26. King, A.C., Needham, D.J.: On a singular initial-boundary-value problem for a reaction-diffusion equation arising from a simple model of isothermal chemical autocatalysis. Proc. R. Soc. Lond. A 437, 657–671 (1992) 27. Kozono, H., Ogawa, T.: Two Dimensional Navier–Stokes Flow in Unbounded Domains, Math. Annalen 297, 1–31 (1993) 28. Landau, L., Lifshitz, E.: Fluid Mechanics. 2nd edition, London: Pergamon Press, 1987 29. Leray, J.: Essai sur le mouvement d’un liquide visquex emplissant l’espace. Acta Math. 63, 193–248 (1934) 30. Leray, J.: Etude de diverses e´ quations int´egrals nonlin´eaires et des quelques probl’emes que pose l’hydrodynamique. J. Math. Pures Appl. bf12, 1–82 (1933) 31. Lions, J.-L., Magenes, E.: Non-homogeneous Boundary Value Problems and Applications. Vol. I, Berlin– Heidelberg: Springer-Verlag, 1972 32. Lions, P.-L.: Mathematical Topics in Fluid Mechanics. Volume I, Oxford: Oxford University Press, 1996 33. Majda, A.J., Souganidis, P.E.: Bounds on Enhanced Turbulent Flame Speeds for Combustion with Fractal Velocity Fields. J. Stat. Phys. 83, Nos. 5/6 (1996) 34. Manley, O., Marion, M.: Attractor Dimension for a Simple Premixed Flame Propagation Model. Combus. Sci. and Tech. 88, 15–32 (1992) 35. Manley, O., Marion, M., Temam, R.: Equations of Combustion in the Presence of Complex Chemistry. Indiana Univ. Math. J. 42, No. 3, 941–965 (1993) 36. Marion, M. Attractors and Turbulence for some Combustion Models. I.M.A. Vol. in Maths and its Appl. 35, 213–228 (1991) 37. Masuda, K.: On the global existence and asymptotic behavior of solutions of reaction-diffusion equations. Hokkaido Math. J. 12, 360–370 (1983) 38. Meshalkin, L.D., Sinai, Ya.G.: J. Appl. Math. 25, 1700 (1961) 39. Metcalf, M.J., Merkin, J.H., Scott, S.K.: Oscillating wave fronts in isothermal chemical systems with arbitrary powers of autocatalysis. Proc. R. Soc. Lond. A 447, 155–174 (1994)
316
S. Malham, J. Xin
40. Needham, D.J.: On the global existence of solutions to a singular semilinear parabolic equation arising from the study of autocatalytic chemical kinetics. Z. angew. Math. Phys. 43, (1992) 41. Needham, D.J., King, A.C.: On the Existence, Uniqueness of Solutions to a Singular Nonlinear Boundary Value Problem Arising in Isothermal Autocatalytic Chemical Kinetics. Proceedings of the Edinburgh Mathematical Society 36, 479–500 (1993) 42. Needham, D.J., Merkin, J.H.: The development of traveling waves in a simple isothermal chemical system with general orders of autocatalysis and decay. Phil. Trans. R. Soc. Lond A 337, 261–274 (1991) 43. Oliver, M.: A Mathematical Investigation of Models of Shallow Water with a Varying Bottom. PhD Thesis, University of Arizona, 1996 44. Papanicolaou, G., Xin, X.: Reaction-diffusion fronts in periodically layered media. J. Stat. Phys. 63, 915–931 (1991) 45. Patnaik, G., Kailasanath, K., Effect of gravity on the stability and structure of lean hydrogen-air flames. 23rd International Symposium on Combustion, Combustion Inst 1641-1647 (1990) 46. Pazy, A.: Semigroups of Linear Operators and Applications to Partial Differential Equations. Appl. Math. Sci., 44, Berlin–Heidelberg–New York: Springer-Verlag, 1983 47. Riesz, F., Sz.-Nagy, B.: Functional Analysis. New York: Dover, 1955 48. Roquejoffre, J-M.: Stability of Traveling Fronts in a Model for Flame Propagation, Part II: Nonlinear Stability. Arch. Rat. Mech. Analysis, 117, 119–153 (1992) 49. Sattinger, D.H.: On the Stability of Waves of Nonlinear Parabolic Systems. Adv. in Math 22, 312–355 (1976) 50. She, Z.: Instabilite et Dynamique a Grande Echelle en Turbulence. Thesis, Univ. of Paris VII, 1987 51. Sivashinsky, G.I.: Instabilities, pattern formation and turbulence in flames. Ann. Rev. Fluid Mech, 15, 179–199 (1983) 52. Smoller, J.: Shock Waves, Reaction-Diffusion Equations. New York: Springer, 1982 53. Stein, E.M.: Singular Integrals, Differentiability Properties of Functions. Princeton,NJ: Princeton University Press, 1970 54. Temam, R.: Navier–Stokes Equations, Theory, Numerical Analysis. Second Edition, Amsterdam: NorthHolland, 1984 55. Williams, F.A.: Combustion Theory. 2nd edition, Menlo Park, CA: Benjamin Cummings, 1985 56. Wolka, J.: Partial differential equations. Cambridge: Cambridge University Press, 1987 57. Xin, X.: Existence and Stability of Traveling Waves in Periodic Media Governed by a Bistable Nonlinearity. J. Dynamics Diff. Eqs. 3, 541–573 (1991) 58. Xin, J.X.: Existence of Planar Flame Fronts in Convective-Diffusive Periodic Media. Arch. Rat. Mech. and Analysis 121, 205–233 (1992) 59. Xin, J.X.: Existence and Nonexistence of Travelling Waves and Reaction-Diffusion Front Propagation in Periodic Media. J. Stat. Phys. 73, 893–926 (1993) 60. Xin, J.X., Zhu, J.: Quenching and propagation of bistable reaction-diffusion fronts in multidimensional periodic media, Phys. D 81, 94–110 (1995) 61. Yosida, K.: Functional Analysis. Berlin-Heidelberg: Springer-Verlag, 1965 62. Zhu, J.Y., Xin, J.: A Numerical Study of Propagating Fronts in a Reactive Boussinesq Flow System. Preprint, 1996 Communicated by A. Kupiainen
Commun. Math. Phys. 193, 317 – 336 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Influence of a Magnetic Fluxon on the Vacuum Energy of Quantum Fields Confined by a Bag S. Leseduarte1 , August Romeo2 1 Dept ECM and IFAE, Faculty of Physics, University of Barcelona, Diagonal 647, 08028 Barcelona, Spain. E-mail: [email protected] 2 Institut d’Estudis Espacials de Catalunya (IEEC), Edifici Nexus, c. Gran Capit` a 2–4, 08034 Barcelona, Spain. E-mail: [email protected]
Received: 25 March 1997 / Accepted: 3 September 1997
Abstract: We study the simultaneous influence of boundary conditions and external fields on quantum fluctuations by considering vacuum zero-point energies for quantum fields in the presence of a magnetic fluxon confined by a bag, circular and spherical for bosons and circular for fermions. The Casimir effect is calculated in a generalized cut-off regularization after applying zeta-function techniques to eigenmode sums and using recent techniques about Bessel zeta functions at negative arguments. 1. Introduction Aharonov–Bohm [1] settings may be regarded as one of the possible ways in which an external field modifies some observables of a given quantum system. In this specific phenomenon, the presence of an infinitely thin tube of magnetic flux alters the energy spectrum and brings about a modification of the vacuum energy, giving rise to a form of Casimir effect. Initially, Aharonov–Bohm fields acted on free particles1 . The purpose of the present paper is to study the influence of the same type of fields on systems which are already constrained by boundary conditions (b.c.), working out the combined net effect of both on what would otherwise be a free system. The relevance of the Aharonov–Bohm scenario to some cosmic strings models including particles in the gravitational field of a spinning source is discussed in [2, 3]. A similar mathematical procedure is also applied to the description of Dirac fermions on black-hole backgrounds as shown in [4]. The quantum mechanical problem of a scalar particle inside a circular Aharonov– Bohm quantum billiard [6–9] (of radius a) bears a great resemblance, from the mathematical point of view, to the ones we set out to consider. In our case we have a classical magnetic fluxon which is coupled to a quantum field. We take this object to be an idealization of a vortex with a radially symmetric distribution of magnetic field in the limit where its characteristic thickness is vanishingly small. With this model in mind one has 1
For Casimir interactions between two solenoids see [5]
318
S. Leseduarte, A. Romeo
the physical basis to fix the boundary conditions at the origin to be imposed on the modes of the matter field (a detailed account of this kind of analysis is given in [10]). We start with a complex, Klein-Gordon, massless field (the additional analytic effort which takes the treatment of a massive field by zeta function techniques is explained in [11]). We call φ the space-dependent part of the eigenmodes, which satisfies the equation (in units such that ~ = c = 1) (1.1) (−i∇ − eA)2 φ = ω 2 φ, where the vector potential A is given by eA(r) =
α eˆϕ , r
α=
e8 . 2π
(1.2)
α is called reduced flux, 8 being the flux of the magnetic field. Since a billiard is a domain with perfectly reflecting walls, and we imagine an infinitely thin solenoid at the origin – reduced, in D = 2, to an unreachable point – the b.c. are φ = 0 at r = 0 and r = a. 1X ωn , and give rise to Zero-point energies emerge from mode-sums of the type 2 n the Casimir effect [12–13]. Since the summation extends over all the ωn ’s in the set of eigenmodes, such quantities usually diverge and need some regularization to make sense of them. To this end, we introduce spectral zeta functions as mere auxiliary tools, which will be denoted by X X ωn −s ωn−s , ζ M (s) = . (1.3) ζM (s) = µ µ n n µ is an arbitrary scale with mass dimensions, used to work with dimensionless objects. This is a regularization of analytical nature (see also [14]); other examples in this same category are the techniques in refs. [15] and refs. [16, 17]. When we discuss the results, we shall comment on their physical significance from the perspective of cut-off regularization. In this sense, our standpoint in the present case is that the zeta function is a purely mathematical object which affords a convenient method for the calculation of observables inasmuch as it may be connected with other, more physical, regularizations. Let us assume that we are using a general cut-off regularization for the vacuum energy which is given by ω 1X n , (1.4) ωn g Ereg = 2 n 3 where g is a well-behaved function which satisfies asymptotic expansions near the origin and at infinity of the following kind: g(t) ∼ 1 +
∞ X k=1
ak tk (t → 0),
g(t) ∼
∞ 1 X
tM s
bk t−k (t → ∞).
(1.5)
k=0
If we restrict this analysis for the sake of simplicity to a (2 + 1)-D case, then g should be such that Ms > 3. These conditions guarantee that the Mellin transform of g has no poles in the strip 0 <
Magnetic Fluxon on Vacuum Energy of Quantum Fields
319
the following reasoning makes sense. Now we explain presently how the connection between the ζM -function (1.3) and expression (1.4) may be accomplished (for other discussions on how the results from different regulators may be related, see [21, 22]). The Parseval formula for Mellin transforms allows us to write Z r+i∞ 1 1 dz31+z M [g, z + 1] ζM (z), (1.6) Ereg (3) = 2 2πi r−i∞ where r is any real number such that 2 < r < Ms − 1. The structure of the poles of the integrand in expression (1.6) is closely related to the asymptotic expansion of Ereg (3) for large values of the cut-off 3. In the case at issue the relevant poles (the rightmost ones) are found at z = 2, z = 1 and z = −1. The first one is a simple pole due to the divergence of ζM at z = 2. This induces the strongest divergence in Ereg (the one that is related to the volume (surface) and which in our two dimensional case goes as 33 ). The next divergence arises from the pole of ζM at z = 1 and goes as 32 . The last divergence stems from a pole at z = −1. This pole is in general a double one because both M [g, z + 1] and ζM have poles at that point. This fact means that by properly applying the Cauchy theorem, the integrand at z = −1 determines the finite part of Ereg , and also a divergent piece which goes as the logarithm of 3. Let us give a more detailed account of these remarks. From the hypothesis we have stated, we may write 1 + Fin [M [g, z = 0]] + O(z), z Res [ζM , 2] ζM (z) = + O((z − 2)0 ), z−2 Res [ζM , 1] + O((z − 1)0 ), ζM (z) = z−1 Res [ζM , −1] + O((z + 1)0 ), ζM (z) = z+1
M [g, z] =
(1.7)
where the symbol Fin means the extraction of the finite coefficient from the Laurent expansion of a function. Now we may use these expressions in (1.6) to get that, apart from terms which go to zero for large values of the cut-off 3, the regularized expression for the vacuum energy is 1 33 M [g, 3] Res [ζM , 2] + 32 M [g, 2] Res [ζM , 1] 2 + ln 3Res [ζM , −1] + Fin [M [g] , z = 0] Res [ζM , −1] + Fin [ζM , z = −1]) .(1.8) Ereg (3) =
The conclusion of this analysis is that to prove that the divergent terms in Ereg (3) are independent of the magnetic flux, it suffices to show that the residues of ζM at z = 2, z = 1 and z = −1 do not depend on this physical parameter. Let us put it another way, if we label the different heat-kernel coefficients B0 , B 1 , B1 , B 3 . . ., this independence 2 2 boils down to saying that B0 , B1 and B 3 do not depend on the magnetic flux. As for 2 the finite part of expression (1.4) and the quantity Fin [ζM , z = −1], their relationship is quite direct, as it appears explicitly in expression (1.8). We shall systematically throw away such divergences which do not depend on α. These divergences would be relevant in a study about the bag dynamics (see [20, 11]), that is, concerned with situations where the bag walls are liable to deformation. In our case, we take the bag to be a perfectly rigid object.
320
S. Leseduarte, A. Romeo
In the free case (i.e. without flux) the eigenfrequencies under our conditions are zeros of Jν Bessel functions with integer indices ν coming from angular momentum. The solutions for nonzero α have been found in [6, 9], and basically correspond to an index shift with respect to the free case |l| → |l − α|. Since in both cases the eigenmodes are zeros of the same type of functions, we shall introduce the following “partial-wave” zeta functions for fixed values of ν: ζν (s) =
∞ X
−s jν,n , for Re s > 1,
(1.9)
n=1
where jν,n denotes the nth nonvanishing zero of Jν (see also [26, 27]; discrete versions of the Bessel problem, their solutions and associated zeta functions have also been studied in [32]). When considering the whole problem in a D-dimensional space, one must take into account the degeneracy d(D, l) of each angular mode in D dimensions. Therefore, we define the “complete” spherical zeta function ζM (s) = as
∞ X
d(D, l)
l=lmin
∞ X
−s jν(D,l),n = as
n=1
∞ X
d(D, l)ζν(D,l) (s),
(1.10)
l=lmin
lmin is the minimum value (if any) of l. In the free case ν(D, l) = l + D/2 − 1 and the (l + D − 3)! , but this general form of d(D, l) (see e.g. [33]) is d(D, l) = (2l + D − 2) l!(D − 2)! will change when a flux is present. Following the programme we have just put forward, the partial wave zeta function for scalars is obtained in Sect. 2. From this starting point, we construct the complete zeta functions for D = 2 and D = 3 complex scalar fields in Sects. 3 and 4, respectively, finding their analytic continuations to s = −1. Numerical results for the zero-point energy are then discussed. Afterwards, in Sect. 5 we study the Dirac field in D = 2, and Sect. 6 is devoted to the conclusions.
2. “Partial-Wave” Zeta Function The calculation of the Casimir energy by the complete zeta function needs the knowledge of the Bessel zeta functions (1.9) at s = −1, but the complex domain where (1.9) holds is bounded by Re s = 1. However, ζν (s) admits an analytic continuation to other values of s. What is more, we showed in refs. [26] and [27] how to find an integral representation of this continuation to the domain −1 < Re s < 0, which reads πs s ζν (s) = sin π 2
Z
∞
dxx−s−1 ln
h√
i 2πxe−x Iν (x) , for −1 < Re s < 0.
(2.1)
0
Whenever ν 6= 0 we can work out (2.1) by the method explained in [29, 30] (see also [31] and [24]), and arrive at
Magnetic Fluxon on Vacuum Energy of Quantum Fields
ζν (s) =
321
1 σ ν −s 4 1 πs s s+1 s s+1 1 σ2 B ,− + 2s−1 B , −s +ν −s sin π 2 2s 2 2 2 s + 3 , −s ν +2s−1 B 2 # N 1 1 X 1 s+1 s 1 +SN (s, ν) + ρB ,− + J 1 (s) + Jn (s) n , 2 2 2 ν ν ν n=2 (2.2)
where σ1 = −1, and
σ2 = 1,
ρ=
1 , 8
) N X Un (t(x)) SN (s, ν) ≡ dxx ln [L(ν, x)] − , νn 0 n=1 √ √ L(ν, x) = 2πν(1 + x2 )1/4 e−νη(x) Iν (νx), η(x) = 1 + x2 + ln Z
with
∞
(2.3)
(
−s−1
5t3 t − , 8 24 2 4 3t 5t6 t − + , U2 (t) = 16 8 16 3 5 531t 221t7 1105t9 25t − + − , U3 (t) = 384 640 128 1152 4 6 8 10 71t 531t 339t 565t12 13t − + − + U4 (t) = 128 32 64 32 128 .. .
x √ , 1 + 1 + x2 (2.4)
U1 (t) =
(2.5)
(note that for n ≥ 2 the Un (t)’s no longer contain linear terms in t). The key point is that, calculating in this way, SN (s, ν) is a finite integral at s = −1. Further, 5 s+3 s ,− J 1 (s) = − B 2 2 Z 48 (2.6) ∞ 1 −s−1 Jn (s) = dxx Un (t(x)), t(x) = √ . 1 + x2 0 Thus, the Jn (s)’s are found from the Un (t)’s. Since Z ∞ 1 s + m s ,− , dxx−s−1 [t(x)]m = B 2 2 2 0
(2.7)
the result of the x-integration is just like replacing Un (t) → Jn (s) 1 s + m s ,− . tm → B 2 2 2
(2.8)
322
S. Leseduarte, A. Romeo
Expression (2.2) is not valid for ν = 0, because it was obtained from a rescaling x → νx and subsequent application of uniform asymptotic expansions in νx. Moreover, numerically speaking it is not convenient if ν is very small. An alternative representation valid in these conditions is needed. Starting from (2.1), we subtract and add the asymptotic behaviour of the integrand, which produces a logarithmic divergence on integration. When doing this, we shall write the large-x expansion of ln[. . .] as follows: i h√ 4ν 2 − 1 4ν 2 − 1 1 1 √ + O +O . (2.9) = − ln 2πxe−x Iν (x) = − 8x x2 x2 + 1 8 x2 + 1 The piece we separate is later integrated with the help of (2.7) (m = 1 case) and we obtain πs s 4ν 2 − 1 s+1 s Rν (s) − B ,− , ζν (s) = sin 2 2 h 16 Zπ ∞ 2 (2.10) i 2 √ 4ν − 1 . dxx−s−1 ln 2πxe−x Iν (x) + √ Rν (s) = 8 x2 + 1 0 Given that the above integral is now finite at s = −1 we can Laurent-expand without problems around s = −1, arriving at ζν (s) =
1 − 4ν 2 1 1 − 4ν 2 1 + (−1 + ln 2) + Rν (−1) + O(s + 1). 8π s + 1 8π π
(2.11)
In particular, for ν = 0, R0 (−1) = −0.00723 and, as a result, ζ0 (s) =
1 1 − 0.01451 + O(s + 1). 8π s + 1
(2.12)
Note also the vanishing of the s = −1 pole when ν = ±1/2 as already explained in ref. [27]. 3. D = 2 Bosons 3.1. “Complete” zeta function. Now, we consider the two-dimensional problem. Following ref.[9], one realizes that the eigenmode sum for this case gives rise to the following complete spectral zeta function: ζM (s; α) = as
∞ X
ζ|l−α| (s).
(3.1)
l=−∞
As this function has the properties ζM (s; α + k) = ζM (s; α), k ∈ Z, ζM (s; −α) = ζM (s; α),
(3.2)
it will suffice to study it for 0 ≤ α ≤ 1/2. Introducing ζM (s; β) ≡ as
∞ X l=0
ζl+β (s),
(3.3)
Magnetic Fluxon on Vacuum Energy of Quantum Fields
323
we can decompose ζM (s; α) = ζM (s; α) + ζM (s; 1 − α) = as ζ|α| (s) + ζM (s; 1 + α) + ζM (s; 1 − α). Next we insert expression (2.2) into (3.3) and, realizing that
∞ X
(3.4)
(l + β)−s = ζH (s, β),
l=0
where ζH stands for the Hurwitz zeta function, we find ζM (s; β) =
1 σ as ζH (s, β) 4 1 πs s s+1 s 1 s+1 σ2 B ,− + 2s−1 B , −s +as sin π 2 2s 2 2 2 s+3 s−1 , −s ζH (s − 1, β) +2 B 2 ∞ X + SN (s, l + β)(l + β)−s l=0 s+1 s 1 ,− ζH (s + 1, β) + ρB 2 2 2 # N X +J 1 (s)ζH (s + 1, β) + Jn (s)ζH (s + n, β) ,
(3.5)
n=2
with the values of σ1 , σ2 and ρ in (2.3). Taking N = 4 and Laurent-expanding near s = −1, this is written 1 1 − ζH (−1, β) ζM (s; β) = a 4 5 229 35 1 1 ζH (−2, β) − ζH (0, β) − ζH (2, β) + ζH (3, β) + π 4 24 40320 65536 ∞ X + S4 (−1, l + β)(l + β) l=0 1 1 1 π − ζH (−2, β) + ζH (0, β) + ln a − 1 + − 256 2 8 s +1 ln 2 πψ(β) 1 π ln 2 −β + − 1 + ln 2 ζH (−2, β) − + 64 16 8 256 2 1 0 1 0 − ζH (−2, β) + ζH (0, β) + O(s + 1) . 2 8 (3.6) As for the pole at s = −1 of the complete zeta function, by (3.4), (2.11) and (3.6), and observing that ζH (−2, 1 + α) + ζH (−2, 1 − α) = −α2 , one arrives at 1 1 1 ζM (s; α) = − + O((s + 1)0 ) , (3.7) a 128 s + 1 i.e. the residue is independent of α. The reader may check by using the method explained in ref.[24] that this independence applies not only to B 3 , but also to B0 and B 1 . As 2 2 we have explained in the introduction, this property allows us to state that in cut-off
324
S. Leseduarte, A. Romeo
regularization the dependence of the vacuum energy on the magnetic flux does not appear in the divergent terms. The dependence of the vacuum energy on the magnetic flux is completely contained in the finite part of ζM . Since we plan to use the same formulas for calculating the finite parts, it will be 0 0 (0, β) and ζH (−2, β) around β = 1. The first is known (see e.g. necessary to obtain ζH [34]) and has the value 0 (0, β) = ln 0(β) − ζH
1 ln(2π), 2
(3.8)
0 while the second is calculated by numerical evaluation of ζH (−n, β) from an integral representation of the derivative of ζH valid for negative first arguments.
3.2. Numerical results. First, we take the l = 0 partial wave zeta-functions obtained from (2.11)–(2.10). We are supposing α ≥ 0 and the results will be denoted by 1 1 − 4α2 1 rα + ln a + pα + O(s + 1) , rα = , (3.9) as ζα (s) = a s+1 8π where the finite parts pα are listed in the second column of Table 1. Table 1. Finite parts at s = −1 of the involved zeta funcions (for a = 1). Column 2: l = 0 partial wave zeta function ζα (s) Columns 3 and 4: ζM (s; β) with β = 1 + α and β = 1 − α. Column 5: finite part of the complete zeta function ζM (s; α) α
pα
p¯1+α
p¯1−α
qα
0
−0.01451
+0.01174
+0.01174
+0.00899
0.1
−0.05971
+0.04062
−0.01172
−0.03081
0.2
−0.10771
+0.07491
−0.02987
−0.06266
0.3
−0.15778
+0.11462
−0.04285
−0.08601
−0.20932
+0.15968
−0.05095
−0.10060
= −0.26180
+0.21001
−0.05471
−0.10650
0.4 1 2
π − 12
The pole absence for α = 1/2 may be viewed as a consequence of the fact that J1/2 (x) ∝ sin x, and therefore ζ1/2 (x) = π −s ζR (s) (ζR meaning the Riemann zeta function), which is finite at s = −1 because ζR (−1) = −1/12. Next, we find ζM (s; β) from (3.6) for the corresponding β = 1 ± α’s. We will write it in the way 1 1 r¯β + ln a + p¯β + O(s + 1). ζM (s; β) = (3.10) a s+1 According to (3.6) r¯β =
π 1 1 1 − − ζH (−2, β) + ζH (0, β) π 256 2 8
(3.11)
1 Bn+1 (x)). As for p¯β , we (note that in terms of Bernoulli polynomials ζH (−n, x) = − n+1 list some of its values in columns 3 and 4 of Table 1. Using now (3.4) and the above results we obtain
Magnetic Fluxon on Vacuum Energy of Quantum Fields
325
Fig. 1. Bosonic zero-point energy Ec in D = 2 for a = 1 (then Ec = qa ) as a function of the reduced flux α
ζM (s; α) =
1 1 1 − + ln a + qα + O(s + 1). a 128 s + 1
(3.12)
The already remarked α-indepedence of the resdiue is exhibited by the fact that 1 . Values of qα = pα + p¯1+α + p¯1−α for different α’s between rα + r¯1+α + r¯1−α = − 128 0 and 1/2 are given in the fifth column of Table 1 (see also Fig. 1). Now it would be 1 qα . We incorrect to say that the dependence of Ereg (3) on α is exactly given by 2a should take into account a factor 2 which stems from the complex nature of the scalar field. In other words, the dependence of Ereg (3) on α is given by a1 qα , apart from terms which vanish when 3 goes to infinity. 4. D = 3 Bosons Equation (1.1) is again considered, but now in D = 3 and with a magnetic flux line diametrically threading a sphere of radius a. We make such a gauge choice that the vector potential in spherical coordinates reads eA(r) =
α eˆϕ . r sin θ
(4.1)
The spectrum and eigenfuctions for the associated quantum-mechanical problem have been written down by the authors in ref. [26]. The proof that one must impose regularity at the origin may be carried out as described in ref. [10]. After studying their associated degeneracies, we are able to write the complete zeta function as follows: ζM (s; α) = as
∞ ∞ X X p=0 m=−∞
Again, it is apparent that
ζ|m−α|+p+1/2 (s).
(4.2)
326
S. Leseduarte, A. Romeo
ζM (s; α + k) = ζM (s; α)
(4.3)
ζM (s; 1 − α) = ζM (s; α).
(4.4)
for any integer k, and
It is now an immediate result that ζM (s; α) = ζM (s; −α).
(4.5)
From (4.3) and (4.4) we may also restrict our study to the domain 0 ≤ α ≤ 21 ; this is a property which we proceed to take advantge of in the sequel. Under this restriction we may give an alternative representation for (4.2): ∞ X
ζM (s; α) = as
|l|ζ|l−α+1/2| (s).
(4.6)
l=−∞
In terms of s ζf M (s; β) ≡ a
∞ X
lζl+β (s)
(4.7)
l=0
and of the ζM function defined in (3.3), ζM (s; α) reads 1 1 1 f + α + ζ − α . s; s; ζM (s; α) = ζM s; + α + ζf M M 2 2 2
(4.8)
Next, let’s consider the relation between them and the new zeta function s
ζM (s; β) ≡ a
∞ X
(l + β)ζl+β (s).
(4.9)
l=0
This has the advantage that ζM (s; β) can be immediately found from known material. The case without magnetic flux has already been studied for D = 3 in [29, 30]. Since d(3, l) = 2l + 1 = 2ν(3, l), formula (1.10) is now rewritten as ζM (s; α = 0) = ∞ X ν(3, l)ζν(3,l) (s). Therefore, the expression for ζM (s; β) is the one for the ζM (s) 2as l=0
in those works but for the simple replacement ζM (s; β) =
1 ζM (s; α = 0) {ν(l) = (l + 1/2) −→ (l + β)} . 2
Thus, for N = 4 subtractions we find
(4.10)
Magnetic Fluxon on Vacuum Energy of Quantum Fields
327
1 1 1 1 − ζH (0, β) + ζH (−3, β) − ζH (−2, β) ζM (s; β) = a 256 4π 4 5 35 ζH (−1, β) + 65536 ζH (2, β) − 24π 229 1 1 1 1 − − ζH (−3, β) + ζH (−1, β) + ln a − 1 + π 40320 2 8 s+1 ∞ X + S4 (−1, l + β)(l + β)2 l=0
229 293 − (ln 2 − ψ(β)) 24192 40320 1 0 1 (−3, β) + −1 − ln 2 ζH (−3, β) − ζH 2 2 1 0 1 + ln 2ζH (−1, β) + ζH (−1, β) + O(s + 1) . 8 8 +
(4.11) As a result of the previous definitions, ζf M (s; β) = ζM (s; β) − βζM (s; β). Hence, the complete zeta function (4.8) is conveniently written 1 1 1 − α ζM s; + α − ζM s; − α ζM (s; α) = 2 2 2 1 1 +ζM s; + α + ζM s; − α , 2 2
(4.12)
(4.13)
and the necessary knowledge about the objects on the r.h.s. is available. We have already found the residue of ζM (s; β) at s = −1, namely 1 Res ζM (s; β); s = −1 = r¯β , a where r¯β is the one in (3.11). Similarly, from (4.11), h i 229 1 1 1 Res ζM (s; β); s = −1 = − − ζH (−3, β) + ζH (−1, β) . aπ 40320 2 8
(4.14)
(4.15)
(At this point, one can check that h i 1 1 1 Res ζM (s; β = 1/2); s = −1 = = Res ζM (s; α = 0); s = −1 , a 315π 2 as it should be.) Then, (4.13) yields Res [ζM (s; α); s = −1] =
1 1 α 2 − α(1 − α2 ) 1 − . aπ 315 6 2
(4.16)
We recall that this is valid for 0 ≤ α ≤ 21 , and one has to make use of (4.3) and (4.4) to extend it to any value. In particular, when we extend expression (4.16) to any real α, we have a non-analytic function of α. The theory cannot be renormalized. The situation
328
S. Leseduarte, A. Romeo
Fig. 2. Description of ζM (s; α) – for the D = 3 internal modes only – at s = −1, as a function of the reduced flux α: a) residue rα , b) finite part pα , c) comparison of pα for internal and external modes, together with the Casimir energy EC (aEC = pαint + pαext ), in D = 3, where pαint is the same as in b)
is different if we give up the idea of a purely confining enclosure and allow the presence of external modes, but satisfying the same b.c. as the internal ones. Parallelling the steps in ref.[30] for the α = 0 case, we construct the α-dependent complete zeta function for these external Dirichlet modes – say ζMext (s; α). With respect to the internal case, we have the following modifications: r L(ν, x) →
2ν (1 + x2 )1/4 eνη(x) Kν (νx), π
(4.17)
and the ν-series undergoes a ν-parity change, which brings about the transformations
Magnetic Fluxon on Vacuum Energy of Quantum Fields
σ2
→ −σ2 ,
ρ
→ −ρ,
Un (t)
→ (−1)n Un (t),
329
(4.18)
J 1 (s), Jn (s) → −J 1 (s), (−1)n Jn (s). The external ζ-function ensuing from these transformations takes into account an overall subtraction from the Minkowski space (for instance, the residue of the rightmost pole is negative). From the construction of this external ζ-function, all the terms contributing to the s = −1 pole – including a piece proportional to J3 (s) – reverse their sign with respect to their internal counterparts and (4.19) Res ζMext (s; α); s = −1 = −Res ζM (s; α); s = −1 , as a result of which the net zeta function ζM (s; α) + ζMext (s; α) is finite at s = −1 regardless of the α value. The same cancellation applies for the residues at s = 1 and s = 3. Such a cancellation is typical of odd D’s, and does not happen in D = 2 because the residue receives then a contribution from J2 (s), which maintains its sign. To be brief, we only have to worry about the residue at s = 2. It is quite immediate that it is α independent. The conclusion is that for a 3 − D Klein-Gordon field defined in both the exterior and the interior region, the whole dependence of the vacuum energy on the α parameter is contained in the finite part of ζM (s; α) + ζMext (s; α). The α-dependences of the residue rα and of the finite part pα of ζM (s; α) at s = −1 are depicted in Figs. 2a and 2b for the internal modes only. The residue is simply formula (4.16), while the finite parts have been obtained through numerical evaluation of (4.13) by the methods described in refs. [29, 30]. Figure 2c shows the inclusion of the external modes, and the net dependence on α of the vacuum energy. Though we do not pay too much attention to the absolute figures, but only to the dependence on α (in other words, to the derivative of the finite part with respect to α), it is worthwhile noting that the value at α = 0 furnishes us with an opportunity to verify our results. We have obtained that the value of the graph at α = 0 is a1 0.005634... = 2 · a1 0.002817..., i.e. twice the figure found in [23] for an ordinary free field, as had to be expected. 5. D = 2 Fermions For D = 2 massless Dirac particles under the influence of the same magnetic field as in Sect. 1 and 3, the Dirac equation reads (i 6 ∂ + e A 6 )9 = 0,
(5.1)
with 6 v ≡ γ µ vµ (γ 0 = σ 3 , γ 1 = iσ 2 , γ 2 = −iσ 1 ), A0 (r) = 0 and A(r) as in (1.2), (for previous works where ζ-function techniques are studied in fermionic systems see [35–37]). The boundary conditions that we choose on the bag, given by the circle r = a, are those of the M.I.T. bag model −i 6 n9 = 9, where n stands for the normal vector. It has been remarked in refs. [3, 2] that in this problem it would be too restrictive to impose regularity at the origin for the modes. If one imposes regularity the result is that
330
S. Leseduarte, A. Romeo
the domain of the operator is not dense and, consequently, one loses self-adjointness. Making 9(x, t) = ψ(x)e−iEt , let us note the space-dependent part of a particular mode by ! χ1 (r) eimϕ . ψ(x(r, ϕ)) = 2 iϕ χ (r)e It is easily seen that if one demands regularity for the modes characterized by m = − [α] − 1, then one is left only with the trivial solution 9 = 0. As we advanced in the introduction, we have followed the analysis which was set forth in [10]; the outcome is that for the particular value of m = − [α] − 1, one should choose the solution with regular χ1 for positive α, and the one with regular χ2 when α is negative. In any case this amounts to picking an element among a family of possible self-adjoint extensions for the Hamiltonian under the b.c. in question. Then, the whole set of Hamiltonian eigenfrequencies consists of: 2 (ka) = 0 with ν = {α} − 1 if α > 0, and 1. the k’s satisfying fν (ka) ≡ Jν2 (ka) − Jν+1 with ν = −{α} otherwise. 2. a) the k’s satisfying fl+{α} (ka) = 0, for l = 0, 1, 2, . . ., b) the k’s satisfying fl+1−{α} (ka) = 0, for l = 0, 1, 2, . . .,
where {α} denotes the fractional part of α. It is then adequate to define ζνf (s) =
∞ X
λ−s ν,n , for Re s > 1,
(5.2)
n=1
where λν,n means the nth nonvanishing zero of fν (λ). Now we may write down the fermionic zeta function as f f f (s) ≡ ζ1M (s) + ζ2M (s), ζM
(5.3)
where f f f (s) ≡ θ(α)ζ{α}−1 (s) + θ(−α)ζ−{α} (s), ζ1M f (s) ζ2M
≡
∞ X
f ζ{α}+n (s)
+
n=0
∞ X
f ζ1−{α}+n (s).
(5.4)
n=0
It follows that the Casimir energy will have to fulfill the equalities EC (−α) = EC (α), EC (α + k) = EC (α), for α > 0, k = 1, 2, 3, . . . ,
(5.5)
EC (α − k) = EC (α), for α < 0, k = 1, 2, 3, . . . (compare with relations (3.2)). Since we have now this sort of periodicity when shifting α by integer values, it suffices for our study to take 0 ≤ α < 1. With the help of the auxiliary object f (s; β) ≡ as ζM
∞ X l=0
f ζl+β (s),
(5.6)
Magnetic Fluxon on Vacuum Energy of Quantum Fields
331
(analogous to (3.3) for bosons) we are able to express the complete zeta function as f f f (s; α) + ζ f (s; 1 − α) (s; α) = as ζα−1 (s) + ζM ζM M
f f (s; 1 + α) + ζ f (s; 1 − α). (s) + ζαf (s)] + ζM = as [ζα−1 M
(5.7)
(For the numerical methods to be applied below, the second form proves to be more suitable around α = 0.) Starting from the partial wave zeta function (5.2), we make use of the technique described in [26, 27] and find an analytic continuation to the domain −1 < Re s < 0 given by the integral representation Z πs ∞ s 2 dxx−s−1 ln πxe−2x [Iν2 (x) + Iν+1 (x)] , for −1 < Re s < 0. ζνf (s) = sin π 2 0 (5.8) For ν 6= 0, and by a subtraction method similar to the one applied in the bosonic case, we obtain the more convenient form πs s 1 s+1 s s+1 s+3 B ,− +2s B , −s +2s B , −s ν ζνf (s) = ν −s sin π 2 s 2 2 2 2 f (s, ν) +SN 1 1 1 s+1 s s+1 s 1 f ,− − B ,− + J 1 (s) + B 2s 2 2 8 2 2 ν ν PN 1 f + n=2 Jn (s) ν n , (5.9) with Z f (s, ν) SN
∞
≡
( −s−1
dxx 0
ln L (ν, x) − f
N X U f (t(x)) n
n=0
νn
) ,
√ 1 2 (νx)], L (ν, x) = [ 2πν(1 + x2 )1/4 e−νη(x) ]2 [Iν2 (νx) + Iν+1 2
(5.10)
f
where U0f (t) = ln(1 − t), t t3 U1f (t) = − + , 4 12 8 t4 t5 t6 t + − − , U2f (t) = 8 8 8 8 3 4 5 t 9t t6 23t7 3t8 179t9 5t + + − − + − , U3f (t) = 192 8 320 2 64 8 576 t6 165t7 37t8 327t9 57t10 179t11 71t12 t4 17t5 + − − − + + − − , U4f (t) = 32 128 8 128 64 128 32 128 64 .. . (5.11) and
332
S. Leseduarte, A. Romeo f J 1 (s)
1 B = 24 Z ∞
Jnf (s) =
s+3 s ,− 2 2
,
dxx−s−1 Unf (t(x)).
(5.12)
0
In order to include ν = 0 or close values, we also find the alternative representation 2 2 1 1 1 1 1 1 f ν+ − ν+ (−1 + ln 2) + Rfν (−1) + O(s + 1), ζν (s) = − π 2 (s + 1 π 2 π ) 2 Z ∞ 1 1 2 √ , dxx−s−1 ln πxe−2x (Iν2 (x) + Iν+1 (x)) + ν + Rfν (s) = 2 x2 + 1 0 (5.13) which is like (2.11), for bosons. Note the vanishing of the s = −1 pole for ν = −1/2, f (s) = (π/2)−s (2s − 1)ζR (s) is finite at s = −1. which happens because ζν=−1/2 f of the fermionic complete zeta function for varying reduced flux α Table 2. Finite part qα
α
f qα
0
−0.00583
0.1
+0.05603
0.2
+0.07735
0.3
+0.07753
0.4
+0.06656
0.5
+0.05045
0.6
+0.03314
0.7
+0.01733
0.8
+0.00484
0.9
−0.00312
1
−0.00583
The analogue of expression (3.6) for the fermionic case is π 1 1 1 97 f ζH (−2, β) + ζH (0, β) − + ζH (2, β) ζM (s; β) = πa 2 12 20160 256 ∞ X 13 35π + 20160 + 32768 S4f (−1, l + β)(l + β) ζH (3, β) + l=0 β π 1 1 − ζH (−2, β) − ζH (−1, β) + ln a − 1 + − + + 12 4 128 s+1 ln 2 ln 2 ψ(β) πψ(β) 1 +β − − − − 24 12 4 24 128 − (2 + ln 2) ζH (−2, β) − (1 + ln 2) ζH (−1, β) 1 0 0 0 −ζH (−2, β) − ζH (−1, β) − ζH (0, β) + O(s + 1) . 4 (5.14) Actually, with this plus (5.7) and (5.13) we realize that
Magnetic Fluxon on Vacuum Energy of Quantum Fields
333
1 f Fig. 3. a) Finite part of the fermionic zero-point energy EC = − 2a qα in D = 2 for a = 1 as a function of the reduced flux α. b) Same as in a after enlarging the α-domain considered: the energy minima appear to be at α = ±(m + 0.25), m = 0, 1, 2, . . .
f ζM (s; α)
1 1 1 f + ln a + qα + O(s + 1) , = a 64 s + 1
(5.15)
i.e. after adding up all the contributions, the residue of the resulting pole at s = −1 is independent of α, like the residues at s = 1 and s = 2, and, in consequence, the 1 f qα (remember the minus sign associated dependence of Evac (3) on α is given by − 2a to Dirac particles, see for instance ref. [19]), where qαf is the remaining finite part of the ζ-function once the pole has been removed and does – predictably – depend on the magnetic field. Numerical values are given in Table 2. f f (s; 0) = ζM (s; 1), which is in the The fact that q1f = q0f , comes from the equality ζM end a consequence of the modified Bessel function identity I−n = In , n = 1, 2, 3, . . .. The values of qαf are shown in Fig. 3a. For the sake of clarity we also give a plot for an extended domain of α which illustrates the particular periodicities of the fermionic case (5.5), see Fig. 3b.
334
S. Leseduarte, A. Romeo
6. Ending Comments
A scalar Klein-Gordon field subject to Dirichlet boundary conditions and under the influence of an external magnetic field producing a single flux line has been studied in twoand three-dimensional spaces. We have obtained a nontrivial effect in the dependence of the vacuum energy on the flux which would be invisible (within the order of our approximation) without the presence of a finite-sized bag. Considering only the internal field modes in the D = 2 case, we arrive at the conclusion that the vacuum energy undergoes a finite variation when the magnetic flux is changed. It is interesting to note that around α = 0, the vacuum energy decreases when the flux grows. In this sense, the system would seem to energetically favour the presence of such fluxons. In D = 3 the flux line is diametrically threading a sphere and we have quite a different resulting picture. Taking just the internal modes, we see that it is not true that the divergent pieces are independent of the magnetic flux. In fact we have seen that the coefficient giving the logarithmic divergence in 3 (or if you prefer, the residue of the ζ-function at z = −1) depends non-analytically on α. This is also the case for the coefficient giving a 32 divergence, associated to the residue at s = 1. The inclusion of external modes dramatically modifies the situation. Their associated divergences exactly cancel those from the internal part, except the one arising from the pole at s = 2, but this piece does not contain any dependence on α. At α = 0, our result agrees with the one found in ref. [23]. Around this point the system seems to oppose the growth of the kind of fluxons we have pictured, in the sense that some amount of energy must be provided. Fermions in D = 2 have also been considered. While the bosonic energy was periodic in α with period= 1, the fermionic one is – not too surprisingly – periodic with the same period only on each real semiaxis separately as shown in Fig. 3b, where qαf is represented. The divergences of Ereg (3) are independent of α, with the same transparency in the physical interpretation of the result as in the D = 2 bosonic case. For values in a neighbourhood of α = 0, the energy is seen to decrease as the absolute value of the flux grows. So we have that the fermionic case in D = 2 shares this property with the scalar one. References [25] include a study of the three-dimensional bag involving gauge (bosonic) and fermionic massless fields without external flux. By way of rough comparison with some of the figures obtained in these works we may evaluate the ratio between the maximum variation of the vacuum energy for the Dirac field and the same quantity for a complex Klein-Gordon field. Taking into account that this variation is given by 0.0397 for the fermionic case, and 0.0975 for the scalar one, we have that the ratio is 0.41. a a In other words, the energy of a Klein-Gordon field is in this sense more sensitive to flux changes. To finish this work we shall briefly comment how the analysis that we have performed in this article with models without a mass term carries over to cases where this term is present. Of course, had we incorporated a mass term, the result for the finite contributions to the zeta functions would be different and would call for some extra, though feasible, effort (see [11]). The divergent pieces would also change, but in a way which is quite trivial. In general, the residue of a pole at a point s, would be transformed into itself plus a linear combination of the residues of the poles at s + 2k for positive integer k’s, with coefficients given by even powers of the mass. In other words, if one finds that in the massless case divergences are α independent, this property carries over to the massive case.
Magnetic Fluxon on Vacuum Energy of Quantum Fields
335
Acknowledgement. I. Brevik, E. N. Bukina, E. Elizalde, A. Yu. Kamenshchik, A. A. Kvitsinsky and K. A. Milton are thanked for comments and discussions. S. L. gratefully acknowledges an FI grant from Generalitat de Catalunya. A. R. thanks Commissionat per a Universitats i Recerca (Generalitat de Catalunya) for financial support.
References 1. Aharonov, Y. and Bohm, D.: Phys. Rev. 115, 485 (1956), 123, 1511 (1961), 125, 2192 (1962), 130, 1625 (1963); Morandi, G. and Menossi, E.: Eur. J. Phys. 5, 49 (1984) 2. de Sousa Gerbert, Ph. and Jackiw, R.: Commun. Math. Phys. 124, 229 (1989) 3. de Sousa Gerbert, Ph.: Phys. Rev. D 40, 1346 (1989) 4. Kabat, D.: Nucl. Phys B 453, 281 (1995) 5. Duru, L.H.: Foundations of Physics 23, 809–818 (1993) 6. Berry, M.V.: J. Phys. A 19, 2281 (1986); J. Phys. A 20, 2389 (1987) 7. Berry, M.V. and Robnik, M.: J. Phys. A 19, 649 (1986) 8. Itzykson, C., Moussa, P. and Luck, J.M.: J. Phys. A 19, L111 (1986); Ziff, R. M.: J. Phys. A 19, 3923 (1986) 9. Steiner, F.: Fortschr. Phys. 35, 87 (1987) 10. Alford, M.G., March-Russell, J. and Wilczek, F.: Nucl. Phys. B328, 140 (1989) 11. Bordag, M., Elizalde, E., Kirsten, K. and Leseduarte, S.: Phys. Rev. D56, 4904 (1997) 12. Casimir, H.B.G.: Proc. Kon. Ned. Akad. Wetenschap 51, 793 (1948) 13. Ambjørn, J. and Wolfram, S.: Ann. Phys. 147, 1 (1983) 14. Salam, A. and Strathdee, J.: Nucl. Phys. B90, 203 (1975); Brown, L.S. and MacLay, G.J.: Phys. Rev. 184, 1272 (1969); Dowker, J.S. and Critchley, R.: Phys. Rev. D 13, 3224 (1976); Hawking, S.W.: Commun. Math. Phys. 55, 133 (1978) 15. McKeon, D.G.C. and Sherry, T.N.: Phys. Rev. Lett. 59, 532 (1987); Phys. Rev. D 35, 3854 (1987); Rebhan, A.: Phys. Rev. D 39, 3101 (1989) 16. Speer, E.R.: J. Math. Phys. 15, 1 (1974) 17. De Francia, M.: Phys. Rev. D 50, 2908 (1994); De Francia, M., Falomir, H., Santangelo, E.M.: Phys. Rev. D 45, 2129 (1992) 18. Plunien, G., M¨uller, B. and Greiner, W.: Phys. Rep. 134, 87 (1986) 19. Greiner, W., M¨uller, B. and Rafelski, J.: Quantum electrodynamics of strong fields, Berlin–Heidelberg– New York: Springer, 1985 20. Blau, S.K., Visser, M. and Wipf, A.: Nucl. Phys. B310, 163 (1988) 21. Cognola, G., Vanzo, L. and Zerbini, S.: J. Math. Phys. 33, 222 (1992) 22. Beneventano, C.G. and Santangelo, E.M.: Int. J. Mod. Phys. A11, 2871 (1996) 23. Bender, C.M. and Milton, K.A.: Phys. Rev. D 50, 6547 (1994) 24. Bordag, M. and Kirsten, K.: Heat-kernel coefficients of the Laplace operator on the 3-dimensional ball. Ppreprint UB-ECM-PF 95/1, hep-th/9501064; Bordag, M., Elizalde, E. and Kirsten, K.: J. Math. Phys. 37, 895 (1996) 25. Milton, K.A.: Phys. Rev. D 22, 1441 (1980); Phys. Rev. D 27, 439 (1983); Ann. Phys. (N.Y.) 150, 432 (1983) 26. Elizalde, E., Leseduarte, S. and Romeo, : J. Phys. A 26, 2409 (1993) 27. Leseduarte, S. and Romeo, A.: J. Phys. A 27, 2483 (1994) 28. Watson, G.N.: A treatise on the Theory of Bessel Functions, 2nd edition, Cambridge: Cambridge University Press, 1944; Kishore, N.: Proc. Amer. Math. Soc. 14, 527 (1963); Obi, E.C.: J. Math. Anal. Appl. 52, 648 (1975);
336
29. 30. 31. 32. 33. 34. 35. 36. 37.
S. Leseduarte, A. Romeo
Hawkins, J.: On a zeta function associated with Bessel’s equation. PhD Thesis, University of Illinois (1983); Stolarsky, K.B.: Mathematika 32, 96 (1985) Romeo, A.: Phys. Rev. D 52, 7308 (1995) Leseduarte, S. and Romeo, A.: Ann. Phys. 250, 448 (1996) Barvinsky, A.O., Kamenshchik, A.Yu. and Karmazin, I.P.: Ann. Phys. 219, 201 (1992) Kvitsinsky, A.A.: J. Phys. A 28, 1753 (1995); J. Math. Anal. Appl. 196, 947 (1995) Vilenkin, N.Ja.: Fonctions sp´eciales et th´eorie de la r´epresentation des groups. Paris: Dunod, 1969 Bateman Manuscript Project, A. Erd´elyi et al, Higher Transendental Functions. New York: McGrawHill, 1953 D’Eath, P.D. and Esposito, G.V.M.: Phys. Rev. D43, 3234 (1991) Kirsten, K. and Cognola, G.: Class. Quantum Grav. 13, 633 (1996) Apps, J.S., Bordag, M., Dowker, J.S. and Kirsten, K.: Class Quantum Grav. 13, 2911 (1996)
Communicated by G. Felder
Commun. Math. Phys. 193, 337 – 371 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Homotopy Classes for Stable Connections between Hamiltonian Saddle-Focus Equilibria W. D. Kalies? , J. Kwapisz?,?? , R.C.A.M. VanderVorst? Center for Dynamical Systems and Nonlinear Studies, Georgia Institute of Technology, Atlanta, GA 30332, USA Received: 10 December 1996 / Accepted: 5 September 1997
Abstract: For a class of Hamiltonian systems in R4 the set of homoclinic and heteroclinic orbits which connect saddle-focus equilibria is studied using a variational approach. The oscillatory properties of a saddle-focus equilibrium and the variational nature of the problem give rise to connections in many homotopy classes of the configuration plane punctured at the saddle-foci. This variational approach does not require any assumptions on the intersections of stable and unstable manifolds, such as transversality. Moreover, these connections are shown to be local minimizers of an associated action functional. This result has applications to spatial pattern formation in a class of fourth-order bistable evolution equations. 1. Introduction Hamiltonian systems obtained from second-order Lagrangian densities of the form L = L(u, u0 , u00 ) are used in many physical models including problems in nonlinear optics, nonlinear elasticity, and mechanics. The Euler-Lagrange equations associated with such densities are fourth-order differential R equations, and their solutions are critical points of the Lagrangian functional J[u] = R L(u, u0 , u00 ) dt. In these systems stationary solutions u b, which satisfy ∂L/∂u (b u, 0, 0) = 0, can have four hyperbolic complex eigenvalues – two with positive real part and two with negative real part. This type of equilibrium solution is called a saddle-focus, and it is well-known that systems with homoclinic or heteroclinic connections between such points can exhibit complicated (chaotic) behavior, cf. [12, 20, 26, 33]. The purpose of this paper is to investigate the structure of the homoclinic and heteroclinic orbits connecting saddle-focus equilibria in a class of fourth-order equations. The specific Lagrangians that we study are given by ? ??
Partially supported by grants ARO DAAH-0493G0199 and NIST G-06-605. Partially supported by Polish KBN Grant 2 P301 01307 Iteracje i Fraktale, II.
338
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
Z J[u] = R
γ 00 2 β 0 2 |u | + |u | + F (u) dt 2 2
with γ, β > 0.
(1.1)
The nonlinearity F is assumed to be a double-well potential with two nondegenerate global minima at ±1 of which the prototypical example is F (u) = (u2 − 1)2 /4. The number of global minima is not crucial, and our analysis extends to potentials with arbitrarily many wells. The Euler-Lagrange equation for (1.1) is γu0000 − βu00 + F 0 (u) = 0
with γ, β > 0.
(1.2)
This equation has been proposed as a generalization of the second-order stationary AllenCahn or Fisher-Kolmogorov equation (γ = 0) and arises in the study of phase transitions in the neighborhood of Lifshitz points [18, 19, 24, 50], see Sect. 8. This extended Fisher-Kolmogorov equation is valid near parameter values where the Ginzburg-Landau formulation becomes degenerate [40]. When γ > β 2 /4F 00 (±1), the equilibrium points u = ±1 are saddle-foci, and we are interested in the heteroclinic and homoclinic orbits connecting these two points in the four-dimensional flow generated by (1.2). Note that this ODE generates only a local flow on R4 , and there are solutions which blow-up in finite time. Before describing the history of this problem, we would like to state three characteristics of our results which differ from much of the previous work. First, the assumptions on the nonlinearity are very mild. In particular we do not require symmetry or analyticity of F , nor do we place any transversality or nondegeneracy conditions on the intersections of the stable and unstable manifolds of ±1. Second, we produce multitransition solutions of (1.2) with any number of transitions, all of which are local minimizers of the action functional (1.1). Finally, these multitransition solutions do not all lie in some small neighborhood of the principal loop in the phase space. In particular the distance between transitions is not required to be large. Our results imply that the dynamics of equations of the form (1.2) with a double-well potential and saddle-focus equilibria are always chaotic and hence never completely integrable. The methods used in this paper also seem to be applicable to mechanical systems with two degrees of freedom. As for fourth order problems, this would require a nonnegative Lagrangian density and saddle-foci which are global minima. These examples may be the subject of future work. Finding multitransition and multibump solutions for Hamiltonian systems has become an active field of study in recent years. In this context we mention the work of Buffoni, Coti-Zelati, Ekeland, Rabinowitz, and S´er´e, [4, 10, 16, 17, 21, 38, 39, 43, 44]. The initial work is due to S´er´e [43] who finds infinitely many two-bump homoclinic orbits for a general class of nonautonomous, periodically-forced Hamiltonian systems with a subsequent generalization to multibump homoclinics [44]. Coti-Zelati and Rabinowitz [17, 39] consider the problem of finding multibump homoclinic connections for mechanical systems with Lagrangians of the form L(t, q, q 0 ) = 21 |q 0 |2 − V (t, q), where q : R → Rn and V is a periodically-forced potential. Nondegeneracy conditions are imposed on the primary homoclinic connections in order to construct multibump solutions. These variational results are analogous to those obtained from the study of Poincar´e maps of time-periodic systems via Melnikov theory which detects transverse intersections, cf. [23, 31]. Using such techniques from dynamical systems theory, Devaney [20] has shown that autonomous Hamiltonian systems in R4 display horseshoe-like dynamics in a neighborhood of a transverse homoclinic or heteroclinic loop connecting saddle-foci. This principal loop is the four-dimensional equivalent of a Shil’nikov orbit [45, 46]. In particular
Homotopy Classes for Stable Connections
339
this implies that a countable family of multibump or multitransition solutions exists near the primary loop in the phase space. In general the existence of a primary loop composed of transverse intersections of stable and unstable manifolds is difficult to verify in fourth-order problems. In this paper we prove that for a significant class of autonomous Hamiltonian systems no transversality condition is required to obtain multitransition solutions of (1.2). Multibump or multiple-pulse solutions and their stability have been investigated using dynamical systems techniques in a wider variety of contexts with varying hypotheses on the primary solutions. For references to this literature, the reader is referred to Alexander, Gardner, and Jones [3], Lin [30], Nishura [32], and Sandstede [41, 42]. Multibump homoclinic connections near a Shil’nikov orbit in conservative, autonomous systems in R4 have been studied variationally by Buffoni and S´er´e [12]. Their results require an intersection condition on the stable and unstable manifolds which is weaker than transversality and is simpler to check for certain examples. For the system (1.2) this condition has been verified for the specific potential F = (u2 − 1)2 /4 and is used in [26] to construct multitransition solutions of γu0000 − βu00 + u3 − u = 0,
(1.3)
which is often referred to as the (stationary) extended Fisher-Kolmogorov equation. This approach yields solutions which are close to a primary heteroclinic loop, composed of single-transition solutions, by gluing together well-separated copies of these primary solutions. However, checking the intersection condition of Buffoni and S´er´e [12] can be involved, and it has not been verified for the general problem (1.2). The EFK-Eq. (1.3) has been extensively studied by Peletier and Troy [33, 34, 35, 36, 37] who show that heteroclinic connections between ±1 exist for all β, γ > 0. These primary heteroclinic connections have exactly one monotone transition and minimize the action J in a suitable class of functions [37]. Using topological shooting methods they explore the set of bounded solutions of (1.3) when γ > β 2 /8 (the saddle-focus case). They prove the existence of a countable family of heteroclinic connections which are qualitatively different from those found in this paper and in [26], as well as various types of periodic and chaotic solutions. For γ < β 2 /8 the points ±1 are saddles (two negative and two positive real eigenvalues), and the primary heteroclinic connection is monotone and unique (up to translations) within the class of odd monotone functions. A similar approach in [28], using a global Poincar´e section, extends this result to show that the primary heteroclinic is unique without restriction (see also [48]). The approach taken in this paper will be completely different. Also the method of Gardner and Jones [22] proves that for small values of γ/β 2 the heteroclinics are the result of transverse intersections of the stable and unstable manifolds. Transversality for all γ/β 2 ≤ 18 was proved by Vanden Berg [49]. This is still an open question for large values of γ/β 2 . A related equation u0000 + P u00 + u − u2 = 0, which arises in nonlinear elasticity and the theory of shallow water waves, has been extensively studied by Amick, Buffoni, Champneys, and Toland [5, 10, 11, 13, 14] who also develop a shooting method suitable for fourth-order problems of this type. Note that the Lagrangian density is given by L(u, u0 , u00 ) = 21 |u00 |2 − P2 |u0 |2 + 21 u2 − 13 u3 and is not bounded from below as is (1.1). The primary homoclinic connection occurs as a mountain pass critical point, and our methods are not directly applicable. However we believe that many of the same ideas are important for both classes of problems. The parameter P plays the same role as the ratio γ/β 2 in Eq. (1.2), and for −2 < P < 2 the
340
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
stationary point u = 0 is a saddle-focus, which leads to a complicated set of multibump homoclinic connections [10, 12, 15]. This equation with u2 replaced by u3 is also used in certain optical models [2]. Since the points u = ±1 are hyperbolic equilibria of (1.2), homoclinic connections are contained in the affine Sobolev spaces ±1 + H 2 (R). Heteroclinic connections lie in b ∈ C ∞ (R) is a fixed function such that χ b(t) = −1 the spaces ±b χ + H 2 (R), where χ for t ≤ −1 and χ b(t) = 1 for t ≥ 1. In the sequel, χ + H 2 (R) will denote the affine space in which χ is chosen appropriately to be 1, −1, χ b, or −b χ. Functions in all of these spaces have infinite tails which are bi-asymptotic to ±1 as t → ±∞. Disregarding these tails, the solution can make a finite number of transitions between the values −1 and +1 with oscillations around ±1 between transitions, see Fig. 1.1. Our approach to finding multitransition solutions is to define open subclasses of the above spaces in which functions make a specific number of transitions and oscillations. We then minimize the functional J in these classes. When the minimum is attained in the interior of a class, a local minimizer of J is found which is a smooth solution to the Euler-Lagrange Eq. (1.2) with the corresponding properties. The precise definitions of the subclasses are given in Sect. 2, but we present here a brief, informal description in terms of homotopy classes of curves in the plane.
+1 0 1
(
1;0)
e1
(1;0)
e2
Fig. 1.1. A typical heteroclinic orbit with homotopy type e1 e22
Viewed in the configuration plane (u, v) where v = u0 , a heteroclinic or homoclinic orbit is a curve connecting the points (±1, 0), and the transitions and oscillations record the homotopy type relative to these points. All homotopy types arising in this way can be represented by a free semigroup generated by two (clockwise) oriented loops, e1 and e2 around (±1, 0), see Fig. 1.1. Note that the winding in the tails is disregarded in this representation, and the orientation of the loops is due to the relation v = u0 . Thus for every heteroclinic and homoclinic orbit u there exists a representative word of the form θ
m−1 m · eim−1 · . . . · eθi22 · eθi11 , eθim
where θ(u) = (θ1 , ..., θm ) ∈ Nm and ik+1 − 1 = ik mod 2. In the sequel it will be more convenient to consider g(u) = 2θ(u) rather than the winding vector θ. The vector g ∈ 2Nm specifies the number of crossings of ±1 which u makes between transitions, and g = 0 for functions with only one transition. For each vector g ∈ G = 2Nm ∪{0} there are two distinct homotopy classes which correspond to words beginning with e1 and words beginning with e2 . These two classes contain functions for which limt→−∞ u(t) = +1 and limt→−∞ u(t) = −1 respectively. Let M ± (g) denote these homotopy classes of functions with winding vector g/2, and define the numbers J[u]. J ± (g) = inf M ± (g)
Homotopy Classes for Stable Connections
341
The above infima are well-defined since J is bounded from below, although it is not clear whether the infima are attained in each of the homotopy classes. Our goal is to prove that J ± (g) is attained in many classes M ± (g). Before stating our main result, we introduce an order relation ≺ on G defined by the following rule: (g1 , . . . , gm ) ≺ (g1 , · · · , gk−1 , gk− , 2, gk+ , gk+1 , . . . , gm ) for any k ∈ {1, . . . , m} and gk± ∈ 2N such that gk− + gk+ = gk , which is extended on G by transitivity. In terms of words representing homotopy types, e21 ≺ e1 · e2 · e1
and e22 ≺ e2 · e1 · e2 .
This order relation determines the classes in which we can find minimizers as stated in the following theorem. Theorem 1.1. Suppose that F ∈ C 2 (R) has exactly two nondegenerate global minima at u = ±1, and F grows superquadratically as u → ±∞. If β, γ > 0 are chosen such that ±1 are saddle-focus equilibria of (1.2), then for any g ∈ G there exist h± g such b± ∈ C 4 (R) ∩ M ± (h). that J ± (h± ) are attained by functions u Remark. Since there are only finitely many h ∈ G with h g, the following alternative holds for either + or − throughout: either J ± (g) < J ± (h) for all h g and J ± (g) is attained in M ± (g), or there are finitely many vectors h± i g, i = 1, .., n, such that ± ± ± ± ± ) = . . . = J (h ) ≤ J (g), and J (h ) is attained in M ± (h± J ± (h± n i i ) for each 1 i ≤ n. The minima obtained in Theorem 1.1 are local minima of J in the appropriate function spaces χ + H 2 (R). The theorem does not imply that the infimum is attained in every homotopy class, but it can be shown that there are certain classes in which a local minimizer must exist. In particular, if the winding numbers gi are all small enough or large enough, then the infimum J ± (g) is attained in M ± (g). In the latter case, the resulting local minimizers are multitransition solutions whose transitions are close to the single transition minimizers in J ± (0) and are separated by large distances. These solutions are analogous to those found in typical multibump constructions, cf. [26]. Theorem 1.2. Let F , β, γ be as in Theorem 1.1. If g = 0 or g ∈ 2Nm with gi = 2 for all i ≤ m, then J ± (g) are attained by minimizers in M ± (g). Futhermore, there exists an N > 0 such that, if g ∈ 2Nm and gi > N for all i ≤ m, then J ± (g) are attained by minimizers in M ± (g). In Theorems 1.1 and 1.2 the hypotheses on F are fairly mild. If additional symmetry for F is assumed, then local minima exist in every homotopy class. Theorem 1.3. Let F , β, γ be as in Theorem 1.1, and assume in addition that F (u) = F (−u) for all u ∈ R. Then for any g ∈ G the infima J − (g) = J + (g) are attained by minimizers in the associated homotopy classes. 1 1 Theorem 1.3 was originally proved with the restriction that g = 2 or g ≥ 4 for all i. It was demonstrated i i to us by J.B. VandenBerg [47] and the referee that this restriction is unnecessary, see Sect. 7.
342
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
The proofs of these theorems are based on a constrained minimization principle in which J is minimized on open sets M ± (g) in χ + H 2 (R). The main difficulty is to show that there exist minimizing sequences which are bounded with respect to the appropriate norm and whose weak limits are contained in the interior of the class M ± (g). The oscillatory nature of solutions which lie in a neighborhood of a saddle-focus equilibrium is crucial to the control of minimizing sequences and is described in Sect. 4. In Sect. 3 we develop tools for removing spurious oscillations from minimizing sequences. The results of these two sections complement each other, and combined with fairly simple a priori estimates, they comprise the essential ingredients of the proofs of Theorems 1.1, 1.2, and 1.3 in Sects. 5, 6, and 7 respectively. Minimizing sequences in a class M ± (g) can lose complexity in the limit, i.e. approach the boundary of M ± (g), in two ways: crossings of ±1 can coalesce, or the distance between crossings can grow to infinity. In both cases minimizing sequences can be adjusted by replacing pieces of the functions in the sequence with pieces of orbits near a saddle-focus equilibrium. The oscillatory properties of such orbits ensure that the limits of these specially-constructed minimizing sequences remain in the interior of the class M ± (g). Intuitively this is the main idea of this paper, but the implementation requires some technical adjustments, see Sect. 5. We begin in Sect. 2 with a precise description of the functional analytic framework for these minimization problems, and in Sect. 8 we briefly describe other problems in which these techniques might be useful. Throughout this paper C will denote an arbitrary constant which may change from line to line. Generally C will depend on the parameters γ, β, and the nonlinearity F . Any other important dependence will be explicitly specified. 2. Preliminaries The Hamiltonian of (1.2) is given by γ 00 2 β 0 2 |u | + |u | − F (u) 2 2 1 2 β 2 = p 1 q2 + p − q − F (q1 ) 2γ 2 2 2
H(u, u0 , u00 , u000 ) = −γu000 u0 +
in the symplectic coordinates (q, p) = (q1 , q2 , p1 , p2 ) defined by (q1 , q2 ) = (u, u0 ) and 0 000 00 (p R 1 , p02 ) = (βu − γu , γu ). The canonical Lagrangian then has the form I[q, p] = {hq , pi−H(q, p)}. Since this Lagrangian is strongly indefinite, it is more convenient to study homoclinic and heteroclinic orbits of (1.2) as critical points of the action functional Z γ 00 2 β 0 2 |u | + |u | + F (u) dt J[u] = 2 2 R
with γ, β > 0, which is obtained from I[q, p] by substituting the above definiton of (q, p). Recall that the function F ∈ C 2 (R) is a nondegenerate double-well potential which grows superquadratically as |u| → ∞. The specific hypothesis is (H1) F (±1) = F 0 (±1) = 0, F 00 (±1) > 0, and F (u) > 0 for u 6= ±1. Moreover there are constants c1 and c2 such that F (u) ≥ −c1 + c2 u2 . This implies the following property which will be used in the sequel:
Homotopy Classes for Stable Connections
343
(H2) for every α ∈ (−1, 1) there exists η(α) > 0 such that 2 2 F (u) ≥ η(α)2 (u − 1)2 for u ∈ (α, ∞), η(α) (u + 1) for u ∈ (−∞, α). As described in the introduction, we will consider classes of functions in the affine spaces χ + H 2 (R). We will restrict attention to functions for which limt→−∞ u(t) = −1, b, a fixed smooth function with χ b(t) = −1 in which case χ is either χ−1 ≡ −1 or χ1 = χ for t ≤ −1 and χ b(t) = 1 for t ≥ 1. The other cases are completely analogous, and thus we will drop the superscripts ± from the notation for the classes M (g). For m ≥ 1 and g ∈ 2Nm we define the subclass M (g) of χ(−1)m + H 2 (R) as follows. Definition 2.1. A function u is in M (g) if there are nonempty sets {Ai }m+1 i=0 such that S m+1 i) u−1 (±1) = i=0 Ai , ii) #Ai = gi for i = 1, . . . , m, iii) max Ai < min Ai+1 for i = 0, . . . , m, , and iv) u(Ai ) = (−1)i+1 Sm v) {max A0 } ∪ i=1 Ai ∪ {min Am+1 } consists of transverse crossings of ±1. Under these conditions M (g) is an open subset of χ(−1)m + H 2 (R). For m = 0 define M (0) as above with two sets A0 and A1 each with at least one transverse crossing. For convenience we will suppress the dependence of χ on m and use the notation |g| = m if g ∈ 2Nm and |0| = 0. m For g ∈ G = ∪∞ m=1 2N ∪ {0} functions in M (g) make |g| + 1 transitions between ±1, and these occur on the intervals from max Ai to min Ai+1 for i = 0, . . . , m. The numbers gi count the crossings of either −1 or +1 between consecutive transitions, and these crossings are transverse as well as the crossings at the beginning of the first transition and at the end of the last transition. Functions in χ+H 2 (R) can make infinitely many crossings of ±1 as t → ±∞, but we do not make any assumptions about their transversality. For u ∈ M (g) we will call the interval from the beginning of the first transition to the end of the last transition, i.e. from max A0 to min Am+1 , the core interval of u, see Fig. 2.1. These classes have been defined so that between any two crossings in Ai functions in M (g) stay strictly above/below (−1)i . Hence a function which crosses +1, then has a tangency at −1, and subsequently crosses +1 again is on the boundary between two distinct classes. Initially we will allow minimizing sequences to move from one class to another in this manner. To formalize this idea we define the following partial ordering m ≺ on the set G = ∪∞ m=1 2N ∪ {0}. Let g = (g1 , . . . , gm ) ∈ G. For any k ∈ {1, . . . , m} and any gk± ∈ 2N such that − gk + gk+ = gk , we declare (g1 , . . . , gm ) ≺ (g1 , · · · , gk−1 , gk− , 2, gk+ , gk+1 , . . . , gm ). Also if g, h, k ∈ G with g ≺ h and h ≺ k, then g ≺ k. These relations define a partial ordering on the set G. For any g ∈ G define M(g) = ∪hg M (h). The functions in M(g) are at least as topologically complex as those in M (g), i.e. they have at least |g| + 1 transitions and at P least i gi crossings of ±1 between the first and last transitions. Let J (g) = inf M (g) J. We will prove a more detailed version of Theorem 1.1 which states that given any g ∈ G there is a local minimizer of J which is at least as topologically complex as functions in M (g) in the sense of the above ordering, see Fig. 2.1.
344
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
+1 0 1
g = (6; 4)
h = (2; 2; 4; 4)
Fig. 2.1. (6, 4) ≺ (2, 2, 4, 4). The intervals pictured are the core intervals
Theorem 2.2. Suppose F satisfies the hypothesis (H1) and γ, β > 0 are chosen such that ±1 are saddle-foci. Then for any g ∈ G there exists u b ∈ M(g) which is a local minimizer of J in χ + H 2 (R) with the following properties: i) u b has strictly monotone transitions, ii) u b has only one local extremum between consecutive crossings, and iii) the tails of u b have a countable infinity of crossings of ±1, all of which are transverse, and between consecutive crossings u b has one local extremum. In each tail, there are u(tmin two sequences (b u(tmax n )) and (b n )) consisting of all local maxima and minima. These sequences are strictly monotone, and u b(tmax b(tmin n ) & ±1 and u n ) % ±1 as n → ∞. Moreover, either J (g) < J (h) for all h g and u b ∈ M (g), or there are finitely many hi g, i = 1, . . . , n, such that J (h1 ) = . . . = J (hn ) ≤ J (g), and there exist local minimizers u bi in each M (hi ), i = 1, . . . , n. Remark. This theorem establishes the existence of locally minimizing heteroclinic and homoclinic solutions emanating from −1. Obviously the same result holds for solutions starting at +1. For certain vectors g ∈ G the theorem immediately implies that there is a local minimizer in the class M (g). Corollary 2.3. There are local minimizers of J in any class M (g) for which gi = 2 for all i ≤ |g|. In particular there exist minimizers in the classes M (0) and M ((2)) which correspond to a single transition heteroclinic orbit and a single pulse homoclinic orbit with no oscillations between two transitions. Remark. The minimizers in the class M (0) are global minimizers in χ1 + H 2 (R). Global minimizers in these affine spaces can be found without assuming ±1 are saddle-foci. In the saddle-focus case these global minimizers must a priori be in M (0), i.e. have oscillations in the tails, but this is not necessary for other types of equilibria. 3. Clipping Let u ∈ C 1 [a, b]. Suppose there is a subinterval I = [α, β] of [a, b] such that u(α) = u(β) and u0 (α) = u0 (β). Then we can clip out the interval I from [a, b] by collapsing it to a point (cf. Fig. 3.1) to obtain a function u∗ ∈ C 1 [a, b − |I|] which is formally defined by u∗ |[a,α] ≡ u|[a,α]
and u∗ |[α,b−|I|] ≡ u|[β,b] .
Homotopy Classes for Stable Connections
345
Here |I| denotes the length of the interval I. More generally, a function u∗ will be called a clip of u if it is obtained by clipping out a finite number of intervals from u. Note that the values of the functions u and u∗ along with their derivatives coincide at the corresponding endpoints of their domains. Clipping is a well-defined operation on H 2 (R) functions. Since the integrand of J is nonnegative, it also has the fundamental property that J[u∗ ] ≤ J[u] for any clip u∗ of u. The next three lemmas will be tools for clipping functions, and these ideas are best understood by examining intersections of the corresponding curves in the configuration plane (u, u0 ), cf. Fig. 1.1. Lemma 3.1. Let a1 < b1 ≤ a2 < b2 , and Ij = [aj , bj ], j = 1 or 2. Suppose a function u ∈ C 1 (I1 ) ∩ C 1 (I2 ) is increasing on both I1 and I2 with u(I1 ) ∩ u(I2 ) 6= ∅ and satifies one of the following two properties: i) u(a1 ) = u(a2 ), u(b1 ) = u(b2 ), and (u0 (a1 ) − u0 (a2 )) · (u0 (b1 ) − u0 (b2 )) ≤ 0, or ii) u0 (a1 ) = u0 (a2 ) = u0 (b1 ) = u0 (b2 ) = 0 and (u(a1 ) − u(a2 )) · (u(b1 ) − u(b2 )) ≥ 0. Then there exist cj ∈ Ij such that u(c1 ) = u(c2 ) and u0 (c1 ) = u0 (c2 ). Hence the interval (c1 , c2 ) can be clipped out of u to produce an increasing function. The same result holds for functions which decrease on I1 and I2 , and if u is strictly monotone over these intervals, then the clip of u is also strictly monotone. Proof. First consider the case in which the hypothesis (i) is satisfied. Assume u(a1 ) < u(b1 ), and let I = [u(a1 ), u(b1 )], as the other case is similar. Since u is C 1 and monotone on the intervals I1 and I2 , the function −1 0 ϕ(s) = u0 (u|−1 I1 (s)) − u (u|I2 (s))
is well-defined and continuous for s ∈ I. By hypothesis ϕ(u(a1 )) · ϕ(u(b1 )) = (u0 (a1 ) − u0 (a2 )) · (u0 (b1 ) − u0 (b2 )) ≤ 0. Therefore ϕ(u(a1 )) and ϕ(u(b1 )) have opposite signs, and ϕ(s0 ) = 0 for some s0 ∈ I. Let 0 0 cj ∈ u|−1 Ij (s0 ), j = 1 or 2. Then u(c1 ) = u(c2 ) and u (c1 ) = u (c2 ) by construction. The clip of u inherits the monotonicity properties of u on the intervals [a1 , c1 ] and [c2 , b2 ]. Now suppose u satisfies hypothesis (ii) and u is increasing on I1 and I2 . Then there are two cases, and we consider u(a1 ) ≤ u(a2 ) and u(b1 ) ≤ u(b2 ), as the other case is similar. Since u(I1 ) ∩ u(I2 ) 6= ∅, there are points b a1 ∈ I1 and bb2 ∈ I2 such that b u(b a1 ) = u(a2 ) and u(b2 ) = u(b1 ). Then u satisfies hypothesis (i) on the intervals [b a1 , b1 ] and [a2 , bb2 ]. Typically, we will apply this lemma to functions defined on all of R. In this case, the clipping operation is localized and removes a finite interval so that the resulting function is again defined on all of R, see Fig. 3.1. The next two lemmas apply to Morse functions, i.e. C 2 functions whose critical points are all nondegenerate. Morse functions have finitely many critical points on a compact interval, all of which are local maxima or minima. We will denote the closed convex hull of a set A by conv (A). Lemma 3.2. Let u ∈ C 2 [a, b] be a Morse function with u(a) 6= u(b). Suppose u([a, b]) ⊂ conv ({u(a), u(b)}). Then u can be clipped to a strictly monotone function.
346
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
J1
J2
+1
1 I1
I2
J1
[ Jb2
Fig. 3.1. A typical example of the clipping operation. As in Lemma 3.1, the intervals J1 = [a1 , c1 ] and J2 = [c2 , b2 ] are concatenated to [a1 , b2 − c2 + c1 ] = [a1 , c1 ] ∪ [c1 , b2 − c2 + c1 ] = J1 ∪ Jb2
Proof. If u has no critical points, then u is monotone, and there is nothing to prove. Otherwise we will show that u can be clipped to remove at least two critical points. Let b1 > a be the first critical point of u. Consider the case u(a) < u(b) as the other case is similar. Since u(t) ≥ u(a) and u is Morse, u is strictly increasing on [a, b1 ], and u has a local maximum at b1 . Let b2 = sup{t : u(t) = u(b1 ) and u is increasing at t} and a2 = sup{t < b2 : u0 (t) = 0}, the location of the local minimum to the left of b2 . Finally, since u is increasing on (a, b1 ), let a1 be the unique point in [a, b1 ) with u(a1 ) = u(a2 ). Note that u is strictly increasing on [a2 , b2 ] and on [a1 , b1 ]. Applying Lemma 3.1 (hypothesis (i)) to u with subintervals Ij = [aj , bj ], j = 1, 2, we obtain points cj ∈ Ij for j = 1, 2 at which u can be clipped to a function u∗ . Since b1 and a2 are in the interval which is clipped out, u∗ has at least two critical points less than u. An even number of critical points are removed. Since there are only finitely many critical points, this process can be repeated to obtain a strictly monotone function. Lemma 3.3. Let u ∈ C 2 [a, b] be Morse with u(a) = u(b), u0 (a) · u0 (b) < 0, and u([a, b]) ⊂ [u(a), ∞) or (−∞, u(b)]. Then there exists a clip of u which has exactly one critical point in (a, b). Proof. Assume u0 (b) < 0 < u0 (a), the other case is similar. Let M = max u(t) and S = {t : u(t) = M }. Let c1 = min S and c2 = max S. Then u(c1 ) = u(c2 ) = M and u0 (c1 ) = u0 (c2 ) = 0. Apply Lemma 3.2 to the intervals [a, c1 ] and [c2 , b] to obtain two clips, u∗1 which is increasing from u(a) to M and u∗2 which is decreasing from M to u(b). These two clips can be glued together at c1 and c2 to get a clip of u with one critical point. Definition 3.4. A function u ∈ M (g) is normalized if u is monotone on each transition, and there are points −∞ ≤ a < b ≤ ∞ such that i)
all crossings of ±1 are transverse in (a, b),
ii) u contains exactly one local extremum between each pair of consecutive crossings of either +1 or −1 in (a, b), iii) a and b are accumulation points of crossings of ±1, and
Homotopy Classes for Stable Connections
347
iv) u contains no intervals of critical points except (−∞, a] and [b, ∞) on which u is identically ±1. Several comments about this definition are in order. First, the basic property of normalized functions, which will be used extensively throughout the sequel, is that all local extrema are isolated except possibly on infinite intervals at the ends of the tails where the function is identically ±1. Moreover, each maximal monotonicity interval either contains exactly one crossing of ±1 or exactly one crossing of both −1 and +1 if it is a transition. These monotonicity intervals will be used to identify places where a function can be clipped using Lemma 3.1 with hypothesis (ii). Note that the crossings are transverse in the core interval by the definition of the classes M (g), but for normalized functions all crossings are transverse except possibly at the ends of the tails. So the core interval is contained in (a, b). Finally, the normalized functions are prevalent in the classes M (g) as the following key lemma indicates. Lemma 3.5. Let u ∈ M (g). For every > 0 there exists a normalized u∗ ∈ M (g) such that J[u∗ ] ≤ J[u] + . Proof. The first step in the proof is to perturb u so that it possesses infinitely many transverse crossings. Choose µ > 0 and a fixed cutoff function ω ∈ C ∞ (R) with supp ω = [−1, 1] and ω(0) = 1 and ω 0 (0) = 0. Since u ∈ χ + H 2 (R), there is a bi-infinite sequence . . . t−1 < t0 < t1 . . ., which can be chosen such that |ti+1 − ti | > 4 and |(u(ti ), u0 (ti )) − (χ(ti ), χ0 (ti ))| < µ · 2−|i|−1 . Clearly the sequence (ti ) can also be chosen so that each ti is at least distance four from the core interval and χ(ti ) = ±1 for all i ∈ Z by the definition of χ. The function v(t) = u(t) +
∞ X
(µ · 2−|i| (t − ti ) − u(ti ) + χ(ti ))ω(t − ti )
i=−∞
has transverse crossings with positive derivative at the points ti → ±∞ as i → ±∞, and kv − ukH 2 ≤ CµkωkH 2 . Thus for µ sufficiently small J[v] ≤ J[u] + /2. Let (zi )i∈Z denote the sequence of all the crossings ti created above along with all the crossings in the core interval of v. On the interior of each interval [zi , zi+1 ] the function v can be perturbed to a Morse function in the usual way without altering the crossings zi . This perturbation can be made arbitrarily small so that the resulting function w satisfies kw − ukH 2 ≤ Cµ. For sufficiently small µ, we have J[w] ≤ J[u] + . Futhermore, the nontransverse crossings must be local extrema of w. Now let (zi )i∈Z be the sequence of all transverse crossings of w. On each interval [zi , zi+1 ] either w(zi ) = w(zi+1 ) = ±1 or w(zi ) = −w(zi+1 ) = ±1 in which case w makes a transition on this interval. By the choice of (zi ) the hypotheses of Lemmas 3.3 and 3.2 hold, and we can apply them to the two cases repectively to “normalize” w on [zi , zi+1 ]. More precisely, for each i ∈ Z we obtain an interval Ii and w∗ : Ii → R which is a clip of w|[zi ,zi+1 ] . Note that the values of w∗ and its derivative match at the right endpoint of Ii and the left endpoint of Ii+1 . Therefore, by concatenating the intervals P Ii to I, |Ii |. Since we construct a C 1 and piecewise C 2 function w∗ : I → R with |I| = clipping reduces the action, J[w∗ ] ≤ J[w] ≤ J[u] + . Now w∗ has all the properties of a normalized function in Definition 3.4 except that I need not be all of R (note that possibly infinitely many intervals were clipped from w). However
348
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
lim (w∗ (t), (w∗ )0 (t)) = lim (w∗ (t), (w∗ )0 (t)) = (±1, 0). t→sup I
t→inf I
Indeed any sequence tn → inf I or sup I can be associated with a sequence τn → ±∞ such that (w∗ (tn ), (w∗ )0 (tn )) = (w(τn ), w0 (τn )) which tends to (±1, 0) since w ∈ χ + H 2 (R). In this way ∗ w (t) for t ∈ I, ∗ u (t) = ±1 for t 6∈ I, is a normalized function in χ + H 2 (R) as in Definition 3.4 with J[u∗ ] ≤ J[u] + .
The previous lemma allows us to consider minimizing sequences which are normalized functions. The clipping lemmas also imply that adding more crossings in the core interval can only increase the action J. Lemma 3.6. Suppose g, h ∈ 2Nm with gi ≤ hi for all i ≤ m. Then J (g) ≤ J (h). Proof. We will consider the case where gi = hi for all i ≥ 2 and g1 = h1 − 2. The case i 6= 1 is similar, and the general case follows by induction. For any u ∈ M (h) which is normalized we construct v ∈ M (g) with strictly lower action. Let τ1 = max A0 and τ2 = min A2 , and consider u restricted to [τ1 , τ2 ]. Let s1 be such that u has a local minimum at s1 and no other local minimum of u|(τ1 ,τ2 ) is larger than u(s1 ). Let s2 < s1 < s3 be such that u is strictly monotone on [s2 , s1 ] and [s1 , s3 ], and u has a local maximum at s2 and s3 . Assume u(s2 ) ≤ u(s3 ), as the other case is similar. Define a = sup{t < s1 : u(t) = u(s1 )} and b = inf{t > s1 : u(t) = u(s2 )}. By construction u(a) = u(s1 ) < 1 < u(s2 ) = u(b). Also u0 (a) ≥ 0, u0 (b) ≥ 0, and u0 (s1 ) = u0 (s2 ) = 0. Since u is normalized, the hypothesis (i) of Lemma 3.1 holds on the intervals I1 = [a, s2 ] and I2 = [s1 , b]. Therefore we can clip u over I1 and I2 to a monotone function v. Exactly two crossings are removed because u has one crossing in each interval I1 = [a, s2 ], [s2 , s1 ], and I2 = [s1 , b]. Let > 0 and u ∈ M (h) be such that J[u] ≤ J (h) + /2. By Lemma 3.5 there is a normalized u∗ ∈ M (h) sufficiently close to u such that |J[u] − J[u∗ ]| < /2. By the above argument we can find v ∗ ∈ M (g) such that J[v ∗ ] < J[u∗ ] < J (h) + . Therefore J (g) ≤ J (h). 4. Saddle-Focus Equilibria In this section we analyze the minimizers of J on a finite interval [0, T ] which satisfy the following boundary value problem, γv 0000 − βv 00 + G0 (v) = 0, (4.1) (v(0), v 0 (0)) = x and (v(T ), v 0 (T )) = y, where x, y ∈ R2 . For notational convenience we consider a potential G ∈ C 2 (R) for which G(0) = 0 is a nondegenerate global minimum. Since the wells at ±1 in the original potential F can be translated to the origin, the analysis of this section will apply to both cases, i.e. G(v) = F (v ± 1). Hence we want to minimize ZT J= 0
γ 00 2 β 0 2 |v | + |v | + G(v) dt 2 2
Homotopy Classes for Stable Connections
349
over the space XT = {v ∈ H 2 [0, T ] : (v(0), v 0 (0)) = x, (v(T ), v 0 (T )) = y}, and we will be interested in the properties of the minimizer with small boundary data, kxk, kyk 1. In particular we will show that if the boundary data are sufficiently small, then the minimizer in XT is unique and small along with all its derivatives up to third-order, and the minimizer oscillates around the origin. Theorem 4.1. There exists δ0 (G) > 0 such that if kxk, kyk ≤ δ ≤ δ0 and T ≥ 1, then there exists a unique global minimizer vb of J in XT satisfying the boundary value problem (4.1). Furthermore, kb v kW 3,∞ ≤ Cδ and J[b v ] ≤ Cδ 2 , where C is independent of T ≥ 1. Proof. We will separate the proof into several steps. Since G has a nondegenerate global minimum at the origin, there are constants δ1 > 0 and η > 0 such that G(v) ≥ η 2 v 2 for |v| ≤ δ1 . Step 1. There exists C1 (G, δ1 ) > 0 such that, if kxk, kyk ≤ δ ≤ δ1 , then inf XT J ≤ C1 δ 2 . Choose any functions ϕ0 , ϕ1 ∈ C ∞ [0, 1] such that supp (ϕj ) ⊂ [0, 1/2] with ϕ0 (0) = 1, ϕ00 (0) = 0, ϕ1 (0) = 0, and ϕ01 (0) = 1, and define ψj (t) = (−1)j ϕj (T − t). Consider the function φ ∈ XT defined by φ = x0 ϕ0 + x1 ϕ1 + y0 ψ0 + y1 ψ1 . Note that there is a constant ξ > 0 such that G(v) ≤ ξv 2 , and hence inf XT J ≤ J[φ] ≤ C1 δ 2 . Step 2. There exists C2 (η) > 0 such that for every v ∈ XT with kxk, kyk ≤ δ ≤ δ1 /2 we have J[v] ≥ C2 min{kvk2∞ , δ12 }. First suppose kvk∞ ≤ δ1 . If |v(t)| ≥ kvk∞ /2 for all t ∈ [0, T ], then ZT J[v] ≥
ZT G(v) dt ≥
0
η 2 v 2 dt ≥
1 2 η kvk2∞ ≥ Ckvk2∞ . 4
0
Otherwise there are points t0 and t1 ∈ [0, T ] such that |v(t0 )| = kvk∞ /2 and |v(t1 )| = kvk∞ . Then t Z 1 Zt1 p 0 0 J[v] ≥ C(β) |v | G(v) dt ≥ C ηvv dt t0
t0
= C|v(t1 )2 − v(t0 )2 | 1 = C(kvk2∞ − kvk2∞ ) ≥ Ckvk2∞ . 4
(4.2)
Now, if kvk∞ > δ1 , then there are points t0 and t1 such that |v(t0 )| = δ1 /2 and |v(t1 )| = δ1 because the boundary conditions are smaller than δ1 /2. Hence J[v] ≥ Cδ12 by (4.2). Step 3. There exists a δ0 < δ1 /2 and C(δ0 ) > 0 such that, if kxk, kyk ≤ δ < δ0 and v ∈ XT with J[v] ≤ 2 inf XT J, then kvk∞ , kvkH 2 ≤ Cδ. If kvk∞ ≥ δ1 , then J[v] ≥ C2 δ12 by Step 2. From Step 1, J[v] ≤ 2C1 δ02 . Thus δ0 can be chosen small enough so that kvk∞ < δ1 . Again by Steps 1 and 2, C2 kvk2∞ ≤ J[v] ≤ C1 δ 2 , which implies kvk∞ ≤ Cδ. Since G(v) ≥ η 2 v 2 , we have that C(γ, β, η)kvk2H 2 ≤ J[v] ≤ C1 δ 2 . v kW 3,∞ ≤ Step 4. For δ0 sufficiently small J has a unique minimizer vb ∈ XT such that kb Cδ, where C is independent of T ≥ 1.
350
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
Using the a priori estimates in Step 3 and the weakly lower semicontinuity of J on XT , a minimizer vb ∈ XT of J can be found by the standard theory, and vb is a solution to (4.1). v kL2 + kb v 00 kL2 ) ≤ Cδ. A straightforFrom the differential equation kb v 0000 kL2 ≤ C(kb ward interpolation inequality yields kv (k) kL2 [0,T ] ≤ C kv (k−1) kL2 [0,T ] + kv (k+1) kL2 [0,T ] with a constant C independent of T ≥ 1. Remark. Using Fourier transforms, the above inequality is easily established for functions defined on all of R. To obtain the desired estimate for functions on the finite interval [0, T ], extension operators are used, and a description of their properties can be found in [9]. The independence of C on T follows from combining properties of extension operators and the estimate on R. A complete proof of a more general result is contained in the appendix to Kalies, VanderVorst, and Wanner [27]. Similar estimates are also performed in [29]. Combining these estimates we obtain kb v kH 4 ≤ Cδ which gives the bound in W 3,∞ , cf. [29]. From the assumptions on G near the origin, δ0 can further be chosen small enough so that the standard estimates on the difference of two solutions yields the uniqueness of the minimizer. This completes the proof of Theorem 4.1.
In the construction of convergent minimizing sequences of J we will need to know that the minimizer vb of J found in the previous theorem has many oscillations. Theorem 4.2. Suppose γ > β 2 /4G00 (0) so that the origin is a saddle-focus equilibrium in the four-dimensional flow. Then there exist δ0 (G) > 0 and τ0 (G) > 0 such that if kxk, kyk ≤ δ0 , the unique global minimizer vb of J in XT satisfying (4.1) changes sign in any subinterval of length τ0 in [0, T ] for T ≥ 1. Proof. First we consider solutions to the linear differential equation γw
0000
− βw00 + G00 (0)w = 0.
(4.3)
Since the origin is a saddle-focus, it has complex eigenvalues ±λ±µi. By rescaling time we can assume without loss of generality that µ = 1 and λ > 0. Therefore all solutions to (4.3) have the form w(t) = Ae−λt sin(t + ϕ) + Beλt sin(t + ψ) for some A, B, ϕ, and ψ. Step 1. There exists τ0 > 0 depending only on λ such that for every A, B, ϕ, and ψ there are points τ± ∈ [0, τ0 ] such that ± w(τ± ) ≥
1 kwkL∞ [0,τ± ] . τ0
(4.4)
We prove only the existence of τ = τ+ , as the other case is similar. The calculation is separated into two cases. First suppose
Homotopy Classes for Stable Connections
351
|B|e2πλ ≤
1 |A|e−2πλ . 2
Choose τ ∈ [0, 2π] such that sin(τ + ϕ) = sgn A. Then we can estimate 1 |A|e−2πλ , and 2 1 ≤ |A| + |A|e−2πλ ≤ 2|A|. 2
w(τ ) ≥ |A|e−2πλ − |B|e2πλ ≥ kwkL∞ [0,τ ] ≤ |A| + |B|e2πλ Otherwise |B|e2πλ ≥
1 |A|e−2πλ . 2
Choose τ ∈ [2π + λ−1 ln 4, 4π + λ−1 ln 4] such that sin(τ + ψ) = sgn B. For this choice of τ we have 1 |B|eλτ ≥ 2|B|e2πλ ≥ |A|e−2πλ ≥ |A|e−λτ . 2 Thus we can estimate 1 1 |B|eλτ ≥ |B|e2πλ , and 2 2 h
w(τ ) ≥ |B|eλτ − |A|e−λτ ≥
kwkL∞ [0,τ ] ≤ |A| + |B|eλτ ≤ 2|B|e4πλ + |B|eλτ ≤ |B| 2e4πλ + e4π+λ
−1
ln 4
i .
If τ0 is chosen larger than max{4π +
−1 ln 4 , 4e2πλ + 2e4π−2πλ+λ ln 4 } > 1, λ
then for every w there is a τ+ ∈ [0, τ0 ] such that (4.4) holds. Step 2. There exists δ1 > 0 such that if v is the solution to the nonlinear differential equation γv 0000 − βv 00 + G0 (v) = 0 with initial conditions v0 = (v(0), v 0 (0), v 00 (0), v 000 (0)) and kv0 k < δ1 , then v changes sign in [0, τ0 ]. First note that if w is the solution to the linear Eq. (4.3) with the same initial conditions, and δ1 is small enough so that kvkL∞ [0,τ0 ] , kwkL∞ [0,τ0 ] ≤ 1, then there is a constant C = C(τ0 ) such that kv − wkL∞ [0,t] ≤ C(τ0 )kρ(v)kL∞ [0,t] · kvkL∞ [0,t]
for all t ∈ [0, τ0 ],
(4.5)
where ρ(v) = (G0 (v) − G00 (0)v)/v. This estimate is obtained from the variation of constants formula Zt v(t) = w(t) + eL(t−s) N(v(s)) ds, 0 0
00
000
0
where v = (v, v , v , v ), w = (w, w , w00 , w000 ), N(v) = (0, 0, 0, G0 (v) − G00 (0)v), and L is the 4 × 4 matrix of the linear vector field obtained by writing (4.3) as a firstorder system. The estimate (4.5) follows from the fact that |G0 (v) − G00 (0)v| = o(|v|)
352
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
as |v| → 0. Now choose κ < 1 such that 0 < Cκ/(1 − Cκ) ≤ 1/2τ0 and δ1 such that kρ(v)kL∞ [0,t] , kvkL∞ [0,τ0 ] , kwkL∞ [0,τ0 ] ≤ κ. We now estimate as follows, kvkL∞ [0,t] ≤ kwkL∞ [0,t] + kv − wkL∞ [0,t] ≤ kwkL∞ [0,t] + Ckρ(v)kL∞ [0,t] · kvkL∞ [0,t] , and hence
(1 − Cκ)kvkL∞ [0,t] ≤ kwkL∞ [0,t] .
This implies that kv−wkL∞ [0,t] ≤ Ckρ(v)kL∞ [0,t] ·kvkL∞ [0,t] ≤
Cκ 1 kwkL∞ [0,t] ≤ kwkL∞ [0,t] . 1 − Cκ 2τ0
Now take t = τ = τ+ as in Step 1. Then v(τ ) ≥ w(τ ) − kv − wkL∞ [0,τ ] ≥
1 1 kwkL∞ [0,τ ] − kwkL∞ [0,τ ] > 0. τ0 2τ0
So v(τ+ ) > 0 and similarly v(τ− ) < 0. Finally let T ≥ 1 and vb be the minimizer from Theorem 4.1 on the interval [0, T ], v kW 3,∞ < δ1 . Note that in the above analysis and choose δ0 sufficiently small such that kb we rescaled time, and hence redefine the constant τ0 to be τ0 /µ. Then either T < τ0 and the theorem is vacuously satisfied, or T ≥ τ0 . In the latter case, Step 2 above implies that vb changes sign on every subinterval of length τ0 in [0, T ]. This completes the proof of Theorem 4.2. 5. Minimization In this section we minimize J in the classes M(g) defined in Sect. 2 and prove Theorem 2.2 and Corollary 2.3. The main idea in this minimization problem is to use the clipping lemmas and local theory of the previous sections to construct minimizing sequences which have a weak limit in the class M(g). The limiting function is then a local minimizer of J in χ + H 2 (R). First we obtain estimates for functions which stay away from a neighborhood of ±1. Lemma 5.1. Let u ∈ H 2 [a, b] and δ > 0. Then there exists a constant C(β, δ) such that J[u] ≥ C|u(b) − u(a)| and J[u] ≥ C(b − a) (5.1) whenever |u ± 1| > δ on [a, b]. Moreover, J is uniformly continuous on the sublevel set J c = {u ∈ H 2 [a, b] : J[u] ≤ c}. Proof. We estimate Zb J[u] ≥ a
β 02 (u ) + F (u) dt ≥ C 2
(u(b) − u(a))2 +b−a b−a
(5.2)
using the Schwartz inequality and that F (u) ≥ C(δ) for |u ± 1| > δ. The first estimate in (5.1) follows from the arithmetic-geometric mean (Young’s) inequality and the second estimate is clear.
Homotopy Classes for Stable Connections
353
Let u ∈ J c . Since J[u] ≤ c and F (u) ≥ −C1 + C2 u2 , we have that kukH 2 ≤ C which implies a uniform bound on u in L∞ for all u ∈ J c . Therefore, using the local Lipschitz continuity of F , we have Zb |J[u + ϕ] − J[u]| ≤ C
|u00 ϕ00 | + (ϕ00 )2 + |u0 ϕ0 | + (ϕ0 )2 + |F (u + ϕ) − F (u)| dt
a
≤ C(kϕkH 2 , c, b − a) · kϕkH 2 for any u ∈ J c and ϕ ∈ H 2 [a, b] which establishes uniform continuity.
Corollary 5.2. There is a constant κ(β, F ) so that, whenever u ∈ χ + H 2 (R) makes a transition on an interval I, we have J[u|I ] ≥ κ(β, F ). Therefore J[u] ≥ κ(β, F )·(|g|+1) for all u ∈ M (g). Proof. In each transition there must be an interval [a, b] ⊂ I such that u([a, b]) = [1/4, 3/4]. The previous lemma implies that J[u|I ] ≥ C(β, δ)(3/4 − 1/4) = κ(β, F ). Since u ∈ M (g) has |g| + 1 transitions, the result follows. Remark. The above estimates require β > 0, but if β = 0, similar but more delicate estimates can be obtained. Now consider the minimization problem [ M (h) = min {J (h) : h g} . inf J[u] : u ∈ M(g) = h g
(5.3)
The last equality holds because {h : h g} is a finite set, and also Pnote that there is gi − |g|. Within a maximal element of the form gmax = (2, 2, . . . , 2) with |gmax | = this set it is convenient to consider only those vectors for which J (h) = inf M(g) J and which are maximal with respect to ≺, and we define E(g) = h ∈ G : h g, J (h) = inf J, and J (h) < J (k) for all k h . M(g) Since no two elements in E(g) can be related by ≺, either E(g) = {g} or E(g) = {h1 , . . . , hn : hi g}. In the first case J (g) < J (h) for all h g, and otherwise J (h1 ) = . . . = J (hm ) ≤ J (g). Also let Ii = conv (Ai ) for i = 1, . . . , m, I0 = (−∞, max A0 ), and Im+1 = (min Am+1 , ∞), where the sets Ai are those used in the definition of the classes in Sect. 2, i.e. the intervals Ii for i = 1, . . . , m are simply the intervals between transitions. We will minimize J in a fixed class M (h). This minimization can only be accomplished in those classes which have the following property: Uniform Separation Property. There exists 0 > 0 and 0 < δ(h) < 1/4 such that for any normalized u ∈ M (h) with J[u] ≤ J (h) + 0 we have |u(t) − (−1)i | > δ(h) for all t ∈ Ii , i = 0, . . . , |h| + 1. This property asserts that minimizing sequences in the class M (h) cannot gain complexity by forming new transitions, i.e. between two crossings of ±1, functions with small enough action in M (h) are uniformly bounded away from ∓1. The next lemma states that any class M (h) with h ∈ E(g) satisfies this property. In Sect. 7 we will prove that if F is even, then all classes have this property.
354
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
Lemma 5.3. Let g ∈ G. Then there exists an κ > 0 such that for every h ∈ E(g) we have J (k) ≥ J (h) + κ for all k h. Moreover, the class M (h) satisfies the uniform separation property. Proof. Since E(g) and {k : k h} are both finite sets, there is κ > 0 such that J (k) ≥ J (h) + κ for all k h and all h ∈ E(g). Before proving the uniform separation property, we define a family of functions ψn (t) ∈ H 2 (R) by 2 2 ψn (t) = (t − 1) cos(nπt) for t ∈ [−1, 1], (5.4) 0 for t ∈ / [−1, 1]. The support and range of ψn are contained in [−1, 1], and by a suitable scaling, the H 2 norm, the domain and range of ψn can all be made arbitrarily small. We will use various scalings of these functions in this proof and subsequent proofs to perturb functions in χ + H 2 (R). In most cases there will be many different ways to accomplish such perturbations, but we will use this family of functions for specificity. Now let 0 = κ/2. We will show that the uniform separation property holds on the core interval with this 0 and some small δ > 0. Suppose to the contrary that for any δ < 1/4 there exists a normalized u ∈ M (h) with J[u] ≤ J (h) + 0 and a point t0 ∈ Ii with 1 ≤ i ≤ m such that |u(t0 ) − (−1)i | < δ. Let v = u + 2δψ0 ((t − t0 )/δ 1/2 )(−1)i . Then v = u outside the interval of size δ 1/2 around t0 . To specify δ, let I = [a, b] be the largest interval containing t0 for which (−1)i u ≥ 0, and choose t1 ∈ I such that |u(t1 )| = 1/2. The size of the interval I is uniformly bounded from below because C
|u2 (t1 ) − u2 (a)| (1/4 − 0) ≤C ≤ J[u] ≤ J (h) + 0 , b−a t1 − a
using the estimate (5.2). Now choose δ small enough so that the perturbation is restricted to the interval I. Therefore the function v has exactly two more transitions than u, since |v(t0 ) − u(t0 )| = 2δ. Furthermore kv − ukH 2 < Cδ 1/4 . For δ sufficiently small (independently of u) we have J[v] < J[u] + 0 /2, because J is uniformly continuous with respect to perturbations with support on the finite interval [t0 − 1/2, t0 + 1/2] by Lemma 5.1. Since u is normalized in M (h), the point t0 can be chosen to be the unique local extremum between two crossings. In this case the two new crossings created in v are transverse because of the form of the perturbation. Thus v ∈ M (k) for some k h and J[v] < J[u] + 0 /2. Hence J (k) ≤ J[v] ≤ J[u] +
30 3κ 0 ≤ J (h) + ≤ J (h) + , 2 2 4
which contradicts the choice of 0 , since J (k) ≥ J (h) + κ. This establishes the uniform separation property in the core interval. Finally we will show that the uniform separation property holds in the tails. We begin with the following claim. Claim. There exists κ > 0 such that if k = (2, 2, h) or (h, 2, 2) for any h ∈ G, then J (k) ≥ J (h) + κ. Let u be any normalized function in M (k), where k = (2, 2, h) as the other case is similar. Then the core interval of u begins with an interval I containing some number of transitions (at least three) with no extra oscillations around ±1 between them, i.e. there
Homotopy Classes for Stable Connections
355
are (at least) three maximal monotonicity intervals each contributing a transition. Let I be the union of the maximal monotonicity intervals of all these transitions. Then u does not have transitions on either side of I without first oscillating around ±1. Let tmax be the location of the maximum value of u on I, and let s1 < tmax < s2 be the locations of the adjacent local minima. Let u(a1 ) and u(a2 ) be the adjacent local maxima located to the left of s1 and to the right of s2 respectively. If u(s1 ) = u(s2 ), then the interval [s1 , s2 ] can be clipped out of u removing two transitions. If u(s1 ) < u(s2 ), then the intervals [a1 , s1 ] and [tmax , s2 ] satisfy the hypothesis (ii) of Lemma 3.1. Note that one of the intervals [a1 , s1 ] and [s1 , tmax ] need not be a transition, but in any case two transitions can be clipped from u. Note that s1 must be at the bottom of a transition so that u(s1 ) < −1. If u(s1 ) > u(s2 ), then u(s2 ) < −1, and hence a2 ∈ I so that u(a2 ) < u(tmax ). Therefore u can be clipped using the intervals [s1 , tmax ] and [s2 , a2 ]. This clipping yields a u∗ ∈ M (h) such that J[u∗ ]+κ ≤ J[u] for some κ > 0 independent of u by Corollary 5.2. Therefore J (h) + κ ≤ J[u] for all u ∈ M (k) which implies that J(k) ≥ J(h) + κ. This completes the proof of the claim. Now assume the uniform separation property fails in the left tail of u. Let t0 be the largest point such that u(t0 ) is the global maximum of u on I0 . Arguing as in the proof of Lemma 3.6 each of the oscillations around −1 can be removed one-by-one so that there are exactly two crossings of −1 between t0 and max A0 . As above assume that |u(t0 ) − 1| < δ < 1/4. For δ sufficiently small depending only on max{J (k) : k = (2, 2, h) or k = (h, 2, 2) for some h ∈ E(g)}, we can perturb u near t0 exactly as above to obtain v ∈ M (k) where k = (2, 2, h) with J[v] ≤ J (h) + 3κ/4. This contradicts the claim. Hence the uniform separation property holds in M (h) with 0 chosen above and some δ(h) < 1/4. The above results show that to minimize in the extended class M(g) one can minimize in a fixed class M (h) for some h ∈ E(g), and that minimizing sequences in this class never gain complexity. New transitions cannot form in the limit of a minimizing sequence by Lemma 5.3, and the number of crossings between transitions is fixed by the class M (h). The remainder of this section is primarily devoted to constructing minimizing sequences which do not lose complexity in the limit and which are bounded in χ + H 2 to exploit the weakly lower semicontinuity of J. As we will show, this amounts to controlling how much time functions spend near ±1 in their core interval, and hence we now need to examine what happens when functions are close to ±1. For the remainder of this section the constants h ∈ G, 0 > 0, δ > 0, and τ0 = max{τ0 (F (u − 1)), τ0 (F (u + 1))} > 0 will be fixed so that the uniform separation property holds in M (h) with 0 , δ and so that the results of the local theory in Sect. 4 apply near both wells. In particular we choose δ < min{δ0 (F (u+ 1)), δ0 (F (u − 1)), δ(h), 1/4}. Let u ∈ M (h), and consider the interval C(u) = (c1 , c2 ), where c1 = inf{t : |u(t) − 1| < δ} and c2 = sup{t : |u(t) + (−1)m | < δ}. We will call C(u) the δ-core interval of u. Observe that if u has the properties stated in the uniform separation condition, then its δ-core interval is contained in its core interval. Now let S(u) = {t : |u(t) ± 1| < δ} be the set of all points for which u is in the strips of width δ around ±1, and define B[u] = |C(u) ∩ S(u)|, where | · | denotes the Lebesgue measure. To control minimizing sequences for J in M (h) we first analyze the minimization problem, B = inf {B[u] : u ∈ M (h), u is normalized, and J[u] ≤ J (h) + } ,
(5.5)
for < 0 . The purpose of this auxiliary minimization is contained in the following lemma.
356
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
Lemma 5.4. For any normalized u ∈ M (h) with J[u] ≤ J [h] + 0 , ku − χkH 2 ≤ C(h) (1 + J[u] + B[u]) , and |C(u)| ≤ C(h) · J[u] + B[u]. Before proving this lemma we introduce some notation. Due to the translation invariance of both J and B, we will always translate normalized functions in M (h) so that c1 = 0, and the δ-core interval is C(u) = (0, c2 ). Note that ku0 k2H 1 ≤ CJ[u], and hence we must control ku − χkL2 to prove Lemma 5.4. Define H−1 = χ−1 ≡ −1 and H1 to be the Heavyside function, i.e. H1 (t) = −1 for t < 0 and H1 (t) = 1 for t > 0. Then (H±1 − χ±1 ) ∈ L2 , and it suffices to estimate ku − H(−1)m kL2 , where m = |h|. Since |h| determines H±1 for the entire class M (h), we will drop the subscript on H as well as on χ. Proof of Lemma 5.4. Using the uniform separation property, we can estimate Z0 ku −
Hk2L2
≤
|u − (−1)| dt + −∞
1 ≤ 2 η
Z∞
Z |u − (−1) | dt +
C(u)
Z0
Z F (u) dt +
−∞
C(u)
|u − (−1)m |2 dt
m 2
2
c2
1 |u − (−1) | dt + 2 η
Z∞
m 2
F (u) dt. c2
where η satisfies property (H2) in Sect. 2 for both α = δ −1 and α = (−1)m (1−δ). Since δ is fixed, η can also be chosen small enough so that F (u) ≥ η 2 (u ± 1)2 for |u ± 1| > δ to obtain Z Z Z |u − (−1)m |2 dt = |u − (−1)m |2 dt + |u − (−1)m |2 dt C(u)
C(u)∩S(u)
1 ≤ (2 + δ)2 B[u] + 2 η
Z
C(u)\S(u)
F (u) dt. C(u)\S(u)
Therefore combining these two estimates yields ku −
Hk2L2
1 ≤ 2 η
Z F [u] dt + (2 + δ)2 B[u] ≤ C(J[u] + B[u]), R
and the estimate of ku − χkH 2 follows. The bound on the length of the δ-core interval follows immediately from the second inequality in Lemma 5.1. Theorem 5.5. There exists a constant K = K(h, τ0 ) such that for every < 0 there is a normalized u ∈ M (h) with J[u] ≤ J (h) + 2 and B[u] ≤ K.
Homotopy Classes for Stable Connections
357
Proof. Let m = |h|. Choose a minimizing sequence un ∈ M (h) for B in the minimization problem (5.5) with a fixed < 0 . Then kun − χkH 2 is bounded by Lemma b weakly in χ + H 2 (R), 5.4. Hence there exists u b ∈ χ + H 2 (R) such that un approaches u b, which implies in particular that un → u b uniformly in C 1 on any denoted un * u compact interval. Since J[un ] and B[un ] are uniformly bounded, the points c1 , c2 in the definition of the δ-core interval are finite for u b, so C(b u) is a compact interval. Since the set S(u) is defined to be the points where u is in the open strips around ±1, it follows from Fatou’s lemma that |C(b u) ∩ S(b u)| ≤ lim inf |C(un ) ∩ S(un )|, n→∞
b satisfies the estimate in (5.5), and B[b u] ≤ and hence B[b u] ≤ lim inf n→∞ B[un ]. So u B . Let X m K(h, τ0 ) = 2 τ0 max hi + 2 + 2 hi . 1≤i≤m
i=1
Note that the number of intervals of maximalPmonotonicity in the δ-core interval of any normalized function in M (h) is exactly i hi − m + 1. Since the sequence un 1 is normalized and converges strongly in CP loc , the number of maximal monotonicity b has finitely many local intervals in the δ-core of u b is bounded by i hi . Likewise, u extrema except for possibly intervals of local extrema. However, all intervals of critical points can be clipped to a point, which reduces both J and B. Therefore we can assume that u b has finitely many local extrema in its δ-core interval. The number of transitions is preserved in the limit at m + 1. Now there are two possibilities: either B[b u] > K or B[b u] ≤ K. In the first case, we will construct a normalized function vb ∈ M (h) such that J[b v ] < J[b u] ≤ J (h) + ,
and
B[b v ] < B[b u] ≤ B ,
which contradicts the definition of B in (5.5) and therefore excludes this case. In the second case, we will construct a normalized function vb ∈ M (h) such that J[b v ] < J[b u] + ≤ J (h) + 2,
and
B[b v ] < B[b u] ≤ K,
which means that u b can be pushed back into the class M (h) at the expense of slightly increasing the action only. Then vb satisfies the requirements of the theorem. The construction of vb in both cases is very similar, and we now implement it in detail for the first case. Step 1. Modify u b on an interval where J is not optimal, see Fig. 5.1. First, note that the number of components of C(b u) ∩ S(b u) is at most twice P the number of maximal monotonicity intervals in C(b u) which is therefore at most 2 i hi . Hence there must be P a component I with |I| larger than K/2 i hi = τ0 (max hi + 2) + 2. This implies that I u(ai ) ± 1| < δ, |b u0 (ai )| < δ, contains a subinterval [a1 , a2 ] with small boundary data |b and yet large length |a2 − a1 | > τ0 (max hi + 2). Note that u b has at most max hi crossings of ±1 in [a1 , a2 ], and by Theorem 4.2 the global minimizer of J over the finite interval b [a1 , a2 ] with these small boundary conditions has at least max hi + 2 crossings. Hence u does not minimize J on the interval [a1 , a2 ]. Therefore we can replace u b by the minimizer on [a1 , a2 ] to construct a new function vb ∈ χ + H 2 (R) with J[b v ] < J[b u] ≤ J (h) + . u) ∩ S(b u). Also B[b v ] ≤ B[b u] since [a1 , a2 ] ⊂ C(b
358
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
Step 2. Modify vb so that B[b v ] < B[b u]. Since I is a component of C(b u) ∩ S(b u), we know that |b v ± 1| = δ at the endpoints of I. Also vb is strictly monotone at the endpoints of I, since intervals of critical points have been clipped out. One can easily perturb vb near the endpoint of I so that B[b v ] < B[b u]. Specifically, suppose b = inf I and vb is increasing at b, but possibly vb0 (b) = 0. Choose b1 < b < b2 such that vb0 (bi ) > 0 and vb is increasing on [b1 , b2 ]. Define 1 ψ0 ( t−b ν ) for t ≤ b1 , b ψν (t) = 1 for b ≤ t ≤ b , ψ ( t−b2 ) for t 1≥ b , 2 0 2 ν where ψ0 is defined by formula (5.4). The support of ψbν is [−ν + b1 , b2 + ν], and note that b1 , b2 can be chosen arbitrarily close to b. Let w b = vb ± ν 3 ψbν (t). For ν > 0 sufficiently small, this perturbation shrinks the component I of C(b v ) ∩ S(b v ) to the right of b so that B[w] b < B[b v ] ≤ B[b u]. Also J[w] b < J (h) + since J is uniformly continuous over the b has exactly the same number of local extrema as vb support of ψbν by Lemma 5.1, and w b and ρ = B[b u] − B[b v ] > 0. by the choice of bi . Set vb = w, Step 3. Eliminate tangencies in vb at ±1. The function vb is possibly not in any class M (h) as defined in Sect. 2 because crossings of ±1 could have coalesced into tangencies at ±1 b. No crossings are added due to the postulated uniform separation in the limit un * u property. However, as will be described below, near each place where such a tangency occurs vb can be perturbed to add as many tiny oscillations as are necessary to put vb back into a class M (k) with ki ≥ hi for i ≤ m. Suppose vb has N tangencies at ±1 in its core interval. Each point of tangency, t0 , can be widened to an interval of length ρ/2N on which vb is identically ±1. Then B[b v ] = B[b u] − ρ/2, but J is unchanged. For 0 < ν < ρ/4N sufficiently small and n = (max hi + 2)/2, the scaled function ν 3 ψn ((t − t0 )/ν), defined by formula (5.4), can be added to vb on this interval to create at least max hi + 2 crossings of ±1 in a neighborhood of t0 . This procedure can be done at each point of tangency. For ν small enough, we still have J[b v ] < J (h) + and B[b v ] = B[b u] − ρ/2. Step 4 Adjust crossings in the tails. Still the function vb is possibly not in any class as defined in Sect. 2 because all of the crossings in the tails may have been lost at ±∞ b. The definition of the classes M (h) require at least one crossing in the limit un → u v (t), vb0 (t)) − (±1, 0)k → 0 as in each of the tails. However, since vb ∈ χ + H 2 (R), k(b |t| → ∞. We perturb vb by adding functions of the form ±ν 3 ψ0 ((t ± t0 )/ν) to each tail. For ν sufficiently small, there is a point t0 sufficiently large such that two transverse crossings are added to each tail in vb, and J[b v ] < J (h) + and B[b v ] = B[b u] − ρ/2. Step 5 Clip vb so that vb is normalized in M (h), see Fig. 5.1. Finally we can perturb vb to make it a Morse function on its δ-core interval. Outside of I, the function vb has finitely many local extrema, and hence, for a small enough perturbation, B increases only slightly. Since I is a component of C(b u)∩S(b u), no perturbation in I can increase B beyond B[b u]. Thus, for a small enough perturbation, B[b v ] < B[b u], and also J[b v ] < J (h) + . Now vb can be normalized by clipping which possibly reduces but does not increase either B or J. Thus vb is a normalized function in M (k) with ki ≥ hi for i = 1 ≤ m. As in Lemma 3.6, crossings can be removed by clipping to put vb into M (h). Again note that clipping can only reduce both B and J. Therefore we have constructed a normalized vb ∈ M (h) with J[b v ] < J (h) + and B[b v ] < B[b u] which completes the first case described above. In the second case, we
Homotopy Classes for Stable Connections
359
skip Step 1 and begin with vb = u b and I any component of C(b u) ∩ S(b u) in Step 2. Since the action J has not been decreased (as in Step 1 of the previous case), Step 2 possibly results in a small increase in J and yields a function vb with J[b v ] < J (h) + 2
and
B[b v ] < B[b u].
Then Steps 3–5 further modify vb to a normalized function in M (h) conserving the above inequalities which are required by the theorem. This completes the proof of the theorem.
+1 0 0
1 T
Fig. 5.1. The dashed curve represents the minimizer of J on the interval [0, T ] from Theorem 4.1 which oscillates by Theorem 4.2. After pasting in the dashed curve, the function is clipped to restore the correct number of crossings. See Steps 1 and 5 in the proof of Theorem 5.6. A similar technique is used to prove Theorem 2.2
Now we are in a position to prove Theorem 2.2 as stated in Sect. 2. Proof of Theorem 2.2. Given g ∈ G, consider any h ∈ E(g). By Lemma 5.3, the class M (h) has the uniform separation property. Hence 0 and K are chosen so that Theorem 5.5 holds, and minimizing sequences for J in the minimization problem (5.3) can be chosen in the fixed class M (h). Therefore we finish the proof with the following general statement. Claim 5.6. If the uniform separation property holds in M (h), then there exists u b ∈ M (h) which is a local minimizer of J and satisfies all of the properties listed in Theorem 2.2. Theorem 5.5 implies that a minimizing sequence of normalized functions un ∈ M (h) exists with the property that B[un ] is uniformly bounded. Hence kun −χkH 2 is uniformly b. Since J is bounded by Lemma 5.4, and there exists u b ∈ χ + H 2 (R) such that un * u sequentially weakly lower semicontinuous, J[b u] ≤ J (h). We will now show that u b is in the class M (h) and satisfies the desired properties. Since B[un ] is uniformly bounded, the δ-core interval is uniformly bounded by Lemma 5.4, and hence u b has |h| + 1 transitions. Recall that we have factored out translations by pinning down the left endpoint of the δ-core interval at the origin. First we would like to conclude that u b ∈ M (h) which would require that u b have the correct number P b cannot gain crossings i hi + 2 of transverse crossings in its core interval. Note that u due to the uniform separation property. Since un ∈ M (h) are normalized and converge 1 to u b, the only way crossings can be lost is if some number of them coalesce into in Cloc a tangency at ±1 or the crossings at the endpoints of the core interval approach ±∞. If neither of these situations occurs, then the intervals of maximal monotonicity in the core interval are preserved in u b, and hence u b ∈ M (h). First suppose that one of the endpoints of the core interval approaches ±∞ as n → u(t), u b0 (t)) = (±1, 0) ∞. Then u b has no crossings in one or both of its tails, but limt→±∞ (b
360
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
since u b ∈ χ + H 2 (R). Therefore there is an interval I in the tail with length |I| > τ0 and k(b u, u b0 ) − (±1, 0)k < δ at the endpoints of I. As in Step 1 of the proof of Theorem 5.5, we can replace u b by the local minimizer of J on I with the appropriate boundary conditions (Theorem 4.1). By Theorem 4.2 this yields a function w b with at least one crossing in both tails and J[w] b < J[b u]. Now suppose w b has a certain number of tangencies at ±1. Each point of tangency p can be stretched to an arbitrarily long interval Kp by gluing in a long segment on which the function is identically ±1. The resulting function has the same action as w b and will b has still be denoted by w. b On a slightly larger interval Lp containing Kp the function w boundary data close but not equal to (±1, 0). Again as in Step 1 of the proof of Theorem 5.5, we can replace w b on Lp with the minimizer of the boundary value problem to obtain vb with reduced action J[b v ] < J[b u], see Fig. 5.1. By Theorem 4.2, we can assume each interval Lp contains at least max hi crossings by choosing |Kp | ≥ τ0 max hi . A small perturbation will make vb a Morse function with transverse crossings in the core interval. So vb ∈ M (k) for some k ∈ G such that |k| = |h| = m and ki ≥ hi for all i ≤ m. However J (k) ≤ J[b v ] < J[b u] ≤ J (h) which contradicts Lemma 3.6. This proves that u b ∈ M (h), and hence J[b u] ≥ J (h). Therefore u b is a minimizer of J over M (h). Now that the existence of a u b ∈ M (h) which minimizes over the class M(g) has been established, we can characterize the behavior of u b in the tails. First, all crossings of ±1 must be transverse by the above argument. Therefore the number of crossings is at most countable. As t → ±∞, u b must have infinitely many crossings – otherwise there would be an arbitrarily long interval on which u b could be replaced by the local solution from Theorems 4.1 and 4.2 which reduces J and contradicts the fact that u b is a minimizer. 1 limit) of normalized functions un ∈ M (h), Since u b is the weak H 2 limit (strong Cloc between each pair of consecutive crossings there is exactly one local extremum, and the transitions are monotone. Recall that intervals of critical points are removed. Finally, without loss of generality assume u b → +1 as t → ∞ and consider the right tail, the other cases are similar. Let t1 be the location of the first maximum to the right of the core interval, and suppose the absolute maximum in the right tail is located at t2 > t1 . Let s1 and s2 be the locations of the local minima immediately preceding t1 and t2 respectively. Then s1 < t1 < s2 < t2 and Lemma 3.1 can be applied to the intervals [s1 , t1 ] and [s2 , t2 ] to clip u b which is a contradiction. A similar argument shows that the first minimum to the right of the core interval is the absolute minimum in the right tail. Repeating an analogous argument shows that the maxima are strictly decreasing and the minima are strictly increasing as t → ∞. This completes the proof of the claim and Theorem 2.2. Proof of Corollary 2.3. The corollary follows immediately from the definition of the partial order ≺ since there is no h ∈ G such that h g when gi = 2 for all i ≥ 1.
6. Well-Separated Transitions Now we would like to investigate in more detail the relationship between the infima over the various classes M (g), and prove Theorem 1.2. In particular, we are interested in finding conditions either on the class or on the nonlinearity F which imply that a local minimizer over M(g) exists in the class M (g) without increasing the complexity to some M (h) with h g. As will be proven in the next section, this can be done for all g ∈ G if F is even, because the symmetry provides an extra tool for clipping. Also this
Homotopy Classes for Stable Connections
361
automatically holds for all g with gi = 2 for all i = 1, . . . , |g|, since there is no h ∈ G with h g (Corollary 2.3). In this section we provide additional arguments which yield results similar to those in more conventional multibump constructions, cf. [26]. Intuitively the basic idea is that the interaction between transitions should be weak if they are all separated by large distances. Thus, if the numbers of oscillations between all the transitions are large enough, i.e. gi 1 for all i ≥ 1, one would expect that a local minimum should be attained in the class M (g). We begin with the following lemma. Lemma 6.1. Suppose that gi > 2 for all i = 1, ..., m. There exists a universal constant C0 such that if u ∈ M (g) with J[u] ≤ J (g) + 1, then J[u|Ii ] ≤ C0 . Proof. Fix i ∈ {1, ..., m}, and for specificity assume that it is odd. Let u(a) be the last local maximum before Ii = conv (Ai ) and u(b) be the first local maximum after Ii . Since gi±1 > 2, we have −1 < u(a), u(b) < 1. Note that in the case i = 1, it is possible that there is no local maximum in the left tail with −1 < u(a). However, in this case a small perturbation of u can be made to create such a point with J(u) ≤ J (g) + 1 + , which does not substantially effect the following argument. A similar situation can occur for i = m + 1. Let ξu (t) be defined by ( (u(a) + 1)ψ0 (t + 3) + 2ψ0 (t + 1) − 1 for t ∈ [−3, −1] (6.1) ξu (t) = 1 for t ∈ [−1, 1] (u(b) + 1)ψ0 (t − 3) + 2ψ0 (t − 1) − 1 for t ∈ [1, 3], where ψ0 is defined in (5.4). Clearly J[ξu ] < 8J[ψ0 ]. Note that ξu (−3) = u(a), ξu (3) = u(b), and ξu0 (±3) = 0. Also ξu has a “W” shape which can easily be adjusted via an arbitrarily small C ∞ perturbation to obtain ξbu with two transversal crossings of −1 followed by gi transversal crossings of 1 and another two transversal crossings of −1. This perturbation can be accomplished exactly as in Step 3 in the proof of Theorem 5.5. Now, replace u over the interval [a, b] by ξbu (after adjusting the domain) to obtain a function v ∈ M (g). Using J (g) ≤ J[v] and J[u] ≤ J (g) + 1, we can estimate −1 ≤ J[v] − J[u] = J[ξbu ] − J[u|[a,b] ]. Hence
J[u|Ii ] ≤ J[u|[a,b] ] ≤ J[ξbu ] + 1 ≤ 8J[ψ0 ] + 1 = C0 ,
which completes the proof.
Lemma 6.2. There exists N > 0 such that whenever g ∈ G with gi > N for all i ≤ |g|, the uniform separation property holds in M (g). Proof. Let κ be the lower bound on the action p of any transition from Corollary 5.2, and fix ρ ≤ min{δ0 (F (u + 1)), δ0 (F (u − 1)), κ/3C, 1/4}, where δ0 and C are as in Theorem 4.1. Claim. There exists N ≥ 4 such that for any g ∈ G satisfying the above conditions the following property holds: given any normalized u ∈ M (g) with J[u] ≤ J (g) + 1 and any interval Ii , i ≤ m, which necessarily contains N consecutive crossings of (−1)i+1 , then |u(t0 ) − (−1)i+1 | < ρ for some t0 ∈ Ii at which u has a local extremum.
362
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
From Lemma 6.1, we have J[u|Ii ] ≤ C0 . The set A of locations of local extrema of u in Ii has cardinality at least N − 1. From Lemma 5.1 we obtain J[u|Ii ] ≥ #(A ∩ {t ∈ Ii : |u(t) ∓ 1| > ρ}) Cρ. Combining these two estimates, A ∩ {t ∈ Ii : |u(t) ∓ 1| ≤ ρ} is nonempty for N sufficiently large. Choosing t0 in this set proves the claim. Assume the uniform separation property fails in M (g) for 0 = κ/2 and all δ > 0. Then there exists a normalized u ∈ M (g) with J[u] ≤ J (g) + κ/2 and s0 ∈ Ii such that |u(s0 ) − (−1)i | ≤ δ for some i ≤ m. For specificity, we can assume i = 1, and without loss of generality we can take u(s0 ) to be the smallest local minimum on the interval between the zeroes of the first two transitions. Since g1 > N , the interval I1 contains a point t0 satisfying the properties of the claim. First we will clip out the large oscillation near s0 , which lowers J by at least κ, and then we will insert crossings at t0 , which increases J by at most Cρ2 . Thus, for any sufficiently small δ > 0, we will construct u∗ ∈ M (g) such that J (g) ≤ J[u∗ ] ≤ J (g) −
1 κ, 12
which will yield a contradiction. The insertion of crossings at t0 is straightforward. First cut the function u at t0 . Then insert the minimizer of J on [t0 , t0 + T ] with boundary conditions (u(t0 ), u0 (t0 )) at both ends. For T sufficiently large any number of crossings are added to u by Theorem 4.2 with an increase in the action of less than Cρ2 by Theorem 4.1. Clipping out the large oscillation near s0 requires a more technical argument which is similar to the proof of Lemma 3.6. Let s1 < s0 < s2 be the locations of the neighboring local maxima immediately adjacent to s0 . First assume that neither s1 or s2 is the endpoint of a transition, i.e. if a and b are the left/right endpoints of the maximal interval of monotonicity ending/beginning at s1 and s2 respectively, then u(a), u(b) ≥ u(s0 ) > −1. In this case the oscillation can be removed by clipping over the intervals [a, s1 ] and [s0 , s2 ] or [s1 , s0 ] and [s2 , b] if u(s1 ) > u(s2 ) or u(s1 ) < u(s2 ) respectively. There is the trivial clipping if u(s1 ) = u(s2 ). A case where the above argument does not work is when u(s2 ) > u(s1 ) and [s2 , b] is a transition, i.e. u(b) < −1. The other case involving a transition is similar. Immediately to the right of b is either an interval with at least N crossings or the tail of u which has infinitely many transverse crossings because u is normalized. Let c and d be the locations of the first local maximum to the right of b and the adjacent local minimum to its right. Since there cannot be another transition immediately to the right of b, we know that u(c) < 1 and u(d) < −1. To clip we need u(s0 ) < u(c). If u(s0 ) > u(c), then since |u(s0 ) + 1| < δ, we can perturb u in a neighborhood of s0 so that −1 < u(s0 ) < u(c). For δ small enough, this can be accomplished with at most κ/12 increase in the action. Now the large oscillation can be removed by clipping over the intervals [s1 , s0 ] and [c, d] which yields J (g) ≤ J[u∗ ] ≤ J[u] − κ + Cρ2 + This estimate provides the desired contradiction.
1 1 κ ≤ J (g) − κ. 12 12
Lemma 6.2 states that the uniform separation property holds in M (g), and hence Claim 5.6 in the proof of Theorem 2.2 immediately yields the following theorem.
Homotopy Classes for Stable Connections
363
Theorem 6.3. Let g ∈ G. If gi is sufficiently large for all i ≤ |g|, then there exists a solution u b ∈ M (g) of (1.2) which is a local minimizer of J restricted to M (g).
7. Symmetric Case In Sect. 5 we found a local minimizer u b of J in every class M(g) which is a union of classes M (h) with h g. A priori there is no guarantee that u b lies in the primary subset M (g) unless gi = 2 for all i ≤ |g| (Corollary 2.3), or gi are all large (Theorem 6.3). The purpose of this section is to prove that if F is even, then a local minimizer exists in every class M (g). For example the result will apply to the EFK Eq. (1.3) in which the potential F (u) = (u2 − 1)2 /4. Theorem 7.1. Suppose that F satisfies the hypothesis (H1) and ±1 are saddle-foci. If b ∈ M (g) F is even, then for each g = (g1 , . . . , gm ) ∈ G there is a local minimizer u which satisfies all the properties listed in Theorem 2.2. Remark. The original proof of this theorem contained the restriction that gi > 2 for all i. An argument by J. VandenBerg [47], given as part of the proof of Lemma 7.4 below, allows this restriction to be removed. The existence of u b will follow from Claim 5.6 in the proof of Theorem 2.2, after we have established that the uniform separation property holds in M (g). Proposition 7.2. If F is even, then the uniform separation property holds in M (g) for any g = (g1 , . . . , gm ) ∈ G. In this section we will assume that F is even, and we will need to consider all classes M ± (g) as defined in the introduction, where limt→−∞ u(t) = ±1 for u ∈ M ± (g). In the preceding sections we used only the classes M (g) = M − (g), but the classes M + (g) are simply their reflections, M + (g) = {−u : u ∈ M (g)}. Since F is even, the infima in these classes are the same, J (g) = J − (g) = J + (g). Proposition 7.2 relies on two lemmas that allow for comparison of the infima J (g) for various g in a way similar to Lemma 3.6. Lemma 7.3. For any g = (g1 , ..., gm ) ∈ G, J (g) ≤ J ((g1 , ..., gi−1 )) + J ((gi+1 , ..., gm )) for 1 < i < m, J (g) ≤ J (0) + J ((g2 , . . . , gm )), and J (g) ≤ J ((g1 , . . . , gm−1 )) + J (0). Remark. As will be evident in the proof, estimates similar to those in Lemma 7.3 using J ± hold without requiring symmetry. However, the next lemma does assume F to be even. An example of a non-even F and a class M (g) without a local minimizer is not known. Lemma 7.4. There is a universal constant κ = κ(β, F ) > 0 such that i) J (0) < J ((g1 )) − κ for any g1 ∈ 2N, ii) J ((g2 , ..., gm )) < J (g) − κ for g = (g1 , ..., gm ) ∈ G, iii) J ((g1 , ..., gm−1 )) < J (g) − κ for g = (g1 , ..., gm ) ∈ G.
364
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
First we will assume these lemmas to prove Proposition 7.2 which implies Theorem 7.1. Proof of Proposition 7.2. Set 0 = κ, where κ is as in Lemma 7.4 and Corollary 5.2. Take an arbitrary u ∈ M (g) such that J[u] ≤ J (g) + 0 /2, and assume, contrary to the assertion of the proposition, that |u(t0 ) − (−1)i | < δ for some t0 ∈ Ii , i ∈ {0, 1, . . . , m, m + 1}, and small δ to be specified later. We will consider the case of i odd, as the other case is completely analogous. Let gi0 := #(Ai ∩ (−∞, t0 ]) and gi00 := #(Ai ∩[t0 , ∞)). Note that if i = m+1, then gi00 is infinite. From Lemma 7.4, for 1 < i < m we have J ((g1 , ..., gi−1 ))+J ((gi+1 , ..., gm )) ≤ J ((g1 , ..., gi−1 , gi0 ))+J ((gi00 , gi+1 , ..., gm ))−20 , which has a natural counterpart for i = m 0 00 )) + J ((gm )) − 20 , J ((g1 , . . . , gm−1 )) + J (0) ≤ J ((g1 , . . . , gm−1 , gm
and for i = m + 1 we have 0 )) − 0 . J (g) ≤ J ((g1 , . . . , gm , gm+1
This will immediately contradict Lemma 7.3 if we prove the following inequality J ((g1 , ..., gi−1 , gi0 )) + J ((gi00 , gi+1 , ..., gm )) − 0 ≤ J (g) for 1 ≤ i ≤ m and
0 )) − 0 < J (g) J ((g1 , . . . , gm , gm+1
(7.1) (7.2)
for i = m + 1. The same simple geometric construction leads to both these inequalities, and we regret that our notation forced these rather cumbersome formulations. First we will perturb u to a function v which has a tangency with (−1)i . This perturbation will cause a small increase in J, which is uniform in u and dependent on δ, and is accomplished as follows. Let I be the largest interval around t0 such that (−1)i u|I ≥ 0. Exactly as in the proof of Lemma 5.4, |I| is uniformly bounded from below because u can not grow too sharply without increasing J. √ Therefore, there is a uniform δ > 0 for which the perturbations v := u + τ ψ0 ((t − t0 )/ δ), τ ∈ [0, δ], are localized to I, and hence all the crossings in u are preserved. Choose τ ∈ [0, δ] so that v is tangent to (−1)i at some t∗ ∈ I. For δ sufficiently small J[v] < J (g) + 0 . Again, as in the proof of Lemma 5.4, uniformity in u comes from Lemma 5.1. Cutting u at t∗ and extending the two pieces by a constant ±1 we obtain C 1 functions u− and u+ defined by n u(t) for t ≤ t∗ (−1)i−1 for t ≤ t∗ u− (t) = and u (t) = + i−1 (−1) for t ≥ t∗ u(t) for t ≥ t∗ . Note that J[u+ ] + J[u− ] = J[v], hence J[u+ ] + J[u− ] < J (g) + 0 . The two functions account for all the crossings in u. Therefore, if not for tangency that u− and u+ have in their tails, we would have u− ∈ M ((g1 , ..., gi0 )) and u+ ∈ M ((gi00 , gi+1 , ..., gm )). Then the last inequality immediately yields the desired inequalities (7.1) and (7.2). The tangencies can be easily removed with arbitrarily small perturbations that leave the inequality intact. Now we proceed with the proofs of Lemmas 7.3 and 7.4.
Homotopy Classes for Stable Connections
365
Proof of Lemma 7.3. Fix > 0 arbitrarily. We will argue for 0 < i < m. Let u− ∈ M ((g1 , ..., gi−1 )) and u+ ∈ M ((gi+1 , ..., gm )) be normalized such that J[u− ] + J[u+ ] ≤ J ((g1 , ..., gi−1 )) + J ((gi+1 , ..., gm )) + /2. We will paste u− and u+ together to get u ∈ M (g). For s1 , s2 > 0 consider the function v(t) defined on R \ [−1, 1] by n u− (t + s1 ) for t < −1, v(t) = u+ (t − s2 ) for t > 1. Since normalized functions have infinitely many transverse crossings which accumulate in the tails at points −∞ ≤ a < b ≤ ∞, we can choose s1 and s2 so that v(±1) is close to (−1)i+1 , and v 0 (±1) is very small. In particular v can be extended across [−1, 1] (using the local solutions of Sect. 4 for example) to construct a C 1 -function w with J[w|[−1,1] ] < /2. Then J[w] < J ((g1 , ..., gi−1 )) + J ((gi+1 , ..., gm )) + , and the construction can be accomplished so that w has transverse crossings of (−1)i+1 , i.e. w ∈ M ((g1 , ..., l, ..., gm )) for some l > 0. A number of crossings l is inherited by w from the tails of u− and u+ . Thus l can be arbitrarily large since normalized functions have infinitely many oscillations in their tails. Assume then that l ≥ gi . The above estimate together with Lemma 3.6 yield J (g) ≤ J((g1 , ..., l, ..., gm )) ≤ J ((g1 , ..., gi−1 )) + J ((gi+1 , ..., gm )) + . Since was arbitrary, this completes the proof of the first inequality; the other two are proved analogously. Proof of Lemma 7.4. (i) From Lemma 3.6 we can assume that g1 = 2. Let u ∈ M ((2)) be normalized and translated so that u(0) is the global maximum. Also assume that J[u|(−∞,0] ] ≤ J[u|[0,+∞) ], the other case is similar. Let u(a) be the local minimum preceding u(0). Define v by v(t) = u(t) for t < 0, and v(t) = −u(−t) for t > 0. At t = 0 there is a jump with the one-sided limits v(0− ) = u(0) and v(0+ ) = −u(0). Clearly, (v(0+ ) − v(a))(v(−a) − v(0− )) = (u(0) + u(a))2 ≥ 0, so v|[a,0] and v|[0,−a] are latched, i.e. the hypothesis (ii) of Lemma 3.1 is satisfied. Clipping v yields a function w which is in M (0). Indeed, clipping merges the two transitions of v; one is included in w and the other is removed. In particular J[w] ≤ J[u] − κ, where κ is the uniform (a priori) lower bound on the action of any transition given by Corollary 5.2 (Lemma 5.1). Since u is arbitrary, J ((g1 )) ≥ J[w] + κ ≥ J (0). Statements (ii) and (iii) are completely analogous by considering the map t 7→ −t, and therefore we will only prove (iii). (iii) As before it is enough to consider the case when gm = 2. Take an arbitrary normalized u ∈ M (g). For specificity let us assume that m is even so that u(+∞) = 1. Let u(d) be the first local maximum in the tail, u(b) be the last local maximum before the tail, and u(c) be the unique local minimum in [b, d], see Fig. 7.1. In this way u|[b,c] and u|[c,d] are the two last transitions. There are three possibilities. Case 1a: u(b) > −u(c) with gm−1 > 2. Let u(a) be the local minimum preceding u(b). Since gm−1 > 2, then u(a) > −1, and so (−u(a) − u(b)) < 0. Cutting u at t = b and flipping the tail part leads to v(t) := u(t) for t ≤ b and v(t) := −u(t) for t ≥ b. The
366
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
ab
c
d
a
b c
d
a c
d
Fig. 7.1. An example of flipping and clipping in the proof of Lemma 7.4 (iii), Case 1
above inequality and the assumption guarantee that v|[a,b] and v|[b,c] are latched. Thus we can clip v, and the resulting function w, having lost one transition and none of the gm−1 crossings, belongs to M ((g1 , ..., gm−1 )), see Fig. 7.1. Since a full transition was clipped out, J[w] ≤ J[u] − κ, where κ is as in (i). Since u is arbitrary, we conclude that J ((g1 , ..., gm−1 )) ≤ J (g) − κ. Case 1b [J. VandenBerg]: u(b) > −u(c) with gm−1 = 2. If u(b) ≥ −u(a), the proof is similar to Case 1a, but notice that a ∈ Im−2 and u(a) < −1. If u(b) < −u(a) and gm−2 > 2, then let u(e) be the local maximum immediately preceding u(a). Cut and flip u at t = a (as described above) to obtain v(t) for which v|[e,a] and v|[a,b] are latched. Clipping v then yields the desired function w ∈ M ((g1 , ..., gm−2 , 2)) = M ((g1 , ..., gm−1 )). ¯ c¯, and d¯ as follows. Let If u(b) < −u(a) and gm−2 = 2, we define new points a¯ , b, ¯ be the local maximum preceding u(¯c) and u(¯a) be the local d¯ = b and c¯ = a, and let u(b) ¯ Now u(d) ¯ = u(b) < −u(a) = −u(¯c). So either −u(¯c) > u(b), ¯ minimum preceding u(b). ¯ In in which case we can use Case 3 below to obtain the desired result, or −u(¯c) < u(b). the latter case, if gm−3 > 2, then applying Case 1a with the newly defined points yields the result, and if gm−3 = 2, we repeat the argument in Case 1b. This process terminates after finitely many steps. Case 2: u(d) > −u(c). Let u(e) be the local minimum succeeding u(d). Since u(e) > −1, we have −u(e) − u(d) < 0. As before, cut and flip u at d to get v(t) = u(t) for t ≤ d and v(t) = −u(t) for t ≥ d. The above inequality and the assumption imply that v|[c,d] and v|[d,e] are latched, so clip v to w which has one less transition. Actually w ∈ M ((g1 , ..., gm−1 )), and again J[w] ≤ J[u]−κ. Consequently, J ((g1 , ..., gm−1 )) ≤ J (g) − κ. Case 3: −u(c) ≥ max{u(b), u(d)}. Cut and flip the tail at c to obtain v(t) = u(t) for t ≤ c and v(t) = −u(t) for t ≥ c. The functions v|[b,c] and v|[c,d] are latched; clip out one transition and finish the argument exactly as in the previous case.
8. Concluding Remarks In this section we will discuss some related observations and possible directions for future work which are worthy of mention. However, we will be brief and omit details. 8.1. Stable states for fourth-order evolution equations. In relation to the study of phase transitions in the neighborhood of Lifshitz points the following partial differential equa-
Homotopy Classes for Stable Connections
367
tion has been proposed in [18, 19, 50]: ut = −γuxxxx + βuxx − F 0 (u),
(8.1)
where F is a double-well potential. In the case that the spatial domain is all of R, the local minima of J are weakly stable states of (8.1) in the following sense. Let u b be a local b(· − s), s ∈ R. From [12, 26] it follows (at minimum of J, and define u bs to be the curve u us ) least in the analytic case) that these curves are isolated in χ+H 2 (R). Now let u0 ∈ B (b for sufficiently small. Then, since u b is an isolated local minimum of J, the flow us ), with 0 ≥ . u(t, ·) generated by (8.1) remains in some tubular neighborhood B0 (b u) satisfies 1 ≤ dim N (J 00 (b u)) ≤ Furthermore, for arbitrary analytic F , the kernel of J 00 (b u)) = 1, which corresponds to the transverse intersection of stable 2. When dim N (J 00 (b b(·−s0 ) and unstable manifolds, it follows from [6] that u(t, x; u0 ) approaches an element u on the curve u bs for some s0 . Otherwise it is not clear whether the ω-limit set of the solution u(t, x; u0 ) is a singleton. In the case of a finite domain, the local minimizers found in this paper provide stable equilibrium solutions with complex spatial patterns to initial boundary value problems of the form ( for x ∈ (0, 1), ut = −γ4 uxxxx + β2 uxx − F 0 (u) (8.2) (u(t, 0), u0 (t, 0)) = A and (u(t, 1), u0 (t, 1)) = B, u(0, x) = u0 (x). Indeed, let vb ∈ M (g) be a local minimizer of J, and let [0, L] be the core interval of vb. Truncating vb to this interval and rescaling by = 1/L yields a function v : [0, 1] → R which is a local minimizer of Z1 J [u] =
β2 2 γ4 2 u + u + F (u) dx 2 xx 2 x
0
over the space X = {u ∈ H [0, 1] : (u(0), u0 (0)) = A and (u(1), u0 (1)) = B} where A and B are the values of (b v , vb0 ) at the points 0, L. It is easily verified that J is a Lyapunov function for the flow generated on X by the initial boundary value problem, and hence v is an asymptotically stable equilibrium. The function v has the complexity specified by g. Note that the Pvalues of A, B, and are determined by the particular minimizer on R, and → 0 as gi → ∞. This is reminiscent of a result due to Afraimovich, Babin and Chow [1] who study second-order parabolic systems of the form 2
qt = qxx − ∇V (q), q : [0, T ] → R2 , derived from potentials V with two infinite spike-like singularities at z1 , z2 ∈ R2 . This makes the configuration space nontrivial homotopically (unlike our case with wells at (±1, 0)). They prove that homotopy classes of initial data are invariant under this gradient flow with periodic boundary conditions, which yields stable equilibria in every homotopy class of R2 \ {z1 , z2 }. This invariance of classes will not occur for the fourth-order PDE (8.2), but we do obtain spatially complex stable equilibria. The dynamics near the attractor is governed by the motion of transition layers in solutions to (8.2), see Kalies, VanderVorst, and Wanner [27]. Sandstede [41] has developed a general theory for studying the dynamics of higher-order parabolic PDE’s in one space dimension. If the single pulse heteroclinics in M (0) are transverse, then his results
368
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
apply to (8.2) and describe the dynamics near solutions with well-separated transitions such as those in Theorem 1.2 (or 6.3). The influence of saddle-focus equilibria in stationary or traveling-wave equations for a variety of fourth-order PDE’s has been discussed by many authors, cf. [7, 8] and the references therein, see also Remark 7 below. 8.2. Multiple-well potentials. In the previous sections we discussed only equal depth double-well potentials F (u). However the theory developed there easily extends to the case when F has arbitrarily many wells all of equal depth and all global minima. As in the case of two wells, topological classes of functions can be defined with limits at t = ±∞ in one of the wells. Naturally all of these equilibrium solutions are required to be of saddle-focus type. The necessary clipping procedures are identical to those in this paper, and the analysis proceeds in a completely analogous way. Again by comparing infima one can conclude that there are local minimizers in many of these topological classes. Potentials F with infinitely many wells can also be considered, cf. [26]. 8.3. Critical levels for large winding vectors. For vectors g ∈ G for which the components gi are large, a result more detailed than Theorem 6.3 can be obtained. In particular precise estimates on the critical levels of the minimizers can be found. There exist positive constants k and N = N (|g|) such that if min gi ≥ N , then |J (g) − nJ − (0) − mJ + (0)| = O(e−k min gi ),
n + m = |g|.
of the global minima J − (0) As min gi → ∞, the critical levels cluster around sums P + 1,∞ -close to i≤|g|+1 u bi (· − ti ), where ui and J (0). Moreover the minimizers are W − + are minimizers of J (0) for i odd and J (0) for i even. The points ti satisfy t1 < t2 < . . . < t|g|+1 and |ti − ti−1 | → ∞ as min gi → ∞. The proof is based on the following observation. Recall Ii = conv Ai are the intervals between transitions. Let u ∈ M (g) be a minimizer, then |Ii | → ∞ as gi → ∞ for all i, which follows from the proof of Theorem 2.8 in [26]. The exponential bound in the above estimate follows from the fact that ±1 are hyperbolic equilibria. A detailed proof of this estimate from below is completely analogous to arguments found inP[27]. Note that the solutions constructed in [26], which lie in a W 1,∞ -neighborhood of i [u− (· − s2i+1 ) + u+ (· − s2i )], s → ∞ also have the property that the critical value goes to nJ − + mJ + . Also the uniform a priori bound on all bounded solutions established in the appendix of [28] allows the constant N in the above estimate to be chosen independently of |g|. 8.4. Non-integrability and postive entropy. The multitransition solutions found in this paper can be used to estimate the topological entropy of the local flow generated by (1.2). Such a computation is performed in [26] for the EFK Eq. (1.3), where F (u) = 41 (u2 −1)2 . A similar argument can be made for a general double-well potential using the solutions in Theorem 1.2 provided one can find estimates on the distances between transitions. These estimates can be done, but we will not furnish them here. Since the topological entropy is positive, the flow is chaotic. Consequently Eq. (1.2), with any C 2 doublewell potential F , is not a completely integrable Hamiltonian system when u = ±1 are saddle-focus equilibria points. In fact, the above results could possibly be extended to produce chaotic solutions with infinitely many transitions directly by a minimization procedure (see [25]). 8.5. Connections of higher index. The homoclinic and heteroclinic solutions discussed here are local minimizers of J. Therefore one would expect to also find critical points
Homotopy Classes for Stable Connections
369
of higher index such as mountain pass critical points. If the minimizers were known to be nondegenerate, i.e. the translation eigenvalue is simple (which is difficult to verify), then many of the critical points of arbitrarily high index can be derived from work by Sandstede [41]. Recent numerical results concerning such solutions can be found in VandenBerg [48]. A possible variational approach can be illustrated by the following example. The shooting methods of Peletier and Troy [36] yield heteroclinics which oscillate around 0 on an interval [−T, T ] with amplitude less than 1. Such functions are in the class M (0) but are not the local minimizers from Theorem 2.2, because their transitions are not monotone. This does not preclude them from being local minimizers of J, but they cannot be global minimizers in χ + H 2 (R), since they can be clipped. However we conjecture that there are solutions of this type which are mountain pass critical points between local minima. Let u0 ∈ M (0) and u1 ∈ M ((2, 2)) and consider the connection us = {(1−s)u0 +su1 }s∈[0,1] . Then for some s∗ ∈ [0, 1], the function us∗ has this property. It is possible that the techniques used in this paper can be extended to find a minimax of mountain pass type in the class of odd functions between the solutions u0 and u1 in this way. 8.6. Different types of Lagrangians. The Lagrangians that we consider in this paper are bounded from below by zero which allows for minimization. There are related problems for which such a formulation is not immediately applicable. However, the clipping and pasting techniques developed here could be useful in the application of other variational methods to the study of fourth-order equations with saddle-foci. In this context we mention two equations. One class of problems comes from nonlinear optics [2], where the following equation arises: u0000 − αu00 − u3 + u = 0. In this case the Lagrangian density is not bounded from below, and one can find a homoclinic √ connection to √ the origin as a mountain pass, and the origin is a saddle-focus for −2 2 < α < 2 2. A similar situation occurs in (8.1) when the wells of F do not have equal depth. One form of the stationary Swift-Hohenberg or Mizel equation is u0000 + αu00 + u3 −√u = 0, α > 0 (see e.g. [29]), and the points u = ±1 are saddle-foci when 0 < α < 2 2. Again the action functional is not bounded below which causes problems in minimization. The techniques of this paper might be extendable to find higher types of minimax critical points in these types of equations. Acknowledgement. The authors thank their colleagues at the Center for Dynamical Systems and Nonlinear Studies for their support and encouragement. In particular, we also thank J.B. VandenBerg and the referee for the improvement of Theorem 7.1 and the referee for his careful reading of the manuscript.
References 1. Afraimovich, V., Babin, A. and Chow, S.-N.: Spatial chaotic structure of attractors of reaction-diffusion systems. Trans. Am. Math. Soc. 348, 5031–5063 (1996) 2. Akhmediev, N.N., Buryak, A.V., and Karlsson, M.: Radiationless optical solutions with oscillating tails. Optics Comm. 110, 540–544 (1994) 3. Alexander, J.C., Gardner, R.A., and Jones, C.K.R.T.: Existence and stability of asymptotically oscillatory double pulses. J. Reine Angew. Math. 446, 49–79 (1994) 4. Ambrosetti, A. and Coti-Zelati, V.: Multiplicit´e des orbites homoclines pour des syst`em conservatifs. C. R. Acad. Sci. Paris 314 II, 601–604 (1992) 5. Amick, C.J. and Toland, J.F.: Homoclinic orbits in the dynamic phase-space analogy of an elastic strut. Eur. J. Appl. Math. 3, 97–114 (1992) 6. Bates, P. and Jones, C.K.R.T.: Invariant manifolds for semilinear partial differential equations. Dynamics Reported 2, 1–38 (1989)
370
W.D. Kalies, J. Kwapisz, R.C.A.M. VanderVorst
7. Belyakov, L.Y. and Shil’nikov, L.P.: On the complex stationary nearly solitary waves. In: SelfOrganisation, Autowaves and Structures Far from Equilibrium. Berlin–Heidelberg–New York: SpringerVerlag, 1984, pp. 106–110 8. Belyakov, L.Y. and Shil’nikov, L.P.: Homoclinic curves and complex waves. Selecta Math. Sov. 9, 214–228 (1990) 9. Brezis, H.: Analyse Functionnelle, Theorie et Applications. Masson: Paris, 1988 10. Buffoni, B.: Multiplicity of homoclinic orbits in fourth-order conservative systems. In: Variational and Local Methods in the Study of Hamiltonian Systems, River Edge, NJ: World Sci Publishing, 1995, pp. 129–136 11. Buffoni, B., Champneys, A.R. and Toland, J.F.: Bifurcation and coalescence of a plethora of homoclinic orbits. J. Dyn. Diff. Eq. 8, 221–279 (1996) 12. Buffoni, B. and S´er´e, E.: A global condition for quasi-random behavior in a class of conservative systems. Comm. Pure Appl. Math. 49, 285–305 (1996) 13. Buffoni, B. and Toland, J.F.: Global existence of homoclinic and periodic orbits for a class of autonomous Hamiltonian systems. J. Diff. Eq. 118, 104–120 (1995) 14. Champneys, A.R.: Subsidiary homoclinic orbits to a saddle-focus for reversible Hamiltonian systems. Intl. J. Bifurcation and Chaos 4, 1447–1482 (1994) 15. Champneys, A.R. and Toland, J.F.: Bifurcation of a plethora of multi-modal homoclinic orbits for autonomous Hamiltonian systems. Nonlinearity 6, 665–772 (1993) 16. Coti-Zelati, V., Ekeland, I. and S´er´e, E.: A variational approach to homoclinic orbits in Hamiltonian systems. Math. Ann. 288, 133–160 (1990) 17. Coti-Zelati, V. and Rabinowitz, P.H.: Homoclinic orbits for second order Hamiltonian systems possessing superquadratic potentials. J. AMS 4, 693–727 (1991) 18. Coullet, P., Elphick, C., and Repaux, D.: The nature of spatial chaos. Phys. Rev. Lett. 58, 431–434 (1987) 19. Dee, G.T. and van Saarloos, W.: Bistable systems with propagating front leading to pattern formation. Phys. Rev. Lett. 60, 2641–2644 (1988) 20. Devaney, R.L.: Homoclinic orbits in Hamiltonian systems. J. Diff. Eq. 21, 431–438 (1976) 21. Felmer, P.: Heteroclinic orbits for spatially periodic Hamiltonian systems. Ann. Inst. H. Poincar´e – Anal. Nonl. 8, 477–497 (1991) 22. Gardner, R.A. and Jones, C.K.R.T.: Traveling waves of a perturbed diffusion equation arising in a phase field model. Indiana Univ. Math. Journ. 38, 1197–1222 (1989) 23. Guckenheimer, J. and Holmes, P.: Nonlinear Oscillations, Dynamical Systems, and Bifurcation of Vector Fields. Number 42 in Appl. Math. Sciences. Berlin–Heidelberg–New York: Springer-Verlag, 1982 24. Hornreich, R.M., Luban, M. and Shtrikman, S.: Critical behavior at the onset of k-space instability at the λ line. Phys. Rev. Lett. 35, 1678 (1975) 25. Kalies, W.D., Kwapiesz, J., VandenBerg, J.B., VanderVorst, R.C.M.: Homotopy classes for stable periodic and random patterns in fourth order Hamiltonian systems. In preparation 26. Kalies, W.D. and VanderVorst, R.C.A.M.: Multitransition homoclinic and heteroclinic solutions of the extended Fisher-Kolmogorov equation. J. Diff. Eq. 131, 209–228 (1996) 27. Kalies, W.D., VanderVorst, R.C.A.M. and Wanner, T.: Slow motion in higher-order systems and 0convergence in one space dimension. CDSNS Report Series 97–259 (1997) 28. Kwapisz, J.: Uniqueness of the stationary wave for the extended Fisher–Kolmogorov equation. Preprint, 1997 29. Leizarowitz, A. and Mizel, V.: One dimensional infinite-horizon variational problems arising in continuum mechanics. Arch. Rational Mech. Anal. 106, 161–194 (1989) 30. Lin, X.-B.: Using Melnikov’s method to solve Silnikov’s problems. Proc. Royal Soc. Edin. 116A, 295– 325 (1990) 31. Melnikov, V.K.: On the stability of the center for time periodic perturbations. Trans. Moscow Math. Soc. 12, 1–57 (1963) 32. Nishiura, Y.: Coexistence of infinitely many stable solutions to reaction- diffusion systems in the singular limit. Dynamics Reported 3, 25–103 (1994) 33. Peletier, L.A. and Troy, W.C.: Spatial patterns described by the extended Fisher-Kolmogorov equatio: Kinks. Differential Integral Equations 8, 1279–1304 (1995) 34. Peletier, L.A. and Troy, W.C.: Spatial patterns described by the extended Fisher-Kolmogorov equation: Periodic solutions. Preprint 1995 35. Peletier, L.A. and Troy, W.C.: Chaotic spatial patterns described by the extended Fisher-Kolmogorov equation. J. Diff. Eq. 129, 458–508 (1996)
Homotopy Classes for Stable Connections
371
36. Peletier, L.A. and Troy, W.C.: A topological shooting method and the existence of kinks of the extended Fisher-Kolmogorov equation. Topol. Methods Nonlinear Anal. 6, 331–355 (1997) 37. Peletier, L.A., Troy, W.C. and VanderVorst, R.C.A.M.: Stationary solutions of a fourth-order nonlinear diffusion equation. Differentsialnye Uravneniya 31, 327–337 (1995) 38. Rabinowitz, P.H.: Periodic and heteroclinic solutions for a periodic Hamiltonian system. Ann. Inst. H. Poincar´e – Anal. Nonl. 6, 331–346 (1989) 39. Rabinowitz, P.H.: Homoclinic and heteroclinic orbits for a class of Hamiltonian systems. Calc. Var. 1, 1–36 (1993) 40. Rottsch¨afer, V. and Doelman, A.: On the transition from the Ginzburg-Landau equation to the extended Fisher-Kolmogorov equation. Preprint (1997) 41. Sandstede, B.: Interaction of pulses in dissipative systems. In preparation (1997) 42. Sandstede, B. Stability of multiple-pulse solutions. To appear in Trans. Am. Math. Soc. 1997 43. S´er´e, E. Existence of infinitely many homoclinic orbits in Hamiltonian systems. Math. Ann. 209, 27–42 (1992) 44. S´er´e, E.: Looking for the Bernoulli shift. Ann. Inst. H. Poincar´e – Anal. Nonl. 10, 561–590 (1993) 45. Shil’nikov, L.P.: The existence of a denumerable set of periodic motions in four-dimensional space in an extended neighborhood of a saddle-focus. Soviet Math. Dokl. 8, 54–58 (1967) 46. Turaev, D.V. and Shil’nikov, L.P. On Hamiltonian systems with homoclinic saddle curves. Soviet Math. Dokl. 39, 165–168 (1989) 47. VandenBerg, J.B.: Private communication 1997 48. VandenBerg, J.B.: Branches of heteroclinic solutions in fourth-order bi-stable systems. In preparation (1997) 49. VandenBerg, J.B.: Uniqueness for solutions of the extended Fisher–Kolmogorov equation. To appear in Comptes Rendus 50. Zimmerman, W.: Propagating fronts near a Lifshitz point. Phys. Rev. Lett. 66, 1546 (1991) Communicated by J. L. Lebowitz
Commun. Math. Phys. 193, 373 – 396 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Elliptic Solutions to Difference Non-Linear Equations and Related Many-Body Problems I. Krichever1 , P. Wiegmann2 , A. Zabrodin3 1 Department of Mathematics of Columbia University and Landau Institute for Theoretical Physics, Kosygina str. 2, 117940 Moscow, Russia 2 James Franck Institute and Enrico Fermi Institute of the University of Chicago, 5640 S.Ellis Avenue, Chicago, IL 60637, USA and Landau Institute for Theoretical Physics 3 Joint Institute of Chemical Physics, Kosygina str. 4, 117334, Moscow, Russia and ITEP, 117259, Moscow, Russia
Received: 15 May 1997 / Accepted: 7 September 1997
Abstract: We study algebro-geometric (finite-gap) and elliptic solutions of fully discretized KP or 2D Toda equations. In bilinear form they are Hirota’s difference equation for τ -functions. Starting from a given algebraic curve, we express the τ -function and the Baker–Akhiezer function in terms of the Riemann theta function. We show that the elliptic solutions, when the τ -function is an elliptic polynomial, form a subclass of the general algebro-geometric solutions. We construct the algebraic curves of the elliptic solutions. The evolution of zeros of the elliptic solutions is governed by the discrete time generalization of the Ruijsenaars-Schneider many body system. The zeros obey equations which have the form of nested Bethe-ansatz equations, known from integrable quantum field theories. We discuss the Lax representation and the action-angle-type variables for the many body system. We also discuss elliptic solutions to discrete analogues of KdV, sine-Gordon and 1D Toda equations and describe the loci of the zeros.
1. Introduction Among a vast class of solutions to classical non-linear integrable equations elliptic solutions play a special role. First, these solutions occupy a distinguished place among all algebro-geometric (also called finite-gap) solutions, i.e. solutions constructed out of a given algebraic curve. The general formulas in terms of Riemann theta-functions become much more effective – in this case the Riemann theta-function splits into a product of Weierstrass σ-functions associated to an elliptic curve. Second, there exists a remarkable connection between the motion of poles (zeros) of the elliptic solutions and certain integrable many body systems. The pole dynamics of elliptic solutions to the Korteweg–de Vries (KdV) equation and the Calogero–Moser system of particles were linked together in the paper [1] (see also [2]). It has been shown in [3],[4] that this relation becomes an isomorphism if one considers elliptic solutions of the Kadomtsev–Petviashvili (KP) equation. More recently,
374
I. Krichever, P. Wiegmann, A. Zabrodin
these results were generalized to elliptic solutions of the matrix KP and the matrix 2D Toda lattice equations (see [5] and [6], respectively). The dynamics of their poles obeys the spin generalization of the Ruijsenaars–Schneider (RS) model [7]. Let us recall some elements of the elliptic solutions for the standard example of the KP equation 3uyy = (4ut + 6uux − uxxx )x for a function u = u(x, y, t). An elliptic solution in the variable x is given by u(x, y, t) = const + 2
N X
℘(x − xi (y, t)),
(1.1)
i=1
where ℘(x) is the Weierstrass ℘-function. The self-consistency of this ansatz is a manifestation of integrability. It has been shown in [3],[4] that the dynamics of poles as functions of y obeys the Calogero-Moser many body system with the Hamiltonian N
H=
X 1X 2 pi − 2 ℘(xi − xj ). 2 i=1
(1.2)
i6=j
This system in its turn is known to be integrable. There is an involutive set of conserved quantities H (j) – the Hamiltonian (1.2) and the total momentum are H (2) and H (1) . The equations of motion are N X ℘0 (xi − xj ). (1.3) ∂y2 xi = 4 j=1,6=i (3)
The t-dynamics is described by H . The reduction to the KdV equation restricts the particles to the locus LN in the phase space: X (pi , xi ) pi = 0, ℘0 (xi − xj ) = 0 (1.4) LN = i6=j
(here pi = ∂y xi ). In spite of interesting developments, an analysis of the locus structure is far from completed. In this paper we extend these results to the fully discretized version of the KP equation or 2D Toda lattice. Being fully discretized they become the same equation. In bilinear form they are known as Hirota’s bilinear difference equation (HBDE) [8], [9] (see [10] for a review). This is a bilinear equation for a function τ (l, m, n) (called τ -function) of three variables: λτ (l + 1, m, n)τ (l, m + 1, n + 1) + µτ (l, m + 1, n)τ (l + 1, m, n + 1) + ντ (l, m, n + 1)τ (l + 1, m + 1, n) = 0,
(1.5)
where λ, µ, ν are complex parameters and the three variables are not necessarily integer. In what follows we call them discrete times stressing the difference with continuous KPflows. Let us introduce a lattice spacing η for one of the variables, say, l and denote x ≡ ηl. By elliptic solutions (in the variable x) to this equation we mean the following ansatz for the τ -function: τ (l, m, n) ≡ τ m,n (x) =
N Y j=1
σ(x − xm,n ), j
(1.6)
Elliptic Solutions to Difference Non-Linear Equations
375
where σ(x) is the Weierstrass σ-function. We refer to the r.h.s. of (1.6) as elliptic polynomials in x. For brevity, we call solutions of this type elliptic though the τ -function itself is not double-periodic. However, suitable ratios of these τ -functions, for instance,
Am,n (x) =
τ m,n (x)τ m+1,n (x + η) τ m+1,n (x)τ m,n (x + η)
(1.7)
are already elliptic functions. In this paper we derive equations of motion for poles of Am,n (x) or zeros of τ m,n (x) for discrete times m, n and thus obtain a fully discretized Calogero-Moser many body problem. This appears to be the discrete time version of the Ruijsenaars-Schneider (RS) model proposed in the seminal paper [11]. Remarkably, the discrete equations of motion have the form of Bethe equations of the hierarchical (nested) Bethe ansatz. The discrete time runs over “levels” of the nested Bethe ansatz. We also consider stationary reductions of HBDE. In this case the initial configuration of poles (zeros) is not arbitrary but constrained to a stable locus as in the continuous case (1.4). For the most important examples we give equations defining the loci. A renewed interest in soliton difference equations, and especially in their elliptic solutions is caused by the revealing classical integrable structures present in integrable models of quantum field theory. It turns out that Hirota’s Eq. (1.5) is the universal fusion rule for a family of quantum transfer matrices. Their eigenvalues (as functions of spectral parameters) obey a set of functional Eqs. [12] which can be recast into the bilinear Hirota Eq. [13] (see also [14] and [15] for less technical reviews). Furthermore, it turned out that most of the ingredients of the Bethe ansatz and the quantum inverse scattering method are hidden in the elliptic solutions of the entirely classical discrete time soliton Eqs. [13]. In particular, the discrete dynamics of poles of Am,n (x) or zeros of (1.6) has the form of Bethe ansatz equations, where the discrete time runs over “nested” levels. The theory of elliptic solutions has direct applications to the algebraic Bethe ansatz and to Baxter’s T -Q-relation, which we plan to discuss elsewhere. Here we attempt to develop a systematic approach to the elliptic solutions of the integrable difference equations. The basic concept of the approach is the Baker–Akhiezer functions on algebraic curves. We prove that all solutions to HBDE of the form (1.6) are of the algebro-geometric type and present them in terms of Riemann theta functions. The plan of the paper is as follows. In Sect. 2 we describe general algebro-geometric (finite-gap) solutions to HBDE. We start from the Baker–Akhiezer function constructed from a complex algebraic curve of genus g with marked points. This function satisfies an overcomplete set of linear difference equations. Their consistency is equivalent to Hirota’s equation. In this way, one obtains a (4g + 1)-parametric family of quasiperiodic solutions to HBDE in terms of the Riemann theta-functions. Solitonic degenerations of these solutions are discussed in Sect. 2.4. Section 3 is devoted to elliptic solutions. They are shown to be a particular subclass of the algebro-geometric family of solutions of Sect. 2. We derive equations of motion for zeros of the τ -function (the Bethe ansatz equations) and their Lax representation. We discuss variables of the action-angle type and write down equations for the stable loci for the most important reductions of Hirota’s equation.
376
I. Krichever, P. Wiegmann, A. Zabrodin
2. Algebro-Geometric Solutions to Hirota’s Equation In this section we construct algebro-geometric solutions of Hirota’s equation out of a given algebraic curve. The general method of constructing such solutions of integrable equations is standard. As soon as the bilinear equation can be represented as a compatibility condition for an overdetermined system of linear problems, the first step is to pass to common solutions 9 to the linear problems. Given a linear multi-dimensional difference operator with quasiperiodic coefficients, one associates with it a spectral curve defined by the generalized dispersion relations for quasimomenta of Bloch eigenfunctions of the linear operator. The Bloch solutions 9 are parametrized by points of this curve. Solutions to the initial non-linear equation are encoded in the analytical properties of 9 as a function on the curve. Spectral curves of general linear operators with quasiperiodic coefficients are transcendental and, therefore, intractable. However, soliton theory mostly deals with the inverse problem: to characterize specific operators whose spectral curves are algebraic curves of finite genus. Such operators are called algebro-geometric or finite-gap. Their coefficients yield solutions to HBDE, that we are going to study. Finite-gap multi-dimensional linear difference operators were constructed in the paper [16] by one of the authors. We present the corresponding construction in a form adequate for our purposes. 2.1. The Baker–Akhiezer function. As usual, we begin with the axiomatization of analytical properties of Bloch solutions 9. The Baker–Akhiezer function is an abstract version of the Bloch function. Since we solve the inverse problem, the primary objects are 9-functions rather than linear operators. Let 0 be a smooth algebraic curve of genus g. We fix the following data related to the curve: – A finite set of marked points (punctures) Pα ∈ 0, α = 0, 1, . . . , M ; – Local parameters wα in neighbourhoods of Pα : wα (Pα ) = 0; – A set of cuts Cαβ between the points Pα , Pβ for some pairs α, β (it is implied that different cuts do not have common points other than their endpoints at the punctures); – A set D of g (distinct) points γ1 , . . . , γg ∈ 0. Further, we introduce the following complex parameters lαβ (times or flows): – To each cut Cαβ is associated a complex number lαβ (it is convenient to assume that lβα = −lαβ ). Consider a linear space F(l; D) of functions 9(l; P ), P ∈ 0, such that: 1. The function 9(l; P ) as a function of the variable P ∈ 0 is meromorphic outside the cuts and has at most simple poles at the points γs ; 2. The boundary values 9±,(αβ) of this function at opposite sides of the cut Cαβ satisfy the relation 9+,(αβ) (l; P ) = 9−,(αβ) (l; P )e2πilαβ ; (2.1) 3. In a neighbourhood of the point Pα it has the form 9(l; P ) =
wα−Lα
ξ0(α) (l)
+
∞ X s=1
! ξs(α) (l)wαs
,
Lα =
X β
lαβ .
(2.2)
Elliptic Solutions to Difference Non-Linear Equations
377
Note that if lαβ are integers, then 9 is a meromorphic function having simple poles at γs and having zeros or poles of orders |Lα | at the points Pα . Any function 9 obeying conditions 1 - 3 is called a Baker–Akhiezer function. In our approach, these functions are central objects of the theory. In some cases (especially in the matrix generalizations of the theory) the notion of the dual Baker–Akhiezer function 9† is also important (see e.g. [5], [6]). We omit its definition because it can be easily restored using [5], [6]. Let us prepare some notation. Fix a canonical basis of cycles ai , bi on 0 and denote the canonically normalized holomorphic differentials by dωi , i = 1, 2, . . . , g. We have I I dωj = δij , dωj = Bij , ai
bi
where B is the period matrix. Given a period matrix B, the g-dimensional Riemann theta-function is defined by h i X ~ . ~ = 2(X|B) ~ exp πi(B~n, ~n) + 2πi(~n, X) 2(X) = ~ n∈Zg
~ = (X1 , . . . , Xg ) is a g-component vector. Here X For each pair of points Pα , Pβ ∈ 0, let d(αβ) be the unique differential of the third kind holomorphic on 0 but having simple poles at the points Pα , Pβ with residues −1 and 1 and zero a-periods. To write an explicit form of Baker–Akhiezer functions, let us choose one of the ~ ) = (A1 (P ), . . . , Ag (P )), marked points, say P0 , and by A(P Z Ai (P ) =
P
P0
dωi ,
denote the Abel map. The Baker–Akhiezer function is given by the following theorem. Theorem 2.1. If the points γ1 , . . . , γg are in a general position (i.e. D is a non-special divisor), then F(l; D) is a one-dimensional space generated by the function 9(l; P ) ∈ F(l; D), Z P X ~ ~ ~ ~ 2(A(P ) + X(l) + Z|B)2(Z|B) lαβ d(αβ) . (2.3) exp 9(l; P ) = ~ ) + Z|B)2( ~ ~ + Z|B) ~ 2(A(P X(l) Q0 (αβ)
Here Q0 ∈ 0 is an arbitrary point in the vicinity of P0 , Q0 6= P0 , belonging to the integration path P0 → P in the Abel map. Further, ~ = −K ~ − Z
g X i=1
~ i ), A(γ
~ = X(l)
X
~ (αβ) lαβ , U
(2.4)
(αβ)
~ is the vector of Riemann’s constants and the components of the vectors U ~ (αβ) where K are I ~ (αβ) )j = 1 d(αβ) = Aj (Pβ ) − Aj (Pα ). (2.5) (U 2πi bj
378
I. Krichever, P. Wiegmann, A. Zabrodin
The proof of theorems of this kind as well as the explicit formula for 9 in terms of Riemann theta-functions are standard in finite-gap theory (see e.g. [17]). The last equality in (2.5) follows from Riemann’s relations. Remark 2.1. Although the explicit formula (2.3) requires a fixed basis of cycles, the Baker–Akhiezer function is modular invariant. Remark 2.2. Since abelian integrals in (2.3) have logarithmic singularities at the punctures, one can define a single valued branch of 9 only after cutting the curve along Cαβ . Remark 2.3. The choice of the initial point of the Abel map is in fact not essential. It can be chosen not to be one of the punctures. In particular, it may be Q0 , which slightly simplifies the theorem. However, our choice simplifies the linear equations for 9 below. ~ α ) go along Moreover, below we assume that the integration paths in the Abel maps A(P the cuts C0α . Remark 2.4. The theorem implies that any function from F (l; D) has the form r(l)9(l; P ), where r(l) is an arbitrary function of l but does not depend on P . It is convenient to choose it such that at the point P0 the first regular term ξ0(0) (l) in (2.2) equals 1. Coefficients ξs(α) of the asymptotical behaviour of the 9 (2.2) can be expressed through the τ -function ~ + Z). ~ (2.6) τ (l) ≡ τ ({lαβ }) = 2(X(l) In particular, ξ0(α) (l) ξ0(β) (l)
= χαβ
τ (l0α + 1, l0β ) , α, β 6= 0, τ (l0α , l0β + 1)
(2.7)
where χαβ are l-independent constants. Here and thereafter we skip unshifted arguments. Remark 2.5. If the graph of cuts includes a closed cycle, then a shift of variables lαβ → lαβ +1 does not change the τ -function but multiplies the 9-function by a cycle dependent constant. For instance, if the cycle consists of three links Cαβ , Cβγ , Cγα , then τ (lαβ + 1, lβγ + 1, lγα + 1) = τ (lαβ , lβγ , lγα ), 9(lαβ + 1, lβγ + 1, lγα + 1; P ) = const 9(lαβ , lβγ , lγα ; P ).
(2.8)
This follows from (2.1), (2.3). In the sequel, we do not need the above construction in its full generality. For our purposes it is enough to consider the case of four punctures P0 , . . . , P3 and a general graph of cuts as is in the figure. Cuts connect each pair of points. Any three links (not forming a cycle) give rise to a bilinear equation of the Hirota type. They have different forms, but are in fact equivalent due to (2.8). For further convenience we specify l01 = l1 ,
l02 = l2 ,
l03 = l3 ,
l12 = l¯3 .
(2.9)
Elliptic Solutions to Difference Non-Linear Equations
379
l¯3 -
P1
P2
l2 l1
6
R
l3
P0
6
P3
The general case of more punctures yields higher Hirota equations (i.e. the discretized KP or 2D Toda lattice hierarchies). 2.2. Difference equations for the Baker–Akhiezer function. The Baker–Akhiezer function 9(l; P ) obeys certain linear difference equations with respect to the variables lαβ . Coefficients of the equations are fixed by the analytical properties of 9(l; P ) as a function of P ∈ 0. We restrict ourselves to the case of four punctures and use the notation introduced at the end of the previous subsection. The general case of more punctures can be treated in a similar way. Theorem 2.2. Let 9(l; P ) be the Baker–Akhiezer function normalized so that ξ0(0) = 1. Then it satisfies the following linear difference equations: 9(lα + 1, lβ ; P ) − 9(lα , lβ + 1; P ) + Aαβ (lα , lβ )9(lα , lβ ; P ) = 0
(2.10)
with Aαβ (lα , lβ ) =
ξ0(α) (lα , lβ + 1) ξ0(α) (lα , lβ )
(2.11)
for any α, β = 1, 2, 3, α 6= β. ˜ This The proof is standard in finite gap theory. Denote the l.h.s. of Eq. (2.10) by 9. function has the same analytical properties as 9. At the same time the leading term at the point P1 is zero: ξ˜0(1) = 0 for any lα , lβ . From the uniqueness of the Baker–Akhiezer ˜ = 0. Equations (2.18), (2.20) are proved in the same way. function it follows that 9 Remark 2.6. The dual Baker–Akhiezer function 9† obeys difference equations obtained from Eqs. (2.10), (2.18) by conjugating the difference operators in the right hand sides. The coefficient functions of eq. (2.10) are given by the leading nonsingular term ξ0(α) of the Baker–Akhiezer function at the punctures. They can be found from Eq. (2.3) and are expressed through the τ -function (2.6): Aαβ (lα , lβ ) = −λαβ
τ (lα , lβ )τ (lα + 1, lβ + 1) . τ (lα , lβ + 1)τ (lα + 1, lβ )
(2.12)
380
I. Krichever, P. Wiegmann, A. Zabrodin
The constants λαβ are expressed through the constant terms rγ(αβ) in expansion of the abelian integrals Z P (αβ) d = (δγβ − δγα ) log wγ + rγ(αβ) + O(wγ ), (2.13) Q0 P →Pγ
λαβ = − exp rα(0β) − r0(0β) .
as follows:
It can be shown that λβα = −λαβ for α 6= β and λαβ = − exp λβγ
Z
(2.14)
!
Pα
d
(0β)
(2.15)
Pγ
for any cyclic permutation of {αβγ} = {123}. The integration path goes from Pγ to the neighbourhood of P0 along the cut C0γ (in the opposite direction), passes through the point Q0 and then goes along the cut C0α . Eqautions (2.10), (2.11) can be viewed as linear problems for the discretized KP equation. Choosing another triplet of variables, say l1 , l3 , l¯3 and using Eq. (2.8), i.e. τ (l1 + 1, l2 , l3 ; l¯3 + 1) = τ (l1 , l2 + 1, l3 ; l¯3 ), τ (l1 , l2 , l3 ; l¯3 ) = 2
3 X
~ 2 ) − A(P ~ 1 ) l¯3 + Z ~ ~ α )lα + A(P A(P
(2.16) ! ,
(2.17)
α=1
one may rewrite Eqs. (2.10), (2.11) in a form suitable for discretization of the 2D Toda lattice. In this case eq. (2.10) for α, β = 1, 3 remains the same. Another linear equation, 9(l1 , l¯3 ; P ) − 9(l1 , l¯3 + 1; P ) + B(l1 , l¯3 )9(l1 − 1, l¯3 ; P ) = 0 ,
(2.18)
ξ0(1) (l1 , l¯3 + 1) τ (l1 + 1; l¯3 + 1)τ (l1 − 1; l¯3 ) = −λ12 (1) τ (l1 ; l¯3 + 1)τ (l1 ; l¯3 ) ξ (l1 − 1, l¯3 )
(2.19)
with B(l1 , l¯3 ) =
0
follows from (2.10) for α, β = 1, 2 as a result of the change of variables. The third equation, ˜ 3 , l¯3 )9(l3 , l¯3 + 1; P ) = 9(l3 + 1, l¯3 ; P ) − B(l ˜ 3 , l¯3 )9(l3 , l¯3 ; P ) , 9(l3 + 1, l¯3 + 1; P ) − A(l (2.20) with (1) ¯ ¯ ¯ ˜ 3 , l¯3 ) = ξ0 (l3 + 1, l3 + 1) = −λ12 τ (l1 + 1, l3 + 1; l3 + 1)τ (l1 , l3 ; l3 + 1) , A(l (1) ¯ τ (l1 , l3 + 1; l3 + 1)τ (l1 + 1, l3 ; l¯3 + 1) ξ0 (l3 , l¯3 + 1)
(2.21)
(2) ¯ ¯ ¯ ˜ 3 , l¯3 ) = ξ0 (l3 + 1, l3 ) = λ12 λ32 τ (l1 + 1, l3 + 1; l3 + 1)τ (l1 , l3 ; l3 ) , B(l (2) λ13 τ (l1 , l3 + 1; l¯3 )τ (l1 + 1, l3 ; l¯3 + 1) ξ0 (l3 , l¯3 )
(2.22)
is a linear combination of two other equations (2.10) written in terms of the new variables. The constant prefactors in (2.21), (2.22) are derived using the reciprocity law for differentials of the third kind [18]. Alternatively, Eqs. (2.18), (2.20) can be proved in the same way as in Theorem 2.2.
Elliptic Solutions to Difference Non-Linear Equations
381
2.3. Bilinear equations for the τ -function. We have shown that the Baker–Akhiezer function satisfies an overdetermined system of linear equations. For compatibility of this system the coefficient functions must obey certain non-linear relations. In terms of the τ -function all these relations have a bilinear form. Theorem 2.3. The τ -function obeys the Hirota bilinear difference equation λ23 τ (l1 + 1, l2 , l3 )τ (l1 , l2 + 1, l3 + 1) − λ13 τ (l1 , l2 + 1, l3 )τ (l1 + 1, l2 , l3 + 1) (2.23) + λ12 τ (l1 , l2 , l3 + 1)τ (l1 + 1, l2 + 1, l3 ) = 0 with the constants λαβ defined in (2.14). Proof. First we examine the compatibility of Eqs. (2.10) for (αβ) = (12) and (αβ) = (13). The variable l1 is common for both linear problems. Equations (2.10) can be rewritten as α = 2, 3, (2.24) 9(lα + 1; P ) = M1(α) 9(lα ; P ), where M1(α) is the difference operator in l1 : M1(α) = e∂l1 − λ1α
τ (l1 , lα )τ (l1 + 1, lα + 1) . τ (l1 , lα + 1)τ (l1 + 1, lα )
(2.25)
Equation (2.24) has a family of linearly independent solutions parametrized by points of the curve 0. Whence the compatibility is equivalent to commutativity of the operators ˆ (α) = e−∂lα M (α) , M 1 1
(2.26)
ˆ (3) ] = 0, ˆ (2) , M [M 1 1
(2.27)
i.e., which is a discrete zero curvature condition. Commuting the operators (2.25), we find after some algebra that this condition is equivalent to the relation λ13 τ (l1 , l2 + 1, l3 )τ (l1 + 1, l2 , l3 + 1) − λ12 τ (l1 , l2 , l3 + 1)τ (l1 + 1, l2 + 1, l3 ) (2.28) + H1 (l1 ; l2 , l3 )τ (l1 + 1, l2 , l3 )τ (l1 , l2 + 1, l3 + 1) = 0 , where H1 is an arbitrary function such that H1 (l1 + 1; l2 , l3 ) = H1 (l1 ; l2 , l3 ). All this can be repeated for other two pairs of linear problems, i.e., for (αβ) = (21), (23) and (αβ) = (31), (32) in (2.10). This leads to bilinear relations similar to (2.28) for the same τ -function but with different functions H2 and H3 . To be consistent, all three bilinear relations must be identical. This determines H1 = −λ23 , etc, which proves the theorem. Remark 2.7. In the class of the algebro-geometric solutions the Hirota Eq. (2.23) is equivalent to Fay’s trisecant identity [18]. The coefficients in Eq. (2.23) may be hidden by the transformation τ (l1 , l2 , l3 ) →
λ13 λ12
l1 l 2
λ13 λ23
l2 l3
bringing the Hirota equation into its canonical form:
τ (l1 , l2 , l3 )
(2.29)
382
I. Krichever, P. Wiegmann, A. Zabrodin
τ (l1 + 1, l2 , l3 )τ (l1 , l2 + 1, l3 + 1) − τ (l1 , l2 + 1, l3 )τ (l1 + 1, l2 , l3 + 1) (2.30) + τ (l1 , l2 , l3 + 1)τ (l1 + 1, l2 + 1, l3 ) = 0. The formula Z τ (l1 , l2 , l3 ) = exp l1 l2
Z
P2
d
(01)
P3
+ l2 l3
!
P2
d
(03)
P1
2
3 X
! ~ ~ α )lα + Z A(P
α=1
(2.31) thus provides the family of algebro-geometric solutions to Eq. (2.30). This family has 4g +1 continuous parameters. Indeed, for g > 1 the solution depends on 3g − 3 moduli of the curve, on g points γi and on the 4 marked points Pα . The dependence on the choice of local parameters is not essential. In terms of the variables l1 , l3 , l¯3 the bilinear equations has the form λ13 τ (l1 , l3 ; l¯3 + 1)τ (l1 , l3 + 1; l¯3 ) − λ23 τ (l1 , l3 ; l¯3 )τ (l1 , l3 + 1; l¯3 + 1) (2.32) = λ12 τ (l1 + 1, l3 ; l¯3 + 1)τ (l1 − 1, l3 + 1; l¯3 ). (Alternatively, this equation is a result of the compatibility of Eqs. (2.10), (2.18) and (2.20).) In contrast to Eq. (2.23), the variable l2 is not shifted and skipped. The Hirota equation in the form (2.23) can be considered as a discrete KP, whereas the form (2.32) is a fully discretized 2D Toda lattice. Let us stress again that in the fully discretized setup the KP equation and the 2D Toda lattice become equivalent. 2.4. Degenerate cases. Degenerations of the curve 0 lead to important classes of solutions. Among them are multi-soliton and rational solutions. Some examples of soliton solutions to Eqs. (2.30), (2.32) were found by R.Hirota [8]. Here we outline the algebrogeometric construction of the general soliton solutions. Let us concentrate on multi-soliton solutions. In this case all N “handles” of the Riemann surface of genus N become infinitely thin. In other words, the algebraic curve of genus N degenerates into the complex plane with a set of 2N marked points pi , qi : 0 → {pi , qi , i = 1, . . . , N }, where pi , qi are the ends of the ith handle. The Baker–Akhiezer function has the same value at each pair pi , qi . The punctures Pα are replaced by points zα with local parameters wα = z − zα . In this case the meromorphic differentials of the third kind have the form: (zβ − zα )dz . d(αβ) = (z − zβ )(z − zα ) Let F0 (z) be a polynomial of degree N : F0 (z) = z N +
N X j=1
bj z N −j =
N Y
(z − γi ).
i=1
Its zeros γi will stand for poles of the Baker–Akhiezer function. Let us concentrate on the particular case when there are four punctures and three cuts from z0 to zα (α = 1, 2, 3) on the complex z-plane. Here zα are the z-coordinates of the marked points Pα : zα = z(Pα ). In this case the general definition of the Baker–Akhiezer function suggests the ansatz:
Elliptic Solutions to Difference Non-Linear Equations
9(l; z) = with 90 (l; z) =
383
F (l; z) 90 (l; z) F0 (z)
l 3 Y z − zα α z − z0
α=1
l ≡ (l1 , l2 , l3 )
,
(2.33)
(2.34)
and the polynomial F (l; z) =
N X
aj (l)z N −j
(2.35)
j=0
with yet undetermined l-dependent coefficients. These coefficients are determined by N conditions l = 1, . . . , N. (2.36) 9(l; pi ) = 9(l; qi ), They are equivalent to the system of N linear equations for N + 1 unknown coefficients aj in Eq. (2.35): N X
Kij aj = 0,
i = 1, . . . , N,
j = 0, 1, . . . , N,
j=0
where Kij =
−j pN q N −j i 90 (l; pi ) − i 90 (l; qi ). F0 (pi ) F0 (qi )
(2.37)
Solving the system of linear equations we represent the Baker–Akhiezer function in the form 1(l; z) 90 (l; z), (2.38) 9(l; z) = c(l) 1(0; z) where
ˆ ij ) 1(l; z) = det(K ij
ˆ is the (N + 1) × (N + 1)-matrix with entries and K ˆ 0j = z N −j , K
ˆ ij = Kij , K
i = 1, . . . , N.
The normalization factor c(l) is fixed by the asymptotics 9(l; z) = (z − z0 )−l1 −l2 −l3 (1 + O(z − z0 )) near the point z0 . It gives 1(0; z0 ) Y (z0 − zα )−lα . 1(l; z0 ) 3
c(l) =
α=1
We also point out the identity 1(lα ; zα ) = 1(lα + 1; z0 ),
α = 1, 2, 3.
(2.39)
The degeneration of the τ -function (2.17) is τ (l) = 1(l; z0 ).
(2.40)
384
I. Krichever, P. Wiegmann, A. Zabrodin
It satisfies the bilinear Eq. (2.23) with λαβ =
zα − zβ . (z0 − zα )(z0 − zβ )
The continuous parameters of the solution (2.40) are 2N points pi , qi , N points γi and 4 points zα . However, the τ -function is invariant (up to an irrelevant constant) under the simultaneous fractional-linear transformation z → (az + b)/(cz + d), ad − bc = 1, of all these parameters, so we are left with 3N + 1 parameters. In the case of rational degeneration the points pi and qi merge and the condition (2.36) becomes i = 1, . . . , N, (2.41) (Dˆ i 9)(pi ) = 0, where Dˆ i are some differential operators in z with constant coefficients. This changes the form of the matrix K but Eqs. (2.38), (2.39) remain the same. 3. Elliptic Solutions General finite-gap solutions of Hirota’s equation for an arbitrary algebraic curve 0 with punctures Pα are quasi-periodic functions of all variables lα . Below we construct a special important class of solutions for which the quantity (1.7) is doubly periodic in one of the variables. The τ -function in this case is an elliptic polynomial (1.6). We call them elliptic solutions. We show that the elliptic solutions also imply a spectral algebraic curve and are therefore a subclass of the algebro-geometric solutions of Sect.2. Among all algebro-geometric solutions described in the previous section the elliptic solutions in one of the variables or their linear combination are characterized as follows. Let X Xαβ ∂lαβ (3.1) ∂x = ~ = P Xαβ U ~ (αβ) . Let us be a vector field in the space of variables lαβ and let U αβ transport the τ -function (2.6) along the vector field ∂x and denote it by ~ x + X(l) ~ +Z ~ . τ (x) = 2 U (3.2) ~ Consider a set of algebraic curves with punctures Pα and cuts such that the vector U has the property: ~ , a = 1, 2, belong • There exist two constants ω1 , ω2 , Im (ω2 /ω1 ) 6= 0, such that 2ωa U to the lattice of periods of the holomorphic differentials on 0: τ (x + 2ωa ) = era x+sa τ (x) , where ra , sa are constants. The τ -function is then an elliptic polynomial in the variable x. Due to the commensu~ and the lattice of periods, solutions to the equation rability of U ~ x + Z) ~ =0 2(U
(3.3)
~ + 2J1 ω1 + 2J2 ω2 , where xi (Z) ~ belong to the fundamental domain of the lattice are xi (Z) generated by 2ω1 , 2ω2 and J1 , J2 run over all integers. Therefore,
Elliptic Solutions to Difference Non-Linear Equations
385
~ x + Z) ~ = ea1 x+a2 x2 2(U
N Y
~ σ x − xi (Z)
(3.4)
i=1
with x-independent a1 , a2 and N ≥ g. The requirement of the ellipticity imposes 2g constraints on 4g + 1 parameters. Therefore the dimension of the family of elliptic solutions is 2g + 1. Below we concentrate on a specific case where the ellipticity is imposed along ~ = η −1 U ~ (01) and x = ηl1 , where η is a complex constant. l01 ≡ l1 by setting U The logic of this section is opposite to the one of Sect. 2. Here we solve the direct problem and show that all solutions which are elliptic in any “direction” lαβ of the form (1.6) with a time-independent degree N , are of the finite gap type (3.2). 3.1. Equations of motion for zeros of the τ -function. Let us show how to obtain the equations of motion for zeros of elliptic solutions by elementary methods. First of all, let us rename variables to emphasize the “direction” of ellipticity: l1 ≡ ηx,
l2 ≡ n,
l3 = m,
l¯3 ≡ m, ¯
(3.5)
so that the τ -function has the form (1.6). Let us now consider one of Eqs. (2.10), say for α, β = 1, 2. In this equation all variables except x and m are parameters and we skip m,n → xm (x) → τ m (x), them wherever it does not cause confusion: xm,n i , τ i ψ m (x + η) − λ13
τ m (x)τ m+1 (x + η) m ψ (x) = ψ m+1 (x) . τ m+1 (x)τ m (x + η)
(3.6)
Let us look for solutions of the form ψ m (x) =
ρm (x) τ m (x)
with some function ρm (x). Eq. (3.6) then reads τ m+1 (x)ρm (x + η) − λ13 τ m+1 (x + η)ρm (x) = τ m (x + η)ρm+1 (x).
(3.7)
We are interested in the case when τ m (x) is an elliptic polynomial in x for any m: τ m (x) =
N Y
σ(x − xm j ).
(3.8)
j=1
The “equations of motion” for its roots xm j in the “discrete time” m can be easily obtained , x = xm+1 − η, x = xm from Eq. (3.7) in the following way. Substituting x = xm+1 j j j − η, we get the relations + η)ρm (xm+1 ) = τ m (xm+1 + η)ρm+1 (xm+1 ), −λ13 τ m+1 (xm+1 j j j j τ m+1 (xm+1 − η)ρm (xm+1 ) = τ m (xm+1 )ρm+1 (xm+1 − η), j j j j m m m+1 m m m τ m+1 (xm (xj )ρ (xj − η), j − η)ρ (xj ) = λ13 τ
(3.9)
respectively. Combining these relations, we eliminate ρ and obtain a system of equations for the roots xm i :
386
I. Krichever, P. Wiegmann, A. Zabrodin N m−1 m m m+1 Y σ(xm )σ(xm − η) j − xk j − xk + η)σ(xj − xk k=1
m−1 m m m+1 ) σ(xm + η)σ(xm j − xk j − xk − η)σ(xj − xk
= −1.
(3.10)
Note that these equations involve only one discrete variable m while the direct substitution of the ansatz (3.8) into the non-linear Eq. (2.30) would give a system of equations involving two discrete variables. It is not easy to see that they in fact decouple. The decoupling becomes transparent if one starts with the auxiliary linear problem (3.6). ¯ From the linear problem Similar equations hold for the discrete l3 ≡ m-dynamics. (2.18) we obtain the equations for the m ¯ dependence of xi : N m−1 ¯ ¯ ¯ m ¯ m ¯ m+1 ¯ Y σ(xm − η)σ(xm ) j − xk j − xk + η)σ(xj − xk k=1
m−1 ¯ ¯ ¯ m ¯ m ¯ m+1 ¯ σ(xm )σ(xm + η) j − xk j − xk − η)σ(xj − xk
= −1.
(3.11)
To avoid confusion, let us stress that the xi depend on all discrete times. However, the variables are separated in the equations of motions. Remark 3.1. Another choice of the “direction” of ellipticity gives rise to different equations of motion. To illustrate this, let us require τ (l) to be an elliptic polynomial in the direction orthogonal to s = l3 + l¯3 and a = l1 + l3 − l¯3 , so that the zeros of the τ -function depend on a and s. Then the vector field in (3.1) is ∂x = η∂l1 − 21 ∂l3 + 21 ∂l¯3 . In terms of T a,s (x) ≡ τ (l1 , l3 , l¯3 ) the bilinear Eq. (2.32) reads: λ12 T a,s (x + η)T a,s (x − η) + λ23 T a,s+1 (x)T a,s−1 (x) = λ13 T a+1,s (x)T a−1,s (x), and the zeros xa,s of T a,s (x) obey the system of coupled equations: i a−1,s a,s T a+1,s+1 (xa,s (xi − η)T a,s−1 (xa,s i )T i + η) = −1 , a,s a,s a,s a−1,s−1 a+1,s a,s+1 T (xi )T (xi + η)T (xi − η) a−1,s a,s (xi − η)T a,s+1 (xa,s T a+1,s−1 (xa,s i )T i + η) = −1 . a,s a,s−1 (xa,s − η) T a−1,s+1 (xi )T a+1,s (xa,s + η)T i i
These equations are more complicated than (3.10). In contrast to the previous case, evolutions in a and s are not separated. 3.2. Double-Bloch solutions to the linear problems. In order to further examine the elliptic solutions, we need the notion of double-Bloch functions. A meromorphic function f (x) is said to be double-Bloch if it enjoys the following monodromy properties: f (x + 2ωa ) = Ba f (x),
a = 1, 2.
(3.12)
The complex numbers Ba are called Bloch multipliers. A non-trivial double-Bloch function can be represented as a linear combination of elementary ones: f (x) =
N X
ci 8(x − xi , ζ)k x/η ,
(3.13)
i=1
where [6]
x/(2η) σ(ζ + x + η) σ(ζ − η) 8(x, ζ) = σ(ζ + η)σ(x) σ(ζ + η)
(3.14)
Elliptic Solutions to Difference Non-Linear Equations
387
and complex parameters ζ and k are related to the Bloch multipliers by the formulas ω /η σ(ζ − η) a (3.15) Ba = k 2ωa /η exp(2ζ(ωa )(ζ + η)) σ(ζ + η) (ζ(x) = σ 0 (x)/σ(x) is the Weierstrass ζ-function). Let us point out some properties of the function 8(x, ζ). Considered as a function of ζ, 8(x, ζ) is double-periodic: 8(x, ζ + 2ωa ) = 8(x, ζ). For general values of x one can define a single-valued branch of 8(x, ζ) by cutting the elliptic curve between the points ζ = ±η. In the fundamental domain of the lattice generated by 2ωa the function 8(x, ζ) has a unique pole at the point x = 0: 8(x, ζ) =
1 + O(1) . x
In the next subsection we need the identity: 8(x, z)8(y, z) = 8(x + y, z) ζ(x) + ζ(y) + ζ(z + η) − ζ(x + y + z + η)
(3.16)
which is equivalent to the well known 3-term bilinear functional equation for the σfunction. Recall the notion of equivalent Bloch multipliers [6]. The “gauge transformation” f (x) → f˜(x) = f (x)ebx (b is an arbitrary constant) does not change the poles of any function and transforms a double-Bloch function into another double-Bloch function. If Ba are Bloch multipliers for f , then the Bloch multipliers for f˜ are B˜ 1 = B1 e2bω1 , B˜ 2 = B2 e2bω2 . Two such pairs of Bloch multipliers Ba and B˜ a are said to be equivalent. (In other words, they are equivalent if the product B1ω2 B2−ω1 is the same for both pairs.) This definition implies that any double-Bloch function can be represented as a ratio of two elliptic polynomials of the same degree multiplied by an exponential function and a constant: N Y σ(x − yi ) . (3.17) f (x) = c0 (k 0 )x/η σ(x − xi ) i=1
The Bloch multipliers are
Ba = (k 0 )2ωa /η exp 2ζ(ωa )
N X
(xj − yj ) .
j=1
Equations (3.13) represents a Bloch function by its poles and residues, whereas Eq. (3.17) represents a Bloch function by its poles and zeros. 3.3. The Lax representation. The coefficients in Eq. (3.6) are elliptic functions, i.e. double-periodic with periods 2ωa . Therefore, the equation has double-Bloch solutions . Similarly to the case of the Calogero-Moser model and its spin generalizations the dynamics of poles of the elliptic coefficient in the linear problem is determined by the fact that Eq. (3.6) has an infinite number of double-Bloch solutions. In what follows we always assume that the poles are in a generic position, i.e. m±1 m m 6= 0, ±η for any pair i 6= j. Exceptional cases are xm i − xj 6= 0, ±η and xi − xj also of interest but must be treated separately.
388
I. Krichever, P. Wiegmann, A. Zabrodin
Theorem 3.1. Let τ m (x) be an elliptic polynomial of degree N . Equation (3.6) has N linearly independent double-Bloch solutions with simple poles at the points xm i and of the τ -function satisfy “equations equivalent Bloch multipliers if and only if zeros xm i of motion” (3.10). Theorem 3.2. If Eq. (3.6) has N linearly independent double-Bloch solutions with equivalent Bloch multipliers, then it has infinite number of them. All these solutions have the form N X x/η ci (m, ζ, k)8(x − xm (3.18) ψ m (x) = i , ζ)k i=1
(8(x, ζ) is defined in (3.14)). The set of corresponding pairs (ζ, z) is parametrized by points of an algebraic curve. These theorems are proved by the same arguments as in [6]. Here we present the main steps. N linearly independent double-Bloch solutions with equivalent Bloch multipliers may be written in the form (3.18) with some values of the parameters ζr , kr , s = 1, . . . , N . Equivalence of the multipliers implies that the ζr can be chosen to be equal ζr = ζ. Let us substitute the function ψ m (x) of the form (3.18) with this particular value of ζ into Eq. (3.6). Since any double-Bloch function (except equivalent to a constant) has at least one pole, it follows that the equation is satisfied if its left hand side has zero m+1 . The cancelation of poles at these points residues at the points x = xm i − η and x = xi gives the conditions kci (m, ζ, k) − fi (m)
N X
m cj (m, ζ, k)8(xm i − xj − η, ζ) = 0 ,
(3.19)
j=1
ci (m + 1, ζ, k) = gi (m)
N X
cj (m, ζ, k)8(xm+1 − xm i j , ζ) ,
(3.20)
j=1
where
QN
m m m+1 σ(xm ) i − xs − η)σ(xi − xs , Q N m m m m+1 − η) s=1,6=i σ(xi − xs ) s=1 σ(xi − xs QN σ(xm+1 − xm+1 + η)σ(xm+1 − xm s s ) i i . gi (m) = −λ13 QN s=1 Q N m+1 m+1 − xm+1 ) s=1 σ(xi − xm s s + η) s=1,6=i σ(xi
fi (m) = λ13 QN
s=1
(3.21)
(3.22)
Introducing a vector C(m) with components ci (m, ζ, z) we can rewrite these conditions in the form (L(m) − kI)C(m) = 0 , (3.23) C(m + 1) = M(m)C(m) ,
(3.24)
where I is the unit matrix. The matrix elements of L(m) and M(m) are: m Lij (m) = fi (m)8(xm i − xj − η, ζ),
(3.25)
Mij (m) = gi (m)8(xm+1 − xm i j , ζ).
(3.26)
Elliptic Solutions to Difference Non-Linear Equations
389
The compatibility condition of (3.19) and (3.20), L(m + 1)M(m) = M(m)L(m)
(3.27)
has a form of the discrete Lax equation. The Lax Eq. (3.27) appeared in ref. [11], where Eqs. (3.10) were proposed as a time discretezation of the RS model. Lemma 3.1. For matrices L and M defined by (3.25), (3.26) the discrete Lax Eq. (3.27) is equivalent to the equations (3.10). The proof is along the lines of ref. [11]. We have Fij ≡ (x, ζ)(M(m)L(m) − L(m + 1)M(m))ij X = fi (m + 1) gs (m)8(xm+1 − xm+1 − η, ζ)8(xm+1 − xm i s s j , ζ) + gi (m)
X
s m m fs (m)8(xm+1 − xm i s , ζ)8(xs − xj − η, ζ).
(3.28)
s
The coefficient in front of the leading singularity at ζ = −η is proportional to X X fi (m + 1) gs (m) + gi (m) fs (m). s
On the other hand,
X
s
fs (m) − gs (m) = 0
(3.29)
s
(because this is the sum of residues of the elliptic coefficient in Eq. (3.6) ). Therefore, fi (m + 1) = gi (m),
i = 1, . . . , N.
(3.30)
These equations are equivalent to (3.10). To show that (3.28) is identically zero provided Eqs. (3.30) hold, we use the identity (3.16): − xm Fij (x, ζ) = −gi (m)8(xm+1 i j − η, ζ) G, G=− +
X
X
m m fs (m) ζ(xm+1 − xm s s ) + ζ(xs − xj ) − η)
s
gs (m)(ζ(xm+1 − xm+1 − η) + ζ(xm+1 − xm i s s j ) .
(3.31)
s
Noting that G is proportional to the sum of residues of the elliptic function
N m+1 Y σ(x − xm + η) i )σ(x − xi
ζ(xm+1 − η − x) + ζ(x − xm i j )
i=1
m+1 ) σ(x − xm i + η)σ(x − xi
and x = xm at the points x = xm+1 i i − η, we conclude that G = 0. It was already proved that Eq. (3.6) has N linearly independent solutions if Eqs. (3.10) or the Lax Eq. (3.27) hold for some value of the spectral parameter ζ. It then follows from Lemma 3.1 that the Lax equation holds for any value of ζ. Therefore, for each ζ there exists a double-Bloch solution given by (3.18), where the ci are components of the common solution to (3.23), (3.24).
390
I. Krichever, P. Wiegmann, A. Zabrodin
Theorem 3.3. All elliptic solutions of Eq. (2.23) of the form (3.8) are of the algebrogeometric type and xm i are given implicitly by the equation ~ 3) + Z ~ = 0, ~ 1 )xm + m A(P 2 η −1 A(P i
(3.32)
where the Riemann theta-function corresponds to the algebraic curve 0 defined by the characteristic equation R(k, ζ) ≡ det(L(m) + kI) = k N +
N X
ri (ζ)k N −i = 0 .
(3.33)
i=1
The matrix L is given by Eqs. (3.25), (3.21) and the coefficients ri (ζ) have the form ri (ζ) =
σ(ζ + η)(i−2)/2 σ(ζ − (i − 1)η) Ii σ(ζ − η)i/2
where Ii are integrals of motion. The characteristic Eq. (3.33) and Ii are functions of m+1 but stay the same for all m. The spectral curve 0 determined by Eq. (3.33) xm i and xi is an algebraic curve realized as a ramified covering of the elliptic curve. The function (3.18) is the Baker–Akhiezer function on 0. We call the spectral curve 0 defined in Theorem 3.3 the Ruijsenaars-Schneider (RS) curve. The RS curve is identical to the spectral curve for the continuous time RS model studied in ref. [6]. The proof of the theorem is omitted. It is as in ref. [6]. 0 and xim0 +1 . These Cauchy data uniquely The matrix L is defined by fixing xm i ~ in Eq. (3.32). The curve and ~ α ), α = 1, 2 and Z define the RS curve 0, the vectors A(P ~ the vectors A(Pα ), α = 1, 2 do not depend on the choice of m0 . They are action-type ~ depends linearly on this choice and its components are thus variables. The vector Z angle-type variables. Remark 3.2. The discrete time dynamics defined in Theorem 3.3 is time-reversible, i.e. m0 +1 0 the Cauchy data xm completely determine the dynamics for both time directions i , xi up to permutation of the “particles”. The L-M-pair for the backward time motion is obtained from the difference equations for the dual Baker–Akhiezer function (see Remark 2.6) with an ansatz similar to (3.18). An alternative way to derive equations of motion (3.10) is to require the spectral Eq. (3.23) to be identical to the similar equation for the backward time motion. Remark 3.3. The form of equations for the dynamics in l2 ≡ n is identical to the equations (3.10) of the dynamics in m ≡ l3 . The Cauchy data for m-dynamics, i.e., values of xm i at m = 0 and m = 1 completely determine an evolution and Cauchy data in n (as well as all other flows). Comparing L-operators for each flow, one finds relations between the Cauchy data: N Y σ(x0,0 − x1,0 )σ(x0,0 − x0,1 − η) i 0,0 σ(x i s=1
−
s i 0,0 x0,1 s )σ(xi
−
s x1,0 s
− η)
=
λ12 , λ13
i = 1, 2, . . . , N.
(3.34)
Elliptic Solutions to Difference Non-Linear Equations
391
Similar connections exist for the initial data of the m-flow. ¯ 3.4. Loci equations. The values of x0i , x1i may be arbitrary if no other reduction (apart from the elliptic one) is imposed. If there is an additional reduction, then the x0i , x1i are constrained to belong to a submanifold of C2N , the reduction locus. An example of such a locus in the continuous setup is the KP → KdV locus (1.4). Here we present equations defining the loci for three important reductions of Hirota’s difference equation. In these cases spectral curves of algebro-geometric solutions are hyperelliptic. As before, x0i , x1i are assumed to be in generic position. A) Discrete KdV equation [18]. The discrete KdV equation appears as the reduction τ (l1 , l2 + 1, l3 + 1) = τ (l1 , l2 , l3 ),
x ≡ ηl1
of the general 3-dimensional Hirota Eq. (2.23). In the notation (3.5) the equation is λ23 τ m (x + η)τ m (x) − λ13 τ m−1 (x)τ m+1 (x + η) + λ12 τ m+1 (x)τ m−1 (x + η) .
(3.35)
~ (02) + U ~ (03) belongs to the For algebro-geometric solutions the reduction means that U lattice of periods. Therefore, the function ! Z P
(d(02) + d(03) )
ε(P ) = exp
(3.36)
Q0
is meromorphic on the curve 0. From the definition of the abelian integrals it follows that this function has a double pole at P0 and simple zeros at P2 and P3 . Outside these points ε(P ) is holomorphic and is not zero. The existence of such a function means that the spectral curve is hyperelliptic. A hyperelliptic curve of genus g can be defined by the equation 2g+1 Y 2 (ε − εi ). (3.37) y = i=1
This is a two-fold covering of the complex plane of the variable ε. The projection of 0 onto the ε-plane defines ε as a meromorphic function on 0. This function has a double pole on 0 at the branch point P∞ (above ε = ∞) and two simple zeros at the points P0(±) (above ε = 0). The identification of this notation for the punctures with our previous ones is P∞ = P0 , P0(−) = P2 , P0(+) = P3 . The branch points εi in (3.37) may not be arbitrary since the curve should simultaneously be of the RS type. Correspondingly, the Cauchy data x0i , x1i with respect to the l3 -flow obey certain constraints. Using the equations of motion (3.10) and (3.34), we obtain a system of 2N coupled equations on allowed values of x0i , x1i (“equilibrium locus”): N Y λ13 σ(x1i − x1s + η)σ(x1i − x0s − η) , (3.38) =− λ12 σ(x1i − x1s − η)σ(x1i − x0s + η) s=1
N Y σ(x0 − x0 + η)σ(x0 − x1 − η) i
s=1
s
i
s
σ(x0i − x0s − η)σ(x0i − x1s + η)
=−
λ12 λ13
(3.39)
392
I. Krichever, P. Wiegmann, A. Zabrodin
for i = 1, 2, . . . , N . With the help of Eq. (2.15) the right hand sides can be expressed through abelian integrals. The relation between the number of zeros N and genus g of the spectral curve is a subtle question. We do not discuss it here. Each of the systems (3.38), (3.39) has the form of Bethe equations for an N -site spin chain of spin 1 at each site. One may treat x0s (for instance) as arbitrary input parameters while Bethe’s quasimomenta x1s are to be determined by Eqs. (3.38). However, the system of locus equations (3.38), (3.39) determines x0i and x1i simultaneously. In the continuous time limit we set x0i = xi , x1i = xi + x˙ i + 21 2 x¨ i , x˙ i = ∂t xi , → 0. Assuming λ13 /λ12 → −(1 + C) with a constant C, we get from (3.38), (3.39): N X
x˙ k [ζ(xi − xk − η) − ζ(xi − xk + η)] = C,
(3.40)
x˙ k [℘(xi − xk − η) − ℘(xi − xk + η)] = 0.
(3.41)
k=1 N X k=1
In the leading order in the systems (3.38), (3.39) coincide with each other and yield the first system (3.40) while the second one (3.41) follows from the higher order terms. These equations define the equilibrium locus for the Ruijsenaars-Schneider system of particles.
B) 1D Toda chain in discrete time [8]. The reduction condition in this case is: τ (l1 , l2 , l3 + 1, l¯3 + 1) = τ (l1 , l2 , l3 , l¯3 ). The discrete time 1D Toda chain in the bilinear form reads λ13 τ m−1 (x)τ m+1 (x) − λ12 τ m−1 (x + η)τ m+1 (x − η) = λ23 (τ m (x))2 ,
(3.42)
where we have excluded l¯3 and have passed to the notation of Example A). In this case ~ (03) + U ~ (12) belongs to the lattice of periods. The corresponding curve is given by U Eq. (3.37) with a polynomial of even degree in the r.h.s. The Cauchy data x0i , x1i with respect to the l3 -flow satisfy the locus equations: N Y s=1
λ12 σ(x1i − x1s + η)σ 2 (x1i − x0s ) . =− λ13 σ(x1i − x1s − η)σ 2 (x1i − x0s + η)
N Y σ(x0 − x0 + η)σ 2 (x0 − x1 − η) i
s=1
s
i
s
σ(x0i − x0s − η)σ 2 (x0i − x1s )
=−
λ13 λ12
(3.43)
(3.44)
The continuous time limit can be taken similar to the way of the previous example. ˜ −2 as → 0 with a constant C. ˜ In this case, however, we have to assume λ13 /λ12 → C We get N Y σ(xi − xk + η)σ(xi − xk − η) σ 2 (η) = C˜ x˙ 2i , (3.45) σ 2 (xi − xk ) k=1,6=i
Elliptic Solutions to Difference Non-Linear Equations N X
393
(x˙ i + x˙ k ) [ζ(xi − xk − η) + ζ(xi − xk + η) − 2ζ(xi − xk )] = 0.
(3.46)
k=1,6=i
These equations follow also from Eqs. (4.58), (4.59) of the paper [6]. They define the stable locus for the RS system with respect to another flow than in Eqs. (3.40), (3.41). C) Discrete sine-Gordon equation1 . The reduction condition is τ (l1 , l2 , l3 + 1, l¯3 ) = τ (l1 + 2, l2 , l3 , l¯3 + 1), so, passing to the same independent variables as in the previous examples, we get the equation λ13 τ m (x − η)τ m (x + η) − λ23 τ m−1 (x + η)τ m+1 (x − η) = λ12 (τ m (x))2 .
(3.47)
~ (32) that belongs to the lattice of periods. The continuous ~ (01) + U Now it is the vector U SG equation is reproduced in the limit P3 → P0 , P2 → P1 . The spectral curves are again hyperelliptic. The locus equations for the Cauchy data x0i , x1i with respect to the m-flow are N Y σ(x0 − x1 )σ(x0 − x1 − 2η) s
i
i
s
σ 2 (x0i − x1s − η)
s=1
N Y σ(x1 − x0 )σ(x1 − x0 + 2η) i
s=1
s σ 2 (x1i
i
−
x0s
s
+ η)
=
λ12 , λ13
(3.48)
=
λ12 . λ13
(3.49)
Note that the structure of these equations is different compared to the previous examples. 3.5. Remarks on trigonometric solutions. Trigonometric solutions are degenerations of the elliptic solutions when one of the periods tends to infinity. They form a particular subfamily in the variety of soliton solutions described in Sect. 2.4. The trigonometric solutions admit a very explicit description in terms of the data defining the singular curve. Let us set the period to be 2π: τ m (x + 2π) = τ m (x) , so an elliptic polynomial becomes a Laurent polynomial in eix . The Bethe-like equations on motion (3.10) preserve its form, but the Weierstrass function σ(x) is replaced by sin x. It follows from the periodicity that the function 90 (2.34) obeys 90 (ηx + 2πη, m; pj ) 90 (ηx, m; pj ) = , 90 (ηx + 2πη, m; qj ) 90 (ηx, m; qj ) or, explicitly
pj − z1 qj − z1 −iηJj = e , p j − z0 q j − z0
j = 1, 2, . . . , N,
(3.50)
1 This version of the discrete SG equation is different from the one considered in [20], [21]. The latter is closer to a special degeneracy of the discrete KdV at λ12 → λ13 .
394
I. Krichever, P. Wiegmann, A. Zabrodin
where Jj are integer numbers. This condition restricts admissible singular curves (Riemann spheres with N double points). Minimal Laurent polynomials correspond to the choice Jj = ±1. This gives N conditions for pj , qj , so the number of continuous parameters in the trigonometric case is 2N + 1 – the same as for the non-degenerate curves of genus N . Here we do not discuss the trigonometric degeneration of the L-M-pair and dependence of of pj and qj on initial data for Ruijsenaars-Schneider particles. This can be done along the lines of the paper [5]. The trigonometric loci can be characterized alternatively by imposing a relation on pj and qj in addition to (3.50): E(pi ) = E(qi ),
pi 6= qi ,
i = 1, 2, . . . , N.
(3.51)
For the examples A)-C) of Sect. 3.4 the functions E(z) are A)
E(z) =
(z − z2 )(z − z3 ) , (z − z0 )2
B)
E(z) =
(z − z2 )(z − z3 ) , (z − z1 )(z − z0 )
C)
E(z) =
(z − z1 )(z − z2 ) . (z − z3 )(z − z0 )
(3.52)
Conditions (3.50), (3.51) leave us with a discrete set of admissible pairs pi , qi . The continuous parameters γs give then an implicit parametrization of the loci.
4. Conclusion We have shown that the main body of finite-gap theory and the theory of elliptic solutions to nonlinear integrable equations is also applicable to finite-difference (discrete) integrable equations. Discrete equations includes the continuous theory as the result of a limiting procedure. Furthermore, discrete equations reveal some symmetries lost in the continuum limit. We have shown that all elliptic solutions with a constant number of zeros in the evolution (compare to [13]), are of the algebro-geometric type. Moreover, their algebraic curves are spectral curves for L-operators of the Ruijsenaars-Schneider model. Each point of this curve gives rise to discrete time dynamics of zeros of the τ -function. The structure of equilibrium loci equations of reductions of Hirota’s Eq. (analogues of the known KdV-locus (1.4)) is expected to be richer than in the continuous case and requires further study. It would be very interesting to extend the algebro-geometric approach to elliptic solitons of KdV of ref. [22] to the difference case as well as to understand difference elliptic solitons in terms of the Weierstrass reduction theory [23]. To the two main motivations pointed out at the beginning of the paper we can now add yet another one: an intriguing intimate connection between the elliptic solutions to soliton equations and quantum integrable models solved by the Bethe ansatz. In our opinion, the very fact that the zeros dynamics and equilibrium loci are described by Bethe-like equations is remarkable and suggests hidden parallels between quantum and classical integrable equations.
Elliptic Solutions to Difference Non-Linear Equations
395
Acknowledgement. We thank Ovidiu Lipan for discussions and for his interest in the subject and J.Talstra for help. The work of I.K. was supported by RFBR grant 95-01-00751. The work of A.Z. was supported in part by RFBR grant 97-02-19085, by ISTC grant 015 and also by NSF grant DMR-9509533. A.Z. is grateful to Professor R.Seiler for the kind hospitality at the Technische Universit¨at Berlin where part of this work was done. P.W was supported by NSF grant DMR-9509533. P.W and A.Z thank the Institute for Theoretical Physics at Santa Barbara for its hospitality in April 1997 where this paper was completed.
References 1. Airault, H., McKean, H., and Moser, J.: Rational and elliptic solutions of the KdV equation and related many-body problem. Comm. Pure and Appl. Math. 30, 95–125 (1977) 2. Chudnovsky, D.V. and Chudnovsky, G.V.: Pole expansions of non-linear partial differential equations. Nuovo Cimento 40B, 339–350 (1977) 3. Krichever, I.M.: On rational solutions of Kadomtsev-Petviashvilii equation and integrable systems of N particles on line. Funct. Anal i Pril. 12:1, 76–78 (1978) 4. Krichever, I.M.: Elliptic solutions of Kadomtsev–Petviashvilii equation and integrable systems of particles. Func. Anal. Appl. 14:4, 282–290 (1980) 5. Krichever, I., Babelon, O., Billey, E. and Talon, M.: Spin generalization of the Calogero-Moser system and the Matrix KP equation. Preprint LPTHE 94/42 6. Krichever, I.M. and Zabrodin, A.V.: Spin generalization of the Ruijsenaars–Schneider model, nonabelian 2D Toda chain and representations of Sklyanin algebra. Uspekhi Mat. Nauk, 50:6, 3–56 (1995), hep-th/9505039 7. Ruijsenaars, S.N.M. and Schneider, H.: A new class of integrable systems and its relation to solitons. Ann. Phys. (NY) 170, 370–405 (1986) 8. Hirota, R.: Nonlinear partial difference equations II; Discrete time Toda equations. J. Phys. Soc. Japan 43, 2074–2078 (1977); Discrete analogue of a generalized Toda equation. J. Phys. Soc. Japan 50, 3785–3791 (1981) 9. Miwa, T.:On Hirota’s difference equation. Proc. Japan Acad. 58 Ser. A, 9–12 (1982); Date, E., Jimbo, M. and Miwa, T.: Method for generating discrete soliton equations I, II. J. Phys. Soc. Japan 51, 4116–4131 (1982) 10. Zabrodin, A.: A survey of Hirota’s difference equations. Preprint ITEP-TH-10/97, solv-int/9704001 11. Nijhof, F., Ragnisco, O. and Kuznetsov, V.: Integrable time-discretization of the Ruijsenaars–Schneider model. Commun. Math. Phys. 176, 681–700 (1996) 12. Kulish, P.P. and Reshetikhin, N.Yu.: On GL3 -invariant solutions of the Yang–Baxter equation and associated quantum systems. Zap. Nauchn. Sem. LOMI 120, 92–121 (1982), Engl. transl.: J. Soviet Math. 34, 1948–1971 (1986); Klumper, A. and Pearce, P.: Conformal weights of RSOS lattice models and their fusion hierarchies. Physica A183, 304–350 (1992); Kuniba, A., Nakanishi, T. and Suzuki, J.: Functional relations in solvable lattice models, I: Functional relations and representation theory, II: Applications. Int. J. Mod. Phys. A9, 5215–5312 (1994) 13. Krichever, I., Lipan, O., Wiegmann, P. and Zabrodin, A.: Quantum integrable models and discrete classical Hirota equations. Preprint ESI 330 (1996), Commun. Math. Phys. 188, 267–304 (1997) 14. Zabrodin, A.: Discrete Hirota’s equation in quantum integrable models. Preprint ITEP-TH-44/96 (1996), hep-th/9610039 15. Wiegmann, P.: Bethe ansatz and Classical Hirota Equation. Int. J. Mod. Phys. B, 11, 75–89 (1997) 16. Krichever, I.M.: Algebraic curves and non-linear difference equation. Uspekhi Mat. Nauk 33 n 4, 215– 216 (1978) 17. Dubrovin, B., Matveev, V. and Novikov, S.: Non-linear equations of Korteweg–de Vries type, finite zone linear operators and Abelian varieties. Uspekhi Mat. Nauk 31:1, 55–136 (1976); Krichever, I.M.:Nonlinear equations and elliptic curves. Modern problems in mathematics, Itogi nauki i tekhniki, VINITI AN USSR 23, (1983) 18. Fay, J.: Theta functions on Riemann surfaces. Springer Lecture Notes in Math., Berlin–Heidelberg–New York: Springer-Verlag, 352, 1973 19. Hirota, R.: Nonlinear partial difference equations I. J. Phys. Soc. Japan 43, 1424–1433 (1977) 20. Hirota, R.: Nonlinear partial difference equations III; Discrete sine-Gordon equation. J. Phys. Soc. Japan 43, 2079–2086 (1977)
396
I. Krichever, P. Wiegmann, A. Zabrodin
21. Faddeev, L.D. and Volkov, A.Yu.: Quantum inverse scattering method on a spacetime lattice. Teor. Mat. Fiz. 92, 207–214 (1992) (in Russian) 22. Treibich, A. and JVerdier, .-L.: Solitons elliptiques. Grothendieck Festschrift, ed.: P.Cartier et al, Progress in Math. 88, Boston: Birkh¨auser, 1990; A.Treibich, Tangential polynomials and elliptic solitons. Duke Math. J. 59, 611–627 (1989) 23. Belokolos, E. and Enolskii, V.: Verdier elliptic solitons and Weierstrass reduction theory. Funct. Anal. i Pril. 23:1, 57–58 (1989); Belokolos, E., Bobenko, A., Enolskii, V., Its, A. and Matveev, V.: Algebraicgeometrical approach to nonlinear integrable equations, Berlin: Springer, 1994 Communicated by T. Miwa
Commun. Math. Phys. 193, 397 – 406 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
The Inviscid Burgers Equation with Brownian Initial Velocity Jean Bertoin Laboratoire de Probabilit´es, Universit´e Pierre et Marie Curie, 4, Place Jussieu, F-75252 Paris Cedex 05, France. E-mail: [email protected] Received: 24 February 1997 / Accepted: 8 September 1997
Abstract: The law of the (Hopf-Cole) solution of the inviscid Burgers equation with Brownian initial velocity is made explicit. As examples of applications, we investigate the smoothness of the solution, the statistical distribution of the shocks, we determine the exact Hausdorff function of the Lagrangian regular points and investigate the existence of Lagrangian regular points in a fixed Borel set.
1. Introduction and Main Results Burgers equation with viscosity parameter ε > 0 2 u ∂t u + ∂x u2 /2 = ε∂xx
(1)
appears in several classical physical problems. Initially, it has been introduced by Burgers as a simple model of hydrodynamic turbulence, where the solution uε (x, t) may be viewed as the velocity of a fluid particle located at x at time t (however, it is now commonly admitted that it is not a good model for turbulence). See Burgers [5], and also [18] and [2] for discussions of further physical motivations. The initial datum u(·, 0) is referred to as the initial velocity; we suppose here that u(·, 0) = 0 on (−∞, 0). Burgers [5] and Hopf [13] have considered the asymptotic behaviour of the solution uε of (1) as the viscosity parameter ε tends to 0. Roughly, uε converges to a certain function u, which will be referred to as the Hopf-Cole solution of the inviscid limit equation (2) ∂t u + ∂x u2 /2 = 0 . The Hopf-Cole solution can be expressed implicitly in terms of the initial velocity as follows (cf. [13], and also [18] and [19] for a brief account): under simple conditions such as lim inf x→∞ u(x, 0)/x ≥ 0, the function
398
J. Bertoin
Z s→
s
(u(r, 0) + (r − x)t−1 )dr
(3)
0
tends to ∞ as s → ∞ for every x ≥ 0 and t > 0. We then denote by a(x, t) the largest location of the overall minimum of (3). Alternatively, a(x, t) can be expressed as a(x, t) = max{s ≥ 0 : C 0 (s) ≤ x/t} ,
(4)
where s → C(s) R s denotes the convex hull (i.e. the largest convex lower function) of the function s → 0 (u(r, 0) + rt−1 )dr, and C 0 the right-derivative of C. The mapping x → a(x, t) is right-continuous increasing; it is known as the inverse Lagrangian function. The Hopf-Cole solution to (2) is given by u(x, t) =
x − a(x, t) . t
(5)
Since the early 90’s, there has been a lot of interest in the inviscid Burgers equation with random initial velocity. In particular, Sinai [19] and She-Aurell-Frisch [18] have considered the case when the initial velocity is given by a Brownian motion and a fractional Brownian motion respectively, and Avellaneda and E [2, 3] the case when the initial velocity is given by a white noise. See also Carraro-Duchon [6] where (2) is understood in some weak statistical sense. Our main result specifies the distribution of the inverse Lagrangian function (and hence, thanks to (5), that of the Hopf-Cole solution) in the case of Brownian initial velocity. More precisely, the condition (u(x, 0), x ≥ 0)
is a Brownian motion started at u(0, 0) = 0
(6)
is enforced from now on, except in Sect. 3 where more general initial velocities are discussed. As an easy scaling argument (cf. Eq. (21) in [18]) shows that the distribution of the solution u(·, t) at time t > 0 is just a linear transform of that at time 1, we may focus on the case t = 1 with no loss of generality. Theorem 1. The process (a(x, 1) − a(0, 1) : x ≥ 0) is independent of a(0, 1) and has the same distribution as the first passage process of a Brownian motion with unit drift, that is that of (Tx : x ≥ 0) where Tx = inf {s ≥ 0 : u(s, 0) + s > x} . In other words, (a(x, 1) : x ≥ 0) is a right-continuous increasing process having independent and homogeneous increments. The distribution of the increments is specified by E exp {−q (a(x, 1) − a(0, 1))} = exp{−x8(q)} , x, q ≥ 0 , where the Laplace exponent 8 is given by 8(q) =
p 2q + 1 − 1 .
Finally, the variable a(0, 1) has a gamma (1/4,1/2)-law, that is the probability measure on [0, ∞) with density 2−1/4 y −3/4 e−y/2 /0(1/4).
Inviscid Burgers Equation with Brownian Initial Velocity
399
Remark. It is important at this point to discuss the connection with a recent work of Carraro-Duchon [6]. These authors have introduced the notion of intrinsic statistical solutions of the inviscid Burgers equation, which can be thought of as weak solutions in a statistical sense where the statistical correlation in time is not relevant. In particular, the Hopf-Cole solution yields an intrinsic statistical solution, but the converse fails: An intrinsic statistical solution does not necessarily correspond to the distribution of the Hopf-Cole solution. The main result of Carraro-Duchon is that there is a unique flow of laws of processes with independent and homogeneous increments which forms an intrinsic statistical solution of the inviscid Burgers equation with Brownian initial velocity, and more precisely, this solution at time t = 1 corresponds to the distribution of (x − Tx : x ≥ 0), where Tx is as in Theorem 1. In particular, our description of the distribution of the increments of the inverse Lagrangian function will follow from the result of Carraro-Duchon, provided that one is able to prove that the Hopf-Cole solution has independent and homogeneous increments. This is precisely the main technical issue of the proof of Theorem 1; the derivation of the law of the increments being then essentially straightforward. Although we will not rely on [6], it is clear that the paper by Carraro-Duchon has provided a most important hint in this work. The very simple probabilistic structure that is described in Theorem 1 has several interesting consequences. We first consider the statistical behaviour of the increments of the inverse Lagrangian function. It is immediately seen that √ q ≥ 0, lim E exp −qε−2 (a(x + ε, 1) − a(x, 1)) = e− 2q , ε→0+
so that by Laplace inversion p lim P ε−2 (a(x + ε, 1) − a(x, 1)) ∈ ds = 2/πs−3/2 exp {−1/2s} ds ,
ε→0+
s > 0.
Roughly speaking, the increments of a(·, 1) on an interval of length ε are statistically of order ε2 . Precise information on the almost-sure smoothness of a(·, 1) is given by the following result. Corollary 1. For every fixed x ≥ 0, we have with probability one lim inf ε→0+
a(x + ε, 1) − a(x, 1) log log 1/ε = 1/8 ; ε2
and for every increasing function h : (0, ∞) → (0, ∞) lim sup ε→0+
according as the integral
a(x + ε, 1) − a(x, 1) = 0 or ∞ h(ε)
R1p 1/h(s)ds converges or diverges. 0
This entails that a(·, 1) is H¨older-continuous with exponent (2 − η) for every η > 0 with probability one at each fixed point. By (5), we have in particular that ∂u(x, 1)/∂x = 1 almost everywhere with probability one; a fact that has been proven first by Sinai [19]. In the same vein, one can make use of a recent result of Jaffard [14] to determine the Hausdorff dimension of the set of points where a(·, 1) is H¨older regular with given exponent. The discontinuities (or shocks) of the Eulerian velocity u are another major object of interest (cf. [2, 3, 18, 19]). From the viewpoint of hydrodynamic turbulence, the
400
J. Bertoin
amplitude of the jump u(x, t) − u(x−, t) < 0 at an Eulerian shock point x corresponds to the velocity absorbed into the shock. To give a complete statistical description of the shocks, let us first recall the notion of Poisson point process due to K. Itˆo. Let µ be a sigma-finite measure on (0, ∞). A Poisson point process with characteristic measure µ is a point process (1x : x ≥ 0) with values in (0, ∞) such that for every measurable set E ⊆ (0, ∞) with µ(E) < ∞, the counting process Card {r ∈ [0, s] : 1r ∈ E} : s ≥ 0 is a Poisson process with parameter µ(E), and to disjoint sets correspond independent counting processes. Corollary 2. The velocity discontinuities (−1u(x, 1) = u(x−, 1) − u(x, 1) : x ≥ 0) form a Poisson point process valued in (0, ∞) with characteristic measure 1 √ y −3/2 exp {−y/2} dy 2π
(y > 0) .
For instance, one deduces from Corollary 2 that the number N (y) of shocks of amplitude greater than y > 0 on an interval of unit length has r 2 −3/2 −y/2 y e as y → ∞ . P (N (y) > 0) ∼ π In a different direction, a standard consequence of the strong law of large numbers for Poisson point processes entails s 2 as y → 0+ N (y) ∼ πy with probability one. Next, we turn our interest to the fractal properties of the set R of the so-called Lagrangian regular points, that are the points y ≥ 0 for which there exists some x ≥ 0 such that the function (3) reaches its overall minimum at y = a(x, 1) and nowhere else. p Corollary 3. With probability one, the function s → s log log 1/s is an exact Hausdorff function for R. In particular, the Hausdorff dimension of R is 1/2 with probability one, a result that has been proven first by Sinai [19]; see also Aspandiiarov-Le Gall [1]. In the same vein, one can also calculate the packing measure of R by making use of results in Fristedt-Taylor [9]. In particular, one finds that the packing dimension of R is again 1/2, so R is a fractal set in the sense of Taylor [20], with probability one. Finally, we discuss the existence of regular Lagrangian points in a fixed Borel set E ⊆ (0, ∞). Corollary 4. The probability that E contains at least one regular Lagrangian point is positive if and only if the (1/2)-capacity of E is positive, R that is if and only if there exists a non-trivial measure µ with support in E such that E |s − r|−1/2 µ(dr) ≤ 1 for all s ≥ 0. As a consequence of the celebrated Frostman’s lemma, we see in particular that E has no regular Lagrangian points with probability one whenever its Hausdorff dimension is strictly less than 1/2; whereas it contains regular Lagrangian points with positive probability if its Hausdorff dimension is strictly greater than 1/2 (of course, the case of dimension 1/2 is critical). More precisely, information on the Hausdorff dimension of
Inviscid Burgers Equation with Brownian Initial Velocity
401
the set of regular Lagrangian points in E derives immediately from results in Hawkes [11]. The rest of this paper is organized as follows. Theorem 1 and its corollaries are established in Sect. 2. The proof of Theorem 1 relies crucially on the so-called last-exit decomposition for Markov processes; we stress that this technique already had a key role in [2, 3]. The corollaries are then deduced from standard results for subordinators. Section 3 is devoted to an extension of Theorem 1 to the situation where the initial velocity is given by a L´evy process with no positive jumps. This case has first been investigated by Carraro-Duchon [6] in the framework of intrinsic statistical solutions (to this end, the remark after Theorem 1 is also relevant here); a related result is proven here for the Hopf-Cole solution. 2. Proof of the Statements It will be convenient to use the so-called canonical notation for random processes. Specifically, let denote the set of c`adl`ag1 paths ω : [0, ∞) → R ∪ {∞} such that lims→∞ ω(s) = ∞. This space is endowed with the shift operators (θs : s ≥ 0) and the killing operators (ks : s ≥ 0), which are defined for r, s ≥ 0 by n ω(r) if r < s θs ω(r) = ω(s + r) , ks ω(r) = . ∞ otherwise We write Xs : ω → ω(s) for the canonical projections. For every x ∈ R, let Px stand for the law of the Brownian motion with unit drift started at x, which is viewed as a probability measure on . We also introduce the indefinite integral of X Z s Xr dr , s ≥ 0, Is = 0
its past-minimum function ms = min Ir , 0≤r≤s
s ≥ 0,
and the largest location of the overall minimum of I a = max{s ≥ 0 : Is = m∞ } . We now arrive at the key point in the proof of Theorem 1. Lemma 1. For every x ≥ 0, the processes X ◦ ka and X ◦ θa are independent under P−x , and the law of X ◦ θa does not depend on x. Proof. The proof is based on the fact that, loosely speaking, splitting the path of a Markov process at its last passage time at a given point produces two independent processes; and more precisely, the law of the part after the last passage time does not depend of the initial distribution of the Markov process. In other words, there is a weak analogue of the Markov property at last passage times. We refer e.g. to Theorem 2.12 in Getoor [10] and the related references for a precise statement which is more general than we actually need. 1 This is the abbreviation for the French continu a ` droite et pourvu de limites a` gauche, that is rightcontinuous with limits to the left. The reason why we do not only consider continuous paths is that the notion of killed path will be useful in the sequel.
402
J. Bertoin
Consider the integral process reflected at its past minimum, I − m. Making use of the identity Is+r = Is + Ir ◦ θs and the strong Markov property of Brownian motion, it is readily seen that the pair (X, I − m) is a strong Markov process. See e.g. the proof of Proposition VI.1 in [4] for a closely related argument. On the other hand, it is plain that P−x -a.s. for every x ≥ 0, we have a < ∞ and Xa = 0, and thus a = sup{s ≥ 0 : (X, I − m)s = (0, 0)} . In other words, the largest location of the minimum of I coincides with the last passage time of (X, I − m) at (0, 0). It thus follows that the processes (X, I − m) ◦ ka and (X, I − m) ◦ θa are independent, and that the law of the latter does not depend on x. Our claim follows. We are now able to prove Theorem 1. Proof. Fix x ≥ 0 and note that (u(s, 0) + s − x : s ≥ 0) can be identified as X = (Xs : s ≥ 0) under the law P−x . In this framework, the function (3) for t = 1 coincides with s → Is , and the inverse Lagrangian function evaluated at x is simply a(x, 1) = a. It should be clear from (4) that for every 0 ≤ z ≤ x, a(z, 1) is measurable with respect to X ◦ ka . Rs Write X 0 = X ◦ θa , Is0 = 0 Xr0 dr, and for y ≥ 0, a0 (y, 1) for the largest location of the overall minimum of s → Is0 − ys. We then observe the identity a(x + y, 1) − a = a0 (y, 1) .
(7)
More precisely, a(x + y, 1) is the largest location of the overall minimum of s → Is − sy. This location is bounded from below by a(x, 1) = a, so that a(x + y, 1) − a is the largest location of the overall minimum of s → Ia+s − (a + s)y. Because Ia+s = Ia + Is0 , (7) follows. According to Lemma 1, X 0 and X ◦ ka are independent. We deduce from (7) that the increment a(x + y, 1) − a(x, 1) is independent of (a(z, 1) : 0 ≤ z ≤ x). Because the law of X 0 does not depend on x, the same holds for a0 (y, 1) = a(x + y, 1) − a(x, 1). We have thus proven the independence and homogeneity of the increments of the inverse Lagrangian function. Next, introduce T = min{s ≥ 0 : Xs = 0}, the first hitting time of 0 by X. By the strong Markov property, X˜ = X ◦ θT is independent of X ◦ kT and has the law P0 . The very same argument as above shows that a(x, 1) = T + a˜ (0, 1),
(8) Rs
where a˜ (0, 1) stands for the largest location of the minimum of s → I˜s = 0 X˜ r dr. Because a˜ (0, 1) is independent of T and has the same law as a(0, 1), the decompositions a(x, 1) = (a(x, 1) − a(0, 1)) + a(0, 1) and (8), and the independence of the increments property show that T and a(x, 1) − a(0, 1) have the same law. In other words, the process (a(x, 1) − a(0, 1) : x ≥ 0) has the same one-dimensional distributions as the first passage process (Tx : x ≥ 0) of a Brownian motion with unit drift started at zero. Because both have independent and homogeneous increments, we conclude that these two processes have the same law. We finally determine the distribution of a(0, 1), that is that of a under P0 . We implicitly work with the probability measure P0 in the sequel, and use the easy fact that I reaches its overall minimum at a unique location (viz. a) almost surely, i.e. a(0, 1) is a regular Lagrangian point with probability one. Introduce the last-passage time of X at 0
Inviscid Burgers Equation with Brownian Initial Velocity
403
g = max{s ≥ 0 : Xs = 0} . It is known that g has a gamma (1/2,1/2)-distribution, see e.g. Exercise 4.16 in [17]. On the one hand, it follows from Lemma 1 that the variables a and g −a are independent. On the other hand, it can be checked that they have the same distribution. More precisely, the processes X ◦ kg and −X ◦ kg have the same law (this follows again e.g. from Exercise 4.16 in [17]). A standard argument of excursion theory then shows that the latter process reversed at the last-passage time g, Xˆ s = −Xg−s : 0 ≤ s < g , has also the same law as X ◦ kg . In this framework, it is immediately seen that, in the obvious notation, Iˆs = Ig−s − Ig for 0 ≤ s < g, so g − a is the (a.s. unique) location aˆ of the ˆ which implies that a and g − a have overall minimum of the integral process Iˆ of X, the same law. Using the well-known convolution identity gamma (1/4,1/2) ∗ gamma (1/4,1/2) = gamma (1/2,1/2) , we conclude that a has a gamma (1/4,1/2)-distribution. The proof of Theorem 1 is now complete. We refer to Lachal [15] for further results on the distribution of random variables related to the location of the overall minimum of the indefinite integral of a Brownian motion with drift. Theorem 1 reduces the corollaries to standard results on subordinators (i.e. increasing processes with independent and homogeneous increments). √ Proof. 1. Note that the Laplace exponent 8 in Theorem 1 has 8(q) ∼ 2q as q → ∞. Thus the first assertion stems from the Fristedt-Pruitt law of the iterated logarithm (cf. Theorem III.11 in [4]), and the second from Proposition III.10 in [4]. 2. We have −1u(·, 1) = 1a(·, 1) in the obvious notation. According to the celebrated L´evy-Itˆo decomposition of L´evy processes (cf. Theorem I.1 and Eq. (III.3) in [4]), the jump process (1a(x, 1) : x ≥ 0) is a Poisson point process whose characteristic measure is the L´evy measure µ which appears in the L´evy-Khintchine formula for 8 Z 1 − e−qy µ(dy) , 8(q) = dq + (0,∞)
where d ≥ 0 is the drift coefficient. It is immediately checked that the drift coefficient √ corresponding to 8(q) = 2q + 1 − 1 is d = 0 and the L´evy measure 1 √ y −3/2 exp {−y/2} dy 2π
(y > 0) .
3. An instant of reflection shows that the set R of Lagrangian regular points can be viewed as the range of the inverse Lagrangian function on its continuity set, i.e. R = {a(x, 1) : x ≥ 0 and a(x, 1) = a(x−, 1)} . Because a(·, 1) has only countably many discontinuity points, the exact Hausdorff functions of R coincide with the exact Hausdorff functions of the √ closed range of a(·, 1) shifted at the origin, a(·, 1) − a(0, 1). Recall that 8(q) ∼ 2q as q → ∞. A general p result of Fristedt-Pruitt [8] implies that an exact Hausdorff function of R is s → s log log 1/s. See also Taylor-Wendel [21]. 4. Theorem 1 and the preceding observation show that a given set E ⊆ (0, ∞) is disjoint from R if and only if it is polar for the first-passage process (Tx : x ≥ 0) of a
404
J. Bertoin
Brownian motion with unit drift. Because the law of the Brownian motion with unit drift is locally equivalent to that of the standard Brownian motion (with zero drift), the latter holds if and only if E is polar for the first-passage process of a Brownian motion, that is for a stable subordinator with index 1/2. By the potential theory for subordinators (see Orey [16] and Hawkes [12]) we conclude that E ∩ R = ∅ with probability one if and only if the (1/2)-capacity of E is zero. 3. Extension to More General Initial Velocities The argument developed in the preceding section applies as well when the initial velocity is given by a L´evy process with no positive jumps. To this end, we first recall some basic facts in that field, referring to chapter VII in [4] for details. A L´evy process X = (Xs : s ≥ 0) is a random process started at X0 = 0, with independent and homogeneous increments and c`adl`ag paths (for instance, a subordinator is a L´evy process with nonnegative increments). When a L´evy process X has no positive jumps, one has s, q ≥ 0 , E eqXs = exp{sψ(q)} , where ψ : [0, ∞) → R is called the Laplace exponent of X. It is a convex function that increases ultimately, except in the case when −X is a subordinator which will be implicitly excluded in the sequel. The first passage process Tx = inf{s ≥ 0 : Xs > x} ,
x≥0
is then a subordinator whose Laplace exponent 8 : [0, ∞) → [0, ∞) is the right-inverse of ψ, viz. ψ ◦ 8 = Id. (This property is essential in this section.) For instance, the case when X is a Brownian motion with constant drift d ∈ R corresponds to ψ(q) = 21 q 2 + dq p and 8(q) = 2q + d2 − d. We now replace the assumption (6) by the weaker The initial velocity (u(x, 0) : x ≥ 0) is a L´evy process with no positive jumps, (9) its Laplace exponent ψ has ψ 0 (0) ≥ 0. The last condition on the derivative of ψ is equivalent to E (u(1, 0)) ≥ 0, which is necessary and sufficient for lim inf x→∞ u(x, 0)/x ≥ 0. We now state the generalized version of Theorem 1. Theorem 2. Suppose (9) holds. Then for each fixed t > 0, the process (a(x, t) −a(0, t) : x ≥ 0) is independent of a(0, t) and has the same distribution as x → inf {s ≥ 0 : tu(s, 0) + s > x} . In other words, (a(x, t) : x ≥ 0) is a subordinator started at a(0, t), with Laplace exponent 8t which is specified by the identity ψ(t8t (q)) + 8t (q) = q ,
q ≥ 0.
Proof. For t > 0 fixed, consider the L´evy process with no positive jumps tu(s, 0) + s ,
s ≥ 0.
Its Laplace exponent ψt is given by ψt (q) = ψ(tq) + q, so the inverse 8t of ψt , which is the Laplace exponent associated with the first-passage process
Inviscid Burgers Equation with Brownian Initial Velocity
Tx = inf{s ≥ 0 : tu(s, 0) + s > x} , solves the equation
405
x≥0
ψ(t8t (q)) + 8t (q) = q .
Write P−x for the law of (tu(s, 0) + s − x : s ≥ 0) on the canonical space . The notation X, I, m, a, · · · being the same as in Sect. 2, Lemma 1 still holds in the present framework. More precisely, the argument used in the Brownian case applies just as well, provided that we check that a < ∞ and Xa = 0 P−x -a.s. The finiteness of a is plain as Xs drifts to ∞ as s → ∞, P−x -a.s. On the other hand, I has a right and a left derivative at time a, which are Xa and Xa− , respectively. Because a is a location of the minimum of I, we must have Xa− ≤ 0 and Xa ≥ 0. Since X has no positive jumps, this forces Xa− = Xa = 0. We then identify the function (3) as s → t−1 Is under the law P−x . In particular, a can be viewed as the inverse Lagrangian function evaluated at (x, t). We now simply repeat the argument of the proof of Theorem 1 to establish the independence and homogeneity of the increments of the inverse Lagrangian function, and the fact that a(x, t) − a(0, t) has the same distribution as Tx under P0 . We emphasize that the above argument collapses when the initial velocity has positive jumps. Carraro-Duchon [6] have shown the existence of intrinsic statistical solutions to the inviscid Burgers equation whenever the initial velocity is distributed according to a L´evy process, even with positive jumps. Stephane Jaffard (private communication) has pointed out that when the initial velocity has positive jumps, then the distribution of the Hopf-Cole solution cannot have independent and homogeneous increments; which stresses the difference between intrinsic statistical solutions and the Hopf-Cole solution. The condition of absence of positive jumps also appears in [6], in connection with the Lax entropy condition. Analogues of Corollaries 1-3 can be derived from Theorem 2 by the very same arguments as in Sect. 2. For instance, in the case when the initial velocity is stable with index α (i.e. ψ(q) = q α ) for some α ∈ (1, 2], one gets that both the Hausdorff and packing dimensions of the regular Lagrangian points is 1/α a.s. The interested reader is referred to the survey by Fristedt [7], chapters III and VII in Bertoin [4], and the references therein for a number of properties of subordinators and L´evy processes with no positive jumps, that can be shifted via Theorem 2 to the Hopf-Cole solution of the inviscid Burgers equation with initial velocity fulfilling (9). Acknowledgement. I am very grateful to Stephane Jaffard for drawing my attention to the work [6] by CarraroDuchon, and for several stimulating discussions on the inviscid Burgers equation, the Hopf-Cole solution and the intrinsic statistical solutions.
References 1. Aspandiiarov, S. and Le Gall, J.F.: Some new classes of exceptional times of linear Brownian motion. Ann. Probab. 23, 1605–1626 (1995) 2. Avellaneda, M. and E, W.: Statistical properties of shocks in Burgers turbulence. Commun. Math. Phys. 172, 13–38 (1995) 3. Avellaneda, M.: Statistical properties of shocks in Burgers turbulence, II: Tail probabilities for velocities, shock-strengths and rarefaction intervals. Commun. Math. Phys. 169, 45–59 (1995) 4. Bertoin, J.: L´evy processes. Cambridge: Cambridge University Press, 1996 5. Burgers, J.M.: The nonlinear diffusion equation. Dordrecht: Reidel, 1974
406
J. Bertoin
6. Carraro, L. and Duchon, J.: Equation de Burgers avec conditions initiales a` accroissements ind´ependants et homog`enes. Ann. Inst. Henri Poincar´e: Analyse non-lin´eaire (to appear). Results announced in C. R. Acad. Sc. Paris Math. 319, 855–858 (1994) 7. Fristedt, B.E.: Sample functions of stochastic processes with stationary, independent increments. In: Advances in Probability 3, New-York: Dekker, 1974, pp. 241–396 8. Fristedt, B.E. and Pruitt, W.E.: Lower functions for increasing random walks and subordinators. Z. Wahrscheinlichkeitstheorie verw. Gebiete 18, 167–182 (1971) 9. Fristedt, B.E. and Taylor, S.J.: The packing measure of a general subordinator. Probab. Theory Relat. Fields 92, 493–510 (1992) 10. Getoor, R.K.: Splitting times and shift functionals. Z. Wahrscheinlichkeitstheorie verw. Gebiete 47, 69–81 (1979) 11. Hawkes, J.: On the Hausdorff dimension of the intersection of the range of a stable process with a Borel set. Z. Wahrscheinlichkeitstheorie verw. Gebiete 19, 90–102 (1971) 12. Hawkes, J.: On the potential theory of subordinators. Z. Wahrscheinlichkeitstheorie verw. Gebiete 33, 113–132 (1975) 13. Hopf, E.: The partial differential equation ut + uux = µuxx . Comm. Pure Appl. Math. 3, 201–230 (1950) 14. Jaffard, S.: The multifractal nature of L´evy processes. Preprint 15. Lachal, A.: Sur la distribution de certaines fonctionnelles de l’int´egrale du mouvement brownien avec d´erive parabolique et cubique. Comm. Pure Appl. Math. XLIX-12, 1299–1338 (1997) 16. Orey, S.: Polar sets for processes with stationary independent increments. Markov processes and potential theory. New-York: Wiley, 1967, pp. 117–126 17. Revuz, D. and Yor, M.: Continuous martingales and Brownian motion. 2nd edn. Berlin: Springer, 1994 18. She, Z.S., Aurell, E. and Frisch, U.: The inviscid Burgers equation with initial data of Brownian type. Commun. Math. Phys. 148, 623–641 (1992) 19. Sinai, Ya.: Statistics of shocks in solution of inviscid Burgers equation. Commun. Math. Phys. 148, 601–621 (1992) 20. Taylor, S.J.: The measure theory of random fractals. Math. Proc. Camb. philos. Soc. 100, 383–406 (1986) 21. Taylor, S.J. and Wendel, J.G.: The exact Hausdorff measure of the zero set of a stable process. Z. Wahrscheinlichkeitstheorie verw. Gebiete 6, 170–180 (1966) Communicated by Ya. G. Sinai
Commun. Math. Phys. 193, 407 – 448 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Framed Vertex Operator Algebras, Codes and the Moonshine Module Chongying Dong1,? , Robert L. Griess Jr.2,?? , Gerald H¨ohn3,??? 1 2 3
Department of Mathematics, University of California, Santa Cruz, CA 95064, USA Department of Mathematics, University of Michigan, Ann Arbor, MI 48109-1109, USA Mathematisches Institut, Universit¨at Freiburg, Eckerstr. 1, D-79104 Freiburg, Germany
Received: 14 July 1997 / Accepted: 8 September 1997
Abstract: For a simple vertex operator algebra whose Virasoro element is a sum of commutative Virasoro elements of central charge 21 , two codes are introduced and studied. It is proved that such vertex operator algebras are rational. For lattice vertex operator algebras and related ones, decompositions into direct sums of irreducible modules for the product of the Virasoro algebras of central charge 21 are explicitly described. As an application, the decomposition of the moonshine vertex operator algebra is obtained for a distinguished system of 48 Virasoro algebras. 1. Introduction Vertex operator algebras (VOAs) have been studied by mathematicians for more than a decade, but still very little is known about the general structure of VOAs. Most of the examples so far come from an auxiliary mathematical structure like affine Kac-Moody algebras, Virasoro algebras, integral lattices or are modifications of these (like orbifolds and simple current extensions). We use the definition of VOA as in [FLM], Sect. 8.10. In this paper we develop a general structure theory for a class of VOAs containing a subVOA of the same rank and relatively simple form, namely a tensor product of simple Virasoro VOAs of central charge 21 . We call this the class of framed VOAs, abbreviated FVOAs. It contains important examples of VOAs. We show how VOAs constructed from certain integral lattices can be described as framed VOAs. In the case that the lattice itself comes from a binary code, this can be done even more explicitly. As an application of the general structure theory we describe VOAs of small central charge as FVOAs, especially the moonshine VOA V \ of central charge 24. ? The first author is supported by NSF grants DMS-9303374, DMS-9700923 and a research grant from the Committee on Research, UC Santa Cruz. ?? The second author is supported by NSF grant DMS-9623038 and the University of Michigan Faculty Recognition Grant (1993–96). ??? The third author is supported by a research fellowship of the DFG, grant Ho 1842/1-1.
408
C. Dong, R. L. Griess Jr., G. H¨ohn
The modules of a VOA together with the intertwining operators can be put together into a larger structure which is a called an intertwining algebra [Hu1, Hu2]. In the case where the fusion algebra of the VOA is the group algebra of an abelian group G, like for lattice VOAs, this specializes to an abelian intertwining algebra [DL1]; also see [Mo]. The description of VOAs containing a fixed VOA with abelian intertwining algebra is relatively simple: They correspond to the subgroups H ≤ G such that all the conformal weights of the VOA-modules indexed by H are integral [H3]. The Virasoro VOA of rank 1 2 gives one of the easiest examples of non abelian intertwining algebras. Sect. 2 can be considered as a study of the extension problem for tensor products of this Virasoro VOAs. It is our hope that the ideas used in this work can be extended to structure theories for VOAs based on other classes of rational subVOAs with nonabelian intertwining algebras, like the VOAs belonging to the discrete series representations of the Virasoro algebra [W]. We continue with a more detailed description of the results in this paper. The Virasoro algebra of central charge 21 has just three irreducible unitary highest 1 weight representations, with highest weights h = 0, 21 , 16 , and the one with h = 0 carries the structure of a simple VOA whose irreducible modules are exactly these irreducible unitary highest weight representations. The relevant fusion rules here (Theorem 2.3) are relatively simple-looking. A tensor product of r such VOAs, denoted Tr , has irreducible 1 }. representations in bijection with r-tuples (h1 , . . . , hr ) such that each hi ∈ {0, 21 , 16 We are interested in the case of a VOA V containing a subVOA isomorphic to Tr . Such a subVOA arises from a Virasoro frame, a set of elements ω1 , . . ., ωr such that for each i, the vertex operator components of ωi along with the vacuum element span a copy of the simple Virasoro VOA of central charge 21 and such that these subVOAs are mutually commutative and ω1 + · · · + ωr is the Virasoro element of V . We abbreviate VF for Virasoro frame. Such elements may be characterized internally up to a factor 2 as the unique indecomposable idempotents in the weight 2 subalgebra of Tr with respect to the algebra product u1 v induced from the VOA structure on Tr . It was shown in [DMZ] that the moonshine VOA V \ is a FVOA with r = 48. Partial results on decompositions of V \ into a direct sum of irreducible T48 -modules were obtained in [DMZ] and [H1]. These results were fundamental in proving that V \ is holomorphic [D3]. In fact, the desire to understand V \ was one of the original motivations for us to study FVOAs. In Sect. 2, we describe how the set of r-tuples which occur lead to two linear codes C, D ≤ Fr2 where D is contained in the annihilator code C ⊥ . For self-dual (also called holomorphic) FVOAs we give a proof that they are equal: C = D⊥ . Associated to these codes are normal 2-subgroups GD ≤ GC of the subgroup G of the automorphism group Aut(V ) of V which stabilizes the VF (as a set). The group G is finite. We get an accounting of all subVOAs of V which contain V 0 , the subVOA of GD -invariants. We obtain a general result (Theorem 2.12) that FVOAs are rational, establishing the existence of a new broad class of rational VOAs. The rationality of FVOAs is a very important aspect of their representation theory. In particular, a FVOA has only finitely many irreducible modules. In Sect. 3, we describe the Virasoro decompositions of the lattice VOAs VD1d , and closely related VOAs, with respect to a natural subVOA T2d . In Sect. 4, we study the familiar situation of the “twisted” or untwisted lattice associated to a binary doubly-even code of length d ∈ 8Z and the twisted and untwisted VOA associated to a lattice. A marking of the code is a partition of its coordinates into
Framed Vertex Operator Algebras, Codes and Moonshine Module
409
2-sets. A marking determines a D1d sublattice in the associated lattices and a VF in the associated VOAs. We give an explicit description of the coset decomposition of the lattices under the D1d sublattice, a Z4 -code, and the decomposition of the VOA as a module for the subVOA generated by the VF. As a corollary, we give information about various multiplicities of the decompositions under this subVOA using the symmetrized marked weight enumerator of the marked code or the symmetrized weight enumerator of the Z4 -code. Finally, Sect. 5 is devoted to applications. Two examples are discussed in detail. Example I is about the Hamming code of length 8, the root lattice E8 and the VOA VE8 . Here, r = 16, and we find at least 5 different VFs. Example II is about the Golay code, the Leech lattice and the moonshine module, V \ , where r = 48. For every VF inside V \ , the code C has dimension at most 41. There is a special marking of the Golay code for which this bound of 41 is achieved, and for this marking the complete decomposition polynomial is explicitly given. The D1 -frames inside the Leech lattice which arise from a marking of the Golay code are characterized by properties of the corresponding Z4 -codes. Appendix A contains a few special results about orbits on markings of the length 8 Hamming code, Appendix B the stabilizer in M24 of the above special marking for the Golay code, and Appendix C the structure of the automorphism group of the above code of dimension 41. Appendix D shows that all automorphisms of a lattice VOA which correspond to −1 on the lattice are conjugate. In [M1–M3], there is a new treatment of the moonshine VOA and there is some overlap with results of this article. In particular, the vertex operator subalgebra similar to our V 0 (see Sect. 2) and its representation theory have been independently investigated in [M3]. Notation and terminology. an b 1 Aut(V ) binary composition on V BV B2n (B2n )0 c C C⊥ C = C(V ) C[L] C{L} Co0 d dn4 (dn4 )0
The value of the endomorphism an on b (see Y (v, z)) The vacuum element of a VOA The automorphism group of the VOA V see: nth binary compostion The conformal block on the torus of the VOA V The FVOA (M (0, 0) ⊕ M ( 21 , 21 ))⊗n with binary code C(B2n ) = {(0, 0), (1, 1)}n of length 2n The subVOA of B2n belonging to the subcode of C(B2n ) consisting of codewords of weights divisible by 4 An element of Fn2 A linear binary code, often self-annihilating and doubly-even The annihilator code of C The binary code determined by the Tr -module structure of V 0 . The complex group algebra of the group L The twisted complex group algebra of the lattice L; ˆ modulo the ideal generated by κ + 1 it is the group algebra C[L] The Conway group which is Aut(3), a finite group of order 222 39 54 72 11.13.23; its quotient by the center {±1} is a finite simple group The length of a binary code C, usually divisible by 8 The marked binary code {(0, 0, 0, 0), (1, 1, 1, 1)}n of length 4n The subcode of dn4 consisting of codewords of weights divisible by 8
410
Dn D = D(V ) δ2n (δ2n )0 δ(c) 1(L) E8 FVOA F G GC GD G = G24 γ γak 0a h, hi H8 H = H6 I I +J κ L Lˆ L∗ LC eC L L(n), Li (n) 3 IM M24 1 ) M (0), M ( 21 ), M ( 16 M (1) M (h1 , . . . , hr ) mh (V ) = mh1 ,...,hr M n nth binary composition Nµabk k ab Nµ, µ PV (a, b, c) r
C. Dong, R. L. Griess Jr., G. H¨ohn
The index 2 sublattice of Zn consisting of vectors whose coordinate sum is even (the “checkerboard lattice") The binary code of the I ⊆ {1, . . . , r} with V I 6= 0 The marked Kleinian or F4 -code {(0, 0), (1, 1)}n of length 2n The subcode of δ2n consisting of codewords of weights divisible by 4 The number of k with c(k) = (c2k−1 , c2k ) ∈ {(0, 1), (1, 0)} The Z4 -code associated to a lattice L with fixed D1 -frame The root lattice of the Lie group E8 (C) A vector with components + or − Abbreviation for framed vertex operator algebra 1 )} The set {M (0), M ( 21 ), M ( 16 The subgroup of Aut(V ) fixing a VF of V . The normal subgroup of G acting trivially on Tr The normal subgroup of G acting trivially on V 0 The Golay code of length 24 An element of Zn4 A map F22 −→ Z24 2n A map F2n 2 −→ Z4 1 Weights of elements or modules of a VOA, usually hi ∈ {0, 21 , 16 } The Hamming code of length 8 The hexacode of length 6, a code over F4 = {0, 1, ω, ω} ¯ or over the Kleinian fourgroup Z2 × Z2 = {0, a, b, c} A subset of {1, . . . , r} The symmetric difference, for subsets of {1, . . . , r} A central element of order 2 in the group Lˆ An integral lattice, often self-dual and even A central extension of L by a central subgroup κ The dual lattice of L The even lattice constructed from a doubly-even code C The “twisted” even lattice constructed from a doubly-even code C The generator Pexpansion P of a Virasoro algebra given by the Y (ω, z) = n∈Z L(n)z −n−2 , resp. Y (ωi , z) = n∈Z Li (n)z −n−2 . The Leech lattice The Monster simple group The simple Mathieu group of order 210 .33 .5.7.11.23 = 244, 823, 040 The irreducible modules for the Virasoro algebra with central charge 21 The canonical irreducible module for Heisenberg algebras The irreducible Tr -module of highest weight (h1 , . . . , hr ) The multiplicity of the Tr -module M (h1 , . . . , hr ) in the FVOA V ; 1 r We think of this as a function of (h1 , . . . , hr ) ∈ {0, 21 , 16 } . A marking of a binary code A natural number The map V × V → V which takes the pair (a, b) to an b A map F22 −→ C[F 4 ] 4n A map F2n 2 −→ C[F ] A vector with components + or −. The decomposition polynomial of a FVOA V The number of elements in a VF
Framed Vertex Operator Algebras, Codes and Moonshine Module
Rµa k Rµa smweC (x, y, z) swe1 (A, B, C) Symr , Sym 6n2 (6n2 )0 T Tr = M (0)⊗r V V (c) VL VLT VeL VF VOA VI V0 =V∅ V\ W (R) P Y (v, z) = n∈Z vn z −n−1 Ξ 1 , Ξ3 ω, ωi
411
A map Z4 −→ C[F 2 ] A map Zn4 −→ C[F 2n ] The symmetrized marked weight enumerator of a binary code C with marking M The symmetrized weight enumerator of a Z4 -code 1 The symmetric group on a set of r objects, usually the index set {1, . . . , r}, resp. the symmetric group on the set The Z4 -code {(0, 0), (2, 2)}n of length 2n The subcode of 6n2 consisting of codewords of weights divisible by 4 A faithful module of dimension 2m for an extraspecial group of ˆ order 21+2m , for some m, or for a finite quotient of some L. The tensor product of r simple Virasoro VOAs of rank 21 An arbitrary VOA, often holomorphic = self-dual The submodule of the FVOA V isomorphic to M ( c2 ) The VOA constructed from an even lattice L The Z2 -twisted module of the lattice VOA VL The “twisted” VOA constructed from an even lattice L Abbreviation for Virasoro frame Abbreviation for vertex operator algebra The sum of irreducible Tr -submodules of V isomorphic to 1 if and only if i ∈ I M (h1 , . . . , hr ) with hi = 16 I This is V , for I = 0 The moonshine VOA, or moonshine module The Weyl group of type R, a root system. The vertex operator associated to a vector v Two D8∗ /D8 -codes of length 1 and 3 Virasoro elements of rank r, 21 , respectively The “all ones vector” (1, 1, . . . , 1) in Fn2 .
2. Framed Vertex Operator Algebras Recall that the Virasoro algebra of central charge 21 has three irreducible unitary rep1 resentations M (h) of highest weights h = 0, 21 , 16 (cf. [FQS, GKO, KR]). Moreover, M (0) can be made into a simple vertex operator algebra with central charge 21 (cf. [FZ]). In [DMZ], a class of simple vertex operator algebras (V, Y, 1, ω) containing an even number of commuting Virasoro algebras of rank 21 were defined. Definition 2.1. Let r be any natural number. A simple vertex operator algebra V is called a framed vertex operator algebra (FVOA) if the following conditions are satisfied: There exist ωi ∈ V for i = 1, . . ., r such that (a) each ωi generates a copy of the simple Virasoro vertex operator algebra of central charge 21 and the comP i −n−2 ponent operators Li (n) of Y (ωi , z) = satisfy [Li (m), Li (n)] = n∈Z L (n)z 3 m −m (m − n)Li (m + n) + 24 δm,−n , (b) the r Virasoro algebras are mutually commutative, and (c) ω = ω1 + · · · + ωr . The set {ω1 , . . . , ωr } is called a Virasoro frame (VF). In this paper we assume that V is a FVOA. It follows that V is a unitary representation for each of the r Virasoro algebras of central charge 21 . In [DMZ] it is also assumed that V0 is one-dimensional. This assumption is now a consequence of the simplicity of V :
412
C. Dong, R. L. Griess Jr., G. H¨ohn
L Lemma 2.2. A FVOA is truncated below from zero: V = n≥0 Vn and V0 is one dimensional: V0 = C 1. P Proof. Let Y (ωi , z) = n∈Z Li (n)z −n−2 . Since V is a unitary representation for the P −n−2 Virasoro algebra generated by the components for Y (ω, z) = as n∈Z L(n)z Pr L i L(n) = i=1 L (n) all weights of V are nonnegative that is, V = n≥0 Vn . Then each nonzero vector v ∈ V0 is a highest weight vector for the r Virasoro algebras with highest weight (0, . . . , 0). The highest weight module for the ith Virasoro algebra generated by v is necessarily isomorphic to M (0). PFrom the construction of M (0) we see immediately that Li (0)v = 0. So L(−1)v = i Li (−1)v = 0, i.e. v is a vacuum-like vector (see [L1]). It is proved in [L1] that a simple vertex operator algebra has at most one vacuum-like vector up to a scalar. Since 1 is a vacuum like vector, we conclude that V0 = C 1. The following theorem can be found in [DMZ]: Theorem 2.3. (1) The VOA M (0) has exactly three irreducible M (0)-modules, M (h), 1 , and any module is completely reducible. with h = 0, 21 , 16 (2) The nontrivial fusion rules among these modules are given by: M ( 21 ) × M ( 21 ) = 1 1 1 1 M (0), M ( 21 ) × M ( 16 ) = M ( 16 ) and M ( 16 ) × M ( 16 ) = M (0) + M ( 21 ). (3) Any module for the tensor product vertex operator algebra Tr = M (0)⊗r , where r is a positive integer, is a direct sum of irreducible modules M (h1 , . . . , hr ) := 1 }. M (h1 ) ⊗ · · · ⊗ M (hr ) with hi ∈ {0, 21 , 16 (4) As Tr -modules, M mh1 ,...,hr M (h1 , . . . , hr ), V = 1 hi ∈{0, 21 , 16 }
where the nonnegative integer mh1 ,...,hr is the multiplicity of M (h1 , . . . , hr ) in V . In particular, all the multiplicities are finite and mh1 ,...,hr is at most 1 if all hi are 1 different from 16 . Let I be a subset of {1, . . . , r}. Define V I as the sum of all irreducible submodules 1 if and only if i ∈ I. Then isomorphic to M (h1 , . . . , hr ) such that hi = 16 M V I. V = I⊆{1,...,r}
Here and elsewhere we identify a subset of {1, 2, . . . , r} with its characteristic function, an integer vector of zeros and ones. We further identify such vectors with their image under the reduction modulo 2, i.e. we consider them as binary codewords in Fr2 . Interpretation should be clear from the context, e.g. we think of the codeword c as an r-tuple of integers in the expression 21 c. the sum of the irreducible submodules isomorphic For each c ∈ Fr2 let V (c) beL to M ( 21 c1 , . . . , 21 cr ). Then V 0 = c∈Fr V (c). Recall the important fact mentioned in 2
Theorem 2.3 (4) that for c ∈ C the Tr -module M ( 21 c1 , . . . , 21 cr ) has multiplicity 1 in V . So, V (c) = 0 or is isomorphic to M ( 21 c1 , . . . , 21 cr ). We can now define two important binary codes C = C(V ) and D = D(V ). Definition 2.4. For every FVOA V , let C = C(V ) = {c ∈ Fr2 | V (c) 6= 0}, and D = D(V ) = {I ∈ Fr2 | V I 6= 0}.
(2.1)
Framed Vertex Operator Algebras, Codes and Moonshine Module
413
The vector of all multiplicities mh1 ,...,hr will be denoted by mh (V ). Note that the codes C and D are completely determined by mh (V ). The following proposition generalizes Proposition 5.1 of [DMZ] and Theorem 4.2.1 of [H1]. In particular it shows C and DP are linear binary codes. As usual we use un for the component operators of Y (u, z) = n∈Z un z −n−1 . Proposition 2.5. (1) V 0 = V ∅ is a simple vertex operator algebra and the V I are irreducible V 0 -modules. Moreover V I and V J are inequivalent if I 6= J. (2) For any I and J and 0 6= v ∈ V J , span{un v | u ∈ V I } = V I+J , where I + J is the symmetric difference of I and J. Moreover, D is an abelian group under the symmetric difference. (3) There is one to one correspondence between the subgroups D0 of D and the vertex operator subalgebras which contain V 0 via D0 7→ V D0 , where we define V S := ⊕I∈S V I for any subset S of D0 . Moreover V I+D0 is an irreducible V D0 -module for I ∈ D and V I+D0 and V J+D0 are nonisomorphic if the two cosets are different. (4) Let I ⊆ {1, . . . , r} be given and suppose that (h1 , . . . , hr ) and (h01 , . . . , h0r ) are 1 1 1 } such that hi = 16 (resp. h0i = 16 ) if and only if r-tuples with hi , h0i ∈ {0, 21 , 16 i ∈ I. If both mh1 ,...,hr and mh01 ,...,h0r are nonzero then mh1 ,...,hr = mh01 ,...,h0r . That is, all irreducible modules inside V I for Tr have the same multiplicities. (5) The binary code C is linear and span{un v | u ∈ V (c)} = V (c + d) for any c, d ∈ C and 0 6= v ∈ V (d). (6) Moreover, there is a one to one correspondence between vertex operator subalgebras of V 0 which contain Tr and the subgroups of C, and V is completely reducible for such vertex operator subalgebras whose irreducible modules in V 0 are indexed by the corresponding cosets in C. Proof. Let v ∈ V J be nonzero. It follows from Proposition 2.4 of [DM] or Lemma 6.1.1 of [L2] and the simplicity of V that V = span{un v | u ∈ V, n ∈ Z}. From the fusion rules given in Theorem 2.3 (2) and Proposition 2.10 of [DMZ] we see that un v ∈ V I+J exactly for u ∈ V I . In particular, span{un v | u ∈ V 0 , n ∈ Z} = V J . So, V J can be generated by any nonzero vector and V J is a irreducible V 0 -module. Since V I and V J are inequivalent Tr -modules if I 6= J they are certainly inequivalent V 0 -modules. By Proposition 11.9 of [DL1], we know that Y (u, z)v 6= 0 if u and v are not 0. Thus V I+J 6= 0 if neither V I or V J are 0. This shows that D is a group. So, we finish the proof of (1) and (2). For (3), we first observe that for a subgroup D0 of D, (2) implies that V D0 is a subVOA which contains V 0 . On the other hand, since V = V D , V is a completely reducible V 0 module. Also V I and V J are inequivalent V 0 -modules if I and J are different. Let U be any vertex operator subalgebra of V which contains V 0 . Then U is a direct sum of certain V I . Let D0 be the set of I ∈ D such that V I ≤ U . Then 0 ∈ D0 . Also from (2) if I, J ∈ D0 then I + J ∈ D0 . Thus D0 is a subgroup of D. In order to see the simplicity of U , we take a vector v ∈ V I for some I ∈ D0 . Then span{un v | u ∈ V J , n ∈ Z} = V I+J for any J ∈ D0 . It is obvious that {I + J | J ∈ D0 } = D0 . Thus U is simple. The proof of the irreducibility of V I+D0 is similar to that of simplicity of V D0 . Inequivalence of V I+D0 and V J+D0 is clear as they are inequivalent Tr -modules. The proofs of (5) and (6) are similar to that of (2) and (3). For (4) we set p = mh1 ,...,hr and q = mh01 ,...,h0r . Let W1 , . . ., Wp be submodules of V Pp isomorphic to M (h1 , . . . , hr ) such that i=1 Wi is a direct sum. Let d = (d1 , . . . , dr ) ∈ C such that V (d) × M (h1 , . . . , hr ) = M (h01 , . . . , h0r ). Set Wi0 = span{un Wi | u ∈
414
C. Dong, R. L. Griess Jr., G. H¨ohn
V (d), n ∈ Z} for i = 1, . . ., p. Then Wi0 is isomorphic to M (h01 , . . . , h0r ) for all i. Note that span{un Wi0 | u ∈ V (d), n ∈ Z} = span{un vm Wi | u, v ∈ V (d), m, n ∈ Z} = span{un Wi | u ∈ Tr , n ∈ Z} = Wi Pp (cf. Proposition 4.1 of [DM]). Thus i=1 Wi0 must be a direct sum in V . This shows that p ≤ q. Similarly, p ≥ q. Remark 2.6. We can also define framed vertex operator superalgebras. The analogue of Proposition 2.5 still holds. In particular we have the binary codes C and D. Definition 2.7. Let G be the subgroup of Aut(V ) consisting of automorphisms which stabilize the Virasoro frame {ωi }. Namely, G = {g ∈ Aut(V ) | g{ω1 , . . . , ωr } = {ω1 , . . . , ωr } } .
(2.2)
The two subgroups GC and GD are defined by: GC = {g ∈ G | g|Tr = 1}, GD = {g ∈ G | g|V 0 = 1}. Finally, we define the automorphism group Aut(mh (V )) as the subgroup of the group Symr of permutations of {1, . . . , r} which fixes the multiplicity function mh (V ), i.e. which consists of the permutations σ ∈ Symr such that mh1 ,...,hr = mhσ(1) ,...,hσ(r) . It is easy to see that both GD and and GC are normal subgroups of G and GD is a subgroup of GC . Following Miyamoto [M1], we define for i = 1, . . ., r an involution τi on V which acts on V I as −1 if i ∈ I and as 1 otherwise. The group generated by all τi is a subgroup of the group of all automorphisms of V and is isomorphic to the dual group Dˆ of D. We define another group FC which is a subgroup of Aut(V 0 ) and is generated by σi which acts on M (h1 , . . . , hr ) by −1 if hi = 21 and 1 otherwise. The group FC is isomorphic to the dual group Cˆ of C. Theorem 2.8. (1) The subgroup GD is isomorphic to the dual group Dˆ of D. (2) GC /GD is isomorphic to a subgroup of the dual group Cˆ of C. (3) G/GC is isomorphic to a subgroup of Aut(mh (V )) ≤ Symr . In particular, G is a finite group. (4) For any g ∈ G and a Tr -submodule W of V isomorphic to M (h1 , . . . , hr ) then gW is isomorphic to M (hµ−1 , . . . , hµ−1 ), where µg ∈ Symr such that gωi = ωµg (i) g (1) g (r) for all i. (5) If the eigenvalues of g ∈ GC on V I are i and −i, then i and −i have the same multiplicity. Proof. (1) Let g ∈ G such that g |V 0 = 1. Recall from Proposition 2.5 that V = L I I is an irreducible V 0 -module we have V I = span{vn u | I∈D V . Since each V v ∈ V 0 , n ∈ Z} for any nonzero vector u ∈ V I . Note that g preserves each homogeneous subspace VnI , which is finite-dimensional. Take u ∈ V I to be an eigenvector of g with eigenvalue xI and let v ∈ V 0 . Then g(vn u) = vn gu = xI vn u. Thus g acts on V I as the constant xI . For any 0 6= u ∈ V I and 0 6= v ∈ V J we have
Framed Vertex Operator Algebras, Codes and Moonshine Module
415
0 6= Y (u, z)v ∈ V I+J [[z, z −1 ]]. Since xI+J Y (u, z)v = gY (u, z)v = xI xJ Y (u, z)v, we see that xI xJ = xI+J . In particular x ∈ Dˆ and x takes values in {±1}. Clearly, each g ∈ Dˆ acts on V 0 trivially since Dˆ is generated by the τi . This proves (1). For (2) we take g ∈ GC . A similar argument as in the first paragraph shows that g|V (c) is a constant yc = ±1 and yc+d = yc yd . In other words we have defined an element y of Cˆ which maps c ∈ C to yc . One can easily see that this gives a group homomorphism from GC to Cˆ with kernel GD . For (3) let g ∈ G. Then there exists a unique µg ∈ Symr such that gωi = ωµg (i) . Clearly we have µg1 g2 = µg1 µg2 for g1 , g2 ∈ G. It is obvious that the kernel of the map g 7→ µg is GC . In order to prove (4), we take a highest weight vector v of W . Then Li (0)v = −1 v and gv is a highest hi v for i = 1, . . ., r. So Li (0)gv = gLµg (i) (0)v = hµ−1 g −1 , . . . , h ). That is, gW is isomorphic weight vector with highest weight (hµ−1 µg (r) g (1) , . . . , hµ−1 ). to M (hµ−1 g (1) g (r) Finally, we turn to (5). We first mention how a general g ∈ GC acts on V I for I ∈ D. Note that g 2 = 1 on V 0 by the proof of (2), that is, g 2 ∈ GD . So g 2 = ±1 on each V I . This implies that g is diagonalizable on V I whose eigenvalues are ±1 if g 2 = 1 on V I and are ±i if g 2 = −1 on V I . In the second case, let V I = W1 ⊕ · · · ⊕ Wp ⊕ M1 ⊕ · · · ⊕ Mq , where all Wj , Mk are irreducible Tr -modules and g = i on each Wj and g = −i on each Mk . On V 0 , g is not 1, since otherwise g is in GD and g would have only ±1 for eigenvalues, by (1). Take an irreducible Tr -submodule U of V 0 so that g|U = −1. Set Wj0 = { un Wj | u ∈ U, n ∈ Z}. Then g = −i on each Wj0 . Pp 0 Claim. j=1 Wj is a direct sum. Using associativity, we see that span{un Wj0 | u ∈ U, n ∈ Z} = span{um vn Wj | u, v ∈ U, m, n ∈ Z} = span{vn Wj | v ∈ Tr , n ∈ Z} = Wj . This proves the claim. Thus p ≤ q. Similarly, q ≤ p. So they must be equal. This finishes the proof. The results in Proposition 2.5 (2) and (3) resp. (5) and (6) can be interpreted by the “quantum Galois theory” developed in [DM] and [DLM2]. For example, Proposition 2.5 (2) and (3) is now a special case of Theorems 1 and 3 of [DM] applied for the group GD : Remark 2.9. Note that V 0 is the space of GD -invariants. There is a one to one correspondence between the subgroups of L GD and vertex operator subalgebras of V containing V 0 via H 7→ V H . In fact, V H = I∈H 0 V I , where H 0 = {I ∈ D | H|V I = 1}. Under ˆ the subcode H 0 of D corresponds to the common kernel the identification of GD with D, of the functionals in H. Next we prove that a FVOA is always rational. Recall the definition of rationality and regularity as defined in [DLM1]. A vertex operator algebra is called rational if any admissible module is a direct sum of irreducible admissible modules and a rational vertex operator algebra is regular if any weak module is a direct sum of ordinary irreducible modules. (The reader is referred to [DLM3] for the definitions of weak module, admissible module, and ordinary module.)
416
C. Dong, R. L. Griess Jr., G. H¨ohn
It is proved in [DLM3] that if V is a rational vertex operator algebra then V has only finitely many irreducible admissible modules and each is an ordinary irreducible module. We need two lemmas. Lemma 2.10. Let V be a FVOA such that D(V ) = 0. Then any nonzero weak V -module W contains an ordinary irreducible module. L Proof. Since D(V ) = 0 we have the decomposition V = c∈C V (c). Since Tr is regular (see Proposition 3.3 of [DLM1]), W is a direct sum of ordinary irreducible Tr -modules. Let M be an irreducible Tr -submodule of W . Then N := span{un M | u ∈ V, n ∈ Z} is an ordinary V -module as each span{un M | u ∈ V (c), n ∈ Z} is an ordinary irreducible Tr -module and C is a finite set. For an ordinary V -module X we define m(X) to be the sum of the multiplicities mh1 ,...,hr of all modules M (h1 , . . . , hr ) in X, i.e., the Tr -composition length. Let K be a V -submodule of N such that m(K) is the smallest among all nonzero V -submodules of N . Then K is an irreducible ordinary V -submodule of N and of W . Lemma 2.11. Any FVOA V with D(V ) = 0 is rational. Proof. We must show that any admissible V -module is a direct sum of irreducible ones. Let W be an admissible V -module and M the sum of all irreducible V -submodules. We prove that W = M . Otherwise by Lemma 2.10 the quotient module W/M has an irreducible submodule W 0 /M , where W 0 is a submodule of W which contains M . Let U be an irreducible Tr submodule of W 0 such that U ∩ M = 0 and set X := span{vn U | v ∈ V, n ∈ Z}. Then X is a submodule of W 0 and W 0 = M + X. Note that U [c] := span{vn U | v ∈ V (c), n ∈ Z} for each c ∈ C is an irreducible Tr -module. Then either U [c] ∩ M = 0 or U [c] ∩ M = U [c]. If the latter happens, then Y (v, z)(U + M/M ) = 0 in the quotient module W/M , which is impossible by Proposition 11.9 of [DL]. Thus U [c] ∩ M = 0 for all c ∈ C and W 0 = M ⊕ X. By Lemma 2.10, X has an irreducible V -submodule Y and certainly M ⊕Y strictly contains M . This is a contradiction. Theorem 2.12. Any FVOA V is rational. Proof. Let W be an admissible V -module. Then W is a direct sum of irreducible V 0 modules by Lemma 2.11. Let M be an irreducible V 0 -module. It is enough to show that M is contained in an irreducible V -submodule of W . First note that there exists a subset I of {1, . . . , r} such that for every irreducible Tr -module M (h1 , . . . , hr ) inside M we 1 if and only if i ∈ I. Let X be the V -submodule generated by M . Then have hi = 16 P X = J∈D X[J] ≤ W , where X[J] = span{un M | u ∈ V J , n ∈ Z} is a V 0 -module. We will show that X is an irreducible V -module. By the fusion rules, we know that for every irreducible Tr -submodule of V which 1 if and only if k ∈ I + J. The X[J] for is isomorphic to M (h1 , . . . , hr ) has hk = 16 0 J ∈L D are nonisomorphic V -modules as they are nonisomorphic Tr -modules. Thus X = J∈D X[J]. L Let Y be a nonzero V -submodule of X. Then Y = J∈D Y [J], where Y [J] = Y ∩ X[J] is a V 0 -module. If Y [J] 6= 0 then span{vn YJ | v ∈ V J , n ∈ Z} 6= 0. Otherwise use the associativity of vertex operators to obtain
Framed Vertex Operator Algebras, Codes and Moonshine Module
417
0 = span{um vn Y [J] | u, v ∈ V J , m, n ∈ Z} = span{vn Y [J] | v ∈ V 0 , n ∈ Z} = Y [J]. By associativity again we see that span{vn Y [J] | v ∈ V J , n ∈ Z} is a nonzero V 0 submodule of M . Since M is irreducible it follows immediately that span{vn Y [J] | v ∈ V J , n ∈ Z} = M . So M is a subspace of Y . Since X is generated by M as a V -module we immediately have X = Y . This shows that X is indeed an irreducible V -module. It should be pointed out that each X[J] in fact is an irreducible V 0 -module. Let 0 6= u ∈ X[J]. Since X = span{vn u | v ∈ V, n ∈ Z} we see that span{vn u | v ∈ V 0 , n ∈ Z} = X[J]. Corollary 2.13. Let V be a FVOA. Then (1) V has only finitely many irreducible admissible modules and every irreducible admissible V -module is an ordinary irreducible V -module. (2) V is regular, that is, any weak V -module is a direct sum of ordinary irreducible V -modules. Proof. We have already mentioned that (1) is true for all rational vertex operator algebra (see [DLM3]). So, (1) is an immediate consequence of Theorem 2.12. In [DLM1] we proved that (2) is true for any rational vertex operator algebra which has a regular vertex operator subalgebra with the same Virasoro element. Note that Tr is such a vertex operator subalgebra of V . Theorem 2.12 is very useful. We will see in the later sections that the FVOAs V3+ and V \ are rational vertex operator algebras. Theorem 2.12 simplifies the original proofs of the rationality of V3+ in [D3] and V \ in [DLM1]. Most important, we do not use the self-dual property of V \ (i.e., V \ is the only irreducible module for itself) as proved in [D3]. It is a interesting problem to find suitable invariants for a FVOA V . Two invariants of V are the binary codes C and D of length r as defined before. They cannot be arbitrary but must satisfy the following conditions: Pr Proposition 2.14. (1) The code C is even, i.e. the weight wt(c) = i=1 ci ∈ Z+ of every codeword c ∈ C is divisible by 2. (2) The weights of all codewords d ∈ D are divisible by 8. ⊥ r (3) The binary P code D is a subcode of the annihilator code C = {d = (di ) ∈ F2 | (d, c) = i di ci = 0 for all c = (ci ) ∈ C}. Proof. Let W be a Tr -submodule isomorphic to M (h1 , . . . , hr ). Then the weight of a highest weight vector of W is h1 + h2 + · · · + hr which is necessarily an integer as V is Z-graded. The parts (1) and (2) now follow immediately. To see (3), note that for c ∈ C and M ≤ V I isomorphic to M (g1 , . . . , gr ) one has from the fusion rules given in Theorem 2.3 (2) that M 0 = span{un M (g1 , . . . , gr ) | u ∈ V (c), n ∈ Z} ≤ V I 1 is isomorphic to M (h1 , . . . , hr ) with hi = gi = 16 if i ∈ I and hi = 0 (resp. hi = 21 ) if ci + 2gi = 0 in F2 (resp. ci + 2gi = 1). Since the conformal weights g1 + · · · + gr and h1 + · · · + hr of M (g1 , . . . , gr ) and M (h1 , . . . , hr ) are both integral we see that #({i ∈ {1, . . . , r} | ci = 1} \ I) is an even integer. Thus #{i ∈ I | ci = 1} is also even as wt(c) is even. This implies that (d, c) = 0, as required, where d ∈ D is the codeword belonging to I ⊆ {1, . . . , r}.
418
C. Dong, R. L. Griess Jr., G. H¨ohn
Here are a few remarks on the action of Aut(V ) on V2 , which is an action preserving the algebra product a1 b coming from the VOA structure. Remark 2.15. (1) If V is a VOA and is generated as a VOA by V2 , then Aut(V ) acts faithfully on V2 . This happens in the case V = VL+ , where L is a lattice spanned by its vectors x such that (x, x) = 4. (2) If V is a FVOA, the kernel of the action of Aut(V ) on V2 is contained in the intersection of the groups GC , as we vary over all frames. Hence, this kernel is a finite 2-group, of nilpotence class at most two and order dividing 2r , where r = rank(V ). The framed vertex operator algebras with D = 0 can be completely understood in an easy way. Proposition 2.16. For every even linear code C ≤ Fr2 there is up to isomorphism exactly one FVOA VC such that the associated binary codes are C = C and D = 0. Proof. Let VFermi = M (0) ⊕ M ( 21 ) be the super vertex operator algebra as described in ⊗r is a super vertex operator algebra whose code [KW]. The (graded) tensor product VFermi r C is the complete code F2 (see Remark 2.6). It has the property, that the even vertex operator subalgebra is the vertex operator algebra associated to the level 1 irreducible highest weight representation for the affine Kac-Moody algebra Dr/2 if r is even and B(r−1)/2 if r odd (see [H1], chapter 2). The code C for this vertex operator algebra is the even subcode of Fr2 . Proposition 2.5 (6) gives, for every even code C ≤ Fr2 , a FVOA V such that C(V ) = C and D(V ) = 0. The uniqueness of the FVOA with code C(V ) = C up to isomorphism follows from a general result on the uniqueness of simple current extensions of vertex operator algebras [H3]. This proposition is also proved in a different way by Miyamoto in [M2, M3]. Recall that a holomorphic (or self-dual) VOA is a VOA V whose only irreducible module is V itself. In the case of holomorphic FVOAs, we can show that the subcode D ≤ C ⊥ is in fact equal to C ⊥ . We need some basic facts from [Z] and [DLM4] about the “conformal block on the torus” BV [Z] of a VOA V . To apply Zhu’s modular invariance theorems one has to assume that V is rational and satisfies the C2 condition.1 It was proved in [DLM4] that the moonshine VOA satisfies the C2 condition. The same proof in fact works for any FVOA. We also know from Theorem 2.12 that a FVOA is rational. Applying Zhu’s result to a FVOA V yields that BV is a finite dimensional complex vector space with a canonical base TMi indexed by the inequivalent irreducible V modules Mi and that BV carries a natural SL2 (Z)-module structure ρV : SL2 (Z) −→ GL(BV ). Let V and W be two rational VOAs satisfying the C2 condition. The following two properties of the conformal block follow directly from the definition: (B1) BV ⊗W = BV ⊗ BW as SL2 (Z)-modules and TMi ⊗Mj = TMi ⊗ TMj . (B2) If W is a subVOA of V with the same Virasoro element then there is a natural SL2 (Z)-module map ι∗ : BV −→ BW . We also need the following well-known result: 1 The condition that V be a direct sum of highest weight representations for the Virasoro algebra was also required in [Z], but was removed in [DLM4].
Framed Vertex Operator Algebras, Codes and Moonshine Module
419
(B3) For the vertex operator algebra M (0), the action of S = 01 −1 ∈ SL2 (Z) on 0 BM (0) in the canonical basis {TM (0) , TM ( 1 ) , TM ( 1 ) } is given by the matrix 2 16 √ 1/2 1/2 1/ √2 1/2 1/2 −1/ 2 (2.3) √ √ 1/ 2 −1/ 2 0 . Here is a result about binary codes used in the proof of Theorem 2.19 below: Lemma 2.17. Let µ⊗n be the n-fold tensor product of the matrix µ = 11 −11 considered ⊗n as a linear endomorphism of the vector space C[Fn2 ] ∼ on the canonical base = C[FP 2] {ev | v ∈ Fn2 }. For a subset X ⊆ Fn2 denote by χX = v∈X ev the characteristic function of X. Then the following relation between a linear code C and its annihilator C ⊥ holds: 1 · µ⊗n (χC ). χC ⊥ = |C| Remark 2.18. µ⊗n is a Hadamard matrix of size 2n and the corresponding linear map is called the Hadamard transform. Proof. For every Z-module R and function f : Fn2 −→ R the following relation holds (cf. Ch. 5, after Lemma 2 of [MaS]) X X X f (v) = (−1)(u,v) f (v). (2.4) |C| v∈C ⊥
u∈C v∈Fn 2
Now let R be the abelian group C[Fn2 ] and define f by f (v) = ev for all v ∈ Fn2 . The left hand side of (2.4) is |C| · χC ⊥ . Expansion of the right side gives: X
X
n Y
(−1)ui vi ev1 ⊗ · · · ⊗ evn =
u∈C v1 ,...,vn ∈F2 i=1
X
µ⊗n (eu ) = µ⊗n (χC ).
u∈C
Theorem 2.19. For a holomorphic FVOA the binary codes C and D satisfy D = C ⊥ . Proof. The vector of multiplicities mh (V ) can be regarded as an element in the vector 1 )}. Define two linear maps π, space C[F r ] ∼ = C[F]⊗r , where F = {M (0), M ( 21 ), M ( 16 θ : C[F] −→ C[F2 ] = C e0 ⊕ C e1 by π(M (0)) = e0 ,
1 π(M ( )) = e0 , 2
π(M (
√ 1 )) = 2e1 , 16
and
1 1 θ(M ( )) = e1 , θ(M ( )) = 0. 2 16 Finally let σ : C[F] −→ C[F] the linear map given by the matrix (2.3) relative to the basis F . Now one has π ◦ σ = µ ◦ θ and thus the following diagram commutes: θ(M (0)) = e0 ,
C[F r ]
σ ⊗r −→
↓ θ⊗r C[Fr2 ]
C[F r ] ↓ π ⊗r
µ⊗r −→
C[Fr2 ].
(2.5)
420
C. Dong, R. L. Griess Jr., G. H¨ohn
By definition, the support of π ⊗r (mh (V )) is D ≤ Fr2 . From Lemma 2.17 µ⊗r ◦ θ⊗r (mh (V )) = |C| · χC ⊥ ∈ C[Fr2 ]. Note that the support of χC ⊥ is C ⊥ . These facts together with (2.5) imply the theorem if we can show that σ ⊗r (mh (V )) = mh (V ). We identify C[F r ] with the conformal block on the torus of the VOA Tr by identifying the canonical bases: M = TM . Using (B1) and (B3) we observe that σ ⊗r = ρTr (S), where ρTr is the representation ρTr : SL2 (Z) −→ GL(BTr ) of degree 3r . P Define the shifted graded character chV (τ ) := q −c/24 n≥0 (dim Vn )q n , where q = e2πiτ and c is the central charge of V. Since V is holomorphic, the conformal block BV is one dimensional. Then ρV (S) = 1 (the case ρV (S) = −1 is impossible since chV (i) > 0, where i is the square root of −1 in upper half plane; cf. [H1], proof P of Cor. 2.1.3). Now we use (B2). The ∗generator TV of BV is mapped mh1 ,...hr TM (h1 ,...,hr ) = mh (V ). Since ι is SL2 (Z)-equivariant we get by ι∗ to σ ⊗r (mh (V )) = ρTr (S)(mh (V )) = ι∗ (ρV (S)(TV )) = mh (V ). The same kind of argument was used in the proof of Theorem 4.1.5 in [H1]. 3. Vertex Operator Algebras VDd 1
Pn Let Dn = {(x1 , . . . , xn ) ∈ Zn | i=i xi even} ≤ Rn , n ≥ 1, be the root lattice of type Dn , the “checkerboard lattice”. In this section, we describe the Virasoro decomposition of modules and twisted modules for the vertex operator algebra VDd . 1 We work in the setting of [FLM] and [DMZ]. In particular L is an even lattice with nondegenerate symmetric Z-bilinear form h·, ·i; h = L ⊗Z C; hˆ Z is the corresponding Heisenberg algebra; M (1) is the associated irreducible induced module for hˆ Z such that ˆ − ) is the central extension of L by the canonical central element of hˆ Z acts as 1; (L, hκ | κ2 = 1i, a group of order 2, with commutator map c0 (α, β) = hα, βi + 2Z; c(·, ·) is the alternating bilinear form given by c(α, β) = (−1)c0 (α,β) for α, β ∈ L; χ is a faithful ˆ linear character of hκi such that χ(κ) = −1; C{L} = IndL hκi Cχ (' C[L], linearly), where Cχ is the one-dimensional hκi-module defined by χ; ι(a) = a ⊗ 1 ∈ C{L} for P ˆ VL = M (1) ⊗ C{L}; 1 = ι(1); ω = 1 d βr (−1)2 , where {β1 , . . . , βd } is an a ∈ L; r=1 2 orthonormal basis of h; it was proved in [B] and [FLM] that there is a linear map VL → (End VL )[[z, z −1 ]], X vn z −n−1 (vn ∈ End VL ) v 7→ Y (v, z) = n∈Z
such that VL = (VL , Y, 1, ω) is a simple vertex operator algebra. Let L∗ = {x ∈ h | hx, Li ≤ Z} be the dual lattice of L. Then the irreducible modules of VL are the VL+γ (which are defined in [D1]) indexed by the elements of the quotient group L∗ /L (see [D1]). In fact, VL is a rational vertex operator algebra (see [DLM1]). ¯ ai/2 ¯ Let θ be the automorphism of Lˆ such that θ(a) = a−1 κha, . Then θ is a lift of the −1 automorphism of L. We have an automorphism of VL , denoted again by θ, ˆ (See Appendix D for such that θ(u ⊗ ι(a)) = θ(u) ⊗ ι(θa) for u ∈ M (1) and a ∈ L. a fuller discussion.) Here the action of θ on M (1) is given by θ(α1 (n1 ) · · · αk (nk )) = (−1)k α1 (n1 ) · · · αk (nk ). The θ-invariants VL+ of VL form a simple vertex operator subalgebra and the −1-eigenspace VL− is an irreducible VL+ -module (see Theorem 2 of [DM]). Clearly VL = VL+ ⊕ VL− .
Framed Vertex Operator Algebras, Codes and Moonshine Module
421
Now we take for L the lattice D1d =
d M
Zαi , hαi , αj i = 4δi,j .
i=1
Then L is an even lattice and the central extension Lˆ is a direct product of D1d with hκi and C{L} is simply the group algebra C[L] with basis eα for α ∈ L. It is clear that θ(eα ) = e−α for α ∈ D1d . We extend the action of θ from VDd to V(D1∗ )d = M (1)⊗C[L∗ ] 1 such that θ(u ⊗ eα ) = (θu) ⊗ e−α for u ∈ M (1) and α ∈ L∗ . One can easily verify that θ has order 2 and θY (u, z)θ−1 = Y (θu, z) for u ∈ VDd , where Y (v, z) (v ∈ VDd ) are 1 1 the vertex operators on V(D1∗ )d . For any θ-invariant subspace V of VL∗ we use V ± to denote the ±-eigenspaces. First we turn our attention to the case that d = 1. Then L = Zα ∼ = 2Z = D1 , where hα, αi = 4. Note that the dual lattice D1∗ is 41 D1 and {0, 1, 21 , − 21 } is a system of coset representatives of D1∗ /D1 . Set 1 1 α(−1)2 + (eα + e−α ), 16 4 1 1 α(−1)2 − − − − (eα + e−α ). ω2 = 16 4 ω1 =
(3.1)
Then ωi ∈ VD+ 1 . Lemma 3.1. For D1 ∼ = L = Zα, hα, αi = 4, we have: (1) VD1 is a FVOA with r = 2. (2) We have the following Virasoro decompositions of VD+ 1 and VD−1 : 1 1 VD+ 1 ∼ = M (0, 0), VD−1 ∼ = M( , ) 2 2 with highest weight vectors 1 and α(−1), respectively. (3) The decompositions for VD±1 +1 are: 1 1 VD+ 1 +1 ∼ = M ( , 0), VD−1 +1 ∼ = M (0, ) 2 2 with highest weight vectors (e 2 α − e− 2 α ) and (e 2 α + e− 2 α ), respectively. (4) For VD1 + 1 ⊕ VD1 − 1 we get, in both cases, 1
2
1
1
1
2
1 1 (VD1 + 1 ⊕ VD1 − 1 )± ∼ = M( , ) 2 2 16 16 with highest weight vectors e 4 α ± e− 4 α . In fact, both VD1 + 1 and VD1 − 1 are irre2 2 ducible VD+ 1 -modules. 1
1
422
C. Dong, R. L. Griess Jr., G. H¨ohn
P Proof. It was proved in [DMZ] (see Theorem 6.3 there) that Y (ω1 , z1 ) = n∈Z P L1 (n)z −n−2 and Y (ω2 , z2 ) = n∈Z L2 (n)z −n−2 give two commuting Virasoro algebras with central charge 21 . We first show that the highest weight of α(−1) is ( 21 , 21 ). Since α(−1) ∈ VD−1 has the smallest weight in VD−1 it is immediate to see that Li (n)α(−1) = 0 if n > 0. It is a straightforward computation by using the definition of vertex operators to show that L1 (0)α(−1) = L2 (0)α(−1) = 21 α(−1). Clearly, 1 ∈ (VD1 )+ is a highest weight vector for the Virasoro algebras with highest weight (0, 0). So VD1 contains two highest weight modules for the two Virasoro algebras with highest weights (0, 0) and ( 21 , 21 ). Since M (0, 0) ⊕ M ( 21 , 21 ) and VD1 have the same graded dimension we conclude that VD1 ∼ = M (0, 0) ⊕ M ( 21 , 21 ) and VD+ 1 ∼ = M (0, 0), − ∼ 1 1 VD1 = M ( 2 , 2 ). This proves (2) and shows also (1): VD1 is a FVOA with r = 2. Additionally we see that VD1 is a unitary representation of the two Virasoro algebras. By Theorem 2.3 (3) we know that VD1 +λ , for λ = 0, ± 21 , 1, is a direct sum of 1 irreducible modules M (h1 , h2 ) with hi ∈ {0, 21 , 16 }. It is easy to find all highest weight vectors in VD1 +λ . Part (3) and (4) follow immediately then. Ld d ∼ d We return to the lattice L = i=1 Zαi , hαi , αj i = 4δi,j , L = D1 = (2Z) . We d sometimes identify L with (2Z) . The component Zαi gives two Virasoro elements ω2i−1 and ω2i , as in (3.1), above. Definition 3.2. The VF associated to the FVOAs derived from the D1d -lattice is the set {ω1 , . . . , ω2d }. Corollary 3.3. (1) The decomposition of VD±d into irreducible modules for T2d is given 1 by M M (h1 , . . . , h2d ). VD±d ∼ = 1
(h2i−1 , h2i ) ∈ {(0, 0), ( 21 , 21 )} (−1)#{i|h2i =0} = ±1
In particular, VD±d is a direct sum of 2d−1 irreducible modules for T2d . 1
(2) Let γ = (γi ) ∈ (D1∗ )d such that γi ∈ {0, 1}. Then we get the decomposition M M (h1 , . . . , h2d ). (VDd +γ )± ∼ = 1 {(0, 0), ( 21 , 21 )} if γi = 0, {(0, 21 ), ( 21 , 0)} if γi = 1
(h2i−1 ,h2i )∈
(−1)#{i|h2i =0} =±1
(3) Let γ = (γi ) ∈ (D1∗ )d , such that 2γ 6∈ D1d , i.e. there is at least one i such that γi = ± 21 . Then (VDd +γ ⊕ VDd −γ )± , VDd ±γ have the same decomposition: 1
1
1
M ( (h2i−1 ,h2i )∈
{(0, 0), ( 21 , {( 21 , 0), (0, 1 , 1 )} {( 16 16
M (h1 , . . . , h2d ). 1 2 )} if γi = 0, 1 2 )} if γi = 1, if γi = ± 21
Proof. Note that VDd is isomorphic to the tensor product vertex operator algebra VD1 ⊗ 1 · · ·⊗VD1 (d factors) and that VDd +γ is isomorphic to the tensor product module VZα1 +γ1 ⊗ 1 · · · ⊗ VZαd +γd . Thus
Framed Vertex Operator Algebras, Codes and Moonshine Module
(VDd +γ ⊕ VDd −γ )± = 1
1
M Q
423
VDµ11+γ1 ⊗ · · · ⊗ VDµ1d+γd .
µ∈{+,−}d
µi =±
The results (1) and (2) now follow from Lemma 3.1 immediately. For (3) it is clear that the decompositions for VDd ±γ hold by Lemma 3.1. It remains 1 to show that VDd ±γ and (VDd +γ ⊕ VDd −γ )± are all isomorphic T2d -modules. Note from 1 1 1 Lemma 3.1 that VD1 +h and VD1 −h are isomorphic T2 -modules for any h ∈ {0, 1, ± 21 }. Thus VDd +γ and VDd −γ are isomorphic T2d -modules. In fact, θ : VDd +γ → VDd −γ is 1 1 1 1 such an isomorphism. Thus, (VDd +γ ⊕VDd −γ )± = {v ±θv | v ∈ VDd +γ } are isomorphic 1 1 1 to VDd +γ as T2d -modules. 1
Next we discuss the twisted modules of VL for an arbitrary d-dimensional positive definite even lattice L. Recall from [FLM] the definition of the twisted sectors associated ˆ Then K ¯ = 2L (bar is the quotient map to an even lattice L. Let K = {θ(a)a−1 | a ∈ L}. Lˆ → L). Also set R := {α ∈ L | hα, Li ≤ 2Z}; then R ≥ 2L. Then the inverse image Rˆ ˆ It was proved in [FLM] (Proposition of R in Lˆ is the center of Lˆ and K is a subgroup of R. ˆ ˆ 7.4.8) there are exactly |R/2L| central characters χ : R/K → C× of L/K such that ˆ χ(κK) = −1. For each such χ, there is a unique (up to equivalence) irreducible L/Kˆ module Tχ with central character χ and every irreducible L/K-module on which κK ˆ acts as −1 is equivalent to one of these. In particular, viewing Tχ as an L-module, θa and ˆ ˆ Let h[−1] be the twisted Heisenberg algebra. As in a agree as operators on Tχ for a ∈ L. ˆ Sect. 1.7 of [FLM] we also denote by M (1) the unique irreducible h[−1]-module with Tχ the canonical central element acting by 1. Define the twisted space VL = M (1) ⊗ Tχ . It was shown in [FLM] and [DL2] that there is a linear map VL → (End VL χ )[[z 1/2 , z −1/2 ]], X v 7→ Y (v, z) = vn z −n−1 T
n∈ 21 Z T
such that VL χ is an irreducible θ-twisted module for VL . Moreover, every irreducible T θ-twisted VL -module is isomorphic to VL χ for some χ. T Define a linear operator θˆd on VL χ such that ˆ 1 (−n1 ) · · · αk (−nk ) ⊗ t) = (−1)k edπi/8 α1 (−n1 ) · · · αk (−nk ) ⊗ t θ(α for αi ∈ h, ni ∈ 21 + Z and t ∈ T . Then θˆd Y (u, z)(θˆd )−1 = Y (θu, z) for u ∈ VL T T T T (cf. [FLM]). We have the decomposition VL χ = (VL χ )+ ⊕ (VL χ )− , where (VL χ )+ and Tχ − (VL ) are the θˆd -eigenspaces with eigenvalues −edπi/8 and edπi/8 respectively. Then T T both (VL χ )+ and (VL χ )− are irreducible VL+ -modules (cf. Theorem 5.5 of [DLi]). As before, we now take L = Zα ∼ = D1 with hα, αi = 4. Then K = 2L, R = L and R/K ∼ = Z2 . Let χ1 be the trivial character of R/K and χ−1 the nontrivial character. Then both Tχ1 and Tχ−1 are one-dimensional L modules and α acts on Tχ±1 as ±1. Lemma 3.4. We have the Virasoro decompositions: (1)
1 1 Tχ (VD1 1 )+ ∼ = M ( , ), 16 2
1 Tχ (VD1 1 )− ∼ = M ( , 0). 16
424
C. Dong, R. L. Griess Jr., G. H¨ohn Tχ 1 1 (VD1 −1 )+ ∼ = M ( , ), 2 16
(2)
Tχ 1 (VD1 −1 )− ∼ = M (0, ). 16
Proof. Recall from [DL2] that Tχ1
VL
X
=
Tχ
(VL 1 ) 1 +n 16
n∈ 21 Z, n≥0
(see Proposition 6.3 and formula (6.28) of [DL2]). Note that X Tχ Tχ (VL 1 )+ = (VL 1 ) 1 + 1 +n 16
2
n∈Z, n≥0
and that
X
Tχ
(VL 1 )− =
Tχ
(VL 1 ) 1 +n . 16
n∈Z, n≥0 Tχ
Tχ
Since both (VL 1 )+ and (VL 1 )− are irreducible VL+ -modules we only need to calculate highest weights for nonzero highest weight vectors in these spaces. Note that Tχ1 is a Tχ 1 and space of highest weight vectors of (VL 1 )− . One can easily verify that L1 (0) = 16 T χ 1 2 1 − ∼ L (0) = 0 on Tχ . Thus (V ) = M ( , 0). L
1
16
Tχ
Also observe that α(−1/2) ⊗ Tχ1 is a space of highest weight vectors of (VL 1 )+ . From Lemma 3.1 we know that α(−1) ∈ VL− ∼ = M ( 21 , 21 ). Now use the fusion rule Tχ1 + ∼ 1 1 given in Theorem 2.3 to conclude that (VL ) = M ( 16 , 2 ). Part (2) is proved similarly. As did in the untwisted case, we now consider the twisted modules for the lattice Lwe d L = i=1 Zαi ∼ = D1d , hαi , αj i = 4δi,j , where d is now a positive integer divisible by 8. Then K = 2L, R = L and R/2L ∼ = Zd2 . Thus, there are 2d irreducible characters for R/2L which are denoted by χJ , (where J is a subset Q of {1, . . . , d}) sending αj to −1 if j ∈ J and to 1 otherwise. Then we have χJ = j χxj , where χxj is a character of Zαj /Z2αj and xj = χJ (aj ). Moreover, TχJ ∼ = Tχx1 ⊗ · · · ⊗ Tχxd . In particular, each TχJ is one dimensional. Corollary 3.5. We have the Virasoro decompositions: M Tχ (VDd J )± = 1 n 1 1 1
M (h1 , . . . , h2d ).
{( 16 , 0), ( 16 , 2 )} if i ∈ 6 J 1 ), ( 1 , 1 )} if i ∈ J {(0, 16 2 16
(h2i−i , h2i ) ∈
#{j|hj = 1 } 2 =±(−1)d/8
(−1)
Proof. Recall from the proof of Corollary 3.3 that VDd is isomorphic to the tensor 1
Tχ
product vertex operator algebra VD1 ⊗ · · · ⊗ VD1 . Note that VDd J is isomorphic to the Tχx
tensor product VD1 Lemma 3.4,
1
Tχx
⊗ · · · ⊗ VD1
1
d
and θˆd is also a tensor product θˆ1 ⊗ · · · ⊗ θˆd . By M
Tχ
VD d J =
n
1
(h2i−i , h2i ) ∈
1 , 0), ( 1 , 1 )} if i ∈ {( 16 6 J 16 2 1 ), ( 1 , 1 )} if i ∈ J {(0, 16 2 16
M (h1 , . . . , h2d ).
Framed Vertex Operator Algebras, Codes and Moonshine Module
425
Since θˆd = (−1)#{j|hj = 2 } (−1)d/8 = (−1)#{j|hj =0} (−1)d/8 on M (h1 , . . . , h2d ) we see 1 Tχ that M (h1 , . . . , h2d ) embeds in (VL J )± if and only if (−1)#{j|hj = 2 } (−1)d/8 = ±1. The proof is complete. 1
Tχ
Remark 3.6. Note that VDd J is 21 Z graded if d is divisible by 8 (cf. [DL2]). In fact Tχ
1
Tχ
(VDd J )+ is then the subspace of VDd J consisting of vectors of integral weights while (V
1 TχJ D1d
Tχ
1
)− is the subspace of VDd J consisting of vectors of non-integral weights. 1
4. Vertex Operator Algebras Associated to Binary Codes Let C be a doubly-even linear binary code of length d ∈ 8Z containing the all ones vector = (1, . . . , 1). As mentioned in Sect. 2, we can regard a vector of Fd2 as an element in Zd in an obvious way. One can associate (cf. [CS1]) to such a code the two even lattices 1 LC = { √ (c + x) | c ∈ C, x ∈ (2Z)d } 2 and e C = { √1 (c + y) | c ∈ C, y ∈ (2Z)d , 4 | P yi } ∪ L 2 P 1 { √ (c + y + ( 21 , . . . , 21 )) | c ∈ C, y ∈ (2Z)d , 4 | (1 − (−1)d/8 + yi )} 2 and for every self-dual even lattice there are two vertex operator algebras VL and VeL = VL+ ⊕ (VLT )+ (see [FLM, DGM1]). Definition 4.1. A marking for the code C is a partition M = {(i1 , i2 ), . . . , (id−1 , id )} of the positions 1, 2, . . ., d into d2 pairs. Ld A marking M = {(i1 , i2 ), . . . , (id−1 , id )} determines the D1d sublattice l=1 Zαl √ √ e C , where α2k−1 = 2(ei inside LC and L + ei2k ) and α2k = 2(ei2k−1 − ei2k ) for 2k−1 d k = 1, . . ., 2 using {ei } as the standard base of LC ⊗ Q = Q . Let us simplify notation and arrange for the marking to be M = {(1, 2), (3, 4), . . . , (d − 1, d)}. From Definition 3.2, we see that every such marking defines a system of 2d commut∼ e e . As ing Virasoro algebras inside the vertex operator algebras VLC , VL eC = VLC and VL eC the main theorem we describe explicitly the decomposition into irreducible T2d -modules in terms of the marked code. The triality symmetry of VeL eC given in [FLM] and [DGM1] is directly visible in this decomposition. (See also [G1].) In order to give the Virasoro decompositions in a readable way, we need the next e C in the coordinate system spanned by the αi . We lemma which describes LC and L 0 1 use the following notation. Let γ+0 , γ− , γ+1 and γ− be the maps F22 −→ (D1∗ /D1 )2 = 1 1 2 {0, 2 , − 2 , 1} defined by the table (0, 0) γ+0 0 γ− γ+1 1 γ−
(0, 0) (1, 1)
(1, 1)
(1, 0)
(0, 1)
(1, 0) (0, 1)
( 21 , 21 ) (− 21 , − 21 ) (1, 21 ) (0, − 21 )
( 21 , − 21 ) (− 21 , 21 ) (1, − 21 ) (0, 21 )
( 21 , 0) (− 21 , 0) (− 21 , 1) ( 21 , 1)
426
C. Dong, R. L. Griess Jr., G. H¨ohn
and write c(k) (k = 1, . . ., d2 ) for the pair (c2k−1 , c2k ) of coordinates of a codeword c ∈ C. Finally let for b = 0 or 1 and = (1 , . . . , d/2 ) ∈ {+, −}d/2 , 0a (b)
=
d/2 M
D12 + γbk (c(k))
(4.1)
k=1
be a coset of D1d . e C under the above Lemma 4.2. We have the following coset decomposition of LC and L d D1 sublattice: [ [ LC = 00 (c), c∈C ∈{+,−}d/2
eC = L
[
c∈C
[ Q
00 (c) ∪
∈{+,−}d/2 i =+
[ ∈{+,−}d/2 Q d/8
01 (c) .
i =(−)
Proof. The result follows from the definition of these two lattices.
We next interpret the decompositions in terms of codes over Z4 = {0, 1, 2, 3} associated to LC and L˜ C . See [CS2] for the relevant definitions for Z4 -codes. Let L be a positive definite even lattice of rank d which contains a D1d as a sublattice. We call such a sublattice a D1 -frame. Note that (D1∗ /D1 )d is isomorphic to Zd4 . Then 1(L) := L/D1d ≤ (D1∗ /D1 )d is a code over Z4 and 1(L) is self-annihilating if and only if L is self-dual. For the lattices LC and L˜ C we give the following explicit description of the corresponding codes 1: Let b be the map from Fd2 to Zd4 induced from ˆ : F22 ∼ = D2∗ /D2 −→ (D1∗ /D1 )2 ∼ = Z24 , n 00 7→ 00, 11 7→ 20, 10 7→ 11 and 01 7→ 31. Let (62 )0 be the subcode of the Z4 -code 6n2 = {(00), (22)}n of length 2n consisting of codewords of weights divisible by 4. Then we have b+6 , 0=1(LC ) = C 2 d/2
(4.2)
b +(6d/2 )0 ∪ C b +(6d/2 )0 + (1, 0, . . . , 1, 0, 1, 0), if d ≡ 0 (mod 16), e 0=1( L˜ C ) = C 2 2 (1, 0, . . . , 1, 0, 3, 2), if d ≡ 8 (mod 16). An important invariant of a Z4 -code 1 is the symmetrized weight enumerator, as defined in [CS2]. Definition 4.3. The symmetrized weight enumerator of a Z4 -code 1 of length d is given by X Ur,s Ad−r−s B r C s , swe1 (A, B, C) = 0≤r,s≤d
where Ur,s is the number of codewords γ ∈ 1 having at r positions the value ±1 and at s positions the value 2. e in terms To describe the symmetrized weight enumerator for our codes 0 and 0 of the marked binary code C, we introduce the analogous invariant for marked binary codes.
Framed Vertex Operator Algebras, Codes and Moonshine Module
427
Definition 4.4. The symmetrized marked weight enumerator of a binary code C of length d with a marking M is the homogeneous polynomial smweC (x, y, z) =
d/2 [k/2] X X
Wk,l xd/2−k+l y k−2l z l ,
k=0 l=0
where Wk,l is the number of codewords c ∈ C of Hamming weight k having the value (ci2m−1 , ci2m ) = (1, 1) for exactly l of the d/2 pairs (i2m−1 , i2m ) of the marking M. Remark 4.5. The concept of marked binary codes can be considered as the third step in the sequence D8∗ /D8 -codes, Kleinian codes, marked binary codes, Z4 -codes and VFOAs (cf. Sect. 5 and [H2], last section). It is very useful and one obtains easily the usual code-theoretic type of results. For example, the following two hold for doubly-even self-annihilating codes of a fixed rank d ≡ 0(mod 8): (1) Mass formula. X [(C,M)]
2d/2 (d/2)! = 2 · 3 · 5 · · · · · (2(d/2)−2 + 1), |AutM (C)|
where the sum runs over equivalence classes of pairs of codes C with marking M and AutM (C) denotes the group of automorphisms of C that fix M. For an application, see Appendix A. (2) Ring of invariants. The symmetrized marked weight enumerator belongs to a ring C[u4 , v4 , u8 ] ⊕ C[u4 , v4 , u8 ] · u12 generated by u4 = x4 + 6 x2 z 2 + z 4 + 8 y 4 , v4 = x4 + z 4 + 12 xzy 2 + 2 y 4 and two polynomials u8 and u12 of degree 8 resp. 12, subject to one relation for u28 . From Lemma 4.2 we get: e are given Corollary 4.6. The symmetrized weight enumerators of the Z4 -codes 0 and 0 by swe0 (A, B, C) = smweC (A2 + C 2 , 2B 2 , 2AC), (4.3) 1 1 swee (A, B, C) = · smweC (A2 + C 2 , 2B 2 , 2AC) + (A2 − C 2 )d/2 (4.4) 0 2 2 1 + · 2d/2 (A + C)d/2 + (−1)d/8 (A − C)d/2 B d/2 . 2 Motivated by Lemmas 3.1 and 3.4, define for a ∈ {0, 1}, α ∈ {+, −} and x ∈ a (x) by the following {0, ± 21 , 1} the 16 formal linear combinations of T2 -modules Rα table: 1 1 0 1 2, −2 R+0 M (0, 0) M ( 21 , 0) 0 M ( 21 , 21 ) M (0, 21 ) R− √1 M ( 1 , 1 ) R+1 16 2 2 1 R−
√1 M ( 1 , 0) 16 2
1 1 1 2 M ( 16 , 16 ) 1 1 1 2 M ( 16 , 16 ) √1 M ( 1 , 1 ) 2 16 2 √1 M (0, 1 ) 16 2
For µ ∈ {+, −}n and an element γ = (γk ) ∈ Zn4 which is identified with {0, 1, ± 21 } we write shortly
428
C. Dong, R. L. Griess Jr., G. H¨ohn
Rµa (γ) =
n O
Rµa k (γk ).
(4.5)
k=1 0 1 We see from Lemmas 3.1 and 3.4 that R± = VD±1 +x and R± =
Tχ (−1)2x ± √1 (V ) . 2 D1
The
introduction of the extra factor √12 in the twisted case enables us to write the Virasoro 0 decompositions for the twisted sector VLT in a neat way. The index 0 in R± refers to the 1 untwisted case while the index 1 in R± refers to the twisted case. Let L be a self-dual even lattice of rank d containing a D1 -frame. So, L is defined by the self-annihilating Z4 -code 1 = L/D1d ≤ (D1∗ /D1 )d of length d which is now even 1 in the sense that swe1 (1, x 2 , x) is a polynomial in x2 (cf. [BS]). Theorem 4.7. The vertex operator algebras VL and VeL have the following decompositions as modules for T2d : M M VL = Rµ0 (γ), γ∈1 µ∈{+,−}d
VeL =
M
M
γ∈1
µ∈{+,−}d
Q
µk =+
Rµ0 (γ) ⊕
M
M
γ∈1
d Qµ∈{+,−}d/8
Rµ1 (γ).
µk =(−)
In order to determine the decomposition for VeL , we first study the decomposition of
VLT .
Since L is self-dual, VL has a unique irreducible θ-twisted module VLT [D2]. In this case T can be constructed in the following way. Let Q be a subgroup of L containing the D1d which is maximal such that hα, βi ∈ 2Z for α, β ∈ Q (it exists since L ˆ has ascending chain conditions on subgroups). Let Qˆ be the inverse image of Q in L. Note that |L/Q| = 2d/2 . Then Qˆ is a maximal abelian subgroup of Lˆ which contains d ˆd ∼ ˆ ˆ D 1 = D1 × hκi and which contains K. Let ψ : Q → h±1i be a character of Q such that ˆ ˆ ψ(κK) = −1. Then T can be realized as the induced L-module T = C[L] ⊗C[Q] ˆ Cψ , ˆ ˆ set defined by the character ψ. For a ∈ L, where Cψ is a one dimensional Q-module ˆ d such that a¯ i = αi for all t(a) = a ⊗ 1 ∈ T . It is easy to see that we can choose ai ∈ D 1 ˆ i and ψ(ai ) = 1. Then for a ∈ L, ¯ ¯ aai ⊗ 1 = (−1)hαi ,ai a ⊗ 1. ai t(a) = ai a ⊗ 1 = (−1)hαi ,ai
(4.6)
Thus, C t(a) is a one-dimensional representation for D1d , with character χ given by ¯ . In fact C t(a) is isomorphic to Tχ as D1d -modules. Let βl ∈ L for χ(ai ) = (−1)hαi ,ai d/2 l = 1, . . ., 2 represent the distinct cosets of Q in L. Choose bl ∈ Lˆ with b¯ l = βl for all l. Then {t(bl ) | l = 1, . . . , 2d/2 } forms a basis of T and each t(bl ) spans a one-dimensional module for D1d . Denote the character of D1d on C t(bl ) by χl . Then Tχ M (1)⊗t(a) is isomorphic to VDd l as θ-twisted VDd -modules and as T2d -modules. Thus, 1 1 L2d/2 Tχ VLT ∼ = l=1 VDd l . 1
Proposition 4.8. Let 2 ⊆ 1 be a complete coset system for the induced Z4 -subcode Q/D1d of 1. We have the Virasoro decomposition
Framed Vertex Operator Algebras, Codes and Moonshine Module
VLT =
M γ∈2
M
M
µ∈{+,−}d
γ∈1
Tϕ
VD d γ = 1
429
Rµ1 (γ),
where the character ϕγ is determined by ϕγ (ai ) = (−1)2γi if we identify D1∗ /D1 ∼ = Z4 with {0, 1, ± 21 }. Proof. The first equality has been proven in the previous discussion. In order to see the second equality note that by Lemma 3.4 we have Rµ1 (γ) = 2−d/2
d O
Tχx
(VD1
k
)µk ,
(4.7)
k=1
where xk = (−1)2γk . Observe that hα, βi ∈ 2Z for any α, β ∈ Q. Let a, b ∈ Lˆ such that a¯ + Q = b¯ + Q. Then from (4.6), M (1) ⊗ t(a) and M (1) ⊗ t(b) are isomorphic θ-twisted VDd -modules and are isomorphic T2d -modules. Thus Rµ1 (γ) = Rµ1 (γ 0 ) if γ and γ 0 are in 1 the same coset of Q/D1d in 1. Since the coset of Q/D1d in 1 has exactly 2d/2 elements, we immediately see from (4.7) that for γ ∈ 1, M M Tϕ Rµ1 (γ + σ). VD d γ = 1
µ∈{+,−}d σ∈Q/D1d
This proves the second equality.
ˆ Proof of Theorem 4.7. For a subset N of L we denote inverse image of N in L by N the d ˆ ˆ V [D + γ] and V [D1d + γ] is L and set V [N ] = M (1) ⊗ C{N }. Then VL = 1 γ∈1 isomorphic to VDd +γ (defined in Sect. 3) as VDd -modules. The decomposition for VL 1 1 follows immediately from Corollary 3.3 and Lemma 4.2. Now we study the decomposition of VL+ . If γj = ± 21 for some j then (V [D1d + γ] ⊕ V [D1d − γ])+ is isomorphic to V [D1d + γ] as T2d -modules and has the desired decomposition. So we can assume that all γj = 0, 1. In Lemma 4.9 below we will prove that the θ defined on VDd +γ in Sect. 3 coincides with the θ on V [D1d + γ]. We again use Corollary 3.3 to see 1 that V [D1d + γ]+ has the desired decomposition. For the twisted part, we use Proposition 4.8 and Corollary 3.5 to obtain M M + M Rµ1 (γ). Rµ1 (γ) = (VLT )+ = γ∈1 γ∈1 Q µ∈{+,−}d d/8 µk =(−)
Lemma 4.9. With the same notations as in the the proof of Theorem 4.7, the θ defined on VDd +γ in Sect. 3 coincides with the θ on V [D1d + γ] if all γj = 0, 1. 1
Proof. Let X be a sublattice of L containing D1d such that hx, yi ∈ 2Z for x, y ∈ X. Then the inverse image Xˆ of X in Lˆ is an abelian group. We can choose a section hx,xi/2 e : X → Xˆ such that ex ey = κhx,yi/2 ex+y . Then e−1 e−x for x ∈ X. Thus x = κ θι(ex ) = ι(e−x ).
430
C. Dong, R. L. Griess Jr., G. H¨ohn
Take X to be the sublattice generated by D1d and γ. Then V [D1d + γ] is generated by ι(eγ ) as a VDd -module and V [D1d + γ]+ is generated by ι(eγ ) + ι(e−γ ) as a VD+ d -module. 1
1
In fact V [D1d + γ]+ is an irreducible VD+ d -module. Let µ : VDd +γ 7→ V [D1d + γ] be a 1 1 VDd -module isomorphism such that µeγ = ι(eγ ). We must prove that µ maps VD+ d +γ 1
1
to (V [D1d + γ])+ . As both VD+ d +γ and (V [D1d + γ])+ are irreducible VD+ d -modules, it is 1
1
enough to show that µ(eγ + e−γ ) = ι(eγ ) + ι(e−γ ) or equivalently µe−γ = ι(e−γ ). Let J be the subset of {1, . . . , d} consisting of j such that γj = 1. Note that e−γ is the coefficient of z −2|J| in Y (e−2γ , z)eγ and ι(e−γ ) is the coefficient of z −2|J| in Y (ι(e−2γ ), z)ι(eγ ) as (−1)h2γ,γi/2 is even. Also note that 2γ ∈ D1d . Thus µe−γ = ι(e−γ ).
e C associated to the code C. We assume that Now we return our lattices LC and L C is a self-annihilating (i.e., C = C ⊥ ) doubly-even binary code. Then the Z4 -codes 0 e are self-annihilating and even, or equivalently as mentioned above, the lattices and 0 e C are self-dual and even. LC and L Combining Lemma 4.2 and Theorem 4.7 we will see how to read off the Virasoro decomposition directly from the marked code C. For a, b ∈ {0, 1}, α, β ∈ {+, −} and ab ((x, y)) by x, y ∈ {0, 1}, define the formal linear combinations of T4 -modules Nαβ M ab a b ((x, y)) = R(α Nαβ 0 ,α00 ) (γβ ((x, y))), α0 ,α00 ∈{+,−} α0 α00 =α
a where R(α 0 ,α00 ) was defined in (4.5). Explicitly, we get the 64 formal linear combinations as shown in the following table:
(0, 0)
(1, 1)
00 N++
M (0, 0, 0, 0) ⊕ M ( 21 , 21 , 21 , 21 ) M ( 21 , 0, 0, 0) ⊕ M (0, 21 , 21 , 21 )
00 N−+
M (0, 0, 21 , 21 ) ⊕ M ( 21 , 21 , 0, 0) M ( 21 , 0, 21 , 21 ) ⊕ M (0, 21 , 0, 0)
00 N+− 00 N−− 01 01 N++ , N−+ 01 01 N+− , N−− 10 10 N++ , N+− 10 10 , N−− N−+ 11 11 N++ , N+− 11 11 , N−− N−+
M ( 21 , 0, 21 , 0) ⊕ M (0, 21 , 0, 21 ) M (0, 0, 21 , 0) ⊕ M ( 21 , 21 , 0, 21 ) M ( 21 , 0, 0, 21 ) ⊕ M (0, 21 , 21 , 0) M (0, 0, 0, 21 ) ⊕ M ( 21 , 21 , 21 , 0) 1 M ( 1 , 1 , 0, 0) ⊕ 1 M ( 1 , 1 , 1 , 2 16 16 2 16 16 2 1 M ( 1 , 1 , 1 , 0) ⊕ 1 M ( 1 , 1 , 0, 2 16 16 2 2 16 16 1 M ( 1 , 0, 1 , 0) ⊕ 1 M ( 1 , 1 , 1 , 2 16 16 2 16 2 16 1 M ( 1 , 1 , 1 , 0) ⊕ 1 M ( 1 , 0, 1 , 2 16 2 16 2 16 16 1 M (0, 1 , 1 , 0) ⊕ 1 M ( 1 , 1 , 1 , 2 16 16 2 2 16 16 1 M ( 1 , 1 , 1 , 0) ⊕ 1 M (0, 1 , 1 , 2 2 16 16 2 16 16
1) 2 1) 2 1) 2 1) 2 1) 2 1) 2
(0, 1), (1, 0) 1 M( 1 , 1 , 1 , 1 ) 2 16 16 16 16 1 M( 1 , 1 , 1 , 1 ) 2 16 16 16 16 1 M( 1 , 1 , 1 , 1 ) 2 16 16 16 16 1 M( 1 , 1 , 1 , 1 ) 2 16 16 16 16 1 M (0, 0, 1 , 1 ) ⊕ 1 M ( 1 , 1 , 1 , 2 16 16 2 2 2 16 1 M ( 1 , 0, 1 , 1 ) ⊕ 1 M (0, 1 , 1 , 2 2 16 16 2 2 16 1 M (0, 1 , 0, 1 ) ⊕ 1 M ( 1 , 1 , 1 , 2 16 16 2 2 16 2 1 M ( 1 , 1 , 0, 1 ) ⊕ 1 M (0, 1 , 1 , 2 2 16 16 2 16 2 1 M ( 1 , 0, 0, 1 ) ⊕ 1 M ( 1 , 1 , 1 , 2 16 16 2 16 2 2 1 M ( 1 , 1 , 0, 1 ) ⊕ 1 M ( 1 , 0, 1 , 2 16 2 16 2 16 2
1 ) 16 1 ) 16 1 ) 16 1 ) 16 1 ) 16 1 ) 16
For µ, ∈ {+, −}n/2 and an element c ∈ Fn2 we write ab (c) = Nµ,
n/2 O
Nµabk k (c(k)),
k=1
where c(k) = (c2k−1 , c2k ) as before. Let δ(c) be the number of k with c(k) ∈ {(0, 1), (1, 0)}. Recall that the lattice D1d determines 2d commuting Virasoro algebras e e . The following is the inside the four vertex operator algebras VLC , VL eC , VLC and VL eC main theorem of this paper.
Framed Vertex Operator Algebras, Codes and Moonshine Module
431
Theorem 4.10. For the Virasoro frame coming from the marked code C, we have the following decompositions: M M 00 Nµ, (c), V LC = c∈C µ,∈{+,−}d/2
VL e =
M
M
C
Q
c∈C µ,∈{+,−}d/2 δ(c)=0 k =+
⊕
VeLC =
00 Nµ, (c) ⊕
M
M
c∈C
µ,∈{+,−}d/2 k =(−)d/8
δ(c)>0
M
00 Nµ, (c) ⊕
Q
M
M
c∈C
µ,∈{+,−}d/2 µk =(−)d/8
1 M · 2 c∈C, δ(c)>0
M
00 Nµ, (c)
µ,∈{+,−}d/2
Q
M
c∈C, δ(c)=0
µ,∈{+,−} Q Q d/2 µk =
c∈C
µ,∈{+,−}d/2
10 Nµ, (c),
M
1 M ⊕ · 2
00 Nµ, (c)
01 Nµ, (c),
c∈C µ,∈{+,−}d/2 δ(c)=0 µk =+
VeL eC =
M
Q
M
⊕
1 M · 2 c∈C,
1 M · 4 c∈C,
00 Nµ, (c) ⊕
M
00 Nµ, (c)
µ,∈{+,−}d/2
δ(c)>0
k =+
M
M
01 Nµ, (c) ⊕
Q
Q
µ,∈{+,−}d/2 k =(−)d/8
M
10 Nµ, (c) ⊕
11 Nµ, (c) .
Q µ,∈{+,−} Q d/2d/8
µ,∈{+,−}d/2 µk =(−)d/8
k =
µk =(−)
Proof. Recall (4.1). For a codeword c ∈ C and ∈ {+, −}d/2 , let γ = 0b (c) and fix ab (c) we get s ∈ {+, −}. Using the definition of Nµ, M Q
M
d/2 O
µ0 ,µ00 ∈{+,−}d/2 µ0 µ00 =s k k
i=1
Rµa (γ) =
i
Q
µ∈{+,−}d µk =s
=
M Q
µ∈{+,−}d/2
=
Q
µ∈{+,−}d/2
=
M
i=1
µ0 µ00 ∈{+,−} i i µ0 µ00 =µi i i
d/2 O
a R(µ 0 ,µ00 ) ((γ2i−1 , γ2i )) i
i
Nµabi i (c(i))
i=1
µk =s
M Q
i
d/2 O
µk =s
M
a R(µ 0 ,µ00 ) ((γ2i−1 , γ2i ))
ab Nµ, (c).
µ∈{+,−}d/2 µk =s
˜ the decomposition By using the identification of 0a (c) with codewords in 0 and 0 follows from Lemma 4.2 and Theorem 4.7 if we use the following two observations:
432
C. Dong, R. L. Griess Jr., G. H¨ohn
00 First note that Nµ00k k (c(k)) = N±µ (c(k)) for c(k) ∈ {(0, 1), (1, 0)}. So for k ±k δ(c) > 0 we can suppress the distinction between ±k (resp. ±µk ) in the decomposition and compensate it with one factor 21 . 01 10 11 (c) (resp. Nµ, (c) and Nµ, (c)) depends for fixed c only on Second, the value of Nµ, (resp. µ).
Now we discuss an action of Sym3 (the permutation group on three letters) defined in [FLM] and [DGM2] on VLC and VeL eC in terms of our decompositions. The resulting group of automorphisms is sometimes called the triality group. Recall from Chapters 11 and 12 of [FLM] and Sects. 7 and 8 of [DGM2] that the triality group is generated by distinct involutions σ and τ . Also recall from Sect. 4 that α1 , . . . , αd form a D1 -frame in L. A straightforward computation shows that σω4i−3 = ω4i−3 , σω4i = ω4i and σ interchanges ω4i−2 = ω4i−1 for all i = 1, . . ., d2 . Similarly, τ interchanges ω4i−3 = ω4i−2 and fixes ω4i−1 and ω4i . Thus the triality group is a subgroup of G defined in (2.2) for both VLC and VeL eC . Its image in G/GC ≤ Sym2d is the above described permutation of the elements of the VF {ων }. e Additionally, the involution σ defines an isomorphism between VL e and VLC . C
Definition 4.11. Following [DMZ], the decomposition polynomial of a FVOA V = L mh1 ,...,hr M (h1 , . . . , hr ) is defined as PV (a, b, c) =
X
Ai,j,k ai bj ck ,
i,j,k
where Ai,j,k is the number of Tr -modules M (h1 , . . . , hr ) in a Tr composition series 1 of V for which the number of coordinates in (h1 , . . . , hr ) equal to 0, 21 , 16 is i, j, k, respectively. The polynomial is homogeneous of degree r and, in general, depends on the chosen Virasoro frame {ω1 , . . . , ωr } inside of V . The following corollary is an immediate consequence of Theorem 4.10. Corollary 4.12. Using the symmetrized marked weight enumerator smweC (x, y, z) one has PVLC (a, b, c) = smweC (a4 + 6a2 b2 + b4 , 2c4 , 4a3 b + 4ab3 ), d 1 1 4 a − 2a2 b2 + b4 2 + smweC (a4 + 6a2 b2 + b4 , 2c4 , 4a3 b + 4ab3 ) PV (a, b, c) = e LC 2 2 1 d/2 d + ·2 (a + b) + (−1)d/8 (a − b)d cd , 2 PVe (a, b, c) = PV (a, b, c), LC e LC d 1 PVe (a, b, c) = · 3 a4 − 2a2 b2 + b4 2 4 e LC 1 + smweC (a4 + 6a2 b2 + b4 , 2c4 , 4a3 b + 4ab3 ) 4 1 +3 · · 2d/2 (a + b)d + (−1)d/8 (a − b)d cd . 4
Framed Vertex Operator Algebras, Codes and Moonshine Module
433
Remark 4.13. From Theorem 4.10 we can deduce that VeL eC is a self-dual rational vertex \ operator algebra. The proof for the special case of V given in [D3] works in general since the Virasoro decompositions were the only information needed.
5. Applications In this section we discuss some important applications for Theorem 4.10. The simplest example is for the Hamming code H8 of length 8. When C is the Golay code G24 of length 24 there is a special marking and we obtain a particular interesting decomposition of the moonshine module V \ = VeL e under 48 Virasoro algebras. G24
Example I. The Hamming code H8 , the root lattice E8 , and the lattice vertex operator algebra VE8 . The Hamming code H8 is the unique self-annihilating doubly-even binary code of length 8. Its automorphism group is isomorphic to AGL(F32 ). The root lattice E8 of the exceptional Lie group E8 (C) is the unique even unimodular lattice of rank 8. It has the Weyl group W (E8 ) as its automorphism group. The lattice vertex operator algebra VE8 , whose underlying vector space is the irreducible level 1 highest weight representation of the affine Kac-Moody algebra E8(1) , is a self-dual vertex operator algebra of rank 8 whose automorphism group is the Lie group E8 (C). One can show, under some additional conditions on the vertex operator algebra, that VE8 is the unique self-dual VOA of rank 8 (cf. [H1], Ch. 2). e H and VE ∼ VLH8 ∼ The uniqueness of the lattice E8 implies E8 ∼ = LH 8 ∼ = L = 8 8 = ∼ ∼ ∼ e e e VL eH8 = VLH8 = VL eH8 for the vertex operator algebras, since one has VL eC = VLC in general (see [DGM1, DGM2] and the remark after Theorem 4.10). We will determine up to automorphism all markings for the Hamming code, all D1 -frames of the E8 lattice, and five Virasoro frames inside VE8 and describe the corresponding decompositions. They are all coming from markings of the Hamming code. To fix notation we choose {(00001111), (00110011), (11000011), (01010101)} as a set of base vectors for the Hamming code. Theorem 5.1. There are 3 orbits of markings for the Hamming code H8 under Aut(H8 ). Their main properties can be found in the next table. The last column shows the symmetrized marked weight enumerator. orbit
orbit representatives
stabilizer
orbit size
smweH8 (x, y, z)
α
{(1, 2), (3, 4), (5, 6), (7, 8)}
23 : Sym4
7
x4 + 6 x2 z 2 + z 4 + 8 y 4
β
{(1, 2), (3, 4), (5, 7), (6, 8)}
22 .Dih8
42
x4 + 2 x2 z 2 + z 4 + 8 xzy 2 + 4 y 4
γ
{(1, 2), (3, 5), (4, 7), (6, 8)}
Sym4
56
x4 + z 4 + 12 xzy 2 + 2 y 4
The proof is an easy counting exercise (see Appendix A). We remark that every pair (i, j) of the eight positions is the component of exactly one of the seven markings of type α: Every marking contains 4 pairs, so we cover 7 · 4 = 28 pairs. There are 28 = 28 different such pairs on which Aut(H8 ) transitively acts. As explained in the last section before Lemma 4.2 in general, every marking of H8 eH ∼ determines a D18 sublattice inside LH8 ∼ E8 . = E8 and L 8 = The following theorem shows that all possible D1 -frames in E8 are obtained in this way.
434
C. Dong, R. L. Griess Jr., G. H¨ohn
Theorem 5.2 (Conway-Sloane [CS2]). There are 4 orbits of D18 sublattices inside E8 under the action of W (E8 ). Their main properties are shown in the next table. The column “origin” lists the corresponding (untwisted, resp. twisted lattice) Hamming code marking and swe1 (A, B, C) is the symmetrized weight enumerator of the decomposition code 1 = E8 /D18 ≤ (D1∗ /D1 )8 ∼ = Z84 . orbit
origin
stabilizer
orbit size
K8
α
135
A8 + 28A2 C 6 + 70A4 C 4 + 28A6 C 2 + C 8 + 128B 8
K80
β, α e
Z72 . · Sym8
swe1 (A, B, C)
27 (4!)2
9450
A8 + C 8 + 12A2 C 2 (A4 + C 4 ) + 38A4 C 4 +
L8
e γ, β
28 · 4!
113400
A8 + C 8 + 4A2 C 2 (A4 + C 4 ) + 22A4 C 4 +
O8
e γ
259200
A8
64AC(A2 + C 2 )B 4 + 64B 8 96AC(A2 + C 2 )B 4 + 32B 8 2.AGL(3, 2)
+
C8
+ 14A4 C 4 + 112AC(A2 + C 2 )B 4 + 16B 8
Proof. It was also explained in the last section that every D18 sublattice inside E8 defines a Z4 -code 1 ≤ (D1∗ /D1 )8 ∼ = Z84 . Since E8 is self-dual and even, 1 is self-annihilating and even as a code over Z4 . All self-annihilating Z4 -codes of length 8 are classified in [CS2], Theorem 2. Only K8 , K80 , L8 and O8 are even (see also [BS]). The order of Aut(1) and swe1 (A, B, C) are also described in [CS2]. To show that these codes arise from the markings of the Hamming code as in the table we apply Corollary 4.6. The remark after Theorem 5.1 about the Hamming code has an analogue here: every vector of squared length 4 inside E8 is contained in exactly one D1 -frame belonging to the orbit of type K8 since 135 · 16 = 2160, the number of vectors of squared length 4, and W (E8 ) acts transitively on such vectors. These 135 D18 -sublattices are in bijection with cosets of 2E8 in E8 which have coset representatives of norm 4. Every D1 -frame inside E8 determines 16 commuting Virasoro vertex operator algebras of rank 21 inside VE8 and VeE8 ∼ = VE8 . Altogether, one gets at least five different systems of commuting Virasoro subVOAs: Theorem 5.3. Let {ω1 , . . . , ω16 } be a Virasoro frame inside VE8 . The possible decomposition polynomials are displayed in the next table. They correspond by the untwisted or twisted lattice construction to the D1 -frames inside E8 as indicated in the column origin. Furthermore, the first four cases belong to four distinct orbits of Virasoro frames under the action of the Lie group E8 (C). In the fifth case, , at least the T16 -module structure is unique. case
origin
PVE (a, b, c)
0 6
K8
1 2
e8 K80 , K
8
(a + b)16 + (a − b)16 + 128 c16
a16 + b16 + 56 (a14 b2 + a2 b14 ) + 924 (a12 b4 + a4 b12 )+ 3976 (a10 b6 + a6 b10 ) + 6470 a8 b8 +
9
e80 L8 , K
a16 + b16 + 24 (a14 b2 + a2 b14 ) + 476 (a12 b4 + a4 b12 )+ 1960 (a10 b6 + a6 b10 ) + 3270 a8 b8 +
2
e8 O8 , L e8 O
192 (a7 b + a b7 ) + 1344 (a5 b3 + a3 b5 ) c8 + 32 c16 a16 + b16 + 8 (a14 b2 + a2 b14 ) + 252 (a12 b4 + a4 b12 )+ 952 (a10 b6 + a6 b10 )1670 a8 b8 +
128 (a7 b + a b7 ) + 896 (a5 b3 + a3 bb ) c8 + 64 c16
224 (a7 b + a b7 ) + 1568 (a5 b3 + a3 b5 ) c8 + 16 c16 a16 + b16 + 140 (a12 b4 + a4 b12 ) + 448 (a10 b6 + a6 b10 ) + 870 a8 b8 +
240 (a7 b + a b7 ) + 1680 (a5 b3 + a3 b5 ) c8 + 8 c16
Framed Vertex Operator Algebras, Codes and Moonshine Module
435
Proof. Using Corollary 4.12 to compute PVE8 (a, b, c) for the different Virasoro subVOAs T16 coming from VE8 , and VeE8 and a given D18 sublattice in E8 one checks that the polynomials for 0, 6, 9, 2 and correspond to the D1 -frames of E8 as indicated. We show that there are no other possibilities for the decomposition polynomial PVE8 (a, b, c) and we will see directly that there is a unique Aut(VE8 )-orbit of T16 subVOAs corresponding to each of the cases 0, 6, 9 and 2. Assume a vertex operator subalgebra T16 in VE8 is given. First we determine the possible decomposition polynomials. proof of Theorem 4.1.5 in [H1], SL2 (Z) = hS, T i with S = Asdescribedin the 01 11 −1 0 and T = 0 1 acts on C[a, b, c] by √ 1/2 1/2 1/ √2 ρ(S) = 1/2 1/2 √ −1/ 2 , √ 0 1/ 2 −1/ 2
1 0 0 ρ(T ) = e−2πi/48 0 −1 0 0 0 e2πi/16
via substitution. Since VE8 is a self-dual VOA of rank 8, the decomposition polynomial must be invariant under the action of ρ(S) and ρ(T 3 ) (cf. proof of Theorem 2.19 or Th. 2.1.2 and Th. 4.1.5 in [H1]). They generate a matrix group G = hρ(S), ρ(T )3 i of order 384 as can easily be seen with the help of the program Gap [Sgap]. The dimension of the space of invariant polynomials of degree n is the multiplicity of the trivial representation in the nth symmetric power of ρ. This multiplicity is given by the coefficient of tn in the expression 1 X 1 ρG (t) = . |G| det(id − gt) g∈G
For degree 16 we obtain that the space of invariant polynomials is two dimensional; a possible base is given by PV0E (a, b, c) and PVE (a, b, c). The only polynomials P (a, b, c) 8 8 inside this space having positive coefficients and satisfying the necessary conditions P (1, 0, 0) = 1 and P (1, 1, 0) = |C| = 2l , 0 ≤ l ≤ 15 with integral l, are the five polynomials given in the theorem. Next we claim that the code C is uniquely determined from its weight enumerator P (a, b, 0): The weight enumerator of its annihilator code C ⊥ is a16 + (2k − 2)a8 b8 + b16 , with k = 16 − l = 1, . . ., 5. For k = 5 the uniqueness of C ⊥ and so of C is the uniqueness of the simplex code (see Theorem C.3). For smaller k, it can also be seen from a proof of Theorem C.3. The code C contains for k = 1, 2, 3 and 4 the subcode C0 = {(00), (11)}8 . By Corollary 3.3 and the uniqueness statement of Proposition 2.16 the corresponding subVOA must be VD18 . Recall that VE8 = M (1) ⊗ C{E8 }. The weight one subspace (VE8 )1 is a Lie algebra under [u, v] = u0 v, which is isomorphic to the Lie algebra of type E8 and (VD8 )1 1 is a Cartan subalgebra of (VE8 )1 . From the construction of VE8 = M (1)⊗C{E8 } we have a canonical Cartan subalgebra M (1)1 of (VE8 )1 which is identified with h = C ⊗Z E8 . Since all Cartan subalgebras are conjugate under the adjoint action of the Lie group E8 (C) we can assume that (VD8 )1 = M (1)1 ≤ (VE8 )1 . 1 It is well-known that C{E8 } = 1 ⊗ C{E8 } = {u ∈ VE8 | hn u = 0, h ∈ h, n > 0}, which is the vacuum space for the Heisenberg algebra hˆ Z (see Sect. 3). Similarly, C{D18 } = {u ∈ VD18 | hn u = 0, h ∈ h, n > 0}. Thus, C{D18 } is a subspace of C{E8 }. Note that C{E8 } and C{D18 } are direct sums of weight spaces for the Cartan algebra h and the corresponding weight lattices are exactly E8 and D18 . This determines a D18
436
C. Dong, R. L. Griess Jr., G. H¨ohn
sublattice of E8 , unique up to the action of W (E8 ). That is, for k = 1, 2, 3, 4 the Virasoro frames {ω1 , . . . , ω16 } come from one of the four D1 -frames inside E8 by the untwisted lattice-VOA construction. It remains to show that for k = 5, i.e. in the case , the obtained Virasoro decomposition is unique. As stated before the code C ⊥ , which is equal to D by Theorem 2.19, is the simplex code and so C is the extended Hamming code of length 16. The uniqueness of the code C implies by Theorem 2.3 (4) that VE08 is unique as a T16 -module. Let I ∈ D such that |I| = 8. Take an irreducible T16 -module W in V I . Using the action of V 0 1 on W we see that all such M (h1 , . . . , h16 ) occur in V I , where hi = 16 if i ∈ I and 1 1 hi ∈ {0, 2 } if i 6∈ I and the number of i with hi = 2 is odd. So, there are 27 nonisomorphic T16 -modules inside V I . Since D has 30 codewords of weight 8, we get at least 30 · 27 such nonisomorphic T16 -modules. But 30 · 27 is exactly the coefficient of c8 in P (1, 1, c). This shows that all these modules have multiplicity one. Finally the multi1 1 , . . . , 16 ) is 8. Therefore the decomposition in the last case is unique. plicity of M ( 16 Remark 5.4. (1) We expect that also in the fifth case the vertex operator algebra structure is unique, i.e. corresponds to a unique E8 (C)-orbit of Virasoro frames. (2) A different proof would follow if the list of the 71 unitary self-dual VOA candidates of rank 24 given by Schellekens [Sch] is complete: The fusion algebras for M (0) and the Kac-Moody VOA VB1,1 are isomorphic and one can identify the corresponding intertwiner spaces (cf. [MoS], Appendix D). From this, one can define for every VOA V of rank c containing M (0)⊗2c a VOA W of rank 3c containing VB⊗2c . There are five candidates W on Schellekens list containing VB⊗16 , 1,1 1,1 2 4 , W A8 namely WD24,1 , WD12,1 , WD6,1 and WA16 . They correspond to 0, 6, 9, 2 and 3,1 1,2 in this order. Again the uniqueness of WA16 is unknown. The decomposition of WA16 as 1,2
1,2
a VB⊗16 -module obtained in [Sch] by a computer calculation follows from our analysis 1,1 of the case . The next table summarizes the relation between the markings for H8 , the D1 -frames inside E8 and the Virasoro frames {ω1 , . . . , ω16 } inside VE8 as obtained in the last three theorems. The arrow . (resp. &) denotes the untwisted (resp. twisted) construction. For a detailed explanation of the second row of the table see [H2]. Self-dual Kleinian codes are a generalization of the so called type IV codes over F4 . Especially, the notation of a marking of a Kleinian code is defined in [H2]. Finally, Ξ1 is the D8∗ /D8 -code {0, s} of length 1, where the D8 -coset s ∈ D8∗ /D8 has minimal squared length 2. type D8∗ /D8 -code:
object Ξ1
Kleinian codes:
2
binary codes:
H8
lattices:
E8
VOAs:
V E8
marking/frame A .&
0
K8 .&
α .&
6
a .& K80 .&
β .&
9
b .& L8 .&
γ .&
2
O8 .&
Example II. The Golay code G24 , the Leech lattice 3, and the moonshine module V \ The moonshine module V \ is the Z2 -orbifold vertex operator algebra of V3 associated
Framed Vertex Operator Algebras, Codes and Moonshine Module
437
e G coming from the Golay to the Leech lattice 3 which is itself the twisted lattice L 24 \ e . code G24 (cf. [B, FLM]). That is, V = VL e G24
To describe Virasoro decompositions of the moonshine module coming from markings of the Golay code, we must study these markings first. For the decomposition polynomial PV \ (a, b, c) only, it is enough to compute the coefficients Wk,l of the symmetrized marked weight enumerator. The possible values for W8,l (and so for W16,l ) for the Golay code were computed by Conder and McKay in [CM]. They found 90 possibilities. It is not clear if the numbers W12,l , which are also needed, can be determined from the W8,l alone. The markings for the Golay code are classified by the double cosets Z12 2 .Sym12 \Sym24 /M24 . (The first subgroup is the stabilizer of a partition of the 24-set into 2-sets; the second is M24 , the automorphism group of G24 .) In fact there are 1858 different classes of markings [Be]. The binary linear code C ≤ F48 2 as defined in Sect. 2 depends also on the chosen marking. Since for the moonshine module we have dim V1\ = 0 the minimal weight of C is at least four. The following easy result gives an restriction on the dimension of C. Lemma 5.5. For every frame of 48 Virasoro vertex operator algebras of rank the moonshine module the dimension of C is smaller than or equal to 41.
1 2
inside
Proof. Deleting one coordinate of the codewords of a k-dimensional code C of minimal weight 4 leads to a code of length 47, dimension k and minimal weight at least 3. Minimal weight 3 implies that the spheres of radius one around the codewords of this code are all disjoint, i.e. we have the sphere packing condition 2k · (1 + 47) ≤ 247 or k ≤ 41. There is indeed a special marking M∗ , where C meets this bound. A good way to define it, is to describe the Golay code itself by a “double twist” construction. Starting from the glue code Ξ3 of the Niemeier lattice with root sublattice D83 one gets first the hexacode H6 , a code over the Kleinian fourgroup, and from the hexacode one obtains the Golay code G24 : As a code over D8∗ /D8 = Z2 × Z2 = {0, 1, s, s} ¯ (where 1, s, s¯ are the D8 -cosets represented by (07 , 1), (( 21 )7 , ± 21 ), respectively) one has (cf. [V]) ¯ (s0 ¯ s), ¯ (s¯s0), ¯ (sss)} . Ξ3 = {(000), (s11), (1s1), (11s), (0s¯s), The hexacode as a code over D4∗ /D4 = Z2 × Z2 = {0, a, b, c} (where a = [(0, 0, 0, 1)], b = [( 21 , 21 , 21 , 21 )] and c = [( 21 , 21 , 21 , − 21 )]) can be defined by b3 + (δ 3 )0 + (b0b0ca) . e3 := Ξ b3 + (δ 3 )0 ∪ Ξ H6 = Ξ 2 2 Here b is the map induced from ˆ : D8∗ /D8 −→ (D4∗ /D4 )2 , 0 7→ 00, 1 7→ a0, s 7→ bb and s¯ 7→ cb, and (δ2n )0 is the subcode of the Kleinian code δ2n := {(00), (aa)}n of length 2n consisting of codewords of weights divisible by 4. In a similar way one gets b6 + (d6 )0 + (1000 1000 . . . 1000 0111) , e6 := H b6 + (d6 )0 ∪ H G24 = H 4 4 where b is the map induced from ˆ : D4∗ /D4 −→ (D2∗ /D2 )2 ∼ = F42 , 0 7→ 0000, n a 7→ 1100, b 7→ 1010, c 7→ 0110, and (d4 )0 is the subcode of the binary code dn4 :=
438
C. Dong, R. L. Griess Jr., G. H¨ohn
{(0000), (1111)}n of length 4n consisting of codewords of weights divisible by 8. This is the usual MOG or hexacode construction of the Golay code and is a special case of the twisted construction of binary codes from Kleinian codes (cf. [H2], last section). In this description of the Golay code we let M∗ = {(1, 2), . . . , (47, 48)} the special marking mentioned above. The marking used in [DMZ and H1] arose from the way the Golay code was written there as a cyclic code. The symmetrized marked weight enumerator for the marking M∗ of the Golay code is easily computed (using for example the above description) and one gets smweG24 (x, y, z) = x12 + z 12 + 39 (x4 z 8 + x8 z 4 ) + 48 x6 z 6 + 96 (x6 z 2 + x2 z 6 ) + 192 x4 z 4 y 4 + 576 ( x5 z + x z 5 ) + 1920 x3 z 3 y 6 + 48 (x4 + z 4 ) + 288 x2 z 2 y 8 + 128 y 12 .
(5.1)
Another property of the marking M∗ is, that it has the largest stabilizer inside M24 among all the different markings, namely 26 : [Sym4 × Sym3 ] of order 210 32 = 9216 (see Appendix B), as was noted in [CM]. Remark 5.6. Assume that a marking is represented by the standard partition {(1, 2), (3, 4), . . . , (23, 24)}. The markings of the Golay code that arise from markings of the hexacode in the sense of Kleinian codes (cf. end of last subsection) are exactly the ones 0 , a code equivalent to G24 . for which the code (d64 )0 is a subcode of G24 e G under From Lemma 4.2, we get the decomposition of the Leech lattice 3 ∼ = L 24 ∗ the D1 -frame belonging to the marking M . For the symmetrized weight enumerator e ≤ Z24 (see (4.2)). Corollary 4.2 gives: of the corresponding code 0 4 swe (A, B, C) = A24 + C24 + 23439 (A16 C8 + A8 C16 ) + 4032 (A6 C18 + A18 C6 )
e 0
+378 (A4 C20 + A20 C4 ) + 60480 (A10 C14 + A14 C10 ) + 85484 A12 C12 + 3072 (A2 C14 + A14 C2 ) + 43008 (A12 C4 + A4 C12 )
+193536 (A10 C6 + A6 C10 ) + 307200 A8 C8 B 8 + 86016 (A11 C + A C11 ) + 1576960 (A9 C3 + A3 C9 )
+5677056 (A7 C5 + A5 C7 ) B 12
+ 6144 (A8 + C8 ) + 172032 (A6 C2 + A2 C6 ) + 430080 A4 C4 B 16 +262144 B 24 .
As stated before, the markings for the Golay code are classified by the double cosets Z12 2 .Sym12 \Sym24 /M24 . The classification of all D1 -frames in the Leech lattice would seem to be more complicated. From Eq. (4.2), we see that in the case where the D1 -frame comes from a e contains the subcode (612 marking of the Golay code the corresponding Z4 -code 0 2 )0 . The following result gives the converse. Recall that the Euclidean weight of a codeword is the minimal Euclidean squared norm of a coset representative in (D1∗ )24 . Lemma 5.7. Every self-annihilating even Z4 -code 1 of length 24 and minimal Euclidean weight 4 containing the subcode (612 2 )0 can be obtained from a marking of the Golay code as in Eq. (4.2).
Framed Vertex Operator Algebras, Codes and Moonshine Module
439
L24 Proof. Let K = i=1 Z ai a lattice of type A24 in R24 , i.e. the ai are pairwise orthogonal L24 1 vectors of squared length 2. Set L = i=1 Z bi , with b2i−1 = a2i−1 + a2i and b2i = a2i−1 − a2i for i = 1, . . ., 12, i.e. L is a lattice of type D124 . Finally let M = 2K. On K, the group Z24 2 : Sym24 acts by monomial matrices with entries ±1 with respect to the basis {ai | i = 1, . . . , 24}. The lattice L is fixed at least by the group of sign ∗ ∗ ∼ 24 ∼ 24 changes. Clearly K ∗ /K ∼ = Z24 2 , L /L = Z4 , M /M = Z8 , with the induced action 24 24 24 24 of Z2 : Sym24 on Z2 and Z8 and of Z2 on K. The code 1 ≤ L∗ /L determines a self-dual even lattice 3 of rank 24 and minimal length 4. (This must be the Leech lattice since it is the unique self-dual even rank 24 lattice of minimal length 4.) To prove the lemma we have to find a doubly-even self-annihilating binary code 0 0 e as determines 1 = 0 G24 ≤ K ∗ /K equivalent to the Golay code G24 such that G24 in (4.2). (Instead of changing the marking M = {(1, 2), . . . , (23, 24)}, the choice which is determined by the relation between K and L, we are permuting the code G24 ; these procedures are equivalent.) The lattice 3 defines a self-annihilating even Z8 -code = 3/M ≤ M ∗ /M of minimal Euclidean weight 4. If we start with our standard copy of the Golay code G24 e a Z4 -code 1 e ⊂ L∗ /L, and a Z8 -code e ⊂ M ∗ /M . we get a lattice 3, 12 Since (62 )0 is contained in 1, we see easily that the code contains all 24 2 vectors of type (42 022 ). As a main step in the uniqueness proof of 3 in [Co], it was shown that such a code is unique up to the action of Z24 2 : Sym24 , i.e. we have a π in this group such 0 e that π(3)/M = 6. The copy G24 = π(G24 ) of the Golay code gives the code 1 in L∗ /L. Finally we come to the Virasoro decomposition of the moonshine module V \ = VeL eG24 . The following theorem gives a precise description of the codes C and D as defined in Sect. 2. Theorem 5.8. The code C associated to the special marking M∗ of the Golay code has length 48 and dimension 41. Its annihilator code C ⊥ = {d ∈ F48 2 | (d, c) = 0 for all c ∈ C} is of dimension 7 and equals the code D which has generator matrix 1111111111111111 0000000000000000 0000000000000000 0000000000000000 1111111111111111 0000000000000000 0000000000000000 0000000000000000 1111111111111111 0000000011111111 0000000011111111 0000000011111111 0000111100001111 0000111100001111 0000111100001111 0011001100110011 0011001100110011 0011001100110011 0101010101010101 0101010101010101 0101010101010101 . Proof. Recall the description of the Golay code given above. The codes H6 and G24 are unions of two parts. The first part we call the untwisted part and the second is called the twisted part. First we show that the above matrix is a parity check matrix for C. From Theorem 4.10 we see that a codeword c ∈ G24 gives us an irreducible T48 -module M (h1 , . . . , h48 ) with 1 if and only if c(k) ∈ {(0, 0), (1, 1)} for all k. The codewords all hi different from 16 with this property are exactly the ones that are coming from the codeword (000) ∈ Ξ3 . This gives the first three rows of the parity check matrix. The next two rows correspond to the selection of the subcodes (δ23 )0 ⊂ δ23 and (d64 )0 ⊂ d64 . Let B2n be the FVOA (M (0, 0) ⊕ M ( 21 , 21 ))⊗n with binary code C(B2n ) = {(0, 0), (1, 1)}n of length 2n. The
440
C. Dong, R. L. Griess Jr., G. H¨ohn
subVOA (B2n )0 is the FVOA belonging to the subcode of C(B2n ) consisting of codewords of weights divisible by 4 (cf. Proposition 2.16). Then the last two rows of the parity 12 24 check matrix correspond to the selection of theQsubcodes (612 2 )0 ⊂ 62 and C((B2 )0 ) ⊂ Q 24 C(B2 ): these are the conditions k = + and µk = +. There are no further conditions. To determine D note first that the inclusion D ≤ C ⊥ is Proposition 2.14 (3). To see ⊥ C ≤ D observe that the codewords {(s11), (1s1), (11s)} ⊂ Ξ3 correspond to the first three lines of the generator matrix, the twisted parts of H6 and G24 to the next two, and two of the last three summands of V \ = VeL eG24 in Theorem 4.10 correspond to the last two lines of the generator matrix. Alternatively, one can compute D by using the self-duality of the moonshine VOA [D3] and apply Theorem 2.19. The code C is also the lexicographic code of length 48 and minimal weight 4 (see [CS3], Th. 6). As mentioned there, it is a “shortened extended Hamming code" of length 64 in the following sense: If we extend the generator matrix of D by the block 1111111111111111 1111111111111111 1111111111111111 0000000011111111 0000111100001111 0011001100110011 0101010101010101 , we obtain a parity check matrix for the extended Hamming code H64 of length 64. The vectors c ∈ F64 2 with 0’s in the last 16 coordinates belong to H64 if and only if the vector of the first 48 coordinates belongs to C. The automorphism group of this code is of type 212 [GL(4, 2)× Sym3 ] and has order 495452160 (see Appendix C for a proof). For future references we give the decomposition polynomial as obtained from Corollary 4.12 in full. Remember that a, b and c count the modules of conformal weight 0, 21 , 1 resp. 16 (see Definition 4.11). Corollary 5.9. The complete decomposition polynomial for the moonshine module belonging to the special marking M∗ is given by ∗
PVM\ (a, b, c) = a48 + b48 + 3300 (a44 b4 + a4 b44 ) + 189504 (a42 b6 + a6 b42 ) +5907810 (a40 b8 + a8 b40 ) + 102156864 (a38 b10 + a10 b38 ) +1088684372 (a36 b12 + a12 b36 ) + 7535996160 (a34 b14 + a14 b34 ) +35232581487 (a32 b16 + a16 b32 ) + 114215080192 (a30 b18 + a18 b30 ) +261496913352 (a28 b20 + a20 b28 ) + 427898196864 (a26 b22 + a22 b26 ) +503871835740 a24 b24 + 6144 (a30 b2 + a2 b30 ) + 430080 (a28 b4 + a6 b28 ) +10881024 (a26 b6 + a6 b26 ) + 126197760 (a24 b8 + a8 b24 ) +774199296 (a22 b10 + a10 b22 ) + 2709417984 (a20 b12 + a12 b20 ) +5657364480 (a18 b14 + a14 b18 ) + 7212810240 a16 b16 c16 + 184320 (a23 b + a b23 ) + 15544320 a21 b3 + a3 b21 )
Framed Vertex Operator Algebras, Codes and Moonshine Module
441
+326430720 (a19 b5 + a5 b19 ) + 2658078720 (a17 b7 + a7 b17 ) +10041630720 (a15 b9 + a9 b15 ) + 19170385920 (a13 b11 + a11 b13 ) c24 + 3072 (a16 ) + b16 ) + 368640 (a14 b2 + a2 b14 ) +5591040 (a12 b4 + a4 b12 ) + 24600576 (a10 b6 + a6 b10 ) +39536640 (a8 b8 ) c32 + 131072 c48 . It was shown in Chapter 4 of [H1] that for a self-dual vertex operator algebra V the decomposition polynomial belongs to the ring C[a, b, c]G of invariants for some 3 × 3matrix group G of order 1152. The space of invariant homogeneous polynomials of degree 48 is 7-dimensional and it can be checked that the above polynomial indeed belongs to this space by using the explicit base given in [H1]. We expect that the analog of Remark 5.6 and Lemma 5.7 holds: Every self-dual FVOA of central charge 24 and minimal weight 2 (i.e. dim V1 = 0) containing the subVOA (B224 )0 can be obtained from a D1 -frame of the Leech lattice as in the second equation of Theorem 4.7. Appendix A. Orbits on Markings of a Hamming Code Notation A.1. Let H be the unique binary code with parameters [8, 4, 4], the Hamming code. We take it to be the span of (00001111), (00110011), (01010101), (11111111). Let A := Aut(H) ∼ = AGL(3, 2) (see Theorem C.3). A marking is a partition of the index set into 2-sets. The number of markings is 28 26 24 22 /4! = 2520/24 = 105. We show that there are three orbits of A on the set of markings and determine the stabilizers. This group is triply but not quadruply transitive on the eight indices. Notation A.2. It helps to interpret the index set as V ∼ = F3 2 with the obvious action of A. So, 2-sets correspond to affine subspaces of dimension 1. Take a linear subspace U ≤ V of dimension 1. Let T be the translation subgroup of A and let L := StabA (0) ∼ = GL(3, 2). Let M be a marking, S := StabA (M ). By double transitivity, we may assume U ∈ M . Let R ≤ T be the group of order 2 corresponding to U . Case α. We assume that all four parts of the marking M are cosets of U . Then S = T StabL (U ) ∼ = 23 : Sym4 , a group of index 7 in A. Case β. We assume that exactly two parts of the marking are cosets of U , say U and W . Let P and Q be the other two parts. Then X := U ∪ W is a dimension 2 linear subspace of V and Y := P ∪ Q is its complement. Both P and Q are cosets of a common linear 1-dimensional subspace U ∗ 6= U of X. Let R∗ be the fours group in T which corresponds to X; R∗ > R. Then, R∗ stabilizes both {U, W } and {P, Q}, whence R∗ ≤ S; in fact, R∗ = T ∩ S. Since A acts transitively on pairs of parallel affine 1-spaces, S acts transitively on {X, Y }; let S0 := StabS (X) = StabS (Y ). Then S0 has index 2 in S and acts transitively on {U, W }; let S1 be the common stabilizer, index 2 in S0 . Also, S0 acts transitively on {P, Q}; let S2 be the common stabilizer, index 2 in S0 . Then S1 6= S2 (since R stabilizes U and W but interchanges P and Q), and S4 := S1 ∩ S2 / S and S/S4 ∼ = Dih8 , a Sylow 2-group of Sym4 (via its action on the marking).
442
C. Dong, R. L. Griess Jr., G. H¨ohn
It suffices to show that |S4 | = 4. Clearly, elements of S4 have square 1. The involution which is trivial on X and interchanges the points within each of P and Q is in L. The same idea, with U , W replaced by P , Q gives an involution which is in a conjugate of L, say in Lg , where g ∈ A interchanges X and Y . Since these involutions are different, |S4 | ≥ 3. If 1 6= u ∈ S4 has a fixed point, say v ∈ V , it may be interpreted as a linear transformation by taking v as the origin; since u is an involution, its fixed point subspace has dimension 2, and is a union of members of M , so is one of X or Y ; this means u is one of the two involutions already defined. Therefore, |S4 | ≤ 4, whence equality. Case γ. We assume that all parts of the marking besides U are not cosets of U . It follows that S ∩ T = 1, so S embeds in L. Clearly, 7 does not divide |S|, so S embeds as a proper subgroup of order dividing 24. Thus, the orbit here has length divisible by 8 · 7 = 56. By our above count of the number of markings, this must be the exact number. We conclude that S ∼ = Sym4 , since the only subgroups of odd index in GL(3, 2) are parabolic subgroups [Ca], 8.3.2.
B. Automorphisms of a Marked Golay Code We settle the stabilizer in M24 of the special marking M∗ we obtained in our description of the Golay code and identify M∗ with the exceptional marking of Blackburn, Conder and McKay [CM] with parameters (48, 576, 96, 0, 39). As noted in Sect. 5 our construction of G is equivalent to the usual hexacode construction, as in [G2] (5.25). The marking M∗ in this notation is gotten from the usual sextet partition of the 24-set 0 1 ω ω¯
• • • •
• • • •
• • • •
• • • •
• • • •
• • • •
by intersecting the columns with the unions Row0 ∪ Row1 and Rowω ∪ Rowω¯ : 0 1
• •
• •
• •
• •
• •
• •
ω ω¯
• •
• •
• •
• •
• •
• •
The set of twelve resulting 2-sets form M∗ . In [G2], the action of the associated sextet group on is described. The group has shape H := 26 3 · Sym6 and may be thought of as H: Aut∗ (H), the affine hexacode group (H denotes the hexacode) (5.25). As in [G2], Chapters 5 and 6, we use the notation Ki for the 4-set in occurring as the ith column above and Kij... denotes the union Ki ∪ Kj ∪ · · ·. The obvious subgroup of H which preserves M is H : P , where P = S × hti ∼ = Sym4 × 2, where S is generated by the groups of permutations (1) the four-group of row-respecting column permutations which interchange columns within evenly many coordinate blocks K12 , K34 , K56 ; (2) the copy of Sym3 obtained by permuting the three coordinate blocks (respecting the order within the blocks); (3) the permutation t is given by the following diagram
Framed Vertex Operator Algebras, Codes and Moonshine Module
←→ ←→ -% .&
←→ ←→ -% .&
443
←→ ←→ -% .&
[G2] (5.38), UP2. The corresponding subgroup Sym4 × 2 of Sym6 is maximal (since it is the stabilizer of a 2-set in a sextuply transitive action). Since the “scalar” transformation UP9 (5.38) [G2] (fixes top row elementwise, cycles rows 2, 3, 4 downward) •••••• ↓↓↓↓↓↓ does not stabilize M, it follows that H : P is the stabilizer of M in H. Notation B.1. R := StabG (M), G := M24 . The next step is to determine R; we know that R ∩ H = H: P . We take a clue from the symmetrized marked weight enumerator smweG (x, y, z) of the Golay code as given in (5.1) and see that the parameters in the sense of [CM] are (48, 576, 96, 0, 39). The next result is an exercise. Lemma B.2. (i) The octads which contribute to contribute to c0 = 48 are those with even parity and which are labeled by a hexacode word of the form (00xxxx), where x = ω or ω¯ and where the zeroes occur in any of the three coordinate blocks; these octads are unions of 2-sets which are subsets of columns labeled by x. (ii) The octads which contribute to contribute to c4 = 39 have even parity and are one of Kij (15 of these) or are octads labeled by hexacode words (001111), with the zeroes occurring in any of the three coordinate blocks; these octads are unions of parts of M which occur in columns labeled by 1 (24 such octads). Clearly, R permutes the sets of octads (i) and (ii). Lemma B.3. The orbits of H : P on X, the set of octads in (ii), are the following: (a) K12 , K34 , K56 (length 3); (b) Kij , for all {ij} 6= {12}, {34}, {56} (length 12); (c) octads labeled by some (001111) (length 24). Theorem B.4. R is a subgroup of index 7 in the stabilizer of the trio (a), whence |R : R ∩ H| = 3 and R ∼ = 26 : [Sym4 × Sym3 ]. Proof. We consider the action of R on X. The octads in X which have only a 0- or a 4-set as intersection with all members of X are the three in (a). So, R preserves this trio and so is in the trio group, J, of the form 26 [GL(3, 2) × Sym3 ]. The group H : P is a subgroup of J of index 21. We consider the possibility that 7 divides |R|. Let g ∈ R be an element of order 7. Then g fixes at least 1 of the remaining 36 members of X. An element of order 7 in G fixes exactly three octads and clearly these are just the octads of our trio (a), a contradiction. So, R has order 210 3 or 210 32 . We eliminate the former by exhibiting a permutation in R \ H; UP13 from (5.38) [G2] does the job. l l l l l l .&-% .&-% .&-% .
444
C. Dong, R. L. Griess Jr., G. H¨ohn
Finally we can identify our marking M∗ with the one in [CM]. This is not completely obvious since the labelings chosen in [CM] are different from the standard ones, e.g. in [Atlas] or [G2]. There are |G|/|R| = 26565 markings equivalent to M∗ , but this is exactly the number of markings obtained in [CM] with parameters (48, 576, 96, 0, 39), i.e. there is only one orbit of markings with these parameters. C. Automorphism Group of Certain Codes C and D of length 3 · 2d d ⊥ We are studying binary codes C ≤ F 2 , where || = 3 · 2 and D := C is spanned by the d + 3 rows of the matrix 1111 . . . . . . 1111 0000 . . . . . . 0000 0000 . . . . . . 0000 0000 . . . . . . 0000 1111 . . . . . . 1111 0000 . . . . . . 0000 0000 . . . . . . 0000 0000 . . . . . . 0000 1111 . . . . . . 1111 00 . . . 0011 . . . 11 00 . . . 0011 . . . 11 00 . . . 0011 . . . 11 M := .. .. .. . . . 0011 . . . . . . 0011 0011 . . . . . . 0011 0011 . . . . . . 0011
0101 . . . . . . 0101 0101 . . . . . . 0101 0101 . . . . . . 0101
.
Our problem is to find F := Aut(C) = Aut(D) ≤ Sym . Notation C.1. We partition into three coordinate blocks 01 := {1, 2, . . . , 2d }, 03 := {2d + 1, . . . , 2 · 2d } and 03 := {2 · 2d + 1, . . . , 3 · 2d }. Here is our main result; it was referred to after Theorem 5.8, for d = 4. Theorem C.2. F ∼ = 23d [GL(d, 2) × Sym3 ], where the 23d may be interpreted as a tensor product of a d and 3 dimensional module for the factors of GL(d, 2) × Sym3 . The two main parts of the proof consist of showing that F preserves the partition {0i } and the description the automorphism groups of the related length 2d codes. Theorem C.3. (i)
The code J spanned by the d vectors 0000 . . . 0000 1111 . . . 1111 0000 . . . 1111 0000 . . . 1111 .. . 0011 . . . 0011 0011 . . . 0011 0101 . . . 0101 0101 . . . 0101
has automorphism group GL(d, 2). (ii) The code spanned by J and the all ones vector has automorphism group AGL(d, 2); the normal translation subgroup is the group of automorphisms which are trivial modulo the span of the all ones vector. (iii) There is a unique binary code of length 2d and dimension d in which all nonzero weights are 2d−1 . It is equivalent to the code of (i). Proof. See [AK], Chapter 5, for example.
Framed Vertex Operator Algebras, Codes and Moonshine Module
445
Notation C.4. Let R be the span of the first three rows of M and let S be the span of the last d. Note that the projections of S or D to any 0i block is a code as described by (C.3). We observe that every element of R has cardinality 0, 2d , 2 · 2d or 3 · 2d and that every element of D \ R has cardinality 3 · 2d−1 . To check this, just verify it for elements of S, a triply thickened [G2] (3.19) extended Hamming code, and note that the effect of adding an element of R to an element d ∈ D is, for each i, to take the ith projection of d to itself or its complement with respect to 0i . So, F fixes R. Lemma C.5. F := Aut(D) permutes the partition 0i , i = 1, 2, 3, as Sym3 . Proof. Since F preserves R, we deduce that F preserves the partition by examining the three minimal weight elements of R. On the other hand, any blockwise permutation fixes the set of rows of M (permutes the first three, fixes the rest). Notation C.6. Let H be the subgroup of F which fixes each 0i ; the code S (C.4) is a triply thickened version of the d-dimensional length 2d code associated to GL(d, 2), as in (C.3). It is clear that the natural action of a group F0 ∼ = GL(d, 2) × Sym3 (first factor F1 acting diagonally and the second F2 as block permutations) is in F and stabilizes S. Note that the second factor acts trivially on S. Proof of Theorem C.2. There is a group Ti acting as translations on 0i , identified with Fd2 as in (C.3), and trivially on 0j , for j 6= i; we choose these identifications to be compatible with the action of F2 . The direct product T := T1 × T2 × T3 is in F . Since H fixes R, we consider the action of H on D/R. The kernel of this action corresponds naturally to a subgroup Hom(D/R, R), order 23d , and may be interpreted as an element of T as in the above paragraph. Since T ≤ H and |T | = 23d , this kernel is T . Since F1 ≤ H induces GL(D/R) on D/R, H = T F1 and F = T F0 .
D. Lifting Minus the Identity Definition D.1. Let L be an even integral lattice. A lift of −1 is an automorphism θ of the lattice VOA VL such that for all x ∈ L, there is a scalar cx so that θ: ex 7→ cx e−x . (Here, ex means 1 ⊗ ex , where 1 is the constant polynomial.) As usual, there is an epsilon function in the description of the lattice VOA VL , ε : L × L → C× , which is bimultiplicative and satisfies ε(x, y)ε(x, y)−1 = (−1)(x,y) . Lemma D.2. Let x, y ∈ L. For some integer k and scalar c, exk ey = cex+y (ak b means the value of the k th binary composition on a, b). In fact, we take k = −1 − (x, y) and c = ε(x, y), which is always nonzero. Proof. This is obvious from the form of the vertex operator representing ex .
Lemma D.3. If the set S = −S spans L, then the set of all ex , for x ∈ S, generates the associated lattice VOA VL . Proof. By Lemma D.2, we may assume that S = L. Let V 0 be the subVOA so generated. −α . Thus V 0 contains all eα and α(−1) Note that for any α ∈ L, α(−1) = eα (α,α)−2 e for α ∈ S. It is clear that VL is irreducible under the component operators of Y (eα , z) and Y (α(−1), z) for α ∈ S, hence V 0 contains all p ⊗ eα , where p is a polynomial expression in α(n), for n < 0. It follows immediately that V 0 = VL .
446
C. Dong, R. L. Griess Jr., G. H¨ohn
Notation D.4. Let M be the set of lifts of −1 and T the rank ` torus of automorphisms of VL associated to L. There is an identification T = C` /L∗ so that t = v + L∗ ∈ T sends ex to e2πi(v,x) ex . Lemma D.5. Let A be an abelian group, hui be a group of order 2 which acts on A by letting u invert every element of A. Set B := Ahui, the semidirect product. Every element of the coset Au is an involution, and two such involutions cu and du are conjugate in B (equivalently, by an element of A) iff cd−1 is the square of an element of A. This last condition follows if A is divisible, e.g. a torus. Theorem D.6. M forms an orbit under conjugation by T in Aut(VL ). Proof. Let x1 , . . ., x` form a basis of L. Given an element of M , we may compose it with an element r ∈ T to assume it satisfies e±xi 7→ e∓xi , for all i. The conditions e±xi 7→ e∓xi characterize an automorphism, since these 2` elements generate the VOA, by Lemma D.3. This composition is the same as conjugation by s ∈ T such that s2 = r or r−1 . So, we are done if we prove that ex 7→ e−x for all x ∈ L. But this is clear from Lemma D.2 since ε(−x, −y) = ε(x, y). Corollary D.7. Given two lifts of −1 on VL , their fixed point subVOAs are isomorphic. In fact, these subVOAs are in the same orbit of Aut(VL ). Acknowledgement. The authors are grateful to E. Betten for discussions on markings of the Golay code. They also thank A. Feingold and G. Mason for discussions.
References [AK] [Atlas]
Assmus, E.F. Jr., and Key, J.D.: Designs and Their Codes Cambridge: Cambridge Univ. Press, 1993 Conway, J.H., Curtis, R.T., Norton, S.P., Parker R.P. and Wilson, R.A.: Atlas of Finite Groups. Oxford: Clarendon Press, 1985 [BS] Bonnecaze, A. and Sol´e, P.: Quaternary constructions of formally self-dual binary codes and unimodular lattices. Algebraic coding (Paris, 1993), Lecture Notes in Comput. Sci., 781, Berlin: Springer, 1994, pp. 194–205 [B] Borcherds, R.E.: Vertex algebras, Kac-Moody algebras, and the Monster. Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986) [Be] Betten, A.: Personal communication [CM] Conder, M. and McKay, J.: Markings of the Golay code. New Zealand J. Mathematics 25, 133-139 (1996) [Ca] Carter, R.: Simple Groups of Lie Type. Wiley Classics Library, London, New York, Sydney, Toronto: John Wiley and Sons, 1989 [Co] Conway, J.: A characterization of Leech’s lattice. Invent. Math. 7, 137–142 (1969) [CS1] Conway, J. and Sloane, N.J.A.: Sphere packings, Lattices and Groups. Berlin–Heidelberg–New York: Springer, 1989 [CS2] Conway, J. and Sloane, N.J.A.: Self-dual Codes over the Integers Modulo 4. J. of Combinatorial Th., Series A 62, 30–45 (1993) [CS3] Conway, J. and Sloane, N.J.A.: Lexicographic codes: error-correcting codes from game theory, Trans. on Information theory 32, 337–348 (1986) [DGM1] Dolan, L., Goddard, P. and Montague, P.: Conformal field theory of twisted vertex operators. Nucl. Phys. B338, 529 (1990) [DGM2] Dolan, L., Goddard, P. and Montague, P.: Conformal field theory, triality and the Monster group. Phys. Lett. B236, 165 (1990) [D1] Dong, C.: Vertex algebras associated with even lattices. J. Algebra 160, 245–265 (1993) [D2] Dong, C.: Twisted modules for vertex algebras associated with even lattices. J. Algebra 165, 90–112 (1994)
Framed Vertex Operator Algebras, Codes and Moonshine Module
[D3] [DL1] [DL2] [DLM1] [DLM2] [DLM3] [DLM4] [DLi] [DM] [DMZ] [FLM] [FZ] [FQS] [G1] [G2] [GKO] [H1] [H2] [H3] [Hu1]
[Hu2] [KR] [KW] [L1] [L2] [MaS]
[M1] [M2] [M3] [MoS] [Mo] [Sch]
447
Dong, C.: Representations of the moonshine module vertex operator algebra. Contemporary Math. 175, 27–36 (1994) Dong, C. and Lepowsky, J.: Generalized Vertex Algebras and Relative Vertex Operators. Progress in Math. Vol. 112, Boston: Birkh¨auser, 1993 Dong, C. and Lepowsky, J.: The algebraic structure of relative twisted vertex operators. J. Pure and Applied Algebra 110, 259–295 (1996) Dong, C., Li, H. and Mason, G.G.: Regularity of rational vertex operator algebras. Advances. in Math. 32, 148–166 (1997) Dong, C., Li, H. and Mason, G.: Compact automorphism groups of vertex operator algebras. International Math. Research Notices 18, 913–921 (1996) Dong, C., Li, H. and Mason, G.: Twisted representations of vertex operator algebras. Math. Ann. to appear, q-alg/9509005 Dong, C., Li, H. and Mason, G.: Modular invariance of trace functions in orbifold theory. qalg/9703016 Dong, C. and Lin, Z.: Induced modules for vertex operator algebras. Commun. Math. Phys. 179, 157–184 (1996) Dong, C. and Mason, G.: On quantum Galois theory. Duke Math. J. 86, 305–321 (1997) Dong, C., Mason, G. and Zhu, Y.: Discrete series of the Virasoro algebra and the moonshine module. Proc. Symp. Pure. Math., American Math. Soc. 56 II, 295–316 (1994) Frenkel, I.B., Lepowsky, J. and Meurman, A.: Vertex Operator Algebras and the Monster. Pure and Applied Math. Vol. 134, New York: Academic Press, 1988 Frenkel, I. and Zhu, Y.: Vertex operator algebras associated to representations of affine and Virasoro algebras, Duke Math. J. 66, 123–168 (1992) Friedan, D., Qiu, Z. and Shenker, S.: Details of the non-unitarity proof for highest weight representations of Virasoro Algebra, Commun. Math. Phys. 107, 535–542 (1986) Griess, R.L., Jr.: The Friendly Giant. Invent. Math. 69, 1–102 (1982) Griess, R.L.,Jr.: Twelve Sporadic Groups Book to be published by Springer-Verlag Goddard, P., Kent, A. and Olive, D.: Unitary representations of the Virasoro Algebra and superVirasoro algebras Commun. Math. Phys. 103, 105–119 (1986) H¨ohn, G.: Selbstduale Vertexoperator-Superalgebren und das Babymonster, Ph.D. thesis, University of Bonn, 1995. See: Bonner mathematische Schriften 286, (1996) H¨ohn, G.: Self-dual codes over the Kleinian fourgroup. Preprint 1996 H¨ohn, G.: Simple current extensions of vertex operator algebras and the cohomology of K(π, 2). In preparation Huang, Yi-Zhi: Intertwining Operator Algebras, genus-zero Modular Functors and genus-zero Conformal Field Theories. In: Operads, Proceedings of Renaissance Conferences. ed. J.-L. Loday, J. Stasheff, and A. A. Voronov, Contemporary Math., Vol. 202, Providence, RI: Amer. Math. Soc., 1997, pp. 335–355 Huang, Yi-Zhi: A Jacobi identity intertwiner algebras. q-alg/9704008 Kac, V. and Raina, A.: Bombay Lectures on Highest Weight Representations, Singapore: World Sci., 1987 Kac, V. and Wang, W.: Vertex operator superalgebras and representations. Contemporary Math. Vol. 175, 161–191 (1994) Li, H.: Symmetric invariant bilinear forms on vertex operator algebras. J. Pure and Appl. Alg. 96, 279–297 (1994) Li, H.: Representation theory and tensor product theory for vertex operator algebras. Ph.D. thesis, Rutgers University, 1994 MacWilliams, F.J. and Sloane, N.J.A.: The Theory of Error-Correcting Codes. Amsterdam: Elsevier Science Publishers B.V., 1977; The Theory of Error-Correcting Codes. Amsterdam: North Holland, second reprint 1983 Miyamoto, M.: Griess algebras and conformal vectors in vertex operator algebras. J. Algebra 179, 523–548 (1996) Miyamoto, M.: Binary codes and vertex operator superalgebras. J. Algebra 181, 207–222 (1996) Miyamoto, M.: Representation theory of Code VOA and construction of VOAs. Preprint (1996) Moore, G. and Seiberg, N.: Classical and quantum conformal field theory. Commun. Math. Phys. 123, 177–254 (1989) Mossberg, G.: Axiomatic vertex algebras and the Jacobi identity. J. Algebra, 956–1010 (1994) Schellekens, A.N.: Meromorphic c = 24 Conformal Field Theories. Commun. Math. Phys. 153, 159 (1993)
448
[Sgap] [V] [W]
C. Dong, R. L. Griess Jr., G. H¨ohn
Sch¨onert, M. et al.: GAP: Groups, Algorithms and Programming. Lehrstuhl D f¨ur Mathematik, Aachen, 1994 Venkov, B.B.: The classification of integral even unimodular 24-dimensional quadratic forms. Trudy Matem. Inst. i. V. A. Steklova 148, 65–76 (1978) Wang, W.: Rationality of Virasoro vertex operator algebras. Duke Math. J. IMRN 71, 197–211 (1993)
Communicated by G. Felder
Commun. Math. Phys. 193, 449 – 470 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Horizons Non-Differentiable on a Dense Set Piotr T. Chru´sciel1,? Gregory J. Galloway2,?? 1 D´ epartement de Math´ematiques, Facult´e des Sciences, Parc de Grandmont, F37200 Tours, France. E-mail: [email protected] 2 Department of Mathematics and Computer Science, University of Miami, Coral Gables FL 33124, USA. E-mail: [email protected]
Received: 17 April 1997 / Accepted: 9 September 1997
Abstract: It is folklore knowledge amongst general relativists that horizons are well behaved, continuously differentiable hypersurfaces except perhaps on a negligible subset one needs not to bother with. We show that this is not the case, by constructing a Cauchy horizon, as well as a black hole event horizon, which contain no open subset on which they are differentiable.
1. Introduction In various works where Cauchy horizons or event horizons occur it is assumed that those objects have a fair amount of differentiability, except perhaps for a well behaved lower dimensional subset thereof. A standard example is the proof of the area theorem for black holes [7, Proposition 9.2.7], where something close to C 2 differentiability “almost everywhere” of the event horizon seems to have been assumed. An extreme example is the rigidity theorem for black-holes [5, 7], where it is assumed that the event horizon is an analytic submanifold of space–time. Other examples include discussions of the structure of Cauchy horizons (e.g., [7, p. 295–298], [6, 13, 4, 10, 1]). It seems reasonable to raise the question, whether there are any reasons to expect that the horizons considered in the above quoted papers should have the degree of differentiability assumed there. While we do not have an answer to that question, in this paper we construct examples of horizons the degree of differentiability of which is certainly not compatible with the considerations in the papers previously quoted. Our examples are admittedly very artificial and display various pathological features (e.g., an incomplete null infinity). One would hope that the behavior described here can not occur, in the case of event horizons in “reasonable” asymptotically flat space–times ? On leave of absence from the Institute of Mathematics, Polish Academy of Sciences, Warsaw. Supported in part by KBN grant # 2P302 095 06. ?? Supported in part by NSF grant # DMS-9204372.
450
P.T. Chru´sciel, G. J. Galloway
(e.g., in vacuum and maximal globally hyperbolic space–times), or for compact Cauchy horizons in spatially compact space–times. While this expectation might well turn out to be true, the examples here show that we have no a priori reasons to expect that to be correct, and some new insights are necessary in those cases to substantiate such hopes. It is, moreover, clear from our results below that the kind of behaviour we obtain is a generic property of horizons as considered here, in an appropriate C 0,1 topology on the collection of sets. We shall, however, not attempt to formalize this statement. We note that a criterion for stability of differentiability, for a rather special class of compact Cauchy horizons, has been given in [2]. Let us recall some facts about, say, Cauchy horizons in globally hyperbolic space– times (M, g). Let be an open subset of a Cauchy surface 6 ⊂ M . Then [7, Prop. 6.3.1] [12, Lemma 3.17], H+ () is a Lipschitz topological submanifold of M . It then follows from a theorem of Rademacher [3, Theorem 3.1.6] that H+ () is differentiable almost everywhere. (Recall that in local coordinates xµ in which x0 is a time function H+ () can be written as a graph x0 = f (xi ), and then the “almost everywhere” above refers to the Lebesgue measure dx1 ∧ . . . ∧ dxn .) This shows that there is some smoothing associated with Cauchy horizons: consider a set K such that ∂K is a nowhere differentiable curve. (Take, e.g., a Mandelbrot set in R2 , considered as the Cauchy surface for the threedimensional Minkowski space–time R2,1 .) Even in such a “bad” situation, for almost all (in the sense of Lebesgue measure on R) later-times t the boundary of D+ (K) intersected with the plane t = const will be differentiable almost everywhere. It could be the case that some more smoothing occurs, so that perhaps H+ (K) ∩ {t = const} will have nicer differentiability properties than what follows from the above considerations. The example we construct here shows that this does not happen. More precisely, we show the following: Theorem 1.1. There exists a connected set K ⊂ R2 = {t = 0} ⊂ R2,1 , where R2,1 is the 2 + 1 dimensional Minkowski space–time, with the following properties: ¯ \ int K of K is a connected, compact, Lipschitz topological 1. The boundary ∂K = K 2 submanifold of R . K is the complement of a compact set of R2 . 2. There exists no open set ⊂ R2,1 such that ∩ H+ (K) ∩ {0 < t < 1} is a differentiable submanifold of R2,1 . Let us show that Theorem 1.1 implies the existence of “nowhere” differentiable (in the sense of point 2 of Theorem 1.1) Cauchy horizons H+ (6) in stably causal asymptotically flat space–times, with well behaved space–like surfaces 6. (Our example satisfies even the vacuum Einstein equations, as the space–time is flat.) To do that, consider the complement CK of K in R2 , and consider the set J + (CK). As CK is compact, it follows from global hyperbolicity of R2,1 that K 0 ≡ J + (CK) ∩ {t = 1} is compact, hence there exists R such that K 0 is a subset of the open ball B(0, R) of radius R centered at the ˆ = R2,1 \ (N ∪ CK), equipped with origin of R2 . Let N = J + ({1} × B(0, R)) and set M the obvious flat metric coming from R2,1 (cf. Fig. 1). Consider the space-like surface ˆ . It is easily seen that the Cauchy horizon H+ (6; M ˆ ) of 6 in M ˆ 6 = {t = −1} ⊂ M coincides with the intersection of the Cauchy horizon H+ (K) in R2,1 with the “slab” ˆ ) such that this {0 < t < 1} ⊂ R2,1 . Hence there exists no open subset of H+ (6; M ˆ . We note that the construction guarantees subset is a differentiable submanifold of M global hyperbolicity of M . Let us show that Theorem 1.1 also implies the existence of black hole space–times with “nowhere” differentiable event horizons (“nowhere” in the sense of point 2 of
Horizons Non-Differentiable on a Dense Set
451
Remove
H+ (6)
N = J + ({1} × B(0, R))
CK 6 = {t = −1} ˆ Fig. 1. The space–time M
Theorem 1.1). The construction should be clear from Fig. 2, and can be formally described as follows: Construct first a compact set K 0 ⊂ R3 = {t = 1} ⊂ R2,1 with the property that the past Cauchy horizon ∂D− (K 0 ) of K 0 is “nowhere differentiable” for 0 ≤ t ≤ 1. This is easily achieved using the same construction as below, by adding exterior ripples to the disc B(0, 2) of radius 2 centered at 0. Consider the space–time ˜ obtained by removing from the three–dimensional Minkowski space–time the set K 0 M ˜ which are in D− (K 0 ) together with J − (D− (K 0 ) ∩ {t = 0}). Clearly those points in M ˜ ) of M ˜ (which coincides with cannot be seen from the usual future null infinity J + (M ˜ ) lies in the black hole region of M ˜ . It is also easily seen that of R2,1 ), so that D− (K 0 ; M ˜ coincides with H− (K 0 ) ∩ {0 < t < 1}. It follows that the black hole horizon B in M ˜. again that there exists no open subset of B which is a differentiable submanifold of M Black hole region
K0
Event horizon {t = 0}
D − (K 0 ) ∩ {t = 0}
Remove
Fig. 2. A flat space–time with a black hole region.
˜ constructed in the last paragraph has a We note that the black–hole space–time M complete J + and is stably causal, but is not globally hyperbolic, moreover the domain of outer communications is not globally hyperbolic either. We could obtain a globally hyperbolic space–time and a globally hyperbolic domain of outer communications and ˜ , but this would lead a flat metric by further removing the set J + (K 0 ) ∩ {t > 1} from M
452
P.T. Chru´sciel, G. J. Galloway
to an incomplete J + .A way of improving global causal properties of the space–time so ˜ defined as the interior of J − (K 0 ; R1+2 ) with constructed is to consider the manifold M the metric being the flat metric multiplied by a conformal factor which is strictly positive on the interior of J − (K 0 ; R1+2 ) and which vanishes on its boundary, in a way such that ∂J − (K 0 ; R1+2 ) \ K 0 becomes the null conformal infinity. Such a construction does not, however, guarantee that the Ricci tensor of the new metric will satisfy any positivity conditions. This paper is organized as follows. In Sect. 5 we present the construction of K. While all the assertions made there can be seen, with some effort, by inspection of the pictures, it seems difficult to give a rigorous proof based on that. For that reason we develop, in the following sections, a framework which allows us to handle various problems one needs to address in the construction presented in Section 5. We believe that the results obtained in those sections are of independent interest, and shed light on various properties of Cauchy horizons. In particular the notion of the distribution of semi-tangents introduced below seems to be rather useful in understanding the structure of horizons. In Sect. 2 we analyze continuity properties of the map which to a set K ⊂ 6, where 6 is a Cauchy surface in a globally hyperbolic space–time, assigns its Cauchy horizon H+ (K), in various topologies. Some related results can be found in [1]. In Sect. 3 we introduce the notion of the distribution of semi-tangents to a Cauchy horizon, which plays a crucial role in our analysis. We also study some properties of that distribution. In Sect. 4 we analyze continuity properties of the map which to a set K ⊂ 6 assigns its distribution of semi-tangents N + (K). We show that this map is continuous both in a “clustering” topology and in a Hausdorff-distance topology. In the concluding section we briefly discuss how our construction generalizes to higher dimensional Minkowski space–times, or, in fact, to any globally hyperbolic space–times.
2. Convergence Properties of Cauchy Horizons In this section, and subsequent sections, we establish some convergence properties of Cauchy horizons. For this purpose we make use of the notion of Hausdorff convergence. Let X be a metric space with distance function d. For any subset A ⊂ X, let Uε (A) be the ε-neighbourhood of A, Uε (A) = {x ∈ X : d(x, A) < ε}, where d(x, A) = inf d(x, y). y∈A
For subsets A, B ∈ X, the Hausdorff distance between A and B, denoted D(A, B), is defined as follows, D(A, B) = inf{ε > 0 : A ⊂ Uε (B) and B ⊂ Uε (A)}. Then, a sequence of subsets An ⊂ X is said to (Hausdorff) converge to a subset A ⊂ X provided D(An , A) → 0 as n → ∞. Note that D(An , A) → 0 if and only if sup d(x, A) → 0 and sup d(y, An ) → 0. x∈An
y∈A
There is an auxiliary convergence notion that turns out to be useful in our study, as well. Let {An } be a sequence of subsets of X. Then the cluster limit of An , denoted c−lim An , is defined as follows,
Horizons Non-Differentiable on a Dense Set
453
c−lim An = {x ∈ X : ∃ sequence xj ∈ Anj such that nj → ∞ and limj→∞ xj = x}. Proposition 2.1. c−lim An is closed. Proof. Assume xj ∈ c−lim An and xj → x. Now, xj ∈ c−lim An implies ∃xjk → xj , xjk ∈ An(j,k) , where, for each j, n(j, k) → ∞ as k → ∞. For each j there exists K(j) such that d(xjk , xj ) <
1 for all k ≥ K(j). j
Hence, we can construct a sequence {K(j)} so that n(j, K(j)) → ∞ as K(j) → ∞ and d(yj , xj ) <
1 , j
where yj = xn(j,K(j)) ∈ An(j,K(j)) . Hence, lim yj = lim xj = x. Thus, x ∈ c−lim An . j→∞
j→∞
Under fairly general circumstances, Hausdorff convergence implies cluster limit convergence. Proposition 2.2. If A ⊂ X is closed and if D(An , A) → 0, then c−lim An = A. Proof. c−lim An ⊂ A: Let x ∈ c−lim An . Then ∃xj ∈ Anj such that x = lim xj . We have that sup d(y, A) → 0. Hence, d(xj , A) → 0. Since z → d(z, A) is continuous, y∈An
d(x, A) = lim d(xj , A) = 0. Thus, since A is closed, x ∈ A. j→∞
A ⊂ c−lim An : D(An , A) → 0 implies that sup d(y, An ) → 0. Hence for any y∈A
x ∈ A, d(x, An ) → 0. It follows that there exists a sequence yn ∈ An such that d(x, yn ) → 0, i.e., x = lim yn ∈ c−lim An . n→∞
We will have occasion to make use of the following simple fact concerning Hausdorff distance. Proposition 2.3. Let (X, d) be a locally compact metric space such that all distance balls {y ∈ X : d(x, y) < r} are connected (e.g. a connected Riemannian manifold). Suppose we are given subsets An , A ⊂ X such that D(An , A) → 0 and D(∂An , ∂A) → 0. Let C be a compact subset of X. (a) If C ⊂ int A then C ⊂ intAn for all n sufficiently large. (b) If C ∩ A = ∅ then C ∩ An = ∅ for all n sufficiently large. Proof. We prove only (a), as the proof of (b) is similar. To prove (a), it is sufficient to consider the case in which C is a closed metric ball, C = {x : d(x0 , x) ≤ r}. The proof for general C follows by taking an appropriate finite cover of C. Hence, in particular, we may assume C is connected. Since C is compact and contained in the interior of A, we have that inf d(x, ∂A) = x∈C
δ > 0. The convergence D(∂An , ∂A) → 0 then implies that inf d(x, ∂An ) ≥ δ/2 for x∈C
all n sufficiently large. Since we are assuming C is connected, it follows that for all
454
P.T. Chru´sciel, G. J. Galloway
n sufficiently large, C ⊂ int An or C ⊂ X \ An . If C ⊂ X \ An for some n then inf d(x, An ) ≥ δ/2. Since D(An , A) → 0, this cannot happen for infinitely many n. x∈C
Hence, C ⊂ int An for all n sufficiently large.
We now pass to the space–time setting. Let (M, g) be a globally hyperbolic space– time, and 6 be a smooth space–like Cauchy surface for M . We wish to emphasize that we shall assume global hyperbolicity throughout. Fix a future directed time–like vector field T on M . Introduce a complete Riemannian metric h on M . Let d be the associated distance function, and let D be the associated Hausdorff distance on subsets of M . Normalize T so that T has unit length with respect to h. Each integral curve of T meets 6 once and only once. Let 5 : M → 6 be the projection onto 6 along the integral curves of T , i.e., 5(p) is the point in 6 where the integral curve of T through p meets 6. Let 9 : M → R × 6 be the diffeomorphism determined by the integral curves of T , i.e., 9(p) = (t, x), where x = 5(p) and t = t(p) is the signed h-length of the integral curve of T from x to p. When convenient, we will suppress the map 9 and simply identify M with R × 6. Thus, for example, by the graph of a function u : 6 → R we mean the following subset of M , graph u = {(u(x), x) ∈ M : x ∈ 6}. We now consider the Cauchy horizons of certain subsets of 6. For any subset A ⊂ 6, we let ∂A = A \ int A denote the boundary of A in 6. Let Xˆ be the following collection of subsets of 6, Xˆ = {A ⊂ 6 : A is the closure of an open set in 6, A is connected and ∂A is a topological submanifold of co-dimension one in 6}. For any A ∈ Xˆ , the future Cauchy horizon H+ (A) is an achronal Lipschitz hypersurface in M such that edge H+ (A) = edge A = ∂A [12, Propositions 5.8 and 5.11]. Each integral curve of T meets H+ (A) at most once. Henceforth it will be convenient to define a slightly more restricted class of sets X by X = {A ∈ Xˆ : each integral curve of T that meets A also meets H+ (A)}.
(2.1)
In this case H+ (A) may be viewed as a graph over A. More precisely, there exists a locally Lipschitz function u : A → R such that, H+ (A) = {(u(x), x) ∈ M : x ∈ A}. For the proof of the following lemma, let 6A denote the achronal surface obtained by “gluing together” H+ (A) and 6 \ A along ∂A. Then by setting u(x) = 0 for all x ∈ 6 \ A, we obtain a locally Lipschitz function u : 6 → R such that, 6A = {(u(x), x) ∈ M : x ∈ 6}. Lemma 2.4. Let An , A ∈ X be such that D(An , A) → 0 and D(∂An , ∂A) → 0. Then for each compact set C ⊂ A, D(H+ (An ) ∩ 5−1 (C), H+ (A) ∩ 5−1 (C)) → 0.
Horizons Non-Differentiable on a Dense Set
455
Proof. There exist locally Lipschitz functions un : 6 → R, n = 1, 2, . . ., and u : 6 → R such that 6An = graph un and 6A = graph u. To prove the lemma, it is sufficient to show that un → u pointwise, for then un → u uniformly on C, and the lemma easily follows. First suppose x ∈ 6 \ A, so that u(x) = 0. Then, by Proposition 2.3, x ∈ 6 \ An for all n sufficiently large, and hence un (x) = 0 for all such n. Thus, lim un (x) = 0 = u(x). n→∞
Now suppose x ∈ int A, so that (u(x), x) ∈ H+ (A) \ edge A and u(x) > 0. For any ε > 0 sufficiently small, q = (u(x) − ε, x) ∈ int D+ (A). Hence J − (q) ∩ 6 is compact and contained in int A. Thus, by Proposition 2.3, J − (q) ∩ 6 is contained in An for all n sufficiently large. It follows that q ∈ D + (An ) for all such n. Hence, (u(x) − ε, x) lies to the past of (un (x), x) ∈ H+ (An ) along the integral curve of T through x. This implies that u(x) − ε ≤ un (x). A similar argument shows that un (x) ≤ u(x) + ε. Combining these inequalities we obtain |u(x) − un (x)| ≤ ε for all n sufficiently large. Hence, lim un (x) = u(x). n→∞
Finally, suppose x ∈ ∂A. Since each hypersurface 6An is achronal, there exists a neighbourhood W of x in 6 and a uniform Lipschitz constant K such that for all n and all x1 , x2 ∈ W , (2.2) |un (x2 ) − un (x1 )| ≤ Kρ(x1 , x2 ), where ρ is the distance function in 6 determined by the induced metric on 6. The condition, D(∂An , ∂A) → 0 implies that there exists a sequence of points yn ∈ ∂An such that yn → x. Then (2.2) implies, |un (x)| = |un (x) − un (yn )| ≤ Kρ(yn , x). Since ρ(yn , x) → 0, it follows that lim un (x) = 0 = u(x). n→∞
Remarks. We note that every subset B of H+ (A) is of the form H+ (A) ∩ 5−1 (C), where C = 5(B). In particular, if B is compact, C is compact. We note also that Lemma 2.4 remains valid if C is merely assumed to be a subset of A with compact closure. The following corollary is an immediate consequence of Proposition 2.2 and Lemma 2.4. Corollary 2.5. Consider An , A ∈ X . Let C ⊂ A be compact. If D(An , A) → 0 and D(∂An , ∂A) → 0 then c−lim[H+ (An ) ∩ 5−1 (C)] = H+ (A) ∩ 5−1 (C). The next lemma to be presented, which we refer to as the localization lemma, shows that, under suitable circumstances, “localized” changes in the boundary ∂A produce only “localized” changes in H+ (A). The proof is a consequence of the following simple fact which will arise in other situations, as well. We use the convention that D+ (A), for an achronal set A, is the collection of all points p such that every past inextendible time–like curve through p intersects A. Proposition 2.6. Let A be a closed achronal subset of a space–time M . Let η be a future directed causal curve from p ∈ edge A to q ∈ D+ (A) \ edge A. Then η is contained in H+ (A) and hence is a null geodesic generator (or segment thereof) of H+ (A).
456
P.T. Chru´sciel, G. J. Galloway
Proof. q ∈ D+ (A) forces η to be contained in D+ (A). If η were to enter the interior of D+ (A) then p would have to be a point in A \ edge A (see, e.g. [12, Prop. 5.16, p. 45]). It follows that η ⊂ H+ (A). Moreover, to avoid an achronality violation of H+ (A), η must be a null geodesic segment. ˆ For any Lemma 2.7 (The Localization Lemma). Consider A, Aˆ ∈ X , with A ⊂ A. subset ∂0 A of ∂A, let H0+ (A) = {x ∈ H+ (A) : x lies on a null geodesic generator with past end point on ∂0 A}. ˆ If ∂0 A ⊂ ∂ Aˆ then H0+ (A) ⊂ H+ (A). Proof. Let y ∈ H0+ (A). If y ∈ ∂0 A, there is nothing to show. Hence, we may assume y ∈ H0+ (A) \ ∂A. Let η be a null geodesic generator of H+ (A) from x ∈ ∂0 A to y. Since ˆ Hence, η is a null geodesic segment from x ∈ ∂ Aˆ = edge Aˆ ˆ D + (A) ⊂ D+ (A). A ⊂ A, + ˆ ˆ and hence to y ∈ D (A). By Proposition 2.6, η is a null geodesic generator of H+ (A), + ˆ y ∈ H (A). 3. Semi-tangents We introduce the notion of semi-tangents. Let A ∈ X . A semi-tangent of H+ (A) is a vector X ∈ T M based at a point p ∈ H+ (A) \ ∂A, which is tangent to a null generator η of H+ (A), is past-directed, and is normalized so that h(X, X) = 11 . Let N + p (A) denote the collection of semi-tangents of H+ (A) based at p, and let N + (A) denote the collection of all semi-tangents of H+ (A). Note that N + (A) ⊂ U M , where U M is the unit tangent bundle of M with respect to the Riemannian metric h. If we let MA = M \ ∂A, then MA is an open submanifold of M and U MA is an open sub–bundle of U M . Since, in the definition of N + (A), vectors based at ∂A are excluded, we have that N + (A) ⊂ U MA . Lemma 3.1. N + (A) is closed in U MA . Proof. Let Xn ∈ N + (A) such that Xn → X in U MA . Let Xn be based at the point pn ∈ H+ (A) and X be based at p. Since pn → p, p ∈ H+ (A) \ ∂A. Let ηn be the past directed null generator of H+ (A) starting at pn with initial tangent Xn . Let qn ∈ ∂A be the past end point of ηn . The ηn ’s converge, as geodesics, to a past directed null geodesic η which starts at p with initial tangent X. η will meet 6 at a point q ∈ D+ (A). Convergence properties of geodesics guarantee that qn → q. Hence, q ∈ ∂A. Thus η is a null geodesic segment from p ∈ H+ (A) to q ∈ ∂A. By Proposition 2.6, this forces η to be a null geodesic generator of H+ (A). Thus, X ∈ N + (A). We now establish some connections between the distribution of semi-tangents and the regularity of Cauchy horizons. Lemma 3.1 may be used to show that the distribution of semi-tangents is continuous at interior points (non-end points) of null generators. For A ∈ X , consider the set, 1 A more elegant and essentially equivalent approach would be not to impose the condition h(X, X) = 1. That would avoid the usage of an auxiliary Riemannian metric h. The current definition is more convenient for technical reasons, as it fixes the parametrisation of the null geodesics under consideration.
Horizons Non-Differentiable on a Dense Set
457
S = {x ∈ H+ (A) : x is an interior point of a null generator of H+ (A)}, endowed with the subspace topology. We note the following well known result: Lemma 3.2. At each point p ∈ S there exists a unique semi-tangent X(p) ∈ N + (A). Proof. Indeed, if this were not the case, there would be one null generator meeting another at an interior point. Then, traveling around the corner at which they meet would produce an achronality violation. Lemma 3.2 shows that there is a well-defined map 9 : S → N + (A) defined by 9(p) = X(p). If we endow N + (A) ⊂ T M with the subspace topology, we obtain: Proposition 3.3. 9 : S → N + (A) is continuous. Proof. Given p ∈ S, consider any sequence {pn } ⊂ S such that pn → p. Let Xn = 9(pn ) and X = 9(p). We want to show that Xn → X. Indeed, since the Xn ’s are h-unit vectors, there exists a subsequence Xn , that converges to an h-unit vector Y . By Lemma 3.1, Y is a semi-tangent based at the interior point p. But since semi-tangents at interior points are unique, we must have Y = X, which had to be established. The interest in the distribution of semi-tangents lies in the fact, that it encodes most – if not all – information about differentiability of H+ (A). Indeed, at any point of H+ (A) where there is more than one semi-tangent, the Cauchy horizon must fail to be differentiable. Proposition 3.4. Consider the Cauchy horizon H+ (A), A ∈ X . If there exist two distinct semi-tangents at p ∈ H+ (A) \ ∂A then H+ (A) is not differentiable at p. The heuristic idea is clear. If H+ (A) were differentiable at p it would have a tangent hyper-plane 5p at p. Since H+ (A) is achronal and generated by null geodesics, 5p should be a null hyper-plane. On the other hand, 5p should contain the two distinct semi-tangents at p, forcing 5p to be a time–like hyper plane. The proof presented below carries this argument out rigorously. Proof. Suppose to the contrary that H+ (A) is differentiable at p. Introduce geodesic normal coordinates (x 0 , x1 , . . . , xn ) = (x0 , x) centered at p, defined in a neighborhood ∂ is future pointing time–like. Choose U sufficiently small so that U of p such that ∂x0 p ∂ is time–like on U. Any geodesic γ through p, when expressed in these coordinates ∂x0 takes the form, γ : xµ (s) = λµ s, λµ ∈ R, µ = 0, . . . , n. γ is time–like, space–like, null if and only if −|λ0 |2 + |λ|2 < 0, > 0, = 0. It follows that the future (past) null cone at p is described by the graph x0 = |x| (x0 = −|x|). Since H+ (A) is achronal, it can be expressed in U as a graph over the slice V = {x0 = 0}, H+ (A) ∩ U = { x0 = f (x), x ∈ V} . H+ (A) is differentiable at p if and only if the function f (x) is differentiable at 0, i.e. iff there exists a vector C = (C 1 , . . . , C n ) such that, lim
x→0
f (x) − C · x =0. |x|
(3.1)
458
P.T. Chru´sciel, G. J. Galloway
Claim. |C| ≤ 1. Otherwise, as we show, the achronality of H+ (A) is violated. Suppose, then, that |C| > 1. By (3.1), with x = tC, we have that for any ε > 0 ∃ t0 > 0 such that for all t ∈ (0, t0 ), |f (tC) − |C|2 t| <ε, |C|t which implies
f (tC) > t|C|(|C| − ε) > t|C|,
provided ε is sufficiently small. But this implies that H+ (A) enters into the future null cone at p, which violates the achronality of H+ (A). Let A = (A0 , A) and B = (B 0 , B) be any two semi-tangents at p, scaled so that |A| = |B| = 1. Since A and B are past directed null, A0 = B 0 = −1. Then s → (sA0 , sA) and s → (sB 0 , sB) are past directed null geodesic generators of H+ (A). It follows that f (sA) = sA0 = −s, and f (sB) = −s. Setting x = sA in (3.1) gives, lim
s→0
which implies that
s + sC · A = 0, s
−1 = C · A = |C| cos θ.
≤ −1, forcing θ = π and |C| = 1. Thus, A = −C. But the same Hence cos θ = argument also shows that B = −C. Hence, A = B, which contradicts the assumption that there are distinct semi-tangents at p. 1 − |C|
Proposition 3.4 shows what happens at those points at which N + (A) consists of more than one element. We do not know what happens at all points at which N + (A) contains only one vector (and it would be of interest to settle that question).2 However, when those points are interior points of generators, we have the following result. Proposition 3.5. At interior points of null generators the Cauchy horizon H+ (A) is differentiable. Related results concerning the regularity of Cauchy horizons (or, more generally, achronal sets) at interior points of generators have been obtained by I. Major [8] and K. Newman [9]. Proof. Let γ be a null generator of H+ (A) and let p be any of its interior points. Let U , xµ , f , etc., be as in the proof of Proposition 3.4. Note that any point p = (t, x) ∈ U ∩ D+ (A) satisfies t ≤ f (x). Similarly if p = (t, x) ∈ U is not in D+ (A) then t ≥ f (x). We can find q + ∈ γ ∩ J + (p) such that, passing to a subset of U if necessary, J − (q + ) is a smooth submanifold of U , given as a graph x0 = f+ (x). Similarly we can find q − ∈ γ ∩ J − (p) such that J + (q − ) is a smooth submanifold of U , given as a graph x0 = f− (x). As every point to the past of q + lies inside of D+ (A) we have f+ (x) ≤ f (x). Similarly every point to the future of q − lies outside of D+ (A), so that we have f (x) ≤ f− (x). 2 Added in proof: This has been recently settled by Beem and Kr´ olak (preprint), who show that the horizon has a tangent plane at such points.
Horizons Non-Differentiable on a Dense Set
459
At the origin we have f (0) = f+ (0) = f− (0). Moreover J + (q − ) and J − (q + ) are tangent there, so that there exists a vector C such that C = df+ (0) = df− (0). It follows that Cx + o(|x|) = f+ (x) ≤ f (x) ≤ f− (x) = Cx + o(|x|),
which establishes (3.1) and shows differentiability of H+ (A) at p. 4. Convergence in the Distribution of Semi-tangents
In this section we establish some further properties of the distribution of semi-tangents. We retain all the notational conventions from previous sections. In particular, here as ˆ = 5 ◦ P : UM → elsewhere, global hyperbolicity of the space–time is assumed. Let 5 ˆ −1 (C) is the 6, where P : U M → M is the natural projection map. Thus, if C ⊂ 6, 5 −1 set of all h-unit vectors over 5 (C) ⊂ M . Lemma 4.1. Consider An , A ∈ X such that D(An , A) and D(∂An , ∂A) tend to zero as n tends to infinity. (a) If C is compact and C ⊂ int A then ˆ c−lim[N + (An ) ∩ 5
−1
ˆ (C)] ⊂ N + (A) ∩ 5
−1
(C).
(4.1)
ˆ −1 (C) ⊂ c−lim[N + (An ) ∩ 5 ˆ −1 (C)]. N + (A) ∩ 5
(4.2)
(b) If C is open (as a subset of 6) and C ⊂ A then,
Remark. Following the proof of Lemma 4.1 we present an example which shows that the inclusion in (4.1) (resp. (4.2)) can be strict (even if, in (4.2), one takes the closure of ˆ −1 (C)). N + (A) ∩ 5 ˆ −1 (C) such that Xj → X in U M . Since 5 ˆ −1 (C) Proof. (a): Let Xj ∈ N + (Anj ) ∩ 5 ˆ −1 (C). Let Xj be based at pj ∈ H+ (Anj ), and let X be based at p. is closed, X ∈ 5 Let ηj be the past directed null geodesic generator of H+ (Anj ) with initial point pj and initial tangent Xj , and let qj ∈ ∂Anj be its past end point. Let η be the past directed null geodesic with initial point p and initial tangent X. Since pj → p, Corollary 2.5 implies that p ∈ H+ (A) \ ∂A. Then η meets 6 at some point q ∈ A. Since ηj → η in the sense of geodesics it follows that qj → q. The assumption D(∂An , ∂A) → 0 implies that q ∈ ∂A. Hence, by Proposition 2.6, η is a null geodesic generator of H+ (A), and ˆ −1 (C). thus X ∈ N + (A) ∩ 5 ˆ −1 (C) based at p ∈ H+ (A) ∩ 5−1 (C) (in particular, (b): Consider X ∈ N + (A) ∩ 5 p∈ / ∂A). Let η be the past directed null generator of H+ (A) starting at p with initial tangent X, and let q ∈ ∂A be the past end point of η. Let y ∈ H+ (A) ∩ 5−1 (C) be an interior point of η. Since C is open, y can be chosen arbitrarily close to p. By Corollary 2.5 (with C = 5(y)) there exists yj ∈ H+ (Anj ) ∩ 5−1 (C) such that yj → y. Let µj be a past directed null generator of H+ (Anj ) starting at yj with initial tangent Yj ∈ N + (Anj ). Assume µj meets ∂Anj at zj . For any p0 ∈ I + (p), the sequence {zj } eventually enters the set J − (p0 ) ∩ 6. Since, by the global hyperbolicity of M , this set is compact, {zj } has an accumulation point z. Proposition 2.2 implies that z ∈ ∂A. Thus, by passing to a subsequence if necessary, we may assume the µj converge to a null geodesic µ
460
P.T. Chru´sciel, G. J. Galloway
with future end point y and past end point z. Proposition 2.6 implies that µ is a null geodesic generator of H+ (A). In order to avoid there being two distinct semi-tangents at the interior point y, µ must coincide with η, cf. Lemma 3.2. Hence, the tangent of η at y coincides with the tangent of µ at y ∈ c−lim N + (An ). Since y can be chosen arbitrarily close to p, Proposition 2.1 implies that X ∈ c−lim N + (An ). Example 4.2. Consider Minkowski 2-space, M = R1,1 , with metric ds2 = dx2 − dy 2 . Let 6 be the slice y = 0. Set, A = {(x, 0) : x ∈ (−2, 2)}, An = {(x, 0) : x ∈ (−2 − n1 , 2 − n1 )}. If C = {(x, 0) : x ∈ [0, 1]}, the inclusion (4.1) is strict. If C = {(x, 0) : x ∈ (−1, 0)}, the inclusion (4.2) is strict (even if one takes the closure of ˆ −1 (C)). N + (A) ∩ 5 We now wish to consider the Hausdorff convergence of certain subsets of the distribution of semi-tangents. For this purpose, introduce a complete Riemannian metric on the h-unit tangent bundle U M , and let ρ denote the associated distance function. With respect to subsets V , W ⊂ U M , let D(V, W ) denote the Hausdorff distance between V and W with respect to the distance function ρ. Given An , A ∈ X such that D(An , A) → 0 and D(∂An , ∂A) → 0, one might have hoped that, for reasonable subsets C ⊂ intA (e.g. C compact or open with compact ˆ −1 (C) Hausdorff converges to N + (A)∩ 5 ˆ −1 (C), i.e., D(N + (An )∩ closure), N + (An )∩ 5 ˆ −1 (C), N + (A) ∩ 5 ˆ −1 (C)) → 0 as n → ∞. But Example 4.2 shows that this will not 5 be the case in general. However, it is possible to establish a slightly weaker result that is sufficient for our purposes. Theorem 4.3. Consider An , A ∈ X such that D(An , A) and D(∂An , ∂A) tend to zero as n tends to infinity. (a) If C is compact and C ⊂ intA then, ˆ ρ(X, N + (A) ∩ 5
sup −1
ˆ X∈N + (An )∩5
−1
(C)) → 0
as
n → ∞.
as
n → ∞.
(C)
(b) If C is open with compact closure and C ⊂ intA then, ˆ ρ(Y, N + (An ) ∩ 5
sup −1
ˆ Y ∈N + (A)∩5
−1
(C)) → 0
(C)
Part (a) is a consequence of Lemma 3.1, part (a) of Lemma 4.1 and the following simple fact. Lemma 4.4. Let (X, d) be a metric space. Consider subsets An , A ⊂ X. Assume the An ’s are closed. Suppose there exists a compact set B ⊂ X such that An ⊂ B for all n. If c−lim An ⊂ A then, sup d(x, A) → 0
x∈An
as
n → ∞.
Proof of Lemma 4.4. Note that the An ’s are compact. If the conclusion does not hold then there exists ε > 0 and a subsequence Ank such that, sup d(x, A) ≥ ε for all
x∈Ank
k.
Horizons Non-Differentiable on a Dense Set
461
By the compactness of Ank , ∃xk ∈ Ank such that d(xk , A) ≥ ε. Since {xk } ⊂ B, by passing to subsequence if necessary, we may assume xk → y. Since c−lim An ⊂ A, y ∈ A. On the other hand, d(y, A) = lim d(xk , A) ≥ ε, k→∞
and hence y ∈ / A, a contradiction.
Proof of part (a) of Theorem 4.3. The aim is to apply Lemma 4.4. The set B := H+ (A) ∩ ˆ −1 (C) = N + (A) ∩ 5−1 (C) is compact and does not meet ∂A. Note that N + (A) ∩ 5 −1 P (B), where, recall, P : U M → M is the natural projection map. Since P −1 (B) is compact and N + (A) is closed in U MA (recall that MA = M \ ∂A) by Lemma 3.1, ˆ −1 (C) is compact. Similarly, using Proposition 2.3, for sufficiently large n, N + (A) ∩ 5 ˆ −1 (C) is compact. Since, by Lemma 2.4, D(H+ (An ) ∩ 5−1 (C), H+ (A) ∩ N + (An ) ∩ 5 ˆ −1 (C) (n sufficiently large) is contained 5−1 (C)) → 0, it follows that H+ (An ) ∩ 5 in a compact set K ⊂ M . (One may take K to be the closure of an ε-neighborhood ˆ −1 (C)). It follows that N + (An ) ∩ 5 ˆ −1 (C) (for n sufficiently large) is of H+ (A) ∩ 5 −1 contained in the compact set P (K) ⊂ U M . Also, part (a) of Lemma 4.1 implies that ˆ −1 (C)] ⊂ N + (A) ∩ 5 ˆ −1 (C). Part (a) of Theorem 4.3 now follows c−lim[N + (An ) ∩ 5 directly from Lemma 4.4. Proof of part (b) of Theorem 4.3. Suppose our claim does not hold. Then there exists ε > 0 and a sequence nj such that, ˆ ρ(Y, N + (Anj ) ∩ 5
sup −1
ˆ Y ∈N + (A)∩5
−1
(C)) ≥ 4ε.
(4.3)
(C)
Let Yj ∈ Upj M be such that the sup in (4.3) is obtained at Y = Yj . By the compactˆ −1 (C), there exists Y ∈ N + (A) ∩ 5 ˆ −1 (C) such that, passing to a ness of N + (A) ∩ 5 subsequence if necessary, we have Yj → Y . It follows that for j large enough we shall have, ˆ −1 (C)) ≥ 2ε. (4.4) ρ(Y, N + (Anj ) ∩ 5 ˆ −1 (C), we can choose Y˜ ∈ Since every neighborhood of Y meets N + (A) ∩ 5 −1 ˆ (C) such that ρ(Y, Y˜ ) ≤ ε/2. Since C is open, there will exist a semiN + (A) ∩ 5 ˆ −1 (C) at an interior point pˆ of the null geodesic generator of tangent Yˆ ∈ N + (A) ∩ 5 H+ (A) determined by Y˜ , such that ρ(Yˆ , Y˜ ) ≤ ε/2. It follows that for all j large enough, ˆ ρ(Yˆ , N + (Anj ) ∩ 5
−1
(C) ≥ ε.
(4.5)
ˆ −1 (C) such that qj → By Corollary 2.5 there exists a sequence qj ∈ H+ (Anj ) ∩ 5 + p. ˆ Equation (4.5) implies there exists a sequence Zj ∈ N (Anj ) ∩ Upj M such that ρ(Yˆ , Zj ) ≥ for all j sufficiently large. By passing to a subsequence if necessary, we have that Zj → Z ∈ Upˆ M . Proposition 2.1 and part (b) of Lemma 4.1 imply that Z ∈ N + (A). Clearly Z 6= Yˆ by (4.5), which contradicts the fact that there can be only one semi-tangent at an interior point of a null geodesic generator.
462
P.T. Chru´sciel, G. J. Galloway
5. The Proof of Theorem 1.1 Let us start by giving an outline of the construction of the set K, the existence of which is asserted in Theorem 1.1. The idea is to construct a sequence of sets Ki , converging in a precise sense to the desired set K, such that 1. K0 = R3 \ B(0, 1), where B(x0 , r) denotes an open ball of radius r centered at x0 . 2. ∂Ki+1 is obtained by adding a certain number of “ripples” to ∂Ki . The process of adding ripples consists of adding a certain number of circular arcs to ∂Ki . It follows that ∂Ki is the union of a finite number of circular arcs. 3. The horizon H+ (Ki ) has several “creases” – curves along which H+ (Ki ) is not differentiable. The addition of ripples to ∂Ki does not affect the existence of the “old” creases, and leads to the occurrence of new ones. The construction guarantees that the set of points which lie on creases of H+ (K) is dense. 4. The boundaries of the Ki ’s are radial graphs, that is, ∂Ki is described by an equation r = fi (θ), θ ∈ [0, 2π], for some Lipschitz continuous function fi (θ). The construction guarantees that ∂K will also have that property, hence will be a Lipschitz continuous topological manifold. y0 y1
y2
x1
r − r1
x2
r1
ϕ1
Fig. 3. The process of adding a ripple: the thin solid line represents the original arc, the thick solid line corresponds to “an arc with a ripple added”
Before we proceed further, let us recall some elementary facts from Lorentzian geometry. Consider a globally hyperbolic space–time (M, g) with Cauchy surface 6. Let be an open subset of 6, then H+ () is a union of null geodesic segments γ ([12, Theorem 5.12], [11, Prop. 53, p. 430]), called generators of H+ (). Each such generator has a past end point p ∈ ∂ = \ , and either no future end point, or a future end point q ∈ H+ (). If ∂ is C 2 near p, then γ is the unique null geodesic orthogonal to ∂ pointing towards [11, Lemma 50, p. 298]. Let us now pass to the description of the process that we call “adding a ripple”. More precisely, we will be adding a ripple at a point y0 lying in the “middle” of a circular arc A. Consider thus an arc A of a circle S(x0 , r0 ) of radius r0 centered at x0 , thus A is given by the equation A = {x : x − x0 = r0 eiϕ , ϕ ∈ [ϕ− , ϕ+ ]}. We will be interested in the structure of a Cauchy horizon of a set which has first A, and then “A plus a ripple” as part of its boundary. Now for any set and any conformal
Horizons Non-Differentiable on a Dense Set
463
isometry 9 we have H+ (9()) = 9(H+ ()). We can always find a conformal isometry 9 of the three-dimensional Minkowski space–time R2,1 so that x0 = 0, r0 = 1, and ϕ ∈ [ π2 − ϕ0 , π2 + ϕ0 ] on 9(A), and it follows that for our purposes it is sufficient to π consider this case. Then the addition of a ripple to 9(A) at πy0 = ei 2 proceeds as follows. π i( 2 −ϕ1 ) , x2 = (1 − r1 )ei( π2 +ϕ1 ) . Let 0 < r1 < 1, 0 < ϕ1 < ϕ0 , and let x1 = (1 − r1 )e i( 2 −ϕ1 ) Consider the circles S(x1 , r1 ) and S(x2 , r1 ), they are tangent to A at points y1 = e π 2| and y2 = ei( 2 −ϕ1 ) . If r1 is large enough compared with |x1 − x2 |, r1 ≥ a = |x1 −x , then 2 S(x1 , r1 ) will intersect S(x2 , r1 ), and henceforth we will assume that this is the case. The process of adding a ripple to A is made clear by Fig. 3, and can formally be described as follows: “9(A with a ripple added)” will be the curve obtained by following 9(A) π π π from ei( 2 −ϕ0 ) to ei( 2 −ϕ1 ) , then following S(x1 , r1 ) from ei(π 2 −ϕ1 ) to the y-axis where S(x1 , r1 ) intersects S(x2 , r1 ), then following S(x2 , r1 ) to ei( 2 +ϕ1 ) , and finally following 9(A) to its end point. It should be seen that this construction is invariant under reflections across the y-axis. The parameter r1 in the construction will be called the radius of the ripple. The distance a = 21 |x1 − x2 | will be called the diameter of the ripple. It should be noted that “A with a ripple” can be described as a radial graph |x − x0 | = g(ϕ), ϕ ∈ [ϕ− , ϕ+ ]. g(ϕ) is Lipschitz continuous and piecewise smooth. Let us for further use note that the |g 0 (ϕ)|. L modulus of Lipschitz continuity L = L(r1 , a) of g(ϕ) satisfies L ≤ lim ϕ +ϕ ϕ→
− 2
+
is a strictly decreasing function of a, with lima→0 L(r1 , a) = 0. Let us now address the question, how does the addition of a ripple affect the Cauchy horizon of K0 = R2 \ B(0, 1). As discussed at the beginning of this section, generators of the horizon of any set are null geodesics, which in our case are null straight lines in R2,1 . Every such line is determined uniquely by its projection on R2 under the projection map 5 : R2,1 → R2 , 5(t, x) = x, and this projection is itself a straight line segment in R2 . The generators emanating from points p lying on a smooth piece of the boundary of the set under consideration are thus straight lines orthogonal to the boundary. It follows that the generators of H+ (K0 ) project down to the radial lines {x = reiϕ , r ∈ [1, ∞)}. π Consider now a set K(a, r1 ), defined by the addition of a ripple to S(0, 1) at y0 = ei 2 , of radius r1 and diameter a, with a < r1 < 1/2. (By that we mean that K(a, r1 ) is that set which has as boundary S(0, 1) with a ripple added, and R3 \ B(0, 1) ⊂ K(a, r1 )). From what has been said so far it follows that those generators of H+ (K(a, r1 )) which emanate from points lying on S(0, 1) project on the radial half lines {reiϕ , r ∈ [1, ∞)}. πThose generators of H+ (K(a, r1 )) which emanate from points x lying on S(x1 , r1 )\{rei 2 , r ∈ R} project to line segments contained in the lines {r(x − x1 ), r ∈ R}. Note that each suchπsegment, except for the point y1 where S(x1 , r1 ) meets S(0, 1), intersects the axis {rei 2 , r ∈ R} precisely at one point. By invariance of K(a, r1 ) under reflections across that axis it follows that each such generator must have an end point precisely on the plane π {t ∈ R, x = rei 2 , r ∈ R}. The projection of some of the generators of H+ (K(a, r1 )) can be found in Fig. 4. We wish to point out the following: 1. The generators can always be parameterized so that they are given by an equation of ˙ = 1}, with xµ (0) ∈ ∂K(a, r1 ). In everything that the form γ = {xµ (s), x˙ 0 = 1, |x| follows all generators will always be parameterized in that manner. π 2. Let C = H+ (K(a, r1 )) ∩ {(t, x), t ∈ R, x = rei 2 , r ∈ R}. Every point p ∈ C + is the future end point of two generators of H (K(a, r1 )), while every point p ∈ H+ (K(a, r1 )) \ C is an interior point of precisely one generator of H+ (K(a, r1 )). C will be called the crease set of H+ (K(a, r1 )).
464
P.T. Chru´sciel, G. J. Galloway
CK
Fig. 4. The portion of the solid segments that lies outside of the shaded area is the projection of the generators of H+ (K(a, r1 )) on the plane t = 0
We wish now to set up an iterative scheme for constructing a sequence (Ki , εi , δi ), where εi and δi are positive numbers and Ki is an increasing sequence of sets. The reader is referred to Sect. 3 for the definition of semi-tangents and of the distribution of semi-tangents. For any set A ∈ X , with X as in (2.1), we define the crease set of H+ (A) as the set of points of H+ (A) at which there is more than one semi-tangent. We set K0 = R3 \ B(0, 1), ε0 = ∞, δ0 = ∞. The future Cauchy horizon H+ (K0 ) of K0 is shown in Fig. 5. The number εi will be a measure of “how close to each other are distinct semi-tangents on the crease set of H+ (Ki )”. The number δi will be a measure of “how far apart are the distributions of semi-tangents N + Ki−1 and N + Ki ”. Clearly iπn the crease set C0 of K0 is empty. Let {yn = 2e 2 , n = 1, . . . , 4}, and let xn be the iπn starting points of those generators of H+ (K0 ) which pass through yn , thus xn = e 2 . We define K1 by adding a ripple of radius r1 and diameter a1 at each xn . r1 and a1 are arbitrary except for the requirement that the ripples do not intersect, and that ∂K1 be the radial graph of a function f1 , ∂K1 = {r = f1 (θ), θ ∈ [0, 2π]}, such that the modulus of Lipschitz continuity of f1 is smaller than 1/2. A possible set K1 has been shown in Fig. 6. By the localization principle, Lemma 2.7, the horizon H+ (K1 ) will coincide with that of H+ (K0 ) away from those circular arcs at which ripples have been added. Again by the localization principle in a small neighborhood of each ripple the horizon will look like the horizon of the set K(a, r1 ) as previously considered – R3 \ B(0, 1) with a single ripple added. Thus H+ (K1 ) will look as shown in Fig. 7.
Horizons Non-Differentiable on a Dense Set
465
0.6 0.4 0.2 0
1
1 0.5
0.5 0
0 -0.5
-0.5
-1
-1
Fig. 5. The future Cauchy horizon H+ (K0 ) of K0
Fig. 6. The dotted line is the boundary of K0 , the solid line that of K1 . For graphical clarity the parameters describing the ripples have not been chosen so that the modulus of Lipschitz continuity of the function representing K1 be smaller than one half, as required in the construction described below
466
P.T. Chru´sciel, G. J. Galloway
0.8 0.6 0.4 0.2 0
1
1 0.5
0.5 0
0 -0.5
-0.5
-1
-1
Fig. 7. The future Cauchy horizon H+ (K1 ) of K1 . (K1 here has been rotated by 45 degrees, as compared to Fig. 6)
It follows that the crease set C1 of H+ (K1 ) projects down on R2 under 5 to half– i πn 2 infinite intervals included in the four half-lines {re , r ∈ [1, ∞)}, n = 1, . . . , 4. It should be noted that every point in the annulus B(0, 2) \ B(0, 1) is a distance not further than π2 from 5C1 . Letting ρ be the distance on T R2,1 as defined after Example 4.2 in Sect. 2, we set ε1 = inf{ρ(X, Y )| X, Y ∈ N p K1 , X 6= Y, p ∈ C1 }, δ1 = ε1 /4. This finishes our first inductive step. The remainder of the construction consists in adding ripples in a controlled way. The second step is shown in Fig. 8, and that figure makes it also clear how the induction proceeds; the formal description goes as follows: Suppose that (Kj , εj , δj ) have been constructed for all j ≤ i, with the property that 5Cj ⊂ 5Cj+1 , and δj+1 ≤
1 min(εj , δj ), 4
and with the property that for j ≥ 1 every point p ∈ B(0, 2) \ B(0, 1) is a distance not further than 2−j π from 5Cj . Let us further assume that for j = 1, . . . , i the Kj ’s have the property that ∂Kj is a radial graph ∂Kj = {r = fj (θ), θ ∈ [0, 2π]} of a Lipschitz
Horizons Non-Differentiable on a Dense Set
467
Fig. 8. The dotted line represents the intersection of the boundary of K0 with the first quadrant, the dashed one that of K1 , and the solid one that of K2 . The dots are the centers of the corresponding circles. K0 and K1 have been rotated by 45o , as compared to Fig. 6. For graphical clarity the radii of the “second-generation-ripples” have not been chosen to be equal, as done in the construction below
continuous function, the modulus of continuity of which is smaller than 21 + 212 + . . . + 21j . We construct Ki+1 as follows: we can find points zk ∈ Ci ∪ (H+ (Ki ) ∩ 5−1 (∂B(0, 2)) k = 1, . . . , I, such that every point p ∈ H+ (Ki ) ∩ 5−1 (B(0, 2)) is a distance not larger than 2−i−1 π from a generator of H+ (Ki ) passing through one of the zk ’s, or having its end point there. Let {yn }In=1 denote the set of the starting points of those generators. We may choose the zk ’s so that none of the yn ’s lies on the end point of a circular arc on ∂Ki . We can also require that #N zk Ki ≤ 2. Define Ki+1 (a) by adding ripples of radius ri+1 and diameter a, a ≤ ai+1 ≤ ri+1 , to ∂Ki at all the yn ’s. The radius ri+1 and threshold diameter ai+1 are chosen small enough so that ∂Ki+1 is a radial graph of a function fi+1 , the modulus of Lipschitz continuity of 1 . Moreover ai+1 is chosen small enough which is smaller than or equal to 21 + 212 +. . .+ 2i+1 so that the new ripples do not intersect each other. Consider now a point zk ∈ Ci , such / Cj−1 for some j. There exist precisely two points, say x1 and that zk ∈ Cj and zk ∈ x2 , on ∂Ki , which are initial points of generators of H+ (Ki ), say γ1 and γ2 , with end point at zk . Moreover x1 and x2 are symmetric images of each other under the reflection across the line which contains the projection of that generator rk of H+ (Kj−1 ) which passes through zk . By the localization principle, Lemma 2.7, the addition of a sufficiently small ripple at x1 and x2 will only affect H+ (Ki ) in a small neighborhood of γ1 and γ2 . Because the ripples are added symmetrically with respect to rk , the generators emerging from points on the ripples can only meet on a set, the projection of which contains the projection of rk . It follows that for ai+1 small enough the projection on R2 of the crease set Ci+1 (a) of H+ (Ki+1 (a)), a ≤ ai+1 , will contain the projection of the crease set Ci . Note that the above discussion also shows that there exists a compact neighborhood Di of {zk }, with Di ∩ ∂Ki = ∅, such that the crease set of H+ (Ki ) is contained in that of H+ (Ki+1 (a)), except for points in Di , for all 0 ≤ a < ai+1 . Moreover, by choosing ai+1 sufficiently small, Di may be expressed as a disjoint union, Di = ∪k Di,k , where Di,k is an arbitrarily small compact neighborhood of zk . Then, continuity properties of geodesics, as applied to the null geodesics orthogonal to ∂Ki+1 , and the structure
468
P.T. Chru´sciel, G. J. Galloway
of the crease sets imply the following: For any > 0, ai+1 (and, in turn, the Di,k ’s) ˆ −1 (5Di,k ) and may be chosen sufficiently small so that for any Z ∈ N +Ki+1 (a) ∩ 5 −1 + + for any q ∈ H (Ki+1 (a)) ∩ 5 (5(Ci ∩ Di,k )) there exists Y ∈ Nq Ki+1 (a) such that ρ(Z, Y ) < . Define δi+1 (a) = sup{inf{ρ(X, Y )| Y ∈ N +5−1 p Ki+1 (a)}| p ∈ 5Ci , X ∈ N +5−1 p Ki }. (5.1) Here 5−1 p denotes that point which is the lift of p to the appropriate Cauchy horizon. By our previous remark, in (5.1) we can replace the condition p ∈ 5Ci by: p ∈ 5(Ci ∩ Di ), where Di is as above. Hence, from our previous remark and part (b) of Theorem 4.3 it follows that δi+1 (a) → 0 as a → 0. (This should anyway be clear by inspecting what happens in Figs. 4 or 5 when the diameter of the ripples tends to zero). We can thus choose ai+1 small enough so that δi+1 (ai+1 ) ≤
1 min(δi , εi ). 4
Set Ki+1 = Ki+1 (ai+1 ), and define εi+1 = inf{inf{ρ(X, Y ) | X, Y ∈ N +5−1 p Ki+1 , X 6= Y } | p ∈ Ci+1 } > 0, where, Ci+1 is the crease set of H+ (Ki+1 ). This completes our inductive step. Consider the family of functions fi (θ) which represent ∂Ki as radial graphs. For any θ we have fi+1 (θ) ≤ fi (θ) , so that f (θ) = lim fi (θ) exists. Set i→∞
iθ
K = {re | r ≥ f (θ)}. As the modulus of Lipschitz continuity of all the fi ’s is less than 21 + 41 + . . . + 21i < 1, f is a Lipschitz continuous function and ∂K is thus a Lipschitz continuous topological manifold. Let C be the crease set of H+ (K), we wish to show that 5C is dense in the annulus B(0, 2) \ B(0, 1). This will follow if we show that 5C contains 5Ci for each i. Consider then a point p ∈ Ci , there exist at least two vectors X and Y in T M which are semi-tangent to H+ (Ki ), with ρ(X, Y ) ≥ εi , let q = 5p. By construction there exist X1 , Y1 ∈ N +p1 Ki+1 , 5p1 = 5p, such that ρ(X, X1 ) ≤ δi+1 ≤ ε4i , ρ(Y, Y1 ) ≤ ε4i . εi Similarly there exist X2 , Y2 ∈ N +p2 Ki+2 such that ρ(X1 , X2 ) ≤ δi+1 4 ≤ 16 , ρ(Y1 , Y2 ) ≤ εi εi εi εi εi 16 . This gives ρ(X, X2 ) ≤ 4 + 16 < 3 , ρ(Y, Y2 ) < 3 . Continuing this way one obtains a sequence of vectors Xi , Yi such that ρ(X, Xi ) < ε3i , ρ(Y, Yi ) < ε3i . By compactness a converging sequence can be chosen, converging to vectors X∞ , Y∞ . By the triangle inequality we have ρ(X∞ , Y∞ ) ≥ ε3 , so that X∞ 6= Y∞ . By the cluster-limit Lemma 4.1, X∞ , Y∞ ∈ N +5−1 q K. By Proposition 3.4 H+ (K) cannot be smooth at 5−1 q, and the result follows. 6. Higher Dimensional Examples Let us now address the question of existence of “nowhere” differentiable horizons in more than 2+1 space–time dimensions. First, consider the space–times M p,q = R2,1 ×Rp ×S1q , where S1 denotes a circle, p, q are non-negative integers. Here Rp and S1q are equipped with the obvious metric, and M p,q is equipped with the product metric. We claim that
Horizons Non-Differentiable on a Dense Set
H+ (K × Rp × S1q ; M p,q ) = H+ (K; R2,1 ) × Rp × S1q .
469
(6.1)
Here K is the set constructed in Theorem 1.1, H+ (; M ) denotes the Cauchy horizon of an acausal set in a space–time M . It follows that H+ (K × Rp × S q ; M p,q ) provides us with a “nowhere” differentiable Cauchy horizon, via the constructions discussed in Section 1. To prove (6.1) it is sufficient to note that Rp × S1q acts transitively on itself by isometries, which consist of translations in the Rp factor and rotations of each S1 factor. Since H+ (9(); M ) = 9(H+ (; M )) for any isometry 9, it follows that H+ (K × Rp × S q ; M p,q ) is of the form U × Rp × S q for some set U . Clearly U = H+ (K; R2,1 ) and (6.1) follows. Note that if p > 0 then K × Rp × S q does not have a compact boundary, while if q > 0 then M p,q is not asymptotically flat in the usual sense. It is therefore of interest to try to mimic our construction in higher dimension, to obtain a set K ⊂ Rn , with compact boundary ∂K, having the properties listed in Theorem 1.1, when Rn is considered as the t = 0 hyper-surface in n + 1-dimensional Minkowski space–time Rn,1 . Now the results proved in Sections 2 and 3 are dimension-independent, let us then isolate the essential elements of our construction. The key element was the procedure of adding a ripple to the boundary of a set, which then produced a crease on the event horizon. Thus we need a building block which will have an effect similar to the one that our two-dimensional ripple had. Such a higher dimensional ripple can be easily constructed as follows: let us start with 3 + 1 dimensions. Let K(a, r) be the complement of a disc with a ripple added, as described at the beginning of Sect. 2 and shown in Fig. 4, lying in the plane z = 0 in R3 . Let ˆ K(a, r) ⊂ R3 be the set obtained by rotating K(a, r) around the axis z = x = 0. ˆ (K(a, r) is clearly diffeomorphic to an apple without its stalk.) K(a, r) is invariant under rotations around that axis, so must therefore be its Cauchy horizon in R3,1 . It is ˆ then easily seen that H+ ≡ H+ (K(a, r); R3,1 ) is obtained by rotating H+ (K(a, r); R2,1 ) around the z = x = 0 axis. This shows that the crease set of H+ is non-trivial and projects on R3 under the canonical projection 5 : R3,1 → R3 to a subset of the z = x = 0 axis. Clearly one can now start an iterative scheme of adding new ripples, and obtain a desired set K by passing to a limit. Let us mention that a useful bookkeeping procedure in the two-dimensional case was to ensure that the projections 5Ci of the crease sets Ci formed an increasing sequence of sets. Presumably a way of ensuring that could also be conceived in the threedimensional case. Rather than doing that let us note that this aspect of the construction is not necessary, and can be bypassed as follows: At the ith step of the construction add i the new “end points” {zj }N j=1 of the generators γj , j = 1, . . . , Ni , so that every point of B(0, 2) \ B(0, 1) ⊂ R3 be a distance not larger than 2−i−2 from the projection of some Ni i generator with future end point in the set {zj }N j=1 , or passing through a point in {zj }j=1 . Choose then the size of the ripples added at the zj ’s, j = 1, . . . , Ni so small that 5Ci+1 lies a distance not larger than 2−i−2 from one of the distinguished generators γi or from 5Ci . The other arguments go through with some rather obvious modifications. Let us finally describe one of the many possible processes of “adding a ripple” in n-dimension. Let fa,r be the function which describes the “rippled circle” ∂K(a, r) as a radial graph, ∂K(a, r) = {fa,r (ϕ)eiϕ , ϕ ∈ [0, 2π]}; K(a, r) is as described at the ˜ beginning of Sect. 2. Then a “rippled n − 1-dimensional sphere” ∂ K(a, r) is given by the equation ˜ ∂ K(a, r) =
{xn = fa,r (θ) cos θ,
470
P.T. Chru´sciel, G. J. Galloway
x21 + x22 + . . . + x2n−1 = fa,r (θ) sin θ,
θ ∈ [− π2 , π2 ]}.
The reader should be able to check, by isometry considerations, that for all t ≥ 0 ˜ r); Rn,1 ) with the hyper-planes t = const are sets the cross-sections of H+ ≡ H+ (K(a, the boundaries of which are again rippled n − 1 spheres. The crease set of H+ projects down to a subset of the axis x1 = x2 = . . . = xn−1 = 0. Acknowledgement. P.T.C. acknowledges useful discussions with R. Geroch, K. Newman, P. Tod and R. Wald. He also wishes to thank the Department of Mathematics of the University of Miami for hospitality during part of the work on this paper. We are grateful to V.E. Dubau and C. Georgelin for producing the Mathematica figures.
References 1. Beem, J.: Causality and Cauchy horizons. Class. Quantum Grav. 27, 93–108 (1995) 2. Chru´sciel, P.T. and Isenberg, J.: On the stability of differentiability of Cauchy horizons. Commun. Anal. Geom. 5, 249–277 (1997) 3. Federer, H.: Geometric measure theory. New York: Springer Verlag, 1969 (Die Grundlehren der mathematischen Wissenschaften, Vol. 153) 4. Galloway, G.J.: Some global aspects of compact space-times. Arch. Math. 42, 168–172 (1984) 5. Hawking, S.W.: Black holes in general relativity. Commun. Math. Phys. 25, 152–166 (1972) 6. Hawking, S.W.: Chronology protection conjecture. Phys. Rev. D46, 603–611 (1992) 7. Hawking, S.W. and Ellis, G.F.R.: The large scale structure of space-time. Cambridge: Cambridge University Press, 1973 8. Major, I.: On differentiability of achronal sets. Budapest preprint, 1987 9. Newman, K.: Private communication 10. Newman, R.P.A.C.: Compact space-times and the no-return theorem. Gen. Rel. Grav. 18, 1181–1186 (1986) 11. O’Neill, B.: Semi-Riemannian geometry. New York: Academic Press, 1983 12. Penrose, R.: Techniques of differential topology in relativity. Philadelphia, PA: SIAM, 1972, (Regional Conf. Series in Appl. Math., vol. 7) 13. Tipler, F.J.: Singularities and causality violation. Ann. Phys. 108, 1–36 (1977) Communicated by H. Nicolai
Commun. Math. Phys. 141, 471 – 492 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Covariant Sectors with Infinite Dimension and Positivity of the Energy? Paolo Bertozzini?? , Roberto Conti??? , Roberto Longo Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, Via della Ricerca Scientifica, I-00133 Roma, Italy. E-mail: [email protected], [email protected], [email protected] Received: 21 April 1997 / Accepted: 23 September 1997
Abstract: Let A be a local conformal net of von Neumann algebras on S 1 and ρ a M¨obius covariant representation of A, possibly with infinite dimension. If ρ has finite index, ρ has automatically positive energy. If ρ has infinite index, we show the spectrum of the energy always to contain the positive real line, but, as seen by an example, it may contain negative values. We then consider nets with Haag duality on R, or equivalently sectors with non-solitonic extension to the dual net; we give a criterion for irreducible sectors to have positive energy, namely this is the case iff there exists an unbounded M¨obius covariant left inverse. As a consequence the class of sectors with positive energy is stable under composition, conjugation and direct integral decomposition.
1. Introduction In the past there has been a certain belief that all irreducible superselection sectors have finite statistics. Indeed if A is a translation covariant net on the Minkowski spacetime and the energy-momentum spectrum has an isolated mass shell, then, by a theorem of Buchholz and Fredenhagen [8], every positive-energy irreducible representation is localizable in a space-like cone and has finite statistics. Nevertheless an analysis of sectors with infinite dimension is possible, in the context of a modular covariant net, as outlined in [18], and this indicated such quantum charges to have a natural occurrence. First examples of irreducible superselection sectors with infinite dimension have been constructed, in a simple way, by Fredenhagen [14], associated with conformal nets on S 1 , and moreover there are arguments that in this context a large natural family of sectors should have infinite dimension [22], thus providing the ?
Research supported in part by MPI, CNR-GNAFA and INDAM. Present address: Department of Mathematics and Statistics, Thommasat University, Klongluang Pathumthani 12121, Thailand. ??? Present address: Department of Mathematics, University of California, Berkeley, CA 94720, USA. ??
472
P. Bertozzini, R. Conti, R. Longo
feeling that infinite statistics might be the generic or prevailing situation in low spacetime dimension. At this point it is natural to begin with a general study of superselection sectors with infinite dimension. However the extension from the finite-dimensional to the infinite dimensional case is certainly far from being straightforward and requires new methods and insight; it is analogous to the passage, in the study of group representations, from compact groups to locally compact groups. In order to understand the structure of infinite dimensional sectors, we shall study here the positive energy property. We start with a study of the finite index case in the onedimensional conformal case, with a point of view suitable for generalization. Classical arguments, see [11], are replaced also due to the failure of Haag duality on the real line and the occurrence of soliton sectors. Based on modular theory methods, we show however that the positivity of the energy holds automatically in the finite index case. In the context of infinite dimensional sectors we shall then show that the spectrum of the energy always contains R+ . But, as we shall illustrate by a (reducible) example, negative energy values may occur in general. We are thus led to characterize the sectors with positive energy. To this end we study the basic question whether we may associate an unbounded left inverse with every covariant superselection sector with positive energy. We start by considering a translation-dilation covariant net A of von Neumann algebras on R obtained by a M¨obius covariant precosheaf on S 1 by cutting the circle at one point. Such translation-dilation covariant nets are characterized by the Bisognano-Wichmann geometric action of the modular group of the von Neumann algebras of half-lines, see [19]. Assuming Haag duality for A on R to hold (for bounded intervals), we give a positive answer to the above question and in fact we show that an irreducible sector has positive energy if and only if it admits a M¨obius covariant unbounded left inverse. As a consequence we show the class of superselection sectors with positive energy is closed under composition, conjugation and direct integral decomposition. Notice now that the Bisognano-Wichmann dual net Ad of A Ad (a, b) = A(−∞, b) ∩ A(a, +∞) always satisfies Haag duality, is conformal thus strongly additive [19], and a covariant representation ρ of A localized in a bounded interval I extends to a covariant representation of Ad , but it may become localized in a half-line, i.e. a soliton sector [24]. However one may construct an extension ρR of ρ to Ad localized in a right half-line and an extension ρL localized in a left half-line. The extensions are easily obtained: if I ⊂ (a, +∞), the restriction of ρ to the C∗ -algebra generated by the von Neumann algebras of bounded intervals contained in (a, +∞) extends to a normal endomorphism of its weak closure A(a, +∞), thus, as a ∈ R is arbitrary, it gives up an endomorphism of the C∗ -algebra ∪a∈R A(a, +∞)− that restricts to a representation of the quasi-local C∗ -algebra of Ad generated by the local von Neumann algebras Ad (I)’s, I bounded interval. This representation is ρR and ρL is obtained similarly. In general ρR and ρL are inequivalent representations. They are equivalent in particular if they are still localized in a bounded interval, namely the extension of ρ is not a soliton. Our results thus apply to general (not necessarily strongly additive) M¨obius covariant nets, provided we consider “truly non-solitonic” covariant sectors. The idea behind our analysis is to explore the equivalence between the positivity of the energy and the KMS condition for the dilation automorphisms of A(R+ ), that holds in
Covariant Sectors with Infinite Dimension and Positivity of Energy
473
the vacuum sector. It is rather easy to see that the latter KMS property passes to charged sectors, of arbitrary dimension, enabling us to make an analysis by the Tomita–Takesaki theory. Our methods rely on an analysis of the unitary representation of the translationdilation group, where we give a characterization of the positive energy representations in terms of domain conditions for certain associated operators. We then identify these operators with modular objects furnished by the Tomita-Takesaki theory and use the domain conditions. In the infinite index case we make use of Haagerup operator–valued weights, Connes spatial derivatives and Araki modular operators in particular. For the convenience of the reader we use [25] as a reference for the modular theory. This paper leaves open the problem whether every irreducible covariant sector has automatically positive energy. Our work however indicates that, at least in the strongly additive case, the answer is likely to be affirmative. 2. Representations of the Translation-Dilation Group: a Criterion for Positive Energy In this section we analyze the unitary positive-energy representations of the dilationtranslation group from a point of view of later use. In particular we are interested in characterizing the positive energy representations in terms of domain conditions. To begin with, we relax the hypothesis of positivity of the energy in the proof of [21], Cor. 2.8. Lemma 2.1. Let T (a) := eiHa , U (t) be two 1–parameter groups on a Hilbert space H such that U (t)T (a)U (t)∗ = T (e−2πt a) for every t, a ∈ R. Then the spectral projection P1 (resp. P2 , P3 ), relative to the positive (resp. negative, 0) part of the spectrum of H commutes with T , U , and thus reduces the representation on globally invariant subspaces. Proof. From the commutation relation U (t)T (a)U (t)∗ = T (e−2πt a), by differentiation with respect to a, it follows that U (t)HU (t)∗ = e−2πt H. For i = 1, 2, 3 let χi be the characteristic function of the positive (resp. negative, zero) part of R. By Borel functional calculus we get: χi (U (t)HU (t)∗ ) = U (t)χi (H)U (t)∗ = χi (e−2πt H) = χi (H). As χi (H) = Pi , we have the proof. Lemma 2.2. Given two 1-parameter unitary groups T (a), U (t) = e−2πitD on the Hilbert space H such that U (t)T (a)U (t)∗ = T (e−2πt a) for every t, a ∈ R, then for every a ∈ R and ζ in the domain of both e−πD , e−πD T (a), we have ke−πD ζk = ke−πD T (a)ζk. The subspace D(e−πD ) ∩ D(e−πD T (a)) of such ζ is dense for every fixed a ∈ R. Proof. By symmetry considerations it is sufficient to consider the case a ≥ 0. If the d T (a)|a=0 of T is non–negative then D(e−πD ) ⊂ D(e−πD T (a)), a ≥ 0 generator −i da and the thesis is known [21] (see also Prop. 3.4). We just outline how to change the arguments in this reference in order to have our general statement. Let us decompose the Hilbert space H in the direct sum of the two spectral subspaces H+ := (P1 + P3 )H, H− := P2 H. By Lemma 2.1 these subspaces are invariant for both the 1-parameter groups T (a) and U (t), so that it is possible to define the restrictions U+ (t), (resp. U− (t)) and T+ (a), (resp. T− (a)) of the 1-parameter groups to these subspaces. Now T+ (a) (resp. T−−1 (a)) is a 1-parameter group with positive
474
P. Bertozzini, R. Conti, R. Longo
generator satisfying the commutation relation U+ (t)T+ (a)U+ (t)∗ = T+ (e−2πt a) (resp. U− (t)T−−1 (a)U− (t)∗ = T−−1 (e−2πt a)) therefore using [21], Corollary 2.8, we obtain: ke−πD+ ζk = ke−πD+ T+ (a)ζk (resp. ke−πD− ζk = ke−πD− T−−1 (a)ζk) for every ζ in the domain of e−πD+ (resp. for every ζ in the domain of e−πD− ), for every a ≥ 0, where D+ and D− are the generators of the 1-parameter groups U+ (t), U− (t). Let us take ζ = (ζ+ + ζ− ) in the common domain of e−πD and e−πD T (a), a ≥ 0, then we have: ζ+ in the domain of e−πD+ and ζ− in the domain of e−πD− so that we obtain ke−πD+ ζ+ k = ke−πD+ T+ (a)ζ+ k and ke−πD− ζ− k = ke−πD− T−−1 (a)ζ− k. From the second equation (using the fact that by hypothesis T− (a)ζ− is in the domain of e−πD− ), we get ke−πD− T− (a)ζ− k = ke−πD− T−−1 (a)T− (a)ζ− k = ke−πD− ζ− k. Now, summing the result for the two components ζ+ , ζ− , from Pitagora’s Theorem it is possible to deduce: ke−πD ζk = ke−πD T (a)ζk, a ≥ 0. If a > 0 and ζ ∈ D(e−πD ) ⊂ D(e−πD T (−a)), then T (−a)ζ ∈ D(e−πD ) ⊂ D(e−πD T (a)), thus ke−πD ζk = ke−πD T (a)T (−a)ζk = ke−πD T (−a)ζk. The last part of the statement is now clear. Corollary 2.3. Assume that T (a), and U (t) = e−2πitD are two 1–parameter groups as in Lemma 2.2, and let ζ ∈ H be a vector such that ke−πD ζk = ke−πD T (a)ζk < ∞, for some a ≥ 0; then ke−πD ζk = ke−πD T (b)ζk for every 0 < b < a. Proof. With the same notation as above we know that ζ+ ∈ D(e−πD+ ) and T− (a)ζ− ∈ D(e−πD− ), thus ζ+ ∈ D(e−πD+ T (b)) and T− (b)ζ− = T− (a−b)∗ T− (a)ζ− ∈ D(e−πD− ). We also have the following criterion for the positivity of the energy Proposition 2.4. Given two 1-parameter groups T (a), U (t) = e−2πitD as in Lemma 2.2, the following are equivalent: a) e−πD T (a) ⊃ T (−a)e−πD , i.e. T (−a)e−πD is hermitian, for some (⇔ for all) a > 0; b) there exists a core D1 for e−πD contained in D(e−πD T (a)), for some (⇔ for all) a > 0; c) the generator H of T (a) is positive. Proof. The equivalence a) ⇔ c) is proved in [10], Th. 1. The implication c) ⇒ a) is also contained in [21], (proof of) Cor. 2.8 by a different method of proof. Clearly a) ⇒ b), thus we have to show that b) ⇒ c). D2 := e−πD D1 is a core for eπD , and, for some a > 0, e−πD T (a)eπD is isometric on D2 by Lemma 2.2, thus by [26], 9.24 the function t → U (t)T (a)U (t)∗ admits an analytic continuation inside the strip {t ∈ C | − 21 < =t < 0} bounded in norm by 1, see also Th. 4.3. The conclusion may be easily obtained as in [5], (proof of) Prop. 2.7; in fact putting t = − 4i we get kT (ia)k = e−aH ≤ 1. 3. Preliminaries on Local Conformal Precosheaves Our analysis will concern nets of von Neumann algebras on the real line. More precisely A will be a map I 7→ A(I) from the bounded open intervals I of R to von Neumann algebras on a fixed Hilbert space H. For this net we require the following properties: 1) Isotony: I1 ⊂ I2 ⇒ A(I1 ) ⊂ A(I2 ). 2) Locality: I1 ∩ I2 = ∅ ⇒ [A(I1 ), A(I2 )] = {0}.
Covariant Sectors with Infinite Dimension and Positivity of Energy
475
3) Covariance: There exists a strongly continuous unitary representation V of the translation-dilation group P , namely a semidirect product of R with R, on H such that: V (g)A(I)V (g)−1 = A(gI), g ∈ P. Here P acts on R ((a, t)x = a + et x) and we will denote by a, b, · · · ∈ R elements of the translation one-parameter subgroup and by t, s, · · · ∈ R elements of the dilation one-parameter subgroup. We shall frequently denote the one-parameter translation (resp. dilation) group simply by T (a) (resp. U (t)). 4) Existence of the vacuum: There exists a unique (up to a phase) unit V -invariant vector ∈ H. 5) Reeh–Schlieder Property: The vacuum vector is cyclic and separating for the von Neumann algebras A(I). 6) Bisognano–Wichmann Property: The modular unitary one–parameter group associated (by Reeh–Schlieder Property and Tomita–Takesaki Theorem) with (A(R+ ), ) coincides with the rescaled dilation one–parameter unitary 1it = U (−2πt), t ∈ R. Here and in the following, given S ⊂ R, we indicate by A0 (S) the C ∗ –algebra generated by all the A(I), with I ⊂ S, and by A(S) = A0 (S)00 its weak closure. In the literature one considers more often M¨obius covariant precosheaves (also named nets) of von Neumann algebras on the proper intervals of S 1 . If one cuts S 1 and restricts such a precosheaf to R = S 1 \{point} one obtains a net on the real line satisfying the above properties 1 to 6 [5, 2], see also [15]. Conversely any net on R with the above properties extends uniquely to a M¨obius covariant precosheaf on S 1 [19]. In particular the modular conjugation JR associated with A(R+ ), corresponds to the reflection in R with respect to 0, thus “wedge duality” holds: A(a, ∞)0 = A(−∞, a). d T (a)|a=0 of the translation one–parameter group T Moreover the generator H := −i da is positive [27, 6]. In other words positivity of the energy in the vacuum sector is a consequence of the KMS property for the dilation automorphism group of A(R+ ). We will see how this implication works in different representations. Notice that the uniqueness of the vacuum and the positivity of the energy entail the factoriality of the von Neumann algebras A(I), if I is a half-line, thus the irreducibility of the quasilocal C∗ -algebra A0 (R); indeed every net satisfying the above properties decomposes uniquely into a direct sum of irreducible nets and irreducibility, uniqueness of the vacuum and factoriality of the von Neumann algebras of half-lines are equivalent properties. We shall now consider a morphism ρ of the quasi-local C∗ -algebra A = A0 (R) localized in a half-line, namely ρ is a representation of A on H such that ρ(X) = X for every X ∈ A0 (−∞, a). Two such morphisms ρ, ρ0 are said to be equivalent if they are equivalent as representations; thus, by wedge duality, there exists a unitary T ∈ A(a, ∞) such that T ρ(X) = ρ0 (X)T for every X ∈ A. An endomorphism ρ is covariant if there exists a unitary strongly continuous representation Vρ : P → B(H) such that:
ρ(αg (X)) = Vρ (g)ρ(X)Vρ (g)−1 , X ∈ A, g ∈ P,
476
P. Bertozzini, R. Conti, R. Longo
where αg := Ad(V (g)). As far as we consider a covariant irreducible morphism ρ or, more generally, a finite direct sum of irreducibles (in particular finite index endomorphisms), the representation Vρ providing the covariance is unique, due to the fact that there are no non–trivial finite–dimensional representations of P . In the reducible case, different representations are related by a cocycle in ρ(A)0 . We shall say that ρ has positive energy if we can choose Vρ so that the generator of the translation group is positive. We shall only consider covariant morphisms ρ which are transportable, i.e. localizable in any half-line (−∞, a) or (b, ∞). This is in particular the case of a covariant morphism localizable in an interval. Thus ρ is normal on A(a, ∞) and, if localized in (a, ∞), extends to a normal endomorphism of A(a, ∞) denoted ρ(a,∞) or simply by ρ, if no confusion arises. By introducing the notations: βgρ := Ad(Vρ (g)), zρ (g) := Vρ (g)V (g)∗ ,
g ∈ P,
the covariance condition takes the form: αg ραg−1 = Ad(zρ (g)∗ ) ◦ ρ ' ρ, g ∈ P
(3.1)
and zρ (g) satisfies the following α–cocycle identity: zρ (gg 0 ) = zρ (g)αg (zρ (g 0 )),
g, g 0 ∈ P.
Let ρ be localized in the half-line I = (a, ∞), then αg ραg−1 is localized in gI and ˜ 0 )0 = A(I), ˜ where I˜ is the largest half-line from formula (3.1) we obtain zρ (g) ∈ A((I) between I and gI. For every dilation (0, t) ∈ P , we set U (t) := V ((0, t)) as before, and Uρ (t) := Vρ ((0, t)) and αt := AdV ((0, t)), βtρ := AdUρ (t), zρ (t) := Uρ (t)U (t)∗ . We will also use the following notations: M := A(0, +∞), Mb := αb (M ) = A(b, +∞), Mbρ := βbρ (M ), b ∈ R+ . As zρ (b) ∈ Ma , b > 0 it follows that Mb = Mbρ ,
b < a.
The Bisognano–Wichmann Property states that the one–parameter group t 7→ α−2πt , t ∈ R, coincides on M with the modular group t 7→ σt := Ad(1it ), t ∈ R and, since the cocycle zρ (−2πt) is localized in M , by Connes’ Theorem there exists a unique semifinite normal faithful (s.n.f.) weight ψρ on M whose Radon–Nikodym derivative with respect to the vacuum state ω := (, ·) is given by (Dψρ : Dω)t = zρ (−2πt), see [25], Sect. ρ 11. Then t 7→ β−2πt = Ad(zρ (−2πt)) ◦ α−2πt , t ∈ R, is the modular group associated to the weight ψρ on M .
Covariant Sectors with Infinite Dimension and Positivity of Energy
477
4. Automatic Positivity of the Energy in the Finite Index Case Although we shall be mainly interested in sectors with infinite dimension, our proof will be more transparent by a previous analysis of the finite index case (recall that the index is the square of the dimension, see [20]). Let ρ be an endomorphism of A localized in I ⊂ R+ . In the finite index case the following analog of the Kac-Wakimoto formula holds [21]: (Dϕρ : Dω)t = d(ρ)−it zρ (−2πt), where ω = (, ·) is the restriction of the vacuum state to M = A(R+ ), ϕρ is the state ω ◦ φρ on M and φρ = ρ−1 ◦ Eρ is the minimal left inverse of ρ : M → M , with Eρ : M → ρ(M ) is the minimal conditional expectation. Let Dρ be the generator of the one parameter group t 7→ Uρ (t) =: eitDρ , t ∈ R, we recall the following: Proposition 4.1 ([21]). Let us assume that ρ is covariant with finite index as above, and let ψρ be a positive linear functional on M ; then the following are equivalent: a) b) c) d)
ψρ is normal, faithful and (Dψρ : Dω)t = zρ (−2πt), t ∈ R, ψρ = d(ρ) ω ◦ φρ , ψρ (XY ∗ ) = (e−πDρ X, e−πDρ Y ), X, Y ∈ M , ψ ψρ is normal faithful, σt ρ ◦ ρ = ρ ◦ σtω , t ∈ R, and ψρ |ρ(M )0 ∩ M is a trace whose value on a central projection p is ψρ (p) = d(ρp ), where ρp is the subrepresentation associated to p. (In particular, if ρ is irreducible, this last condition reduces to ψρ (I) = d(ρ)).
Proof. We sketch the first part of the proof. We assume ρ to be irreducible. We consider the states ω, ϕρ := ω ◦ φρ on the von Neumann algebra M . Note that ϕρ ◦ρ = ω, i.e. ϕρ |ρ(M ) = ω ◦ρ−1 . From ϕρ = ϕρ ◦Eρ it follows that ρ(M ) is σ ϕρ –stable by Takesaki’s theorem, therefore ϕ |
σt ρ |ρ(M ) = σt ρ ρ(M ) = ρ ◦ σtω ◦ ρ−1 . ϕ
Now defining vt := (Dϕρ : Dω)t ∈ M then we have vt σtω (X)vt∗ = σt ρ (X), X ∈ M , ϕ ρ thus vt σtω (ρ(X))vt∗ = σt ρ (ρ(X)) = ρ ◦ σtω ◦ ρ−1 ρ(X) = ρσtω (X) = β−2πt ρ(X) = ω ∗ zρ (−2πt)σt (ρ(X))zρ (−2πt) , X ∈ M . Hence ϕ
zρ (−2πt)∗ vt ∈ σtω (ρ(M ))0 ∩ M = σtω (ρ(M )0 ∩ M ) = C. Now to complete the argument with regard to the phase d(ρ) in b), we refer to [21] part 1. The proof of the point c) will be easily obtained by polarization of the first formula contained in Proposition 4.2. Assuming d), to obtain a), notice that this condition determines (Dψρ : Dω)t up to the multiplication by a cocycle in ρ(M )0 ∩M , hence ψρ is determined by the specification of ψρ |ρ(M )0 ∩ M that we require to be d(ρ)–times the restriction to ρ(M )0 ∩ M of the minimal expectation of M onto ρ(M ). ψ
ρ it = σt ρ (= σ ϕρ ) = Ad(1it Now we have β−2πt ξ ) = Ad(1ξ, ) on M , where ϕρ = (ξ, ·ξ), kξk = 1 and ξ is cyclic for M (e.g. ξ is the vector representative of ϕρ in the natural cone of M given by ). Clearly ψρ ◦ βtρ = ψρ .
478
P. Bertozzini, R. Conti, R. Longo
Using the fact that ρ is localized in R+ and recalling the definition of ψρ , it is easy to check that Uρ (−2πt) = zρ (−2πt)U (−2πt) = zρ (−2πt)1it coincides up to the phase it it d(ρ)it with 1it ξ, = 1ϕρ ,ω , where 1ξ, is the Araki relative modular operator, see [4], 1
2 is the polar decomposition of the closure of X → X ∗ ξ, X ∈ namely Sξ, = Jξ, 1ξ, M . In fact we have 2πDρ = −log1ξ, − logd(ρ),
see [21]. Hence by the commutation relations of the group P we obtain −it −2πt 1it a), t, a ∈ R. ξ, Tρ (a)1ξ, = Tρ (e
The argument given in Lemma 2.2 amounts to a proof that the invariance condition holds if the dimension of ρ is finite, cf. [21], Prop. 2.11. Proposition 4.2. Let ρ be a covariant endomorphism with finite dimension localized in I ⊂ R+ , and let M , Tρ (a), a ∈ R, ϕρ be as above. Then ϕρ is βaρ –invariant on M , for every a ≥ 0. Proof. If X ∈ M we have ϕρ (X ∗ X) = (ξ, X ∗ Xξ) = kXξk2 = kSξ, X ∗ k2 1
1
2 2 = kJξ, 1ξ, X ∗ k2 = k1ξ, X ∗ k2
(see formula (3.3) in [21]), therefore if a > 0 and zρ (a) denotes the cocycle with respect to the translation by a, 1
2 Tρ (a)X ∗ Tρ (a)∗ k2 ϕρ (Tρ (a)X ∗ XTρ (a)∗ ) = k1ξ, 1
2 = k1ξ, Tρ (a)X ∗ zρ (−a)T (a)∗ k2 1
2 = k1ξ, Tρ (a)X ∗ zρ (−a)k2 1
2 = kTρ (a)1ξ, X ∗ zρ (−a)k2 (by Lemma 2.2) 1
2 = k1ξ, (X ∗ zρ (−a))k2 = kzρ (−a)∗ Xξk2 = (Xξ, Xξ).
In the sequel we write T η M to denote that the (generally unbounded) linear operator T on H is affiliated to M ⊂ B(H), see [26], 9.7. Proposition 4.3. Let ρ be as above. For any given a ≥ 0, the function −it −2πt a) admits an analytic continuation inside the strip t 7→ 1it ξ, Tρ (a)1ξ, = Tρ (e 1 {z ∈ C | − 2 < =z < 0} which is bounded in norm by 1. Proof. The existence of the analytic continuation inside the strip follows from general arguments, see e.g. [7] p. 241. The bound 1 is thus a consequence of the Hadamard three line theorem, once we check that the norm of the function is bounded by 1 on the lines z = 0 (this is obvious), z = − 2i + t, t ∈ R and has a priori global bound on the entire strip. 1
−1
−1 2 Tρ (a)1ξ,2 , or eqivalently Sξ, Tρ (a)Sξ, , is extended by We now check that 1ξ, an isometric operator. We write for short S = Sξ, .
Covariant Sectors with Infinite Dimension and Positivity of Energy
479
Note that S Tρ (a) S −1 is isometric on the dense subspace M ξ : given X ∈ M , we have S Tρ (a) S −1 X ∗ ξ = STρ (a)X = S zρ (a)T (a)XT (a)∗ = T (a)X ∗ T (a)∗ zρ (a)∗ ξ = T (a)X ∗ Tρ (a)∗ ξ, therefore M ξ ⊂ D(STρ (a)S −1 ) ⊂ D(S −1 ) and k S Tρ (a) S −1 X ∗ ξk = kX ∗ ξk, X ∈ M by the βaρ –invariance of ϕ showed in Prop. 4.2. As is known D(S) = {T | T η M, ∈ D(T ), ξ ∈ D(T ∗ )}, and using Lemma 2.2 it is direct to verify that k S Tρ (a) S −1 ζk = kζk for every ζ ∈ D(S −1 ) for which the l.h.s. is well defined. It remains to check the a priori bound on the strip. This may be derived by the bound −iz 1 on matrix coefficients |(1iz ξ, Tρ (a)1ξ, ζ1 , ζ2 )| ≤ kζ1 kkζ2 k, with − 2 < =z < 0, and ζ1 , ζ2 in spectral subspaces for log 1ξ, with respect to bounded intervals. The bound follows by the three line theorem, because the involved fuctions are bounded on the strip. Notice that the same result may be obtained using Prop. 2.4 or the proposition stated in [26], p. 219. Corollary 4.4. Let ρ be a covariant endomorphism with finite dimension localized in I ⊂ R+ , and M , Tρ (a) as above. Then in the sector ρ the energy (the generator of the 1–parameter group Tρ (a)) is positive. Proof. It is immediate from (the proofs of) Prop. 2.4 (the core is M ) and Prop. 4.2, cf. Prop. 4.3. 5. Infinite Index (Weight) Case 5.1. General considerations. We shall now begin an analysis of sectors with infinite dimension, by an extension of the previous methods. In the following A is a net on R as in the previous section, namely A is obtained by restricting a local conformal precosheaf of von Neumann algebras on S 1 . Let ρ be a covariant endomorphism localized in a half-line strictly contained in R+ as before, but not necessarily with finite index. We know from [21], Sect. 2, that Uρ (−2πt) =: e−2πitDρ = 1(ψρ /ω 0 )it
(5.1)
(cf. the paragraphs following Prop. 4.1 and Lemma A.2) where ω 0 is the restriction of the vacuum state to M 0 , and 1(ψρ /ω 0 ) is the Connes’ spatial derivative with ψρ the weight on M defined at the end of Sect. 3. Although ψρ is unbounded in general, ω 0 is a state, represented by the vector . This suggests the formula ψρ (XX ∗ ) = ke−πDρ Xk2 still to express ψρ and to be useful in proving its invariance properties, cf. [21] (3.3). We shall show this is in fact the case. To this end we need to discuss certain aspects of the spatial theory of von Neumann algebras, that may be of independent interest, and
480
P. Bertozzini, R. Conti, R. Longo
we collect them in Appendix 5. We refer to this appendix for notations and notions used here below. Notice now that Eq. (5.1) gives the dilation–translation commutation relations 1(ψρ /ω 0 )it Tρ (a)1(ψρ /ω 0 )−it = Tρ (e−2πt a), a, t ∈ R, so we may still apply the analysis made in Sect. 2. We use the following notations: ψa := ψρ ◦ Ad(Tρ (a)),
a > 0,
is a faithful normal weight on M ; N := Nψρ ,
Na := {X ∈ M | ψa (X ∗ X) < ∞},
are left ideals in M for any a > 0. ψ Note that σt ρ (Na ) = Ne2πt a , a > 0, t ∈ R and that if X ∈ M , and a > 0 then, cf. Prop. 4.2 and Lemma A.1, X ∗ ∈ D(1(ψρ /ω 0 ) 2 Tρ (a)) ⇔ X ∈ Na . 1
As a first result we show that the spectrum of the generator Hρ of the translation group relative to the sector ρ always contains the positive real line. Proposition 5.1. Let A be a local net of von Neumann algebras on R as in Sect. 3 and ρ a covariant morphism localized in (1, +∞). Then [0, +∞) ⊂ Sp(Hρ ). Proof. We already know that N is dense in M by the semifiniteness of ψρ . By dilation covariance Sp(Hρ ) is either R+ ∪ {0}, R− ∪ {0}, or R, thus we assume it to be R− ∪ {0} 1 1 to find a contradiction. We then have D(1(ψρ /ω 0 ) 2 ) ⊂ D(1(ψρ /ω 0 ) 2 Tρ (a)∗ ), a > 0, see Sect. 2. We take X ∈ N∗ and define ζ = Tρ (a)∗ XT (a), thus we have ζ ∈ 1 D(1(ψρ /ω 0 ) 2 ). Now we observe that the closed operator Tζ η M as defined in Lemma A.3 coincides with the bounded operator Tρ (a)∗ XT (a) which a priori is in M−a ; in fact 0 the two operators coincide when restricted to the dense vector space M−a . Therefore ∗ Tζ is bounded and Tζ = Tρ (a) XT (a) ∈ M . It follows that X ∈ Ma whenever a > 0 is small enough. Therefore we have N ⊂ Ma , thus M ⊂ Ma which is not possible. Lemma 5.2. 0 < b < a ⇒ N ∩ Na ⊆ N ∩ Nb . Proof. By Lemma A.1 if X ∗ ∈ N∩Na , then X ∈ D(1(ψ/ω) 2 )∩ D(1(ψ/ω) 2 Tρ (a)), 1 1 thus by Corollary 2.3, X ∈ D(1(ψ/ω) 2 ) ∩ D(1(ψ/ω) 2 Tρ (b)), hence X ∗ ∈ N ∩ Nb by Lemma A.1. 1
1
Proposition 5.3. N ⊂ Na for any a > 0 if and only if N = ∪a>0 N ∩ Na . Proof. One implication is obvious. To show the converse assume N = ∪a>0 N ∩ Na and let X ∈ N be fixed. Then there exists a = a1 > 0 such that X ∈ N ∩ Na , i.e. ψρ ◦ (X ∗ X) = ψρ ◦ βaρ (X ∗ X) < ∞, thus βaρ (X) ∈ N; by iteration we find an increasing sequence an > 0 such that ψρ ◦ βaρn (X ∗ X) < ∞. Let us consider a∗ := sup{a ∈ R+ | X ∈ Na }, an % a∗ . But ψρ ◦ βaρ∗ (X ∗ X) ≤ limn ψρ ◦ βaρn (X ∗ X) = ψρ (X ∗ X) by lower semicontinuity, therefore a∗ = ∞ and we are done.
Covariant Sectors with Infinite Dimension and Positivity of Energy
481
Proposition 5.4. We have ∩a>0 Na ⊆ N. Proof. If X ∈ ∩a>0 Na , we have ψρ ◦ βaρ (X ∗ X) = ψρ ◦ βbρ (X ∗ X) for every a, b > 0. We summarize the last results in the following proposition, although only the equivalence between the first two points will be needed. Proposition 5.5. The following assertions are equivalent: 1) 2) 3) 4)
ρ has positive energy; N ⊂ Na for some (⇔ for all) a > 0; N = ∪a>0 (N ∩ Na ); N = ∩a>0 Na .
Proof. We shall show the equivalence 1) ⇔ 2); the other results are immediate from Lemmata 5.2, 5.3, 5.4. 1 1) ⇒ 2) From Lemma A.1 we have: N∗ = {X ∈ M | X ∈ D(1(ψ/ω 0 ) 2 )} and Na∗ = 1 {X ∈ M | X ∈ D(1(ψ/ω 0 ) 2 Tρ (a))}, cf. Prop. 4.2. Now, as already recalled in the 1 proof of Lemma 2.2 (from [21], Corollary 2.8), if ρ has positive energy, D(1(ψ/ω 0 ) 2 ) ⊂ 1 D(1(ψ/ω 0 ) 2 Tρ (a)) for any a > 0 and so: N∗ ⊂ Na∗ . 1 2) ⇒ 1) From [25], pp. 94–95, we have that N∗ is a core for 1(ψ/ω 0 ) 2 . If N∗ ⊂ Na∗ , 1 by Lemma A.1 N∗ ⊂ D(1(ψ/ω 0 ) 2 Tρ (a)), and so by Proposition 2.4 b), c), we get the positivity of the energy. 5.2. Sectors of Haag dual nets on R. In the following A is a net of von Neumann algebras on R satisfying the six properties listed in Sect. 3. Furthermore we require Haag duality on R. Equivalently we assume our net to be strongly additive, meaning that A(a, b) ∨ A(b, c) = A(a, c) for every a < b < c, a, b, c ∈ R. It follows that A(a, b)0 ∩ A(a, c) = A(b, c) [19]. Let ρ be a covariant morphism of A localized in (b, +∞), 0 < b,. As always we assume that ρ is localizable in every half-line, thus it extends to a normal endomorphism of A(b, ∞). As recalled in Sect. 3, we obtain a localized cocycle t → zρ (−2πt) ∈ M := A(0, ∞), thus a s.n.f. weight ψρ on M such that (Dψρ : Dω)t = zρ (−2πt), t ∈ R. We have Uρ (−2πt) = 1(ψρ /ω 0 )it (see formula (5.1)). For each X ∈ M we have σt ρ (ρ(X)) = zρ (−2πt)α−2πt (ρ(X))zρ∗ (−2πt) ψ
= zρ (−2πt)(α−2πt ρα2πt (α−2πt (X)))zρ∗ (−2πt) = ρ(α−2πt (X)) −1
= ρα−2πt ρ−1 (ρ(X)) = σtωρ (ρ(X)), t ∈ R, therefore by Haagerup’s theorem [25], p. 164, there exists a unique s.n.f. operator valued + + weight E := Eρ : M + → ρ(M ) (where the symbol ρ(M ) denotes the extended positive part of the von Neumann algebra ρ(M ), see Appendix 5) such that ψρ = ωρ−1 E. + Here ωρ−1 has to be thought of as the unique extension to ρ(M ) such that ωρ−1 (n) = + n(ωρ−1 ), n ∈ ρ(M ) . In particular E(X ∗ Y X) = X ∗ E(Y )X, for every X ∈ ρ(M ), Y ∈ M + . E may be uniquely “extended” to a (densely defined) linear mapping (also denoted by E) ME := lin{X ∈ M + ; E(X) ∈ ρ(M )+ } → ρ(M ) with image a ultraweakly dense two–sided ideal in ρ(M ), such that E(ρ(Y )Xρ(Z)) = ρ(Y )E(X)ρ(Z) ∈ ρ(M ) for every Y, Z ∈ M , X ∈ ME . Clearly ME ⊆ Mψρ . We consider the unbounded left inverse of
482
P. Bertozzini, R. Conti, R. Longo +
+
ρ defined by φρ := ρ−1 ◦ E : M + → ρ−1 (ρ(M ) ) = M ; by linearity φρ : ME → M . Clearly ρφρ = E, ψρ = ωφρ both on M + , ME , and φρ (ρ(Y )Xρ(Z)) = Y φρ (X)Z for every Y, Z ∈ M , X ∈ ME . +
Lemma 5.6. With the notations above we have E(X) ∈ Ma for every X ∈ Ma + with 0 < a sufficiently small. Proof. Let N := Ma0 ∩ M , making use of the strong additivity property and duality we obtain that N = A(0, a) (it is immediate to see that A(0, a) ⊂ N , on the other side if x ∈ N then x ∈ A(0, a) = A(0, a)00 = (M 0 ∨ Ma )0 ). If a is sufficiently small, so that ρ is localized in (a, ∞), then ρ(u) = u for every u ∈ N . For every X ∈ Ma+ and unitary u ∈ N , we have X = uXu∗ , so that E(X) = E(uXu∗ ). Now using the fact that u = ρ(u) (because u ∈ N ), and that E is an operator valued weight, it follows that E(X) = uE(X)u∗ (in fact E(uXu∗ ) = E(ρ(u)Xρ(u)∗ ) = ρ(u)E(X)ρ(u)∗ = uE(X)u∗ ). We know that E(X) can be uniquely written as he + ∞(I − e), with e ∈ P (ρ(M )) ⊂ P (M ) (P (M ) is the set of all the projections of M ) and h η eρ(M )e positive. On the other side E(X) = uE(X)u∗ = uhu∗ ueu∗ + ∞(I − ueu∗ ), and using the uniqueness of the decomposition we obtain: e = ueu∗ and h = uhu∗ for every u ∈ N . From this, with the help of strong additivity (N 0 ∩ M = Ma ), we obtain e ∈ M ∩ N 0 = Ma (because e ∈ P (M ) ⊂ M and e commutes with every u ∈ N ). In the same way we can say that h η eMa e. In fact, the bounded parts hn of h are in N 0 (because h commutes with every unitary in N ) and in eM e (because h η eM e), using the fact that eM e ∩ N 0 = eMa e (clearly eMa e ⊂ eM e ∩ N 0 , since e ∈ Ma ⊂ N 0 , and if x ∈ eM e ∩ N 0 , x = eme = e(eme)e with eme ∈ M ∩ N 0 = Ma ), we get hn ∈ eMa e and thus h η eMa e. From the fact that E(X) = he + ∞(I − e) with h η eMa e and + e ∈ Ma the result follows: E(X) ∈ Ma . Lemma 5.7. Let M , N be von Neumann algebras on the Hilbert space H. We have + + + M ∩N =M ∩N . +
+
+
+
+
Proof. Both M , N can be embedded in B(H) . m ∈ M ∩ N can be uniquely written as he + ∞(I − e), with e ∈ P (M ), h η eM e, and h0 e0 + ∞(I − e0 ), with e0 ∈ P (N ), h0 η e0 N e0 . It follows that e = e0 ∈ P (M ∩ N ), and h = h0 η M ∩ N . Lemma 5.8. Let M = A(0, ∞), and ρ localized in (b, +∞), 0 < b. Then we have βaρ (M ) ∩ ρ(M ) = ρ(βaρ (M )) whenever 0 < a < b. Proof. The inclusion βaρ (M ) ∩ ρ(M ) ⊃ ρ(βaρ (M )) is obvious. We choose un ∈ (ρ, ρn ) unitaries with ρn localized in (−∞, cn ) with cn → −∞. Any weak limit point of the ˜ = id on ∪l∈R A(l, ∞). φ˜ sequence Ad(un ) is a map φ˜ : B(H) → B(H) such that φρ ρ is a non normal left–inverse of ρ, cf. [11]. Let X ∈ βa (M ) ∩ ρ(M ). Then X = ρ(Y ) ˜ for some Y ∈ M , and Y = φ(X). For every Z ∈ A(I), I ⊂ (−∞, a) we have ˜ ˜ ˜ ˜ ZY = Z φ(X) = φ(ρ(Z)X) = φ(ZX) = φ(XZ) = Y Z, therefore Y commutes with ∪I⊂(−∞,a) A(I). By duality Y ∈ βaρ (M ) = αa (M ) and we are done. Proposition 5.9. Let ρ be a covariant (transportable) morphism of A localized in + (b, +∞), b > 0, M := A(0, ∞), E : M + → ρ(M ) defined as above, and + Ea := ρα−a ρ−1 Eβaρ . Then Ea : M + → ρ(M ) is a s.n.f. operator valued weight, for 0 < a < b. If ρ is irreducible and ψa is semifinite, we have E = Ea , thus ψ = ψa .
Covariant Sectors with Infinite Dimension and Positivity of Energy
483
Proof. X ∈ M + ⇒ βaρ (X) ∈ Ma + ⇒ +
Eβaρ (X) ∈ ρ(M ) ∩ Ma + = ρ(M ) ∩ Ma + = ρ(Ma )
+
(by Lemma 5.6) (by Lemma 5.7) (by Lemma 5.8) +
+
+
⇒ ρ−1 Eβaρ (X) ∈ Ma ⇒ α−a ρ−1 Eβaρ (X) ∈ M ⇒ ρα−a ρ−1 Eβaρ (X) ∈ ρ(M ) . Ea is an operator valued weight. We check that for every X ∈ M + and Y ∈ M , Ea (ρ(Y )∗ Xρ(Y )) = ρα−a ρ−1 E(ρ(αa (Y ∗ ))βaρ (X)ρ(αa (Y )) = ρα−a (αa (Y ∗ )ρ−1 (E(βaρ (X)))αa (Y )) = ρ(Y ∗ α−a ρ−1 (E(βaρ (X)))Y ) = ρ(Y ∗ )ρ(α−a (ρ−1 (E(βaρ (X)))))ρ(Y ) = ρ(Y ∗ )Ea (X)ρ(Y ). Ea is normal, thus also faithful and semifinite (see [25], p. 155) since ωρ−1 ◦ Ea = ωρ−1 Eβaρ = ψρ βaρ , which is faithful (since ψρ is) and semifinite by hypothesis. In the irreducible case it follows by uniqueness that E = λa Ea , λa ∈ R+ , see e.g. [25], p.174, Corollary 12.13 (see also [12], Prop. 11.1), thus ψρ = λa ψa and λa = 1 by Sect. 2, 5. Corollary 5.10. Let ρ be a covariant irreducible morphism of A localized in (b, +∞), b > 0. Then the following are equivalent: 1) ψa is semifinite for some (⇔ for all) a < b; 2) ρ has positive energy; 3) ψρ = ψa for some (⇔ for all) a > 0. Proof. It is immediate from Prop. 2.4, cf. Prop. 5.5, and Prop. 5.9.
Our result entails the following one for sectors on non strongly additive nets. Corollary 5.11. Let A be a net on R satisfying the property of Sect. 3 (but not necessarily strongly additive). Let ρ be an irreducible covariant morphism of A localized in an bounded interval I. If ρ acts identically on A(1, ∞)0 ∩ A(0, ∞), then ρ has positive energy iff ψa is semifinite. Proof. We may assume I = (0, 1). ρ extends to a morphism ρ˜ of the dual net Ad , covariant with respect to the same representation of the translation-dilation group. Since Ad (a, 0) = A(0, ∞)0 ∩ A(a, ∞), a < 0, and Ad (1, b)) = A(b, ∞)0 ∩ A(1, ∞), b > 1, it follows that ρ˜ is still localized in (0, 1). Now Ad is strongly additive [19], thus ρ, ˜ hence ρ, has positive energy by Prop. 5.9. If (ρ, Vρ ) and (σ, Vσ ) are two covariant morphisms, then ρσ is covariant with respect to the representation of P given by Vρσ (g) := ρ(zσ (g))Vρ (g), g ∈ P as shown by the following computation (X ∈ M, g ∈ P ):
(5.2)
484
P. Bertozzini, R. Conti, R. Longo
Vρσ (g)(ρ ◦ σ(X))Vρσ (g)∗ = ρ(zσ (g))Vρ (g)ρ ◦ σ(X)Vρ (g)∗ ρ(zσ (g))∗ = ρ(zσ (g))ρ(V (g)σ(X)V (g)∗ )ρ(zσ (g)∗ ) = ρ(zσ (g)V (g)σ(X)V (g)∗ zσ (g)∗ ) = ρ(Vσ (g)σ(X)Vσ (g)∗ ) = ρ ◦ σ(V (g)XV (g)∗ ). In the following we will show that if two irreducible sectors ρ, σ are covariant with positive energy with respect to the representations Vρ , Vσ , then Vρσ defined above is a positive energy representation. Proposition 5.12. Let A be a strongly additive net on R, and ρ, σ be two covariant irreducible morphisms with positive energy. Then the representation Vρσ defined by Eq. (5.2) has positive energy. Proof. To prove this we show the translational invariance of the unbounded left inverse φρσ (cf. the last subsection), namely αa−1 φρσ βaρσ = φρσ , a > 0. As a preliminary result we shall now prove that: (5.3) φρσ = φσ φρ . Let ψ be a s.n.f. weight on the von Neumann algebra M and ρ be an isomorphism of M onto its subalgebra ρ(M ). Then we have −1
σtψρ (Y ) = ρ ◦ σtψ ◦ ρ−1 (Y ), Y ∈ ρ(M ), checking the invariance and the KMS property. From this, given two s.n.f. weights ψi , i = 1, 2, on M , we obtain by direct computation
Now we have
ρ((Dψ1 : Dψ2 )t ) = (Dψ1 ρ−1 : Dψ2 ρ−1 )t .
(5.4)
zρσ (t) = (Dψσ ρ−1 Eρ : Dω)t
(5.5)
as shown by the following computation: zρσ (t) = ρ(zσ (t))zρ (t) = ρ((Dψσ : Dω)t )(Dψρ : Dω)t , (by Eq. (5.4)) (by [25] Theorem 11.9) (since ψρ = ωρ−1 Eρ ) (by [25] Corollary 3.5)
= (Dψσ ρ−1 : Dωρ−1 )t (Dψρ : Dω)t , = (Dψσ ρ−1 Eρ : Dωρ−1 Eρ )t (Dψρ : Dω)t , = (Dψσ ρ−1 Eρ : Dψρ )t (Dψρ : Dω)t , = (Dψσ ρ−1 Eρ : Dω)t :
From Formula (5.5), by a theorem of Connes (see [25] Corollary 3.6) we deduce: ψσ ρ−1 Eρ = ψρσ , and from a theorem of Haagerup (see [25] Theorem 11.9): Eρσ = ρEσ ρ−1 Eρ , and finally, applying (ρσ)−1 , to both sides we get Eq. (5.3). The invariance of φρσ is readily obtained using Eq. (5.3) (X ∈ M + ):
Covariant Sectors with Infinite Dimension and Positivity of Energy
485
αa−1 ◦φρσ ◦ βaρσ (X) = αa−1 ◦ φσ ◦ φρ ◦ βaρσ (X) = αa−1 (φσ ◦ φρ ◦ Ad(Vρσ (a))(X)) by Eq. (5.2)
= αa−1 (φσ ◦ φρ ◦ Ad(ρ(zσ (a))Vρ (a))(X)) = αa−1 (φσ (zσ (a)φρ (βaρ (X))zσ (a)∗ )) = αa−1 (φσ (zσ (a)(αa ◦ φρ (X))zσ (a)∗ )) = αa−1 (φσ (βaσ ◦ φρ (X))) = φσ ◦ φρ (X).
The conclusion follows using Prop. 5.5, by the invariance of ψρσ .
As already mentioned, the net on R we are considering is obtained by a local conformal precosheaf by removing a point from the circle [19]. Therefore our results have a version for M¨obius covariant sectors. Theorem 5.13. Let A be a strongly additive local conformal precosheaf on S 1 . The class of M¨obius covariant (resp. translation covariant, with respect to a given ∞ point) sectors with positive energy is stable under composition, conjugation and direct integral decomposition. Proof. The stability of the covariance with positive energy under direct integral decomposition is shown explicitly in Lemma 5.14 of the next subsection. Let us thus assume that ρ, σ are covariant sectors with positive energy. Let ρ = R⊕ R⊕ ρλ dµ(λ) and σ = σλ0 dν(λ0 ) be two direct integral decompositions into irreducible sectors. By the previous statement, the irreducible components of ρ (resp. σ) are µ (resp. ν) almost everywhere covariant with positive energy. Then: Z ⊕ ρλ σλ0 d(µ × ν)(λ, λ0 ), ρσ = therefore by Proposition 5.12 ρλ σλ0 is covariant with positive energy almost everywhere, and the same is true for ρσ by Lemma 5.14. It remains to show that if ρ is covariant with positive energy, the same is true for its conjugate ρ¯ = j ◦ ρ ◦ j, where j = AdJ and J is the modular conjugation of (M, ). But this immediately follows by setting Vρ¯ (g) = JVρ (rgr)J, where r is the change of sign on R, see [16]. 5.3. An example of sector with infinite dimension and negative energy levels. We show now that there exist translation-dilation covariant sectors whose energy spectrum is the real line. Our example, concerning a non strongly additive net, will be a reducible sector, but enlightens nevertheless the structure of the involved objects and the limit of the arguments. R⊕ ρλ dµ(λ) be a (non unique) direct integral decomposition of Lemma 5.14. Let ρ = a sector ρ of a net of von Neumann algebras A on R as in Sect. 3 (resp. on S 1 ). Then ρ is translation (resp. M¨obius) covariant with positive energy iff ρλ is translation (resp. M¨obius) covariant with positive energy µ–almost everywhere. Proof. Clearly if ρλ is translation (resp. M¨obius) covariant with positive energy for µalmost all λ, the same is true for ρ. Conversely if ρ is translation covariant with positive energy, there exists by Borchers theorem [1] a unitary one-parameter group Tρ ∈ ρ(A)00 ,
486
P. Bertozzini, R. Conti, R. Longo
with positive generator, implementing the translations ρ ◦ AdT (·). Therefore Tρ has a R ⊕ (λ) Tρ dµ(λ), where Tρ(λ) has positive generator for almost all decomposition Tρ = λ, and implements the translations on ρλ . If moreover ρ is covariant with respect to (the universal covering of the) M¨obius group, we may repeat the argument with the translations associated with different intervals of S 1 and find implementations in the weak closure of ρλ (A). In this way we construct, for almost all λ, a unitary representation Vλ of the universal covering of the M¨obius group up to a cocycle in the center of ρλ (A)0 . Since the cohomology of the universal covering of the M¨obius group is trivial, we may replace Vλ (g) with z(g)Vλ (g), where z(g) is in the center of ρλ (A)0 , and get a true unitary representation providing the M¨obius covariance with positive energy of ρλ . Proposition 5.15. Let A be an irreducible net of von Neumann algebras on R, covariant under translations and dilations as in Sect. 3. Let γ be a morphism of A localized in the interval I ⊂ R, and assume that γa := α−a γαa is disjoint from γ for every a 6= 0 (thus R⊕ γa da is a translationally covariant γ is not translationally covariant). Then ρ := (reducible) endomorphism with infinite statistics whose energy spectrum is R. Proof. ρ acts on vectors ξ in the Hilbert space L2 (R, H) (separable if H is separable) via (ρ(X)ξ)(a) = γ−a (X)ξ(a), X ∈ A, a ∈ R, and covariance under translations is implemented by the unitary one–parameter group (T (b)ξ)(a) = ξ(b − a). However ρ has not positive energy since otherwise we would infer from Lemma 5.14 that (for almost every a ∈ R) γa is covariant and this is not possible by hypothesis. To give an explicit example, we recall that although the free scalar massless field ϕ = ϕ(t, x) in two dimensions does not exist, its derivative j = ∂0 ϕ − ∂1 ϕ makes sense and depends on t − x, thus defines a one dimensional current j. Every f ∈ S(R) such R that R f (t)dt =R 0 determines a unitary operator W (f ) = eij(f ) and the Weyl relations 0
W (f + g) = ei f g dt W (f )W (g) are satisfied. Let A be the (strongly additive) net on R R defined by A(I) := {W (f ) | f ∈ S, R f dt = 0, supp(f ) ⊂ I}. As is known [9], this net has localized automorphisms γq , with q ∈ S real valued with compact support, R R 2i qf W (f ) [9]. If q = Q0 with Q ∈ S, that is q(t)dt = 0, then given by γq (W (f )) = e γq is inner, indeed γq = Ad(W (Q)). R The equivalence class of these automorphisms are labeled by the real numbers q0 := q(t)dt. As noticed in [19], it is possible to generalize this construction in order to obtain non-covariant automorphisms, thus sectors with the properties needed in Prop. 5.15. For the sake of completeness we briefly recall this construction. 0 RLet B ⊂ A be the net generated by the derivative of j: B(I) := {W (f ) | f = F , F ∈ S, F dt = 0, supp(F ) ⊂ I}. Then B is a proper subnet of A which is not strongly additive, but A(I) = B(I) if I is a half-line. If q is a smooth function on R such that q 0 has compact support, then γqRmakes sense Ras an automorphism of B and its equivalence class is labeled by the charges q 0 (t)dt and tq 0 (t)dt. As a consequence, if q(+∞) 6= q(−∞), then γq is a transportable localized automorphism of B such that α−a γq αa is disjoint from γq for each non trivial translation αa . 5.4. Construction of the M¨obius covariant unbounded left inverse. Let A be a local conformal precosheaf on S 1 , namely a map I → A(I)
Covariant Sectors with Infinite Dimension and Positivity of Energy
487
that associates to each (proper) interval of S 1 a von Neumann algebra, satisfying isotony, locality, M¨obius covariance with positive energy, uniqueness of the vacuum, see e.g. [18]. A morphism ρ localized in the interval I0 is a map I → ρI which associates to every interval I of S 1 a normal representation of A(I) on H such that ρI˜ |A(I) = ρI , I ⊂ I˜ and ρI00 = id. By Haag duality ρI ∈ End(A(I)) if I ⊃ I0 . We now assume that A is strongly additive. Then ρ is automatically covariant with positive energy if d(ρ) < ∞. In the following d(ρ) is infinite and we assume M¨obius covariance with positive energy. By an unbounded left inverse φ of ρ we shall mean a map I → φI that associates + with any interval I ⊃ I0 a map φI : A(I)+ → A(I) such that φI (ρI (u)XρI (u∗ )) = uφI (X)u∗ , u ∈ A(I), X ∈ A(I)+ , i.e. ρI φI is a ρ(A(I))–valued weight on A(I), and φI˜ |A(I)+ = φI if I ⊂ I˜ are intervals containing I0 . In the following zρI (t), t ∈ R will denote the cocycle in A(I) given by the covariance, namely zρI (t) = Vρ (g(t))V (g(t))∗ , where g(·) is the one-parmater subgroup of special conformal transformations associated with the interval I (the dilation subgroup if I = R+ , and conjugate to this by M¨obius covariance for a general I). Proposition 5.16. Let A be strongly additive and ρ irreducible, covariant with positive energy and localized in I0 . There exists an unbounded left inverse φ of ρ, covariant with respect to the M¨obius group, namely such that αg−1 ◦ φgI ◦ βgρ = φI whenever I ∩ gI ⊃ I0 , g ∈ P . φ is given by the formula (DωI ◦ φI : DωI )t = zρI (−2πt), t ∈ R, I ⊃ I0 . If ρ is irreducible, there exists only one unbounded left inverse, up to renormalization. Proof. Let ψI = ψρ,I be the s.n.f. weight on A(I), I ⊃ I0 , defined by (DψI : DωI )t = zρI (−2πt), t ∈ R. Then ˜ (5.6) ψI˜ |A(I) = ψI , I ⊂ I. To check this notice that (5.6) is true if I and I˜ have a common boundary point, as by cutting the circle we may assume I = (1, ∞), I˜ = (0, ∞), and we may apply the results so far obtained for the real line. Thus (5.6) is true in general as we may check it in two steps by considering an intermediate interval I ⊂ I1 ⊂ I˜ such that both I ⊂ I1 and I1 ⊂ I˜ have a common boundary point. Concerning the invariance, let I1 , I2 ⊂ I˜ be intervals containing I0 and g ∈ PSL(2, R) with gI1 = I2 . If X2 ∈ A(I2 ), then X2 = βgρ (X1 ) with X1 ∈ A(I1 ).
488
P. Bertozzini, R. Conti, R. Longo
In fact zρ (g) ∈ A(gI1 ) since I0 and gI0 are both contained in gI1 . Then ψgI1 (βgρ (X1 )) = ψI2 (X2 ) = ψI˜ (X2 ) = ψI˜ (βgρ (X1 )) = ψI˜ (X1 )
(5.7)
= ψI1 (X1 ), ˜ provided g is a translation or dilation of I. It follows that (5.7) holds for all g such that gI, I ⊃ I0 by a simple repetition of the arguments. The reason is as follows: given two such intervals I, gI we can always find (actually n = 3) such that Ii ∩ Ii+1 ⊃ I0 , sequences I1 = I, . . . , In = gI and {I˜i }n−1 1 Ii ∪ Ii+1 ⊂ I˜i , Ii+1 = gi Ii with gi a translation of I˜i . Namely, if one of the two intervals I1 = I, I3 = gI is contained in the other, let us take as I2 an intermediate interval containing the smallest of the two and one of the endpoints of the biggest. Otherwise take as I2 the connected component of I0 in I ∩gI. In both of the situations it is easy to see that I2 is an interval included in one of the two intervals I1 , I3 , containing the other one and having the two endpoints in common with I1 and I3 respectively. It is now possible to apply the previous results in two successive steps to the pairs I1 , I2 and I2 , I3 taking as I˜1 and I˜2 the biggest intervals and considering the unique translation g1 of I˜1 such that g1 I1 = I2 (resp. g2 of I˜2 such that g2 I2 = I3 ) with fixed point the common extreme of I1 and I2 (resp. I2 and I3 ). Then g = g1 g2 h, where h is a dilation of I and we have + (by (5.7)): ψg1 g2 hI (βgρ1 βgρ2 βhρ (X)) = ψI (X), X ∈ A(I). Let EI : A(I)+ → ρI (A(I)) the Haagerup’s operator valued weight, then φI := ρ−1 I EI is the desired unbounded left inverse. If ρ is irreducible, then ρI is an irreducible endomorphism of A(I) by the strong additivity assumption. As A(I) is a factor, the uniqueness of the unbounded left inverse then follows by the uniqueness of the operator valued weight EI by the Haagerup theorem. Corollary 5.17. Let A be strongly additive and ρ irreducible, covariant and localized in I0 . Then ρ has positive energy if and only if there exists an unbounded left inverse φ of ρ, covariant with respect to the M¨obius group. Proof. The existence of the M¨obius covariant unbounded left inverse has been shown in the preceding Proposition 5.16. The reverse implication follows from the invariance of ψ as in Corollary 5.10. A Appendix. Araki Relative Modular Operators and Connes Spatial Derivatives We use the notation in [25], Ch. I, II. In particular given a von Neumann algebra M ⊂ B(H), a semi-infinite normal faithful (s.n.f.) weight ψ on M , Nψ will denote the dense left ideal {X ∈ M | ψ(X ∗ X) < ∞}, Hψ the GNS Hilbert space of ψ. Nψ is embedded as a dense linear subspace of Hψ , denoted by X → (X)ψ , and the GNS representation πψ is given by πψ (X)(Y )ψ = (XY )ψ , X ∈ M, Y ∈ Nψ . D(H, ψ) will denote the dense linear subspace of all ζ ∈ H that are ψ-bounded, namely the linear operator Rζψ : Hψ → H defined by (X)ψ → Xζ, X ∈ Nψ is bounded. If χ0 is a s.n.f. weight on M 0 , 1(ψ/χ0 ) denotes the Connes spatial derivative.1 1
If A is a positive linear operator, we set kAζk = +∞ for all vectors ζ ∈ / D(A).
Covariant Sectors with Infinite Dimension and Positivity of Energy
489
Lemma A.1. Let M ⊂ B(H) be a von Neumann algebra, ω 0 = (, ·) a vector state on M 0 given by a cyclic and separating vector , ψ a normal faithful semifinite weight 1 on M . Then k1(ψ/ω 0 ) 2 Xk2 = ψ(XX ∗ ), X ∈ M . Proof. As ω 0 is the state on M 0 given by a vector , one checks immediately that ω0 M = D(H, ω 0 ) and RX = X. The lemma is thus obtained by taking ζ = X in 0 0 1 0 2 2 the formula k1(ψ/ω ) ζk = ψ(Rζω (Rζω )∗ ) for ζ ∈ D(H, ω 0 ) that defines the spatial derivative [25] p. 95. By polarization we also deduce that ∗ . (1(ψ/ω 0 ) 2 Y , 1(ψ/ω 0 ) 2 X) = ψ(XY ∗ ), X, Y ∈ Nψ 1
1
We also need two facts. Recall (see [25] Chapters I, II) that if ψ and ω are s.n.f. weight on M then the Tomita operator Sψ,ω : Hω → Hψ is the closure of the densely de∗ . The polar decomposition fined anti-linear operator (X)ω → (X ∗ )ψ , X ∈ Nω ∩ Nψ 1
1
∗ 2 2 Sψ,ω = Jψ,ω 1ψ,ω , 1ψ,ω = (Sψ,ω Sψ,ω ) 2 , defines the relative modular conjugation and the relative modular operator. Let Vψ,ω : Hω → Hψ be the uniquely determined unitary operator satisfying: ∗ , preserving the natural cones, see [25], 3.16. We shall identify πψ (X) = Vψ,ω πω (X)Vψ,ω Hω and Hψ via Vψ,ω , cf. [25], 3.17, therefore Vψ,ω = 1 and Jψ,ω = Jω,ω , that we simply denote by J in the following. We will assume M to act standardly on Hψ = Hω , thus we suppress the symbols πψ and πω , and shall consider the s.n.f. weight ω 0 = ω(J · J) on M 0 . 1
Lemma A.2. Let M , ψ, ω 0 as above; then 1
2 1(ψ/ω 0 ) 2 = 1ψ,ω . 1
0 Proof. Let us first assume that M is a factor. Then 1(ψ/ω 0 )it 1−it ψ,ω ∈ M ∩ M = C, 0 t ∈ R because both 1(ψ/ω 0 )it and 1it ψ,ω implement the same modular groups of ω on 0 0 M and of ψ on M . It follows that 1(ψ/ω ) = λ1ψ,ω for some λ > 0. Now, for every ∗ , we have (cf. the proof of Lemma A.1) X ∈ N ω ∩ Nψ 1
1
2 2 (X)ω k2 = kJψ,ω 1ψ,ω (X)ω k2 = k(X ∗ )ψ k2 k1ψ,ω 0
0
∗
ω ω = ψ(XX ∗ ) = ψ(R(X) R(X) ) ω ω
= k1(ψ/ω 0 ) 2 (X)ω k2 . 1
(A.1) In fact from the relation JY J(X)ω = XJ(Y )ω , Y ∈ Nω ,
(A.2)
see [25], p. 26, it follows that (X)ω ∈ D(Hω , ω 0 ) since (JY J(X)ω , JY J(X)ω )ω = (XJ(Y )ω , XJ(Y )ω )ω = (J(Y )ω , X ∗ XJ(Y )ω )ω ≤ kXk2 ((Y )ω , (Y )ω )ω = kXk2 ω(Y ∗ Y ) = kXk2 ω 0 ((JY J)∗ (JY J));
490
P. Bertozzini, R. Conti, R. Longo 0
ω furthermore R(X) = X as results from: ω 0
ω R(X) (JY J)ω0 = JY J(X)ω ω
= XJ(Y )ω = X(JY J)ω0 ,
(by equation (A.2))
where, as before, Y ∈ Nω and we have identified (JY J)ω0 and J(Y )ω . Thus by equation (A.1), we obtain λ = 1. The conclusion follows by direct integral decomposition. If A is a positive operator on H affiliated to M we define ψ(A) := supn ψ(AEA ([0, n))), where EA is the projection–valued spectral measure of A. Lemma A.3. Let M ⊂ B(H), ψ, ω 0 , as above; then D(1(ψ/ω 0 ) 2 ) = {T | T closed, T η M, ∈ D(T ), ψ(T T ∗ ) < ∞}. 1
Proof. ⊂ : given ζ ∈ D(1(ψ/ω 0 ) 2 ), consider the densely defined operator Tζ0 : X 0 → X 0 ζ, X 0 ∈ M 0 . Then Tζ0 is closable since its adjoint is densely defined; indeed J(Y )ψ ∈ 1
∗
∗
1
D(Tζ0 ), Y ∈ Nψ , and Tζ0 J(Y )ψ = JY 1ψ,ω 2 ζ, since (Tζ0 JX, J(Y )ψ ) = (JXJζ, J(Y )ψ ) = (ζ, JX ∗ JJ(Y )ψ ) = (ζ, JX ∗ (Y )ψ ) = (ζ, J(X ∗ Y )ψ ) 1
2 Y ∗ X) = (ζ, JSψ,ω Y ∗ X) = (ζ, 1ψ,ω 1
1
1
2 2 2 = (1ψ,ω ζ, Y ∗ X) = ((Y 1ψ,ω ζ, X) = (JX, JY 1ψ,ω ζ).
Furthermore Tζ0 commutes with unitaries in M 0 , thus its closure Tζ is affiliated to M . Let Tζ = V H be the polar decomposition, and en := EH ([0, n)), so that Tn := Tζ en ∈ M , ∗ and Tn → Tζ = ζ. Finally for each X ∈ Nψ we have ∗ (Tn , Sψ,ω JX) = (JX, en J1ψ,ω 2 ζ). 1
This is shown, using the fact that ∗ (JX) = JSψ,ω X Sψ,ω
= JSψ,ω JX = J(X ∗ )ψ , by the following computation: ∗ (Tζ en , Sψ,ω JX) = (Tζ en , J(X ∗ )ψ )
= (Tζ en , J(X ∗ )ψ ) = (en , Tζ∗ J(X ∗ )ψ ) 1
1
2 2 ζ) = (, en JX ∗ 1ψ,ω ζ) = (en , JX ∗ 1ψ,ω 1
1
2 2 = (Jen JX ∗ 1ψ,ω ζ, ) = (Jen J1ψ,ω ζ, X) 1
1
2 2 = (JX, en J1ψ,ω ζ) = (JX, en J1ψ,ω ζ),
Covariant Sectors with Infinite Dimension and Positivity of Energy
491
therefore Tn ∈ D(1(ψ/ω 0 ) 2 ) and 1
1
2 ζ. Sψ,ω Tζ en = en J1ψ,ω
Moreover
ψ(Tn Tn∗ ) = k1(ψ/ω 0 ) 2 Tn k2 ≤ k1(ψ/ω 0 ) 2 ζk2 . 1
1
Finally ψ(Tn Tn∗ ) % ψ(Tζ Tζ∗ ) < ∞. ⊃ : let Tn := en T ∈ M , where en is defined using the spectral family of T T ∗ ; then Tn Tn∗ = en T T ∗ is a increasing sequence of (bounded) positive operators, Tn Tn∗ ≤ T T ∗ , 1 ψ(Tn Tn∗ ) = k1(ψ/ω 0 ) 2 Tn k2 ≤ ψ(T T ∗ ) < ∞. Therefore Tn = en T → ζ, and 1 1 1(ψ/ω 0 ) 2 Tn is convergent, i.e. it is a Cauchy sequence since 1(ψ/ω 0 ) 2 (Tn −Tm ) = ψ((Tn − Tm )(Tn − Tm )∗ ) = ψ((en − em )T T ∗ ) = ψ(en T T ∗ ) − ψ(em T T ∗ ) → 0 when 1 n, m → ∞. In particular it follows that ζ ∈ D(1(ψ/ω 0 ) 2 ). Before concluding this appendix, we recall the notion of the extended positive part of a von Neumann algebra, needed in Sect. 6. + Given a von Neumann algebra M , its extended positive part M is defined as the family of all additive, positively homogeneous and lower semicontinuous functions + m : M∗+ → [0, ∞] [25], 11.1. M has the following characterization (see [25], 11.3): + any element m ∈ M can be uniquely represented as m = he + (1 − e)∞, where e is a projection in M and h is a positive self–adjoint operator h affiliated to eM e. Every + normal weight on M has a canonical extension to M such that ϕ(m) = m(ϕ), ϕ ∈ M∗+ (see [25], 11.4).
References 1. Borchers, H.J.: Energy and momentum as observables in quantum field theory. Commun. Math. Phys. 2, 49–54 (1966) 2. Borchers, H.J.: The CPT-Theorem in two-dimensional theories of local observables. Commun. Math. Phys. 143, 315–332 (1992) 3. Borchers, H.J.: QFT and modular structures. Preprint 4. Bratteli, O., Robinson, D.W.: Operator algebras and quantum statistical mechanics, I. Berlin– Heidelberg–New York: Springer-Verlag, 1979 5. Brunetti, R., Guido, D., Longo, R.: Modular structure and duality in conformal quantum field theory. Commun. Math. Phys. 156, 201–219 (1993) 6. Brunetti, R., Guido, D., Longo, R.: Group cohomology, modular theory and spacetime symmetries. Rev. Math. Phys. 7, 57–71 (1995) 7. Buchholz, D., D’Antoni, C., Longo, R.: Nuclear maps and modular structures I. J. Funct. Anal. 88, 223–250 (1990) 8. Buchholz, D., Fredenhagen, K.: Locality and the structure of particle states. Commun. Math. Phys. 84, 1–54 (1982) 9. Buchholz, D., Mack, G., Todorov, I.: The current algebra on the circle as a germ for local field theories. In: Conformal field theories and related topics, eds. P. Binetruy et al., Nucl. Phys. B (Proc. Suppl.) 5B, 20, (1988) 10. Davidson, D.R.: Endomorphism semigroups and lightlike translations. Lett. Math. Phys. 38, 77–90 (1996) 11. Doplicher, S., Haag, R., E. Roberts, J.: Local observables and particle statistics I. Commun. Math. Phys. 23, 199–230 (1971); II, Commun. Math. Phys. 35, 49–85 (1974) 12. Enoch, M., Nest,R.: Irreducible inclusions of factors, multiplicative unitaries and Kac algebras. J. Funct. Anal. 137, 466–543 (1996) 13. Fredenhagen, K.: On the existence of anti–particles. Commun. Math. Phys. 79, 141–151 (1981)
492
P. Bertozzini, R. Conti, R. Longo
14. Fredenhagen, K.: Superselection sectors with Infinite statistical dimension. In Subfactors. (H. Araki et al. eds.), Singapore: World Scientific, 1995, pp. 242–258 15. Fredenhagen, K., J¨orss, M.: Conformal Haag–Kastler nets, pointlike localizable fields and the existence of operator product expansion. Commun. Math. Phys. 176, 541–554 (1996) 16. Guido, D., Longo, R.: Relativistic invariance and charge conjugation in quantum field theory. Commun. Math. Phys. 148, 521–551 (1992) 17. Guido, D., Longo, R.: An algebraic spin and statistics theorem. Commun. Math. Phys. 172, 517–533 (1995) 18. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 19. Guido, D., Longo, R., Wiesbrock, H.-W.: Extension of conformal nets and superselection structure. Commun. Math. Phys. 192, 217–244 (1998) 20. Longo, R.: Index of subfactors and statistics of quantum fields. I. Commun. Math. Phys. 126, 217–247 (1989); II, Commun. Math. Phys. 130, 285–309 (1990) 21. Longo, R.: An analogue of the Kac–Wakimoto formula and black hole conditional entropy. Commun. Math. Phys. 186, 461–480 (1997) 22. Rehren, H.K.: News from the Virasoro algebra. Lett. Math. Phys. 30, 125–130 (1994) 23. Reed, M., Simon, B.: Methods of modern mathematical physics, II. Fourier analysis, selfadjointness. New York: Academic Press, 1975 24. Roberts, J.E.: Spontaneously broken gauge symmetries and superselection rules. In Proceedings of the International School of Mathematical Physics. Camerino 1974, G. Gallavotti ed., Universit´a di Camerino, 1976 25. Str˘atil˘a, S.: Modular theory in operator algebras. Tunbridge Wells, Kent: Abacus Press, 1981 26. Str˘atil˘a, S., Zsid´o, L.: Lectures on von Neumann algebras, Tunbridge Wells, Kent: Abacus Press, 1979) 27. Wiesbrock, H.-W.: A comment on a recent work of Borchers. Lett. Math. Phys. 25, 157–159 (1992) Communicated by A. Connes
Commun. Math. Phys. 193, 493 – 525 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Method of Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type H. Aratyn1,? , E. Nissimov2,3,?? , S. Pacheva2,3,?? 1 Department of Physics, University of Illinois at Chicago, 845 W. Taylor St., Chicago, IL 60607-7059, USA. E-mail: [email protected] 2 Institute of Nuclear Research and Nuclear Energy, Boul. Tsarigradsko Chausee 72, BG-1784 Sofia, Bulgaria. E-mail: [email protected], [email protected] 3 Department of Physics, Ben-Gurion University of the Negev, Box 653, IL-84105 Beer Sheva, Israel. E-mail: [email protected], [email protected]
Received: 19 March 1997 / Accepted: 12 September 1997
Abstract: The method of squared eigenfunction potentials (SEP) is developed systematically to describe and gain new information about the Kadomtsev–Petviashvili (KP) hierarchy and its reductions. Interrelation to the τ -function method is discussed in detail. The principal result, which forms the basis of our SEP method, is the proof that any eigenfunction of the general KP hierarchy can be represented as a spectral integral over the Baker–Akhiezer (BA) wave function with a spectral density expressed in terms of SEP. In fact, the spectral representations of the (adjoint) BA functions can, in turn, be considered as defining equations for the KP hierarchy. The SEP method is subsequently used to show how the reduction of the full KP hierarchy to the constrained KP (cKPr,m ) hierarchies can be given entirely in terms of linear constraint equations on the pertinent τ -functions. The concept of SEP turns out to be crucial in providing a description of cKPr,m hierarchies in the language of the universal Sato Grassmannian and finding the non-isospectral Virasoro symmetry generators acting on the underlying τ -functions. The SEP method is used to write down generalized binary Darboux-B¨acklund transformations for constrained KP hierarchies whose orbits are shown to correspond to a new Toda model on a square lattice. As a result, we obtain a series of new determinant solutions for the τ -functions generalizing the known Wronskian (multi-soliton) solutions. Finally, applications to random matrix models in condensed matter physics are briefly discussed. 1. Introduction The primary object of this paper is the Kadomtsev–Petviashvili (KP) integrable hierarchy (for comprehensive reviews, see e.g. [1, 2]) and its nontrivial reductions generalizing the familiar r-reduction to the SL(r) Korteweg-de Vries (KdV) hierarchy. The KP hierarchy is an infinite-dimensional system which admits different alternative formulations ? ??
Work supported in part by the U.S. Department of Energy under contract DE-FG02-84ER40173 Supported in part by Bulgarian NSF grant Ph-401
494
H. Aratyn, E. Nissimov, S. Pacheva
and exhibits many types of symmetries. Here we are interested in a formulation based on the notion of squared eigenfunction potential and the spectral representations of the underlying eigenfunctions it gives rise to. Because of its connection to vertex operators, many aspects of this theory are algebraic in nature. This allows us to discuss in a systematic manner various symmetries of the hierarchy and applications to a large class of soliton systems obtained from it via symmetry reduction. The KP hierarchy arises as a set of compatibility conditions for the linear spectral problem involving the pseudo-differential Lax operator L and the Baker–Akhiezer (BA) wave function ψBA (t, λ). In recent years, the study of integrable systems of KP type has undergone rapid growth due to the applications of the tau-function technique invented by the Kyoto School [3, 4, 5]. The underlying principle of this method is to represent the relevant soliton potentials and Hamiltonian densities in terms of isospectral flows (with evolution parameters (t) ≡ (t1 ≡ x, t2 , . . .)) of one single function τ (t) in such a way that ∂ 2 ln τ (t)/∂t1 ∂tn becomes equal to the coefficient in front of D−1 in the pseudo-differential operator expansion of Ln . In terms of the τ -function, viewed as a function of the infinitely many KP “time”variables (t1 ≡ x, t2 , . . .), the whole KP hierarchy is contained in Hirota’s fundamental bilinear identity [5] instead of the infinite system of non-linear partial differential equations derived from the Sato-Wilson Lax operator approach. The τ -function approach bridges the way to several physical applications in view of its direct connection to physical objects, such as correlation and partition functions. Moreover, it allows a coherent treatment of multi-soliton solutions. These solutions of the nonlinear differential equations are generated by the action of the Miwa-Jimbo vertex operator Xb (λ, µ) [4] (cf. Eq. (3.1) below) on the τ -function. This vertex operator generates an infinitesimal B¨acklund transformation of the KP hierarchy. The family of all vertex operators constitutes a Lie algebra isomorphic to GL(∞). The transformation τ (t) → (exp(aXb (λ, µ)))τ (t) sends a solution of the KP hierarchy into another solution. In this way, the action of the infinite-dimensional Lie algebra GL(∞) on the solution space of the KP equation is made explicit via the B¨acklund transformation of “adding one soliton". This powerful formal machinery embeds many other concrete and useful structures relevant for physical models. Recently a special class of solutions encountered in the matrix models of discrete two-dimensional gravity was realized via the imposition of the Virasoro type of constraints on the underlying τ -function [6, 7] (see also [8]). A remarkable feature of the KP hierarchy is the existence of the so called additional non-isospectral symmetries which, within the Lax operator formalism, are generated by Orlov-Schulman pseudo-differential operators [9]. The latter are defined as purely pseudo-differential parts of products of powers of the Lax operator L and its “conjugate” M-operator (cf. Eqs. (5.1),(5.2) below) and their respective flows form the infinitedimensional Lie algebra W1+∞ 1 . In an important recent development Adler, Shiota, and van Moerbeke [14, 15] (see also [16, 17]) obtained a formula for the KP hierarchy which relates the action of the vertex operator Xb (λ, µ) on the τ -function to Orlov-Schulman non-isospectral additional symmetry flows on the BA wave function. The coefficients in the spectral expansion of Xb (λ, µ) act as vector fields on the space of τ -functions 1 W 1+∞ algebra was originally introduced in physics literature [11] as a nontrivial “large N ” limit of the associative, but non-Lie, conformal WN algebra [12]. It turns out to be isomorphic to the (centrally extended) algebra of differential operators on the circle [13], i.e., the Lie algebra generated by z k (∂/∂z)n for k ∈ Z , n ≥ 0. Let us also recall that the “semiclassical” limit (contraction) w1+∞ of W1+∞ is the algebra of area-preserving diffeomorphisms on the cylinder [11].
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
495
generating W1+∞ algebra as well. Hence, the above result relates the W1+∞ algebra acting on τ (t) to the centerless W1+∞ algebra of non-isospectral symmetry flows acting on the BA function ψBA (t, λ). There exists an alternative to the τ -function method characterization of the KP hierarchy evolution equations in terms of (adjoint) eigenfunctions, i.e., functions whose KP multi-time flows are governed by an infinite set of purely differential operators ∞ {Bk }k=1 (cf. Def.2.1 below). The latter, by virtue of compatibility of the multi-time flows, have to satisfy the so called “zero-curvature” Zakharov–Shabat equations (cf. Eq. 3.44 below). One can then show [18] that all Bk are obtained as purely differential projections of k th powers of a single pseudo-differential operator L, thus leading to the standard Lax formulation of the KP hierarchy. Overcoming the formal obstacle of having to define a function via an inverse derivative ∂x−1 Oevel succeeded in [19] to associate a well-defined (up to a constant) function – the squared eigenfunction potential (SEP), to a pair of arbitrary eigenfunction and adjoint eigenfunction such that the x-derivative of SEP coincides with the product of the latter eigenfunctions. Consequently, a systematic formalism emerged in [19] for the study of symmetries generated within the KP hierarchy via SEP [20]. In a particular example, when both eigenfunctions defining the SEP are BA functions, the SEP becomes a generating function for the above mentioned additional non-isospectral symmetries of the KP hierarchy [9, 10, 14, 15, 16, 17]. In the SEP framework, the product of any pair of eigenfunction and adjoint eigenfunction, being a x-derivative of SEP, can be viewed as a conserved density within the hierarchy. The transition to the important class of constrained KP hierarchies cKPr,m 2 , which are Hamiltonian reductions of the general KP hierarchy and whose Lax operators are given in Eq. (2.20) below, can be effectuated by imposing equality between a linear combination of m (m ≥ 1) conserved densities of the above mentioned type and the rth (r ≥ 1) fundamental Hamiltonian density of the KP hierarchy. In such a case, the symmetry generated by SEP (called “ghost” flow) is identified with the rth isospectral flow of the original KP hierarchy. The principal merit of our work is to reformulate the eigenfunction formalism of the KP hierarchy in a new form called the squared eigenfunction potential (SEP) method, namely, to employ SEP as a basic building block in defining the KP hierarchy. The main ingredient of the SEP method is the proof of existence of spectral representation for any eigenfunction involving SEP as an integration kernel (spectral density). A link is then provided between the two alternative formulations of the KP hierarchy: one based on the τ -function and another one based on the SEP method. Furthermore, we apply the SEP method to solve various issues in integrable models of KP type and their applications in physics, among them, deriving new determinant solutions for the τ -function containing the familiar Wronskian (multi-soliton) solutions as simple particular cases, as well as identifying them as possible novel types of joint distribution functions in random matrix models of condensed matter physics. The plan of the paper is as follows. After reviewing the background material in Sect. 2, we first prove in Sect. 3 that any eigenfunction of the general KP hierarchy can be represented as a spectral integral over the BA wave function with a spectral 2 The cKP r,m integrable hierarchies appeared in different disguises from various parallel developments: (a) symmetry reductions [23, 19, 24] of the full KP hierarchy; (b) abelianization, i.e., free-field realizations, in terms of finite number of fields, of both compatible first and second KP Hamiltonian structures [25, 26]; (c) a method of extracting continuum integrable hierarchies from the generalized Toda-like lattice hierarchies [27] underlying (multi-)matrix models; (d) purely algebraic approach in terms of the zero-curvature equations for the affine Kac-Moody algebras with non-standard gradations [28].
496
H. Aratyn, E. Nissimov, S. Pacheva
density expressed in terms of SEP. When (at least one) of the eigenfunctions is a BA functions, we obtain a closed expression for the SEP. When both of the eigenfunctions are BA functions, the resulting SEP’s are connected straightforwardly to the bilocal vertex operator Xb (λ, µ) acting on the τ -function. This association leads to a simple alternative proof for the Adler, Shiota, and van Moerbeke result [14, 15, 16, 17] mentioned above. A further important observation in Sect. 3 is that the spectral representation equations for the (adjoint) BA functions themselves can be considered as defining equations for the KP hierarchy. In other words, our formalism of spectral representations of KP eigenfunctions can be viewed as an equivalent alternative characterization of the KP hierarchy parallel to Hirota’s bilinear identity or Fay’s trisecant identity [14]. Our results in the constrained cKPr,m hierarchy case are as follows. In Sect. 4, using the SEP framework we obtain an equivalent description (parallel to the one within the Lax pseudo-differential operator approach) of cKPr,m hierarchies entirely in terms of τ -functions only. Namely, we first derive a linear equation for the τ -function (Eq. (4.9) below), involving the bilocal vertex operator Xb (λ, µ) and a set of spectral densities, which uniquely constrains the τ -function to belong to the cKPr,m hierarchy. Furthermore, we provide in Sect. 4 an alternative description of cKPr,m hierarchies in the language of the universal Sato Grassmannian. One of the advantages of the SEP approach lies in the fact that it allows for a coherent treatment of the non-isospectral symmetries for KP-type hierarchies. We use this feature in Sect. 5 to demonstrate how the combination of the familiar Orlov-Schulman nonisospectral symmetry flows, operating in the full unconstrained KP hierarchy, together with certain appropriately chosen additional SEP-generated “ghost” symmetry flows [29, 8] gives rise to the correct non-isospectral Virasoro symmetry generators acting on the space of cKPr,m τ -functions3 . The SEP method is applied further in Sect. 6 to formulate generalized multi-step binary Darboux-B¨acklund (DB) transformation rules of (constrained) KP hierarchies (one-step binary DB transformations with SEP have been introduced previously in ref.[30]). Based on these transformation rules and using the fundamental Fay identities, we derive a series of new determinant solutions for the τ -functions generalizing the known Wronskian (multi-soliton) solutions. The binary DB orbits define a discrete symmetry structure for cKPr,m hierarchies corresponding to a square lattice. We exhibit the equivalence of these binary DB orbits with a generalized Toda model on a square lattice. Our formalism offers applications to the study of random matrix models in condensed matter physics, which we briefly discuss in Sect. 7, where we also present some discussion of the results and an outlook. 2. Background on General and Constrained KP Hierarchies The calculus of the pseudo-differential operators (see e.g. [3, 2]) is one of the principal approaches to describe the KP hierarchy of integrable nonlinear evolution equations. In what follows the operator D is such that [ D , f ] = ∂f = ∂f /∂x and the generalized Leibniz rule holds: ∞ X n (∂ j f )Dn−j , n ∈ Z. (2.1) Dn f = j j=0
3 The standard Orlov-Schulman non-isospectral symmetry flows do not preserve the constrained form (2.20) of cKPr,m hierarchy.
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
497
In order to avoid confusion we shall employ the following notations: for any (pseudo-) differential operator A and a function f , the symbol A(f ) will indicate application (action) of A on f , whereas the symbol Af will denote just the operator product of A with the zero-order (multiplication) operator f . In this approach the main object is the pseudo-differential Lax operator: L = Dr +
r−2 X
∞ X
ui D−i .
(2.2)
, n = 1, 2, . . .
(2.3)
vj D j +
j=0
i=1
The Lax equations of motion: ∂L n/r = [L+ , L] ∂tn
describe isospectral deformations of L. In (2.3) what follows the subscripts P and in j a D denote its purely differential (±) of any pseudo-differential operator A = j j P P j part (A+ = j≥0 aj D ) or its purely pseudo-differential part (A− = j≥1 a−j D−j ), respectively. Further, (t) ≡ (t1 ≡ x, t2 , . . .) collectively denotes the infinite set of evolution parameters (KP “multi-time”) from (2.3). The Lax operator (2.2) can be represented equivalently in terms of the so called dressing operator W : W =1+
∞ X
wn D−n ;
L = W Dr W −1
(2.4)
n=1
whereupon the Lax Eqs. (2.3) become equivalent to the so-called Wilson-Sato equations: ∂W n/r n/r = L+ W − W Dn = −L− W. ∂tn
(2.5)
A further important object is the Baker–Akhiezer (BA) “wave” function defined via: ψBA (t, λ) = W eξ(t,λ) = w(t, λ)eξ(t,λ) ;
w(t, λ) = 1 +
∞ X
wi (t)λ−i ,
(2.6)
i=1
where ξ(t, λ) ≡
∞ X
tn λ n ;
t1 = x
(2.7)
n=1
Accordingly, there is also an adjoint BA function: ∗ ψBA (t, λ)
=W
∗−1
e
−ξ(t,λ)
∗
= w (t, λ)e
−ξ(t,λ)
;
∗
w (t, λ) = 1 +
∞ X
wi∗ (t)λ−i . (2.8)
i=1
The (adjoint) BA functions obey the following linear system: L ψBA (t, λ) = λr ψBA (t, λ), ∗ ∗ L∗ ψBA (t, λ) = λr ψBA (t, λ),
∂ n/r ψBA (t, λ) = L+ ψBA (t, λ) , (2.9) ∂tn n/r ∗ ∂ ∗ ψBA (t, λ) = − L∗ + ψBA (t, λ) . ∂tn
498
H. Aratyn, E. Nissimov, S. Pacheva
Note that Eqs. (2.3) for the KP hierarchy flows can be regarded as compatibility conditions for the system (2.9). There exists another equivalent and powerful approach to the KP hierarchy based on one single function of all evolution parameters – the so called tau-function τ (t) [3] . It is an alternative to using the Lax operator and the calculus of the pseudo-differential operators. The τ -function is related to the BA functions (2.6)–(2.9) via: ∞ X τ t − [λ−1 ] ξ(t,λ) pn (−[∂]) τ (t) −n e λ , = eξ(t,λ) (2.10) ψBA (t, λ) = τ (t) τ (t) n=0 ∞ X τ t + [λ−1 ] −ξ(t,λ) pn ([∂]) τ (t) −n ∗ −ξ(t,λ) e λ , =e (2.11) ψBA (t, λ) = τ (t) τ (t) n=0
where [λ−1 ] ≡
1 1 λ−1 , λ−2 , λ−3 , . . . ; 2 3
[∂] ≡
∂ 1 ∂ 1 ∂ , , ,... , ∂t1 2 ∂t2 3 ∂t3
and the Schur polynomials pn (t) are defined through P ∞ X λl t λn pn (t1 , t2 , . . .). e l≥1 l =
(2.12)
(2.13)
n=0
Taking into account (2.10) and (2.6), the expansion for the dressing operator (2.4) becomes: W =
∞ X pn (−[∂]) τ (t) n=0
τ (t)
D−n ,
i.e. w1 (t) = ResW = −∂x ln τ (t).
The (adjoint) BA functions enter the fundamental Hirota bilinear identity: Z ∗ dλ ψBA (t, λ)ψBA (t0 , λ) = 0,
(2.14)
(2.15)
which generates the entire KP hierarchy. and in what follows integrals over spectral R H Here dλ = Resλ=0 . parameters are understood as: dλ ≡ 0 2iπ Let us also recall that the KP hierarchy possesses an infinite set of commuting integrals of motion w.r.t. the compatible first and second fundamental Poisson-bracket structures whose densities hl−1 = 1l ResLl/r are expressed in terms of the τ -function (2.10) as ∂ ∂x ln τ (t) = ResLl/r . (2.16) ∂tl Below we shall be particularly interested in reductions of the full (unconstrained) KP hierarchy (2.2). In this respect, it turns out that a crucial rˆole is played by the notions of (adjoint) eigenfunctions of KP hierarchy. Definition 2.1. The function 8 (9) is called an (adjoint) eigenfunction of the Lax operator L satisfying Sato’s flow equation (2.3) if its flows are given by the expressions: ∂8 k/r = L+ 8 ∂tk for the infinitely many times tk .
;
k/r ∂9 = − L∗ + 9 ∂tk
(2.17)
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
499
Of course, according to (2.9) the (adjoint) BA functions are particular examples of (adjoint) eigenfunctions which, however, satisfy in addition also the corresponding spectral equations. In what follows a very important rˆole will be played by the notion of the so called squared eigenfunction potential (SEP). As shown by Oevel [19], for an arbitrary pair of (adjoint) eigenfunctions 8(t), 9(t) there exists the function S (8(t), 9(t)), called SEP, which possesses the following characteristics: ∂ (2.18) S (8(t), 9(t)) = Res D−1 9(Ln/r )+ 8D−1 . ∂tn The argument in [19], proving the existence of S (8(t), 9(t)), was built on compatibility between isospectral flows as defined in Eq. (2.18) and (2.17). In particular, for n = 1, Eq. (2.18) implies that the space derivative (recall ∂x ≡ ∂/∂t1 ) of S (8(t), 9(t)) is equal to the product of the underlying eigenfunctions: ∂x S (8(t), 9(t)) = 8(t) 9(t).
(2.19)
Remark. Equation (2.18) determines S (8(t), 9(t)) up to a shift by a trivial constant. From Eqs. (2.18)–(2.19) one sees that 8(t) 9(t) is a conserved density of the KP hierarchy. This fact has a special significance for the reduction of the general KP hierarchy to the constrained cKPr,m models (see below). Definition 2.2. The constrained KP hierarchy (denoted as cKPr,m ) is given by a Lax operator of the following special form: L ≡ Lr,m = Dr +
r−2 X l=0
ul D l +
m X
8a D−1 9a .
(2.20)
a=1
One can easily check that the functions 8a , 9a , entering the purely pseudo-differential part of Lr,m (2.20), are (adjoint) eigenfunctions of Lr,m according to Def.2.1. The purely pseudo-differential part of arbitrary power of the cKPr,m Lax operator (2.20) has the following explicit form [31]: Lk
−
=
k−1 m X X
Lk−j−1 (8a )D−1 L∗
j
(9a ).
(2.21)
a=1 j=0
Formula (2.21) can easily be derived from the simple technical identity involving a product of two pseudo-differential operators of the form fi D−1 gi , i = 1, 2: f1 D−1 g1 f2 D−1 g2 = f1 S(f2 , g1 )D−1 g2 − f1 D−1 S(f2 , g1 )g2 ,
(2.22)
where fi , gi are pairs of (adjoint) eigenfunctions of some KP Lax operator, with S(·, ·) being the corresponding SEP. Note, that in agreement with Eq. (2.22)P the expression L(8a ) in (2.21) with L as in m (2.20) explicitly reads: L(8a ) = L+ (8a ) + b=1 8b S (8a , 9b ). This definition extends naturally to higher powers of L acting on 8a as well as higher powers of L∗ acting on 9a . Moreover, one can easily show that Ll (8a ) and L∗ k (9a ) are (adjoint) eigenfunctions of L (2.20) as well. For three pseudo-differential operators Xi ≡ fi D−1 gi , i = 1, 2, 3 the associativity law (X1 X2 ) X3 = X1 (X2 X3 ) implies the following technical lemma:
500
H. Aratyn, E. Nissimov, S. Pacheva
Lemma 2.1. The squared eigenfunction potential S(·, ·) satisfies: S(f, g)S(h, k) = S (h, kS(f, g)) + S (f S(h, k), g)
(2.23)
for (adjoint) eigenfunctions f, g, h, k.
3. Spectral Representation of KP Eigenfunctions Consider the bilocal vertex operator [4]: P∞ 1 −l −l ∂ −1 1 1 ˆ ˆ λ −µ ) ∂t l Xb (λ, µ) ≡ : eθ(λ) : : e−θ(µ) := eξ(t+[λ ],µ)−ξ(t,λ) e 1 l ( λ λ P ∞ 1 ∂ −1 1 λ−l −µ−l ) ∂t l + δ(λ, µ), = − eξ(t,µ)−ξ(t−[µ ],λ) e 1 l ( (3.1) µ where ˆ θ(λ) ≡−
∞ X l=1
λ l tl +
∞ X 1 l=1
l
λ−l
∂ . ∂tl
(3.2)
ξ (t, λ) is as in (2.7), the columns : . . . : indicate Wick normal ordering w.r.t. the creation/annihilation “modes” tl and ∂t∂ l , respectively, and the delta-function is defined as 1 1 1 1 + . (3.3) δ(λ, µ) = λ 1 − µλ µ 1 − µλ An equivalent representation for Xb (λ, µ), using Wick theorem, reads 1 ˆ ˆ : eθ(λ)−θ(µ) : λ−µ P∞ 1 −l −l ∂ 1 λ −µ ) ∂t l = eξ(t,µ)−ξ(t,λ) e 1 l ( λ−µ
Xb (λ, µ) =
for
|µ| ≤ |λ|.
(3.4)
The vertex operator Xb (λ, µ) can be expanded as follows: Xb (λ, µ) =
∞ ∞ 1 X (µ − λ)l X −s−l−1 1 c (l+1) W λ , λ−µ l! l+1 s s=−∞
(3.5)
l=0
c (l) span W1+∞ algebra. where the operators W s From the standard representation for the (adjoint) Baker–Akhiezer wave function (2.10),(2.11) in terms of the τ -function we deduce the identity: Xb (λ, µ) τ (t) 1 ∗ = ψBA (t, λ) ψBA t + [λ−1 ], µ τ (t) λ 1 ∗ t − [µ−1 ], λ + δ(λ, µ). = − ψBA (t, µ) ψBA µ
(3.6) (3.7)
Now, recall the Fay identity [14] for τ -functions: (s0 − s1 )(s2 − s3 )τ (t + [s0 ] + [s1 ])τ (t + [s2 ] + [s3 ]) + cyclic(1, 2, 3) = 0
(3.8)
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
501
which, in fact, is equivalent to Hirota bilinear identity (2.15). In what follows, we shall often make use of a particular form of (3.8) upon setting s0 = 0, dividing by s1 s2 s3 and shifting the KP times (t) → (t − [s2 ] − [s3 ]): −1 τ (t + [s1 ] − [s2 ] − [s3 ]) τ (t) s−1 2 − s3 −1 −1 + s1 − s2 τ (t − [s2 ]) τ (t + [s1 ] − [s3 ]) −1 τ (t − [s3 ]) τ (t + [s1 ] − [s2 ]) = 0. + s−1 3 − s1
(3.9)
Especially, making identification s1 = µ−1 , s2 = z −1 and s3 = λ−1 in (3.9) and using (2.10)–(2.11), we arrive at the following useful lemma: Lemma 3.1. The truncated Fay identity (3.9) is equivalent to the following bilinear identity for (adjoint) BA functions: 1 1b ∗ ∗ 1z ψBA (t, λ)ψBA (t − [λ−1 ], µ) = − ψBA (t, λ)ψBA (t − [z −1 ], µ), λ z
(3.10)
b z is the shift-difference operator acting on functions depending on the variables where 1 t = (t1 , t2 , ...) as follows: bz ≡ e 1
P∞ 1
1 −l ∂/∂tl lz
b z f (t) = f t − [z −1 ] − f (t). 1
− 1,
(3.11)
The Fay identity (3.8) is also known in its differential version: !
∂x
τ t + [λ−1 ] − [µ−1 ] τ (t)
= (λ − µ)
=
! τ t + [λ−1 ] − [µ−1 ] τ t + [λ−1 ] τ t − [µ−1 ] − . (3.12) τ (t) τ (t) τ (t)
Using (3.4) and multiplying both sides of (3.12) by exp {−ξ (t, λ) + ξ (t, µ)} we can rewrite it as ! Xb (λ, µ) τ (t) ∗ = −ψBA (t, λ)ψBA (t, µ) (3.13) ∂x τ (t) or, equivalently, using (3.6) and (3.7): 1 ∗ ∗ (t, λ) ψBA t + [λ−1 ], µ = ψBA (t, λ)ψBA (t, µ) , ∂x − ψBA λ 1 ∗ ∗ ∂x ψBA (t, µ) ψBA (t, λ)ψBA (t, µ) . t − [µ−1 ], λ = ψBA µ
(3.14)
Let 8, 9 be a pair of an eigenfunction and an adjoint eigenfunction of the general KP hierarchy. Our main statement in this section is:
502
H. Aratyn, E. Nissimov, S. Pacheva
Proposition 3.1. Any (adjoint) eigenfunction of the general KP hierarchy possesses a spectral representation of the form: Z Z ∗ (t, λ) (3.15) 8(t) = dλ ϕ(λ) ψBA (t, λ); 9(t) = dλ ϕ∗ (λ) ψBA with spectral densities given by 1 ∗ 1 ψ t0 , λ 8 t0 + [λ−1 ] ; ϕ∗ (λ) = ψBA t0 , λ 9 t0 − [λ−1 ] , λ BA λ (3.16) where the multi-time t0 = t01 , t02 , . . . is taken at some arbitrary fixed value. In other words: Z 1 ∗ (3.17) t0 , λ 8 t0 + [λ−1 ] , 8(t) = dλ ψBA (t, λ) ψBA λ Z 1 ∗ (t, λ) ψBA t0 , λ 9 t0 − [λ−1 ] (3.18) 9(t) = dλ ψBA λ ϕ(λ) =
are valid for arbitrary KP (adjoint) eigenfunctions 8, 9 and for an arbitrary fixed multi-time t0 . Furthermore, the r.h.s. of (3.17) and (3.18) do not depend on t0 . We will proceed proving the above proposition in two steps. First, let us assume that the (adjoint-)eigenfunctions indeed possess a spectral representation of the form (3.15) with some spectral densities ϕ(∗) (λ) . In such a case the statement of the proposition is contained in a much simpler lemma: Lemma 3.2. For (adjoint) eigenfunctions possessing the spectral representation (3.15) their respective spectral densities are given by (3.16). Consequently, in this case Eqs. (3.17) and (3.18) are valid too. Proof. Using the spectral representation (3.15) for 8 t0 + [λ−1 ] and substituting it into the right hand side of (3.17), we get: Z Z 1 ∗ (3.19) t0 , λ ψBA t0 + [λ−1 ], µ . dλ dµ ϕ(µ) ψBA (t, λ) ψBA λ Recalling (3.7) we can rewrite (3.19) as: Z Z ∗ −1 ψBA t0 , µ ψBA t0 − [µ−1 ], λ + δ (λ, µ) dλ dµ ϕ(µ) ψBA (t, λ) µ Z (3.20) = dλ ϕ(λ) ψBA (t, λ) = 8(t). where use was made of the fundamental Hirota bilinear identity (2.15). The t0 independence of the r.h.s. of (3.17) and (3.18) will be demonstrated explicitly in the course of the proof of Prop.3.1 given below. We are now ready to take a final step of the proof of Prop.3.1 and extend the result of Lemma (3.2) to arbitrary KP (adjoint-)eigenfunctions without assuming existence of a spectral representation (3.15). To this end we need to recall the notion of SEP (2.18)–(2.19). ∗ (t, λ) be the SEP associated with a pair of eigenfunctions 8(t) Let S 8(t), ψBA ∗ ∗ ∗ and ψBA (t, λ), i.e. ∂x S 8(t), ψBA (t, λ) = 8(t) ψBA (t, λ). Before proceeding let us
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
503
∗ note the following simple property of S 8(t), ψBA (t, λ) , namely, it is an “oscillatory” type of function w.r.t. λ of the form: ∞ X ∗ (t, λ) = e−ξ(t,λ) sj (t)λ−j = e−ξ(t,λ) −8(t)λ−1 + O(λ−2 ) (3.21) S 8(t), ψBA j=1
sj+1 (t) = ∂x sj (t) − 8(t)wj∗ (t) , j = 0, 1, . . . ; s0 ≡ 0 (3.22) −1 ∗ (t, λ) where wj∗ (t) = τ (t) pj ([∂])τ (t) are the coefficients in the λ-expansion of ψBA (2.11). Indeed, the defining relation (2.18) for the SEP in question: ∂ ∗ ∗ (3.23) S 8(t), ψBA (t, λ) = Res D−1 ψBA (t, λ)(Ln/r )+ 8(t)D−1 ∂tn implies the oscillatory form: e
−ξ(t,λ)
∞ X
sj (t)λ−j
(3.24)
j=0
of the latter upto an additive constant (recall, that any SEP is defined upto an addition of a constant). Further, taking into account: ∞ X ∗ ∗ (t, λ) = 8(t) ψBA (t, λ) = e−ξ(t,λ) 8(t)wj∗ (t)λ−j ∂x S 8(t), ψBA
(3.25)
j=0
and comparing the series in the last equality (3.25) with the ∂x derivative of the series (3.24), one obtains recurrence relations (3.22). Define now: Z 0 ∗ b (t0 , λ) (3.26) 8 t, t = − dλ ψBA (t, λ) S 8(t0 ), ψBA b t, t0 = b t, t0 /∂t0n = 0 due to Eqs. (2.18) and (2.15). Hence 8 We first observe that ∂ 8 b (t) does not depend on the multi-time t0 . Moreover, it is obvious from the definition 8 b (t) is an eigenfunction possessing spectral representation of the form (3.15) (3.26) that 8 and, therefore, satisfying the conditions of Lemma (3.2). Consequently, according to (3.17), we have: Z ∗ b t0 + [λ−1 ] . b (t) = dλ ψBA (t, λ) 1 ψBA (3.27) t0 , λ 8 8 λ Comparing integrands in Eqs. (3.26) and (3.27) we find: 1 ∗ ∗ b t + [λ−1 ] = −S 8(t), ψBA ψ (t, λ)8 (t, λ) + X(t, λ), (3.28) λ BA where X(t, λ) is a function of the “oscillatory” type (3.24) (taking into account (3.21)– (3.22)): 1 b 8(t) − 8(t) + O(λ−2 . (3.29) X(t, λ) = e−ξ(t,λ) λ It satisfies: (3.30) Res(ψBA (t, λ)X(t0 , λ)) = 0 0 0 b − 8(t). This for arbitrary t . Inserting (3.29) in (3.30) and taking t = t we get 8(t) concludes the proof of Prop.3.1.
504
H. Aratyn, E. Nissimov, S. Pacheva
Corollary 3.1. Taking into account Prop.3.1, Eqs. (3.13)–(3.14) imply the following relations: Xb (λ, µ) τ (t) ∗ (t, λ) = − S ψBA (t, µ), ψBA τ (t) ∗ 1 (t, λ), = − ψBA t + [λ−1 ], µ ψBA λ Xb (µ, λ) τ (t) ∗ + δ(µ, λ) (t, µ) = − S ψBA (t, λ), ψBA τ (t) 1 ∗ t − [λ−1 ], µ , = ψBA (t, λ)ψBA λ
1 ∗ S = − ψBA (t, λ)8 t + [λ−1 ] , λ 1 S (ψBA (t, λ), 9(t)) = ψBA (t, λ)9 t − [λ−1 ] , λ ∗ 8(t), ψBA (t, λ)
(3.31)
(3.32) (3.33) (3.34)
where 8, 9 are arbitrary (adjoint-)eigenfunctions and S (·, ·) are the corresponding squared eigenfunction potentials. Moreover, we also have the following double spectral density representation for the SEP S (8(t), 9(t)): Z Z 1 ∗ (t, λ)ψBA (t + [λ−1 ], µ) S (8(t), 9(t)) = − dλ dµ ϕ∗ (λ)ϕ(µ) ψBA λ Z Z Xb (λ, µ) τ (t) (3.35) =− dλ dµ ϕ∗ (λ)ϕ(µ) τ (t)
Taking into account (3.33)–(3.34), the spectral representations (3.17)–(3.18) become: Z ∗ (t0 , λ) , (3.36) 8(t) = − dλ ψBA (t, λ) S 8(t0 ), ψBA Z ∗ (t, λ) S ψBA (t0 , λ), 9(t0 ) . (3.37) 9(t) = dλ ψBA Remark. Note that the expressions (3.36)–(3.37) applied for (adjoint) BA functions yield: Z ∗ (t0 , µ) , ψBA (t, λ) = − dµ ψBA (t, µ)S ψBA (t0 , λ), ψBA Z ∗ ∗ ∗ (t, µ)S ψBA (t0 , µ), ψBA (t0 , λ) , (3.38) ψBA (t, λ) = dµ ψBA ∗ (t0 , µ) can be identified with the Cauchy which shows that the SEP S ψBA (t0 , λ), ψBA kernel for each fixed KP multi-time t0 (cf. also [21] and references therein, where the above SEP was previously introduced in the context of the Riemann factorization problem, as well as [22] for related discussion within the dispersionless KP hierarchy).
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
505
Remark. Going back to the spectral representation Eqs. (3.17)–(3.18), valid for any eigenfunction of the general KP hierarchy, we observe that they can be rewritten as evolution equations w.r.t. the KP multi-time of the following form: 8(t) = Uˆ (t, t0 )8(t0 ), Z P∞ 1 −l 0 1 ∗ 0 ˆ U (t, t ) ≡ dλ ψBA (t, λ) ψBA t0 , λ e 1 l λ ∂/∂tl , λ 9(t) = Uˆ ∗ (t, t0 )9(t0 ), Z P∞ 1 −l 0 1 ∗ 0 ∗ ˆ (t, λ) ψBA t0 , λ e− 1 l λ ∂/∂tl . U (t, t ) ≡ dλ ψBA λ
(3.39)
(3.40)
One can readily verify that: Uˆ (t, t) = 1l, Uˆ −1 (t, t0 ) = Uˆ (t0 , t), Uˆ (t, t0 ) = Uˆ (t, t00 ) Uˆ (t00 , t0 ), ∂ ˆ ∂ ˆ l/r l/r U (t, t0 ) = L+ Uˆ (t, t0 ), U (t, t0 ) = −L0 + Uˆ (t, t0 ). ∂tl ∂t0l
(3.41) (3.42)
From (3.41)–(3.42) we deduce that the evolution operator Uˆ (t, t0 ) (3.39) can be formally written as a path-ordered exponential: X ∞Z 1 dtl l/r 0 ˆ ds L+ (t(s)) ; tk (0) = t0k , tk (1) = tk , k = 1, 2, . . . U (t, t ) = P exp ds 0 l=1 (3.43) which precisely agrees with the formal solution of the differential evolution Eqs. (2.17) for the KP eigenfunctions. The r.h.s. of (3.43) is independent of the particular path {tk (s)} connecting the points t0 and t in the space of KP multi-times due to the “zero-curvature” Zakharov–Shabat equations: i ∂ k/r h k/r ∂ l/r l/r L+ − L + − L+ , L + = 0. (3.44) ∂tk ∂tl Thus, our SEP method allowed us to find the explicit expression (r.h.s. of the second Eq. (3.39)) for the formal path-ordered exponential (3.43). Now, it is worthwhile to observe that we can revert the logic of our procedure above, i.e., instead of starting with the Hirota bilinear identity (2.15) (or, equivalently, with the Fay identity (3.8)) as defining the KP hierarchy and deriving from them the spectral representation formalism (3.17)–(3.18) (or (3.36)–(3.37)) for KP eigenfunctions, we can take the spectral representation Eqs. (3.36)–(3.37) as the basic equations defining the KP hierarchy. Namely, we have the following simple: Proposition 3.2. Consider a pair of functions ψ(t, λ), ψ ∗ (t, λ) of the multi-time (t1 , t2 , . . .) and the spectral parameter λ of the form ψ (∗) (t, λ) = e±ξ(t,λ) P∞ (∗) −j with w0(∗) = 1 and ξ(t, λ) as in (2.7). Let us assume that ψ (∗) (t, λ) j=0 wj (t)λ obey the following spectral identities: Z Z ψ(t, λ) = − dµ ψ(t, µ) S(t0 ; λ, µ), ψ ∗ (t, λ) = dµ ψ ∗ (t, µ) S(t0 ; µ, λ) (3.45) for two arbitrary multi-times t and t0 , where by definition the function S(t; λ, µ) is such that ∂t∂ 1 S(t; λ, µ) = ψ(t, λ) ψ ∗ (t, µ). Then, Eqs. (3.45) are equivalent to the Hirota bilinear identity (2.15) and, accordingly, ψ (∗) (t, λ) become (adjoint) BA functions of the associated KP hierarchy.
506
H. Aratyn, E. Nissimov, S. Pacheva
To see that Eqs. (3.45) imply the Hirota identity (2.15),Rit is enough to differentiate both sides of (3.45) w.r.t. t01 : 0 = ∂ψ(t, λ)/∂t01 = −ψ(t0 , λ) dµ ψ(t, µ) ψ ∗ (t0 , µ). The proof of the inverse statement of the equivalence, namely, that the Hirota bilinear identity (2.15) imply the spectral representation Eqs. (3.45), is contained in the proof of Prop.3.1 above. Using (2.10)–(2.11), Eqs. (3.33)-(3.34) can be rewritten as: τ t + [λ−1 ] 8 t + [λ−1 ] −ξ(t,λ) ∗ e = −S 8(t), ψBA (t, λ) , (3.46) λτ (t) τ t − [λ−1 ] 9 t − [λ−1 ] ξ(t,λ) e = S (ψBA (t, λ), 9(t)) . (3.47) λτ (t) Remark. Spectral representations for eigenfunctions (3.36)–(3.37) as well as identities (3.46)–(3.47) were obtained in a similar form in [32] for the particular case of constrained cKPr,m hierarchies. Let us specifically emphasize, that all main equations of the present SEP method (3.15)–(3.18), (3.31)–(3.37) and (3.46)–(3.47), derived above, are valid within the general unconstrained KP hierarchy. Acting with space derivative ∂x on both sides of (3.46)-(3.47) and shifting the KP time arguments, we get: τ t − [λ−1 ] 8 t − [λ−1 ] −1 −1 − 1 + λ ∂ ln 8(t) = λ ∂ ln , (3.48) 8(t) τ (t) 9 t + [λ−1 ] τ t + [λ−1 ] −1 −1 − 1 − λ ∂ ln 9(t) = −λ ∂ ln , (3.49) 9(t) τ (t) which were obtained in [8] by studying the way the τ -function transforms under Darboux-B¨acklund transformations. Taking into consideration that: ∞ X 8 t − [λ−1 ] pn (−[∂])8(t) + ∂ ln 8(t) = (3.50) −λ+λ 8(t) λn−1 8(t) n=2
with pn (·) being the Schur polynomials (2.13), we find that Eq. (3.48) is a generating equation for the following set of equations upon expanding in powers of λ−1 : pn (−[∂])8(t) = vn (t)8(t); n ≥ 2, vn (t) ≡ pn−1 (−[∂]) ∂ ln τ (t).
(3.51)
Note that vn (t) are coefficients in the λ-expansion of the generating function v(t, λ) [33]: ∞ X b λ ∂x ln τ (t), vn+1 λ−n ≡ ∂x ln ψBA (t, λ) − λ = 1 (3.52) v(t, λ) = n=1
where in obtaining the last equality we again used Eqs. (2.10)–(2.11) and notation (3.11). We will later need a slight generalization of (3.52): v (l) (t, λ) =
∞ X n=1
σn(l) (t)λ−n ≡
∂ b λ ∂ ln τ (t); ln ψBA (t, λ) − λl = 1 ∂tl ∂tl
l ≥ 1. (3.53)
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
507
(1) Clearly σn(l) (t) = pn (−[∂]) ∂/∂tl ln τ and vn (t) = σn−1 (t) , n ≥ 2. The coefficients σn(l) enter the basic identity for the KP Lax operator (2.2): ∞ X Ll/r = Ll/r + σn(l) L−n/r . +
(3.54)
n=1
Remark. Eqs. (3.51) are, clearly, valid for an arbitrary eigenfunction 8 of the full KP hierarchy. On the other hand, in ref.[3] (see also [24]) Eqs. (3.51) were presented for the special case of 8 = ψBA (t, λ) as relations equivalent to the standard KP evolution equations ∂ψBA (t, λ)/∂tn = (Ln/r )+ ψBA (t, λ). In fact, as shown in [33], plugging the BA wave function 8(t) = ψBA (t, µ) into Eq. (3.48) one easily recovers the differential Fay identity (3.12). We now define the “ghost” symmetry flows generated by the SEP [9, 19, 24, 8]. Let ∂α be a vector field, whose action on the Lax operator L and, accordingly, on the dressing operator W , is induced by a set of (adjoint) eigenfunctions 8a , 9a , a ∈ {α} through: X h X i 8a D−1 9a , L ; ∂α W ≡ 8a D−1 9a W. (3.55) ∂α L ≡ a∈{α}
a∈{α}
As shown in [19], the corresponding action of the above “ghost” flows on the (adjoint) eigenfunctions 8, 9: X X 8a S (8, 9a ) ; ∂α 9 = S (8a , 9) 9a (3.56) ∂α 8 = a∈{α}
a∈{α}
is compatible with the isospectral evolutions of 8, 9. Furthermore, it is easy to see that X S (8, 9a ) S (8a , 9) (3.57) ∂α S (8, 9) = a∈{α}
is compatible with P Eq. (3.56). −1 W defines some other “ghost” flow and both flows 8 D 9 If ∂β W ≡ b b b∈{β} ∂α and ∂β satisfy (3.56), then: (3.58) ∂α , ∂β W = 0, as follows from the technical identity (2.22). Equations (3.56) and (3.58) can be compactly expressed by an identity X X ∂α − 8a D−1 9a , ∂β − 8b D−1 9b = 0 a∈{α}
b∈{β}
[20, 19]. ∗ (t, λ) (cf. ref.[16]) to be pseudoDefine now Y (λ, µ) ≡ ψBA (t, µ) D−1 ψBA differential operator inducing a ghost-flow ∂(λ,µ) W ≡ Y (λ, µ)W according to (3.55). In this case the “SEP” symmetry flow is generated by an infinite combination of W1+∞ algebra generators [16]. Then, according to Eq. (3.56) the action of this flow on the BA wave function is given by:
508
H. Aratyn, E. Nissimov, S. Pacheva
∗ (t, λ) . Yb (λ, µ) ψBA (t, z) ≡ ∂(λ,µ) ψBA (t, z) = ψBA (t, µ) S ψBA (t, z), ψBA (3.59) Further, let us also define the action of the vertex operator Xb (λ, µ) on the BA function ψBA (t, z) as generated by its action (as a vector field) on the ratio of τ -functions entering (2.10): τ (t)Xb (λ, µ) τ (t − [z −1 ]) − τ (t − [z −1 ])Xb (λ, µ) τ (t) . Xb (λ, µ) ψBA (t, z) = eξ(t,z) τ 2 (t) (3.60) The latter, upon using the shift-difference operator (3.11), can be written as b b z X (λ, µ) τ (t) . Xb (λ, µ) ψBA (t, z) = ψBA (t, z) 1 τ (t)
(3.61)
Let us stress that, according to (3.59)–(3.61), Yb (λ, µ) acts on the BA function as a standard pseudo-differential operator, whereas Xb (λ, µ) acts on it as a shift-difference operator. Now, the above results allow us to give a simple straightforward proof of the following version of the Adler-Shiota-van-Moerbeke proposition [15, 16]. It provides the connection between the form of the non-isospectral (“additional”) symmetries of KP hierarchies acting on the Lax operators and BA functions [9, 10], on one hand, and their respective form when acting on KP τ -functions, on the other hand. Corollary 3.2. With definitions (3.59) and (3.60) it holds: Xb (λ, µ) ψBA (t, z) = Yb (λ, µ) ψBA (t, z) .
(3.62)
Proof. Indeed, applying (3.7) and Lemma 3.1 to the r.h.s. of (3.61), the latter equation can be rewritten as: 1 ∗ (t−[z −1 ], λ) = Yb (λ, µ) (ψBA (t, z)) , Xb (λ, µ) ψBA (t, z) = ψBA (t, z) ψBA (t, µ) ψBA z (3.63) where in order to arrive at the last equality use was made of (3.32). b µ) ≡ In the literature one often comes across the vertex operator defined as X(λ, ˆ − θ(µ) ˆ : exp θ(λ) : = (λ − µ)Xb (λ, µ). In such a notation the expression (3.62) b µ) = (λ − µ)Yb (λ, µ) as in [15, 16]. becomes X(λ, We conclude this section by proving the following important property of SEP: Lemma 3.3. Under shift of the KP times, the squared eigenfunction potential obeys: 1 S 8(t − [λ−1 ]), 9(t − [λ−1 ]) − S (8(t), 9(t)) = − 8(t)9(t − [λ−1 ]), (3.64) λ 1 S 8(t + [λ−1 ]), 9(t + [λ−1 ]) − S (8(t), 9(t)) = 8(t + [λ−1 ])9(t). (3.65) λ
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
509
Proof. According to (3.35) and (3.31): Z Z ∗ b b z S ψBA (t, λ), ψBA 1z S (8(t), 9(t)) = dλ dµ φ∗ (λ)φ(µ)1 (t, µ) ,
(3.66)
while from Eq. (3.10) we find that: 1 ∗ ∗ b z S ψBA (t, λ), ψBA 1 (t, µ) = − ψBA (t, λ)ψBA (t − [z −1 ], µ). z Inserting the above identity back in (3.66) gives (3.64).
(3.67)
After expanding identities (3.64) and (3.65) in power series w.r.t. λ we obtain: ps (−[∂])S 8(t), 9(t) = −8(t) ps−1 (−[∂])9(t), ps ([∂])S 8(t), 9(t) = 9(t) ps−1 ([∂])8(t), s = 1, 2, . . . ,
(3.68)
where ps (·) are the standard Schur polynomials (2.13). 4. Constraints on cKPr,m Tau-Functions. Grassmannian Interpretation From now on we concentrate on studying the class of constrained cKPr,m hierarchies for which we have: m X 8a D−1 9a (4.1) Lr,m − = a=1
according to Eq. (2.20). We first note that the cKPr,m BA function satisfies, according to (4.1), the following spectral equation: Lr,m ψBA (t, λ) = λr ψBA (t, λ) = (Lr,m )+ ψBA (t, λ) +
m X
8a (t) S (ψBA (t, λ), 9a (t)) .
a=1
(4.2)
Due to Eq. (3.34), the latter can be cast in the following form: λr ψBA (t, λ) = (Lr,m )+ ψBA (t, λ) +
m X 1 8a (t)9a t − [λ−1 ] ψBA (t, λ) λ a=1
= (Lr,m )+ ψBA (t, λ) m h i X S 8a (t − [λ−1 ]), 9a (t − [λ−1 ]) − S (8a (t), 9a (t)) ψBA (t, λ), (4.3) − a=1
where the second equality in (4.3) follows from (3.64). Recalling relation (3.53) we find that m X S (8a (t), 9a (t)) τ (t). (4.4) ∂τ (t)/∂tr = a=1
Similarly, using the spectral identity Lnr,m ψBA (t, λ) = λrn ψBA (t, λ) and taking into account relation (2.21) we obtain the following set of differential equations for the cKPr,m τ -function:
510
H. Aratyn, E. Nissimov, S. Pacheva
X m n−1 X ∂ n−1−i ∗i τ (t) = S L (8a ), L (9a ) τ (t). ∂trn
(4.5)
a=1 i=0
The point we want to stress is that the constraint (4.4) (or, the equivalent relation (4.3)) contains all the information about the cKP system in addition to the regular Hirota bilinear expression for the KP τ -function. This constraint can be given a bilinear form as in [32]. Consider, namely, the expression: Z ∗ (t0 , λ). (4.6) dλλr ψBA (t, λ)ψBA Using (4.3) and Hirota’s equation one gets for (4.6): Z m m X X 1 ∗ 0 −1 8a (t) dλψBA (t , λ) 9a (t − [λ ])ψBA (t, λ) = 8a (t)9(t0 ) λ a=1
(4.7)
a=1
where we also used the spectral representation (3.18) with t and t0 interchanged. Hence we proved that the constraint (4.4) implies: Z m X ∗ 8a (t)9(t0 ) = dλλr ψBA (t, λ)ψBA (t0 , λ) (4.8) a=1
which is the well-known bilinear expression for the cKP hierarchy [32] derived here from the simple fundamental τ -function constraint (4.4). Using the differential Fay identity (3.12), Eqs. (4.5) can be equivalently written in the form: ( ) Z Z m X ∂ ∂ r r −1 ∗ b − , dλ dµ (λ − µ ) ϕa (λ)ϕa (µ)X (λ, µ) τ (t) = 0 ∂trn ∂trn a=1 (4.9) (λ) are the “spectral densities” of the (adjoint) eigenfunctions 8 (t), 9 where ϕ(∗) a a (t) a entering the pseudo-differential part of the cKPr,m Lax operator (2.20), and also we have used the identity: ∂ b , X (λ, µ) = µl − λl Xb (λ, µ) . (4.10) ∂tl Thus we arrive at the following statement providing an alternative definition of cKPr,m hierarchies intrinsically in terms of τ -functions: Proposition 4.1. Reduction of the full KP hierarchy (2.2) to the cKPr,m hierarchy in terms of Lax operators (2.20) is equivalent to imposing Eqs. (4.9) as constraints on the pertinent τ -functions, where ϕ(∗) a (λ) are “spectral densities” of KP (adjoint) eigenfunctions given as in Eqs. (3.16). Let us now translate Eq. (4.9) into the language of universal Sato Grassmannian Gr [3, 34]. Consider the hyperplane W ∈ Gr defined through a linear basis of Laurent series {fk (λ)} in λ in terms of the BA function as generating function F (t, λ): W ≡ spanhf1 (λ),f2 (λ) , . . .i, F (t, λ) , F (t, λ) = ψBA (t, λ). fk (λ) = k ∂x ∂k
x=t2 =t3 =...=0
(4.11)
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
511
In case of the standard rth KdV reduction, where the corresponding Lax operator L = P∞ D + 1 ui D−i satisfies Lr = Lr+ , the latter constraint translates to the Grassmannian language as λr W ⊂ W [4]. Our aim now is to express the cKPr,m constraint (4.1) (cf. (2.20)) in the Grassmannian setting. We find from (4.3) that the generating function F 0 (t, λ): ∞ m X h X pn (−[∂])S (8a (t), 9a (t)) i F 0 (t, λ) ≡ λr + ψBA (t, λ) = (Lr,m )+ ψBA (t, λ) λn a=1 n=1 (4.12) defines via (4.11) a point W 0 of Sato Grassmannian Gr:
W 0 = spanhF 0 (0, λ), ∂x F 0 (0, λ), ∂x2 F 0 (0, λ), . . .i
(4.13)
which coincides, because of the second equality in (4.12), with the original point W defined through F (t, λ) = ψBA (t, λ) (4.11). Thus, we have4 : Proposition 4.2. Let S (8a (t), 9a (t)) , a = 1, . . . , m , be m squared eigenfunction potentials (2.18), where 8a , 9a are (adjoint-)eigenfunctions of the general KP hierarchy (2.2). Then, the reduction of (2.2) to the cKPr,m hierarchy (2.20) can be equivalently expressed as a restriction of Gr to a subset whose points (hyperplanes) W (4.11) are subject to the following constraint: # " m X r b 1λ S (8a (t), 9a (t)) W ⊂ W (4.14) λ + a=1
b λ as in (3.11) and S (8a (t), 9a (t)) being given by (3.35) in terms of the generating with 1 function (4.11) of W. 5. Non-Isospectral Virasoro Symmetry for cKPr,1 τ -Functions The conventional formulation of additional non-isospectral symmetries for the full KP integrable hierarchy [9, 17] is not compatible with the reduction of the latter to the important class of constrained cKPr,m integrable models. In refs.[29, 8] we solved explicitly the problem of compatibility of the Virasoro part of non-isospectral symmetries with the underlying constraints of cKPr,m hierarchies within the pseudo-differential Lax operator framework. Our construction in [29, 8] involves an appropriate modification of the standard non-isospectral symmetry flows, acting on the space of cKPr,m Lax operators, by adding a set of additional “ghost symmetry” flows (of the type appearing in Eq. (3.55)). In this section, we derive the explicit form of the action of the correct modified Virasoro non-isospectral symmetries as flows on the space of cKPr,m τ -functions. Note that the corresponding result for the full unconstrained KP hierarchy has been previously obtained in [15, 16, 17]. To this end, let us first recall that the standard additional (non-isospectral) symmetries [9, 17] are defined as vector fields on the space of general KP Lax operators (2.2) or, alternatively, on the dressing operators (2.4), through their flows as follows: 4 For a different criteria characterizing cKP r,m hierarchies within the Sato Grassmannian framework, see refs.[35, 36].
512
H. Aratyn, E. Nissimov, S. Pacheva
h i ∂¯k,n L = − M n Lk − , L h i = M n Lk + , L + nM n−1 Lk ;
∂¯k,n W = − M n Lk
−
W.
(5.1)
Here M is a pseudo-differential operator “canonically conjugated” to L such that: h i h i ∂ l/r L , M = 1l, M = L+ , M . (5.2) ∂tl Within the Sato-Wilson dressing operator formalism, the M -operator can be expressed in terms of dressing of the “bare” M (0) operator: M (0) =
Xl Xl+r tl Dl−r = X(r) + tr+l Dl ; r r l≥1
l≥1
X(r) ≡
r X l tl Dl−r r
(5.3)
l=1
conjugated to the “bare” Lax operator L(0) = Dr . The additional symmetry flows (5.1) commute with the usual KP hierarchy isospectral flows given in (2.3). However, they do not commute among themselves, instead they form a centerless W1+∞ algebra (see e.g. [17]). One finds that the Lie algebra of operators ∂¯k,n is isomorphic to the Lie algebra generated by −z k (∂/∂z)n . Especially for n = 1 this becomes an isomorphism to the centerless Virasoro algebra ∂¯k,1 ∼ −Lk−1 , with [ Ll , Lk ] = (l − k)Ll+k . As demonstrated in [29, 8], the conventional non-isospectral flows (5.1) do not preserve the space of cKPr,m Lax operators given by (2.20). In particular, for the Virasoro non-isospectral symmetry algebra the transformed Lax operator ∂¯k,1 L belongs to a different class of constrained KP hierarchies – cKPr,m(k−1) (when k ≥ 3). The solution to this problem is provided by the following [29, 8]: Proposition 5.1. The correct non-isospectral symmetry flows for the cKPr,m hierarchies (2.20), spanning the Virasoro algebra, are given by: h i (1) ,L , (5.4) ∂k∗ L ≡ − M Lk − + Xk−1 (1) (1) i.e., with the following isomorphism Lk−1 ∼ − M Lk − + Xk−1 , where Xk−1 are ghost-symmetry generating operators (cf. (3.55)) defined as: Xk(1) =
j 1 j − (k − 1) Lk−1−j (8i )D−1 L∗ (9i ); 2
k−1 m X X i=1 j=0
k ≥ 1.
(5.5)
Since (auto-)Darboux-B¨acklund transformations of cKPr,m hierarchies (see next section) play a fundamental rˆole for finding exact solutions, as well as in establishing the link between cKPr,m integrable models and (multi-)matrix models, it is natural to impose the additional condition of commutativity of the non-isospectral symmetries with the Darboux-B¨acklund transformations. The latter condition was shown in refs.[29, 8] to be satisfied only by the subclass cKPr,1 of constrained KP hierarchies (it is precisely cKPr,1 hierarchies which provide the integrability structure of discrete multi-matrix models [8]). Therefore, in the rest of this section we restrict our attention to cKPr,1 models.
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
513
Consider the modified non-isospectral Virasoro symmetry flows (5.4) acting on the dressing operator of the cKPr,1 hierarchy: ∂k∗ W = − M Lk
−
(1) W + Xk−1 W.
(5.6)
Taking the operator residuum on both parts of (5.6) we obtain: ∂k∗ τ (t)
X k−2 1 c (2) 1 k−2−j ∗j W (k − 2) − j S L = τ (t) + (8), L (9) τ (t). 2r r(k−1) 2 j=0
(5.7) (1) In deriving Eq. (5.7) we used the expression for Xk−1 (5.5) together with the differential Fay identity (3.12) as well as: ∗ (t, λ) Res M l Lk = Resλ M l Lk (ψBA (t, λ)) ψBA l 1 kr−l(r−1) ∂ ∗ ψBA (t, λ) ψBA (t, λ) (5.8) = l Resλ λ r ∂λl 1 1 kr−l(r−1) ∂ l b Resλ µ = −∂x X (λ, µ) |µ=λ τ (t) τ (t) rl ∂µl ! c (l+1) τ (t) W 1 (k−l)r = ∂x . rl (l + 1) τ (t) In the chain of the identities in (5.8) we took into account Dickey’s formula for M l Lk − [16] (first equality in (5.8)), Eq. (3.13) (third equality in (5.8)), and formula (3.5) for Xb (λ, µ) to arrive at the last equality above. The Virasoro operator in the first term on the r.h.s. of (5.7) comes from the standard Orlov-Schulman non-isospectral symmetry flow and reads explicitly (for k ≥ −1): c (2) = 2 W k
X l≥1
X ∂ ∂ ∂2 ltl − (k + 1) + . ∂tl+k ∂tk ∂tl ∂tk−l k−1
(5.9)
l=1
We now express the second additional “ghost-flow” term on the r.h.s. of (5.7) as a differential operator acting on τ (t) of a form similar to (5.9). The starting point are the differential Eqs. (4.5) obeyed by the cKPr,1 τ -function, wherefrom we get for the second-order derivatives: n−1 i Xh 1 ∂ 2 τ (t) = S Ln+l−1−i (8), L∗ i (9) − S Ln−1−i (8), L∗ i+l (9) τ (t) ∂trl ∂trn i=0
+
S Ln−1−i (8), L∗ i (9) S Ll−1−j (8), L∗ j (9)
l−1 h n−1 XX i=0 j=0
i − S Ln−1−i (8), L∗ j (9) S Ll−1−j (8), L∗ i (9) . In obtaining relation (5.10) we made use of the following lemma:
(5.10)
514
H. Aratyn, E. Nissimov, S. Pacheva
Lemma 5.1. The relation: ∂ S(f, g) = S (Ln (f ), g) − S f, L∗ n (g) ∂tnr n−1 X S Ln−1−i (8), g S f, L∗ i (9) −
(5.11)
i=0
holds for f an eigenfunction and g an adjoint eigenfunction of the Lax operator L ≡ Lr,1 = L+ + 8D−1 9 belonging to the cKPr,1 hierarchy. Proof. We are going to show that ∂/∂tnr S(f, g) = Res D−1 g(Ln )+ f D−1 is equal to the right-hand side of Eq. (5.11) (up to a constant). We first apply ∂/∂tmr on the left hand side of Eq. (5.11). This yields ∂ ∂ ∂ ∂ S(f, g) = S(f, g) = −Res D−1 (L∗ )n+ (g)Lm f D−1 ∂tmr ∂tnr ∂tnr ∂tmr (5.12) + Res D−1 gLm (L)n+ (f )D−1 . After making the substitutions: (L)n+ (f ) = Ln (f ) −
n−1 X
Ln−1−i (8)S f, L∗ i (9) ,
(5.13)
L∗ i (9)S Ln−1−i (8), g ,
(5.14)
i=0
(L∗ )n+ (g) = L∗ n (g) +
n−1 X i=0
where use was made of (2.21), we obtain agreement with the result of applying ∂/∂tmr on the right-hand side of Eq. (5.11) and using Eq. (2.18) as well as Lemma 2.1 Using Eqs. (4.5),(5.10) we obtain: k−2 X 1 j=0
2
(k − 2) − j S Lk−2−j (8), L∗ j (9) =
1 X ∂2τ . (5.15) 2τ (t) ∂trl ∂tr(k−1−l) k−2 l=1
Collecting (5.9) and (5.15), the final form of the cKPr,1 non-isospectral Virasoro symmetry flows reads: 1X r(k − 1) + 1 ∂ ∂ ltl − ∂k∗ τ (t) = r ∂tl+r(k−1) 2r ∂tr(k−1) l≥1 # r(k−1)−1 k−2 1 X 1X ∂2 ∂2 + + τ (t). (5.16) 2r ∂tl ∂tr(k−1)−l 2 ∂trl ∂tr(k−1−l) l=1
l=1
In particular, for cKP1,1 hierarchies the Virasoro non-isospectral symmetry takes the form (Eq. (5.16) for r = 1): k−2 2 X X k ∂ ∂ ∂ τ (t). ltl − + (5.17) ∂k∗ τ (t) = ∂tl+k−1 2 ∂tk−1 ∂tl ∂tk−1−l l≥1
l=1
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
515
Concluding this section it is instructive to point out the relation of (5.17) with the so-called Virasoro constraints in conventional discrete matrix models [37, 7] spanning the Borel subalgebra of the Virasoro algebra: s ≥ −1,
) L(N s ZN = 0, ) L(N = s
) = L(N 0
∞ X
ktk
k=1 ∞ X
ktk
k=1
(5.18)
∂ ∂ + 2N + ∂tk+s ∂ts ∂ + N 2; ∂tk
s−1 X
∂ ∂ , ∂tk ∂ts−k
k=1 ∞ X
) L(N −1 =
k=2
ktk
s ≥ 1,
∂ + N t1 . ∂tk−1
(5.19) (5.20)
Here ZN denotes the one-matrix model partition function with N indicating the size of the corresponding (Hermitian) random matrix. On one hand, it can be identified as a τ -function of the semi-infinite one-dimensional Toda lattice model [37]. On the other hand, from the point of view of continuum integrable systems it was shown in [38, 8] to be ZN = τ (N,0) , i.e., N th member of the Darboux-B¨acklund orbit on the subspace of cKP1,1 τ -functions starting from the “free” initial τ (0,0) = 1 (see the next section for more details about Darboux-B¨acklund orbits on constrained KP hierarchies). Comparing (5.19)–(5.20) with (5.17) one finds: ) ∗ L(N −1 = ∂0 + N t1 ;
) L(N = ∂1∗ + N 2 , 0
) ∗ L(N k−1 = ∂k + (2N + k/2)∂/∂tk−1 ,
k ≥ 2.
(5.21)
6. Binary Darboux-B¨acklund Orbits on cKPr,m Hierarchies. Toda Square-Lattice Model Let us recall the form of the Darboux-B¨acklund and adjoint-Darboux-B¨acklund transformations which preserve the constrained form of the cKPr,m hierarchy Lax operator (2.20) [38, 8], i.e., we shall discuss auto-Darboux-B¨acklund transformations for cKPr,m hierarchies (for a general discussion of DB transformations of generic KP hierarchies without the requirement of preserving specific classes of constrained KP hierarchies, see refs.[39, 19]): e=L e+ + L→L L → L¯ = L¯ + +
m X
i=1 m X
e i D−1 9 e i = Ta LTa−1 , 8
Ta = 8a D8−1 a ,
(6.1)
¯ i D−1 9 ¯ i = T¯ ∗ −1 LT¯b∗ , 8 b
T¯b = 9b D9−1 b .
(6.2)
i=1
Here 8a , 9b are (adjoint) eigenfunctions entering L− (2.20) with fixed indices a, b which henceforth will be assumed a 6= b. Accordingly, for BA functions, the τ -function, eigenfunctions and their respective spectral “densities”, the DB transformations (6.1) imply: e a = Ta L(8a ), 8 e i = Ta (8i , 8
ea = 1 , 9 8a e i = Ta∗ −1 (9i ) = − 1 S(8a , 9i ), 9 8a
, i 6= a,
(6.3)
516
H. Aratyn, E. Nissimov, S. Pacheva
1 ψeBA (t, λ) = Ta (ψBA (t, λ)); τe(t) = 8a (t)τ (t), λ 1 ∗ ∗ ∗ S(8a (t), ψBA (t, λ) = λTa∗ −1 (ψBA (t, λ)) = −λ (t, λ)), ψeBA 8a (t) ei (λ) = λϕi (λ), ϕ e∗i (λ) = λ−1 ϕ∗i (λ), ϕ ea (λ) = λ1+r ϕa (λ); ϕ
(6.4) (6.5) i 6= a. (6.6)
For the adjoint DB transformations (6.2) we have: ¯b =− 1 , 8 9b
¯ b = T¯b L∗ (9b ), 9
¯ i = T¯b (9i ), i 6= b, ¯ i = T¯ ∗ −1 (8i ) = − 1 S(8i , 9b ), 9 8 b 9b 1 ψ¯ BA (t, λ) = −λT¯b∗ −1 (ψBA (t, λ)) = λ S(ψBA (t, λ), 9b (t)), 9b (t) 1 ∗ ∗ ψ¯ BA (t, λ) = − T¯b (ψBA (t, λ); τ¯ (t) = 9b (t)τ (t), λ 1 ϕ¯ ∗b (λ) = −λ1+r ϕ∗b (λ); ϕ¯ i (λ) = − ϕi (λ), ϕ¯ ∗i (λ) = −λϕ∗i (λ), λ
(6.7) (6.8) (6.9) i 6= b. (6.10)
We shall use the double superscript (n, k) to indicate the iteration of n successive Darboux-B¨acklund transformations (6.3) plus k successive adjoint-Darboux-B¨acklund transformations (6.7). One can easily show that the result does not depend on the particular order these transformations are performed. Therefore, the set of all (n, k) DB transformations, called generalized binary DB transformations in what follows, defines a discrete symmetry structure on the space of cKPr,m hierarchies corresponding to a two-dimensional lattice. Let us note that the (1, 1) binary DB transformations were previously introduced in ref.[30]. For one-step binary-DB transformed τ - and BA functions we get: (0,0) (6.11) (t) τ (0,0) (t), τ (1,1) (t) = −S 8(0,0) a (t), 9b (0,0) (t − [λ−1 ]) (0,0) 1 8(0,0) (1,1) a (t)9b ψBA (t, λ) ψBA (t, λ) = 1 − λ S 8(0,0) (t), 9(0,0) (t) a b (0,0) −1 −1 S 8(0,0) (t − [λ ]), 9 (t − [λ ]) a b (0,0) = ψBA (t, λ), (6.12) (0,0) S 8a (t), 9(0,0) (t) b (0,0) (0,0) −1 9 (t)8 (t + [λ ]) 1 a ∗ (1,1) ∗ (0,0) b ψBA ψBA (t, λ) = 1 + (t, λ) λ S 8(0,0) (t), 9(0,0) (t) a b (0,0) −1 −1 S 8(0,0) (t + [λ ]), 9 (t + [λ ]) a b ∗ (0,0) = ψBA (t, λ), (6.13) (0,0) S 8a (t), 9(0,0) (t) b where in the second equalities in (6.12) and (6.13) we used again (3.33)–(3.34) and (3.64)–(3.65). Combining Eq. (6.11) with Eq. (4.5) for n = 1 we find the following transformation formula for the squared eigenfunction potentials:
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
517
m X ∂ (1,1) (0,0) (0,0) (0,0) (0,0) S 8(1,1) , 9 S 8 , 9 ln −S 8 (t), 9 (t) . − = a i i i i b ∂tr i=1 i=1 (6.14) Let us recall again that here and below a 6= b are fixed indices of Lax (adjoint) eigenfunctions. m X
Remark. Comparing formula (6.11) with (3.35) we see that one-step binary-DB transformations on τ -functions are nothing but the well-known Sato B¨acklund transformations [4] (i.e., those generated by the bilocal vertex operator χ b(λ, µ) (3.1)) “averaged” with appropriate spectral densities. Introducing short-hand notations: ∗ i (8(0,0) χ∗b (i) (t) ≡ L(0,0) (9(0,0) (t)) a (t)) , b j i (i,j) (0,0) (0,0) ∗ Sab (t) ≡ S L(0,0) (8(0,0) (t)), L (9 (t)) a b
(0,0) χ(i) a (t) ≡ L
i
(6.15)
we have: Proposition 6.1. The following determinant formulae hold for the n-step binary DB transformed quantities: n−1
Y τ (n,n) (t)
(i−1,j−1) (j,j) n (j,j) n = (−1) S 8 det (t), 9 (t) = (−1) (t)
S
, (6.16) a b ab n τ (0,0) (t) j=0
(i−1,j−1)
S (t) χ(i−1) (t) a ab
detn+1 (n,j−1)
Sab (t) χ(n) a (t) (n,n) n
, (6.17) 8a (t) = (−1)
(i−1,j−1) detn Sab (t)
(i−1,j−1) (i−1,n)
S (t) Sab (t) ab
detn+1
χ∗ (j−1) (t) χ∗ (n) (t) b b (n,n) n , (6.18) 9a (t) = (−1)
(i−1,j−1) detn Sab (t)
(i−1,j−1) detn+1 Sab (t)
S 8(n,n) (t), 9(n,n) (t) = a b
(i−1,j−1) , (6.19) det n Sab (t)
(i,j) ∗ (i) where χ(i) and Sab are defined in (6.15), and the matrix indices i, j run from 1 a , χb to n or n + 1 according to the indicated sizes of the determinants.
Formulae (6.17)–(6.19) can be further generalized to:
(i−1,j−1)
S (t) χ(i−1) (t) a ab
detn+1 (s+n,j−1) (s+n)
S (t) χ (t) s a ab
L(n,n) , (t) = (−1)n 8(n,n) a
(i−1,j−1) detn Sab (t)
(6.20)
518
H. Aratyn, E. Nissimov, S. Pacheva
(i−1,j−1) (i−1,s+n)
S (t) Sab (t) ab
detn+1 ∗ (j−1) s (t) χ∗b (s+n) (t) χb (n,n) (n,n) ∗ n
, L 9b (t) = (−1)
(i−1,j−1) detn Sab (t) S
l ∗ s ), L(n,n) (9(n,n) ) L(n,n) (8(n,n) a b
(i−1,j−1) (i−1,s+n)
S (t) Sab (t) ab
det n+1 (l+n,j−1) (l+n,s+n) (t) Sab (t) Sab
= ,
(i−1,j−1) detn Sab (t)
(6.21)
(6.22)
We are now ready, with the help of (6.20)–(6.22), to write down the generalization of Eq. (6.16) for the cKP τ -function subject to an arbitrary (n, k) binary DB transformation: Proposition 6.2. The general discrete binary Darboux-B¨acklund orbit on the space of cKPr,m τ -functions, generated by a fixed pair of (adjoint) eigenfunctions 8a , 9b and starting from arbitrary “initial” τ (0,0) , consists of the following elements τ (n,k) :
Wn−k
(i−1,j−1)
S ab det (k,j−1) k+1 Sab
−(n−k−1)
(i−1,j−1) (−1) det Sab ×
k
(i−1,j−1) χ(i−1) χ(i−1) a a ab
, . . . , det S(n−1,j−1)
,
k+1 Sab χ(n−1) χ(k) a a for n ≥ k , i, j = 1, . . . , k
τ (n,k) = τ (0,0)
k
τ (n,k) (t)
(i−1,j−1) −(k−n−1) n = (−1) det ×
S
ab n τ (0,0) (t)
(i−1,j−1) (i−1,k−1) (i−1,j−1) (i−1,n)
S
S
Sab Sab ab
ab
Wk−n det (j−1) (n) , . . . , det (j−1) (k−1) , ∗ ∗ ∗ ∗
n+1 n+1 χ χ χ χ b
b
b
for n ≤ k
(i,j) ∗ (i) where χ(i) and Sab are as in (6.15), a , χb
α−1 fβ and Wk [f1 , . . . , fk ] = det ∂
a set of functions {f1 , . . . , fk }.
α,β=1,...,k
(6.23)
(6.24)
b
,
i, j = 1, . . . , n
indicates the Wronskian determinant of
Remark. Note that the entries in the Wronskians in Eqs. (6.23)–(6.24) are determinants themselves. Remark. In the Appendix we write down the explicit expressions for the cKPr,m τ functions on the most general discrete binary Darboux-B¨acklund orbit generated via successive (adjoint) DB transformations (6.3)–(6.10) w.r.t. an arbitrary set of (adjoint) eigenfunctions. In the simple case of a “free” initial system with L(0,0) = D which, accordingly, is characterized by: τ (0,0) (t) = 1
;
∂ (i) χ = ∂xn χ(i) a ∂tn a
,
∂ ∗ (j) χ = −(−∂x )n χ∗b (j) , ∂tn b
(6.25)
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
519
formula (6.16) reproduces the determinant expression τ = det kδi j + ai j k for R x Fredholm (i) ∗ (j) the τ -function with ai j ≡ −∞ χa χb dy [40, 41]. Namely, it follows that: X l−1 (i) ∂ −1 . (6.26) ∂x χa (−∂x )n−l χ∗b (j) = Res D−1 χ∗b (j) Dn χ(i) D δij + ai j = a ∂tn n
l=1
∗ (j) and establishes connection The latter allows us to identify δij + ai j with S χ(i) a , χb between the above special case of (6.16) and the Fredholm determinant expressions for the τ -functions of refs.[40, 41]. Now, we shall show that the (n, k) binary Darboux-B¨acklund orbit of the cKPr,2 hierarchy defines a two-dimensional Toda square-lattice system which describes two coupled ordinary two-dimensional Toda-lattice models corresponding to the horizontal (n, 0) and the vertical (0, k) one-dimensional sublattices of the (n, k) binary DB squarelattice. Namely, consider: 8(n,k) 9(n,k) ∂ 1 2 ln τ (n,k) = ResL(n,k) = 8(n,k) 9(n,k) + 8(n,k) 9(n,k) = (n−1,k) − (n,k−1) , 1 1 2 2 ∂tr 81 92 (6.27) where we used (6.3) and (6.7). Taking into account the expressions for the (adjoint-)DB transformed τ -functions (6.4) and (6.9), i.e. ∂x
= 8(n,k) 1
τ (n+1,k) τ (n,k)
,
9(n,k) = 2
τ (n,k+1) , τ (n,k)
(6.28)
Eq. (6.27) can be rewritten in the form: ∂x
τ (n+1,k) τ (n−1,k) − τ (n,k+1) τ (n,k−1) ∂ ln τ (n,k) = , 2 ∂tr τ (n,k)
(6.29)
or, equivalently: ∂x
∂ (n,k) ∂ (n,k) τ − ∂x τ (n,k) τ = τ (n+1,k) τ (n−1,k) − τ (n,k+1) τ (n,k−1) . ∂tr ∂tr
(6.30)
Similarly, Eq. (6.27) can be represented as a system of coupled equations of motion for and 9(n,k) using again (6.28): 8(n,k) 1 2 ∂ ∂x ln 8(n,k) = 1 ∂tr ∂ ∂x ln 9(n,k) =− 2 ∂tr
8(n+1,k) 1 8(n,k) 1 9(n,k+1) 2 9(n,k) 2
−
8(n,k) 1
!
8(n−1,k) 1
−
9(n,k) 2 9(n,k−1) 2
− ! +
9(n+1,k) 2 9(n+1,k−1) 2 8(n,k+1) 1 8(n−1,k+1) 1
− −
9(n,k) 2 9(n,k−1) 2 8(n,k) 1 8(n−1,k) 1
! , (6.31) ! . (6.32)
vanishes the remaining equations for 8(n,k) reduce for a fixed k to the When 9(n,k) 2 1 equations of motion for the well-known Toda model on one-dimensional lattice w.r.t. n (and vice versa if 8(n,k) = 0). 1
520
H. Aratyn, E. Nissimov, S. Pacheva
7. Discussion and Outlook. Relation to Random Matrix Models In this paper we provided a new version of the eigenfunction formulation of the KP hierarchy, called the squared eigenfunction potential (SEP) method, where the SEP plays a rˆole of a basic building block. The principal ingredient of the SEP method is the proof of existence of spectral representation for any KP eigenfunction as a spectral integral over the (adjoint) BA function with spectral density explicitly given in terms of a SEP. It was pointed out that the spectral representations of the (adjoint) BA functions themselves (being particular examples of KP eigenfunctions) can, in turn, serve as defining relations for the whole KP hierarchy parallel to Hirota fundamental bilinear identity or Fay identity. The SEP method was subsequently employed to solve various issues in integrable hierarchies of KP type both of conceptual, as well as more applied character. Many, previously unrelated, recent developments in the theory of the τ -function of the KP hierarchy gained from being described by the present formalism. As one of the important illustrations of how our method works, we have shown how the SEP, acting on the manifold of wave functions ψBA (t, λ) by generating non-isospectral symmetry algebra, lifts to a vertex operator acting on τ -functions. This reproduced in the SEP setting the results of [14, 15, 16, 17]. We have also employed the SEP construction in the context of Hamiltonian reductions of KP hierarchy providing: – description of the reductions of the general KP hierarchy to the constrained cKPr,m hierarchies entirely in terms of linear constraint equations on the pertinent τ functions; – description of constrained cKPr,m hierarchies in the language of the universal Sato Grassmannian; – obtaining the explicit form of the non-isospectral Virasoro symmetry generators acting on the cKPr,m τ -functions. The achieved progress should result in further clarification of the status of the cKPr,m hierarchies and their connection to the underlying fermionic field language. It would also be interesting to look for the signs of the affine sl(r ˆ + m + 1) symmetry encountered in construction of the cKPr,m models by the generalized Drinfeld-Sokolov method associated to affine Kac-Moody algebras [28]. Furthermore, as a principal application, the SEP method was used to derive a series of new determinant solutions for the τ -functions of (constrained) KP hierarchies which generalize the familiar Wronskian (multi-soliton) solutions. These new solutions belong to a new type of generalized binary Darboux-B¨acklund orbits which, in turn, were shown to correspond to a novel Toda model on a square lattice. An important task for future study is to find a closed Lagrangian description of this new Toda square-lattice model. Finally, let us briefly describe another potential physical application of the present approach. Using the spectral representation for (adjoint) eigenfunctions (3.15) together with (2.10)–(2.11), as well as the following form of the Fay identity for τ -functions [14]:
τ t + [λ−1 ] − [µ−1 ]
i j det
n
(λi − µj )τ (t) P P Q τ t + l [λ−1 ] − l [µ−1 ] − λ − µ λ µ l l n(n−1) i j i j i>j Q = (−1) 2 , (7.1) τ (t) i,j λi − µj
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
521
we obtain an equivalent “spectral” representation for τ (n,n) (t) (6.16): τ (n,n) (t) n(n−1)
=
(−1) 2 (n!)2
Q
Z n−1 Y
dλj dµj
j=0
Y
1 µ i − µj (λ i,j i − µj ) i>j
Y
λ i − λj
i>j
λri − λrj
Y n−1
ϕ∗b (0,0) (λj )e−ξ(t,λj ) ×
j=0
r
µri − µj
n−1 Y
X X ϕa (0,0) (µj )eξ(t,µj ) τ (0,0) t + [λ−1 [µ−1 l ]− l ] .
j=0
l
(7.2)
l
Following [42], we can interpret the τ -function (7.2) as a partition function of certain random multi-matrix ensemble with the following joint distribution function of eigenvalues: Zn [{t}] ≡ const τ (n,n) (t) = H(t; {λ}, {µ}) ≡
X
Z n−1 Y
dλj dµj exp {−H(t; {λ}, {µ})} , (7.3)
j=0
X H¯ 1 (λj ) + H1 (µj ) + H2 (λi , λj ) + H2 (µi , µj )
j
i>j
+
X
e 2 (λi , µj ) + Hn ({λ}, {µ}), (7.4) H
i,j
where the one-, two- and many-body Hamiltonians read, respectively: H1 (λ) = − ln ϕ(0,0) (λ) − ξ(t, λ), H¯ 1 (λ) = − ln ϕ∗ (0,0) (λ) + ξ(t, λ), (7.5) r−1 X 2 e 2 (λ, µ) = ln(λ − µ), (7.6) H2 (λi , λj ) = − ln λi − λj − ln λsi λr−1−s , H j s=0
X X [λ−1 [µ−1 Hn ({λ}, {µ}) = − ln τ (0,0) t + l ]− l ] . (7.7) l
l
The physical implications of the above new type of joint distribution function (7.3)– (7.7) deserves further study especially regarding critical behavior of correlations. The emerging new interesting features of (7.3)–(7.7), absent in the joint distribution function derived from the conventional two-matrix model [42], are as follows: (a) the second attractive term in the two-body potential H2 (7.6) (both for λ- and µ“particles”) dominating at very long distances over the customary repulsive first term; e 2 (7.6) between each pair of λ- and µ(b) an additional two-body attractive potential H “particles” (c) a genuine many-body potential Hn (7.7). One of the most important issues here is to exhibit the explicit form of the generalized multi-matrix model behind (7.3)–(7.7).
522
H. Aratyn, E. Nissimov, S. Pacheva
A. Appendix: The Most General cKPr,m Binary Darboux-B¨acklund Orbit Let us first introduce a few convenient compact notations for Wronskian and related Wronskian-like determinants:
Wk ≡ Wk [f1 , . . . , fk ] = det ∂ α−1 fβ , Wk (f ) ≡ Wk+1 [f1 , . . . , fk , f ] , (A.1)
α−1
∂ fβ ∂ α−1 fk+1 f f
, (A.2)
Wk+1 (f ) ≡ W [f1 , . . . , fk+1 ; f ] ≡ det k+1 S fβ , f S (fk+1 , f ) where the matrix indices α, β = 1, . . . , k and, as above, ∂x S fβ , f = fβ f . The Wronskian(-like) determinants (A.1)–(A.2) obey the following useful identities: ! fk (f ) Wk+1 fk+1 (f ) Wk (f ) Wk−1 W W Wk−1 (f ) =− = ; ∂ , (A.3) ∂ 2 Wk Wk Wk Wk2 where the first one is known as Jacobi expansion theorem (see, e.g. [43]), whereas the second identity in (A.3) can be easily verified via induction. Equations (A.3) imply in turn the identities: Wk (f ) Wk
,
Tk∗ −1 · · · T1∗ −1 (f ) = −
Wj Wj−1 D Wj−1 Wj
,
Tj∗ −1 = −
Tk · · · T1 (f ) = with Tj =
fk (f ) W Wk
Wj−1 −1 Wj D . Wj Wj−1
(A.4)
(A.5)
Now we can use Eqs. (A.4) to derive explicit expressions for the (adjoint) eigenfunctions and τ -functions of cKPr,m hierarchies generated via successive (adjoint) DB transformations (6.3)–(6.10) w.r.t. an arbitrary set of (adjoint) eigenfunctions of the “initial” cKPr,m Lax operator L = Lr,m (2.20). We shall denote the latter arbitrary successive (adjoint) DB transformations by the following double-vector superscript: (A.6) ~n, ~k ≡ ((n1 , . . . , nm ), (k1 , . . . km )) indicating n1 successive DB transformations w.r.t. 81 etc., until nm DB transformations w.r.t. 8m and, similarly, k1 successive adjoint-DB transformations w.r.t. 91 etc., until km adjoint-DB transformations w.r.t. 9m . Specifically, we have: P 8a
~ n,~0
m
= (−1)
a+1
nj
i (n1 −1) (na −1) (nm −1) W χ(0) , . . . , χ(0) , χa(na ) , . . . , χ(0) a , . . . , χa m , . . . , χm 1 , . . . , χ1 h i , (n1 −1) (na −1) (nm −1) W χ(0) , . . . , χ(0) , . . . , χ(0) a , . . . , χa m , . . . , χm 1 , . . . , χ1 τ τ
h
i h (n1 −1) (0) (nm −1) = W χ(0) , , . . . , χ , . . . , χ , . . . , χ m m 1 1 ~~ ~ n,~0 0,0
(A.7) (A.8)
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type ~ n,~0
523
9a = (0) Pm (n1 −1) (na −2) (nm −1) ,...,χ(0) ,...,χ(0) a ,...,χa m ,...,χm nj W χ1 ,...,χ1 , a+1 (n −1) (na −1) (nm −1) (−1) W χ(0) ,...,χ1 1 ,...,χ(0) ,...,χ(0) a ,...,χa m ,...,χm 1 h i ~ ~ (0,0) (n −1) (nm −1) e χ(0) W ,...,χ1 1 ,...,χ(0) ;9a m ,...,χm 1 − , (n1 −1) (nm −1) (0) (0) W χ1 ,...,χ1
,...,χm ,...,χm
for na ≥ 1 (A.9) for na = 0
where the functions χ(s) in (6.15) a are the same as with the superscripts (0, 0) replaced with ~ ~ the corresponding double-vector ones 0, 0 . Equations (A.7)–(A.8) already appeared in [29] (see also refs.[44]), whereas Eq. (A.9) is derived via iterative application of the second identity in (A.4) and taking into account (A.7). ~ n,~0 (A.8) Now, performing arbitrary successive adjoint-DB transformations on τ according to the second Eq. (6.9) upon using the first identity in (A.4) and inserting there the explicit expressions (A.9), we arrive at the following: Proposition A.1. The most general discrete binary Darboux-B¨acklund orbit on the space of cKPr,m τ -functions is built-up of the following elements: ~ n,~ k h i− Pm kj τ a+1 (0) (n1 −1) (0) (n −1) = −W χ1 , . . . , χ1 , . . . , χa , . . . , χa a × ~0,~0 τ W 1~n(0,a+1) , . . . , 1~n(ka+1 −1,a+1) , . . . , 1~n(0,m) , . . . , 1~n(km −1,m) (A.10) h i f χ(0) , . . . , χ(n1 −1) , . . . , χ(0) , . . . , χ(na −1) ; χ∗ (l) (A.11) 1~n(l,s) ≡ W a a s 1 1 where ~n, ~k = ((n1 , . . . , na , 0, . . . , 0), (0, . . . , 0, ka+1 , . . . , km ))
;
a = 0, 1, . . . , m (A.12)
and, furthermore, notations (6.15) and (A.2) are employed. Remark. The reason for the zero entries in the labels (A.12) of the most general binary DB transformations, preserving the spaces of cKPr,m hierarchies (2.20), lies in the fact that any pair of two successive (adjoint-)DB transformations w.r.t. 8a , 9a , i.e. both with the same index, is equivalent to an identity transformation as one can easily conclude by combining the second equation in (6.3) with the second equations in (6.4) and (6.9). Acknowledgement. It is our pleasure to thank Leonid Dickey for interest in this work, his very useful comments and encouragement.
References 1. Manakov, S., Novikov, S., Pitaevski L. and Zakharov: V.: Soliton Theory: The Inverse Problem. Moscow: Nauka, 1980 2. Dickey, L.: Soliton Equations and Hamiltonian Systems. Singapore: World Scientific, (1991) 3. Sato, M. and Sato, Y.: In: Nonlinear Partial Differential Equations in Applied Science. Lect. Notes in Num. Anal. 5, P.D. Lax, H. Fujita, and G. Strang (eds.), Amsterdam: North-Holland, 1982, pp. 259–271
524
H. Aratyn, E. Nissimov, S. Pacheva
4. Date, E., Kashiwara, M., Jimbo, M. and Miwa, T.: Transformation groups for soliton equations. In: Nonlinear Integrable Systems – Classical Theory and Quantum Theory, Singapore: World Scientific 1983; Jimbo, M. and Miwa, T.: Solitons and infinite dimensional Lie algebras. Publ. RIMS, Kyoto Univ., 19, 943–1001 (1983) 5. Hirota, R.: Direct methods in soliton theory. In: Solitons, Topics in Current Physics. 17, R.K. Bullough and P.J. Chaudrey (eds.), Berlin–Heidelberg–New York: Springer, 1980 6. Dijkgraaf, R., Verlinde, E. and Verlinde, H.: Nucl. Phys. B348, 435 (1991); Goeree, J.: Nucl. Phys. B358, 737 (1991); La, H.: Mod. Phys. Lett. A6, 573 (1991); Bonora, L., Martellini, M. and Xiong, C.S.: Nucl. Phys. B375, 453 (1992); Adler, M. and van Moerbeke, P.: Commun. Math. Phys. 147, 25 (1992); Panda, S. and Roy, S.: Int. J. Mod. Phys. A8, 3457 (1993) (hep-th/9208065); van de Leur, J.: J. Geom. Phys. 17, 95 (1995) (hep-th/9403080) 7. Morozov, A.: Phys. Usp. 37 1, (hep-th/9303139) (1994); hep-th/9502091 8. Aratyn, H., Nissimov, E. and Pacheva, S.: Int. J. Mod. Phys. A12, 1265 (1997) (hep-th/9607234) 9. Orlov, A. and Schulman, E.: Theor. Mat. Phys. 64, 323 (1985); Letters in Math. Phys. 12, 171 (1986); Orlov, A.: In: Plasma Theory and Nonlinear and Turbulent Processes in Physics. Singapore: World Scientific, 1988 10. Dickey, L.: Mod. Phys. Lett. A8, 1357 (1993) (hep-th/9210155); Haine, L. and Horozov, E.: Bull. Sc. Math., 2-e serie, 117, 485 (1993) 11. Bakas, I.: Commun. Math. Phys. 134, 487 (1990); Pope, C., Romans, L. and Shen, X.: Nucl. Phys. B339, 191 (1990) 12. Zamolodchikov, A.: Theor. Mat. Phys. 65, 1205 (1985); Fateev, V. and Lukianov, S.: Int. J. Mod. Phys. A7, 853 (1992); Bouwknegt, P. and Schoutens, K.: Phys. Reports 223, 183 (1993) 13. Kac, V. and Peterson, D.: Proc. Natl. Acad. Sci. USA 78, 3308 (1981); Radul, A.: Functional Analysis and Its Application 25, 33 (1991); Radul, A. and Vaysburd, I.: Phys. Lett. 274B, 317 (1992); Bakas, I., Khesin, B. and Kiritsis, E.:Commun. Math. Phys. 151, 233 (1993) 14. van Moerbeke, P.: In Lectures on Integrable Systems, eds. O. Babelon et.al., Singapore: World Scientific, 1994 15. Adler, M., Shiota, T. and van Moerbeke, P.: Commun. Math. Phys. 171, 547 (1995); Phys. Lett. 194A, 33 (1994) 16. Dickey, L.A.: Commun. Math. Phys. 167, 227 (1995) (hep-th/9312015) 17. Dickey, L.: Lectures on classical W -algebras. Cortona Lectures, 1993 18. Takebe, T.: Int. J. Mod. Phys. A7, 923 (1992) (Suppl. 1B) 19. Oevel, W.: Physica A195, 533 (1993); Oevel, W. and Schief, W.: Rev. Math. Phys. 6, 1301 (1994) 20. Orlov, A.: In Nonlinear Processes in Physics: Proceedings of the III Potsdam-V Kiev Workshop. A.S. Fokas et.al. (eds.), Berlin–Heidelberg–New York: Springer-Verlag, 1993 21. Grinevich, P., Orlov, A. and Schulman, E.: In Important Developments in Soliton Theory. A. Fokas and V. Zakharov (eds.), Berlin–Heidelberg–New York: Springer-Verlag, 1993; Grinevich, P.: In Singular Limits of Dispersive Waves. N. Ercolani et al (eds.), London–New York: Plenum Press, 1994 (solv-int/9509004) 22. Carroll, R.: solv-int/9612011, Applicable Anal. To appear 23. Konopelchenko, B. and Strampp, W.: J. Math. Phys. 33, 3676 (1992); Cheng, Y.: J. Math. Phys. 33, 3774 (1992); Xu, B. and Li, Y.: J. Physics A25, 2957 (1992); Sidorenko, J. and Strampp, W.: J. Math. Phys. 34, 1429 (1993); Oevel, W. and Strampp, W.: Commun. Math. Phys. 157, 1 (1993) 24. Cheng, Y., Strampp, W. and Zhang, B.: Commun. Math. Phys. 168, 117 (1995) 25. Aratyn, H., Nissimov, E. and Pacheva, S.: Phys. Lett. 314B, 41 (1993) (hep-th/9306035); Aratyn, H., Gomes, J. and Zimerman, A.: J. Math. Phys. 36, 3419 (1995) (hep-th/9408104) 26. Yu, F.: Letters in Math. Phys. 29, 175 (1993) (hep-th/9301053); Aratyn, H., Nissimov, E. and Pacheva, S.: Phys. Lett. 331B, 82 (1994) (hep-th/9401058); Aratyn, H., Nissimov, E., Pacheva, S. and Zimerman, A.H.: Int. J. Mod. Phys. A10, 2537 (1995) (hepth/9407117)
Squared Eigenfunction Potentials in Integrable Hierarchies of KP Type
525
27. Bonora, L. and Xiong, C.S.: Int. J. Mod. Phys. A8, 2973 (1993) (hep-th/9209041); Nucl. Phys. B405, 191 (1993) (hep-th/9212070); J. Math. Phys. 35, 5781 (1994) (hep-th/9311070) 28. Aratyn, H., Ferreira, L., Gomes, J. and Zimerman, A.: J. Math. Phys. 38, 1559 (1997) (hep-th/9509096) 29. Aratyn, H., Nissimov, E. and Pacheva, S.: Phys. Lett. 228A, 164 (1997) (hep-th/9602068) 30. Oevel, W. and Schief, W.: In: Applications of Analytic and Geometric Methods in Nonlinear Differential Equations. P. Clarkson (ed.), Amsterdam: Kluwer, 1993 31. Enriquez, B., Orlov, A.Yu. and Rubtsov, V.N.: Inverse Problems, 12, 241 (1996) (solv-int/9510002) 32. Cheng, Y. and Zhang, Y.J.: J. Math. Phys. 35, 5869 (1994) 33. Takasaki, K. and Takebe, T.: Rev. Math. Phys. 7, 743 (1995) 34. Ohta, Y., Satsuma, J., Takahashi, D., Tokihiro, T.: Suppl. Prog. Theor. Phys., 94, 210 (1988) 35. Zhang, Y.-J.: Letters in Math. Phys. 36, 1 (1996) 36. van de Leur, J.: J. Geom. Phys. 23, 83 (1997) (q-alg/9609001) 37. Gerasimov, A., Marshakov, A., Mironov, A., Morozov, A. and Orlov, A.: Nucl. Phys. B357, 565 (1991); Kharchev, S., Marshakov, A., Mironov, A., Orlov, A. and Zabrodin, A.: Nucl. Phys. B366, 569 (1991) 38. Aratyn, H., Nissimov, E. and Pacheva, S.: Phys. Lett. 201A, 293 (1995) (hep-th/9501018) 39. Chau, L.-L., Shaw, J.C. and Yen, H.C.. Commun. Math. Phys. 149, 263 (1992) 40. Schmelzer, I.: Seminar Analysis 1984/1985. (1985) 163 41. P¨oppe, C.: Inverse Problems. 5, 613 (1989); P¨oppe, C. and Sattinger, D.H.: Publ. RIMS 24, 505 (1988) 42. Avishai, Y., Hatsugai, Y. and Kamoto, M.: Phys. Rev. B53, 8369 (1996) 43. Ince, E.L.: Ordinary Differential Equations. Chap. V, London, 1926; Crum, M.M.: Quart. J. Math. Oxford 6, 121 (1955); Adler, M. and Moser, J.: Commun. Math. Phys. 61, 1 (1978) 44. Aratyn, H., Nissimov, E. and Pacheva, S.: solv-int/9512008, In New Trends in Quantum Field Theory. eds. A. Ganchev, R. Kerner and I. Todorov, Heron Press (Proceedings of the second Summer Workshop, Razlog/Bulgaria, Aug-Sept 1995); Oevel, W. and Strampp, W.: J. Math. Phys. 37, 6213 (1996); Loris, I. and Willox, R.: J. Math. Phys. 38, 283 (1997) Communicated by T. Miwa
Commun. Math. Phys. 193, 527 – 594 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Supersymmetric Quantum Theory and Differential Geometry J. Fr¨ohlich, O. Grandjean? , A. Recknagel?? Institut f¨ur Theoretische Physik, ETH-H¨onggerberg, CH-8093 Z¨urich, Switzerland. E-mail: [email protected], [email protected], [email protected] Received: 1 April 1997 / Accepted: 13 September 1997
Abstract: We reconsider differential geometry from the point of view of the quantum theory of non-relativistic spinning particles, which provides examples of supersymmetric quantum mechanics. This enables us to encode geometrical structure in algebraic data consisting of an algebra of functions on a manifold and a family of supersymmetry generators represented on a Hilbert space. We show that known types of differential geometry can be classified in terms of the supersymmetries they exhibit. Our formulation is tailor-made for a generalization to non-commutative geometry, which will be presented in a separate paper. Contents 1 Introduction and Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 528 1.1 Geometry and quantum physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529 1.2 Pauli’s electron . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530 1.3 Supersymmetric quantum theory and geometry put into perspective . . . 539 2 Algebraic Formulation of Differential Geometry . . . . . . . . . . . . . . . . . . 541 2.1 The N = 1 formulation of Riemannian geometry . . . . . . . . . . . . . . . . . . 542 2.2 The N = (1, 1) formulation of Riemannian geometry . . . . . . . . . . . . . . . 546 2.3 Lie groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555 2.4 Algebraic formulation of complex geometry . . . . . . . . . . . . . . . . . . . . . . 556 2.5 Hyperk¨ahler manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 565 2.6 Symplectic geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 572 2.7 Symmetries of spectral data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 576 3 Supersymmetry Classification of Geometries . . . . . . . . . . . . . . . . . . . . . 578 Appendix: Clifford Algebras and Spincc-groups . . . . . . . . . . . . . . . . . . . . . . . . 588 A.1 Clifford algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588 ? ??
Work supported in part by the Swiss National Foundation Work supported in part by a European Union HCM Fellowship
528
J. Fr¨ohlich, O. Grandjean, A. Recknagel
A.2 The group Spinc (V ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590 A.3 Exterior algebra and spin representations . . . . . . . . . . . . . . . . . . . . . . . . 591 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592
1. Introduction and Summary of Results In this paper we describe an approach to differential topology and geometry rooted in supersymmetric quantum theory. We show how the basic concepts and notions of differential geometry emerge from concepts and notions of the (supersymmetric) quantum theory of non-relativistic particles with spin, and how the classification of different types of differential geometry follows the classification of supersymmetries. Historically, the development of geometry has been closely linked to that of classical physical theory. Geometry was born, in ancient Greece, out of concrete problems in geodesy. Much later, the invention and development of differential geometry took place parallel to the development of classical mechanics and electromagnetism. A basic notion in differential geometry is that of parameterized curves on a manifold. This notion is intimately related to the one of trajectories of point particles central in classical mechanics: Tangent vectors correspond to velocities, vector fields to force laws. There are many further examples of such interrelations. To mention one, the work of de Rham and Chern was inspired and influenced by Maxwell’s theory of electromagnetism, a classical field theory. In this century, the foundations of physics have changed radically with the discovery of quantum mechanics. Quantum theory is more fundamental than classical physics. It is therefore natural to ask how differential geometry can be rediscovered – starting from the basic notions of quantum theory. In this work, we outline a plausible answer to this question. As a payoff, we find natural generalizations of differential geometry to non-commutative differential geometry, in the sense of Connes, which will be presented in a companion paper [FGR ]. Connes’ theory naturally leads to a far-reaching generalization of geometry which makes it possible to study highly singular and genuinely non-commutative spaces as geometric spaces. Such generalizations are called for by deep problems in quantum mechanics, quantum field theory and quantum gravity. Quantum mechanics entails that phase space must be deformed to a non-commutative space; moreover, attempts to reconcile quantum theory with general relativity naturally lead to the idea that space-time is non-commutative. The idea to reconsider and generalize algebraic topology and differential geometry from the point of view of operator theory and quantum theory is a guiding principle in Connes’ work on non-commutative geometry, the development and results of which are summarized in detail in [Co1–4 ] and the references therein. Connes’ work is basic for the approach presented in this paper. That supersymmetry and supersymmetric quantum field theory provide natural formulations of problems in differential topology and geometry and a variety of novel tools was recognized by Witten, see e.g. [Wi1,2 ]. In this paper, we attempt to combine the two threads coming from the work of Connes and of Witten: Notions and concepts from quantum mechanics and operator theory and from supersymmetry play fundamental roles in the approach to classical (and, see [FGR ], to non-commutative) geometry described here. Supersymmetry has previously been exploited in the context of non-commutative geometry by Jaffe and collaborators, see [Ja1,2,3 ].
Supersymmetric Quantum Theory and Differential Geometry
529
A key example of supersymmetric quantum mechanics is Pauli’s theory of nonrelativistic spinning electrons, which will be described later in this introduction. It forms the basis for an analysis of classical differential geometry starting from notions of quantum theory. Although we shall not solve any difficult problem in algebraic topology or differential geometry in this paper, we expect that the perspective of topology and geometry offered by supersymmetric quantum mechanics and the generalizations it suggests will turn out to be useful and productive. In this first section, we give a short sketch of our approach and of the main results. A detailed account of the algebraic formulation of classical geometry is contained in Sect. 2, with some preparatory material collected in the appendix. The third section contains an overview of the various types of differential geometry treated before, but now classified according to the supersymmetry encoded in the algebraic data. This provides us with a better understanding of the hierarchy of geometries, and further facilitates the task to generalize our classical notions to the non-commutative case. A formulation of noncommutative geometry and some examples of non-commutative geometrical spaces will be presented separately in [FGR ]. 1.1. Geometry and quantum physics. In quantum mechanics, one is used to identify pure states of the physical system under consideration with unit rays in a separable Hilbert space. This Hilbert space carries a representation of an “observable algebra”, in the following denoted by F~ . Imagine that the system consists of a quantum mechanical particle propagating on a manifold M . Then F~ can be identified with a deformation of the algebra of functions over the phase space T ∗ M . The non-commutativity of F~ can be traced back to the Heisenberg uncertainty relations 4 pj 4 q
j
≥
~ 2
(1.1)
for each j = 1, . . . , dimM , where pj , q j are Darboux coordinates of T ∗ M , and ~ is Planck’s constant divided by 2π. In the Schr¨odinger picture, one could identify F~ with an algebra of pseudo-differential operators. We will always choose it in such a way that it contains the algebra A of smooth functions over the configuration space M as a maximal abelian sub-algebra. Starting from the algebra F~ of observables, viewed as an abstract ∗ -algebra, one is thus interested in constructing ∗ -representations of F~ on separable Hilbert spaces. Traditionally, this is accomplished as follows: Given (F~ , A), or equivalently (F~ , M ), and assuming e.g. that M is smooth, one chooses a Riemannian metric g on M . If the quantum mechanical particle is a “neutral scalar particle” one would take as a Hilbert space (1.2) Hsc = L2 (M, dvolg ) , where integration is defined with the help of the Riemannian volume form dvolg on M . The algebra F~ has a natural ∗ -representation on Hsc . The generator of the quantum mechanical time evolution, i.e. the Hamiltonian, of a system describing a neutral scalar particle is chosen to be ~2 4g + v , (1.3) H=− 2m where 4g is the Laplace-Beltrami operator on smooth functions on M (in physics defined to be negative definite), and the “potential” v is some function on M , i.e, v ∈ A. Finally, m is the mass of the particle.
530
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Using ideas of Connes, see e.g. [Co1 ], it has been verified in [FG ] – and is recalled in Sect. 2.1.3 – that the manifold M and the geodesic distance on M can be reconstructed from the the abstract “spectral triple” (A, Hsc , H). Hence the Riemannian geometry of the configuration space M is encoded into the following data of quantum mechanics: i) the abelian algebra A of position measurements; ii) the Hilbert space Hsc of pure state vectors; iii) the Hamiltonian H generating the time evolution of the system. Of course, as physicists, we would prefer exploring the Riemannian geometry of M starting from the data (F~ , Hsc , H). This poses some interesting mathematical questions which are only partially answered. Since, ultimately, momentum measurements are reduced to position measurements, it is tolerable to take the spectral triple (A, Hsc , H) as a starting point. But even then the reconstruction of the differential topology and geometry of M from the quantum mechanics of a scalar particle, as encoded in the latter triple, is quite cumbersome. Fortunately, nature has invented particles with spin which are much better suited for an exploration of the geometry of M , as is shown in the work of Connes. 1.2. Pauli’s electron. Low-energy electrons can be treated as non-relativistic quantum mechanical point particles with spin. Here is a sketch of Pauli’s quantum mechanics of non-relativistic electrons and positrons. Physical space is described in terms of a smooth, orientable Riemannian (spinc ) manifold (M, g) of dimension n. Let T ∗ M be the cotangent bundle of M and denote by 3• M the bundle of completely anti-symmetric covariant tensors over M . Let • (M ) be the space of smooth sections of 3• M , i.e., of smooth differential forms on M , and •C (M ) = • (M ) ⊗ C its complexification. Since we are given a Riemannian metric on M , 3• M is equipped with a Hermitian structure which, together with the Riemannian volume element dvolg , determines a scalar product (·, ·)g on •C (M ). Let He−p denote the Hilbert space completion of •C (M ) in the norm given by the scalar product (·, ·)g . Hence He−p is the space of complex-valued square-integrable differential forms on M . The meaning of the subscript “e-p” will become clear soon. Given a 1-form ξ ∈ 1 (M ), let X be the vector field corresponding to ξ by the equation ξ(Y ) = g(X, Y ) for any smooth vector field Y . For every ξ ∈ 1C (M ), we define two operators on He−p : a∗ (ξ)ψ = ξ ∧ ψ
(1.4)
and a(ξ)ψ = X
ψ
(1.5) ∗
for all ψ ∈ He−p . In (1.5), denotes interior multiplication. Thus a (ξ) is the adjoint of the operator a(ξ) on He−p . One verifies that, for arbitrary ξ, η ∈ 1C (M ) (corresponding to vector fields X, Y via g),
and
{ a(ξ), a(η) } = { a∗ (ξ), a∗ (η) } = 0
(1.6)
{ a(ξ), a∗ (η) } = g(X, Y ) · 1 ≡ g(ξ, η) · 1;
(1.7)
here { A, B } := AB + BA denotes the anti-commutator of A and B, and we have used the same symbol for the metric on vector fields and 1-forms. Equations (1.6) and
Supersymmetric Quantum Theory and Differential Geometry
531
(1.7) are called canonical anti-commutation relations and are basic in the description of fermions in physics. Next, for every real ξ ∈ 1 (M ), we define two anti-commuting anti-selfadjoint operators 0(ξ) and 0(ξ) on He−p by 0(ξ) = a∗ (ξ) − a(ξ) , 0(ξ) = i a∗ (ξ) + a(ξ) .
(1.8) (1.9)
{ 0(ξ), 0(η) } = { 0(ξ), 0(η) } = −2g(ξ, η) , { 0(ξ), 0(η) } = 0,
(1.10) (1.11)
One checks that
for arbitrary ξ and η in 1 (M ). Thus 0(ξ) and 0(ξ), ξ ∈ 1 (M ), are anti-commuting sections of two isomorphic Clifford bundles Cl(M ) over M . An n-dimensional Riemannian manifold (M, g) is a spinc manifold if and nonly if M is oriented and there exists a complex Hermitian vector bundle S of rank 2[ 2 ] over M , where [k] denotes the integer part of k, together with a bundle homomorphism c : T ∗ M −→ End (S) such that c(ξ) + c(ξ)∗ = 0, c(ξ)∗ c(ξ) = g(ξ, ξ) · 1
(1.12)
for all cotangent vectors ξ ∈ T ∗ M . Above, the adjoint is defined with respect to the Hermitian structure (·, ·)S on S. The completion of the space of sections 0(S) in the norm induced by (·, ·)S is a Hilbert space which we denote by He , the Hilbert space of square-integrable Pauli-Dirac spinors. The homomorphism c extends uniquely to an irreducible unitary Clifford action of Cl(M ) on S. If M is an even-dimensional spinc manifold then there is an element γ in the Clifford bundle generated by the operators 0(ξ), ξ ∈ 1 (M ), which anti-commutes with every 0(ξ) and satisfies γ 2 = 1; γ corresponds to the Riemannian volume form on M . We conclude that there exists an isomorphism i : •C (M ) −→ 0(S) ⊗A 0(S),
(1.13)
where A = C ∞ (M ) if M is smooth (and A = C(M ) for topological manifolds), and where S is the “charge-conjugate” bundle of S, constructed from S by complex conjugation of the transition functions, see Sect. 2.2.2. Upon “transporting” c to S, this bundle receives a natural Clifford action c, and the map i is an intertwiner satisfying i ◦ 0(ξ) = (1 ⊗ c(ξ)) ◦ i, i ◦ 0(ξ) = (c(ξ) ⊗ γ) ◦ i
(1.14)
for all ξ ∈ 1 (M ). The “volume element” γ has been inserted so as to ensure that the Clifford action 1 ⊗ c on S ⊗ S anti-commutes with the second action c ⊗ γ. c manifold then γ is central, and we use the Pauli If M is anodd-dimensional spin 0 1 1 0 matrices τ1 = and τ3 = in order to obtain anti-commuting Clifford 1 0 0 −1 actions 1 ⊗ c ⊗ τ3 and c ⊗ 1 ⊗ τ1 on the bundle S ⊗ S ⊗ C2 ; as before, there is an isomorphism (1.15) i : •C (M ) −→ 0(S) ⊗A 0(S) ⊗ C2
532
J. Fr¨ohlich, O. Grandjean, A. Recknagel
which intertwines the Clifford actions: i ◦ 0(ξ) = (1 ⊗ c(ξ) ⊗ τ3 ) ◦ i, i ◦ 0(ξ) = (c(ξ) ⊗ 1 ⊗ τ1 ) ◦ i.
(1.16)
A connection ∇S on S is called a spinc connection iff it satisfies S S ∇X ψ c(η)ψ) = c(∇X η)ψ + c(η)∇X
(1.17)
for any vector field X, any 1-form η and any section ψ ∈ 0(S), where ∇ is a connection on T ∗ M . We say that ∇S is compatible with the Levi–Civita connection iff, in (1.17), ∇ = ∇L.C. . If ∇S1 and ∇S2 are two spinc connections compatible with ∇L.C. then ∇S1 − ∇S2 ψ = i α ⊗ ψ (1.18) for some real 1-form α ∈ 1 (M ). In physics, α is the difference of two electromagnetic vector potentials. If R∇S denotes the curvature of a spinc connection ∇S then n 2−[ 2 ] tr R∇S (X, Y ) = FA (X, Y ) (1.19) for arbitrary vector fields X, Y , where FA is the curvature (“the electromagnetic field tensor”) of a virtual U(1)-connection A (“electromagnetic vector potential”) on a virtual line bundle canonically associated to S; see Sect. 2.2.2 and the appendix. The (Pauli-)Dirac operator associated with a spinc connection ∇S is defined by D A = c ◦ ∇S .
(1.20)
We are now prepared to say what is meant by Pauli’s quantum mechanics of nonrelativistic electrons. As a Hilbert space of pure state vectors one chooses He , the space of square-integrable Pauli-Dirac spinors. The dynamics of an electron (with the gyromagnetic factor g, measuring the strength of the magnetic moment of the electron, set equal to 2) is generated by the Hamiltonian HA =
~2 2 ~2 r DA + v = −4SA + + c(FA ) + v, 2m 2m 4
(1.21)
where the electrostatic potential v is a function on M and r denotes the scalar curvature. There are well-known sufficient conditions on M and v which guarantee that HA is self-adjoint and bounded from below on He – and there are less well-known ones, see [FLL,LL,LY ]. As an algebra of “observables” associated with a quantum mechanical electron, one chooses the algebra A = C ∞ (M ), possibly enlarged to A = C(M ). One may ask how far the geometry of M is encoded in the spectral triple (A, He , DA ) associated with a non-relativistic electron. The answer given by Connes, see [Co1 ], is that (A, He , DA ) encodes the differential topology and Riemannian geometry of M completely. From a physics point of view, it is more natural to work with an algebra F~ of pseudodifferential operators that acts on He and is invariant under the Heisenberg picture dynamics generated by HA . Thus one should consider the triple (F~ , He , DA ). When F~ , He and DA are given as abstract data, some interesting mathematical problems connected with the reconstruction of M remain to be solved. If v = 0 then
Supersymmetric Quantum Theory and Differential Geometry
533
r
2 ~2 DA , (1.22) 2m q ~2 DA . If M is even-dimensional then, i.e., HA is the square of a “supercharge” Q = 2m as discussed above, the Riemannian volume form determines a section γ of the Clifford bundle which is a unitary involution with the property that HA =
[ γ, a ] = 0 but
for all a ∈ A
(or for all a ∈ F~ )
{ γ, Q } = 0.
(1.23)
Thus γ defines a Z2 -grading. The data (A, He , Q, γ) yield an example of what physicists call N = 1 supersymmetric quantum mechanics. In order to describe the twin of Pauli’s electron, the non-relativistic positron, we have to pass from S to the charge-conjugate spinor bundle S. The latter inherits a spinc connection ∇S from S, which can be defined, locally, by using the (local) isomorphism S∼ = S and setting ∇S − ∇S ψ = 2i A ⊗ ψ for ψ ∈ S ∼ = S ; precise formulas are given in Sect. 2.2.2. The space Hp of square integrable sections of S is, however, canonically isomorphic to He , and thus the description of the positron involves the same algebra of observables, A or F~ , and the same Hilbert space, now denoted by Hp ; we only replace the operator DA by D A = c ◦ ∇S (1.24) and set
~2 2 D − v. (1.25) 2m A The physical interpretation of these changes is simply that we have reversed the sign of the electric charge of the particle, keeping everything else, such as its mass m, unchanged. The third character of the play is the (non-relativistic) positronium, the ground state of a bound pair of an electron and a positron. Here, “ground state” means that we ignore the relative motion of electron and positron. As an algebra of “observables”, we continue to use A. The Hilbert space of pure state vectors of the positronium ground state is HA =
He−p = Hp ⊗A He
if dimM is even , (1.26) He−p = Hp ⊗A He + ⊕ Hp ⊗A He − ∼ (1.27) = Hp ⊗A He ⊗ C2 if dimM is odd. Elements in Hp ⊗A He + are even, elements in Hp ⊗A He − are odd under reversing the orientation of M , i.e. under space reflection. e on He−p as follows: If φ ∈ He−p is given by φ = We can define a connection ∇ ψ1 ⊗ ψ2 (⊗u), ψ1 ∈ Hp , ψ2 ∈ He , (u ∈ C2 ), we set e = ∇S ψ1 ⊗ ψ2 (⊗u) + ψ1 ⊗ ∇S ψ2 (⊗u). (1.28) ∇φ
e uniquely, and using the intertwiners (1.13,15) it turns out that Given ∇S , this defines ∇ L.C. e is independent of the virtual e = ∇ , see Lemma 2.11 below. Observe that ∇ in fact ∇ U(1)-connection A – which, physically, is related to the fact that the electric charge of positronium is zero.
534
J. Fr¨ohlich, O. Grandjean, A. Recknagel
We can now introduce two first order differential operators on He−p , e D = 0 ◦ ∇;
e, D =0◦∇
(1.29)
here we have identified the Clifford actions 1 ⊗ c and c ⊗ γ (if M is odd-dimensional: the Clifford actions 1 ⊗ c ⊗ τ3 and c ⊗ 1 ⊗ τ1 ) with 0 and 0, respectively, by virtue of Eqs. (1.14,16). The details of the construction of D are explained in Sect. 2.2.2; see also [FGR ] for an extension to the non-commutative case. If ∇S is compatible with the Levi–Civita connection then D and D satisfy the algebra { D, D } = 0 ,
D2 = D2
(1.30)
and are (formally) self-adjoint on He−p . The quantum theory of positronium is formulated in terms of the algebra of “observables” A, the Hilbert space He−p and the Hamiltonian H=
~2 2 ~2 2 D = D , 2µ 2µ
(1.31)
assuming (1.30), i.e., that ∇S is compatible with the Levi–Civita connection; µ = 2m is the mass of the positronium. The Weitzenb¨ock formula says that H=
r 1 ~2 −4 + − Rijkl 0i 0j 0k 0l , 2µ 4 8
(1.32)
where 4 = g ij (∇i ∇j − 0kij ∇k ) in terms of the Christoffel symbols 0kij of the Levi– Civita connection, where r is the scalar curvature and Rijkl are the components of the Riemann curvature tensor in local coordinates q j ; finally, 0j = 0(dq j ), and analogously for 0j . In (1.32), the summation convention has been used. Of course, we have seen in (1.13–16) that He−p is the Hilbert space of squareintegrable (complexified) differential forms; hence it is no surprise that the Hamiltonian H of positronium is proportional to the usual Laplacian on differential forms. It is convenient to introduce operators d and d∗ given by d=
1 D − iD , 2
d∗ =
Then the relations (1.30) show that d 2 = d∗
2
1 D + iD . 2
=0
(1.33)
(1.34)
and
~2 (d d∗ + d∗ d) = H. 2µ Using (1.8,9,29,33), one sees that e = A ◦ ∇, e d = a∗ ◦ ∇ ∗
(1.35)
(1.36)
where a is defined in (1.4) and A denotes anti-symmetrization. In local coordinates, e j. d = a∗ (dq j )∇ e of a connection ∇ e on • (M ) is defined by Since the torsion T (∇)
Supersymmetric Quantum Theory and Differential Geometry
535
e = d − A ◦ ∇, e T (∇)
(1.37)
where d denotes exterior differentiation, we conclude that d=d
⇐⇒
e =0 T (∇)
=⇒
relations (1.30) hold
e is a metric and that vice versa Eqs. (1.30) imply d = d if we additionally assume that ∇ connection, which guarantees that D and D are symmetric operators on He−p . Thus, e is the Levi–Civita connection ∇L.C. d = d is exterior differentiation precisely if ∇ on He−p . It follows that the quantum mechanics of positronium can be formulated on general orientable Riemannian manifolds (M, g) which need not be spinc . One easily verifies that, no matter whether the dimension of M is even or odd, there always exists a Z2 -grading γ on He−p such that { γ, d , } = { γ, d∗ } = 0,
[ γ, a ] = 0
(1.38)
for all a ∈ A. The operator γ has eigenvalue +1 on even-degree and −1 on odd-degree differential forms. The spectral data (A, He−p , d, d∗ , γ) define an example of what physicists call N = (1, 1) supersymmetric quantum mechanics: There are two supercharges d and d∗ (or D and D) satisfying the algebra (1.34,35) (or (1.30), respectively). When d = d (exterior differentiation) the Z2 -grading γ can be replaced by a Zgrading Ttot counting the total degree of differential forms, and one can add to the spectral data described so far a unitary Hodge operator ∗ such that [ ∗, a ] = 0 for all a ∈ A and ∗d = ζd∗ ∗ , where ζ is a phase factor. The spectral data (1.39) (A, He−p , d, d∗ , Ttot , ∗) are said to define a model of N = 2 supersymmetric quantum mechanics. It is important to distinguish N = (1, 1) from N = 2 supersymmetry: Every N = 2 supersymmetry is an N = (1, 1) supersymmetry, but the converse does not usually hold, even in the context of classical geometry. This is seen by considering geometry with torsion: Let ∇S be a spin connection with torsion, i.e. the connection ∇ on T ∗ M determined by ∇S as in Eq. (1.17) has non-vanishing torsion T (∇). We then redefine what we mean by the (charge-conjugate) connection ∇S , namely (locally) ∇S − ∇S ψ = 2iA ⊗ ψ + c(T (∇)) ⊗ ψ for all ψ ∈ He ; in local coordinates, c(T (∇)) ⊗ ψ =
1 i dq ⊗ Tijk c(dq j )c(dq k )ψ. 2
e on He−p is defined in terms of ∇S and ∇S as in (1.28). We assume The connection ∇ that d T (∇) = 0 and introduce two Dirac operators D and D by e− D =0◦∇
1 0(T (∇)) 6
(1.40)
e+ D =0◦∇
1 0(T (∇)). 6
(1.41)
and
536
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Then the N = (1, 1) algebra (1.30) holds, but there is no natural Z-grading Ttot with the property that [ Ttot , d , ] = d and [ Ttot , d∗ ] = d∗ , for d and d∗ as in (1.33). One can again derive a Weitzenb¨ock formula; it can be used to show that, in various examples, ~ D2 is strictly positive, in which case one says that the superthe Hamiltonian H = 2µ symmetry is spontaneously broken [Wi1 ]. Then the indices of D and D vanish, which has implications for the topology of the manifold M . The present example is discussed in some detail in [FG ]. One can verify that the de Rham-Hodge theory and the differential geometry of a Riemannian manifold (M, g) are completely encoded in the N = 2 set of spectral data (A, He−p , d, d∗ , Ttot , ∗). This theme is developed in great detail in Sect. 2.2. For purposes of physics (in particular, in analyzing the geometry of quantum field theory and string theory), it would be desirable to replace A by a suitable algebra F~ of pseudo-differential operators on M . The resulting change in perspective will be discussed in future work. Mathematicians not interested in quantum physics may ask what one gains by reformulating differential topology and geometry in terms of spectral data, such as those provided by N = 1 (electron) or N = (1, 1) (positronium) supersymmetric quantum mechanics, beyond a slick algebraic reformulation. The answer – as emphasized by Connes – is: generality ! Supersymmetric quantum mechanics enables us to study highly singular spaces or discrete objects, like graphs, lattices and aperiodic tilings (see e.g. [Co1 ]), and also non-commutative spaces, like quantum groups, as geometric spaces, and to extend standard constructions and tools of algebraic topology or of differential geometry to this more general context, so as to yield non-trivial results. The general theory is developed and exemplified in [FGR ]. Below, we shall argue that quantum physics actually forces us to generalize the basic notions and concepts of geometry. The principle that the time evolution of a quantum mechanical system is a oneparameter unitary group on a Hilbert space, whose generator is the Hamiltonian of the system (a self-adjoint operator), entails that the study of supersymmetric quantum mechanics is the study of metric geometry. Let us ask then how we would study manifolds like symplectic manifolds that are, a priori, not endowed with a metric. The example of symplectic manifolds is instructive, so we sketch what one does (see also Sects. 2.6 and [FGR ]). Let (M, ω) be a symplectic manifold. The symplectic form ω is a globally defined closed 2-form. It is known that every symplectic manifold can be equipped with an almost complex structure J such that the tensor g defined by g(X, Y ) = −ω(JX, Y ),
(1.42)
for all vector fields X, Y , is a Riemannian metric on M . Thus, we can study the Riemannian manifold (M, g), with g from (1.42), by exploring the quantum mechanical propagation of e.g. positronium on M , using the spectral data (A, He−p , d, d∗ , Ttot , ∗) of N = 2 supersymmetric quantum mechanics, with A = C ∞ (M ) – or A = C(M ), depending on the smoothness of M . We must ask how these data “know” that M is symplectic. The answer, developed in Sect. 2.6, is as follows: We can view the Zgrading Ttot as the generator of a U(1)-symmetry (a “global gauge symmetry”) of the system. It may happen that this symmetry can be enlarged to an SU(2)-symmetry, with generators L1 , L2 , L3 acting on He−p which commute with all elements of A and have the following additional properties: i) L3 = Ttot − n2 with n = dim M . Defining L± = L1 ± iL2 , the structure equations of su(2) = Lie(SU(2)) imply that
Supersymmetric Quantum Theory and Differential Geometry
537
ii) [ L3 , L± ] = ±2L± , [ L+ , L− ] = L3 , and, since in quantum mechanics symmetries are represented unitarily, ∗ ∗ iii) L 3 = L3 , L ± = L∓ . We also assume that iv) [ L+ , d , ] = 0, e∗ by hence L− commutes with d∗ by property iii). Next we define an operator d e∗ = [ L− , d , ]; d
(1.43)
e∗ ] = d because of ii) and iv), and also it satisfies [ L+ , d e∗ , d , } = 0, {d since d is nilpotent. Assuming, moreover, that e∗ ] = 0 v) [ L− , d e∗ transforms as a doublet under the adjoint action of L3 , L+ , L− and we find that d, d e, −d∗ with d e2 = 0. e = (d e∗ )∗ is an SU(2)-doublet, too, and d e∗ is nilpotent. Thus, d that d The theorem is that the spectral data (A, He−p , d, d∗ , {L3 , L+ , L− }, ∗)
(1.44)
with properties i)−v) assumed to be valid, encode the geometry of a symplectic manifold (M, ω) equipped with the metric g defined in (1.42). The identifications are as follows: L3 = Ttot −
n , 2
L− = (L+ )∗ =
L+ = ω∧ =
1 ωij a∗ (dq i )a∗ (dq j ), 2
1 −1 ij (ω ) a(∂i )a(∂j ). 2
Assumption iv) is equivalent to dω = 0. Further details can be found in Sect. 2.6. We say that the spectral data (1.44) define N = 4s supersymmetric quantum mee, d∗ ; the superscript e∗ , d chanics, because there are four “supersymmetry generators” d, d s stands for “symplectic”. Note that we are not claiming that e} = 0 { d, d
(1.45)
because this equation does, in general, not hold. However, if it holds then (M, ω) is in fact a K¨ahler manifold, with the J from Eq. (1.42) as its complex structure and ω as its K¨ahler form. Defining ∂=
1 e), (d − id 2
∂=
1 e), (d + id 2
(1.46)
e are nilpotent, one finds that, thanks to eqs. (1.43, 45) and because d and d ∂ 2 = ∂ 2 = 0,
{ ∂, ∂ ∗ } = { ∂, ∂ ∗ }.
(1.47)
There is a useful alternative way of saying what it is that identifies a symplectic manifold (M, ω) as a K¨ahler manifold: Eq. (1.45) is a consequence of the assumption that an N = 4s supersymmetric quantum mechanical model has an additional U(1)-symmetry
538
J. Fr¨ohlich, O. Grandjean, A. Recknagel
– which, in physics jargon, one is tempted to call a “global chiral U(1)-gauge symmetry”: We define e, dθ = cos θ d + sin θ d (1.48) eθ = − sin θ d + cos θ d e, d e∗θ and d eθ , −d∗θ are again SU(2)-doublets with the same propand assume that dθ , d e, −d∗ , for all real angles θ. Then the nilpotency of d, d e and of e∗ and d erties as d, d e dθ for all θ implies Eq. (1.45). Furthermore ∂θ =
1 eθ ) = eiθ ∂, (dθ − id 2
∂θ =
1 eθ ) = e−iθ ∂. (dθ + id 2
Assuming that the symmetry (1.48) is implemented by a one-parameter unitary group on He−p with an infinitesimal generator denoted by J0 , we find that e, [ J0 , d ] = −i d
e ] = i d. [ J0 , d
(1.49)
Geometrically, J0 can be expressed in terms of the complex structure J on a K¨ahler manifold – it is bilinear in a∗ and a with coefficients given by J. Defining T :=
1 3 (L − J0 ), 2
1 3 (L + J0 ), 2
T :=
[ T, ∂ ] = ∂,
[ T, ∂ ] = 0,
[ T , ∂ ] = 0,
[ T , ∂ ] = ∂.
(1.50)
one checks that (1.51)
Thus T is the holomorphic and T the anti-holomorphic Z-grading of complex differential forms, see Sect. 2.4. The spectral data (A, He−p , d, d ∗ , { L3 , L+ , L− }, J0 , ∗)
(1.52)
+
belong to N = 4 supersymmetric quantum mechanics. We have seen that they contain the spectral data (1.53) (A, He−p , ∂, ∂ ∗ , ∂, ∂ ∗ , T, T , ∗) characterizing K¨ahler manifolds. We say that these define N = (2, 2) supersymmetric quantum mechanics. If one drops the requirement that ∂ anti-commutes with ∂ ∗ (amounting to the breaking of the SU(2) symmetry generated by L3 , L+ , L− ) the data (1.53) characterize complex Hermitian manifolds, see Sect. 2.4.2. Alternatively, complex Hermitian manifolds can be described by N = 2 spectral data as in Eq. (1.39) with an additional U(1) symmetry generated by a self-adjoint operator e := i [ J0 , d , ] is nilpotent, and different from d. Then d e and J0 with the property that d d anti-commute, and one may define ∂ and ∂ through eqs. (1.46). One verifies that ∂2 = ∂2 = 0
and
{ ∂, ∂ } = 0.
Having proceeded thus far, one might think that on certain K¨ahler manifolds with special properties the U(1) symmetries generated by T and T are embedded into SU(2) symmetries with generators T 3 = T, T + , T − (analogously for the anti-holomorphic generators) which satisfy properties i) through v) from above, with d and d∗ replaced by ∂ and ∂ ∗ , and such that ∂e∗ = [ T − , ∂ ] – as well as analogous relations for the anti-holomorphic generators.
Supersymmetric Quantum Theory and Differential Geometry
539
Alternatively, one might assume that, besides the SU(2) symmetry generated by L3 , L+ , L− there are actually two “chiral” U(1) symmetries with generators I0 and J0 , enlarging the original U(1) symmetry. Indeed, this kind of symmetry enhancement can happen, and what one finds are spectral data characterizing Hyperk¨ahler manifolds, see Sect. 2.5. The two ways of extending the SU(2) × U(1) symmetry of K¨ahler manifolds to larger symmetry groups characteristic of Hyperk¨ahler manifolds are equivalent by a theorem of Beauville, see [Bes ] and Sect. 2.5. The resulting spectral data define what is called N = (4, 4) supersymmetric quantum mechanics, having two sets of four supercharges, { ∂, ∂e∗ , ∂ ∗ , ∂e } e ∂ ∗ , ∂e∗ , ∂ }, with the property that each set transforms in the fundamental repand { ∂, resentation of Sp(4) – see Sect. 3 for the details. This yields the data of N = 8 supersymmetric quantum mechanics – from which we can climb on to N = (8, 8) or N = 16 supersymmetric quantum mechanics and enter the realm of very rigid geometries of symmetric spaces with special holonomy groups [Joy,Bes ]. Of course, the operators I := exp(−iπ I0 ),
J := exp(−iπ J0 ),
K := IJ
(1.54)
in the group of “chiral symmetries” of the spectral data of N = (4, 4) supersymmetric quantum mechanics correspond to the three complex structures of Hyperk¨ahler geometry. One may then try to go ahead and enlarge these “chiral” symmetries by adding further complex structures, see e.g. [WTN ] and references therein for some formal considerations in this direction. We could now do our journey through the land of geometry and supersymmetric quantum mechanics in reverse and pass from special (rigid) geometries, i.e., supersymmetric quantum mechanics with high symmetry, to more general ones by reducing the supersymmetry algebra. The passage from special to more general geometries then appears in the form of supersymmetry breaking in supersymmetric quantum mechanics (in a way that is apparent from our previous discussion). The symmetry generators in the formulation of geometry as supersymmetric quantum mechanics are bilinear expressions in the creation and annihilation operators a∗ and a from Eqs. (1.4,5), with coefficients that are tensors of rank two. It is quite straightforward to find conditions that guarantee that such tensors generate symmetries and hence to understand what kind of deformations of geometry preserve or break the symmetries. Furthermore, the general transformation theory of quantum mechanics enables us to describe the deformation theory of the supersymmetry generators (DA ; D, D; or d, d∗ ) including isospectral deformations (as unitary transformations). Deformations of d and d∗ played an important role in Witten’s proof of the Morse inequalities [Wi2 ] and in exploring geometries involving anti-symmetric tensor fields such as torsion – see the example described in Eqs. (1.40,41) – which are important in string theory. We hope we have made our main point clear: Pauli’s quantum mechanics of the nonrelativistic electron and of positronium on a general manifold (along with its internal symmetries) neatly encodes and classifies all types of differential geometry. 1.3. Supersymmetric quantum theory and geometry put into perspective. A non-linear σ-model is a field theory of maps from a parameter “space-time” 6 to a target space M . Under suitable conditions on M , a non-linear σ-model can be extended to a supersymmetric theory. (See e.g. [FG ] for a brief introduction to this subject.) One tends to imagine that such models can be quantized. When 6 is the real line R, this is indeed
540
J. Fr¨ohlich, O. Grandjean, A. Recknagel
possible, and one recovers supersymmetric quantum mechanics in the sense explained in the previous subsection. When 6 = S 1 ×R, there is hope that quantization is possible, and one obtains an analytic tool to explore the infinite-dimensional geometry of loop 1 space M S . When 6 = L × R with dim L ≥ 2, the situation is far less clear, but a supersymmetric non-linear σ-model with parameter space-time L × R could be used to explore the geometry of M L – i.e., of poorly understood infinite-dimensional manifolds. A (quantized) supersymmetric σ-model with n global supersymmetries and with parameter space 6 = Td × R, where Td is a d-dimensional torus, formally determines a model of supersymmetric quantum mechanics by dimensional reduction: The supersymmetry algebra of the non-linear σ-model with this 6 contains the algebra of infinitesimal translations on 6; by restricting the theory to the Hilbert sub-space that carries the trivial representation of the group of translations of Td one obtains a model of supersymmetric quantum mechanics. The supersymmetry algebra can be reduced to this “zero-momentum” subspace, and the restricted algebra is of the form discussed in Sect. 1.2 (see also Sect. 3). Starting from nˆ supersymmetries and a d + 1 -dimensional parameter space-time 6, one ends up with a model of N = (n, n) supersymmetric quantum mechanics where n = nˆ for d = 1, 2, n = 2nˆ for d = 3, 4, n = 4nˆ for d = 5, 6 and n = 8nˆ for d = 7, 8 (see e.g. [Au ]). The resulting supersymmetric quantum mechanics (when restricted to an even smaller subspace of “zero modes”) is expected to encode the geometry of the target space M – in, roughly speaking, the sense outlined in Sect. 1.2. From what we have learned there, it follows at once that target spaces of σ-models with many supersymmetries or with a high-dimensional parameter space-time must have very special geometries. This insight is not new. It has been gained in a number of papers, starting in the early eighties with work of Alvarez-Gaum´e and Freedman [AGF ], see also [HKLR,BSN,Ni, WN ]. Supersymmetric quantum mechanics is a rather old idea, too, beginning from papers by Witten [Wi1,Wi2 ] – which, as is well known, had a lot of impact on mathematics. In later works, “supersymmetry proofs” of the index theorem were given [AG,FW,Ge ]. The reader may find a few comments on the history of global supersymmetry in Chapter 3; for many more details see e.g. [WeB,Wes ]. Our discussion in Sect. 1.2 has made it clear that, in the context of finite-dimensional classical manifolds, (globally) supersymmetric quantum mechanics – as it emerges from the study of non-relativistic electrons and positrons – is just another name for classical differential topology and geometry. Actually, this is a general fact: Global supersymmetry, whether in quantum mechanics or in quantum field theory, is just another name for the differential topology and geometry of (certain) spaces. In the following, we indicate why globally supersymmetric quantum field theory, too, is nothing more than geometry of infinite-dimensional spaces. However, once we pass from supersymmetric quantum mechanics to quantum field theory, there are surprises. A (quantum) field theory of Bose fields can always be thought of as a σ-model (linear or non-linear), i.e., as a theory of maps from a parameter “space-time” 6 = L × R to a target space M . At the level of classical field theory, we may attempt to render such a model globally supersymmetric, and then to quantize it. As we have discussed above, the resulting quantum field theory – if it exists – provides us with the spectral data to explore the geometry of what one might conjecture to be some version of the formal infinite-dimensional manifold M L . It may happen that the quantum field theory exhibits some form of invariance under re-parametrizations of parameter space-time 6 (though such an invariance is often destroyed by anomalies, even if present at the classical level). However, when 6 = R, re-parametrization invariance can be imposed; when e.g.
Supersymmetric Quantum Theory and Differential Geometry
541
6 = S 1 × R, it leads us to the tree-level formulation of first-quantized string theory. Let us be content with “conformal invariance” and study (supersymmetric) conformal σ-models of maps from parameter space-time 6 = S 1 × R to a target space M which, for concreteness, we choose to be a smooth, compact Riemannian manifold without boundary, at the classical level. An example would be M = G, some compact (simply laced) Lie group. The corresponding field theory is the supersymmetric Wess–Zumino– Witten model [Wi3 ], which is surprisingly well understood. Thanks to the theory of Kac–Moody algebras and centrally extended loop groups, see e.g. [GeW,FGK,PS ], its quantum theory is under fairly complete mathematical control and provides us with the spectral data of some N = (1, 1) supersymmetric quantum theory (related to the example discussed in Eqs. (1.42, 43) with “broken supersymmetry”; see e.g. [FG ]). 1 We might expect that these spectral data encode the geometry of the loop space GS over G. The surprise is, though, that they encode the geometry of loop space over a quantum deformation of G (where the deformation parameter depends on the level k of the Wess–Zumino–Witten model in such a way that, formally, k → ∞ corresponds to the classical limit). We expect that this is an example of a general phenomenon: Quantized supersymmetric σ-models with parameter space-time 6 of dimension d ≥ 2 – assuming that they exist – tend to provide us with the spectral data of a supersymmetric quantum theory which encodes the geometry of a “quantum deformation” of the target space of the underlying classical σ-model. This is one reason why quantum physics forces us to go beyond classical differential geometry. A second, more fundamental subject which calls for “quantum geometry” is the outstanding problem of unifying the quantum theory of matter with the theory of gravity within a theory of quantum gravity. It has been argued in [DFR ] that in such a theory space-time coordinates should no longer commute with each other. In [DFR ], it was shown how quantum field theory on a “non-commutative Minkowski space-time” can be formulated technically, in particular how to construct representations of the commutation relations which guarantee the uncertainty relations for space-time coordinates. Since the final theory of quantum gravity is unknown as yet, arguments for non-commutativity like the ones given in [DFR ] (see also [F ] for a somewhat different presentation) are necessarily heuristic. Nevertheless, they indicate that in order to formulate a fundamental theory of “space-time-matter” it is necessary to go beyond classical geometry. These remarks motivate the attempts presented in [FGR ].
2. Algebraic Formulation of Differential Geometry Historically, the first step towards an algebraic formulation of geometry was Gelfand’s Theorem showing that every unital abelian C ∗ -algebra A is isomorphic to the algebra C(X) of complex-valued continuous functions over some compact Hausdorff space. The points of X can be identified with the prime ideals of C(X). Swan’s Lemma, stating that every finite dimensional vector bundle E over a compact topological space X can equivalently be viewed as a finitely generated projective module over C(X), provided a further link between the theory of operator algebras and topology of spaces. Connes’ work made it possible to treat geometrical aspects by means of operator algebraic techniques and, more importantly, led to proposals of how to describe (compact) non-commutative spaces and their geometry by means of (unital) non-commutative *-algebras.
542
J. Fr¨ohlich, O. Grandjean, A. Recknagel
In this chapter, we shall give algebraic formulations of classical Riemannian, symplectic, Hermitian, K¨ahler, and Hyperk¨ahler geometry which are tailor-made for generalization to the non-commutative setting. Compact Lie groups are discussed as a special example in section 2.3. Extensions to non-commutative geometry and examples will appear in [FGR ]. 2.1. The N = 1 formulation of Riemannian geometry. In this section, we explain how to encode a Riemannian manifold into what will be called N = 1 spectral data. The special case of spin manifolds, which is more familiar to physicists, will be mentioned in section 2.1.2. Following the work of Connes, see e.g. [Co1 ], we briefly recall in section 2.1.3 how to reconstruct the manifold from the spectral data. 2.1.1. The N = 1 spectral data associated to a Riemannian manifold. Let M be a smooth compact real manifold of dimension n with Riemannian metric g. (Throughout this paper, we restrict ourselves to compact manifolds, even when not mentioned explicitly.) The complexified cotangent bundle T ∗ M C carries a symmetric C-bilinear form induced by g, so we can write Clp (M ) as a shorthand for the Clifford algebra Cl(Tp∗ M ) = ClR (Tp∗ M ) ⊗ C at p ∈ M , where the Riemannian metric enters the relations, { ω, η } := ωη + ηω = −2g(ω, η) .
(2.1)
Some important facts about Clifford algebras and spinc groups are collected in the appendix. The Clifford bundle Cl(M ) is the bundle whose fibre at p is Clp (M ). The space of smooth sections 0(Cl(M )) carries an algebra structure given by pointwise Clifford multiplication. The Levi–Civita connection ∇ on T ∗ M C induces a connection on Cl(M ), also denoted by ∇, such that ∇X (σ1 σ2 ) = (∇X σ1 )σ2 + σ1 ∇X σ2
(2.2)
for all σ1 , σ2 ∈ 0(Cl(M )) and X ∈ 0(T M ). Let C ∞ (M ) denote the algebra of smooth complex valued functions on M . A complex vector bundle S over M with Hermitian structure h·, ·iS is said to carry a unitary Clifford action if there is a C ∞ (M )-linear map c : 0(Cl(M )) −→ 0(End S) such that
{ c(σ), c(σ 0 ) } = −2g(σ, σ 0 ) · idS
and hc(σ)s1 , s2 iS + hs1 , c(σ)s2 iS = 0
(2.3)
for all s1 , s2 ∈ 0(S) and all σ, σ 0 ∈ 0(T ∗ M ). Then, a connection ∇S on S is called a Clifford connection iff it is Hermitian and ∇S (c(σ)s) = c(∇σ)s + c(σ)∇S s
(2.4)
for all σ ∈ 0(Cl(M )), s ∈ 0(S), where ∇ denotes the Levi–Civita connection. Definition 2.1. A Dirac bundle is a Hermitian vector bundle S together with a unitary Clifford action and a Clifford connection.
Supersymmetric Quantum Theory and Differential Geometry
543
If S is a Dirac bundle then the Dirac operator on 0(S) is defined by the composition of maps ∇S
D : 0(S) −→ 0(T ∗ M C ⊗ S) −→ 0(S) . c
(2.5)
µ
In local coordinates x on M , and setting γ µ := c(dxµ ) ,
(2.6)
D = γ µ ∇Sµ .
(2.7)
the Dirac operator reads, as usual,
Since √ n the manifold M is Riemannian it carries a canonical 1-density locally given by gd x, and functions can be integrated over M with this 1-density. Therefore, the space 0(S) is supplied with a canonical scalar product Z √ hs1 , s2 iS gdn x , (2.8) (s1 , s2 ) = M
and we denote the Hilbert space completion of 0(S) with respect to this scalar product by L2 (S). The Dirac operator on 0(S) extends to a self-adjoint operator on L2 (S) [Su,LaM ]. Proposition 2.2. On a Riemannian manifold (M, g), the complexified bundle of differential forms 3• M C := 3• T ∗ M ⊗ C carries a canonical Dirac bundle structure. Notice that the canonical Dirac operator on 3• M C is given by the signature operator D = d + d∗ . The proposition shows that Dirac bundles always exist; thus, given a compact Riemannian manifold M – which need not be spin – we can choose a Dirac bundle S over M and obtain what we will call an N = 1 spectral triple (C ∞ (M ), L2 (S), D). In the following sections we shall prove that the Riemannian manifold M can be recovered from the spectral triple associated to any Dirac bundle. For detailed proofs and a survey of applications of the theory of Dirac operators see [BGV, LaM ]. 2.1.2. The special case of spin manifolds. Let M be a spin manifold, i.e., the first and the second Stiefel-Whitney class of M vanish. Then, in particular, M is oriented, and there is a canonical volume form determined by the Riemannian metric, which allows to define integration (of functions) over M . Since M is spin, it furthermore carries a Hermitian bundle S of irreducible representations of the Clifford algebra, i.e. of Dirac spinors. We associate to the manifold M the N = 1 spectral triple (C ∞ (M ), L2 (S), D), where C ∞ (M ) is the algebra of smooth complex valued functions on M , L2 (S) is the Hilbert space of square integrable spinors, and D is the Dirac operator associated to the spin-connection on S. The same data can be used if the manifold is only spinc . (The Dirac operator DA then depends on a U(1) gauge field A, but we will see later that DA enters e.g. the differential forms only through commutators, therefore A drops out again.) In cohomological terms, a manifold is spinc if its first Stiefel–Whitney class vanishes, and its second one is the mod-2 reduction of an integral class. One can show that, in particular, every smooth compact oriented 4-manifold has a spinc -structure, see e.g. [Sa ]. 2.1.3. Reconstructing the geometry of M from its N = 1 spectral data. The algebra C ∞ (M ) is a ∗ -algebra with involution given by complex conjugation, and it acts on the
544
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Hilbert space L2 (S) by bounded (multiplication) operators. Thus, C ∞ (M ) is equipped with a C ∗ -norm, and taking its C ∗ -closure yields the algebra of continuous functions on M : With the help of Gelfand’s Theorem, we can obtain M as a compact Hausdorff space from the spectral triple. In his fundamental work, A. Connes has shown how one can reconstruct the de Rham–Hodge theory and the Riemannian geometry of a Riemannian manifold (M, g) from the spectral triple (C ∞ (M ), L2 (S), D). All the details of this construction can be found in [Co1 ], but because of their importance, and also for the sake of introducing some notation, we list a few of the main results in the following. While the applications in [Co1 ] focus on the case of a spin manifold, the constructions clearly work for any Riemannian manifold M (cf. also Prop. 2.2). First, Connes shows that one can recover the geodesic distance on M from the Dirac operator D: Proposition 2.3 ([Co1]). The geodesic distance between two points p, q ∈ M is given by d(p, q) = sup{ |f (p) − f (q)| : k [ D, f ] kL2 (S) ≤ 1 } . f
Note that the commutator [ D, f ] acts on L2 (S) simply as Clifford multiplication by df , which can be seen e.g. when writing D in local coordinates. We note that the geodesic distance of Proposition 2.3 can also be recovered from the algebra C ∞ (M ) and the Laplace operator 4 = 4g associated with a Riemannian metric g on M , see [FG ]. More precisely, we have d(p, q) = sup{ |f (p) − f (q)| : k f
1 (4f 2 + f 2 4) − f 4f kL2 (S) ≤ 1 } . 2
In fact, much of classical differential geometry can also be extracted from a triple (C ∞ (M ), L2 3• M C , 4). Next, we turn to the reconstruction of differential forms over M from the spectral triple (A, L2 (S), D), where A := C ∞ (M ). We first introduce an abstract object, the graded differential algebra of universal forms • (A), see [CoK ], which also plays an important role in the non-commutative setting: • (A) =
∞ M
k (A) ,
(2.9)
k=0
k (A) = {
N X
f0i δf1i . . . δfki |N ∈ N , fji ∈ A } .
(2.10)
i=1
Here, δ : k (A) −→ k+1 (A) is a linear operator satisfying δ 2 = 0 and the Leibniz rule. The algebra • (A) is endowed with an involution given by f ∗ = f¯ ,
(δf )∗ = −δ f¯
(2.11)
for each f ∈ A. There is a ∗ -representation π of • (A) on L2 (S) by bounded operators defined via π(f ) = f , We use π to define – see [Co1 ] –
π(δf ) = [ D, f ] = c(df ) .
(2.12)
Supersymmetric Quantum Theory and Differential Geometry
J k := ker π|k (A) .
545
(2.13)
L
J k is not a differential ideal in • (A), since there are ω ∈ k (A) such that π(ω) = 0 but π(δω) 6= 0. Connes shows that k
π(k (A)) = 0(Cl(k) (M )) := span{ω1 ω2 · · · ωk | ωi ∈ 0(T ∗ M C )} , and that the representation π maps the elements of δJ k onto sections of Cl(k−1) (M ), i.e. π(δJ k ) = 0(Cl(k−1) (M )) . This leads to the following characterization of differential forms: Proposition 2.4 ([Co1]). For each k ∈ Z+ , there is an isomorphism kD (A) := k (A)/(J k + δJ k−1 ) ∼ = 0(3k M C ) . The integration theory on M may be recovered from the N = 1 spectral data in two different ways, namely 1) using the Dixmier trace Tr ω (·|D|−n ) and Connes’ trace theorem, see [Co3 ], or 2) through the operator exp(−εD2 ) and the heat kernel expansion. The first approach is described in [Co1,3 ]; we shall follow the second one. In both cases, mild restrictions have to be placed on the class of admissible Dirac bundles: Definition 2.5. A Z2 -graded Dirac bundle is a Dirac bundle together with a section γ ∈ 0(End S) such that γ 2 = 1, {γ, c(ω)} = 0 for each ω ∈ 0(T ∗ M C ), and hγs1 , γs2 iS = hs1 , s2 iS for all s1 , s2 ∈ 0(S). Note that the spinor bundle of an even dimensional spin manifold is Z2 -graded by the chirality element γ. Let θi be a local orthonormal basis of 0(T ∗ M ) over U ⊂ M . Then any section σ ∈ 0(Cl(M )) can locally be written as σ = σ0 +σi θi +σi1 i2 θi1 θi2 +. . .+σi1 ...in θi1 · · · θin with totally anti-symmetric functions σi1 ...ik . Using the heat kernel expansion of the operator exp(−εD2 ), ε > 0, see e.g. [BGV ], we arrive at 2 Z Z Tr c(σ)e−εD 1 √ = σ0 (x) gdn x . (2.14) − c(σ) := lim+ ε→0 Vol(M ) M Tr e−εD2 R The integral − induces a scalar product on the space 0(Cl(M )), Z (2.15) (σ1 , σ2 ) := − c(σ1 )c(σ2 )∗ and the induced scalar product on differential forms is the usual one. 2.1.4. Vector bundles and Hermitian structures. As is well-known [Sw,Co1 ], one can also give an algebraic description of (complex) vector bundles E over the manifold M : The space of sections E = 0(E) is a finitely generated, projective left- (and right-) module over C ∞ (M ). It turns out that the converse statement is also true, i.e. every module E of the above type is isomorphic to the space of sections of a complex vector bundle E over M , which is unique up to isomorphism [Sw ].
546
J. Fr¨ohlich, O. Grandjean, A. Recknagel
A Hermitian structure on a bundle (a module) E is a non-degenerate positive C ∞ (M )sesqui-linear map from E × E to C ∞ (M ). The extension of the Riemannian metric to a Hermitian structure on differential ¯ φ, ψ ∈ 0(3k M C ), to be the unique function on forms is recovered by defining g(φ, ψ), M such that Z Z ¯ = − c(φ)c(ψ)∗ f g(φ, ψ)f M ∞
for all f ∈ C (M ). To conclude, we note that global definitions of connection, metric connection, curvature, torsion, etc. will fit into the present algebraic formalism. 2.2. The N = (1, 1) formulation of Riemannian geometry. In this section, we describe an algebraic formulation of Riemannian geometry that apparently requires slightly more structure on the manifold than the N = 1 setting. The advantage is that the reconstruction procedure and the non-commutative generalization are easier from the N = (1, 1) point of view, see Subsect. 2.2.4. In addition, in 2.2.2, we will in fact see that every Riemannian manifold provides this larger set of data. 2.2.1. The N = (1, 1) spectral data. Definition 2.6. Let M be a Riemannian manifold. An N = (1, 1) Dirac bundle over M is a Z2 -graded Dirac bundle S with Clifford action c and connection ∇S , see Definition 2.5, together with a second Clifford action c such that 1) (S, c, ∇S ) is a Z2 -graded Dirac bundle; 2) {c(σ1 ), c(σ2 )} = 0 for all σ1 , σ2 ∈ 0(T ∗ M C ); 3) the Dirac operators D = c ◦ ∇S and D = c ◦ ∇S satisfy the relations {D, D} = 0 ,
2
D2 = D .
Given an N = (1, 1) Dirac bundle S over M we can introduce an operator d on 0(S) and its adjoint by 1 1 (2.16) d := (D − iD) , d∗ := (D + iD) ; 2 2 both of these operators are nilpotent: d2 = 0. 2.2.2. Spinc -manifolds and the canonical N = (1, 1) Dirac bundle. In this section we prove that spinc -manifolds carry a canonical N = (1, 1) Dirac bundle which can be constructed in terms of the spinor bundle S. Moreover, it turns out that the Dirac bundle is isomorphic to the bundle of differential forms, which means that to any compact Riemannian manifold one can associate an N = (1, 1) Dirac bundle, as mentioned at the beginning of Sect. 2.2. In the present subsection, we systematically avoid the language of principal bundles, since it is difficult to generalize to the non-commutative setting. Instead, we make use of the material presented in the appendix and of some basic knowledge of vector bundles to convert those “local” notions into global ones. The definition of a spinc -manifold we use is a direct generalization of a spinc structure on an oriented vector space recalled in Definition A.1 of the appendix:
Supersymmetric Quantum Theory and Differential Geometry
547
Definition 2.7 ([Sa]). A spinc -manifold is an oriented (compact) Riemannian manifold n (M, g) of dimension n together with a complex Hermitian vector bundle S of rank 2[ 2 ] , ∗ where [k] denotes the integer part of k, and a bundle homomorphism c : T M −→ End (S) such that c(ω) + c(ω)∗ = 0 (2.17) c(ω)∗ c(ω) = g(ω, ω) for all ω ∈ T ∗ M . Note that c extends uniquely to an irreducible unitary Clifford action of Cl(M ) on S. Recall that a Clifford connection ∇S on S is a Hermitian connection compatible with the Levi–Civita covariant derivative ∇ on T ∗ M , see (2.4). Here we follow the terminology of [BGV ]; in the context of spinc -geometry, what we call a Clifford connection on the Spinc -bundle is referred to as a Spinc -connection compatible with the Levi–Civita connection. e S are Clifford connections on S, then their differProposition 2.8 ([Sa]). If ∇S and ∇ ence is a 1-form with values in the purely imaginary numbers, i.e. e S ∈ 0(T ∗ M ⊗ iR) . ∇S − ∇ Sketch of Proof. The compatibility with ∇ implies that e S )c(ω) = c(ω)(∇S − ∇ eS ) (∇SX − ∇ X X X for all ω ∈ 0(Cl(M )) and X ∈ 0(T M ). Since the commutant of 0(Cl(M )) in End S e S must be given by a function over the manifold. equals C ∞ (M ), the difference ∇SX − ∇ X Because of the Hermiticity condition, it takes purely imaginary values. Note that the 1-form taking purely imaginary values may be interpreted from the physical point of view as an electromagnetic vector potential; thus, it is charged spinning particles that can propagate on a spinc -manifold. By virtue of Proposition 2.8, we can describe all Clifford connections on S if we only present a construction for one of them, which will be done in the following. The main idea is to glue together the local data described in the appendix with the help of suitably chosen local trivializations of T ∗ M and S: Choose a real oriented n-dimensional vector space V with scalar product (·, ·) and a spinc -structure (W, 0) on V ; see the appendix. Let U ⊂ M be a contractible open set and h : S|U −→ U × W be a local trivialization of S with the property hσ1 , σ2 iS = hh(σ1 ), h(σ2 )iW .
(2.18)
We define a local trivialization of T ∗ M ϕ : T ∗ M |U −→ U × V by requiring
0(ϕ(ω)) w = h c(ω)h−1 (w)
(2.19)
548
J. Fr¨ohlich, O. Grandjean, A. Recknagel
for all ω ∈ T ∗ M |U and w ∈ W . (Such a ϕ exists because c and 0 both implement irreducible representations of algebras that are isomorphic to Cl(M )|U .) The map ϕ satisfies (ϕ(ω), ϕ(η)) = g(ω, η) for all ω, η ∈ T ∗ M |U . A triple (U, ϕ, h) as above is called an admissible trivialization of T ∗ M and S with respect to (V, W, 0). Lemma 2.9. The structure group of S is given by Spinc (V ).
Sketch of Proof. Choose a finite covering {Uα } of M by contractible open sets such that (Uα , ϕα , hα ) are admissible trivializations of T ∗ M and S. Then the joint transition functions −1 (Aαβ , 8αβ ) := (ϕα ◦ ϕ−1 β , h α ◦ hβ )
of T ∗ M and S are easily seen to take values in Homspin (W, W ) ∼ = Spinc (V ) by Proposition A.2 of the appendix. c
With the help of this lemma, one may also show that Definition 2.7 of a spinc manifold is equivalent to the usual definition of Spinc -structures via principal bundles. Let δ : Spinc (n) −→ U(1) be the group homomorphism defined by δ(eiα x) = e2iα ,
(2.20)
x ∈ Spin(n), and (Uα , ϕα , hα ) be a set of admissible trivializations of T ∗ M and S. The U(1)-valued functions δ ◦ ((Aαβ , 8αβ )) – see the proof of Lemma 2.9 – define a Hermitian line bundle L. The bundle S can be written locally as SR ⊗ L1/2 , where SR is a bundle of irreducible representations of the real Clifford algebra and L1/2 is the square root of L. It may happen that neither SR nor L1/2 are globally defined. The “bundle” L1/2 is therefore called a virtual bundle. Notice that an admissible trivialization (U, ϕ, h) of T ∗ M and S also determines a trivialization of L. Let ∇L be a Hermitian connection on L and 2A its connection 1-form in the trivialization (U, ϕ, h). We choose orthonormal bases ei and p of V and W , respectively; then θi = ϕ−1 (ei ) and σ p = h−1 (p ) are orthonormal frames of 0(T ∗ M |U ) and 0(S|U ), respectively, and we denote by ϑi the frame of 0(T M |U ) dual to θi . Lemma 2.10. Let ω jk (ϑi ) denote the coefficients of the Levi–Civita connection in the basis ϑi of 0(T M |U ), and let 2A be the connection 1-form of a Hermitian connection on L; then the formula 1 ∇S σ p = θi ⊗ − ω jk (ϑi )c(θj )c(θk ) + iA(ϑi ) σ p 4 defines a Clifford connection on S.
Supersymmetric Quantum Theory and Differential Geometry
549
Sketch of Proof. The expression for ∇S σ p does not depend on the choice of bases ei and ˜ ˜ h) p since any two such choices are related by a global transformation over U . If (U˜ , ϕ, ∗ M and S with the same V and (W, 0) is another admissible local trivialization of T c , there is a Homspin (W, W )-valued transition function (A, 8)x , x ∈ U ∩ U˜ , such that σ˜ p = 8pq σ q and θ˜i = Aij θj . Then we have q ∇S σ˜ p = d8pq 8−1 r ⊗ σ˜ r + 8pq ∇S σ q q 1 = d8pq 8−1 r ⊗ σ˜ r + θi ⊗ − ω jk (ϑi )c(θj )c(θk ) + iA(ϑi ) σ˜ p . 4 According to Proposition A.2, we may define eiαx ξx = Ξ0−1 (Aij , 8pq )x ∈ C ∞ (U ∩ U˜ , Spinc (V )) where ξx ∈ C ∞ (U ∩ U˜ , Spin(V )), and from Eqs. (A.21,22) in the appendix and the compatibility properties of admissible trivializations we obtain d8pq 8−1
q r
⊗ σ˜ r =
1 j −1 l dA l (A ) k ⊗ c(θ˜j )c(θ˜k )σ˜ p + idα . 4
The transformation formula for the Levi–Civita connection is as usual: 1 j 1 1 ω (ϑi )c(θj )c(θk ) = ω˜ jk (ϑi )c(θ˜j )c(θ˜k ) + dAj l (A−1 )lk ⊗ c(θ˜j )c(θ˜k ). 4 k 4 4 Finally, using the transformation of the U(1)-field, A˜ = A + dα , and putting all the terms together, we see that ∇S is indeed well-defined. A direct computation shows that it also satisfies the properties of a Clifford connection. Note that iA transforms like a connection on L1/2 . Therefore it is called virtual connection. We will now use the results of Appendix A.3 to establish a relation to differential forms. Let {ϕαβ } be a family of Spinc (n)-valued transition functions for S. Using the automorphism (2.21) eiα ξ 7→ eiα ξ = e−iα ξ n
of Spinc (n), where ξ ∈ Spin(n), we define S to be the vector bundle of rank 2[ 2 ] defined by the transition functions {ϕαβ }. Then S is also a Dirac bundle with Clifford action c¯ ¯ – which is just c transferred to the “new” bundle – and its Clifford connection ∇S reads 1 ¯ ∇S σ¯ p = θi ⊗ − ω jk (ϑi )c(θj )c(θk ) − iA(ϑi ) σ¯ p 4
(2.22)
in an admissible trivialization. The vector bundle S ⊗ S has SO(n) as its structure group, and there is a vector bundle isomorphism ψ : 3• M C −→ S
(2.23)
with S = S ⊗ S if n is even and S = S ⊗ C2 ⊗ S if n is odd. Furthermore, this isomorphism can be chosen in such a way that the correspondences (A.24–30) given in the appendix hold, and therefore we can use ψ to push the two anti-commuting Clifford
550
J. Fr¨ohlich, O. Grandjean, A. Recknagel
actions 0 and 0 (A.24) to S, where we call them c and c, respectively. (Of course, we could have immediately identified c with c ⊗ 1 and c with γ ⊗ c.) 2 Let ∇C be the canonical flat connection on M × C2 and ∇tot be the tensor product connection on S, i.e. 2 ¯ (2.24) ∇tot = ∇S + ∇C + ∇S , where of course the middle term only appears if n is odd. Lemma 2.11. The isomorphism ψ in (2.23) maps the Levi–Civita connection to ∇tot . Proof. Using Lemma 2.10 and Eq. (2.22), we see that in an admissible trivialization the U(1)-field 2A does not contribute to the connection coefficients of ∇tot , i.e. 1 ∇tot = θi ⊗ − ω jk (ϑi )(c(θj )c(θk ) + c(θj )c(θk )) . 4 By (A.24–30) we get ψ −1 ◦ ∇tot ◦ ψ = θi ⊗ (−ω jk (ϑi )ak∗ aj ), where ak∗ = θk ∧ and ak is its adjoint. But the last expression is just the Levi–Civita connection in the basis θi . We use the isomorphism ψ of Eq. (2.23) to define two Dirac operators on S, D = c ◦ ∇tot ,
D = c ◦ ∇tot .
(2.25)
Lemma 2.12. The operators D and D satisfy ψ −1 Dψ = d + d∗ ,
(2.26)
ψ −1 Dψ = i(d − d∗ ) . Proof. This follows from (A.24) and the equations d = ai∗ ◦ ∇ϑi ,
d∗ = −ai ◦ ∇ϑi ,
see e.g. [BGV ]; ∇ denotes the (torsion-free) Levi–Civita connection.
With Lemma 2.12 it is easy to see that D and D fulfill the relations {D, D} = 0
and
D2 = D
2
of Definition 2.6. Thus, the bundle S = S ⊗ S – or S = S ⊗ C2 ⊗ S for n odd – is an N = (1, 1) Dirac bundle over M . Furthermore, S is isomorphic to the bundle of differential forms, and this proves that any Riemannian manifold carries an N = (1, 1) structure. 2.2.3. Structure theorem for N = (1, 1) Dirac bundles. In the previous section we have seen that any Riemannian manifold carries a canonical N = (1, 1) Dirac bundle, see Definitions 2.1, 2.5, 2.6, namely the bundle of differential forms together with the Levi– Civita connection and the Clifford actions 0 and 0 as in (A.24). In this section, we prove that any N = (1, 1) Dirac bundle is obtained by tensoring this canonical one with a flat Hermitian vector bundle.
Supersymmetric Quantum Theory and Differential Geometry
551
Let S be an N = (1, 1) Dirac bundle with Clifford actions c and c and Clifford connection ∇S . We define E := HomCl (3• M C , S)
(2.27)
to be the vector bundle of bundle maps ϕ : 3• M C −→ S which intertwine the Clifford actions on 3• M C and S, i.e. ϕ ◦ 0(ω) = c(ω) ◦ ϕ ,
ϕ ◦ 0(ω) = c(ω) ◦ ϕ ,
for all ω ∈ 3• M C . Lemma 2.13. The bundle map ψ :
3• M C ⊗ E −→ S ω ⊗ ϕ 7−→ ϕ(ω)
is an isomorphism, and the Clifford algebra acts trivially on E, i.e., for all ω ∈ Cl(M ), we have that c(ω) ◦ ψ = ψ ◦ (0(ω) ⊗ 1) , c(ω) ◦ ψ = ψ ◦ (0(ω) ⊗ 1) . Proof. For each x ∈ M , the two anti-commuting unitary representations c and c of Cl(M )|x on Sx may be viewed as one unitary representation of Cl(M × M )|(x,x) . Since M × M is even-dimensional, Sx is a multiple of the unique irreducible unitary representation of Cl(M × M )|(x,x) which is precisely 3• Tx∗ M C with actions 0 and 0. This proves that ψ is an isomorphism, and since E is the “multiplicity space” it is clear that the Clifford algebra acts trivially on it. In the following we will identify the bundles 3• M C ⊗ E and S with the help of ψ. Lemma 2.14. The sesqui-linear map 0(E) × 0(E) −→ C ∞ (M ) h·, ·iE : (ϕ, ϕ0 ) 7−→ hϕ(1), ϕ0 (1)iS , where 1 ∈ 0(3• M C ) denotes the identity function, defines a Hermitian structure on E. Furthermore, the Hermitian structure on S factorizes as h·, ·iS = h·, ·i ⊗ h·, ·iE , where h·, ·i denotes the canonical Hermitian structure on 3• M C given by the Riemannian metric. Proof. Let θi be an orthonormal basis of 0(3• M C ) over a coordinate chart U ⊂ M and ϕ, ϕ0 ∈ 0(E|U ). We denote by ai∗ = a∗ (θi ) the wedging operator by θi on 0(3• M C |U ). Since ai∗ is a linear combination of 0(θi ) and 0(θi ), it commutes with ϕ. If ϕ(1) = 0 then we have ϕ(θi1 ∧ . . . ∧ θip ) = ϕ(ai1 ∗ . . . aip ∗ · 1) = ai1 ∗ . . . aip ∗ · ϕ(1) = 0
552
J. Fr¨ohlich, O. Grandjean, A. Recknagel
for all 1 ≤ i1 < . . . < ip ≤ n = dim M , and therefore ϕ ≡ 0; this proves that h·, ·iE is a Hermitian structure on E. The contraction operators ai = a(θi ) also commute with ϕ and ϕ0 , and it follows that ai ϕ(1) = ai ϕ0 (1) = 0 for all i. Using the CAR we get hϕ(θi1 ∧ . . . ∧ θip ), ϕ0 (θj1 ∧ . . . ∧ θjq )iS = hai1 ∗ . . . aip ∗ · ϕ(1), aj1 ∗ . . . ajq ∗ · ϕ0 (1)iS j1 . . . j q = hϕ(1), aip . . . ai1 aj1 ∗ . . . ajq ∗ · ϕ0 (1)iS = δp,q sgn hϕ(1), ϕ0 (1)iS i1 . . . i p = hθi1 ∧ . . . ∧ θip , θj1 ∧ . . . ∧ θjq ihϕ, ϕ0 iE , where in the second equality we have used the unitarity of the Clifford actions. The result follows by linearity. Lemma 2.15. The connection ∇S on S is a tensor product connection, i.e. ∇ S = ∇ ⊗ 1 + 1 ⊗ ∇E for some covariant derivative ∇E on E; as usual, ∇ denotes the Levi–Civita connection. Proof. We define
(∇E ϕ)(ω) = ∇S (ϕ(ω)) − ϕ(∇ω)
for all ϕ ∈ 0(E) and ω ∈ 0(3• M C ). It is readily seen that ∇E ϕ is tensorial, i.e. C ∞ (M )-linear, in ω. Furthermore, since ∇S and ∇ are Clifford connections compatible with the Levi–Civita connection, we have (∇E ϕ)(0(η)ω) = c(η)(∇E ϕ)(ω) for all ω, η ∈ 0(3• M C ) and ϕ ∈ 0(E), and the analogous relation with the “barred” Clifford actions 0 and c holds as well. This proves that ∇E maps 0(E) into 0(T ∗ M ⊗E), in other words, ∇E is a connection on E. Finally, the equation ∇S (ω ⊗ ϕ) = ∇ω ⊗ ϕ + ω ⊗ ∇E ϕ is obvious from the definition of ∇E .
Lemma 2.16. Let D = c ◦ ∇S and D = c ◦ ∇S be the two Dirac operators on S as in Definition 2.6, and let R(∇E ) be the curvature tensor of ∇E . Then the following statements are equivalent: 1) {D, D} = 0, 2 2) D2 = D , 3 R(∇E ) = 0. Proof. First of all, we use Lemma 2.15 and the isomorphism ψ of Lemma 2.13 to write D (ω ⊗ ξ) = 0(dxµ )(∇µ ω ⊗ ξ + ω ⊗ ∇E µ ξ). From the previous section, we know that µ the operators D∧ := 0(dx )∇µ and D∧ := 0(dxµ )∇µ on the space 0(3• M C ) satisfy the first two conditions. In addition, we exploit the fact that ∇ is a Clifford connection, i.e. ∇µ 0(dxν ) ω = −0νµσ 0(dxσ ) ω + 0(dxν )∇µ ω – and similarly with 0 instead of 0 – for all ω ∈ 0(3• M C ); here 0νµσ are the Christoffel symbols. With this, it is straightforward, if slightly lengthy, to calculate
Supersymmetric Quantum Theory and Differential Geometry
553
{D, D} (ω ⊗ ξ) = {D∧ , D∧ } (ω ⊗ ξ) µ ν E + {0(dxµ ), 0(dxν )}∇ν ω ⊗ ∇E µ ξ + {0(dx ), 0(dx )}∇µ ω ⊗ ∇ν ξ µ ν E E − {0(dxµ ), 0(dxν )}0λµν ω ⊗ ∇E λ ξ + 0(dx )0(dx ) ω ⊗ [ ∇µ , ∇ν ]ξ
= 0(dxµ )0(dxν ) ω ⊗ RE (∂µ , ∂ν )ξ , which shows that condition 3 implies 1. On the other hand, if 1 holds then, taking the commutator of the last expression with 0(dxσ ) and then the anti-commutator with 0(dxλ ), we obtain 3. Next, a similar computation as above gives µν λ E D2 (ω ⊗ ξ) = D2∧ ω ⊗ ξ − 2g µν ∇µ ω ⊗ ∇E ν ξ + g 0µν ω ⊗ ∇λ ξ 1 E µ ν E − g µν ω ⊗ ∇E µ ∇ν ξ + 0(dx )0(dx )ω ⊗ R (∂µ , ∂ν )ξ, 2
where the Riemannian metric arises from the basic relation (2.1) of the Clifford algebra; 2 for D , one obtains the same formula with 0 replaced by 0. Therefore, the difference of the squares of the Dirac operators is 2
D2 − D =
1 0(dxµ )0(dxν ) − 0(dxµ )0(dxν ) ω ⊗ RE (∂µ , ∂ν ), 2
which proves the equivalence of the last two statements of the lemma.
Altogether, Lemmata 2.13 - 2.16 imply the following structure theorem for N = (1, 1) Dirac bundles: Theorem 2.17. Any N = (1, 1) Dirac bundle S on M , see Definition 2.6, is of the form S = 3• M C ⊗ E where E is a Hermitian vector bundle over M endowed with a flat connection. Conversely, any flat Hermitian vector bundle defines an N = (1, 1) Dirac bundle. In view of the ideas on the quantum mechanical picture of differential geometry outlined in the introduction, it is natural to interpret the typical fibre of the bundle E as a space of internal degrees of freedom, on which some gauge group may act. If this gauge group is of the second kind, dynamical gauge fields with an associated field strength appear. For this situation, Theorem 2.17 states that the algebraic structure of the N = (1, 1) Dirac bundle is spoiled by the presence of gauge fields with non-vanishing field strength tensors. 2.2.4. Reconstruction of differential forms. Let S be an N = (1, 1) Dirac bundle over a Riemannian manifold M . We associate to M the spectral data (C ∞ (M ), L2 (S), D, D, γ), where γ denotes the Z2 -grading operator on S. As in Subsect. 2.1.3, we see that linear combinations of operators f0 [D, f1 ]. . .[D, fk ] = f0c(df1 ). . .c(dfk ) ∈ c(0(Cl(k) (M ))), with k ∈ N and fi ∈ C ∞ (M ), generate the algebra c(0(Cl(M ))) – and that c(0(Cl(M ))) is obtained in the same way using D; see [Co1 ] for the detail. Thus, we recover the two anti-commuting Clifford actions c and c on S, which will turn out to simplify the reconstruction of differential forms from the N = (1, 1) spectral data. First of all, the ideals J k introduced in Eq. (2.13) become differential ideals in the N = (1, 1) case, even in the non-commutative generalization, see [FGR ], such that working with the “reduced” algebra of forms •d (A) technically becomes much easier – the Dirac operator D from the N = 1 framework is now replaced by a nilpotent operator d, see
554
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Eq. (2.16). Moreover, in the classical case we may even disregard the algebra of universal forms completely and define •d (A) directly without worrying about quotients. In order to show this, we invoke the following result Lemma 2.18. Let Cl(V ) be the (complexified) Clifford algebra over a Euclidean vector space V of dimension n, and c, c be two anti-commuting unitary representations of Cl(V ) on a vector space W . Then the map π(x) :=
1 c(x) − ic(x) ∈ End W , 2
x∈VC ,
defines a faithful representation of the exterior algebra 3• V C on W by π(x1 ∧ . . . ∧ xk )) := π(x1 ) · · · π(xk ) ,
xi ∈ V C .
Proof. Let ei be an orthonormal basis of V and define a∗i := π(ei ) as well as ai := π(ei )∗ . It is easy to verify that the operators ai satisfy the canonical anti-commutation relations (CAR). The result follows since a representation of CAR is necessarily faithful and the algebra generated by the a∗i is isomorphic to 3• V C . The result of this lemma trivially extends to coordinate neighborhoods on M and, using a partition of unity, to M itself. Recalling the form of the operator d defined in eq. (2.16), we see that the algebra of differential forms over M is linearly generated by products of commutators of functions with d, i.e., kd (A) = {
N X
f0i [ d, f1i ] . . . [ d, fki ] | fji ∈ C ∞ (M ) } .
i=1
Note that forms of even degree commute with γ, whereas forms of odd degree anticommute with γ. Since the operator d is nilpotent, the exterior derivative of a form ω is directly given by (2.28) d ω = [ d, ω ]g , where [·, ·]g is the graded commutator. It is obvious that, in the classical case, the counting of the degree of differential forms provides a Z-grading T on the set of spectral data (C ∞ (M ), L2 (S), d, d∗ , γ) – where we have passed from D, D to nilpotent differentials using (2.16); thus we automatically obtain a description of classical Riemannian manifolds by a set of N = 2 spectral data (C ∞ (M ), L2 (S), d, d∗ , T ) as introduced in Sect. 1.2. 2.2.5. Integration. We define the integral and the scalar product on differential forms as in the N = 1 setting: 2 Z Z Tr ωe−εD 1 √ tr(ω(x)) gdn x = − ω := lim+ 2 ε→0 Tr e−εD Rk(S)Vol(M ) M
and
Z (ω, η) := − ωη ∗
(2.29)
(2.30)
for all differential forms ω and η acting on S. The fact that this gives the correct result follows from
Supersymmetric Quantum Theory and Differential Geometry
555
Lemma 2.19. Let θi be an orthonormal basis of Tp∗ M and let a∗i denote the action of θi on S. Then Rk(S) j1 . . . j l ∗ ∗ , sgn tr(ai1 . . . aik aj1 . . . ajl ) = 2k ik . . . i 1 j1 ...jl l where sgn ijk1 ...j ...i1 denotes the sign of the permutation ik ...i1 . 2.3. Lie groups. Before we turn to the various types of complex geometries, we briefly recall some facts on this important special case of real Riemannian manifolds. Here, all the geometrical objects like the metric or the Levi–Civita connection can be expressed through Lie algebraic data. Let G be a compact connected semi-simple Lie group. For each g ∈ G, we denote by Lg the natural left action of g on G. The Lie algebra g of G is the space of left-invariant vector fields, i.e. g = {X ∈ 0(T G) | Lg∗ X = X ◦ Lg ∀g ∈ G} , which is canonically isomorphic to the tangent space Te G at the unit of G: To X ∈ Te G ˜ one associates the left-invariant vector field X˜ with values X(h) = Lh∗ X for all h ∈ G. The Lie algebra g acts on itself by the adjoint representation adX (Y ) = [ X, Y ] ,
X, Y ∈ g ,
and we can introduce the symmetric g-invariant Killing form on g, hX, Y i = Tr(adX ◦ adY ) , X, Y ∈ g, which is non-degenerate (since G is semi-simple) and negative definite. The Killing form provides a Riemannian metric on all of T G by putting g(X, Y ) = −hLh−1 ∗ X, Lh−1 ∗ Y i , X, Y ∈ Th G . Thus, G carries a canonical Riemannian structure and the general N = 1 or N = (1, 1) formalisms can be applied. But let us instead give more concrete expressions for the metric and, in particular, for the Levi–Civita connection as well as the exterior derivative which are special to Lie groups. To this end, we choose a basis {ϑi }, i = 1, . . . , n, of left-invariant vector fields with k of G with respect to this basis dual basis {θi } of 0(T ∗ G). The structure constants fij are defined by k ϑk . [ ϑi , ϑj ] = fij First note that the metric is given in terms of the structure constants as l . g(ϑi , ϑj ) = −filk fjk
Next, using the g-invariance of the Killing form, h[ X, Y ], Zi + hY, [ X, Z ]i = 0 for all X, Y, Z ∈ g, the conditions of metricity and vanishing torsion immediately yield the following formula for the Levi–Civita connection on G : ∇ϑi =
1 j k f θ ⊗ ϑj . 2 ki
556
J. Fr¨ohlich, O. Grandjean, A. Recknagel
If we define, as previously, two sets of operators on 3• T ∗ G, ai∗ = θi ∧ , ai = ϑi
,
i = 1, . . . , n ,
– which satisfy the CAR – then we can write the Levi–Civita covariant derivative on differential forms as 1 k j∗ a ak ) ; (2.31) ∇ = θi ⊗ (ϑi − fij 2 here ϑi acts as a derivation of the coefficient functions: ϑi (ωj1 ...jk θj1 ∧ . . . ∧ θjk ) = ϑi (ωj1 ...jk )θj1 ∧ . . . ∧ θjk . Since ∇ has vanishing torsion, we find a simple expression for the exterior derivative on G in terms of the fermionic operators above and the structure constants of the Lie algebra: 1 k i∗ j∗ a a ak . (2.32) d = ai∗ ∇i = ai∗ ϑi − fij 2 Actually, this formula holds for all finite-dimensional Lie groups independently of the additional requirements listed at the beginning of this subsection. The operator on the rhs of (2.32) is known to mathematicians as the coboundary operator in Lie algebra cohomology, and to physicists as the BRST charge. Infinite-dimensional generalizations of (2.32) play an important role in string theory. For a compact connected semi-simple Lie group G, the following identities hold for the lowest cohomology groups – see e.g. [CE ]: H 1 (G) = H 2 (G) = H 4 (G) = 0 ,
H 3 (G) 6= 0 .
2.4. Algebraic formulation of complex geometry. In this section, we discuss the spectral data associated to complex, Hermitian and K¨ahler manifolds. Since these data always contain an N = (1, 1) substructure, the reconstruction of the Riemannian aspects proceeds along the lines of the previous sections, and therefore we only show how the additional features are recovered. First, algebraic conditions that tell if a Riemannian manifold admits a complex structure are introduced. Afterwards, we turn to Hermitian manifolds and to the special case of K¨ahler manifolds (which play an important role in string theory). This will lead to what we will call N = (2, 2) data. In addition, we present an algebraic characterization of holomorphic vector bundles and connections in Subsect. 2.4.4. Although they are complex manifolds, too, we discuss Hyperk¨ahler manifolds only in Sect. 2.5, since their definition in terms of algebraic relations displays a higher type of supersymmetry, called N = (4, 4). For more definitions and further details on complex differential geometry, the reader is referred to the literature, e.g. to [Ko, KN, Wel ]. 2.4.1. Complex manifolds. Let (C ∞ (M ), L2 (S), D, D, γ) be N = (1, 1) spectral data associated to a compact Riemannian manifold M . In Sect. 2.2 we have seen that using the operator d = 21 (D − iD) we can recover the graded algebra of differential forms on M as an operator algebra •d (M ) on L2 (S). Definition 2.20. The N = (1, 1) spectral data (C ∞ (M ), L2 (S), D, D, γ) are called complex if there is an operator T on L2 (S) with the following properties: (2.33) a) G := [ T, d ] satisfies G2 = 0;
Supersymmetric Quantum Theory and Differential Geometry
b) c) d) e)
557
[ T, G ] = G; [ T, f ] = 0 for all f ∈ C ∞ (M ); [ T, ω ] ∈ 1 (M ) for all 1-forms ω ∈ 1 (M ) acting on L2 (S); put G := d − G and define (1,0) (M ) =
N nX
(2.34) (2.35) (2.36)
o f0i [ G, f1i ] | fji ∈ C ∞ (M ), N ∈ N ,
i=1
(0,1) (M ) =
N nX
o f0i [ G, f1i ] | fji ∈ C ∞ (M ), N ∈ N ,
i=1
then the anti-linear map \ :
(1,0) (M ) −→ (0,1) (M ) f0 [ G, f1 ] 7−→ f¯0 [ G, f¯1 ]
(2.37)
is an isomorphism; here, complex conjugation of functions is the ∗ -operation in C ∞ (M ). Comparing with the usual notions of complex geometry, one recognizes that T counts the holomorphic degree p of a (p, q) form in the Dolbeault complex. On the whole, Definition 2.20 of a “complex manifold” looks very different from the ones given usually which in particular involve the existence of an almost complex structure on the tangent bundle. However, the latter notion cannot be used conveniently in noncommutative geometry so that we are forced to replace it by an algebraic characterization which is natural from a physical point of view and coincides with the usual definition in the classical case as shown below. It is not clear whether the five conditions listed above are independent from each other, though there are examples of Riemannian manifolds without complex structure that satisfy a) through d) but not e). Lemma 2.21. The following identities hold for complex spectral data: 1) [ T, G ] = 0, 2) { G, d } = 0, 3) G2 = 0, 4) { G, G } = 0. Proof. 1) [ T, G ] = [ T, d − G ] = 0 by (2.33) and (2.34). 2) {G, d} = {[ T, d ], d} = 0 by the Jacobi identity and since d is nilpotent. 2 3) G = (d − G)2 = 0 by (2.33) and part 2) of the lemma. 4) {G, G} = {G, d − G} = 0 by (2.33) and 2). For each f ∈ C ∞ (M ) we define D(1,0) f = [ G, f ] ,
D(1,1) f = {G, [ G, f ]} ,
D(0,1) f = [ G, f ] ,
(2.38)
and for r, s ∈ Z+ we set (r,s) (M ) =
N nX
f0i D(α1 ,β1 ) f1i . . . D(αk ,βk ) fki | fji ∈ C ∞ (M ),
i=1
N ∈ N,
P n
αn =r ,
o
P n
βn =s
.
(2.39)
558
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Note that this definition of the vector space of differential forms is symmetric in G and G in spite of the choice made in defining D(1,1) above, since the Jacobi identity yields {G, [ G, f ]} = −{G, [ G, f ]} for all f ∈ C ∞ (M ). Proposition 2.22. The space of p-forms decomposes into a direct sum M (r,s) (M ) , p (M ) =
(2.40)
r+s=p
and the induced bi-grading on • (M ) is compatible with the operators G and G, i.e. [ G, ω ]g ∈ (r+1,s) (M ) ,
(2.41)
[ G, ω ]g ∈
(2.42)
(r,s+1)
(M ) ,
for any ω ∈ (r,s) (M ); as before, [·, · ]g denotes the graded commutator. Proof. The conditions (2.34,35) on the operator T and part 1) of Lemma 2.21 imply that [ T, D(α,β) f ] = αD(α,β) f , and therefore we have [ T, ω ] = rω for any ω ∈ (r,s) (M ), which proves that the summands in the rhs of (2.40) have trivial intersection. Obviously, d = G + G implies the inclusion M p (M ) ⊂ (r,s) (M ) ; r+s=p
moreover, in the case p = 1 the conditions (2.35) and (2.36) give, for any function f ∈ C ∞ (M ), [ G, f ] = [ T, d ], f = T, [ d, f ] ∈ 1 (M ) and
[ G, f ] = [ d, f ] − [ G, f ] ∈ 1 (M ),
which shows that 1 (M ) = (1,0) (M ) ⊕ (0,1) (M ). For an arbitrary ω ∈ (0,1) (M ) we have [ G, ω ]g = [ T, d ], ω g = − [ d, ω ]g , T + [ ω, T ], d g ∈ 2 (M ) and this gives, in particular, that {G, [ G, f ]} ∈ 2 (M ) for all f ∈ C ∞ (M ). Since L (r,s) (M ), equality (2.40) (1,0) (M ), (0,1) (M ) and (1,1) (M ) generate all of r,s follows. The next statement, Eq. (2.41), is established by the computation T, [ G, ω ]g = − G, [ ω, T ] g + (−1)r+s+1 ω, [ T, G ] g = (r + 1)[ G, ω ]g . Finally, (2.42) follows from [ G, ω ]g = [ d, ω ]g − [ G, ω ]g ∈ r+s+1 (M ) and for ω ∈ (r,s) (M ).
T, [ G, ω ]g = r[ G, ω ]g
Supersymmetric Quantum Theory and Differential Geometry
559
To make contact with the usual definition of complex manifolds, let us define a C-linear operator J on 1 (M ) by J|(0,1) (M ) = −i .
J|(1,0) (M ) = i ,
Lemma 2.23. The operator J on 1 (M ) is an almost complex structure on M , i.e. it is a real operator with square −1 : 1) (Jω)\ = J(ω \ ), 2) J 2 = −1. Proof. Property 2) is trivial. Let ω ∈ 1 (M ), then by Proposition 2.22 we can write ω=
N X
0
f0i [ G, f1i
]+
i=1
N X
g0j [ G, g1j ]
j=1
for some functions fki , gkj ∈ C ∞ (M ); now we simply insert the definition (2.37) of the anti-linear map \: \
(Jω) = i
N X
0
f0i [ G, f1i
]−i
i=1
= −i
N X i=1
N X
\
g0j [ G, g1j ]
j=1 0
f¯0i [ G, f¯1i ] + i
N X
g¯ 0j [ G, g¯ 1j ] = J(ω \ ).
j=1
Theorem 2.24. The operator J on 1 (M ) is a complex structure, i.e. M is a complex manifold. Proof. It only remains to show that the almost complex structure J is integrable: Since d = G + G, we have [ d, ω ]g ∈ (r+1,s) (M ) ⊕ (r,s+1) (M ) for ω ∈ (r,s) (M ), and this is equivalent to the required property of J; see e.g. [KN ]. Theorem 2.24 shows that if complex N = (1, 1) spectral data as defined in Definition 2.20 are associated to a compact Riemannian manifold M , then the latter is indeed a complex manifold. We remark that we could have given an alternative algebraic definition of complex manifolds by postulating the existence of two commuting operators T and T on the space of forms which satisfy conditions c) and d) of Definition 2.20 and such that, for G := [ T, d ] and G := [ T , d ] , conditions a) and b) hold. The fifth requirement of Definition 2.20 can be replaced by the condition that T and −T are unitarily equivalent by the Hodge operator. 2.4.2. Hermitian manifolds. Here, we describe the additional structure needed on a set of complex N = (1, 1) spectral data in order to identify the underlying manifold as a Hermitian manifold. Not too surprisingly, the relevant condition is one of Hermiticity. Definition 2.25. The complex N = (1, 1) spectral data (C ∞ (M ), L2 (S), G, G, T, γ) are called Hermitian if the operator T is self-adjoint.
560
J. Fr¨ohlich, O. Grandjean, A. Recknagel
In the following, we prove that a manifold giving rise to Hermitian spectral data is indeed Hermitian. Recall that by Theorem 2.17, any N = (1, 1) bundle S is of the form S = 3• M C ⊗ E, where E is a Hermitian vector bundle equipped with a flat Hermitian connection ∇E ; the Clifford actions c and c act trivially on E. Lemma 2.26. The operator T on 0(S) has degree zero with respect to the natural grading on S, i.e. T : 0(3p M C ⊗ E) −→ 0(3p M C ⊗ E) . Proof. For anyω ∈ 0(T ∗ M C ), denote by a∗ (ω) the wedging operator by ω, and set ∗ a(ω) := a∗ (ω) , i.e. a(ω) acts as contraction by ω. Since a∗ (ω) ∈ 1 (M ), viewed as the space of 1-forms acting on 0(S), condition (2.36) yields [ T, a∗ (ω) ] = a∗ (T ω) for some 1-form T ω ∈ 0(T ∗ M C ), and since by definition T is self-adjoint, we get [ T, a(ω) ] = −a(T ω) . This is already sufficient to prove the lemma for p = 0, since it implies that for any ω ∈ 0(T ∗ M ) and ξ ∈ 0(E) – 1 is the constant function on M –, a(ω)T (1 ⊗ ξ) = [ a(ω), T ](1 ⊗ ξ) + T (a(ω)1 ⊗ ξ) = a(T ω)1 ⊗ ξ = 0 , i.e. T (1 ⊗ ξ) is annihilated by all contraction operators with 1-forms and therefore T (1 ⊗ ξ) ∈ 0(30 M C ⊗ E) . Now take ω ∈ 0(3p M C ) and ξ ∈ 0(E), then T (ω ⊗ ξ) = [ T, a∗ (ω) ] (1 ⊗ ξ) + a∗ (ω)T (1 ⊗ ξ) , and since [ T, a∗ (ω) ] ∈ p (M ) by (2.36), we obtain the desired result T (ω ⊗ ξ) ∈ 0(3p M C ⊗ E).
Theorem 2.27. A compact Riemannian manifold M corresponding to Hermitian spectral data is Hermitian. Proof. Since Hermitian spectral data are in particular complex N = (1, 1) data, the manifold is complex by the results of the last subsection. Hermiticity of M is equivalent to the orthogonality of the spaces 3(1,0) M C and 3(0,1) M C with respect to the sesquilinear form induced by the Riemannian metric. By (2.35), the operator T acts on the fibres, and since it is self-adjoint we can choose an eigenvector ξ ∈ 0(3• M C ⊗ E)x of T at x ∈ M with eigenvalue λ ∈ R, i.e. T ξ = λξ . Let ω ∈ 3 imply that
(1,0)
Tx∗ M
and η ∈ 3
(0,1)
Tx∗ M ;
then (2.34) and Lemma 2.21, statement 1),
T (ω ⊗ ξ) = (λ + 1) ω ⊗ ξ , T (η ⊗ ξ) = λ η ⊗ ξ . Since T is self-adjoint, ω ⊗ ξ and η ⊗ ξ are orthogonal to each other, and with Lemma 2.14 we obtain hω, ηi = 0.
Supersymmetric Quantum Theory and Differential Geometry
561
As was remarked at the end of Sect. 2.4.1, we could also require the existence of two commuting and, in the Hermitian case, self-adjoint operators T and T on H. Equivalently, the approach of Sect. 1.2 could have been chosen, involving U(1) generators Ttot and J0 , the second operator being directly related to the complex structure J. 2.4.3. K¨ahler manifolds and Dolbeault cohomology. Let M be a Hermitian manifold with Riemannian metric g and complex structure J. We define the fundamental 2-form by (X, Y ) = g(JX, Y ) for all X, Y ∈ T M . It is readily verified that is a (1,1)-form on M . Definition 2.28. A Hermitian manifold M is a K¨ahler manifold if its fundamental 2-form is closed, i.e. d = 0 . In this case the fundamental form is called K¨ahler form. The following alternative characterization of K¨ahler manifolds is very useful for computations (for a proof see e.g. [KN,Wel ]). Theorem 2.29. A Hermitian manifold is K¨ahler if and only if for each p ∈ M there exists a system {z µ } of holomorphic coordinates around p, also called complex geodesic system, such that gµν¯ (p) = δµν¯ ,
∂λ gµν¯ (p) = 0 ,
∂λ¯ gµν¯ (p) = 0 .
To a compact Hermitian manifold of complex dimension n we can associate the ¯ and the operator T acting canonical Hermitian spectral data S = 3• M C , G = ∂, G = ∂, on 3(p,q) M C by multiplication with p − n2 . Whether such a manifold – i.e. such a set of spectral data – is K¨ahler, can also be answered in an algebraic fashion: Theorem and Definition 2.30. The following statements are equivalent: 1) The manifold M is K¨ahler. 2) {∂, ∂¯ ∗ } = 0. ¯ ∂¯ ∗ } 3) The holomorphic and anti-holomorphic Laplacians = {∂, ∂ ∗ } resp. = {∂, coincide: =. Hermitian spectral data with differential operators G ≡ ∂ and G ≡ ∂¯ that satisfy conditions 2) and 3) will be called N = (2, 2) or K¨ahler spectral data. Proof. It is a well-known fact that 1) implies 2) and 3) – see e.g. [Wel ]. In order to show the converse, we choose local coordinates z µ , z¯ µ , and use the operators aµ ∗ = dz µ ∧ , aµ¯ ∗ = dz¯ µ ∧ as before; together with their adjoints, they satisfy the relations { aµ ∗ , aν } = { aµ¯ ∗ , aν¯ } = δνµ , while all other anti-commutators vanish; of course, aµ = gµν¯ aν¯ etc. From the decomposition of d∗ = −aµ ∇µ − aµ¯ ∇µ¯ – here, ∇ is the Levi–Civita connection, therefore no torsion terms appear in this formula – into d∗ = ∂ ∗ + ∂ ∗ we obtain the expressions
562
J. Fr¨ohlich, O. Grandjean, A. Recknagel ¯ µρ ¯ ν∗ ∂ ∗ = −aµ¯ ∂µ¯ + 0σµ¯¯ ν¯ g µρ aρ aν¯ ∗ aσ¯ + 0σµν¯ g µρ¯ aρ¯ aν¯ ∗ aσ + 0σµν ¯ g aρ a aσ , ¯ µρ ¯ ν∗ σ¯ µρ¯ ν¯ ∗ ∂ ∗ = −aµ ∂µ + 0σµν g µρ¯ aρ¯ aν ∗ aσ + 0σµν ¯ g aρ a aσ¯ + 0µν¯ g aρ¯ a aσ¯ .
The 0’s are the Christoffel symbols of the Levi–Civita connection; since gµν = gµ¯ ν¯ = 0, ¯ = 0σµ¯ ν¯ = 0. they satisfy 0σµν Next, consider the anti-commutator { ∂, ∂ ∗ } , which, locally, is a sum of differential operators of order 0 and 1. If condition 2) of the theorem holds, these must vanish separately, and this leads to equations on the Christoffel symbols. With the help of the operators aµ ∗ etc., we compute for the first order part (with ∂ = aµ ∗ ∂µ ) µ∗ ¯ ¯ { ∂, ∂ ∗ } = −∂µ g σν¯ − g ρν¯ 0σρµ + g ρσ 0νρµ a aν¯ ∂σ + . . . , ¯ where the dots substitute for the zeroth order terms that were omitted. Thus, we arrive at the implication ¯ ¯ 0νρµ { ∂, ∂ ∗ } = 0 =⇒ ∂µ g σν¯ = −g ρν¯ 0σρµ + g ρσ ¯ .
But since the Levi–Civita connection is metric, i.e. ¯ ¯ ∇g = 0 ⇐⇒ ∂µ g σν¯ = −g ρν¯ 0σρµ − g ρσ 0νρµ ¯ ,
we further conclude that
0νµ¯ ρ¯ = 0νµρ ¯ =0
so that the only non-vanishing Christoffel symbols are 0σµν and 0σµ¯¯ ν¯ . This is precisely the condition for the complex structure to be covariantly constant, thus the manifold M is K¨ahler. Let us now show that condition 3) also leads to this consequence. In the same fashion as before, we calculate, locally, the first order parts of the operators and . We find ¯ ¯ aµ ∗ aν ∂σ¯ + g ρσ 0νρ¯¯ µ¯ − g ρν¯ 0σρµ¯ aµ¯ ∗ aν¯ ∂σ { ∂, ∂ ∗ } = −∂µ g σν µ∗ ¯ ρν ¯ σ + g ρσ 0νρµ a aν ∂σ + 2g µν¯ 0σµν¯ ∂σ + . . . , ¯ − g 0ρµ ¯ where the dots indicate zeroth and second order differential operators which we do not need to know explicitly. The expression for is obtained by complex conjugation of the above. Assume that condition 3) holds. Then, − = 0, and, in particular, the term in the difference which is proportional to the operator aµ¯ ∗ aν¯ ∂σ has to vanish: ¯ g ρσ 0νρ¯¯ µ¯ − g ρν¯ 0σρµ¯ + ∂µ¯ g σν¯ aµ¯ ∗ aν¯ ∂σ = 0 . But this coefficient is just the complex conjugate of the first order part of { ∂, ∂ ∗ } computed above. Thus, in the same way, we conclude that M is K¨ahler if = . For a Hermitian manifold, we can study not only ordinary de Rham theory of differential forms, but also the cohomology of the complex ∂¯
∂¯
∂¯
∂¯
∂¯
0(3(p,0) M C )−→0(3(p,1) M C )−→ . . . −→0(3(p,q) M C )−→0(3(p,q+1) M C )−→ . . . , which is called Dolbeault cohomology and denoted by H (p,•) (M ); the dimensions hp,q of the spaces H (p,q) (M ) are called Hodge numbers. It is customary to present the Hodge
Supersymmetric Quantum Theory and Differential Geometry
563
numbers of a Hermitian manifold of complex dimension n in the form of the so-called Hodge diamond: hn,n hn,n−1 .. hn,0 ..
hn−1,1
.
..
hn−1,1 · · · h1,n−1
h0,n . ..
. h1,0
.
h0,1 h0,0
If the manifold is K¨ahler there are several equations between these numbers as well as p (M ): the Betti numbers bp = dim HdR Theorem 2.31. Let M be a K¨ahler manifold of complex dimension n; then 1) hp,q = hq,p ; 2) hp,q =Phn−p,n−q ; 3) bp = r+s=p hr,s ; 4) b2p−1 is even; 5) b2p ≥ 1 for all 1 ≤ p ≤ n . Proof. From the Hodge decomposition theorem, see e.g. [Wel ], (p,q−1) ¯ M C ) ⊕ ∂¯ ∗ 0(3(p,q+1) M C ) , 0(3(p,q) M C ) = H(p,q) (M C ) ⊕ ∂0(3
and the fact that the Laplace operators satisfy ==
1 4 2
we obtain statement 1) using complex conjugation, statement 2) with the help of the Hodge ∗-operator, and statement 3) using = 21 4; fact 4) is a consequence of 1) and 3), and 5) follows from the existence of the K¨ahler form along with its powers up to ∧n , which are closed but not exact since the latter is the volume form on M . A subclass of K¨ahler manifolds which is of particular interest for superstring theory is given by the so-called Calabi–Yau manifolds. Definition 2.32. A Calabi–Yau manifold is a compact K¨ahler manifold whose first Chern class vanishes. It is clear that a Ricci flat K¨ahler manifold has vanishing first Chern class; that the converse is also true was conjectured by Calabi [Cal ] and proven by Yau [Y ]: Theorem 2.33. Let M be a compact K¨ahler manifold with vanishing first Chern class and let be its K¨ahler form. Then there exists a unique Ricci flat metric whose K¨ahler form is in the same cohomology class as . Calabi–Yau manifolds have a number of striking properties. We list two of them in the following theorem, the proof of which can be found e.g. in [Can ].
564
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Theorem 2.34. Let M be a Calabi–Yau manifold of complex dimension n such that the Euler characteristic χ(M ) 6= 0. Then the first Betti number of M vanishes, b1 = 2h1,0 = 0 . Furthermore, there exists a harmonic (n, 0)-form on M which is covariantly constant with respect to the Ricci flat metric. In particular, for n = 3, one has h3,0 = 1. The existence of a covariantly constant (n, 0)-form supplies a further symmetry of the Hodge numbers, hp,0 = hn−p,0 , which – after some additional work – permits to fix the Hodge diamond of a Calabi–Yau three-fold with χ(M ) 6= 0 up to two numbers: 1 0 h
0 h
1
0 1,1
2,1
h h
0
0 2,1
1,1
0
1 0
0 1
It can be shown that, for an arbitrary K¨ahler manifold M , deformations of the complex and K¨ahler structures are parameterized by elements of H (2,1) (M ) and H (1,1) (M ), respectively, which gives a geometrical meaning to the two free Hodge numbers above. But let us remark that the latter also have a deep interpretation within string theory, see e.g. [GSW ]. Investigations of Calabi–Yau manifolds within this context have led to the conjecture of a new symmetry among Calabi–Yau manifolds which has become important for purely mathematical considerations, too: The idea of mirror symmetry suggests that to an n-dimensional Calabi–Yau manifold M one can associate another Calabi–Yau f such that the Hodge numbers satisfy n-fold M n−p,q ; hp,q M = hM e
for n = 3, this means e.g. that deformations of the complex structure on M correspond f and vice versa. From the point of view to deformations of the K¨ahler structure on M of superconformal quantum field theory, it is almost trivial to predict the existence of “mirrors” of Calabi–Yau manifolds; their explicit construction is, however, quite involved, even within a heuristic context. The general investigation of mirror symmetry in mathematically rigorous terms is far from being complete. 2.4.4. Holomorphic vector bundles and connections. Let (C ∞ (M ), L2 (S), G, G, T, γ) be the canonical spectral data associated to a Hermitian manifold M , see the paragraph preceding Theorem 2.30. We have seen in the previous sections how to recover the complex geometry from these. Now we will characterize holomorphic bundles over M as well as holomorphic connections on such bundles. The bundle of 1-forms on M decomposes into the direct sum T ∗ M C = 3(1,0) M C ⊕ 3(0,1) M C
Supersymmetric Quantum Theory and Differential Geometry
565
of the bundles of holomorphic and anti-holomorphic 1-forms. Let E be a complex vector bundle over M with a connection ∇, ∇ : 0(E) −→ 0(T ∗ M C ⊗ E) . We can decompose ∇ according to its range, i.e. we write ∇ = ∇(1,0) + ∇(0,1) with ∇(α,β) : 0(E) −→ 0(3(α,β) M C ⊗ E) , α, β = 0, 1. The Leibniz rule for ∇ is refined as follows: ∇(1,0) (f ξ) = ∂f ⊗ ξ + f ∇(1,0) ξ, ¯ ⊗ ξ + f ∇(0,1) ξ, ∇(0,1) (f ξ) = ∂f for all f ∈ C ∞ (M ) and ξ ∈ 0(E). The characterization of holomorphic vector bundles over M is based on the following theorem, the proof of which can be found e.g. in [Ko ]: Theorem 2.35. Let E be a complex vector bundle over a complex manifold M and ∇ be a connection on E such that the map ∇(0,1) ◦ ∇(0,1) : 0(E) −→ 0(3(0,2) M C ⊗ E) vanishes. Then there is a unique holomorphic vector bundle structure on E such that ∇(0,1) ξ = 0 for any local holomorphic section ξ of E. Recall that complex vector bundles over M are in one-to-one correspondence with finitely generated projective modules over C ∞ (M ). Altogether, this leads us to the following definitions: Definition 2.36. A complex structure on a finitely generated projective left C ∞ (M )module E is a connection ∇ on E such that ∇(0,1) ◦ ∇(0,1) = 0. The pair (E, ∇) is e on (E, ∇) is a then called a holomorphic vector bundle. Any other connection ∇ holomorphic connection if e ∈ HomC ∞ (M ) E, (1,0) (M ) ⊗C ∞ (M ) E . ∇−∇
(2.43)
e are called equivalent if (2.43) holds. The complex structures ∇ and ∇ 2.5. Hyperk¨ahler manifolds. After having discussed the case of K¨ahler manifolds at some length, we will now focus on an even more special type of complex geometry whose algebraic characterization will involve an N = (4, 4) supersymmetry algebra. Hyperk¨ahler manifolds are interesting from the mathematical point of view, since they admit a metric with vanishing Einstein tensor. They also were discussed in the physical literature in the context of N = 4 supersymmetric non-linear sigma models [AGF,HKLR ] with a classical target. But it appears that the possibility of a direct algebraic interpretation of the Hyperk¨ahler axioms has been overlooked by now. This is also the reason why we will be more explicit in the proofs contained in this section.
566
J. Fr¨ohlich, O. Grandjean, A. Recknagel
2.5.1. Definition and basic properties of Hyperk¨ahler manifolds. Definition 2.37. A Riemannian manifold (M, g) is a Hyperk¨ahler manifold if it carries three complex structures I, J and K satisfying the quaternion algebra IJ = −JI = K ,
JK = −KJ = I ,
KI = −IK = J ,
(2.44)
and such that g is a K¨ahler metric for I, J and K. From the representation theory of the quaternion algebra we conclude that a Hyperk¨ahler manifold must satisfy dimR M ≡ 0 mod 4 .
(2.45)
Let (M, g, I, J, K) be a Hyperk¨ahler manifold. Then we can consider (M, g, I) as a K¨ahler manifold and decompose the complexified tangent bundle T M into its holomorphic and anti-holomorphic parts T M = T +M ⊕ T −M ,
I|T ± M = ±i .
(2.46)
Let P ± denote the projections T M −→ T ± M ; then we have from (2.44), 0 = P ± {I, J}P ± = ±2iP ± JP ± , KP ± = IJP ± = ∓iP ∓ JP ± , which implies and
J, K : T ± M −→ T ∓ M,
(2.47)
K|T ± M = ∓iJ|T ± M .
(2.48)
We define the holomorphic symplectic form on M by ω(·, ·) =
1 g (J + iK)·, · ; 2
(2.49)
if we denote by J and K the K¨ahler forms associated with the complex structures J and K, resp., then we have the relation ω=
1 (J + iK ) , 2
(2.50)
which shows that the form ω is indeed closed and – with Eqs. (2.47,48) – is a holomorphic 2-form (equivalently, ω is an anti-holomorphic symplectic form). Moreover, (2.48) allows us to write (2.51) ω(·, ·) = g(J·, ·) on T + M which shows that ω is non-degenerate and, therefore, that, on a Hyperk¨ahler manifold, there exists a holomorphic volume form µ = ω ∧ ... ∧ ω (dimR M/4 factors). Let U be a coordinate neighborhood with holomorphic coordinates z 1 , . . . , z n . From Eqs. (2.47,48,51), we can obtain the following local expressions which will be useful later on:
Supersymmetric Quantum Theory and Differential Geometry
Jµ ν = Jµ¯ ν¯ = Kµ ν = Kµ¯ ν¯ = 0
567
,
(2.52)
Kµ = −iJµ ,
(2.53)
Kµ¯ ν = iJµ¯ ν ,
(2.54)
ν¯
ν¯
¯ λ
ωµν = Jµ gλν ¯ = Jµν , 1 ω = Jµν dz µ ∧ dz ν . 2
(2.55) (2.56)
In local coordinates, the complex conjugate of the holomorphic symplectic form is given by ω = 21 Jµ¯ ν¯ dz µ¯ ∧ dz ν¯ . 2.5.2. The N = (4, 4) data of a Hyperk¨ahler manifold. In this section we prove that the canonical set of N = (2, 2) data on a Hyperk¨ahler manifold extends to what we will call N = (4, 4) spectral data. The canonical N = (2, 2) data on a K¨ahler manifold are given ¯ T ), where by the tuple (C ∞ (M ), L2 3• M C , ∂, ∂, T |3(p,q) M = p −
1 dimC M 2
(2.57)
counts the holomorphic degree of differential forms – including a “normalization term” which makes the spectrum of T symmetric around zero. Let now (M, g, I, J, K) be a Hyperk¨ahler manifold of complex dimension n, let and (M, g, I) be the underlying K¨ahler manifold. We define operators = { ∂, ∂ ∗ },
(2.58)
G1+ = ∂, G2+ = [ ι(ω), ∂ ], 1 T1 = ι(ω) + (ω) , 2 i T2 = ι(ω) − (ω) , 2 n 1 3 T = (p − ) on 3(p,q) M, 2 2
(2.59) (2.60) (2.61) (2.62) (2.63)
where ω denotes the holomorphic symplectic form on M , (ω) is the wedging operator by ω and ι(ω) its adjoint, i.e. contraction by ω. In addition, for a = 1, 2 we introduce ∗ Ga− = Ga+ . (2.64) Theorem 2.38. The operators defined by Eqs. (2.58–64) satisfy the (anti-)commutation relations [ , Ga+ ] = 0 , a = 1, 2 , [ , T i ] = 0 , i = 1, 2, 3 , { Ga+ , Gb+ } = 0 , a, b = 1, 2 ,
(2.65) (2.66) (2.67)
{ Ga− , Gb+ } = δ ab , a, b = 1, 2 , 1 i b+ G , i = 1, 2, 3 , a = 1, 2 , [ T i , Ga+ ] = τab 2 [ T i , T j ] = iijk T k , i, j = 1, 2, 3 ;
(2.68) (2.69) (2.70)
568
J. Fr¨ohlich, O. Grandjean, A. Recknagel
τ i are the Pauli matrices defined in the appendix. In addition, the following Hermiticity conditions hold: ∗ ∗ Ga± = Ga∓ , Ti = Ti . (2.71) ∗ = , In particular, these operators generate a finite-dimensional Z2 -graded (or super) Lie algebra. Proof. We begin with the SU(2) commutation relations, Eq. (2.70). Since ω is a (2,0)form, the definitions immediately give 1 −ι(ω) + (ω) = iT 2 , 2 i 3 2 [T ,T ] = −ι(ω) − (ω) = −iT 1 . 2 To obtain the last SU(2) commutator, we need the following [ T 3, T 1 ] =
Lemma 2.39. [ (ω), ι(ω) ] = p −
n 2
Proof. Let z µ be holomorphic local coordinates on M with the properties of Theorem 2.29; set aµ ∗ = (dz µ ) and aµ¯ = ι(dz¯ µ ). These operators satisfy the anti-commutation relations (2.72) {aµ ∗ , aν ∗ } = {aµ¯ , aν¯ } = 0 , {aµ ∗ , aν¯ } = g µν¯ . Then we can write (ω) = 21 ωµν aµ ∗ aν ∗ and ι(ω) ¯ = ((ω))∗ = − 21 ω ρ¯ σ¯ aρ¯ aσ¯ . Now the calculation is straightforward: 1 [ (ω), ι(ω) ] = − ωµν ω ρ¯ σ¯ [ aµ ∗ aν ∗ , aρ¯ aσ¯ ] 4 1 = − ωµν ω ρ¯ σ¯ g ν ρ¯ aµ ∗ aσ¯ − g ν σ¯ aµ ∗ aρ¯ + g µρ¯ aσ¯ aν ∗ − g µσ¯ aρ¯ aν ∗ 4 1 = − Jµν J νσ¯ aµ ∗ aσ¯ − Jρ¯ ν aµ ∗ aρ¯ + J µσ¯ aσ¯ aν ∗ − Jρ¯ µ aρ¯ aν ∗ 4 1 −gµσ¯ aµ ∗ aσ¯ − gµρ¯ aµ ∗ aρ¯ + gν σ¯ aσ¯ aν ∗ + gν ρ¯ aρ¯ aν ∗ =− 4 1 1 = gµν¯ (aµ ∗ aν¯ − aν¯ aµ ∗ ) = gµν¯ aµ ∗ aν¯ − gµν¯ g µν¯ 2 2 n =p− 2 on 3(p,q) M C , since gµν¯ aµ ∗ aν¯ is the “number operator”.
From the preceding lemma, it follows easily that [ T 1, T 2 ] =
n i (p − ) = iT 3 2 2
on 3(p,q) M C , which proves (2.70). ¯ = 0; To derive Eq. (2.69), we recall that the symplectic form ω is closed, ∂ω = ∂ω since [ (ω), ∂ ] = −(∂ω) = 0, we obtain [ T i , G1+ ] =
1 i a+ τ G . 2 1a
The commutation relations [ T i , G2+ ] will be computed with the help of another lemma:
Supersymmetric Quantum Theory and Differential Geometry
Lemma 2.40.
569
ι(ω), [ ι(ω), ∂ ] = 0
Proof. Using the same coordinates and notations as in the proof of Lemma 2.39, we have 1 1 [ ι(ω), ∂ ] = − ω µ¯ ν¯ [ aµ¯ aν¯ , ∂ ] = − ω µ¯ ν¯ aµ¯ { aν¯ , ∂ } − { aµ¯ , ∂ } aν¯ 2 2 1 ¯ ¯ ∂λ − aν¯ δ µλ = − ω µ¯ ν¯ aµ¯ δ νλ 2 at the center of the holomorphic geodesic coordinate system. This implies that
1 ¯ ι(ω), [ ι(ω), ∂ ] = − ω ρ¯ σ¯ ω µ¯ ν¯ [ aρ¯ aσ¯ , aν¯ ]δ µλ ∂λ = 0 . 2
Together with the Jacobi identity, the last lemma yields [ T 1 , G2+ ] =
1 1 1 1 1+ 1 (ω), [ ι(ω), ∂ ] = ∂, [ ι(ω), (ω) ] = ∂ = τ21 G 2 2 2 2
and analogously 1 2 1+ i i (ω), [ ι(ω), ∂ ] = − ∂ = τ21 G , 2 2 2 1 1 3 2+ G , [ T 3 , G2+ ] = − G2+ = τ22 2 2 [ T 2 , G2+ ] = −
which proves (2.69). We proceed with Eq. (2.68) in Theorem 2.38, { G1− , G2+ } = { ∂ ∗ , [ ι(ω), ∂ ]} = ι(ω), { ∂, ∂ ∗ } + { ∂, [ ∂ ∗ , ι(ω) ]} = [ ι(ω), ] where we have used the Jacobi identity and [ ∂ ∗ , ι(ω) ] = [ (ω), ∂ ]∗ = 0 . On a K¨ahler manifold, we have 2 = 4, where 4 is the Laplace–Beltrami operator. One of the Hodge identities, see e.g. [Wel ], reads [ 4, ι() ] = 0, where denotes the K¨ahler form. Since ω is a linear combination of the K¨ahler forms associated to the complex structures J and K on the Hyperk¨ahler manifold M , we have { G1− , G2+ } = [ ι(ω), ] = 0 .
(2.73)
The other commutation relations of the Ga± are { G1− , G1+ } = { ∂ ∗ , ∂ } =
(2.74)
by definition, and using Eqs. (2.69, 71, 73, 74) we obtain { G2− , G1+ } = { G2+ , G1− }∗ = 0 ,
{ G2− , G2+ } = 2{ G2− , [ T 1 , G1+ ]} = 2 T 1 , { G1+ , G2− } + 2{ G1+ , [ G2− , T 1 ]} = { G1+ , G1− } = , which proves (2.68). The remaining equations are much simpler:
570
J. Fr¨ohlich, O. Grandjean, A. Recknagel
{ G1+ , G1+ } = { ∂, ∂ } = 0,
(2.75)
{ G , G } = 2{ G , [ T , G ]} = 0 1+
2+
1+
1
1+
(2.76)
by (2.69) and the Jacobi identity; { G2+ , G2+ } = 2{ G2+ , [ T 1 , G1+ ]} = 2 T 1 , { G1+ , G2+ } + 2{ G1+ , [ G2+ , T 1 ]} = −{ G1+ , G1+ } = 0 with the help of (2.69, 75, 76) and again the Jacobi identity; this proves (2.67). It only remains to show that the Laplace operator commutes with all other operators introduced at the beginning of this section. The Hodge identity [ , (ω) ] = [ , ι(ω) ] = 0 , yields [ , T 1 ] = [ , T 2 ] = 0 , and [ , T 3 ] = 0 follows from the fact that has bi-degree (0, 0) – proving (2.66). Finally, we have [ , Ga+ ] = { Ga+ , Ga− }, Ga+ = 0 by (2.67,68) and the Jacobi identity. This completes the proof of (2.65) and of Theorem 2.38. If we define the anti-holomorphic analogues , Ga± and T i of the previous generators , Ga± , T i , we get a second copy of the algebra established in Theorem 2.38. These two copies (anti-)commute, as is easily verified: It suffices to show that [ ∂, ι(ω) ] = 0 and [ ∂, (ω) ] = 0; the first equation follows from 1 [ ∂, ι(ω) ] = − [ aµ ∗ ∂µ , ωνρ aν aρ ] = 0 2 at the center of a holomorphic geodesic system, and the second equation simply means ∂ω = 0. This leads us to the following ¯ T i , T i , i = 1, 2, 3) with operators Definition 2.41. A tuple (C ∞ (M ), L2 3• M C , ∂, ∂, subject to the relations of Theorem 2.38 and the corresponding relations for the antiholomorphic analogues – which (anti-)commute with the holomorphic operators – is called a set of N = (4, 4) spectral data. It characterizes M as a Hyperk¨ahler manifold. 2.5.3. Characterization of Hyperk¨ahler manifolds. In Sect. 2.4, we have seen that K¨ahler manifolds can be characterized algebraically, i.e. that one can identify the operators ∂ = G1+ and T 3 = 21 (p − n/2) on 3(p,q) M C along with their anti-holomorphic partners, as soon as an N = (2, 2) structure on the bundle of differential forms is given. In this section, we will prove that if the N = (2, 2) extends to an N = (4, 4) structure, the underlying manifold is Hyperk¨ahler. Theorem 2.42. Let M be a K¨ahler manifold and suppose that its canonical N = (2, 2) bundle carries a realization of two (anti-)commuting N = 4 algebras, eqs. (2.65–71), 1+ where G1+ , G and T 3 , T 3 have their usual meaning. Then M is a Hyperk¨ahler manifold.
Supersymmetric Quantum Theory and Differential Geometry
571
Proof. We have to show that there exists a holomorphic symplectic form ω on M , which by a theorem of A. Beauville [Beau,Bes ] implies that M is Hyperk¨ahler. The idea is to identify T + := T 1 + iT 2 with the wedging operator by ω. From the N = 4 commutation relations 3 [ T 3 , T + ] = T + , [ T , T + ] = 0, we see that T + has bi-degree (2,0), i.e. it maps 0 3(p,q) M C to 0 3(p+2,q) M C . Furthermore, 1+ [ G1+ , T + ] = 0 , [ G , T + ] = 0 1+
show that [ d , T + ] = [ G1+ + G , T + ] = 0. In order to fully identify the operator in geometrical terms, we shall need the following Lemma 2.43. Let M be a compact Riemannian manifold and W : 0 3• M C −→ 0 3• M C be a C ∞ (M )-linear operator of degree d, i.e. W maps 0 3p M C to 0 3p+d M C , such that in addition [ W, d ]g = 0. Then W is a wedging operator by a closed form of degree d uniquely determined by W (1), where 1 ∈ C ∞ (M ) denotes the constant function equal to 1. Proof. Fix a p-form η; then there is a finite set of functions aij ∈ C ∞ (M ) such that X ai0 [ d, ai1 ] . . . [ d, aip ] · 1 . η= i
Using the C ∞ (M )-linearity of W and [ W, d ]g = 0 , we get X W η = (−1)pd ai0 [ d, ai1 ] . . . [ d, aip ] · W (1) i
= (−1) η ∧ W (1) = (−1)pd η ∧ (W (1)) · 1 = W (1) ∧ η, pd
which proves that W is a wedging operator by a d-form. That W (1) is closed follows from d W (1) = [ d, W ]g · 1 = 0. Since the operator T + satisfies the assumptions of Lemma 2.43, there exists a closed (2,0)-form ω such that T + = (ω). We can identify this form with a holomorphic symplectic form on M if we can show that it is non-degenerate: From the N = 4 commutation relations, we have n [ T + , T − ] = 2T 3 = p − 2 on 3(p,q) M C , where n is the complex dimension of the manifold, and since T − = (T + )∗ , we get T − = ι(ω). For any f ∈ C ∞ (M ), we have 1 n − f = 2T 3 f = [ T + , T − ] f = −ι(ω)(ω) f = − ω µν ωµν f , 2 2 and therefore ω µν ωµν = n . Using the last equation, we have for any (1,0)-form η = ηλ dz λ ,
572
J. Fr¨ohlich, O. Grandjean, A. Recknagel
(1 −
n 1 ) η = −ι(ω)(ω) η = − ω µν ωµν η + ω µλ ωµν ηλ dz ν 2 2 n = − η + ω µλ ωµν ηλ dz ν , 2
and it follows that η = ω µλ ωµν ηλ dz ν ; in other words, the matrix ωµν is invertible, independent of the coordinate system, and thus ω is non-degenerate. This concludes the proof of Theorem 2.42. 2.6. Symplectic geometry. This section is devoted to the discussion of symplectic manifolds, which constitute an important class of manifolds where the central role of the Riemannian metric is taken over by the symplectic form. Again, there is a description of symplectic geometry in terms of a certain set of algebraic data, which turn out to be “close” to the ones characterizing a K¨ahler manifold. More precisely, one can always choose a Riemannian metric g on a manifold M with given symplectic form ω such that g and ω define an almost complex structure J on the bundle of differential forms. The obstruction against M being a complex K¨ahler manifold (with K¨ahler form ω) can also be stated as an algebraic relation. Let (M, ω) be a compact symplectic manifold, i.e., M is a smooth manifold endowed with a non-degenerate closed two-form ω. Using local coordinates xµ in a neighbourhood U ⊂ M , we define the following three operators on 0(3p M C ) for 0 ≤ p ≤ n, n := dim M : n L3 = p − , 2 1 + L = ωµν aµ ∗ aν ∗ , (2.77) 2 1 −1 µν ω aµ aν , L− = 2 where aµ = ι(∂µ ) is contraction with the basis vector field ∂µ , thus, { aµ , aν ∗ } = δµν , µν νλ and ω −1 is the inverse matrix of ωµν : ωµν ω −1 = δµλ . Obviously, L3 , L± are globally defined. Proposition 2.44. The operators L3 and L± satisfy the su(2) commutation relations, i.e., [ L3 , L± ] = ±2L± , [ L+ , L− ] = L3 . Proof. The first commutator is clear since L+ resp. L− increases resp. decreases the degree of a form by two. The second relation can be calculated in local coordinates: σλ µ ∗ ν ∗ 1 ωµν ω −1 [ a a , a σ aλ ] 4 σλ µ ∗ ν 1 a (δσ aλ − δλν aσ ) + (δσµ aλ − δλµ aσ )aν ∗ = ωµν ω −1 4 n 1 = δµλ (aµ ∗ aλ − aλ aµ ∗ ) = aµ ∗ aµ − . 2 2
[ L+ , L− ] =
The operators above allow us to introduce a second differential operator on 0(3• M C ) in addition to the exterior differential d, namely e∗ := [ L− , d ] ; d e∗ anti-commutes with d. since d satisfies d2 = 0, we conclude that d
(2.78)
Supersymmetric Quantum Theory and Differential Geometry
573
e∗ form a two-dimensional representation of Proposition 2.45. The operators d and d su(2) under the adjoint action of L3 and L± : [ L3 , d ] = d , [ L+ , d ] = 0 , e∗ , [ L− , d ] = d
e ∗ ] = −d e∗ , [ L3 , d e∗ ] = d , [ L+ , d e∗ ] = 0 . [ L− , d
Proof. The commutators of L’s with d are direct consequences of d ω = 0 and of the e∗ . Furthermore, we have definition of d e∗ ] = [ L+ , [ L− , d ]] = [ d, [ L− , L+ ]] + [ L− , [ L+ , d ]] = −[ d, L3 ] = d . [ L+ , d In order to derive the last commutator, we choose a Darboux coordinate system, i.e., one e∗ in which the components ωµν of the symplectic form are constant. Then the operator d has the explicit form e∗ = [ L− , d ] = 1 ω −1 µν [ aµ aν , aλ ∗ ]∂λ = ω −1 µλ aµ ∂λ , d 2 and it follows that e∗ ] = 1 ω −1 κρ ω −1 µλ [ aκ aρ , aµ ∂λ ] = 0. [ L− , d 2
e∗ is nilpotent: Corollary 2.46. The operator d e∗ , d e∗ } = 0. {d e∗ under the Proof. This is a direct consequence of the transformation properties of d and d 3 ± ∗ e su(2) generated by L , L , the Jacobi identity, and the fact that d and d anti-commute. In summary, we see that on a symplectic manifold M the space 0(3• M C ) carries a representation of the Lie algebra su(2) and that there are two anti-commuting and nilpotent operators spanning the spin 21 representation. However, as long as no Riemannian metric is given, the space 0(3• M C ) has no scalar product, and therefore cannot be completed to a Hilbert space occurring in the set of algebraic data, as introduced in the previous sections. However, if we introduce a Riemannian metric, we have a notion of adjoint for operators on 0(3• M C ), and we may ask whether the representation of su(2) is unitary, i.e., whether L+ is the adjoint of L− . Proposition 2.47. Let g be a Riemannian metric on the symplectic manifold (M, ω). Then the following statements are equivalent: 1 L+ = (L− )∗ . 2 The (1,1) tensor field J defined by ω(X, Y ) = g(JX, Y ) for all X, Y ∈ 0(T M ) is an almost complex structure. Furthermore, if the above conditions are satisfied then the manifold (M, g, J) is almost K¨ahlerian with almost K¨ahler form ω.
574
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Proof. Assume 1) holds; then we have 1 1 1 −1 λσ ω aλ aσ , (L+ )∗ = − ωµν aµ aν = − ωµν g µλ g νσ aλ aσ = L− = 2 2 2 which implies
ωµν ωκλ g νσ g λµ = −δκσ .
By definition, the components of the tensor J are given by Jµ κ = ωµν g νκ , and the previous equation implies that J is indeed an almost complex structure: Jµ σ Jκ µ = −δκσ . On the other hand, if 2) holds, then J 2 = −id and the above relation between the components of J and ω yield λσ , −ωµν g µλ g νσ = ω −1 and thus (L+ )∗ = L− . If either of the equivalent conditions 1) or 2) holds, we conclude that g(JX, JY ) = ω(X, JY ) = −ω(JY, X) = −g(JJY, X) = g(Y, X) = g(X, Y ) for all X, Y ∈ 0(T M ), and this means that J is an almost Hermitian structure. The associated fundamental 2-form is clearly ω, and since it is closed, the manifold (M, g, J) is in fact almost K¨ahlerian. The next ( well-known) proposition ensures that on a symplectic manifold there is always a Riemannian metric with the properties of the previous proposition. For a proof, we refer the reader e.g. to [LiM ]. Proposition 2.48. If (M, ω) is a symplectic manifold then there exists an almost complex structure J on M such that the tensor g defined by g(X, Y ) = −ω(JX, Y ) for all X, Y ∈ 0(T M ) is a Riemannian metric on M . Because of these results, we will consider almost K¨ahler manifolds in the following. Since g induces a scalar product on 0(3• M C ) with respect to which the su(2) representation is unitary, we can introduce a second spin 21 doublet for the same su(2) simply by e∗ : taking the adjoints of the operators d and d e := (d e∗ )∗ satisfy the following relations: Proposition 2.49. The operators d∗ and d { d∗ , d∗ } = 0 , e } = 0, { d∗ , d
e, d e } = 0, {d
e] = d e, [ L3 , d∗ ] = −d∗ , [ L3 , d e , [ L+ , d e] = 0 , [ L+ , d∗ ] = −d e ] = −d∗ . [ L− , d∗ ] = 0 , [ L− , d
Supersymmetric Quantum Theory and Differential Geometry
575
Proof. All equations are obtained by taking the adjoints of the corresponding relations in Propositions 2.45. e∗ together with their adjoints, satisfy all the Notice that the operators L3 , L± and d, d commutation relations of the N = 4 supersymmetry algebra of Theorem 2.38, except for e } = 0 and all relations involving the Laplace operators. We shall see in section 4 { d, d that on a classical K¨ahler manifold there is always a realization of the full N = 4 algebra – which also explains the appearance of the SU(2) generators that may look somewhat e is precisely the surprising at the moment. The condition that d anti-commutes with d requirement for the almost complex structure to be integrable. Proposition 2.50. For an almost K¨ahler manifold M , the following two statements are equivalent: e anti-commute, 1) The operators d and d e} = 0 . { d, d 2) The manifold M is K¨ahler. In particular, if relation 1) holds, all the other equations in Theorem 2.38 are true as well. Proof. The Jacobi identity and the su(2) transformation properties of d imply e } = −{ d, [ L+ , d∗ ]} = −{ d∗ , [ d, L+ ]} − [ L+ , { d∗ , d }] = −[ L+ , 4 ] { d, d where 4 denotes the Laplace-Beltrami operator on M . Thus condition 1) holds if and only if L+ commutes with 4. In Sect. 4 we will show that this is true on any K¨ahler manifold, and therefore 2) implies 1). Let us now assume that 1) holds. Choose a point p ∈ M and normal coordinates xµ around p; then at the center of this system, we can write 4 = −g µν ∂µ ∂ν + . . . , 1 L+ = ωρσ aρ ∗ aσ ∗ , 2 where the dots substitute for terms that commute with functions, in other words, for zeroth order differential operators. We can now compute the commutator of L+ with 4: 1 [ L+ , 4 ] = − g µν aρ ∗ aσ ∗ [ ωρσ , ∂µ ∂ν ] + . . . = g µν ∂µ ωρσ aρ ∗ aσ ∗ ∂ν + . . . . 2 If [ L+ , 4 ] = 0, then the zeroth and first order terms should vanish separately, and we get the following series of implications, which taken together prove the claim: [ L+ , 4 ] = 0 =⇒ ∂µ ωρσ = 0 in normal coordinates at p ⇐⇒ ∇ω = 0 ⇐⇒ ∇J = 0 ⇐⇒ J is integrable ⇐⇒ M is K¨ahler =⇒ [ L+ , 4 ] = 0, where ∇ denotes the Levi–Civita connection; the last implication is derived explicitly in Sect. 3.
576
J. Fr¨ohlich, O. Grandjean, A. Recknagel
If the symplectic manifold satisfies the conditions of this proposition, we choose e) and ∂¯ = 1 (d + id e) as holomorphic resp. anti-holomorphic differential. ∂ = 21 (d − id 2 This concludes our derivation of the algebraic data associated to a symplectic manifold. Because of Lemma 2.43 already used in Subsect. 2.5.3, we can also formulate a converse statement: Suppose we have an algebraic description of a classical manifold M in terms of its canonical N = (1, 1) bundle with the usual operators d and L3 – the exterior differential and the total degree of forms – then M is a symplectic manifold if we e∗ , d e acting on differential forms and satisfying can identify further operators L± and d the relations listed above. Moreover, the symplectic form ω of M is determined by L+ . 2.7. Symmetries of spectral data. In this section, we describe some observations on “symmetries” of classical manifolds and their formulation in the algebraic framework. Whenever a manifold admits some additional structure, one is interested in the specific class of diffeomorphisms (“symmetries”) which respect that structure. On a general Riemannian manifold, for example, one may want to look for diffeomorphisms that preserve the metric, i.e., for isometries; in the case of complex (or Hermitian) manifolds, the relevant diffeomorphisms (resp. isometries) are those which do not mix holomorphic and anti-holomorphic coordinates, i.e., holomorphic diffeomorphisms (resp. isometries); similarly, on a symplectic manifold, it is the symplectic form which is preserved under symplectomorphisms. In classical geometry, the conditions which guarantee preservation of the extra structure of a manifold are well-known, but their formulation relies on objects not present in the spectral data which, as was shown in the previous sections, yield an algebraic characterization of these spaces. Our task is, therefore, to find suitable algebraic conditions for the relevant symmetries. Let M be a Riemannian manifold with metric g, A = C ∞ (M ) the algebra of smooth functions over M and (A, H, d, d∗ , . . .) the canonical set of spectral data associated to M as before; the dots indicate possible additional structures (i.e., differential operators and Lie algebra generators) beyond the Riemannian N = (1, 1) data specified by the exterior differential d and its adjoint. Clearly, Aut(A) is the group of all diffeomorphisms of M . It will turn out that the special subgroups of Aut(A) we are interested in are simply determined by demanding that their elements commute with certain differential operators contained in the spectral data. We will indicate sketches of proofs only for oneparameter subgroups of diffeomorphisms in the connected component of the identity, but covariance under general pullbacks strongly suggests that the results in fact hold for arbitrary elements of Aut(A). A one-parameter subgroup of diffeomorphisms in Aut(A) is generated by the Lie derivative LX associated to a vector field X ∈ 0(T M ) over M ; we have LX = { d, ι(X) } .
(2.79)
Using local coordinates and the expressions d = aµ ∗ ∇µ and ι(X) = X ν aν with “creation” and “annihilation” operators aµ ∗ and aν as before, (2.79) can be rewritten as LX = (∂µ X ν ) aµ ∗ aν + X µ ∂µ .
(2.80)
Equation (2.79) shows that any LX , and therefore the associated diffeomorphism, commutes with the exterior differential, [ d, LX ] = 0 ,
(2.81)
Supersymmetric Quantum Theory and Differential Geometry
577
as expected: d commutes with pullbacks. It is natural to ask which properties the Lie derivative LX must satisfy if we require that it also commutes with the adjoint of d, [ d∗ , LX ] = 0 .
(2.82)
In local coordinates, d∗ is given by d∗ = −aµ ∂µ + 0λµν aµ aν ∗ aλ , where 0λµν are the Christoffel symbols associated to the Riemannian metric g on M . Inserting this expression and (2.80) into Eq. (2.82), one finds – after a lengthy, but straightforward computation using the canonical anti-commutation relations of the aµ ∗ and aν – that a necessary condition for (2.82) to hold is that (2.83) LX g ≡ X µ ∂µ gνρ + gαν ∂ρ X α + gβρ ∂ν X β aν ∗ aρ ∗ = 0 . This means that X is a Killing vector field, i.e., it generates an isometry of M . Since for Killing vector fields one has that ∗ L X = − LX , relation (2.83) is also sufficient to ensure that LX commutes with d∗ . We conclude that the isometries of a Riemannian manifold M described by the N = (1, 1) spectral data (A, H, d, d∗ ) are precisely those elements of Aut(A) which commute with both d and d∗ . Next, we consider a symplectic manifold given in terms of spectral data (A, H, d, e∗ , L± , L3 ) as in the previous section. Recall that the symplectic 2-form ω is encoded d µν e∗ = in the SU(2) generators L± . In particular, we have L− = 21 ω −1 aµ aν and d − [ L , d ]. For symplectic spectral data, it is natural to study diffeomorphisms associated to Lie e∗ . Inserting the explicit local expressions derivatives LX which commute with d and d into e ∗ , LX ] = 0 , (2.84) [d we obtain, in a similar way as before, the necessary condition that LX ω ≡ X µ ∂µ ωνρ + ωνα ∂ρ X α + ωβρ ∂ν X β aν ∗ aρ ∗ = 0 ;
(2.85)
a short calculation, using the non-degeneracy of the symplectic form, shows that this condition is also sufficient for (2.84) to hold. Thus, the symplectomorphisms of a symplectic manifold – the diffeomorphisms which leave ω invariant as in (2.85) – are precisely given by those elements of Aut(A) e∗ . that commute with d and d Complex manifolds, given by spectral data (A, H, ∂, ∂ ∗ , ∂, ∂ ∗ ) as in section 2.4, can be discussed in complete analogy. The relevant class of diffeomorphisms consists of those that preserve the complex structure J on M , i.e., one is interested in vector fields whose Lie derivatives satisfy (2.86) LX J = 0 . Not surprisingly, it turns out that this condition is equivalent to [ ∂, LX ] = 0 .
(2.87)
Holomorphic diffeomorphisms correspond to those elements of Aut(A) which commute with the holomorphic and anti-holomorphic differentials ∂ and ∂. Combining this with the results on isometries from above, we find that the class of diffeomorphisms which
578
J. Fr¨ohlich, O. Grandjean, A. Recknagel
respects the structure of a Hermitian complex manifold commute with ∂, ∂ and their adjoints. We have seen that the algebraic description of classical manifolds in terms of spectral data with N = (n, n) symmetry allows for a direct characterization of the subclass of diffeomorphisms which leaves the relevant structure on a given manifold invariant. Obviously, it is possible to carry over the commutation relations (2.82,84,87) to the non-commutative setting without difficulties. However, it remains to be investigated whether the special subgroups of Aut(A) defined in this way bear particular geometrical relevance in the non-classical case as well.
3. Supersymmetry Classification of Geometries In the preceding section, we have achieved an algebraic formulation of classical differential geometry. In each case, the basic data contain a commutative algebra of functions on the manifold M and an appropriate Hilbert space of square-integrable sections of a spinor bundle S over M . We always required a representation of the algebra of sections of the Clifford bundle on S in order to define a Dirac operator. While these building blocks of “spectral data”, the algebra of functions and the Hilbert space, are basic for all geometries considered, it is the presence of additional operators and their algebraic relations with the Dirac operators that allows us to distinguish merely Riemannian from, say, K¨ahler geometry. In [FGR ], where we will describe natural non-commutative generalizations of classical geometry, we will be able to take full advantage of the results of Sect. 2. Here, we summarize and interpret our algebraic characterizations of classical differential geometry from the point of view of supersymmetry, which has been our – up to now hidden – guiding principle throughout the last section. In particular, notions like N = (1, 1) spectral data etc. will become transparent in the following. Supersymmetry was invented, at the beginning of the 1970’s, in the form of the “spinning string” [R,NS ] and in the context of relativistic quantum field theory (QFT) on 3 + 1-dimensional Minkowski space-time [GL,WZ ]. Within QFT, supersymmetry provided a surprising way to circumvent “no-go theorems” [CM ] stating that in realistic relativistic QFTs – i.e., disregarding 1+1 dimensions, which at that time were thought to be irrelevant – the space-time symmetries and the “internal” symmetries (gauge groups of the first kind) always show a direct product structure and cannot be non-trivially embedded into a common simple symmetry group. It turned out that, at least at the level of infinitesimal symmetry operations, one can install further symmetries by adding new generators, Q, to those of the Poincar´e algebra, as long as they behave like fermions, i.e., they obey anti-commutation relations among each other (and commutation relations with the old “bosonic” generators). The main consequence of the new symmetry generated by the Q’s is to set up a correspondence between the bosonic and the fermionic fields occurring in a supersymmetric QFT – with implications reaching from abstract concepts down to computational details. Although nature obviously is not strictly supersymmetric – as this would e.g. imply that to each boson one can associate a fermion with the same mass – physicists nevertheless hope that the fundamental theory of elementary particles might display supersymmetry which is however “broken” at the low energy scales accessible to experiment. In a variety of physical contexts (critical phenomena, string theory) one also encounters supersymmetric quantum field theories on a 1 + 1dimensional space-time, in particular 1 + 1-dimensional superconformal field theories. Many of our arguments are inspired by features of these theories.
Supersymmetric Quantum Theory and Differential Geometry
579
The 1 + 1-dimensional supersymmetry algebra with N supersymmetry charges is (i) generated by fermionic self-adjoint operators Q(i) L , QR , i = 1, . . . , N , that are subject to the anti-commutation relations (j) ij {Q(i) L , QL } = δ (H + P ) , (j) ij {Q(i) R , QR } = δ (H − P ) , (j) {Q(i) L , QR }
(3.1)
=0.
Here, H and P denote the generators of time and space translations in the Poincar´e group. On the rhs of (3.1), only the combinations H + P and H − P occur because we display the relations in light-cone coordinates: Q(i) L behave like the left-moving, (i) QR like the right-moving chiral components of a fermion field in a 1 + 1-dimensional QFT. To make this so-called “chiral splitting” explicit, we will often use notations like e.g. N = (2, 2) supersymmetry algebra for the relations (3.1) with, in this case, two left-moving and two right-moving supercharges. The operators Q(i) L,R commute with H and P , and, on a higher-dimensional spacetime, we would additionally require that the supercharges transform as a spin 1/2 spinor under rotations. The full algebra of (super) space-time transformations is then spanned by the fermionic supercharges together with the Poincar´e generators – and, for N ≥ 2, some additional Lie group generators which will be introduced when necessary; they arise from chiral symmetries of the relations (3.1) and act only on the supercharges, not on the Poincar´e generators. As was done in Sect. 1.2, one could in fact always start from one supersymmetry charge and characterize the higher-N superalgebras by the additional Lie group generators that act on this charge to create further fermionic operators. Although the algebra introduced in (3.1) makes perfect sense for any number N of supercharges, it turns out that within the area of QFT the special values N = 1, 2, 4 and 8 are the most interesting. Since, remarkably, for our classification of geometries we also primarily encounter these types of supersymmetry, other supersymmetry algebras will not be considered. We first discuss the N = (1, 1) superalgebra: After renaming the generators as (1) 1 1 Q := Q(1) L , Q := QR , 4 := 2 (H + P ) and 4 := 2 (H − P ), the relations for the supercharges read 2 {Q, Q} = 0 , Q2 = 4 , Q = 4 . (3.2) If we enforce the additional relation 4=4,
(3.3)
i.e. P = 0, we precisely find the relations of N = (1, 1) spectral data, see Definition 2.6: Q and Q are to be identified with the anti-commuting Dirac operators D and D on a Riemannian manifold M , and condition (3.3) expresses the fact that both these operators 2 square to the Laplacian of M . Recall that the property D2 = D made it possible to define a nilpotent operator d – the exterior differential on M – by setting d=
1 D − iD . 2
This enabled us to derive de Rham–Hodge theory directly from the N = (1, 1) spectral data of Riemannian geometry.
580
J. Fr¨ohlich, O. Grandjean, A. Recknagel
In the N = (1, 1) data (C ∞ (M ), L2 (S), D, D), two chiral “halves” of the N = 1 supersymmetry algebra (3.2,3) are present. If we consider a theory with only one supercharge Q (e.g. obtained from the N = (1, 1) operators as Q = QL or Q = QL +QR ), the remaining data may be identified with those of a spectral triple (C ∞ (M ), L2 (S), D := Q), as introduced by Connes. We have called such a triple a set of N = 1 spectral data. From the data (C ∞ (M ), L2 (S), D), one can again recover a complete geometric description of M , as was discovered by Connes; the natural algebra of forms arising here is D , see Proposition 2.4. Since D acts on a spinor bundle S over M , N = 1 spectral data are the natural algebraic setting for spin geometry. Note, however, that the constructions of Sect. 2.1 show that a Dirac operator can be defined on any Riemannian manifold. Therefore N = 1 spectral data in fact provide a description of general Riemannian geometry, not just of spin geometry. The algebra D is more difficult to handle than the usual algebra of forms defined with the help of the nilpotent differential d = 21 (D−iD). Therefore, it would be desirable to have a general construction enabling us to pass from an N = 1 spectral triple to a set of N = (1, 1) spectral data. In the classical case, this is of course always possible; the concrete procedure follows the lines of Sect. 2.2. Within the non-commutative context, however, the situation is more complicated – see the end of this section and [FGR ] for further remarks. All the other special types of manifolds that were studied in Sect. 3, Hermitian, K¨ahler, etc., are special cases of Riemannian manifolds; thus they will all be described by N = (1, 1) spectral data, but with additional operators and relations – which, in fact, correspond to higher supersymmetry involving n chiral supercharges and are denoted by N = (n, n) with n = 1, 2 and 4. It turns out that, in classical geometry, there is a kind of “staircase” leading from N = (1, 1) to N = 8 supersymmetric data. The supercharges of an N = (n, n) set of data can always be re-interpreted as those of an N = 2n superalgebra. (More precisely, all the left- and right-moving N = (n, n) charges together form an algebra with the relations of the left- or the right-moving half of the N = 2n algebra, as long as the constraint P = 0 is imposed in (3.1), cf. relation (3.3) on the two Laplace operators.) As was mentioned after Eq. (3.1), for N ≥ 2 the full supersymmetry algebra contains also certain Lie group generators; the dimension of these groups grows as N increases, and thus the non-trivial task in enlarging N = (n, n) data to N = 2n data is to uncover the additional symmetries. In the classical situation, this is always possible, as will be shown in this section, case by case. Through this procedure, we can pass from N = (n, n) to N = 2n data – without gaining new structure in classical geometry. In a further step, upon “doubling the algebra”, we arrive at a truly richer N = (2n, 2n) geometry. In the simplest, and most general, case of N = (1, 1) data, we have the operator d = 21 (D − iD) and its adjoint d∗ = 21 (D + iD); but in addition there is a Z-grading given by the operator Ttot counting the total degree of a differential form. Altogether, these operators satisfy the algebra d2 = (d∗ )2 = 0 , {d, d∗ } = 4 , [ Ttot , d ] = d , [ Ttot , d∗ ] = −d∗ .
(3.4)
These are just the (anti-)commutation relations of the (non-chiral) N = 2 supersymmetry algebra in 1+1 dimensions. In the classical case we can always enlarge N = (1, 1) spectral data to (non-chiral) N = 2 data. In view of the relations (3.4), which can be realized in terms of operators defined on any Riemannian manifold M , the natural question to ask is when one can find a “geomet-
Supersymmetric Quantum Theory and Differential Geometry
581
rical representation” of the full chiral N = (2, 2) supersymmetry algebra including leftand right-moving supercharges. It turns out that this really requires additional structure on M . We discuss the N = (2, 2) supersymmetry algebra in terms of the supercharges 1 1 (2) (2) Q(1) Q(1) Q± := (3.5) L ± i QL , R ± i QR , 2 2 ∗ ∗ which satisfy Q± = Q∓ and Q± = Q∓ as well as the anti-commutation relations Q± :=
2
Q2± = Q± = 0 , 1 {Q+ , Q− } = (H + P ) , 2
{Q] , Q] } = 0 , 1 {Q+ , Q− } = (H − P ) . 2
(3.6)
In addition, there are two chiral self-adjoint operators T and T , which behave like bosons; T and T commute with each other and with H and P , and they act on the Q’s as “counting” operators: [ T, Q± ] = ±Q± ,
[ T, Q± ] = 0 ,
[ T , Q± ] = ±Q± ,
[ T , Q± ] = 0 .
(3.7)
Note that T and T generate the chiral U(1) × U(1) symmetry of the relations (3.6), which do not change if Q+ and Q+ are separately multiplied by phases. Let us now show that the N = (2, 2) algebra (3.6,7) is realized by the operators included in K¨ahler complex data, Definitions 2.20, 2.25 and 2.30. We first use the (selfadjoint) operator T and the counting operator Ttot as above to define another (self-adjoint) operator T by Ttot = T + T . The relation [ T, T ] = 0 follows since Ttot and T commute. The operators G, defined as G = [ T, d ], d being the nilpotent differential as before, and G = d − G are then easily seen to satisfy G = [T,d] as well as the algebraic relations (3.7) – if we identify G with Q+ and G with Q+ . Taking adjoints we also obtain the supercharges Q− and Q− . The remaining relations (3.6) of the N = (2, 2) supersymmetry algebra are proven in Lemma 2.21 and Theorem 2.30. Note that, via the identifications above, the charges Q+ and Q+ simply correspond to the holomorphic and anti-holomorphic differentials ∂ and ∂, respectively, on a Hermitian complex manifold if we work with the canonical Hermitian data of a Hermitian manifold (with untwisted spinor bundle). On a K¨ahler manifold, there is the additional requirement that the holomorphic and anti-holomorphic Laplace operators coincide: = .
(3.8)
They are to be identified with operators occurring in (3.6) as = 21 (H + P ) and = 21 (H − P ); relation (3.8) is analogous to condition (3.3) in the N = (1, 1) case. Taking everything together, we have identified N = (2, 2) spectral data with K¨ahler geometry. In the more general case of Hermitian complex geometry, condition (3.8) and the basic relation { Q+ , Q− } = 0 can be violated; see Theorem 2.30. Therefore, we cannot identify Hermitian geometry with a realization of an ordinary supersymmetry algebra.
582
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Note, however, that, in any case, the combination J0 = T − T of U(1) generators can be understood as the complex structure of the manifold; see Eqs. (1.49) and (1.54) in Sect. 1.2. Just as we could define an additional operator Ttot on a Riemannian manifold which enabled us to introduce a non-chiral N = 2 structure on N = (1, 1) spectral data, we can construct a non-chiral N = 4 supersymmetry algebra from the data of an N = (2, 2) classical K¨ahler manifold M . The main task is to find operators that generate the SU(2)symmetry of the N = 4 anti-commutation relations – whereas there is a simple choice for the representation of the supercharges on the algebra of forms over M : We can use the complex structure given on the K¨ahler manifold and put ∗
G1 = ∂ ,
G2 = i∂ ,
G∗1 = ∂ ∗ ,
G∗2 = −i∂ .
(3.9)
M is also equipped with the K¨ahler form ∈ 3(1,1) M , which is closed and real, i.e. d = 0, = . This permits us to introduce operators on 0(3(p,q) T ∗ M ) by setting T3 =
1 (p + q − n) , 2
T + = () ,
T − = ι() ;
(3.10)
here, n = dimC M , () is wedging by the K¨ahler form, and ι() = ()∗ . The anti-commutation relations of the operators Ga follow directly from the K¨ahler conditions in Theorem 2.30 and the relations in Definition 2.20 and Lemma 2.21: {Ga , Gb } = {G∗a , G∗b } = 0 ,
{Ga , G∗b } = δab
(3.11)
for a, b = 1, 2. The operators in (3.10), on the other hand, satisfy the commutation relations of the Lie algebra of SU(2), namely [ T 3 , T ± ] = ±T ± ,
[ T + , T − ] = 2T 3 .
(3.12)
Only the last of these equations is not immediately obvious; to prove it, we locally choose holomorphic normal coordinates z µ , as in Theorem 2.29, and introduce the operators aµ ∗ = (dz µ ), aµ = ι(dz µ ). Then we can write () = igµν¯ aµ ∗ aν¯ ∗ ,
ι() = −igρσ¯ aρ aσ¯ ,
since = igµν¯ dz µ ∧ dz¯ µ . The simple anti-commutation rules (2.72) satisfied by the aµ ∗ , aν¯ allow for an easy verification of (3.12), much along the lines of Lemma 2.39. Similarly, we can show that the SU(2)-generators act on the supercharges as follows: 1 G1 , 2 [ T + , G1 ] = 0 , [ T 3 , G1 ] = −
[ T , G1 ] = G2 ,
1 [ T 3 , G2 ] = − G2 , 2 [ T + , G2 ] = G1 ,
(3.13)
−
[ T , G2 ] = 0 . ∗
To verify (3.13), we write ∂ = aµ ∗ ∂µ and ∂ = −aµ ∂µ , which is true at the center of ∗ ∗ a complex geodesic coordinate system. Using T 3 = T 3 , T ± = T ∓ , we can also deduce the SU(2)-action on G∗a from (3.13). To complete the commutation relations of the N = 4 superalgebra, we use (3.11) and (3.13) to show that the T 0 s commute with the operator , e.g.
Supersymmetric Quantum Theory and Differential Geometry
583
[ T + , ] = [ T + , {G1 , G∗1 } ] = {G1 , [ T + , G∗1 ]} = −{G1 , G∗2 } = 0 . The relations (3.11–13) – which we have derived for operators on a classical K¨ahler manifold originally associated to chiral N = (2, 2) spectral data – are indeed the relations of an N = 4 supersymmetry algebra. The non-chiral character of this N = 4 algebra is not only obvious from the choice of Ga in (3.9) – note, that we could also have started with the de Rham differential d as one of the supercharges –, but is also reflected in the fact that the SU(2)-generators are constructed in terms of the “mixed” (1,1)-form . We have seen in Sect. 2.6 that there is a variant of the N = 4 supersymmetry algebra which provides an algebraic description of symplectic geometry. Here the SU(2) generators are constructed with the symplectic form instead of the K¨ahler form used in (3.10), and the two SU(2) doublets are generated from d and d∗ , see Propositions 2.45 and 2.49. The algebraic relations satisfied by these operators in general do not coincide with the N = 4 relations on a K¨ahler manifold, in particular the SU(2) generators need not commute with the Laplacian 4. Therefore, we cannot simply identify operators on a general symplectic manifold with generators of some supersymmetry algebra. As was shown at the end of Sect. 2.6, the symplectic relations deviate from the N = 4 algebra if and only if the symplectic manifold has no integrable complex structure that is compatible with the symplectic structure. In analogy to the step from N = 2 to N = (2, 2) data, the presence of N = (4, 4) spectral data on a K¨ahler manifold M contains new information: In this case, M is in fact a Hyperk¨ahler manifold equipped with three complex structures and a holomorphic symplectic form ω ∈ 3(2,0) M . We first present the complete commutation relations of the supersymmetry algebra with N = 4 supercharges of each chirality in abstract terms. The left-moving sector contains the fermionic operators 1 (2) G1 := √ Q(1) L + i QL , 2
1 (4) G2 := √ Q(3) L − i QL 2
(3.14)
∗
and their adjoints, and from the Q(i) R we form Ga , Ga , a = 1, 2, analogously. The anticommutators within chiral sectors read {Ga , Gb } = {G∗a , G∗b } = 0 , ∗
∗
{Ga , Gb } = {Ga , Gb } = 0 ,
{Ga , G∗b } = δab (H + P ) ,
(3.15)
∗
{Ga , Gb } = δab (H − P ) , ∗
and left-moving charges Ga , G∗a anti-commute with right-movers Gb , Gb . It is easy to see that (3.15) is left invariant by SU(2) × SU(2) transformations G1 G1 G1 G1 7−→ A , 7−→ A , (3.16) G2 G2 G2 G2 where A, A ∈ SU(2) are two independent matrices. In addition, the relations within each of the two supercharge doublets are left invariant by rescaling with a phase, so the full symmetry Lie algebra consists of two commuting copies of su(2) ⊕ u(1). For the moment, we ignore the u(1) , which can be associated with the generator J0 of Eq. (1.49) in Sect. 1.2, but we will come to it at the end of this section. As a consequence of (3.16), the complete N = 4 supersymmetry algebra involves α generators T α , T , α = 1, 2, 3, of two commuting SU(2) actions, with the Ga transforming according to the fundamental representation:
584
J. Fr¨ohlich, O. Grandjean, A. Recknagel
[ T α , T β ] = iαβγ T γ ,
1 τα ab Gb . 2
[ T α , Ga ] =
(3.17)
The bar in the second equation indicates complex conjugation of the Pauli matrices. T α commutes with all generators in the right-moving sector; analogous statements hold for α T and Ga . The operators and relations in the left-moving sector correspond to the operators on (p, q)-forms over a Hyperk¨ahler manifold which were introduced in Subsect. 2.5.2 and satisfy the relations of Theorem 2.38. The right-moving generators, on the other hand, may be identified with their anti-holomorphic partners. In addition, since every Hyperk¨ahler manifold is automatically K¨ahler, the holomorphic and anti-holomorphic Laplacians = H + P resp. = H − P have to coincide (P = 0). All in all, we may say that N = (4, 4) supersymmetry corresponds to Hyperk¨ahler geometry. In view of the examples of Riemannian and K¨ahler manifolds, it is natural to ask whether the N = (4, 4) structure of Hyperk¨ahler data can be rewritten as an N = 8 algebra involving eight supersymmetry charges as well as the generators of a larger Lie group acting on the fermions. This is indeed possible: In order to simplify the presentation of the commutation relations later on, we introduce new notations for the fermionic generators of Sect. 2.5.2: G(1,0) = G1+ = ∂ , G(0,1) = −G
2−
G(−1,0) = G2+ = [ ι(ω), ∂ ] , ∗
= [ (ω), ∂ ] ,
G(0,−1) = −G
1−
∗
= −∂ ,
(3.18)
and G G
(1,0)
(0,1)
= G2− = −[ (ω), ∂ ∗ ] , =G
1+
=∂,
(−1,0)
= −G1− = −∂ ∗ ,
(0,−1)
= −G
G G
2+
= −[ ι(ω), ∂ ] .
(3.19)
Note that the G(η1 ,η2 ) defined in (3.18) include both holomorphic and anti-holomorphic (η1 ,η2 ) in (3.19); this “mixing” already occurred operators on M , and analogously for the G in the passage from N = (2, 2) to N = 4. We also introduce new normalizations and notations for the generators of the two commuting copies of su(2) acting on 0(3(p,q) T ∗ M ) over a Hyperk¨ahler manifold of complex dimension n ; ω and ω are the holomorphic resp. anti-holomorphic symplectic forms: √ √ √ √ n H 1 = 2 T 3 = p − , E (2,0) = 2 T + = 2 (ω) , E (−2,0) = 2 T − = 2 ι(ω) 2 (3.20) and √ + √ √ − √ n 3 H 2 = 2 T = q − , E (0,2) = 2 T = 2 (ω) , E (0,−2) = 2 T = 2 ι(ω). 2 (3.21) Next, we want to extend the bosonic algebra su(2) ⊕ su(2), i.e., we want to define further generators (“currents”) of a larger Lie algebra in terms of the geometric operations on 0(3(p,q) T ∗ M ), which are related to the Riemannian metric g and to wedging and contraction operators built from ω and ω – or, equivalently, from the three complex structures I, J and K. The new bosonic generators should act within the fibres of the complexified bundle of differential forms. We must therefore construct local expressions for the currents – which are bi-linear in the fermionic operators aµ and aµ ∗ , contracted with one of the rank two tensors listed above. We find four more bosonic operators:
Supersymmetric Quantum Theory and Differential Geometry
585
E (1,1) = (−iI ) = gµν¯ aµ ∗ aν¯ ∗ ,
E (−1,−1) = ι(iI ) = gµν¯ aµ aν¯ ,
E (1,−1) = Jµν aµ ∗ aν ,
E (−1,1) = −Jµ¯ ν¯ aµ¯ ∗ aν¯ ,
(3.22)
where I = igµν¯ dz µ ∧ dz¯ ν is the K¨ahler form associated to the complex structure I, and Jµν are the components of the holomorphic symplectic form, see Eqs. (2.52–56); again, aµ ∗ and aν belong to some local coordinates, e.g. geodesic ones, as in Theorem 2.29. Since the first derivatives of the metric and of the symplectic form vanish at the center of such a coordinate system, it is not difficult to show that the operators introduced in (3.20–22) obey the following commutation relations: [ H i , E (n1 ,n2 ) ] = ni E (n1 ,n2 ) ,
[ E (n1 ,n2 ) , E (−n1 ,−n2 ) ] = n1 H 1 + n2 H 2 , n o n 1 + m1 ±2 0 1 0 if ∈ / [ E (n1 ,n2 ) , E (m1 ,m2 ) ] = 0 , ,± , , n 2 + m2 0 ±2 ±1 0 √ √ [ E (2,0) , E (−1,1) ] = 2 E (1,1) , [ E (2,0) , E (−1,−1) ] = − 2 E (1,−1) , √ √ [ E (0,2) , E (−1,−1) ] = − 2 E (−1,1) , [ E (0,2) , E (1,−1) ] = 2 E (1,1) , √ √ [ E (1,1) , E (1,−1) ] = 2 E (2,0) , [ E (1,1) , E (−1,1) ] = 2 E (0,2) ; (3.23) further commutation relations follow from the Hermiticity properties
Hi ∗ = Hi ,
E (n1 ,n2 ) ∗ = E (−n1 ,−n2 ) .
(3.24)
Equations (3.23,24) can be shown to be equivalent to the commutation relations of the Lie algebra of Sp(4), e.g. by computing the Dynkin diagram. With the help of the explicit realization of the currents in terms of fermionic operators aµ ∗ , aµ , etc., it is also straightforward to determine the transformation properties of the fermionic charges given in (3.18,19) under the Sp(4) generators. One finds that the G(η1 ,η2 ) transform under the fundamental representation of Sp(4): [ H i , G(η1 ,η2 ) ] = ηi G(η1 ,η2 ) ,
n o 0 n1 + η 1 ±1 ∈ / if [ E (n1 ,n2 ) , G(η1 ,η2 ) ] = 0 , , n2 + η 2 ±1 0 √ [ E (−1,−1) , G(1,0) ] = G(0,−1) , [ E (−2,0) , G(1,0) ] = 2 G(−1,0) ,
[ E (−1,1) , G(1,0) ] = G(0,1) , √ [ E (2,0) , G(−1,0) ] = 2 G(1,0) ,
[ E (1,1) , G(−1,0) ] = −G(0,1) ,
[ E (1,−1) , G(−1,0) ] = G(0,−1) , √ [ E (0,−2) , G(0,1) ] = − 2 G(0,−1) ,
[ E (−1,−1) , G(0,1) ] = −G(−1,0) ,
[ E (1,−1) , G(0,1) ] = G(1,0) , √ [ E (0,2) , G(0,−1) ] = − 2G(0,1) ,
[ E (1,1) , G(0,−1) ] = G(1,0) ,
[ E (−1,1) , G(0,−1) ] = G(−1,0) . (3.25) (η1 ,η2 ) span a second fundamental representation of Sp(4) with similar The supercharges G relations. Note that G’s and G’s are related by the following Hermiticity properties:
586
J. Fr¨ohlich, O. Grandjean, A. Recknagel
G G
(1,0)
= G(−1,0) ∗ ,
G
(−1,0)
= −G(1,0) ∗ ,
(0,1)
= −G(0,−1) ∗ ,
G
(0,−1)
= G(0,1) ∗ .
(3.26)
Finally, the anti-commutators of the fermionic generators with each other are a direct consequence of the Hyperk¨ahler relations of Theorem 2.38. They read 0
0
{ G(η1 ,η2 ) , G(η1 ,η2 ) } = { G (η10 ,η20 )
(η1 ,η2 )
,G
(η10 ,η20 )
}=0,
+ η20 ) 6= (0, 0) , 1 (−1,0) (0,1) } = { G(0,−1) , G }=− 4, { G(1,0) , G 2 1 (1,0) (0,−1) } = { G(0,1) , G }= 4. { G(−1,0) , G 2 {G
(η1 ,η2 )
,G
}=0
if (η1 +
η10 , η2
(3.27)
As before, it would be straightforward to proceed towards sets of N = (8, 8) supersymmetric data with 8 + 8 supercharges and Sp(4) × Sp(4) symmetry. Such spectral data describe a very rigid type of complex geometry, related to manifolds discussed in [Joy ]. We will not describe these geometries here. Let us summarize the correspondence between different supersymmetry algebras and different types of classical differential geometry in the following table: N =1
: Riemannian (spin) geometry, Connes’ complex D
↑| ↓ N = 2
←− −→
↑| N = 4+
↑| ←− −→
↑| N = 8
N = (1, 1) : Riemannian geometry, de Rham complex
N = (2, 2) : complex (K¨ahler) geometry, Dolbeault complex ↑|
←− −→
N = (4, 4) : Hyperk¨ahler geometry
In addition, we have described symplectic geometry and complex Hermitian geometry, which are not included in the above scheme since they correspond to broken N = 4 (or N = (2, 2) ) supersymmetry; but see [FGR ]. In the table, fat arrows pointing e.g. from N = (2n, 2n) to N = (n, n) data indicate that the algebraic relations of the latter are included within those of the former. The necessary re-interpretations occur at a purely algebraic level. Therefore the corresponding “transitions” will be possible in the non-commutative case, too. The same is true for the arrows from N = 2n to N = n in the left row. Here, the notation N = 4+ means that the corresponding data possess N = 4 supersymmetry with enlarged symmetry group SU(2) × U(1). The extra U(1) factor was mentioned already after Eq. (3.16), as well as in Sect. 1.2: It is always present if the N = 4 data arise from a reduction of N = 8 data (whose symmetry group has rank 2), and is in fact necessary if one wants to recover N = (2, 2) data (containing two commuting abelian Lie groups) from the N = 4 data. Up to this subtlety, all the fat arrows pointing to the right merely involve re-interpretations of the N = 2n generators.
Supersymmetric Quantum Theory and Differential Geometry
587
In Connes’ original context, N = 1 data always correspond to a spin manifold, and thus the passage from N = (1, 1) data associated with an arbitrary Riemannian manifold to an N = 1 description is in general obstructed. In the formulation of Sect. 2.1, however, there is an N = 1 set of data for every Riemannian manifold, which is why the table also includes a fat arrow from N = (1, 1) to N = 1. The thin arrows mark special extensions of a set of spectral data by constructing additional operators, either further symmetry generators for N = (n, n) −→ N = 2n, or a second Dirac operator for N = 1 −→ N = (1, 1). As we have seen, these transitions are always possible for spectral data describing a classical manifold. In the non-commutative case, it seems that this is not generally true – see the remarks below and in [FGR ]. In both cases, steps against the upward arrows correspond to true extensions of the spectral data with passage to higher supersymmetry content and to a more rigid geometrical structure. Whether this is possible depends, of course, on the properties of the specific manifold or non-commutative space. One could also say it depends on the symmetries “hidden” in the spectral data of the (non-commutative) space: In the introduction, we have seen that different types of spectral data associated with different types of classical manifolds can be written in the alternative form (A, H, d, d∗ , g), where d is the nilpotent de Rham differential on the manifold, and g denotes some Lie algebra which acts on H – and on d, d∗ , thereby generating further supersymmetry charges and determining the supersymmetry content of the spectral data. To be precise, g = u(1), su(2) ⊕ u(1) or sp(4) for the N = 2, N = 4+ or N = 8 geometries, respectively. This concise classification of geometries according to their supersymmetry content – or simply: according to their symmetry – opens up various new possibilities. First of all, we gain a clear picture of what the natural non-commutative generalizations of Riemannian and complex geometry should look like: The algebra of functions C ∞ (M ) is to be replaced by a non-commutative ∗ -algebra A now acting on some appropriate Hilbert space H, but the algebraic relations for the collection of generalized Dirac operators on H are just the same as in the classical cases. We observe that, within the non-commutative context, the transition from N = (n, n) spectral data to an N = 2n structure will not always be possible, because the additional bosonic generators will not, in general, commute with the elements of the algebra A. Likewise, it appears that one cannot always pass from N = 1 to N = (1, 1) spectral data, i.e. introduce supersymmetry into the data. Comparing to the classical situation, where one essentially passes from the sections 0(S) of a spinor bundle S to sections 0(S) = 0(S)⊗0(S) – with the tensor product taken over the algebra of smooth functions – one might try to introduce two “Dirac operators” on a suitable tensor product space. This question will be discussed in more detail in [FGR ], where we will find that, given some extra structure on the N = 1 spectral data, there is indeed a natural definition of the operators D and D on the new Hilbert space, but in general they need not obey the N = (1, 1) algebra. Another comment is in order, concerning the apparent identification of commutative algebras with classical geometry on the one hand, and non-commutative algebras with non-commutative geometry on the other. Recall the idea that was outlined in the introduction, namely that our spectral data describe how a quantum mechanical spinning particle (or a bound state of two such particles) would “see” the manifold it propagates on. From the point of view of quantum mechanics, the natural object to study is quantized phase space rather than configuration space, with the – non-commutative – algebra of pseudo-differential operators substituting for C ∞ (M ). More precisely, it is natural to require that the algebra A used in the spectral data is invariant under time evolution generated by the quantum mechanical Hamiltonian. From the physical point of view, there
588
J. Fr¨ohlich, O. Grandjean, A. Recknagel
are reasons to believe that the non-commutative spectral data containing the algebra of “functions over quantized phase space” contain all essential information on the classical background manifold. However, at present we lack a complete understanding of how to extract it explicitly. We expect that at least the cohomology of M can be obtained from the differential forms over phase space in a rather straightforward way. These questions will be dealt with in a future publication. An important pay-off of our identification of the relations occurring in spectral data with the defining (anti-)commutators of supersymmetry algebras is that it immediately provides us with a large “pool” of physical examples for non-commutative geometries: In any supersymmetric QFT on a cylindrical space-time S 1 × R, the full algebra formed by the charges Q, Q together with the Poincar´e generators H and P and additional bosonic symmetry generators T, T is represented on the Hilbert space H of states, as well as on the algebra A of fields. Therefore we can simply start from the tuple (A, H, Q, Q, T, T ) in order to produce admissible spectral data. Disregarding complex Hermitian and symplectic data, we encountered the additional requirement that the leftand right-moving “Laplace” operators coincide. In terms of the above identification, this simply means that we have to project onto the zero-momentum subspace, i.e. set P = 0. At the level of differential forms, this restriction amounts to studying S 1 -equivariant instead of ordinary cohomology. In our view, the most important examples of supersymmetric QFTs are the N = 2 superconformal field theories associated to perturbative vacua of string theory. General considerations suggest that the structure of these theories in principle also determines the features of space-time. The methods used in the literature are, however, essentially limited to uncover classical (and mainly topological) aspects of space-time from N = 2 superconformal field theories, while our approach should make it possible to study the full non-commutative geometry of the underlying “quantum space-time”. Appendix: Clifford Algebras and Spinc -Groups In this appendix, we review some standard material about Clifford algebras and the groups Spinc which will be needed in the text. In Subsect. A.2, we follow [Sa ], where the reader may find detailed proofs and further references. A.1. Clifford algebras. Let Fk be the unital complex algebra generated by elements bi and their adjoints bi∗ , i = 1, . . . , k, satisfying the canonical anti-commutation relations (CAR) (A.1) {bi , bj } = {bi∗ , bj∗ } = 0 , {bi , bj∗ } = δ ij . Note that here we regard the bi as abstract generators of a ∗ -algebra; we can alternatively identify them with basis elements of a real Euclidian vector space. Then the Kronecker symbol δ ij would have to be replaced with the inner product of bi and bj . A representation π : Fk −→ End (W ) of Fk on a Hermitian vector space W is called unitary if π(bi∗ ) = π(bi )∗ . In fact, the algebra Fk has a unique irreducible unitary representation which is explicitly realized on (C2 )⊗k by setting bj = τ3 ⊗ . . . ⊗ τ3 ⊗ τ− ⊗ 1 ⊗ . . . ⊗ 1, bj∗ = τ3 ⊗ . . . ⊗ τ3 ⊗ τ+ ⊗ 1 ⊗ . . . ⊗ 1,
(A.2) (A.3)
Supersymmetric Quantum Theory and Differential Geometry
with τ± :=
1 2
589
(τ1 ± iτ2 ) in the j th position; τa denote the Pauli matrices τ1 =
0 1
1 0
, τ2 =
0 i
−i 0
, τ3 =
1 0
0 −1
.
This representation is faithful, and therefore Fk is isomorphic to the full matrix algebra M (2k , C). Let V be a real n-dimensional vector space with scalar product (·, ·). The complexified Clifford algebra Cl(V ) associated to V is the unital complex algebra generated by vectors v, w ∈ V subject to the relations {v, w} = −2(v, w) .
(A.4)
If e1 , . . . , en is an orthonormal basis of V , the relations (A.4) are equivalent to {ei , ej } = −2δ ij .
(A.5)
A ∗ -operation in Cl(V ) is induced by anti-linear and involutive extension of v ∗ := −v for all v ∈ V , and, accordingly, a representation π : Cl(V ) −→ W of Cl(V ) on a Hermitian vector space W is called unitary if π(v)∗ = −π(v), for all v ∈V. Let n = 2k + p, where p = 0, 1 is the parity of n. For j = 1, . . . , k, consider the following generators of the Clifford algebra: 1 2j−1 e − ie2j , 2 1 = − e2j−1 + ie2j , 2
bj = bj∗
(A.6) (A.7)
and, for p = 1, ν = ik+1 e1 e2 · · · en .
(A.8)
It is easy to check that the elements bj , bj∗ satisfy the CAR, and therefore, the Clifford algebra is isomorphic to the full matrix algebra M (2k , C) for p = 0. If p = 1, the relations [ bj , ν ] = [ bj∗ , ν ] = 0 , ν 2 = 1
(A.9)
show that Cl(V ) decomposes into a direct sum Cl(V ) =
1 1 (1 − ν) · Cl(V ) ⊕ (1 + ν) · Cl(V ), 2 2
(A.10)
and each factor is again isomorphic to M (2k , C). Thus, for an odd-dimensional vector space V , the Clifford algebra Cl(V ) has two unitary irreducible representations, related to each other by space reflection. Note also that the trace on Cl(V ), inherited from the isomorphism to the matrix algebra, after appropriate normalization defines a scalar product on Cl(V ), n+1 (a, b) = 2−[ 2 ] tr (a∗ b) , a, b ∈ CL(V ), which extends the scalar product on V .
590
J. Fr¨ohlich, O. Grandjean, A. Recknagel
A.2. The group Spinc (V ). We now assume that the real vector space V is oriented, and as before we denote by x 7→ x∗ , x ∈ Cl(V ) the unique anti-linear anti-automorphism obtained by extending v 7→ −v, v ∈ V . Then we can define the group Spin(V ) as Spin(V ) := {x ∈ ClRev (V ) | xx∗ = 1 , xV x∗ ⊂ V },
(A.11)
where ClRev (V ) denotes the real sub-algebra of Cl(V ) generated by products of an even number of elements of V . We will sometimes use the notation Spin(n) as an abbreviation for Spin(Rn ). The group Spinc (V ) is then given by Spinc (V ) := {eiα x | α ∈ R , x ∈ Spin(V )} . For each x ∈ Spinc (V ), let
(A.12)
Ad(x)v := xvx∗
denote the adjoint action on v ∈ V . There is a short exact sequence – see [Sa ] – Ad
1 −→ U(1) −→ Spinc (V ) −→ SO(V ) −→ 1 .
(A.13)
Definition A.1 ([Sa]). A spinc -structure on V is a pair (W, 0) of a 2k -dimensional Hermitian vector space W and a linear map 0 : V −→ End (W ) such that 0(v)∗ + 0(v) = 0 , 0(v)∗ 0(v) = (v, v) · 1 .
(A.14) (A.15)
A spinc -isomorphism from (V0 , W0 , 00 ) to (V1 , W1 , 01 ) is a pair (A, 8), where A is an orientation preserving orthogonal map from V0 to V1 and 8 is a unitary operator from W0 to W1 intertwining 00 and 01 “up to A”, more precisely 800 (v)8−1 = 01 (Av)
(A.16)
for all vc ∈ V0 . The set of all such spinc -isomorphisms from W0 to W1 is denoted by Homspin (W0 , W1 ). For W0 = W1 this is a group; in fact: Proposition A.2 ([Sa]). Let (W, 0) be a spinc -structure on V ; then the map c Spinc (V ) −→ Homspin (W, W ) , Ξ0 : x 7−→ (Ad(x), 0(x)) where 0 now denotes the extension to Spinc (V ) of 0 in Definition A.1, is an isomorphism. Finally, we describe the Lie algebra, spinc (V ), of Spinc (V ). From the definitions of Spin(V ) and Spinc (V ), it follows immediately that spin(V ) = {ξ ∈ Clev (V ) | ξ + ξ ∗ = 0, [ ξ, V ] ⊂ V } and
spinc (V ) = spin(V ) ⊕ u(1) .
Furthermore, a direct computation proves that X aij ei ej | aij = −aji ∈ R} spin(V ) = Cl2 (V ) := { i,j
as expected: Spin(V ) is locally isomorphic to SO(V ).
(A.17)
(A.18)
Supersymmetric Quantum Theory and Differential Geometry
591
For the description of connections on Spinc -manifolds in Subsect. 2.2.2, we needed some further properties of spincc -isomorphisms. Let (At , 8t ), t ∈ [−1, 1], be a smooth d At |t=0 1-parameter family in Homspin (W, W ) with (A0 , 80 ) = (1, 1), and set dA = dt d and d8 = dt 8t |t=0 . Then, differentiating (A.16) at t = 0 yields [ d8, 0(ei ) ] = 0(dA ei ) = dAij 0(ej ) .
(A.19)
Since d8 is an element of the Lie algebra spinc (V ) acting on W , it is of the form – see (A.17,18) – X aij 0(ei )0(ej ) + i δA (A.20) d8 = i,j
for some anti-symmetric matrix aij and some δA ∈ R. It follows that [ d8, 0(ei ) ] = 4aij 0(ej ) , and with (A.19) and (A.20) we get d8 =
1 i dA j 0(ei )0(ej ) + iδA . 4
(A.21)
Let eiαt xt , xt ∈ Spin(V ), be the path in Spinc (V ) corresponding to (At , 8t ) by the isomorphism of Proposition A.2. Then the scalar term in (A.21) is given by δA =
d αt |t=0 . dt
(A.22)
A.3. Exterior algebra and spin representations. Let V C be the complexification of V , with a C-bilinear inner product induced by the scalar product on V , and let 3• V C be the associated exterior algebra. On the latter space, there exists a canonical scalar product with respect to which the set of elements e i1 ∧ . . . ∧ eiν ,
1 ≤ i1 < . . . < iν ≤ n
is an orthonormal basis. For each v ∈ V , we define a∗ (v) : 3• V C −→ 3• V C ,
ω 7−→ a∗ (v)ω := v ∧ ω ,
and we denote by a(v) the adjoint of a∗ (v). This operator a(v) = v v, more explicitly a(v) e ∧ . . . ∧ e i1
iν
=
ν X
is contraction with
iα ∧ . . . ∧ e i ν , (−1)α+1 (eiα , v)ei1 ∧ . . . ∧ ec
(A.23)
α=1
where b· denotes omission of the term. a∗ (v) and a(v) are extended to v ∈ V C by C-linearity; note that, for w ∈ V C , taking adjoints involves complex conjugation, i.e., ¯ The operators ai∗ := a∗ (ei ) and ai := a(ei ) satisfy the CAR (A.1), (a∗ (w))∗ = a(w). and thus 0(v) := a∗ (v) − a(v) , 0(v) := i(a∗ (v) + a(v)) (A.24) define two anti-commuting unitary representations of the Clifford algebra on 3• V C . We study the even and the odd dimensional cases separately:
592
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Let n = dimV be even (p = 0). We introduce the element γ = ik 0(e1 )0(e2 ) . . . 0(en )
(A.25)
which anti-commutes with all 0(v) and satisfies γ 2 = 1. Since the Clifford algebra has only one irreducible representation, say c on S, we can write 3• V C ∼ = S⊗S ,
0(v) ∼ = c(v) ⊗ 1,
(A.26)
where, at this point, S is just some multiplicity space. Counting dimensions of course gives S ∼ = S as vector spaces. Furthermore, the equation [ 0(v), γ0(w) ] = 0 for all v, w ∈ V implies 0(v) = γ ⊗ c(v) (A.27) so S is also isomorphic to the unique irreducible representation of Cl(V ). (The γ in the first tensor factor of (A.27) ensures that the two Clifford actions anti-commute.) Let n = dimV be odd (p = 1). We now define γ = ik+1 0(e1 )0(e2 ) . . . 0(en ) = 0(ν) ,
(A.28)
see (A.8); this element commutes with all 0(v) and satisfies γ 2 = 1. The Clifford algebra now has two irreducible representations corresponding to γ = ±1, see (A.10), and one can show that both eigenvalues of γ come with multiplicity 2n−1 . Thus we have 3• V C ∼ = S ⊗ C2 ⊗ S .
(A.29)
Explicit expressions for 0(v) and 0(v) are given by 0(v) = c(v) ⊗ τ3 ⊗ 1 ,
0(v) = 1 ⊗ τ1 ⊗ c(v) .
(A.30)
Note that Eq. (A.30) implies anti-commutativity of 0 and 0, as well as {0(v), γ} = 0. Any other expression for 0 and 0 is related to (A.30) by a unitary conjugation. Acknowledgement. We thank A. Alekseev, O. Augenstein, J.-B. Bourgignon, A. Chamseddine, G. Felder, C. Voisin and M. Walze for interesting discussions, comments and collaborations. We are particularly grateful to A. Connes and K. Gawe¸dzki for their invaluable help and inspiration and to A. Connes for advance copies ´ for hospitality during various stages of work presented in this of some of his works. J.F. thanks the I.H.E.S. paper.
References [AG] [AGF] [Au] [Beau] [Bes] [BGV]
Alvarez-Gaum´e, L.: Supersymmetry and the Atiyah–Singer index theorem. Commun. Math. Phys. 90, 161–173 (1983) Alvarez-Gaum´e, L., Freedman, D.Z.: Geometrical structure and ultraviolet finiteness in the supersymmetric σ-model. Commun. Math. Phys. 80, 443–451 (1981) Augenstein, O.: Supersymmetry, non-commutative geometry and model building. Diploma Thesis ETH Z¨urich, March 1996 Beauville, A.: Variet´es K¨ahleriennes dont la 1`ere classe de Chern est nulle. J. Diff. Geom. 18, 755–782 (1983) Besse, A.L.: Einstein Manifolds Berlin–Heidelberg–New York: Springer Verlag, 1987 Berline, N., Getzler, E., Vergne, M.: Heat Kernels and Dirac Operators, Berlin–Heidelberg–New York: Springer Verlag, 1992
Supersymmetric Quantum Theory and Differential Geometry
593
[BSN]
Bergshoeff, E., Szegin, E., Nishino, H.: (8,0) Locally supersymmetric σ-models with conformal invariance in two dimensions. Phys. Lett. B 186, 167–172 (1987)
[Cal]
Calabi, E.: On K¨ahler manifolds with vanishing canonical class. In: Algebraic geometry and topology, a symposium in honor of S. Lefschetz, Princeton: Princeton University Press 1955, pp. 78–89
[Can]
Candelas, P.: Lectures on complex manifolds. In: Superstrings ’87, Proceedings of the Trieste Summer School, Eds. L. Alvarez-Gaum´e et al., pp. 1–88
[CE]
Chevalley, C., Eilenberg, S.: Cohomology of Lie groups and Lie algebras Trans. Am. Math. Soc. 63, 85–124 (1948)
[CM]
Coleman, S., Mandula, J.: All possible symmetries of the S-matrix. Phys. Rev. 159, 1251–1256 (1967)
[Co1]
Connes, A.: Noncommutative Geometry. New York: Academic Press, 1994 ´ Connes, A.: Noncommutative differential geometry, Inst. Hautes Etudes Sci. Publ. Math. 62, 257– 360 (1985)
[Co2] [Co3]
Connes, A.: The action functional in noncommutative geometry. Commun. Math. Phys. 117, 673– 683 (1988)
[Co4]
Connes, A.: Reality and noncommutative geometry. J. Math. Phys. 36, 6194–6231 (1995)
[CoK]
Connes, A., Karoubi, M.: Caract`ere multiplicatif d’un module de Fredholm K-Theory 2, 431–463 (1988)
[DFR]
Doplicher, S., Fredenhagen, K., Roberts, J.E.: The quantum structure of space-time at the Planck scale and quantum fields. Commun. Math. Phys. 172, 187–220 (1995)
[F]
Fr¨ohlich, J.: The non-commutative geometry of two-dimensional supersymmetric conformal field theory. In: PASCOS, Proc. of the Fourth Intl. Symp. on Particles, Strings and Cosmology. K.C. Wali (ed.), Singapore: World Scientific 1995
[FG]
Fr¨ohlich, J., Gawe¸dzki, K.: Conformal field theory and the geometry of strings. CRM Proceedings and Lecture Notes Vol. 7, 57–97 (1994)
[FGK]
Felder, G., Gawe¸dzki, K., Kupiainen, A.: Spectra of Wess–Zumino–Witten models with arbitrary simple groups. Commun. Math. Phys. 117, 127–158 (1988)
[FGR]
Fr¨ohlich, J., Grandjean, O., Recknagel, A.: Supersymmetry and non-commutative geometry. In preparation. Supersymmetric quantum theory, non-commutative geometry, and gravitation. In: Quantum Symmetry, A. Connes, K. Gawe¸dzki(eds.), Lecture notes for the Les Houches 1995 Summer School
[FLL]
Fr¨ohlich, J., Lieb, E.H., Loss, M.: Stability of Coulomb systems with magnetic fields I. Commun. Math. Phys. 104, 251–270 (1986)
[FW]
Friedan, D., Windey, P.: Supersymmetric derivation of the Atiyah–Singer index theorem and the chiral anomaly. Nucl. Phys. B 235, 395–416 (1984)
[Ge]
Getzler, E.: Pseudo-differential operators on supermanifolds and the Atiyah–Singer index theorem. Commun. Math. Phys. 92, 163–178 (1983); A short proof of the Atiyah–Singer index theorem. Topology 25, 111–117 (1986)
[GeW]
Gepner, D., Witten, E.: String theory on group manifolds, Nucl. Phys. B 278, 493–549 (1986)
[GL]
Gol’fand, Y.A., Likhtman, E.P.: Extension of the algebra of Poincar´e group generators and violation of P-invariance. JETP Lett. 13, 323–326 (1971)
[GSW]
Green, M.B., Schwarz, J.H., Witten, E.: Superstring Theory I,II. Cambridge: Cambridge University Press, 1987
[HKLR] Hitchin, N.J., Karlhede, A., Lindstrom, U., Rocek, M.: Hyperk¨ahler metrics and supersymmetry. Commun. Math. Phys. 108, 535–589 (1987) [Ja1]
Jaffe, A., Lesniewski, A.: Geometry of supersymmetry. Lectures given at the Erice Summer School, July 1988; Jaffe, A., Lesniewski, A., Weitsman, J.: The two-dimensional N = 2 Wess–Zumino– Witten model on a cylinder. Commun. Math. Phys. 114, 147–185 (1988)
[Ja2]
Jaffe, A., Osterwalder, K.: Ward identities for non-commutative geometry. Commun. Math. Phys. 132, 119–130 (1990)
[Ja3]
Jaffe, A., Lesniewski, A., Osterwalder, K.: On super-KMS functionals and entire cyclic cohomology, K-Theory 2, 675–682 (1989)
594
[Joy]
J. Fr¨ohlich, O. Grandjean, A. Recknagel
Joyce, D.D.: Compact hypercomplex and quaternionic manifolds. J. Differ. Geom. 35, 743–762 (1992); Manifolds with many complex structures. Q. J. Math. Oxf. II. Ser. 46, 169–184 (1995) 1. Kobayashi, S., Nomizu, K.: Foundations of differential geometry I, II. New York: Interscience Publishers, 1963 [Ko] Kobayashi, S.: Differential geometry of complex vector bundles. Iwanami Shoten Publishers and Princeton University Press, 1987 [LaM] Lawson, H.B., Michelsohn, M.-L.Spin Geometry. Princeton: Princeton University Press, 1989 [LiM] Libermann, P., Marle, C.-M.: Symplectic geometry and analytical mechanics. Dordrecht: D. Reidel Publishing Company, 1987 [LL] Lieb, E.H., Loss, M.: Stability of Coulomb systems with magnetic fields II. Commun. Math. Phys. 104, 271–282 (1986) [LY] Loss, M., Yau, H.-T.: Stability of Coulomb systems with magnetic fields III. Commun. Math. Phys. 104, 283–290 (1986) [Ni] Nicolai, H.: The integrability of N = 16 supergravity. Phys. Lett. B 194, 402–407 (1987) [NS] Neveu, A., Schwarz, J.H.: Quark model of dual pions. Phys. Rev. D 4, 1109–1111 (1971) [PS] Pressley, A., Segal, G.: Loop groups. Oxford: Clarendon Press, 1986 [R] Ramond, R.: Dual theory for free fermions. Phys. Rev. D 3, 2415–2418 (1971) [Sa] Salamon, D.: Spin geometry and Seiberg–Witten invariants. Preprint 1995, to be published by Birkh¨auser Verlag [Su] Sullivan, D.: Exterior d, the local degree, and smoothability, IHES preprint 1995 [Sw] Swan, R.G.: Vector bundles and projective modules. Trans. Am. Math. Soc. 105, 264–277 (1962) [VGB] V´arilly, J.C., Gracia-Bondia, J.M.: Connes’ non-commutative differential geometry and the standard model. J. Geom. Phys. 12, 223–301 (1993) [WeB] Wess, J., Bagger, J.: Supersymmetry and supergravity. Princeton: Princeton University Press, 1983 [Wel] Wells, R.O.: Differential analysis, on complex manifolds. Berlin–Heidelberg–New York: Springer Verlag, 1979 [Wes] West, P.: Introduction to supersymmetry and supergravity. Singapore: World Scientific, 1986 [Wi1] Witten, E.: Constraints on supersymmetry breaking. Nucl. Phys. B 202, 253–316 (1982) [Wi2] Witten, E.: Supersymmetry and Morse theory. J. Diff. Geom. 17, 661–692 (1982) [Wi3] Witten, E.: Non-abelian bosonization in two dimensions. Commun. Math. Phys. 92, 455–472 (1984) [WN] de Wit, B., van Nieuwenhuizen, P.: Rigidly and locally supersymmetric two-dimensional nonlinear σ-models with torsion. Nucl. Phys. B 312, 58–94 (1989) [WTN] de Wit, B., Tollst´en, A.K., Nicolai, H.: Locally supersymmetric D = 3 non-linear sigma models. Nucl. Phys. B 392, 3–38 (1993) [WZ] Wess, J., Zumino, B.: Supergauge transformations in four dimensions. Nucl. Phys. B 70, 39–50 (1974) [Y] Yau, S.T.: On the Ricci curvature of a compact K¨ahler manifold and the complex Monge–Amp`ere equation I. Comm. Pure and Appl. Math. 31, 339–411 (1978) Communicated by A. Connes
Commun. Math. Phys. 193, 595 – 626 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
On the Limit Distributions of Random Matrices with Independent or Free Entries Øyvind Ryan? Department of Mathematics, University of Oslo, P.O.B. 1053 Blindern, 0316 Oslo, Norway. E-mail: [email protected] Received: 27 June 1997 / Accepted: 17 September 1997
Abstract: We state limit distribution results for random matrices with independent or free entries, also addressing when we get freeness in the limit and semicircular and circular limits. The results generalize some already known results about asymptotic freeness of large random matrices, but our goal is to get a more optimal flavour on these results. When having matrices with identically distributed entries, we show that freeness in the limit is natural when we have free entries, but unnatural when the entries are independent, and restricted to the case of circular limits. 1. Introduction The theory of random matrices has been extensively used for quite some time in connection with nuclear physics. On the mathematical side, the theory of limit distributions of random matrices has recently (see [18, 3]) found applications in the form of free group factors in Von Neumann algebra theory. The clue to this application is the multimatrix version of Wigner’s semicircle law [20, 21], discovered by Voiculescu [18], which says that independent Gaussian random matrices in the limit become free semicircular random variables, free from sets of constant block diagonal matrices. Although Gaussian random matrices have been sufficient in obtaining the applications to free group factors, one can ask the question of how general random matrices one can use in order to obtain the same limit distribution results. This question is for instance addressed in [5], where an optimal result was obtained for when one gets a semicircular limit distribution for the eigenvalues of certain random matrices. The task of this paper is to work towards a similar optimal-flavoured result for convergence in distribution, i.e. that of convergence of moments of the distributions. We will also consider the multimatrix problem, and we will be especially interested in how the limit distributions relate: When do we have asymptotic freeness in the limit, as in the application to free group ?
Sponsored by the Norwegian Research Council, Project nr. 100693/410
596
Ø. Ryan
factors? The author hopes this paper will shed some more light on certain results of [5] in the case of matrices with entries having moments of all orders. Further connections with this paper and the works of Girko [5] will possibly be pursued in a forthcoming paper. We will start by looking at selfadjoint random matrices with (up to symmetry) independent entries, and prove a generalized version (Theorem 1) of the asymptotic freeness results for random matrices and constant block diagonal matrices appearing in [18, 3]. The proof is rather different from that in [18, 3], and hinges on a characterization of freeness due to Nica and Speicher [9, 10]. This new proof and the combinatorics appearing there is an essential part of the paper. In the proof, we keep at the same time track of how the moments of the entries can be allowed to grow as n → ∞ in order to get a limit distribution. There is a similar result, Theorem 2, without the symmetry condition on the matrices, i.e. all entries are assumed independent. In this direction we also find an “optimal” result, Theorem 3, for what growth conditions we can put on the moments of the entries in order to obtain convergence in distribution for random matrices with identically distributed entries. For what optimal should mean in this case, see Definition 22 and the comments before it. It can also be seen from this when one obtains free distributions in the limit, and that the circular limit distributions have a unique role in connection with this. This is a statement which will be made precise and proved in the forthcoming paper [12]. This shows that the results in [18, 3] and the matrices considered there are in a certain way representative for the possibilities in obtaining free limit distributions. When working towards the “optimal” result in Theorem 3, we get a whole class of limit distribution laws, and we will show that the structure of this class is governed by a set of partitions, called clickable partitions, in the same way free random variables are governed by the set of noncrossing partitions, see [14]. We characterize these partitions at the end of the paper. There is an obstruction for attaining free random variables in the limit, this is roughly that the set of clickable partitions is somewhat larger than the set of noncrossing partitions. We also phrase a “symmetric version” of Theorem 3, this is Theorem 4. It will be clear from the context of symmetric random matrices how we can construct a well of limit distributions by using entries with infinitely divisible distributions in the matrices. If one replaces the word independent with ∗-free, the situation gets to be different. There is a similar result, Theorem 5, as Theorem 3 in this case, but, vaguely speaking, the obstruction mentioned above has disappeared in the calculations. Freeness in the limit will be a consequence of this. Also, the R-transform of the joint limit ∗-distribution of the matrices can be described in terms of properties of the entries in a nice way, and one can see from this that in the limit we typically get (free) random variables giving rise to R-diagonal pairs (see Definition 6). This is perhaps the most striking result of the paper. R-diagonal pairs have been studied already in the literature [10]. For the limit distribution we obtain a nice interpretation of the cumulants and the noncrossing partitions in terms of the properties of the entries and matrix multiplication, when writing out the moments of the matrices as sums of products of the entries. We also show that we in the symmetric version of Theorem 5 can get all even, infinitely divisible probability measures with compact support as limits in a particularly nice way. Random matrices with free entries have also been studied by Shlyakhtenko [13]. The conclusion is that asymptotic freeness of random matrices, at least when one has identically distributed entries, becomes natural when the entries are ∗-free. In the
Random Matrices with Independent or Free Entries
597
other direction, freeness in the limit is unnatural when the entries are independent, and is then in a certain way restricted to the free circular limit distributions. 2. Combinatorial Preliminaries We will occupy ourselves with certain noncommutative probability spaces. A noncommutative probability space is a pair (A, φ), where A is a unital ∗-algebra and φ is a normalized (i.e. φ(1) = 1) linear functional on A. The elements of A are called random variables. An important particular case is the case where A = L = ∩1≤p<∞ Lp (σ) with σ some probability measure on a measure space, i.e. the algebra of complex valued random variables, having bounded moments of all orders. The state on L, given by integration with respect to σ, is denoted E (serving as φ). Definition 1. A family of unital ∗-subalgebras (Ai )i∈I will be called a free family if a j ∈ A ij i1 6= i2 , i2 6= i3 , ..., in−1 6= in (1) ⇒ φ(a1 · · · an ) = 0. φ(a ) = φ(a ) = · · · = φ(a ) = 0 1 2 n The family ({a11 , ..., a1k1 }, ..., {an1 , ..., ankn }) will be called a ∗-free family if the ∗algebras Ai = ∗ − alg(ai1 , ..., aiki ) form a free family (we sometimes write free for ∗-free if the sets {ai1 , ..., aiki } are selfadjoint). Note that Definition 1 enables one to calculate mixed moments from individual moments. ChX1 , ..., Xn i will be the unital algebra of complex polynomials in n noncommuting variables. Unital complex linear functionals on ChX1 , ..., Xn i will be called distributions. The set of all distributions will be denoted 6n (or simply 6I for a general index set I). If a1 , ..., an are elements in some noncommutative probability space (A, φ), their joint distribution µa1 ,...,an ∈ 6n is defined by having mixed moments µa1 ,...,an (Xi1 · · · Xim ) = φ(ai1 · · · aim ). Definition 2. We will say that random variables (an (1), an (2), ...) ⊂ (An , φn ) converge in ∗-distribution (to random variables (a(1), a(2), ...) ⊂ (A, φ)) if lim φn (an (i1 )g(1) · · · an (ik )g(k) ) exists (is equal to φ(a(i1 )g(1) · · · a(ik )g(k) )) (2)
n→∞
for all choices of k, i1 , ..., ik and functions g : {1, ..., k} → {·, ∗}. If this is the case, and the (a(1), a(2), ...) are ∗-free in (A, φ), we will say that (an (1), an (2), ...)n is an asymptotically ∗-free family. If (an (1), an (2), ...) converge in ∗- distribution as above and there is no mention of (A, φ) and (a(1), a(2), ...) in the limit, we will think of a(i), a(i)∗ as the random variables Xi , Xi∗ in (ChX1 , X1∗ , X2 , X2∗ , ...i, µ); here µ is the unital linear functional defined by µ(Xig(1) · · · Xig(k) ) = limn→∞ φn (an (i1 )g(1) · · · an (ik )g(k) ), i.e. the limit distribution is 1 k denoted by µ. Given a sequence {{an (i, j; k)}1≤i,j≤n }n of random variables from (A, φ), possibly subject toP the symmetry condition an (i, j; k) = an (j, i; k)∗ , we will consider the matrices An (k) = 1≤i,j≤n an (i, j; k)en (i, j) in (Mn (A), φn ). Here {en (i, j)}i,j is the canonical
598
Ø. Ryan
system of matrix units for Mn (A), and φn is φ tensored with the normalized trace on the n × n-matrices. We will occupy ourselves with the limit distribution of the An (k), together with sets of constant block diagonal matrices. This will be specified later. Mostly we will try to conclude the mere existence of a joint limit ∗-distribution or asymptotic ∗freeness of such random matrices subject to a freeness or independence condition on the entries. It turns out that all limit distributions we encounter naturally can be expressed in terms of cumulants, either the free cumulants or some cumulants coming from a different setting. To be able to express the joint limit ∗-distributions of our matrices, we will need the following definitions and results. 2.1. Preliminaries on noncrossing partitions and the R-transform. The set of all partitions of {1, ..., m} will be denoted P(m). A partition π will have block structure {B1 , ..., Bk }, |π| = k will be the number of blocks and |Bi | will denote the number of elements in each block. Also, |π|k will be the number of blocks of cardinality k. We will also write Bi = {vi1 , ..., vi|Bi | }, with the v’s written in increasing order, and write i ∼ j when i and j are in the same block. We will also write i ∼π j for this when we need to specify the partition itself. A partition will be called even if all blocks have even cardinality, and P(m)even denotes the set of even partitions. The following class of partitions will be important: Definition 3. A partition π is called noncrossing if whenever we have i < j < k < l with i ∼ k, j ∼ l we also have i ∼ j ∼ k ∼ l (i.e. i, j, k, l are all in the same block). The set of all noncrossing partitions is denoted N C(n). We will also write N C(n)2 for the noncrossing partitions with all blocks of cardinality two. The fact that i < j < k < l could actually be taken in the general sense that i, j, k and l lie in clockwise order, or more precisely, i < j < k < l < i on the circle when one identifies {1, ..., n} with points on the circle as one does in the circular representation of a partition. See Sect. 2 for the definition of the circular representation. This also gives the notion of successors in blocks meaning, by addressing the next element of the block in the clockwise direction. N C(n) becomes a lattice with the refinement order on the set of partitions, i.e. the partial ordering given by refinement of partitions. The maximal and minimal elements are denoted 1n and 0n , which are the partitions with 1 block and n blocks, respectively. We will have use for the complementation map of Kreweras, a lattice anti-isomorphism N C(n) → N C(n). We denote it by K. It is usually defined in terms of a circular representation of the partitition [6, 9]. In this paper we will not have use for K defined on the set of all noncrossing partitions, but rather on a smaller set of partitions, for which we will define it through a different circular representation. See Sect. 2 for this. The multidimensional R-transform, also defined in [7], is an important transform 6n → 2n . Here 2n denotes the set of all power series with vanishing constant term in n noncommuting variables. In referring to the coefficients of a power series f = P ai1 ,...,im zi1 · · · zim we will write [ coef (i1 , ..., im )](f ) = ai1 ,...,im , and if π = {B1 , ..., Bk } ∈ P(m), [ coef (i1 , ..., im )|Bi ](f ) = a(ij )j∈Bi ,
Random Matrices with Independent or Free Entries
[ coef (i1 , ..., im ); π](f ) =
599
Y [ coef (i1 , ..., im )|Bi ](f ). i
We will define the R-transform in the following way, which is not the way it was defined first in the literature. The characterization below can be derived from the real definition, given by Voiculescu [16] for single distributions (n = 1). Definition 4. If µ ∈ 6n then R(µ) ∈ 2n is the unique power series such that X [ coef (i1 , ..., im ); π](R(µ)) µ(Xi1 · · · Xim ) =
(3)
π∈N C(m)
for all monomials Xi1 · · · Xim . One can show, by induction on m, that (3) provides a bijection from 6n to 2n . (3) is also called the moment-cumulant formula; the R-transform coefficients are sometimes referred to as cumulants. Note that the odd moments of µa are all zero if and only if all the odd R-transform coefficients are zero. Again, this is easy to show by induction from the moment-cumulant formula (3). Such random variables a are called even. Having the R-transform, one can define semicircular and circular random variables in the following way. Again, this is not the way these concepts were defined first, but they can be derived from the real definition. Definition 5. A random variable a is called 1. (Centered) semicircular (of radius r > 0) if it is selfadjoint and its R-transform is 2 given by R(µa )(z) = r4 z 2 . A semicircular family is a family of free semicircular random variables. 2. (Centered) circular (of radius r > 0) if the R-transform is given by R(µa,a∗ )(z, z ∗ ) = r2 r2 ∗ ∗ 4 zz + 4 z z. 3. A creation operator (on the full Fock space) if its R-transform is given by R(µa,a∗ )(z, z ∗ ) = z ∗ z. Creation operators should really be defined through the vacuum expectation on the full Fock space, but we will not have use for this characterization. 2 The quantity α = r4 in the above could also be called variance, since we assume centeredness (i.e. the first moment is zero). We will also need the following definition related to the R-transform: Definition 6. ([10]) {a, b} is called an R-diagonal pair if R(µa,b )(z1 , z2 ) =
∞ X
bk (z1 z2 )k + bk (z2 z1 )k
(4)
k=1
for some sequence of complex numbers {bk }. We will say that a random variable a gives rise to an R-diagonal pair if {a, a∗ } is an R-diagonal pair. The sequence {bn }n is called the defining sequence (or determining series) of the R-diagonal pair. The simplest example of a random variable giving rise to an R-diagonal pair is the circular random variable, as can easily be seen from 2 of Definition 5. The R-series is sometimes used in connection with the moment series of a distribution:
600
Ø. Ryan
Definition 7. The moment series of the distribution µ ∈ 6n is the power series M (µ) in 2n given by X X M (µ)(z1 , ..., zn ) = µ(Xi1 · · · Xim )zi1 · · · zim . m≥1 i1 ,...,im
The Kreweras complementation map enters the picture due to a convolution product which is an important tool in recognizing freeness of random variables: Definition 8. The boxed convolution f ? g (see [8, 9, 10]) of two power series f and g is the power series defined by [ coef (i1 , ..., im )](f ? g) = X [ coef (i1 , ..., im ); π](f )[ coef (i1 , ..., im ); K(π)](g).
(5)
π∈N C(m)
With this definition at hand, we can state the characterization of freeness which we will use [9]: Lemma 9. If φ is a trace, the following are equivalent: 1. {a1 , ..., an } and the unital algebra D are free in (A, φ). 2. φ(ai1 d1 · · · aik dk ) = [ coef (1, ..., k)](R(µai1 ,...,aik ) ? M (µd1 ,...,dk )) for all choices of k, 1 ≤ i1 , ..., ik ≤ n and d1 , ..., dk ∈ D. This characterization of freeness can be formulated also without referring to the coefficients of the series (see [10, 11]), but it is the coefficients themselves which will appear naturally in our calculations. This says nothing about mutual freeness of the ai . For this we will use the following “no mixed terms” characterization of freeness [7]: Lemma 10. The following are equivalent: 1. ({a1,1 , ..., a1,m1 }, ..., {an,1 , ..., an,mn }) is a free family in (A, φ). 2. The coefficient of zi1 ,j1 · · · zik ,jk in R(µa1,1 ,...,a1,m1 ,...,an,1 ,...,an,mn )(z1,1 , ..., z1,m1 , ..., zn,1 , ..., zn,mn )
(6)
vanishes whenever we don’t have i1 = i2 = · · · = ik . When referring to monomials in ChX1 , ..., Xn i we will use the following terminology: Definition 11. The signed partition σ of a monomial Xi1 · · · Xim is the partition obtained by saying that j and k are in the same block, say σl , if and only if ij = ik = l. σ gives rise to the sign map, defined by σ(k) = ik for k ∈ {1, ..., m}. If π = {A1 , ..., Ah } ≤ σ we will also write σ(Ai ) = r if σ(k) = r for all k ∈ Ai . Note from Lemma 10 that ∗-freeness of (a1 , a2 , ..., ), which is the same as freeness of ({a1 , a∗1 }, {a2 , a∗2 }, ...), is the same as having only to sum over π ≤ σ in (3), i.e. X g(m) [ coef ((i1 , g(1)), ..., (im , g(m))); π]R(µa1 ,a∗1 ,a2 ,a∗2 ,... ) φ(ag(1) i 1 · · · a im ) = (7) π∈N C(m)≤σ for all m, i1 , ..., im , g(1), ..., g(m) with σ the signed partition of the monomial ai1 · · · aim as in Definition 11.
Random Matrices with Independent or Free Entries
601
Using (7), we will need the following combinatorial descriptions of the distributions appearing in Definitions 5 and 6, as this is the form they will appear in the limit distributions of our matrices. Roughly speaking, the partitions π appear as ways to identify dependent (or non-free) entries from the matrices when they are multiplied, while the cumulants appear as (scaled) moments of the individual entries. P 1. The R-transform of a semicircular family (ai )i is (due to Lemma 10) k αk zk2 with αk their variances. This means that due to (7), with σ meaning the same, X [ coef (i1 , ..., im ); π]R(µa1 ,a2 ,... ) φ(ai1 · · · aim ) = π∈N C(m)2 ≤σ h Y
X
=
ασ(Ai ) .
(8)
π∈N C(m)2 ≤σ i=1 π={A1 ,...,Ah }
P 2. If (ai )i is a circular family with variances αk , the R-transform is k αk (zk zk∗ + zk∗ zk ), and we have X g(m) φ(ag(1) [ coef ((i1 , g(1)), ..., (im , g(m))); π]R(µa1 ,a∗1 ,a2 ,a∗2 ,... ) = i1 · · · a im ) = π∈N C(m)≤σ
X
Y
π∈N C(m)2 ≤σ
i
ασ(Ai ) .
(9)
π={A1 ,...,Ah }={{vi1 ,vi2 }}i (g(vi1 ),g(vi2 ))=(·,∗) or (∗,·)
3. To make the picture complete for Definition 5, for ∗-free creation operators (ai )i we have X g(m) φ(ag(1) · · · a ) = 1, (10) i1 im π∈N C(m)2 ≤σ π={{vi1 ,vi2 }}i (g(vi1 ),g(vi2 ))=(∗,·)
and it is not difficult to see that only one π can appear in the sum. This formula was used by Shlyakhtenko in his paper [13]. Finally, if {ak , a∗k } are free R-diagonal pairs with defining sequences {αk,2m }m≥1 , then we have due to the alternation of z and z ∗ in their R-series, X Y g(m) ασ(Ai ),|Ai | . (11) φ(ag(1) i1 · · · a im ) = π∈N C(m)even ≤σ
i
π={A1 ,...,Ah }={{vij }j }i (g(vi1 ),g(vi2 ),...)= (·,∗,·,∗,...) or (∗,·,∗,·,...)
The expression on the right will also come out of our calculations. 2.2. Preliminaries on Oriented Partitions. We will use the following circular representation of a partition, which is suited for what we will call oriented partitions. All
602
Ø. Ryan
elements i of {1, ..., n} will be represented as edges in the inscribed n-gon of the circle, the labelling of the edges being done clockwise. Each block of the partition will be the corresponding assembly of edges in the n-gon, we indicate this by drawing connecting lines between the midpoints of successive edges in the blocks, see Fig. 1. The notion of successors is the same as before. 8
7
1
8
1
2
7
6
2
3
6
5
4
5
3
4
Fig. 1. The circular representation of an oriented partition
Definition 12. A partition π ∈ P(n), where each block of π has an equivalence relation with at most two equivalence classes is called an oriented partition. Each such equivalence class is called an orientation class, and the assembly of all orientation classes is called an orientation. In the circular representation of the partition, the orientation class on each block will be decribed by a direction for each edge, symbolized by an arrow pointing clockwise or anti-clockwise. Orientation classes with arrows in the clockwise direction will also be called positive and denoted by +, anti-clockwise direction called negative, denoted by −. The set of all oriented partitions will be denoted OP(n). This concept will be important to the combinatorial calculations in our matrix multiplications. Definition 13. We equip the set of oriented partitions with a partial ordering by saying that π1 ≤ π2 if both π1 ≤ π2 as ordinary partitions, and any orientation class of π1 is contained in some orientation class of π2 . The positive and negative orientation classes of a block B are denoted B + and B − , their cardinalities |B + |, |B − |. · · · Xig(m) Note that the signed partition σ = {σ1 , ..., σr } of the ∗-monomial Xig(1) 1 m + can be viewed as an oriented partition by saying that k ∈ σj (the positive orientation class of σj ) if and only if ik = j and g(k) = · (σj− defined similarly with g(k) = ∗ instead). We will follow this convention when we assign positive or negative sign to an orientation class.
Random Matrices with Independent or Free Entries
603
Definition 14. The quotient graph π of an oriented partition π ∈ OP(n) is the graph that appears when the set of edges {1, ..., n} in the circular representation are ’glued together’ with the other edges in the same block, and so that they are glued with arrows pointing in the same direction. The block structure of {1, ..., n}, the connecting vertices between the edges, that appear when we do the identifications of the edges, will produce a partition, denoted K 0 (π), its blocks consisting of vertices that are identified (we do not give K 0 (π) an orientation). As an example, let us apply K 0 to the partition π = {B1 , B2 } = {{1, 2, 5, 6}, {3, 4, 7, 8}} with orientation classes given by B1+ = {1, 5}, B2+ = {3, 7}. Doing the identifications of edges, we obtain the picture in Fig. 2 for the quotient graph. Keeping track of the corresponding identifications of edges, we obtain K 0 (π) = {{1, 5}, {3, 7}, {2, 4, 6, 8}}. We will use this partition as a main example throughout the paper, as it is the simplest partition which has a crossing, but also is clickable, see Definition 17 below. 8
8
1
7 1
7
2 6
2
6
3 3
5 5
4
4 Fig. 2. Doing the identifications of edges of π
Remark. This representation of a partition was used (implicitly) by Dykema in his paper [3]. The map K 0 is not to be confused with the complementation map of Kreweras, denoted K. They are denoted by the same letter since they coincide in the important special case of partitions from N C(n)2 given a certain orientation, which is the class of partitions we will encounter most often. More precisely: Lemma 15. The following hold: 1. K 0 (π) = K(π) for any π ∈ N C(m)2 so that the two members of any block have opposite orientation, 2. K 0 (π) > K(π) for any noncrossing π 6∈ N C(m)2 so that two successors in a block always have opposite orientation, 3. K 0 (π 0 ) ≥ K 0 (π) for any oriented partitions π 0 ≥ π. K 0 is not injective. Proof. 1 is not too hard to infer from the circular definition of K, in which {1, ..., n} are vertices on the circle, and {1, ..., n} are the midpoints on the circular arcs connecting these vertices, connecting lines drawn as before. 2 will also just be partially explained: Roughly, this is due to the fact that the mapping K 0 produce many more identifications of vertices than the mapping K does. This follows really from the connection between the two circular representations. When we identify
604
Ø. Ryan
two successive edges which have opposite orientation, we usually get two identifications of vertices. For instance, identifying the edges 1 and 3 in figure 1 leads to an identification of the vertices 1 and 2, and also the vertices 3 and 8. In the definition of K applied to the same partition one would only get one identification (1 and 2). This explains why K 0 in general produce more identifications, hence K 0 (π) > K(π) when orientations are chosen as indicated in 2. 3: It is not too hard to find different π 0 and π so that K 0 (π 0 ) = K 0 (π). So K 0 is not injective like K is. The order preserving property in 3 follows really directly from the definition of K 0 , since larger blocks in a partition give rise to more identifications of vertices. 1 above may sound strange, as K is order reversing and K 0 is order preserving. But note that no two different partitions from N C(m)2 are comparable with respect to the refinement order. Throughout the paper we will mostly concentrate on (oriented) noncrossing partitions which have alternating orientations within its blocks as in 2. The following might be taken as an alternative definition of K 0 : Corollary 16 (of definition). The partition K 0 (π) is the equivalence relation on {1, ..., n} generated by the relations (running through all pairs of edges (i, j) in the same block): i ∼π j with opposite orientation ⇒ i ∼K 0 (π) j − 1, i − 1 ∼K 0 (π) j,
(12)
i ∼π j with same orientation ⇒ i − 1 ∼K 0 (π) j − 1, i ∼K 0 (π) j (here numbers are taken mod n). Actually, it is not too hard to see that if k ∼K 0 (π) l, then there exist bordering edges r, s of k, l, respectively (the bordering edges of k are k and k + 1) with r ∼π s so that k and l get identified when r and s get identified. This is a direct consequence of transitivity of ∼π . The following set of partitions will appear in the combinatorics when we describe the limit distributions of our random matrices. Note that the number of vertices in the quotient graph, |K 0 (π)|, is at most |π| + 1 as |π| is the number of edges in the graph. This can be the case only if the quotient graph is a tree (see also Lemma 32). Definition 17. An oriented partition π ∈ OP(n), is said to be clickable if the number of vertices in the quotient graph π (which is |K 0 (π)|) is |π| + 1. The set of all such is denoted C(n). As an example, the partition π of Fig. 2 is clickable, since its quotient graph is seen to be a tree. Thus, being clickable and being noncrossing are two different concepts. The name clickable stems from [3], where a click was defined as a certain identification of edges, namely two edges lying next to each other being identified with opposite orientation. The importance of the clickable partitions here is due to the fact that we will run into calculations where a degree of freedom is assigned to each vertex in the quotient graph of the partition, and it is only partitions with enough such degrees of freedom (i.e. the clickable ones) which can give a contribution in the limit of our calculations. The clickable partitions do not become a lattice under the refinement order. We will describe the structure of these partitions at the end of the paper. There we will also prove a fact we will use, namely that C(n)2 = N C(n)2 when orientation is chosen as in 1 of Lemma 15. Actually, we will show that any clickable π has alternating orientation
Random Matrices with Independent or Free Entries
605
within its blocks as in 2 of Lemma 15, and that all even noncrossing partitions become clickable with this choice of orientation. Note that if we in the index of the summand in (9) (or in (11)) assign the orientation to π ∈ N C(m)2 (or π ∈ N C(m)even ) by letting vij have positive orientation if g(vij ) = · and negative orientation if g(vij ) = ∗, we get that we sum over all partitions π ∈ C(m)2 ≤ σ (or π ∈ N C(m)even (⊂ C(m)) ≤ σ). 3. Random Matrices with Independent Entries We shall need the terminology about oriented partitions when we look at limit distributions of random matrices An (1), An (2), ... as in the section on combinatorial preliminarP ies, i.e. An (r) = i,j an (i, j; r)en (i, j) with the en (i, j) the canonical system of matrix units. We will first do the computations for selfadjoint (i.e. an (i, j; r) = an (j, i; r)) random matrices with independent entries from L satisfying the following criteria (α1 , α2 , ... are arbitrary positive numbers): 1. All entries an (i, j; k) satisfy E(|an (i, j; k)|2 ) = αnk 2. All E(an (i, j; k)) = 0, and sup E(|an (i, j; k)|m ) = o(n−1 ) for m > 2 and every k. i,j (13) 3. All {an (i, j; k)}k,1≤i≤j≤n are independent. This will be referred to as the symmetric case of our random matrices. We will also replace condition 3 with 3’. All {an (i, j; k)}k,1≤i,j≤n are independent, so that no symmetry condition is imposed on our matrices. This will be referred to as the nonsymmetric case. The conditions above are analogous to Dykema’s random matrix conditions in [3], but are somewhat weaker. As the paper proceeds, we will try to make the conditions we need on the entries as weak as possible. We will obtain a certain optimal result, Theorem 3, in this direction. In the above, o(nα ) denotes any sequence γ = {γn }n such that limn→∞ γn n−α = 0. Of course, we have that o(nα )o(nβ ) = o(nα+β ), and also that γ is o(nα ) ⇒ γ is o(nβ ) whenever α < β. A sequence converges to 0 if and only if it is o(n0 ) = o(1). α = −1 turns out to be a “critical value” for the existence of a limit distribution, and is thus related to our optimal result. In addition to random matrices as above, consider constant block diagonal matrices (with n a multiple of N ) X dn (N b + i, N b + j; t)e(N b + i, N b + j; n), Dn (t) = n 0≤b≤ N −1
1≤i,j≤N
for t in some index set T . We will consider N as a fixed integer throughout the paper. Thus N specifies the size of the block structure in Dn (t). Definition 18. We call the set {Dn (t)}t∈T as above (with n running through multiples of N ) a set of constant block diagonal matrices if
606
Ø. Ryan
1. Dn (t) has a limit distribution as n → ∞ for any t ∈ T . 2. supi,j,n |dn (i, j; t)| < ∞ for any t ∈ T . 3. For any t1 , t2 ∈ T there exists t3 ∈ T such that Dn (t1 )Dn (t2 ) = Dn (t3 )∀n. We have that the term nφn (An (i1 ) · · · An (im )), or more generally the term nφn (An (i1 )g(1) · · · An (im )g(m) ) in the nonsymmetric case, is a sum of terms on the form E an (jm , j1 ; i1 )an (j1 , j2 ; i2 ) · · · an (jm−2 , jm−1 ; im−1 )an (jm−1 , jm ; im ) , (14) with j1 , ..., jm ∈ {1, ..., n}, ik ∈ I. The following definition is crucial, and explains the connection between random matrix multiplication and oriented partitions. Definition 19. For a term as in (14) we will associate the oriented partition π of {1, ..., m} given by dependence of the random variables involved, i.e. k ∼π l if and only if the k th and lth factor, an (jk−1 , jk ; ik ) and an (jl−1 , jl ; il ) are not independent due to the conditions imposed. In particular, if k ∼π l we must have that ik = il due to 3 of (13), i.e. π ≤ σ with σ the signed partition of Xi1 · Xim . The orientation of π is defined in the following way in the nonsymmetric case (i.e. 1,2 and 3’ hold): k has positive orientation if g(k) above equals ·, negative orientation if g(k) = ∗ (i.e. the k th factor comes from some adjoint matrix). In the symmetric case (i.e. 1,2 and 3 hold), the orientation of π is defined by saying that k and l have the same orientation if and only if the corresponding random variables are placed in the same position in the matrices. Opposite orientation is the case if the two entries are placed symmetrically about the diagonal. We say that the j1 , ...jm and i1 , ..., im give rise to π if π is defined in this way, or simply that the j’s give rise to π when no confusion can arise (i.e. when σ is known). In the symmetric case, same orientation thus means ik = il and jk−1 = jl−1 , jk = jl , while opposite orientation means ik = il and jk−1 = jl , jl−1 = jk with jk 6= jl . k ∼π l with same orientation in the nonsymmetric case means that ik = il with g(k) = g(l), and jk−1 = jl−1 , jk = jl . Opposite orientation means ik = il with g(k) 6= g(l), and jk−1 = jl , jl−1 = jk . Note that π ≤ σ as oriented partitions also, σ meaning the oriented signed partition of Xig(1) · · · Xig(m) as defined before. 1 m The following lemma will be useful at certain instances in some proofs later on. Let us fix σ, i.e. we only look at terms as in (14) coming from the same, given mixed moment. Lemma 20. In the nonsymmetric case with some π ≤ σ given, the following are equivalent: 1. j1 , ..., jm give rise to some π 0 with π ≤ π 0 ≤ σ, 2. jk = jl whenever k ∼K 0 (π) l. Proof. We show 2 ⇒ 1 first. Assume that r ∼π s. We need to show that r ∼π0 s also, with π 0 the partition the j’s give rise to. If r and s have equal orientation, we have that r − 1 ∼K 0 (π) s − 1 and r ∼K 0 (π) s from definition of K 0 . But then jr−1 = js−1 and jr = js . As the j1 , ..., jm give rise to π 0 and π 0 ≤ σ, it follows that r ∼π0 s, also with equal orientation. So π 0 ≥ π, which was what we had to show. The case with opposite orientation follows in the same fashion. 1 ⇒ 2: Assume k ∼K 0 (π) l. From the comments following the definition of K 0 (π), it follows that there exist edges r and s with r ∼π s, bordering to the vertices k and l, so that k and l get identified when r and s are glued together. As π 0 ≥ π by assumption, the same identifications of k and l happen when gluing edges of π 0 . We must then have that jk = jl as the j’s give rise to π 0 . This finishes the proof.
Random Matrices with Independent or Free Entries
607
From Lemma 20 one can prove the following: Corollary 21. The following hold: 0
1. Exactly n|K (π)| choices of j’s give rise to some π 0 with π ≤ π 0 ≤ σ. 2. The number of choices of j1 , ..., jm giving rise to π is either 0 (this can occur), or given by a polynomial of degree |K 0 (π)| with leading coefficient 1, and the other coefficients being integers not depending on n. 3. K 0 is injective on the set of partitions ≤ σ which are given rise to by some choice of j’s. Proof. 1 is a direct consequence of Lemma 20. 2: It is not too hard to convince oneself of the fact that some partitions π may not be given rise to by some choice of j’s. But if π is given rise to, one can use the recursive formula X 0 #(ji giving rise to π 0 ). (15) #(ji giving rise to π) = n|K (π)| − π0 π<π 0 ≤σ
If we had shown the statement for π 0 with |K 0 (π 0 )| < |K 0 (π)|, then the result follows from induction using this formula, when we also use part 3 proved below, and also noting 0 that π 0 = σ gives a contribution n|K (σ)| . 3: Note first that if j1 , ..., jm give rise to π and p1 , ..., pm give rise to π 0 , then jk = jl whenever k ∼K 0 (π) l, and pk = pl whenever k ∼K 0 (π0 ) l. If K 0 (π) = K 0 (π 0 ), then we can use Lemma 20 twice to obtain both π ≤ π 0 ≤ σ and π 0 ≤ π ≤ σ, hence π = π 0 , proving that K 0 is injective on this set. A consequence of this we used when proving 2, was that |K 0 (π 0 )| < |K 0 (π)| whenever π 0 > π (recall that we know from Lemma 15 that π 0 ≥ π ⇒ K 0 (π 0 ) ≥ K 0 (π)). When using the recursive formula (15), we see that the coefficients in the polynomial in 2 must be related to multi-chains of partitions in a certain way. We will not go into the symmetric case of Lemma 20 and Corollary 21, even if the first theorem below deals with symmetric matrices. We will go through its proof in detail, and then discuss the generalizations (Theorems 2, 3, 4 and 5) which follow. Theorem 1. Under the conditions 1,2 and 3 of (13) and Definition 18 we get that (An (1), An (2), ..., {Dn (t)}t ) is an asymptotically free family as n → ∞ (through multiples of N ). Moreover, the An (k) converge in distribution to centered semicircular random variables of variance αk . Proof. Let σ be the signed partition of the monomial Xi1 · · · Xim . We write φn (An (i1 )Dn (t1 ) · · · An (im )Dn (tm )) = X
X
π≤σ,
j1 ,...,jm ,
(16)
1 E(an (j1 , k1 ; i1 )dn (k1 , j2 ; t1 ) · · · an (jm , km ; im )dn (km , j1 ; tm )) n
π∈OP(m) k1 ,...,km giving π
where the oriented partition π is defined by the term involved as in Definition 19, with the obvious meaning when additional matrix entries dn (k, j; t) are put in between. If
608
Ø. Ryan
π = {A1 , ..., Ah } = {{wij }j }i and K 0 (π) = {B1 , ..., Bk } = {{vij }j }i we get from independence that this equals |Ai | h X 1 Y Y X E an (jwir , kwir ; iwir ) n π≤σ,
i=1
j1 ,...,jm ,
r=1
π∈OP(m) k1 ,...,km giving π
× dn (k1 , j2 ; t1 ) · · · dn (km , j1 ; tm ).
(17)
Note that the number of choices of j1 , ..., jm , k1 , ..., km giving π is at most 0 0 n|K (π)| N 2|π|−|K (π)| , since the j’s and the k’s are attached to the vertices in the quotient graph of π: Recall that the d’s are block diagonal, this is where the powers of N come from. We need not have kr = jr+1 to get a nonzero term, but rather |kr − jr+1 | ≤ N as Dn (t) has no nonzero entries outside block diagonals. Using this one sees that the exponent 2|π| − |K 0 (π)| comes out by counting in the quotient graph, which is a graph with |K 0 (π)| vertices and |π| edges. Set d := supk∈{1,...,m},n,i,j |dn (i, j; tk )|m , which is < ∞ from condition 2 of Definition 18. If there are blocks of cardinality 6= 2, we get from the conditions on the an (i, j; k) and the fact that |E(f )| ≤ E(|f |) (and also that we get 0 if one of the blocks has cardinality one, due to centeredness), that the π-term in (17) is dominated by −|π|−1 X ) dn (k1 , j2 ; t1 ) · · · dn (km , j1 ; tm ) , o(n jr ,kr giving π since some factor in (17) in this case is dominated by E(|an (i, j; k)|r ) with r > 2 or is zero, which is o(n−1 ) by our assumptions. The π-term is then also dominated by o(n−|π|−1 )n|K
0
(π)|
N 2|π|−|K
0
(π)|
d = ro(n|K
0
(π)|−|π|−1
)
(18)
0
(with r = N 2|π|−|K (π)| d), which is o(1) since |K 0 (π)| − |π| − 1 ≤ 0. Therefore, we may assume that all blocks have cardinality two in order to get a contribution in the limit. The estimate (18) shows that for such π the quantity is dominated by (a constant times) 0 n|K (π)|−|π|−1 . This means that π must be clickable, i.e. |K 0 (π)| − |π| − 1 = 0, in order to give a contribution in the limit. But then π ∈ N C(m)2 with alternating orientation by our analysis of C(m) in the last section. For such partitions K and K 0 coincide, so that we can simply write K for them from now on. We add to (17) for each π, |Ai | h X 1 Y Y E an (jwir , kwir ; iwir ) × n j1 ,...,jm ,
i=1
r=1
k1 ,...,km so that jr =ks ,kr =js whenever r∼π s, j 0 s and k0 s not giving rise to π
× dn (k1 , j2 ; t1 ) · · · dn (km , j1 ; tm ),
(19)
Random Matrices with Independent or Free Entries
609
which is o(1): This follows since a choice of j’s and k’s as in the summand gives rise to some π 0 with π 0 > π as partitions (> not necessarily as oriented partitions, orientation of 0 0 0 0 0 π 0 may not be compatible with that of π); Hence at most n|K (π )| N 2|π |−|K (π )| choices 0 of j’s and k’s can give π as quotient graph exactly as above, so that estimating as in (18) we get (for some constant r0 ) r0 n|K
0
(π 0 )| −|π|−1
n
0
0
≤ r0 n|π |+1 n−|π|−1 = r0 n|π |−|π| ≤ r0 n−1 = o(1),
since |π 0 | < |π|. Summing up for different π 0 yields the desired result. Modulo terms that are o(1), (17) thus gets to be |Ai | h X X 1 Y Y E an (jwir , kwir ; iwir ) × n π≤σ,
j1 ,...,jm ,
π∈N C(m)2
k1 ,...,km
i=1
r=1
so that jr =ks ,kr =js whenever r∼π s
× dn (k1 , j2 ; t1 ) · · · dn (km , j1 ; tm ). αiw ir n
Replacing the expectations in (20) by X π≤σ, π∈N C(m)2
n−|π|−1
! h Y ασ(Ai ) i=1
(20)
, we see that this is X
k |B Y Yi |
jp ,kp ,
i=1 t=1
jr =ks ,kr =js
dn (kvit , j(vit +1) ; tvit ) , (21)
whenever r∼π s
where we have split up the product as dictated by the partition K(π). Consider the term corresponding to some π. I claim that j(vit +1) = kvi(t+1) , i = 1, ..., k, t = 1, ..., |Bi |( mod |Bi |)
(22)
and that these relations run through the same relations as jr = ks , kr = js with r ∼ s in π.
(23)
The number of relations is m for both sets of relations, as is easily checked. If r ∼ s in π, then r − 1 = vit , s = vi(t+1) and r = vj(s+1) , s − 1 = vjs for suitable choices of i, j, s and t as can easily be seen from the circular representation. jr = ks then says that j(vit +1) = kvi(t+1) , while kr = js says that kvj(s+1) = j(vjs +1) . These are all relations from (22), so that (23)⊂(22). As all relations from (23) are distinct and the numbers of relations are the same, we also have equality here so that the relations are the same. Q|B | All this means that the product t=1i inside the summand of (21) can be written |Bi |
Y
dn (kvit , kvi(t+1) ; tvit ),
t=1
and summing over all kvit gives that this equals
610
Ø. Ryan
nφn (Dn (tvi1 ) · · · Dn (tvi|Bi | )).
The entire product
(24)
Qk Q|Bi | i=1
in (21) thus equals |Bi | k Y Y φn Dn (tvit ) . n|K(π)|
t=1
i=1
Q
t=1
Since |K(π)| − |π| − 1 = 0 and since all t Dn (tvit ) = Dn (t0i ) have limit distributions as n → ∞ due to 1 of Definition 18 (denote the limit variables by D(ti )), we get that the limit contribution for π in (21) exists and is ! k h Y Y Y ασ(Ai ) φn D(tr ) . (25) i=1
j=1
r∈Bj
If we first choose all the Dn (tk )’s to be the identity (the assumption I ∈ {Dn (t)}t∈T is irrelevant), we get by summing over all π ∈ N C(m)2 that the limit of (16) is lim φ(An (i1 ) · · · An (im )) =
n→∞
X
h Y
ασ(Ai ) .
(26)
π∈N C(m)2 ≤σ i=1 π={A1 ,...,Ah }
We see thus from Definition 2 and Eq. (8) that (An (1), An (2), ...) converge in distribution to a semicircular family (X1 , X2 , ...) of variances αi . As [ coef (i1 , ..., im ); π]R(µa1 ,a2 ,... ) equals [ coef (1, ..., m); π]R(µai1 ,...aim ) (this is shown in [11]), the limit quantity (25) is seen to be [coef (1, ..., m); π] R(µXi1 ,...,Xim ) [coef (1, ..., m); K(π)] M (µD(t1 ),...,D(tm ) ) . (27) By the definition of boxed convolution, for general D’s the limit quantity is (by summing over π ∈ N C(m)2 ) (28) [coef (1, ..., m)] R(µXi1 ,...,Xim ) ? M (µD(t1 ),...,D(tm ) ) . This implies asymptotic freeness with ({Dn (t)}t∈T ) by Nica and Speichers characterization of freeness in Lemma 9. This reproves Voiculescu’s results on limit distributions of random matrices [16, 19], and simplifies Dykema’s proof of the same statement in [3]. We have also improved the conditions they needed on the moments of the entries. Dykema used the trace-0 definition (Definition 1) of freeness directly in order to show asymptotic freeness. This meant that for every power of a random matrix or constant block diagonal matrix he had to subtract its trace (times the identity) in order to get centered random variables. Then he had to multiply them together to find that the product has trace 0, and many combinatorial sides had to be resolved in this direction. The boxed convolution characterization of freeness in Lemma 9 is nicer with respect to this since the assumption of zero trace on the random variables involved is irrelevant. One need not modify (i.e. subtract the trace times the identity) the random variables to work with them, and this leads to a more direct proof of the asymptotic freeness result. One can say that freeness from constant block diagonal matrices comes out when one factors out the trace of the block diagonal matrices in (25). Note the following nonsymmetric version of Theorem 1:
Random Matrices with Independent or Free Entries
611
Theorem 2. If the matrices (An (1), An (2), ...) are nonsymmetric, i.e. they have entries satisfying 1 and 2 and 3’ of (13) instead, then the statement of Theorem 1 holds with semicircular replaced by circular. Proof. To see this one goes through the proof of Theorem 1 again, starting by replacing (16) by φn (An (i1 )g(1) Dn (t1 ) · · · An (im )g(m) Dn (tm )), and an (jr , kr ; ir ) by the (jr , kr )entry of An (ir )g(r) . σ is as before, but now it can be thought of as an oriented partition, as we are dealing with a ∗-monomial. Summing over π ≤ σ (≤ taken in the oriented sense) in (17), we get because of the same estimates as before that clickable π with blocks only of cardinality two are the only ones giving contribution in the limit. This means that π ∈ N C(m)2 with alternating orientation in the blocks due to Lemma 32. This says that if π = {A1 , ..., Ah } = {{vi1 , vi2 }}i , then vi1 and vi2 have opposite orientation, i.e. (g(vi1 ), g(vi2 )) = (·, ∗) or (∗, ·). All the calculations go exactly as before. We could however rephrase the addition of quantities for different π as in (19) to saying that we add a quantity with index of the summand equal to j1 , ..., jm , k1 , ..., km giving any π 0 with σ ≥ π 0 > π. These are o(1), as follows easily from the same estimates as before, or using Lemma 20. Thus we get in the limit ! k h Y Y Y X ασ(Ai ) φn D(tr ) . (29) i=1
π∈N C(m)2 ≤σ
j=1
r∈Bj
π={A1 ,...,Ah }={{vi1 ,vi2 }}i (g(vi1 ),g(vi2 ))=(·,∗) or (∗,·)
The π appearing in Fig. 2 still does not give a contribution, though. Choosing the Dn (t)’s to be the identity first we arrive at exactly the same expression as in (9) for the joint limit ∗-distribution, and this is the same as saying that (An (1), An (2), ...) converge in distribution to ∗-free circular random variables. Exactly as in Theorem 1 we then also get asymptotic freeness with constant block diagonal matrices. The result above is actually not far from giving an optimal result for when one can hope for a limit distribution in the case of each random matrix consisting of identically distributed random variables. We can’t state an optimal result for convergence in distribution itself, but this is possible for the following type of convergence, which is some type of analogue for convergence in distribution parallel to what absolute convergence is for convergence. Definition 22. We say that the random matrices An (1), An (2), ... we are considering converge absolutely in distribution if the limits X
X
π≤σ,
j1 ,...,jm ,
1 |E(an (jm , j1 ; i1 ) · · · an (jm−1 , jm ; im ))| n
π∈OP(m) giving π
exist as n → ∞ (i.e. the absolute sums of the sums in (16) converge). We see that convergence and absolute convergence coincide in the case of positive valued random variables in the matrices. With this kind of convergence in distribution in mind we will show that we can characterize exactly when we get convergence, and describe the entire class of limit distribution laws. The clickable partitions will play an important role in this description. Freeness of the limit distributions we get will be very
612
Ø. Ryan
rare, as will be shown more precisely in the paper [12]. The reason for this is roughly that the clickable and noncrossing partitions do not coincide: First of all only even partitions arise in the combinatorics of our calculations. Secondly, the set of clickable partitions consists of all the even noncrossing partitions, plus a large class of partitions having crossings, the easiest such is given by Fig. 2. Only in the case of partitions with all blocks of cardinality two there is a correspondence between the clickable and the even noncrossing partitions. This suggests why the circular limit distribution should be the only one appearing in the case of freeness in the limit: Circular random variables have cumulants only of order two. Theorem 3. Assume the matrices An (k) consist of independent, identically distributed entries from L. Then we have absolute convergence in distribution if and only if the following conditions hold: 1. limn→∞ nE(|an (i, j; k)|2m ) exists for all integers k and m ≥ 1. q 2. limn→∞ nα E(an (i, j; k)p an (i, j; k) ) = 0 for all α < 1 and all integers k, p, q ≥ 1 (i.e. the expectations are o(nα ) for all α > −1). Moreover, the joint limit ∗-distributions we then get are in one to one correspondence with the sequences of limits ( lim nE(|an (i, j; k)|2 ), lim nE(|an (i, j; k)|4 ), lim nE(|an (i, j; k)|6 ), ...)k n→∞
n→∞
n→∞
in such a way that if (αk,2 , αk,4 , ...) is this sequence of limits for the random matrices An (k), the joint limit ∗-distribution of the matrices An (k) is given by lim φn (An (k1 )g(1) · · · An (km )g(m) ) X = [ coef ((k1 , g(1)), ..., (km , g(m))); π](α),
n→∞
(30)
π∈C(m)
where α is the power series without mixed terms in the variables (zi , zi∗ )i , XX αk,2m ((zk zk∗ )m + (zk∗ zk )m ), α(z1 , z1∗ , z2 , z2∗ , ..., ) = k
(31)
m
with [ coef ((k1 , g(1)), ..., (km , g(m)))](α) the coefficient of zkg(1) · · · zkg(m) in α. 1 m In particular, we get ∗-free circular limit distributions if and only if αk,2m = 0 for m 6= 1. In this case we get also asymptotic freeness with constant block diagonal matrices. Proof. We assume first that we have absolute convergence in distribution. This means that the quantity for each π in (16) stays bounded as n → ∞. Put π = 12m with orientation of π chosen so that π is clickable. The corresponding quantity in (16) coming from π for the mixed moment φn (An (k)An (k)∗ · · · An (k)An (k)∗ ) is easily seen to be n|K
0
(π)|−1
E(|an (i, j; k)|2m ) = nE(|an (i, j; k)|2m )
as |K 0 (π)| = 2 (it is easy to calculate the exact number of j’s giving π in this case). Thus the quantities in 1 stay bounded as n → ∞ (convergence of these quantities will be proved later). The fact that the quantities in 2 stay bounded as n → ∞ is a bit harder. We will need the following lemma for this.
Random Matrices with Independent or Free Entries
613
Lemma 23. For all s > 0, p 6= q there exists an oriented partition π = {B1 , ..., Bs } such that 1. all |Bi |+ = p, |Bi |− = q (so that |Bi | = p + q). 2. |K 0 (π)| = |π|, i.e. the number of vertices is one from being maximal. Proof. Assume p > q. The π we want to construct is in OP(s(p + q)). We will construct π so that the orientation of edges in each block is given by (in increasing order, · meaning positive orientation, ∗ meaning negative orientation) ·, ∗, ·, ∗, ..., ·, ∗, ·, ·, ..., ·,
(32)
i.e. the first 2q elements are given by alternating ·’s and ∗’s till the q ∗’s (i.e. the edges having negative orientation) are used up, the rest are ·’s. We will construct the circular representation of π by adding edges with orientation, so that the end product is an oriented partition with quotient graph having one loop. It is easy to see that this implies |K 0 (π)| = |π|. First place m = p + q edges on the circle, they are to make out the first block B1 of π, and let their orientation be determined by (32). The first 2q + 1 of the edges should be connected, this we call the largest segment of B1 . The rest (p − q − 1 edges) should not be connected, as we will place the remaining (p + q)(s − 1) edges in the p − q intervals we now have. To see how the rest of the edges (with orientation) should be placed, do first the identifications in B1 . Then we obtain the loops in phase 1 of Fig. 3, each loop corresponding to one of the p − q mentioned intervals. There clockwise direction in the circular representation is indicated, the innermost loops appearing first in the circular representation, the B1 -block indicated in bold. It is not hard to see from this how one can add the remaining edges so
Phase 1
Phase 2
Phase 3
Fig. 3. Doing the identifications of edges in obtaining the quotient graph of π
that the loops actually collapse to one loop when one does the remaining identifications: The inner loop in the above could be made up of s − 1 consecutive copies of the largest segment of B1 , the other loops s − 1 consecutive copies of a single edge with positive orientation, i.e. copies of the other segments of B1 . Doing identifications here we arrive at phase two of Fig. 3, dotted lines drawn to indicate what edges are identified. When one does the identifications from phase two to phase three of Fig. 3, we obtain in the end one loop. Obviously, this partition also satisfies the conditions in 1. For p < q we interchange the roles of · and ∗ in the above argument to come to the desired conclusion.
614
Ø. Ryan
Say that the order ·’s and ∗’s appeared in the above is given by the function g : {1, ..., (p + q)s} → {·, ∗}. Then the term coming from the above π in (16) is, for the mixed moment φn (An (k)g(1) · · · An (k)g((p+q)s) ), equal to (as |K 0 (π)| = |π| = s) 1 s q (n + P (n))E(an (i, j; k)p an (i, j; k) )s , n where P is a polynomial of degree < s, coming from Corollary 21. This quantity s−1 q must stay bounded as n → ∞, so that n s E(an (i, j; k)p an (i, j; k) ) stays bounded as q n → ∞. As s was arbitrary we obtain that limn→∞ nα E(an (i, j; k)p an (i, j; k) ) = 0 for all p, q and α < 1. Knowing this, we see that no nonclickable π can give a contribution in (16), as for any such π = {B1 , ..., Bs } (|K 0 (π)| ≤ |π| for such π) its contribution would be dominated by (a constant times) |π|−1
n
s Y
|Bl− |
E(an (i, j; k)|Bl | an (i, j; k) +
),
l=1
which is, after distributing the powers of n among the factors, s Y
n
|π|−1 |π|
|Bl− |
E(an (i, j; k)|Bl | an (i, j; k) +
),
(33)
l=1
which converges to zero from what we have shown. Convergence of the quantities nE(|an (i, j; k)|2m ) follows by induction: If this is shown for m0 < m, convergence of the mixed moments of order 2m is equivalent to convergence of nE(|an (i, j; k)|2m ), as π 6= 12m gives convergent quantities in (16) from induction. This means that we are through showing that 1 and 2 are fulfilled when we assume absolute convergence in distribution. The expressions for the limit distribution is obtained in the same way as in the proof of Theorem 1, but we now have to sum over all clickable partitions. This means that we in (25) sum over all π ∈ C(m) ≤ σ with π = {A1 , ..., Ah } = {{vij }j }i so that all (g(vi1 ), g(vi2 ), ...) are alternating sequences of ·’s and ∗’s. Choosing all Dn (t) = I we arrive at (after replacing expectations) Y X ασ(Ai ),|Ai | . (34) i
π∈C(m)≤σ π={A1 ,...,Ah }={{vij }j }i (g(vi1 ),g(vi2 ),...)= (·,∗,·,∗,...) or (∗,·,∗,·,...)
It is easy to see that the summand here is the same as XX αk,2m ((zk zk∗ )m + (zk∗ zk )m )), [ coef ((i1 , g(1)), ..., (im , g(m))); π]( k
m
where the power series is recognized as the α in (31). Since such a coefficient is zero unless π ≤ σ and all (g(vi1 ), g(vi2 ), ...) give alternating sequences, we see that the sum for the limit distribution is the same if we sum over all π ∈ C(m), i.e. X [ coef ((i1 , g(1)), ..., (im , g(m))); π](α), π∈C(m)
Random Matrices with Independent or Free Entries
615
which is what we wanted to show. For instance, the partition appearing in Fig. 2 now gives contribution. The other way, if 1 and 2 are fulfilled, we see from our arguments above that all terms from any π converge, so that we have absolute convergence in distribution. The sequences of limits are now easily seen to be in bijection with the possible limit distributions, just as one shows that the R-transform is a bijection, namely by determining the cumulants recursively in terms of the moments. Therefore we get a circular limit distribution if and only if αk,2m = 0 for m 6= 1. These distributions are free and we get freeness with constant block diagonal matrices since the situation is the same as the one we have seen before in Theorem 2. We see from the above that absolute convergence of each An (i) separately implies absolute convergence of An (1), An (2), ... altogether. In this case one can also show that one obtains limit distributions with constant diagonal matrices as in Definition 18 (N = 1). For larger N it seems to be difficult to obtain this. One can in fact show that the only possibility for free limit distributions as in Theorem 3 is if all, except possibly one of the matrices, give circular limit distributions. This is shown in the upcoming paper [12]. Even with N = 1 we can’t expect freeness with our block diagonal matrices, as summation in (27) goes over all clickable partitions, so that Lemma 9 for proving freeness does not apply. These things are discussed further in Appendix 3 of the authors PhD-thesis. Note that, for a sequence kn , the fact that limn→∞ nα kn = 0 for all α < 1 needs not imply that limn→∞ nkn exists (the converse is of course true). The sequence kn = ln(n) n provides an example of this. Note also the following symmetric version of Theorem 3. Theorem 4. Symmetric random matrices An (1), An (2), ... with real valued independent, identically distributed entries satisfying conditions 1 and 2 of Theorem 3 converge absolutely in distribution, the limit given by X XX [ coef (k1 , ..., km ); π]( αk,2m zk2m ) lim φn (An (k1 ) · · · An (km )) = n→∞
π∈C(m)
k
m
(the sequence of limits αk,2m defined as before). Proof. This can be shown by looking at matrices An (i) + An (i)∗ with An (i) as in Theorem 3, or by going through the same calculations again, adapting the orientation to the symmetric case as in Definition 19. 3.1. Obtainable limit distributions. We will not go into special limit distributions we get from Theorem 4, except for the semicircular one. But we will make some remarks about a connection with the limit distributions we get and the infinitely divisible distributions. More precisely: Proposition 24. Any sequence of limits {α2 , α4 , ...} equal to the even (classical) cumulant sequence of some even, infinitely divisible probability measure with compact support, is obtainable by using symmetric matrices as in Theorem 4. Proof. Let ν be a probability measure as above, and let An be matrices as in Theorem 4, with all entries an having distribution ν 1 (with ν = ν 1 ∗ · · · ∗ ν 1 ). We will need the n n n following lemma
616
Ø. Ryan
Lemma 25. Let the measures νn be compactly supported (such measures are determined by their moments) with moments νn(m) . Let also αn,m and βn,m be the classical and free cumulants of νn , respectively. Then all limits limn→∞ nνn(m) exist if and only if all limits limn→∞ nαn,m exist if and only if all limits limn→∞ nβn,m exist. If all these limits exist, then lim nνn(m) = lim nαn,m = lim nβn,m n→∞
n→∞
n→∞
for all m. Proof. From the moment-cumulant formula in classical probability, X lim nνn(m) = lim n [ coef (m); π](αn,1 z + αn,2 z 2 + · · · ). n→∞
n→∞
π∈P(m)
Existence of the second type of limits is easily seen to imply existence of the first type of limits due to this formula, with equality of the limits themselves; only π = 1m can give a contribution in this formula in the limit if the second type of limits exist. Similarly, it is not too hard to see by induction on m that existence of the first type of limits implies existence of the second type of limits. The third type of limits is similarly handled by using the moment-cumulant formula in free probability. The set of all partitions is here replaced by the set of noncrossing partitions. Say ν has classical cumulant sequence 0, α2 , 0, α4 , .... Then ν 1 has cumulants n 0, αn2 , 0, αn4 , .... We see from the lemma that limn→∞ nE(an (i, j)2m ) = limn→∞ n αn2m = α2m , so that the sequence of limits coincides with the desired cumulant sequence. In a certain sense, the sequences of limits must be on the form above. To formulate this more precisely, let asym be a symmetrization of a, i.e. any random variable with the same even moments as a, but with zero odd moments. Proposition 26. Let the sequence of limits in Theorem 4 exist with an the entries in the random matrix An , and assume also that the n-fold convolutions µ((an )sym ) ∗n all are supported on the same interval. Then the sequence of limits is the even cumulant sequence of some even, infinitely divisible and compactly supported probability measure. Proof. One assumption is that all limits limn→∞ nφ(|an (i, j)|2m ) exist, say with a sequence of limits α2 , α4 , .... Then all limits limn→∞ nφ(((an )sym )m ) exist with limits 0, α2 , 0, α4 , .... If we denote the cumulants of µ(an )sym by αn,m , then the limits limn→∞ nαn,m also must exist, and with values 0, α2 , 0, α4 , .... But these limits are then also the limits of the cumulants of µ((an )sym ) ∗n . As these all are supported on the same interval by assumption, and their moments also must converge, µ((an )sym ) ∗n converge weakly also. As the random variables (an )sym also are infinitesimal (this follows from Chebyshev’s inequality [2]), it follows (see [1]) that the limit is infinitely divisible. Of course, it is also even and compactly supported. As the cumulants of the limit are the limits of the cumulants, it follows that our sequence of limits coincide with such a cumulant sequence. The limit distributions from the propositions above may not themselves be infinitely divisible, but they are infinitely divisible in a “clickable” sense: Proposition 27. If An are random matrices with limit distribution as in the situation in the propositions above, then this limit distribution can for any m be written as the limit distribution of An (1) + · · · + An (m), with the An (i) identically distributed and also as in the propositions above.
Random Matrices with Independent or Free Entries
617
Proof. The proof for this is easy when we have the following result, which is proved in appendix 2 of the author’s PhD-thesis: Lemma 28. If An (1), ..., An (m) are as in Theorem 4, then An (1) + · · · + An (m) also is as in Theorem 4, and its sequence of limits is given by the componentwise sum of the sequence of limits of the An (i). So, if the An are as in the propositions, their sequence of limits is some even cumulant sequence α2 , α4 , .... But then αm2 , αm4 is also some even cumulant sequence, and is thus a sequence of limits for random matrices An (i), 1 ≤ i ≤ m as in the propositions. The lemma above then says that An (1) + · · · + An (m) has the same sequence of limits as An has, and we are through since the sequence of limits determine the limit distribution. 4. Random Matrices with Free Entries Versions of Theorems 1, 2, 3 and 4 can be stated also for matrices with ∗-free assemblies of random variables (see Theorems 5, 6, 7). We will see that the freeness assumption on the entries implies a certain dominance for the even noncrossing partitions inside the clickable partitions in our calculations, so that the even noncrossing partitions, instead of the clickable partitions, govern the structure of the joint limit ∗-distributions here. This means that we must get free random variables in the limit. We also get that these give rise to R-diagonal pairs, this is due to the alternating property for the blocks of a clickable partition, see 3b) of Lemma 32. Theorem 5. Let the {An (k)}k be random matrices with entries in each matrix being identically distributed and ∗-free (entries in separate matrices also being free) in (A, φ) (for each n). If 1. αk,2m = limn→∞ nφn ((an (i, j; k)∗ an (i, j; k))m ) and ∗ m βk,2m = limn→∞ n ((an (i, j; k)an (i, j; k) ) ) exist for m ≥ 1, Qnφ m α g(r) 2. limn→∞ n φn ( r=1 an (i, j; k) ) = 0 for all α < 1, m and g, then the joint limit ∗-distribution of the matrices An (k) exists. Moreover, the (An (1), An (2), ...) are asymptotically ∗-free, and the R-transform coefficients of the limit distribution of the An (k) are related to the limits in 1 by R(µXk ,Xk∗ )(z, z ∗ ) =
∞ X
αk,2m (z ∗ z)m + βk,2m (zz ∗ )m .
(35)
m=1
In particular, if the φn are traces (⇒ αk,2m = βk,2m ), we get in the limit random variables giving rise to R-diagonal pairs with the sequences of limits in 1 as defining sequences (if the entries of An (k) are normal, φn is automatically a trace on the ∗algebra the entries generate). Freeness holds with sets of constant block diagonal matrices if the limit ∗-distributions in the above have cumulants only of order two. Proof. Look at the situation without constant block diagonal matrices, i.e. the terms in (16) are instead n1 φn (an (jm , j1 ; i1 ) · · · an (jm−1 , jm ; im )). For convenience we drop the adjoints of the matrices in the first part of the calculations. The oriented partition π appearing in (16) should now be defined by replacing independent with ∗-free in
618
Ø. Ryan
Definition 19. We need only sum over π ≤ σ due to the freeness condition for separate matrices. The calculations go as in Theorem 1. A difference here is that we can’t replace the term above with h
|Ai |
i=1
r=1
Y 1Y φn ( an (jwir −1 , jwir ; iwir )) n (with notation for π and its blocks as in Theorem 1) as in (17) anymore since independence has been replaced by freeness, and the expectation E has been replaced by the unital linear functional φn . Instead we have to split the mixed moment φn (an (jm , j1 ; i1 ) · · · an (jm−1 , jm ; im )) into sums of products of the individual moments 0 0 0 using Definition Q Q 1. For any (oriented) σ = {σ1 , ..., σr } ≤ π we get a ’submoment’ 0 mσ = i φn ( r∈σ0 an (jr−1 , jr ; ir )), and we can write i
φn (an (jm , j1 ; i1 ) · · · an (jm−1 , jm ; im )) =
X
t(π; σ 0 )mσ0
(36)
σ 0 ≤π
for some constants t(π; σ 0 ). Our notations for t(π; σ 0 ) is the same as that appearing in [15]. They depend only on the partitions π, σ 0 , not on the particular random variables involved. In particular, the constants are the same if some of the random variables are replaced by their adjoints, explaining in a certain sense why we could drop the adjoints as in the beginning of the proof. A result in [15] says that t(π, π) 6= 0 if and only if π is noncrossing and that t(π; π) = 1 for such π. Any σ 0 < π has more than |π| blocks, so that putting the summands t(π; σ 0 )mσ0 into (17) we get terms that are negligible for large n: To see this, distribute powers of n as in (33), note that the maximium possible |K 0 (π)|−1
|π|
power of n is n |σ0 | ≤ n |σ0 | < n1 and use condition 2. For π which are crossing we get no terms which contribute in the limit since t(π; π) = 0 for such π. Therefore, in the limit we get only contribution from t(π; π) with π noncrossing. As these t’s are 1, we end up with the same type of situation as in (11) for the limit distribution, i.e. X
Y
π∈N C(m)even ≤σ
i
(α or β)σ(Ai ),|Ai | .
(37)
π={A1 ,...,Ah }={{vij }j }i (g(vi1 ),g(vi2 ),...)= (·,∗,·,∗,...) or (∗,·,∗,·,...)
Here we choose α or β for i depending on whether (g(vi1 ), g(vi2 ), ...) = (∗, ·, ∗, ·, ...) or (·, ∗, ·, ∗, ...), respectively. If αk,2m = βk,2m , we get from comparison with (11), in the limit (free) R-diagonal pairs with the sequences of limits as defining sequences. If the α’s and the β’s are different, it is not too hard to conclude that the R-transform is as in (35). Freeness with constant block diagonal matrices holds in the case of circular limit distributions because we end up with the same situation as in Theorem 2, as C(m)2 = N C(m)2 are the only partitions appearing in this case. If αk,2m 6= βk,2m it is not too hard to convince oneself that one also gets freeness with constant block diagonal matrices, even if we do not have circular limits in this case. We can also obtain a symmetric version of Theorem 5 in the following way:
Random Matrices with Independent or Free Entries
619
Theorem 6. If the symmetric matrices An (1), An (2), ... consist of free, identically distributed, selfadjoint entries satisfying conditions 1 and 2 of Theorem 5, then the matrices converge absolutely in distribution to a limit satisfying X XX [ coef (k1 , ..., km ); π]( αk,2m zk2m ) lim φn (An (k1 ) · · · An (km )) = n→∞
π∈N C(m)even
k
m
(the α’s and β’s coincide in this case). The proof for this statement follows exactly as before. The identically distributed condition on the entries is not really needed in the proofs above, and could be replaced by saying that the sup of the mixed moments of the entries appearing in condition 2 should be of order o(nα ) for any α > −1, with exact values for the moments in condition 1. As for Theorem 3 it is only in the case of constant diagonal matrices we can say that limit distributions exist with all the matrices in Theorem 5. We cannot conclude freeness with these even if we now have reduced to summation with π noncrossing, as K 0 (π) 6= K(π) for noncrossing π if π 6∈ N C(m)2 , so that (27) is still different from an application of Lemma 9 (see also the comments following Theorem 7). In the general situation above, it is only in the limit we retrieve freeness, there is no reason why we should have a similar result for the finite dimensional matrices. It is not difficult to construct matrices with free identically distributed entries, that are not free. Examples with freeness for the finite dimensional matrices seem to be limited, but there is an important special case if we choose the an (i, j; k) to give rise to R-diagonal pairs: Theorem 7. If the an (i, j; k) above give rise to R-diagonal pairs with defining sequence α { k,2m n }m , then the matrices An (k) give rise to R-diagonal pairs with defining sequences {αk,2m }m . Moreover, (An (1), An (2), ...) is a ∗-free family. In particular, there is no need to take the limit as in Theorem 5. Proof. For an arbitrary product of the matrix entries, we use the moment-cumulant formula (3) in (36) instead. The result is that we obtain for a mixed moment with oriented signed partition σ, 1 X n
π≤σ
X
#(ji giving rise to π) ×
0
σ ≤π σ 0 ∈N C(m)
[ coef ((i1 , g(1)), ..., (im , g(m))); σ 0 ]R(µan (i,j;1),an (i,j;1)∗ ,an (i,j;2),an (i,j;2)∗ ,... ), (38) when we also bring the adjoints into the picture. Changing order of summation and 0 0 noting by Lemma 20 that n|K (σ )| choices of j’s give rise to some π with π ≥ σ 0 , we obtain X 0 0 n|K (σ )|−1 × σ 0 ∈N C(m)≤σ
[ coef ((i1 , g(1)), ..., (im , g(m))); σ 0 ]R(µan (i,j;1),an (i,j;1)∗ ,an (i,j;2),an (i,j;2)∗ ,... ). (39) 0
From R-diagonality of the an ’s we see that the R-transform coefficient above is n−|σ | times the same coefficient of R(µa(1),a(1)∗ ,a(2),a(2)∗ ,... ) with a(k) giving rise to (free) Rdiagonal pairs with defining sequence αk,2m (the powers of n enter since the defining
620
Ø. Ryan
sequences of the an ’s were scaled by n1 ). As R-diagonality implies that only σ 0 with alternating orientations within the blocks enter in the sum, the σ 0 we work with are clickable (as σ 0 is noncrossing) so that |K 0 (σ 0 )| − |σ 0 | − 1 = 0 and the powers of n cancel. The result is that the matrices An (k) have the same distribution as the a(k) above since we have exhibited the cumulants of our distribution. The result follows. In particular, all R-diagonal limit distributions arise, and in such a way that we need not take the limit. If we in the above also wanted distributions with sets of diagonal matrices Dn (tk ), then we would obtain the quantity 0 0 φn (An (i1 )g(1) Dn (t1 ) · · · An (im )g(m) Dn (tm )) simply by replacing n|K (σ )| in (39) by dn (j1 , j1 ; t1 ) · · · dn (jm , jm ; tm ) and in addition add over j’s giving rise to some π ≥ σ 0 . Bringing the powers of n from the R-transform coefficient into play we can add up for j’s and factor out traces as in (24) to see that our mixed moment equals X [ coef ((i1 , g(1)), ..., (im , g(m))); σ 0 ]R(µan (i,j;1),an (i,j;1)∗ ,an (i,j;2),an (i,j;2)∗ ,... ) σ 0 ≤σ
[ coef (1, ..., m); K 0 (σ 0 )]M (µDn (t1 ),...,Dn (tm ) ).
(40)
Although we get an exact expression for the joint ∗-distribution with diagonal matrices, we do not obtain freeness with these in general (except in the case with only second order cumulants), as the partition K 0 (σ 0 ) appears instead of K(σ 0 ), so that we can’t use Lemma 9 to conclude freeness. Note that if the R-diagonal pairs an (i, j; k) in the above are circular, then we can actually find the joint distribution with sets of arbitrary matrices, not just the diagonal ones. Adding over j1 , ..., jm giving rise to some π ≥ σ, we obtain as above X [ coef ((i1 , g(1)), ..., (im , g(m))); σ 0 ]R(µan (i,j;1),an (i,j;1)∗ ,an (i,j;2),an (i,j;2)∗ ,... ) σ 0 ≤σ, σ 0 ∈N C(m)2
[ coef (1, ..., m); K(σ 0 )]M (µDn (t1 ),...,Dn (tm ) )
(41)
as K 0 = K for σ 0 ∈ N C(m)2 , and since we now could factor out traces as in (24). This proves freeness with all n × n-matrices. In [4] this was proved in a different fashion, and under more general assumptions on the Dn (ti ). If some of the entries an (i, j; k) are non-R-diagonal we see that the proof above breaks down as we can have σ 0 without alternating orientation in the blocks then, hence nonclickable σ 0 . But then we can have formula (39) with no possibility of cancelling powers of n as above for all σ 0 . It seems then to be difficult to at all produce matrices which possess freeness except in the limit. 4.1. Random matrices with the same kind of entries in the matrices for all n. Theorem 5 shows how to model many free families by using matrices of free assemblies of identically distributed random variables. We were allowed to choose different types of entries for each n. One can ask what one can model if the entries in all the matrices have the same ∗-distribution, subject to some normalization condition. An example of such matrices is given by Shlyakhtenko [13] who used free creation operators within the matrices, and obtained also free creation operators for each n and so also in the limit. Roughly speaking, he showed this by recognizing formula (10) in the
Random Matrices with Independent or Free Entries
621
calculations. Shlyakhtenko’s example is related to Voiculescus Proposition 2.8 of [17] for obtaining the semicircular distribution by putting circular and semicircular entries into the matrices. The combinatorics in this case is nicer than in the general case with an arbitrary entry. We will show that this example is actually close to being exhaustive for the possible limit distributions in this setting. We will also show that it is close to being the only example of when the matrices produced give the same distributions for all n. More precisely, given a random variable a, we will consider for each n the correa sponding matrices An with (i, j) entry equal to √ijn , with ({aij }1≤i,j≤n ) a ∗-free family of random variables, all having the same ∗-distribution as a. We use the normalization factor √1n for the matrices. The effect of this is amongst other things that the Hilbert Schmidt norms of the matrices stay bounded. Corollary 29. Let An (i) be random matrices constructed from some random variable ai as above, the entries of the An (i) assumed ∗-free for separate i. Then the matrices (i.e. φ(ai ) = φ(a∗i ) = 0). In this converge in ∗-distribution if and only if the ai are centeredP case the R-transform of the limit ∗-distribution is given by i φ(a∗i ai )zi∗ zi +φ(ai a∗i )zi zi∗ . In particular, the higher moments of ai have no influence. Proof. If the limit distribution exists, ai must be centered, as φ(An (i)2 ) = φ(a2i ) + (n2 − 2 n) φ(ani ) : Here it is easy to calculate the exact number of choices giving π = 12 and 02 in (17), these give rise to the first and second term, respectively. This diverges if φ(ai ) 6= 0. When ai is centered, it is easy to calculate the ∗-moments of the entries √ain in 1 and 2 of Theorem 5. For instance, we get limn→∞ nφ(( √ain )∗ ( √ain )) = φ(a∗i ai ) and lim nφ
n→∞
a∗ a √i √i n n
m
= lim n1−m φ((a∗i ai )m ) = 0 n→∞
for m > 1 as ai has moments of all orders. Thus 1 and 2 are fulfilled, so that the limit distribution exists. As all limits in the sequence of limits are zero except the first one, we see that the R-transform of the limit ∗-distribution is φ(a∗i ai )zi∗ zi + φ(ai a∗i )zi zi∗ . This is circular if and only if φ(a∗i ai ) = φ(ai a∗i ). We see from this that the only way to reproduce the same distribution in the limit (as what we started with) is to let R(µai ,a∗i )(z, z ∗ ) = φ(a∗i ai )z ∗ z + φ(ai a∗i )zz ∗ . If this is the case, one can show, just as Shlyakhtenko did when ai was a creation operator, that there is no need to take the limit; One obtains the same distribution, and freeness, for all the finite dimensional matrices. a∗ This follows from Theorem 7, because if ai is as above then { √ain , √in } has R-
transform n1 φ(a∗i ai )z ∗ z + n1 φ(ai a∗i )zz ∗ from properties of the R-transform of dilations of random variables. The entries of the matrices thus have R-transforms with coefficients scaled by n1 which makes Theorem 7 apply. With choices of other ai one can actually show that there always is a need to take the limit, i.e. we have convergence to the limit which doesn’t terminate at a finite number of steps. Choices of other ai are also likely, vaguely speaking, to never produce freeness for all the finite dimensional matrices. 4.2. Infinitely divisible limit distributions. As in the section on matrices with independent entries, there is a connection here between infinite divisibility and the obtainable limit distributions for symmetric matrices. Here the connection is stronger, though:
622
Ø. Ryan
Proposition 30. The class of limit distributions we can get for symmetric matrices with free entries contains the class of even -infinitely divisible probability measures with compact support. Moreover, these limits can be obtained in such a way that all the finite dimensional matrices have P the same distribution; that is, µAn = ν for all n, with ν the limit distribution and An = i,j an (i, j)e(i, j) the random matrices. Moreover, the limiting procedure can be carried through with exact realizations of the identity ν = ν 1 · · · ν 1 for each n in the following way: Let n
n
An (l) =
X
an (i, j)e(i, j)
i,j, i+j=l mod n
with 1 ≤ l ≤ n, so that An = An (1) + · · · An (n). Then µAn (l) = ν 1 for all n, l, with the n (An (1), ..., An (n)) being free for all n. Proof. There is a similar result as Theorem 7 for symmetric matrices, but P the entries can’t be selfadjoint anymore as in Theorem 6. More precisely, if R(ν)(z) = m α2m z 2m , then this is obtained by choosing AP n with selfadjoint entries an (i, i) on the diagonal with α2m 2m = R(ν 1 )(z), and the upper off-diagonal R-transform R(µan (i,i) )(z) = m n z n elements an (i, j) as R-diagonal pairs with { αn2m }m as defining sequences: To see this, note that if we use Theorem 7 with the random matrix Bn having entries bn (i, j) being free, identically distributed and giving rise to R-diagonal 2m }m , then Bn + Bn∗ is selfadjoint with R-transform pairs with defining sequences { α2n P 2m = R(ν)(z), as can easily be checked. The diagonal entries of Bn + Bn∗ m α2m z P α2m 2m are selfadjoint with R-transform R(µbn (i,i)+bn (i,i)∗ )(z) = m≥1 n z , and the offdiagonal entries have ∗-distributions with R-transforms R(µbn (i,j)+bn (j,i)∗ ,bn (j,i)+bn (i,j)∗ )(z, z ∗ ) = R(µbn (i,j),bn (i,j)∗ )(z, z ∗ ) + R(µbn (j,i)∗ ,bn (j,i) )(z, z ∗ ) = X α2m ((z ∗ z)m + (zz ∗ )m ). n m≥1 P Thus, the matrices An = i,j an (i, j)en (i, j) = Bn + Bn∗ provide us with symmetric matrices so that µAn = ν for all n. So, let us concentrate on the last part of the statement, regarding the symmetric matrices An (l), 1 ≤ l ≤ n. To see that the An (l) are free and that all have the same distribution ν 1 , we will use the following argument which is due to Alexandru Nica. n Nica dealt with more general assumptions on the entries, and I am indebted to Roland Speicher for communicating the argument to me. An odd power of the An (l) consists of a matrix with just n nonzero entries, due to the permutation nature of our matrices, with an odd monomial in the an (i, j), an (i, j)∗ in each entry. These have all expectation zero since the entries have even distributions and are free, so that odd moments of An (l) are also 0. Also, An (l)2 is diagonal with selfadjoint entries of the form an (i, j)an (i, j)∗ (or possibly an (i, i)2 ) on the diagonal, and these have all the same moments. Thus we see that the even moments of the matrices all coincide with the moments of an (i, i)2 . This shows that the An (l) all have the same distribution ν 1 . n
Random Matrices with Independent or Free Entries
623
To show that the An (l) also are free, we need only take polynomials Ci = An (li )ki − βi In from alternating subalgebras (i.e. l1 6= l2 , l2 6= l3 , ...) with all Ci having expectation zero, and show that C1 · · · Cn has expectation zero. But it is easily seen from the above that all entries of the Ci have expectation zero, and that the nonzero terms in products of the entries of the Ci have expectation zero due to the freeness assumption on the entries. Adding things up ends the proof. Above, we sliced the An -matrices into n parts by X X X An (l) = an (i, j)en (i, j). An = l
l
i,j, i+j=l mod n
We could instead try to slice the original matrices Bn by X X X Bn (l) = bn (i, j)en (i, j), Bn = l
l
i,j, i−j=l mod n
which means that each slice is parallel to the diagonal instead of perpendicular to it. One can show without much more difficulty that this slicing writes the Bn -matrices as sums of free identically distributed R-diagonal pairs. As in the case with independent entries, the obtainable limit distributions in terms of symmetric matrices must to a certain extent be even and infinitely divisible. As before we can make the additional assumption that the ((an )sym )n all have supports within the same interval, with an denoting the selfadjoint matrix entries. The proof is completely parallel to the proof of Proposition 26, but we now use the characterization of -infinite divisibility appearing in [1] to deduce that the free cumulant sequence of the limit is the cumulant sequence of an even, infinitely divisible, compactly supported probability measure. 5. The Structure of the Clickable Partitions To be able to describe the clickable partitions, we will need a result on the spotting of loops in the quotient graph. Let π be an oriented partition, and let π be its quotient graph. Lemma 31. If we, after having done some identifications of edges in obtaining the quotient graph, have obtained a loop where one edge in the loop is not in the same block as any of the other edges in the loop, then this really gives rise to a loop in the quotient graph when we do the rest of the identifiactions also. Proof. First do the rest of the identifications inside the loop. This may give us shorter loops, but the existence of the edge not being identified with any other assures us that we must end up with at least one loop. Doing the identifications outside the loop may also give shorter loops, but there is no way to “break up” the loop we already have so that all loops disappear in the end. All in all, the quotient graph must at least contain one loop. This lemma is crucial in obtaining the following properties and the recursive characterization of the clickable partitions:
624
Ø. Ryan
Lemma 32. The following hold: 1. C(n) = ∅ if n is odd. 2. An oriented partition is clickable if and only if its quotient graph has no loops, i.e. it is a tree. 3. If π = {B1 , ..., Bk } = {{vij }j }i ∈ C(n) with v’s in increasing order, then: a) All |Bi | are even, that is π is an even partition. b) (Alternating property of clickable partitions) The orientation classes of each block Bi are {vi1 , vi3 , ..., vi|Bi |−1 } and {vi2 , vi4 , ..., vi|Bi | }, i.e. we have alternating orientation within the blocks. 4. For an oriented partition π with alternating orientation within its blocks (as in 3b)), the following are equivalent: a) π is clickable. b) For any i, j we have that π| ∪r {vi(2r) + 1, ..., vi(2r+1) − 1} and π| ∪r {vi(2r+1) + 1, ..., vi(2r+2) − 1} are clickable, with no interidentifications between the two collections of edges listed, and all π|{vij + 1, ..., vi(j+1) − 1} also are clickable partitions (here π| means the partition with induced block structure and orientation). Proof. 2): We will not prove this, as it is a well known fact about graphs and trees: the number of vertices in a graph is at most the number of edges +1, and is so only if it is a tree. 3): Assume that π is clickable. We first show b): If vik and vi(k+1) (k + 1 taken mod |Bi |) have the same orientation, we get by identifying these edges a loop. Since none of the v, vik < v < vik+1 are in Bi , this gives us, by Lemma 31, a real loop after having done all the identifications. So the partition is not clickable by part 2, contra assumption. Therefore, the vik must have alternating orientations. a): If one of the |Bi | is odd, then there must be consecutive edges in Bi with the same orientation, which again is contra assumption due to 3b). Therefore a) also follows. This also implies part 1. 4): Let π be oriented with alternating orientation within its blocks. Assume first that π is clickable. First do the identifications within the block Bi . This gives us (many) separate loops, one loop for each collection {vij + 1, ..., vi(j+1) − 1} of edges, and each of these loops has to be clickable in order to get no loops in the end (else Lemma 31 would imply that π is not clickable). So we have shown that the π|{vij + 1, ..., vi(j+1) − 1} have to be clickable. The identifications within Bi have lead to two connected collections of edges, namely ∪r {vi(2r) + 1, ..., vi(2r+1) − 1} and ∪r {vi(2r+1) + 1, ..., vi(2r+2) − 1}. These two collections can’t have any interidentifications, else we would again have a contradiction to Lemma 31. Each of these collections have to be clickable in order to obtain no loops as is easily inferred from the quotient graph, so that the two other restrictions of π as in the statement also have to be clickable. The other way, it is not hard to convince oneself by looking at the quotient graph that all these conditions are enough to get a quotient graph without loops so that π is clickable. The condition 3b) indicates a canonical orientation for any even partition. A partition from N C(m)even will automatically be given this orientation in the following. Note the similarity between the recursive characterization 4 of the clickable partitions and the following recursive characterization of noncrossing partitions: Lemma 33. A partition π is noncrossing if and only if all π|{vij + 1, ..., vi(j+1) − 1} are noncrossing and there are no interidentifications of edges in different such segments.
Random Matrices with Independent or Free Entries
625
An easy corollary of the characterizations in Lemma 32 and Lemma 33 is the following: Corollary 34. The following hold: 1. N C(n)2 and C(n)2 coincide. 2. N C(n)even ⊆ C(n), and this inclusion is strict if and only if n ≥ 8. 3. A clickable partition with all blocks, except possibly one, of cardinality two, is noncrossing. 4. In the class of partitions with prescribed even cardinalities for all the blocks (to some values), one can always find clickable and crossing partitions as long as at least two of the prescribed values for the cardinalities are > 2. Proof. 1 follows from the recursive characterizations of noncrossing and clickable partitions, as we can have no interidentifications between different segments as in the proof of 4 above when we have all blocks of cardinality two as all identifications get used up within each loop. 2 is also easily seen from these characterizations. To see that the third statement is correct, do the identifications within the block of cardinality 6= 2 first. In the loops we then obtain, we can have no interidentifications, as all blocks have cardinality 2 in these loops. Therefore, the partition is noncrossing from Lemma 33. The fourth statement is seen really from the simplest example possible in this direction, namely the one appearing in Fig. 2. By building on this example, it is not hard to construct crossing and clickable partitions with prescribed cardinalities of the blocks as above. References 1. Bercovici, H. Pata, V.: A free analogue of Hincin’s characterization of infinite disibility. 2. Billingsley, P.: Probability and Measure. New York: Wiley, 1986 3. Dykema, K.: On certain free product factors via an extended matrix model. J. Funct. Anal. 112, 31–60 (1993) 4. Dykema, K., Rørdam, M.: Projections in free product c∗ -algebras. To appear in Geom. Funct. Anal. 5. Girko, V.L.: Theory of Random Determinants. Amsterdam: Kluwer Academic Publishers, 1990 6. Kreweras, G.: Sur les partitions non-croisees d’un cycle. Discrete Math. 1 (4), 333–350 (1972) 7. Nica, A.: R-transforms of free joint distributions, and non-crossing partitions. J. Funct. Anal. 135 (2), 271–297 (1996) 8. Nica, A., Speicher, R.: A ’Fourier Transform’ for multiplicative functions on non-crossing partitions. Preprint (1995) 9. Nica, A. Speicher, R.: On the multiplication of free n-tuples of noncommutative random variables. Am. J. Math. 118 (4), 799–837 (1996) 10. Nica, R.A., Speicher, R.: R-diagonal pairs – a common approach to Haar unitaries and circular elements In: D. V. Voiculescu, editor, Free Probability Theory, Providence, RI: American Mathematical Society, 1997, pp. 149–188 11. Ryan, Ø.: On the construction of free random variables. To appear in J. Funct. Anal. 12. Ryan, Ø.: Asymptotic freeness and Limit Distributions of Random Matrices. 1997 Preprint 13. Shlyakhtenko, D.: Limit Distributions of Matrices with bosonic and fermionic entries. In: D. V. Voiculescu, editor, Free Probability Theory, Providence, RI, American Mathematical Society, 1997, pp. 241–253 14. Speicher, R.: Multiplicative functions on the lattice of noncrossing partitions and free convolution. Math. Ann. 298, 611–628 (1994) 15. Speicher, R.: On universal products. In: D. V. Voiculescu, editor, Free Probability Theory, Providence, RI: American Mathematical Society, 1997, pp. 257–267
626
Ø. Ryan
16. Voiculescu, D.V.: Addition of certain non-commuting random variables. J. Funct. Anal. 66, 323–335 (1986) 17. Voiculescu, D.V.: Circular and semicircular systems and free product factors. Operator algebras, unitary representations, enveloping algebras and invariant theory. In: Progress in Mathematics 92, Boston: Birkhauser, 1990 18. Voiculescu, D.V.: Limit laws for random matrices and free products. Inv. Math. 104, 201–220 (1991) 19. Voiculescu, D.V., Dykema, K., Nica, A.: Free random variables. CRM mongraph series V. 1, Providence, RI: The American Mathematical Society, 1992 20. Wigner, E.: Characteristic vectors of bordered matrices with infinite dimensions. Ann. Math. 62 (3), 548–564 (1955) 21. Wigner, E.: On the distribution of the roots of certain symmetric matrices. Ann. Math. 67 (2), 325–327 (1958) Communicated by H. Araki
Commun. Math. Phys. 193, 627 – 641 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
An Analytic Description of the Vector Constrained KP Hierarchy G.F. Helminck, J.W. van de Leur? Faculty of Applied Mathematics, University of Twente, P.O.Box 217, 7500 AE Enschede, The Netherlands. E-mail: [email protected], [email protected] Received: 9 June 1997 / Accepted: 23 September 1997
Abstract: In this paper we give a geometric description in terms of the Grassmann manifold of Segal and Wilson, of the reduction of the KP hierarchy known as the vector k-constrained KP hierarchy. We also show in a geometric way that these hierarchies are equivalent to Krichever’s general rational reductions of the KP hierarchy. 1. Introduction In recent years (vector) constrained KP hierarchies have attracted considerable attention both from the mathematical and the physical community [2–27, 29, 31, 32]. Many interesting integrable systems like the AKNS, Yajima–Oikawa and Melnikov hierarchies appear amongst these constrained families. In the physics literature they are studied in connection with multi-matrix models. The (vector) constrained KP hierarchies were introduced as reductions of the KP hierarchy ∂L = [(Ln )+ , L], n ≥ 1, ∂tn P for the first order pseudodifferential operator L = ∂ + j<0 `j ∂ j . This reduction consists of assuming that m X qj ∂ −1 rj , (Lk )− = j=1
such that the following conditions on the functions qj and rj hold: ∂qj = (Ln )+ (qj ) ∂tn ?
and
∂rj = −(Ln )∗+ (rj ) ∂tn
for all n ≥ 1.
JvdL is financially supported by the Netherlands Organization for Scientific Research (NWO).
628
G.F. Helminck, J.W. van de Leur
In this way it generalizes the well-known Gelfand–Dickey hierarchies ((Lk )− = 0). Much is known about these constrained hierarchies and many well-known features are investigated, e.g. it was shown that they possess a bi-Hamiltonian structure [9, 20, 24, 29, 32], a bilinear representation [13], [21], [22], [32] and B¨acklund-Darboux and Miura transformations [2, 4–7, 10, 23]. However, until recently, the geometry remained unclear. It is well-known that one can associate to a point in an infinite Grassmannian a solution L of the KP hierarchy [28, 30]. In this paper we consider the Segal–Wilson Grassmannian. Let H be the Hilbert space of all square integrable functions on the circle S 1 = {z ∈ C | |z| = 1}, which decomposes in a natural way Pas the direct sum of two infinite dimensional orthogonal closed subspaces H+ = { n≥0 an z n ∈ H} P and H− = { n<0 an z n ∈ H}. The Segal–Wilson Grassmannian Gr(H) consists of all closed subspaces W ⊂ H such that the orthogonal projection on H− is a HilbertSchmidt operator. In this setting, the k th Gelfand–Dickey hierarchy has the following simple geometrical interpretation. The KP operator L belongs to the k th Gelfand–Dickey hierarchy if and only if the corresponding W ∈ Gr(H) satisfies z k W ⊂ W . One of the authors gave in [19] (see also [18]) a simple interpretation of the constrained KP hierarchy for the case of polynomial tau-functions, viz L belongs to the m-vector kconstrained KP hierarchy if and only if the corresponding W ∈ Gr(H) has a subspace W 0 of codimension m such that z k (W 0 ) ⊂ W . We show in this paper that the same interpretation also holds in the Segal–Wilson case. Using this geometrical interpretation, we prove in Sect. 5 that the vector constrained KP hierarchy describes the same reduction of KP as the general rational reductions of Krichever [17] (see also [15]). Our geometrical interpretation is also useful to give solutions of these hierarchies (see e.g. [19]). 2. The KP Hierarchy Revisited In this section we recall some results for the KP-hierarchy that we will need in this paper. The KP hierarchy starts with a commutative ring R and a privileged derivation ∂ of R. In order to be able to take roots of differential operators in ∂ with coefficients form R, one extends this ring R[∂] to the ring R[∂, ∂ −1 ) of pseudodifferential operators with coefficients in R. It consists of all expressions N X
ai ∂ i
,
ai ∈ R
for all i,
i=−∞
that are added in an obvious way and multiplied according to ∞ X j k ∂ j ◦ a∂ i = ∂ (a)∂ i+j−k . k k=0
P pj ∂ j its pj ∂ j decomposes as P = P+ + P− with P+ = j≥0 P pj ∂ j its integral operator part. We denote by differential operator part and P− = Each operator P =
P
j<0
−1 Res∂ P = p−1 the residue of P . On R[∂, P∂ )i we have an anti-algebra morphism called taking the adjoint. The adjoint of P = pi ∂ is given by X (−∂)i pi . P∗ = i
Analytic Description of Vector Constrained KP Hierarchy
629
Further one has a set of derivations {∂n | n ≥ 1} of R that commute with ∂. The equations of the hierarchy can be formulated in a compact way in a set of relations for a so-called Lax operator in R[∂, ∂ −1 ), i.e. an operator of the form X `j ∂ j , `j ∈ R for all j < 0. (2.1) L=∂+ j<0
These equations are ∂n (L) =
X
∂n (`j )∂ j = [(Ln )+ , L],
n ≥ 1.
(2.2)
j<0
Since this equation for n = 1 boils down to ∂1 (`j ) = ∂(`j ) for all j, we assume from now on that ∂ = ∂1 . Equation (2.2) has at least the trivial solution L = ∂ and can be seen as the compatibility equation of the linear system Lψ = zψ
and ∂n (ψ) = (Ln )+ (ψ).
(2.3)
One needs a context in which the actions of (2.3) make sense and that allows you to derive (2.2) from (2.3). For the trivial solution (2.3) becomes ∂ψ = zψ Hence if one takes ∂n =
∂ ∂tn
∂n ψ = z n ψ
for all n ≥ 1. P then the function γ(z) = exp( ti z i ) is a solution. The and
i≥1
space M of the so-called oscillating functions for which we make sense of (2.3) can be seen as a collection of perturbations of this solution. It is defined as P i X ai z i )e ti z | ai ∈ R, for all i}. M = {( i≤N
The space M becomes a R[∂, ∂ −1 )-module by the natural extension of the actions P i P i P P b{( j aj z j )e Pti z } = ( j baj z j )e ti z , P i P P P i ∂{( j aj z j )e ti z } = ( j ∂(aj )z j + j aj z j+1 )e ti z . It is even a free R[∂, ∂ −1 )-module, since we have P i P i X X ( pj ∂ j )e ti z = ( pj z j )e ti z . An element ψ in M is called an oscillating function of type z ` , if it has the form P i X αj z j }e ti z . ψ(z) = {z ` + j<`
The fact that M is a free R[∂, ∂ −1 )-module permits you to show that each oscillating function of type z ` that satisfies (2.3) gives you a solution of (2.2). This function is then called a wavefunction of the KP-hierarchy. Segal and Wilson give in [30] an analytic approach to construct wavefunctions of the KP-hierarchy. They considered the Hilbert space X X an z n | an ∈ C, | an |2 < ∞}, H={ n∈Z
n∈Z
630
G.F. Helminck, J.W. van de Leur
with decomposition H = H+ ⊕ H− , where X an z n ∈ H} and H+ = {
H− = {
X
an z n ∈ H}
n<0
n≥0
and inner product < · | · > given by X X X an z n | bm z m >= a n bn . < n∈Z
m∈Z
n∈Z
To this decomposition is associated the Grassmannian Gr(H) consisting of all closed subspaces W of H such that the orthogonal projection p+ : W → H+ is Fredholm and the orthogonal projection p− : W → H− is Hilbert-Schmidt. The connected components of Gr(H) are given by Gr(`) (H) = W ∈ Gr(H)| p+ : z ` W → H+ has index zero . On each of these components we have a natural action by multiplication of the group of commuting flows X X ti z i ) | ti ∈ C, | ti | (1 + )i < ∞ for some > 0}. 0+ = {exp( i≥1
Now we take for R the ring of meromorphic functions on 0+ and for ∂n the partial derivative w.r.t. tn . Then there exists for each W in Gr(−`) (H) a wavefunction ψW of type z ` that is defined on a dense open subset of 0+ and that takes values in W . Moreover, it is known Pthati the range of ψW spans a dense subspace of W . Hence, if we −1 is a solution write ψW = PW · e ti z with PW ∈ R[∂, ∂ −1 ), then LW = PW ∂PW of the KP-hierarchy. Each component of Gr(H) generates in this way the same set of solutions of the KP-hierarchy, so it would suffice, as is done in [30], to consider only Gr(0) (H). However, it is more convenient here to consider all components. A subsystem of the KP-hierarchy consists of all solutions L that are the k th root of a differential operator. This gives you solutions of the KP-hierarchy that do not depend on the {tkn , with n ≥ 1}. Those operators satisfy the condition Lk = (Lk )+ . The set of equations corresponding to this condition is called the k th Gelfand–Dickey hierarchy. Now it has been shown that, among the solutions coming from the Segal– Wilson Grassmannian, the ones that satisfy the k th Gelfand–Dickey hierarchy are exactly characterized by z k W ⊂ W . In the next section we consider a generalization of this condition. 3. An Extension of the Condition z k W ⊂ W In this section we consider, for each k and m in N = {0, 1, 2, . . . }, k 6= 0 subspaces W in Gr(H) that possess the m-Vector k-Constrained (mV kC)-Condition: T here is a subspace W 0 of W of codimension m such that z k (W 0 ) ⊂ W. (3.1) This is a natural generalization of the condition that describes inside Gr(H) the solutions of the k th Gelfand–Dickey hierarchy. We will show here in a geometric way how you can
Analytic Description of Vector Constrained KP Hierarchy
631
associate to each W , satisfying the mV kC-condition, 2m functions {qj | 1 ≤ j ≤ m} and {rj | 1 ≤ j ≤ m} for which the following equations hold: ∂n (qj ) = (LnW )+ (qj )
for all n ≥ 1,
∂n (rj ) = −(LnW )∗+ (rj )
for all n ≥ 1.
(3.2) (3.3)
Here A∗ denotes the adjoint of A in R[∂, ∂ −1 ). Moreover LW satisfies LkW = (LkW )+ +
m X
qj ∂ −1 rj .
(3.4)
j=1
At the same time we will give links with the paper of Zhang [31]. Take any W in Gr(−`) (H) that satisfies the mV kC-condition. It is no restriction to assume that the m occurring in (3.1) is optimal, i.e. there is an orthonormal basis {u1 , . . . , um } of the orthocomplement of W 0 in W such that (Span{z k u1 , . . . , z k um }) ∩ W = {0}. Since multiplication with z is unitary, the vectors {z k (u1 ), . . . , z k (um )} are an orthonormal basis of the orthocomplement of W in z k W + W . To the space W we associate the subspaces Wj = W ⊕ Cz k uj , 1 ≤ j ≤ m. Clearly the Wj all belong to Gr(−`+1) (H) and hence, they have wavefunctions ψWj of type z `−1 , i.e. P i X ψWj = ψWj (t, z) = {z `−1 + ajs (t)z `−1−s }e ti z . (3.5) s≥1
Recall that ψWj (t, z) is well-defined for all t belonging to the open dense subset X W ti z i ) ∈ 0+ |γ −1 Wj is transverse to z `−1 H+ }. 0+ j = {γ(z) = exp( Wj
On 0+
we consider the function sj (t) =< ψWj (t, z) | z k uj > .
(3.6)
W
Since the vectors {ψWj (t, z) | t ∈ 0+ j } are lying dense in Wj and m was assumed to be optimal, the functions {sj } do not vanish. Hence, on a dense open subset of 0+ , there is defined the function ϕj =
1 ψWj := rj ψWj . sj
(3.7)
It takes values in Wj and has moreover the following useful property ϕj (t) − z k uj ∈ W,
(3.8)
for all t in a dense open subset of 0+ . This property is a consequence of the facts that ϕj (t) − z k uj is by construction orthogonal to z k uj and that W is the orthocomplement of Cz k uj inside Wj . In [31], similar functions {ϕj } are introduced, only not using the
632
G.F. Helminck, J.W. van de Leur
geometry, but as solutions of a certain system of differential equations. In particular, we can dispose of the condition (a) in the proposition of [31]. Thus we have obtained m functions {rj }. To define the {qj } we consider P i X z k ψW − (LkW )+ (ψW ) = (LkW )− (ψW ) = { bs (t)z `−1−s }e ti z . (3.9) s≥0 W
For each j, 1 ≤ j ≤ m, we have a function qj on 0+ j , qj (t) = < z k ψW (t, z) − (LkW )+ ψW (t, z) | z k uj > = < z k ψW (t, z) | z k uj > = < ψW (t, z) | uj > . Because m is optimal, the functions {qj } are non-zero on an open dense subset of 0+ . Since uj does not depend on t and since ∂t∂n ψW = (LnW )+ (ψW ), we get directly for qj , ∂qj ∂tn
= < ∂t∂n (ψW )(t, z) | uj >=< (LnW )+ (ψW (t, z)) | uj > = (LnW )+ (< ψW | uj >) = (LnW )+ (qj ).
(3.10)
Thus Eqs. (3.2) for the derivatives of the {qj } are clear. Those for the {rj } require more work. First we derive an expression for (LkW )− (ψW ). Thereto we consider 8(t) = z k ψW − (LkW )+ (ψW ) −
m X
q j ϕj .
(3.11)
j=1
Since ϕj takes values in Wj , the function (LkW )+ (ψW ) does so in the space W and z k ψW in z k W . Hence we have that 8(t) belongs to W +z k W for all relevant t. By construction we have that for all j, 1 ≤ j ≤ m, 8(t) is orthogonal to z k uj , hence 8(t) even belongs to W . From the form of the ϕj , we see that on an open dense set of 0+ one has P i X 8(t) = { cs z `−1−s }e ti z . s≥0
By construction, there holds W ∩ (z ` H+ )⊥ γ(z) = {0}, so that we arrive at k
z ψW −
(LkW )+ (ψW )
=
m X
q j ϕj .
(3.12)
j=1
This equation is part of the system of differential equations for the ϕj as used in [31]. Recall that ϕj has the form P i ϕj = {rj z `−1 + lower order terms in z}e ti z . Hence,
Analytic Description of Vector Constrained KP Hierarchy
633
P i ∂ϕj ∂ϕj = = {rj z ` + lower order terms }e ti z . ∂x ∂t1 On the other hand we know that ϕj (t) − z k uj belongs to W for all t. Thus also belongs to W . In W we have that P i X ∂ϕj − rj ψW = { αs z `−1−s }e ti z ∈ (z ` H+ )⊥ γ, ∂x
∂ϕj ∂x (t)
s≥0
and this has to be zero. By definition we have ϕj = rj ψWj and differentiation w.r.t. x gives ψW =
1 ∂(rj ψWj ) = (rj−1 ∂rj )(ψWj ). rj
(3.13)
Consequently, we have for ϕj , ϕj = rj ψWj = rj (rj−1 ∂ −1 rj )ψW = ∂ −1 rj ψW . Now we substitute this in Eq. (3.12) and obtain m X (LkW )− (ψW ) = { qj ∂ −1 rj }ψW .
(3.14)
j=1
Since the pseudodifferential operators act freely on wavefunctions, we see that LW and the functions {qj } and {rj } are exactly connected by Eq. (3.4) (LkW )− =
m X
qj ∂ −1 rj .
j=1
What remains to be shown, is the differential Eq. (3.3) for the rj . As ϕj (t) − z k uj ∂ϕ belongs to W , it follows that for all n ≥ 1, ∂tnj (t) lies in W . Recall that P i ϕj = {rj z `−1 + lower order terms in z}e ti z . Then we have
P i ∂ϕj = {rj z n+`−1 + lower order terms}e ti z ∂tn P i P ti z = {rj ∂ n−1 }ψW + { s≥0 αs z n−1+`−s }e P i P = Anj (ψW ) + { s≥0 βs z `−1−s }e ti z ,
with Anj a uniquely determined differential operator in ∂ of order n−1 and with leading ∂ϕ coefficient rj . Since both ∂tnj as Anj (ψW ) are lying in W , we get ∂ϕj − Anj (ψW ) = 0 = W ∩ (z ` H+ )⊥ γ(z). ∂tn On the other hand we know that ϕj = ∂ −1 rj ψW and this leads to Anj (ψW ) = ∂ −1
∂rj ψW + ∂ −1 rj (LnW )+ (ψW ). ∂tn
(3.15)
634
G.F. Helminck, J.W. van de Leur
This gives you an expression for Anj in LW and rj , Anj = ∂ −1 (
∂rj + rj (LnW )+ ). ∂tn
By taking the residue in ∂ of the operators in this equation, we see that Res∂ (Anj ) = 0 =
∂rj ∂rj + Res∂ (∂ −1 rj (LnW )+ ) = + (LnW )∗+ (rj ). ∂tn ∂tn
The last equality is a direct consequence of the following property of residues of pseudodifferential operators. Lemma 3.1. In the ring R[∂, ∂ −1 ) ofPpseudodifferential operators with coefficients in R, we have for each f in R and P = j≤N pj ∂ j in R[∂, ∂ −1 ),
where (P ∗ )+ =
P
Res∂ (∂ −1 f P ) = (P ∗ )+ (f ), (−∂)j pj is the differential operator part of the adjoint of P .
0≤j≤N
Proof. First we recall that Res∂ behaves as follows w.r.t. to taking the adjoint P ∗ = P (−∂)j pj of P , j≤N
Res∂ (P ∗ ) = −Res∂ P.
This is easily reduced to operators of the form a∂ n , n ∈ Z. Next one notices that it suffices to prove the equality in the lemma for differential operators. The left-hand side for such a P transforms as Res∂ (∂ −1 f P ) = −Res∂ (P ∗ f (−∂)−1 ) = Res∂ (P ∗ f ∂ −1 ). As P ∗ f is a differential operator with constant term P ∗ (f ), this gives the proof of the lemma. So we have shown that each rj satisfies Eq. (3.3): ∂rj = −(LnW )∗+ (rj ), ∂tn and we can conclude that LW , the {qj } and the {rj } form a solution of the m-vector k-constrained KP-hierarchy. 4. The Main Theorem In this subsection we will prove the converse of the result from the foregoing subsection and thus come to the main theorem. So we start with a W in Gr(−`) (H) and functions {qj } and {rj }, all defined on a dense open subset of 0+ , such that the Eqs. (3.2) , (3.3) and (3.4) are satisfied. We will show that such a W fulfills the mV kC-condition from Sect. 3.PRecall that there is a unique pseudodifferential operator PW such that i ψW = PW (e ti z ). It has the form X X pj ∂ j = {1 + p`+s ∂ s }∂ ` . (4.1) PW = ∂ ` + j<`
s<0
Analytic Description of Vector Constrained KP Hierarchy
635
It is not difficult to see that the fact that ψW is a wavefunction is equivalent to PW satisfying the Sato-Wilson equations
where P−
∂PW −1 −1 P = −(PW ∂ n PW )− , (4.2) ∂tn W P P denotes the integral operator part pi ∂ i of the element P = pj ∂ j in i<0
R[∂, ∂ −1 ). Next we consider for each j, 1 ≤ j ≤ m, the operators Qj and Rj defined by Qj := qj ∂qj−1 PW
and Rj = rj−1 ∂ −1 rj PW .
(4.3)
We want to show that the Qj and the Rj also satisfy the Sato-Wilson equations. To do so, we need some properties of the ring R[∂, ∂ −1 ) of pseudodifferential operators with coefficients from R. We resume them in a lemma Lemma 4.1. If f belongs to R and Q to R[∂, ∂ −1 ), then the following identities hold: (a) (b) (c) (d) (e) (f) (g)
(Qf )− = Q− f , (f Q)− = f Q− , Res∂ (Qf ) = Res∂ (f Q) = f Res∂ (Q), (∂Q)− = ∂Q− − Res∂ (Q), (Q∂)− = Q− ∂− Res∂ (Q), (Q∂ −1 )− = Q− ∂ −1 + Res∂ (Q∂ −1 )∂ −1 , (∂ −1 Q)− = ∂ −1 Q− + ∂ −1 Res∂ (Q∗ ∂ −1 ).
Since the proof of this lemma consists of straightforward calculations, we leave this to the reader. Now we can show Proposition 4.1. The operators Qj and Rj , 1 ≤ j ≤ m, satisfy the Sato-Wilson equations. Proof. If we denote
∂ ∂tn
by ∂n , then we get for Qj = qj ∂qj−1 PW that
−1 −1 −1 −1 −1 −1 −1 ∂n (Qj )Q−1 qj j = ∂n (qj ∂qj )qj ∂j qj + qj ∂qj ∂n (PW )PW qj ∂ −1 −1 n −1 −1 −1 −1 = −qj ∂qj (LW )− qj ∂ qj + ∂n (qj ∂qj )qj ∂ qj .
Now we apply successively the identities from Lemma 4.1 to the first operator of the right-hand side qj ∂qj−1 (LnW )− qj ∂ −1 qj−1 qj ∂(qj−1 LnW qj ∂ −1 )− qj−1 qj (∂qj−1 LnW qj ∂ −1 )− qj−1 qj ∂Res∂ (qj−1 LnW qj ∂ −1 )∂ −1 qj−1 qj−1 Res∂ (LnW qj ∂ −1 )
= − + = −
qj ∂(qj−1 LnW qj )− ∂ −1 qj−1 qj ∂Res∂ (qj−1 LnW qj ∂ −1 )∂ −1 qj−1 qj Res∂ (qj−1 LnW qj ∂ −1 )qj−1 (qj ∂qj−1 LnW qj ∂ −1 qj−1 )− qj ∂qj−1 Res∂ (LnW qj ∂ −1 )∂ −1 qj−1 .
= = − +
By applying Lemma 3.1 to these last two residues we get (qj ∂qj−1 LnW qj ∂ −1 qj−1 )− + (LnW )+ (qj )qj−1 − qj ∂qj−1 (LnW )+ (qj )∂ −1 qj−1 . On the other hand
636
G.F. Helminck, J.W. van de Leur
∂n (qj ∂qj−1 )qj ∂ −1 qj−1 = ∂n (qj )qj−1 − qj ∂qj−2 ∂n (qj )qj ∂ −1 qj−1 . Thus we see that, if ∂n (qj ) = (LnW )+ (qj ), the operator Qj satisfies the Sato-Wilson equation n −1 ∂n (Qj )Q−1 j = −(Qj ∂ Qj )− .
(4.4)
For Rj , we proceed in a similar fashion ∂n (Rj )Rj−1 = −rj−1 ∂ −1 rj (LnW )− rj ∂rj + ∂n (rj−1 ∂ −1 rj )rj−1 ∂rj = −rj−1 ∂ −1 (rj LnW rj−1 )− ∂rj + −∂n (rj )rj−1 + rj−1 ∂ −1 (∂n (rj )rj−1 )∂rj . Now we successively apply Lemma 4.1 (g) and (c) and (4.2) to the first term of the right-hand side of this equation − rj−1 ∂ −1 (rj LnW rj−1 )− ∂rj = −rj−1 {(∂ −1 rj LnW rj−1 )− − ∂ −1 Res∂ (rj−1 (LnW )∗+ rj ∂ −1 )}∂rj = −rj−1 (∂ −1 rj LnW rj−1 )− ∂rj + rj−1 ∂ −1 rj−1 (LnW )∗+ (rj )∂rj = −rj−1 {(∂ −1 rj LnW rj−1 ∂)− + Res∂ (∂ −1 rj LnW rj−1 )}rj + rj−1 ∂ −1 rj−1 (LnW )∗+ (rj )∂rj = −(rj−1 ∂ −1 rj LnW rj−1 ∂rj )− − rj−1 (LnW )∗+ (rj ) + rj−1 ∂ −1 rj−1 (LnW )∗+ (rj )∂rj . Since ∂n (rj ) = −(LnW )∗ (rj ), we see that the last two terms cancel ∂n (rj−1 ∂rj )rj−1 ∂rj and thus we have obtained the Sato-Wilson equation for Rj , ∂n (Rj )Rj = −(Rj ∂ n Rj−1 )− . This concludes the proof of Proposition 4.1.
(4.5)
This proposition has some important consequences. Since the {rj } and the {qj } are non-zero on a dense open subset of 0+ , we define on such a subset of 0+ oscillating functions ψQj and ψRj of type z `+1 resp. z `−1 by ψQj = qj ∂qj−1 · ψW
and ψRj = rj−1 ∂ −1 rj · ψW .
(4.6)
In fact Qj and Rj are B¨acklund–Darboux transformations of the KP hierarchy. To be more precise, we conclude from Proposition 4.1. Corollary 4.1. The functions ψQj and ψRj are wavefunctions of planes WQj and WRj . Moreover we have the following codimension 1 inclusions: W Qj ⊂ W
and W ⊂ WRj .
Proof. From the Sato-Wilson equations one deduces directly that for all n ≥ 1, ∂n ψQj = (Qj ∂ n Q−1 j )+ ψ Q j
and ∂n ψRj = (Rj ∂ n Rj−1 )+ ψRj .
This shows the first part of the claim. Consider the following subspace in Gr(H): WQj = the closure of Span{ψQj (t, z)}. The inclusions between the spaces W and WQj follows from the first relation of (4.6) and the fact that the values of a wavefunction corresponding to an element of Gr(H)
Analytic Description of Vector Constrained KP Hierarchy
637
are lying dense in that space. Since for a suitable γ in 0+ the orthogonal projections of γ −1 WRj on z ` H+ resp. γ −1 W on z `+1 H+ have a one dimensional kernel, one obtains the codimension one result. For the inclusions between the spaces W and WRj we consider P i
ti z ∗−1 − ∗ ∗ ∗ = PW e and ψR = −r∂r−1 ψW . Since the the adjoint wavefunctions ψW j ∗ ⊥ ∗ complex conjugate zψW (t, z) of zψW (t, z) corresponds to the space W , the same argument as before shows the codimension 1 inclusion: ∗ (t, z)} ⊂ W ⊥ . Wj := the closure of Span{zψR j
Hence ψRj (t, z) corresponds to Wj⊥ , which must be WRj = the closure of Span {ψRj (t, z)}. This concludes the proof of the corollary. Now we can formulate the main results of this paper. Theorem 4.1. Let W be a plane in Gr(H) and let LW be the corresponding solution of the KP-hierarchy. Then for m, k ∈ N, k 6= 0, the following 2 conditions are equivalent: (a) The space W satisfies the mV kC-condition. (b) There exist functions {qj | 1 ≤ j ≤ m} and {rj | 1 ≤ j ≤ m} defined on an open dense subset of 0+ such that the following conditions are fulfilled: (i) ∂n (qj ) = (LnW )+ (qj ) for all n ≥ 1, (ii) ∂n (rj ) = −(LnW )∗+ (rj ) for all n ≥ 1, m P qj ∂ −1 rj . (iii) LkW = (LkW )+ + j=1
Proof. In Sect. 2 it has been shown that (a) implies (b). So we assume from now on (b). The relation (b) (iii) leads to LkW (ψW ) = z k ψW = (LkW )+ (ψW ) + = (LkW )+ (ψW ) + = (LkW )+ (ψW ) +
m P j=1
P
qj ∂ −1 rj ψW
j rj 6=0
P
j rj 6=0
qj rj rj−1 ∂ −1 rj ψW q j rj ψ R j .
Thus we see with the usual density argument that X X ˜. WRj = WRj = W zk W ⊂ W + j
j
˜ Since each W has codimension one in WRj , we see that the codimension of W in W ˜ and p1 : H → W1 the orthogonal is ≤ m. Let W1 be the orthocomplement of W in W projection on W1 . Inside W we consider W 1 = {w ∈ W | p1 (z k w) = 0}. Since dim(W1 ) ≤ m, we see that W 1 is a subspace of W of codimension ≤ m and by construction z k W 1 ⊂ W . This completes the proof of the theorem.
638
G.F. Helminck, J.W. van de Leur
5. General Rational Reductions of the KP Hierarchy We are now going to connect the vector constrained KP hierarchy to reductions of the KP hierarchy introduced by Krichever [17]. For that purpose we assume that W is a plane in Gr(H) that satisfies the mV kC-condition, where we choose m to be as −1 , with PW of the form (4.1), minimal as is possible for that plane. Let LW = PW ∂PW be the corresponding solution of the KP hierarchy and let W 1 ⊂ W be the subspace of codimension m such that W1 = z k W 1 ⊂ W . Notice first that W1 is a subspace of W and z k W of codimension k + m and m, respectively. Hence there exist differential operators L1 and L2 of order k + m and m, respectively, such that L 1 ψ W = ψW 1 ,
L 2 z k ψ W = ψW 1
(5.1)
and that ψW1 is again a wavefunction. From (5.1) one immediately deduces that LkW = L−1 2 L1 .
(5.2)
We first prove the following lemma. Lemma 5.1. Let L = P ∂ k P −1 be a pseudodifferential operator of order k and let L1 and L2 be differential operators of order k+m and m, respectively, such that L = L−1 2 L1 . Then one has the following identities: i/k i/k = (L1 L−1 L1 , L1 (L−1 2 L1 ) 2 )
i/k i/k L2 (L−1 = (L1 L−1 L2 . 2 L1 ) 2 )
Proof. Since L1 P = L2 P ∂ k , one can find a pseudodifferential operator Q of the same k −1 order as P such that L1 = Q∂ k+m P −1 , L2 = Q∂ m P −1 , and thus L1 L−1 2 = Q∂ Q . −1 k −1 th Since also L2 L1 = P ∂ P , one finds that their k roots satisfy 1/k = P ∂P −1 , (L−1 2 L1 )
1/k (L1 L−1 = Q∂Q−1 . 2 )
Using this, one easily verifies the identities of the Lemma.
Since both ψW and ψW1 are wavefunctions that are connected by Eqs. (5.1), we find, using (5.2) and Lemma 5.1, that 1/k LW = (L−1 2 L1 )
Hence
1/k −1 1/k and LW1 = L1 (L−1 L1 = (L1 L−1 . 2 L1 ) 2 )
(5.3)
i/k i/k )+ ψW1 = ((L1 L−1 )+ L 1 ψ W , ∂i ψW1 = ((L1 L−1 2 ) 2 )
and on the other hand is also equal to i/k )+ ψ W , ∂i (L1 ψW ) = ∂i (L1 )ψW + L1 ((L−1 2 L1 )
from which one deduces that i/k i/k ∂i L1 = ((L1 L−1 )+ L1 − L1 ((L−1 )+ . 2 ) 2 L1 )
(5.4)
In a similar way one obtains from the other identity of (5.1) that i/k i/k ∂i L2 = ((L1 L−1 )+ L2 − L2 ((L−1 )+ . 2 ) 2 L1 )
(5.5)
Notice that in this way we have exactly obtained Krichever’s general rational reductions of the KP hierarchy [17]. Krichever considers KP pseudodifferential operators L of the
Analytic Description of Vector Constrained KP Hierarchy
639
form (2.1), such that Lk = L−1 2 L1 , where L1 and L2 are coprime differential operators of order k + m and m, respectively. It can be shown that Eqs. (5.4) and (5.5) for L1 and L2 are equivalent to the KP Lax equations for L. It is not difficult to see that our operators must be coprime, since we have chosen our m to be minimal. We will now prove that the converse also holds, i.e, that the following theorem holds. Theorem 5.1. Let W be a plane in Gr(H) and let LW be the corresponding solution of the KP-hierarchy. Then for m, k ∈ N, k 6= 0, the following 2 conditions are equivalent: (a) The space W satisfies the mV kC-condition, with m as minimal as possible. (b) There exist coprime differential operators L1 and L2 of order k + m and m, respectively, such that the following conditions are fulfilled: (i) LkW = L−1 2 L1 , i/k i/k )+ L1 − L1 ((L−1 )+ , (ii) ∂i L1 = ((L1 L−1 2 ) 2 L1 ) −1 i/k −1 (iii) ∂i L2 = ((L1 L2 ) )+ L2 − L2 ((L2 L1 )i/k )+ . Proof. We have already shown that (a) implies (b). So we assume from now on (b). Let ψ1 be the oscillating function L1 ψW , then by using Lemma 5.1: 1/k 1/k 1/k ψ1 = (L1 L−1 L1 ψW = L1 (L−1 ψW = zL1 ψW = zψ1 . (L1 L−1 2 ) 2 ) 2 L1 )
Now consider ∂ i ψ1 = = = =
∂i (L1 )ψW + L1 ∂i ψW i/k i/k i/k (((L1 L−1 )+ L1 − L1 ((L−1 )+ + L1 ((L−1 )+ )ψW 2 ) 2 L1 ) 2 L1 ) −1 i/k ((L1 L2 ) )+ L1 ψW i/k ((L1 L−1 )+ ψ1 . 2 )
Hence ψ1 is again a wavefunction of the KP hierarchy. If we let W1 be the closure of the span of the ψ1 (t, z), then ψW1 = ψ1 . Since z k ψW is also a wavefunction, L 2 z k ψ W = ψW 1 . Thus we see with the usual density argument that W1 ⊂ z k W of codimension m, W1 ⊂ W of codimension k + m.
(5.6)
Hence W 1 = z −k W1 is a subset of W of codimension m such that z k W 1 ⊂ W . Since our differential operators are coprime, one cannot find lower order operators M1 and M2 such that LW = M2−1 M1 . Hence there is no smaller subspace W1 and no smaller m such that (5.6) is satisfied. As a consequence of this, we obtain that in the Segal–Wilson setting, the vector constrained KP hierarchy and Krichever’s general rational reduction define the same reduction of the KP hierarchy. Acknowledgement. We would like to thank Igor Krichever for sending us an early preprint version of his paper [17], and especially Henrik Aratyn, for sending us [1]. In this paper he presents his proof that the vector constrained KP and Krichever’s general rational reduction’s of KP describe the same hierarchies. His proof is based on kernels of differential operators and properties of Wronskians and is quite different from the proof given in this paper.
640
G.F. Helminck, J.W. van de Leur
References 1. Aratyn, H.: The constrained KP hierarchy as ratio of differential operators. Preprint to appear in the Proceedings of the 1997 UIC workshop on Supersymmetry and Integrable Models, Springer Lecture Notes in Physics 16 (1998) 2. Aratyn, H.: Integrable Lax hierarchies, their symmetry reductions and multi-matrix models. hep-th 9503211 3. Aratyn, H., Ferreira, L., Gomes, J.F., Zimerman, A.H.: Constrained KP models as integrable matrix hierarchies. J. Math. Phys. 38, 1559 (1997), (hep-th 9509096) 4. Aratyn, H., Gomes, J.F., Zimerman, J.F.: Affine Lie algebraic origin of constrained KP hierarchies. J. Math. Phys 36, 3419 (1995), (hep-th 9408104) 5. Aratyn, H., Nissimov, E. and Pacheva, S.: Virasoro symmetry of constrained KP hierarchies. Phys. Letters 228A, 164 (1997), ( hep-th 9602068) 6. Aratyn, H., Nissimov, E. and Pacheva, S.: Constrained KP hierarchies: Additional symmetries, DarbouxB¨acklund solutions and relations to multi-matrix models. Int. J. Mod. Phys. A12, 1265–1340 (1997), (hep-th 9607234) 7. Aratyn, H., Nissimov, E. and Pacheva, S.: Method of squared eigenfunction potentials in integrable hierarchies of KP type. solv-int 9701017 8. Aratyn, H., Nissimov, E., Pacheva, S. and Zimerman, A.H.: Two-matrix string model as constrained (2+1)-dimensional integrable system. Int.J. Mod Phys. A10, 2537 (1995), (hep-th 9407017) 9. Cheng, Yi: Constraints of the Kadomtsev–Petviashvili hierarchy. J. Math. Phys. 33, 3747–3782 (1992) 10. Cheng, Yi: Modifying the KP, the nth constrained KP hierarchies and their Hamiltonian structures. Commun. Math. Phys. 171, 661–682 (1995) 11. Cheng, Yi, Strampp, W. and Zhang, You–Jin: Bilinear B¨acklund transformations for the KP and kconstrained KP hierarchy. Phys. Lett. A182, 71–76 (1993) 12. Cheng, Yi, Strampp, W. and Zhang, Bin: Constraints of the KP hierarchy and multilinear forms. Commun. Math. Phys. 168 117–135 (1995) 13. Cheng, Yi, Zhang, You-Jin: Bilinear equations for the constrained KP hierarchy. Inverse Problems 10 L11–L17 (1994) 14. Dickey, L.A.: On the constrained KP. Letters Math. Phys. 34, 379–384 (1995) 15. Dickey, L.A.: On the constrained KP hierarchy II. Letters Math. Phys 35, 229–236 )1995) 16. Dickey, L., Strampp, W.: On new identities for KP Baker functions and their application to constrained hierarchies. Preprint 17. Krichever, I.: General rational reductions of the KP hierarchy and their symmetries. Funct. Anal. Appl. 29, 75–80 (1995) 18. van de Leur, J.: A geometrical interpretation of the constrained KP hierarchy. Preprint 19. van de Leur, J.: The vector constrained KP hierarchy and Sato’s Grassmannian. J. of Geom. and Phys. 23, 83–96 (1997), (q-alg 9609001) 20. Liu, Q.P.: Bi-hamiltonian structures of coupled AKNS hierarchy and coupled Yajima–Oikawa hierarchy. J. Math. Phys. 37, 2307–2314 (1996) 21. Loris, I., Willox, R.: Bilinear form and solutions of the k-constrained Kadomtsev–Petviashvili hierarchy. Inverse Problems 13, 411–420 (1997) 22. Loris, I., Willox, R.: On solutions of constrained KP equations. J. Math. Phys. 38, 283–291 (1997) 23. Mas, J., Ramos, E.: The constrained KP hierarchy and the generalised Miura transformation. Phys. Letters B351, 194–199 (1995), (q-alg 9501009) 24. Oevel, W., Strampp, W.: Constrained KP hierarchies and bi-hamiltonian structures. Commun. Math. Phys. 157, 51–81 (1993) 25. Oevel, W., Strampp, W.: Wronskian solutions of the constrained Kadomtsev–Petviashvili hierarchy. J. Math. Phys. 37, 6213–6219 (1996) 26. Orlov, A.Yu.: Symmetries for unifying different soliton systems into a single integrable hierarchy. Preprint IINS/Oce04/03 27. Orlov, A.Yu.: Volterra operator algebra Zero curvature representation. Universality of KP. In: Nonlinear Processes in Physics, Proceeding of the III Potsdam-V Kiev Workshop at Clarkson Univ., Potsdam , N.Y., USA, eds. A.S. Fokas, D.J. Kaup, A.C. Newell and V.E. Zakharov, Springer series in Nonlinear Dynamics, Berlin: Springer Verlag, 1991 pp. 126–131
Analytic Description of Vector Constrained KP Hierarchy
641
28. Sato, M.: Soliton equations as dynamical systems on infinite dimensional Grassmann manifolds. Res. Inst. Math. Sci. Kokyuroku 439 30–46 (1981) 29. Sidorenko, J. and Strampp, W.: Multicomponent integrable reductions in the Kadomtsev–Petviashvilli hierarchy. J. Math Phys. 34, 1429–1446 (1993) 30. Segal, G. and Wilson, G.: Loop groups and equations of KdV type. Publ. Math IHES 63, 1–64 (1985) 31. Zhang, Y.-J.: On Segal–Wilson’s construction for the τ -functions of the constrained KP hierarchies. Letters in Math. Physics 36, 1–15 (1996) 32. Zhang, You-jin and Cheng, Yi: Solutions for the vector k-constrained KP hierarchy. J. Math Phys. 35, 5869–5884 (1994) Communicated by T. Miwa
Commun. Math. Phys. 193, 643 – 660 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
On the Determinant of One-Dimensional Elliptic Boundary Value Problems? ¨ Matthias Lesch, Jurgen Tolksdorf Institut f¨ur Mathematik, Humboldt-Universit¨at zu Berlin, Unter den Linden 6, D-10099 Berlin, Germany. E-mail: [email protected], [email protected] Received: 12 August 1997 / Accepted: 23 September 1997
Abstract: We discuss the ζ-regularized determinant of elliptic boundary value problems on a line segment. Our framework is applicable for separated and non-separated boundary conditions. 1. Introduction In [BFK1, BFK2, BFK3], Burghelea, Friedlander and Kappeler calculated the ζ-regularized determinant of elliptic differential operators on a line segment with periodic and separated boundary conditions. In [BFK2] they also discussed pseudodifferential operators over S 1 , e.g. on [0, 1] with periodic boundary conditions. In [L2], the first named author of this paper studied the ζ-regularized determinant of second order SturmLiouville operators with regular singularities at the boundary. The common phenomenon of [BFK1, BFK2, BFK3] and of [L2] is that the ζ-regularized determinant is expressed in terms of a determinant (in the sense of linear algebra) of an endomorphism of a finitedimensional vector space of solutions of the corresponding homogeneous differential equation. In this paper we want to show that this phenomenon remains valid for arbitrary (e.g. non-separated, non-periodic) boundary conditions and that there exists a simple proof which works for all types of boundary conditions simultaneously. However, our result is less explicit than the results of [BFK1, BFK3] for periodic and separated boundary conditions, respectively. The reason is that for arbitrary boundary conditions we could not prove a general deformation result for the variation of the leading coefficient. On the other hand while [BFK3] is limited to even order operators we deal with operators of arbitrary order (see also the discussion at the end of Sect. 3.1). The main feature of our approach is the new proof of the variation formula, Proposition 3.1 below, which uses the explicit formulas for the resolvent kernel. This together ?
This work was supported by Deutsche Forschungsgemeinschaft.
644
M. Lesch, J. Tolksdorf
with some general considerations about ζ-regularized determinants and regularized limits (Sect. 2.3) easily give the main results, Theorem 3.2, Theorem 3.3 and Theorem 3.7. Since this paper may be viewed as the second part of [L2], we refer the reader to the end of Sect. 1 of loc. cit. for a more detailed historical discussion of ζ-regularized determinants for one-dimensional operators. Nevertheless, we try to keep this paper notationally as self-contained as possible. The paper is organized as follows: In Sect. 2 we introduce some notation and review the basic facts about the ζ-regularized determinant of an operator. In Sect. 3 we state and prove our main results. 2. Generalities
2.1. Regularized integrals. First we briefly recall regularized limits and integrals (cf. [L1, (1.8)-(1.13c)]). Let f : R+ → C be a function having an asymptotic expansion X f (x) ∼x→0+ xα Pα (logx) + o(1), (2.1) <α≤0
where Pα ∈ C[t] are polynomials and Pα = 0 for all but finitely many α. Then we put LIM f (x) := P0 (0).
(2.2)
x→0
If f has an expansion like (2.1) as x → ∞ then LIMx→∞ is defined likewise. Next let f : R+ → C such that X xα Pα (logx) + f1 (x), f (x) = α
=
X
xβ Qβ (logx) + f2 (x),
(2.3)
β 1 ([0, ∞)), f2 ∈ L1 ([1, ∞)). Then, we define with Pα , Qβ ∈ C[t], f1 ∈ Lloc
Z1 Zb Z ∞ f (x) dx + LIM f (x) dx. − f (x) dx := LIM a→0
0
b→∞
a
(2.4)
1
We note that for a slightly more restricted class of functions, this regularized integral can also be defined by the Mellin transform ([BS, L1, Sect. 2.1, L2, (1.12)]) Note that Z ∞ (2.5) − xα logk x dx = 0 0
for α ∈ C, k ∈ Z+ (cf. [L2, (1.13 a-c)]). 2.2. Boundary value problems on a line segment. We consider a linear differential operator n X d (2.6) ak (x) Dk , D := −i , l := dx k=0
Determinant of 1D Elliptic Boundary Value Problems
645
defined on the bounded interval I := [a, b] with matrix coefficients ak ∈ C ∞ (I, M (m, C)). We assume (2.6) to be elliptic, i.e. det an (x) 6= 0, x ∈ I. A priori, the differential operator l acts on C ∞ (I, Cm ). We consider the following boundary condition: f (a) f (b) f 0 (a) f 0 (b) + Rb = 0, (2.7) B(f ) := Ra . .. .. . f (n−1) (a) f (n−1) (b) where Ra , Rb ∈ M (nm, C) are matrices of size nm × nm. We denote by L := lB the differential operator (2.6), acting on the domain D(L) := {f ∈ H n (I, Cm ) | B(f ) = 0}.
(2.8)
Let φ : I → M (nm, C) be the fundamental matrix of l, which means that φ is the solution of the initial value problem φ0 (x) = A φ(x), φ(a) = 1nm . Here, A ∈ C ∞ (I, M (nm, C)) is the matrix 0 1m 0 0 0 1m . .. .. . A := . . . 0 0 0 β0
β1
β2
... ... ... ...
(2.9)
0 0 .. . 1m βn−1
,
(2.10)
k where, respectively, βk ≡ −αn−1 αk := −(−i)−n a−1 n (−i) ak , k = 0, · · · , n − 1, i.e. αk := (−i)k ak , k = 0, · · · , n and 1m ∈ M (m, C) denotes the m × m unit-matrix. Sometimes we also write φ(x; l) to make the dependence on the operator (2.6) explicit. We introduce the matrices
R := R(l, Ra , Rb ) := Ra + Rb φ(b; l), R(z) := R(l + z, Ra , Rb ) := Ra + Rb φ(b; l + z).
(2.11) (2.12)
It is a well-known fact that the operator L is invertible if and only if the matrix R is invertible. In this case the inverse operator L−1 is an integral operator with kernel ( − φ(x) R−1 Rb φ(b) φ−1 (y) 1n αn (y)−1 if y > x, (2.13) K(x, y) = − φ(x) (R−1 Rb φ(b) − 1) φ−1 (y) 1n αn (y)−1 if y < x. Here, [ ]1n means the upper right entry of a n × n block matrix. Note that K(x, y) ∈ M(m, C). 2.3. The ζ-regularized determinant. We briefly discuss ζ-regularized determinants in an abstract setting. Let H be a Hilbert space and let L be an (unbounded) operator in H. For α < β we denote by Cα,β := {z ∈ C\{0} | α ≤ arg z ≤ β},
(2.14)
646
M. Lesch, J. Tolksdorf
a sector in the complex plane. We assume that the operator L has θ as a principal angle. By this we mean that there exists an > 0 such that spec L ∩ Cθ−,θ+ = ∅.
(2.15)
Furthermore, we assume that ||(L − z)−1 ||L(H) ≤ c |z|−1 ,
z ∈ Cθ−,θ+ ,
|z| ≥ R,
(2.16)
where (L − z)−1 is trace class and there is an asymptotic expansion in Cθ−,θ+ as z → ∞, X z α Pα (log z) + o(|z|−1−δ ), (2.17) Tr(L − z)−1 ∼z→∞ <α≥−1−δ
where, again, Pα ∈ C[t] are polynomials and Pα 6= 0 for at most finitely many α. Moreover, we assume that (2.18) deg P−1 = 0, i.e., there are no terms like z −1 logk (z), k ≥ 1. The trace class property of (L − z)−1 implies that lim
z→∞ z∈Cθ−δ,θ+δ
Tr(L − z)−1 = 0,
(2.19)
for any δ < . Thus, Pα = 0 if <α ≥ 0. In view of (2.16) we can construct the complex powers of the operator L as follows (cf. [Se2, Sect. 1], [Sh, Sect. 10.1]): let 0 = 01 ∪ 02 ∪ 03 be the contour in C with 01 := {r ei(θ+2π) | ρ < r < ∞}, 02 := {ρ ei(θ+ϕ) | 0 < ϕ < 2π}, 03 := {r eiθ | ρ < r < ∞}.
(2.20)
Here, the contour 0 is traversed such that the set C\{r e | r > ρ} lies “inside” 0. Moreover, ρ is chosen so small that specL ∩ {z ∈ C | |z| ≤ ρ} ⊂ {0}. Then, put for
Here, the complex powers λz are defined by (r ei(θ+ϕ) )z := rz eiz(θ+ϕ) , 0 ≤ ϕ ≤ 2π. The same proof as in [Se1, Thm.1] (cf. also [Sh, Prop. 10.1]) shows that z 7→ Lz is a holomorphic semigroup of bounded operators in the Hilbert space H. For k ∈ Z, k < 0, we have Lk = (L−1 )k . Moreover, if 0 6∈ spec L, then L−1 = L−1 . If 0 ∈ spec L then LL−1 is a projection onto a complementary subspace of ker L. Therefore, we shall write Lz instead of Lz . By assumption (L − z)−1 is trace class and in view of (2.16) we can estimate the trace norm ||(L − z)−1 ||tr ≤ ||(L − z0 )−1 ||tr ||(L − z0 )(L − z)−1 || ≤ ||(L − z0 )−1 ||tr (1 + |z − z0 | ||(L − z)−1 ||) ≤ C,
|z| ≥ R.
(2.22)
Determinant of 1D Elliptic Boundary Value Problems
647
Therefore, if
=
i 2π
λ∈spec(L)\{0}
Z 0
z −s Tr(L − z)−1 dz,
(2.23)
is a holomorphic function for <s > 1. Furthermore, the asymptotic expansion (2.17) implies that ζL,θ (s) has a meromorphic continuation to <s > −δ with poles in the set {α + 1 | Pα 6= 0}. The order of the pole α + 1 is either deg Pα if α + 1 ∈ Z or degPα + 1 if α + 1 6∈ Z (see for instance [BL, Lemma 2.1]). Because of the assumption (2.18) ζL,θ (s) is regular at 0. Following Ray and Singer, [RS], we put detθ L = 0 if 0 ∈ spec L, and otherwise 0 (0)). detθ L := exp(−ζL,θ
(2.24)
It is convenient to deal with the principal angle θ = π. We therefore consider the operator e := ei(π−θ) L. (2.25) L Obviously, this operator has θ = π as a principal angle and it satisfies (2.16)-(2.18), too. Furthermore, e −s = eis(θ−π) L−s , (2.26) L and thus is(θ−π) ζL,θ (s). ζL e,π (s) = e
(2.27)
Consequently, ζL,θ (0) = ζL e,π (0), 0 (0) = ζe0 (0) + i(π − θ) ζL ζL,θ e,π (0), L,π
and therefore detθ L = e
(0) e e L,π detπ L.
i(θ−π) ζ
(2.28)
(2.29)
In the sequel we thus assume θ = π. We then write the expansion (2.17) in the form X Tr(L + x)−1 ∼x→∞ xα Pα (log x) + o(x−1−δ ), x ≥ 0. (2.17’) <α≥−1−δ
Of course, there exist formulas relating the Pα in (2.17) and the corresponding Pα in (2.17’). Lemma 2.1. Let the operator L be given as above with principal angle θ = π. Then, Z ∞ sin πs − x−s Tr(L + x)−1 dx, (2.30) ζL,π (s) = π 0 Z ∞ 0 (2.31) ζL,π (0) = − Tr(L + x)−1 dx . 0
648
M. Lesch, J. Tolksdorf
Proof. With respect to the decomposition H = (ker L) ⊕ (ker L)⊥ , the operator L reads 0 T L= , (2.32) 0 L1 where L1 is invertible. In view of (2.5) we have Z ∞ Z ∞ −s −1 − x Tr(L + x) dx = − x−s Tr(L1 + x)−1 dx, 0
(2.33)
0
and thus we may assume L to be invertible. From the estimate (2.22) we conclude that the following integral is absolutely convergent for 1 < Re s < 2: Z ∞ Z ∞ X −s −1 −1 x [Tr(L + x) − Tr(L )] dx = x−s [(λ + x)−1 − λ−1 ] dx 0
λ∈spec (L)\{0}
Z ∞ − x−s (λ + x)−1 dx
X
=
0
λ∈spec(L)\{0} 0
=
π sin πs
X
λ−s .
Here, we have used (2.5) again. Hence, the first formula is proved. Since (2.17’), (2.18) and [L1, (1.12)] imply Z ∞ Z ∞ P−1 (0) −s −1 + − Tr(L + x)−1 dx + O(s), − x Tr(L + x) dx = s 0 0 we reach the conclusion by noting that
sin πs π
(2.34)
λ∈spec(L)\{0}
= s + O(s3 ), s → 0.
s → 0, (2.35)
Lemma 2.2. Let L be as before, θ = π. Then, we have the asymptotic expansion X xα+1 Qα (log x) + O(x−δ ), (2.36) log detπ (L + x) ∼x→∞ <α≥−1−δ
where Pα = (α + 1) Qα + Q0α . Furthermore, Q−1 (log x) = P−1 (0) log x. In particular LIM log detπ (L + x) = 0.
x→∞
(2.37)
Proof. Since L−1 is trace class, it follows that log detπ (L + x) is differentiable and d log detπ (L + x) = Tr(L + x)−1 . dx
(2.38)
Hence, Z ∞ LIM log detπ (L + y) − log detπ (L + x) = − Tr(L + y)−1 dy.
y→∞
(2.39)
x
Comparing this equation for x = 0 with the preceding lemma yields (2.37). Hence,
Determinant of 1D Elliptic Boundary Value Problems
log detπ (L + x)
=
649
Z ∞ − − Tr(L + y)−1 dy x
Z ∞ − − y α Pα (log y) dy + O(x−δ ),
X
∼x→∞ and we reach the conclusion.
(2.40)
x
<α≥−1−δ
The reader might ask why we made the argument so complicated in order to get the first equality of (2.40). It appears to be a direct consequence of (2.31) via the apparently “trivial” calculation Z ∞ log det π (L + x) = − − Tr(L + x + y)−1 dy 0
Z ∞ = − − Tr(L + y)−1 dy.
(2.41)
x
However, in general for functions f like (2.3) we have Z ∞ Z ∞ − f (x + y) dy 6= − f (y) dy.
(2.42)
x
0
Consequently, some care must be in order. Since the operator L−1 is trace class, the 1 ([0, ∞)) phenomenon (2.42) does not occur for Tr(L + x)−1 . More precisely, if f ∈ Lloc and X xα Pα (log x), (2.43) f (x) ∼x→∞ α
then
Z Z ∞ Z ∞ − f (x + y) dy − − f (y) dy = LIM 0
b→∞
x
b+x
f (y) dy,
(2.44)
b
and in general this vanishes only if Pα = 0 for α ∈ Z+ . As an illustrative example we consider f (x) := xα , α ∈ Z. Then, we get xα+1 − Z ∞ α+1 α − (x + y) dy = − ln x 0 0
if α ∈ Z− \{−1}, if α = −1, if
(2.45)
α ∈ Z+ ;
( xα+1 Z ∞ − α+1 if α 6= −1, − f (y) dy = x − ln x if α = −1. Hence,
( xα+1 Z ∞ Z ∞ α+1 − f (x + y) dy − − f (y) dy = 0 x 0
(2.46)
if α ∈ Z+ , if
α∈ / Z+ .
(2.47)
650
M. Lesch, J. Tolksdorf
3. Main Results From now on we restrict ourselves to boundary value problems on a line segment as introduced in Sect. 2.2. Let (l, B) be an elliptic boundary value problem, L := lB . More precisely, we assume that (l, B) is elliptic in the sense of [Se1, Def.1] and that it satisfies Agmon’s condition [Se1, Def.2]. Agmon’s condition assures that the coefficient an (x) has a certain principal angle, θ. Then we can find an angle θ0 , arbitrary close to θ, such that θ0 is a principal angle for L and an (x). Henceforth we shall write θ for a common principal angle of an (x) and L. In short: we will refer to an operator L = lB , defining an elliptic boundary value problem (l, B), as an admissible operator. 3.1. Operators of order ≥ 2. If in addition n ≥ 2 then the conditions (2.16)-(2.18) are fulfilled by the work of Seeley [Se1, Se2]. Namely, (2.16) follows from [Se1, Lemma 15] and by [Se2, Thm.2] we have an asymptotic expansion as z → ∞ in Cθ−,θ+ , Tr(L − z)−1 ∼z→∞
∞ X
ck z
1−k n −1
.
(3.1)
k=0
Equation (2.18) is automatically fulfilled since there are no log-terms in (3.1). Summing up, we see that detθ L is well-defined for n ≥ 2. First order operators are slightly more complicated since in this case (L − z)−1 is not of trace class. This problem will be treated separately in Subsect. 3.2. First, we study the behavior of det θ L under deformations of the coefficients of l. Proposition 3.1. Assume that the coefficients a0 , . . . , an−2 depend smoothly on a parameter t. Let Lt be the corresponding family of operators. If Lt is invertible then we have (3.2) ∂t logdetθ Lt = ∂t logdetRt , where Rt := R(lt , Ra , Rb ), cf. (2.11). Proof. The inclusion H 2 ([a, b], Cm ) ,→ L2 ([a, b], Cm ) is trace class and D(L) ⊂ H n ([a, b], Cm ). Hence the operators Dk L−1 are trace class, as well, for k = 0, . . . , n−2. Hence, ∂t logdetθ Lt = Tr((∂t Lt )L−1 t ) =
n−2 XZ b j=0
=
=
a
n−2 XZ b j=0
=
a
n−2 XZ b j=0
a
n−2 XZ b j=0
tr Cm ∂t aj (t; x) (Dj Kt )(x, x) dx
a
tr Cm ∂t αj (t; x) Kt(j) (x, x) dx e t(j) (x, x)]1n dx tr Cm −αn−1 (x)∂t αj (t; x) [K e t(j) (x, x)]1n dx, tr Cm ∂t βj (t; x) [K
(3.3)
Determinant of 1D Elliptic Boundary Value Problems
where
( e t (x, y) := K
651
−1 φt (x) R−1 t Rb φt (b) φt (y)
if
y > x,
−1 φt (x) (R−1 t Rb φt (b) − 1) φt (y)
if
y < x.
(3.4)
e t(n−1) has a e t(j) is continuous on the diagonal for j = 0, . . . , n − 2, but K Note that K jump. This is one of the reasons that this proposition is limited to the case of constant an−1 . Thus we have ∂t logdetθ Lt = Tr((∂t Lt ) L−1 t ) Z b e t (x, x) dx. tr Cnm (∂t At )(x) K =
(3.5)
a
By use of (2.9) we then calculate: −1 e t (x, x) = tr Cnm (∂t At )(x) φt (x) R−1 tr Cnm (∂t At )(x) K t Rb φt (b) φt (x) −1 = tr Cnm (∂x ∂t φt )(x) R−1 t Rb φt (b) φt (x) −1 −1 − (∂x φt )(x) φ−1 t (x) (∂t φt )(x) Rt Rb φt (b) φt (x) −1 (3.6) = ∂x tr Cnm (∂t φt )(x) R−1 t Rb φt (b) φt (x) , to obtain ∂t logdetθ Lt = tr Cnm (∂t φt )(b) R−1 t Rb = ∂t logdetRt , which proves the statement.
(3.7)
Theorem 3.2. Let (l, B) be an admissible operator of order n ≥ 2, L := lB . Assume the principal angle θ equals π. For R(z) := R(l + z, B) we obtain an asymptotic expansion logdetR(z) ∼z→∞
∞ X
bk z
1−k n
+ b1 + ζL,π (0) logz,
(3.8)
logdet π (L + z) = logdet R(z) − LIM logdet R(w).
(3.9)
k=0 k6=1
in a conic neighborhood of R+ . Furthermore, w→∞
Proof. In view of (3.1) we have an asymptotic expansion Tr(L + z)−1 ∼z→∞
∞ X
ck z
1−k n −1
.
(3.10)
k=0
We apply the preceding proposition with a0 (z; x) = a0 (x) + z and ak (z; x) = ak (x), k ≥ 1. Then
652
M. Lesch, J. Tolksdorf
∂z logdetR(z)
=
∂z logdetπ (L + z)
=
Tr(L + z)−1
∼z→∞
∞ X
ck z
1−k n −1
.
(3.11)
k=0
This proves the first assertion. Note that from Lemma 2.1 one easily concludes c1 = ζL,π (0). By (3.11) logdetR(z) − logdetπ (L + z) is a constant. Then the second assertion follows from (2.37). Theorem 3.3. Let again (l, B) be an admissible operator of order n ≥ 2, L := lB , with principal angle θ = π. We put log C(l, B) := − LIM log detR(z). z→∞
Then,
detπ L = C(l, B) det R,
R := R(0).
(3.12)
(3.13)
Furthermore, C(l, B) depends only on an , an−1 and on the boundary operator B, i.e. C(l, B) = C1 (an , an−1 , Ra , Rb ). Proof. Equation (3.12) and (3.13) are immediate consequences of the preceding theorem. To prove the last statement we consider two admissible operators lj :=
n X
ak,j (x) Dk ,
j = 0, 1 ,
(3.14)
k=0
where an,0 = an,1 , an−1,0 = an−1,1 and the boundary condition B is fixed. We put lt := t l1 + (1 − t) l0 , 0 ≤ t ≤ 1. We would like to apply Proposition 3.1. However, it may happen that spec lt ∩ {z ∈ C | z ≤ 0} = 6 ∅ for some t. But since π is a principal angle for the leading symbol of lt there exists a z0 > 0 such that Lt + z is invertible for all z ≥ z0 . By Proposition 3.1 we then have C(l0 + z, B) = C(l1 + z, B) for z > z0 . Since both functions are holomorphic we are done. Note that formulas (3.12) and (3.13) express the ζ-regularized determinant of L completely in terms of the solutions of the homogeneous differential equation (L+z)u = 0. It seems impossible, however, to find an explicit formula for the coefficient C(l, B) in full generality. But in cases where the fundamental matrix R(z) can be calculated explicitly one can also find an expression for C(l, B). Now we are going to discuss in detail non-separated boundary conditions for second order operators. We therefore consider the following Example. Let A, B, C, D ∈ M (m, C) and consider the operator l := −
d2 + q(x), dx2
with boundary operator B = (Ra , Rb ), where
Determinant of 1D Elliptic Boundary Value Problems
Ra :=
A C
653
B , D
Rb := 12m .
It turns out that the operator L = lB is admissible iff the meromorphic function M (z) := det(−z
B A+D 1 C + − ), 2 2 z 2
(3.15)
does not vanish identically. Hence, let us assume M (z) 6= 0. Note that M is a Laurent polynomial. By the preceding proposition C(l, B) =: C2 (A, B, C, D) is independent of q. More precisely, Proposition 3.4. C(l, B)−1 is equal to the leading coefficient of the Laurent polynomial M (z); ζL,π (0) equals 21 the degree of M (z). Proof. As remarked before it suffices to consider the case q = 0. Then the fundamental solution φ(x, z) := φ(x, l + z) reads √ ! √ sinh[(x√− a) z] 1m cosh[(x − a) z] 1m , (3.16) φ(x, z) = √ z √ √ z sinh[(x − a) z] 1m cosh[(x − a) z] 1m where, again, 1m denotes the m × m unit matrix. For the rest of the proof all matrices will be 2 × 2 block matrices with m √ × m block entries and for simplicity we will omit 1m . Abbreviating c := b − a, w := z we find sinh cw cosh cw w φ(b, z) = w sinh cw cosh cw cw 0 e W −1 , =W 0 e−cw where
W =
1 w
1 . −w
Thus, det R(z) = det(Ra + Rb φ(b, z)) cw 0 e = det(Ra + W W −1 ) −cw 0 e cw 0 e = det(W −1 Ra W + −cw ) 0 e = emcw detCm [W −1 Ra W ]22 + O(wm+1 e(m−1)cw ), since W −1 Ra W =: w X1 + X0 + w−1 X−1 ,
Xi ∈ M (2m, C), i = −1, 0, 1.
Here, [W −1 Ra W ]22 = −
A+D 1 w B+ − C, 2 2 2w
(3.17) (3.18) (3.19)
654
M. Lesch, J. Tolksdorf
denotes the lower right entry of the 2 × 2 block matrix W −1 Ra W . This implies h √ i √ m+1 log detR(z) = mc z + log detCm [W −1 Ra W ]22 + O(z 2 e−c z ) √
√
= mc z + log M ( z) + O
z
m+1 2
√
e−c √ M ( z)
z
.
Since M (w) is a Laurent polynomial we may write √ k−1 M ( z) = λ z k/2 + O(z 2 ),
z → ∞,
and thus
√ 1 k log M ( z) = log z + log λ + O(z − 2 ) 2 and we reach the conclusion.
z → ∞,
The leading coefficient of M (z) is in general difficult to describe. Thus, it seems hard to find a more explicit formula for C(l, B) than given in the preceding Proposition. We discuss some special cases: 1. B invertible: M (z) = (−1)m (det = (−1)m (det
B m ) z det(1m + O(z −1 )) 2 B m ) z + O(z m−1 )); 2
2. B = 0, A + D invertible: M (z) = det( 3. B = 0, A + D = 0: M (z) = Hence,
A+D ) + O(z −1 ); 2
(−1)m det C z −m . 2m
(−1)m 2m det B 2m C2 (A, B, C, D) = det(A + D) m m (−1) 2 det C
if
det B 6= 0,
if
B = 0, det(A + D) 6= 0,
if
B = 0, A + D = 0.
(3.20)
Of course, this does not cover all possible cases. The periodic boundary conditions are given by Ra = −12m , and thus C2 (A, B, C, D) = (−1)m , which is consistent with [BFK1, Thm.1].
Determinant of 1D Elliptic Boundary Value Problems
655
Next, we discuss how C1 (an , an−1 , B) depends on the coefficients an and an−1 . We start with the dependence on the subleading coefficient an−1 and use the standard trick to eliminate it (cf. also [BFK3, Prop.2.2]). For this let again L = lB be an admissible operator, B = (Ra , Rb ). Let also U : I → M (m, C) be the unique solution of the initial value problem U 0 (x) = −
i −1 (a an−1 )(x) U (x), n n
U (a) = 1m .
(3.21)
The determinant of U (x) is given by Z x i −1 det U (x) = exp − tr(an an−1 )(y) dy 6= 0. n a
(3.22)
By conjugation of l with U we find lu := U −1 l U =
n X
e aj (x) Dj ,
(3.23)
j=0
an−1 = 0. Since spec e an = spec an , e an has the same with e an (x) = (U −1 an U )(x) and e principal angle as an . Furthermore, for Lu := U −1 LU we have spec Lu = spec L,
(3.24)
and hence det θ Lu = det θ L. Next, we determine the transformed boundary operator B u := (Rau , Rbu ). If ··· ϕn ϕ1 .. , φ = ... (3.25) . (n−1) (n−1) · · · ϕn ϕ1 denotes a fundamental matrix of l then the corresponding fundamental matrix of lu reads ··· U −1 ϕn U −1 ϕ1 .. .. φeu = . . (U −1 ϕ1 )(n−1)
···
(U −1 ϕn )(n−1)
=: T (U ) φ, where
( i T (U )ij (x) :=
j
(3.26)
(∂x(i−j) U −1 )(x) 0
if
0 ≤ j ≤ i ≤ n − 1,
if j > i .
(3.27)
Note that det T (U ) = (det U )−n . However, the fundamental matrix φeu is not normalized. We therefore put φu (x) := φeu (x) (φeu )−1 (a) = T (U )(x) φ(x) T (U )−1 (a).
(3.28)
656
M. Lesch, J. Tolksdorf
We now determine the boundary conditions for Lu . Let g ∈ D(Lu ). Then, g = U −1 f with f ∈ D(L). With F := (f, . . . f (n−1) )t , G := (g, . . . g (n−1) )t we have G = T (U ) F,
(3.29)
and thus 0 = Ra F (a) + Rb F (b) = Ra T (U )−1 (a) G(a) + Rb T (U )−1 (b) G(b) eu G(a) + R eu G(b). =: R a b
(3.30)
We put eu , Rau := T (U )(b) R a eu . Rbu := T (U )(b) R b
(3.31)
eu , R eu . Note that Rau , Rbu define the same boundary condition as R a b Then, R(lu + z, Rau , Rbu ) = Rau + Rbu φu (b, lu + z), = T (U )(b) R(l + z, Ra , Rb ) T (U )−1 (a),
(3.32)
det R(lu + z, Rau , Rbu ) = (det U (b))−n det R(l + z, Ra , Rb ).
(3.33)
and thus Consequently, det θ Lu = detθ L,
(3.34)
C(an , an−1 , Ra , Rb ) = (det U (b))−n C(U −1 an U, 0, Rau , Rbu ).
(3.35)
implies We thus have proved the Proposition 3.5. Let L = lB be an admissible operator and let U (x) be the unique soluu tion of the initial value problem (3.21). Then, the operator Lu = lB u is also admissible. It has the same principal angle θ as L and ! Z b
C(an , an−1 , Ra , Rb ) = exp i a
−1 tr(a−1 an U, 0, Rau , Rbu ), (3.36) n an−1 )(y) dy C(U
with Rau := T (U )(b) Ra T (U )−1 (a),
(3.37)
Rbu := T (U )(b) Rb T (U )−1 (b),
(3.38)
and T (U ) is defined by formula (3.27).
Determinant of 1D Elliptic Boundary Value Problems
657
As an application, we consider the operator l := −
d d2 + q(x), + p(x) dx2 dx
with the same boundary operator B = (Ra , Rb ) as given in the preceding example. Again, by Proposition 3.3 it is sufficent to consider q = 0. Notice that the leading coefficient a2 of the operator l is invariant with respect to conjugation with U . Hence, in the two specific cases where, respectively, B is invertible, or B = 0 and A + D is invertible, we can simply make use of (3.20) to obtain m m 2 (−1) if det B 6= 0, 1 Rb det B tr p(x) dx 2 (3.39) C2 (p, A, B, C, D) = e a 2m det(A+D) if B = 0, det(A + D) 6= 0. We now turn to the dependence of C(l, B) on the leading symbol of the differential operator l. The aim is to get an explicit formula analogous to (3.35) in the case of an−1 . Unfortunately, this is much more involved than in the case of separated boundary conditions. Following [BFK3] one considers the family of operators lt := αt (Dn + l0 ), where l0 denotes a differential operator of order n − 1 and αt (x), t ∈ [0, 1] is a smooth variation of an (x) such that α0 = Id and α1 = an . Then, the question arises whether the corresponding operators Lt := (lt , B) are admissible for all t ∈ [0, 1]. To answer this question seems to be hopeless for the general situation discussed in this paper. Note, however, for a given admissible operator L = (l, B) the constant C(l, B) can be calculated if the fundamental solution of the corresponding homogeneous equation is known. 3.2. Operators of order 1. In this subsection we briefly indicate how Theorem 3.2 and 3.3 generalize to operators of order one. Let L = lB be an admissible operator of order one with principal angle π. A priori (L + x)−1 is not of trace class. However, the trace of (L + x)−1 can be regularized. In the sequel we use the notation Resk f (z0 ) for the coefficient of (z − z0 )k in the Laurent expansion of the meromorphic function f . The function Tr(L + z)−s is meromorphic with simple poles in 1, 0, −1, . . ., which follows from (3.43) below, and we put Tr(L + z)−1 := Res0 Tr(L + z)−s |s=1 .
(3.40)
Then, d d d logdetπ (L + z) = − |s=0 Tr(L + z)−s dz ds dz =
d |s=0 sTr(L + z)−s−1 ds
= Tr(L + z)−1 = Tr( (L + z)−1 − L−1 ) + Tr(L−1 ), since (L + z)−1 − L−1 = −z(L + z)−1 L−1 is of trace class. The same calculation as (2.34) shows that
(3.41)
658
M. Lesch, J. Tolksdorf
Z ∞ sin πs − x−s Tr[(L + x)−1 − L−1 ] dx ζL,π (s) = π 0 Z ∞ sin π(s − 1) − x1−s Tr(L + x)−2 dx. = π(s − 1) 0
(3.42)
By the work of Seeley we have an asymptotic expansion Tr(L + x)−2 ∼x→∞
∞ X
ck x−k−1 ,
(3.43)
k=0
which implies in view of sin π(s − 1) = π(s − 1)2 + O((s − 1)3 ) that Z ∞ Tr(L−1 ) = Res0 ζL,π (1) = − Tr(L + x)−2 dx.
(3.44)
0
On the other hand, we have ( Z Z ∞ Tr(L−1 ) = − Tr(L + x)−2 dx = LIM − R→∞
0
R 0
d Tr[(L + x)−1 − L−1 ] dx dx
= − LIM Tr[(L + R)−1 − L−1 ]. R→∞
)
(3.45)
Comparing this with (3.41) gives d logdetπ (L + z) = LIM Tr(L + z)−1 = 0. z→∞ dz z→∞ LIM
(3.46)
Summing up we can state the analogue of Lemma 2.1 and 2.2 for operators of first order: Lemma 3.6. Let L be as before. Then we have Z ∞ sin πs − x−s Tr(L + x)−1 dx, ζL,π (s) = π 0 Z ∞ 0 (0) = − Tr(L + x)−1 dx, ζL,π
(3.47) (3.48)
0
and the asymptotic expansions in a conic neighborhood of R+ , Tr(L + x)−1 ∼x→∞
∞ X ck k=1
log det π (L + x) ∼x→∞
∞ X k=2
k
x−k + a0 log x,
(3.49)
ck x1−k + ζL,π (0) log x + c0 x log x + c0 x, (3.50) k(k − 1)
in particular LIM Tr(L + x)−1 = 0,
(3.51)
LIM log detπ (L + x) = 0.
(3.52)
x→∞ x→∞
Determinant of 1D Elliptic Boundary Value Problems
659
Proof. Equation (3.47) follows from (3.42) and (3.48) follows from (3.47), similar to (2.35); (3.49), (3.50) follow from integrating the expansion (3.43); (3.51) follows from (3.46) and finally, (3.52) is proved exactly as (2.37). Theorem 3.7. Let L be as before. For R(x) = R(l + x, B) we have an asymptotic expansion log det R(x) ∼x→∞
∞ X k=2
ck x1−k + ζL,π (0) log x + b + c0 x log x + d x. (3.53) k(k − 1)
Furthermore, log detπ (L + x) − log det R(x) = −b − (d − c0 )x.
(3.54)
Proof. We cannot apply Proposition 3.1 to L + z since the operator L is of order one. But the same computation as in the proof of Proposition 3.1 shows that d Tr(L + x)−1 = −Tr(L + x)−2 dx Z b =− tr(L + x)−2 (t, t) dt a
d = dx = hence,
Z
b
tr[(L + x)−1 (t, t + 0)] dt
a
d2 log det R(x), dx2
log detπ (L + x) − log det R(x),
is a polynomial of degree one and we are done.
(3.55)
(3.56)
As a consequence, we end up with the formula detπ L = e−b det R,
(3.57)
where b := LIMx→∞ log det R(x). References [BFK1] [BFK2] [BFK3] [BL] [BS] [L1] [L2]
Burghelea, D., Friedlander, L., Kappeler, T.: On the determinant of elliptic differential and finite difference operators in vector bundles over S 1 . Commun. Math. Phys. 138, 1–18 (1991) Burghelea, D., Friedlander, L., Kappeler, T.: Regularized determinants for pseudodifferential operators in vector bundles over S 1 . Int. Equ. Op. Th. 16, 496–513 (1993) Burghelea, D., Friedlander, L., Kappeler, T.: On the determinant of elliptic boundary value problems on a line segment. Proc. Am. Math. Soc. 123, No. 10, 3027–3038 (1995) Br¨uning, J., Lesch, M.: On the spectral geometry of algebraic curves. J. reine angew. Math. 474, 25–66 (1996) Br¨uning, J., Seeley, R.: Regular Singular Asymptotics. Adv. Math. 58, 133–148 (1985) Lesch, M.: Operators of Fuchs type, conical singularities and asymptotic methods. Teubner Texte zur Mathematik, vol. 136, Leipzig: B. G. Teubner, 1997 Lesch, M.: Determinants of regular singular Sturm-Liouville operators. To appear in Math. Nachr.
660
[RS] [Se1] [Se2] [Sh]
M. Lesch, J. Tolksdorf
Ray, D.B., Singer, I.M.: R-torsion and the Laplacian on Riemannian manifolds. Adv. Math. 7, 145–210 (1971) Seeley, R.: The resolvent of an elliptic boundary problem. Amer. J. Math. 91, 889–920 (1969) Seeley, R.: Analytic extension of the trace associated with elliptic boundary problems. Amer. J. Math. 91, 963–983 (1969) Shubin, M.A.: Pseudodifferential operators and spectral theory. Berlin–Heidelberg–New York: Springer, 1987
Communicated by B. Simon
Commun. Math. Phys. 193, 661 – 674 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Spinc Manifolds and Complex Contact Structures Andrei Moroianu? Institut f¨ur Reine Mathematik, Humboldt-Universit¨at zu Berlin, Ziegelstr. 13a, D–10099 Berlin, Germany Received: 17 July 1997 / Accepted: 23 September 1997
Abstract: In this paper we extend our notion of projectable spinors ([9], Ch.1) to the framework of Spinc manifolds and deduce the basic formulas relating spinors on the base and the total space of Riemannian submersions with totally geodesic one-dimensional fibres. Some geometric applications concerning positive K¨ahler–Einstein complex contact manifolds (e.g. their characterisation as twistor spaces over positive quaternionic K¨ahler manifolds) are also given.
1. Introduction Projectable spinors for Riemannian submersions of spin manifolds arose in a quite natural way ([9], Ch.1) and have led to important geometric applications, as the classification of K¨ahler manifolds admitting K¨ahlerian Killing spinors ([8]) or results on the spectrum of the Dirac operator for certain classes of Riemannian manifolds ([10]). In this paper we introduce projectable spinors for Riemannian submersions of Spinc manifolds, motivated by the following facts: K.-D. Kirchberg and U. Semmelmann discovered that every complex contact manifold of complex dimension 4l + 3 admitting a K¨ahler–Einstein metric of positive scalar curvature carries a canonical spin structure with K¨ahlerian Killing spinors [4]. Using this together with the results in [8], we were able to prove the following characterisation of twistor spaces over positive quaternionic K¨ahler manifolds in half of the possible dimensions: Theorem A (cf. [12]). Let M be a compact spin K¨ahler manifold of positive scalar curvature and complex dimension 4l + 3. Then the following statements are equivalent: (i) ?
M is the twistor space of some quaternionic K¨ahler manifold; Supported by the SFB 288 "Differentialgeometrie und Quantenphysik" of the DFG.
662
A. Moroianu
(ii) M is K¨ahler–Einstein and admits a complex contact structure; (iii) M admits K¨ahlerian Killing spinors. By different methods, C. LeBrun independently obtained the following Theorem B (cf. [7]). Let Z be a Fano contact manifold. Then Z is a twistor space iff it admits a K¨ahler–Einstein metric. In complex dimensions 4l + 3, this is a direct corollary of Theorem A. The reasons for which our Theorem A fails to hold in complex dimensions 4l + 1 are of a topological nature: neither the twistor spaces, nor the complex contact manifolds of complex dimensions 4l + 1 are spin (with one exception: the complex projective space). On the other hand, each K¨ahler manifold admits a natural Spinc structure; it is thus natural to try to extend the above notions to the framework of Spinc structures, and to generalise the results in [12] to this case. In order to keep the computations as simple as possible, we do not construct here the whole theory of projectable spinors on Spinc manifolds, and restrict ourselves to a particular situation which is of special interest to us. Generalisations of the constructions described below can be easily obtained. The author is very indebted to Jean Pierre Bourguignon for the careful reading of a preliminary version of this paper, and to Uwe Semmelmann for many helpful discussions.
2. Preliminaries Definition 2.1. A Spinc structure on an oriented Riemannian manifold (M n , g) is given by a U(1) principal bundle PU (1) M and a Spincn principal bundle PSpincn M together with a projection θ : PSpincn M → PSO(n) M × PU (1) M (PSO(n) M means the SO(n) principal bundle of oriented orthonormal frames on M ), satisfying θ(u˜ ˜ a) = θ(u)ξ(˜ ˜ a), for every u˜ ∈ PSpincn M and a˜ ∈ Spincn , where ξ is the canonical 2–fold covering of Spincn over SO(n) × U(1). Recall that Spincn = Spinn ×Z2 U(1), and that ξ is given by ξ([u, a]) = (φ(u), a2 ), where φ : Spinn → SO(n) is the canonical 2–fold covering. If M has a Spinc structure, we denote by 6M the associated complex spinor bundle and by L the complex line bundle associated to PU (1) M , which is called the auxiliary bundle. On 6M there is a canonical Hermitian product (., .), with respect to which the Clifford multiplication by vectors is skew–Hermitian: (X · ψ, ϕ) = −(ψ, X · ϕ),
∀X ∈ T M, ψ, ϕ ∈ 6M.
(1)
Every connection form A on PU (1) M defines, together with the Levi-Civita connection of M , a covariant derivative on 6M denoted by ∇A . Correspondingly, we define the Dirac operator as the composition γ ◦ ∇A , where γ denotes the Clifford contraction. The Dirac operator can be expressed using a local orthonormal frame {e1 , · · · , en } as D=
n X i=1
e i · ∇A ei .
Spinc Manifolds and Complex Contact Structures
663
Suppose now that (M 2m , g, J) is a K¨ahler manifold. We define the twisted Dirac ˜ by operator D 2m 2m X X ˜ J(ei ) · ∇ei = − ei · ∇J(ei ) , D= i=1
which satisfies
i=1
˜ 2 = D2 D
˜ + DD ˜ = 0. and DD
We also define the complex Dirac operators D± := 2 = D+2 = 0 D−
1 2 (D
(2)
˜ and (2) becomes ∓ iD),
and D2 = D+ D− + D− D+ .
(3)
Consider a local orthonormal frame {Xα , Yα } such that Yα = J(Xα ). Then Zα = 1 1 1,0 (M ) and T 0,1 (M ), and D± ¯ = 2 (Xα + iYα ) are local frames of T 2 (Xα − iYα ) and Zα can be expressed as D+ = 2
m X
Zα ·
∇A Zα¯
,
D− = 2
α=1
m X
Zα¯ · ∇A Zα .
(4)
α=1
A k-form ω acts on 6M by X ω · 9 :=
ω(ei1 , · · · , eik ) ei1 · ... · eik · 9,
i1 <···
where {ei } is a local orthonormal frame on M . With respect to this action, the K¨ahler form (defined by (X, Y ) = g(X, JY )) satisfies =
2m 2m 1 X 1 X J(ei ) · ei = − ei · J(ei ). 2 2 i=1
(5)
i=1
For later use let us note that m X α=1
m i Zα · Zα¯ = − − 2 2
,
m X
Zα¯ · Zα =
α=1
m i − , 2 2
(6)
where Zα and Zα¯ are local frames of T 1,0 (M ) and T 0,1 (M ) as before. The action of on 6M yields an orthogonal decomposition 6M = ⊕m r=0 6r M, where 6r M is the eigenbundle associated to the eigenvalue i µr = i (m − 2 r) of . If we define 6−1 M = 6m+1 M = {0}, then D± 0(6r M ) ⊂ 0(6r±1 M ).
(7)
The complex volume element ωC := im e1 · ... · e2m acts on 6M by Clifford multiplication and its square is the identity. We denote by 6± M the eigenbundles corresponding to the eigenvalues ±1, and it is easy to see that 6r M ⊂ 6+ M (6r M ⊂ 6− M ) iff r is even (respectively odd). If, with respect to the decomposition 6M = 6+ M ⊕ 6− M , a spinor ψ is written as ψ = ψ+ + ψ− , then we define its conjugate ψ¯ := ψ+ − ψ− .
664
A. Moroianu
3. Projectable Spinors to Spinc -Manifolds As in the case of spin manifolds, projectable spinors may be defined for arbitrary Riemannian submersions of Spinc manifolds with 1–dimensional totally geodesic fibres, but for the sake of simplicity we treat only a particular case here. Let PU (1) M be the principal U(1) bundle associated to a Spinc structure on a Riemannian manifold (M n , g) of even dimension and suppose that on PU (1) M a connection ¯ := PU (1) M and by π the canonical bundle projection. It is form A is given. Denote by M well-known that there exists an unique 2-form α on M whose pull-back is just i times the curvature form dA of the connection A (note that A and dA are imaginary-valued forms ¯ ). Let T be the (1,1) tensor on M associated to α (defined by α(X, Y ) = g(X, T Y )). on M ¯ carries a canonical 1–parameter family of Riemannian metrics gt The manifold M ¯ → M into a Riemannian submersion with which make the bundle projection π : M totally geodesic fibres. These metrics are given by ¯, ¯ , X, Y ∈ Tx M gt (X, Y ) = g(π∗ X, π∗ Y ) − t2 A(X)A(Y ), ∀x ∈ M and we denote by ∇t the covariant derivative of the Levi-Civita connection of gt and by ¯ , g) V the unit vertical vector field on (M ¯ satisfying A(V ) = i/t. This choice of V fixes ¯. an orientation for M Before proceeding, we mention here a simple result relating spin and Spinc structures, that will be used in a moment. Lemma 3.1. A Spinc structure with trivial auxiliary bundle is canonically identified with a spin structure. Moreover, if the connection A of the auxiliary bundle L is flat, then by this identification ∇A corresponds to ∇ on the spinor bundles. Proof. One first remarks that since the U(1) bundle associated to L is trivial, we can exhibit a global section of it, that we will call σ. Denote by PSpinn M the inverse image by θ of PSO(n) M × σ. It is straightforward to check that this defines a spin structure on M , and that the connection on PSpincn M restricts to the Levi-Civita connection on PSpinn M if σ can be chosen to be parallel, i.e. if A defines a flat connection. ¯. Proposition 3.1. Every Spinc structure on M induces a canonical spin structure on M Proof. By enlargement of the structure groups, the two-fold covering θ : PSpincn M → PSO(n) M × PU (1) M, gives a two-fold covering θ : PSpincn+1 M → PSO(n+1) M × PU (1) M, ¯, which, by pull-back through π, gives rise to a Spinc structure on M ¯ PSpincn+1 M
π
−→
π∗ θ ↓ ¯ × PU (1) M ¯ PSO(n+1) M
θ↓ π
−→
↓ ¯ M
PSpincn+1 M
PSO(n+1) M × PU (1) M ↓
π
−→
M.
Spinc Manifolds and Complex Contact Structures
665
Using Lemma 3.1 we see that this construction actually yields a spin structure on ¯ . Indeed, the pull back of a G principal bundle (PU (1) M , in our situation) with respect M to its own projection map is always trivial: π
π ∗ P ' P × G −→ π∗ π ↓ P
P π↓
π
−→
M
.
Nevertheless, we will continue to call this spin structure the induced Spinc structure ¯. on M ¯ . We point The next step is to relate the covariant derivatives of spinors on M and M ¯ as spin manifold, the out an important detail here: since we are actually interested in M ¯ (which defines the covariant derivative of spinors on M ¯ ) that we connection on PU (1) M consider, will not be the pull-back connection, but the flat connection on the canonically ¯ . The following result relates an arbitrary connection on a principal trivial bundle PU (1) M bundle π : P → M and the flat connection on π ∗ P → P . Lemma 3.2. The connection form A0 of the flat connection on π ∗ P can be related to an arbitrary connection A on P by A0 ((π ∗ s)∗ (U )) = −A(U ),
(8)
A0 ((π ∗ s)∗ (X ∗ )) = A(s∗ X),
(9)
where U is a vertical vector field on P , X ∗ is the horizontal lift (with respect to A) of a vector field X on M , and s is a local section of P → M . Proof. The identification P × U(1) ' π ∗ P is given by (u, a) 7→ (u, ua), for all (u, a) ∈ P × U(1). For some fixed u ∈ P , take a path ut in the fiber over x := π(u) such that u0 = u and u˙ 0 = U . We define at ∈ U(1) by ut = s(x)at , so via the above identification we have (π ∗ s)(ut ) = (ut , s(x)) = (ut , (at )−1 ), and thus
˙ 0 = −A(u˙ 0 ) = −A(U ). A0 ((π ∗ s)∗ (U )) = −a−1 0 a
Similarly, for x ∈ M and X ∈ Tx M , take a path xt in M such that x0 = x and x˙ 0 = X. Let u ∈ π −1 (x) and ut the horizontal lift of xt such that u0 = u. We define at ∈ U(1) by s(xt ) = ut at , which by derivation gives s∗ (X) = Ra0 u˙ 0 + u0 a˙ 0 . Then (π ∗ s)(ut ) = (ut , s(xt )) = (ut , at ), and thus, using the fact that u˙ 0 is horizontal, ˙ 0 = A(s∗ (X)). A0 ((π ∗ s)∗ (X ∗ )) = a−1 0 a
Recall that the complex Clifford representation 6n can be made into a Cl(n + 1)– representation by defining ej · ψ for j ≤ n µ(ej ) · ψ = . iψ¯ for j = n + 1
666
A. Moroianu
¯, Corresponding to this, we obtain an identification of the pull back π ∗ 6M with 6M and with respect to this identification, if X is a vector and ψ a spinor on M , then X ∗ · π ∗ ψ = π ∗ (X · ψ),
(10)
¯ V · π ∗ ψ = π ∗ (iψ),
(11)
where V is the unit vertical vector field defined at the beginning of this section. ¯ which can be written as pull-back of sections of Definition 3.1. The sections of 6M 6M are called projectable spinors. ¯ in terms of the We now compute the covariant derivative of projectable spinors on M covariant derivative of spinors on M . ¯ induced by the Levi-Civita conProposition 3.2. The covariant derivative ∇t on 6M ¯ , gt ) and the flat connection on π ∗ PU (1) M satisfies nection on (M t ¯ ∇tX ∗ (π ∗ ψ) = π ∗ (∇A X ψ − i T (X) · ψ) 4
∀X ∈ T M,
i t ∇tV π ∗ ψ = −π ∗ ( α · ψ + ψ). 4 2t
(12)
(13)
¯ → M satisfies Proof. Recall that the curvature form dA of the principal U(1) bundle M dA = −iπ ∗ α.
(14)
¯. gt (X, Y ) = g(π∗ X, π∗ Y ) − t2 A(X)A(Y ), ∀X, Y ∈ T M
(15)
The metric gt is given by
If V denotes as before the unit vertical vector field, then A(V ) = i/t, and we obtain 1 t2 gt (∇tX ∗ V, Y ∗ ) = − gt (V, [X ∗ , Y ∗ ]) = A(V )A([X ∗ , Y ∗ ]) 2 2 it it ∗ ∗ = A([X , Y ]) = − dA(X ∗ , Y ∗ ) 2 2 t t = − α(X, Y ) = gt (T (X)∗ , Y ∗ ), 2 2 so finally ∇tX ∗ V =
t T (X)∗ . 2
(16)
Consider the pull-back π ∗ ψ of a spinor field ψ = [σ, ξ], where ξ : U ⊂ M → 6n is a vector valued function, and σ is a local section of PSpincn M whose projection onto PSO(n) M is a local orthonormal frame (X1 , ..., Xn ) and whose projection onto PU (1) M is a local section s. Then π ∗ ψ can be expressed as π ∗ ψ = [π ∗ σ, π ∗ ξ], and it is ¯ is the local orthonormal frame easy to see that the projection of π ∗ σ onto PSO(n+1) M ¯ is just π ∗ s. (X1∗ , ..., Xn∗ , V ) and its projection onto PU (1) M
Spinc Manifolds and Complex Contact Structures
667
Using Lemma 3.2 and (16) we obtain 1 X gt (∇tX ∗ Xj∗ , Xk∗ )Xj∗ · Xk∗ · π ∗ ψ 2
∇tX ∗ π ∗ ψ = [π ∗ σ, X ∗ (π ∗ ξ)] +
j
1 1 X gt (∇tX ∗ Xj∗ , V )Xj∗ · V · π ∗ ψ + A0 ((π ∗ s)∗ X ∗ )π ∗ ψ + 2 j 2 = [π ∗ σ, π ∗ (X(ξ))] +
1 X g(∇X Xj , Xk )π ∗ (Xj · Xk · ψ) 2 j
1t X ¯ + 1 A(s∗ X)π ∗ ψ g(T (X), Xj )π ∗ (Xj · ψ) − i 22 j 2 1 X = π ∗ [σ, (X(ξ))] + g(∇X Xj , Xk )Xj · Xk · ψ 2 j
1 X gt (∇tV Xj∗ , Xk∗ )Xj∗ · Xk∗ · π ∗ ψ 2 j
1 1 X gt (∇tV Xj∗ , V )Xj∗ · V · π ∗ ψ + A0 ((π ∗ s)∗ V )π ∗ ψ + 2 j 2 =
1 t1 X g(T (Xj ), Xk )π ∗ (Xj · Xk · ψ) − A(V )π ∗ ψ 22 2 j
t i = − π ∗ (α · ψ) − π ∗ ψ 4 2t i t = −π ∗ α·ψ+ ψ . 4 2t
We now particularise the above results to the case where M is a K¨ahler–Einstein manifold (M n , g, J) of positive scalar curvature, and the auxiliary bundle L of the Spinc structure on M is a root of the canonical bundle K, i.e. L⊗r = K for some r ∈ N∗ . The canonical connection on K, whose curvature form is just −iρ (ρ is the Ricci form), induces then a connection A on L, whose curvature form ω satisfies ω = −iρ/r. As ¯ the U(1) principal bundle associated to L. By rescaling the before, we denote by M metric on M if necessary, we can suppose that the scalar curvature of M is equal to n(n + 2), and thus ρ = (n + 2). The 2-form α on M defined at the beginning of this section is given in this case by n+2 , (17) α= r so the above proposition becomes
668
A. Moroianu
Proposition 3.3. If M is a Spinc K¨ahler–Einstein manifold of positive scalar curvature and the auxiliary bundle L of the Spinc structure on M is a r–root of the canonical bundle ¯ induced by the Levi-Civita connection on K, then the covariant derivative ∇t on 6M ∗ ¯ (M , gt ) and the flat connection on π PU (1) M satisfies ∇tX ∗ (π ∗ ψ) = π ∗ (∇A Xψ − i
t(n + 2) ¯ J(X) · ψ) 4r
∇tV π ∗ ψ = −π ∗ (
∀X ∈ T M,
i t(n + 2) · ψ + ψ). 4r 2t
(18) (19)
The formula (16) allows us to compute the Ricci tensor Rict of the Riemannian ¯ , gt ). If we denote by a := t(n+2) , then manifold (M 2r Rict (V, V ) = na2 t
∗
∗
,
Rict (X ∗ , V ) = 0,
(20)
(21) Ric (X , Y ) = (n + 2 − 2a )g(X, Y ). ¯ := ∇t0 . The vertical vector field V defines Let us take t0 = and denote g¯ := gt0 , ∇ ¯ , g) then an Einstein–Sasakian structure on the manifold (M ¯ (cf. [2]). We can synthetise the above results in the following 2
2r n+2
Theorem 3.1. Let (M n , g, J) be a K¨ahler–Einstein manifold with scalar curvature R = 1 ¯ the associated U(1) principal n(n+2), L := K r a root of the canonical bundle K and M bundle with connection form A, induced by the Levi-Civita connection on K. Then the following hold: ¯ making the bundle projection π : M ¯ →M (i) There is a canonical metric g¯ on M ¯ X∗ V = into a Riemannian submersion with totally geodesic fibres, and satisfying ∇ ∗ J(X) . (ii) With respect to the metric g, ¯ V defines a regular Einstein–Sasakian structure on ¯ . The length of the fibres of the corresponding S 1 –action is constant and equal M 4πr . to n+2 (iii) The Spinc structure on M defined by (L, A) induces a canonical spin structure ¯ and every spinor field on M induces a projectable spinor field π ∗ ψ on M ¯, on M satisfying i ¯ ¯ X ∗ (π ∗ ψ) = π ∗ (∇A ∇ X ψ − J(X) · ψ) 2
∀X ∈ T M,
¯ V π ∗ ψ = − 1 π ∗ ( · ψ + i(n + 2) ψ). ∇ 2 2r
(22) (23)
4. Complex Contact Structures Definition 4.1 (cf. [5]). Let M 2m be a complex manifold of complex dimension m = 2k + 1. A complex contact structure is a family C = {(Ui , ωi )} satisfying the following conditions: (i) (ii) (iii) (iv)
{Ui } is an open covering of M . ωi is a holomorphic 1-form on Ui . ωi ∧ (∂ωi )k ∈ 0(3m,0 M ) is different from zero at every point of Ui . ωi = fij ωj in Ui ∩ Uj , where fij is a holomorphic function on Ui ∩ Uj .
Spinc Manifolds and Complex Contact Structures
669
Let C = {(Ui , ωi )} be a complex contact structure. Then there exists an associated −1 holomorphic line sub–bundle LC ⊂ 31,0 (M ) with transition functions {fij } and local sections ωi . It is easy to see that D := {Z ∈ T 1,0 M | ω(Z) = 0, ∀ω ∈ LC } is a codimension 1 maximally non–integrable holomorphic sub–bundle of T 1,0 M , and conversely, every such bundle defines a complex contact structure. From condition (iii) ∼ immediately follows the isomorphism Lk+1 = K, where K = 3m,0 (M ) denotes the C canonical bundle of M . From now on, M will denote a K¨ahler–Einstein manifold of odd complex dimension m = 4l + 1 with positive scalar curvature, admitting a complex contact structure C. The manifold M is compact, by Myers’ Theorem. By rescaling the metric on M if necessary, we can suppose that the scalar curvature of M is equal to 2m(2m + 2), and thus the Ricci form ρ and the K¨ahler form are related by ρ = (2m + 2). The main objective of this section is to construct the analogues of K¨ahlerian Killing spinors ([3, 4, 8]) for a certain Spinc structure on M , determined by C. This is done just as in [4]. The collection (Ui , ωi ∧ (∂ωi )l ) defines a holomorphic line bundle Ll ⊂ 32l+1,0 M , and from the definition of C we easily obtain Ll ∼ = Ll+1 C .
(24)
We now fix some (U, ω) ∈ C and define a local section ψC of 30,2l+1 M ⊗ Ll+1 C by ψC |U := |ξτ |−2 τ¯ ⊗ ξτ ,
(25)
l
where τ := ω ∧ (∂ω) and ξτ is the element corresponding to τ through the isomorphism (24). The fact that ψC does not depend of the element (U, ω) ∈ C shows that it actually defines a global section ψC of 30,2l+1 M ⊗ Ll+1 C . We now recall ([6], Appendix D) that 30,∗ M is just the spinor bundle associated to the canonical Spinc structure on M , whose auxiliary line bundle is K −1 , so that c 30,∗ M ⊗ Ll+1 C is actually the spinor bundle associated to the Spin structure on M with 2(l+1) ∼ −(2l+1) 2(l+1) ∼ −1 ⊗ LC auxiliary bundle L = K ⊗ LC = LC = LC . The section ψC is thus ∼ M , so 6 a spinor lying in 30,2l+1 M ⊗ Ll+1 = 2l+1 C · ψC = −iψC .
(26)
Proposition 4.1. The spinor field ψC satisfies ∇Z ψC = 0, ∀Z ∈ T 1,0 M (in particular D− ψC = 0), and D 2 ψC = D − D + ψC =
l+1 1 ( R ψC − iρ · ψC ), 2l + 1 2
(27)
where R is the scalar curvature of M . Proof. This is actually a variant of Proposition 5 from [4], the only difference being that ξτ (9ω in their notations) is not any more a section of K 1/2 , but of K (l+1)/(2l+1) , so the l+1 . coefficients 1/2 in formulas (8) and (9) of [4] have to be replaced by 2l+1 Using (26), (27) and the fact that ρ =
1 8l+2 R
= (8l + 4), we obtain
Corollary 4.1. The spinor field ψC is an eigenspinor of D2 with respect to the eigenvalue 16l(l + 1).
670
A. Moroianu
Let us introduce some notations ψ− := ψC ∈ 0(62l+1 M )
,
ψ+ :=
1 DψC ∈ 0(62l+2 M ). 4l + 4
(28)
By integration over M we immediately obtain from the above corollary, |ψ− |2L2 =
l+1 |ψ+ |2L2 . l
(29)
Proposition 4.2. The following relations hold: ∇Z ψ− = 0, ∀Z ∈ T 1,0 M,
(30)
∇Z¯ ψ− + Z¯ · ψ+ = 0, ∀Z¯ ∈ T 0,1 M,
(31)
∇Z¯ ψ+ = 0, ∀Z¯ ∈ T 0,1 M,
(32)
∇Z ψ+ + Z · ψ− = 0, ∀Z ∈ T
1,0
M.
(33)
Proof. The first relation is part of Proposition 4.1. In order to prove (31), let us consider the local frames of T 1,0 (M ) and T 0,1 (M ) introduced in Sect. 2: Zα = 21 (Xα − iYα ) and Zα¯ = 21 (Xα + iYα ), where Yα = J(Xα ), and {Xα , Yα } is a local orthonormal frame of T M . From (30) we find ∇Zα¯ ψ− = ∇Xα ψ− = i∇Yα ψ− , so using (6) and (28) gives 0≤
m X
|∇Zα¯ ψ− + Zα¯ · ψ+ |2
α=1
=
m X
|∇Xα ψ− |2 − 2<e
m X
α=1
(ψ+ , Zα · ∇Zα¯ ψ− ) −
α=1
m X
(ψ+ , Zα · Zα¯ · ψ+ )
α=1
1 1 |∇ψ− |2 − <e(ψ+ , D+ ψ− ) − (ψ+ , (−i − m)ψ+ ) 2 2 1 1 = |∇ψ− |2 − (4l + 4)|ψ+ |2 + (4l + 4)|ψ+ |2 . 2 2
=
The last expression is by construction a positive function, say |F |2 , on M . Integrating over M and using the generalised Lichnerowicz formula ([6], Appendix D), Corollary 4.1 and (29), we obtain 1 ∗ 1 (∇ ∇ψ− , ψ− )L2 − (4l + 4)|ψ+ |2L2 + (4l + 4)|ψ+ |2L2 2 2 1 1 i 1 ρ · ψ− , ψ− )L2 − (2l + 2)|ψ+ |2L2 = (D2 ψ− − Rψ− + 2 4 2 2l + 1 (8l + 2)(8l + 4) i −i(8l + 4) 2 + − 2l = 0, = |ψ− |L2 8l(l + 1) − 8 4 2l + 1
|F |2L2 =
thus proving that F = 0 and consequently (31). In order to check the last two equations ˜ From D− ψ− = 0 we find one has to make use of the operator D. 0= so
1 D 2 ψ− = D + ψ + , 4l + 4 + ˜ + = −iDψ+ . Dψ
(34) (35)
Spinc Manifolds and Complex Contact Structures
671
We take a local orthonormal frame ei and write (using (1), (5), (28) and (35)) 0≤
n X j=1
1 |∇ej ψ+ + (ej − iJ(ej )) · ψ− |2 2
˜ + , ψ− ) = |∇ψ+ |2 − <e((D + iD)ψ n 1X ((ej + iJ(ej )) · (ej − iJ(ej )) · ψ− , ψ− ) − 4 j=1
= |∇ψ+ |2 − 2<e(Dψ+ , ψ− ) + ((m − i) · ψ− , ψ− ) = |∇ψ+ |2 − 8l|ψ− |2 + 4l|ψ− |2 := |G|2 Just as before, we compute the integral over M of the positive function |G|2 , namely |G|2L2 = |∇ψ+ |2L2 − 4l|ψ− |2L2
= (∇∗ ∇ψ+ , ψ+ )L2 − 4l|ψ− |2L2 1 i 1 ρ · ψ+ , ψ+ )L2 − 4l|ψ− |2L2 = (D2 ψ+ − Rψ+ + 4 2 2l + 1 (8l + 2)(8l + 4) i −3i(8l + 4) + − 4(l + 1) = 0, = |ψ+ |2L2 16l(l + 1) − 4 2 2l + 1
thus proving G = 0. Consequently ∇X ψ+ + 21 (X − iJ(X)) · ψ− = 0, ∀X ∈ T M , which is equivalent to (32) and (33). The above proposition motivates the following Definition 4.2. A section ψ of the spinor bundle of a given Spinc structure on a K¨ahler manifold (M 8l+2 , g, J) satisfying ∇A Xψ =
i 1 ¯ X · ψ + JX · ψ, 2 2
∀X ∈ T M
(36)
is called a K¨ahlerian Killing spinor. Defining ψ := ψ+ − ψ− we immediately obtain the Corollary 4.2. Let C be a complex contact structure on a K¨ahler–Einstein manifold (M 8l+2 , g, J). Then the Spinc structure on M with auxiliary bundle LC carries a K¨ahlerian Killing spinor ψ ∈ 0(62l+1 M ⊕ 62l+2 M ). 5. Geometric Consequences We can now state the main application of the above results: Theorem 5.1. Let M be a compact K¨ahler manifold of positive scalar curvature and complex dimension 4l + 1. Then the following statements are equivalent: (i) M is the twistor space of some quaternionic K¨ahler manifold; (ii) M is K¨ahler–Einstein and admits a complex contact structure; (iii) There exist a Spinc structure on M with auxiliary bundle L and spinor bundle 6M such that L⊗(2l+1) ∼ = 34l+1,0 M and 6M carries a K¨ahlerian Killing spinor ψ ∈ 0(62l+1 M ⊕ 62l+2 M ).
672
A. Moroianu
Proof. The implications (i)=⇒(ii) and (ii)=⇒(iii) follow directly from [13] and Corollary 4.2 respectively. Suppose now that (iii) holds. The proof of (iii)=⇒(i) parallels that of [8]. We first show that M is K¨ahler–Einstein. Let ψ ∈ 0(62l+1 M ⊕ 62l+2 M ) be a spinor field on M which satisfies (36). Taking the covariant derivative with respect to an arbitrary vector field Y we obtain A ∇A Y ∇X ψ =
i 1 (X · Y + JX · JY ) · ψ + (X · JY − JX · Y ) · ψ¯ + ∇A ∇Y X ψ, 4 4
(37)
which easily implies 1 ¯ (X · Y + JX · JY + 2g(X, Y )) · ψ − ig(X, JY )ψ. 2
RA Y,X ψ =
(38)
A local computation shows that the curvature operator RA on the spinor bundle is given by the formula i (39) RA = R + ω, 2 i ρ is the curvature form of the auxiliary bundle L, and where iω := − 2l+1
RX,Y =
1X R(X, Y, ej , ek ) ej · ek · 2
(40)
j
in a local orthonormal frame {e1 , ..., en }. Using the first Bianchi identity for the curvature tensor one obtains ([2], p.16) X i
ei · Rei ,X =
1 Ric(X), 2
(41)
so, by (39) and (41), X j
ej ·RA ej ,X ψ =
X j
i i 1 ej ·(Rej ,X ψ + ω(ej , X)ψ) = Ric(X)·ψ − X 2 2 2
|
ω ·ψ. (42)
On the other hand, a straightforward computation using (38) and the fact that ψ ∈ 0(62l+1 M ⊕ 62l+2 M ) yields X ¯ ej · RA ej ,X ψ = (4l + 2)X · ψ + iJX · ψ + JX · · ψ j
= (4l + 2)X · ψ − 2iJX · ψ, which, together with (42), gives i 1 1 Ric(X) − (4l + 2)X · ψ = J Ric(X) − (4l + 2)X · ψ. 2 2l + 1 2
(43)
As ψ never vanishes, if the equality A · ψ = iB · ψ holds for some real vectors A, B, then |A| = |B|. The above formula thus shows that Ric(X) = (8l + 4)X, ∀X ∈ T M , so M is K¨ahler–Einstein with scalar curvature R = (8l + 2)(8l + 4).
Spinc Manifolds and Complex Contact Structures
673
¯ associated to L From Theorem 3.1 we deduce that the principal U(1) bundle M admits a canonical metric g¯ and a canonical spin structure such that the spinor π ∗ ψ induced by ψ satisfies i ∗ 1 ¯ ¯ X ∗ (π ∗ ψ) = π ∗ (∇A ∇ X ψ − J(X) · ψ) = π ( X · ψ) 2 2
∀X ∈ T M,
¯ ¯ V π ∗ ψ = − 1 π ∗ ( · ψ + i(8l + 4) ψ) = π ∗ ( i ψ), ∇ 2 2(2l + 1) 2
(44) (45)
¯. and (10), (11) show that π ∗ ψ is a Killing spinor on M ¯ over M ¯, The spinor field π ∗ ψ induces then a parallel spinor 9 on the cone C M which is a K¨ahler manifold (cf. [1, 8, 11]). Moreover, using (45) we can compute the ¯ . From C. B¨ar’s ¯ on 9, and obtain that 9 ∈ 62l+3 C M action of the K¨ahler form of C M ¯ is one of the classification [1] we know that the restricted holonomy group of C M following: SU(4l + 2), Sp(2l + 1) or 0. The fixed points of the spin representation of ¯ , the restricted SU(4l + 2) lie in 60 and 64l+2 , so as 9 is a parallel spinor in 62l+3 C M ¯ cannot be equal to SU(4l + 2). This implies that the universal holonomy group of C M ¯ is hyperk¨ahler, and thus that the universal covering of M ¯ is 3–Sasakian covering of C M (see [1]). ¯ 0 the U(1) bundle associated to some maximal root of L. Using the Let us denote by M ¯ 0 is simply connected (see [2], p.85). Moreover, Gysin exact sequence we deduce that M ¯ , thus proving that M ¯ 0 is the ¯0 → M there exists a canonical covering projection M 0 0 ¯ ¯ universal covering of M . Consequently, (M , g¯ ) is a 3–Sasakian manifold, where g¯ 0 is the metric induced from g¯ via the covering projection. On the other hand, the unit vertical ¯ 0 defines a Sasakian structure, since this is true for its projection V vector field V 0 on M ¯ . It is well known that any Sasakian structure on a 3–Sasakian manifold P 4k−1 of on M non-constant sectional curvature belongs to the 2–sphere of Sasakian structures. Indeed, the cone CP over P has restricted holonomy Sp(k), and since the centraliser of Sp(k) in U(2k) is just Sp(1), every K¨ahler structure on CP must belong to the 2–sphere of K¨ahler structures of CP , which is equivalent to our statement. ¯ 0 is regular in the direction of V 0 , so an old result of Tanno implies that Now, M it is actually a regular 3–Sasakian manifold (cf. [14]). It is then well known that the ¯ 0 by the corresponding SO(3) action is a quaternionic K¨ahler manifold of quotient of M positive scalar curvature, say N , and that the twistor space over N is biholomorphic to ¯ 0 by each of the S 1 actions given by the Sasakian vector fields, so in the quotient of M ¯ 0 by the S 1 action generated by V 0 . particular to M , which is the quotient of M From Theorem A and Theorem 5.1 we immediately obtain the result of LeBrun mentioned in Sect. 1: Corollary 5.1. Let Z be a Fano contact manifold. Then Z is a twistor space iff it admits a K¨ahler–Einstein metric.
References 1. B¨ar, C.: Real Killing spinors and holonomy. Commun. Math. Phys. 154, 509–521 (1993) 2. Baum, H., Friedrich, Th., Grunewald, R., Kath, : Twistor and Killing Spinors on Riemannian Manifolds. Stuttgart-Leipzig: Teubner-Verlag, 1991 3. Hijazi, O.: Eigenvalues of the Dirac operator on Compact K¨ahler Manifolds. Commun. Math. Phys. 160, 563–579 (1994)
674
A. Moroianu
4. Kirchberg, K.-D., Semmelmann, U.: Complex Contact Structures and the first Eigenvalue of the Dirac Operator on K¨ahler Manifolds. Geom. Anal. Funct. Anal. 5, 604–618 (1995) 5. Kobayashi, S.: Remarks on Complex Contact Manifolds. Proc. Am. Math. Soc. 10, 164–167 (1959) 6. Lawson, B., Michelson, M.-L.: Spin Geometry. Princeton: Princeton University Press, 1989 7. Lebrun, C.: Fano Manifols, Contact Structures, and Quaternionic Geometry. Int. J. Math. 6, 419–437 (1995) 8. Moroianu, A.: La premi`ere valeur propre de l’op´erateur de Dirac sur les vari´et´es k¨ahl´eriennes compactes. Commun. Math. Phys. 169, 373–384 (1995) 9. Moroianu, A.: Op´erateur de Dirac et submersions riemanniennes. Thesis, Ecole Polytechnique, 1996 10. Moroianu, A.: Spineurs et vari´et´es de Hodge. Pr´epublications de l’Ecole Polytechnique 1126, 1996 11. Moroianu, A.: Parallel and Killing Spinors on Spinc manifolds. Commun. Math. Phys. 187, 417–428 (1997) 12. Moroianu, A., Semmelmann, U.: K¨ahlerian Killing Spinors, Complex Contact Structures and Twistor Spaces. C. R. Acad. Sci. Paris t. 323 S´erie I, 57–61 (1996) 13. Salamon, S.: Quaternionic K¨ahler Manifolds. Invent. Math. 67, 143–171 (1982) 14. Tanno, S.: Killing vectors on contact Riemannian manifolds and fiberings related to the Hopf fibrations. Tohoku Math. J. 23, 314–333 (1971) Communicated by A. Connes
Commun. Math. Phys. 193, 675 – 711 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Equilibrium Measures for Coupled Map Lattices: Existence, Uniqueness and Finite-Dimensional Approximations Miaohua Jiang1,? , Yakov B. Pesin2 1 Center for Dynamical Systems and Nonlinear Studies, Georgia Institute of Technology, Atlanta, GA 30332, USA 2 Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA
Received: 9 May 1997 / Accepted: 24 September 1997
Abstract: We consider coupled map lattices of hyperbolic type, i.e., chains of weakly interacting hyperbolic sets (attractors) over multi-dimensional lattices. We describe the thermodynamic formalism of the underlying spin lattice system and then prove existence, uniqueness, mixing properties, and exponential decay of correlations of equilibrium measures for a class of H¨older continuous potential functions with a sufficiently small H¨older constant. We also study finite-dimensional approximations of equilibrium measures in terms of lattice systems (Z-approximations) and lattice spin systems (Zd approximations). We apply our results to establish existence, uniqueness, and mixing property of SRB-measures as well as obtain the entropy formula. Introduction Coupled map lattices form a special class of infinite-dimensional dynamical systems. They were introduced by K. Kaneko [Ka] in 1983 as simple models with essential features of spatio-temporal chaos. These systems are built as weak interactions of identical local finite-dimensional subsystems at lattice points. Such systems are proven to be useful in studying qualitative properties of spatially extended dynamical systems. They can easily be simulated on a computer, and many remarkable results about coupled map lattices were obtained by researchers working in different areas of physics, biology, mathematics, and engineering. Bunimovich and Sinai initiated the rigorous mathematical study of coupled map lattices in [BuSi]. They constructed special Sinai–Bowen–Ruelle (SRB)-measures for weakly coupled expanding circle maps (under some additional assumptions that the interaction is of finite range and preserves the unique fixed point of the map). SRBmeasures are invariant under both space and time translations and have strong ergodic ? Current address: IMA, University of Minnesota, 514 Vincent Hall, 206 Church St. SE, Minneapolis, MN 55455, USA.
676
M. Jiang, Ya.B. Pesin
properties including mixing, positive entropy, and exponential decay of correlations. From the physical point of view this is interpreted as evidence of spatio-temporal chaos. In [BK1–BK3], Bricmont and Kupiainen extended the results of Bunimovich and Sinai to general expanding circle maps. In [KK], Keller and K¨unzle studied the case when the local subsystems are piecewise smooth interval maps. A detailed survey can be found in [Bu]. The first attempt to analyze coupled map lattices with multidimensional local subsystems of hyperbolic type was made by Pesin and Sinai in [PS]. They constructed conditional distributions for the SRB-measure on unstable local manifolds assuming that the local subsystem possesses a hyperbolic attractor. In [J1, J2], Jiang considered the case when a local subsystem possesses a hyperbolic set and obtained some partial results on the existence and uniqueness of Gibbs distributions. In this paper we extend these results and establish the existence and uniqueness of Gibbs distributions for arbitrary chain of weakly interacting hyperbolic sets. Our main tool of study is the thermodynamic formalism which is applied to the lattice spin system of statistical mechanics associated with a given coupled map lattice. We point out that the lattice spin systems corresponding to coupled map lattices are of a special type and have not been studied in the framework of the “classical” statistical mechanics until recently. The study of Gibbs distributions for these special lattice spin systems required new and advanced technique which was developed in [JM] and [BK2, BK3]. In [JM], the authors considered two-dimensional lattice spin systems. Using polymer expansions of partition functions they found an explicit formula for Gibbs states in terms of the potentials. They proved existence and uniqueness of Gibbs states for the special class of potentials arising from the corresponding coupled map lattices (which are generated by H¨older continuous functions with sufficiently small H¨older constant). They also established continuity of Gibbs states over such potentials. In [BK2, BK3], the authors considered general multidimensional lattice spin systems. Using expansions of the correlation functions they also established existence and uniqueness of the Gibbs states as well as the mixing property for the same type of potentials. In this paper we include a detailed discussion of lattice spin systems and their relation to coupled map lattices. The appendix contains a concise description of polymer expansions. This makes the paper relatively self-contained and thus more accessible for specialists in dynamical systems who are not very familiar with this highly specialized area of statistical physics. The paper is divided into five sections. In the first three sections we generalize results of [J1] on the topological structure of coupled map lattices of hyperbolic type. Our main result is that these systems are structurally stable (Theorem 1.1). This result allows us to obtain a complete description of topological properties of coupled map lattices of hyperbolic type as well as construct their symbolic representations. When the interaction is short ranged and thus the coupling is exponentially weak, the conjugacy map allows one to use Markov partitions for the uncoupled map lattice to build Markov partitions for the coupled map lattice. This leads to a symbolic representation of the lattice system as a lattice spin system of statistical mechanics. In [JM] (see also [BK3]) the authors established uniqueness of Gibbs states and exponential decay of correlations for these lattice spin systems. We use their results (as well as results in [BK3]) to establish uniqueness and the exponential mixing property of equilibrium measures. Our main result is Theorem 3.6. In Sect. 4 we construct “natural” finite-dimensional approximations of equilibrium measures. There are two different types of approximations. One results from Z-approximations by finite volumes in the lattice while the other is obtained from Zd+1 -
Equilibrium Measures for Coupled Map Lattices
677
approximations by finite volumes in the lattice spin systems. Our main results are stated in Theorems 4.2 and 4.3. In Sect. 5 we apply our results to establish the existence, uniqueness, and mixing property of SRB-measures for chains of weakly interacting hyperbolic attractors. We show that these measures are Gibbs states for H¨older continuous functions and we describe them in terms of their finite-dimensional approximations using lattice spin systems (see Theorem 5.1). One direct consequence of our construction of SRB-measures is a formula for the Zd+1 -measure theoretic entropy (see Remark (5) in Sect. 5; see [J3] for the detailed proof). This generalizes the well-known formula for the entropy of SRB-measures in the finite-dimensional case. 1. Coupled Map Lattices 1.1. Definition of coupled map lattices. Let M be a smooth compact Riemannian manifold and f a C r -map of M , r ≥ 1. Let also Zd , d ≥ 1 be the d-dimensional integer lattice. Set M = ⊗i∈Zd Mi , where Mi are copies of M . The space M admits the structure of an infinite-dimensional Banach manifold with the Finsler metric induced by the Riemannian metric on M , i.e., kvk ¯ = sup kvi k.
(1.1)
i∈Zd
The distance in M induced by the Finsler metric is given as follows: ρ(x, ¯ y) ¯ = sup d(xi , yi ),
(1.2)
i∈Zd
where x¯ = (xi ) and y¯ = (yi ) are two points in M and d is the Riemannian distance on M . We define the direct product map on M by F = ⊗i∈Zd fi , where fi are copies of f . Consider a map G on M which is C r -close to the identity map id. Set 8 = F ◦ G. The map G is said to be an interaction between points (space sites) of the lattice Zd and the map 8 is said to be a perturbation of F . Iterates of the map 8 generate a Z-action on M called time translations. We also consider the group action of the lattice Zd on M byspatial translations S k . ¯ i = xi+k . Namely, for any k ∈ Zd and any x¯ = (xi ) ∈ M, we set S k (x) The pair of actions (8, S) on M is called a coupled map lattice generated by the local map f and the interaction G. If G commutes with the spatial translations S k , i.e., S k ◦ G = G ◦ S k , we call G spatial translation invariant. In this case the pair (8, S) generates a Zd+1 -action on M. If G = id, the lattice is called uncoupled. One can also define the perturbation in the form 8 = G ◦ F . If F is invertible (and in what follows we will always assume this) the study of perturbations of such a form is equivalent to the study of perturbations in the previous form since G ◦ F = F ◦ (F −1 ◦ G ◦ F ) with F −1 ◦ G ◦ F being close to the identity. 1.2. Coupled map lattices of hyperbolic type. We consider a special type of coupled map lattice assuming that the local map is hyperbolic. More precisely, let U ⊂ M be an open set, f : U → M a C 1 -diffeomorphism, and 3 ⊂ U a closed invariant hyperbolic set for f . The latterL means that the tangent bundle T3 M over 3 is split into two subbundles: T3 M = E s E u , where E s and E u are stable and unstable subspaces. They are both invariant under the differential Df , and for some C > 0 and 0 < λ < 1,
678
M. Jiang, Ya.B. Pesin
kDf n vk ≤ Cλn kvk for n ≥ 0, kDf −n wk ≤ Cλn kwk for n ≥ 0,
v ∈ Es;
(1.3)
w ∈ Eu.
The hyperbolic set 3 is called locally maximal if there exists an open set U ⊃ 3 such T that 3 = n∈Z f n (U¯ ), where U¯ is the closure of U . For any point x in a hyperbolic set 3 one can construct local stable and unstable manifolds defined by V s (x) = {y ∈ M : d(x, y) ≤ , d(f n (x), f n (y)) → 0, n → +∞}; V u (x) = {y ∈ M : d(x, y) ≤ , d(f n (x), f n (y)) → 0, n → −∞}.
(1.4)
It is known that these submanifolds are as smooth as the map f . The definition of hyperbolicity can easily be extended to diffeomorphisms of Banach manifolds. Suppose that H is a C 1 -diffeomorphism of an open set U of a Banach manifold N (endowed with a Finsler metric) and a set 1 ⊂ U is invariant under H (note that 1 may not be compact). We say that 1 is hyperbolic if the tangent bundle T1 N over 1 admits a splitting T1 N = E s ⊕ E u with the following properties: 1) E s and E u are invariant under the differential DH; 2) for any continuous sections v valued in E s and w valued in E u we have kDH n vk ≤ Cλn kvk
and kDH −n wk ≤ Cλn kwk,
for some constants C > 0 and 0 < λ < 1 independent of v and w; 3) there exists b > 0 such that for any z the angle between E s (z) and E u (z) is bounded away from zero, i.e., inf{kξ − ηk : ξ ∈ E s (z), η ∈ E u (z)k, kξk = kηk = 1} ≥ b.
(1.5)
Note that in the finite-dimensional case the last condition holds true automatically. It is easy to see that the map F is hyperbolic in the above sense, i.e., it possesses an infinite-dimensional hyperbolic set 1F = ⊗i∈Zd 3i , where 3i is a copy of 3. Moreover, for each point x¯ = (xi ) ∈ 1F the tangent space Tx¯ M ¯ ⊕ E u (x), ¯ where the stable and unstable subspaces admits the splitting Tx¯ M = E s (x) are E s (x) ¯ = ⊗i∈Zd E s (xi ), E u (x) ¯ = ⊗i∈Zd E u (xi ). (1.6) Furthermore, for each point x¯ = (xi ) ∈ 1F the local stable and unstable manifolds passing through x¯ are ¯ = ⊗i∈Zd Vis (xi ), VFs (x)
VFu (x) ¯ = ⊗i∈Zd Viu (xi ),
(1.7)
where Vis (xi ) and Viu (xi ) are the local stable and unstable manifolds at xi respectively. If the hyperbolic set 3 is locally maximal, so is 1F . 1.3. Short range maps. The goal of this paper is to investigate metric properties of coupled map lattices of hyperbolic type. In the finite-dimensional case one uses thermodynamic formalism (see [Bo, Ru]) to construct invariant measures and then studies the ergodicity of hyperbolic maps with respect to these measures. The extension of this formalism to the infinite-dimensional case faces some obstacles. The most crucial obstacle is noncompactness of the hyperbolic set 1F . One of the ways to overcome this obstacle is to
Equilibrium Measures for Coupled Map Lattices
679
introduce a new metric on M with respect to which the space becomes compact. This metric is known as a metric with weights and is defined as follows: given 0 < q < 1 and x, ¯ y¯ ∈ M, we set ¯ y) ¯ = sup q |i| d(xi , yi ), (1.8) ρq (x, i∈Zd
where |i| = |i1 | + |i2 | + · · · + |id |, i = (i1 , i2 , · · · , id ) ∈ Zd . For different 0 < q < 1 the metrics ρq induce the same compact (Tychonov) topology in M. Although working with ρq -metrics gives us some advantages in studying invariant measures for the maps F and 8, it also introduces some new problems. For example, the set M is no longer a differential manifold and the maps F and 8, while being continuous, need not be differentiable. In particular, the set 1F being compact is no longer hyperbolic in the above sense but only in some weak sense. More precisely, this set is topologically hyperbolic, i.e., for every point in 1F the local stable and unstable manifolds (1.7) are, in general, only continuous (not smooth). We will restrict to the class of perturbations to be able to keep track of the hyperbolic behavior of trajectories for the perturbation map 8. More precisely, we consider the special class of perturbations called short range maps. The concept of short range maps was introduced by Bunimovich and Sinai in [BuSi] and was further developed by Pesin and Sinai in [PS] (see also [KK]). We follow their approach. Let Y be a subset of M and G : Y → M a map. We say that G is short ranged if G is of the form G = (Gi )i∈Zd , where Gi : Y → Mi satisfy the following condition: for any fixed k ∈ Zd and any points x¯ = (xj ), y¯ = (yj ) ∈ Y with xj = yj for all j ∈ Zd , j 6= k we have d(Gi (x), ¯ Gi (y)) ¯ ≤ Cθ|i−k| d(xk , yk ), (1.9) where C and θ are constants and C < 0, 0 < θ < 1. We call θ the decay constant of G. If G is spatial translation invariant then G can be shown to be short ranged with a decay constant θ, if and only if ¯ G0 (y)) ¯ ≤ Cθ|k| d(xk , yk ), d(G0 (x),
(1.10)
for any x¯ = (xj ), y¯ = (yi ) ∈ Y with xj = yj for all j ∈ Z, j 6= k. In the following Propositions 1.1–1.3 we collect some basic properties of short range maps. The proofs can be found in [J1]. Proposition 1.1. Let G be a C 1 -diffeomorphism of an open set U ⊂ M onto its image. Assume that G is short ranged with a decay constant θ. Then 1) the differential of G at every point x, ¯ Dx¯ G : Tx¯ M → Tx¯ M, is a short range linear map with the same decay constant θ; 2) the bundle map DG is short ranged with the same decay constant θ. Moreover, if the map G is continuous with respect to a ρq -metric then either of statements (1) or (2) implies that G is short ranged. Proposition 1.2. For any 0 < θ < 1, there exists > 0 such that if G : M → M is a short range C 1+α -diffeomorphism with the decay constant θ and distC 1 (G, id) < then G−1 is also a short range map. Short range maps are well adopted with the metric structure of M generated by ρq metrics as the following result shows.
680
M. Jiang, Ya.B. Pesin
Proposition 1.3. 1) Let G : M → M be a short range map with a decay constant θ. Then G is Lipschitz continuous as a map from (M, ρq ) into itself for any q > θ. 2) If G is a Lipschitz continuous map from (M, ρq ) to (M, ρq1 ), with some 0 < q1 < 1, then G is short ranged with the decay constant θ = q. 3) For any > 0 and 0 < θ < q < 1, there exist δ > 0 such that if G is a C 1+α spatial translation invariant short range map of M with the decay constant θ and distC 1 (G, id) ≤ δ, then G is Lipschitz continuous in the ρq -metric with a Lipschitz constant L ≤ 1 + . 1.4. Structural stability. We consider the problem of structural stability of coupled map lattices of hyperbolic type (M, F ). It is well-known that finite-dimensional hyperbolic dynamical systems are structurally stable (see for example, [KH, Sh]) and so are hyperbolic maps of Banach manifolds which admit a partition of unity (see [Lang]). We stress that the Banach manifold M = ⊗i∈Zd Mi does not admit a partition of unity and this result cannot be applied directly. In order to study structural stability we will exploit the special structure of the system (M, F ) as the direct product of countably many copies of the same finite-dimensional dynamical system (M, f ). This enables us to establish structural stability by modifying arguments from the proof in the finite-dimensional case. From now on we always assume that the interaction G is short ranged. Theorem 1.1. 1) For any > 0 there exists 0 < δ < δ0 such that, if distC 1 (8, F ) ≤ δ, then there is a unique homeomorphism h : 1F → M satisfying 8 ◦ h = h ◦ F |1F with distC 0 (h, id) ≤ . In particular, the set 18 = h(1F ) is hyperbolic and locally maximal. 2) For any 0 < θ < 1 there exists δ > 0 such that if G is a C 2 -spatial translation invariant short range map with a decay constant θ and distC 1 (G, id) ≤ δ, then the conjugacy map h is H¨older continuous with respect to the metric ρq , 0 < q < 1. Moreover, ¯ i∈Zd satisfies the following property: h = (hi (x)) d(h0 (x), ¯ h0 (y)) ¯ ≤ C(δ)dα (xk , yk )
(1.11)
for every k 6= 0 and any x, ¯ y¯ ∈ M with xi = yi , i ∈ Zd , i 6= k, where 0 < α < 1 and C(δ) > 0 is a constant. Furthermore, C(δ) → 0 as distC 1 (G, id) → 0. Proof. We describe the main steps of the proof of Statement 1 recalling those arguments that will be used below (detailed arguments can be found in [J1]). Let U (1F ) be an open neighborhood of 1F and C 0 (1F , U (1F )) the space of all continuous maps from 1F to U (1F ). Consider the map G : C 0 (1F , U (1F )) → C 0 (1F , M)
(1.12)
defined by β 7−→ 8 ◦ β ◦ F −1 . We wish to show that G has a unique fixed point near the identity map. Let 00 (1F , T M) be the space of all continuous vector fields on 1F . We denote by I the identity embedding of 1F into M, by Bγ (I) the ball in C 0 (1F , U (1F )) centered at I of radius γ, and by A : Bγ (I) → 00 (1F , T M) the map that is defined as follows: ¯ i∈Z . (1.13) Aβ(y) ¯ = (exp−1 yi βi (y)) When γ is small A is a homeomorphism onto the ball Dγ (0) in 00 (1F , T M) centered at the zero section 0 of radius γ. Set
Equilibrium Measures for Coupled Map Lattices
G 0 = A ◦ G ◦ A−1 : Dγ (0) → 00 (1F , T M).
681
(1.14)
If a section v ∈ Dγ (0) is a fixed point of G 0 , then A ◦ G ◦ A−1 v = v, and hence the preimage of v, A−1 v ∈ Bγ (I), is a fixed point of G. To show that G 0 has a fixed point in Dγ (0) we want to prove that the following equation has a unique solution v in Dγ (0): −((DG 0 )|0 − Id)−1 (G 0 v − (DG 0 )|0 v) = v.
(1.15)
Note that 00 (1F , T M) is a Banach space and the map G 0 is differentiable in Dγ (0). In fact, DG 0 is Lipschitz in v since the exponential map and its inverse are both smooth. Since the map G is short ranged, so are the maps G 0 and (DG 0 )|0 . Therefore, we can use weak∗ bases to represent (DG 0 ) in a matrix form. This enables one to readily reproduce the arguments in [KH] (see Lemma 18.1.4) and, exploiting hyperbolicity of F , to show that: 1) the operator −((DG 0 )|0 − Id)−1 is bounded; 2) the map K : Dγ (0) → 00 (1F , T M) defined by Kv = −((DG 0 )|0 − Id)−1 (G 0 v − (DG 0 )|0 v)
(1.16)
is contracting in a smaller ball Dγ0 (0) ⊂ Dγ (0) ⊂ 00 (1F , T M); 3) K(Dγ0 (0)) ⊂ Dγ0 (0). Thus, K has a unique fixed point in Dγ0 (0). We now proceed with Statement 2 of the theorem. In order to establish (1.11) we need to show that the section v has such a property. Let w be a section satisfying (1.11). Since the map K is short ranged and sufficiently closed to an uncoupled contracting map, it is straightforward to verify that the section Kw also satisfies (1.11). Since the map G is spatial translation invariant, so is h. The H¨older continuity of h was proved in [J1] by showing that stable and unstable manifolds for 8 vary H¨older continuously in the ρq -metric. In Sect. 5, we describe finite-dimensional approximations for h which can be also used to establish an alternative proof of the H¨older continuity. The hyperbolicity of the map 8|18 enables one to establish the following topological properties of this map: ¯ = h(VFs (x)) ¯ and V8u (h(x)) ¯ = h(VFu (x)) ¯ are local stable and 1) the manifolds V8s (h(x)) unstable manifolds for 8. They are infinite-dimensional submanifolds of M and are transversal in the sense that the distance between their tangent bundles is bounded away from 0. 2) stable and unstable manifolds for 8 constitute a local product structure of the set ¯ y¯ ∈ 18 with 18 . This means that there exists a constant δ such that for any x, ρ(x, ¯ y) ¯ < δ, the intersection V8s (x) ¯ ∩ V8u (y) ¯ consists of a single point which belongs to 18 . Furthermore, in [J1] the author proved the following result. Theorem 1.2. If the map f |3 is topologically mixing then so is the map 8|18 . Although the space M equipped with the ρq -metric is not a Banach manifold and the maps F and 8 are not differentiable, Theorem 1.1 allows one to keep track of the hyperbolic properties of these maps. More precisely, the following statements hold:
682
M. Jiang, Ya.B. Pesin
1) The local stable and unstable manifolds are Lipschitz continuous with respect to the ρq -metric. The map 8 is uniformly contracting on stable manifolds and the map 8−1 is uniformly contracting on unstable manifolds. The contracting coefficients can be estimated from above by (1 + )λ with arbitrary small. 2) The local stable and unstable manifolds are transversal in the ρq -metric in the fol¯ and z¯ ∈ V8u (x), ¯ lowing sense: for any points x, ¯ y¯ ∈ V8s (x), ρq (x, ¯ y) ¯ + ρq (x, ¯ z) ¯ ≤ Cρq (y, ¯ z), ¯
(1.17)
where C is a constant depending only on the size of local stable and unstable manifolds and the number q. The first property was originally proved in [PS] based upon the graph transform technique. The second property was established in [J2]. These properties allows one to say that the map 8 is “topologically hyperbolic”. 2. Existence of Equilibrium Measures Let be a compact metric space and τ a Zd+1 -action on induced by d + 1 commuting homeomorphisms, d ≥ 0. Let also U = {Ui } and be a cover of . For a finite set X ⊂ Zd+1 define (2.1) U X = ∨x∈X τ −x U . Denote by |X| the cardinality of the set X. The action τ is said to be expansive if there exists > 0 such that for any ξ, η ∈ , d(τ x ξ, τ x η) ≤ for all x ∈ Zd+1 implies ξ = η. A Borel measure µ on is said to be τ -invariant if µ is invariant under all d + 1 homeomorphisms. We denote the set of all τ -invariant measures on by I(). Let µ ∈ I() and U = {Ui } be a finite Borel partition of . Define X µ(Ui ) log µ(Ui ), (2.2) H(µ, U) = − i
and then set hτ (µ, U) =
1 1 H(µ, U X(a) ) = inf H(µ, U X(a) ), a1 ,...,ad+1 →∞ |X(a)| a |X(a)| lim
(2.3)
where X(a) = {(i1 . . . id+1 ) ∈ Zd+1 : a = (a1 . . . ad+1 ), ak > 0, |ik | ≤ ak , k = 1, . . . , d + 1}. The (measure-theoretic) entropy of µ is defined to be hτ (µ) = sup hτ (µ, U) = U
lim hτ (µ, U ), diam U →0
(2.4)
where diam U = maxi (diamUi ). Let U be a finite open cover of , ϕ a continuous function on , and X a finite subset of Zd+1 . Define X X ZX (ϕ, U) = min { exp inf ϕ(τ x ξ) }, (2.5) {Bj }
j
ξ∈Bj
x∈X
where the minimum is taken over all subcovers {Bj } of U X . Set
Equilibrium Measures for Coupled Map Lattices
Pτ (ϕ, U) =
lim sup
a1 ,...,ad+1 →∞
683
1 log ZX(a) (ϕ, U ). |X(a)|
(2.6)
The quantity
lim Pτ (ϕ, U ) = sup Pτ (ϕ, U ) (2.7) diam U →0 U is called the topological pressure of ϕ (one can show that the limit in (2.7) exists). For any continuous function ϕ and any ν ∈ I() the variational principle of statistical mechanics claims that Z (2.8) Pτ (ϕ) = sup hτ (ν) + ϕdν . Pτ (ϕ) =
ν∈I()
A measure µ ∈ I() is called an equilibrium measure for ϕ with respect to a Zd+1 -action τ if Z (2.9) Pτ (ϕ) = hτ (µ) + ϕdµ. In [Ru], Ruelle shows that expansiveness of a Zd+1 -action implies the upper semicontinuity of the metric entropy hτ (µ) with respect to µ. Therefore, it also implies the existence of equilibrium measures for continuous functions. For uncoupled map lattices one can easily check that the action (F, S) is expansive on 1F in the ρq -metric. The expansiveness of the action (8, S) on 18 is a direct consequence of the structural stability (see Theorem 1.1). Thus, we have the following result. Theorem 2.1. Let τ = (8, S) be a Zd+1 −action on 18 , where 8 = F ◦ G and G is short ranged spatial translation invariant and sufficiently C 1 -close to identity. Then for any 0 < q < 1 and any continuous function ϕ on (18 , ρq ), there exists an equilibrium measure µϕ for ϕ with respect to τ . The measure µϕ does not depend on q. While this theorem guarantees the existence of equilibrium measures for continuous functions (with respect to ρq -metrics), it does not tell us anything about uniqueness and ergodic properties of these measures. One can show that uniqueness of equilibrium measures implies their ergodicity (see [Ma˜ne´ ]) and usually some stronger ergodic properties (mixing, etc.). Ruelle [Ru] obtained the following general result about uniqueness which is a direct consequence of the convexity of the topological pressure on the Banach space C 0 (18 ) of all continuous functions in a ρq -metric. Theorem 2.2. Assume that the map f is topologically mixing. Then for a residual set of (continuous) functions in C 0 (18 ), the corresponding equilibrium measures are unique. 3. Uniqueness of Equilibrium Measures Ruelle’s theorem does not specify the class of functions for which the uniqueness takes place. In this section we establish uniqueness for H¨older continuous functions with sufficiently small H¨older constant. Our main tool is the thermodynamic formalism applied to symbolic models corresponding to the coupled map lattices. 3.1. Markov partitions and symbolic representations. One of the main manifestations of Structural Stability Theorem 1.1 is that the conjugacy map h is continuous in ρq metric and is even H¨older continuous. Therefore, the study of existence, uniqueness, and
684
M. Jiang, Ya.B. Pesin
ergodic properties of an equilibrium measure µϕ corresponding to a (H¨older) continuous function ϕ on 18 for the perturbed map 8 is equivalent to the study of these properties for the equilibrium measure µϕ◦h for the unperturbed map F . We shall assume that f is topologically mixing and the hyperbolic set 3 is locally maximal. For any > 0 there exists a Markov partition of 3 of “size” . This means that 3 is the union of sets Ri , i = 1, . . . , m satisfying: 1) each set Ri is a “rectangle”, i.e., for any x, y ∈ Ri the intersection of the local stable and unstable manifolds V s (x) ∩ V u (y) is a single point which lies in Ri ; 2) diamRi < and Ri is the closure of its interior; 3) Ri ∩ Rj = ∂Ri ∩ ∂Rj , where ∂Ri denotes the boundary of Ri ; 4) if x ∈ Ri and f (x) ∈ intRj , then f (V s (x, Ri )) ⊂ V s (f (x), Rj ); if x ∈ Ri and f −1 (x) ∈ intRj , then f −1 (V u (x, Ri )) ⊂ V u (f (x), Rj ); here V s (x, Ri ) = V s (x) ∩ Ri and V u (x, Ri ) = V u (x) ∩ Ri . The transfer matrix A = (aij )1≤i,j≤m associated with the Markov partition is defined as follows: aij = 1 if f (intRi ) ∩ intRj 6= ∅ and aij = 0 otherwise. subshift of finite type (where σ denotes the shift). Let (6A , σ) be the associated T∞ For each ξ ∈ 6A the set n=−∞ f −n (Rξ(n) ) contains a single point. The coding map T∞ π : 6A → 3 defined by πξ = n=−∞ f −n (Rξ(n) ) is a semi-conjugacy between f and σ, i.e., f ◦ π = π ◦ σ. d
We consider 6ZA as a subset of the direct product Z , where = {1, 2, . . . , m}. ¯ j)i∈Zd ,j∈Z , or sometimes by ξ¯ = ξi (j)i∈Zd ,j∈Z . The elements will be denoted by ξ¯ = ξ(i, This symbolic space is endowed with the distance d+1
¯ η) ¯ = ρq (ξ,
¯ j) − η(i, q |i|+|j| |ξ(i, ¯ j)|,
sup
(3.1)
(i,j)∈Zd+1
which is compatible with the product topology. Let σt and σs be the time and space d d translations on 6ZA defined as follows: for ξ¯ = (ξi ) ∈ 6ZA , ξi = ξi (·) ∈ 6A , ¯ i (j) = ξi (j + k), k ∈ Z; (σtk ξ)
¯ i = ξi+k , k ∈ Zd . (σsk ξ)
(3.2)
d
We define the coding map π¯ = ⊗i∈Zd π : 6ZA → 1F . It is a semi-conjugacy between the uncoupled map lattice and the symbolic dynamical system, i.e., the following diagram is commutative: 1F
(F,S)
−→
↑ π¯ d
6ZA
1F ↑ π¯
(σt ,σs )
−→
d
6ZA
.
(3.3)
The following statement describes the properties of the map π. ¯ Its proof follows from the definitions. We denote the boundary set of the Markov partition for f by ∂R and the boundary set of the induced the Markov partition of 1F by B. The set B can be written in the form of a countable union: B = ∪k∈Zd B(k), where B(k) = {x¯ = (xi )i∈zd : xk ∈ ∂R}.
Equilibrium Measures for Coupled Map Lattices
685
Proposition 3.1. (1) π¯ is surjective and Lipschitz continuous with respect to the ρq metric for any 0 < q < 1. ¯ π¯ ◦ σs = SS◦ π, ¯ i. e., π¯ ◦ τ ∗ = τ ◦ π. ¯ (2) π¯ ◦ σt = F ◦ π, (3) π¯ is injective outside the set k∈Zd+1 τ ∗k (π¯ −1 (B)). 3.2. Coupled map lattices and lattice spin systems. The coding map π¯ enables one to reduce the study of the uniqueness and ergodic properties of equilibrium measures corresponding to a (H¨older) continuous function ϕ on (1F , ρq ) for the Zd+1 -action τ = (F, S) to the study of the same properties of equilibrium measures corresponding d
to the function ϕ∗ = ϕ ◦ π¯ on 6ZA for the action τ ∗ = (σt , σs ). In statistical physics the latter is called the lattice spin system. We describe the reduction in the following series of results. Theorem 3.1. (1) Let ϕ be a continuous function on 1F . Then Pτ ∗ (ϕ∗ ) ≥ Pτ (ϕ). d
(2) Let µ∗ be a τ ∗ -invariant measure on 6ZA and µ = µ∗ ◦ π¯ ∗−1 . Then hτ (µ) ≤ hτ ∗ (µ∗ ). As in the case of finite-dimensional dynamical systems it is crucial to know that the projection measure µ = µ∗ ◦ π¯ ∗−1 of the equilibrium measure µ∗ corresponding to the function ϕ∗ is not concentrated on the boundary B of the Markov partition, i.e., that µ∗ (π¯ −1 (B)) = 0.
(3.4)
Theorem 3.2. Let ϕ be a continuous function on 1F . Assume that the condition (3.4) ¯ Then, holds for any equilibrium measure µ∗ corresponding to ϕ∗ = ϕ ◦ π. (1) the pressure Pτ ∗ (ϕ∗ ) = Pτ (ϕ); (2) the measure µ = µ∗ ◦ π¯ ∗−1 is an equilibrium measure corresponding to ϕ; (3) if µϕ is an equilibrium measure for ϕ on 1F , then there exists an equilibrium measure µ∗ for ϕ∗ = ϕ ◦ π¯ with the property µϕ (E) = µ∗ (π¯ −1 (E)) for any Borel set E ⊂ 1F . Theorem 3.1 and Statements 1 and 2 of Theorem 3.2 follow directly from the definitions of topological pressure and metric entropy for the Zd -actions and the variational principle (see (2.4) and (2.7)). Statement 3 of Theorem 3.2 can be proved using arguments similar to those in the finite-dimensional case (see [Bo]). Let A be the set of d ¯ where g is a continuous function on 1F . continuous functions on 6ZA of the form g ◦ π, d Clearly, A is a closed linear subspace of the space of all continuous functions on 6ZA . R Define a linear functional F on A by the formula g ◦ π¯ → g dµ and extend it then to the entire space by the Hahn-Banach theorem. Consider a new functional F ∗ which is a weak∗ -accumulation point of the average of translations of F over finite volumes of the lattice. Let µ∗ be the measure corresponding to F ∗ . One can see that µ∗ is a translation invariant measure. Finally, one can use the variational principle to show that µ∗ is an equilibrium measure. In the finite-dimensional case Condition (3.4) holds provided the potential function is H¨older continuous. This is due to the fact that the equilibrium measure is unique and hence is ergodic [Ma]. In the infinite-dimensional case the ergodicity of µ∗ with respect to time translations is still sufficient for (3.4) to hold. Theorem 3.3 ([J1]). Let µ∗ be an equilibrium measure corresponding to a H¨older cond
tinuous function on 6ZA . Assume that µ∗ is ergodic with respect to the time translation σt . Then it satisfies Condition (3.4).
686
M. Jiang, Ya.B. Pesin
The proof of this theorem is similar to the argument in the finite-dimensional case (see [Bo]). The boundary B can be represented as the union B = ∪k∈Zd B(k), where B(k) = {x¯ = (xi ) : xk lies on the boundary of the Markov partition for fk }. Each B(k) can be decomposed into “stable” and “unstable” parts, B + (k) and B − (k) (depending on whether xk lies on stable or unstable local manifolds). The stable part is invariant under d F and is a closed subset. Thus, its preimage in 6ZA , π¯ −1 (B + (k)) is a closed subset and is invariant under time translations. By ergodicity, its measure is either zero or one. Since every equilibrium measure is a Gibbs state and takes on positive values on open sets (see below) the measure of the stable part π¯ −1 B + (k) is zero. Applying the above arguments to the inverse of F , we conclude that the measure of the unstable part π¯ −1 B − (k) is also zero and hence Eq. (3.4) holds for the whole boundary set. Uniqueness of the equilibrium measure implies its ergodicity with respect to the Zd+1 -action induced by (F, S). This is weaker than ergodicity with respect to the time translation. In [J1], the author proved directly that for a class of H¨older continuous functions Condition (3.4) holds. Recall that a function ϕ on 1F is H¨older continuous in the ρq -metric if |ϕ(x) ¯ − ϕ(y)| ¯ ≤ cρα ¯ y), ¯ q (x, where x¯ = (xi ), y¯ = (yi ) ∈ 1F . Note that if the function ϕ is H¨older continuous on d 1F (in the ρq -metric) then the function ϕ∗ = ϕ · π¯ on 6ZA is also H¨older continuous. The following statement enables one to reduce the study of the uniqueness problem for coupled map lattices to the study of the same problem for lattice spin systems. Theorem 3.4 ([J1]). Let ϕ be a H¨older continuous function on (1F , ρq ). Assume in addition that ¯ y), ¯ |ϕ(x) ¯ − ϕ(y)| ¯ ≤ cρα q (x, where x¯ = (xi ), y¯ = (yi ) ∈ 1F , x0 = y0 , and c is sufficiently small. Then, µ∗ (π¯ −1 (B)) = 0 d
holds for any equilibrium measure µ∗ for ϕ∗ on 6ZA . Therefore, for this class of potential functions, the uniqueness of equilibrium measure for ϕ∗ implies the uniqueness of equilibrium measure for ϕ. In the next section we shall actually show that the equilibrium measure for ϕ∗ is unique and exponentially mixing for the class of H¨older continuous functions satisfying the condition of Theorem 3.4. 3.3. Gibbs states for lattice spin systems. We remind the reader of the concept of Gibbs states for lattice spin systems of statistical physics. d+1 d An element ξ¯ ∈ 6ZA ⊂ Z is called a configuration. For any subset X ⊂ Zd+1 we set d
¯ ¯ = ξ(i), i ∈ X}. X = {η¯ ∈ X : there exists ξ¯ ∈ 6ZA such that η(i) ¯ The elements of X will be denoted by ξ¯X , or sometimes by ξ(X). One can say that ¯ X consists of restrictions of configurations ξ to X. d
Let ϕ be a H¨older continuous function on 6ZA with respect to the ρq -metric (see ¯ on 6ZAd by (3.1)). For each finite subset X ⊂ Zd+1 define the function p (ξ) X
¯ =P pX (ξ)
b ξ(¯ X) b exp η, ¯ η( ¯ X)=
P
1 x∈Zd+1
,
¯ ϕ(τ x η) ¯ − ϕ(τ x ξ)
(3.5)
Equilibrium Measures for Coupled Map Lattices
687
b = Zd+1 \ X, and x = (i, j), i ∈ Zd , j ∈ Z. where τ x is the action (σt )i ◦ (σs )j , X d
A probability measure µ on 6ZA is called a Gibbs state for ϕ if for any finite subset X ⊂ Zd+1 , Z ¯ pX (ξ)dµ (3.6) µX (ξ¯X ) = b, X b X where µX and µ are the probability measures on X and X b respectively that are b X induced by natural projections. This equation is known as the Dobrushin-Ruelle–Lanford equation. There is an equivalent way to describe Gibbs states corresponding to H¨older continuous functions on symbolic spaces. Let ϕ be such a function. For each finite volume X we define a conditional Gibbs distribution on X under the given boundary condition η¯ ∗ by ¯ =P µη¯ ∗ ,X (ξ(X))
b η¯ ∗ (X) b η, ¯ η( ¯ X)=
exp
1
P x∈Zd+1
, (3.7)
b ¯ ϕ(τ x η) ¯ − ϕ(τ x (ξ(X) + η¯ ∗ (X))
b denotes the (admissible) configuration on X ∪ X b whose restrictions ¯ where ξ(X)+ η¯ ∗ (X) ∗ b b ¯ to X and X are ξ(X) and η¯ (X) respectively. The set of all Gibbs states for ϕ is the convex hull of the thermodynamic limits of the conditional Gibbs distributions. The relation between translation invariant Gibbs states and equilibrium measures can be stated as follows (see [Ru]). Theorem 3.5. If the transfer matrix A is aperiodic then µ is an equilibrium measure for ϕ if and only if it is a translation invariant Gibbs state for ϕ. In statistical mechanics Gibbs states are usually defined for potentials rather than for functions. We briefly describe this approach. A potential U is a collection of functions defined on the family of all finite configurations, i.e., U = {UX : X ⊂ Zd+1 , UX : X → R}. Gibbs states for a potential U are defined as the convex hull of the thermodynamic limits of the conditional Gibbs distributions: P b ¯ + η¯ ∗ (X)) exp( V ∩X6=∅ UV (ξ(X) ¯ P , (3.8) =P µη¯ ∗ ,X (ξ(X)) ¯ b η¯ ∗ (X) b exp( V ∩X6=∅ UV (η)) η, ¯ η( ¯ X)= where η¯ ∗ is a fixed configuration. We describe potentials corresponding to H¨older continuous functions (in the ρq metric (see (3.1)). Let ϕ be such a function. We write ϕ in the form of a series ϕ=
∞ X
ϕn .
(3.9)
n=0
Here the value of ϕn depends only on configurations inside the (d + 1)-dimensional cube Qn centered at the origin of side 2n × · · · × 2n. We also set Q0 = (0, 0). We define the functions ϕn as follows. Fix a configuration η ∗ and set b0 ) . ¯ = ϕ ξ(Q ¯ 0 ) + η¯ ∗ (Q ϕ0 (ξ) (3.10)
688
M. Jiang, Ya.B. Pesin
Continuing inductively we define b n+1 ) − ϕ ξ(Q bn ) , ¯ = ϕ ξ(Q ¯ n+1 ) + η¯ ∗ (Q ¯ n ) + η¯ ∗ (Q ϕn+1 (ξ)
n = 1, 2, . . . .
(3.11)
It is easy to see that kϕn k → 0 exponentially fast as n → ∞. We define the potential Uϕ associated with the function ϕ on Qn by setting ¯ n )) = ϕn (ξ(Q ¯ n )). Uϕ (ξ(Q
(3.12)
For other (d + 1)-dimensional cubes that are translations of Qn we assign the same value of Uϕ . For other finite subsets of Zd+1 we define the potential to be zero. Thus, we obtain a translation invariant potential whose values on finite volumes decrease exponentially when the diameter of the volume grows. If ϕ0 = 0, the value of the corresponding potential Uϕ is bounded by the H¨older constant of the function ϕ. More generally, let us set ¯ ¯ ¯ − ϕ(η)| F(α, q, ) = {ϕ : |ϕ(ξ) ¯ ≤ ρα q (ξ, η)}, kUQn k =
(3.13)
¯ n ))|, |UQn (ξ(Q
(3.14)
P(q, ) = {U : sup q −n kUQn k ≤ }.
(3.15)
sup ξ(Qn )∈Qn
n≥1
It is easy to see that, if ϕ ∈ F(α, q, ), then Uϕ ∈ P(q α , ). On the other hand, Uϕ ∈ P(q, ) implies ϕ ∈ F (1/2, q, ). The definition of Gibbs states corresponding to potentials is consistent with the one corresponding to functions. More precisely, Gibbs distributions corresponding to a H¨older continuous function ϕ are exactly the Gibbs distributions corresponding to the potential Uϕ . As we have seen the problem of uniqueness of equilibrium states on symbolic spaces can be reduced to the problem of uniqueness of translation invariant Gibbs states provided the function ϕ is H¨older continuous. This problem has been extensively studied in statistical physics for a long time. In the one-dimensional case (when d = 0) Gibbs states are always unique and are mixing with respect to the shift provided the potential decays exponentially fast as the length of intervals goes to infinity (see [Ru]). In the case of higher dimensional lattice spin systems the well-known Ising model provides an example where the Gibbs states are not unique even for potentials of finite range (see [Sim]). We first describe the two-dimensional Ising model in the context of spin lattice systems. Example 1 (The Ising Model, d = 1). Define the potential function ϕ on by ¯ = β ξ(1, ¯ 0)ξ(0, ¯ 0) + ξ(0, ¯ 0)ξ(0, ¯ 1) . ϕ(ξ)
(3.16)
Then the following statements hold: ¯ depends only on the values of ξ¯ at three lattice points: (1, 0), (0, 0), and (0, 1) (1) ϕ(ξ) and is H¨older continuous; (2) there exists β0 > 0 such that for β > β0 Gibbs states corresponding to the potential Uϕ generated by ϕ are not unique. Based upon this Ising model we describe now an example of a coupled map lattice and a H¨older continuous function with non-unique equilibrium measure.
Equilibrium Measures for Coupled Map Lattices
689
Example 2 (Phase Transition For Coupled Map Lattices). Let M be a compact smooth surface and (3, f ) the Smale horseshoe. One can show that the semi-conjugacy π¯ be2 tween M = ⊗i∈Z M and {0, 1}Z induced by the Markov partition can be chosen as an isometry. Thus, the function ψ = ϕ ◦ π¯ −1 is H¨older continuous on 1F , where the function ϕ is chosen as in Example 1. Since the boundary of the Markov partition is empty Condition (3.1) holds. We conclude that there are more than one equilibrium measure for the function ψ. The following statement provides a general sufficient condition for uniqueness of d+1 Gibbs states. Let U be a translation invariant potential on the configuration space Z , where = {1, 2, . . . , m}. (1) ( Dobrushin’s Uniqueness Theorem [D1, Sim]): Assume that X (|X| − 1)||U (X)|| < 1. (3.17) X: 0∈X
Then the Gibbs state for U is unique. (2) ([Gro, Sim]): There exist r > 0 and ε > 0 such that if X erd(X) ||U (X)|| ≤ ε
(3.18)
X: 0∈X
(d(X) denotes the diameter of X) then the unique Gibbs state is exponentially mixing d+1 with respect to the Zd+1 -action on Z . The proof of Dobrushin’s uniqueness theorem exploits the direct product structure d+1 of the configuration space Z . This result cannot be directly applied to establish uniqueness of Gibbs states for lattice spin systems, which are symbolic representations d
of coupled map lattices, because the configuration space 6ZA is, in general, a translation d+1 invariant subset of Z . In [BuSt], the authors constructed examples of strongly irreducible subshifts of finite type for which there are many Gibbs states corresponding to the function ϕ = 0. In order to establish uniqueness we will use some special structure of d
the space 6ZA : it admits subshifts of finite type in the “time” direction and the Bernoulli shift in the “space” direction. We now present the main result on uniqueness and mixing property of Gibbs states for lattice spin systems which are symbolic representations of coupled map lattices of hyperbolic type. In the two-dimensional case (d = 1), it was proved by Jiang and Mazel (see [JM]). In the multidimensional case it was established by Bricmont and Kupiainen (see [BK3]). A potential U0 on 6ZA is called longitudinal if it is zero everywhere except for configurations on vertical finite intervals of the lattice. A potential U0 is said to be exponentially decreasing if ¯ ≤ Ce−λ|I| , (3.19) |U0 (ξ(I))| where C > 0 and λ > 0 are constants, I is a vertical interval (i.e., in the time direction), ¯ is a configuration over I. Exponentially deceasing longitudinal |I| is its length, and ξ(I) potentials correspond to those potential functions whose values depend only on the ¯ j), j ∈ Z. configuration ξ(0, We say that a Gibbs state is exponentially mixing if for every integrable function on the configuration space the Zd+1 -correlation functions decay exponentially to zero.
690
M. Jiang, Ya.B. Pesin
Theorem 3.6 (Uniqueness and Mixing Property of Gibbs States). For any exponentially deceasing longitudinal potential U0 and every 0 < q < 1, there exists > 0 such that the Gibbs state for any potential U = U0 + U1 with U1 ∈ P(q, ) is unique and exponentially mixing. Proof. We provide a brief sketch of the proof assuming first that U0 = 0 and d = 1. We may assume that the potential is non-negative (otherwise, the non-negative potential U 0 (η(Q)) = U (η(Q)) + maxη(Q) |U (η(Q))| defines the same family of Gibbs distributions). We introduce a new potential U˜ which is defined on rectangles and is equivalent to the potential U . The latter means that both potentials generate the same conditional Gibbs distributions. Consider a square Q and a rectangle P and denote by b(Q) = (b1 (Q), b2 (Q)) and b(P ) = (b1 (P ), b2 (P )) the left lowest corners of Q and P , respectively. Fix L > 0 (its choice will be specified later) and define a rectangular potential U˜ (η(P ¯ )) in the following way. For every rectangle P with b2 (P ) = nL, n ∈ Z of size l(P ) × Ll(P ) we have X U (η(Q)), ¯ (3.20) U˜ (η(P ¯ )) = Q:Q∼P
where the sum is taken over all squares Q associated with P (we write this as Q ∼ P ) i.e., the following condition holds: Q is of size l(P ) × l(P ) and b1 (Q) = b1 (P ), b2 (P ) ≤ b2 (Q) < b2 (P ) + L. It is easy to show that U˜ ∈ P(q, δ), where δ = δ() → 0 as → 0. Let V ⊂ Z2 be any finite volume. Fix a boundary condition η¯ ∗ (Vb ). For any config¯ ) such that ξ(V ¯ ) + η¯ ∗ (Vb ) is a configuration in Z2 a conditional Hamiltonian uration ξ(V specified by the potential U˜ (η(P ¯ )) is defined as follows (see A2.3) X ¯ ) + η¯ ∗ (Vb ) . ¯ )|η¯ ∗ (Vb )) = − HU˜ (ξ(V U˜ η(P ¯ )|ξ(V P ∩V 6=∅
¯ ) + η¯ ∗ (Vb ) means that the potential U˜ (η(P ¯ )) is evaluated The expression U˜ η(P ¯ )|ξ(V ∗ b ¯ under the condition that ξ(V ) + η¯ (V ) is fixed. It is easy to see that X ¯ )|η¯ ∗ (Vb )) = − ¯ ) + η¯ ∗ (Vb ) U η(Q)| ¯ ξ(V HU˜ (ξ(V Q:Q∩V 6=∅
−
X
X
P ∩V 6=∅
Q:Q∼P Q∩V =∅
¯ ) + η¯ ∗ (Vb ) U η(Q)| ¯ ξ(V
¯ )|η¯ ∗ (Vb )) − = HU (ξ(V
X
X
P ∩V 6=∅
Q:Q∼P Q∩V =∅
(3.21) ¯ ) + η¯ ∗ (Vb ) . U η(Q)| ¯ ξ(V
The conditional Gibbs distributions defined by (3.8) for the potential U˜ can be expressed in terms of the conditional Hamiltonian as follows: ¯ )|η¯ ∗ (Vb )) exp H( ξ(V ¯ )) = , (3.22) µ V,η¯ ∗ (ξ(V Ξ(V |η¯ ∗ (Vb )) where
Ξ(V |σ 0 (Vb )) =
X η(V ¯ )
exp H(η(V ¯ )|η¯ ∗ (Vb ))
Equilibrium Measures for Coupled Map Lattices
691
is the partition function for the potential U˜ in the volume V with the boundary condition η¯ ∗ (Vb ) (see (A2.2) and (A2.4)). It follows from (3.21) that
P
¯ )|η¯ ∗ (Vb )) exp HU (ξ(V η(V ¯ )
=P
exp HU (η(V ¯ )|η¯ ∗ (Vb ))
¯ )|η¯ ∗ (Vb )) exp HU˜ (ξ(V η(V ¯ )
.
exp HU˜ (η(V ¯ )|η¯ ∗ (Vb ))
Therefore, the potentials U and U˜ generate the same conditional Gibbs distributions on any finite volume V ⊂ Z2 . ¯ of the Let B ⊂ V ⊂ Z2 . We use (3.22) to compute the probability µ V,η¯ ∗ (ξ(B)) ¯ configuration ξ(B) under the boundary condition and wish to show that it has a limit as V → Z2 independent of η¯ ∗ . The latter is the unique Gibbs state for the potential U˜ . ¯ Using (3.22) we obtain the following formula for the conditional measure µ V,η¯ ∗ (ξ(B)): X
¯ µ V,η¯ ∗ (ξ(B)) =
µ V,η¯ ∗ (η(V ¯ )).
¯ η(V ¯ ):η(V ¯ )|B =ξ(B)
We wish to use the Polymer Expansion Theorem (see Appendix) and decompose the above expression in the form of (A4.3). Namely, ¯ = µ V,η¯ ∗ (ξ(B)) N (B) exp
X P ⊆B
U˜ (η(P ¯ )) +
X ℘:℘∩V \B6=∅
¯ w(℘|ξ(B) + η¯ ∗ (Vb )) −
X
w(℘|η¯ ∗ (Vb )) ,
℘:℘∩V 6=∅
(3.23) where N (B) is the normalizing factor, determined the volume B (see (A4.4)), ¯ + η¯ ∗ (Vb )) are the statistical weights for the polymer ℘ (see w(℘|η¯ ∗ (Vb )) and w(℘|ξ(B) (A4.3)), and P is a rectangle. If the parameter L in the definition of the rectangles is chosen sufficiently large and is sufficiently small by the Polymer Expansion Theorem, each sum in (3.23) converges to a limit uniformly in P(q, δ). The above argument can be extended to the general case when U0 is an exponentially decreasing longitudinal potential (see [JM] for detail). The case d > 1 is considered by Bricmont and Kupiainen in [BK3] and is treated in a slightly different way by obtaining polymer expansions of correlation functions. Theorems 3.4 and 3.6 enable us to obtain the following main result about uniqueness and mixing property of equilibrium measures for coupled map lattices. Theorem 3.7. Let (8, S) be a coupled map lattice and ϕ = ϕ0 + ϕ1 a function on 18 , where ϕ0 is a H¨older continuous function depending only on the coordinate x0 and ϕ1 is a H¨older continuous function with a small H¨older constant in the metric ρq . Then there exists a unique equilibrium measure µϕ on 18 corresponding to ϕ. This measure is mixing and takes on positive values on open sets. Furthermore, the correlation functions decay exponentially for every H¨older continuous function on 18 satisfying the above assumptions.
692
M. Jiang, Ya.B. Pesin
4. Finite-Dimensional Approximations In this section we describe finite-dimensional approximations of equilibrium measures for coupled map lattices. One should distinguish two different types of approximations: by Zd+1 -action equilibrium measures and Z-action equilibrium measures. The first come from the corresponding Zd+1 -dimension lattice spin system while the second one is a straightforward finite-dimensional approximation of the initial coupled map lattice. In order to explain some basic ideas concerning finite-dimensional approximations we first consider an uncoupled map lattice (M, F ). Let ϕ be a H¨older continuous function on M which depends only on the central coordinate, i.e., ϕ(x) ¯ = ψ(x0 ), where ψ is a H¨older continuous function on M (whose H¨older constant is not necessarily small). It is easy to see that the equilibrium measure µϕ corresponding to ϕ is unique with respect to the Zd+1 -action (F, S) and that µϕ = ⊗i∈Zd µψ , where µψ is the equilibrium measure on 3 ⊆ M for ψ with respect to the Z-action generated by f . One can also verify that for measure any finite set X ⊂ Zd the measure µX = ⊗i∈X µψ is the unique equilibrium P ¯ with on the space MX = ⊗i∈X M corresponding to the function ϕX = i∈X ϕ(S i x) respect to the Z-action FX = ⊗i∈X f . Clearly, µXn → µSϕ in the weak∗ -topology for any sequence of subsets Xn → Zd (i.e., Xn ⊂ Xn+1 and n≥0 Xn = Zd ). It is worth emphasizing that the sequence of the functions ϕXn does not converge to a finite function on M as n → ∞, while the corresponding Z-action equilibrium measures µϕXn approach the Zd+1 -action equilibrium measure µϕ . On the other hand, one can consider ϕ as a function on the space MX provided 0 ∈ X. The unique equilibrium measure with respect to the Z-action generated by FX is µψ × ⊗i∈X,i6=0 ν0 , where ν0 is the measure of maximal entropy on M . This simple example illustrates that the Zd+1 -action equilibrium measures corresponding to a function ϕ may not admit approximations by the Z-action equilibrium measures corresponding to the restrictions of ϕ to finite volumes. 4.1. Continuity of equilibrium measures over potentials. In this section we show that equilibrium measures for coupled map lattices depend continuously on their potential functions in the weak∗ -topology. Fix 0 < q < 1 and consider the space of all H¨older continuous functions on 18 with H¨older exponent 0 < α < 1 and H¨older constant > 0 in the metric ρq . We denote this space by Fe(α, q, ). It is endowed with the usual supremum norm kϕk. We also introduce the q α -norm on this space by kϕkqα = max{sup q −αn sup |ϕ(x) ¯ − ϕ( e y)|, ¯ kϕk}, n≥0
(4.1)
x, ¯ y∈1 ¯ 8
where the second supremum is taken over all points x, ¯ y¯ for which xi = yi for |i| ≤ n. The following statement establishes the continuous dependence of equilibrium measures for coupled map lattices for potential functions in Fe(α, q, ). We provide a proof in the case d = 1 using an approach based on polymer expansions. If d > 1 the continuous dependence still holds and can be established using methods in [BK3]. Theorem 4.1. There exists > 0 such that the unique equilibrium measure µϕ on 18 e q, ) with respect to the depends continuously (in the weak∗ -topology) on ϕ ∈ F(α, e α α norm k · kq , i.e., for ψm ∈ F(α, q, ), kψm − ϕkq → 0 implies µψm → µϕ in the weak∗ -topology.
Equilibrium Measures for Coupled Map Lattices
693
Proof. Observe that the convergence kψm − ϕkqα → 0 implies the convergence of corresponding potentials on the symbolic space. Therefore, we need only to establish the continuity of the Gibbs state for the corresponding symbolic representation. For a potential U on 6ZA its norm k · kq is defined as kU kq = sup q −n kUQn (ξ¯Qn )k,
(4.2)
n≥0
where 0 < q < 1. By Theorem 3.6 the Gibbs state is unique when kU kq is sufficiently small. We denote the Gibbs state for U by µU . We show that for any cylinder set E ⊂ 6ZA , µU (E) depends on U continuously in a neighborhood of the zero potential in the set P(q, 1) = {U : kU kq ≤ 1}. For this purpose we use the explicit expression of µU (E) in terms of the potential U provided by the Polymer Expansion Theorem (see (A4.5)). Namely, for a non-negative potential U ∈ P(q, ) and any finite volume B ⊂ Z2 we have that X ¯ ¯ )) + = N (B) exp U (ξ(P µU (ξ(B)) P ⊆B
X
¯ w(℘|ξ(B)) −
dist(℘,B)≤1 ¯ dist(℘,¯ Bb)=0
℘:
X
w(℘) ,
℘:dist(℘,B)≤1 ¯
(4.3) where N (B) is a normalizing factor determined by the volume B (see (A4.4)), w(℘) and ¯ w(℘|ξ(B)) are the statistical weights for the polymer ℘ (see (A4.5)), and P is a rectangle. ¯ By the Polymer Expansion Theorem the statistical weights w(℘) and w(℘|ξ(B)) (B is fixed) depend continuously on U (η(P )) with respect to the norm k · kq . This implies that µU depends weakly continuously on U . To show that µU depends on U continuously for all (not necessarily non-negative) ¯ n )) = q n . potentials U ∈ P(q, /4) let us consider the potential U defined as U (ξ(Q Then, for any U ∈ P(q, /4) we have that U + U/4 ≥ 0,
U + U/4 ∈ P(q, 1/2).
Note that given Qn , U is a constant potential on Qn . Therefore, Gibbs distributions for U and U + U/4 coincide and hence, µU = µU +U/4 . This implies the desired result.
(4.4)
4.2. Finite-dimensional Zd+1 -approximations. We now describe finite-dimensional Zd+1 -approximations of equilibrium measures for coupled map lattices. Let ϕ ∈ Fe(α, q, ) be a H¨older continuous function on 18 . Fix a point x¯ ∗ = (x∗i ) which we call the boundary condition. Given a finite volume V ⊂ Zd consider the function on 18 ϕn,x¯ ∗ (x) ¯ = ϕ(x| ¯ V , x¯ ∗ |Vb ). (4.5) One can see that kϕn,x¯ ∗ − ϕkq1α → 0
(4.6)
as n → ∞ for any q1 with 0 < q < q1 . The following result is an immediate corollary of Theorem 4.1.
694
M. Jiang, Ya.B. Pesin weak∗
Theorem 4.2. µϕn,x¯ ∗ −→ µϕ independently of the boundary condition x¯ ∗ (recall that µϕn,x¯ ∗ is the unique equilibrium measure corresponding to the function ϕn,x¯ ∗ and µϕ is the unique equilibrium measure corresponding to the function ϕ). 4.3. Finite-dimensional Z-approximations I: uncoupled map lattices. We describe some “natural” finite-dimensional approximations of equilibrium measures for coupled map lattices by Z-action equilibrium measures. We first consider an uncoupled map lattice (F, S) in the space (M, ρq ). For every volume V ⊂ Zd we set MV = ⊗i∈V Mi , FV = ⊗i∈V fi , and 1F,V = ⊗i∈V 3i . One can see that MV is a smooth finite-dimensional manifold, FV is a C r diffeomorphism of MV , and 1F,V is a locally maximal hyperbolic set for FV . Fix a point x¯ ∗ = (x∗i ) ∈ 1F (the boundary condition) and consider a H¨older cone q, ) on 1F . Define the function ψV,x¯ ∗ on 1F,V by tinuous function ϕ ∈ F(α, X ψV,x¯ ∗ (x) = ϕ(S i (x, x∗ |1 )). (4.7) \ F,V i∈V
Consider the Z-action equilibrium measure νV corresponding to the function ψV,x¯ ∗ . We can view these measures as being supported on M. Let µϕ be the Zd+1 -action equilibrium measure corresponding to ϕ. This measure is concentrated on 1F and thus can also be viewed as being supported on M. Theorem 4.3. There exists c0 > 0 such that if 0 < ≤ c0 then µϕ is the limit (in the weak∗ -topology) of equilibrium measures νV as V → Zd in the sense of van Hove, i.e., for any fixed a ∈ Zd , |τ a V \ V | = 0. lim |V | V →Zd Proof. We consider only the case d = 1. For d > 1 the arguments are similar. It is sufficient to prove the convergence of the measures νV∗ = νV π¯ to the measure µ∗ = µϕ∗ (ϕ∗ = ϕ ◦ π) ¯ on the symbolic space ⊗Z 6A as V → Z. Let us fix a configuration η¯ ∗ on Z2 . Given n > 0 and m > 0, consider the rectangle Vnm = {x = (i, j) ∈ Z2 : |i| ≤ n, |j| ≤ m} and define the Gibbs distribution on Vnm ¯ nm ) over the volume Vnm we set as follows: for any configuration ξ(V X ¯ nm ) + η¯ ∗ (Vbnm ) exp ϕ∗ τ x (ξ(V ¯ nm )) = X µnm (ξ(V
x∈Vnm
exp
η(V ¯ nm )
X
.
ϕ∗ τ x (η(V ¯ nm ) + η¯ ∗ (Vbnm )
(4.8)
x∈Vnm
Given a finite volume W ⊂ Z2 , for sufficiently large n and m we have that W ⊂ Vnm . ¯ ) over W is a subset of the configuration space Therefore, the set configurations ξ(W ¯ )) the measure on this set, where µnm is ¯ nm ) over Vnm . We denote by µnm (ξ(W ξ(V defined by (4.8). By the definition of Gibbs states and the uniqueness of µ∗ the measure µ∗ is the thermodynamic limit of measures µnm , i.e., for any finite volume W ⊂ Z2 and any ¯ ) over W , configuration ξ(W ¯ )) = µ∗ (ξ(W
lim
Vnm →Z2
¯ )), µnm (ξ(W
Equilibrium Measures for Coupled Map Lattices
695
where Vnm converges to Z2 in the sense of van Hove. We observe that for each n > 0, there exists the limit νn∗ = limm→∞ µnm , which is the Z-action Gibbs state for the function ψV∗n ,η∗ on Vn = ⊗ni=−n 6A . Thus, for each fixed n there exists m(n) such that ¯ )) − νn (ξ(W ¯ ))| ≤ 1 |µnm(n) (ξ(W n for every W ⊂ Vnm . Notice that Vnm(n) → Z2 in the sense of van Hove. This implies that limn→∞ νn = limn→∞ µnm(n) = µϕ . 4.4. Finite-dimensional Z-approximations II: coupled map lattices. We consider a coupled map lattice (8, S) in the space (M, ρq ) and define its finite-dimensional approximations as follows. Fix a point x¯ ∗ ∈ 18 (the boundary condition). For any finite volume V ⊂ Zd consider the map on MV , 8V (x) i = 8((x, x∗ |Vb ) i , (4.9) where ()i denotes the coordinate at the lattice site i. One can see that if the perturbation is sufficiently small then 8V is a diffeomorphism of MV . It can be written as 8V = GV ◦ FV , where GV is the restriction of G to MV : GV (x) = G(FVb (x∗ |Vb ), x).
(4.10)
Since the diffeomorphism 8V is closed to the diffeomorphism FV by the structural stability theorem it possesses a locally maximal hyperbolic set which we denote by 18,V . Moreover, there exists a conjugacy homeomorphism hV : 1F,V → 18,V which is close to identity. The maps 8V and hV provide finite-dimensional approximations for the infinitedimensional maps 8 and h respectively. In order to describe this in a more explicit way we introduce the following maps: ˜ V (x) ¯ = (8V (x| ¯ V ), FVb (x| ¯ Vb )), 8
h˜ V (x) ¯ = (hV (x| ¯ V ), idVb (x| ¯ Vb )).
We denote by d0q and d1q the C 0 and respectively C 1 distances in the space of diffeomorphisms induced by the ρq -metric. We also use d(0, ∂V ) to denote the shortest distance from the origin of the lattice to the boundary of the set V . Theorem 4.4. There exist constants C > 0 and β > 0 such that for any V ⊂ V 0 ⊂ Zd , (1) d1q (8V , 8V 0 ) ≤ Ce−βd(0,∂V ) and 8V → 8. (2) d0q (hV , hV 0 ) ≤ Ce−βd(0,∂V ) and hV → h. Proof. The first statement is obvious since 8 is short ranged. The proof of the second statement is based on arguments in the proof of structural stability (see Theorem 1.1). We recall that the conjugacy map h is determined as a unique fixed point for a contracting map K acting on a ball Dγ (0) contained in the Banach space 00 (1F , T M) of all continuous vector fields on 1F (see (1.16)). In order to obtain the conjugacy map hV one needs to find a (unique) fixed point for a contracting map KV acting in Dγ (0) by a formula similar to (1.16): KV v = −((DGV0 )|0 − Id)−1 (GV0 v − (DGV0 )|0 v),
696
M. Jiang, Ya.B. Pesin
˜ V ◦ β ◦ F −1 . One can show that the where GV0 = A ◦ GV ◦ A−1 (see (1.14)) and GV0 β = 8 contraction coefficient of FV is uniform over V and that FV converges exponentially fast to F . Therefore, the corresponding fixed point hV converges exponentially fast to h. e q, ) on 18 consider the function ϕ˜ = For a H¨older continuous function ϕ ∈ F(α, ϕ ◦ h on 1F , where h : 1F → 18 is a conjugacy homeomorphism. Let ν˜ V be the Z-action equilibrium measure on 1F,V corresponding to the function ψ˜ V,x¯ ∗ which is determined by (4.7) with respect to the function ϕ. ˜ Finally, we define the measure ∗ ) ◦ ν ˜ on 1 . It also can be considered as a measure on M. As a direct νV = (h−1 V 8,V V consequence of Theorem 4.3 we conclude the following result. Theorem 4.5. If is sufficiently small then the measure µϕ is the limit (in the weak∗ topology) of the measures νV as V → Zd . 5. Existence, Uniqueness, and Ergodic Properties of SRB-Measures In this section we discuss the problems of existence and uniqueness of Sinai–Bowen– Ruelle measures for coupled map lattices as well as some of their ergodic properties (including mixing and decay of correlations). The first construction of these measures appeared in [BuSi]. In [BK2], Bricmont and Kupiainen constructed these measures for general expanding circle maps. Their approach is based upon the study of the Perron— Frobenius operator. In [PS], Pesin and Sinai developed another method for constructing SRB-measures for coupled map lattices assuming that the local map possesses a hyperbolic attractor. In this section we develop a new approach and obtain stronger results under more general assumptions. Let f be a C r -diffeomorphism of a compact finite-dimensional manifold M possessing a hyperbolic attractor 3. The latter means that 3 is a hyperbolic set and there exists an open neighborhood U of 3 such that f (U ) ⊂ U . In particular, 3 = ∩n>0 f n (U ) and is a locally maximal invariant set. We assume that the map f is topologically mixing. Then an SRB-measure µ on 3 is unique and is characterized as follows: 1) the conditional distributions generated by µ on the unstable manifolds are absolutely continuous with respect to the Lebesgue measure; 2) for any continuous function g and almost all x ∈ U with respect to the Lebesgue measure in U , Z n−1 1X g(f k x) = gdµ; (5.1) lim n→∞ n k=0
3) µ is the unique equilibrium measure corresponding to the H¨older continuous function ϕu (x) = − log Jacu f (x), where Jacu f (x) denotes the Jacobian of f at x along the unstable subspace. In the infinite-dimensional case we construct a measure on 18 which has similar properties. This is an SRB-measure for the coupled map lattice. Our construction is based upon symbolic representations of the finite-dimensional approximations of the lattice constructed in the previous section. Let V ∈ Zd be a finite volume. Consider the diffeomorphisms FV and 8V . Since 8V is close to FV it has a hyperbolic attractor 18,V .
Equilibrium Measures for Coupled Map Lattices
697
Since we assume that the map f is topologically mixing then so are the maps F, 8, FV , and 8V . Therefore, the map 8V possesses the unique SRB-measure µV that is supported on 18,V . This measure is the unique equilibrium measure corresponding to the H¨older continuous function ϕV (x) = − log Jacu 8V (x), where Jacu 8V (x) is the Jacobian of the map 8V at x along the unstable subspace. We can consider the measure µV to be supported on the compact space (M, ρq ). Our main result is the following. Theorem 5.1. The SRB-measures µV weak∗ converge to a measure on M which is a unique equilibrium measure µ = µϕ corresponding to a H¨older continuous function ϕ on M and is mixing. Furthermore, the correlation functions decay exponentially for every continuous function on M satisfying the assumptions of Theorem 3.1. Remarks. (1) It is clear that for an uncoupled map lattice the SRB-measures µV converge to the measure ⊗i∈Zd µf which is the equilibrium measure for the potential function ¯ = − log Jacu f (x0 ). The potential function ϕ(x) ¯ of the SRB-measure for a coupled ϕ0 (x) map lattice is a small perturbation of ϕ0 (x). ¯ More precisely, ϕ(x) ¯ = ϕ0 (x) ¯ + ϕ1 (x) ¯ where ϕ1 (x) ¯ is a H¨older continuous function with sufficiently small H¨older constant. Its precise description is given by (5.15). (2) We follow the approach suggested in [BK2, BK3]. We thank J. Bricmont who suggested to use the formula (5.8) to expand the Jacobian. (3) To avoid some technical obstacles we assume that f is an Anosov map. In this case 18V = 1FV = MV . The general case of hyperbolic attractors can be treated in a similar way with the use of Theorem 4.4. (4) Another approach for the existence of SRB-measures was suggested in [PS]. It is based upon a delicate analysis of conditional measures generated by measures µV on finite-dimensional unstable manifolds for 8V . Combining results in [PS] and Theorem 5.1 one can show that these conditional measures determine the conditional measures, generated by the SRB-measure µϕ on infinite-dimensional unstable manifolds for 8 in a unique way. This justifies one of the main characteristic features of SRB-measures. (5) Using the finite-dimensional approximations approach developed in the proof of Theorem 5.1 one can show that the Zd+1 -topological pressure Pτ (ϕ) = 0, where ϕ is the potential function for the SRB-measure. Since the SRB-measure is an equilibrium measure in view of (2.9) we obtain the entropy formula for the SRB-measure Z hτ (µϕ ) = − ϕ dµϕ (see detailed arguments in [J3]). (6) Another interesting manifestation of our construction of the SRB-measure is the continuous dependence of the entropy on the perturbation 8. Using arguments in the proof presented below one can show that the potential function depends continuously on the map 8 in the ρq -metric. Moreover, the SRB-measure as a Gibbs state is also continuous in the weak sense with respect to the potential function (see Sect. 4.1). Therefore, the entropy formula gives the continuous dependence. Proof of Theorem 5.1. Let πV = ⊗i∈V πi be the semi-conjugacy map between the symbolic dynamical system (σt , ⊗i∈V 6A ) and (FV , MV ) (here πi are copies of the coding map π). Define the measure νV on 6VA = ⊗i∈V 6A by the following relation µV = (hV πV )∗ νV . It is easy to see that the following statement holds. Lemma 5.1. The measures µV converge in the weak∗ topology to a measure on M if d the measures νV converge in the weak∗ topology to a measure on 6ZA as V → Zd . The desired result is now a consequence of Lemma 1 and the following lemma.
698
M. Jiang, Ya.B. Pesin
Lemma 5.2. The measures νV converge in the weak∗ topology to a measure on the d (d + 1)-dimensional lattice spin system 6ZA which is the unique Gibbs state for a H¨older continuous function. It is also exponentially mixing with respect to the Zd+1 -action of the lattice. Proof of the lemma. Note that the measure νV is the unique Gibbs state for the H¨older continuous function ϕ∗V (ξV ) = − log Jacu 8V (hV πV (ξV )) on
(5.2)
We express the Jacobian Jac 8V (xV ), xV ∈ MV as a product Y Jacu f (xi ) , Jacu 8V (xV ) = det(D8V |W8uV (xV )) = det(I + AV (xV ))
6VA .
u
(5.3)
i∈V
where I is the identity matrix and AV is a matrix whose entries are submatrices satisfying some special properties which we specify later. Let E8uV (xV ) be the unstable subspace at xV for the map 8V . One can see that E8uV (xV ) is close to the direct product ⊗i∈V Efu (xi ). We choose a basis {ui (xi ), si (xi ), i ∈ V } in the space ⊗i∈V Txi M = ⊗i∈V Efu (xi ) ⊗ ⊗i∈V Efs (xi ) such that ui (xi ) and si (xi ) are bases in Efu (xi ) and Efs (xi ) respectively, and we assume that they depend H¨older continuously on the base point xV . The derivative D8V (xV ) can now be written as follows: u uu us 0 (xV ) (D f (xi )) a (xV ) aij D8V (xV ) = I + ij , (5.4) s su ss 0 (D f (xi )) aij (xV ) aij (xV ) where we arrange the elements of the basis {ui (xi ), si (xi ), i ∈ V } in an arbitrary linear order, ui first, followed by si . Since 8 is C 1 -close to F and is short ranged the ∗ (xV )) satisfy the following conditions (we use ∗ to denote one of the submatrices (aij symbols uu, us, su, or ss): ∗ (xV ))k ≤ e−β|i−j| , where |i − j| is the distance between the lattice sites i and (1) k(aij j and constants > 0 and β > 0 are independent of the volume V as well as of the base point xV ; ∗ (xV ) depends H¨older continuously on xV : (2) each submatrix aij ∗ ∗ (xV ) − aij (yV )k ≤ e−β|i−k| dδ (xk , yk ), kaij
(5.5)
where xV = (xi ) and yV = (yi ) are such that xi = yi for i 6= k (recall that d is the Riemannian distance on M ). The constant > 0 can be chosen arbitrarily small as the C 1 -distance between 8 and F goes to zero. The constant δ is independent of the volume V and the base point xV . Using the graph transform technique one can identify the unstable subspace E8uV (xV ) with the graph of a linear map HxV : ⊗i∈V Efu (xi ) → ⊗i∈V Efs (xi ), i.e., E8uV (xV ) = (⊗i∈V Efu (xi ), HxV ⊗i∈V Efu (xi )).
(5.6)
us ) in the basis {ui (xi ), si (xi )}, The linear map HxV has a unique matrix representation (cij X us HxV ui (xi ) = cij sj (xj ), (5.7) j
where each submatrix
us cij
satisfies conditions similar to Conditions (1) and (2):
Equilibrium Measures for Coupled Map Lattices
699
us (3) kcij k ≤ e−β|i−j| ; us us (yV )k ≤ e−β|i−k| dδ (xk , yk ), where xV = (xi ) and yV = (yi ) are (4) kcij (xV ) − cij such that xi = yi for i 6= k.
To prove Condition (3) one can use the graph transform technique in the form described in [JLP] and combine it with the fact that the linear map HxV is short ranged. Condition (4) follows from the fact that distributions E8uV (xV ), ⊗i∈V Efu (xi ), and ⊗i∈V Efs (xi ) depend H¨older continuously over the base point xV . us Moreover, the entries cij satisfy the following crucial condition which allows one to pass from a finite volume to a bigger one: us us (xV ) − cij (yV 0 )k ≤ e−βd(i,∂V ) for any finite volume V ⊂ V 0 and any point (5) kcij yV 0 satisfying yV 0 |V = xV . In order to prove (5), we apply the graph transform technique to the map 8V 0 on MV 0 with the ρq -metric restricted to MV 0 . Note that the ρq -distance between 8V 0 and 8V ⊗ FV 0 \V is proportional to e−βd(V ) . Therefore, using results in [PS] we obtain that u,s u,s 0 the ρq -distance between subspaces E8u,s (yi ) is also 0 (xV ) and E8 (xV ) ⊗i∈V \V Ef V V
proportional to e−βd(V ) . Hence, so is the ρq -distance between linear operators HxV 0 and HxV . This implies (5). P us We choose {u˜ i } = {ui + Hui } = {ui + j cij sj } as a basis in E8uV (xV ), and we u write the derivative D8|E8V (xV ) in the new basis {u˜ i , si , i ∈ V } into the following matrix form: uu us us (xV )) + (aij (xV ))(cij (xV )). D8|E8uV (xV ) = (Du f (xi ))(I + aij
The latter expression can be rewritten in the form (Du f (xi ))(I + (aij (xV ))), where AV (xv ) = (aij (xV )) is the matrix whose submatrix entries aij (xV ) satisfy the following conditions (which follow immediately from (1)–(5)): (6) kaij k ≤ e−β|i−j| ; (7) kaij (xV ) − aij (yV )k ≤ e−β|i−k| dδ (xk , yk ), where xV = (xi ) and yV = (yi ) are such that xi = yi for i 6= k. (8) kaij (xV ) − aij (yV 0 )k ≤ e−βd(i,∂V ) for any V ⊂ V 0 . Next, we apply the well-known formula: det(exp(B)) = exp(trace(B)). In our case, exp(B) = I + AV (xV ) and hence, det(I + AV ) = exp(trace(ln(I + AV )) = exp(−
X
wV i ),
(5.8)
i∈V
where wV i (xV ) =
∞ X (−1)n n=1
n
n trace(aii (xV ))
n and aii (xV ) are submatrices on the main diagonal of (AV )n .
(5.9)
700
M. Jiang, Ya.B. Pesin
Sublemma. The functions wV i (xV ) satisfy: (1) |wV i (xV )| ≤ C; (2) |wV i (xV ) − wV i (yV )| ≤ C exp(− β2 |i − k|)dδ (xk , yk ), where xV = (xi ) and yV = (yi ) are such that xi = yi for i 6= k; (3) if V ⊂ V 0 then |wV i (x) − wV 0 i (y)| ≤ C exp(− β2 d(i, ∂V )); (4) there exists the limit ϕi = limV →Zd wV i (x) which is translation invariant in the ¯ = ψ(σsi x). ¯ Moreover, ψ is H¨older continuous with H¨older following sense: ϕi (x) constant which goes to zero as → 0. Proof of the sublemma. The proof is a straightforward calculation. We first show the following inequality: ˜ n k ≤ (C)n e−β|i−j| , (5.10) kaij ˜ is a constant. where β˜ is a number smaller than β and C = C(β) We use the induction. For n = 2 we have X X 2 kaij k=k ail alj k ≤ 2 exp(−β(|i − l| + |l − j|)) l∈V
≤
X
l∈V
˜ − l| + |l − j|) − (β − β)|l ˜ − j|) 2 exp(−β(|i
l∈V
≤ 2 e−β|i−j| ˜
X
˜ ˜ − j|) ≤ C2 e−β|i−j| exp(−(β − β)|l ,
(5.11)
l∈V
P ˜ = ˜ where C = C(β) l∈Zd exp(−(β − β)|l|). n−1 n−2 n−1 ˜ − j|). Then Let us assume that kaij k ≤ C exp(−β|i n kaij k=k
X
n−1 ail alj k ≤
l∈V
X
˜ − l| + |l − j|) − (β − β)|l ˜ − j|) C n−2 n exp(−β(|i
l∈V
˜ − j|). ≤ C n−1 n exp(−β|i
(5.12)
Therefore, Statement 1 follows directly from the definition of wV i . To prove Statement 2 we need only to show the following inequality: β
n n (xV ) − aij kaij (yV )k ≤ (C)n e− 2 |i−k| dδ (xk , yk ),
where xV = (xi ) and yV = (yi ) are such that xi = yi for i 6= k. We again use the induction. For n = 2, X 2 2 (xV ) − aij (yV )k = ail (xV )alj (xV ) − ail (yV )alj (yV ) kaij l∈V
= ≤
X
X
ail (xV )[alj (xV ) − alj (yV )] + alj (yV )[ail (xV ) − ail (yV )]
l∈V
2 [exp(−β(|l − k| + |i − l|)) + exp(−β(|l − j| + |i − k|))]dδ (xk , yk )
l∈V
β ≤ C2 exp(− |i − k|)dδ (xk , yk ), 2
(5.13)
Equilibrium Measures for Coupled Map Lattices
701
P where C = 2 l∈Zd exp(− β2 |l|). For n > 2 we argue similarly using Statement (1): X n−1 n−1 n n (xV ) − aij (yV )k = ail (xV )alj (xV ) − ail (yV )alj (yV ) kaij l∈V
=
X
n−1 n−1 n−1 ail (xV )[alj (xV ) − alj (yV )] + alj (yV )[ail (xV ) − ail (yV )]
l∈V
≤
X l∈V
β β (C)n−1 exp(− |i − l| − β|l − k| − β|l − j| − |i − k|)dδ (xk , yk ) 2 2
β (5.14) ≤ (C)n exp(− |i − k|)dδ (xk , yk ). 2 Statement 3 follows from Condition (8) while Statement 4 is a consequence of Statements 2 and 3 and our assumption that the map 8 is spatial translation invariant. We proceed with the proof of the theorem. Let V be a d-dimensional cube centered at the origin. Choose any finite volume V0 ⊂ V and numbers 0 < m < n. We have that ∗ νV (ξ(V0 ,m) ) = lim νV (ξ(V0 ,m) |η(V,n) [ ). n→∞
In order to obtain the desired result we shall show that the one-dimensional Gibbs distributions νV (ξ(V,n) |η ∗[ ) has a unique thermodynamic limit as V → Zd+1 and (V,n) n → ∞. This thermodynamic limit is precisely the unique d + 1-Gibbs state for the potential function ¯ = (ψ − log Jacu f )(hπ( ¯ ¯ ξ)) (5.15) ϕ∗ (ξ) on 6ZA , where ψ is defined in Statement 4 of Sublemma. Note that the function ϕ∗ is the sum of two functions, ϕ∗ = ϕ∗0 + ϕ∗1 , where d
ϕ∗0 = − log Jacu f ◦ π¯ and
ϕ∗1 = (ψ − log Jacu f ) ◦ h ◦ π¯ + log Jacu f ◦ π. ¯
By Statements 1, 2, and 4 of Sublemma and Theorem 1.1 the function ϕ∗1 is H¨older continuous with a small H¨older constant in the metric ρq provided is sufficiently small. The function ϕ∗0 is also H¨older continuous and depends only on the coordinate ξ0 . Therefore, by Theorem 3.7 the Gibbs state corresponding to this function is unique. Since the measure νV is the unique Gibbs state for the H¨older continuous function ϕ∗V (ξV ) on 6VA (see (5.2)) it satisfies the following equation [Ru]: given a configuration d η ∗ ∈ 6ZA , P exp k∈Z ϕ∗V (σtk (ξ(V,n) + η ∗[ )) (V,n) ∗ P P , (5.16) νV (ξ(V,n) |η(V,n) [) = ∗ (σ k (η ∗ exp ϕ (V,n) + η [ ) t η(V,n) k∈Z V (V,n)
where ξ(V,n) is a configuration over the finite volume (V, n) = V × [−n, n] ⊂ Zd+1 , and \ η ∗[ is the restriction of the configuration η ∗ to (V, n) = Zd+1 \V × [−n, n]. (V,n) Using (5.3) and (5.8) we rewrite (5.16) in the following way:
702
M. Jiang, Ya.B. Pesin
∗ νV (ξ(V,n) |η(V,n) [)
=P
exp η(V,n)
P
=P
exp η(V,n)
P k∈Z
exp
P
ϕ∗V (hV πV σtk (ξ(V,n) + η ∗[ )) ∗ k∈Z ϕV
(V,n) k (hV πV σt (η(V,n) + η ∗[ )) (V,n)
P
k∈Z
exp
P
(wV i − log Jacu f )(hV πV σtk (ξ(V,n) + η ∗[ )) (V,n) P . u k (η (w − log Jac f )(h π σ + η ∗[ )) V i V V (V,n) t k∈Z i∈V i∈V
(V,n)
The rest of the proof is split into the following steps. Step 1. We wish to rewrite the last expression for the conditional distributions νV (ξ(V,n) |η ∗[ ) in terms of potentials (see Sect. 3). The potential U corresponding (V,n) ¯ can be constructed using (3.9)–(3.12). to the function (ψ − log Jacu f )(hπ) Given a finite volume V and i ∈ V , consider the function (wV i −log Jacu f )(hV πV ). In order to construct the potential U V i corresponding to this function we again follow the procedure described in Sect. 3 and use (wV i − log Jacu f )(hV πV ) for each Zd+1 cube centered at (i, k) ∈ V × Z. Not that the resulting potential is invariant under time translations but may not be invariant under spatial translations. Step 2. We now rewrite the distributions νV (ξ(V,n) |η ∗[ ) in terms of potentials U V i : (V,n)
∗ P νV (ξ(V,n) |η(V,n) [) =
exp
P Q∩(V,n)6=∅
η(V,n) exp
P
Vi UQ (ξ(V,n) + η ∗[ ) (V,n)
Vi ∗ Q∩(V,n)6=∅ UQ (η(V,n) + η [ )
.
(5.17)
(V,n)
Step 3. By Statement 3 of Sublemma wV i → ϕi = ψ(σsi ) exponentially fast. Using the fact that hV → h exponentially fast in the ρq -metric (see Theorem 4.4) we obtain that for any Zd+1 -cube Q centered at (i, k) ∈ V × Z, |U V i (ξ(Q)) − U (ξ(Q))| ≤ Ce−βd(i,∂V ) . By Statement 2 of Sublemma both potentials U fast as the side length of Q increases.
Vi
(5.18)
|Q and U |Q go to zero exponentially
Step 4. Take a larger volume (V 0 , n0 ) ⊂ Zd+1 such that (V, n) ⊂ (V 0 , n0 )/2 = (V 0 /2, n0 /2), where V 0 /2 is the d-dimensional cube centered at the origin of the side length equal to 1/2 of the side length of V . We follow the approach elaborated by Ruelle in [Ru] (see Sect. 1.7). (For the reader’s convenience we provide the correspondence between Ruelle’s notations and ours: M = (V 0 , n0 ), 3 = (V, n), X = Q, and 8 = U V i , U ). We first decompose the numerator of (5.17) (for volume (V 0 , n0 )) into two terms. X Vi ∗ 0 0 0 0 UQ (ξ(V,n) + η(V,n) exp [ ) = exp H(V,n) (ξ(V,n) ) + B(V ,n ) (ξ(V ,n ) ) , Q∩(V 0 ,n0 )6=∅
where the main term H(V,n) (ξ(V,n) ), the Hamiltonian in volume (V, n), is given by X ∗ UQ (ξ(V,n) + η(V,n) H(V,n) (ξ(V,n) ) = [ ), Q⊂(V,n)
while the boundary term is given as follows:
Equilibrium Measures for Coupled Map Lattices
X
B(V 0 ,n0 ) (ξ(V 0 ,n0 ) ) =
703
0
Q∩(V 0 ,n0 )6=∅
V i ∗ ∗ UQ (ξ(V 0 ,n0 ) + η(V ) − UQ (ξ(V 0 ,n0 ) + η(V ) 0 ,n0 ) 0 ,n0 ) \ \
X
+
Q∩(V 0 ,n0 )6=∅ \ 0 ,n0 )6=∅ Q∩(V
UQ (ξ(V 0 ,n0 ) + η ∗b ). M
By (5.17) and results in [Ru] (see Sect. 1.6) we now only need to verify that the boundary term satisfies the conditions stated in Sect. 1.7 of [Ru]. We first split B(V 0 ,n0 ) (ξ(V 0 ,n0 ) ) into two terms B(V 0 ,n0 ) (ξ(V 0 ,n0 ) ) = B 0 (η)+B 00 (ξ(V,n) + η), where ξ(V 0 ,n0 ) = ξ(V,n) + η and B 0 (η) collects the terms depending only on η ∈ (V 0 ,n0 )\(V,n) , i.e., X 0 V i ∗ ∗ B 0 (η) = UQ (ξ(V 0 ,n0 ) + η(V ) − UQ (ξ(V 0 ,n0 ) + η(V ) 0 ,n0 ) 0 ,n0 ) \ \ X∩(V 0 ,n0 )6=∅ Q∩(V,n)=∅
+
X Q
∗
∗ UQ (ξ(V 0 ,n0 ) + η(V ), 0 ,n0 ) \
while the second term is given as follows: X 0 V i ∗ ∗ 0 ,n0 ) + η (ξ(V 0 ,n0 ) + η(V ) − U (ξ UQ ) B 00 (ξ3 + η) = Q (V 0 ,n0 ) 0 ,n0 ) \ \ (V Q∩(V,n)6=∅
+
X Q
∗∗
∗ UQ (ξ(V 0 ,n0 ) + η(V ). 0 ,n0 ) \
P∗
P∗∗ 0 , n0 ) 6= ∅, Q ∩ 3 = ∅} and \ runs over {Q : Q ∩ (V 0 , n0 ) 6= ∅, Q ∩ (V Q runs 0 , n0 ) 6= ∅, Q ∩ 3 6= ∅}. \ over {Q : Q ∩ (V 0 , n0 ) 6= ∅, Q ∩ (V According to [Ru] in order to show that the thermodynamic limit of νV (ξ(V,n) |η ∗[ )
Here
Q
(V,n)
goes to a Zd+1 -Gibbs state of U , we only need to check that for any fixed (V, n), B 00 (ξ(V,n) + η) as a function of η ∈ (V 0 ,n0 )\(V,n) goes to zero uniformly in (V 0 ,n0 )\(V,n) P∗∗ as (V 0 , n0 ) → Zd+1 . The second sum in B 00 , Q , goes to zero uniformly since the potential U decays exponentially. The first sum in B 00 can be further decomposed into two sums. Let (i(Q), k(Q)) ∈ Zd+1 denote the center of Q. We may assume that (V 0 , n0 ) is a Zd+1 -cube with equal sides. Then, X V 0i ∗ ∗ UQ (ξ(V 0 ,n0 ) + η(V ) − UQ (ξ(V 0 ,n0 ) + η(V ) 0 ,n0 ) 0 ,n0 ) \ \ Q∩(V,n)6=∅
=(
X
X
+
i(Q)∈(V 0 ,n0 )/2 Q∩(V,n)6=∅
0
i(Q)6∈(V 0 ,n0 )/2 Q∩(V,n)6=∅
By (5.18) we have X |
V i ∗ ∗ )UQ (ξ(V 0 ,n0 ) + η(V ) − UQ (ξ(V 0 ,n0 ) + η(V ). 0 ,n0 ) 0 ,n0 ) \ \
0
i(Q)∈(V 0 ,n0 )/2 Q∩(V,n)6=∅
V i ∗ ∗ UQ (ξ(V 0 ,n0 ) + η(V ) − UQ (ξ(V 0 ,n0 ) + η(V )| 0 ,n0 ) 0 ,n0 ) \ \
≤ C 0 ε|(V, n)||(V 0 , n0 )/2|e−βd((V
0
,n0 ))
,
704
M. Jiang, Ya.B. Pesin
where |(V, n)| and |(V 0 , n0 )/2| are the cardinalities of the corresponding sets and d((V 0 , n0 )) is the side length of (V 0 , n0 ). The sum X V 0i ∗ ∗ UQ (ξ(V 0 ,n0 ) + η(V ) − UQ (ξ(V 0 ,n0 ) + η(V ) 0 ,n0 ) 0 ,n0 ) \ \ i(Q)6∈(V 0 ,n0 )/2 Q∩(V,n)6=∅
also goes to zero uniformly as d((V 0 , n0 )) → ∞ since both potentials U V zero exponentially fast as d((V 0 , n0 )) → ∞. This completes the proof of the theorem.
0
i
and U go to
Appendix: Spin Lattice Systems 1. Abstract Polymer Expansion Theorem. Consider a finite or countable set 2. Its elements are called (abstract) contours and denoted by θ, θ0 , etc. Fix some reflexive and symmetric relation on 2 × 2. A pair θ, θ0 ∈ 2 × 2 is called incompatible (θ 6∼ θ0 ) if it belongs to the given relation. Otherwise, this pair is called compatible (θ ∼ θ0 ). A collection {θj } is called a compatible collection of contours if any two of its elements are compatible. A statistical weight w is a complex function on the set of contours. For any finite subset 3 ⊆ 2 an abstract partition function is defined as X Y w(θj ), (A1.1) Z(3) = {θj }⊆3 j
where the sum is extended to all compatible collections of contours θi ∈ 3. The empty collection is compatible by definition and it is included in Z(3) with statistical weight 1. A polymer ℘ = [θiαi ] is an (unordered) finite collection of different contours θi ∈ 2 with positive integer multiplicity αi . For every pair θ0 , θ00 ∈ ℘ there exists a sequence θ0 = θi1 , θi2 , . . . , θis = θ00 ∈ ℘ with θij 6∼ θij+1 , j = 1, 2, . . . , s − 1. The notation ℘ ⊆ 3 means that θi ∈ 3 for every θi ∈ ℘. P With every polymer ℘ we associate an (abstract) graph 0(℘) which consists of i αi vertices labeled by the contours from ℘ and edges joining every two vertices labeled by incompatible contours. It follows from the definition of 0(℘) that it is connected and we denote by r(℘) the quantity X Y 0 (αi !)−1 (−1)|0 | , (A1.2) r(℘) = i
00 ⊂0(℘)
P where the sum is taken over all connected subgraphs 00 of 0(℘) containing all of i αi 0 0 vertices and |0 | denotes the number of edges in 0 . For any θ ∈ ℘ we denote by α(θ, ℘) the multiplicity of θ in the polymer ℘. The polymer expansion theorem below is a modification of results of [Se] and [KP] proven in [MSu] (see also [D2] for closely related results). Abstract Polymer Expansion Theorem. Suppose that there exists a function a(θ) : 2 7→ R+ such that for any contour θ X 0 |w(θ0 )|ea(θ ) ≤ a(θ). (A1.3) θ 0 : θ 0 6∼θ
Equilibrium Measures for Coupled Map Lattices
Then, for any finite 3, log Z(3) =
705
X
w(℘),
(A1.4)
℘⊆3
where the statistical weight of a polymer ℘ = [θiαi ] equals to Y w(℘) = r(℘) w(θi )αi .
(A1.5)
i
Moreover, the series (A1.4) converges absolutely in view of the estimate X α(θ, ℘)|w(℘)| ≤ |w(θ)|ea(θ) ,
(A1.6)
℘: ℘3θ
which holds true for any contour θ. 2. Gibbs States. Let S = {1, 2, · · · , p} and A be a p × p transfer matrix with entries aij equal to either 0 or 1. Assume that A is transitive, i.e., there is a constant n0 such that every entry of An0 is positive. For any volume V ⊆ Z2 a configuration in V is an element η(V ) of S V with the value ηx (V ) at point x = (i, j) ∈ V . A configuration η is called admissible if aηx1 ηx2 = 1 for any pair x1 = (i, j), x2 = (i, j + 1) ∈ V . For the P family of configurations η(Vi ) in mutually disjoint volumes Vi we denote by i η(Vi ) the corresponding configuration in ∪i Vi provided such a configuration N exists (i.e., is admissible). When V = Z2 we have the configuration space 6ZA = Z 6A , where 6A is the subshift generated by the matrix A. Let Q be a square in Z2 and l(Q) its side length. Consider a potential U satisfying 0 ≤ U (η(Q)) ≤ exp [−l(Q)]
(A2.1)
for every square Q ⊂ Z2 . Take a finite volume V and fix a configuration η 0 over Vb = Z2 \ V . The configuration 0 b η (V ) is called a boundary condition. Conditional Gibbs distributions over V under the boundary condition η 0 (Vb ) are defined by h i exp H(η(V )|η 0 (Vb )) . (A2.2) µ V,η0 (η(V )) = Ξ(V |η 0 (Vb )) Here η(V ) is a configuration over V such that η(V ) + η 0 (Vb ) is also a configuration in Z2 , X X H(η(V )|η 0 (Vb )) = − U (η(Q)) − U η(Q ∩ V ) + η 0 (Q ∩ Vb ) (A2.3) Q⊆V b 6=∅ Q∩V 6=∅, Q∩V is the conditional Hamiltonian, and the denominator in (A2.2) is the partition function for the potential U in the volume V with the boundary condition η 0 (Vb ): h i X Ξ(V |η 0 (Vb )) = exp −βH(η(V )|η 0 (Vb )) . (A2.4). η(V )
706
M. Jiang, Ya.B. Pesin
3. Contour Representation of Partition Functions. We shall show that the partition function Ξ(V |η 0 (Vb )) can be represented in the form of an abstract partition function (A1.1). It has a polymer expansion (A1.4) if β is sufficiently small. We shall describe the terms in (A1.1) in our specific context. We first introduce a new potential which is equivalent to the original one (A2.1)– (A2.4). This means that the new potential defines the same Gibbs distributions over any finite volume under a fixed boundary condition. Let b(Q) be the leftmost lower corner of Q. Take an integer L ≥ n0 and consider a rectangle P of size n(P ) × Ln(P ) such that its leftmost lower corner b(P ) = (b1 (P ), b2 (P )) has b2 (P ) = rL, where r and n(P ) are integers. We say that the square Q with b(Q) = (b1 (Q), b2 (Q)) is associated with the rectangle P if b1 (Q) = b1 (P ), L[b2 (Q)/L] = b2 (P ), l(Q) = n(P ), and hence Q ⊆ P (here [ · ] denotes the integer part). For any rectangle P we define X U (η(Q)), (A3.1) U (η(P )) = Q
where the sum is taken over all squares Q associated with the rectangle P . Clearly, 0 ≤ U (η(P )) ≤ Lexp [−n(P )]
(A3.2)
and absorbing L in β one can assume that the potential is defined on rectangles P (instead of squares Q) and satisfies 0 ≤ U (η(P )) ≤ exp [−n(P )] .
(A3.3)
Set ∂ I V = {x ∈ V | dist (x, Vb ) = 1}, ∂ E V = {x ∈ Vb | dist (x, V ) = 1}. We call ∂ I V and ∂ E V an internal and an external boundaries of V respectively. Observe that every finite volume V can be uniquely partitioned into vertical segments Vn with each segment being a connected component of the intersection of V and some vertical line. We denote by a(Vn ) and b(Vn ) the points of ∂ E V adjacent to Vn from above and from below, respectively. The collection of such elements will be denoted by a(V ) and b(V ). In addition, we restrict our considerations to the volumes with L[a(Vn )/L] = a(Vn ) and L[b(Vn ) + 1/L] − 1 = b(Vn ).
(A3.4)
As we still allow arbitrary boundary conditions it is sufficient to prove the uniqueness of the limiting Gibbs state when the limit is taken over volumes of the special shape described above. 3.1. Definition of contours. A precontour γ = {Pj } is a family of rectangles which satisfy the following conditions: (1) γ¯ = ∪j Pj is a connected subset of Z2 ; (2) every Pj contains a point which does not belong to any other rectangle of γ. Consider a finite family of rectangles 0 = {Pi } such that 0¯ = ∪i Pi is a connected subset of Z2 . This family of rectangles γ(0) will be a precontour by our definition. It is called the precontour of 0. We describe an algorithm which produces a unique minimal ¯ covering γ(0) of 0. (i)
¯ Among all rectangles of 0 that begin at this point Fix the leftmost lower point in 0. choose the rectangle Pi1 with the maximal linear size n(Pi1 ) and include it in γ(0).
Equilibrium Measures for Coupled Map Lattices
707
(ii) Suppose that the rectangles Pi1 , . . . , Pik are already selected to γ(0) during the previous steps of the algorithm. Fix the leftmost lower point x ∈ 0¯ \ (∪kj=1 Pij ). Consider all rectangles of 0 covering x. Among them choose the rectangles with the maximal right upper corner (here maximal means rightmost upper). From this family of rectangles include in γ(0) the rectangle Pik+1 which has the maximal linear size. (iii) Repeat step (ii) until 0¯ will be totally covered, i.e. 0¯ = ∪j Pij . We say that a rectangle P is compatible with precontour γ = {Pj } and denote it by P ≺ γ if for 0 = {Pj } ∪ {P } one has γ(0) = γ. Obviously, any P ≺ γ belongs to γ¯ and any P embedded into some Pj ∈ γ is compatible with γ. It is also clear that some of the rectangles P ⊆ γ¯ can be incompatible with γ. A collection of precontours {γi } is called a compatible if for any γi1 , γi2 ∈ {γi } either dist (γ¯ i1 , γ¯ i2 ) > 1 or γ¯ i1 ⊆ γ¯ i2 \ ∂ I γ¯ i2 . For V ⊂ Z2 , the inclusion 0 ⊂ V means that every rectangle of 0 is contained in V . Furthermore, 0 ∩ V 6= ∅ mean that P ∩ V 6= ∅ for every P ⊂ 0. A collection of precontours{0i } ∩ V 6= ∅ if 0i ∩ V 6= ∅ for each i. A contour is a triple = {γi }, {τj }, η , where (i) (ii) (iii) (iv)
(v)
either {γi } ∩ V 6= ∅ is a compatible collection of precontours or {0i } is an empty set; {τj } ⊆ V \ (∪i ∂ I γ¯ i ) is a collection of mutually disjoint finite vertical segments with a(τj ), b(τj ) ∈ ∪i (∂ I γ¯ i ∩ V ) ∪ ∂ E V ; η is a configuration in ∪i (∂ I γ¯ i ∩ V ); either {γi } is non empty and for every τj at least one of its ends (a(τj ) or b(τj )) belongs to ∪i (∂ I γ¯ i ∩ V ) or {γi } is empty and {τj } consists of a single segment τ with a(τ ), b(τ ) ∈ ∂ E V ; for every pair γi0 and γi00 there exists a sequence γi0 = γi1 , τj1 , . . . , γis , τjs , γis+1 = γi00 such that for any 1 ≤ k ≤ s either a(τjk ) ∈ ∂ I γ¯ ik and b(τjk ) ∈ ∂ I γ¯ ik+1 or b(τjk ) ∈ ∂ I γ¯ ik and a(τjk ) ∈ ∂ I γ¯ ik+1 .
The contour clearly depends on V . In the special case when V = Z2 we obtain so called free contours. Given a contour = {γi }, {τj }, η , we set ¯ τ = ∪ j τj , ¯ γ = ∪i γ¯ i , ¯ = ¯τ ∪ ¯ γ, ˜ = ¯ τ ∪ (∪i ∂ I γ¯ i ). ˜ l1 ∩ ˜ l2 = ∅ and the A collection {l } is compatible if for any l1 and l2 one has total collection {γi (l1 ), γi (l2 )} is a compatible collection of precontours. A contour belongs to the volume V if the corresponding precontours γi ⊆ V and ¯ ⊆ V . A contour has non empty intersection with the volume V if {γi } ∩ V 6= ∅ ¯τ ⊆V. and 3.2. Definition of statistical weight for contours. We partition the finite volume V into vertical segments Vn and denote the distance between a(Vn ) and b(Vn ) by ||Vn || = |Vn | + 1. The number of configurations in V with the boundary condition η 0 (Vb ) can be calculated as Y 0 0 N (V |η 0 (∂ E V )) = N Vn |ηa(V , ηb(V , (A3.5) n) n) n
708
M. Jiang, Ya.B. Pesin
0 0 0 0 where N Vn |ηa(V , ηb(V , ηb(V . is the entry of the matrix A||Vn || indexed by ηa(V n) n) n) n) By Perron–Frobenius theorem both matrices A and its adjoint A∗ have a unique maxwith positive imal eigenvalue λ > 1. Let e and e∗ be the corresponding eigenvectors P components eη and e∗η . We normalize e and e∗ in such a way that η eη e∗η = 1. Using the Jordan normal form for matrix A one can show that 0 0 0 N Vn |ηa(V , ηb(V e∗ 0 = eηa(V n) n) ) η n
b(Vn )
0 0 λ||Vn || 1 + F Vn |ηa(V , ηb(V n) n)
,
(A3.6)
where for some 0 < ρ(A) < 1 and ν(A) > 0, 0 0 ≤ ν(A)ρ(A)||Vn || . F Vn |ηa(V , ηb(V n) n) We define L(V ) = λ− E
E(η(∂ V )) =
Y
P n
||Vn ||
!−1
∗
E
E (η(∂ V )) =
,
(A3.8)
Y
eηa(Vn )
n
(A3.7)
!−1 e∗ηb(Vn )
,
n
Y
!−1 e∗ηa(Vn )
Y
n
!−1 eηb(Vn )
.
(A3.9)
n
Similarly, we define E(η(∂ I V )) and E ∗ (η(∂ I V )) by using the top and bottom elements of ¯ ), Vn instead of a(Vn ) and b(Vn ). Given a precontour γ and a fixed configuration η(∂ I γ∩V we define a precontour partition function by −1 ¯ ∩ V E ∗ η(∂ I γ¯ ∩ V ) E η 0 (∂ E V ∩ γ) ¯ Ξ γ, η(∂ I γ¯ ∩ V )|η 0 (Vb ) = L (γ¯ \ ∂ I γ) ×
Y
X
η (γ\∂ ¯ I γ)∩V ¯
Y [U η(P ∩ V ) + η 0 (P ∩ Vb ) − 1] U η(P ∩ V ) + η 0 (P ∩ Vb ) .
P ∈γ
P ≺γ
(A3.10) Set Ξ ∗ (V |η 0 (∂ E V )) = L(V )E(η 0 (∂ E V ))
Y
X
1 + U β, η(P )
.
η(V ) P : P ⊆V
The statistical weight of precontour is defined by I 0 b Ξ γ, η(∂ γ ¯ ∩ V )|η ( V ) . W γ, η(∂ γ¯ ∩ V )|η (Vb ) = ∗ Iγ Ξ (γ¯ ∩ V ) \ ∂ I γ|η(∂ ¯ ¯ ∩ V ) + η 0 (∂ E V ∩ γ) ¯ (A3.11) For any contour = {γi }, {τj }, η , the statistical weight is 0
I
W (|η 0 (Vb )) =
Y
Y
W γi , η(∂ I γ¯ i ∩ V )|η 0 (Vb )
i
where η 00 = η 0 ∂ E V \ ∪i γ¯ i
j
+
P i
η ∂ I γ¯ i ∩ V .
00 00 F (τj |ηa(τ , ηb(τ ), j) j)
(A3.12)
Equilibrium Measures for Coupled Map Lattices
709
Polymer Expansion Theorem (see [JM]). Let U (η(P )) be a potential which is defined on rectangles of size n(P )×Ln(P ). Assume that U ∈ P(q, ) satisfies (A3.3). Then there exists a constant 0 > 0 such that for any 0 < ≤ 0 , any finite volume V satisfying (A3.4), and arbitrary boundary condition η 0 (Vb ) the following equation holds: X Y W (j |η 0 (Vb )), (A4.1) L(V )E(η 0 (∂ E V ))Ξ(V |η 0 (Vb )) = {j }∩V 6=∅ j
where the partition function Ξ(V |η 0 (Vb )) on the left-hand side is defined by (A2.1)– (A2.4) with U (η(P )) replacing U (η(Q)) and the right-hand side is the abstract partition function over contours defined in the previous sections. Thus, the partition function has the polymer expansion X w(℘)), (A4.2) L(V )E(η 0 (∂ E V ))Ξ(V |η 0 (Vb )) = exp ℘∩36=∅
where the statistical weight w(℘) is defined in (A1.5) i ¯ i , a potential U ∈ P(q, ) satisfying (A3.3), For a polymer ℘ = [α ¯ = ∪i i ], ℘ and every sufficiently small the conditional Gibbs distributions (see (A2.2)) can be computed by the following formula
N (B) exp
µ V,η0 (ξ(B)) = X
X
U (η(P )) +
P ⊆B
w(℘|ξ(B) + η 0 (Vb )) −
℘:℘∩V \B6=∅
X
w(℘|η 0 (Vb )) ,
℘:℘∩V 6=∅
where P is a rectangle, B ⊂ V ⊂ P are finite volumes (V satisfies (A3.4)) and N (B) =
L(B) E ∗ (ξ(∂ I B))
(A4.3)
(A4.4)
is the normalizing factor (recall that L(B) and E ∗ (ξ(∂ I B)) are defined by (A3.8)). One can show that the infinite sums on the right-hand side in the above formula are convergent uniformly for all B in Z2 and obtain an explicit formula for the Gibbs state in terms of the potential U independent of the boundary condition η 0 : µ(ξ(B)) =
X N (B) exp U (ξ(P )) + P ⊆B
X dist(℘,B)≤1 ¯ dist(℘,¯ Bb)=0
℘:
w(℘|ξ(B)) −
X ℘:dist(℘,B)≤1 ¯
w(℘) .
(A4.5)
Acknowledgement. The authors wish to thank Jean Bricmont and Antti Kupiainen for helpful discussions. Both authors were partially supported by the National Science Foundation grant DMS9403723. M. J. was also partially supported by the grants from Army Research Office and the National Institute of Standards and Technology. Ya. P. was also partially supported by the National Science Foundation grant DMS9704564. Ya. P. wish to thank IHES (Bur-Sur-Yvette, France), ESI (Vienna, Austria), and ETH (Zurich, Switzerland) for hospitality and providing excellent conditions to work on the paper. The authors also thank IMA (Minneapolis) for providing a great opportunity to get together and complete the paper.
710
M. Jiang, Ya.B. Pesin
References [Bo] [BK1] [BK2] [BK3] [Bu] [BuSi] [BuSt] [D1] [D2] [DM1] [DM2] [Geo] [Gro] [J1] [J2] [J3] [JLP] [JM] [Ka] [KH] [KK] [KP] [Lang] [Ma] [MM] [MSu] [PS] [Ru] [Se] [Sh]
Bowen, R.: Equilibrium State and the Ergodic Theory of Anosov Diffeomorphisms. Lecture Notes in Mathematics, 470 Berlin: Springer-Verlag, 1975 Bricmont, J., Kupiainen, A.: Coupled Analytic Maps. Nonlinearity 8, 379–396 (1995) Bricmont, J., Kupiainen, A.: High Temperature Expansions and Dynamical Systems. Commun. Math. Phys. 178, 703–732 (1996) Bricmont, J., Kupiainen, A.: Infinite dimensional SRB-measures. mp arc Preprint, 1995 Bunimovich, L.: Coupled map lattices: One step forward and two steps back. Physica D 86, 248–255 (1995) Bunimovich, L., Sinai, Ya.G.: Space-time Chaos in Coupled Map Lattices. Nonlinearity 1, 491–516 (1988) Burton, R., Steif, J.: Non-uniqueness of Measures of Maximal Entropy for Subshifts of Finite Type. Ergod. Theor. & Dyn. Systems, 14, 2, 213–235 (1965) Dobrushin, R.: The Problem of Uniqueness of a Gibbsian Random Field and the Problem of Phase Transitions. Funct. Anal. Appl. 2, 302–312 (1994) Dobrushin, R.: Estimates of Semiinvariants for the Ising Model at Low Temperatures. Preprint ESI 125, 1994 Dobrushin, R., Martirosian, M.: Nonfinite Perturbations of the Random Gibbs Fields. Theor. Math. Phys. 74, 10–20 (1988) Dobrushin, R., Martirosian, M.: Possibility of High-temperature Phase Transitions Due to the Manyparticle Nature of the Potential. Theor. Math. Phys. 74, 443–448 (1988) Georgii, H.: Gibbs Measures and Phase Transitions. Berlin: Walter de Gruyter, 1988 Gross, L.: Decay of correlations in classical lattice models at high temperature. Commun. Math. Phys. 68, 9–27 (1979) Jiang, M.: Equilibrium states for lattice models of hyperbolic type. Nonlinearity 8 5, 631–659 (1994) Jiang, M.: Ergodic Properties of Coupled Map Lattices of hyperbolic type. Penn State University Dissertation, 1995 Jiang, M.: Entropy Formula for Coupled Map Lattices. In preparation 1997 Jiang, M., de la Llave, R., Pesin, Ya.: On the integrability of intermediate distributions for Anosov diffeomorphisms. Ergod. Th. & Dynam. Sys. 15, 2, 17–331 (1995) Jiang, M., Mazel, A.: Uniqueness of Gibbs states and Exponential Decay of Correlation for Some Lattice Models. J. Stat. Phys. 82, 3–4, 797–821 (1995) Kaneko, K., ed.: Theory and Applications of Coupled Map Lattices. New York: Wiley, 1993 Katok, A., Hasselblatt, B.: Introduction to the Modern Theory of Dynamical Systems. Cambridge: Cambridge University Press, 1994 Keller, G., K¨unzle, M.: Transfer Operators for Coupled Map Lattices. Ergod. Theor. & Dyn. Systems 12, 297–318 (1992) Kotecky, R., Preiss, D.: Cluster Expansion for Abstract Polymer Models. Commun. Math. Phys. 103, 491–498 (1996) Lang, S.: Differential Manifold. New York: Springer-Verlag, 1985 Ma˜ne´ , R.: Ergodic Theory and Differential Dynamics. New York: Springer-Verlag, 1987 Malyshev, V., Minlos, R.: Gibbs Random Fields. Dordrecht: Kluwer Academic Publisher, 1991 Mazel, A., Suhov, Yu.: Ground States of a Boson Quantum Lattice Model. Sinai’s Moscow Seminar on Dynamical Systems 185–226, Amer. Math. Soc. Transl. 2, 171 (1996) Pesin, Ya., Sinai, Ya.: Space-time chaos in chains of weakly interacting hyperbolic mappings. Adv. in Soviet Math. 3, 165–198 (1991) Ruelle, D.: Thermodynamic Formalism. Encyclopedia of Mathematics and Its Applications, New York: Addison Wesley, 1978 Seiler, E.: Gauge Theories as a Problem of Constructive Quantum Field Theory and Statistical Mechanics. Lect. Notes in Physics 159, New York: Springer-Verlag, 1982 Shub, M.: Global Stability of Dynamical Systems. New York: Springer-Verlag, 1987
Equilibrium Measures for Coupled Map Lattices
[Sim]
711
Simon, B.: The Statistical Mechanics of Lattice Gases. V. 1, Princeton NJ.: Princeton University Press, 1993 [Sinai] Sinai, Ya.G.: Topics in Ergodic Theory. Princeton, NJ.: Princeton University Press, 1994 [V1] Volevich, D.L.: The Sinai–Bowen–Ruelle Measure for a Multidimensional Lattice of Interacting Hyperbolic Mappings. Russ. Acad. Dokl. Math. 47, 117–121 (1993) [V2] Volevich, D.L.: Construction of an Analogue of Bowen–Ruelle–Sinai Measure for a Multidimensional Lattice of Interacting Hyperbolic Mappings. Russ. Acad. Math. Sbornik 79, 347–363 (1994) Communicated by Ya. G. Sinai
Commun. Math. Phys. 193, 713 – 736 (1998)
Communications in
Mathematical Physics © Springer-Verlag 1998
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking G. Wolansky1 , M. Ghil2 1 Department of Mathematics, Technion-Israel Institute of Technology, Haifa, 32000, Israel. E-mail: [email protected] 2 Department of Atmospheric Sciences and Institute of Geophysics and Planetary Physics, University of California, Los Angeles, CA 90095-1565, USA
Received: 25 February 1997 / Accepted: 24 September 1997
Abstract: In this paper we generalize Arnol’d’s method for nonlinear stability of the two-dimensional, incompressible Euler equation’s stationary solutions. The method is extended so as to apply for saddle solutions, which include many cases of practical interest, in particular the case of solutions which do not share the symmetry of the domain. The extension is made possible by our technique of supporting functionals and the use of additional flow invariants.
1. Introduction and Main Result
1.1. Background. The two-dimensional (2-D) incompressible Euler equation is given in vorticity form by (1.1) ∂t ω + ∂ ψ ω = 0 , where ψ is the stream function, ω ≡ 1ψ ≡ (∂x2 + ∂y2 )ψ the vorticity, and ∂ ψ stands for the Jacobian derivative along streamlines, −ψy ∂x + ψx ∂y . Equation (1.1) is considered in a bounded domain ⊂ R2 with a smooth boundary ∂ ≡ 3 composed of a finite number of connected components 3i . The boundary conditions associated with (1.1) are given by (1.2) ψ|3i ≡ 9i , with 9i being undetermined constants that may depend on time, while I ∇ψ · n ≡ Ci
(1.3)
3i
for any of the boundary components; here n stands for the normal along 3i and the circulations Ci are prescribed constants. The boundary conditions (1.2), (1.3) correspond
714
G. Wolansky, M. Ghil
to 3 being composed of streamlines and to Kelvin’s circulation theorem, respectively. Another boundary condition for which (1.1) is well posed corresponds to prescribing constant fluxes 9∗i , ψ|3i = 9∗i , (1.4) while the circulations Ci in (1.3) are left unprescribed. A stationary solution of (1.1) is given by a stream function ψ 0 subject to (1.2, 1.3) which satisfies ∂ ψ0 ω0 = 0, (1.5) where ω0 ≡ 1ψ 0 is the associated vorticity. The object of this paper is to study conditions for nonlinear stability of such stationary solutions, using Lyapunov’s direct method. This method was first used to prove genuine, nonlinear stability for a class of stationary solutions of the Euler equation (1.1) by Arnol’d in the 1960s [Ar1, Ar2]. Arnol’d considered the flow-invariant, energy-Casimir [Cas] functional Z Z 1 |∇ψ|2 dx1 dx2 + (1.6) F (1ψ)dx1 dx2 E(ψ) = 2 for a given, twice-differentiable function F . The critical points ψ 0 of E are stationary solutions, given by 0 (1.7) − ψ 0 + F (ω0 ) = 0 . It then follows that ψ 0 is a nonlinearly stable solution of the Euler equation (1.1) if ψ 0 is a local extremizer of E in an appropriately defined function space. Arnol’d’s observation was that ψ 0 is a global minimizer if F is a convex function 00 [Ar1], i.e. if F ≥ 0, and hence Z 1 0 ∇(ψ − ψ 0 ) dx1 dx2 . E(ψ) − E(ψ ) ≥ (1.8) 2 00
Alternatively, if F < −c < 0 uniformly for sufficiently large c ≥ c∗ > 0, Z 1 E(ψ) − E(ψ 0 ) ≤ |∇(ψ − ψ 0 )|2 − c|1(ψ − ψ 0 )|2 dx1 dx2 2
(1.9)
and the quadratic form on the right-hand side (RHS) of (1.9) is negative definite for c∗ large enough. In the latter case, ψ 0 is a global, strict maximizer of E and likewise nonlinearly stable. A large number of publications concerning this so-called energy-Casimir (EC) method have appeared during the 1980s and 1990s. Some of these deal with the generalization of Arnol’d’s ideas to other, more complicated but related equations, such as the quasi-geostrophic equations for planetary-scale rotating flows [Bl, SG2] or the equations governing multi-layer or stratified incompressible flows [Ab], compressible flows, Vlasov-Poisson ([BR]), Vlasov-Einstein ([KM]) and magneto-hydrodynamic (MHD) problems; see [Ho] for various applications and references, and [L] for an application to free-boundary flows. All these equations, like (1.1), admit a Hamilton-Poisson structure and a Casimir functional, generalizing the vorticity integral that appears in the second term on the RHS of (1.6). There are, however, serious limitations to the applicability of the EC method for proving nonlinear hydrodynamic stability in nontrivial cases. The main difficulty is the need for uniform estimates on the EC functional E to guarantee global, strict convexity
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
715
(1.8) or strict concavity (1.9). It was pointed out by Arnol’d himself (see [Ar2], cf. also [Ho]) that the condition of the second variation of E being definite is only sufficient for the stability of the linearized Euler equation. In particular, nonlinear stability of the solution (1.7) is not guaranteed by the EC method even if the functional E is locally concave at ψ 0 , i.e. if Z 1Z 00 |∇φ|2 + F (ω0 )|1φ|2 dx1 dx2 < −σ |∇φ|2 (1.10) δψ2 0 Eφ, φ = 2 holds for some σ > 0 and any compatible test function φ. The reason for this difficulty is that the EC functional E is not twice Fr´echet differentiable on any reasonable Banach space, and the condition (1.10) is not sufficient for ψ 0 to be a local maximizer or, for that matter, even for it to be an isolated critical point in the vorticity norm. Examples of critical points of functionals that admit positive-definite second variations but are not local minima appear in [BM, BKM]. It is true, however, that the left-hand side (LHS) of (1.10) is an invariant along solutions of the linearized Euler equation, so (1.10) does guarantee linearized stability [WG1]. In [WG2] we have shown, using the method of supporting functionals, that the local concavity condition (1.10) is, indeed, sufficient for genuine, nonlinear stability of ψ 0 00 provided F < −ε for arbitrarily small ε > 0. This condition is significantly weaker 00 than F < −c for c > 0 large enough, as required in [Ar2]. 0 The concavity of F and ensuing monotonicity of F enable us to invert (1.7) and obtain (1.11) 1ψ 0 = g ψ 0 , 0
together with appropriately defined boundary conditions; here g is the inverse of F . Condition (1.10) can then be verified by studying the generalized eigenvalue problem for the linear operator 0 (1.12) 1 − µg ψ 0 = 0 . 0
Since g, as the inverse function of F , is monotone decreasing by assumption, it follows 0 that g < 0, and there exists a sequence of generalized eigenvalues 0 ≤ µ0 ≤ µ1 ≤ µ2 ≤ . . . ≤ µk ≤ . . . .
(1.13)
It then follows that the local concavity condition (1.10) holds if and only if (iff) µ0 > 1 .
(1.14)
Hence (1.14) is a sufficient condition for nonlinear stability of ψ 0 (cf. [WG2]). Still, in many interesting cases, ψ 0 is a saddle solution and even the local condition (1.10) or, equivalently, (1.14) is not satisfied. Consider, for example, a periodic channel given by 0 < y < 1, x = x(mod L) [SG2, WG1]. Then φ ≡ ∂x ψ 0 annihilates the quadratic form on the LHS of (1.10), as can easily be seen by differentiating E ψ 0 (· + γ, ·) twice with respect to γ. Thus, if ψ 0 is not a function of y alone, i.e. if ψ 0 is not symmetric under the shift action x → x + γ, y → y, then φ 6≡ 0 and (1.10) is not satisfied. This observation was made by Andrews [An] and generalized in Appendix A of [SG1] and by [CM] to other domains and their associated symmetry groups. In all these cases, φ is an eigenfunction of (1.12) corresponding to the (generalized) eigenvalue µ∗ = 1 and δψ2 0 E is degenerate. Moreover, for the periodic channel, since φ 6≡ 0 must change sign in the domain (by periodicity of ψ 0 in the x-direction), it follows by
716
G. Wolansky, M. Ghil
a general theorem (cf. Sect. 5) that it cannot be the ground eigenfunction of (1.12), and thus, necessarily, µ0 < µ∗ = 1. In particular, the LHS of (1.10) has then to take a positive value for the ground state φ0 and δψ2 0 E is not only degenerate but also indefinite. Chern and Marsden [CM] suggested considering the stability of the family of stationary (or stationarily rotating) solutions obtained by the actions of the domain’s symmetry group on ω0 , rather than the stability of an isolated stationary solution (see also [CS]). The argument uses the additional invariant of the flow due to the symmetry group. This approach is related to phase-space reduction by the momentum map (e.g., [Ma, SLM] and references therein), and has become known as the energy-momentum method. It was successfully applied to various problems in rigid-body motion [SPM90], elasticity [SPM91], and others. However, it seems that there are only few results related to the particular problem of stability of equilibria (1.7) of the incompressible Euler equation ([L, R]). 1.2. Outline of main results. In this paper we extend the applicability of the EC method to the case of critical points ψ 0 that are saddles – rather than maxima – of the functional E. In Sects. 2–4 we shall assume that there exist at least one generalized eigenvalue of (1.12) between 0 and 1, 0 ≤ µ0 < µ1 < . . . < µn < 1 < µn+1 ≤ . . . ,
(1.15)
i.e., that ψ ◦ is a saddle point of the EC functional E, and that all eigenvalues in the [0,1) interval have multiplicity one. In Sect. 5 we extend the results to the case where ψ 0 is not equivariant with respect to the action of some Lie group of isometries. Here, necessarily, µ = 1 is a generalized eigenvalue. Let W be the linear space of functions which are expressed as single-valued functions of ψ 0 , i.e. ψ ∈ W iff ∃h : R1 −→ R1 , where ψ = h(ψ 0 ). This is essentially the tangent space to the (nonlinear) manifold induced by several Casimir functionals. Similarly, let ˆ ⊃ W be the space of functions that depend functionally on ψ 0 , i.e. ψ ∈ W ˆ iff W ∂ ψ0 ψ = 0. ˆ m ) be some m-dimensional subspace of W (W) ˆ and ξ1 , . . . ξm be a basis Let Wm (W ˆ of Wm (Wm ). Define αi,j (µ) ≡< R(µ) ξi , ξj >, (1.16) 0
where R(µ) is the resolvent corresponding to (1.12) and < ·, · > stands for the −g weighted L2 inner product. Set A(µ) to be the m × m matrix-valued function whose entries are given by (1.16). Define the characteristic polynomial of A(µ), A(µ) ≡ Det [A(µ) − I] ,
(1.17)
where I is the m × m unit matrix. Then one can state our Main Result. Assume the solution ψ 0 is a saddle point of E, i.e. (1.15) holds. A) Assume that a subspace Wm can be found for which A(µ) 6= 0 for 0 ≤ µ < 1, and A admits poles at µ = µi for 0 ≤ i ≤ n. Then ψ 0 is nonlinearly stable. ˆ we obtain ˆ m ⊂ W, B) Under the same conditions as in (A), but upon choosing W linearized stability of ψ 0 . Part (B) of the main result was actually derived in [WG1] for the special case of channel geometry, and can easily be extended to the general case. In this paper we concentrate on the harder proof of Part (A).
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
717
In Sect. 2 we review the concept of supporting functional [WG1, WG2]. This method is generalized, in Sect. 3, to deal with hyperbolic, stationary solutions ψ 0 , that are saddles of the EC functional. In Sect. 4 we study the spectral properties of the operator associated with the supporting functional and derive the Main Result. In Sect. 5 we deal with the case of symmetry breaking, when there exists a Lie group of isometries admitted by the domain while ψ 0 is not equivariant; in this case the saddle is not hyperbolic, i.e. ∃µj = 1. To handle this case we extend the family of stationary solutions into stationarily rotating solutions (cf. also [Ar3], Appendix 2) and prove an extension of the Main Result for nonhyperbolic saddles. In Sect. 6 we apply the Main Result to the laterally periodic channel of [WG1], extending the linear stability result obtained there to genuine, nonlinear stability.
2. Method of Supporting Functionals Let ψ 0 be a stationary solution of (1.1) satisfying (1.7) and ω0 the corresponding vorticity. We denote by J the compact interval J ≡ Range [ω0 ()] ⊂ R1 ,
(2.1)
where R1 is the real line. In order to generalize Arnol’d’s second stability theorem [Ar2], 0 we assume, as in [WG2], that F ∈ C 2 (J) and that there exists > 0 for which sup
s∈J
0 d2 F (s) < −ε ; ds2
inf
s∈J
1 d2 F (s) > − ds2 ε0
(2.2)
holds. Thus, (1.7) can be inverted to yield 1ψ 0 = g(ψ 0 ),
(2.3)
0 −1 where 1ψ 0 ≡ ω0 and g ≡ F . We now define function spaces which accommodate the class of initial data ψ under consideration. For the sake of simplicity we assume, now and hereafter, that the domain is simply connected. In this case both boundary conditions (1.2, 1.3) reduce to ψ|∂ = const. Since an additive constant for ψ is irrelevant for the Euler equation (1.1), we may set the homogeneous Dirichlet boundary conditions ψ|∂ = 0
(2.4)
for all initial data. This suggests the choice of Hilbert spaces X = H01 () ∩ H2 (), with the respective norms Z 2 ||ψ||2X ≡ |1ψ| dx1 dx2 ,
Y ≡ H01 (),
(2.5)
Z ||ψ||2Y ≡
2
|∇ψ| dx1 dx2 .
The following definition utilizes the norms associated with both spaces:
(2.6)
718
G. Wolansky, M. Ghil
Definition 2.1. A stationary solution ψ 0 of (1.1) is called stable provided ∀ ∃δ such that kψ(., 0) − ψ0 kX < δ =⇒ sup kψ(., t) − ψ0 kY < , t>0
where ψ(·, 0) ∈ X is an initial state for which the solution ψ(·, t) ∈ Y exists ∀t > 0. It was observed by M. Mu that stability in the sense of Definition 2.1 implies stability in a stronger sense: 0
00
Proposition 2.1 ([Mu]). If ψ 0 = F (ω0 ) and F < −c, (c>0), then Z 1 0 2 ||ψ(·, t) − ψ 0 ||2Y + ||ψ(·, 0) − ψ 0 ||2X . ||ψ(·, t) − ψ ||X ≤ c Proof. Consider the functional Z 1 ∇ψ(·, t) − ∇ψ 0 2 dx1 dx2 E(t) = 2 Z h i 0 F (ω) − F (ω0 ) − F (ω0 )(ω − ω0 ) dx1 dx2 . +
0
00
It follows by F (ω0 ) = ψ that E is constant in time. We now use F < −c to obtain 0 c 2 (2.7) F (ω) − F (ω0 ) − F (ω0 )(ω − ω0 ) ≥ |ω − ω0 | . 2 0
The proof follows directly from (2.7) and the invariance of E.
We address the stability problem by presenting the idea of supporting functionals. The following definition is an extension of Definition 2 given in [WG2]. Definition 2.2. Let S ⊆ X be some subset of X (not necessarily a subspace) and ψ 0 ∈ X. A functional D defined on Y is said to be (locally) {E, S}-supporting if D is twice Fr´echet differentiable on (an open neighborhood V 3 ψ 0 of) Y and satisfies D(ψ) ≥ E(ψ) and
∀ψ ∈ X ∩ S
(ψ ∈ X ∩ S ∩ V )
D(ψ0 ) = E(ψ0 ).
(2.8) (2.9)
Granting the existence of D we obtain a stability result of the form: Lemma 2.1. Suppose E is a flow-invariant, continuous functional on the space X and D is {E, X}-supporting. Assume that the second Fr´echet derivative δψ2 0 D of D at the stationary solution ψ 0 is negative definite. Then ψ 0 is stable in the sense of Definition 2.1 and hence is genuinly nonlinearly stable by Proposition 2.1. The EC functional (1.6) is not uniquely determined by the stationary solution ψ 0 , since one has some freedom in the definition of the function F appearing in the Casimir term. The function F can be defined arbitrarily outside a compact interval, containing J (cf. (2.1)). We may, therefore, extend the definition of F so that (2.2) is satisfied on 00 the whole line R1 and F ≡ d2 F/ds2 is uniformly bounded by some negative constant. Moreover, we may assume that there exists q > 0 so that lim
|s|→∞
q F (s) =− . s2 2
(2.10)
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
719
Lemma 2.2. Let ψ 0 be a stationary solution given by (1.7) (or, equivalently, 2.3)) and F satisfy (2.10). Then i) E is a well-defined, continuous functional on X. ii) The functional Z Z 1 |∇ψ|2 dx1 dx2 + G(−ψ)dx1 dx2 (2.11) D(ψ) ≡ − 2 is {E, X}-supporting, defined on the space Y; here G is the Legendre transform of the convex function −F : G(λ) = sup [µλ + F (µ)] . µ∈R
(2.12)
In particular, we observe by (2.12) and (1.7) that 00
G (−ψ 0 ) = −
1 . F (ω0 ) 00
(2.13)
The proofs of Lemmas 2.1 and 2.2 are given in [WG2]. As a result of these lemmas, we can prove nonlinear stability provided the quadratic form Z Z 1 1 |∇φ|2 dx1 dx2 + G00 (−ψ0 )φ2 dx1 dx2 (2.14) (δψ2 0 Dφ, φ) ≡ − 2 2 is negative definite. Viewing (2.14) as a quadratic form in the space L2 G00 (−ψ 0 )dx1 dx2 , we obtain this condition by considering the generalized, self-adjoint eigenvalue problem of the Laplacian on , subject to a Dirichlet boundary condition (if ∂ 6= ∅): (2.15) 1φ + µG00 (−ψ0 )φ = 0. Since G00 (−ψ0 ) > 0, (2.15) has a complete set of eigenfunctions φi , orthonormal in L2 G00 (−ψ 0 ) dx1 dx2 ), and eigenvalues 0 ≤ µ0 ≤ . . . ≤ µi → ∞, where µ0 > 0 if ∂ 6= ∅. The quadratic form (2.14) is negative definite iff µ0 > 1. Thus (cf. [WG2]) one obtains: Theorem 2.1. Assume the conditions of Lemma 2.1. Then ψ 0 is stable in the sense of Definition 2.1 provided µ0 > 1 is the ground eigenvalue of (2.15). 3. Hyperbolic Saddle Solutions Definition 3.1. A stationary solution ψ 0 of (1.1) is called a hyperbolic saddle iff 0 ≤ µ0 < µ1 < . . . < µn < 1 < µn+1 ≤ . . . holds for some n ≥ 0. To handle the situation of a saddle solution ψ 0 , we first look for a set S ⊂ X invariant under the Euler flow (1.1) and containing ψ 0 . Then we shall construct a {E, S}supporting functional. To define the set S, consider the inner product Z 00 G (−ψ 0 )ζηdx1 dx2 (3.1) < ζ, η >≡
720
G. Wolansky, M. Ghil
over the space Y. Let n h 0 i o W ≡ ψ ∈ Y : ∃h ∈ C 2 F (J) and ψ = h ψ 0 .
(3.2)
Let Wm be an m-dimensional subspace of W and isolate a < ·, · >-orthonormal basis of Wm : 0 Wm ≡ Span ξ10 , . . . ξm . Denote ξi0 ≡ hi (ψ 0 )
Z
0 ξˆi = hi F (·) ∈ C 2 (J)
; Ξˆ i (s) ≡
s
ξˆi
00
and extend Ξˆ i over R1 so that Ξˆ i ∈ C 2 (R1 ). Given m real numbers M1 , . . . , Mm , define the set Z ˆ Ξi (1ψ) = Mi , 1 ≤ i ≤ m . SM1 ,...Mm ≡ ψ ∈ X :
(3.3)
Z
Let Mi0
≡
Ξˆ i (ω0 )
(3.4)
0 and denote S0 ≡ SM 0 ,...,Mm 0 . Then ψ ∈ S0 by definition. 1 Our object now is to construct an {E, S0 }-supporting functional over Y. In order to do it, one may try to repeat the proof of Lemma 2.2 (cf. [WG2]), namely to define Z Z 1 2 (∇ψ) + sup −ψξ + F (ξ) , (3.5) D(ψ) ≡ − 2 ζ∈1S0
where 1S0 ≡ {1η : η ∈ S0 }. However, this direct approach cannot work here since the set 1S0 is not weakly closed in L2 (). Thus, the supremum in (3.5) is not necessarily attained within this set and, in particular, we cannot prove condition (2.8). We thus take a somewhat different approach. Given 2m real numbers λ(1) , . . . , λ(m) , M1 , . . . Mn , define 1 E (λ, M, ψ) ≡ 2
Z
Z |∇ψ| + 2
= E(ψ) +
m X 1
F (1ψ) +
m X 1
Z λ
(i)
Z
Ξˆ i (1ψ)dx1 dx2
λ(i)
Ξˆ i (1ψ)dx1 dx2
− Mi
− Mi
(3.6)
over R2m ⊗ X, where λ ≡ {λ(1) , . . . λ(m) } ∈ Rm and M ≡ {M1 , . . . Mm } ∈ Rm . Since 00 E is continuous over X by Lemma 2.1 and Ξˆ i ∈ C 2 (R1 ) by assumption, the functional E(M, λ, ·) is well and continuous over X for any fixed {λ, M} ∈ R2m . It is defined 0 obvious that an E(M , λ, ·), X -supporting functional is also {E, S0 }-supporting for 0 }. In fact, according to (3.4) and for ∀ψ ∈ S0 any λ ∈ Rm , where M0 ≡ {M10 , . . . , Mm 0 : E(λ, M , ψ) ≡ E(ψ). Let δ > 0 be small enough so that
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
!
m
sup
s∈R
X d2 d2 F (s) + λ(i) 2 Ξˆ i (s) 2 ds ds
721 0
<−
0
ε , 2
(3.7)
Pm 00 00 0 provided |λ| ≡ 0 |λ(i) | < δ. This is possible since Ξˆ i ∈ C 2 (R1 ), while F < −ε everywhere. Define now ! Z Z m X 1 2 (i) ˆ |∇ψ| + sup λ Ξi (ζ) dx1 dx2 −ψζ + F (ζ) + D(λ, M, ψ) ≡ − 2 ζ∈L2 () 1
−
m X
λ(i) Mi .
(3.8)
1
The functional (3.8) is well defined for any ψ ∈ Y and any M ∈ Rm , provided |λ| < δ. Indeed, the integrand in the RHS of (3.8) is strictly concave by (3.7), and the supremum is attained at some ζ = ζψ,λ ∈ L2 (). It follows that 1 D(λ, M, ψ) = − 2
Z
Z |∇ψ| + 2
G (λ, −ψ) −
m X
λ(i) Mi ,
(3.9)
1
where G (λ, ·) is the convex conjugate, with respect to the second variable, of − F (·) −
m X
λ(i) Ξˆ i (·),
(3.10)
1
namely
" G(λ, p) = sup pq + F (q) + q∈R
m X
# λ Ξˆ i (q) . (i)
(3.11)
1
Since (3.10) is strictly convex and smooth in the dummy variable denoted by q in (3.11), we obtain that G (λ, ·) is smooth, convex and, by using (3.7), that it is majorized by a quadratic function for fixed |λ| < δ as well. Using standard Sobolev embedding we obtain that Z G(λ, ψ) ≡
G (λ, −ψ)
(3.12)
is a twice Fr´echet-differentiable functional on Y for any fixed λ satisfying |λ| < δ. Define ˆ (3.13) D(M, ψ) ≡ inf D(λ, M, ψ). |λ|≤δ
Lemma 3.1. There exists an open neighborhood V ⊂ Y containing ψ 0 and an open ˆ 0 , ·) neighborhood Uˆ ⊂ Rm containing M0 , where Dˆ ∈ C 2 (Uˆ ⊕ V ). In addition, D(M is locally {E, S0 }-supporting in V and, moreover, ∀ψ ∈ SM ∩ V and ∀M ∈ Uˆ , ˆ D(M, ψ) ≥ E(ψ) holds.
(3.14)
722
G. Wolansky, M. Ghil
ˆ 0 , ·) satisfies (2.8) and (2.9) in a Y-neighborhood Proof. We shall first prove that D(M of ψ 0 . We claim that D(λ, M0 , ψ) is an {E, S0 }-supporting functional for any |λ| < δ. To see this, take ζ ≡ 1ψ in (3.8) and observe that D(λ, M, ψ) ≥ 1 − 2
Z
Z
|∇ψ|2 +
−ψ1ψ + F (1ψ) +
m X
!
λ(i) Ξˆ i (1ψ) dx1 dx2 −
1
m X
λ(i) Mi
1
= E(λ, M, ψ)
(3.15)
m
for any ψ ∈ X, M ∈ R and |λ| < δ. The last equality in (3.15) follows via integration by parts. On the other hand, E(λ, M, ψ) = E(ψ) for any ψ ∈ SM and any |λ| < δ by definition. Hence D(λ, M, ψ) ≥ E(ψ) (3.16) over SM ⊂ Y. The inequality (3.16) is preserved if we take the infimum (3.13) and (3.14) follows. The inequality (2.8) follows upon substituting M = M0 in (3.14). On the other hand, ˆ 0 , ψ) ≤ D(0, M0 , ψ) (3.17) D(M for any ψ ∈ Y by (3.13), while D(0, M, ψ) ≡ D(ψ)
(3.18)
over Y for any M ∈ Rm , where D as given by (2.11). The equality (3.18) follows, by comparing (2.12) and (3.11), from the equality G(0, ·) ≡ G(·). Thus ˆ 0 , ψ 0 ) ≤ D(ψ 0 ) = E(ψ 0 ); D(M
(3.19)
the equality in (3.19) follows from the fact that D is {E, X}-supporting (cf. Lemma 2.2). By the already proven (2.8), we conclude that the inequality (3.19) is, indeed, an equality and (2.9) follows. ˆ Let Next we have to investigate the Fr´echet differentiability of D. Z ˆ M ˆ , ψ) = − 1 ˆM ˆ , ψ), D( |∇ψ|2 dx1 dx2 + G( 2 "
where ˆM ˆ , ψ) ≡ inf G(
|λ|≤δ
G(λ, ψ) −
m X
# (i)
λ Mi
(3.20)
1
and G(λ, ψ) is given by (3.12). We will show that Gˆ is twice Fr´echet differentiable in an appropriately defined neighborhood Uˆ ⊕ V ⊂ Rm ⊕ Y containing {M0 , ψ 0 }. This is the object of the following proposition. Proposition 3.1. G(λ, ψ 0 ) satisfies
and
at λ = 0 and ψ = ψ 0 .
∂ G = Mi0 ∂λ(i)
(3.21)
∂2 G = δij , ∂λ(i) ∂λ(j)
(3.22)
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
723
By Proposition 3.1 there exist a neighborhood U ⊂ Rm of 0 and a neighborhood V ⊂ Y of ψ 0 so that G(λ, ψ) is strictly convex with respect to λ in U , uniformly for any ψ in V . This implies that the infimum in (3.13) is attained at λ(M, ψ), |λ(M, ψ)| ≤ δ, provided ψ ∈ V is a sufficiently small neighborhood of ψ 0 and M − M0 is sufficiently small. The Rm -valued function λ is, thus, twice-continuously differentiable in a neighborhood V ⊕ Uˆ ⊂ Y⊕Rm of ψ 0 , M0 , which implies that Gˆ is twice-continuously differentiable in the same neighborhood. We proceed to prove Proposition 3.1. Using the fact that G(λ, p) is the Legendre Pm transform of −F (q) − 1 λ(i) Ξˆ i (q) with respect to the q-variable (see Eq. (3.11)), we may introduce the relation between the dual variables p – q as: " 0
q ≡ q(λ, p) = F +
m X
λ
ˆ
(i) ∂ Ξi (·)
1
∂·
#−1 (−p);
(3.23)
accordingly, G(λ, p) = F (q(λ, p)) +
m X
λ(i) Ξˆ i (q(λ, p)) + pq(λ, p)
(3.24)
1
is defined for any |λ| < δ. Using (3.24) we obtain the usual conjugacy relations ∇λ G = {∂λ(0) G, . . . ∂λ(m) G} , ∂λ(i) G ≡
∂G (λ, p) = Ξˆ i (q(λ, p)) , ∂λ(i)
(3.25)
and ∇2λ G = {∂λ2 (i) ,λ(j) G}, ∂λ2 (i) ,λ(j) G ≡
∂2G Ξˆ i (q(λ, p)) Ξˆ j (q(λ, p)) . = − 00 Pm 00 (i) (j) ∂λ ∂λ F (q(λ, p)) + k=1 λ(k) Ξˆ k (q(λ, p))
The proof follows by noticing that (3.25) yields Z ∂G 0 Ξˆ i q(0, −ψ 0 ) . (0, −ψ ) = (i) ∂λ
(3.26)
(3.27)
0 −1 0 −1 At the same time, by (3.23), q(0, −ψ 0 ) = F (ψ 0 ) and, by (2.3), F (ψ 0 ) ≡ g(ψ 0 ) = 1ψ 0 . Thus (3.21) follows from (3.4). Similarly, we obtain by (2.13), (3.1), (3.26) and the presumed < ·, · >-orthonormality of ξi0 : ∂2G (0, −ψ 0 ) = ∂λ(i) ∂λ(j) Z −
0 Z 0 Ξˆ i 1ψ 0 Ξˆ j 1ψ 0 00 = G −ψ 0 ξi0 ξj0 =< ξi0 , ξj0 >= δi,j . 00 0 F 1ψ
This completes the proof of Lemma 3.1.
(3.28)
724
G. Wolansky, M. Ghil
The choice of the particular < ·, · >-orthonormal basis of the subspace Wm ⊂ W ˆ It is evident that a different choice of basis led us to the definition of D. ξi0 −→ ζi0 =
m X
βij ξj0
(3.29)
j=1
will lead us to the analogous relation Dˆ (M, ψ) −→ Dˆ (BM, ψ) ,
(3.30)
where B ≡ {βij } is the transformation matrix. We may, thus, consider Dˆ as uniquely defined, modulo the transformation (3.30), by a finite-dimensional subspace of W. We are now in a position to prove the stability result in the case of a saddle solution ψ0 : Theorem 3.1. Assume there exists a finite-dimensional subspace of W so that 0 ˆ ˆ ·) is constructed in terms of this subspace. , ·) is negative definite, where D(·, δψ2 0 D(M 0 Then ψ is stable. Proof. Let ε > 0. Define
and, similarly,
n o BXδ ≡ ψˆ ∈ X : ||ψˆ − ψ 0 ||X < δ
n o BYε ≡ ψˆ ∈ Y : ||ψˆ − ψ 0 ||Y < ε .
Given ε > 0 we have to show, by Proposition 2.1, the existence of δ > 0 so that ψ(·, t) ∈ BYε
(3.31)
for any t > 0 and any ψ(·, 0) ≡ ψˆ ∈ BXδ . 0 By the assumption of the theorem, there exists ε (ε) > 0 for which n o 0 sup Dˆ M0 , ψˆ − Dˆ M0 , ψ 0 < −ε .
(3.32)
0 || =ε ˆ ||ψ−ψ Y
ˆ For δ sufficiently small we obtain, by the continuity of D: sup ||ψ−ψ 0 ||Y =ε
Dˆ (M, ψ) − Dˆ M0 , ψ 0
0
<−
ε , 2
provided M − M0 = O(δ). By Lemma 3.1, E (ψ) − E ψ 0 ≤ Dˆ (M, ψ) − Dˆ M0 , ψ 0 ,
(3.33)
(3.34)
provided ψ ∈ BYε ∩ X ∩ SM . Now, the solution ψ(·, t) of (1.1) originating at consider ψˆ ∈ BXδ . Then ψˆ ∈ SM , where M − M0 = O(δ). By the invariance of SM under the flow, it follows that ψ(·, t) ∈ SM for all t ≥ 0. Suppose there exists t > 0 for which 0 ||ψ(·, t) − ψ 0 ||Y = ε. Then the RHS of (3.34) is smaller than −ε /2 by (3.33), while the LHS of (3.34) is O(δ) by the continuity of E. We, then, obtain a contradiction by setting 0 δ = δ(ε ) sufficiently small.
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
725
4. Proof of the Main Result An explicit computation of the quadratic form induced by the second Fr´echet derivative of Dˆ M0 , · at ψ 0 , acting on φ ∈ Y, yields φ, δψ2 0 Dˆ M0 , · φ = 1 − 2
"Z
Z
|∇φ|2 dx1 dx2 −
00
G (−ψ 0 )φ2 dx1 dx2 +
m X
# < ξi0 , φ >2 .
(4.1)
1
Notice that the first two terms on the RHS of (4.1) make up the quadratic form (2.14) associated with δψ2 0 D. The sum in the third term reduces the value of the quadratic form associated with δψ2 0 Dˆ further. If this effect is strong enough, we obtain stability via Theorem 3.1. The eigenvalue problem associated with (4.1) generalizes (2.15) to 00
1η − G (−ψ 0 )
m X
00
< ξi0 , η > ξi0 + νG (−ψ 0 )η = 0,
(4.2)
i=1
where a generalized eigenfunction η should satisfy the Dirichlet boundary condition if ∂ 6= ∅. As in the case (2.15), we see that (4.2) admits a complete, L2 G00 (−ψ 0 )dx1 dx2 orthonormal set of eigenfunctions ηi and positive eigenvalues 0 ≤ ν0 ≤ . . . . ≤ νi → ∞, where ν0 ≥ µ0 . The quadratic form (4.1) is negative definite iff ν0 > 1. Corollary 4.1. A stationary solution ψ 0 is stable if ν0 > 1. We now consider algebraic conditions for ν0 > 1. Let us rewrite (4.2) as:
00
0
−1
G (−ψ )
1η −
m X
< ξk0 , η > ξk0 + νη = 0,
(4.3)
k=1
−1 00 1: and denote by R(ν) the resolvent of G (−ψ 0 ) R(ν) Define
−1 −1 00 0 = G (−ψ ) 1+ν . αi,j (ν) ≡< R(ν) ξi0 , ξj0 >
(4.4)
and denote by A(ν) the m × m matrix-valued function whose entries are given by (4.4). Consider A(ν) ≡ Det [A(ν) − I] , (4.5) where I is the m × m unit matrix. Evidently, A is meromorphic in the entire complex plane, with possible poles at the eigenvalues ν = µi of (2.15). Proposition 4.1. A real, nonnegative ν is a generalized eigenvalue of (4.2) iff either i) ν 6= µ for any eigenvalue µ = µi of (2.15) and A(ν) = 0, or ii) ν = µi for some eigenvalue µi of (2.15), and A is analytic at µi .
726
G. Wolansky, M. Ghil
Proof. Define ηi to be the solution of 00
00
1ηi + G (−ψ 0 )ηi − G (−ψ 0 )ξi0 = 0.
(4.6)
If η is a generalized eigenfunction of (4.2) corresponding to the eigenvalue ν, then η=
m X
β j ηj ,
(4.7)
1
while βj =< η, ξj0 >. Since < ηi , ξj0 >= αi,j (ν) by definition, we obtain βk =
m X
βi αi,k (ν).
(4.8)
1
Evidently, (4.8) admits a nontrivial solution iff A(ν) = 0 iff η 6= 0 is an eigenvalue of (4.2). ii) Let ν = µi be some eigenvalue of (2.15). Let φ = φi be the corresponding eigenfunction; i.e. a nontrivial solution of (2.15) at µ = µi . Let P be the < ·, · >orthonormal projection on φ. By self-adjointness of the operators of interest, R(ν) = (I − P)R(ν) (I − P) + PR(ν) P. Define then 0 αi,j (ν) =< (I − P)R(ν) (I − P)ξi0 , ξj0 >
(4.9)
and A(0) (ν) to be the corresponding m × m matrix-valued function. Then A(0) is analytic at ν = µi and 1 A(ν) = A(0) (ν) + A(1) , (4.10) ν − µi (1) where A(1) is the constant matrix whose entries are αkj =< φ, ξj0 >< φ, ξk0 >. (1) 0 If A ≡ 0 then A is certainly analytic at ν = µi . This is the case iff Span ξ10 , . . . ξm is < ·, · >-perpendicular to φ. It is easy to see that φ is an eigenfunction of (4.2) corresponding to the eigenvalue ν = µi in this case. Otherwise, consider the normalized m-column vector c(1) whose components are:
< φ, ξj0 > q ≡ , c(1) j Pm 0 2 < φ, ξ > k=1 k
(4.11)
and let c(k) , 2 ≤ k ≤ m be some complement of c(1) extending it into an orthonormal basis of Rm . Define C to be the orthonormal matrix col c(1) , . . . c(m) . (4.12) Then
CT ◦ A(1) ◦ C = I(1,1) ,
(4.13)
is the m × m matrix whose upper-left ({1, 1}) entry is one and all the rest where I are zeros. Thus (1,1)
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
1 A(ν) = Det C A (ν)C − I + I(1,1) ν − µi T
727
(0)
1 Det [Q(ν)] + analytic function, (4.14) ν − µi (1,1) where Q(ν) = A(0) (ν) − I is the (m − 1) × (m − 1) minor of A(0) (ν) − I obtained by omitting the first row and column. Now, Det[Q] is an analytic function at ν = µi and we have to show that Det[Q](µi ) = 0 iff µi is an eigenvalue of (4.2). We claim that µi is an eigenvalue of (4.2) iff there exists a real number α and a nontrivial solution β ∈ Rm of =
A(0) (µi )β + αc(1) = β, β T · c(1) = 0.
(4.15)
If η is an eigenfunction of (4.2) corresponding to µi then β = βj ≡< solution of (4.15, 4.16) where T α = − c(1) A(0) β.
η, ξj0
(4.16) > is a (4.17)
Conversely, if β is a nontrivial solution of (4.15, 4.16), then η=
m X
βk ηk0 + αφ
(4.18)
k=1
is an eigenfunction of (4.2) corresponding to µi , where ηk0 is the solution of 00 00 1ηk0 + µi G (−ψ 0 )ηk0 − G (−ψ 0 ) ξk0 − < φ, ξk0 > φ = 0
(4.19)
0 0 (µi ) = αkj (µi ) =< ηk0 , ξj0 >. orthogonal to φ and α defined as in (4.17); notice that αj,k Defining γ = CT β and substituting into (4.15, 4.16) yields the equivalent (4.20) CT A(0) (µi )C − I γ + αCT c(1) = 0,
γ T CT c(1) ≡ γ0 = 0.
(4.21)
We thus see that Det[Q(µi )] = 0 is equivalent to the solvability condition of either (4.20, 4.21) or equivalently to that of (4.15, 4.16). This completes the proof of Proposition 4.2. It is easy to see that the meromorphic function A is unchanged, up to multiplication by 0 (not necessarily a nonzero constant, if we choose a different basis for Span ξ10 , . . . ξm < ·, · >-orthonormal). Thus the zeros and the poles of A are determined completely 0 is used to as a property of a finite-dimensional subspace of W, whose basis ξ10 , . . . , ξm define αi,j in (4.4). We have thus proved Part (A) of our Main Result as stated in the Introduction: Theorem 4.1. Let a hyperbolic stationary solution ψ 0 have the n + 1 eigenvalues {µ0 , . . . µn } of its associated generalized eigenproblem (2.15). Suppose there exists a finite-dimensional subspace of W so that the corresponding meromorphic function A(ν) has no zero in [0, 1] and admits poles at ν = µ0 , . . . , µn . Then ψ 0 is nonlinearly stable.
728
G. Wolansky, M. Ghil
The proof of Part (B) is based on equivalent arguments, concerning the Euler equation linearized at the stationary solution ψ 0 . Let φ(·, t) be a solution of the linearized equation. Then the linearized energy 1Z 00 (4.22) |∇φ|2 + F (ω0 )|1φ|2 dx1 dx2 δψ2 0 Eφ, φ ≡ 2 is an invariant of the linearized Euler equation. Moreover, the linear functionals Z ξφdx1 dx2 (4.23)
are invariants as well, provided ˆ ≡ ξ ∈ Y : ∂ ψ0 ξ ≡ 0 ξ∈W
(4.24)
(cf. [MS, S]). Using these invariants we may construct the supporting functional (4.1) for the linearized equation, where ξi0 satisfies (4.24). The rest of the proof is obtained along the same lines as Theorem 4.1 (cf. [WG2]). A special case in which the conditions of our Main Result are satisfied yields: Corollary 4.2. Suppose the eigenfunctions φ0 , . . . φn of (2.15) corresponding to the ˆ Then ψ 0 is nonlinearly (linearly) stable. eigenvalues µ0 , . . . µn are contained in W (W). Indeed, under the conditions of Corollary 4.2, we may choose m = n + 1 and set 1 δkj ; hence ξk0 = φk for 0 ≤ k ≤ n. We then observe that αkj (ν) = ν−µ k 1 + µk − ν 1 A(ν) = Π0n − 1 = Π0n ν − µk ν − µk satisfies the respective condition of the main result. Unfortunately, the conditions of Corollary 4.2 are, in general, too strong to be satisfied in interesting cases. 5. Symmetry Breaking and Non-Hyperbolic Solutions The case in which the flow (1.1) is equivariant with respect to a continuous symmetry group is particularly interesting from the point of view of stability analysis (see also [CM, CS, R and SG1]). In this section we focus on the action of such a group on the domain . The Euler flow (1.1) can be generalized to a 2-D, compact Riemannian manifold associated with the metric g ij dz i dz j , where z 1 ≡ x, z 2 ≡ y. Let us then √ equip with the symplectic form obtained from the Riemannian volume gdx ∧ dy, where g ≡ Det g ij , and replace ∂ ψ ω by {ω, ψ}; here {·, ·} is the Poisson bracket associated with the flow of (1.1) on , while the vorticity ω is obtained by applying the Laplace-Beltrami operator to ψ. Let 0 be the Lie group of isometries on the domain . Given our assumptions, this group induces a symplectic action on . We denote the action of γ ∈ 0 on by 3 z → γ ◦ z ∈ . This action on induces a Poisson action on the functional spaces X or Y by ψ(z) → ψ γ (z) ≡ ψ(γ ◦ z) .
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
729
Usually, this Poisson action is defined on the Banach manifold of vorticities ω ≡ 1ψ. The Poisson structure is defined on this manifold in a natural way, but we omit its detailed description since the algebraic structure will not be used. The Euler flow (1.1) is equivariant with respect to the action 0 on X in the following sense: For any orbit ψ(·, t) that represents a time-dependent solution of (1.1), and any γ ∈ 0, ψ γ (·, t) is an orbit of (1.1) as well. The dual Lie algebra of 0 can be identified with the linear space of Hamiltonian functions K (the momenta) defined on , whose flow induces the symplectic action z → γ ◦ z in . We have in mind the following three examples: A) is the x-periodic, zonal channel a < y < b with the Euclidean metric. The symplectic structure is given by dx ∧ dy, 0 is the one-dimensional group of meridional shifts z ≡ (x, y) → (x + γ, y), and K ≡ Span{K1 }, where K1 (z) = y is the Hamiltonian function inducing this flow. B) is an axisymmetric domain (disc or annulus) with the same metric and symplectic structure as in (A), 0 is the group of rotations about the axis z → eı˙γ z, and K = Span{K3 }, where K3 (z) = |z|2 is the Hamiltonian inducing the rotation. C) is the sphere S 2 with the Euclidian metric induced by R3 , and 0 = SO(3); K is spanned by the three angular momenta, given in the spherical coordinates on S 2 . Remark 5.1. If the flow is doubly-periodic in space, then can be identified with the two-torus, and 0 induces the 2-D shift: z = (x, y) → (x + γ1 , y + γ2 ). In this case K = Span{K1 , K2 }, where K2 (z) = −x. However, the momenta {K1 , K2 } are not doubly-periodic and the associated vector fields are only locally Hamiltonian. It is interesting that, therefore, this symmetry group on the torus does not seem to induce conservation laws! Let us assume that admits a nontrivial Lie group of isometries 0. The space of momenta associated with the group action is K = Span {K1 , . . . Kk }. Example 1. Let ψ 0 be a solution of (2.3), K ∈ K, and ψ be a solution of 1ψ = g (ψ + K) ,
(5.1)
where g is the inverse of F 0 , as in (2.3). An application of ∂ ψ to 1ψ ≡ ω yields, via (5.1), ∂ ψ ω = g 0 (ψ + K) ∂ ψ (ψ + K) = g 0 (ψ + K) ∂ ψ K = −g 0 (ψ + K) ∂ K ψ.
(5.2)
Let us define γλt to be the orbit in 0 that corresponds to the solution ψ(·, t) of (5.1) through the identity element i ∈ 0 which is generated by the momentum K. Set ψ(·, t) ≡ ψ γλt (·) ; ω(·, t) ≡ ω γλt (·). Then the RHS of (5.2) acting on ψ = ψ(·, t) can be rewritten as −∂t (g (ψ(·, t) + K)) = −∂t ω(·, t)
(5.3)
730
G. Wolansky, M. Ghil
and we obtain from (5.2) that ψ(·, t) is also a solution of (1.1). Assume now that sup |K| is sufficiently small and that the solution ψ of (5.1) is a small perturbation of the solution ψ 0 of (2.3), i.e. ||ψ 0 − ψ||X < δ. If, moreover, ψ 0 is not equivariant – i.e. if there is no γ ∈ 0 for which, according to (5.3), ψ 0,γ ≡ ψ 0 – then ∃ for which ||ψ 0 − ψ(·, t)||Y ≈ ||ψ 0 − ψ 0,γt ||Y > for some t. Hence ψ 0 cannot be stable in the sense of Definition 2.1 of Sect. 2. Example 1 demonstrates the need to generalize Definition 2.1 for stability in the presence of a symmetry group, on the one hand, and the need to generalize stationary solutions into stationarily rotating solutions, on the other. We introduce (cf. also [Ar3]): Definition 5.1. A solution ψ 0 ∈ X ∩ C ∞ () is called stationarily rotating if ∃K ∈ K for which ∂ K + ∂ ψ0 ω0 = 0. (5.4) Equation (5.4) is an obvious generalization of (1.5). We shall restrict attention to stationarily rotating solutions which satisfy the generalization (5.1) of (2.3). Definition 5.2. A stationary or stationarily rotating solution ψ 0 is called stable modulo the group action 0 provided ∀ ∃δ such that kψ(., 0) − ψ0 kX < δ =⇒ sup inf kψ(., t) − ψ0γ kY < , t>0 γ∈0
where ψ(·, 0) ∈ X is an initial state for which the solution ψ(·, t) ∈ Y exists ∀t > 0. By Example 1 we observe that a stationary solution ψ 0 is not, in general, stable in the sense of Definition 2.1 when a nontrivial isometry group 0 is present and ψ 0 is not equivariant, i.e. ∂ K ψ 0 6= 0 (5.5) for any K ∈ K. Since the action of 0 preserves the invariants of the Euler equation (1.1) it follows that (5.6) E (ψ γ ) = E(ψ) for any γ ∈ 0 and any ψ ∈ X. A stationary solution ψ 0 which satisfies (5.5) cannot, therefore, be a strict extremizer of E and (1.10) is violated for φ ≡ ∂ K ψ 0 . Indeed, differentiating E ψ 0,γ twice along the group’s orbit generated by K ∈ K at the identity i ∈ 0 and applying (5.6), we obtain 2 2 0 = ∂ K E ψ 0,γ = δψ0 E ◦ ∂ K ψ 0 + 2 δψ2 0 E ◦ ∂ K ψ 0 , ∂ K ψ 0 = 0 + 2 δψ2 0 E ◦ ∂ K ψ 0 , ∂ K ψ 0 = 0,
(5.7)
having used δψ0 E ≡ 0. In fact, the situation is even worse: Lemma 5.1. Let be a bounded domain and ∂ 6= ∅. Suppose there exists an isometry solution group 0 on and let ψ 0 be a stationary satisfying (5.5). Then there exists at least one test function φ for which δψ2 0 E ◦ φ, φ > 0.
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
731
Example 2. Let be the disc {|z| < R} or the annulus {R1 < |z| < R2 }, and 0 the group of rotations around the center z = 0. Any stationary solution ψ 0 which is not radially symmetric (ψ 0 6= ψ 0 (|z|)) satisfies the assumptions of Lemma 5.1. Proof of Lemma 5.1. It suffices to show that there exists an eigenvalue 0 < µ < 1 of (2.15). Indeed, suppose such an eigenvalue exists. Then the associated eigenfunction φ satisfies 00 (5.8) 1φ + µG (−ψ 0 )φ = 0 and, using (2.13), 00
F (1ψ 0 )1φ − µφ = 0.
(5.9)
Multiplying (5.9) by 1φ and integration by parts over the domain , using the homogeneous Dirichlet condition for φ on the boundary, we obtain Z Z 00 0 2 F (1ψ )|1φ| dx1 dx2 + µ |∇φ|2 dx1 dx2 0=
= 2 δψ2 0 E ◦ φ, φ + (µ − 1)
Z
|∇φ|2 dx1 dx2 < 2 δψ2 0 E ◦ φ, φ .
(5.10)
Next, recall that the Laplace-Beltrami operator 1 commutes with ∂ K for any K ∈ K. Using this and (2.3) we obtain 0
00
∂ K 1ψ 0 = g (ψ 0 )∂ K ψ 0 ≡ −G (−ψ 0 )∂ K ψ 0 ;
(5.11)
hence φˆ ≡ ∂ K ψ 0 is an eigenfunction of (2.15) that corresponds to µˆ = 1. On the other hand, any integral curve of the vector field ∂ K is either a periodic orbit of (1.1) or a dense orbit in the compact, 2-D manifold . Thus ∂ K ψ 0 must change sign in if not identically zero. However, the ground eigenfunction φ of (2.15) must be strictly positive [RS, Ch. XII.12]. Hence there exists µ < µˆ = 1 and a corresponding eigenfunction φ 6= ∂ K ψ 0 for (2.15). The invariance of the EC functional E with respect to the group action (5.6) is also ˆ 0 , ·), given by (3.13) on the space Y, i.e. ∀ψ ∈ Y valid for the supporting functional D(M m and ∀M ∈ R , ˆ ˆ D(M, ψ γ ) = D(M, ψ). (5.12) Notice that (5.12) holds even though Dˆ is not an invariant functional with respect to the Euler equation (1.1). In particular we obtain that the second Fr´echet derivative of ˆ 0 , ·) at ψ 0 is degenerate, i.e., ∀K ∈ K and φ ≡ ∂ K ψ 0 , D(M ˆ 0 , ·)φ, φ) ≡ 0. (δψ2 0 D(M
(5.13)
Lemma 5.1 is not necessarily valid if δψ2 0 E is replaced by δψ2 0 Dˆ M0 , · . However, an application of (5.12) yields, by an argument similar to the first part of the proof of Lemma 5.1: Lemma 5.2. Let 8 be the space spanned by all the eigenfunctions of (4.2) that correspond to the eigenvalue ν = 1. Then ∂ K ψ 0 ≡ Span ∂ K ψ 0 : K ∈ K ⊆ 8.
732
G. Wolansky, M. Ghil
The space ∂ K ψ 0 contained in 8 cannot be eliminated by any choice of a subspace Wm ⊂ W. However, it may be reduced by using the flow invariants induced by the isometry group, as in the energy-momentum (EM) method. The key idea of the EM method is to use the flow invariants of the dynamical system in question in order to reduce the phase space. The flow invariants of the Euler equation due to the isometry group 0 are all linear functionals over X of the form Z K1ψdx1 dx2 . (5.14) ∀K ∈ K
The invariants (5.14) can be added to the Casimir invariants of the EC method and be used to foliate the invariant set SM of (3.3) into
SM1 ,...Mm
SM1 ,...Mm ,Mm+1 ,...Mm+k ≡ Z ∩ ψ∈X : Ki 1ψ = Mm+i , 1 ≤ i ≤ k .
(5.15)
More generally, we may add the functional space K to the “Casimir space” W of (3.2) to obtain W = Span {W ∪ K} . (5.16) Given a finite-, m-dimensional subspace of W, we may define Z (i) ˆ SM ≡ ψ ∈ X : Ξi (1ψ) + K 1ψ dx1 dx2 = Mi , 1 ≤ i ≤ m ,
(5.17)
using this time (5.17) where Ki ∈ K. We can go now through the steps (3.6)–(3.13), rather than (3.3) to define Dˆ in a neighborhood of M0 , ψ 0 . Moreover, we may assume ψ 0 to be a stationarily rotating solution that satisfies (5.1), rather than (2.3). Lemma 3.1 is still valid for the above functional, with Definition 5.2 replacing Definition 2.2. From now on, we consider Dˆ to be defined for a finite-dimensional subspace of W and will refer to the eigenvalue problem (4.2) where ξi0 ∈ W. Define the locally stationary space Z 0 0 0 ω0 ∂ K K = 0 , (5.18) Ks ≡ K ∈ K : ∀K ∈ K,
and the stationary space n o 0 0 Ks ≡ K ∈ K : ∀K ∈ K, ∂ K K = 0 .
(5.19)
Both Ks0 and Ks are subalgebras of K. Let 00s and 0s be the corresponding locally stationary and stationary subgroups of 0. Evidently 0s ⊆ 00s and we obtain: Lemma 5.3. For each ψ ∈ Y and M ∈ Rm , the function γ : 0 → Dˆ (M, ψ γ )
(5.20)
is constant over 0s ⊆ 0. If ψ = ψ 0 and M = M0 , then (5.20) is constant over 00s ⊇ 0s . Lemma 5.3 immediately yields Lemma 5.4 and Theorem 5.1.
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
733
Lemma 5.4. Let 8 be the space defined in Lemma 5.2. Then ∂ Ks0 ψ 0 ≡ Span ∂ K ψ 0 : ∀K ∈ Ks0 ⊆ 8. Theorem 5.1. Assume there exists a nontrivial isometry group 0 on . Let ψ 0 be a stationary or stationarily rotating solution, and assume that 00s is a compact subgroup. Suppose there exists a finite-dimensional subspace of W which satisfies the conditions of Theorem 4.1 and, in addition, 8 = ∂ Ks0 ψ 0 . Then ψ 0 is stable in the sense of Definition 5.2. Remark 5.2. Comparing the result of Lemma 5.4 with the condition on 8 given in Theorem 5.1 we obtain that ∂ Ks0 ψ 0 = ∂ Ks ψ 0 (5.21) is necessary for the conditions of Theorem 5.1 to hold. Condition (5.21) is satisfied, in particular, if ψ 0 is equivariant, i.e. ∂ Ks0 ψ 0 = ∂ Ks ψ 0 = ∂ K ψ 0 = {0}. Proof of Theorem 5.1. Let us define the hyperspace Z Y⊥ ≡ ψ = ψ 0 + φ ∈ Y : ∀ζ ∈ ∂ Ks0 ψ 0 , ∇φ · ∇ζdx1 dx2 = 0 .
By Remark 5.2 and the conditions of the theorem we obtain that ψ 0 ∈ Y⊥ is a local ˆ 0 , ·) restricted to Y⊥ . Proceeding like in the proof of Theorem 3.1 maximizer of D(M we obtain the inequality (3.34) for any ψ ∈ BYε ∩ X ∩ SM ∩ Y⊥ . To end the proof we just have to show that, if ψ satisfies inf kψ − ψ 0,γ kY = ε, 0s
then one can find γ ∈ 0s so that ψ γ ∈ Y⊥ satisfies both Dˆ (M, ψ γ ) = Dˆ (M, ψ)
(5.22)
kψ γ − ψ 0 kY = ε,
(5.23)
and
after which one proceeds as in the proof of Theorems 3.1 and 4.1. R To obtain (5.22, 5.23) we use Lemma 5.3 and the invariance of the inner product ∇φ · ∇ζdx1 dx2 with respect to the group action.
6. Application to Channel Flows Let us consider the Euler flow (1.1) in the channel ≡ {0 < y < π}
(6.1)
subject to the boundary conditions ψ(x, 0) = 0 and ψ(x, π) = const. (arbitrary but fixed). Notice that this boundary condition corresponds to a prescribed flux through the channel. The analogous problem of prescribed circulation (see Sect. 1.1) can be studied in a similar way.
734
G. Wolansky, M. Ghil
Let the velocity profile v = {γ cos(γy), 0} 0
(6.2)
0
correspond to the stream function ψ = ψ (y) = sin(γy) and to the vorticity ω0 = −γ 2 sin(γy). The relation dψ = −γ −2 (6.3) dω follows. If γ 2 < 1 then Arnol’d’s stability theorem applies and the flow is nonlinearly stable. A similar result can be obtained if γ 2 = 1. If, however, γ 2 > 1 then the flow is known to be unstable, in the absence of additional restrictions on the allowable perturbations [N]. In [WG1], linearized stability was proved for any value of γ, provided the perturbed flow is required to be L-periodic in the xdirection and L is sufficiently small. The arguments in [WG1] were based on Part (B) ˆ is composed of all the functions of the of the Main Result here, utilizing the fact that W form ψ = ψ(y) in Y. Here we will use Part (A) to extend Arnol’d’s nonlinear stability result to an interval γ ∈ (0, γ c ), where γ c > 1, for L-periodic flow in the x-direction. The restriction on the x-period L is the same as in [WG1; see Eq. (11) there] and is given more explicitly, for the case at hand here, as inequality (6.8) below. The eigenvalue problem corresponding to (2.15) takes the form
and the solutions are:
1φ + µγ 2 φ = 0
(6.4)
φjk = eı˙ L kx sin(jy), " 2 # 2πk µjk = γ −2 j 2 + L
(6.5)
2π
(6.6)
for any pair of integers {j, k}. In order to apply Theorem 4.1, we need to avoid nonzonal eigenvalues (µjk with k 6= 0) in the interval [0, 1]. To do so, we fold our domain (6.1) into a cylinder {0 < y < π ; x = x(modL)},
(6.7)
where 2π
L< p
γ2 − 1
.
(6.8)
The choice (6.8) yields, ∀k ∈ Z, 0 < µ0 ≡ µ0,0 < . . . < µn ≡ µn,0 < 1 < µn+1,k , n γ
n+1 γ ,
(6.9)
while µjk > 1 for any j ≥ 0 and k 6= 0. We consider the case q 2γπ−sin(2γπ) m = 1 in Theorem 4.1 and choose ξ10 = sin(γy). This is the simplest 4γ 0 0 possible choice, with ξˆ1 (ψ ) = ψ up to the normalization constant. A straightforward calculation of A(ν) yields ( ) √ sin(γπ) cos(γπ) − ν sin2 (γπ) cot( νγπ) 1 1−4 − 1. (6.10) A(ν) = γ(ν − 1) (1 − ν)(2γπ − sin(2γπ)) where
<1<
Nonlinear Stability for Saddle Solutions of Ideal Flows and Symmetry Breaking
735
One should notice that A(ν) admits poles at the eigenvalues ν = j 2 /γ 2 = µj,0 for any integer j. Hence A satisfies the second condition of Theorem 4.1. Notice, however, that A does not admit a singularity at ν = 1 (unless γ = 1), even though it looks as if it does. To evaluate an estimate on the range of γ > 1 for which the stability of ψ 0 is guaranteed under the assumption (6.8) by Theorem 4.1, we substitute ν = 1 in (6.10) to obtain the condition π 2 cot(γπ) > 1. (6.11) A(1) > 0 =⇒ 2γπ − sin(2γπ) Corollary 6.1. The stationary solution ψ 0 = α sin(γy) of (1.1) in the periodic channel given by (6.7) is nonlinearly stable in the sense of Definition 2.1, provided γ ≤ 1 and L arbitrary, or if 3/2 > γ > 1 satisfies (6.11) and L satisfies (6.8). The conditions of Corollary 6.1 are stronger than the conditions for linearized stability obtained for the same problem in [WG1]. There we did not have to assume any limitation on γ, provided (6.8) holds. A natural question is the following: Is there really a number γ > 1 for which the zonal flow in a periodic channel subject to (6.8) is nonlinearly unstable, while being linearly stable according to [WG1]? Acknowledgement. It is a pleasure to acknowledge interesting discussions with M. Mu (see Proposition 2.1). This work was supported by NSF Grant ATM 95-23787 and by Grant 94-00239/2 from the United States-Israel Binational Science Foundation. MG also wishes to acknowledge a Visiting Professorship at the Coll`ege de France that provided extra time for the finishing touches. Constructive comments and additional references from an anonymous referee helped improve the presentation, and K. E. Hartman helped with the word processing and references. This is publication no. 4915 of UCLA’s Institute of Geophysics and Planetary Physics.
References [Ab] [An] [Ar1] [Ar2] [Ar3] [BM] [BKM] [Bl] [BR] [CS] [Cas]
[CM] [Ho]
Abarbanel, H.D.I., Holm, D.D., Marsden, J.E., Ratiu, T.: Nonlinear stability analysis of ideal stratified, incompressible fluid flow. Phys. Rev. Lett. 52, 2352–2355 (1984) Andrews, D.G.: On the existence of non-zonal flows satisfying sufficient conditions for stability. Geophys. Astrophys. Fluid Dyn. 28, 243–256 (1984) Arnol’d, V.I.: Conditions for nonlinear stability of stationary plane curvilinear flows of an ideal fluid. Sov. Math. Dokl. 6, 773–776 (1965) Arnol’d, V.I.: On an a priori estimate in the theory of hydrodynamical stability. Am. Math. Soc. Transl.: Series 2 79, 267–269 (1969) Arnol’d, V.I.: Mathematical Methods of Classical Mechanics. New York: Springer-Verlag, 1980 Ball, J.M., Marsden, J.E: Quasiconvexity at the boundary, positivity of the second variation and elastic stability. Arch. Rat. Mech. Anal. 86, 251–277 (1984) Ball, J.M., Knops, J.E., Marsden, J.E.: Two examples in nonlinear elasticity. Lec. Notes in Math., 665, 41–49 (1978) Blumen, W.: On the stability of quasigeostrophic flow. J. Atmos. Sci. 25, 929–931 (1968) Batt, J., Rein, G.: A rigorous stability result for the Vlasov-Poisson system in three dimensions. Anal. Mat. Pura Appl. 164, 133–154 (1993) Carnevale, G.F., Shepherd, T.G.: On the interpretation of Andrews’ theorem. Geophys. Astrophys. Fluid Dyn. 51, 1–17 (1990) ¨ Casimir, H.G.B.: Ueber die Konstruktion einer zu den irreduziblen Darstellungen halbeinfacher kontinuierlicher Gruppen geh¨origen Differentialgleichung. Proc. R. Soc. Amsterdam 34, 844–846 (1931) Chern, S.-J., Marsden, J.E.: A note on symmetry and stability for fluid flows. Geophys. Astrophys. Fluid Dyn. 51, 19–26 (1990) Holm, D.D.: Nonlinear stability of fluid and plasma equilibria. Phys. Reports 123 #1 & 2, 1–116 (1985)
736
[KM]
G. Wolansky, M. Ghil
Kandrup, H.E., Morrison, P.J.: Hamiltonian structure of the Vlasov-Einstein system and the problem of stability for spherical relatevistic star clusters. Ann. Phys. 233, 114–166 (1993) [L] Lewis, D.: Nonlinear stability of a rotating planar liquid drop. Arch. Rat. Mech. Anal. 106, 287–333 (1989) [Ma] Marsden, J.E.: Lectures on Mechanics. Lond. Math. Soc. Lect. Note Ser. 174, Cambridge: Cambridge Univ. Press, 1992 [MS] McIntyre, M.E., Shepherd, T.G.: An exact local conservation theory for finite-amplitude disturbance to non-parallel shear flows, with remarks on Hamiltonian structure and on Arnol’d’s stability theorems. J. Fluid Mech. 181, 527–565 (1987) [Mu] Mu, M.: Personal communication, Isaac Newton Institute, Cambridge (September 1996) [N] Nycander, J.: Refutation of stability proofs for dipole vortices. Phys. Fluids A 4, 467–476 (1992) [R] Ripa, P.: On the stability of elliptical vortex solutions of the shallow-water equations. J. Fluid Mech. 183, 343–363 (1987) [RS] Reed, M., Simon, B.: Methods of Modern Mathematical Physics. Vol. IV, New York: Academic Press, 1978 [S] Shepherd, T.G.: Rigorous bounds on the nonlinear saturation of instabilities of parallel shear flows. J. Fluid Mech. 196, 291–322 (1988) [SG1] Sakuma, H., Ghil, M.: Stability of stationary barotropic modons by Lyapunov’s direct method. J. Fluid Mech. 211, 393–416 (1990) [SG2] Sakuma, H., Ghil, M.: Stability of propagating modons for small-amplitude perturbations. Phys. Fluids A 3, 408–414 (1991) [SLM] Simo, J.C., Lewis, D.R., Marsden, J.E.: Stability of relative equilibria I: The reduced energy momentum method. Arch. Rat. Mech. Anal. 115, 15–59 (1991) [SPM90] Simo, J.C., Posbergh, T.A., Marsden, J.E.: Stability of coupled rigid body and geometrically exact rods: Block diagonalization and the energy-momentum method. Phys. Repts. 193, 280–360 (1990) [SPM91] Simo, J.C., Posbergh, T.A., Marsden, J.E.: Stability of relative equilibria II: Three dimensional elasticity. Arch. Rat. Mech. Anal. 115, 61–100 (1991) [WG1] Wolansky, G., Ghil, M.: Stability of quasi-geostrophic flows in periodic channels. Phys. Lett. A 202, 111–116 (1995) [WG2] Wolansky, G., Ghil, M.: An extension of Arnol’d’s second stability theorem for the Euler equation. Physica D 94, 161–167 (1996) Communicated by J. L. Lebowitz