Commun. Math. Phys. 253, 1–24 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1190-8
Communications in
Mathematical Physics
On Deformation of Poisson Manifolds of Hydrodynamic Type Luca Degiovanni1 , Franco Magri2 , Vincenzo Sciacca3 1
Dottorato in Matematica, University of Torino, via C. Alberto 10, 10123 Torino, Italy. E-mail:
[email protected] 2 Department of Mathematics and Applications, University of Milano Bicocca, via degli Arcimboldi 8, 20126 Milano, Italy. E-mail:
[email protected] 3 Dottorato in Matematica, University of Palermo, via Archirafi 34, 90123 Palermo, Italy. E-mail:
[email protected] Received: 1 November 2000 / Accepted: 13 May 2004 Published online: 5 November 2004 – © Springer-Verlag 2004
Abstract: We study a class of deformations of infinite-dimensional Poisson manifolds of hydrodynamic type which are of interest in the theory of Frobenius manifolds. We prove two results. First, we show that the second cohomology group of these manifolds, in the Poisson-Lichnerowicz cohomology, is “essentially” trivial. Then, we prove a conjecture of B. Dubrovin about the triviality of homogeneous formal deformations of the above manifolds. 1. Dubrovin’s Conjecture In this paper we solve a problem proposed by B. Dubrovin in the framework of the theory of Frobenius manifolds [2]. It concerns the deformations of Poisson tensors of hydrodynamic type. The challenge is to show that a large class of these deformations are trivial. In an epitomized form the problem can be stated as follows. Let M be a Poisson manifold endowed with a Poisson bivector P0 fulfilling the Jacobi condition [P0 , P0 ] = 0 with respect to the Schouten bracket on the algebra of multivector fields on M. A deformation of P0 is a formal series P = P0 + P1 + 2 P2 + · · · in the space of bivector fields on M satisfying the Jacobi condition [P , P ] = 0
(1)
Work sponsored by the Italian Ministry of Research under the project 40%: Geometry of Integrable Systems.
2
L. Degiovanni, F. Magri, V. Sciacca
for any value of the parameter . The deformation is trivial if there exists a formal diffeomorphism φ : M → M, admitting the Taylor expansion φ = φ0 + φ1 + 2 φ2 + · · · , which pulls back P to P0 : P = φ ∗ (P0 ). Assume that the class of deformations P and of diffeomorphisms φ is restricted by a set of additional conditions to be described below. The demand is to prove that every allowed deformation is trivial, and to provide an explicit procedure to construct the trivializing map φ in the class of allowed transformations. In the concrete form suggested by Dubrovin, the manifold M is very simple but the class of allowed deformations is rather large. That is the source of difficulty of the problem. Indeed, the manifold M is the space of C ∞ -maps ua (x) from S 1 into IR n , and the bivector P0 is of hydrodynamic type [1]. By using the so-called “flat coordinates” ua in IR n , it can be written in the simple form P0 = g ab
d , dx
where the coefficients g ab are the entries of a constant, regular, symmetric n × n matrix (not necessarily positive definite). The allowed deformations P are a formal series of matrix-valued differential operators. The coefficient Pk has degree k + 1, and is written in the form Pk = A0 (u)
dk+1 dk + A (u; u ) + · · · + Ak+1 (u; ux , . . . , uk+1 ). 1 x dx k+1 dx k
The entries of the matrix coefficient Al are assumed to be homogeneous polynomials of degree l in the derivatives of the field functions ua (x). The degree of a polynomial is computed by attributing degree zero to the field functions, degree one to their first derivatives, degree two to the second derivatives, and so on. By this requirement the form of the operator Pk is fixed up to the choice of an increasing number of arbitrary functions of the coordinates ua . These functions, finally, must be chosen so to guarantee that the operator Pk is skewsymmetric Pk ∗ = −Pk
(2)
and that the Jacobi condition (1) is satisfied at the order k. This means that the first k operators Pl must be chosen so to verify the k conditions [Pi , Pj ] = 0 l = 1, . . . , k i+j =l
or, explicitly, 2[P0 , P1 ] = 0, 2[P0 , P2 ] + [P1 , P1 ] = 0, 2[P0 , P3 ] + 2[P1 , P2 ] = 0, ···
(3)
On Deformation of Poisson Manifolds of Hydrodynamic Type
3
and so on. The conjecture of Dubrovin is that all these homogeneous deformations are trivial, and that the trivializing map is homogeneous as well. To better understand the problem, let us consider the scalar-valued case. According to the rules of the game the first three coefficients of the deformations P have the form P0 =
d , dx
d2 d + B(u)ux dx 2 dx +(C(u)uxx + D(u)ux 2 ), d2 d3 d P2 = E(u) 3 + F (u)ux 2 + (G(u)uxx + H (u)ux 2 ) dx dx dx +(L(u)uxxx + M(u)uxx ux + N (u)ux 3 ). P1 = A(u)
They depend on eleven arbitrary functions of the coordinate u. By imposing the skewsymmetry condition (2) this number falls to four. Indeed we get the seven differential constraints
A = 0, 2C = B, 2D = B ,
2F = 3E , 4L = 2G − E , 4N = 2H − E ,
4M = 2G + 4H − 3E . The remaining four functions are constrained by the Jacobi condition. To work out this condition we use the operator form of the Schouten bracket of two skewsymmetric operators P and Q. We need the following notations. We denote by Qu α the value of the differential operator Qu on the argument α, and by
˙ = Qu (α; u)
d Qu+s u˙ α|s=0 ds
˙ The adjoint of this derivative with respect to u˙ is its derivative along the vector field u. ∗ denoted by Qu (α; β). It is defined by
∗
˙ β = u, ˙ Qu (α; β), Qu (α; u), where the pairing between vector fields and 1-forms is defined, as usual, by ˙ ˙ β = u(x)β(x)dx. u, S1
(Of course, in the vector-valued case we have to sum over the different components.) Then the Schouten bracket is given by [4]:
∗
2[P , Q](α, β) = Pu (α; Qu β) − Pu (β; Qu α) − Qu · Pu (α; β)
∗
+Qu (α; Pu β) − Qu (β; Pu α) − Pu · Qu (α; β). In our example, the bivector P0 is constant. Therefore, the first two Jacobi conditions (3) take the simple form:
∗
P1 u (α; P0 β) − P1 u (β; P0 α) − P0 · P1 u (α; β) = 0
4
L. Degiovanni, F. Magri, V. Sciacca
and
∗
P2 u (α; P0 β) − P2 u (β; P0 α) − P0 · P2 u (α; β)
∗
+P1 u (α; P1u β) − P1 u (β; P1u α) − P1u · P1 u (α; β) = 0 respectively. By expanding these operator conditions, we obtain two further relations B = 0,
4H = 2G + E , among the eleven functions (A(u), . . . , N(u)). Solving them and the previous ones we obtain that the first coefficients of P are: P1 = 0,
(4)
d2 d d3 3 P2 = E(u) 3 + E (u)ux 2 + G(u)uxx dx 2 dx dx 1 1 1 1 2 d + (G (u) + E (u))ux + G(u) − E (u) uxxx 2 2 dx 2 2 1 1 1 + G (u − E (u))uxx ux + (G (u) − E (u))ux 3 . 2 4 2
(5)
Up to the second order in , this is the most general homogeneous deformation in the scalar case. To check Dubrovin’s conjecture to the second-order in , it is enough to consider the homogeneous map φ (u) = u + R(u)ux + 2 (S(u)uxx + T (u)u2x ) + · · · and to use the operator form [4]
∗
P = φ u ·P0 · φ u
(6)
∗
of the transformation law for bivectors. As before, φ u denotes the adjoint operator of the Fréchet derivative of φ (u). By expanding Eq. (6), we find: P1 = 0, d2 d3 + 3(S (u) − R(u)R (u))u x dx 3 dx 2 d +(5S (u) − 4T (u) − R(u)R (u))uxx dx 2 d +(3S (u) − 2T (u) − R(u)R (u) − R (u) )ux 2 dx +2(S (u) − T (u))uxxx + 4(S (u) − T (u))uxx ux
P2 = (2S(u) − R 2 (u))
+(S (u) − T (u))ux 3 .
On Deformation of Poisson Manifolds of Hydrodynamic Type
5
By comparison with Eq. (4) and Eq. (5), we realize that Dubrovin’s conjecture is true in the scalar case, up to the second order in . In fact the choices R(u) = 0, 1 S(u) = E(u), 2 5 1 T (u) = E (u) − G(u) 8 4
(7)
allow us to reconstruct the diffeomorphism φ from the deformation P . The questions now are: what happens at higher order in , or in the matrix case? What is the meaning of the relations (7) connecting P to φ ? Due to the great complexity of the computations, it is clear that any direct attack is beyond our reach. We have to devise an alternative approach. Our strategy is to convert the given problem into a problem in Poisson-Lichnerowicz cohomology. It is based on two remarks: 1. Poisson manifolds of hydrodynamic type are transversally constant. 2. The second cohomology group in the Poisson-Lichnerowicz cohomology of these manifolds is “essentially” trivial. The first remark concerns the symplectic foliation associated with the Poisson bivector P0 . In our example, this foliation is rather regular. All the leaves are affine hyperplane of codimension n. They are the level sets of n globally defined Casimir functions C a , a = 1, 2, . . . , n. Furthermore there exists an abelian group of symplectic diffeomorphisms which transform the symplectic leaves among themselves. The second remark concerns the bivectors Q fulfilling the condition [P0 , Q] = 0. They must be compared with the bivectors Q = LX (P0 ) which are Lie derivatives of P0 along any vector field X on M. The former are called 2-cocycles in the Poisson-Lichnerowicz cohomology defined by P0 on M [3]. The latter are called 2-coboundaries. Not all cocycles are coboundaries. A first simple obstruction is the vanishing of the Poisson bracket of the Casimir functions C a with respect to Q: {C a , C b }Q = 0.
(8)
Further obstructions depend on the topology of the manifold. The main result of the paper is the proof, in §3, that these further obstructions are absent on a Poisson manifold of hydrodynamic type. By a combined use of ideas of the theory of transversally constant Poisson manifolds (suitably extended to infinite-dimensional manifolds) and of the operator approach to the inverse problem of the Calculus of Variations in the style of Volterra [9, 8], we show that every 2-cocycle verifying Eq. (8) is a 2-coboundary, and we give an explicit formula for the vector field X (called the potential of Q). Several examples of this result are shown in §4, where possible applications to the classifications of bihamiltonian manifolds are also briefly discussed. Once equipped with this result, the conjecture of Dubrovin can be proved in a direct and simple way. First we notice that the homogeneous deformations pass the obstructions (8). Then we notice that the Jacobi condition [P , P ] = 0 may be replaced by a
6
L. Degiovanni, F. Magri, V. Sciacca
recursive system of cohomological equations. This leads to a simple general representation of the deformation P . The argument goes as follows. Consider the first Jacobi condition [P0 , P1 ] = 0. It is already in a cohomological form. By the main result of §3, it follows that there exists a vector field X1 , such that P1 = LX1 (P0 ). By inserting this information into the second Jacobi equation 2 [P0 , P2 ] + [P1 , P1 ] = 0 we get a new cohomological equation 1 P0 , P2 − L2X1 (P0 ) = 0, 2 hence there exists a second vector field X2 such that 1 P2 = LX2 (P0 ) + L2X1 (P0 ). 2 By induction, one proves the existence of a sequence of vector fields {Xk }k∈IN such that all the coefficients Pk of the deformation P admits the representation j j j1 LXkk LXk−1 L X1 k−1 Pk = ··· (P0 ). (9) jk ! jk−1 ! j1 ! j1 +2j2 +···+kjk =k
This result gives complete control of the deformations of Poisson brackets of hydrodynamic type. In particular it allows us to give the following simple proof of Dubrovin’s (k) conjecture. Consider separately the different flows φtk associated with the vector fields Xk . Give them a different weight by setting tk = k , and make the ordered product of these flows by multiplying each flow by the subsequent one on the left. The result is the one-parameter family of the diffeomorphism
(k) φ = φ k . k≥1
It provides the solution we were looking for. Indeed, according to the theory of “Lie transform”, Eq. (9) are equivalent to the transformation law P = φ ∗ (P0 ) as required. We believe that the strategy sketched above is of interest in itself, and that it can be profitably used in a more general context. In our opinion it can provide, for instance, new insights on the problem of classification of bihamiltonian manifolds associated with soliton equations.
On Deformation of Poisson Manifolds of Hydrodynamic Type
7
2. Transversally Constant Poisson Manifolds In this section we collect a few ideas of the theory of Poisson manifolds which are used later on to solve Dubrovin’s conjecture. Our interest is mainly centered around the difference between 2-cocycles and 2-coboundaries on a regular transversally constant Poisson manifold. We recall that a finite-dimensional Poisson manifold (M, P ) is regular if the symplectic foliation defined by the Poisson bivector P has constant rank. Let k denote the corank of the foliation. It follows that, around any point of the manifold, there exist k functions C a , a = 1, 2, . . . , k, which are independent and constant on each symplectic leaf. They are called Casimir functions. Their differentials dC a span the kernel of the bivector P . We also recall that the Poisson manifold is called transversally constant [7] if there exist k vector fields Za which are transversal to the symplectic leaves and are symmetries of P : LZ a (P ) = 0. Without loss of generality, one can assume that these vector fields satisfy the normalization conditions Za (C b ) = δab with respect to the chosen family of Casimir functions. The local structure of a transversally constant Poisson manifold is quite simple: essentially it is the product of a symplectic leaf and of the abelian group generated by the vector fields Za . In particular, the tangent space at any point m can be split into the direct sum Tm M = Hm ⊕ Vm of a “horizontal space” Hm (the tangent space of the symplectic leaf) and of a “vertical space” Vm , spanned by the vector fields Za . This splitting induces a corresponding decomposition of the dual space and, hence, of any tensor field on M. For a bivector Q the basic elements are the vector fields X a = QdC a and the horizontal bivector QH = πH ◦ Q ◦ πH ∗ , where πH denotes, as usual, the canonical projection on H along V . A simple computation gives 1 QH = Q + X a ∧ Za + X a (C b )Za ∧ Zb . 2
(10)
They already contain the clue of the distinction between 2-cocycle and 2-coboundaries. Lemma 1. If Q is a cocycle the vector fields Xa are symmetries of P and QH is a cocycle. If Q is a coboundary the vector fields Xa are Hamiltonian and QH is a coboundary.
8
L. Degiovanni, F. Magri, V. Sciacca
Proof. If Q is a cocycle we have LQdF (P ) + LP dF (Q) = 0 for any function F . For F = C a , this equation shows that X a is a symmetry of P . Hence 1 a b a P , X ∧ Za + X (C )Za ∧ Zb = 0 2 since both Xa and Za are symmetries of P and X a (C b ) is a Casimir function. This show that [P , QH ] = 0 as claimed. If Q = LX (P ) is a coboundary, we find QdC a = LX (P )dC a = LX (P dC a ) − P dX(C a ) = −P dX(C a ) showing that with Therefore we find
with
QdC a = P dH a H a = −X(C a ). 1 Xa ∧ Za + X a (C b )Za ∧ Zb = LZ (P ) 2 Z = −H a Za .
This proves the second part of the lemma.
The previous remark alone is sufficient for later applications. However, in view of adapting the result to the case of infinite-dimensional Poisson manifolds of hydrodynamic type, it is better to restate it in a different form. The idea is to trade multivectors for forms. To this end, we first split the vector fields Xa into horizontal and vertical parts. Then, the components of the vertical parts are used to define the matrix {C a , C b }Q := X a (C b ). a are, instead, used to define k1-forms θ a living on the symplectic The horizontal parts XH leaves. They are given by a θ a (XF ) = XH (F ),
(11)
where XF = P dF is the Hamiltonian vector field associated with the function F . Similarly, the horizontal bivector QH is traded for a 2-form ω, living on the symplectic leaves, according to ω(XF , XG ) = QH (dF, dG). The outcome is that any bivector Q on a regular transversally constant Poisson manifold M is characterized by three elements:
On Deformation of Poisson Manifolds of Hydrodynamic Type
9
1. the functions {C a , C b }Q , 2. the 1-forms θ a , 3. the 2-form ω. As a simple restatement of the previous lemma, we obtain the following result. Lemma 2. If Q is a cocycle {C a , C b }Q is a Casimir function, and the forms θ a and ω are closed. If Q is a coboundary the functions {C a , C b }Q vanish, and the forms θ a and ω are exact. We do not give the proof of this result; that can be found in [7]. Instead, for further convenience, we show its converse in the following form. Lemma 3. If the functions {C a , C b }Q vanish, {C a , C b }Q = 0,
(12)
θ a = dH a , ω = dθ,
(13) (14)
and the forms θ a and ω are exact,
the bivector Q is a coboundary. Its potential X is given by X = −H a Za + P θ.
(15)
Proof. The first assumption (12) entails that the vector fields Xa are tangent to the syma . Thus the definition (11) and the second assumption plectic leaves. Hence Xa = XH (13) leads to Xa = P dH a . Set Z = −H a Za . As in the proof of Lemma 1 we get Q = QH + LZ (P ).
(16)
Finally, we notice that the third assumption (14) entails QH (dF, dG) = ω(XF , XG ) = dθ (XF , XG ) = LP θ (P )(dF, dG). Hence the previous equation becomes Q = LP θ (P ) + LZ (P ) = LX (P ) as claimed.
A difficulty is readily met in trying to extend the previous result to infinite-dimensional manifolds. It is connected to the definition (10) of the bivector QH where the operation of the exterior product is used. We have found it difficult to extend this formula in the infinite-dimensional setting where vector fields and bivectors are represented by differential operators. To circumvent this difficulty, we can follow a two-step procedure, where the vector fields Xa come first, and only later the bivector QH is introduced as the complementary part of LZ (P ) in the splitting (16) of Q. This detour leads to an “eight step algorithm” to check if a given bivector Q on a transversally constant Poisson manifold is a coboundary. They are:
10
1. 2. 3. 4. 5. 6. 7. 8.
L. Degiovanni, F. Magri, V. Sciacca
Check that the functions {C a , C b }Q vanish. Check that the vector fields QdC a are Hamiltonian: QdC a = P dH a . Introduce the transversal vector field Z = −H a Za . Compute the Lie derivatives of P along Z. Define the horizontal bivector QH according to: QH = Q − LZ (P ). Introduce the 2-form ω by factorizing QH according to: QH = P ◦ ω ◦ P . Check that this form is exact on the symplectic leaves. Compute its potential θ.
At the end of this long chain of tests, one can affirm that Q is a coboundary and construct its potential X according to Eq. (15). In the next section we shall display this procedure for manifolds of hydrodynamic type. 3. Poisson Manifolds of Hydrodynamic Type Let now
d . dx We notice that this bivector admits k globally defined Casimir functions 1 ua (x)dx. C a (u) = P = P0 = g ab
0
Therefore its symplectic leaves are affine hyperplanes and the manifold is regular. We also notice that the vector fields Za : u˙ b = δab are globally defined transversal symmetries. Hence, the manifold is transversally constant as well. On this manifold we consider the class of bivectors Q which are represented by matrix-valued differential operators Q=
k≥0
Ak (u, ux , . . .)
dk dx k
(17)
and which satisfy the simple condition {C a , C b }Q = 0.
(18)
We stress that no homogeneity conditions are imposed on Q. So the present class of bivectors is bigger than that considered in Dubrovin’s conjecture. We shall prove Proposition 1. Each cocycle Q in this class is a coboundary. To appreciate the strength of this result, let us consider the case of a single loop function u(x). Condition (18) is automatically verified in this case, since there is only one Casimir function, and therefore we can conclude that every scalar-valued cocycle is a coboundary. This result is far from being trivial. Let us check this claim for the simple cocycle P2 considered in §1. We have to exhibit a vector field u˙ = X(u, ux , uxx )
On Deformation of Poisson Manifolds of Hydrodynamic Type
such that
11
∗
−P2 = −LX (P0 ) = Xu ·P0 + P0 · Xu , ∗
where Xu is the formal adjoint of the Fréchet derivative Xu of the operator defining the vector field X. A reasonable guess is to look for a homogeneous vector field X = a(u)uxx + b(u)u2x . Since
d d2 + 2b(u)ux + a (u)uxx + b (u)u2x , 2 dx dx a simple computation leads to
Xu = a(u)
d2 d d3 + 3a (u)u + (5a (u) − 4b(u))uxx x 3 2 dx dx dx 2 d +(3a (u) − 2b (u))ux + 2(a (u) − b(u))ux xx dx +4(a (u) − b (u))ux uxx + (a (u) − b (u))u3x .
−LX (P0 ) = 2a(u)
The problem is solved by noticing that the relations 1 a(u) = − E(u), 2 1 5 b(u) = G(u) − E (u) 4 8 allow us to identify the operator LX (P ) with P2 , for any choice of the function E(u) and G(u) (see Eq. (5)), as claimed in Proposition 1. In this section we shall show that the above relations are simply an instance of the general formula (15), defining the potential X of any coboundary of a transversally constant Poisson manifold. The main difficulty is to identify the geometrical objects (the vector fields QdC a , the 1-forms θ a , and the 2-form ω) to be associated with each bivector of the form (17). To this end, it is useful to split the operator Q into the sum of three operators. The first operator has degree zero. Therefore it is simply a skewsymmetric matrix E, whose entries are functions of the loops ua (x) and of their derivatives. The d d second operator has order one. It is written as the anticommutator S · dx + dx · S of d a symmetric matrix S with dx . The third operator, finally, collects all the higher order terms. Lemma 4. Any bivector Q can be uniquely written in the form: Q=E+S·
d d d d + ·S + · · , dx dx dx dx
(19)
where is the skewsymmetric operator dk dk
k · k + k · k .
= dx dx k≥0
The coefficients k of this operator are alternatively symmetric and skewsymmetric matrices, according to the order of the derivatives.
12
L. Degiovanni, F. Magri, V. Sciacca
This lemma is very simple to prove, but it is interesting because each term in the splitting (19) has a geometrical meaning. Roughly speaking, the first term E controls the brackets {C a , C b }Q , the second term controls the 1-forms θ a , and the third term controls the 2-form ω. By using this representation formula we can now work out the “eight step algorithm” stated at the end of the previous section. Step 1. The vanishing of the functions {C a , C b }Q . Since the differentials of the Casimir functions are the constant matrices δC a = δba δub we easily find
1
{C , C }Q = a
b
E ab dx,
0
where E ab is the entry of place (a, b) in the matrix E. Therefore, condition (18) holds iff there exists a second skewsymmetric matrix E such that E=
d (E). dx
Writing this condition in the commutator form E=
d d ·E −E · , dx dx
we can easily eliminate E from the representation (19) of Q. Setting B = E + S we get Q = Bt ·
d d d d + ·B + · · . dx dx dx dx
Finally we replace the differential operator
d dx
P = G·
by the Poisson bivector d . dx
We then arrive at the following useful second representation theorem. Lemma 5. Each bivector Q for which {C a , C b }Q = 0 can be uniquely represented in the form Q = At ·P + P ·A + P ··P ,
(20)
where A is the n × n matrix and is the skewsymmetric differential operator given by: B = G · A,
= G · · G.
On Deformation of Poisson Manifolds of Hydrodynamic Type
13
Step 2. The vector fields QdC a are Hamiltonian. Since {C a , C b }Q = 0 we know that the vector fields QdC a are tangent to the symplectic leaves of P . Therefore there exist 1-forms θ a such that QdC a = P θ a . From the representation theorem we easily recognize that the 1-form θ a is given by the a th column of the matrix A. So the component b of the 1-form θ a is the entry Aab of place (a, b) of the matrix A: θba = Aab . We further know that the vector fields QdC a are symmetries of P by the cocycle condition. If we work out explicitly the condition LQdC a (P ) = 0
(21)
in the operator formalism we have
−LP θ a (P ) = P · θ a · P − P · θ a = P · (θ
a
−θ
a∗
∗
·P
) · P = 0.
This is the same as writing ∗ d d ·(θ a − θ a )· = 0. dx dx
Let us expand the differential operator θ a − θ a
θa − θa
∗
= A0 + A 1 ·
∗
(22)
in power of
d dx :
d d + · · · + An · n . dx dx
Substituting into the previous equation we obtain ∗ d d d d2 ·(θ a − θ a )· = A0x · + (A0 + A1x )· 2 + · · · dx dx dx dx n+1 d dn+2 +(An−1 + Anx )· n+1 + An · n+2 , dx dx
∗
showing that condition (22) can be verified iff θ a − θ a = 0. This means that the Fréchet derivative of the operator θ a is symmetric and, therefore, that this operator is the potential [6]. In geometric language this means that the 1-form θ a is closed and therefore exact, since the topology of the manifold M is simple. The potential is the functional 1 Ha = ha (u, ux , . . . )dx, 0
where, according to [5],
1
ha = 0
Aab (λu, λux , . . . )ub dλ.
We have thus proved that the vector fields QdC a are Hamiltonian.
(23)
14
L. Degiovanni, F. Magri, V. Sciacca
Step 3. The transversal vector field Z. We choose the transversal vector field Z = −ha (u, ux , . . . )Za
(24)
(sum over the repeated index a). We notice that by this choice we depart slightly from the geometrical scheme. According to the third step of §2 we should have introduced at this point the vector field 1 a ˆ Z=− h (u, ux , . . . )dx Za 0
whose components are the functionals H a , instead of the associated densities ha . The ˆ a ) and so the functions C a are Casimir funcchange is permitted since Z(C a ) = Z(C tions also for Q − LZ P . This fact allows us to still define a 2-form ω but in general this is different from the one associated with the previous “horizontal” part of the bivector Q. Our choice has the advantage that the vector field Z is local. Step 4. The Lie derivative LZ (P ). The next step is to compute the Lie derivative of P along Z. In the operator formalism this is easily accomplished if we know the Fréchet derivative Zu of the vector field Z. This derivative is a matrix differential operator. A key property is that the zero-order term of this operator is the transpose of the matrix A defining the 1-form θ a . Lemma 6. The Fréchet derivative Z of the vector field Z may be uniquely represented as the difference Z = −At + P · R
(25)
of the transpose of the matrix A and of a factorized differential operator P · R, taking into account all the higher-order terms appearing in Z . Proof. The identity (25) is nothing else but a disguised form of the Lagrange identity
∗
(α, Zu φ) − (φ, Zu α) =
d B(α, φ) dx
(26)
used to define the formal adjoint of the operator Z . In this identity α and φ are arbitrary, and the bracket denotes the usual scalar product in IR n . We notice that, by Eq. (24) ∗ the vector Zu (el ) is the opposite of the Euler operator associated with the lagrangian density hl , δhl , δu and we write the identity (26), for α = el , in the operator form
∗
Zu (el ) = −
−hl (φ) +
n b=1
φb
δhl d B(el , φ), = b δu dx
where hl is the Fréchet derivative of the scalar differential operator hl . One easily recognizes in this equation the identity (25) by recalling that Alb =
δhl . δub
On Deformation of Poisson Manifolds of Hydrodynamic Type
15
The identity (25) allows us to perform the fourth step in our program rather easily. By using once the operator form of the Lie derivative of P we obtain LZ (P ) = −Z ·P − P ·Z ∗ = At ·P + P ·A + P ·(R ∗ − R)·P . Steps 5 and 6. The horizontal bivector QH and the 2-form ω. By subtracting this identity from the basic representation formula (20), we obtain Q = LZ (P ) + P ·( + R − R ∗ )·P = LZ (P ) + P · ·P .
(27)
It allows us to identify the 2-form ω with the restriction to the symplectic leaves of P of the differential operator − , where
= + R − R∗ defined on M. The explicit computation of this form is algorithmic, as shown by the examples given in the next section. We can thus conclude that we have a systematic procedure to identify the 2-form ω. Step 7. The 2-form ω is exact. The last steps are now performed along a well-established path. The closure of the 2-form follows from the cocycle condition [P , Q] = 0. By using the operator form of this condition we obtain: [P , Q](α, β, γ ) = [P , P · ·P ](α, β, γ ) = α, P · u (Pβ; P γ ) + β, P · u (P γ ; P α) + γ , P · u (P α; Pβ) ∗
= α, P ·[ u (Pβ; P γ ) − u (P γ ; Pβ) + u (Pβ; P γ )] = 0. Therefore Q is a cocycle iff verifies the equation ∗
P ·[ u (Pβ; P γ ) − u (P γ ; Pβ) + u (Pβ; P γ )] = 0
(28)
for any choice of the arguments β and γ . Let us fix β. We can regard the previous equation as a differential equation on γ of the form
d d ·T · γ = 0, dx dx
where T is a suitable differential operator depending on β. By the argument already used in discussing Eq. (22) we see that this equation can be verified by any γ only if T = 0. This gives rise to a new differential system on β. Once again it can be satisfied by any β only if the equations are identically vanishing. Thus we conclude that the operator Eq. (28) holds iff ∗
u (φ; ψ) − u (ψ; φ) + u (φ; ψ) = 0 for any choice of the arguments φ and ψ. This is the closure condition for the 2-form .
16
L. Degiovanni, F. Magri, V. Sciacca
Step 8. The potential θ . Since we are working on a manifold with simple topology, by the Poincaré lemma we can affirm the existence of a 1-form θ such that ω = dθ . In our particular context the 1-form θ may be represented as a vector-valued differential operator θ = θ (u, ux , . . . ) and its exactness, in operator formalism, may be explicitly written as ∗
= θu − θu ,
(29)
where θu is, as usual, the Fréchet derivative of the 1-form θ . Like in the finite-dimensional case, the operator θ can be reconstructed from by a quadrature. The formula 1 θ =−
λu (λu)dλ (30) 0
means that we must apply the matrix differential operator , evaluated at the point λu on the vector λu itself. Then we must integrate, term by term, the resulting vector-valued differential operator λu (λu), depending on λ, on the interval [0, 1]. Applications of this formula will be given in the next section. We have finally achieved our goal. By inserting the representation (29) of the 2-form
in the representation formula (28) of the cocycle Q we obtain ∗
Q = LZ (P ) + P ·(θu − θu )·P = LZ (P ) + LP θ (P ) = LZ+P θ (P ), showing that the cocycle Q is a coboundary. Furthermore we obtain the explicit formula X = Z + Pθ
(31)
for the potential X of Q, as in the finite-dimensional case. The proposition stated at the beginning of this section is thus completely proved. To prepare the discussion on Dubrovin’s conjecture, to be performed in the final section, it remains to understand what relation connects the class of homogeneous bivectors considered by Dubrovin (and described in §1) to the class of bivectors considered in this section. Lemma 7. The class of Dubrovin’s cocycles is strictly contained in the present class of cocycles. Proof. The point is to show that the homogeneity assumption (together with the cocycle condition) entails the involutivity condition (18) used to define our class of cocycles. To prove this result we exploit the well-known property that, for every cocycle, the bracket {C a , C b }Q is still a Casimir function of P . This means that P d({C a , C b }Q ) = 0, that is d δE ab = 0. dx δul
On Deformation of Poisson Manifolds of Hydrodynamic Type
17
Let us write δE ab = Aab l . δul
(32)
By the above condition the functions Aab l are constant, and therefore l E ab = Aab l u +
d ab K . dx
Accordingly
1
{C , C }Q = a
b
0
l Aab l u dx.
ab The homogeneity condition of Dubrovin entails Aab l = 0, since E should have at least degree one. So the combined action of the cocycle and of the homogeneity condition entails the involutivity (18), as required.
We finally notice that the vector field Z and the 1-form θ associated with the homogeneous cocycle Q are themselves homogeneous operators, due to Eq. (23) and Eq. (30). Thus we can end our discussion by stating the following proposition Proposition 2. All cocycles in Dubrovin’s class are coboundaries, and their potentials are homogeneous operators. In our opinion, this is the deep reason for the validity of Dubrovin’s conjecture. 4. Three Examples As the first example we consider again the homogeneous third-order scalar differential operators (5). We have already shown that they are coboundaries by guessing the form of the vector field X, inside the class of homogeneous vector fields. Presently we want to rediscover systematically this vector field, by using the previous algorithm. We recall the main steps of this approach. 1. The starting point is the representation formula At ·P + P ·A + P ··P . It allows us to identify the matrix A and the 2-form . 2. The next step is to exploit the pieces of information encoded into the matrix A. Its columns are exact 1-forms, and their potentials are the Hamiltonians with Lagrangian densities ha . They allow us to define the transversal vector field Z = −ha Za . 3. The Fréchet derivative Z of this vector field is the last object to be analyzed. Through the second representation formula Z = −At + P ·R, it allows us to identify the operator R, which defines the deformation = + R − R ∗ of we are interested in. 4. At this point we compute the potential θ of the 2-form . The vector field X is given by the formula X = −H a Za + P θ. For the example at hand, the first representation formula reads d d + · [2α(u)uxx + α (u)ux 2 ] dx dx d d d d + · [β(u) + · β(u)] · , dx dx dx dx
P2 = [2α(u)uxx + α (u)ux 2 ]
18
L. Degiovanni, F. Magri, V. Sciacca
where α(u) and β(u) are related to the previous coefficients E(u) and G(u) according to 1 1 α(u) = G(u) − E (u) , 4 2 1 β(u) = E(u). 2 Since P =
d dx ,
the “matrix” A is simply the scalar function
A(u; ux , uxx ) = 2α(u)uxx + α (u)ux 2 . We recognize in this expression the Euler operator associate with the Lagrangian density h(u; ux ) = −α(u)ux 2 . Consequently the vector field Z is given by Z(u; ux ) = α(u)ux
2
1 1 = G(u) − E (u) ux 2 . 4 2
Its Fréchet derivative is
Zu = −2α(u)uxx − α (u)ux 2 + 2
d [α(u)ux ], dx
and therefore Zu + At = 2
d [α(u)ux ]. dx
In this way we obtain the operator R = 2α(u)ux . Since R ∗ = R, the 2-form is simply given by
= β(u)
d d + · β(u). dx dx
This form is exact and its potential is given by 1 θ (u; ux ) = −β(u)ux = − E(u)ux . 2 It can be computed either by direct inspection or by using the equation 1 θ =−
λu (λu)dλ 0 1 d d β(λu) (λu) + (β(λu)) dλ. =− dx dx 0 Finally, the vector field X is given by 1 X = Z + P θ = − E(u)uxx + 2
5 1 G(u) − E (u) ux 2 . 4 8
It coincides with the vector field already obtained at the beginning of §3.
On Deformation of Poisson Manifolds of Hydrodynamic Type
19
As the second example let us consider the non-homogeneous third-order scalar differential operator Q=
d3 d + 2u + ux . 3 dx dx
(33)
It is the well-known second Hamiltonian operator of the KdV hierarchy. By writing Q in the form d d d d d + ·u+ · Q=u· dx dx dx dx dx we identify A(u) = u, d . = dx Once again, we recognize in A the Euler operator associated with the lagrangian density h(u) =
1 2 u . 2
Consequently 1 Z = − u2 . 2 Its Fréchet derivative verifies the equation
Z + At = −u + u = 0. Hence R = 0, and =
d dx .
The potential θ of this 2-form is 1 θ = − ux . 2
The potential X associated with the operator Q is then 1 1 X = − u2 − uxx . 2 2
(34)
The final example concerns a non-homogeneous matrix-valued bivector Q. We consider the pair of Poisson bivectors 01 d (35) P = 1 0 dx and
Q=
3
d d − dx 3 + 2u dx + ux d 3v dx + vx
d 3v dx + 2vx d5
d3 dx 3
d2 dx 2
− 10 − 5ux 3 u 16 2 d 2 +( 3 u − 3uxx ) dx + ( 16 3 uux − 3 uxxx ) dx 5
(36)
20
L. Degiovanni, F. Magri, V. Sciacca
associated with the Boussinesq hierarchy. It is well-known that P is a coboundary of Q. Indeed P is the Lie derivative of Q along the vector field u˙ = 0 . X−1 : v˙ = 21 Our aim is presently to show that Q is a coboundary of P , and we want to compute explicitly its potential X1 : Q = LX1 (P ). The boring part is the identification of the matrix A and of the operator . We start from the representation formula d d + dx ·S+ Q = E + S· dx d d dx ·[ 0 + ( 1 · dx + d3 +( 3 · dx 3
+
d d2 dx · 1 ) + ( 2 · dx 2 d3 d · 3 )]· dx dx 3
+
d2 · 2 ) dx 2
(37)
valid for any fifth-order bivector. By comparison with Eq. (36), we obtain 3 0 21 vx v u 2 E= , 0 = 0, , S= 3 8 2 2 − 21 vx 0 2 v 3 u − 3 uxx 1 0 0 −2 0 ,
2 = 0.
1 = , 3 = 0 21 0 − 53 u We notice that E is a total derivative with respect to x, guaranteeing that {C a , C b }Q = 0. Since d d 0 21 v 0 − 21 v · (38) E= + 1 · 1 −2v 0 dx dx 2v 0 we obtain the first representation formula of the operator Q, d d u v u 2v + · · Q= 8 2 2 8 2 2 u − u u − u 2v v dx dx 3 xx 3 3 xx 3 d − dx 0 d d + · · . 3 d 5 u − u 0 dd3 x − 10 dx dx 3 dx 3 x d by the bivector P , we obtain the second representation formula Replacing the operator dx 8 v u v 3 u2 − 23 uxx ·P + 8 2 2 Q=P· u 2v 3 u − 3 uxx 2v 3 d 10 d 5 − u − u 0 x 3 3 dx 3 +P · d x · P. d 0 − dx
It follows that
A=
v u
8 2 3u
− 23 uxx 2v
On Deformation of Poisson Manifolds of Hydrodynamic Type
and
=
d3 d3 x
−
10 d 3 u dx
− 53 ux
0
21
0 d − dx
.
At this point we just repeat the usual scheme. We notice that the entries of the columns of A are the Euler operators associated with the Lagrangian densities h1 (u, v) = uv, 8 1 h2 (u, v; ux , vx ) = v 2 + u3 + u2x 9 3 respectively. Then we use the transversal symmetries u˙ = 1 z1 : , v˙ = 0 u˙ = 0 z2 : v˙ = 1 to build the vector field
Z:
u˙ = −uv v˙ = −v 2 − 89 u3 − 13 u2x .
Its Fréchet derivative verifies the equation −v −u v u + Z + At = d 8 2 2 −2v − 83 u2 − 23 ux dx 3 u − 3 uxx 2v d 0 0 . = · 2 − 3 ux 0 dx So the operator R is given by R=
− 23 ux 0 . 0 0
Since R = R ∗ we finally get = . Its potential θ is: 5 uu − 1 u θ = 3 x 1 2 xxx . 2 vx Consequently X1 :
u˙ = −uv + 21 vxx
v˙ = −v 2 − 89 u3 + 43 u2x + 53 uuxx − 21 uxxxx .
Let us finally consider the third vector field 2X0 = [X−1 , X1 ].
22
L. Degiovanni, F. Magri, V. Sciacca
It is a conformal symmetry of both Poisson bivectors P and Q. Indeed 1 P, 2 1 LX0 (Q) = Q. 2 LX0 (P ) =
Furthermore, the vector fields (X−1 , X0 , X1 ) satisfy the commutation relations [X−1 , X0 ] = X1
[X1 , X0 ] = −X−1 .
Therefore by the present algorithm we have constructed the sl(2)-subalgebra of the W-algebra associated with the Boussinesq hierarchy. This remark suggests that the method used in this paper are potentially very useful in analyzing and classifying Poisson pencils on bihamiltonian manifolds. 5. Proof of Dubrovin’s Conjecture The key idea for proving the conjecture is to reduce the Jacobi identity [P , P ] = 0 to a sequence of cohomological equations. This is possible on a manifold of hydrodynamic type due to the results of §3. The outcome is a peculiar representation of the coefficients of the deformation P in terms of vector fields. Proposition 3. A sequence of homogeneous vector fields Xk may be associated with every homogeneous deformation P , in such a way that the coefficients Pk of the Taylor expansion of P are written as iterated derivatives of the given bivector P0 . To this end consider the Lie derivatives associated with the vector fields, and construct with them the operator Tk =
j
LXkk
j1 +2j2 +···+kjk =k
jk !
j
···
LX11 j1 !
to be referred to as the Schur polynomial of order k associated with the given sequence of vector fields. Then Pk = Tk (P0 ).
(39)
Proof. Let us first check the formula for k = 1. We know that the first coefficient P1 is a homogeneous bivector verifying the cocycle condition [P1 , P0 ] = 0. Hence, by the final proposition of §3 there exists a homogeneous vector field X1 , such that P1 = LX1 (P0 ). This proves the first case of identity (39). To prove by induction the remaining cases, we use the identity Tk ([P , P ]) =
k
[Tj (P ), Tl (P )].
j,l=0 j +l=k
It follows from the transformation law ψ∗ ([P , P ]) = [ψ∗ (P ), ψ∗ (P )]
(40)
On Deformation of Poisson Manifolds of Hydrodynamic Type
23
with respect to the special one parameter family of local diffeomorphisms φ(k) : M → M constructed as follows. First we compose the flows (φt1 , . . . , φtk ) associated with the vector fields (X1 , . . . , Xk ) so to obtain the multiparameter family of local diffeomorphisms (k)
φt1 ,··· ,tk = φtk ◦ · · · ◦ φt1 .
(41)
Then we reduce this family by setting tj = j .
(42)
By expanding Eq. (41) in powers of , and by equating the coefficients of k we obtain exactly Eq. (40). Assume presently that the representation (39) is true for the first n coefficients (P1 , . . . , Pn ). To prove that it is also true for Pn+1 we consider Eq. (40) for k = n + 1. We notice that this equation holds for any choice of the vector fields (X1 , . . . , Xn+1 ). In particular it holds also for Xn+1 = 0. Let us denote by Tˆn+1 = Tn+1 |Xn+1 =0 the restriction of the operator Tn+1 to the first n vector fields of the sequence. Then we can write n+1
Tˆn+1 ([P0 , P0 ]) =
[Tˆj (P0 ), Tˆl (P0 )].
(43)
j,l=0 j +l=n+1
By assumption [P0 , P0 ] = 0, and Pl = Tl (P0 ) = Tˆl (P0 )
∀l = 1, . . . , n.
Therefore Eq. (43) becomes: 2[P0 , Tˆn+1 (P0 )] +
n
[Pj , Pl ] = 0.
j,l=1 j +l=n+1
Let us compare this equation with 2[P0 , Pn+1 ] +
n
[Pj , Pl ] = 0,
j,l=1 j +l=n+1
expressing the Jacobi identity [P , P ] = 0 at the order n + 1 in . It takes the form of a cocycle condition: [P0 , Pn+1 − Tˆn+1 (P0 )] = 0.
24
L. Degiovanni, F. Magri, V. Sciacca
Therefore there exists a vector field Xn+1 such that Pn+1 = LXn+1 (P0 ) + Tˆn+1 (P0 ) = Tn+1 (P0 ). By induction this proves the representation formula (39) for any k.
To end the proof of Dubrovin’s conjecture it is sufficient now to notice that the infinite sequence of identities Pk = Tk (P0 )
(44)
P = φ∗ (P0 )
(45)
means that
(k)
for the limit φ of the sequence of the local diffeomorphism φ for k → ∞. Indeed, according to the theory of “Lie transform”, Eq. (44) is nothing else but the Taylor expansion of Eq. (45) in powers of . We have then obtained a constructive proof of Dubrovin’s conjecture. The relation k] φ(k) = φ[X ◦ · · · ◦ φ[X1 ] k
(46)
gives the approximation, at order k of the trivializing map φ : M → M we were looking for. Acknowledgements. We sincerely thank B. Dubrovin for introducing us to the problem of deformation of Poisson manifolds of hydrodynamic type. We also thank G. Falqui for many useful discussions. We finally thank the Istituto Nazionale di Alta Matematica of Rome, who supported a meeting on the geometry of Frobenius manifolds, giving us the occasion to meet all together and discuss the problem.
References 1. Dubrovin, B., Krichever, I.M., Novikov, S.P.: Integrable systems I. In: Encyclopaedia of Mathematical Sciences, 4, Dynamical systems IV, Berlin-Heidelberg-NewYork: Springer-Verlag, 1990, pp. 173–280 2. Personal communication during the “Intensive INDAM two-month seminar on the Geometry of Frobenius manifolds”, Milano, 2000. See also; B. Dubrovin, Y. Zhang,: Bi-Hamiltonian hierarchies in 2D topological field theory at one-loop approximation. Commun. Math. Phys. 198, 311–361 (1998) 3. Lichnerowicz, A.: Les varietes de Poisson et leurs algebres de Lie associees. J. Diff. Geom. 12, 253–300 (1977) 4. Magri, F., Morosi, C.: A Geometrical Characterization of integrable Hamiltonian Systems through the Theory of Poisson-Nijenhuis Manifolds. Quaderno S 19/1984 of the Department of Mathematics of the University of Milano 5. Tonti, E.: Inverse problem: its general solution. Differential geometry, calculus of variations, and their applications. In: Lecture Notes in Pure and Appl. Math. 100, Berlin-Heidelberg-New York: Springer, 1985, pp. 497–510 6. Vainberg, M.M.: Vatiational Methods for the Study of Nonlinear Operators. Holden-Day, 1964 7. Vaisman, I.: Lectures on the geometry of Poisson manifolds. Basel: Birkhäuser Verlag, 1994 8. Volterra, V.: Fonctions de lignes. Paris: Gauthier-Villars, 1913 9. Volterra, V.: Theory of functionals and of integral and integro-differential equations. NewYork: Dover Publications Inc., 1959 Communicated by N.A. Nekrasov
Commun. Math. Phys. 253, 25–49 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1194-4
Communications in
Mathematical Physics
Chern-Simons Theory, Matrix Integrals, and Perturbative Three-Manifold Invariants Marcos Marino ˜ Jefferson Physical Laboratory, Harvard University, Cambridge, MA 02138, USA. E-mail:
[email protected] Received: 21 August 2002 / Accepted: 10 May 2004 Published online: 5 November 2004 – © Springer-Verlag 2004
Abstract: The universal perturbative invariants of rational homology spheres can be extracted from the Chern-Simons partition function by combining perturbative and nonperturbative results. We spell out the general procedure to compute these invariants, and we work out in detail the case of Seifert spaces. By extending some previous results of Lawrence and Rozansky, the Chern-Simons partition function with arbitrary simply-laced group for these spaces is written in terms of matrix integrals. The analysis of the perturbative expansion amounts to the evaluation of averages in a Gaussian ensemble of random matrices. As a result, explicit expressions for the universal perturbative invariants of Seifert homology spheres up to order five are presented. Contents 1. 2. 3. 4.
Introduction . . . . . . . . . . . . . . . . . . . . . . . The Partition Function of Chern-Simons Theory . . . . Chern-Simons Perturbation Theory . . . . . . . . . . . The Chern-Simons Partition Function on Seifert Spaces 4.1 Seifert homology spheres . . . . . . . . . . . . . 4.2 Computation of the partition function . . . . . . 4.3 Connection to matrix models . . . . . . . . . . . 5. Asymptotic Expansion and Matrix Integrals . . . . . . 5.1 Asymptotic expansion of the exact result . . . . . 5.2 Evaluating the integrals . . . . . . . . . . . . . . 5.3 Universal perturbative invariants up to order 5 . . 6. Open Problems . . . . . . . . . . . . . . . . . . . . . A. Appendix . . . . . . . . . . . . . . . . . . . . . . . . A.1 Group theory factors . . . . . . . . . . . . . . . A.2 Matrix integrals . . . . . . . . . . . . . . . . . . A.3 Symmetric polynomials . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
26 27 29 32 32 33 36 38 38 40 43 43 45 45 45 46
26
M. Mari˜no
1. Introduction Chern-Simons theory [44] has been at the heart of the developments in three-manifold topology and knot theory for the last ten years. The partition function of Chern-Simons theory defines a topological invariant of three-manifolds, sometimes known as the Witten-Reshetikhin-Turaev invariant, that can be studied from many different points of view. In general, the invariant thus obtained contains information about the threemanifold itself but also about the gauge theory group that one uses to define the theory. However, from a perturbative point of view it is clear that one can extract numerical invariants of the three-manifold which are intrinsic to it and do not depend on the gauge group. This goes as follows: if we compute the partition function in perturbation theory, the contribution at a given order consists of a sum of terms associated to Feynman diagrams. Each term is the product of a group dependent factor (the group weight of the diagram), and a factor involving multiple integrals of the propagators over the threemanifold. This last factor does not depend on the gauge group one started with, and in this sense it is universal. Therefore, one can extract from perturbation theory an infinite series of invariants, the so-called universal perturbative invariants of three-manifolds. The idea of looking at the perturbative expansion of Chern-Simons theory in order to extract numerical invariants that “forget” about the gauge group was first implemented in the context of knot invariants, leading to the theory of Vassiliev invariants and the Kontsevich integral (see [7, 28]). The perturbative approach to the study of the partition function of Chern-Simons theory has a long story, starting in [44]. This has been pursued from many points of view. On the one hand, the structure of the perturbative series has been analyzed in detail (see for example [3, 5] and [15] for a nice review), leading to the graph homology of trivalent graphs as a systematic tool to organize the expansion. On the other hand, the asymptotic expansion of the nonperturbative results has also been studied [26, 35–37, 29, 30], although so far all the analysis have focused on theories with gauge group SU (2). Finally, a mathematically rigorous theory of universal perturbative invariants of three-manifolds has been constructed starting from the Kontsevich integral: the so-called LMO invariant [31] and its Aarhus version [8]. The main goal of the present paper is to elaborate on the topological field theory approach to universal perturbative invariants. The point of view presented here is very similar to the one advocated in [1, 2] to extract Vassiliev invariants from Chern-Simons perturbation theory: first, one analyzes the structure of the perturbative series of an observable in the theory. This means, in practical terms, choosing a basis of independent group factors and computing its value for various gauge groups. In a second step, one computes the corresponding invariant nonperturbatively for those gauge groups, performs an asymptotic expansion, and extracts the universal invariants by comparing to the perturbative result. This program was applied successfully in [1, 2] to compute Vassiliev invariants of many knots. It turns out that, in the case of the Chern-Simons partition function, the first step is relatively easy, but the calculation of the partition function for arbitrary gauge groups in a way that is suitable for an asymptotic expansion turns out to be trickier, except in very simple cases. In this paper, some well-known results concerning the structure of the perturbative series are put together, and we carry out a detailed analysis up to order five. The focus is on a rather general class of rational homology spheres, Seifert spaces. The partition function of Chern-Simons theory with gauge group SU (2) on these spaces and its asymptotic expansion have been studied in [19, 35, 37]. The extension to higher rank gauge groups has also been considered [40, 22], but in forms that are not useful for a systematic perturbative expansion. In [30], Lawrence and Rozansky found a beautiful expression
Chern-Simons Theory
27
for the SU (2) partition function on Seifert spaces in terms of a sum of integrals and residues. It turns out that their result can be generalized to any simply-laced group and written in terms of integrals over the Cartan subalgebra of the gauge group (these kind of integrals already appeared in a related context in [36]). Interestingly, they are closely related to models of random matrices, and one can use matrix model technology to study the Chern-Simons partition function on these spaces. The resulting expressions can be expanded in series in a fairly systematic way, and by comparing the result with the general structure of the perturbative expansion, the universal perturbative invariants can be extracted. It should be mentioned that the full LMO invariant of Seifert spaces has been computed by Bar-Natan and Lawrence [9] by using techniques from the theory of the Aarhus integral. However, their result is rather implicit and involves a complicated graphical calculus. This paper is organized as follows: in Sect. 2, we review the computation of the ChernSimons partition function starting from a surgery presentation. In Sect. 3, we analyze in some detail the structure of the Chern-Simons perturbation series. In Sect. 4 we compute the exact partition function of Seifert spaces for simply-laced gauge groups, generalizing the results of Lawrence and Rozansky, and we make the connection to matrix models. In Sect. 5, we analyze the asymptotic expansion of the exact result, explain how to evaluate the matrix integrals, and present the results for universal perturbative invariants up to order five. In Sect. 6, we comment on the possible relevance of these results to other physical contexts, and some avenues for future research are suggested. The Appendix collects the explicit expressions for the group factors and the matrix integrals, together with a summary of the properties of symmetric functions that are used in the paper. 2. The Partition Function of Chern-Simons Theory In this section we review some well-known results about the computation of the ChernSimons partition function in terms of surgery presentations. An excellent summary, that we follow quite closely, is given in [36]. We consider Chern-Simons theory on a three-manifold M and for a simply-laced gauge group G, with action k 2 S(A) = (2.1) Tr A ∧ dA + A ∧ A ∧ A , 4π M 3 where A is a G-connection on M. We will be interested in framed three-manifolds, i.e. a three-manifold together with a trivialization of the bundle T M ⊕ T M. As explained in [4], for every three-manifold there is a canonical choice of framing, and the different choices are labeled by an integer s ∈ Z in such a way that s = 0 corresponds to the canonical framing. Unless otherwise stated, we will always work in the canonical framing, and we will explain below how to incorporate this in the calculations, following [26, 19, 30]. As shown by Witten in [44], the partition function of Chern-Simons theory Zk (M) = DAeiSCS (A) (2.2) defines an invariant of framed manifolds. There is a very nice procedure to evaluate (2.2) in a combinatorial way which goes as follows. By the Lickorish theorem (see for example [32]), any three-manifold M can be obtained by surgery on a link L in S3 . Let
28
M. Mari˜no
us denote by Ki , i = 1, . . . , L, the components of L. The surgery operation means that around each of the knots Ki we take a tubular neighborhood Tub(Ki ) that we remove from S3 . This tubular neighborhood is a solid torus with a contractible cycle αi and a noncontractible cycle βi . We then glue the solid torus back after performing an SL(2, Z) transformation given by the matrix pi ri (pi ,qi ) U . (2.3) = q i si This means that the cycles pi αi +qi βi and ri αi +si βi on the boundary of the complement of Ki are identified with the cycles αi , βi in Tub(Ki ). This geometric description leads to the following prescription to compute the invariants in Chern-Simons theory. By canonical quantization, one associates a Hilbert space to any two-dimensional compact manifold that arises as the boundary of a threemanifold, so that the path-integral over a manifold with boundary gives a state in the corresponding Hilbert space. As it was shown in [44], the states of the Hilbert space of Chern-Simons theory associated to the torus are in one to one correspondence with the integrable representations of the WZW model with gauge group G at level k. We will use the following notations in the following: r denotes the rank of G, and d its dimension. y denotes the dual Coxeter number. The fundamental weights will be denoted by λi , and the simple roots by αi , with i = 1, · · · , r. The weight and root lattices of G are denoted by w and r , respectively. Finally, we put l = k + y. A representation given by a highest weight is integrable if the weight ρ + is in the fundamental chamber Fl (ρ denotes as usual the Weyl vector, given by the sum of the fundamental weights). The fundamental chamber is given by w / l r modded out by the action of the Weyl group. For example, in SU (N ) a weight p = ri=1 pi λi is in Fl if r
pi < l,
and pi > 0, i = 1, . . . , r.
(2.4)
i=1
In the following, the basis of integrable representations will be labeled by the weights in Fl . In the case of simply-laced gauge groups, the Sl(2, Z) transformation given by U (p,q) has the following matrix elements in the above basis [26, 36]: (p,q) Uαβ
idπ Vol 21 [i sign(q)]|+ | w (p,q) (U = exp − ) (l|q|)r/2 12 Vol r · (w) n∈r /qr w∈W
iπ
× exp
lq
(pα 2 − 2α(ln + w(β)) + s(ln + w(β))2 .
(2.5)
In this equation, |+ | denotes the number of positive roots of G, and the second sum is over the Weyl group W of G. (U (p,q) ) is the Rademacher function: p+s pr = − 12s(p, q), (2.6) q s q where s(p, q) is the Dedekind sum
Chern-Simons Theory
29
s(p, q) =
q−1 1 πn π np cot cot . 4q q q
(2.7)
n=1
With these data we can already present Witten’s result for the Chern-Simons partition function of M. As before, suppose that M is obtained by surgery on a link L in S3 . Then, the partition function of M is given by: Z(M, l) = eiφfr Zα1 ,··· ,αL (L) Uα(p1 ρ1 ,q1 ) · · · Uα(pLLρ,qL ) . (2.8) α1 ,··· ,αL ∈Fl
In this equation, Zα1 ,··· ,αL (L) is the invariant of the link L with representation αi − ρ attached to its i th component (recall that the weights in Fl are of the form ρ + ). The phase factor eiφfr is a framing correction that guarantees that the resulting invariant is in the canonical framing for the three-manifold M. Its explicit expression is: L πkd (pi ,qi ) φfr = (U ) − 3σ (L) , 12l
(2.9)
i=1
where σ (L) is the signature of the linking matrix of L. 3. Chern-Simons Perturbation Theory The expression (2.8) gives the nonperturbative result for the partition function of M, and allows an explicit evaluation for many three-manifolds for any gauge group G and level k. However, from the point of view of Chern-Simons perturbation theory, the partition function can be also understood as an asymptotic series in l −1 , whose coefficients can be computed by evaluating Feynman diagrams. In this section we review some known facts about the perturbative expansion of Chern-Simons theory and we state our strategy to compute the universal perturbative invariants. We are interested in the perturbative evaluation of the partition function (2.2). Let us assume (as we will do in this paper) that M is a rational homology sphere. The classical solutions of the Chern-Simons action are just flat connections on M, and for a rational homology sphere these are a finite set of points. Therefore, in the perturbative evaluation one expresses Zk (M) as a sum of terms associated to stationary points: (c) Zk (M) = Zk (M), (3.1) c
where c labels the different flat connections A(c) on M. Each of the terms in this sum has a perturbative expansion as an asymptotic series in l −1 . The structure of the perturbative series was analyzed in various papers [44, 37, 5] and is given by the following expression: ∞ (c) (c) (c) (3.2) S x . Zk (M) = Z1−loop (M). exp =1
In this equation, x is the effective expansion parameter: x=
2πi . l
(3.3)
30
M. Mari˜no (c)
The one-loop correction Z1−loop (M) was first analyzed in [44], and has been studied in much detail since then. It has the form, (c)
Z1−loop (M) =
1 0 1 (2πx) 2 (dimHc −dimHc ) − 1 SCS (A(c) )− iπ ϕ (c) 4 |τR |, e x vol(Hc )
(3.4)
where Hc0,1 are the cohomology groups with values in the Lie algebra of G associated (c) to the flat connection A(c) , τR is the Reidemeister-Ray-Singer torsion of A(c) , Hc is the (c) isotropy group of A , and ϕ is a certain phase. More details about the structure of this term can be found in [44, 19, 26, 35, 36]. Our main object of concern in this paper are the terms in the exponential of (3.2) corresponding to the trivial connection, which we will simply denote by S . In order to make a precise statement about the structure of these terms, we have to explain in some detail what is the appropriate set of diagrams we want to consider. In principle, in order to compute S we just have to consider all the connected bubble diagrams with loops. To each of these diagrams we will associate a group factor times a Feynman integral. However, not all these diagrams are independent, since the underlying Lie algebra structure imposes the Jacobi identity:
fabe fedc + fdae febc + face fedb = 0.
(3.5)
e
This leads to the diagram relation known as IHX relation. Also, antisymmetry of fabc leads to the so-called AS relation (see for example [7, 15, 28, 39]). The existence of these relations between diagrams suggests to define an equivalence relation in the space of connected trivalent graphs by quotienting by the IHX and the AS relations, and this gives the so-called graph homology. The space of homology classes of connected diagrams will be denoted by A(∅)conn . This space is graded by half the number of vertices, and this number gives the degree of the graph. The space of homology classes of graphs at degree is then denoted by A(∅)conn . For every , this is a finite-dimensional vector space of dimension d( ). The dimensions of these spaces are explicitly known for low degrees (see for example [7]), and we have listed some of them in Table 1. Finally, notice that, given any group G, we have a map A(∅)conn −→ R
(3.6)
that associates to every graph its group theory factor r (G). This map is an example of a weight system for A(∅)conn . Every gauge group gives a weight system for A(∅)conn , but one may in principle find weight systems not associated to gauge groups, although so far the only known example is the one constructed by Rozansky and Witten in [38], which uses instead hyperK¨ahler manifolds. Table 1. Dimensions d( ) of A(∅)conn up to = 10
1
2
3
4
5
6
7
8
9
10
d( )
1
1
1
2
2
3
4
5
6
8
Chern-Simons Theory
31
We can now state very precisely what is the structure of the S appearing in (3.2): since the Feynman diagrams can be grouped into homology classes, we have
S =
r (G)I (M).
(3.7)
∈A(∅)conn
The factors I (M) appearing in (3.7) are certain (complicated) integrals of propagators over M. It was shown in [5] that these are differentiable invariants of the three-manifold M, and since the dependence on the gauge group has been factored out, they only capture topological information of M, in contrast to Zk (M), which also depends on the choice of the gauge group. These are the universal perturbative invariants defined by Chern-Simons theory. Notice that, at every order in perturbation theory, there are d( ) independent perturbative invariants. Of course, these invariants inherit from A(∅)conn the structure of a finite-dimensional vector space, and it is convenient to pick a basis once and for all. Here we will study these invariants up to order 5, and we choose the basis presented by Sawon in [39]: =1: =2: =3: =4:
e e J e
e
e
e
e
@
eee =5:
e
@
e
@ e @ (3.8)
As in [39], we will denote the graphs with k circles joined by lines by θk . Therefore, the graph corresponding to = 1 will be denoted by θ , the graph corresponding to = 2 will be denoted θ2 , and so on. The second graph for = 4 will be denoted by ω, and the second graph in = 5 by ωθ . The group factors associated to these diagrams can be easily computed by using the techniques of [12] (see also [6, 7]). Explicit results for all classical gauge groups are presented in the Appendix. Remark. 1. It is interesting to understand the framing dependence of the universal perturbative invariants (see [5] for a discussion of this issue). As shown in [44], the full partition theory Zk (M) changes as follows under a change of framing: Z→e
π isc 12
Z,
(3.9)
where s ∈ Z labels the choice of framing and c=
kd k+y
(3.10)
32
M. Mari˜no
is the central charge of the WZW model with group G. Using now that (see Appendix A) rθ (G) = 2yd,
(3.11)
we find that under a change of framing one has Iθ (M) → Iθ (M) −
s , 48
(3.12)
while the other universal perturbative invariants remain the same. Since we will work in the canonical framing of M, this will produce a canonical value of Iθ (M). 2. Notice that Chern-Simons theory detects the graph homology through the weight system associated to Lie algebras. Unfortunately it is known [43] that there is an element of graph homology at degree 16 that it is not detected by any weight system associated to simple Lie algebras. However, there is a very elegant mathematical definition of the universal perturbative invariant of a three-manifold that works directly in the graph homology. This is called the LMO invariant [31] and it is a formal linear combination of homology graphs with rational coefficients: ω(M) = ILMO (M) ∈ A(∅)conn [Q]. (3.13) ∈A(∅)conn
It is believed that the universal invariants extracted from Chern-Simons perturbation theory agree with the LMO invariant. More precisely, since the LMO invariant ω(M) is taken to be 0 for S3 , we have: ILMO (M) = I (M) − I (S3 ),
(3.14)
as long as the graph is detected by Lie algebra weight systems. In that sense the LMO invariant is more refined than the universal perturbative invariants extracted from Chern-Simons theory. 3. The Chern-Simons approach to the theory of universal perturbative invariants is very similar to the approach to Vassiliev invariants based on the analysis of vevs of Wilson loops in perturbation theory [1, 2]. The role of graph homology is played there by the homology of chord diagrams (see for example [7, 28]). 4. The Chern-Simons Partition Function on Seifert Spaces In this section we write the partition function of Chern-Simons theory on Seifert homology spheres as a sum of integrals over the Cartan subalgebra and a set of residues, by extending results of Lawrence and Rozansky [30] for SU (2). We also show that these integrals can be interpreted in terms of matrix integrals associated to a random matrix model. 4.1. Seifert homology spheres. Seifert homology spheres can be constructed by performing surgery on a link L in S3 with n + 1 components, consisting of n parallel and unlinked unknots together with a single unknot whose linking number with each of the other n unknots is one. The surgery data are pj /qj for the unlinked unknots, j = 1, · · · , n, and 0 on the final component. pj is coprime to qj for all j = 1, . . . , n, and the pj ’s are pairwise coprime. After doing surgery, one obtains the Seifert space
Chern-Simons Theory
33
M = X( pq11 , · · · , pqnn ). This is rational homology sphere whose first homology group H1 (M, Z) has order |H |, where H =P
n qj , pj j =1
and P =
n
pj .
(4.1)
j =1
Another topological invariant that will enter the computation is the signature of L, which turns out to be [30] n H qi σ (L) = − sign sign . (4.2) pi P i=1
For n = 1, 2, Seifert homology spheres reduce to lens spaces, and one has that L(p, q) = X(q/p). For n = 3, we obtain the Brieskorn homology spheres (p1 , p2 , p3 ) (in this case the manifold is independent of q1 , q2 , q3 ). In particular, (2, 3, 5) is the Poincar´e 2 m homology sphere. Finally, the Seifert manifold X( −1 , (m+1)/2 , t−m 1 ), with m odd, can be obtained by integer surgery on a (2, m) torus knot with framing t. 4.2. Computation of the partition function. In order to compute the partition function of M, we first have to compute the invariant of L for generic representations β − ρ, 1 , · · · , n of the gauge group G, where β − ρ is the irreducible representation coloring the unknot with surgery data 0, and i are irreducible representations coloring the unknots with surgery data pi /qi , i = 1, · · · , n. This can be easily done by using the formula of [44] for connected sums of knots, and one obtains: n Sβρ+ Zβ,ρ+1 ,··· ,ρ+n (L) = i=1 n−1 i . (4.3) Sρβ Therefore, the partition function of M will be given by (pi ,qi ) ni=1 ρ+i ∈Fl Sβρ+i Uρ+ iρ iφfr Zk (M) = e , n−2 S ρβ β∈F
(4.4)
l
where the framing correction is given by the general formula (2.9). Seifert homology spheres can be also obtained by doing surgery on n strands parallel to S1 in S2 × S1 [35], and then (4.4) follows from Verlinde’s formula [42]. This expression is not suitable for an asymptotic expansion in 1/ l, since it involves a sum over integrable representations that depends itself on l. In order to obtain a useful expression, we follow a series of steps generalizing the procedure in [36, 30]. First of (pi ,qi ) all, we perform the matrix multiplication ρ+i ∈Fl Sβρ+i Uρ+ . This gives iρ q πikd i (pi ,qi ) (−q ,p ) sign Sβρ+i Uρ+ = exp (4.5) Uβρ i i , iρ 4l pi ρ+i ∈Fl
where the SL(2, Z) transformation in the right hand side is given by −qi −si (pi ,qi ) = U (−qi ,pi ) , = S·U p i ri
(4.6)
34
M. Mari˜no
and the phase factor is needed in order to keep track of the framing. The partition function is then, up to a multiplicative constant, given by: β∈Fl
n−2 sin πl (β · α)
n
1
α>0
(wi )
i=1 ni ∈r /pi r wi ∈W
iπ 2 2 −qi β − 2β(lni + w(ρ)) + ri (lni + w(ρ)) . × exp lpi
(4.7)
If G is simply-laced, the summand is invariant under the simultaneous shift, β → β + lα,
ni → ni − qi α,
(4.8)
and also under ni → ni + pi α.
(4.9)
In these equations, α is any element in the root lattice. This invariance allows us to put ni = 0 in the above sum by extending the range of β: β = p + lα, where p ∈ Fl , and α = i ai αi , 0 ≤ ai < P . It is easy to see that the resulting summand is invariant under the Weyl group W acting on β, and by translations by lP α, where α is any root. We can then sum over Weyl reflections and divide by the order of W, denoted by |W|, and use the translation symmetry to extend the sum over β in the above set to a sum over β ∈ (w / lP r )\M. Here M denotes the set given by the wall of Fl together with its Weyl reflections and translations by lP α inside w / lP r (for SU (N ), the wall of Fl is given by the weights with i pi = l). We won’t need a precise description of the points of M in the following, since they only enter in the contribution of irreducible flat connections to the path integral [30]. After performing all these changes, and using the Weyl denominator formula α 2 sinh = (w)ew(ρ) , (4.10) 2 w∈W
α>0
we can write (4.7) as: 1 iπl ρ 2 ni=1 pri i e |W| ·e−
iπ H lP
β2
β∈(w / lP r )\M
n
(−2i) sin
i=1 α>0
1
α>0
sin πl (β · α)
n−2
π (β · α). lpi
(4.11)
The last step involves transforming the above sum in a sum over integrals and residues. To do that, we generalize slightly [30] and we introduce a holomorphic function of β1 , · · · , βr and x1 , · · · , xr given by: h(β, x) =
e−
α>0
e
iπ H lP
πi l (β·α)
β2
−e
f (β, x) , −2πiβi ) i=1 (1 − e
= r
2π i
e l β·x r n−2 −2πiβi ) − πl i (β·α) i=1 (1 − e (4.12)
Chern-Simons Theory
where β =
r
35
i=1 βi λi
∈ w ⊗ C, x =
r
i=1 xi αi
∈ r ⊗ C. This function satisfies:
h(β + lP α, x) = e2πiP α·x h(β, x − lH α),
(4.13)
for any α ∈ r . Notice also that h(β, x) has poles at the points of w , the weight lattice. Introduce now the integral over Cr : (x) =
Cr
(4.14)
h(β, x)dβ,
where C r = C × · · · × C is a multiple contour in Cr , and C is the contour considered in [30]: a line through the origin from (−1 + i)∞ to (1 − i)∞ for sign(H /P ) > 0 (if sign(H /P ) < 0, we rotate C by π/2 in the clockwise direction). This contour is chosen to guarantee good convergence properties as βi → ∞. Let us now shift the contour in such a way that it crosses all the poles corresponding to the weights in the chamber w / lP r . Using (4.13) it is easy to see that, if P α · x ∈ Z for any root α, the resulting integral can be written as r
(x − lH αi )−
(x − lH (αi + αj ))+· · ·+(−1)r−1 (x − lH
r
1≤i<j ≤r
i=1
αi ).
i=1
(4.15) The difference between the original integral and the shifted integral (4.15) can be written as f (β, x)e−2πit·β . (4.16) t∈r /H r
Cr
On the other hand, the effect of shifting the contour is to pick the residues corresponding to all the weights in the chamber w / lP r . Here the residue is understood as limβi →ni i (βi − ni )h(β, x), and the residues for the weights that are not in M are simply given by (2πi)−r f (β, x). Putting everything together we find, n∈(w / lP r )\M
f (n, x) =
t∈r /H r
×
Cr
f (β, x)e−2πiβ·t dβ − (2π i)r
Res(h(β, x), β = n),
(4.17)
n∈M
whenever P α · x ∈ Z. We can apply this formula to (4.11), since what we have there is just a sum of expressions of the form f (β, x) in (4.12), with x of the form α/P , α ∈ r . In this context, the sum over t ∈ r /H r is interpreted as a sum over reducible flat connections on the Seifert sphere, and of course t = 0 corresponds to the trivial connection. In the remainder of this paper we will focus on these contributions, i.e. we will not deal with the residue terms in (4.17), that should give the contribution of irreducible flat connections [37, 30]. In fact, we will only analyze in detail the contribution of the trivial connection in order to make contact with the universal perturbative invariants.
36
M. Mari˜no
In order to present the final result for the contribution of reducible flat connections to the partition function of Chern-Simons theory on Seifert spaces, we have to collect the prefactors, including the phases. Define as in [30]: φ = 3 sign
H P
+
n
12 s(qi , pi ) −
i=1
qi . pi
(4.18)
Therefore, the contribution of reducible flat connections to the Chern-Simons partition function of X( pq11 , · · · , pqnn ) is given by Vol w [sign(P )]|+ | π id sign(H /P )− π idy φ 12l e 4 Vol r |P |r/2 n β·α α>0 2 sinh 2pi i=1 −β 2 /2x−lt·β ˆ dβ e · n−2 . β·α t∈r /H r α>0 2 sinh 2
(−1)|+ | |W| (2πi)r
(4.19)
In this equation, φ is given by (4.18), and in obtaining the phase factor we have made use of the Freudenthal-De Vries formula ρ2 =
1 dy. 12
(4.20)
We have also introduced the hatted coupling constant xˆ =
Px , H
(4.21)
where x is the coupling constant given in (3.3). In the evaluation of the above integral we can rotate the integration contour C r to Rr as long as we are careful with phases in the Gaussian integral, as explained for example in [44]. If we specialize (4.19) to G = SU (2), we obtain the result derived in [30]. The expression (4.19) is in principle only valid for simply-laced groups, although the results for the perturbative series turn out to be valid for any gauge group. Notice that, in the sum over r /H r , the t’s that are related by Weyl transformations correspond to the same flat connection. Fortunately, each of the integrals in (4.19) is invariant under Weyl permutations of t, so in order to consider the contribution of a given flat connection, one can just evaluate (4.19) for a particular representative and then multiply by the corresponding degeneracy factor (i.e. the number of Weyl-equivalent t configurations giving the same flat connection). If one is just interested in obtaining the contribution of the trivial connection, one can use the shorter arguments of [36] and end up with (4.19) with t = 0. The contribution of the reducible connections can also be obtained by generalizing the arguments of [37] to the higher rank situation. 4.3. Connection to matrix models. In (4.19) we have written the contribution of reducible connections to the Chern-Simons partition function in terms of an integral over the Cartan subalgebra, since dβ = ri=1 dβi and βi are the Dynkin coordinates. In fact, the above expression can be interpreted as the partition function of a random matrix model (for a review of random matrices, see [34, 24]). To see this, let us consider a
Chern-Simons Theory
37
slight generalization of the above results to the U (N ) and O(2r) theories. The partition function for these groups can be obtained by writing β in terms of the orthonormal basis in the space of weights. Let us first consider the case of U (N ). Denote the orthonormal basis as {ek }k=1,... ,N , and put β = k βk ek (where βk are taken to be independent variables), t = k tk ek . It is well-known that the positive roots can be written as αkl = ek − el ,
1 ≤ k < l ≤ N.
(4.22)
Therefore, the integral in (4.19) becomes dβ e
−
2 ˆ k βk /2x−l k tk βk
n
i=1 k
k
2 sinh
2 sinh
βk −βl 2
βk −βl 2pi n−2 .
(4.23)
We can interpret the βk as the eigenvalues of a Hermitian matrix in a Gaussian potential and interacting through n i=1 k
βk − βl β k − βl + (2 − n) log 2 sinh log 2 sinh . 2pi 2
(4.24)
k
Notice moreover that for a small separation of the eigenvalues (4.24) becomes, at leading order,
log(βk − βl )2
(4.25)
k
which is the interaction between eigenvalues of the standard Hermitian matrix model. Therefore, the integral above can be interpreted as a nonlinear deformation of the usual Gaussian unitary ensemble (GUE). In fact, as we will see in detail in the next section, Chern-Simons perturbation theory means that we expand around the GUE, and the perturbative corrections are obtained by evaluating averages in this ensemble. Note that the non-trivial reducible flat connections, labeled by t, are interpreted in the matrix model language as a source term coupling linearly to the eigenvalues. Similar considerations apply to the orthogonal group O(2r). The positive roots can be written in terms of an orthonormal basis as follows: ± αkl = ek ± el ,
1 ≤ k < l ≤ r,
(4.26)
and the interaction between the eigenvalues reduces again, in the limit of small separation, to
log(βk2 − βl2 )2 ,
(4.27)
k
which is the eigenvalue interaction of the orthogonal ensemble O(N ) for even N .
38
M. Mari˜no
5. Asymptotic Expansion and Matrix Integrals 5.1. Asymptotic expansion of the exact result. In this subsection we will study the asymptotic expansion of the exact result obtained in the previous section for the contribution of the trivial connection t = 0. The expression (4.19) is very well suited for an asymptotic expansion in powers of x : we just have to expand the integrand in a power series of β, and integrate the result term by term with the Gaussian weight. The integrand has the expansion: n β·α 1 α>0 2 sinh 2pi i=1 2 = (β · α) f (β), (5.1) n−2 P |+ | β·α α>0 2 sinh α>0 2 where f (β) has the form f (β) =
1+
α>0
∞
as (β · α)
2s
.
(5.2)
s=1
The coefficients as can be obtained in a very straightforward way from (5.1). They are polynomials of degree s in n and in the power sums πj =
n
−2j
pi
(5.3)
.
i=1
One has, for example, 1 (π1 + 2 − n), 24 1 a2 = (16 + 5n2 − 18n − 10nπ1 + 20π1 + 5π12 − 2π2 ). 5760 a1 =
Let us analyze in more detail the structure of f (β). Define σj (β) = (β · α)2j .
(5.4)
(5.5)
α>0
By taking the log of (5.2), one finds: ∞
f (β) = exp
(c) ak σk (β)
(5.6)
,
k=1
(c) where the connected coefficients ak are defined in the usual way: log(1 + n ak x k ) = (c) k k ak x . An explicit expression for f (β) can be obtained as follows. Let k =(k1 , k2 , · · · ) be a vector whose components are nonnegative integers. Denote = j j kj , and define: (c) k
a =
(c) (aj )kj , j
σk (β) =
j
k
σj j (β).
(5.7)
Chern-Simons Theory
39
Then, f (β) = 1 +
1 (c) a σ (β), k k k!
(5.8)
k
We see that the perturbative expansion = j kj ! and the sum is over all vectors k. with k! of the partition function can be written in terms of the quantities 2 dβ2 (β)e−β /2 σk (β) Rk (G) = , (5.9) 2 dβ2 (β)e−β /2 where we have denoted 2 (β) =
(β · α)2 .
(5.10)
α>0
Notice that, when we write β in terms of the orthogonal basis (4.22) or (4.26), (5.10) is indeed the square of the Vandermonde determinant in the variables βj (for U (N )) or βj2 (for O(2r)). Therefore, as we anticipated before, the asymptotic expansion of the integral is an expansion around the corresponding Gaussian ensemble, and the perturbative corrections can be evaluated systematically as averages in this ensemble. We will denote 2 Z0 = dβ2 (β)e−β /2 , (5.11) so that the partition function on Seifert spaces can be written, using (4.19), as log
∞ Zk (M) 1 = − dyφx + log 1 + Z1−loop 24
=1 k|
j
j kj =
1 (c) a Rk (G) xˆ . k k!
(5.12)
In this equation Z1−loop is given by Z1−loop =
(−1)|+ | |W| (2πi)r
π id Vol w e 4 sign(H /P ) Z0 xˆ d/2 , Vol r |P |d/2
(5.13)
and indeed gives the one-loop contribution around the trivial connection. This follows by comparing the exact result with the perturbative expansion ∞
Zk (M) log = Z1−loop =1
r (G)I (M) x .
(5.14)
∈A(∅)conn
We also see that, by comparing (5.12) and (5.14), we can extract the value of the universal perturbative invariants I (M) at each order x . In order to do that we just have to eval uate Rk (G) for all vectors k with j j kj ≤ , and also the group factors r (G) for graphs with 2 vertices. Of course, from a mathematical point of view it is not obvious that the asymptotic expansion of the exact partition function has the structure predicted by the perturbation theory analysis. The fact that this is the case provides an important consistency check of the procedure.
40
M. Mari˜no
5.2. Evaluating the integrals. We now address the problem of computing the integrals in (5.9). As we explained in Sect. 4, the partition function of Chern-Simons theory on Seifert spaces can be interpreted as a matrix model with an interaction between eigenvalues of the form log(sinh(βi − βj )). In the perturbative approach we have to expand the sin in power series, and the integrals Rk (G) are nothing but averages of symmetric polynomials in the eigenvalues in a Gaussian matrix model. We will present two methods to compute these averages. The first method gives the complete answer only up to = 5, but it has the advantage of providing general expressions for any simply-laced gauge group. The starting point is the following identity: t (β · α) s(β · α) a 2 4 sinh sinh dβ e− 2 β 2 2 α>0 2π r/2 ts(ρ · α) 1 t 2 +s 2 2 = |W|(det(C)) 2 e 2a ρ 2 sinh , (5.15) a 2a α>0
where C is the Cartan matrix of the group. This formula is easily proved by using (4.10). Another useful fact is that σ1 (β) can be written as (see [14], pp. 519–20)
(β · α)2 = yβ 2 .
(5.16)
α>0
One can easily show that, by expanding (5.15) in s, t, and by using (5.16), it is possible to determine the integrals Rk (G) for any gauge group up to = 5, therefore this is enough for the computational purposes of the present paper. The answer is given in terms of y, d, and the quantities αk =
(α · ρ)2k .
(5.17)
α>0
For example, one finds: R(0,1,0,··· ) (G) = 5dy 2 .
(5.18)
The answers obtained by this method are listed in the Appendix. In order to evaluate the integrals (5.9) for arbitrary σk , it is important to have a more general and systematic method. Here is where the connection to matrix integrals becomes computationally useful. It is easy to see that, since the integrals Rk (G) are normalized, one can evaluate them in U (N ) and O(2r) instead of SU (N ) and SO(2r). Therefore, one has 2 1 dβ e− j βj /2 (βi − βj )2 σk (β), (5.19) Rk (SU (N )) = Z0 i<j
where σn (β) =
(βi − βj )2n . i<j
(5.20)
Chern-Simons Theory
41
In (5.19) one integrates over N independent variables β1 , · · · , βN . It is clear that the σk (β) are symmetric polynomials in these N variables. One can for example write (5.20) in terms of power sum polynomials Pj (β) (defined in (A.8)) as follows, 2n−1 1 s 2n σn (β) = N P2n (β) + Ps (β)P2n−s (β). (−1) s 2
(5.21)
s=1
The averages of symmetric polynomials in the Gaussian unitary ensemble can be evaluated in principle by using the Selberg integral [34], or the results of [10]. A more effective way is the following: any symmetric polynomial in the βi ’s can be written as a linear combination of Schur polynomials Sλ (β), which are labeled by Young tableaux associated to a partition λ (see (A.6)). Therefore, if we know how to compute the normalized average of a Schur polynomial, 2 1 dβ e− j βj /2 (βi − βj )2 Sλ (β), (5.22)
Sλ (β) = Z0 i<j
we can compute all Rk . An explicit expression for (5.22) has been presented in [25]. The result is the following: let |λ| be the total number of boxes in the tableau labeled by λ, and let λi denote the number of boxes in the i th row of the Young tableau. Define now the |λ| integers fi as follows fi = λi + |λ| − i,
i = 1, · · · , |λ|.
(5.23)
Following [25], we will say that the Young tableau associated to λ is even if the number of odd fi ’s is the same as the number of even fi ’s. Otherwise, we will say that it is odd. If λ is odd, the normalized average Sλ (β) vanishes. Otherwise, it is given by: A(A−1) f odd f !! f even f !!
Sλ (β) = (−1) 2 dim λ, (5.24) f odd,f even (f − f ) where A = |λ|/2 (notice that |λ| has to be even in order to have a non vanishing result). Here dim λ is the dimension of the irreducible representation of SU (N ) associated to λ, and can be computed by using the hook formula. This expression solves the problem of computing the averages (5.19) in the general case: we express the product of power sums appearing in (5.21) in terms of Schur polynomials by using the Frobenius formula (A.9), and then we compute the averages of these with (5.24). As an example of this procedure, let us compute R(0,1,0,··· ) (SU (N )). Using (5.21) and the Frobenius formula (A.9), we find: σ2 = (N − 1)S
− (N + 1)S
+ (N − 3)S
− (N + 3)S
+ 10S
. (5.25)
The averages of the different Schur polynomials can be computed from (5.24), and we obtain, after some simple algebra: R(0,1,0,··· ) (SU (N )) = 5N 2 (N 2 − 1), in agreement with (5.18).
(5.26)
42
M. Mari˜no
Let us now consider the orthogonal ensemble. The averages that we want to compute are given by 1 Z0
dβ e−
r
2 j =1 βj /2
1≤i<j ≤r
(βi2 − βj2 )2 σk (β),
(5.27)
where σn (β) =
(βi + βj )2n + (βi − βj )2n
i<j
= (2r − 22n−1 )Pn (βi2 ) +
n−1 2n s=1
2s
Ps (βi2 )Pn−s (βi2 ).
(5.28)
The functions σk (β) are now symmetric polynomials in the βi2 , so we can write them in terms of Schur polynomials Sλ (βi2 ). This allows to express the integrals (5.27) in terms of the integrals
∞
∞
···
0
0
dy 2 (y)(y1 . . . yr )α−1 e−(y1 +···+yr )/2 Sλ (y),
(5.29)
which are a special case of a generalization of the Selberg integral studied by Kadell [27], see also [33]. Their value is given by r!
r
(λi + α + r − i)
i=1
(λi − λj + j − i).
(5.30)
i<j
In our case, α = 1/2. The normalized average of a Schur polynomial is then:
Sλ (βi2 ) = 2|λ| dim λ
r (λi + 1/2 + r − i) . (1/2 + r − i)
(5.31)
i=1
In this equation, dim λ denotes the dimension of the representation of SU (r) associated to λ. This solves the problem of computing the averages (5.27) in the orthogonal ensemble. As a simple example, let us consider again k = (0, 1, 0, · · · ). It is easy to see that σ2 = (2r − 2)S
+ (14 − 2r)S ,
(5.32)
and one finds R(0,1,0,··· ) (SO(2r)) = 20 r(2r − 1)(r − 1)2 , in agreement with (5.18).
(5.33)
Chern-Simons Theory
43
5.3. Universal perturbative invariants up to order 5. Using the above ingredients, it is easy to find the universal perturbative invariants of Seifert spaces up to order 5. Although the coefficients as in (5.2) are functions of n and the Newton polynomials Pk (pi−2 ), the answer turns out to be more compact when written in terms of elementary symmetric polynomials Ek in the variables pi−2 − 1 by using (A.10) (so for example E1 = −n + ni=1 pi−2 ). One finds, 1 P Iθ = − φ − (2 + E1 ) , 48 H 1 P 2 Iθ2 = (1 + E1 + E2 ), 1152 H 1 P 3 E3 , Iθ3 = 13824 H 1 P 4 Iθ4 = 11059200 H × 82E4 − 46E3 − 18(1 + E2 + E3 )E1 − 9E22 − 18E2 − 9E12 − 9 , P 4 1 Iω = 2E4 − 6E3 + 2E1 (1 + E2 − E3 ) + E22 + 2E2 + E12 + 1 , 1382400 H P 5 1 Iθ5 = 55E5 + E4 (27E1 − 56) − E3 (9E2 + 36E1 + 8) , 66355200 H P 5 1 Iωθ = 5E5 − E4 (3E1 + 16) + E3 (E2 + 4E1 + 12) . (5.34) 8294400 H Note that the first universal invariant is given by λ(M) Iθ = , (5.35) 2 where λ(M) is the Casson invariant of M, in accord with the general result of [36] and with the result for the LMO invariant [31]. We have also checked that the value for Iθ2 listed above agrees with the value obtained in [9] for the LMO invariant using the Aarhus integral. It would be interesting to see if these invariants have some nice integrality properties. The perturbative SU (2) invariants of integral homology spheres do exhibit some integrality properties (discussed for example in [30]), but they include extra factors coming from the SU (2) group weights. In general, the invariants listed above are rational even for integral homology spheres. For example, Iθ3 ∈ Z/4 for Brieskorn 1 integral homology spheres. Also, Iθ2 − 1152 ∈ Z/2 for those spaces. 6. Open Problems Besides the original motivation of understanding universal perturbative invariants from a field theory point of view, the results presented here also provide a computationally feasible framework to study the Chern-Simons partition function with higher rank gauge groups. There are various avenues to explore in the context of Chern-Simons theory and the theory of three-manifold invariants. It would be interesting for example to understand the structure of perturbation theory in the background of a reducible nontrivial connection, and work out the asymptotic expansion starting from (4.19). One could also consider other three-manifolds (not necessarily rational homology spheres) and see in particular if the matrix model representation provided here can be generalized to other cases.
44
M. Mari˜no
There are also various interesting physical contexts in which the results of this paper might be relevant. Let us end by mentioning a few of them: 1) The computation of Rozansky-Witten invariants [38] involves the universal perturbative invariants of three-manifolds that are extracted from Chern-Simons theory, but the weight system is now associated to a hyperK¨ahler manifold (see [39] for a very nice review). Therefore, the universal perturbative invariants of Chern-Simons theory are relevant in Rozansky-Witten theory. On the other hand, this theory is an essential ingredient in the worldvolume theory of M2 membranes in manifolds of G2 holonomy [23], and the computation of the Rozansky-Witten partition function should play a role in understanding membrane instanton effects in G2 compactifications of M theory. 2) Chern-Simons theory on a three-manifold M describes topological A branes wrapping the Lagrangian submanifold M in the Calabi-Yau T ∗ M [45]. Moreover, the perturbative invariants of Chern-Simons theory correspond to topological open string amplitudes on that target. Having a systematic procedure to compute ChernSimons perturbative invariants may prove to be useful in further understanding topological open strings in those backgrounds. 3) Another consequence of our results is that topological A branes in T ∗ M are described by a matrix model, when M is a Seifert sphere. It has been recently shown [16] that topological B branes on some noncompact Calabi-Yau spaces are described by a Hermitian matrix model characterized by a potential W () with multiple cuts. It would be interesting to know if there is a relation between the matrix models of [16] and the ones presented here. Notice that according to our results the partition function of U (N ) Chern-Simons theory on S3 can be written as e− 12 N(N N! x
Z=
2 −1)
N dβi i=1
2π
e−
i
βi2 /2x
2 sinh
i<j
βi − βj 2 . 2
(6.1)
This describes open topological A strings on T ∗ S3 with N branes wrapping S3 , or equivalently (after the geometric transition of [21]) closed topological strings on the resolved conifold. In the limit x → 0, (6.1) gives the standard Gaussian model, as we have argued at length in this paper. On the other hand, it is shown in [16] that the Gaussian model (which corresponds to W () = 2 ) describes type topological B strings in the deformed conifold geometry. This is consistent with the fact that, as explained in [41], the deformed conifold geometry gives the mirror of the resolved conifold only at small ’t Hooft coupling t = N x, which for fixed N means precisely small x. This also suggests to consider multicut matrix models with a potential W (), as in [16], but where the eigenvalue interaction is not the usual one i<j (βi − βj )2 , but i<j (2 sinh((βi − βj )/2))2 . In view of the above observation, these deformed models might be relevant to understand the mirror of the geometric transition studied in [11, 16]. Indeed, a compact version of this, corresponding to a unitary matrix model where the βi are periodic variables, has been considered in [17] in order to describe the stringy realization of the N = 2 Seiberg-Witten geometry. 4) Although in this paper the focus has been on the perturbative expansion of the partition function, the matrix model is also very useful to understand its large N expansion. This is an interesting problem in itself, and it would be nice to see what is the connection to the approach of [18]. But of course the large N expansion of these models is particularly interesting in view of the large N dualities involving
Chern-Simons Theory
45
Chern-Simons theory [20, 21]. Although these dualities are not expected to hold for arbitrary Seifert spaces, the results presented here may be useful to understand in detail the case of lens spaces (already analyzed in [20]) and shed light on the situation for more general three-manifolds. Acknowledgements. I would like to thank Bobby Acharya, Mina Aganagic, Emanuel Diaconescu, Jaume Gomis, Rajesh Gopakumar, Shinobu Hikami, Jose Labastida, Greg Moore, Boris Pioline, Justin Sawon, Toshie Takata, Miguel Tierz and Cumrun Vafa for useful discussions and correspondence. Thanks to Justin too for providing me the graphs of his thesis. This work is supported by the grant NSF–PHY/98–02709.
A. Appendix A.1. Group theory factors. We first present the group theory factors associated to the connected graphs that give a basis of A(∅)conn up to order 5. The evaluation of these factors is straightforward by using the graphical techniques of Cvitanovi´c [12], and rather immediate for all of them (except for rω (G), that gives the quartic Casimir in the adjoint and has been computed for all gauge groups in the second reference of [12]). Our conventions are as in [12]: the Lie algebra in the defining representation has Hermitian generators Ti , i = 1, . . . , d satisfying the commutation relations [Ti , Tj ] = iCij k . The generators are normalized in such a way that the quadratic Casimir of the adjoint representation CA (which is defined here by CA δij = k,l Cikl Cj kl ) is twice the dual Coxeter number. This implies that Tr(Ti Tj ) = aδij with a = 1 for SU (N ) and Sp(N ), and a = 2 for SO(N ) (notice that these normalizations differ from the ones in [1, 2]). The group factor will be written in terms of the dual Coxeter y, the dimension of the group d, and α2 (where αk is defined in (5.17)). Their values for SU (N ), SO(N ) are listed in Table 2. The results for Sp(N ) follow from the Sp(N ) = SO(−N ) relation [13], Sp(N) SO(N) so for Sp(N ) one has d = N (N + 1)/2, y = N + 2 and α2 (N ) = α2 (−N ). The group theory factors for the graphs in (3.8) are listed in Table 3. A.2. Matrix integrals. We now list the results for the matrix integrals (5.9), up to order 5. The results for k1 = 0 are: R(0,1,0,··· ) (G) = 5 dy 2 , R(0,0,1,0,··· ) (G) = 35 dy 3 , R(0,2,0,0,··· ) (G) = 25 d(d + 12)y 4 − 2880α2 , R(0,0,0,1,0,··· ) (G) = 350 dy 4 − 1680α2 , R(0,1,1,0,0,··· ) (G) = 35 y 5d(d + 24)y 4 − 1728α2 , R(0,0,0,0,1,0,··· ) (G) = 4620 y dy 4 − 12α2 . Table 2. Dimensions, dual Coxeter numbers and α2 for SU (N) and SO(N) SU (N)
SO(N)
d
N2 − 1
1 2 N(N
y
N
α2
1 2 2 60 N (N
− 1)
N −2 − 1)(2N 2 − 3)
1 480 N(N
− 1)(N − 2)(8N 3 − 45N 2 + 54N + 32)
(A.1)
46
M. Mari˜no Table 3. Group theory factors for the Feynman graphs up to = 5
graph
group factor
1
2dy
2
2 4dy e e J e
8dy 3
3
4
e
e
e
e
16dy 4
@
5
@ eee
18dy 4 − 480α2
e
32dy 5
@
e
e
@
2y(18dy 4 − 480α2 )
The results for k1 > 0 can be obtained from (A.1) very easily: insertions of σ1 (β) can be reduced to insertions of β 2 by using (5.16), and these can be computed by taking derivatives with respect to a in (5.15). We have for example: R(2,1,0,··· ) (G) = 5d(d + 4)(d + 6)y 4 . Finally, the integral Z0 in (5.11) is given by r
1
Z0 = (2π) 2 |W|(det C) 2
(α · ρ).
(A.2)
(A.3)
α>0
Note that this combines with the rest of the factors in (5.13) to produce, up to an overall phase Z1−loop
1 (2π)|+ | Vol w 2 = (α · ρ), (l|H |)d/2 Vol r
(A.4)
α>0
where we have used that Vol r /Vol w = det(C). If instead of taking the volume of the root lattice in (A.4) we take that of the coroot lattice, the resulting expression for the partition function is probably valid for any gauge group. A.3. Symmetric polynomials. Here we summarize some ingredients of the elementary theory of symmetric functions that are used in the paper. A standard reference is [33]. Let x1 , · · · , xN denote a set of N variables. The elementary symmetric polynomials in these variables, Em (x), are defined as:
Chern-Simons Theory
47
Em (x) =
xi1 . . . xim .
(A.5)
i1 <...
The products of elementary symmetric polynomials provide a basis for the symmetric functions of N variables with integer coefficients. Another basis is given by the Schur polynomials, Sλ (x), which are labeled by Young tableaux. A tableau will be denoted here by a partition λ = (λ1 , λ2 , . . . , λp ), where λi is the number of boxes of the i th row of the tableau, and we have λ1 ≥ λ2 ≥ · · · ≥ λp . The total number of boxes of a tableau will be denoted by |λ| = i λi . The Schur polynomials are defined as quotients of determinants, det xjλi +N −i
Sλ (x) =
det xjN−i
.
(A.6)
A third set of symmetric functions is given by the Newton polynomials Pk (x). These are labeled by vectors k = (k1 , k2 , · · · , kp ), where the kj are nonnegative integers, and they are defined as Pk (x) =
p j =1
k
Pj j (x),
(A.7)
where Pj (x) =
N
j
(A.8)
xi ,
i=1
are power sums. The Newton polynomials are homogeneous of degree = j j kj and give a basis for the symmetric functions in x1 , · · · , xN with rational coefficients. They are related to the Schur polynomials through the Frobenius formula, λ (x), Pk (x) = χλ (k)S (A.9) λ
is the character of the where the sum is over all tableaux such that |λ| = , and χλ (k) symmetric group in the representation associated to λ and evaluated on the conjugacy class associated to k (this is the conjugacy class with kj cycles of length j ). Finally, one has the following relation between elementary symmetric polynomials and Newton polynomials, (−1) Em (x) = k
where the sum is over all vectors with
j
j
j (kj −1)
kj !j kj
Pk (x),
(A.10)
j kj = m.
References ´ 1. Alvarez, M., Labastida, J.M.F.: Numerical knot invariants of finite type from Chern-Simons perturbation theory. Nucl. Phys. B 433, 555 (1995) [Erratum-ibid. B 441, 403 (1995)]; Vassiliev invariants for torus knots. J. Knot Theory Ramifications 5, 779 (1996); Primitive Vassiliev invariants and factorization in Chern-Simons perturbation theory. Commun. Math. Phys. 189, 641 (1997) ´ 2. Alvarez, M., Labastida, J.M.F., P´erez, E.: Vassiliev invariants for links from Chern-Simons perturbation theory. Nucl. Phys. B 488, 677 (1997) ´ 3. Alvarez-Gaum´ e, L., Labastida, J.M.F., Ramallo, A.V.: A note on perturbative Chern-Simons theory. Nucl. Phys. B 334, 103 (1990)
48
M. Mari˜no
4. Atiyah, M.: On framings of three-manifolds. Topology 29, 1 (1990) 5. Axelrod, S., Singer, I.: Chern-Simons perturbation theory. In: Differential geometric methods in theoretical physics, Singapore World Scientific, 1991, pp. 3 6. Bar-Natan, D.: Perturbative aspects of the Chern-Simons topological quantum field theory. Ph.D. Thesis, 1991, in http://www.math.toronto.edu.il/∼drorbn/LOP.html, 1991 7. Bar-Natan, D.: On the Vassiliev knot invariants. Topology 34, 423 (1995) 8. Bar-Natan, D., Garoufalidis, S., Rozansky, L., Thurston, D.P.: The Aarhus integral of rational homology spheres, I, II, III. Selecta Mathematica, New Series 8, 315–339; 341–371 (2002), and Selecta Mathematica, To appear 9. Bar-Natan, D., Lawrence, R.: A rational surgery formula for the LMO invariant. Israel J. Math. 140, 29–60 (2004) 10. Br´ezin, E., Hikami, S.: Characteristic polynomials of random matrices. Commun. Math. Phys. 214, 214 (2000) 11. Cachazo, F., Intriligator, K.A., Vafa, C.: A large N duality via a geometric transition. Nucl. Phys. B 603, 3 (2001) 12. Cvitanovi´c, P.: Group theory for Feynman diagrams in non-Abelian gauge theories. Phys. Rev. D 14, 1536 (1976); Group Theory, Nordita, 1984, updated as Group theory. Exceptional Lie groups as invariance groups http://www.nbi.dk/GroupTheory/postscript.html, 2002 13. Cvitanovi´c, P., Kennedy, A.D.: Spinors in negative dimensions. Phys. Scripta 26, 5 (1982) 14. Di Francesco, P., Mathieu, P., S´en´echal, D.: Conformal field theory, Berlin-Heidelberg-New York: Springer-Verlag, 1997 15. Dijkgraaf, R.: Perturbative topological field theory. In: String theory, gauge theory and quantum gravity ’93, Singapore: World Scientific, 1994, pp. 189 16. Dijkgraaf, R., Vafa, C.: Matrix models, topological strings, and supersymmetric gauge theories. Nucl. Phys. B 644, 3 (2002) 17. Dijkgraaf, R., Vafa, C.: On geometry and matrix models. Nucl. Phys. B 644, 21 (2002) 18. Douglas, M.R.: Chern-Simons-Witten theory as a topological Fermi liquid. http://arxiv.org/abs/hepth/9403119, 1994 19. Freed, D.S., Gompf, R.E.: Computer calculation of Witten’s three manifold invariant. Commun. Math. Phys. 141, 79 (1991) 20. Gopakumar, R., Vafa, C.: Topological gravity as large N topological gauge theory. Adv. Theor. Math. Phys.2, 413 (1998) 21. Gopakumar, R., Vafa, C.: On the gauge theory/geometry correspondence. Adv. Theor. Math. Phys. 3, 1415 (1999) 22. Hansen, S.K.: Reshetikhin-Turaev invariants of Seifert 3-manifolds and a rational surgery formula. Algebr. Geom. Top. 1, 627 (2001) 23. Harvey, J.A., Moore, G.W.: Superpotentials and membrane instantons. http://arxiv.org/abs/hepth/9907026, 1999 24. Itzykson, C., Drouffe, J.-M.: Th´eorie statistique des champs, 2. Paris: InterEditions, 1989 25. Itzykson, C., Zuber, J.B.: Matrix integration and combinatorics of modular groups. Commun. Math. Phys. 134, 197 (1990) 26. Jeffrey, L.C.: Chern-Simons-Witten invariants of lens spaces and torus bundles, and the semiclassical approximation. Commun. Math. Phys. 147, 563 (1992) 27. Kadell, K.W.J.: The Selberg-Jack symmetric functions. Adv. Math. 130, 33 (1997) 28. Labastida, J.M.F.: Chern-Simons gauge theory: ten years after. http://arxiv.org/abs/hep-th/9905057, 1999 29. Lawrence, R.J.: Asymptotic expansions of Witten-Reshetikhin-Turaev invariants for some simple 3-manifolds. J. Math. Phys. 36, 6106 (1995) 30. Lawrence, R., Rozansky, L.: Witten-Reshetikhin-Turaev invariants of Seifert manifolds. Commun. Math. Phys. 205, 287 (1999) 31. Le, T.T.Q., Murakami, J., Ohtsuki, T.: On a universal perturbative invariant of 3-manifolds. Topology 37, 539 (1998) 32. Lickorish, W.B.R.: An introduction to knot theory. Berlin-Heidelberg-New York: Springer-Verlag, 1998 33. MacDonald, I.G.: Symmetric functions and Hall polynomials. 2nd edn, Oxford: Oxford University Press, 1995 34. Mehta, M.L.: Random matrices. 2nd edn, London-New York: Academic Press, 1991 35. Rozansky, L.: A large k asymptotics of Witten’s invariant of Seifert manifolds. Commun. Math. Phys. 171, 279 (1995) 36. Rozansky, L.: A contribution of the trivial connection to Jones polynomial and Witten’s invariant of 3-D manifolds. 1, 2. Commun. Math. Phys. 175, 275, 297 (1996) 37. Rozansky, L.: Residue formulas for the large k asymptotics of Witten’s invariants of Seifert manifolds: The case of SU (2). Commun. Math. Phys. 178, 27 (1996)
Chern-Simons Theory
49
38. Rozansky, L., Witten, E.: HyperK¨ahler geometry and invariants of three-manifolds. Selecta Math. 3, 401 (1997) 39. Sawon, J.: Rozansky-Witten invariants of hyperK¨ahler manifolds. Ph.D. Thesis, http://arxiv.org/abs/math.DG/0404360, 2004 40. Takata, T.: On quantum P SU (N) invariants for Seifert manifolds. J. Knot Theory Ramifications 6, 417 (1997) 41. Vafa, C.: Superstrings and topological strings at large N. J. Math. Phys. 42, 2798 (2001) 42. Verlinde, E.: Fusion rules and modular transformations in 2-D conformal field theory. Nucl. Phys. B 300, 360 (1988) 43. Vogel, P.: Algebraic structure on modules of diagrams. Preprint 1997, at http://www.math.jussieu.fr/˜ vogel 44. Witten, E.: Quantum field theory and the Jones polynomial. Commun. Math. Phys. 121, 351 (1989) 45. Witten, E.: Chern-Simons gauge theory as a string theory. Prog. Math. 133, 637–378 (1995) Communicated by N.A. Nekrasov
Commun. Math. Phys. 253, 51–79 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1206-4
Communications in
Mathematical Physics
Characteristic Properties of the Scattering Data for the mKdV Equation on the Half-Line Anne Boutet de Monvel1 , Vladimir Kotlyarov2 1 2
Institut de Math´ematiques de Jussieu, case 7012, Universit´e Paris 7, 2 place Jussieu, 75251 Paris, France. E-mail:
[email protected] Mathematical Division, Institute for Low Temperature Physics, 47 Lenin Avenue, 61103 Kharkiv, Ukraine. E-mail:
[email protected]
Received: 22 May 2003 / Accepted: 12 March 2004 Published online: 5 November 2004 – © Springer-Verlag 2004
Abstract: In this paper we describe characteristic properties of the scattering data of the compatible eigenvalue problem for the pair of differential equations related to the modified Korteweg-de Vries (mKdV) equation whose solution is defined in some halfstrip or in the quarter plane (0 < x < ∞) × [0, T ), T ≤ ∞. We suppose that this solution has a C ∞ initial function vanishing as x → ∞, and C ∞ boundary values, vanishing as t → ∞ when T = ∞. We study the corresponding scattering problem for the compatible Zakharov-Shabat system of differential equations associated with the mKdV equation and obtain a representation of the solution of the mKdV equation through Marchenko integral equations of the inverse scattering method. The kernel of these equations is valid only for x ≥ 0 and it takes into account all specific properties of the pair of compatible differential equations in the chosen half-strip or in the quarter plane. The main result of the paper is the collection A–B–C of characteristic properties of the scattering functions given below. 1. Introduction 1.1. Initial-boundary value problems on the half-line. Initial value problems on the whole line for nonlinear integrable equations such as the nonlinear Schr¨odinger equation, the Korteweg-de Vries equation, the sine-Gordon equation, etc. are well studied. The solvability of the Cauchy problem, multi-soliton solutions, the proof that these nonlinear equations are completely integrable infinite-dimensional Hamiltonian systems are the most significant results in the soliton theory on the whole line. At the same time the initial-boundary value problem on the half-line for nonlinear integrable equations has not been studied so far. In the last decade attention to those problem has strongly increased. Among papers [2–15, 20–33, 36–47] devoted to this problem, the most interesting results were obtained by A.S. Fokas [20], A.S. Fokas and A.R. Its [22–24]. Later, in [26] A.S. Fokas has proposed a general method for solving boundary value problems for two-dimensional linear and integrable nonlinear partial differential
52
A. Boutet de Monvel, V. Kotlyarov
equations. This method, which was further developed in [21, 27–29], is based on the simultaneous spectral analysis of the two eigenvalue equations of the associated Lax pair. It expresses the solution in terms of the solution of a matrix Riemann-Hilbert problem in the complex plane of the spectral parameter. The spectral functions determining the Riemann-Hilbert problem are expressed in terms of the initial and boundary values of the solution. The fact that these initial and boundary values are in general related can be expressed in a simple way in terms of a global relation satisfied by the corresponding spectral functions. In the framework of this approach we recently found characteristic properties of the scattering data for the compatible Zakharov-Shabat eigenvalue problem associated with focusing and defocusing nonlinear Schr¨odinger equations on the half-line with initial and boundary functions of Schwartz type [9]. Recently in [11] (see also [33]) an initial-boundary value problem for the mKdV equation on the half-line was analyzed by expressing the solution in terms of the solution of a matrix Riemann-Hilbert problem in the complex k-plane. In particular, it is shown that for a subclass of boundary conditions, the “linearizable boundary conditions”, all spectral functions can be computed from the given initial data by using algebraic manipulations of some “global relation”. Thus in this case, the problem on the half-line can be solved as efficiently as the problem on the whole line. But the general initial-boundary value problem on the half-line remains non-linearizable. Characteristic properties of the spectral functions were not considered in [11]. In this connection the characterization of the spectral functions becomes important. Besides, a description of the characteristic properties of the scattering or spectral data is a very important problem in itself [35]. Most papers initiated by problems on the half-line deal with nonlinear dynamics of the spectral or scattering data. We prefer the approach of A.S. Fokas and A.R. Its where scattering (spectral) data have trivial dynamics. But then analytic properties of the scattering data are more complicated. Therefore it is necessary to give their complete description when, of course, initial and boundary functions belong to suitable classes of functions. We introduce spectral data in a natural way as in ordinary scattering problems: • First, a “scattering matrix” for the x-equation (by initial function). • Then, a “scattering matrix” for the t-equation (by boundary functions). • Finally, a “scattering matrix” for the compatible x- and t-equations as a product. In this case a kernel of the Marchenko integral equations or a jump matrix of the corresponding Riemann-Hilbert problem has an explicit (x, t)-dependence. That makes it possible to study the asymptotic behavior of the solution of the non-linear problem by using, for example, the powerful steepest descent method of P. Deift and X. Zhou [16, 17] while the non-linear dynamics of the spectral or scattering data makes it almost impossible to obtain an effective asymptotics of the solution.
1.2. The initial-boundary value problem for the modified Korteweg-de Vries equation. In this paper we consider the problem to characterize “scattering data” for a compatible pair of differential equations attached to the modified Korteweg-de Vries (mKdV) equation. Let q(x, t) be a real-valued solution of the mKdV equation
x ∈ R+ ,
qt + qxxx − 6λq 2 qx = 0, t ∈ [0, T ), T ≤ ∞, λ = ±1
(1.1)
Scattering Data for the mKdV Equation on the Half-Line
53
in the half-strip or quarter xt-plane and suppose the initial function q(x, 0) = u(x) with x ∈ R+ and the boundary values q(0, t) = v(t)
qx (0, t) = v1 (t)
qxx (0, t) = v2 (t) with t ∈ [0, T ],
T ≤∞
C∞,
and u(x) ∈ S(R+ ), where S(R+ ) is the Schwartz space of rapidly decreasing are functions on R+ , i.e. C ∞ functions whose derivatives of any order n ≥ 0 vanish at infinity faster than any negative power of x. For T = ∞ the boundary values are also supposed to be rapidly decreasing: v(t), v1 (t), v2 (t) ∈ S(R+ ). Remark. We consider here the IBV problem for the mKdV equation (1.1) in the first quarter (x ≥ 0, t ≥ 0) of the xt-plane. This problem differs from that studied in [11] which is also on the first quarter but for the mKdV equation of the form: qt − qxxx + 6λq 2 qx = 0. This form can be easily reduced to (1.1), but then the IBV problem is on the second quarter (x ≤ 0, t ≥ 0) of the xt-plane. Scattering (spectral) data for that problem have different analytic properties. It is well-known that for KdV and mKdV equations there are differences between the IBV problems for x > 0 and for x < 0. To study the solution q(x, t) we shall use spectral analysis of a compatible eigenvalue problem for the linear x-equation wx + ikσ3 w = Q(x, t)w, 1 0 , σ3 = 0 −1 0 q(x, t) Q(x, t) = , λq(x, t) 0
(1.2)
and for the linear t-equation ˆ t, k)w, wt + 4ik 3 σ3 w = Q(x, 3 ˆ Q(x, t, k) = 2Q (x, t) − Qxx − 2ik(Q2 (x, t) +Qx (x, t))σ3 + 4k 2 Q(x, t).
(1.3)
This is the well-known Ablowitz-Kaup-Newel-Segur [1] or Zakharov-Shabat [48] system of linear equations, which are compatible if and only if q(x, t) satisfies the mKdV equation. The main goal of the present paper is to study the scattering problem for compatible differential equations (1.2) and (1.3) on the half-strip or on the quarter of the xt-plane. We will combine the Marchenko integral equation and the corresponding Riemann–Hilbert problem in our approach to obtain characteristic properties of scattering data. We get a description of these characteristic properties and obtain a representation of the solution of the mKdV equation through Marchenko integral equations of the inverse scattering method. The kernel of these equations is valid for x ≥ 0 only and it takes into account all specific properties which occurred for compatible differential equations on the half-line. In particular, the solution of the mKdV equation given by our method does not have continuation for x < 0. It is well defined on the half-strip or on the quarter of the
54
A. Boutet de Monvel, V. Kotlyarov
JJ
2 J
J
1 3 J
J
J
J 4 6
J 5
J
JJ
Fig. 1. = {k ∈ C | Im k 3 = 0} and 1 , . . . , 6
xt-plane. Then, if one is using our representation of the solution of the mKdV equation and the explicit (x, t)-dependence of the kernel of the Marchenko integral equations or the jump matrix in the Riemann-Hilbert problem one can easily obtain the asymptotic behavior of the solution in the same way as in [16, 17 or 34]. 1.3. Scattering data: definition. Let q(x, t) be a real-valued solution of Eq. (1.1) with initial and boundary functions satisfying smoothness and decreasing assumptions described above. Let us define = {k ∈ C | Im k 3 = 0} and domains 1 , . . . , 6 as depicted on Fig. 1. Scattering data are introduced as follows: 1. The initial function u(x) = q(x, 0) and the x-equation (1.2) at t = 0 define the Jost solution (x, 0, k) = exp(−ikxσ3 ) + o(1), x → ∞ and a “scattering matrix” S(k) :=
−1
s2+ (k) −s1+ (k) (0, 0, k) = , −s2− (k) s1− (k)
¯ s1− (k) = s¯2+ (k),
¯ s2− (k) = λ¯s1+ (k).
In particular, they define • the spectral function r(k) = −s2− (k)/s2+ (k), called the “reflection coefficient”, and, in a generic situation: • eigenvalues kj ∈ C+ , j = 1, . . . , n, which are the zeros of s2+ (k) in C+ , • numbers1 mj = [is1+ (kj )˙s2+ (kj )]−1 , j = 1, . . . , n. 2. Boundary data v(t) = q(0, t), v1 (t) = qx (0, t), v2 (t) = qxx (0, t), with t ∈ [0, T ], T ≤ ∞ together with the t-equation at x = 0 define a solution Y (0, t, k) = exp(−4ik 3 tσ3 ),
1
The dot denotes differentiation with respect to k.
t ≤ T,
Scattering Data for the mKdV Equation on the Half-Line
55
then a “scattering matrix"
p1− (k, T ) p1+ (k, T ) P (k, T ) := Y (0, 0, k) = , p2− (k, T ) p2+ (k, T ) ¯ T ), p − (k, T ) = λp¯ + (k, ¯ T) where p1− (k, T ) = p¯ 2+ (k, 2 1 and, together with S(k), another “scattering matrix" − r1 (k, T ) r1+ (k, T ) R(k, T ) = S(k)P (k, T ) = − . r2 (k, T ) r2+ (k, T ) 3. Now we introduce: • one more spectral function c(k, T ) =
p2− (k, T )
s2+ (k)r1− (k, T )
,
and, in a generic situation: • eigenvalues zj ∈ 2 , j = 1, . . . , m, which are the zeros of r1− (k, T ) in 2 , • numbers m2j = −i Resk=zj c(k, T ), j = 1, . . . , m, which depend on the initial and boundary functions. In what follows we consider the generic situation where initial and boundary functions are such that eigenvalues kj ∈ C+ , j = 1, . . . , n and zj ∈ 2 , j = 1, . . . , m are simple and finite in number, with s1+ (kj ) = 0, j = 1, . . . , n and s2+ (k) = 0, k ∈ R. Definition (scattering data). We define the set R = {s1+ (k), s2+ (k), k ∈ C+ ; p1+ (k, T ), p2+ (k, T ), k ∈ }, where
=
C if T < ∞, 1 ∪ 3 ∪ 5 if T = ∞,
as “scattering data” of the compatible eigenvalue problem for the system of differential equations (1.2)-(1.3) with q(x, t) satisfying the mKdV equation (1.1). 1.4. Scattering data: conditions A-B-C. Now we introduce three sets of conditions on the set of scattering data R. Conditions A. Conditions on s1+ (k), s2+ (k), p1+ (k, T ), p2+ (k, T ). ¯ s1+ (k), s2+ (k) are analytic in k ∈ C+ and sj+ (k) = s¯j+ (−k). + + 2 2 |s2 (k)| − λ|s1 (k)| ≡ 1 for k ∈ R. s2+ (k) = 1 + O(k −1 ), s1+ (k) = O(k −1 ), k ∈ C+ . If λ = 1, s2+ (k) = 0 in C+ . If λ = −1, s2+ (k) = 0, k ∈ C+ ⇒ s1+ (k) = 0. p1+ (k, T ) and p2+ (k, T ) are entire for T < ∞ or analytic in k ∈ if T = ∞. ¯ Moreover, pj+ (k) = p¯ j+ (−k). + + ¯ + + ¯ A6. p2 (k)p¯ 2 (k) − λp1 (k)p¯ 1 (k) ≡ 1 for k 3 ∈ R. A7. p2+ (k, T ) = 1 + O(k −1 ) and p1+ (k, T ) = O(k −1 ) for k ∈ , as k → ∞.
A1. A2. A3. A4. A5.
56
A. Boutet de Monvel, V. Kotlyarov
A8. If T < ∞, then p2+ (k, T ) = 1 + O
3T
e8ik k
, p1+ (k, T ) = O
3T
e8ik k
for k ∈
2 ∪ 4 ∪ 6 , as k → ∞. A9. s1+ (k), s2+ (k), p1+ (k, T ) and p2+ (k, T ) satisfy the relation: s1+ (k)p2+ (k, T ) − s2+ (k)p1+ (k, T )
c1 (k, T )e8ik 0,
=
3T
if T < ∞, if T = ∞,
where c1 (k, T ) is analytic in k ∈ C+ and O(k −1 ) as k → ∞. Definition. Let R be given. We define r(k), k ∈ R by r(k) = −λ
s¯1+ (k) s2+ (k)
.
Conditions B. Conditions on r(k), k ∈ R. B1. r(k) ∈ C ∞ (R). B2. The function
∞ −∞
r(k) eikx+8ik t dk 3
is C ∞ in x, t for x > 0 and t ≥ 0. Definition. Let R be given. We define c(k, T ), k ∈ C+ by c(k, T ) =
¯ T) λp¯ 1+ (k,
¯ T )s + (k) − λp¯ + (k, ¯ T )s + (k)] s2+ (k)[p¯ 2+ (k, 2 1 1
.
Conditions C. Conditions on c(k, T ), k ∈ C+ . C1. If λ = 1, c(k, T ) is analytic in the half-plane C+ for T < ∞ or in the domain 2 for T = ∞. C2. If λ = −1, c(k, T ) is meromorphic in the half-plane C+ for T < ∞, or in the domain 2 for T = ∞, where it can have poles at some points z1 , z2 , . . . , zm . C3. c(k) has C ∞ boundary values on R forn T < ∞, or on ∂2 for T = ∞. d r(k) dn c(k, T ) =− , n = 0, 1, 2, . . . . C4. If T = ∞, then n dk dk n k=0 k=0 C5. The function ∂2
is C ∞ in t.
c(k, T )e8ik t dk + 3
∞ −∞
r(k)e8ik t dk 3
Scattering Data for the mKdV Equation on the Half-Line
57
1.5. Recovering of q(x, t) from scattering data. Then we prove that the solution q(x, t) of the non-linear problem (1.1) can be written q(x, t) = 2K2 (x, x, t),
(1.4)
where K2 (x, y, t), together with K1 (x, y, t), satisfies the Marchenko integral equations: ∞ K2 (x, z, t)H (z + y, t)dz = 0 for 0 ≤ x < y < ∞, (1.5) K1 (x, y, t) + λ x ∞ K1 (x, z, t)H (z + y, t)dz = 0 (1.6) K2 (x, y, t) + H (x + y, t) + x
with kernel 1−λ H (x, t) = 2 1 + 2π
kj ∈1
∂2
3 m1j eikj x+8ikj t
c(k, T )e
+
ikx+8ik 3 t
zj ∈2
dk
3 m2j eizj x+8izj t
1 + 2π
∞
−∞
+
kj ∈3
3 m3j eikj x+8ikj t
r(k)eikx+8ik t dk, 3
(1.7)
where m1j = [is1+ (kj )˙s2+ (kj )]−1 for kj ∈ 1 , m3j = [is1+ (kj )˙s2+ (kj )]−1 for kj ∈ 3 , and m2j = −i Resk=zj c(k, T ) for zj ∈ 2 . 1.6. Main theorem. The main result is that properties A–B–C are characteristic: Theorem. In the generic situation described above, Conditions A–B–C on R = {s1+ (k), s2+ (k), k ∈ C+ ; p1+ (k, T ), p2+ (k, T ), k ∈ } characterize the scattering data of a compatible eigenvalue problem (1.2)–(1.3) for xand t-equations defined by a solution q(x, t) of the mKdV equation (1.1) with initial function u(x) ∈ S(R+ ) and boundary values v(t), v1 (t), v2 (t) ∈ C ∞ [0, T ] if T < ∞, or v(t), v1 (t), v2 (t) ∈ S(R+ ) if T = ∞. Remark. Condition B2 (about smoothness of the integral with respect to x and t) is fulfilled if the initial and boundary functions obey the following relations: dn dn dn dn u(x) = v(t) = v (t) = v (t) =0 1 2 x=0 t=0 t=0 t=0 dx n dt n dt n dt n
(1.8)
for any n ≥ 0. In this case the reflection coefficient r(k) is in the Schwartz space S(R). Under these assumptions Condition C5 is also fulfilled. Remark. If u(x) ≡ 0, Conditions A1–A4 and B are trivial: r(k) ≡ 0, s1+ (k) ≡ 0, s2+ (k) ≡ 1, {k1 , . . . , kn } = ∅. There remain only Conditions A5–A9 and C. They mean that any solution of the mKdV equation in the half-strip or quarter plane with zero initial ¯ T ), analytic (λ = 1) or function is parametrized by only one function c(k, T ) = c(− ¯ k, meromorphic (λ = −1) in 2 with poles at z1 , . . . , zm .
58
A. Boutet de Monvel, V. Kotlyarov
2. Basic Solutions of Compatible x- and t-Equations Let us write the x- and t-equations in the form Wx = U (x, t, k)W, Wt = V (x, t, k)W,
(2.1) (2.2)
where U (x, t, k) and V (x, t, k) are matrices given by U (x, t, k) = Q(x, t) − ikσ3 , V (x, t, k) = 2Q3 (x, t) − Qxx − 2ik(Q2 (x, t) + Qx (x, t))σ3 + 4k 2 Q(x, t) − 4ik 3 σ3 . Lemma 1. Let the system (2.1), (2.2) be compatible for all k. Let W (x, t, k) satisfy the x-equation (2.1) for all t, and let W (x0 , t, k) satisfy the t-equation (2.2) for some x = x0 (including the case x0 = ∞). Then W (x, t, k) satisfies the t-equation for all x. Proof. See e.g. [9].
Notations. The over-bar denotes the complex conjugation. C± denotes the upper (lower) complex half plane. If A = A− A+ denotes a 2 × 2 matrix, the vectors A∓ denote the first and second columns of A. We also denote [A, B] = AB − BA. In this section we shall introduce basic solutions of compatible x- and t-equations.
2.1. First basic solution. The first basic solution is a matrix-valued Jost solution of the x-equation (1.2). It has the triangular integral representation (see e.g. [19]) (x, t, k) = e−ikxσ3 +
∞ x
3 K(x, y, t)e−ikyσ3 dy e−4ik tσ3 ,
(2.3)
where real-valued matrix K(x, y, t) has the form
K1 (x, y, t) λK2 (x, y, t) K(x, y, t) = K2 (x, y, t) K1 (x, y, t) with entries in C ∞ (R+ × R+ × R+ ) and rapidly decreasing as x + y → ∞ for any t ∈ R+ . Matrices K(x, x, t) and Q(x, t) are connected by the relation: [σ3 , K(x, x, t)] = Q(x, t)σ3 .
(2.4)
This last equality yields the important formula (1.4) for the solution q(x, t) of the modified Korteweg-de Vries equation. The matrix (x, t, k) satisfies the x-equation (1.2) 2 and it satisfies the t-equation (1.3) with x = ∞, because the matrix e−ik(x+4k t)σ3 is a solution of both Eqs. (1.2) and (1.3) with Q(x, t) ≡ 0. Lemma 1 implies that (x, t, k) satisfies the t-equation for any x ∈ R+ due to the compatibility of the x- and t-equations. The triangular integral representation (2.3) and Lemma 1 imply the following properties of the matrix-valued Jost solution (x, t, k) (cf. [19]):
Scattering Data for the mKdV Equation on the Half-Line
59
Properties of the first basic solution. 1. (x, t, k) satisfies the x- and t-equations (1.2)–(1.3). 01 −1 ¯ 2. (x, t, k) = (x, t, k) for k ∈ R, = . λ0 3. det (x, t, k) ≡ 1 for k ∈ R. 4. (x, t, k) → (x, t, k) ∈ C ∞ (R+ × R+ × R). 5. + (x, t, k) is analytic in k ∈ C+ , − (x, t, k) is analytic in k ∈ C− . ¯ 6. ± (x, t, k) = ± (x, t, −k). 7. For k → ∞, 3 1 eikx+4ik t − (x, t, k) = + O(k −1 ) if Im k ≤ 0, 0 3 0 e−ikx−4ik t + (x, t, k) = + O(k −1 ) if Im k ≥ 0. 1 2.2. Second basic solution. Now let us introduce the second basic solution (x, t, k) of the x- and t-equations which satisfies the initial condition 10 (0, 0, k) = σ0 ≡ . (2.5) 01 It can be represented as a product of two matrices: (x, t, k) = ϕ(x, t, k)ϕ(t, ˆ k),
(2.6)
where ϕ(x, t, k) satisfies the x-equation under the condition ϕ(0, t, k) = σ0 , and ϕ(t, ˆ k) satisfies the t-equation with x = 0 under initial condition ϕ(0, ˆ k) = σ0 . Lemma 1 implies that (x, t, k) is a compatible solution of the x- and t-equations. The existence of the solution ϕ(x, t, k) and its representation x −ikxσ3 ϕ(x, t, k) = e + A(x, y, t)e−ikyσ3 dy (2.7) −x
by some integral kernel A(x, y, t) are proved in [9]. The matrix ϕ(t, ˆ k) can be found as a solution of the Volterra integral equation: t 3 3 ˆ ϕ(t, ˆ k) = e−4ik tσ3 + e4ik (τ −t) Q(0, τ, k)ϕ(τ, ˆ k)dτ, (2.8) 0
where ˆ Q(0, t, k) =
−2iλkv 2 (t)
2v 3 (t)+4λk 2 v(t)−2iλkv1 (t)−λv2 (t)
2λv 3 (t)+4k 2 v(t)+2ikv1 (t)−v2 (t) 2iλkv 2 (t)
Besides ϕ(t, ˆ k) has the integral representation: t 3 3 ϕ(t, ˆ k) = e−4ik tσ3 + B(t, s)e−4ik sσ3 ds −t t 3 C(t, s)e−4ik sσ3 ds + k 2 +ik −t
t
−t
D(t, s)e−4ik
3 sσ 3
ds,
.
(2.9)
60
A. Boutet de Monvel, V. Kotlyarov
which will be used below. The proof of this triangular representation can be done in the same way as in [9]. In the present case the matrix-valued real functions A(x, y, t), B(t, s), C(t, s) and D(t, s) are C ∞ and bounded in x, y, t, s. The triangular integral representations (2.6)–(2.9) yield the following properties of the solution (x, t, k): Properties of the second basic solution. 1. 2. 3. 4. 5. 6. 7.
(x, t, k) is a solution of the x- and t-equations. ¯ −1 for any k ∈ C. ¯ (x, t, k) = (x, t, k) ¯ for any k ∈ C. ¯ (x, t, k) = (x, t, −k) det (x, t, k) ≡ 1 for any k ∈ C. (x, t, k) → (x, t, k) ∈ C ∞ (R+ × R+ × C). (x, t, k) is analytic (entire) in k ∈ C. For k ∈ C, k → ∞,
(x, t, k) = I + O(k
−1
e2ikxσ3
e8ik 3 tσ3 2 )+O +O e−ik(x+4k t)σ3 . k k
8. For k ∈ 1 ∪ 3 , k → ∞, eikx+4ik t − (x, t, k) = 3
1 + O(k −1 ). 0
The last asymptotic relation can be easily proved using large k asymptotics for the functions ϕ ∓ (x, t, k) and ϕˆ − (t, k). ˆ k) be a 2.3. Third basic solution. Let = {k ∈ C | Im k 3 = 0} as above and let (t, solution of the Volterra integral equation ∞ 3 −4ik 3 tσ3 ˆ ˆ ˆ (t, k) = e − e−4ik (τ −t)σ3 Q(0, τ, k)(τ, k)dτ, k ∈ , t
ˆ ˆ where Q(0, t, k) is as in (2.8) and Q(0, t, k) ≡ 0 for t > T if T < ∞, that means ˆ k) satisfies the t-equation with x = 0 under the asymptotic condition the matrix (t, ˆ k) the triangular integral repreˆ k) = e−4ik 3 tσ3 + o(1) as t → ∞. Again, for (t, (t, sentation ∞ 3 −4ik 3 tσ3 ˆ (t, k) = e + L(t, s)e−4ik sσ3 ds t ∞ ∞ 3 −4ik 3 sσ3 2 M(t, s)e ds + k N (t, s)e−4ik sσ3 ds (2.10) +ik t
t
can be obtained as in [9]. Here the matrix-valued real functions L(t, s), M(t, s) and N(t, s) are C ∞ and bounded in t, s, and they vanish for s > 2T − t if T < ∞ or tend to zero as t + s → ∞ if T = ∞. We introduce the matrix ˆ k), Y (x, t, k) = ϕ(x, t, k)(t,
k ∈ ,
(2.11)
Scattering Data for the mKdV Equation on the Half-Line
61
where ϕ(x, t, k) is as in (2.7). Lemma 1 implies that Y (x, t, k) is a solution of the xand t-equations with det Y (x, t, k) = 1 for k ∈ . ˆ k), hence also Y (x, t, k), is unbounded in t ∈ R+ . Since For k ∈ / the function (t, the integral equation is of Volterra type with τ ∈ (t, ∞), the first column Y − (x, t, k) is analytic in k ∈ 2 ∪ 4 ∪ 6 and the second column Y + (x, t, k) is analytic in k ∈ 1 ∪ 3 ∪ 5 or Y (x, t, k) is an entire matrix-valued function if T < ∞. The properties of the solution Y (x, t, k) follow from the triangular integral representations (2.7) and (2.10): Properties of the third basic solution. Y (x, t, k) satisfies the x- and t-equations. ¯ −1 for k ∈ . Y (x, t, k) = Y¯ (x, t, k) det Y (x, t, k) = 1 for k ∈ . (x, t, k) → Y (x, t, k) ∈ C ∞ (R+ × R+ × ). Y + (x, t, k) is analytic in k ∈ 1 ∪ 3 ∪ 5 , Y − (x, t, k) is analytic in k ∈ 2 ∪ 4 ∪ 6 or they are entire if T < ∞. ¯ = Y − (x, t, k), k ∈ 2 ∪ 4 ∪ 6 . 6. Y¯ − (x, t, −k) + ¯ ¯ = Y + (x, t, k), k ∈ 1 ∪ 3 ∪ 5 . 7. Y (x, t, −k) 8. For k → ∞, k ∈ 2 ,
1. 2. 3. 4. 5.
eikx+4ik t Y − (x, t, k) = 3
1 + O(k −1 ). 0
3. Analysis of the Direct Scattering Problem The basic solutions we have introduced are clearly linearly dependent: (x, t, k) = (x, t, k)S(k), Y (x, t, k) = (x, t, k)P (k), Y (x, t, k) = (x, t, k)R(k).
(3.1)
The matrices S(k), P (k) and R(k) depend neither on x nor on t because by virtue of the x-equation they do not depend on x, and by virtue of the t-equation they do not depend on t. Hence: S(k) = −1 (0, 0, k), k ∈ R; P (k) = Y (0, 0, k), k ∈ ; R(k) = S(k)P (k), k ∈ R.
(3.2)
We omit the dependence on T of the matrices P (k) = P (k, T ) and R(k) = R(k, T ) if T < ∞. Let us study the properties of these “scattering” (transition) matrices.
62
A. Boutet de Monvel, V. Kotlyarov
Properties of the scattering matrix S(k). They follow from the scattering problem for the x-equation with t = 0. Indeed, consider the problem on the whole x-line by putting 0 for x ∈ (−∞, 0) q(x, 0) = u(x) ˆ = u(x) for x ∈ [0, ∞). ˜ Let (x, k) be the Jost solution [19] normalized by ˜ (x, k) = e−ikxσ3 for x < 0 and let T˜ (k) be the transition matrix for that case, i.e. ˜ (x, k) = (x, 0, k)T˜ (k). Putting x = 0 we find S(k) ≡ T˜ (k). Hence the “scattering” matrix S(k) has all properties of the transition matrix T˜ (k) [19]: −1 for k ∈ R. ¯ • S(k) = S(k) • det S(k) ≡ 1 for k ∈ R. • S(k) ∈ C ∞ (R). For the half-line case there are additional properties: + s2 (k) −s1+ (k) where sj+ (k) = j+ (0, 0, k). • S(k) = −s2− (k) s1− (k) ¯ • s2+ (k) −s1+ (k) is analytic in k ∈ C+ and sj+ (k) = sj+ (−k). − − − − ¯ • −s2 (k) s1 (k) is analytic2 in k ∈ C− and sj (k) = sj (−k). • If k ∈ C+ and k → ∞, s2+ (k) = 1 + O(k −1 ),
s1+ (k) = O(k −1 ).
(3.3)
Let us prove some integral representations for s1+ (k) and s2+ (k). We use the limit formulas s1+ (k) = lim e−ikx 1+ (x, 0, k), x→−∞
s2+ (k) = lim e−ikx 2+ (x, 0, k), x→−∞
which follow from the definition of the matrix S(k). If one puts χj (x, k) = e−ikx j+ (x, 0, k) the x-equation yields ˆ χ1 + 2ikχ1 = u(x)χ 2 χ1 (x, k) → 0 as x → +∞, χ2 = −u(x)χ ˆ 1 χ2 (x, k) → 1 as x → +∞. Then, by integration: χ1 (x, k) = −
∞
x
χ2 (x, k) = 1 +
e2ik(y−x )u(y)χ ˆ 2 (y, k)dy, ∞
x
u(y)χ ˆ 1 (y, k)dy,
2 For an arbitrary function u(x), ˆ x ∈ R these analytic properties do not hold, s2+ (k) and s1− (k) are ˆ ≡ 0 for x < 0 and therefore s1+ (k) only analytic in k ∈ C+ and k ∈ C− respectively. In our case u(x) − and s2 (k) are also analytic in k ∈ C+ and k ∈ C− respectively.
Scattering Data for the mKdV Equation on the Half-Line
63
therefore ∞ ∞ u(x)e2ikx dx − u(x)e2ikx dx K1 (x, x + y)eiky dy, (3.4) 0 0 ∞ ∞ 0 + ikx e dx u(y)K2 (y, y + x)dy, (3.5) s2 (k) = 1 −
s1+ (k) = −
∞
0
0
where K1 (x, y) and K2 (x, y) are entries of the kernel of the triangular integral transformation (2.3). The last two formulas allow to find the large-k asymptotic expansions at any order of s1+ (k) and s2+ (k) and to obtain (3.3), in particular, which is exact (precise) if u(0) = 0. The matrix S(k) = −1 (0, 0, k) is determined by u(x) ∈ S(R+ ). The entries of this matrix are not independent and can be recovered from one known function. Let s(k) ≡
s1+ (k)
s2+ (k)
be given and let dic be the set of zeros of the analytic function s2+ (k). As in [19] we shall consider a subset S0 (R+ ) of functions u(x) ∈ S(R+ ) for which s2+ (k) has a finite number of zeros: dic = {k1 , . . . , kn ∈ C+ | s2+ (kj ) = 0}, all of multiplicity 1, i.e. s˙2+ (kj ) = 0, and s2+ (k) = 0 for every k ∈ R. Since det S(k) ≡ 1, then |s2+ (k)|2 − λ|s1+ (k)|2 ≡ 1 for any k ∈ R. This identity yields the well-known formula: 1−λ 2 ∞ k − kj log(1 − λ|s(µ)|2 )dµ i + s2 (k) = exp . 2π −∞ µ−k k − k¯j
(3.6)
kj ∈C+
The remaining entries of S(k) are also recovered: s1+ (k) = s(k)s2+ (k),
¯ s2− (k) = λ¯s1+ (k),
¯ s1− (k) = s¯2+ (k).
So, if λ = 1, the function s2+ (k) has no zeros at all and the set dic is empty. It follows from the self-adjointness of the x-equation (1.2) and the obvious inequality: |s2+ (k)| ≥ 1 for k ∈ R. If λ = −1 then s2+ (k) may vanish at some points kj ∈ C+ . Since u(x) is real-valued, dic is symmetric with respect to the imaginary axis. We can enumerate the kj ’s in such a way that • kj = iκj , κj > 0 for j = 1, . . . , n1 ≤ n with n = n1 + 2n2 , • kn1 +l = −k¯n1 +n2 +l for l = 1, . . . , n2 . Let us briefly discuss the discrete spectrum of the x-problem, which may appear when λ = −1. The main relation of the x-scattering problem is 1 − (x, t, k) = − (x, t, k) + r(k) + (x, t, k) for k ∈ R, s2+ (k)
(3.7)
64
A. Boutet de Monvel, V. Kotlyarov
where r(k) = −
s2− (k) s2+ (k)
.
(3.8)
F (x, t, k) = − (x, t, k)/s2+ (k) is analytic in k ∈ C+ except for dic = {k1 , . . . , kn }, where it has poles. We have s2+ (kj ) = det − (x, t, kj ) + (x, t, kj ) = 0, then − (x, t, kj ) = γj1 + (x, t, kj ). Hence, Resk=kj F (x, t, k) = cj1 + (x, t, kj ) with cj1 =
γj1
s˙2+ (kj )
and
γj1 =
1
, s1+ (kj )
j = 1, . . . , n.
Note that s1+ (kj ) = 0 because otherwise we come to a contradiction: + (x, t, kj ) ≡ 0 since 1+ (0, 0, kj ) = s1+ (kj ) = 0 and 2+ (0, 0, kj ) = s2+ (kj ) = 0. We also assume all zeros are simple, i.e. s˙2+ (kj ) = 0. Using asymptotics of the function − (x, t, k) at k = ∞, for k ∈ 1 ∪ 3 , we find 3 1 −1 F (x, t, k) = + O(|k| ) e−ikx−4ik t for |k| → ∞, k ∈ 1 ∪ 3 , 0
(3.9)
that will be used below. So, we come to the following (cf. Conditions A and B): Properties of r(k), t (k) and kj . • The reflection coefficient r(k) belongs to C ∞ (R), r(−k) = r¯ (k) and r(k) = O(k −1 ) as k → ∞. It is the ratio of two functions −s2− (k) and s2+ (k) analytic in k ∈ C− and k ∈ C+ respectively, and |r(k)| < 1 if λ = 1. • The transition coefficient t (k) = [s2+ (k)]−1 is represented through formula (3.6), where |s(µ)| = |r(µ)|. • If λ = −1, ki = kj for i = j and Im kj > 0, j = 1, . . . , n, with: kj = iκj , 1 ≤ j ≤ n1 ≤ n = n1 + 2n2 , and kn1 +l = −k¯n1 +n2 +l , 1 ≤ l ≤ n2 . These properties follow from the x-scattering problem on the whole line. We take into ¯ is analytic in k ∈ C− account that for the half-line case the function s2− (k) = −¯s1+ (k) 1 and the constants γj are not independent parameters (that takes place for the whole line), and they are evaluated by means of the function s1+ (k) at kj : γj1 = 1/s1+ (kj ).
Scattering Data for the mKdV Equation on the Half-Line
65
Properties of the scattering matrix P (k). They follow from the defining relation, i.e. from (3.2), (2.11), (2.10): ∞ 3 L(t, s)e−4ik sσ3 ds P (k) = I + ∞ 0∞ 3 3 M(t, s)e−4ik sσ3 ds + k 2 N (t, s)e−4ik sσ3 ds. +k 0
0
It is easy to find the following properties: ¯ −1 for k ∈ . • P (k) = P¯ (k) • det P (k) ≡ 1 for k ∈ . • P (k) is C ∞ in k ∈ . • If T < ∞ the matrix-valued function P (k) is entire in k ∈ C. • If T = ∞ the vector-function P + (k) is analytic in k ∈ 1 ∪ 3 ∪ 5 , and P − (k) is analytic in k ∈ 2 ∪ 4 ∪ 6 . ¯ • P ± (k) = P ± (−k). • P (k) = σ0 + O(k −1 ), k ∈ , k → ∞. Properties of the scattering matrix R(k). Below we need to study properties of the “scattering” matrix R(k) introduced by Eqs. (3.1)–(3.2). The matrix R(k) has the following form − r1 (k) r1+ (k) ¯ ¯ for k ∈ R(k) = − r2+ (k) = r¯1− (k) , r1+ (k) = −¯r2− (k), r2 (k) r2+ (k) with r1− (k) = p1− (k)s2+ (k) − p2− (k)s1+ (k)
(3.10)
analytic in k ∈ C+ (if T < ∞) and k ∈ 2 (if T = ∞). Hence r2+ (k) is analytic in k ∈ C− (if T < ∞) and k ∈ 5 (if T = ∞). Furthermore r2− (k) = p2− (k)s1− (k) − p1− (k)s2− (k)
(3.11)
is analytic in k ∈ C− (for T < ∞) and k ∈ 4 ∪ 6 (for T = ∞), hence r1+ (k) is analytic in k ∈ C+ (for T < ∞) and k ∈ 1 ∪ 3 (for T = ∞). In the domains of analyticity we have the following symmetry properties: ¯ and r ± (k) = r ± (−k). ¯ r1± (k) = r1± (−k) 2 2 From (3.1) we derive Y + (x, t, k) = r1+ (k) − (x, t, k) + r2+ (k) + (x, t, k) with r1+ (k) = det(Y + (x, t, k) + (x, t, k)), r2+ (k) = det( − (x, t, k) Y + (x, t, k)). Let us put x = 0, and k = k1 + ik2 ∈ 1 ∪ 3 . Using (2.3), (2.11), (2.10) for large t we obtain |r1+ (k)| ≤ C1 (k) exp 8t K − (3k12 − k22 )k2 ,
66
A. Boutet de Monvel, V. Kotlyarov
where C1 (k) is independent of t, and K = max (3e2 kj − Im2 kj ) Im kj , 1≤j ≤n
where kj ∈ 1 ∪ 3 is an eigenvalue of the x-scattering problem. Taking into account the analyticity of the function r1+ (k) for k ∈ 1 ∪ 3 , choosing a large enough k and putting t → ∞ (if T = ∞) we find r1+ (k) ≡ 0 for any k ∈ 1 ∪ 3 , hence r2− (k) ≡ 0 for any k ∈ 4 ∪ 6 . So, we come to the main property of the compatible scattering problem for x- and t-equations. • If T = ∞ the “scattering” matrix R(k) is diagonal: ρ (k) 0 R(k) = − for k ∈ R 0 ρ+ (k) with ρ+ (k) =
p2+ (k) s2+ (k)
=
p1+ (k) s1+ (k)
,
ρ− (k) =
p1− (k) s1− (k)
=
p2− (k) s2− (k)
.
(3.12)
• We also have the important relation (Condition A9): p1+ (k)
p2+ (k)
≡
s1+ (k)
(3.13)
s2+ (k)
that says: the function p(k) := p1+ (k)/p2+ (k) being meromorphic in the domain 1 ∪ 3 has an analytic continuation into C+ up to the meromorphic function s(k) = s1+ (k)/s2+ (k). • If T < ∞, instead of (3.13) we have the so-called “global relation” [21]: s2+ (k)p1+ (k, T ) − s1+ (k)p2+ (k, T ) = r1+ (k, T ), where r1+ (k, T ) = c1 (k, T )e8ik with
c1 (k, T ) = −
∞
3T
K(0, y, T )eiky dy
0
is analytic in k ∈ C+ . We write down the dependence on T to emphasize that functions p1+ (k, T ), p2+ (k, T ), and r1+ (k, T ) really depend on T . The asymptotic behavior of S(k) and P (k) yields the following asymptotic expansions: ω2 ρ1 ω1 r1− (k) = 1 + + . . . , r2− (k) = + 2 + . . . for k → ±∞. k k k If T = ∞ then r2− (k) ≡ r1+ (k) ≡ 0. Since det R(k) ≡ 1, then ρ− (k)ρ+ (k) = |ρ+ (k)|2 ≡ 1. Hence, ρ± (k) can be written in the form: ρ± (k) = e±iν(k)
k∈R
(3.14)
with a real function ν(k) for k ∈ R. The function ν(k) has an analytic continuation to 1 ∪ 3 ∪ 4 ∪ 6 which satisfies:
Scattering Data for the mKdV Equation on the Half-Line
67
¯ ν(k) = −ν(−k), ¯ and ν(k) → 0 as k → ∞, • ν(k) = ν¯ (k), in view of the asymptotics of the functions s2+ (k) and p2+ (k). Indeed, in view of (3.12), pj+ (k) = ρ+ (k) sj+ (k) for j = 1, 2,
(3.15)
ρ+ (k) must have poles at the points where s1+ (k) and s2+ (k) vanish. On the other hand, s1+ (k) and s2+ (k) must simultaneously vanish at poles in view of the analyticity of the functions pj+ (k) for k ∈ 1 ∪ 3 . Hence + (x, t, k) vanish identically if k is a pole, which is impossible. So ρ+ (k) is analytic (without singularities) in k ∈ 1 ∪ 3 . Hence the functions pj+ (k) and sj+ (k) have a common set of zeros, possibly empty, in 1 ∪ 3 . The other statements about ν(k) are obvious. So, for r1− (k) which is analytic in k ∈ 2 we obtain: r1− (k) =
s2+ (k)
p2+ (k)
= e−iν(k) for k ∈ ∂2 .
(3.16)
The last formula follows from relations: r1− (k) = p1− (k)s2+ (k) − p2− (k)s1+ (k),
p1+ (k)s2+ (k) − p2+ (k)s1+ (k) = 0.
Hence the function r1− (k) has an analytic continuation to the domain 1 ∪ 3 , where it coincides with the function 1/ρ+ (k). Therefore the function r1− (k) does not vanish for k ∈ ∂2 , and its zeros are some points zj ∈ 2 . Let dbc be the set of zeros of the function r1− (k). Here we also assume that boundary functions are such that the number of zeros is finite: dbc = {z1 , . . . , zm ∈ 2 | r1− (zj ) = 0}, and all zeros are simple, i.e. r˙1− (zj ) = 0. We have dbc = ∅ if λ = 1. Let ρ(k) :=
r2− (k)
r1− (k)
.
The functions ρ(k) and r1− (k) are dependent. They satisfy the determinant relation 1 − λ|ρ(k)|2 =
1 , |r1− (k)|2
k ∈ R,
(3.17)
and ρ(k) ≡ 0 if T = ∞. They have the following properties: • ρ(k), r1− (k) ∈ C ∞ (R), and ρ(−k) = ρ(k), ¯ r1− (k) = r¯1− (−k). − • If T = ∞, ρ(k) ≡ 0 for k ∈ R, and r1 (k) = e−iν(k) , where ν(k) is described above. • If T < ∞, then 1−λ i ∞ log(1 − λ|ρ(s)|2 )ds k − zj 2 − r1 (k) = exp , k ∈ C+ . k − z¯ j 2π −∞ s−k zj ∈C+
(3.18)
68
A. Boutet de Monvel, V. Kotlyarov
• The function ρ(k) − r(k) =
p2− (k)
(3.19)
r1− (k)s2+ (k)
has an analytic continuation to C+ for T < ∞. For T = ∞ the r.h.s. is analytic only in 2 . The last item follows from Eqs. (3.10) and (3.11) which yield p1− (k) = r2− (k)s1+ (k) + r1− (k)s1− (k) = r1− (k) ×[1/s2+ (k) + s1+ (k)(ρ(k) − r(k))],
(3.20)
p2− (k) = r2− (k)s2+ (k) + r1− (k)s2− (k) = r1− (k)s2+ (k) ×[ρ(k) − r(k)] for k ∈ R.
(3.21)
For T < ∞ the difference ρ(k) − r(k) has an analytic continuation to C+ because the l.h.s. are analytic in k ∈ C+ . Hence, the r.h.s. must have analytic continuations to C+ . The second main relation of the compatible scattering problem is: 1 Y − (x, t, k) r1− (k) − (x, t, k) + ρ(k) + (x, t, k) = − (x, t, k)
G(x, t, k) =
for k ∈ R if T < ∞ for k ∈ R if T = ∞.
(3.22)
The function G(x, t, k) is analytic in k ∈ 2 , for k = zj and the zj ’s are poles of G. If r1− (zj ) = 0 then Y − (x, t, zj ) and + (x, t, zj ) are linearly dependent: Y − (x, t, zj ) = γj2 + (x, t, zj ),
j = 1, . . . , m,
Resk=zj G(x, t, k) = cj2 + (x, t, zj ),
cj2 =
hence γj2
r˙1− (zj )
with γj2 =
p1− (zj ) s1+ (zj )
=
p2− (zj ) s2+ (zj )
.
Using asymptotics of the function Y − (x, t, k) in the neighborhood of k = ∞ for k ∈ 2 , we find 3 1 −1 G(x, t, k) = + O(|k| ) e−ikx−4ik t 0 for |k| → ∞, k ∈ 2 . (3.23) This asymptotic formula will be used in the next section.
Scattering Data for the mKdV Equation on the Half-Line
69
4. The Main Integral Equations The main relations of the compatible scattering problem follow from (3.1), (3.2) and (3.7), (3.22): F (x, t, k) = − (x, t, k) + r(k) + (x, t, k) for k ∈ R, G(x, t, k) = − (x, t, k) + ρ(k) + (x, t, k) for k ∈ R.
(4.1) (4.2)
These relations give: G(x, t, k) − F (x, t, k) = c(k) + (x, t, k), where c(k) can be written as follows c(k) = ρ(k) − r(k) =
p2− (k)
s2+ (k) r1− (k)
for
k ∈ C+ k ∈ 2
(4.3)
if T < ∞, if T = ∞.
(4.4)
Properties of c(k). Indeed, c(k) is meromorphic in C+ if T < ∞, and in 2 if T = ∞, ¯ because p − (k), s + (k) and r − (k) are analytic in k ∈ C+ if T < ∞, and c(k) = c(−k), 2 2 1 and in k ∈ 2 if T = ∞. Hence relation (4.3) is true for all k ∈ 2 . Furthermore, since ρ(k) = c(k) + r(k) ≡ 0 (T = ∞) for k ∈ R, c(k) satisfies Condition C4. The function c(k) has poles at the points zj , where s2+ (zj ) = r1− (zj ) = 0. Since the zeros of s2+ (k) and r1− (k) are simple and finite in number, all poles of c(k) are simple and also finite in number. Indeed, we only have to check the case s2+ (z0 ) = r1− (z0 ) = 0. Due to (3.10) we also find p2− (z0 ) = 0. If λ = 1 the function c(k) is regular analytic in the corresponding domains. We have the following relation on ∂2 : Y − (x, t, k − 0) − (x, t, k + 0) − = c(k) + (x, t, k) r1− (k − 0) s2+ (k + 0)
(4.5)
for k ∈ ∂2 . To deduce the integral equations of the inverse scattering problem let us put 1 −ikx−4ik 3 t − e for k ∈ ∂2 , h (x, t, k) = G(x, t, k) − 0 1 −ikx−4ik 3 t h+ (x, t, k) = F (x, t, k) − e for k ∈ R. 0 Let us consider the integral ∞ 1 1 3 − iky+4ik 3 t J (x, y, t) = h (x, t, k)e dk + h+ (x, t, k)eiky+4ik t dk. 2π ∂2 2π −∞ Using Eqs. (4.1), (4.2), (4.4), (2.3) we find 1 3 J (x, y, t) − h− (x, t, k)eiky+4ik t dk 2π ∂2 ∞ K1 0 λK2 = (x, y, t) + Fs (x + y, t) + (x, z, t)Fs (z + y, t)dz, 1 K2 K1 x
70
A. Boutet de Monvel, V. Kotlyarov
where Fs (x, t) =
1 2π
∞
−∞
r(k)eik(x+y)+8ik t dk. 3
On the other hand, using estimates (3.9) and (3.23) of F (x, t, z) and G(x, t, z) for large k, taking into account (4.3) (4.4), (4.5) and applying the Jordan lemma, we find 1−λ 3 J (x, y, t) = Resk=kj h+ (x, t, k)eiky+4ik t i 2 +i
kj ∈1 ∪3 s2+ (kj )=0
3 Resk=zj h− (x, t, k)eiky+4ik t
zj ∈2 r1− (zj )=0
1 3 [G(x, t, k) − F (x, t, k)]eiky+4ik t dk 2π ∂2 3 1−λ = m1j eikj y+4ikj t + (x, t, kj ) − 2 kj ∈1 3 ikj y+4ikj3 t + − mj e (x, t, kj ) −
kj ∈3
1 − λ 2 izj y+4izj3 t + − mj e (x, t, zj ) 2 zj ∈2 1 3 − c(k)eiky+4ik t + (x, t, k)dk. 2π ∂2 Finally we have the following integral equations of the inverse scattering: ∞ K1 (x, y, t) + λ K2 (x, z, t)H (z + y, t)dz = 0 for 0 ≤ x < y < ∞, (4.6) x ∞ K1 (x, z, t)H (z + y, t)dz = 0 (4.7) K2 (x, y, t) + H (x + y, t) + x
with the kernel 1 − λ 1 ikj x+8ikj3 t 2 izj x+8izj3 t 3 ikj x+8ikj3 t mj e + mj e + mj e H (x, t) = 2 zj ∈2 kj ∈1 kj ∈3 ∞ 1 1 3 3 c(k)eikx+8ik t dk + r(k)eikx+8ik t dk. (4.8) + 2π ∂2 2π −∞ The coefficients m1j , m2j , and m3j are given by m1j = [is1+ (kj )˙s2+ (kj )]−1 , m3j m2j
= =
kj ∈ 1 ,
[is1+ (kj )˙s2+ (kj )]−1 , kj − + − −1 p1 (zj )[is1 (zj )˙r1 (zj )]
∈ 3 ,
= p2− (zj )[is2+ (zj )˙r1− (zj )]−1 = −i Resk=zj c(k), z j ∈ 2 .
(4.9)
Scattering Data for the mKdV Equation on the Half-Line
71
Using (4.7) for y = x one can prove that H (x, t) ∈ C ∞ (R+ × R+ ) and is rapidly decreasing in x, i.e. H (x, t) = O(x −∞ ), as x → ∞, since K1 (x, y, t) and K2 (x, y, t) are in C ∞ (R+ × R+ × R+ ) and also rapidly decreasing as x + y → ∞, and (4.7) is a Volterra integral equation with respect to the kernel H (x, t). All terms in (4.8) are clearly C ∞ and of the Schwartz type except for the last term: ∞ 1 3 r(k)eikx+8ik t dk 2π −∞ because the reflection coefficient r(k) vanishes at infinity as O(k −1 ) that follows from (3.3), (3.4). Thus we arrive at the requirement that this integral must be C ∞ in x, t (Condition B2). It is easy to see that under the additional condition (1.8) the reflection coefficient r(k) ∈ S(R), therefore Condition B2 can be omitted because the correspond (0, t) are ing integral is C ∞ . By assumption the boundary functions q(0, t), qx (0, t), qxx ∞ ∞ C , then we obtain the following condition: the kernel H (x, t) must be C in t when x = 0. Therefore the function ∞ 3 3 c(k)e8ik t dk + r(k)e8ik t dk ∂2
−∞
C∞.
Thus we arrive at the property given in Condition C5. should be For any fixed t ∈ R+ the function H (x, t) will be rapidly decreasing for x → ∞. Indeed, using the method of steepest descent and integration by parts we see that H (x, t) = O(x −∞ ) because c(k) and r(k) are C ∞ and, according to their asymptotic behavior, they vanish at infinity as well as their derivatives of any order. If T = ∞ we also use Condition C4. Remark. For t = 0 the kernel H (x, t)|t=0 coincides with the kernel ∞ 1 H0 (x) = mj eikj x + r(k)eikx dk, 2π −∞ kj ∈C+
because in this case (t = 0) the integral over ∂2 can be evaluated by using the residues of the function c(k). After integration we find that H (x, 0) = H0 (x). Then Marchenko integral equations with kernel H0 (x) yield q(x, 0) = u(x). Now it is natural to introduce the set R = {s1+ (k), s2+ (k), k ∈ C+ ; p1+ (k, T ), p2+ (k, T ), k ∈ }
(4.10)
and to call it (see §1.3) the set of scattering data of the compatible eigenvalue problem for the pair of differential equations (1.2), (1.3) defined by q(x, t) satisfying (1.1). The kernel H (x, t) of the Marchenko equations is completely defined by the scattering data R. Conditions A–B–C from the Introduction follow immediately from the properties proved above for the scattering data R. The properties given by Conditions A–B–C are characteristic, i.e. they are sufficient to ensure that the functions s1+ (k), s2+ (k), k ∈ C+ , and p1+ (k, T ), p2+ (k, T ), k ∈ , are the scattering data of compatible x- and t-equations (1.2), (1.3) with q(x, t) satisfying the mKdV equation (1.1) with an initial function u(x) ∈ S(R+ ) and boundary values v(t), v1 (t), v2 (t) ∈ C ∞ [0, T ] if T < ∞, or v(t), v1 (t), v2 (t) ∈ S(R+ ) if T = ∞. Anyway formula (1.4) and Marchenko integral equations (4.6), (4.7) represent a solution of the mKdV equation if the kernel (4.8) is sufficiently smooth and rapidly decreasing as x → ∞. It follows from statements in Sect. 6.
72
A. Boutet de Monvel, V. Kotlyarov
5. Formulation of the Riemann-Hilbert Problem Here we give a formulation of the inverse scattering problem as a Riemann-Hilbert problem which will be used for proving that the solution q(x, t) arising from Marchenko equations satisfies the boundary conditions. We recall that the equality q(x, 0) = u(x) is already proved. The main scattering relations (3.1) yield the following Riemann-Hilbert problem. Indeed, from (3.1) we derive: − (x, t, k) = − (x, t, k) + r(k) + (x, t, k), s2+ (k)
k ∈ R,
Y − (x, t, k) = − (x, t, k) + ρ(k) + (x, t, k), r1− (k)
k ∈ R,
+ (x, t, k) = + (x, t, k) + λ¯r (k) − (x, t, k), s1− (k)
Y + (x, t, k) + ¯ = + (x, t, k) + λρ(k) (x, t, k), r2+ (k)
Y − (x, t, k) − (x, t, k) = c(k) + (x, t, k), − s2+ (k) r1− (k)
Y + (x, t, k) + (x, t, k) ¯ − (x, t, k), − = λc( ¯ k) s1− (k) r2+ (k)
(5.1)
k ∈ R, k ∈ R,
(5.2)
k ∈ ∂2 ,
(5.3)
k ∈ ∂5 .
(5.4)
Let us define the sectionally meromorphic (analytic for λ = 1) matrix M(k, x, t): − 1 (x, t, k)eiθ + −iθ 1 (x, t, k)e s2+ (k) k ∈ 1 ∪ 3 − iθ (x, t, k)e + 2 −iθ 2 (x, t, k)e + − s2 (k) iθ Y1 (x, t, k)e + −iθ 1 (x, t, k)e r1− (k) k ∈ 2 − Y2 (x, t, k)eiθ + −iθ (x, t, k)e 2 − (k) r 1 M(k, x, t) = Y1+ (x, t, k)e−iθ − iθ 1 (x, t, k)e r2+ (k) k ∈ 5 + −iθ Y (x, t, k)e − 2 iθ 2 (x, t, k)e + r2 (k) + (x, t, k)e−iθ − 1 iθ 1 (x, t, k)e s1− (k) k ∈ 4 ∪ 6 . + 2 (x, t, k)e−iθ − iθ (x, t, k)e 2 s1− (k) Here θ = θ (x, t, k) = k(x + 4k 2 t). Then we have the following Riemann-Hilbert problem M− (k, x, t) = M+ (k, x, t)J (k, x, t),
k∈
(5.5)
Scattering Data for the mKdV Equation on the Half-Line
73
on the contour = {k | Im k 3 = 0}. The orientation on the contour is chosen in such a way that the sign “+” (resp. “−”) corresponds to the left (resp. right) boundary values of the matrix M(k, x, t) in the domains 1 , 3 , 5 marked by “+”, and in the domains 2 , 4 , 6 marked by “−”. The corresponding graph is depicted on Fig. 2. + J−
+ −
− + 2 J
J 3 1
] J J
J
J
J 4 6
J
5 J ] J − + + −
J
+ −
Fig. 2. The oriented contour
The jump matrix has the form: −2iθ 1 λ¯ r (k)e arg k = 0, π ; −r(k)e2iθ 1 − λ|r(k)|2 π 2π 1 0 J (k, x, t) = ; arg k = , 2iθ 1 c(k)e 3 3 ¯ −2iθ 4π 5π 1 λc( k)e , . arg k = 0 1 3 3
The proof of Eqs. (5.5) is a simple algebraic verification of relations (5.1)–(5.4). Remark. The above Riemann-Hilbert problem is written for the case when the set of the eigenvalues is empty. More details about the Riemann-Hilbert problem for the mKdV equation and complete consideration of the initial-boundary value problem can be found in [11]. Remark. We consider the problem for the form (1.1) of the mKdV equation in the first quarter (x ≥ 0, t ≥ 0) of the xt-plane. This problem is different from that studied in [11], which is the initial boundary value problem for the same equation, but in the second quarter (x ≤ 0, t ≥ 0) of the xt-plane. Scattering (spectral) data for that problem have different analytic properties. It is well-known that for the KdV and mKdV equations there are differences between the initial boundary value problems for x > 0 and for x < 0.
74
A. Boutet de Monvel, V. Kotlyarov
Indeed, the kernel of the Marchenko integral equations, which have to be considered now in the domain −∞ < y < x ≤ 0, takes the form: H (x, t) =
3 3 1 − λ 5 ikj x+8ikj3 t m ˆ je + m4j eizj x+8izj t + m6j eizj x+8izj t 2 zj ∈4 zj ∈6 kj ∈5 ∞ 1 1 3 3 c(k)eikx+8ik t dk + (r(k) + c(k))eikx+8ik t dk, + 2π ∂5 2π −∞
where c(k) is now meromorphic (analytic for λ = 1) in 4 ∪ 6 .
To prove that q(x, t) satisfies the boundary condition we have also to formulate the Riemann-Hilbert problem for the t-equation. To do so let us define the sectionally meromorphic (analytic for λ = 1) matrix N (k, t): ˆ + (t, k)e−4ik 3 t − 3t 1 4ik ϕˆ1 (t, k)e p2+ (k) k ∈ 1 ∪ 3 ∪ 5 ˆ + (t, k)e−4ik 3 t − 3t 2 4ik ϕˆ2 (t, k)e p2+ (k) N(k, t) = ˆ − (t, k)e4ik 3 t 3t + 1 −4ik ϕˆ1 (t, k)e p1− (k) k ∈ 2 ∪ 4 ∪ 6 , − 4ik 3 t ˆ (t, k)e 3 + 2 −4ik t ϕˆ2 (t, k)e p1− (k) − ˆ k) = ˆ (t, k) ˆ + (t, k) are matrix where ϕ(t, ˆ k) = ϕˆ − (t, k) ϕˆ + (t, k) and (t, solutions (2.9) and (2.10) of the t-equation. Then it is easy to verify that N (k, t) is a solution of the following Riemann-Hilbert problem: N− (k, t) = N+ (k, t)J t (k, t),
k∈
(5.6)
on the contour (Fig. 2) oriented as above. The jump matrix J t (k, t) has the form: 3 1 p− (k)e4ik t J (k, t) = , 3 −p+ (k)e−4ik t 1 − p− (k)p+ (k)
t
where p− (k) = p2− (k)/p1− (k), p+ (k) = p1+ (k)/p2+ (k), and pj± (k) are the entries of the “scattering” matrix P (k). We have already proved that q(x, 0) = u(x). The proof that q(x, t) satisfies the (0, t) = v (t) is carried out by boundary values q(0, t) = v(t), qx (0, t) = v1 (t) and qxx 2 using Riemann-Hilbert problems (5.5) and (5.6) in the same manner as in [28]. The main tool in the proof is the existence of an analytic map from the Riemann-Hilbert problem (5.5) attached to M(k, 0, t) into the Riemann-Hilbert problem (5.6) attached to N (k, t) and using the “global relation” (Condition A9). Such a proof for the mKdV equation is precisely given in [11].
Scattering Data for the mKdV Equation on the Half-Line
75
6. Inverse Scattering Problem Let R be scattering data (4.10) satisfying Conditions A–B–C. Then: Statements. 1. The xt-integral equation K(x, y, t) + H(x + y, t) + 0 ≤ x < y < ∞,
∞ x
K(x, z, t)H(z + y, t)dz = 0,
0≤t <∞
with the 2 × 2 matrix kernel
(6.1)
0 H (x, t) H(x, t) = , λH (x, t) 0
where real scalar function H (x, t) given by (4.8), is uniquely solvable in L1 (x, ∞) for any x ≥ 0 and t ≥ 0. 2. The solution K(x, y, t) belongs to C ∞ (R+ × R+ × R+ ), it and all its derivatives decrease faster than any negative power of x + y, for x + y → ∞, and t fixed. 3. The matrix ∞ 3 −ikxσ3 −ikyσ3 + K(x, y, t)e dy e−4ik tσ3 (x, t, k) = e x
satisfies the symmetry conditions ¯ (x, t, k) = (x, t, k)−1 ¯ ± (x, t, k) = ± (x, t, −k)
for k ∈ R, for k ∈ C± ,
and is a solution of the x-equation (1.2) with Q(x, t) given by Q(x, t) = σ3 K(x, x, t)σ3 − K(x, x, t).
(6.2)
4. (x, t, k) is a solution of the x-and t-equations constructed from the matrix Q(x, t) and its derivatives Qx (x, t), Qxx (x, t), using Eqs. (6.2), (1.2), (1.3) and (2.3). 5. The scattering data R of these compatible differential equations coincide with the chosen functions s1+ (k), s2+ (k), p1+ (k, T ), p2+ (k, T ). Statement 1 follows from Lemma 2 about the solvability of the xt-integral equations: Lemma 2. Let R be scattering data satisfying Conditions A–B–C. Then the xt-integral equations (4.6)–(4.8) have a unique solution in L1 (x, ∞). Proof. Under Conditions A–B–C the integral operator of the xt-integral equation is compact in L1 (x, ∞). Then, by Fredholm theory the xt-integral equation has a unique solution if the homogeneous equation has no non-zero solution. If a non-zero solution does exist in L1 (x, ∞), in view of the homogeneity of the integral equation, it is bounded, hence belongs to L2 (x, ∞). The integral operator is clearly skew-Hermitian in L2 (x, ∞), so we obtain a contradiction, because the only solution in this case is zero. For λ = 1 the proof is more complicated. For example, it follows from the solvability of the corresponding Riemann-Hilbert problem (5.5). In turn, the unique solvability of the Riemann-Hilbert problem is proved by the same way as in [50].
76
A. Boutet de Monvel, V. Kotlyarov
Statement 2 follows from Lemma 3: Lemma 3. Let Conditions A–B–C be fulfilled and (K1 (x, y, t), K2 (x, y, t)) be the solution of the xt-integral equations (4.6)–(4.8). Then (K1 (x, y, t), K2 (x, y, t)) ∈ C ∞ (R+ ×R+ ×R+ ). These functions and all their derivatives decrease faster than any negative power of x + y, for x + y → ∞, t fixed. Moreover, q(x, t) = 2K2 (x, x, t) C∞
(a) is in x and t, (b) decreases faster than any negative power of x for x → ∞, t fixed, (c) is a solution of the mKdV equation with initial function q(x, 0) = u(x) and bound (0, t) = v (t). ary values q(0, t) = v(t), qx (0, t) = v1 (t), qxx 2 Proof (of Statement 2). According to Lemma 2 the xt-integral equations have a solution K1 (x, y, t) K2 (x, y, t) K(x, y, t) = λK2 (x, y, t) K1 (x, y, t) which belongs to L1 (x, ∞). By Condition A, the kernel H (x + y, t) is in C ∞ (R+ × R+ ) and is fast decreasing as x + y → ∞ (end of Sect. 4). Therefore (K1 (x, y, t), K2 (x, y, t)) is in C ∞ (R+ × R+ × R+ ) and vanishes faster than any negative power of x + y as x + y → ∞, t fixed. The same is true for all their derivatives and for q(x, t). It is clear that H (x, t) satisfies: ∂3 ∂ H (x, t) + 8 3 H (x, t) = 0. ∂t ∂x Then it is well-known that q(x, t) solves the mKdV equation, cf. [1, 49]. The fact that q(x, t) satisfies the boundary conditions was discussed at the end of Sect. 5. Proof (of Statements 3 and 4). The proof of Statement 3 is well-known [1, 19]. In particular, formula (6.2) follows from Eq. (2.4). Statement 4 is also true. Indeed, due to Lemma 3, the function q(x, t) is a solution of the mKdV equation (1.1). Hence the constructed x- and t-equations are compatible. Therefore, according to Lemma 1, the matrix-valued function (x, t, k) solves the t-equation. Proof (of Statement 5). Let us consider compatible solutions (x, t, k) (2.3), (x, t, k) ˜ be the corre(2.6), and Y (x, t, k) (2.11) of the constructed x- and t-equations. Let R ˜ sponding scattering data. We have to show that R ≡ R. First of all one can find that the matrix K(x, y, 0) solves the xt-integral equation ˜ (6.1) (t = 0) with kernel H(x, 0) generated by ∞ 1 ik˜j x ˜ H (x) = m ˜ je + r˜ (k)eikx dk. 2π −∞ k˜j ∈C+ s˜2+ (k˜j )=0
On the other hand the matrix K(x, y, 0) solves the same integral equation (t = 0) with kernel H(x, 0) generated by ∞ 1 ikj x H (x) = mj e + r(k)eikx dk, 2π −∞ kj ∈C+ s2+ (kj )=0
Scattering Data for the mKdV Equation on the Half-Line
since 1 2π
∂2
c(k)eikx dk =
kj ∈2 s2+ (kj )=0
m2j eikj x −
77
zj ∈2 r1− (zj )=0
m2j eizj x .
Hence F (x) = H˜ (x)−H (x) is a solution of the homogeneous Volterra integral equation ∞ F (2x) + K1 (x, y − x, 0)F (y)dy = 0, 2x
˜ j = mj (hence m ˜ 1j = m1j , which yields the identity F (x) ≡ 0. Therefore k˜j = kj , m m ˜ 3j = m3j ) and r˜ (k) ≡ r(k) for k ∈ R. Scattering data {s1+ (k), s2+ (k)} for the x-equation can be easily recovered [19] from r(k) and k1 , . . . , kn by using formula (3.6). For t > 0 by using (4.8) we shall now obtain the relation F (x, t) = H˜ (x, t)−H (x, t) ≡ 0, that yields c(k) ˜ ≡ c(k) and z˜ j = zj , m ˜ 2j = m2j . The uniqueness of c(k) follows from estimates for the functions pj+ (k) (Conditions A7 and A8). Then we recover ρ(k) = r(k) + c(k) and r1− (k) by using (3.19) and (3.18). Furthermore p1+ (k) and p2+ (k) are recovered by ¯ using (3.20), (3.21), an analytic continuation and symmetry conditions p1+ (k) = λp¯ 2− (k) + − ¯ ˜ (λ = ±1), p2 (k) = p¯ 1 (k). Hence R ≡ R. All Statements 1–5 on the inverse scattering problem are proved. As a consequence the main theorem (see §1.6) is also proved. References 1. Ablowitz, M.J., Segur, H.: Solitons and the Inverse Scattering Transform. Philadelphia: SIAM, 1981 2. Ablowitz, M.J., Segur, H.: The Inverse Scattering Transform: semi-infinite interval. J. Math. Phys. 16, 1054–1056 (1975) 3. Belokolos, E.D.: General Formulas for Solutions of Initial and Boundary-Value Problem for the sine-Gordon Equation. Teor. Mat. Fiz. 103, 358–367 (1995); Engl. Transl.: Theor. Math. Phys. 103, 613–620 (1995) 4. Berezanskii, Yu.M.: Integration of nonlinear difference equations by the method of the inverse spectral problem. Dokl. Akad. Nauk SSSR 281, 16–19 (1985); Engl. Transl.: Sov. Math. Dokl. 31, 264–267 (1985) 5. Berezanskii, Yu.M., Shmoish, M.: Nonisospectral flows of the semi-infinite Jacobi matrices. J. Nonlinear Math. Phys. 1, 116–146 (1994) 6. Bikbaev, R.F., Its, A.R.: Algebraic-Geometric solutions of a Boundary Value Problem for the Nonlinear Schr¨odinger Equation. Mat. Zametki 45, 3–9 (1989); Engl. Transl.: Math. Notes 45, 349–354 (1989) 7. Bikbaev, R.F., Tarasov, V.O.: Initial Boundary Value Problem for the Nonlinear Schr¨odinger Equation. J. Phys. A: Math. Gen. 24, 2507–2516 (1991) 8. Bikbaev, R.F., Tarasov, V.O.: An inhomogeneous boundary value problem on the semi-axis and on a segment for the sine-Gordon equation. Algebra i Analiz 3, 78–92 (1991); Engl. Transl.: St. Petersburg Math. J. 3, 775–789 (1992) 9. Boutet de Monvel, A., Kotlyarov, V.P.: Scattering problem for the Zakharov-Shabat equations on the semi-axis. Inverse Problems 16, 1813–1837 (2000) 10. Boutet de Monvel, A., Kotlyarov, V.P.: Generation of asymptotic solitons of the nonlinear Schr¨odinger equation by boundary data. J. Math. Phys. 44, 3185–3215 (2003) 11. Boutet de Monvel, A., Fokas, A.S., Shepelsky, D.: The mKdV equation on the half-line. J. Inst. Math. Jussieu 3, 139–164 (2004) 12. Calogero, F., De Lillo, S.: The Burgers Equation on the semi-infinite and finite intervals. Nonlinearity 2, 37–43 (1989) 13. Caroll, R., Bu, Q.: Solution of the forced nonlinear Schr¨odinger equation using PDE techniques. Appl. Anal. 41, 33–51 (1991)
78
A. Boutet de Monvel, V. Kotlyarov
14. Degasperis, A., Manakov, S.V., Santini, P.M.: On the Initial-Boundary Value Problems for Soliton Equations. JETP Lett. 74, 481–485 (2001) 15. Degasperis,A., Manakov, S.V., Santini, P.M.: Initial-BoundaryValue Problems for Linear and Soliton PDEs. Theoret. and Math. Phys. 133, 1475–1489 (2002) 16. Deift, P., Zhou, X.: A steepest descent method for oscillatory Riemann-Hilbert problems. Bull. Am. Math. Soc. 26, 119–123 (1992) 17. Deift, P., Zhou, X.: A steepest descent method for oscillatory Riemann-Hilbert problems. Asymptotics for the mKdV equation. Ann. Math. 137, 295–368 (1993) 18. Dubrovin, B.A.: Theta functions and nonlinear equations. Usp. Mat. Nauk 36, 11–80 (1981); Engl. Transl.: Russ. Math. Surv. 36, 11–92 (1982) 19. Faddeev, L.D., Takhtajan, L.A.:Hamiltonian Methods in the Theory of Solitons. Moscow: Nauka, 1986; Engl. Transl.: Berlin: Springer, 1987 20. Fokas, A.S.: An Initial Boundary Value Problem for the Nonlinear Schr¨odinger Equation. Phys. D 35, 167–185 (1989) 21. Fokas, A.S.: On the integrability of linear and nonlinear partial differential equations. J. Math. Phys. 41, 4188–4237 (2000) 22. Fokas, A.S., Its, A.R.: An initial-boundary value problem for the sine-Gordon equation in laboratory coordinates. Teoret. Mat. Fiz. 92, 387–403 (1992); Engl. Transl.: Theoret. and Math. Phys. 92, 964–978 (1992) 23. Fokas, A.S., Its, A.R.: An initial-boundary value problem for the Korteweg-de Vries equation. In: “Solitons, nonlinear wave equations and computation (New Brunswick, NJ, 1992)”. Math. Comput. Simulation 37, 293–321 (1994) 24. Fokas, A.S., Its, A.R.: The Linearization of the Initial Boundary Value Problem of the Nonlinear Schr¨odinger Equation. SIAM J. Math. Anal. 27, 738–764 (1996) 25. Fokas, A.S., Gelfand, I.M.: Integrability of Linear and Nonlinear Evolution Equations and the Associated Nonlinear Fourier Transform. Lett. Math. Phys. 32, 189–210 (1992) 26. Fokas, A.S.: A unified transform method for solving linear and certain nonlinear PDEs. Proc. Roy. Soc. London Ser. A 453, 1411–1443 (1997) 27. Fokas, A.S.: Two dimensional linear PDEs in a convex polygon. Proc. Roy. Soc. London Ser. A 457, 371–393 (2001) 28. Fokas, A.S., Its, A.R., Sung, L.Y.: The nonlinear Schr¨odinger equation on the half-line. Preprint, Cambridge University, 2001 29. Fokas, A.S.: Integrable Nonlinear Evolution Equations on the Half-Line. Commun. Math. Phys. 230, 1–39 (2002) 30. Gattobigio, M., Liguori, A., Mintchev, M.: The Nonlinear Schr¨odinger equation on the half-line. J. Math. Phys. 40, 2949–2970 (1999) 31. Habibullin, I.T.: B¨acklund Transformation and Integrable Boundary-Initial Value Problems. In: Nonlinear world 1, Kiev NTPP-89. River Edge, NJ: World Scientific, 1990, pp. 130–138 32. Habibullin, I.T.: The KdV equation on a half-line with a zero boundary condition. Teoret. Mat. Fiz. 119, 397–404 (1999); Engl. Transl.: Theoret. and Math. Phys. 119, 712–718 (1999) 33. Habibullin, I.T.: An initial-boundary value problem on the half-line for the mKdV equation. Funct. Anal. Appl. 34, 52–59 (2000) 34. Kotlyarov, V.P.: Asymptotic analysis of the Marchenko integral equation and soliton asymptotics of a solution of the nonlinear Schr¨odinger equation. Phys. D 87, 176–185 (1995) 35. Marchenko, V.A.: Sturm-Liouville operators and their applications. Kiev: Naukova Dumka, 1977 36. Sabatier, P.C.: New direct linearizations for KdV and solutions of the other Cauchy problem. J. Math. Phys. 40, 2983–3020 (1999) 37. Sabatier, P.C.: Elbow scattering and inverse scattering applications to LKdV and KdV. J. Math. Phys. 41, 414–436 (2000) 38. Sabatier, P.C.: Past and future of inverse problems. J. Math. Phys. 41, 4082–4124 (2000) 39. Sabatier, P.C.: Elbow scattering and boundary value problems of NLPDE. J. Nonlinear Math. Phys. 8, suppl., 249–253 (2001) 40. Sabatier, P.C.: Should we study sophisticated inverse problems? Inverse Probl. 17, 1219–1223 (2001) 41. Sakhnovich, L.A.: Explicit Formulas for Spectral Characteristics and Solution of the sinh-Gordon Equation. Ukr. Mat. Zh. 42, 1517–1523 (1990); Engl. Transl.: Ukr. Mat. J. 42, 1359–1365 (1990) 42. Sakhnovich, L.A.: Integrable Nonlinear equations on the semi-axis, Ukr. Mat. Zh. 43, 1578–1584 (1991); Engl. Transl.: Ukr. Math. J. 43, 1470–1476 (1991) 43. Sakhnovich, L.A.: The Goursat Problem for the sine-Gordon Equation and an Inverse Spectral Problem. Izv. Vyssh. Uchebn. Zaved. Mat. 36, 44–54 (1992); Engl. Transl.: Russian Math. (Iz. VUZ) 36, 42–52 (1992) 44. Sakhnovich, L.A.: Sine-Gordon Equation in Laboratory coordinates and Inverse Problem on the Semi-Axis. In: Boutet de Monvel, A., Marchenko, V.A. (eds.), Algebraic and Geometrical Methods in Mathematical Physics. Proceedings, Kaciveli 1993. Dordrecht: Kluwer, 1996, pp. 443–447
Scattering Data for the mKdV Equation on the Half-Line
79
45. Sklyanin, E.K.: Boundary Conditions for Integrable Equations. Funktsional. Anal. i Prilozhen. 21, 86–87 (1987); Engl. Transl.: Funct. Anal. Appl. 21, 164–166 (1987) 46. Sung, L.Y.: Solution of the Initial Boundary Problem of the nonlinear Schr¨odinger equation using PDE techniques. Preprint, Clarkson University, 1993 47. Tarasov, V.O.: The Boundary Value Problem for the Nonlinear Schr¨odinger Equation. Zap. Nauch. Semin. LOMI 169, 151–165 (1988) 48. Zakharov,V.E., Shabat,A. B.:An exact theory of two-dimensional self-focusing and one-dimensional automodulation of waves in a nonlinear medium. Soviet Phys. JETP 34, 62–78 (1972) 49. Zakharov, V.E., Shabat, A. B.: A scheme for integrating the nonlinear equations of mathematical physics by the method of the inverse scattering problem. I. Funktsional. Anal. i Prilozhen. 6, 43–53 (1974); Engl. Transl.: Funct. Anal. Appl. 8, 226–235 (1974) 50. Zhou, X.: The Riemann-Hilbert problem and inverse scattering. SIAM J. Math. Anal. 20, 966–986 (1989) Communicated by M. Aizenman
Commun. Math. Phys. 253, 81–119 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1173-9
Communications in
Mathematical Physics
Roughness-Induced Effects on the Quasi-Geostrophic Model Didier Bresch1 , David G´erard-Varet2 1 2
LMC-IMAG, U.M.R. 5523, Universit´e Joseph Fourier, 38041 Grenoble, France UMPA, U.M.R. 5669, E.N.S. Lyon, 46 All´ee d’Italie, 69364 Lyon, France
Received: 21 August 2003 / Accepted: 31 March 2004 Published online: 5 November 2004 – © Springer-Verlag 2004
Abstract: We study in this paper the effect of small-scale irregularities on the quasi-geostrophic model. This study is motivated by some problems related to oceanography, as the Gulf Stream separation, or the impact of the topography on the global circulation. We first consider the role of coastal roughness in the phenomenon of western intensification of boundary currents. We show that the roughness is responsible for a nonlinear dynamics of the boundary layers, governed by a quasilinear elliptic equation. We thus extend substantially the classical derivation of Munk layers [15] and the results of convergence obtained in [10]. We then discuss the effect of a rough topography, by generalizing and justifying some formal computations of [17]. In particular, we derive rigorously a simplified model of oceanic circulation, with a nonlinear and nonlocal dissipative term due to the roughness.
1. Introduction This paper is devoted to the study of the so-called “quasi-geostrophic” system. We shall derive rigorously some new asymptotic models, accounting for roughness-induced effects. Before we present our results in more detail, let us introduce the problem in a precise way. The quasi-geostrophic system is a classical model of oceanic circulation. It reads ∂t + u1 ∂x + u2 ∂y ( + βy − F + ηB ) + r = β curl τ + Re−1 2 , u = (u1 , u2 )t = ∇ ⊥ , ∂ |∂ = 0, |∂ = ∂n |t=0 = 0. (1)
82
D. Bresch, D. G´erard-Varet
In these equations : • = (t, x) ∈ R is a stream function, associated to the velocity field u = (u1 (t, x), u2 (t, x))t . The time variable t lies in R+ , and the space variable x = (x, y) lies in a two-dimensional domain to be made precise later on. • d/dt = ∂t + u1 ∂x + u2 ∂y is the transport operator by the two-dimensional flow. • is the vorticity, and r, r > 0, is the Ekman pumping term. • F is due to the free surface with F the Froude number. • βy is the second term in the development of the Coriolis force. • ηB is a bottom topography term. • β curl τ is the vorticity created by the wind, where τ is a given stress tensor. • Re−1 2 is the usual viscosity term. Formally, system (1) is derived starting from three dimensional Navier-Stokes equations with a Coriolis force and a free surface : see [15] or [8] for a brief recall of this derivation. Mathematically, the justification of this derivation is open. But if the free surface is replaced by a rigid lid approximation, (1) with F = 0 can be rigorously derived, as shown by B. Desjardins and E. Grenier in [10]. Note that the derivation of the Quasi-geostrophic equations with F = 0 may be mathematically proved starting from a viscous Shallow water equations, see for instance [5]. Despite its simplicity, this model catches some of the main features of oceanic circulation. We refer the reader to [3, 15] for an extensive physical discussion. From a mathematical viewpoint, qualitative properties of (1) have also been widely studied [7, 10, 1, 4, 6]. Among the related papers, let us mention the important work of B. Desjardins and E. Grenier [10]. These authors have performed a complete boundary layer analysis, under various asymptotics, in domains = {χw (y) ≤ x ≤ χe (y),
ymin ≤ y ≤ ymax } .
They have notably derived the so-called Munk layers, responsible for the western intensification of boundary currents. In their stationary version, Eqs. (1) have been studied in [1, 4, 6], looking at for instance the influence of an island or the influence of the choice of boundary conditions. Remark that there exist other quasi-geostrophic models, see [9], depending on the small scale parametrizations. The aim of the present paper is to study roughness-induced effects on the quasi-geostrophic model. This work has strong physical motivation: the ocean bottom and coastal topography vary over a wide range of scales, the smallest of which can not be resolved in numerical computations. Such small-scale variations can not be seen in former theoretical studies, as topography terms depend regularly on x. There is therefore a need for the development of simplified models that account for the impact of the roughness on the large-scale ocean circulation. In a first part, we investigate the influence of rough shores. We consider the asymptotic behaviour of solutions of (1), as β goes to infinity, in a domain with rough boundaries. We extend the convergence result obtained in [10] about Munk layers. In the rough case, we show that the dynamics of the boundary layers is strongly modified. The usual Munk system of ordinary differential equations is replaced by a quasilinear elliptic equation, leading to serious additional difficulties. To overcome these difficulties, we use some methods introduced by one of the authors in [12], for rotating fluids in rough domains. In our view, it is an important step to progress in the understanding of the so-called “Gulf stream separation”. This expression refers to the abrupt separation of the Gulf Stream from the North American coastline, at Cape Hatteras. This phenomenon, which
Roughness-Induced Effects on the Quasi-geostrophic Model
83
has been observed for many years, seems to have very little variability: the current leaves the continent on a straight path without any visible deflection at the separation point. However, it is still poorly understood from a physical point of view: see [2] for more physical insight (in particular for a numerical study of the influence of boundary conditions on the separation point). Therefore, it is important to try to obtain new models which reflect the Gulf stream separation, taking into account for instance the effect of rough boundaries. In a second part, we investigate the effect of a rough bottom topography. We consider ζ rapidly oscillating functions ηB = ηB (x, y, x/ζ, y/ζ ), periodic in both variables, with ζ a small positive parameter. Such a study has been carried at a formal level and in the linear regime in articles [17] and [18]. We consider in this paper the “weakly nonlinear” case, which seems to be the most relevant (see [17] for a discussion on the choice of the scalings). That means: ζ → 0,
= ζ ψ,
u = ζ v,
Re−1 = νζ 2 ,
β, r, ν > 0 given.
Thus, system (1) becomes βy 1 ζ 2 2 (∂t + ζ v · ∇) ψ + ζ + ζ ηB + rψ = β curl τ + νζ ψ
v = (v1 , v2 )t = ∇ ⊥ ψ, ψ|t=0 = 0.
(2)
We show convergence results on the solutions ψ ζ of (2) as ζ → 0. As will be seen in the sequel, the limit system has a dissipative term, due to the small-scale roughness. This term is both nonlinear and nonlocal. In the linear case, it degenerates into a convolution product, turning the limit system into an integro differential system. All these results are in agreement with formal computations of [17]. The plan of the paper is as follows. In Sect. 2, we describe precisely the different domains we will consider and state the main results of the paper. In Sect. 3, we analyze ζ the case of rough shores. In Sect. 4, we look at the case of the periodic topography ηB . 2. Statement of the Results 2.1. Rough shores. In order to lighten notations, we assume throughout the study of rough coasts (Sect. 3) that r = 1, Re = 1. Up to minor changes, similar results would hold for arbitrary constants r and Re. Moreover, we use the parameter ε = β −1/3 preferentially to β. The reason is that ε is the natural size of the boundary layers arising in this study. Assuming F = 0, System (1) reads then ∂t + u1 ∂x + u2 ∂y ( + ε −3 y + ηB ) + = ε −3 curl τ + 2 , u = (u1 , u2 )t = ∇ ⊥ , (3) ∂ |∂ = 0, |∂ = ∂n |t=0 = 0.
84
D. Bresch, D. G´erard-Varet
2.1.1. The rough domain ε . Let us describe the domain ε where Eqs. (3) hold (see also Fig. 1). We write ε = εw ∪ w ∪ ∪ e ∪ εe . • is the “interior domain”, which, following notations of [10], is defined by
= χw (y) ≤ x ≤ χe (y), y ∈ [ymin , ymax ] , where χ w and χe are smooth functions defined for y ∈ [ymin , ymax ].
• w = (χw (y), y), y ∈ (ymin , ymax ) and e = (χe (y), y), y ∈ (ymin , ymax ) are “interfaces”. • εw and εe are the “rough shores”. More precisely, let γw = γw (Y ) and γe = γe (Y ) be smooth, positive and 1-periodic functions. We set
(4) εw = (x, y), 0 > x − χw (y) > −εγw (ε −1 y) ,
εe = (x, y), 0 < x − χe (y) < εγe (ε −1 y) . (5) We also define lateral boundaries
ε = χw (y) − εγw (ε −1 y), y , y ∈ (ymin , ymax ) , w
eε = χe (y) + εγe (ε −1 y), y , y ∈ (ymin , ymax ) . Remark. Up to a few more technicalities, one could consider more general (hence more realistic) domains: results below extend to functions γw = γw (y, Y ) (resp. γe = γe (y, Y )), with γw (y, ·) Tw (y)-periodic (resp. γe (y, ·) Te (y)-periodic).
ε
Ωw
ε
Γw
ε
Ωe
Ω
Σw
Σe
Fig. 1. The rough domain ε
ε
Γe
Roughness-Induced Effects on the Quasi-geostrophic Model
85
2.1.2. The boundary layer domains. The study of the boundary layers requires addi+ − tional boundary layer domains (Fig. 2). Namely, we define ωw = ωw ∪ σw ∪ ωw , where
+ ωw = X > 0, Y ∈ (0, 1) , σw = X = 0, Y ∈ (0, 1) ,
− ωw = (X, Y ), Y ∈ (0, 1), −γw (Y ) < X < 0 . We define similarly ωe = ωe− ∪ σe ∪ ωe+ . We call nw (resp. ne ) the outward unit normal vector at the boundary {X = −γw (Y )} (resp. {X = γe (Y )}). For all R > 0, we denote ωR = ωw ∩ {X > R}.
Finally, we set w = ∪k∈Z (X, Y + k),
X > −γw (Y ),
Y ∈ [0, 1]
2.1.3. Other notations. Let us introduce some more notations that will be useful in the study of the western boundary layer. Operators. For all y ∈ [ymin , ymax ], we set ∇w (y) = (∂X , ∂Y − χw (y)∂X )t ,
∇w⊥ (y) = (χw (y)∂X − ∂Y , ∂X )t ,
and define the operators w (y) and Qw (y) by: 2 w (y) = ∇w (y)2 = ∂X2 + χw (y)∂X − ∂Y ,
⊥ ⊥ ⊥ Qw (y)(, ) = ∇w (y) · ∇w (y) · ∇w (y) · ∇w (y) . Similarly, ∇e (y) = (∂X , −χe (y)∂X − ∂Y )t ,
∇e⊥ (y) = (χe (y)∂X + ∂Y , ∂X )t ,
y
ωw-
ωw+ σw
x Fig. 2. The boundary layer domain ωw
86
D. Bresch, D. G´erard-Varet
and 2 e (y) = ∇e (y)2 = ∂X2 + χe (y)∂X + ∂Y ,
˜ = ∇e⊥ (y) · ∇e⊥ (y)ψ · ∇e (y) · ∇e⊥ (y) . Qe (y)(ψ, ψ) Functional spaces. We name
V = ϕ ∈ C ∞ (w ) , ϕ 1-periodic in Y , (6)
V0 = ϕ ∈ V, (supp ϕ) ∩ ∂w = ∅, supp ϕ bounded in the X direction . (7) Finally, we define H2 (ωw ) = the closure of V in H 2 (ωw ), H02 (ωw ) = the closure of V0 in H 2 (ωw ),
(8) (9)
D02 (ωw ) = the completion of V0 for the norm ψ = D 2 ψ (L2 (ωw ))4 , (10) where D 2 ψ is the Hessian matrix of ψ. 2.1.4. Results. We want to consider System (3) in the limit of small ε, in the domain ε defined above. Let T > 0; for the sake of simplicity we assume that curl τ ∈ C ∞ [0, T ] × ε , curl τ |t=0 = 0. Since we consider a problem with zero initial data and a source term, we are actually studying well-prepared data, as in [10]. Moreover, in order to avoid problems near northern and southern boundaries, we assume as in [10] that: (11) ∃ λ > 0, curl τ = 0 for y ∈ ymax − λ, ymax ∪ ymin , ymin + λ . As pointed out in [10], System (3) is the vorticity formulation of a two-dimensional Navier-Stokes type equation : for all ε > 0, (3) has a unique smooth solution ε . As usual in boundary layer problems, the study of ε will require an additional system. For t ∈ [0, T ], x in ε , let x int (t, x) := curl τ (t, x , y) dx , (12) χe (y)
and for t ∈ [0, T ], y ∈ [ymin , ymax ],
χw (y)
φ(t, y) :=
curl τ (t, x , y) dx .
(13)
χe (y)
For all t ∈ [0, T ] and y ∈ [ymin , ymax ], we consider the following boundary layer systems, of unknown t,y = t,y (X), for X ∈ ωw : (14) Qw (y) t,y , t,y + ∂X t,y − (w (y))2 t,y = 0,
Roughness-Induced Effects on the Quasi-geostrophic Model
with the boundary conditions : t,y 1-periodic in Y, t,y |X=−γw (Y ) = −φ(t, y),
87
∂ t,y = 0. ∂nw X=−γw (Y )
(15)
Note that for fixed y, w (y) (and consequently (w (y))2 ) is an elliptic operator. We prove in Sect. 3 the following Theorem 2.1. There exist a constant φ∞ > 0 and a function w : [0, T ] × [ymin , ymax ] × ωw → R, such that if φ ∞ < φ∞ , for all (t, y), t,y = w (t, y, ·) is the unique weak solution in H2 (ωw ) of (14)–(15) (see remark below). Moreover, t,y belongs to H m (ωw ) for all m ≥ 0, and satisfies the estimate sup w (t, y, ·) H m (ωR ) ≤ Cm exp(−σ R), t,y
R ≥ R1 ,
(16)
where σ and R1 are independent of m. t,y 2 Remark. For all (t, y) and for all in H (ωw ), one can easily see that Eq. (14) has 2 a meaning in H0 (ωw ) . Furthermore, the boundary conditions (15) have also a meaning in the trace sense. Hence, “the unique weak solution of (14)–(15)” means the only function t,y satisfying (14) in (H02 (ωw )) , and (15) in the trace sense.
Once this auxiliary system is solved, we can prove our main convergence result. We prove in Sect. 3 Theorem 2.2. Let ε be the solution of System (3). There exists C∞ , such that if φ ∞ < C∞ , then ε L∞ (0,T ;H 1 (ε )) → 0 as ε → 0, ε − app
where
x − χw (y) y ε app (t, x) = int (t, x) + w t, y, , , ε ε
where int is given by (12) and w is given by Theorem 2.1. Remark. The previous theorem provides an approximation of ε at the main order. In fact for the proof, we will have to build an approximation up to the order ε that means under the form
x − χw (y) y ε 0 app = int (t, x) + w t, y, , ε ε
x − χw (y) y x − χw (y) y 0 1 +ε e t, y, , + w t, y, , . ε ε ε ε √ The two last quantities being of size ε in the H 1 norm, Theorem 2.2. will be obtained ε . by a classical energy estimate on the difference ε − app
88
D. Bresch, D. G´erard-Varet
Remark. Theorem 2.2 is the generalization to rough boundaries of the convergence theorem obtained in [10], for a domain without roughness. They proved a similar asymptotic result, with an asymptotic behaviour
x − χw (y) ε app (t, x) = int (t, x) + m t, y, . ε The stream function int was also given by (12), whereas the Munk layer term m = m (t, y, X) was the solution of ∂X m (t, y, ·) − (1 + χw (y)2 )2 ∂X4 m (t, y, ·) = 0, (17) ∂X m (t, y, ·) = 0, m (t, y, ·)|X=0 = −φ(t, y), X=0
with φ given by (13). Note that (17) is a simple linear differential equation in variable X. It is strongly different from the rough case, in which the boundary layer system (14)–(15) is genuinely two-dimensional, and nonlinear. Remark. A smallness assumption on the function φ of (13) is crucial, as can be seen from both theorems. It is first needed to prove the existence of the boundary layer term w . We then use it to prove the convergence theorem. Note that such condition was already present in the convergence theorem of [10]. As explained in this paper, it is deeply linked to the stability of the boundary layer : it can be read as a smallness condition on an appropriate Reynolds number. Remark. We recover the fact that the velocity is very large (O(ε −1/2 ) in L2 norm), due to the western boundary layer: it is the so-called intensification of western currents [3]. 2.2. Rough topography. ζ
2.2.1. The topography ηB . Throughout Sect. 4, we will study Eqs. (2), in = T2 = (R/Z)2 . Up to slight modifications, the case = R2 could be handled similarly. We still assume that curl τ ∈ C ∞ ([0, T ] × ), with curl τ |t=0 = 0. We assume that the bottom topography is given by
x ζ ηB (x) = ηB x, , ζ with ηB = ηB (x, X) smooth, 1-periodic in its variables. We assume ζ such that 1/ζ is an integer. 2.2.2. Results. We want to consider System (2) in the limit of small ζ . There again, we need auxiliary systems. 2 Let T > 0, and U = U (t, x) ∈ L∞ ((0, T ) × ) . We first consider the following ˜ x, X) for t > 0 and (x, X) ∈ T2 × T2 : equations, of unknown ψ˜ = ψ(t, ˜ + u˜ · ∇X X ψ˜ + U · ∇X X ψ˜ + u˜ · ∇X ηB + rX ψ˜ ∂ ψ t X (18) − ν (X )2 ψ˜ + U · ∇X ηB = 0, ˜ ∂X ψ˜ , ψ| ˜ t=0 = 0. u˜ = ∇X⊥ ψ˜ = −∂Y ψ,
Roughness-Induced Effects on the Quasi-geostrophic Model
89
of (18), Theorem 2.3. There exists a unique weak solution ψ ∞ 1 2 ∞ 2 ψ ∈L 0, T × ; H (Q) ∩ L 0, T ; L (; H (Q)) ,
T2
(t, ·) = 0. ψ
We now introduce what will be shown to be the limit system of (2). First, on the basis of Theorem 18, we define F : (L∞ ((0, T ) × ))2 → (L∞ ((0, T ) × ))2 by: for all x ∈ T2 , ˜ x, X) dX, ηB (x, X) ∇X ψ(t, F (U ) (t, x) = − T2X
where ψ˜ is the solution of (18), given by Theorem 2.3. We will show that it is dissipative, in the sense of Proposition 2.4. For all U in (L∞ ((0, T ) × ))2 , for all t ∈ (0, T ), for all x,
t
F (U ) (s, x) · U (s, x) ds ≥ 0.
0
The limit system is: for t > 0, x ∈ T2x , ∂t x ψ + β∂x ψ + u · ∇x ηB + rx ψ + curlx F(u) = β curl τ, u = ∇x⊥ ψ, ψ|t=0 = 0,
(19)
where the bar stands for the average in the rough variable X. We show in Sect. 4: Theorem 2.5. For all m > 1, there exists Tm > 0, such that System (19) has a unique solution ψ ∈ C [0, Tm ]; H m+1 (T2 ) , with for all t ≥ 0 , T2 ψ(t, x) dx = 0. Remark. The dissipative term curlx F(u0 ) in the limit system is of course due to the roughness: it is the mathematical translation of energy loss due to the friction at the bottom. Remark. System (19) is close to the one derived in [12], for rotating fluids in rough domains. Namely, the limit equation was of type ∂t x ψ + u · ∇x x ψ + curlx F(u) = 0 for a function F defined in a neighborhood of 0 in R2 , with values in R2 . In particular, in both cases, we do not manage to control curlx F(u) in Lp spaces, so that we can not conclude to the existence of global in time solutions, for instance through a Yudovitch scheme (cf. [14]).
90
D. Bresch, D. G´erard-Varet
However, we wish to point out a strong difference between the two cases: the function F is here non-local in time, whereas it was local for the rotating fluids system. In other words, for our model of oceanic circulation, the dissipative mechanisms depend on the whole history of the large scale flow. This appears clearly when one considers the linear case, i.e. omitting the nonlinear term in (2). Following [17], Eq. (19) becomes
t
∂t x ψ + β∂x ψ + u · ∇x ηB + rx ψ + curlx
R(t − τ, ·)u(τ, ·) dτ = β curl τ,
0
(20) for some kernel R. Thus, ψ obeys an integro-differential rather than a differential equation. For more on this linear case (such as numerical computations), see [17]. We now state the convergence result on solutions ψ ζ of System (2): Theorem 2.6. Let m > 1. Let u = ∇x⊥ ψ, where ψ ∈ C [0, Tm ]; H m+1 (T2x ) is the solution of (19) given by Theorem 2.5. Let uζ = ∇x⊥ ψ ζ , where ψ ζ is the solution of (2). Then 2 uζ → u in L∞ (0, Tm ); L2 () , as ζ → 0.
3. Rough Coasts This section is devoted to the proof of Theorems 2.1 and 2.2. In Subsect. 3.1, we formally derive an asymptotic expansion of the solution ε of (3). We recover formally functions int and w defined in the previous section. In Subsect. 3.2, we prove Theorem 2.1. In Subsect. 3.3, we prove Theorem 2.2.
3.1. Formal asymptotic expansion. In this part, we build formally an approximate soluε tion app of System (3). 3.1.1. Ansatz. We look for an approximation of the following type ε app (t, x, y)
n x − χw (y) y i i i = ε (t, x, y) + w t, y, , ε ε i=0
χe (y) − x y , , +ei t, y, ε ε
(21)
where • i (t, x, y) are interior terms, defined for t > 0, (x, y) ∈ ε , with support in the interior domain . • wi = wi (t, y, X, Y ) and ei = ei (t, y, X, Y ) are respectively the western and eastern boundary layer profiles. They are defined for t > 0, and X respectively in ωw and ωe , with 1-periodicity in the Y variable.
Roughness-Induced Effects on the Quasi-geostrophic Model
91
Dirichlet conditions. We expect the boundary layer terms to have no role far from the boundaries, which leads to the condition ei , wi −−−→ 0.
(22)
X→∞
ε We then want our approximate solution to satisfy app = 0 at ∂ε . It yields the following Dirichlet conditions: for all (t, y), = 0, ei (t, y, ·) = 0. (23) wi (t, y, ·) X=−γw (Y )
X=γe (Y )
ε Neumann conditions. We then want that ∂n app = 0 at ∂ε . It yields easily Neumann conditions on the boundary layer profiles, of the type: for all (t, y),
∂wi i = ϕw (t, y, X)|X=−γw (Y ) , ∂nw X=−γw (Y )
∂ei = ϕei (t, y, X)|X=−γe (Y ) , ∂ne X=−γe (Y ) (24)
i and ϕ i involve the k ’s and k for k ≤ i − 1. where functions ϕw e w e
Interface conditions. It remains to determine conditions at the interfaces. As the interior terms are zero outside , they create discontinuities at the interfaces w and e . Then boundary layer terms are added to cancel such discontinuities. This procedure is classical in roughness effect problems, see for instance [13, 12]. More precisely, to prove the convergence result, we will carry energy estimates on the difference of the stream ε . The roughness corrector ε is made to drop the surface functions φ ε = ε − app ext integral that appears when evaluating the viscous term ε ε ε ε ∂n (app (2 φ ε )φ ε = )app − ∇∂n app · ∇app
e ∪ w ε
e ∪ w + |D 2 φ ε |2 . ε
Sufficient conditions for this integral to vanish are ε = 0, k = 0, . . . , 3. ∂xk app
(25)
e ∪ w
Straightforward computations give the following jump conditions on the boundary layer terms: = − i (t, y, ·)|x=χw (y) , ψei (t, y, ·) = − i (t, y, ·)|x=χe (y) , ψwi (t, y, ·) |σw
|σw
(26) and ∂Xk ψwi (t, y, ·)
|σw
= fwi,k (t, y),
∂Xk ψei (t, y, ·)
|σw
= fei,k (t, y),
k = 1, . . . , 3, (27)
j
j
where the fwi,k and fei,k , k = 1, . . . , 3, depend on the j , w and e , j ≤ i − 1.
92
D. Bresch, D. G´erard-Varet
3.1.2. Derivation of the first profiles. We now derive the equations satisfied by the first profiles of the expansion (21). In the interior, we obtain at the leading order ε −3 the so-called Sverdrup relation ∂x 0 = curl τ.
(28)
We choose 0 to cancel at e , so that x 0 (t, x, y) = curl τ (t, x , y) dx
in ,
χe (y)
=0
in ε − .
(29)
+ − In the western layer, we have ∇ ∼ ε −1 ∇w (y). We find, for X in ωw ∪ ωw :
Qw (y)(w0 (t, y, ·), w0 (t, y, ·)) + ∂X w0 (t, y, ·) − (w (y))2 w0 (t, y, ·) = 0. (30) Remark. The quadratic operator Qw is deduced from the quadratic term (u · ∇) of system (3), replacing ∇ by ∇w = (∂X , ∂Y − χw ∂X )t . Indeed, we recall that (u · ∇) = (u · ∇) curl u = curl ((u · ∇)u) = ∇ ⊥ · ∇ ⊥ · ∇ ∇ ⊥ . We complete these equations by the interface and boundary conditions mentioned above, namely 0 w (t, y, ·) | σ = −φ(t, y), w k 0 ∂X w (t, y, ·) = 0, k = 1, . . . , 3 | σw ∂w0 (t, y, ·) w0 (t, y, ·) 1-periodic in Y, w0 (t, y, ·)X=−γ (Y ) = = 0. X=−γw (Y ) w ∂nw (31) In the eastern layer, we obtain homogeneous equations: for X ∈ ωe+ ∪ ωe− , Qe (y) e0 (t, y, ·), e0 (t, y, ·) − ∂X e0 (t, y, ·) − (e (y))2 e0 (t, y, ·) = 0.
(32)
The interface and boundary conditions are also homogeneous, thanks to our definition of 0 : ∂Xk e0 (t, y, ·) = 0, k = 0, . . . , 3, (33) |σe
e0 (t, y, ·) 1-periodic in Y,
e0 (t, y, ·)X=−γ
e
= (Y )
∂e0 (t, y, ·) = 0. (34) X=−γe (Y ) ∂ne
Of course, this system has the solution e0 ≡ 0. Remark. Our choice for the streamfunction 0 is the same as in [10]. It is due to the properties of the eastern coast, which cannot bear a large boundary layer. Indeed, as explained in [10], the equation satisfied by the eastern profile e0 in the non-rough case is −∂X e0 − (1 + (χe )2 )2 ∂X4 e0 = 0.
(35)
Roughness-Induced Effects on the Quasi-geostrophic Model
93
The characteristic equation of this ODE has only one root with negative real part, so that one can not impose simultaneously conditions on e0 , ∂x e0 at the boundary and a condition at infinity. The same difficulty appears in the rough case, as Eqs. (32) “contain” the differential equation (35). Note that such difficulty is not present in the western layer, as the underlying ODE ∂X w0 − (1 + (χw )2 )2 ∂X4 w0 = 0 has two characteristic roots with negative real parts. This ends the derivation of the first corrector terms. If we gather differently the leading order terms of the approximation, we obtain formally that in ε ,
x − χw (y) y ε app , , (t, x) ∼ int (t, x, y) + w t, y, ε ε where w = w (t, y, X) is defined by w (t, y, X) = w0 (t, y, X), w (t, y, X) =
+ X ∈ ωw ,
w0 (t, y, X) − int (t, x, y),
(36) X∈
− ωw .
(37)
Using Eqs. (30)–(31), we get (still at a formal level) the following equations on t,y = + ∪ ω− , w (t, y, ·): for X in ωw w Qw (y)( t,y , t,y ) + ∂X t,y − (w (y))2 t,y = 0, with
∂Xk t,y
t,y X=−γ
w (Y )
| σw
= 0,
= −φ(t, y),
(38)
k = 0, . . . , 3, ∂ t,y = 0. ∂nw X=−γw (Y )
(39)
As the jump conditions are homogeneous, Eq. (38) is in fact satisfied on all ωw . We thus recover formally the boundary layer system and the asymptotic result of Theorems 2.1 and 2.2.
3.2. The boundary layer system. The aim of this section is to prove Theorem 2.1. To do so, we use an equivalent formulation of (14), (15), more appropriate for a variational − treatment. Let us define = ψm (t, y, X) by: for X = (X, Y ) in ωw , ψm (t, y, X) = 0, + and for X = (X, Y ) in ωw , √ √ 3 3 1 ψm (t, y, X) = −φ(t, y) cos , (40) Dw (y)X + √ sin Dw (y)X 2 2 3 where φ is given by (13), and Dw (y) = (1 + (χw (y))2 )−2/3 . Note that for all (t, y), t,y + ∪ ω− , with the following jump conditions: ψm (X) = ψm (t, y, X) satisfies (14) in ωw w t,y t,y [ψm ]|σw = −φ(t, y), ∂X ψm |σw = 0.
94
D. Bresch, D. G´erard-Varet
If we set ψw0 (t, y, X) = w (t, y, X) − ψm (t, y, X), ψw0 (t, y, X) and denote ψ t,y X in ω+ ∪ ω− ,
=
= w (t, y, X) + int (t, x, y),
ψw0 (t, y, ·),
+ X ∈ ωw ,
X∈
(41)
− ωw ,
(42)
(14)–(15) are equivalent to the following equation: for t,y
t,y
Qw (y)(ψ t,y , ψ t,y ) + Qw (y)(ψm , ψ t,y ) + Qw (y)(ψ t,y , ψm ) +∂X ψ t,y − (w (y))2 ψ t,y = 0, ∂Xk ψ t,y = 0, k = 0, 1 | σw k t,y t,y ∂X ψ = − ∂Xk ψm , k = 2, 3. | σw | σw ∂ψ t,y = 0. ψ t,y 1-periodic in Y, ψ t,y X=−γw (Y ) = ∂nw X=−γw (Y )
(43)
(44)
We proceed in the following way: i) Under a smallness assumption on φ, we prove for all t, y the existence of a solution ψ t,y of (43)–(44) (in a sense to be made precise). ii) We show that ψ t,y belongs to H m (ωR ), for R > 1, and satisfies estimates (16). iii) Under a stronger smallness assumption on φ, we prove that ψ t,y is the unique solution of (43), (44). 3.2.1. Variational formulation. In this part, we show for all (t, y) the existence of a solution ψ t,y of (43)–(44). More precisely, we associate a variational formulation to these equations, and show the existence ofa solution for this formulation. ˜ ϕ on D02 (ωw ) × D02 (ωw ) × V0 by For all y, we define a trilinear form by ψ, ψ, ˜ φ =− (45) ∇w⊥ ψ · ∇w ∇w⊥ ψ˜ · ∇w⊥ ϕ. by ψ, ψ, ωw
One checks easily that
by
is well-defined for all y, satisfies ˜ ϕ = Qw (y) ψ, ψ˜ ϕ, by ψ, ψ,
˜ ϕ regular enough, and that for all ψ, ψ, by (ψ, ϕ, ϕ) = 0,
∀ ϕ ∈ V0 .
The variational formulation associated to (43)–(44) is: for all t, y, find ψ t,y ∈ D02 (ωw ) such that, for all ϕ in V0 , t,y t,y by ψ t,y , ψ t,y , ϕ + by ψ t,y , ψm , ϕ + by ψm , ψ t,y , ϕ 2,y t,y 2,y − < Dw ψ , D w ϕ > + ∂X ψ t,y ϕ = −(1 + (χw (y))2 ) ωw ωw
2 t,y 3 t,y × ∂X ψm ϕ + ∂ X ψm ∂X ϕ , (46) | σw
σw
| σw
σw
Roughness-Induced Effects on the Quasi-geostrophic Model
where 2,y Dw ψ=
95
2 ψ (1 + (χw (y))2 ) ∂X2 ψ −χw (y) ∂XY , 2 ψ −χw (y) ∂XY ∂Y2 ψ
and < , > stands for the euclidean matricial scalar product. We now prove Theorem 3.1. There exists φ∞ , such that if φ ∞ < φ∞ , there exists for all t ∈ [0, T ], for all y ∈ [ymin , ymax ] a function ψ t,y in D02 (ωw ) satisfying (46) for all ϕ in V0 . Proof of Theorem 3.1. The proof relies on a Galerkin scheme. Let (ϕn )n ∈ V0 be a basis of D02 (ωw ). We consider the sequel of approximate problems : For all n, find ψn of the form n ψn (t, y, X) = αk (t, y) ϕk (X), k=0 t,y
such that for all k ∈ {0, . . . , n}, for all (t, y), ψn = ψn (t, y, ·) satisfies t,y t,y t,y t,y t,y t,y by ψn , ψn , ϕk + by ψm , ψn , ϕk + by ψn , ψm , ϕk t,y 2,y t,y 2,y − < Dw ψ n , Dw ϕk > + ∂X ψn ϕk = −(1 + (χw (y))2 ) ωw ωw
t,y t,y × ∂X2 ψm ϕk + ∂X3 ψm ∂ X ϕk . | σw
| σw
σw
(47)
σw
• A priori estimates We start with a priori estimates. As t and y are simple parameters, we drop them temporarily in order to lighten notations: one can easily verify that all the constants Ci arising in the inequalities below can be chosen independent of t and y (and n). Multiplying the last equation by αk and summing over k leads to 2 2 < Dw ψ n , Dw ψn >= b(ψn , ψm , ψn ) + (∂X ψn ) ψn ωw ωw
(48) +(1 + (χw )2 ) ∂X2 ψm ψn + ∂X3 ψm ∂X ψn . | σw
| σw
σw
We obtain the following energy estimates: (ψn )2 We have (∂X ψn ) ψn = ∂X ( ) = 0. 2 ωw ωw The integrals over σw are bounded in the following way: ψn + ∂X3 ψm (1 + (χw )2 ) ∂X2 ψm | σw
≤ C1 |φ| ψn
σw
| σw
σw
σw
∂X ψn
− H 2 (ωw )
≤ C2 |φ| D 2 ψn L2 (ωw ) , −. where we have used Poincar´e’s inequality on ωw
(49)
96
D. Bresch, D. G´erard-Varet
It remains to estimate the trilinear term, i.e. to handle terms of the type +∞ 1 dY dX ∂i ∂j ψm (∂k ψn ) (∂l ψn ) , i, j, k, l ∈ {X, Y }. I= 0
We write I =
0
1
+∞
dY
0
dX ∂i ∂j ψm
∂k ψn (0, Y ) +
0
× ∂l ψn (0, Y ) +
X
X
dX ∂X ∂l ψn
dX ∂ ∂k ψn X
0
0
so that I = I1 + I2 + I3 + I4 , with 1 dY ∂k ψn (0, Y ) ∂l ψn (0, Y ) I1 =
+∞
dX ∂i ∂j ψm ,
X 1 +∞ I2 = dY ∂k ψn (0, Y ) dX (∂i ∂j ψm ) dX ∂X ∂l ψn , 0
0
0
0
1
I3 =
0 +∞
dY ∂l ψn (0, Y )
I4 =
X
dX (∂i ∂j ψm ) 0
dX ∂X ∂k ψn , 0
+∞
dY 0
0 X
dX (∂i ∂j ψm ) 0
1
X
dX ∂X ∂l ψn .
dX ∂X ∂k ψn 0
0
Through repeated use of the Cauchy-Schwarz inequality, we get +∞
|I1 | ≤ sup dX |∂i ∂j ψm | ψn 2H 3/2 (σ ) Y
≤
w
0
C1 |φ| D ψn 2L2 (ω ) , w +∞ 2
√ dX |∂i ∂j ψm | X
|I2 | + |I3 | ≤ sup Y
0
ψn H 3/2 (σw ) D 2 ψn L2 (ωw )
≤ C2 |φ| D 2 ψn 2L2 (ω ) , w +∞
|I4 | ≤ sup dX |∂i ∂j ψm |X D 2 ψn 2L2 (ω Y
w)
0
≤ C3 |φ| D 2 ψn 2L2 (ω ) . w
Remark. With similar computations, one has more generally that : there exists C, σ > 0, such that for all ψ ∈ D02 (ωw ) and for all R ≥ 0, ⊥ ⊥ ⊥ 2 2 ψ · ∇ ψ ψ (50) ∇ ∇ · ∇ w w w m w ≤ C |φ| exp(−σ R) D ψn L2 (ω ) . w
ωR
If we gather all these bounds, we end with the a priori estimate 2 2 2 2 D ψn L2 (ω ) ≤ C5 < Dw ψ n , Dw ψn > w ωw ≤ C6 |φ| D 2 ψn L2 (ωw ) + D 2 ψn 2L2 (ω ) . w
Roughness-Induced Effects on the Quasi-geostrophic Model
97
Recovering the dependence on t and y, we get t,y
D 2 ψn 2L2 (ω
w)
t,y t,y ≤ ≤ C6 |φ(t, y)| D 2 ψn L2 (ωw ) + D 2 ψn 2L2 (ω ) . (51) w
In particular, if φ ∞ < C6 /2, we get t,y
D 2 ψn L2 (ωw ) ≤ 2 C6 |φ(t, y)|.
(52)
• Existence of ψn Let α = (α1 , . . . , αn )t , where ψn =
αk ϕk . Equations (47) read
F y, φ(t, y), α(t, y) = 0,
with F of type
F (y, φ, a) = Q1 (y, a, a) + Q2 (y, a, φ) + Q2 (y, φ, a) + A(y) a + L(y)φ. The Qi ’s come from the trilinear term by : they are smooth and bilinear in their last two variables. The matrix A(y) comes from the viscous term: it is smooth in y, and invertible, as the ϕk ’s are independent. The matrix L(y) is smooth and comes from the surface integral. Note that up to extending the functions χw and φ, one can assume that F : R × R × Rn → Rn . Moreover, it is straightforward that for all y in R, F (y, 0, 0) = 0,
∂F (y, 0, 0) invertible. ∂a
By the implicit function theorem, for all y, there exist neighborhoods Yy of y, Py of φ = 0, and Ay of a = 0, and a smooth function a from Yy × Py to Ay such that ∀ (y, ˜ φ, a) ∈ Yy × Py × Ay ,
F (y, ˜ φ, a) = 0
⇔
a = a(y, ˜ φ).
As [ymin , ymax ] is compact, we can find a covering of [ymin , ymax ] by a finite number of neighborhoods Yy1 , . . . , Yyn . Let P = ∩ni=1 Pyi ,
A = ∩ni=1 Ayi ,
and let ai : Yyi × Pyi → Ayi be the corresponding solution of F (y, φ, a) = 0, given by the implicit function theorem. Thanks to (52), there exists φ∞ such that, for φ ∞ < φ∞ , for all t ∈ [0, T ], for all y ∈ [ymin , ymax ], φ(t, y) belongs to P and for all i, ai (t, y, φ(t, y)) belongs to A. It is then straightforward to see that α(t, y) = ai (y, φ(t, y)) ,
t ∈ [0, T ],
y ∈ Yyi
is well-defined and is a smooth solution of the approximate problem.
98
D. Bresch, D. G´erard-Varet
• Convergence of ψn The estimate (52) shows that, uniformly in t and y, t,y ψn is a bounded sequence of D02 (ωw ). n
t,y Hence, for all (t, y), there exists ψ t,y ∈ D02 (ωw ), such that, up to a subsequence, ψn
n
converges weakly to ψ t,y , with (cf. (52)) D 2 ψ t,y L2 (ω) ≤ 2 C6 |φ(t, y)|.
t,y Moreover, using Rellich’s theorem (and again up to a subsequence), ψn converges s (ωw ), for any s < 2. It is then a routine argument to pass to strongly to ψ t,y in Hloc the limit into (47) as n goes to infinity, to obtain: for all t, y, ψ t,y satisfies (46) for all ϕ ∈ V0 . This ends the proof of Theorem 3.1.
Remark. We do not know at this point the behaviour at infinity of solutions ψ t,y . In particular, we still may not use ψ t,y as a test function in (46) (through a density argument), which would allow to conclude for uniqueness. Also, if we wish to use Sobolev injections, we need to distinguish between the oscillatory part of ψ (for which injections hold) and its average, as will be done in the next subsection. 3.2.2. H m estimates. Let φ ∞ < φ∞ as in Theorem 3.1, and let ψ t,y ∈ D02 (ωw ) sat2 (ω ). Besides, thanks to classical isfy (46) for all ϕ ∈ V0 . Note that ψ t,y belongs to Hloc w regularity results for elliptic operators, − + ψ t,y ∈ C ∞ ωw ∪ ωw , and satisfies (43) in the classical sense. In this part, we show that ψ t,y belongs to H m (ωR ) for all m ≥ 0 and for all R > 1. As t and y are simple parameters, we drop them from notations : it is straightforward to see that the various constants appearing in the inequalities below can be chosen independent of t and y. N Average and oscillations. We introduce the following notations: to all w ∈ L1loc R+ ×T N (N ≥ 1), we associate its vertical average w ∈ L1loc R+ and its oscillatory part N ∗ 1 + w ∈ Lloc R × T , through + ∀X = (X, Y ) ∈ R × T, w(X) = w dY, w∗ (X) = w(X) − w(X),
(X)
where for all R ≥ 0, (R) is the cross-section at X = R. In order to lighten notations, + instead of X > 0. we may sometimes consider w as a function of X ∈ ωw The aim of this paragraph is to prove Proposition 3.2. There exist C∞ ∈ R and C, σ > 0, such that for all R > 0, ψ − C∞ belongs to H 2 (ωR ) with ψ − C∞ H 2 (ωR ) ≤ C exp(−σ R) + D 2 ψ ∗ L2 (ωR/2 ) . (53)
Roughness-Induced Effects on the Quasi-geostrophic Model
99
Proof of proposition 3.2. As ψ belongs to D02 (ωw ), we have + ), ψ ∗ ∈ H 2 (ωw
ψ
(2)
∈ L2 ({X > 0}).
ψ satisfies for all X > 0 the linear differential equation 2 (4) (1) − 1 + (χw )2 ψ + ψ = −Qw (ψ, ψ) − Qw (ψ, ψm ) − Qw (ψm , ψ). Simple computations yield Qw (ψm , ψ) = Qw (ψ, ψm ) = 0 Qw (ψ, ψ) = Q(ψ ∗ , ψ ∗ ), so that ψ satisfies 2 (4) (1) − 1 + (χw )2 ψ + ψ = −Qw (ψ ∗ , ψ ∗ ). Remember that
(54)
˜ ψ) ˜ = ∇w × (∇w⊥ ψ · ∇w )∇w⊥ ψ . Qw (ψ,
Applying the last equality to ψ˜ = ψ ∗ and integrating in the Y variable leads to Qw (ψ ∗ , ψ ∗ ) = ∂X B(ψ ∗ , ψ ∗ ) for a bilinear operator B involving the derivatives of ψ ∗ up to order two. If we substitute this expression into (54) and integrate in the X variable, we finally obtain 2 (3) (55) − 1 + (χw )2 ψ + ψ = C∞ + B(ψ ∗ , ψ ∗ ), where C∞ is a real constant. Any solution ψ of Eq. (55) is given by ψ = C∞ + ϕ, with ϕ a solution of 2 (56) − 1 + (χw )2 ϕ (3) + ϕ = B(ψ ∗ , ψ ∗ ). This last equation can be expressed in the form of a first order system V = AV + F , with 0 10 ϕ 0 . 0 0 1 , F = 0 V = ϕ (1) , A = (57) −2 (2) 2 ∗ ∗ −B( , ) ϕ 00 1 + (χw ) −2/3 A is diagonalisable, with eigenvalues Dw , j Dw , j 2 Dw , with Dw = 1 + (χw )2 . Let E+ and E− be the eigenspaces corresponding respectively to 1 and {j, j 2 }, and let −2 −1 t P+ , P− be the associated eigenprojections. Note that E+ = C Dw , Dw , 1 . We introduce the “Green function” G(X) = eAX P − if X > 0, G(X) = e−AX P + if X < 0,
100
D. Bresch, D. G´erard-Varet
and
f (X) = F (X) f (X) = 0
if X > 0, if X < 0.
The function G ∗ f is a solution of the system V = AV + F . Thus, we can write, for a V0 ∈ C4 , for all X > 0, V (X) = eAX V0 + G ∗ f.
(58)
Note that G satisfies, for positive constants C and α, ∀ X,
|G(X)| ≤ C e−α|X| ,
and that f ∈ L1 (R). More precisely, one has easily that for all R > 0, |F | ≤ C D 2 ψ ∗ 2L2 (ωR ) . {X>R}
Proceeding exactly as in [12, p. 1473], one is able to obtain G ∗ F 2L2 ({X>R}) ≤ C exp(−αR) + D 2 ∗ 2L2 (ωR/2 ) .
(59)
(2)
= ϕ (2) ∈ L2 ({X > 0}), we necessarily have P+ V0 ∈ C2 × {0}. But Now, as ψ t −2 −1 E+ = C Dw , Dw , 1 , so that P+ V0 = 0. It yields the inequality: ϕ H 2 ({X>R}) ≤ C exp(−σ R) + D 2 ψ ∗ L2 (ωR/2 ) . As ψ −C∞ = ϕ+ψ ∗ , we finally obtain (53). This ends the proof of the proposition.
H m estimates on ψ − C∞ . In this paragraph, we prove Proposition 3.3. There exists σ > 0, such that for all m ≥ 0, for all R ≥ 1, ψ − C∞ ∈ H m (ωR ), with (60) ψ − C∞ H m (ωR ) ≤ Cm exp(−σ R) + ψ − C∞ H 2 (ωR−1 ) . The keypoint of the proof is the following lemma: + , 1-periodic in Y , satisfying Lemma 3.4. Let , f1 , f2 , ∈ C ∞ ωw (w )2 = ∇w⊥ · f1 + f2
+ in ωw ,
where ∇w = (∂X , ∂Y − χw ∂X )t . Then, for all s ≥ 1, δ ∈ (0, s), for all m ≥ 0, and 1 < q < ∞, m+3,q,s,s+1 ≤ C f1 m,q,s−δ,s+1+δ + f2 m,q,s−δ,s+1+δ + 2,q,s−δ,s+1+δ ,
where · m,q,s,t stands for the norm in W of s.
m,q
(61)
+ ωw ∪ {s < X < t)} , and C independent
Roughness-Induced Effects on the Quasi-geostrophic Model
101
Proof of Lemma 3.4. The proof of this lemma is very close to the proof of [11, Lemma 1.2], and relies on classical elliptic regularity results. Note that it is enough to prove the inequality (61) for s = 1: we can always come back to this case through a change of variable: X → X − ξ . This also implies that the constant C of the lemma is independent of s. + We content ourselves with proving: Let ψ, f1 , f2 ∈ C ∞ ωw , 1-periodic in Y , 2 ⊥ + ˜ for all 1 < q < ∞, satisfying (w ) ψ = ∇w · f1 + f2 in ωw . Then, for all 0 < δ < δ, ψ 3,q,1−δ,2+δ ≤ C f1 0,q,1−δ,2+ ˜ ˜ ˜ δ˜ + f2 0,q,1−δ,2+ δ˜ + ψ 2,q,1−δ,2+ δ˜ . (62) Indeed, the proof of the proposition follows easily by induction, applying (62) for ˜ ψ = ∂ α , f1 = ∂ α f1 , f2 = ∂ α f2 , |α| = m, and appropriate δ, δ. ˜ Let 0 < δ < δ and 1 < q < ∞ be fixed. Let 0 ≤ ζ ≤ 1 be a C ∞ function on R3 such that 1 3 ζ = 1 in K = [1 − δ, 2 + δ] × − , , 2 2 ˜ 2 + δ×] ˜ − 1, 2[. ζ = 0 outside a C ∞ open set O with K ⊂ O s]1 − δ, ˜ and u = ∇w⊥ ψ. We derive the following system on u: We set ψ˜ = ζ ψ, u˜ = ∇w⊥ ψ, ˜ ∇w⊥ · (w u) ˜ = ∇w⊥ · f˜1 + f2 in O, ∇w · u˜ = 0 in O, u˜ = 0 on ∂O,
(63)
f˜1 = (4∇w ζ · ∇w ) u + ζf1 , f˜2 = ζf2 + w ((w ζ )ψ) + 2 (∇w w ζ · ∇w ) ψ + (∇w (∇w ζ ) · ∇w (∇w ψ)) − 4∇w (∇w ζ ) · ∇w u − ∇w⊥ · f1 .
(64)
where
Using [11, Theorem 3.2, p. 130], there exists G ∈
1,q W0 (O)
such that
˜ Lq (O) . ∇ · G = f˜2 in O, with G W 1,q (O) ≤ C g The function F = (G2 , −G1 + χw G2 ), where G = (G1 , G2 )t satisfies ∇w × F = f˜2 ,
with F W 1,q (O) ≤ f˜2 Lq (O) .
Going from the vorticity formulation (63) to a velocity formulation, we obtain w u˜ + ∇w q = f˜1 + F in O, ∇w · u˜ = 0 in O, u˜ = 0 on ∂O. Classical estimates for elliptic systems yield u ˜ W 2,q (O) ≤ C f˜1 + F Lq (0) + u ˜ Lq (O) , which leads easily to (62).
(65)
102
D. Bresch, D. G´erard-Varet
Proof of Proposition 3.3. To prove this proposition, we work with the function ψ∞ := + the equation ψ − C∞ − ψm , which satisfies on ωw Qw (ψ∞ , ψ∞ ) + ∂X ψ∞ − 2w ψ∞ = 0, which can also be written, introducing u∞ = ∇ ⊥ ψ∞ , 2w ψ∞ = ∇w⊥ × (−u∞ · ∇w u∞ ) − ∂X ψ∞ .
(66)
Let us show that: for all δ > 0, there exists C > 0 such that ∀ R ≥ 1,
ψ∞ H 3 (ωR ) ≤ C ψ∞ H 2 (ωR−δ ) .
(67)
Let δ > 0; for all s ≥ 1, we have − u∞ · ∇w u∞ 0,3/2,s−δ,s+1+δ ≤ C1 u∞ 0,6,s−δ,s+1+δ ∇w u∞ 0,2,s−δ,s+1+δ ≤ C2 ψ∞ 22,2,s−δ,s+1+δ ≤ C3 ψ∞ 2,2,s−δ,s+1+δ .
(68)
Note that the constants Ci do not depend on s (thanks to the invariance by translation along X). We also have ∂X ψ∞ 0,3/2,s−δ,s+1+δ ≤ C4 ∂X ψ∞ 0,2,s−δ,s+1+δ ≤ C5 ψ∞ 2,2,s−δ,s+1+δ , so that applying Lemma 3.4, we have ψ∞ 3,3/2,s−δ/2,s+1+δ/2 ≤ C6 u∞ · ∇w u∞ + ∂X ψ∞ 0,3/2,s−δ,s+1+δ ≤ C7 ψ∞ 2,2,s−δ,s+1+δ .
(69)
We now iterate the process and get an improved regularity. By Sobolev imbedding, we have u∞ 0,∞,s−δ/2,s+1+δ/2 ≤ C8 ψ∞ 3,3/2,s−δ/2,s+1+δ/2 ≤ C9 ψ∞ 2,2,s−δ,s+1+δ . Hence, we get the following bound − u∞ · ∇w u∞ 0,2,s−δ/2,s+1+δ/2 ≤ C10 u∞ 0,∞,s−δ/2,s+1+δ/2 ∇w u∞ 0,2,s−δ/2,s+1+δ/2 ≤ C11 ψ∞ 22,2,s−δ,s+1+δ ≤ C12 ψ∞ 2,2,s−δ,s+1+δ . Moreover,
(70)
∂X ψ∞ 0,2,s−δ/2,s+1+δ/2 ≤ ψ∞ 2,2,s−δ,s+1+δ ,
so that applying again Lemma 3.4, ψ∞ 3,2,s,s+1 ≤ C u∞ · ∇w u∞ + ∂X ψ∞ 0,2,s−δ/2,s+1+δ/2 ≤ C ψ∞ 2,2,s−δ,s+1+δ ,
(71)
C
with independent of s. Using this last inequality with s = R + k, k = 1, 2, . . . , and summing over k, we get (67). Using the same type of bootstrap arguments with the derivatives of ψ∞ instead of ψ∞ , it is easy to show that more generally: for all m ≥ 3, for all δ > 0, there exists C > 0 such that: ∀ R ≥ 1,
ψ∞ H m (ωR ) ≤ C ψ∞ H 2 (ωR−δ ) .
Back to ψ − C∞ = ψ∞ + ψm , the proposition follows. It ends the proof.
(72)
Roughness-Induced Effects on the Quasi-geostrophic Model
103
Control of ψ at infinity. To show that ψ is in H m (ωR ) for all R > 0, it remains to prove that the constant C∞ of Proposition 3.2 is zero. On one hand, we multiply Eq. (43) by ψ(= ψ t,y ) and integrate on ω ∩ {X < R}. After a few integration by parts, using jump conditions (44), we are left with
2 2 ∇w⊥ ψ · ∇w ∇w⊥ ψ · ∇w⊥ ψ Dw ψ, Dw ψ = − ω∩{X
ω∩{X
−
ω∩{X
∇w⊥ ψm · ∇w ∇w⊥ ψ · ∇w⊥ ψm
− +
⊥ ⊥ ∇w ψm · ∇w ∇w ψ · ∇w⊥ ψ
ω∩{X
2
∂X (w ψ) ψ −
+ σ (R)
(∇w ∂X ψ) · ∇w ψ σ (R)
χw ⊥ ⊥ + · ∇w (ψ + ψm ) · ∇w ∇w ψ 1 σ (R)
χw + · ∇w⊥ ψ · ∇w ∇w⊥ ψm 1 σ (R) 2 2 3 +(1 + (χw ) ) ∂X ψm ψ + ∂ X ψm | σw
| σw
σw
∂X ψ .
(73)
σw
As ψ − C∞ belongs to H m (ωR ) for all R > 1, we may pass easily to the limit as R goes to infinity in (73) and obtain C2 2 2 Dw ψ, Dw ψ = b(ψ, ψm , ψ) + ∞ + (1 + (χw )2 ) 2 ωw
× ∂X2 ψm ψ + ∂X3 ψm ∂X ψ . (74) | σw
| σw
σw
σw
On the other hand, we use the fact that, up to a subsequence, ψn → ψ,
weakly in D02 (ωw ),
where ψn was built in Sect. 3.2.1. Hence, 2 2 Dw ψ, Dw ψ ≤ lim inf
n→+∞ ω w
ωw
2 2 Dw ψ n , Dw ψn .
Using (48) yields 2 2 lim inf Dw ψ n , Dw ψn = lim inf b(ψn , ψm , ψn ) + lim inf (1 + (χw )2 ) n→+∞ ω n→+∞ n→+∞ w
× ∂X2 ψm (75) ψn + ∂X3 ψm ∂X ψn . | σw
| σw
σw
σw
By the Rellich theorem, for all s > 2, we have up to a subsequence ψn → ψ,
s strongly in Hloc .
(76)
104
D. Bresch, D. G´erard-Varet
If we choose for instance s = 7/4, and thanks to the imbedding − H s (ωw ) → H 1 (σw ),
we deduce
lim inf (1 + (χw )2 ) ∂X2 ψm ψn + ∂X3 ψm ∂X ψn n→+∞ | σw σw | σw σw
= (1 + (χw )2 ) ∂X2 ψm ψ + ∂X3 ψm ∂X ψ . | σw
| σw
σw
(77)
σw
It remains to show that lim inf b(ψn , ψm , ψn ) = b(ψ, ψm , ψ). n→+∞
Thanks to (50),
ωR
⊥ ⊥ ⊥ ∇w ψn · ∇w ∇w ψm · ∇w ψn ≤ C1 |φ| exp(−σ R) D 2 ψn 2L2 (ω
w)
≤
C2
exp(−σ R),
(78)
where we have used the uniform bound (52). In the same way
∇w⊥ ψ · ∇w ∇w⊥ ψm · ∇w⊥ ψ ≤ C3 exp(−σ R).
ωR
Thus, for all ε > 0, there exists R such that |b(ψn , ψm , ψn ) − b(ψ, ψm , ψ)| ≤ ε +
ω∩{X
− ω∩{X
∇w⊥ ψn · ∇w ∇w⊥ ψn · ∇w⊥ ψm
∇w⊥ ψ
· ∇w
∇w⊥ ψ
⊥ ·∇w ψm .
(79)
The second term at the right-hand side goes to zero as n → +∞, using (76). Finally,
2 2 Dw ψ, Dw ψ ≤ b(ψ, ψm , ψ) + (1 + (χw )2 )
× ∂X2 ψm ψ + ∂X3 ψm ∂X ψ .
ωw
| σw
σw
The comparison with (74) yields C∞ = 0.
| σw
σw
(80)
Roughness-Induced Effects on the Quasi-geostrophic Model
105
Exponential decay. In this last paragraph, we prove the so-called “Saint-Venant” estimates on ψ, that means exponential decay estimates (16). Once again, we rather work + the equation with = ψ − ψm , which satisfies on ωw Qw (, ) + ∂X − 2w = 0. Combining Propositions 3.2 and 3.3, it is enough to prove Proposition 3.5. There exists R1 , C, σ > 0, such that D 2 ∗ L2 (ωR ) ≤ C exp(−σ R). |D 2 ψ ∗ |2 . An energy estimate on the previProof of Proposition 3.5. Let f (R) = ∀R ≥ R1 ,
ωR
ous equation gives f (R) ≤ C with
ωR
2 ∗ 2 ∗ Dw , Dw =C
6
(81)
Ik ,
k=1
Qw (, ∗ ) ∗ , I3 = Qw ( ∗ , ) ∗ , R R R ω ω ω ∗ ∗ ∗ 2 I4 = Qw ( , ) , I5 = − ((χw ) + 1)(∂X w ∗ ) ∗ , ωR σ (R) I6 = + (1 + (χw )2 )∂X |∇w ∗ |2 . (82) ∂X ∗ ∗ ,
I1 =
I2 =
σ (R)
I1 = − σ (R)
| ∗ |2 ≤ 0. 2
We have +∞ I4 = ∇w⊥ ∗ · ∇w w ∗ ∗ σ (R ) R ≤
∗
sup |∇w w | Y,X≥R
+∞
σ (R )
R
∗ 2
1/2
| |
σ (R )
|∇w⊥ ∗ |2
1/2 . (83)
Using Proposition 3.3, we obtain for R large enough sup |∇w w ∗ | ≤ δ, where δ Y,Z≥R
positive will be chosen later. Moreover, | ∗ |2 + |∇w⊥ ∗ |2 ≤ 2 σ (R )
σ (R )
σ (R )
|D 2 ∗ |2 .
Finally, I4 ≤ δf (R). The estimates of I2 and I3 are similar to the one of I4 . We get for R large enough, I2 + I3 ≤ δf (R).
106
D. Bresch, D. G´erard-Varet
We treat I5 in the following way: I5 = − (1 + (χw )2 ) ∂X w ∗ σ (R) 2 = −(1 + (χw ) ) ∂X w ∗ σ (R) 2 ≤ −(1 + (χw ) ) ∂X w ∗ σ (R) 2 ≤ −(1 + (χw ) ) ∂X w ∗
∗
∗ 2 + (1 + (χw ) ) w ∗ ∂X ∗ σ (R)
∗ + C |D 2 ∗ |2 σ (R)
∗ − Cf (R). (84)
σ (R)
Finally,
2 I6 = 1 + (χw ) ∂X
|∇w ∗ |2 . σ (R)
Back to the main energy estimate (81), we obtain for R large enough.
(1 − 2δ)f (R) ≤ −Cf (R) − (1 + (χw )2 ) ∂X w ∗ ∗ σ (R)
2 + 1 + (χw ) ∂X |∇w ∗ |2 ,
(85)
σ (R)
where C is a positive constant. We integrate this inequality between t > 0 large enough and +∞. It leads to +∞ f (R) dR + C1 f (t) ≤ C2 f (t), (86) t
where the Ci ’s are positive constants. We conclude thanks to [11, Lemma 2.2].
3.2.3. Uniqueness. In this part, we show for all (t, y) the uniqueness of the solution ψ t,y ∈ H02 (ωw ) of (43)–(44), under a smallness assumption on φ. Let (t, y) fixed, and t,y t,y t,y t,y ψ1 , ψ2 in H02 (ωw ) two solutions of (43)–(44), and θ = ψ1 − ψ2 . It satisfies, for all ϕ ∈ V0 , t,y t,y 2 2 − < Dw θ, Dw θ > +by (θ, ψ2 , ϕ) + by (ψ1 , θ, ϕ) ωw t,y t,y +by (θ, ψm , ϕ) + by (ψm , θ, ϕ) + (∂X θ) ϕ = 0. (87) ωw
As θ ∈ H02 (ωw ), there exists ϕn in V0 such that ϕn → θ in H02 (ωw ). It is then easy to pass to the limit in the above equation, as n goes to infinity, with ϕ = ϕn . It yields 2 2 − < Dw θ, Dw θ > +b(θ, ψm , θ ) = 0. ωw
Using (50), we obtain, |b(θ, ψm , θ)| ≤ C |φ| D 2 θ 2L2 (ω ) , w
Roughness-Induced Effects on the Quasi-geostrophic Model
so that
D 2 θ 2L2 (ω
w)
107
≤ C |φ(t, y)| D 2 θ 2L2 (ω ) . w
1/C ,
we obtain θ = 0, which concludes the proof of the Finally, for φ ∞ ≤ φ∞ < uniqueness. We thus may define, for φ ∞ < φ∞ a function ψw0 = ψw0 (t, y, X) by: f or all (t, y), ψw0 (t, y, ·) is the only solution in H02 (ωw )of (43) − (44). As already mentioned, if we come back to the original system (14), (15) and define w = w (t, y, X) by (41)–(42), it is clear that the function w satisfies the statements of Theorem 2.1. This ends the proof of Theorem 2.1. We end Sect. 3.2 by additional regularity results on ψw0 with respect to t and y. Indeed, to build the other profiles of expansion (21) requires the existence and control β β of ∂tα ∂y w0 (or equivalently ∂tα ∂y ψw0 ). This is possible thanks to k > 0, such that if φ < φ k , then Proposition 3.6. For all k ≥ 0, there exists φ∞ ∞ ∞ 0 k,∞ 2 i) ψw ∈ W (0, T ) × (ymin , ymax ); D0 (ωw ) .
ii) For all α, β, |α| + |β| ≤ k, for all R > 1, ∂tα ∂yβ ψw0 (t, y, ·) belongs to H m (ωR ), with sup ∂tα ∂yβ ψw0 (t, y, ·) H m (ωR ) ≤ Cm exp(−σ R), t,y
R ≥ R1 .
Proof of Proposition 3.6. We only give the main ideas of the proof. Indeed, precise computations are very close to those performed above, so that we do not give all the details • Case k = 0 Let ψn = ψn (t, y, X) be the approximation built in Subsect. 3.2. On one hand, estimate (52) shows that (ψn )n is a bounded sequence of E := L∞ (0, T ) × (ymin , ymax ); D02 (ωw ) . Thus, there exists a subsequence, that we still denote (ψn )n which converges for the weak star topology to a function ψ 0 of E. In particular, for any compact subset K of ωw , we get ∀ ϕ ∈ C ∞ ([0, T ] × [ymin , ymax ] × K), ψn ϕ −−−−→ ψ 0 ϕ. n→+∞
On the other hand, as ψw0 (t, y, ·) is the only solution of (43)–(44), one shows through standard arguments that for all (t, y), for all s < 2, ψn (t, y, ·) → ψw0 (t, y),
n → +∞,
Thanks to this last convergence, we also get ∞
∀ ϕ ∈ C ([0, T ] × [ymin , ymax ] × K), Finally, ψw0 ≡ ψ0 , and therefore belongs to E.
strongly in H s (K).
ψn ϕ −−−−→ n→+∞
ψw0 ϕ.
108
D. Bresch, D. G´erard-Varet
• Case k = 1 Let (hn ) be a sequence going to zero. The idea is to consider the sequence (ϕn = ϕn (t, y, X))n given by ϕn (t, y, ·) =
ψw0 (t + hn , y, ·) − ψw0 (t, y, ·) . hn
Note that ϕn converges to ∂t ψw0 in D ((0, T ) × (ymin , ymax ) × ωw ). Moreover, repeating the whole process above with ϕn instead of ψn , one shows that, for φ ∞ small enough, this sequence converges as above, to a function p = p(t, y, X) ∈ E. For all t, y, p(t, y, ·) is the unique solution of equations that are of the same type as (43)–(44). In particular, it satisfies exponential decay estimates of type (16). Finally, one has ∂t ψw0 = p ∈ E, with appropriate exponential decay estimates. In the same way, one has also ∂y ψw0 ∈ E, with appropriate exponential decay estimates. • Case k > 1 One applies inductively the same arguments, replacing ψw0 by its derivatives. 3.3. Convergence result. This part is devoted to the proof of Theorem 2.2. The scheme of the proof is classical: on the basis of the previous analysis, we build an approxiε of the type (21) with n = 1. We then perform energy estimates on mate solution app ⊥ ε ε ). ε v = ∇ ( − app 3.3.1. The approximate solution. Interior and eastern profiles. We recall that 0 is given by (29), and that e0 ≡ 0. j The next profiles are built inductively. Assume that the j and e , j ≤ i − 1, are k well-defined, with e and its derivatives decaying exponentially to zero. Then, the i th interior profile satisfies ∂X i = F i , where F i depends on the j , j ≤ i − 1. This defines i , up to the addition of a function of t and y. To determine this function we use the equation on the eastern profile ei . This profile satisfies a linear elliptic equation of the type −2e ei − ∂X ei = Gi ,
X ∈ ωe+ ∪ ωe− ,
(88)
fulfilled with jump and boundary conditions = − i (t, y, ·) , ei (t, y, ·) |σe |{x=χe (y)} ∂Xk ei (t, y, ·) = g i,k (t, y), k = 1, . . . , 3,
(89)
|σe
ei (t, y, ·) 1-periodic in Y, ei (t, y, ·)|X=γe (Y )
= 0,
∂nw ei (t, y, ·)|X=γe (Y )
(90) =h, i
Roughness-Induced Effects on the Quasi-geostrophic Model
109 j
where the Gi , g i,k , and hi depend on the j and e , j ≤ i − 1. If we introduce , X ∈ ωe+ , ψei = ei + i |{x=χe (y)}
ψei
=
X ∈ ωe− ,
ei ,
ψei is formally a solution of (88), (90), with the homogeneous jump condition = 0. ei (t, y, ·) |σe
(91)
Using the Lax-Milgram lemma, it is then direct to show the existence and uniqueness of the ψei solution of (88), (90), (91). Moreover, using the same kind of arguments as in Subsect. 3.3.2, one shows that there exists C = C(t, y) such that ψei (t, y, ·) + C(t, y) and its derivatives decay exponentially to zero. Hence, we choose i such that = C(t, y), i (t, y, ·) |{x=χe (y)}
ei (t, y, ·)
so that and ei .
decays exponentially to zero. This completes the construction of i
Western profiles. The first western profile w0 has been built in Sect. 3.2. The second profile w1 can be obtained by solving similar equations, taking into account the error term created by w0 . Such an error term is well controlled, thanks to Proposition 3.6. More generally, the following profiles are solutions of similar equations, with error terms coming from lower profiles. As the construction is straightforward but tedious, we will not detail it. 3.3.2. Final energy estimates. We now focus on the energy estimates on the difference ε be the approximation above, between the exact and the approximate solutions. Let app ε which is well defined (and regular enough) for φ ∞ small enough. Let uεapp = ∇ ⊥ app , ε ⊥ ε ε and u = ∇ , with satisfying (3). We wish to show that ε L∞ ε − app t
or equivalently that
uε − uεapp L∞ t
The difference
vε
=
uε
− uεapp
belongs to
−−→ 0,
(H 1 (ε ))
ε→0
−−→ 0.
(L2 (ε ))
H 1 (ε ),
ε→0
and satisfies
∂t v ε + v ε · ∇v ε + v ε · ∇uεapp + uεapp · ∇v ε +v ε + (ηB + βy)(v ε )⊥ + ∇q ε − v ε = R ε ∇ · vε = 0 ε v | ∪ = 0 ε w e ∂v = σε − q ε n ∂n | w ∪ e vε = ϕε
in ∪ εw ∪ εe , in ∪ εw ∪ εe , at w ∪ e , at w ∪ e , ε at w ∪ eε
(92)
110
D. Bresch, D. G´erard-Varet
with R ε L∞ −1 (ε )) = O(ε), t (H
σ ε L∞ (H 1/2 ( w ∪ e )) = O(ε), t
λ ϕ ε W 1,∞ (H 1/2 ( ε ∪ ε )) = O exp − , λ > 0. t w e ε
The boundary term ϕ ε is the trace of the western (resp. eastern) boundary layer at eε ε ): (resp. at w ε (t, y, χe (y) + εγe (ε −1 y))|eε , ϕ ε |eε = ϕw
ϕ ε |wε = ϕeε (t, y, χw (y) − εγw (ε −1 y))|wε , ε , ϕ ε , and their derivatives exponentially decaying in their last variable. This with ϕw e ε are zero for explains the exponential bound on ϕ ε . Note that thanks to (11), ϕeε and ϕw y ∈ [ymax − λ, ymax ] ∪ [ymin , ymin + λ].
Lift of the boundary term. We must add a corrector to v ε so as to correct the boundary ε conditions. Let w (y) = χw (y) − εγw (ε −1 y), eε (y) = χe (y) + εγe (ε −1 y). We set
ε (y))2 + 1 (2x − w ε ε ε ε θ(x − w (y)) w1 (t, x, y) = ϕe (t, y, w (y)) ε (y))2 + 1 (w
(2x − eε (y))2 + 1 ε ε ε + ϕw (t, y, e (y)) θ(x − e (y)) , eε (y)2 + 1 where θ belongs to Cc∞ ([−δ, δ]) for δ > 0 small enough. It is clear that w1ε W 1,∞ (H 1 (ε )) ≤ C exp(−λ /ε),
λ > 0. 3 Now, we may apply [11, Lemma 3.1]: there exists w2ε ∈ L∞ H01 (ε ) such that t
∇ · w2ε = −∇ · w1ε ,
w2ε L∞ ≤ C(ε ) ∇ · w1ε L∞ 1 ε 2 ε . t (H ( )) t (L ( ))
Moreover, we can choose
C(ε ) = C δ(ε )2 1 + δ(ε ) ≤ C ,
where δ is the Lebesgue measure on R2 , and C is independent of ε. A look at the proof also shows that the same inequality holds for time derivatives, so that finally w2ε W 1,∞ (H 1 (ε )) ≤ C ε 2 . t
If we set v ε = wε + w1ε + w2ε , w ε is solution of ∂t w ε + w ε · ∇w ε + w ε · ∇uεapp + uεapp · ∇w ε +w ε + (ηB + βy)(w ε )⊥ + ∇q ε − w ε = R˜ ε ∇ · w ε = 0 in ∪ εw ∪ εe , ε w | ∪ = 0 at w ∪ e , ε w e ∂w = σ˜ε at w ∪ e , − q ε n ∂n | w ∪ e wε = 0
ε at w ∪ eε
in ∪ εw ∪ εe , (93)
Roughness-Induced Effects on the Quasi-geostrophic Model
111
with R˜ ε L∞ −1 (ε )) = O(ε), t (H
σ˜ ε L∞ 1/2 ( ∪ )) = O(ε). w e t (H
Final estimates. The final estimates are very close to those of [12]. We get ∂t w ε (t, ·) 2L2 + w ε (t, ·) 2L2 + ∇w ε (t, ·) 2L2 ≤ σ˜ ε · w ε (t, ·) σ ε ε + R · w (t, ·) + w ε · ∇uεapp · w ε = I1 + I2 + I3 . ε
(94)
ε
• I1 satisfies ε |I1 | ≤ σ˜ ε L∞ −1/2 (ε )) w (t, ·) H 1/2 (ε ) t (H
≤ C1 ε ∇w ε (t, ·) L2 (ε ) ≤ C12 ε + ε ∇w ε (t, ·) 2L2 (ε ) . • Similarly, we get
(95)
|I2 | ≤ C2 ε + ε ∇w ε (t, ·) 2L2 .
• I3 is the most difficult to control, because the boundary layer part of uεapp has strong gradient. Indeed, ∇uεapp
=ε
−2
y x − χw (y) ∇w (y) uw t, y, , + higher order terms in ε, ε ε
where uw = ∇w (y)w . The worst term is then
y x − χw (y) ε −2 ∇w (y) uw t, y, , . J =ε ε ε ε Proceeding exactly as in [12, p. 1495], one obtains For φ ∞ small enough, |Jε | ≤
1 ∇wε (t, ·) 2L2 (ε ) , 2
which allows to absorb this term into the one coming from viscosity, and to conclude thanks to a Gronwall lemma. 4. Rough Topography 4.1. Formal Derivation. We try to build an approximate solution to system (2) in the following natural form: x y x y ζ ψapp = ψ 0 (t, x, y) + ζ ψ 1 (t, x, y, , ) + ζ 2 ψ 2 (t, x, y, , ) + . . . , ζ ζ ζ ζ
(96)
where ψ 0 describes the large scale circulation and the ψ i ’s, i ≥ 1 are correctors taking into account the effect of the rough topography, through the variables X, Y .
112
D. Bresch, D. G´erard-Varet
At order ζ −1 , we get the following equation on ψ 1 : ∂t X ψ 1 + u˜ 1 · ∇X X ψ 1 + u0 · ∇X X ψ 1 + u˜ 1 · ∇X ηB + rX ψ 1 − ν(X )2 ψ 1 + u0 · ∇X ηB = 0,
(97)
where u˜ 1 = ∇X⊥ ψ 1 , and u0 = ∇x⊥ ψ 0 . This last equation is of type (18), setting U := u0 := ψ 1 . and ψ We obtain the evolution equation on ψ 0 , by averaging in X, Y the equation at order ζ 0 . Note that the nonlinearity in (2) can be written (u = ∇ ⊥ ψ): ζ u · ∇ψ = ζ ∂y (div (u1 ⊗ u)) − ∂x (div (u2 ⊗ u)) . Thus, the largest nonlinear term with non-zero average appears at O(ζ ) only. We get ∂t x ψ 0 + β∂x ψ 0 + u0 · ∇x ηB + rx ψ 0 +∇X ηB · ∇x⊥ ψ 1 + ∇x ηB · ∇X⊥ ψ 1 = β curl τ, where the bar stands for the average in the rough variable X. Then, ∇X ηB · ∇x⊥ ψ˜ =
−∂X ηB ∂y ψ˜ + ∂Y ηB ∂x ψ˜
X,Y
ηB ∂y ∂X ψ˜ − ηB ∂x ∂Y ψ˜ = ∂x −ηB ∂Y ψ˜ + ∂y ηB ∂X ψ˜ X,Y X,Y + ∂x ηB ∂Y ψ˜ − ∂y ηB ∂X ψ˜ X,Y X,Y
˜ = curlx − ηB ∇X ψ˜ − ∇x ηB ∇X⊥ ψ, =
X,Y
X,Y
so that we recover the limit system (19).
4.2. The auxiliary systems. The aim of this section is to show that the systems formally derived above are well-posed. 4.2.1. Resolution of (18). In system (18), variables x, y are simply parameters, so that this system is very close to classical geostrophic equations. A simple Galerkin scheme leads to the existence and uniqueness of the solution ∈ L∞ (0, T ) × ; H 1 (Q) ∩ L2 0, T ; L∞ (; H 2 (Q) , ψ
T2
(t, ·) = 0. ψ
Roughness-Induced Effects on the Quasi-geostrophic Model
113
4.2.2. Dissipativity of F. We now prove Proposition 2.4. Let U ∈ (L∞ ((0, T ) × ))2 , and ψ˜ solution of (18). The dissipativity of F is equivalent to the inequality:
t ˜ − ηB ∇X ψ · U ≥ 0. (98) 0
X,Y
An integration by parts gives
˜ − ηB ∇ X ψ · U = X,Y
Now, we get from (18) that t ˜ ψ (∇X ηB · U ) = 0
X,Y
ψ˜ (∇X ηB · U ) .
(99)
X,Y
∂t 2 2 ˜ ˜ 2 r |∇X ψ| + ν |X ψ| + |∇X ψ| 2 X,Y X,Y X,Y 0 )X ψ + (u˜ · ∇X ψ )X ψ + (U · ∇X ψ t
X,Y
+ X,Y
X,Y
(−u˜ · ∇X ηB )ψ =
t 0
6
Ii .
(100)
i=1
t 1 ˜ ˜ ·)|2 ≥ 0. • As ψ(0, ·) ≡ 0, 0 I3 = |∇X ψ(t, 2 • We do not detail the treatment of I4 , I5 , I6 . Integrations by parts lead to I4 = I5 = . I6 = 0 using that U does not depend on X and u = ∇X⊥ ψ Thus, we obtain inequality (98), which ends the proof.
4.2.3. Resolution of (19). System (19) can be written curlx ∂t u0 + ∇x p 0 − (ηB + βy)(u0 )⊥ + ru0 + F(u0 ) = 0,
(101)
for any pressure p0 , so that it is a kind of “semilinear Euler equation”. Therefore, as we do not manage to control curlx F(u0 ) ∞ , it seems difficult to prove the existence of global solutions. Hence, we look for existence of local in time strong solutions. The only difficult point is to check that F behaves well with respect to functions in the space 2 L∞ (0, T ; H m (T2 )) , m > 1 = d2 . Once we have appropriate bounds on F, it is then routine to conclude through standard Galerkin or iterative schemes (see for instance [14] on classical Euler equations). 2 Let us fix M > 0, and set BT∞ (0, M) the ball of radius M in L∞ 0, T × T2 . We prove Proposition 4.1. Let m > 1. There exists T > 0, such that F sends BT∞ (0, M) ∩ 2 2 to L∞ 0, T ; H m (T2 ) , and is uniformly Lipschitzian on L∞ 0, T ; H m (T2 ) bounded subsets. Remark. Note that we have to look at a solution u0 in (L∞ ((0, T ) × ))2 in order to be able to pass to the limit in the nonlinear terms u1 · ∇X (X ψ 1 ) + u0 · ∇X (X ψ 1 ) which 1 are in the equation satisfied by ψ .
114
D. Bresch, D. G´erard-Varet
2 Remark. If u0 = ∇x⊥ ψ0 belongs to H m T2 ) , m > 1, and if we set M > u0 ∞ , Proposition 4.1 above allows us to build a solution in short time to (19), with u0 as initial data. Proof of Proposition 4.1. This proof relies on two lemmas. The first one is a generalization of the classical Schauder’s lemma about the behaviour of F (u) for u ∈ H m and F a smooth function (cf. [16]). Lemma 4.2. Let F be a C ∞ function from U ∈ (L∞ (0, T ))2 , U ∞ < M to (L∞ (0, T ))2 , such that F and its derivatives send bounded sets to bounded sets. Then F 2 2 to L∞ 0, T ; H m T2 , and is uniformly sends BT∞ (0, M)∩ L∞ 0, T ; H m T2 Lipschitzian on bounded subsets. The proof of this lemma is a straightforward adaptation of the proof of Schauder’s lemma [16]. For the sake of brevity, we skip it and refer to [16] for all necessary details. Let U ∈ (L∞ (0, T ))2 . We define G(U ) as the solution = (t, X, Y ) of ∂ + u · ∇X (X ) + U · ∇X (X ) + u · ∇X ηB + rX t X − ν (X )2 + U · ∇X ηB = 0, u = ∇X⊥ = (−∂Y , ∂X ).
(102)
As already mentioned in the proof of Theorem 2.3, such a system has a unique weak solution in the space E := L∞ 0, T ; H 1 (Q) ∩ L2 0, T ; H 2 (Q) , so that G is well defined as a function from (L∞ (0, T ))2 to E. The link between G and our functional F is made through the relation
ηB ∇X G u0 (·, x, y) (t, X, Y ) dX dY.
F(u )(t, x, y) = − 0
(103)
X,Y
Proposition 4.1 is then a direct consequence of Lemma 4.2 and of the following Lemma 4.3. There exists T > 0 such that G is a C ∞ function from u ∈ (L∞ (0, T ))2 , u ∞ < M to E. Moreover, G and its derivatives send bounded sets to bounded sets. Remark. This lemma can be seen as a result of smooth dependence on parameters for 2D Navier-Stokes equations. Proof of Lemma 4.3. We divide this proof into two parts
Roughness-Induced Effects on the Quasi-geostrophic Model
115
First part. We introduce an auxiliary function. Let T > 0. We define H : (L∞ (0, T ))2 × E × E → E, by: H(U, α, f ) is the solution of ∂ + U · ∇X (X ) + u · ∇X (X α) t X + u · ∇X ηB + rX − ν (X )2 = f, u = ∇X⊥ = (−∂Y , ∂X ).
(104)
We show that H is a smooth function and that each of its derivatives sends bounded sets to bounded sets. • Let t > 0. Multiplying by 1[0,t] and integrating both over Q and from 0 to t leads to the estimate ∇X (t, ·) 2L2
X
2
t
+r 0
∇(s, ·) 2L2 + ν X
t 0
X (s, ·) 2L2 = − f, 1[0,t] E ,E . X
(105) Hence, ∇X (t, ·) 2L2
X
2
t t ∇X (s, ·) 2L2 + ν X (s, ·) 2L2 +r X X 0 0
t
≤ C1 f E 0
1/2
X (t, ·) 2L2 X
+
sup ∇X (s, ·) L2
. (106)
X
0≤s≤t
Thus ∇X (t, ·) 2L2
X
2
t
+r 0
ν ∇(s, ·) 2L2 + X 2
≤ C2 f 2E + C1 f E
0
t
X (s, ·) 2L2 X
sup ∇X (s, ·) L2
X
0≤s≤t
(107)
.
Applying the last inequality for t from 0 to T yields sup 0≤t≤T
∇X (t, ·) 2L2
X
2
≤ C2 f 2E + C1 f E
sup ∇X (t, ·) L2
0≤t≤T
≤ C2 f 2E + C3 f 2E +
X
1 sup ∇X (t, ·) 2L2 . (108) X 4 0≤t≤T
Finally, we obtain H(U, α, f ) 2E ≤ C f 2E , where 2E = 2L∞ (0,T ;H 1 (Q)) + 2L2 (0,T ;H 2 (Q)) .
(109)
116
D. Bresch, D. G´erard-Varet
• We may now show that H is a smooth function. For two elements (U, α, f ) and (U˜ , α, ˜ f˜) in the domain of H, we set = H(U˜ , α, − , = H(U, α, f ), ˜ f˜), χ = V = U˜ − U, b = α˜ − α, g = f˜ − f.
(110)
It is then straightforward to see that χ = H(U˜ , α, F ), where · ∇X (X b) + g. F = −V · ∇X (X ) − ∇X⊥ · ∇X (X b) belong to E . First, for all φ ∈ E, Note that both V · ∇X (X ) and ∇X⊥ we have T T V · ∇X (X ) φ = V · ∇X φ (X ) 0
0
Q
Q
≤ V L∞ (0,T ) ∇φ L2 (0,T ×) (X ) L2 (0,T ×) ≤ C V L∞ (0,T ) E φ E . The other term is treated in the following way: T T ⊥ ⊥ ∇X · ∇X (X b) φ = ∇X · ∇X φ (X b) 0
0
Q
Q
· ∇X φ L2 (0,T ×) . ≤ X b L2 (0,T ×) ∇X⊥ Besides, · ∇X φ 2 2 ∇X⊥ ≤ L (0,T ×)
T 0
|2 |∇X φ|2 |∇X
Q T
≤
0
(t, ·) 2 4 ∇X φ(t, ·) 2 4 . ∇X L L
The Gagliardo-Nirenberg inequality yields T · ∇X φ 2 2 (t, ·) ∇X⊥ ≤ ∇X L (0,T ×) 0
(t, ·) L2 ∇X φ(t, ·) L2 X φ(t, ·) L2 × L2 X L∞ (0,T ;L2 ) X L2 (0,T ;L2 ) ∇X φ L∞ (0,T ;L2 ) X φ L2 (0,T ;L2 ) . ≤ ∇X Thus, we get
0
T
Q
· ∇X (X b) φ ≤ E b E φ E . ∇X⊥
(111)
Finally, V · ∇X (X ) E ≤ V L∞ (0,T ) E ,
· ∇X (X b) E ≤ E b E . ∇X⊥ (112)
Roughness-Induced Effects on the Quasi-geostrophic Model
117
Using estimates (109), (112), we deduce H(U˜ , α, ˜ f˜) − H(U, α, f ) 2 ≤ C V 2 + b 2 + g 2 ,
(113)
where · stands for the various appropriate norms. Thus, function H is lipschitzian and sends bounded sets to bounded sets. One checks easily that we may also write χ = H(U, α, F1 ) + H(U, α, F2 ),
(114)
where F1 = −∇ ⊥ · ∇X (X b) − V · ∇X (X ) + g, F2 = −∇ ⊥ χ · ∇X (X b) − V · ∇X (X χ ). Note that we have used the linearity of H in its last variable. Thanks to estimates (109) and (113) we have H(U, α, F2 ) = O( V 2 + b 2 + g 2 ). We have thus proved that H is differentiable and formula dH (U, α, f ) (V , b, g) = H(U, α, F1 ). If we introduce the (smooth) bilinear application Q and linear application L respectively defined by: Q (, (V , b, g)) = −∇ ⊥ · ∇X (X b) − V · ∇X (X ), L (V , b, g) = g, we have the relation dH (U, α, f ) (·) = H U, α, Q (H(U, α, f ), ·) + L (·) .
(115)
Thanks to this last relation, one can use a bootstrap argument, show that H is smooth, and that each of its derivatives sends bounded sets to bounded sets. Second part. We link G to H. We first note the identity G(U ) = H(U, G(U ), −U · ∇X ηB ). Thus, to prove that G(U ) is smooth (for T small enough), and that all its derivatives send bounded sets to bounded sets, it is enough to prove, for all U , the invertibility of the derivative of the function K : α −→ α − H(U, α, −U · ∇X ηB ). Using (115) leads to dK[α](·) = I d − H(U, α, B(, ·)), where
B(, b) = −∇ ⊥ · ∇X (X b),
= H(U, α, −U · ∇X ηB ).
118
D. Bresch, D. G´erard-Varet
Bounds (109) and (112) yield |||H(U, α, B(, ·))||| ≤ E . In the same spirit as (109), one can prove easily that 2E ≤ C |U |2 ηB 2L2 (Q) T , so that |||H(U, α, B(, ·))||| ≤
√
CM ηB L2 (Q)
√
T.
√ √ Thus, if CM ηB L2 (Q) T < 1, then dK[α] is invertible, which ends the proof of the proposition.
4.3. Convergence results. As usual, on the basis of the previous analysis, one can conζ ζ struct an approximate solution uζapp = ∇x⊥ ψapp , where ψapp is given by (96). Theorem ζ
2.6 then follows from an energy estimate on v = uζ − uapp . Note that uζapp ≈ ∇x⊥ ψ 0 + ∇X 1 ,
∇x uζapp ≈
1 2 1 ∇ , ζ X
so that we have the following bound on the nonlinear term ζ v · ∇x uζapp (t, ·) L2x ≤ C v L2x . All other terms are easy to handle, so that we do not give further details. Acknowledgements. This work has been partially supported by the GDR “Amplitude Equations and Qualitative Properties” (GDR CNRS 2103: EAQP) and by the IDOPT project in Grenoble.
References 1. Barcilon, V., Constantin, P., Titi, E.S.: Existence of solutions to the Stommel-Charney model of the Gulf Stream. SIAM J. Math. Anal. 19(6), 1355–1364 (1988) 2. Blayo, E., Verron, J.: The no slip condition and separation of western boundary currents. J. Phys. Oceanography 26, 1938–1951 (1995) 3. Bougeault, P., Sadourny, R.: Dynamique de l’atmosph`ere et de l’oc´ean. Palaiseau: Editions de l’Ecole Polytechnique, 2001
Roughness-Induced Effects on the Quasi-geostrophic Model
119
4. Bresch, D., Colin, T.: Some remarks on the derivation of the Sverdrup relation. J. Math. Fluid Mech. 4(2), 95–108 2002 5. Bresch, D., Desjardins, B.: Existence of global weak solutions for 2D viscous shallow water equations and convergence to the quasi-geostrophic model. Commun. Math. Phys. 238, 211–223 (2003) 6. Bresch, D., Guillen-Gonzalez, F., Rodr´ıguez-Bellido, M.A.: A corrector for the Sverdrup solution for a domain with islands. Applicable Anal. 83(3), 217–230 (2004) 7. Colin, T.: Remarks on a homogeneous model of ocean circulation. Asymptotic Anal. 12(2), 153–168 (1996) 8. Colin. T.: Mod`eles stratifi´es en m´ecanique des fluides g´eophysiques. Ann. Math. Blaise-Pascal 2, 229–243 (2002) 9. Constantin, P., Wu, J.: Behavior of solutions of 2D quasi-geostrophic equations. SIAM J. Math. Anal. 30(5), 937–948 (1999) (electronic) 10. Desjardins, B., Grenier, E.: On the homogeneous model of wind-driven ocean circulation. SIAM J. Appl. Math. 60(1), 43–60 (2000) (electronic) 11. Galdi, G.P.: An introduction to the mathematical theory of the Navier-Stokes equations. Vol. I, Vol. 38 of Springer Tracts in Natural Philosophy. New York: Springer-Verlag, 1994. Linearized steady problems 12. G´erard-Varet, D.: Highly rotating fluids in rough domains. J. Math. Pures Appl. 82(11), 1457–1498 (2003) 13. J¨ager, W., Mikeli´c, A.: On the roughness-induced effective boundary conditions for an incompressible viscous flow. J. Differ. Eqs. 170(1), 96–122 (2001) 14. Marchioro, C., Pulvirenti, M.: Mathematical theory of incompressible nonviscous fluids. Berlin-Heidelberg-New York: Springer Verlag, 1994 15. Pedlosky, J.:Geophysical fluid dynamics. Berlin-Heidelberg-New York: Springer Verlag, 1979 16. Rauch, J., Keel, M.: Lectures on geometric optics. In: Hyperbolic equations and frequency interactions (Park City UT 1995), Vol. 5 of IAS/Park City Math. Ser. Providence, RI: Am. Math. Soc. 1999, pp. 383–466 17. Vanneste, J.: Enhanced dissipation for quasi-geostrophic motion over small-scaletopography. J. Fluid. Mech. 407, 105–122 (2000) 18. Vanneste, J.: Nonlinear dynamics over rough topography: barotropic andstratified quasi-geostrophic theory. J. Fluid. Mech. 474, 299–318 (2003) Communicated by P. Constantin
Commun. Math. Phys. 253, 121–155 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1175-7
Communications in
Mathematical Physics
Poisson Geometrical Symmetries Associated to Non-Commutative Formal Diffeomorphisms Fabio Gavarini Dipartimento di Matematica, Universit`a degli Studi di Roma “Tor Vergata”, Via della Ricerca Scientifica 1, 00133 Roma, Italy. E-mail:
[email protected] Received: 14 September 2003 / Accepted: 20 April 2004 Published online: 5 November 2004 – © Springer-Verlag 2004
Abstract: Let G dif be the group of all formal power series starting with x with coeffi cients in a field k of zero characteristic (with the composition product), and let F G dif be its function algebra. In [BF] a non-commutative, non-cocommutative graded Hopf algebra Hdif was introduced via a direct process of “disabelianisation” of F G dif , taking the like presentation of the latter as an algebra but dropping the commutativity constraint. In this paper we apply a general method to provide four one-parameter deformations of Hdif , which are quantum groups whose semiclassical limits are Poisson geometrical symmetries such as Poisson groups or Lie bialgebras, namely two quantum function algebras and two quantum universal enveloping algebras. In particular the two Poisson groups are extensions of G dif , isomorphic as proalgebraic Poisson varieties but not as proalgebraic groups. A series of outlaws joined and formed the Nottingham group, whose renowned chieftain was the famous Robin Hopf N. Barbecue, “Robin Hopf” Introduction The most general notion of “symmetry” in mathematics is encoded in the notion of Hopf algebra. Then, among all Hopf algebras (over a field k), there are two special families which are of relevant interest for their geometrical meaning: assuming for simplicity that k have zero characteristic, these are the function algebras F [G] of algebraic groups G and the universal enveloping algebras U (g) of Lie algebras g. Function algebras are exactly those Hopf algebras which are commutative, and enveloping algebras those which are connected (in the general sense of Hopf algebra theory) and cocommutative. Given a Hopf algebra H , encoding some generalized symmetry, one can ask whether there are any other Hopf algebras “close” to H , which are of either one of the above mentioned geometrical types, hence encoding geometrical symmetries associated to H .
122
F. Gavarini
The answer is affirmative: namely (see [Ga4]), it is possible to give functorial recipes to get out of any Hopfalgebra H twopairs of Hopf algebras of geometrical type, say F [G+ ], U (g− ) and F [K+ ], U (k− ) . Moreover, the algebraic groups thus obtained are connected Poisson groups, and the Lie algebras are Lie bialgebras; therefore in both cases Poisson geometry is involved. In addition, the two pairs above are related to each other by Poisson duality (see below), thus only either one of them is truly relevant. Finally, these four “geometrical” Hopf algebras are “close” to H in that they are 1-parameter deformations (with pairwise isomorphic fibers) of a quotient or a subalgebra of H. The method above to associate Poisson geometrical Hopf algebras to general Hopf algebras, called “Crystal Duality Principle” (CDP in short), is explained in detail in [Ga4]. It is a special instance of a more general result, the “Global Quantum Duality Principle” (GQDP in short), explained in [Ga2, Ga3], which in turn is a generalization of the “Quantum Duality Principle” due to Drinfeld (cf. [Dr], §7, and see [Ga1] for a proof). Drinfeld’s QDP deals with quantum universal enveloping algebras (QUEAs in short) and quantum formal series Hopf algebras (QFSHAs in short) over the ring of formal power series k[[]]. A QUEA is any topologically free, topological Hopf k[[]]–algebra whose quotient modulo is the universal enveloping algebra U (g) of some Lie algebra g; in this case we denote the QUEA by U (g). Instead, a QFSHA is any topological Hopf k[[]]–algebra of type k[[]]S (as a k[[]]–module, S being a set) whose quotient modulo is the function algebra F [[G]] of some formal algebraic group G; then we denote the QFSHA by F [[G]]. The QDP claims that the category of all QUEAs and the category of all QFSHAs are equivalent, and provides an equivalence in either direction. From QFSHAs to QUEAs it goes as follows: given a QFSHA, say F [[G]], let J be its augmentation ideal (the kernel of its counit map) and set Fh [[G]]∨ := n≥0 −n J n . Then F [[G]] → Fh [[G]]∨ defines (on objects) a functor from QFSHAs to QUEAs. To go the other way round, i.e. from QUEAs to QFSHAs, one uses a perfectly dual recipe. Namely, given a QUEA, say U (g), let again J be its augmentation ideal; for each n ∈ N, let δn be the composition ⊗n of the n–fold iterated coproduct followed by the projection J makes n (this ⊗n sense onto −1 since U (g) = k[[]]·1U (g) ⊕ J ): then set U (g) := n≥0 δn U (g) , or more explicitly U (g) := η ∈ U (g)δn (η) ∈ n U (g)⊗n , ∀n ∈ N . Then U (g) → U (g) defines (on objects) a functor from QUEAs to QFSHAs. The functors ( )∨ and ( ) are inverse to each other, hence they provide the claimed equivalence. Note that the objects (QUEAs and QFSHAs) involved in the QDP are quantum groups; their semiclassical limits then are endowed with Poisson structures: namely, every U (g) is in fact a co-Poisson Hopf algebra and every F [[G]] is a (topological) Poisson Hopf algebra. The geometrical structures they describe are then Lie bialgebras and Poisson groups. The QDP then brings further information: namely, the semiclassical limit of the image of a given quantum group is Poisson dual to the Poisson geometrical object we start from. In short
Fh [[G]]∨ Fh [[G]]∨ = U (g× ),
i.e. (roughly)
Fh [[G]]∨ = U (g× ), (I.1)
where g× is the cotangent Lie bialgebra of the Poisson group G, and
U (g) U (g) = F G ,
i.e. (roughly)
U (g) = F G ,
(I.2)
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
123
where G is a connected Poisson group with cotangent Lie bialgebra g. So the QDP involves both Hopf duality (switching enveloping and function algebras) and Poisson duality. The generalization from QDP to GQDP stems from a simple observation: the construction of Drinfeld’s functors needs not to start from quantum groups! Indeed, in order to define either H ∨ or H one only needs that H be a torsion-free Hopf algebra over some 1-dimensional domain R and ∈ R be any non-zero prime (actually, even less is truly necessary, see [Ga2, Ga3]). On the other hand, the outcome still is, in both cases, a “quantum group”, now meant in a new sense. Namely, a QUEA now will be any torsion-free Hopf algebra H over R such that H H ∼ = U (g), for some Lie (bi)algebra g. Also, instead of QFSHAs we consider “quantum function algebras”,QFAs in short: here a QFA will be any torsion-free Hopf algebra H over R such that H H ∼ = F [G] (plus one additional technical condition) for some connected (Poisson) group G. In this new framework Drinfeld’s recipes give that H ∨ is a QUEA and H is a QFA, whatever is the torsion-free Hopf R–algebra H one starts from. Moreover, when restricted to quantum groups Drinfeld’s functors ( )∨ and ( ) again provide equivalences of quantum group categories, respectively from QFAs to QUEAs and vice versa; then Poisson duality is involved once more, like in (I.1–2). Therefore, the generalization process from the QDP to the GQDP spreads over several concerns. Arithmetically, one can take as () any non-generic point of the spectrum of R, and define Drinfeld’s functors and specializations accordingly; in particular, the corresponding quotient field k := R/R might have positive characteristic. Geometrically, one considers algebraic groups rather than formal groups, i.e. global vs. local objects. Algebraically, one drops any topological worry (–adic completeness, etc.), and deals with general Hopf algebras rather than with quantum groups. This last point is the one of most concern to us now, in that it means that we have (functorial) recipes to get several quantum groups, hence – taking semiclassical limits – Poisson geometrical symmetries, springing out of the “generalized symmetry” encoded by a torsion-free Hopf algebra H over R: namely, for each non-trivial point of the spectrum of R, the quantum groups H ∨ and H given by the corresponding Drinfeld’s functors. Note, however, that a priori nothing prevents any of these H ∨ or H or their semiclassical limits from being (essentially) trivial. The CDP comes out when looking at Hopf algebras over a field k, and then applying the GQDP to their scalar extensions H [] := k[] ⊗k H with R := k[] (and := itself). A first application of Drinfeld’s functors to H := H [] followed by specialization at = 0 provides the pair F [G ], U (g ) mentioned above: in a nutshell, + − ∨ , where hereafter X F [G+ ], U (g− ) = H , H := X X. Then =0
=0
=0
at = 0 applying once more Drinfeld’s functors to H∨ and to H and specializing ∨ ∨ . yields the pair F [K+ ], U (k− ) , namely F [K+ ], U (k− ) = H , H =0
=0
Finally, the very last part of the GQDP explained before implies that K+ = G− and k− = g× +. While in the second step above one really needs the full strength of the GQDP, for the first step instead it turns out that the construction of Drinfeld’s functors on H [] can be fully “tracked through” and described at the “classical level”, i.e. in terms of H alone. In addition, the exact relationship among H and the pair F [G+ ], U (g− ) can be made quite clear, and more information is available about this pair. We now sketch it in some detail.
124
F. Gavarini
Let J be the augmentation ideal of H , let J := J n n∈N be the associated (decreas := GJ (H ) the associated graded vector space and H ∨ := ing) J –adic filtration, H n H n∈N J . One can prove that J is a Hopf algebra filtration, hence H is a graded ∼ Hopf algebra. The latter happens to be connected and cocommutative, so H = U (g− ) for some Lie algebra g− ; in addition, since H is graded also g− itself is graded as a Lie is cocommutative allows to define on it a Poisson cobracket algebra. The fact that H which makes H into a graded co-Poisson Hopf algebra; eventually, this implies that g− . is a Lie bialgebra. The outcome is that our U (g− ) is just H On the other hand, one considers a second (increasing) filtration defined in a dual := GD (H ) be the manner to J , namely D := Dn := Ker(δn+1 ) n∈N . Let now H associated graded vector space and H := n∈N Dn . Again, one shows that D is a Hopf is a graded Hopf algebra. Moreover, the latter is commutative, algebra filtration, hence H = F [G+ ] for some algebraic group G+ . One proves also that H = F [G+ ] has so H is graded, G+ no non-trivial idempotents, thus G+ is connected; in addition, since H as a variety is just an affine space. The fact that H is commutative allows to define on it into a graded Poisson Hopf algebra: this means that a Poisson bracket which makes H . G+ is an algebraic Poisson group. Thus eventually F [G+ ] is just H and H can be The relationship among H and the “geometrical” Hopf algebras H expressed in terms of “reduction steps” and regular 1-parameter deformations, namely 1←→0 0←→1 ←−−−−−−−→ , H H −−→ H −− H ∨ ←−−−−−−−→ H
R D (H )
∨ R J (H )
(I.3)
where one-way arrows are Hopf algebra morphisms and two-way arrows are regular 1-parameter deformations of Hopf algebras, realized through the Rees Hopf algebras RD (H ) and RJ (H ∨ ) associated to the filtration D of H and to the filtration J of H ∨ . Hereafter “regular” for a deformation means that all its fibers are pairwise isomorphic as vector spaces. In classical terms, (I.3) comes directly from the construction above; (H ) = H and on the other hand, in terms of the GQDP it comes from the fact that RD
RJ (H ∨ ) = H∨ . As we mentioned above, the next step is the “application” of (suitable) Drinfeld functors to the Rees algebras RD (H ) = H and RJ (H ∨ ) = H∨ occurring in (I.3). The outcome is a second frame of regular 1-parameter deformations for H and H ∨ , namely 0←→1 1←→0 U g× H −→ H −− H ∨ ←−−−−−−−→ F [K+ ] = F G− + = U (k− ) ←−−−−−−−→ ∨ ∨ (H )
(I.4)
(H )
which is the analogue of (I.3). In particular, when H ∨ = H = H from (I.3) and (I.4) together we find H as the mid-point of four deformation families, whose “external points” are Hopf algebras of “Poisson geometrical” type, namely 0←→1 1←→0 U (g− ) ←−−−−−−−−→ H F G− ←−−−−−−−−→ H∨ (H∨ ) () × 0←→1 1←→0 F [G+ ] ←−−−−−−−−→ H ←−−−−−−−−→ U g+ H
(H )∨
which gives four different regular 1-parameter deformations from H to Hopf algebras encoding Poisson geometrical objects. Then each of these four Hopf algebras may be
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
125
thought of as a semiclassical geometrical counterpart of the “generalized symmetry” encoded by H . The purpose of the present paper is to show the effectiveness of the CDP, applying it to a key example, the Hopf algebra of non-commutative formal diffeomorphisms of the line. Indeed, the interest of the latter, besides its own reasons, grows bigger as we can see it as a toy model for a broad family of Hopf algebras of great concern in mathematical physics, non-commutative geometry and beyond. Now I present the results of this paper. Let G dif be the set of all formal power series starting with x with coefficients in a field k of zero characteristic. Endowed with the composition product, this is an infinite dimensional prounipotent proalgebraic group – known as the “(normalised) Nottingham group” among group-theorists and the “(normalised) group of formal diffeomorphisms of the line” among mathematical physicists – whose tangent Lie algebra is a special subalgebra of the one-sided Witt algebra. The function algebra F G dif is a graded, commutative Hopf algebra with countably many generators, which admits a neat combinatorial description. In [BF] a non-commutative version of F G dif is introduced: this is a non-commu tative non-cocommutative Hopf algebra Hdif which is presented exactly like F G dif but dropping commutativity, i.e. taking the presentation as one of a unital associative dif – and not outcome of applying to commutative – algebra; in other words, H is the dif dif a raw “disabelianization” process. In particular, H = H is graded and verifies F G H ∨ = H = H , hence the scheme () makes sense and yields four Poisson symmetries associated to Hdif . Note that in each line in () there is essentially only one Poisson geometry involved, since Poisson duality relates mutually opposite sides; thus any classical symmetry on the same line carries as much information as the other one (but for global-to-local differences). Nevertheless, in the case of H = Hdif we shall prove that the pieces of information from either line in () are complementary, because G+ and G− happen to be isomorphic as proalgebraic Poisson varieties but not as groups. In particular, we find that the Lie bialgebras g− and g× + are both isomorphic as Lie algebras to the free Lie algebra L(N+ ) over a countable set, but they have different, non-isomorphic Lie coalgebra structures. Moreover, G− ∼ = G+ as Poisson varieties, where N = G dif × N ∼ is a proaffine Poisson variety whose coordinate functions are in bijection with a basis of the derived subalgebra L(N+ ); indeed, the latter are obtained byiterated Poisson brackets of coordinate functions on G dif , in short because both F G− and F G+ are dif freely generated as Poisson algebras by a copy of F G . For G− we have a more precise result, namely G− ∼ = G dif N (a semidirect product) as proalgebraic groups: thus in a sense G− is the free Poisson group over G dif , which geometrically speaking is obtained by “pasting” to G dif all 1-parameter subgroups freely obtained via iterated Poisson brackets of those of G dif ; in particular, these Poisson brackets iteratively yield 1-parameter subgroups which generate N . We perform the same analysis simultaneously for G dif , for its subgroup of odd formal diffeomorphisms and for all the groups Gν of truncated (at order ν ∈ N+ ) formal diffeomorphisms, whose projective limit is G dif itself; mutatis mutandis, the results are the same. The case of Hdif is just one of many samples of the same type: indeed, several cases of Hopf algebras built out of combinatorial data – graphs, trees, Feynman diagrams, etc. – have been introduced in (co)homological theories (see e.g. [LR] and [Fo1, Fo2], and references therein) and in renormalization studies (see [CK1, CK2, CK3]); in most cases these algebras – or their (graded) duals – are commutative polynomial, like F G dif ,
126
F. Gavarini
and admit non-commutative analogues (thanks to [Fo1, Fo2]), so our discussion applies almost verbatim to them too, with like results. Thus the given analysis of the “toy model” Hopf algebra Hdif can be taken as a general pattern for all those cases.
1. Notation and Terminology 1.1. The classical zero characteristic. Consider the set data. Let k be a fixed field of G dif := x + n≥1 an x n+1 an ∈ k∀n ∈ N+ of all formal series starting with x: endowed with the composition product, this is a group, which can be seen as the group of all “formal diffeomorphisms” f : k −→ k such that f (0) = 0 and f (0) = 1 (i.e. tangent to the identity), also known as the Nottingham group (see, e.g., [Ca] and references dif therein). In fact, Gdif is an infinite dimensional (pro)affine algebraic group, whose function algebra F G is generated by the coordinate functions an (n ∈ N+ ). Giving to each an the weight1 ∂(an ) := n, we have that F G dif is an N–graded Hopf algebra, dif with polynomial structure F G = k[a1 , a2 , . . . , an , . . . ] and Hopf algebra structure given by
(an ) = an ⊗ 1 + 1 ⊗ an +
n−1
am ⊗ Q m n−m (a∗ ),
(an ) = 0,
m=1 n−1
S(an ) = −an −
n−1 am S Q m (a ) = −a − S(am )Qm n n−m ∗ n−m (a∗ ),
m=1
where Q t (a∗ ) :=
t k=1
+1
m=1
k
(k)
(k)
Pt (a∗ ) and Pt (a∗ ) :=
j1 ,...,jk >0 aj1 j1 +···+jk =t
· · · ajk (the
symmetric monic polynomial of weight m and degree k in the indeterminates aj ’s) for all m, k, ∈ N+ , and the formula for S(an ) gives the antipode by recursion. From now on, to simplify notation we shall write G := G dif and G∞ := G = G dif . Note also that the tangent Lie algebra of G dif is just the Lie subalgebra W1≥1 = Span {dn |n ∈ N+ } of d the one-sided Witt algebra W1 := Der k[t] = Span dn := t n+1 dt n ∈ N ∪ {−1} . ν In addition, for all ν ∈ N+ the subset G := f ∈ G an (f ) =0, ∀n ≤ ν is a normal subgroup of G; the corresponding quotient group Gν := G G ν is unipotent, with dimension ν and function algebra F Gν (isomorphic to) the Hopf subalgebra of F G generated by a1 , . . . , aν . In fact, the G ν ’s form exactly the lower central series of G (cf. [Je]). Moreover, G is (isomorphic to) the inverse (or projective) limit of these quotient groups Gν (ν ∈ N+ ), hence G is pro-unipotent; conversely, F [G] is the direct (or inductive) limit of the direct system of its graded Hopf subalgebras F [Gν ] (ν ∈ N+ ). Finally, the set G odd := f ∈ G dif a2n−1 (f ) = 0∀n ∈ N+ is another normal subgroup of G dif (the group of odd formal diffeomorphisms2 after [CK3]), whose function alge odd dif bra F G is (isomorphic to) the quotient Hopf algebra F G a2n−1 n∈N . The +
1
We say weight instead of degree because we save the latter term for the degree of polynomials. The fixed-point set of the group homomorphism : G → G , f → (f ) x → (f ) (x) := −f (−x) . 2
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
127
latter has the following denoting again the cosets of the a2n ’s with the like description: symbol, we have F G odd = k[a2 , a4 , . . . , a2n , . . . ] with Hopf algebra structure (a2n ) = a2n ⊗ 1 + 1 ⊗ a2n +
n−1
¯m a2m ⊗ Q n−m (a2∗ ),
(a2n ) = 0,
m=1
S(a2n ) = −a2n −
n−1
n−1 m ¯ n−m (a∗ ) = −a2n − ¯m a2m S Q S(a2m )Q n−m (a2∗ ),
m=1
¯ t (a2∗ ) := where Q
t k=1
m=1
2 +1 k
(k) (k) P¯t (a2∗ ) and P¯t (a2∗ ) :=
j1 ,...,jk >0 j1 +···+jk =t
a2j1 · · · a2jk
for all m, k, ∈ N+ . For each ν ∈ N+ we can consider also the normal subgroup odd G ν ∩ G odd : then F G odd is G ν ∩ G odd and the corresponding quotient Gνodd := G ν
a2n−1 (2n−1)∈N , in particular (isomorphic to) the quotient Hopf algebra F G odd ν it is the Hopf sub-algebra of F G odd generated by a2 , . . . , a2[ν/2] . All the F Gνodd ’s odd are graded Hopf (sub)algebras forming a direct system with direct limit F G ; conodd odd versely, the Gν ’s form an inverse system with inverse limit G . In the sequel we write G + := G odd and Gν+ := Gνodd . For each ν ∈ N+ , set Nν := {1, . . . , ν}; set also N∞ := N+ . For each ν ∈ N+ ∪ {∞}, let Lν = L(Nν ) be the free Lie algebra over k generated by {xn }n∈Nν and let Uν = U (Lν ) be its universal enveloping algebra; let also Vν = V (Nν ) be the k–vector space with basis {xn }n∈Nν , and let Tν = T (Vν ) be its associated tensor algebra. Then there are canonical identifications U (Lν ) = T (Vν ) = k {xn |n ∈ Nν } , the latter being the unital k–algebra of non-commutative polynomials in the set of indeterminates {xn }n∈Nν , and Lν is just the Lie subalgebra of Uν = Tν generated by {xn }n∈Nν . Moreover, Lν has a basis Bν made of Lie monomials in the xn ’s (n ∈ Nν ), like [xn1 , xn2 ], [[xn1 , xn2 ], xn3 ], [[[xn1 , xn2 ], xn3 ], xn4 ], etc.: details can be found e.g. in [Re], Ch. 4–5. In the sequel I shall use these identifications with no further mention. We consider on U (Lν ) the standard Hopf algebra structure given by (x) = x ⊗ 1 + 1 ⊗ x, (x) = 0, S(x) = −x for all x ∈ Lν , which is also determined by the same formulas for x ∈ {xn }n∈Nν alone. By construction ν ≤ µ implies Lν ⊆ Lµ , whence the Lν ’s form a direct system (of Lie algebras) whose direct limit is exactly L∞ ; similarly, U (L∞ ) is the direct limit of all the U (Lν )’s. Finally, with Bν we shall mean the obvious PBW-like basis of U (Lν ) w.r.t. some fixed total order of B , namely B := xb b = b1 · · · bk ; b1 , . . . , bk ∈ ν ν Bν ; b1 · · · bk . The same construction applies to make out “odd” objects, based + + on {xn }n∈N+ν , with N+ ν := Nν ∩ 2N (ν ∈ N ∪ {∞}), instead of {xn }n∈Nν, Lν = L(Nν ), + + + + + + Uν = U (Lν ), Vν = V (Nν ), Tν = T (V identifica ν ), with the+obvious canonical + + + tions U (L+ ν ) = T (Vν ) = k {xn |n ∈ Nν } ; moreover, Lν has a basis Bν made of Lie + monomials in the xn ’s (n ∈ N+ ν ), etc. The Lν ’s form a direct system whose direct limit + ) is the direct limit of all the U (L+ )’s. is L+ , and U (L ∞ ν ∞ Warning. In the sequel, we shall often deal with subsets {yb }b∈Bν (of some algebra) in bijection with Bν , the fixed basis of Lν . Then we shall write things like yλ with λ ∈ Lν : this means we extend the bijection {yb }b∈Bν ∼ Bν to Span {yb }b∈Bν ∼ = = Lν by linearity, so that yλ ∼ = b∈Bν cb b iff λ = b∈Bν cb b (cb ∈ k). The same kind of convention will be applied with Bν+ instead of Bν and L+ ν instead of Lν .
128
F. Gavarini
1.2. The noncommutative Hopf algebra of formal diffeomorphisms. For all ν ∈ N+ ∪ {∞}, let Hν be the Hopf k–algebra given as follows: as a k–algebra it is simply Hν := k {an |n ∈ Nν } (the k–algebra of non-commutative polynomials in the set of indeterminates {an }n∈Nν ), and its Hopf algebra structure is given by (for all n ∈ Nν ) (an ) = an ⊗ 1 + 1 ⊗ an +
n−1
am ⊗ Q m n−m (a∗ ),
(an ) = 0,
m=1
S(an ) = −an −
n−1
n−1 am S Q m (a ) = −a − S(am )Qm ∗ n n−m n−m (a∗ ),
m=1
(1.1)
m=1
(notation like in §1.1) where the latter formula yields the antipode by recursion. Moreover, Hν is in fact an N–graded Hopf algebra, once generators have been given degree – in the sequel called weight – by the rule ∂(an ) := n (for all n ∈ Nν ). By construction the various Hν ’s (for all ν ∈ N+ ) form a direct system, whose direct limit is H∞ : the latter was originally introduced3 in [BF], §5.1 (with k = C), under the name Hdif . + Similarly, for all ν ∈ N+ ∪ {∞} we set Kν := k {an |n ∈ N+ ν } (where Nν := + Nν ∩ (2N)): this bears a Hopf algebra structure given by (for all 2n ∈ Nν ) (a2n ) = a2n ⊗ 1 + 1 ⊗ a2n +
n−1
¯m a2m ⊗ Q n−m (a2∗ ),
(a2n ) = 0,
m=1
S(a2n ) = −a2n −
n−1 m=1
n−1 m ¯ n−m (a2∗ ) = −a2n − a2m S Q S(a2m )Qm n−m (a2∗ ) m=1
(notation of §1.1). Indeed, this is an N–graded Hopf algebra where generators have degree – called weight – given by ∂(an ) := n (for all n ∈ N+ ν ). All the Kν ’s form a direct system with direct limit K∞ . Finally, for each ν ∈ N+ ν there is a graded Hopf algebra epimorphism Hν −− Kν given by a2n → a2n , a2m+1 → 0 for all 2n, 2m + 1 ∈ Nν . Definitions and §1.1 imply that
Hν ab := Hν via an → an ∀n ∈ Nν Hν , Hν ∼ = F Gν , as N–graded Hopf algebras: in other words, the abelianization of Hν is nothing but F Gν . Thus in a sense one can think of Hν as a non-commutative version (indeed, the “coarsest” one) of F Gν , hence as a “quantization” of Gν itself: however, this is not a quantization in the usual sense, because F Gν is attained through abelianization, not via specialization of some deformation parameter. Similarly we have also
Kν ab := Kν Kν , Kν ∼ via a2n → a2n ∀2n ∈ N+ = F Gν+ , ν as N–graded Hopf algebras: in other words, the abelianization of Kν is just F Gν+ . In the following I make the analysis explicit for Hν , the case Kν being the same (details are left to the reader); I drop the subscript ν, which stands fixed, and write H := Hν . 3 However, the formulas in [BF] give the opposite coproduct, hence change the antipode accordingly; we made the present choice to make these formulas “fit well” with those for F G dif (see below).
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
129
1.3. Deformations. Let be an indeterminate. In this paper we shall consider several Hopf algebras over k[], which can also be seen as 1-parameter depending families of Hopf algebras over k, the parameter being k; each k–algebra in such a family can then be thought of as a 1-parameter deformation of any other object in the same family. As a matter of notation, if H is such a Hopf k[]–algebra I call the fibre of H (thought of as a deformation) any Hopf k–algebra of type H p()H for some irreducible p() ∈ k[]; in particular H =c := H ( − c)H , for any c ∈ k, is called specialization of H at = c. We start from H := H[] ≡ k[] ⊗k H: this is indeed a Hopf k[]–algebra, namely H = k[] {an |n ∈ Nν } with Hopf structure given by (1.1) again. Set also H() := k() ⊗k[] H = k() ⊗k H = k() {an |n ∈ Nν } , a Hopf k()–algebra ruled by (1.1) too.
2. The Rees Deformation H¯h∨ 2.1. The goal. The crystal duality principle (cf. [Ga2], §5, or [Ga4]) yields a recipe to produce a 1-parameter deformation H∨ of H which is a quantized universal enveloping algebra (QUEA in the sequel): namely, H∨ is a Hopf k[]–algebra such that H∨ = H and H∨ = U (g− ), the universal enveloping algebra of a graded =1
=0
Lie bialgebra g− . Thus H∨ is a quantization of U (g− ), and the quantum symmetry H is a deformation of the classical Poisson symmetry U (g− ). By definition H∨ is the Rees algebra associated to a distinguished decreasing Hopf algebra filtration of H, so that U (g− ) is just the graded Hopf algebra associated to this filtration. The purpose of this section is to describe explicitly H∨ and its semiclassical limit U (g− ), hence also g− itself. This will also provide a direct, independent proof of all the above mentioned results about H∨ and U (g− ) themselves. 2.2. The Rees algebra H ∨ . Let J := Ker H : H −→ k be the augmentation ideal of H, and let J := J n n∈N be the J –adic filtration in H. It is easy to show (see [Ga4]) that J is a Hopf algebra filtration of H; since H is graded connected we have th J H (n) is the n homogeneous component of H), whence = H+n := ⊕n∈N H(n) ∨(where n ∼ n∈N J = {0} and H := H n∈N J = H. We let the Rees algebra associated to J be H∨ := k[] ·
n −n J n = k[]−n · J n = k[] −1 · J ⊆ H() . (2.1) n≥0
n≥0
n≥0
Letting J := Ker H : H −→ k[] = k[] · J (the augmentation ideal of H ) one has H∨ =
n −n Jn = −1 J n≥0
n≥0
⊆ H() .
130
F. Gavarini
For all n ∈ Nν , set xn := −1 an ; clearly H∨ is the k[]–subalgebra of H() generated by J ∨ := −1 J , hence by {xn }n∈Nν , so H∨ = k[] {xn |n ∈ Nν } . Moreover, m n−1 k n−m+1 (xn ) = xn ⊗1 + 1⊗xn + xn−m ⊗ Pm(k) (x∗ ), (xn ) = 0, k m=1 k=1
m n−1 k n−m+1 S(xn ) = −xn − xn−m S Pm(k) (x∗ ) k
(2.2)
m=1 k=1
n−1 m k n−m+1 = −xn − S(xn−m )Pm(k) (x∗ ) k m=1 k=1
for all n ∈ Nν , due to (1.1). From this one sees by hand that the following holds: Proposition 2.1. Formulas (2.2) make H∨ = k[] {xn , |n ∈ Nν } into a graded Hopf k[]–algebra, embedded into H() := k() ⊗k H as a graded Hopf subalgebra. Moreover, H∨ is a deformation of H, for its specialization at = 1 is isomorphic to H, i.e.
H∨ := H∨ (−1)H∨ ∼ = H via xn mod (−1)H∨ → an (∀n ∈ Nν ) =1
as graded Hopf algebras over k. Remark. The previous result shows that H is a deformation of H, which is recovered as a specialization (of H ) at = 1. The next result instead shows that H is also a deformation of U (Lν ), recovered as specialization at = 0. Altogether, this gives the top-left horizontal arrow in the frame () in the Introduction for H = H := Hν , with g− = Lν . at = 0. Namely, the specialization limit of H∨ at Theorem 2.1. H∨ is a QUEA
:= H∨ H∨ ∼ = 0 is H∨ = U (Lν ) via xn mod H∨ → xn for all n ∈ Nν , =0 thus inducing on U (Lν ) the structure of a co-Poisson Hopf algebra uniquely provided by the Lie bialgebra structure on Lν given by δ(xn ) = n−1 =1 ( + 1)x ∧ xn− (for all n ∈ Nν ).4 In particular in the diagram () for H = H(= Hν ) we have g− = Lν . Finally, the grading d given by d(xn ) := 1(n ∈ N+ ) makes H∨ ∼ = U (Lν ) into a =0 ) graded co-Poisson Hopf algebra; similarly, the grading ∂ given by ∂(x n := n(n ∈ N+ ) ∨ ∼ makes H = U (Lν ) into a graded Hopf algebra and Lν into a graded Lie bialgebra. =0
Proof. First observe that since H∨ = k[] {xn |n ∈ Nν } and U (Lν ) = T (Vν ) = k {xn |n ∈ Nν } mapping xn mod H∨ → xn (∀n ∈ Nν ) does really define an isomorphism of algebras : H∨ H∨ ∼ = U (Lν ). Second, formulas (2.2) give (xn ) ≡ xn ⊗ 1 + 1 ⊗ xn mod H∨ ⊗ H∨ , (xn ) ≡ 0 mod k[], S(xn ) ≡ −xn mod H∨ for all n ∈ Nν ; comparing with the standard Hopf structure of U (Lν ) this shows that
is an isomorphism of Hopf algebras too. Finally, as H∨ is cocommutative, a =0
4
Hereafter, I use notation a ∧ b := a ⊗ b − b ⊗ a.
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
131
Poisson co-bracket is defined on it by the standard recipe used in quantum group theory, namely δ(xn ) := −1 (xn ) − op (xn ) mod H∨ ⊗ H∨ n−1 n−1 n−m+1 = xn−m ∧ Pm(1) (x∗ ) = ( + 1)x ∧ xn− 1 m=1
∀n ∈ Nν .
=1
3. The Drinfeld’s Deformation H¯h∨ 3.1. The goal. The second step in the crystal duality principle is to build a second deformation deformation H∨ . This will be a new Hopf k[]–algebra ∨ based upon the Rees ∨ H , contained in H , which for = 1 specializes to H and for = 0 specializes to F [K+ ], the function algebra of some connected Poisson group K+ ; in other words, ∨ ∨ H = H and H = F [K+ ], the latter meaning that H∨ is a quantized =1 =0 function algebra (QFA in the sequel). Therefore H∨ is a quantization of F [K+ ], and the quantum symmetry H is a deformation of the classical Poisson symmetry F [K+ ]. In addition, the general theory also describes the relationship between K+ and the Lie bialgebra g− = Lν in §2.1, which is Lν = coLie(K+ ), so that we can write K+ = GLν . Comparing with §2.1, one eventually concludes that the quantum symmetry encoded by H is intermediate between the two classical, Poisson symmetries ruled by GLν and Lν . In this section I describe explicitly H∨ and its semiclassical limit F [G− ], hence G− itself too. This yields a direct proof of all above mentioned results about H∨ and G− . 3.2. Drinfeld’s δ• – maps. Let H be any Hopf algebra (over a ring R). For every n ∈ N, define the iterated coproduct n : H −→ H ⊗n by 0 := , 1 := idC , and finally ⊗(n−2) ◦ n−1 if n > 2. For any ordered subset = {i1 , . . . , ik } ⊆ n := ⊗ idC {1, . . . , n} with i1 < · · · < ik , define the linear map j : H ⊗k −→ H ⊗n by j (a1 ⊗ · · · ⊗ ak ) := b1 ⊗ · · · ⊗ bn with bi := 1 if i ∈ / and bim := am for 1 ≤ m ≤ k; then k 0 , ∅ := , and δ := ⊂ (−1)n−| | , δ∅ := . The inverse set := j ◦ formula = ⊆ δ also holds. We shall also use the shorthand notation δ0 := δ∅ , δn := δ{1,2,...,n} for n ∈ N+ . The following properties of the maps δ will be used: ⊗n (a) δn = idC − u ◦ ◦ n for all n ∈ N+ , where u : R −→ H is the unit map; ⊗(n−1−s) (b) the maps δn are coassociative, that is id⊗s ◦ δn = δn+ −1 for C ⊗ δ ⊗ idC all n, , s ∈ N, 0 ≤ s ≤ n − 1, and similarly in general for the maps δ ; (c) δ (ab) = δ (a)δY (b) for all finite subsets ⊆ N and all a, b ∈ H ;
∪Y =
(d) δ (ab − ba) =
∪Y =
∩Y =∅
δ (a)δY (b) − δY (b)δ (a) for all = ∅ and a, b ∈ H .
132
F. Gavarini
3.3. Drinfeld’s algebra H∨ . Using Drinfeld’s δ• – maps of §3.2, we define ∨ ⊗n H := η ∈ H∨ δn (η) ∈ n H∨ ∀n ∈ N ⊆ H∨ .
(3.1)
Now I describe H∨ and its specializations at = 1 and = 0, in several steps. Step I. A direct check shows that x˜ n := xn = an ∈ H∨ , for all n ∈ Nν . Indeed, we have of course δ0 (˜xn ) = (˜xn ) ∈ 0 H∨ and δ1 (˜xn ) = x˜ n − (˜xn ) ∈ 1 H∨ . Moreover, m (k) k+1 n−m+1 x ˜ n−m ⊗ Qn−m x∗ ) = n−1 δ2 (˜xn ) = n−1 n−m ⊗ Pm (x∗ ) ∈ m (˜ k=1 m=1 x m=1 k 2 H∨ ⊗ H∨ . Since in general δ = δ −1 ⊗ id ◦ δ2 for all ∈ N+ , we have m n−1 k n−m+1 δ −1 (xn−m ) ⊗ Pm(k) (x∗ ), δ (˜xn ) = id ⊗ δ −1 δ2 (˜xn ) = k
m=1 k=1
⊗ for all ∈ N, thus x˜ n ∈ H∨ , q.e.d. whence induction gives δ (˜xn ) ∈ H∨ Step II. Using property (c) in §3.2 one easily checks that H∨ is a k[]–subalgebra of H∨ (see [Ga2, Ga3], Proposition 3.5 for details). In particular, by Step I and the very definitions this implies that H∨ contains H . is commutative Step III. Using property (d) in §3.2 one easily sees that H∨ =0 (cf. [Ga2, Ga3], Theorem 3.8 for details): this means [a, b] ≡ 0 mod H∨ , that is [a, b] ∈ H∨ hence also −1 [a, b] ∈ H∨ , for all a, b ∈ H∨ . In particular, we ∨ −1 x , x get [x for all n, m ∈ Nν , whence iterating n , xm ] := [xn , xm ] = [˜ n ˜ m ] ∈ H (and recalling Lν is generated by the xn ’s) we get x˜ := x ∈ H∨ for every x ∈ Lν . Hereafter we identify the free Lie algebra Lν with its image via the natural embed ding Lν −→ U (Lν ) = k {xn }n∈Nν −→ k[] {xn }n∈Nν = H∨ given by xn → xn (n ∈ Nν ). U (Lν ) −→ H∨ via Step IV. The previous step showed that, if we embed L ν −→ ∨ ν := Lν ⊆ H . Let L be the k[]–subalgebra of x → x (n ∈ Nν ) we find L ∨ ν ∨ n ∨ n H generated by Lν : then Lν ⊆ H , because H is a subalgebra. In par bb := bb ∈ H∨ . ticular, if bb ∈ H∨ is the image of any b ∈ Bν (cf. §1.1) we have ν ⊇ H∨ . In fact, let η ∈ H∨ ; then Step V. Conversely to Step IV, we have L there are unique d ∈ N, η+ ∈ H∨ \ H∨ such that η = d η+ ; set also y¯ := y mod H∨ ∈ H∨ H∨ for all y ∈ H∨ . As H∨ = k[] {xn |n ∈ Nν } there is a unique –adic expansion of η+ , namely η+ = η0 + η1 + · · · + s ηs = sk=0 k ηk with all ηk ∈ k{xn |n ∈ Nν } and η0 = 0. Then η¯ + = η¯ 0 := η0 mod H∨ , with η¯ + = η¯ 0 ∈ H∨ = U (Lν ) by Theorem 2.1. On the other hand, η ∈ H∨ implies =0 ∨ ⊗(d+1) d+1 H ∨ ⊗(d+1) , whence δ −d so δd+1 (η) ∈ d+1 (η+ ) = δd+1 (η) ∈ H that δd+1 η¯ 0 = 0; the latter implies that the degree ∂(η¯ 0 ) of η¯ 0 for the standard filtration of U (Lν ) is at most d (cf. [Ga2, Ga3], Lemma 4.2(d) for a proof). By the PBW
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
133
theorem, ∂(η¯ 0 ) is also the degree of η¯ 0 as a polynomial in thex¯ b ’s, hence also of η0 as ν ⊆ H∨ (using Step III), hence a polynomial in the xb ’s (b ∈ Bν ): then d η0 ∈ L we find η(1) := d+1 η1 + η2 + · · · + s−1 ηs = η − d η0 ∈ H∨ . Thus we can apply our argument again, with η(1) instead of η. Iterating we find ∂(η¯ k ) ≤ ∨ d+k ν ν , for all k, thus η = s d+k ηk ∈ L ⊆ H d + k, whence ηk ∈ L k=0
q.e.d. An entirely similar analysis clearly works with K taking the role of H , with similar results (mutatis mutandis). On the upshot, we get the following description: Theorem 3.1. (a) With notation of Step III in §3.3 (and [a, c] := ac − ca), we have ∨ H = Lν = k[] bb b∈B bb1 , bb2 − bb1 , bb2 ∀b1 , b2 ∈ Bν . ν
(b) H∨ is a graded Hopf k[]–subalgebra of H∨ , and H is naturally embedded ∨ into H as a graded Hopf subalgebra via H −−→ H∨ , an → x˜ n (for all n ∈ Nν ). ∨ (c) H∨ H = F GLν , where GLν is an infinite dimensional := H∨ =0 connected Poisson algebraic group with cotangent Lie bialgebra isomorphic to Lν ∨ (with the graded Lie bialgebra structure of Theorem 2.1). Indeed, H is the =0 free Poisson (commutative) algebra over Nν , generated by all the x˜ n =0 (n ∈ Nν ) with Hopf structure given by (1.1) with x˜ ∗ instead of a∗ . Thus H∨ is the =0 polynomial algebra k {βb }b∈Bν generated by a set of indeterminates {βb }b∈Bν in ν (a (pro)affine k–space) as algeAB bijection with the basis Bν of Lν , so GLν ∼ = ∨ k ∼ braic varieties. Finally, F GL = H = k {βb }b∈B bears the natural ν
=0
ν
algebra grading grading inher and d of polynomial algebras the Hopf algebra ited from H∨ , respectively given by d bb = 1 and ∂ bb = ki=1 ni for all b = [[· · · [[xn1 , xn2 ], xn3 ], · · · ], xnk ] ∈ Bν . (d) F Gν is naturally embedded into H∨ = F GLν as a graded Hopf sub=0 algebra via µ : F Gν −−→ H∨ = F GLν , an → x˜ n mod H∨ =0 (for all n ∈ Nν ); moreover, F Gν freely generates F GLν as a Poisson algebra. Thus there is an algebraic group epimorphism µ∗ : GLν −−Gν , that is GLν is an extension of Gν . (e) Mapping x˜ n mod H∨ → an (for all n ∈ Nν ) gives a well-defined graded Hopf algebra epimorphism π : F GLν −−F Gν . Thus there is an algebraic group monomorphism π∗ : Gν −−→ GLν , that is Gν is an algebraic subgroup of GLν . (f) The map µ is a section of π, hence π∗ is a section of µ∗ . Thus GLν is a semidirect product of algebraic groups, namely GLν = Gν Nν , where Nν := Ker(µ∗ ) GLν . (g) The analogues of statements (a)–(f) hold with K instead of H, with X+ instead of X for all X = Lν , Bν , Nν , µ, π, Nν , and with GL+ν instead of GLν .
134
F. Gavarini
Proof. (a) This part follows directly from Step IV and Step V in §3.3. (b) To show that H∨ is a graded Hopf subalgebra we use its presentation in (a). But first recall that, by Step II, H embeds into H∨ via an embedding which is compatible with the Hopf operations (it is a restriction of the identity on H()): then this will be a Hopf algebra monomorphism, up to proving that H∨ is a Hopf subalgebra (of H∨ ). Now, H∨ obviously restricts to give a counit for H∨ . Second, we show that
H∨ ⊆ H∨ ⊗ H∨ , so restricts to a coproduct for H∨ . Indeed, each b ∈ Bν is a Lie monomial, say b = [[[. . . [xn1 , xn2 ], xn3 ], . . . ], xnk ] for some k, n1 , . . . , nk ∈ Nν , where k is its Lie degree: by induction on k we’ll prove bb ∈ ∨ ∨ bb := bb = [[[. . . [xn1 , xn2 ], xn3 ], . . . ], xnk ]). H ⊗ H (with If k = 1 then b = xn for some n ∈ Nν . Then bb = xn = an and bb = (an ) = an ⊗ 1 + 1 ⊗ an +
n−1
dif dif an−m ⊗ Qn−m ⊆ H∨ ⊗ H∨ . m (a∗ ) ∈ H ⊗ H
m=1
If k > 1 then b = [b− , xn ] for some n ∈ Nν and some b− ∈ Bν expressed by a Lie − − monomial of degree k − 1. Then bb = [b , xn ] = b , xn and b− , xn = b− , (xn ) = −1 b− , (an ) bb = n−1 − = −1 an−m ⊗ Qn−m b− m (a∗ ) (1) ⊗ b(2) , an ⊗ 1 + 1 ⊗ an + m=1 (b−) n−1 − n−m −1 b− + (1) , an−m ⊗ b(2) Qm (a∗ ) (b−)m=1 −1 − n−m + b− a b ⊗ , Q (a ) , ∗ m (1) n−m (2) − where we used the standard –notation for b− = (b−) b− (1) ⊗ b(2) . By inductive ∨ − ; then since also a ∈ H∨ for all and since hypothesis we have b− (1) , b(2) ∈ H ∨ H is commutative modulo we have −1 − −1 − −1 − n−m , a , a , a , Q (a ) ∈ H∨ b− , b , b , b −1 n n n−m ∗ m (1) (2) (1) (2) ∨ ∨ for all n and (n − m) above: so the previous formula gives bb ∈ H ⊗ H . q.e.d. Finally, the antipode. Take the Lie monomial b = [[[. . . [xn1 , xn2 ], xn3 ], . . . ], xnk ] ∈ ∨ bb = bb = [[[. . . [xn1 , xn2 ], xn3 ], . . . ], xnk ]. We prove that S bb ∈ H Bν , so
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
135
by induction on the degree k. If k = 1 then b = xn for some n, so bb = xn = an and n−1 dif S bb = S(an ) = −an − an−m S Qn−m ⊆ H∨ . m (a∗ ) ∈ H
q.e.d.
m=1 − If k > 1 then b = [b− , xn ] for some n ∈ Nν and some b ∈ Bν which is a Lie − − b , xn = −1 b− , an and so monomial of degree k − 1. Then bb = [b , xn ] = − S bb = S b− , xn = −1 S(an ), S ∈ −1 H∨ , H∨ ⊆ H∨ b
∨ b ∈ H (by the case k = 1) along with the using the fact S(an ) = S x =S − n ∨ x n of H∨ modulo . inductive assumption S b ∈ H and the commutativity (c) As a consequence of (a), the k–algebra H∨ is a polynomial algebra, =0 ∨ ∨ namely H = k {βb }b∈B with βb := bb mod H for all b ∈ Bν . So =0 ∨ H is the algebra of regular functions F [Γ ] of some (affine) algebraic variety =0∨ Γ ; as H is a Hopf algebra the same is true for H∨ = F [Γ ], so Γ is an =0 ∨ algebraic group; and since F [Γ ] = H is a specialization limit of H∨ , it is =0 endowed with the Poisson bracket a|=0 , b|=0 := −1 [a, b] =0 which makes Γ into a Poisson group too.
We compute the cotangent Lie bialgebra of Γ . First, me := Ker F [Γ ] = βb b∈B ν
2 (the ideal generated by the βb ’s) by construction, so me = βb1 βb2 b ,b ∈B . Thereν 1 2 fore the cotangent Lie bialgebra Q F [Γ ] := me me2 as a k–vector space has basis β b b∈B , where β b := βb mod me2 for all b ∈ Bν . For its Lie bracket we have (cf. ν Remark 1.5) β b1 , β b2 := βb1 , βb2 mod me2 = −1 bb2 mod H∨ mod me2 b b1 , = −1 2 bb1 , bb2 mod H∨ mod me2
mod me 2 = b[b1 ,b2 ] mod H∨
= b[b1 ,b2 ] mod H∨ mod me2 = β[b1 ,b2 ] mod me2 = β [b1 ,b2 ] , thus the k–linear map : Lν −→ me me2 defined by b → β b for all b ∈ Bν is a Lie algebra isomorphism. As for the Lie cobracket, using the general identity δ = − op "e2 for short) we get, for all n ∈ Nν , mod me2 ⊗ F [Γ ] + F [Γ ] ⊗ me2 (written mod m
"2 = −op(˜x ) mod H ∨ ⊗ H ∨ mod m "2 δ β xn = − op (βxn ) mod m e n e $ ## $ n−1 n−m "2 mod m = an ∧ 1 + 1 ∧ an + an−m ∧ Qm (a∗ ) mod H ⊗ H e m=1
=
n−1 m=1
n−m "2 = βxn−m ∧ Qm (βx∗ ) mod m e
m n−1 m=1k=1
n−m+1 (k) "2 βxn−m ∧ Pm (βx∗ ) mod m e k
136
F. Gavarini
=
# n−1 n − m + 1 1
m=1
$ (1) βxn−m ∧ Pm (βx∗ )
"2 = mod m e
n−1
( + 1)β x ∧ β xn−
=1
(k)
because – among other things – one has Pm (βx∗ ) ∈ me2 for all k > 1: therefore n−1 ( + 1)β x ∧ β xn− δ β xn =
∀ n ∈ Nν .
(3.2)
=1
(as a Lie algebra) by the xn ’s, the last formula shows that the map Since Lν is generated : Lν −→ me me2 given above is also an isomorphism of Lie bialgebras, q.e.d. ∨ Finally, the statements about gradings of H should be trivially clear. =0 (d) The part about Hopf algebras is a direct consequence of (a) and (b), noting that ∨ ∨ the x˜ n ’s commute modulo H , since H is commutative. Taking spec=0 tra (i.e. sets of characters of each Hopf algebra) we get an algebraic group morphism µ∗ : GLν −−→Gν , which in fact is onto because, as these are polynomial, each algebras character of F Gν does extend to a character of F GLν , so the former arises from restriction of the latter. description of F GLν coming from (a) and (b), mapping (e) Due to the explicit
x˜ n mod H∨ → an (for all n ∈ Nν ) clearly yields a Hopf algebra epimor phism π : F GLν −−F Gν . Taking spectra gives an algebraic group monomorphism π∗ : Gν −−→ GLν as required. (f) The map µ is a section of π by construction. Then clearly π∗ is a section of µ∗ , which implies GLν = Gν Nν (with Nν := Ker(µ∗ ) GLν ) by general theory. (g) This ought to be clear from the whole discussion, for all arguments apply again – mutatis mutandis – when starting with K instead of H; details are left to the reader. Remark. Roughly speaking, we can say that the extension F Gν −−→ F GLν is performed simply by adding to F Gν a free Poisson structure, which happens to be compatible with the Hopf structure. Then the Poisson bracket starting from the “elementary” coordinates an (for n ∈ Nν) freely generates new {an1 , an2 }, coordinates {an1 , an2 }, an3 , etc., thus enlarging F Gν and generating F GLν . At the group level, this means that Gν freely Poisson-generates the Poisson group GLν : new 1-parameter subgroups, build up in a Poisson-free manner from those attached to the an ’s, are freely µ∗ “pasted” to Gν , expanding it and building up GLν . Then the epimorphism GLν −−Gν is just a forgetful map: it kills the new 1-parameter subgroups and is injective (hence an isomorphism) on the subgroup generated by the old ones. On the other hand, defi ∼ nitions imply that F GLν F GLν , F GLν = F Gν , and with this identifica π tion F GLν −−F Gν is just the canonical map, which mods out all Poisson brakets {f1 , f2 }, for f1 , f2 ∈ F GLν . 3.4. Specialization limits. So far, we have already pointed out (by Proposition The 2.1, orem 2.1, Theorem 3.1 (c)) the following specialization limits of H∨ and H∨ : →1
H∨ −−−→H,
→0
H∨ −−−→U (Lν ),
→0 H∨ −−−→F GLν
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
137
as graded Hopf k–algebras, with (co-)Poisson structures in the last two cases. As some for the specialization limit of H∨ at = 1, Theorem 3.1 implies that it is H. Indeed, by Theorem 3.1(b) H embeds into H∨ via an → x˜ n (for all n ∈ Nν ): then ∨ ∀n, m ∈ Nν , [an , am ] = x˜ n , x˜ m = [x n , xm ] ≡ [x n , xm ] mod (−1) H whence, due to the presentation of H∨ by generators and relations in Theorem 3.1 (a), ∨ H
=1
:= H∨ ( −1) H∨ = k x˜ 1 , x˜ 2 , . . . , x˜ n , . . . = k a1 , a2 , . . . , an , . . .
(where c := c mod (−1) H∨ ) as k–algebras, and the Hopf structure is exactly the one of H because it is given by the like formulas on generators. In a nutshell, we have ∨ →1 H −−−→H as Hopf k–algebras. This completes the top part of the diagram () in the Introduction, for H = H(:= Hν ), because H∨ := H ∩n∈N J n = H by §2.2: namely, 0←→1 1←→0 H ←−−−−−−→ F GLν U (Lν ) ←−−−−−−→ ∨ ∨ (H )
H
.
4. The Rees Deformation H¯h 4.1. The goal. The crystal duality principle (cf. [Ga2, Ga4]) yields also a recipe to produce a 1-parameter deformation H of H which is a quantized function algebra (QFA in the sequel): namely, H is a Hopf k[]–algebra such that H = H and =1 = F [G+ ], the function algebra of a connected algebraic Poisson group G+ . H =0
Thus H is a quantization of F [G+ ], and the quantum symmetry H is a deformation of the classical Poisson symmetry F [G+ ]. By definition H is the Rees algebra associated to a distinguished increasing Hopf algebra filtration of H, and F [G+ ] is simply the graded Hopf algebra associated to this filtration. The purpose of this section is to describe explicitly H and its semiclassical limit F [G+ ], hence also G+ itself. This will also provide a direct, independent proof of all the above mentioned results about H and F [G+ ] themselves. 4.2. The Rees algebra H . Let’s consider Drinfeld’s δ• –maps, as in §3.2, for the Hopf algebra H. Using them, we define the δ• –filtration D := Dn n∈N of H by Dn := Ker(δn+1 ), for all n ∈ N. It is easy to show (cf. [Ga4]) that Dis a Hopf algebra filtration of H; moreover, since H is graded connected, we have H = n∈N Dn =: H . We define the Rees algebra associated to D as H := k[] · +n Dn = k[]+n · Dn (4.1) ⊆ H := H[] . n≥0
n≥0
A trivial check shows that the following intrinsic characterization (inside H ) also holds: H = η ∈ H δn (η) ∈ n H⊗n , ∀n ∈ N ⊆ H .
138
F. Gavarini
We shall describe H explicitly, and we’ll compute its specialization at = 0 and at = 1: in particular we’ll show that it is really a QFA and a deformation of H, as claimed. By (4.1), all we need is to compute the filtration D = Dn n∈N ; the idea is to describe it in combinatorial terms, based on the non-commutative polynomial nature of H. 4.3. Gradings and filtrations. Let ∂− be the unique Lie algebra grading of Lν given by ∂− (xn ) := n − 1 + δn,1 (for all n ∈ Nν ). Let also d be the standard Lie algebra the central lower series of Lν , i.e. the one defined by grading associated with d [· · · [[xs1 , xs2 ], . . . xsk ] = k − 1 on any Lie monomial of Lν . As both ∂− and d are Lie algebra gradings, (∂− −d) is a Lie algebra grading too. Let Fn n∈N be the Lie algebra fil tration associated with (∂− −d); then the down-shifted filtration T := Tn := Fn−1 n∈N is again a Lie algebra filtration of Lν . There is a unique algebra filtration on U (Lν ) extending T : we denote it Θ = Θn n∈N , and set also Θ−1 := {0}. Finally, for each y ∈ U (Lν ) \ {0} there is a unique τ (y) ∈ N with y ∈ Θτ (y) \ Θτ (y)−1 ; in particular τ (b) = ∂− (b) − d(b), τ (bb ) = τ (b) + τ (b ) and τ [b, b ] = τ (b) + τ (b ) − 1 for b, b ∈ Bν . We can explicitly describe Θ. Indeed, let us fix any total order on the basis Bν of §1.1: then X := b := b1 · · · bk k ∈ N, b1 , . . . , bk ∈ Bν , b1 · · · bk is a k–basis of U (Lν ), by the PBW theorem. It followsthat Θ induces a set-theoretic filtra tion X = Xn n∈N of X with Xn := X ∩ Θn = b := b1 · · · bk k ∈ N, b1 , . . . , bk ∈ Bν , b1 · · · bk , τ (b) = τ (b1 ) + · · · + τ (bk ) ≤ n , and also that Θn = Span Xn for all n ∈ N. Let us define α 1 := a1 and α n := an − a1n for all n ∈ Nν \ {1}. This “change of variables” – which switch from the an ’s to their differentials, in a sense – is the key to achieve a complete description of D, via a close comparison between H and U (Lν ). By definition H = Hν is the free associative algebra over {an }n∈Nν , hence (by defini∼ =
tion of the α’s) also over {α n }n∈Nν ; so we have an algebra isomorphism : H−U (Lν ) given by α n → xn (∀n ∈ Nν ). Via we pull back all data and results about gradings, filtrations, PBW bases and so on mentioned above for U (Lν ); in particular we set α b := (xb ) = α b1 · · · α bk (b1 , . . . , bk ∈ Bν ), An := (Xn ) (n ∈ N), A := (X ) = n∈N An . For gradings on H we stick to the like notation, i.e. ∂− , d and τ , and similarly for Θ. Finally, for all a ∈ H \ {0} we set κ(a) := k iff a ∈ Dk \ Dk−1 (with D−1 := {0}). Our goal is to prove an identity of filtrations, namely D = Θ, or equivalently κ = τ . In fact, this would give to the Hopf filtration D, which is defined intrinsically in Hopf algebraic terms, an explicit combinatorial description, namely the one of Θ explained above. t Lemma 4.1. Q t (a∗ ) ∈ Θt \ Θt−1 , Zt (α ∗ ) := Q t (a∗ ) − +t t a1 ∈ Θt−1 ( , t ∈ N, t ≥ 1). Proof. When t = 1 definitions give Q 1 (a∗ ) = ( + 1)a1 ∈ Θ1 and so Z1 (α ∗ ) = ( + 1)a1 − +1 1 a1 = 0 ∈ Θ0 , for all ∈ N. Similarly, when = 0 we have Q0t (a∗ ) = at ∈ Θt and so Zt0 (α ∗ ) = at − 11 a1t = α t ∈ Θt−1 (by definition), for all t ∈ N+ .
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
139
When > 0 and t > 1, we can prove the claim using two independent methods. First method. The very definitions imply that the following recurrence formula holds: Q t (a∗ ) = Q −1 (a∗ ) + t
t−1
Q −1 t−s (a∗ ) · as + at
∀
≥ 1, t ≥ 2.
s=1
From this formula and from the identities a1 = α 1 , as = α s + α 1s (s ∈ N+ ), we argue
t−1 +t +t −1 a1t = Q −1 (a ) + Q (a )a + a − a1t ∗ ∗ s t t−s t t t s=1 −1+t t −1 = Zt (a∗ ) + a1 t t−1 −1+t −s +t −1 t−s Zt−s (a∗ ) + as + at − + a1 a1t t −s t s=1 t−1 t−1 −1+t −s −1 s −1 α 1t−s α s + α t = Zt (a∗ ) + Zt−s (a∗ ) α s + α 1 + t −s s=1 s=1 t−1 −1+t −s −1+t +t α 1t−s α 1s + α 1t + + α 1t − α 1t t −s t t
Zt (α ∗ ) := Q t (a∗ ) −
s=1
= Zt −1 (a∗ ) +
t−1
−1 Zt−s (a∗ ) α s + α 1s
s=1
t−1 t −1+t −s −1+r t−s + α1 αs + t −s −1 s=1 r=0 t−1 +t −1 − α 1t + α t = Zt −1 (a∗ ) + Zt−s (a∗ ) α s + α 1s t s=1 t−1 − 1 + t − s + α 1t−s α s + α t t −s s=1
t −1+r because of the classical identity +t r=0 −1 . Then induction upon and the = very definitions allow us to argue that all summands in the final sum belong tto Θt−1 , hence Zt (α ∗ ) ∈ Θt−1 as well. Finally, this implies Q t (a∗ ) = Zt (α ∗ ) + +t t α 1 ∈ Θt \ Θt−1 . Second method. Q t (a∗ ) :=
t s=1
+1 s
(s)
Pt (a∗ ) =
t s=1
+1 s
j1 ,...,js >0 aj1 · · · ajs , j1 +···+js =t find that Q t (a∗ ) =
by definition; then expanding the aj ’s (for j > 1) as above we Q t α ∗ + α 1∗ is a linear combination of monomials α (j1 ) · · · α (js ) with j1 , . . . , js > 0, j1 + · · · + js = t, α (jr ) ∈ α jr , α 1jr for all r. Let Q− be the linear combination of those monomials such that (α (j1 ) , α (j2 ) , . . . , α (js ) = α 1j1 , α 1j2 , . . . , α 1js ; the remaining monomials enjoy α j1 · α j2 · · · α js = α 1j1 +···+js = α 1t , so their linear combination giving Q+ := Q t (a∗ ) − Q− is a multiple of α 1t , say Q+ = N α 1t . Now we compute this N.
140
F. Gavarini
By construction, N is nothing but N = Q t (1∗ ) = Q t (1, 1, . . . , 1, . . . ), where the latter is the value of Q t when all indeterminates are set equal to 1; thus we compute Q t (1∗ ). Recall that the Q t ’s enter in the definition of the coproduct of F G dif : the latter is dual to the (composition) product of series in G dif , thus if {an }n∈N+ and {bn }n∈N+ are two countable sets of commutative indeterminates then
x+
+∞
+∞
an x n+1 ◦ x + bm x m+1
n=1
m=1
+∞ +∞ +∞ +∞
n+1 := x + =x+ bm x m+1 + an x + bm x m+1 ck x k+1 m=1
n=1
m=1
k=0
with ck = Q0k (b∗ ) + kr=1 ar · Qrk−r (b∗ ) (cf. §1.1). Specializing a = 1 and ar = 0 for all r = we get ct+ = Q0t+ (b∗ ) + Q t (b∗ ) = bt+ + Q t (b∗ ). In particular setting b∗ = 1∗ we have that 1 + Q t (1∗ ) is the coefficient c +t of x +t+1 in the series +∞
x + x +1 ◦ x + x m+1 = x + x +1 ◦ x · (1 − x)−1 m=1
+∞ +∞
+1 +1 = x · (1 − x)−1 + x · (1 − x)−1 = x m+1 + x +1 xm
=
+∞ m=0
x
m+1
+x
+1
m=0
+∞ +n n=0
x = n
−1
x
s+1
s=0
+
+∞ s=
m=0
s 1+ x s+1 ;
therefore 1 + Q t (1∗ ) = c +t = 1 + +t , whence Q t (1∗ ) = +t . As an alternative +t approach, one can prove that Qt (1∗ ) = by induction using the recurrence formula +t t +t−1 −1 Q t (x∗ ) = Q −1 (x∗ ) + t−1 t s=0 −1 . s=1 Qt−s (x∗ )xs + xt and the identity = +t The outcome is N = Qt (1∗ ) = (for all t, ), thus Qt (a∗ ) − +t at = +t +t Q− + Q+ − at = Q− + N at − at = Q− . Now, by definition τ (α jr ) = jr − 1 and τ α 1jr = jr . Therefore if α (jr ) ∈ α jr , α 1jr (for all r = 1, . . . , s) and j (α (j1 ) , α (j2 ) , . . . , α (js ) ) = α 1 1 , α 1j2 , . . . , α 1js , then τ α (j1 ) · · · α (js ) ≤ j1 + · · · + js − 1 = t − 1. Then by construction τ (Q− ) ≤ t − 1, whence, since Zt (α ∗ ) := Q t (a∗ ) − +t at = Q−, we get also τ Zt (α ∗ ) ≤ t − 1, i.e. Zt (α ∗ ) ∈ Θt−1 , so +t Q t (a∗ ) = Zt (α ∗ ) + t α 1t ∈ Θt \ Θt−1 . Proposition 4.1. Θ is a Hopf algebra filtration of H. Proof. By construction (cf. §4.3) Θ is an algebra filtration; so to check it is Hopf too we are left only to show that ()(Θn ) ⊆ r+s=n Θr ⊗ Θs (for all n ∈ N), for then S(Θn ) ⊆ Θn (for all n) will follow from that by recurrence (and Hopf algebra axioms). By definition Θ0 = k · 1H ; then (1H ) = 1H ⊗ 1H proves () for n = 0. For n = 1, by definition Θ1 is the direct sum of Θ0 with the (free) Lie (sub)algebra (of H) generated
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
141
by {α 1 , α 2 }. Since (α 1 ) = α 1 ⊗ 1 + 1 ⊗ α 1 and (α 2 ) = α 2 ⊗ 1 + 1 ⊗ α 2 and [x(1) , y(1) ] ⊗ x(2) y(2) + x(1) y(1) ⊗ [x(2) , y(2) ] [x, y] = (x), (y) = (x),(y)
(for all x, y ∈ H) we argue () for n = 1 too. Moreover, for every > 1 (set n ting Qn0 (a∗ ) = 1 = a0 for short) we have (α n ) = (an ) − a1n = nk=0 ak ⊗ k n n−1 k k k n−k = Qkn−k (a∗ ) − nk=0 nk a 1 ⊗ a1 k=2 α k ⊗ Qn−k (a∗ ) + k=0 α 1 ⊗ Zn−k (α ∗ ), and therefore (α n ) ∈ r+s=n−1 Θr ⊗ Θs thanks to Lemma 4.1 (and to α m ∈ Θm−1 for m > 1). Finally, as [x, y] = (x), (y) = (x),(y) [x(1) , y(1) ] ⊗ x(2) y(2) + x(1) y(1) ⊗ [x(2) , y(2) ] and similarly (xy) = (x)(y) = (x),(y) x(1) y(1) ⊗ x(2) y(2) (for x, y ∈ H), we have that does not increase (∂− − d): as Θ is exactly the (algebra) filtration induced by (∂− − d), it is a Hopf algebra filtration as well. Lemma 4.2 (Notation of §4.3). (a) κ(a) ≤ ∂(a) for every a ∈ H \ {0} which is ∂(a)–homogeneous. (b) κ(aa ) ≤ κ(a) + κ(a ) and κ [a, a ] < κ(a) + κ(a ) for all a, a ∈ H \ {0}. (c) κ(α n ) = ∂− (α n ) = τ (α n ) for all n ∈ Nν . (d) κ [α r , α s ] = ∂− (α r ) + ∂− (α s ) − 1 = τ [α r , α s ] for all r, s ∈ Nν with r = s. (e) κ(α b ) = ∂− (α b ) − d(α b ) + 1 = τ (α b ) for every b ∈ Bν . (f) κ(α b1 α b2 · · ·α b ) = τ (α b1 α b2 · · · α b ) for all b1 , b2 , .. . , b ∈ Bν . (g) κ [α b1 , α b2 ] = κ(α b1 ) + κ(α b2 ) − 1 = τ [α b1 , α b2 ] , for all b1 , b2 ∈ Bν . Proof. (a) Let a ∈ H\{0} be ∂(a)–homogeneous. Since H is graded, we have ∂ δ (a) = ∂(a) for all ; moreover, δ (a) ∈ J ⊗ (with J := Ker(H )) by definition, and ∂(y) > 0 for each ∂–homogeneous y ∈ J \ {0}. Then δ (a) = 0 for all > ∂(a), whence the claim. (b) Let a ∈ Dm , b ∈ Dn : then ab ∈ Dm+n by property (c) in §3.2. Similarly, we have [a, b] ∈ Dm+n−1 ≤ m + n − 1 because of property (d) in §3.2. The claim follows. (c) By part (a) we have κ(an ) ≤ ∂(an ) = n. Moreover, by definition n−1 k (a ), thus δ (a ) = (δ a ⊗Q ⊗δ ) δ (a ) = δ2 (an ) = n−1 n n n−1 1 2 n k=1 k=1 δn−1 (ak )⊗ n−k ∗ k k n−1 δ1 Qn−k (a∗ ) by coassociativity. Since δ (am ) = 0 for > m, Q1 (a∗ ) = na1 and δ1 (a1 ) = a1 , we have δn (an ) = δn−1 (an−1 ) ⊗ (na1 ), thus by induction δn (an ) = n!a1⊗n (= 0), whence κ(an ) = n. But also δn (a1n ) = n!a1⊗n . Thus δn (α n ) = δn (an ) − δn (a1n ) = 0 for n > 1. Clearly κ(α 1 ) = 1. For the general case, for all ≥ 2 we have −1 δ −1 (a ) = (δ −2 ⊗ δ1 ) δ2 (a ) = δ −2 (ak ) ⊗ δ1 Qk −1−k (a∗ ) , k=1
2 which by the previous analysis gives δ −1 (a ) = δ −2 (a −2 ) ⊗ ( − 1)a2 + −1 2 a1 + 2 + ·δ δ −2 (a −1 )⊗ a1 = ( − 1)!·a1⊗( −2) ⊗ a2 + −1 · a (a )⊗a . Iterating 1 −2 −1 1 2 −1 we get, for all ≥ 2 (with 2 := 0, and changing indices) δ −1 (a ) =
−1 m=1
! m−1 ⊗(m−1) 2 · a1 · a1 ⊗ a1⊗( −1−m) . ⊗ a2 + m+1 2
142
F. Gavarini
! ⊗(m−1) ⊗a 2 ⊗a ⊗( −1−m) . On the other hand, we have also δ −1 a1 = −1 1 1 m=1 2 ·a1 n Therefore, for δn−1 (α n ) = δn−1 (an ) − δn−1 (a1 ) (for all n ∈ Nν , n ≥ 2) the outcome is δn−1 (α n ) = =
n−1 m=1 n−1 m=1
n! · a1⊗(m−1) ⊗ a2 − a12 ⊗ a1⊗(n−1−m) m+1 n! · α 1⊗(m−1) ⊗ α 2 ⊗ α 1⊗(n−1−m) ; m+1
(4.2)
in particular δn−1 (α n ) = 0, whence α n ∈ Dn−2 and so κ(α n ) = n − 1, q.e.d. (d) Let r = 1 = s. From (b)–(c) we get κ [α r , α s ] < κ(α r ) + κ(α s ) = r + s − 2. In addition, we prove that δr+s−3 [α r , α s ] = 0, yielding (d). Property (d) in §3.2 gives δr+s−3 [α r , α s ] = δ (α r ), δY (α s )
∪Y ={1,...,r+s−3}
∩Y =∅
=
j δr−1 (α r ) , jY δs−1 (α s ) .
∪Y ={1,...,r+s−3}
∩Y =∅,| |=r−1,|Y |=s−1
−1 ! ⊗( −2) + α ⊗ η (for some Using (4.2) in the form δ −1 (a ) = 1 m=1 2 · α 2 ⊗ α 1 η ∈ H), and counting how many ’s and Y ’s exist with 1 ∈ and {1, 2} ⊆ Y , and – conversely – how many of them exist with {1, 2} ⊆ and 1 ∈ Y , we argue δr+s−3 [α r , α s ] = cr,s · [α 2 , α 1 ] ⊗ α 2 ⊗ α 1⊗(r+s−5) +α 1 ⊗ ϕ1 + α 2 ⊗ ϕ2 + [α 2 , α 1 ] ⊗ α 1 ⊗ ψ for some ϕ1 , ϕ2 ∈ H⊗(r+s−4) , ψ ∈ H⊗(r+s−5) , and with r! s! s! r! r +s−5 s+r −5 cr,s = · · − · · 2 3 r −2 2 3 s−2 2 r s = (s − r)(r + s − 5)! = 0. 3 2 2 In particular δr+s−3 [α r , α s ] = cr,s · [α 2 , α 1 ] ⊗ α 2 ⊗ α 1⊗(r+s−5) + l.i.t., where “l.i.t.” stands for some further terms which are linearly independent of [α 2 , α 1 ] ⊗ α 2 ⊗ α 1⊗(r+s−5) and cr,s = 0. Then δr+s−3 [α r , α s ] = 0, q.e.d. Finally, if r > 1 = s (and if r = 1 < s) things are simpler. Indeed, again similarly (b) and (c) together give κ [α , α ] < κ(α r 1 r ) + κ(α 1 ) = (r − 1) + 1 = r, and we prove that δr−1 [α r , α 1 ] = 0. Like before, property (d) in §3.2 gives (since δ1 (α 1 ) = α 1 ) δr−1 [α r , α 1 ] = δ (α r ), δY (α 1 )
∪Y ={1,2,...,r−1}
∩Y =∅,| |=r−1,|Y |=1 r−1 = δr−1 (α r ), 1⊗(k−1) ⊗ α 1 ⊗ 1⊗(r−1−k) k=1 r−1 r! = · α 1⊗(m−1) ⊗ [α 2 , α 1 ] ⊗ α 1⊗(n−1−m) m+1 m=1
= 0
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
143
(e) We perform induction upon d(b): the case d(b) < 2 is dealt with in parts (c) and (d), thus we assume d(b) ≥ 2, so b = b , x for some ∈ Nν and some b ∈ Bν with d(b ) = d(b) − 1; then τ (α b ) = τ [α b , α ] = τ (α b ) + τ (α ) − 1, directly from definitions. Moreover τ (α ) = κ(α ) by part (c), and τ (α b ) = κ(α b ) by inductive assumption. From (b) we have κ(α b ) = κ [α b , α ] ≤ κ(α b )+κ(α )−1 = τ (α b )+τ (α )−1 = τ (α b ), i. e. κ(α b ) ≤ τ (α b ); we must prove the converse, for which it is enough to show δτ (α b ) (α b ) = cb · [· · · [[α 1 , α 2 ], α 2 ], . . . , α 2 ] ⊗ α 2 ⊗ α 1⊗(τ (α b )−2) + l.i.t. % &' (
(4.3)
d(b)+1
means the same as before. for some cb ∈ k \ {0}, where “l.i.t.” Since τ (α b ) = τ [α b , α ] = τ (α b ) + − 2, using property (d) in §3.2 we get
δτ (α b ) (α b ) = δτ (α b ) [α b , α ] =
=
δ (α b ), δY (α )
∪Y ={1,...,τ (α b )}
∩Y =∅
j δτ (α b ) (α b ) , jY δ −1 (α )
∪Y ={1,...,τ (α b )}, ∩Y =∅ | |=τ (α b ),|Y |= −1
=
∪Y ={1,...,τ (α b )}, ∩Y =∅ | |=τ (α b ), |Y |= −1
j cb [· · · [α 1 , α 2 ], . . . , α 2 ] % &' ( d(b )+1
! ⊗α 2 ⊗ α 1⊗(τ (α b)−2) , jY α 2 ⊗ α 1⊗( −2) + l.i.t. 2 τ (α b ) − 2 ! · [[· · · [[α 2 , α 1 ], α 2 ], . . . , α 2 ], α 2 = cb · · % &' ( −2 2 ⊗α 2 ⊗ α 1⊗(τ (α b )−2)
d(b )+1+1=d(b)+1
+ l.i.t.
b )−2 (using induction about α b ); this proves (4.3) with cb = cb · !2 · τ (α −2 = 0. Thus (4.3) holds, yielding δτ (α b ) (α b ) = 0, hence κ(α b ) ≥ τ (α b ), q.e.d. (f) The case = 1 is proved by part (e), so we can assume > 1. By part (b) and the case = 1 we have κ(α b1 α b2 · · · α b ) ≤ i=1 κ(α bi ) = i=1 τ (α bi ) = τ (α b1 α b2 · · · α b ); so we must only prove the converse inequality. We begin with = 2 and d(b1 ) = d(b2 ) = 0, so α b1 = α r , α b2 = α s , for some r, s ∈ Nν . If r = s = 1 then κ(α r ) = κ(α s ) = κ(α 1 ) = 1, by part (c). Then δ2 (α 1 α 1 ) = δ2 (a1 a1 ) = (id − )⊗2 a12 = 2 · a1 ⊗ a1 = 2 · α 1 ⊗ α 1 = 0 so that κ(α 1 α 1 ) ≥ 2 = κ(α 1 ) + κ(α 1 ), hence κ(α 1 α 1 ) = κ(α 1 ) + κ(α 1 ), q.e.d. If r > 1 = s (and similarly if r = 1 < s) then κ(α r ) = r − 1, κ(α s ) = κ(α 1 ) = 1, by part (c). Then property (d) in §3.2 gives δr (α r α 1 ) =
δ (α r )δY (α 1 ) =
r
r! m+1
m=1 k<m
∪Y ={1,...,r} | |=r−1,|Y |=1 ⊗(k−1) × α1 ⊗ 1 ⊗ α 1⊗(m−1−k) ⊗ α 2 ⊗ α 1⊗(r−1−m)
144
F. Gavarini r × 1⊗(k−1) ⊗ α 1 ⊗ 1⊗(r−k) +
r! ⊗(m−1) α1 m+1 m=1 k>m ⊗α 2 ⊗ α 1⊗(k−1−m) ⊗ 1 ⊗ α 1⊗(r−1−k) × 1⊗(k−1) ⊗ α 1 ⊗ 1⊗(r−k) r r! = · α 1⊗(m−1) ⊗ α 2 ⊗ α 1⊗(r−1−m) = 0 m+1 m=1
so that κ(α r α 1 ) ≥ r = κ(α r ) + κ(α 1 ), hence κ(α r α 1 ) = κ(α r ) + κ(α 1 ), q.e.d. Finally let r, s > 1 (and r = s). Then κ(α r ) = r − 1, κ(α s ) = s − 1, by part (c); then property (d) in §3.2 gives δr+s−2 α r α s = δ (α r ) · δY (α s )
∪Y ={1,...,r+s−2} | |=r−1,|Y |=s−1
=
j δr−1 (α r ) · jY δs−1 (α s ) .
∪Y ={1,...,r+s−2} | |=r−1,|Y |=s−1
t! ⊗(t−2) + α ⊗ η (for some η ∈ H Using (4.2) in the form δt−1 (at ) = t−1 1 t t m=1 2 · α 2 ⊗ α 1 and t ∈ {r, s}) and counting how many ’s and Y ’s exist with 1 ∈ and 2 ∈ Y and vice versa – actually, it is a matter of counting (r − 2, s − 2)-shuffles – we argue δr+s−2 α r α s = er,s · α 2 ⊗ α 2 ⊗ α 1⊗(r+s−4) + α 1 ⊗ ϕ s+r−4 r!s! r+s−4 for some ϕ ∈ H⊗(r+s−3) with er,s = r!2 · s!2 · r+s−4 = 2 · r−2 = 0. r−2 + s−2 ⊗(r+s−4) In particular δr+s−2 α r α s = er,s · α 2 ⊗ α 2 ⊗ α 1 + l.i.t., where “l.i.t.” stands again for some further terms which are linearly independent of α 2 ⊗ α 2 ⊗ α 1⊗(r+s−4) and er,s = 0. Then δr+s−2 α r α s = 0, so κ(α r α 1 ) ≥ r + s − 2 = κ(α r ) + κ(α 1 ). q.e.d. Now let again = 2 but d(b1 ), d(b2 ) > 0. Set κi := κ(α bi ) for i = 1, 2. Applying (4.3) to b = b1 and b = b2 (and recalling that τ ≡ κ) gives δκ1 +κ2 (α b1 α b2 ) = δ (α b1 )δY (α b2 )
∪Y ={1,...,κ1 +κ2 }
=
∪Y ={1,...,κ1 +κ2 } | |=κ1 ,|Y |=κ2
=
j δκ1 (α b1 ) jY δκ2 (α b2 ) j cb1 · [· · · [[α 1 , α 2 ], α 2 ], . . . , α 2 ] % &' (
∪Y ={1,...,κ1 +κ2 } | |=κ1 ,|Y |=κ2 ⊗α 2 ⊗ α 1⊗(κ1 −2)
d(b1 )+1
+ l.i.t. jY cb2 · [· · · [[α 1 , α 2 ], α 2 ], . . . , α 2 ] % &' ( d(b2 )+1 ⊗α 2 ⊗α 1⊗(κ2 −2) + l.i.t. κ1 + κ 2 − 4 = 2cb1 cb2 · [· · · [α 1 ,α 2 ], . . . ,α 2 ] % &' ( κ1 − 2 d(b1 )+1
⊗[· · · [α 1 ,α 2 ], . . . ,α 2 ] ⊗ α 2⊗2 ⊗ α 1⊗(κ1 +κ2 −4) + l.i.t. % &' ( d(b2 )+1
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
145
which proves the claim for = 2. In addition, we can take this last result as the basis of induction (on ) to prove the following: for all b := (b1 , . . . , b ) ∈ Bν , one has # δ|κ|
)
$ α bi = cb ·
*
i=1
i=1
⊗α 2⊗
[· · · [[α 1 , α 2 ], α 2 ], . . . , α 2 ] % &' ( d(bi )+1
⊗ α 1⊗(|κ|−2 ) + l.i.t.
(4.4)
for some cb ∈ k \ {0}, with |κ| := i=1 κi and κi := κ(α bi ) (i = 1, . . . , ). The induction step, from to ( + 1), amounts to compute (with κ +1 := κ(α b +1 )) δ|κ|+κ +1 α b1 · · · α b · α b +1 = δ (α b1 · · · α b )δY (α b +1 )
∪Y ={1,...,|κ|+κ +1 }
=
j δ|κ| (α b1 · · · α b ) · jY δκ +1 (α b +1 )
∪Y ={1,...,|κ|+κ +1 } | |=|κ|,|Y |=κ +1
=
*
j cb · [· · · [[α 1 , α 2 ], α 2 ], . . . , α 2 ] ⊗ α 2⊗ % &' (
∪Y ={1,...,|κ|+κ +1 } i=1 | |=|κ|,|Y |=κ +1
⊗α 1⊗(|κ|−2 ) + l.i.t. jY cb +1
⊗α 2 ⊗ α 1⊗(κ +1 −2)
+ l.i.t.
d(bi )+1
· [· · · [[α 1 , α 2 ], α 2 ], . . . , α 2 ] % &' (
d(b +1 )+1
|κ| + κ
* +1 − 2( + 1) [· · · [[α 1 , α 2 ], α 2 ], . . . , α 2 ] × % &' ( |κ| − 2
= cb cb +1 · ( + 1)
i=1
⊗[· · · [[α 1 , α 2 ], α 2 ], . . . , α 2 ] ⊗ α 2⊗( +1) %
&'
(
d(bi )+1
⊗ α 1⊗(|κ|+κ +1 −2( +1)) + l.i.t.
d(b +1 )+1
+1 −2( +1) which proves (4.4) for (b, b +1 ) with c(b,b +1 ) = cb cb +1 · ( + 1) |κ|+κ|κ|−2 = 0. Finally, (4.4) yields δ|κ| (α b1 · · · α b ) = 0, so κ(α b1 · · · α b ) ≥ κ(α b1 ) + · · · + κ(α b ), q.e.d. (g) Part (d) proves the claim for d(b1 ) = d(b2 ) = 0, that is b1 , b2 ∈ {xn }n∈N . Moreover, when b2 = xn ∈ {xm }m∈Nν we can replicate the proof of part (d) to show that κ [α b1 , α b2 ] = κ [α b1 , α n ] = ∂− [α b1 , α n ] − d [α b1 , α n ] : but the latter is exactly τ [α b1 , α b2 ] , q.e.d. Everything is similar if b1 = xn ∈ {xm }m∈ Nν . Now let b , b ∈ B \{x } . Then (b) gives κ [α , α ] ≤ κ(α b1 )+κ(α b2 )−1 = 1 2 ν n b b n∈Nν 1 2 τ [α b1 , α b2 ] . Applying (4.3) to b = b1 and b = b2 we get, for κi := κ(α bi ) (i = 1, 2)
146
F. Gavarini
δκ1 +κ2 −1 [α b1 , α b2 ] = δ (α b1 ), δY (α b2 )
∪Y ={1,...,κ1 +κ2 −1}
∩Y =∅
=
∪Y ={1,...,κ1 +κ2 } | |=κ1 ,|Y |=κ2 ⊗α 2 ⊗ α 1⊗(κ1 −2)
j cb1 · [· · · [[α 2 , α 1 ], α 2 ], . . . , α 2 ] % &' (
d(b1 )+1
+ l.i.t. jY cb2 · [· · · [[α 2 , α 1 ], α 2 ], . . . , α 2 ] % &' ( d(b2 )+1 α 1⊗(κ2 −2) + l.i.t. ⊗α 2 ⊗ κ1 + κ 2 − 4 = 2cb1 cb2 [· · · [α 2 ,α 1 ], . . . ,α 2 ], [· · · [α 2 ,α 1 ], . . . ,α 2 ] % &' ( % &' ( κ1 − 2 ⊗α 2⊗2
⊗ α 1⊗(κ1 +κ2 −4)
d(b1 )+1
d(b2 )+1
+ l.i.t. (note that d(bi ) ≥ 1 because bi ∈ xn n ∈ Nν for i = 1, 2). In particular this means δκ1 +κ2 −1 [α b1 , α b2 ] = 0, thus κ [α b1 , α b2 ] ≥ κ(α b1 ) + κ(α b2 ) − 1 = τ [α b1 , α b2 ] . Lemma 4.3. Let V be a k–vector space, and ψ ∈ Homk V , V ∧ V ). Let L(V ) be the free Lie algebra over V , and ψd L ∈ Homk L(V ), L(V ) ∧ L(V ) the unique extension of ψ from V to L(V ) by derivations, i.e. such that ψd L V = ψ and ψd L [x, y] = x ⊗ 1 + 1 ⊗ x, ψd L (y) + ψd L (x), y ⊗ 1 + 1 ⊗ y = x.ψd L (y) − y.ψd L (x) in the L(V )–module L(V )∧L(V ), ∀x, y ∈ L(V ). Let K := Ker(ψ): then Ker ψd L = L(K), the free Lie algebra over K. Proof. Standard, by universal arguments (for a direct proof see [Ga2], Lemma 10.15). Lemma 4.4. The Lie cobracket δ of U (Lν ) preserves τ . That is, for each ϑ ∈ U (Lν ) in the expansion δ2 (ϑ) = b ,b ∈B cb1 ,b2 α b1 ⊗ α b2 (w.r.t. the basis B ⊗ B, where B is a 1 2 PBW basis as in §1.1 w.r.t. some total order of Bν ) we have τ bˆ 1 + τ bˆ 2 = τ (ϑ) for some bˆ , bˆ with c ˆ ˆ = 0, so τ δ(ϑ) := max τ (b ) + τ (b )cb ,b = 0 = τ (ϑ) if 1
δ(ϑ) = 0.
2
b1 ,b2
1
2
1
2
Proof. It follows from Proposition 4.1 that τ δ(ϑ) ≤ τ (ϑ); so δ : U (Lν ) −→ U (Lν )⊗2 is a morphism of filtered algebras, hence it naturally induces a morphism of graded alge ⊗2 bras δ : GΘ U (Lν ) −−−→ GΘ U (Lν ) . Thus proving the claim is equivalent to showing that Ker δ = GΘ∩Ker(δ) Ker(δ) =: Ker(δ), the latter being embedded into GΘ U (Lν ) . By construction, τ (xy − yx) = τ [x, y] < τ (x) + τ (y) for x, y ∈ U (Lν ), so GΘ U (Lν ) is commutative: indeed, it is clearly isomorphic – as an algebra – to S(Vν ), the symmetric algebra over Vν . Moreover, δ acts as a derivation, that is δ(xy) = δ(x)(y) + (x)δ(y) (for all x, y ∈ U (Lν )), thus the same holds for δ too. Like in Lemma 4.3, since GΘ U (Lν ) is generated by GΘ∩Lν (Lν ) =: Lν it follows that Ker δ is the free (associative sub)algebra over Ker δ L , in short ν
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
147
Ker δ = Ker δ L . Now, by definition δ(xn ) = n−1 =1 ( + 1)x ∧ xn− (cf. Theν orem 2.1) is τ – homogeneous, of τ – degree equal to τ (x n ) = n − 1. As δ also enjoys δ [x, y] = x ⊗ 1 + 1 ⊗ x, δ(y) + δ(x), y ⊗ 1 + 1 ⊗ y (for x, y ∈ Lν ) we have that δ L is even τ – homogeneous, i.e. such that τ δ(z) = τ (z), for any τ ν homogeneous z ∈ Lν such that δ(z) = 0; this implies that the induced map δ L enjoys ν δ L ϑ = 0 ⇐⇒ δ(ϑ) = 0 for any ϑ ∈ Lν , whence Ker δ L = Ker δ L . On ν ν ν the upshot we get Ker δ = Ker δ L = Ker δ L = Ker(δ). q.e.d ν
ν
Proposition 4.2. D = Θ, that is Dn = Θn for all n ∈ N, or κ = τ . Therefore, given any total order in Bν , the set A≤n = A ∩ Θn = A ∩ Dn of ordered monomials A≤n = α b = α b1 · · · α bk k ∈ N, b1 , . . . , bk ∈ Bν , b1 · · · bk , τ (b) ≤ n is a k–basis of Dn , and An := A≤n mod Dn−1 is a k–basis of Dn Dn−1 (∀n ∈ N). Proof. Both claims to D = Θ. Also, An := about the A≤n ’s and An ’s are equivalent A ≤n mod Dn−1 = A≤n \ A≤n−1 mod Dn−1 , with A≤n \ A≤n−1 = α b ∈ Aτ (b) = n . By Lemma 4.2(f) we have A≤n = A Θn ⊆ A Dn ⊆ Dn ; since A is a basis, A≤n is linearly independent and is a k–basis of Θn (by definition): so Θn ⊆ Dn for all n ∈ N. n = 0. By definition D0 := Ker(δ1 ) = k · 1H =: Θ0 , spanned by A≤0 = {1H }, q.e.d. n = 1. Let η ∈ D1 := Ker(δ2 ). Let B be a PBW-basis of H∨= U (Lν ) as in Lemma 4.4; expanding η w.r.t. A we have η = α b ∈A cb α b = b∈B cb α b . Then η := η − τ (b)≤1 cb α b = τ (b)>1 cb α b ∈ D1 , since α b ∈ A1 ⊆ Θ1 ⊆ D1 for τ (b) ≤ 1. Now, α 1 := a1 and α s := as − a1s = xs + s−1 x1s for all s ∈ Nν \ {1} yield cb α b = g(b) cb xb + χb ∈ H∨ η= b∈B τ (b)>1
b∈B τ (b)>1
for some χb ∈ H∨ : hereafter we set g(b) := k for each b = b1· · · bk ∈ B (i.e. g(b) is the degree of b as a monomial in the bi ’s). If η = 0, let g0 := min g(b)τ (b) > 1, cb = 0 ; then g0 > 0, η+ := −g0 η ∈ H∨ \ H∨ and
0 = η+ = cb x b = cb xb ∈ H∨ H∨ = U (Lν ). g(b)=g0
g(b)=g0
Now δ2 (η) = 0 yields δ2 η+ = 0, thus g(b)=g0 cb xb = η+ ∈ P U (Lν ) = Lν ; therefore all PBW monomials occurring in the last sum do belongto B ν (and g0 = 1). In addition, δ2 (η) = 0 also implies δ2 (η+ ) = 0 which yields also δ η+ = 0 for the Lie cobracket δ of Lν arising as the semiclassical limit of H∨ (see Theorem 2.1); therefore η+ = b∈Bν cb xb is an element of Lν killed by the Lie cobracket δ, i.e. η+ ∈ Ker(δ). Now we apply Lemma 4.3 to V = Vν , L(V ) = L(Vν ) =: Lν and ψ = δ V , so that ν ψd L = δ. By the formulas for δ in Theorem 2.1 we get K := Ker(ψ) = Ker δ V =
ν
148
F. Gavarini
Span {x1 , x2 } , hence L(K) = L Span {x1 , x2 } = Span xb b ∈ Bν ; τ (b) = 1 , thus eventually (via Theorem 2.1) Ker(δ) = L(K) = Span xb b ∈ Bν ; τ (b) = 1 . As η+ ∈ Ker(δ) = Span xb b ∈ Bν ; τ (b) = 1 , we have η+ = b∈Bν ,τ (b)=1 cb xb ; but cb = 0 whenever τ (b) ≤ 1, by construction of η: thus η+ = 0, a contradiction. The outcome is η = 0, whence finally η ∈ Θ1 , q.e.d. n > 1. We must show that induction Dm = Θm that Dn = Θn , while assuming by for all m < n. Let η = b∈B cb α b ∈ Dn ; then τ (η) = max τ (b)cb = 0 . If δ2 (η) = 0 then η ∈ D1 = Θ1 by the previous analysis, and we’re done. Otherwise, δ2 (η) = 0 and τ δ2 (η) = τ (η) by Lemma 4.4. On the other hand, since D is a Hopf algebra filtration we have δ2 (η) ∈ r+s=n Dr ⊗ Ds = r+s=n Θr ⊗ Θs , thanks to the induction; but r,s>0 r,s>0 then τ δ2 (η) ≤ n, by definition of τ . Thus τ (η) = τ δ2 (η) ≤ n, which means η ∈ Θn . α b := κ(α b ) α b = τ (b) α b . Theorem 4.1. For any b ∈ Bν set (a) The set of ordered monomials ≤n := A α b := α b1 · · · α bk k ∈ N, b1 , . . . , bk ∈ B, b1 · · · bk , κ(α b ) = τ (b) ≤ n
:= is a k[]–basis of Dn = Dn H = n Dn . So A n∈N A≤n is a k[]–basis of H . (b) H = k[] α b b∈B α b2 − α b1 , α [b1 ,b2 ] ∀b1 , b2 ∈ Bν . ν
(c) H is a graded Hopf of H .
k[]–subalgebra (d) H := H H = H = F ΓLν , where ΓLν is a connected Poisson alge=0 braic group with cotangent Lie bialgebra isomorphic to Lν (as a Lie algebra) with the graded Lie bialgebra structure given by δ(xn ) = (n − 2)xn−1 ∧ x1 (for all n ∈ Nν ). is the free Poisson (commutative) algebra over Nν , generated by Indeed, H =0 αn (n ∈ Nν ) with Hopf structure given (for all n ∈ Nν ) by all the α¯ n := =0
n−1 n−1 n α¯ n = α¯ n ⊗ 1 + 1 ⊗ α¯ n + α¯ k ⊗ α¯ n−k + (k + 1)α¯ k1 ⊗ α¯ n−k , 1 k k=2 k=1 n−1 n−1 k n n−k S α¯ n = −α¯ n − S α¯ k α¯ 1 − (k + 1)S α¯ 1 α¯ n−k , α¯ n = 0. k k=2
Thus H
=0
k=1
is the polynomial algebra k {ηb }b∈Bν generated by a set of indeter-
B minates {ηb }b∈Bν in bijection with Bν , so ΓLν ∼ = Ak ν as algebraic varieties. Finally, H = F ΓLν = k {ηb }b∈Bν is a graded Poisson Hopf algebra
=0
w.r.t. the grading ∂(α¯ n ) = n (inherited from H ) and w.r.t. the grading induced from κ = τ (on H), and a graded algebra w.r.t. the (polynomial) grading d(α¯ n ) = 1 (for all n ∈ N+ ). (e) The analogues of statements (a)–(d) hold with K instead of H, with X + instead of X for all X = Lν , Bν , Nν , and with ΓL+ν instead of ΓLν .
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
149
Proof. (a) This follows from Proposition 4.2 and the definition of H in §4.2. (b) This is a direct consequence of claim (a) and Lemma 4.2(g). (c) Thanks to claims (a) and (b), we can look at H as a Poisson algebra, whose Poisson bracket is given by {x, y} := −1 [x, y] = −1 (xy − yx) (for all x, y ∈ H ); then H itself is the free associative Poisson algebra generated α n n ∈ N . Clearly by is a Poisson map, therefore it is enough to prove that α n ∈ H ⊗ H for all n ∈ N+ . This is clear for α 1 and α 2 which are primitive; as for n > 2, we have, like in Proposition 4.1, n n−1 k k−1 α k ⊗ n−k Qkn−k (a∗ ) + k α 1k ⊗ n−k−1 Zn−k (α ∗ ) αn =
=
k=2 n
n−1
k=0
k=2
k=0
α k ⊗ n−k Qkn−k (a∗ ) +
k α k1 ⊗ n−k−1 Zn−k (α ∗ ) ∈ H ⊗ H
(4.5)
thanks to Lemma 4.1 (with notations used therein). In addition, S H ⊆ H also follows by induction from (4.5) because Hopf algebra axioms along with (4.5) give n−1 n−1 n−k k k n−k−1 k S α k Qn−k (a∗ ) − S α1 Zn−k (α ∗ ) ∈ H S α n = − αn − k=2
k=1
for all n ∈ Nν (using induction).The claim follows. (d) Thanks to (a) and (b), H is a polynomial k–algebra as claimed, over the set =0 of indeterminates α¯ b := α b =0 ∈ H =0 . Furthermore, in the proof of (c) b∈Bν we noticed that H is also the free Poisson algebra generated by α n n ∈ N ; therefore H is the free commutative Poisson algebra generated by α¯ n := αˇ xn =0 n∈N . =0
Then formula (4.5) – for all n ∈ Nν – describes uniquely the Hopf structure of H , hence the formula it yields at = 0 will describe the Hopf structure of H =0 . in (a) we find a sum of terms Expanding n−k Qkn−k (a∗ ) in (4.5) w.r.t. the basis A of τ –degree less than or equal to (n − k), and the sole one achieving equality is α n−k 1 , n k n−k−1 which occurs with coefficient k : similarly, when expanding Zn−k (α ∗ ) in (4.5) all summands have τ –degree less than or equal to (n − k − 1), and equality w.r.t. A holds only for α n−k , whose coefficient is (k + 1). Therefore for some η ∈ H =0 we have
n n−1 n n−k αk ⊗ α1 + (k + 1) α k1 ⊗ α n−k + η; αn = k k=2
k=0
this yields the formula for, from which the formula for S follows too as usual. Finally, let Γ := Spec H =0 be the algebraic Poisson group such that F Γ = H =0 , and let γ ν := coLie(Γ ) be its cotangent Lie bialgebra. Since H =0 is Pois son free over α¯ n n∈N , as a Lie algebra γ ν is free over dn := α¯ n mod m2 n∈N ν ν (where m := JH |=0 ), so γ ν ∼ = Lν , via dn → xn (n ∈ N+ ) as a Lie algebra. The Lie cobracket is
150
F. Gavarini
δγ ν dn = ( − op ) α¯ n mod m⊗ n−1 n−1 n α¯ k ∧ α¯ n−k = + (k + 1)α¯ k1 ∧ α¯ n−k mod m⊗ 1 k k=2 k=1 n = α¯ n−1 ∧ α¯ 1 + 2α¯ 1 ∧ α¯ n−1 mod m⊗ n−1 = (n − 2)α¯ n−1 ∧ α¯ 1 mod m⊗ = (n − 2)dn−1 ∧ d1 ∈ γ ⊗ γ ,
where m⊗ := m2 ⊗ H |=0 + m ⊗ m + H |=0 ⊗ m2 , whence Γ = ΓLν as claimed in (d). Finally, the statements about gradings of H = F ΓLν hold by construction. =0 (e) This should be clear from the whole discussion, since all arguments apply again – mutatis mutandis – when starting with K instead of H; we leave details to the reader. ∨ 5. Drinfeld’s Deformation H¯h 5.1. The goal. Like in §3.1, there is a second step in the crystal duality principle which builds another deformation based ∨upon the Rees deformation H . This will be again a =0 Hopf k[]–algebra, namely H , which specializes to H for = 1 and for ∨ instead specializes to U (k− ), for some Lie bialgebra k− . In other words, H = H =1 ∨ ∨ and H = U (k− ), the latter meaning that H is a quantized universal envel=0 ∨ oping algebra (QUEA in the sequel). Thus H is a quantization of U (k− ), and the quantum symmetry H is a deformation of the classical Poisson symmetry U (k− ). The general theory describes the relationship between k− and ΓLν in §4, explicitly ∼ which is k− = γ ν := coLie ΓLν = Lν (with the structure in Theorem 4.1(d)), the cotangent Lie bialgebra of ΓLν . Thus, from this and §4 we see that the quantum symmetry encoded by H is (also) intermediate between the two classical, Poisson symmetries ruled by ΓLν and γ ν . ∨ In this section I describe explicitly H and its semiclassical limit U (k− ), hence ∨ k− itself too. This provides a direct proof of the above mentioned results on H and k− . ∨ 5.2. Drinfeld’s algebra H . Let J := JH , and define n ∨ n −1 J ⊆ H() . −n J = H := n∈N
n∈N
(5.1)
∨ Now I describe H and its specializations at = 1 and = 0. The main step is αb . Theorem 5.1. For any b ∈ Bν set αˇ b := κ(α b )−1 α b = τ (b)−1 α b = −1 ∨ αˇ b1 , αˇ b2 − αˇ [b1 ,b2 ] ∀b1 , b2 ∈ Bν . (a) H = k[] αˇ b b∈B ν ∨ (b) H is a graded Hopf k[]–subalgebra of H .
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
151
∨ ∨ ∨ H ∼ (c) H := H = U Lν as co-Poisson Hopf algebra, where Lν =0 bears the Lie bialgebra structure given by δ(xn ) = (n−2)xn−1 ∧x 1 (for all n ∈ Nν ). ∨ = U (Lν ) into Finally, the grading d given by d(xn ) := 1(n ∈ N+ ) makes H =0 a graded co-Poisson Hopf algebra, and the grading ∂ given by ∂(xn ) := n(n ∈ N+ ) ∨ = U (Lν ) into a graded Hopf algebra and Lν into a graded Lie makes H =0 bialgebra. + + (d) The analogues of statements (a)–(c) hold with K, L+ ν , Bν and Nν respectively +. instead of H, L+ , B and N ν ν ν ∨ Proof. (a) This follows from Theorem 4.1(b) and the very definition of H in §5.2. (b) This is a direct consequence of claim (a) and Theorem 4.1(c). (c) It follows from claim (a) that mapping αˇ b =0 → b (∀b ∈ Bν ) yields a well∼ ∨ = defined algebra isomorphism : H −−U Lν ). In addition, when expanding =0
n−k Qkn−k (a∗ ) in (4.5) w.r.t. the basis A (see Proposition 4.2) we find a sum of terms of τ –degree less than or equal to (n − k), and equality is achieved only for α n−k 1 , which k (α ) in (4.5) yields occurs with coefficient nk : similarly, the expansion of n−k−1 Zn−k ∗ a sum of terms whose τ –degree is less than or equal to (n − k − 1), with equality only α s = αˇ s (s ∈ N+ ) we for α n−k , whose coefficient is (k + 1). Thus using the relation get n−1 n−1 k αˇ n = αˇ n ⊗ 1 + 1 ⊗ αˇ n + (α ∗ ) αˇ k ⊗ n−k Qkn−k (a∗ ) + αˇ k1 ⊗ n−1 Zn−k k=2 n−1
k=1
n−1 n n−k k n−k αˇ k ⊗ αˇ 1 + (k + 1)αˇ k1 ⊗ αˇ n−k + 2 η = αˇ n ⊗ 1 + 1 ⊗ αˇ n + k k=2 k=1 = αˇ n ⊗ 1 + 1 ⊗ αˇ n + nαˇ n−1 ⊗ αˇ 1 + 2αˇ 1 ⊗ αˇ n−1 + 2 χ
∨ ∨ for some η, χ ∈ H ⊗ H . It follows that αˇ n =0 = αˇ n =0 ⊗1+1⊗ αˇ n =0 for all n ∈ Nν . Similarly we have S αˇ n =0 = −αˇ n =0 and αˇ n =0 = 0 for all n ∈ Nν , thus is an isomorphism of Hopf algebras too. In addition, the Poisson co ∨ ∨ inherited from H is given by bracket of H =0 ∨ ∨ −1 mod H ⊗ H δ αˇ n =0 = ( − op ) αˇ n ∨ ∨ = nαˇ n−1 ∧ αˇ 1+ 2αˇ 1 ∧ αˇ n−1 mod H ⊗ H = (n − 2)αˇ n−1 =0 ∧ αˇ 1 =0 , hence is also an isomorphism of co-Poisson Hopf algebras, as claimed. ∨ = U (Lν ) should be clear by construction. The statements on gradings of H =0 (d) This should be clear from the whole discussion, as all arguments apply again – mutatis mutandis – when starting with K instead of H; details are left to the reader. 5.3. Specialization limits. So far, Theorem ∨4.1(d) and Theorem 5.1(c) prove the follow ing specialization results for H and H respectively: →0 H −−−→F ΓLν
,
∨ →0 H −−−→U (Lν )
152
F. Gavarini
as graded Poisson or co-Poisson Hopf k–algebras. In addition, Theorem 4.1(b) implies →1
that H −−−→H = H as graded Hopf k–algebras. Indeed, by Theorem 4.1(b) H (or even H ) embeds as an algebra into H , via α n → α n (for all n ∈ Nν ): then [α n , α m ] → αn, α m = α [xn ,xm ] mod (−1)H α [xn ,xm ] ≡ ∀n, m ∈ Nν , := thus, thanks to the presentation of H in Theorem 4.1(b), H is isomorphic to H =1
α 1 =1 , α 2 =1 , . . . , α n , . . . , as a k–algebra, via α n → H ( − 1)H = k =1 α n =1 . Moreover, the Hopf structure of H is given by =1
n n−1 k α k ⊗ n−k Qkn−k (a∗ ) + α k1 ⊗ n−1 Zn−k α n =1 = (α ∗ ) mod ( − 1)H ⊗ H . k=2
k=0
Now, Qkn−k (a∗ ) = Qkn−k (α ∗ + α 1 ∗ ) = Qkn−k (α ∗ ) for some polynomial Qkn−k (α ∗ ) s,k in the α i ’s ; let Qkn−k (α ∗ ) = s Tn−k (α ∗ ) be the splitting of Qkn−k into τ –homogeneous s,k summands (i.e., each Tn−k (α ∗ ) is a homogeneous polynomial of τ –degree s): then n−k Qkn−k (a∗ ) = n−k Qkn−k (α ∗ ) = n−k
s,k Tn−k (α ∗ ) =
s
s,k n−k−s Tn−k ( α∗) s
s,k s,k ( α ∗ ) ≡ Tn−k ( α∗) with n−k−s > 0 for all s (by construction). Since clearly n−k−s Tn−k s,k k k n−k n−k n−k−s mod ( − 1)H , we find Qn−k (a∗ ) = Qn−k (α ∗ ) = s Tn−k ( α∗) ≡ s,k k ( T ( α ) mod ( − 1)H = Q α ), for all k and n. Similarly we argue that s n−k ∗ n−k ∗ k k n−1 α ∗ ) mod ( − 1)H , for all k and n. The outcome is that Zn−k (α ∗ ) ≡ Zn−k ( n α n =1 = α k ⊗ n−kQkn−k (α ∗ ) k=2 n−1
k α k1 ⊗ n−1 Zn−k (α ∗ ) mod ( − 1)H ⊗ H
+
k=2 n n−1 k = α∗) + ( α∗) α k ⊗ Qkn−k ( α k1 ⊗ Zn−k k=2
mod ( − 1)H ⊗ H .
k=0
On the other hand, we have (α n ) =
n
n−1 k k k k=2 α k ⊗Qn−k (α ∗ )+ k=0 α 1 ⊗Zn−k (α ∗ ) in ∼ = : H−−H given by α n → α n =1 =1
H. Thus the graded algebra isomorphism preserves the coproduct too. Similarly, respects the antipode and the counit, hence it is a graded Hopf algebra isomorphism. In a nutshell, we have (as graded Hopf k– ∨ →1 →1 algebras) H −−−→H = H. Similarly, Theorem 5.1 implies that H −−−→H as ∨ graded Hopf k–algebras. Indeed, Theorem 5.1(a) shows that H ∼ = k[] ⊗k U (Lν ) ∨ as graded associative algebras,via αˇ n → xn (n ∈ Nν ), in particular H is the free associative k[]–algebra over αˇ n n∈N ; then specialization yields a graded algebra ν isomorphism ∨
∨ ∼ ∨ = := H (−1) H −−H, αˇ n =1 → α n . : H =1
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
∨ As for the Hopf structure, in H
153
it is given by
=1
n n−1 k (α ∗ )=1 . αˇ k =1 ⊗ n−k Qkn−k (α ∗ )=1 + αˇ k1 =1 ⊗ n−2 Zn−k αˇ n =1 = k=2
k=0
s,k s,k As before, split Qkn−k (α ∗ ) as Qkn−k (α ∗ ) = s Tn−k (α ∗ ), and split each Tn−k ( α∗) s,k into homogeneous components w.r.t. the total degree in the α i ’s, say Tn−k ( α∗) = s,k s,k n−k−s+r s,k n−k−s T s,k ( n−k−s ˇ ∗ ), Y ( α ): then α ) = Y ( α ) = Y ∗ ∗ r,n (α r r,n r r,n r n−k ∗ ∨ s,k s,k n−k−s+r because α ∗ = αˇ ∗ . As Yr,n (αˇ ∗ ) ≡ Yr,n (αˇ ∗ ) mod ( − 1) H , we eventually get s,k n−k Qkn−k (α ∗ ) = n−k−s+r Yr,n (αˇ ∗ ) ≡
s,r
∨ s,k Yr,n (αˇ ∗ ) mod ( − 1) H = Qkn−k (a∗ )
s,r
for all k
k (α ) and n. Similarly n−1 Zn−k ∗
k (α ) mod ( − 1) H ∨ (∀k, n). Thus ≡ Zn−k ∗
n n−1 k (α ∗ )=1 αˇ k =1 ⊗ n−k Qkn−k (α ∗ )=1 + αˇ k1 =1 ⊗ n−2 Zn−k αˇ n =1 = k=2
=
n
k=0
αˇ k
k =1 ⊗ Qn−k
(α ∗ )
k=2
=1 +
n
n−1
k (α ∗ )=1 . αˇ k1 =1 ⊗ Zn−k
k=0
k k On the other hand, one has (α n ) = k=2 α k ⊗ Qkn−k (α ∗ ) + n−1 k=0 α 1 ⊗ Zn−k (α ∗ ) ∼ ∨ = in H, thus the algebra isomorphism : H −−H given by α n =1 → α n also =1 preserves the coproduct; similarly, it also respects the antipode and the counit, hence it is a graded Hopf algebra isomorphism. In a nutshell, we have (as graded Hopf k–algebras) ∨ →1 H −−−→H. Therefore we have filled in the bottom part of the diagram () in the Introduction, for H = H(:= Hν ), because H := ∪n∈N Dn = H by §4.2: namely, 0←→1 1←→0 H ←−−−−−−→ U (Lν ) F ΓLν ←−−−−−−→ ∨ H
(H )
where now in right-hand side Lν is given the Lie bialgebra structure of Theorems 4.1 and 5.1, and ΓLν is the corresponding dual Poisson group mentioned in Theorem 4.1. 6. Summary and Generalizations 6.1. Summary. The analysis in §§2–5 yields a complete description of the nontrivial deformations the Rees deformations H∨ and H and Drinfeld’s ∨ of H – namely ∨ deformations H and H – built out of the trivial deformation H . In particular g× G+ = ΓLν , g× G− = GLν , (6.1) − = Lν , δ• , + = Lν , δ∗ (with notation of ()) where δ• and δ∗ denote the Lie cobracket on Lν defined respectively in Theorem 2.1 and in Theorems 4.1 and 5.1. The next result shows that the four objects in (6.1) are really different, though they share some common features:
154
F. Gavarini
Theorem 6.1. (a) H∨ ∼ H as Hopf = = H as Poisson k[]–algebras, but H∨ ∼ k[]–algebras. (b) Lν , δ• ∼ Lν , δ∗ as Lie bialgebras. = Lν , δ∗ as Lie algebras, but Lν , δ• ∼ = (c) GLν ∼ ΓLν as (algebraic) groups. = ΓLν as (algebraic) Poisson varieties, but GLν ∼ = (d) The analogues of statements (a)–(c) hold with K and L+ ν instead of H and Lν . Proof. It follows from Theorem 3.1(a) that H∨ can be seen as a Poisson Hopf algebra, with Poisson bracket given by {x, y} := −1 [x, y] = −1 (xy x, y ∈ − yx) (for all ∨ ∨ H ); then H is the free Poisson algebra generated by bxn = xn = an n ∈ N ; n since an = α n + (1 −δ1,n)α 1n and α n = an − (1 − δ1,n )a1 (n ∈ N+ ) it is also (freely) Poisson-generated by α n ∈ N . We also saw that H is the free Poisson algebra over n α n n ∈ N ; thus mapping α n → α n (∀n ∈ N) does define a unique Poisson algebra ∨ ∼ = isomorphism : H −→H , given by α b := −d(b) α b → α b , for all b ∈ Bν . This proves the first half of (a), and then also (taking semiclassical limits and spectra) of (c). The group structure of either GLν or ΓLν yields a Lie cobracket onto the cotangent space at the unit point of the above, isomorphic Poisson varieties: this cotangent space identifies with Lν , and the two cobrackets are given respectively by δ• (xn ) = n−1 =1 ( + 1)x ∧ xn− for GLν (by Theorem 3.1) and by δ∗ (xn ) = (n − 2)xn−1 ∧ x1 for ΓLν (by Theorem 4.1), for all n ∈ Nν . It follows that Ker(δ• ) = {0} = Ker(δ∗ ), which implies that the two Lie coalgebra structures on Lν are not isomorphic. (b), This proves and also means that GLν ∼ F ΓLν as Hopf ΓLν as (algebraic) groups, hence F GLν ∼ = = k–algebras, and so H∨ ∼ = H as Hopf k[]–algebras, which ends the proof of (c) and (a) too. Finally, claim (d) should be clear: one applies the like arguments mutatis mutandis, and everything follows as before.
6.2. Generalizations. Plenty of features of H = Hdif are shared by a whole bunch of graded Hopf algebras, which usually arose in connection with some physical problem or some (co)homological topic and all bear a nice combinatorial content; essentially, most of them can be described as “formal series” over indexing sets – replacing N – of various (combinatorial) nature: planar trees (with or without labels), forests, graphs, Feynman diagrams, etc. Besides the ice-breaking examples in physics provided by Connes and Kreimer (cf. [CK1, CK2, CK3]), which are all commutative or cocommutative Hopf algebras, other non-commutative non-cocommutative examples (like the one of Hdif ) are introduced in [BF], roughly through a “disabelianization process” applied to the commutative Hopf algebras of Connes and Kreimer. A very general analysis and wealth of examples in this context is due to Foissy (see [Fo1, Fo2, Fo3]), who also makes an interesting study of δ• –maps and of the functor H → H (H a Hopf k-algebra). Other examples, issued out of topological motivations, can be found in the works of Loday et al.: see e.g. [LR], and references therein. When performing the like analysis, as we did for H, for a graded Hopf algebra H of the afore mentioned type, the arguments used for H apply essentially the same, up to minor changes, and give much the same results. To give an example, the Hopf algebras considered by Foissy are non-commutative polynomial, say H = k {x } i i∈I for some index set I: then one finds H∨ =0 = U (g− ) = U (LI ), where LI is the free Lie algebra over I.
Poisson Symmetries Associated to Non-Commutative Diffeomorphisms
155
This opens the way to apply the methods presented in this paper to all these graded Hopf algebras, of great interest for their applications in mathematical physics or in topology (or whatever); the simplest case of Hdif plays the role of a toy model which realizes a clear and faithful pattern for many common features of all Hopf algebras of this kind. Acknowledgements. The author thanks Alessandra Frabetti and Loic Foissy for many helpful discussions.
References [BF] [Ca] [CK1] [CK2] [CK3] [Dr] [Fo1] [Fo2] [Fo3] [Ga1] [Ga2] [Ga3] [Ga4] [Je] [LR] [Re]
Brouder, C., Frabetti, A.: Noncommutative renormalization for massless QED. Preprint, http://arxiv.org/abs/hep-th/0011161, 2000 Carmina, R.: The Nottingham Group. In: M. Du Sautoy, D. Segal, A. Shalev (eds.), New Horizons in pro-p Groups, Progress in Math. 184, 2000, pp. 205–221 Connes,A., Kreimer, D.: Hopf algebras, Renormalization and Noncommutative Geometry. Commun. Math. Phys. 199, 203–242 (1998) Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann-Hilbert. problem I: the Hopf algebra structure of graphs and the main theorem. Commun. Math. Phys. 210, 249–273 (2000) Connes, A., Kreimer, D.: Renormalization in quantum field theory and the Riemann-Hilbert. problem II: the β function, diffeomorphisms and the renormalization group. Commun. Math. Phys. 216, 215–241 (2001) Drinfeld, V.G.: Quantum groups. Proc. Intern. Congress of Math. (Berkeley, 1986), 1987, pp. 798–820 Foissy, L.: Les alg`ebres de Hopf des arbres. enracin´es d´ecor´es, I. Bull. Sci. Math. 126, 193–239 (2002) Foissy, L.: Les alg`ebres de Hopf des arbres. enracin´es d´ecor´es, II. Bull. Sci. Math. 126, 249–288 (2002) Foissy, L.: Finite dimensional comodules over the Hopf algebra of rooted trees. J. Algebra 255, 89–120 (2002) Gavarini, F.: The quantum duality principle. Annales de l’Institut Fourier 52, 809–834 (2002) Gavarini, F.: The global quantum duality principle: theory, examples, and applications. Preprint, http://arxiv.org/abs/math.QA/0303019, 2004 Gavarini, F.: The global quantum duality principle. To appear, 2004 Gavarin, F.: The Crystal Duality Principle: from Hopf Algebras. To Geometrical Symmetries, Preprint, http://arxiv.org/abs/math.QA/0304164, 2003 Jennings, S.: Substitution groups of formal power series. Canadian J. Math. 6, 325–340 (1954) Loday, J.-L., Ronco, M.O.: Hopf algebra of the planar binary trees. Adv. Math. 139, 293–309 (1998) Reutenauer, C.: Free Lie Algebras. London Mathematical Society Monographs, New Series 7, New York: Oxford Science Publications, 1993
Communicated by A. Connes
Commun. Math. Phys. 253, 157–170 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1207-3
Communications in
Mathematical Physics
Algebro-Geometric Solution of the Discrete KP Equation over a Finite Field out of a Hyperelliptic Curve Mariusz Białecki1 , Adam Doliwa2 1 2
Instytut Geofizyki PAN, ul. Ksi¸ecia Janusza 64, 01-452 Warszawa, Poland. E-mail:
[email protected] Wydział Matematyki i Informatyki, Uniwersytet Warmi´nsko–Mazurski, ul. Zołnierska 14A, 10-561 Olsztyn, Poland. E-mail:
[email protected]
Received: 8 October 2003 / Accepted: 16 April 2004 Published online: 5 November 2004 – © Springer-Verlag 2004
Abstract: We transfer the algebro-geometric method of construction of solutions of the discrete KP equation to the finite field case. We emphasize the role of the Jacobian of the underlying algebraic curve in construction of the solutions. We illustrate in detail the procedure on example of a hyperelliptic curve. 1. Introduction Cellular automata are dynamical systems on a lattice with values being discrete (usually finite) as well. They are one of the more popular and distinctive classes of models of complex systems. Introduced in various contexts [28, 23] around 1950 they have found wide applications in different areas, from physics through chemistry and biology to social sciences [29]. One of the most interesting properties of cellular automata is that complex patterns can emerge from very simple uptade rules. However, usually one cannot easily predict how a given cellular automaton will behave without going through a number of time steps on a computer. Due to their completely discrete nature, cellular automata are naturally suitable for computer simulations, but also here it would be instructive to have examples of rules with large classes of analytical solutions, integrals of motions and other “integrable features”. The problem of construction of integrable cellular automata is not new and was undertaken in a number of papers (see, for example, [8, 4, 5, 26]). In particular in [27, 18] it was given a systematic method, called ultra-discretization, of obtaining cellular automaton version of a given discrete integrable system. Recently a new approach to integrable cellular automata was proposed in [7]. Its main idea is to keep the form of a given integrable discrete system but to transfer the algebro-geometric method of construction of its solutions [14, 1] from the complex field C to a finite field. This method, which in principle can be applied to any integrable discrete system with known algebro-geometric method of solution, has been applied to the
158
M. Białecki, A. Doliwa
fully discrete 2D Toda system (the Hirota equation) and in [3] to discrete KP and KdV equations (in Hirota form). In particular, the finite field valued multisoliton solutions of these equations have been constructed. We remark that the algebraic geometry over finite fields, although conceptually similar to that over the field of complex numbers [10], has its own tools and peculiarities [6, 21]. It is also nowadays very important in practical use in modern approaches to public key cryptography [13] and in the theory of error correcting codes [25]. The aim of this paper is to study in the finite field context the very distinguished example of the integrable system — the discrete KP equation. We present the algebrogeometric scheme of construction of its solutions in a finite field and we demonstrate its linearization on the level of the abstract Jacobi variety of the corresponding algebraic curve. We illustrate details of the construction in the example of a hyperelliptic curve. We remark that in [4] it was observed that the Lax representation of the discrete sineGordon equation of Hirota [11] has a meaning also when the field of complex numbers is replaced by a finite field of the form Fp2 , where p is a prime number. The authors of [4] showed also that, in principle, the corresponding integrals of motion can be calculated. Finally, we note that the possibility of considering the soliton theory in positive characteristic has been anticipated in [22]. The organization of the paper is as follows. In Sect. 2 we first summarize the finite field version of Krichever’s construction of solutions of the discrete KP equation, then we present its abstract Jacobian picture. In Sect. 3 we apply the method starting from an algebraic curve of genus two. 2. The Finite Field Solution of the Discrete KP Equation out of Nonsingular Algebraic Curves We first briefly recall algebro-geometric construction of solutions of the discrete KP equation over finite fields [7, 3]. We discuss in addition a possible degeneracy of the linear problem and its consequences. Then we present the Jacobian picture of the construction, in which the integrable nature of the equation is evident. We point out some aspects of the representation which will help us to construct effectively solutions of the equation. 2.1. General construction. Consider an algebraic projective curve C/K (or simply C), absolutely irreducible, nonsingular, of genus g, defined over the finite field K = Fq with q elements, where q is a power of a prime integer p (see, for example [25, 10]). By C(K) we denote the set of K-rational points of the curve. By K denote the algebraic closure of K, i.e., K = ∞ =1 Fq , and by C(K) denote the corresponding infinite set (often identified with C) of K-rational points of the curve. The action of the Galois group G(K/K) (of automorphisms of K which are identity on K, see [17]) extends naturally to the action on C(K). Let us choose: 1. four points Ai ∈ C(K), i = 0, 1, 2, 3, 2. effective K-rational divisor of order g, i.e., g points Bγ ∈ C(K), γ = 1, . . . , g, which satisfy the following K-rationality condition: ∀σ ∈ G(K/K),
σ (Bγ ) = Bγ .
The Discrete KP Equation over a Finite Field
159
As a rule we assume here that all the points in the construction are distinct and in used g general position. In particular, the divisor γ =1 Bγ is non-special. Definition 1. Fix the K-rational local parameter t0 at A0 . For any integers n1 , n2 , n3 ∈ Z define the function ψ(n1 , n2 , n3 ) as a rational function on the curve C with the following properties: 1. it has pole of the order at most n1 + n2 + n3 at A0 , 2. the first nontrivial coefficient of its expansion in t0 at A0 is normalized to one, 3. it has zeros of order at least ni at Ai for i = 1, 2, 3, 4. it has at most simple poles at points Bγ , γ = 1, . . . , g. As usual, zero (pole) of a negative order means pole (zero) of the corresponding positive order. Correspondingly one should exchange the expressions “at most” and “at least” in front of the orders of poles and zeros. By the standard (see e.g., [1]) application of the Riemann–Roch theorem (and the general position assumption) we conclude that the wave function ψ(n1 , n2 , n3 ) exists and is unique. The function ψ(n1 , n2 , n3 ) is K-rational, which follows from K-rationality conditions of sets of points in their definition. Remark. In what follows we will often normalize functions in a sense of point 2 of Definition 1. Fix K-rational local parameters ti at Ai , i = 1, 2, 3. In the generic case, which we assume in the sequel, when the order of the pole of ψ(n1 , n2 , n3 ) at A0 is n1 + n2 + n3 , (i) denote by ζk (n1 , n2 , n3 ), i = 0, 1, 2, 3, the K-rational coefficients of expansion of ψ(n1 , n2 , n3 ) at Ai , respectively, i.e., ∞ (0) ψ(n1 , n2 , n3 ) = (n1 +n12 +n3 ) 1 + ζk (n1 , n2 , n3 )t0k , t0
ψ(n1 , n2 , n3 ) =
k=1
tini
∞ k=0
(i) ζk (n1 , n2 , n3 )tik ,
i = 1, 2, 3.
Denote by Ti the operator of translation in the variable ni , i = 1, 2, 3, for example T1 ψ(n1 , n2 , n3 ) = ψ(n1 + 1, n2 , n3 ). Uniqueness of the wave function implies the following statement. Proposition 1. Generically, the function ψ satisfies equations (i)
Ti ψ − T j ψ +
Tj ζ0
(i)
ζ0
ψ = 0,
i = j,
i, j = 1, 2, 3.
(1)
Remark. When the genericity assumption fails then the linear problem (1) degenerates, i.e., some of its terms are absent. Notice that Eq. (1) gives (j )
(i)
Tj ζ0
(i)
ζ0
=−
Ti ζ 0
(j )
,
ζ0
Define ρi = (−1)
j
i = j,
nj (i) ζ0 ,
i, j = 1, 2, 3.
i = 1, 2, 3,
(2)
(3)
160
M. Białecki, A. Doliwa
then Eq. (2) implies existence of a K-valued potential (the τ -function) defined (up to a multiplicative constant) by formulas Ti τ = ρi , τ
i = 1, 2, 3.
(4)
Finally, Eq. (1) give rise to condition T2 ρ1 T 3 ρ1 T 3 ρ2 − + = 0, ρ1 ρ2 ρ1
(5)
which written in terms of the τ -function gives the discrete KP equation [12] called also the Hirota equation (T1 τ ) (T2 T3 τ ) − (T2 τ ) (T3 T1 τ ) + (T3 τ ) (T1 T2 τ ) = 0.
(6)
Corollary 2. Equation (5) can be obtained also from expansion of Eq. (1) at Ak , where k = 1, 2, 3, k = i, j . Remark. Absence of a term in the linear problem (1) (see the remark after Proposition 1) reflects, due to Corollary 2, the absence of the corresponding term in Eq. (6). This implies that in the non-generic case, when we have not defined the τ -function yet, we are forced to put it to zero. Let us notice in advance (see the next section) that it is in complete analogy with the well known (complex field) interpretation of the algebro-geometric solution τ of the discrete KP equation as, essentially, the Riemann theta function.
2.2. The Jacobian interpretation. Denote by Div(C) the abelian group of the divisors on the curve C and by Pic0 (C) the group of equivalence classes of degree zero divisors Div0 (C) modulo the principal divisors. There exists [16, 20] an abelian variety J (C) of dimension g (the Jacobian of the curve) and an injective map φ : C → J (C) (the Abel map) such that the extension of φ to Div(C) establishes an isomorphism between Pic0 (C) and J (C). Moreover, if there exists a K-rational point A ∈ C(K) of the curve, then φ can be defined by C P → φ(P ) = [P − A] ∈ J (C), where [P − A] designates the class of the degree zero divisor P − A in Pic0 (C). Denote by Dr (C) the effective divisors of degree r of the curve C and by φr the extension of φ to Dr (C), Dr (C) D → φr (D) = [D − r · A] ∈ J (C). The direct image of φr is a subvariety Wr of dimension r if 0 ≤ r ≤ g, and of dimension g if r > g. In particular, Wg−1 defines a divisor in J (C). The group Pic0 (C; K) of eqivalence classes of K-rational degree zero divisors Div0 (C; K) modulo the principal K-rational divisors can be identified with the abelian group J (C; K) of K-rational points of the Jacobian variety. For finite field K the group Pic0 (C; K) is finite as well and its order can be found using properties of the zeta function of the curve (see, for example [25]).
The Discrete KP Equation over a Finite Field
161
Let us present in this picture the description of the wave function ψ and of the τ -function. We choose the point A0 as the reference point A and consider the following divisor D(n1 , n2 , n3 ) ∈ Div0 (C; K) of degree zero g Bγ − g · A 0 , D(n1 , n2 , n3 ) = n1 (A0 − A1 ) + n2 (A0 − A2 ) + n3 (A0 − A3 ) + γ =1
with linear dependence on n1 , n2 and n3 . Its equivalence class in Pic0 (C; K) has the unique K-rational representation of the form X(n1 , n2 , n3 ) =
g
Xγ (n1 , n2 , n3 ) − g · A0 .
γ =1
This equivalence is given by a function whose divisor reads n1 (A1 − A0 ) + n2 (A2 − A0 ) + n3 (A3 − A0 ) +
g
Xγ (n1 , n2 , n3 ) −
γ =1
g
Bγ ∼ 0.
γ =1
(7) If we normalize such a function at A0 according to Definition 1 it becomes the wave function ψ. Notice that some of Xγ could be A0 which would mean that [D(n1 , n2 , n3 )] ∈ Wg−1 . This correspondence gives rise to a set of important facts. Corollary 3. Evolutions in variables n1 , n2 and n3 define linear flows in the Jacobian. Corollary 4. Points Xγ , γ = 1, . . . , g indicate zeros of the wave function which are not specified in the previous construction. Corollary 5. If [D(n1 , n2 , n3 )] ∈ Wg−1 , then the pole of the wave function at A0 has a order less than (n1 + n2 + n3 ), i.e., we are in the non-generic case, thus τ (n1 , n2 , n3 ) = 0. Remark. Notice that because [D(0, 0, 0)] ∈ Wg−1 , then τ (0, 0, 0) = 0. Remark. If K = C then the algebraic curve C is the compact Riemann surface, theorems of Abel and Jacobi identify the Jacobian with quotient of Cg by the period lattice, and a theorem of Riemann identifies Wg−1 with a certain translate of the zero locus of the Riemann theta function (see [9]). Then, as we mentioned in the remark after Corollary 2, the algebro-geometric solution τ of the discrete KP equation becomes, with appropriate understanding of its argument via the divisor D(n1 , n2 , n3 ) and up to a not essential and non-vanishing multiplier, the Riemann theta function (see, e.g. [15]). In particular, in such an interpretation the zeros of τ are located in points of the translate Wg−1 of the theta divisor. Let us discuss periodicity of solutions of the finite field version of the KP equation obtained using the above method. Denote by i , i = 1, 2, 3, the ranks of cyclic subgroups of J (C; K) generated by divisors Ai − A0 , then for arbitrary ki ∈ Z, i = 1, 2, 3, D(n1 + k1 1 , n2 + k2 2 , n3 + k3 3 ) ∼ D(n1 , n2 , n3 ). In particular, τ (n1 , n2 , n3 ) = 0 implies τ (n1 + k1 1 , n2 + k2 2 , n3 + k3 3 ) = 0. There exist unique (normalized at A0 ) functions hi , i = 1, 2, 3, with zeros of order i at Ai , poles of order i at A0 and no other singularities and zeros such that ψ(n1 + k1 1 , n2 + k2 2 , n3 + k3 3 ) = hk11 hk22 hk33 ψ(n1 , n2 , n3 ).
(8)
162
M. Białecki, A. Doliwa
Remark. Generalizing the above considerations, if for i ∈ Z, i = 1, 2, 3, 3
i (Ai − A0 ) ∼ 0,
i=1
then (1 , 2 , 3 ) is the period vector of zeros of the τ -function and vector of quasi-periodicity (in the above sense) of the wave function. (i)
Equation (8) implies quasi-periodicity of the functions ζ0 , i = 1, 2, 3, (i)
(i)
k1 k2 k3 ζ0 (n1 + k1 1 , n2 + k2 2 , n3 + k3 3 ) = c(i)1 c(i)2 c(i)3 ζ0 (n1 , n2 , n3 ),
with the (non-zero) factors c(i)j ∈ K∗ equal to
hj
c(i)j = δ
. ij j P =Ai tj (i)
The multiplicative group K∗ is a cyclic group of order q − 1, therefore the functions ζ0 are periodic. Their periods in variable nj are equal to j times the order of the subgroup of K∗ generated by c(i)j (a divisor of q − 1). Due to the possible change of sign (see Eq. (3)) the periods of ρi can be eventually doubled with respect to the corresponding (i) periods of ζ0 . Again, periodicity of ρi implies quasi-periodicity of τ with a factor from K∗ , thus the period of τ in the variable ni can be maximally q − 1 times the period of ρi in that variable. 3. A “Hyperelliptic” Solution of the Discrete KP Equation Our goal here is to demonstrate how the method described above works. We perform all steps of the construction (see also [2] for details) starting from a given algebraic curve, which we have chosen to be a hyperelliptic curve, due to a relatively simple description of Jacobians of such curves [19]. We consider a hyperelliptic curve of genus g = 2 but the technical tools used here can be applied directly to hyperelliptic curves of arbitrary genus. 3.1. A hyperelliptic curve and its Jacobian. Consider a hyperelliptic curve C of genus g = 2 defined over the field F7 and given by the equation C:
v 2 + uv = u5 + 5u4 + 6u2 + u + 3.
(9)
The (u, v) coordinates of its F7 -rational points are presented in Table 1. The curve has one point at infinity, denoted by ∞, whose preimage on the nonsingular model of C consists of one point only [24], and where the local uniformizing parameter can be chosen as u2 /v (u is a polynomial function of order 2, and v is a polynomial function of order 5). The point opposite (with respect to the hyperelliptic automorphism) to P is denoted by P˜ . The only two special points of the curve are (6, 4) and the infinity point ∞. We identify the field F49 as the extension of F7 by the polynomial x 2 + 2, i.e., F49 = F7 [x]/(x 2 + 2). Let us introduce the following notation: the element k ∈ F49
The Discrete KP Equation over a Finite Field
163
Table 1. F7 -rational points of the curve C i 0 1 2 3 4
Pi ∞ (1, 1) (2, 2) (5, 3) (6, 4)
P˜i P0 (1, 5) (2, 3) (5, 6) P4
Table 2. F49 -rational points of the curve C (which are not F7 -rational); here P˜ is the opposite of P , and P σ denotes its conjugate with respect to the lift of the Frobenius automorphism i
Pi
P˜i
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
(0, 21) (3, 9) (4, 26) (7, 5) (8, 22) (11, 5) (12, 6) (13, 14) (14, 8) (15, 13) (16, 17) (17, 0) (18, 4) (19, 9) (20, 12) (22, 4) (25, 6) (27, 7)
(0, 28) (3, 44) (4, 33) (7, 44) (8, 26) (11, 47) (12, 45) (13, 29) (14, 34) (15, 28) (16, 23) (17, 39) (18, 41) (19, 28) (20, 31) (22, 30) (25, 32) (27, 22)
Piσ P˜5 P˜6 P˜7 (42, 5) (43, 29) (46, 5) (47, 6) (48, 35) (35, 43) (36, 48) (37, 38) (38, 0) (39, 4) (40, 44) (41, 47) (29, 4) (32, 6) (34, 42)
P˜iσ P5 P6 P7 (42, 9) (43, 33) (46, 12) (47, 10) (48, 22) (35, 27) (36, 21) (37, 30) (38, 18) (39, 20) (40, 21) (41, 24) (29, 23) (32, 25) (34, 29)
represented by the polynomial βx + α is denoted by the natural number 7β + α. The Galois group G(F49 /F7 ) = {id, σ }, where σ is the Frobenius automorphism, acts on elements of F49 \ F7 in the following way: F49 \ F7 k = 7β + α → σ (k) = 7(7 − β) + α. The coordinates of F49 -rational points of the curve (which are not F7 -rational) are presented in Table 2. In the next step we find the group of the F7 -rational points J (C; F7 ) of the Jacobian of the curve. The number of its points can be found from the number of F7 -rational and F49 -rational points of the curve by application of properties of the zeta function of the curve C (see for instance [25, 13]). In our case the curve has 8 F7 -rational points and 74 F49 -rational points which implies the following form of the zeta function ζ (C; T ): ζ (C; T ) =
P (T ) , (1 − T )(1 − 7T )
P (T ) = 1 + 12T 2 + 49T 4 .
The number #J (C; F7 ) of the F7 -rational points of the Jacobian is equal to P (1) = 62, and therefore J (C; F7 ) is the direct sum of cyclic groups of orders 31 and 2. Let us choose the infinity point ∞ as the basepoint. The group law in the Jacobian of a hyperelliptic curve can be intuitively described in a way which is a higher-genus
164
M. Białecki, A. Doliwa
analog of the well known addition operation for points of elliptic curves. We present here only its sketch for genus g = 2 and in the generic case of addition of two points of J (C) with representations of the form Ei = Qi + Ri − 2∞,
i = 1, 2,
with all points distinct. If E3 = Q3 + R3 − 2∞ is the representation of [E1 + E2 ], i.e., E1 + E2 = (g) + E3 , then E1 + E2 + E˜ 3 ∼ 0,
(10)
where we have used the fact that for any point P ∈ C(K) of a hyperelliptic curve the divisor P + P˜ − 2∞ is principal. Therefore, there exists a normalized polynomial function f of the order six, thus necessarily of the form f = a + bu + cu2 + du3 + ev, with the divisor given by the left-hand side of Eq. (10). Its zeros at Qi and Ri , i = 1, 2, and the normalization condition allow to fix the coefficients and then to find two other ˜ 3 and R˜ 3 . zeros Q Geometrically, we are looking for two other intersection points of the four known cubic interpolating points with the hyperelliptic curve. In cases when some points of E1 + E2 are repeated, the interpolation step must be adjusted to ensure tangency to the curve with sufficient multiplicity. When divisors have less points then we consider the interpolating curve of lower degree (some intersection points are at infinity). Finally, the transition function g is the unique normalized function with the nominator equal to f and the denominator being the normalized polynomial function with the divisor E3 + E˜ 3 . The full description of the group J (C; F7 ) is given in Table 3. The divisor D1 = P1 − ∞ generates the subgroup of order 31 and the divisor D4 = P4 − ∞ generates the subgroup of order 2. We present the reduced representations [nD1 + mD4 ]r of elements [nD1 + mD4 ] of J (C; F7 ), where n ∈ {0, 1, . . . , 30} and m ∈ {0, 1}. Moreover we write down the transition functions gm (n) defined by the following divisor equation: [nD1 + mD4 ]r + D1 = (gm (n)) + [(n + 1)D1 + mD4 ]r .
(11)
Also the transition functions for the sums [nD1 + mD4 ]r + D4 can be read off from Table 3. In particular, to find such a transition function (we call it W (0, 1)) with n = 29 and m = 0, i.e., (1, 5) + (1, 5) + D4 − 2∞ = (W (0, 1)) + (12, 6) + (47, 6) − 2∞,
(12)
we make use of the fact that the analogous transition function for n = 30 and m = 0 is equal to 1. Then W (0, 1) = g0 (29) · 1 · [g1 (29)]−1 =
2 + 3u + 4u2 + v , 6 + 4u + u2
(13)
where in the last equality we have used Eq. (9) of the curve to get rid of v from the denominator.
The Discrete KP Equation over a Finite Field
165
Table 3. The group J (C ; F7 ) of F7 -rational points of the Jacobian as the simple sum of its cyclic subgroups n 0 1
[nD1 ]r 0 (1, 1) − ∞
g0 (n) 1 1
[nD1 + D4 ]r (6, 4) − ∞ (1, 1) + (6, 4) − 2∞
2
(1, 1) + (1, 1) − 2∞
(12, 45) + (47, 10) − 2∞
3
(5, 6) + (5, 6) − 2∞
4
(2, 3) + (5, 3) − 2∞
5
(19, 9) + (40, 44) − 2∞
6
(22, 4) + (29, 4) − 2∞
7
(2, 3) + (5, 6) − 2∞
8
(27, 22) + (34, 29) − 2∞
9
(14, 34) + (35, 27) − 2∞
10
(2, 2) + (6, 4) − 2∞
11
(2, 3) + (2, 3) − 2∞
12
(13, 14) + (48, 35) − 2∞
13
(20, 12) + (41, 47) − 2∞
14
(5, 3) + (6, 4) − 2∞
15
(25, 32) + (32, 25) − 2∞
16
(25, 6) + (32, 6) − 2∞
17
(5, 6) + (6, 4) − 2∞
18
(20, 31) + (41, 24) − 2∞
19
(13, 29) + (48, 22) − 2∞
20 21
(2, 2) + (2, 2) − 2∞ (2, 3) + (6, 4) − 2∞
22
(14, 8) + (35, 43) − 2∞
23
(27, 7) + (34, 42) − 2∞
24
(2, 2) + (5, 3) − 2∞
25
(22, 30) + (29, 23) − 2∞
26
(19, 28) + (40, 21) − 2∞
27
(2, 2) + (5, 6) − 2∞
28
(5, 3) + (5, 3) − 2∞
u+5u2 +v (2+u)2 1+u+4u2 +v (2+u)(5+u) 2+4u2 +v 5+4u+u2 4u+2u2 +v 5+5u+u2 5+2u+6u2 +v (2+u)(5+u) 5+6u+2u2 +v 5+2u+u2 1+3u+2u2 +v 1+u2 1+5u+v (1+u)(5+u) 3+5u+5u2 +v (5+u)2 6+u+6u2 +v 3+2u+u2 3+6u+4u2 +v 2+2u+u2 5u+u2 +v (1+u)(2+u) 6+5u+2u2 +v 6+6u+u2 5+u2 +v 6+6u+u2 6+5u+2u2 +v (1+u)(2+u) 5u+u2 +v 2+2u+u2 3+6u+4u2 +v 3+2u+u2 6+u+6u2 +v (5+u)2 3+5u+5u2 +v (1+u)(5+u) 1+5u+v 1+u2 1+3u+2u2 +v 5+2u+u2 5+6u+2u2 +v (2+u)(5+u) 5+2u+6u2 +v 5+5u+u2 4u+2u2 +v 5+4u+u2 2+4u2 +v (2+u)(5+u) 1+u+4u2 +v (2+u)2 u+5u2 +v (6+u)2
29 30
(1, 5) + (1, 5) − 2∞ (1, 5) − ∞
6+u 6+u
(12, 6) + (47, 6) − 2∞ (1, 5) + (6, 4) − 2∞
(15, 28) + (36, 21) − 2∞ (7, 44) + (42, 9) − 2∞ (11, 5) + (46, 5) − 2∞ (18, 41) + (39, 20) − 2∞ (16, 17) + (37, 38) − 2∞ (17, 39) + (38, 18) − 2∞
g1 (n) 1 5+5u+3u2 +v 6+4u+u2 1+5u2 +v 2+5u+u2 6u2 +v 2+u2 5+u+v 4+6u+u2 6+6u+u2 +v 3+6u+u2 5+3u+5u2 +v 5+3u+u2 5+4u+4u2 +v 3+u+u2 3+2u+u2 +v (5+u)(6+u)
(1, 5) + (2, 2) − 2∞
6+u
(2, 2) − ∞
1
(1, 1) + (2, 2) − 2∞
4+2u2 +v 3+5u+u2 2+4u+v (2+u)(6+u)
(8, 22) + (43, 29) − 2∞ (1, 5) + (5, 3) − 2∞
6+u
(5, 3) − ∞
1
(1, 1) + (5, 3) − 2∞
u+5u2 +v (2+u)(6+u)
(1, 5) + (5, 6) − 2∞
6+u
(5, 6) − ∞
1
(1, 1) + (5, 6) − 2∞ (8, 26) + (43, 33) − 2∞
2+4u+v 3+5u+u2 4+2u2 +v (5+u)(6+u)
(1, 5) + (2, 3) − 2∞ (2, 3) − ∞
6+u 1
(1, 1) + (2, 3) − 2∞
3+2u+u2 +v 3+u+u2 5+4u+4u2 +v 5+3u+u2 5+3u+5u2 +v 3+6u+u2 6+6u+u2 +v 4+6u+u2 5+u+v 2+u2 6u2 +v 2+5u+u2 1+5u2 +v 6+4u+u2 5+5u+3u2 +v (1+u)(6+u)
(17, 0) + (38, 0) − 2∞ (16, 23) + (37, 30) − 2∞ (18, 4) + (39, 4) − 2∞ (11, 47) + (46, 12) − 2∞ (7, 5) + (42, 5) − 2∞ (15, 13) + (36, 48) − 2∞
6+u
3.2. Construction of the wave and τ functions. In order to find a solution of the discrete KP equation let us fix the following points of the curve C, A0 = ∞,
A1 = (1, 1),
A2 = (2, 2),
A3 = (5, 3),
166
M. Białecki, A. Doliwa
with the uniformizing parameters t0 = u2 /v,
t1 = u − 1,
t2 = u − 2,
t3 = u − 5,
and B1 = (12, 6),
B2 = (47, 6).
Because A0 − A1 ∼ 30D1 , A0 − A2 ∼ 21D1 + D4 , B1 + B2 − 2A0 ∼ 29D1 + D4 ,
A0 − A3 ∼ 17D1 + D4 ,
then the points X1 (n1 , n2 , n3 ) and X2 (n1 , n2 , n3 ) where the wave function ψ(n1 , n2 , n3 ) has additional zeros can be found from the Table 3 and from the equation X1 (n1 , n2 , n3 ) + X2 (n1 , n2 , n3 ) − 2∞ = [nD1 + mD4 ]r ,
(14)
where n ∈ {0, 1, . . . , 30} and m ∈ {0, 1} are given by n = 29 + 30n1 + 21n2 + 17n3 m = 1 + n2 + n3 mod 2.
mod 31,
(15) (16)
Remark. Notice that the infinity point ∞ is the Weierstrass point of the curve C, which violates the assumption of the general position of points used in the construction of solutions of the discrete KP equation. This will not change the Jacobian picture of the construction but in some situations, which we will point out, will affect uniqueness of the wave function. We remark that such a choice is indispensable in the reduction of the method from the discrete KP equation to the discrete KdV equation (see, for example [15, 3]). To find effectively the wave function we will constraint the range of parameters from Z3 to the parameters of the group of F7 -rational points of the Jacobian. Let us introduce functions h1 and h4 corresponding to generators of the two cyclic subgroups of J (C; F7 ). The function h1 with the divisor 31D1 and normalized at the infinity point is equal to the product 30 i=0 g0 (i) and reads h1 = 1 + 2u + u2 + 4u3 + 3u5 + u6 + 3u7 + u8 + 4u9 +4u10 + 2u11 + 5u12 + 2u13 + 4u14 + 3u15 + 5u + 2u2
+5u3 + 4u5 + 6u6 + 4u7 + 3u9 + 5u10 + 5u11 + 4u12 + u13 v,
where we also used the equation of the curve (9) to reduce higher order terms in v. The normalized function h4 with the divisor 2D4 is h4 = u − 6. Let us introduce other auxiliary functions f2 and f3 to factorize the zeros of the wave function at A2 and A3 . Notice that A2 + 21D1 + D4 − ∞ ∼ 0, which implies that there exists a polynomial function on C with simple zero at A2 and other zeros in the distinguished (by our choice of description of J (C; F7 )) points (1, 1)
The Discrete KP Equation over a Finite Field
167
and (6, 4). Define f2 as the unique such function normalized at the infinity point ∞, then f2 = 1 + 5u + u2 + 4u4 + 6u5 + 4u6 + 4u7 + 3u8 + 4u9
+6u11 + 6 + 4u + 2u2 + 5u3 + 6u4 + 6u6 + u7 + u8 + u9 v. Similarly we define the normalized function
f3 = 1+6u+2u2 +6u5 +u6 + 5u7 + 5u8 + 4u9 + 4 + 3u+5u2 +4u5 + 2u6 + u7 v, with the divisor A3 + 17D1 + D4 − ∞. Uniqueness of the wave function ψ implies that it can be decomposed as follows: ψ(n1 , n2 , n3 ) =
f2n2 f3n3 p q
h 1 h4
W (m1 , m2 ),
(17)
where W (m1 , m2 ) is the unique normalized function with the divisor m1 D1 + m2 D4 + Y1 (m1 , m2 ) + Y2 (m1 , m2 ) − (12, 6) − (47, 6),
(18)
Y1 (m1 , m2 ) + Y2 (m1 , m2 ) = X1 (n1 , n2 , n3 ) + X2 (n1 , n2 , n3 ),
(19)
where
and the new variables m1 i m2 are given by 21n2 + 17n3 − n1 = 31p − m1 , m1 ∈ {0, 1, . . . , 30}, n2 + n3 = 2q − m2 , m2 ∈ {0, 1}.
(20) (21)
To find the functions W (m1 , m2 ) for all m1 ∈ {0, 1, . . . , 30} and m2 ∈ {0, 1} let us first notice that W (0, 0) = 1 and W (0, 1) is indeed the function found in Eq. (13). For m1 ∈ {1, . . . , 30} and m2 ∈ {0, 1} define the functions wm2 (m1 ) as follows: W (m1 , m2 ) = wm2 (m1 )W (m1 − 1, m2 ). Equations (11), (14)–(16) and (18)–(21) imply that for such a range of m1 and m2 we have gm (n) = wm2 (m1 ), where m2 = 1 − m
mod 2,
m1 = 29 − n mod 31.
Finally, under the identification wm2 (0) = W (0, m2 ) we obtain W (m1 , m2 ) =
m1
wm2 (i),
i=0
which, together with factorization (17), gives the wave function ψ for all values of (n1 , n2 , n3 ) ∈ Z3 .
168
M. Białecki, A. Doliwa
– 0,
– 1,
– 2,
– 3,
– 4,
– 5,
– 6.
Fig. 1. F7 -valued solutions of discrete KP equation out of genus two hyperelliptic curve C . Variables n1 (directed to the right) and n2 (directed up) take values from 0 to 34. Subsequent figures are for values of n3 = −1, 0, 1, 2
Remark. For (m1 , m2 ) = (29, 1) we have X1 = X2 = ∞. Because the infinity point ∞ is the Weierstrass point of order two, there exist functions with a divisor of poles equal to 2∞. This means that ψ is not uniquely determined in this case. However it is natural to keep the divisor of ψ, and therefore ψ itself, exactly like it is given from the flow on the Jacobian. Notice that because for X1 = X2 = ∞ we stay in the divisor Wg−1 then this ambiguity does not affect construction of the τ -function. The coefficients ζ0 (k) (n1 , n2 , n3 ), k = 1, 2, 3, of the expansion of the wave function can be obtained from the factorization (17) and are given by W (m1 , m2 )
(1) ζ0 (n1 , n2 , n3 ) = 6n3 +p 4q , (22)
t1 =0 t1m1 ζ0 (n1 , n2 , n3 ) = 6n2 5n3 +q 4p W (m1 , m2 )|t2 =0 , (2)
(23)
The Discrete KP Equation over a Finite Field
ζ0 (n1 , n2 , n3 ) = 5n2 +n3 3p 6q W (m1 , m2 )|t3 =0 . (3)
169
(24)
Using the definition of the τ function for nonzero ρi , i.e. Eq. (4), and putting τ = 0 for points of the divisor Wg−1 we obtain the corresponding F7 -valued solution of the discrete KP equation (6). This τ -function is presented in Fig. 1. Notice that due to quasiperiodicity of the τ -function, we have to calculate the solution in this way only for a finite range of values of the variables. The periods i , i = 1, 2, 3, of zeros of the τ -function are, respectively, 31, 62 and 62. Equivalently, the “period vectors” of zeros can be chosen as −4 11 2 v1 = −1 , v2 = 2 , v3 = 6 . 1 0 0 Because c(1)1 = 6, c(2)1 = 3 and c(3)1 = 5, then periods of the functions ρi , i = 1, 2, 3, in variable n1 are, respectively, 2 · 31, 3 · 31 · 2 and 6 · 31. Moreover, we have τ (31, 0, 0) = 3 which gives τ (62, 0, 0) = 5 = 32 · 631 mod 7, and therefore the period of τ in n1 is 6 · 2 · 31. Acknowledgements. The paper was partially supported by the University of Warmia and Mazury in Olsztyn under the grant 522-1307-0201 and by KBN grant 2 P03B 12622.
References 1. Belokolos, E.D., Bobenko, A.I., Enol’skii, V.Z., Its, A.R., Matveev, V.B.: Algebro-geometric approach to nonlinear integrable equations. Berlin: Springer-Verlag, 1994 2. Białecki, M.: Methods of algebraic geometry over finite fields in construction of integrable cellular automata. PhD dissertation, Warsaw University, Institute of Theoretical Physics, 2003 3. Białecki, M., Doliwa, A.: The discrete KP and KdV equations over finite fields. Theor. Math. Phys. 137, 1412–1418 (2003) 4. Bobenko, A., Bordemann, M., Gunn, Ch., Pinkall, U., On two integrable cellular automata. Commun. Math. Phys. 158, 127–134 (1993) 5. Bruschi, M., Santini, P.M.: Cellular automata in 1+1, 2+1 and 3+1 dimensions, constants of motion and coherent structures. Physica D 70, 185–209 (1994) 6. Cornell, G., Silverman, J.H. (eds.): Arithmetic geometry. New York: Springer-Verlag, 1986 7. Doliwa, A., Białecki, M., Klimczewski, P.: The Hirota equation over finite fields: algebro-geometric approach and multisoliton solutions. J. Phys. A 36, 4827–4839 (2003) 8. Fokas, A.S., Papadopoulou, E.P., Saridakis, Y.G.: Soliton cellular automata. Physica D 41, 297–321 (1990) 9. Griffiths, P., Harris, J.: Principles of algebraic geometry. New York: John Wiley and Sons, 1978 10. Hartshorne, R.: Algebraic geometry. New York: Springer-Verlag, 1977 11. Hirota, R.: Nonlinear partial difference equations. III. Discrete sine-Gordon equation. J. Phys. Soc. Jpn. 43, 2079–2086 (1977) 12. Hirota, R.: Discrete analogue of a generalized Toda equation. J. Phys. Soc. Jpn. 50, 3785–3791 (1981) 13. Koblitz, N.: Algebraic aspects of cryptography. Berlin: Springer-Verlag, 1998 14. Krichever, I.M.: Algebraic curves and non-linear difference equations. Usp. Mat. Nauk 33, 215–216 (1978) 15. Krichever, I.M., Wiegmann, P., Zabrodin, A.: Elliptic solutions to difference non-linear equations and related many body problems. Commun. Math. Phys. 193, 373–396 (1998) 16. Lang, S.: Abelian varieties. New York: Interscience Publishers Inc. 1958 17. Lang, S.: Algebra. Reading, MA: Addison-Wesley, 1970 18. Matsukidaira, J., Satsuma, J., Takahashi, D., Tokihiro, T., Torii, M.: Toda-type cellular automaton and its N-soliton solution. Phys. Lett. A 225, 287–295 (1997) 19. Menezes, A.J., Wu, Y.H., Zuccherato, R.J.: An elementary introduction to hyperelliptic curves. Appendix in [13], pp. 151–178
170
M. Białecki, A. Doliwa
20. Milne, J.S.: Jacobian varieties. Chapter VII in [6], pp. 167–212 21. Moreno, C.: Algebraic curves over finite fields. Cambridge: University Press, 1991 22. Mumford, D.: An algebro-geometric construction of commuting operators and of solutions to the Toda lattice equation, Korteweg–de Vries equation, and related nonlinear equations. Proceedings of the International Symposium on Algebraic Geometry (M. Nagata, ed.), Kinokuniya, Tokyo, 1978 pp. 115–153 23. von Neumann, J.: The general and logical theory of automata. In: The collected works of John von Neumann (A.W. Taub, ed.), Vol. 5, New York: Pergamon Press, 1963 24. Shafarevich, I.: Basic algebraic geometry. Heidelberg: Springer-Verlag, 1974 25. Stichtenoth, H.: Algebraic function fields and codes. Berlin: Springer-Verlag, 1993 26. Takahashi, D., Satsuma, J.: A soliton cellular automaton. J. Phys. Soc. Jpn. 59, 3514–3519 (1990) 27. Tokihiro, T., Takahashi, D., Matsukidaira, J., Satsuma, J.: From soliton equations to integrable cellular automata through a limiting procedure. Phys. Rev. Lett. 76, 3247–3250 (1996) 28. Ulam, S.: Random processes and transformations. In: Proceedings of the International Congress of Mathematicians, Cambridge, MA, 30 August–6 September 1950 (P. A. Smith, O. Zariski, eds.), Providence, RI: AMS, 1952, pp. 264–275 29. Wolfram, S.: Theory and application of cellular automata. Singapore: World Scientific, 1986 Communicated by L. Takhtajan
Commun. Math. Phys. 253, 171–219 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1132-5
Communications in
Mathematical Physics
Fusion Rules for the Vertex Operator Algebras M (1)+ and VL+ Toshiyuki Abe1, , Chongying Dong2, , Haisheng Li3,4, 1
Graduate School of Mathematical Sciences, University of Tokyo, 3-8-1 Komaba, Tokyo 153-8914, Japan 2 Department of Mathematics, University of California, Santa Cruz, CA 95064, USA 3 Department of Mathematical Sciences, Rutgers University, Camden, NJ 08102, USA 4 Department of Mathematics, Harbin Normal University, Harbin, P.R. China Received: 27 October 2003 / Accepted: 4 December 2003 Published online: 8 July 2004 – © Springer-Verlag 2004
Abstract: The fusion rules for the vertex operator algebras M(1)+ (of any rank) and VL+ (for any positive definite even lattice L) are determined completely. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Vertex operator algebras and modules . . . . . . . . . . . 2.2 Intertwining operators and fusion rules . . . . . . . . . . . 3. Vertex Operator Algebras M(1)+ and VL+ . . . . . . . . . . . . 3.1 Vertex operator algebras M(1)+ and VL+ and their modules 3.2 Contragredient modules . . . . . . . . . . . . . . . . . . . 4. Fusion Rules for Vertex Operator Algebra M(1)+ . . . . . . . . 4.1 Construction of intertwining operators . . . . . . . . . . . 4.2 Main theorem . . . . . . . . . . . . . . . . . . . . . . . . 5. Fusion Rules for Vertex Operator Algebra VL+ . . . . . . . . . . 5.1 Main theorem . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Fusion rules among modules of untwisted types . . . . . . 5.3 Fusion rules involving modules of twisted type . . . . . . 5.4 Fusion product for VL+ . . . . . . . . . . . . . . . . . . . 5.5 Application . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
172 173 173 175 181 181 186 188 188 192 196 196 197 207 215 217
Supported by JSPS Research Fellowships for Young Scientists Partially supported by NSF grants and a research grant from the Committee on Research, UC Santa Cruz Partially supported by a NSA grant and a grant from Rutgers University Research Council
172
T. Abe, C. Dong, H. Li
1. Introduction In this paper we study orbifold vertex operator algebras M(1)+ and VL+ for a positive definite even lattice L. The vertex operator algebra VL+ (see [FLM]) is the fixed point subalgebra of the lattice vertex operator algebra VL under the automorphism lifted from the −1 isometry of the lattice and the vertex operator algebra M(1)+ can be regarded as a subalgebra of VL . The vertex operator algebra VL+ in the case that L is the Leech lattice was first studied in [FLM] to construct the moonshine module vertex operator algebra V which is a direct sum of VL+ and an irreducible VL+ -module in [FLM]. This construction was extended to some other lattices in [DGM]. Previously, the vertex operator algebras M(1)+ and VL+ have been studied extensively in the literature. The irreducible modules for both M(1)+ and VL+ have been classified in [DN1, DN2, DN3 and AD]. If L is of rank 1, the fusion rules for these vertex operator algebras have been also determined in [A1 and A2]. In this paper we determine the fusion rules for general M(1)+ and VL+ . It turns out that all of the fusion rules are either 0 or 1. The fusion rules for M(1)+ are obtained in the following way. First we construct certain untwisted and twisted intertwining operators which are similar to the untwisted and twisted vertex operators constructed in Chapters 8 and 9 of [FLM]. The main problem is to find the upper bound for each fusion rule. In order to achieve this we use a general result about the fusion rules for a tensor product vertex operator algebra to reduce the problem to the case when the rank is 1. Applying the fusion rules obtained in [A1] we get the required upper bound. In particular, the constructed intertwining operators are the only nonzero intertwining operators up to scalar multiples. The determination of fusion rules for VL+ is much more complicated. The main strategy is to employ the results (on fusion rules) for M(1)+ . (Notice that M(1)+ is a vertex operator subalgebra of VL+ and each irreducible VL+ -module is a completely reducible M(1)+ -module.) First, we show that the fusion rules of certain types are nonzero by exhibiting nonzero intertwining operators. Then we prove that the fusion rules for VL+ are either 0 or 1. Observe that the intertwining operators constructed in [DL1] for VL restrict to nonzero (untwisted) intertwining operators for VL+ . We then construct certain (nonzero) intertwining operators among untwisted and twisted VL -modules and again restrict to nonzero (twisted) intertwining operators for VL+ . The main difficulty is in proving that the constructed intertwining operators are all the nonzero intertwining operators. This is achieved by a lengthy calculation involving commutativity and associativity of vertex operators. As an application of our main result we show that if L is self dual and if VL+ extends to a vertex operator algebra by an irreducible module from the (unique) twisted VL -module, then the resulted vertex operator algebra is always holomorphic in the sense that it is rational and the vertex operator algebra itself is the only irreducible module. The moonshine module vertex operator algebra is such an extension for the Leech lattice and thus it is holomorphic (this result has been obtained previously in [D3]). It is expected that the main result will be useful in the future study of orbifold conformal field theory for L not self dual. The organization of the paper is as follows. Section 2 is preliminary; In Sect. 2.1 we recall definitions of modules for vertex operator algebras, and in Sect. 2.2 we review the notion of intertwining operators and fusion rules and we also prove that fusion rules for a tensor product of two vertex operator algebras are equal to the product of fusion rules for each vertex operator algebra. In Sect. 3.1, we present the construction of vertex operator algebras M(1)+ and VL+ and their irreducible modules following [FLM].
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
173
The classifications of irreducible M(1)+ -modules and irreducible VL+ -modules given in [DN1, DN2, DN3 and AD] are also stated here. In Sect. 3.2 we identify the contragredient modules of irreducible M(1)+ -modules and VL+ -modules. This result is useful to reduce the arguments to calculate fusion rules. In Sect. 4 we determine the fusion rules for M(1)+ completely. The nontrivial intertwining operators among irreducible M(1)+ -modules are constructed in Sect. 4.1, and it is proved that all of the fusion rules are either 0 or 1. Throughout the paper, Z≥0 is the set of nonnegative integers. 2. Preliminaries 2.1. Vertex operator algebras and modules. In this section we recall certain basic notions such as the notions of (weak) twisted module and contragredient module (see [FLM, FHL, DLM3]). For any vector space W (over C) we set −1 −n−1 W [[z, z ]] = vn z vn ∈ W , n∈Z W ((z)) = vn z−n−1 vn ∈ W, vn = 0 for sufficient small n , n∈Z W {z} = vn z−n−1 vn ∈ W . n∈C
We first briefly recall the definition of vertex operator algebra (see [B, FLM]). A vertex operator algebra is a Z-graded vector space V = n∈Z V(n) such that dim V(n) < ∞ for all n ∈ Z and such that V(n) = 0 for n sufficiently small, equipped with a linear map, called the vertex operator map, Y ( · , z) : V → (End V )[[z, z−1 ]], a → Y (a, z) = an z−n−1 . n∈Z
The vertex operators Y (a, z) satisfy the Jacobi identity. There are two distinguished vectors; the vacuum vector 1 ∈ V(0) and the Virasoro element ω ∈ V(2) . It is assumed that Y (1, z) = idV and that the following Virasoro algebra relations hold for m, n ∈ Z: [L(m), L(n)] = (m − n)L(m + n) +
1 (m3 − m)δm+n,0 cV , 12
(2.1)
where Y (ω, z) = n∈Z L(n)z−n−2 (= n∈Z ωn z−n−1 ) and cV is a complex scalar, called the central charge of V . It is also assumed that for n ∈ Z, the homogeneous subspace V(n) is the eigenspace for L(0) of eigenvalue n. We say that a nonzero vector v of V(n) is a homogeneous vector of weight n and write wt(v) = n. Let V be a vertex operator algebra, fixed throughout this section. An automorphism of the vertex operator algebra V is a linear isomorphism g of V such that g(ω) = ω and gY (a, z)g −1 = Y (g(a), z) for any a ∈ V . A simple consequence of this definition is that g(1) = 1 and that g(V(n) ) = V(n) for n ∈ Z. Denote by Aut (V ) the group of all automorphisms of V . For a subgroup G < Aut (V ) the fixed point set V G = {a ∈ V | g(a) = a for g ∈ G} is a vertex operator subalgebra.
174
T. Abe, C. Dong, H. Li
Let g be an automorphism of vertex operator algebra V of (finite) order T . Then V is decomposed into the eigenspaces for g: V =
T −1
V r , V r = { a ∈ V | g(a) = e−
2π ir T
a }.
r=0
Definition 2.1. A weak g-twisted V-module is a vector space M equipped with a linear map YM : V → (End M){z}, an z−n−1 a → YM (a, z) =
(where an ∈ End M) ,
n∈Q
called the vertex operator map, such that the following conditions hold for 0 ≤ r ≤ T − 1, a ∈ V r , b ∈ V and u ∈ M: (1) YM (a, z)v ∈ z− T M((z)), (2) YM (1, z) = idM , (3) (the twisted Jacobi identity) r
z1 − z2 z2 − z1 −1 YM (a, z1 )YM (b, z2 ) − z0 δ YM (b, z2 )YM (a, z1 ) z0 −z0 − r T z1 − z0 −1 z1 − z0 YM (Y (a, z0 )b, z2 ). δ = z2 z2 z2
z0−1 δ
A weak g-twisted V -module is denoted by (M, YM ), or simply by M. When g = 1, a weak g-twisted V -module is called a weak V-module. A g-twisted weak V-submodule of a g-twisted weak module M is a subspace N of M such that an N ⊂ N hold for all a ∈ V and n ∈ Q. If M has no g-twisted weak V -submodule except 0 and M, M is said to be irreducible. (see [DLM2]) that the operators L(n) for n ∈ Z on M with YM (ω, z) = It is known −n−2 also satisfy the Virasoro algebra relations (2.1). Moreover, we have n∈Z L(n)z the L(−1)-derivative property YM (L(−1)a, z) =
d Y (a, z) for all a ∈ V . dz
(2.2)
Definition 2.2. An admissible g-twisted V-module is a weak g-twisted V -module M equipped with a T1 N-grading M = n∈ 1 N M(n) such that T
am M(n) ⊂ M(wt(a) − m − 1 + n)
(2.3)
for any homogeneous a ∈ V and for n ∈ T1 N, m ∈ Q. In the case g = 1, an admissible g-twisted V -module is called an admissible V-module. A g-twisted weak V -submodule N of a g-twisted admissible V -module is called a g-twisted admissible V-submodule if N = n∈ 1 N N ∩ M(n). T A g-twisted admissible V -module M is said to be irreducible if M has no nontrivial admissible submodule. A g-twisted admissible V -module M is said to be completely reducible if M is a direct sum of irreducible admissible submodules.
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
175
Definition 2.3. The vertex operator algebra V is said to be g-rational if any g-twisted admissible V -module is completely reducible. If V is idV -rational, then V is said to be rational. Definition 2.4. A g-twisted V-moduleis a weak g-twisted V -module M which is C-graded by L(0)-eigenspace M = λ∈C M(λ) (where M(λ) = {u ∈ M | L(0)u = λu}) such that dim M(λ) < ∞ for all λ ∈ C and such that for any fixed λ ∈ C, M(λ+n/T ) = 0 for n ∈ Z sufficiently small. In the case g = 1, a g-twisted V -module is called a V-module. A V -module M is said to be irreducible if M is irreducible as a weak V -module. The vertex operator algebra V is said to be simple if V as a V -module is irreducible. ∗ Let M = λ∈C M(λ) be a V -module. Set M = λ∈C M(λ) , the restricted dual of M. It was proved in [FHL] that M is naturally a V -module where the vertex operator map, denoted by Y , is defined by the property Y (a, z)u , v = u , Y (ezL(1) (−z−2 )L(0) a, z−1 )v
(2.4)
for a ∈ V , u ∈ M and v ∈ M. The V -module M is called the contragredient module of M. It was proved therein that if M is irreducible, then so is M . A V -module M is said to be self-dual if M and M are isomorphic V -modules. Then a V -module M is self-dual if and only if there exists a nondegenerate invariant bilinear form on M in the sense that (2.4) with the obvious modification holds. The following result was proved in [L]: Lemma 2.5. Let V be a simple vertex operator algebra such that L(1)V(1) = V(0) . Then V is self-dual. 2.2. Intertwining operators and fusion rules. We recall the definitions of the notions of intertwining operator and fusion rule from [FHL] and we prove a theorem about fusion rules for a tensor product vertex operator algebra. Definition 2.6. Let M 1 , M 2 and M 3 be weak V -modules. An intertwining operator
3 Y( · , z) of type MM 1 M 2 is a linear map Y( · , z) : M 1 → Hom (M 2 , M 3 ){z} vn1 z−n−1 v 1 → Y(v 1 , z) =
where vn1 ∈ Hom (M 2 , M 3 )
n∈C
satisfying the following conditions: 1 v 2 = 0 for n ∈ Z sufficiently large. (1) For any v 1 ∈ M 1 , v 2 ∈ M 2 and λ ∈ C, vn+λ 1 1 (2) For any a ∈ V , v ∈ M , z1 − z2 z2 − z1 z0−1 δ YM 3 (a, z1 )Y(v 1 , z2 ) − z0−1 δ Y(v 1 , z2 )YM 2 (a, z1 ) z0 −z0 z1 − z0 = z2−1 δ Y(YM 1 (a, z0 )v 1 , z2 ). z2
(3) For v 1 ∈ M 1 ,
d 1 dz Y(v , z)
= Y(L(−1)v 1 , z).
176
T. Abe, C. Dong, H. Li
3 All of the intertwining operators of type MM 1 M 2 form a vector space, denoted by
M3
M3
3 IV M 1 M 2 . The dimension of IV M 1 M 2 is called the fusion rule of type MM 1 M 2 for V. The following result, which is given [FHL and HL], gives the following symmetry: Proposition 2.7. Let M, N and L be V -modules. Then there exist canonical vector space isomorphisms such that L L N ∼ ∼ I I . IV = V = V MN NM M L The following proposition can be found in [DL1, Prop. 11.9]: Proposition 2.8. Let M i (i = 1, 2, 3) be V -modules. Suppose that M 1 and M 2 are irre 3 ducible and that IV MM 1 M 2 = 0. Let Y( · , z) be any nonzero intertwining operator of
M3 type M 1 M 2 . Then for any nonzero vectors u ∈ M 1 and v ∈ M 2 , Y(u, z)v = 0. Assume that U is a vertex operator subalgebra of V (with the same Virasoro element). Then every V -module is naturally a U -module. Let M 1 , M 2 , M 3 be V -modules and let N 1 and N 2 be any U -submodules of M 1 and M 2 , respectively. Clearly, any intertwin 3 ing operator Y( · , z) of type MM 1 M 2 in the category of V -modules is an intertwining
M3 operator of type M 1 M 2 in the category of U -modules. Furthermore, the restriction of
3 Y( · , z) onto N 1 ⊗ N 2 is an intertwining operator of type NM 1 N 2 in the category of U -modules. Then we have a restriction map M3 M3 IV → I , U M1 M2 N1 N2 Y( · , z) → Y( · , z)|N 1 ⊗ N 2 . Now, assume that M 1 , M 2 are irreducible V -modules and M 3 is any V -module (not necessarily irreducible) and assume that N 1 and N 2 are nonzero U -modules, e.g., irreducible U -modules. It follows immediately from Proposition 2.8 that the restriction map is injective. Therefore we have proved: Proposition 2.9. Let V be a vertex operator algebra and let M 1 , M 2 , M 3 be V -modules among which M 1 and M 2 are irreducible. Suppose that U is a vertex operator subalgebra of V (with the same Virasoro element) and that N 1 and N 2 are irreducible
3 U -submodules of M 1 and M 2 , respectively. Then the restriction map from IV MM 1 M 2 to
M3 IU N 1 N 2 is injective. In particular, dim IV
M3 M1 M2
≤ dim IU
M3 . N1 N2
(2.5)
Let V 1 and V 2 be vertex operator algebras, let M i (i = 1, 2, 3) be V 1 -modules and let
3 (i = 1, 2, 3) be V 2 -modules. For any intertwining operator Y1 ( · , z) of type MM 1 M2
3 , by using commutativity and and for any intertwining operator Y2 ( · , z) of type N N 1 N2
Ni
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
177
rationality one can prove that Y1 ( · , z) ⊗ Y2 ( · , z) is an intertwining operator of type
M 3 ⊗M 3 M 1 ⊗N 1 M 2 ⊗N 2 , where (Y1 ⊗ Y2 )( · , z) is defined by (Y1 ⊗ Y2 )(u1 ⊗ v 1 , z)u2 ⊗ v 2 = Y1 (u1 , z)u2 ⊗ Y2 (v 1 , z)v 2 for ui ∈ M i and v i ∈ N i (i = 1, 2). Then we have a canonical linear map σ : IV 1
M3 N3 M3 ⊗ M3 ⊗ IV 2 → IV 1 ⊗V 2 M1 M2 N1 N2 M1 ⊗ N1 M2 ⊗ N2 Y1 ( · , z) ⊗ Y2 ( · , z) → (Y1 ⊗ Y2 )( · , z).
The following is our main theorem of this section: Theorem 2.10. With the above setting, the linear map σ is one-to-one. Furthermore, if either dim IV 1
M3 M1 M2
< ∞ or
dim IV 2
N3 N1 N2
< ∞,
then σ is a linear isomorphism. To prove this theorem we shall need some preparation. Denote by ωi the Virasoro element of V i for i = 1, 2, and write Y (ωi , x) =
Li (n)x −n−2 .
n∈Z
The following proposition is a modification and a generalization of Proposition 13.18 [DL1]. It can also be proved in the same way. Proposition 2.11. Let V 1 and V 2 be vertex operator algebras and let W i (i = 1, 2, 3) be V 1 ⊗ V 2 -modules on which both L1 (0) and L2 (0) act semisimply. Let Y( · , x) be an
3 1 2 intertwining operator of type WW 1 W 2 for V ⊗ V . Then for any h ∈ C, x −L
1 (0)
1 (0)
Ph Y(x L
1 (0)
· , x)x L
3 2 (0),h) for V1 -modules, where W 3 (L2 (0), h) is is an intertwining operator of type W W(L1 W 2 2 3 the L (0)-eigenspace of W with eigenvalue h, which is naturally a (weak) V1 -module, and Ph is the projection of W 3 onto W 3 (L2 (0), h). For a vector space U , we say that a formal series a(x) = n∈C a(n)zn ∈ U {z} is lower truncated if a(n) = 0 for n whose real part is sufficiently small. Furthermore, for vector spaces A and B, a linear map g(z) from A to B{z} is said to be lower truncated if g(z) sends every vector in A to a lower truncated series in B{z}. With these notions we formulate the following result, which will be very useful in our proof of Theorem 2.10:
178
T. Abe, C. Dong, H. Li
Lemma 2.12. Let W = h∈C W(h) be a C-graded vector space satisfying the condition that dim W(h) < ∞ for any h ∈ C and that W(h) = 0 for h whose real part is sufficiently small. Let A and B be any vector spaces, let gi (x) (i = 1, . . . , r) be linearly independent lower truncated linear maps from A to B{x}. Suppose that fi (x) ∈ W {x} (i = 1, . . . , r) are lower truncated formal series such that for any h ∈ C, there exists s ∈ C such that Ph fi (x) ∈ x s W(h) for all i, where Ph is the projection map of W onto W(h) , and such that f1 (x) ⊗ g1 (x) + · · · + fr (x) ⊗ gr (x) = 0 as an element of Hom (A, (W ⊗ B){x}). Then fi (x) = 0 for all i. Proof. For any η ∈ W ∗ , we extend η to a linear map from W ⊗ B to B by η(w ⊗ u) = η(w)u for w ∈ W and u ∈ B, and then canonically extend it to a linear map from W ⊗ (B{x}) to B{x}. For any h ∈ C, η ∈ W ∗ and u ∈ A, we see that η(Ph (f1 (x) ⊗ g1 (x)(u) + · · · + fr (x) ⊗ gr (x)(u))) = x s (η(w1)g1 (x)(u) + · · · + η(wr )gr (x)(u)) = 0, where we set Ph fi (x) = x s wi with wi ∈ Wh . Since gi (x) (i = 1, . . . , r) are linearly independent linear maps from A to B{x}, η(wi ) = 0 for all i. Thus wi = 0 for any h ∈ C and i, that is, Ph fi (x) = 0. This implies fi (x) = 0 for all i. Now we prove Theorem 2.10. Proof. For h ∈ C, let Ph be the projection map of M 3 ⊗ N 3 onto (M 3 )(h) ⊗ N 3 .
3 Suppose that Y1i ( · , x) for i = 1, . . . , r are intertwining operators of type MM 1 M2 and suppose that Y2i ( · , x) for i = 1, . . . , r are linearly independent intertwining oper 3 ators of type N N 1 N 2 . Assume that r
(Y1i ⊗ Y2i )( · , x) = 0.
i=1
That is, r
Y1i (w 1 , x)w2 ⊗ Y2i (v 1 , x)v 2 = 0
(2.6)
i=1
for wj ∈ M j , v j ∈ N j with j = 1, 2. Write fni (w 1 , w2 )x −n−1 . Y1i (w 1 , x)w2 = n∈C
From [FHL], for homogeneous vectors w1 , w2 , we have L1 (0)fni (w 1 , w2 ) = (wt(w1 ) + wt(w 2 ) − n − 1)fni (w 1 , w2 ). Then for any h ∈ C, Ph Y1i (w 1 , x)w2
(2.7)
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+ i 1 2 h−wt(w = fwt(w 1 )+wt(w 2 )−h−1 (w , w )x
179
1 )−wt(w 2 )
∈ x h−wt(w
1 )−wt(w 2 )
(M 3 )(h) .
Now it follows immediately from (2.6) and Lemma 2.12 that Y1i (w 1 , x)v 1 = 0
for i = 1, . . . , r.
Thus Y1i ( · , x) = 0 for all i. This proves that σ is injective.
2 Assume dim IV 2 ML2 N 2 < ∞. We are going to show that σ is also surjective. Let
M 3 ⊗N 3 1 2 Y( · , x) be any intertwining operator of type M 1 ⊗N 1 M 2 ⊗N 2 for V ⊗ V . We must prove that Y( · , x) ∈ Im σ .
3 1 1 2 Let Y2i ( · , x) (i = 1, . . . , r) be a basis of IV 2 N N 1 N 2 . We fix vectors w ∈ M , w ∈ M 2 arbitrarily. By Proposition 2.11, for h ∈ C, x −L
1 (0)
1 (0)
Ph Y(x L
1 (0)
w 1 ⊗ · , x)(x L
w2 ⊗ · )
(M 3 )(h) ⊗N 3 for V2 -modules. (Notice that (M 3 )(h) ⊗ is an intertwining operator of type N1 N2 3 1 N is the L (0)-eigenspace of eigenvalue h.) Since dim(M 3 )(h) < ∞, we have 3 (M )(h) ⊗ N 3 ∼ N3 3 IV 2 (M . ) ⊗ I 2 = (h) V N1 N2 N1 N2 Thus for any v 1 ∈ N 1 and v 2 ∈ N 2 , we can write x −L1 (0) Ph Y((x L1 (0) w 1 ) ⊗ v 1 , x)(x L1 (0) w 2 ⊗ v 2 ) =
r
fi (w 1 , w2 , h) ⊗ Y2i (v 1 , x)v 2
i=1
for some fi (w 1 , w2 , h) ∈ (M 3 )(h) . That is, Ph Y(w1 ⊗ v 1 , x)(w 2 ⊗ v 2 ) =
r
x h fi (x −L1 (0) w 1 , x −L1 (0) w 2 , h) ⊗ Y1i (v 1 , x)v 2 .
i=1
Then Y(w1 ⊗ v 1 , x)(w2 ⊗ v 2 ) =
r
x h fi (x −L1 (0) w 1 , x −L1 (0) w 2 , h) ⊗ Y1i (v 1 , x)v 2
h∈C i=1
for any v 1 ∈ N 1 and v 2 ∈ N 2 . Now we set Y1i (w 1 , x)w 2 = fi (x −L1 (0) w 1 , x −L1 (0) w 2 , h)x h . h∈C
Since M 3 is an ordinary V -module, for each i, Y1i (w 1 , x)w2 is a lower truncated element of M 3 {x}. For example, when w 1 ∈ M 1 , w 2 ∈ M 2 are homogeneous, we have 1 2 Y1i (w 1 , x)v 1 = fi (w 1 , w2 , h)x h−wt(w )−wt(w ) . h∈C
180
T. Abe, C. Dong, H. Li
Then Ph Y1i (w 1 , x)w2 ∈ x h−wt(w
1 )−wt(w 2 )
(M 3 )(h) .
Furthermore, for a homogeneous vector a ∈ V1 and for n ∈ Z, we have Ph an Y1i (w 1 , x)w2 1 2 = an Ph−wt(a)+n+1 Y1i (w 1 , x)w2 ∈ x h−wt(a)+n+1−wt(w )−wt(w ) (M 3 )(h) . We are going to prove that Y1i ( · , x) are intertwining operators, so that we will have that Y( · , x) ∈ Im σ . Noticing that L(−1) = L1 (−1) ⊗ 1 + 1 ⊗ L2 (−1), using the L(−1) (resp. L2 (−1))derivative property for Y( · , x) (resp. Y2i ( · , x)), we get r
Y1i (L1 (−1)w 1 , x)w2 ⊗ Y2i (v 1 , x)v 2
i=1
= Y(L(−1)(w 1 ⊗ v 1 ), x)(w 2 ⊗ v 2 ) −
r
Y1i (w 1 , x)w2 ⊗ Y2i (L2 (−1)v 1 , x)v 2
i=1 r
d d i 1 Y1i (w 1 , x)w 2 ⊗ Y(w1 ⊗ v 1 , x)(w 2 ⊗ v 2 ) − Y (v , x)v 2 dx dx 2 i=1 r d i 1 = Y (w , x)w2 ⊗ Y2i (v 1 , x)v 2 . dx 1 =
i=1
Since Y2i ( · , x), (i = 1, . . . , r) are linearly independent, by Lemma 2.12 we get Y1i (L1 (−1)w1 , x)w2 =
d i Y (w1 , x)w2 dx 1
(2.8)
for any i = 1, . . . , r and wj ∈ M j (j = 1, 2). Finally, we show that each Y1i ( · , x) satisfies the Jacobi identity. Let a ∈ V 1 , w 1 ∈ 1 M , w 2 ∈ M 2 . By linearity we may assume that a, w 1 and v 1 are homogeneous. From the Jacobi identity x1 − x2 Y (a ⊗ 1, x1 )Y(w1 ⊗ v 1 , x2 )(w 2 ⊗ v 2 ) x0 x2 − x1 −1 − x0 δ Y(w1 ⊗ v 1 , x2 )Y (a ⊗ 1, x1 )(w 2 ⊗ v 2 ) −x0 x1 − x0 −1 = x2 δ Y(Y (a ⊗ 1, x0 )(w 1 ⊗ v 1 ), x2 )(w 2 ⊗ v 2 ), x2
x0−1 δ
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
181
we get x1 − x2 Y (a, x1 )Y1i (w 1 , x2 )w 2 ⊗ Y2i (v 1 , x2 )v 2 x0 i=1 r x2 − x1 −1 Y1i (w 1 , x2 )Y (a, x1 )w 2 ⊗ Y2i (v 1 , x2 )v 2 − x0 δ −x0 i=1 r x1 − x0 Y1i (Y (a, x0 )w 1 , x2 )w 2 ⊗ Y2i (v 1 , x2 )v 2 = x2−1 δ x2
r
x0−1 δ
(2.9)
i=1
for any v j ∈ N j (j = 1, 2). For n ∈ Z, h ∈ C, we have Res x1 x1n (x1 − x2 )m Ph Y (a, x1 )Y1i (w 1 , x2 )w 2 ∞ m (−x2 )j Ph an+m−j Y1i (w 1 , x2 )v 1 = j j =0
=
∞ m j =0
j
(−x2 )j an+m−j Ph−wt(a)+n+m−j +1 Y1i (w 1 , x2 )v 1
h−wt(a)−wt(w1 )−wt(w2 )+n+m+1
∈ x2
(M 3 )(h) .
(2.10)
Similarly, we have h−wt(a)−wt(w 1 )−wt(w 2 )+n+m+1
Res x1 x1n (x1 − x2 )m Ph Y1i (w1 , x2 )Y (a, x1 )w2 ∈ x2
(M 3 )(h) , (2.11)
and Res x0 Res x1 x0m x1n x2−1 δ
x1 − x0 x2
Ph Y1i (Y (a, x0 )w 1 , x2 )w 2
= Res x0 x0m (x2 + x0 )n Ph Y1i (Y (a, x0 )w 1 , x2 )w 2 ∞ n n−j x = Ph Y1i (am+j w 1 , x2 )v 1 j 2 j =0
h−wt(a)−wt(w1 )−wt(w2 )+n+m+1
∈ x2
(M 3 )(h) .
(2.12)
With (2.9)–(2.12), it follows from Lemma 2.12 that each Y1i ( · , x) satisfies the Jacobi identity. Then Y1i ( · , x) are intertwining operators. This shows that σ is onto, completing the proof. 3. Vertex Operator Algebras M(1)+ and VL+ 3.1. Vertex operator algebras M(1)+ and VL+ and their modules. In this section we review the construction of the vertex operator algebras M(1)+ and VL+ associated with a positive definite even lattice L, following [FLM].
182
T. Abe, C. Dong, H. Li
Let h be a d-dimensional vector space equipped with a nondegenerate symmetric bilinear form (· , ·). Consider the Lie algebra hˆ = h ⊗ C[t, t −1 ] ⊕ CC defined by the commutation relations ˆ =0 [β1 ⊗ t m , β2 ⊗ t n ] = m(β1 , β2 )δm,−n C and [C, h] for any β1 , β2 ∈ h, m, n ∈ Z. Set hˆ + = C[t] ⊗ h ⊕ CC, which is clearly an abelian subalgebra. For any λ ∈ h, let Ceλ denote the 1-dimensional hˆ + -module on which h⊗tC[t] acts as zero, h (= h⊗Ct 0 ) acts according to the character λ, i.e., heλ = (λ, h)eλ for h ∈ h and and C acts as the scalar 1. Set ˆ ⊗ ˆ + Ceλ ∼ M(1, λ) = U (h) = S(t −1 C[t −1 ] ⊗ h), U (h ) ˆ the induced h-module. For h ∈ h, n ∈ Z, we denote by h(n) the corresponding operator of h ⊗ t n on M(1, λ), and write h(z) = h(n)z−n−1 . n∈Z
Define a linear map Y ( · , z) : M(1, 0) → (End M(1, λ))[[z, z−1 ]]
(3.1)
by Y (v, z) =
◦ ◦
1 (n1 − 1)!
d dz
n1 −1
β1 (z) · · ·
1 (nr − 1)!
d dz
nr −1
βr (z) ◦◦ ,
for the vector v = β1 (−n1 ) · · · βr (−nr )e0 with βi ∈ h, ni ≥ 1, where the normal ordering ◦◦ · ◦◦ is an operation which reorders the operators so that β(n) (β ∈ h, n < 0) to be placed to the left of β(n) (β ∈ h, n ≥ 0). Following [FLM], we denote M(1) = M(1, 0) and set 1 ω= hi (−1)2 e0 ∈ M(1), 2 d
1 = e ∈ M(1), 0
i=1
where {h1 , . . . , hd } is an orthonormal basis of h. (Note that ω does not depend on the choice of the orthonormal basis.) Then (M(1), Y ( · , z), 1, ω) is a simple vertex operator algebra, and (M(1, λ), Y ( · , z)) is an irreducible M(1)-module for any λ ∈ h (see [FLM]). We next recall a construction of the vertex operator algebra VL associated to an even lattice and its irreducible modules, following [DL1] (see also [FLM and D1]. First we start with a rank d rational lattice P with a positive definite symmetric Z-bilinear form (· , ·). We suppose that L is a rank d even sublattice of P such that (L, P ) ⊂ Z.
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
183
Let q be a positive even integer such that (λ, µ) ∈ q2 Z for all λ, µ ∈ P and let Pˆ be a central extension of P by the cyclic group κq of order q : − 1 → κq |κq q = 1 → Pˆ → P → 0 (α,β)
with commutator map c(· , ·) such that c(α, β) = κ 2 for α, β ∈ L, where κ = κq q/2 . It is known that such a central extension exists if q is sufficiently large (see Remark 12.18 in [DL1]). Let e : P → Pˆ , λ → eλ be a section such that e0 = 1 and : P × P → κq be the corresponding 2-cocycle, i.e., eλ eµ = (λ, µ)eλ+µ for any λ, µ ∈ P . We can assume that is bimultiplicative. Then (α, β) (β, α) = κ (α, β) , (α + β, γ ) = (α, γ ) (β, γ ). We may further assume that (α, α) = κ
(α,α) 2
for any α ∈ L. Denote by C[P ] = λ∈P Ceλ the group algebra. For any subset M of P , we write C[M] = λ∈M Ceλ . Then C[P ] becomes a Pˆ -module by the action eλ eµ = (λ, µ)eλ+µ
and κq eµ = ωq eµ
(3.2)
for λ, µ ∈ P , where ωq ∈ C× is a q th root of unity. It is clear that for any λ ∈ P , ˆ C[λ + L] is an L-module on which κ(= κq q/2 ) acts by the scalar −1. Set h = C ⊗Z L and extend the Z-bilinear form (· , ·) to a C-bilinear form of h. Then VP := M(1) ⊗ C[P ] ˆ is endowed with an h-module structure such that h(n)(u ⊗ eλ ) = (h(n)u) ⊗ eλ
and h(0)(u ⊗ eλ ) = (h, λ)(u ⊗ eλ )
for h ∈ h, n = 0, λ ∈ P and that C acts as the identity. We have VP ∼ M(1, λ), = λ∈P
as an M(1)-module. For any subset M of P , we set VM = M(1) ⊗ C[M], which is an M(1)-submodule of VP , where C[M] = λ∈M Ceλ . For λ ∈ P , we define Y (eλ , z) ∈ (End VP ){z} by
∞ λ(−n) n ∞ λ(n) −n Y (eλ , z) = exp z z (3.3) exp − eλ z λ , n=1 n n=1 n where eλ is the left action of eλ ∈ Pˆ on C[P ] and zλ is the operator on C[P ] defined by zλ eµ = z(λ,µ) eµ . The vertex operator associated to the vector v = β1 (−n1 ) · · · βr (−nr )eλ for βi ∈ h, ni ≥ 1 and λ ∈ P is defined by Y (v, z) =
◦ ◦
1 (n1 − 1)!
d dz
n1 −1
1 β1 (z) · · · (nr − 1)!
d dz
nr −1
βr (z)Y (eλ , z) ◦◦ ,
184
T. Abe, C. Dong, H. Li
where the normal ordering ◦◦ · ◦◦ is an operation which reorders the operators so that β(n) (β ∈ h, n < 0) and eλ to be placed to the left of X(n), (X ∈ h, n ≥ 0) and zλ . This defines a linear map Y ( · , z) : VP → (End VP ){z}.
(3.4)
Let α, λ ∈ P be such that (α, λ) ∈ Z. Then for u ∈ M(1, α), v ∈ M(1, λ), we have z1 − z2 Y (u, z1 )Y (v, z2 ) z0−1 δ z0 z2 − z1 Y (v, z2 )Y (u, z1 ) −(−1)(α,λ) c(α, λ)z0−1 δ −z0 z1 − z0 Y (Y (u, z0 )v, z2 ). = z2−1 δ (3.5) z2 Set L◦ = { λ ∈ h | (α, λ) ∈ Z, α ∈ L}, ◦ the dual lattice of L, and fix a coset decomposition L = ∪i∈L◦ /L (L + λi ) such that ◦ λ0 = 0. In the case P = L , we see that VP = i∈L◦ /L Vλi +L and that the restriction of Y ( · , z) to VL gives a linear map VL → (End Vλi +L )[[z, z−1 ]] for any i ∈ L◦ /L. From [FLM], (VL , Y ( · , z), 1, ω) is a vertex operator algebra and (Vλ+L , Y ( · , z)) are irreducible VL -modules. Note that M(1) is a vertex operator subalgebra of VL (with the same vacuum vector and the Virasoro element). Now we define a map θ from Lˆ◦ to itself by
θ (κqs eλ ) = κqs e−λ for any s ∈ Z and λ ∈ L◦ . Since the 2-cocycle is bimultiplicative, θ is in fact an automorphism of Lˆ◦ . Now we define the action of θ on VL◦ by θ(β1 (−n1 )β2 (−n2 ) · · · βk (−nk )eλ ) = (−1)k β1 (−n1 )β2 (−n2 ) · · · βk (−nk )e−λ for βi ∈ h, ni ≥ 1 and λ ∈ L◦ . Then we see that θ Y (u, z)v = Y (θ (u), z)θ (v)
(3.6)
for any u, v ∈ VL◦ . In particular, θ gives an automorphism of VL which induces an automorphism of M(1). For any θ-stable subspace U of VL◦ , let U ± be the θ -eigenspace of U (of eigenvalues ±1). Then both (M(1)+ , Y ( · , z), 1, ω) and (VL+ , Y ( · , z), 1, ω) are simple vertex operator algebras. We have the following proposition (see [DM and DLM1]): Proposition 3.1. (1) M(1)± , M(1, λ) for λ ∈ h − {0} are irreducible M(1)+ -modules, and M(1, λ) ∼ = M(1, −λ). (2) (Vλi +L + V−λi +L )± for i ∈ L◦ /L are irreducible VL+ -modules. Moreover if 2λi ∈ L then (Vλi +L + V−λi +L )± , Vλi +L and V−λi +L are isomorphic VL+ -modules.
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
185
Next we recall a construction of θ-twisted modules for M(1) and VL following [FLM 1
and D2]. Denote by h[−1] = h ⊗ t 2 C[t, t −1 ] ⊕ CC the twisted affinization of h defined by the commutation relations [β1 ⊗ t m , β2 ⊗ t n ] = m(β1 , β2 )δm,−n C for any β1 , β2 ∈ h, m, n ∈
1 2
and
ˆ =0 [C, h]
+ Z. Set 1
M(1)(θ ) = S(t − 2 C[t −1 ] ⊗ h). ˆ Then M(1)(θ ) is (up to equivalence) the unique irreducible h[−1]-module such that n C = 1 and (β ⊗ t ) · 1 = 0 if n > 0. This space is an irreducible θ-twisted M(1)-module (see [FLM]). ˆ For any L/K-module ˆ Set K = {a −1 θ (a) | a ∈ L}. T such that κ acts by the scalar T −1, we define VL = M(1)(θ ) ⊗ T . Then there exists a linear map Y ( · , z) : VL → 1
1
(End VLT )[[z 2 , z− 2 ]] such that (VLT , Y ( · , z)) becomes a θ -twisted VL -module (see [FLM]). The cyclic group θ acts on M(1)(θ ) and VLT by θ (β1 (−n1 )β2 (−n2 ) · · · βk (−nk )) = (−1)k β1 (−n1 )β2 (−n2 ) · · · βk (−nk ) and θ (β1 (−n1 )β2 (−n2 ) · · · βk (−nk )t) = (−1)k β1 (−n1 )β2 (−n2 ) · · · βk (−nk )t
(3.7)
for βi ∈ h, ni ∈ 21 + Z≥0 and t ∈ T . We denote by M(1)(θ )± and VLT ,± the ±1-eigenspaces for θ of M(1)(θ ) and VLT , respectively. ˆ Following [FLM], let Tχ be the irreducible L/K-module associated to a central character χ satisfying χ (κ) = −1. Then any irreducible θ -twisted VL -module is isomorphic T
to VL χ for some central character χ with χ (κ) = −1 (see [D2]). From [DLi] we have:
Proposition 3.2. (1) M(1)(θ )± are irreducible M(1)+ -modules. ˆ (2) Let χ be a central character of L/K such that χ (κ) = −1, and Tχ the irreducible T ,± ˆ L/K-module with central character χ . Then V + -modules V χ are irreducible. L
L
The following classification of the irreducible M(1)+ -modules is due to [DN1 and DN3]: Theorem 3.3. The M(1)+ -modules M(1)± , M(1)(θ )± , M(1, λ)(∼ (3.8) = M(1, −λ)) for λ ∈ h − {0} are all the irreducible M(1)+ -modules (up to equivalence). Furthermore, the following classification of the irreducible VL+ -modules was obtained in [DN2 and AD]: Theorem 3.4. Let L be a positive-definite even lattice and let {λi } be a set of representatives of L◦ /L. Then any irreducible VL+ -module is isomorphic to one of the irreducible
/ L, Vλ±i +L with 2λi ∈ L or VL χ for a central character modules VL± , Vλi +L with 2λi ∈ ˆ χ of L/K with χ (κ) = −1. Furthermore, Vλi +L and Vλj +L are isomorphic if and only if λi ± λj ∈ L. T ,±
± / L) and Vλ+L (2λ ∈ L) as We refer to the irreducible VL+ -modules VL± , Vλ+L (2λ ∈ T ,±
the irreducible modules of untwisted type and refer to VL χ of twisted type.
as the irreducible modules
186
T. Abe, C. Dong, H. Li
3.2. Contragredient modules. In this section we identify the contragredient modules of the irreducible M(1)+ -modules and VL+ -modules explicitly. First we have: Proposition 3.5. Every irreducible M(1)+ -module W is self dual, i.e., W ∼ = W. + Proof. First, since M(1)+ is simple and M(1)+ (1) = 0, by Lemma 2.5 M(1) is self-dual. Similarly, the vertex operator algebra M(1) is also self-dual because L(1)M(1)(1) = L(1)h = 0. Then as an M(1)+ -module
M(1) = (M(1)+ ) ⊕ (M(1)− ) M(1) = M(1)+ ⊕ M(1)− . Since M(1)+ and M(1)− are nonisomorphic irreducible M(1)+ -modules, we must have that M(1)− is self-dual. We claim that for any λ ∈ h, M(1, λ) M(1, −λ) as an M(1)-module. Note that the lowest L(0)-weight subspace of M(1, λ) is Ceλ whose L(0)-weight is (λ, λ)/2. Define a linear functional ψ ∈ M(1, λ) by ψ(eλ ) = 1 and ψ(u) = 0 for u ∈ M(1, λ)(n) with n − (λ, λ)/2 ∈ Z>0 . From (2.4) we get h(0)ψ = −(λ, h)ψ and h(n)ψ = 0 ˆ for h ∈ h, n ≥ 1. Thus M(1, λ) M(1, −λ) as an h-module, since M(1, λ) and ˆ M(1, −λ) are irreducible h-modules. Now that M(1, λ) and M(1, −λ) are isomorphic M(1)+ -modules, we see that M(1, λ) as an M(1)+ -module is self-dual. It remains to show that the irreducible M(1)+ -modules M(1)(θ )+ and M(1)(θ )− are self-dual. It is known that the lowest L(0)-weights of M(1)(θ )+ and M(1)(θ )− are dim h/16 and 1/2 + dim h/16, respectively. Noticing that any irreducible module and its contragredient module have the same lowest weight L(0)-weight, we see that M(1)(θ )± must be self-dual. Combining Proposition 3.5 with Proposition 2.7 we immediately have: Proposition 3.6. Let M i (i = 1, 2, 3) be irreducible M(1)+ -modules. Then the fusion
3 1 2 3 rule of type MM 1 M 2 as a function of (M , M , M ) is invariant under the permutation group of {1, 2, 3}. Next we identify the contragredient modules of the irreducible VL+ -modules: / L Proposition 3.7. The irreducible VL+ -modules VL± and Vλ+L for λ ∈ L◦ with 2λ ∈ ± are self dual. For any λ ∈ L◦ with 2λ ∈ L, Vλ+L are self dual if 2(λ, λ) is even and ± ∓ ˆ ) ∼ if 2(λ, λ) is odd. Let χ be a central character of L/K such that (Vλ+L = Vλ+L T ,±
T ,±
χ(κ) = −1. Then the irreducible modules (VL χ ) are isomorphic to VL χ , where χ (a, ¯ a) ¯ ˆ ˆ is a central character of L/K defined by χ (a) = (−1) 2 χ (a) for any a ∈ Z(L/K). Proof. We first prove that for λ ∈ L◦ (Vλ+L ) ∼ = V−λ+L as a VL -module. Since Vλ+L = ⊕α∈L M(1, λ + α) and since (M(1, λ)) ∼ = M(1, −λ) as an M(1)-module (from the proof of Proposition 3.5), we have (Vλ+L ) ∼ = ⊕α∈L M(1, −λ + α). By the classification of irreducible VL -modules (see [D1]), we must have (Vλ+L ) ∼ = V−λ+L . Since Vλ+L ∼ = V−λ+L as a VL+ -module we see that Vλ+L as a VL+ -module is self dual. ∼ Now suppose that 2λ ∈ L. Then λ + L = −λ + L, so that Vλ+L = Vλ+L . We have a nondegenerate VL -invariant bilinear form · , · on Vλ+L . From the invariance property we have h(n)u, v = −u, h(−n)v
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
187
for h ∈ h, n ∈ Z, u, v ∈ Vλ+L , noticing that L(1)h = 0 and L(0)h = h. Thus we get eλ , e−λ+α = 0 for nonzero α ∈ L. Since the bilinear form is nondegenerate, we must have that eλ , e−λ = 0. By (3.3) and (3.2) we have 2λ(−n) Y (e2λ , z)e−λ = (2λ, −λ)z−2(λ,λ) exp zn e λ . n n≥1
Using this and the invariance property we have
2λ(−n)
Y (e2λ , z)e−λ , e−λ = (2λ, −λ)z−2(λ,λ) exp
n
n≥1
zn eλ , e−λ
2λ(n)
= (2λ, −λ)z−2(λ,λ) eλ , exp
n≥1
= (2λ, −λ)z
−2(λ,λ)
e , e λ
−λ
n
zn e−λ
.
On the other hand, we have e−λ , Y (ezL(1) (−z−2 )L(0) e2λ , z−1 )e−λ = e−λ , (−1)2(λ,λ) z−4(λ,λ) Y (e2λ , z−1 )e−λ = (−1)2(λ,λ) (2λ, −λ)z−2(λ,λ) e−λ , eλ , noticing that L(1)e2λ = 0 and L(0)e2λ = 2(λ, λ)e2λ , where 2(λ, λ) is a nonnegative integer. By the invariance property we have eλ , e−λ = (−1)2(λ,λ) e−λ , eλ . This shows that eλ ± e−λ , eλ ± (−1)2(λ,λ) e−λ = ±2. ± The irreducibility of Vλ+L and the V -invariance of · , · prove that if 2(λ, λ) is even ± ± × Vλ+L (resp. odd), then · , · gives a nondegenerate invariant bilinear form on Vλ+L ± ∓ ± ± ± ∓ (resp. Vλ+L ×Vλ+L ). Therefore, (Vλ+L ) ∼ = Vλ+L if 2(λ, λ) is even and (Vλ+L ) ∼ = Vλ+L if 2(λ, λ) is odd. T ˆ Let χ be a central character of L/K such that χ (κ) = −1. Then (VL χ ) is a θ -twisted VL -module (see [X]; cf. [FHL]). The classification of irreducible θ -twisted modules (see T
Tχ
[D2]) implies that (VL χ ) is isomorphic to VL 1 for some central character χ1 . We are going to show that χ1 = χ , using the same method that was used for the untwisted modules. For α ∈ L, we have ([FLM, Sect. 9.1]) α(−n) n Y (eα , z) = 2−(α,α) z−(α,α)/2 exp z n n∈1/2+Z≥0 α(n) × exp − z−n eα , n n∈1/2+Z≥0
188
T. Abe, C. Dong, H. Li
so that
Y (eα , z)t = χ (eα )2−(α,α) z−(α,α)/2 exp
n∈1/2+Z≥0
α(−n) n z t n
ˆ ¯ t ∈ Tχ and t1 ∈ Tχ1 , for t ∈ Tχ and α ∈ R¯ = { a¯ | a ∈ Z(L/K) }. Then for any α ∈ R, we have Y (eα , z)t1 , t = 2−(α,α) z−(α,α)/2 t1 , t and t1 , Y (ezL(1) (−z−2 )L(0) eα , z−1 )t = (−1)(α,α)/2 χ (eα )2−(α,α) z−(α,α)/2 t1 , t . ¯ t ∈ Tχ and t1 ∈ Therefore, we get χ (eα )t1 , t = (−1) 2 χ1 (eα )t1 , t for any α ∈ R, T T Tχ ,± ∼ Tχ ,± χ χ ) =V Tχ . This proves χ1 = χ and (V ) ∼ = V . Then it is clear that (V (α,α)
1
as a VL+ -module.
L
L
L
L
4. Fusion Rules for Vertex Operator Algebra M(1)+ 4.1. Construction of intertwining operators. In this subsection we prove that the fusion rules of certain types are not zero for a vertex operator algebra M(1)+ by constructing a nonzero intertwining operator. This construction of intertwining operator is essentially due to [FLM]. For any λ, µ, ν ∈ h we call the triple (λ, µ, ν) ∈ h × h × h an admissible triple if pλ + qµ + rν = 0 for some p, q, r ∈ {±1}. Clearly, if (λ, µ, ν) is admissible, so is every permutation of (λ, µ, ν). Note that in view of Theorem 3.3, M(1, λ) and M(1, µ) are isomorphic M(1)+ -modules if and only if (0, λ, µ) is an admissible triple. For λ, µ ∈ h, we define a linear map pλ : M(1, µ) → M(1, λ + µ) by pλ (u⊗eµ ) = u⊗eλ+µ . The vertex operator associated to the vectors eλ and v = β1 (−n1 ) · · · βr (−nr )eλ for βi ∈ h, ni ≥ 1 is defined by ∞ ∞ λ(−n) λ(n) λ n −n pλ z λ , (4.1) z exp − z Yλ,µ (e , z) = exp n n n=1 n=1 n1 −1 1 d ◦ Yλ,µ (v, z) = ◦ β1 (z) · · · (n1 − 1)! dz nr −1 1 d × βr (z) Yλ,µ (eλ , z) ◦◦ , (4.2) (nr − 1)! dz where zλ is the operator on Ceµ defined by zλ eµ = z(λ,µ) eµ , and the normal ordering · ◦◦ is an operation which reorders the operators so that β(n) (β ∈ h, n < 0) and pλ to be placed to the left of β(n), (β ∈ h, n ≥ 0) and zλ . From the arguments in [FLM, Sect. 8], we see that the operator ◦ ◦
Yλ,µ ( · , z) : M(1, λ) → Hom (M(1, µ), M(1, λ + µ)){z}
(4.3)
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
satisfies z0−1 δ
z1 − z2 z0
189
Yλ,µ+ν (u, z1 )Yµ,ν (v, z2 ) z2 − z1 −(−1)(λ,µ) z0−1 δ Yµ,λ+ν (v, z2 )Yλ,ν (u, z1 ) −z0 z1 − z0 = z2−1 δ Yλ+µ,ν (Yλ,µ (u, z1 )v, z2 ) z2
(4.4)
for λ, µ, ν ∈ h with (λ, µ) ∈ Z, u ∈ M(1, λ) and v ∈ M(1, ν). We also have the d L(−1)-derivative property dz Yλ,µ (u, z) = Yλ,µ (L(−1)u, z). Noting Y0,ν ( · , z) is the vertex operator map of the irreducible M(1)-module M(1, ν), we see that Yλ,µ ( · , z)
M(1,λ+µ) is a nonzero intertwining operator of type M(1,λ) M(1,µ) for M(1). Consequently, the
M(1,λ+µ) fusion rule of type M(1,λ) M(1,µ) for M(1)+ is not zero. Since M(1, ν) ∼ = M(1, −ν) as
M(1,−λ+µ) + + an M(1) -module for any ν ∈ h, M(1,λ) M(1,µ) for M(1) is not zero. Therefore we have proved:
M(1,ν) Proposition 4.1. For any admissible triple (λ, µ, ν), the fusion rule of type M(1,λ) M(1,µ) for M(1)+ is nonzero. For any λ ∈ h, we define a linear map θ : M(1, λ) → M(1, −λ);
θ (u ⊗ eλ ) = θ (u) ⊗ e−λ
for u ∈ M(1).
(4.5)
For h ∈ h, u ∈ M(1), we have (θ ◦ h(0) ◦ θ −1 )(u ⊗ eλ ) = θ h(0)(θ −1 (u) ⊗ e−λ ) = (h, −λ)u ⊗ e−λ = −h(0)(u ⊗ eλ ) and for n = 0, we have (θ ◦ h(n) ◦ θ −1 )(u ⊗ eλ ) = θ ((h(n)θ −1 (u)) ⊗ e−λ ) = (θ h(n)θ −1 (u)) ⊗ eλ = −h(n)(u ⊗ eλ ). Therefore, we see that θ ◦ h(z) ◦ θ −1 = −h(z) for any h ∈ h. Since θ ◦ pλ ◦ θ −1 = p−λ for any λ ∈ h, one has θ ◦ Yλ,−µ (eλ , z) ◦ θ −1 = Y−λ,µ (e−λ , z). By using (4.2) we can prove that the intertwining operator Yλ,µ ( · , z) satisfies that θYλ,−µ (u, z)θ −1 (v) = Y−λ,µ (θ (u), z)v
(4.6)
for any u ∈ M(1, λ) and v ∈ M(1, µ). By using the isomorphism θ, we define an operator θ
Yλ,µ ( · , z) : M(1, λ) → Hom (M(1, µ), M(1, −λ + µ)){z}
by θ
Yλµ (u, z)v = Y−λ,µ (θ (u), z)v
for u ∈ M(1, λ) and v ∈ M(1, µ). Then one can see that θ Yλµ ( · , z) is a nonzero
M(1,−λ+µ) + intertwining operator of type M(1,λ) M(1,µ) for M(1) by using (4.6).
190
T. Abe, C. Dong, H. Li
Now we consider the case λ = µ = ν = 0 in Proposition 4.1. Since M(1) is simple, Y (u, z)v = 0 for nonzero vectors u, v ∈ M(1) by Proposition 2.8. Clearly, we have M(1)+ ((z)) if u ∈ M(1)± and v ∈ M(1)± , Y (u, z)v ∈ M(1)− ((z)) if u ∈ M(1)± and v ∈ M(1)∓ . The restrictions of Y ( · , z) give nonzero intertwining operators of types
− and M(1)M(1) ± M(1)∓ . Thus we have: Proposition 4.2. The fusion rules of types
M(1)+ M(1)± M(1)±
M(1)− M(1)+ M(1)± M(1)± and M(1)± M(1)∓ are nonzero.
Next we consider the case λ = 0 and µ = 0 in Proposition 4.1. Notice that the vertex operator map Y ( · , z) of the irreducible M(1)-module M(1, µ) is an intertwining oper
ator. Then the restrictions of Y ( · , z) give intertwining operators of types M(1)M(1,µ) ± M(1,µ) . By Proposition 2.8, Y (u, z)v = 0 for any nonzero vectors u ∈ M(1) and v ∈ M(1, µ). Therefore the following proposition holds:
Proposition 4.3. For any µ ∈ h − {0}, the fusion rules of types M(1)M(1,µ) ± M(1,µ) are nonzero.
M(1)(θ) 2 We shall discuss the construction of intertwining operators of type M(1,λ) M(1)(θ) 1 for λ ∈ h and i ∈ {±} (i = 1, 2). Let λ ∈ h. Following [FLM], we define a linear map Yλtw ( · , z) : M(1, λ) → (End M(1)(θ )){z}
(4.7)
as follows. First we set Yλtw (eλ , z)
= e−|λ|
2 log 2
z
2 − |λ|2
exp n∈ 21 +Z≥0
λ(−n) n z exp − n
n∈ 21 +Z≥0
λ(n) −n z . n
(4.8)
Next we define W (u, z) for u = β1 (−n1 ) · · · βr (−nr )eλ (βi ∈ h, ni ≥ 1) by W (u, z) =
◦ ◦
1 (n1 − 1)!
d dz
n1 −1
1 β1 (z) · · · (nr − 1)!
d dz
nr −1
βr (z) Yλtw (eα , z) ◦◦ , (4.9)
where the normal ordering ◦◦ · ◦◦ reorders the operators so that β(n) (β ∈ h, n < 0) to be placed to the left of β(n), (β ∈ h, n > 0). Now we introduce an operator z defined by z =
∞ d i=1 m,n=0
cmn hi (m)hi (n)zm+n
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
191
by using an orthonormal basis {hi } of h and the coefficients cmn subject to the following formal expansion: 1 1 (1 + x) 2 + (1 + y) 2 m n cmn x y = − log . 2 m,n≥0
Finally we set Yλtw (u, z) = W (ez u, z). Then by using the same arguments in [FLM, Chapter 9], we get the following twisted Jacobi identity z1 − z2 z2 − z1 −1 −1 tw Y (a, z1 )Yλ (u, z2 ) − z0 δ Yλtw (u, z2 )Y (a, z1 ) z0 δ z0 −z0 1/2 1 −1 (z − z ) 1 0 = Yλtw (Y (θ p (a), z0 )u, z2 ) z2 δ (−1)p 1/2 2 z p=0,1
2
d tw for any a ∈ M(1) and u ∈ M(1, λ) and the L(−1)-derivative property dz Yλ (u, z) = tw tw Yλ (L(−1)u, z) for u ∈ M(1, λ). These imply that Yλ ( · , z) is a nonzero intertwining
M(1)(θ) + operator of type M(1,λ) M(1)(θ) for M(1) . By definition we have tw θ Yλtw (u, z)θ −1 (v) = Y−λ (θ (u), z)v
(4.10)
for any u ∈ M(1, λ) and v ∈ M(1)(θ ). Let p : M(1)(θ ) → M(1)(θ ) be the canonical projection and ι : M(1)(θ ) → M(1)(θ ) the canonical inclusion for ∈ {±}. Then for any 1 , 2 ∈ {±}, the composi M(1)(θ) 2 + tion p 2 ◦ Yλtw ( · , z) ◦ ι 1 is an intertwining operator of type M(1,λ) M(1)(θ) 1 for M(1) . By direct calculation, one has
|λ|2 |λ|2 1 2 Y tw (eλ , z)1 ≡ e−|λ| log 2 z− 2 1 + λ (−1/2) z1/2 mod z− 2 +1 M(1)(θ )[[z 2 ]] and Y tw (eλ , z)λ(−1/2) ≡e−|λ|
z−
|λ|2 2
−|λ|2 z−1/2 + (1 − 2|λ|2 )λ (−1/2) z0 2 2 2 1/2 3 + 2(1 − 2|λ| )λ (−1/2) z + 4λ (−1/2) − λ (−3/2) z 3 2 log 2
mod z−
|λ|2 2 +2
1
M(1)(θ )[[z 2 ]].
These show that if λ is nonzero then the intertwining operator p 2 ◦ Yλtw ( · , z) ◦ ι 1 is nonzero for any 1 , 2 ∈ {±}. Therefore, the following proposition holds:
M(1)(θ)± Proposition 4.4. For any λ ∈ h − {0}, the fusion rules of types M(1,λ) M(1)(θ)± and
M(1)(θ)∓ M(1,λ) M(1)(θ)± are nonzero. In the case λ = 0, Y0tw ( · , z) is the vertex operator map Y ( · , z) of the θ -twisted M(1)-module M(1)(θ ). In particular, Y0tw (1, z) = id and Y0tw (h(−1)1, z) = h(z) for
M(1)(θ) any h ∈ h. Thus Y0tw ( · , z)◦ι is a nonzero intertwining operator of type M(1,λ) M(1)(θ) for any ∈ {±}. By using the conjugation property (4.10) we immediately have:
192
T. Abe, C. Dong, H. Li
Proposition 4.5. The fusion rules of types zero.
M(1)(θ)− M(1)(θ)+ M(1)± M(1)(θ)± , M(1)± M(1)(θ)∓
are non-
4.2. Main theorem. In this section we determine the fusion rules for irreducible M(1)+ modules, generalizing a result of [A1]. The following result was proved in [A1]: Theorem 4.6. Let h be a 1-dimensional vector space equipped with a symmetric nondegenerate bilinear form (·, ·). For any irreducible M(1)+ -modules M i (i = 1, 2, 3), the
3 fusion rule of type MM 1 M 2 is either 0 or 1 and it is invariant under the permutations of
3 i {1, 2, 3}. The fusion rule of type MM 1 M 2 is 1 if and only if M (i = 1, 2, 3) satisfy one of the following conditions: (i) M 1 = M(1)+ and M 2 ∼ = M 3. 1 − 2 (ii) M = M(1) and (M , M 3 ) is one of the following pairs: (M(1)+ , M(1)− ), (M(1)− , M(1)+ ), (M(1, µ), M(1, ν)) for µ, ν ∈ h − {0} such that µ = ±ν, (M(1)(θ )+ , M(1)(θ )− ), (M(1)(θ )− , M(1)(θ )+ ). (iii) M 1 = M(1, λ) (λ ∈ h − {0}) and (M 2 , M 3 ) is one of the following pairs: (M(1)± , M(1, µ)), (M(1, µ), M(1)± ) for µ ∈ h − {0} such that λ = ±µ, (M(1, µ), M(1, ν)) for µ, ν ∈ h − {0} such that (λ, µ, ν) is an admissible triple, (M(1)(θ )± , M(1)(θ )± ), (M(1)(θ )± , M(1)(θ )∓ ). 1 (iv) M = M(1)(θ )+ and (M 2 , M 3 ) is one of the following pairs: (M(1)± , M(1)(θ )± ), (M(1)(θ )± , M(1)± ), (M(1, µ), M(1)(θ )± ), (M(1)(θ )± , M(1, µ)) (µ ∈ h − {0}). (v) M 1 = M(1)(θ )− and (M 2 , M 3 ) is one of the following pairs: (M(1)± , M(1)(θ )∓ ), (M(1)(θ )± , M(1)∓ ), (M(1, µ), M(1)(θ )± ), (M(1)(θ )± , M(1, µ)) (µ ∈ h − {0}). This section is devoted to prove the following generalization: Theorem 4.7. Let h be any finite-dimensional vector space equipped with a symmetric nondegenerate bilinear form (· , ·). Then all the assertions of Theorem 4.6 hold. We write Mh (1) for the vertex operator algebra M(1) associated with h and similarly for the modules. It is clear that if h is a subspace of h such that the bilinear form of h restricted to h is nondegenerate, then Mh (1)+ is a vertex operator subalgebra Mh (1)+ (with different Virasoro element if h = h). Furthermore, if h = h1 ⊕ h2 such that (h1 , h2 ) = 0, then the irreducible Mh (1)+ -modules are decomposed into direct sums of irreducible Mh1 (1)+ ⊗ Mh2 (1)+ -modules as follows: Mh (1)+ ∼ = Mh1 (1)+ ⊗ Mh2 (1)+ ⊕ Mh1 (1)− ⊗ Mh2 (1)− , Mh (1)− ∼ = Mh (1)+ ⊗ Mh (1)− ⊕ Mh (1)− ⊗ Mh (1)+ , 1
2
1
(4.11) (4.12)
2
Mh (1, λ) ∼ = Mh1 (1, λ1 ) ⊗ Mh2 (1, λ2 ), + ∼ Mh (1)(θ ) = Mh1 (1)(θ )+ ⊗ Mh2 (1)(θ )+ ⊕ Mh1 (1)(θ )− ⊗ Mh2 (1)(θ )− , Mh (1)(θ )− ∼ = Mh (1)(θ )+ ⊗ Mh (1)(θ )− ⊕ Mh (1)(θ )− ⊗ Mh (1)(θ )+ , 1
2
1
where we decompose λ ∈ h into λ = λ1 + λ2 so that λi ∈ hi . First we prove the following result:
2
(4.13) (4.14) (4.15)
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
193
+ Proposition
L 4.8. For any irreducible M(1) -modules M, N and L, the fusion rule of type M N is either 0 or 1.
Proof. We shall use induction on d = dim h. Noticing that Theorem 4.7 in the case d = dim h = 1 has been proved in [A1] (Theorem 4.6), we assume that d > 1. Assume that Theorem 4.7 for Mh (1)+ with dim h < d has been proved. We decompose h into a direct sum of mutually orthogonal subspaces h1 and h2 with dim h1 = 1. Theorem 4.7 applies for both Mh1 (1)+ and Mh2 (1)+ . Recall (4.11)–(4.15) for the decompositions of the irreducible Mh (1)+ -modules into direct sums of irreducible Mh1 (1)+ ⊗ Mh2 (1)+ modules. Notice that each of M, N and L is isomorphic to one of those Mh (1)+ -modules. Pick up irreducible Mh1 (1)+ ⊗ Mh2 (1)+ -submodules M 1 ⊗ M 2 of M and N 1 ⊗ N 2 of N, where M i and N i are irreducible Mhi (1)+ -modules for i = 1, 2. Decompose L as a direct sum of irreducible Mh1 (1)+ ⊗ Mh2 (1)+ -modules: L1j ⊗ L2j , L∼ = j
where Lij are irreducible Mhi (1)+ -modules for i = 1, 2. By Proposition 2.9 and Theorem 2.10 we have L L dim IM(1)+ ≤ dim IMh (1)+ ⊗Mh (1)+ 1 2 MN M1 ⊗ M2 N1 ⊗ N2 L1j L2j = · dim IMh (1)+ . (4.16) dim IMh (1)+ 1 2 M1 N1 M2 N2 j We take suitable irreducible Mh1 (1)+ ⊗ Mh2 (1)+ -modules M 1 ⊗ M 2 and N 1 ⊗ N 2 from M and N respectively, and consider inequality (4.16). From inductive hypothesis, all the summands in the right-hand side of (4.16) are less than or equal to 1. Furthermore, using Theorem 4.6 for Mh1 (1)+ we see that at most one of summands in the right-hand side of (4.16) is possibly nonzero. For example, in the case M = N = M(1)− and L = M(1)+ , we have M(1)+ + dim IM(1) M(1)− M(1)− M(1)+ ≤ dim IMh (1)+ ⊗Mh (1)+ 1 2 Mh1 (1)+ ⊗ Mh2 (1)+ Mh−1 ⊗ Mh−2 M(1)+ M(1)+ + + = dim IMh (1) · dim IMh (1) 1 2 M(1)+ M(1)− M(1)− M(1)+ − M(1) M(1)− + + dim IMh (1)+ · dim I Mh2 (1) 1 M(1)+ M(1)− M(1)− M(1)+ = 1. Therefore, the right hand side of (4.16) is zero or one. This proves the proposition. Next, we show that fusion rules of certain types for M(1)+ are zero.
194
T. Abe, C. Dong, H. Li
Lemma 4.9. The fusion rules of types
M(1)− M(1)+ M(1)+
and
M(1)− M(1)− M(1)−
are zero.
Proof. Again we shall use induction on d = dim h. As it was proved in [A1] in the case dim h = 1, we assume that dim h ≥ 2. Take h ∈ h such that (h, h) = 0 and set ± ± 1 + 2 1 h1 = Ch, h2 = h⊥ 1 . Then by using (4.16) for M = Mh1 (1) , M = Mh2 , N = Mh1
− and N 2 = Mh±2 and the inductive hypothesis, we get IM(1)+ M(1)M(1) ± M(1)± = 0 respectively. Using a similar argument we have: Lemma 4.10. For λ ∈ h − {0}, the fusion rules of types are zero.
M(1,λ) M(1,λ) M(1)± M(1)± and M(1)± M(1)∓
We shall need the following simple result in linear algebra: Lemma 4.11. Let h be a (nonzero) finite-dimensional vector space over C equipped with a nondegenerate symmetric bilinear form (·, ·). Let S be a finite set of nonzero vectors in h. Then there exists a one-dimensional vector subspace h1 of h such that (·, ·) is nondegenerate on h1 and such that u1 = 0 for any u ∈ S, where u1 denotes the orthogonal projection of u into h1 . In particular, for λ, µ, ν ∈ h, if the triple (λ, µ, ν) is not admissible then there exists a one-dimensional vector subspace h1 of h such that (·, ·) is nondegenerate on h1 and such that the triple (λ1 , µ1 , ν1 ) is not admissible. Proof. Let h1 , . . . , hd be an orthonormal basis of h. Then the bilinear form (·, ·) restricted on the R-subspace E = Rh1 ⊕ · · · ⊕ Rhd is positive definite. For any u ∈ h, we consider u as a linear functional on h through the bilinear form on h. If u = 0, we have (u, hi ) = 0 for some 1 ≤ i ≤ d, so that ker u ∩ E is a proper R-subspace of E. By a well known fact in linear algebra we have E = ∪u∈S (ker u ∩ E). Take h ∈ E − ∪u∈S (ker u ∩ E) and set h1 = Ch. We have (u, h) = 0 for all u ∈ S. Then h1 meets our need. For λ, µ, ν ∈ h, set S = {aλ + bµ + cν | a, b, c ∈ {1, −1}}. We see that the triple (λ, µ, ν) is not admissible if and only if S consists of nonzero vectors. Then the particular assertion follows immediately. Next we prove the following lemma: Lemma 4.12. (1) For any λ, µ ∈ h − {0}, the fusion rules of types zero if (λ, µ, 0) is not an admissible triple.
M(1,µ) M(1)± M(1,λ)
are
(2) Let λ, µ, ν ∈ h − {0} such that (λ, µ, ν) is not an admissible triple. Then the fusion M(1,ν) + rule of type M(1,λ) M(1,µ) for M(1) is zero. Proof. We also use induction on dim h. As it has been proved (Theorem 4.6) in the case dim h = 1, we assume that dim h ≥ 2. Since (λ, µ, 0) is not an admissible triple, in view of Lemma 4.11, there exists an orthogonal decomposition h = h1 ⊕ h2 such that
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
195
dim h1 = 1 and (λ1 , µ1 , 0) is not an admissible triple. Using (4.16) and the initial case, we obtain M(1, µ) dim IM(1)+ M(1)± M(1, λ) Mh1 (1, µ1 ) ⊗ Mh2 (1, µ2 ) ≤ dim IMh (1)+ ⊗Mh (1)+ 1 2 Mh1 (1)± ⊗ Mh2 (1)+ Mh1 (1, λ1 ) ⊗ Mh2 (1, λ2 ) M(1, µ1 ) M(1, µ2 ) + ≤ dim IMh (1)+ · dim I Mh2 (1) 1 M(1)± M(1, λ1 ) M(1)+ M(1, λ2 ) = 0, proving the assertion (1). From this proof the assertion (2) is also clear.
We also have: Lemma 4.13. The fusion rules of types are zero.
M(1)(θ)± M(1)(θ)± M(1)(θ)∓ M(1)− M(1)± , M(1)− M(1)± and M(1)− M(1)(θ)±
Proof. We shall also use induction on dim h. As it was proved in Theorem 4.6 for rank one, we assume that dim h > 1. As we have done before, we decompose h = h1 ⊕ h2 (an orthogonal sum) with dim h1 = 1. For any γ ∈ h, γ is decomposed as γ1 + γ2 with γi ∈ hi for i = 1, 2. Using the decomposition (4.14), the inequality (4.16) and inductive hypothesis, we have dim I
M(1)+
M(1)(θ )± M(1)− M(1)(θ )±
≤ dim IMh
(1)+ ⊗Mh2 1
(1)+
M(1)(θ )± + Mh1 (1) ⊗ Mh2 (1)− Mh1 (1)(θ )+ ⊗ Mh2 (1)(θ )±
Mh1 (1)(θ )+ ⊗ Mh2 (1)(θ )± ≤ dim I Mh1 (1)+ ⊗ Mh2 (1)− Mh1 (1)(θ )+ ⊗ Mh2 (1)(θ )± Mh1 (1)(θ )− ⊗ Mh2 (1)(θ )∓ + dim IMh (1)+ ⊗Mh (1)+ 1 2 Mh1 (1)+ ⊗ Mh2 (1)− Mh1 (1)(θ )+ ⊗ Mh2 (1)(θ )± M(1)(θ )+ M(1)(θ )± = dim IMh (1)+ · dim IMh (1)+ 1 2 M(1)+ M(1)(θ )+ M(1)− M(1)(θ )± M(1)(θ )− M(1)(θ )∓ + dim IMh (1)+ · dim IMh (1)+ 1 2 M(1)+ M(1)(θ )+ M(1)− M(1)(θ )± Mh1 (1)+ ⊗Mh2
(1)+
= 0, respectively. Similarly, the fusion rules of types zero.
M(1)(θ)± M(1)(θ)∓ M(1)− M(1)± , M(1)− M(1)±
Now we put everything together to prove Theorem 4.7.
are also
196
T. Abe, C. Dong, H. Li
Proof. By Propositions 4.8, 3.5 and 2.7, all the fusion rules among irreducible M(1)+ modules are either 0 or 1 and are stable under the permutation of modules. We see that the fusion rule of arbitrary type for M(1)+ coincides with one of those in Lemmas 4.9–4.13 or Propositions 4.1–4.5 after permuting irreducible modules. Furthermore, we can show that any type of fusion rule indicated in (i)–(v) of Theorem 4.7 agrees with one of that in Propositions 4.1–4.5 by permuting irreducible modules. This completes the proof. 5. Fusion Rules for Vertex Operator Algebra VL+ 5.1. Main theorem. In this section we state the main result on the fusion rules for irreducible VL+ -modules. To do this we need to introduce a few notations. First, recall the commutator map c( · , · ) of Lˆ◦ . This defines an alternating Z-bilinear form c0 : ¯ c (a, ¯ b) for a, b ∈ Lˆ ◦ . For λ, µ ∈ L◦ , we L◦ × L◦ → Z/qZ by the property c(a, b) = κq 0 set πλ,µ = e(λ,µ)πi ωqc0 (µ,λ) .
(5.1)
ˆ Note that πλ,α = ±1 for any α ∈ L if 2λ ∈ L. Next for a central character χ of L/K ◦ with χ (κ) = −1 and λ ∈ L with 2λ ∈ L we set cχ (λ) = (−1)(λ,2λ) (λ, 2λ)χ (e2λ ).
(5.2)
ˆ let It is easy to see that cχ (λ) = ±1. For any λ ∈ L◦ and a central character χ of L/K, (λ) (λ) (λ) ( a,λ) ¯ χ be the central character defined by χ (a) = (−1) χ (a). We set Tχ = Tχ (λ) . We call a triple (λ, µ, ν) an admissible triple modulo L if pλ + qµ + rν ∈ L for some p, q, r ∈ {±1}. Theorem 5.1. Let L be a positive-definite even lattice. For any irreducible VL+ -modules
3 M i (i = 1, 2, 3), the fusion rule of type MM 1 M 2 is either 0 or 1. The fusion rule of type
M3 i M 1 M 2 is 1 if and only if M (i = 1, 2, 3) satisfy one of the the following conditions; (i) M 1 = Vλ+L for λ ∈ L◦ such that 2λ ∈ / L and (M 2 , M 3 ) is one of the following pairs: (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ, 2ν ∈ / L and (λ, µ, ν) is an admissible triple modulo L, ± ± (Vµ+L , Vν+L ), ((Vν+L ) , (Vµ+L ) ) for µ, ν ∈ L◦ such that 2µ ∈ L and (λ, µ, ν) is an admissible triple modulo L, (λ)
(λ)
ˆ (VL χ , VL χ ), (VL χ , VL χ ) for any irreducible L/K-module Tχ . + 1 ◦ 2 3 (ii) M = Vλ+L for λ ∈ L such that 2λ ∈ L and (M , M ) is one of the following pairs: (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ / L and (λ, µ, ν) is an admissible triple modulo L, ± ± (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ L, πλ,2µ = 1 and (λ, µ, ν) is an admissible triple modulo L, T ,±
T
,±
T ,±
T
,∓
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
197
± ∓ (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ L, πλ,2µ = −1 and (λ, µ, ν) is an admissible triple modulo L, T ,±
T
(λ)
,±
T
(λ)
(VL χ , VL χ ), ((VL χ such that cχ (λ) = 1, (λ)
,± T ,± ) , (VL χ ) )
ˆ for any irreducible L/K-module Tχ
(λ)
T ,∓ T ,± T ,± T ,∓ ˆ ), ((VL χ ) , (VL χ ) ) for any irreducible L/K-module Tχ (VL χ , VL χ such that cχ (λ) = −1. − for λ ∈ L◦ such that 2λ ∈ L and (M 2 , M 3 ) is one of the following (iii) M 1 = Vλ+L pairs: (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ / L and (λ, µ, ν) is an admissible triple modulo L, ± ∓ (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ L, πλ,2µ = 1 and (λ, µ, ν) is an admissible triple modulo L, ± ± (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ L, πλ,2µ = −1 and (λ, µ, ν) is an admissible triple modulo L, T ,±
T
(λ)
,∓
T
(λ)
,∓ T ,± ) , (VL χ ) )
ˆ for any irreducible L/K-module Tχ
T ,±
T
(λ)
,±
T
(λ)
,± T ,± ) , (VL χ ) )
ˆ for any irreducible L/K-module Tχ
(VL χ , VL χ ), ((VL χ such that cχ (λ) = 1, (VL χ , VL χ ), ((VL χ such that cχ (λ) = −1.
T ,+ ˆ Tχ and (M 2 , M 3 ) is one of the (iv) M 1 = VL χ for an irreducible L/K-module following pairs: T
(λ)
(Vλ+L , VL χ
,±
(λ) Tχ ,±
± (Vλ+L , VL cχ (λ) = 1,
(λ)
T
(λ)
), ((VL χ ), ((VL
(λ)
± (Vλ+L , VL χ ), ((VL χ cχ (λ) = −1. T
,± ) , (Vλ+L ) )
(λ) Tχ ,±
,∓
T
for λ ∈ L◦ such that 2λ ∈ / L,
± ) , (Vλ+L ) ) for λ ∈ L◦ such that 2λ ∈ L and that
,∓ ± ) , (Vλ+L ) )
for λ ∈ L◦ such that 2λ ∈ L and that
T ,− ˆ Tχ and (M 2 , M 3 ) is one of the (v) M 1 = VL χ for an irreducible L/K-module following pairs: T
(λ)
(Vλ+L , VL χ
,±
(λ) Tχ ,∓
± (Vλ+L , VL cχ (λ) = 1,
(λ)
T
(λ)
,± ) , (Vλ+L ) ) (λ) T ,± ∓ ((VL χ ) , (Vλ+L ) )
), ((VL χ
for λ ∈ L such that 2λ ∈ / L◦ ,
),
for λ ∈ L◦ such that 2λ ∈ L and that
(λ)
± (Vλ+L , VL χ ), ((VL χ cχ (λ) = −1. T
,±
T
,∓ ∓ ) , (Vλ+L ) )
for λ ∈ L◦ such that 2λ ∈ L and that
Remark 5.2. In the case that the rank of L is one, Theorem 5.1 was previously proved in [A2]. We will give a proof of this theorem in Sects. 5.2 and 5.3, where we deal with the fusion rules for irreducible modules of untwisted types and for those of twisted types respectively.
5.2. Fusion rules among modules of untwisted types. In this section we determine the fusion rules for the irreducible VL+ -modules of untwisted types. We first prove that the
198
T. Abe, C. Dong, H. Li
fusion rules of certain types for irreducible VL+ -modules of untwisted types are nonzero by giving nonzero intertwining operators. Such intertwining operators come from intertwining operators constructed in [DL1] for irreducible VL -modules. We recall a construction of intertwining operators for irreducible VL -modules following [DL1]. Let Y ( · , z) : VL◦ → (End VL◦ ){z} be the linear map as in (3.4) (with P = L◦ ). However Y ( · , z) satisfies the L(−1)-derivative property, the identity (3.5) implies that Y ( · , z) does not give intertwining operators among irreducible VL -modules. We attach an extra factor to Y ( · , z) to get intertwining operators. Let λ ∈ L◦ . We define a linear map π (λ) ∈ End VL◦ which acts on M(1, µ) (µ ∈ L◦ ) as the scalar πλ,µ (= c (µ,λ) e(λ,µ)πi ωq0 ) and we then define a linear map Yλ ( · , z) : Vλ+L → (End VL◦ ){z} by Yλ (u, z)v = Y (u, z)π (λ) (v)
(5.3)
for any u ∈ M(1, λ) and v ∈ M(1, µ). Then the restriction of Yλ ( · , z) gives rise to a
. nonzero intertwining operator of type V Vλ+µ+L λ+L Vµ+L Let λ, µ, γ ∈ L◦ (the dual lattice of L). It was proved in [DL1, Prop. 12.8] that
is nonzero if and only if γ − λ − µ ∈ L and that IVL V Vλ+µ+L is one IVL V Vγ +L λ+L Vµ+L λ+L Vµ+L
Vλ+µ+L + + dimensional. Thus the fusion of type V V for VL is nonzero. Using a VL -module λ+L µ+L
isomorphism between Vµ+L and V−µ+L , we see that the fusion rule of type V Vλ−µ+L λ+L Vµ+L is also nonzero. Furthermore we have:
Vν+L + Proposition 5.3. For any λ, µ, ν ∈ L◦ , the fusion rule of type Vλ+L Vµ+L for VL is nonzero if and only if (λ, µ, ν) is an admissible triple modulo L. Proof. Let (λ, µ, ν) be an admissible triple modulo L. Then Vν+L is isomorphic to Vν+L + Vλ+µ+L or Vλ−ν+L as a VL -module. Hence the fusion rule of type Vλ+L Vµ+L is nonzero.
Vν+L Conversely, let us assume that the fusion rule of type Vλ+L is nonzero. We take Vµ+L ◦ ∼ λ, µ to be nonzero if necessary. Note that for any γ ∈ L , Vγ +L = α∈L M(1, γ + α) as an M(1)+ -module. Since Vλ+L and Vµ+L contain irreducible M(1)+ -modules M(1, λ)
Vν+L and M(1, µ), respectively, by Proposition 2.9, the fusion rule of type M(1,λ) M(1,µ) for M(1)+ is nonzero. By Theorem 4.7, Vν+L must contain an irreducible M(1)+ -submodule isomorphic to M(1, λ + µ) or M(1, λ − µ). Then λ + µ ∈ ν + L, or −ν + L, or λ − µ ∈ ν + L, or −ν + L. This shows that (λ, µ, ν) is an admissible triple modulo L. Furthermore, if 2λ ∈ L, by Proposition 2.9, we see that the fusion rules of types
V are not zero. Similarly, the fusion rules of types V ±λ−µ+L are also nonV µ+L λ+L
Vλ−µ+L is nonzero, the fusion rule zero. Clearly, if one of the fusion rules of types V ± V λ+L µ+L
Vν+L + of type Vλ+L Vµ+L for VL is nonzero. In view of Proposition 5.3 we immediately have:
Vλ+µ+L ± Vµ+L Vλ+L
Proposition 5.4. For any λ, µ, ν ∈ L◦ with 2λ ∈ L, the fusion rules of types are nonzero if and only if (λ, µ, ν) is an admissible triple modulo L. We next prove the following result:
Vν+L ± Vλ+L Vµ+L
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
199
Proposition 5.5. Let M 1 , M 2 and M 3 be irreducible VL+ -modules of untwisted types. Suppose that one of M i (i = 1, 2, 3) is isomorphic to Vλ+L for λ ∈ L◦ with 2λ ∈ / L or
M3 ± VL . Then the fusion rule of type M 1 M 2 is either 0 or 1. Proof. For λ ∈ L◦ , the VL+ -module Vλ+L is decomposed into a direct sum of irreducible M(1)+ -modules as M(1, λ + α). Vλ+L ∼ = α∈L
Moreover, if 2λ ∈ L, we can take a subset Sλ ⊂ λ + L so that Sλ ∩ (−Sλ ) = ∅ and Sλ ∪ (−Sλ ) = λ + L (= L − {0} if λ ∈ L), and we have M(1, µ) if λ ∈ L, VL± ∼ = M(1)± ⊕ µ∈Sλ + Vλ+L
− ∼ ∼ = Vλ+L =
M(1, µ) if λ ∈ /L
µ∈Sλ
as M(1)+ -modules. Therefore, the multiplicity of any irreducible M(1)+ -module in any irreducible VL+ -module of untwisted type is at most one and any irreducible VL+ module of untwisted type contains an irreducible M(1)+ -submodule isomorphic to M(1, β) with 0 = β ∈ L◦ . Let M 1 , M 2 and M 3 be irreducible VL+ -modules of untwisted type. From the previous paragraph, each M i contains M(1, λi ) as an irreducible M(1)+ -submodule for some nonzero λi ∈ L◦ for i = 1, 2, 3. In view of Proposition 2.9, we see that the fusion
3 M3 + rule of type MM 1 M 2 for VL -modules is not bigger than that of type M(1,λ ) M(1,λ ) for 1 2
3 + M(1)+ -modules. Assume that the fusion rule of type MM 1 M 2 for VL -modules is not zero. From Theorem 4.6 we must have aλ1 + bλ2 ∈ λ3 + L for some a, b ∈ {1, −1}. That is, (λ1 , λ2 , λ3 ) is an admissible triple modulo L. By using Propositions 2.7 and 3.7 we may assume that M 1 is isomorphic to one of the irreducible modules VL± and Vλ+L for λ ∈ L◦ with 2λ ∈ / L. We divide the proof in the following three cases. Case 1. M 1 = VL+ . From Remark 2.9 of [L] we have that for any vertex operator
algebra V and for any V -modules W and M, the fusion rule of type VMW equals dim Hom
V (W, M). It follows from the Schur lemma (see [FHL]) that the fusion rule of type VMW for irreducible V -modules W and M is either 0 or 1. Case 2. M 1 = VL− . From Theorem 4.6 (ii), for any irreducible M(1)+ -module W the
fusion rule of type M(1)− WM(1,λ2 ) for M(1)+ -modules is 1 if W ∼ = M(1, −λ2 ) and it is zero otherwise. We also know that the multiplicity of M(1, −λ2 ) in M 3 is one. Thus the
3 fusion rule of the type MM 1 M 2 is at most 1. Case 3. M 1 = Vλ+L for λ ∈ L◦ with 2λ ∈ / L. Because (λ1 , λ2 , λ3 ) is an admissible triple modulo L, we have that either 2λ2 ∈ / L or 2λ3 ∈ / L. By using Proposition 2.7 and 3.7 we may assume that 2λ3 ∈ / L. This implies that Vλ3 +L contains either M(1, λ1 + λ2 ) or M(1, λ1 − λ2 ), as an M(1)+ -submodule with multiplicity one. In view of Theorem
3 4.6 and Proposition 2.9, the fusion rule of type MM 1 M 2 is either 0 or 1.
200
T. Abe, C. Dong, H. Li
Let λ, µ ∈ L◦ such that 2λ, 2µ ∈ L. Then we see that Yλ ( · , z) gives rise to a
V nonzero intertwining operator of type V 1λ+µ+L for any 1 , 2 ∈ {±}. We consider the V 2 λ+L
µ+L
conjugation θ Yλ ( · , z)θ −1 . By definition, we have for any β ∈ L and v ∈ M(1, µ + β), θ (π (λ) (θ −1 (v))) = e(λ,−µ−β)πi ωqc0 (−µ−β,λ) v = e(λ,−2µ−2β)πi ωqc0 (−2µ−2β,λ) e(λ,µ+β)πi ωqc0 (µ+β,λ) v = e(λ,−2β)πi ωqc0 (−2β,λ) e(λ,−2µ)πi ωqc0 (−2µ,λ) π (λ) (v) = πλ,−2µ π (λ) (v) = πλ,2µ π (λ) (v), noticing that e(λ,−2β)πi = 1,
ωqc0 (−2β,λ) = ωqc0 (β,−2λ) = (−1)(β,−2λ) = 1.
Using (3.6) and (5.3), we get θYλ (u, z)θ −1 (v) = πλ,2µ Yλ (θ (u), z)v.
(5.4)
Next we prove the following result: Proposition 5.6. Let λ, µ, ν ∈ L◦ such that 2λ, 2µ ∈ L. (1) If (λ, µ, ν) is not an admissible triple modulo L then the fusion rule of type
3 Vν+L 1 2 Vλ+L Vµ+L
is zero for any i ∈ {±} (i = 1, 2, 3). (2) Let (λ, µ, ν) be an admissible triple modulo L. Then the fusion rules of types +
V−
Vν+L and ± ν+L∓ are nonzero if and only if πλ,2µ = 1. The fusion rules of ± ± Vλ+L Vµ+L Vλ+L Vµ+L
V+
V− types ± ν+L± and ± ν+L∓ are nonzero if and only if πλ,2µ = −1. FurtherVλ+L Vµ+L
Vλ+L Vµ+L
more, the fusion rules of type
3 Vν+L 1 2 Vλ+L Vµ+L
is either 0 or 1 for i ∈ {±}.
Proof. The assertion (1) follows immediately from Proposition 5.3. We now prove (2). By (5.4) we see that Yλ ( · , z) gives nonzero intertwining operators V−
V+ of types ± ν+L± ( ± ν+L± resp.) if πλ,2µ = 1 (πλ,2µ = −1 reps.), so that the Vλ+L Vµ+L
Vλ+L Vµ+L
corresponding fusion rules are nonzero. It is enough to prove that the fusion rule of type
Vλ+µ+L 1 2 Vλ+L Vµ+L
for VL+ is one for any
1 , 2 ∈ {±}. We shall demonstrate the proof only for 1 = 2 = +. The other cases can be proved similarly.
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
201
As in the proof of Proposition 5.5, for any nonzero ν ∈ L◦ with 2ν ∈ L, we take a subset Sν ⊂ ν + L such that Sν ∩ (−Sν ) = ∅ and Sν ∪ (−Sν ) = ν + L (L − {0} + if ν ∈ L). Wemay assume that ν, 3ν ∈ Sν . Then we have an M(1) -isomorphism + + + φ : Vν+L → γ ∈Sν M(1, γ ) ( VL → M(1) ⊕ γ ∈Sν M(1, γ ) if ν ∈ L) such that φ(u + θ (u)) = u for any γ ∈ Sν and u ∈ M(1, γ ). Set + [γ ] = M(1)+ ⊗ (eγ + e−γ ) ⊕ M(1)− ⊗ (eγ − e−γ ) ⊂ Vν+L Vν+L + for γ ∈ ν + L. Then φ gives an M(1)+ -isomorphism from Vν+L [γ ] to M(1, γ ). Let γ ∈ λ + L and δ ∈ µ + L. By Proposition 2.9 it suffices to prove that the image
V
Vλ+µ+L in IM(1)+ V + [λ] under the restriction map is one-dimenof IV + V +λ+µ+L + + V V [µ] L λ+L µ+L λ+L µ+L
V is of 4 dimension and sional. By Theorem 4.6, the vector space IM(1)+ V + [γλ+µ+L ] V + [δ]
is spanned by Yi ( · , z) (i = 1, 2, 3, 4) defined by
λ+L
µ+L
Y1 (u, z)v = Yγ ,δ (φ(u), z)φ(v), Y2 (u, z)v = Yγ ,−δ (φ(u), z)θ (φ(v)), Y3 (u, z)v = Y−γ ,δ (θ (φ(u)), z)φ(v), Y4 (u, z)v = Y−γ ,−δ (θ (φ(u)), z)θ (φ(v)) + + [γ ] and v ∈ Vµ+L [δ]. Let Y( · , z) be an intertwining operator of type for any u ∈ Vλ+L
Vλ+µ+L . Then for γ ∈ λ + L, δ ∈ µ + L, there are cγi ,δ ∈ C such that the restric+ + Vλ+L Vµ+L + + tion of Y( · , z) to Vλ+L [γ ] ⊗ Vµ+L [δ] is expressed by Y( · , z) = 4i=1 cγi ,δ Yi ( · , z). So we need to prove that the coefficient vector (cγi ,δ )i=1,2,3,4 is one dimensional. This is achieved by using commutativity and associativity of certain vertex operators and intertwining operators. + + [λ], v ∈ Vµ+L [µ], For α ∈ L we set E α = eα + e−α ∈ VL+ . Then for u ∈ Vλ+L there exists a nonnegative integer k such that (by using the explicit expressions of the operators)
(z1 − z2 )k Y (E 2µ , z1 )Y1 (u, z2 )v = (z1 − z2 )k Y (E 2µ , z1 )Yλ,µ (φ(u), z2 )φ(v) = (z1 − z2 )k (2µ, λ + µ)Y2µ,λ+µ (e2µ , z1 )Yλ,µ (φ(u), z2 )φ(v)
+ (−2µ, λ + µ)Y−2µ,λ+µ (e−2µ , z1 )Yλ,µ (φ(u), z2 )φ(v) = (z1 − z2 )k (−1)(2µ,λ) (2µ, λ + µ)Yλ,3µ (φ(u), z2 )Y2µ,µ (e2µ , z1 )φ(v)
+ (−2µ, λ + µ)Yλ,−µ (φ(u), z2 )Y−2µ,µ (e−2µ , z1 )φ(v) .
202
T. Abe, C. Dong, H. Li
As well, we have (z1 − z2 )k Y (E 2µ , z1 )Y2 (u, z2 )v = (z1 − z2 )k (−1)(2µ,λ) (2µ, λ − µ)Yλ,µ (φ(u), z2 )Y2µ,−µ (e2µ , z1 )θ (φ(v))
+ (−2µ, λ − µ)Yλ,−3µ (φ(u), z2 )Y−2µ,−µ (e−2µ , z1 )θ (φ(v)) , (z1 − z2 )k Y (E 2µ , z1 )Y3 (u, z2 )v = (z1 − z2 )k (−1)(2µ,−λ) (2µ, −λ + µ)Y−λ,3µ (θ (φ(u)), z2 )Y2µ,µ (e2µ , z1 )φ(v)
+ (−2µ, −λ + µ)Y−λ,−µ (θ (φ(u)), z2 )Y−2µ,µ (e−2µ , z1 )φ(v) , (z1 − z2 )k Y (E 2µ , z1 )Y4 (u, z2 )v = (z1 − z2 )k (−1)(2µ,−λ) × (2µ, −λ − µ)Y−λ,µ (θ (φ(u)), z2 )Y2µ,−µ (e2µ , z1 )θ (φ(v))
+ (−2µ, −λ − µ)Y−λ,−3µ (θ (φ(u)), z2 )Y−2µ,−µ (e−2µ , z1 )θ (φ(v)) .
For simplicity, we set Ai,j = Y(−1)i λ,(2+(−1)j )µ (θ i (φ(u)), z2 )Y2µ,(−1)j µ (e2µ , z1 )θ j (φ(v)), B i,j = Y(−1)i λ,(−2+(−1)j )µ (θ i (φ(u)), z2 )Y−2µ,(−1)j µ (e−2µ , z1 )θ j (φ(v)) for i = 0, 1. Then we see that Ai,j ∈ M(1, (−1)i λ + (2 + (−1)j )µ){z1 }{z2 } and B i,j ∈ M(1, (−1)i λ + (−2 + (−1)j )µ){z1 }{z2 } and that Ai,j and B i,j for i, j = 0, 1 are linearly independent in Vλ+µ+L {z1 }{z2 }. Thus (z1 − z2 )k Y (e2µ , z1 )Y(u, z2 )v (1)
(1)
= (z1 − z2 )k (−1)(2µ,λ) ( (2µ, λ + µ)cλ,µ A0,0 + (−2µ, λ + µ)cλ,µ B 0,0 (2)
(2)
+ (2µ, λ − µ)cλ,µ A0,1 + (−2µ, λ − µ)cλ,µ B 0,1 (3)
(3)
+ (2µ, −λ + µ)cλ,µ A1,0 + (−2µ, −λ + µ)cλ,µ B 1,0 (4)
(4)
+ (2µ, −λ − µ)cλ,µ A1,1 (−2µ, −λ − µ)cλ,µ B 1,1 ). Since µ, 3µ ∈ Sµ , we see that φ(Y (E 2µ , z)v) = Y (e2µ , z)φ(v) + Y (e2µ , z)θ (φ(v)) = (2µ, µ)Y2µ,µ (e2µ , z)φ(v) + (2µ, −µ)Y2µ,−µ (e2µ , z)θ (φ(v)).
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
203
Thus we get (z1 − z2 )k Y(u, z2 )Y (E 2µ , z1 )v (1)
= (z1 − z2 )k (cλ,3µ (2µ, µ)Yλ,3µ (φ(u), z2 )Y2µ,µ (e2µ , z1 )φ(v) + cλ,3µ (2µ, µ)Yλ,−3µ (φ(u), z2 )Y−2µ,−µ (e−2µ , z1 )θ (φ(v)) (2) (3)
+ cλ,3µ (2µ, µ)Y−λ,3µ (θ (φ(u)), z2 )Y2µ,µ (e2µ , z1 )φ(v) + cλ,3µ (2µ, µ)Y−λ,−3µ (θ (φ(u)), z2 )Y−2µ,−µ (e−2µ , z1 )θ (φ(v)) (4) (1)
+ cλ,µ (2µ, −µ)Yλ,−µ (φ(u), z2 )Y2µ,−µ (e2µ , z1 )θ (φ(v)) + cλ,µ (2µ, −µ)Yλ,−µ (φ(u), z2 )Y−2µ,µ (e−2µ , z1 )φ(v) (2) (3)
+ cλ,µ (2µ, −µ)Y−λ,−µ (θ (φ(u)), z2 )Y2µ,−µ (e2µ , z1 )θ (φ(v)) + cλ,µ (2µ, −µ)Y−λ,−µ (θ (φ(u)), z2 )Y−2µ,µ (e−2µ , z1 )φ(v)) (4)
(1)
(2)
= (z1 − z2 )k (cλ,3µ (2µ, µ)A0,0 + cλ,3µ (2µ, µ)B 0,1 (3)
(4)
+ cλ,3µ (2µ, µ)A1,0 + cλ,3µ (2µ, µ)B 1,1 (1)
(2)
(3)
(4)
+ cλ,µ (2µ, −µ)A0,1 + cλ,µ (2µ, −µ)B 0,0 + cλ,µ (2µ, −µ)A1,1 + cλ,µ (2µ, −µ)B 1,0 ). Since Y( · , z) is an intertwining operator for VL+ , we have (z1 −z2 )k Y (E 2µ , z)Y(u, z)v = (z1 −z2 )k Y(u, z)Y (E 2µ , z)v for a sufficiently large integer k. Therefore, the linear independence of Ai,j and B i,j for i, j = 0, 1 gives the following equations: (1)
(1)
(−1)(2µ,λ) (2µ, λ + µ)cλ,µ = cλ,3µ (2µ, µ), (1)
(2)
(−1)(2µ,λ) (−2µ, λ + µ)cλ,µ = cλ,µ (2µ, −µ), (2)
(1)
(−1)(2µ,λ) (2µ, λ − µ)cλ,µ = cλ,µ (2µ, −µ), (2)
(2)
(3)
(3)
(−1)(2µ,λ) (−2µ, λ − µ)cλ,µ = cλ,3µ (2µ, µ), (−1)(2µ,λ) (2µ, −λ + µ)cλ,µ = cλ,3µ (2µ, µ), (3)
(4)
(−1)(2µ,λ) (−2µ, −λ + µ)cλ,µ = cλ,µ (2µ, −µ), (4)
(3)
(−1)(2µ,λ) (2µ, −λ − µ)cλ,µ = cλ,µ (2µ, −µ), (4)
(4)
(−1)(2µ,λ) (−2µ, −λ − µ)cλ,µ = cλ,3µ (2µ, µ). From these equations we get (2)
(1)
cλ,µ = (−1)(2µ,λ) (−2µ, λ)cλ,µ ,
(4)
(3)
cλ,µ = (−1)(2µ,λ) (2µ, λ)cλ,µ .
Next we shall apply a similar argument to the associativity (z0 + z2 )k Y (E 2λ , z0 + z2 )Y(u, z2 )v = (z2 + z0 )k Y(Y (E 2λ , z0 )u, z2 )v
(5.5)
204
T. Abe, C. Dong, H. Li
for u ∈ Vλ+L [λ], v ∈ Vµ+L [µ] and sufficiently large integer k. By using (4.4) we have (z0 + z2 )k Y (E 2λ , z0 + z2 )Yλ,µ (φ(u), z2 )φ(v) = (z2 + z0 )k ( (2λ, λ + µ)Y3λ,µ (Y2λ,λ (e2λ , z0 )φ(u), z2 )φ(v) + (−2λ, λ + µ)Y−λ,µ (Y−2λ,λ (e−2λ , z0 )φ(u), z2 )φ(v)). Similarly, (z0 + z2 )k Y (E 2λ , z0 + z2 )Yλ,−µ (φ(u), z2 )θ (φ(v)) = (z2 + z0 )k ( (2λ, λ − µ)Y3λ,−µ (Y2λ,λ (e2λ , z0 )φ(u), z2 )θ φ(v)) + (−2λ, λ − µ)Y−λ,−µ (Y−2λ,λ (e−2λ , z0 )φ(u), z2 )θ (φ(v))), (z0 + z2 )k Y (E 2λ , z0 + z2 )Y−λ,µ (θ (φ(u)), z2 )φ(v) = (z2 + z0 )k ( (2λ, −λ + µ)Yλ,µ (Y2λ,−λ (e2λ , z0 )θ (φ(u)), z2 )φ(v) + (−2λ, −λ + µ)Y−3λ,µ (Y−2λ,−λ (e−2λ , z0 )θ (φ(u)), z2 )φ(v)), (z0 + z2 )k Y (E 2λ , z0 + z2 )Y−λ,−µ (θ (φ(u)), z2 )θ (φ(v)) = (z2 + z0 )k ( (2λ, −λ − µ)Yλ,−µ (Y2λ,−λ (e2λ , z0 )θ (φ(u)), z2 )θ (φ(v)) + (−2λ, −λ − µ)Y−3λ,−µ (Y−2λ,−λ (e2λ , z0 )θ (φ(u)), z2 )θ (φ(v))). Hence, (z0 + z2 )k Y (E 2λ , z0 + z2 )Y(u, z2 )v (1)
= (z2 + z0 )k (cλ,µ (2λ, λ + µ)Y3λ,µ (Y2λ,λ (e2λ , z0 )φ(u), z2 )φ(v) + cλ,µ (−2λ, λ + µ)Y−λ,µ (Y−2λ,λ (e−2λ , z0 )φ(u), z2 )φ(v)) (1)
(2)
+ cλ,µ (2λ, λ − µ)Y3λ,−µ (Y2λ,λ (e2λ , z0 )φ(u), z2 )θ φ(v)) + cλ,µ (−2λ, λ − µ)Y−λ,−µ (Y−2λ,λ (e−2λ , z0 )φ(u), z2 )θ (φ(v))) (2) (3)
+ cλ,µ (2λ, −λ + µ)Yλ,µ (Y2λ,−λ (e2λ , z0 )θ (φ(u)), z2 )φ(v) + cλ,µ (−2λ, −λ + µ)Y−3λ,µ (Y−2λ,−λ (e−2λ , z0 )θ (φ(u)), z2 )φ(v)) (3)
(4)
+ cλ,µ (2λ, −λ − µ)Yλ,−µ (Y2λ,−λ (e2λ , z0 )θ (φ(u)), z2 )θ (φ(v)) (4)
+ cλ,µ (−2λ, −λ − µ)Y−3λ,−µ (Y−2λ,−λ (e2λ , z0 )θ (φ(u)), z2 )θ (φ(v))). Now we set C i,j = Y(2+(−1)i )λ,(−1)j µ (Y2λ,(−1)i λ (e2λ , z0 )θ i (φ(u)), z2 )θ j (φ(v)), D i,j = Y(−2+(−1)i )λ,(−1)j µ (Y−2λ,(−1)i λ (e−2λ , z0 )θ i (φ(u)), z2 )θ j (φ(v)) for i = 0, 1. Since C i,j ∈ M(1, (2 + (−1)i )λ, (−1)j µ)((z0 ))((z2 )) ⊂ Vλ+µ+L ((z0 )) ((z2 )) and D i,j ∈ M(1, (−2 + (−1)i )λ, (−1)j µ)((z0 ))((z2 )) ⊂ Vλ+µ+L ((z0 ))((z2 )),
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
205
C i,j and D i,j for i = 0, 1 are linearly independent in Vλ+µ+L ((z0 ))((z2 )). Using C i,j and D i,j , we can rewrite the identity above as (z0 + z2 )k Y (E 2λ , z0 + z2 )Y(u, z2 )v (1)
(1)
= (z2 + z0 )k (cλ,µ (2λ, λ + µ)C 0,0 + cλ,µ (−2λ, λ + µ)D 0,0 (2)
(2)
+ cλ,µ (2λ, λ − µ)C 0,1 + cλ,µ (−2λ, λ − µ)D 0,1 (3)
(3)
(4)
(4)
+ cλ,µ (2λ, −λ + µ)C 1,0 + cλ,µ (−2λ, −λ + µ)D 1,0 + cλ,µ (2λ, −λ − µ)C 1,1 + cλ,µ (−2λ, −λ − µ)D 1,1 . As before we note that φ(Y (E 2λ , z0 )u) = (2λ, λ)Y2λ,λ (e2λ , z0 )φ(u) + (2λ, −λ)Y2λ,−λ (e2λ , z0 )θ (φ(u)). So (z0 + z2 )k Y(Y (E 2µ , z0 )u, z2 )v (1)
=(z0 + z2 )k (c3λ,µ (2λ, λ)Y3λ,µ (Y2λ,λ (e2λ , z0 )φ(u), z2 )φ(v) (2)
+ c3λ,µ (2λ, λ)Y3λ,−µ (Y2λ,λ (e2λ , z0 )φ(u), z2 )θ (φ(v)) + c3λ,µ (2λ, λ)Y−3λ,µ (Y−2λ,−λ (e−2λ , z0 )θ (φ(u)), z2 )φ(v) (3)
+ c3λ,µ (2λ, λ)Y−3λ,−µ (Y−2λ,−λ (e−2λ , z0 )θ (φ(u)), z2 )θ (φ(v)) (4) (1)
+ cλ,µ (2λ, −λ)Yλ,µ (Y2λ,−λ (e2λ , z0 )θ (φ(u)), z2 )φ(v) (2)
+ cλ,µ (2λ, −λ)Yλ,−µ (Y2λ,−λ (e2λ , z0 )θ (φ(u)), z2 )θ (φ(v)) + cλ,µ (2λ, −λ)Y−λ,µ (Y−2λ,λ (e−2λ , z0 )φ(u), z2 )φ(v) (3)
+ cλ,µ (2λ, −λ)Y−λ,−µ (Y−2λ,λ (e−2λ , z0 )φ(u), z2 )θ (φ(v)) (4)
(1)
(2)
=(z1 − z2 )k (c3λ,µ (2λ, λ)C 0,0 + c3λ,µ (2λ, λ)C 0,1 (3)
(4)
+ c3λ,µ (2λ, λ)D 1,0 + c3λ,µ (2λ, λ)D 1,1 (1)
(2)
(3)
(4)
+ cλ,µ (2λ, −λ)C 1,0 + cλ,µ (2λ, −λ)C 1,1 + cλ,µ (2λ, −λ)D 0,0 + cλ,µ (2λ, −λ)D 0,1 ). Since C i,j and D i,j for i, j = 0, 1 are linearly independent, the associativity formula implies the equations (1)
(1)
cλ,µ (−2λ, λ + µ) = cλ,µ (2λ, −λ),
(2)
(2)
cλ,µ (−2λ, λ − µ) = cλ,µ (2λ, −λ),
cλ,µ (2λ, λ + µ) = c3λ,µ (2λ, λ), cλ,µ (2λ, λ − µ) = c3λ,µ (2λ, λ), (3)
(1)
cλ,µ (2λ, −λ + µ) = cλ,µ (2λ, −λ), (4)
(2)
cλ,µ (2λ, λ − µ) = cλ,µ (2λ, −λ),
(1)
(3)
(2)
(4)
(3)
(3)
cλ,µ (−2λ, −λ + µ) = c3λ,µ (2λ, λ), (4)
(4)
cλ,µ (−2λ, λ + µ) = c3λ,µ (2λ, λ).
206
T. Abe, C. Dong, H. Li
This proves that (3)
(1)
(1)
cλ,µ = (−2λ, µ)cλ,µ = (−λ, 2µ)cλ,µ .
(5.6)
Combining (5.6) with (5.5) we see that (1)
Y(u, z)v =cλ,µ (Yλ,µ (φ(u), z2 )φ(v) + (−1)(2µ,λ) (−2µ, λ)Yλ,−µ (φ(u), z2 )θ (φ(v)) + (−λ, 2µ)Y−λ,µ (θ (φ(u)), z2 )φ(v) + (−1)(2µ,λ) Y−λ,µ (θ (φ(u)), z2 )θ (φ(v)). Thus the image of IV + L
Vλ+µ+L + + Vλ+L Vµ+L
in IM(1)+
Vλ+µ+L + + Vλ+L [λ] Vµ+L [µ]
is spanned by one inter-
twining operator, in particular, the dimension is one. This concludes that the fusion rule
V is at most one. This completes the proof. of type V +λ+µ+L V+ λ+L
µ+L
In view of Propositions 2.7, 5.3, 5.4, 5.5 and 5.6 we immediately have: Proposition 5.7. Let M i (i = 1, 2, 3) be irreducible VL+ -modules of untwisted type.
M3
3 Then the fusion rule of type MM 1 M 2 is either 0 or 1. The fusion rule of type M 1 M 2 is 1 if and only if M i (i = 1, 2, 3) satisfy one of the following conditions; (i) M 1 = Vλ+L for λ ∈ L◦ such that 2λ ∈ / L and (M 2 , M 3 ) is one of the following pairs: (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ, 2ν ∈ / L and (λ, µ, ν) is an admissible triple modulo L, ± ± (Vµ+L , Vν+L ), ((Vν+L ) , (Vµ+L ) ) for µ, ν ∈ L◦ such that 2µ ∈ L and (λ, µ, ν) is an admissible triple modulo L. + (ii) M 1 = Vλ+L for λ ∈ L◦ such that 2λ ∈ L and (M 2 , M 3 ) is one of the following pairs: (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ / L and (λ, µ, ν) is an admissible triple modulo L, ± ± (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ L, πλ,2µ = 1 and (λ, µ, ν) is an admissible triple modulo L, ± ∓ (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ L, πλ,2µ = −1 and (λ, µ, ν) is an admissible triple modulo L. − (iii) M 1 = Vλ+L for λ ∈ L◦ such that 2λ ∈ L and (M 2 , M 3 ) is one of the following pairs: (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ / L and (λ, µ, ν) is an admissible triple modulo L, ± ∓ (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ L, πλ,2µ = 1 and (λ, µ, ν) is an admissible triple modulo L, ± ± (Vµ+L , Vν+L ) for µ, ν ∈ L◦ such that 2µ ∈ L, πλ,2µ = −1 and (λ, µ, ν) is an admissible triple modulo L.
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
207
5.3. Fusion rules involving modules of twisted type. We construct nonzero intertwining operators among irreducible VL+ -modules involving modules of twisted type in this secˆ tion. We use χ for a central character of L/K with χ (κ) = −1 and use Tχ to denote the ˆ corresponding irreducible L/K-module. Let λ ∈ L◦ . We define an automorphism σλ of Lˆ by ¯ a σλ (a) = κ (λ,a)
ˆ Since σλ (θ (a)) = θ (σλ (a)), σλ stabilizes K. Hence σλ induces an for any a ∈ L. ˆ automorphism of L/K. ˆ ˆ For any L/K-module T we denote by T ◦ σλ the L/K-module twisted by σλ . That is, T ◦ σλ = T as vector spaces but with a new action defined by a.t = σλ (a)t for ˆ a ∈ L/K and t ∈ T . Since Tχ ◦ σλ is also irreducible, there is a unique central character ˆ ˆ (with χ (λ) (κ) = −1), such that Tχ ◦ σλ ∼ χ (λ) of L/K = Tχ (λ) . Let f be an L/K-module ∼
isomorphism Tχ ◦σλ → Tχ (λ) . Then f is a linear isomorphism from Tχ to Tχ (λ) satisfying f (σλ (a)t) = af (t)
(5.7)
ˆ for a ∈ L/K and t ∈ Tχ . ˆ isomorphism f : Tχ ◦ σλ → Tχ (λ) . For We now fix λ ∈ L◦ and an L/K-module any α ∈ L, we define a linear isomorphism ηλ+α : Tχ ◦ σλ → Tχ (λ) by ηλ+α = (µ,µ)
(−α, λ)eα ◦ f , where we write (µ, ν) = ωq0 have a linear isomorphism
for µ, ν ∈ L◦ as before. Then we
ηγ : Tχ → Tχ (λ) for any γ ∈ λ + L. Lemma 5.8. For any γ ∈ λ + L and α ∈ L, eα ◦ ηγ = (−1)(α,γ ) ηγ ◦ eα , eα ◦ ηγ = (α, γ )ηγ +α = (−α, γ )ηγ −α .
(5.8) (5.9)
Proof. Set β = γ − λ ∈ L. Since eα ◦ f = (−1)(α,λ) f ◦ eα and eα eβ = (−1)(α,β) eβ eα , we have eα ◦ ηγ = (−1)(α,γ ) ηγ ◦ eα . This proves (5.8). By definition we have eα ◦ ηγ = (−β, λ)eα ◦ eβ ◦ f = (−β, λ) (α, β)eα+β ◦ f = (−β, λ) (α, β) (α + β, λ)ηγ +α = (α, γ )ηγ +α . Thus the first equality in (5.9) holds. The second equality in (5.9) follows from the fact that e−α = θ (eα ) = eα on Tχ . ˆ Remark 5.9. In the case L = Zα of rank one, there are two irreducible L/K-modules r 1 2 T , T on which eα acts as 1 and −1 respectively. Then for any λ = |α|2 α ∈ L◦ , ηλ
stabilizes T i for i = 1, 2 if r is even and switches T 1 and T 2 if r is odd. Thus the map ηλ coincides with ψλ in [A2] up to a scalar multiple.
208
T. Abe, C. Dong, H. Li
Let λ ∈ L◦ . Recall operators Yλ,µ ( · , z) and Yλtw ( · , z) defined in (4.3) and (4.7). Following the arguments in [FLM, Chapter 9], we have the following identity for any α ∈ L, λ ∈ L◦ , a ∈ M(1, α) and u ∈ M(1, λ), z1 − z2 Yαtw (a, z1 )Yλtw (u, z2 ) z0−1 δ z0 z2 − z1 − (−1)(α,λ) z0−1 δ Yλtw (u, z2 )Yαtw (a, z1 ) −z0 1/2 1 −1 p (z1 − z0 ) tw p = Yλ+(−1) z2 δ (−1) p α (Y(−1)p α,λ (θ (a), z0 )u, z2 ) 1/2 2 z p=0,1
2
(5.10) and Yλtw (L(−1)u, z) =
d tw Y (u, z) dz λ
(5.11)
on M(1)(θ ). (λ)
T T Now we define an operator Y˜ tw (u, · , z) from VL χ to VL χ by tw Y˜ λtw (u, z) = Yλ+β (u, z) ⊗ ηλ+β
(5.12)
for any u ∈ M(1, λ + β) ⊂ Vλ+L . So we have a linear map (λ)
T T Y˜ λtw ( · , z) : Vλ+L → Hom (VL χ , VL χ ){z}.
We remark that if λ = 0 then Tχ = Tχ , ηλ+α = eα and Y˜ λtw (a, z) = Yαtw (a, z1 )⊗eα is exactly the twisted vertex operator Y (a, z) associated to a ∈ M(1, α) ⊂ VL which (λ)
T
defines the twisted module structure on VL χ (see [FLM]). ˆ with χ (κ) = −1. Proposition 5.10. Let λ ∈ L◦ and χ be a central character of L/K Then for any a ∈ VL and u ∈ Vλ+L , the identities z1 − z2 z2 − z1 ˜ tw −1 −1 tw ˜ Y (a, z1 )Y (u, z2 ) − z0 δ Y (u, z2 )Y (a, z1 ) z0 δ z0 −z0 1 −1 (z1 − z0 )1/2 ˜ tw = Y (Y (θ p (a), z0 )u, z2 ) z2 δ (−1)p 1/2 2 z2 p=0,1 and d ˜ tw Y (u, z) = Y˜ tw (L(−1)u, z) dz hold on VL+ .
T VL χ .
In particular, Y˜ tw ( · , z) is an intertwining operator of type
(λ) Tχ
VL
Tχ
Vλ+L VL
for
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
209
Proof. By (5.8)–(5.10) we see that for any a ∈ M(1, α) ⊂ VL and u ∈ M(1, γ ) with γ ∈ λ + L, z1 − z2 z2 − z1 ˜ tw Y (a, z1 )Y˜ λtw (u, z2 ) − z0−1 δ Yλ (u, z2 )Y (a, z1 ) z0 −z0
z1 − z2 tw = z0−1 δ Yα (a, z1 ) ⊗ eα Yγtw (u, z2 ) ⊗ ηγ z0
z2 − z1 tw − z0−1 δ Yγ (u, z2 ) ⊗ ηγ Yαtw (a, z1 ) ⊗ eα −z0
z1 − z2 = z0−1 δ Yαtw (a, z1 )Yγtw (u, z2 ) ⊗ eα ◦ ηγ z0
z2 − z1 − z0−1 δ Yγtw (u, z2 )Yαtw (a, z1 ) ⊗ ηγ ◦ eα −z0 z1 − z2 = z0−1 δ Yαtw (a, z1 )Yλtw (u, z2 ) z0
z2 − z1 −(−1)α,γ z0−1 δ Yγtw (u, z2 )Yαtw (a, z1 ) ⊗ eα ◦ ηγ −z0 1/2
1 −1 (z1 − z0 ) = z2 δ Yγtw+α (Yα,γ (a, z0 )u, z2 ) ⊗ (α, γ )ηγ +α 1/2 2 z 2
1 −1 (z1 − z0 )1/2 + z2 δ − Yγtw−α (Y−α,γ (θ (a), z0 )u, z2 ) ⊗ (−α, γ )ηγ −α 1/2 2 z2 1 −1 (z1 − z0 )1/2 = z2 δ Yγtw+α (Y (a, z0 )u, z2 ) ⊗ ηγ +α 1/2 2 z2 1 −1 (z1 − z0 )1/2 Yγtw−α (Y (θ (a), z0 )u, z2 ) ⊗ ηγ −α + z2 δ − 1/2 2 z 2 1/2 1 −1 p (z1 − z0 ) Y˜ λtw (Y (θ p (a), z0 )u, z2 ). = z2 δ (−1) 1/2 2 z p=0,1 2
z0−1 δ
It follows from (5.11) that Y˜ tw ( · , z) satisfies the L(−1)-derivative property. The last assertion is clear. We recall the canonical projection p± : M(1)(θ ) → M(1)(θ )± and the canonical injection ι± : M(1)(θ )± → M(1)(θ ). We then have the projection p± ⊗ id : VLT → ˆ VLT ,± and the injection ι± ⊗ id : VLT ,± → VLT for any irreducible L/K-module on Y which κ = −1, noting that VL = M(1)(θ ) ⊗ T . We also write for them by p± and ι± respectively. Let 1 , 2 ∈ {±} and λ ∈ L◦ . It is clear from the definition that the (λ)
VLTχ , 2 tw tw ˜ restriction p 2 ◦ Y ( · , z) ◦ ι 1 is a nonzero intertwining operator of type Tχ , 1 for VL+ . Thus we have:
Vλ+L VL
210
T. Abe, C. Dong, H. Li
Proposition 5.11. For any λ ∈
L◦ , the fusion rules of types
for VL+ are nonzero.
(λ) Tχ ,±
VL
Tχ ,±
Vλ+L VL
and
(λ) Tχ ,∓
VL
Tχ ,±
Vλ+L VL
We now consider the case 2λ ∈ L. Let Y˜ λtw ( · , z) be the intertwining operator of type
(λ) Tχ
VL
Tχ
Vλ+L VL
defined in (5.12). By the conjugation formula (4.10), one has tw θ Y˜ λtw (u, z)θ −1 (v ⊗ t) = (Y−λ−α (θ (u), z)v) ⊗ ηλ+α (t)
(5.13)
for α ∈ L, u ∈ M(1, λ + α), v ∈ M(1)(θ ) and t ∈ Tχ . By (5.9) η−λ−α (t) = (2λ + α, λ)e−2λ−α ηλ (t) = (2λ + α, λ) (α, 2λ)(−1)(α,2λ) e−2λ e−α ηλ (t) = (2λ, λ) (α, 3λ)e2λ eα ηλ (t) = (2λ, λ) (α, 4λ)e2λ ηλ+α (t) = (2λ, λ)e2λ ηλ+α (t). Note that e2λ is in the center of Lˆ as (2λ, β) ∈ 2Z for any β ∈ L. Therefore, e2λ acts on T (λ) by the scalar χ (λ) (e2λ ) = (−1)(λ,2λ) χ (e2λ ) = χ (e2λ ). Hence η−λ−α (t) = (2λ, λ)(−1)(λ,2λ) χ (e2λ )ηλ+α (t) = cχ (λ)ηλ+α (t),
(5.14)
where cχ (λ) is the constant defined in (5.2). It follows from (5.13) and (5.14) that θ Y˜ λtw (u, z)θ −1 w = cχ (λ)−1 Y˜ λtw (θ (u), z)w
(5.15)
for any u ∈ VLλi , w ∈ VLT , 2 . It is clear that cχ (λ) depends on λ up to modulo L. Recall that cχ (λ) ∈ {±1}. We have the following proposition: ˆ Proposition 5.12. Let χ be a central character of L/K such that χ (κ) = −1. For any (λ) (λ) Tχ ,±
VLTχ ,∓
V L ◦ λ ∈ L with 2λ ∈ L, the fusion rules of types + Tχ ,± and − Tχ ,± are nonzero Vλ+L VL
if cχ (λ) = 1, and the fusion rules of types cχ (λ) = −1.
(λ) Tχ ,∓ VL Tχ ,± + Vλ+L VL
Vλ+L VL
and
(λ) Tχ ,±
VL
+ Vλ+L VL
We are now in the position to prove that the fusion rules of type 1 if one of M 1 , M 2 and M 3 is of twisted type.
Tχ ,±
are nonzero if
M3 M 1 M 2 are less than
Proposition 5.13. Let M 1 , M 2 and M 3 be irreducible VL+ -modules. The fusion rule of
3 1 2 3 type MM 1 M 2 is 0 if all three or exactly one of M , M and M is of twisted type.
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
211
Proof. First we consider the case that one of M 1 , M 2 and M 3 is of twisted type and the others are of untwisted type. In view of Propositions 2.7 and 3.7, we may assume that M 1 and M 2 are modules of untwisted types. Then there exist λ, µ ∈ L◦ such that M 1 and M 2 contains irreducible M(1)+ -submodules isomorphic to M(1, λ) and M(1, µ),
3 respectively. By Proposition 2.9, the fusion rule of type MM 1 M 2 is less than or equal
M3 + to the fusion rule of type M(1,λ) M(1,µ) for M(1) . Since M 3 is a module of twisted type and is a direct sum of irreducible M(1)+ modules isomorphic to M(1)(θ )+ or
3 M(1)(θ )− , the fusion rule of type MM 1 M 2 is 0 by Theorem 4.7. Next we consider the case that all M 1 , M 2 and M 3 are of twisted type. Then each M i is a direct sum of M(1)(θ )+ or M(1)(θ )− . Proposition 2.9 and Theorem 4.7 show
3 that the fusion rule of type MM 1 M 2 is 0. ˆ Let χ1 and χ2 be central characters of L/K such that χi (κ) = −1 and M an irreduc+ ible VL -module of untwisted type. We shall prove that for any 1 , 2 ∈ {±}, the fusion
VLTχ2 , 2 (λ) rule of type Tχ1 , 1 is 0 if χ2 = χ1 . M VL
Suppose that the fusion rule of type
Tχ2 , 2
VL
Tχ1 , 1
M VL
is nonzero, and let Y( · , z) be a
nonzero intertwining operator of the corresponding type. Since M is an irreducible VL+ module of untwisted type, there is an M(1)+ -submodule W isomorphic to M(1, λ) for some λ ∈ L◦ . Let ξ be an M(1)+ -module isomorphism from W to M(1, λ), and define ˜ · , z) by Y( ˜ Y(u, z)v = Y(ξ −1 (u), z)v Tχ , 1
˜ · , z) is a nonzero intertwining operator of for u ∈ M(1, λ) and v ∈ VL 1 . Then Y( Tχ2 , 2
VL Tχi , i + ∼ type = M(1)(θ ) i ⊗ Tχi as M(1)+ -modules, Tχ1 , 1 for M(1) . Since VL M(1,λ) VL
we have the following isomorphism of vector spaces: IM(1)+
Tχ2 , 2
VL
Tχ1 , 1
M(1, λ) VL
∼ = IM(1)+
M(1)(θ ) 2 ⊗ Hom C (Tχ1 , Tχ2 ). M(1, λ) M(1)(θ ) 1 (5.16)
M(1)(θ) 2 tw We recall that IM(1)+ M(1,λ) M(1)(θ) 1 is one dimensional and has a basis p 2 ◦Yλ ( · , z)◦ ι 1 . By using (5.16) we see that there exists fλ ∈ Hom (Tχ1 , Tχ2 ) such that
˜ Y(u, z)(v ⊗ t) = p 2 (Yλtw (u, z)ι 1 (v)) ⊗ fλ (t)
(5.17)
for any u ∈ M(1, λ), v ∈ M(1)(θ ) 1 and t ∈ Tχ1 . The vertex operator Y (a, z) asso
Tχ , 2 tw (θ (b), z) ⊗ e , ciated to a ∈ VL+ [α] acts on VL 2 as Y (a, z) = Yαtw (b, z) + Y−α α where a = b + θ (b) with b ∈ M(1, α). Thus we have Y (a, z1 )Y(u, z2 )(v ⊗ t)
tw = p 2 Yαtw (b, z1 ) + Y−α (θ (b), z1 ) Yλtw (ξ(u), z2 )ι 1 (v) ⊗ eα fλ (t)
(5.18)
212
T. Abe, C. Dong, H. Li
for any u ∈ W, v ∈ M(1)(θ ) 1 and t ∈ Tχ1 . Similarly, we get Y(u, z2 )Y (a, z1 )(v ⊗ t)
tw = p 2 Yλtw (ξ(u), z2 ) Yαtw (b, z1 ) + Y−α (θ (b), z1 ) ι 1 (v) ⊗ fλ (eα t).
(5.19)
From (5.10), we see that for a sufficiently large integer k, tw tw (z1 − z2 )k Y±α (b, z1 )Yλtw (ξ(u), z2 )v = (−1)(α,λ) (z1 − z2 )k Yλtw (ξ(u), z2 )Y±α (b, z1 )v
for b ∈ M(1, ±α), u ∈ W and v ∈ M(1)(θ ) 1 , respectively. Therefore, (5.18) and (5.19) shows that (z1 − z2 )k Y(u, z 2 )Y (a, z1 )(v ⊗ t)
tw = (z1 − z2 )k p 2 Yλtw (ξ(u), z2 ) Yαtw (b, z1 ) + Y−α (θ (b), z1 ) ι 1 (v) ⊗ fλ (eα t)
= (−1)(α,λ) (z1 − z2 )k p 2 Yαtw (b, z1 ) tw +Y−α (θ (b), z1 ) Yλtw (ξ(u), z2 )ι 1 (v) ⊗ fλ (eα t) = (−1)(α,λ) (z1 − z2 )k Y (a, z1 )Y(u, z2 )(v ⊗ fλ−1 ((eα )−1 fλ (eα t))). (5.20) Since Y( · , z) is an intertwining operator for VL+ , we have (z1 − z2 )k Y(u, z2 )Y (a, z1 )(v ⊗ t) = (z1 − z2 )k Y (a, z1 )Y(u, z2 )(v ⊗ t) for large k. Thus (5.18), (5.20) and Proposition 2.8 imply the identity eα fλ (t) = (−1)(λ,α) fλ (eα t) = fλ (σλ (eα )(t))
(5.21)
(λ)
for any α ∈ L and t ∈ Tχ1 . Therefore, fλ ∈ Hom L/K (Tχ1 , Tχ2 ). Consequently, we see ˆ that there exists an injective linear map IV + L
Tχ2 , 2
VL
Tχ1 , 1
M VL
→ IM(1)+
M(1)(θ ) 2 ⊗ Hom L/K (Tχ(λ) , Tχ2 ). (5.22) ˆ 1 M(1, λ) M(1)(θ ) 1
(λ)
We have dimC Hom L/K (Tχ1 , Tχ2 ) = Cδχ (λ) ,χ . Hence the dimension of the right-hand ˆ 2 1 side in (5.22) is less than or equal to 1 by Theorem 4.7. We obtain the following proposition: Proposition 5.14. Let M be an irreducible VL+ -module containing an M(1)+ -submod VLTχ2 , 2 ule isomorphic to M(1, λ) and let 1 , 2 ∈ {±}. Then the fusion rule of type Tχ1 , 1 is zero if χ2 = χ1 (λ) and is less than or equal to 1 if χ2 = χ1 (λ) .
M VL
/ L and 1 , 2 ∈ {±}, the fusion rule of type Corollary 5.15. For any λ ∈ L◦ with 2λ ∈
VLTχ2 , 2 . Tχ1 , 1 is δ (λ) χ ,χ Vλ+L VL
1
2
Proof. It is clear from Propositions 5.11 and 5.14. Finally we prove the following proposition:
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
213
ˆ Proposition 5.16. Let λ ∈ L◦ with 2λ ∈ L and χ a central character of L/K such that χ(κ) = −1. Then (1) the fusion rules of types (2) the fusion rules of types
(λ) Tχ ,∓
VL
Tχ ,± + Vλ+L VL (λ) Tχ ,± VL Tχ ,± + Vλ+L VL
and and
(λ) Tχ ,±
VL
Tχ ,± − Vλ+L VL (λ) Tχ ,∓ VL Tχ ,± − Vλ+L VL
Proof. We shall only prove that the fusion rule of type −1. The others can be dealt with similarly. Let Y( · , z) be an intertwining operator of type
are 0 if cχ (λ) = 1, are zero if cχ (λ) = −1. (λ) Tχ ,+
VL
+ Vλ+L
Tχ ,+ VL
(λ) Tχ ,+
VL
+ Vλ+L
Tχ ,+ VL
is 0 when cχ (λ) =
and fλ the linear map
defined as in (5.17). As in the proof of Proposition 5.6, we take a subset Sλ ⊂ λ + L such + ∼ that Vλ+L = M(1)+ ⊕ µ∈Sλ M(1, µ) if λ ∈ L) as M(1)+ -mod= µ∈Sλ M(1, µ) (∼ + ules. We recall the M(1)+ -module isomorphism φ : Vλ+L [λ] → M(1, λ) for λ ∈ Sλ . We may also assume that λ, 3λ ∈ Sλ . By (5.10), we have for any a ∈ M(1, 2λ), u ∈ M(1, λ) and a sufficiently large integer k, tw (z0 + z2 )k Y2λ (a, z0 + z2 )Yλtw (u, z2 )(v ⊗ t) 1 tw p (z2 + z0 )k Y(1+(−1) = p 2)λ (Y(−1)p 2λ,λ (θ (a), z0 )u, z2 )(v ⊗ t) 2 p=0,1
=
tw 1 tw (Y2λ,λ (a, z0 )u, z2 ) + Y−λ (Y−2λ,λ (θ (a), z0 )u, z2 ) (v ⊗ t) (z2 + z0 )k Y3λ 2
for v ∈ M(1)(θ )+ and t ∈ Tχ . This and (5.18) show that for a = b + θ(b) ∈ VL+ [2λ] + [λ] and v ∈ M(1)(θ )+ , with b ∈ M(1, 2λ), u ∈ Vλ+L (z0 + z2 )k Y (a, z0 + z2 )Y(u, z2 )(v ⊗ t) tw tw = (z0 + z2 )k p+ (b, z0 + z2 ) Y2λ tw + Y−2λ (θ (b), z0 + z2 ) Yλtw (φ(u), z2 )v ⊗ e2λ fλ (t)
tw tw 1 Y3λ (Y2λ,λ (b, z0 )φ(u), z2 )v = (z2 + z0 )k p+ 2 tw tw + p+ Y−λ (Y−2λ,λ (θ (b), z0 )φ(u), z2 )v tw tw + p+ Y−λ (Y−2λ,λ (θ (b), z0 )φ(u), z2 )v tw tw Y3λ (Y2λ,λ (b, z0 )φ(u), z2 )v ⊗ e2λ fλ (t) + p+
tw tw Y3λ (Y2λ,λ (b, z0 )φ(u), z2 )v = (z2 + z0 )k p+ tw +Y−λ (Y−2λ,λ (θ (b))φ(u), z2 )v ⊗ e2λ fλ (t).
(5.23)
+ [λ]. Note that u = φ(u) + Consider φ(Y (a, z)u) for any a ∈ VL+ [2λ] and u ∈ Vλ+L θ(φ(u)). We have
Y (a, z)u = (2λ, λ) Y2λ,λ (a, z)φ(u) + Y−2λ,−λ (θ (a), z)θ (φ(u))
+ (−2λ, λ) Y2λ,−λ (a, z)θ (φ(u)) + Y−2λ,λ (θ (a), z)φ(u) .
214
T. Abe, C. Dong, H. Li
Since 3λ, λ ∈ Sλ , we have φ(Y (a, z)u) = (2λ, λ)Y2λ,λ (a, z)φ(u) + (−2λ, λ)Y2λ,−λ (a, z)θ (φ(u)). Using (4.6) gives φ(Y (a, z)u) = (2λ, λ)Y2λ,λ (a, z)φ(u) + (−2λ, λ)θ Y−2λ,λ (θ (a), z)φ(u). tw (θ (u), z)w for any u ∈ M(1, ν) and w ∈ Note that p+ Yνtw (u, z)w = p+ Y−ν + M(1)(θ ) . Hence
Y(Y (a, z0 )u, z2 )(v ⊗ t) tw tw = (2λ, λ)p+ Y3λ (Y2λ,λ (a, z0 )φ(u), z2 )v) ⊗ f3λ (t)
(5.24)
tw tw + (−2λ, λ)p+ Y−λ (Y−2λ,λ (θ (a), z0 )φ(u), z2 )v) ⊗ fλ (t).
On the other hand, (5.23) gives (z0 + z2 )k Y (a, z0 + z2 )Y(u, z2 )(v ⊗ t)
tw tw Y3λ (Y2λ,λ (a, z0 )φ(u), z2 )v ⊗ e2λ fλ (t) = (z2 + z0 )k p+ tw tw + p+ Y−λ (Y−2λ,λ (θ (a), z0 )φ(u), z2 )v ⊗ e2λ fλ (t) .
(5.25)
Since Y is an intertwining operator we have the identity (z0 + z2 )k Y (a, z0 + z2 )Y(u, z2 )(v ⊗ t) = (z2 + z0 )k Y(Y (a, z0 )u, z2 )(v ⊗ t) for a sufficiently large integer k. It follows from (5.24) and (5.25) that tw tw Y3λ (Y2λ,λ (a, z0 )φ(u), z2 )v) ⊗ ( (2λ, λ)f3λ (t) − e2λ fλ (t)) p+ tw tw = p+ Y−λ (Y−2λ,λ (θ (a), z0 )φ(u), z2 )v) ⊗ (e2λ fλ (t) − (−2λ, λ)fλ (t)).
Since the least powers of z0 in tw (Y2λ,λ (e2λ , z0 )eλ , z2 )v Y3λ
and
tw Y−λ (Y−2λ,λ (e−2λ , z0 )eλ , z2 )v
are (2λ, λ) and −(2λ, λ) respectively, we see that if λ = 0, χ (λ) (e2λ )fλ (t) = e2λ fλ (t) = (−2λ, λ)fλ (t)
(5.26)
for any t ∈ Tχ . That is, cχ (λ)fλ (t) = fλ (t). The condition cχ (λ) = −1 forces fλ = 0. This shows Y( · , z) = 0.
(5.27)
By Propositions 2.7, 5.11–5.14, 5.16 and Corollary 5.15 we immediately have: Proposition 5.17. Let M i (i = 1, 2, 3) be irreducible VL+ -modules and assume that one
3 of them is of twisted type. Then the fusion rule of type MM 1 M 2 is either 0 or 1. The
M3 fusion rule of type M 1 M 2 is 1 if and only if M i (i = 1, 2, 3) satisfy one of the following conditions;
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
215
(i) M 1 = Vλ+L for λ ∈ L◦ such that 2λ ∈ / L and (M 2 , M 3 ) is one of the following pairs: T ,±
T
(λ)
,±
T ,±
T
(λ)
,±
(λ)
ˆ (VL χ , VL χ ), (VL χ , VL χ ) for any irreducible L/K-module Tχ . + (ii) M 1 = Vλ+L for λ ∈ L◦ such that 2λ ∈ L and (M 2 , M 3 ) is one of the following pairs: T
T ,±
T
(λ)
(VL χ , VL χ ), ((VL χ such that cχ (λ) = 1, (λ)
,∓
,± T ,± ) , (VL χ ) )
ˆ for any irreducible L/K-module Tχ
(λ)
T ,∓ T ,± T ,± T ,∓ ˆ (VL χ , VL χ ), ((VL χ ) , (VL χ ) ) for any irreducible L/K-module Tχ such that cχ (λ) = −1. − for λ ∈ L◦ such that 2λ ∈ L and (M 2 , M 3 ) is one of the following (iii) M 1 = Vλ+L pairs: T ,±
T
(λ)
,∓
T
(λ)
,∓ T ,± ) , (VL χ ) )
ˆ for any irreducible L/K-module Tχ
T ,±
T
(λ)
,±
T
(λ)
,± T ,± ) , (VL χ ) )
ˆ for any irreducible L/K-module Tχ
(VL χ , VL χ ), ((VL χ such that cχ (λ) = 1, (VL χ , VL χ ), ((VL χ such that cχ (λ) = −1.
T ,+ ˆ Tχ and (M 2 , M 3 ) is one of the (iv) M 1 = VL χ for an irreducible L/K-module following pairs: T
(λ)
(Vλ+L , VL χ
,±
(λ) Tχ ,±
± (Vλ+L , VL cχ (λ) = 1,
(λ)
T
(λ)
,± ) , (Vλ+L ) ) (λ) T ,± ± ((VL χ ) , (Vλ+L ) )
), ((VL χ
for λ ∈ L◦ such that 2λ ∈ / L,
),
for λ ∈ L◦ such that 2λ ∈ L and that
(λ)
± (Vλ+L , VL χ ), ((VL χ cχ (λ) = −1.
,∓ ± ) , (Vλ+L ) )
(λ)
,± ) , (Vλ+L ) )
T
,∓
T
for λ ∈ L◦ such that 2λ ∈ L and that
T ,− ˆ Tχ and (M 2 , M 3 ) is one of the (v) M 1 = VL χ for an irreducible L/K-module following pairs: T
(Vλ+L , VL χ
,±
(λ) Tχ ,∓
± (Vλ+L , VL cχ (λ) = 1,
(λ)
T
(λ)
), ((VL χ
(λ) Tχ ,±
), ((VL
(λ)
± (Vλ+L , VL χ ), ((VL χ cχ (λ) = −1. T
,±
T
for λ ∈ L such that 2λ ∈ / L◦ ,
∓ ) , (Vλ+L ) ) for λ ∈ L◦ such that 2λ ∈ L and that
,∓ ∓ ) , (Vλ+L ) )
for λ ∈ L◦ such that 2λ ∈ L and that
Now Theorem 5.1 follows from Propositions 5.7 and 5.17. 5.4. Fusion product for VL+ . Let V be a vertex operator algebra and {Wi }i∈I the set of all the equivalence classes of irreducible V -modules. For any representatives M i of
k Wi (i ∈ I ), we write Nijk for the fusion rule of type MM i M j for i, j, k ∈ I . The fusion rules Nijk are independent of a choice of representatives. Here and further we assume that I is a finite set and that all the fusion rules are finite. Set R(V ) = i∈I CWi a vector space over C with basis {Wi }i∈I . Then the product of R(V ) is defined by Nijk Wk Wi × Wj = k
216
T. Abe, C. Dong, H. Li
for any i, j ∈ I . By Proposition 2.7 the product is commutative. The commutative algebra R(V ) is called the fusion algebra of V . Denote by Wi the equivalence class of the contragredient module of a representative of Wi . Then for any i ∈ I there exists uniquely i ∈ I such that Wi = Wi . By j
Proposition 2.7, we have Nijk = Nik for any i, j, k ∈ I . We now describe the fusion products for VL+ . For simplicity, we introduce notations of equivalence classes of irreducible VL+ -modules. For λ ∈ L◦ , we set [λ] to be the equivalent class of irreducible VL+ -modules isomorphic to Vλ+L . When 2λ ∈ L, we ± . denote by [λ]± the equivalent class of irreducible VL+ -modules isomorphic to Vλ+L + − ◦ By abuse of notations we set [λ] = [λ] + [λ] for λ ∈ L with 2λ ∈ L. We then have that [λ] = [−λ] and [λ + α] = [λ] for any λ ∈ L◦ and α ∈ L. This implies that ˆ [λ + µ] = [λ − µ] for λ, µ ∈ L◦ if 2λ ∈ L or 2µ ∈ L. For a central character χ of L/K + ± with χ(κ) = −1, we write [χ ] for the equivalence classes of irreducible VL -modules T ,±
VL χ , respectively. We set S0 = {λ ∈ L◦ |2λ ∈ L} and S1 = {λ ∈ L◦ |2λ ∈ / L}. By Theorem 5.1 we have the following fusion products: [λ] × [µ] = [λ + µ] + [λ − µ] ±
[λ] × [µ] = [λ + µ] +
±
[λ] × [µ] = [λ + µ]
for λ, µ ∈ S1 ,
(5.28)
for λ ∈ S0 , µ ∈ S1 , ±
[λ]+ × [µ]± = [λ + µ]∓
(5.29)
for λ, µ ∈ S0 such that πλ,2µ = 1,
(5.30)
for λ, µ ∈ S0 such that πλ,2µ = −1,
(5.31)
−
−
+
for λ, µ ∈ S0 such that πλ,2µ = 1,
(5.32)
−
−
−
for λ, µ ∈ S0 such that πλ,2µ = −1,
(5.33)
[λ] × [µ] = [λ + µ] [λ] × [µ] = [λ + µ]
[λ] × [χ ]± = [χ (λ) ]+ + [χ (λ) ]−
for λ ∈ S1 ,
(5.34)
+
±
(λ) ±
for λ ∈ S0 such that cχ (λ) = 1,
(5.35)
+
±
(λ) ∓
for λ ∈ S0 such that cχ (λ) = −1,
(5.36)
−
−
(λ) +
for λ ∈ S0 such that cχ (λ) = 1,
(5.37)
−
−
(λ) −
for λ ∈ S0 such that cχ (λ) = −1.
(5.38)
[λ] × [χ ] = [χ [λ] × [χ ] = [χ [λ] × [χ ] = [χ [λ] × [χ ] = [χ
] ] ] ]
The other products are derived from these products with the symmetries of fusion rules in Proposition 2.7. For example, the product of [χ1 ]+ and [χ2 ]+ is given by [χ1 ]+ × [χ2 ]+ = [λ] + [µ]+ + [ν]− , where λ runs through S1 such that χ1 = χ2 , µ runs through S0 such that χ1 = χ2 (ν) and that cχ1 (µ)(−1)2(µ,µ) = 1, and ν runs through S0 such that χ1 = χ2 and that cχ1 (µ)(−1)2(µ,µ) = −1. (λ)
(µ)
Theorem 5.18. The fusion algebra R(VL+ ) is a commutative associative algebra. Proof. For any equivalence classes W1 , W2 and W3 of irreducible VL+ -modules, we have to prove that W1 × (W2 × W3 ) = (W1 × W2 ) × W3 . We can do this case by case. For example we shall prove [λ]+ × ([µ]+ × [ν]− ) = ([λ]+ × [µ]+ ) × [ν]−
(5.39)
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
217
for λ, µ, ν ∈ S0 such that πλ,2µ = 1 and πµ,2ν = −1 and [λ]+ × ([µ]+ × [χ ]− ) = ([λ]+ × [µ]+ ) × [χ ]−
(5.40)
ˆ for λ, µ ∈ S0 and a central character χ of L/K with χ (κ) = −1 such that πλ,2µ = 1 and cχ (µ) = −1. We first show (5.39). By using (5.30) and (5.31), we have [µ]+ × [ν]− = [µ + ν]+ ,
[λ]+ × [µ]+ = [λ + µ]+ .
Since πλ,2µ+2ν = e(λ,2µ+2ν)πi ωqc0 (2µ+2ν,λ) = πλ,2µ πλ,2ν = πλ,2ν , πλ+µ,2ν = e(λ+µ,2ν)πi ωqc0 (2ν,λ+µ) = πλ,2ν πµ,2ν = −πλ,2ν , we see that [λ]+ × ([µ]+ × [ν]− ) = [λ]+ × [µ + ν]+ = [λ + µ + ν]± , ([λ]+ × [µ]+ ) × [ν]− = [λ + µ]+ × [ν]− = [λ + µ + ν]± if πλ,2ν = ±1 respectively. Thus (5.39) holds. Next we show (5.40). Equation (5.36) implies [µ]+ × [χ ]− = [χ (µ) ]+ . Then we see that
(λ) ]± [λ]+ × ([µ]+ × [χ ]− ) = [λ]+ × [χ (µ) ]− = [ χ (µ) if cχ (λ) = ±1 respectively. On the other hand, ([λ]+ × [µ]+ ) × [χ ]− = [λ + µ]+ × [χ ]− = [χ (λ+µ) ]∓ (λ)
if cχ (λ + µ) = −cχ (λ) = ±1 respectively. Since χ (µ) (a) = χ (µ) (σλ (a)) =
(λ) ˆ we have χ (µ) = χ(σµ σλ (a)) = χ (σλ+µ (a)) = χ (λ+µ) (a) for any a ∈ Z(L/K), χ (λ+µ) . Therefore, (5.40) holds. 5.5. Application. In this section we apply the results on fusion rules for VL+ -modules to study orbifold vertex operator algebras constructed from VL and automorphism θ when L is unimodular. Let L be a positive-definite even unimodular lattice. That is, L = L◦ . Then VL is a holomorphic vertex operator algebra in the sense that VL is rational and VL is the only irreducible VL -module up to isomorphism (see [D1 and DLM2]). Moreover, VL has a unique irreducible θ -twisted module VLT up to isomorphism where T is the unique ˆ simple module for L/K such that κK acts as −1 (see [FLM and D2]). Recall that VLT = M(1)(θ ) ⊗ T and d is the rank of L. The weight gradation of VLT is given by VLT = (VLT )n+ d (5.41) n∈ 21 Z≥0
16
218
T. Abe, C. Dong, H. Li
(see [DL2]). Since L is unimodular, d is divisible by 8. Hence the L(0)-weights of either VLT ,+ or VLT ,− are integers (half integers for the other). We denote by VLT ,e (resp., VLT ,o ) the irreducible VL+ -submodules of VLT with integral (half integral) L(0)-weights. It is clear that VLT ,e = VLT ,+ if d/8 is even and VLT ,e = VLT ,− if d/8 is odd. By Theorems 3.4 (also see [AD]) and 5.1, we have: Proposition 5.19. Let L be a positive-definite even unimodular lattice. (i) The vertex operator algebra VL+ has exactly 4 irreducible modules VL± , VLT ,± up to isomorphism. (ii) The fusion rules among modules are VL+ × W = W × VL+ = W,
VL− × VL− = VL+ ,
VL− × VLT ,e = VLT ,e × VL− = VLT ,o ,
VL− × VLT ,o = VLT ,o × VL− = VLT ,e ,
VLT ,e × VLT ,e = VLT ,o × VLT ,o = VL+ ,
VLT ,e × VLT ,o = VLT ,o × VLT ,e = VL− ,
where W is any irreducible VL+ -module. Remark 5.20. If L is the Leech lattice, the irreducible modules for VL+ have been classified previously in [D3] by using the representation theory for the Virasoro algebra of central charge 1/2. The main result in this subsection is the following: Proposition 5.21. Let L be a positive-definite even unimodular lattice. Assume that VL+ is rational and V = VL+ + VLT ,e is a vertex operator algebra. Then V is a holomorphic vertex operator algebra and C2 -cofinite. Proof. It is known that VL+ is C2 -cofinite (see [Ya and ABD]). Since V is C2 -cofinite as a VL+ -module by [Bu], it is also C2 -cofinite as a vertex operator algebra. We assume that V = VL+ + VLT ,− . The case that V = VL+ + VLT ,+ can be proved similarly. We first prove that V is the only irreducible V -module up to isomorphism. Let W be an irreducible V -module. Then W is a completely reducible VL+ -module. Let M be an irreducible VL+ -submodule of W. If M = VL+ or VLT ,− using the fusion rule given in Proposition 5.19 shows that VL+ is always contained in W as a VL+ -submodule. So W contains a vacuum like vector and thus is isomorphic to V (see [L]). If M = VL− , then VLT ,− × VL− = VLT ,+ is a VL+ -submodule of W. Note that VLT ,− has integral weight and VLT ,+ has strictly half integral weights. So W has both integral weights from VL− and half integral weights from VLT ,+ . But this is impossible as W is irreducible. Similarly, M cannot be VLT ,+ . We now prove that V is rational. That is, any admissible V -module is completely reducible. Let W be an admissible V -module and M be the maximal semisimple admissible submodule. Then V = M ⊕ X for a VL+ -submodule of W as VL+ is rational. If X = 0 then W/M is a V -module. So as VL+ -module W/M = X contains a VL+ -submodule isomorphic to VL+ . This shows that X contains a vacuum-like vector x and the V -submodule Z of W generated by u is isomorphic to V . Clearly, M ∩ Z = 0 and M ⊕ Z is a semisimple admissible V -submodule of W and strictly contains M. This contradiction shows that W = M. Again, if L is the Leech lattice, this result has been given in [D3] before.
Fusion Rules for Vertex Operator Algebras M(1)+ and VL+
219
References [A1]
Abe, T.: Fusion rules for the free bosonic orbifold vertex operator algebra. J. Alg. 229, 333–374 (2000) [A2] Abe, T.: Fusion rules for the charge conjugation orbifold. J. Alg. 242, 624–655 (2001) [ABD] Abe, T., Buhl, G., Dong, C.: Rationality, Regularity and C2 co-finiteness. Trans. Am. Math. Soc. 356, 3391–3402 (2004) [AD] Abe, T., Dong, C.: Classification of irreducible modules for the vertex operator algebra VL+ : General case. J. Alg. 273, 657–685 (2004) [B] Borcherds, R.: Vertex algebras, Kac-Moody algebras, and the Monster. Proc. Natl. Acad. Sci. USA 83, 3068–3071 (1986) [Bu] Buhl, G.: A spanning set for VOA modules. J. Alg. 254, 125–151 (2002) [DGM] Dolan, L., Goddard, P., Montague, P.: Conformal field theories, representations and lattice constructions. Commun. Math. Phys. 179, 61–120 (1996) [D1] Dong, C.: Vertex algebras associated with even lattices. J. Alg. 160, 245–265 (1993) [D2] Dong, C.: Twisted modules for vertex algebras associated with even lattices. J. Alg. 165, 91–112 (1994) [D3] Dong, C.: Representations of the moonshine module vertex operator algebra. Contemporary Math. 175, 27–36 (1994) [DL1] Dong, C., Lepowsky, J.: Generalized vertex algebras and relative vertex operators. Progress in Math., Vol. 112, Boston: Birkh¨auser, 1993 [DL2] Dong, C., Lepowsky, J.: The algebraic structure of relative twisted vertex operators. J. Pure Appl. Alg. 110, 259–295 (1996) [DLM1] Dong, C., Li, H., Mason, G.: Compact automorphism groups of vertex operator algebras. Int. Math. Res. Notices 18, 913–921 (1996) [DLM2] Dong, C., Li, H., Mason, G.: Regularity of rational vertex operator algebras. Adv. Math. 132, 148–166 (1997) [DLM3] Dong, C., Li, H., Mason, G.: Twisted representation of vertex operator algebras. Math. Ann. 310, 571–600 (1998) [DLi] Dong, C., Lin, Z.: Induced modules for vertex operator algebras. Commun. Math. Phys. 179, 157–184 (1996) [DM] Dong, C., Mason, G.: On quantum Galois theory. Duke Math. J. 86, 305–321 (1997) [DN1] Dong, C., Nagatomo, K.: Classification of irreducible modules for the vertex operator algebra M(1)+ . J. Alg. 216, 384–404 (1999) [DN2] Dong, C., Nagatomo, K.: Representations of Vertex operator algebra VL+ for rank one lattice L. Commun. Math. Phys. 202, 169–195 (1999) [DN3] Dong, C., Nagatomo, K.: Classification of irreducible modules for the vertex operator algebra M(1)+ II. Higher Rank. J. Alg. 240, 389–325 (2001) [FHL] Frenkel, I., Huang, Y., Lepowsky, J.: On axiomatic approaches to vertex operator algebras and modules. Mem. Am. Math. Soc. 104, (1993) [FLM] Frenkel, I., Lepowsky, J., Meurman, A.: Vertex Operator Algebras and the Monster. Pure and Appl. Math., Vol. 134, Boston: Academic Press, 1988 [FZ] Frenkel, I., Zhu,Y.: Vertex operator algebras associated to representations of affine and Virasoro algebras. Duke Math. J. 66, 123–168 (1992) [HL] Huang,Y., Lepowsky, J.: A theory of tensor products for module categories for a vertex operator algebra I, II. Selecta Math. (N.S.) 1, 757–786 (1995) [L] Li, H.: Symmetric invariant bilinear forms on vertex operator algebras. J. Pure Appl. Alg. 96, 279–297 (1994) [X] Xu, X.: Twisted modules of colored lattice vertex operator superalgebras. Quart. J. Math. Oxford 47, 233–259 (1996) [Ya] Yamskulna, G.: C2 -cofiniteness of the vertex operator algebra VL+ when L is a rank one lattice. Comm. Algebra 32, 927–954 (2004) Communicated by Y. Kawahigashi
Commun. Math. Phys. 253, 221–252 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1153-0
Communications in
Mathematical Physics
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials Serguei Tcheremchantsev UMR 6628-MAPMO, Universit´e d’Orl´eans, B.P. 6759, 45067 Orl´eans Cedex, France. E-mail:
[email protected] Received: 31 October 2003 / Accepted: 20 February 2004 Published online: 27 August 2004 – © Springer-Verlag 2004
Abstract: We consider discrete half-line Schr¨odinger operators H with potentials of the form V (n) = S(n) + Q(n). Here Q is any compactly supported real function, 1−η
S(n) = n 2η if n = LN and S(n) = 0 otherwise, where η ∈ (0, 1) and LN is a very fast growing sequence. We study in a rather detailed manner the time-averaged dynamics exp(−itH )ψ for various initial states ψ. In particular, for some ψ we calculate explicitly the “intermittency function” βψ− (p) which turns out to be nonconstant. The dynamical results obtained imply that the spectral measure of H has exact Hausdorff dimension η for all boundary conditions, improving the result of Jitomirskaya and Last. 1. Introduction Consider the discrete Schr¨odinger operators in l 2 (Z+ ): Hθ ψ(n) = ψ(n − 1) + ψ(n + 1) + V (n)ψ(n),
(1.1)
where V (n) is some real function, with boundary condition ψ(0)cosθ + ψ(1)sinθ = 0, θ ∈ (−π/2, π/2).
(1.2)
We shall consider the case of sparse potentials. Namely, V (n) = VN , if n = LN and V (n) = 0 otherwise, where LN is a monotone rapidly increasing sequence. Such potentials were studied, in particular, in [G, P, S, SSP, SST, JL, K, KR, Z]. Their interest lies in the fact that the spectrum on (−2, 2) may be singular continuous with nontrivial Hausdorff dimension. In the present paper we will be interested in a particular case of such potentials, considered by Jitomirskaya and Last [JL]. We consider a slightly more general model. Let V (n) =
∞ N=1
1−η
LN2η δLN ,n + Q(n) ≡ S(n) + Q(n),
(1.3)
222
S. Tcheremchantsev
where LN is some very fast-growing sequence such that L1 L2 · · ·LN−1 = LαNN , limN→∞ αN = 0, η ∈ (0, 1) is a parameter, and Q(n) is any compactly supported real function (i.e. Q(n) = 0 for all n ≥ n0 ). It is well known that the study of the operator defined by (1.1)–(1.2) is equivalent to the study of the operator with Dirichlet boundary condition ψ(0) = 0 and potential V1 (n) = V (n) − tanθ δ1,n . It is clear that V1 (n) = S(n)+Q1 (n), where Q1 is another compactly supported function. Thus, without loss of generality, we may consider only operators with Dirichlet boundary condition and potentials given by (1.3). We shall denote by H the corresponding operator. For such a model, it is known [SST] that (−2, 2) belongs to the singular continuous spectrum of H , and there may exist some discrete point spectrum outside of [−2, 2]. It was shown [JL] that the Hausdorff dimensionality of the spectrum in (−2, 2) lies 2η between η and 1+η for all boundary conditions (they consider Q(n) = 0 in our notation). Moreover, for Lebesgue a.e. θ, the spectrum on (−2, 2) is of exact dimension η. Combes and Mantica [CM] showed that the packing dimension of the spectral measure restricted to (−2, 2) is equal to 1. These spectral results imply dynamical lower bounds in the usual way [L, GSB]. However, for the model considered this is only partial dynamical information. Some dynamical upper bounds were obtained by Combes and Mantica [CM] (in our proofs we use some ideas from their paper). Krutikov and Remling [KR, K] studied the behaviour of the Fourier transform of the spectral measure at infinity. The main motivations of the present paper are the following: 1. To give a rather complete description of the (time-averaged) dynamical behaviour of the considered model related to the singular continuous part of the spectrum (and some strong results in the case of more general initial states). This is the first example of this kind where the dynamics is studied in such a detailed manner. Although this model is simple enough, the results suggest what could be done in more complicated cases, namely, for Fibonacci potentials, bounded sparse barriers, random decaying potentials or a random polymer. 2. For some initial states ψ we find the exact expression of the intermittency function (see the definition below) βψ− (p) which is non-constant in p. To the best of our knowledge, this is the first model where such a phenomenon of “quantum intermittency” is rigorously proven. 3. Throughout the paper, we use many different methods to study dynamics and show how their combination gives stronger results. In particular, we further develop the method for proving lower bounds based on the Parseval formula [DT], allowing more general initial states ψ than δ1 . We think that these ideas will be useful in many other cases. 4. For a long time the priority was given to the spectral analysis of operators with s.c. spectrum rather than to the analysis of the corresponding dynamics (and most dynamical bounds were obtained as a consequence of spectral results). In the present paper we show how it is possible to study dynamics directly without virtually any knowledge of the spectral properties. Indeed, the only information we need in our considerations is that (−2, 2) ∈ σ (H ). Although we prove that the spectral measure is of exact Hausdorff dimension η on (−2, 2) for all boundary conditions (improving the result of Jitomirskaya and Last), this is just a particular simple consequence of our dynamical results.
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
223
Let ψ ∈ l 2 (Z+ ) be some initial state (in particular, ψ = δ1 ). The time evolution is given by ψ(t) = exp(−itH )ψ, where exp(−itH ) is the unitary group. We shall be interested by the time-averaged quantities like 1 ∞ aψ (n, T ) = dt exp(−t/T )|ψ(t, n)|2 . T 0 This definition of time-averaging is virtually equivalent to the Cesaro average, but is more convenient for technical reasons. We consider the time averaging because of the rather irregular behaviour in time of |ψ(t, n)|2 in the case of singular continuous spectrum. For the sparse barriers model we can see this from numerical simulations in [CM]. Moreover, effective analytical methods exist to study time-averaged quantities. Upper bounds for the return probability as t → ∞ without time-averaging are obtained in [K, KR], which is difficult technically. We shall study the inside and outside time-averaged probabilities defined as Pψ (n ≤ M, T ) = aψ (n, T ) n≤M
and Pψ (n ≥ M, T ) =
aψ (n, T )
n≥M
respectively. Here M > 0 are some numbers which may depend on T (increasing with T ). The quantity Pψ (n ≤ M, T ) can be interpreted as the time-averaged probability to find a system inside an interval [0, M], and similarly for outside probabilities. The obtained results are of the form Pψ (n ≥ M1 (T ), T ) ≥ c > 0, Pψ (n ≤ M2 (T ), T ) ≥ c > 0,
(1.4)
Pψ (n ≥ M3 (T ), T ) ≥ h(T ), Pψ (n ≤ M4 (T ), T ) ≥ g(T ),
(1.5)
or
and similarly for the upper bounds, where Mi (T ) → +∞ are some increasing functions, and h(T ), g(T ) tends to 0 not faster than polynomially. Thus, we control the essential parts of the wavepacket (1.4), as well as polynomially small parts of the wavepacket (1.5) (such bounds for outside probabilities imply lower bounds for the moments of the position operator). We also consider the more traditional quantities: p |X|ψ (T ) = np aψ (n, T ), p > 0, n>0
called time-averaged moments of order p of the position operator, as well as their growth exponents βψ± (p) (both functions non-decreasing in p): p
βψ− (p) =
log|X|ψ (T ) 1 liminf T →∞ , p > 0, p logT
224
S. Tcheremchantsev
and similarly for βψ+ (p). Since p
|X|ψ (T ) ≥ M p Pψ (n ≥ M, T ) for any M > 0, it is clear that probabilities and moments are related. We shall also study the time-averaged return probability: 1 ∞ dt exp(−t/T )|ψ(t), ψ|2 . Jψ (1/T , R) = T 0 Let us present the main results. Assume first that ψ belongs to the subspace of continuous spectrum of H . Then due to the RAGE theorem, the system escapes with time (after time-averaging) from any finite interval [1, M] and thus the quantum particle goes to infinity. Since the barriers are very sparse, the picture of motion is rather obvious. If the main part of the wavepacket is far enough from the barriers: LN−1 << n << LN for some N , then the propagation is ballistic (as in the case of the free particle). When the wavepacket reaches a barrier V (LN ) (at time T of order LN ) the motion is slowed down and the process of tunneling through the high barrier begins. The time necessary 1/η for the essential part of the wavepacket to go through is about LN V 2 (LN ) = LN . During this time the main part of the wavepacket is confined in the interval [1, LN ]. For 1/η T >> LN most of the wavepacket is on [LN + 1, ∞) and a new period of ballistic motion begins. It is clear that given a large value of T , it is crucial to locate it with respect to the LN . Thus, throughout the paper, for any T we shall denote by N (depending on T and N → ∞ if T → ∞) the unique value such that LN /C ≤ T < LN+1 /C with some C > 1. We prefer considering LN /C ≤ T < LN+1 /C rather than LN ≤ T < LN+1 for the following reason: if LN /C ≤ T ≤ LN , the far tail of the wavepacket is already approaching the barrier V (LN ) and the tunneling begins. For simplicity, we take C = 4 (of course, any other value C > 1 could be used). Let ψ = f (H )δ1 = 0, where f is a complex function from f ∈ C0∞ ([−2+ν, 2−ν]) for some ν ∈ (0, 1). The operator f (H ) is given by the spectral theorem. We shall call these ψ initial states of the first kind. The following bounds are proven: For T : LN /4 ≤ T ≤ 2LN , 1−1/η−αN
C 1 LN
1−1/η
≤ Pψ (n ≥ 2LN , T ) ≤ C2 LN
,
(1.6)
where αN → 0 as N → ∞ (i.e. as T → ∞). These bounds describe the beginning of tunneling. 1/η For T : 2LN ≤ T ≤ LN , −1/η−αN
C1 T LN
−1/η
≤ Pψ (n ≥ T , T ) ≤ Pψ (n ≥ 2LN , T ) ≤ C2 T LN
, αN → 0. (1.7)
These bounds describe the main part of the tunneling process. In particular, for T : 1/η LN /4 ≤ T ≤ LN , −1/η
Pψ (n ≤ 2LN , T ) ≥ ||ψ||2 − CT LN
.
(1.8)
1/η
Thus, for T : LN /4 ≤ T ≤ cLN with c small enough, the main part of the wavepacket is located in [1, 2LN ]. Moreover, for T : LN /4 ≤ T ≤ LB N with some B > 0, N Pψ (LN /4 ≤ n ≤ LN , T ) ≥ C(B)L−α N .
(1.9)
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
225
1/η
For T : LN /4 ≤ T ≤ LN the following bounds hold for the time-averaged moments of the position operator: −1/η
p+1 N LN C1 L−α N (LN + T p
p
−1/η
p
) ≤ |X|ψ (T ) ≤ C2 (LN + T p+1 LN
).
(1.10)
The bounds (1.6)–(1.10) are proved in Theorem 3.4 and Theorem 4.3. The upper bound in (1.10) for the moments averaged over the boundary condition θ (1.2) was proved by Combes and Mantica in [CM] for p ≤ 2. Our result holds for all p > 0 and any compact potential Q (in particular, for all boundary conditions). 1/η The next bounds describe the beginning and the end of the ballistic regime. If LN ≤ 1/η+δ 1−δ T ≤ LN or LN+1 ≤ T < LN+1 /4 for some δ > 0, then N CL−α ≤ Pψ (n ≥ T , T ) ≤ Pψ (n ≥ 2LN , T ) ≤ ||ψ||2 , N
(1.11)
and for the moments N C1 T p L−α ≤ |X|ψ (T ) ≤ C2 T p . N
p
(1.12)
These bounds are proved in Theorem 4.3. 1/η+δ Finally, if LN ≤ T ≤ L1−δ N+1 , the motion is exactly ballistic. Namely, for any θ > 0 there exists τ > 0 small enough (independent of T ) such that ||ψ||2 − θ ≤ Pψ (n ≥ τ T , T ) ≤ ||ψ||2 ,
(1.13)
for T large enough, and p
C1 T p ≤ |X|ψ (T ) ≤ C2 T p .
(1.14)
1/η
Moreover, for LN ≤ T < LN+1 /4, 1/η+αN
Pψ (n ≤ 2LN , T ) ≤ CLN
T −1 .
(1.15)
The bounds (1.13)–(1.14) are proved in Theorem 4.4, and (1.15) follows from (3.7) of Lemma 3.3. For the time-averaged return probability for any T such that LN /4 ≤ T < LN+1 /4 the bounds hold (Theorem 4.2): C −1/η
LN (1 + T LN
)
≤ Jψ (1/T , R) ≤
CLαNN
−1/η
LN (1 + T LN
.
(1.16)
)
A related result (Lemma 3.3) states that aψ (n, T ) = |ψ|2 (t, n)(T ) ≤
CLαNN
−1/η
LN (1 + T LN
(1.17) )
for any n. As a particular corollary of our bounds for the time-averaged moments, we obtain the exact expression for the functions βψ± (p) (the result for βψ+ (p) follows also from dimP (µψ ) = 1 proved in [CM]): βψ− (p) =
p+1 , β + (p) = 1, p > 0. p + 1/η ψ
(1.18)
226
S. Tcheremchantsev
Thus, the upper bound for βψ− (p), obtained in [CM] for p ≤ 2 and for a.e. boundary conditions, gives in fact the exact expression of βψ− (p) for all p > 0 and all boundary conditions, as it was conjectured in [CM]. The result (1.18) is important from two points of view: 1. This is the first example where a nontrivial (i.e. nonconstant) function βψ− (p) is rigorously calculated. 2. It implies (Corollary 4.5) that the restriction of the spectral measure on (−2, 2) is of exact Hausdorff dimension η. This result holds for all compact potentials Q and thus, in particular, for all boundary conditions θ in (1.2). This improves the result of [JL], where it was proven only for Lebesgue-a.e. θ . Consider now more general initial states ψ, for example, ψ = δ1 . The problem is that we have no control of the discrete spectrum outside (−2, 2). Thus, it is possible that some part of the wavepacket remains well localized at any time. On the other hand, it is also possible that the part of the wavepacket related to the discrete spectrum moves quasiballistically (the well known example is the one of [DRJLS]). As a consequence, we cannot prove non-trivial upper bounds for the outside probabilities and for the moments, and 1/η we cannot prove that all the wave function escapes from [1, LN ] as T >> LN . However, the part of the wavepacket corresponding to the continuous spectrum (if non-zero) behaves in the same manner. It escapes from any interval [1, M], moves ballistically between the barriers, tunnels through the barriers, etc. Therefore, we are able to prove non-trivial lower bounds for outside probabilities and for the moments. Consider ψ = f (H )δ1 = 0, where f is some bounded Borel complex function such that for some interval S = [E0 − ν, E0 + ν] ⊂ [−2 + ν, 2 − ν], f is C ∞ on S and |f (x)| ≥ c > 0 on S. We call these ψ initial states of the second kind. In particular, ψ considered previously and ψ = δ1 verify this condition. For ψ described above, the following bounds hold (proved essentially in Theorem 4.3 and Theorem 4.4): The first bound in (1.6), and the first and the second bounds in (1.7) remain true. Instead 1/η of (1.8) we prove that for some δ > 0 small enough and T : LN /4 ≤ T ≤ δLN , Pψ (n ≤ 2LN , T ) ≥ c1 > 0. The bound (1.9) remains true as well as the first bound in (1.10). The bound (1.11) and the first bound in (1.12) hold (we do not have a priori a ballistic upper bound for the ψ considered, except the case where f is smooth, in particular, ψ = δ1 ). Instead of (1.13), one has the bound Pψ (n ≥ τ T , T ) ≥ c1 > 0. The first bound in (1.14) follows. For the time-averaged return probability the lower bound in (1.16) holds (Theorem 4.2). For the functions βψ± (p), one has lower bounds βψ− (p) ≥
p+1 , β + (p) ≥ 1. p + 1/η ψ
One can ask whether the smoothness condition on f is relevant. As for the upper bounds for moments and outside probabilities, it seems essential. Some results, namely, Lemma 2.1, Corollary 2.6, Lemma 3.3 and Theorem 4.2, hold for nonsmooth f . Probably, lower bounds for outside probabilities and for the moments (for both kinds of ψ) can be proved without smoothness of f . The paper is organized as follows. In Sect. 2 we first prove upper bounds for the transfer matrices with complex energies T (n, 0; z) associated with the equation H u = zu.
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
227
With this result we obtain some lower bounds for probabilities and for the moments (Theorem 2.4) using the Parseval formula. The combination of this method with the traditional approach going back to Guarneri allows us to obtain some control of the essential part of the wavepacket (Corollary 2.6) as well as a better lower bound for the time-averaged moments (Corollary 2.7 and Theorem 2.8). The approach of Sect. 2 can be applied to a more general class of models, where the transfer matrix has a non-trivial upper bound like ||T (n, 0; E + iε)|| ≤ g (n), E ∈ , ε ∈ (0, 1). Here is any compact interval in (−2, 2), and the function g (n), growing not too fast, does not depend on E ∈ , ε ∈ (0, 1). In particular, g = C( )nα with some α > 0 is possible (Theorem 2.9). This result is applied to the operators with bounded sparse potentials (Proposition 2.10). The bounds of Theorem 2.4 show the importance of the integrals dEIm2 F (E + iε), ⊂ (−2, 2), I ( , ε) = ε
where F denotes the Borel transform of spectral measure. Good lower bounds for I ( , ε) imply better lower bounds for probabilities and thus for the moments. These integrals are closely related to the time-averaged return probabilities and to the correlation dimensions of the spectral measure restricted to (−2, 2). In Sect. 3, which is specific to the model considered with growing sparse potentials, we obtain upper bounds for inside (Lemma 3.3) and outside probabilities and moments (Theorem 3.4). These results are proved for ψ = f (H )δ1 with f compactly supported on (−2, 2) (and moreover f ∈ C0∞ in Theorem 3.4). When considering the inside probabilities, we obtain some upper bound for ImF (x + iε), x ∈ (−2, 2). It implies a very simple proof of the fact that for any δ > 0, ν > 0 the spectral measure is uniformly η − δ-H¨older continuous on [−2 + ν, 2 − ν] (the result which follows also from the proofs of [JL]). In Sect. 4 we first use the upper bounds obtained for outside probabilities to obtain a lower bound for the integrals I ( , ε) which is virtually optimal (Corollary 4.1). Together with Theorem 2.4, it implies better lower bounds for probabilities and for the moments (which are optimal for ψ of the first kind up to the factors like LαNN , where αN → 0). It also implies bounds for the time-averaged return probabilities (Theorem 4.2). The upper bounds of Sect. 3 are also used (Theorem 4.4) to control the essential part of the wavepacket on [1, 2LN ] and on [τ T , +∞) with some τ > 0. Finally, we show that the upper bounds obtained for the moments imply that the restriction of spectral measure on (−2, 2) is of exact Hausdorff dimension η. 2. Direct Lower Bounds for Probabilities and Moments Define the time-averaged quantities (which we call probabilities) of the form 1 +∞ Pψ (n ≥ M, T ) = |ψ(t, n)|2 (T ) ≡ dte−t/T | exp(−itH )ψ(n)|2 T 0 n≥M
n≥M
and similarly for Pψ (n ≤ M, T ), Pψ (L ≤ n ≤ M, T ), where M, L may depend on T . We shall call Pψ (n ≥ M, T ) outside and Pψ (n ≤ M, T ) inside probabilities respectively.
228
S. Tcheremchantsev
Throughout the paper we shall consider two kinds of initial states ψ: 1. ψ = f (H )δ1 , where f ∈ C0∞ ([−2 + ν, 2 − ν]) for some ν > 0 and f (x0 ) = 0 for some x0 . We shall call these ψ initial states of the first kind. 2. ψ = f (H )δ1 where f : R → C is a bounded Borel function such that for some [E0 − ν, E0 + ν] ⊂ [−2 + ν, 2 − ν], with ν > 0, f ∈ C ∞ ([E0 − ν, E0 + ν]) and |f (x)| ≥ c > 0, x ∈ [E0 − ν, E0 + ν].
(2.1)
In particular, we can take ψ = δ1 . We shall call these ψ initial states of the second kind. We can observe that any ψ of the first kind belongs to the second kind. In the case of any ψ we shall denote by µψ the corresponding spectral measure, and by µ ≡ µδ1 the measure of the state δ1 (which is cyclic vector). Observe that dµψ (x) = |f (x)|2 dµ(x). Let ψ be any vector and µψ its spectral measure. For any Borel set and ε > 0 define the following integrals: Jψ (ε, ) = dµψ (x) dµψ (y)R((x − y)/ε),
R
where R(w) = 1/(1 + w2 ). These quantities will play an important role in the sequel. What we can observe is the following identity (which can be easily proved using the spectral theorem): 1 ∞ dt exp(−t/T )|ψ(t), ψ|2 = Jψ (ε, R), ε = 1/T . (2.2) T 0 Thus, Jψ (ε, R) coincides with the time-averaged return probability. The first statement is of a rather general nature, and holds in fact for any self-adjoint operator H . Lemma 2.1. Let H be some self-adjoint operator in l 2 (N) and ψ any vector such that c1 = µψ ( ) > 0, where is some Borel set. Let M(T ) = c12 /(16Jψ (T −1 , )). Then Pψ (n ≥ M(T ), T ) ≥ c1 /2 > 0. Proof. The result follows rather directly from [T] and is obtained using the traditional approach developed by Guarneri-Combes-Last. For the sake of completeness we shall give the main lines of the proof. Define ρ = X ψ, χ = ψ − ρ, where XS is the spectral projector of the operator H on the set S. One has ρ = 0 since ρ2 = µψ ( ) = c1 > 0. We show [T] that for any M > 0, Pψ (n ≥ M, T ) ≥ ||ρ||2 − 2|D(M, T )|, where D(M, T ) =
1 T
+∞
dt exp(−t/T ) 0
n<M
ψ(t, n)ρ(t, n).
(2.3)
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
229
Lemma 2.1 of [T] (with h(u) = exp(−u), u > 0) implies dµψ (x) b(x, T )SM (x), |D(M, T )| ≤
(2.4)
where
b(x, T ) = R
dµψ (u)R((T (x − u)) = εImFµψ (x + iε), ε =
1 , T
(2.5)
2 Fµψ is the Borel transform of spectral measure, SM (x) = n<M |uψ (n, x)| , and uψ (n, x) are generalized eigenfunctions associated with the state ψ. Since dµψ (x)|uψ (n, x)|2 ≤ 1 R
for any n, the bound (2.4) and Cauchy-Schwarz inequality yield |D(M, T )|2 ≤ MJψ (ε, ), ε = 1/T .
(2.6)
Let us take M = ||ρ||4 (16Jψ (ε, ))−1 . It follows from (2.3), (2.6) that Pψ (n ≥ M, T ) ≥ ||ρ||2 /2 = c1 /2 > 0.
In the sequel we shall also need the following integrals:
Iψ (ε, ) = ε
dEIm2 Fψ (E + iε) = ε 3
dE
R
dµψ (u) 2 ε + (E − u)2
2 ,
where ψ is some state and Fψ denotes the Borel transform of its spectral measure. In fact, the integrals Iψ (ε, ) and Jψ (ε, ) are closely related. Lemma 2.2. Let 0 < ε < 1, = [a, b] some bounded interval. The uniform estimate holds: Jψ (ε, ) ≤ C( )Iψ (ε, ).
(2.7)
Proof. For simplicity we shall omit the dependence on ψ in the proof. The definition of I implies dE I (ε, ) = ε3 . dµ(x) dµ(u) 2 2 2 2 R R ((u − E) + ε )((x − E) + ε ) Thus
I (ε, ) ≥
dµ(x)
dµ(u)f (x, u, ε),
R
((u − E)2
+ ε 2 )((x
where
b
f (x, u, ε) = ε 3 a
dE − E)2 + ε 2 )
, = [a, b].
(2.8)
230
S. Tcheremchantsev
We change the variable t = (E − x)/ε in the integral over E: B dt f (x, u, ε) = , 2 2 A (t + 1)((t + s) + 1) where A = (a − x)/ε, B = (b − x)/ε, s = (x − u)/ε. Since we integrate in (2.8) over x ∈ [a, b], and 0 < ε < 1, one can easily see that f (x, u, ε) ≥ c/(s 2 + 1) with a uniform positive constant. The bound (2.8) yields dµ(x) dµ(u)R((x − y)/ε) = cJ (ε, ). I (ε, ) ≥ c
(2.9)
R
As a basis of our further proofs we shall use the following statement. Lemma 2.3. Let x ∈ [−2+ν, 2−ν] with some ν > 0, ε ∈ [0, 1). The following uniform bounds hold under condition nε ≤ K for some K > 0. a) If n < LN , then ||T (n, 0; x + iε)|| ≤ C(K, ν)LαNN , αN → 0.
(2.10)
If n : LN ≤ n < LN+1 , then 1−η
||T (n, 0; x + iε)|| ≤ C(K, ν)LN2η
+αN
, αN → 0.
(2.11)
Proof. Assume first that Q(n) ≡ 0. Then we can easily see that for any n : Lm ≤ n < Lm+1 with some m ≥ 1, T (n, 0; z) = T0 (n − Lm , z)A(Lm , z)T0 (Lm − Lm−1 + 1, z) ×A(Lm−1 , z) · · · A(L1 , z)T0 (L1 − 1, z). Here T0 (k, z) = A0 (z)k is the free transfer matrix with z −1 A0 (z) = 1 0
and A(Lk , z) =
1−η 2η
z − Lk 1
(2.12)
(2.13)
−1 0
.
For real x ∈ [−2 + ν, 2 − ν] one can show by direct calculations that ||T0 (k, x)|| ≤ C uniformly in x, k. For complex z = x + iε we can write A0 (z) = A0 (x) + iεD with 10 D= . 00 Developing A0 (z)k , one sees that we still have ||T0 (k, z)|| = ||A0 (z)k || ≤ C
(2.14)
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
231
while kε ≤ K. As to A(Lk , z), it can be bounded by 1−η
||A(Lk , z)|| ≤ CLk2η ,
(2.15)
since x ∈ [−2 + ν, 2 − ν], ε ∈ [0, 1). The statement of the lemma follows directly from the bounds (2.12)–(2.15) and the sparseness condition: ν
m+1 L1 L2 · · · Lm ≡ Lm+1 , νm → 0.
For more details see the similar proof in [JL]. If we add the finite range perturbation Q(n), it is clear that the norms ||T (n, 0; z)|| remain bounded by the same expressions (2.10), (2.11) with different constants.
In the next statement we shall use the notation I (ε, ) = Iδ1 (ε, ). In all statements of the paper αN denotes sequences such that lim αN = 0 (not necessarily the same). Theorem 2.4. Assume that ψ is of the second kind (in particular, ψ = δ1 ). Let = [E0 − ν/2, E0 + ν/2], where ν comes from (2.1). 1. Let LN ≤ T < LN+1 /4 for some N. The bound uniform in T holds: η−1
Pψ (n ≥ T , T ) ≥ cT LNη
−αN
η−1 −αN 1 ) ≥ cLNη . T
(I (1/T , ) +
(2.16)
2. Let LN /4 ≤ T ≤ 4LN . Then 2− η1 −αN
Pψ (n > LN , T ) ≥ cLN
(I (1/T , ) +
η−1 −αN 1 . ) ≥ cLNη T
(2.17)
3. Let LN /4 ≤ T ≤ LB N with some B > 1. Then the uniform bound holds: N Pψ (LN /4 ≤ n ≤ LN , T ) ≥ cB L1−α (I (1/T , ) + N
1 N −1 ) ≥ cB L1−α T . N T
(2.18)
In all bounds (2.16)–(2.18), c > 0 and limN→∞ αN = 0. Proof. We shall follow the ideas of [DT]. The starting point is the Parseval formula: ε 2 |ψ(t, n)| (T ) = dE|(R(E + iε)ψ)(n)|2 , ε = (2T )−1 , (2.19) π R where R(z) = (H − zI )−1 . a) We begin with ψ = δ1 . Let u(n, z) = (R(z)δ1 )(n). It is well known [KKL] that (u(n + 1, z), u(n, z))T = T (n, 0, z)(F (z), −1)T , n ≥ 0,
(2.20)
where T (n, 0, z) is the transfer matrix associated with the equation H u = zu and F is dµδ1 (x) the Borel transform of the spectral measure F (z) = R x−z . Let E ∈ [−2 + δ, 2 − δ] with some δ ∈ (0, 1), z = E + iε, ε = (2T )−1 . Assume first that LN ≤ T ≤ LN+1 /4. The bound (2.11) of Lemma 2.3 and (2.20) imply (since ||T || = ||T −1 ||) for any LN ≤ n ≤ 2T , |u(n + 1, z)|2 + |u(n, z)|2 ≥ ||T (n, 0, z)||−2 (|F (z)|2 + 1) η−1
≥ a(δ)LNη
−2αN
((Im2 F (z) + 1),
(2.21)
232
S. Tcheremchantsev
where αN → 0. Summation in (2.21) over n : T ≤ n ≤ 2T and integration over E ∈ [−2 + δ, 2 − δ] in (2.19) yields (2.16) with = [−2 + δ, 2 − δ]. We have used a simple bound I (u/2, ) ≥ 1/8I (u, ), which directly follows from the definition of integrals I . If LN /4 ≤ T ≤ 4LN , one considers n : 2LN ≤ n ≤ 3LN to get (2.17). The bound (2.18) is proved in a similar manner using the bound (2.10) of Lemma 2.3 and summing over n : LN /4 ≤ n < LN − 1. b) Assume now that ψ is such that ψ = g(H )δ1 , g(x) ∈ C0∞ (S), where S = [E0 − ν, E0 + ν] ⊂ [−2 + ν, 2 − ν] for some ν ∈ (0, 1). Assume also that g(x) ≡ 1, x ∈ [E0 − 3ν/4, E0 + 3ν/4]. Consider the decomposition δ1 = ψ + χ , ψ = g(H )δ1 , χ = (1 − g(H ))δ1 . Let LN ≤ T ≤ LN+1 /4. Since |R(z)ψ(n)|2 ≥ 1/2|R(z)δ1 (n)|2 − |R(z)χ (n)|2 , integration over = [E0 − ν/2, E0 + ν/2] and summation over n : T ≤ n ≤ 2T yields (using the proof of part a): η−1
Pψ (n ≥ T , T ) ≥ cT LNη
−αN
(I (1/(2T ), ) + 1/T ) dE |R(E + iε)χ (n)|2 .
−c/T
(2.22)
n≥T
To bound from above |R(E + iε)χ (n)|, E ∈ , we shall use now the following result from [GK]: |(u(H )δm )(n)| ≤ Ck |||u|||k+2 (1 + |n − m|2 )−k/2 ,
(2.23)
for any integer k > 0, where u is some smooth complex function, |||u|||k =
k
dx|u(r) (x)|(1 + |x|2 )(r−1)/2 ,
r=0 R
and the constants in (2.23) are independent of u and H . Although the result of [GK] is stated in the continuous case, we can easily see that the result holds in the discrete case for any self-adjoint operator H . χ(x) , where χ (x) = 1 − g(x), and z = E + iε is We shall take uE+iε (x) = x−E−iε considered as a parameter. Thus, R(E + iε)χ (n) = (uE+iε (H )δ1 )(n). The definition of f implies that χ (x) = 0 for any x ∈ [E0 − 3ν/4, E0 + 3ν/4]. One can easily show that |||uE+iε |||k ≤ C(k, ν) for any k and any E ∈ , ε > 0 with uniform constants. Thus, (2.23) implies |R(E + iε)χ (n)| ≤ C(k)n−k and |R(E + iε)χ (n)|2 ≤ C(k)T −k . (2.24) n≥T
Taking k large enough, we see that (2.22), (2.24) imply the same bound (2.16), since T ≥ LN and thus the integral in (2.22) is small with respect to the first term. The bounds (2.17) and (2.18) can be proved in the same manner. c) Now let ψ be any vector of the second kind. Let g be some function verifying
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
233
conditions of part b), that is, g ∈ C0∞ (S), S ≡ [E0 − ν, E0 + ν], g(x) = 1, x ∈ [E0 − 3ν/4, E0 + 3ν/4]. We can write g(x) = l(x)f (x), where l(x) = 0 if |x − E0 | > ν and l(x) = g(x)/f (x), |x − E0 | ≤ ν. The facts that f ∈ C ∞ (S), g ∈ C0∞ (S) and |f (x)| ≥ c > 0 on S imply that l ∈ C0∞ (S). Again, due to (2.23), the kernel of l(H ) is fast decaying in |n − m|, so that for any k > 0, Ck |R(E + iε)g(H )δ1 (n)|2 ≤ |R(E + iε)f (H )δ1 (m)|2 . k 1 + |n − m| m Therefore, for any L > 0,
A(2L, T ) ≡ 1/T
dE|R(E + iε)g(H )δ1 (n)|2
n≥2L
≤ 1/T
dE
hk (m, T )|R(E + iε)f (H )δ1 (m)|2 ,
m
where
hk (m, T ) =
n≥2L
Ck . 1 + |n − m|k
Let us split the sum over m into two with m < L and m ≥ L. We observe that hk (m, T ) ≤ Ck L1−k in the first case and we use a trivial bound hk (m, T ) ≤ Ck in the second case. Thus, we get A(2L, T ) ≤ Ck L1−k + Ck /T dE |R(E + iε)f (H )δ1 (m)|2 , (2.25)
m≥L
where we used the fact that dE|R(E + iε)ψ(m)|2 = π ||ψ||2 . ε m
R
Let us assume first that LN ≤ T ≤ LN+1 /4. One can easily see from the proofs of part b) that the quantity A(2T , T ) is bounded from below by the r.h.s. of (2.16) (only the constant changes). Taking k > 1/η, using (2.25) and Parseval equality, we get (2.16) for ψ = f (H )δ1 . For (2.17) the proof is similar with L = 2LN . To prove (2.18), we consider A(T ) ≡ 1/T dE|R(E + iε)g(H )δ1 (n)|2
LN /2≤n≤3LN /4 ≤ 1/T dE hk (m)|R(E + iε)f (H )δ1 (m)|2
m
(2.26)
with hk (m) = LN /2≤n≤3LN /4 Ck (1 + |n − m|k )−1 . Splitting the sum over m in (2.26) into three with m < LN /4, m > LN and LN /4 ≤ m ≤ LN , we show that the first two are bounded from above by Ck L1−k N and the third by C/T |R(E + iε)f (H )δ1 (m)|2 . L /4≤m≤L N N
234
S. Tcheremchantsev
On the other hand, A(T ) is bounded from below by the r.h.s. of (2.18) (the proof is identical to the one of part b), only the constants change). Since T ≤ LB N , taking k large enough we get the bound (2.18) for ψ = f (H )δ1 .
Corollary 2.5. Let = [−2 + ν, 2 − ν] with some ν > 0. Let ε > 0 and N be such that LN /4 ≤ T ≡ 1/ε < LN+1 /4. The following estimate holds: CLαNN
J (ε, ) ≤ CI (ε, ) ≤
(2.27)
η−1 η
LN + T L N
with constants uniform in T and lim αN = 0. Here the integrals J, I correspond to ψ = δ1 . Proof. We shall use the bounds of Theorem 2.4 for ψ = δ1 . In this case, as it follows from part a) of the proof, (2.16), (2.17), (2.18) hold with = [−2+ν, 2−ν]. Moreover, (2.18) holds for all LN /4 ≤ T ≤ LN+1 /4 without restriction. On the other hand, all the quantities Pψ (n ≥ T , T ), Pψ (n ≥ 2LN , T ), Pψ (LN /4 ≤ n ≤ LN , T ) are bounded from above by 1. We thus obtain the last inequality in (2.27). The first inequality is that of Lemma 2.2.
Corollary 2.6. Let ψ be any vector of the second kind (but f is not necessarily smooth on = [E0 − ν, E0 + ν]). Then η−1 η N L , (2.28) + T L Pψ (n ≥ M(T ), T ) ≥ c > 0, for M(T ) = CL−α N N N where again LN /4 ≤ T < LN+1 /4 and αN → 0. Proof. Since |f (x)| ≥ c > 0, x ∈ and ⊂ (−2, 2) ⊂ σ (H ), it is clear that µψ ( ) ≥ c2 µ( ) > 0. On the other hand, since f is bounded, by (2.27), LαNN
Jψ (ε, ) ≤ CJ (ε, ) ≤ C
η−1 η
.
(2.29)
LN + T L N The result now follows from (2.29) and Lemma 2.1.
Generally speaking, to obtain a better lower bound for M(T ), one should better estimate from above the integrals J (ε, ). Similarly, to get better lower bounds for probabilities (Theorem 2.4), one should bound from below the integrals I (ε, ). These quantities are both closely related to the correlation dimensions D ± (2) [T] of the spectral measure restricted to . To get good bounds for I, J , we need rather good knowledge of the fine structure of the spectral measure. In Sect. 4 we shall use the upper bound obtained for the outside probabilities to obtain optimal lower bounds for I (ε, ). The idea is the following: upper bound on outside probabilities ⇒ upper bound on M(T ) such that Pψ (n ≥ M(T ), T ) ≥ c > 0 ⇒ lower bound on J ⇒ lower bound on I . This method, however, is specific to the considered model with unbounded sparse potentials. Consider now applications of the results obtained for probabilities to the time-averaged moments of the position operator: p |X|ψ (T ) ≡ |n|p |ψ(t, n)|2 (T ), p > 0. n
An immediate consequence of Lemma 2.1 and Theorem 2.4 is the following.
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
235
Corollary 2.7. Let ψ be of the second kind, p > 0, T : LN /4 ≤ T < LN+1 /4 for some N. The bounds hold: |X|ψ (T ) ≥ CJ (ε, )−p η−1 −αN p+1−αN (I (ε, ) + 1/T ) +C LN + T p+1 LNη p
η−1
−pαN
≥ C(LN + T LNη )p LN
η−1
+ CT p LNη
−αN
,
(2.30) (2.31)
where ε = 1/T and αN → 0. p
Proof. We observe that |X|ψ (T ) ≥ M p Pψ (n ≥ M, T ) for any M, T . The bound (2.30) now follows directly from Lemma 2.1, Theorem 2.4 and Lemma 2.2. The bound p+1 (2.31) follows directly from (2.30) and (2.29) (since T ≥ LN , the term with LN /T is p smaller than LN , so it is not retained in (2.31)).
What is interesting is the following observation: even if we have no additional information about integrals I, J , we can obtain the bound better than (2.31), optimizing (2.30) as a sum of two related terms. Theorem 2.8. Let ψ be of the second kind. Let p > 0, T : LN /4 ≤ T < LN+1 /4. The estimate uniform in T holds: p η−1 p p −αN p p+1 η |X|ψ (T ) ≥ CLN L N + T LN , (2.32) where αN → 0. In particular, βψ− (p) ≥
(p + 1) , βψ+ (p) ≥ 1. p + 1/η
(2.33)
Proof. The bound (2.30) of Corollary 2.7 and Lemma 2.2 imply η−1 p p+1 −αN η −p p+1 |X|ψ (T ) ≥ C z + LN (LN + T LN )z , where z = I (ε, ). The function f (z) = z−p + Kz, z > 0, is bounded from below by p c(p)K p+1 . The bound (2.32) follows. To prove the second statement, define s = p(1−η) (p+1)η . p+s
p+s
Considering LN /4 ≤ T ≤ LNp and LN+1 /4 > T ≥ LNp , one can easily see from (2.32) that in both cases N |X|ψ (T ) ≥ cL−α N T
p
p2 p+s
2
≥ cT
p −αN + p+s
.
The first bound of (2.33) follows. To see that β + (p) ≥ 1 for any p > 0, it is sufficient to take the sequence TN = LN in (2.32).
Remark 1. A priori we don’t have upper bounds for the moments. However, if ψ is such that the ballistic upper bound holds, then βψ+ (p) = 1 for any p.
236
S. Tcheremchantsev
Remark 2. In a somewhat paradoxical manner, one can obtain better lower bounds for the moments if one has good upper bounds. This can be done in the following way. Assume that |X|rψ (T ) ≤ hr (T ), r > 0, with some nontrivial hr (T ) (that is, better than ballistic). Then the bound (2.30) implies some nontrivial lower bound J ≥ A(r, ε) and upper bound I ≤ B(r, ε). The result of Lemma 2.2 yields I ≥ CA(r, ε) and J ≤ CB(r, ε). These two bounds (with any values r = r1 and r = r2 respectively) can be inserted into (2.30). Finally, one can optimize the bound obtained (for a given p > 0) by choosing appropriate values of r1 , r2 . Most probably, one should take r1 small and r2 large. The methods developed in this section, as mentioned in the Introduction, can be applied to more general models. For example, one can prove the following statement. Theorem 2.9. Let ψ be of the second kind (in particular, ψ = δ1 ). Let H be such that the corresponding transfer matrix verifies the condition: ||T (n, 0; E + iε)|| ≤ Cnα , α > 0,
(2.34)
for any E ∈ [E0 − ν, E0 + ν], ε ∈ (0, 1) and n such that nε ≤ K, K > 0. For any T the bounds hold: Pψ (n ≥ KT , T ) ≥ cT 1−2α (I (1/T , ) + 1/T ) ≥ cT −2α , |X|ψ (T ) ≥ CI −p (1/T , ) + T p+1−2α I (1/T , ) ≥ C(p)T p−2pα/(p+1) , p
(2.35) (2.36)
where = [E0 − ν/2, E0 + ν/2]. Thus, 2α p+1
βψ− (p) ≥ 1 −
(this bound is non-trivial only from p > 2α − 1). Proof. The bound (2.35) is obtained following the proof of Theorem 2.4. The condition (2.34) implies [GKT] that µψ ( ) > 0 and one can apply Lemma 2.1. The first inequality in (2.36) follows from the proof of Corollary 2.7, and the second from the proof of Theorem 2.8.
This result can be applied, in particular, to the operators with bounded sparse potentials considered in [Z, GKT]: ∞ V (n) = hN δLN ,n , N=1
for some γ > 1. Let ψ be of the second kind. where |hN | ≤ a for all N and LN ≥ Define B= sup ||T0 (n, 0; E)||. cγ N
E∈[E0 −ν,E0 +ν],n≥1
Proposition 2.10. The lower bound holds: βψ− (p) ≥ 1 − where α =
log(B(a+3)) . logγ
2α , p+1
(2.37)
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
237
Proof. First, we observe [S2] that ||T0 (n, 0; E + iε)|| ≤ B exp(KB) ≡ B(K) for all E ∈ [E0 − ν, E0 + ν], n : nε ≤ K. Assume that n : LN ≤ n < LN+1 for some N, nε ≤ K. Then, following the proof of Lemma 2.3 (see also [GKT]), one shows that ||T (n, 0; E + iε)|| ≤ B N+1 (K)
N
(|hj | + 3) ≤ Dnα(K) ,
(2.38)
j =1
where α(K) =
log(B(K)(a+3)) . logγ
The statement of Theorem 2.9 yields βψ− (p) ≥ 1 −
2α(K) p+1
for any K > 0. Letting K tend to 0, we get (2.37).
The bound (2.37) improves both the result of [GKT]: βψ− (p) ≥ 1 − 2α/p and the dynamical bound which follows from [JL, L] in the case of α < 1/2 : βψ− (p) ≥ 1 − 2α. 3. Dynamical Upper Bounds In this section we shall establish some upper bounds for the inside and outside probabilities and the moments. It is clear that one cannot consider the same class of initial states ψ as in the previous section. The problem is that we do not have dynamical control of the possible pure point spectrum outside (−2, 2). Thus, we shall consider only ψ = f (H )δ1 such that suppf ⊂ (−2, 2). Moreover, to control the decay at infinity (when considering outside probabilities), we shall assume that the function f is infinitely smooth (recall that we call these ψ initial states of the first kind). We begin with the inside probabilities. Let ψ = f (H )δ1 , where f is a bounded Borel function such that suppf ⊂ = [−2 + ν, 2 − ν] for some ν > 0. Following the proof of Lemma 2.1, one can show that for any K, M > 0, n=K+M MLαN N |ψ(t, n)|2 (T ) ≤ C MJ (ε, ) ≤ C . (3.1) η−1 n=K LN + T LNη In fact, a slightly better result can be obtained using the upper bound for the imaginary part of the Borel transform of spectral measure. Such a bound represents an independent interest since it provides an upper bound for the measure of intervals and thus a lower bound for Hausdorff and packing dimensions of the spectral measure. Lemma 3.1. Let µ be the spectral measure of the state ψ = δ1 and F (z) its Borel transform. For any ν ∈ (0, 1) there exists a constant C(ν) such that for all x ∈ [−2+ν, 2−ν] and ε : LN4+1 < ε ≤ L4N the bound holds: 1 µ([x − ε, x + ε]) ≤ ImF (x + iε) ≤ C(ν)LαNN 2ε where αN → 0.
η−1 η
εLN + LN
−1 ,
(3.2)
238
S. Tcheremchantsev
Proof. It is well known that ImF (z) = Imz||R(z)δ1 ||2 = Imz
∞
|u(n, z)|2 ,
n=1
where F (z) is the Borel transform of µ. The first inequality in (2.21) implies ImF (z) ≥ cImz(Im F (z) + 1) 2
∞
||T (n, 0, z)||−2 .
(3.3)
n=1
Let x ∈ [−2+ν, 2−ν], ε ∈ (4/LN+1 , 4/LN ], z = x +iε. We can sum over n : 1 ≤ n < LN and over n : LN ≤ n ≤ K/ε with suitable K (K = 8 for 1/(2LN ) ≤ ε ≤ 4/LN and K = 1 for ε < 1/(2LN ), for example) using the upper bounds for ||T || of Lemma 2.3. Thus, we obtain from (3.3): η−1
N ImF (z) ≥ C(ν)εIm2 F (z)L−2α (LN + ε −1 LNη ). N
Since ImF (x + iε) ≥ 1/(2ε)µ([x − ε, x + ε]), the result follows.
Remark . The proof is rather simple because we have from the very beginning the upper bound for ||T (n, 0, z)|| for complex z. In most applications, however, one has such bounds only for real z, and one should proceed in a more complicated way using the Jitomirskaya-Last method [JL]. As a first direct consequence of this result, one can obtain the already known upper bounds (2.27) on I (ε, ), J (ε, ). Indeed, for = [−2 + ν, 2 − ν], J (ε, ) = dµ(x)b(x, T ),
where b(x, T ) = εImF (x + iε) ≤ C(ν)εLαNN
η−1
−1
εLN + LNη
(3.4)
due to (3.2). The bound for J (ε, ) follows. Next, 2 I (ε, ) = ε dEIm F (E + iε) = dEb(E, T )ImF (x + iε).
The bound (3.4) and
dEImF (x + iε) = µ(R) = 1 R
imply the bound for I (ε, ). Before stating the next corollary, let us recall the definition of the lower and upper Hausdorff dimension of Borel measure: dim∗ (µ) = inf{dim(S) | µ(S) > 0}, dim∗ (µ) = inf{dim(S) | µ(S) = µ(R)}, where dim(S) denotes Hausdorff dimension of the set S. Thus, the measure gives zero weight to any set S with dim(S) < dim∗ (µ) and for any ε > 0 is supported by some
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
239
set S with dim(S) < dim∗ (µ) + ε. The measure is of exact Hausdorff dimension if dim∗ (µ) = dim∗ (µ). It is known (see [T] for the references) that dim∗ (µ) = µ − essinfγ − (x) = sup{α | γ − (x) ≥ α µ − a.s.},
(3.5)
dim∗ (µ) = µ − esssupγ − (x) = inf{α | γ − (x) ≤ α µ − a.s.}.
(3.6)
Here γ − (x) is the lower local exponent of µ: γ − (x) = lim inf ε→0
logµ([x − ε, x + ε]) . logε
For the packing dimension similar formulae hold (see [T] for details). Corollary 3.2. 1. For any δ ∈ (0, 1), ν > 0 the spectral measure µ of the state ψ = δ1 is uniformly η − δ-H¨older continuous on [−2 + ν, 2 − ν]. In particular, for µ , the restriction of µ on (−2, 2), dim∗ (µ ) ≥ η. 2. The packing dimension of µ is 1. Proof. Let ε ∈ (4/LN+1 , 4/LN ] for some N . One can easily see that η−1
εLN + LNη ≥ ε1−η . Therefore, Lemma 3.1 implies µ([x − ε, x + ε]) ≤ C(δ)ε η LαNN ≤ C1 (δ)ε η−αN for any x ∈ [−2 + ν, 2 − ν]. Since lim αN = 0, the uniform η − δ-continuity of µ restricted to [−2 + ν, 2 − ν] follows. As a particular consequence, γ − (x) ≥ η for all x ∈ (−2, 2). The equality (3.5) implies dim∗ (µ ) ≥ η. Taking εN = 1/LN , we obtain from Lemma 3.1 that 1−αN µ([x − εN , x + εN ]) ≤ C(δ)εN .
Therefore, for the upper local exponents of the measure we have γ + (x) ≡ limsupε→0 The fact that dimP (µ) = 1 follows [T].
logµ([x − ε, x + ε]) ≥ 1. logε
Remark . These results are not new. The fact that dim∗ (µ ) ≥ η is proved in [JL] and dimP (µ) = 1 in [CM]. Our proof, however, is simpler. Moreover, the upper bound (3.2) contains more information. Lemma 3.3. Let ψ = f (H )δ1 , where f is a bounded Borel function such that suppf ⊂ = [−2 + ν, 2 − ν] for some ν > 0. Let LN /4 ≤ T < LN+1 /4 for some N . 1. For any n the bound holds: |ψ(t, n)|2 (T ) ≤ C
LαNN η−1
LN + T LNη
.
(3.7)
240
S. Tcheremchantsev η−1
η 2. Define M(T ) = L−δ N (LN + T LN ) with some δ > 0. Then
−δ/2
Pψ (n ≤ M(T ), T ) ≤ CLN for T large enough and thus
−δ/2
Pψ (n ≥ M(T ), T ) ≥ ||ψ||2 − CLN
.
3. For the time-averaged return probability the bound holds: 1 T
∞
LαNN
dt exp(−t/T )|ψ(t), ψ|2 ≤ C
0
η−1
.
(3.8)
LN + T LNη
Proof. Using the spectral theorem in a standard way (see [T], for example), one first shows that 2|ψ(t, n)|2 (T ) = dµψ (x)dµψ (y)uψ (n, x)uψ (n, y)R(T (x − y)) R R ≤ 2 dµψ (x)|uψ (n, x)|2 bψ (x, T ), (3.9) R
where
bψ (x, T ) = R
dµψ (u)R(T (x − u)) = εImFµψ (x + iε), ε = 1/T .
Since f is bounded and suppf ⊂ , we get dµψ (x)|uψ (n, x)|2 bψ (x, T ) ≤ C dµψ (x)b(x, T )|uψ (n, x)|2 . R
The bound (3.2) and
dµψ (x)|uψ (n, x)|2 ≤ 1 R
yield (3.7). The second statement of the lemma directly follows. For the return probabilities the result follows from the bound Jψ (ε, R) ≤ CJ (ε, ) and the established upper bound for J (ε, ) (Corollary 2.5).
The situation is more difficult with the upper bounds for outside probabilities. We shall consider the initial state ψ of the form ψ = f (H )δ1 , where f ∈ C0∞ ([−2 + ν, 2 − ν]) with some ν ∈ (0, 1/2). For smooth f it is well known that the function ψ(n) decays at infinity faster than any inverse power and moreover, for the moments of the timeaveraged position operator, the ballistic upper bound holds: p
|X|ψ (T ) ≤ C(p)T p , p > 0.
(3.10)
The following statement holds (where we use some ideas of [CM] in the proof).
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
241 1/η
Theorem 3.4. Consider ψ of the first kind. Let T be such that LN /4 ≤ T ≤ LN for some N. 1. For any p ≥ 0 the following bound holds: −1/η np |ψ(t, n)|2 (T ) ≤ C(p)T p+1 LN . (3.11) n≥2LN
In particular, −1/η
Pψ (n ≥ 2LN , T ) ≤ CT LN
(3.12)
and p
−1/η
p
|X|ψ (T ) ≤ CLN + CT p+1 LN
(3.13)
.
2. Let T : LN /4 ≤ L1−δ N+1 with some δ > 0. For M > 2LN and any A > 0 the uniform bound holds: Pψ (2LN ≤ n ≤ M, T ) ≤ C
M T
1/η + LN
+
CA . LA N
(3.14)
Proof. First of all, observe that the ballistic upper bound (3.10) implies |ψ(t, n)|2 (T ) ≤ C(r)T r n−r for any r > 0. Therefore, taking r large enough, we obtain np |ψ(t, n)|2 (T ) ≤ C(r, p)T 2p+2−r ≤ C(p, A)T −A n≥T 2
for any A > 0. Thus, to prove (3.11), it is sufficient to consider the sum over n : 2LN ≤ n ≤ T 2 . We use again the Parseval formula: ε 1 |ψ(t, n)|2 (T ) = . (3.15) dE|(R(E + iε)f (H )δ1 )(n)|2 , ε = π R 2T Define = [−2 + ν/2, 2 − ν/2], where f ∈ C0∞ ([−2 + ν, 2 − ν]). We shall denote by a1 (n, T ) the integral over R \ in (3.15), and by a2 (n, T ) the integral over . Since f (x) = 0, |x| ≥ 2 − ν, one can show, as in the proof of part b) of Theorem 2.4 (bounds (2.22)–(2.24)), that |R(E + iε)f (H )δ1 (n)| ≤
C(k, ν) E(1 + |n|2 )k/2
for any integer k > 0 and all E ∈ R \ with constants uniform in n, E, ε. Therefore, a1 (n, T ) ≤
C(k, ν) (1 + |n|2 )−k T
for any k > 0. In particular, taking k large enough, we obtain
(3.16)
242
S. Tcheremchantsev
np a1 (n, T ) ≤ C(p, A)L−A N
(3.17)
n≥2LN
for any A > 0. Consider now the term a2 (n, T ). Since R(z)f (H ) = f (H )R(z), one can write it as follows: ε a2 (n, T ) = dE|(f (H )R(E + iε)δ1 )(n)|2 . (3.18) π Since f ∈ C0∞ ([−2, 2]), it follows again from the results of [GK] that for any χ ∈ l 2 (N), |(f (H )χ )(n)|2 ≤ C(k) (1 + |n − m|2 )−k |χ (m)|2 . m
Inserting this bound in (3.18) yields after integration: 2 −k a2 (n, T ) ≤ C(k)ε (1 + |n − m| ) dE|(R(E + iε)δ1 )(m)|2
(3.19)
m
for any k > 0. Denote by a21 (n, T ) the sum in (3.19) over m : m ≤ LN , by a22 (n, T ) the sum over m : LN < m ≤ T 2 + LN and by a23 (n, T ) the sum over m : m > T 2 + LN . It is clear that for any A > 0, np (a21 (n, T ) + a23 (n, T )) ≤ C(p, A)L−A ε |(R(E + iε)δ1 )(m)|2 . N 2LN ≤n≤T 2
m
(3.20) Since
ε dE|(R(E + iε)δ1 )(m)|2 = ||δ1 ||2 = 1, π m R
(3.20) yields
np (a21 (n, T ) + a23 (n, T )) ≤ C(p, A)L−A N .
(3.21)
2LN ≤n≤T 2
The summation over n in the expression of a22 (n, T ) yields p p n a22 (n, T ) ≤ Cε m dE|(R(E + iε)δ1 )(m)|2 . 2LN ≤n≤T 2
LN <m≤T 2 +LN
(3.22)
To bound from above the r.h.s. of (3.22), we shall introduce in l 2 (N) the operator HN = H0 + VN , VN (n) = F (n ≤ LN )V (n) with compactly supported potential and thus absolutely continuous spectrum on (−2, 2). We denote by R(z) and RN (z) the resolvents of H and HN respectively. We can see that for N large enough (so that Q(n) disappears), (H − HN )φ(n) =
∞ k=N+1
δLk (n)V (Lk )φ(Lk ).
(3.23)
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
243
The resolvent equation implies that for any complex z = E + iε, ||R(z)δ1 − RN (z)δ1 || ≤
1 ||(H − HN )RN (z)δ1 ||. ε
(3.24)
To bound from above the r.h.s. of (3.22) and the r.h.s. of (3.24), we need to control g(n) = RN (z)δ1 (n) for n > LN . In fact, a rather explicit expression can be obtained. Since (HN − z)g = δ1 and VN (n) = 0 for n > LN , g(n − 1) + g(n + 1) − zg(n) = 0, n > LN . Thus, (g(n + 1), g(n))T = T0 (n − LN , z)(g(LN + 1), g(LN ))T , n ≥ LN ,
(3.25)
where T0 (m, z) = A0 (z)m is the free transfer matrix with A0 (z) given by (2.13). Since E ∈ = [−2 + ν/2, 2 − ν/2], the matrix A0 (z) has two complex eigenvalues λ1,2 =
1 (z ± z2 − 4) 2
with corresponding eigenvectors ei = (λi , 1)T , i = 1, 2. It follows from (3.25) that N N e1 + C2 λn−L e2 , n ≥ L N , (g(n + 1), g(n))T = C1 λn−L 1 2
with some complex C1 , C2 . Since Imz = ε > 0, one of the two eigenvalues, say, λ1 , is such that |λ1 | < 1 and then |λ2 | > 1. On the other hand, since g = RN (z)δ1 , it should be square integrable in n. Therefore, C2 = 0 and N (λ1 , 1)T . (g(n + 1), g(n))T = Cλn−L 1
Finally, we obtain that N g(n) ≡ (RN (z)δ1 )(n) = λn−L g(LN ) 1
(3.26)
for any n ≥ LN . One can see from the expression of λ1 that exp(−c1 ε) ≤ |λ1 | ≤ exp(−cε)
(3.27)
with uniform c1 , c > 0 for all E ∈ , ε ∈ (0, 1). Let us return to the resolvent R(z). Using the trivial bound |g(LN )| ≤ 1/ε, one gets from (3.23)–(3.24) and (3.26)–(3.27): ||R(z)δ1 − RN (z)δ1 ||2 ≤ ε−4
∞
V 2 (Lk ) exp(−2cε(Lk − LN ).
(3.28)
k=N+1 (1−η)/2η
1 Since T = 2ε ≤ L1−δ N+1 in all three statements of the theorem, V (Lk ) = Lk Lk is a very fast growing sequence, it is easy to check that
||R(z)δ1 − RN (z)δ1 ||2 ≤ C exp(−1/ε α )
, and
(3.29)
244
S. Tcheremchantsev
−1 with some α > 0 for all E ∈ , ε ∈ [Lδ−1 N+1 , 4LN ]. Thus, the bounds (3.22) and (3.29) imply np a22 (n, T ) ≤ C/ε2p exp(−1/ε α ) + Cε 2LN ≤n≤T 2
m
LN ≤m≤T 2 +LN
dE|RN (E + iε)δ1 (m)|2 .
p
(3.30)
It follows from (3.26)–(3.27) and ε ≤ 2/LN that mp |RN (E + iε)δ1 (m)|2 ≤ Cε −p−1 |RN (E + iε)δ1 (LN )|2 .
(3.31)
LN ≤m≤T 2 +LN
To bound RN (E + iε)δ1 (LN ), one can use the result of Lemma 4 in [CM]. For the sake of completeness we shall give here a simple and slightly different proof. Namely, we shall show that |(RN (E + iε)δ1 )(LN )|2 ≤ C( )
1 1/η
1 + εLN
ImFN (E + iε),
(3.32)
where E ∈ and FN denotes the Borel transform of the spectral measure associated to the state δ1 and operator HN . First, it follows from (3.26)–(3.27) that C 1 ImFN (E + iε) = ||RN (E + iε)δ1 ||2 ≥ |g(m)|2 ≥ |g(LN )|2 . ε ε m>LN
Therefore, |g(LN )|2 ≤ CImFN (E + iε).
(3.33)
Let LN−1 < n < LN+1 . The definition of g = RN (z)δ1 implies g(n + 1) +g(n − 1) − zg(n) = 0, n = LN , g(LN + 1) +g(LN − 1) + (V (LN ) − z)g(LN ) = 0.
(3.34) (3.35)
It is clear that for n > LN , (g(n + 1), g(n))T = T0 (n − LN , z)(g(LN + 1), g(LN ))T ,
(3.36)
and for n < LN − 1, (g(n + 1), g(n))T = T0 (n − LN + 1, z)(g(LN ), g(LN − 1))T ,
(3.37)
where T0 (m, z) is the free transfer matrix. Since z = E + iε, E ∈ , its norm is uniformly bounded for |m| ≤ K/ε. Using the fact that ||T −1 || = ||T || and ε ≤ 2/LN , we thus get that for 2LN > n > LN , |g(n + 1)|2 + |g(n)|2 ≥ c(|g(LN + 1)|2 + |g(LN )|2 ), with uniform c > 0. Summing this bound, one obtains cLN (|g(LN + 1)|2 + |g(LN )|2 ) ≤ 2||g||2 = 2/εImFN (E + iε).
(3.38)
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
245
Similarly, summation over LN /2 < n < LN yields (since LN > 2LN−1 ): c/2LN (|g(LN )|2 + |g(LN − 1)|2 ) ≤ 2/εImFN (E + iε).
(3.39)
Thus, (3.38)–(3.39) yield |g(LN − 1)|2 + |g(LN + 1)|2 ≤
C ImFN (E + iε). εLN
(3.40)
It follows from (3.35) that |V (LN ) − z|2 |g(LN )|2 ≤ (1−η)/2η
Since |z| ≤ 3 and V (LN ) = LN
C ImFN (E + iε). εLN
, we obtain −1/η
|g(LN )|2 ≤ C( )ε −1 LN
ImFN (E + iε).
(3.41)
The bound (3.32) follows from (3.33) and (3.41). We can finish now the proof of the first part of the theorem. It follows from (3.30),(3.31) and (3.32) that np a22 (n, T ) ≤ C/ε2p exp(−1/ε α ) 2LN ≤n≤T 2 −1/η
+Cε−p−1 LN
−1/η
dEImFN (E + iε) ≤ Cε −p−1 LN
,
(3.42)
since ε ≤ 2L−1 N and dEImFN (E + iε) ≤ dEImFN (E + iε) = π µN (R) = π. R
The bound (3.11) of the theorem follows from the Parseval equality, (3.17), (3.21) (one takes A = 1/η) and (3.42). Taking p = 0, we obtain the bound for outside probabilities. Since p np |ψ(t, n)|2 (T ), |X|ψ (T ) ≤ (2LN )p ||ψ||2 + n≥2LN
the upper bound for the moments follows. The proof of the second statement is similar. One defines a1 (n, T ) and a2 (n, T ) in the same manner. The bound (3.17) yields a1 (n, T ) ≤ CA L−A N . 2LN ≤n≤M
Next, we denote by a21 (n, T ), a22 (n, T ) and a23 (n, T ) the sums in (3.19) over m ≤ LN , m : LN < m < 2M and m : m ≥ 2M respectively. The bound (3.21) yields a21 (n, T ) ≤ CA L−A N . 2LN ≤n≤M
246
S. Tcheremchantsev
Similarly, one shows that
a23 (n, T ) ≤ CA L−A N .
2LN ≤n≤M
Thus, it is sufficient to bound from above the r.h.s. of a22 (n, T ) ≤ Cε dE|(R(E + iε)δ1 )(m)|2 . LN <m<2M
2LN ≤n≤M
The same consideration as in the proof of part 1 yields
a22 (n, T ) ≤ C exp(−1/εα ) + Cε
dE|(RN (E + iε)δ1 )(m)|2 ,
LN <m<2M
2LN ≤n≤M
(3.43) where N (RN (E + iε)δ1 )(m) = g(LN )λm−L , m ≥ LN . 1
(3.44)
The bounds (3.32) and (3.44) imply ε
|(RN (E + iε)δ1 )(m)|2 ≤ C
LN <m<2M
Mε 1/η
ImFN (E + iε).
1/η
.
1 + εLN
Inserting this bound in (3.43), we obtain 2LN ≤n≤M
The bound (3.14) follows.
a22 (n, T ) ≤ C
Mε 1 + εLN
4. Improved Lower Bounds The result of Theorem 3.4 gives us total control of the integrals I, J (up to a factor like LαNN ). Corollary 4.1. Let be a non-empty interval such that ⊂ [−2 + ν, 2 − ν] for some ν > 0. There exist positive constants uniform in ε such that for all ε : 4L−1 N+1 < ε ≤ −1 4LN , Cε η−1 η
εLN + LN where αN → 0.
≤ J (ε, ) ≤ CI (ε, ) ≤
CLαNN ε η−1
εLN + LNη
,
(4.1)
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
247
Proof. Pick a slightly smaller interval ⊂ . Let f be a function from C0∞ ( ) such that 0 ≤ f (x) ≤ 1 and f (x) = 1 on . Define ψ = f (H )δ1 . The result of Lemma 2.1 applied to interval yields Pψ (n ≥ M1 (T ), T ) ≥ D1 c1 > 0,
(4.2)
where c1 = µψ ( ) = µ( ) > 0, and M1 (T ) = c12 /(16Jψ (ε, )), ε = 1/T .
(4.3)
On the other hand, Theorem 3.4 implies −1/η
Pψ (n ≥ 2LN , T ) ≤ D2 T LN 1/η
1/η
, T ≤ LN .
(4.4)
1/η
1 c1 LN ≡ γ LN , then (4.2), (4.4) imply Pψ (n ≥ 2LN , T ) < Pψ (n ≥ If T ≤ D2D 2 M1 (T ), T ) and thus M1 (T ) < 2LN . It follows from (4.3) that
C ≤ Jψ (ε, ) ≤ Jψ (ε, R) LN −1/η
−1 for ε ∈ [ε0 , 4L−1 N ], where ε0 = (γ ) LN
∞
Jψ (ε, R) = ε
(4.5)
. Recall the equality
dt exp(−εt)|ψ(t), ψ|2 .
(4.6)
0
The crucial observation is that Jψ (ε, R)/ε is decreasing in ε. Therefore, it follows from (4.5)–(4.6) that 1−η
Jψ (ε, R) ≥ ε/ε0 Jψ (ε0 , R) ≥ CεLNη
(4.7)
for all ε ≤ ε0 . The bounds (4.5), (4.7) imply that Jψ (ε, R) ≥
Cε η−1
εLN + LNη −1 for all ε ∈ (4L−1 N+1 , 4LN ] with a suitable constant. Since 0 ≤ f (x) ≤ 1 and f (x) = 0 for x outside of , the definition of Jψ , J implies dµ(x) dµ(y)R((x − y)/ε) ≤ J (ε, ). Jψ (ε, R) ≤
Thus, we get J (ε, ) ≥
dµ(y)R((x − y)/ε) ≥
dµ(x)
Cε η−1
.
(4.8)
εLN + LNη
The first inequality in (4.1) follows. The second and the third inequalities follow from Lemma 2.2 and Corollary 2.5.
248
S. Tcheremchantsev
As a direct consequence of this result, one gets a lower bound for the time-averaged return probabilities Jψ (1/T , R). In fact, if the measure µψ has a nontrivial point part: µψ ({E0 }) = γ > 0 for some E0 , then clearly Jψ (ε, R) ≥ γ 2 > 0 for any ε. The situation is more interesting if the measure is continuous, in our case if suppµψ ⊂ (−2, 2). Theorem 4.2. Assume that ψ = f (H )δ1 , where f is a bounded Borel function a) supported on [−2 + ν, 2 − ν] for some ν > 0, b) such that |f (x)| ≥ c > 0 on some interval ⊂ [−2 + ν, 2 − ν]. Then Cε η−1 η
≤ Jψ (1/T , R) ≤
εLN + LN
CεLαNN η−1
, ε = 1/T ,
εLN + LNη
for T : LN /4 ≤ T < LN+1 /4. If only condition b) is fulfilled, then only the lower bound for Jψ (ε, R) holds. Proof. The upper bound is proved in Lemma 3.3. Since Jψ (ε, R) ≥ c4 dµ(x) dµ(y)R((x − y)/ε),
the second inequality in (4.8) yields the lower bound. One observes that the integral
∞
T Jψ (1/T , R) =
dt exp(−t/T )|ψ(t), ψ|2
0 1/η LN
1/η
grows linearly for LN ≤ T ≤ and remains stable for LN < T < LN+1 (up to factors like CLαNN ). Since the main contribution to the integral comes from the interval [0, T ], one can conjecture that the return probability Rψ (t) = |ψ(t), ψ|2 = | µψ (t)|2 1/η −1 is essentially constant of order LN if t ∈ [LN , LN ], and is small (decaying at least as 1/η+δ 1/t) if t ∈ [LN , LN+1 ]. The obtained lower bounds for I (ε, ) also imply improved lower bounds for probabilities and moments. Theorem 4.3. Let ψ be of the second kind. Then 1/η 1. For LN ≤ T ≤ LN : −1/η−αN
Pψ (n ≥ T , T ) ≥ CT LN
,
1/η
and for LN ≤ T < LN+1 /4: N Pψ (n ≥ T , T ) ≥ CL−α N .
1/η
2. For LN /4 ≤ T ≤ LN : N Pψ (LN /4 ≤ n ≤ LN , T ) ≥ CL−α N .
1/η
As a consequence, for LN /4 ≤ T ≤ LN , p
p
−1/η
|X|ψ (T ) ≥ C(LN + T p+1 LN
)LαNN ,
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
249
1/η
and for LN ≤ T < LN+1 /4, N |X|ψ (T ) ≥ CT p L−α N .
p
The results for probabilities follow directly from Theorem 2.4 and Corollary 4.1. The p bound |X|ψ (T ) ≥ M p Pψ (n ≥ M, T ) for any M yields the result for the moments.
1/η The result of the theorem tells, in particular, that for T ≥ LN , some (not too small) part of the wavepacket has gone through the barrier and moves ballistically: N Pψ (n ≥ T , T ) ≥ CL−α ≥ CT −αN , αN → 0. N
1/η+δ
On the other hand, Corollary 1.6 implies that for T ≥ LN η−1
Pψ (n > LN , T ) ≥ Pψ (n ≥ T LNη
−αN
(4.9)
with some δ > 0,
, T ) ≥ c > 0.
Thus, some essential (and not small) part of the wavepacket is on the right of LN . This part will continue to move ballistically up to the next barrier located at n = LN+1 . Therefore, one can expect a bound like Pψ (n ≥ T , T ) ≥ c1 > 0 1/η+δ
for T > LN , which is better than just (4.9). The following statement confirms this conjecture. Slightly modifying the proof, we show also that Pψ (n ≤ 2LN , T ) ≥ c2 > 0 N for T ≤ τ LN with τ > 0 small enough. This is better than Pψ (n ≤ 2LN , T ) ≥ CL−α N , which follows from Theorem 4.3.
1/η
Theorem 4.4. The following statements hold: 1. Assume that ψ is of the second kind. 1/η+δ For any δ > 0 there exist τ > 0, c1 > 0 such that for T : LN ≤ T < L1−δ N+1 with N large enough, Pψ (n ≥ τ T , T ) ≥ c1 > 0. If ψ is of the first kind, for any θ > 0 one can choose τ so that Pψ (n ≥ τ T , T ) ≥ ||ψ||2 − θ. In both cases, for such T , p
|X|ψ (T ) ≥ C(p)T p , p > 0. 2. Let ψ be of the second kind. There exists τ > 0 small enough such that Pψ (n ≤ 2LN , T ) ≥ c2 > 0 1/η
for all T : LN /4 ≤ T ≤ τ LN . If ψ is of the first kind, then a better bound holds: −1/η
Pψ (n ≤ 2LN , T ) ≥ ||ψ||2 − CT LN
.
(4.10)
250
S. Tcheremchantsev
Proof. Recall that ψ = f (H )δ1 , where f is a bounded Borel function, f ∈ C ∞ (S), S = [E0 − ν, E0 + ν] ⊂ [−2 + ν, 2 − ν] and |f (x)| ≥ c > 0 on S. Let h be some function such that 0 ≤ h(x) ≤ 1, h ∈ C0∞ ([E0 − γ , E0 + γ ]) and h(x) = 1, x ∈ [E0 − θ, E0 + θ], where 0 < θ < γ < ν. Define g(x) = f (x)h(x). It is clear that g ∈ C0∞ ([E0 − γ , E0 + γ ]). Let ρ = g(H )δ1 , χ = ψ − ρ = (f (H ) − g(H ))δ1 . As |f (x)| ≥ c > 0 on S, α ≡ ||ρ||2 ≥ c2 µ([E0 − θ, E0 + θ ]) > 0.
Since ρ, χ =
dµ(x)|f (x)|2 h(x)(1 − h(x))
and f bounded, choosing the parameter γ in the definition of h close enough to θ , one can ensure that |ρ, χ ≤ ||ρ||2 /4 = α/4.
(4.11)
Let ρ(t) = exp(−itH )ρ, χ (t) = exp(−itH )χ and ψ(t) = exp(−itH )ψ. For any n ∈ N, |ψ(t, n)|2 = |ρ(t, n)|2 + |χ (t, n)|2 + 2Re(ρ(t, n)χ (t, n)). Let M > 0. Summation over n ≤ M and time-averaging yield for any T > 0: |ψ(t, n)|2 (T ) ≤ ||χ ||2 + |ρ(t, n)|2 (T ) n≤M
n≤M
+2||χ ||
1/2 |ρ(t, n)|2 (T )
(4.12)
.
n≤M
We have used the fact that ||χ (t)|| = ||χ || and the Cauchy-Schwarz inequality. The condition (4.11) implies that ||χ ||2 ≤ ||ψ||2 − α/2. Therefore, (4.12) yields Pψ (n ≤ M, T ) ≤ ||ψ||2 − α/2 + Pρ (n ≤ M, T ) + C(Pρ (n ≤ M, T ))1/2 .
(4.13)
Thus, if Pρ (n ≤ M, T ) ≤ η, where η is small enough (depending on α), then Pψ (n ≥ M, T ) ≥ α/4 > 0. Let M > 2LN . To bound from above Pρ (n ≤ M, T ), we shall write Pρ (n ≤ M, T ) = Pρ (n ≤ 2LN , T ) + Pρ (2LN < n ≤ M, T ). Recall that ρ = g(H )ψ, where g ∈ C0∞ ([E0 − γ , E0 + γ ]) and [E0 − γ , E0 + γ ] ⊂ [E0 − ν, E0 + ν] ⊂ [−2 + ν, 2 − ν]. 1/η+δ
Therefore, all upper bounds of the previous section hold for ρ. Since T ≥ LN bound (3.7) of Lemma 3.3 yields −δ/2
Pρ (n ≤ 2LN ) ≤ CLαNN −δ ≤ CLN
, the
(4.14)
Dynamical Analysis of Schr¨odinger Operators with Growing Sparse Potentials
251
for N large enough. On the other hand, the bound (3.14) of Theorem 3.4 implies Pρ (2LN ≤ n ≤ M, T ) ≤ C
M + CA L−A N T
(4.15)
for any A > 0. The bounds (4.14)–(4.15) yield Pρ (n ≤ M, T ) ≤ C
M + βN , βN → 0. T
It is clear that taking M = τ T with τ > 0 small enough, for N large enough we get Pρ (n ≤ M, T ) ≤ η and thus Pψ (n ≥ M, T ) ≤ α/4 > 0. In the case of ψ of the first kind the proof is simpler. One can directly estimate Pψ (n ≤ 2LN , T ) and Pψ (2LN ≤ n ≤ M, T ) as in (4.14), (4.15). Taking τ small enough, one obtains for T large enough that Pψ (n ≤ τ T , T ) ≤ θ. For the moments the bound directly follows. To prove the second part of the theorem, one shows the bound similar to (4.13): Pψ (n ≥ M, T ) ≤ ||ψ||2 − α/2 + Pρ (n ≥ M, T ) + C(Pρ (n ≥ M, T ))1/2 . Taking M = 2LN and using the bound (3.12) of Theorem 3.4 for the state ρ, we get −1/η
Pρ (n ≥ 2LN , T ) ≤ CT LN
.
1/η
One sees that for LN /4 ≤ T ≤ τ LN with τ small enough, Pρ (n ≥ 2LN , T ) + C(Pρ (n ≥ 2LN , T ))1/2 ≤ α/4. Thus, Pψ (n ≤ 2LN , T ) ≥ α/4 > 0. In the case of ψ of the first kind, the bound (4.10) follows directly from the bound (3.12) of Theorem 3.4.
Corollary 4.5. Let ψ be of the first kind. 1. The equalities hold: βψ− (p) =
p+1 , βψ+ (p) = 1. p + 1/η
2. The measure µψ and the restriction of µδ1 to (−2, 2), have exact Hausdorff dimension η. Proof. The first statement is proved using the bounds for the moments of Theorem 2.8 and Theorem 3.4 and considering LN ≤ T ≤ LαN and LαN ≤ T < LN+1 with suitable α as in the proof of Theorem 2.8. The bound dim∗ (µψ ) ≥ η was proved by Jitomirskaya and Last (see also Corollary 3.2 for a simpler proof). On the other hand, one has the well known inequality βψ− (p) ≥ dim∗ (µψ ) for all p > 0 (which follows from the results of
p+1 , letting p → 0 we obtain the upper bound dim∗ (µψ ) ≤ η. [L]). Since βψ− (p) = p+1/η Thus, µψ has exact Hausdorff dimension η. Since it is true for any f of the first kind, it is true for the restriction of µδ1 to (−2, 2).
Acknowledgement. I would like to thank F. Germinet for useful discussions.
252
S. Tcheremchantsev
References [CM]
Combes, J.M., Mantica, G.: Fractal dimensions and quantum evolution associated with sparse potential Jacobi matrices. In: Long time behaviour of classical and quantum systems, S. Graffi, A. Martinez (eds.), Series on concrete and appl. math. Vol. 1, Singapore: World Scientific, 2001, pp. 107–123 [DT] Damanik, D., Tcheremchantsev, S.: Power-law bounds on transfer matrices and quantum dynamics in one dimension. Commun. Math. Phys. 236, 513–534 (2003) [DRJLS] del Rio, D., Jitomirskaya, S., Last, Y., Simon, B.: Operators with singular continuous spectrum, IV. Hausdorff dimensions, rank-one perturbations, and localization. J. d’Analyse Math. 69, 153–200 (1996) [G] Gordon, A. Ya.: Deterministic potential with a pure point spectrum. Math. Notes 48, 1197– 1203 (1990) [GKT] Germinet, F., Kiselev, A., Tcheremchantsev, S.: Transfer matrices and transport for Schr¨odinger operators. Ann. Inst. Fourier 54 (3), 787–830 (2004) [GK] Germinet, F., Klein, A.: Operator kernel estimates for functions of generalized Schr¨odinger operators. Proc. Amer. Math. Soci. 131, 911–920 (2003) [GSB] Guarneri, I., Schulz-Baldes, H.: Lower bounds on wave-packet propagation by packing dimensions of spectral measure. Math. Phys. Elec. J. 5, 1 (1999) [JL] Jitomirskaya, S., Last, Y.: Power-law subordinacy and singular spectra. I. Half-line operators. Acta Math. 183, 171–189 (1999) [KKL] Killip, R., Kiselev, A., Last, Y.: Dynamical upper bounds on wavepacket spreading. Am. J. Math. 125 (5), 1165–1198 (2003) [K] Krutikov, D.: Asymptotics of the Fourier transform of the spectral measure for Schr¨odinger operators with bounded and unbounded sparse potentials. J. Phys. A 35, 6393–6417 (2002) [KR] Krutikov, D., Remling, C.: Schr¨odinger operators with sparse potentials: asymptotics of the Fourier transform of the spectral measure. Commun. Math. Phys. 223, 509–532 (2001) [L] Last, Y.: Quantum dynamics and decompositions of singular continuous spectrum. J. Funct. Anal. 142, 405–445 (1996) [P] Pearson, D.B.: Singular continuous measures in scattering theory. Commun. Math. Phys. 60, 13–36 (1978) [S] Simon, B.: Operators with singular continuous spectrum, VII. Examples with borderline time decay. Commun. Math. Phys. 176, 713–722 (1996) [S2] Simon, B.: Bounded eigenfunctions and absolutely continuous spectra for one-dimensional Schr¨odinger operators. Proc. Am. Math. Soc. 124, 3361–3369 (1996) [SSP] Simon, B., Spencer, T.: Trace class perturbations and the absence of absolutely continuous spectrum. Commun. Math. Phys. 125, 113–126 (1989) [SST] Simon, B., Stolz, G.: Operators with singular continuous spectrum, V. Sparse potentials. Proc. Am. Math. Soc. 124, 2073–2080 (1996) [T] Tcheremchantsev, S.: Mixed lower bounds for quantum transport. J. Funct. Anal. 197, 247– 282 (2003) [Z] Zlatos, A.: Sparse potentials with fractional Hausdorff dimension. J. Funct. Anal. 207, 216– 252 (2004) Communicated by B. Simon
Commun. Math. Phys. 253, 253–282 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1193-5
Communications in
Mathematical Physics
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles Johan L. Dupont1, , Franz W. Kamber2, 1 2
Department of Mathematics, University ofAarhus, 8000Århus C, Denmark. E-mail:
[email protected] Department of Mathematics, University of Illinois at Urbana–Champaign, 1409 W. Green Street, Urbana, IL 61801, USA. E-mail:
[email protected]
Received: 14 October 2003 / Accepted: 16 April 2004 Published online: 14 October 2004 – © Springer-Verlag 2004
Abstract: The notion of smooth Deligne cohomology is conveniently reformulated in terms of the simplicial deRham complex. In particular the usual Chern-Weil and Chern-Simons theory is well adapted to this framework and rather easily gives rise to characteristic Deligne cohomology classes associated to families of bundles and connections. In turn this gives invariants for families of foliated bundles. The construction ˇ provides representing cocycles in the usual Cech-deRham model for smooth Deligne cohomology called ‘gerbes with connection’ as they generalize usual Hermitian line bundles with connection. A special case is the Quillen line bundle associated to families of flat SU(2)-bundles.
Contents 1. Introduction . . . . . . . . . . . . . . 2. Gerbes with Connection . . . . . . . . 3. Gerbes and Simplicial Forms . . . . . 4. Fibre Integration of Simplicial Forms . 5. Secondary Characteristic Classes . . . 6. Invariants for Families of Connections 7. Examples . . . . . . . . . . . . . . . References280
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
254 256 260 263 267 271 276
Work supported in part by the Erwin Schrödinger International Institute of Mathematical Physics, Wien, Austria and by the Statens Naturvidenskabelige Forskningsråd, Denmark Supported in part by the European Union Network EDGE. Supported in part by ‘Fonds zur Förderung der wissenschaftlichen Forschung, Projekt P 14195 MAT’
254
J.L. Dupont, F.W. Kamber
1. Introduction The determinant line bundle was constructed by Quillen [32] for families of Riemann surfaces and generalized to higher dimension by Bismut and Freed (see e.g. [1, 18, 19]). It also admits a ‘geometric’ construction (and further generalization) in terms of families of principal G-bundles with connection for G any Lie group (see e.g. Bonora et.al. [2], Brylinski [5, 6], Dupont–Johansen [14]). In this situation the construction in the present paper more generally provides ‘–gerbes with connection’ for suitable = 0, 1, 2, . . . depending on curvature conditions on the fibre connections in the family. We use the phrase ‘(Hermitian line) gerbe’ (respectively ‘(Hermitian line) gerbe with connection’) as an abbreviation for the notion of a ˇ ˇ representing cocycle (with a shift in degree) in the Cech (respectively Cech–deRham) model for the usual (respectively Deligne) cohomology associated to the sheaf U (1) of smooth functions with values in the circle group U (1) ⊆ C. We are aware that the word ‘gerbe’ originally was used for a rather different kind of object which however, in the abelian case, is closely related to our ‘2–gerbe’ in the same way as a ‘1–gerbe’ corresponds to a Hermitian line bundle. Similarly our notion of a ‘2–gerbe with connection’ is in accordance with Hitchin [25] and is in line with a widespread use of the word ‘gerbe’ in mathematical physics (see e.g. Carey–Mickelsson [8].) We refer to Brylinski [5, 7] for more information about Deligne cohomology and its relation to the original notion of ‘gerbes’ (see also Breen–Messing [3]). However, we are using the word ‘gerbe’ only in the restricted sense described in Sect. 2. Let us now describe our main results. In the following, X will be a compact oriented smooth manifold and G a Lie group with finitely many components. Definition 1.1. A family of principal G-bundles over X with connections consists of the following: (i) A smooth fibre bundle π : Y → Z with fibre X and structure group Diff + (X) of orientation preserving diffeomorphisms. (ii) A principal G-bundle p : E → Y . (iii) A smooth family A = {Az | z ∈ Z} of connections in the G-bundles Pz = E |Xz , Xz = π −1 (z). Notice that the family of connections in (iii) can always be obtained (using a partition of unity) from some ‘global’ connection B in the G-bundle E such that Az = B | T Pz for all z ∈ Z. But this global extension is not part of the structure. Furthermore let IZn+1 (G) ⊆ I n+1 (G) denote the set of invariant homogeneous polynomials of degree n + 1 on the Lie algebra g such that the Chern-Weil image is an integral class. That is, Q ∈ IZn+1 (G) corresponds in the cohomology H 2n+2 (BG, R), BG the classifying space, to the image of a class u ∈ H 2n+2 (BG, Z) by the map induced by the natural inclusion Z ⊆ R. We shall distinguish between two cases: In Case I (the ‘Godbillon–Vey’ case) we have Q ∈ ker(I ∗ (G) → I ∗ (K)), K ⊆ G a maximal compact subgroup, and u can be chosen to be 0. Otherwise in Case II we have u = 0 (the ‘Cheeger–Chern–Simons case’). With this notation we shall prove the following in Case I: Theorem 1.2. Consider Q ∈ I n+1 (G) as in Case I above and let E → Y be a family of G-bundles with connections {Az | z ∈ Z} as in Definition 1.1. Let dim X = 2n + 1 − with 0 ≤ ≤ 2n + 1. (i) For B a global extension of the family there is associated a natural class of -forms [Y/Z (Q, B)] ∈ (Z)/d−1 (Z).
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
255
(ii) This class is independent of the choice of extension provided FAn+1− = 0 for all z z ∈ Z, where FAz is the curvature form in the fibre Pz . (iii) Curvature formula : −1 dY/Z (Q, B) = (−1) Q(FBn+1 ), Y/Z
where Q(FBn+1 ) ∈ 2n+2 (Y ) is the characteristic form associated to Q. = 0 for all z ∈ Z then [Y/Z (Q, B)] lies in H (Z, R). (iv) If FAn− z Here Y/Z denotes integration over the fibre in the bundle π : Y → Z. Also the curvature FA of a connection A in a principal G-bundle P → X is defined as usual by FA = dA + 21 [A, A]. For Q ∈ IZn+1 (G) as in Case II above we shall prove (Section 6) a result analogous to Theorem 1.2 only the integral class u ∈ H 2n+2 (BG, Z) has to be taken into account, and the deRham complex ∗ (Z) is going to be replaced by the simplicial deRham complex (as in Dupont [11 or 12] ) for the nerve of an open covering of Z. In terms of the above mentioned notion of gerbes with connections (see Sect. 2 below) we shall prove the following: Theorem 1.3. Consider Q ∈ I n+1 (G) and u ∈ H 2n+2 (BG, Z) as in Case II above, and let E → Y be a family of G–bundles with connections {Az | z ∈ Z} as in Definition 1.1. Let dim X = 2n + 1 − , 0 ≤ ≤ 2n + 1. (i) For B a global extension of the family there is associated a natural equivalence class of –gerbes θ = θ (Q, u, B) with connection ω = (ω0 , . . . , ω ) for a suitable open covering U = {Ui | i ∈ I }. (ii) This class [θ, ω] is independent of the choice of extension provided FAn+1− = 0 for z all z ∈ Z, where FAz is the curvature form in the fibre Pz . (iii) Curvature formula : dω0 = (−1)−1 ε ∗ Q(FBn+1 ) and δ∗ [θ ] = (−1)−1 π! (u(E)). (1.1) Y/Z
(iv) If FAn− = 0 then dω0 = 0 and the invariant [θ, ω] lies in H (Z, R/Z). z In (1.1) ε∗ : ∗ (Z) → Cˇ 0 (U, ∗ ) is the natural inclusion of the deRham complex ˇ into the Cech bicomplex. Furthermore u(E) ∈ H 2n+2 (Y, Z) is the associated characteristic class for the G–bundle E → Y and π! : H 2n+2 (Y, Z) → H +1 (Z, Z) is the usual transfer map. Finally ∼ =
δ∗ : H (Z, U (1)) −→ H +1 (Z, Z) ˇ is the usual isomorphism in Cech–cohomology. The above theorems contain the classical secondary characteristic classes by taking X = {pt} and = 2n + 1; but in this case the invariants may depend on the extension B (see Sect. 5). We are more concerned with the case ≤ n where this does not happen. In particular we shall apply Theorems 1.2 and 1.3 to families of foliated G–bundles of codimension q in the sense of Kamber–Tondeur [30]. These have adapted connections q+1 A whose curvature FA satisfy FA = 0. Hence we obtain invariants for families of such foliations provided n − ≥ q. We refer to Sect. 6 for a precise statement.
256
J.L. Dupont, F.W. Kamber
In the case = 1 Theorem 1.3 includes the construction of the generalized Quillen line bundles considered in [14] which was our motivating example. In Sect. 6 we shall also consider a relative version of our construction generalizing the notion of a ‘ChernSimons section’ considered in [14]. Our Theorems 1.2 and 1.3 overlap with the results of Freed [20] but the methods are rather different. In fact we take advantage of the reformulation of ‘gerbes with connection’ and smooth Deligne cohomology in terms of simplicial differential forms as explained in Sect. 3. In particular the notion of integration along the fibres which we are going to use, is fairly straight forward in this formulation (see Sect. 4 below or Dupont–Ljungmann [17] ). Also, as we shall see in Sect. 5, the Cheeger–Chern–Simons characters are represented by simplicial differential forms. There are by now several ways of looking at gerbes with connection (see e.g. Hitchin [25]), but we hope to demonstrate that the representation as a simplicial differential form is both an attractive and a convenient point of view. 2. Gerbes with Connection In this section we briefly recall the notion of a ‘gerbe with connection’ and smooth ‘Deligne cohomology’. We refer to [5] for more information. We shall only consider ˇ Hermitian line gerbes which are by definition Cech cocycles for the sheaf U (1) of smooth functions with values in the circle group U (1) ⊆ C. For convenience we shall 1 identify this group with R/Z via the map z ↔ 2πi log z, z ∈ U (1). Hence a (Hermitian ˇ line) p–gerbe on a smooth manifold X is a p-cocycle in the Cech complex Cˇ p (U, R/Z) = C ∞ (U(i0 ,... ,ip ) , R/Z), (i0 ,... ,ip )
with the usual coboundary ˇ )i0 ,... ,ip = (δθ
p+1
(−1)i θi0 ,... ,iν ,... ,ip .
(2.1)
ν=0
Here U = {Ui | i ∈ I } is an open covering of X. For convenience we assume that U is ‘good’ in the sense that all non-empty intersections Ui0 ,... ,ip = Ui0 ∩ · · · ∩ Uip are contractible. It is well-known that every open covering has a good refinement and that for such covering we have H p (Cˇ ∗ (U, R/Z)) ∼ = H p (X, R/Z). Notice also that every cochain is the reduction of a cochain in Cˇ ∗ (U, R) and that the isomorphism ∼ =
δ∗ : H p (X, R/Z) −→ H p+1 (X, Z)
(2.2)
is indeed induced by δˇ in (2.1) applied to such a lift. ˇ In general consider the Cech–deRham bicomplex ˇ p,q (U) = Cˇ p (U, q ) R
(2.3)
p ˇ ˇ ∗ (U) given on ˇ with differential in the total complex R R by D = δ + (−1) d. Notice that there are natural inclusions of chain complexes p,∗
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
257
ˇ ∗ (U), Cˇ ∗ (U, Z) ⊆ Cˇ ∗ (U, 0 ) ⊆ R
(2.4)
⊆ ˇ ∗ (U), ε ∗ : ∗ (X) −→ Cˇ 0 (U, ∗ ) ⊆ R
(2.5)
and
where ε ∗ is induced by the natural map ε: Ui → X. i
Since U is good we have Cˇ ∗ (U, R/Z) = Cˇ ∗ (U, 0 )/Cˇ ∗ (U, Z) and we put ˇ ∗ (U) = ˇ ∗ (U)/Cˇ ∗ (U, Z). R/Z R
(2.6)
Notice that the canonical map ˇ ∗ (U) → ˇ ∗ (U) ε∗ : ∗ (X) → R R/Z is injective in degrees > 0. We now have the following: Lemma 2.1. Let U be a good covering of X. Then ˇ ∗ (U)/ε ∗ ∗ (X)) = 0. (i) H ∗ ( R ˇ ∗ (U)) ∼ (ii) H ∗ ( = H ∗ (X, R/Z) for R/Z the constant sheaf. R/Z (iii) There is a natural isomorphism ˇ ∗ (U)/ε ∗ ∗ (X)) ∼ D∗ : H ( = H +1 (X, Z) R/Z for ≥ 0. ⊆ ˇ ∗ (U) is a homology isomorphism. (i) follows since ε∗ : ∗ (X) −→ R ⊆ ˇ ∗ (U) is a (ii) follows since, for R the constant sheaf, the inclusion Cˇ ∗ (U, R) −→ R homology isomorphism. (iii) Now D∗ is just the connecting homomorphism for the exact sequence
Proof.
ˇ ∗ (U)/ε ∗ ∗ (X) → ˇ ∗ (U)/ε ∗ ∗ (X) → 0. 0 → Cˇ ∗ (U, Z) → R R/Z
We can now define a gerbe with connection as follows: Definition 2.2. Let U be a good covering for X. ˇ ,0 (U) , δθ ˇ = 0 , is given by ω ∈ ˇ (U), (i) A connection ω in an –gerbe θ ∈ R/Z R ν,−ν ˇ (U), ν = 0, . . . , , with that is a sequence ω = (ω0 , . . . , ω ) , ων ∈ R ˇ ∗ (U)/ε ∗ ∗ (X). ω ≡ −θ mod Z, such that ω is a cycle in R/Z
258
J.L. Dupont, F.W. Kamber
(ii) The curvature form for ω is the unique closed ( + 1)–form Fω such that ˇ 0,+1 (U) ε∗ Fω = dω0 ∈ R The connection is called flat if Fω = 0. (iii) Two –gerbes θ1 , θ2 with connections ω1 , ω2 are equivalent if ω1 − ω2 is a ˇ ∗ (U). The set of equivalence classes [θ, ω] is denoted H +1 coboundary in D R/Z (X, Z) and is called the (smooth) Deligne cohomology in degree + 1 (note the shift in degree). +1 Remark 2.3. 1. Thus HD (X, Z) is the homology of the sequence d d ˇ −1 (U) −→ ˇ (U) −→ ˇ +1 (U)/ε ∗ +1 (X). R/Z R/Z R/Z
(2.7)
2. The set of equivalence classes of –gerbes with flat connections is isomorphic to H (X, R/Z) by Lemma 2.1. 3. It follows also using Lemma 2.1, that there is a natural exact sequence d∗
+1 0 → H (X, R/Z) −→ HD (X, Z) −→ +1 cl (X, Z) −→ 0.
(2.8)
∗ Here +1 cl (X, Z) ⊆ (X) denotes the set of closed forms with integral periods, and d∗ is induced by the map sending ω to the curvature form Fω . In particular, as the +1 notation indicates, HD (X, Z) does not depend on the choice of a (good) covering U. 4. Notice the natural commutative diagram
/ H (X, R/Z)
+1 HD (X, Z)
ˇ ∗ (U)/ε ∗ ∗ (X)) H ( R/Z
(2.9)
∼ = δ∗
D∗ ∼ =
/ H +1 (X, Z)
where the top horizontal map is induced by the map forgetting the connection and where D∗ is given by Lemma 2.1. 5. The explicit description of an –gerbe θ with connection ω is as follows. Let ω be a ˇ ν,−ν (U) , ν = 0, . . . , , satisfying sequence (ω0 , . . . , ω ) of cochains ων ∈ R ˇ ν−1 + (−1)ν dων = 0 , δω ˇ ≡ 0 mod Z. δω
ν = 1, . . . , , (2.10)
ˇ 0 = 0, and dω0 defines a global The first equation for ν = 1 in (2.10) implies that δdω ∗ 0 closed ( + 1)–form Fω , that is ε Fω = dω . The last equation in (2.10) says that ˇ is an integral ( + 1)–cycle z ∈ Zˇ +1 (U, Z), that is −ω ∈ ˇ ,0 (U) is the lift −δω R ,0 ˇ of a unique –cycle θ ∈ R/Z (U). Thus from (2.2) we have δ∗ [θ] = [z]. Moreover by construction, the integral class [z] determines the class [Fω ] under the canonical homomorphism r : H +1 (X, Z) → H +1 (X, R). Then ω is a connection for the –gerbe θ.
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
259
+1 6. In terms of the notation in [5] our smooth Deligne cohomology group HD (X, Z) is +1 canonically isomorphic to the group HD,∞ (X, Z(+1)), that is the hypercohomology group of X in degree + 1 with values in the sheaf complex
Z → 0 → 1 → . . . → . k Since in the smooth case HD ,∞ (X, Z( + 1)) is ordinary cohomology with coefficients R/Z for k < + 1, respectively Z for k > + 1, k = + 1 is the only degree which needs a special name and we have therefore deleted the extra index from the notation. This is of course in contrast to the holomorphic Deligne cohomology for an algebraic variety. ∗ (X, Z) as the group of differential charFinally let us mention the interpretation of HD Sing acters in the sense of Cheeger-Simons [9] (see also Dupont et al. [13]). Let C∗ (X) denote the chain complex of (smooth) singular chains in X and let ∗ (X, R) = HomZ (C∗ I : ∗ (X) → CSing
Sing
(X), R)
be the deRham integration map. Definition 2.4. The group of differential characters (mod Z) in degree + 1 is +1 (X, Z) = {(f, α) ∈ HomZ (Z Sing (X), R/Z) ⊕ +1 (X) | δf = I (α) and dα = 0}. H Sing
Sing
Here Z (X) ⊆ C (X) is the set of cycles. The following is well-known (cf. [13]) but is included for completeness: ∗ (X, Z) ∼ H Proposition 2.5. There is a natural isomorphism HD = ∗ (X, Z). Sing
Proof. Choose a good open covering U = {Ui | i ∈ I } of X and let i : C∗ (X, U) ⊆ Sing Sing C∗ (X) be the inclusion of the subcomplex generated by i∈I C∗ (Ui ). Since i is a chain equivalence we can choose a chain map Sing
p : C∗
Sing
(X) → C∗
(X, U)
such that p ◦ i = id and i ◦ p is chain homotopic to the identity with chain homotopy +1 (X, Z) and ξ ∈ Z Sing (X) we have s. Then for (f, α) ∈ H f, ξ − f, i ◦ p(ξ ) = δf, s(ξ ) = I(α), s(ξ ) Sing
so that f is determined by its restriction to the set of cycles Z (X, U) in the chain Sing Sing Sing complex C∗ (X, U). Hence we can replace Z (X) by Z (X, U) in Definition 2.4. ˇ Now we consider the Cech bicomplex of singular chains Sing Cˇ p,q (U) =
(i0 ,... ,ip )
Sing
Cq
(Ui0 ···ip )
260
J.L. Dupont, F.W. Kamber
with associated total complex Cˇ ∗
Sing
(U). Then again the natural chain map
ε∗ Sing / C Sing (X, U) Cˇ ∗ (U) ∗ ? :: :: :: : Sing Cˇ (U) 0,∗
induced by ε : i∈I Ui → X, has an ‘inverse’ chain map j such that ε∗ ◦ j = id and j ◦ ε∗ is chain homotopic to the identity. Now we can define a map ∗ ∗ (X, Z) j∗ : HD (X, Z) → H Sing
by j∗ [ω, θ] = (f, α), where f (ξ ) = I(ω), j (ξ ), ξ ∈ Z Sing In fact for x ∈ C+1 (X, U) we have
(X, U), and α = (ε ∗ )−1 dω0 .
δf, x = I(ω), ∂j (x) = I(D(ω)), j (x) = I(dω0 ), j (x) = I(α), ε∗ j∗ (x) = I(α), x +1 (X, Z). Since any two choices of j are chain homotopic, it is also so that (f, α) ∈ H straightforward to see that j∗ does not depend on the particular choice. Finally, in order to show that j∗ is an isomorphism one just observes that there is a natural exact sequence similar to the one in (2.8): +1 (X, Z) → +1 (X, Z) → 0, 0 → H (X, R/Z) → H cl where the second map is the one sending (f, α) to α.
(2.11)
3. Gerbes and Simplicial Forms In this section we shall reformulate the smooth Deligne cohomology in terms of simplicial deRham cohomology as in [11, 12 and 15]. As before let X be a smooth manifold and let U = {Ui }i∈I be a good covering of X. For convenience we choose a linear ordering of the index set I . The nerve N U of U is the simplicial manifold N U = {N U(p)}p≥0 , given by
NU(p) = Ui0 ···ip (3.1) i0 ≤···≤ip
and with face and degeneracy operators εi : N U(p) → N U(p − 1), i = 0, . . . , p, ηj : NU(p) → N U(p + 1), j = 1, . . . , p, given by the obvious inclusion maps corresponding to deletion of the i th index, respectively repeating the j th index. Also let p ⊆ Rp+1 be the standard p-simplex p
p = t = (t0 , . . . , tp ) ti ≥ 0, ti = 1 i=0
with the corresponding face and degeneracy maps εi : p−1 → p , i = 0, . . . , p, respectively ηj : p+1 → p , j = 0, . . . , p.
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
261
Definition 3.1. (i) A simplicial k-form ω on NU is a sequence of k-forms ω(p) on p × N U(p) satisfying (ε i × id)∗ ω(p) = (id ×εi )∗ ω(p−1) ,
i = 0, . . . , p,
p = 1, 2, . . . .
i = 1, . . . , p,
p = 1, 2, . . . .
(ii) ω is called normal if it furthermore satisfies (ηi × id)∗ ω(p) = (id ×ηi )ω(p+1) ,
We shall denote the set of simplicial k-forms (respectively normal k-forms) by k (||NU||) (respectively k (|N U|)) corresponding to the ‘fat’ (respectively ‘thin’) realizations ||NU|| (respectively |N U|). Clearly ∗ (||N U||) is a differential graded algebra and ∗ (|NU|) ⊆ ∗ (||N U||) is a DGA-subalgebra. Notice that the inclusions Ui ⊆ X induce a natural simplicial map ε : N U → N{X} and this in turn induces a DGA-map ε∗ : ∗ (X) → ∗ (|N U|) ⊆ ∗ (||N U||),
(3.2)
where ∗ (X) = ∗ (|N {X}|) is the usual deRham complex. It follows from [11] that ε ∗ induces homology isomorphisms ε∗ : H (∗ (X)) −→ H (∗ (|N U|)) −→ H (∗ (||N U||)). ∼ ∼ =
=
(3.3)
ˇ The relation with the Cech–deRham complex in Sect. 2 is given by the integration map ˇ p,q (U), I (ω) = I : p,q (||NU||) → ω(p) , (3.4) R p
where ω lies in if it has degree p as a form in the variables of n , n ≥ p. This is a map of bicomplexes and again by [11] the corresponding map of total complexes induces an isomorphism p,q
ˇ ∗ (U)). I : H (∗ (||NU||)) −→ H ( R ∼
(3.5)
=
Also I clearly commutes with ε ∗ given by (2.5) and (3.2). For the representation of the integral cohomology we also consider the discrete simplicial set Nd U, where a p-simplex is a point (i0 , . . . , ip ) for each non-empty intersection Ui0 ∩ · · · ∩ Uip , i0 ≤ i1 ≤ . . . ≤ ip , and we let η : N U → Nd U denote the simplicial map sending Ui0 ∩ · · · ∩ Uip to (i0 , . . . , ip ). Notice that for U a good covering we have a commutative diagram of homotopy equivalences ||N U|| ||η||
||Nd U||
/ |N U|
|η|
(3.6)
/ |Nd U|
and a similar diagram of isomorphisms o H (∗ (||NU||)) O
∼ =
η∗ ∼ =
H (∗ (||N
o d U||))
H (∗ (|N O U|)) ∼ = η∗
∼ =
H (∗ (|N
d U|))
.
(3.7)
262
J.L. Dupont, F.W. Kamber
Also notice that η∗ maps ∗ (||Nd U||) = ∗,0 (||Nd U||) injectively into ∗,0 (||N U||) ⊆ ∗ (||N U||) and that ω ∈ ∗ (||N U||) lies in the image if and only if it only involves the variables of p . Definition 3.2. (i) A k-form ω ∈ ∗ (||NU||) is called discrete if ω ∈ η∗ (∗ (||Nd U||). (ii) ω ∈ ∗ (||NU||) is called integral if it is discrete and if furthermore ˇ ∗,0 (U). I (ω) ∈ Cˇ ∗ (U, Z) ⊆ We let ∗Z (||N U||) ⊆ ∗ (||N U||) (respectively ∗Z (|N U|) ⊆ ∗ (|N U|)) denote the chain complex of integral forms (respectively integral normal forms) and we also put ∗R/Z (||NU||) = ∗ (||NU||)/ ∗Z (||N U||)
(3.8)
∗R/Z (|N U|) = ∗ (|N U|)/ ∗Z (|N U|).
(3.9)
respectively
We now have the following: Proposition 3.3. Let U be a good covering. Then there are natural isomorphisms η∗
I∗
(i) H (∗Z (||Nd U||)) ∼ = H (∗Z (||N U||)) ∼ = H (Cˇ ∗ (U, Z)) = H ∗ (X, Z), I∗
ˇ ∗ (U)) ∼ (ii) H (∗R/Z (||N U||)) ∼ = H ∗ (X, R/Z), = H ( R/Z d∗
(iii) H (∗ (||N U||)/(∗Z (||N U||)+ε ∗ ∗ (X))) ∼ = H +1 (X, Z). = H +1 (∗Z (||N U||)) ∼ +1 (iv) Furthermore I induces a natural isomorphism to HD (X, Z) from the homology of the sequence d
d
+1 ∗ +1 −1 (X). (3.10) R/Z (||N U||) −→ R/Z (||N U||) −→ R/Z (||N U||)/ε
(v) In (i)–(iv) above ||N U|| can be replaced by |N U|. Proof.
(i) In the commutative diagram ∗Z (||Nd U||) η∗
∗Z (||N U||)
(ii) (iii) (iv) (v)
VVIVV V+ ˇ∗ h3 C (U, Z) h h h h h I
η∗ is an isomorphism and I for Nd U is a homology isomorphism since it is surjective and the kernel has vanishing homology by the simplicial deRham theorem. Hence also I for NU is a homology isomorphism. now follows from (i) and (3.5) together with Lemma 2.1 (ii). is similar to Lemma 2.1, (iii). follows from the five–lemma applied to the sequence in (2.8) and the corresponding sequence for the homology group in (3.10). follows similarly.
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
263
+1 Corollary 3.4. Every class in HD (X, Z) can be represented by an –gerbe θ with connection ω of the form ω = I () for some simplicial -form ∈ (||N U||) satisfying
d = ε ∗ α − η∗ β,
α ∈ +1 (X),
β ∈ +1 Z (||Nd U||).
(3.11)
Furthermore and β can be chosen to be normal in the sense of Definition 3.1. Remark 3.5. 1. We shall call a (normal) simplicial -form a (normal) simplicial –gerbe if it satisfies (3.11). 2. Continuing with the previous notation, we write = 0 + · · · + ∈
ν,−ν (||N U||)
ν=0
and we put θ =−
,
ω = ν
ν , ν
ν = 0, . . . , .
(3.12)
Then (3.11) corresponds to the condition (2.10) for the –gerbe θ with connection ω = (ω0 , . . . , ω ≡ −θ ). 3. Note that α and β in (3.11) are uniquely determined by and that α is the curvature form of ω. We shall refer to it as the curvature form for . 4. By (3.11) and (3.12) we have ˇ ∈ Cˇ +1 (U, Z). I (β) = − d = δθ (3.13) +1
Hence β represents the characteristic class z = δˇ∗ [θ ] ∈ H +1 (X, Z) = H +1 (∗Z (||Nd U||)). 5. The simplicial deRham complexes ∗ (||NU||) and ∗ (|N U|) as well as the corresponding subcomplexes of integral forms are clearly functorial with respect to smooth maps f : X → X and compatible coverings. By this we mean coverings U = {Ui }i ∈I of X and U = {Ui }i∈I of X together with an order preserving map ν : I → I such that f (Ui ) ⊆ Uν(i ) for all i ∈ I ; that is, U is a refinement of f −1 (U). The induced maps in the deRham complexes do depend on ν but the induced map in Deligne cohomology does not. Notice that this is the case also for f = id : X → X, that is, when U is a refinement of U. 6. If X has dimension m then it also has covering dimension m (see e.g. [29], Chap. II). Hence by taking a suitable refinement we obtain a covering U for which N U has only non-degenerate simplices of dimension ≤ m. In particular for such a covering we have k, (|N U |) = 0
and
k (|Nd U |) = 0
for k > m.
(3.14)
4. Fibre Integration of Simplicial Forms Fibre integration in smooth Deligne cohomology can be done in various ways, see e.g. Freed [20], Gomi–Terashima [22] or Hopkins–Singer [26]. In this section we sketch how to define it in terms of simplicial forms. We refer to Dupont–Ljungmann [17] for the details.
264
J.L. Dupont, F.W. Kamber
In the following X denotes an oriented compact manifold of dimension m possibly with boundary and π : Y → Z is a smooth fibre bundle with fibre X and structure group Diff + (X) of orientation preserving diffeomorphisms. Also let V = {Vj }j ∈J and U = {Ui }i∈I be good open coverings of Y respectively Z (not necessarily compatible). We shall define integration along the fibre for a normal simplical (k + m)-form ω ∈ k+m (|NV|) as a simplicial k-form Y/Z ω ∈ k (||N U||) defined by usual fibre integration in the bundle p × N (π −1 U)(p) → p × N U(p), p = 0, 1, 2, . . . with fibre X: ω|p ×N(π −1 U )(p) = (4.1) φ˜ ∗ ω, Y/Z
(p ×N(π −1 U )(p))/(p ×N U (p))
where π −1 U = {π −1 Ui }i∈I is the obvious covering of Y and φ˜ : ||N (π −1 U)|| → |N V| denotes a ‘piecewise smooth’ map associated to a choice of partition of unity for the coverings {π −1 Ui ∩ Vj }j ∈J for each i ∈ I . For the construction of φ˜ let us assume for simplicity that π : Y → Z is the product fibration X × Z → Z. For the case of a general fibration we refer to [17]. By Remark 3.5, 5 we can assume that V = U × U = {Vij = Uj × Ui }i∈I, j ∈J , where U = {Ui }i∈I and U = {Uj }j ∈J are open coverings of Z and X respectively and we order I × J lexicographically with i ∈ I before j ∈ J . (Notice the interchange in Vij .) Also as in Remark 3.5, 6 we can assume that N U has only non-degenerate simplices of dimension ≤ m and that N (U ∩ ∂X) has only nondegenerate simplices of dimension ≤ m − 1 (m = dimension of X). Finally we choose a partition of unity {φj }j ∈J subordinate U . Then the natural projection |N U | → X has a right-inverse φ¯ : X → |NU | defined by ¯ φ(x) = ((φj0 (x), . . . , φjq (x), x)j0 ···jq ∈ q × N U (q)
(4.2)
for those x ∈ Uj0 ···jq ⊆ X satisfying φj0 (x) + · · · + φjq (x) = 1. Now, we would like to define φ˜ in a similar fashion as the composite in the diagram ||N (π −1 U)|| QQQ QQQ φ˜ QQQ QQQ Q( |N U × N U| = |N (U × U)| |N (π −1 U)|
(4.3)
τ ≈
X × |N U|
¯ φ×id
/ |N U | × |N U|
where the homeomorphism τ is induced by the Eilenberg-Zilber triangulation map n × (NU (n) × N U(n)) → (n × N U (n)) × (n × N U(n)) given by the diagonal n → n × n . It is well-known that τ −1 is given by the triangulation of a prism q × p into n-simplices (n = p + q), one for each ‘(q, p)-shuffle’ of (0, . . . , n), that is, a pair of non-decreasing functions (ν, µ) : {0, . . . , n} → {0, . . . , q} × {0, . . . , p}
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
265
satisfying µ(0) = ν(0) = 0, µ(n) = p, ν(n) = q, and µ(r) − µ(r − 1) + ν(r) − ν(r − 1) = 1, r = 1, . . . , n,
(4.4) (4.5)
(so that for increasing r the functions µ and ν alternate increasing by 1). It follows that φ˜ ∗ ω ∈ k+m (||N(π −1 U)||) is the simplicial form defined explicitly on p × (X × Ui0 ···ip ) in a neighborhood of a point (t, x, z) by the sum ∗ (φ˜ ∗ ω)i0 ···ip = ω (4.6) φ˜ (ν,µ) (ν,µ)
with (ν, µ) running through the (q, p)-shuffles as above. Here q is determined such that φj0 + · · · + φjq = 1 near x and φ˜ (ν,µ) : p × (Uj0 ···jq × Ui0 ··· ,ip ) → n × (Ujν(0) × Uiµ(0) ) ∩ · · · ∩ (Ujν(n) × Uiµ(n) ) is given by the formula φ˜ (ν,µ) (t, x, z) = (σ0 , . . . , σn , x, z), where σr =
(ν ,µ )
(4.7)
(4.8)
tµ φjν (x)
is a sum over the pairs of integers (ν , µ ), µ = 1, . . . , p, ν = 0, . . . , q, satisfying (ν(r − 1), µ(r − 1)) < (ν , µ ) ≤ (ν(r), µ(r)) in the lexicographical order.That is, tµ(r) φjν(r) (x) if µ(r − 1) = µ(r), ν(r − 1) < ν(r), t φjν (x) + µ(r−1) σr = ν(r)<ν if µ(r − 1) < µ(r), ν(r − 1) = ν(r). φ (x) + t j µ(r) ν ν ≤ν(r)
p −1 The form given by (4.6) clearly defines a smooth form in × N (π U)(p) so that Y/Z ω is indeed well-defined by the formula (4.1). Also it is easy to see from the construction that it is a simplicial k-form, i.e., that it satisfies Definition 3.1 (i). It is however not necessarily a normal simplicial form even though ω was normal to begin with. We note the following properties of fibre integration. The signs are determined by the convention that we always integrate the variables starting from the left:
Proposition 4.1.
(i) Let ω ∈ k+m−1 (|N V|), m = dim X. Then dω = ω + (−1)m d ω. Y/Z
∂Y/Z
(ii) If ∂X = ∅ and ω ∈ ∗Z (|N V|) then
Y/Z
Y/Z
ω is also integral.
266
J.L. Dupont, F.W. Kamber
(iii) Suppose ∂X = ∅. Then : k+m (|N V|) → k (||N U||) Y/Z
induces the usual transfer map π! : H k+m (Y ) → H k (Z) with coefficients in R, Z or R/Z. Also it induces a well-defined map of smooth Deligne cohomology k+m k (Y, Z) → HD (Z, Z) π ! : HD
independent of choices of coverings and partition of unity. (iv) π! is functorial with respect to bundle maps and compatible coverings. Proof. Again we restrict to the product case, referring to [17] for the general case. (i) By (4.1) this follows as for the usual fibre integration from Stokes’ Theorem. ˇ (ii) We shall prove that the Cech cochain c = I Y/Z ω for the covering U has integral values. For this we observe that (φ˜ ∗ ω)i0 ···ik in (4.6) only involves ω restricted to the (k + m)-skeleton of Nd V, hence by (3.14) it can be assumed to be a closed integral form. Since φ¯ : X → |N U | has degree one, it is straightforward from (4.3) and the Eilenberg– Zilber Theorem that ci0 ···ik is the evaluation of I (ω) on the chain [X] × (i0 , . . . , ip ) and hence is integral. In fact it follows that c represents the slant product I (ω)/[X] in the integral cohomology. (iii) Since π! : H k+m (Y ) → H k (Z) is induced by the slant product by [X] the first statement is already contained in the proof of (ii). That π! in Deligne cohomology is independent of choice of partition of unity follows from (i) applied to Y × [0, 1] and the partition of unity {(1 − t)φj + tφj }j ∈J , where {φj }j ∈J and {φj }j ∈J are the two given ones for the covering V. Independence of choice of covering is now straightforward using Remark 3.5, 5. (iv) is also straightforward.
Remark 4.2. In the case of a product fibration π : X × Z → Z with the covering V = ∗ (X × Z, Z) by a {Uj × Ui }(i,j )∈I ×J as above, we can also represent a class in HD normal simplicial form ω in the bisimplicial manifold N U × N U (cf. [15]), i.e., by a collection of compatible forms on q × p × N U (q) × N U(p). We can then define n− (|NU|) for ξ ∈ C ˇ ˇ Sing (N U ) any class in the Cech bicomplex of singu ξω ∈ lar chains in the notation at the end of Sect. 2 above. In fact, for a singular r-simplex ξ = σ : r → Uj0 ···jq ⊆ X with r + q = , we just integrate the pull-back of ω to q ×p ×r ×NU(p) over q ×r . For ξ = [X] a representative for the fundamental cycle of X (in case ∂X = ∅) we have (with τ being the Eilenberg-Zilber map as in (4.3) above): ω= τ ∗ ω. (4.9) [X]
Y/Z
Also we have the Stokes’ formula similar to Proposition 4.1 (i): dω = ω + (−1) d ω for ω ∈ n (|N U | × |N U|). ξ
∂ξ
ξ
We refer to [17] for further details on fibre integration of simplicial forms.
(4.10)
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
267
5. Secondary Characteristic Classes In this section we reformulate the classical constructions of secondary characteristic classes and ‘characters’ for connections on principal G-bundles in terms of simplicial forms. For the classical constructions we refer to Kamber–Tondeur [30], Chern–Simons [10], Cheeger–Simons [9] or Dupont–Kamber [16]. In the following p : P → X is a smooth principal G-bundle, G a Lie-group with only finitely many components and K ⊆ G is the maximal compact subgroup. As in Sect. 1 we fix an invariant homogeneous polynomial Q ∈ I n+1 (G), n ≥ 0, such that one of the following 2 cases occur: Case I: Q ∈ ker(I n+1 (G) → I n+1 (K)). Case II: Q ∈ IZn+1 (G), that is, there exists an integral class u ∈ H 2n+2 (BK, Z) representing the Chern-Weil image of Q in H ∗ (BG, R) ∼ = H ∗ (BK, R). Let us introduce the notation +1 HD (X) = (X)/d−1 (X).
(5.1)
+1 In the notation of [5], HD (X) is canonically isomorphic to the smooth Deligne coho+1 mology group HD,∞ (X, 0( + 1)), with ’0’ denoting the 0-ring, that is the hypercohomology group H (X, () ) with values in the truncated sheaf complex
() : 0 → 1 → . . . → . +1 The elements [ω] ∈ HD (X) can be interpreted as equivalence classes of connections on the trivial –gerbe θ = 0 by setting
ω0 = ε ∗ ω ,
Fω = dω ,
ˇ 0=0, δω
ω1 = . . . = ω = 0.
(5.2)
H (X, R). Com-
Clearly the connection is flat if and only if Fω = dω = 0, that is [ω] ∈ bining this with (2.8), the data in (5.2) determine a commutative diagram with exact rows 0
/ H (X, R)
/ H +1 (X) D
/ (X)/ (X) cl
0
/ H (X, R/Z)
/ H +1 (X, Z) D
/ +1 (X, Z) cl
/0
(5.3)
d
d∗
/ 0.
Further, it is easy to see that the center vertical arrow in diagram (5.3) is induced by the exact hypercohomology sequence associated to the exact triangle of complexes () [−1] cGG GG GG +1 GGG
Z
/ {Z → 0 → 1 → . . . → } . lll lll l l ll lll u ll l
With this notation the secondary characteristic class associated to Q (Case I) or (Q, u) (Case II) for a connection A on P → X is a class 2n+2 (X) [(Q, A)] ∈ HD 2n+2
[(Q, u, A)] ∈ HD
(X, Z)
in Case I, in Case II.
(5.4)
∗ (X) are defined by global forms, whereas Note that that the characteristic classes in HD ∗ (X, Z) are defined by simplicial forms. the classes in HD
268
J.L. Dupont, F.W. Kamber
For the construction we need the following well-known lemma which we include for completeness: Lemma 5.1. Given a G-bundle p : P → X with connection A and an integer N , there is a bundle map P X
ψ¯
ψ
/ P¯
(5.5)
/ X¯
¯ and a connection A¯ on P¯ such that P¯ is N -connected and such that A = ψ¯ ∗ A. Proof. By choosing X¯ to be a smooth approximation to the classifying space BG we can clearly establish the bundle map in (5.5) with P¯ N -connected. Furthermore, by multiplying P¯ with a Euclidean space, the classifying map ψ can be assumed to be an embedding. Then the connection A on P clearly extends over a tubular neighborhood of X on X¯ and subsequently over all of X¯ by use of a partition of unity.
Remark 5.2. 1. Since the classifying map X¯ → BG for P¯ is unique up to homotopy, we ¯ Z) ∼ have a natural identification of the cohomology H k (X, = H k (BG, Z) for k ≤ N . 2. There is a functorial construction of the bundle map in (5.5) using simplicial manifolds which however requires the use of multi-simplicial constructions for the Deligne cohomology (cf. [16, 13]). The classes in (5.4) are now constructed as follows: Choose a bundle map and connection A¯ as in Lemma 5.1 with N > 2n+2 and choose compatible good coverings U = ¯ Also in Case II choose a representative {Ui }i∈I and U¯ = {U¯ ı¯ }ı¯∈I¯ of X, respectively X. 2n+2 ¯ Z) ∼ γ¯ ∈ Z (|NU|) for the cohomology class u ∈ H 2n+2 (|N U|, = H 2n+2 (BG, Z). ¯ we can find (normal Then for FA and FA¯ the curvature forms for A, respectively A, ¯ ¯ simplicial) forms (Q, A), respectively (Q, u, A), such that ¯ Q(FAn+1 ¯ ) = d(Q, A)
in Case I,
¯ ε∗ Q(FAn+1 ¯ ) − γ¯ = d(Q, u, A)
in Case II,
(5.6)
and we put ¯ ∈ 2n+1 (X) in Case I, (Q, A) = ψ ∗ (Q, A) 2n+1 ∗ ¯ (Q, u, A) = ψ (Q, u, A) ∈ (|N U|) in Case II. R/Z
(5.7)
Proposition 5.3. (i) The classes [(Q, A)], respectively [(Q, u, A)] in (5.4) are well¯ defined given P¯ and A. ¯ (ii) They are independent of the choice of P¯ and A. (iii) They are natural with respect to bundle maps and compatible coverings. (iv) Curvature formula : d(Q, A) = Q(FAn+1 ) d(Q, u, A) = ε
∗
in Case I,
Q(FAn+1 ) − γ
in Case II,
(5.8)
where γ = ψ ∗ γ¯ ∈ Z (|N U|) represents the characteristic class u(P ) associated with u.
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
269
(v) If Q(FAn+1 ) = 0, then in Case I, [ (Q, A) ] ∈ H 2n+1 (X, R) 2n+1 [ (Q, u, A) ] ∈ H (X, R/Z) in Case II,
(5.9)
d∗ [(Q, u, A)] = −u(P ),
(5.10)
and
where d∗ : H 2n+1 (X, R/Z) → H 2n+2 (X, Z) is the Bockstein homomorphism. Proof. (i), (iii), (iv), and (v) are obvious from the construction in (5.6) and (5.7). Finally for (ii), let ψ¯ : P → P¯ and A¯ be another choice of bundle map and connection as in Lemma 5.1. Then P X
¯ ψ¯ ψ×
/ P¯ × P¯
/ (P¯ × P¯ )/G
is also a bundle map of the required form and At = (1 − t)A¯ + t A¯ , t ∈ [0, 1] gives a family of connections on P¯ × P¯ pulling back to the constant family A in P . The claim therefore follows from the following more general formula (with dAt
dt = 0). Lemma 5.4. Variational formula : Let At , t ∈ [0, 1], be a smooth family of connections denote the corresponding connection on P × [0, 1] over X × [0, 1]. on P → X and let A Then we have on 2n+1 (X) respectively 2n+1 (|N U|) : +d
Q
dt
0
∧ FAnt dt
dA
1
Q 0
1
t
dt , i d (Q, A)
(Q, u, A1 ) − (Q, u, A0 ) = (n + 1)ε ∗ +d
dt
0 1
0
dA
1
(Q, A1 ) − (Q, A0 ) = (n + 1)
dt
t
∧ FAnt dt
i d (Q, u, A)dt dt
(5.11)
in Cases I and II respectively. = 0. Hence for the curvature on P ×I satisfies i d A Proof. Notice that the connection A dt + 1 [A, A] we have FA = d A 2
= i d FA = i d d A dt
dt
dAt . dt
270
J.L. Dupont, F.W. Kamber
In Case II say, we therefore obtain from (5.6): d = i d d(Q, u, A) (Q, u, At ) − di d (Q, u, A) dt dt dt n+1 = ε ∗ i d Q(FA ) dt dA t = (n + 1) ε ∗ Q ∧ FAnt , dt
(5.12)
since we can choose the representing integral form for u independent of t. Formula (5.11) now follows from (5.12) by integration.
The invariants in (5.7) have certain multiplicative properties which we state next. Proposition 5.5. (i) For Q1 and Q2 both satisfying Case I, we have [ (Q1 Q2 , A) ] = [ Q1 (A) ∧ (Q2 , A) ] ∗ = [ (Q1 , A) ∧ Q2 (A) ] ∈ HD (X).
(5.13)
(ii) In Case II, let u1 , u2 and u1 ∪ u2 ∈ H ∗ (BG, Z) be represented by integral forms γ1 , γ2 and γ3 respectively, and choose the form µ such that dµ = γ1 ∧ γ2 − γ3 . Then ∗ (X, Z) : we have in HD [ (Q1 Q2 , u1 ∪ u2 , A) ] = [ (Q1 , u1 , A) ∧ ψ ∗ γ2 + ε ∗ Q1 (A) ∧ (Q2 , u2 , A) − ψ ∗ µ ] = [ ψ ∗ γ1 ∧ (Q2 , u2 , A) + (Q1 , u1 , A) ∧ ε ∗ Q2 (A) − ψ ∗ µ ]. Proof. This is straightforward from the definitions in (5.7).
We now apply Proposition 5.3 to the case of foliated bundles in the sense of Kamber–Tondeur [30]. We recall that a principal G-bundle p : P → X is foliated if there are given two foliations F on P , F on X such that (i) F is given by a G–equivariant involutive subbundle T F ⊂ T P , that is the ¯ action by G on P permutes the leaves of F, (5.14) (ii) for each u ∈ P the differential p∗ : Tu F → Tp(u) F is an isomorphism. Also the codimension of the foliated bundle is by definition the codimension of F in X. It is well-known that a foliated G-bundle p : P → X has an adapted connection, i.e., a connection A satisfying A(v) = 0 for v ∈ Tu F, u ∈ P . Then it follows that the curvature form FA satisfies FA ∈ J , where J is the defining ideal of the foliation F. For q+1 the codimension q of F, we have J q+1 = 0 and the curvature form satisfies FA ≡ 0. Theorem 5.6. (i) The classes [(Q, A)] respectively [(Q, u, A)] in (5.4) are welldefined. (ii) They are natural with respect to maps of foliated bundles. (iii) Curvature formula : We have d(Q, A) = Q(FAn+1 ) d(Q, u, A) = ε
∗
in Case I,
Q(FAn+1 ) − γ
in Case II,
(5.15)
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
271
where γ = ψ ∗ γ¯ ∈ Z (|N U|) represents the characteristic class u(P ) associated with u ∈ H 2n+2 (BK, Z). (iv) If n ≥ q, then Q(FAn+1 ) ∈ J q+1 = 0 and in Case I, [ (Q, A) ] ∈ H 2n+1 (X, R) [ (Q, u, A) ] ∈ H 2n+1 (X, R/Z) in Case II.
(5.16)
Moreover these classes are independent of the choice of adapted connection A. (v) Rigidity : If n ≥ q + 1, then the cohomology classes in (iv) are rigid under variation of the foliated structure (P , F) → (X, F). Proof. (i) to (iii) follow from the construction in (5.6), (5.7) and from Proposition 5.3. The statements in (iv) and (v) essentially follow from the variational formulas in (5.11). Equation (5.16) in (iv) follows directly from (5.15). For the last statement in (iv), let A be another choice for the adapted connection. Then the family of adapted connections At given by the convex combination At = (1 − t)A + tA , t ∈ [0, 1] satisfies dAt n q+1 = 0 for n ≥ q and the dt = A − A = α ∈ J . Thus we have Q(α ∧ FAt ) ∈ J statement follows from (5.11). For (v), let (F t , Ft ) , t ∈ [0, 1], be a smooth family of foliated structures on P → X. Let At , t ∈ [0, 1], be a smooth family of (F t , Ft )– q+1 n t adapted connections on P → X. Then for n ≥ q +1 we have Q( dA =0 dt ∧FAt ) ∈ Jt and (v) follows also from (5.11).
Remark 5.7. 1. Theorem 5.6 is essentially a reformulation of Theorem 2.2 in [16]. The above constructions could of course be extended to define more general characteristic classes associated to elements in (the cohomology of) the relative Weil algebra F 2(q+1) W (G, K) as in [16]. 2. Following Kamber-Tondeur [30], Sect. 2.24 we call the adapted connection A basic if the Lie derivative LX A = iX dA vanishes for all F-horizontal vector fields X on P or equivalently if iX FA = 0, that is FA ∈ J 2 . If we can choose the connections in Theorem 5.6 to be basic, then the condition n ≥ q in (iv) can be replaced by 2n ≥ q and the condition n ≥ q + 1 in (v) can be replaced by 2n ≥ q + 1. In fact, we have n 2n t Q(α ∧ FAnt ) ∈ J 2n+1 in (iv), and Q( dA dt ∧ FAt ) ∈ Jt in (v). 6. Invariants for Families of Connections We now return to the situation of a family of principal G-bundles with connections as in Definition 1.1. That is, (i) π : Y → Z is a Diff + (X)-fibre bundle with fibre X, (ii) p : E → Y is a principal G-bundle, and (iii) A = {Az | z ∈ Z} is a family of connections on Pz = E |Xz , z ∈ Z, where Xz = π −1 (z). Also V = {Vj }j ∈J and U = {Ui }i∈I are good coverings of Y respectively Z. Finally Q ∈ I n+1 (G) is an invariant polynomial satisfying Case I or II as in Sect. 5. Our main result is the following: Theorem 6.1. Suppose ∂X = ∅ and dim X = 2n + 1 − , 0 ≤ ≤ 2n + 1. Also let B be a global connection on E extending the family A. Then the following holds: (i) The (simplicial) -form defined by (Q, B) in Case I, Y/Z (Q, B) = Y/Z Y/Z (Q, u, B) = (Q, u, B) in Case II, (6.1) Y/Z
272
J.L. Dupont, F.W. Kamber
+1 +1 gives well-defined classes in HD (Z), respectively HD (Z, Z), functorial with respect to bundle maps
E
/E
Y
/Y
Z
/Z
(6.2)
and the induced connections. (ii) These classes are independent of the choice of the global extension B provided that = 0 for all z ∈ Z. FAn+1− z (iii) Curvature formula : We have Q(FBn+1 ) = (−1)−1 dY/Z (Q, B) in Case I, Y/Z ε∗ Q(FBn+1 ) − γ = (−1)−1 dY/Z (Q, u, B) in Case II, (6.3) Y/Z
Y/Z
where γ represents u(E) ∈ H 2n+2 (Y, Z). (iv) In particular in Case II we have in H +1 (Z, Z): d∗ [ Y/Z (Q, u, B) ] = (−1) π! (u(E)). = 0 for all z ∈ Z then Y/Z (Q, B), respectively Y/Z (Q, u, B) are closed, (v) If FAn− z respectively closed mod Z, and [ Y/Z (Q, B) ] ∈ H (Z, R) [ Y/Z (Q, u, B) ] ∈ H (Z, R/Z)
in Case I, in Case II
(6.4)
are well-defined invariants of the family {Az | z ∈ Z}. Proof. Again (i), (iii), and (iv) follow from the definitions and the properties of fibre integration listed in Proposition 4.1. Also (ii) follows from Lemma 5.4 applied to the family Bt = (1 − t)B0 + tB1 , t ∈ [0, 1] for B0 , B1 two choices of global connections extending the family A. The first statements in (v) are a consequence of formula (6.3) and the properties of fibre integration. Finally, the last statement in (v) follows from (ii), since the curvature assumption in (v) is stronger than the assumption in (ii).
Remark 6.2. 1. Theorems 1.2 and 1.3 are reformulations of Theorem 6.1. In fact the –gerbe θ = θ (Q, u, B) with connection ω = (ω0 , . . . , ω ) in Theorem 1.3 is given by the formulas in (3.12) for = Y/Z (Q, u, B). 2. In particular we recover from Theorem 6.1 the construction of the Quillen ‘determinant line bundle’ and their Hermitian connections as in [14] by taking = 1 and specializing Y to a product and π : X × Z → Z the projection on Z (compare also Example 7.5). 3.Again in the product situation π : X×Z → Z and the covering V = {Uj ×Ui }(i,j )∈I ×J Sing
as in Remark 4.2, we can define more generally for ξ ∈ C2n+1− (X) respectively Sing ξ ∈ Cˇ (NU ) the invariant 2n+1−
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
273
ξ (Q, B) =
(Q, B)
in Case I,
ξ
ξ (Q, u, B) =
(Q, u, B)
in Case II.
(6.5)
ξ
Then we obtain from (4.10) and (5.8): Q(FBn+1 ) = ∂ξ (Q, B) + (−1)−1 dξ (Q, B) in Case I, ξ n+1 ∗ ε Q(FB ) − γ = ∂ξ (Q, u, B) + (−1)−1 dξ (Q, u, B) in case II. (6.6) ξ
ξ
Here γ represents u(E) in 2n+2 (|N U|), hence in particular ξ γ is integral. Thus, under Z the appropriate vanishing conditions for the fibre curvature the left-hand side of (6.6) is going to vanish (mod Z in Case II). Hence ξ (Q, B), respectively ξ (Q, u, B), defines a cycle in the total complex of the bicomplex in Case I, Hom(C∗ (X), ∗ (Z)) ∗ (||N U||)) in Case II. Hom(Cˇ ∗ (U ), R/Z
Notice that for = 0, ξ (Q, u, B) is essentially the ‘Chern–Simons section’ of the line bundle given by ∂ξ (Q, u, B) as defined in [14]. We can now apply Theorem 6.1 to the general case of families of foliated bundles. By a family of foliated G-bundles of codimension q we mean the following: (i) π : Y → Z is a Diff + (X)-fibre bundle with fibre X. (ii) p : E → Y is a principal G-bundle. (iii) F, F are foliations of E, respectively Y , such that T F ⊂ T (π ), respectively T F ⊂ T (π ◦ p) are involutive (G–equivariant) subbundles, inducing foliated structures (F z , Fz ) of codimension q in the principal bundles pz : Pz → Xz for z ∈ Z.
(6.7)
In this situation (F, F) makes p : E → Y into a foliated G-bundle. By restriction to T (π ◦ p) ⊂ T E, a global adapted connection B induces a smooth family A = {Az } of adapted connections on the principal bundles pz : Pz → Xz , z ∈ Z, satisfying the q+1 curvature condition FAz = 0. Conversely, any global extension B of a smooth family
A = {Az } of adapted connections is adapted to (F, F). Thus by choosing a global adapted connection B, we conclude the following from Theorem 6.1:
Theorem 6.3. Suppose ∂X = ∅ and dim X = 2n + 1 − , 0 ≤ ≤ 2n + 1. Let B be an adapted connection for the family of foliated bundles of codimension q as above. Then the following holds : (i) The classes +1 [ Y/Z (Q, B) ] ∈ HD (Z) in Case I, +1 in Case II [ Y/Z (Q, u, B) ] ∈ HD (Z, Z)
(6.8)
are well-defined and independent of the choice of adapted connection B if n − ≥ q.
274
J.L. Dupont, F.W. Kamber
(ii) Suppose that n − > q. Then Y/Z (Q, B), respectively Y/Z (Q, u, B) are closed, respectively closed mod Z and [ Y/Z (Q, B) ] ∈ H (Z, R) [ Y/Z (Q, u, B) ] ∈ H (Z, R/Z)
in Case I, in Case II
(6.9)
are well-defined invariants of the family of foliated bundles. (iii) Suppose again that n − > q. Then the cohomology classes [ Y/Z (Q, B) ], respectively [ Y/Z (Q, u, B) ] in (6.9) above, are rigid, that is they are invariant under (germs of) smooth deformations of the data in 6.7, (iii). In either case, we call the invariants in (6.8), respectively in (6.9), the characteristc –gerbe, respectively the characteristc flat –gerbe of the family of foliated bundles, associated to the pair (Q, u). Proof. (i) needs some elaboration, since the family A of adapted connections on T (π ◦p) is now not fixed. We want to show that (i) follows from the variational formulas in (5.11). Let A, A be two families of adapted connections along the fibres and consider corresponding global extensions B, B of A, A . Then the convex combination Bt = (1 − t)B + tB , t ∈ [0, 1] is an extension of the adapted connection At = t (1 − t)A + tA , t ∈ [0, 1] on the fibres. Further Bt satisfies dB dt = B − B = β, where the F–transversal 1–form β on Y is of the form β = α + γ , with α = α 1,0 = A − A on T (π ) being fibrewise in the ideal Jz of Fz , that is α vanishes on the subbundle T F ⊂ T (π ), and γ = γ 0,1 being of type (0, 1) on Y , that is γ vanishes on the subbundle T (π ) ⊂ T Y . Thus we have, observing that FB2,0 = FAt , t
1
Q(FBn+1 ) = (n + 1)
1 1
= (n + 1) 0
1
= (n + 1) 0
dt ∧ Q(β ⊗ FBnt ) n 1,1 0,2 dt Q (α + γ ) ∧ FB2,0 + F + F B B t t t n 0,2 Q α 1.0 ∧ FAt + FB1,1 + F dt Bt t
+(n + 1) 0
1
n 0,2 Q γ 0,1 ∧ FAt + FB1,1 + F dt . Bt t
(6.10)
As we will have to integrate over the fibre, only the components of type (2n + 1 − , ) of the (2n + 1)–form in the integrand can contribute non–trivial terms. Therefore the relevant terms in the first summand of (6.10) must contain α ∧ FAk t for k ≥ n − ≥ q, that is k + 1 ≥ (n − ) + 1 ≥ q + 1, while the relevant terms in the second summand of q (6.10) must contain FAk t for k ≥ n − ( − 1) = (n − ) + 1 ≥ q + 1. Since α ∧ FAt = 0 q+1
and FAt = 0, it follows that all the relevant terms vanish in either case. Thus (6.10) vanishes under integration over the fibre and (i) follows from (5.11). (iii) is proved by the same argument, with the following modifications. Let Aσ , σ ∈ [0, 1] be families of connections along the fibres, adapted to the data (F σ , Fσ ) and let Bσ 1,0 0,1 σ be corresponding global extensions of Aσ . Then we can write again dB dσ = ασ +γσ as above, except that the horizontal forms ασ1,0 on T (π) do not necessarily satisfy the fibrewise condition of being in the ideal (Jσ )z of (Fσ )z . However, the argument in the proof
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
275
of (ii) remains valid, since the relevant terms ασ ∧ FAk σ satisfy now k ≥ n − ≥ q + 1 and therefore vanish. Of course, (ii) follows from Theorem 6.1 (v). Explicitly, we have to show that the curvature term in (6.3) vanishes under the assumption n − > q. Writing FB = FA + FB1,1 + FB0,2 as above and expanding Q(FBn+1 ) = Q (FA + FB1,1 + FB0,2 )n+1 , the claim follows by a counting argument similar to the one above.
Remark 6.4. 1. In applying the variation formula (5.11) in Lemma 5.4 in the proof of Theorem 6.3 (i), we observe that the adapted connection A = B | T (π ◦ p) is not fixed q+1 during a variation Bt of B, but we still have FAt = 0 , t ∈ [0, 1]. 2. The results of Theorem 6.3 apply in particular to families p : E → Y of flat bundles, that is A = {Az } is a family of flat connections on the G–principal bundles Pz = E |Xz → Xz for z ∈ Z. In this case we have q = 0 and T F = T (π ) and the relevant conditions are n ≥ in (i) and n > in (ii) and (iii). This case occurs in all examples in Section 7 except for the last Example 7.6. Families of flat bundles are also considered in [21]. 3. The characteristic classes of foliated bundles in Theorem 5.6 can be recovered from Theorem 6.3 by taking Z = {pt} and = 0. In this case, we have B = A and the non–integrated classes [ (Q, A) ] ∈ H 2n+1 (X, R), respectively [ (Q, u, A) ] ∈ H 2n+1 (X, R/Z) are well-defined under the assumption n ≥ q as in Theorem 5.6, (iv). Moreover, the restriction dim X = 2n + 1 is obviously not necessary. Recall that the family of adapted connections A is basic if the Lie derivative LX A = iX dA vanishes for all F-horizontal vector fields X on E or equivalently if iX FA = 0. Proposition 6.5. If we can choose A basic in Theorem 6.3, then the conditions n− ≥ q in (i), respectively n − > q in (ii), can be replaced by 2(n − ) ≥ q, respectively 2(n − ) > q. Proof. Again, we restrict attention the first statement (i) in Theorem 6.3. Counting powers of Jt instead of curvature terms, we see that the above estimates for the relevant terms in the proof of (i) give 2k + 1 ≥ 2(n − ) + 1 for the first summand of (6.10), and 2k ≥ 2(n − ( − 1)), that is 2k ≥ 2(n − ) + 2 for the second summand of (6.10). Thus in either case, the condition 2(n − ) ≥ q implies that the relevant curvature terms are in J q+1 = 0.
Remark 6.6. 1. Observe that the global extensions Bt on E, respectively the connection on E × [0, 1] in the proof of Theorem 6.3 (i) will in general not be basic for the B respective foliated structures, even if A, A and hence At are. 2. One might expect the correct conditions in Proposition 6.5 to be n − ≥ q in (i), respectively n− > q in (i), where q = [ q2 ], that is q = 2q for q even and q = 2q +1 q +1
for q odd. Then the basic vanishing property is FA = 0, since 2(q + 1) = 2q + 2 ≥ q + 1. However, for q odd and n − = q , the estimate 2k + 1 ≥ 2(n − ) + 1 gives 2k + 1 ≥ 2(n − ) + 1 = 2q + 1 = q which is not sufficient.
276
J.L. Dupont, F.W. Kamber
7. Examples In this section we give a few examples of increasing complexity. Example 7.1. We start with a simple example, found together with R. Ljungmann, which 2 (Z). Let X = T 2 = R2 /Z2 and Z = R2 and congives non–trivial classes in HD 2 2 sider the trivial GL(1, R)+ = R× + –bundle E over Y = T × R with coordinates (x1 , x2 ; z1 , z2 ; λ). If ω0 denotes the Cartan–Maurer form, then ω = ω0 + B , B = z1 dx1 + z2 dx2 , defines a foliated structure on E which is flat along the fibres T 2 of π : Y → Z. In fact, the curvature on Y is given by F = dB = dz1 ∧ dx1 + dz2 ∧ dx2 , which is clearly of type (1, 1) and vanishes on every fibre Tz2 = π −1 (z) , z = (z1 , z2 ). The flat structure of E | Tz2 → Tz2 is not trivial; in fact, the holonomy depends on z ∈ Z 2 ∼ 2 and is given by the homomorphism hz : → R× + , = π1 (Tz ) = Z , where hz (λ1 , λ2 ) = eλ1 z1 +λ2 z2 . Since the Lie algebra of 3–form
R× +
(7.1)
is R we can take the polynomial Q(ξ ) =
ξ2
to obtain the
(Q, B) = B ∧ dB = (dx1 ∧ dx2 ) ∧ (z2 dz1 − z1 dz2 ), with d(Q, B) = dB 2 = −2 (dx1 ∧ dx2 ) ∧ (dz1 ∧ dz2 ). Thus on Z we have the characteristic form Y/Z (Q, B) = B ∧ dB = z2 dz1 − z1 dz2 , (7.2) T2
2 (Z), and can be interpreted as a connection in the which defines a non-zero class in HD trivial line bundle on Z with curvature −2 dz1 ∧ dz2 = −2V , where V is the volume form on Z = R2 . Restricting Y and (7.2) to S1 ⊂ Z = R2 by setting z1 = cos θ, z2 = sin θ, we obtain on T 2 × S1 ,
(Q, B) = B ∧ dB = −dx1 ∧ dx2 ∧ dθ. Thus on S1 we have the characteristic form Y/S1 (Q, B) = − dx1 ∧ dx2 dθ = −dθ ,
(7.3)
T2
representing a non-zero element in H 1 (S1 , R) ∼ = Hom(Z, R) = R. Thus the restriction of the class in (7.2) is closed, that is the above line bundle is flat on S1 with holonomy determined by (7.3). Example 7.2. More generally let X = Xg be a surface of genus g ≥ 2 and let {α1 , β1 , . . . , αg , βg } be a set of closed 1–forms representing a symplectic basis for the cup–product pairing in cohomology, that is αi ∧ α j = 0 , αi ∧ βj = δij , βi ∧ βj = 0. Xg
Xg
Xg
We let Z = R2g with coordinates (z1 , . . . , z2g ) and again consider the foliated R× +– bundle E with the foliated structure given by the 1–form ω = ω0 + B , B = z1 α1 +
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
277
z2 β1 + . . . + z2g−1 αg + z2g βg , similar to Example 7.1. The curvature F = dB on Y is again of type (1, 1) and vanishes on every fibre Tz2 = π −1 (z) , z ∈ Z. The holonomy of the flat bundles E |Xg,z is determined as a homomorphism hz : → R× + , = H1 (Xg , Z) ∼ = Z2g , by a formula similar to (7.1), namely
hz (γ1 , . . . , γ2g ) = e
γ1
α1 +
γ2
β1 +...+
γ2g−1
αg +
γ2g
βg
.
(7.4)
Again we take the polynomial Q(ξ ) = ξ 2 to obtain the characteristic form 1–form on Z: Y/Z (Q, B) = B ∧ dB = (z2 dz1 − z1 dz2 ) Xg
+ . . . + (z2g dz2g−1 − z2g−1 dz2g ),
(7.5)
2 (Z), and can be interpreted as a connection in the which defines a non-zero class in HD trivial line bundle on Z with curvature dY/Z (Q, B) = (7.6) dB 2 = −2 dz1 ∧ dz2 + . . . + dz2g−1 ∧ dz2g . Xg
Note that in this and the previous example we have n = = 1 and q = 0. Example 7.3. This example is like Example 7.1, but here we take X = T k = Rk /Zk and k k Z = Rk and consider again the trivial GL(1, R)+ = R× + –bundle E over Y = T × R with coordinates (x1 , . . . , xk ; z1 , . . . , zk ; λ), with the foliated structure given by the 1–form ω = ω0 + B , B = z1 dx1 + . . . + zk dxk . This foliated structure is flat along the fibres T k of π : Y → Z. In fact, we have for the curvature F = dB = dz1 ∧ dx1 + . . . + dzk ∧ dxk , which is of type (1, 1) and vanishes on every fibre Tzk = π −1 (z) , z = (z1 , . . . , zk ). As in Example 7.1, the holonomy of the flat bundle E | Tzk → Tzk depends on z ∈ Z and is given by the homomorphism hz : → k ∼ k R× + , = π1 (Tz ) = Z , where hz (λ1 , . . . , λk ) = eλ1 z1 +...+λk zk , (λ1 , . . . , λk ) ∈ .
(7.7)
Now we take the polynomial Q(ξ ) = ξ n+1 , k = n + 1 to obtain the characteristic (2n + 1)–form, (Q, B) = B ∧ dB n k = (−1)(2) (k − 1)! (dx1 ∧ . . . ∧ dxk ) ∧ k j . . . ∧ dzk . ∧ (−1)j −1 zj dz1 ∧ . . . ∧ dz j =1
Thus on Z = Rk we have the characteristic form Y/Z (Q, B) = B ∧ dB k−1 Tk
k
= (−1)(2) (k − 1)!
k j =1
j . . . ∧ dzk , (7.8) (−1)j −1 zj dz1 ∧ . . . ∧ dz
278
J.L. Dupont, F.W. Kamber
with curvature k
k
dY/Z (Q, B) = (−1)(2) k! dz1 ∧ . . . ∧ dzk = (−1)(2) k! V ,
(7.9)
k (Z). with V the volume form on Z = Rk . Hence (7.8) defines a non-zero class in HD Restricting Y and (7.8) to Sk−1 = {(z1 , . . . , zk ) | ki=1 zi2 = 1} ⊂ Z = Rk , it is easy to see that Y/Z (Q, B) is a non–zero multiple of the volume form on Sk−1 and is clearly closed. Thus we have Y/Sk−1 (Q, B) = 0 ∈ H k−1 (Sk−1 ). Note that in this example we have k = n + 1 , n = and q = 0. We can interpret the invariants Y/Z (Q, B), respectively Y/Sk−1 (Q, B) as (flat) connections on the trivial n = (k − 1)–gerbe as in (5.2).
So far, the examples have been for Case I. The next two examples will be for Case II. Example 7.4. The Poincaré (k − 1)–gerbe (cf. [4, 21]) : This example is the Case II analogue of Example 7.3. Let T be the k–dimensional real torus, that is T = Rk / for the rank k integral lattice ⊂ Rk . The associated dual torus is defined as exp
T = H 1 (T , R)/H 1 (T , Z) ∼ = U(1)k , = HomZ (, U(1)) ∼
(7.10)
that is the points in T parametrize flat unitary connections on the trivial line bundle C = T × C −→ T . For ξ ∈ T , x ∈ Rk , a ∈ and λ ∈ C , consider the equivalence relation Rk × T × C −→ Rk × T × C/∼ , (x + a, ξ, λ) ∼ (x, ξ, exp(2π ι ξ(a))λ) .
(7.11)
The quotient space under ‘∼’defines the Poincaré line bundle P −→ T ×T . Let pˆ denote the projections of T × T → T. From (7.11) we see that the restriction P | pˆ −1 (ξ ) ∼ = Lξ , where the latter denotes the flat line bundle parametrized by ξ ∈ T . There exists a canonical unitary connection B on the U(1)–principal bundle p : E → T × T associated to P, with curvature FB given by FB = 2πι
k
dξ j ∧ dxj ,
(7.12)
j =1
where {xj } are (flat) coordinates on T and {ξ j } are dual (flat) coordinates on T. FB is of type (1, 1) and therefore induces a family A = {Aξ } of flat connections on the fibres pˆ T → T × T → T. Now we take Q = C1k and u = c1k , where c1 ∈ H 2 (B U(1)), Z) = H 2 (CP∞ , Z) ∼ = Z[c1 ] is the generator. Thus we obtain from Theorem 6.3 a (simplicial) characteristic n = (k − 1)–gerbe k k k [ (C1k , c1k , B) ] ∈ HD (T , Z). (7.13) [ T ×T/T(C1 , c1 , B) ] = T
We remark that T × T has a canonical Kähler structure for which the Poincaré bundle 1 FB = ω. Here ω is the P becomes a holomorphic line bundle such that C1 (P) = 2πι Kähler form, so that T × T has a Hodge structure. It follows that the curvature of the characteristic gerbe in (7.13) is a non–zero multiple of T ωk = V , the volume form on T. Note that in this and the previous example we have n = = k − 1 and q = 0.
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
279
Example 7.5. The Quillen 1–gerbe [32, 33, 14]: This well-known complex line bundle with unitary connection associated to families of flat SU(2)–bundles appears in our setup as a characteristic 1–gerbe. We briefly recall this non–abelian example, referring to Ramadas– Singer–Weitsman [33] for details. Let X = Xg be an oriented surface of genus g , G = SU(2) and let Z be the smooth part of the representation variety Hom(π1 (Xg ), G)/G . This is a symplectic manifold of dimension 6(g −1) and the symplectic form is in fact the curvature form for the characteristic 1–gerbe constructed below. The family E → Xg × Z is the tautological family of flat SU(2)–bundles Pρ → Xg determined by ρ : π1 (Xg ) → SU(2) , ρ ∈ Z . The pair (Q, u) is taken to be Q = C2 , the second Chern polynomial, and u = c2 ∈ H 4 (B SU(2), Z) ∼ = Z[c2 ] is the universal Chern class. Hence choosing a global SU(2)–connection B on E, extending the family A of flat connections along the fibres Pρ → Xg , ρ ∈ Z, we obtain from Theorem 6.3 the (simplicial) characteristic 1–gerbe 2 [ (C2 , c2 , B) ] ∈ HD (Z, Z). (7.14) [ Xg ×Z/Z (C2 , c2 , B) ] = Xg
The above examples are all cases where q = 0, that is we have T F = T (π ) and A = {Az } is a family of flat connections on the fibres Pz → Xz , z ∈ Z. We end with a Case I example which relates to variations of the Godbillon–Vey invariant [23] and also gives some new classes of Godbillon–Vey type. Example 7.6. Godbillon–Vey gerbes for families of foliations: Let F be a family of transversally oriented foliations of codimension q on π : Y → Z as in (6.7), that is T F ⊂ T (π ). The relative transversal bundle QF = T (π )/T F has a natural foliated structure given by the partial Bott connection. On the oriented frame bundle E = FGL(q)+ (QF ) → Y this determines a foliated structure F. We choose a family A = {Az } of torsion–free, hence adapted connections along the fibres and extend it to a global connection B on E → Y . For given n ≥ q, we consider invariant polynomials of the form C1 Q ∈ ker(I (GL(q, R)+ ) → I (SO(q)), where Q ∈ I n (GL(q, R)+ ) and I (GL(q, R)+ ) ∼ = R[C1 , . . . , Cq ] is generated by the Chern polynomials Cj , that is the t 1 coefficients of t j in det(Id + 2π A) , A ∈ gl(q, R). We have C1 = 2π Tr and the kernel of the restriction to I (SO(q)) is generated by the odd Chern polynomials C2k+1 . Then (C1 Q, B) is given by the (2n + 1)–form (C1 Q, B) = β ∧ Q(FBn )
(7.15)
on Y , satisfying d(C1 Q, B) = dβ ∧ Q(FBn ) =
1 Tr(FB ) ∧ Q(FBn ) = C1 (FB ) ∧ Q(FBn ). (7.16) 2π
1 ∗ Here β = 2π s Tr(B) is the pull–back of the trace of the connection form B on det(E) = q (QF )0 by a trivializing section s : Y → q (QF )0 given by the transverse orientation on the normal bundle QF . Note that β satisfies dβ = C1 (FB ) and that the choice Q = C1n corresponds to the Godbillon–Vey form proper, that is (C1n+1 , B) = β ∧ dβ n . For satisfying n− ≥ q, (7.15) now gives rise to characteristic –gerbes on Z according to Theorem 6.3. First of all, the above data determine a family parametrized by z ∈ Z of secondary characteristic classes of Godbillon–Vey type on the fibres Xz of π : Y → Z, according to Theorem 5.6, namely
[ (C1 Q, Az ) ] = [ αz ∧ Q(FAnz ) ],
(7.17)
280
J.L. Dupont, F.W. Kamber q+1
where α = β | T (π) , dαz = C1 (FAz ) and FA = 0. However, for n > q and in particular for > 0, that is dim X = 2n + 1 − < 2n + 1, these forms on the fibres vanish identically. Next, we consider the case n = q, that is = 0 and dim X = 2n + 1 = 2q + 1 according to our general convention. Then the classes (7.17) actually live on the fibres Xz = π −1 (z) and we obtain from Theorem 6.3 (i) a global 0–gerbe q 1 [ Y/Z (C1 Q, B) ] = [ β ∧ Q(FB ) ] ∈ HD (Z) = 0 (Z), (7.18) Y/Z
given fibrewise by [ Y/Z (C1 Q, B) ] (z) = Xz
q
αz ∧ Q(FAz ).
(7.19)
Thus the family of invariants in (7.19) are the integrated fibrewise Godbillon–Vey invariants, which are well–known to be variable and hence non–constant in 0 (Z) for a suitable choice of the family of foliations (compare Heitsch [23, 24] and also the original work of Thurston [34]). A similar result is obtained for n = q + , > 0 and dim X = 2n + 1 − = 2q + + 1, in which case Theorem 6.3 (i) gives rise to (variable) characteristic –gerbes +1 [ Y/Z (C1 Q, B) ] = [ β ∧ Q(FBn ) ] ∈ HD (Z) = (Z)/d−1 (Z), (7.20) Y/Z
determined by formula (7.15); compare also (5.3). A more original class of gerbes is obtained in the ’rigid‘ range n − > q, that is = 0, . . . n − (q + 1), in which case we still have 2q + 1 < dim X = 2n + 1 − . Then we can invoke Theorem 6.3 (ii) to obtain well-defined flat characteristic Godbillon–Vey –gerbes [ Y/Z (C1 Q, B) ] = [ β ∧ Q(FBn ) ] ∈ H (Z, R). (7.21) Y/Z
Note that for n − ≥ q , > 0, we have 2n + 1 > dim X = 2n + 1 − > 2q + 1. Hence, as already noted, the fibrewise classes vanish identically on the form level, while the forms (C1 Q, B) are not necessarily closed on Y unless n ≥ q + dim Z. In contrast, the classes investigated in Kotschick [31], Hoster–Kamber–Kotschick [28], are families of classes on a fixed manifold X, defined with respect to a 1–parameter family Ft of foliations and foliated bundles and their suspension on the cylinder X × I . Hoster in his thesis [27] considers fibre spaces with flags of foliations along the fibres, but stays essentially in the context of [28]. Acknowledgements. The results of the paper go back a few years but the presentation follows a talk given by the first author in November 2002 during the program ‘Aspects of Foliation Theory’ at the Erwin Schrödinger Institute in Vienna. Both authors gratefully acknowledge the hospitality and support of the Erwin Schrödinger Institute. The second author visited Århus on several occasions during the preparation of this work and would like to thank the Department of Mathematics at Aarhus University for its hospitality and support. Finally we want to thank the referee for some very useful comments in particular on the terminology of ‘gerbes’ and ‘Deligne cohomology’.
Gerbes, Simplicial Forms and Invariants for Families of Foliated Bundles
281
References 1. Bismut, J.M., Freed, D.: The analysis of elliptic families I: Metrics and connections on determinant bundles. Commun. Math. Phys. 106, 159–176 (1986) 2. Bonora, L., Cotta-Ramusino, P., Rinaldi, M., Stasheff, J.: The evaluation map in field theory, sigmamodels and strings – II. Commun. Math. Phys. 114, 381–438 (1988) 3. Breen, L., Messing, W.: Differential geometry of gerbes. http://arxiv.org/abs/math.AG/0106083, 2001 4. Bruzzo, U., Marelli, G., Pioli, F.: A Fourier transform for sheaves on real tori I. The equivalence Sky(T ) Loc(T). J. Geom. Phys. 39, 174–182 (2001) 5. Brylinski, J.-L.: Loop Spaces, Characteristic Classes and Geometric Quantization. Progr. Math. 107, Boston–Basel: Birkhäuser, 1993 6. Brylinski, J.-L.: Geometric construction of Quillen line bundles. In: J.-L.Brylinski (ed.), Advances in Geometry, Progr. Math. 172, Boston–Basel: Birkhäuser, 1999 7. Brylinski, J.-L.: Gerbes on complex reductive Lie groups. http://arxiv.org/abs/math.DG/0002158, 2000 8. Carey, A.L., Mickelsson, J.: The universal gerbe, Dixmier-Douady class and gauge theory. Lett. Math. Phys. 59, 47–60 (2002) 9. Cheeger, J., Simons, J.: Differential characters and geometric invariants. In: J. Alexander, J. Harer, (eds.), Geometry and Topology, Proc. Spec. Year, College Park/ Md. 1983/84, Lecture Notes in Math. 1167, Berlin-Heidelberg-New York: Springer-Verlag, 1985, pp. 50–80 10. Chern, S.-S., Simons, J.: Characteristic forms and geometric invariants. Ann. Math. 99, 48–69 (1974) 11. Dupont, J.L.: Simplicial de Rham cohomology and characteristic classes of flat bundles. Topology 15, 233–245 (1976) 12. Dupont, J.L.: Curvature and Characteristic Classes. Lecture Notes Math. 640, Berlin–Heidelberg– New York: Springer–Verlag, 1978 13. Dupont, J.L., Hain, R., Zucker, S.: Regulators and characteristic classes of flat bundles. In: B.B.Gordon, J.D. Lewis, S. Müller-Stach, S. Saito, N. Yui, (eds.), The Arithmetic and Geometry of Algebraic Cycles, CRM Proceedings and Lecture Notes 24, Providence, RI: Am. Math. Soc. 2000 14. Dupont, J.L., Johansen, F.L.: Remarks on determinant line bundles, Chern–Simons forms and invariants. Math. Scand. 91, 5–26 (2001) 15. Dupont, J.L., Just, H.: Simplicial currents. Illinois J. Math. 41, 354–377 (1997) 16. Dupont, J.L., Kamber, F.W.: On a generalization of Cheeger-Chern-Simons classes. Illinois J. Math. 34, 221–255 (1990) 17. Dupont, J.L., Ljungmann, R.J.: Integration of simplicial forms and Deligne cohomology. http://arxiv.org/abs/math.DG/0402059, 2004 18. Freed, D.: On determinant line bundles. In: S.T. Yau (ed.), Mathematical Aspects of String Theory, Singapore: World Scientific Publishing, 1987, pp. 189–238 19. Freed, D.: Determinant line bundles revisited. In: J.E. Andersen et. al. (ed.), Proceedings of the conference Geometry and Physics, Århus, Denmark, July 18–27, 1995, Lecture Notes in Pure and Appl. Math. 184, New York: Marcel Dekker, Inc. 1997, pp. 187–196 20. Freed, D.: Classical Chern-Simons theory, part 2. Houston J. Math. 28, 293–310 (2002) 21. Glazebrook, J.F., Jardim, M., Kamber, F.W.: A Fourier–Mukai transform for real torus bundles. J. Geom. Phys. 50, 360–392 (2004) 22. Gomi, K., Terashima, Y.: A fibre integration formula for the smooth Deligne cohomology. Internat. Math. Res. Notices 13, 699–708 (2000) 23. Heitsch, J.L.: Derivatives of secondary characteristic classes. J. Differ. Geom. 13, 311–339 (1978) 24. Heitsch, J.L.: Independent variation of secondary classes. Ann. Math. 108, 421–460 (1978) 25. Hitchin, N.: Lectures on special Lagrangian submanifolds. In: Winter School on Mirror Symmetry, Vector Bundles and Lagrangian Submanifolds (Cambridge, MA, 1999), AMS/IP Stud. Adv. Math. 23, Providence, RI: Am. Math. Soc. 2001, pp. 151–182 26. Hopkins, M.J., Singer, I.M.: Quadratic functions in geometry, topology, and M-theory. http://arxiv.org/abs/math.AT/0211216, 2002 27. Hoster, M.: Derived secondary classes for flags of foliations. PhD thesis, Ludwig Maximilians Universität München, 2001 28. Hoster, M., Kamber, F., Kotschick, D.: Characteristic classes for families of foliated bundles. In preparation 29. Hurewicz, W., Wallman, H.: Dimension Theory. Princeton, NJ: Princeton University Press, 1948 30. Kamber, F.W., Tondeur, Ph.: Foliated Bundles and Characteristic Classes. Lecture Notes Math. 493, Berlin–Heidelberg–New York: Springer–Verlag, 1975 31. Kotschick, D.: Godbillon-Vey invariants for families of foliations. In: Symplectic and contact topology: Interactions and Perspectives (Toronto/Montreal 2001), Fields Inst. Commun. 35, Providence, RI: Am. Math. Soc. 2003, pp. 131–144
282
J.L. Dupont, F.W. Kamber
32. Quillen, D.: Determinants of Cauchy-Riemann operators over a Riemann surface. Funct. Anal. Appl. 19(1), 31–34 (1985) 33. Ramadas, T.R., Singer, I.M., Weitsman, J.: Some comments on Chern–Simons gauge theory. Commun. Math. Phys. 126, 409–420 (1989) 34. Thurston, W.: Noncobordant foliations of S 3 . Bull. Am. Math. Soc. 78, 511–514 (1972) Communicated by L. Takhtajan
Commun. Math. Phys. 253, 283–322 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1133-4
Communications in
Mathematical Physics
Vertex Algebras in Higher Dimensions and Globally Conformal Invariant Quantum Field Theory Nikolay M. Nikolov Institute for Nuclear Research and Nuclear Energy Tsarigradsko Chaussee 72, 1784 Sofia, Bulgaria. E-mail:
[email protected] Received: 14 November 2003 / Accepted: 27 January 2004 Published online: 5 August 2004 – © Springer-Verlag 2004
Abstract: We propose an extension of the definition of vertex algebras in arbitrary space–time dimensions together with their basic structure theory. A one–to–one correspondence between these vertex algebras and axiomatic quantum field theory (QFT) with global conformal invariance (GCI) is constructed. Contents 1. Introduction and Preliminaries . . . . . . . . . . . . . . . . . . . . . . . 2. Vertex Algebra Definition and Operator Product Expansion . . . . . . . . 3. Consequences of the Existence of a Vacuum and of Translation Invariance 4. Existence Theorem. Analytic Continuations . . . . . . . . . . . . . . . . 5. Free Field Examples. Lie Superalgebras of Formal Distributions . . . . . 6. Categorical Properties of Vertex Algebras. Representations . . . . . . . . 7. Conformal Vertex Algebras . . . . . . . . . . . . . . . . . . . . . . . . . 8. Hermitian Structure in Conformal Vertex Algebras . . . . . . . . . . . . . 9. Connection with Globally Conformal Invariant QFT . . . . . . . . . . . . 10. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A. Affine System of Charts on Complex Compactified Minkowski Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B. Proof of Theorem 9.1 . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . .
283 288 293 295 300 302 303 307 310 317
. .
318 320
1. Introduction and Preliminaries The axiomatic QFT was proposed and accepted by the physics community about 50 years ago as a collection of mathematically precise structures and their properties which any QFT should possess. Despite the fact that no four dimensional nontrivial model of the axiomatic QFT has been found so far, the long time efforts in these directions have
284
N.M. Nikolov
led to several general results such as the Bargmann–Hall–Wightman (BHW) theorem about analytic properties of correlation (i.e. Wightman) functions, the TCP and the spin and statistic theorems. A basic structure in the axiomatic approach is the Poincar´e symmetry. Right from the beginning the question of extending the space–time symmetry to the conformal one has been posed. It was shown in the article [8] that the condition of GCI, i.e. group conformal invariance, in the frame of the axiomatic QFT leads to the rationality of all correlation functions in any number D of space–time dimensions. This result can be viewed as an extension of the above mentioned BHW theorem. Since the Wightman functions carry the full information of the theory this result shows that the QFT with GCI is essentially algebraic. This gives new insight to the problem of constructing nonfree QFT models in higher dimensions. In 2 dimensional conformal QFT the theory of vertex algebras is based on simple axiomatic conditions with a straightforward physical interpretation [6]. One of them is the axiom of locality stating that the commutators or anticommutators of the fields vanish when multiplied by a sufficiently large power of the coordinate difference. This axiom has a natural extension to higher dimensions by replacing the coordinate difference with the space–time interval and it is a consequence of GCI in the axiomatic QFT – this is a form of the Huygens principle in QFT called in [8] (see Remark 3.1) strong locality. On the other hand, the rationality of correlation functions in a QFT with GCI allows to define a precise state–field correspondence and an expansion of fields as formal power series in their coordinates z = z1 , . . . , zD and the inverse square interval 2
2
= z · z := (z1 ) + · · · + (zD ) ). This provides the second axiomatic structure for the vertex algebras. The coordinates “z” define a chart in the complex compactified Minkowski space containing the entire real compact space and they are useful for connecting the vertex algebra approach with the axiomatic QFT with GCI (see Sect. 9, they are introduced for D = 4 in [12] and for general D, in [9] Sect. 2.2). The existence of the latter connection motivates our approach from the physical point of view – giving examples of such vertex algebras one would actually obtain models of the Wightman axioms. Physically, one could regard the vertex algebras as providing a realization of the observable field algebra in higher dimensional conformal QFT. The proposed construction of vertex algebras allows to give a precise definition of the notion of their representation which would realize the charged sectors in accord with Haag’s program in the algebraic QFT [4]. There is a more general definition of vertex algebras in higher dimensions proposed by Borcherds [2] which allows an arbitrary type of singularities occurring in the correlation functions. From the point of view of GCI the only type of singularities arising is the light cone type [8]. The paper is organized as follows: In Sect. 2 we give the basic definitions and prove the existence of operator product expansions (Theorem 2.1). The vertex algebra fields are denoted by Y (a, z) as in the chiral two–dimensional conformal QFT (chiral CFT), depending on the state a and being formal power series in z including negative powers of z2 . A convenient basis for such series is provided by the harmonic decomposition of the polynomials in z, which we will briefly recall below. The operator product expansion of two fields Y (a, z) and Y (b, z) is described in terms of an infinite series of “products” Y (a, z){n,m,σ } Y (b, z) labeled by integers which generalize the analogous products Y (a, t)(n) Y (b, t) in the chiral CFT. The {0, 0, 1}–product in our notations is the natural candidate for the normal product in higher–dimensional vertex algebras. In Sect. 3 we obtain an analogue (Theorem 3.1) of the (corollary) of the Reeh–Schlider theorem – the separating property of the vacuum [5]. It is also shown that the state–field 1 2 (z z2
Vertex Algebras in Higher Dimensions
285
correspondence exhausts the class of translation invariant local fields (i.e. the Borchers class, Proposition 3.2). We also obtain generalizations of some basic formulas for the vertex algebras from the chiral CFT. In Sect. 4 we prove a higher dimensional analogue (Theorem 4.1) of the Kac existence theorem ([6], Theorem 4.5) which provides examples of vertex algebras (at least the free ones). In this section we also find a higher dimensional analogue of the associativity identity “Y (a, z) Y (b, w) = Y (Y (a, z − w) b, w)” (Theorem 4.3). In Sect. 5 we present the free field examples of higher dimensional vertex algebras and also a more general construction based on Lie superalgebras of formal distributions. In Sect. 6 we introduce some constructions with vertex algebras including the basic categorical notions, tensor product and representations of vertex algebras. Sections 7 and 8 are devoted to the incorporation of the conformal symmetry in higher dimensions and the Hermitian structure (needed for the passage to the GCI QFT) within the vertex algebras. In Sect. 9 we give a one–to–one correspondence between vertex algebras with additional conformal and Hermitian structure, and the GCI QFT. Thus the free GCI QFT models provide examples for the vertex algebras with additional structure introduced in the previous sections. Preliminaries. The z– and w–variables as z, z1 , z2 , w etc. will always denote D component variables: (1.1) z = z1 , . . . , zD , zk = zk1 , . . . , zkD , w = w 1 , . . . , wD . We fix the standard scalar product: z1 · z2 =
D
µ µ
z1 z2 ,
z2 ≡ z · z.
(1.2)
µ=1
N ≡ {1, 2, . . .}, Z ≡ {0, ± 1, . . .} and I is used for the identity operator or element. For a complex vector space V , V [z] stands for the space of polynomials with coefficients in V (i.e., V [z] ≡ V ⊗ C[z]). Similarly, V [[z]] is the space of formal power (Taylor) series in z with coefficients in V . We introduce the formal derivatives on V [z] and V [[z]]: ∂ ∂ ∂z ≡ ∂z1 , . . . , ∂zD ≡ (1.3) ,..., D , ∂z1 ∂z as well as the Euler and Laplace operators: z · ∂z ≡
D µ=1
z µ ∂z µ ,
∂z2 ≡ ∂z · ∂z ≡
D
(∂zµ )2 .
(1.4)
µ=1
Then ∂z obeys the Leibniz rule and the homogeneous polynomials of degree n are characterized by the Euler equation (z · ∂z p) (z) = np (z). A harmonic polynomial p (z) ∈ V [z] is such that the Laplace equation ∂z2 p (z) = 0 is satisfied. The basic fact about the existence of harmonic decomposition can be stated as follows: Lemma 1.1. If p (z) ∈ V [z] is a homogeneous polynomial of degree n (degp = n) then there exists a unique decomposition
286
N.M. Nikolov n
p (z) =
2 k z2 hk (z) , ∂z2 hk (z) = 0, z · ∂z hk (z) = (n − 2k) hk (z) ,
(1.5)
k=0
where [[a]] stands for the integer part of the real number a. The proof is based on induction in n = degp: if ∂z2 p (z) has by the inductive assump n
tion a unique decomposition the difference
∂z2 p (z)
n
h0 (z) := p (z) −
=
−1 2 k z hk (z), deg hk = n − 2 − 2k, then
2
k=0
2 −1
−1 k+1 hk (z) 4 (k + 1) n − k + D−4 z2 2
k=0
is verified to be a harmonic homogeneous polynomial by a straightforward computation. −1 hk−1 (z) for k > 0. Thus we obtain that hk (z) = 4k n − k + 1 + D−4 2 In such a way if we denote by Vm [z] the subspace of homogeneous polynomials of V [z] of degree m and by Vmharm [z] the subspace of Vm [z] of the harmonic polynomials (Vm [z] ≡ Vmharm [z] ≡ {0} for m < 0) then we have the decomposition Vm [z] = Vmharm [z] ⊕ z2 Vm−2 [z],
(1.6)
harm hD m ≡ hm := dimCm [z] = dimCm [z] − dimCm−2 [z]
=
m+D−1 m+D−3 − D−1 D−1
(1.7)
∞ (recall that (1 − q)−D = (dimCm [z]) q m ). The space Charm m [z] carries an irreducm=0 ible representation of the complex orthogonal group SO (D; C) for every m = 0, 1, . . .. Note that h10 = h11 = h20 = 1 and h1m+1 = 0, h2m = 2 for m 1; h3m = 2m + 1 and for D 4: 2m hD D = (1.8) m + D − 2 ... m − D + 2 , m− 2 +1 2 2 (D−2)! so that hD
m− D2 +1
is a polynomial in m of degree D − 2 for D 2, which is even for D
even, and odd for D odd. For D = 4: hm = (m + 1)2 . Let us fix for every m = 0, 1, . . . a basis in Charm m [z]: (0) (m) h(m) σ (z) : σ = 1, . . . , hm , h1 (z) ≡ 1, hσ (z) ≡ 0 iff m < 0.
(1.9)
Then for every a (z) ∈ V [[z]] we have a unique representation: a (z) =
hm ∞ ∞
n a{n,m,σ } z2 h(m) σ (z) ,
a{n,m,σ } ∈ V .
(1.10)
n=0 m=0 σ =1
For every n, m = 0, 1, . . . and σ = 1, . . . , hm there exists a unique homogeneous polynomial P{n,m,σ } (z) of degree 2n + m such that:
(1.11) = a{n,m,σ } P{n,m,σ } (∂z ) a (z) z=0
Vertex Algebras in Higher Dimensions
287
for any a (z) (1.10). In the special case of m = 0 (hm = 1): n P{n,0,1} (z) = Kn z2 ,
(1.12)
(D − 2)!! and k !! := k (k − 2) . . . k − 2 2k . In general, 2n n! (2n + D − 2)!! n (m) P{n,m,σ } (z) could be proven to be proportional to z2 hσ (z) under the additional (m) assumption of orthogonality of hσ but we will not need its explicit form. Denote by V z, 1/z2 the vector space of all formal series: where Kn :=
a (z) =
hm ∞
n a{n,m,σ } z2 h(m) σ (z) ,
a{n,m,σ } ∈ V .
(1.13)
n∈Z m=0 σ =1
The subspace of V z, 1/z2 of finite series (1.13) will be denoted by V [z, 1/z2 ]; the subspace of formal series (1.13) whose sum in n is bounded from below: V [[z]]z2 – i.e. n [1]. Thus the localization of V [[z]] with respect to the multiplicative system z2 n∈N 2 N a (z) ∈ V [[z]]z2 iff z a (z) ∈ V [[z]] for sufficiently large N , which we will briefly write as N 0. Note also that V [z, 1/z2 ] ≡ V [z]z2 . The spaces V [[z]] and V [[z]] 2 are C [[z]] and C [[z]]z2 modules, respectively, with z derivations {∂zµ }µ=1,...,D . For V z, 1/z2 we have a structure of a C[z, 1/z2 ] module with derivations ∂zµ . To obtain the explicit form of the actions of C[z, 1/z2 ] and ∂zµ on V z, 1/z2 let us note first that for a homogeneous harmonic polynomial hm (z) of degree m (hm ∈ Vmharm [z]), the polynomials: and zµ hm (z) −
∂zµ hm (z)
1
2 m+ D −1 2
z2 ∂zµ hm (z)
are harmonic and homogeneous of degrees m − 1 and m + 1, respectively. Therefore, (m) (m) there exist constants Aµσ1 σ2 and Bµσ1 σ2 such that (m)
∂zµ hσ1 (z) =
h
m−1 σ2 =1
+
(m)
(m−1)
Aµσ1 σ2 hσ2
(z) ,
(m)
zµ hσ1 (z) =
h
m+1 σ2 =1
(m)
(m+1)
Bµσ1 σ2 hσ2
(m) (m−1) 1 z2 Aµσ1 σ2 hσ2 (z) D 2 m+ −1 σ2 =1
(z) +
hm−1
(1.14)
2
(m)
(m)
(setting Aµσ1 σ2 = Bµσ3 σ4 = 0 for m < 0). Using these equations one can obtain µ the explicit form of the actions of z and of the derivations ∂zµ on a general series (m) (m) a (z) ∈ V z, 1/z2 . The coefficients Aµσ1 σ2 and Bµσ1 σ2 define intertwining operators harm (m) : Charm [z] ⊗ Charm [z] → Charm [z] [z] ⊗ Charm A(m) : Charm m [z] → Cm−1 [z] and B m 1 1 m+1 as SO (D; C) representations.
288
N.M. Nikolov
In the same way one can define the spaces V [z1 , 1/z2 ; . . . ; zn , 1/z2 ], V z1 , 1/z2 ; . . . ; zn , 1/z2 and V [[z1 , . . . , zn ]]z2 ...z2 n
1
n
1
n
1
(the latter symbol stands for the localization of V [[z1 , . . . , zn ]] with respect to the muln ). Note that tiplicative system z12 . . . zn2 n∈N
(1.15) V [z1 , 1/z2 ; . . . ; zn , 1/z2 ] = V [z1 , 1/z2 ; . . . ; zn−1 , 1/z2 ] [zn , 1/z2 ], n n 1 1 n−1 V z1 , 1/z2 ; . . . ; zn , 1/z2 = V z1 , 1/z2 ; . . . ; zn−1 , 1/z2 zn , 1/z2 , (1.16) n
1
which allows us to define on V
1
z1 , 1/z2 ; . . . ; zn , 1/z2 1
n
n
n−1
a structure of a C[z1 , 1/z2 ; . . . ; 1
zn , 1/z2 ]–module with derivations ∂zµ (k = 1, . . . , n, µ = 1, . . . , D). For the spaces k n V [[z1 , . . . , zn ]] and V [[z1 , . . . , zn ]]z2 ...z2 as usual we have structures of C [[z1 , . . . , zn ]] n 1 and C [[z1 , . . . , zn ]]z2 ...z2 modules, respectively. n 1 It is important that the C [[z]]z2 –module V [[z]]z2 has no “zero divisors”, i.e. if f (z) a (z) = 0 for f (z) ∈ C [[z]]z2 and a (z) ∈ V [[z]]z2 then f (z) = 0 or a (z) = 0. This is not the case for the C[z]z2 –module V z, 1/z2 as it is seen by the following example. Example 1.1. Let c = c1 , . . . , cD ∈ CD be a complex vector such that c2 ≡ c · c 2 is invertible in C [[z]] and let t (z) be its inverse. Then = 1. Thepolynomial (z − c) (z − c)2 t (z) − 12 J [ t (z)] = 0 and 0 = t (z) − 12 J [ t (z)] ∈ C z, 1/z2 , where z z 1 J is an involutive automorphism of V z, /z2 for every complex vector space V defined as
J [a (z)] := a (w)
hm ∞
n
≡ a{−n−m,m,σ } z2 w = z2 z n∈Z m=0 σ =1
h(m) σ (z) ,
(1.17)
for a (z) in Eq. (1.13) (J 2 = I and J [f a] = J [f ] J [a] for f ∈ C[z, 1/z2 ], a ∈ V z, 1/z2 ). 2. Vertex Algebra Definition and Operator Product Expansion In the next three sections we will use the notation zkl := zk − zl as an abbreviation of the polynomial zk − zl ∈ CD [zk , zl ] but not as a new variable. Definition 2.1. Let V = V0 ⊗ V1 be a Z2 –graded complex vector space (i.e., a superspace) and EndV = (EndV)0 ⊕ (EndV)1 be the corresponding associative superalgebra with Lie bracket [A , B ] = AB −(−1)pq BA for A ∈ (EndV)p and B ∈ (EndV)q . Then D V is said to be a vertex over C if it is equipped with a parity preserving linear algebra map V → (End V) z, 1/z2 : a → Y (a, z), mutually commuting endomorphisms Tµ ∈ (End V)0 for µ = 1, . . . , D called translation endomorphisms and an element 1 ∈ V0 called vacuum such that for every a, a1 , a2 , b ∈ V:
Vertex Algebras in Higher Dimensions
289
N (a) z2 Y (a, z) b ∈ V [[z]] for N 0 (⇔ Y (a, z) b ∈ V [[z]]z2 ); 2 N (b) z12 [Y (a1 , z1 ) , Y (a2 , z2 )] = 0 for N 0 (z12 := z1 − z2 ); (c) Tµ , Y (a, z) = ∂zµ Y (a, z) for µ = 1, . . . , D;
= a; (d) Y (a, z) 1 ∈ V [[z]] and Y (a, z) 1 z=0 (e) Tµ 1 = 0 for µ = 1, . . . D; Y 1, z = I. The map a → Y (a, z) is represented as a formal series by: hm ∞ n a{n,m,σ } z2 h(m) Y (a, z) = σ (z) , a{n,m,σ } ∈ End V
(2.1)
n∈Z m=0 σ =1
and Y (a, z) b is understood as the series V z, 1/z2 . For every a, b ∈ V:
hm ∞
n∈Z m=0 σ =1
N
a{n,m,σ } b = P{n+N,m,σ } (∂z ) z2 Y (a, z) b
n (m) a{n,m,σ } b z2 hσ (z) ∈
z=0
for
N 0
(2.2)
(P{n,m,σ } (z) are defined by Eq. (1.11)). The product Y (a1 , z1 ) . . . Y (aN , zN ) will be presented by the series Y (a1 , z1 ) . . . Y (aN , zN ) h
h
=
m1 ∞
mN ∞
... a1{n1 ,m1 ,σ1 } . . . aN{nN ,mN ,σN } n1 ∈Z m1 =0 σ1 =1 nN ∈Z mN =0 σN =1 n1 nN 2 (mN ) 1) × z12 . . . zN h(m σ1 (z1 ) . . . hσN (zN ) belonging to (End V) z1 , 1/z2 ; . . . ; zN , 1/z2 . 1
N
Definition 2.2. Let V be a superspace. An element u (z1 , . . . , zn ) ∈ (End V) z1 , 1/z2 ; . . . ; zn , 1/z2 n 1 is said to be a field if for every a ∈ V: u (z1 , . . . , zn ) a ∈ V [[z1 , . . . , zn ]]z2 ... z2 (i.e., if n 1 2 N z1 . . . zn2 a u (z1 , . . . , zn ) a ∈ V [[z1 , . . . , zn ]] for Na 0). Thus in the case of a vertex algebra V, Y (a, z) are fields for every a ∈ V, in accord with Definition 2.1 (a). If u (z1 , . . . , zn ) is a field then we can define u (z, . . . , z) by setting:
Na 1
2 2 z1. . . zn , (2.3) u(z, . . . , z) a := nN u(z1 , . . . , zn ) a z1 = · · · = zn = z z2 a for a ∈ V and Na 0, which does not depend on Na ∈ N. Clearly, if u (z1 , . . . , zn ) is a field then u (z, . . . , z) and ∂zµ u (z1 , . . . , zn ) are fields too. k
290
N.M. Nikolov
Definition 2.3. Let a (z) and b (z) be two fields on a superspace V, a (0) (z), b(0) (z) and a (1) (z), b(1) (z) be their even and odd parts, respectively (i.e. if a (z) has an expansion of type (1.13) with a{n,m,σ } ∈ End V then a (0,1) (z) is the formal series with coefficients (0,1) (0) (1) a{n,m,σ } ∈ (End V)0,1 , a{n,m,σ } = a{n,m,σ } + a{n,m,σ } ). The fields a (z) and b (z) are said 2 N (ε ) to be mutually local if z12 a 1 (z1 ) , b(ε2 ) (z2 ) = 0 for N 0 and ε1 , ε2 = 0, 1. 2 N a (z1 ) b (z2 ) is a field for N 0. Indeed, if N 0 then for all v ∈ V: Then z12 2 N (0) 2 N z12 a (z1 ) b (z2 ) v = z12 [ b (z2 ) a (0) (z1 )+ b(0) (z2 ) a (1) (z1 )+ b(1)(z2 ) a (0)(z1 ) M 2 N z12 a (z1 ) so that for M 0: z12 z22 − b(1) (z2 )a (1) (z1 ) ]v, in accord with locality, b (z2 ) v ∈ V [[z1 ]]z2 [[z2 ]] ∩ V [[z2 ]]z2 [[z1 ]] ≡ V [[z1 , z2 ]]. 1
2
Theorem 2.1. Let a (z) and b (z) be mutually local fields on a superspace V. Then for N 0 and every v ∈ V, M = 0, 1, . . . , there exists a unique decomposition:
2 z12
N
hm
a (z1 ) b (z2 ) v =
n, m = 0, 1, . . . ; σ =1 2n + m M D
+
µ
µ
µ1 ,...,µM+1 =1 (N,M)
n (N,M) 2 θ{n,m,σ } (z2 ) v z12 h(m) σ (z12 )
z121 . . . z12M+1 ψµ(N,M) (z1 , z2 ) v, 1 ...µM+1
(2.4)
(N,M)
where θ{n,m,σ } (z) and ψµ1 ...µM+1 (z1 , z2 ) are fields. The fields a (z){n,m,σ } b (z) := (N,M)
θ{n+N,m,σ } (z) do not depend on N and M and are determined by
2 N
a (z1 ) b (z2 ) v a (z){n,m,σ } b (z) v = P{n+N,m,σ } ∂z1 z12
z1 = z2 = z
(2.5)
for sufficiently large N , independent of v ∈ V, n ∈ Z, m = 0, 1, . . . and σ = 1, . . . , hm (P{n+N,m,σ } (z) are the polynomials introduced by (1.11)). If c (z) is another field which is local with respect to a (z) and b (z) then every field a (z){n,m,σ}b (z) is also local with respect to c (z). We will prove first two lemmas. Lemma 2.2. Let f (z1 , z2 ) ∈ V [[z1 , z2 ]]z2 z2 . Then for every M = 0, 1, . . . there exists 1 2 a unique decomposition
f (z1 , z2 ) =
hm
n 2 g{n,m,σ } (z2 ) z12 h(m) σ (z12 )
n, m = 0, 1, . . . ; σ =1 2n + m M
+
D
µ
µ1 ,...,µM+1 =1
µ
z121 . . . z12M+1 gµ(M) (z1 , z2 ) , 1 ...µM+1
(2.6)
(M)
where g{n,m,σ } (z) ∈ V [[z]]z2 and gµ1 ...µM+1 (z1 , z2 ) ∈ V [[z1 , z2 ]]z2 z2 . Moreover, if 1 2
(M)
f (z1 , z2 ) ∈ V [[z1 , z2 ]] then g{n,m,σ } (z) ∈ V [[z]] and gµ1 ...µM+1 (z1 , z2 ) ∈ V [[z1 , z2 ]] .
Vertex Algebras in Higher Dimensions
291
Proof. The uniqueness of the decomposition (2.6) follows from the equality
g{n,m,σ } (z) = P{n,m,σ } ∂z1 f (z1 , z2 )
z1 = z2 = z
in accord with Eq. (1.11), so that if f (z1 , z2 ) ∈ V [[z1 , z2 ]] then ∀g{n,m,σ } (z) ∈ V [[z]] as (M) well as ∀gµ1 ...µM+1 (z1 , z2 ) ∈ V [[z1 , z2 ]]. One proves the existence first when f (z1 , z2 ) ∈ V [[z1 , z2 ]] by the change of variables (z1 , z2 ) → (z12 = z1 − z2 , z2 ). In the general −N case: f (z1 , z2 ) = z12 z22 φ (z1 , z2 ) for N 0 and φ (z1 , z2 ) ∈ V [[z1 , z2 ]] . Then it is sufficient to prove that there exists the decomposition f (z1 , z2 ) = f (z2 , z2 ) +
D
µ
z12 gµ(0) (z1 , z2 ) ,
µ=1 (0)
where gµ (z1 , z2 ) ∈ V [[z1 , z2 ]]z2 z2 , using further induction in M. On the other hand, 1 2 the existence of the latter decomposition follows from f (z1 , z2 ) − f (z2 , z2 ) −N −N −2N = z12 z22 φ (z1 , z2 ) − φ (z2 , z2 ) + φ (z2 , z2 ) z12 z22 , − z22 −N −2N z12 z22 − z22 D N−1 −N −2N k N−k−1 µ µ µ . = z12 z12 z1 + z2 z22 z12 z22 µ=1
k=0
Lemma 2.3. For every M ∈ N and P (z) ∈ C[z] there exist N ∈ N and Q (z, w) ∈ C[z, w] such that: N M z2 P (∂z ) = Q (z, ∂z ) z2 ,
(2.7)
where the equation is understood as an operator equality and Q (z, ∂z ) stands for the polynomial Q (z, w) with each monomial zµ1 . . . zµk w ν1 . . . wνl replaced by zµ1 . . . zµk ∂zν1 . . . ∂zνl . Proof. Apply induction in degP . If degP = 0 then N = M and Q = P . If degP > 0 D
then P (z) = zµ Pµ (z) + P0 (z), where degPµ (z) < degP (z) for µ = 0, . . . , D. µ=1
By induction: for every µ = 0, . . . , D there exist Nµ ∈ N and Qµ (z, w) ∈ C[z, w] N M+1 such that z2 µ Pµ (∂z ) = Qµ (z, ∂z ) z2 . Then let N = max Nµ : µ = 0, . . . , N M+1 D so that z2 Pµ (∂z ) = Qµ (z, ∂z ) z2 . Thus: D N M 2 2 µ 2 z Qµ (z, ∂z ) ∂zµ z − 2 (M + 1) Qµ (z, ∂z ) z + Q0 (z, ∂z ) z z2 P (∂z ) = . µ=1
292
N.M. Nikolov
Proof of Theorem 2.1. The first part of the theorem follows from Lemma 2.2. For the latter statement we have to prove that: N 2 c (z1 ) , a (z2 ){n,m,σ } b (z2 ) v = 0 for N 0 and v ∈ V. (2.8) z12 Because of the equality (ε) a (z){n,m,σ } b (z) =
ε1 =0,1mod2
a (ε+ε1 ) (z){n,m,σ } b(ε1 ) (z)
(using the notations of Definition 2.3) it is sufficient to consider the case when the fields a (z), b (z) and c (z) have fixed parities pa , pb , pc ∈ Z2 , respectively. Then we have N+M 2 c (z1 ) a (z2 ){n,m,σ } b (z2 ) v z12
N M 2 M
2 2 = z12 z13 P{n+M,m,σ } ∂z2 z23 c (z1 ) a (z2 ) b (z3 ) v z3 = z2
2 M 2 M 2 M
= Q z12 , ∂z2 z12 c (z1 ) a (z2 ) b (z3 ) v z13 z23 z3 = z2 M 2 = (−1)pa pc +pb pc Q z12 , ∂z2 z12
M M
2 2 × z13 a (z2 ) b (z3 ) c (z1 ) v z23 z3 = z2 N M pa pc +pb pc 2 2 = (−1) z12 z13
2 M
×P{n+M,m,σ } ∂z2 z23 a (z2 ) b (z3 ) c (z1 ) v z3 = z2 N+M 2 = (−1)(pa +pb ) pc z12 a (z2 ){n,m,σ } b (z2 ) c (z1 ) v, (2.9) for sufficiently large M and N , independent of v ∈ V, in accord with Eq. (2.5) and Lemma 2.3. The {0, 0, 1}–product is the natural candidate for the notion of normal product in vertex algebras which generalizes the corresponding one from the chiral CFT: : Y (a, z) Y (b, z) : := Y (a, z){0,0,1} Y (b, z) . As a consequence of Eqs. (2.5) and (1.12) it could be expressed as: N N
2 2 : Y (a, z) Y (b, z) : v = KN ∂z1 z12 Y (a, z1 ) Y (b, z2 ) v
(2.10)
z1 = z2 = z
(2.11)
for N 0 and every v ∈ V. Remark 2.1. In the case D = 1 (the chiral CFT case) there are two basic “harmonic” polynomials: h(0) (z) = 1 and h(1) (z) = z (z being
now
a 1–dimensional formal varia{n,m,1} z2n h(m) (z) so that able). Then the series (1.13) takes the form a (z) = n∈Z m=0,1
it is related to the expansion a (z) = a(k) z−k−1 used in [6] by a{n,0,1} = a(−2n−1) n∈Z
and a{n,1,1} = a(−2n−2) . It follows that in the D = 1 vertex algebra case the {n, m, 1}– product corresponds to the (−2n − 1 − m)–product (m = 0, 1) of [6].
Vertex Algebras in Higher Dimensions
293
3. Consequences of the Existence of a Vacuum and of Translation Invariance There is a vertex algebra analog of (the corollary of) the Reeh–Schlider theorem – the separating property of the vacuum [5]. Theorem 3.1. Let V be a vertex algebra and u (z) be a field on V which is mutually local with respect to all fields Y (a, z), a ∈ V. Then if u (z) 1 = 0 it follows that u (z) = 0. Proof. Because of locality we have for every a ∈ V and Na 0: 2 Na 2 Na z12 u (z1 ) Y (a, z2 ) Y (a, z2 ) u (z1 ) 1 = z12 1, 2 Na thus obtaining z12 u (z1 ) Y (a, z2 ) by 1 = 0. Then we can set z2 = 0 and divide 2 Na 2 Na z12 = z1 because it multiplies an element of V [[z1 ]]z2 (in the C [[z]]z2 – 1
module V [[z]]z2 there are no zero divisors). Thus we obtain that u (z) a = 0 for every a ∈ V. The following proposition shows that the system of fields {Y (a, z) : a ∈ V} is a maximal system of translation invariant local fields. Proposition 3.2. Let V be a vertex algebra and u (z) be a field on V which is mutually local with respect to all fields Y (a, z), a ∈ V. Then the following conditions are equivalent: (a) Tµ , u (z) = ∂zµ u (z) for µ = 1, . . . , D and u (z) 1 ∈ V [[z]], and for z = 0,
1 u (z) 1 = c; z=0 D
(b) u (z) 1 = exp (T · z) c, where T · z ≡ Tµ zµ and exp (T · z) = ∞
1 (T · z)n ∈ (End V) [[z]]; n! n=0
µ=1
(c) u (z) = Y (c, z). Proof. (a) ⇒ (b). The equality u (z) 1 = exp (T · z) c appears as the unique solution µ u (z) of the equations ∂ 1 = T u 1 for µ = 1, . . . , D with initial condition (z) z µ
u (z) 1 = c. Indeed, if z=0
u (z) 1=
∞
D
n=0 µ1 ,...,µn =1
(n)
cµ1 ... µn zµ1 . . . zµn ,
(n)
cµ1 ... µn ∈ V,
1 1 (n−1) T c = ··· = T . . . Tµn c for n > 1. n µ1 µ2 ... µn n! µ1 (b) ⇒ (c) . By Definition 2.1 (c) and (a) ⇒ (b) above we (d), and the implication have Y (c, z) 1 = exp (T · z) c. Then u (z) − Y (c, z) 1 = 0 and by Theorem 3.1 we (n)
then c(0) = c and cµ1 ... µn =
conclude that u (z) = Y (c, z). (c) ⇒ (a) . This is a part of Definition 2.1 (conditions (c) and (d)).
1 We remark that the second assumption of this condition, u(z)ˆl ∈ V [[z]], follows from the first one, [Tµ , u(z)] = ∂zµ u(z), and the requirement that u(z) acts as a field on V. This is proven in [10] (see Proposition 3.2(a)).
294
N.M. Nikolov
Corollary 3.3. Let V be a vertex algebra. Then for all a ∈ V and µ = 1, . . . , D: (3.1) Y Tµ a, z = ∂zµ Y (a, z) ≡ Tµ , Y (a, z) , Tµ a = a{0,1,µ} 1 (3.2) (1)
if we choose hµ (z) = zµ . Proof. Equation (3.1) follows from the equality
1 = Tµ a = Tµ , Y (a, z) 1 Y Tµ a, z z=0
z=0
and Proposition 3.2. Equation (3.2) follows then from the first equality of (3.1).
Proposition 3.4. Let V be a vertex algebra. Then for all a, b ∈ V and n ∈ Z, m = 0, 1, . . ., σ = 1, . . . , hm : (3.3) Y (a, z){n,m,σ } Y (b, z) = Y a{n,m,σ } b, z , and for n 0:
Y (a, z){n,m,σ } Y (b, z) = P{n,m,σ } (∂z ) Y (a, z) {0,0,1} Y (b, z) .
(3.4)
Proof. To prove Eq. (3.3) we will basically use Eq. (2.5). First we have for N 0 and all v ∈ V, µ = 1, . . . , D: Tµ , Y (a, z){n,m,σ } Y (b, z) v 2 N
= P{n+N,m,σ } ∂z1 z12 Tµ , Y (a, z1 ) Y (b, z2 ) v z1 = z2 = z 2 N = P{n+N,m,σ } ∂z1 z12
× ∂zµ Y (a, z1 ) Y (b, z2 ) + Y (a, z1 ) ∂zµ Y (b, z2 ) v 1 2 z1 = z2 = z
N
2 = P{n+N,m,σ } ∂z1 ∂zµ + ∂zµ z12 Y (a, z1 ) Y (b, z2 ) v 1 2 z1 = z2 = z
2 N
= ∂zµ P{n+N,m,σ } ∂z1 z12 Y (a, z1 ) Y (b, z2 ) v z1 = z2 = z = ∂zµ Y (a, z){n,m,σ } Y (b, z) v . (3.5) On the other hand, for N 0:
1 Y (a, z){n,m,σ } Y (b, z) z=0
2 N
1 = P{n+N,m,σ } ∂z1 z12 Y (a, z1 ) Y (b, z2 )
z1 = z2 = z z = 0
.
(3.6)
2 N But P{n+N,m,σ } ∂z1 z12 Y (a, z1 ) Y (b, z2 ) 1 ∈ V [[z1 , z2 ]] , so that the consecutive restrictions z1 = z2 = z and z = 0 are equivalent to the restrictions: first z2 = 0 and then z1 = 0. In such a way we obtain that
1 = a{n,m,σ } b. Y (a, z){n,m,σ } Y (b, z) z=0
Combining these two results we conclude by Proposition 3.2 that Eq. (3.3) holds.
Vertex Algebras in Higher Dimensions
295
The proof of (3.4) uses Eqs. (3.3), (2.2) and some additional properties of the polynomials P{n,m,σ } (z). We will not prove (3.4) since we will not use it further. Corollary 3.5. Let u (z) =
hm ∞
n∈Z m=0 σ =1
n (m) u{n,m,σ } z2 hσ (z) and v (z) be two mutu-
ally local fields on a superspace V and 1 ∈ V0 be such that u (z) 1 and v (z) 1 belong to ∈ V [[z]] . Then
u (z){n,m,σ } v (z) 1 = u{n,m,σ } v (z) 1 . (3.7) z=0
z=0
Proof. This can be derived as in the proof of Proposition 3.4 (the derivation of (3.6)). 4. Existence Theorem. Analytic Continuations The next theorem allows one to construct a vertex algebra from a system of mutually local and “translation covariant” fields which give rise to the entire space by acting on the vacuum. hm ∞
n (m)
Theorem 4.1 (“Existence Theorem”). Let uα (z) = uα{n,m,σ } z2 hσ (z) n∈Z m=0 σ =1
for α ∈ A be a system of mutually local fields on a superspace V. Let 1 ∈ V0 and Tµ ∈ (End V)0 be such that Tµ 1 = 0 for µ = 1, . . . , D and: (a) Tµ , uα (z) = ∂zµ uα (z) and uα (z) 1 ∈ V [[z]] for all α ∈ A, µ = 1, . . . , D; α1 (b) the set of all elements u . . . uαN 1 for N = 0, 1, . . . , αk ∈ A, {n1 ,m1 ,σ1 }
{nN ,mN ,σN }
nk ∈ Z, nN 0, mk = 0, 1, . . ., σk = 1, . . . , hmk (k = 1, . . . , N), spans the space V. Then there exists a unique structure of a vertex algebra with vacuum 1 and translation endomorphisms Tµ on V such that
Y uα , z = uα (z) for uα := uα (z) 1 , α ∈ A. (4.1) z=0
The operators Y (a, z) are determined for the vectors of the set in the above condition (b) by: Y uα{n11 ,m1 ,σ1 } . . . uα{nNN ,mN ,σN } 1, z = uα1 (z){n1 ,m1 ,σ1 } . . . uαN −1 (z){nN −1 ,mN −1 ,σN −1 } P{nN ,mN ,σN } (∂z )×uαN (z) . . . . (4.2) Proof. Set Y 1, z = I and take Eq. (4.2) as a definition for the operators Y (a, z) restricting to a subsystem of the set displayed in condition (b) which contains 1 and forms a basis of V. Note that Eq. (4.2) is well defined: any linear dependence obtained by the terms of the right-hand side will lead by Eq. (3.7) to a corresponding linear dependence of the terms in the left hand side. By Theorem 2.1 we obtain a system of mutually local fields. The conditions (c) and (d) of Definition 2.1 can be proven by induction in N for the fields (4.2) following the argument of the first part of the proof of Proposition 3.4 (the computations of (3.5) and (3.6)). The uniqueness follows from Proposition 3.2.
296
N.M. Nikolov
Now we will find an analogue of the analytic continuation of products of Wightman fields acting on the vacuum. Let R be a ring and V be an R–module. Then V [[z]]z2 is an R [[z]]z2 –module with derivations ∂zµ for µ = 1, . . . D. Moreover, if the R–module V has no zero divisors then this is also true for the R [[z]]z2 –module V [[z]]z2 . From this simple fact it follows by induction that V [[z1 ]]z2 . . . [[zn ]]z2 := V [[z1 ]]z2 . . . [[zn ]]z2 n n 1 1
(4.3)
is a C [[z1 ]]z2 . . . [[zn ]]z2 –module with derivations ∂zµ (k = 1, . . . , n, µ = 1, . . . , D), k
n
1
which has no zero divisors. Note that V [[z1 , . . . , zn ]]z2 ... z2 V [[z1 ]]z2 . . . [[zn ]]z2 V n n 1 1
z1 , 1/z2 ; . . . ; zn , 1/z2 1
n
. (4.4)
It follows from the definition of vertex algebra (Def. 2.1) that in a vertex algebra V, for all a1 , . . . , an , b ∈ V: Y (a1 , z1 ) . . . Y (an , zn ) b ∈ V [[z1 ]]z2 . . . [[zn ]]z2 . 1
(4.5)
n
Let us introduce the following multiplicative systems in C[z1 , . . . , zn ]: 2 N n n λk,l zl : N ∈ N , λk,1 , . . . , λk,n ∈ C \ {0} for k = 1, . . . , N , (4.6) Ln:= k=1
Rn:=
N k=1
l=1 N 2 zk
N 2 zlm
:N ∈N
(4.7)
1l<mn
(zlm = zl − zm ∈ CD [zl , zm ]). Clearly, Rn Ln
and V [[z1 , . . . , zn ]]R V [[z1 , . . . , zn ]]L n
(4.8)
n
for every vector space V and the localized modules in (4.8) have induced derivations ∂zµ (µ = 1, . . . , D). n n For every linear automorphism A : C → C with matrix (Akl ) and inverse matrix −1
Akl , and a vector space V we define an induced automorphism r (A) : V [[z1 , . . . , zn ]]L −→ V [[z1 , . . . , zn ]]L , n
the “linear change of variables zk → zk =
(4.9)
n
n
l=1
Akl zl ”, replacing zk →
n −1
Akl zl l=1
(k = 1, . . . , n) (note that the multiplicative system Ln (4.6) is invariant under this replacement while Rn is not). There is also a natural action of the symmetric group
Vertex Algebras in Higher Dimensions
297
Sn on V [[z1 , . . . , zn ]]L and V [[z1 , . . . , zn ]]R induced by the permutation of variables n n (z1 , . . . , zn ), since the multiplicative systems Ln and Rn are invariant under this action. Now we will introduce a homomorphism, commuting with the derivations ∂zµ , ιz1 ,...,zn : C [[z1 , . . . , zn ]]L −→ C [[z1 ]]z2 . . . [[zn ]]z2 n n 1
2
2 that will be the expansion “in the domain z1 > · · · > zn ”. We first set
ιz1 ,...,zn = IC [[z1 , . . . , zn ]] . C [[z1 , . . . , zn ]]
(4.10)
(4.11)
Next, consider for every N ∈ Z and constants (λ1 , . . . , λn ) ∈ Cn , the Taylor expansions in the D–dimensional variables z1 , . . . , zn : n −N
2
2 2 2 2 2 ι 1+ 2 2λ1 λm z1 · zm + λ1 λm z1 zm λ1 λk λl z 1 z k · z l + 2k
m=2
∞
=
λk11 . . . λknn fkN1 ...kn (z1 , . . . , zn ) ∈ C[z1 ] [[z2 , . . . , zn ]]
k1 ,...,kn =0
⊂ C [[z1 , . . . , zn ]] ,
(4.12)
where fkN1 ...kn (z1 , . . . , zn ) are separately homogeneous polynomials in z1 , . . . , zn of degrees k1 , . . . , kn , respectively, and the coefficient (formal) series in z1 for every monomial in z2 , . . . , zn is actually a polynomial. (The last is true because the polynomials z fkN1 ...kn (z1 , . . . , zn ) are zero if k1 > k2 + · · · + kn .) Thus we can replace z1 by 12 in z1 the formal series (4.12) and define for every N ∈ Z and constants (λ1 , . . . , λn ) ∈ Cn , λ1 = 0: n
2 −N 2 2 −N
−2 ιz1 ,...,zn ι 1+ 2 λk z l := λ1 z1 λ1 λk λl z12 zk · zl 2k
k=1
−N
−1
−2 2 2 2 + 2λ1 λm z1 · zm + λ1 λm z1 zm
m=2
n
z1 →
z1 2
z1
∈ C[z1 , 1/z2 ] [[z2 , . . . , zn ]] ⊂ C [[z1 ]]z2 . . . [[zn ]]z2 . n
1
1
For general constants (λ1 , . . . , λn ) ∈ Cn \ {0} we set n 2 −N n 2 −N ιz1 ,...,zn λ k zl := ιzm ,...,zn λk z l k=1
(4.13)
(4.14)
k=m
(∈ C [[z1 ]]z2 . . . [[zn ]]z2 ) if λ1 = · · · = λm−1 = 0, λm = 0. Since the Taylor expann 1 sions (4.12) have a multiplicative property, then n 2−N1 n 2−N2 n 2−N1 −N2 ιz1 ,...,zn
k=1
λk z k
ιz1 ,...,zn
k=1
λk z k
= ιz1 ,...,zn
λk z k
k=1
(4.15)
298
N.M. Nikolov
for N1 , N2 ∈ Z. Finally, the homomorphism ιz1 ,...,zn (4.10) is uniquely determined by Eqs. (4.11)–(4.15). Remark 4.1. The operation ιz1 ,...,zn applied to a rational function R(z1 , . . . , zn ), regular for z2 = 0, . . . , zn = 0, should give the Taylor expansion of R in z2 , . . . , zn around (0, . . . , 0). Its coefficients are rational functions in z1 . In the case of the left hand side of (4.13) these coefficients are polynomials in z1 and 1/z2 , which follows by induction 1
in the total order of z2 , . . . , zn .2
Note that ιz1 ,...,zn is a C[z1 , 1/z2 ; . . . ; zn , 1/z2 ]–linear map and commutes with the 1
n
derivations ∂zµ (k = 1, . . . , n, µ = 1, . . . , D). We can also define a C[z1 , 1/z2 ; . . . ; k
1
zn , 1/z2 ]–linear map n
so that ιz1 ,...,zn
ιz1 ,...,zn : V [[z1 , . . . , zn ]]L −→ V [[z1 ]]z2 . . . [[zn ]]z2 , n n 1 V [[z1 , . . . , zn ]]
= IV [[z1 , . . . , zn ]] and ιz1 ,...,zn (f u) = ιz1 ,...,zn (f ) ιz1 ,...,zn (u)
for f ∈ C [[z1 , . . . , zn ]]L and u ∈ V [[z1 , . . . , zn ]]L . n
n
The map ιz1 ,...,zn has zero kernel in V [[z1 , . . . , zn ]]L . Indeed, if ιz1 ,...,zn u = 0 n n 2 N
−1 λk,l zl , v ∈ for some u ∈ V [[z1 , . . . , zn ]]L then u = f v, where f := n k=1 l=1 −1 ιz1 ,...,zn v and ιz1 ,...,zn v ≡ v = 0, ιz1 ,...,zn V [[z , . . . , zn ]] . But ιz1 ,...,zn u = ιz1 ,...,zn f −11 −1 f = 0 (since ιz1 ,...,zn f is the inverse of f ), which contradicts the fact that in V [[z1 ]]z2 . . . [[zn ]]z2 there are no zero divisors. 1
n
Proposition 4.2. In any vertex algebra V and a1 , . . . , an , b ∈ V one has Y (a1 , z1 ) . . . Y (an , zn ) b ∈ ιz1 ,...,zn V [[z1 , . . . , zn ]]R ⊂ V [[z1 ]]z2 . . . [[zn ]]z2 n
1
n
(4.16) (see Eq. (4.7)). Moreover, the inverse image Yn (a1 , z1 ; . . . ; an , zn ; b) := ι−1 Y , z , z . . . Y b ∈ V [[z1 , . . . , zn ]]R (a ) (a ) 1 1 n n z1 ,...,zn
n
(4.17) is Z2 –symmetric in the sense that if a1 , . . . an have fixed parities p1 , . . . pn (resp.) then for any permutation σ ∈ Sn : Yn aσ (1) , zσ (1) ; . . . ; aσ (n) , zσ (n) ; b = (−1)ε(σ ) Yn (a1 , z1 ; . . . ; an , zn ; b) , (4.18)
where ε (σ ) = paσ (i) paσ (j ) (mod2), where the sum is over all pairs of indices i < j such that σ (i) > σ (j ). 2 The author thanks A. Retakh for his interest in this work and for asking a question answered in this Remark.
Vertex Algebras in Higher Dimensions
299
Proof. Locality (Definition 2.1 (b)) implies that ρnN Y (a1 , z1 ) . . . Y (an , zn ) b ∈ V [[z1 , . . . , zn ]] 2 2 where ρn := k zk l<m zlm . On the other hand,
N 0,
for
Y (a1 , z1 ) . . . Y (an , zn ) b ∈ V [[z1 ]]z2 . . . [[zn ]]z2 , n
1
because of Definition 2.1 (a). Then (4.16) follows from the fact that ιz1 ,...,zn ρn−N is an inverse element of ρnN in C [[z1 ]]z2 . . . [[zn ]]z2 . To prove Eq. (4.18) we note that 1
n
for N 0: ρnN Yn (a1 , z1 ; . . . ; an , zn ; b) is Z2 –symmetric, while ρnN is symmetric in z1 , . . . , zn . Theorem 4.3. In any vertex algebra V and a, b, c ∈ V it follows that w, z−w −1 Y (Y (a, z − w) b, w) c = ιw, z−w rz, ιz, w Y (a, z) Y (b, w) c , w
(4.19)
where Y (Y (a, z − w) b, w) c is viewed as a series belonging to V [[w]]w 2 [[z − w]](z − w)2 , w,z−w ι−1 : V [[z, w]]L → V [[w, z − w]]L is z,w is the inverse of ιz,w on its image and rz,w n
n
the map of type (4.9) induced by the change of variables (z, w) → (w, z − w) . Proof. The theorem follows from Theorem 2.1 (Eq. (2.4)) and Eq. (3.3). More precisely, we obtain the following equalities in V [[z, w]] ∼ = V [[w, z − w]] for N 0 : 2 2 N z w (z − w)2 Y (a, z) Y (b, w) c ∞ hm
N
n Y (a, w){n−N,m,σ }Y (b, w) c (z − w)2 h(m) = z2 w 2 σ (z − w) = z2 w =
n, m = 0 σ =1 ∞ hm
2 N
n Y a{n−N,m,σ } b, w c (z − w)2 h(m) σ (z − w)
n, m = 0 σ =1 N 2 2 z1 z2 (z1 + z2 )2 Y (Y
(a, z1 ) b, z2 ) c z1 = z − w, z2 = w ;
then the prefactors can be cancelled after applying to both sides the corresponding ι−1 operators. Proposition 4.4 (Quasisymmetry relation). Y (a, z) b = (−1)pa pb ez·T Y (b, −z) a (here z · T =
D
(4.20)
zµ Tµ ) for all a, b ∈ V, the right-hand side being understood as an
µ=1
action of the series ez·T ∈ (End V) [[z]] on a series belonging to V [[z]]z2 . Sketch of the series Yn introduced in Proposition 4.2, we first derive the proof. Using 1 ∈ V [[z, w]](z − w)2 and that Y2 a, z, b, w;
Y2 a, z; b, w; = Y1 a, z; b = Y (a, z) b, 1 (4.21) w=0 ew·T Y2 a, z1 ; b, z2 ; 1 = Y2 a, z1 + w; b, z2 + w; 1 (4.22) (using the argument of the proof of Proposition 3.2 (a) ⇒ (b)); we then apply to the left hand side of Eq. (4.22) the Z2 –symmetry (4.18) and set z1 := 0, w = −z2 := z.
300
N.M. Nikolov
5. Free Field Examples. Lie Superalgebras of Formal Distributions Let us consider a central extension of the free commutative Lie superalgebra SpanC uα{n,m,σ } : α ∈ A, n ∈ Z, m ∈ N ∪ {0} , σ = 1, . . . , hm ,
(5.1)
where A is some index set and all generators uα{n,m,σ } have parities pα which do not depend on n, m and σ . The commutation relations are presented by the following generating functions: α Qαβ (z − w) Qαβ (z − w) β u (z) , u (w) = ιz,w K, (5.2) µ − ιw,z µ (z − w)2 αβ (z − w)2 αβ uα (z) =
hm ∞
n∈Z m=0 σ =1
n (m) uα{n,m,σ } z2 hσ (z)
(5.3)
for α, β ∈ A, where µαβ are positive integers, K is the central element and Qαβ (z) are polynomials such that Qαβ (−z) = (−1)pα pβ Qβα (z). Without loss of generality we can suppose that the leading (i.e. harmonic) term in the harmonic decomposition (1.5) for every Qαβ (z) =: p (z) is nonzero if Qαβ (z) = 0. Then the right-hand side of Eq. (5.2) uniquely determines the polynomials Qαβ (z). As a consequence of Eq. (5.2) (and (4.15)) we have µ (5.4) (z − w)2 αβ uα (z) , uβ (w) = 0. The Lie super algebra H so obtained has a decomposition H = H{+} ⊕ CK ⊕ H{−} , H{+} = SpanC uα{n,m,σ } : α ∈ A, n 0, m ∈ N ∪ {0} , σ = 1, . . . , hm , H{−} = SpanC uα{n,m,σ } : α ∈ A, n < 0, m ∈ N ∪ {0} , σ = 1, . . . , hm .
(5.5) (5.6) (5.7)
Let F be the Fock representation space of H determined by H{−} |0 = 0 and K F = kI, where |0 ∈ F is the Fock vacuum. Then the formal series (5.3) is represented on F as a field for every α ∈ A. In fact we will prove a more general statement: Proposition 5.1. Let a system of formal series (5.3) be given with coefficients uα{n,m,σ } generating a Lie superalgebra L and such that Eq. (5.4) is satisfied for some positive integers µαβ and all α, β ∈ A (uα{n,m,σ } are supposed to have a parity independent of n, m and σ ). Let L{+} and L{−} be the subalgebras of L which are generated by the right-hand sides of Eqs. (5.6) and (5.7), respectively. Let U be the representation of L obtained by factorization of the universal enveloping algebra U (L) of L by the left ideal U (L) L{−} , where L is assumed to act by left multiplication. Then the formal series (5.3) is represented on U as a field for every α ∈ A. Proof. Let us denote the class of a ∈ U (L) in U by [a]. Thus we have to prove that (5.8) uα (z) uα{n11 ,m1 ,σ1 } . . . uα{nkk ,mk ,σk } I ∈ U [[z]]z2 for all k = 0, 1, . . . and all values of the indices. We will make the proof by induction in k: for k = 0 Eq. (5.8) follows from the factorization by L{−} . Suppose that (5.8) is
Vertex Algebras in Higher Dimensions
301
satisfied for k 0 and all values of the indices. Let us set vk := uα{n11 ,m1 ,σ1 } . . . uα{nkk ,mk ,σk } I ∈ U. N β Then we have to prove that z2 uα (z) u{n,m,σ } vk ∈ U [[z]] for N 0. By the inductive
M
β assumption u{n,m,σ } vk = P{n+M,m,σ } (∂w ) w2 uβ (w) vk for M 0 (recall w=0 the definition (1.11) of P{n,m,σ } (z)). Then by Lemma 2.3, for every L ∈ N there exist N ∈ N and a polynomial Q (z, w) ∈ C[z, w] such that N β z2 uα (z) u{n,m,σ } vk
M N
= (z − w)2 uα (z) P{n+M,m,σ } (∂w ) w2 uβ (w) vk
w=0 2 M α
2 L β w = Q (z − w, ∂w ) (z − w) u (z) u (w) vk w=0
L for M 0. On the other hand, it follows from (5.4) that (z − w)2 uα (z) uβ (w) vk ∈ U [[z, w]] for L 0 (as in the case of vertex algebras, after Def. 2.3). Consequently, 2 N α β u (z) u{n,m,σ } vk ∈ U [[z]] for N 0. z To obtain a vertex algebra structure we need additional assumptions. Proposition 5.2. In the assumptions of Proposition 5.1 let us suppose that there exist even derivations T1 , . . . , TD of L such that (5.9) Tµ uα (z) = ∂zµ uα (z) for µ = 1, . . . , D and α ∈ A. Then L{−} is T–invariant and hence Tµ are represented ! on U. Suppose also that V = U J is a quotient representation of L by a T–invariant subrepresentation J such that the class 1 := [I] ∈ V of I ∈ U (L) is nonzero. Then if Tµ are represented on V by Tµ ∈ End V it follows that the representation of the formal series (5.3) on V satisfy all the assumptions of the existence Theorem 4.1 and hence V has the structure of a vertex algebra. Proof. Using Eqs. (1.14) one can prove that Eq. (5.9) is equivalent to Tµ
uα{n,m,σ }
=
m+1 n + m + D h
2
+
m+ D 2 h m−1 σ1 =1
σ1 =1
α A(m+1) µσ1 σ u{n,m+1,σ1 }
(m−1) α 2 (n + 1) Bµσ u{n+1,m−1,σ1 } . 1σ
(5.10)
Therefore, L{−} is T–invariant. By Proposition 5.1, uα (z) acts as a field on V for every α ∈ A. The verifications of the other assumptions of Theorem 4.1 are straightforward. Corollary 5.3. The Fock space F defined above has the structure of a vertex algebra which is generated by the fields (5.3) satisfying the relations (5.2).
302
N.M. Nikolov
Proof. Equation (5.10) and Tµ (K) = 0 define an even derivation of the algebra H (5.5) since the relations (5.2) are ∂–invariant. To apply Propositions 5.1 and 5.2 we extend the system of formal series (5.3) with the constant series K (z) = K. Then the role of L{−} is played by H{−} and F is obtained by additional factorization of U by the subrepresentation generated by K − kI which is T–invariant. Finally, F is isomorphic to the symmetric superalgebra generated by H{+} (5.6) so that the class 1 is nonzero. The vertex algebra obtained in Corollary 5.3 is called a free field vertex algebra. A Lie superalgebra L and a system of series (5.3) satisfying the assumptions of Proposition 5.1 and possessing a system of even derivations T1 , . . . , TD , such that Eq. (5.9) holds is called a Lie superalgebra of formal distributions. 6. Categorical Properties of Vertex Algebras. Representations We begin with some basic categorical notions for vertex algebras in higher dimensions which are straightforward generalizations of the corresponding one from the chiral CFT [6]. A morphism f of vertex algebras V and V is called a parity preserving linear map f : V → V such that f a{n,m,σ } b = f (a){n,m,σ } f (b) , (6.1) (6.2) f Tµ (a) = Tµ (f (a)) , f 1 =f 1 (6.3) for all a, b ∈ V, n ∈ Z, m = 0, 1, . . ., σ = 1, . . . , hm and µ = 1, . . . , D, where Tµ , 1 and Tµ , 1 , are the translation endomorphisms and the vacuum, correspondingly in V and V . An isomorphism of vertex algebras is a morphism which is an isomorphism as a linear map. An injective or surjective morphism f is such that the map f is injective or surjective as a linear map, respectively. The image g (U) and the kernel Ker g of a morphism g : U → V are called a vertex subalgebra and ideal of V, respectively. Note that the image g (U) is itself a vertex algebra. If f : V → V is a surjective morphism and ! J is its kernel then the quotient space V J possesses the structure of a vertex algebra isomorphic to V . It is called a quotient vertex algebra. Proposition 6.1. Let V be a vertex algebra. (a) A super-subspace U of V has the structure of a vertex subalgebra of V under the inclusion U → V iff 1 ∈ U and a{n,m,σ } b ∈ U for all a, b ∈ U and n ∈ Z, m = 0, 1, . . ., σ = 1, . . . hm . (b) A super-subspace J of V is a proper ideal iff J is Tµ –invariant (µ = 1, . . . , D), 1 ∈ / J and a{n,m,σ } b ∈ J for all a ∈ V, b ∈ J and n ∈ Z, m = 0, 1, . . ., σ = 1, . . . hm . Proof. To prove the statement (a) first observe that U is Tµ –invariant
(µ = 1, .. . , D), since for every a ∈ U: a = Y (a, z) 1 z = 0 and then Tµ a = ∂zµ Y (a, z) 1 z = 0. Then (a) follows directly by the above definitions. For the proof of part (b), as in the chiral CFT ([6]), we need to show that a{n,m,σ } b ∈ J for all a ∈ J, b ∈ V, n ∈ Z, m = 0, 1, . . ., σ = 1, . . . hm . However the latter property is a consequence of the quasisymmetry relation (4.20).
Vertex Algebras in Higher Dimensions
303
Let V be a vertex algebra over CD and let A : CD → CD be a linear orthogonal D µ
µ map ((Ax) · (Ax) = x · x) with a matrix Aν : Aeν = Aν eµ in the standard bases ν=1
D D eµ µ=1 and eν ν=1 (D D). Then the formal series Y (a, x) := Y (a, z) z = Ax D
µ for x = x 1 , . . . , x D and (Ax)µ = Aν x ν , are correctly defined for every a ∈ V ν=1 as a series belonging to (End V) x, 1/x 2 . They generate, combined with the maps Tν :=
D
Aν Tµ for ν = 1, . . . , D , a structure of vertex algebra on V over CD with µ
µ=1
the same Z2 –grading and vacuum 1 ∈ V. We denote this vertex algebra by A∗ V and call D it a restriction of V over C . Let V and V be vertex algebras over CD and let the corresponding state–field correspondence, translation operators and vacua be: Y (a, z), Tµ , 1 (in V) and Y (b, x), Tν , 1 (in V ). Then for every a ∈ V and b ∈ V the formal series Y (a, z1 ) ⊗ Y (b, z2 ) is a field on the superspace V ⊗ V and consequently one can define the field Y (a ⊗ b, z) := Y (a, z1 ) ⊗ Y (b, z2 ) z1 = z2 = z . The fields Y (a ⊗ b, z) together with the operators Tµ := Tµ + Tµ generate a vertex algebra structure over CD on V ⊗ V with a vacuum 1 ⊗ 1 . This vertex algebra is called a tensor product of V and V and we will denote it by V ⊗ V . A representation of the vertex algebra V is called a super space M together with a 1 parity preserving linear map V → (EndM) z, /z2 : a → YM (a, z) and even endomorphisms Tµ ∈ EndM for µ = 1, . . . , D called again translation endomorphisms such that: (a) YM (a, z) is a field and YM (a, z), YM (b, z) are mutually local for all a and b ∈ V; (b) YM (a, z){n,m,σ } YM (b, z) = YM a{n,m,σ } b, z for all a, b ∈ V and all n ∈ Z, m = 0, 1, . . ., σ = 1, . . . , hm , where the field {n, m, σ }–products are defined in accord with Theorem 2.1; (c) Tµ , YM (a, z) = ∂zµ YM (a, z) for µ = 1, . . . , D. An example of a representation of a vertex algebra V is provided by the vertex algebra itself and it is called the vacuum representation of V. 7. Conformal Vertex Algebras We begin by recalling some basic facts about the conformal group and its action for higher space dimensions D. We will use the complex conformal group CC which is convenient to choose to be the connected complex spinor group Spin0 (D + 2; C) =: CC (the latter is in fact a covering of the “geometrical” conformal group in space dimensions D 3). The geometrical action of CC on CD will be denoted as CD z → g (z) ∈ CD
(g ∈ CC )
(7.1)
and it generally has singularities: we will denote the regularity of z for g as “g (z) ∈ CD ”. The Lie algebra cC of CC is isomorphic to so (D + 2; C) and it has generators T1 , . . . ,
304
N.M. Nikolov
TD , S1 , . . . , SD , H and Ωµν for 1 µ < ν D (Ωνµ := −Ωµν ), with the following commutation relations:
H, Ωµν
= 0 = Tµ , Tν = Sµ , Sν ,
Ωµ1 ν1 , Ωµ2 ν2 = δµ1 µ2 Ων1 ν2 + δν1 ν2 Ωµ1 µ2 − δµ1 ν2 Ων1 µ2 − δν1 µ2 Ωµ1 ν2 , H, Sµ = −Sµ , H, Tµ = Tµ , Ωµν , Tρ = δµρ Tν − δνρ Tµ , Ωµν , Sρ = δµρ Sν − δνρ Sµ , Sµ , Tν = 2δµν H − 2Ωµν . (7.2)
Thus: T1 , . . . , TD are the generators of the translations on CD , ta (z) := z + a, ta = ea·T , D
a·T ≡ a µ Tµ ; H is the generator of the dilations e λ H (z) = e λ z; Ωµν are the genµ=1
erators of the orthogonal group of CD (e ϑ Ωµν being the rotation on angle ϑ in the plane (µ, ν) of RD ); and finally, Sµ are generators of the special conformal transformations: sa := ea·S ,
sa (z) =
z + z2 a 1 + 2a · z + a 2 z2
(a, z ∈ CD ).
(7.3)
(For the explicit expression of the generators T1 , . . . , TD , S1 , . . . , SD , H and Ωµν in terms of the standard generators of so (D + 2; C) – see for example [3].) We will call the Lie subalgebra of cC generated by Ωµν (1 µ < ν D) the rotation subalgebra (∼ = so (D; C)) and its corresponding subgroup in CC – the spinor rotation subgroup (∼ = Spin0 (D; C)). An important element of the group CC is the Weyl inversion jW : jW (z) :=
RD (z) , z2
(7.4)
where Rµ (z) for z = z1 , . . . , zD and µ = 1, . . . , D, is the reflection Rµ z1 , . . . , zD = z1 , . . . , −zµ , . . . , zD .
(7.5)
jW is represented in SO (D + 1, 1; R) as a rotation on π in the plane (D, D + 1) , i.e., 1 2 (z) = z for all z but nevertheless j 2 as an element of jW = eπ 2 (SD −TD ) . Note that jW W Spin0 (D + 2, C) is the nonunit central element C = −ICliff , 2 jW = C,
(7.6)
where ICliff is the Clifford algebra unit. The passage from the vertex algebras to the globally conformal invariant QFT needs first an additional symmetry structure for our vertex algebras. For this purpose we will extend the abelian Lie algebra of the translations T1 , . . . , TD to the conformal one cC ∼ = so(D + 2; C).
Vertex Algebras in Higher Dimensions
305
Definition 7.1. A conformal vertex algebra is a vertex algebra V endowed with an action of cC by even linear endomorphisms such that
[ H, Y (a, z)] = Ωµν , Y (a, z) = Sµ , Y (a, z) =
z · ∂z Y (a, z) + Y (H a, z) ,
−z ∂zν Y (a, z) + z ∂zµ Y (a, z) + Y Ωµν a, z , z2 ∂zµ − 2zµ z · ∂z Y (a, z) − 2zµ Y (H a, z) µ
−2
ν
D
zν Y Ωνµ a, z + Y Sµ a, z
(7.7) (7.8)
(7.9)
ν=1
(z · ∂z ≡
D
z µ ∂z µ , z 2 ≡ z · z ≡
µ=1
D
zµ zµ ). The compatibility of the commutation
µ=1
relations (7.2) with Eqs. (7.7)–(7.9) is obtained by a straightforward computation. We require also that: (a) the enodomorphism H is diagonalizable with nonnegative eigenvalues (the energy positivity condition). (b) The representation of the rotation subalgebra so (D; C) ⊂ cC on V decomposes into a direct sum of finite dimensional irreducible subrepresentations. Then this representation admits integration to an action of the spinor rotation subgroup Spin0 (D; C). (c) Let C be the central element (7.6) (C 2 = I) then H + 41 (I − C) has an integer spectrum. In particular, H has only integer or half-integer eigenvalues. (d) The vacuum 1 is the only one cC –invariant element of V up to multiplication, 1. i.e., Xa = 0 for every X ∈ cC ⇔ a ∼ If a is an element of conformal vertex algebra V which is an eigenvector of H we will denote its eigenvalue by wtH (a) and call it weight of a: H a = wtH (a) a. Then if a, b ∈ V have fixed weights: wtH a{n,m,σ } b = wtH (a) + wtH (b) + 2n + m,
(7.10)
(7.11)
which follows from the equation
N a{n,m,σ } b = P{n+N,m,σ } (∂z ) z2 Y (a, z) b z = 0
for N 0 (see Eq. (1.11)) and the relation (7.7). As a consequence of Definition 7.1 (a) and (d), and the commutation relations (7.2) the endomorphisms Sµ will have a locally nilpotent action on a conformal vertex algebra V. Thus the representation of S1 , . . . , SD , H and Ωµν (1 µ < ν D) on V can be integrated to a group action. Recall that the latter generators span a Lie subalgebra =: cC,0 of cC which corresponds to the subgroup CC,0 of CC – the connected part of the stabilizer of 0 ∈ CD . Note that CC,0 is isomorphic to the inhomogeneous connected spinor group of CD with dilations (i.e. the complex Euclidean spinor group with dilations) and it is simply connected. Denote the obtained action by π0 : CC,0 → Aut0 V
(7.12)
306
N.M. Nikolov
and define −1 πz (g) := π0 tg(z) gtz
(7.13)
for all pairs (z, g) in some neighbourhood of (0, I) ∈ CD × CC such that g (z) ∈ CD . −1 −1 Note that tg(z) gtz (0) = 0 if z is regular for g (i.e., g (z) ∈ CD ) so that tg(z) gtz ∈ CC,0 for small z and g. Proposition 7.1. (a) The function πz (g) (7.13) is rational in z for every g ∈ CC with values in (End V)0 , i.e. it can be expressed as a ratio of a polynomial belonging to (End V)0 [z] and a polynomial belonging to C[z]. It has the cocycle property πz (g1 g2 ) = πg2 (z) (g1 ) πz (g2 )
iff g1 g2 (z) , g2 (z) ∈ CD
(7.14)
and satisfies πz (ta ) = I. (b) Let the assumptions of Definition 7.1 be supposed except the condition (c). Then if the cocycle πz (g) (7.13) possesses a continuation to a rational function in z it follows that the condition (c) of Definition 7.1 is also satisfied. Proof. (a) Equation (7.14) is a straightforward consequence of (7.13) for small z, g1 and −1 g2 . If g belongs to the (spinor) Euclidean group of CD with dilations then tg(z) gtz = g1 does not depend on z and it is just the homogeneous part of g (i.e. the projection on the spinor and dilation group). Thus if we prove the rationality of πz (jW ) it will follow for the general πz (g) because of Eq. (7.14) and the fact the conformal group CC is generated by jW and the spinor Euclidean group with dilations. To compute πz (jW ) first we observe that z·w 1 1 −1 tjW (z) jW tz = s−RD (z) Oz , Oz (w) = 2 RD w − 2 2 z ≡ 2 RD R(z) w, (7.15) z z z where R(z) is the reflection determined by z and s−RD (z) , Oz ∈ CC,0 so that πz(jW ) = π0 s−RD (z) π0 (Oz ) .
(7.16)
Then π0 (Oz ) is rational due to Definition 7.1 (c): indeed, if π0 is a subrepresentation of the spinor representation of the Clifford algebra Cliff (D; R) (for even D it is uniquely determined and for odd D, there are two irreducible representations, up to equivalence), D −H − 1
2 then Oz is represented by ± z2 zµ γD γµ ), where γµ are the Clifford algebra µ=1
generators; in the general case π0 is a direct sum of subrepresentations of some tensor power of the above one. To prove the second part (b) we observe that the latter argument is invertible: the rationality of π0 (Oz ) implies the condition (c) of Definition 7.1. A standard consequence of the commutation relations (7.2) is that the full eigenspaces of the action of H on a conformal vertex algebra are invariant for the action of the rotation subalgebra so (D; C). Therefore, the irreducible subrepresentations of so (D; C) are eigenspaces for H .
Vertex Algebras in Higher Dimensions
307
Lemma 7.2. For every element a in a conformal vertex algebra V there exists a finite dimensional subspace U of V which contains a and is invariant with respect to the representation π0 (7.12). Therefore, U is also invariant for the cocycle πz (g) (7.13). Proof. Because of Definition 7.1 (b) it is sufficient to prove the lemma in the case when
a belongs to some irreducible subrepresentation U0 of the rotation group. Then H U0 = dI for 2d ∈ {0} ∪ N. Therefore, Sµ1 . . . Sµ2d U0 = 0 for all µk = 1, . . . , D (k = 1, . . . , 2d) and we can set then the space U to be the linear span of all vectors belonging to Sµ1 . . . Sµk U0 (k = 0, . . . , 2d − 1). Let f (λ) and f λ, λ be functions with values in a finite dimensional vector space V which are holomorphic in of 0 ∈ C. Then we set λ and λ in a neighebouhood ιλ f (λ) ∈ V [[λ]] and ιλ,λ f λ, λ ∈ V λ, λ to be just the Taylor series of f and f around 0 ∈ C. This definition is applicable to the functions πz eλ X and πeλ X (z) eλ X for X, X ∈ cC because of Lemma 7.2. Then Eq. (7.14) implies ιλ,λ πz eλ X eλ X = ιλ,λ πeλ X (z) eλ X ιλ,λ πz eλ X . (7.17) We will distinguish the above notations ιλ and ιλ,λ of the similar ιz1 ,...,zn of Sect. 4 by the type of the arguments λ, λ and zk . It follows from the above constructions that the following equation is valid for any conformal vertex algebra V: (7.18) eλ X Y (a, z) e−λ X b = Y ιλ πz eλ X a, eλ X (z) b as a series belonging to V [[z]]z2 [[λ]] for all a, b ∈ V and X ∈ cC . In the case of X ∈ cC,0 Eq. (7.18) follows from the construction of π0 (7.12) and relations (7.7)–(7.9). For X = Tµ , (7.18) is just a consequence of Eq. (3.1). 8. Hermitian Structure in Conformal Vertex Algebras Besides the conformal structure the passage from the vertex algebras to the QFT requires a Hermitian structure. In this section we will use besides the variables z, w, etc., their D–dimensional conjugate variables z ≡ z1 , . . . , zD = (z1 , . . . , zD ), etc. If z is considered as a formal variable then z will be treated as an independent formal variable and if z takes values in CD then z will be its complex conjugate. We set also
u (z) = u (z) ,
z=z
for a series u (z) ∈ C z, 1/z2 , where u (z) stands for the series with complex conjugate coefficients. Define the following conjugation z → z∗ :=
z z2
.
(8.1)
It can be written also by the Weyl reflection (7.4) as: −1 z∗ = jW (RD (z)) ≡ jW (RD (z)) .
(8.2)
308
N.M. Nikolov
The corresponding conjugations on the complex conformal Lie algebra and group are introduced by: ∗ Tµ := Sµ (1 µ D), ∗ ∗ =Ω H = −H, Ωµν µν (1 µ < ν D), λ X ∗ ∗ λ X e =e for X ∈ cC and λ ∈ C (8.3) as [X1 , X2 ]∗ = X1∗ , X2∗ and g1∗ g2∗ = (g1 g2 )∗ for X1 , X2 ∈ cC and g1 , g2 ∈ CC . Since the conjugation ∗ uses the conformal inversion z → z2 under which the generators Tµ z and Sµ are conjugated, we have the consistency relation: g (z)∗ = g ∗ z∗ (8.4) ∗ D for all g ∈ CC and z ∈ CDsuch that z , g (z) ∈ C . 1 For a series a (z) ∈ V z, /z2 , V being a complex vector space, the substitution a (z∗ ) is correctly defined by Eq. (1.17) as the series
a z∗ := J [a (w)] w = z ∈ V z, 1/z2 . (8.5) " # If V is endowed with a Hermitian form a b ∈ C for a, b ∈ V , then we will use the convention:
# " # " # " # " # " ( a b = b a ) (8.6) a u (z) b = u (z) a b = u (z) a b for a series u (z) ∈ C z, 1/z2 .
Definition 8.1. A conformal vertex algebra V with Hermitian structure " # is a conformal vertex algebra equipped with a nondegenerate Hermitian form a b ∈ C for a, b ∈ V, " # compatible with the Z2 –grading of V ( a b = 0 if a ∈ V0 and b ∈ V1 ) and possessing + an antilinear even involution a → a + (a ∈ V; (λa) + = λa, a + + = a, V0,1 = V0,1 ) satisfying the following conditions:
# " # " a Xb = − X∗ a b , (8.7) " + # " # a Y c , z b = b Y πz∗ (jW )−1 c, z∗ a (8.8) for all a, b, c ∈ V and X ∈ cC . Here the latter equality is understood in the sense of rational functions in z and it is correct in view of the following remark. Remark 8.1. (a) As a consequence of Eq. (7.11) and the orthogonality of the different eigenspaces of H (because of (8.7) and (8.3)) we have " # a Y (c, z) b ∈ C[z, 1/z2 ] ≡ C[z]z2 , (8.9) " # (note that in accord with Definition 2.1 we only have a Y (c, z) b ∈ C [[z]]z2 ). Note also that due to the decomposition (7.16) we will also have πz jW±1 a ∈ V[z, 1/z2 ]. (8.10)
Vertex Algebras in Higher Dimensions
309
(b) The conjugation in Eq. (8.7) is a combination of the “Minkowski” conjugation z → RD (z) and the Weyl reflection since it can be rewritten (using Eq. (8.2)) also as # " + # " −1 −1 c, jW (8.11) a Y c , z b = Y πRD (z) jW (RD (z)) a b . This conjugation law is also idempotent as a consequence of Eq. (8.19) below. Proposition 8.1. In any conformal vertex algebra V endowed with a Hermitian form " # a b ∈ C for a, b ∈ V, satisfying Eqs. (8.7) and (8.9), the correlation functions " # b Y (a1 , z1 ) . . . Y (an, zn ) c are rational for all a1 , . . . , an , b, c ∈ V in the sense that
they belong to ιz1 ,...,zn C[z1 , . . . , zn ]Rn (see Sect. 4). In particular, they are regular for zk2 = 0 and (zk − zl )2 = 0 (k, l = 1, . . . , n). The vacuum correlation functions " # (8.12) Wn (a1 , . . . , an ; z1 , . . . , zn ) := ι−1 z1 ,...,zn 1 Y (a1 , z1 ) . . . Y (an , zn ) 1 are globally conformal invariant in the sense that Wn πz1 (g) a1 , . . . , πzn (g) an ; g (z1 ) , . . . , g (zn ) = = Wn (a1 , . . . , an ; z1 , . . . , zn )
(8.13)
(for all g ∈ CC ) as rational functions in z1 , . . . , zn . Proof. It follows from locality and Eq. (8.9) that n N N " # 2 b Y (a1 , z1 ) . . . Y (an , zn ) c ∈ C[z1 , . . . , zn ] zk2 zlm k=1
1l<mn
for N 0 (zlm := zl − zm ). Then we multiply by n −N ιz1 ,...,zn zk2 k=1
1l<mn
2 zlm
−N
as in the proof of Proposition 4.2. To prove Eq. (8.13) we first obtain by Eq. (7.18) and by the conformal invariance of the vacuum (Def. 7.1 (d)) that 1 eλ X Y (a1 , z1 ) . . . Y (an , zn ) λX = Y ιλ πz1 e a1 , eλ X (z1 ) . . . Y ιλ πzn eλ X an , eλ X (zn ) 1
(8.14)
for all X ∈ cC as an equality in V [[z1 ]]z2 . . . [[zn ]]z2 [[λ]]. Then in view of Eq. (8.7) and n 1 the conformal invariance of the vacuum we find
ιλ Wn πz1 e λ X a1 , . . . , πzn e λ X an ; e λ X (z1 ) , . . . , e λ X (zn ) λ=0
= Wn (a1 , . . . , an ; z1 , . . . , zn ) for all X ∈ cC which actually proves (8.13).
(8.15)
A special case of the GCI (8.13) is the translation invariance: Wn (a1 , . . . , an ; z1 , . . . , zn ) = Wn (a1 , . . . , an ; z12 , . . . , zn−1 n )
(8.16)
(zkl = zk − zl , a1 , . . . , an ∈ V), since πz (ta ) = I. A consequence of locality for the correlation functions (8.12) is the Z2 –symmetry: Wn(a1 , . . . , an ; z1 , . . . , zn ) = (−1)ε(σ )Wn aσ (1) , . . . , aσ (n) ; zσ (1) , . . . , zσ (n) (8.17) for every permutation σ ∈ Sn , where ε (σ ) is the Z2 –parity of σ introduced in Proposition 4.2.
310
N.M. Nikolov
Proposition 8.2. Let V be a conformal vertex algebra with Hermitian structure. Then −1 −1 a + or πz (g) + = πRD (z) jW g ∗ jW (8.18) (πz (g) a) + = πRD (z) jW g ∗ jW + (8.19) πz∗∗ (jW )−1 πz∗ (jW )−1 a = a + . Proof. of Eqs. (8.8), (8.7) and (7.18) we have the following equalities As a consequence 1 in C z, /z2 [[λ]] (λ being one component variable) for all a, b, c ∈ V and X ∈ cC : " λX # " # b Y πz e a, eλ X (z) c = b eλ X Y (a, z) e−λ X c " ∗ # ∗ = eλ X Y πz∗ (jW )−1 a + , z∗ e−λ X b c " # ∗ ∗ = Y πz∗ eλ X πz∗ (jW )−1 a + , eλ X z∗ b c . (8.20) On the other hand: " λX # b Y πz e a, eλ X (z) c + # " ∗ = Y πeλ X∗ (z∗ ) (jW )−1 πz eλ X a , eλ X z∗ b c .
(8.21)
Since the Hermitian form is nondegenerate, comparing the right-hand sides of the latter equations and replacing g = eλ X we obtain (8.22) (πz (g) a) + = πg(z)∗ (jW ) πz∗ g ∗ πz∗ (jW )−1 a + , which actually proves (8.18) if we further use the cocycle property (7.14) and Eq. (8.2). ∗ = j −1 (see Equation (8.19) is a corollary of Eq. (8.18), the conjugation law jW W Appendix A: Eq. (A.5) and the relation between h and jW explained before the equation) and the cocycle property (7.14). 9. Connection with Globally Conformal Invariant QFT In this section we will construct a connection between vertex algebras and Wightman QFT with GCI in higher dimensions. For this purpose we start by introducing some additional notations. We set D–dimensional Minkowski space M to have a signature (D − 1, 1) so that its scalar product is: x · y := −x 0 y 0 + x · y ≡ −x 0 y 0 + x 1 y 1 + · · · + x D−1 y D−1 (9.1) 0 0 1 for x = x , x ≡ x , x , . . . , x D−1 ∈ RD , etc. and x 2 ≡ x · x. The complex Minkowski space MC is just M + iM ∼ = CD with the complexified scalar product (9.1). For simplicity of notation we will use the same sign “·” for two different scalar products (metrics) in CD : the first is Euclidean one (1.2) used with variables labeled by letters as z and w; the second is the above Minkowski metric used with variables labeled by letters as x, y (in the real case) and ζ (in the complex case). We will identify this metrics via the isomorphism: s : z, zD −→ −izD , z , z := z1 , . . . zD−1 . (9.2)
Vertex Algebras in Higher Dimensions
311
9.1. Analytic picture on the compactified Minkowski space. The connection between the structures of the vertex algebras and QFT is based on the following coordinate transformation: CD ∈ z = z, zD ←→ ζ = ζ 0 , ζ ∈ CD , (9.3) where ζ := ζ 1 , . . . , ζ D−1 and i 1 − z2 , 2 1 + z2 + zD 2 1 z= ζ, 1 + ζ2 0 2 − iζ
ζ0 =
1 z, 1 + z2 + zD 2 1 − ζ2 1 zD = . 2 1 + ζ2 0 2 − iζ ζ =
(9.4)
(9.5)
We will regard the coordinates ζ and z as two different charts on the complex compactified Minkowski space M C . They are particular cases of the system of charts considered in Appendix A. The map h : ζ → z,
(9.6)
is an invertible rational transformation of CD . We will use the same letter h also for the transformation s ◦ h which can be represented by an element of the connected complex conformal group CC ≡ Spin0 (D + 2; C) of CD as it is derived in Appendix A. Other properties of the above transformations are: −1 (a) h2 = jW (where jW is defined by Eq. (7.4)).
(b) If ζ ↔ z under the correspondences (9.4) and (9.5) then ζ ↔ z∗ , where ζ is the standard (coordinate) complex conjugation and z∗ is the conjugation given by Eq. (8.1). (c) The transformation h is regular on M and maps it on a precompact subset of CD : h (M) = M := z ∈ CD : z∗ = z = z ∈ CD : z = eiϑ u, ! ϑ ∈ R, u ∈ RD , u2 = 1 ∼ (9.7) = S1 × SD−1 Z2 . (d) Let D−1 2 21 yi T± = ξ = x + iy ∈ CD : ±y 0 > |y| :=
(9.8)
i=1
be the forward and backward tube in M C . Then h is regular on T+ and
2
2 1 +
z2
2
1
D 2 h (T+ ) = T+ := z ∈ C : z < 1, z · z = z + · · · + zD < . 2 (9.9) (Note that B 1 T+ B1 , where Bλ is the Hermitian ball {z : z · z < λ}.) 2
312
N.M. Nikolov
These statements follow by the considerations made in Appendix A and by a straightforward computation. We will only make a comment about the derivation of Eq. (9.9). 2 It can be obtained using the transformation properties of the interval ζ − ζ under 2 the conformal transformation h (note that ζ ∈ T+ ∪ T− iff ζ − ζ > 0, see also [8] Eq. (A.6)). The regularity of h on T± follows by the boundedness of the transformed 2 interval h (ζ ) − h ζ . We will use the active point of view for the conformal group action. This means that the group CC will be assumed to act on the points of the compactified Minkowski space M C , i.e. the conformal transformations will not be considered as coordinate changes. Then every conformal transformation will have different coordinate expressions in different charts of M C . In particular, for the above two charts the corresponding coordinate expressions for an element g ∈ CC will be: ζ → g (M) (ζ )
and z → g (z)
(g ∈ CC ).
(9.10)
−1 These two coordinate actions are connected by the conjugation g (z) = h g (M)(h(z)) . When there is no danger of confusion we will just write g (ζ ) for g (M) (ζ ). Under these conventions we have one more property: (e) There are two natural real forms for the complex conformal group Spin (D + 2; C) defined by the conjugations g → g, g (z) := g (z), ∗ g → g ∗ , g ∗ (z) := g z∗
(9.11) (9.12)
(g1 g2 = g1 g2 and (g1 g2 )∗ = g1∗ g2∗ ) and the corresponding real subgroups are the Euclidean conformal group Spin (D + 1, 1) and the Minkowski conformal group Spin (D, 2), respectively. 9.2. From GCI QFT to vertex algebras. We continue with the construction of the passage from the Wightman QFT with GCI to vertex algebras with Hermitian positive definite structure. It is convenient to define the axiomatic QFT with GCI as a bilinear map (f, a) → φ [f, a] which gives an operator for every complex Schwartz test function f ∈ S (M) on Minkowski space and a vector a from a complex vector space F . The operator φ [f, a] is supposed to act on a dense invariant domain D in a Hilbert space H. The axiomatic assumptions which we impose to the fields φ [f, a] are the Wightman axioms ([5], Chapt. III) in which Lorentz covariance is replaced by the GCI condition for the correlation functions. In more detail, φ [f, a] is assumed to be a nonzero operator valued distribution for any fixed nonzero a ∈ F formally written as: $ φ [f, a] = φ (x, a) f (x) d D x. (9.13) M
The Hermitian conjugation of the fields requires the existence of a conjugation a → a+ on F such that for all a ∈ F and 1 , 2 ∈ D:
# " # " (9.14) 1 φ [f, a] 2 = φ f , a + 1 2 .
Vertex Algebras in Higher Dimensions
313
We demand the existence of a unitary representation of the Poincar´e translations which leave invariant the domain D and a fixed element |0 ∈ D called vacuum so that they satisfy the covariance axiom, the spectral condition and the uniqueness of the vacuum. The + locality axiom requires that the space F is Z2 –graded, F = F0 ⊕ F1 , so that F0,1 = F0,1 and φ (x1 , a1 ) φ (x2 , a2 ) − (−1)p1 p2 φ (x2 , a2 ) φ (x1 , a1 ) = 0
(9.15)
2 ≡ (x − x )2 > 0 and a ∈ F (k = 0, 1). The condition of GCI is imposed on if x12 1 2 k pk the correlation functions 0| φ (x1 , a1 ) . . . φ (xn , an )|0 and it supposes first that there exists a cocycle πxM (g): a rational function in x ∈ M for fixed g ∈ CC with values in (EndF )0 (i.e. a ratio of a polynomial in x whose coefficients are even endomorphisms of F and a complex polynomial), regular in the domain of g and satisfying the cocycle property (7.14) and triviality for Poincar´e translations, πxM (τa ) = IF (τa (x) := x + a); then the correlation functions should be invariant under the substitution φ (x, a) → φ g (x) , πxM (g) a (9.16)
in the sense of [8], Sect. 2. The cocycle should be consistent with the conjugation on F in the sense that: M + πζ (g) a = πζM g ∗ a + . (9.17) Note that from the triviality of πxM (τa ) it follows that πxM (g) does not depend on x for the transformations g belonging to the complex Weyl group, the complex Poincar´e group with dilations, and this is an action of this group on F . Since the linear space F can be infinite dimensional we impose an additional condition on the cocycle π M : the above representation of the complex Weyl group is supposed to be decomposable into a direct sum of finite dimensional irreducible representations. Finally, the axiom of completeness is naturally generalized. This completes our characterization of axiomatic QFT with GCI. Under these assumptions the result about the analytic continuation of the vector–valued distribution φ (x1 , a1 ) . . . φ (xn , an )|0 ([5] IV.2) comes true: they are the boundary value of the functions n (ζ1 , a1 ; . . . ; ζn , an ) ∈ D,
(9.18)
analytic in the tube domain Tn := {(ζ1 , . . . , ζn ) : ζk+1 k ∈ T+ for k = 1, . . . , n − 1, ζn ∈ T+ }
(9.19)
(ζk+1 k := ζk+1 − ζk ). Direct generalizations of Theorems 3.1 and 4.1 of [8] lead to rationality of the Wightman functions: 0| φ (x1 , a1 ) . . . φ (xn , an )|0 =
P (x1 , . . . , xn ; a1 , . . . , an ) , 2 0 µ(ak ,al ) xkl + i0xkl
(9.20)
1k
where P (x1 , . . . , xn ; a1 , . . . , an ) are polynomials and µ (a, b) is such a function that: µ(a,b) (9.21) φ (x, a) φ (y, b) − (−1)p q φ (y, b) φ (x, a) = 0 (x − y)2
314
N.M. Nikolov
for all x, y ∈ M (in the sense of distributions) and all a ∈ Fp and b ∈ Fq (p, q = 0, 1). It is natural to expect that the rationality of the correlation functions will imply stronger analytic properties for the analytic vector functions (9.18): Theorem 9.1. Let the system of fields φ (x, a) satisfies the above conditions. Then the analytic vector–valued functions (9.18) possess a continuation which we will denote again by n (ζ1 , a1 ; . . . ; ζn , an ) such that it is holomorphic in the domain of all sets nonisotropic points of the forward tube T+ . The functions (ζ 1 , .. . , ζn ) of mutually µ(ak ,al ) ζkl2 n (ζ1 , a1 ; . . . ; ζn , an ) possess analytic continuation in the domain k
of all ζ1 , . . . , ζn ∈ T+ . We will prove this theorem in Appendix B. A straightforward corollary of Theorem 9.1 is that the real connected conformal group Spin0 (D, 2) has a unitary representation U (g) on H such that U (g) n (ζ1 , a1 ; . . . ; ζn , an ) = n g (ζ1 ) , πζM1 (g) a1 ; . . . ; g (ζn ) , πζMn (g) an (9.22) for all (Minkowski) real conformal transformations g ∈ C := Spin0 (D, 2), all ζ1 , . . . , ζn ∈ T+ for which ζkl2 = 0 for k < l, and for all vectors a1 . . . an ∈ F . Indeed, Eq. (9.22) determines U (g) as a preserving norm map on a dense subspace of H, in accord with the GCI. Note also that πζ (g) is defined (regular) for g ∈ Spin0 (D, 2) and g (ζ ) ∈ T+ since T+ is a homogeneous space for the real conformal group [13]. As a corollary of Theorem 9.1 the operator–valued generalized functions 2 µ(ak ,al ) n ζkl φ (xm , am ) 1k
m=1
together with all their derivatives in x1 , . . . , xn are defined (and regular) for coinciding arguments x1 = · · · = xn =: x and are again operator distributions in x acting on a % ⊇ D. Denote the linear span of all these operator common dense invariant domain D functions ψ (x) together with the constant function I (x) ≡ IH by A. The space A has natural Z2 –grading induced by those of the fields φ (a, x) (i.e. the Z2 –grading of F ). The construction of the operator distributions ψ (x) ∈ A is done via the correlation functions (9.20) so we conclude that every correlator 0| ψ1 (x1 ) . . . ψn (xn )|0, for ψ1 , . . . , ψn ∈ A, is rational and Z2 –symmetric in the sense of Eq. (9.20) and the vector distribution ψ1 (x1 ) . . . ψn (xn )|0 is a boundary value of an analytic function YnM (ψ1 , ζ1 ; . . . ; ψn , ζn ) (≡ ψ1 (ζ1 ) . . . ψn (ζn )|0) satisfying Theorem 9.1. Note that " M + # ,ζ Yn ψ1 , ζ1 ; . . . ; ψn+ , ζn YmM ψ1 , ζ1 ; . . . ; ψm m # M ,ζ = 0| Yn+m ψ1 , ζ1 ; . . . ; ψn , ζn ; ψ1 , ζ1 ; . . . ; ψm m
(9.23)
(9.24)
M ), where ψ → ψ + is the conjugation (the scalar product of the vacuum with Yn+m induced on A by the Hermitian conjugation of ψ (x) ∈ A. Next observe that the cocycle π M on F gives rise to a cocycle on A, which we denote again by π M , such that it is a continuation of the initial one under the natural inclusion of F in A and U (g) ψ (x) U (g)−1 = πxM (g) ψ (g(x)) (9.25)
Vertex Algebras in Higher Dimensions
315
for g ∈ C and x ∈ M, regular for g (i.e., g (x) ∈ M). In particular, (a ∈ F, g (x) ∈ M). U (g) φ (x, a) U (g)−1 = φ g(x), πxM (g) a
(9.26)
Equations (9.25)and (9.26) are understood in the sense of distributions in the domain x : g (x) ∈ M . The sketch of the arguments is the following: first observe that the action of the Weyl group on A, which is naturally generated by those on F , is again decomposable into a direct sum of finite dimensional irreducible representations (since it is representable by the sum of tensor products of such representations). The energy and Wightman positivities imply that the eigenvalues of the dilation operator are positive numbers, which allows then to define an extension of the cocycle π M on A so that every element ψ ∈ A belongs to a finite dimensional π M –invariant subspace (see Lemma 7.2). This makes meaningful the action of πxM (g) in Eqs. (9.25) and (9.26) as an action of multiplicators. Having made precise the meaning of Eqs. (9.25) and (9.26) we can prove the second one using (9.22) and the first is then obtained. Note also that for all g ∈ C we have a similar equation as (9.22) for YnM (ψ1 , ζ1 ; . . . ; ψn , ζn ). The heuristic connection between the system of fields A and the vertex operators Y (a, z) which we are going to introduce is (9.27) Y (a, z) = πsM(z) (h) ψ h−1 (z) , where h is the complex conformal transformation introduced in Sect. 9.1. In more detail, let us first define the transformed vector functions Yn (ψ1 , z1 ; . . . ; ψn , zn) := YnM πsM(z) h−1 ψ1 , h−1 (z1 ) ; . . . ; πsM(z) h−1 ψn , h−1 (zn ) , (9.28) 2 µkl Y (z , a ; . . . ; z , a ) are analytic in z , . . . , zkl such that the products n 1 1 n n 1 k
πz (g) := πsM(z) hgh−1
(9.30)
is a rational cocycle on A in the z–chart as those of Sect. 7. Note that both cocycles πζM (g) and πz (g) are constant (not depending on ζ and z, resp.) for different subgroups of CC : the complex Weyl group of Minkowski space and the complex Euclidean group with dilations for a z–chart, respectively. This is because we consider the conformal group action in an active sense. In particular, the translations τa and ta for both charts (M) (τa (ζ ) := ζ + a and ta (z) := z + a) are different, as in the first case these are the Poincar´e translations while in the second case these translations are generated by the T ’s of Sect. 7. Now we are ready to define the vertex algebra structure that arises from the considered GCI QFT. Set the vertex algebra space V to be the image of the linear map:
∈ H. (9.31) A ψ → Y1 (ψ, z) z=0
316
N.M. Nikolov
In fact, this map is injective as a corollary of the separating property of the vacuum (indeed, the translation covariance in (9.29) will imply that the function Y1 (ψ, z) will be identically zero in z). So we will further identify the space A by V under this injection, thus transferring the Z2 –grading and the above cocycle πz (g) on V. Note that the space V is dense in the Hilbert space H which is a consequence of the completeness axiom. The vertex algebra vacuum 1 we set to be the QFT vacuum |0 (note that Y1 (I, z) ≡ |0). The action of the conformal Lie algebra cC (which includes the action of the translation endomorphisms Tµ ) is the derivative action of that of (9.29) (since A include all the derivative fields we will not thus leave the space V). The Hermitian form on V is the restricted Hilbert scalar product. Finally, the state–field correspondence is defined as
−N N
Y (a, z) b := z2 ι Y2 (a, z; b, w) (9.32) (z − w)2 w=0
for N 0, where ι stands for the Taylor expansion at 0 of the resulting analytic function in z. The coefficients in the formal series (9.32) belong to V since the derivatives of 2 N ψ1 (x1 ) ψ2 (x2 ) (for N 0) of fields ψ1 , ψ2 ∈ A, at the regularized products x12 coinciding arguments x1 = x2 , again belong to A. Theorem 9.2. The above defined V is a vertex algebra with Hermitian structure. The space V is dense in the Hilbert space H and coincides with the finite conformal energy space: the linear span of all eigenvectors of the conformal Hamiltonian H (having a discrete spectrum). We will sketch the proof only for some of the properties which one has to verify. First, one can find by a straightforward computation that under the transformation (9.28) the conversed Eq. (9.24) reads # " Yn πz1∗ (jW )−1 a1+ , z1∗ ; . . . ; πzn∗ (jW )−1 an+ , zn∗ Ym (b1 , w1 ; . . . ; bm , wm ) # " (9.33) = 1 Yn+m (a1 , z1 ; . . . ; an , zn ; b1 , w1 ; . . . ; bm , wm ) (one uses Eqs. (9.17) and (A.5), the cocycle property as well as the properties (a) and (b) of the previous subsection). As a corollary of this equation and Eq. (9.32) we then obtain that Eq. (8.8) from the definition of the Hermitian structure on vertex algebra is satisfied. Next one can derive the equality 1 = ιz1 ,...,zn ρn−N ι ρnN Yn (a1 , z1 ; . . . ; an , zn ) (9.34) Y (a1 , z1 ) . . . Y (an , zn ) for N 0, where ρn :=
1k
2 and the second ι stand for the Taylor expanzkl
sion of the resulting analytic function in z1 , . . . , zn at (0, . . . , 0). For this purpose, first we note that the left-hand side in (9.34) belongs to V [[z1 ]]z2 . . . [[zn ]]z2 which follows n 1 inductively in n. Then Eq. (9.34) is proven by induction in n starting from the definition equality (9.32) and for the inductive step using Eq. (9.33). From Eq. (9.34) the locality follows (Definition 2.1 (b)). The covariance properties of Definition 7.1 (Eqs. (7.7)–(7.9) together with Definition 2.1 (c)) follows from the conformal covariance law (9.29). Finally, we point out that the condition (c) of Definition 7.1 is a consequence of the rationality of the cocycle πz (g). This condition also implies the second statement in Theorem 9.2.
Vertex Algebras in Higher Dimensions
317
9.3. The converse passage. Here we start by a vertex algebra V with a strongly positive Hermitian structure. Thus V is a prehilbert space and denote by H its Hilbert completion. We are going to construct a GCI QFT on H. The first step is to reconstruct the functions Yn (a1 , z1 ; . . . ; an , zn ) as an analytic in the domain of (z1 , . . . , zn ) ∈ T+n . Here they are defined by the sum of the formal series n 2 , divided by ρ −N . The last ρnN 1 for N 0, where ρn = Y (am , zm ) zkl n 1k
m=1
formal series converges due to Fact B.3 since its Hilbert norm converges to a polynomial in accord with Proposition 8.1. Next one proves Eq. (9.33) as a consequence of the iterated Eq. (8.8) in Definition 8.1. Then one can define a unitary representation of the real conformal group C on H via Eq. (9.29) as the right-hand side does not change its Hilbert norm due to the GCI (8.13). Now we have to make the converse passage in Eq. (9.28) and obtain the functions YnM (a1 , ζ1 ; . . . ; an , ζn ) analytic in the same domain as those of previous subsection. They can be proven to satisfy again Eq. (9.24) and the corresponding conformal covariance law with a cocycle πxM (g) on V given bythe converse equality of Eq. (9.30). Finally we define a system of local fields y (a, x) : a ∈ V using the generalized vector functions obtained by the limit YnM (a1 , ζ1 ; . . . ; an , ζn ) → y (a1 , x1 ) . . . y (an , xn )|0
(9.35)
(|0 := 1), for: (ζ1 , . . . , ζn ) ∈ T+ (Eq. (9.19)), Reζk = xk , Imζl+1l → 0 (k = 1, . . . , n and l = 1, . . . , n − 1) and Imζn → 0. In order to express the action of y (a, x) by the vector distributions y (a1 , x1 ) . . . y (an , xn )|0 one should use Eq. (9.24). Summarizing the above construction we have found: Theorem 9.3. The above defined system of fields y (a, x) : a ∈ V with the cocycle πxM (g) satisfies the axioms of the GCI QFT. 10. Outlook In conclusion, we shall discuss some possibilities for finding new models for the vertex algebras in higher dimensions. They are based on the construction of vertex algebra from Lie algebras of formal distributions explained in Sect. 5. A straightforward generalization of the commutation relations (5.2) is presented by the so-called Lie field models. In our notations these models can be defined by the relations:
u (z1 ) , u (z2 ) = α
β
A $ γ =1
Dαβγ (z1 , z2 , z3 ) uγ (z3 ) d Dz3 + Fαβ (z1 , z2 ) I
M
(3) (3) Dαβγ (z1 , z2 , z3 ) = ιz3 ,z1 ,z2 Wαβγ (z12 , z23 ) − ιz3 ,z2 ,z1 Wαβγ (z12 , z23 ) (2) (2) Fαβ (z1 , z2 ) = ιz1 ,z2 Wαβ (z12 ) − ιz2 ,z1 Wαβ (z12 ) , (2)
(10.1)
(3)
where Wαβ (z, w) , Wαβγ (z, w) ∈ C[z, w]z2 w2 (z + w)2 , zkl = zk − zl , and the (com-
plex) measure d D z on M is provided by the restriction of the volume form dz1 ∧· · ·∧dzD on the real submanifold M (9.7). The sum of integrals of this type should appear in any
318
N.M. Nikolov
GCI QFT with possibly infinite number A of fields. It was shown by Robinson [11] that there are no nonfree (i.e. with Dαβγ (z, w) = 0 for some α, β and γ ) such scalar models (i.e. A = 1) in space-time dimension higher than 2. For nonscalar but finite component models it is not clear whether there exists a solution of the Jacobi identity for the commutators in (10.1). From the point of view of the GCI the interesting solutions are those (2) (3) for which the functions Wαβ and Wαβγ are conformally invariant (finite component) 2and 3–point functions. They would provide a conformal and Hermitian structure on the vertex algebra of the type considered in Sects. 7 and 8. For example there are nontrivial a displayed in [7] and candidates for a tree point function of a nonabelian gauge field Fµν it is an interesting question whether they yield the Jacobi identity for the corresponding commutators. Acknowledgements. The author thanks Bojko Bakalov, Petko Nikolov, Ventseslav Rizov and Ivan Todorov for useful discussions. The author is grateful to Ivan Todorov, Ventseslav Rizov and the referees for their careful reading of the manuscript and for proposing several corrections and improvements. The author acknowledges the hospitality of the Erwin Schr¨odinger International Institute for Mathematical Physics (ESI) where this work was conceived as well as partial support by the Bulgarian National Council for Scientific Research under contract F-828, by the Research Training Network within the Framework Programme 5 of the European Commission under contract HPRN-CT-2002-00325.
Appendix A. Affine System of Charts on Complex Compactified Minkowski Space In this section we will follow the notation of [8], Sect. 2 and Appendix A but our considerations will be basically done over the complex field C. Recall that as an algebraic variety, the complex D–dimensional compactified Minkowski space M C is defined as a complex D–dimensional projective nondegenerated quadric. The real compactified Minkowski space M is then characterized by a conjugation M C p → p ∗ ∈ M C M ≡ p ∈ M C : p = p∗ such that the corresponding real restriction of the complex quadric has a signature (D, 2). The manifold M C is a homogeneous space of the connected complex conformal group CC ≡ Spin0 (D + 2; C) which acts on it via its orthogonal action on the quadric. The stabilizer CC,p of a point p ∈ M C is isomorphic to the complex spinor Weyl group: the complex spinor Poincar´e group with dilations. The action of CC,p leave invariant two natural subsets of M C : the set KC,p of all mutually isotropic points of M C with respect to p (recall that the isotropy relation is conformally invariant and coincides on the projective quadric with the orthogonality relation of the rays); the other invariant subspace is the complement MC,p := M C \KC,p . The set MC,p is open and dense in M C and the action of CC,p induces on it a structure of affine space with a conformal class of flat metric. Thus we obtain a complex affine atlas MC,p : p ∈ M C on M C indexed by its points. To illustrate the above construction note that a particular case of affine chart of M C is the injection of the complex Minkowski space MC in M C , MC → M C . In this case MC ≡ MC,p∞ for a special point p∞ ∈ M C : the tip of the cone K∞ ≡ KC,p∞ of “infinite” points. The complex spinor Weyl group of the Minkowski space MC coincides then with CC,p∞ . Every other stabilizer CC,p is conjugated to CC,p∞ : CC,p = gCC,p∞ g −1 if g (p∞ ) = p. Thus MC,p = g (MC ) which transfers the affine structure from MC to MC,p . Fixing a point q ∈ MC,p it becomes a vector space (with center q). In the case of the (complex) Minkowski space MC the center is denoted by p0 (note that p0 and p∞ are
Vertex Algebras in Higher Dimensions
319
real points). Thus every pair p, q ∈ M C of mutually nonisotropic points determines a vector space included in M C as a dense open subset. Since there always exists a transformation g ∈ CC such that g (p∞ ) = p and g (p0 ) = q ([8] Proposition 1.1) then the map g : MC → g (MC ) ≡ MC,p will be an isomorphism of the corresponding vector spaces. In the projective description of M C a vector chart MC,p with center q is determined → → →2 → → →2 by two representatives η ∞ , η 0 ∈ CD+2 , p = λ η ∞ , q = λ η 0 ( η ∞ = η 0 = 0 →
→
as in [8] Eq. (A.1)) with fixed mutual normalization η ∞ · η 0 = 1. Then the orthogonal → → → → ⊥ ∼ complement η ∞ , η 0 = CD (Span η ∞ , η 0 has nondegenerated metric) plays → → ⊥ the role of the vector space of the chart, MC,p ∼ = η ∞ , η 0 . A particular case of the latter correspondence is the Klein–Dirac compactification formulae [8] Eq. (A.2) where → → → → η ∞ ≡ ξ ∞ and η 0 ≡ ξ 0 so that → → → ⊥ D−1
µ→ ξ ∞, ξ 0 ζ e µ → λ ξ ζ ∈ MC , µ=0
→ ξζ
:=
→
→ ξ0
−
D−1
µ→ ζ2 → ξ∞+ ζ e µ, 2 µ=0
(A.1)
→
where the expressions for ξ ∞ and ξ 0 can be found in Appendix A of [8]. In the general → → → ⊥ → ∈ MC,p determined case every vector v ∈ η ∞ , η 0 is mapped to the point λ η → v by the representative →
→
η → := η 0 − v
→2
v → → η∞ + v. 2
(A.2)
There is a special kind of affine complex charts MC,p determined by the condition that they cover the whole real compact Minkowski space, i.e. M ⊂ MC,p : Proposition A.1. M ⊂ MC,p iff p ∈ T+ ∪ T− ⊂ MC (T± are defined in Eq. (9.8)). → Proof. First, q ∈ M iff q = q ∗ . On the after hand, let p = λ η and q = q ∗ for → → →∗ → → → q = λ θ and θ = θ ( θ ∈ RD+2 ). Then q ∈ M C,p iff η · θ = 0, which is also → → ∗ ⊥ → → → →∗ → / Re η , η equivalent to η · θ = 0 and η · θ = 0. Thus q ∈ MC,p iff θ ∈ ≡ → → ⊥ D+2 η η ). Therefore, Re , Im R (the latter one stands for orthogonal complement in R → → ⊥ M ⊂ MC,p iff the space Re η , Im η R has definite restriction of the metric (i.e. it does not contain isotropic vectors). Since we have chosen the signature of RD+2 to be → → (D, 2) we conclude that the space SpanR Re η , Im η should be of signature (0, 2). In →
→
→
→
particular, η · ξ ∞ = 0, i.e. p ∈ MC ≡ MC,p∞ , and if we set η := ξ ζ , according to → ∗ 2 → 2 → 2 Eq. (A.1), then we derive that 0 > Im ξ ζ = −4 ξ ζ − ξ ζ = 2 Imζ . But the latter means that ζ ∈ T+ ∪ T− .
320
N.M. Nikolov
Remark A.1. If we start with the real form on the quadric M C of signature (r + 1, s + 1) with s = 1 and r = 1 then it follows from the above proof that there does not exist an affine chart containing the whole corresponding real quadric. Since all points of T+ as well as T− form single orbits under the action of the real conformal group C ([13]) then the charts in Proposition A.1 are mutually conjugated in both cases of p ∈ T+ and p ∈ T− . Besides that, we prefer the case when p ∈ T− because p∈ / MC,p . Thus up to real conformal transformation we can choose p to be the fixed point −ie0 = (−i, 0). The center of the chart is convenient to be the conjugated point p∗ ∈ T+ since then the conjugation becomes simpler in the coordinates of the chart. → ⊥ → which coincides The vector space of the chart is then isomorphic to ξ ie0 , ξ −ie0 → → with Span e 1 , . . . , e D in the notations of [8] Eq. (A.1). The real part of the latter space is Euclidean so we denote this chart by EC → M C , EC ∼ = CD . Equation (A.2) takes then the form similar to Eq. (A.1): → C z → λ η z ∈ MC , D
→ ηz
:=
→ η0
→ z2 → η∞ + − zµ e µ , 2 D
(A.3)
µ=1
→ → → → →∗ → → → where η 0 := 21 e −1 + i e 0 = 21 ξ ie0 and η ∞ := − e −1 + i e 0 = −2 η 0 . The relations (9.4) and (9.5), giving the connection between the coordinates in the charts → → MC and EC , are derived by the equation ξ ζ ∼ η z . Note that the transformation h, → → which acts on CD+2 as an Euclidean rotation in the plain Span i e 0 , e D of an angle π , conjugates the charts M and E since C C 2 →
→
h η z = ξ s (z)
(A.4)
(s is defined by Eq. (9.2)). This is exactly the transformation (9.6) and its square acting → → as an Euclidean rotation in the plain Span i e 0 , e D of an angle π is the Weyl reflec→
−1 (7.4) (here the axis of i e 0 corresponds to the axis “D + 1” in the notations of tion jW Sect. 7). Note also that
h∗ = h−1 .
(A.5)
We conclude this geometric review with a remark about the projective interpretation of the tube domains T± (9.8) of the complex Minkowski space MC . First observe →2 →2 → →2 that for a complex vector η ∈ CD+2 we have: η = 0 iff Re η = Im η and → → → Re η · Im η = 0. Therefore, there are three natural subsets of rays λ η ∈ M C , invariant with respect to the real conformal group C: →2 →2 →2 →2 →2 →2 (a) Re η = Im η > 0, (b) Re η = Im η < 0, (c) Re η = Im η = 0. In the selected signature (D, 2) of the real quadric M, the case (b) corresponds to the → 2 2 union T+ ∪ T− . This follows from the equality Im ξ ζ = 2 Imζ , for ζ ∈ MC , already used in the proof of Proposition A.1. The separation between T+ and T− in the → → subspace (b) of rays, corresponds to the orientation of the pair Re η , Im η .
Vertex Algebras in Higher Dimensions
321
Appendix B. Proof of Theorem 9.1 To prove Theorem 9.1 we need some technical preparation. Proposition B.2. Let Θ0 (ζ1 , . . . , ζn ) be a continuous function with values in a Hilbert space H and defined in an open domain U0 in MCn . Let U0 be contained in a connected domain U in MCn , such that the scalar product " # (B.1) F0 (ζ1 , . . . , ζ2n ) := Θ0 ζ1 , . . . , ζn Θ0 (ζn+1 , . . . , ζ2n ) , (ζ ≡ x + iy = x − iy for x, y ∈ M) has a continuation to an analytic function F (ζ1 , . . . , ζ2n ) in the domain U ∗ × U in MC×2n , where U ∗ is the set of complex conjugate elements of U . Then there exists a single-valued continuation Θ (ζ1 , . . . , ζn ) of Θ0 (ζ1 , . . . , ζn ), to a strongly analytic function (i.e. analytic in norm) on the domain U , with values in H such that " # F (ζ1 , . . . , ζ2n ) = Θ ζ1 , . . . , ζn Θ (ζn+1 , . . . , ζ2n ) (B.2) in U ∗ × U . To prove this proposition the following simple fact is useful:
" #∞ Fact B.3. Let {n }∞ m n m,n=1 is Cauchy fundan=1 ⊂ H, then if the sequence mental then the sequence {n }∞ n=1 is fundamental in the norm of H too.
" # " # " # This follows from m − n 2 = m m + n n −2 m n −→ 0. Continm, n→∞
uing with the proof of Proposition B.2 we first observe that the vector–valued function Θ0 (ζ1 , . . . , ζn ) is strongly analytical in the domain U0 , which follows from the analyticity of the scalar product (B.1) and the above Fact B.3. Let (ζ1 , . . . , ζn ) ∈ U . The domain U is connected and consequently there exists a piecewise linear path in U connecting (ζ1 , . . . , ζn ) with U0 . Then for every interval (ζ1 (t) , . . . , ζn (t)), t ∈ [0, 1] we can make the requested continuation by the following fact: If 0 (t) is a strongly analytic function with values in H defined of " in a neighborhood # t = 0 ∈ C and such that the scalar product h0 (t1 , t2 ) := 0 t1 0 (t2 ) possesses an analytic continuation h (t1 , t2 ) in (t1 , t2 ) in some neighborhood of [0, 1]×2 ⊂ C2 , then the function 0 (t) can be continued in a neighborhood of# [0, 1] in C to a strongly " analytic function (x), for which h (t1 , t2 ) = t1 (t2 ) . To prove the latter statement we first note that there exists a positive number ρ such that if Dρ ⊂ C stands for the open complex disk with centre at 0 ∈ C and radius ρ, then the set [0, 1]×2 + Dρ×2 = u + v : u ∈ [0, 1]×2 , v ∈ Dρ×2 is contained in the analyticity domain of h (t1 , t2 ). Thus, if for every t ∈ [0, 1], Jt stands for the interval in R with centre t and radius ρ, then Jt × Jt lies in the convergence domain of the Taylor expansion of the function h (t1 , t2 ) around (t, t). Since the length of the intervals Jt is constant, then there exist a finite number Jtk , for k = 0, . . . , n such that 0 = t0 < · · · < tk < · · · < tn = 1 and tk+1 ∈ Jtk (0 k < n). In such a way if the requested analytic expansion (t) is done in a neighborhood of [0, tk ] (0 k n), then the Taylor expansion of (t) with centre tk will be convergent in Jk – by the construction of Jk and Fact B.3. Thus we arrive to tk+1 and by induction, to tn = 1. The single-valuedness of the obtained continuation Θ (ζ1 , . . . , ζn ) follows by Eq. (B.2). Indeed, if we denote by H0 , the closed linear span of the vectors of the type
322
N.M. Nikolov
Θ0 (ζ1 , . . . , ζn ), for all (ζ1 , . . . , ζn ) in U0 , then in accord with the above construction the continuation Θ (ζ1 , . . . , ζn ) will take its values again in H0 . Thus the single-valuedness of Θ (ζ1 , . . . , ζ1 ) will follow by the single-valuedness of the scalar product (B.2). The strong analyticity of Θ (ζ1 , . . . , ζ1 ) is implied again by Eq. (B.2), as those of Θ0 (ζ1 , . . . , ζn ). To prove Theorem 9.1 we just need to apply Proposition B.2 for the analytic function (9.18), taking U0 to be Tn (9.19) and for U using the domain of all sets of mutually nonisotropic points of T+ . References 1. Atiyah, M.F., Macdonald, I.G.: Introduction to Commutative Algebra. Reading, MA: Addison–Wesley, 1969 2. Borcherds, R.E.: Vertex Algebras. In: Topological field theory, primitive forms and related topics, (Kyoto, 1996), Progr. Math., 160, Boston, MA: Birkh¨auser Boston, 1998, pp. 35–77 3. Dubrovin, B.A., Novikov, S.P., Fomenko, A.T.: Modern Geometry: Methods and Applications. New York: Springer Verlag, 1992 4. Haag, R.: Local Quantum Physics: Fields, Particles, Algebras, 2nd revised edition. Berlin: Springer– Verlag, 1996 5. Jost, R.: The General Theory of Quantized Fields. Providence, R.I.: Amer. Math. Soc., 1965 6. Kac, V.: Vertex Algebras for Beginners. Providence, R.I.: AMS, 1996 7. Nikolov, N.M., Stanev, Ya.S., Todorov, I.T.: Globally conformal invariant gauge field theory with rational correlation functions. Nucl. Phys. B 670 [FS], 373–400 (2003) 8. Nikolov, N.M., Todorov, I.T.: Rationality of Conformally Invariant Local Correlation Functions on Compactified Minkowski Space. Commun. Math. Phys. 218, 417–436 (2001) 9. Nikolov, N.M., Todorov, I.T.: Conformal Quantum Field Theory in Two and Four Dimensions. In: Dragovich, B., Sazdovi´c, B. (eds.), Proceedings of the Summer School in Modern Mathematical Physics. Belgrade: Belgrade Institute of Physics, 2002, pp. 1–49 10. Nikolov, N. M., Todorov, I. T.: Finite Temperature Correlation Functions and Modular Forms in a Globally Conformal Invariant QFT, hep-th/0403191. 11. Robinson, D.W.: On a soluble model of relativistic field theory. Phys. Lett. 9, 189–191 (1964) 12. Todorov, I.T.: Infinite dimensional Lie algebras in conformal QFT models. In: Barut, A.O., Doebner, H.-D. (eds.), Conformal Groups and Related Symmetries, Lecture Notes in Physics 261, Berlin-Heidelberg-New York: Springer, 1986, pp. 387–443 13. Uhlmann, A.: Remarks on the future tube. Acta Phys. Pol. 24, 293 (1963); The closure of Minkowski space, ibid. pp. 295–296; Some properties of the future tube, preprint KMU-HEP 7209 (Leipzig, 1972) Communicated by Y. Kawahigashi
Commun. Math. Phys. 253, 323–335 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1191-7
Communications in
Mathematical Physics
The Universal Connection and Metrics on Moduli Spaces Fortun´e Massamba1,2 , George Thompson1 1 The Abdus
Salam ICTP, P.O. Box 586, 34100 Trieste, Italy. E-mail:
[email protected],
[email protected] 2 I.M.S.P., B.P. 613, Porto-Novo, B´enin Received: 14 November 2003 / Accepted: 16 April 2004 Published online: 17 September 2004 – © Springer-Verlag 2004
Abstract: We introduce a class of metrics on gauge theoretic moduli spaces. These metrics are made out of the universal matrix that appears in the universal connection construction of M.S. Narasimhan and S. Ramanan. As an example we construct metrics on the c2 = 1 SU (2) moduli space of instantons on R4 for various universal matrices. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . 2. Universal Metrics . . . . . . . . . . . . . . . . . 2.1. More metrics from the universal connection. 3. The Universal Connection . . . . . . . . . . . . . 3.1. The Narasimhan-Ramanan matrix. . . . . . 3.2. Abelian universal connections. . . . . . . . 4. Instanton Moduli Space . . . . . . . . . . . . . . 4.1. The ADHM universal matrix and its metric. 4.2. The NR universal matrix and its metric. . . 5. Conclusions . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
323 325 327 328 328 330 331 332 333 333 334
1. Introduction The aim of this paper is to give, more or less, natural metrics on gauge theoretic moduli spaces. The question of alternate metrics to the standard L2 metric arose recently in the context of the AdS/CFT correspondence. The moduli space in question is the c2 = 1, SU (2) moduli space of instantons on R4 . The L2 metric is not ‘the right one’ in that context, essentially because it does not preserve the conformal invariance inherent in
324
F. Massamba, G. Thompson
the definition of the moduli space. A rather remarkable alternative is the information metric which is built out of Tr FA ∗ FA and its derivatives with respect to the moduli [3]. The information metric is designed to preserve the conformal invariance of the theory at hand and yields, for the round metric on S 4 , the standard Einstein metric on AdS5 . It is remarkable in that if one perturbs the metric on S 4 then to first order in that perturbation the information metric remains Einstein [2]. There are also a host of other small miracles associated with this metric. On other instanton moduli spaces the information metric fails to be a metric as it becomes highly degenerate. The reason for this is that it is overly gauge invariant, meaning that it is invariant under gauge transformations that depend on the moduli. For example a convenient parameterization of the c2 = 1, SU (n) instanton moduli space on R4 is the one where the SU (2) instanton is embedded in SU (n) and then one acts with rigid SU (n) gauge transformations to obtain the general instanton. But all the rigid gauge transformations leave Tr FA ∗ FA invariant and so drop right out of the information metric leaving us with the SU (2) parameters only. Rather more dramatically one sees that the information metric is in fact zero on moduli spaces of flat or parabolic bundles. So, while the L2 metric does not keep some of the properties we would like, it does have the advantage of being a metric! The problem then, as set out at the start of this introduction is to find other natural metrics. The L2 metric is made out of the gauge connection directly while the information metric is constructed from the curvature 2-form. These are natural objects in the theory and that is why these metrics are also natural. These, however, do not exhaust the natural objects that are available to us. In 1961 M.S. Narasimhan and Ramanan [4] introduced the concept of a universal connection. This connection plays the same role as that of the universal bundle construction for bundles. Specifically any connection on the Principal bundle of interest can be obtained by pull-back from the universal connection (the bundle itself is obtained by pull-back of the universal bundle). The important conclusion there is that any U (n) connection1 can be expressed as A = U † d U,
(1.1)
where U is an m × n rectangular matrix satisfying U † . U = In .
(1.2)
Since any such connection is made up of n2 d real ‘functions’ on a d-dimensional manifold it is apparent that there is a raw lower bound on m, namely that 2m ≥ n(d + 1). This bound is very difficult to meet. Indeed the bound one gets depends precisely on how the matrices U are constructed. In any case, the observation of M.S. Narasimhan and Ramanan was turned into a powerful tool for self-dual connections on 4-manifolds in the ADHM construction. That construction amounts to a method for obtaining from a universal connection the required matrix U . In fact the universal connection appears in the construction of many moduli spaces. We take the attitude that the U matrices are also natural objects in gauge theory and so should be used in the construction of metrics. Our construction gives us metrics on moduli spaces where the parameterization of the moduli space is contained in the connection. See our main assumption in the next section. When there are ‘matter’ or ‘Higgs’ 1
There is no restriction on the structure group; we have fixed on U (n) here for ease of presentation.
Universal Connection and Metrics on Moduli Spaces
325
fields present one should use that data as well in the construction of metrics but this depends on the details of the equations one is trying to solve so we do not enter into this, except loosely in the Conclusions. As an example one can check that for instantons on flat R4 with U given by the ADHM construction one of the proposed metrics is the non-degenerate AdS5 metric. The paper is organized as follows. We delay describing the universal connection construction, and so how one may arrive at the matrices U , till Sect. 3. Consequently, we require the readers’ indulgence until then and ask that they take on faith results that will be proven in that section. In the next section we introduce possible metrics on the space of connections and ultimately on the moduli space of interest. In Sect. 3 the universal connection construction is finally given. An improvement in the Abelian case is also presented. Metrics on the c2 = 1 SU (2) instanton moduli space are considered in detail in Sect. 4 by way of example of the general construction. Finally, in the Conclusions, we end with many open questions. 2. Universal Metrics The title of this section is perhaps misleading. We mean that these are metrics made from the matrices U . Suppose that we have a parametrized family of connections, with parameters t i such that A(t) = U † (t)dU (t).
(2.1)
Denote the derivative by ∂i . We denote by G the space of maps from M to G. We also denote by Gt the space of maps from T × M to G, where T is the space of moduli. So in fact for fixed t i ∈ T , g(t, x) ∈ G. Assumptions. We demand that the parameters are ‘honest’ parameters, that is, we demand that the parameterization is complete. Furthermore, we will presume that we are working on a smooth part of the moduli space. The two assumptions imply that there are no non-zero vectors v i (t) such that v i (t)∂i A(t) = 0 at any point t i of the moduli space under consideration. We will refer to these assumptions as the main assumption. Introduce the projector, ∂/∂t i
P = U.U †
(2.2)
which is an m × m hermitian matrix, satisfying P2 = P. The projector, P , is not only gauge invariant since U → U.g † for g ∈ G but also invariant under Gt . We have the following simple Lemma 2.1. If ∂i P = 0, then ∂i A = dA Ai , where Ai = U † ∂i U and the covariant derivative is dA = d + [A, . Proof. This is by direct computation. ∂i P = 0 implies that ∂i U † = −U † ∂i U U † , ∂i U = −U ∂i U † U. For the variation of the connection we have, thanks to these two equations, ∂i A = ∂i U † .dU + U † d∂i U = − U † ∂i U A + A U † ∂i U + d U † ∂i U = dA Ai .
326
F. Massamba, G. Thompson
One can consider the connection form to be a form in a higher dimension, that is on T ×M, with the components in the T direction being Ai = U † ∂i U . With this understood Lemma 2.1 says that v i ∂i P = 0 only if v i Fi µ = 0 (Fi µ are the mixed components of the curvature 2-form on T × M). We now introduce some universal metrics. Set gij0 = Tr (∂i P ∗ ∂j P ), (2.3) M gij1 = − Tr Ai ∗ Aj , (2.4) M
where ∗ is the Hodge star operator. is invariant under Gt while g 1 is only invariant under G. The metrics that one gets naturally descend to A/G and finally to the moduli space. g0
Theorem 2.2. Let M be a compact closed manifold and U a universal matrix for some family of connections, then there is a linear combination of the components of the quadratic forms g 0 (2.3) and g 1 (2.4) which is a metric on the moduli space. Proof. There are two cases to consider: First suppose that there is no vector v i such that v i ∂i P = 0, then g 0 is a metric. This follows from the fact that ∗ and Tr (on Hermitian matrices) are positive definite so that for g 0 to be degenerate, that is for gij v i v j = 0 for some v i , we must have v i ∂i P = 0. Secondly suppose that there is a vector v i such that v i ∂i P = 0, then g 0 is degenerate in this direction. However, by Lemma 2.1 we have that v i ∂i A = dA v i Ai , and v i Ai cannot be zero as that would contradict our main assumption. Positive definiteness of ∗ and negative definiteness of Tr (on anti-Hermitian matrices) guarantee that gij1 v i v j = 0. Consequently we can always organize for some linear combination of the components of g 0 and g 1 to yield a non-degenerate symmetric quadratic form on T . We have a kind of converse to the theorem Corollary 2.3 (to Theorem 2.2). Given the conditions of the theorem if Ai = 0 then g 0 is a metric on the moduli space. Proof. If Ai = 0 then there is no v i such that v i ∂i P = 0 since if there was we would conclude v i ∂i A = 0, contradicting our main assumption. To see that we can apply Corollary 2.3 of Theorem 2.2 directly we quote a specialized version of Lemma 3.3 Lemma 2.4. On a relatively compact subset of the moduli space the NR (M.S. Narasimhan, Ramanan) matrix U can be chosen to satisfy Ai = U † ∂i U = 0. Proof. This is delayed till Sect. 3 where a stronger version will be proved and all the definitions will also be available. As the construction in [4] applies to any connection we learn, from Corollary 2.3, that any moduli space can be given the metric g 0 at least on a relatively compact set providing the universal matrix used is a NR matrix. If one wishes to make use of U matrices other than those which are NR matrices then (2.3) can fail to be, but need not fail to be, a metric. In the proof of Theorem 2.2
Universal Connection and Metrics on Moduli Spaces
327
we saw that degeneracy of g 0 only comes from having connections whose dependence on some moduli is through gauge transformations. This is precisely the situation that we described in the Introduction for the data for instantons for higher rank and which plagues the information metric. In cases of this type one has that some of the moduli, say s a , are obtained by a gauge transformation on a connection A0 which depends on moduli r α , that is A(s, r) = (U0 (r)h(s, r))† d(U0 (r)h(s, r)) = h† (s, r)A0 (r)h(s, r) + h† (s, r)dh(s, r), then ∂a A(s, r) = dA Aa (s, r), where A0 (r) = U0† (r)dU0 (r) and Aa (s, r) = h† (s, r)∂a h(s, r). Furthermore, we have that P depends on all the coordinates r α but ∂a P = 0. Now 0 = g 0 = 0 however, gab aα 1 Tr h† (s, r)∂a h(s, r) ∗ h† (s, r)∂b h(s, r) . gab = − M
Thus, in this situation, one can consider as a metric a linear combination of g 0 and g 1 . Proposition 2.5. Let M be a compact, closed manifold and U a universal matrix for a family of connections with U = U0 (r) h(s, r), where g 0 for U0 is a metric on the part of the moduli space parameterized by r α and h ∈ Gs as above, then a linear combination 1 is a metric on the moduli space. of g 0 and gab Remark 2.6. Notice that we are not saying that Aα = g † (s, r)∂g(s, r)/∂r α is zero. Such a condition is not required for g 0 to be a metric. 2.1. More metrics from the universal connection. One of the problems we are faced with is that for non-compact manifolds the integrals that go into defining g 0 and g 1 may well not converge. To improve the situation we add a ‘damping’ factor. This generalization gives us many possible metrics even in the compact case. Let (U ) = ∗−1 Tr(dP ∗ dP ).
(2.5)
So far we had only used a volume form on M, however (U ) requires a metric. Note that (U ) is invariant under Gt . As an aside, note that the mass dimension of (U ) is 2 which means that it is like a mass term for the gauge field Aµ . In fact it is gauge invariant albeit highly non-local and non-polynomial (in the gauge field). One has 4 d x (U ) = d 4 x Tr −Aµ Aµ + ∂µ U † ∂ µ U , R4
R4
which is much more suggestive of a mass for the gauge field. Set (U )α Tr(∂i P ∗ ∂j P ), gij0, α = M 1, β gij = − (U )β Tr U † ∂i U ∗ U † ∂j U .
(2.6) (2.7)
M
Proposition 2.7. g 0, α and g 1, β clearly have the following properties: 1. They are gauge invariant (that is under G). 2. For α = β = d/2 where dim M = d, the metric on M enters only through its conformal class.
328
F. Massamba, G. Thompson
3. For M non-compact with α and β suitably large the integrals in (2.6, 2.7) formally converge provided that (U ) has some suitable integrability properties. As far as the third point of the proposition is concerned it may be natural to suppose that (U ) be integrable (as indicated by the analogy with a mass term). However, weaker integrability conditions may also suffice in certain cases. 3. The Universal Connection M.S. Narasimhan and Ramanan [4] prove that, at least locally, any connection A on a bundle P can be expressed as A = iU † dU .
(3.1)
The set up is as follows [4]. One begins with the Stiefel manifold, V (m, n) of all unitary n-frames through the origin of Cm , thought of as a U (n) principal bundle over the Grassman manifold G(m, n) ≡ U (m)/ (U (m) × U (m − n)). There is a connection on this bundle of the n-planes which has the following description. Denote the components m by vi = m a=1 ea Sai , where the (ea ) are a canonical basis for C and i = 1, . . . , n. By orthogonality one requires that S † . S = ? n (the unit n × n matrix). Denote the matrix valued function which associates to each n-plane its matrix Sai by the same letter. Then S† d S = ω
(3.2)
is a canonical connection on the bundle. This is the universal connection. The main theorem of M.S. Narasimhan and S. Ramanan is that any connection on a principle U (n)-bundle, P over a d-dimensional manifold X, is obtained by pullback of ω from a differentiable bundle homomorphism from P to V (m, n) for some sufficiently large m. The theorem tells us that any connection may be expressed in the form (3.1) for an m × n matrix U providing that m is sufficiently large. They provide a lower bound on m, m ≥ (d + 1)(2d + 1)n3 will do, but it is not a very efficient one. We will see below that for the local problem for U (1) bundles one can in fact do much better than requiring m = (d + 1)(2d + 1). 3.1. The Narasimhan-Ramanan matrix. In their paper [4] Narasimhan and Ramanan not only prove the existence of the universal connection but also give a construction of the matrices U . The way they do this is to pass from a local construction of U to a more global one. They certainly do not give the most minimal form of U but, nevertheless, their procedure is the only one we know of that will produce the required matrix U for any connection. Lemma 3.1 ([4]). Let V be an open subset of Rd and W a relatively compact open subset whose closure is contained in V . For every differential form α of degree 1 on V with values in u(n) (the space of skew-Hermitian matrices), there exist differentiable functions φ1 , · · · φm in W with values in the space Mn (C) of (n × n) complex matrices such that ∗ 1. m j =1 φj φj = ? n , and ∗ 2. A = m j =1 φj dφj , where m = (2d + 1)n2 .
Universal Connection and Metrics on Moduli Spaces
329
The Narasimhan-Ramanan matrices are of a very particular form. Write 2
A=i
n d
λr,µ fr dxµ ,
(3.3)
µ=1 r=1
where f1 · · · · · · fn2 is a set of positive define matrices which form a base for the complex Hermitian matrices over the reals, such that ||fr || = 1 for every r, here ||f || being the norm as a linear transformation. According to the proof the lemma, there exists pr,µ , qr,µ and hr strictly positive differentiable functions such that 2 2 λr,µ = pr,µ − qr,µ , d 1 2 2 h2r (x) = 2 ? n − (pr,µ + qr,µ ) fr . n
(3.4)
µ=1
The matrix U that Narasimhan and Ramanan propose is, 1 U (A) = 2 , 3
(3.5)
where 1 has d(n×n) components defined by the functions pr,µ eixµ ·gr , 2 has d(n×n) components defined by the functions qr,µ e−ixµ · gr , 3 has (n × n) functions defined by hr and gr is the positive square root of fr . Think of all of these as vectors with entries (n × n) matrices. There are two, somewhat surprising, results that can be deduced from the lemma and its proof. Proposition 3.2. In general the Narasimhan-Ramanan matrix for the gauge transform of a connection is not the gauge transformation of the Narasimhan-Ramanan matrix of the original connection, i.e. it is not equivariant, U (Ag ) = U (A).g.
(3.6)
Lemma 3.3. Let A(t) be a family of connections parameterized by t i and let U (t) be the corresponding family of Narasimhan-Ramanan universal matrices. Then, U † (A)∂i U (A) = k(t)∂i k(t)−1 x µ Aµ , (3.7) µ
where ∂i = ∂/∂t i and k is the moduli dependent scaling of coordinates. The first proposition is evident from the construction of the matrix U (A) (one should refer to [4] for the details of that construction). Proof of Lemma 3.3. The scaling of coordinates is alluded to on p. 566 of [4] and we are taking the scaling to be k so that the coordinates are related by x µ = ky µ . In the 2 − q 2 and the components of are taken to scaled coordinates one has kλr,µ = pr,µ 1 r,µ be pr,µ eiyµ · gr and those of 2 by the functions qr,µ e−iyµ · gr . From the definitions we have,
330
F. Massamba, G. Thompson
†1 ∂i 1 = †2 ∂i 2 =
1 2 2 (∂i pr,µ + i∂i k −1 x µ pr,µ ) fr , 2 r,µ 1 2 2 (∂i qr,µ − i∂i k −1 x µ qr,µ ) fr , 2 r,µ
and †3 ∂i 3 = so that 3 a=1
†a ∂i a
1 ∂i h2r , 2 r
1 2 2 2 = ∂i (p + qr,µ )fr + hr 2 r,µ r,µ r 2 2 x µ (pr,µ − qr,µ ). fr +i∂i k −1 r, µ
1 = ∂i 2 ? n×n + k∂i k −1 x µ Aµ . 4n µ
From Proposition 3.2 we learn that there are universal parameterizations which are not gauge covariant. In particular, it is difficult to see how to construct invariants from the Narasimhan-Ramanan matrix apart from those made out of the curvature 2-form. This suggests that, in this case, one should already work on a slice of the space of connections with this parameterization. Somewhat more mysterious is Lemma 3.3. Firstly it implies that for this parameterization the ‘Schwinger-Fock’ gauge µ x µ Aµ = 0 is singled out. Furthermore, on a relatively compact subset of an open set in the moduli space one can take k to be constant. The lemma then implies a special case of Proposition 3.2. Suppose that U is some parameterization matrix, not necessarily the NR matrix, which satisfies (3.7) and is covariant U (Ag ) = U (A).g for all g ∈ Gt , then we get into a contradiction. Inserting the covariance condition into (3.7) we find that g † ∂i g = 0, which implies that g cannot be an arbitrary gauge transformation but rather only one that does not depend on the parameters t i . 3.2. Abelian universal connections. The NR matrices are very messy to deal with, as can be seen from the details of their construction. However, in the Abelian case one can simplify the discussion and construction somewhat. Lemma 3.4. There exists fixed real-valued functions ri , for i = 1, . . . , d − 1, on Rd such that any U (1) connection A, which is pure gauge at infinity, can be expressed as A=−
d−1
θi dri2 + d θ¯
i=1
for some functions θ and θi , with i = 1, . . . , d − 1.
(3.8)
Universal Connection and Metrics on Moduli Spaces
331
Proof. Let ∂j ri2 = δij ∂i ri2 for i = 1, . . . , d − 1 and be such that this derivative does not vanish anywhere except at infinity, i.e. ∂i ri2 → 0 as |x| → ∞. This means that we can invert ∂i ri2 everywhere. Let A match dθ at infinity. Set, θi = −
1 Ai − ∂i θ¯d 2 ∂i ri
and in any case we have that Ad = ∂d θ , from which we can solve for θ. Plugging back in establishes the lemma. A set of functions that have the required property are ri2 = ci (exp(xi ) + 1)−1 , where ci are constants. Corollary 3.5. A Universal matrix for any U (1) connection A on Rd , which is pure gauge at infinity, is given by r1 e−i(θ1 +θd ) r2 e−i(θ2 +θd ) .. U = , . rd−1 e−i(θd−1 +θd ) rd e−iθd d−1 2 2 ¯ where rd2 = 1 − d−1 i=1 ri . and θd = θ − i=1 ri θi . Proof. We have that A = iU † dU =
d−1 i=1
ri2 d(θi + θd ) + dθd = −
d−1
dri2 θi + dθ,
i=1
and by the lemma this is the general form for any such connection.
The NR matrices, in the U (1) case, are 2d + 1 × 1 matrices. The connection matrix above is a d × 1 matrix and since a connection form is essentially given by d functions, we may consider this to be an optimal parameterization. 4. Instanton Moduli Space As already mentioned a convenient parameterization of the c2 = 1, SU (n) instanton moduli space on R4 is the one where the SU (2) instanton is embedded in SU (n) and then one acts with rigid SU (n) gauge transformations to obtain the general instanton. From previous sections we know that the metric coming from these gauge transformations will be caught by g 1 and the SU (2) part will come from g 0 . In this section we concentrate on the SU (2) part. The moduli space M1SU (2) is known to be five-dimensional, with the topology of the open 5-ball, thus parametrized by five moduli, namely four coordinates a µ (the centre of the instanton) and one scale ρ. The parameter ρ measures the size of the instanton, and zero size corresponds to delta function or so-called singular instantons. ρ = 0 corresponds to the boundary of the 5-ball, S 4 , where the moduli a µ are the coordinates on the S 4 . The following sections are relatively brief as all the details are essentially computational and many steps are skipped.
332
F. Massamba, G. Thompson
4.1. The ADHM universal matrix and its metric. The ADHM construction gives the universal connection form for the instanton that we are interested in [1]. These authors find a 4 × 2 representation for the rectangular matrices required to parameterize an instanton, namely the explicit expression of U for the self-dual gauge potential is (x − a)µ σ¯ µ 1 . U (x) = (4.1) (x − a)2 + ρ 2 −ρ? 2×2 We use the notation σµ = (? 2×2 , iτa ), σ µ = (? 2×2 , −iτa ),
(4.2)
with τa the usual Pauli matrices. A straightforward calculation leads to 4δµν ρ 2
Tr (∂a µ P ∂a ν P ) =
(x − a)2 + ρ 2 4ρ (x − a)ν
Tr ∂ρ P ∂a ν P =
(x − a)2 + ρ 2
Tr ∂ρ P ∂ρ P =
4(x − a)2 (x − a)2 + ρ 2
2 , 2 , 2 .
(4.3)
We do not have to work to calculate (U ). By translational invariance one can replace derivatives with respect to x µ with derivatives with respect to −a µ so that we find 16ρ 2
Tr (U ) =
(x − a)2 + ρ 2
2 .
(4.4)
Our next task is to determine the metric g 0, α . For α > 1/2 the integrals converge and we consider α in this range. By rotational invariance the integral defining gρ0,aαµ is zero. Likewise, by translational invariance, the other integrals do not depend on a µ . The metric on the moduli space is, therefore, of the form ds 2 = ρ 1−α A(α) d a 2 + B(α)dρ 2 , (4.5) where the dependence on ρ is determined by dimensional arguments. The coefficients A(α) and B(α) are determined on doing the integrals and they are both non-zero, thus this is a metric. When α = 1 this is proportional to the L2 metric. When α = 4/2, ds 2 is proportional to the AdS5 metric and this had to be so as a consequence of Proposition 2.7. Thus with α = 2 the universal metric exhibits the nice feature of the information metric namely that it is Einstein. For completeness we list the values of the coefficients A(α) = (4)2α+1 π 2 [2α − 1] B(α) =
2 . 2α − 1
(2α − 1) , (2(α + 1))
Universal Connection and Metrics on Moduli Spaces
333
4.2. The NR universal matrix and its metric. The construction of the NR matrix is somewhat involved, so here we will outline some of the ingredients and leave the details for [5]. For the instanton moduli space M1SU (2) , the explicit expression for the connection of interest, Aµ , is given by a Aµ = ηµν
(x − a)ν (x − a)2 + ρ 2
τa ,
(4.6)
a are the ’t Hooft eta-symbols, a basis for the self-dual two-form on R4 , and τ where ηµν a are the Pauli matrices. The basis of positive definite Hermitian matrices fr are chosen to be fi = (τi + 3I2 )/4 for i = 1, 2, 3 and f4 = I2 . The λr,µ are given by the expressions i λi,µ = 4ηµν
(x − a)ν (x − a)2 + ρ 2
, i = 1, 2, 3,
3 λi,µ . 4
(4.7)
3
λ4,µ = −
(4.8)
i=1
One can see directly from the construction of prµ and qrµ , as spelled out in the proof of the lemma on p. 565 of [4], that these functions depend on the position of the instanton, a µ , only through the combination (x − a)µ . This is due to the fact that the ar,µ (that appear on p. 566), for the single SU (2) instanton, are functions of the scale ρ and not µ of a µ . Unfortunately, the exponential factors eix break the translation invariance. The NR matrices are not designed to preserve any particular symmetry of a connection. Fortunately, one can just as well take the exponential factors to depend on the combination (x − a) and the proof of Lemma 3.1 goes through unchanged. One should note that in this case Lemma 3.3 no longer applies, but a similar result holds. One important aspect in the present situation that must be mentioned is that powers of (U ) as given in (2.5) will not dampen. However, we can take advantage of the freedom to pick the damping factor to be, for example, (U ) = | ∗−1 Tr (FA ∗ FA )|1/2 .
(4.9)
Consequently, once more by dimensional arguments, with the damping factor given by (4.9) the g 0 α metric will take the form given in (4.5) and only the coefficients A(α) and B(α) need to be determined. This agreement of g 0 , up to change of parameters, of the ADHM and NR universal matrices is not a ‘generic’ situation. The proof of the lemma just cited holds if the ar,µ are replaced by ar,µ + kr,µ , where the kr,µ are positive functions. For example one could consider kr,µ = cr,µ exp (−dr,µ a 2 /ρ 2 ) with cr,µ and dr,µ positive constants satisfying suitable conditions so that hr is well defined. In this case the universal matrix would lead to a metric with a highly non-trivial dependence on a 2 . 5. Conclusions There are many more metrics that one can form from the universal matrices U . For example, both the L2 and information metrics can be written in terms U . One can also construct other ‘damping’ factors. If we set φ = ∗−1 dP ∗ dP ,
334
F. Massamba, G. Thompson
then powers of terms of the form Tr φ α1 . . . Tr φ αk with αi positive integers, give us more positive definite damping factors. Proposition 2.7 holds if one replaces with the damping factors we have just introduced. One question that has not been addressed is: how are metrics that come from different parameterizations, meaning different universal matrices, related? In some cases are they related simply by a diffeomorphism? We do not have an answer to this series of questions. However, there is a set of cases in this direction where we have a simple Proposition 5.1. Let U be a m × n universal matrix and M a k × n universal matrix with k ≥ m so that Aµ = iU † ∂µ U = iM † ∂µ M. If
U 1 U M=√ . , N .. U
with k = Nm, then the universal metrics of U and M agree. We have also not investigated the geometry of the metrics that have been introduced. This appears to require a more detailed knowledge of the moduli space that one is dealing with as, for example, of the instanton moduli space of the previous section. Many gauge theory moduli spaces also involve ‘matter’ fields and consequently any metric on the moduli space would probably need to involve those objects as well. Suppose is such a field, that is a section of some associated bundle to the Principal bundle (tensored with other bundles). Think of as being valued in some tensor product representation V ⊗ · · · ⊗ V , where V is the n-dimensional representation of SU (n). In this case the covariant derivative dA = U † ⊗ · · · ⊗ U † d (U ⊗ · · · ⊗ U ), and one can form gauge invariant combinations U ⊗ · · · ⊗ U and from this construct gauge invariant terms to be added to the universal metric. Finally, we would also like to know if there exists a construction which produces equivariant universal matrices. We note that the U matrices in the Abelian case in Corollary 3.5 are indeed equivariant. Acknowledgements. It is a pleasure to thank M. Blau, K. Narain, M.S. Narasimhan and T. Ramadas for many useful discussions. F. Massamba would like to thank the Abdus Salam ICTP for a fellowship. This research was supported in part by EEC contract HPRN-CT-2000-00148.
References 1. Atiyah, M., Drinfeld, V., Hitchin, N., Manin, Y.: Construction of Instantons. Phys. Lett. 65A, 185–187 (1978) 2. Blau, M., Narain, K.S., Thompson, G.: Instantons, the Information Metric and the AdS/CFT Correspondence. http://arxiv.org/abs/hep-th/0108122, 2001
Universal Connection and Metrics on Moduli Spaces
335
3. Hitchin, N.J.: The Geometry and Topology of Moduli Space. In: Global Geometry and Mathematical Physics, Lecture Notes in Mathematics 1451, Heidelberg: Springer, 1988, pp. 1–48 4. Narasimhan, M.S., Ramanan, S.: Existence of Universal Connections. Am. J. Math. 83, 563–572 (1961) 5. Massamba, F.: Ph. D. Thesis. On Certain Aspects of Moduli Spaces of Connections. In preparation Communicated by M.R. Douglas
Commun. Math. Phys. 253, 337–370 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1192-6
Communications in
Mathematical Physics
Toric Self-Dual Einstein Metrics as Quotients Charles P. Boyer1 , David M.J. Calderbank2 , Krzysztof Galicki1 , Paolo Piccinni3 1
Department of Mathematics and Statistics, University of New Mexico, Albuquerque, NM 87131, USA. E-mail:
[email protected];
[email protected] 2 School of Mathematics, University of Edinburgh, King’s Buildings, Mayfield Road, Edinburgh EH9 3JZ, Scotland. E-mail:
[email protected] 3 Universit`a degli Studi di Roma, “La Sapienza”, Dipartimento di Matematica, Piazzale Aldo Moro, 00185 Roma, Italia. E-mail:
[email protected] Received: 19 November 2003 / Accepted: 20 April 2004 Published online: 14 October 2004 – © Springer-Verlag 2004
Abstract: We use the quaternion K¨ahler reduction technique to study old and new self-dual Einstein metrics of negative scalar curvature with at least a two-dimensional isometry group, and relate the quotient construction to the hyperbolic eigenfunction Ansatz. We focus in particular on the (semi-)quaternion K¨ahler quotients of (semi-)quaternion K¨ahler hyperboloids, analysing the completeness and topology, and relating them to the self-dual Einstein Hermitian metrics of Apostolov–Gauduchon and Bryant. Introduction There has been quite a lot of interest in self-dual Einstein (SDE) metrics in dimension four. In the negative scalar curvature case, such metrics naturally generalize the symmetric metrics on the real 4-ball H4 HH1 (the real or quaternionic hyperbolic metric) and on the complex 2-disc CH2 (the complex hyperbolic or Bergman metric). A rather general construction of negative SDE metrics was offered by C. LeBrun in 1982 [LeB82]. LeBrun observed that for any real-analytic conformal structure [h] on S 3 , there is a Riemannian metric g0 defined on some open neighborhood of S 3 ⊂ R4 such that g0 is self-dual, the restriction of it to S 3 is in the conformal class [h], and moreover g = f −2 g0 is Einstein for some defining function f for S 3 in this open neighbourhood. However, this result is purely local: the Einstein metric it defines typically cannot be extended to a complete metric everywhere inside the ball. Nevertheless, in later work [LeB91], LeBrun showed that the moduli space of negative complete SDE metrics on a ball is infinite dimensional, which led him to formulate During the preparation of this work the first and third authors were supported by NSF grant DMS0203219. The second author was supported by the Leverhulme Trust, the William Gordon Seggie Brown Trust and an EPSRC Advanced Fellowship. The fourth author was supported by the MIUR Project “Propriet`a Geometriche delle Variet`a Reali e Complesse”. The authors are also grateful for support from EDGE, Research Training Network HPRN-CT-2000-00101, funded by the European Human Potential Programme.
338
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
a conjecture. A conformal structure on S 3 is said to have positive frequency if it bounds a complete SDE metric on the ball and negative frequency if it bounds a complete anti-self-dual Einstein (ASDE) metric on the ball. The conjecture then asserts that near the standard conformal structure on S 3 (in an appropriate sense) the moduli spaces of positive and negative frequency subspaces are transverse (i.e., their tangent spaces at the standard conformal structure give a direct sum decomposition). The positive frequency conjecture is now proven, thanks to the remarkable work of O. Biquard [Biq02] (see also [Biq00, Biq99]). However, this still provides very little information about which conformal structures on S 3 bound complete SDE metrics on the ball. The known examples are rather few. Apart from the hyperbolic metric, the first such metrics were obtained by H. Pedersen in [Ped86]: the conformal class [h] on S 3 is represented by a Berger sphere metric σ12 + σ22 + λ2 σ32 (where σ1 , σ2 and σ3 are the standard left-invariant 1-forms on S 3 Sp(1) and λ is a nonzero constant), and the corresponding complete SDE metric on the 4-ball is equally explicit. Later, N. Hitchin [Hit95] generalized this result by showing that any left-invariant conformal structure on S 3 determines a complete SDE metric on the ball, although now explicitness requires elliptic rather than elementary functions. The reason that these metrics are tractible is the presence of symmetry. The real and complex hyperbolic metrics have isometry groups SO(4, 1) and U (2, 1) respectively, while the Pedersen metrics on the ball have isometry group U (2). There are also (related) U (2)-invariant SDE metrics on complex line bundles O(n) → CP 2 , with n 3, called the Pedersen–LeBrun metrics [LeB88, Hit95]. Hitchin [Hit95] actually classifies all SU(2)-invariant SDE metrics, and proves that the complete examples of negative scalar curvature consist only of the real and complex hyperbolic metrics, the Pedersen and Pedersen–LeBrun metrics, and SDE metrics on the ball associated to a left-invariant conformal or CR structure on S 3 SU (2). Recent progress on SDE metrics with symmetry concerns the much smaller symmetry group T 2 S 1 × S 1 (and its non-compact forms): such SDE metrics are said to be toric. In the positive case, these metrics can be constructed using the Galicki–Lawson quaternion K¨ahler reduction [GL88] of the quaternionic projective space HP n by the action of an (n − 1)-dimensional subtorus of the maximal torus of Sp(n + 1). Although the only positive SDE metrics on compact manifolds are the standard metrics on S 4 and CP 2 , these methods produce positive SDE metrics on compact orbifolds. The general such metrics were described by C. Boyer et al. in [BGMR98], following the construction by Galicki and Lawson of positive SDE metrics on weighted projective spaces [GL88]. It is natural to conjecture that all positive compact SDE orbifolds arise in this way: this would be similar to a related result of R. Bielawski [Bie99] stating that all toric 3-Sasakian manifolds (in any dimension) are the 3-Sasakian quotients considered in [BGMR98]. Another impetus to study toric SDE metrics comes from the recent work [CP02] of D. Calderbank and H. Pedersen, who proved that if a (positive or negative) SDE metric admits two commuting Killing vector fields, it can be expressed locally in an explicit form depending on a single function F on the upper-half plane, where F is an eigenfunction of the hyperbolic Laplacian with eigenvalue 3/4. Conversely, any metric of this form is an SDE metric. Calderbank and Pedersen then showed explicitly how the positive SDE metrics of Galicki–Lawson and Boyer et al. arise from such an eigenfunction F , and tied together a number of examples of negative SDE metrics. The (locally) toric SDE metrics of [CP02] also relate to a recent study by V. Apostolov and P. Gauduchon of SDE Hermitian metrics [AG02]. SDE metrics with symmetry are
Toric Self-Dual Einstein Metrics
339
conformal to metrics which are K¨ahler with the opposite orientation (hence scalar-flat), but it is much rarer for an SDE metric to admit a Hermitian structure inducing the given orientation. Nevertheless, many of the examples of SDE metrics discussed so far are Hermitian in this sense. Other non-locally symmetric examples of SDE Hermitian metrics include cohomogeneity one metrics under the action of R × Isom(R2 ), U (1, 1), and U (2) constructed by A. Derdzi´nski [Der81] (the U (2) case being the Pedersen–LeBrun metrics mentioned above). Apostolov and Gauduchon show, quite generally, that SDE Hermitian metrics always admit two distinguished commuting Killing vector fields, and that if the induced local R2 action does not have two dimensional generic orbits, then the isometry group necessarily acts transitively or with cohomogeneity one. In either case, they show that SDE Hermitian metrics are toric, hence given locally by the metrics of Calderbank and Pedersen. The emergence of non-trivial isometries for SDE Hermitian metrics is perhaps less surprising in view of a link with recent work of R. Bryant on Bochner-flat K¨ahler metrics [Bry01]. In four dimensions, the Bochner tensor coincides with the anti-self-dual Weyl tensor and so K¨ahler metrics with vanishing Bochner tensor are just self-dual K¨ahler metrics. Apostolov and Gauduchon show that SDE Hermitian metrics are necessarily conformal to self-dual K¨ahler metrics, hence they belong to the class of metrics studied by Bryant. In his impressive paper, Bryant obtains an explicit local classification of Bochner-flat K¨ahler metrics and studies in detail their global geometry. The symmetries here arise naturally from a differential system, which amounts to the realisation of Bochner-flat K¨ahler 2n-manifolds as local quotients of the flat CR structure on S 2n+1 . Bryant’s work not only provides an alternative way of classifying SDE Hermitian metrics locally, but it also gives insight into the question of completeness, and he discusses some examples in an appendix to his paper. In spite of this work (and in contrast to the case of SU(2) symmetry, where Hitchin provides a classification) the issue of completeness for negative SDE Hermitian metrics is not yet fully explored, and for the toric SDE metrics in general, the complete examples are far from understood. In fact, there are very many examples. In [CS03], Calderbank and M. A. Singer constructed examples of complete SDE metrics on resolutions of complex cyclic singularities and showed that the moduli of such metrics is (continuously) infinite dimensional. In particular these metrics can have arbitrarily large second Betti number (cf. [BGMR98] in the positive case). Examples of infinite topological type are also known. The simplest examples in [CS03] are quaternion K¨ahler quotients of HHm generalizing the Pedersen–LeBrun metrics on O(n) (n 3), and may be viewed as negative analogues of the compact orbifold SDE metrics of Galicki–Lawson and Boyer et al. In fact many of the metrics discussed in this introduction occur as quaternion K¨ahler quotients [Gal87a, GL88]. For positive toric SDE metrics, compact orbifold examples are well understood (as we have discussed). For negative toric SDE metrics, many examples have been introduced as quotients by Galicki [Gal87b, Gal91], but the quotient approach has not been thoroughly explored. Our purpose in this work is to develop systematically the quotient approach to toric SDE metrics, which has a number of advantages. In addition to producing an abundance of examples locally, the quotient approach provides more direct insight into the global behaviour of such metrics (completeness or topology), as well as a systematic way to organise these examples into families. In this paper we set the initial stage for such a systematic study by considering the toric SDE metrics arising as (semi-)quaternion K¨ahler quotients of 8-dimensional quaternionic hyperbolic space HH2 and its indefinite signature analogue HH1,1 by a one
340
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
dimensional group action. A given reduction may be encoded by the adjoint orbits in sp(1, 2) of the generator of the action, which in turn may be classified using work of Burgoyne and Cushman [BC77]. There are essentially four distinct possible types of generator: (i) (ii) (iii) (iv)
elements belonging to the Lie algebra a maximal torus; elements in a Cartan subalgebra with exponential image S 1 × R; non-semisimple elements with two step nilpotent part; non-semisimple elements with three step nilpotent part.
The quaternion K¨ahler quotients by generators in the first two classes correspond to the 3-pole solutions discussed in [CP02], but we present a detailed and self-contained analysis of the completeness and topology of the quotient. The other two classes may be regarded as limiting cases, but the geometry of the quotient is less well studied. According toApostolov and Gauduchon [AG02], quaternion K¨ahler quotients of HH2 (and HP 2 ) by one dimensional group actions are SDE Hermitian, and their argument applies also to quotients of HH1,1 . Therefore all of the quotients we discuss in this paper are SDE Hermitian manifolds. Furthermore, by comparing our examples with the classification of self-dual K¨ahler metrics by Bryant [Bry01], we see that in fact all SDE Hermitian metrics with nonzero scalar curvature are (at least locally) quaternion K¨ahler quotients of HP 2 , HH2 or HH1,1 . In addition to studying the quotients of HH2 and HH1,1 in detail, we develop some aspects of the general theory of quotients of HHk,l by (k + l − 1)-dimensional Abelian semi-quaternion K¨ahler group actions. In particular we show how the quotient metrics are related to the hyperbolic eigenfunction Ansatz, simplifying and extending a result of [CP02]. 1. Semi-Quaternionic Projective Spaces Definition 1.1. Let (M 4n , g) be a semi-Riemannian manifold of signature (4ν, 4n−4ν). We say that (M 4n , g) is semi-quaternion K¨ahler if the holonomy group of the metric connection is a subgroup of Sp(ν, n − ν) · Sp(1) when n > 1. As usual, when n = 1 we extend our definition and require that (M, g) be self-dual and Einstein. We will always suppose that the scalar curvature of (M, g) is nonzero. We refer to ν as the quaternionic index of M. Exactly as in the Riemannian case, the above definition implies the existence of the quaternion K¨ahler 4-form which is parallel with respect to the Levi-Civita connection and gives rise to the quaternionic rank 3 bundle V over M. The simplest example of semi-quaternion K¨ahler manifolds are obtained as follows. Let Hk,l = {u = (a, b)|a = (u0 , . . . , uk−1 ), b = (uk , . . . , uk+l−1 )} be the set of all quaternionic (n + 1)-vectors together with the symmetric form Fk,l (u1 , u2 ) = −
k−1 α=0
u¯ 1α u2α +
k+l−1
u¯ 1α u2α = −a1 , a2 + b1 , b2 .
(1.1)
α=k
Here a1 , a2 denotes the standard quaternionic-Hermitian inner product on Hk and we shall denote the associated norm by ||a||2 = a, a. The form Fk,l defines the flat semi-Riemannian metric of signature (4k, 4l) on Hk,l .
Toric Self-Dual Einstein Metrics
341
Definition 1.2. Let Hk,l () = {(a, b) ∈ Hk,l | − ||a||2 + ||b||2 = }. (i) Hk,l (−1) S 4k−1 × Hl , where k > 0, is a semi-Riemannian submanifold of signature (4k − 1, 4l) called the pseudohyperboloid. (ii) Hk,l (+1) Hk × S 4l−1 , where l > 0, is a semi-Riemannian submanifold of signature (4k, 4l − 1) called the pseudosphere. (iii) Hk,l (0) S 4k−1 ×S 4l−1 ×R/∼, where k, l > 0 and ∼ identifies S 4k−1 ×S 4l−1 ×{0} with a point, is called the null cone. Let k + l = n + 1 and Sp(k, l) ⊂ GL(n + 1, H) which preserves the form Fk,l . It is well-known that Hk,l (±1) are spaces of constant curvature and as homogeneous spaces of the semi-symplectic group Sp(k, l) they are Hk,l () =
Sp(k, l)/ Sp(k, l − 1) when l > 0 and = +1, Sp(k, l)/ Sp(k − 1, l) when k > 0 and = −1.
k,l | ||a||2 < ||b||2 }, and Hk,l = {(a, b) ∈ Hk,l | ||a||2 > Consider Hk,l + = {(a, b) ∈ H − k,l ||b||2 }. Also, let us write H0 for Hk,l (0) as an alternative notation. We can then write k,l k,l Hk,l = Hk,l − ∪ H0 ∪ H+ .
(1.2)
After removing 0 ∈ Hk,l we consider the action of H∗ on (1.2) by right multiplication. Definition 1.3. Let Hk,l be the quaternionic vector space with semi-hyperk¨ahler metric of signature (4k, 4l). We define the following projective spaces. k,l ∗ (i) HHk−1,l := PH (Hk,l − ) = H− /H = Hk,l (−1)/ Sp(1), k,l k,l k,l−1 (ii) HH := PH (H+ ) = H+ /H∗ = Hk,l (+1)/ Sp(1), k,l ∗ 4k−1 × 4l−1 . (iii) PH (H0 ) = (Hk,l Sp(1) S 0 \ {0})/H = S
If we make a choice of C∗ ⊂ H∗ we also have complex ‘projective’ spaces. Definition 1.4. Let Hk,l be the quaternionic vector space with semi-hyperk¨ahler metric of signature (4k, 4l). Let C∗ ⊂ H∗ . We define k,l ∗ (i) PC (Hk,l − ) = H− /C = Hk,l (−1)/U (1), k,l k,l (ii) PC (H+ ) = H+ /C∗ = Hk,l (+1)/U (1), k,l ∗ 4k−1 × 4l−1 . (iii) PC (Hk,l U (1) S 0 ) = (H0 \ {0})/C = S
Proposition 1.5. As homogeneous spaces of the semi-symplectic group Sp(k, l) , Sp(1) × Sp(k − 1, l) Sp(k, l) PH (Hk,l , + )= Sp(k, l − 1) × Sp(1) PH (Hk,l − )=
Sp(k, l) , U (1) × Sp(k − 1, l) Sp(k, l) PC (Hk,l , + )= Sp(k, l − 1) × U (1)
PC (Hk,l − )=
k >0 l > 0. (1.3)
342
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
Furthermore, we have the natural fibrations
PC (Hk,l − )
Hk,l −
Hk,l +
k,l PH (Hk,l − ) PH (H+ )
PC (Hk,l + )
(1.4)
which can be glued together along the common boundary C∗ → Hk,l 0 \ {0} ↓ 4k−1 × S 4l−1 S 2 → PC (Hk,l S1 0 ) S ↓ 4k−1 × S 4l−1 PH (Hk,l S3 0 ) S
(1.5)
to give HP k+l−1 , its twistor space CP 2k+2l−1 and the vector space Hk+l \ {0}. Note that 4k−1 × S 4l−1 is both an S 4k−1 -bundle over HP l−1 and an S 4l−1 -bundle PH (Hk,l S3 0 )S k−1 over HP . The following proposition is straightforward. Proposition 1.6. The manifolds PH (Hk,l ahler with − ), k > 0, are semi-quaternion K¨ holonomy group Sp(k − 1, l) · Sp(1), index ν = k − 1, negative scalar curvature, k,l k,l twistor space PC (Hk,l − ), and Swann bundle H− ; furthermore, PH (H− ) is the qual ternionic H -bundle over the standard quaternionic projective space HP k−1 associated to the quaternionic Hopf fibration. The manifolds PH (Hk,l + ), l > 0, are semi-quaternion K¨ahler with holonomy group Sp(k, l − 1) · Sp(1), index ν = k, positive scalar k,l k,l curvature, twistor space PC (Hk,l + ), and Swann bundle H+ ; furthermore, PH (H+ ) is the quaternionic Hk -bundle over the standard quaternionic projective space HP l−1 associated to the quaternionic Hopf fibration. Topologically, HHk−1,l = PH (Hk,l − ) and k+l−1 P (Hk,l ). HHk,l−1 = PH (Hk,l ) are the components of HP H + 0 The bundle structure of PH (Hk,l − ) is the one associated to the right quaternionic multiplication of the quaternionic vector space Hk by the unit quaternions. Explicitly, let 4k−1 (1) × Hl via a map (a, b) ∈ Hk,l (−1) ⊂ Hk,l − . Let us identify Hk,l (−1) S a ,b . f (a, b) = (v, b) = ||b||2 + 1 Then σ ∈ Sp(1) acting on S 4k−1 (1)×Hl by (v, b) → (vσ, bσ ) gives the quotient which k,l k can be identified with PH (Hk,l − ). Hence, PH (H− ) is an H -bundle (quaternionic vector l−1 associated to the quaternionic Hopf bundle S 3 → S 4l−1 → HP l−1 . bundle) over HP 2 Example 1.7. Let (k, l) = (1, 2). Then PH (H1,2 − ) is simply the unit open 8-ball in H . 1,2 3 7 7 The boundary of this cell PH (H0 ) = S ×S 3 S S is the unit sphere. The space 4 1 4 PH (H1,2 + ) is the H R bundle over HP S associated to the quaternionic Hopf 3 7 4 bundle S → S → S . Viewed another way PH (H1,2 + ) is a complement of the unit 8-ball in H2 with HP 1 S 4 added in at infinity.
Toric Self-Dual Einstein Metrics
343
Remark 1.1. Note that the map ψ : Hk,l → Hl,k defined by ψ(u0 , u1 , . . . , uk−1 , uk , . . . , un ) = (un , . . . , uk , uk−1 , . . . , u0 )
(1.6)
is the anti-isometry (or metric reversal) which induces anti-isometries ψ : Hk,l () → Hl,k (−), i.e., ψ : HHk,l → HHl,k . For example, PH (Hn+1,0 ) is diffeomorphic to HP n but has negative-definite metric. It − can be identified with PH (H0,n+1 ) which is obviously the usual definition of HP n by + changing the sign of the metric. As a result we can restrict our discussion only to the negative scalar curvature spaces PH (Hk,l − ), k > 0. This is not natural if one talks about n+1,0 the projective space PH (H− ) but in this paper we will mostly deal with the case k < n + 1. − We now describe the spaces (PH (Hk,l − ), gk,l ) in inhomogeneous quaternionic coor-
dinates. One needs k quaternionic charts to cover PH (Hk,l − ), namely
Uβ = {u ∈ PH (Hk,l − ) | uβ = 0}, β = 0, . . . , k − 1.
(1.7)
On Uβ we write −1 −1 −1 n xβ = (x1 , . . . , xnβ ) = (u0 u−1 β , . . . , uβ−1 uβ , uβ+1 uβ . . . , un uβ ) ∈ H . β
(1.8)
Note that (1.1) implies that on Uβ we have 1 − Fk−1,l (xβ , xβ ) = 1 +
k−1 α=1
|xαβ |2 −
n
|xαβ |2 = 1/|uβ |2 > 0.
(1.9)
α=k
Let us denote Fk−1,l simply by ∗, ∗k−1,l with the associated semi-norm || ∗ ||k−1,l . − Then, on Uβ , ||xβ ||k−1,l < 1 and the semi-quaternion K¨ahler metric gk,l reads − gk,l
1 1 β 2 β β 2 ||dx ||k−1,l + = |dx , x k−1,l | . 1 − ||xβ ||2k−1,l 1 − ||xβ ||2k−1,l
(1.10)
We will often refer to u = (u0 , u1 , . . . un ) as homogeneous coordinates on PH (Hk,l − ). n n Example 1.8. It is clear that PH (H1,n − ) = HH is simply the unit ball in H with the quaternionic hyperbolic metric. In this case U0 is the only chart so we have global inhomogeneous coordinates −1 n x = (x1 , . . . , xn ) = (u1 u−1 0 , . . . , un u0 ) ∈ H
with the positive definite hyperbolic metric 1 1 2 2 g= |dx| + |dx, x| . 1 − |x|2 1 − |x|2
(1.11)
(1.12)
344
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
These are not the only examples of semi-quaternion K¨ahler manifolds as we shall see. However, many other examples can be obtained by taking semi-quaternionic K¨ahler quotients of PH (Hk,l − ) by subgroups of Sp(k, l). The quotient construction in the semi-Riemannian case works in a similar way as in the Riemannian case. However, the zero-level set for the moment map need not be a semi-Riemannian submanifold. For example, when G is a 1-parameter subgroup acting on a semi-quaternion K¨ahler manifold (M 4n , g) of index ν then N = µ−1 (0) ⊂ M can have regions of signature (4ν − 3, 4n − 4ν) and (4ν, 4n − 4ν − 3) separated by all points in M with g(V , V ) = 0, where V is the vector field of the G-action on M. Let us call these two regions by −1 N− = µ−1 − (0) and N+ = µ+ (0). We have Theorem 1.9. Let (M 4n , g) be a semi-Riemannian manifold with quaternionic index ν and G ⊂ Isom (M, g) be a one-parameter subgroup of isometries of M preserving the quaternion K¨ahler 4-form . Let µ : M → V be the quaternion K¨ahler moment map −1 for this action and let µ−1 − (0) ⊂ M, µ+ (0) ⊂ M be semi-Riemannian submanifolds of signature (4ν − 3, 4ν) and (4ν, 4ν − 3). If G acts freely and properly on µ−1 ± (0) −1 −1 the quotients M− = µ− (0)/G and M+ = µ+ (0)/G are semi-quaternion K¨ahler manifolds of dimension 4n − 4 and quaternionic index ν − 1 and ν, respectively. The situation is even more complex when we choose an arbitrary G ⊂ Isom (M, g). In general, depending on how G acts on M one should separate µ−1 (0) into submanifolds of signature (4ν − 3c, 4n − 4ν − 3d), where c + d = dim(G) and one could expect quotients of various quaternionic indices ranging from 0 to min(dim(G), ν). However, in this paper we shall focus our interest on the special case when M = k,l PH (Hk,l − ) or M = PH (H+ ) with k + l = 3 and dim(G) = k + l − 2 = 1. When (k, l) = {(0, 3), (3, 0)} we are in the realm of the S 1 reductions of HP 2 , which have been already studied in [GL88]: the quotients are orbifold complex weighted projective planes. In the case of (k, l) = (1, 2) we have two projective spaces one can consider: 1,2 2 1,1 PH (H1,2 − ) = HH and PH (H+ ) = HH . However, as described in Example 1.7 2 these are two pieces of HP cut along a 7-sphere. The choice of G ⊂ Sp(1, 2), simultaneously determines the quaternion K¨ahler reduction of both HH2 and HH1,1 by G. In fact, the reduction depends only on the conjugacy classes of such 1-parameter subgroups in Sp(1, 2). These, on the other hand, are given by adjoint orbits in the Lie algebra sp(1, 2). For each such adjoint orbit [ ] ( ∈ sp(1, 2)) one can consider a 1-parameter group G( ) = {A ⊂ Sp(1, 2) | A = e t , t ∈ R}
(1.13)
acting on H1,2 as a subgroup of Sp(1, 2) ⊂ GL(3, H). This action descends to an action on HP 2 = HH2 ∪ S 7 ∪ HH1,1
(1.14)
preserving the above decomposition and defining the semi-quaternion K¨ahler moment maps. Following Swann [Swa91], it is convenient to consider the semi-hyperk¨ahler moment map µ : H1,2 → Im(H) and the corresponding decomposition of the Swann bundle. We then write µ−1
(0) = N− ( ) ∪ N0 ( ) ∪ N+ ( ),
(1.15)
Toric Self-Dual Einstein Metrics
345
1,2 where N ( ) are restrictions, of µ−1
(0) to H . As we shall see N− ( ) can be empty, N+ ( ) is never empty. Let N− ( ) be nonempty and suppose
PH (N− ( )) = N− ( )/H∗ ⊂ HH2 is a submanifold in the 8-ball HH2 . Further assuming that G( ) acts freely and properly on PH (N− ( )) we define the quotient G( ) → PH (N− ( )) → M− ( ) = G( )\N ( )/H∗ .
(1.16)
It follows that the metric g( ) on M− ( ) obtained by inclusion and submersion in the quotient construction is a complete SDE metric of negative scalar curvature. Its Swann bundle U(M− ( )) = G( )\N− ( ) is a semi-hyperk¨ahler manifold of index 1. Hence, for every ∈ sp(1, 2) such that PH (N− ( )) ⊂ HH2 and G( ) acts freely and properly on it we get a negative SDE manifold (M− ( ), g( )). What remains is to enumerate all possible adjoint orbits (this will be done in the next section) and examine all the possible quotients (the following four sections). The projectivisation PH (N+ ( )) ⊂ PH (H1,2 + ), in general, does not need to be a semiRiemannian submanifold. Let V (u) = · u be the vector field for the G( )-action on + PH (H1,2 + ). Then the norm square of V in the semi-Riemannian metric g1,2 can be neg+ (N+ ( )) ⊂ PH (H1,2 ative, positive, or it can vanish. Let PH + ) be the subset on which 1,2 + − + g1,2 (V , V ) > 0 while PH (N+ ( )) ⊂ PH (H+ ) is the subset on which g1,2 (V , V ) < 0.
+ (N+ ( )) is a submanifold in PH (H1,2 If PH + ) then it is a semi-Riemannian submanifold − of signature (4, 1). On the other hand, if PH (N+ ( )) is a submanifold in PH (H1,2 + ) then it is a semi-Riemannian submanifold of signature (1, 4). At least locally we can define + two different quotient metrics: (1) if PH (N+ ( )) is not empty we have positive scalar + curvature metric g + ( ) on PH (N+ ( ))/G( ) of signature (4, 0) (anti-Riemannian); − (2) if PH (N+ ( )) is not empty we have positive scalar curvature metric g − ( ) on − − M+ ( ) = PH (N+ ( ))/G( ) of signature (0, 4). The metric g − ( ), is a Riemannian metric of positive scalar curvature. Typically this metric is not complete, unless the quotient can be globally extended to the symmetric metric on S 4 or CP 2 . On the other hand g + ( ) is an anti-Riemannian metric of positive scalar curvature so that −g + ( ) + + is a Riemannian metric of negative scalar curvature on M+ ( ) = PH (N+ ( ))/G( ). Generally this metric is not complete. However, as we shall see in Sect. 7 complete metrics of this type can occur. Hence, a priori, for each ∈ sp(1, 2) we have locally three different metrics: g( ), −g + ( ) and g − ( ). The two metrics g( ), −g + ( ) are negative SDE while g − ( ) is positive SDE.
Remark 1.2. Similarly, we can consider any orbits [ ] under Sp(1, n) of the (n − 1)dimensional subalgebras ⊂ sp(1, n). Our analysis carried out for Sp(1, 2) applies without any changes and, a priori, for each we obtain locally 3 different metrics: g( ), −g + ( ) and g − ( ). In addition, when ⊂ sp(1, n) is Abelian these metrics have two commuting Killing vectors. Even more generally, we could consider any orbit [ ] under Sp(k, l) of (k + l − 2)-dimensional subalgebras g ⊂ sp(k, l). If both k, l are greater than one PH (Hk,l ahler. Our analysis carried out ± ) are both semi-quaternion K¨ for (1, 2) still applies and, a priori, for each we get locally four different metrics: k,l two from the reduction of PH (Hk,l − ) and the other two from the reduction of PH (H+ ). Two of these metrics will have negative scalar curvature. The case of k = l is of special
346
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
interest as we shall see in Sect. 7. Again, for Abelian subalgebras the metrics will have two commuting Killing vectors while the non-Abelian case is more general. 2. Adjoint Orbits in sp(1, 2) Adjoint orbits of elements in the classical Lie algebras g have been determined by Burgoyne and Cushman [BC77]. We shall use this work to find all the conjugacy classes of one-parameter subgroups of Sp(1, 2). First let us review some basic definitions. The symmetric form on H1,2 is given by u† Fu, where u0 −1 0 0 u = u1 , F = F1,2 = 0 1 0 . (2.1) 0 01 u2 We can describe Sp(1, 2) and its Lie algebra sp(1, 2) as 3 × 3 matrices preserving F, i.e., Sp(1, 2) = {g ∈ M3×3 (H) | g † Fg = F},
(2.2)
sp(1, 2) = {Y ∈ M3×3 (H) | FY + Y † F = 0}.
(2.3)
Explicitly, an element of sp(1, 2) can be written as a α β Y = α¯ b γ . β¯ −γ¯ c
(2.4)
Setting α = β = 0 gives the maximal compact subalgebra sp(1) ⊕ sp(2) while β = γ = 0 yields sp(1, 1) ⊕ sp(1). We say that Y ∈ sp(1, 2) is decomposable if H1,2 may be split non-trivially as a direct sum of mutually orthogonal Y -invariant quaternionic subspaces. Otherwise we say that Y is indecomposable. Choosing a particular unit quaternion i identifies H1,2 ∼ = H3 with C6 , which realizes sp(1, 2) as a subalgebra of gl(6, C). We shall say an element Y ∈ sp(1, 2) is semisimple iff it is diagonalizable as an element of gl(6, C). Any Y ∈ sp(1, 2) can be uniquely written as Y = S + N, where S is semisimple, and N is nilpotent with [S, N] = 0. If N m+1 = 0, N m = 0 then the integer m is called the height of Y . Semisimple elements have height equal to zero. Definition 2.1. We define the following elements of sp(1, 2): ip0 0 0 T0 (ip0 , ip1 , ip2 ) = 0 ip1 0 , 0 0 ip2 ip λ 0 T0 (λ, ip, iq) = λ ip 0 , 0 0 iq ip 0 0 i i 0 T1 (λ, ip, iq) = 0 ip 0 + λ −i −i 0 , 0 0 iq 0 0 0 0 0 −i T2 (λ, ip) = ip I3 + λ 0 0 i , i i 0 where (throughout) λ = 0.
Toric Self-Dual Einstein Metrics
347
The first two 3-parameter families of elements are semisimple and they are in two different Cartan subalgebras of sp(1, 2). They are necessarily decomposable. The first one corresponds to the decomposition of H1,2 into H ⊕ H ⊕ H while the second decomposes H1,2 into H1,1 ⊕ H. The 3-parameter family T1 (λ, ip, iq) has height one (and T1 (λ, 0, 0) is 2-step nilpotent). These are decomposable, splitting H1,2 into H1,1 ⊕ H. Finally, the 2-parameter family T2 (λ, ip) has height two (and T2 (λ, 0) is 3-step nilpotent). These are indecomposable. Note that all elements in Definition 2.1 are inside the subalgebra u(1, 2). Furthermore, note that we chose T1 := T1 (1, 0, 0) and T2 := T2 (1, 0), so that they commute. In fact {iI3 , T2 , T1 = iT22 } span a maximal nilpotent Abelian subalgebra of sp(1, 2). The following proposition follows from [BC77]. Proposition 2.2. Let Y be an arbitrary non-zero element of sp(1, 2). Then Y is conjugate under the adjoint Sp(1, 2) action to an element of Definition 2.1. This element is unique, except in the height one case, where T1 (λ, p, q) is conjugate to T1 (1, p, q) or T1 (−1, p, q) for p = 0, and to T1 (1, 0, q) for p = 0, and in the height two case, where T2 (λ, p) is conjugate to T2 (1, p). Furthermore, any one-parameter subgroup in Sp(1, 2) is conjugate to G( ) = {A ∈ Sp(1, 2) | A = e t }, where is one of the types of Definition 2.1. In other words, the list of Definition 2.1 enumerates all adjoint orbits in sp(1, 2). The corresponding conjugacy classes of one parameter subgroups of Sp(1, 2) are enumerated by these elements up to scale: and c define the same subgroup for any c = 0. In the following, it will sometimes be more convenient to work with a different basis ˜ with of H1,2 in which the symmetric form may be written v† Fv v0 v = v1 , v2
010 F˜ = 1 0 0 . 001
(2.5)
The advantage of this basis is that the last three matrices in Definition 2.1 are conjugated to the following simpler forms:
ip + λ 0 0 T˜0 (λ, ip, iq) = 0 ip − λ 0 , 0 0 iq ip 0 0 T˜1 (λ, ip, iq) = −iλ ip 0 , 0 0 iq 000 T˜2 (λ, ip) = ip I3 + λ 0 0 i . i 00 We end our discussion by noting that it is straightforward to compute the momentum map in homogeneous coordinates associated to a generator T or T˜ using the general
348
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
formulae u0 µT (u) = u† FT u = (−u¯ 0 , u¯ 1 , u¯ 2 )T u1 , u2 v0 µT˜ (v) = v† F˜ T˜ v = (v¯1 , v¯0 , v¯2 )T˜ v1 . v2 3. The Pedersen–LeBrun Metrics on Line Bundles over CP 1 In this section we will examine the case of = 0 (p) = T0 (ip0 , ip1 , ip2 ). We shall assume that this generates a circle action, which means, after rescaling , that we may assume that the pi ’s are integers with gcd(p0 , p1 , p2 ) = 1. (We can assume that the weights do not vanish as the cases when one or two of the weights vanish are degenerate.) We then have a circle action on the quaternionic hyperbolic 2-ball HH2 given in homogeneous coordinates by ϕt (u0 , u1 , u2 ) = (e2πip0 t u0 , e2πip1 t u1 , e2πip2 t u2 ),
(3.1)
where t ∈ [0, 1). We note that this action is effective unless the weights p0 , p1 , p2 are all odd, in which case we obtain an effective action of a quotient circle by taking t ∈ [0, 1/2). In inhomogeneous coordinates (x1 , x2 ) we have ϕt (x1 , x2 ) = (e2πip1 t x1 e−2πip0 t , e2πip2 t x2 e−2πip0 t ), p
(3.2)
and the moment map is given as µp (u) = −p0 u¯ 0 iu0 + p1 u¯ 1 iu1 + p2 u¯ 2 iu2 , fp (x) =
u0 µp (u)u−1 0
(3.3)
= −ip0 + p1 x¯1 ix1 + p2 x¯2 ix2
(3.4)
in homogeneous or inhomogeneous coordinates. We now write ¯ x = z + wj = z + j w,
(3.5)
where z, w ∈ C2 and observe that 2πi(p −p )t 1 0 z e 2πi(p1 +p0 )t w e p z 1 w1 = 2πi(p2 −p0 )t 1 2πi(p2 +p0 )t 1 . ϕt z 2 w2 e z2 e w2 2 µ−1 p (0) = {(z, w) ∈ HH :
α=1,2
pα (|zα |2 − |wα |2 ) = p0 ,
pα w¯ α · zα = 0}. (3.6)
α=1,2
2 Proposition 3.1. Let qα = pα /p0 . Then the subset µ−1 p (0) ⊂ HH is empty unless −1 |qα | > 1 for at least one α. Otherwise, µp (0) is an open smooth submanifold of codimension 3.
Toric Self-Dual Einstein Metrics
349
Proof. We can assume that both q1 , q2 are positive (otherwise we simply reverse the role of zα and wα in the argument below). On the one hand, the momentum constraint gives q1 (|z1 |2 − |w1 |2 ) + q2 (|z2 |2 − |w2 |2 ) = 1, so that −q1 |z1 |2 − q2 |z2 |2 −1. On the other hand, we have |z1 |2 + |z2 |2 |z1 |2 + |w1 |2 + |z2 |2 + |w2 |2 < 1 by the unit ball condition. Adding the two inequalities we get (1 − q1 )|z1 |2 + (1 − q2 )|z2 |2 < 0. This has no solutions when 1 q1 and 1 q2 . Otherwise, if (say) |q1 | > 1 then by taking z2 , w2 = 0, it is easy to see that µ−1 p (0) is nonempty. The last statement follows because straightforward computation reveals that 0 is a regular value of the Jacobian of µp . Without loss of generality we will further assume that all weights are positive. We will also choose q1 > 1 that is that p1 > p0 . Then we have the following 2 Proposition 3.2. The ϕt -action on the level set µ−1 p (0) ⊂ HH is free if and only if p1 = p0 + 1 and 0 < p2 p0 + 1 when one of the weights is even, or p1 = p0 + 2 and 0 < p2 p0 + 2 when all the weights are odd.
p
Proof. Consider the set described by (z1 , 0, 0, 0). This meets µ−1 p (0) in a circle, but any point on this circle is fixed by Zp1 −p0 . Hence we must have p1 = p0 + 1, unless all weights are odd when we have p1 = p0 + 2. Next, suppose that p2 > p0 . Then the set described by (0, z2 , 0, 0) also meets µ−1 p (0) in a circle and any point on this circle is fixed by Zp2 −p0 . Thus if p2 > p0 , we must have p2 = p0 + 1 (or p0 + 2 if all weights are odd). It is easy to see that p2 can be any integer with 0 < p2 p0 + 1 (or p0 + 2 if all weights are odd). We now have: 1 Theorem 3.3. For p ∈ Z3+ as in Proposition 3.2, the quotient M(p) = µ−1 p (0)/S (p) is a complete self-dual Einstein manifold with negative scalar curvature and at least two commuting Killing vectors. When p2 = p1 (which is p0 + 1 or p0 + 2) the metric is U (2)-invariant while when p2 = p0 the metric is U (1, 1)-invariant.
Proof. Only completeness of the induced metric on M(p) remains to be proven, and this follows from the fact that the induced metric on the closed embedded submanifold 2 µ−1 p (0) → HH is complete and the action 3.1 is proper. We continue with describing the total space of these metrics. When p2 = p1 , we expect that the metric is complete and, hence, it has to be one of the possibilities listed by Hitchin in Theorem 13 of [Hit95]. We will show that our quotient metrics are the Pedersen–LeBrun metrics on complex line bundles O(n) → CP 1 , n 3 (Theorem
350
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
13:3(d) of [Hit95]). Before we analyze M(p, p + 1, p + 1) and M(p, p + 2, p + 2) let us recall a standard description of a complex line bundle over CP 1 with first Chern class s. Let S 3 = {v ∈ C2 : v = 1} and let s ∈ Z+ . Then we set Ls ≡ S 3 × C/s ,
(3.7)
where s is the action of S 1 on S 3 × C given by sτ (v, α) = (τ v, τ s α).
(3.8)
The natural projection Ls −→ S 2 ∼ = S 3 /S 1 makes Ls a complex line bundle over S 2 with c1 (Ls ) = s. 1 (1) = {α ∈ C : Note that we get the same conclusion when we replace C by DC |α| < 1}. Then Lr,s is a complex unit disk bundle with first Chern class s. Now we are ready for Theorem 3.4. Let p ∈ Z and p = (p − 1, p, p), p > 1. Then the quotient metric g(p) is complete, U (2)-invariant and the total space M(p) can be identified with the complex unit disk bundle L2p → CP 1 with first Chern class equal to 2p. Proof. By Theorem 3.3 it suffices only to identify the quotient in this special case. Let (z, w) ∈ HH2 . We make a slight change of these coordinates by setting x=
1 p0 /p + w2
z,
y=
2pw.
(3.9)
In these coordinates the moment map equations can be written 2 2 µ−1 p (0) = {(x, y) ∈ C × C :
x2 = 1, y¯ · x = 0, ||y|| < 1}.
(3.10)
Then the circle action is given by ϕτ (x, y) = (τ x, τ 2p−1 y)
(3.11)
for τ = e2π it ∈ S 1 . Consequently, we have that M(p) is equivalent to the quotient of the set 3 2 3 1 µ−1 p (0) = {(x, y) ∈ S × DC (1) : x ⊥ y} S × DC (1) 1 (1) −→ M(p) by setting f (v, α) = by the action (3.11). We define a map f : S 3 × DC † † (v, αv ), where if v = (vo , v1 ) then v = (−v¯1 , v¯o ). Note that for any τ ∈ S 1 we have the commutative diagram
(v,α) −→ (v, αv† ) ; (τ v, τ 2p α) −→ (τ v, τ 2p−1 αv† ) 2p
i.e., we have that f ◦ τ = ϕτ ◦ f . Thus f is an S 1 -equivariant diffeomorphism and therefore f induces a smooth equivalence of the quotient spaces. Hence M(p) L2p . When p = (p − 2, p, p) we immediately get the other half of the line bundles with odd Chern classes:
Toric Self-Dual Einstein Metrics
351
Theorem 3.5. Let p ∈ Z and p = (p − 2, p, p), p = 2k + 1 > 2. Then the quotient metric g(p) is complete, U (2)-invariant and the total space M(p) can be identified with the complex unit disk bundle Lp → CP 1 with first Chern class equal to p. Note that this construction does not give the line bundles over CP 1 with first Chern classes c1 = 1, 2. The metrics on Lp with p 3 have a curious history. The quotient construction presented here was written in [Gal87b]. The Pedersen metric on the 4-ball [Ped86] depends on a single parameter m2 ∈ (−1, ∞). It was realized later (see [Hit95]) that setting this parameter to (2 − n)/n (with n ∈ Z, n > 2) allows for the analytic continuation of this metric to a complete metric on O(n) → CP 1 . The reason these metrics are called Pedersen–LeBrun in [Hit95] is that they are conformal to the scalar flat K¨ahler metrics on O(−n) → CP 1 constructed by LeBrun [LeB88]. When, p0 + 1 = p1 > p2 > 0 we take a different approach. Observe that one can still easily solve the complex equation of the moment map by setting (w1 , w2 ) = α(−p2 z¯ 2 , p1 z¯ 1 ),
(3.12)
where α ∈ C. The unit ball condition in terms of (z1 , z2 , α) reads: |z1 |2 + |z2 |2 + |α|2 p12 |z1 |2 + |α|2 p22 |z2 |2 = |z1 |2 (1 + p12 |α|2 ) + |z2 |2 (1 + p22 |α|2 ) < 1,
(3.13)
while the remaining moment map equation is |z1 |2 (p1 − p2 p12 |α|2 ) + |z2 |2 (p2 − p1 p22 |α|2 ) = p0 . Let us solve this equation with respect to |z1 |2 : 1 p0 p2 − |z2 |2 . p1 1 − p1 p2 |α|2 p1
|z1 |2 =
(3.14)
One can immediately see that z1 cannot vanish as then |z2 |2 = Let ρ =
z1 |z1 | .
1 p0 p0 1. 2 p2 1 − p1 p2 |α| p2
(3.15)
It is easy to see that φτ (ρ, z2 , α) = (τρ, τ p2 −p0 z2 , τ p2 +p1 α).
1 2 Proposition 3.6. The level set µ−1 p (0) D × S , where D ⊂ C is an open 4-ball.
Proof. It is clear that (ρ, z2 , α) ∈ S 1 × D are coordinates on µ−1 p (0). We have to check that D is diffeomorphic to a 4-ball. To do that let us consider p0 1 p2 2 (1 + p12 |α|2 ) + |z2 |2 (1 + p22 |α|2 ) < 1 − |z | 2 p1 1 − p1 p2 |α|2 p1 which can be written as fp (z2 , α) = (p1 − p2 )|z2 |2 [1 − p1 p2 |α|2 ]2 + p12 p2 |α|2 − 1 < 0. One can easily see that |α|2 <
1 , p12 p2
D(p) = {(z2 , α) ∈ C × C | fp (z2 , α) < 0} is an open 4-ball.
352
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
Theorem 3.7. The quotient M(p) D(p) is diffeomorphic to a 4-ball. The self-dual Einstein metric g(p) obtained from the quaternion K¨ahler quotient is complete and it has two commuting Killing vectors. Furthermore, M(p, p + 1, p) is of cohomogeneity one with respect to U (1, 1). The cohomogeneity one U (1, 1) action on M(p, p +1, p) can be explicitly described as follows. Let a τb A= ¯ ∈ U (1, 1), b τ a¯ where a, b, τ ∈ C with |a|2 − |b|2 = 1 and |τ |2 = 1. This group acts on the quaternionic ball as a 0 τb u0 ϕA (u) = 0 1 0 u1 u2 b¯ 0 τ a¯ p
and, it commutes with the circle action given by ϕt . In the inhomogeneous chart we get x1 (a + τ bx2 )−1 x1 = ¯ . ϕA x2 (b + τ ax ¯ 2 )(a + τ bx2 )−1 The above action preserves the zero level set of the moment map and it descends to a cohomogeneity one isometric action on the quotient space M(p, p + 1, p). Cohomogeneity one SDE metrics with an isometric action of a four-dimensional Lie group have been studied by Derdzi´nski. Hence, up to isometries M(p, p + 1, p) must be the cohomogeneity one self-dual K¨ahler metric introduced in [Der81] and more recently studied by Apostolov and Gauduchon in [AG02]. 4. Generalized Pedersen Metrics on the Ball In this section and the following two, we will consider the R-actions on HH2 whose generators do not belong to the Lie algebra of a maximal torus. To do this we shall work in the v = (v0 , v1 , v2 ) coordinates introduced in Sect. 2. In these coordinates HH2 is the open subset of HP 2 defined by the equation v¯0 v1 + v¯1 v0 + |v2 |2 < 0.
(4.1)
It follows that v0 does not vanish on HH2 and so the inhomogeneous coordinates y1 = v1 v0−1 , y2 = v2 v0−1 provide a global chart identifying HH2 with the domain y1 + y¯1 + |y2 |2 < 0
(4.2)
in H2 . We remark (for later use) that the real part of y1 is strictly negative on this domain. We begin by considering the case of 0 (p, q) = T˜0 (1, ip, iq), where we have taken λ = 1 by rescaling. The R-action on the quaternionic hyperbolic 2-ball (HH2 , g) is given explicitly by ipt t (ip+1)t e e v0 e 0 0 v0 p,q (4.3) ϕt (v) = 0 e(ip−1)t 0 v1 = eipt e−t v1 , v2 eiqt v2 0 0 eiqt
Toric Self-Dual Einstein Metrics
353
which reduces, in inhomogeneous coordinates y = (y1 , y2 ), to ipt −2t e e y1 e−ipt p,q y1 ϕt = . y2 eiqt y2 e−ipt
(4.4)
This action is a quaternionic isometry of the hyperbolic metric g and it defines a bundle valued momentum map µp,q : HH2 → V given in homogeneous and inhomogeneous coordinates by the function µp,q (v) = v¯1 v0 − v¯0 v1 + p(v¯1 iv0 + v¯0 iv1 ) + q v¯2 iv2 , fp,q (y) = y¯1 − y1 + p(y¯1 i + iy1 ) + q y¯2 iy2 . Although this function is not invariant under the action (4.4), its zero set is, and the quaternion K¨ahler reduction of HH2 by the one parameter group e 0 (p,q)t is the quotient of this zero set by the group action. The resulting SDE metrics were first introduced in [Gal87b] and may be regarded as a deformation of the Pedersen metrics on the ball to metrics with fewer symmetries. Theorem 4.1. Let = 0 (p, q) = T˜0 (1, ip, iq) and consider the one parameter p,q group ϕt = e t acting on the quaternionic hyperbolic space HH2 . The quaternion p,q is diffeomorphic to an open 4-ball for all K¨ahler reduction M(p, q) = µ−1 p,q (0)/ϕ 2 (p, q) ∈ R . The quotient metric is complete, self-dual, and Einstein of negative scalar curvature whose isometry group contains a 2-torus. Furthermore, the quotient metrics on M(0, q) are isometric to the Pedersen metrics, and their isometry group contains U (2). Proof. Consider the following set Sp,q = {y | fp,q (y1 , y2 ) = 0 and y1 + y¯1 = 2 Re(y1 ) = −1}.
(4.5)
For any y2 ∈ H there is a unique y1 such that y ∈ Sp,q . It follows that Sp,q ∩ HH2 is diffeomorphic to the open 4-ball |y2 |2 < 1 in H ∼ = R4 . Furthermore, Sp,q ∩ HH2
t provides a global slice for the action of e in the zero set of the momentum map: to see this, we only have to note that e t sends y1 + y¯1 to e−2t (y1 + y¯1 ) and therefore, since y1 + y¯1 < 0, there is a unique (y1 , y2 ) in any orbit with y1 + y¯1 = −1. Therefore M(p, q) is diffeomorphic to the open 4-ball equipped with the metric obtained by restriction to µ−1 p,q (0) and submersion. The fact that the quotient is a complete Riemannian manifold follows as in the proof of Theorem 3.3. Moreover, it must be an SDE metric of negative scalar curvature since it is obtained as a quaternion K¨ahler quotient of HH2 . The isometry group contains a 2-torus since Sp,q is invariant under the transformation (y1 , y2 ) → (σy1 σ −1 , τy2 σ −1 ) by (τ, σ ) ∈ U (1) × U (1). As this action is by quaternionic isometries on HH2 and commutes with e t , it descends to give an action by isometries on M(p, q). If p = 0 this action may be extended, by taking σ ∈ Sp(1), to yield an action of U (1)·Sp(1) U (2). To identify M(0, q) as the Pedersen family, one can compute the metric explicitly. Alternatively, we can use the classification of SDE metrics with SU(2) symmetry by Hitchin [Hit95]. This classification provides very few possible candidates with U (2) symmetry: apart from the real and complex hyperbolic metrics, the Pedersen metrics are the only examples. In fact, one can see that M(0, 0) is real hyperbolic space but for other values of q the metric is not symmetric.
354
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
5. The Height One Quotients In this section we will examine the family of quotients of HH2 , obtained from T˜1 (λ, ip, iq). By rescaling we can assume that p is 0 or 1, and if p = 0 we can scale q to 1 or 0. Since we assume λ is nonzero, we can then conjugate so that λ = ±1 (or λ = 1 if p = 0) and rescale by the sign. Hence we only need to consider the quotients
1 (p, q) = T˜1 (1, ip, iq) with p ∈ {−1, 0, 1}, and if p = 0 we can suppose q ∈ {0, 1}. Nevertheless, for convenience we shall carry out our analysis for arbitrary p, q. We have ipt eipt v0 e 0 0 v0 p,q (5.1) ϕt (v) = −it eipt 0 v1 = eipt v1 − itv0 , v2 eiqt v2 0 0 eiqt which reduces, in inhomogeneous coordinates y = (y1 , y2 ), to ipt e (y1 − it)e−ipt p,q y1 = . ϕt y2 eiqt y2 e−ipt
(5.2)
The moment map for this action is given in homogeneous or inhomogeneous coordinates by µp,q (v) = −v¯0 iv0 + p(v¯0 iv1 + v¯1 iv0 ) + q v¯2 iv2 , fp,q (y) = −i + p(iy1 + y¯1 i) + q y¯2 iy2 . Theorem 5.1. Let = 1 (p, q) = T˜1 (1, ip, iq) and consider the one parameter group p,q ϕt = e t acting on the quaternionic hyperbolic space HH2 . Then p,q is diffeomorphic to R4 (i) the quaternion K¨ahler reduction M(p, q) = µ−1 p,q (0)/ϕ for all (p, q) with p < 0. p,q is diffeomorphic to S 1 × (ii) the quaternion K¨ahler reduction M(p, q) = µ−1 p,q (0)/ϕ R3 for all (p, q) with 0 p < |q|.
In these cases M(p, q) has a complete self-dual Einstein metric of negative scalar curvature and its isometry group contains a 2-torus. In all other cases (i.e., if p |q|) the zero set of the momentum map is empty. Proof. We begin by defining the set Sp,q = {(y1 , y2 ) | fp,q (y1 , y2 ) = 0 and iy1 − y¯1 i = 2 Re(iy1 ) = 0},
(5.3)
and claim that Sp,q ∩ HH2 can be identified with the quotient space M(p, q) as a global slice for the ϕ p,q action on the momentum zero set. Indeed, it is clear that as the action of e t sends Re(iy1 ) to Re(iy1 ) + 2t, so there is a unique point of Sp,q on each orbit of 2 e t in µ−1 p,q (0). It remains to describe the set Sp,q ∩ HH . For p = 0, there is a unique (y1 , y2 ) ∈ Sp,q for any y2 ∈ H. We now note that HH2 is the domain p(y1 + y¯1 ) + py¯2 y2 < 0, p(y1 + y¯1 ) + py¯2 y2 > 0,
p > 0, p < 0.
On the other hand 0 = Re(ifp,q ) = 1 − p(y1 + y¯1 ) − q Re(i y¯2 iy2 )
Toric Self-Dual Einstein Metrics
355
so that Sp,q ∩ HH2 may be identified with the set of y2 ∈ H satisfying −q Re(i y¯2 iy2 ) + py¯2 y2 < −1, −q Re(i y¯2 iy2 ) + py¯2 y2 > −1,
p > 0, p < 0.
Writing y2 = z2 + j w2 for w2 , z2 ∈ C, this is the domain in C2 given by (p + q)|z2 |2 + (p − q)|w2 |2 < −1, (p + q)|z2 |2 + (p − q)|w2 |2 > −1,
p > 0, p < 0.
For p > 0, this domain is empty unless p < |q|, in which case it is the exterior of a hyperboloid, which is diffeomorphic to S 1 × R3 . For p < 0, this domain is the interior of a hyperboloid for −|q| < p < 0, the interior of a cylinder for p = −|q|, and the interior of an ellipsoid for p < −|q|: all these domains are diffeomorphic to R4 . We now consider the case p = 0, when y1 is not uniquely determined by y√2 . For q = 0, the momentum zero set is empty. Otherwise, for q > 0, we have y2 = eis / q for √ s ∈ R, while for q < 0, we have y2 = eis j/ −q for s ∈ R. In either case, S0,q ∩ HH2 is identified with the set of (iy1 , eis ) ∈ Im H × S 1 with y1 + y¯1 < −p/q. This is diffeomorphic to S 1 × R3 . It is now clear that when S(p, q) ∩ HH2 is non-empty, as in the previous cases, it carries a complete SDE metric of negative scalar curvature. The isometry group contains the 2-torus (y1 , y2 ) → (σy1 σ −1 , τy2 σ −1 ) with (τ, σ ) ∈ U (1) × U (1).
6. The Height Two Quotients To complete our analysis of the quotients of HH2 we consider the height two case T˜2 (λ, ip).As in the height one case, by scaling and conjugation, we can suppose 2 (p) = T˜2 (1, ip) with p ∈ {0, 1}, so there are only two distinct quotients up to scale, but we shall carry out our computations for arbitrary p. We then have ipt eipt v0 e 0 0 v0 p (6.1) ϕt (v) = −t 2 /2 eipt it v1 = eipt v1 + itv2 − t 2 v0 /2 , ipt ipt v it 0 e e v2 + itv0 2 which reduces, in inhomogeneous coordinates y = (y1 , y2 ), to ipt e (y1 + ity2 − t 2 /2)e−ipt p y1 = . ϕt y2 eipt (y2 + it)e−ipt
(6.2)
The moment map for this action is given in homogeneous or inhomogeneous coordinates by µp (v) = v¯0 iv2 + v¯2 iv0 + p(v¯0 iv1 + v¯1 iv0 ) + p v¯2 iv2 , fp (y) = iy2 + y¯2 i + p(iy1 + y¯1 i) + p y¯2 iy2 .
356
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
Theorem 6.1. Let = 2 (p) = T˜2 (1, ip) and consider the one parameter group p ϕt = e t acting on the quaternionic hyperbolic space HH2 . Then the quaternion p 4 K¨ahler reduction M(p) = µ−1 p (0)/ϕ is diffeomorphic to R for all p, and carries a complete self-dual Einstein metric of negative scalar curvature whose isometry group contains S 1 × R. Furthermore, M(0) is a quaternionic hyperbolic space. Proof. Consider the following set Sp = {y | fp (y1 , y2 ) = 0 and iy2 − y¯2 i = 2 Re(iy2 ) = 0}.
(6.3)
It is clear that this is a global slice for the action of e t on the zero set of the momentum map. For p = 0, we obtain y2 = 0, and hence Sp ∩ HH2 is diffeomorphic to {y1 ∈ H : Re y1 < 0}, so let us suppose that p = 0. We write y2 = s2 + j w2 with s2 ∈ R and w2 ∈ C. The momentum constraint determines the imaginary part of iy1 in terms of y2 . In particular, it implies that 2s2 /p + y1 + y¯1 + s22 − |w2 |2 = 0. We find that y2 is constrained to lie in the paraboloid s2 /p > |w2 |2 . M(p) is diffeomorphic to the product of this paraboloid with the real line, which is diffeomorphic to R4 , and as before has a complete SDE metric of negative scalar curvature. The isometry group of the quotient metric contains the group generated by T˜1 (λ, ip, ip), which is isomorphic to S 1 × R. The last statement follows from a direct computation. 7. The Bergman Metric on the 4-Ball In this section we turn our attention to the quotients of HH1,1 = PH (H1,2 + ). One could consider all the cases studied in the previous four sections. Locally we will get families of metrics of both positive and negative scalar curvature. However, because HH1,1 is not Riemannian, singularities can arise when the vector field generating the ϕ (t) = e t action is null somewhere on the zero-set of the momentum map. For this reason, we shall restrict our attention to the special case = 0 (p) (cf. [Gal87a]). Furthermore, it will be convenient to switch signature and take quotients of PH (H2,1 − ): this means we don’t have to reverse the sign of the quotient metric to get a positive definite metric of negative scalar curvature. We begin by placing the case p = (1, 1, 1) in a more general context. Recall the following construction of the Wolf space X(2, k) = U (2, k)/U (2)×U (k). We start with the 2,k space PH (H2,k − ) and the diagonal circle action on PH (H− ), described in quaternionic coordinates u = (u0 , u1 , u2 , . . . , uk+1 ) as ϕt (u) = e2πit u,
(7.1)
where t ∈ [0, 1/2). The moment map for this action reads µ(u) = −u¯ 0 iu0 − u¯ 1 iu1 +
k+1 α=2
u¯ α iuα .
(7.2)
Toric Self-Dual Einstein Metrics
357
By introducing the complex coordinates uα = zα + j w¯ α and the matrices z 2 w2 z w Z0 .. , , Z0 = 0 0 , Z1 = ... Z= . Z1 z1 w 1 zk+1 wk+1
(7.3)
we can describe the set µ−1 (0) ∩ H2,k (−1) by a matrix equation −Z†0 Z0 + Z†1 Z1 = −I2×2 .
(7.4)
Now, one observes that the U (1)·Sp(1) U (2) which takes us from µ−1 (0)∩H2,k (−1) to the quotient is nothing but a U (2) matrix multiplication of Z from the right. This action is free and the quotient is a simple bounded domain in C2k . As homogeneous (symmetric) spaces µ−1 (0) ∩ H2,k (−1) U (2, k)/U (k),
(7.5)
and M=
U (2, k) µ−1 (0) ∩ H2,k (−1) = . U (2) U (2) × U (k)
(7.6)
In particular, when k = 1 we get the complex hyperbolic (or Bergman) metric on the unit ball in C2 . Below, we will show that this construction is rigid in a sense that an introduction of weights automatically leads to orbifold singularities. As we are interested in 4-dimensional quotients we will set k = 1. In the previous sections we have seen that all of the complete U (2)-symmetric SDE metrics of negative scalar curvature can be obtained as quaternion K¨ahler quotients of the ball PH (H1,2 − ). The only exception is the complex hyperbolic Bergman metric. The above calculation now shows that this metric can be constructed as a quotient of the pseudo-Riemannian quaternion K¨ahler manifold PH (H2,1 − ). More generally, take = (p) and examine the following circle action: p
ϕt (u0 , u1 , u2 ) = (e2πip0 t u0 , e2πip1 t u1 , e2πip2 t u2 ),
(7.7)
where p = (p0 , p1 , p2 ) ∈ Z3 , gcd(p0 , p1 , p2 ) = 1, t ∈ [0, 1) when all the weights are odd, and t ∈ [0, 1/2) otherwise. Now, the moment map µp : PH (H2,1 − ) → V is given as µp (u) = −p0 u¯ 0 iu0 − p1 u¯ 1 iu1 + p2 u¯ 2 iu2 .
(7.8)
Theorem 7.1. Let p ∈ (Z+ )3 and let M(p) be the quaternion K¨ahler quotient of PH (H2,1 − ) by the above circle action. Then M(p) has orbifold singularities unless p = (1, 1, 1) in which case M(1, 1, 1) U (2, 1)/U (2) × U (1) is the symmetric complex hyperbolic metric on the unit ball in C2 . Proof. Unlike in the case of PH (H2,1 − ) we no longer have the advantage of global coordinates. We need to consider two cases: PH (H2,1 − ) = U0 ∪ U1 ,
(7.9)
358
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
where Ui are defined as a submanifold on which ui = 0. We first consider U0 , where −1 we can switch to the inhomogeneous local chart (x10 , x20 ) = (u1 u−1 0 , u2 u0 ). On U0 we have −|x10 |2 + |x20 |2 < 1.
(7.10)
As before, the action and the zero level of the moment map become ϕt (x10 , x20 ) = (e2πip1 t x10 e−2πip0 t , e2πip2 t x20 e−2πip0 t ),
(7.11)
0 = −ip0 − p1 x¯10 ix10 + p2 x¯20 ix20 .
(7.12)
¯ 0, x0 = z0 + j w
(7.13)
p
We then write
where (z0 , w0 ) ∈ U0 and observe that on U0 0 0 e2πi(p1 −p0 )t z10 e2πi(p1 +p0 )t w10 p z1 w1 = 2πi(p −p )t 0 2πi(p +p )t 0 ϕt 2 0 z e 2 0 w z20 w20 e 2 2 while the moment map equations (7.12) become −p1 (|z10 |2 − |w10 |2 ) + p2 (|z20 |2 − |w20 |2 ) = p0 , −p1 w¯ 10 z10 + p2 w¯ 20 z20 = 0.
(7.14)
In this case we no longer have any analogue of Proposition 3.1 as µ−1 p (0) always intersects the open set defined by (7.10). Without loss of generality we will further assume that all weights are non-negative. Furthermore, neither p0 nor p1 can equal 0 if we want the quotient to be non-singular. If, say, p0 = 0, then take (u0 , 0, 0) ∈ PH (H2,1 − ). This point is also on the level set of the moment map and it is fixed by every element of S 1 (p). The third weight p2 can be 0 0 0 0 2 zero. On U0 ∩ µ−1 p (0) we can choose z1 = z2 = w2 = 0 and |w1 | = p0 /p1 which is a circle of points fixed by Zp0 +p1 . Hence, in order to get a smooth quotient we must assume all p0 = p1 = 1 and p2 is odd. But then, taking z10 = w10 = w20 = 0 and |z20 |2 = p0 /p2 = 1/p2 one gets a circle of points where the isotropy group equals Z(p2 +p0 )/2 . This forces p0 = p1 = p2 = 1. From our previous example we know that M(1, 1, 1) is the complex hyperbolic Bergman metric on C2 . Remark 7.1. Let us observe that all quaternion K¨ahler reductions of the symmetric space X(2, 2) U (2, 2)/U (2) × U (2) can now be obtained using our construction in a very simple manner. As X(2, 2) is by itself reduction of PH (H2,2 − ) by the circle action corresponding to the generator T1 = ipI4 we can consider all possible quotients of PH (H2,2 − ) by 2-dimensional Lie algebras g = {T1 , T2 }, where T2 ∈ sp(2, 2). Since T1 is fixed to be a multiple of the identity these are classified by the adjoint orbit [T2 ] in sp(2, 2). Hence, one could begin by enumerating all such classes. Here, there are many more cases. To begin with sp(2, 2) has 3 different Cartan subalgebras. In addition, we have elements of height 0,1,2,3. In fact, there are six distinct families of ‘purely’ nilpotent classes [BC77]. All of these quotients can be examined and they should lead to many new metrics.
Toric Self-Dual Einstein Metrics
359
Example 7.2. The simplest example when one gets a non-trivial negative SDE Hermitian metric is deformation of the Bergman metric on the 4-ball. We choose the second generator as
ip 0 T2 (p, q, r) = 0 0
0 iq 1 0
0 1 iq 0
0 0 ∈ sp(2, 2). 0 ir
(7.15)
One can easily see that p = q = r = 0 gives the Bergman metric which should correspond to a 4-parameter family of deformations of this metric. Detailed analysis of this and other quotients will be carried out elsewhere. 8. Quotients, Hyperbolic Eigenfunctions and Bochner-Flat Metrics The SDE metrics that we have constructed have in common that they possess (at least) two commuting Killing vector fields, and therefore belong to the class of metrics classified locally by Calderbank and Pedersen [CP02]. Furthermore, according to Apostolov–Gauduchon [AG02], quaternion K¨ahler quotients of HH2 are not just SDE, but Hermitian, and are therefore conformal to the self-dual (and therefore Bochner-flat) K¨ahler metrics classified by Bryant [Bry01]. In this section we relate our metrics to the hyperbolic eigenfunction Ansatz of Calderbank–Pedersen (which gives the explicit local form of the metrics), and to the SDE Hermitian metrics of Apostolov–Gauduchon and Bryant. We recall that the work of Calderbank and Pedersen shows that an SDE metric of nonzero scalar curvature with two commuting Killing vector fields is determined explicitly (on the open set where the vector fields are linearly independent) by an eigenfunction F of the Laplacian on the hyperbolic plane with eigenvalue 3/4, so it suffices to find the eigenfunction F corresponding to our quotients. According to [CP02], the hyperbolic eigenfunctions F arising as quotients of HH2 or HH1,1 should be either ‘3-pole’ solutions, or limits in which one or more of the ‘centers’ of the 3-pole coincide. We shall justify this claim here.
8.1. Quotients and hyperbolic eigenfunctions. We first consider SDE manifolds arising as semi-quaternion K¨ahler quotients of HHk−1,l or HHk,l−1 by n − 1 dimensional Abelian subgroups G of Sp(k, l) (with k + l = n + 1) in full generality. Following [CP02], ˜ g), we study such a quotient (M, g) using the Swann bundle (M, ˜ which is the principal CO(3) bundle over (M, g) arising as the corresponding semi-hyperk¨ahler quotient of Hk,l . More precisely, we take the semi-hyperk¨ahler quotient by G of (a connected k,l Hk,l )/{±1}, which is a principal CO(3)-bundle over component of) Hk,l ∗ = (H 0 HHk−1,l ∪ HHk,l−1 . M˜ is thus the quotient by G of the zero-set of the momentum map of G in Hk,l ∗ and we have a commutative diagram k−1,l ∪ HHk,l−1 Hk,l ∗ −→ HH M˜ −→ M,
360
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
where the vertical arrows denote semi-hyperk¨ahler and quaternion K¨ahler quotients, and the horizontal arrows are principal CO(3) bundles: SO(3) acts by isometries, and R+ by homotheties, so that if q is an H× valued function on the double cover of M˜ coming from a local trivialization, we have g˜ = s|q|2 g + |dq + qω|2 , where ω is the principal SO(3) connection on M˜ and s is a positive multiple of the scalar curvature of g, so that sg is a (possibly negative definite) SDE metric of positive scalar curvature. We can then arrange our conventions so that |q|2 pulls back to the zero-set of the momentum map in Hk,l to give the absolute value |Fk,l (u, u)| of the quadratic form. Any (n − 1)-dimensional Abelian subgroup G of Sp(k, l) lies in a maximal Abelian subgroup H , which has dimension n + 1. For generic G this maximal Abelian subgroup will be unique, but in general we must choose such a group H so that we have a quotient group H /G acting on M˜ and M. The Lie algebra of this quotient group may be identified with R2 . Now, according to [CP02], there is also a commutative diagram ˜ M −→ Im H ⊗0 R2 −→
M H2 ,
where the vertical arrows are (possibly only locally defined) isometric quotients by H /G, Im H ⊗0 R2 is the open subset of indecomposable elements of the flat vector space Im H ⊗ R2 ∼ = Im H ⊕ Im H, and H2 is the hyperboloid of positive definite elements of determinant one in S 2 R2 , equipped with the metric induced by the determinant on S 2 R2 (which is the hyperbolic metric). The lower horizontal map, like the upper map, is a principal CO(3)-bundle, and is given explicitly by the Grammian map 1 |x1 |2 x1 , x2 2 . x = (x1 , x2 ) ∈ Im H ⊗0 R → |x1 ∧ x2 | x1 , x2 |x2 |2 Given a hyperbolic eigenfunction F on (an open subset of) H2 , we can lift F to a homogeneity 1/2 function F˜ on the corresponding union of rays in the space of positive definite elements of S 2 R2 . Now we have the following result, which was proven in the definite case (i.e., k = 0 or l = 0) in [CP02]. The more general result also has a more direct proof, and we correct a minor error in [CP02]. Theorem 8.1. Let (M 4 , g) be an SDE metric with two commuting Killing vector fields obtained as a semi-quaternion K¨ahler quotient of HHk−1,l or HHk,l−1 by an n − 1 dimensional Abelian subgroup G of Sp(k, l) (where k + l = n + 1), and let F˜ be the homogeneity 1/2 lift of the hyperbolic eigenfunction F generating g, locally, with respect to a 2-dimensional Abelian quotient group acting by isometries. Then the pullback of F˜ to the zero-set of the momentum map in Hk,l is a nonzero constant multiple of the restriction of the quadratic form Fk,l (u, u). √ Proof. Let A be a positive definite element of S 2 R2 , and write A = det AA1 with A1 ∈ H2 having determinant one. Then by definition F˜ (A) = (det A)1/4 F (A1 ) and so F˜ = (det A)1/4 F , where F now denotes the (homogeneity 0) pullback to S 2 R2 . Now it was shown in [CP02] that the pullback of the function A → det A to the Swann bundle
Toric Self-Dual Einstein Metrics
361
M˜ is |q|8 /|F |4 (although the result is incorrectly stated there). It follows that F˜ pulls back to the Swann bundle to give |q|2 F /|F |, which pulls back to the momentum zero set in Hk,l to give a nonzero constant multiple of the absolute value of the quadratic form times a (possibly nonconstant) sign. However, F˜ is smooth, even through its zero-set, so the result follows. Note that the pullback of F˜ is independent of the choice of quotient torus (in the case that such a choice exists). We are going to use this result to calculate the hyperbolic eigenfunction corresponding to the metrics we have studied in detail here. In order to do this we just need to write the quadratic form Fk,l (u, u) in momentum coordinates and restrict it to the zero-set of the momentum map, as we now explain. Having chosen (if there is a choice) the maximal Abelian subgroup H of Sp(k, l) containing G (and a basis for the Lie algebra of H so that we can identify it with Rn+1 ), we have momentum coordinates y0 , . . . , yn ∈ Im H which are independent on the open subset U of Hk,l , where the H action is free. Since Fk,l (u, u) is H -invariant it will be a function of y = (y0 , . . . , yn ) on U , and our first task is to compute this function. Then, secondly, we must restrict to the zero-set of the momentum map of the G action. For this second step, following Bielawski–Dancer [BD00], we introduce an explicit parameterization of the momentum zero-set of G in terms of the momentum coordinates of the quotient torus. To do this, we write the Lie algebra g of G as the kernel of a 2 × (n + 1) matrix S : Rn+1 → R2 . Then the transpose matrix S t : R2∗ → R(n+1)∗ parameterizes the kernel of the projection R(n+1)∗ → g∗ . Since the momentum map of H is injective on U , the momentum zero-set of G in U is the subset where the momentum map of H takes values in the image of S t , so we can parameterize it by writing w = S t x, with x = (x1 , x2 ). The hyperbolic eigenfunction is now obtained by substituting this into the quadratic form Fk,l , writing the result in terms of the SO(3) invariants xi , xj and restricting to the hyperboloid detxi , xj = 1, where we can write 1/ρ η/ρ |x1 |2 x1 , x2 = η/ρ (ρ 2 + η2 )/ρ x1 , x2 |x2 |2 for half-space coordinates (ρ, η) on H2 . We now carry out this procedure for the examples we have studied. 8.2. Subgroups of a maximal torus and the generalized Pedersen–LeBrun metrics. Let H ∼ = (S 1 )n+1 be the standard maximal torus in Sp(k, l) acting diagonally on Hk,l with respect to the coordinates (u0 , . . . , un ), i.e., the j th circle acts by scalar multiplication by eit on the j th coordinate uj , and has momentum map yj = u¯ j iuj . We therefore have Fk,l (u, u) = −
k−1
|yj | +
j =0
k+l−1
|yj |.
j =k
On the zero-set of the momentum map of G we then get Fk,l (u, u) = −
k−1 j =0
|aj x2 − bj x1 | +
k+l−1 j =k
|aj x2 − bj x1 |,
362
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
where the matrix Sij defining g has columns (−bj , aj ). We now observe that |ax2 − bx1 | = a 2 |x2 |2 − 2abx1 , x2 + b2 |x1 |2 a 2 (ρ 2 + η2 ) − 2abη + b2 a 2 ρ 2 + (aη − b)2 = = , √ √ ρ ρ
(8.1)
and thus the corresponding hyperbolic eigenfunction is k−1 aj2 ρ 2 + (aj η − bj )2 k+l−1 aj2 ρ 2 + (aj η − bj )2 F (ρ, η) = − + √ √ ρ ρ j =0
j =k
in accordance with the discussion in [CP02]—see also [CS03]. These hyperbolic eigenfunctions may be interpreted as ‘multipole’ solutions, in the sense that they are a linear combination of solutions of the form (8.1) which we regard as the eigenfunction generated by a monopole source at the point η = b/a on the boundary ρ = 0 of the hyperbolic plane (which is a circle R ∪ {∞}). In the case studied in this paper, n = 2, and the vectors (a0 , a1 , a2 ) and (b0 , b1 , b2 ) are any two linearly independent solutions to the equation a0 p0 + a1 p1 + a2 p2 = 0, where p0 , p1 , p2 are the weights of the torus action. Note that SL(2, R) acts on the vectors (aj , bj ) to give equivalent solutions so that the points bj /aj can be fixed (for instance at 1, −1 and ∞, as in [CP02]—we remark that the points are distinct provided the weights p0 , p1 , p2 are nonzero). 8.3. The generalized Pedersen metrics. For the generalized Pedersen metrics, the family of generators that we are using span a Cartan subalgebra of sp(1, 2) which is not the Lie algebra of a maximal torus. However, this can be understood as an analytic continuation of the generalized Pedersen–LeBrun metrics (replace λ by it, where i acting on the left is a complex scalar commuting with the right quaternionic action, and diagonalize). As discussed in [CP02] this implies that the hyperbolic eigenfunction can be assumed to take the form a b + ic ρ 2 + (η + i)2 b − ic ρ 2 + (η − i)2 F (ρ, η) = √ + + . √ √ 2 2 ρ ρ ρ This is still a 3-pole solution, but two of the sources are complex conjugate rather than real.
8.4. The height one quotients. In the remaining cases, it is more convenient to begin with the coordinates v = (v0 , v1 , v2 ) that we introduced already before, so that F1,2 (u, u) = v¯0 v1 + v¯1 v0 + v¯2 v2 . In the case of the height one quotients, the momentum coordinates (in terms of the v0 , v1 , v2 coordinates) that we shall use are y0 = v¯1 iv0 + v¯0 iv1 ,
y1 = −v¯0 iv0 ,
y2 = v¯2 iv2 ,
Toric Self-Dual Einstein Metrics
363
and we compute F1,2 (u, u) = v¯0 v1 + v¯1 v0 + v¯2 v2 =
y0 , y1 + |y2 |. |y1 |
After substituting for x1 , x2 , the second term is treated as before, so it suffices to compute a0 x2 − b0 x1 , a1 x2 − b1 x1 a0 a1 ρ 2 + (a0 η − b0 )(a1 η − b1 ) . = √ |a1 x2 − b1 x1 | ρ a12 ρ 2 + (a1 η − b1 )2 Under the action of SL(2, R) this is equivalent to η
∂ F (ρ, η) = √ = ∂η ρ ρ 2 + η2
ρ 2 + η2 √ ρ
which may be interpreted as an ‘infinitesimal dipole’, i.e., a limit of oppositely charged monopoles at η = ±ε as ε → 0. Thus the hyperbolic eigenfunctions corresponding to height one quotients are combinations of a monopole and an infinitesimal dipole.
8.5. The height two quotients. In the height two case, the maximal Abelian subalgebra containing T2 is unique, being spanned by iI3 , T2 and T1 = iT22 . The momentum coordinates of these generators are y0 = v¯1 iv0 + v¯0 iv1 + v¯2 iv2 ,
y1 = v¯2 iv0 + v¯0 iv2 ,
y2 = −v¯0 iv0 .
Writing the quadratic form in these coordinates is straightforward once one has computed all the inner products between them. The result is F1,2 (u, u) = v¯0 v1 + v¯1 v0 + v¯2 v2 =
|y1 |2 |y2 |2 − y1 , y2 2 + 2y0 , y2 |y2 |2 . 2|y2 |3
The family of quotients we consider is the span of iI3 and T2 , so we can take y0 = a0 x1 , y1 = a1 x1 and y2 = x2 as our parameterization in quotient coordinates to yield a12 (|x1 |2 |x2 |2 − x1 , x2 2 ) + 2a0 x1 , x2 |x2 |2 a12 η ρ 3/2 = a + . 0 √ 2|x2 |3 2 (ρ 2 + η2 )3/2 ρ ρ 2 + η2 We recognise the first term as an infinitesimal dipole. Differentiating again with respect to η, we see that the second term may be regarded as an infinitesimal tripole. As the two terms have different homogeneities in (ρ, η), by scaling the coordinates and the eigenfunction, we have just three distinct quotients: • a0 = 0, the pure tripole, corresponds to the quotient by iI3 , which is complex hyperbolic space (under the non-semisimple R2 action induced by T1 and T2 ); • a1 = 0, the pure dipole, corresponds to the quotient by T2 , which is real hyperbolic space (under the non-semisimple S 1 × R action induced by T1 and iI3 ); • the nontrivial case with a0 , a1 both nonzero.
364
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
8.6. Infinitesimal multipoles from the quotient point of view. We have seen, as conjectured in [CP02], that the nilpotent cases (height one and height two quotients) can be regarded as limiting cases in which two or more monopoles come together to form an infinitesimal multipole. This can be seen from the group theory of the quotient construction by realizing a non-semisimple element as a limit of semisimple ones. For example, consider the following generator in sp(1, 2): ip0 λ 0 (8.2) Tp,λ = λ ip1 0 ∈ sp(1, 2). 0 0 ip2 Generically this generator is of height 0 but a special choice of the parameters (p, λ) raises the height to 1. To see it let us consider the following one parameter group actions on the ball p,λ
ϕt
(u) = exp(Tp,λ t) · u ≡ Ap,λ (t) · u,
where now all (λ, p0 , p1 , p2 ) are real parameters and Ap,λ (t) ∈ U (1, 2) ⊂ Sp(1, 2) and we assume λ = 0. We set p 0 + p1 p0 − p 1 , β= , γ = |α 2 − λ2 |. (8.3) α= 2 2 We can compute the matrix Ap,λ (t) explicitly. Depending on the sign of α 2 − λ2 we get 0 three distinct cases. If we denote the corresponding U (1, 2) matrices by A+ p,λ (t), Ap,λ (t), and A− p,λ (t), we obtain λ iβt eiβt (cosh γ t + iα sinh γ t 0 γ sinh γ t) γe λ iβt , A+ sinh γ t eiβt (cosh γ t − iα p,λ (t) = γe γ sinh γ t) 0 ip t 2 0 0 e iβt iβt e (1 + iαt) e λt 0 A0p,λ (t) = eiβt λt eiβt (1 − iαt) 0 , 0 0 eip2 t λ iβt sin γ t 0 eiβt (cos γ t + iα γ sin γ t) γe λ iβt . A− sin γ t eiβt (cos γ t − iα p,λ (t) = γe γ sin γ t) 0 ip t 2 0 0 e − − 0 Note that limγ →0 A+ p,λ (t) = lim γ →0 Ap,λ (t) = Ap,λ (t). Also, Ap,λ (t) is actually a circle provided the triple (γ , β, p2 ) is commensurate (all ratios are in Q). Note that the above calculation has to do with writing iα λ 0 iβ 0 0 (8.4) Tp,λ = L + N = 0 iβ 0 + λ −iα 0 , 0 0 0 0 0 ip2
where [L, N ] = 0 and L = T0 (iβ, iβ, ip2 ). Now, N 2 = 0 when λ2 = α 2 . This shows that when λ2 = α 2 , Tp,λ must be conjugated to some T1 (µ, p, q) of Definition 2.1. When λ2 = α 2 the generator Tp,λ has height 0 and, depending on the sign of λ2 − α 2 , is conjugated either to some T0 (µ, ip, iq) or T0 (iq0 , iq1 , iq2 ). In either case, one can think of height 1 metrics as certain limits of height 0 metrics.
Toric Self-Dual Einstein Metrics
365
8.7. Quotients and Bochner-flat K¨ahler metrics. We finally discuss the relationship between quaternion K¨ahler quotients of HH2 or HH1,1 and Bochner-flat (i.e., selfdual) K¨ahler surfaces. On a self-dual K¨ahler surface (M, h, J ) the conformal metric g = sh−2 h, defined wherever the scalar curvature sh of h is nonzero is an SDE metric. Conversely, h can be recovered from g using the fact that the Weyl tensor W = W + of a self-dual K¨ahler surface is a constant multiple of sh ω ⊗0 ω, where ω is the K¨ahler form and the subscript zero denotes the tracefree part in S02 (2 T+∗ M): thus, up to a constant 1/3 sh = |W |h = |W |g and h = |W |2/3 g. This sets up a one to one correspondence, at least locally, between self-dual K¨ahler metrics and SDE Hermitian metrics [Der83, AG02] which are not conformally flat. (h and g are equal up to homothety iff they are locally symmetric.) Bochner-flat K¨ahler manifolds have been completely classified, locally and globally, by Bryant [Bry01]. The local classification is quite easy to understand: over a Bochnerflat K¨ahler 2n-manifold M, the (locally defined) rank 1 bundle with connection, whose curvature is the K¨ahler form of M, has a flat CR structure (given by the horizontal lift of the K¨ahler structure on M) and is therefore locally CR isomorphic to S 2n+1 . This realises the K¨ahler metric on M as a local quotient of S 2n+1 by a one parameter subgroup of PSU(1, n + 1), the group of CR automorphisms of S 2n+1 (which is naturally realised as the quadric of totally null complex lines in the projective space of C1,n+1 ). It then follows that Bochner-flat K¨ahler metrics are classified by adjoint orbits in su(1, n + 1). Specialising to n = 2, self-dual K¨ahler surfaces are classified, as local quotients of S 5 , by adjoint orbits in su(1, 3), and it is natural to conjecture that the corresponding SDE Hermitian metrics are obtained as (perhaps only local) quaternion K¨ahler quotients of HP 2 , HH2 and HH1,1 , classified by adjoint orbits in sp(3) and sp(1, 2). This is essentially correct, as the work of Apostolov–Gauduchon [AG02] shows. Proposition 8.2. Let (M, g) be a self-dual Einstein manifold given as a (semi-)quaternion K¨ahler quotient of HP 2 , HH2 or HH1,1 by a (possibly local) S 1 or R action. Then (M, g) admits a compatible Hermitian structure and there is an invariant Sasakian structure on the momentum zero-set of the action, whose underlying CR structure is flat. Sketch proof. We outline the arguments, referring the reader to Apostolov–Gauduchon [AG02] for more details. Let K be a quaternionic Killing vector field on a (semi-)quaternion K¨ahler manifold Q of nonzero scalar curvature; this means that ∇K ∈ C∞ (Q, VQ ⊕ sp(T Q)) ⊂ C∞ (Q, so(T Q)), where VQ is the bundle of sp(1)’s in so(T Q) defining the quaternionic structure and sp(T Q) ⊂ so(T Q) is the bundle of sp(n)’s in so(T Q) consisting of the skew endomorphisms which commute with VQ . Since the scalar curvature is nonzero, the momentum map of K is defined to be the VQ component of ∇K. It follows that on the zero-set S of the momentum map, ∇K is a section of sp(T Q). It is also K invariant, so its horizontal part descends to the (perhaps only locally defined) quotient M = S/K, which is the quaternion K¨ahler quotient of Q by K, to give a section of sp(T M). If Q is an 8-manifold, then M is a 4-manifold and so(T M) = VM ⊕ sp(T M) and with our conventions VM = so− (T M) and sp(T M) ∼ = so+ (T M), the bundles of (anti-)self-dual endomorphisms associated to 2− T ∗M and 2+ T ∗M using the metric. It follows that wherever √ is nonzero 2/|| is an almost complex structure which is self-dual (i.e., orthogonal and commuting with the quaternionic structure), so that M is an almost Hermitian manifold. Apostolov and Gauduchon show that this complex structure is integrable if Q
366
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
is HP 2 or HH2 and their argument applies unchanged to HH1,1 (it is a straightforward consequence of the fact that these spaces are flat as quaternionic manifolds). Thus M is SDE Hermitian, as claimed, and one can check that the conformal K¨ahler metric h is |K|−2 g. Now the curvature of the rank 1 bundle S → M (i.e., the horizontal part of the 2-form associated to |K|−2 ∇K) is then the K¨ahler form of M, so that the K¨ahler structure on M lifts to the horizontal distribution to give a K-invariant Sasakian structure on S. This is the canonical Sasakian structure associated to (M, h), and the underlying CR structure is flat because (M, h) is self-dual. A flat CR manifold is locally isomorphic to S 5 with its standard flat CR structure (as the projective light cone in C1,3 ). Since K generates an action by CR automorphisms, such a local isomorphism determines an element of su(1, 3), the Lie algebra of CR automorphisms of S 5 . However, the local isomorphism is only determined up to conjugation by PSU(1, 3), so we do not obtain a Lie algebra homomorphism from sp(3) or sp(1, 2) to su(1, 3)—these Lie algebras are certainly not isomorphic. Nevertheless, the classifications of self-dual K¨ahler manifolds (in terms of adjoint orbits in su(1, 3)) and quotients of HP 2 , HH2 and HH1,1 (in terms of adjoint orbits in sp(3) and sp(1, 2)) do essentially coincide. This is slightly subtle, as in both quotient constructions the manifold (or orbifold) corresponding to a conjugacy class may not be connected: for the K¨ahler metric, these components correspond to Bryant’s ‘momentum cells’, whereas for the Einstein metric, the conformal infinity (which in K¨ahler terms is the zero-set of sh ) separates the quotients of HH2 from the quotients of HH1,1 . Also some of the self-dual K¨ahler quotients of S 5 will have associated Einstein metrics which are scalar-flat, while some of the SDE quotients of HP 2 , HH2 and HH1,1 will be conformally flat. One way to relate the classifications is to observe that every element of su(1, 3) has a spacelike eigenvector, and some of them (the ‘elliptic’ elements) have a timelike eigenvector too. Since PSU(1, 3) acts transitively on the spacelike or timelike lines, we can fix one of each and conjugate any element of su(1, 3) into u(1, 2), and the elliptic elements into u(3). On the other hand all adjoint orbits in sp(3) are represented by elements of u(3), and the same is true for sp(1, 2), since we have given representatives in u(1, 2) in Definition 2.1. Remark 8.1. There is a rather beautiful Hermitian/quaternionic real form of the classical Klein correspondence that allows us to make the identification of adjoint orbits more natural. Recall that there is a special isomorphism between so(6, C) and sl(4, C): C4 is the spin representation of so(6, C), or, more straightforwardly, sl(4, C) acts on 2 C4 (via A · u ∧ v = A(u) ∧ v + u ∧ A(v)) preserving a complex bilinear form gc given by the contraction of (α, β) → α ∧ β with the volume element. This isomorphism underlies the Klein correspondence: • lines in P (C4 ) correspond bijectively to points on the quadric in P (2 C4 ) (P (U ) corresponds to the null line 2 U ); • points in P (C4 ) correspond bijectively to α-planes in the quadric ([u] corresponds to projectivization of the maximal totally null subspace {u ∧ v : v ∈ C4 }); • planes in P (C4 ) correspond bijectively to β-planes in the quadric (P (W ) corresponds to the projectivization of the maximal totally null subspace 2 W ). Now su(1, 3) is the real form of sl(4, C) preserving a Hermitian metric (., .) of signature (1, 3). Consider now the Hodge star operator on 2 C1,3 defined by (∗α)∧β = (α, β)vol. For this to make sense, we must take the Hermitian metric to be anti-linear in α and thus
Toric Self-Dual Einstein Metrics
367
∗ anti-commutes with i. The signature of the metric implies that ∗2 = −1, so j := ∗ defines a quaternionic structure on 2 C1,3 . It is convenient to make 2 C1,3 into a right quaternionic vector space in this way (thus k = ij = j ◦ i). We denote by so∗ (3, H) the subalgebra of so(6, C) commuting with j : it is the real form isomorphic to su(1, 3). We can describe it in quaternionic terms as the Lie algebra of the group of H-linear transformations of H3 preserving an (i, j, k)-invariant skew form ω, and hence also the triple of signature (6, 6) symmetric forms gi , gj , gk defined by gi (a, b) = ω(ai, b) and so on. Note that gi is i-invariant, but is anti-invariant with respect to j and k, and similarly for gj and gk . Hence the quaternionic definition is related to the complex one by taking gj to be the real part of gc (since gc is i-bilinear and j -invariant). A spacelike or timelike line in C1,3 defines a maximal totally null (α) subspace of 2 C1,3 and its perpendicular hyperplane defines a complementary maximal totally null (β) subspace. Such a decomposition is equivalently given by a gj -orthogonal complex structure I on 2 C1,3 commuting with the quaternionic structure: the null subspaces are the ±i eigenspaces. Note that g(a, b) = ω(I a, b) is therefore an (i, j, k)-invariant inner product and it is easy to check that it is indefinite or definite according to whether the line is spacelike or timelike. An element of su(1, 3) belongs to u(1, 2) or u(3) (i.e., preserves the spacelike or timelike line) if and only if its action on 2 C1,3 commutes with I if and only if it is skew with respect to g. In fact this realizes u(1, 2) and u(3) as sp(1, 2) ∩ so∗ (3, H) and sp(3) ∩ so∗ (3, H) respectively. For example, consider the diagonal element ir0 0 0 0 0 ir1 0 0 0 0 ir 0 2 0 0 0 ir3 with r0 + r1 + r2 + r3 = 0, defined using the standard basis e0 , e1 , e2 , e3 for C1,3 with e0 timelike. Its action on 2 C1,3 with respect to the quaternionic basis e0 ∧ e1 , e0 ∧ e2 , e0 ∧ e3 is easily computed to be 0 0 i(r0 + r1 ) , 0 i(r0 + r2 ) 0 0 0 i(r0 + r3 ) where i acts by left multiplication (we have chosen our quaternionic basis so that the complex structure I determined by e0 is left multiplication by i). Adjoint orbits in su(1, 3) are essentially determined by their characteristic and minimal polynomials, and Bryant [Bry01] gives his classification in these terms—more precisely, in terms of the polynomials of the associated Hermitian matrices. If Pc is the characteristic polynomial and Pm is the minimal polynomial, then the degree d of Pc /Pm determines the local cohomogeneity of the self-dual K¨ahler metric as 2 − d. In the generic, local cohomogeneity two case, Bryant discusses the classification in detail, which he divides into Cases 1–4. For reference, we shall give the correspondence between adjoint orbits in su(1, 3) and sp(1, 2) which relate Bryant’s classification to ours. We do this by giving the characteristic and minimal polynomials Pc corresponding to the representatives in Definition 2.1. These correspondences are obtained by choosing an element of su(1, 3) with given Pc ,
368
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni
Pm and spacelike eigenvector e1 , and computing its action on 2 C1,3 with quaternionic basis e1 ∧ e0 , e1 ∧ e2 , e1 ∧ e3 . We recall that there are exceptional adjoint orbits which do not give SDE and self-dual K¨ahler metrics which are conformal. For these exceptional orbits, the self-dual K¨ahler metric is conformal to a scalar-flat SDE metric, while the quotient SDE metric is the real hyperbolic metric (conformally flat). We begin with the (local) cohomogeneity two K¨ahler metrics, where Pm (t) = Pc (t). (i) Pc (t) = (t − r0 )(t − r1 )(t − r2 )(t − r3 ), where r0 , r1 , r2 , r3 are distinct with r0 + r1 + r2 + r3 = 0. This is Bryant’s Case 4, and corresponds to ip0 0 0 T0 (ip0 , ip1 , ip2 ) = 0 ip1 0 , 0 0 ip2 with p0 = r0 + r1 , p1 = −(r0 + r2 ), p2 = −(r0 + r3 ) (and so pi = ±pj for i, j distinct). The exceptional orbits arise when one of the weights vanish. (ii) Pc (t) = (t − r1 )(t − r2 )(t − r − iλ)(t − r + iλ), where r1 , r2 are distinct with r1 + r2 + 2r = 0. This is Bryant’s Case 1 and corresponds to ip λ 0 T0 (λ, ip, iq) = λ ip 0 0 0 iq with p = r + r1 , q = −2r = r1 + r2 (and so p = 0). The exceptional orbits arise when q vanishes. (iii) Pc (t) = (t −r1 )(t −r2 )(t −r)2 , where r1 , r2 and r are distinct with r1 +r2 +2r = 0. This is Bryant’s Case 3 and corresponds to ip 0 0 i i 0 T1 (1, ip, iq) = 0 ip 0 + −i −i 0 , 0 0 iq 0 0 0 with p = r + r1 , q = −2r = r1 + r2 (and so p = 0 and p = ±q). The exceptional orbits arise when q = 0. (iv) Pc (t) = (t − r1 )(t − r)3 , where r1 and r are distinct with r1 + 3r = 0. This is Bryant’s Case 2 and corresponds to 0 0 −i T2 (1, ip) = ip I3 + 0 0 i , i i 0 with p = r + r1 (so that p = 0). We finally consider the cohomogeneity one and homogeneous K¨ahler metrics. (i) Pc (t) = (t − r0 )2 (t − r1 )(t − r2 ) and Pm (t) = (t − r0 )(t − r1 )(t − r2 ), where r0 , r1 , r2 are distinct with 2r0 + r1 + r2 = 0. These metrics have cohomogeneity one under U(2) or U(1, 1) according to the signature of the Hermitian metric on the repeated eigenspace, and correspond to T0 (ip, ±ip, iq) or T0 (iq, ip, ±iq) with p = ±q. When q = 0 we have an exceptional orbit. Further degenerations give homogeneous metrics: • Pc (t) = (t − r)3 (t + 3r) and Pm (t) = (t − r)(t + 3r) with r = 0 corresponds to T0 (ip, ip, ip) and the Bergman metric;
Toric Self-Dual Einstein Metrics
369
• Pc (t) = (t − r)2 (t + r)2 and Pm (t) = (t − r)(t + r) with r = 0 is exceptional: the K¨ahler metric is the product metric on S 2 × H2 and the SDE quotient (by T0 (0, 0, ip) or T0 (ip, 0, 0)) is H4 . (ii) Pc (t) = (t +r)2 (t −r −iλ)(t −r +iλ) and Pm (t) = (t +r)(t −r −iλ)(t −r +iλ). These have cohomogeneity one under U(2) and correspond to the Pedersen metrics T0 (λ, 0, ir), apart from exceptional orbits when r = 0. (iii) Pc (t) = (t + r)2 (t − r)2 or (t + r)(t − r)3 and Pm (t) = (t + r)(t − r)2 with r = 0, correspond to the height one quotients by T1 (1, 0, iq) and T1 (1, ip, ±ip), which are cohomogeneity one metrics. A further degeneration gives an exceptional orbit: Pc (t) = t 4 and Pm (t) = t 2 . The K¨ahler metric in this case is flat, while the SDE quotient by T1 (1, 0, 0) is H4 . (iv) Pc (t) = t 4 and Pm (t) = t 3 corresponds to the height two quotient T2 (1, 0). This is an exceptional orbit: the self-dual K¨ahler metric has cohomogeneity one, but the SDE quotient is H4 . Acknowledgements. The first author thanks the Ecole Polytechnique, Palaiseau and the Universit`a di Roma “La Sapienza” for hospitality and support. The third author would like to thank the Universit`a di Roma “La Sapienza”, I.N.d.A.M, M.P.I-Bonn, and IHES as parts of this paper were written during his visits there. The fourth named author would like to thank University of New Mexico for hospitality and support. The authors are grateful to Paul Gauduchon, Michael Singer and Pavel Winternitz for invaluable discussions.
References [AG02]
Apostolov, V., Gauduchon, P.: Self-dual Einstein Hermitian four manifolds. Ann. Sc. Nrom. Sup. Pisa I, 203–246 (2002) [BC77] Burgoyne, N., Cushman, R.: Conjugacy classes in linear groups. J. Algebra 44(2), 339–362 (1977) [BD00] Bielawski, R., Dancer, A.S.: The geometry and topology of toric hyperk¨ahler manifolds. Comm. Anal. Geom. 8, 727–760 (2000) [BGMR98] Boyer, C.P., Galicki, K., Mann, B.M., Rees, E.G.: Compact 3-Sasakian 7-manifolds with arbitrary second Betti number. Invent. Math. 131(2), 321–344 (1998) [Bie99] Bielawski, R.: Complete hyper-K¨ahler 4n-manifolds with a local tri-Hamiltonian R n -action. Math. Ann. 314(3), 505–528 (1999) [Biq99] Biquard, O.: Einstein deformations of hyperbolic metrics. In: Surveys in differential geometry: essays on Einstein manifolds, Surv. Differ. Geom., VI, Boston, MA: Int. Press, 1999, pp. 235–246 [Biq00] Biquard, O.: M´etriques d’Einstein asymptotiquement sym´etriques.Ast´erisque (265), vi+109 (2000) [Biq02] Biquard, O.: M´etriques autoduales sur la boule. Invent. Math. 148(3), 545–607 (2002) [Bry01] Bryant, R.L.: Bochner-K¨ahler metrics. J. Amer. Math. Soc. 14(3), 623–715 (2001) [CP02] Calderbank, D.M.J., Pedersen, H.: Selfdual Einstein metrics with torus symmetry. J. Differ. Geom. 60(3), 485–521 (2002) [CS03] Calderbank, D.M.J., Singer, M.A.: Einstein metrics and complex singularities. Invent. Math. 156(2), 405–443 (2004) [Der81] Derdzi´nski, A.: Exemples de m´etriques de K¨ahler et d’Einstein autoduales sur le plan complexe. In: G´eometrie riemannienne en dimension 4, CEDIC, 1981 [Der83] Derdzi´nski, A.: Self-dual K¨ahler manifolds and Einstein manifolds of dimension four. Compositio Math. 49(3), 405–433 (1983) [Gal87a] Galicki, K.: A generalization of the momentum mapping construction for quaternionic K¨ahler manifolds. Commun. Math. Phys. 108(1), 117–138 (1987) [Gal87b] Galicki, K.: New matter couplings in N=2 supergravity. Nucl. Phys. B 289(2), 573–588 (1987) [Gal91] Galicki, K.: Multi-centre metrics with negative cosmological constant. Classical Quantum Gravity 8(8), 1529–1543 (1991)
370 [GL88] [Hit95] [LeB82] [LeB88] [LeB91] [Ped86] [Swa91]
C.P. Boyer, D.M.J. Calderbank, K. Galicki, P. Piccinni Galicki, K., Lawson, H.B. Jr.: Quaternionic reduction and quaternionic orbifolds. Math. Ann. 282(1), 1–21 (1988) Hitchin, N.J.: Twistor spaces Einstein metrics and isomonodromic deformations. J. Differ. Geom. 42(1), 30–112 (1995) LeBrun, C.R.: H-space with a cosmological constant. Proc. Roy. Soc. London Ser. A 380(1778), 171–185 (1982) LeBrun, C.R.: Counter-examples to the generalized positive action conjecture. Commun. Math. Phys. 118(4), 591–596 (1988) LeBrun, C.R.: On complete quaternionic-K¨ahler manifolds. Duke Math. J. 63(3), 723–743 (1991) Pedersen, H.: Einstein metrics, spinning top motions and monopoles. Math. Ann. 274(1), 35–59 (1986) Swann, A.: Hyper-K¨ahler and quaternionic K¨ahler geometry. Math. Ann. 289(3), 421–450 (1991)
Communicated by G.W. Gibbons
Commun. Math. Phys. 253, 371–384 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1134-3
Communications in
Mathematical Physics
Spectral Gaps for Periodic Schr¨odinger Operators with Strong Magnetic Fields Yuri A. Kordyukov Institute of Mathematics, Russian Academy of Sciences, 112, Chernyshevsky str., 450077 Ufa, Russia. E-mail:
[email protected] Received: 20 November 2003 / Accepted: 3 February 2004 Published online: 14 July 2004 – © Springer-Verlag 2004
Abstract: We consider Schr¨odinger operators H h = (ih d + A)∗ (ih d + A) with the periodic magnetic field B = dA on covering spaces of compact manifolds. Using methods of a paper by Kordyukov, Mathai and Shubin [14], we prove that, under some assumptions on B, there are in arbitrarily large number of gaps in the spectrum of these operators in the semiclassical limit of the strong magnetic field h → 0.
Introduction be its Let (M, g) be a closed Riemannian oriented manifold of dimension n ≥ 2, M so that universal cover and g be the lift of g to M g is a -invariant Riemannian metric where denotes the fundamental group of M acting on M by the deck transforon M, We assume that B is mations. Let B be a real-valued -invariant closed 2-form on M. such that dA = B. Physically we can think exact. Choose a real-valued 1-form A on M of A as the electromagnetic vector potential for a magnetic field B. We consider the magnetic Schr¨odinger operator Hh given by H h = (ih d + A)∗ (ih d + A), Here h > 0 is a semiclassical parameter, which acting on the Hilbert space H = L2 (M). is assumed to be small. In local coordinates X = (X1 , . . . , Xn ), we write the 1-form A as A = A1 (X) dX1 + . . . + An (X) dXn ,
372
Y.A. Kordyukov
the matrix of the Riemannian metric g as g(X) = (gj l (X))1≤j,l≤n and its inverse as (g j l (X))1≤j,l≤n . If |g(X)| = det(g(X)), then the Schr¨odinger operator Hh is given by ∂ 1 ih + Aj (X) Hh = √ ∂Xj |g(X)| 1≤j,l≤n ∂ × |g(X)|g j l (X) ih + Al (X) . ∂Xl denote by B(x) the anti-symmetric linear operator on the tangent For any x ∈ M associated with the 2-form B: space Tx M gx (B(x)u, v) = Bx (u, v),
u, v ∈ Tx M.
The trace-norm |B(x)| of B(x) is given by the formula |B(x)| = [Tr(B ∗ (x) · B(x))]1/2 . We will assume that there exists an integer k > 0 such that, if B(x0 ) = 0 then C1−1 d(x, x0 )k ≤ |B(x)| ≤ C1 d(x, x0 )k We assume in some neighborhood of x0 (here d denotes the geodesic distance on M). that there exists at least one zero of B. Theorem 1. Under current assumptions, there exists an increasing sequence {λm , m ∈ N}, satisfying λm → ∞ as m → ∞, such that for any a and b, satisfying λm < a < 2k+2 2k+2 b < λm+1 with some m, the interval [ah k+2 , bh k+2 ] does not meet the spectrum of H h for any h > 0 small enough. It follows that there exists an arbitrarily large number of gaps in the spectrum of H h provided the coupling constant h is sufficiently small. Here the sequence {λm } appears as the set of eigenvalues of a model operator K h associated to H h . This operator is defined as a direct sum of principal parts of H h near the zeroes of B in a fundamental domain (see Sect. 2 for a precise definition). It is a differential operator, which acts on the Hilbert space HK = L2 (Rn ; CN ) and has discrete spectrum (here n is the dimension of M and N denotes the number of zeroes of B that lie in a fundamental domain). Using a simple scaling and gauge invariance, it can be 2k+2 shown that the operator K h is unitarily equivalent to the operator h k+2 K 1 . Therefore 2k+2 the operator h− k+2 K h has discrete spectrum independent of h. This fact explains the 2k+2 appearance of a scaling factor h k+2 in Theorem 1. There exist a few examples of periodic partial differential operators of the second order with spectral gaps (see, for instance, [1–3, 10, 11, 13] and a recent survey [12] and references therein). In particular, in [10] Hempel and Herbst studied magnetic Schr¨odinger operators H (λa) = (−i∇ − λa(x))2 in L2 (Rn ), where a ∈ C 1 (Rn ; Rn ) is a vector potential and λ ∈ R. Let M = {x ∈ Rn : B(x) = 0}, where B = da is the magnetic field associated with a, and Ma = {x ∈ Rn : a(x) = 0}. They proved that, if B is periodic with respect to the lattice Zn , the set M \ Ma has measure zero, the interior of M is non-empty and M can be represented as
Spectral Gaps for Periodic Schr¨odinger Operators
373
M = ∪j ∈Zn Mj (up to a set of measure zero) where the Mj are pairwise disjoint compact sets with Mj = M0 + j , then the spectrum of the operator H (λa) has an arbitrarily large number of gaps provided the coupling constant λ is sufficiently large. The proof of this result is based on the fact that, as λ → ∞, H (λa) converges in the norm resolvent sense to the Dirichlet Laplacian −M on the closed set M. Since norm resolvent convergence implies convergence of spectra, we immediately obtain that, as λ → ∞, the spectrum of H (λa) concentrates around the eigenvalues of −M and gaps opens up in the spectrum of H (λa). On the other hand, Hempel and Herbst also proved in [10] that, if Ma has measure zero, then, as λ → ∞, H (λa) converges in the strong resolvent sense to the zero operator in L2 (Rn ). So, in this case, their method to produce operators with spectral gaps does not work. In this paper, we consider a particular case when Ma has measure zero. More precisely, Theorem 1 states that if Ma has measure zero and the magnetic field has a regular behaviour near its zeroes, we still can produce examples of magnetic Schr¨odinger operators H (λa) with arbitrarily large number of gaps in their spectra. The proof of Theorem 1 is based on Theorem 2 below. Recall that the magnetic Schr¨odinger operator H h commutes with a projective (, σ )action of the fundamental group , where σ is the multiplier or U (1)-valued 2-cocycle on defining this projective action. Consider the reduced twisted group C ∗ -algebra Cr∗ (, σ¯ ) of the group . If H is a Hilbert space, then let K(H) denote the algebra of compact operators in H, and K = K(2 (N)), where N = {1, 2, 3, . . . }. Let E h (λ) = χ(−∞,λ] (H h ) and E 0 (λ) = χ(−∞,λ] (K h ) denote the spectral projections. One can define actions of the C ∗ -algebra Cr∗ (, σ¯ ) ⊗ K in the Hilbert spaces H and 2 () ⊗ HK . It can be shown that E(λ) and id ⊗E 0 (λ) are in Cr∗ (, σ¯ ) ⊗ K. Recall that two projections P and Q in a unital ∗-algebra A are said to be Murray-von Neumann equivalent if there is an element V ∈ A such that P = V ∗ V and Q = V V ∗ . Theorem 2. Assume that λ ∈ R does not coincide with λk for any k. There exists a (, σ )-equivariant isometry U : H → 2 () ⊗ HK and a constant h0 > 0 such that 2k+2 2k+2 for all h ∈ (0, h0 ), the spectral projections U E(h k+2 λ)U ∗ and id ⊗E 0 (h k+2 λ) are Murray-von Neumann equivalent in Cr∗ (, σ¯ ) ⊗ K(HK ). The proofs of Theorems 1 and 2 are based on abstract operator-theoretic results obtained in [14]. In [14], these results were applied to prove the Murray-von Neumann equivalence of spectral projections of a periodic magnetic Schr¨odinger operator Hµ = (i d + A)∗ (i d + A) + µ−2 V (x) of a compact manifold M and of the corresponding model on the universal covering M operator in the limit of the strong electric field (µ → 0), where B = dA is a -invariant and V ≥ 0 is a -invariant Morse potential. This lead to a new closed 2-form on M proof of existence of an arbitrarily large number of gaps in the spectrum of the periodic magnetic Schr¨odinger operators Hµ as µ → 0 (see [15] for another proof). It should also be noted that the idea of the model operator was first crystallized in the paper [17]. Another important result, which we use in construction of the model operator and in the proof of Theorem 2, are connected with the study of Schr¨odinger operators with magnetic wells and were obtained by Helffer and Mohamed (=Morame) in ([5], see also [6–8] for further developments).
374
Y.A. Kordyukov
The paper is organized as follows. Section 1 contains some background results from [14]. In Sect. 2, we describe a construction of the model operator K h . Section 3 provides some necessary information on magnetic translations and related operator algebras. Finally, in Sect. 4 we give the proofs of the main results. 1. General Results on Equivalence of Projections and Existence of Spectral Gaps In this section we recall general results on equivalence of projections and existence of spectral gaps proved in [14]. Let A be a C ∗ -algebra, H a Hilbert space equipped with a faithful ∗ - representation of A, π : A → B(H). For simplicity of notation, we will often identify the algebra A with its image π(A). Consider Hilbert spaces H1 and H2 equipped with inner products (·, ·)1 and (·, ·)2 . Assume that there are given unitary operators V1 : H1 → H and V2 : H2 → H. Using the unitary isomorphisms V1 and V2 , we get representations π1 and π2 of A in H1 and H2 accordingly, πl (a) = Vl−1 ◦ π(a) ◦ Vl , l = 1, 2, a ∈ A. Consider (unbounded) self-adjoint operators A1 in H1 and A2 in H2 with the domains Dom(A1 ) and Dom(A2 ) respectively. We will assume that – the operators A1 and A2 are semi-bounded from below: (A1 u, u)1 ≥ λ01 u 21 ,
u ∈ Dom(A1 ),
(1)
λ02 u 22 ,
u ∈ Dom(A2 ),
(2)
(A2 u, u)2 ≥
with some λ01 , λ02 ≤ 0; – for any t > 0, the operators e−tAl , l = 1, 2, belong to πl (A). Let H0 be a Hilbert space, equipped with injective bounded linear maps i1 : H0 → H1 and i2 : H0 → H2 . Assume that there are given bounded linear maps p1 : H1 → H0 and p2 : H2 → H0 such that p1 ◦ i1 = idH0 and p2 ◦ i2 = idH0 . Consider a self-adjoint bounded operator J in H0 . We assume that – the operator V2 i2 Jp1 V1−1 belongs to the von Neumann algebra π(A)
; – (i2 Jp1 )∗ = i1 Jp2 ; – for any a ∈ A, the operator π(a)V2 (i2 Jp1 )V1−1 belongs to π(A). Since the operators il : H0 → Hl , l = 1, 2, are bounded and have bounded leftinverse operators pl , they are topological monomorphisms, i.e. they have closed image and the maps il : H0 → Im il are topological isomorphisms. Therefore, we can assume that the estimate ρ −1 i2 J u 2 ≤ i1 J u 1 ≤ ρ i2 J u 2 ,
u ∈ H0 ,
(3)
holds with some ρ > 1 (depending on J ). Define the bounded operators Jl in Hl , l = 1, 2, by the formula Jl = il Jpl . We assume that – the operator Jl , l = 1, 2, maps the domain of Al to itself; – Jl is self-adjoint, and 0 ≤ Jl ≤ idHl , l = 1, 2; – for u ∈ H0 , i1 J u ∈ Dom(A1 ) iff i2 J u ∈ Dom(A2 ).
Spectral Gaps for Periodic Schr¨odinger Operators
375
Denote D = {u ∈ H0 : i1 J u ∈ Dom(A1 )} = {u ∈ H0 : i2 J u ∈ Dom(A2 )}. Introduce a self-adjoint positive bounded linear operator Jl in Hl by the formula 2 Jl + Jl 2 = idHl . We assume that – the operator Jl , l = 1, 2, maps the domain of Al to itself; – the operators [Jl , [Jl , Al ]] and [Jl , [Jl , Al ]] extend to bounded operators in Hl , and max( [Jl , [Jl , Al ]] l , [Jl , [Jl , Al ]] l ) ≤ γl ,
l = 1, 2.
(4)
Finally, we assume that (Al Jl u, Jl u)l ≥ αl Jl u 2l ,
u ∈ Dom(Al ),
l = 1, 2,
(5)
for some αl > 0, and (A2 i2 J u, i2 J u)2 ≤ β1 (A1 i1 J u, i1 J u)1 + ε1 i1 J u 21 ,
u ∈ D,
(6)
β2 (A2 i2 J u, i2 J u)2 + ε2 i2 J u 22 ,
u ∈ D,
(7)
(A1 i1 J u, i1 J u)1 ≤
for some β1 , β2 ≥ 1 and ε1 , ε2 > 0. Denote by El (λ), l = 1, 2, the spectral projection of the operator Al , corresponding to the semi-axis (−∞, λ]. We assume that there exists a faithful, normal, semi-finite trace τ on π(A)
such that, for any t > 0, the operators Vl e−tAl Vl−1 , l = 1, 2, belong to π(A) and have finite trace. By standard arguments, it follows that Vl El (λ)Vl−1 ∈ π(A)
, and τ (Vl El (λ)Vl−1 ) < ∞ for any λ, l = 1, 2. Theorem 3. Under current assumptions, let b1 > a1 and a2 = ρ β1 b2 =
(a1 + γ1 − λ01 )2 a1 + γ 1 + α1 − a 1 − γ 1
+ ε1 ,
β2−1 (b1 ρ −1 − ε2 )(α2 − γ2 ) − α2 γ2 + 2λ02 γ2 − λ202 α2 − 2λ02 + β2−1 (b1 ρ −1 − ε2 )
(8) .
(9)
Suppose that α1 > a1 + γ1 , α2 > b2 + γ2 and b2 > a2 . If the interval (a1 , b1 ) does not intersect with the spectrum of A1 , then: (1) the interval (a2 , b2 ) does not intersect with the spectrum of A2 ; (2) for any λ1 ∈ (a1 , b1 ) and λ2 ∈ (a2 , b2 ), the projections V1 E1 (λ1 )V1−1 and V2 E2 (λ2 )V2−1 belong to A and are Murray-von Neumann equivalent in A. Remark 1. Since ρ > 1, β1 ≥ 1, γ1 > 0 and ε1 > 0, we, clearly, have a2 > a1 . The formula (9) is equivalent to the formula (b2 + γ2 − λ02 )2 + ε2 , b1 = ρ β2 b2 + γ2 + α2 − b 2 − γ 2 which is obtained from (8), if we replace α1 , β1 , γ1 , ε1 , λ01 by α2 , β2 , γ2 , ε2 , λ02 accordingly and a1 and a2 by b2 and b1 accordingly. In particular, this implies that b1 > b2 .
376
Y.A. Kordyukov
2. The Model Operator Here we will give a construction of the model operator, using ideas of [5]. We will use so that there are the notation of the Introduction. Choose a fundamental domain F ⊂ M no zeros of B on the boundary of F. This is equivalent to saying that the translations {γ F, γ ∈ } cover the set of all zeros of B. Let {x¯j | j = 1, . . . , N} denote all the zeros of B in F; x¯i = x¯j if i = j . The model operator K h associated with H h is an operator in L2 (Rn )N given by K h = ⊕1≤j ≤N Kjh , where Kjh is an unbounded self-adjoint differential operator in L2 (Rn ) which corre defined in a sponds to the zero x¯j . Let us fix local coordinates fj : U (x¯j ) → Rn on M small neighborhood U (x¯j ) of x¯j for every j = 1, . . . , N. We assume that fj (x¯j ) = 0 and the image fj (U (x¯j )) is a fixed ball B = B(0, r) ⊂ Rn centered at the origin 0. Write the 2-form B in the local coordinates as Bj (X) = blm (X) dXl ∧ dXm , X = (X1 , . . . , Xn ) ∈ B(0, r). 1≤l<m≤n
The 1-form A is written in the local coordinates as a 1-form Aj on B(0, r). By [4], there exists a real function θj ∈ C ∞ (B(0, r)) such that |Aj (X) − dθj (X)| ≤ C|X|k+1 ,
X ∈ B(0, r).
Write the 1-form Aj − dθj as Aj (X) − dθj (X) =
n
al (X) dXl ,
X ∈ B(0, r).
l=1
Let A1,j be a 1-form on Rn with polynomial coefficients given by A1,j (X) =
n l=1 |α|=k+1
X α ∂ α al (0) dXl , α! ∂X α
X ∈ Rn .
So we have dA1,j (X) = B0j (X),
X ∈ Rn ,
where B0j is a closed 2-form on Rn with polynomial coefficients defined by B0j (X) =
X α ∂ α blm (0) dXl ∧ dXm , α! ∂X α
X ∈ Rn .
1≤l<m≤n |α|=k
Take any extension of the function θj to a smooth, compactly supported function in Rn denoted also by θj and put A0j (X) = A1,j (X) + dθj (X),
X ∈ Rn .
Spectral Gaps for Periodic Schr¨odinger Operators
377
Then we still have dA0j (X) = B0j (X),
X ∈ Rn ,
and, moreover, |Bj (X) − B0j (X)| ≤ C|X|k+1 ,
X ∈ B(0, r),
(10)
|Aj (X) − A0j (X)| ≤ C|X|k+2 ,
X ∈ B(0, r).
(11)
Then Kjh is the self-adjoint differential operator with asymptotically polynomial coefficients in L2 (Rn ) given by Kjh = (ih d + A0j )∗ (ih d + A0j ), where the adjoint is taken with respect to a Hilbert structure in L2 (Rn ) given by the flat Riemannian metric (glm (0)) in Rn . If we write A0j as A0j = A0j,1 dX1 + . . . + A0j,n dXn , then Kjh is given as Kjh
=
1≤l,m≤n
∂ g (0) ih + A0j,l (X) ∂Xl lm
∂ 0 ih + Aj,m (X) . ∂Xm
The operator Kjh has discrete spectrum (cf., for instance, [9, 4]). By the operator Kjh is unitarily equivalent to the Schr¨odinger operator
gauge invariance,
Hjh = (ih d + A1,j )∗ (ih d + A1,j ), 1
associated with the homogeneous 1-form A1,j . Using a simple scaling X → h k+2 X, it 2k+2
can be shown that the operator Hjh is unitarily equivalent to the operator h k+2 Hj1 . So we 2k+2
conclude that the operator h− k+2 K h has discrete spectrum independent of h, which is denoted by {λm : m ∈ N}, λ1 < λ2 < λ2 < . . . (not taking into account multiplicities). As it will be shown in Sect. 4, the sequence {λm : m ∈ N} is precisely the sequence, which we need for the proof of Theorem 1. 3. Magnetic Translations and Related Operator Algebras In this section, we collect some necessary facts on magnetic translations and related operator algebras (see, for instance, [15, 14] and references therein for more details). As above, let M be a compact connected Riemannian manifold, be its fundamental → M be its universal cover. Let B be a closed -invariant real-valued group and p : M We will 2-form on M. Assume that B is exact. So B = dA, where A is a 1-form on M. assume without loss of generality that A is real-valued. The Hermitian connection A defines a projective (, σ )-unitary representation on that is, the map T : → U(L2 (M)), γ → Tγ , where for any Hilbert space H L2 (M), we denote by U(H) the group of all unitary operators in H, satisfying Te = id,
Tγ1 Tγ2 = σ (γ1 , γ2 )Tγ1 γ2 ,
γ1 , γ2 ∈ .
Here σ is a multiplier on , i.e. σ : × → U (1) satisfies
378
Y.A. Kordyukov
– σ (γ , e) = σ (e, γ ) = 1, γ ∈ ; – σ (γ1 , γ2 )σ (γ1 γ2 , γ3 ) = σ (γ1 , γ2 γ3 )σ (γ2 , γ3 ), tion).
γ1 , γ2 , γ3 ∈
(the cocycle rela-
In other words one says that the map γ → Tγ defines a (, σ )-action in H. The operators Tγ are also called magnetic translations. Denote by 2 () the standard Hilbert space of complex-valued L2 -functions on the discrete group . For any γ ∈ , define a bounded operator TγL in 2 () by TγL f (γ ) = f (γ −1 γ )σ¯ (γ , γ −1 γ ),
γ ∈ ,
f ∈ 2 ().
It is easy to see that TeL = id, Also
TγL1 TγL2 = σ¯ (γ1 , γ2 )TγL1 γ2 ,
γ1 , γ2 ∈ .
(TγL )∗ = σ (γ , γ −1 )TγL−1 .
This means that Tγ is a left (, σ¯ )-action on 2 () (or, equivalently, a (, σ¯ )-unitary representation in 2 ()). Define a twisted group algebra C(, σ¯ ) which consists of complex valued functions with finite support on , with the twisted convolution operation f (γ1 )g(γ2 )σ¯ (γ1 , γ2 ), (f ∗ g)(γ ) = γ1 ,γ2 :γ1 γ2 =γ
and with the involution
f ∗ (γ ) = σ (γ , γ −1 )f (γ −1 ).
Associativity of the multiplication is equivalent to the cocycle condition. The basis of C(, σ¯ ) as a vector space is formed by δ-functions {δγ }γ ∈ , δγ (γ ) = 1 if γ = γ and 0 otherwise. We have δγ1 ∗ δγ2 = σ¯ (γ1 , γ2 )δγ1 γ2 . Note also that the δ-functions {δγ }γ ∈ form an orthonormal basis in 2 (). It is easy to check that TγL δγ = δγ ∗ δγ = σ¯ (γ , γ )δγ γ . It is clear that the correspondence f ∈ C(, σ¯ ) → T L (f ) ∈ B(2 ()), where T L (f )u = f ∗ u, u ∈ 2 (), defines a ∗-representation of the twisted group algebra C(, σ¯ ) in 2 (). The weak closure of the image of C(, σ¯ ) in this representation coincides with the (left) twisted group von Neumann algebra AL (, σ¯ ). The corresponding norm closure is the so-called reduced twisted group C ∗ -algebra which is denoted Cr∗ (, σ¯ ). The von Neumann algebra AL (, σ¯ ) can be described in terms of the matrix elements. For any A ∈ B(2 ()) denote Ax,y = (Aδy , δx ), x, y ∈ (which is a matrix element of A). Then repeating standard arguments (given in a similar situation, e.g. in [16]) we can prove that for any A ∈ B(2 ()) the inclusion A ∈ AL (, σ¯ ) is equivalent to the relations Axγ ,yγ = σ¯ (x, γ )σ (y, γ )Ax,y ,
x, y, γ ∈ .
Spectral Gaps for Periodic Schr¨odinger Operators
379
In particular, for any A ∈ AL (, σ¯ ), we have Axγ ,xγ = Ax,x ,
x, γ ∈ .
A finite von Neumann trace tr ,σ¯ : AL (, σ¯ ) → C is defined by the formula tr ,σ¯ A = (Aδe , δe ). We can also write tr ,σ¯ A = Aγ ,γ = Aδγ , δγ for any γ ∈ because the right-hand side does not depend on γ . 4. Proof of Main Results For the proof of the main theorem, we apply Theorem 3 in the following particular setting. Take the C ∗ algebra A to be Cr∗ (, σ¯ )⊗K. Let H be the Hilbert space 2 ()⊗2 (N). Put Choose an arbitrary unitary isomorphism H1 = 2 () ⊗ L2 (Rn )N and H2 = L2 (M). 2 n N 2 V1 : L (R ) → (N) and define an unitary operator V1 : H1 → H as V1 = id ⊗V1 . ∼ As in [14], define a (, σ )-equivariant isometry U : L2 (M) = 2 () ⊗ L2 (F) by the formula U(φ) = δγ ⊗ i ∗ (Tγ φ), φ ∈ L2 (M), γ ∈
denotes the inclusion map. Choose an arbitrary unitary isomorwhere i : F → M phism V2 : L2 (F) → 2 (N). Then a unitary operator V2 : H2 → H is defined as V2 = (id ⊗V2 ) ◦ U. Let π be the representation of the algebra A in H given by the tensor product of the representation T L of Cr∗ (, σ¯ ) on 2 () and the standard representation of K in 2 (N). So we have π(Cr∗ (, σ¯ ) ⊗ K) ⊂ AL (, σ¯ ) ⊗ B(2 (N)) and π(Cr∗ (, σ¯ ) ⊗ K)
∼ = AL (, σ¯ ) ⊗ B(2 (N)). Using the unitary isomorphisms V1 and V2 , we get representations π1 and π2 of A in H1 and H2 accordingly, πl (a) = Vl−1 ◦π(a)◦Vl , l = 1, 2, a ∈ A. Define a trace τ on AL (, σ¯ ) ⊗ B(2 (N)) as the tensor product of the finite von Neumann trace tr ,σ¯ on AL (, σ¯ ) and the standard trace on B(2 (N)). Consider self-adjoint, semi-bounded from below operators A1 in H1 and A2 in H2 : 2k+2
A1 = id ⊗h− k+2 K h ,
2k+2
A2 = h− k+2 H h .
Clearly, we have − 2k+2 k+2
e−tA1 = id ⊗e−th
Kh
∈ π1 (A) ∼ = Cr∗ (, σ¯ ) ⊗ K(HK )
with τ (e−tA1 ) < ∞ for any t > 0. As shown in [14], for any t > 0, the operator e−tA2 belongs to π2 (A) and τ (e−tA2 ) < ∞. Remark that, in the notation of Theorem 2,
2k+2
2k+2 E1 (λ) = id ⊗E 0 h k+2 λ , E2 (λ) = E h k+2 λ . We will use notation of Sect. 2. Let
2 H0 = 2 () ⊗ ⊕N L (U ( x ¯ )) . j j =1
380
Y.A. Kordyukov
An inclusion i1 : H0 → H1 is defined as i1 = id ⊗j1 , where j1 is the inclusion 2 N 2 n N ∼ 2 ⊕N j =1 L (U (x¯ j )) = L (B(0, r)) → L (R )
given by the chosen local coordinates. An inclusion i2 : H0 → H2 is defined as i2 = U∗ ◦ (id ⊗j2 ), where j2 is the natural inclusion 2 2 ⊕N j =1 L (U (x¯ j )) → L (F).
The operator p1 : H1 → H0 is defined as p1 = id ⊗r1 , where r1 is the restriction operator 2 L2 (Rn )N → L2 (B(0, r))N ∼ = ⊕N j =1 L (U (x¯ j )).
The operator p2 : H1 → H0 is defined as p2 = (id ⊗r2 ) ◦ U, where r2 : L2 (F) → 2 ⊕N j =1 L (U (x¯ j )) is the restriction operator. Fix a function φ ∈ Cc∞ (Rn ) such that 0 ≤ φ ≤ 1, φ(x) = 1 if |x| ≤ 1, φ(x) = 0 if |x| ≥ 2, and φ = (1 − φ 2 )1/2 ∈ C ∞ (Rn ). Fix a number κ > 0, which we shall choose later. For any h > 0 define φ (h) (x) = φ(h−κ x). For any h > 0 small enough, let φj = φ (h) ∈ Cc∞ (U (x¯j )) in the fixed coordinates near x¯j . Denote also φj,γ = (γ −1 )∗ φj . (This function is supported in the neighborhood U (γ x¯j ) = γ (U (x¯j )) of γ x¯j .) We will always take h ∈ (0, h0 ), where h0 is sufficiently small, so in particular the supports of all functions φj,γ are disjoint. Let ∈ C ∞ ( N j =1 U (x¯ j )) be equal to φj on U (x¯ j ), j = 1, 2, . . . , N. Consider a (, σ )-equivariant, self-adjoint, bounded operator J in H0 defined as J = id ⊗, where 2 denotes the multiplication operator by the function in the space ⊕N j =1 L (U (x¯ j )). It is clear that V2 i2 Jp1 V1−1 = id ⊗V2 j2 r1 V1−1 , and j2 r1 is a bounded operator from L2 (Rn )N to L2 (F) given as the composition 2 N 2 2 L2 (Rn )N → L2 (B(0, r))N ∼ = ⊕N j =1 L (U (x¯ j )) −→ ⊕j =1 L (U (x¯ j )) → L (F).
Hence, the operator V2 i2 Jp1 V1−1 belongs to the von Neumann algebra π(A)
∼ = AL (, σ¯ ) ⊗ B(2 (N)), and, for any a ∈ A, the operator π(a)V2 (i2 Jp1 )V1−1 belongs to π(A). Similarly, we have i1 Jp2 = id ⊗j1 r2 , and j1 r2 is a bounded operator from L2 (F) to L2 (Rn )N given as the composition 2 N 2 N 2 n N ∼ 2 L2 (F) → ⊕N j =1 L (U (x¯ j )) −→ ⊕j =1 L (U (x¯ j )) = L (B(0, r)) → L (R ) .
So we have (i2 Jp1 )∗ = i1 Jp2 . We will use local coordinates near x¯j such that the Riemannian volume element at the point x¯j coincides with the Euclidean volume element given by the chosen local coordinates. Then the estimate (3) holds with ρ = 1 + O(hκ ).
(12)
Spectral Gaps for Periodic Schr¨odinger Operators
381
Denote by the same letters φ and φ the multiplication operators in L2 (Rn ) by the functions φ and φ accordingly. Let 1 and 1 be the bounded operators in L2 (Rn )N ∼ = L2 (Rn )⊗CN given by 1 = φ ⊗idCN and 1 = φ ⊗idCN . Then we have J1 = id ⊗1 and J1 = id ⊗ 1 in 2 () ⊗ L2 (Rn )N . be equal to φj,γ on U (γ x¯j ), j = 1, 2, . . . , N, and 0 otherwise. Let γ ∈ C ∞ (M)
∞ ∞ Put 2 = γ ∈ γ ∈ C (M). Define a function 2 ∈ C (M), 2 ≥ 0 by the
2 2 ∞ The operators J2 and J are given by the equation (2 ) + (2 ) = 1 in C (M). 2 respectively. multiplication operators by the functions 2 and 2 in L2 (M) The estimate (4) hold with
2k+2 (13) γl = O h− k+2 +2−2κ , l = 1, 2. Indeed, for any j = 1, 2, . . . , N, the principal symbol a1,j ∈ C ∞ (T ∗ Rn ) of Kjh is given by (2)
n
(2)
a1,j (x, ξ ) = h2
g ik (x¯j )ξi ξk ,
(x, ξ ) ∈ T ∗ Rn .
i,k=1
So we have
and
2k+2 (2) [J1 , [J1 , A1 ]] = −h− k+2 +2 id ⊗ ⊕1≤j ≤N a1,j (x, dφ(x))
2k+2 (2) [J1 , [J1 , A1 ]] = −h− k+2 +2 id ⊗ ⊕1≤j ≤N a1,j (x, dφ (x))
of H h is given in 2 () ⊗ L2 (Rn )N . Similarly, the principal symbol a2 ∈ C ∞ (T ∗ M) by (2)
(2)
a2 (x, ξ ) = h2
n
g ik (x)ξi ξk ,
(x, ξ ) ∈ T ∗ M.
i,k=1
So the operators [J2 , [J2 , A2 ]], [J2 , [J2 , A2 ]] are the multiplication operators in L2 (M) 2k+2
2k+2
by the functions −h− k+2 +2 a2 (x, d2 (x)) and −h− k+2 +2 a2 (x, d 2 (x)) accordingly. Therefore,
2k+2 (2) (2) γ1 = h− k+2 +2 max max supx∈Rn (a1,j (x, dφ(x))), supx∈Rn (a1,j (x, dφ (x))) j =1,2,... ,N
2k+2 − k+2 +2−2κ =O h ,
2k+2 (2) (2)
γ2 = h− k+2 +2 max supx∈M (a2 (x, d2 (x))), supx∈M (a2 (x, d2 (x)))
2k+2 = O h− k+2 +2−2κ . (2)
(2)
The estimates (5) hold with
2k+2 αl = O h− k+2 +kκ+1 ,
l = 1, 2.
(14)
382
Y.A. Kordyukov
h denote the quadratic hermitian form associated to K h , Indeed, let q0,j j
h q0,j (u) = (Kjh u, u) =
Rn
|ih du + A0j u|2 g(0) dx.
h the quadratic Consider the operator Hjh = (ih d +A1,j )∗ (ih d +A1,j ) and denote by q1,j hermitian form associated to this operator:
h q1,j (u) = (Hjh u, u) =
|ih du + A1,j u|2 g(0) dx.
Rn
By [5, Theorem 4.4], there exists a constant Cj > 0 such that for any h > 0, h
Rn
h |B0,j (x)| |u(x)|2 dx ≤ Cj q1,j (u),
u ∈ Cc∞ (Rn ).
(15)
Using gauge invariance and (15), we get h h q0,j (u) = q1,j (eiθj u) h h iθj 2 ≥ |B0,j (x)| |e u(x)| dx = |B0,j (x)| |u(x)|2 dx. (16) Cj R n Cj R n
Similarly, let q h be the quadratic hermitian form associated to H h , q h (u) = (H h u, u) =
M
|ih du + Au|2 dµ(x),
By an easy modification of the where dµ denotes the Riemannian volume form on M. proof of Theorem 4.5 in [5], one can show that there exists a constant C0 > 0, and for any ε ∈ (0, 1), there exists a constant Cε > 0 such that for any h ∈ (0, h0 ], h
M
|B(x)| |u(x)|2 dµ(x) ≤ C0 q h (u) + Cε h2−ε u 2 ,
u ∈ Cc∞ (M).
(17)
Assume that kκ < 2 and take ε ∈ (0, 1) so that kκ < 2 − ε. Since |B(x)| ≥ Chkκ for x ∈ supp l , l = 1, 2, the estimates (16) and (17) easily imply the desired estimates (5). The constants λ0l , l = 1, 2, can be chosen to be independent of h. Namely, one can take λ01 = λ02 = 0.
(18)
Finally, the estimates (6) and (7) hold with βl = 1 + O(hκ ),
2k+2 εl = O h2κ(k+2)−κ− k+2 ,
l = 1, 2.
(19)
Spectral Gaps for Periodic Schr¨odinger Operators
383
Using the inequality |a + b|2 ≤ |a|2 + 2|a||b| + |b|2 ≤ (1 + ε)|a|2 + (1 + ε−1 )|b|2 with ε = hκ , we get |ih d(φu) + Aφu|2 dµ(x) q h (φu) =(H h (φu), φu) = ≤(1 + hκ )
U (x¯j )
|ih d(φu) + Aj φu|2
g(0) dx
B(0,r)
≤(1 + hκ ) |ih d(φu) + A0j φu|2 g(0) dx B(0,r) + ch−κ |(Aj − A0j )φu|2 g(0) dx. B(0,r)
By (11), we have 0 2 2κ(k+2) |(Aj − Aj )φu| g(0) dx ≤ Ch B(0,r)
|φu|2
g(0) dx,
B(0,r)
that completes the proof of (6). The proof of (7) is similar. Now we complete the proofs of Theorems 1 and 2. As above, take {λm : m ∈ N}, λ1 < λ2 < λ2 < . . . , to be the spectrum (without taking into account multiplici2k+2 ties) of the operator h− k+2 K h , which is independent of h. Take any a and b such that λm < a < b < λm+1 with some m. Clearly, the spectrum of the operator A1 coincides 2k+2 with the spectrum of the operator h− k+2 K h . Therefore, the interval [a, b] does not intersect with the spectrum of A1 . Take any open interval (a1 , b1 ) that contains [a, b] and does not intersect with the spectrum of A1 . Using the estimates (12), (13), (14), (18) and (19), one can see that, for a2 and b2 given by (8) and (9), we have a2 = a1 + O(hs ),
b2 = b1 + O(hs ),
h → 0,
(20)
2k+2 , − + 2 − 2κ . The best possible value of s where s = min (2k + 3)κ − 2k+2 k+2 k+2 which is 2k + 2 2 2k + 2 ,− + 2 − 2κ = s = max min (2k + 3)κ − κ k+2 k+2 (2k + 5)(k + 2) 2 is attained when κ = 2k+5 . Hence, if h > 0 is small enough, we have α1 > a1 + γ1 , α2 > b2 + γ2 , b2 > a2 and the interval (a2 , b2 ) contains [a, b]. By Theorem 3, we conclude that the interval (a2 , b2 ) does not intersect with the spectrum of A2 , that completes the proof of Theorem 1. Moreover, we have that, for any λ1 ∈ (a1 , b1 ) and λ2 ∈ (a2 , b2 ), the spectral projections V1 E1 (λ1 )V1−1 and V2 E2 (λ2 )V2−1 are equivalent in A. Putting U = V1−1 V2 , we get the desired Murray - von Neumann equivalence of E1 (λ1 ) = id ⊗E 0 (λ) and V1−1 V2 E2 (λ2 )V2−1 V1 = U E(λ)U −1 in π1 (A) = Cr∗ (, σ¯ ) ⊗ K(L2 (Rn )N ).
Acknowledgements. I am very thankful to Bernard Helffer for bringing these problems to my attention and useful discussions and to Mikhail Shubin for his comments.
384
Y.A. Kordyukov
References 1. Figotin, A., Kuchment, P.: Band-Gap Structure of Spectra of Periodic Dielectric and Acoustic Media. I. Scalar model. SIAM J.Appl. Math. 56, 68–88 (1996); II. Two-dimensional photonic crystals. SIAM J. Appl. Math. 56, 1561–1620 (1996) 2. Figotin, A., Kuchment, P.: Spectral properties of classical waves in high-contrast periodic media. SIAM J. Appl. Math. 58, 683–702 (1998) 3. Friedlander, L.: On the density of states of periodic media in the large coupling limit. Commun. Partial Differ. Eqs. 27, 355–380 (2002) 4. Helffer, B., Mohamed, A.: Caract´erisation du spectre essential de l’op´erateur de Schr¨odinger avec un champ magn´etique. Ann. Inst. Fourier (Grenoble) 38, 95–112 (1988) 5. Helffer, B., Mohamed, A.: Semiclassical analysis for the ground state energy of a Schr¨odinger operator with magnetic wells. J. Funct. Anal. 138, 40–81 (1996) 6. Helffer, B., Morame, A.: Magnetic bottles in connection with superconductivity. J. Funct. Anal. 185, 604–680 (2001) 7. Helffer, B., Morame, A.: Magnetic bottles for the Neumann problem: the case of dimension 3. In: Spectral and inverse spectral theory (Goa, 2000). Proc. Indian Acad. Sci. (Math. Sci.) 112, 71–84 (2002) 8. Helffer, B., Morame, A.: Magnetic bottles for the Neumann problem: Curvature effects in the case of dimension 3. To appear in Ann. Sci. Ecole Norm. Sup. 2004 9. Helffer, B., Nourrigat, J.: Hypoellipticit´e maximale pour des op´erateurs polynˆomes de champs de vecteurs. Boston: Birkh¨auser, 1985 10. Hempel, R., Herbst, I.: Strong magnetic fields, Dirichlet boundaries, and spectral gaps. Comm. Math. Phys. 169, 237–259 (1995) 11. Hempel, R., Lienau, K.: Spectral properties of periodic media in the large coupling limit. Comm. Partial Differ. Eqs. 25, 1445–1470 (2000) 12. Hempel, R., Post, O.: Spectral gaps for periodic elliptic operators with high contrast: an overview. http://arxiv.org/abs/math-ph/0207020, 2002 13. Herbst, I., Nakamura, S.: Schr¨odinger operators with strong magnetic fields: Quasi-periodicity of spectral orbits and topology. In: Differential operators and spectral theory. Am. Math. Soc. Transl. Ser. 2, v. 189, Providence RI: Am. Math. Soc., 1999, pp. 105 – 123 14. Kordyukov, Yu. A., Mathai, V., Shubin, M.: Equivalence of projections in semiclassical limit and a vanishing theorem for higher traces in K-theory. http://arxiv.org/abs/math.DG/0305189, 2003, to appear in J. Reine Angew. Math. 15. Mathai, V., Shubin, M.: Semiclassical asymptotics and gaps in the spectra of magnetic Schr¨odinger operators. Geom. Dedicata 91, 155–173 (2002) 16. Shubin, M.: Discrete magnetic Laplacian. Commun. Math. Phys. 164, 259–275 (1994) 17. Shubin, M.: Semiclassical asymptotics on covering manifolds and Morse Inequalities. Geom. Anal. Funct. Anal. 6, 370–409 (1996) Communicated by B. Simon
Commun. Math. Phys. 253, 385–422 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1155-y
Communications in
Mathematical Physics
Polyvector Super-Poincar´e Algebras Dmitri V. Alekseevsky1 , Vicente Cort´es2 , Chandrashekar Devchand3 , Antoine Van Proeyen4 1
Dept. of Mathematics, University of Hull, Cottingham Road, Hull, HU6 7RX, UK. E-mail:
[email protected] 2 Institut Elie ´ Cartan de Math´ematiques, Universit´e Henri Poincar´e - Nancy 1, B.P. 239, 54506 Vandoeuvre-l`es-Nancy Cedex, France. E-mail:
[email protected] 3 Mathematisches Institut der Universit¨at Bonn, Beringstraße 1, 53115 Bonn, Germany. E-mail:
[email protected] 4 Instituut voor Theoretische Fysica, Katholieke Universiteit Leuven, Celestijnenlaan 200D, 30001 Leuven, Belgium. E-mail:
[email protected] Received: 25 November 2003 / Accepted: 13 February 2004 Published online: 12 August 2004 – © Springer-Verlag 2004
Abstract: A class of Z2 -graded Lie algebra and Lie superalgebra extensions of the pseudo-orthogonal algebra of a spacetime of arbitrary dimension and signature is investigated. They have the form g = g0 +g1 , with g0 = so(V )+W0 and g1 = W1 , where the algebra of generalized translations W = W0 +W1 is the maximal solvable ideal of g, W0 is generated by W1 and commutes with W . Choosing W1 to be a spinorial so(V )-module (a sum of an arbitrary number of spinors and semispinors), we prove that W0 consists of polyvectors, i.e.all the irreducible so(V )-submodules of W0 are submodules of ∧V . We provide a classification of such Lie (super)algebras for all dimensions and signatures. The problem reduces to the classification of so(V )-invariant ∧k V -valued bilinear forms on the spinor module S.
Contents 1. 2. 3. 4. 5. 6.
Introduction . . . . . . . . . . . . . . . . . . . . . . . -Extensions of so(V ) . . . . . . . . . . . . . . . . . Extensions of Translational Type and -Transalgebras . Extended Polyvector Poincar´e Algebras . . . . . . . . Decomposition of S ⊗ S: Complex Case . . . . . . . . Decomposition of S ⊗ S: Real Case . . . . . . . . . . 6.1. Odd dimensional case: dim V = 2m + 1. . . . . 6.2. Even dimensional case: dim V = 2m. . . . . . . 6.3. Decomposition of tensor square of semi-spinors. . 7. N -Extended Polyvector Poincar´e Algebras . . . . . . . A. Admissible ∧k V -Valued Bilinear Forms on S . . . . . B. Reformulation for Physicists . . . . . . . . . . . . . . B.1. Complex and real Clifford algebras. . . . . . . . B.2. Summary of the results for the algebras. . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
386 390 390 393 396 400 402 403 404 406 407 410 410 417
386
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
1. Introduction A superextension of a Lie algebra h is a Lie superalgebra g = g0 + g1 , such that h ⊂ g0 . If the Lie algebra g0 ⊃ h and a g0 -module g1 are given, then a superextension is determined by the Lie superbracket in the odd part, which is a g0 -equivariant linear map ∨2 g1 → g0 ,
(1.1)
satisfying the Jacobi identity for X, Y, Z ∈ g1 , where ∨ denotes the symmetric tensor product. Similarly, a Z2 -graded extension (or simply Lie extension) of h is a Z2 -graded Lie algebra g, i.e.a Lie algebra with a Z2 -grading g = g0 + g1 compatible with the Lie bracket: gα , gβ ⊂ gα+β , α, β ∈ Z/2Z , such that g0 ⊃ h. As above, a Z2 -graded extension is determined by the Lie bracket in g1 , which defines a g0 -equivariant linear map, ∧2 g1 → g0 ,
(1.2)
satisfying the Jacobi identity. For instance, consider a super vector space V0 +V1 endowed with a scalar superproduct g = g0 + g1 , i.e. g0 is a (possibly indefinite) scalar product on V0 and g1 is a nondegenerate skewsymmetric bilinear form on V1 . The Lie algebra h = g0 = so(V0 ) ⊕ sp(V1 ) of infinitesimal even automorphisms of (V0 + V1 , g) has a natural extension with g1 = V0 ⊗ V1 , where the Lie superbracket is given by: [v0 ⊗ v1 , v0 ⊗ v1 ] := g1 (v1 , v1 )v0 ∧ v0 + g0 (v0 , v0 )v1 ∨ v1 . This is the orthosymplectic Lie superalgebra osp(V0 |V1 ). One can also define an analogous Lie superalgebra spo(V0 |V1 ), starting from a symplectic super vector space (V0 + V1 , ω = ω0 + ω1 ), such that spo(V0 |V1 ) = osp(V1 |V0 ). Similarly, for a Z2 -graded vector space V0 + V1 endowed with a scalar product g = g0 + g1 (respectively, a symplectic form ω = ω0 + ω1 ) we have a natural Z2 -graded extension g = g0 + g1 = so(V0 + V1 ) (respectively, g = sp(V0 + V1 )) of the Lie algebra h = g0 = so(V0 ) ⊕ so(V1 ) (respectively, of h = sp(V0 ) ⊕ sp(V1 )). For a pseudo-Euclidean space-time V = Rp,q (with p positive and q negative directions), Nahm [N] classified superextensions g of the pseudo-orthogonal Lie algebra so(V ) under the assumptions that q ≤ 2, g is simple, g0 is a direct sum of ideals, g0 = so(V ) ⊕ k , where k is reductive and g1 is a spinorial module (i.e.its irreducible summands are spinors or semi-spinors). These algebras for q = 2 are usually considered as superconformal algebras for Minkowski spacetimes, by virtue of the identification conf(p − 1, 1) = so(p, 2). In this paper, we shall consider both super and Lie extensions (which we call extensions) of the pseudo-orthogonal Lie algebra so(V ), with = + 1 corresponding to superextensions and = − 1 to Lie extensions. Here V = Rp,q or V = Cn is a vector space endowed with a scalar product. In the case g0 = so(V ) + V (Poincar´e Lie algebra), -extensions g = g0 + g1 such that g1 is a spinorial module and [g1 , g1 ] ⊂ V were classified in [AC]. In the case = −1 such extensions clearly do not respect the conventional field theoretical spin–statistics relationship. However, in order to classify super-Poincar´e algebras ( = +1) with an arbitrary number of irreducible spinorial submodules in g1 we need to classify Lie extensions as well as superextensions with irreducible g1 . We study Z2 -graded Lie algebras and Lie superalgebras, g = g0 + g1 , where g0 = so(V )+W0 , g1 = W1 , such that so(V ) is a maximal semisimple Lie subalgebra of g and
Polyvector Super-Poincar´e Algebras
387
W = W0 +W1 is its maximal solvable ideal. If W0 contains [W1 , W1 ] and commutes with W , we call g an -extension of translational type. If moreover, W0 = [W1 , W1 ], we call g an -transalgebra. Our main result is the classification of -extended polyvector Poincar´e algebras, i.e.-extensions of translational type in the case when W1 = S, the spinor so(V )-module, or, more generally, an arbitrary spinorial module. Here V is an arbitrary pseudo-Euclidean vector space Rp,q . We prove that, under these assumptions, any irreducible so(V )-submodule of W0 is of the form ∧k V or ∧m ± V , where m = (p + q)/2 m and ∧m ± V are the eigenspaces of the Hodge star operator on ∧ V . If g = so(V ) + W0 + S is an -transalgebra, then the (super) Lie bracket defines an so(V )-equivariant surjective map W0 : S ⊗ S → W0 . If K is the kernel of this 0 such that S ⊗ S = W 0 + K map, then there exists a complementary submodule W 0 with W0 . We note that we can choose W 0 ⊂ S ∧ S in the Lie and we can identify W 0 ⊂ S ∨ S in the Lie superalgebra case. Conversely, if we have a algebra case and W decomposition S ⊗ S = W0 + K into a sum of two so(V )-submodules and moreover W0 ⊂ S ∧ S or W0 ⊂ S ∨ S, then the projection W0 onto W0 with the kernel K defines an so(V )-equivariant bracket [, ] : S ⊗ S → W0 [s, t] = W0 (s ⊗ t)
(1.3)
which is skewsymmetric or symmetric, respectively. More generally, if A is an endomorphism of W0 that commutes with so(V ), then the twisted projection A ◦ W0 is another so(V )-equivariant bracket and any bracket can be obtained in this way. Together with the action of so(V ) on W0 and S, this defines the structure of an -transalgebra g = so(V ) + W0 + S, since the Jacobi identity for X, Y, Z ∈ g1 follows from [g1 , [g1 , g1 ]] = 0. The classification problem then reduces essentially to the decomposition of S ∧ S, S ∨ S into irreducible so(V )-submodules and the description of the projection W0 . In this paper, we resolve both these matters. In all cases the irreducible so(V )-submodules occurring in the tensor product S ⊗ S are k-forms ∧k V , with the exception of the case of even dimensions n = p+q = 2m with signature s = p−q divisible by 4. In the latter case the m-form module splits into irreducible selfdual and anti-selfdual submodules ∧m ±V . The multiplicities of any irreducible so(V )-submodules of S ⊗ S take values 1,2,4 or 8. For example if V = Cn , n = 2m + 1 or if V = Rm,m+1 , we have (cf. [OV]) S⊗S =
m
∧k V ,
k=0
S∨S =
[m/4]
∧m−4k V +
k=0
S∧S =
[(m−2)/4] k=0
[(m−3)/4]
∧m−3−4k V ,
k=0
∧m−2−4k V +
[(m−1)/4]
∧m−1−4k V .
k=0
The vector space of =−1-extensions of translational type of the form g = so(V ) + ∧k V + S is identified with the vector space Bilk− (S)so(V ) := Homso(V ) (S ∧ S, ∧k V ) of ∧k V -valued invariant skewsymmetric bilinear forms on S. Similarly, the vector space of =+1-extensions of translational type of the form g = so(V ) + ∧k V + S is identified with the vector space Bilk+ (S)so(V ) :=Homso(V ) (S ∨ S, ∧k V ).
388
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
The main problem is the description of these spaces of invariant ∧k V -valued bilinear forms. For k = 0, 1 this problem was solved in [AC], where three invariants, σ, τ and ι, were defined for bilinear forms on the spinor module. Following [AC], a nondegenerate so(V )-invariant (scalar) bilinear form β on the spinor module S is called admissible if it has the following properties: 1) β is either symmetric or skewsymmetric, β(s, t) = σ (β)β(t, s) , s, t ∈ S , σ (β) = ±1. We define σ (β) to be the symmetry of β. 2) Clifford multiplication by v ∈ V , γ (v) : S → S ,
s → γ (v)s = v · s ,
is either β-symmetric or β-skewsymmetric, i.e. β(vs, t) = τ (β)β(s, vt) ,
s, t ∈ S ,
with τ (β) = +1 or −1, respectively. We define τ (β) to be the type of β. 3) If the spinor module is reducible, S = S + + S − , then the semispinor modules S + and S − are either mutually orthogonal or isotropic. We define the isotropy of β to be ι(β) = +1 if β(S+ , S− ) = 0 or ι(β) = −1 if β(S± , S± ) = 0. In [AC], a basis βi of the space Bil(S)so(V ) := Bil0 (S)so(V ) of scalar-valued invariant forms was constructed explicitly, which consists of admissible forms. These are tabulated in the appendix (Table A.3). The dimension N (s) = dimBil(S)so(V ) depends only on the signature s = p − q of V (see Table A.1 in the Appendix). We associate with a bilinear form β on S the ∧k V -valued bilinear form βk : S ⊗ S → ∧k V , defined by the following fundamental formula:
βk (s ⊗ t), v1 ∧ · · · ∧ vk = sgn(π )β γ (vπ(1) ) · · · γ (vπ(k) )s, t s, t ∈ S, vi ∈ V , π∈Sk
which extends the formula given in [AC] from k = 1 to arbitrary k. For k = 0 we have that β0 = β. We shall prove that the map β → βk is so(V )-equivariant and induces an isomorphism ∼
k : Bil(S)so(V ) → Bilk (S)so(V ) onto the vector space of ∧k V -valued invariant bilinear forms on S. This was proven for k = 1 in [AC]. The definitions of the invariants σ, τ, ι make sense for ∧k V -valued bilinear forms as well. If σ (βk ) = −1, the form βk is skewsymmetric and hence defines a Lie algebra structure on g = so(V ) + ∧k V + S. If σ (βk ) = +1, it defines a Lie superalgebra structure on g = so(V ) + ∧k V + S. We shall prove, for admissible β, that σ (βk ) = σ (β)τ (β)k (−1)k(k−1)/2 .
(1.4)
In the cases when semi-spinors exist, we shall prove that ι(βk ) = ι(β)(−1)k .
(1.5)
Polyvector Super-Poincar´e Algebras
389
For k > 0 the bilinear forms βk associated with an admissible bilinear form β have neither value of the type τ . Clearly, the formulae for the invariants show that σ (βk ) and ι(βk ) depend only on k modulo 4. We tabulate these invariants for βk i for k = 0, 1, 2, 3 in the Appendix. Let the number Nk (s, n) denote the dimension of the vector space Bilk (S)so(V ) of -extended k-polyvector Poincar´e algebra structures on g = so(V ) + ∧k V + S . We shall see that the sum Nk (s, n) = Nk+ (s, n) + Nk− (s, n) = N (s) = dim Bil(S)so(V ) depends only on the signature s. We shall also verify the following remarkable shift formula: Nk± (s, n + 2k) = N0± (s, n),
(1.6)
which reduces the calculation of these numbers to the case of zero forms. The function N ± (s, n) := N0± (s, n) has the following symmetries: a) Periodicity modulo 8 in s and n: N ± (s + 8a, n + 8b) = N ± (s, n) ,
a, b ∈ Z .
Using this, we can extend the functions N ± (s, n) to all integer values of s and n. b) Symmetry with respect to reflection in signature 3: N ± (−s + 6, n) = N ± (s, n) . c) The mirror symmetries: N ± (s, n + 4) = N ∓ (s, n) , N ± (s, −n + 4) = N ∓ (s, n) .
(1.7) (1.8)
Due to the shift formula (1.6), all these identities yield corresponding identities for Nk± (s, n) for any k. For example the mirror identity (1.8) gives the mirror symmetry for k=1 (reflection with respect to zero dimension), N1± (s, −n) = N1∓ (s, n), which was discovered in [AC]. In Appendix B, we summarise our results in language more familiar to the physics community. Recently, there have been many discussions (e.g. [AI, CAIP, DFLV, DN, FV, Sc, Sh, V, VV]) of generalizations of spacetime supersymmetry algebras which go beyond Nahm’s classification. Of particular interest has been the M-theory algebra, which extends the d=11 super Poincar´e algebra by two-form and five-form brane charges. In the important paper [DFLV], the authors study superconformal Lie algebras and polyvector super-Poincar´e algebras g = so(V )+∧k V +W1 , where W1 = S or W1 = S± . They propose an approach for the classification of such Lie superalgebras g which consists essentially of the following two steps: first describe the space Homso(V C ) (S∨S, ∧k V C ), if the complex spinor module S is irreducible, and the spaces Homso(V C ) (S± ∨ S± , ∧k V C )
390
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
and Homso(V C ) (S+ ⊗ S− , ∧k V C ) if the complex spinor module S = S+ + S− is reducible, then describe so(V )-invariant reality conditions. They determine the dimension of the above vector spaces, which is always zero or one and discuss the second problem. In the present paper we start from the real spinor module S and, in particular, describe explicitly the real vector space H = Homso(V ) (W1 ∨W1 , ∧k V ) for an arbitrary spinorial module W1 . We shall see that even if W1 is an irreducible spinor module S, the dimension of H can be 0, 1, 2 or 3. Polyvector super-Poincar´e algebras were also considered in [CAIP] for Lorentzian signature (1, q) in the dual language of left-invariant one-forms on the supergroup of supertranslations. 2. -Extensions of so(V ) Let V be a real or complex vector space endowed with a scalar product and W1 an so(V )-module. Definition 1. A superextension ( = +1-extension) of so(V ) of type W1 is a Lie superalgebra g satisfying the conditions i) so(V ) ⊂ g0 as a subalgebra, ii) g1 = W1 , a g0 -module. A Lie extension ( = −1-extension) of so(V ) of type W1 is a Z2 -graded Lie algebra g = g0 + g1 , also satisfying i) and ii). Further, an -extension is called minimal if it does not contain a proper subalgebra which is also an -extension of type W1 ; more precisely, if g = g 0 + g1 ⊂ g, so(V ) ⊂ g 0 , then g = g. The Lie superalgebras classified by Nahm are examples of superextensions of so(Rp,q ) of spinor type W1 = S. Let g = g0 + g1 , be an -extension of so(V ), g0 = so(V ) + W0 , with W0 an so(V )submodule that is complementary to so(V ) in g0 and g1 = W1 . There are two extremal classes of such algebras: E1: g is semi-simple, i.e.does not contain any proper solvable ideal, E2: g is of semi-direct type, i.e.g is maximally non-semi-simple, in the sense that so(V ) is its largest semi-simple super Lie subalgebra, g = so(V ) + W0 + W1 and W = W0 + W1 is a solvable ideal. 3. Extensions of Translational Type and -Transalgebras Definition 2. Let g = so(V ) + W0 + W1 be an -extension of so(V ). If [W0 , W ] = 0 and [W1 , W1 ] ⊂ W0 then the extension g = so(V ) + W0 + W1 is called an -extension of so(V ) of translational type and the (nilpotent) ideal W = W0 + W1 is called the algebra of generalized translations. If it is minimal, in the sense of Definition 1, then it is called an -transalgebra. We note that such an extension is automatically of semi-direct type, provided that dim V ≥ 3, which we assume in this section. We also assume for definiteness that V is a real vector space. The minimality condition is equivalent to [W1 , W1 ] = W0 and means that even translations are generated by odd translations. The construction of -extensions of so(V ) of translational type with given so(V )-modules W0 and W1 reduces to the construction of so(V )-equivariant linear maps ∨2 W1 → W0 and ∧2 W1 → W0 . The Jacobi
Polyvector Super-Poincar´e Algebras
391
identity for the Lie bracket associated to such a map follows from the so(V )-equivariance. Now we show that the description of -extensions of so(V ) of translational type reduces to that of minimal ones (i.e.transalgebras). Let g = so(V ) + W0 + W1 be an -extension of so(V ) of translational type. Then g := so(V ) + [W1 , W1 ] + W1 is an -transalgebra and g = g + a is the semi-direct sum of the subalgebra g and an (even) Abelian ideal a, where a ⊂ W0 is an so(V )-submodule complementary to [W1 , W1 ] ⊂ W0 . Conversely, if g = so(V ) + W0 + W1 is an -transalgebra and a is an so(V )-module then the semi-direct sum g := g + a is an -extension of so(V ) of translational type, where W0 := W0 + a. Proposition 1. Let W1 be an so(V )-module. Then there exists a unique (up to isomorphism) -transalgebra of maximal dimension with g1 = W1 : g = g (W1 ) = g0 + g1 = (so(V ) + W0 ) + W1 , where W0+ = ∨2 W1 and W0− = ∧2 W1 . The Lie (super)bracket [·, ·] : W1 ⊗ W1 → W0 is the projection onto the corresponding summand of W1 ⊗ W1 = ∨2 W1 ⊕ ∧2 W1 . Moreover, any -transalgebra with g1 = W1 is isomorphic to a contraction of g (W1 ). Proof. It is clear that g is a maximal -transalgebra. Let g = g0 + g1 be a maximal -transalgebra with g1 = W1 and g0 = so(V ) + W0 . The Lie (super)bracket [·, ·] : W1 ⊗ W1 → W0 defines an so(V )-equivariant isomorphism from ∨2 W1 or ∧2 W1 onto W0 . This isomorphism extends to an isomorphism g → g, which is the identity on so(V ) + W1 . Similarly for any -transalgebra with g1 = W1 the (super) Lie bracket [·, ·] : W1 ⊗ W1 → W0 defines an so(V )-equivariant epimorphism ϕ from ∨2 W1 or ∧2 W1 onto W0 . The kernel K is an so(V )-submodule of ∨2 W1 or ∧2 W1 , 0 respectively. Since so(V ) is semi-simple, there exists a complementary submodule W 0 with W0 by means of the isomorphic to W0 . We can identify the so(V )-module W 0 . Then the Lie bracket corresponds to the projection π + : ∨2 W1 = isomorphism ϕ|W 0 W 0 or π − : ∧2 W1 = K + W 0 → W 0 . This defines an -transalgebra 0 → W K+W 0 W
0 ) = so(V ) + W 0 + W1 , whose bracket is the above projection π . The g (W1 , W 0 W 0 → W0 of so(V )-modules extends trivially to an isomorphism 0 : W isomorphism ϕ|W 0 ) → g. This shows that any -transalgebra is isomorphic to an -transalgebra g (W1 , W 0 ), where W 0 ⊂ W1 ⊗ W1 is an so(V )-submodule contained in of the form g (W1 , W ∨2 W1 or ∧2 W1 , respectively. Consider now the one-parameter family of Lie brackets ) + π . This defines a family of -transalgebras (g (W ), [·, ·] ). [·, ·]t := t (Id − πW 1 t 0 0 W For t = 0 they are isomorphic to the original (t = 1) -transalgebra. In the limit t → 0 0 ) as a contraction of g (W1 ). we obtain the -transalgebra g (W1 , W The following proposition describes the structure of extensions of semi-direct type under the additional assumption that the so(V )-module W1 is irreducible. We denote by ρ : g0 → gl(W1 ) the adjoint representation of g0 = so(V ) + W0 on W1 and by K the kernel of ρ|W0 , which is clearly an ideal of g. Thus, ρ is the action of adW0 on W1 and K are all the generators of W0 that commute with W1 . Proposition 2. Let g = so(V ) + W0 + W1 be an -extension of semi-direct type. Assume that the so(V )-module W1 is irreducible of dimension at least 3 if it does not admit an so(V )-invariant complex structure and dimC W1 ≥ 3 if it does and that dim V ≥ 3. Consider the decomposition of g into a direct sum of so(V )-submodules, g = so(V ) + A + K + W1 , where A is an so(V )-invariant complement to K in W0 = A+K. Then the
392
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
dimension dim A = 0, 1, 2 and the irreducible linear Lie algebra ρ(g0 ) = ρ(so(V ))+Z, where the centre Z ∼ = W0 /K is either 0, R·Id or C·Id. Moreover, [A, A] ⊂ K ,
[so(V ), A] = 0 ,
[W1 , W1 ] ⊂ K.
Proof. By assumption, the linear Lie algebra ρ(g0 ) ⊂ gl(W1 ) is irreducible and hence reductive. Since any solvable ideal of a reductive Lie algebra belongs to the centre, we conclude that the solvable ideal ρ(W0 ) ⊂ ρ(g0 ) is in fact Abelian and consists of operators commuting with so(V ). Now Schur’s Lemma implies that ρ(W0 ) = 0, R·Id, or C·Id. The inclusion [A, A] ⊂ K follows from the fact that ρ(A) is in the centre of g0 . Since the restriction of ρ to so(V ) + A is faithful and [ρ(so(V )), ρ(A)] = 0, we conclude that [so(V ), A] = 0. From the assumptions it follows that there exist three vectors x, y, z ∈ W1 , which are linearly independent over the reals if W1 has no invariant complex structure and over the complex numbers if W1 has an invariant complex structure J . For any three linearly independent vectors (over R or C) x, y, z ∈ W1 , the Jacobi identity gives 0 = [[x, y], z] + [[y, z], x] + [[z, x], y] = ρ([x, y])z + ρ([y, z])x + ρ([z, x])y. Since [W1 , W1 ] ⊂ W0 and ρ(W0 ) = ρ(A) consists of scalar operators (over R or C), we have that ρ([x, y]) = 0, i.e. [W1 , W1 ] ⊂ K = ker ρ. Note that g is a transalgebra if and only if A=0. The following corollary gives sufficient conditions for extensions of semi-direct type to be transalgebras. Corollary 1. Under the assumptions of the previous proposition, assume moreover that g = so(V )+W0 +W1 is a minimal extension of type W1 . Then g is a transalgebra. Proof. Minimality implies W0 = [W1 , W1 ] and, by the above proposition, [W1 , W1 ] commutes with W1 . Now the Jacobi identity for x, y ∈ W1 and z ∈ W0 yields [W0 , W0 ] = 0. Instead of minimality we may assume the irreducibility of the so(V )-module W0 . Proposition 3. Let g = so(V ) + W0 + W1 be an -extension of semi-direct type, with dim V ≥ 3. Assume that W0 and W1 are irreducible so(V )-modules. Then either g is of translational type, i.e. [W0 , W ] = 0, or W0 ∼ = R (considered as a real Lie algebra) is the centre of g0 = so(V ) + W0 and ad W0 acts on W1 by scalars. Proof. Let W0 , W1 be irreducible so(V ) modules. Since the algebra W = W0 + W1 is solvable, [W0 , W0 ] is a proper so(V ) submodule of W0 , hence [W0 , W0 ] = 0. The kernel K of the adjoint representation ρ : W0 → gl(W1 ) is an so(V )-submodule of W0 . Hence K = W0 or 0. In the first case, g is of translational type. In the second case, the representation ρ is faithful and ρ(W0 ) commutes with ρ(so(V )), hence [so(V ), W0 ] = 0. On the other hand the so(V )-module W0 is irreducible, so W0 ∼ = R.
Polyvector Super-Poincar´e Algebras
393
4. Extended Polyvector Poincar´e Algebras and ∧k V -Valued Invariant Bilinear Forms on the Spinor Module S In this and the next two sections, we devote ourselves to the classification of -transalgebras g = g0 + g1 with g1 = W1 = S, the spinor so(V )-module. We take V to be the pseudo-Euclidean space Rp,q of dimension n = p+q and signature s = p−q. In other words, we consider (super) Lie algebras g = (so(V ) + W0 ) + S with [W0 , W0 + S] = 0
,
W0 = [S, S].
The (super) Lie bracket defines an so(V )-equivariant surjective map W0 : S ⊗S → W0 . 0 + K, where W 0 is an so(V )-submodule If K is the kernel of this map, then S ⊗ S = W equivalent to W0 such that W0 ⊂ S ∧ S in the Lie algebra case and W0 ⊂ S ∨ S in the superalgebra case. Conversely, if we have a decomposition S ⊗ S = W0 + K into a sum of two so(V )-submodules and moreover W0 ⊂ S ∧ S or W0 ⊂ S ∨ S, then the projection W0 onto W0 with the kernel K defines an so(V )-equivariant bracket [, ] : S ⊗ S → W0 [s, t] = W0 (s ⊗ t)
(4.1)
which is skewsymmetric or symmetric, respectively. More generally, if A is an endomorphism of W0 which commutes with so(V ), then the twisted projection A◦W0 is another so(V )-equivariant bracket and any bracket can be obtained in this way. Together with the action of so(V ) on W0 and S, this defines the structure of an -transalgebra g = so(V ) + W0 + S. We therefore have a 1–1 correspondence between -transalgebras of the form g = so(V ) + W0 + S, where W0 is a submodule of S ∨ S (for = 1) or S ∧ S (for = −1), and equivariant surjective maps W0 : S ⊗ S → W0 , whose kernel contains S ∨ S if = −1 and S ∧ S if = 1. The problem thus reduces to the description of the decomposition of S ∧ S and S ∨ S into irreducible so(V )-submodules and the determination of the twisted projections A◦W0 . We consider these projections as equivariant W0 -valued symmetric or skewsymmetric bilinear forms on S. In the next section we show that the irreducible submodules of S ⊗ S are of the form ∧k V or ∧m ±V ((anti)selfdual m-forms) if n = 2m and s is divisible by 4. We denote by Bilk (S) = Hom(S ⊗ S , ∧k V ), the vector space of ∧k V -valued bilinear forms on S. It can be decomposed, Bilk (S) = Bilk+ (S) ⊕ Bilk− (S), into the sum of the vector spaces of symmetric (+) and skewsymmetric (−) bilinear forms. For W0 = ∧k V , the space of -transalgebras ( = ±) is identified with the space k Bil (S)so(V ) of so(V )-invariant symmetric ( = +) or skewsymmetric ( = −) bilinear forms. Hence: The classification of -transalgebras g = g0 + g1 with g1 = S reduces to the description of the spaces Bilk (S)so(V ) of ∧k V -valued invariant bilinear forms on the spinor module S. The following formula associates a ∧k V -valued bilinear form βk ∈ Bilk (S) to every (scalar) bilinear form β ∈ Bil(S):
βk (s ⊗ t), v1 ∧ · · · ∧ vk = sgn(π )β γ (vπ(1) ) · · · γ (vπ(k) )s, t s, t ∈ S, vi ∈ V , π∈Sk
where the sum is over permutations π of {1, . . . , k}. Our classification is based on the following theorem.
394
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
Theorem 1. For any pseudo-Euclidean vector space V ∼ = Rp,q , the map k : Bil(S) → Bilk (S) β → βk is a Spin(V)-equivariant monomorphism and it induces an isomorphism ∼
k : Bil(S)so(V ) → Bilk (S)so(V ) of vector spaces. Proof. It is known that Clifford multiplication γ : V → End S is Spin(V )–equivariant, i.e. γ (gv) = gγ (v)g −1 , g ∈ Spin(V ), v ∈ V . Using this property we now check that the map k is also Spin(V )–equivariant: k = g · βk , g·β
where (g ·β)(s, t) = β(g −1 s, g −1 t) and (g ·βk )(s, t) = gβk (g −1 s, g −1 t). We calculate k
g·β (s, t) , v1 ∧ · · · ∧ vk = sgn(π )β g −1 γ (vπ(1) ) · · · γ (vπ(k) )s, g −1 t π∈Sk
=
sgn(π )β γ (g −1 vπ(1) ) · · · γ (g −1 vπ(k) )g −1 s, g −1 t
π∈Sk
= βk (g −1 s, g −1 t) , g −1 v1 ∧ · · · ∧ g −1 vk = gβk (g −1 s, g −1 t) , v1 ∧· · · ∧ vk = (g · βk )(s, t) , v1 ∧· · · ∧vk.
(4.2)
Next, we prove that is injective. For β ∈ Bil(S) the bilinear form βk is zero if and only if β( sgn(π )γ (vπ(1) ) · · · γ (vπ(k) )S, S) = 0 π
or
sgn(π )γ (vπ(1) ) · · · γ (vπ(k) )S ⊂ ker(β)
π
for any vectors v1 , . . . , vk . If the vectors v1 , . . . , vk are orthogonal, then the endomorphisms γ (v1 ), . . . , γ (vk ) anticommute and the endomorphism π sgn(π )γ (vπ(1) ) . . . γ (vπ(k) ) = k! γ (v1 ) · · · γ (vk ) is invertible. This implies that ker(β) = S and so β = 0. To complete the proof of the theorem, we need to check that dim Bilk (S)so(V ) = dim Bil(S)so(V ) =: N (p − q) . In fact, dim Bilk (S)so(V ) = µ(k) dim C(∧k V ), where µ(k) is the multiplicity of ∧k V in S ⊗ S and C(M) = Endso(V ) (M) denotes the Schur algebra of an so(V )-module
Polyvector Super-Poincar´e Algebras
395
M. If the signature s = p − q is divisible by 4 and k = m = n/2, then ∧m V = m ∧m + V ⊕ ∧− V is the sum of two inequivalent irreducible so(V )-modules of real type and hence C(∧m V ) ∼ = R ⊕ R. If the signature s is even but not divisible by 4 and k = m = n/2, then ∧m V is an irreducible so(V )-module of complex type, with the complex structure defined by the Hodge star operator and hence C(∧m V ) ∼ = C. In both cases dim Bilm (S)so(V ) = µ(m) dim C(∧m V ) = 2µ(m) = N (s) , where the last equation follows from Table A.1 in the appendix. In all other cases, ∧k V is an irreducible module of real type and C(∧k V ) = R. Therefore, using Table A.1, we obtain dim Bilk (S)so(V ) = µ(k) dim C(∧k V ) = µ(k) = N (s) . In the Introduction we defined the three Z/2Z-valued invariants for ∧k V -valued bilinear forms on the spinor module: symmetry, type and isotropy. We say that a non-zero ∧k V -valued bilinear form ∈ Bilk (S), k > 0, is admissible if it is either symmetric or skewsymmetric and, in the cases when semispinor modules exist, if the two semispinor modules are either isotropic or mutually orthogonal with respect to . Recall that in the case of scalar-valued bilinear forms (k = 0), admissibility requires, in addition, that the bilinear form has a specific type τ . The invariants of admissible ∧k V -valued bilinear forms in terms of the invariants of the scalar-valued admissible bilinear forms are given by: Proposition 4. Let β ∈ Bil(S) be a an admissible scalar bilinear form and βk the associated ∧k V -valued bilinear form. Then βk is admissible and its invariants, the symmetry σ (βk ) and the isotropy ι(βk ), can be calculated as follows: σ (βk ) = σ (β)τ (β)k (−1)k(k−1)/2 , ι(βk )
= ι(β)(−1) . k
(4.3) (4.4)
For k > 0 the bilinear forms βk = 0 have neither type. Proof. Let s, t ∈ S and e1 , . . . , ek ∈ V be orthogonal vectors. We put γi := γei and compute
βk (s ⊗ t), e1 ∧ · · · ∧ ek = k!β(γ1 · · · γk s, t) = k!τ (β)k β(s, γk · · · γ1 t) = k!τ (β)k (−1)k(k−1)/2 β(s, γ1 · · · γk t) = k!τ (β)k (−1)k(k−1)/2 σ (β)β(γ1 · · · γk t, s) = k!τ (β)k (−1)k(k−1)/2 σ (β) βk (t ⊗ s), e1 ∧ · · · ∧ ek . This proves Eq. (4.3). Equation (4.4) follows from the fact that Clifford multiplication γv maps S+ to S− and vice versa.
396
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
5. Decomposition of the Tensor Square of the Spinor Module of Spin(V ) into Irreducible Components: Complex Case In this section we consider the spinor module S of a complex Euclidean vector space V = Cn and we derive the decompositions of S ⊗ S, S ∨ S and S ∧ S into inequivalent irreducible Spin(V )–submodules. These decompositions also yield the corresponding decompositions for the cases when S is a spinor module of a real vector space V = Rm,m if n = 2m and V = Rm,m+1 if n = 2m+1. We shall use the well known facts summarised in the following lemma, see e.g. [OV]. Lemma 1. Let V be an n-dimensional complex Euclidean vector space or a real pseudoEuclidean vector space of signature (p, q), p + q=n, p − q=s. If n = 2m+1, then the decomposition of ∧V into irreducible pairwise inequivalent so(V )-submodules is given by ∧V =
n
∧k V =
k=0
m
∧k V +
k=0
m
∗ ∧k V = 2
k=0
m
∧k V .
(5.1)
k=0
If n = 2m then we have the following decomposition into irreducible pairwise inequivalent so(V )-submodules: m−1 2 ∧k V + ∧ m V if s/2 is odd k=0 ∧V = (5.2) m−1 m 2 ∧k V + ∧ m + V + ∧− V if s/2 is even or if V is complex. k=0
Here ∧m ± V are selfdual and anti-selfdual m-forms, the ±1-eigenspaces of the Hodge ∗-operator, which acts isometrically on ∧m V , with ∗2 = (−1)m+q = (−1)s/2 = +1 if s/2 is even . In particular, the so(V )-module ∧k V is irreducible, unless n = 2m, s/2 is even and m m m k = m, in which case ∧m V = ∧m + V + ∧− V , where ∧+ V and ∧− V are irreducible inequivalent modules. Theorem 2. (i) The Spin(V )-module S ⊗ S contains all modules ∧k V which are irreducible. (ii) If V is a complex vector space of dimension n = 2m + 1 or if V is real of signature (m, m + 1) then S⊗S =
m
∧k V ,
k=0
S∨S =
[m/4]
∧
m−4i
i=0
S∧S =
[(m−2)/4] i=0
V +
[(m−3)/4]
∧m−3−4i V ,
i=0
∧m−2−4i V +
[(m−1)/4] i=0
∧m−1−4i V .
Polyvector Super-Poincar´e Algebras
397
Proof. (i) Theorem 1 associates a Spin(V )-equivariant linear map (βk )∗ : ∧k V ∼ = ∧k V ∗ → S ∗ ⊗ S ∗ ∼ =S⊗S with any invariant bilinear form β on S. In particular, if ∧k V is irreducible and β = 0 then (βk )∗ embeds ∧k V into S ⊗S as a submodule. It was proven in [AC] that a non-zero k invariant bilinear form β on S always exists. This shows that S ⊗ S ⊃ m k=0 ∧ V . 1 n m (ii) If n = 2m + 1 then the right-hand side has dimension 2 2 = 4 and under the assumptions on V we have that dim S = 2m . Hence dim S ⊗ S = 4m , so the inclusion is an equality. The decompositions of S ∨ S and S ∧ S can either be read off the tables in [OV] or they follow from Proposition 4 using the invariants of the admissible scalar-valued form, which in this case is unique up to scale [AC] (see the tables in the appendix). Now, we consider the case when V is complex of dimension n = 2m or real of m signature (m, m). In this case, ∧m V = ∧m + V ⊕ ∧− V and the spinor module splits as a sum S = S+ + S− of inequivalent irreducible semi-spinor modules S± of dimension 2m−1 . Theorem 3. Let V be complex of dimension n = 2m or real of signature (m, m). Then the decompositions of the Spin(V )-modules S+ ⊗ S− and S± ⊗ S± into inequivalent irreducible submodules are given by: S+ ⊗ S− =
[(m−1)/2]
∧m−1−2i V ,
(5.3)
i=0
S± ⊗ S± =
∧m ±V
+
[(m−2)/2]
∧m−2−2i V ,
(5.4)
i=0
S ⊗ S = S+ ⊗ S+ + 2S+ ⊗ S− + S− ⊗ S− = ∧V .
(5.5)
Further, for any admissible bilinear form β on S, the equivariant maps βk | βk | S+ ⊗S−
S± ⊗S±
and
have the following images: βm (S± ⊗ S± ) = ∧m ±V ,
(5.6)
m±(2i+2)
(S± ⊗ S± ) = ∧m±(2i+2) V ,
0 ≤ i ≤ [ m−2 2 ],
(5.7)
m±(2i+1)
(S+ ⊗ S− ) = ∧m±(2i+1) V ,
0 ≤ i ≤ [ m−1 2 ],
(5.8)
β β
m±(2i+1) (S± ⊗ S± ) = β m±2i β (S+ ⊗ S− ) = 0,
0,
0≤i≤
0≤i≤
[ m−1 2 ],
[ m2 ].
(5.9) (5.10)
Proof. To prove the theorem we use the following model for the spinor module of an even dimensional complex Euclidean space V or of a pseudo-Euclidean space V with split signature (m, m): V = U ⊕ U ∗ , where U is an m-dimensional vector space and the scalar product is defined by the natural pairing between U and the dual space U ∗ . Then the spinor module is given by S = ∧U = ∧ev U + ∧odd U = S+ + S− , where the
398
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
semi-spinor modules S± consist of even and odd forms. The Clifford multiplication is given by exterior and interior multiplication: u · s := u ∧ s u∗ · s := ι∗u s
for u ∈ U, s ∈ S , for u∗ ∈ U ∗ , s ∈ S .
There exist exactly two independent admissible bilinear forms f and fE = f (E· , ·) on the spinor module, where E|S± = ±Id, and the form f is given by f (∧i U , ∧j U ) = 0,
if
f (s, t) vol U = (−1)
i + j = m ,
i(i+1)/2
s ∧ t,
s ∈ ∧i U,
t ∈ ∧m−i U,
(5.11)
where vol U ∈ ∧m U is a fixed volume form of U ∗ . We note that the symmetry, type and isotropy of the admissible basis (f, fE ) of Bil(S)so(V ) are given by σ (f ) = (−1)m(m+1)/2 , σ (fE ) = (−1)m(m−1)/2 , τ (f ) = −1 , τ (fE ) = +1 , ι(f ) = ι(fE ) = (−1)m . From this and Proposition 4 it follows that σ (fk ) = (−1)(m(m+1)+k(k+1))/2 ,
σ (fkE ) = (−1)(m(m−1)+k(k−1))/2 , (5.12)
ι(fk ) = ι(fkE ) = (−1)m+k .
(5.13)
The formulae (5.7)-(5.10) and the fact that βm (S± ⊗ S± ) = 0 follow from the formulae for the isotropy of fk and fkE . To prove (5.6) we first show that for any admissible form β, the image βm (S+ ⊗ S+ ) m m contains ∧m + V and the image β (S− ⊗ S− ) does not contain ∧+ V . For this we need m to show that for any a ∈ ∧+ V there exist spinors s, t ∈ S+ = ∧ev U such that the scalar product βm (s ⊗ t), a = 0, and that there exists an element a ∈ ∧m + V such that
βm (s ⊗ t), a = 0 for any s, t ∈ S− = ∧odd U . Since ∧+ V is an irreducible so(V )module, it follows that if a single element a of ∧m + V is contained in the so(V )-module β (S+ ⊗ S− ), then all of ∧m + V is contained in it. Therefore, it will suffice to prove the first statement for just one choice of a. We shall use the following lemma. Lemma 2. Let V = U ⊕ U ∗ as above. Then ∧m U ⊂ ∧m +V . Proof. Let (u1 , . . . , um ) be a basis of U and (u∗1 , . . . , u∗m ) the dual basis of U ∗ . Then, up to a sign factor, the volume form is given by vol = u1 ∧ · · · ∧ um ∧ u∗1 ∧ · · · ∧ u∗m . Now, using the definition of the Hodge star operator, ∗α, βvol = α ∧β, we may immediately check that ∗ (u1 ∧ · · · ∧ um ) = u1 ∧ · · · ∧ um . Let us consider a = volU . By the lemma, a ∈ ∧m + V . Then for s = t = 1 ∈ S+ we have
βm (s ⊗ t), a = β (a ∧ s , t) = β (a , 1) = ±1 = 0 . Similarly, for any s, t ∈ S− ,
βm (s ⊗ t), a = β (a ∧ s , t) = 0 ,
Polyvector Super-Poincar´e Algebras
399
since deg(a ∧ s) > m = dim U and hence a ∧ s = 0. This proves both the above m statements and hence βm (S+ ⊗ S+ ) = ∧m + V . Since the image β (S− ⊗ S− ) is nonm m zero and does not contain ∧+ V , we also have β (S− ⊗ S− ) = ∧m − V . This proves (5.6). [(m−1)/2] m−1−2i ∧ V ⊂ We now prove (5.3). By (5.8), we have the inclusion i=0 2m S+ ⊗ S− . To prove equality we compare dimensions. Using the identity m−1−2i = 2m−1 2m−1 m−1−2i + m−2−2i , we calculate:
[(m−1)/2]
dim
∧m−1−2i V =
[(m−1)/2]
i=0
2m m − 1 − 2i
i=0
=
1 2
=
m−1 i=0
2m − 1 i
(5.14)
2m−1
2m − 1 = 22m−2 = dim(S+ ⊗ S− )(5.15) i
i=0
since dim S± = 2m−1 . This proves (5.3). [(m−2)/2] m−2i−2 ∧ V. Similarly, by (5.6) and (5.7), we have S± ⊗ S± ⊃ ∧m ±V + i=0 To prove (5.4) we compare dimensions: dim ∧m ±V +
[(m−2)/2]
∧
m−2i−2
V =
i=0
[m/2] i=0
1 2m 2m − m − 2i 2 m
2m − 1 2m − 1 + m − 2i m − 2i − 1 i=0 1 2m ×− 2 m m 1 2m 2m − 1 = − 2 m m−i =
=
[m/2]
i=0 2m−1
1 2
i=0
2m − 1 = 22m−2 = 2m−1 · 2m−1 i
= dim(S± ⊗ S± ) . This proves (5.4) and (5.5).
Corollary 2. (i) Let V be either complex of even dimension or real of signature (m, m) and β an admissible bilinear form on the spinor module S = S+ + S− . Then for all k the image of βk restricted to S+ ⊗ S+ , S− ⊗ S− and S+ ⊗ S− is an irreducible Spin(V )-module and the Spin(V )-module S ⊗ S is isomorphic to ∧V . (ii) Let V be either complex of odd dimension or real of signature (m, m + 1) and β an admissible bilinear form on the spinor module S. Then for all k the image βk (S ⊗ S) is irreducible and the Spin(V )-module 2S ⊗ S is isomorphic to ∧V .
400
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
Corollary 3. Let V be complex of dimension n = 2m or real of signature (m, m). Then we have S± ∨ S± = ∧m ±V +
[(m−4)/4]
∧m−4−4i V ,
(5.16)
i=0
S± ∧ S± =
[(m−2)/4]
∧m−2−4i V .
(5.17)
i=0
Proof. These decompositions follow from (5.4) and (5.12).
6. Decomposition of the Tensor Square of the Spinor Module of Spin(V ) into Irreducible Components: Real Case In this section we describe the decompositions of S ⊗S, S ∨S and S ∧S into inequivalent irreducible Spin(V )–submodules, where S is the spinor module of a pseudo-Euclidean vector space V = Rp,q of arbitrary signature s = p − q and dimension n = p + q. We obtain these decompositions in two steps: First, we describe the complexification S C of the spinor module S. Second, using the decomposition of the tensor square S ⊗ S of the complex spinor module S, we decompose S C ⊗ S C into complex irreducible Spin(V C )– submodules and then we take real forms. We recall that the complex spinor module S associated to the complex Euclidean space V = V C = V ⊗ C = Cn is the restriction to Spin(V) of an irreducible representation of the complex Clifford algebra Cl(V). Depending on the signature s ≡ p − q (mod 8), the complexification S C of the spinor module S is given by either i) S C = S, where we denote by S the spinor module of the complex Euclidean space V = V C = V ⊗ C = Cn , or ii) S C = S + S, where S is the complex conjugated module of S. In the latter case S admits a Spin(V )-invariant complex structure J and S is identified with the complex space (S, J ) and S with (S, −J ). In the next lemma we specify the signatures for which the cases i) or ii) occur. For this we use Table 1, in which we have collected important information about the real and complex Clifford algebras and spinor modules. Now, we define the notion of TypeCl 0 (V ) (S, S± ) used in Table 1. If s is odd, then the complex spinor module S is irreducible (as a complex module of the real even Clifford algebra Cl 0 (V )). In this case we define TypeCl 0 (V ) (S) := K ∈ {R, H} if the Cl 0 (V )module S is of real or quaternionic type, i.e.it admits a real or quaternionic structure commuting with Cl 0 (V ). For even s the complex spinor module S = S+ + S− and S± are irreducible complex Cl 0 (V )-modules. We put TypeCl 0 (V ) (S, S± ) = (lK, K ), where K and K are the types of S and S± , respectively, further l = 1 if S is irreducible and l = 2 if S+ and S− are not equivalent as complex Cl 0 (V )-modules. Note that if the semispinor modules are of complex type (s = 2, 6), then they are complex-conjugates of each other: S± ∼ = S∓ . If S± are of real (s = 0) or quaternionic (s = 4) type, then they are selfconjugate: S± ∼ = S± . We now explain how Table 1 has been obtained. The first two columns have been extracted from [LM] and imply the third column. Passing to the complexification of the
Polyvector Super-Poincar´e Algebras
401
0 , the Schur algebra C = End Table 1. Clifford Modules Clp,q , their even parts Clp,q Cl 0 (V ) (S), the complex Spinor Module S, the complex Semispinor Modules S± , the Type of these Cl 0 (V )-modules and physics terminology: M stands for Majorana, W for Weyl and Symp for symplectic (i.e.quaternionic) spinors; s = p − q mod 8, n = p + q, N = 2[n/2] . Note that p is the number of negative eigenvalues of the product of two gamma matrices, and q the number of positive eigenvalues, see Appendix B.1
s
Clp,q
0 Clp,q
C
S
S±
TypeCl 0 (V ) (S, S± )
Name
0 1 2 3 4 5 6 7
R(N) C(N) H(N/2) 2H(N/2) H(N/2) C(N) R(N) 2R(N)
2R(N/2) R(N) C(N/2) H(N/2) 2H(N/4) H(N/2) C(N/2) R(N)
2R R(2) C(2) H 2H H C R
S⊗C S = S± ⊗ C S = S± ⊗ C S S S S⊗C S⊗C
S± ⊗ C
(2R, R) R (2C, C) H (2H, H) H (R, C) R
M-W M M, W Symp Symp-W Symp M, W M
S± S± S
Clifford algebras we have: Cl(V ) ⊗ C = Cl(V ⊗ C) and Cl 0 (V ) ⊗ C = Cl 0 (V ⊗ C). From this we can describe the complex spinor module S and semispinor modules S± and determine the relation between S, S± and S, S± . This gives the fourth, fifth and sixth columns of the table. Using this table we prove the following lemma, which describes the complex Spin(V)–module S C : Lemma 3. C
S =
S+S S
if s = p − q ≡ 1, 2, 3, 4, 5 (mod 8) if s ≡ 6, 7, 8 (mod 8).
Proof. According to Table 1, if s ≡ 6, 7, 8 (mod 8) we have S = S ⊗ C. In all other cases there exists a Spin(V )-invariant complex structure J and the complex space (S, J ) is identified with S. Then S C = S + S. Remark 1. We note that a Spin(V )–invariant real or quaternionic structure ϕ on S (i.e.an antilinear map with ϕ 2 = +1 or −1, respectively) defines an isomorphism ϕ : S → S. From Table 1 it follows that if s ≡ 1, 2 (mod 8), then there exists a Spin(V )–invariant real structure and if s ≡ 3, 4, 5 (mod 8), then there exists a Spin(V )–invariant quaternionic structure on S. Now, using the results of the previous section for the complex case, we decompose S C ⊗ S C into complex irreducible Spin(V)–submodules. If S C ⊗ S C = Wi and all submodules Wi are of real type (i.e.complexifications of irreducible real Spin(V )–submodules Wi ), then S ⊗ S = Wi is the desired decomposition. In odd dimensions all modules Wi = ∧i V are of real type. This is also the case in even dimensions n = 2m, C C 2 with one exception: the modules ∧m ± V ⊂ S ⊗ S are not of real type if ∗ = −1, m m m i.e.if s/2 is odd. Then, ∧ V = ∧+ V + ∧− V is the complexification of the irreducible Spin(V )–module ∧m V , which has the Spin(V )–invariant complex structure ∗. The decompositions of S ∨S and S ∧S can be obtained using the same method. Using this approach, we describe in detail all these decompositions for any pseudo-Euclidean vector space V = Rp,q in the next three subsections.
402
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
6.1. Odd dimensional case: dim V = 2m + 1. We now describe the decomposition of S ⊗ S = S ∨ S + S ∧ S for all signatures s = 1, 3, 5, 7 (mod 8) in the odd dimensional case. Theorem 4. Let V = Rp,q be a pseudo-Euclidean vector space of dimension n = p + q = 2m+1. Then the decompositions of the Spin(V )-modules S ⊗ S, S ∨ S and S ∧ S into inequivalent irreducible submodules is given by the following: If the signature s = p−q ≡ 1, 3, 5 (mod 8), we have S ⊗ S = 2(∧V ) = 4
m
∧i V ,
(6.1)
i=0
S∨S = 3
[m/4]
∧
m−4i
V +3
[(m−3)/4]
i=0
+
i=0
[(m−2)/4]
∧m−2−4i V +
[(m−1)/4]
i=0
S∧S = 3
∧m−2−4i V + 3
[(m−1)/4]
i=0 [m/4]
∧m−1−4i V ,
(6.2)
i=0
[(m−2)/4]
+
∧m−3−4i V
∧m−1−4i V
i=0
∧m−4i V +
[(m−3)/4]
i=0
∧m−3−4i V .
(6.3)
i=0
If the signature s ≡ 7 (mod 8), we have S⊗S =
m
∧i V ,
(6.4)
i=0
S∨S =
[m/4]
∧
m−4i
V +
[(m−3)/4]
i=0
S∧S =
[(m−2)/4]
∧m−3−4i V ,
(6.5)
i=0
∧m−2−4i V +
i=0
[(m−1)/4]
∧m−1−4i V .
(6.6)
i=0
Moreover, S is an irreducible Spin(V )-module for s = 3, 5, 7 and for s = 1, S = S+ +S− is the sum of two equivalent semi-spinor modules. Proof. The signature s ≡ 7 (mod 8) corresponds to Rm,m+1 , which was already discussed in Theorem 2. For s ≡ 1, 3, 5 (mod 8), the spinor module S has an invariant complex structure J and (S, J ) is identified with the complex spinor module S. We denote by S = (S, −J ) the module conjugate to S. According to Lemma 3 and Remark 1, S and S are equivalent as complex modules of the real spin group Spin(V ) and SC = S + S = 2S. Hence, (S ⊗R S)C = S C ⊗C S C = 4S ⊗C S ,
(∨2 S)C = ∨2 (S + S) = ∨2 (2S) = 3 ∨2 S + ∧2 S , (∧2 S)C = 3 ∧2 S + ∨2 S .
Polyvector Super-Poincar´e Algebras
403
This implies the theorem. For example, (S ⊗R S)C = 4S ⊗C S = 4
m
∧i V = 4
i=0
m
(∧i V )C ,
i=0
by virtue of Theorem 2 and the real part gives (6.1).
6.2. Even dimensional case: dim V = 2m. In this subsection, we describe the decomposition of S ⊗ S = S ∨ S + S ∧ S for all signatures s = 0, 2, 4, 6 (mod 8) in the even dimensional case. Theorem 5. Let V = Rp,q be a pseudo-Euclidean vector space of dimension n = p + q = 2m. Then the decompositions of the Spin(V )-module S ⊗ S into inequivalent irreducible submodules is given by the following: If the signature s = p−q ≡ 2, 4 (mod 8), we have
S ⊗ S = 4(∧V ) =
m−1 8 ∧i V + 4 ∧ m V
if s = 2 (mod 8)
i=0
m−1 m 8 ∧i V + 4 ∧ m + V + 4 ∧− V
(6.7) if s = 4 (mod 8).
i=0
If the signature s ≡ 0, 6 (mod 8), we have
S ⊗ S = ∧V =
m−1 m ∧i V + ∧ m 2 + V + ∧− V
if s = 0 (mod 8)
i=0
(6.8)
m−1 ∧i V + ∧ m V 2
if s = 6 (mod 8).
i=0
Proof. Similarly to the odd dimensional case, we have (S ⊗R S)C = S ⊗C S = ∧V = 2
m
m ∧i V + ∧ m + V + ∧− V .
(6.9)
i=0
Now we note that (∧m V )C = ∧m V and ∗2 |∧m V = (−1)s/2 . If s/2 is even, then ∧m V = m m ∧m + V + ∧− V , where ∧± V are irreducible submodules, which are ±1–eigenspaces of m C ∗. Then, ∧± V = ∧m ± V. In this case, the real part of (6.9) gives the first part of (6.8). If s/2 is odd, then ∗ is a complex structure on ∧m V , which is irreducible since it has complex structure and its complexification ∧m V has only two irreducible components ∧m ± V. In this case the real part of (6.9) gives the second part of (6.8). The decompositions of S ∨ S and S ∧ S for the cases when semispinor modules exist, in particular for s = 0, 2, 4 (mod 8), will be given in the next subsection. Therefore it is now sufficient to determine these decompositions for s = 6 (mod 8).
404
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
Corollary 4. Let S be the spinor module of a pseudo-Euclidean vector space V of signature s ≡ 6 (mod 8) and dimension n = 2m. Then we have S ∨ S = ∧m V + 2
[(m−4)/4]
∧m−4−4i V +
[(m−1)/2]
i=0
S∧S = 2
[(m−2)/4]
∧m−1−2i V ,
(6.10)
i=0
∧m−2−4i V +
i=0
[(m−1)/2]
∧m−1−2i V .
(6.11)
i=0
Proof. This follows by complexification of Eq. (6.8), using Lemma 3, Eq. (5.3) and Corollary 3. 6.3. Decomposition of tensor square of semi-spinors. According to Table 1, semi-spinor modules S± exist if the signature s ≡ 0, 1, 2, 4 (mod 8). More precisely, we list below C. whether S± are equivalent Spin(V )–modules and we give S± s
S±
C S±
0 1 2 4
inequivalent equivalent equivalent inequivalent
S± ⊗ C = S± S± ⊗ C = S S± ⊗ C = S S± + S± = 2S±
For s ≡ 1, 2, 4 (mod 8) we have S = S, whereas for s ≡ 0 (mod 8) we have S = S C . We also note that for s ≡ 2 (mod 8), although the Spin(V )–modules S± are equivalent, the Spin(V)–modules S+ and S− = S+ are not equivalent. For s = 4, S± = S± admits a Spin(V )-invariant quaternionic structure. C and the decompositions of the tensor squares of Using the above description for S± complex spinor and semi-spinor modules, we obtain the following: Theorem 6. Let S = S+ + S− be the spinor module of a pseudo-Euclidean vector space V of signature s ≡ 0, 1, 2, 4 (mod 8) and dimension n = 2m or n = 2m + 1. Then we have the following decomposition of Spin(V )–modules S± ⊗ S± and S+ ⊗ S− into inequivalent irreducible submodules: For s ≡ 0 (mod 8): S+ ⊗ S− =
[(m−1)/2]
∧m−1−2i V ,
i=0
S± ⊗ S± =
∧m ±V
+
[(m−2)/2]
∧m−2−2i V ,
i=0
S± ∨ S± = ∧m ±V +
[(m−4)/4]
∧m−4−4i V ,
i=0
S± ∧ S± =
[(m−2)/4] i=0
∧m−2−4i V .
Polyvector Super-Poincar´e Algebras
405
For s ≡ 1 (mod 8): S± ⊗ S± = S+ ⊗ S− =
m
∧i V ,
i=0
S± ∨ S± =
[m/4]
∧m−4i V +
[(m−3)/4]
i=0
S± ∧ S± =
∧m−3−4i V ,
i=0
[(m−2)/4]
∧m−2−4i V +
[(m−1)/4]
i=0
∧m−1−4i V .
i=0
For s ≡ 2 (mod 8): S± ⊗ S± = S+ ⊗ S− = ∧m V + 2
m−1
∧i V ,
i=0
S± ∨ S± = ∧m V + 2
[(m−4)/4]
∧m−4−4i V +
[(m−1)/2]
i=0
S± ∧ S± = 2
[(m−2)/4]
∧m−1−2i V ,
i=0
∧m−2−4i V +
i=0
[(m−1)/2]
∧m−1−2i V .
i=0
For s ≡ 4 (mod 8): S+ ⊗ S− = 4
[(m−1)/2]
∧m−1−2i V ,
i=0
S± ⊗ S± = 4 ∧m ±V +4
[(m−2)/2]
∧m−2−2i V ,
i=0
S± ∨ S± = 3 ∧m ±V +3
[(m−4)/4]
∧m−4−4i V +
[(m−2)/4]
i=0
S± ∧ S± = 3
[(m−2)/4]
∧m−2−4i V + ∧m ±V +
∧m−2−4i V ,
i=0 [(m−4)/4]
i=0
∧m−4−4i V .
i=0
Proof. The case s = 0 (mod 8) follows from the complex case (see Theorem 3 and Hence Corollary 3). For s = 1, 2 (mod 8) the modules S+ and S− are isomorphic. i V we have S = S+ + S− = 2S+ and S ⊗ S = 4S+ ⊗ S+ . Since S ⊗ S = 4 m ∧ i=0 S+ ⊗ S+ = S− ⊗ S− = S+ ⊗ S− =
m
∧i V .
i=0
The splitting into symmetric and skew parts of these tensor products follows that in the complex cases, see Theorem 2 and Corollary 3. For s = 4 (mod 8) the semi-spinor C = S = S + S . The result follows from the modules S± are not equivalent but S± + − decomposition in the complex case (Theorem 3 and Corollary 3) on taking the real parts.
406
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
7. N -Extended Polyvector Poincar´e Algebras In the previous sections we have classified -transalgebras of the form g = g0 + g1 , g0 = so(V ) + W0 , with g1 = W1 = S the spinor module. As in [AC] we can easily extend this classification to the case where W1 is a general spinorial module, i.e.W1 = N NS = S ⊗ RN , if S is irreducible, or W1 = N+ S+ ⊕ N− S− = S+ ⊗ RN + ⊕ S− ⊗ R − if semi-spinors exist. An -extension of translational type of the above form is called an N -extended polyvector Poincar´e algebra if W1 = N S and an (N+ , N− )-extended polyvector Poincar´e algebra if W1 = N+ S+ ⊕N− S− . Consider first the case W1 = N S. As before, the classification reduces to the decomposition of ∨2 W1 and ∧2 W1 into irreducible submodules. These decompositions follow from the decompositions of ∨2 S and ∧2 S obtained in the previous sections, together with the decompositions ∨ 2 W1 = ∨ 2 S ⊗ ∨ 2 R N ⊕ ∧ 2 S ⊗ ∧ 2 R N , ∧2 W1 = ∧2 S ⊗ ∨2 RN ⊕ ∨2 S ⊗ ∧2 RN . In particular, this implies that the multiplicities µ+ (k, N ) and µ− (k, N ) of the module ∧k V in ∨2 W1 and ∧2 W1 , respectively, are given by N (N + 1) N (N − 1) + µ− (k) , 2 2 N (N + 1) N (N − 1) µ− (k, N ) = µ− (k) + µ+ (k) , 2 2
µ+ (k, N ) = µ+ (k)
where µ+ (k) and µ− (k) are the multiplicities of ∧k V in ∨2 S and ∧2 S, respectively. The vector space of N-extended polyvector Poincar´e -algebra structures with W0 = ∧k V is identified with the space Bilk (W1 )so(V ) of invariant ∧k V -valued bilinear forms on W1 . Its dimension is given by dim Bilk (W1 )so(V ) = µ (k, N ) dim C(∧k V ) , where the Schur algebra C(∧k V ) = R, R ⊕ R or C, see the proof of Theorem 1. Any element ∈ Bilk (W1 )so(V ) can be represented as j i + = βk i ⊗ b+ + k j ⊗ b− , i
− =
i
+
i βk i ⊗ b+ + −
j
j
β−
j
k j ⊗ b− , β+
i ∈ Bilk (S)so(V ) and bi and bi are, respectively, symmetric and skewsymwhere β± ± + − metric bilinear forms on RN . We note also that there exists a unique minimal (i.e.W0 = [W1 , W1 ]) N -extended polyvector Poincar´e -algebra with W0 = µ (k, N ) ∧k V . The Lie (super)bracket is given, up to a twist by an invertible element of the Schur algebra of W0 , by the projection onto the corresponding maximal isotypical submodule of ∨2 W1 or ∧2 W1 , respectively. Similarly, in the case when the spinor module S = S+ ⊕ S− is reducible, we can reduce the description of all (N+ , N− )-extended polyvector Poincar´e -algebras g = so(V )+∧k V +W1 such that W1 = N+ S+ +N− S− to the chiral cases (N+ , N− ) = (1, 0) or (0, 1) and to the isotropic case: (N+ , N− ) = (1, 1) and [S+ , S+ ] = [S− , S− ] = 0.
Polyvector Super-Poincar´e Algebras
407
Table A.1. The numbers (N + (s, n), N − (s, n)) of independent symmetric and skewsymmetric bilinear forms on S s\n 0 1 2 3 4 5 6 7
0
1
2,0
2
3
1,1 3,1
6,2
4,4
2,0
µ(m)
2 4 8 4 8 4 2 1
2 4 8 4 8 4 2 1
1*
3,1 4,4
1,3 0,2
0,1
µ(k) = µ(0)
4,4
2,6
1,1 1,0
N(s)
3,1
1,3
1,3
7
1,1
2,6
4,4
6
1,3
1,3
3,1
5
0,2 1,3
3,1 6,2
4
3,1 1,1
0,1
1,0
4 4* 1
Let β be an admissible bilinear form on S = S+ ⊕ S− and βk ∈ Bilk (S)so(V ) the corresponding admissible ∧k V -valued bilinear form. Its restriction to S+ (or S− ) defines a (1, 0)-extended (respectively, (0, 1)-extended) k-polyvector Poincar´e -algebra if and only if ι(βk ) = +1. If ι(βk ) = −1 then we obtain an isotropic (1, 1)-extended k-polyvector Poincar´e -algebra, i.e.[S+ , S+ ] = [S− , S− ] = 0. The values of the invariants σ (βk ) = and ι(βk ) can be read off Tables A.3–A.6 in Appendix A. Acknowledgements. This work was generously supported by the ‘Schwerpunktprogramm Stringtheorie’ of the Deutsche Forschungsgemeinschaft. Partial support by the European Commission’s 5th Framework Human Potential Programmes under contracts HPRN-CT-2000-00131 (Quantum Spacetime) and HPRN-CT-2002-00325 (EUCLID) is also acknowledged. A.V.P. is supported in part by the Belgian Federal Office for Scientific, Technical and Cultural Affairs through the Inter-University Attraction Pole P5/27. V.C, C.D. and A.V.P. are grateful to the University of Hull for hospitality during a visit in which the basis of this work was discussed.
A. Admissible ∧k V -Valued Bilinear Forms on S In Table A.1 we give the numbers (N + (s, n), N − (s, n)) of independent symmetric (+) and skewsymmetric (−) invariant bilinear forms on S, i.e. N ± (s, n) = dim Bil± (S)so(V ) . They were computed in [AC] and are periodic with period 8 in the signature s = p − q and in the dimension n = p +q of V = Rp,q . The entry N (s) = N + (s, n)+N − (s, n) is the total number of invariant bilinear forms and µ(k) is the multiplicity of the irreducible submodule ∧k V in S ⊗ S. For k = n/2, it does not depend on k, i.e.µ(k) = µ(0). In the case n = 2m the multiplicities µ(m) with a ∗ indicate that the module ∧m V is reducible m as ∧m + V + ∧− V . From Table A.1, we note the following symmetries: a) Modulo 8-symmetry: N ± (s + 8a, n + 8b) = N ± (s, n)
,
b) Reflection with respect to the horizontal line s = 3, N ± (−s + 6, n) = N ± (s, n) and reflection with respect to the vertical line n = 0, N ± (s, n) = N ± (s, −n) .
a, b ∈ Z .
408
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
c) The reflection with respect to the vertical line n = 2 is a mirror symmetry, i.e.it interchanges N + and N − : N ± (s, n) = N ∓ (s, −n + 4) . It is the same as the mirror symmetry with respect to n = 6. We note that the composition of reflections in n = 0 and n = 2 gives the translation n → n + 4: N ± (s, n) = N ∓ (s, n + 4) . More generally, we consider the dimension N˜ k± (p, q) := Nk± (s, n) := dim Bilk± (S)so(V ) of the vector space of invariant ∧k V -valued bilinear forms on S. By Theorem 1, the sum Nk (s, n) = Nk+ (s, n) + Nk− (s, n) = N (s, n) does not depend on k. From Table A.1, we see that it depends only on the signature s. As a corollary of Proposition 4, the numbers N˜ k± (p, q) are periodic modulo 4 in k: ± (p, q) . N˜ k± (p, q) = N˜ k+4
Moreover, we have the following periodicities in (p, q): N˜ k± (p, q) = N˜ k± (p + 8, q) = N˜ k± (p, q + 8) = N˜ k± (p + 4, q + 4) . In fact, it was proven in [AC] that, for any given symmetry σ0 , type τ0 and isotropy ι0 (if defined), the number of bilinear forms β with σ (β) = σ0 , τ (β) = τ0 and ι(β) = ι0 in a basis (βi ) of Bil(S)so(V ) consisting of admissible forms is (8, 0)-, (0, 8)- and (4, 4)periodic in (p, q). By Proposition 4, this implies that for any given symmetry σ0 and isotropy ι 0 (if defined), the number of ∧k V -valued bilinear forms βk i with σ (βk i ) = σ0 , and ι(βk i ) = ι 0 is (8, 0)-, (0, 8)- and (4, 4)-periodic in (p, q). Finally, we have the following shift formula: Nk± (s, n + 2k) = N0± (s, n) := N ± (s, n) , which we can write also as N˜ k± (p + k, q + k) = N˜ 0± (p, q) . This shift formula follows from the tables below. In Table A.3 we describe a basis of Bil(S)so(V ) , which consists of admissible forms and indicate the values of the three invariants σ , τ and ι. In the three following tables we give the invariants σ and ι for the corresponding bases of Bilk (S)so(V ) , k = 1, 2, 3 modulo 4, denoted for simplicity by the same symbols. Due to the above periodicity properties, we can calculate, from these tables, the values of the invariants for the corresponding bases of Bilk (S)so(V ) for all k ∈ N and V = Rp,q . For any V = Rp,q an explicit basis of Bilk (S)so(V ) consisting of admissible forms was constructed for k = 0 and k = 1, in terms of appropriate models of the spinor module in [AC]. It was proven there that any admissible V -valued bilinear form on S is of the form β1 , where β is a linear combination of admissible scalar-valued forms. By Theorem 1 and Proposition 4, this result extends to k > 1, namely any admissible ∧k V -valued
Polyvector Super-Poincar´e Algebras
409
bilinear form is of the form βk , where β is a linear combination of admissible scalarvalued forms. This provides a basis of the vector space Bilk (S)so(V ) ∼ = Bil(S)so(V ) consisting of admissible forms. The dimension of this space is equal to the dimension of the Schur algebra C(S), which depends only on the signature s = p − q modulo 8, see [AC]. In the tables below we use the notation of [AC]. We use the fact that any pseudoEuclidean vector space can be written as V = V1 ⊕ V2 , where V1 = Rr,r and V2 = R0,k or V2 = Rk,0 . Then [LM] Cl(V2 ) , Cl(V ) ∼ = Cl(V1 ) ⊗ denotes the Z/2Z-graded tensor product. Let S1 be the spinor module of where ⊗ Spin(V1 ) and S2 the spinor module of Spin(V2 ). Then we always have that S1 = S1+ +S1− is a Z/2Z-graded module of the Z/2Z-graded algebra Cl(V1 ). The spinor module S of Spin(V ) can be described in terms of S1 and S2 as follows, see Proposition 2.3 of [AC]. Consider first the case when S2 = S2+ + S2− is reducible. In this case S2 is a Z/2Z S2 of Spin(V ) is obtained as graded Cl(V2 )-module and the spinor module S = S1 ⊗ the Z/2Z-graded tensor product of the modules S1 and S2 . It is again Z/2Z-graded with even part S+ = S1+ ⊗ S2+ + S1− ⊗ S2− and odd part S− = S1+ ⊗ S2− + S1− ⊗ S2+ . If S2 is an irreducible Spin(V )-module, then the spinor module of Spin(V ) is given by S2 = S1+ ⊗ S2 + S1− ⊗ S2 S = S1 ⊗ with the action ±
b) · (s1± ⊗ s2 ) = (−1)deg(s1 )deg(b) as1± ⊗ bs2 , (a ⊗ where a ∈ Cl(V1 ), b ∈ Cl(V1 ), s1± ∈ S1± , s2 ∈ S2 , deg(s1+ ) = 0 and deg(s1− ) = 1. In this case S is an irreducible Spin(V )-module. As discussed in Sect. 5, for the case of split signature, V = Rr,r , there exist two independent admissible bilinear forms f and fE = f (E· , ·) on the spinor module, where E|S± = ±Id. Their invariants are given in Table A.3. If V is positive or negative definite, then there exists a unique (up to scale) Pin(V )-invariant scalar product g on the spinor module S and any admissible, hence invariant, bilinear form on S is of the form gA = g(A·, ·), where A is an admissible element of the Schur algebra C(S). The admissibility of A means that A is either symmetric or skewsymmetric with respect to g, that it either commutes or anticommutes with the Clifford multiplication γv and that it preserves or interchanges S+ and S− if they exist [AC]. S2 , as above, Returning to the general case V = Rp,q = V1 ⊕ V2 , S = S1 ⊗ the admissible bilinear forms on S can be described as follows. Let (gAi ) be a basis of Bil(S2 )so(V2 ) consisting of admissible elements. Inspection of Table A.3 shows that for any gAi there exists a unique element φi ∈ {f, fE } which satisfies the condition τ (φi ) = ι(gAi )τ (gAi ). By virtue of Proposition 3.4 of [AC], the tensor products φi ⊗ gAi provide a basis of Bil(S)so(V ) consisting of admissible elements. The corresponding basis φk i ⊗gA of Bilk (S)so(V ) and its invariants are tabulated below. For simplicity the i
symbols k and ⊗ are omitted. We use the following bases for the Schur algebra C(S) (see Table A.2). If the signature s = 0, 1, 2 or 4, then S = S+ ⊕S− , and we put E = diag(IdS+ , IdS− ). In the cases s = 1 and s = 6 we denote by I the standard complex structure in C(S) = R(2) and C(S) = C, respectively. In fact, in the case s = 1, S = CN = RN ⊕ RN has a Cl(V )-invariant
410
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen Table A.2. Standard bases for the Schur algebras C(S) s
C(S)
basis of C(S)
0 1 2 3 4 5 6 7
2R R(2) C(2) H 2H H C R
Id, E Id, E, I, EI Id, I, J, K, E, EI, EJ, EK Id, I, J, K Id, I, J, K, E, EI, EJ, EK Id, I, J, K Id, I Id
complex structure. In the cases s = 2, 3, 4 and 5, we denote by I, J, K = I J ∈ C(S) the canonical Cl 0 (V )-invariant hypercomplex structure of S (3 anticommuting complex structures). In fact, in the case s = 4, S = HN/2 = HN/4 ⊕ HN/4 has a Cl(V )-invariant hypercomplex structure. B. Reformulation for Physicists In this appendix, we reformulate our results in a language that may be more familiar to physicists. It is useful first to review some properties of Clifford algebras, in particular those that concern the real Clifford algebras.
B.1. Complex and real Clifford algebras. We use here the terminology of Clifford algebras, spinors and gamma matrices as used in physics. Results for the real case are dependent on the signature. We remark that in the main text Clifford algebras have been taken with a minus sign: γ a γ b + γ b γ a = −2ηab .
(B.1)
The signature s has been introduced as p − q modulo 8, where p and q are the number of +1, respectively −1 eigenvalues of the metric ηab . Thus, p = number of negative eigenvalues of γ a γ b + γ b γ a , q = number of positive eigenvalues of γ a γ b + γ b γ a , s = p−q
mod 8 ,
n=p+q.
(B.2)
This is important in order to interpret Table 1. The bilinear form β corresponds to the charge conjugation matrix C, or for spinors s and t, we have β(s, t) = s T Ct. If v = ea , a basis vector, then the operation γ (v) is the gamma matrix γ (v = ea ) = γ a . The invariant σ (C) indicates the symmetry of C, while τ (C) indicates the symmetry of Cγ a : CT = σ (C) C ,
Cγ a
T
= τ (C) Cγ a .
(B.3)
Polyvector Super-Poincar´e Algebras
411
Table A.3. Admissible bilinear forms β on S and their invariants (σ, τ, ι) p\q
0
1
2
0
f fE
+−+ +++
g
++
fg fE gI
−− ++
1
g gE gI gI E g gI gJ gK gE gI E gJ E gKE g gI gJ gK
+−+ +++ −−− ++− +−+ −−+ −−− −−− +++ −++ ++− ++− +− −− −− −−
f fE
−−− ++−
g
−−
fE g fE gI f gE f gI E
++− −++ −−− −−+
f fE
g gI gJ gK gE gI E gJ E gKE g gI gJ gK
+−+ −−+ −−+ −−+ +++ −++ −++ −++ +− −− −+ −+
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fE gJ fE gK
++− −+− −++ −++ −−− +−− −−+ −−+ ++ −+ −+ −+
6
g gI
+− −+
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fg f gI
++− −+− −+− −+− −−− +−− +−− +−− ++ −+ +− +−
7
g
+−
fE g f gI
++ +−
2
3
4
5
3 fE g fE gI fE gJ fE gK fE g f gI
−+ ++ −− −− −+ −−
−−+ −++
g
−+
fg f gI fE gE fE gI E
−−+ +−− −++ −+−
f fE
+−− −+−
fg f gI f gJ f gK fE gE fE gI E fE gJ E fE gKE fg f gI f gJ f gK
−−+ +−+ +−− +−− −++ +++ −+− −+− −− +− +− +−
fE g fE gI f gE f gI E
−+− +++ +−− +−+
fg f gI f gJ f gK fE gE fE gI E fE gJ E fE gKE fg f gI fE g fE gI
−−+ +−+ +−+ +−+ −++ +++ +++ +++ −− +− ++ ++
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fE gJ fE gK
−+− ++− +++ +++ +−− −−− +−+ +−+ −+ ++ ++ ++
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE
−+− ++− ++− ++− +−− −−− −−− −−−
412
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen Table A.4. ∧k V -valued admissible bilinear forms on S and their invariants (σ, ι) for k≡1(4) p\q
0
1
2
0
f fE
−− +−
g
+
fg fE gI
+ +
1
g gE gI gI E g gI gJ gK gE gI E gJ E gKE g gI gJ gK
−− +− ++ ++ −− +− ++ ++ +− −− ++ ++ − + + +
f fE
++ ++
g
+
fE g fE gI f gE f gI E
++ −− ++ +−
f fE
g gI gJ gK gE gI E gJ E gKE g gI gJ gK
−− +− +− +− +− −− −− −− − + − −
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fE gJ fE gK
++ −+ −− −− ++ −+ +− +− + − − −
6
g gI
− −
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fg f gI
++ −+ −+ −+ ++ −+ −+ −+ + − − −
7
g
−
fE g f gI
+ −
2
3
4
5
3 fE g fE gI fE gJ fE gK fE g f gI
− + + + − +
+− −−
g
−
fg f gI fE gE fE gI E
+− −+ −− −+
f fE
−+ −+
fg f gI f gJ f gK fE gE fE gI E fE gJ E fE gKE fg f gI f gJ f gK
+− −− −+ −+ −− +− −+ −+ + − − −
fE g fE gI f gE f gI E
−+ +− −+ −−
fg f gI f gJ f gK fE gE fE gI E fE gJ E fE gKE fg f gI fE g fE gI
+− −− −− −− −− +− +− +− + − + +
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fE gJ fE gK
−+ ++ +− +− −+ ++ −− −− − + + +
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE
−+ ++ ++ ++ −+ ++ ++ ++
Polyvector Super-Poincar´e Algebras
413
Table A.5. ∧k V -valued admissible bilinear forms on S and their invariants (σ, ι) for k≡2(4) p\q
0
1
2
0
f fE
−+ −+
g
−
fg fE gI
+ −
1
g gE gI gI E g gI gJ gK gE gI E gJ E gKE g gI gJ gK
−+ −+ +− −− −+ ++ +− +− −+ ++ −− −− − + + +
f fE
+− −−
g
+
fE g fE gI f gE f gI E
−− ++ +− ++
f fE
g gI gJ gK gE gI E gJ E gKE g gI gJ gK
−+ ++ ++ ++ −+ ++ ++ ++ − + + +
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fE gJ fE gK
−− +− ++ ++ +− −− ++ ++ − + + +
6
g gI
− +
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fg f gI
−− +− +− +− +− −− −− −− − + − −
7
g
−
fE g f gI
− −
2
3
4
5
3 fE g fE gI fE gJ fE gK fE g f gI
+ − + + + +
++ ++
g
+
fg f gI fE gE fE gI E
++ −− ++ +−
f fE
−− +−
fg f gI f gJ f gK fE gE fE gI E fE gJ E fE gKE fg f gI f gJ f gK
++ −+ −− −− ++ −+ +− +− + − − −
fE g fE gI f gE f gI E
+− −+ −− −+
fg f gI f gJ f gK fE gE fE gI E fE gJ E fE gKE fg f gI fE g fE gI
++ −+ −+ −+ ++ −+ −+ −+ + − − −
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fE gJ fE gK
+− −− −+ −+ −− +− −+ −+ + − − −
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE
+− −− −− −− −− +− +− +−
414
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen Table A.6. ∧k V -valued admissible bilinear forms on S and their invariants (σ, ι) for k≡3(4) p\q
0
1
2
0
f fE
+− −−
g
−
fg fE gI
− −
1
g gE gI gI E g gI gJ gK gE gI E gJ E gKE g gI gJ gK
+− −− −+ −+ +− −− −+ −+ −− +− −+ −+ + − − −
f fE
−+ −+
g
−
fE g fE gI f gE f gI E
−+ +− −+ −−
f fE
g gI gJ gK gE gI E gJ E gKE g gI gJ gK
+− −− −− −− −− +− +− +− + − + +
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fE gJ fE gK
−+ ++ +− +− −+ ++ −− −− − + + +
6
g gI
+ +
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fg f gI
−+ ++ ++ ++ −+ ++ ++ ++ − + + +
7
g
+
fE g f gI
− +
2
3
4
5
3 fE g fE gI fE gJ fE gK fE g f gI
+ − − − + −
−− +−
g
+
fg f gI fE gE fE gI E
−− ++ +− ++
f fE
++ ++
fg f gI f gJ f gK fE gE fE gI E fE gJ E fE gKE fg f gI f gJ f gK
−− +− ++ ++ +− −− ++ ++ − + + +
fE g fE gI f gE f gI E
++ −− ++ +−
fg f gI f gJ f gK fE gE fE gI E fE gJ E fE gKE fg f gI fE g fE gI
−− +− +− +− +− −− −− −− − + − −
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE fE g fE gI fE gJ fE gK
++ −+ −− −− ++ −+ +− +− + − − −
fE g fE gI fE gJ fE gK f gE f gI E f gJ E f gKE
++ −+ −+ −+ ++ −+ −+ −+
Polyvector Super-Poincar´e Algebras
415
When we can define chiral spinors, called semi-spinors here, the invariant ι indicates whether the charge conjugation matrix maps between spinors of equal chirality ι(β) = 1 or different chirality ι(β) = −1. For complex gamma matrices, there are many references, and one can compare e.g.with [VP]. The invariants σ and τ and ι are related to the two numbers and η of [VP] as σ (C) = − ,
τ (C) = −η .
(B.4)
The main results depend on the dimension n. For n odd, there is one charge conjugation matrix (i.e.1 bilinear form C) and n = 1 mod 8 : n = 3 mod 8 : n = 5 mod 8 : n = 7 mod 8 :
σ (C) = 1 , σ (C) = −1 , σ (C) = −1 , σ (C) = 1 ,
τ (C) = 1 , τ (C) = −1 , τ (C) = 1 , τ (C) = −1 .
(B.5)
For even n we can define a charge conjugation matrix for either sign of τ . We define γ∗ ≡ (−i)n/2+p γ1 . . . γn ,
γ∗ γ∗ = 1 .
(B.6)
Now, if C is a good charge conjugation matrix, then C = Cγ∗ is a charge conjugation matrix as well, with τ (C ) = −τ (C). The value of σ is n=0 n=4
mod 8 : σ (C) = 1 , mod 8 : σ (C) = −1 ,
n = 2 mod 8 : σ (C) = τ (C) , n = 6 mod 8 : σ (C) = −τ (C) .
(B.7)
Using (1 ± γ∗ )/2, we can define chiral spinors in this case, and we find ←
ι(C) = n ≡ (−1)n(n−1)/2 .
(B.8)
←
Here, n is the sign change on reversing n indices of an antisymmetric tensor. In this paper we more often make use of real Clifford algebras. Explicit results on real Clifford algebras can be found in [O]. Here we give some key results. Only in the cases s = 0, 6, 7, can the matrices of the complex Clifford algebra (of dimension 2[n/2] , where the Gauss bracket [x] denotes the integer part of x) be chosen to be real. This is called the normal type. In the other cases, we can get real matrices of dimension twice that of the complex Clifford algebra. Many representations contain only pure real or pure imaginary gamma matrices. A simple way to obtain real matrices of double dimension is to use the matrices a = γ a ⊗
2
if γ a is real ,
a = γ a ⊗ σ2
if γ a is imaginary .
(B.9)
In the cases s = 1, 5, called the almost complex type, there is one matrix that commutes with all gamma matrices. It is J ≡ 1 . . . n ,
J2 = − .
(B.10)
Note that in the complex case the product of all the γ a ’s is proportional to the identity for odd dimensions, but this is not so for these larger and real a ’s. (For s = 3, 7 the
416
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
product of all real gamma matrices is also ± ). In this case there is always a charge conjugation matrix C with1 σ (C) = σ (C) ,
τ (C) = −1 ,
(B.11)
where C indicates the charge conjugation matrix for the complex case. There is also a matrix D that satisfies the properties D a + a D = 0 ,
D2 = D2 = −
D T = CDC −1 ,
if s = 1 , if s = 5 .
(B.12)
In the remaining cases, s = 2, 3, 4, called the quaternionic type, there are 3 matrices, which commute with all the a ’s, denoted Ei for i = 1, 2, 3. They satisfy [Ei , a ] = 0 ,
Ei Ej = −δij
+ εij k Ek ,
EiT = −CEi C −1 ,
(B.13)
where a charge conjugation matrix C is used that satisfies σ (C) = −σ (C) ,
τ (C) = τ (C) .
(B.14)
With these properties, we can obtain the following consequences for bilinear forms: s = 0, 6. We have the normal type. The two charge conjugation matrices of the complex Clifford algebra can be used (possibly multiplied by i to make them real, but an overall factor is not important), having opposite values of τ . For σ one can use (B.7). For s = 0 there is no imaginary factor in (B.6), and thus γ∗ is a real matrix that can be used to define real chiral spinors (Majorana-Weyl spinors). The value of ι is then as in the complex case, see (B.8). For s = 6 there is no projection possible in this real case. The fact that the Clifford algebra is real reflects that the irreducible spinors are Majorana spinors. s = 7. The real Clifford representation is also of the normal type. With the odd dimension there is only one charge conjugation matrix, and no chiral projection. The values of σ and τ are as in (B.5). Again, the reality reflects the property of Majorana spinors. s = 1, 5. The real Clifford representation is of the ‘almost complex type’. We have 4 choices for the charge conjugation matrix: C, CJ , CD and CDJ . We can derive from the given properties that ←
←
σ (C) = σ (C) = − n σ (CJ ) = σ (CD) = n σ (CDJ ), τ (C) = τ (CJ ) = −1 , τ (CD) = τ (CDJ ) = 1 .
(B.15)
If s = 1 then (B.12) says that 21 ( ± D) are good projection operators, and can be used to define semispinors. These (real) semispinors have the same dimension as the original complex ones and are the Majorana spinors. It is straightforward to check that ι(C) = ι(D) = 1 ,
ι(CJ ) = ι(CDJ ) = −1 .
(B.16)
1 In this explanation of properties of real Clifford algebras, we will always denote by C a specific choice of charge conjugation matrix for the real Clifford algebra, and by C the one for the complex gamma matrices. In general we use C for any choice of charge conjugation matrix.
Polyvector Super-Poincar´e Algebras
417
If s = 5 no such projection is possible. The size of the spinor representation is doubled by the procedure (B.9), and this reflects the fact that we have symplectic-Majorana spinors. s = 2, 4. The real Clifford representation is of the quaternionic type. With dimension even, we start from the two charge conjugation matrices of the complex case. For each of them, we can construct 3 extra ones by multiplying with the imaginary units Ei , bringing the total to 8 invariant bilinear forms. From (B.13) and (B.14) it follows that σ (C) = −σ (C) = σ (CEi ), τ (C) = τ (C) = τ (CEi ) .
(B.17)
The definition of chiral spinors as in the complex case is only possible if γ∗ in (B.6) is real. Thus if 21 n + p is even, i.e.s = 4, the product of all the a ’s is a good chiral projection operator. The projected spinors are the components of symplectic MajoranaWeyl spinors. If γ∗ is imaginary, i.e.s = 2, the product of all the a ’s squares to − . Using one of the complex structures, say E1 , then gives chiral projections of the form 1 2 ( ± i∗ E1 ). In this case ι(C) = ι(CE1 ) = −ι(C) ,
ι(CE2 ) = ι(CE3 ) = ι(C) .
(B.18)
The projected spinors are the components of Majorana spinors. s = 3. Here also, the real Clifford algebra is of the quaternionic type, but since the dimension is odd, there is only one charge conjugation matrix in the complex Clifford algebra. For this, (B.17) applies. There is no chiral projection, and the components correspond to symplectic Majorana spinors. The results can be seen in Table A.3 (though the names for the different bilinear forms are unrelated to what has been explained here). Table A.1 gives the number of solutions for charge conjugation matrices that have β = 1, β = −1. The map βk in the main text corresponds to the mapping from two spinors s and t to the form with components s T Ca1 ...ak t (where C denotes now any choice as explained in footnote 1), and the number σ ( k ) gives the symmetry of this bispinor (for commuting C spinors) under interchange of s and t, while ι( k ) tells whether s and t have the same C chirality. They are related to σ (C), τ (C) and ι(C) by (1.4) and (1.5): ←
k σ (C ) = σ (C)τ k (C) k ,
k ι(C ) = (−)k ι(C) .
(B.19)
For real Clifford algebras they are given explicitly in Tables A.4– A.6.
B.2. Summary of the results for the algebras. This paper treats algebras that consist of an even sector g0 = so(p, q) + W0 , and an odd sector g1 = W1 consisting of a representation of so(p, q). The group so(p, q) is denoted as so(V ), and V denotes its vector representation. We consider either the usual case where the odd generators are fermionic ( = 1, and we have a superalgebra), or they can be bosonic ( = −1, and we have a ‘Z2 -graded Lie algebra’). We will use the word ‘commutator’ in all cases, though this is obviously an anticommutator for [W1 , W1 ] in the superalgebra case. These algebras are called -extensions of so(V ). We use the following terminologies for special cases:
418
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
Poincar´e superalgebras or Lie algebras: W0 are the translations in n dimensions (n = p + q), which are denoted by V [and thus g0 = so(V ) + V ] and W1 is a spinorial representation. Algebra of translational type: all generators in [W1 , W1 ] belong to W0 : [W1 , W1 ] ⊂ W0 ,
[W1 , W0 ] = 0 ,
[W0 , W0 ] = 0 .
(B.20)
This part W0 + W1 is called the ‘algebra of generalized translations’. Transalgebra: algebra of translational type where all the generators of W0 appear in [g1 , g1 ], i.e. [W1 , W1 ] = W0 .
(B.21)
-extended polyvector Poincar´e algebras: Algebra of translational type where W1 is a (possibly reducible) spinorial representation (includes chiral and extended supersymmetry). There are 2 extreme cases: one in which the full g is semisimple, which is the case of the Nahm superalgebras, and the algebras of semi-direct type, where so(p, q) is its largest semisimple subalgebra. Apart from degenerate cases where n ≤ 2, any transalgebra is of semi-direct type. Transalgebras are minimal cases of algebras of translational type in the sense that there are no proper subalgebras, see Definition 1. In fact, any algebra of translational type can be written as g = g + a, where a ⊂ W0 is an so(p, q) representation, which is irrelevant in the sense that all its generators commute with all of W1 and W0 and do not appear in [W1 , W1 ]. The algebra g is a transalgebra. For any choice of W1 there is a unique transalgebra where W0 has all the so(p, q) representations that appear in the (anti)symmetric product of W1 with itself. The (anti)commutators of W1 are then (r) [W1 , W1 ] = W0 . (B.22) all r Here r labels all representations that appear in the symmetric product for = 1, i.e.superalgebras, and in the antisymmetric product for = −1, i.e. for Lie algebras. Any other transalgebra can be obtained by removing an arbitrary number of terms in (B.22). We can consider these to be contractions of this basic transalgebra, where the representations to be removed are multiplied by some parameter t and the limit t → 0 is taken. Any -extension of semi-direct type with W1 irreducible and of dimension at least 3, is of the following form: W0 = A + K , [so(V ), A] = 0 ,
[K, W1 ] = 0 , [A, A] ⊂ K ,
[A, W1 ] = ρW1 , [W1 , W1 ] ⊂ K ,
(B.23)
where ρ = 0 or R·Id or C·Id. This is thus a transalgebra iff A = 0. Also, if the algebra is minimal, then [W1 , W1 ] = W0 and it is a transalgebra.
Polyvector Super-Poincar´e Algebras
419
Table B.1. The values of k in (B.24) for the case of complex spinors. n is the dimension of the vector space. i can be 0, 1, . . . limited by the fact that obviously k ≥ 0. For the even-dimensional case, we split the (anti)commutator between spinors of different and equal chirality. For equal chirality, the k = m generator is either selfdual or antiselfdual n = 2m + 1 n = 2m [S± , S± ] [S+ , S− ]
Superalgebra: σ = +1 k = m − 4i k = m − 3 − 4i
Lie algebra: σ = −1 k = m − 1 − 4i k = m − 2 − 4i
k = m − 4 − 4i k = m (anti)selfdual k = m − 1 − 2i
k = m − 2 − 4i k = m − 1 − 2i
Also, if W0 and W1 are irreducible so(p, q) representations, then either the algebra is of translational type, i.e.[W0 , W ] = 0, or W0 is an abelian generator and [W0 , W1 ] = a W1 , where a is a number. We now restrict ourselves to transalgebras where W1 = S, the irreducible spinor representation of so(p, q). Then the representations that appear in the right-hand side of (B.22) are either k-forms or, in the case that s = p − q is divisible by 4, also (anti-) selfdual (n/2)-forms. Thus, the unique maximal transalgebra has (anti)commutators Sα , Sβ = (C a1 ...ak )αβ W0k a1 ...ak ,
(B.24)
k
where α, β denote spinor indices. The classification of transalgebras with W1 = S reduces to the description of all the charge conjugation matrices C and the specification of the range of the summation over k. The relevant issue is the symmetry for a particular k, i.e.the σ ( k ) of the previous subsection. When there are chiral spinors involved, the C chirality should be respected, which is related to ι( k ). C First, in Sect. 5, the complex case is discussed. That means that there are no reality conditions on bosonic or fermionic generators. When the dimension is odd, the result is given in Theorem 2. There is only one charge conjugation matrix, and the result can be understood from (B.19) and (B.5). For even dimensions the result is given in Theorem 3. This depends mainly on (B.7) and (B.8). Here the spinors can be split into chiral spinors, and we can separately consider the commutators between spinor generators of the same and of opposite chirality. The result for allowed values of k in (B.24) can be found also in Table B.1. As an example we may check that in 11 dimensions we can indeed have P , Z ab and a 1 Z ...a5 generators in W0 , as is the case of the M-algebra, and the classification implies that we can consistently put any one of these to zero. For the case of real generators, it is important to note that (anti)selfdual tensors in even dimensions are only consistent for s/2 even. We now discuss the algebras according to the 8 different values of s. The results are shown in Table B.2. s = 0 (Majorana-Weyl spinors). There are chiral spinors and we can split the commutators. The k values that appear in Tables A.3– A.6 with ι = 1 can appear in commutators of equal chirality. The value of σ indicates whether they appear in superalgebras (σ = 1) or in Lie algebras (σ = −1). Those with ι = −1 appear in the same way in commutators of
420
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
Table B.2. The values of k in (B.24) for the case of real spinors. n is the dimension of the vector space. i can be 0, 1, . . . limited by the fact that obviously k ≥ 0 (and k ≤ n for s = 2, 6). In cases where there are real Weyl spinors, we split the (anti)commutator between spinors of different and equal chirality, and the k = m generator is either selfdual or antiselfdual. When there are symplectic spinors, the right-hand side of (B.24) contains for some k’s triplets of the automorphism group su(2), and singlets for other k’s. The types of real spinors, Majorana (M), symplectic-Majorana (SM), or symplectic Majorana-Weyl (SMW) are indicated Superalgebra : σ = +1 n = 2m + 1 s = 1, 7(M) k = m − 4i k = m − 3 − 4i s = 3, 5(SM) k = m − 4i triplet k = m − 3 − 4i triplet k = m − 1 − 4i singlet k = m − 2 − 4i singlet n = 2m s = 0(MW) [S± , S± ] k = m − 4 − 4i k=m (anti)selfdual [S+ , S− ] k = m − 1 − 2i s = 2, 6(M) k = m − 4i, m + 4 + 4i k = m − 3 − 4i, m + 1 + 4i s = 4(SMW) [S± , S± ] k = m − 4 − 4i triplet k = m (anti)selfdual triplet k = m − 2 − 4i singlet [S+ , S− ] k = m − 1 − 2i 2×2
Lie algebra : σ = −1 k k k k k k
= m − 1 − 4i = m − 2 − 4i = m − 4i = m − 3 − 4i = m − 1 − 4i = m − 2 − 4i
singlet singlet triplet triplet
k = m − 2 − 4i k = m − 1 − 2i k = m − 1 − 4i, m + 3 + 4i k = m − 2 − 4i, m + 2 + 4i k k k k
= m − 2 − 4i triplet = m (anti)selfdual singlet = m − 4 − 4i singlet = m − 1 − 2i 2×2
different chirality. The (anti)selfdual tensors appear in the commutators between spinors of the same chirality. s = 1 (Majorana spinors). The two projections to semispinors mentioned above (B.16), lead to equivalent spinors. We thus consider only the commutator between these irreducible spinors (including the others is contained in the ‘extended algebras’ discussed below). In Tables A.3– A.6 we thus consider the ι = 1 cases. We can check that ι = −1 always allows both σ = 1 and σ = −1 as this concerns commutators between unrelated but equivalent spinors. s = 2 (Majorana spinors). The two projections to semispinors lead to equivalent spinors. We thus consider only the commutator between these irreducible spinors. Note that in the table we indicate here also forms with k > m. These are dual to k < m forms, and this duality has been used in the formulation of the s = 2 part of Theorem 6. The formulation here shows the gamma matrices completely, e.g. the appearance of abc = εabcd γ5 γ d in 4 dimensions. s = 3, s = 5 (Symplectic-Majorana spinors). The symplectic spinors are in a doublet of su(2). According to the value of σ for a particular k we find either a triplet or a singlet of generators in the superalgebra or in the Lie algebra. s = 4 (Symplectic Majorana-Weyl spinors). In the commutators between generators of equal chirality (which are again doublets of su(2)), we find either triplets (symmetric) or singlet (antisymmetric) generators. For commutators between generators of different
Polyvector Super-Poincar´e Algebras
421
chirality no symmetry or antisymmetry can be defined, and the generators allowed by the chirality (ι = −1) appear in the superalgebra as well as in the Lie algebra. s = 6 (Majorana spinors). This case is straightforward from the tables and the spinors are just real and not projected. Remark that the result is then the same as for the projected ones of s = 2. The same remark about showing tensors with k > m holds here too. These are dualized in the formulation in Corollary 4. s = 7 (Majorana spinors). Here also, the tables straightforwardly lead to the same result as for the projected spinors of s = 1. We remark that the result is the same for s and for −s, which shows that the conventional choices discussed at the beginning of Sect. B.1. do not influence the final algebras. Finally, in Sect. 7, results are obtained for N -extended polyvector Poincar´e algebras. This means that W1 consists of N copies of the irreducible spinor S. In cases where there are two inequivalent copies (complex even dimensional, or real with s = 0 or s = 4) we have (N+ , N− )(N+ , N− )-extended polyvector Poincar´e algebras. The results are straightforward from the above tables and this shows why it has been useful to include the Lie algebra case. The generators in W1 are in an N -representation of the automorphism algebra that acts on the copies of S. For the complex odd-dimensional case and real s = 1, 2, 6, 7 (Majorana): We just have to split the N × N representations into the symmetric and antisymmetric ones. for superalgebras:
for Lie algebras:
N(N+1) copies of the σ 2 N(N−1) copies of the + 2 N(N−1) copies of the σ 2 N(N+1) copies of the + 2
= 1 generators σ = −1 generators , = 1 generators σ = −1 generators .
(B.25)
For the complex even-dimensional case and real s = 0 (Weyl): We have (N+ , N− ) algebras. We use the above rule separately for the commutators between the N+ chiral generators and between the N− antichiral ones. Furthermore there are N+ N− copies of the generators that appear in [S+ , S− ] in Tables B.1 and B.2. As an example, the (2, 1) superalgebra in 8-dimensional (4,4) space contains: three selfdual 4-forms, and one antiselfdual 4-form, four 0-forms (three in [2S+ , 2S+ ] and one in [S− , S− ], one 2-form (in [2S+ , 2S+ ]) and two 3-forms and 1-forms in [2S+ , S− ]. For the symplectic real case s = 3, 5: The automorphism algebra is already sp(2) = su(2) for the simple algebras discussed above. For the extended algebras it is sp(N ) where N is even. The simple case is thus similar to (B.25) with N = 2, and the ‘triplet’ and ‘singlet’ indications in Table B.2 reflect this. Therefore for higher N (always even) we replace in Table B.2 the ‘triplet’ by N (N + 1)/2 and the ‘singlet’ by N (N − 1)/2. For the symplectic Majorana-Weyl case s = 4: We merely need to combine the remarks above for the symplectic case and the Weyl case. Extended algebras are of the form (N+ , N− ) where both numbers are even. The ‘triplet’ indication in Table B.2 is replaced by N+ (N+ + 1)/2 and N− (N− + 1)/2 and ‘singlet’ is replaced by N+ (N+ − 1)/2 and N− (N− − 1)/2. The mixed commutators are multiplied by N+ N− .
422
D.V. Alekseevsky, V. Cort´es, C. Devchand, A. Van Proeyen
References [AC]
Alekseevsky, D.V., Cort´es, V.: Classification of N-(super)-extended Poincar´e algebras and bilinear invariants of the spinor representation of Spin(p, q). Commun. Math. Phys. 183, 477–510 (1997) [AI] de Azc´arraga, J.A., Izquierdo, J.M.: Lie groups, Lie algebras, cohomology and some applications in physics. Cambridge: Camb. Univ. Press, 1995 [CAIP] Chryssomalakos, C., de Azc´arraga, J.A., Izquierdo, J.M., P´erez Bueno, J.C.: The geometry of branes and extended superspaces. Nucl. Phys. B567, 293–330 (2000) [DFLV] D’Auria, R., Ferrara, S., Lledo, M.A., Varadarajan, V.S.: Spinor algebras. J. Geom. Phys. 40, 101–129 (2001) [DN] Devchand, C., Nuyts, J.: Supersymmetric Lorentz-covariant hyperspaces and self-duality equations in dimensions greater than (4|4). Nucl. Phys. B503, 627–656 (1997); Lorentz covariant spin two superspaces. Nucl. Phys. B527, 479–498 (1998); Democratic Supersymmetry. J. Math. Phys. 42, 5840–5858 (2001) [FV] Fradkin, E.S., Vasiliev, M.A.: Candidate for the role of higher spin symmetry. Ann. Phys. 177, 63–112 (1987) [LM] Lawson, H.B., Michelson, M.-L.: Spin geometry. Princeton: Princeton University Press, 1989 [N] Nahm, W.: Supersymmetries and their representations. Nucl. Phys. B135 149 (1978) [O] Okubo, S.: Real representations of finite Clifford algebras: (I) Classification. J. Math. Phys. 32 1657–1668 (1991) [OV] Onishchik, A.L., Vinberg, E.B.: Lie Groups and Algebraic Groups. Berlin-Heidelberg: Springer, 1990 [Sc] Scheunert, M.: Generalized Lie Algebras. J. Math. Phys. 20 712–720 (1979) [Sh] Shnider, S.: The superconformal algebra in higher dimensions. Lett. Math. Phys. 16, 377–383 (1988) [V] Vasiliev, M.A.: Consistent equations for interacting massless fields of all spins in the first order in curvatures. Ann. Phys. 190, 59–106 (1989) [VP] Van Proeyen, A.: Tools for supersymmetry. Annals of the University of Craiova, Physics AUC 9 (part I), 1–48 (1999) [VV] van Holten, J.W., Van Proeyen, A.: N=1 Supersymmetry Algebras in D = 2, D = 3, D = 4 mod 8. J. Phys. A 15 3763–3783 (1982) Communicated by G.W. Gibbons
Commun. Math. Phys. 253, 423–449 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1135-2
Communications in
Mathematical Physics
Classification of Subsystems for Graded-Local Nets with Trivial Superselection Structure Sebastiano Carpi1, , Roberto Conti2 1 2
Dipartimento di Scienze, Universit`a “G. d’Annunzio” di Chieti-Pescara, Viale Pindaro 87, 65127 Pescara, Italy. E-mail:
[email protected] Mathematisches Institut, Friedrich-Alexander Universit¨at Erlangen-N¨urnberg, Bismarckstr. 1 1/2, 91054 Erlangen, Germany. E-mail:
[email protected]
Received: 2 December 2003 / Accepted: 5 January 2004 Published online: 27 July 2004 – © Springer-Verlag 2004
Abstract: We classify Haag-dual Poincar´e covariant subsystems B ⊂ F of a gradedlocal net F on 4D Minkowski spacetime which satisfies standard assumptions and has trivial superselection structure. The result applies to the canonical field net FA of a net A of local observables satisfying natural assumptions. As a consequence, provided that it has no nontrivial internal symmetries, such an observable net A is generated by (the abstract versions of) the local energy-momentum tensor density and the observable local gauge currents which appear in the algebraic formulation of the quantum Noether theorem. Moreover, for a net A of local observables as above, we also classify the Poincar´e covariant local extensions B ⊃ A which preserve the dynamics. 1. Introduction It is a fundamental insight of the algebraic approach to Quantum Field Theory that a proper formulation of relativistic quantum physics should be based only on local observable quantities, see e.g. [31]. The corresponding mathematical structure is a net A of local observables, namely an inclusion preserving (isotonous) map which to every open double cone O in four dimensional Minkowski spacetime associates a von Neumann algebra A(O) (generated by the observables localized O) acting on a fixed Hilbert space H0 (the vacuum Hilbert space of A) and satisfying mathematically natural and physically motivated assumptions such as isotony, locality, Poincar´e covariance, positivity of the energy and Haag-duality (Haag-Kastler axioms). The charge (superselection) structure of the theory is then encoded in the representation theory of the quasi-local C ∗ -algebra (still denoted A) which is generated by the local von Neumann algebras A(O). The problem was then posed [19, 20], whether it is possible to reconcile such an approach with the more conventional ones based on the use of unobservable fields and gauge groups.
Partially supported by the Italian MIUR and GNAMPA-INDAM.
424
S. Carpi, R. Conti
A major breakthrough was then provided by S.Doplicher and J.E. Roberts in [24]. For any given observable net O → A(O), the Doplicher-Roberts reconstruction yields an associated canonical field system with gauge symmetry (F, π, G) describing the superselection structure of the net A corresponding to charges localizable in bounded regions (DHR sectors); for details see [24] where also the case of topological charges which are localizable in spacelike cones is considered. Here F = FA is the complete normal field net of A, acting on a larger Hilbert space H ⊃ H0 , the representation π is an embedding of A into F ⊂ B(H) so that A = F G and the gauge group G AutA (F) is a strongly compact subgroup of the unitary group U (H) (to simplify the notation we drop the symbol π when there is no danger of confusion). Actually any (metrizable) compact group may appear as G [23]. If every DHR sector of A is Poincar´e covariant (which is the situation considered in this paper) then the net F is also Poincar´e covariant with positive energy. In this case A is an example of a covariant subsystem (or subnet) of F. More generally a covariant subsystem B of F is an isotonous map that associates to each double cone O a von Neumann subalgebra B(O) of F(O) which is compatible with the Poincar´e symmetry and the grading (giving normal commutation relations) on F. Besides its natural mathematical interest the study of covariant subsystems appears also to be useful in the understanding of the possible role of local quantum fields of definite physical meaning, such as charge and energy-momentum densities, in the definition of the net of local observables (see [11] and the references therein). In a previous paper [10] we gave a complete classification of the (Haag-dual) covariant subsystems of a local field net F satisfying some standard additional assumptions (like the split property and the Bisognano-Wichmann property) and having trivial superselection structure in the sense that every representation of F satisfying the selection criterion of Doplicher, Haag and Roberts [20] (DHR representation) is unitarily equivalent to a multiple of the vacuum representation. The structure that emerged in this analysis is very simple: every Haag-dual covariant subsystem of F is of the form F1 H ⊗ 1 for a suitable tensor product decomposition F = F1 ⊗ F2 and strongly compact group H of unbroken internal symmetries of F1 . Under reasonable assumptions for a net of local observables A it was also pointed out in [10] that the above result is sufficient to classify the covariant subsystems of F = FA (and thus those of A) when the latter net is local since in that case FA has trivial superselection structure as a consequence of [14]. The main achievement of this paper is the generalization of the classification results in [10] to the case of graded-local nets, i.e. obeying normal commutation relations at spacelike distances. Apart from the obvious gain in mathematical generality our work is intended to remove the ad hoc assumption on the locality of FA corresponding to the absence of DHR sectors of Fermi type, i.e. obeying (para) Fermi statistics, for the net of local observables A. Besides its great theoretical value, the Doplicher-Roberts (re)construction will provide a major technical tool for our analysis. As in [10] we will repeatedly exploit the possibility of comparing such constructions for different subsystems given by the functorial properties of the correspondence B → (FB , GB ) discussed in [14]. Compared to [10], there are two preliminary problems which have to be settled. One has to give a meaning to the statement that “F has trivial superselection structure” and this is done by requiring that F has no nontrivial DHR representations, or, equivalently, that every DHR representation of the Bose part F b of F is equivalent to
Graded Local Nets with Trivial Superselection Structure
425
a direct sum of irreducibles and that F b has only two DHR sectors. So in particular F = FF b . One has also to make it clear in which sense it may hold a tensor product decomposition of F (as a graded local net) and this is done using the standard notion of the product of Fermionic theories. Having these differences in mind, the classification results we obtain (see Theorem 3.4 and Theorem 3.8) are nothing but the natural reformulation of the results in [10] in the more general context, thus showing that in the graded local case the structure of Haag-dual Poincar´e covariant subsystems can still be described in terms of internal symmetries (cf. [1]). However, though the general strategy is very much the same, some of the proofs are significantly different due to technical complications related to the fact that we did not find an efficient way to adapt to the graded-local situation some crucial arguments relying on the work of L. Ge and R.V. Kadison [28]. These differences are particularly evident in the proofs of Theorems 3.4 and 3.3 and in fact also provide a partially alternative argument for the validity of the results in [10]. The main new technical ingredients come from the theory of nets of subfactors [40] and the theory of half-sided modular inclusions of von Neumann algebras supplemented with some ideas of H.J. Borchers [4, 50]. As a natural application of the classification result we provide a solution to a problem raised by S. Doplicher (see [18]) about the possibility of a net of local observables to be generated by the corresponding canonical local implementations of symmetries with the characterization in Theorem 4.4. During our study of subsystems we also realized that some of our methods can be useful to handle the opposite problem as well, namely to classify the local extensions of a given observable net A. If a local net B ⊃ A extends a given local net A satisfying the same conditions used in our analysis of subsystems, and if this extension “preserves H for a suitable closed subgroup H the dynamics”, then, modulo isomorphisms, B = FA of the gauge group of A (Theorem 5.2). A crucial assumption on which our results depend and that deserves some comments is the requirement that the net of local observables has at most countably many DHR sectors, all with finite statistical dimension. Although the above properties probably are still waiting for a better understanding they are strongly supported from the experience: no example of DHR sector with infinite statistics is known for a theory on a four-dimensional spacetime and the presence of uncountably many DHR sectors can be ruled out, e.g. by the reasonable requirement that the complete field net fulfills the split property (the situation is drastically different in the case of conformal nets on the circle [8, 9, 27, 41]). Thus, at the present state of knowledge, our results appear to be more than satisfactory. We refer the reader to standard textbooks on operator algebras like [45, 44, 33, 34] for all unexplained notions and facts freely used throughout the text.
2. Preliminaries and Assumptions We follow closely the discussion in [10], pointing out the relevant modifications. We write P for the component of the identity of the Poincar´e group and P˜ for the universal covering of P. Elements of P˜ are denoted by pairs L = (, x), where L is an element of the covering of the connected component of the Lorentz group and x is a spacetime translation. P acts in the usual fashion on the four-dimensional Minkowski
426
S. Carpi, R. Conti
spacetime M4 and the action of P˜ on M4 factors through P via the natural covering map P˜ → P. The family of all open double cones and (causal) open wedges in M4 will be denoted K and W, respectively. If S is any open region in spacetime, we denote by S the interior of the causal complement S c of S. Throughout this paper we consider a net F over K, namely a correspondence O → F(O) between open double cones and von Neumann algebras acting on a fixed separable (vacuum) Hilbert space H = HF . The following assumptions have been widely discussed in the literature and are by now considered more or less standard: (i) Isotony. If O1 ⊂ O2 , O1 , O2 ∈ K, then F(O1 ) ⊂ F(O2 ).
(1)
(ii) Graded locality. There exists an involutive unitary operator κ on H inducing a net automorphism ακ of F, i.e. ακ (F(O)) = F(O) for each O ∈ K. Let F b (O) = {F ∈ F(O) | ακ (F ) = F } and F f (O) = {F ∈ F(O) | ακ (F ) = −F } be the even (i.e., Bose) and the odd (i.e., Fermi) part of F(O), respectively, and let F t be the new (isotonous) net F b + κF f over K. If O1 , O2 ∈ K and O1 is spacelike separated from O2 then F(O1 ) ⊂ F t (O2 ) .
(2)
(iii) Covariance. There is a strongly continuous unitary representation U of P˜ such that, for every L ∈ P˜ and every O ∈ K, there holds U (L)F(O)U (L)∗ = F(LO).
(iv) (v) (vi) (vii)
(3)
˜ commutes The grading and the spacetime symmetries are compatible, that is U (P) with κ. Existence and uniqueness of the vacuum. There exists a unique (up to a phase) unit vector ∈ H which is invariant under the restriction of U to the subgroup of spacetime translations. In addition, one has κ = . Positivity of the energy. The joint spectrum of the generators of the spacetime translations is contained in the closure V + of the open forward light cone V+ . Reeh-Schlieder property. The vacuum vector is cyclic and separating for F(O) for every O ∈ K. Twisted Haag duality. For every double cone O ∈ K there holds F(O) = F t (O ),
(4)
where, for every isotonous net F and open set S ⊂ M4 , F(S) denote the von Neumann algebra defined by F(S) = F(O). (5) O⊂S
Equivalently, one requires that F(O) =
F t (O1 )
O1 ⊂O
(in short F = F d , where F d is the net defined by the r.h.s.)
Graded Local Nets with Trivial Superselection Structure
427
(viii) TCP covariance. There exists an antiunitary operator (the TCP operator) on H such that: ˜ U (, x)−1 = U (, −x) ∀(, x) ∈ P; F(O)−1 = F(−O) ∀O ∈ K. (ix) Bisognano-Wichmann property. Let WR = {x ∈ M4 : x 1 > |x 0 |} be the right wedge and let and J be the modular operator and the modular conjugation of the algebra F(WR ) with respect to , respectively. Then one has: ˜ 0); it = U ((t), J = ZU (R˜ 1 (π), 0); ˜ where (t) and R˜ 1 (θ ) denote the lifting in P˜ of the one-parameter groups cosh 2πt − sinh 2π t 0 0 − sinh 2πt cosh 2π t 0 0 (t) = 0 0 1 0 0 0 01
(6) (7)
(8)
of Lorentz boosts in the x 1 -direction and R(θ ) of spatial rotations around the first axis, respectively, and Z = (I + iκ)/(1 + i). (x) Split property. Let O1 , O2 ∈ K be open double cones such that the closure of O1 is contained in O2 (as usual we write O1 ⊂⊂ O2 ). Then there is a type I factor N (O1 , O2 ) such that F(O1 ) ⊂ N (O1 , O2 ) ⊂ F(O2 ).
(9)
Assumption (ii) says that F is a graded-local (or Z2 -graded) net; for F ∈ F define F+ = (F + ακ (F ))/2 and F− = (F − ακ (F ))/2 with F = F+ + F− , then given Fi ∈ F(Oi ), i = 1, 2 with O1 and O2 spacelike separated the following normal (i.e., Bose-Fermi) commutation relations hold true: F1+ F2+ = F2+ F1+ ,
F1+ F2− = F2− F1+ ,
F1− F2− = −F2− F1− .
Note that F b , the net formed by all the elements of F which are invariant under the Z2 -grading, is a truly local, while F t is a graded-local net (under the same κ). Clearly F tt = F. Among the consequences of these axioms one has that F acts irreducibly on H, F(M4 ) = B(H). Moreover, is U -invariant, and the algebras associated with wedge regions are (type I I I1 ) factors. Strictly speaking, one can deduce TCP covariance from the other properties, see [30, Theorem 2.10]. By the connection between spin and statistics, κ = U (−I, 0)
(10)
represents a rotation by angle 2π about any axis, see [30, Theorem 2.11]. (Of course, in the special case where κ = 1, we are back in the situation of a local (Bose) net as described in [10, Sect. 2].)
428
S. Carpi, R. Conti
From twisted Haag duality it follows that F(O) = ∩O⊂W F(W ), thus F corresponds to an AB-system in the sense of [49]. Note that ZF(O)Z ∗ = F t (O), for every O ∈ K. Also, the Bisognano-Wichmann property entails twisted wedge duality, namely ZF(W )Z ∗ (= F t (W )) = F(W ) , see [30, Prop. 2.5]. Note that F b (S) (as defined by additivity) does not necessarily coincide with F(S)b := {F ∈ F(S) | ακ (F ) = F } for general, possibly disconnected, open sets S. Definition 2.1. A covariant subsystem B of F is an isotonous (nontrivial) net of von Neumann algebras over K, such that B(O) ⊂ F(O), U (L)B(O)U (L)∗ = B(LO) ˜ for every O ∈ K and L ∈ P. Then we write B ⊂ F. For instance, F b is a covariant subsystem of F. Clearly a covariant subsystem B ⊂ F is local if and only if B(O) ⊂ F b (O) for every O ∈ K. For any open set S ⊂ M4 , we also set B(O) . B(S) = O⊂S
If B1 and B2 are covariant subsystems of F, we denote by B1 ∨ B2 the covariant subsystem of F determined by (B1 ∨ B2 )(O) := B1 (O) ∨ B2 (O), O ∈ K. By the relation (10) a covariant subsystem B ⊂ F naturally inherits the grading from F, namely κB(O)κ ∗ = B(O). Accordingly, B b will stand for the local net over K defined by B b (O) = B(O)b . Also, B t ⊂ F t is the (isotonous) net B b + κB f . We denote by HB := B(M4 ) the closed cyclic subspace generated by B acting on , and by EB the corresponding orthogonal projection of H onto HB . It follows at once that EB commutes with U , thus with κ. We say that a covariant subsystem B ⊂ F is Haag-dual if B(W ) . B(O) = W ∈W ,W ⊃O
As an example, F b is a Haag-dual subsystem of F. Note that a Haag-dual subsystem B does not satisfy twisted Haag duality on HF (unless B = F) however it satisfies (twisted) Haag duality on its own vacuum Hilbert space HB and the latter property in turn characterizes Haag-dual subsystems. If B is Haag-dual, then it satisfies all the properties (i)-(x) listed above in restriction to HB with respect to the restricted representation Uˆ of P˜ (and grading and TCP operators), as it can be shown essentially by the same arguments given in the local case, see [10, Prop. 2.3]. We briefly discuss only the split property for non-local subsystems. By repeating the argument in the proof of [10, Prop. 2.3], mutatis mutandis, it suffices to show that HB is separating for B(O1 ) ∨ B t (O2 ) ⊂ F(O1 ) ∨ F(O2 ) for every pair of double cones with O1 ⊂ O2 . Pick O0 ⊂ O1 ∩ O2 . Then F(O0 )b ⊂ (B(O1 ) ∨ B t (O2 )) and F(O0 )b HB ⊃ F(O0 )b = Hb (= {ξ ∈ H | κξ = ξ }). Consider O ∈ K, O ⊂ O2 , and pick a Fermi unitary u in B(O). (Such a unitary always exists, as it can be seen by applying the polar decomposition to any Fermi element in B(O3 ) with O3 ⊂ O and then using Borchers property B for B b . The latter property holds as a consequence of the split property for B b ⊂ F b , inherited from the split property for
Graded Local Nets with Trivial Superselection Structure
429
F.) Then F(O0 )b u = uHb = Hf (= {ξ ∈ H | κξ = −ξ }). Hence HB is cyclic for (B(O1 ) ∨ B t (O2 )) and we are done. 1 If B is not Haag-dual it is always possible to consider the extension B d defined by d B (O) = ∩W ∈W ,W ⊃O B(W ), then B d will be a Haag-dual covariant subsystem of F with B d (W ) = B(W ) for every wedge W ∈ W and HBd = HB . Moreover, in restriction d (O) = Bˆ t (O ) holds on H where to HB , B d is the (twisted) dual net of B, namely B B we usedˆto denote the restriction of B, resp. B d to HB . (However, in order to simplify ˆ specifying if necessary when B acts the notation, we shall often write B in place of B, on H or HB .) For later use we need to recall the notion of tensor product of graded local nets. Given two graded local nets F1 on H1 and F2 on H2 with grading involutive unitaries κ1 and κ2 respectively, their tensor product F on H1 ⊗ H2 is defined2 by setting F(O) = F1 (O) ⊗ F2 (O)b + κ1 F1 (O) ⊗ F2 (O)f .
(11)
Then F is still a graded local net with the diagonal grading κ = κ1 ⊗ κ2 , moreover it ˆ 2 to satisfies twisted duality if both the Fi do [42]. Symbolically we write F = F1 ⊗F stress that we are dealing with the graded tensor product.3 One has F b = (F1t ⊗ F2 )b , ˆ 2 )Z2 ×Z2 = F1b ⊗ F2b . F f = κ1 ⊗ I (F1t ⊗ F2 )f . Note that (F1 ⊗F With our convention F1 sits inside F as F1 ⊗I , however in general F2 does not share this property. Nevertheless, F2 F2 → 1⊗F2+ +κ1 ⊗F2− is a (normal) representation of F2 in F, unitarily equivalent to F2 → I ⊗ F2 . For instance, if the (Fermi) field ψ1 (resp. ψ2 ) generates F1 (resp. F2 ) then ψ1 ⊗ I ˆ 2. and κ1 ⊗ ψ2 will generate F = F1 ⊗F ˆ 2 is covariant with respect to the representaThe graded tensor product F = F1 ⊗F tion U = U1 ⊗ U2 and with vacuum vector = 1 ⊗ 2 , whenever Fi is covariant with respect to the representation Ui with vacuum vector i , i = 1, 2. We say that two graded-local nets F1 and F2 as above are unitarily equivalent (or isomorphic) and write F1 F2 if there exists a unitary operator W : H1 → H2 such that W F1 (O)W ∗ = F2 (O) for every O ∈ K, W κ1 W ∗ = κ2 , W U1 (L)W ∗ = U2 (L), L ∈ P˜ and W 1 = 2 . For the reader’s convenience we also recall some terminology and a few facts that will be used throughout the paper without any further mention. A representation {π, Hπ } of the quasi-local C ∗ -algebra associated to a local (irreducible, Haag-dual) net, say B, is said to satisfy the DHR selection criterion, or simply called a DHR representation, if for every double cone O ∈ K there exists some unitary VO : Hπ → HB such that VO π(B)VO∗ = π0 (B), B ∈ B(O1 ), O1 ⊂ O .
(12)
Here, π0 denotes the identical (vacuum) representation of B on HB . Unitary equivalence classes of irreducible DHR representations are called DHR superselection sectors or simply DHR sectors. The statistics of a DHR sector is described 1 It is perhaps worth pointing out that the possibly stronger assumption (x ) that for every O and O 1 2 in K with O1 ⊂ O2 the triple (F (O1 ), F (O2 ), ) is a W∗ -standard split inclusion in the sense of [22] (see Sect. 4) is also inherited by all Haag-dual subsystems. 2 The net F ⊗ F defined by means of the ordinary tensor product does not satisfy the normal 1 2 commutation relations. 3 Other equivalent definitions are possible, obtained e.g. by exchanging the role of the two components, ˆ F2 = (F1 ⊗ F2 )b + (κ1 ⊗ I )(F1 ⊗ F2 )f . or also F1t ⊗
430
S. Carpi, R. Conti
by the statistical dimension, taking values in N ∪ {∞}, and a sign ± describing the Bose-Fermi alternative. It is well-known that a representation π satisfies the DHR selection criterion if and only if it is unitarily equivalent to some representation of the form π0 ◦ ρ, where ρ is a localized and transportable endomorphism of B. An account about all this matter can be found e.g. in [31], see also [19, 20, 43]. In passing, we observe that the definition of DHR representation, as expressed by Eq. (12), carries over to graded-local nets, however the correspondence with localized and transportable endomorphisms is lost. The results discussed in this paper crucially rely on the analysis in [14], especially Theorem 4.7 therein. This provides support for one further assumption, which plays an important role in the sequel. (A) Every representation of the local net F b satisfying the DHR selection criterion is a (possibly infinite) direct sum of irreducible representations with finite statistics, moreover an irreducible DHR representation that is inequivalent to the vacuum exists only when F b F and then it is unique (up to unitary equivalence). In particular F itself is the canonical field net of F b in the sense of [24]. The following proposition is useful to shed more light on Assumption (A). Proposition 2.2. For a field net F as above, Assumption (A) is satisfied if and only if every DHR representation of F is a multiple of the vacuum representation. Proof. Let π be a DHR representation of F b . Then π is unitarily equivalent to a subrepresentation of the restriction of a DHR representation of F to F b , see [14], p.275 (the second paragraph following Proposition 4.3). If we assume that every DHR representation of F is a multiple of the vacuum representation, it follows that π is (equivalent to) a direct sum of irreducible representations of F b with finite statistics. If F b F these ˆ 2 Z2 . are parametrized by Z Conversely, assume that Condition (A) holds and let π˜ be a DHR representation of F. Then, restricting π˜ to F b , it is not difficult to check that one gets a DHR representation of F b . By assumption, this restriction is thus equivalent to a direct sum of irreducible representations with finite statistics of F b . But then, it follows from [14, Theorem A.6] that π˜ itself is equivalent to a multiple of the vacuum representation of F. Starting from an observable algebra, the Doplicher-Roberts reconstruction theorem [24] supplies us with many examples of field nets satisfying all the structural assumptions as above, cf. Proposition 4.1. 3. Classification of Subsystems Unless otherwise specified, throughout this section, F is a net satisfying the assumptions (i)–(x) and (A) of Sect. 2 and B denotes a local Haag-dual covariant subsystem of F. Recall that by [14, Theorem 3.5] an inclusion of local nets A ⊂ B satisfying suitable properties induces an inclusion of the canonical field nets FA ⊂ FB compatible with b ⊂ F b and F f ⊂ F f . the grading and thus a fortiori also FA B A B In particular, in our setting, from B ⊂ F b we get B ⊂ FB ⊂ F acting on H and B ⊂ FBb ⊂ F b best considered as acting on Hb . Moreover these inclusions are compatible with the action of the Poincar´e group and thus, in particular FB is a covariant subsystem of F, cf. [10] p. 96 and [11, Theorem 2.11].
Graded Local Nets with Trivial Superselection Structure
431
As usual, for a covariant subsystem B ⊂ F we introduce the coset subsystem defined by B c (O) = B(M4 ) ∩F(O), O ∈ K. Then B c ⊂ F is a Haag-dual covariant subsystem. In principle B c ⊂ F could contain a non-trivial odd part, in which case it is convenient to consider also B cb ⊂ F b . If B is local then B c could be graded local but if B is truly graded local then B c has to be local. As in [10] we say that B is full in F if B c is trivial. Note however that in [14] the expression “full” is used in relation to subsystems with a different meaning. We take a similar route as in [10]. We denote π0 the vacuum representation of B on HB , π 0 that of F on H and π the representation of B on H. π satisfies the DHR selection criterion, hence π π0 ◦ ρ for some localized and transportable ρ. Note that π0 is a subrepresentation of π , thus id ≺ ρ. We have the following result, cf. [10, Proposition 3.2]. Theorem 3.1. In our situation, all DHR sectors of B are covariant with positive energy and they have finite statistics. Furthermore there are at most countably many such DHR sectors and the actual representation of B on H is a direct sum of them in which every DHR sector appears with non zero multiplicity. Proof. Let σ be an irreducible localized transportable endomorphism of B. Since B and F are relatively local we can extend σ to a localized transportable endomorphism σˆ of F, see [14, Lemma 2.1] (and the paragraph preceding it). Then, by Assumption (A), σˆ , considered as a representation of F, is normal on F b and thus is normal on F by [14, Theorem A.6]. Since F is irreducible in H and each normal representation of B(H) is a multiple of the identical one, we find π 0 ◦ σˆ ⊕i∈I π 0 for some finite or countable index set I . After restriction of both sides to B, π ◦ σ ⊕i∈I π . From now on the same proof as in [10, Proposition 3.1] goes through. Proposition 3.2. The embedding π˜ of FB into F satisfies π˜ π˜ 0 ⊗ I , where π˜ 0 is the vacuum representation of FB on HFB . Proof. By our previous result the actual representation π˜ of FB on H is normal with respect to the canonical representation of B on HFB and thus, by [14], normal with respect to the actual (irreducible) representation of FB on HFB . The conclusion now follows as in the proof of Theorem 3.1 with FB instead of F. Besides the split property for F implies that FF b = FB , see [10]. B The following theorem will be of crucial importance. We postpone its lengthy proof to Appendix A. Theorem 3.3. If FB is full in F then FBb (W ) ∩ F(W ) = CI for every wedge W . We are now ready to state our first classification result: Theorem 3.4. Let B be a Haag-dual local covariant subsystem of F and let FB be full in F, i.e. FB c (O) ≡ FB (M4 ) ∩ F(O) = CI for every O ∈ K. Then FB = F. In particular if B is full, then there is a compact group G of unbroken internal symmetries of F (with κ ∈ Z(G), the center of G) such that B = F G . Proof. Let us denote M the covariant subsystem FBb . By Theorem 3.3 we have, for any wedge W , M(W ) ∩ F(W ) = CI . Let π0m and π m denote the vacuum representation of M and the identical representation of M on H respectively. Then, as shown below one can prove that π0m appears only once in π m . Hence π˜ π˜ 0 (same notation as in Proposition 3.2), namely also the multiplicity of π˜ 0 in π˜ is one. Thus HFB = H and since FB ⊂ F the conclusion follows (e.g. by twisted Haag duality).
432
S. Carpi, R. Conti
Fix a wedge W and consider the set I = {W + a | a ∈ R4 } ordered under inclusion. Then M(W + a) ⊂ F(W + a) defines a directed standard net of subfactors with a stanˇ = (∪a M(W + a))−· , dard conditional expectation as defined in [40, Sect. 3]. Set M −· ˇ ⊂ M(M4 ) = M and F ⊂ Fˇ ⊂ Fˇ = (∪a F(W + a)) , then of course M ⊂ M F(M4 ) = B(H). ˇ on HM and πˇ m = πˇ 0 | ˇ ; Let πˇ 0 denote the representation of Fˇ on H, πˇ 0m that of M M clearly π 0 = πˇ 0 |F , π0m = πˇ 0m |M and π m = πˇ m |M . By [40, Corollary 3.3.] one can ˇ such that γˇ |F (W +a) is Longo’s canonical construct an endomorphism γˇ : Fˇ → M endomorphism of F(W + a) into M(W + a) whenever W ⊂ W + a, moreover γˇ acts ˇ ∩ F(W ) . It follows from [40, Proposition 3.4] that trivially on M ˇ πˇ m πˇ 0m ◦ ρ, ˇ + a)) ⊂ M(W + a) if W ⊂ W + a. where ρˇ = γˇ |M ˇ . Note that ρ(M(W ˇ M(W ) . It follows from Theorem 3.3 and [25, Corollary 4.2.] (see We set ρˇW := ρ| also [26, Theorem 5.2.]) that ιM(W ) ≺ ρˇW with multiplicity one. Let us assume now that π0m ≺ π m more than once. Then πˇ 0m ≺ πˇ m more than once. Let V1 , V2 ∈ (πˇ 0m , πˇ 0m ◦ ρ) ˇ be isometries with orthogonal ranges. Since the net M is relatively local with respect to F, the C ∗ -algebra M0 (W ) generated by all the M(O) with O ⊂ W is contained ˇ ∩ F(W ) and hence the action of ρˇ is trivial on it. For i = 1, 2 and every M ∈ in M M0 (W ) we have Vi πˇ 0m (M ) = πˇ 0m ◦ ρ(M ˇ )Vi = πˇ 0 (M )Vi , i.e. Vi ∈ πˇ 0m (M0 (W )) = m m πˇ 0 (M(W )) and thus Vi = πˇ 0 (Wi ) with Wi ∈ M(W ), (i=1,2). But then it follows that W1 , W2 ∈ (ιM(W ) , ρˇW ) (notice that πˇ 0m is faithful) and they are isometries with orthogonal ranges, which is a contradiction. We now turn our attention to non full subsystems. We begin with the following Lemma 3.5. Let B be a (not necessarily local) covariant subsystem of F. Then B ∨ B c is full in F. Proof. Since we have (B ∨ B c )(M4 ) = B(M4 ) ∨ B c (M4 ), then (B ∨ B c )(M4 ) = B(M4 ) ∩ B c (M4 ) . Hence, if a double cone O is contained in a wedge W , then (B ∨ B c )(M4 ) ∩ F(O) = B c (O) ∩ B c (M4 ) ⊂ B c (W ) ∩ B c (W ) . Hence the conclusion follows because B c (W ) is a factor.
Proposition 3.6. If B is a local covariant subsystem of F, then one has F(Bc )b = B c on HF . Proof. Let τ be a transportable endomorphism of (B c )b say localized in the double cone O and τˆ its functorial extension to F b ⊃ (B c )b . Since τˆ is implemented by a 1-cocycle in (B c )b ⊂ B we get τˆ (b) = b, b ∈ B, hence for the corresponding implementing Hilbert space of isometries in F = FF b we have Hτˆ ⊂ B ∩ F(O) = B c (O). Letting τ vary in the set (Bc )b (O) of all transportable morphisms of (B c )b localized in O such Hilbert spaces generate F(Bc )b (O) and then it follows that F(Bc )b ⊂ B c . Moreover it is not difficult to see that also the inclusion B c ⊂ F(Bc )b holds and thus the conclusion follows.
Graded Local Nets with Trivial Superselection Structure
433
Proposition 3.7. The field net FB∨(Bc )b acting on its own vacuum Hilbert space is canon B ⊗ c on HF ⊗ HBc as defined in formula (11) via the map ˆB ically isomorphic to F B F B → (Fˆ ⊗ I )(I ⊗ Bˆ + + κˆ ⊗ Bˆ − ) ,
(13)
where F ∈ FB ⊂ FB∨(Bc )b and B = B+ + B− ∈ B c ⊂ FB∨(Bc )b . Proof. Standard arguments relying on the results in [46] show that the vacuum is a product state for B ∨ (B c )b (cf. [4, Subsect. VI.4]). It follows that the local net B ∨ (B c )b acting on HB∨(Bc )b is canonically isomorphic to B ⊗ (B c )b acting on HB ⊗ H(Bc )b . It follows from (the proof of) Theorem 3.1 (cf. also Proposition 4.1) that every factorial DHR representation of B is a multiple of an irreducible DHR representation with finite statistical dimension (in particular it is a type I representation). As a consequence every irreducible DHR representation of B ⊗ (B c )b is unitarily equivalent to a tensor product representation and therefore by Theorem 3.1 and Proposition 3.6 it is realized up to equivalence as a subrepresentation on HFB ⊗HBc . Hence, FB∨(Bc )b being generated by the product of Hilbert spaces of isometries in FB , resp. B c , one has FB∨(Bc )b = FB ∨B c on HFB∨(Bc )b and the conclusion follows from the uniqueness of the canonical field net [24, Theorem 3.6] along with formula (11). To give more clues on formula (13) we include a more detailed argument. It follows from Proposition 3.2 (with F1 = FB∨(Bc )b in place of F) the existence of a unitary W : HF1 → HFB ⊗ K, for some Hilbert space K, such that W F W ∗ = Fˆ ⊗ I,
F ∈ FB ⊂ F1 .
Since FB and (B c )b commute (by an argument similar to the proof of Proposition 3.6), it follows that W BW ∗ =: I ⊗ τ (B), B ∈ (B c )b ⊂ F1 . Moreover, using the easily proven fact that the grading decomposes as a tensor product, namely W κW ∗ = κˆ ⊗ κˆ c for some κˆ c , that (B c )f commutes with FBb and that (B c )f (B c )f ⊂ (B c )b one also deduces that W BW ∗ =: κˆ ⊗ τ (B),
B ∈ (B c )f ⊂ F1 .
Being the representation of B c on HF1 a multiple of the vacuum representation and τ irreducible, without loss of generality one can identify K with HBc and τ with the vacuum representation of B c . But this is exactly formula (13) and we are done. We are now ready to state the complete classification theorem for local covariant subsystems of F. Theorem 3.8. Let B be a local covariant Haag-dual subsystem of F. Then F = FB ∨B c on HF and the net of inclusions O → B(O) ⊂ F(O) on HF is canonically isomorphic B (O)H ⊗ I ⊂ F B (O)⊗ c (O) on HF ⊗ HBc , where H is the canonical ˆB to O → F B gauge group of B. Proof. By Proposition 3.6, we have the following chain of inclusions on HF : B ∨ B c ⊂ FB ∨ B c = FB ∨ F(Bc )b ⊂ FB∨(Bc )b ⊂ F. Thus, by Lemma 3.5 FB∨(Bc )b is full in F and by Theorem 3.4 we have F = FB∨(Bc )b . Hence the conclusion follows from Proposition 3.7.
434
S. Carpi, R. Conti
Remark 3.9. If B is an arbitrary covariant subsystem of F satisfying twisted Haag-duality on its vacuum Hilbert space HB then we can apply Theorem 3.8 to B b . Hence the inclusions B b ⊂ B ⊂ FBb allow us to use Theorem 3.8 to classify all (not necessarily local) covariant subsystems of F satisfying Haag duality or twisted Haag duality on their own vacuum Hilbert space.
4. Nets Generated by Local Generators of Symmetries In this section we focus our attention on nets of observables generated by the local generators of symmetries, i.e. those arising in the framework of the Quantum Noether Theorem [5]. For some background on these nets we refer the reader to [12, 13, 10, 11] (see also [6] for some related issues). The problem we are interested in is to find structural conditions ensuring that a given observable net is generated by such local generators of symmetries, see [18] (cf. also [38]). We consider an observable net A satisfying the assumption (i)-(x) of Sect. 2, with graded locality (resp. twisted Haag duality) replaced by locality (resp. Haag duality). Notice that Borchers’ property B for A follows e.g. from the split property. We further require A to have countably many DHR superselection sectors, all of which have finite statistical dimension. Let F = FA and G = GA be the canonical field net and the compact gauge group of A. Arguing as in [10, Theorem 4.1], one can then show that the assumption (A) for F introduced in Sect. 2 is indeed satisfied, by virtue of [14, Theorem 4.7]. We record this result here, as a slight improvement of [10, Theorem 4.1]. Proposition 4.1. Let A be an isotonous net satisfying Haag duality and the split property on its irreducible vacuum representation. If A has at most countably many DHR superselection sectors, all of which have finite statistical dimension, then every DHR representation of A is unitarily equivalent to a (possibly infinite) direct sum of irreducible ones. Moreover, the canonical field net FA satisfies assumption (A) in Sect. 2. All the other properties (i)-(ix) for F are also satisfied, cf. the discussion in [10, Subsect. 4.2]. However, in order to apply the analysis in the previous section to the present situation we have also to assume property (x) in Sect. 2 to hold for F since it is still unknown whether the split property for A implies the split property for FA . Finally in order to construct the local symmetry implementations as in [5, 21] we need to assume that, for each pair of double cones O1 , O2 with O1 ⊂ O2 , the vacuum vector is cyclic for the von Neumann algebra F(O1 ) ∩ F(O2 ). As a consequence the triple (F(O1 ), F(O2 ), ) is a W∗ -standard split inclusion in the sense of [22]. This last assumption is clearly redundant if F is local since in this case it follows directly from the Reeh-Schlieder property. Moreover, as shown in the introduction of [21], in the graded local case it would be a consequence of the split property and the Reeh-Schlieder property for F if F b satisfies additivity and the time slice axiom as in [17, Sect.1]. We denote by the universal localizing map associated with the triple = (F(O1 ), F(O2 ), ),
(14)
a ∗ -isomorphism of B(H) onto the canonical interpolating factor of type I between F(O1 ) and F(O2 ), see [5, Sect.3].
Graded Local Nets with Trivial Superselection Structure
435
Let K := Gmax ≡ {k ∈ U (H) | kF(O)k ∗ = F(O) ∀O ∈ K, k = } ⊃ G be the maximal group of unbroken (unitary) internal symmetries of F, which in our setting is automatically strongly compact and commutes with Poincar´e transformations [22, Theorem 10.4], and let C be the net generated by the local version of the energymomentum operator [12], defined by
C(O) := (F (O1 ),F (O2 ),) (U (I, R4 )) . O1 ,O2 :O1 ⊂O2 ⊆O
The known properties of the universal localizing map [5, 22] imply that C is a covariant subsystem of F such that C(O) ⊂ F(O)K ⊂ F(O)G .
(15)
Moreover, by [15, Corollary 2.5], we have U (I, R4 ) ⊂ C(M4 ) = C d (M4 ) .
(16)
From Theorem 3.4, the following result readily follows: Theorem 4.2. The net C is a full covariant subsystem of F such that Cd = F K .
(17)
This equality was already proved in [10] in the case where F is Bosonic. As a consequence of Eq. (17) one has F = FC d and K AutC d (F) [14, Proposition 4.3]. It follows at once that A = C d if and only if G = K, if and only if A has no proper Haag-dual subsystem full in F, cf. [10, Corollary 4.2]. In the following we somehow exploit the isomorphism between the lattice structure of subgroups of K and that of “intermediate” subsystems of F. Actually our arguments rest only on the validity of the equality (17). The internal symmetries in K leaving A = F G globally invariant are exactly those in NK (G) := {k ∈ K | kGk −1 = G}, the normalizer of G in K. The internal symmetry group of A can thus be identified with NK (G)/G [7, Proposition 3.1]. We consider the local extension CG of C in F by the local currents associated with G, C ⊂ CG ⊂ F, where
CG (O) := (F (O1 ),F (O2 ),) (U (I, R4 ) ∨ (G ∩ G )) . O1 ,O2 :O1 ⊂O2 ⊆O ˜ d = FG ˜ ⊂ K [14, Theorem 4.1]; furthermore, F = F d and Then CG for some G CG ˜ Aut d (F), again by Theorem 3.4. Of course, k ∈ K belongs to G ˜ if and only if G CG for each one has
k (X)k −1 = (kXk −1 ) = (X) , X ∈ G ∩ G ,
(18)
436
S. Carpi, R. Conti
if and only if kXk −1 = X, X ∈ G ∩ G .
(19)
˜ so that Notice that now has disappeared from our condition. In particular G ⊂ G, C ⊂ CG ⊂ F G ⊂ F. ˜ ⊂ NK (G), i.e. G ˜ leaves F G globally invariant. To see It is also direct to check that G this, we observe that the orthogonal projection EG of H = F onto HG = F G lies ˜ EG ] = 0; then for any ψ ∈ HG and in (F G ) ∩ (F G ) = G ∩ G ≡ Z(G ), hence [G, −1 ˜ k ∈ G one has EG kψ = kψ, hence khk ψ = ψ, h ∈ G and the conclusion follows since G = {k ∈ K | kψ = ψ, ψ ∈ HG } . ˜ then Ad(k) determines (continuous) automorphisms of both G Therefore if k ∈ G and G = A . Let α0 be the natural homomorphism NK (G) → Aut(G), with kernel CK (G), the centralizer of G in K. By [24, Lemma 3.13] all the representations of the compact groups above (e.g. K) on H are quasi-equivalent to the corresponding left-regular representations. Since by Eq. ˜ if and only if its adjoint action is trivial on the center of the (19) k ∈ K belongs to G ˜ if and only if k ∈ NK (G) and von Neumann algebra G we can conclude that k ∈ G α0 (k) ∈ AutGˆ (G), where AutGˆ (G) is the group of the automorphisms of G acting triviˆ of equivalence classes of irreducible continuous unitary representations ally on the set G of G. Hence we have proven the following proposition: ˜ = α −1 (Aut ˆ (G)) . Proposition 4.3. We have G 0 G One can push this analysis a little bit further. Dividing the sequence α0 i ˜ → NK (G) → Aut(G) G
by G we get another sequence α˜ 0 i ˜ G/G → NK (G)/G → Aut(G)/Inn(G) =: Out(G)
so that ˜ G/G = α˜ 0−1 (Out Gˆ (G)),
(20)
where OutGˆ (G) = AutGˆ (G)/Inn(G). Notice that Ker(α˜ 0 ) = CK (G)/Z(G). Therefore ˜ = G if and only if G/G ˜ G = {1}, if and only if CK (G)/Z(G) = {1}
and
α˜ 0 (NK (G)/G) ∩ Out Gˆ (G) = {1} .
We are now ready to summarize the above discussion and draw some conclusion. ˜ It follows from Eq. (20) that G/G can be identified with the group of (unbroken) internal symmetries of the net F G that act trivially on the set of its DHR sectors. To see this let ρ be an irreducible DHR endomorphism of A (with finite statistical dimension by our previous assumption) localized in a double cone O ∈ K. Moreover, let Hρ ⊂ F(O) be the corresponding G−invariant Hilbert space of isometries. Hρ carries an irreducible unitary representation uρ of G defined by uρ (g)V := gV g −1 , V ∈ Hρ , g ∈ G.
Graded Local Nets with Trivial Superselection Structure
437
Now let k ∈ NK (G) and let α := Adk|A be the corresponding internal symmetry of A. Then, α acts on ρ by ρ → ρα := α ◦ ρ ◦ α −1 . If V ∈ Hρ then kV k −1 ∈ Hρα (a unitary transformation). Moreover, for every g ∈ G, uρα (g)kV k −1 = gkV k −1 g −1 = kuρ (α0 (k)(g))V k −1 . Hence uρα uρ ◦ α0 (k) and our claim follows since if ρ1 , ρ2 are DHR endomorphisms of A with finite statistical dimension then ρ1 ρ2 if and only if uρ1 uρ2 , see [24]. Thanks to the above identification we get immediately the following Theorem 4.4. Let A = F G be an observable net satisfying all our standing assumptions. Then one has d A = F G = CG
if and only if A has no nontrivial (unbroken) internal symmetries acting identically on the set of its DHR sectors. In particular, if A has no nontrivial internal symmetries then the above equality of nets holds true. It has been suggested by R. Haag that the existence of nontrivial internal symmetries for A is not compatible with the claim that “the net of observable algebras defines the theory completely without need of further specifications” [31, Sect. IV.1]. ˜ is particularly We briefly list a few special cases for which the computation of G straightforward: ˜ = ker(α0 ) = CK (G). For (i) If G is abelian, then Aut Gˆ (G) is trivial so that G ˜ = G. instance, if K = O(2) and G = SO(2) then G ˜ = NK (G). (ii) If G has no outer automorphisms, namely Aut(G) = Inn(G), 4 then G (The same conclusion holds true if G satisfies the weaker condition that Aut(G) = AutGˆ (G).) ˜ = G · CK (G). (iii) If G is quasi-complete, meaning that Aut Gˆ (G) = Inn(G), then G In particular we have obtained the following result. Corollary 4.5. If the gauge group G of A has no outer automorphisms then the following conditions are equivalent: (1) G = NK (G), namely A = F G has no nontrivial internal symmetries, d . (2) F G = CG A special case of this situation can be obtained by taking F to be the net generated by a hermitian scalar free field and G = K = Z2 , see [10, Subsect. 4.1], cf. also [37, 38, 16]. 4 In the mathematical literature, the groups for which Aut(G) = Inn(G) and whose center Z (G) is trivial are called complete.
438
S. Carpi, R. Conti
5. Classification of Local Extensions So far we have been dealing only with the classification problem for subsystems of a given system. However to some extent our methods can be useful to handle the extension problem as well. We assume that a local theory A has been given, acting on its own vacuum Hilbert space HA . The goal of this section is to setup a framework for classifying all the possible (local) extensions B ⊃ A with some additional properties. The assumptions on A and B should allow one to perform the Doplicher-Roberts reconstruction procedures and have good control on the way these are related. We show below how this can be achieved in the case where “the energy content of B is already contained in A”. In order to be more precise, let us assume throughout this section that A satisfies the same axioms as in the previous section also including that it has at most countably many DHR sectors, all with finite statistics. We shall however not need the split property for FA . Definition 5.1. A local extension of the local covariant net A is a local net B satisfying irreducibility on its separable vacuum Hilbert space HB , Haag duality, covariance under a representation VB fulfilling the spectrum condition and uniqueness of the vacuum, and containing a covariant subsystem A1 such that the corresponding net Aˆ 1 is isomorphic to A. In agreement with our notation, since A and Aˆ 1 are isomorphic, we shall naturally identify A and A1 and write A ⊂ B. In order to state our result, we need two more assumptions. The first one is of technical nature. We require the local extension B as above to satisfy the condition of weak additivity, namely B(M4 ) = B(O + x) x∈R4
for every O ∈ K. Then the Reeh-Schlieder property and Borchers property B also hold for B. The second hypothesis, on the energy content, is that VB (I, R4 ) ⊂ A(M4 )
(21)
(on HB ). As a consequence, A is full in B. The meaning of this assumption is to rule out a number of unnecessarily “large” extensions obtained by tensoring the net A with any other arbitrary (local, covariant) net, cf. [9]. Theorem 5.2. Let A be an observable net satisfying all the standing assumptions in Sect. 4. Let B be a local extension of A, satisfying weak additivity and the condition (21). Then B is an intermediate net between A and its canonical field net FA , and in fact H , the fixed point net of F under the action of some closed subgroup H of the B = FA A gauge group of A. Proof. Since we assume that A is covariant and the spectrum condition holds, all the DHR sectors of A are automatically covariant with positive energy, see [29, Theorem 7.1]. It follows that FA = FA,c , where FA,c denotes the covariant field net of [24, Sect. 6].
Graded Local Nets with Trivial Superselection Structure
439
By assumption, it is possible to perform the Doplicher-Roberts construction of the covariant field net FB,c of B and FA embeds into FB,c as a covariant subsystem [11, Theorem 2.11]. We will make use of Proposition 4.1. It is not difficult to see that the natural representation of A on HFB,c is a direct sum of DHR representations (cf. Sect. 3, [14, Lemma 4.5] and also [10, Lemma 3.1]) and thus, by Proposition 4.1, still decomposable into a direct sum of irreducible DHR representations with finite statistics. Arguing similarly as in Sect. 3, it follows that the representation of FA = FA,c (thought of as a subsystem of FB,c ) on the vacuum Hilbert space HFB,c of FB,c is a multiple of the vacuum representation of FA , and thus there is a unitary W : HFB,c → HFA ⊗ K, for a suitable Hilbert space K, such that W F W ∗ = Fˆ ⊗ IK ,
F ∈ FA ⊂ FB,c
(see e.g. the paragraph preceding Theorem 3.1 and Proposition 3.2) and moreover W UFB,c W ∗ = UFA ⊗ U˜ for some unitary representation U˜ of P˜ on K with positive energy, cf. [10], p.96. Note that, by uniqueness of the vacuum, if the multiplicity factor K is nontrivial then U˜ has to be nontrivial as well and in fact with a unique (up to a phase) unit vector K which is invariant under translations. To see this, without being too much involved in domain problems which nevertheless can be solved with the help of the spectral theorem, we consider the situation where U2 and U˜ are unitary representations of R4 satisfying the spectrum condition and U1 = U2 ⊗ U˜ . Then the corresponding generators satisfy the relation P1 = P2 ⊗ I + I ⊗ P˜ . Let us assume that Ui has a unique invariant vector i , i = 1, 2 (actually i = 1 would be enough). Since P1 1 = 0, the spectrum condition easily implies that (P2 ⊗ I )1 = 0 ˜ and (I ⊗ P˜ )1 = 0. Therefore it follows from the first equation that 1 = 2 ⊗ ˜ ˜ ˜ ˜ for some vector and from the second that P = 0. Thus U has an invariant vector as well, and this must be unique. (Alternatively, the same result could have been shown with the help of a Frobenius reciprocity argument.) Going back to our situation, one can choose K so that W FB,c = FA ⊗ K . Now, by the assumption on the energy-momentum operators, one must have UFA ⊗ U˜ = UFA ⊗ I on W HB ⊂ W HFB,c = HFA ⊗ K. It follows from the last relation that W HB ⊂ HFA ⊗ K = Fˆ A ⊗ I (FA ⊗ K ). Therefore W BW ∗ ⊂ Fˆ A ⊗ I , i.e. B ⊂ FA on HFB,c . Recalling that UFB,c ∈ B(M4 ) on HFB,c , one can finally argue that U˜ is trivial on K and this immediately yields the conclusion that K = C and FA = FB,c .
(22)
Thus we are back to the situation A ⊂ B ⊂ FA discussed in [14, Sect. 4] and whence H B = FA for some closed subgroup H of G = GA , see e.g. [14, Theorem 4.1]. Thanks to Proposition 4.1 now it follows from [14, Corollary 4.8] that
FA = FB,c = FB and all the DHR sectors of B are covariant as well.
440
S. Carpi, R. Conti
A. Proof of Theorem 3.3 In this appendix we give a proof of Theorem 3.3. Our proof relies on various ideas from [4, Sect. V]. For related problems see [36, Sect. 4]. We first recall a convenient notation for the wedges introduced in [3] (cf. also [4]). Let l1 , l2 be linearly independent lightlike vectors in V+ . Then the region W [l1 , l2 ] := {αl1 + βl2 + l ⊥ : α > 0, β < 0, l ⊥ · li = 0, i = 1, 2}
(23)
is a wedge and every wedge in W is of the form W [l1 , l2 ] + a for suitable l1 , l2 and a ∈ M4 . Moreover, W [l1 , l2 ] = W [l2 , l1 ]. Now let F be a net satisfying the assumptions (i)–(x) and Assumption (A) on triviality of superselection structure in Sect. 2. Given a wedge W [l1 , l2 ] we shall denote by [l1 , l2 ](t) the corresponding one-parameter group of Lorentz transformations (cf. [3, 30]). With this notation if [l1 ,l2 ] is the modular ˜ 1 , l2 ](t), 0). operator for (F(W [l1 , l2 ]), ) then it[l1 ,l2 ] = U ([l In [3] Borchers considered the intersection of two wedges W [l, l1 ], W [l, l2 ] and made the following observations: a) W [l, l1 ] ∩ W [l, l2 ] is a nonempty open set; b) [l, l1 ](t)(W [l, l1 ] ∩ W [l, l2 ]) ⊂ W [l, l1 ] ∩ W [l, l2 ] for t ≤ 0. Moreover one finds [l, l1 ](t)l = e−2πt l and [l, l1 ](t)l1 = e2πt l1 . Now let S be a subset of M4 which is contained in some wedge in W . We define (cf. [48, Sect.III]) F(W ). (24) F (S) := W ⊃S
It is easy to see that the map S → F (S) isotonous and covariant, namely F (S1 ) ⊂ ˜ Clearly F (S2 ) if S1 ⊂ S2 and U (L)F (S)U (L)−1 = F (LS) for every L ∈ P. F (W ) = F(W ) and F (O) = F(O) for every W ∈ W and every O ∈ K but for other regions (even for intersections of family of wedges) the equality could fail and in general, if S is open and contained in some wedge, F(S) ⊂ F (S). Similarly, if B is a Haag-dual covariant subsystem of F we define B(W ). (25) B (S) := W ⊃S
Then, the map S → B (S) is isotonous and covariant and B (S) coincides with B(S) if S ∈ W ∪ K. Now, given two wedges W [l, l1 ], W [l, l2 ], the observation of Borchers a) and b) recalled above and the Bisognano Wichmann property imply (cf. [3, Lemma 2.6]) that the inclusions of von Neumann algebras F (W [l, l1 ] ∩ W [l, l2 ]) ⊂ F(W [l, li ]) i = 1, 2,
(26)
are -half-sided modular inclusions in the sense of [4, Definition II.6.1]. Hence, by a theorem of Wiesbrock, Araki and Zsido (see [50] and [4, Theorem II.6.2]) there are strongly continuous one-parameter unitary groups Vi (t), i = 1, 2, with nonnegative generators, leaving the vacuum vector invariant and such that Vi (t)F(W [l, li ])Vi (−t) ⊂ F(W [l, li ]), i = 1, 2, t ≥ 0,
(27)
Graded Local Nets with Trivial Superselection Structure
441
F (W [l, l1 ] ∩ W [l, l2 ]) = Vi (1)F(W [l, li ])Vi (−1), i = 1, 2.
(28)
Moreover, for i = 1, 2 and t ∈ R, Vi (t) is the limit in the strong operator topology of the sequence n −i t i 2πt n , [l,li2π] n [l,l 1 ,l2 ]
(29)
where [l,l1 ,l2 ] denotes the modular operator of F (W [l, l1 ] ∩ W [l, l2 ]) corresponding to . As a consequence, F (W [l, l1 ] ∩ W [l, l2 ]) is a factor of type I I I1 . Note also that by Borchers’ theorem [2, Theorem II.9] we have, for i = 1, 2 and t, s ∈ R, −2πt it[l,li ] Vi (s)−it s). [l,li ] = Vi (e
(30)
The following lemma motivates the introduction of the algebras F (S). Lemma A.1. If l, l1 , l2 are lightlike vectors in the closed forward lightcone such that l, li are linearly independent, then the following holds: F (W [l, l1 ] ∩ W [l, l2 ]) = ZF((W [l, l1 ] ∩ W [l, l2 ]) )Z ∗ .
(31)
Proof. We have
F (W [l, l1 ] ∩ W [l, l2 ]) =
F(W )
W ⊃W [l,l1 ]∩W [l,l2 ]
=
F(W )
W ⊃W [l,l1 ]∩W [l,l2 ]
=
ZF(W )Z ∗
W ⊃W [l,l1 ]∩W [l,l2 ]
= Z
W ⊂(W [l,l1 ]∩W [l,l2
F(W ) Z ∗ ])
= ZF((W [l, l1 ] ∩ W [l, l2 ]) )Z ∗ , where in the last equality we used the convexity of W [l, l1 ]∩W [l, l2 ] and [47, Proposition 3.1]. Proposition A.2. If l, l1 , l2 are lightlike vectors in the closed forward lightcone such that l, li are linearly independent and W ∈ W then F(M4 ) = F(W ) ∨ F(W ),
(32)
F(M4 ) = F (W [l, l1 ] ∩ W [l, l2 ]) ∨ F((W [l, l1 ] ∩ W [l, l2 ]) ).
(33)
442
S. Carpi, R. Conti
Proof. We only prove the second assertion. The proof of the first is similar but simpler. By Lemma A.1 we have
F (W [l, l1 ] ∩ W [l, l2 ]) ∨ F((W [l, l1 ] ∩ W [l, l2 ]) ) = F (W [l, l1 ] ∩ W [l, l2 ]) ∩ F((W [l, l1 ] ∩ W [l, l2 ]) )
= Z F (W [l, l1 ] ∩ W [l, l2 ]) ∩ F((W [l, l1 ] ∩ W [l, l2 ]) ) Z ∗ . Now let
F ∈ F (W [l, l1 ] ∩ W [l, l2 ]) ∩ F((W [l, l1 ] ∩ W [l, l2 ]) )
be even with respect to the grading (i.e. κF κ=F). Then, by graded locality, F ∈ F (W [l, l1 ] ∩ W [l, l2 ]) ∩ F (W [l, l1 ] ∩ W [l, l2 ]) , and hence F is a multiple of the identity because F (W [l, l1 ] ∩ W [l, l2 ]) is a factor. If F ∈ F (W [l, l1 ] ∩ W [l, l2 ]) ∩ F((W [l, l1 ] ∩ W [l, l2 ]) ) is odd then ZF Z ∗ = iκF and hence iκF commutes with F ∗ . It follows that F F ∗ = −F ∗ F which implies F = 0.
Corollary A.3. If B is a local Haag-dual covariant subsystem of F, l, l1 , l2 are lightlike vectors in the closed forward lightcone such that l, li are linearly independent and W ∈ W then, on HF , we have FB (M4 ) = FB (W ) ∨ FB (W ),
(34)
FB (M4 ) = FB (W [l, l1 ] ∩ W [l, l2 ]) ∨ FB ((W [l, l1 ] ∩ W [l, l2 ]) ).
(35)
Proof. By Prop. A.2 (applied to FB instead of F) the claimed equalities hold on HFB and the conclusion follows from Prop. 3.2. Lemma A.4. The set (W [l, l1 ] ∩ W [l, l2 ]) is path connected. Proof. Recall that, given a set S, S = (S c )o is defined as the interior of the spacelike complement S c of S. In order to simplify the notation, set W1 := W [l, l1 ] and W2 := W [l, l2 ]. Then W1 ∩ W2 = ∅ and W1 ∩ W2 = ∅. One has (W1 ∪ W2 )c = W 1 ∩ W 2 [47, Prop. 2.1, b)] and moreover (W1 ∪ W2 )cc = (W 1 ∩ W 2 )c is open [47, Prop. 5.6, a)]. We claim that one also has (W 1 ∩ W 2 )c = [(W1 ∩ W2 )c ]o ≡ (W1 ∩ W2 ) . The inclusion “ ⊆ is clear. The opposite inclusion is a consequence of the following three facts: 1) (W 1 ∩ W 2 )c = ∪{W : W 1 ∩ W 2 ⊂ W c }, as follows by [47, Theor. 3.2, a)], by taking into account the fact that the l.h.s. is actually open; 2) [(W1 ∩ W2 )c ]o = ∪{W : W1 ∩ W2 ⊂ W c } = ∪{W : W1 ∩ W2 ⊂ W c }, where we use the fact that the spacelike complement of an open set is closed and the inclusion [(W1 ∩ W2 )c ]o ⊂ ∪{W : W1 ∩ W2 ⊂ W c } can be proven with the help of [47, Prop. 3.1];
Graded Local Nets with Trivial Superselection Structure
443
3) W1 ∩ W2 = (W 1 ∩ W 2 ), as it can be easily shown recalling that W1 and W2 are open and convex. Finally, since W1 ∪ W2 is connected it is not difficult to see that (W1 ∪ W2 )cc = (W1 ∩ W2 ) has to be connected too. In fact, if S is open and connected and p ∈ S cc \ S, p being spacelike to S c , one has that p belongs to the complement of S c . Since S is open, this means that the open lightcone pointed at p intersects S in at least one point, say q, and there is a timelike segment joining p and q in S cc (cf. the paragraphs preceding [47, Proposition 2.2]). The proof is complete. Proposition A.5. Let B be a (not necessarily local) Haag-dual covariant subsystem of F and let W [l, li ], i = 1, 2 as above. Then there is a vacuum preserving normal conditional expectation of F (W [l, l1 ] ∩ W [l, l2 ]) onto B (W [l, l1 ] ∩ W [l, l2 ]). Proof. It follows from the Bisognano-Wichmann property and the covariance of B that for every W ∈ W the algebra B(W ) is left globally invariant by the modular group of F(W ) associated to . Hence, by [46], there is a a vacuum preserving conditional expectation εW of F(W ) onto B(W ). Now let F ∈ F (W [l, l1 ] ∩ W [l, l2 ]) and Wa , Wb be two wedges containing W [l, l1 ]∩W [l, l2 ]. Moreover, let xa ∈ Wa and xb ∈ Wb . Since (W [l, l1 ] ∩ W [l, l2 ]) is path connected by Lemma A.4 (and open by definition), we can find double cones O1 , ..., On all contained in (W [l, l1 ] ∩ W [l, l2 ]) such that xa ∈ O1 , xb ∈ On and Oi ∩ Oi+1 = ∅, for i = 1, .., n − 1. Then, recalling that W [l, l1 ] ∩ W [l, l2 ] is convex, we can use [47, Prop. 3.1] to infer the existence of wedges W1 , ..., Wn containing W [l, l1 ]∩W [l, l2 ] and such that W1 = Wa , Wn = Wb and Oi ⊂ Wi , i = 1, ..., n. Thus, in particular, we have Wi ∩ Wi+1 = ∅, for i = 1, ..., n − 1 and hence is cyclic ))Z ∗ and separating for F(W ) ∨ F(W for Z(F(Wi ) ∩ F(Wi+1 i i+1 ), i = 1, ..., n − 1. Thus, it follows from εWi (F ) = EB F = εWi+1 (F ), i = 1, ..., n − 1, that εWa (F ) = εWb (F ).As a consequence the restriction of εW to the algebra F (W [l, l1 ] ∩W [l, l2 ]) does not depend on W ⊃ W [l, l1 ]∩W [l, l2 ] and gives the claimed conditional expectation. As a consequence of the Bisognano-Wichmann property, of Prop. A.5 and of the results in [46], for every Haag-dual covariant subsystem B of F, the one-parameter groups it[l,li ] , i = 1, 2 and it[l,l1 ,l2 ] commute with EB . Hence, by (29) EB commutes with Vi (t), i = 1, 2. Moreover, for i = 1, 2, the following hold: B(W [l, li ]) = F(W [l, li ]) ∩ {EB } , B (W [l, l1 ] ∩ W [l, l2 ]) = F (W [l, l1 ] ∩ W [l, l2 ]) ∩ {EB } .
(36) (37)
Hence using Eq. (28) we find B (W [l, l1 ] ∩ W [l, l2 ]) = Vi (1)B(W [l, li ])Vi (−1), i = 1, 2.
(38)
In the following if M is a von Neumann algebra on HF globally invariant under Adκ(·) we shall denote M b the Bose part of M, i.e. M b = M ∩ {κ} . Accordingly if B is a covariant subsystem of F then B b (S) = B(S)b for every S ∈ K ∪ W and for an arbitrary open set S we have B b (S) ⊂ B(S)b .
444
S. Carpi, R. Conti
Proposition A.6. If B is a local Haag-dual covariant subsystem of F, l, l1 , l2 are lightlike vectors in the closed forward lightcone such that l, li are linearly independent and W ∈ W then FB (W ) ∩ F(W )b = FB (M4 ) ∩ F(W )b , FB (W [l, l1 ] ∩ W [l, l2 ]) ∩ F (W [l, l1 ] ∩ W [l, l2 ])b
(39)
= FB (M4 ) ∩ F (W [l, l1 ] ∩ W [l, l2 ])b .
(40)
Proof. We only prove the second equation. By Corollary A.3 we obtain FB (M4 ) ∩ F (W [l, l1 ] ∩ W [l, l2 ])b
= FB (W [l, l1 ] ∩ W [l, l2 ]) ∩ F (W [l, l1 ] ∩ W [l, l2 ])b ∩ FB
×((W [l, l1 ] ∩ W [l, l2 ]) ) ,
and the conclusion follows since (e.g. by Lemma A.1) F (W [l, l1 ] ∩ W [l, l2 ])b ⊂ FB ((W [l, l1 ] ∩ W [l, l2 ]) ) .
Corollary A.7. Let B, l, l1 , l2 as in Prop. A.6 and let W1 , W2 ∈ W be such that W1 ⊂ W2 . Then the following hold (a) FB (W1 ) ∩ F(W1 )b ⊂ FB (W2 ) ∩ F(W2 )b , (b) FB (W [l, l1 ]∩W [l, l2 ]) ∩F (W [l, l1 ]∩W [l, l2 ])b ⊂ FB (W [l, li ]) ∩F(W [l, li ])b , i = 1, 2. For a local Haag-dual covariant subsystem B of F, l, l1 , l2 as in Prop. A.6 and W ∈ W we shall now use the following notations: NB (W ) := FB (W ) ∩ F(W )b ,
(41)
NB (W [l, l1 ] ∩ W [l, l2 ]) := FB (W [l, l1 ] ∩ W [l, l2 ]) ∩ F (W [l, l1 ] ∩ W [l, l2 ])b . (42)
Proposition A.8. Let B be a local Haag-dual covariant subsystem of F and let l, l1 , l2 be as in Prop. A.6. Then the following hold: (a) Vi (t)NB (W [l, li ])Vi (−t) ⊂ NB (W [l, li ]), i = 1, 2, t ≥ 0, (b) NB (W [l, l1 ] ∩ W [l, l2 ]) = Vi (1)NB (W [l, li ])Vi (−1), i = 1, 2. Proof. (b) follows easily from Eq. (28) and Eq. (38) (with FB instead of B). Now, recalling that by (b) in Corollary A.7 we have NB (W [l, l1 ] ∩ W [l, l2 ]) ⊂ NB (W [l, li ]), i = 1, 2, it follows from Eq. (30) and the covariance of the map W W → NB (W ) that, for every s ∈ R, i = 1, 2, −is Vi (e−2π s )NB (W [l, li ])Vi (−e−2πs ) = is [l,li ] Vi (1)NB (W [l, li ])Vi (−1)[l,li ] −is = is [l,li ] NB (W [l, l1 ] ∩ W [l, l2 ])[l,li ] −is ⊂ is [l,li ] NB (W [l, li ])[l,li ]
= NB (W [l, li ]), and also (a) is proven.
Graded Local Nets with Trivial Superselection Structure
445
Lemma A.9. Let B be a local Haag-dual covariant subsystem of F and let l, l1 , l2 as in Prop. A.6. Then, for i = 1, 2, the following holds: NB (W [l, li ]) = NB (W [l, l1 ] ∩ W [l, l2 ]).
(43)
NB (W [l, l1 ]) = NB (W [l, l2 ]).
(44)
In particular
Proof. We use a standard Reeh-Schlieder type argument. By Prop. A.8, recalling that Vi (t) = for t ∈ R, i = 1, 2, we have NB (W [l, l1 ] ∩ W [l, l2 ]) = Vi (1)NB (W [l, li ]) ⊂ NB (W [l, li ]). Now let ψ ∈ (NB (W [l, l1 ] ∩ W [l, l2 ]))⊥ . Then if ξ ∈ NB (W [l, li ]) and t ≥ 1 we have (ψ, Vi (t)ξ ) = 0. Since the self-adjoint generator of Vi (t) is nonnegative, the function t → (ψ, Vi (t)ξ ) is the boundary value of an analytic function in the upper half-plane. Hence, by the Schwarz reflection principle, it must vanish for every t ∈ R. It follows that (ψ, ξ ) = 0 and hence that (NB (W [l, l1 ] ∩ W [l, l2 ]))⊥ ⊂ (NB (W [l, li ]))⊥ . Lemma A.10. Let B be a local Haag-dual covariant subsystem of F. Then, for every wedge W ∈ W, the following holds: NB (W ) = NB (W ).
(45)
Proof. Let JW be the modular conjugation of F(W ) with respect to . Then, by the Bisognano-Wichmann property, JW F(W )JW = ZF(W )Z ∗ . Moreover JW commutes with κ and EFB . Hence JW FB (W )JW = JW F(W ) ∩ {EFB } JW = ZFB (W )Z ∗ , and consequently JW NB (W )JW = ZNB (W )Z ∗ , Accordingly, NB (W ) = JW NB (W ) = ZNB (W ) = NB (W ).
Proposition A.11. Let B a local Haag-dual covariant subsystem of F. Then, the closed subspace NB (W ) of HF does not depend on the choice of W ∈ W. Accordingly, the family {NB (W ) : W ∈ W} is a coherent family of modular covariant subalgebras of {F(W ) : W ∈ W} in the sense of [4, Definition VI.3.1]. Proof. Let l1 , l2 and l1 , l2 be two pairs of linearly independent lightlike vectors in the closed forward lightcone. If l1 and l1 are parallel then W [l1 , l2 ] = W [l1 , l2 ] and hence by Lemma A.9, NB (W [l1 , l2 ]) = NB (W [l1 , l2 ]). On the other hand, if l1 and l1 are linearly independent, using Lemma A.9 and Lemma A.10, we find NB (W [l1 , l2 ]) = NB (W [l1 , l1 ]) = NB (W [l1 , l1 ] )
446
S. Carpi, R. Conti
= NB (W [l1 , l1 ]) = NB (W [l1 , l2 ]). To complete the proof it is enough to show that for any given wedge W ∈ W the subspace NB (W + a) does not depend on a ∈ M4 . To this end we observe that NB (W + a) = U (I, a)NB (W ) and that, by Corollary A.7, NB (W + a) ⊂ NB (W ) if W + a ⊂ W . Because of the positivity of the energy, the conclusion then follows by a Reeh-Schlieder type argument. In the following we shall denote the closed subspace NB (W ) by HNB without any reference to the irrelevant choice of the wedge W ∈ W and the corresponding orthogonal projection by ENB . We now define for every open double cone O ∈ K, NB (W ). (46) NB (O) := W ⊃O
We have the following: Proposition A.12. For every wedge W ∈ W one has NB (W ) = NB (O).
(47)
O⊂W
We split the proof of this claim in a series of lemmata. Lemma A.13. Let O be any double cone, then one has NB (O)ENB = NB (W )ENB .
(48)
W ⊃O
Proof. The inclusion “⊂” is clear. “⊃”: let X denote a generic element in the r.h.s., then for every W ⊃ O there exists XW ∈ NB (W ) such that X = XW |HNB . We have to show that if Wa , Wb ⊃ O then XWa = XWb . Since O is connected we can use the argument in the proof of Proposition A.5 to find wedges W1 , ..., Wn containing O, with W1 = Wa , Wn = Wb and such that is separating for F(Wi )∨F(Wi+1 ), i = 1, ..., n−1. The latter property implies that XWi = XWi+1 , i = 1, ..., n − 1, and hence that XWa = XWb . Lemma A.14. For any double cone O one has NB (O) = HNB . Proof. Consider the family of algebras Nˆ (O) := NB (O)ENB and Nˆ (W ) := NB (W )ENB on HNB . Since by the previous lemma Nˆ (O) = ∩W ⊃O Nˆ (W ) one deduces that ˆ = N(O)
W ⊃O
Nˆ (W ) =
W ⊂O
Nˆ (W )t = (
Nˆ (W ))t
W ⊂O
(in the second equality we used the fact that JW F(W )JW = F t (W )). Now ∨W ⊂O N (W ) ⊂ ∨W ⊂O F(W ) = F(O ), therefore is separating for ∨W ⊂O N (W ) and henceforth for ∨W ⊂O Nˆ (W ) = (∨W ⊂O N (W ))ENB and also for (∨W ⊂O Nˆ (W ))t = ˆ W ⊂O N(W ˆ Z(∨ ))Zˆ ∗ . Thus is separating for Nˆ (O) . Lemma A.15. For every wedge W one has Nˆ (W ) = ∨O⊂W Nˆ (O) .
Graded Local Nets with Trivial Superselection Structure
447
Proof. The inclusion “⊃” is clear. Now let R denote the r.h.s. and σ the modular group ˆ of (N(W ), ), then σt (R) = R, t ∈ R since σ acts like W -preserving Lorentz boosts on the double cones O ⊂ W . But R is dense in HNB and the conclusion follows by Takesaki’s theorem on conditional expectations [46]. We are ready to prove the following theorem. Theorem A.16. Let B be a local Haag-dual covariant subsystem of F and assume that FB is a full subsystem of F. Then, for every wedge W ∈ W, FB (W ) ∩ F(W )b = C1. Proof. First recall that FB (W ) ∩ F(W )b = NB (W ). By Prop. A.6 we have NB (W ) = FB (M4 ) ∩ F(W )b . Hence, for every O ∈ K, NB (O) ⊂
(FB (M4 ) ∩ F(W )) = FB (M4 ) ∩ F(O) = C1,
W ⊃O
and the conclusion follows by Prop. A.12.
Corollary A.17. Let B be a local Haag-dual covariant subsystem of F and assume that FB is a full subsystem of F. Then, for every wedge W ∈ W, FB (W ) ∩ F(W ) = C1. Proof. By the Bisognano-Wichmann property the modular group of F(W ) with respect to is ergodic (cf. [39, Lemma 3.2]) and leaves FB (W ) globally invariant. Hence, by [46], FB (W ) ∩F(W ) has an ergodic modular group and consequently is either a type III factor or it is trivial but the first possibility is impossible because of Theorem A.16. To conclude our proof of Theorem 3.3 we shall show below that Corollary A.17 implies that, if B is a local Haag-dual covariant subsystem of F and FB is full in F, then FBb (W ) ∩ F(W ) = C1. We set M := F(W ), N := FB (W ) and denote ακ the automorphism on M induced by the grading operator κ. ακ defines an action of Z2 on M and leaves N globally invariant. Let N0 and M0 be the fixpoint algebras for the action of ακ on N and M respectively. We then have FBb (W ) = FB (W )b = N0 . Moreover, by Corollary A.17, N ⊂ M is an irreducible inclusion of type III factors and by its proof M has an ergodic modular group σ t commuting with ακ and leaving N globally invariant. It follows that also N0 ⊂ N is an irreducible inclusion of type III factors. By [32, p. 48] N is generated by N0 and a unitary V such that ακ (V ) = −V . Then V normalizes N0 and V 2 ∈ N0 . Let β := AdV |N0 ∩M . Then β is an automorphism of period two. Moreover, the fixpoint algebra (N0 ∩ M)β coincides with N ∩ M = C1. Since N0 ∩ M can be either trivial or a type III factor, because of the ergodicity of σ t , we can infer that N0 ∩ M = C1, i.e. FBb (W ) ∩ F(W ) = C1 and this concludes the proof of Theorem 3.3 . Acknowledgements. We thank H.-J. Borchers, D. R. Davidson, S. Doplicher, R. Longo, G. Piacitelli, and J. E. Roberts for several comments and discussions at different stages of this research. A part of this work was done while the first named author (S. C.) was at the Department of Mathematics of the Universit`a di Roma 3 thanks to a post-doctoral grant of this university. The final part was carried out while the second named author (R. C.) was visiting the Mittag-Leffler Institute in Stockholm during the year devoted to “Noncommutative Geometry”. He would like to thank the Organizers for the kind invitation and the Staff for providing a friendly atmosphere and perfect working conditions.
448
S. Carpi, R. Conti
References 1. Araki, H.: Symmetries in the theory of local observables and the choice of the net of local algebras. Rev. Math. Phys. Special Issue, 1–14 (1992) 2. Borchers, H.J.: The CPT-theorem in two-dimensional theories of local observables. Commun. Math. Phys. 143, 315–332 (1992) 3. Borchers, H.J.: Half-sided modular inclusions and the construction of the Poincar´e group. Commun. Math. Phys. 179, 703–723 (1996) 4. Borchers, H.J.: On the revolutionizing quantum field theory with Tomita’s modular theory. J. Math. Phys. 41, 3604–3673 (2000) 5. Buchholz, D., Doplicher, S., Longo, R.: On Noether’s theorem in quantum field theory. Ann. Phys. 170, 1–17 (1986) 6. Buchholz, D., Doplicher, S., Longo, R., Roberts, J.E.: A new look at Goldstone theorem. Rev. Math. Phys. Special Issue, 47–82 (1992) 7. Buchholz, D., Doplicher, S., Longo, R., Roberts, J.E.: Extension of automorphisms and gauge symmetries. Commun. Math. Phys. 155, 123–134 (1993) 8. Carpi, S.: The Virasoro algebra and sectors with infinite statistical dimension. Ann. H. Poincar´e 4, 601–611 (2003) 9. Carpi, S.: On the representation theory of Virasoro nets. Commun. Math. Phys. 244, 261–284 (2004) 10. Carpi, S., Conti, R.: Classification of subsystems for local nets with trivial superselection structure. Commun. Math. Phys. 217, 89–106 (2001) 11. Carpi, S., Conti, R.: Classification of subsystems, local symmetry generators and intrinsic definition of local observables. In: R. Longo (ed.), Mathematical physics in mathematics and physics. Fields Institute Communications, Vol. 30, Providence, RI: AMS, 2001, pp. 83–103 12. Conti, R.: On the intrinsic definition of local observables. Lett. Math. Phys. 35, 237–250 (1995) 13. Conti, R.: Inclusioni di algebre di von Neumann e teoria algebrica dei campi. Ph.D. Thesis, Universit`a di Roma Tor Vergata (1996) 14. Conti, R., Doplicher, S., Roberts, J.E.: Superselection theory for subsystems. Commun. Math. Phys. 218, 263–281 (2001) 15. D’Antoni, C., Doplicher, S., Fredenhagen, K., Longo, R.: Convergence of local charges and continuity properties of W ∗ inclusions. Commun. Math. Phys. 110, 325–348 (1987) 16. Davidson, D.R.: Classification of subsystems of local algebras. Ph.D. Thesis, University of California at Berkeley (1993) 17. Doplicher, S.: Local aspects of superselection rules. Commun. Math. Phys. 85, 73–85 (1982) 18. Doplicher, S.: Progress and problems in algebraic quantum field theory. In: S. Albeverio, et al. (eds.), Ideas and Methods in Quantum and Statistical Physics, Vol. 2. Cambridge: Cambridge Univ. Press, 1992, pp. 390–404 19. Doplicher, S., Haag, R., Roberts, J.E.: Fields, observables and gauge transformations I, II. Commun. Math. Phys. 13, 1–23 (1969); Commun. Math. Phys. 15, 173–200 (1969) 20. Doplicher, S., Haag, R., Roberts, J.E.: Local observables and particle statistics I, II. Commun. Math. Phys. 23, 199–230 (1971); Commun. Math. Phys. 35, 49–85 (1974) 21. Doplicher, S., Longo, R.: Local aspects of superselection rules II. Commun. Math. Phys. 88, 399–409 (1983) 22. Doplicher, S., Longo, R.: Standard and split inclusions of von Neumann algebras. Invent. Math. 75, 493–536 (1984) 23. Doplicher, S., Piacitelli, G.: Any compact group is a gauge group. Rev. Math. Phys. 14, 873–885 (2002) 24. Doplicher, S., Roberts, J.E.: Why there is a field algebra with a compact gauge group describing the superselection structure structure in particle physics. Commun. Math. Phys. 131, 51–107 (1990) 25. Fidaleo, F., Isola, T.: On the conjugate endomorphism in the infinite index case. Math. Scand. 77, 289–300 (1995) 26. Fidaleo, F., Isola, T.: The canonical endomorphism for infinite index inclusions. Z. Anal. und ihre Anwedungen 18, 47–66 (1999) 27. Fredenhagen, K.: Superselection sectors with infinite statistical dimension. In: Subfactors, H. Araki et al., eds., Singapore: World Scientific, 1995, pp. 242–258 28. Ge, L., Kadison, R.V.: On tensor products of von Neumann algebras. Invent. Math. 123, 453–466 (1996) 29. Guido, D., Longo, R.: Relativistic invariance and charge conjugation in quantum field theory. Commun. Math. Phys. 148, 521–551 (1992) 30. Guido, D., Longo, R.: An algebraic spin and statistics theorem. Commun. Math. Phys. 172, 517–533 (1995)
Graded Local Nets with Trivial Superselection Structure
449
31. Haag, R.: Local Quantum Physics. 2nd ed. New-York-Berlin-Heidelberg: Springer-Verlag, 1996 32. Izumi, M., Longo, R., Popa, S.: A Galois correspondence for compact groups of automorphisms of von Neumann algebras with a generalization to Kac algebras. J. Funct. Anal. 155, 25–63 (1998) 33. Kadison, R.V., Ringrose, J.R.: Fundamentals of the Theory of Operator Algebras Volume I: Elementary Theory. Reprint of the 1983 original. Graduate Studies in Mathematics 15. Am. Providence, RI: Am. Math. Soc. 1997 34. Kadison, R.V., Ringrose, J.R.: Fundamentals of the Theory of Operator Algebras Volume II: Advanced Theory. Corrected reprint of the 1986 original. Graduate Studies in Mathematics 16. Providence, RI: Am. Math. Soc. 1997 35. Kastler, D., ed.: The algebraic theory of superselection sectors. Singapore: World Scientific, 1990 36. K¨oster, S.: Local nature of coset models. Rev. Math. Phys. 16, 353–382 (2004) 37. Langerholc, J., Schroer, B.: On the structure of the von Neumann algebras generated by local functions of the free Bose field. Commun. Math. Phys. 1, 215–239 (1965) 38. Langerholc, J., Schroer, B.: Can current operators determine a complete theory? Commun. Math. Phys. 4, 123–136 (1967) 39. Longo, R.: An analogue of the Kac-Wakimoto formula and black hole conditional entropy. Commun. Math. Phys. 186, 451–479 (1997) 40. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 41. Longo, R., Xu, F.: Topological sectors and a dichotomy in conformal field theory. To appear in Commun. Math. Phys. DOI 10/1007/s00220-004-1063-1 42. Roberts, J.E.: The structure of sectors reached by a field algebra. Carg`ese Lectures in Physics, Vol. 4, New York: Gordon and Breach, 1970, pp. 61–78 43. Roberts, J.E.: Lectures on algebraic quantum field theory. In [35] 44. Str˘atil˘a, S.: Modular theory in operator algebras. Tunbridge Wells, Kent: Abacus Press, 1981 45. Str˘atil˘a, S., Zsid´o, L.: Lectures on von Neumann algebras. Tunbridge Wells, Kent: Abacus Press, 1979 46. Takesaki, M.: Conditional expectations in von Neumann algebras. J. Funct. Anal. 9, 306–321 (1972) 47. Thomas, L.J. III, Wichmann, E. H.: On the causal structure of Minkowski spacetime. J. Math. Phys. 38, 5044–5086 (1997) 48. Thomas, L.J. III, Wichmann, E.H.: Standard forms of local nets in quantum field theory. J. Math. Phys. 39, 2643–2681 (1998) 49. Wichmann, E.H.: On systems of local operators and the duality condition. J. Math. Phys. 24, 1633– 1644 (1983) 50. Wiesbrock, H.-W.: Half-sided modular inclusions of von Neumann algebras. Commun. Math. Phys. 157, 83–92 (1993); Erratum. Commun. Math. Phys. 184, 683–685 (1997) Communicated by Y. Kawahigashi
Commun. Math. Phys. 253, 451–480 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1201-9
Communications in
Mathematical Physics
Elliptic Eigenvalue Problems with Large Drift and Applications to Nonlinear Propagation Phenomena ,,
Henri Berestycki1 , Fran¸cois Hamel2 , Nikolai Nadirashvili3 1 2
EHESS, CAMS, 54 Boulevard Raspail, 75006 Paris, France Universit´e Aix-Marseille III, LATP, Facult´e des Sciences et Techniques, Case cour A, Avenue Escadrille Normandie-Niemen, 13397 Marseille Cedex 20, France 3 University of Chicago, Department of Mathematics, 5734 University Avenue, Chicago, IL 60637-1546, USA
Received: 9 December 2003 / Accepted: 22 April 2004 Published online: 5 November 2004 – © Springer-Verlag 2004
Abstract: This paper is concerned with the asymptotic behaviour of the principal eigenvalue of some linear elliptic equations in the limit of high first-order coefficients. Roughly speaking, one of the main results says that the principal eigenvalue, with Dirichlet boundary conditions, is bounded as the amplitude of the coefficients of the first-order derivatives goes to infinity if and only if the associated dynamical system has a first integral, and the limiting eigenvalue is then determined through the minimization of the Dirichlet functional over all first integrals. A parabolic version of these results, as well as other results for more general equations, are given. Some of the main consequences concern the influence of high advection or drift on the speed of propagation of pulsating travelling fronts.
Introduction Nonlinear propagation of fronts in a reaction-diffusion equation of Fisher type often involves a linear eigenvalue problem. In particular, it is of interest to study the effects of various phenomena such as diffusion, reaction, advection on the speed of fronts. Carrying this rests on some singular limits in these eigenvalue problems. This paper is concerned with both aspects. We derive here some limiting behaviour of eigenvalue problems and we also establish results for nonlinear propagation phenomena. To illustrate the type of results for the first aspect, let us consider the simple case of Dirichlet eigenvalue problem for the Laplacian with a large divergence free drift. Consider a bounded domain ⊂ RN of class C 2 , with outward unit normal ν = ν(x) (for x ∈ ∂). Let v be a L∞ () vector field, such that div v = 0 in D (), The
third author was partially supported by a NSF grant. address: CNRS, LATP, CMI, 39 r.F. Joliot-Curie, 13453 Marseille Cedex 13, France.
Current
452
H. Berestycki, F. Hamel, N. Nadirashvili
and, for A ∈ R, let λA be the principal eigenvalue and ϕA be the principal eigenfunction (unique up to multiplication) of −ϕA + Av · ∇ϕA = λA ϕA in (0.1) ϕA = 0 on ∂. We recall that ϕA ∈ W 2,p () for all 1 ≤ p < +∞ (hence ϕA ∈ C 1,α () for all 0 ≤ α < 1), and that one can assume that ϕA > 0 in . Our purpose here is to analyze the limiting behaviour of the first eigenvalue λA in the limit A → +∞ (note that through a change of v into −v, the limit A → −∞ is treated similarly). As was already mentioned, the goal is to understand the influence of high drift coefficients. In doing this, one has to understand the role of advection and that of diffusion. The limiting behaviour of the solutions of problem (0.1) as A → +∞ turns out to be strongly related to the existence of first integrals of v in H01 (). We now define this notion : Definition 0.1. A function w is said to be a first integral of the vector field v if w ∈ H 1 (), w = 0 and v · ∇w = 0 almost everywhere in . Definition 0.2. We denote I0 the set of all first integrals of v which belong to H01 (). Notice that if both v and a first integral w are smooth enough, then w is constant along the trajectories of the dynamical system X˙ = v(X) in . Theorem 0.3. The first eigenvalues (λA ) of (0.1) are bounded as A → +∞ if and only if v has a first integral in H01 (). Furthermore, 1) If v does not have any first integral in H01 (), then λA → +∞ as A → +∞. 2) If v has a first integral in H01 (), then |∇w|2
λA → min
w∈I0
w
as A → +∞,
2
(0.2)
and the minimum in the right-hand side of (0.2) is achieved. From the proof, we will see that furthermore, |∇w|2 λA ≤ w2
(0.3)
for all A ∈ R and for all w ∈ I0 . Theorem 0.3 gives a necessary and sufficient condition for the eigenvalues λA ’s stay bounded as A → +∞. Formula (0.3) also says that the eigenvalues λA are never larger than the Rayleigh quotient of a first integral, if any. A sufficient condition for the boundedness of λA , stronger than the existence of first integrals (requiring additional properties for some first integrals), was given by Devinatz, Ellis and Friedman [13]. A
Elliptic Eigenvalue Problems with Large Drift
453
large literature has been devoted to this type of question, which we recall in Section 1. Nevertheless, as far as we know, the necessary and sufficient condition for the boundedness of λA , and the limiting behaviour of λA given by (0.2), had not been observed until now. More general eigenvalue problems with large drift, as well as operators with other boundary conditions, are also dealt with in Section 2. For instance, Neumann or periodic boundary conditions are considered. Applications to the decay of solutions of the heat equation with large drift, under Dirichlet boundary conditions, are given in Section 3. Lastly, some applications to nonlinear propagation phenomena are given in Section 4. One is especially concerned with propagation of fronts for some reaction-diffusion equations of the type ut − div(a∇u) + Av · ∇u = f (u) in periodic media. Under some assumptions on f and on the other parameters, which are recalled in Sect. 4, it is known that there exist some pulsating (or periodic) travelling fronts propagating in any given direction in which the domain is unbounded. One especially gives in Sect. 4 a necessary and sufficient condition for the minimal speed of pulsating fronts be asymptotically at least linear with respect to the amplitude A of the drift as A → +∞. This condition involves the first integrals of the vector field v. Applications to reaction-diffusion equations with large reaction or small diffusion are also given.
1. The Case of Dirichlet Boundary Conditions This section is mainly devoted to the proof of Theorem 0.3. Before proceeding to the proof, let us examine some particular cases and briefly recall the literature which has been devoted to this type of problem. Let us first mention some particular cases where the field v has first integrals. For instance, if v, say, vanishes on an open subset ω ⊂⊂ , then v has first integrals : namely, any nonzero w ∈ H01 (ω), extended by 0 in \ω, is a first integral of v; furthermore, under the same conditions, if follows that λA is not larger than the first eigenvalue of the Laplace operator in ω with Dirichlet boundary conditions (the latter agrees with the monotonicity of the first eigenvalue of an elliptic operator with respect to the domain, see [8]). Notice that if v and a first integral w are respectively C 1 and continuous in a neighbourhood of some point x0 ∈ ∂, and if ∇w(x0 ) = 0, then v(x0 ) · ν(x0 ) = 0. Nevertheless, as a consequence of what was just mentioned, this condition v · ν = 0 on ∂ is not necessary in general for the existence of a first integral w. On the other hand, in dimension N = 2, if a C 1 () vector field v satisfies v · ν = 0 on ∂, together with div v = 0 in , then it is easy to see that v has first integrals. In the case where v does not have any first integral, the asymptotic rate of growth of λA to +∞ as A → +∞ is not known in general. However, for general continuous v without the divergence-free assumption, the limit of λA /A2 is known explicitly. It is given by the following formula due to Wentzell [37] : λA A2
−→
A→+∞
1 4
lim
T →+∞
1 inf T
0
T
˙ |X(t) + v(X(t))|2 dt
,
(1.1)
454
H. Berestycki, F. Hamel, N. Nadirashvili
where the infimum is taken over all C 1 functions X : [0, T ] → . It follows in particular that if the dynamical system X˙ = −v(X) has a trajectory which stays in for all t ≥ 0, then λA = o(A2 ) as A → +∞. In particular, λA = o(A2 ) if N = 2, v ∈ C() and v · ν = 0 on at least one connected component of ∂. This also holds in any dimension if v is continuous and has at least one nonzero first integral w in C() ∩ C 1 () that vanishes on ∂. The last results also hold in the case div v = 0. In the case of existence of first integrals in H01 (), together with the divergence-free assumption but without the continuity assumption for v, Theorem 0.3 is much more precise. It says further that the λA ’s are bounded and their limiting behaviour is given. Formula (1.1) implies that, as soon as v is continuous but may or may not be divergence-free, one has λA = O(A2 ) as A → +∞.1 In some special cases, one can say that λA behaves like A2 as A → +∞. For instance, if v is a gradient field which does not vanish in , namely if there exists U ∈ C 2 () such that v = ∇U and v = 0 in , and if ϕA is an eigenfunction of (0.1) for the first eigenvalue λA , then the function A ψA (x) = e− 2 U (x) ϕA (x) is an eigenfunction of the following self-adjoint problem 2 A A 2 |v| − div v ψA = λA ψA in −ψA + 4 2 ψA = 0 on ∂. Hence,
|∇φ| + 2
λA =
min
φ∈H01 ()\{0}
A2 2 A |v| − div v φ 2 A2 4 2 ∼ min|v|2 4 2 φ
(1.2)
as A → +∞, since v = 0 in . More generally speaking, λA ≥ cA2 for some positive constant c if there exists a C 1 () function φ such that v · ∇φ > 0 in (see [13]). On the other hand, λA = O(A) as soon as v is continuous and vanishes at a point x0 ∈ . More generally speaking, any upper bound like Aα with any positive α can be obtained for the λA ’s (see [13]). Let us lastly mention that other results on the asymptotics of the first eigenvalues of some elliptic problems set in the whole space RN are given in [17]. Remark 1.1. (Case of a field which is not divergence free). As already said, λA behaves at most like A2 as A → +∞, as soon as v is continuous at a point in , but with or without the divergence-free assumption. Furthermore, λA is always nonnegative, from the maximum principle. However, Theorem 0.3 does not hold if v is not divergence-free. For instance, if v is continuous in and v · ν > 0 on ∂, then λA = O(Ae−αA ) as A → +∞, for some constant α > 0 (see Friedman [20]). In particular, in a ball B with center 0, for the velocity field v = x, one has λA → 0 and v does not have any first integral in H01 (). Notice that Theorem 0.3 also says that λA converges either to +∞ or to a positive constant if v is divergence-free. We now turn to the Proof of Theorem 0.3. We divide it into the next two lemmas from which Theorem 0.3 obviously follows : 1 Actually, it is simple to see, using the change of variables x = A(x − x ), that λ = O(A2 ) as 0 A soon as v is continuous at a point x0 ∈ .
Elliptic Eigenvalue Problems with Large Drift
455
Lemma 1.2. Let (An )n∈N be a sequence such that An → +∞ and (λAn )n∈N is bounded. Assume furthermore that the principal eigenfunctions ϕAn are normalized with ϕAn L2 () = 1 for all n. Then there exists a subsequence n → +∞ and a function w ∈ I0 such that ϕAn → w strongly in L2 and weakly in H 1 , and 2 2 lim inf λ ≥ |∇w| / w = |∇w|2 . (1.3) An n →+∞
In particular, I0 = ∅. Lemma 1.3. Assume that v has at least one first integral in H01 (). Then, for all w ∈ I0 and for all A ∈ R, |∇w|2 0 ≤ λA ≤ . (1.4) w2
Proof of Lemma 1.2. Set λn = λAn and ϕn = ϕAn . Multiplying Eq. (0.1) by ϕAn and integrating over yields An 2 2 |∇ϕn | + v · ∇(ϕn ) = λn (ϕn )2 = λn . 2 Since the function (ϕn )2 is, say, in W01,1 () (because ϕn ∈ H01 ()), (ϕn )2 can be approx∞ imated in the W 1,1 norm by a sequence of functions (uk )k∈N ∈ C0 (). Since div v = 0
in D (), it follows that v · ∇uk = 0 and the passage to the limit k → +∞ gives v · ∇(ϕn )2 = 0. Hence, λn =
|∇ϕn |2 .
(1.5)
From Rellich’s theorem, there exists a subsequence n → +∞ and a function w ∈ → w strongly in L2 and weakly in H 1 . Furthermore, w L2 () = 1 and (1.3) holds. On the other hand, we also know that ϕn w in the sense of distributions. Therefore, dividing (0.1) by An and passing to the limit n → +∞, one gets that v · ∇w = 0 almost everywhere in . H01 () such that ϕn
Proof of Lemma 1.3. Let A ∈ R be given and let ϕ be a (positive) first eigenfunction of (0.1) for the first eigenvalue λ. We drop the subscripts A for the sake of keeping notations simple. Observe first that λ is nonnegative by (1.5) (λ is actually positive since ϕ is not constant). Let w be in I0 . Fix any ε > 0 and multiply Eq. (0.1) by w 2 /(ϕ + ε) ∈ W01,1 (). One gets w2 − ϕ + A v · ∇(w 2 ln(ϕ + ε)) ϕ+ε ϕ −2A w ln(ϕ + ε) v · ∇w = λ w2 . (1.6) ϕ+ε
456
H. Berestycki, F. Hamel, N. Nadirashvili
Since the function w 2 /(ϕ + ε) is in W01,1 (), it can be approximated in the W 1,1 norm by a sequence of function (uk )k∈N ∈ C0∞ (). On the other hand, by standard estimates, ϕ ∈ L∞ (), and ∇ϕ ∈ L∞ ∩ W 1,p (for any 1 ≤ p < +∞). Therefore, the first term of the left-hand side of (1.6) can be estimated by w2 − ϕ ϕ uk = lim ∇ϕ · ∇uk = − lim k→+∞ k→+∞ ϕ+ε 2 w = ∇ϕ · ∇ ϕ+ε 2w(ϕ + ε)∇ϕ · ∇w − w 2 |∇ϕ|2 = (ϕ + ε)2 ≤ |∇w|2 . The second term of the left-hand side of (1.6) is equal to 0 since w 2 ln(ϕ + ε) ∈ W01,1 and v is divergence-free. Lastly, the third term clearly vanishes since w is a first integral. Therefore, ϕ λ w2 ≤ |∇w|2 ϕ+ε and (1.4) follows from Lebesgue’s dominated convergence theorem, passing to the limit as ε → 0+ . This completes the proof of Lemma 1.3 and that of Theorem 0.3. Remark 1.4. It follows from (1.5) that
|∇φ|2
λA ≥ λ0 =
min
φ∈H01 (), φ≡0
, φ2
where λ0 is the first eigenvalue of (0.1) with A = 0. Theorem 0.3 then says that if v has a first integral, the λA ’s converge to the minimum (which is achieved) of the Rayleigh |∇w|2 /
quotient
for A = 0).
w 2 over all first integrals (instead of over all nonzero φ ∈ H01 ()
Furthermore, Lemmas 1.2 and 1.3 immediately yield the following Corollary 1.5. If v has first integrals (in H01 ), then, for any sequence of eigenfunctions (ϕAn ) of (0.1) with ϕAn L2 () = 1, there exists a subsequence ϕAn which converges weakly in H 1 and strongly in L2 to a minimizer of the Dirichlet functional among all first integrals. Remark 1.6. Whereas the uniqueness (up to multiplication) of the minimizers of the Rayleigh quotient among all nonzero H01 functions is a well-known fact, the uniqueness of the minimizers of the right-hand side of (0.2), i.e. the minimizers of the Rayleigh quotient among all nonzero H01 first integrals of v, if any, does not hold in general. Indeed, consider for instance, in dimension 2, the disk = {x12 + x22 < 1} and v = (0, α(x1 )), where α is, say, an even C 1 function such that α = 0 on [−b, −a]∪[a, b], α = 0 on [−1, −b) ∪ (−a, a) ∪ (b, 1] with 0 < a < b < 1. Call ω− = ∩ {−b < x1 < −a}
Elliptic Eigenvalue Problems with Large Drift
457
and ω+ = ∩ {a < x1 < b}, and let λ and ϕ be the first eigenvalue and eigenfunction of the Laplace operator in ω+ with Dirichlet boundary conditions. Up to normalization, let us assume that ϕ > 0 in ω+ and ϕ L2 (ω+ ) = 1/2. Let us extend ϕ by 0 in \ω+ . It is straightforward to see in this case that v has first integrals and I0 = {w ∈ H01 ()\{0}, w = 0 a.e. in \(ω+ ∪ ω− )}. Hence |∇w|2
min
w∈I0
w2
=λ
and the set of the minimizers of the above Rayleigh quotient is the set of functions of the type c+ χω+ ϕ(x1 , x2 ) + c− χω− ϕ(−x1 , x2 ), where (c+ , c− ) ∈ R2 \{(0, 0)} and χ denotes the characteristic function. Furthermore, since each first eigenvalue ϕA of (0.1) is even in x1 , one can say, assuming that ϕA L2 () = 1, that the whole family (ϕA )A converges to the function χω+ ϕ(x1 , x2 ) + χω− ϕ(−x1 , x2 ) as A → +∞. On the contrary, let = {|x| < 1} be the unit ball in RN and assume that v ∈ C 1 () is such that v · x = 0 in and v(x) = 0 for x = 0. Then v has first integrals and I0 is the set of nonzero radial functions in H01 (). But since the Rayleigh quotient has a unique minimizer among all nonzero H01 () functions and since this unique minimizer is radial and is the first eigenfunction ϕ of the Laplace operator in with Dirichlet boundary conditions, it follows that this Rayleigh quotient has a unique minimizer in I0 . Furthermore, one can say in this case that, after normalization, each ϕA is equal to ϕ, and, for all A, λA is the first eigenvalue associated with ϕ.
2. Extensions to More General Elliptic Equations, and to Neumann or Periodic Boundary Conditions 2.1. More general linear equations with Dirichlet boundary conditions. As in Sect. 1, one still assumes that is a C 2 bounded domain of RN and that v is a bounded vector field such that div v = 0 in D (). Let now a(x) = (aij (x))1≤i,j ≤N be a C 1 () symmetric matrix field such that ∃0 < α ≤ β, ∀x ∈ , ∀ξ ∈ RN , α|ξ |2 ≤
aij (x)ξi ξj ≤ β|ξ |2 ,
(2.1)
1≤i,j ≤N
and let c(x) be a function in L∞ (). Lastly, let p be a measurable function in such that 0 < p1 ≤ p(x) ≤ p2 a.e. in , for some positive constants p1 and p2 . The proof of Theorem 0.3 can easily be extended to the following more general situation: Theorem 2.1. For all A ∈ R, let λA be the principal eigenvalue, and ϕA be the principal eigenfunction (up to multiplication) of
−div(a∇ϕA ) + Av · ∇ϕA + cϕA = λA pϕA in ϕA = 0 on ∂.
(2.2)
458
H. Berestycki, F. Hamel, N. Nadirashvili
If v has a first integral in H01 (), then ∇w · a(x)∇w + c(x)w 2 as A → +∞. λA → min w∈I0 pw 2
If v has no first integral in
H01 (),
then λA → +∞ as A → +∞.
As for Theorem 0.3, we can add furthermore that ∇w · a(x)∇w + c(x)w 2 λA ≤ pw 2
for all A ∈ R and for all w ∈ I0 . The comments and remarks in Sect. 1 can be extended to the more general problem (2.2). Let us especially mention that one always has λA ≥ essinf (c/p). Furthermore, λA = O(A2 ) as soon as v is continuous at a point in with or without the divergence free assumption. Lastly, if there exists a C 2 () field U such that v = a∇U , then λA is the principal eigenvalue, with weight p, of the following problem 2 A A (∇U · a∇U ) − divv ψ + cψ = λA pψ −div (a∇ψ) + 4 2 with Dirichlet boundary conditions, whence λA ∼ A2 × min ((∇U · a∇U )/(4p)) as A → +∞ under the additional assumption that, say, p is continuous in and v does not vanish in .
2.2. Neumann or periodic boundary conditions. The case of Neumann or periodic boundary conditions is slightly different since, as we shall see below, first integrals always exist. Let us now describe what we mean by periodic and/or Neumann boundary conditions. Let d be an integer such that 0 ≤ d ≤ N and call x = (x1 , · · · , xd ) and y = (xd+1 , · · · , xN ). Let L1 , · · · , Ld be d positive numbers and let be a C 2 connected open subset of RN such that ∃R ≥ 0,
∀(x, y) ∈ ,
|y| ≤ R
(2.3)
and ∀(k1 , · · · , kd ) ∈ L1 ZZ × · · · × Ld ZZ,
=+
d
k i ei ,
i=1
where (ei )1≤i≤N is the canonical basis of RN . Let C be the set defined by C = {(x, y) ∈ , x ∈ (0, L1 ) × · · · × (0, Ld )}.
(2.4)
Elliptic Eigenvalue Problems with Large Drift
459
In the case d = 0, then is bounded and C = ; otherwise, is unbounded and C is its cell of periodicity. A function w is said to be L-periodic with respect to x in if w(x + k, y) = w(x, y) almost everywhere in for all k ∈ L1 ZZ × · · · × Ld ZZ. Let a(x, y) = (aij (x, y))1≤i,j ≤N be a C 1 () and L-periodic with respect to x matrix field satisfying
aij (x, y)ξi ξj ≤ β|ξ |2 . ∃ 0 < α ≤ β, ∀(x, y) ∈ , ∀ξ ∈ RN , α|ξ |2 ≤ 1≤i,j ≤N
(2.5) Let v be a L∞ () vector field such that v ∈ W 1,1 (b ) for all b > 0, where b = {(x, y) ∈ , |x| < b}. Assume furthermore that v is L-periodic with respect to x and satisfies div v = 0 a.e. in and v · ν = 0 in L1loc (∂),
(2.6)
where ν denotes the unit outward normal to . Let c(x, y) be a L∞ () function and assume that c is L-periodic with respect to x. Let H be the set of all functions w which are L-periodic with respect to x and which belong to H 1 (b ) for all b > 0. A nonzero function w ∈ H is said to be a first integral of v if v · ∇w = 0 almost everywhere in . Let I be the set of all first integrals. Note that this set is not empty since it contains all nonzero constant functions. For all A ∈ R, there exists a unique principal eigenvalue λA , and a unique (up to multiplication) function ϕA , which is positive in , L-periodic with respect to x, and solves −div(a∇ϕA ) + Av · ∇ϕA + cϕA = λA ϕA in (2.7) ν · a∇ϕA = 0 on ∂. Furthermore, ϕA belongs to W 2,p (b ) for all b > 0 and 1 ≤ p < +∞ (hence, ϕA ∈ C 1,α () for all 0 ≤ α < 1). This general framework includes the case of a bounded domain with Neumann type boundary conditions (case d = 0), that of an infinite (straight or oscillating) cylinder with periodic coefficients and Neumann boundary conditions, as well as the case of the whole space RN with periodic holes and/or periodic coefficients (see also [4] for a detailed study of problem (2.7)). Theorem 2.2. The principal eigenvalues λA are bounded and ∇w · a(x, y)∇w + c(x, y)w 2 C as A → +∞. λA → min w∈I 2 w C
As for Theorems 0.3 and 2.1, one can say furthermore that ∇w · a(x, y)∇w + c(x, y)w 2 C λA ≤ w2 C
for all A ∈ R and w ∈ I.
(2.8)
460
H. Berestycki, F. Hamel, N. Nadirashvili
Remark 2.3. It is clear that, unlike problems (0.1) or (2.2), the principal eigenvalues λA of problem (2.7) are always bounded as A → +∞. Indeed, one has ∀A ∈ R,
inf c ≤ λA ≤ sup c.
The purpose of Theorem 2.2 is to prove that the λA ’s do converge as A → +∞ and to determine their limit. Remark 2.4. In the case of the whole space RN , and under the additional assumption that v has zero average, it is known that there exists a skewsymmetric and L-periodic matrix field B such that v = −div B (see [14, 23]). In this particular case, (2.7) reduces to −div((a + AB)∇ϕA ) + cϕA = λA ϕA . However, because of the lack of symmetry of the diffusion term, this last formulation does not seem to be helpful to derive the limit of λA as A → +∞. Proof of Theorem 2.2. The proof follows the main lines of that of Theorem 0.3. Let us especially mention that the integrations by parts are made over the set C. Notice also that the additional assumptions v · ν = 0 in L1loc (∂) and v ∈ W 1,1 (b ) (forall b > 0), together with div v = 0, are used to guarantee that the integrals of the type C
2 v · ∇ϕA
vanish (from Green’s formula and the periodicity assumptions). Let us just sketch the proof of the upper bound for λA . Choose A ∈ R and w ∈ I. Since ϕA ∈ C 1 () is positive on , the function w 2 /ϕA is in W 1,1 (b ) (for all b > 0), and, since is L-periodic and at least of class C 1 , the function w 2 /ϕA can be approximated in the norms of W 1,1 (b ) by a sequence of functions (uk )k ∈ C 1 () which are L-periodic with respect to x. Each function uk is in H, whence ∇uk · a∇ϕA + A uk v · ∇ϕA + cuk ϕA = λA uk ϕ A C
and
C
w2 ∇ · a∇ϕA + A ϕA C
C
C
w v · ∇ ln ϕA +
cw = λA
2
C
2
C
w2 C
ln ϕA = v·∇(w 2 ln ϕA ). after passing to the limit k → +∞. Since w ∈ 2 1,1 The function w ln ϕA is in W (b ) (for all b > 0) and L-periodic with respect to x. It can then be approximated in the norms of W 1,1 (b ) by a sequence of functions (Uk )k ∈ C 1 () which are L-periodic with respect to x. Therefore, using Green’s formula and (2.6), one gets v · ∇(w 2 ln ϕA ) = lim v · ∇Uk = lim Uk div v = 0. I, one has w 2 v·∇
C
k→+∞ C
k→+∞ C
One then concludes as in Lemma 1.3 that ∇w · a∇w + cw 2 C λA ≤ , w2 C
and the proof of Theorem 2.2 is complete.
Elliptic Eigenvalue Problems with Large Drift
461
Remark 2.5. (On the necessity of the assumption v · ν = 0 on ∂) Unlike problems (0.1) or (2.2) with Dirichlet boundary conditions, the additional assumption stating that v · ν = 0 on ∂ is needed to determine the limit of the principal eigenvalues λA of (2.7). Otherwise, formula (2.8) may or may not hold in general. Consider for instance the eigenvalue problem + Aϕ + c(x)ϕ = λ ϕ −ϕA A A A in (0, 1) A (2.9) (1) = 0. ϕA (0) = ϕA The velocity field v = 1 does not satisfy the assumption v · ν = 0 at 0 and 1. The first integrals of v are the nonzero constants. If c ≡ 0, then (2.8) is clearly true since λA = 0 for all A ∈ R. On the other hand, for general c, after rewriting problem (2.9) in a self-adjoint way : ) + c(x)e−Ax ϕ = λ e−Ax ϕ −(e−Ax ϕA A A A in (0, 1) (1) = 0, ϕA (0) = ϕA it follows that
λA =
min
φ∈H 1 (0,1)
1 0
e−Ax (φ )2 + c(x)e−Ax φ 2 ≤ 1 e−Ax φ 2 0
1
c(x)e−Ax
0
1
. e−Ax
0
Hence, lim supA→+∞ λA ≤ c(0). On the other hand, since the first integrals to the vector field v = 1 are the nonzero constants, the right hand side of (2.8) is here noth 1 c(x)dx. Therefore, (2.8) does not hold for problem (2.9) as soon as ing else but 1 0 c(x)dx. c(0) < 0
3. Applications to Parabolic Equations The results of Sect. 1 can be applied to characterize the behaviour in finite time of the solutions of Cauchy problems with large advection. Namely, let be a C 2 bounded domain in RN , let v be a L∞ () and satisfying div v = 0 in D (). For any u0 ∈ H01 () and A ∈ R, let uA = uA (t, ·) be the solution of A A A ut = u − Av · ∇u , t > 0, A (3.1) u (t, ·) = 0 on ∂, t ≥ 0, uA (0, ·) = u . 0 2 2 This solution uA belongs to C(R+ , H01 ) ∩ L2 (0, T ; H01 ∩ H 2 ), and uA t ∈ L (0, T ; L ) for all T > 0.
Theorem 3.1. The following properties are equivalent : (i) there exists u0 ∈ H01 () such that uA (1, ·) → 0 in L2 () as A → +∞, (ii) the vector field v has a nonzero first integral w ∈ H01 (), (iii) the first eigenvalues λA of (0.1) are bounded as A → +∞.
462
H. Berestycki, F. Hamel, N. Nadirashvili
One has already proved in Theorem 0.3 that properties (ii) and (iii) are equivalent. Theorem 3.1 says further that the nonexistence of first integrals is equivalent to the fact that the solutions of (3.1), with any fixed initial datum, go to 0 in L2 in finite time (see also Remark 3.2 below) as the amplitude of the drift goes to +∞. In the case of Hamiltonian systems in even dimensions, where v is given as the orthogonal gradient of a first integral H , v = ∇H , the behaviour of the functions uA can be made more precise (see Fannjiang, Papanicolaou [14], Wentzell and Freidlin [38]). But, as far as we know, the characterization of the decay of the functions uA in finite time in terms of the first integrals of v, in the general case of non-Hamiltonian systems, had not been investigated yet. Remark 3.2. (On the behaviour of the solutions of (3.1) at other finite times or other times scales). Theorem 3.1 also holds if time t = 1 in property (i) is replaced with any positive time t = T , and even if property (i) is replaced with mint1 ≤t≤t2 uA (t, ·) L2 () → 0 as A → +∞, given any two positive times 0 < t1 ≤ t2 . Indeed, as it can easily be seen by multiplying (3.1) by uA , each function t → uA (t, ·) L2 () is nonincreasing (see the proof of Theorem 3.1 below). On the other hand, any solution uA of (3.1) with A > 0 gives rise to a solution A U (t, ·) = uA (t/A, ·) of A −1 A A Ut = A U − v · ∇U , t > 0, U A (t, ·) = 0 on ∂, t ≥ 0, U A (0, ·) = u . 0 The limiting behaviour of the functions U A in finite time –which corresponds to times proportional to 1/A for the solutions of problem (3.1)– has been thoroughly studied (see [14] and [38]). The above Theorem 3.1 then gives some information about the behaviour of the functions U A for times proportional to A. Proof of Theorem 3.1. As already emphasized, assertions (ii) and (iii) are equivalent, from Theorem 0.3. Proof of ((ii) and/or (iii)) ⇒ (i). From the proof of Theorem 0.3, there exists a sequence An → +∞ and a nonzero first integral w ∈ H01 () such that the first eigenfunctions ϕn = ϕAn of problem (0.1), normalized so that ϕn L2 () = 1, converge to w weakly in H01 and strongly in L2 . Take the initial condition u0 = w for the solutions un := uAn of problem (3.1), and call hn (t, ·) := un (t, ·) − vn (t, ·), where vn (t, ·) = e−λAn t ϕn (·). Since each function vn solves (3.1) with A = An and initial condition ϕn , it follows that hn is a solution of (hn )t = hn − An v · ∇hn , t > 0, hn (t, ·) = 0 on ∂, t ≥ 0, h (0, ·) = w − ϕ . n n Multiplying the above equation by hn (t, ·) and integrating by parts over and [t1 , t2 ] leads to t2 1 1 |∇hn (t, x)|2 dx dt hn (t2 , ·) 2L2 () − hn (t1 , ·) 2L2 () = − 2 2 t1 ≤0 (3.2)
Elliptic Eigenvalue Problems with Large Drift
463
for all 0 ≤ t1 ≤ t2 . Therefore, hn (1, ·) L2 () ≤ hn (0, ·) L2 () = w − ϕn L2 () . Since the right-hand side of the above inequality goes to 0 as n → +∞, it follows from
Theorem 0.3 that un (1, ·) → e−λ∞ w = 0 in L2 (), where λ∞ = min
w∈I0
|∇w|2 /
w2 .
Proof of (i) ⇒ ((ii) and/or (iii)). Let u0 ∈ H01 ()\{0}, ε > 0 and An → +∞ such that ∀n ∈ N,
un (1, ·) L2 () ≥ ε,
(3.3)
where un = uAn . Call un,± the solution of problem (3.1) with A = An and initial condition ± 1 u± 0 ∈ H0 (), where u0 = χ{±u0 >0} u0 . By linearity and uniqueness, one has un = un,+ + un,− . On the other hand, the maximum principle implies that, for all t ≥ 0, un,+ (t, x) ≥ 0 and un,− (t, x) ≤ 0 almost everywhere in . Lastly, one has either un,+ (1, ·) L2 () ≥ ε/2 or un,− (1, ·) L2 () ≥ ε/2. Therefore, up to extraction of some subsequence, and even if it means changing ε/2 into ε, one can assume without loss of generality that (3.3) holds with u0 ≥ 0 a.e. in (and hence, for all t ≥ 0, un (t, ·) ≥ 0 a.e. in ). M Call now uM 0 = χ{u0 <M} u0 and M > 0 large enough so that u0 − u0 L2 () ≤ ε/2. M 1 M Let un and vn be the solutions of (3.1) with initial conditions u0 and u0 −uM 0 ∈ H0 (). M One has un = un + vn and, as in (3.2), the function t → vn (t, ·) L2 () is nonincreasing, whence vn (1, ·) L2 () ≤ vn (0, ·) L2 () = u0 − uM 0 L2 () ≤ ε/2. Thus, (1, ·) ≥ ε/2. On the other hand, it follows from the maximum principle that, uM 2 L () n M for all t ≥ 0, un (t, ·) ≤ M a.e. in . Therefore, even if it means changing ε/2 into ε, one can assume without loss of generality that (3.3) holds with 0 ≤ u0 ≤ M a.e. in (and hence, for all t ≥ 0, 0 ≤ un (t, ·) ≤ M a.e. in ). Since the function t → un (t, ·) L2 () is continuous and nonincreasing, it follows that ∀n ∈ N, un L2 ((0,1)×) ≤ u0 L2 () . Up to extraction of some subsequence, one can then assume that un w weakly in L2 ((0, 1) × ). Furthermore, it also follows as in (3.2) that ∇un is uniformly bounded in L2 ((0, 1) × ), namely 1 |∇un |2 ≤ u0 2L2 () . (3.4) 2 (0,1)× ∂w ∈ L2 ((0, 1) × ) for all 1 ≤ i ≤ N . ∂xi Fubini’s theorem then yields that w(t, ·) ∈ H 1 () for almost every t ∈ (0, 1). Fix now any i ∈ {1, . . . , N} and any function ϕ(t, x) of class C 1 and with compact support in (0, 1) × RN . For all n ∈ N and t ∈ (0, 1), the function un (t, ·) is in H01 () and ϕ(t, ·) ∈ C 1 (RN ) with compact support. Hence un (t, x) ∂ϕ (t, x)dx ≤ ∂un (t, ·) L2 () ϕ(t, ·) L2 () , ∂xi ∂xi
Therefore, standard arguments give that
464
H. Berestycki, F. Hamel, N. Nadirashvili
and it follows from the Cauchy-Schwarz inequality that
1 ∂ϕ ∂un un (t, ·) L2 () ϕ(t, ·) L2 () dt ≤ ∂x ∂xi i (0,1)× 0 ≤ ∇un L2 ((0,1)×) ϕ L2 ((0,1)×) .
Therefore, one gets from (3.4) that ∂ϕ ∂ϕ lim ≤ C ϕ L2 ((0,1)×) , w un = n→+∞ ∂xi (0,1)× ∂xi (0,1)× √ where C = u0 L2 () / 2 is independent of ϕ. One then concludes that w(t, ·) ∈ H01 () for almost every t ∈ (0, 1). On the other hand, the functions (un )t (resp. un ) converge to wt (resp. w) in D ((0, 1) × ). By dividing by An Eq. (3.1) satisfied by un and by passing to the limit n → +∞, it follows that v ·∇w = 0 almost everywhere in (0, 1)×. Hence, for almost every t ∈ (0, 1), the function v(·) · ∇w(t, ·) ∈ L1 () and is equal to 0 a.e. in . Lastly, since 0 ≤ un ≤ M and since the function t → un (t, ·) L2 () is nonincreasing, it is found that 1 1 un ≥ u2n ≥ un (1, ·) 2L2 () , M|| ≥ M M (0,1)× (0,1)× where || denotes the Lebesgue measure of . Thus, M|| ≥
un ≥ (0,1)×
ε2 M
2 from (3.3). Since un w weakly in L ((0, 1) × ) and since (0, 1) × is bounded,
un →
one gets that (0,1)×
w as n → +∞, whence (0,1)×
M|| ≥
w≥ (0,1)×
ε2 > 0. M
(3.5)
To sum up, one knows that, for almost every t ∈ (0, 1), the function w(t, ·) is in H01 () and satisfies v(·) · ∇w(t, ·) = 0 almost everywhere in . From (3.5), one eventually concludes that there exists at least a t ∈ (0, 1) such that w(t, ·) is a nonzero first integral of v in H01 (). That completes the proof of Theorem 3.1. Remark 3.3. Under the same notations as in Sect. 2.1, it is immediate to check that Theorem 3.1 also holds if Eq. (3.1) is replaced with the more general parabolic problem A A A uA t = div(a∇u ) − Av · ∇u + cu ,
where c ∈ L∞ () and a is a C 1 () symmetric matrix field satisfying (2.1).
Elliptic Eigenvalue Problems with Large Drift
465
4. Applications to Nonlinear Propagation Phenomena The previous sections dealt with the asymptotic behaviour of some solutions of linear elliptic or parabolic equations with high first-order coefficients. This behaviour is directly related to the first integrals of the underlying velocity field v. This section is concerned with nonlinear propagation phenomena for some reactiondiffusion-advection equations in periodic domains. We study the asymptotic behaviour of the speeds of propagation of pulsating travelling fronts in the limit of large advection coefficients, and, as in the previous sections, we will shed light on the role played by the first integrals of the velocity field. We consider here the general framework described in Sect. 2.2. Let be a domain satisfying (2.3-2.4) and assume here that d ≥ 1. The latter implies that is unbounded in the d variables (x1 , . . . , xd ). Let v = (v1 , . . . , vN ) be an L-periodic with respect to x and C 1,δ () (with δ > 0) vector field satisfying (2.6) and ∀1 ≤ i ≤ d, vi dx dy = 0. (4.1) C
Let a(x, y) = (aij (x, y))1≤i,j ≤N be a symmetric, L-periodic with respect to x, and C 3 () matrix field satisfying (2.5). Lastly, let f (x, y, u) be a nonnegative function defined in × [0, 1] and such that f is L-periodic with respect to x, f is globally Lipschitz-continuous and ∃δ > 0, f is C 1,δ with respect to u, ∀(x, y) ∈ , f (x, y, 0) = f (x, y, 1) = 0, (4.2) ∃ρ ∈ (0, 1), ∀(x, y) ∈ , ∀ 1 − ρ ≤ s ≤ s ≤ 1, f (x, y, s) ≥ f (x, y, s ), ∀s ∈ (0, 1), ∃(x, y) ∈ , f (x, y, s) > 0, ∀(x, y) ∈ , fu (x, y, 0) := limu→0+ f (x, y, u)/u > 0. The simplest case of a function f (x, y, u) satisfying (4.2) is when f (x, y, u) = f˜(u) and the C 1,δ function f˜ satisfies : f˜(0) = f˜(1) = 0, f˜ > 0 on (0, 1), f˜ (0) > 0 and f˜ (1) < 0. Such nonlinearities arise especially in combustion and biological models [1, 15, 31]. Another example of such a function f is f (x, y, u) = h(x, y)f˜(u) where f˜ is as before and h is L-periodic with respect to x, Lipschitz-continuous and positive in (see [33] for such an example in ecology). We are interested with the propagation of fronts in the domain , with the diffusion a, the advection Av and the reaction f , and we want to analyze the asymptotic behaviour of their speeds of propagation in the limit A → +∞ : more precisely, for any given unit vector e = (e1 , · · · , ed ) in Rd , a function u(t, x, y), defined for all t ∈ R and (x, y) ∈ , satisfying 0 ≤ u ≤ 1 and solving ∂u − div(a∇u) + Av · ∇u = f (x, y, u), t ∈ R, (x, y) ∈ , (4.3) ∂t ν · a∇u = 0, t ∈ R, (x, y) ∈ ∂, is called a pulsating travelling front propagating in direction e with a so-called effective speed c = 0 if it satisfies d
k·e ∀k ∈ , x, y = u(t, x + k, y), Li ZZ, ∀(t, x, y) ∈ R × , u t − c (4.4) i=1 u(t, x, y) −→ 0, u(t, x, y) −→ 1, x·e→+∞
x·e→−∞
466
H. Berestycki, F. Hamel, N. Nadirashvili
where the above limits hold locally in t and uniformly in y and in the directions of Rd orthogonal to e. Call e˜ the vector defined by e˜ = (e1 , · · · , ed , 0, · · · , 0) ∈ RN . ∗ (e) > 0 such Under the above assumptions, it was proved in [4] that there exists cA that pulsating travelling fronts u in the direction e with the speed c exist if and only ∗ (e) (other results with more general nonlinearities f were proved in [4], see if c ≥ cA Remark 4.2 below). Theorem 4.1. Let e be a unit direction in Rd . Under the above assumptions and under the notations of Sect. 2.2, one has : a) if there is a first integral w ∈ I such that (v · e)w ˜ 2 > 0, then C ∗ (e) cA
0 < γ ≤ lim inf
A
A→+∞
where
A→+∞
∗ (e) cA ≤ δ, A
0 < γ = max sup
2
C
w∈I 2
w2
m=
|C|−1
m
, sup C
fu (x, y, 0),
(4.5)
(v · e)w ˜
w∈I 1
ζ (x, y) =
≤ lim sup
w2 +
m
C
∇w · a∇w −
C
(v · e)w ˜
2
C
ζw
, 2
C
ζ, C
ζ w 2 ≥ ∇w · a∇w > 0}, C C 2 2 I = {w ∈ I, 0 < ζ w < ∇w · a∇w}, I 1 = {w ∈ I,
C
C
and
(v · e)w ˜ 2
0 < δ = sup
w∈I
b) if all first integrals w ∈ I satisfy C
C
, w2 C
∗ (e) = o(A) as A → +∞. (v · e)w ˜ 2 ≤ 0, then cA
Remark 4.2. (General positive nonlinearities and combustion-type nonlinearities) Since the necessary and sufficient condition for the minimal speed to have a lower bound which is linear with respect to the amplitude of the flow only depends on the flow v itself, the results of Theorem 4.1 can be used to deduce some facts for other nonlinearities f . Consider for instance either the case of a general nonnegative nonlinearity f satisfying the first five assumptions in (4.2), but maybe not the last one, or, second, let f be a nonnegative function satisfying f is L-periodic with respect to x, 1,δ f is globally Lipschitz-continuous and ∃δ > 0, f is C with respect to u, (4.6) ∃θ ∈ (0, 1), ∀(x, y) ∈ , ∀s ∈ [0, θ] ∪ {1}, f (x, y, s) = 0, ≤ 1, ), ∃ρ ∈ (0, 1 − θ ), ∀(x, y) ∈ , ∀1 − ρ ≤ s ≤ s f (x, y, s) ≥ f (x, y, s ∀s ∈ (θ, 1), ∃(x, y) ∈ , f (x, y, s) > 0.
Elliptic Eigenvalue Problems with Large Drift
467
It is known (see [4]) that, under the above notations, there still exists a minimal speed ∗ (e) > 0 in the first case, whereas there is a unique speed c (e) > 0 and a unique (up cA A to shift in time) pulsating travelling front solving (4.3-4.4) if f satisfies (4.6). In both ∗ (e) (resp. c (e)) ≤ c∗ (e), where c∗ (e) is the minimal speed associated cases, one has cA A A A ∗ (e) (resp. with a nonlinearity f satisfying (4.2) and such that f ≥ f . Therefore, cA cA (e)) = O(A) and c∗ (e) cA (e) (v · e)w ˜ 2 / w2 , (4.7) (resp. lim sup ) ≤ sup 0 ≤ lim sup A A C A→+∞ A A→∞ w∈I C where the latter is zero if there is no first integral w ∈ I such that (v · e)w ˜ 2 > 0 C (indeed, the choice w = 1 gives (v · e)w ˜ 2 = 0 because of (4.1)). Furthermore, C
lim sup A→+∞
∗ (e) cA
A
(resp. lim sup A→+∞
cA (e) ) ≤ max v(x, y) · e. ˜ A (x,y)∈
(4.8)
Many papers have been devoted to the study of travelling fronts for reaction-diffusion equations of the type (4.3) since the pioneering paper by Kolmogorov, Petrovsky and Piskunov [31] in the one-dimensional case, for the equation ut = uxx + f (u). Other results for other nonlinear functions f , including the bistable case, were obtained in [1, 15, 29]. Periodic nonlinearities f (x, u) in space dimension 1 were first considered by Shigesada, Kawasaki and Teramoto [33], and by Hudson and Zinner [28]. The case of shear flows v = (α(y), 0, . . . , 0) in straight infinite cylinders = R × ω was dealt with by Berestycki, Larrouturou, Lions [6], and Berestycki and Nirenberg [7]. The case of the whole space RN with periodic diffusion and advection was considered by Xin [35] for a combustion-type nonlinearity f (for which the speed of propagation of fronts is unique). The homogenization limit in RN with coefficients having small periods was investigated by Fannjiang, Papanicolau [14], Freidlin [18], Heinze [23], Majda and Souganidis [32], and Xin [36]. Heinze also considered the case of the whole space with small periodic holes [24]. Formulas for the unique or minimal speeds of propagation of fronts were obtained by Hamel [22] and Heinze, Papanicolaou and Stevens [26] (similar formulas for systems of one-dimensional equations had been proved by Volpert, Volpert and Volpert [34]). Let us temporarily come back to the case of shear flows v = (α(x2 , . . . , xN ), 0, . . . , 0) in straight infinite cylinders = R × ω = {(x1 , x ), x1 ∈ R, x = (x2 , · · · , xN ) ∈ ω}, where the section ω may or may not be bounded (under our general assumptions, the boundedness of ω simply means that d = 1). Assume that the diffusion a and the reaction f are independent of x1 . Therefore, under the general notations of this section, L1 can be any arbitrary positive number and the travelling fronts in direction ±e1 can be written as u(t, x) = φ(x1 + ct, x ). Without loss of generality, assume furthermore that α is not constant and has zero average (over ω if d = 1 or over the cell of periodicity if d > 1). Under these conditions, several lower and upper bounds for the speeds of such fronts were derived by Audoly, Berestycki and Pomeau [2], Constantin, Kiselev and Ryzhik [12, 30] and Heinze [25] for combustion-type or general positive nonlinearities. Furthermore, Berestycki [3] proved that, if the function f = f (u) satisfies (4.2) and the
468
H. Berestycki, F. Hamel, N. Nadirashvili
∗ (±e ) is increasing additional assumption f (u) ≤ f (0)u for all u ∈ [0, 1], then 1) cA 1 ∗ 2 with A, 2) cA (±e1 )/A is decreasing with A and
∃β > 0,
∗ (±e1 )/A → β > 0 as A → +∞. cA
The latter is more precise than the results of Theorem 4.1 in that case. Such an exact linear behaviour is unknown for the general periodic setting, as well as for a function f = f (u) satisfying only (4.2), or (4.6), even in the case of shear flows. However, one can deduce from Theorem 4.1 that, for a shear flow v = (α(x ), 0, · · · , 0) with nonconstant α having zero average, one has ∗ (e) cA >0 A→+∞ A
lim inf
(4.9)
for any function f satisfying (4.2) and for any unit vector e ∈ Rd such that e˜ · e1 = 0, the diffusion a and the reaction f maybe depending (periodically) on x1 with period L1 (a more precise lower bound in (4.9) can also be derived from Theorem 4.1). Formula (4.9) follows from the observation that any nonzero function w = w(x1 , x ) = w(x ) which is in H is a first integral ; therefore, there are first integrals w ∈ I such that (v · e)w ˜ 2 > 0 (see also the proof of Corollary 4.3 below). C
Recently, Heinze [25] showed that, in the case of shear flows in infinite cylinders, without dependence on x1 in the coefficients of (4.3) and with f = f (u) satisfying ∗ (±e ) may be strict if, for instance, f (0) is small enough. (4.2), inequality (4.8) for cA 1 ∗ (±e ) ∼ Audoly, Berestycki and Pomeau [2] formally derived the asymptotics cA 1 A maxx ∈ω (±α(x )) in the limit of high reaction (together with high advection). Under the above assumptions, the latter was made rigorous by Constantin, Kiselev and Ryzhik [12]. That can also be viewed as an immediate consequence of Theorem 4.1, in a more general periodic setting and with the direction of propagation which may not be that of the flow : ∗ (e) the minimal speed Corollary 4.3. Under the assumptions of Theorem 4.1, call cA,B of pulsating travelling fronts solving (4.3-4.4) with nonlinearity Bf instead of f , for B > 0. Then, for any ε > 0, there exists B0 > 0 such that, for all B ≥ B0 , 2 (v · e)w ˜ (v · e)w ˜ 2 ∗ (e) ∗ (e) cA,B cA,B C C ≤ lim sup ≤ sup − ε ≤ lim inf . sup A→+∞ A A A→+∞ w∈I w∈I w2 w2 C
C
(4.10) (α(x ), 0, . . . , 0)
In particular, under the above assumptions, let v = be a non-constant shear flow with zero average, in an infinite cylinder = R × ω, where x = (x2 , . . ., xN ), where ω may or may not be bounded, and where a and f depend (periodically) on x1 with period L1 . Then, for any ε > 0, there exists B0 > 0 such that, for all B ≥ B0 , max (e˜ · e1 α(x )) − ε ≤ lim inf x ∈ω
A→+∞
∗ (e) cA,B
A
≤ lim sup A→+∞
∗ (e) cA,B
A
≤ max (e˜ · e1 α(x )). x ∈ω
(4.11) 2 One here makes a slight abuse of notation by calling c∗ (±e ) the minimal speed of the travelling 1 A fronts propagating in the direction ±x1 .
Elliptic Eigenvalue Problems with Large Drift
469
The proof of this corollary is given at the end of this section. Another special class of flows are the rotating flows. Consider, say, a two-dimensional rotating flow of the type v = (−∂y ψ, ∂x ψ), where ψ(x, y) = sin(x) sin(y). Under the above notations, formal arguments by Audoly, Berestycki and Pomeau [2] lead to an ∗ (e ) ∼ βA1/4 as A → +∞, for some β > 0. asymptotic behaviour proportional to cA 1 ∗ (e ) ≥ κA1/5 , for some positive constant κ, was obtained by Kiselev The estimate cA 1 and Ryzhik [30] (these estimates actually hold for the bulk burning rate –see [11]– which concerns more generally speaking, the solutions of the Cauchy problem (4.3) in infinite cylinders and coincides with the speed of propagation for travelling fronts, if any). Theorem 4.1 applied to this case gives the following additional information that, in any direction e, the minimal speed can not grow like A : Corollary 4.4. Let v = (−∂y ψ, ∂x ψ) be a two-dimensional rotating flow where, say, ∗ (e) = o(A) as ψ(x, y) = sin(x) sin(y). Then, under the notations of Theorem 4.1, cA 2 A → +∞, for any direction e of R . Proof. For such a flow v, it is easy to see that, for any direction e and any first integral w ∈ I, then (v · e)w2 = 0. The conclusion follows then from Theorem 4.1. (0,2π)2
Before going into the proof of Theorem 4.1, let us lastly point out another consequence of Theorem 4.1, which deals with the case of small diffusion and bounded (from above and below) advection and reaction : Corollary 4.5. Under the notations at the beginning of this section, let γε∗ (e) > 0 be the minimal speed of propagation of pulsating fronts u solving (4.4) and ∂u − εdiv(a∇u) + v · ∇u = f (x, y, u), t ∈ R, (x, y) ∈ , (4.12) ∂t ν · a∇u = 0, t ∈ R, (x, y) ∈ ∂. a) Then
lim inf ε→0+
γε∗ (e)
(v · e)w ˜ 2
≥ sup
w∈I
C
. w
(4.13)
2
C
b) Furthermore, if v is a shear flow v = (α(x ), 0, · · · , 0) with zero average, in an infinite cylinder = R × ω, where x = (x2 , · · · , xN ) and ω may or may not be bounded, if ai1 = 0 for i ≥ 2, if a11 is constant and if a and f only depend on x , then γε∗ (±e1 ) → max (±α(x )) as ε → 0+ . x ∈ω
Corollary 4.5 is proved at the end of this section. Remark 4.6. Other asymptotics have been considered in the literature. Many works have for instance dealt with the solutions of Cauchy problems for equations of the type (4.12), with small diffusion ε, together with large reaction ε −1 f . Typically, the solutions of such Cauchy problems converge as ε → 0+ to two-phase solutions of Hamilton-Jacobi type equations, separated by interfaces : see Freidlin [19] and Majda and Souganidis [32], where other spatio-temporal scales and various homogenization limits have also been considered.
470
H. Berestycki, F. Hamel, N. Nadirashvili
Let us now turn to the Proof of Theorem 4.1. Let us begin with the Proof of the lower bound of a). Let e be a unit direction of Rd and choose any first integral w ∈ I such that (v · e)w ˜ 2 > 0. C ∗ (e) from below for large A. We shall now estimate the minimal speed cA
Remember that ζ is the function defined in by ζ (x, y) = fu (x, y, 0). It follows from [4] and [5] that ∗ (e) ≥ min cA λ>0
µ(λ) , λ
(4.14)
where µ(λ) is the principal eigenvalue of the operator Lλ ψ := div(a∇ψ) − λ[div(a e˜ ψ) + e˜ · a∇ψ] − Av · ∇ψ + (λAv · e˜ + λ2 e˜ · a e˜ + ζ )ψ acting on the set Eλ of functions ψ(x, y) which are L-periodic with respect to x in and satisfy ν · a(−eλψ ˜ + ∇ψ) = 0 on ∂.3 The right-hand side of (4.14) only depends on e, the domain , the coefficients Av and a, and the dependence on f is only through the fu (·, ·, 0). Let us also mention that the equality holds in (4.14) under the additional assumption f (x, y, u) ≤ fu (x, y, 0)u for all u ∈ [0, 1] and for all (x, y) ∈ (see [5]). In the case where = RN , a = I , v = 0 and f = f (u) satisfies (4.2) with f (u) ≤ f (0)u in [0, 1]), the latter reduces to ∗ the well-known KPP formula cA (e) = 2 f (0) for the minimal speed of planar fronts [31]. Fix any positive λ and call ψ the unique (up to multiplication) positive (in ) solution ψ ∈ Eλ of Lλ ψ = µ(λ)ψ.
(4.15)
First, divide (4.15) by ψ and integrate over the cell C. Using (2.6) and the boundary conditions satisfied by ψ, it follows that ∇ψ · a∇ψ e˜ · a∇ψ − 2λ + λ2 e˜ · a e˜ + ζ = µ(λ)|C|. 2 ψ ψ C Therefore,
µ(λ)|C| = C
∇ψ ∇ψ − λe˜ · a − λe˜ + ζ, ψ ψ
whence µ(λ) ≥ m = |C|−1
ζ C
because of (2.5). 3
Such operators Lλ also arise in Bloch eigenvalue problems (see [9, 10]).
(4.16)
Elliptic Eigenvalue Problems with Large Drift
471
Then, multiply (4.15) by w 2 /ψ, where w ∈ I is a first integral such that
0. Observe that the term
(v· e)w ˜ 2> C
∇ψ 2 w ψ C vanishes after integration by parts since v · ∇w = 0, since v, ψ and w are L-periodic with respect to x, and because of (2.6). Therefore, ∇ψ · a∇ψ 2 e˜ · a∇ψ w∇w · a∇ψ − 2λw 2 w −2 µ(λ) w 2 = 2 ψ ψ ψ C C + 2λw e˜ · a∇w + λ2 w 2 e˜ · a e˜ + ζ w 2 + λA(v · e)w ˜ 2 . (4.17) Av ·
C
For any t ∈ (0, 1), the first three terms in the right-hand side of (4.17) can be estimated as follows: ∇ψ · a∇ψ 2 e˜ · a∇ψ w∇w · a∇ψ w −2 − 2λw 2 2 ψ ψ ψ C ∇ψ · a∇ψ 2 w∇w · a∇ψ t w −2 = ψ2 ψ C ∇ψ · a∇ψ 2 2 e˜ · a∇ψ + (1 − t) w − 2λw ψ2 ψ C 1 1 2 2 ≥− ∇w · a∇w + λ w e˜ · a e˜ 1−t Ct because of (2.5). Putting the above inequality into (4.17) leads to 1 t 2 2 2 ∇w · a∇w + λ w e˜ · a e˜ µ(λ) w ≥ − t 1−t C C + 2λw e˜ · a∇w + ζ w 2 + λA(v · e)w ˜ 2.
(4.18)
C
Let us set
∇w · a∇w > 0 and η =
γ = C
w 2 e˜ · a e˜ > 0. C
Maximizing the right-hand side of (4.18) over all t ∈ (0, 1) gives, with t = −1 √ , 1 + λ η/γ η γ 2 − 1+λ ∇w · a∇w − λ w e˜ · a e˜ + 2λw e˜ · a∇w µ(λ) w 2 ≥ γ η C C + ζ w 2 + λA(v · e)w ˜ 2 C γ = (ζ w 2 − ∇w · a∇w) − λ z · az + λA (v · e)w ˜ 2 , (4.19) η C C C η where z = ∇w − w e. ˜ γ Let us now consider two cases, according to the sign of ζ w 2 − ∇w · a∇w. C
472
H. Berestycki, F. Hamel, N. Nadirashvili
ζ w2 ≥
Case 1. C
∇w · a∇w > 0. From (4.14) and (4.19), it follows that C
∗ cA (e)
z · az
µ(λ) γ C ≥ min ≥− λ>0 λ η
w2
(v · e)w ˜ 2
+A
C
C
. w2 C
Therefore, c∗ (e) lim inf A ≥ A→+∞ A
(v · e)w ˜ 2 C w2 C
in case 1.
∇w · a∇w. It follows from (4.16) and (4.19) that, for any
ζ w2 <
Case 2. 0 < C
λ > 0,
C
m µ(λ) ≥ max , h(λ) , λ λ
where h(λ) =
1 w2
γ 1 (ζ w 2 − ∇w · a∇w) − z · az + A (v · e)w ˜ 2 . λ C η C C
C
The functions λ → m/λ and h are respectively decreasing and increasing, and, for A large enough, their graphs have exactly one intersection point, which is the minimum ∗ (e) ≥ min over all positive λ of the function λ → max(m/λ, h(λ)). Since cA λ>0 µ(λ)/λ, a straightforward calculation gives m (v · e)w ˜ 2 ∗ (e) cA C lim inf ≥ A→+∞ A 2 m w + ∇w · a∇w − ζ w 2 C
C
C
in case 2. Putting cases 1 and 2 together completes the proof of the lower bound in part a) of Theorem 4.1. Let us now turn to the Proof of the upper bound in a) and proof of part b) of Theorem 4.1. Fix a direction e of ∗ (e)/A > 0 and choose any positive number κ such Rd and assume that lim supA→+∞ cA that 0 < κ < lim sup A→+∞
∗ (e) cA . A
(4.20)
Let g = g(u) be a function satisfying (4.2), and such that f (x, y, u) ≤ g(u) for all (x, y, u) ∈ × [0, 1] and g(u) ≤ g (0)u for all u ∈ [0, 1]. Call γA∗ (e) the minimal speed of pulsating travelling fronts solving (4.3–4.4) with the nonlinearity g instead of
Elliptic Eigenvalue Problems with Large Drift
473
f . Let χ : R → R be a smooth nondecreasing function such that χ (s) = 0 for all s ≤ 1, 0 < χ(s) < 1 for all s ∈ (1, 2) and χ (s) = 1 for all s ≥ 2. For each θ ∈ (0, 1/2), the function fθ (x, y, s) = χ (s/θ )f (x, y, s) is of the type (4.6). It was proved in [4] that, for each θ ∈ (0, 1/2), there exists a unique speed cA,θ (e) and a unique (up to shift in time) pulsating travelling front solving (4.3–4.4) with the nonlinearity fθ . Furthermore, ∗ (e) as θ → 0+ . cA,θ (e) → cA Similarly, there exists a unique speed γA,θ (e) and a unique (up to shift in time) pulsating travelling front solving (4.3–4.4) with the nonlinearity gθ (s) = χ (s/θ)g(s) ; furthermore, γA,θ (e) → γA∗ (e) as θ → 0+ . But the results in sections 3 and 4 in [4] yield cA,θ (e) ≤ γA,θ (e) for each θ ∈ (0, 1/2) (since fθ ≤ gθ and both fθ and gθ satisfy ∗ (e) ≤ γ ∗ (e). (4.6)). Therefore, cA A From (4.20), there exists then a sequence An → +∞ such that γA∗n (e) ≥ κAn
(4.21)
for all n ∈ N. On the other hand, since g satisfies (4.2) and g(s) ≤ g (0)s for all s ∈ [0, 1], it follows from [5] (as already noticed at the beginning of the proof of Theorem 4.1) that, for all A ∈ R, kA (λ) γA∗ (e) = min , λ>0 λ where kA (λ) is the principal eigenvalue of the elliptic operator LA,λ ψ := div(a∇ψ) − λ[div(a e˜ ψ) + e˜ · a∇ψ] − Av · ∇ψ +(λAv · e˜ + λ2 e˜ · a e˜ + g (0))ψ acting on the set Eλ of functions ψ(x, y) which are L-periodic with respect to x in and satisfy ν · a(−eλψ ˜ + ∇ψ) = 0 on ∂. From (4.21), it resorts that ∀λ > 0, ∀n ∈ N,
kAn (λ) ≥ κλAn .
(4.22)
Fix any ε > 0. Take λn = (εAn )−1 in the above inequality and call ψn ∈ Eλn the principal eigenfunction of LAn ,λn ψn = div(a∇ψn ) − (εAn )−1 [div(a e˜ ψn ) + e˜ · a∇ψn ] − An v · ∇ψn +(ε −1 v · e˜ + (εAn )−2 e˜ · a e˜ + g (0))ψn = kAn (λn )ψn
(4.23)
such that ψn L2 (C) = 1. Multiply the above equality by ψn and integrate by parts over C. One obtains − ∇ψn · a∇ψn + [ε −1 v · e˜ + (εAn )−2 e˜ · a e˜ + g (0)]ψn2 = kAn (λn )ψn2 C C κ ψ2 (4.24) ≥ ε C n from (4.22) and due to the definition of λn . Since ψn2 = 1 and since the matrix a C
is uniformly elliptic, one concludes that the sequence ψn H 1 (C) is bounded. It also
474
H. Berestycki, F. Hamel, N. Nadirashvili
follows that the sequence kAn (λn ) is bounded (from above and below by two positive constants). From Rellich’s theorem, there exists then a subsequence n → +∞ and a function 1 () such that ψ converges to w weakly in H 1 , strongly in L2 , and almost wε ∈ Hloc ε n loc loc everywhere in . The function wε is then L-periodic with respect to x, and it is not the zero function since wε L2 (C) = 1. Multiply (4.23) by 1/An and pass to the limit in the sense of distributions in . It follows that v · ∇wε = 0 almost everywhere in . That means that wε is a (nonzero) first integral of v. Passing to the limit n → ∞ in (4.24) immediately leads to κ 1 2 wε ≤ (v · e)w ˜ ε2 . − g (0) ε ε C C Therefore,choosing ε smaller than κ/g (0) proves the existence of a first integral wε ∈ I such that (v · e)w ˜ ε2 > 0 and C
(v
κ≤
C
· e)w ˜ ε2
C
(v · e)w ˜ 2
+ εg (0) ≤ sup
w∈I
wε2
+ εg (0).
C
w
2
C
That already proves part b) of Theorem 4.1. Furthermore, the passages to the limit < ∗ ε → 0+ and then κ → lim sup cA (e)/A lead to A→+∞
(v · e)w ˜ 2
c∗ (e) lim sup A ≤ sup A→+∞ A w∈I
C
. w2
C
That completes the proof of Theorem 4.1. Let us now turn to the
Proof of Corollary 4.3. The upper bound in (4.10) follows from Theorem 4.1 and (4.7), which holds whether or not there exists w ∈ I such that
(v · e)w ˜ 2 > 0. C
The lower bound in (4.10) is immediate if all first integrals w ∈ I are such that ∗ (e) > 0 for all A and B > 0. In the other case, let any ε > 0 (v · e)w ˜ 2 ≤ 0, since cA,B C and choose a first integral w0 in I such that (v · e)w ˜ 02 > 0 and C
(v · e)w ˜
sup
w∈I
C
w2 C
2
−ε ≤
(v · e)w ˜ 02
C
C
. w02
Elliptic Eigenvalue Problems with Large Drift
475
Let ζ (x, y) = fu (x, y, 0). Since ζ is continuous and positive in , there exists B0 such that, for all B ≥ B0 , C
Bζ w02 ≥
∇w0 · a∇w0 . C
Therefore, for all B ≥ B0 , under the notations of Theorem 4.1, w0 belongs to the set I 1 associated to the nonlinearity Bf . Hence, 2 (v · e)w ˜ 0 (v · e)w ˜ 2 ∗ (e) cA,B C C ∀B ≥ B0 , lim inf ≥ sup − ε. ≥ A→+∞ A w∈I w02 w2 C
C
To prove (4.11) in the case where v = (α(x ), 0, · · · , 0), it is enough to prove that (v · e)w ˜ 2 C sup = max (e˜ · e1 α(x )). x ∈ω 2 w∈I w C
The left-hand side is clearly less than or equal to the right-hand side. On the other hand, fix any ε > 0 and choose an open set U ⊂ ω such that ∀x ∈ U,
e˜ · e1 α(x ) ≥ max (e˜ · e1 α) − ε. ω
Then take a smooth nonzero function w0 = w0 (x ) in H whose support is such that U + k. supp(w0 ) ⊂ Z×···×Ld ZZ k∈L2 Z One can immediately see that w0 is a first integral and 2 (v · e)w ˜ (v · e)w ˜ 02 C C sup ≥ ≥ max (e˜ · e1 α(x )) − ε. x ∈ω 2 2 w∈I w w0 C
C
Since ε > 0 was arbitrary, one gets the desired result.
Proof of Corollary 4.5. Proof of a). Let us first observe that any solution u(t, x, y) of (4.12) and (4.4) with the speed c gives rise to a solution w(t, x, y) := u(t/ε, x, y) of v 1 ∂w − div(a∇w) + · ∇w = f (x, y, w), t ∈ R, (x, y) ∈ , ∂t ε ε ν · a∇w = 0, t ∈ R, (x, y) ∈ ∂, satisfying (4.4) with the speed c/ε. Therefore, under the notations of Corollary 4.3, one has γε∗ (e) = εcε∗−1 ,ε−1 (e),
(4.25)
476
H. Berestycki, F. Hamel, N. Nadirashvili
where cε∗−1 ,ε−1 (e) is the minimal speed of pulsating travelling fronts solving (4.3-4.4) with A = 1/ε and with the nonlinearity f/εinstead of f . If all first integrals w ∈ I are such that
(v · e)w ˜ 2 ≤ 0, then (4.13) is immediate C
since γε∗ (e) is positive for any ε > 0. Otherwise, for any δ > 0 small enough, there exists a first integral w0 ∈ I such that (v · e)w ˜ 02 (v · e)w ˜ 2 C C ≥ sup − δ > 0. w∈I w02 w2 C
C
Take M large enough so that C
Mζ w02
≥
∇w0 · a∇w0 . C
Theorem 4.1 yields that
lim inf ε→0+
cε∗−1 ,M (e) ε −1
(v ≥
C
· e)w ˜ 02
C
w02
(v · e)w ˜ 2
≥ sup
w∈I
C
w2
− δ.
(4.26)
C
On the other hand, as in the proof of Theorem 4.1, it follows that the minimal speed c∗ is nondecreasing with respect to the nonlinearity f . Therefore, cε∗−1 ,ε−1 (e) ≥ cε∗−1 ,M (e) for ε small enough. Putting that together with (4.25) and (4.26) leads to (v · e)w ˜ 2 − δ. lim inf γε∗ (e) ≥ sup C ε→0+ w∈I w2 C
Since δ > 0 was an arbitrary small enough positive number, the inequality (4.13) follows. Proof of b). Let us deal with the case of the propagation in the e1 -direction (the propagation in the −e1 -direction can be treated similarly). As already observed in the proof of Corollary 4.3, the following formula (v · e1 )w 2 sup C = max α(x ) x ∈ω 2 w∈I w C
holds for a shear flow v = α(x )e1 . In order to get the upper bound for γε∗ (e), let us use a formula derived in [22] : εdiv(a(x )∇w) + f (x , w) γε∗ (e1 ) = min sup − + α(x ) , ∂ x1 w w∈E (x ,x )∈ 1
Elliptic Eigenvalue Problems with Large Drift
477
where E = {w ∈ C 2 (), w is periodic with respect to (x2 , · · · , xd ) with the periods L2 , · · · , Ld , ∂ν w = 0 on ∂, ∂x1 w < 0 in and w(−∞, x ) = 1, w(+∞, x ) = 0 uniformly in x ∈ ω}. Actually the above formula was proved in [22] in the case of a diffusion matrix a = I d, with a nonlinearity f not depending on x and in an infinite cylinder with bounded section ω. The generalization to our case with diffusion and reaction depending on x with bounded or unbounded section ω is immediate from the proof in [22]. Since ai1 = 0 for i ≥ 2 and since a11 is constant, it follows that εa11 w (x1 ) + g(w(x1 )) min sup − α(x ), + max γε∗ (e1 ) ≤ w (x1 ) x ∈ω w=w(x1 )∈E x1 ∈R where g is a given function satisfying 4.2 and such that g(u) ≥ f (x , u) for all (x , u) ∈ ω × [0, 1]. In other words, from [22], γε∗ (e1 ) ≤ kε∗ + max α(x ), x ∈ω
where
kε∗
is the minimal speed of planar travelling fronts u(t, x1 ) = φ(x1 − ct) solving ut = εa11 ux1 x1 + g(u)
with φ(−∞) = 1 and φ(+∞) = 0. It is immediate to check that kε∗ = one concludes that lim sup γε∗ (e1 ) ≤ max α(x ). ε→0+
That completes the proof of Corollary 4.5.
√ ∗ εk1 . Eventually,
x ∈ω
5. Discussion and Open Questions The aim of this section is to set a list of open questions and generalizations of the results of the previous sections. Theorem 0.3 gives a necessary and sufficient condition for the first eigenvalues λA of problem (0.1), with a divergence-free vector field v, bounded as A → +∞. Moreover, the limit of λA as A → +∞, which always exists, is either finite or equal to +∞. In both cases, this limit is not smaller than any of the λA ’s. On the other hand, (1.5) implies that, for all A ∈ R, λA ≥ λ0 , where λ0 corresponds to (0.1) with A = 0 (in other words λ0 is the first eigenvalue of the Laplace operator with Dirichlet boundary conditions). Furthermore, one can observe from (1.2) that λA is a nondecreasing function of |A| as soon as v is a divergence-free gradient field, and it is increasing under the additional assumption that v is not identically equal to 0. However, this monotonicity property remains open for a general divergence-free vector field v. As far as the first eigenfunctions ϕA are concerned, it followed from the proof of Theorem 0.3 (see Corollary 1.5) that, if the field v has first integrals, then each sequence (ϕAn )n∈N of normalized eigenfunctions has at least a subsequence (ϕAn )n which converges as n → +∞ to a minimizer of the Rayleigh quotient among all first integrals of v in I0 . If this minimizer is unique (up to normalization), then the whole family (ϕA ) converges to it as A → +∞. We gave in Remark 1.6 an example of a rotating field v in the unit ball, for which the Rayleigh quotient has a unique minimizer (up to normalization) among all first integrals. This property could certainly be generalized to more general rotating type vector fields having at least one first integral in I0 whose level sets
478
H. Berestycki, F. Hamel, N. Nadirashvili
are connected hypersurfaces. We also showed in Remark 1.6 that the minimizers of this Rayleigh quotient among all first integrals may not be unique. But even in the example we gave in Remark 1.6, the whole family (ϕA ) still converges as A → +∞. One can wonder whether or not this convergence property always holds. Furthermore, this question of the convergence of the (normalized) first eigenfunctions can also naturally be asked in the case where v has no first integral. Another natural question is about the other eigenvalues. For instance, for problem (0.1), we proved that the first eigenvalues are bounded if and only if the field v has first integrals. Therefore, if there is no first integral, the other eigenvalues go to +∞ as A → +∞. But if there are first integrals, can one say that the second eigenvalues are bounded as A → +∞? And so on for the other eigenvalues? Let us notice here that, under some additional regularity assumptions for v, the results of Theorem 0.3 could also be formally derived from the following variational formula by Holland [27] for the first eigenvalue λA of problem (0.1), namely |∇φ|2 + Av · (∇φ + φ∇V )φ − |∇V |2 φ 2 λA = min max , (5.1) φ∈ V ∈C 1 () φ2
where = {φ ∈ ∩ C(), φ > 0 in , → 0 as x → ∂, x ∈ }, and d(x) is the distance of x to ∂. At least under some smoothness assumptions, one has v · (∇φ + φ∇V )φ = A (1 − 2V )(v · ∇φ)φ. A C 2 ()
φ 2 (x)/d(x)
If φ ∈ is not a first integral of v, then there is an open set U ⊂ such that v · ∇φ does not vanish in U. The choice V = (1 − χ 2 φ v · ∇φ)/2, where χ is a nonzero C01 function whose support is included in U, implies that, for such a φ, the maximum with respect to V in the right-hand side of (5.1) goes to +∞ as A → +∞. This is a formal indication, but not a proof, suggesting that λA → +∞ as A → +∞ when v has no first integral and that the λA ’s are expected to converge to the right-hand side of (0.2) as A → +∞ if v has first integrals. The aforementioned questions or comments have their similar counterparts for the more general Dirichlet or Neumann/periodic problems (2.2) or (2.7). Nevertheless, the simple question of finding a necessary and sufficient condition for the boundedness of the first eigenvalues remains open for similar elliptic problems with Robin type boundary conditions. The same question can also be asked for more general elliptic problems with non self-adjoint main part, even with Dirichlet boundary conditions. Namely, consider the following eigenvalue problem −aij ∂ij ϕA + Av · ∇ϕA + bi ∂i ϕA + cϕA = λA ϕA in ϕA = 0 on ∂, where is a C 2 bounded domain of RN , v is a bounded vector field such that div v = 0 in D (), bi ∈ C(), c = c(x) ∈ L∞ () and a(x) = (aij (x))1≤i,j ≤N is a C 1 () symmetric matrix field satisfying (2.1). Under these conditions, one can easily check that if there exists a sequence (λAn )n∈N which is bounded, then, after normalization in L2 norm, a subsequence (ϕAn )n converges strongly in L2 and weakly in H 1 to a first integral of v in I0 . Conversely, is it true that if there is a first integral then the first
Elliptic Eigenvalue Problems with Large Drift
479
eigenvalues are bounded ? Furthermore, even if a sequence (λAn )n∈N is bounded, does that imply that it converges, and that the whole family (λA ) converges as A → +∞ ? The answers are not clear. To finish this section, let us mention that one can also ask about the generalizations of Theorems 0.3, 2.1 or 3.1 for elliptic or parabolic problems with large drift in unbounded domains with Dirichlet asymptotic conditions. References 1. Aronson, D.G., Weinberger, H.F.: Multidimensional nonlinear diffusions arising in population genetics. Adv. Math. 30, 33–76 (1978) 2. Audoly, B., Berestycki, H., Pomeau, Y.: R´eaction-diffusion en e´ coulement stationnaire rapide. C. R. Acad. Sci. Paris 328 II, 255–262 (2000) 3. Berestycki, H.: The influence of advection on the propagation of fronts in reaction-diffusion equations. In: Nonlinear PDE’s in Condensed Matter and Reactive Flows. H. Berestycki, Y. Pomeau (eds), Dordrecht Kluwer Academic Publ., 2002 4. Berestycki, H., Hamel, F.: Front propagation in periodic excitable media. Commun. Pure Appl. Math. 55, 949–1032 (2002) 5. Berestycki, H., Hamel, F., Nadirashvili, N.: The speed of propagation for KPP type problems. IPeriodic framework, J. Europ. Math. Soc., in press 6. Berestycki, H., Larrouturou, Lions, P.-L.: Multidimensional traveling-wave solutions of a flame propagation model. Arch. Rat. Mech. Anal. 111, 33–49 (1990) 7. Berestycki, H., Nirenberg, L.: Travelling fronts in cylinders. Ann. Inst. H. Poincar´e, Analyse Non Lin´eaire 9, 497–572 (1992) 8. Berestycki, H., Nirenberg, L., Varadhan, S.R.S.: The principal eigenvalue and maximum principle for second order elliptic operators in general domains. Commun. Pure Appl. Math. 47, 47–92 (1994) 9. Capdeboscq, Y.: Homogenization of a neutronic critical diffusion problem with drift. Proc. Royal Soc. Edinburgh 132 A, 567–594 (2002) 10. Conca, C., Vanninathan, M.: Homogenization of periodic structures via Bloch decomposition. SIAM J. Appl. Math. 57, 1639–1659 (1997) 11. Constantin, P., Kiselev, A., Oberman, A., Ryzhik, L.: Bulk burning rate in passive-reactive diffusion. Arch. Ration. Mech. Anal. 154, 53–91 (2000) 12. Constantin, P., Kiselev, A., Ryzhik, L.: Quenching of flames by fluid advection. Commun. Pure Appl. Math. 54, 1320–1342 (2001) 13. Devinatz, A., Ellis, R., Friedman, A.: The asymptotic behaviour of the first real eigenvalue of the second-order elliptic operator with a small parameter in the higher derivatives, II. Indiana Univ. Math. J. 991–1011 (1973/74) 14. Fannjiang, A., Papanicolaou, G.: Convection enhanced diffusion for periodic flows. SIAM J. Appl. Math. 54, 333–408 (1994) 15. Fife, P.C.: Mathematical aspects of reacting and diffusing systems. Lecture Notes in Biomathematics 28, Berlin-Heidelberg, Newb York: Springer Verlag, 1979 16. Fisher, R.A.: The advance of advantageous genes. Ann. Eugenics 7, 335–369 (1937) 17. Fleming, W.H., Sheu, S.-J.: Asymptotics for the principal eigenvalue and eigenfunction of a nearly first-order operator with large potential. Ann. Probab. 25, 1953–1994 (1997) 18. Freidlin, M.: Functional integration and partial differential equations. Ann. of Math. Studies, Princeton, NJ: Princeton University Press, 1985 19. Freidlin, M.: Wave front propagation for KPP type equations. Surveys in Appl. Math. 2, New York: Plenum, 1995, pp. 1–62 20. Friedman, A.: The asymptotic behaviour of the first real eigenvalue of a second order elliptic operator with a small parameter in the highest derivatives. Indiana Univ. Math. J. 22, 1005–1015 (1973) 21. G¨artner, J., Freidlin, M.: On the propagation of concentration waves in periodic and random media. Sov. Math. Dokl. 20, 1282–1286 (1979) 22. Hamel, F.: Formules min-max pour les vitesses d’ondes progressives multidimensionnelles. Ann. Fac. Sci. Toulouse 8, 259–280 (1999) 23. Heinze, S.: Homogenization of flame fronts. Preprint IWR, Heidelberg, 1993 24. Heinze, S.: Wave solutions for reaction-diffusion systems in perforated domains. Z. Anal. Anwendungen 20, 661–670 (2001) 25. Heinze, S.: The speed of travelling waves for convective reaction-diffusion equations. Preprint MPI, Leipzig, 2001
480
H. Berestycki, F. Hamel, N. Nadirashvili
26. Heinze, S., Papanicolaou, G., Stevens, A.: Variational principles for propagation speeds in in homogeneous media. SIAM J. Appl. Math. 62, 129–148 (2001) 27. Holland, C.J.: A minimum principle for the principal eigenvalue for second-order linear elliptic equations with natural boundary conditions. Commun. Pure Appl. Math. 31, 509–519 (1978) 28. Hudson, W., Zinner, B.: Existence of travelling waves for reaction-diffusion equations of Fisher type in periodic media. In: Boundary Value Problems for Functional-Differential Equations. J. Henderson (ed.), World Scientific, 1995, pp. 187–199 29. Kanel’, Ya.I.: Certain problems of burning-theory equations. Sov. Math. Dokl. 2, 48–51 (1961) 30. Kiselev, A., Ryzhik, L.: Enhancement of the traveling front speeds in reaction-diffusion equations with advection. Ann. Inst. H. Poincar´e, Analyse Non Lin´eaire 18, 309–358 (2001) ´ 31. Kolmogorov, A.N., Petrovsky, I.G., Piskunov, N.S.: Etude de l’´equation de la diffusion avec croissance de la quantit´e de mati`ere et son application a` un probl`eme biologique. Bulletin Universit´e ´ a` Moscou (Bjul. Moskowskogo Gos. Univ.), S´erie internationale A 1, 1–26 (1937) d’Etat 32. Majda, A.J., Souganidis, P.E.: Large scale front dynamics for turbulent reaction-diffusion equations with separated velocity scales. Nonlinearity 7, 1–30 (1994) 33. Shigesada, N., Kawasaki, K., Teramoto, E.: Spatial segregation of interacting species. J. Theoret. Biol. 79, 83–99 (1979) 34. Volpert, A.I., Volpert, V.A., Volpert, V.A.: Traveling wave solutions of parabolic systems. Translations of Math. Monographs 140, Providence, RI: Am. Math. Soc. 1994 35. Xin, X.: Existence of planar flame fronts in convective-diffusive periodic media. Arch. Ration. Mech. Anal. 121, 205–233 (1992) 36. Xin, J.X.: Analysis and modeling of front propagation in heterogeneous media. SIAM Review 42, 161–230 (2000) 37. Wentzell, A.D.: On the asymptotic behaviour of the first eigenvalue of a second-order differential operator with small parameter in higher derivatives. Theory Probab. Appl. 20, 599–602 (1975) 38. Wentzell, A.D., Freidlin, M.: Random perturbations of dynamical systems. Grund. Math. Wissenschaften, Second ed., New York: Springer, 1998 Communicated by P. Constantin
Commun. Math. Phys. 253, 481–509 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1158-8
Communications in
Mathematical Physics
Weyl Asymptotic Formula for the Laplacian on Domains with Rough Boundaries Yu. Netrusov1, , Yu. Safarov2 1
School of Mathematics, University of Bristol, Bristol BS8 1TW, UK. E-mail:
[email protected] 2 Department of Mathematics, King’s College, London WC2R 2LS, UK. E-mail:
[email protected] Received: 12 December 2003 / Accepted: 18 February 2004 Published online: 3 September 2004 – © Springer-Verlag 2004
Abstract: We study asymptotic distribution of eigenvalues of the Laplacian on a bounded domain in Rn . Our main results include an explicit remainder estimate in the Weyl formula for the Dirichlet Laplacian on an arbitrary bounded domain, sufficient conditions for the validity of the Weyl formula for the Neumann Laplacian on a domain with continuous boundary in terms of smoothness of the boundary and a remainder estimate in this formula. In particular, we show that the Weyl formula holds true for the Neumann Laplacian on a Lipα -domain whenever (d − 1)/α < d , prove that the remainder in this formula is O(λ(d−1)/α ) and give an example where the remainder estimate O(λ(d−1)/α ) is order sharp. We use a new version of the variational technique which does not require the extension theorem. Introduction Let −N be the Neumann Laplacian on a bounded domain ⊂ Rd and NN (, λ) be the number of its eigenvalues which are strictly smaller than λ2 ; if the number of these eigenvalues is infinite or −N has essential spectrum below λ then we define NN (, λ) := +∞. Similarly, let −D be the Dirichlet Laplacian on and ND (, λ) be the number of its eigenvalues lying below λ2 . We shall omit the lower index D or N and simply write or N (, λ) if the corresponding statement refers both to the Dirichlet and Neumann Laplacian. According to the Weyl formula, N(, λ) − Cd,W µd () λd = o(λd ) ,
λ → +∞ ,
(0.1)
where µd () is the d-dimensional Lebesgue measure of and Cd,W is the Weyl constant (see Subsect. 1.1). If N = ND then the Weyl formula holds for all bounded
Research supported by EPSRC grant GR/A00249/01.
482
Yu. Netrusov, Yu. Safarov
domains [BS]. If, in addition, the upper Minkowski dimension of the boundary is equal to d1 ∈ (d − 1, d) , then N(, λ) − Cd,W µd () λd = O(λd1 ) ,
λ → +∞ .
(0.2)
The asymptotic formula (0.2) with N = ND is well known and is proved in many papers, for instance, in [BLi] and [Sa] where the authors obtained estimates with explicit constants. This formula remains valid for the Neumann Laplacian whenever has the extension property (see below). Note that d1 may well coincide with d , in which case (0.2) is useless. If N = NN then (0.1) is true only for domains with sufficiently regular boundaries. In the general case NN does not satisfy (0.1); moreover, the Neumann Laplacian on a bounded domain may have a nonempty essential spectrum (see, for example, [HSS or Si]). The necessary and sufficient conditions for the absence of the essential spectrum in terms of capacities have been obtained in [M1]. In [BD] the authors proved that NN (, λ) is polynomially bounded whenever the Sobolev space W 1,2 () is embedded in Lq () for some q > 2 . If the log-Sobolev inequality holds on then NN (, λ) is exponentially bounded [Ma]. For domains with sufficiently smooth boundaries, (0.1) is true for both functions ND and NN and the remainder (i.e., the right-hand side) is O(λd−1 ) [Iv1 , Se]. The proof is based on the study of propagation of singularities for the corresponding evolution equation (see [Iv3 or SV]). If has a rough boundary then the propagation of singularities near ∂ cannot be effectively described and one has to invoke the variational technique. Let bδ and eδ be the internal and external δ-neighbourhoods of ∂ respectively. The classical variational proof of the Weyl formula involves covering the domain by a finite collection of disjoint cubes {Qj }j ∈J and using the Dirichlet–Neumann bracketing. It is convenient toeassume that {Qj }j ∈J is the subset of the family of Whitney cubes covering δ (see Theorem 3.3), which consists of the cubes Qj such that Qj = ∅ . In view of the Rayleigh–Ritzvariational formula, we have the estimates j ∈J0 ND (Qj , λ) ND (, λ) j ∈J NN (Qj , λ) , where {Qj }j ∈J0 is the set of cubes Qj lying inside . If µd (∂) = 0 then, estimating ND (Qj , λ) and NN (Qj , λ) for each j and taking δ = λ−1 , we obtain (0.1) and (0.2) for the Dirichlet Laplacian. It is possible to get rid of the condition µd (∂) = 0 but this requires additional arguments. Similarly, the Rayleigh–Ritz formula implies that j ∈J0 ND (Qj , λ) NN (, λ) , λ) , where {Qj }j ∈Jmδ is the set of cubes j ∈Jmδ NN (Qj , λ)+NN ( j ∈J \Jmδ Qj lying inside \ bmδ . If for some m ∈ N and all sufficiently small positive δ there exist uniformly bounded extension operators from the Sobolev space W 1,2 (bmδ ) to W 1,2 (bmδ eδ ) then NN ( j ∈J \Jmδ Qj , λ) NN ( j ∈J \Jmδ Qj , Cλ) = j ∈J \Jmδ NN (Qj , Cλ) , where C is a sufficiently large constant. If, in addition, µd (∂) = 0 then, estimating the counting functions on the cubes and taking δ = λ−1 , we obtain (0.1) and (0.2) for NN (, λ) . However, the known extension theorems require certain regularity conditions on the boundary (for instance, it is sufficient to assume that ∂ belongs to the Lipschitz class or satisfies the cone condition). Domains with very irregular boundaries do not have the W 1,2 -extension property, in which case the above scheme does not work for the Neumann Laplacian. To the best of our knowledge, in all papers devoted to the Weyl
Weyl Formula
483
formula for NN (, λ) the authors either implicitly assumed that the domain has the W 1,2 -extension property or directly applied a suitable extension theorem. The main aim of this paper is to introduce a different technique which does not use an extension theorem. Instead of disjoint cubes, we cover the domain by a family of relatively simple sets Sm ⊂ . For each of these sets the counting function N (Sm , λ) can be effectively estimated from below and above. The sets Sm may overlap but, under certain conditions on , the multiplicity of their intersection does not exceed a constant depending only on the dimension d . This allows us to apply the Dirichlet–Neumann bracketing and obtain the Weyl asymptotic formula with a remainder estimate for the Neumann Laplacian on domains without the extension property (Theorem 1.3). The remainder term in this formula may well be of higher order than the first term. Then our asymptotic formula turns into an estimate for NN (, λ) . In particular, this may happen if ∈ Lipα , that is, if ∂ coincides with the subgraph of a Lipα -function in a neighbourhood of each boundary point. We prove that NN (, λ) − Cd,W µd () λd = O(λ(d−1)/α ) whenever ∈ Lipα and α ∈ (0, 1) (Corollary 1.6) and that this estimate is order sharp (Theorem 1.10). If (d − 1)/α < d then the right-hand side is o(λd ) and we have (0.1), otherwise NN (, λ) = O(λ(d−1)/α ) . We also obtain a remainder estimate in (0.1) for the Dirichlet Laplacian (Theorem 1.8). This estimate holds true for all bounded domains and immediately implies (0.2). For domains with smooth boundaries our variational method only gives the remainder estimate O(λd−1 log λ) ; in order to obtain O(λd−1 ) one has to use more sophisticated results (see above). On the other hand, it can be applied to many other problems and combined with the technique developed in [BI, Iv3, Iv4, Me, Mi, SV or Z] (see Sect. 5). 1. Definitions and Main Results 1.1. Basic definitions and notation. Throughout the paper we assume that is a bounded open connected subset (domain) of the d-dimensional Euclidean space Rd and that d 2. We shall be using the following notation. • ωd is the volume of the unit ball in Rd and Cd,W := (2π )−d ωd is the standard Weyl constant. • If x = (x1 , . . . , xd ) ∈ Rd then x := (x1 , . . . , xd−1 ) so that x = (x , xd ). • and ∂ are the closure and the boundary of . • µd () denotes the d-dimensional volume of and D := diam . • dist(1 , 2 ) := inf |x − y| is the standard Euclidean distance. x∈1 , y∈2
• := {x ∈ | dist(x, ∂) ε} . • C is the space of continuous functions. • If is a (d − 1)-dimensional domain, f ∈ C( ), b ∈ R and α ∈ (0, 1] then bε
f := {x ∈ Rd | xd = f (x ), x ∈ } , Gf := {x ∈ Rd | xd < f (x ), x ∈ } , Gf, b := {x ∈ Gf | xd > b} , Osc (f, ) :=
1 2
sup f (x) − inf f (x) and |f |α :=
x∈
x∈
sup
x, y∈
|f (x)−f (y)| |x−y|α
.
484
Yu. Netrusov, Yu. Safarov (n)
• Qa is the open n-dimensional cube with edges of length a parallel to the coordinate (n) axes. If the size or the dimension of the cube Qa is not important for our purposes or evident from the context then we shall omit the corresponding index a or n. However, we shall always be assuming that the cube is open and that its edges are parallel to the coordinate axes. • Lipα is the space of functions f on a cube Q such that |f |α < ∞ and lipα is the closure of Lip1 in Lipα with respect to the seminorm | · |α . Definition 1.1. Given a bounded function f on the cube Q(n) and δ > 0, we shall denote by Vδ (f, Q(n) ) the maximal number of disjoint cubes Q(n) (i) ⊂ Q(n) such that Osc (f, Q(n) (i)) δ for each i. If Osc (f, Q(n) ) < δ then we define Vδ (f, Q(n) ) := 1 . Definition 1.2. If τ is a positive nondecreasing function on (0, +∞) , let BVτ,∞ (Q) be the space spanned by all continuous functions f on Q such that V1/t (f, Q) τ (t) for all t > 0 . We shall briefly discuss the relation between BVτ,∞ (Q) and known function spaces in Subsect. 5.3. Let X be a space of continuous real-valued functions defined on a cube Q(d−1) . We shall say that belongs to the class X and write ∈ X if for each z ∈ ∂ there exists a neighbourhood Oz of the point z , a linear orthogonal map U : Rd → Rd , a (d−1) ⊂ Q(d−1) , a function f ∈ X and b ∈ R such that U (Oz ) = {x ∈ cube Qa (d−1) Gf, b | x ∈ Qa }. Since ∂ is compact, for every bounded set ∈ BVτ,∞ there exists a finite collection of domains l ⊂ , l ∈ L , such that (a) ∂ ⊂ l∈L l ; (b) for each l we have Ul (l ) = Gfl , bl , where Ul : Rd → Rd is a linear orthogonal (d−1) map, fl ∈ BVτ,∞ (Qal ) and bl < inf fl ; (c) al D and sup fl − bl D for all l ∈ L. Let us fix such a collection {l }l∈L and denote n := #L and C, τ := sup V1/t (fl , Q(d−1) )/τ (t) . al l∈L t>0
Let δ be the largest positive number such that bδ ⊂ 2δ inf fl − bl for all l ∈ L.
l∈L l
, δ
√
d al and
1.2. Main results. Throughout the paper we shall denote by Cd various constants depending only on the dimension d. Constants appearing in the most important estimates are numbered by an additional lower index; in our opinion, this makes our proofs more transparent. Their precise (but not necessarily best possible) values are given in Sect. 6. −1 then Theorem 1.3. If ∈ BVτ,∞ and λ δ
| NN (, λ) − Cd,W µd () λ | d
+ Cd,10 n λd−1 0
C λ
1/2 Cd,9 C,τ n λ
µd (bt −1 ) dt ,
C λ
(2D )−1
t −2 τ (t) dt (1.1)
Weyl Formula
485 1/2
where C := 4 Cd,8 n . If, in addition, ⊂ R2 then there exists a positive constant c independent of such that | NN (, λ) − (4π)−1 µ2 () λ2 | c C,τ τ (c n λ)
1/2 1/2
+ c n λ
c n λ
D + 0
−1 ∀λ δ .
µ2 (bt −1 ) dt ,
(1.2)
Remark 1.4. For each continuous function f on a closed cube there exists a positive nondecreasing function τ such that f ∈ BVτ,∞ . Therefore Theorem 1.3 allows one to obtain an estimate of the form (1.1) for every domain ∈ C . In particular, this implies the following well known result: if ∈ C then the essential spectrum of the Neumann Laplacian on is empty. The next two corollaries are simple consequences of Theorem 1.3. Corollary 1.5. If ∈ BVτ,∞ then there exists a constant C such that | NN (, λ) − Cd,W µd () λd | C λ C λd−1 t −1 + t −d τ (t) dt ,
∀λ C .
−1 C
(1.3)
Corollary 1.6. If α ∈ (0, 1) and ∈ Lipα then
NN (, λ) = Cd,W µd () λd + O λ(d−1)/α ,
λ → +∞.
(1.4)
λ → +∞.
(1.5)
If α ∈ (0, 1) and ∈ lipα then
NN (, λ) = Cd,W µd () λd + o λ(d−1)/α ,
Remark 1.7. If α 1 − d −1 then the asymptotic formula (1.4) turns into the estimate NN (, λ) = O λ(d−1)/α . Similarly, if α < 1 − d −1 then (1.5) takes the form NN (, λ) = o λ(d−1)/α . The following estimates for the Dirichlet Laplacian are much simpler. The inequality (1.6) seems to be new but results of this type are known to experts. Corollary 1.9 is an immediate consequence of Theorem 1.8; (1.7) also follows from (0.2). Theorem 1.8. For all λ > 0 we have
| ND (, λ) − Cd,W µd () λ | Cd,11 λ d
λ
d−1 0
µd (bt −1 ) dt .
(1.6)
λ → +∞.
(1.7)
λ → +∞.
(1.8)
Corollary 1.9. If α ∈ (0, 1) and ∈ Lipα then
ND (, λ) = Cd,W µd () λd + O λd−α ,
If α ∈ (0, 1) and ∈ lipα then
ND (, λ) = Cd,W µd () λd + o λd−α ,
Note that (d −1)/α > d −α whenever α ∈ (0, 1) . Therefore the remainder estimate in Corollary 1.9 is better than that in Corollary 1.6. The following theorem shows that the asymptotic formulae (1.4) and (1.5) are order sharp.
486
Yu. Netrusov, Yu. Safarov
Theorem 1.10. Let α ∈ (0, 1). Then (1) there exist a bounded domain ∈ Lipα and a positive constant C such that −1 (d−1)/α λ for all λ > C ; NN (, λ) Cd,W µd () λd + C (2) for each nonnegative function φ on (0, +∞) vanishing at +∞ there exist a bounded domain ∈ lipα and a positive constant Cφ, such that −1 NN (, λ) Cd,W µd ()λd + Cφ, φ(λ) λ(d−1)/α for all λ > Cφ, . Remark 1.11. In [BD] the authors proved that 0 < K,N (t, x, y) C t −(α+d−1)/(2α) ,
∀x, y ∈ ,
∀t ∈ (0, 1],
(1.9)
whenever ∈ Lipα and α ∈ (0, 1) , where K,N is the heat kernel of the Neumann Laplacian on and C is a constant depending on . The estimate (1.9) is order sharp as t → 0 (see [BD], Example 6). Corollary 1.6 implies that there exists a constant C such that K,N (t, x, x) dx C (t −d/2 + t −(d−1)/(2α) ) , ∀t ∈ (0, 1].
In view of Theorem 1.10, this estimate is also order sharp. Since d/2 < (α+d −1)/(2α) and (d − 1)/(2α) < (α + d − 1)/(2α) , we see that integration of the heat kernel K,N (t, x, x) improves its asymptotic properties. 1.3. Further definitions and notation. In the rest of the paper • #T denotes the number of elements of the set T . • If {T (i)}i∈I is a finite family of sets T (i) and T := i∈I T (i) then ℵ{T (i)} := sup (#{i ∈ I | x ∈ T (i)}) , x∈T
in other words, ℵ{T (i)} is the multiplicity of the covering {T (i)}i∈I . • If s ∈ R+ then [s] is the entire part of s . • supp f and ∇f denote the support and gradient of the function f . The paper is organised as follows. In the next section we recall some well known results from spectral theory and estimate the counting function on ‘model’ domains. In Sect. 3 we discuss partitions of the domain . In Sect. 4 we deduce the main theorems from the results of Sects. 2 and 3. In the last section we extend our results to a wider class of domains and higher order operators and discuss other possible generalizations. 2. Variational Formulae and Related Results Recall that the Sobolev space W 1,2 () is the space of functions u ∈ L2 () such that ∇u ∈ L2 (), endowed with the norm u W 1,2 () = ( ∇u 2L2 () + u 2L2 () )1/2 . 1,2 () be the closure in W 1,2 () of the set If ϒ is a subset of ∂, let W0,ϒ {f ∈ W 1,2 () | supp f ϒ = ∅}
Weyl Formula
487
1,2 1,2 and W01,2 () := W0,∂ (). Obviously, W0,∅ () = W 1,2 (). Let
NN,D (, ϒ, λ) := sup(dim Eλ ),
(2.1)
1,2 where the supremum is taken over all subspaces Eλ ⊂ W0,ϒ () such that
∇u 2L2 () < λ2 u 2L2 () ,
∀u ∈ Eλ .
(2.2)
In view of the Rayleigh–Ritz variational formula, NN,D (, ϒ, λ) can be thought of as the counting function of the Laplacian on the bounded domain subject to Dirichlet boundary condition on ϒ and Neumann boundary condition on the remaining part of the boundary. In particular, NN,D (, ∅, λ) = NN (, λ) and NN,D (, ∂, λ) = ND (, λ) . Equivalently, (2.1) can be rewritten as NN,D (, ϒ, λ) = inf(codim E˜ λ ),
(2.3)
1,2 where the infimum is taken over all subspaces E˜ λ ⊂ W0,ϒ () such that
∇u 2L2 () λ2 u 2L2 () ,
∀u ∈ E˜ λ .
(2.4)
Lemma 2.1. Let {i }i∈I be a countable family of disjoint open sets j ⊂ such that µd () = µd ( i∈I i ). Then ND (i , λ) ND (, λ) NN (, λ) NN (i , λ) i∈I
and NN (, λ)
j ∈J
i∈I
NN,D (j , ∂j \ ∂, λ) .
Lemma 2.1 is an elementary corollary of the Rayleigh–Ritz formula. The following lemma is less obvious. Lemma that µd () = 2.2. Let {i }i∈I be a countable family of open sets j ⊂ such µd ( i∈I i ) , ϒ be an arbitrary subset of ∂ and ϒj := ∂j ϒ . If ℵ{j } κ < +∞ then NN,D (, ϒ, κ −1/2 λ) j ∈J NN,D (j , ϒj , λ). 1,2 () such that ∇u 2L2 (j ) Proof. Denote by E˜ λ,j, the subspace of functions u ∈ W0,ϒ λ2 u 2L2 ( ) . We have κ ∇u 2L2 () λ2 u 2L2 () whenever u ∈ j ∈J E˜ λ,j, . j Therefore, by (2.3), E˜ λ,j, ) ≤ inf(codim E˜ λ,j, ), NN (, κ −1/2 λ) ≤ inf(codim j ∈J
j ∈J
where the infimum are taken over all subspaces E˜ λ,j, satisfying the above condition. If E˜ λ,j is the intersection of the kernels of linear continuous functionals k on 1,2 (k ) and Eλ,j, is the intersection of the kernels of linear continuous functionW0,ϒ j 1,2 () then codim E˜ λ,j codim Eλ,j, and u|j ∈ E˜ λ,j als u → k ( u|j ) on W0,ϒ whenever u ∈ Eλ,j, . This observation and (2.3) imply that inf(codim E˜ λ,j, ) ≤ NN,D (j , ϒj , λ).
488
Yu. Netrusov, Yu. Safarov
Remark 2.3. Lemma 2.2 implies that NN (, κ −1/2 λ) j ∈J NN (j , λ) whenever i ) and ℵ{j } κ . It may well be the case that, j ∈J j ⊂ , µd () = µd ( i∈I under these conditions, NN (, λ) j ∈J NN (j , λ) . This conjecture looks plausible and is equivalent to the following statement: if 1 ⊂ , 2 ⊂ and µd () = µd (1 ) + µd (2 ) then NN (1 , λ) + NN (2 , λ) NN (, λ). Remark 2.4. The first eigenvalue of the Neumann Laplacian −N is always equal to 0 and the corresponding eigenfunction is identically equal to constant. Let λ1,N () := inf{λ ∈ R+ | NN (, λ) > 1} ; if −N has at least two eigenvalues lying below its essential spectrum (or the essential spectrum is √ empty) then λ1,N () coincides with the smallest nonzero eigenvalue of the operator − N . By the spectral theorem, we
have λ1,N () λ if and only if |u(x)|2 dx λ−2 |∇u(x)|2 dx for all functions
u ∈ W 1,2 () such that u(x) dx = 0 . Note that |u(x)|2 dx |u(x) − c|2 dx for all c ∈ C whenever u(x) dx = 0. Definition 2.5. Denote by P(δ) the set of all rectangles with edges parallel to the coordinate axes, such that the length of the maximal edge does not exceed δ . If f is a continuous function on Q(d−1) , let V(δ, f ) be the class of domains V ⊂ Gf which (d−1) (d−1) ) , where Qc ⊂ Q(d−1) , c δ , can be represented in the form V = Gf, b (Qc (d−1) ) δ/2 . We shall write V ∈ V(δ) if V ∈ V(δ, f ) b = inf f − δ and Osc (f, Qc for some continuous function f . Finally, let M(δ) be the class of open sets M ⊂ Rd (d) (d) such that M ⊂ Qδ for some cube Qδ . Lemma 2.6. Let δ be an arbitrary positive number. (1) If P ∈ P(δ) then NN (P , λ) = 1 for all λ πδ −1 . (2) If V ∈ V(δ) then NN (V , λ) = 1 for all λ (1 + 2π −2 )−1/2 δ −1 . (d) (d) (3) If M ∈ M(δ) , M ⊂ Qδ and ϒ := ∂M Qδ , then we have NN,D (M, ϒ, λ) 1 for all λ π δ −1 and NN,D (M, ϒ, λ) = 0 for all λ (2−1 − 2−1 δ −d µd (M))1/2 πδ −1 . Proof. If P is a rectangle then λ1,N = π a −1 , where a is the length of its maximal edge. This implies (1). (d−1)
Assume now that V ∈ V(δ, f ) , where f is a continuous function on Qc (d−1) and denote b := inf f − δ and P := Qc × (b, b + δ) . Clearly, P ∈ P(δ) . Let 1,2 u ∈ W (V ) and cu the average of u over P . If r ∈ [b, b +δ] and s ∈ [b +δ, f (x )] then, by Jensen’s inequality, s f (x ) 2 2 |u(x , s) − u(x , r)| = | ∂t u(x , t) dt | (s − r) |∂t u(x , t)|2 dt . r
Since
b+δ f
b+δ (s
b
b
− r) ds dr = (δ/2) (f − b − δ) (f − b) and
) δ, 0 f − b − δ = f − inf f 2 Osc (f, Q(d−1) c we have b
g(x ) f (x ) g(x )
|u(x , s) − u(x , r)| ds dr δ 2
f (x )
3 b
|∂t u(x , t)|2 dt .
Weyl Formula
489
In view of Remark 2.4 and (1), we also have |u(x) − cu |2 dx π −2 δ 2 |∇u(x)|2 dx. P
(2.5)
P
Integrating the inequality |u(x , s) − cu |2 (1 + γ ) |u(x , r) − cu |2 + (1 + γ −1 ) |u(x , s) − u(x , r)|2 over r ∈ [b, b + δ] , s ∈ [b + δ, f (x )] and x ∈ and applying these two estimates, we obtain 2 −2 3 δ |u(x) − cu | dx (1 + γ ) π δ |∇u(x)|2 dx V \P P −1 3 +(1 + γ ) δ |∂xd u(x)|2 dx V
for substituting γ = π 2 , we see that
all γ > 0 . 2Dividing both sides by δ and
−2 2 2 V \P |u(x) − cu | dx is estimated by (1 + π ) δ V |∇u(x)| dx . Now (2) follows from (2.5) and Remark 2.4. In order to prove (3), let us consider a function u ∈ W 1,2 (M) which vanishes near (d) (d) ϒ and extend it by zero to the whole cube Qδ . Since u ∈ W 1,2 (Qδ ) , (1) implies (d) the first inequality (3). If cu is the average of u over Qδ then |cu | dx µd (M) δ 2
−d
M
|cu | dx + 2
M
|u(x) − cu | dx . 2
(d)
(2.6)
Qδ
Therefore Remark 2.4 and (1) imply that |u(x)|2 dx 2 |u(x) − cu |2 dx + 2 |cu |2 dx M
(d)
Qδ
2 1 + µd (M) δ 2π
−2 2
δ
−d
M
1 − µd (M) δ
1 − µd (M) δ
−d
−d
−1 (d)
|u(x) − cu |2 dx
Qδ
−1
|∇u(x)|2 dx . M
The second identity (3) follows from the above inequality and the Rayleigh–Ritz formula. Remark 2.7. The second estimate in Lemma 2.6(3) is sufficient for our purposes but is very rough. One can obtain a much more precise result in terms of capacities (see [M2], Chap. 10, Sect. 1). Lemma 2.8. Let δ > 0 . Then for all λ > 0 we have (d) − Cd,1 (δλ)d−1 + 1 N (Qδ , λ) − Cd,W (δλ)d Cd,1 (δλ)d−1 + 1 .
490
Yu. Netrusov, Yu. Safarov
Proof. Changing variables x˜ = δ x, we see that N(, δλ) = N(δ, λ) ,
where
δ := {x ∈ Rd | δ −1 x ∈ } .
(2.7)
Therefore it is sufficient to prove the required estimates only for δ = 1 . If = × , ϒ ⊂ ∂ and ϒ ⊂ ∂ then, separating variables, we obtain NN,D (, ϒ, λ) = NN,D , ϒ , λ2 − µ2 dNN,D ( , ϒ , µ) , (2.8) where ϒ = (ϒ × ∂ ) (∂ × ϒ ) and the right-hand side is a Stieltjes integral. Using (2.8), explicit formulae for the counting functions on the unit interval and the identities λ (λ2 − µ2 )n/2 dµ = λn+1 ωn+1 (2 ωn )−1 , ∀n = 1, 2, . . . , (2.9) 0
one can easily prove the required inequality by induction in d .
Remark 2.9. Lemma 2.8 is an immediate consequence of well known results on spectral asymptotics in domains with piecewise smooth boundaries (see, for example, [Iv2 or F]); a similar result holds true for higher order elliptic operators and operators with variable coefficients [V]. We have given an independent proof in order to find the explicit constant Cd,1 . 3. Properties of Domains and Their Partitions 3.1. Besicovitch’s and Whitney’s theorems. We shall use the following version of Besicovitch’s theorem. Theorem 3.1. There are two constants Cn 1 and Cˆn 1 depending only on the dimension n, such that for every compact set K ⊂ Rn and every positive function ρ on (n) K one can find a finite subset Y ⊂ K and a family of cubes {Qρ(y) [y]}y∈Y centred on y, which satisfy the following conditions: (n) (1) K ⊂ y∈Y Qρ(y) [y] , (n) (2) ℵ{K Qρ(y) [y]}y∈Y Cn ; ˆ and the cubes {Q(n) [y]} ˆ (3) there exists a subset Yˆ ⊂ Y such that #Y Cˆn (#Y) ρ(y)
are mutually disjoint.
y∈Y
Theorem 3.1 is proved in the same way as Besicovitch’s theorem in [G], Chap. 1. Corollary 3.2. Let f be a continuous function on the closure Q(d−1) . Then for every ε > 0 there exists a finite family of cubes {Q(d−1) (x)}x∈X such that (1) x∈X Q(d−1) (x) = Q(d−1) ; (2) ℵ{Q(d−1) (x)} Cd,2 ; (3) #X Cd,3 Vε (f, Q(d−1) ); (4) Osc (f, Q(d−1) (x)) ε for each x ∈ X .
Weyl Formula
491
Proof. Without loss of generality we can assume that Q(d−1) = (−1, 1)d−1 and Osc (d−1) [y] the cube of the size t centred on y , define (f, Q(d−1) ) > ε. Let us denote by Qt ρ(y) := inf{t > 0 | Osc (f, Q(d−1)
(d−1)
Qt
[y]) = ε} ,
y ∈ Q(d−1) ,
ˆ If y ∈ Y, apply Besicovitch’s theorem to the set K = Q(d−1) and find the sets Y and Y. (d−1) (d−1) (d−1) Qρ(y) [y] and assume that [y] := Q denote P P (d−1) [y] = (a1 (y), b1 (y)) × (a2 (y), b2 (y)) × · · · × (ad−1 (y), bd−1 (y)) , where −1 aj (y) < bj (y) 1 . Let Q (y) be the minimal cube such that P (d−1) (x) ⊂ Q (y) ⊂ Q(d−1) and c(y) := maxj (bj (y) − aj (y)). We have (y), bd−1 (y)) , Q (y) = (a1 (y), b1 (y)) × (a2 (y), b2 (y)) × · · · × (ad−1
where (-1) if aj (y) = −1 then aj (y) = −1 and bj (y) = aj (y) + c(y); (0) if aj (y) > −1 and bj (y) < 1 then aj (y) = aj (y) and bj (y) = bj (y); (+1) if bj (y) = 1 then aj (y) = bj (y) − c(y) and bj (y) = 1. Let us consider the set = {−1, 0, 1}d−1 of all (d − 1)-dimensional vectors σ = (σ1 , . . . , σd−1 ) with entries σj equal to −1, 0 or 1. Denote by Yˆ σ the set of points y ∈ Yˆ such that aj (y) and bj (y) satisfy the condition (σj ) for all j = 1, . . . , d −1. Since ℵ{P (d−1) [y]}y∈Yˆ = 1, for each σ ∈ the cubes {Q (y)}y∈Yˆ = 1 are mutually disjoint. σ Therefore #Yˆ σ Vε (f, Q(d−1) ) for all σ ∈ (see Definition 1.1) and, consequently, #Yˆ (#) Vε (f, Q(d−1) ) 3d−1 Vε (f, Q(d−1) ) . This estimate and Theorem 3.1(3) imply that #Y 3d−1 Cˆd−1 Vε (f, Q(d−1) ) .
Since Y ⊂ Q(d−1) , we have 1/2 (bj (y) − aj (y))−1 (bk (y) − ak (y)) 2 for all j, k = 1, . . . , d − 1 and y ∈ Y . Using this inequality, one can easily show by induction in d that every rectangle P (d−1) [y] coincides with the union of a finite collection of cubes {Q(d−1) (x)}x∈Xy such that #Xy 2d−1 and ℵ{Q(d−1) (x)}x∈Xy 2d−1 . Let X := y∈Y Xy . In view of the first two conditions of Theorem 3.1, the family {Q(d−1) (x)}x∈X satisfies (1) and (2). The upper bound #Y 3d−1 Cˆd−1 Vε (f, Q(d−1) ) implies (3). Finally, since Osc (f, P (d−1) [y]) = ε and Q(d−1) (x) ⊂ P (d−1) [y] whenever x ∈ Xy , we have (4). The following theorem is due to Whitney. It can be found, for example, in [St], Chap. VI, or [G], Chap. 1.
Theorem 3.3. There exists a countable family of mutually disjoint cubes (d) (d) {Q2−i (i, n)}n∈N (i) , i∈I such that = i∈I n∈Ni Q2−i (i, n) and (d)
Q2−i (i, n) ⊂ {x ∈ |
√ −i √ d 2 dist(x, ∂) 4 d 2−i } .
Here I is a subset of Z and Ni are some finite index sets.
(3.1)
492
Yu. Netrusov, Yu. Safarov
3.2. Auxiliary results. In this subsection we shall prove several technical results concerning domains Gf, b . (d−1)
. Then for Lemma 3.4. Let f be a continuous function defined on the closure Qa every δ > 0 and m ∈ Z+ there exists a finite family of cubes {Q(d−1) (k)}k∈Km such that (d−1) (1) k∈Km Q(d−1) (k) = Qa ; (2) Q(d−1) (k) ∈ P(δ) for all k ∈ Km ; (3) ℵ{Q(d−1) (k)}k∈Km Cd,2 ; (4) Osc (f, Q(d−1) (k)) 2m−1 δ for all k ∈ Km ; (d−1) ). (5) #{k ∈ Km | µd−1 (Q(d−1) (k)) 21−d δ d−1 } Cd,3 V2m−1 δ (f, Qa Proof. Let {Q(d−1) (x)}x∈X be a family of cubes satisfying the conditions of Corol(d−1) with some ax > 0 and lary 3.2 with ε = 2m−1 δ. Assume that Q(d−1) (x) = Qax denote by Xδ the set of all indices x ∈ X such that ax δ. For each x ∈ X \Xδ , we choose a positive integer mx such that ax /mx ∈ (δ/2, δ] and split the closed cube Q(d−1) (x) (d−1)
congruent closed cubes Qax /mx (x, j ), j = 1, . . . , md−1 into the union of md−1 x x . Let (d−1)
Qax /mx (x, j ) be the corresponding disjoint open cubes and (d−1) {Q(d−1) (k)}k∈K := {Q(d−1) (k)}x∈Xδ {Qax /mx (x, j )}x∈X \Xδ , j =1,... ,md−1 . x Then (2) holds true and (1), (3), (4) and (5) follow from Corollary 3.2(1), Corollary 3.2(2), Corollary 3.2(4) and Corollary 3.2(3) respectively. √ (d−1) Theorem 3.5. Let f be a continuous function on Qa , δ ∈ (0, d a] and b ∈ [−∞, inf f − 2δ] . Then there exist countable families of sets {Pj }j ∈J and {Vk }k∈K satisfying the following conditions: (1) Pj ⊂ Gf,b and Pj ∈ P(δ) for all j ∈ J ; (2) Vk ⊂ Gf,b and Vk ∈ V(δ, f ) for all k ∈ K; (3) ℵ{Pj } 3Cd,2 + 1 and ℵ{Vk} Cd,2 ; (4) Gf,b ⊂ j ∈J , k∈K Pj Vk ;
(d−1)
) and (5) #{k ∈ K | µd (Vk ) 21−d δ d } Cd,3 Vδ/2 (f, Qa √ −d d mδ m (d−1) ), #{j ∈ J | µd (Pj ) (2 d) δ } Cd,3 m=0 2 V2m−1 δ (f, Qa (d−1) m−1 δ Osc (f, Qa )} . where mδ := min {m ∈ Z+ | 2 Proof. Let {Q(d−1) (k)}k∈Km be the same families of cubes as in Lemma 3.4, ck := inf x∈Q(d−1) (k) f (x), bk = ck − δ , Vk := Gf,bk (Q(d−1) (k)) and
where k ∈ implies that
Pm,k,n := Q(d−1) (k) × (ck − nδ, ck − nδ + δ) , m Km
and n ∈ Z+ . Denote Nm := {2m +1, . . . , 2m +2m+1 } . Lemma 3.4(4)
Pm,k,n ⊂ {x ∈ Gf | 2m δ f (x ) − xd 2m+2 δ} ,
k∈Km ,n∈Nm
for all m = 0, 1, . . . , mδ . Let K := K0 , J∗ := mδ m=0 {Pm,k,n }k∈Km ,n∈Nm .
mδ
m=0 Km
(3.2)
× Nm and {Pj }j∗ ∈J∗ :=
Weyl Formula
493
Assume 2δ then, by Lemma 3.4(1), we have x ∈ that x ∈ Gf . If f (x ) − xdm δ +1 δ then . If f (x V P ) − x > 2 k 0,k,2 d k∈K
dist(x, f ) f (x ) − xd − 2 Osc (f, Q(d−1) ) f (x ) − xd − 2mδ δ > 2mδ δ δ. a Finally, if 2δ f (x ) − xd 2mδ +1 δ then 2m+1 δ f (x ) − xd 2m+1 δ + 2m δ for some nonnegative integer m mδ and, in view of Lemma 3.4(1) and Lemma 3.4(4), we have x ∈ k∈Km ,n∈Nm Pm,k,n . Therefore (3.3) Pj∗ Vk . {x ∈ Gf | dist(x, f ) δ} ⊂ j∗ ∈J∗ , k∈K
√ √ Let us choose a constant c ∈ (δ/(2 d), δ/ d] in such a way that a/c ∈ N and split (d−1)
(d−1)
× [b, +∞) into the union of congruent closed cubes Qc (i) whose the set Qa (d−1) (i) are mutually disjoint. Let {Pj }j ∈J be the collection of all the rectinteriors Qc (d−1) angles Pj∗ and all the cubes Qc (i) which are contained in Gf,b . Then (1) and (2) are obvious. The second inequality (3) and (5) follow from the corresponding statements of Lemma3.4. The first inequality (3) is a consequence of (3.2), Lemma 3.4(3) and the identity ℵ [2m , 2m+2 ] i∈Z = 3. It remains to prove (4). +
Let x ∈ Gf . If dist(x, f ) δ then, by (3.3), either x ∈ Vk for some k ∈ K or x ∈ Pj ∗ for some j ∗ ∈ J ∗ . Since Pj∗ ∈ P(δ) and b inf f − 2δ , in the latter case (d−1) (i) , whose closure contains x , is Pj∗ ⊂ Gf,b . If dist(x, f ) > δ then the cube Qc a subset of Gf,b because its diameter does not exceed δ . Therefore (4) holds true. In the two dimensional case we also have the following, more precise result.
Theorem 3.6. Let the conditions of Theorem 3.5 be fulfilled and d = 2 . Then there exists countable families of sets {Pj }j ∈J and {Vk }k∈K such that (1) Pj ⊂ Gf,b and Pj ∈ P(δ) for all j ∈ J ; (2) Vk ⊂ Gf,b and Vk ∈ V(δ, f ) for all k ∈ K; (3) ℵ {Pj }j ∈J {Vk }k∈K 2; (4) Gf,b ⊂ j ∈J , k∈K Pj Vk ; (1)
(5) #{k ∈ K | µ2 (Vk ) δ 2 /2} Vδ/2 (f, Qa ) and (1) #{j ∈ J | µ2 (Pj ) δ 2 /8} 6 Vδ/2 (f, Qa ) + 12a/δ . Proof. In the two dimensional case we do not need Besicovitch’s theorem because the (1) ‘cube’ Qa coincides with an interval of the form (b, b+a) . Given ε > 0 , one can easily construct a finite family {Q(1) (x)}x∈X of disjoint subintervals Q(1) (x) ∈ (a, a + b) satisfying the conditions (1)–(4) of Corollary 3.2 with Cd,2 = Cd,3 = 1 . Therefore Lemma 3.4 remains valid if we substitute Cd,2 = Cd,3 = 1 . Let k ∈ K := X and bk , Q(1) (k) and Vk = Gf,bk (Q(1) (k)) be the same as in the proof of Theorem 3.5. By the above, the first inequality in Theorem 3.5(5) holds true (1) with Cd,3 = 1 . Therefore #K Vδ/2 (f, Qa ) + 2a/δ (the second term is the maximal (1) number of intervals Q (k) whose length exceeds δ/2 ). Let Vf := k∈K Vk . The set Gf \ Vf is a polygon with edges parallel to coor(1) dinate axes which has at most 2 Vδ/2 (f, Qa ) vertices lying on the horizontal lines
494
Yu. Netrusov, Yu. Safarov (1)
{x | x1 ∈ Qa , x2 = bk } . Let us choose a constant c ∈ (δ/2, δ] in such a way that (1) a/c ∈ N and split the interval Qa into the union of a/c intervals (al , al+1 ) of length (1) c ; if a < δ then we take (a1 , a2 ) := Qa . Denote Kl := {k ∈ K | [al−2 , al+3 ] Q(1) (k) = ∅} , bk, l := min bk , k∈Kl
and Pk, l := (al , al+1 ) × (bk , bk ) , where bk := min{bk | bk > bk , k ∈ Kl } ; we assume that Pk, l := ∅ whenever bk = max{bk | k ∈ Kl } . We have dist(x, f ) > δ whenever x1 ∈ [al , al+1 ] and x2 < bk, l . Therefore {x ∈ Gf \ Vf | dist(x, f ) δ, x1 ∈ [al , al+1 ]} ⊂ Pk, l k∈Kl
and, consequently, (3.3) holds true with J∗ := l Kl and {Pj∗ }j∗ ∈J∗ := l {Pk, l }k∈Kl . For each fixed l the number of rectangles Pk, l does not exceed #Kl − 1 . We also have (1) l (#Kl − 1) 6 (#K) because each point x1 ∈ Qa belongs to at most six intervals [al−2 , al+3 ] . Therefore #J∗ 6 (#K) 6 Vδ/2 (f, Q(1) a ) + 12a/δ . The rest of the proof repeats that of Theorem 3.5.
3.3. General domains. We shall need the following elementary lemma. Lemma 3.7. Let h be a real-valued function on R+ and 0 < a b . If the function th(t) is nondecreasing then 2b h(2i ) 2 t −1 h(t) dt . a
i∈Z | a 2i b
i −i − 2−i−1 ) (2i ) h(2i ) . Since the Proof. We have a 2i b h(2 ) = 2 a 2i b (2 −1 −1 ˜ function h(s) = s h(s ) is decreasing, the right-hand side is estimated by
a −1 −1
2b 2 (2b)−1 s h(s −1 ) ds = 2 a t −1 h(t) dt . Corollary 3.8. Let ∈ BVτ,∞ . Then for each δ ∈ (0, δ ] there exist families of sets {Pj }j ∈J and {Vk }k∈K satisfying the following conditions: (1) for each j there exists l ∈ L such that Pj ⊂ l and Ul (Pj ) ∈ P(δ); (2) for each k there exists l ∈ L such that Vk ⊂ l and Ul (Vk ) ∈ V(δ); (3) ℵ{Pj } n (3Cd,2 + 1) and ℵ{Vk } n Cd,2 ; (4) bδ0 ⊂ Pj Vk ⊂ bδ1 , j ∈J , k∈K
(5) #K Cd,3 C, τ τ (2/δ) + n Cd,2 2d−1 δ −d µd (bδ1 ) and #J 4Cd,3 C,τ δ
−1
4/δ (2D )−1
√ t −2 τ (t)dt + n (3Cd,2 + 1)(2 d)d δ −d µd (bδ1 ) ,
√ √ √ where δ0 := δ/ d and δ1 := d δ + δ/ d .
Weyl Formula
495
Proof. Let l = Ul−1 (Gfl , bl ) be the sets introduced in Subsect. 1.1. Given δ ∈ (0, δ ] , we apply Theorem 3.5 for each l ∈ L and denote by {Pj }j ∈J (l) and {Vk }k∈K(l) the families of subsets of l , which satisfy the conditions of Theorem 3.5 in an appropriate orthogonal coordinate system. Let J (l) := {j ∈ J (l) | dist(Pj , ∂) δ0 } , {Pj }j ∈J := {Pj }j ∈J (l) and {Vk }k∈K := {Vk }k∈K(l) . l∈L
l∈L
Then each of the conditions (1)–(3) is a consequence of the corresponding condition in Theorem 3.5. If x ∈ l∈L l then dist(x, ∂) δ > δ0 . If x ∈ l bδ0 then, by Theo rem 3.5(4), we have x ∈ j ∈J (l), k∈K(l) Pj Vk . In this case x ∈ j ∈J (l), k∈K(l) √ Pj Vk because diam Pj d δ . Therefore bδ0 is a subset of j ∈J , k∈K Pj Vk . √ √ The estimates supx∈Vk dist(x, ∂) d δ and diam Pj d δ imply the second inclusion (4). In order to prove (5), let us denote by Mδ the smallest positive integer such that 2Mδ −1 δ D . By Theorem 3.5(5), we have Mδ #{j ∈ J (l) | µd (Pj ) 21−d δ d } Cd,3 C, τ 2m τ ((2m−1 δ)−1 ) . l∈L
m=0
Since 2Mδ −1 δ 2D , applying Lemma 3.7 with a = (2D )−1 δ , b = 2 and h(t) = t −1 τ (δ −1 t) , we obtain 4/δ J (l) | µd (Pj ) 21−d δ d } 4 Cd,3 C, τ δ −1 t −2 τ (t) dt . #{j ∈ l∈L
(2D )−1
Now the second estimate (5) follows from the first inequality (3) and the second inclusion (4). Similarly, the first estimate (5) is a consequence of the second inequality (3), the second inclusion (4) and the first inequality in Theorem 3.5(5). Corollary 3.9. Let ∈ BVτ,∞ and ∈ R2 . Then for each δ ∈ (0, δ ] there exist families of sets {Pj }j ∈J and {Vk }k∈K satisfying the conditions (1), (2) and (4) of Corollary 3.8 such that (3 ) ℵ {Pj }j ∈J {Vk }k∈K 2 n ; (5 ) #K C,τ τ (2/δ) + 2 n δ −2 µ2 (bδ1 ) and #J 6 C,τ τ (2/δ) + 12 D /δ + 16 n δ −2 µ2 (bδ1 ) . Proof. The corollary is proved in the same way as Corollary 3.8, with the use of Theorem 3.6 instead of Theorem 3.5. Our proof of Theorem 1.8 is based on the following simple lemma. Lemma 3.10. Let be an arbitrary domain. Then for every δ > 0 there exists a family of sets {Mk }k∈K satisfying the following conditions: (1) Mk ⊂ and Mk ∈ M(δ) for each k ∈ K ; (2) ℵ{Mj } = 1 ; √ √ √ Mk ⊂ bδ1 , where δ0 := δ/ d and δ1 := d δ + δ/ d . (3) bδ0 ⊂ k∈K
496
Yu. Netrusov, Yu. Safarov (d)
Proof. Consider an arbitrary cover of Rd by closed cubes Qδ (k) with disjoint interi (d) (d) ors Qδ (k) and define {Mk }k∈K := { Qδ (k)}k∈K , where K is the set of indices (d) k such that bδ0 Qδ (k) = ∅ . 4. Spectral Asymptotics 4.1. Estimates of the√counting√function. In this section we shall always assume that √ δ0 := δ/ d , δ1 := d δ + δ/ d and denote ∞ √ R (λ, δ1 ) := 3 (4 d) Cd,1 (4.1) s −1 λd−1 + s −d d(µd (bs )) , where
s −1 λd−1 + s
−d
δ1
d(µd (bs )) is understood as a Stieltjes integral.
Theorem 4.1. If ∈ Rd is an arbitrary domain and δ > 0 then N(, λ) − Cd,W µd () λd − R (λ, δ1 ) − Cd,W µd (b4δ1 ) λd ,
∀λ > 0 , (4.2)
and ND (, λ) − Cd,W µd () λd R (λ, δ1 ) + ((4d)d + 2) δ −d µd (b4δ1 )
(4.3)
for all λ δ −1 . If ∈ BVτ,∞ and δ ∈ (0, δ ] then NN (, λ) − Cd,W µd () λd R (λ, δ1 ) + (4d)d δ −d µd (b4δ1 ) 4/δ −d b −1 + Cd,6 n δ µd (δ1 ) + 8 Cd,3 C, τ δ t −2 τ (t) dt (4.4) (2D )−1
1/2
−1/2
for all λ min{1, Cd,9 n
} δ −1 .
(d)
Proof. Let Q2−i (i, n) be the Whitney cubes introduced in Theorem 3.3, √ √ Iδ− := {i ∈ I | d 2−i δ0 /4} , Iδ+ := {i ∈ I | d 2−i > δ1 } , Iδ0 := I \ (Iδ+
Iδ− ) and σδ :=
(d)
Q2−i (i, n), where σ = + , σ = 0 0 − or σ = − . The set σδ are mutually disjoint and = + δ δ . By virtue of δ (3.1), b − δ ⊂ δ0 ,
and
i∈Iδσ
0δ ⊂ b4δ1 \ bδ0 /4 ,
n∈Ni
b \ b4δ1 ⊂ + δ ⊂ \ δ1
#Ni 2i d µd (b4√d 2−i ) − µd (b√d 2−i ) ,
∀i ∈ I .
In view of the second inclusion (4.5), we have √ #Ni (4 d δ0−1 )d µd (b4δ1 ) = (4d)d δ −d µd (b4δ1 ) . i∈Iδ0
(4.5)
(4.6)
(4.7)
Weyl Formula
497
√ √ Since ℵ{ [ d 2−i , 4 d 2−i ] }i∈Z = 3 and bs = bD for all s D , the inequalities (4.6) imply that √ ∞ −1 d−1 ((2i )1−d λd−1 + 1) #Ni 3 (4 d) (s λ + s −d ) d(µd (bs )) (4.8) δ1
i∈Iδ+
for all λ > 0 . By Lemma 2.1, d N(, λ) − Cd,W µd () λd − Cd,W µd ( \ + δ )λ + d + ND (+ , δ , λ) − Cd,W µd (δ ) λ
(4.9)
ND (, λ) − Cd,W µd () λ NN,D ( \ + δ , ∂, λ) + d + NN (+ δ , λ) − Cd,W µd (δ ) λ
(4.10)
NN (, λ) − Cd,W µd () λd NN ( \ + δ , λ) + + d + NN (δ , λ) − Cd,W µd (δ ) λ .
(4.11)
d
and
Lemma 2.1 implies that (d) + d ND (Q2−i (i, n), λ) − Cd,W (2−i λ)d N (+ δ , λ) − Cd,W µd (δ ) λ n∈Ni , i∈Iδ+
n∈Ni , i∈Iδ+
(d) NN (Q2−i (i, n), λ) − Cd,W (2−i λ)d .
In view of Lemma 2.8, sides are estimated from below and above the right- and left-hand by ± Cd,1 i∈I + (2i )1−d λd−1 + 1 #Ni . Therefore, by (4.8), δ
+ d | N(+ δ , λ) − Cd,W µd (δ ) λ | R (λ, δ1 ) ,
∀λ > 0 .
(4.12)
Since \ b4δ1 ⊂ + δ , the lower bound (4.2) is an immediate consequence of (4.9) and (4.12). Assume that λ δ −1 . Let {Mk }k∈K be the family of sets introduced in Lemma 3.10 and (d) {Sm }m∈MD := {Q2−i (i, n)}n∈Nj , i∈I 0 {Mk }k∈K . δ Lemma 3.10(3) and (4.5) imply that m∈MD Sm = \+ δ . In view of Lemma 3.10(2), we have ℵ{Sm }m∈MD 2 . Consequently, by Lemma 2.2, √ NN,D ( \ + NN,D (Sm , ϒm , 2 λ) , δ , ∂, λ) m∈MD
−1/2 δ ) or to M(δ) , where ϒm = ∂Sm ∂ . Since each set 1 √ Sm belongs either to P(d Lemma 2.6 implies that NN (Sm , ϒm√ , 2 λ) 1 . Moreover, if Sm ∈ M(δ) then, in view of Lemma 2.6(3), NN (Sm , ϒm , 2λ) > 0 only if µd (Sm ) δ d − 4π −2 δ d+2 λ2 .
498
Yu. Netrusov, Yu. Safarov
By Lemma 3.10(3), the number of sets M ∈ {Mk }k∈K satisfying this estimate does not exceed −1 δ −d µd (bδ1 ) 2 δ −d µd (bδ1 ). 1 − 4π −2 δ 2 λ2 Taking into account (4.7), we obtain d −d µd (b4δ1 ) + 2 δ −d µd (bδ1 ) . NN,D ( \ + δ , ∂, λ) (4d) δ
This estimate, (4.10) and (4.12) imply (4.3). In order to prove (4.4), let us consider the family of sets {Pj }j ∈J and {Vk }k∈K constructed in Corollary 3.8 and define (d) {Sm }m∈MN := {Q2−i δ (i, n)}n∈Nj , i∈I 0 {Pj }j ∈J {Vk }k∈K . δ
Corollary 3.8(4) and (4.5) imply that m∈M Sm = \+ δ . In view of Corollary 3.8(3), 2 . Consequently, by Lemma 2.2, we have ℵ{Sm }m∈M n Cd,4 1/2 NN ( \ + NN (Sm , n Cd,4 λ) . δ , λ) m∈MN
Since each set Sm belongs either to V(δ) or to P(d −1/2 δ1 ) , Lemma 2.6 implies that 1/2 1/2 NN (Sm , n Cd,4 λ) = 1 whenever n Cd,4 λ Cd,5 δ −1 . Estimating #M with the use of (4.7) and Corollary 3.8(5) and applying the inequalities 4/δ 4/δ 4/δ −2 −2 t dt t τ (t) dt t −2 τ (t) dt , (δ/4) τ (δ/2) = τ (δ/2) 2/δ
we see that −1 NN ( \ + δ , λ) 8 Cd,3 C, τ δ
+ (4d/δ)d µd (b4δ1 ) + −1/2 −1 δ .
for all λ Cd,7 n
(2D )−1
2/δ
4/δ
t −2 τ (t) dt
(2D )−1 −d Cd,6 n δ µd (bδ1 )
Now (4.4) follows from (4.11) and (4.12).
(4.13)
4.2. Two dimensional domains. If d = 2, τ (t) = t and δ λ−1 then the first term on the right-hand side of (4.13) coincides with c λ log λ , where c is some constant. On the other hand, for two dimensional domains with smooth boundaries we have NN (bλ−1 , λ) ∼ λ as λ → ∞ (see, for example, [SV]). The following lemma gives a refined estimate for NN ( \ + δ , λ) , which does not contain the logarithmic factor. Lemma 4.2. Let ⊂ R2 ,√ ∈ BVτ,∞ , δ ∈ (0, δ ] and + δ be defined as in Subsect. 4.1. Then for all λ
2 3
−1/2 −1 δ
n
we have
−2 NN ( \ + µ2 (b4δ1 ) + 12 D /δ . (4.14) δ , λ) 7 C,τ τ (2/δ) + (64 + 18 n ) δ
Proof. Applying the same arguments as in the proof of Theorem 4.1 but using Corollary 3.9 instead of Corollary 3.8, one obtains (4.14) instead of (4.13).
Weyl Formula
499
4.3. Proof of Theorems 1.3, 1.8 and Corollary 1.5. Integrating by parts in the Stieltjes integral and changing variables s = t −1 , we obtain ε
∞
(s −1 λd−1 + s −d ) d(µd (bs )) + (ε −1 λd−1 + ε −d ) µd (bε ) = 0
ε−1
(λd−1 + d t d−1 ) µd (bt −1 ) dt ,
∀ε > 0 .
(4.15)
δ −1 Therefore (4δ1 )−1 λd−1 + (4δ1 )−d µd (b4δ1 ) (λd−1 + d δ11−d ) 0 1 µd (bt −1 ) dt
δ −1
∞ and δ1 (s −1 λd−1 + s −d ) d(µd (bs )) (λd−1 + d δ11−d ) 0 1 µd (bt −1 ) dt . Applying
these inequalities and the estimates (4.2)–(4.4) with δ1−1 = λ or δ −1 = Cd,8 n λ , we obtain (1.1) and (1.6). The estimate (1.2) is proved in the same manner, using (4.14)
b
b instead of (4.13). Finally, since a t −2 τ (t) dt bd−2 a t −d τ (t) dt , (1.3) is a consequence of (1.1) and the following lemma. 1/2
Lemma 4.3. If ∈ BVτ,∞ then d−1 µd (bε ) Cd,2 3d n D ε + Cd,3 3d C, τ ε d τ (ε −1 ) ,
∀ε > 0 . (d−1)
. Let Proof. Assume first that f is a continuous function on the closed cube Qa {Q(d−1) (x)}x∈X be the same family of cubes as in Corollary 3.2, f (x) := {z ∈ f | z ∈ Q(d−1) (x)} and Xε := {x ∈ X | Q(d−1) (x) ∈ P(ε)}. If dist(y, f ) ε then dist(y, f (x)) ε for some x ∈ X . Therefore µd {y ∈ Q(d−1) | dist(y, ) ε} f a µd {y ∈ Q(d−1) | dist(y, f (x)) ε} . a x∈X (d−1)
The set {y ∈ Qa | dist(y, f (x)) ε} lies in the ε-neighbourhood of the rectangle Q(d−1) (x) × inf z∈Q(d−1) (x) f (z) , supz∈Q(d−1) (x) f (z) . In view of Corollary 3.2(4), the measure of this ε-neighbourhood does not exceed 3ε (ax + 2ε)d−1 , where ax is the length of the edge of Q(d−1) (x). Therefore µd {y ∈ Q(d−1) | dist(y, ) ε} 3d ε d (#Xε ) + 3d ε axd−1 . f a x∈X \Xε
Now the obvious inequality imply that
x∈X
axd−1 a d−1 ℵ{Q(d−1) (x)}x∈X and Corollary 3.2(3)
µd ({y ∈ Rd | dist(y, f ) ε}) Cd,2 3d ε a d−1 + Cd,3 3d ε d Vε (f, Q(d−1) ). a Since bε = l∈L {x ∈ | dist(x, fl ) ε}, where fl are the functions introduced in Subsect. 1.1, the lemma follows from this inequality.
500
Yu. Netrusov, Yu. Safarov
4.4. Proof of Corollaries 1.6 and 1.9. Let ∈ Lipα , fl be the functions introduced in Subsect. 1.1 and ||α := maxl |fl |α , where |·|α is the seminorm defined in Subsect. 1.1. If x ∈ Gfl and dist(x, (y , fl (y )) δ then fl (x ) − xd |xd − fl (y )| + |fl (y ) − fl (x )| δ + δ α |fl |α .
(4.16)
Therefore {x ∈ Gfl | dist(x, fl ) δ} ⊂ {x ∈ Ggl | fl (x ) − xd δ + δ α |fl |α } and, consequently µd ({x ∈ Gfl | dist(x, fl ) δ}) a d−1 (δ + δ α |fl |α ) . This immediately implies the following lemma. d−1 (δ + δ α ||α ) . Lemma 4.4. If ∈ Lipα and δ δ then µd (bδ ) n D (d−1)
If Qc
(d−1)
⊂ Qal
(d−1)
then diam Qc
2 Osc (f, Q(d−1) ) c
sup
= d 1/2 c and
|fl (x ) − fl (y )| d α/2 cα |f |α .
(4.17)
(d−1) x ,y ∈Qc
(1−d)/α (d−1)/α δ
(d−1)
) δ/2 and,
Vδ/2 (f, Q(d−1) ) d (d−1)/2 a d−1 |f |(d−1)/α δ (1−d)/α + 1 . a α
(4.18)
Therefore cd−1 d (1−d)/2 |f |α consequently,
whenever Osc (f, Qc
The inequality (4.18) implies the following result. Lemma 4.5. If ∈ Lipα and d−1 ||(d−1)/α t (d−1)/α + 1, τ (t) = 2(1−d)/α d (d−1)/2 D α
(4.19)
then ∈ BV∞,τ and C,τ n . Clearly, (1.4) follows from (1.1) and Lemma 4.5. Similarly, (1.6) and Lemma 4.4 imply (1.7). It remains to prove (1.5) and (1.8). (ε) Assume that ∈ lipα . Then for each ε > 0 we can find functions fl, 1 ∈ (ε)
(ε)
(ε)
(ε)
Lip1 and fl, 2 ∈ Lipα such that fl = fl, 1 + fl, 2 and |fl, 2 |α ε . Obviously, (ε)
(ε)
(ε)
(ε)
Vδ (fl, 1 + fl, 2 , Q) Vδ/2 (fl, 1 , Q) + Vδ/2 (fl, 2 , Q) . Therefore (4.18) implies that d−1 Vδ (fl , Q(d−1) ε (d−1)/α δ (1−d)/α + Cεd−1 δ 1−d + 2 ) d (d−1)/2 D al ε(d−1)/α τε (δ −1 ) ,
(4.20)
(ε)
where Cε := maxl |fl, 2 |1 , and d−1 (d−1)/α τε (t) := d (d−1)/2 D t + Cε, ε (1−d)/α t d−1 + 2 ε(1−d)/α . We also have |fl (x ) − fl (y )| ε |x − y |α + |fl, 2 |1 |x − y | , (ε)
∀x , y ∈ Q(d−1) . al
Therefore, instead of (4.16), we obtain fl (x ) − xd δ + δ |fl, 2 |1 + δ α ε . This implies d−1 that µd (bδ ) n D (δ + Cε δ + δ α ε) whenever δ δ . (ε)
Weyl Formula
501
In view of (4.20), we have ∈ BV∞,τε and C,τε ε(d−1)/α n . Choosing a sufficiently large constant C and applying (4.2)–(4.4) with δ = C λ−1 and τ = τε , we see that (d−1)/α | NN (, λ) − Cd,W µd () λd | ε(d−1)/α C λ + C,ε λd−1 , d−α | ND (, λ) − Cd,W µd () λd | ε C λ + C,ε λd−1 is a constant depending only on the domain and C for all λ > 1 , where C ,ε is a constant depending on and ε . Since ε can be made arbitrarily small, these inequalities imply (1.5) and (1.8).
(d−1)
4.5. Proof of Theorem 1.10. Let Q1 = (0, 1)d−1 , α ∈ (0, 1) and p be a sufficiently large positive integer. In particular, we shall be assuming that p max{α −1 , (1 − α)−1 } and, consequently, −1 −1 1 − 2−αp 2, 1 − 2(1−α) p 2 (4.21) 21−αp 1 , and
−1 2(1−α) p − 1 21+(1−α) np ,
2(1−α) (n+1)p − 1
∀n = 1, 2, . . . . (4.22)
Given j ∈ Z+ , let us denote by Kj the set of nonnegative integer vectors k = such that maxi ki 2jp − 1 and consider the (d − 1)-dimen(k1 , . . . , kd−1 ) ∈ Zd−1 + sional cubes Q(j, k) := {x ∈ Rd−1 | 2jp x − k ∈ Q1
(d−1)
},
k ∈ Kj
with edges of length 2−jp . For each fixed j ∈ Z+ and k ∈ Kj the cubes Q(j, k) are (d−1) = k∈Kj Q(j, k) . disjoint and Q1 (d−1)
vanishing on the Let ψ ∈ Lip1 be a nonnegative Lipschitz function on Q1 √ (d−1) , aψ := sup ψ and bψ,p := d 23−(1−α)p (|ψ|1 + aψ ) . We shall boundary ∂Q1 be assuming that p is large enough so that aψ > bψ,p . Let us extend ψ by 0 to the whole space Rd−1 and define gj (x ) :=
k∈Kj
ψ(2jp x − k) ,
fn (x ) :=
n
εj 2−α jp gj (x )
j =0
∞
and f (x ) := limn→∞ fn (x ) = j =0 εj 2−αjp gj (x ) , where {εj } is a nonincreasing sequence such that εj ∈ [0, 1] and 2(1−α)([j/2]−j )p ε[j/2] 2 εj ,
∀j = 1, 2, . . . .
(4.23)
Note that the condition (4.23) is fulfilled whenever {εj } is a sufficiently slowly decreasing sequence. Lemma 4.6. We have (1) gj = 0 on ∂Q(j, k) for all k ∈ Kn and j n ; (2) 0 f (x ) − fn (x ) 2 εn+1 2−α (n+1)p aψ εn+1 2−α np aψ ; (3) |fn |β 21+(β−α)np (|ψ|1 + aψ ) for all β ∈ [α, 1] ;
502
Yu. Netrusov, Yu. Safarov
(4) f ∈ Lipα and |f |α 2 (|ψ|1 + aψ ) ; (5) f ∈ lipα whenever εj → 0 as j → ∞ ; (6) 2 Osc (fn−1 , Q(n, k)) εn 2−α np bψ,p for all k ∈ Kn . Proof. (1) is obvious and (2) immediately follows from (4.21). In order to prove (3), let us fix β ∈ [α, 1] , denote n := max{j | 2−jp |x − y |} , n := min{n, n } and estimate
n |gj (x ) − gj (y )| j =0
=
2αjp |x − y |β
|ψ|1
n
n |gj (x ) − gj (y )|
2αjp |x − y |β
j =0
2(1−α)jp |x − y |1−β + aψ
+
n |gj (x ) − gj (y )| 2αjp |x − y |β
j =n +1
n
2−αjp |x − y |−β .
j =n +1
j =0
In view of (4.22), the first term on the right-hand side is estimated by |ψ|1 nj =0 2(1−α)jp+(1−β)np 21+(β−α)np |ψ|1 . If n n then the second term on the right-hand side vanishes; if n > n then, by (4.21), it does not exceed 2 aψ 2−α(n +1)p |x −y |−β 2 aψ 2(β−α)(n +1)p 21+(β−α)np aψ . Thus, n |gj (x ) − gj (y )| j =0
2αjp |x − y |β
21+(β−α)np (|ψ|1 + aψ ) .
(4.24)
This estimate immediately implies (3) and (4). The inclusion (5) is also a consequence |gj (x )−gj (y )| of (4.24) because |f − fn |α εn+1 supx ,y ∞ j =0 2αjp |x −y |α . Finally, in view of (4.23) and (4.24), we have (|ψ|1 + aψ )−1 |fj |1 21+(1−α)[j/2]p + ε[j/2] 21+(1−α)jp εj 23+(1−α)jp . (4.25) √ −np Since diam Q(n, k) = d 2 , (4.25) with j = n − 1 implies (6). Let := Gf, 0 , n, k := {x ∈ | x ∈ Q(n,k) , xd ∈ (fn−1 (x ), f (x ))} , ϒn ,k := ∂n, k \ ∂ and n−1 be the interior of \ k∈Kn n, k . Denote an, k := sup fn−1 (x ) and consider the function x ∈Q(n,k)
un, k (x) :=
sin 2α np (xd − an, k )/εn , 0,
xd an−1, k , xd < an−1, k ,
on n, k . We have un, k (x) ∈ W 1,2 (n, k ) and, in view of Lemma 4.6(1), un, k = 0 on ϒn, k . Applying Lemma 4.6(2) and Lemma 4.6(6), we see that f (x )−an, k |∇un, k (x)|2 dx = εn−2 22α np cos2 2α np xd /εn dxd dx n, k
εn−2 22α np =
Q(n,k) 0
εn 2−α np (gn (x )+aψ )
Q(n,k) 0
εn−1 2α np 2−(d−1) np
(d−1)
Q1
ψ(x )+aψ 0
cos2 2α np xd /εn dxd dx cos2 xd dxd dx
Weyl Formula
and
503
f (x )−an, k
|un, k (x)|2 dx = n, k
Q(n,k) 0 εn 2−α np (gn (x )−bψ,p )
Q(n,k) 0
= εn 2−α np 2−(d−1) np
n, k
sin2 2α np xd /εn dxd dx
(d−1)
Q1
Therefore
ψ(x )−bψ,p
sin2 xd dxd dx .
0
2 ε −2 22αnp |∇un, k (x)|2 dx cψ,p n
cψ,p
sin2 2α np xd /εn dxd dx
n, k
|un, k (x)|2 dx , where
1/2
ψ(x )+aψ 2 x dx dx cos (d−1) 0 d d Q := 1 ψ(x )−b . ψ,p 2 sin x dx dx (d−1) 0 d d Q 1
−1 αnp . This implies that NN,D (n, k , ϒn, k , λ) 1 whenever λ cψ,p εn 2 −1 α(n+1)p 2 and, using Lemma 2.1, estiAssume that λ ∈ cψ,p εn−1 2αnp , cψ,p εn+1 mate NN,D (n, k , ϒn, k , λ) . NN (, λ) ND (n−1 , λ) + k∈Kn
By the above, the second term on the right hand side is not smaller than #Kn = (d−1)/α (d−1)/α 2(d−1) np (cψ,p 2αp )(1−d)/α εn+1 λ . On the other hand, in view of Theorem 1.8, Lemma 4.4 and Lemma 4.6(3) with β = α , we have ND (n−1 , λ) Cd,W µd (n−1 ) λd − Cd (|ψ|1 + aψ + 1) λd−α for all sufficiently large λ . Finally, by Lemma 4.6(2), µd () λd − µd (n−1 ) λd εn 2−α (n−1)p aψ λd aψ cψ,p (εn /εn+1 ) 22αp λd−1 . Since εn ε[(n+1)/2] 2 εn+1 , the above estimates imply that (d−1)/α
NN (, λ) Cd,W µd (n ) λd + (cψ,p 2αp )(1−d)/α εn+1
− Cd (|ψ|1 + aψ + 1) λd−α − Cd,W aψ cψ,p 22αp+1 λd−1
λ(d−1)/α (4.26)
−1 α(n+1)p . 2 for all λ ∈ cψ,p εn−1 2αnp , cψ,p εn+1 By Lemma 4.6(4), ∈ Lipα and we have (d − 1)/α > d − α > d − 1 . Therefore taking ε0 = ε1 = · · · = 1 , we obtain a domain satisfying the conditions of Theorem 1.10(1). If φ is a nonnegative function on (0, +∞) and φ(λ) → 0 as λ → ∞ then we can choose a sequence εn converging to zero and satisfying (4.23) in such a way that the function φ(λ) λ(d−1)/α and the last two terms in (4.26) are estimated by (d−1)/α (d−1)/α −1 α(n+1)p (cψ,p 2αp )(1−d)/α εn+1 λ for all λ ∈ cψ,p εn−1 2αnp , cψ,p εn+1 2 and all sufficiently large n . In view of Lemma 4.6(5), this proves Theorem 1.10(2).
504
Yu. Netrusov, Yu. Safarov
5. Remarks and Generalisations 5.1. Poincar´e inequality. According to the Poincar´e inequality, |u|2 dx c |∇u|2 dx whenever u ∈ W 1,2 () and u dx = 0,
(5.1)
where c is a positive constant. By Remark 2.4, the Poincar´e inequality (5.1) on a domain holds true if and only if the zero eigenvalue of the Neumann Laplacian is isolated and c λ−2 1,N () . ˜ ⊂ Rd . If there exist an invertible map F : → Lemma 5.1. Let satisfy (5.1) and ˜ and a constant CF such that |F (x) − F (y)| CF |x − y| for all x, y ∈ and ˜ then ˜ also satisfies (5.1) with a |F −1 (x) − F −1 (y)| CF |x − y| for all x, y ∈ positive constant c˜ = Cd CF2d+2 c .
˜ u(x) := v(F −1 (x)) and cu := Proof. Let v ∈ W 1,2 (), u(x) dx. Under the conditions of the lemma the maps F and F −1 are differentiable almost everywhere. Changing variables and estimating the Jacobians, we obtain |v(y) − cu |2 dy Cd CFd |u(x) − cu |2 dx ˜
and
|∇v(y)| dy 2
˜
Cd CF−d−2
|∇u(x)|2 dx .
These two estimates and the Poincar´e inequality (5.1) imply that |v(y)|2 dy |v(y) − cu |2 dy Cd CF2d+2 c |∇v(y)|2 dy ˜
whenever
˜ v dy
˜
= 0.
˜
Lemma 5.1 allows one to extend Theorem 1.3 to more general domains. Theorem 5.2. Assume that there exists a finite collection of domains l ⊂ such that (a) ∂ ⊂ l l ; (b ) for each l there exists an invertible map Fl : Rd → Rd satisfying the condi(d−1) tions of Lemma 5.1 such that Fl (l ) = Gfl , bl , where fl ∈ BVτ,∞ (Qal ) and bl < inf fl ; (c) al D and sup fl − bl D for all l ∈ L. Then (1.1) holds true. Proof. Let CFl be the constant introduced in Lemma 5.1 and C := maxl CFl . Under conditions of the theorem, Corollary 3.8 remains valid if we replace Ul with Fl and take −2 δn := C −1 δn . Since (5.1) is equivalent to the identity NN (, c ) = 1 , Lemma 2.6 −1 and Lemma 5.1 imply that NN (Sm , λ) = 1 for all λ c δ , where Sm are the same is a constant depending on the domain . sets as in the proof of Theorem 4.1 and c Therefore, using the same arguments as in Subsect. 4.1, we obtain the estimates (4.2) and (4.4) with some other constants (which may depend on ). In the same way as in Subsect. 4.3, these estimates imply (1.1).
Weyl Formula
505
The following example shows that Theorem 5.2 is not just a formal generalization of Theorem 1.3. Example 5.3. Let f be a nowhere differentiable Lipα -function on the interval [0, 1] . Assume that f > 1 and consider the domain := {(ϕ, r) ∈ R2 | ϕ ∈ (0, 1) , 1 < r < f (ϕ)} , where (ϕ, r) are the polar coordinates on R2 . If y1 = r sin ϕ and y2 = r cos ϕ are the standard Cartesian coordinates on R2 then the map which takes the point with polar coordinates (ϕ, r) into the point with Cartesian coordinates (y1 , y2 ) = (ϕ, r) satisfies the conditions of Lemma 5.1. Therefore, by Theorem 5.2, we have (1.1). On the other hand, if (x1 , x2 ) are arbitrary Cartesian coordinates on R2 then x1 (ϕ, r) = r sin(ϕ + ϕ0 ) and x2 (ϕ, r) = r cos(ϕ + ϕ0 ) for some ϕ0 ∈ [0, 2π ) . For every subinterval (a, b) ⊂ (0, 1) there exist at least two different points ϕ1 , ϕ2 ∈ (a, b) such that x1 (ϕ1 , f (ϕ1 )) = x1 (ϕ2 , f (ϕ2 )) (otherwise the function x1 (ϕ, f (ϕ)) would be monotone on (a, b) and, by Lebesgue’s theorem, almost everywhere differentiable). Since x2 (ϕ1 , f (ϕ1 )) = x2 (ϕ2 , f (ϕ2 )) , we see that the set {r = f (ϕ)} cannot be represented as the graph of a continuous function in Cartesian coordinates. Nowhere functions f ∈ Lipα do exist. For instance, the function differentiable −n dist(10n t, Z) is not differentiable at each t ∈ R(see [W or RS-N] , f (t) := ∞ n=0 10 Chap. 1, Sect. 1) but f ∈ Lipα (R) for all α ∈ (0, 1) . 5.2. Higher order operators. Let us consider, instead of the Laplacian, a homogeneous elliptic nonnegative operator A(Dx ) of degree 2m with real constant coefficients and denote by QA its quadratic form (we use the standard notation Dx := −i ∂x ) . As α 2 ∞ m,2 () be the sume that QA [u] |α|=m ∂ u L2 () for all u ∈ C (). Let W
Sobolev space, W0m,2 () be the closure of C0∞ in W m,2 () and AN and AD be the self-adjoint operators in the space L2 () generated by the quadratic form QA with domains W m,2 () and W0m,2 () respectively. Then the results of Sect. 2 remain valid with the following modifications: (i) In the definitions of NN,D , NN , ND and in Lemma 2.2 we replace the Dirichlet form |∇u|2 dx with QA , W 1,2 () with W m,2 () , λ2 with λ2m , and κ −1/2 with κ −1/(2m) . (ii) The kernel of the operator AN is the space Pm () of all polynomials on whose degree is strictly smaller than m . Therefore we have |u(x)|2 dx λ−2m QA [u] for all u ∈ W 1,2 () Pm () if and only if λ1,N () λ , where λ1,N () is the first nonzero eigenvalue of AN . If pu is the projection of u ∈ L2 () onto the subspace Pm () then u − pu L2 () u − p L2 () for all p ∈ Pm () (cf. Remark 2.4). (iii) Let CA,W := (2π)−d µd {ξ ∈ Rd : A(ξ ) < 1} . Then there exists a constant CA,Q such that (d)
− CA,Q (δλ)d−1 N (Qδ , λ) − CA,W (δλ)d CA,Q (δλ)d−1 ,
∀λ > δ −1 ,
for all δ > 0 (see Remark 2.9). (iv) Instead of Lemma 2.6 we have the following result. Lemma 5.4. There exists a constant cA depending only on the operator A and the dimension d such that the following statements hold true:
506
Yu. Netrusov, Yu. Safarov
(1) If P ∈ P(δ) then NN (P , λ) = dim Pm for all λ cA δ −1 . (2) If V ∈ V(δ) then NN (V , λ) = dim Pm for all λ cA δ −1 . (d) (d) (3) If M ∈ M(δ) , M ⊂ Qδ and ϒ := ∂M Qδ then NN,D (M, ϒ, λ) dim Pm −1 −d for all λ cA δ −1 and NN,D (M, ϒ, λ) = 0 for all λ (1 − cA δ µd 1/(2m) −1 (M))+ cA δ . Proof. We shall denote by C various constants depending only on A and d. It is sufficient to prove the lemma assuming that A(Dx ) = Am (Dx ) := d 2m m,2 (Q(d) ) , j =1 Dxj . Then (1) is easily obtained by separation of variables. If u ∈ W δ u ≡ 0 outside M and pu is the projection of u onto the subspace Pm (M) then 2 2 −d |pu | dx µd (M) sup |pu (x)| C µd (M) δ |pu |2 dx M
(d)
Qδ
(d)
x∈Qδ
= C µd (M) δ
−d
|pu | dx + 2
M
|u − pu | dx 2
(d)
.
Qδ
Applying (ii) and this estimate instead of Remark 2.4 and (2.6), we obtain (3) in the same way as Lemma 2.6(3). (d−1) ) with c δ , b = In order to prove (2), let us assume that V = Gf, b (Qc m,2 inf f −δ and Osc f δ/2 and consider a function u ∈ W (V ) . Let pu; r, k (x ) be the (d−1) (d−1) projection of the function ∂xkd u(x , r) ∈ L2 (Qc ) onto the subspace Pm−k (Qc ), m−1 1 1 k ∂ k u(x , r) , (x −r) pu; r (x) := k=0 k! (xd −r)k pu; r, k (x ) and vr (x) := m−1 d xd k=0 k! where r ∈ [b, b + δ] and xd ∈ [b, f (x )] . We have |u(x) − pu; r (x)|2 2 |u(x) − vr (x)|2 + 2 |vr (x) − pu; r (x)|2 .
(5.2)
Since |xd − b| 2δ , Jensen’s inequality implies that |u(x) − vr (x)|2 = ((m − 1)!)−2 |
xd
(xd − t)m−1 ∂xmd u(x , t) dt |2 xd −2 ((m − 1)!) |xd − r| (xd − t)2m−2 |∂xmd u(x , t)|2 dt r
r
((m − 1)!)−2 (2δ)2m−1
f (x ) b
|∂xmd u(x)|2 dxd .
In view of (ii) and (1), we also have |∂xkd u(x) − pu; r, k (x )|2 dx C δ 2m−2k QAm−k [∂xkd u(x)] (d−1)
Qc
for all k = 0, . . . , m−1 , where Am−k (Dx ) :=
d−1
2m−2k and Q Am−k is the quaj =1 Dxj (d−1) dratic form of Am−k with domain W m−k, 2 (Qc ) . Therefore, integrating (5.2) over (d−1) r ∈ [b, b + δ] , xd ∈ [b, f (x )] , x ∈ Qc and estimating |xd − r| 2δ , we obtain
Weyl Formula
δ
−1
b+δ b
507
|u(x) − pu; r (x)|2 dx dr m−1 C δ 2m |∂xmd u(x)|2 dx + C δ 2m |∂xα u(x)|2 dx , V
k=0 |α|=m P
(d−1)
where P = Qc × (b, b + δ) . Since the L2 -norms of the mixed derivatives ∂xα u(x) on a rectangle are estimated by the L2 -norms of the derivatives ∂xmj , this estimate and (ii) imply (2). Applying the same arguments as in Sect. 4 and using (iii) and Lemma 5.4, we obtain the following result. Theorem 5.5. If NN (λ, ) and ND (λ, ) denote the number of eigenvalues of the corresponding self-adjoint operator lying below λ2m then Theorems 1.3, 1.8 and Corollaries 1.5, 1.6, 1.9 hold true with Cd,W := CA,W . α be the Besov space and BVβ,∞ := BVτβ ,∞ , 5.3. Other function spaces. Let Bp,q β α where τβ (t) = (t + 1) and β ∈ (0, +∞). Lemma 4.5 implies that B∞,∞ = Lipα ⊂ (d−1) (d−1) α ) → C(Qa ) BV(d−1)/α,∞ . Estimating the norm of the embedding Bp,∞ (Qa α for αp > d − 1 and a > 0 , one can also show that Bp,∞ ⊂ BV(d−1)/α,∞ whenever αp > d − 1 .
5.4. Open problems. 5.4.1. The spaces BVτ,∞ . The space BVβ,∞ or BVτ,∞ (under certain conditions on the function τ ) is a Banach space with respect to an appropriate norm. Similar spaces have been considered in the dimension one, but we could not find references in the multidimensional case. It would be interesting to find a more constructive description of these spaces and to investigate their properties. 5.4.2. More general domains. The crucial point in our proof of Theorem 1.3 is the construction of the families {Sm }M such that (i) bδ ⊂ m Sm ⊂ , (ii) ℵ{Sm }M C , (iii) NN (Sm , λ) C whenever λ C δ −1 , where C , C and C are some constants independent of δ ∈ R+ . The remainder estimate in the Weyl formula for the Neumann Laplacian depends on the behaviour of #M as δ → 0 . In this paper we were assuming that is the union of subgraphs of continuous functions, used Lemma 2.6 in order to prove (iii) and applied Corollary 3.2 in order to estimate ℵ{Sm } and #M . Theorem 3.1 allows one to construct families of open sets Sm satisfying (i)–(iii) for many other domains . It should be possible to find less restrictive sufficient conditions which guarantee the existence of such families and imply an asymptotic formula for NN (, λ) . 5.4.3. Operators with variable coefficients. Our main goal was to estimate the contribution of ∂ to the Weyl formula. In the interior part of we used the old fashioned variational technique based on the Whitney decomposition and Dirichlet–Neumann
508
Yu. Netrusov, Yu. Safarov
bracketing. There are much more advanced methods of studying the asymptotic behaviour of the spectral function at the interior points (see the monographs [Iv3 , SV] or the recent papers [BI , Iv4] ), which are applicable to operators with variable coefficients. Freezing the coefficients at an arbitrary point x ∈ Sm , we see that (iii) remains valid for a uniformly elliptic operator A with variable coefficients, provided that the corresponding quadratic form is homogeneous, the coefficients are uniformly continuous, δ is sufficiently small and diam Sm c δ with some constant c independent of δ . Using this observation and applying a more powerful technique in the interior of , one can try to extend our results to operators with variable coefficients. 5.4.4. Reminder estimate for the Dirichlet Laplacian. It is not difficult to construct a bounded domain such that limδ→0 |δ −α µd (bδ )| = C and ND (, λ) − Cd,W µd () λd − C −1 λd−α ,
∀λ > C ,
(5.3)
where C and C are some positive constants. For example, it can be done by considering a cube with a sequence of ‘cracks’ converging to the outer boundary, which get denser as the outer boundary is approached (similar domains were studied in [LV and MV]). For such a domain the estimate (1.7) is order sharp. It would be interesting find a domain ∈ Lipα satisfying (5.3) (cf. Theorem 1.10). Note that in the known examples disproving the so-called Berry conjecture (see, for instance, [BLe or LV] ) the domain does not belong to the class Lipα . 6. Constants Throughout the paper Cd,W is the Weyl constant (see Subsect. 1.1), Cd,1 :=
d−1 n! (d − n)! n=0
d!
Cn,W ,
C0,W := 1 ,
Cd,2 = 2d−1 Cd−1 and Cd,3 = 6d−1 Cˆd−1 , where Cd−1 and Cˆd−1 are the constants introduced in Theorem 3.1, Cd,4 := (4 Cd,2 + 2)1/2 , Cd,5 := min (1 + 2π −2 )−1/2 , π(1 + d −1 )−1 , √ −1 Cd,6 := 2d−1 Cd,2 + (3 Cd,2 + 1) (2 d)d , Cd,7 := Cd,4 Cd,5 , −1/2
Cd,8 := max{1, Cd,7 } , Cd,9 := 8 Cd,3 Cd,8 , √ Cd,10 := (d + 1) 12 d Cd,1 + 4 Cd,W + (4d d d + Cd,6 ) (4d 1/2 + 4d −1/2 )d , √ Cd,11 := (d + 1) 12 d Cd,1 + 4 Cd,W + (4d d d + 2) (4d 1/2 + 4d −1/2 )d . Remark 6.1. If ρ is continuous then Theorem 3.1 holds true with Cn = 2n and Cˆn = 4n (see [G]). Since the function ρ in the proof of Corollary 3.2 is continuous, all our results remain valid for Cd,2 = 4d−1 and Cd,3 = 24d−1 . Acknowledgements. The authors are very grateful to M. Solomyak and E.B. Davies for their valuable comments.
Weyl Formula
509
References [BD]
Burenkov, V., Davies, E.B.: Spectral stability of the Neumann Laplacian. J. Diff. Eqs. 186(2), 485–508 (2002) [BI] Bronstein, M., Ivrii, V.: Sharp spectral asymptotics for operators with irregular coefficients. I. Pushing the limits. Commun. Part. Diff. Eqs. 28(1–2), 83–102 (2003) [BLe] van den Berg, M., Levitin, M.: Functions of Weierstrass type and spectral asymptotics for iterated sets. Quart. J. Math. Oxford Ser. 47(188), 493–509 (1996) [BLi] van den Berg, M., Lianantonakis, M.: Asymptotics for the spectrum of the Dirichlet Laplacian on horn-shaped regions. Indiana Univ. Math. J. 50(1), 299–333 (2001) [BS] Birman, M.S., Solomyak, M.Z.: The principal term of spectral asymptotics for “non-smooth” elliptic problems. Funkt. Anal. i Pril. 4:4, 1–13 (1970) (Russian); English transl. Funct. Anal. Appl. 4 (1971) [F] Fedosov, B.: Asymptotic formulae for the eigenvalues of the Laplace operator in the case of a polygonal domain. (Russian), Dokl. Akad. Nauk SSSR 151, 786–789 (1963) [G] de Guzm´an, M.: Differentiation of integrals in Rn . Lecture Notes in Mathematics, V. 481, Berlin-Heidelberg-New York: Springer Verlag, 1975 [HSS] Hempel, R., Seco, L., Simon, B.: The essential spectrum of Neumann Laplacians on some bounded singular domains. J. Funct. Anal. 102(2), 448–483 (1991) [Iv1] Ivrii, V.: On the second term of the spectral asymptotics for the Laplace-Beltrami operator on manifolds with boundary. Funkt. Anal. i Pril. 14(2), 25–34 (1980); English transl. Funct. Anal. Appl. 14, 98–106 (1980) [Iv2] Ivrii, V.: The asymptotic Weyl formula for the Laplace-Beltrami operator in Riemannian polyhedra and domains with conical singularities of the boundary. Dokl. Akad. Nauk SSSR 288, 35–38 (1986) (Russian) [Iv3] Ivrii, V.: Microlocal Analysis and Precise Spectral Asymptotics. SMM, Berlin-Heidelberg-New York: Springer-Verlag, SMM, 1998 [Iv4] Ivrii, V.: Sharp spectral asymptotics for operators with irregular coefficients. II. Domains with boundaries and degenerations. Commun. Part. Diff. Eqs. 28(1–2), 103–128 (2003) [LV] Levitin, M., Vassiliev, D.: Spectral asymptotics, renewal theorem, and the Berry conjecture for a class of fractals. Proc. Lond. Math. Soc. 72, 188–214 (1996) [M1] Maz’ya, V.: On Neumann’s problem for domains with irregular boundaries. Siberian Math. J. 9, 990–1012 (1968) [M2] Maz’ya, V.: Sobolev spaces. Leningrad: Leningrad University, 1985. English translation in Springer Series in Soviet Mathematics, Berlin: Springer-Verlag, 1985 [Ma] Mason, C.: Log-Sobolev inequalities and regions with exterior exponential cusps. J. Funct. Anal. 198, 341–360 (2003) [Me] M´etivier, G.: Valeurs propres de problemes aux limites elliptiques irr´eguli`eres. (French). Bull. Soc. Math. France Suppl. M´em. 51–52, 125–219 (1977) [Mi] Miyazaki, Y.: A sharp asymptotic remainder estimate for the eigenvalues of operators associated with strongly elliptic sesquilinear forms. Japan J. Math. 15(1), 65–97 (1989) [MV] Molchanov, S., Vainberg, B.: On spectral asymptotics for domains with fractal boundaries of cabbage type. Math. Phys. Anal. Geom. 1(2), 145–170 (1998) [RS-N] Riesz, F., Sz.-Nagy, B.: Le¸cons d’analyse fonctionnelle. (French). Acad´emie des Sciences de Hongrie, Budapest: Akad´emiai Kiad´o, 1952; English translation: Funct. Anal. New York: Dover Publications Inc. 1990 [Se] Seeley, R.: An estimate near the boundary for the spectral function of the Laplace operator. Am. J. Math. 102(5), 869–902 (1980) [St] Stein, E.: Singular integrals and differentiability properties of functions. Princeton NJ: Princeton University Press, 1970 [Sa] Safarov, Yu.: Fourier Tauberian Theorems and applications. J. Funct. Anal. 185, 111–128 (2001) [Si] Simon, B.: The Neumann Laplacian of a jelly roll. Proc. Am. Math. Soc. 114(3), 783–785 (1992) [SV] Safarov, Yu., Vassiliev, D.: The asymptotic distribution of eigenvalues of partial differential operators. Providence, RI: American Mathematical Society, 1996 [V] Vassiliev, D.: Two-term asymptotic behavior of the spectrum of a boundary value problem in the case of a piecewise smooth boundary. Dokl. Akad. Nauk SSSR 286(5), 1043–1046 (1986); English translation: Soviet Math. Dokl. 33(1), 227–230 (1986) [W] van der Waerden, B.L.: Ein einfaches Beispiel einer nichtdifferenzierbaren stetigen Funktion. Math. Zeitschr. 32, 474–475 (1930) [Z] Zielinski, L.: Asymptotic distribution of eigenvalues for elliptic boundary value problems. Asymptot. Anal. 16(3–4), 181–201 (1998) Communicated by B. Simon
Commun. Math. Phys. 253, 511–533 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1233-1
Communications in
Mathematical Physics
On the Strongly Damped Wave Equation Vittorino Pata, Marco Squassina Dipartimento di Matematica “F. Brioschi”, Politecnico di Milano, Via Bonardi 9, 20133 Milano, Italy. E-mail: {pata,squassina}@mate.polimi.it Received: 15 May 2003 / Accepted: 2 June 2004 Published online: 11 November 2004 – © Springer-Verlag 2004
Abstract: We prove the existence of the universal attractor for the strongly damped semilinear wave equation, in the presence of a quite general nonlinearity of critical growth. When the nonlinearity is subcritical, we prove the existence of an exponential attractor of optimal regularity, having a basin of attraction coinciding with the whole phase-space. As a byproduct, the universal attractor is regular and of finite fractal dimension. Moreover, we carry out a detailed analysis of the asymptotic behavior of the solutions in dependence of the damping coefficient. 1. Introduction Let ⊂ R3 be a bounded domain with smooth boundary ∂. Given ω > 0, we consider the following initial-boundary value problem for u : × R+ → R: utt − ωut − u + φ(u) = f, x ∈ , t > 0, u(x, 0) = u (x), x ∈ , 0 (P ) ut (x, 0) = u1 (x), x ∈ , u(x, t) = 0, x ∈ ∂, t ≥ 0. The semilinear wave equation with strong damping has been investigated by many authors in the last years (see, e.g., [2–4, 7, 10, 14, 16, 17]). In particular, Carvalho and Cholewa [4] have recently proved that for Problem P with the critical nonlinearity (i.e., when the growth of φ is of order 5), the associated semigroup possesses a universal attractor. Actually, in [4] the authors analyze a more general situation, with a term of Research partially supported the Italian MIUR Research Projects Problemi di Frontiera Libera nelle Scienze Applicate, Aspetti Teorici e Applicativi di Equazioni a Derivate Parziali and Metodi Variazionali e Topologici nello Studio dei Fenomeni Nonlineari. The second author was also supported by the Istituto Nazionale di Alta Matematica “F. Severi” (INdAM).
512
V. Pata, M. Squassina
the form (−)θ ut , for θ ∈ [ 21 , 1], in place of −ut . This was, in our opinion, a significant progress, since the passage from the subcritical to the critical case is highly nontrivial, mainly due to the fact that in the critical situation the embeddings are no longer compact. The key ingredient of [4] is Alekseev’s nonlinear variation of constants formula, which has been successfully employed also to establish an analogous result for the weakly damped semilinear wave equation (see [1]). However, the universal attractor is not shown to have the best possible regularity, even in the subcritical case. This lack of regularity prevents a more detailed asymptotic analysis. In this paper, using a different approach, we prove the existence of a universal attractor for Problem P , with a more general nonlinearity than the one used in [4]. Moreover, in the subcritical case, we demonstrate the existence of an exponential attractor of optimal regularity, and in turn the existence of a regular universal attractor of finite fractal dimension. We should mention that the basin of attraction of the exponential attractor is the whole phase-space, and not just a compact invariant subset. This is obtained as a consequence of a remarkable result due to Fabrie, Galusinski, Miranville and Zelik [9], who have proved that the exponential attraction enjoys a transitivity property. Indeed, after the paper [9], it is now clear that the interesting object to investigate is the exponential attractor, rather than the universal attractor, which is recovered as a byproduct. Finally, we pursue a detailed analysis of the longtime behavior of solutions in dependence of the damping coefficient ω > 0. Our technique relies on a bootstrap argument that was envisaged in [11], together with a sharp use of Gronwall-type lemmas. 2. Functional Setting We denote the inner product and the norm on L2 () by ·, · and · , respectively, and the norm on Lp () by · Lp . Let A be the (strictly) positive operator on L2 () defined by A = − with domain D(A) = H 2 () ∩ H01 (). Identifying L2 () with its dual space L2 ()∗ , we consider the family of Hilbert spaces D(As/2 ), s ∈ R, whose inner products and norms are given by ·, ·D(As/2 ) = As/2 ·, As/2 ·
and
· D(As/2 ) = As/2 · .
Then we have D(A0 ) = L2 (),
D(A1/2 ) = H01 (),
D(A−1/2 ) = H −1 (),
and the compact and dense injections D(As/2 ) → D(Ar/2 ),
∀s > r.
In particular, naming α1 the first eigenvalue of A, we get the inequalities Ar/2 v ≤ α1 (r−s)/2 As/2 v,
∀v ∈ D(As/2 ).
(1)
∀s ∈ [0, 23 ),
(2)
We also recall the continuous embedding D(As/2 ) → L6/(3−2s) (),
On the Strongly Damped Wave Equation
513
and the Ehrling lemma, that is, given s > r > q, for every ν > 0 there exists cν > 0 such that Ar/2 v ≤ νAs/2 v + cν Aq/2 v,
∀v ∈ D(As/2 ).
(3)
Concerning the phase-spaces for our problem, we consider, for s ∈ R, the product Hilbert spaces Hs = D(A(1+s)/2 ) × D(As/2 ), endowed with the usual inner products and norms (denoted by · s ). Throughout the paper, we will denote by c ≥ 0 a generic constant, that may vary even from line to line within the same equation, depending only on , φ and the external source f . Further dependencies will be specified on occurrence. Also, we will tacitly use (1)–(3), as well as the Young and the generalized H¨older inequalities, and the usual Sobolev embeddings. We conclude the section with two technical lemmas that will be needed in the course of the investigation. Lemma 1. Let X be a Banach space, and let Z ⊂ C(R+ , X). Let E : X → R be a function such that sup E(z(t)) ≥ −m
t∈R+
and
E(z(0)) ≤ M,
for some m, M ≥ 0 and every z ∈ Z. In addition, assume that for every z ∈ Z the function t → E(z(t)) is continuously differentiable, and satisfies the differential inequality d E(z(t)) + δz(t)2X ≤ k, dt for some δ > 0 and k > 0, both independent of z ∈ Z. Then E(z(t)) ≤ sup E(ζ ) : δζ 2X ≤ 2k , ∀t ≥ ζ ∈X
m+M k .
The proof can be found, for instance, in [2, Lemma 2.7]. Lemma 2. Let be an absolutely continuous positive function on R+ , which satisfies for some ε > 0 the differential inequality d
(t) + 2ε (t) ≤ g(t) (t) + h(t), dt for almost every t ∈ R+ , where g and h are functions on R+ such that t |g(y)|dy ≤ m1 1 + (t − τ )µ , ∀t ≥ τ ≥ 0, τ
for some m1 ≥ 0 and µ ∈ [0, 1), and t+1 sup |h(y)|dy ≤ m2 , t≥0
t
for some m2 ≥ 0. Then
(t) ≤ β (0)e−εt + ρ,
∀t ∈ R+ ,
514
V. Pata, M. Squassina
for some β = β(m1 , µ) ≥ 1 and ρ=
βm2 eε . 1 − e−ε
(4)
For the proof, we refer the reader to [11, Lemma 2.2]. In the applications of the above lemmas, we might not have the required regularity for E and . However, this is not really a problem, since we can always suppose to work within a proper regularization scheme. 3. The Solution Semigroup We will consider for simplicity a time-independent external source, namely f ∈ H −1 () independent of time,
(5)
although the results could be generalized with little effort to the nonautonomous case, provided that f enjoys some translation-compactness properties. Concerning the nonlinearity, we stipulate the following set of assumptions. Let φ ∈ C(R) be such that |φ(r) − φ(s)| ≤ c|r − s|(1 + |r|4 + |s|4 ),
∀r, s ∈ R.
(6)
Also, let φ admit the decomposition φ = φ0 + φ 1 , with φ0 ∈ C(R), φ1 ∈ C(R), satisfying ∀r ∈ R, |φ0 (r)| ≤ c(1 + |r|5 ), φ0 (r)r ≥ 0, ∀r ∈ R, |φ1 (r)| ≤ c(1 + |r|γ ), γ < 5, ∀r ∈ R, φ1 (r) lim inf > −α1 . |r|→∞ r
(7) (8) (9) (10)
Without loss of generality, we can think γ large enough, say, γ ≥ 3. Notice that, by virtue of (10), there exists α < α1 such that φ1 (r)r ≥ −αr 2 − c,
∀r ∈ R.
(11)
Remark 1. It is apparent that we can replace φ0 with ηφ0 and φ1 with φ1 + (1 − η)φ0 , where η is a smooth function with values in [0, 1], such that η(r) = 0 if |r| ≤ 1, and η(r) = 1 if |r| ≥ 2. Then φ0 and φ1 still fulfill (8)–(10); moreover, |φ0 (r)| ≤ c(|r| + |r|5 ),
∀r ∈ R.
(12)
Thus, in the sequel, we will assume the stronger condition (12) in place of (7). Remark 2. Analogously to what observed in [1, Lemma 1.2], a function φ ∈ C(R) such that |φ(r)| ≤ c(1 + |r|5 ), φ(r) lim inf > −α1 , |r|→∞ r
∀r ∈ R,
admits a decomposition φ = φ0 + φ1 satisfying (7)–(10).
On the Strongly Damped Wave Equation
515
Throughout the paper, we will assume conditions (5)–(6), (8)–(10), and (12). By [3] (see also [4, Theorem 1]), the following holds. Theorem 1. For every T > 0, and every (u0 , u1 ) ∈ H0 , Problem P admits a unique weak solution u ∈ C([0, T ], H01 ()), with ut ∈ C([0, T ], L2 ()) ∩ L2 ([0, T ], H01 ()), which continuously depend on the initial data. In other words, Problem P generates a strongly continuous semigroup S(t) on the phase space H0 . Actually, the result has been proved for f ≡ 0, but it holds as well in the present case. Remark 3. As a matter of fact, it is not hard to check that, when f ∈ L2 (), S(t) is a strongly continuous semigroup also on the phase space H1 . For further use, let us write down explicitly the continuous dependence estimate for S(t) on H0 . Theorem 2. Given any R > 0 and any two initial data z0 , z1 ∈ H0 such that z0 0 ≤ R and z1 0 ≤ R, there holds K
S(t)z0 − S(t)z1 0 ≤ e ω t z0 − z1 0 ,
∀t ∈ R+ ,
(13)
for some K = K(R). Proof. Given two solutions u1 and u2 corresponding to different initial data, the difference u¯ = u1 − u2 fulfills the inequality d 1/2 2 ¯ + u¯ t 2 + 2ωA1/2 u¯ t 2 = −2φ(u1 ) − φ(u2 ), u¯ t . A u dt Taking into account the uniform energy estimates for the solutions (see the subsequent Theorem 3), from (6) we have, for every ν > 0, ¯ L6 u¯ t L6 −2φ(u1 ) − φ(u2 ), u¯ t ≤ c 1 + u1 4L6 + u2 4L6 u 1/2 1 4 1/2 2 4 1/2 ≤ c 1 + A u + A u A1/2 uA ¯ u¯ t k ≤ A1/2 u ¯ 2 + kνA1/2 u¯ t 2 , ν for some k = k(R). Therefore, setting ν =
2ω k ,
we obtain
k 2 1/2 2 d 1/2 2 A u A u ¯ + u¯ t 2 ≤ ¯ + u¯ t 2 , dt 2ω and the assertion follows from the Gronwall lemma.
516
V. Pata, M. Squassina
4. Dissipativity We now deal with the dissipative feature of the semigroup S(t). Namely, we show that the trajectories originating from any given bounded set eventually fall, uniformly in time, into a bounded absorbing set B0 ⊂ H0 . In order to highlight the dependence on ω > 0, we introduce the function
ω, ω < 1, (ω) = 1 (14) ω , ω ≥ 1. Theorem 3. There exists a constant R0 > 0 with the following property: given any R ≥ 0, there exist t0 = t0 (R, ω) such that, whenever z0 0 ≤ R, it follows that
S(t)z0 0 ≤ R0 ,
Consequently, the set
∀t ≥ t0 .
B0 = z0 ∈ H0 : z0 0 ≤ R0
is a bounded absorbing set for S(t) on H0 , that is, for any bounded set B ⊂ H0 , there is t0 = t0 (B, ω) such that S(t)B ⊂ B0 for every t ≥ t0 . Remark 4. Before proceeding to the proof, let us dwell on the physical meaning of Theorem 3. Firstly, the solution corresponding to any set of initial data, after a certain time t0 (depending only on the size of the data) is controlled in norm by the constant R0 . Notice that R0 does not depend on the damping coefficient. What makes the difference is actually the time t0 needed to stabilize the system. As shown in formula (21) below, t0 is an increasing function of R, and so far this is no surprise, since the larger are the initial data, the larger is the time needed to squeeze them. Less evident, at a first glance, is the dependence on ω. Indeed, for a fixed R, t0 → ∞ both if ω → 0 and ω → ∞. This is obvious when ω → 0, for in the limiting case ω = 0 the dissipation is lost. On the other hand, a very large damping has the effect of freezing the system, since the damping acts only on the velocity ut , and this prevents the squeezing of the component u. Therefore the most dissipative situation occurs in between, that is, for a certain damping ω∗ , which depends on the other coefficients of the equation (in our case, for simplicity, they are all set equal to 1). Proof of Theorem 3. In view of a further use, throughout this proof, besides c, we will also employ the generic constant c0 ≥ 0, which is independent of R and vanishes if φ1 ≡ 0 and f ≡ 0. Denoting z(t) = S(t)z0 = (u(t), ut (t)), we consider the functional
F(t) = F(u(t)) = 2
u(x,t)
φ(y)dydx. 0
Set = min{1, ω}. Given ε ∈ [0, ε0 ], for some ε0 ≤ 1 to be determined later, we introduce the auxiliary variable ξ(t) = ut (t) +
ε u(t). ω
On the Strongly Damped Wave Equation
517
Testing the equation with ξ yields ε 1 d E+ (1 − ε )A1/2 u2 + ωA1/2 ξ 2 2 dt ω ε ε2 2 ε ε = u, ξ + ξ 2 − f, u − φ(u), u, 2 ω ω ω ω
(15)
where the energy functional E is defined as E(t) = E(z(t)) = (1 − ε )A1/2 u(t)2 + ξ(t)2 + F(t) − 2f, u(t). By (5), (9) and (12), we get the bound from above E(t) ≤ c 1 + z(t)60 ,
(16)
whereas by (5), (8) and (11), and the continuity of φ1 , we find the bound from below E(t) ≥ λz(t)20 − c0 ,
(17)
provided that ε0 is small enough, for some (possibly very small) λ > 0. We now proceed to the evaluation of the right-hand side of (15). Making use of (8) and (11), φ0 (u), u ≥ 0,
φ1 (u), u ≥ −(1 − 2λ0 )A1/2 u2 − c0 ,
(18)
for some λ0 ∈ (0, 21 ). All the constants appearing here are independent of ε ∈ [0, ε0 ]. In particular, λ and λ0 depend only on the value of the limit in (10). Using now (18) and the inequalities −
ε2 2 ε3 3 ε u, ξ ≤ A1/2 u2 + ξ 2 , 2 3 ω 4α1 ω ω ε ε λ0 1/2 2 ε f, u ≤ A u + c0 , ω ω ω
we get from (15) the differential inequality d ε 2 2 1/2 2 4ε ε 2ε A u + 2α1 ω − E+ λ0 − ε − ξ 2 ≤ c0 , 2 dt ω 4α1 ω ω ω which, for ε0 small enough, becomes ε λ0 1/2 2 4ε ε d E+ A u + 2α1 ω − ξ 2 ≤ c0 . dt ω ω ω
(19)
All the inequalities we wrote so far, hold for every ε ∈ [0, ε0 ], provided that ε0 is small enough. However, the size of ε0 does not depend on ω. At this point, we need to treat separately two cases. So, assume first that ω ≥ 1. Then it is easy to see that 2α1 ω −
4ε ελ0 ε λ0 ≥ 2α1 − 4ε ≥ = , ω ω ω
up to taking ε0 possibly smaller. On the contrary, if ω < 1, then 2α1 ω −
4ε ε λ0 = 2α1 ω − 4ε ≥ ελ0 = , ω ω
518
V. Pata, M. Squassina
provided that ε0 is of the form ε˜ ω, for some small ε˜ . In either case, the term εωλ0 can be given the form 2δ(ω), for some δ > 0 small (independently of ω). Then, inequality (19) reads d E + 2δ(ω) A1/2 u2 + ξ 2 ≤ c0 (ω), dt
(20)
which in turn (if δ is sufficiently small) entails d E + δ(ω)z20 ≤ c0 (ω). dt Applying now Lemma 1, on account of (16)–(17), and taking c0 strictly positive, there exists t0 = t0 (R, ω) = such that E(z(t)) ≤ sup
c0 + c(1 + R 6 ) c0 (ω)
E(ζ ) : δ(ω)ζ 20 ≤ 2c0 (ω) ,
ζ ∈H0
(21)
∀t ≥ t0 ,
that is equivalent to E(z(t)) ≤ sup
ζ ∈H0
E(ζ ) : δζ 20 ≤ 2c0 ,
The thesis then follows from (16)–(17).
∀t ≥ t0 .
Corollary 1. Given any R ≥ 0, there exist K0 = K0 (R) and 0 = 0 (R) such that, whenever z0 0 ≤ R, the corresponding solution S(t)z0 = (u(t), ut (t)) fulfills S(t)z0 0 ≤ K0 ,
and ω
∞
∀t ∈ R+ ,
A1/2 ut (y)2 dy ≤ 0 .
0
Proof. Set ε = 0 and integrate (15), on account of (16)–(17). Incidentally, the set z0 ∈ H0 : S(t)z0 0 ≤ K0 (R0 ), ∀t ∈ R+ turns out to be a bounded absorbing set for S(t) on H0 which is invariant under the action of the semigroup. 5. The Universal Attractor The aim of this section is to prove the existence of a universal attractor for S(t) on H0 . Recall that the universal attractor is the (unique) compact set A ⊂ H0 , which is at the same time attracting, in the sense of the Hausdorff semidistance, and fully invariant for S(t), that is, S(t)A = A for all t ∈ R+ (see, e.g., [12, 15]). Theorem 4. For every ω > 0, the semigroup S(t) possesses a connected universal attractor A = A(ω) ⊂ H0 .
On the Strongly Damped Wave Equation
519
In order to prove Theorem 4, and for further purposes, we decompose the solution u to Problem P with initial data z0 = (u0 , u1 ) ∈ H0 into the sum u(t) = v(t) + w(t), where v and w are the solutions to the problems vtt + ωAvt + Av + φ0 (v) = 0, v(0) = u0 , v (0) = u , t 1 and
(22)
wtt + ωAwt + Aw + φ(u) − φ0 (v) = f, w(0) = 0, w (0) = 0. t
(23)
It is convenient to denote z(t) = (u(t), ut (t)),
zd (t) = (v(t), vt (t)),
zc (t) = (w(t), wt (t)).
As a first step, we show that zd has an exponential decay in H0 , which is uniform as z0 runs into a bounded subset of H0 . Lemma 3. Given any R ≥ 0, there exist M0 = M0 (R) ≥ 0 and ν0 = ν0 (R) > 0 such that, whenever z0 0 ≤ R, it follows that zd (t)0 ≤ M0 e−ν0 (ω)t ,
∀t ∈ R+ ,
with (ω) given by (14). The constants M0 and ν0 depend increasingly and decreasingly, respectively, on R. Proof. Repeating word by word the proof of Theorem 3, that applies to the present case with zd (t) in place of z(t) (with the further simplification that c0 = 0, for now φ1 ≡ 0 and f ≡ 0), we get the differential inequality ε λ0 1/2 2 4ε d E+ (24) A v + 2α1 ω − ξ 2 ≤ 0, dt ω ω for some ε0 small enough, independent of ω. Integrating (24) for ε = 0 on (0, t) gives sup sup zd (t)0 < ∞.
z0 ≤R t∈R+
Hence, we find the uniform estimate F(t) ≤ c v(t)2 + v(t)6L6 ≤ kA1/2 v(t)2 ,
∀t ≥ 0,
for some constant k = k(R) ≥ 1. Upon taking ε0 small enough, we may replace the term A1/2 v2 appearing in (24) with 1 1 1 (1 − ε ) 1/2 2 A v + F = E − ξ 2 , 2k 2k 2k 2k
520
so obtaining
V. Pata, M. Squassina
ε λ0 ε m d E+ E + 2α1 ω − ξ 2 ≤ 0, dt 2kω ω
where we set for simplicity m = m(R) = 4 + λ2k0 . Again, arguing as in the proof of Theorem 3, we see that ε m ≥ 0, 2α1 ω − ω λ0 provided the term ε 2kω is given the form 2ν0 (ω). The only difference here is that ν0 is not a constant any longer, but a decreasing function of R. Thus we end up with
d E + 2ν0 (ω)E ≤ 0. dt By means of the Gronwall lemma, and using subsequently (16)–(17) (recall that c0 = 0), the proof is completed. A straightforward consequence is Corollary 2. If φ1 ≡ 0 and f ≡ 0, then S(t) decays to zero. Thus the set {0} ⊂ H0 is the universal attractor for S(t) on H0 . Next we show that, for every fixed time, the component zc belongs to a compact subset of H0 , uniformly as the initial data z0 belongs to the absorbing set B0 , given by Theorem 3. Lemma 4. For every time T ∈ R+ and every ω > 0, there exists a compact set KT ,ω ⊂ H0 such that zc (t) ∈ KT ,ω , ∀t ∈ [0, T ]. z0 ∈B0
Proof. The constant c appearing in this proof may depend on K0 (R0 ) (given by Theorem 3), which is however a fixed value. Due to Corollary 1 and Lemma 3, ∀t ∈ R+ .
A1/2 u(t) + A1/2 v(t) ≤ c, Choosing σ = min and multiplying (23) times
Aσ w
t,
1
4,
5−γ 2
,
we are led to the identity
1 d zc 2σ + ωA(1+σ )/2 wt 2 2 dt = −φ(u) − φ(v), Aσ wt − φ1 (v), Aσ wt + f, Aσ wt .
(25)
By virtue of (6) we get −φ(u) − φ(v), Aσ wt ≤ c 1 + u4L6 + v4L6 wL6/(1−2σ ) Aσ wt L6/(1+2σ ) ≤ c 1 + A1/2 u4 + A1/2 v4 A(1+σ )/2 wA(1+σ )/2 wt ω c ≤ zc 2σ + A(1+σ )/2 wt 2 . ω 3
(26)
On the Strongly Damped Wave Equation
Since
γ 5−2σ
521
≤ 1, by (9) we deduce that −φ1 (v), Aσ wt γ ≤ c 1 + vL6γ /(5−2σ ) Aσ wt L6/(1+2σ ) ≤ c 1 + A1/2 vγ A(1+σ )/2 wt c ω ≤ + A(1+σ )/2 wt 2 . ω 3
(27)
Finally, f, Aσ wt ≤ A−1/2 f A(1+σ )/2 wt ≤
c ω + A(1+σ )/2 wt 2 . ω 3
(28)
Plugging (26)–(28) into (25), we obtain d c c zc 2σ ≤ zc 2σ + , dt ω ω and the Gronwall lemma entails kt
zc (t)2σ ≤ e ω − 1, which concludes the proof.
Collecting now Theorem 3, Lemma 3 and Lemma 4, we establish that S(t) is asymptotically smooth. Therefore, by means of well-known results of the theory of dynamical systems (see, e.g., [12]), Theorem 4 is proved. Remark 5. On account of Lemma 3 and Lemma 4, for every T ∈ R+ the attractor A belongs to a M0 e−ν0 (ω)T -neighborhood of KT ,ω . Note that, as ω → ∞, the set KT ,ω shrinks, but the constant M0 e−ν0 (ω)T increases. This seems to suggest that the “smallest" attractor occurs for a certain ω∗ , away from zero and infinity. We will come back on this in the next sections, where we discuss the dependence on ω in the subcritical case. Remark 6. With the additional assumptions f ∈ L2 () and φ0 ∈ C 1 (R) with φ0 ≥ 0, it is also possible to prove that the semigroup S(t) possesses a universal attractor A1 on the phase-space H1 . Clearly, A1 ⊂ A. If we could prove that A is a bounded subset of H1 , then, on account of the maximality properties of universal attractors (cf. [15]), we would have the reverse inclusion. As a consequence, A would not only be bounded, but also compact in H1 . In general, one cannot have an H1 -bound for A assuming only f ∈ H −1 (). This follows from the fact that the stationary points (which belong to the attractor) solve the equation −u˜ + φ(u) ˜ = f, and are as regular as f permits. In particular, if f ∈ H −1 (), but not more, then u˜ ∈ H01 (), but not more. In this case A cannot be a subset of Hσ for any σ > 0.
522
V. Pata, M. Squassina
6. The Subcritical Case: Further Regularity In the last two sections, we want to pursue a quite detailed asymptotic analysis when f is more regular and the nonlinearity is subcritical. More precisely, we make the extra assumptions f ∈ L2 () independent of time, φ0 (r) = 0, ∀r ∈ R, φ1 ∈ C 1 (R) with |φ1 (r)| ≤ c(1 + |r|γ −1 ),
∀r ∈ R.
(29) (30) (31)
Also, we focus on the case when ω is separated from zero. As we will see, this situation is much more interesting (see however Remark 9 at the end). To this aim, we assume ω ≥ ω0 , for some ω0 > 0.
(32)
All the constants and the sets appearing in the sequel are independent of ω ≥ ω0 (but they do depend on ω0 ). Accordingly, all the estimates we will provide are understood to be uniform as ω ≥ ω0 . From now on, let conditions (10) and (29)–(32) hold. Remark 7. On account of (30)–(32), Lemma 3 simplifies as follows: given any R ≥ 0, there exist M0 = M0 (R) ≥ 0 and ν0 > 0 (independent of R), such that, whenever z0 0 ≤ R, it follows that
ν0
zd (t)0 ≤ M0 e− ω t ,
∀t ∈ R+ .
To be more precise, M0 (R) = cR, for some c > 1. The goal of this section is to prove the existence of a bounded set B1 ⊂ H1 which is an attracting set in H0 , with an exponential rate of attraction. Clearly, it is enough to prove the attraction property on the absorbing set B0 . Let us state the result. Theorem 5. There exist M ≥ 0, ν > 0, and a set B1 , closed and bounded in H1 , such that ν ∀t ∈ R+ , distH0 (S(t)B0 , B1 ) ≤ Me− ω t , where distH0 denotes the usual Hausdorff semidistance in H0 . In light of Remark 6, a straightforward consequence is Corollary 3. The universal attractor A of S(t) on H0 is a compact subset of H1 . Also, its H1 -bound is uniform as ω ≥ ω0 . The proof of Theorem 5 will be carried out by means of several lemmas. The main ingredient is a bootstrap procedure, along the lines of [11]. We will keep the same notation of Sect. 5 (with φ0 ≡ 0); in particular, we will use again the decomposition z = zd + zc .
On the Strongly Damped Wave Equation
523
Lemma 5. Let σ ∈ [0, 1] be given. Assume that z0 σ ≤ Rσ , for some Rσ ≥ 0. Then there exist constants Kσ = Kσ (Rσ ) ≥ 0,
σ = σ (Rσ ) ≥ 0,
µσ = µσ (Rσ ) ∈ [0, 1)
such that ∀t ∈ R+ ,
z(t)σ ≤ Kσ , and
ω
t
A(1+σ )/2 ut (y)2 dy ≤ σ 1 + (t − τ )µσ ,
(33)
∀t ≥ τ, τ ∈ R+ .
(34)
τ
Proof. The result for σ = 0 is already demonstrated, an account of Corollary 1. We will reach the desired conclusion by means of a bootstrap argument. Namely, assuming the result true for a certain σ ∈ [0, 1), we show that the thesis holds for σ + s, for all s ≤ min 41 , 5−γ 2 ,1 − σ . It is thus apparent that, after a finite number of steps, we get the assertion for all σ ∈ [0, 1]. Let then σ ∈ [0, 1) be fixed. By the bootstrap hypothesis, (33)–(34) hold for such σ . Along the proof, the generic constant c ≥ 0 will depend on Rσ . It is convenient to consider separately two cases. Case 1. σ < 21 . Given ε ∈ [0, 2ε0 ], with ε0 > 0 to be determined later, set ξ = ut + ωε u and define
(t) = (1 − ε)A(1+σ +s)/2 u(t)2 + A(σ +s)/2 ξ(t)2 + G(t) + k0 , for some k0 = k0 (Rσ ) ≥ 0, where G is the functional G(t) = 2φ1 (u(t)), Aσ +s u(t) − 2f, Aσ +s u(t). Choosing k0 large enough and ε0 small enough, we have 1 z(t)2σ +s ≤ (t) ≤ 2z(t)2σ +s + c, 2 for all ε ∈ [0, 2ε0 ]. Indeed, 2φ1 (u), Aσ +s u γ −1 ≤ c 1 + uL6/(6+2γ σ −γ −4σ −2s) uL6/(1−2σ ) Aσ +s uL6/(1+2σ +2s) ≤ c 1 + Aq/2 uA(1+σ )/2 uγ −1 A(1+σ +s)/2 u ≤ c 1 + Aq/2 u A(1+σ +s)/2 u, where q = max
γ +4σ +2s−3−2γ σ 2
,0 .
Since q < 1 + σ + s, using (33) we get Aq/2 u ≤ νA(1+σ +s)/2 u + cν ,
(35)
524
V. Pata, M. Squassina
for an arbitrarily small constant ν > 0 and some cν = cν (Rσ ) > 0. This gives at once the inequality 1 2φ1 (u), Aσ +s u ≤ z(t)2σ +s + c. 4 Finally, it is straightforward to see that 1 2f, Aσ +s u ≤ z(t)2σ +s + c. 4 Multiplying the equation times Aσ +s ξ , we are led to the identity 1 d ε ε
+ (1 − ε)A(1+σ +s)/2 u2 + ωA(1+σ +s)/2 ξ 2 + G 2 dt ω 2ω ε ε2 = A(σ +s)/2 ξ 2 − 2 A(σ +s)/2 u, A(σ +s)/2 ξ + φ1 (u)ut , Aσ +s u. (36) ω ω There holds −
ε2 (σ +s)/2 ε3 ε (σ +s)/2 A u, A ξ ≤ A(1+σ +s)/2 u2 + A(σ +s)/2 ξ 2 . 2 3 ω 4α1 ω ω
Moreover, since
3(γ −1) 2−s
≤ 6, we deduce from (9) and (35) that
φ1 (u)ut , Aσ +s u γ −1 ≤ c 1 + uL3(γ −1)/(2−s) ut L6/(1−2σ ) Aσ +s uL6/(1+2σ +2s) ≤ c 1 + A1/2 uγ −1 A(1+σ )/2 ut A(1+σ +s)/2 u ≤ cA(1+σ )/2 ut A(1+σ +s)/2 u ≤ cA(1+σ )/2 ut + cA(1+σ )/2 ut . By virtue of the above inequalities, the right-hand side of (36) is less than or equal to ε3 2ε A(1+σ +s)/2 u2 + A(σ +s)/2 ξ 2 + h + h, 4α1 ω3 ω having set h(t) = cA(1+σ )/2 ut (t). It is then clear that, fixing ε0 small enough, we find the differential inequality d ε k0 ε
+ + ωA(1+σ +s)/2 ξ 2 ≤ h + h + , dt ω ω that holds for all ε ∈ [0, 2ε0 ]. From (34) and the H¨older inequality, t ω ∀t ≥ τ, τ ∈ R+ , h(y)dy ≤ c 1 + (t − τ )µ ,
(37)
(38)
τ
with µ = µσ2+1 < 1. So we are in the hypotheses of Lemma 2. Setting ε = 2ε0 , and using (35), we obtain ε0 ∀t ∈ R+ , (39) z(t)2σ +s ≤ c 1 + z0 2σ +s e− ω + ρ,
On the Strongly Damped Wave Equation
525
where, recalling (4), ρ is given by ε0
ρ=
ce ω
ε0
ω(1 − e− ω )
≤c
as ω ≥ ω0 .
Hence (33) holds for σ + s. Actually, (39) says a little bit more, since the desired result is z(t)σ +s ≤ Kσ +s (Rσ +s ), whereas the constant ρ depends only on Rσ . This allows us, for instance, to prove the existence of bounded absorbing sets for S(t) on the phase-space Hσ , for all σ ∈ [0, 1]. Finally, setting ε = 0 in (37), and using the bound on z(t)σ +s , which in turn furnishes a bound on , we get d ˜
+ ωA(1+σ +s)/2 ut 2 ≤ ch, dt for some c˜ = c(R ˜ σ +s ). Integration on (τ, t), on account of (38), entails (34) for σ + s. Case 2. σ ≥ 21 . Exploiting Case 1, we readily learn that the theorem holds for all σ ∈ [ 21 , σ˜ ], for some σ˜ > 21 . Hence, if σ ≥ σ˜ , in particular we get that A(1+σ˜ )/2 u(t) ≤ c, and the continuous embedding D(A(1+σ˜ )/2 ) → L∞ () bears the uniform bound sup u(t)L∞ ≤ c.
t∈R+
(40)
The proof then goes exactly as in the previous case, with the difference that now the estimates are almost immediate, due to the control (40). The details are left to the reader. Lemma 6. Let σ ∈ [0, 1) be given, and set s = s(σ ) = min
1
4,
5−γ 2
,1 − σ .
(41)
Given any Rσ ≥ 0, there exists Rσ +s = Rσ +s (Rσ ) such that, if z0 σ ≤ Rσ , it follows that zc (t)σ +s ≤ Rσ +s , ∀t ∈ R+ . Proof. The argument is very similar to the one used in the previous proof. Therefore we will just detail those passages in which significant differences occur. As before, let the generic constant c ≥ 0 depend on Rσ . Also, by virtue of Lemma 5, we have the uniform bounds (33)–(34). The energy functional considered here is
c (t) = (1 − ε)A(1+σ +s)/2 w(t)2 + A(σ +s)/2 ξc (t)2 + Gc (t) + k0 , for some ε > 0 and k0 = k0 (Rσ ) ≥ 0, with ξc = wt + ωε w, and Gc (t) = 2φ1 (u(t)), Aσ +s w(t) − 2f, Aσ +s w(t). Again, for k0 large enough and ε small enough, we have 1 zc (t)2σ +s ≤ c (t) ≤ 2zc (t)2σ +s + c. 2 Indeed,
γ 2φ1 (u), Aσ +s w ≤ c 1 + uL6γ /(5−2σ −2s) A(1+σ +s)/2 w.
526
V. Pata, M. Squassina
If σ < 21 , on account of the inequality 6 6γ ≤ , 5 − 2σ − 2s 1 − 2σ we get
γ
uL6γ /(5−2σ −2s) ≤ A(1+σ )/2 uγ ≤ c. If σ ≥ 21 , we still get the inequality γ
uL6γ /(5−2σ −2s) ≤ c, by means of the continuous embedding D(A(1+σ )/2 ) → Lp (),
∀p ≥ 1.
In either case, we can conclude that 1 2φ1 (u), Aσ +s w ≤ zc (t)2σ +s + c. 4 Multiplying (23) times Aσ +s ξc , and repeating the former passages, we obtain the differential inequality d ε k0 ε
c + c ≤ h c + h + , dt ω ω for some ε > 0 small enough, where h fulfills (38). An application of Lemma 2 leads to the desired conclusion, since in this case (cf. (39)), c (0) ≤ c. We will complete our task exploiting the transitivity property of exponential attraction [9, Theorem 5.1], that we recall below for the reader’s convenience. Lemma 7. Let K1 , K2 , K3 be subsets of H0 such that distH0 (S(t)K1 , K2 ) ≤ L1 e−ϑ1 t ,
distH0 (S(t)K2 , K3 ) ≤ L2 e−ϑ2 t , for some ϑ1 , ϑ2 > 0 and L1 , L2 ≥ 0. Assume also that for all z1 , z2 ∈ t≥0 S(t)Kj (j = 1, 2, 3) there holds S(t)z1 − S(t)z2 0 ≤ L0 eϑ0 t z1 − z2 0 , for some ϑ0 ≥ 0 and some L0 ≥ 0. Then it follows that distH0 (S(t)K1 , K3 ) ≤ Le−ϑt , where ϑ =
ϑ1 ϑ2 ϑ0 +ϑ1 +ϑ2
and L = L0 L1 + L2 .
We have now all the tools to proceed to the proof of the theorem. Proof of Theorem 5. With reference to (41), notice that, starting with σ = 0, we find a strictly increasing finite sequence of numbers {σj }nj=0 , with n = n(γ ), such that σ0 = 0,
σj +1 = σj + s(σj ),
σn = 1.
Choosing R0 as in Theorem 3, let us define for j = 0, . . . , n Bσj = z0 ∈ Hσj : z0 σj ≤ Rσj ,
On the Strongly Damped Wave Equation
527
where Rσj = Rσj (Rσj −1 ) are given by Lemma 6. After Remark 7 and Lemma 6, we learn at once that ν0
dist H0 (S(t)Bσj −1 , Bσj ) ≤ Mj e− ω t , where
∀j = 1, . . . , n,
σ /2 Mj = M0 α1 j −1 Rσj −1 .
Taking then into account Corollary 1 and (13), by successive applications of Lemma 7, we obtain the estimate ν distH0 (S(t)B0 , B1 ) ≤ Me− ω t , for some M ≥ 0 and ν > 0.
Solutions departing from B1 satisfy an extra regularity, which shall be needed in the sequel. Lemma 8. There exists C ≥ 0 such that sup zt (t)0 ≤ C,
z0 ∈B1
∀t ≥ 1.
Proof. Let z0 = (u0 , u1 ) ∈ B1 and consider the linear nonhomogeneous problem ψtt + ωAψt + Aψ = −φ1 (u)ut , ψ(0) = u1 , ψ (0) = −ωAu − Au − φ (u ) + f, t 1 0 1 0 obtained by differentiation of Problem P with respect to time. By Lemma 5 (for σ = 1) we have sup sup z(t)1 < ∞.
z0 ∈B1 t∈R+
(42)
Consequently, the continuous embedding H 2 () → C() provides the uniform bound sup sup φ (u(t))ut < ∞.
z0 ∈B1 t∈R+
Thus, by standard arguments, for every T > 0 the above problem admits a unique solution ψ ∈ C([0, T ], L2 ()), with
ψt ∈ C([0, T ], H −1 ()) ∩ L2 ([0, T ], L2 ()).
By comparison, ψ(t) = ut (t) for every t ≥ 0, so, in particular, ψ ∈ C(R+ , H01 ()). Taking the product with ψt , we get d ψt 2 + A1/2 ψ2 ≤ c. dt Integrating the above inequality on (r, t + 1), for some fixed r ∈ [t, t + 1], and integrating the resulting inequality with respect to r on (t, t + 1), the proof follows. Notice that this procedure is a simplified version of the uniform Gronwall lemma [15, Lemma III.1.1].
528
V. Pata, M. Squassina
Remark 8. Of course it is a natural question to ask why this approach fails to handle the critical case. In fact, the bootstrap procedure works as well for the critical case (clearly, it is a little bit more complicated, and an additional control on the second derivative of φ0 is required), provided that we start from σ > 0. The missing passage is exactly from σ = 0 to σ = s. This means that, if we were able to prove that the attractor is bounded in some Hσ , for σ > 0 no matter how small, we would obtain all the results of this paper for the critical case as well. Unfortunately, it seems a really hard task to exhibit such a regularity for the attractor when φ is critical. In fact, it is quite possible that there is not such a regularity. 7. Exponential Attractors for the Subcritical Case As remarked by many authors, the universal attractor may not be for practical purposes (e.g., to get numerical results) a satisfactory object to describe the longterm dynamics. Indeed, in spite of its nice features, it is not possible in general to exhibit an actual control of the convergence rate of the trajectories to the attractor. In order to overcome the problem of quantitative control of the time needed to stabilize the system, Eden, Foias, Nicolaenko and Temam (cf. [5, 6]) introduced the notion of exponential attractor. This is a compact invariant (but not fully invariant) subset of the phase-space of finite fractal dimension that attracts a bounded ball of initial data exponentially fast. However, before the results of [9], it was not clear if, for hyperbolic systems, the exponential attractor had a basin of attraction coinciding with the whole phase-space. Clearly, this was quite a significant limitation. Nonetheless, after [9], we now know that it is possible to remove this obstacle, and this justifies the following generalization of the definition given in [5, 6]. Definition 1. A compact set E ⊂ H0 is called an exponential attractor or inertial set for the semigroup S(t) if the following conditions hold: (i) E is invariant of S(t), that is, S(t)E ⊂ E for every t ≥ 0; (ii) dimF E < ∞, that is, E has finite fractal dimension; (iii) there exist an increasing function J : R+ → R+ and κ > 0 such that, for any set B ⊂ H0 with supz0 ∈B z0 0 ≤ R there holds distH0 (S(t)B, E) ≤ J (R)e−κt . We remark that, contrary to the universal attractor, the exponential attractor is not unique. As a matter of fact, if there is one, then there are infinitely many of them. It is apparent that if there is an exponential attractor E, then in particular the semigroup possesses a compact attracting set, and thus it has a universal attractor A ⊂ E of finite fractal dimension, being dimF A ≤ dimF E. Our main result is Theorem 6. The semigroup S(t) acting on H0 possesses an exponential attractor E = E(ω). Moreover, (i) E is a bounded subset of H1 , and the bound is independent of ω ≥ ω0 ; (ii) the rate of exponential attraction κ is proportional to ω1 ; (iii) J (R)is independent of ω ≥ ω0 ; (iv) sup dimF E(ω) < ∞. ω≥ω0
On the Strongly Damped Wave Equation
529
Corollary 4. The universal attractor A of the semigroup S(t) has finite fractal dimension, and sup dimF A(ω) < ∞. ω≥ω0
In order to prove Theorem 6, we shall use the following sufficient condition (cf. [8, Prop. 1] and [6, p.33]): Lemma 9. Let X ⊂ H0 be a compact invariant subset. Assume that there exists a time t∗ > 0 such that the following hold: (i) the map
(t, z0 ) → S(t)z0 : [0, t∗ ] × X → X
is Lipschitz continuous (with the metric inherited from H0 ); (ii) the map S(t∗ ) : X → X admits a decomposition of the form S(t∗ ) = S0 + S1 ,
S0 : X → H0 ,
S1 : X → H1 ,
where S0 and S1 satisfy the conditions 1 z1 − z2 0 , 8
∀z1 , z2 ∈ X ,
S1 (z1 ) − S1 (z2 )1 ≤ C∗ z1 − z2 0 ,
∀z1 , z2 ∈ X ,
S0 (z1 ) − S0 (z2 )0 ≤ and
for some C∗ > 0. Then there exist an invariant compact set E ⊂ X such that dist H0 (S(t)X , E) ≤ J0 e−
log 2 t∗ t
,
(43)
where J0 = 2L∗ sup z0 0 e
log 2 t∗
z0 ∈X
,
(44)
and L∗ is the Lipschitz constant of the map S(t∗ ) : X → X . Moreover, dimF E ≤ 1 + where N∗ is the minimum number of H1 .
1 8C∗ -balls
log N∗ , log 2
(45)
of H0 necessary to cover the unit ball of
In fact, [8, Prop. 1] allows to build an exponential attractor E that attracts X with an arbitrarily large attraction rate, paying the price of increasing dimF E. However, we will be interested to attract arbitrary bounded subsets of H0 . This translates into an upper bound on the attraction rate, that depends on the velocity at which X attracts the absorbing set B0 . We remark that the original technique to find exponential attractors (cf. [5, 6]) is quite different. Indeed, it relies on the proof that the semigroup S(t) satisfies the socalled squeezing property on X . Besides, it works in Hilbert spaces only, since it makes use of orthogonal projections. On the contrary, this alternative approach is applicable in Banach spaces as well. In a Hilbert space setting, like in our case, the choice of which
530
V. Pata, M. Squassina
procedure to follow is just a matter of taste. Note that, to get precise numerical calculations, one has to know the number N∗ , that, in general, is quite difficult to compute. Similarly, the other method requires the explicit knowledge of the eigenvalues {αn } of A. In fact, this is actually the same problem. We define H0 S(τ )B1 . X = τ ≥1
Let us establish some properties of this set. – X is a compact set in H0 , bounded in H1 , due to Lemma 5. – X is invariant, for, from the continuity of S(t), we have S(t)X ⊂
S(t + τ )B1
H0
⊂ X.
τ ≥1
– There holds distH0 (S(t)B0 , X ) ≤ Me− ω t , ν
∀t ∈ R+ ,
(46)
for some M ≥ 0 and some ν > 0. Indeed, it is apparent that distH0 (S(t)B1 , X ) = 0,
∀t ≥ 1.
Hence (46) follows from Lemma 7, in view of Theorem 5, Lemma 5, and (13). – There is C ≥ 0 such that sup zt (t)20 ≤ C,
z0 ∈X
∀t ≥ 0.
This is a direct consequence of Lemma 8. Therefore such a set X is a promising candidate for our purposes. Indeed, we have the following two lemmas. Lemma 10. For every T > 0, the mapping (t, z0 ) → S(t)z0 is Lipschitz continuous on [0, T ] × X . Proof. For z1 , z2 ∈ X and t1 , t2 ∈ [0, T ] we have S(t1 )z1 − S(t2 )z2 0 ≤ S(t1 )z1 − S(t1 )z2 0 + S(t1 )z2 − S(t2 )z2 0 . The first term of the above inequality is handled by estimate (13). Concerning the second one, t2 z(t1 ) − z(t2 )0 ≤ zt (y)0 dy ≤ C|t1 − t2 |. t1
Hence
S(t1 )z1 − S(t2 )z2 0 ≤ L |t1 − t2 | + z1 − z2 0 ,
for some L = L(T ) ≥ 0.
Lemma 11. Assumption (ii) of Lemma 9 holds true.
On the Strongly Damped Wave Equation
531
Proof. The constant c ≥ 0 of this proof will depend on X (which, however, is a fixed set). For z0 ∈ X , let us denote by S0 (t)z0 the solution at time t of the linear homogeneous problem associated to Problem P , and let S1 (t)z0 = S(t)z0 − S0 (t)z0 . Given two solutions z1 = (u1 , u1t ) and z2 = (u2 , u2t ), originating from z1 , z2 ∈ X , respectively, set z¯ = z1 − z2 = (u, ¯ u¯ t ). Let us decompose z¯ into the sum z¯ = z¯ d + z¯ c = (v, ¯ v¯t ) + (w, ¯ w¯ t ), where
and
v¯tt + ωAv¯t + Av¯ = 0, z¯ d (0) = z1 − z2 ,
(47)
w¯ tt + ωAw¯ t + Aw¯ = −φ1 (u1 ) + φ1 (u2 ), z¯ c (0) = 0.
(48)
It is apparent that z¯ d (t) = S0 (t)z1 − S0 (t)z2 and z¯ c (t) = S1 (t)z1 − S1 (t)z2 . By (47) we get (cf. Remark 7), ν0 ¯zd (t)0 ≤ cz1 − z2 0 e− ω t , for some c > 1. Hence, setting t∗ =
ω log 8c, ν0
(49)
we have ¯zd (t∗ )0 ≤
1 z1 − z2 0 . 8
(50)
For all trajectories departing from X , the first component is (uniformly) bounded almost everywhere. Therefore the product of (48) and Aw¯ t bears d c ¯zc 21 + 2ωAw¯ t 2 ≤ 2φ1 (u1 ) − φ1 (u2 )Aw¯ t ≤ u ¯ 2 + 2ωAw¯ t 2 . dt ω From (13), K u(t) ¯ ≤ ¯z(t)0 ≤ e ω t z1 − z2 0 , ∀t ∈ R+ , thus we obtain the inequality d c 2K ¯zc (t)21 ≤ e ω t z1 − z2 20 , dt ω and an integration on (0, t∗ ) yields ¯zc (t∗ )21 ≤ C∗ z1 − z2 20 , with
(51)
c 2K t∗ eω . 2K Notice that, in light of (49), C∗ is independent of ω ≥ ω0 . Collecting (50)-(51), and setting S0 = S0 (t∗ ) and S1 = S1 (t∗ ), we meet the thesis. C∗ =
532
V. Pata, M. Squassina
Proof of Theorem 6. Thanks to Lemma 10 and Lemma 11, we can apply Lemma 9, so getting a compact invariant set E ⊂ X satisfying (43)–(45). In particular, due to (13) and (49), we may rewrite (43) as κ0
dist H0 (S(t)X , E) ≤ J0 e− ω t ,
∀t ∈ R+ ,
(52)
for some J0 ≥ 0 and κ0 > 0, both independent of ω ≥ ω0 . In addition, N∗ is independent of ω ≥ ω0 , for so is C∗ . This implies assertion (iv) of the theorem. In order to complete the proof, we are left to show that E attracts (exponentially fast) all finite subsets of the whole phase-space H0 . Thus, let B ⊂ H0 be a bounded set, and call R = supz0 ∈B z0 0 . By Theorem 3 (cf. (21)), S(t)B ⊂ B0 ,
∀t ≥ ωt0 ,
for some t0 depending (increasingly) only on R. Hence, by (46), dist H0 (S(t)B, X ) ≤ Meνt0 e− ω t , ν
∀t ≥ ωt0 .
On the other hand, by Corollary 1, we easily get that distH0 (S(t)B, X ) ≤ k,
∀t ∈ R+ ,
for some k ≥ 0 depending (increasingly) only on R. Collecting the two above inequalities, we have ν distH0 (S(t)B, X ) ≤ k + Meνt0 e− ω t , ∀t ∈ R+ . (53) Applying once more Lemma 7, from (13), (52)–(53) and Lemma 5, we conclude that distH0 (S(t)B, E) ≤ J e− ω t , κ
∀t ∈ R+ ,
where J = J (R) is an increasing function of R, and κ > 0. Observe that both J and κ are independent of ω ≥ ω0 . Remark 9. We want to spend a few words to say what happens when ω → 0. All the results clearly hold (ω0 can be chosen arbitrarily small), but there will be dependencies on ω. For instance, the set B1 (and, consequently, A and E) is bounded in H1 with a bound that blows up as ω → 0. Precisely, the bound is proportional to ω−2n , where n = n(γ ) is the number of steps required in the proof of Theorem 5. Analogously, the exponential convergence rate tends to infinity as ω → 0, as well as the upper bound for dimF E. Remark 10. Let us conclude the paper with a consideration. We have seen that the fractal dimension of the exponential attractor (and thus of the attractor) remains bounded as ω → ∞. Clearly, our estimates provide just upper bounds. Nonetheless it seems reasonable that dimF E tends to infinity as ω → 0. Then, as ω gets bigger, the fractal dimension decreases (at least, its upper bound) until it stabilizes. Still, the exponential convergence rate gives some information, namely, things start to get worse as soon as ω is too large. So our analysis seems to suggest, contrary to what is maintained in [17], that dimF E is not a decreasing function of ω, but attains a minimum at some ω∗ .
On the Strongly Damped Wave Equation
533
References 1. Arrieta, J., Carvalho, A.N., Hale, J.K.: A damped hyperbolic equation with critical exponent. Comm. Partial Differ. Eqs. 17, 841–866 (1992) 2. Belleri, V., Pata, V.: Attractors for semilinear strongly damped wave equation on R3 . Discrete Contin. Dynam. Systems 7, 719–735 (2001) 3. Carvalho, A.N., Cholewa, J.W.: Local well posedness for strongly damped wave equations with critical nonlinearities. Bull. Austral. Math. Soc. 66, 443–463 (2002) 4. Carvalho, A.N., Cholewa, J.W.: Attractors for strongly damped wave equations with critical nonlinearities. Pacific J. Math. 207, 287–310 (2002) 5. Eden, A., Foias, C., Nicolaenko, B., Temam, R.: Ensembles inertiels pour des e´ quations d’´evolution dissipatives. C.R. Acad. Sci. Paris S´er. I Math. 310, 559–562 (1990) 6. Eden, A., Foias, C., Nicolaenko, B., Temam, R.: Exponential attractors for dissipative evolution equations. Paris: Masson, 1994 7. Eden, A., Kalantarov, V.: Finite dimensional attractors for a class of semilinear wave equations. Turkish J. Math. 20, 425–450 (1996) 8. Efendiev, M., Miranville, A., Zelik, S.: Exponential attractors for a nonlinear reaction-diffusion system in R3 . C.R. Acad. Sci. Paris S´er. I Math. 330, 713–718 (2000) 9. Fabrie, P., Galusinski, C., Miranville, A., Zelik, S.: Uniform exponential attractors for a singularly perturbed damped wave equation. Discrete Contin. Dynam. Systems 10, 211–238 (2004) 10. Ghidaglia, J.M., Marzocchi, A.: Longtime behaviour of strongly damped wave equations, global attractors and their dimension. SIAM J. Math. Anal. 22, 879–895 (1991) 11. Grasselli, M., Pata, V.: Asymptotic behavior of a parabolic-hyperbolic system. Commun. Pure Appl. Anal., to appear 12. Hale, J.K.: Asymptotic behavior of dissipative systems. Providence, RI:Amer. Math. Soc. Providence, 1988 13. Haraux, A.: Syst`emes dynamiques dissipatifs et applications. Paris: Masson, 1991 14. Massat, P.: Limiting behavior for strongly damped nonlinear wave equations. J. Differ. Eqs. 48, 334–349 (1983) 15. Temam, R.: Infinite-dimensional dynamical systems in mechanics and physics. New York: Springer, 1997 16. Webb, G.F.: Existence and asymptotic behavior for a strongly damped nonlinear wave equation. Canad. J. Math. 32, 631–643 (1980) 17. Zhou, S.: Global attractor for strongly damped nonlinear wave equations. Funct. Differ. Eq. 6, 451– 470 (1999) Communicated by P. Constantin
Commun. Math. Phys. 253, 535–560 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1234-0
Communications in
Mathematical Physics
Universal Behavior for Averages of Characteristic Polynomials at the Origin of the Spectrum M. Vanlessen Department of Mathematics, Katholieke Universiteit Leuven, Celestijnenlaan 200 B, 3001 Leuven, Belgium. E-mail:
[email protected] Received: 30 June 2003 / Accepted: 27 July 2004 Published online: 11 November 2004 – © Springer-Verlag 2004
Abstract: It has been shown by Strahov and Fyodorov that averages of products and ratios of characteristic polynomials corresponding to Hermitian matrices of a unitary ensemble, involve kernels related to orthogonal polynomials and their Cauchy transforms. We will show that, for the unitary ensemble ˆ1 | det M|2α e−nV (M) dM of n × n Zn Hermitian matrices, these kernels have universal behavior at the origin of the spectrum, as n → ∞, in terms of Bessel functions. Our approach is based on the characterization of orthogonal polynomials together with their Cauchy transforms via a matrix RiemannHilbert problem, due to Fokas, Its and Kitaev, and on an application of the Deift/Zhou steepest descent method for matrix Riemann-Hilbert problems to obtain the asymptotic behavior of the Riemann-Hilbert problem. 1. Introduction Characteristic polynomials of random matrices are useful to make predictions about moments of the Riemann-Zeta function, see [8, 18, 19, 21]. Another domain where they are of great value is quantum chromodynamics, see for example [2, 3, 9, 33]. In this paper we consider characteristic polynomials det(x −M) of random matrices taken from the following unitary ensemble of n × n Hermitian matrices M, cf. [4, 5, 25]: 1 | det M|2α e−ntr V (M) dM, ˆ Zn
α > −1/2.
(1.1)
Here dM is the associated flat Lebesgue measure on the space of n × n Hermitian matrices, and Zˆ n is a normalization constant. The confining potential V in (1.1) is a real valued function with enough increase at infinity, for example a polynomial of even Postdoctoral Fellow of the Research Foundation – Flanders (FWO–Vlaanderen). Supported by FWO research projects G.0176.02 and G.0455.04.
536
M. Vanlessen
degree with positive leading coefficient. This unitary ensemble induces a probability density function on the n eigenvalues x1 , . . . , xn of M, see [26] P (n) (x1 , . . . , xn ) =
n 1 wn (xj )2 (x1 , . . . , xn ), Zn j =1
where (x1 , . . . , xn ) = i<j (xj − xi ) stands for the Vandermonde determinant, where Zn is a normalization constant (the partition function), and where wn is the following varying weight on the real line: wn (x) = |x|2α e−nV (x) .
(1.2)
The unitary ensemble (1.1) is relevant in three-dimensional quantum chromodynamics [33], and has been investigated before in [4, 5, 20, 25, 28], where universal behavior for local eigenvalue correlations is established in various regimes of the spectrum, as n → ∞. It is known that averages of products and ratios of characteristic polynomials are intimately related to orthogonal polynomials and their Cauchy transforms, see [7, 8, 17, 27, 30]. Let πj,n (x) = x j + · · · be the jth degree monic orthogonal polynomial with respect to wn . There is an integral representation for the monic orthogonal polynomials, which appears already in the work of Heine in 1878, see for example [31], n πn,n (x) = · · · (x − xj )P (n) (x1 , . . . , xn )dx1 . . . dxn . j =1
So, the monic orthogonal polynomial πn,n can be understood as the average of the characteristic polynomial det(x − M) over the unitary ensemble (1.1) det(x − M)M = πn,n (x). Here, the brackets are used to denote the average over the ensemble (1.1) of random matrices M. A first generalization of this formula was obtained by Brézin and Hikami [8], and also by Mehta and Normand [27]. They have derived a determinantal formula for the average of products of characteristic polynomials in terms of orthogonal polynomials. A further generalization was obtained by Fyodorov and Strahov [17], who derived a determinantal formula for the average of both products and ratios of characteristic polynomials in terms of both orthogonal polynomials and their Cauchy transforms. Here, the ratios gave rise to the Cauchy transforms. For explicit formulas and streamlined proofs of these results we refer to [7]. Recently, Strahov and Fyodorov [30] showed, see also [7] for an alternative proof, that the averages of characteristic polynomials of n×n Hermitian matrices, are governed by kernels related to orthogonal polynomials and their Cauchy transforms πj,n (x) 1 hj,n (z) = wn (x)dx, for Im z = 0. (1.3) 2πi x−z Namely, kernels WI,n+m made of orthogonal polynomials, kernels WI I,n+m made of both orthogonal polynomials and their Cauchy transforms, and kernels WI I I,n+m made of Cauchy transforms of orthogonal polynomials. See Table 1 for the explicit expressions of these kernels. This connection between the averages of characteristic polynomials and the three kernels is given by, see [7, 30]
Averages of Characteristic Polynomials at the Origin of the Spectrum
537
Table 1. Expressions for the finite kernels WI,n+m , WI I,n+m and WI I I,n+m , cf. [30] Finite kernels
k
WI,n+m (ζ, η)
πn+m,n (ζ )πn+m−1,n (η)−πn+m−1,n (ζ )πn+m,n (η) ζ −η
WI I,n+m (ζ, η)
hn+m,n (ζ )πn+m−1,n (η)−hn+m−1,n (ζ )πn+m,n (η) ζ −η
WI I I,n+m (ζ, η)
hn+m,n (ζ )hn+m−1,n (η)−hn+m−1,n (ζ )hn+m,n (η) ζ −η
det(xi − M) det(yi − M)
i=1
M
(cn+k−1,n 1 = n+k−1 det WI,n+k (xi , yj ) 1≤i,j ≤k , ˆ y) ˆ cj,n (x)( j =n )k
k det(yi − M) det(xi − M) i=1
= (−1) and 2k i=1
1 det(xi − M)
= (−1)
k(k−1) 2
M
(cn−1,n )k
(x, ˆ y) ˆ 2 (x) ˆ 2 (y) ˆ
det WI I,n (xi , yj ) 1≤i,j ≤k ,
M
k (cn−k−1,n )
(2k)!
k
n−1 l=n−k
cl
σ ∈S2k
det WI I I,n−k (xσ (i) , xσ (k+j ) ) 1≤i,j ≤k (xσ (1) , . . . , xσ (k) )(xσ (k+1) , . . . , xσ (2k) )
,
2 with γ where xˆ = (x1 , . . . , xk ), yˆ = (y1 , . . . , yk ), where cj,n = −2π iγj,n j,n the th leading coefficient of the j degree orthonormal polynomial with respect to wn , and where S2k is the permutation group of the index set {1, . . . , 2k}. There are also explicit formulas for averages containing a non-equal number of characteristic polynomials in the numerator and the denominator, in terms of these kernels, see [30] for details. Strahov and Fyodorov [30] used this connection, together with the Riemann-Hilbert (RH) approach, to establish universal behavior, as n → ∞, for the averages of characteristic polynomials of random matrices taken from the unitary ensemble
1 −ntr V (M) e dM, ˜ Zn
(1.4)
in the bulk of the spectrum. It is the goal of this paper to establish universal behavior as n → ∞, for the kernels WI,n+m , WI I,n+m and WI I I,n+m (and thus also for the averages of characteristic
538
M. Vanlessen
polynomials) associated to the unitary ensemble (1.1), appropriately scaled at the origin such that the asymptotic eigenvalue density at the origin is 1. This scaling limit is called the origin of the spectrum by various authors, see for example [4, 6, 20, 25]. It will turn out that this universal behavior is described in terms of the Bessel kernels given in Table 2. For the case α = 0, our results agree with those of Strahov and Fyodorov [30]. The issue of universality at the origin of the spectrum for the averages of characteristic polynomials, corresponding to Hermitian matrices of the unitary ensemble (1.1), was also considered by Akemann and Fyodorov [6]. They showed, on a physical level of rigor using Shohat’s method, that the asymptotic behavior near the origin, as n → ∞, of the orthogonal polynomials and their Cauchy transforms are expressed in terms of Bessel and Hankel functions, see [6] for details. However, explicit expressions for the universal behavior of the three kernels WI,n+m , WI I,n+m and WI I I,n+m at the origin of the spectrum have not been given yet, which we will determine on a mathematical level of rigor using the RH approach, as in [30]. In [6] it was assumed that the potential V is an even polynomial with positive leading coefficient, and that the spectrum support is only one interval. In this paper, we can allow V to be quite arbitrary, and assume the following: V : R → R is real analytic, V (x) lim = +∞, |x|→∞ log(x 2 + 1) ψ(0) > 0,
(1.5) (1.6) (1.7)
Table 2. Expressions for the limiting Bessel kernels. Here, Jν is the usual J -Bessel function of order (1) (2) ν, and Hν and Hν are the Hankel functions of order ν of the first and second kind, respectively. The right column denotes the expressions in case α = 0 Case α = 0
Limiting Bessel kernels 1
π ζ −α+ 2 η−α+ 2
J+ α,I I (ζ, η)
1 1 πζ α+ 2 η−α+ 2
J− α,I I (ζ, η)
1 1 −π ζ α+ 2 η−α+ 2
1
1
J± α,I I I (ζ, η)
−π ζ α+ 2 ηα+ 2
α− 21
(πη)−J
α− 21
2(ζ −η)
(πζ )J
α+ 21
(πη)
(1) (1) (πζ )J 1 (πη)−H 1 (πζ )J 1 (πη) α− 2 α+ 2 α+ 21 α− 2
4(ζ −η)
H
(2) (2) (πζ )J 1 (πη)−H 1 (πζ )J 1 (πη) α− 2 α+ 2 α+ 21 α− 2
4(ζ −η)
J− α,I I I (ζ, η)
π ζ α+ 2 ηα+ 2
8(ζ −η)
1
1
(πζ )J
(1) (1) (1) (1) (πζ )H 1 (πη)−H 1 (πζ )H 1 (πη) α+ 21 α− 2 α− 2 α+ 2
π ζ α+ 2 ηα+ 2
1
H
α+ 21
sin π(ζ −η) π(ζ −η)
iπ(ζ −η)
− ie 2π(ζ −η)
−iπ(ζ −η)
− ie2π(ζ −η)
H
J+ α,I I I (ζ, η)
1
J
1
Jα,I (ζ, η)
H
H
(1) (2) (1) (2) (πζ )H 1 (πη)−H 1 (πζ )H 1 (πη) α+ 21 α− 2 α− 2 α+ 2
8(ζ −η)
(2) (2) (2) (2) (πζ )H 1 (πη)−H 1 (πζ )H 1 (πη) α+ 21 α− 2 α− 2 α+ 2
8(ζ −η)
0
ieiπ(ζ −η) 2π(ζ −η)
0
Averages of Characteristic Polynomials at the Origin of the Spectrum
539
where ψ is the density of the equilibrium measure µV in the presence of the external field V , see [12, 13, 29]. The equilibrium measure µV has compact support, it is supported on a finite union of intervals (since V is real analytic), and it is absolutely continuous with respect to the Lebesgue measure, i.e. dµV (x) = ψ(x)dx. The importance of µV lies in the fact that its density ψ is the limiting mean eigenvalue density of the unitary ensemble (1.1). Assumption (1.7) then states that the mean eigenvalue density does not vanish at the origin. Our results are given by the following three theorems. We use C+ and C− to denote the upper and lower half-plane, respectively. Theorem 1.1. Fix m ∈ Z, let WI,n+m be the kernel given in Table 1, and let γj,n > 0 be the leading coefficient of the j th degree orthonormal polynomial with respect to wn . For ζ, η ∈ C,
ζ η 1 WI,n+m , nψ(0) nψ(0) nψ(0)
V (0) 2α nV (0) (ζ +η) = nψ(0) e e 2ψ(0) Jα,I (ζ, η) + O(1/n) ,
2 γn+m−1,n
(1.8)
as n → ∞, where the Bessel kernel Jα,I (ζ, η) is given in Table 2. The error term holds uniformly for ζ and η in compact subsets of C. Theorem 1.2. Fix m ∈ Z, let WI I,n+m be the kernel given in Table 1, and let γj,n > 0 be the leading coefficient of the j th degree orthonormal polynomial with respect to wn . Then the following holds: (a) For ζ ∈ C+ and η ∈ C, 2 γn+m−1,n
ζ −η WI I,n+m nψ(0)
= (ζ − η)e
ζ η , nψ(0) nψ(0)
V (0) − 2ψ(0) (ζ −η) + Jα,I I (ζ, η) + O(1/n),
(1.9)
as n → ∞, where the Bessel kernel J+ α,I I (ζ, η) is given in Table 2. The error term holds uniformly for ζ and η in compact subsets of C+ and C, respectively. (b) For ζ ∈ C− and η ∈ C, 2 γn+m−1,n
ζ −η WI I,n+m nψ(0)
= (ζ − η)e
ζ η , nψ(0) nψ(0)
V (0) − 2ψ(0) (ζ −η) − Jα,I I (ζ, η) + O(1/n),
(1.10)
as n → ∞, where the Bessel kernel J− α,I I (ζ, η) is given in Table 2. The error term holds uniformly for ζ and η in compact subsets of C− and C, respectively. Theorem 1.3. Fix m ∈ Z, let WI I I,n+m be the kernel given in Table 1, and let γj,n > 0 be the leading coefficient of the j -th degree orthonormal polynomial with respect to wn . Then the following holds:
540
M. Vanlessen
(a) For ζ, η ∈ C+ ,
1 ζ η WI I I,n+m , nψ(0) nψ(0) nψ(0)
2α V (0) 1 − 2ψ(0) (ζ +η) + −nV (0) e e Jα,I I I (ζ, η) + O(1/n) , (1.11) = nψ(0)
2 γn+m−1,n
as n → ∞, where the Bessel kernel J+ α,I I I (ζ, η) is given in Table 2. The error term holds uniformly for ζ and η in compact subsets of C+ . (b) For ζ ∈ C+ and η ∈ C− ,
1 ζ η 2 γn+m−1,n WI I I,n+m , nψ(0) nψ(0) nψ(0)
2α 1 − V (0) (ζ +η) ± e−nV (0) e 2ψ(0) Jα,I I I (ζ, η) + O(1/n) , (1.12) = nψ(0) as n → ∞, where the Bessel kernel J± α,I I I (ζ, η) is given in Table 2. The error term holds uniformly for ζ and η in compact subsets of C+ and C− , respectively. (c) For ζ, η ∈ C− ,
1 ζ η 2 γn+m−1,n WI I I,n+m , nψ(0) nψ(0) nψ(0)
2α 1 − V (0) (ζ +η) − e−nV (0) e 2ψ(0) Jα,I I I (ζ, η) + O(1/n) , (1.13) = nψ(0) as n → ∞, where the Bessel kernel J− α,I I I (ζ, η) is given in Table 2. The error term holds uniformly for ζ and η in compact subsets of C− . Remark 1.4. In case α = 0 we can simplify the expressions for the limiting Bessel kernels, using the facts that, see [1] 2 2 2 iz (1) sin z, J− 1 (z) = cos z, H 1 (z) = −i e , J 1 (z) = 2 2 πz πz πz 2 2 iz 2 −iz 2 −iz (1) (2) (2) H 1 (z) = e , e , e . H 1 (z) = i H 1 (z) = −2 −2 πz πz πz 2 We then obtain the kernels given in the right column of Table 2. This is in agreement with the results of Strahov and Fyodorov [30]. Note however that in [30] the second and the third kernel are multiplied with an extra factor −2π i. Remark 1.5. As noted before, it has been shown by Strahov and Fyodorov [30], see also [7], that
η − M) det( nψ(0) η ζ −η ζ 2 WI I,n , . = 2πiγn−1,n ζ nψ(0) nψ(0) nψ(0) det( nψ(0) − M) M
Then it follows from (1.9), Table 2 and [1, Formula 9.1.3], that for ζ ∈ C+ , ζ
− M) det( nψ(0) π 2ζ J = 1 (π ζ )Y 1 (π ζ ) − J 1 (π ζ )Y 1 (π ζ ) +O(1/n), α+ 2 α−2 α− 2 α+ 2 2 det( ζ − M) nψ(0)
M
Averages of Characteristic Polynomials at the Origin of the Spectrum
541
as n → ∞, where Yν is the Bessel function of the second kind of order ν. By [1, Formula 9.1.16], the right-hand side of this equation is 1 + O(1/n), as it should be. Similarly we find the same result for ζ ∈ C− . The proofs of these theorems are based on the characterization of orthogonal polynomials with respect to the weight (1.2), together with their Cauchy transforms via a 2 × 2 matrix RH problem for Y , due to Fokas, Its and Kitaev [16], and on an application of the Deift/Zhou steepest descent method [15] for matrix RH problems. See [10, 22] for an excellent exposition. This technique was used before by Deift et al. [13] to establish universality for the local eigenvalue correlations in unitary random matrix ensembles (1.4) in the bulk of the spectrum. Strahov and Fyodorov [30] used this method also to establish universality for the three kernels WI,n+m , WI I,n+m and WI I I,n+m in the bulk of the spectrum. In a previous paper [25] together with A.B.J. Kuijlaars, the asymptotic analysis of the RH problem for Y , corresponding to the weight (1.2), has already been done. An essential step in the analysis is the construction of the parametrix near the origin, which gives us the behavior of Y near the origin. Here, the Bessel functions come in. In [25], the behavior of the first column of Y (with the orthogonal polynomials as entries) was determined near the origin for positive (real) values, and used to establish universality for the local eigenvalue correlations at the origin of the spectrum, in terms of a Bessel kernel. Here, we determine the behavior of the first column of Y , as well as the second column of Y (with the Cauchy transforms of orthogonal polynomials as entries) in a full neighborhood of the origin, and use this in a similar fashion to prove our results. The rest of the paper is organized as follows. In Sect. 2 we give a short overview of the asymptotic analysis of the corresponding RH problem for Y . In Sect. 3 we determine the behavior of Y near the origin, in terms of Bessel functions. This will be used in the last section to prove our results. 2. The Corresponding RH Problem In this section we recall the matrix RH problem for Y , due to Fokas, Its and Kitaev [16], which characterizes the orthogonal polynomials with respect to the weight (1.2), together with their Cauchy transforms. We also give a short overview of the Deift/Zhou steepest descent method [10, 15] to obtain the asymptotic behavior of Y . For details we refer to [13, 25], see also [10, 14]. Our point of interest lies in the asymptotic behavior, as n → ∞, of the orthogonal polynomials πn+m,n of degree n + m with respect to the weight wn , for any fixed m ∈ Z. So, in contrast to the RH problem in [13, 25], we have to modify the asymptotic condition at infinity of the RH problem, and leave the jump condition unchanged. However, this will not create any problems. We seek a 2 × 2 matrix valued function Y = Y (n+m,n) that satisfies the following RH problem, cf. [10, 13, 14, 16, 25]. RH problem for Y . (a) Y : C \ R → C2×2 is analytic. (b) Y possesses continuous boundary values for x ∈ R \ {0} denoted by Y+ (x) and Y− (x), where Y+ (x) and Y− (x) denote the limiting values of Y (z ) as z approaches x from above and below, respectively, and
1 |x|2α e−nV (x) , for x ∈ R \ {0}. (2.1) Y+ (x) = Y− (x) 0 1
542
M. Vanlessen
(c) Y has the following asymptotic behavior at infinity: n+m
z 0 Y (z) = (I + O(1/z)) , 0 z−(n+m)
as z → ∞.
(d) Y has the following behavior near the origin:
1 |z|2α O 1 |z|2α , if α < 0,
Y (z) = O 1 1 , if α > 0, 11
(2.2)
(2.3)
as z → 0, z ∈ C \ R. Remark 2.1. The O-terms in condition
(d) of the RH problem are to be taken entrywise. 1 |z|2α So for example Y (z) = O means that Y11 (z) = O(1), Y12 (z) = O(|z|2α ), 1 |z|2α etc. This condition is used to control the behavior of Y near the origin. In the following we will not go into detail about this condition, and refer to [23, 32] for details. The unique solution of the RH problem for Y , see [16] (for condition (d) we refer to [23]), is then given by πn+m,n (z) hn+m,n (z) (n+m,n) Y (z) = Y (z) = , 2 2 −2πiγn+m−1,n πn+m−1,n (z) −2π iγn+m−1,n hn+m−1,n (z) (2.4) for z ∈ C \ R, where πj,n is the j th degree monic orthogonal polynomial with respect to wn , where γj,n is the leading coefficient of the j th degree orthonormal polynomial with respect to wn , and where hj,n is the Cauchy transform of πj,n , see (1.3). Remark 2.2. The superscript n+m in the notation Y (n+m,n) corresponds to the asymptotic condition (c) at infinity of the RH problem, which yields that the orthogonal polynomials in the solution (2.4) of the RH problem have degree n + m and n + m − 1. The superscript n corresponds to the jump condition (b), which yields that the orthogonality is with respect to wn . Remark 2.3. We note that the first column of Y contains the orthogonal polynomials, and the second column their Cauchy transforms. So, from Table 1 and (2.4), the kernel WI,n+m depends only on the first column of Y , the kernel WI I,n+m on both the first and the second column, and the kernel WI I I,n+m only on the second column, as follows:
1 1 Y (ζ ) Y11 (η) WI,n+m (ζ, η) = 2 det 11 , (2.5) Y21 (ζ ) Y21 (η) γn+m−1,n −2πi(ζ − η)
1 1 Y12 (ζ ) Y11 (η) WI I,n+m (ζ, η) = 2 det , (2.6) Y22 (ζ ) Y21 (η) γn+m−1,n −2πi(ζ − η) and WI I I,n+m (ζ, η) =
1 Y12 (ζ ) Y12 (η) det . 2 Y22 (ζ ) Y22 (η) −2πi(ζ − η) γn+m−1,n 1
(2.7)
Averages of Characteristic Polynomials at the Origin of the Spectrum
543
The asymptotic analysis of the RH problem for Y includes a series of transformations Y → T → S → R to obtain a RH problem for R normalized at infinity (i.e. R(z) → I as n → ∞), and with jumps uniformly close to the identity matrix, as n → ∞. Then [10, 13, 14], R is also uniformly close to the identity matrix, and by unfolding the series of transformations we obtain the asymptotic behavior of Y . Before we can give an overview of the series of transformations, we need some properties of the equilibrium measure µV for V . Here, we closely follow [25], see also [12, 13]. The support of µV consists of a finite union of intervals, say N+1 j =1 [bj −1 , aj ], and N+1 we define its interior as J = j =1 (bj −1 , aj ). The N + 1 intervals of J are referred to as the bands. The density ψ of the equilibrium measure is given by ψ(x) =
1 1/2 R (x)h(x), 2πi +
for x ∈ J ,
(2.8)
with h real analytic on R, and where R is the 2(N + 1)th degree monic polynomial with the endpoints ai , bj of J as zeros, R(z) =
N+1
(z − bj −1 )(z − aj ).
(2.9)
j =1
√ We use R 1/2 to denote the branch of R which behaves like zN+1 as z → ∞, and 1/2 which is defined and analytic on C \ J¯. In (2.8), R+ is used to denote the boundary value of R 1/2 on J from above. The equilibrium measure satisfies the Euler-Lagrange variational conditions, which state that there exists a constant ∈ R such that 2 log |x − s|ψ(s)ds − V (x) = , for x ∈ J¯, (2.10) 2
log |x − s|ψ(s)ds − V (x) ≤ ,
for x ∈ R \ J¯.
(2.11)
If the inequality in (2.11) is strict for every x ∈ R \ J¯, and if h(x) = 0 for every x ∈ J¯, then V is called regular. Otherwise, there are a finite number of points, called singular points of V , such that h vanishes there, i.e. a singular point in J¯, or such that we obtain ¯ equality in (2.11), a singular point in R \ J . 0 i.e. Let σ3 = 01 −1 be the Pauli matrix. Following [13], see also [25], we define the 2 × 2 matrix valued function T (z) = e−(n+m) 2 σ3 Y (z)e(n+m) 2 σ3 e−(n+m)g(z)σ3 ,
for z ∈ C \ R,
(2.12)
where is the constant that appears in the Euler-Lagrange variational conditions (2.10) and (2.11), and where the scalar function g is defined by g(z) = log(z − s)ψ(s)ds, for z ∈ C \ (−∞, aN+1 ]. (2.13) Note the small difference in the definition (2.12) of T with its definition in [13, 25], which comes from the modified asymptotic condition (c) of the RH problem for Y . For
544
M. Vanlessen
the case m = 0, both definitions agree. It is known [13, 25] that T is normalized at infinity and satisfies the jump relation T+ (x) = T− (x)v (1) (x) for x ∈ R \ {0}, where −(n+m)(g (x)−g (x)) + − e |x|2α emV (x) , x ∈ J¯ \ {0} (n+m)(g+ (x)−g− (x)) 0 e e−2πi(n+m) j |x|2α emV (x) e(n+m)(g+ (x)+g− (x)−V (x)−) v (1) (x) = , x ∈ (aj , bj ), 0 e2πi(n+m) j 2α emV (x) e(n+m)(g+ (x)+g− (x)−V (x)−) 1 |x| , x < b0 or x > aN+1 . 0 1 (2.14) The constant j is the total µV -mass of the N + 1 − j largest bands. The second transformation is referred to as the opening of the lens. Define [25] for every z ∈ C \ R lying in the region of analyticity of h the scalar function 1 aN +1 1/2 φ(z) = R (s)h(s)ds, (2.15) 2 z where the path of integration does not cross the real axis. Then [25], on the bands, φ is purely imaginary and satisfies 2φ+ (x) = −2φ− (x) = g+ (x) − g− (x),
for x ∈ J ,
(2.16)
so that 2φ and −2φ provide analytic extensions of g+ − g− into the upper half-plane and lower half-plane, respectively. The opening of the lens is based on the factorization of the jump matrix v (1) on the bands, see (2.14), into the following product of three matrices, cf. [25]: 1 0 e−(n+m)(g+ (x)−g− (x)) |x|2α emV (x) = |x|−2α e−mV (x) e−2(n+m)φ− (x) 1 0 e(n+m)(g+ (x)−g− (x)) 1 0 0 |x|2α emV (x) × . |x|−2α e−mV (x) e−2(n+m)φ+ (x) 1 −|x|−2α e−mV (x) 0 We take an analytic continuation of the factor |x|2α emV (x) by defining for z in the region of analyticity of V , (−z)2α emV (z) , if Re z < 0, ω(z) = (2.17) 2α mV (z) z e , if Re z > 0, with principal branches of powers. We now open the lens. Let be the lens shaped contour, as shown in Fig. 1, going through the endpoints ai , bj of J , going through the origin, and also going through the singular points of V in J . Of course we take the lens shaped regions to lie within the region of analyticity of φ and V .
Averages of Characteristic Polynomials at the Origin of the Spectrum
545
Fig. 1. Part of the contour
Define, cf. [25] T (z), for z outside the lens, 1 0 T (z) , for z in the upper parts of the lens, S(z) = −ω(z)−1 e−2(n+m)φ(z) 1 1 0 , for z in the lower parts of the lens. T (z) ω(z)−1 e−2(n+m)φ(z) 1 (2.18) As for the first transformation Y → T , there is a small difference in the definition (2.18) for S with its definition in [25], which comes from the modified asymptotic condition (c) of the RH problem for Y . For the case m = 0, again both definitions agree. Then [25], the matrix valued function S is normalized at infinity and satisfies the jump relation S+ (z) = S− (z)v (2) (z) for z ∈ , where
v (2) (z) =
1 0 , z ∈ ∩ C± , ω(z)−1 e−2(n+m)φ(z) 1 0 |z|2α emV (z) , z ∈ J \ {0}, −2α −mV (z) 0 −|z| e e−2πi(n+m) j |z|2α emV (z) e(n+m)(g+ (z)+g− (z)−V (z)−) , z ∈ (aj , bj ) 0 e2πi(n+m) j 1 |z|2α emV (z) e(n+m)(g+ (z)+g− (z)−V (z)−) , z < b0 or z > aN+1 . 0 1 (2.19)
For z in a neighborhood of a regular point x ∈ J we have, cf. [25], Re φ(z) > 0,
if Im z = 0,
and for every regular point in R \ J¯ we have from the Euler-Lagrange variational condition (2.11), cf. [13], g+ (x) + g− (x) − V (x) − l < 0,
for x ∈ R \ J¯.
546
M. Vanlessen
So, we expect that the leading order asymptotics are determined by a RH problem for (∞) (∞) P (∞) , normalized at infinity, that satisfies the jump relation P+ (x) = P− (x)v (∞) (x) for x ∈ (b0 , aN+1 ), where
2α emV (x) 0 |x| , for x ∈ J \ {0}, −|x|−2α e−mV (x) 0 (∞) v (x) =
0 e−2πi(n+m) j , for x ∈ (aj , bj ), j = 1 . . . N. 0 e2πi(n+m) j (2.20) The solution of this RH problem is referred to as the parametrix for the outside region, and is constructed using a Szeg˝o function on multiple intervals associated to |x|2α emV (x) , cf. [25], and using Riemann-Theta functions, cf. [13], see also [11]. For our purpose here, we do not need the explicit formulas for P (∞) , and refer to [13, 25] for details. Before we can do the third transformation, we have to be careful since the jump matrices for S and P (∞) are not uniformly close to each other near 0, near the endpoints ai , bj of J , and near the singular points of V . To solve this problem, we surround these points by small non-overlapping disks, say of radius δ > 0, and within each disk we construct a parametrix P satisfying the following local RH problem. RH Problem for P near x0 where x0 is 0, an endpoint of J , or a singular point of V . (a) P (z) is defined and analytic for z ∈ {|z − x0 | < δ0 } \ for some δ0 > δ. (b) P satisfies the same jump relations as S does on ∩ {|z − x0 | < δ}. (c) There is κ > 0 such that, as n → ∞,
−1 (z) = I + O(1/nκ ), uniformly for |z − x0 | = δ. P (z) P (∞)
(2.21)
(d) SP −1 has a removable singularity at x0 . For regular endpoints and the origin we can take κ = 1 in (2.21). It is known that this local RH problem is solvable for every x0 . For the endpoints of J and the singular points of V we refer to [13], for the origin we refer to [25]. For our purpose here, it suffices to know the explicit formula for the parametrix near the origin. We will now give the explicit formula for the parametrix P near the origin, see [25, Sect. 5] for details, see also [32, Sect. 4]. This is an essential step in the asymptotic analysis of the RH problem since it allows us to determine the behavior of Y near the origin, which will be the main tool to prove our results. Introduce the scalar function iφ(z) − iφ+ (0), if Im z > 0, f (z) = (2.22) if Im z < 0, −iφ(z) − iφ+ (0), which is defined and analytic in a neighborhood of the origin. The behavior of f near the origin [25, Sect. 5] is given by f (z) = πψ(0)z + O(z2 ),
as z → 0.
(2.23)
Let Uδ be the disk with radius δ around the origin, with δ > 0 sufficiently small such that Uδ lies in the region of analyticity of φ and V . Since f (0) = π ψ(0) > 0 we can choose δ also sufficiently small such that f is a conformal mapping on Uδ onto a convex
Averages of Characteristic Polynomials at the Origin of the Spectrum
547
neighborhood of 0. We have that f (x) is real and positive (negative) for x ∈ Uδ positive (negative). Decompose f (Uδ ) into eight regions I–VIII, as shown in the right of Fig. 2, divided by eight straight rays π j = {ζ ∈ C | arg ζ = (j − 1) }, 4
j = 1, . . . ,8.
This in turn divides the disk Uδ into eight regions I’–VIII’ as the pre-images under f of I–VIII, as shown in the left of Fig. 2. Sector I’ and IV’ correspond to the right and left upper part of the lens inside Uδ , respectively, sector V’ and VIII’ to the left and right lower part of the lens inside Uδ , respectively. Let α be the piecewise analytic matrix valued function [32, Sect. 4], see also [25, Sect. 5], that satisfies the jump relation α,+ (ζ ) = α,− (ζ )vα (ζ ) for ζ ∈ j , where
0 1 for ζ ∈ 1 ∪ 5 , −1 0 ,
1 0 −2πiα , for ζ ∈ 2 ∪ 6 , 1 e vα (ζ ) = for ζ ∈ 3 ∪ 7 , eπiασ3 ,
1 0 , for ζ ∈ 4 ∪ 8 , e2πiα 1 and that has the following behavior near the origin: if α < 0,
α |ζ | |ζ |α α (ζ ) = O , as ζ → 0, |ζ |α |ζ |α and if α > 0,
α |ζ | |ζ |−α π 3π O |ζ |α |ζ |−α , as ζ → 0 for 4 < | arg ζ | < 4 , α (ζ ) =
−α |ζ | |ζ |−α O , as ζ → 0 for 0 < | arg ζ | < π4 and |ζ |−α |ζ |−α
3π 4
Fig. 2. Decomposition of Uδ and f (Uδ ) into eight regions
< | arg ζ | < π .
548
M. Vanlessen
The behavior of α near the origin will ensure that part (d) of the RH problem for P is satisfied, see [25, 32] for details. The matrix valued function α is constructed out of Bessel functions of order α ± 21 , and its explicit formula for 0 < arg ζ < π4 is given by (2) (1) (ζ ) −iH (ζ ) H 1 1 α+ 2 1 √ 1/2 α+ 2 −(α+ 41 )πiσ3 α (ζ ) = πζ (2) . (2.24) e (1) 2 H 1 (ζ ) −iH 1 (ζ ) α− 2
α− 2
< arg ζ < π2 it is given by √ πi πi π ζ 1/2 Iα+ 1 (ζ e− 2 ) − √1π ζ 1/2 Kα+ 1 (ζ e− 2 ) 1 2 2 e− 2 πiασ3 , α (ζ ) = √ πi i − π2i − 1/2 1/2 −i π ζ Iα− 1 (ζ e ) − √π ζ Kα− 1 (ζ e 2 )
For
π 4
2
(2.25)
2
where Iν and Kν are the modified Bessel functions of order ν. See [32, Sect. 4] for the explicit expressions of α in the other sectors of the complex plane. Also define the piecewise analytic function W by zα em V 2(z) , if z ∈ III’,IV’,V’,VI’, W (z) = (2.26) V (z) (−z)α em 2 , if z ∈ I’,II’,VII’,VIII’. And finally, define the following matrix valued function, analytic in a neighborhood of the disk Uδ ,
1i (n+m)φ+ (0)σ3 − π4i σ3 1 En+m,n (z) = E(z)e e , (2.27) √ 2 i 1 where the matrix valued function E is given by [25, (5.27)–(5.30)]. Then, cf. [25, Sect. 5], the parametrix near the origin is defined by P (z) = En+m,n (z)α (n + m)f (z) W (z)−σ3 e−(n+m)φ(z)σ3 .
(2.28)
Remark 2.4. In contrast to [25, Sect. 5], we evaluate the matrix valued function α in (n + m)f (z) instead of in nf (z). This comes from the fact that, in order that the matching condition (c) of the RH problem for P is satisfied, we need to cancel out the factor e−(n+m)φ(z)σ3 instead of e−nφ(z) . This follows in essence from the modified asymptotic condition (c) of the RH problem for Y . For the case m = 0, the definition (2.28) of the parametrix P near the origin agrees with its definition in [25, Sect. 5]. Now, we have all the ingredients to give the third transformation. Define [13, 25] the 2 × 2 matrix valued function R as −1 S(z) P (∞) (z), for z outside the disks, R(z) = (2.29) S(z)P −1 (z), for z inside the disks. Then [13, 25], R is normalized at infinity, and analytic on the entire plane except for jumps on the reduced system of contours R , as shown in Fig. 3, and except for possible isolated singularities at the endpoints ai , bj of J , at the singularities of V and at 0. However, from condition (d) of the RH problem for P , these singularities are removable,
Averages of Characteristic Polynomials at the Origin of the Spectrum
549
Fig. 3. Part of the contour R . The points z1 and z2 are singular points of V
so that R is analytic on C \ R . It is known [13, 25] that the jumps of R on R are uniformly close to the identity matrix as n → ∞. This implies [13], see also [10, 14] R(z) = I + O(1/nκ ),
as n → ∞,
(2.30)
uniformly for z ∈ C \ R , where κ is the constant that appears in the matching condition (c) of the RH problem for P . By tracing back the steps Y → T → S → R we obtain the asymptotic behavior of Y in all regions of the complex plane, as n → ∞. 3. Behavior of Y Near the Origin In this section we unravel, as in [25, Lemma 7.1], the series of transformations Y → T → S → R, see Sect. 2, to determine the behavior of the first and the second column of Y inside the disk Uδ . This behavior will be the main tool to prove our results. Note that the second column of Y has jumps on the real axis, see (2.1). So, for the behavior of the second column of Y inside the disk Uδ we have to distinguish between the upper and lower parts of Uδ . For notational convenience we introduce the 2 × 2 matrix valued function, cf. [25, Lemma 7.1] M(z) = Mn+m,n (z) = R(z)En+m,n (z),
for z ∈ Uδ ,
(3.1)
where En+m,n is given by (2.27). For the case m = 0, the M-matrix defined by (3.1) corresponds to the M-matrix in [25, Lemma 7.1]. It is known that M is analytic on Uδ , that each entry of M is uniformly bounded in Uδ as n → ∞, and that det M ≡ 1, cf. [25, Lemma 7.1]. We also need the following lemma. Lemma 3.1. For z ∈ Uδ , 2g(z) − 2φ(z) − = V (z).
(3.2)
Proof. Let H (z) = 2g(z)−2φ(z)−−V (z), which is defined and analytic for z ∈ Uδ \R. For x ∈ (−δ, δ) ⊂ J we have by (2.16), H+ (x) = H− (x) = g+ (x) + g− (x) − − V (x),
(3.3)
so that H is analytic in the entire disk Uδ . For x ∈ (−δ, δ) we have by (2.13), g+ (x) + g− (x) = 2 log |x − s|ψ(s)ds. Inserting this into (3.3) and using the Euler-Lagrange variational condition (2.10), we have that H (x) = 0 for x ∈ (−δ, δ). This implies from the uniqueness principle that H ≡ 0 on Uδ , which proves the lemma.
550
M. Vanlessen
First, the behavior of the first column of Y inside the disk Uδ is given by the following theorem: Theorem 3.2. Fix m ∈ Z. For z ∈ Uδ and n sufficiently large, the first column of Y = Y (n+m,n) is given by
V (z) √ πi Y11 (z) = z−α en 2 πe− 4 e(n+m) 2 σ3 M(z) Y21 (z) 1/2 Jα+ 1 (n + m)f (z) (n + m)f (z) 2 × 1/2 . (n + m)f (z) Jα− 1 (n + m)f (z)
(3.4)
2
Here, Jν is the J -Bessel function of order ν, f is given by (2.22), and M is given by (3.1). Proof. Let z be in sector I’ of the disk Uδ , see Fig. 2. Unfolding the series of transformations Y → T → S → R we obtain by (2.12), (2.18), (2.28) and (2.29), Y (z) = e(n+m) 2 σ3 R(z)En+m,n (z)α (n + m)f (z) W (z)−σ3
1 0 −(n+m) 2 σ3 (n+m)g(z)σ3 −(n+m)φ(z)σ3 e ×e e . (3.5) ω(z)−1 e−2(n+m)φ(z) 1 V (z)
V (z)
Note that ω(z) = z2α emV (z) , see (2.17), and that W (z) = (−z)α em 2 = zα e−πiα em 2 , see (2.26). Inserting this into (3.5) and using (3.1) and (3.2), the first column of Y is then given by
πiασ 1 Y11 (z) −α n V 2(z) (n+m) 2 σ3 3 . e M(z)α (n + m)f (z) e =z e 1 Y21 (z)
(3.6)
Since f (z) is in sector I of f (Uδ ), see Fig. 2, we have for n sufficiently large (namely n + m > 0) that 0 < arg(n + m)f (z) < π/4. So, we have to use (2.24) to evaluate α (n + m)f (z) . From (3.6) and [1, Formulas 9.1.3 and 9.1.4], which connect the Hankel functions of the first and second kind with the ordinary J -Bessel functions, we then establish (3.4) in sector I’ of Uδ . Now, let z be in sector II’ of Uδ . Similarly as in sector I’, we obtain by (2.12), (2.18), (2.28) and (2.29), Y (z) = e(n+m) 2 σ3 R(z)En+m,n (z)α (n + m)f (z) ×W (z)−σ3 e−(n+m)φ(z)σ3 e−(n+m) 2 σ3 e(n+m)g(z)σ3 .
Since W (z) = zα e−π iα em Y is then given by
V (z) 2
, see (2.26), and using (3.1) and (3.2), the first column of
πiασ 1 Y11 (z) −α n V 2(z) (n+m) 2 σ3 3 . e M(z)α (n + m)f (z) e =z e 0 Y21 (z)
(3.7)
Averages of Characteristic Polynomials at the Origin of the Spectrum
551
Since π/4 < arg(n + m)f (z) < π/2 for n sufficiently large, we have to use (2.25) to evaluate α (n + m)f (z) . This implies, using [1, Formula 9.6.3], which connects the modified Bessel function Iα± 1 with the Bessel function Jα± 1 , that 2 2
1 α (n + m)f (z) eπiασ3 0 1/2 πi (n + m)f (z) Iα+ 1 (n + m)f (z)e− 2 √ π iα = πe 2 1/2 2 πi −i (n + m)f (z) Iα− 1 (n + m)f (z)e− 2 2 1/2 Jα+ 1 (n + m)f (z) (n + m)f (z) √ − πi 2 = πe 4 (3.8) 1/2 . (n + m)f (z) Jα− 1 (n + m)f (z) 2
Inserting this into (3.7), we establish (3.4) also in sector II’ of Uδ . In the other sectors of the disk Uδ the calculations are similar, and are left as an easy exercise for the careful reader. Remark 3.3. For the case m = 0, this theorem agrees with [25, Lemma 7.1]. Remark 3.4. It is not quite clear from (3.4) that the first column of Y is analytic in the entire disk Uδ , which must be the case since it has polynomials as entries, see (2.4). Obviously, it is analytic on Uδ \ (−δ, 0]. From [1, formula 9.1.35] we have for x ∈ (−δ, 0)
V (x) √ πi Y11,+ (x) Y11,− (x) = = |x|−α en 2 πe− 4 e(n+m) 2 σ3 M(x) Y21,+ (x) Y21,− (x) 1/2 Jα+ 1 −(n + m)f (x) − −(n + m)f (x) 2 × 1/2 . −(n + m)f (x) Jα− 1 −(n + m)f (x) 2
(3.9) So, the first column of Y is analytic in the entire disk Uδ except for a possible isolated 1 singularity at the origin. Since Jα± 1 (z) = O(zα± 2 ) as z → 0, see [1, Formula 9.1.10] 2 this singularity is removable, which implies that the first column of Y is analytic in the entire disk. Next, the behavior of the second column of Y in the upper part of the disk Uδ is given by the following theorem: Theorem 3.5. Fix m ∈ Z. For z ∈ Uδ ∩ C+ and n sufficiently large, the second column of Y = Y (n+m,n) is given by Y12 (z) V (z) 1 √ πi πe− 4 e(n+m) 2 σ3 M(z) = zα e−n 2 2 Y22 (z) 1/2 (1) (n + m)f (z) H 1 (n + m)f (z) α+ 2 × (3.10) 1/2 (1) . (n + m)f (z) H 1 (n + m)f (z) α− 2
(1)
Here, Hν is the Hankel function of the first kind of order ν, f is given by (2.22), and M is given by (3.1).
552
M. Vanlessen
Proof. Let z be in sector I’ of Uδ . Instead of (3.6) we get for the second column of Y ,
V (z) 0 Y12 (z) . (3.11) = zα e−n 2 e(n+m) 2 σ3 M(z)α (n + m)f (z) eπiασ3 1 Y22 (z) Since 0 < arg(n + m)f (z) < π/4 for n sufficiently large, we have to use (2.24) to evaluate α (n + m)f (z) . Inserting this into (3.11), we obtain (3.10) for this choice of z. Now, let z be in sector II’ of Uδ . Instead of (3.7) the second column of Y is given by
πiασ 0 Y12 (z) α −n V 2(z) (n+m) 2 σ3 3 . (3.12) e M(z)α (n + m)f (z) e =z e 1 Y22 (z) Since π/4 < arg(n + m)f (z) < π/2 for n sufficiently large, we have to use (2.25) to evaluate α (n + m)f (z) . From [1, Formula 9.6.4], which connects the modified (1) Bessel function Kα± 1 with the Hankel function H 1 of the first kind, we then have 2
α± 2
0 α (n + m)f (z) eπiασ3 1 1/2 − π2i (n + m)f (z) K 1 (n + m)f (z)e α+ π iα 1 2 = − √ e− 2 1/2 π − π2i i (n + m)f (z) Kα− 1 (n + m)f (z)e 2 1/2 (1) (n + m)f (z) H (n + m)f (z) 1 α+ 2 1 √ − πi = π e 4 . 1/2 (1) 2 (n + m)f (z) H (n + m)f (z) α− 21
Inserting this into (3.12), Eq. (3.10) is proven in this sector as well. Similarly, we can prove (3.10) in the other sectors of the upper part of Uδ .
And finally, the behavior of the second column of Y in the lower part of the disk Uδ is given by the following theorem. Theorem 3.6. Fix m ∈ Z. For z ∈ Uδ ∩ C− and n sufficiently large, the second column of Y = Y (n+m,n) is given by Y12 (z) V (z) 1 √ πi πe− 4 e(n+m) 2 σ3 M(z) = −zα e−n 2 2 Y22 (z) 1/2 (2) (n + m)f (z) H 1 (n + m)f (z) α+ 2 × (3.13) 1/2 (2) . (n + m)f (z) H 1 (n + m)f (z) α− 2
(2)
Here, Hν is the Hankel function of the second kind of order ν, f is given by (2.22), and M is given by (3.1). Proof. The proof is similar to the proofs of Theorem 3.2 and Theorem 3.5.
Averages of Characteristic Polynomials at the Origin of the Spectrum
553
Remark 3.7. By (2.1), the jump relation for the second column of Y is
(x) Y Y11 (x) Y12,+ (x) for x ∈ R \ {0}. (3.14) − 12,− = |x|2α e−nV (x) , Y22,+ (x) Y22,− (x) Y21 (x) For x ∈ (0, δ) one can check easily, using (3.2), (3.10), (3.13) and [1, Formulas 9.1.3 and 9.1.4] that (3.14) is satisfied. For x ∈ (−δ, 0) it follows from (3.9), (3.10), (3.13) and [1, Formulas 9.1.3, 9.1.4 and 9.1.39] that (3.14) is satisfied. Remark 3.8. Theorems 3.2, 3.5 and 3.6 give because of (2.4), after straightforward calculations, the behavior near the origin of the orthogonal polynomials and their Cauchy transforms. It has been shown before by Akemann and Fyodorov [6] that the behavior near the origin of the orthogonal polynomials is given in terms of the J -Bessel functions Jα± 1 , and that the behavior near the origin of their Cauchy transforms is given in terms 2
of the Hankel functions H the Hankel functions
(1) α± 21
(2) H 1 α± 2
of the first kind in the upper half-plane, and in terms of
of the second kind in the lower half-plane. However, in [6]
this was done on a physical level of rigor, and under the assumption that the eigenvalue density was supported on only one interval. 4. Proofs of Theorem 1.1–1.3 In this section we prove the universal behavior at the origin of the spectrum for the three kernels WI,n+m , WI I,n+m and WI I I,n+m , in terms of the Bessel kernels given by Table 2. Similarly to [24, 25], where we have investigated local eigenvalue correlations, we do this by using the connection of these kernels with the solution of the RH problem for Y , see (2.5)–(2.7), and by using the behavior of Y near the origin, derived in the previous section. We first need the following lemmas. Lemma 4.1. Let M be the matrix valued function given by (3.1), and let ζ, η ∈ C. Then, each entry Mij of M satisfies Mij
ζ nψ(0)
− Mij
η nψ(0)
=O
ζ −η , n
as n → ∞,
(4.1)
uniformly for ζ and η in compact subsets of C. η ζ Proof. Let ζ, η ∈ C, denote ζn = nψ(0) and ηn = nψ(0) , and let γ be a positively oriented simple closed contour in Uδ going around the origin. Then, since M is analytic on Uδ , an application of Cauchy’s formula shows that Mij (z) 1 dz, Mij (ζn ) − Mij (ηn ) = (ζn − ηn ) 2πi γ (z − ζn )(z − ηn )
for ζ and η in compact subsets of C and n sufficiently large. Since Mij is uniformly bounded in Uδ as n → ∞, see Sect. 3, the integral is uniformly bounded for ζ and η in compact subsets of C as n → ∞. This proves the lemma.
554
M. Vanlessen
Lemma 4.2. Fix m ∈ Z. Let ζ, η ∈ C, and denote ζ˜n = (n + m)f
ζ , nψ(0)
and η˜ n = (n + m)f
η . nψ(0)
Then, 1/2 ζ −α ζ˜n Jα± 1 (ζ˜n ) = ζ −α (π ζ )1/2 Jα± 1 (π ζ ) + O(1/n), 2
2
(4.2)
as n → ∞, uniformly for ζ in compact subsets of C. The left-hand side of (4.2) is uniformly bounded for ζ in compact subsets of C as n → ∞. Also ζ −α ζ˜n Jα± 1 (ζ˜n ) − η−α η˜ n Jα± 1 (η˜ n ) 1/2
1/2
2
=ζ
−α
(π ζ )
2
1/2
Jα± 1 (π ζ ) − η
−α
2
(π η)1/2 Jα± 1 (π η) + O 2
ζ −η , n
(4.3)
as n → ∞, uniformly for ζ and η in compact subsets of C. Proof. By (2.23) it follows that ζ˜n = πζ 1 + O(1/n) as n → ∞, uniformly for ζ in compact subsets of C. Inserting this into the left-hand side of (4.2) we easily obtain estimate (4.2), cf. [25, Lemma 7.2]. Since Jν (ζ ) = ζ ν Hν (ζ ) with Hν entire, see [1, 1 Formula 9.1.10], we have that ζ −α+ 2 Jα± 1 (ζ ) is entire. This implies by (4.2) that the 2 left-hand side of (4.2) is uniformly bounded for ζ in compact subsets of C as n → ∞. Let K1 , K2 be compact subsets of C, and let γ be a positively oriented simple closed contour with K1 and K2 in its interior. Define qn (z) = z−α z˜ n Jα± 1 (˜zn ) − z−α (π z)1/2 Jα± 1 (π z), 1/2
2
2
(4.4)
z with z˜ n = (n + m)f ( nψ(0) ). Note that qn is analytic in an open neighborhood of the interior of γ for n sufficiently large. An application of Cauchy’s theorem then shows that qn (z) 1 qn (ζ ) − qn (η) = (ζ − η) dz, 2πi γ (z − ζ )(z − η)
for ζ ∈ K1 and η ∈ K2 , and n sufficiently large. Since qn (z) = O(1/n) as n → ∞ uniformly for z in compact subsets of C, see (4.2) and (4.4), and since ζ and η are not close to the contour γ , the lemma is then proven. Now, we are ready to prove the universal behavior at the origin of the spectrum for the kernel WI,n+m . Proof of Theorem 1.1. Let ζ, η ∈ C, denote ζn =
ζ , nψ(0)
ηn =
η , nψ(0)
ζ˜n = (n + m)f (ζn ),
and
η˜ n = (n + m)f (ηn ),
Averages of Characteristic Polynomials at the Origin of the Spectrum
555
and let Y = Y (n+m,n) . Similar considerations as in [24, 25], using (2.5) and the behavior (3.4) of the first column of Y inside the disk Uδ , show that, 1 WI,n+m (ζn , ηn ) nψ(0)
1 Y11 (ζn ) Y11 (ηn ) = det Y21 (ζn ) Y21 (ηn ) −2πi(ζ − η) n 1 = (nψ(0))2α e 2 (V (ζn )+V (ηn )) 2(ζ − η) −α 1/2 ˜ ζ ζn Jα+ 1 (ζ˜n ) 0 2 × det M(ζn ) −α 1/2 ζ ζ˜n Jα− 1 (ζ˜n ) 0 2 −α 1/2 0 η η˜ n Jα+ 1 (η˜ n ) 2 . +M(ηn ) 1/2 0 η−α η˜ n Jα− 1 (η˜ n )
2 I,n+m (ζ, η) ≡ γn+m−1,n W
(4.5)
2
The matrix in the determinant can be written as, cf. [24, 25], −α 1/2 1/2 ζ ζ˜n Jα+ 1 (ζ˜n ) η−α η˜ n Jα+ 1 (η˜ n ) 2 2 M(ζn ) 1/2 1/2 ζ −α ζ˜n Jα− 1 (ζ˜n ) η−α η˜ n Jα− 1 (η˜ n ) 2 2 1/2 −α 0 η η˜ n Jα+ 1 (η˜ n ) −1 2 . + M(ζn ) (M(ηn ) − M(ζn )) 1/2 0 η−α η˜ n Jα− 1 (η˜ n ) 2
Since det M ≡ 1 and each entry of M is uniformly bounded in Uδ as n → ∞, see Sect. 3, each entry of M(ζn )−1 is uniformly bounded for ζ in compact subsets of C as 1/2 n → ∞. Using Lemma 4.1 and the fact that η−α η˜ n Jα± 1 (η˜ n ) is uniformly bounded 2 for η in compact subsets of C as n → ∞, see Lemma 4.2, we then obtain
ζ −η −α η˜ 1/2 J 0 O ˜ ) 0 η 1 (η n n n α+ 2
. M(ζn )−1 (M(ηn ) − M(ζn )) = 1/2 ζ −η 0 η−α η˜ n Jα− 1 (η˜ n ) 0 O n 2 1/2 Using the facts that det M ≡ 1 and that ζ −α ζ˜n Jα± 1 (ζ˜n ) is uniformly bounded for ζ 2 in compact subsets of C as n → ∞, see Lemma 4.2, we then find
I,n+m (ζ, η) = (nψ(0))2α e 2 (V (ζn )+V (ηn )) W n
−α 1/2 1/2 ζ ζ˜n Jα+ 1 (ζ˜n ) η−α η˜ n Jα+ 1 (η˜ n ) 1 2 2 det −α 1/2 × 1/2 2(ζ − η) ζ ζ˜n Jα− 1 (ζ˜n ) η−α η˜ n Jα− 1 (η˜ n ) 2 2 " +O(1/n) . (4.6) !
We can now replace z−α z˜ n Jα± 1 (˜zn ) by z−α (π z)1/2 Jα± 1 (π z) for z = ζ, η, and obtain 2 2 the limiting Bessel kernel Jα,I (ζ, η) given in Table 2. However, then we make an error which does not hold uniformly for ζ and η close to each other. To solve this problem 1/2
556
M. Vanlessen
we will work as in [24, 25]. We subtract the second column in the determinant from the 1/2 first one. From (4.3) and the fact that η−α η˜ n Jα± 1 (η˜ n ) is uniformly bounded for η in 2 compact subsets of C as n → ∞, the term inside the brackets in (4.6) is then given by 1 det 2(ζ − η) −α 1/2 ζ (π ζ )1/2 Jα+ 1 (π ζ ) − η−α (π η)1/2 Jα+ 1 (π η) η−α η˜ n Jα+ 1 (η˜ n ) 2 2 2 × −α 1/2 ζ (π ζ )1/2 Jα− 1 (π ζ ) − η−α (π η)1/2 Jα− 1 (π η) η−α η˜ n Jα− 1 (η˜ n ) 2
2
2
+O(1/n). Using the fact that η−α η˜ n Jα± 1 (η˜ n ) = η−α (π η)1/2 Jα± 1 (π η) + O(1/n), 1/2
2
2
and the fact that ζ −α (π ζ )1/2 Jα± 1 (π ζ ) − η−α (π η)1/2 Jα± 1 (π η) 2
2
ζ −η remains bounded for ζ and η in compact subsets of C, which follows since z−α (π z)1/2 Jα± 1 (z) is entire, we then easily obtain that the term inside the brackets in (4.6) is 2 given by −α ζ (π ζ )1/2 Jα+ 1 (π ζ ) η−α (π η)1/2 Jα+ 1 (π η) 1 2 2 det −α + O(1/n). (4.7) ζ (π ζ )1/2 Jα− 1 (π ζ ) η−α (π η)1/2 Jα− 1 (π η) 2(ζ − η) 2
2
The first term in (4.7) is exactly the limiting Bessel kernel Jα,I (ζ, η), see Table 2. From (4.6) and (4.7) we then obtain
n I,n+m (ζ, η) = (nψ(0))2α e 2 (V (ζn )+V (ηn )) Jα,I (ζ, η) + O(1/n) , W (4.8) as n → ∞, uniformly for ζ and η in bounded subsets of C. Note that n
e 2 (V (ζn )+V (ηn )) = e
V (0) nV (0)+ 2ψ(0) (ζ +η)+O(1/n) V (0)
= enV (0) e 2ψ(0)
(ζ +η)
(1 + O(1/n)),
as n → ∞,
uniformly for ζ and η in compact subsets of C. Inserting this into (4.8), and using the fact that Jα,I (ζ, η) is bounded for ζ and η in compact subsets of C, the theorem is then proven. In order to prove Theorem 1.2 we also need the following lemma, which is analogous to Lemma 4.2. ζ ). Then, Lemma 4.3. Fix m ∈ Z. Let ζ ∈ C+ , and denote ζ˜n = (n + m)f ( nψ(0)
ζ α ζ˜n H 1/2
(1) ˜ (ζ ) α± 21 n
= ζ α (π ζ )1/2 H
(1) (π ζ ) + O(1/n), α± 21
(4.9)
as n → ∞, uniformly for ζ in compact subsets of C+ . The left-hand side of (4.9) is uniformly bounded for ζ in compact subsets of C+ as n → ∞.
Averages of Characteristic Polynomials at the Origin of the Spectrum
557
Proof. Recall, cf. the proof of Lemma 4.2, that ζ˜n = (π ζ )(1 + O(1/n)). Inserting this (1) into the left-hand side of (4.9) and using the fact that H 1 is analytic on C \ (−∞, 0], we easily obtain estimate (4.9). Since H ζ α (π ζ )1/2 H
(1) (πζ ) α± 21
(1) α± 21
α± 2
is analytic on C \ (−∞, 0], we have that
is bounded for ζ in compact subsets of C \ (−∞, 0], and thus in
particular in compact subsets of C+ . Together with estimate (4.9) this implies that the left-hand side of (4.9) remains uniformly bounded for ζ in compact subsets of C+ as n → ∞. Proof of Theorem 1.2. Let ζ ∈ C+ , η ∈ C, denote ζn =
ζ , nψ(0)
ηn =
η , nψ(0)
ζ˜n = (n + m)f (ζn ),
η˜ n = (n + m)f (ηn ),
and
and let Y = Y (n+m,n) . Instead of Eq. (4.5) we obtain from (2.6), from the behavior (3.4) of the first column of Y inside Uδ , and from the behavior (3.10) of the second column of Y in the upper part of Uδ , ζ −η 2 I I,n+m (ζ, η) ≡ γn+m−1,n W WI I,n+m (ζn , ηn ) nψ(0)
1 Y (ζ ) Y (η ) = det 12 n 11 n Y22 (ζn ) Y21 (ηn ) −2πi 1 n = e− 2 (V (ζn )−V (ηn )) 4 1/2 (1) ζ α ζ˜n H 1 (ζ˜n ) 0 α+ 2 × det M(ζn ) 1/2 (1) ˜ α ˜ ζ ζn H 1 (ζn ) 0 α− 2 −α 1/2 0 η η˜ n Jα+ 1 (η˜ n ) 2 . +M(ηn ) 1/2 0 η−α η˜ n Jα− 1 (η˜ n ) 2
We now rewrite the matrix in the determinant as was done in the proof of Theorem 1.1. 1/2 (1) Using also the fact that ζ α ζ˜n H 1 (ζ˜n ) is uniformly bounded for ζ in compact subsets α± 2
of C+ as n → ∞, see Lemma 4.3, we obtain instead of Eq. (4.6), in a similar fashion, the following I I,n+m (ζ, η) = e− 2 (V (ζn )−V (ηn )) W 1/2 (1) 1/2 ζ α ζ˜n H 1 (ζ˜n ) η−α η˜ n Jα+ 1 (η˜ n ) α+ 2 2 1 × det + O(1/n) , 1/2 1/2 (1) ˜ α −α 4 ˜ ζ ζn H 1 (ζn ) η η˜ n Jα− 1 (η˜ n ) n
α− 2
2
(4.10) as n → ∞, uniformly for ζ and η in compact subsets of C+ and C, respectively. We now insert the fact that, see Lemma 4.3, ζ α ζ˜n H 1/2
(1) ˜ (ζ ) α± 21 n
= ζ α (π ζ )1/2 H
(1) (π ζ ) + O(1/n), α± 21
558
M. Vanlessen
into (4.10), and use the fact that η−α η˜ n Jα± 1 (η˜ n ) is uniformly bounded for η in compact 2 subsets of C as n → ∞, see Lemma 4.2, to obtain 1/2
I I,n+m (ζ, η) = e− 2 (V (ζn )−V (ηn )) W 1/2 (1) ζ α (π ζ )1/2 H 1 (π ζ ) η−α η˜ n Jα+ 1 (η˜ n ) α+ 2 2 1 × det 1/2 (1) α 1/2 −α 4 ζ (π ζ ) H 1 (π ζ ) η η˜ n Jα− 1 (η˜ n ) α− 2 2 n
+O(1/n) .
(4.11)
Inserting, see Lemma 4.2, η−α η˜ n Jα± 1 (η˜ n ) = η−α (π η)1/2 Jα± 1 (π η) + O(1/n), 1/2
2
2
into (4.11), and using the fact that
(1) ζ α (π ζ )1/2 H 1 (π ζ ) α± 2
is uniformly bounded for ζ in
compact subsets of C+ as n → ∞, we then obtain I I,n+m (ζ, η) W =e
− n2 (V (ζn )−V (ηn ))
ζ α (π ζ )1/2 H
(1) (π ζ ) α+ 21
η−α (π η)1/2 Jα+ 1 (π η)
2 1 det (1) α 1/2 −α 1/2 4 ζ (π ζ ) H 1 (π ζ ) η (π η) Jα− 1 (π η) α− 2 2
+O(1/n)
n (ζ, η) + O(1/n) , = e− 2 (V (ζn )−V (ηn )) (ζ − η)J+ α,I I
(4.12)
as n → ∞, uniformly for ζ and η in compact subsets of C+ and C, respectively. In (4.12), J+ α,I I (ζ, η) is the Bessel kernel given in Table 2. Note that e− 2 (V (ζn )−V (ηn )) = e n
=e
V (0) − 2ψ(0) (ζ −η)+O(1/n) V (0) − 2ψ(0) (ζ −η)
,
(1 + O(1/n)),
as n → ∞,
uniformly for ζ and η in compact subsets of C+ and C, respectively. Inserting this into (4.12) and using the fact that (ζ − η)J+ α,I I (ζ, η) is bounded for ζ and η in compact subsets of C+ and C, respectively, the first part of the theorem is then proven. The second part of the theorem can be treated in the same way, using the behavior (3.13) of the second column of Y in the lower part of the disk Uδ , instead of in the upper part of the disk. We leave it as an exercise for the careful reader to prove the universal behavior at the origin of the spectrum for the kernel WI I I,n+m . Proof of Theorem 1.3. The proof is similar to the proofs of Theorem 1.1 and Theorem 1.2.
Averages of Characteristic Polynomials at the Origin of the Spectrum
559
Acknowledgements. I thank Arno Kuijlaars for his careful reading, as well as for useful discussions and comments. I am also grateful to Yan Fyodorov and Eugene Strahov for sending me the recent version of their manuscript “ Universal results for correlations of characteristic polynomials: Riemann-Hilbert approach”.
References 1. Abramowitz, M., Stegun, I.A.: Handbook of Mathematical Functions. New York: Dover Publications, 1968 2. Akemann, G., Damgaard, P.H.: Microscopic spectra of dirac operators and finite-volume partition functions. Nucl. Phys. B 528, no. 1–2, 411–431 (1998) 3. Akemann, G., Damgaard, P.H.: Consistency conditions for finite-volume partition functions. Phys. Lett. B 432, no. 3–4, 390–396 (1998) 4. Akemann, G., Damgaard, P.H., Magnea, U., Nishigaki, S.: Universality of random matrices in the microscopic limit and the Dirac operator spectrum. Nucl. Phys. B 487, no. 3, 721–738 (1997) 5. Akemann, G., Damgaard, P.H., Magnea, U., Nishigaki, S.: Multicritical microscopic spectral correlators of Hermitian and complex matrices. Nucl. Phys. B 519, no. 3, 682–714 (1998) 6. Akemann, G., Fyodorov, Y.V.: Universal random matrix correlations of ratios of characteristic polynomials at the spectral edges. Nucl. Phys. B 664, no. 3, 457–476 (2003) 7. Baik, J., Deift, P., Strahov, E.: Products and ratios of characteristic polynomials of random hermitian matrices. J. Math. Phys. 44, no. 8, 3657–3670 (2003) 8. Brézin, E., Hikami, S.: Characteristic polynomials of random matrices. Commun. Math. Phys. 214, no. 1, 111–135 (2000) 9. Damgaard, P.H.: Dirac operator spectra from finite-volume partition functions. Phys. Lett. B 424, no. 3-4, 322–327 (1998) 10. Deift, P.: Orthogonal Polynomials and Random Matrices: A Riemann-Hilbert Approach. Courant Lecture Notes 3, New York University, 1999 11. Deift, P., Its, A.R., Zhou, X.: A Riemann-Hilbert approach to asymptotic problems arising in the theory of random matrix models, and also in the theory of integrable statistical mechanics. Ann. Math. 146, 149–235 (1997) 12. Deift, P., Kriecherbauer, T., McLaughlin, K.T-R.: New results on the equilibrium measure for logarithmic potentials in the presence of an external field. J. Approx. Theory 95, 388–475 (1998) 13. Deift, P., Kriecherbauer, T., McLaughlin, K.T-R., Venakides, S., Zhou, X.: Uniform asymptotics for polynomials orthogonal with respect to varying exponential weights and applications to universality questions in random matrix theory. Comm. Pure Appl. Math. 52, 1335–1425 (1999) 14. Deift, P., Kriecherbauer, T., McLaughlin, K.T-R., Venakides, S., Zhou, X.: Strong asymptotics of orthogonal polynomials with respect to exponential weights. Commun. Pure Appl. Math 52, 1491– 1552 (1999) 15. Deift, P., Zhou, X.: A steepest descent method for oscillatory Riemann-Hilbert problems. Asymptotics for the MKdV equation. Ann. Math. 137, 295–368 (1993) 16. Fokas, A.S., Its, A.R., Kitaev, A.V.: The isomonodromy approach to matrix models in 2D quantum gravity. Commun. Math. Phys. 147, 395–430 (1992) 17. Fyodorov, Y.V., Strahov, E.: An exact formula for general spectral correlation function of random Hermitian matrices. J. Phys. A 36, no. 12, 3203–3213 (2003) 18. Hughes, C.P., Keating, J.P., O’Connell, N.: Random matrix theory and the derivative of the Riemann zeta function. Proc. R. Soc. Lond. A 456, no. 2003, 2611–2627 (2000) 19. Hughes, C., Keating, J.P., O’Connell, N.: On the characteristic polynomial of a random unitary matrix. Comm. Math. Phys. 220, no. 2, 429–451 (2001) 20. Kanzieper, E., Freilikher, V.: Random matrix models with log-singular level confinement: method of fictitious fermions. Philos. Magazine B 77, no. 5, 1161–1172 (1998) 21. Keating, J.P., Snaith, N.C.: Random matrix theory and ζ (1/2 + it). Comm. Math. Phys. 214, no. 1, 57–89 (2000) 22. Kuijlaars, A.B.J.: Riemann-Hilbert analysis for orthogonal polynomials. In: Orthogonal Polynomials and Special Functions: Leuven 2002, E. Koelink, W. Van Assche, ed., Lect. Notes Math. 1817, Berlin-Heidelberg-Newyork: Springer-Verlag, 2003, pp. 167–210 23. Kuijlaars, A.B.J., McLaughlin, K.T-R., Van Assche, W., Vanlessen, M.: The Riemann–Hilbert approach to strong asymptotics for orthogonal polynomials. Adv. Math. 188, no. 2, 337–398 (2004) 24. Kuijlaars, A.B.J., Vanlessen, M.: Universality for eigenvalue correlations from the modified Jacobi unitary ensemble. Int. Math. Res. Notices 2002, no. 30, 1575–1600 (2002) 25. Kuijlaars, A.B.J., Vanlessen, M.: Universality for eigenvalue correlations at the origin of the spectrum. Commun. Math. Phys. 243, no. 1, 163–191 (2003)
560
M. Vanlessen
26. Mehta, M.L.: Random Matrices, 2nd. ed. Boston: Academic Press, 1991 27. Mehta, M.L., Normand, J-M.: Moments of the characteristic polynomial in the three ensembles of random matrices. J. Phys. A 34, no. 22, 4627–4639 (2001) 28. Nishigaki, S.: Microscopic universality in random matrix models of QCD. In: New developments in quantum field theory, New York: Plenum Press, 1998, pp. 287–295 29. Saff, E.B., Totik, V.: Logarithmic Potentials with External Fields. New-York: Springer-Verlag, 1997 30. Strahov, E., Fyodorov, Y.V.: Universal results for correlations of characteristic polynomials: Riemann-Hilbert approach. Commun. Math. Phys. 241, no. 2–3, 343–382 (2003) 31. Szeg˝o, G.: Orthogonal Polynomials. Fourth edition, Colloquium Publications, Vol. 23, Providence R.I.: Amer. Math. Soc., 1975 32. Vanlessen, M.: Strong asymptotics of the recurrence coefficients of orthogonal polynomials associated to the generalized Jacobi weight. J. Approx. Theory 125, no. 2, 198–237 (2003) 33. Verbaarschot, J.J.M., Zahed, I.: Random matrix theory and three-dimensional QCD. Phys. Rev. Lett. 73, no. 17, 2288–2291 (1994) Communicated by P. Sarnak
Commun. Math. Phys. 253, 561–583 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1237-x
Communications in
Mathematical Physics
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries David Maxwell Department of Mathematics, University of Washington, Seattle, WA 98195-4350, USA Received: 27 July 2003 / Accepted: 16 June 2004 Published online: 17 November 2004 – © Springer-Verlag 2004
Abstract: We construct asymptotically Euclidean solutions of the vacuum Einstein constraint equations with an apparent horizon boundary condition. Specifically, we give sufficient conditions for the constant mean curvature conformal method to generate such solutions. The method of proof is based on the barrier method used by Isenberg for compact manifolds without boundary, suitably extended to accommodate semilinear boundary conditions and low regularity metrics. As a consequence of our results for manifolds with boundary, we also obtain improvements to the theory of the constraint equations on asymptotically Euclidean manifolds without boundary.
1. Introduction The N -body problem in general relativity concerns the dynamics of an isolated system of N black holes. One aspect of the problem, quite different from its classical counterpart, is the complexity of constructing appropriate initial data for the associated Cauchy problem. Initial data on an n-manifold M is a Riemannian metric g and a symmetric (0, 2)-tensor K. We think of M as an embedded spacelike hypersurface of an ambient Lorentz manifold M; g is the pullback of the Lorentz metric on M and K is the extrinsic curvature of M in M. To model an isolated system of N black holes in vacuum, the triple (M, g, K) must satisfy several requirements. Isolated systems are typically modeled with asymptotically Euclidean initial data. This requires that g approach the Euclidean metric and that K decay to zero at far distances in M (see Sect. 2 for a rigorous definition). Moreover, the vacuum Einstein equation imposes a compatibility condition on K, g, and its scalar curvature R R − |K|2 + tr K 2 = 0, div K − d tr K = 0.
(1)
562
D. Maxwell
These are known as the Einstein constraint equations. Finally, data for N black holes must evolve into a spacetime M containing an event horizon, and the intersection of the event horizon with M must have N connected components. An event horizon is the boundary of the region that can send signals to infinity. It is a global property of a spacetime and cannot be located in an initial data set without evolving the data. This poses a serious obstacle to creating multiple black hole initial data. To address this problem, schemes such as those in [24, 3, 31 and 28]. (see also [10]) create initial data containing an apparent horizon, defined below. The motivation for using apparent horizons comes from the weak Cosmic Censorship conjecture. Assuming that asymptotically Euclidean initial data evolves into a weakly censored spacetime, any apparent horizons present in the initial data will be contained in the black hole region of the spacetime. To generate spacetimes with multiple black holes, one constructs initial data with multiple apparent horizons. If the apparent horizons are well separated, one conjectures that they will be associated with distinct black holes. An interesting approach to creating initial data containing apparent horizons, first suggested by Thornburg [28], is to work with a manifold with boundary and prescribe that the boundary be an apparent horizon. Thornburg numerically investigated generating such initial data. Variations of the apparent horizon condition have subsequently been proposed for numerical study, e.g. [11, 15]. However, as indicated by Dain [13], there has not been a rigorous mathematical investigation of the apparent horizon boundary condition. The goal of this paper is to take an initial step in addressing the problem of constructing asymptotically Euclidean Cauchy data satisfying the apparent horizon boundary condition. We exhibit sufficient conditions for generating a family of this data. An apparent horizon is a surface that is instantaneously neither expanding nor contracting as it evolves under the flow of its outgoing (to infinity) orthogonal null geodesics. On the boundary of M, the expansion under this flow is given by the so-called convergence θ+ = − tr K + K(ν, ν) − (n − 1)h,
(2)
where ν is the exterior unit normal of ∂M and h is the mean curvature of ∂M computed with respect to −ν. Hence ∂M is an apparent horizon if θ+ = 0. More generally, we say that ∂M is outer marginally trapped if θ+ ≤ 0. The corresponding convergence for the incoming (from infinity) family of null geodesics is θ− = − tr K + K(ν, ν) + (n − 1)h.
(3)
A surface is marginally trapped if both θ+ ≤ 0 and θ− ≤ 0. There is not a consistent definition of an apparent horizon in the literature. Other definitions of an apparent horizon include the boundary of a trapped region or an outermost marginally trapped surface. All these structures imply the existence of a black hole in the spacetime. We work with the definition θ+ = 0 since it is a local property and forms a natural boundary condition. Hence we seek asymptotically Euclidean data (M, g, K) satisfying R − |K|2 + tr K 2 = 0, div K − d tr K = 0, − tr K + K(ν, ν) − (n − 1)h = 0 on ∂M.
(4)
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
563
The conformal method of Lichnerowicz [21], Choquet-Bruhat and York [9] provides a natural approach to the problem. For simplicity we work with its constant mean curvature (CMC) formulation, under which the constraint equations decouple. The conformal method seeks a solution of the form 4
ˆ = (φ n−2 g, φ −2 σ + (g, ˆ K)
τ g), n
where g is a given asymptotically Euclidean metric prescribing the conformal class of ˆ σ is an unknown traceless symmetric (0, 2)-tensor, g, ˆ τ is a constant specifying tr K, and φ is an unknown conformal factor tending to 1 at infinity. From the decay conditions on Kˆ for asymptotically Euclidean initial data and the assumption that τ is constant, we have the further simplification τ = 0. Equations (4) then become a semilinear equation with semilinear boundary condition for φ, 1 Rφ − |σ |2 φ −3−2κ = 0, a 1 2 ∂ν φ + hφ − σ (ν, ν)φ −1−κ = 0 on ∂M, κ a
− φ +
(5)
and a linear system for σ , div σ = 0.
(6)
The first equation of (5) is known as the Lichnerowicz equation. In (5), the dimensional constants are κ = 2/(n − 2) and a = 2κ + 4. Note that we use the exterior normal ν to follow traditional PDE notation, but the mean curvature h is computed with respect to the interior unit normal. A trace-free, symmetric (0, 2)-tensor σ satisfying (6) is called transverse traceless. The set of transverse traceless tensors forms a linear space, and the choice of σ can be thought of as data to be prescribed in solving (5). Hence, we wish to find conditions on g and σ under which (5) is solvable. The construction of solutions of (5) starts with an asymptotically Euclidean manifold (M, g ) satisfying λg > 0.
(7)
For any asymptotically Euclidean metric (M, g), λg is the conformal invariant a 2 2 2 M a |∇f | + Rf dV + ∂M κ hf dA λg = inf . f ∈Cc∞ (M),f ≡0 ||f ||2L2n/(n−2) This is analogous to an invariant for compact manifolds with boundary found in [16]. We next make a conformal change from g to a metric g satisfying R = 0 and h < 0; Corollary 1 ensures this can always be done. Our main result, Theorem 1 proves that (5) is then solvable if (n − 1)h ≤ σ (ν, ν) ≤ 0.
(8)
We prove Theorem 1 using a barrier method for semilinear boundary conditions. Section 3 establishes a general existence theorem and Sect. 4 applies it to system (5).
564
D. Maxwell
One can easily find asymptotically Euclidean manifolds with boundary that satisfy λg > 0. For example, every manifold with R ≥ 0 and h ≥ 0 has λg > 0. So condition (7) of the construction can be readily met. On the other hand, it is not obvious that the restriction (8) is reasonable. In Sect. 5 we show that on any sufficiently smooth asymptotically Euclidean manifold with boundary we can freely specify σ (ν, ν) on the boundary. That is, given a function f on ∂M, we can find a transverse traceless tensor σ satisfying σ (ν, ν) = f . This follows from the solution to a boundary value problem for the vector Laplacian. Since h < 0, it follows that there exists a large family of transverse traceless tensors σ satisfying (8). It should be noted that our construction is not a full parameterization of the set of CMC solutions of (4). Since condition (8) is not conformally invariant, the set of allowed transverse traceless tensors depends on the choice of conformal representative satisfying R = 0 and h < 0. Hence condition (8) is not necessary. Moreover, although the technical condition (7) is vital for the construction, it is not clear if it is necessary. Hence there remains much to be understood about parameterizing the full set of solutions. In light of recent low regularity a priori estimates for solutions of the evolution problem [20, 27], there is interest in generating low regularity solutions of the constraints. In 2, n +
1, n +
terms of Lp Sobolev spaces, a natural setting is (g, σ ) ∈ Wloc2 × Wloc2 . This is the weakest regularity that ensures that g has curvature in an Lp space and that the Sobolev space containing g is an algebra. Y. Choquet-Bruhat has announced [6] a construction of such low regularity solutions of the constraint equations in the context of compact manifolds. We construct asymptotically Euclidean solutions with this level of regularity, but under a possibly unneeded assumption. In order to find suitable transverse traceless tensors, we require that (M, g) not admit any nontrivial conformal Killing fields vanishing at infinity. This is known to be true [12] for C 3 asymptotically Euclidean manifolds. We prove this also holds for metrics with regularity as weak as W 2,n+ . To consider n W 2, 2 + metrics, however, we must assume the non-existence of such fields. In a previous version of this article, we constructed solutions of (4) under the hypotheses λg > 0 and σ (ν, ν) ≥ 0. These solutions have the undesirable feature that although the boundary is a outer marginally trapped surface, it is not a marginally trapped surface. This observation was made in [14], which appeared shortly after our results were ˆ Since θˆ+ = 0 on ∂M, announced. From (2) and (3) we see that θˆ− = θˆ+ + 2(n − 1)h. ˆ So ∂M is a marginally trapped surface if and only if hˆ ≤ 0. we have θˆ− = 2(n − 1)h. The sign of hˆ is determined by σ (ν, ν) since θˆ+ = 0 implies ˆ ν , νˆ ) = φ −2−2κ σ (ν, ν). (n − 1)hˆ = K(ˆ So hˆ ≤ 0 if and only if σ (ν, ν) ≤ 0. Under the hypothesis σ (ν, ν) ≥ 0, one can construct an apparent horizon that is also a marginally trapped surface only if σ (ν, ν) = 0, which leads to θˆ− = 0 and hˆ = 0. In [14], Dain constructs trapped surface boundaries by working with θˆ− rather than ˆθ+ . Under suitable hypotheses, [14] prescribes θˆ− ≤ 0 and constructs boundaries with θˆ+ ≤ θˆ− ≤ 0. These are trapped surfaces, but the inequality θˆ+ ≤ θˆ− shows that the resulting boundaries satisfy hˆ ≥ 0. In particular, the techniques of [14] also cannot construct a boundary that is simultaneously a marginally trapped surface and an apparent horizon unless θˆ− = 0 and hˆ = 0. In the current version of this article, we construct solutions with hˆ ≤ 0. To do this requires we assume σ (ν, ν) ≤ 0, since the sign of hˆ is determined by the sign of
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
565
ˆ<0 h
Fig. 1. Boundary mean curvatures of an asymptotically Euclidean manifold
σ (ν, ν). The resulting PDEs are more delicate, and the hypothesis (n − 1)h ≤ σ (ν, ν) arises to compensate for this. Since hˆ ≤ 0, the apparent horizons we construct satisfy ˆ is negative. Figure 1 θˆ− ≤ θˆ+ = 0, with strict inequality wherever σ (ν, ν) (and hence h) shows boundary mean curvatures of various signs and further indicates the naturality of the condition hˆ ≤ 0. 1.1. Notation. Let M be a Lorentz manifold with metric γ and connection D. The signature of γ is (− + · · · +). If M is a spacelike hypersurface of M with timelike unit normal N, we define the extrinsic curvature K of M in M by K(X, Y ) = DX Y, N γ for vector fields tangent to M. This definition agrees with that used in [31] and [13], but differs in sign from that used in [30] and [14]. Let M be a Riemannian manifold with metric g and connection ∇. If is a spacelike hypersurface of M with unit normal ν, the extrinsic curvature k of in M is similarly defined by k(X, Y ) = ∇X Y, ν g . The mean curvature h of computed with respect to 1 ν is n−1 tr k, where n is the dimension of M. Throughout this paper we take n to be a fixed integer with n ≥ 3 and ρ to be a np negative real number. If p ∈ [1, ∞], we define the critical Sobolev exponent p ∗ = n−p if p < n and we set p ∗ = ∞ otherwise. The ball of radius r about x is Br (x), and Er is the region exterior to B r (0). We define f (+) (x) = max(f (x), 0). We use the notation A B to mean A < cB for a certain positive constant c. The constant is independent of the functions and parameters appearing in A and B that are not assumed to have a fixed value. For example, when considering a sequence {fi }∞ i=1 of functions on a domain , the expression ||fi ||L1 () 1 means the sequence is uniformly bounded in L1 () (with a bound that might depend on ). 2. Asymptotically Euclidean Manifolds An asymptotically Euclidean manifold is a non-compact Riemannian manifold, possibly with boundary, that can be decomposed into a compact core and a finite number of ends where the metric approaches the Euclidean metric at far distances. To make this loose description precise, we use weighted function spaces that prescribe asymptotic behavior like |x|δ for large x. For x ∈ Rn , let w(x) = (1 + |x|2 )1/2 . Then for any δ ∈ R and any
566
D. Maxwell k,p
k,p
open set ⊂ Rn , the weighted Sobolev space Wδ () is the subset of Wloc () for which the norm −δ− pn +|β| β ||u||W k,p () = ||w ∂ u||Lp () δ
|β|≤k
is finite; we will always work with spaces for which p = 1, ∞. Weighted spaces of continuous functions are defined by the norm ||u||C k () = sup w(x)−δ+|α| ∂ α u(x) . δ
|α|≤k x∈
Our indexing convention for δ follows [2] so that the value of δ directly encodes asymptotic growth at infinity. We refer the reader to [2] for properties of these weighted spaces. In particular, we recall the following facts. Lemma 1. p q 1. If p ≤ q and δ < δ then Lδ () ⊂ Lδ () and the inclusion is continuous. k,p k−1,p () is compact. 2. For k ≥ 1 and δ < δ the inclusion Wδ () ⊂ Wδ k,p r 3. If 1/p > k/n then Wδ () ⊂ Lδ (), where 1/r = 1/p − k/n. If 1/p = k/n k,p k,p then Wδ () ⊂ Lrδ () for all r ≥ p. If 1/p < k/n then Wδ () ⊂ Cδ0 (). These inclusions are continuous. 4. If m ≤ min(j, k), p ≤ q, > 0, and 1/q < (j + k − m)/n, then multiplication is a j,q k,p m,p continuous bilinear map from Wδ1 () × Wδ2 () to Wδ1 +δ2 + (). In particular, k,p
if 1/p < k/n and δ < 0, then Wδ () is an algebra. Let M be a smooth, connected, n-dimensional manifold with boundary, and let g be a metric on M for which (M, g) is complete, and let ρ < 0 (these will be standing assumptions for the remainder of the paper). We say (M, g) is asymptotically Euclidean k,p of class Wρ if: k,p
i. The metric g ∈ Wloc (M), where 1/p − k/n < 0 (and consequently g is continuous). ii. There exists a finite collection {Ni }m i=1 of open subsets of M and diffeomorphisms i : E1 → Ni such that M − ∪i Ni is compact. k,p iii. For each i, ∗i g − g ∈ Wρ (E1 ). We call the charts i end charts and the corresponding coordinates are end coordinates. Suppose (M, g) is asymptotically Euclidean, and let {i }m i=1 be its collection of end charts. Let K = M − ∪i i (E2 ), so K is a compact manifold with boundary. The k,p k,p weighted Sobolev space Wδ (M) is the subset of Wloc (M) such that the norm ||u||W k,p (M) = ||u||W k,p (K) + ||∗i u||W k,p (E ) δ
i p
δ
1
is finite. The weighted spaces Lδ (M) and Cδk (M) are defined similarly, and we let k Cδ∞ (M) = ∩∞ k=0 Cδ (M). Lemma 1 applies equally well to asymptotically Euclidean manifolds. Using these weighted spaces we can now define an asymptotically Euclidean data set. The extrinsic curvature tensor K of an initial data set (M, g, K) should behave like k,p a first derivative of g. Hence, if (M, g) is asymptotically Euclidean of class Wρ , we k−1,p say (M, g, K) is an asymptotically Euclidean data set if K ∈ Wρ−1 (M).
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
567
3. Barrier Method for Semilinear Boundary Conditions In [18], Isenberg used a constructive barrier method (also known as the method of suband supersolutions) to completely parameterize CMC solutions of the constraint equations on a compact manifold. Subsequently, the method has been applied to construct non-CMC solutions on compact manifolds [25] and asymptotically Euclidean solutions [8]. In this section we provide a version of the generic barrier construction that accommodates semilinear boundary conditions and low regularity metrics. Consider the boundary value problem − u = F (x, u), ∂ν u = f (x, u) on ∂M
(9)
on an asymptotically Euclidean manifold. We use the convention that has negative eigenvalues, so = ∂x21 + · · · + ∂x2n in Euclidean space. A subsolution of Eq. (9) is a function u− that satisfies − u− ≤ F (x, u− ), ∂ν u− ≤ f (x, u− ) on ∂M, and a supersolution u+ is defined similarly with the inequalities reversed. In Proposition 2 below, we show that if there exists a subsolution u− and a supersolution u+ decaying at infinity and satisfying u− ≤ u+ , then there exists a solution u satisfying u− ≤ u ≤ u+ . The proof of Proposition 2 relies on properties of the associated linearized operator −u + V u = F, ∂ν u + µu = f,
on ∂M,
(10) where V , µ, F , and f are functions of x alone. Let P denote (− + V ), (∂ν + µ) |∂M .
k,p
Proposition 1. Suppose (M, g) is asymptotically Euclidean of class Wρ , k ≥ 2, k > k−2,p
n/p, and suppose V ∈ Wρ−2 k,p Wδ (M)
k−2,p Wδ−2 (M)
and µ ∈ W
k−1− p1 ,p
. Then if 2 − n < δ < 0 the operator
k−1− p1 ,p
P: → ×W (∂M) is Fredholm with index 0. Moreover, if V ≥ 0 and µ ≥ 0 then P is an isomorphism. In Proposition 6 we prove a similar result for the vector Laplacian. Since the details are tedious and largely similar, we omit the proof of Proposition 1. The only substantial difference from the proof of Proposition 6 is the method used to show P is injective when V and µ are nonnegative. This is an easy consequence of the following weak maximum principle. Lemma 2. Suppose (M, g), V , and µ satisfy the hypotheses of Proposition 1 and supk,p pose V ≥ 0 and µ ≥ 0. If u ∈ Wloc satisfies −u + V u ≤ 0, ∂ν u + µu ≤ 0,
on ∂M,
(11) k,p
and if u(+) is o(1) on each end of M, then u ≤ 0. In particular, if u ∈ Wδ (M) for some δ < 0 and u satisfies (11), then u ≤ 0.
568
D. Maxwell
Proof. Fix > 0, and let v = (u − )(+) . Since u(+) = o(1) on each end, we see v is k,p compactly supported. Moreover, since u ∈ Wloc we have from Sobolev embedding that 1,2 and hence v ∈ W 1,2 . Now, u ∈ Wloc −vu dV ≤ − V uv dV ≤ 0, M
M
since V ≥ 0, v ≥ 0 and since u is positive wherever v = 0. Integrating by parts we have |∇v|2 dV − v∂ν u dV ≤ 0, M
∂M
since ∇u = ∇v on the support of v. From the boundary condition we obtain |∇v|2 dV ≤ − µuv dV ≤ 0, M
∂M
since µ ≥ 0. So v is constant and compactly supported, and we conclude u ≤ . Sending
to 0 proves u ≤ 0. k,p Finally, if u ∈ Wδ , then u ∈ Cδ0 . Hence if δ < 0, then u(+) = o(1) and the lemma can be applied to u. If V or µ is negative at some point, the kernel of P might not be empty. We have the following estimate for how elements of the kernel decay at infinity. Lemma 3. Suppose (M, g), V , and µ satisfy the hypotheses of Proposition 1 and supk,p k,p pose that u ∈ Wδ with δ < 0 is in the kernel of P. Then u ∈ Wδ (M) for every δ ∈ (2 − n, 0). k−2,p
Proof. Since V ∈ Wρ−2
k−2,p
we have V u ∈ Wρ+δ−2 . Hence
(−u, ∂ν u) = (−V u, −µu) k−2,p
∈ Wρ+δ−2 (M) × W
k−1− p1 ,p
(∂M).
Since (−, ( ∂ν |∂M )) is an isomorphism on Wδ for each δ ∈ (2 − n, 0), we conclude k,p that u ∈ Wδ for each δ ∈ (max(2 − n, ρ + δ − 2), 0). Iterating this argument a finite number of times yields the desired result. k,p
Although the barrier construction in Proposition 2 below only uses the weak maximum principle Lemma 2, we need the following strong maximum principle in our later analysis of the Lichnerowicz equation to ensure that the conformal factors we construct never vanish. Note that there is no sign restriction on V and µ. Lemma 4. Suppose (M, g), V , and µ satisfy the hypotheses of Proposition 1. Suppose k,p also that u ∈ Wloc (M) is nonnegative and satisfies −u + V u ≥ 0, ∂ν u + µu ≥ 0 on ∂M. If u(x) = 0 at some point x ∈ M, then u vanishes identically.
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
569
Proof. From Sobolev embedding, we can assume without loss of generality that the hypotheses of Proposition 1 are satisfied with k = 2. Suppose first that x is an interior p point of M. Since g is continuous and V ∈ Lloc (M) with p > n/2, the weak Harnack inequality of [29] holds and we have for some radius R sufficiently small and some exponent q sufficiently large there exists a constant C > 0 such that ||u||Lq (B2R (x) ≤ C inf u = 0. BR (x)
Hence u vanishes in a neighbourhood of x, and a connectivity argument implies u is identically 0. It remains to consider the case where u vanishes at a point x ∈ ∂M. Working in local coordinates about x we can do our analysis on B1+ (0) ≡ B1 (0) ∩ Rn+ , where balls are now taken with respect to the flat background metric. Let b be a W 1,p (B1+ ) vector field such that b, ν = µ on D1 , where D1 is the flat portion of the boundary of B1+ . For 1− 1 ,p
2,p
ˆ ν where µˆ is a W 1,p example, since µ ∈ W p and g ∈ Wloc , we can take b = µˆ 2,p extension of µ and νˆ is a W extension of ν. Integrating by parts, we have for any nonnegative φ ∈ Cc∞ (B1+ ∪ D1 ), ∇φ, b u + uφ div b + b, ∇u φ dV = µ uφ dA. B1+
Hence
D1
B1+
=
∇u, ∇φ g + R uφ + ∇φ, b u + uφ div b + b, ∇u φ dV
B1+
−φu + R uφ dV +
∂ν uφ + µ φu dA ≥ 0,
(12)
D1
since u is a supersolution. To reduce to the interior case, we now construct an elliptic equation on all of B1 . For any function or tensor f defined on B1+ (0), let f˜ be the extension of f to B1 via its pushforward under reflection. It follows from (12) and a change of variables argument that for any nonnegative φ ∈ Cc∞ (B1 ), ˜ g˜ u˜ + uφ ˜ ∇ u
≥ 0. b + b, ∇ u, ˜ + ∇φ, b
˜ div ˜ g˜ φ dV (13) ˜ ∇φ g˜ + R˜ uφ B1
b ∈ Lp (B1 ), we conclude Since g˜ ∈ W 1,2p (B1 ), R˜ ∈ Lp (B1 ), b˜ ∈ L2p (B1 ), and div 1,2p from (13) that u˜ is a weak W supersolution of an elliptic equation with coefficients having regularity considered by [29]. Since u˜ ≥ 0 and u(0) ˜ = 0, the weak Harnack inequality again implies that u vanishes in a neighbourhood of x and hence on all of M. We now turn to the existence proof for the nonlinear problem (9). We assume for simplicity that the nonlinearities F and f have the form F (x, y) = f (x, y) =
l j =1 m j =1
Fj (x)Gj (y),
fj (x)gj (y).
570
D. Maxwell
Proposition 2. Suppose k,p
1. (M, g) is asymptotically Euclidean of class Wρ with k ≥ 2, p > n/k, and ρ < 0, k,p 2. u− , u+ ∈ Wδ with and δ ∈ (2 − n, 0) are a subsolution and a supersolution respectively of (9) such that u− ≤ u+ , k−1− 1 ,p
k−2,p
p 3. each Fj ∈ Wδ−2 (M) and fj ∈ W (∂M), 4. each Gj and gj are smooth on I = [inf(u− ), sup(u+ )].
Then there exists a solution u of (9) such that u− ≤ u ≤ u+ . Proof. We first assume k = 2 and p > n/2. Let V (x) = µ(x) =
l Fj (x) min G , j I j =1 m j =1
fj (x) min g , I j
1− 1 ,p
p
so that V ∈ Lδ−2 (M), µ ∈ W p (∂M), and both are nonnegative. Let FV (x, y) = F (x, y) + V (x)y and fµ (x, y) = f (x, y) + µ(x)y so that FV and fµ are both nondecreasing in y. Let LV = − + V , and let Bµ = (∂ν + µ) |∂M . From Proposition 1 2,p we have (LV , Bµ ) is an isomorphism acting on Wδ . We construct a monotone decreasing sequence of functions u+ = u0 ≥ u1 ≥ u2 ≥ · · · by letting LV ui+1 = FV (x, ui ), Bµ ui+1 = fµ (x, ui ). The monotonicity of the sequence follows from the maximum principle and the monotonicity of FV (x, y) and fµ (x, y) in y. The maximum principle also implies ui ≥ u− . 2,p We claim the sequence {ui }∞ i=1 is bounded in Wδ (M). From Proposition 1 we can estimate ||ui+1 ||W 2,p (M) ||FV (x, ui )||Lp
δ−2 (M)
δ
+ ||fµ (x, ui )||
.
(14)
1 + ||ui ||W 1,q (U )
(15)
W
1 ,p 1− p
(∂M)
Pick q ∈ (p, ∞) such that 1 1 1 1 − < < , p n q n which is possible since p > n/2. Then ||FV (x, ui )||Lp
δ−2 (M)
+ ||fµ (x, ui )||
W
1 ,p 1− p
(∂M)
for any fixed smooth bounded neighbourhood U of ∂M. From interpolation and Sobolev embedding we have for any > 0, ||ui ||W 1,q (U ) C( )||ui ||W 1,p (U ) + ||ui ||W 2,p (U ) .
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
571
A second application of interpolation then implies ||ui ||W 1,q (U ) C( )||ui ||Lp (U ) + ||ui ||W 2,p (U ) . Hence ||ui ||W 1,q (U ) C( )||ui ||Lp (M) + ||ui ||W 2,p (M) . δ
(16)
δ
Since u− ≤ ui ≤ u+ , we have ||ui ||Lp (M) is uniformly bounded. Combining (14), (15) δ and (16) we obtain, taking sufficiently small, ||ui+1 ||W 2,p ≤ δ
1 ||ui ||W 2,p + C. δ 2
Iterating this inequality we obtain a bound for all i, ||ui ||W 2,p ≤ ||u+ ||W 2,p + 2C. δ
δ
{ui }∞ i=1
Hence some subsequence of (and by monotonicity, the whole sequence) con2,p verges weakly in Wδ to a limit u∞ . 1,p It remains to see u∞ is a solution of (9). Now ui converges strongly to u∞ in Wδ for ∞ any δ > δ, and also converges uniformly on compact sets. Hence for any φ ∈ Cc (M), F (x, u∞ )φ dV , (FV (x, ui ) − V (x)ui+1 ) φ dV → M M fµ (x, ui ) − µ(x)ui+1 φ dA → f (x, u∞ )φ dA, ∂M ∂M ∇ui+1 , ∇φ φ dV → ∇u∞ , ∇φ dV . M
So
M
M
∇u∞ , ∇φ dV =
M
F (x, u∞ )φ dV +
∂M
f (x, u∞ )φ dA,
and an application of integration by parts shows u∞ solves the boundary value problem. The case k > 2 now follows from a bootstrap using the previous result together with Proposition 1. 4. Solving the Lichnerowicz Equation We now prove the existence of solutions of the Lichnerowicz equation 1 Rφ − |σ |2 φ −3−2κ = 0, − φ + a 1 2 ∂ν φ + hφ − σ (ν, ν)φ −1−κ = 0 on ∂M. κ a
(17)
We first show that if λg > 0, then (M, g ) is conformally equivalent to (M, g), where g satisfies λg > 0, R = 0 and h < 0. We then show that if λg > 0, R = 0, and
572
D. Maxwell
(n − 1)h ≤ σ (ν, ν) ≤ 0, then there exists a sub/supersolution pair for (17) and we apply Proposition 2 to obtain a solution. The following proposition gives useful conditions equivalent to λg > 0. We define for compactly supported functions a 2 2 2 M a |∇f | + Rf dV + ∂M κ hf dA . Qg (f ) = ||f ||2L2∗ Thus λg = We also define
inf
f ∈Cc∞ (M),f ≡0
Qg (f ).
η η . Pη = (− + R), (∂ν + h) a κ ∂M k,p
If k ≥ 2, δ < 0, and k > n/p, then Pη is a continuous map from Wδ (M) to k−2,p
Wδ−2 (M) × W
k−1− p1 ,p
(∂M). k,p
Proposition 3. Suppose (M, g) is asymptotically Euclidean of class Wδ , k ≥ 2, k > n/p, and 2 − n < δ < 0. Then the following conditions are equivalent: k,p
1. There exists a conformal factor φ > 0 such that 1 − φ ∈ Wδ (M, φ 2κ g) is scalar flat and has a minimal surface boundary. 2. λg > 0. k,p 3. Pη is an isomorphism acting on Wδ for each η ∈ [0, 1].
and such that
Proof. Suppose Condition 1 holds. Since λg is a conformal invariant, we can assume that R = 0 and h = 0. We first make a conformal change to a metric with positive scalar k−2,p curvature. Let R be any continuous positive element of Wδ−2 and let v be the unique solution given by Proposition 1 to − g˜ v = ∂ν v = 0.
1 R, a
(18)
The maximum principle implies v ≥ 0, and hence φ = 1 + v > 0. Letting gˆ = φ 2κ g, it follows that Rˆ is positive and continuous, and that hˆ = 0. From Sobolev embedding and a standard argument with cutoff functions, we have ||f ||2L2∗ ||∇f ||2L2 + ||f ||2L2 (K) , where K is the compact core of M. Since Rˆ is bounded below on K we find ||f ||2L2∗ ||∇f ||2L2 + ||Rˆ 1/2 f ||2L2 , and hence λg > 0. Now suppose Condition 2 holds. We claim that for η ∈ [0, 1], the kernel of Pη is trivial. Since Pη has index 0, this implies Pη is an isomorphism.
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
573
Suppose, to produce a contradiction, that u is a nontrivial solution. From Lemma 3 we k,p have u ∈ Wδ for any δ ∈ (2−n, 0). From Sobolev embedding we have u ∈ Wδ1,2 (M). Taking δ < 1 − n/2, we can integrate by parts to obtain a 2 2 2 2 −au u + ηRu dV = a |∇u| + ηRu dV + η hu dA. 0= M M ∂M κ Since M a |∇u|2 dV ≥ 0, and since η ∈ [0, 1], we see
a 2 a |∇u|2 + Ru2 dV + hu dA . 0≥η M ∂M κ ∗
2 Since Wδ1,2 (M) is continuously embedded in L (M) we conclude that Qg (u) ≤ 0. k,p k,p ∞ Since Qg is continuous on W δ , and since Cc is dense in Wδ we find λg ≤ 0, a contradiction. Now suppose Condition 3 holds. For each η ∈ [0, 1], let vη be the unique solution in k,p Wδ of
Pη vη = −η(R/a, h/κ). Letting φη = 1 + vη we see η Rφη = 0, a η ∂ν φη + hφη = 0. κ
− φη +
(19)
To show φη > 0 for all η ∈ [0, 1], we follow [5]. Let I = {η ∈ [0, 1] : φη > 0}. Since v0 = 0, we have I is nonempty. Moreover, the set {v ∈ Cδ0 : v > −1} is open in Cδ0 . Since the map taking η to vη ∈ Cδ0 is continuous, I is open. It suffices to show that I is closed. Suppose η0 ∈ I¯. Then φη0 ≥ 0. Since φη solves (19), and since φη0 tends to 1 at infinity, Lemma 4 then implies φη0 > 0. Hence η0 ∈ I and I is closed. Letting φ = φ1 we have shown φ > 0. Since φ solves (19) with η = 1 it follows that k,p (M, φ 2κ g) is scalar flat and has a minimal surface boundary. Moreover, since v ∈ Wδ k,p we see (M, φ 2κ g) is also of class Wδ . Remark 1. In the context of asymptotically Euclidean manifolds without boundary, Theorem 2.1 of [5] claims that one can make a conformal change to a scalar flat asymptotically Euclidean metric if and only if a |∇f |2 + Rf 2 dV > 0 for all f ∈ Cc∞ (M), f ≡ 0. (20) M
The proof of Theorem 2.1 in [5] has a mistake, and the condition (20) is too weak. A similar claim and error appears in [8]. We see from Proposition 3 that the correct condition is 2 2 M a |∇f | + Rf dV inf > 0. f ∈Cc∞ (M),f ≡0 ||f ||2L2∗
574
D. Maxwell
Corollary 1. Suppose (M, g ) is asymptotically Euclidean of class Wδ , k ≥ 2, k > n/p, and 2 − n < δ < 0. If λg > 0, then there exists a conformal factor φ > 0 such that k,p 1 − φ ∈ Wδ and such that (M, g) = (M, φ 2κ g ) is scalar flat, has negative boundary mean curvature, and satisfies λg > 0. k,p
Proof. Since λg > 0, from Proposition 3 we can assume without loss of generality that (M, g ) satisfies R = 0 and h = 0. k,p Let v ∈ Wδ (M) be the unique solution of − g v = 0, ∂ν v = − . Since k > n/p, v depends continuously in Cδ0 on . Since v0 = 0, we have v > −1 for
sufficiently small. Fixing one such > 0 we have φ = 1 + v > 0. Letting g = φ 2κ g we see that R = 0 and h = − κφ −κ−1 < 0. Proposition 3 shows that λg is a conformal invariant and hence λg > 0 also. k,p
Theorem 1. Suppose (M, g) is asymptotically Euclidean of class Wδ , k ≥ 2, k > n/p, k−1,p and 2 − n < δ < 0. Suppose also that λg > 0, R = 0, and h ≤ 0. If σ ∈ Wδ−1 is a transverse traceless tensor on M such that (n − 1)h ≤ σ (ν, ν) ≤ 0 on ∂M, then there exists a conformal factor φ solving (17). Moreover, setting gˆ = φ 2κ g and Kˆ = φ −2 σ , k,p k−1,p ˆ we have that (M, g) ˆ is asymptotically Euclidean of class Wδ , Kˆ ∈ Wδ−1 , (M, g, ˆ K) solves the Einstein constraint equations with apparent horizon boundary condition, and ∂M is a marginally trapped surface. Proof. Setting φ = 1 + v and σ = solving
2 a σ (ν, ν),
the Lichnerowicz equation reduces to
1 |σ |2 (1 + v)−3−2κ , a 1 ∂ν v = − h(1 + v) + σ (1 + v)−1−κ κ
− v =
on ∂M,
(21)
a = n − 1, with the constraint v > −1. We solve this by means of Proposition 2. Since 2κ 1 and since (n − 1)h ≤ σ (ν, ν), we conclude − κ h + σ ≥ 0. Therefore v− = 0 is a subsolution of (21). To find a supersolution, we solve for each η ∈ [0, 1],
1 |σ |2 , a η η ∂ν vη + hvη = − h. κ κ − vη =
(22)
The solution exists since λg > 0. We claim moreover that φη = 1 + vη > 0. Let I = {η ∈ [0, 1] : φη > 0}. Arguing as in Proposition 3, using the fact that |σ |2 ≥ 0, we see I is open and nonempty. Suppose η0 ∈ I¯. Then φη0 ≥ 0. Since φη satisfies − φη ≥ 0, η ∂ν phiη + hφη = 0, κ
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
575
and since φη0 tends to 1 at infinity, Lemma 4 then implies φη0 > 0. Hence η0 ∈ I and I is closed. Let v+ = v1 . We have proved 1 + v+ > 0. But then, since 1 |σ |2 , a 1 ∂ν v+ = − h(1 + v+ ), κ
− v+ =
(23)
and since −h ≥ 0, 1 + v+ ≥ 0 and |σ |2 ≥ 0, we conclude that v+ ≥ 0. Now − v+ =
1 1 |σ |2 ≥ |σ |2 (1 + v+ )−3−2κ , a a
and since σ ≤ 0 we have 1 1 ∂ν v+ = − h(1 + v+ ) ≥ − h(1 + v+ ) + σ (1 + v+ )−1−κ κ κ
on ∂M.
So v+ is a nonnegative supersolution of (21). Now v− , v+ , (M, g), and the right hand sides of (21) all satisfy the hypotheses of k,p Proposition 2. So there exists a nonnegative solution v of (21) in Wδ . Letting gˆ = φ 2κ g ˆ solves the constraint equations with apparent and Kˆ = φ −2 σ , it follows that (M, g, ˆ K) horizon boundary condition. To see that the boundary is marginally trapped, we note that ˆ ν , νˆ ) = φ −2−2κ σ (ν, ν) ≤ 0. (n − 1)hˆ = K(ˆ ˆ we conclude θˆ− ≤ θˆ+ = 0, and ∂M is marginally Since θˆ− = θˆ+ + 2(n − 1)h, trapped. The previous arguments and theorems can all be easily modified for manifolds without boundary by omitting boundary conditions and all references to the boundary of the manifold. Hence we also have the following low regularity construction for manifolds without boundary. Theorem 2. Suppose (M, g) is an asymptotically Euclidean manifold without boundk,p k−1,p ary of class Wδ , k ≥ 2, k > n/p, 2 − n < δ < 0, and suppose σ ∈ Wδ−1 . If (M, g) satisfies λg > 0, then there exists a conformal factor φ solving (17). Moreover, k,p k−1,p (M, g) ˆ = (M, φ 2κ g) is asymptotically Euclidean of class Wδ , Kˆ = φ −2 σ ∈ Wδ−1 , ˆ solves the Einstein constraint equations. and (M, g, ˆ K) 5. Constructing Suitable Transverse Traceless Tensors We now prove the existence of a class of data satisfying the hypothesis of Theorem 1. It is easy to construct asymptotically Euclidean manifolds satisfying λg > 0. From the proof of Proposition 3, it is clear that every manifold with R ≥ 0 and h ≥ 0 satisfies λg > 0. Let (M, g, K) be an asymptotically Euclidean manifold without boundary satisfying R ≥ 0 (for example any maximal solution of the vacuum constraint equations). Let v be the Greens function for the operator −a + R with pole at x ∈ M, and let φ = 1 + v. Since R ≥ 0, we have v ≥ 0 and φ > 0. Setting g˜ = φ 2k g we see that (M − {x}, g) ˜ satisfies R˜ ≥ 0. Moreover, if Br is a geodesic ball (with respect to g) about
576
D. Maxwell
x with radius r, one readily verifies from the asymptotic behaviour φ ≈ r 2−n near x that h˜ > 0 for r sufficiently small. So (M − Br , g) ˜ satisfies λg˜ > 0. In the same way, given an asymptotically Euclidean manifold with boundary satisfying R ≥ 0 and h ≥ 0, we can perform the previous construction using the Greens function for −a + R corresponding to the boundary condition ∂v + hv = 0 to add another boundary component. Continuing iteratively, we can add as many boundary components as we like. On the other hand, to provide a suitable tensor σ , we must do more work. The requirements on σ are that it be trace-free, divergence-free, and satisfy (n − 1)h ≤ σ (ν, ν) ≤ 0 on ∂M. We construct σ using a boundary value problem for the vector Laplacian. Let L denote the conformal Killing operator, so LX = 21 LX g − n1 (div X)g. Then the vector Laplacian L = div L is an elliptic operator on M, and the Neumann boundary operator B corresponding to L takes a vector field X to the covector field LX(ν, ·). We propose to solve the boundary value problem L X = 0, BX = ω on ∂M,
(24)
where ω is a covector field over ∂M. If we can do this, then letting σ = LX, it follows that σ is trace and divergence free. Moreover, σ (ν, ν) = ω(ν) on ∂M, so taking ω such that (n − 1)h ≤ ω(ν) ≤ 0 ensures (n − 1)h ≤ σ (ν, ν) ≤ 0. k,p If (M, g) is asymptotically Euclidean of class Wρ with k ≥ 2 and k > n/p, then k,p
k−1− 1 ,p
k−2,p
p (∂M) L and B act continuously as maps from Wδ (M) to Wδ−2 (M) and W respectively. We show (24) is well posed by proving PL = (L , B) is an isomorphism if 2 − n < δ < 0 and either k > n/p + 1 or k > n/p and (M, g) has no nontrivial conformal Killing fields vanishing at infinity. The method of proof is well established [23, 7]. The first step is to obtain a coercivity estimate (Eq. (29) of Proposition 5 below) that implies PL is semi-Fredholm. The second step is to explicitly compute the index and dimension of the kernel of PL , which we do in Proposition 6. In fact, for smooth metrics it follows from [22] that PL is Fredholm. So our principal concern is to show that the coercivity estimate holds for low regularity metrics. We start with an priori estimate for L on a compact manifold K. We assume that the boundary of K is partitioned into two pieces, ∂K1 and ∂K2 , each the union of components of ∂K and either possibly empty. We then have an estimate for L in terms of a Neumann condition on K1 and a Dirichlet condition on K2 .
Proposition 4. Suppose (K, g) is a compact manifold with boundary of class W k,p with k > n/p and k ≥ 2. If X ∈ W k,p (K), then ||X||W k,p (K) ||L X||W k−2,p (K) + ||B X|| +||X||
W
1 ,p k− p
(∂K2 )
W
1 ,p k−1− p
+ ||X||Lp (K) .
(∂K1 )
(25)
Proof. The estimate follows from interior and boundary estimates together with a partition of unity argument. We explicitly prove the local estimate near ∂K1 , the other estimates being similar and easier. In local coordinates near some fixed x ∈ ∂K we may assume the boundary is flat, x = 0, and gij (0) = δij . In these coordinates, cji,α ∂α Xi , L Xj = i,j,|α|≤2 k−2+|α|,p
where cji,α ∈ Wloc
.
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
577
+ Suppose first that X ∈ W k,p (Br+ ) with support contained in Br/2 , where r is a small number to be specified later. Let L and B denote the constant coefficient operators given by the principal symbols of L and B at 0; these are in fact the vector Laplacian and Neumann boundary operator computed with respect to the Euclidean metric on the half space. It is easy to verify that these satisfy the so-called Lopatinski-Shapiro or covering conditions of [1] and hence we have
||X||W k,p (Br+ ) ≤ ||L X||W k−2,p (Br+ ) + ||BX||
W
1 ,p k−1− p
+ ||X||W k−2,p (Br+ ) ,
(Dr )
(26)
where Dr is the flat portion of ∂Br+ . In local coordinates i,α i,α (cj − cji,α )∂α Xi + cj ∂α Xi . (L X)j = (L X)j + |α|=2
|α|<2
We wish to estimate each of these terms in W k−2,p (Br+ ), and we must proceed carefully since a naive application of the multiplication rule from Lemma 1 will introduce unwanted large terms involving ||X||W 2,p . Consider a term of the form cji,α ∂α Xi with |α| < 2. Pick q ∈ (p, ∞) such that 1 1 k−1 1 − < < . p n q n Then applying Lemma 1 we have ||cji,α ∂α Xi ||W k−2,p (Br+ ) ||Xi ||W k−1,q (Br+ ) and, arguing as in Proposition 2, we obtain from interpolation and Sobolev embedding ||cji,α ∂α Xi ||W k−2,p (Br+ ) C( )||X||Lp (Br+ ) + ||X||W k,p (Br+ ) . The terms involving (cji,α − cji,α )∂α Xi are estimated similarly, and we have (cji,α − cji,α )∂α Xi C( )||X||Lp (Br+ ) + ||cji,α − cji,α ||L∞ + ||Xi ||W k,p (Br+ ) . Since ||cji,α − cji,α ||L∞ can be made as small as we please by taking r small enough we conclude ||L X||Lp (Br+ ) ||L X||Lp (Br+ ) + C( )||X||Lp (Br+ ) + ||X||W 2p,( Br+ ) .
(27)
A similar argument with the boundary operator leads us to ||BX||
W
1 ,p k−1− p
(Dr )
||(B X)j ||
W
1 ,p k−1− p
(Dr )
+C( )||X||W k−1,p (Br+ ) + ||X||W k,p (Br+ ) .
(28)
Combining (26), (27) and (28), taking sufficiently small, we conclude ||X||W k,p (Br+ ) ||L X||W k−2,p (Br+ ) + ||B X||
W
1 ,p k−1− p
(Dr )
+ ||X||W k−1,p (Br+ ) .
The estimate (25) for X with arbitrary support is now achieved in a standard way with cutoff functions and a partition of unity argument.
578
D. Maxwell
With the a priori estimate for compact manifolds in hand, the following coercivity result is now standard [4, 7, 2]. We omit the proof for the sake of brevity. k,p
Proposition 5. Suppose (M, g) is asymptotically Euclidean of class Wρ k,p and k ≥ 2. Then if 2 − n < δ < 0, δ ∈ R, and X ∈ Wδ (M) we have ||X||W k,p (M) ||L X||W k−2,p (M) + ||B X|| δ
δ−2
W
1 ,p k−1− p
(∂M)
with k > n/p
+ ||X||Lp (M) ,
(29)
δ
noting that the inequality is trivial if ||X||Lp (M) = ∞. δ
k,p
Let Pδ
k,p
k−2,p
denote PL acting as a map from Wδ (M) to Wδ−2 (M)×W
k−1− p1 ,p
(∂M).
k,p Pδ
From estimate (29) it follows immediately [26] that is semi-Fredholm under k,p the assumptions on k, p, and δ of Proposition 5. We now show Pδ is Fredholm with index 0. k,p
Proposition 6. Suppose (M, g) is asymptotically Euclidean of class Wρ with k > n/p, k ≥ 2, and suppose 2 − n < δ < 0. Then Pδk,p is Fredholm of index 0. Moreover, it is an isomorphism if and only if (M, g) possesses no nontrivial conformal Killing fields in k,p Wδ (M). Proof. We first suppose (M, g) is asymptotically flat of class Cρ∞ ; the desired result for rough metrics will follow from an index theory argument. It is enough to prove that Pδ2,2 is invertible. Indeed, from elliptic regularity we know k,p that the kernels of Pδ and Pδ2,2 agree. And if Pδ2,2 is surjective, then the image of Pδ2,2 contains Cc∞ (M) × C ∞ (∂M). Again from elliptic regularity we have that the image of k,p k,p k,p Pδ also contains Cc∞ (M) × C ∞ (∂M) and since the image of Pδ is closed, Pδ is surjective. We restrict our attention now to P = Pδ2,2 . To show P is injective, we prove that any element of the kernel of P is a conformal killing field. Since g is smooth, it follows from [12] (see also Sect. 6) that there are no nontrivial conformal Killing fields in Wδ2,2 , and hence the kernel of P is trivial. Suppose u ∈ ker P.We would like to integrate by parts to deduce that < L u, u > dV = < Lu, Lu > dV , 0=− M
M
and hence u is a conformal Killing field. This computation is only valid if δ ≤ 1 − n2 . It was shown in [12] (in the case of manifolds without boundary) that if u ∈ ker P, then u ∈ Wδ2,2 for every δ ∈ (2 − n, 0). Their proof also works if Pu is only compactly supported, and a simple hole filling argument then implies the same result holds for manifolds with boundary. So P is injective. To show P is surjective, it is enough to show that the adjoint P ∗ is injective. The dual space of L2δ−2 (M) × H 1/2 (∂M) is L22−n−δ (M) × H −1/2 (∂M). From elliptic regularity [17] and rescaled interior estimates we know that if P ∗ (f, h) = 0, then in fact f and h 2 are smooth and f ∈ H2−n−δ (M). Now if φ is smooth and compactly supported in each end of M we have from integrating by parts, 0 = P(f, h), φ
= < L f, φ > dV + M
Lφ(ν, f + h) − Lf (ν, φ) dA. ∂M
(30)
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
579
We obtain immediately L f = 0 in M. Moreover, one can readily show that if ω is a smooth 1-form on ∂M and ψ is a smooth function on ∂M, then there exists a φ ∈ Cc∞ such that φ = ψ and Lφ = ω on ∂M. It follows that Bf = 0 and h = 0. Since L f = 0 and Bf = 0, we have f = 0. Hence P ∗ is injective and therefore P is an isomorphism. k,p We return to the case where g is not smooth but only in Wρ (M) with k > n/p and k,p k ≥ 2. To show Pδ is Fredholm of index 0, it is enough to show its index is 0. Since g k,p can be approximated with smooth metrics gk , and since each Pδ,gk has index 0, so does k,p
k,p
the limit Pδ . To show that the kernel of Pδ consists of conformal Killing fields, we integrate by parts again using the fact that elements in the kernel decay sufficiently fast at infinity. Proposition 6 reduces the question of whether or not PL is an isomorphism to the existence of nontrivial conformal Killing fields X vanishing at infinity with LX(ν, ·) = 0 k,p on ∂M. In [12] it was proved that if (M, g) is asymptotically Euclidean of class Wρ k,p with k > n/p + 3, there are no nontrivial conformal Killing fields in Wδ for δ < 0. Hence the boundary value problem (24) is well posed if the metric has this high degree of regularity. It has been subsequently claimed [8] that no nontrivial conformal Killing fields exist with the metric as irregular as k > n/p + 2, but no proof exists in the literature. The following section contains a proof that there are no such vector fields for metrics as irregular as k > n/p + 1 and thereby establishes k,p
Theorem 3. Suppose (M, g) is asymptotically Euclidean of class Wρ with k ≥ 2 and suppose 2 − n < δ < 0. If either k > n/p + 1 or k > n/p and (M, g) has no nontrivial k,p k,p conformal Killing fields in Wδ (M), then there exists a unique solution X ∈ Wδ (M) of (24). 6. Non-Existence of Conformal Killing Fields Vanishing at Infinity We break the problem of showing conformal Killing fields vanishing at infinity are trivial into two pieces as was done in the high regularity proof of [12]. We first show that if a conformal Killing field exists, it must be identically zero in a neighbourhood of each end. We then show that the zero set can be extended to encompass the whole manifold. For both parts of the argument, we use a blowup method to construct a conformal Killing field on a subset of Rn and analyze properties of the resulting limit field. In this section we use the notation ei for a standard basis element of Rn or Rn+1 . Recall [19] that a basis for the conformal Killing fields on Rn with the Euclidean metric is comprised of the generators of the translations {ei }ni=1 , the rotations {xi ej −xj ei }1≤i<j ≤n and the spherical dilations {Dek }n+1 k=1 . Our notation for the dilations is as follows. Any constant vector V in R n+1 gives rise to a function x →< V , x > on the sphere. The field DV is the pushforward under stereographic projection of the gradient of this function. We note that DV1 + DV2 = DV1 +V2 . Moreover, in local coordinates we have 1 De1 = ( 1 − x12 + x22 + · · · + xn2 , −x1 x2 , · · · , −x1 xn ). (31) 2 The dilations Dej for 1 < j ≤ n have similar coordinate expressions being the pushforward of De1 under a rotation. Finally, Den+1 has the coordinate expression Den+1 = (x1 , · · · , xn ).
580
D. Maxwell
The following two lemmas provide properties of conformal Killing fields on subsets Rn . The first one can also be deduced from the analysis in [12], but we include it here for completeness. Lemma 5. Suppose X is a conformal Killing field on the external domain E1 with the p Euclidean metric. Suppose moreover that X ∈ Lδ (E1 ) for some p ≥ 1 and δ < 0. Then X vanishes identically. Proof. The basis of conformal Killing fields on Rn restricts to a basis of conformal Killing fields on E1 . If X is a sum of such vectors, its coefficients are polynomials. But p no non-zero polynomial is in Lδ (E1 ), since δ < 0. Hence X = 0. Lemma 6. Suppose X is a nontrivial conformal Killing field on B1 (Rn ) with the Euclidean metric that satisfies X(0) = 0 and ∇X(0) = 0. Then X(x) = 0 if x = 0. Proof. By a conformal change of metric, it is enough to show the same is true on Rn . A routine computation shows that the subspace of conformal Killing vector fields in Rn that vanish at 0 is spanned by the rotations, the dilation Den+1 , and the vectors 2Dei − ei
i = 1 · · · n.
(32)
The coefficients of the vectors (32) are homogeneous polynomials of degree 2 and hence their derivatives at the origin vanish. On the other hand, a vector in the span of Den+1 and the rotations has linear coefficients and vanishes identically if its gradient at the origin vanishes. So if X(0) = 0 and ∇X(0) = 0, then X is in the span of the vectors (32). We claim a nontrivial such vector vanishes only at the origin. Indeed, n
v j 2Dej − ej = 2DV − V ,
j =1
where V = nj=1 v j ej . If V is not zero, then by performing a rotation and constant scaling we can push 2DV − V forward to 2De1 − e1 . From the explicit expression (31) we observe that 2De1 − e1 vanishes only at 0. We say a vector field X vanishes in a neighbourhood of infinity if for each end there exists a radius R such that in end coordinates X ≡ 0 in the exterior region ER . The following lemma shows under weak hypotheses on the metric that a conformal Killing field vanishing at infinity also vanishes in a neighbourhood of infinity. 2,p
Lemma 7. Suppose (M, g) is asymptotically Euclidean of class Wρ with p > n/2. 2,p Suppose X is a conformal Killing field in Wδ with δ < 0. Then X vanishes in a neighbourhood of infinity. Proof. We work in end coordinates and construct a sequence of metrics {gk }∞ k=1 on the k exterior region E1 by letting gk (x) = g(2 x). Since ||gk − g||W 2,p (E ) 2ρk ||g − ρ
g||W 2,p (E ρ
2k
1
we conclude ||gk − g||W 2,p (E ) → 0. It follows that the associated maps kL ) ρ
1
2,p
and Lk acting on Wδ (E1 ) converge as operators to L and L. Suppose, to produce a contradiction, that X is not identically 0 outside any exterior region ER . Let Xˆ k be the vector field on E1 given by Xˆ k (x) = X(2k x) and let 2,p Xk = Xˆ k /||Xˆ k || 2,p . Fix δ with δ ∈ (δ, 0). Then from the W boundedness of Wδ
δ
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
581
the sequence {Xk }∞ k=1 it follows (after reducing to a subsequence) that the vectors Xk 1,p converge strongly in Wδ to some X0 . We can assume without loss of generality that δ > 2 − n and hence we can apply Proposition 5 to the Euclidean metric to obtain ||Xk1 − Xk2 ||W 2,p ||L − kL1 ||W 2,p + ||L − kL2 ||W 2,p δ
δ
δ
+||L − Lk1 ||W 2,p + ||L − Lk2 ||W 2,p + ||Xk1 − Xk2 ||Lp . δ
δ
δ
2,p Here we have used the Wδ boundedness of the sequence {Xk }∞ k=1 together with the 2,p facts kL Xk = 0 and Lk Xk = 0. We conclude {Xk }∞ is Cauchy in Wδ (E1 ) and k=1 hence Xk−→ X0 . Moreover, since Lk−→ L and since Lk Xk = 0 we have X0 is a con2,p 2,p Wδ Wδ 2,p formal Killing field for g in Wδ (E1 ). From Lemma 5 it follows that X0 = 0. Hence 2,p Xk converges in Wδ to 0, which contradicts ||Xk ||W 2,p = 1 for each k. δ 2,p
Theorem 4. Suppose (M, g) is asymptotically Euclidean of class Wρ with p > n. k,p Then there exist no nontrivial conformal Killing fields in Wδ for any δ < 0. 2,p
Proof. From Lemma 7 we know that if X ∈ Wδ is a conformal Killing field then it vanishes on an open set. We claim X−1 (0) = M. If the claim is not true, then there exists a point x0 in the interior of M and on the boundary of the interior of X−1 (0). Working in local coordinates near x, we reduce to the situation where g is a metric on the unit ball, gij (0) = δij , X is a conformal Killing field for g on B1 , and the origin is on the boundary of the interior of X −1 (0). Since p > n it follows that X is in C 1 (Br ) and hence X(0) = 0 and ∇X(0) = 0. Let {rk }∞ k=1 be a sequence of radii rk tending down to zero such that X(x) = 0 for some x with |x| = rk /2. We construct a sequence of metrics {gk }∞ k=1 on the unit ball by taking gk (x) = g(x/rk ). Evidently, gk−→ g, and it follows that the associated maps W 2,p
L
Lk
2,p
and converge to L and L as operators on Wδ (B1 ). We construct vector fields Xk on B1 by setting Xˆ k (x) = X(x/rk ) and letting Xk = Xˆ k /||Xˆ k ||W 2,p (since X is not identically 0 on Brk the normalization is possible). By our choice of radii, there exists a point xk with |xk | = 1/2 such that Xk (xk ) = 0. From the W 2,p boundedness of the sequence {Xk }∞ k=1 , it follows (after taking a subsequence) that Xk converges strongly in W 1,p to some X0 ∈ W 2,p . Arguing as in Lemma 7, replacing the use of Proposition 5 with Proposition 4, we conclude X0 is a conformal Killing field for g and Xk converges in W 2,p to X0 . From the resulting C 1 (B1 ) convergence of Xk to X0 we know X0 (0) = 0 and ∇X0 (0) = 0. The collection of points xk where Xk vanishes has a cluster point x with |x| = 1/2, and hence X0 (x) = 0. But Lemma 6 then implies X0 = 0, which is impossible since ||Xk ||W 2,p = 1 for each k. Hence X = 0 identically. k
Remark 2. Even though Theorem 4 requires W 2,n+ regularity, we know from Lemma 7 that a conformal Killing field that vanishes at infinity also vanishes in a neighbourhood n of infinity, even if the metric only has W 2, 2 + regularity. Since Theorem 4 is a unique 2,n+
continuation argument, it is perhaps not surprising that it requires g ∈ Wloc , as this is the minimal regularity that guarantees the principal coefficients of L are Lipschitz
582
D. Maxwell
continuous. It would be interesting to determine if Theorem 4 also holds in the low n regularity case W 2, 2 + . Acknowledgements. I would like to thank D. Pollack, J. Isenberg, and S. Dain for helpful discussions and advice. I would also like to thank an anonymous referee for suggestions that improved the paper’s style. This research was partially supported by NSF grant DMS-0305048.
References 1. Agmon, S., Douglis, A., Nirenberg, L.: Estimates near the boundary for solutions of elliptic partial differential equations satisfying general boundary conditions II. Commun. Pure Appl. Math. 17, 35–92 (1964) 2. Bartnik, R.: The mass of an asymptotically flat manifold. Commun. Pure Appl. Math. 39, 661–693 (1986) 3. Brill, D., Lindquist, R.W.: Interaction energy in geometrostatics. Phys. Rev. (2) 131, 471–476 (1963) 4. Cantor, M.: Some problems of global analysis on asymptotically simple manifolds. Compositio Math. 38, 3–35 (1979) 5. Cantor, M., Brill, D.: The Laplacian on asymptotically flat manifolds and the specification of scalar curvature. Compositio Math. 43(3), 317–330 (1981) 6. Choquet-Bruhat, Y.: Einstein constraints on compact n-dimensional manifolds. Class. Quantum Grav. 21, S127–S151 (2004) 7. Choquet-Bruhat, Y., Christodoulou, D.: Elliptic systems in Hsδ spaces on manifolds which are Euclidean at infinity. Acta. Math. 146, 129–150 (1981) 8. Choquet-Bruhat, Y. Isenberg, J., York, Jr J.W.: Einstein constraints on asymptotically Euclidean manifolds. Phys. Rev. D 61, 1–20 (2000) 9. Choquet-Bruhat, Y., York, Jr J.W.: The Cauchy problem. In: Held, A., (ed.), General Relativity and Gravitation. New York: Plenum, 1980 10. Cook, G.B.: Initial data for numerical relativity. Living Rev. 5, 2000 [www.livingreviews.org/lrr2000-5] 11. Cook, G.B.: Corotating and irrotatinal binary black holes in quasi-circular orbits. Phys. Rev. D 65, 084003 (2002) 12. Christodoulou, D., O’Murchadha, N.: The Boost Problem in General Relativity. Commun. Math. Phys. 80, 271–300 (1981) 13. Dain, S.: Initial data for black hole collisions In: L. Gutierrez, J. Alberto, (eds.), Gravitational and Cosmology. Proc. of Spanish Relativity Meeting ERE-2002, Barcelona: Univ. de Barcelona, 2003 14. Dain, S.: Trapped Surfaces as Boundaries for the Constraint Equations Class. Quantom Grav. 21, 555–573 (2004) 15. Eardley, D.M.: Black hole boundary conditions and coordinate conditions. Phys. Rev. D 57(4), 2299–2304 (1998) 16. Escobar, J.F.: The Yambe problem on manifolds with boundary. J. Differ. Geom. 35, 21–84 (1992) 17. H¨ormander, L.: Analysis of linear partial differential operators. Vol. III, Berlin: Springer-Verlag, 1985 18. Isenberg, J.: Constant mean curvature solutions of the Einstein constraint equations on closed manifolds. Class. Quantum Grav. 12, 2249–2274 (1995) 19. Kulkarni, R.S., Pinkall, U.: Conformal Geometry. Wiesbaden: Friedr. Vieweg & Sohn, 1988 20. Klainerman, S., Rodnianski, I.: Rough solutions of the Einstein vacuum equations C. R. Acad. Sci. Paris S´er. I Math. 334, 125–130 (2002) 21. Lichernowicz, A.: Sur l’int´egration des e´ quations d’Einstein. J. Math. Pures Appl. 23, 26–63 (1944) 22. Lockhart, R.B., McOwen, R.C.: Elliptic differential operators on noncompact manifolds. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 12(4), 409–447 (1985) 23. McOwen, R.C.: The behavior of the Laplacian on weighted Sobolev spaces. Commun. Pure Appl. Math. 32, 783–795 (1979) 24. Misner, C.: The method of images in geometrostatics. Ann. Phys. 24, 102–117 (1963) 25. Moncrief, V., Isenberg, J.: A set of nonconstant mean curvature solutions of the Einstein constraint equations on closed manifolds. Class. Qunatum Grav. 13, 1819–1847 (1996) 26. Schechter, M.: Principles of functional analysis. Providence, Rhode Island: Americal Mathematical Society, 2002 27. Smith, H., Tataru, D.: Sharp local well posedness results for the nonlinear wave equation. To appear Ann. Math.
Solutions of the Einstein Constraint Equations with Apparent Horizon Boundaries
583
28. Thornburg, J.: Coordinates and boundary conditions for the general relativistic initial data problem. Class. Quantum Grav. 4, 1119–1131 (1987) 29. Trudinger, N.: Linear elliptic operators with measurable coefficients. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 27, 265–308 (1973) 30. Wald, R.M.: General relativity. Chicago: The University of Chicago Press, 1984 31. York, Jr J.W., Bowen, J.M.: Time-asymmetric initial data for black holes and black hole collisions. Phys. Rev. D 24(8), 2047–2056 (1980) Communicated by G.W. Gibbons
Commun. Math. Phys. 253, 585–609 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1236-y
Communications in
Mathematical Physics
On the Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model Taku Matsui Graduate School of Mathematics, Kyushu University, 1-10-6 Hakozaki, Fukuoka 812-8581, Japan. E-mail:
[email protected] Received: 22 September 2003 / Accepted: 18 May 2004 Published online: 17 November 2004 – © Springer-Verlag 2004
Abstract: We prove that the antiferromagnetic XXZ model with Ising-like anisotropy on a one-dimensional lattice does not have non-periodic ground states. 1. Introduction The XXZ model on a one-dimensional lattice is an exactly solvable model. The Hamiltonian of the model is given by (j ) (j +1) (j ) (j +1) (j ) (j +1) HXXZ = {σx σx + σy σ y + σ z σz }, (1.1) j ∈Z (j )
(j )
(j )
where σx , σy , and σz are Pauli spin matrices at the site j and is a real parameter (anisotropy). For this model, we show that the non-periodic ground state for the infinite volume system cannot exist when the parameter is sufficiently large, >> 1. This is closely related to the instability of the interface between two periodic ground states. For our purpose, we use the C∗ −algebraic method (cf. [4]). To explain the results of this paper more precisely, we introduce our notation now. We denote by A the UHF C ∗ −algebra 2∞ (the infinite tensor product of 2 by 2 matrix algebras ): C∗ A= M2 (C) . Z
Each component of the tensor product above is specified with a lattice site j ∈ Z. A is the totality of quasi-local observables. We denote by Q(j ) the element of A with Q in the jth component of the tensor product and the identity in any other component. We fix a representation of Pauli spin matrices as follows: 01 0 −i 1 0 , σy = , σz = . (1.2) σx = 10 i 0 0 −1
586
T. Matsui
For a subset of Z , A is defined as the C ∗ -subalgebra of A generated by physical observables localized in . We set Aloc = ∪⊂Z:||<∞ A ,
(1.3)
where the cardinality of is denoted by ||. We call an element of Aloc a local observable or a strictly local observable. By a state of A we mean a normalized positive linear functional of A. Let ϕ be a state of A. The restriction to A will be denoted by ϕ : ϕ = ϕ|A . Let τj be the lattice translation determined by τj (Q(k) ) = Q(j +k)
for any j and k in Z.
The time evolution of an observable Q in A is determined by αtXXZ (Q) = eitHXXZ Qe−itHXXZ . Definition 1.1. Let ϕ be a state of A. ϕ is a ground state of the XXZ model if the following inequality is valid for any Q in Aloc : ϕ(Q∗ [HXXZ , Q]) ≥ 0.
(1.4)
The main result of this paper is described below. Theorem 1.2. For the antiferromagnetic XXZ model, if is sufficiently large, there exist XXZ and ϕ XXZ such that precisely two pure ground states ϕeven odd XXZ XXZ ϕeven ◦ τ1 = ϕodd ,
XXZ XXZ ϕodd ◦ τ1 = ϕeven .
XXZ and ϕ XXZ . Any ground state is a convex combination of ϕeven odd
Let us explain the physical meaning of the infinite volume ground states defined above. Let αt be a one-parameter group automorphism of A generated by an infinite volume Hamiltonian H : αt (Q) = eitH Qe−itH ,
Q ∈ A.
Let {π(A), , H} be the GNS triple for a state ϕ satisfying ϕ(Q∗ [H, Q]) ≥ 0. A positive selfadjoint operator Hπ on H exists such that eitHπ π(Q)e−itHπ = π(αt (Q)),
Hπ = 0.
Hπ corresponds to the regularized Hamiltonian. Here by regularization we mean subtraction of the vacuum energy from the Hamiltonian. Another meaning of (1.4) is that the effect of the boundary condition at infinity should be taken into account when we talk about infinite volume systems. Consider a ground state ϕ () (= an eigenvector state for the smallest eigenvalue) of a finite volume Hamiltonian H with free or periodic
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
587
boundary condition. Then ϕ () satisfies (1.4) with HXXZ replaced by H , and (1.4) is valid for the infinite volume ground state ϕ obtained via the usual thermodynamic limit: w − lim ϕ () = ϕ. →Z
For the XXZ model with large Ising-like anisotropy, the infinite volume ground state obtained by this procedure is not a pure state and it is a convex combination of two XXZ and ϕ XXZ with period 2. Both infinite volume states ϕ XXZ periodic ground states ϕeven even odd XXZ and ϕodd satisfy the inequality (1.4). Another possibility to obtain states satisfying (1.4) is to consider finite volume Hamiltonians with boundary magnetic field: H˜ XXZ =
N−1
(j ) (j +1)
{σx σx
(j ) (j +1)
+ σy σ y
(j ) (j +1)
+ σz σ z
} + h−N σz(−N) + hN σz(N) ,
j =−N
(1.5) where h−N and hN are arbitrary magnetic fields dependent on the sites −N and N . The infinite volume limit of ground states for the Hamiltonian (1.4) gives rise to a state satisfying (1.4) as well. More generally, consider a general infinite volume Hamiltonian H obtained by a limit of local Hamiltonian H , lim H = H , and a sequence of local selfadjoint operators B satisfying lim [H + B , Q] = [H, Q]
→Z
(1.6)
for any local Q in Aloc . This equation implies that the time evolution generated by lim→Z H + B is the same as that of H . For any infinite volume limit of ground states for H + B we obtain a state satisfying (1.4). For ferromagnetic XXZ models with > 1, (j ) (j ) (j ) (j ) (j ) (j ) Hf erro = − {σx σx + σy σy + σz σz }, j,j
the above construction of infinite volume ground states yields non translationally invariant ground states. In [1], S. R. Alcaraz, R. S. Salinas, and W. F. Wreszinski obtained non-translationally invariant ground states of XXZ models. Their construction of non translationally invariant ground states is valid for higher dimensional integer lattices Zd . In [13], we have shown that the ground states obtained by S. R. Alcaraz, R. S. Salinas, and W. F. Wreszinski and standard translationally invariant ground states are only pure ground states for the one-dimensional ferromagnetic XXZ model. Absence of non-translationally invariant ground states was established for the isotropic ferromagnetic case, = 1, by T. Koma and B. Nachtergaele in [9]. States considered in [13] and [9] are product states and the proof for the absence(or existence) of non-translationally invariant ground states is not mathematically difficult for ferromagnetic XXZ model. However, we are not aware of general methods to analyze the problem of the absence of non-translationally invariant ground states for general Hamiltonians. Difficulties appear in two areas. (1) In many cases, two dimensional classical spin systems are equivalent to one dimensional quantum spin systems. Path integral representation of correlation functions reveals that certain ground states of one dimensional quantum models are projection of Gibbs measures for two dimensional classical systems to a one dimensional
588
T. Matsui
line. It is widely believed that non-translationally invariant Gibbs measures cannot exist for two dimensional systems , however, when more than two translationally invariant Gibbs states exist, it is hard to prove the conjecture. In a sense, the two dimensional Ising model is the only example for which we can verify the claim. (2) Another difficulty is related to the spectrum of Hamiltonians. In taking the thermodynamic limit, the ground state eigenvalue of the finite volume Hamiltonian may fail to be a point spectrum of the infinite volume Hamiltonian if the Hilbert space belongs to a soliton sector. Whether the infimum of the spectrum in a soliton sector is a point spectrum or not is a subtle question and the ordinary perturbation theories do not work. For a highly anisotropic antiferromagnetic XXZ model the first question (1) of classical Gibbs measures does not occur as the large regime corresponds to a high temperature Gibbs measure. The question of the spectrum in (2) is still problematic as non periodic ground states exist for the one dimensional classical Ising model. Recently N. Datta and T. Kennedy obtained a convergent perturbation expansion for low lying spectrum (dispersion relation) for finite twisted periodic boundary conditions in [7]. (See [6] as well.) The result of [7] by N. Datta and T. Kennedy suggests that the non-periodic ground states cannot exist for the XXZ models, but it does not really imply the absence of non-periodic ground states due to the following reasons: (a) The claim of N. Datta and T. Kennedy that their dispersion relation for one quasiparticle is the smallest excitation energy for a fixed momentum in finite periodic systems relies on the crossing argument which we cannot apply for infinite systems because the different parameters correspond to mutually non-equivalent representations of the quasi-local algebra A so that the comparison between the dispersion relations for different does not make sense. (b) So far, there has been no proof that the way of handling the boundary condition of N. Datta and T. Kennedy is the only possibility which can give rise to non-periodic ground states. (c) To extend the construction of the low lying spectrum of N. Datta and T. Kennedy to infinite volume systems, an infinite product of the Pauli matrix appears, and so far we have not found any idea to extend their construction to infinite systems. Now we return to the antiferromagnetic XXZ model. The Hamiltonian HXXZ is unitarily equivalent to the following Hamiltonian: H =−
(j ) (j +1) 1 (j ) (j +1) 1 (j ) (j +1) {σx σx + σy σy − σz σz }.
(1.7)
j ∈Z
In fact, if we set αt (Q) = eitH Qe−itH , (2j ) (2j ) (2j ) γ (Q) = Ad( σz )(Q) = σz Q σz , j ∈Z
we have
j ∈Z
j ∈Z
γ ◦ αt ◦ γ −1 = αtXXZ .
It is known that there exist precisely two pure translationally invariant ground states for H of (1.7) if is sufficiently large. To prove our theorem, it suffices to show that non-translationally invariant ground states cannot exist for H with large .
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
589
For the proof of our theorem, we combine the ideas of our previous papers [11–14], the expansion technique of J. Kirkwood and L. Thomas [8], and construction of low energy quasiparticle states by R. A. Minlos in [15]. Though we are hugely inspired by the work [7] of N. Datta and T. Kennedy, we do not use their results [7] in the due course of our proof. Our argument does not depend on the integrability of the model. In fact, the absence of non-translationally invariant ground states can be proved for the following translationally invariant Hamiltonian: (j ) (j +1) (j ) (j +1) (j ) (j +1) H =− {σx σx + δσy σy − δσz σz }− Vj . (1.8) j ∈Z
j ∈Z
where Vj is an interaction satisfying Perron-Frobenius Positivity and Z2 invariance of [12]. Both |δ| and || should be very small |δ| << 1 , || << |δ|. Under these conditions, we can estimate the dispersion relation of a quasi-particle in soliton sectors as follows: ω(k) = {2 + 4δ cos 2k} + O(δ 2 ). This non-triviality of the dispersion relation of a quasi-particle implies absence of nontranslationally invariant ground states. We have not succeeded to obtain explicit upper bounds of |δ| and || for which our results are valid. The rest of this article is as follows. In Sect. 2, we explain the construction of pure translationally invariant ground states. Any ground state gives rise to a positive energy representation defined in Sect. 3. We obtain a complete classification of positive energy representations for our model and uniqueness of infinite volume ground states for weakly coupled quantum Ising models on Z in Sect. 3. Sect. 4 is devoted to low lying excitation. With the aid of Kramers-Wannier duality for Ising models and Minlos maps, we obtain non-triviality (momentum dependence) of the dispersion relation for one quasiparticle in soliton sectors. This result gives the absence of non-periodic ground states. It is natural to ask whether the results of this paper can be obtained for higher spin XXZ models. In our proof, we used three quite different techniques. (a) One is the realization of the quantum Hamiltonian as a Markov generator of an infinite particle system and the spectral analysis. (b) The second is the operator algebra technique which gives rise to a complete description of soliton sectors. (c)The third is Kramers-Wannier duality. The operator algebraic results (b) of Sect. 3 can be extended to a more general class of massive models in which we do not assume Perron-Frobenius positivity for Hamiltonians, The argument in our proof is purely functional analytic. The duality technique (b) and the Markov generator method (a) are rather cumbersome in higher spin models, even though we can analyze the same problem, in principle. So far, we are unable to give an infinite dimensional version of the spectral analysis of Datta-Kennedy in [7]. It is desirable to establish another method close to that of [7] for analysis of higher spin systems. 2. Translationally Invariant Ground States In this section, we explain construction of the physical Hilbert space (GNS Hilbert space) for translationally invariant ground states of the XXZ model. We assume that the parameter is sufficiently large.
590
T. Matsui
First we begin with finite volume systems. For later use we consider both free and periodic boundary conditions. For any integer n and m, we set F H[n,m] =−
m−1
(j ) (j +1)
{σx σx
j =n
+
1 (j ) (j +1) 1 (j ) (j +1) − σz σz }, σy σy
P F H[n,m] = H[n,m] − {σx(n) σx(m) +
1 (n) (m) 1 σ σ − σz(n) σz(m) }. y y
F P and H[n,m] are identified with We show that these finite volume Hamiltonians H[n,m] spin flip Markov generators of finite volume Ising spin systems. For this purpose we introduce some notation. The idea goes back to the work of J. Kirkwood and L. Thomas in [8]. Let X be the configuration space of the one-dimensional Ising spin system; X= {+, −}. j ∈Z
By product topology X is a compact space. The set continuous function on X will be denoted by C(X) and the Ising spin variable at the site m ( the projection to the jth coordinate of X to {+, −}) will be denoted by σ (j ) while a spin configuration is denoted (j ) by σ . Let B be the abelian C ∗ -subalgebra generated by all σz ( j ∈ Z ). We will (m) identify B via the equality σ (m) = σz . Let X[n,m] be the configurationspace of the one-dimensional Ising spin system in a finite interval [n, m], X[n,m] = m j =n {+, −}. (j )
The algebra generated by all σz set A of Z, we define σ (A) = σ (j )
with n ≤ j ≤ Z is denoted by B[n,m] . For any finite ,
σα (A) =
j ∈A
σα(j )
(α = x, y, z).
j ∈A
Any continuous function f (σ ) on X is expanded as f (σ ) = fA σ (A), A⊂Z
where the complex coefficients fA have the following summability: |fA |2 < ∞. A⊂Z
We will handle continuous functions with finite |f | or |f |δ1 , where |fA | , |f |δ1 = |fA | eδ1 d(A) , |f | = A⊂Z
(2.1)
A⊂Z
and d(A) is the diameter of A (the minimal length of intervals containing A). We use the symbol |f | instead of f to distinguish from the sup norm of continuous functions. P F When speaking of finite volume systems H[n,m] and H[n,m] may be regarded as m−n+1 m−n+1 m−n+1 2 by 2 matrices acting on the 2 dimensional Hilbert space H[n,m] . Note that P F [σα [n, m]), H[n,m] ] = 0, [σα [n, m]), H[n,m] ] = 0.
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
591
We can decompose the Hilbert space H[n,m] into eigenspaces of σz ([n, m]): − H[n,m] = H+ H± [n,m] ⊕ H[n,m] , [n,m] = ξ ∈ H[n,m] | σz ([n, m])ξ = ±ξ . − We call H+ [n,m] (H[n,m] ) the even (odd) sector. As we employ the representation (1.3) of P F and H[n,m] restricted to Pauli spin matrices, the off-diagonal matrix elements of H[n,m] ± P F sectors H[n,m] are non-positive. On each sector, −H[n,m] and −H[n,m] are irreducible Perron-Frobenius positive matrices. More precisely, let e[n,m] (σ ) be a unit vector of H[n,m] specified with the following identity: (j )
σz e[n,m] (σ ) = σ (j ) e[n,m] (σ )
,
(j )
σx e[n,m] (σ ) = e[n,m] (σj )
for all j in [n, m]. Here σj is the spin configuration obtained by the spin flip at the site j in the lattice Z. The set of unit vectors e[n,m] (σ ) forms a base of H[n,m] . The Perron Frobenius irreducibility condition is equivalent to the following positivity of matrix elements:
δ e[n,m] (σ ), e−H[n,m] e[n,m] (σ ) > 0 δ = P, F provided that both e[n,m] (σ ) and e[n,m] (σ ) are belonging to the same sector. P F and H[n,m] on each sector The ground state vectors P[n,m] (±), F[n,m] (±) of H[n,m] are written as follows: δ c[n,m] (σ )e[n,m] (σ ), δ[n,m] (+) = σ :e[n,m] (σ )∈H+ [n,m]
δ[n,m] (−) =
δ c[n,m] (σ )e[n,m] (σ ),
σ :e[n,m] (σ )∈H− [n,m] δ c[n,m] (σ ) > 0,
δ = P , F.
(2.2)
δ δ We denote the ground state energy of H[n,m] by E[n,m] (δ = P , F ). The ground state δ δ (even) (resp. energy of H[n,m] restricted to even (resp. odd) sector is denoted by E[n,m] δ E[n,m] (odd)). When m − n is even, σx ([n, m]) and σz ([n, m]) are anti-commuting and the Hamilδ tonians H[n,m] on even and odd sectors are unitarily equivalent. Thus if m − n is even, δ δ δ E[n,m] = E[n,m] (even) = E[n,m] (odd).
On the other hand, when m − n is odd, it is known that δ δ δ = E[n,m] (even) < E[n,m] (odd). E[n,m]
So far we did not use our assumption that is sufficiently large. When is large, convergent expansions show us the following results. Proposition 2.1. There exists 0 > 1 such that for > 0 , δ δ (even) − E[n,m] (odd)| ≤ C1 e−C2 (m−n) , |E[n,m]
(2.3)
F E[n,m] = d + (m − n)∞ + O((m − n)−1 ), P = (m − n + 1)∞ + O((m − n)−1 ), E[n,m]
where e∞ is the mean ground state energy and d is a constant.
(2.4)
592
T. Matsui
δ When m − n is even, the coefficient c[n,m] (σ ) in (2.2) is a unique solution to the following Kirkwood-Thomas equation. (j ) (j +1) (j ) (j +1) δ δ δ − ( − σz σz )c[n,m] (σj,j +1 )c[n,m] (σ )−1 + σz σz = E[n,m] , j
j
(2.5) where σj,j +1 is the configuration in which spins at both j and j+1 sites are flipped. P The spectral gap (the difference of the smallest and the next eigenvalues) of H[n,m] F and H[n,m] on each sector is bounded by γ = γ () from below uniformly in the system size. See [8, 6 , and 7]. P Set h[n,m] (σ ) = −2 ln(c[n,m] (σ )). Then, we have the following convergent series h[n,m] (σ ) =
∞ 1 (k) h (σ ). k [n,m]
(2.6)
k=1
Consider the following expansion: (k)
h[n,m] (σ ) =
(k)
(2.7)
J[n,m] (A)σ (A).
A
If m − n is sufficiently large, the first few terms in (2.6) are identical in this expansion: (k)
J[n,m] (A) = J (k) (A)
(2.8)
if k ≤ 1/8(m − n). Set J (A) =
∞ 1 (k) J (A) k k=1
,
Jj (σ ) =
J (A)σ (A).
(2.9)
j ∈A
There exists a positive constant δ1 such that |Jj |δ1 is finite: |J (A)| eδ1 d(A) < ∞. |Jj |δ1 =
(2.10)
j ∈A
Consider the vector states ϕ (±) ([n, m]) of A[n,m] defined by
ϕ (±) ([n, m])(Q) = (P[n,m] (+) ± P[n,m] (−)), Q(P[n,m] (+) ± P[n,m] (−)) (2.11) for Q in A[n,m] . These states ϕ (±) ([n, m]) converge to translationally invariant pure ground states. as we take m − n to infinity. To see the convergence of this limit, it is convenient to write down the correlation functions in terms of the integral over a classical spin configuration space. Let X[n,m] be the configuration space of the one-dimensional Ising spin system in a finite interval [n, m], X[n,m] = m j =n {+, −}. Consider the "Gibbs" measure dµ[n,m] (σ ) defined by dµ[n,m] (σ ) = e−h[n,m] (σ ) dσ,
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
593
where dσ is the probability measure on X[n,m] specified with σ (A)dσ = 0 X[n,m]
for any non-empty A ⊂ [n, m]. Then, the correlation functions for ϕ (±)([n,m] ) of (2.11) is written as follows: ϕ (+) ([n, m])(σz (B)σx (A)) dµ[n,m] (σA ) 1/2 = σ (B) dµ[n,m] (σ ), dµ[n,m] (σ ) X[n,m] ϕ (−) ([n, m])(σz (B)σx (A)) dµ[n,m] (σA ) 1/2 = (−1)|A| σ (B) dµ[n,m] (σ ), dµ[n,m] (σ ) X[n,m]
(2.12)
where σA is the spin configuration in which all spins inside A are flipped for σ . Note that (2.12) determines the expectation value of all observables as σy (A) is a product of σx (A) and σz (A). Now we can identify the GNS Hilbert space of ϕ (±) ([n, m]) with L2 (X[n,m] , µ) in a natural manner. For this purpose, it suffices to exhibit a representation of A and a unit vector in L2 (X[n,m] , µ) such that the associated vector state satisfies the identity (2.12). In fact we set (+)
(j )
(−)
(j )
π[n,m] (σx )F (σ ) =
dµ[n,m] (σj ) 1/2 (j ) (+) F (σj ), π[n,m] (σz )F (σ ) = σ (j ) F (σ ), dµ[n,m] (σ )
π[n,m] (σx )F (σ ) = −
dµ[n,m] (σj ) 1/2 (j ) (−) F (σj ), π[n,m] (σz )F (σ ) = σ (j ) F (σ ) dµ[n,m] (σ )
(2.13)
for F (σ ) in L2 (X[n,m] , µ) and n ≤ j ≤ m. If we set [n,m] = 1,
(±) ϕ (±) ([n, m]) = [n,m] , π[n,m] (Q)[n,m] . The advantage of the above presentation of ground states (2.12) is twofold. First we can take the infinite volume limit easily as the measure dµ[n,m] converges to the Gibbs measure for the following infinite range translationally invariant Hamiltonian h(σ ) of the classical Ising spin system: h(σ ) =
J (A)σ (A).
(2.14)
A⊂Z
h(σ ) is translationally invariant because J (A) introduced in (2.9) satisfies the identity J (A + k) = J (A) due to the periodicity of the boundary conditions. By exponential decay of J (A) in (2.10), the Gibbs measure for h(σ ) is unique. The second reason to use (2.12) is that we have the explicit realization of the GNS Hilbert space for infinite volume ground states. In particular the spectral analysis of the infinite volume Hamiltonian is possible.
594
T. Matsui
Proposition 2.2. Let dµ be the unique Gibbs measure for the Classical Hamiltonian h(σ ) in (2.14). Let ϕ (±) be the states determined by the following equation: dµ(σA ) 1/2 (+) ϕ (σz (B)σx (A)) = σ (B) dµ(σ ), dµ(σ ) X dµ(σA ) 1/2 σ (B) dµ(σ ). (2.15) ϕ (−) (σz (B)σx (A)) = (−1)|A| dµ(σ ) X ϕ (±) are translationally invariant pure states and ϕ (±) = w − lim ϕ (±) ([−m, m]).
(2.16)
m→∞
Next we consider the regularized Hamiltonian. By regularization we mean subtraction of ground state energy. Here we consider finite chains with the periodic boundary condition and their infinite volume limit. Now for simplicity of exposition, we assume P P (even) = E[n,m] (odd). By use of the Kirkwood-Thomas that m−n is even, hence E[n,m] equation (2.5) it is possible to show
(j ) (j +1) (j ) (j ) (j +1) P P − E[n,m] = j (1 − 1 σz σz ) exp( 21 f[n,m] (σz )) − σx σx , (2.17) H[n,m] where (j )
f[n,m] (σ ) = h[n,m] (σ(j,j +1) ) − h[n,m] (σ ).
(2.18)
We used the natural identification of B and C(X). Note that (j ) J (A)σ (A). lim f[−m,m] (σ ) = 2 m→∞
|{j,j +1}∩A|=1
We set
Bj = exp(
(j ) (j +1)
J (A)σz (A)) − σx σx
|{j,j +1}∩A|=1 (j ) (j +1)
Aj = (1 − σx σx
)exp(
Then Bj =
1 2
,
J (A)σz (A)).
(2.19) (2.20)
|{j,j +1}∩A|=1
1 ∗ A Aj . 2 j
Formally, P P lim (H[−m,m] − E[−m,m] )=
m→∞
1 1 (j ) (j +1) ∗ (1 − σz σz )Aj Aj . 2 j ∈Z
So we introduce the infinite (short) range Hamiltonian Hreg via the following equation: Hreg =
1 1 (j ) (j +1) ∗ (1 − σz σz )Aj Aj . 2 j ∈Z
Hreg is referred to as the regularized Hamiltonian.
(2.21)
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
Proposition 2.3.
595
(i) Hreg is physically equivalent to H in the sense that [H, Q] = [Hreg , Q],
Q ∈ Aloc .
(ii) Any state ϕ of A satisfying ϕ(A∗j Aj ) = 0
(2.22)
for all j is a ground state of H . ϕ is a convex combination of pure ground states ϕ (±) of H : ϕ = λϕ (+) + (1 − λ)ϕ (−)
(0 ≤ λ ≤ 1).
(iii) Any translationally invariant ground state ϕ of H satisfies the condition (2.22) of (ii). Proposition 2.4. (i) Let ϕ be a state of A satisfying (2.22) for any j , and let {πϕ , ϕ , Hϕ } be the GNS triple associated with ϕ. The following limit exists in the strong resolvent sense on Hϕ : πϕ (Hreg ) = lim
m→∞
1 1 (j ) (j +1) (1 − πϕ (σz σz ))πϕ (A∗j Aj ). 2 |j |≤m
Furthermore, πϕ (αt (Q)) = eitπϕ (Hreg ) πϕ (Q)e−itπϕ (Hreg ) ,
πϕ (Hreg )ϕ = 0.
(ii) For ϕ (±) , πϕ (±) (Hreg ) generates a Markov semigroup on B. πϕ (±) (Hreg ) has a spectral gap: σ (πϕ (±) (Hreg )) ∩ (0, γ ) = ∅, where σ (πϕ (±) (Hreg )) is the spectrum of πϕ (±) (Hreg ) and γ is a positive constant dependent on . For proof of the above propositions, see [12].
3. Positive Energy Representation In this section, we consider positive energy representations of the XXZ model for large . Definition 3.1. Let {A, αt , R} be a C ∗ -dynamical system, i.e. αt (t ∈ R) is a one-parameter group of *-automorphisms of a C ∗ -algebra A. A representation π of A on a Hilbert space H is a positive energy representation if there exists a positive selfadjoint operator Hπ on H which implements the time evolution αt on H: eitHπ π(Q)e−itHπ = π(αt (Q)) ,
Q ∈ A.
(3.1)
596
T. Matsui
By definition, the GNS representation associated with a ground state is a positive energy representation. By a theorem of H.G.Borchers, the positive (unbounded) operator Hπ is affiliated with the von Neumann algebra generated by π(A), eitHπ ∈ π(A)" . For the XXZ model, we will see that there exist four positive energy representations,two from ground states and the other two obtained by a simple procedure. We introduce π+ , π− , π−+ and π+− on the same Hilbert space L2 (X, dµ), where dµ is the unique Gibbs measure for the Classical Hamiltonian h(σ ) in (2.14). In these representations, any element of the abelian subalgebra B of A acts as a multiplication operator, (j )
(j )
(j )
π± (σz )F (σ ) = π−+ (σz )F (σ ) = π+− (σz )F (σ ) F (σ ) ∈ L2 (X, dµ),
= σ (j ) F (σ ), (j )
and σx
(3.2)
acts as a transformation of spin variables: dµ(σj ) 1/2 (j ) π± (σx )F (σ ) = ± F (σj ), dµ(σ ) dµ(σj ) 1/2 (j ) F (σj ), j < 0, π−+ (σx )F (σ ) = (−1) dµ(σ ) dµ(σj ) 1/2 (j ) F (σj ), j ≥ 0, π−+ (σx )F (σ ) = dµ(σ ) dµ(σj ) 1/2 (j ) F (σj ), j < 0, π+− (σx )F (σ ) = dµ(σ ) dµ(σj ) 1/2 (j ) F (σj ), j ≥ 0, π+− (σx )F (σ ) = (−1) dµ(σ ) F (σ ) ∈ L2 (X, dµ).
(3.3)
It is easy to verify that for any Q in A, π− (Q) = π+ (Ad(σz ((−∞, ∞))(Q)), π+− (Q) = π+ (Ad(σz ([0, ∞))(Q)), π−+ (Q) = π+ (Ad(σz ((−∞, −1])(Q)). Let be the constant vector 1 in
L2 (X, dµ).
(3.4)
Then
ϕ± (Q) = (, π± (Q)) ,
Q ∈ A.
We set ϕ−+ (Q) = (, π−+ (Q))
,
ϕ+− (Q) = (, π+− (Q)) ,
Q ∈ A.
(3.5)
As ϕ+ is a pure ground state for H of (1.7), ϕ−+ is a pure ground state for H˜ due to (3.4), where H˜ is defined by 1 H˜ = (Ad(σz ((−∞, −1])(H ) = H − 2(σx(−1) σx(0) + σy(−1) σy(0) ).
(3.6)
As H˜ is a bounded perturbation of H , ϕ−+ gives rise to a positive energy representation for H . The same argument implies that π+− is a positive energy representation for H as well.
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
597
Theorem 3.2. Any factorial positive energy representation π for H is quasi-equivalent to one of the irreducible representations π+ , π− , π−+ and π+− of A. We divide our proof of Theorem 3.2 into several steps. Lemma 3.3. For any positive energy representation {π, H} of {A, αt }, there exists a vector state ψ1 and a constant K1 such that F F − E[n,m] ) ≤ K1 , 0 ≤ ψ(H[n,m]
(3.7)
uniformly in n and m. Proof. We use ideas of the proof of Lemma 6.2.53 and 6.2.55 of [4]. As the positive selfadjoint operator Hπ satisfying (3.1) is affiliated with the von Neumann algebra F F π(A) , Hπ − (H[n,m] − E[n,m] ) is affiliated with the von Neumann algebra π(A(−∞,n−1]∪[m+1,∞) ) . Take a vector state ψ1 in the positive energy representation {π, H} such that ψ1 (Hπ ) is finite, i.e. we choose a vector giving rise to ψ1 in the domain F of Hπ . Take a ground state ω[n+1,m−1] for H[n+1,m−1] and set ψ˜ = (ψ1 )(−∞,n]∪[m,∞) ⊗ ω[n+1,m−1] . Then, F F F F ˜ π − (H[n,m] − E[n,m] )) = ψ(H − E[n,m] )) ψ1 (Hπ − (H[n,m] F F ˜ [n,m] − E[n,m] ) ≥ −ψ(H F ˜ [n,m] ) + (E[n,m] − E[n+ 1,m−1] ), = −ψ(B
where
F F F = H[n,m] − H[n+1,m−1] , B[n,m]
and we used
F ˜ [n+1,m−1] ψ(H − E[n+ 1,m−1] ) = 0.
As a result we obtain F F F ˜ [n,m] ψ1 (Hπ ) + ψ(B − (E[n,m] − E[n+ 1,m−1] ) ≥ ψ1 (H[n,m] − E[n,m] ).
(3.8)
Due to the asymptotic behavior (2.4) of the ground state energy, the left-hand side of the above inequality (3.8) is bounded uniformly in n and m. Lemma 3.4. For ψ1 satisfying (3.7), there exists a constant K2 such that the following sum is finite: ψ1 (A∗j Aj ) < ∞. (3.9) j ∈Z
Proof. In (3.7), we may replace the Hamiltonian with the periodic boundary condition due to (2.4). So, P P − E[−m,m] ) ≤ K3 . 0 ≤ ψ1 (H[−m,m] Next set (j ) (j +1)
Aj [n, m] = (1 − σx σx
)exp(
1 2
|{j,j +1}∩A|=1
J[n,m] (A)σz (A)).
598
T. Matsui
Then, m 1 1 (j ) (j +1) (1 − σz σz )Aj [−m, m]∗ Aj [−m, m]. 2
P P − E[−m,m] = H[−m,m]
j =−m
Recall that 0<1−
1 1 1 (j ) (j +1) ≤1+ . ≤ 1 − σz σz
As a consequence, 0≤
n
ψ1 (Aj [−m, m]∗ Aj [−m, m]) ≤ 2
j =−n
K3 −1
(3.10)
for any n satisfying 0 < n < m. In (3.10), we take the limit m → ∞ and arrive at the following inequality: n K3 0≤ . ψ1 (A∗j Aj ) ≤ −1 j =−n
Lemma 3.5. For any positive energy representation {π, H} of A, there exists a vector state ψ0 in H and a positive integer N such that ψ0 (A∗j Aj ) = 0
(3.11)
if |j | ≥ N, where Aj is defined in (2.20) . Proof. Fix a small and consider a large integer M such that ψ1 (A∗j Aj ) < . |j |>M
Set
LK =
A∗j Aj .
M<|j |
Let σ be a classical spin configuration in X, and by LK (σ ) we denote L with the spin configuration in [−M +2, M −1] pinned by σ . In other words, consider the pure product state corresponding to the spin configuration σ : ω(σ ) specified with ω(σ ) (σz (A)σx (B)) = 0
B = ∅ ,
ω(σ ) (σz (A)) = σ (A).
(σ )
We take the partial state ω[−M+2,M−1] on L: (σ )
LK (σ ) = ω[−M+2,M−1] (LK ). Let γK (σ ) be the spectral gap (the gap between the first and the second eigenvalues) of LK (σ ). It is possible to show that γK (σ ) ≥ γ > 0 where γ is a positive constant independent of K and σ . (For example, the existence of the spectral gap can be verified by identification of LK (σ ) with a spin flip Markov generator on a finite classical spin system and
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
599
the technique of [10].) Let PK be the projection to the ground state of LK (defined on the GNS space of the state ψ0 ) . Obviously, PK π(A∗j Aj ) = 0
PK π(LK ) = 0, and
(M < |j | < K)
π(LK ) = (1 − PK )π(LK ) ≥ γ (1 − PK ).
Thus
γ ψ1 (1 − PK ) ≤ ψ1 (LK ) ≤ . As we can take arbitrary small , 0 < 1 − ≤ ψ1 (PK ). γ Projections PK are decreasing, PK+1 ≤ PK , and we obtain the following non-trivial limit in the strong operator topology on the GNS space of ψ1 lim PK = P ,
ψ1 (P ) = 0.
K→∞
Finally the state ψ0 (Q) determined by ψ0 (Q) = satisfies (3.11).
ψ1 (P QP ) ψ1 (P )
Lemma 3.6. Suppose that a positive energy representation {π, H} is a factor and let {π0 , 0 H0 } be the GNS triple for a vector state ψ0 satisfying (3.11) if |j | ≥ N . Then, when |j | ≥ N , (j ) π0 (σx )0 = cj exp( J (A)σz (A))0 , (3.12) j ∈A
where cj is a constant satisfying cN = cj
(N < j ),
c−N = cj
(j < −N ),
cj = ±1.
(3.13)
Proof. We only consider positive j in (3.13). As the representation is factor we have a sequence of positive integers j (k) and a real constant c such that (j (k))
w − lim π(σx k→∞
Due to (3.11), (j ) (j (k))
π(σx σx
) = c1.
)0 = π(exp(
J (A)σz (A))0 .
|{j,j (k)}∩A|=1
By taking k to infinity, we obtain (j )
π0 (σx )0 = c exp(
J (A)σz (A))0 .
j ∈A (j ) 2
As Pauli spin matrices are selfadjoint unitaries, σx = 1, and (j ) (j ) σx exp( J (A)σz (A)) = exp(− J (A)σz (A))σx j ∈A
we obtain
c2
= 1.
j ∈A
600
T. Matsui
Proof of Theorem 3.2. We consider the case when the constants cj in (3.13) satisfy cj = −1 (j < −N ),
cj = 1
(N < j ).
We show that the GNS representation of the state ϕ−+ in (3.5) is quasi-equivalent to that of ψ0 satisfying (3.13). For our purpose, it suffices to show that ϕ−+ and ψ0 restricted to the complement [−N, N]c of [−N, N] ([−N, N ]c = (−∞, −N − 1] ∪ [N + 1, ∞) are quasi-equivalent. Let σ be a configuration in X[−N,N] and let P (σ ) be the projection to the configuration σ , P (σ ) =
N 1 (j ) (1 + σ (j ) σz ). 2
j =−N
Choose a configuration σ satisfying ψ0 (P (σ )) = 0 and set ψ (Q) =
ψ (QP (σ )) , ψ (P (σ ))
Q ∈ A[−N,N]c .
(σ )
We also introduce a state ϕ−+ of A[−N,N]c , (σ )
ϕ−+ (Q) =
ϕ−+ (QP (σ )) . ϕ−+ (P (σ ))
Note that the Gibbs measure is faithful and ϕ−+ (P (σ )) does not vanish. We claim that ψ (Q) = ϕ−+ (Q) (σ )
for Q in A[−N,N]c .
(3.14)
First consider the GNS representation of A[−N,N]c associated with ψ , which we denote by {π(A[−N,N]c ), , H[−N,N]c }. Due to (3.13), is cyclic for the abelian algebra B[−N,N]c = C(X[−N,N]c and JC σz (C)), (3.15) ψ (σz (B)σx (A)) = (−1)|A∩(−∞,−N−1]| ψ (σz (B) exp( |A∩C|:0dd (σ )
where A and B are finite subsets of [−N, N]c . We have the same formula for ϕ−+ : (σ ) (σ ) ϕ−+ (σz (B)σx (A)) = (−1)|A∩(−∞,−N −1]| ϕ−+ (σz (B) exp( JC σz (C)). (3.16) |A∩C|:0dd
Equations (3.15) and (3.16) tell us that the states ψ and ϕ−+ are determined by probability measures ν1 and ν2 on X[−N,N]c obtained via restriction of these states to B[−N,N]c . Let us recall that a measure dν on X[−N,N]c is a Gibbs measure for an interaction of our classical spin system h(σ ) = JA σ (A) ( with the spin configuration inside [−N, N ] pinned by σ ) if and only if dν(σj ) = exp JC σz (C) . (3.17) dν(σ ) (σ )
j ∈C
Moreover the Gibbs measure on the one-dimensional lattice is unique. Due to (3.13), probability measures ν1 and ν2 satisfy (3.17) for the same classical interaction and we conclude that ν1 and ν2 are identical. Thus we obtain (3.14).
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
601
By the same reasoning, we also have the following results. Theorem 3.7. For any positive energy representation {π, H} for H there exists a unit vector ξ in H and a positive integer N such that (j ) (j +1) ξ, π(exp( JA σz (A))) − π(σx σx ) ξ =0 (3.18) |{j,j +1}∩A|=1
for any j with |j | ≥ N . Obviously, the converse is true, namely, if there exists a unit cyclic vector ξ in a representation {π, H} of A which satisfies (3.18), the representation is of positive energy. In Sect. 4, we consider a quantum Ising model for our proof of the absence of nonperiodic ground state of the XXZ model. The following result can be proved in the same way as in Theorem 3.2. Consider the following Hamiltonian HQI : (j −1) (j +1) (j ) (j −1) (j +1) (1 − δσz . (3.19) σz )σx − δσz σz HQI = − j ∈Z
Theorem 3.8. There exists a positive constant δ0 such that for |δ| < δ0 the quantum Ising model HQI of (3.19) has a unique ground state ϕQI . Any positive energy representation for HQI is quasi-equivalent to the ground state representation associated with ϕQI . As is the case of the XXZ Hamiltonian, we can rewrite the Hamiltonian in a regularized form, reg (j −1) (j +1) (j ) HQI = (1 − δσz σz )(cj (σz ) − σx ) , (3.20) j ∈Z
where cj (σ ) is a positive function on X satisfying τk (cj (σ )) = cj +k (σ ) and | log cj |δ1 < ∞,
|cj |δ1 < ∞
(3.21)
for positive δ1 . There exists a translationally invariant short range classical interaction hQI = J A σ (A) A⊂Z
such that log cj =
(j −2) (j ) σz
J A σ (A) = δ(σz
(j ) (j +2)
+ σ z σz
) + dj (σ ).
(3.22)
j ∈A
Equation (3.22) is equivalent to the detailed balance condition. There exists a constant C such that |dj (σ )|δ1 ≤ Cδ 2 .
(3.23)
The unique ground state ϕQI of HQI is obtained by the unique Gibbs measure µ for hQI in the sense that ϕQI = ϕµ . ϕQI is characterized by (j )
(j )
ϕQI (cj (σz ) − σx ) = ϕQI (cj (σz )1/2 (1 − σx )cj (σz )1/2 ) = 0 for any j .
602
T. Matsui
4. Duality and Minlos Map Results of the previous sections reveal that a non periodic ground state must be a vector state in a soliton sector and the infimum of the spectrum of the Hamiltonian is a point spectrum. We show this is not the case for our antiferromagnetic XXZ model with large . For this purpose, we use Kramers-Wannier duality of Ising models and the Minlos map (cf.[15]). In our context, the Kramers-Wannier duality gives rise to a relationship between the XXZ model (with large Ising-like anisotropy) and a (weakly coupled) stochastic Ising Model. The Minlos map gives rise to an identification of low energy states with l 2 (Z) and the Hamiltonian restricted to this space is unitarily equivalent to a convolution operator. This fact tells us that the low lying spectrum is absolutely continuous unless the dispersion relation is a constant function of momentum. We start by recalling the Kramers-Wannier duality. Let be the automorphism of A determined by the following equation: (j )
(j )
(σx ) = −σx ,
(j )
(j )
(σy ) = −σy ,
(j )
(j )
(σz ) = σz .
(4.1)
for any j . Let A+ (resp. A− ) is the even (resp.odd) part of A, A± = {Q ∈ A
| (Q) = ±Q} . (j ) (j +1)
(j )
It is easy to see that A+ is generated by σx σx and σz . Then we introduce an automorphism τ1/2 of A+ via the following equation: (j )
(j ) (j +1)
τ1/2 (σz ) = σx σx
,
(j ) (j +1)
τ1/2 (σx σx
(j +1)
) = σz
(4.2)
.
2 =τ . It is known that the automorphism τ1/2 is not extendible to A and τ1/2 1 In the previous section, we considered the regularized Hamiltonian
Hreg =
1 (j ) (j +1) (1 − σz σz )(exp( j ∈Z
(j ) (j +1)
JA σz (A)) − σx σx
)
|{j,j +1}∩A|=1
which generates the time evolution αt . As αt commutes with , αt ◦ = ◦ αt , we can restrict it to A+ . Any ground state ω for {A, αt } gives rise to a ground state for {A+ , αt }. Let ω+ be a restriction of ω to A+ . Then the state ω˜ + defined by ω˜ + = ω+ ◦ τ1/2 is a ground state for {A+ , α˜ t }, where α˜ t is defined by −1 α˜ t = τ1/2 ◦ αt ◦ τ1/2 .
The time evolution α˜ t is governed by the following Hamiltonian H˜ : (j −1) (j +1) (j ) H˜ = (1 − δσx σx )(exp( JA τ1/2 (σz (A))) − σz ). j ∈Z
(4.3)
|{j,j +1}∩A|=1
This Hamiltonian (4.3) is that of a quantum ising model discussed in the previous section. To identify the Hamiltonians of (4.3) and (3.19) we introduce the automorphism β determined by (j ) (j ) (j ) (j ) β(σx ) = σz , β(σz ) = −σx .
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
We set
603
1 = ββ −1 .
Due to Theorem 3.8, the ground state for the HamiltonianHQI is unique and it is invariant under 1 and the lattice translation τ1 . Proposition 4.1. Let ϕQI be the unique infinite volume ground state of HQI . Let {πQI (A), QI , HQI } be the GNS triple for ϕQI and L be the positive operator on HQI which implements the time evolution of the quantum Ising model HQI specified uniquely by the following equation: LQI = 0,
eitL πQI (Q)e−itL = πQI (eitHQI Qe−itHQI ).
(4.4)
(i) Consider the subspaces HQI (±) of HQI determined by HQI (+) = c.l.h. σz (A)QI | |A| even , HQI (−) = c.l.h. σz (A)QI | |A| odd , where c.l.h. is the abbreviation of "closed linear hull". HQI (±) are invariant under actions of L and πQI (β(A+ )). The representation πQI (β(A+ ) restricted to HQI (±) is irreducible. (ii) If a non translationally invariant ground state for the XXZ model H exists, the infimum of the spectrum of L on HQI (−) must be a point spectrum. Lemma 4.2. The positive energy representations π± , π−+ and π+− restricted to A+ are irreducible in L2 (X, dµ). Proof. Take π = π−+ for example. The vacuum vector is cyclic and separating for π(B). The commutant of π(A+ ) is contained in π(B)" . The shift τ is an approximately inner automorphism of A+ , so any element z in the commutant of π(A+ ) is shift invariant. As the Gibbs measure is ergodic, z is trivial. Proof of Proposition 4.1(i). Due to Lemma 4.2, the ground state ϕ (+) restricted to A+ is pure. Recall ϕQI = ϕ (+) ◦ τ1/2 ◦ β on β(A+ ). Thus ϕQI restricted to β(A+ ) is pure. The (o) state ϕQI ◦ Ad(σz ) is a vector state in HQI (−), so πQI (β(A+ ) restricted to HQI (−) is irreducible. Proof of Proposition 4.1(ii). Suppose that ψ0 is a non-translationally invariant ground state for the XXZ model. By Theorem 3.7, there exists a state ψ1 quasi-equivalent to ψ0 satisfying (3.18). Now consider the restriction of ψ1 to A+ and apply τ1/2 ◦ β. Set ψ2 = ψ1 ◦ τ1/2 ◦ β(Q + 1 (Q)). Then the invariant state ψ2 of A satisfies the following equation: (j ) ψ2 (exp( JA β ◦ τ1/2 (σz (A)) − σx ) = 0
(4.5)
|{j,j +1}∩A|=1
for j with |j | ≥ N + 1. Equation (4.5) tells us that ψ2 gives rise to a positive energy representation for our quantum Ising model. ψ2 is quasi-equivalent to the ground state representation due to Theorem 3.8. As ψ0 is equivalent to ψ1 the invariant state ψ4 of A defined by ψ4 = ψ0 ◦ β ◦ τ1/2 (Q + 1 (Q))
604
T. Matsui
is a vector state of A+ associated with a vector ξ0 in the ground state representation of the quantum Ising model HQI . Due to 1 invariance, ψ([HQI , Q− ]) = 0 for any 1 -odd element Q− . The state ψ4 restricted to the 1 -even part of A is invariant under time evolution. Thus ψ([HQI , Q]) = 0 for any Q in A. As a consequence, we obtain a vector ξ in HQI such that its vector state is ψ. Note the following facts. (i) Any 1 invariant vector state is implemented by a vector either in the even sector HQI (+) or in the odd sector HQI (−). (ii) As ψ is invariant under time evolution, ξ is an eigenvector of L with an eigenvalue e. Now suppose that ξ is in the odd sector HQI (−). ψ4 restricted to the 1 -even part of A is a ground state (for the 1 -even part) by definition, the infimum of the spectrum of L restricted to the odd sector HQI (−) must be a point spectrum. On the other hand , suppose that ξ is in the even sector HQI (+). By the same reasoning as above, the infimum of the spectrum of L restricted to the even sector HQI (+) must be a point spectrum. As we know that the ground state vector for L belongs to HQI (+), we see = ξ. This equation implies that the state ψ is the unique translationally invariant ground state of HQI and the state ψ0 is a translationally invariant ground state of the XXZ model. This leads to a contradiction, so ξ is in the odd sector HQI (−) and the infimum of the spectrum of L in the odd sector HQI (−) is a point spectrum. The following is essentially due to R. Minlos (cf. [15]). This result and the previous proposition complete our proof of Theorem 1.2. Theorem 4.3. Consider the ground state representation (the GNS triple for the unique ground state) {πQI (A), , HQI } for HQI , where πQI is the representation of A, is the GNS cyclic vector in the GNS Hilbert space HQI . Let U be the shift on HQI , i.e. U is a unitary acting on HQI satisfying U πQI (Q)U ∗ = πQI (τ1 (Q)),
U =
for any Q in A. (i) There exist a L and U invariant subspace H1 and positive constants d1 , d2 , d3 (0 < d1 < d2 < d3 ) such that H1 ⊂ HQI (−), and spec(L|H1 ) ⊂ [d1 , d2 ],
spec(L|H⊥ ) ⊂ {0} ∪ [d3 , ∞). 1
(4.6)
Here the spectrum of L restricted in H1 is denoted by spec(L|H1 ) and that restricted to the orthogonal complement H⊥ 1 of H1 is denoted by spec(L|H⊥ ). 1
(ii) There exists a unitary W from H1 to L2 ([−π, π ], dk) such that the shifts W U n W ∗ act as multiplication by eink . (iii) W L|H1 W ∗ is a multiplication operator by a function m(k) ˆ of the following form: m(k) ˆ = 2 + 4δ cos(2k) + O(δ 2 ).
(4.7)
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
605
We call W the Minlos map. L commutes with the shifts U k and any operator on L2 ([−π, π ], dk) commuting with eink is a multiplication operator. Thus the first statement of (iii) follows from (ii). Equation (4.7) shows that the infimum of the spectrum of L on HQI (−) is not a point spectrum. We sketch our proof of the above theorem. The only difference between [15] and our case is that R.Minlos considered finite range interactions while we have to consider an (exponentially decreasing) infinite range interaction. This appearance of an infinite range interaction results in length estimates. We use the following fact for our model: for any δ1 , we can find δ0 such that if |δ| < δ0 |JA | eδ1 d(A) < ∞. j ∈A
In other words, the decay exponent of the interaction can be arbitrarily large if we choose δ sufficiently small. Recall that in the weakly coupled quantum Ising model the regularized Hamiltonian reg HQI of (3.20) is unitarily equivalent to the generator of a stochastic Ising model. See [8] and [11]. In what follows, by L we mean Lf (σ ) = (1 − δσ (j −1) σ (j +1) )cj (σ )(f (σ ) − f (σj )) (4.8) j
for f with finite |f |δ1 . ( |f |δ1 is defined in (2.1).) The operator L is positive and essentially selfadjoint on the set of functions satisfying |f |δ1 < ∞. We decompose the operator L as L = L(1) + L(2), where (1 − δσ (j −1) σ (j +1) )(1 + δ(σ (j −2) σ (j ) + σ (j ) σ (j +2) )) L(1)f (σ ) = j ∈Z
(f (σ ) − f (σj )), L(2)f (σ ) = (1 − δσ (j −1) σ (j +1) ){cj (σ )
(4.9)
j ∈Z
−(1 + δ(σ (j −2) σ (j ) + σ (j ) σ (j +2) ))}(f (σ ) − f (σj )). (4.10) L(2) contains the terms of O(δ 2 ) only because the result of the perturbation theory of [8] shows |cj (σ ) − {1 + δ(σ (j −2) σ (j ) + σ (j ) σ (j +2) )}|δ1 ≤ Cδ 2 for small δ. Let H be the set of square integrable functions with respect to the Gibbs measure µ, H = L2 (Xdµ). Consider M equipped with the norm |f |: M = {f ∈ C(X) | |f | < ∞} . Obviously f H ≤ |f |,
|f g| ≤ |f ||g|
for f and g in M. For any bounded linear operator B acting on M we denote the norm of B by BM . R. Minlos has shown that BH ≤ BM if B is a bounded selfadjoint
606
T. Matsui
˜ so M ˜ = B ∩Aloc . operator on H. The set of finite polynomials in C(X) is denoted by M We introduce subspaces M1 and M⊥ via the following equations: 1 (j ) ˜ |f = M1 = {f ∈ M fj σ }, M⊥ fA σ (A)}. (4.11) 1 = {f ∈ M |f = |A|>1
j ∈Z
By definition,
M = M1 + M⊥ 1.
This decomposition of M gives rise to a matrix representation of L regarded as an ˜ to M: unbounded operator from M L11 L12 L= , (4.12) L21 L22 where L11 : M1 → M1 , L12 : M1 → M⊥ 1 and so on. We use the same notation Bij ˜ to M. (i, j = 1, 2) for any linear operator B from M Lemma 4.4. Set L(0)f (σ ) =
L = L − L(0).
(f (σ ) − f (σj )),
j
Then, there exists a constant C1 such that |L σ (A)| ≤ C1 δ |A| . As a consequence, for any f in M⊥ 1, |(L )22 f | ≤ C1 δ|L(0)22 f |. Due to this relative boundedness, L22 is invertible when δ is sufficiently small. L11 is invertible as well. In fact, by Neumann series, 1 1 1 1 1 1 1 = − (L )22 + (L )22 (L )22 + ... , L22 L(0)22 L(0)22 L(0)22 L(0)22 L(0)22 L(0)22 we have
−1 L22
M
It is easy to verify where we used
≤
1 . 4(1 − C1 δ)
L12 M ≤ C2 δ, |B12 f | ≤ |Bf |M ,
and for any i, j = 1, 2,
Bij
M
f ∈ M⊥ 1,
≤ BM .
The next task is to construct L and the shift invariant subspaces L1 and L⊥ 1 which are ⊥ perturbations of M1 and M1 . The idea of R. Minlos is to construct M1 and M⊥ 1 as graphs of operators S and T : S : M1 → M⊥ 1,
T : M⊥ 1 → M1 ,
L1 = {u + Su | u ∈ M1 },
⊥ L⊥ 1 = {u + T u | u ∈ M1 }.
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
607
The invariance condition of L1 and L⊥ 1 leads to S = (L22 )−1 SL11 + (L22 )−1 SL12 S − (L22 )−1 L21 , T = L11 T (L22 )−1 + L21 (L22 )−1 − T L12 T (L22 )−1 .
(4.13)
Lemma 4.5. If δ is sufficiently small, there exist operators (S, T ) satisfying (4.13) such that the subspaces {u + Su | u ∈ M1 } and {u + T u | u ∈ M⊥ 1 } are shift and L invariant. Existence of operators satisfying (4.13) can be proved by the fixed point theorem. What is crucial for our purpose is the estimate of the norm of S. There exists a constant C3 such that SM ≤ C3 δ,
T M ≤ C3 δ.
(4.14)
We can also derive the same estimates (4.14) with a different constant C3 for the norm |f |δ1 ,δ2 . Moreover if we define SA,j via the following equation: SA,j σ (A), Sσ (j ) = |A|≥2
there exists a positive δ3 such that SA,j eδ3 d(A∪{j }) < ∞. |A|≥2
Set
vj = σ (j ) + Sσ (j ) .
Now H1 is defined as the closed linear hull of vj , where the closure is taken in the Hilbert space topology of H. Then, the next step is the construction of an orthogonal basis of H1 . Set D(i − j ) = (vj , vi )H , d(j ) = D(j ) − δ0,j . Lemma 4.6. The following estimate is valid: |d(j )| ≤ C4 δκ |j | ,
(4.15)
where κ is a positive constant strictly less than 1. The proof of (4.15) follows from the exponential decay of correlation of the Gibbs measure µ for quasi-local observables Sσ (j ) . Equation (4.15) shows that the Fourier transform of D(j ) is a strictly positive analytic function if δ is sufficiently small. Let D be the convolution operator on l 2 (Z) defined by Df (i) = D(i − j )f (j ). j
Set
vˆi =
D −1/2 (i − j )vj ,
j
where D −1/2 (i − j ) is a kernel of the inverse of the square root for the operator D. Then {vˆi } is an orthogonal basis of H1 . By L(1) (resp. L(1) (1) or L(1) (2) we denote the
608
T. Matsui
restriction of L (resp. L(1) or L(2) ) to H1 . As L(1) commutes with the shift L(1) is a convolution operator m on l 2 (Z), L(1) vi = m(i − j )vj , L(1) vˆi = m(i − j )vˆj . j
j
In fact, m(i − j ) is expressed in terms of L and S, m(i − j ) = (L11 + L12 S)i−j ,
(4.16)
where the subscript refers to the i j matrix element . (For details of the derivation of (4.16), see the argument leading to (4.4) and (4.8) of [15].) The leading contribution to m(i − j ) is the operator L(1), (L(1)11 )i−j = 2δi,j + 2δ(δi,j +2 + δi,j −2 ),
(4.17)
because we have (L(1)(σ (i) ) = 2(1 + σ (i−2) σ (i) + σ (i) σ (i+2) )σ (i) . As the norms of S and L12 are of order O(δ) we obtain m(i − j ) = (L(1)11 )i−j + O(δ 2 ). This completes the proof of (4.7). Equation (4.7) tells us that the spectrum of L restricted on H1 is contained in the interval [2(1 − δ(1 + C4 ), 2(1 + δ(1 + C4 )] . Set ⊥ H2 = H⊥ 1 ∩ {} ,
where {}⊥ is the set of vectors orthogonal to the vacuum . H2 is the Hilbert space completion of M⊥ 1 . We consider the infimum of the spectrum of L on H2 . |L−1 (u + Su)|M ≤ (L22 )−1 ⊥ 1 + SM |u| M1
≤ (L22 )−1
M⊥ 1
1 + SM
where M⊥ 1 = H2 ∩ M. Now recall that (L(0)22 )−1
M⊥ 1
≤
1 u + SuM , 1 − (SM
1 , 4
and we can estimate (L22 )−1 M⊥ by using the Neumann series: 1
(L22 )−1
M⊥ 1
≤
1 4(1 − C5 δ)
when δ is sufficiently small. As a consequence, 1 + C3 δ 1 −1 . L ≤ H2 4(1 − C5 δ) 1 − 1 + C3 δ This means that the infimum of the spectrum of L on H2 is greater than 4 − if δ is sufficiently small.
Absence of Non-Periodic Ground States for the Antiferromagnetic XXZ Model
609
References 1. Alcaraz, S.R., Salinas, R.S., Wreszinski, W.F.: Anisotropic ferromagnetic quantum domain. Phys. Rev. Lett. 75, 930–933 (1995) 2. Araki, H., Matsui, T.: Ground states of the XY -model. Commun. Math. Phys. 101, 213–245 (1985) 3. Bratteli, O., Robinson,D.: Operator algebras and quantum statistical mechanics I. 2nd edition, Berlin-Heidelberg-New York: Springer, 1987 4. Bratteli, O., Robinson, D.: Operator algebras and quantum statistical mechanics II. 2nd edition, Berlin-Heidelberg-New York: Springer, 1997 5. Fannes, M., Nachtergaele, B., Werner, R.: Finitely Correlated States on Quantum Spin Chains. Commun. Math. Phys. 144, 443–490 1992 6. Datta, N., Kennedy, T.: Expansions for one quasiparticle states in spin 1/2 systems. J. Statist. Phys. 108, 373–399 (2002) 7. Datta, N., Kennedy, T.: Instability of Interfaces in the Antiferromagnetic XXZ Chain at Zero Temperature. Commun. Math. Phys. 236, 477–511 (2003) 8. Kirkwood, J.R., Thomas, L.: Expansions and phase transitions for the ground state of quantum Ising lattice systems. Commun. Math. Phys. 88, 569–580 (1983) 9. Koma, T., Nachtergaele, B.: The complete set of ground states of the ferromagnetic XXZ chains. Adv. Theor. Math. Phys. 2, 533–558 (1998) 10. Liggett, T.: Interacting Particle Systems. Berlin-Heidelberg-New York: Springer, 1985 11. Matsui, T.: Uniqueness of translationally invariant ground state in quantum spin systems. Commun. Math. Phys. 126, 453–467 (1990) 12. Matsui, T. On Ground State Degeneracy of Z2 Symmetric Quantum Spin Models. Publ. RIMS, Kyoto Univ. 27, 657–659 (1991) 13. Matsui, T.: On ground states of the one-dimensional ferromagnetic XXZ model. Lett. Math. Phys. 37, 397–403 (1996) 14. Matsui, T.: Translational symmetry breaking and soliton sectors for massive quantum spin models in 1 + 1 dimensions. Commun. Math. Phys. 189, 127–144 (1997) 15. Minlos, R.A.: Invariant subspaces of the stochastic Ising high temperature dynamics. Markov Proc. Rel. Fields 2, 263–284 (1996) Communicated by M. Aizenman
Commun. Math. Phys. 253, 611–631 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1157-9
Communications in
Mathematical Physics
Absolutely Continuous Spectrum of Schrödinger Operators with Slowly Decaying and Oscillating Potentials A. Laptev1 , S. Naboko2 , O. Safronov1 1 2
Department of Mathematics, Royal Institute of Technology (KTH), 10044 Stockholm, Sweden. E-mail:
[email protected];
[email protected] Faculty of Physics, Division of Mathematical Physics, Saint-Petersburg State University, 3 Ulianovskayaul, Peterhof, St. Petersburg 198504, Russia. E-mail:
[email protected]
Received: 4 December 2003 / Accepted: 4 March 2004 Published online: 3 September 2004 – © Springer-Verlag 2004
Abstract: The aim of this paper is to extend a class of potentials for which the absolutely continuous spectrum of the corresponding multidimensional Schrödinger operator is essentially supported by [0, ∞). Our main theorem states that this property is preserved for slowly decaying potentials provided that there are some oscillations with respect to one of the variables. 1. Introduction In this paper we prove that the absolutely continuous spectrum of a class of Schrödinger operators − + V in L2 (Rd ), d ≥ 3 is essentially supported by [0, ∞). This means that the spectral projection corresponding to any subset of positive Lebesgue measure is not zero. We develop a technique which allows one to estimate the spectral measure of −+V in terms of eigenvalue sums. Namely as soon as we have a “good” estimate for the eigenvalues of the Schrödinger operator, we can prove that the a.c. fills the interval (0, +∞). Different relations between the discrete and continuous spectrum appeared in Damanik-Killip [9] for one dimensional operators. They proved that if one dimensional Schrödinger operators with potentials +V and −V have only finite number of eigenvalues then their positive spectrum is absolutely continuous. Recently O. Safronov [23] has shown that our results can be extended to long range potentials V ∈ Ld+1 (Rd ), d ≥ 3, whose Fourier transform is square integrable near the origin. In particular this implies that for any real function from V0 ∈ Ld+1 whose Fourier transform is square integrable near ξ0 the Schrödinger operator with the potential potential V (x) = cos(ξ0 x)V0 (x)
612
A. Laptev, S. Naboko, O. Safronov
has a.c. spectrum essentially supported by R+ . The same assertion holds for V ∈ Ld+1 (Rd ) such that 2 V (x + y) dy dx < ∞ Rd
|y|<δ
for some positive number δ > 0. Our work differs from the results obtained in the scattering theory, where the existence of wave operators is proved under more restrictive conditions on V . Here the most known conditions are |V (x)| ≤ C(1 + |x|)−1−ε , ε > 0 or |V (x)| + |∇V (x)|(1 + |x|) ≤ C(1 + |x|)−ε . In this case the corresponding a.c. property of the spectrum is a byproduct of much stronger results on the unitary equivalence of the operators − and − + V . One should also notice that some generalizations of the Hack-Cook theorem require only V ∈ L1 + Lp with p < d. This means that for d > 4 one should apply the modified theorem [23] rather than its preliminary version (see Theorem 2.2) which deals with potentials from L4 . Also the operator has lots of a.c. if there is a cone where V decays very fast. As in our previous paper [17] the multidimensional case is reduced to a problem for a one-dimensional second order elliptic integro-differential operator. The “potential” type term appears to be a dissipative Fredholm integral operator depending on the spectral parameter. Such an operator might have poles appearing in an operator version of the so-called first Buslaev-Faddeev-Zakharov (BFZ) trace formula. Their contribution appears with the “right” sign and therefore can be ignored. There are two new crucial elements compared with [17]. One of them suggests new “spectrally local” Lieb-Thirring inequalities for the 3/2 moments of the negative eigenvalues of Schrödinger operators (compare with O.Safronov [22]). Before applying this result we need an argument from A.Laptev and T.Weidl [18] lifting the corresponding eigenvalue estimates for their 1/2-moments to 3/2-moments by using an induction with respect to dimension. This argument forces us to consider the problem starting from dimension d ≥ 3. The second new element is concerned with a parallel consideration of a couple of Schrödinger ∞ operators with potentials V and −V . This leads to the cancellation of the term 0 Sd−1 V dθ dr appearing in the BFZ first trace formula. Note that the first result based on the Buslaev-Faddeev-Zakharov trace formulae for the study of the a.c. properties of the spectrum of one-dimensional Schrödinger operators was suggested in the paper by P.Deift and R.Killip [11]. Their theorem gave a natural generalization of the results obtained by M.Christ, A.Kiselev and C.Remling in [7], M.Christ, A.Kiselev [8] and C.Remling[21]. R.Killip [14] was first in proving a “local” one-dimensional result. That is if Vˆ ∈ L2 (2a, 2b), a > 0, and V ∈ L3 (R), then the absolutely continuous spectrum fills the interval (a 2 , b2 ). For the three dimensional case our theorems require only V ∈ L4 (R3 ) rather than the condition V ∈ L3 (R3 ). Note that the second L2 -condition (2.7) on the Fourier transform of V with respect to one of variables near the origin, becomes interesting if there are cancellations provided by oscillations of the potential V near infinity. There is extensive literature concerning the properties of the spectrum of oscillating potentials starting from the classical Wigner-von Neumann construction [34], see also M.Skriganov [32] and H.Behncke [3, 4]. Some examples of oscillating potentials with respect to the radial variable were given in M.Reed and B.Simon [24], Vol. 3, Ch XI. Our Theorems 2.1 and 2.2 are applied to a class of potentials described in terms of the Fourier transform either with respect to one of the variables or with respect to all variables. Some related results for a class of Schrödinger operators with anisotropic
Absolutely Continuous Spectrum
613
behaviour of potentials at infinity were considered in the paper by V.G.Deich, E.L.Korotjaev and D.R. Yafaev [10]. This article is a natural development of our previous paper [17]. For the sake of completeness we recall the arguments of Sects. 3, 4 and 8 from [17] which become in this text Sects. 4–6 and 10 respectively. 2. The Main Results Let us consider a Schrödinger operator − + V in L2 (Rd ), d ≥ 3, where V ∈ L∞ (Rd ),
V (x) → 0, as |x| → ∞.
Let Vˆ be the Fourier transform of V with respect to the first variable Vˆ (ξ, y) = e−iξ s V (s, y) ds, x = (s, y) ∈ Rd . R
(2.1)
(2.2)
Theorem 2.1. Let d ≥ 3 and let V be a real valued function on Rd obeying (2.1) and let for some δ > 0, δ V 4 (x) dx < ∞, |Vˆ (ξ, y)|2 dξ dy < ∞. Rd
Rd−1
−δ
Then the absolutely continuous spectrum of the operator −+V is essentially supported by [0, ∞). The latter theorem gives some qualitative information about the absolutely continuous spectrum of Schrödinger operators. The next result is related to more delicate properties of the a.c. spectrum. It provides some quantitative characteristics of the spectral measure which is a multidimensional continuous analog of the well-known Szeg˝o condition for orthogonal polynomials and Jacobi matrices (compare with [17]). Let 1 be the unit ball in Rd , ∂1 = Sd−1 , and V be a real valued function on Rd \1 . We consider the operator H in L2 (Rd \ 1 ) with the Dirichlet boundary conditions on Sd−1 , H u = H0 u + V u = −u + V u,
u|∂1 = 0.
Let us assume for the sake of simplicity that there is c1 > 1 such that αd V + 2 = 0 for 1 < |x| < c1 , |x|
(2.3)
(2.4)
2
where αd = (d−1) − d−1 4 2 . Let EH (ω), ω ⊂ R, be the spectral projection of the operator H . We construct a measure µ on the real line such that for spherically symmetric functions f , (EH (ω)f, f ) = |F (λ)|2 dµ(λ), ω ⊂ R+ = (0, ∞), (2.5) ω
where 1 F (λ) = k
c1
sin(k(r − 1))f (r) r (d−1)/2 dr,
suppf ⊂ {x : 1 < |x| < c1 } (2.6)
0
and k 2 = λ > 0. Let us extend V by zero into 1 and then define Vˆ as in (2.2). The following theorem is the main result of the paper.
614
A. Laptev, S. Naboko, O. Safronov
Theorem 2.2. Let d ≥ 3 and let V be a real valued function on Rd \ 1 obeying (2.1) and (2.4). Let δ V 4 (x) dx < ∞, |Vˆ (ξ, y)|2 dξ dy < ∞ (2.7) Rd \1
for some δ > 0. Then
Rd−1
∞
0
−δ
log(1/µ (t)) dt √ < ∞, (1 + t 3/2 ) t
(2.8)
where µ is defined in (2.5). If (2.4) is satisfied then (2.8) is equivalent to d ∞ log dλ (EH (λ)f, f ) dλ > −∞, √ (1 + λ3/2 ) λ 0
(2.9)
for any bounded spherically symmetric function f = 0 with suppf ⊂ {x : 1 < |x| < c1 }. Remark 1. The inequality (2.8) guarantees that the a.c. spectrum of H is essentially supported by [0, ∞), since µ > 0 almost everywhere and gives quantitative information about the measure µ. Remark 2. If d = 1, then the conditions (2.7) do not provide existence of the absolutely continuous spectrum on R+ . This is confirmed by examples of sparse potentials constructed in [15]. The validity of Theorem 2.2 in dimension d = 2 remains open. Remark 3. The equivalence of (2.8) and (2.9) follows from the fact that if F is defined as in (2.6), then the function (1 + λ2 )−1 log(|F (λ)|) is in L1 (R+ ) see, for example, P. Koosis [16] (Sect. IIIG2). Remark 4. When proving Theorem 2.2 we use the projection operator P0 on the spherical function Y0 which leads us to a scalar one-dimensional problem(4.2) with an operator valued potential Qz . Had we used instead of P0 the projection nj=1 Pj , where Pj are projections on the spherical functions Yj , then we would have obtained the corresponding system of one-dimensional equations with an operator valued potential which could be treated similarly. This would imply that the multiplicity of the a.c. spectrum is not smaller than n. Since n is arbitrary, we obtain the a.c. spectrum is of infinite multiplicity. Example. The statement of the theorem holds true for a 3-dimensional operator with the potential V (x, y, z) = v1 (x)v(y, z), v ∈ L4 (R2 ). Here v1 is a so-called Wigner-von Neumann potential v1 (x) =
m j =1
cj
sin(ωj x) + o(1) , 1 + |x|pj
|x| → ∞,
where ωj > 0, pj > 1/4, cj ∈ R, m ∈ N, is a function whose Fourier transform vanishes on a small interval containing zero. For example, one can consider m 1+ωj Cj exp(ikx) dk, v1 (x) = Re (k − ωj )1−pj j =1 ωj with appropriate constants Cj .
Absolutely Continuous Spectrum
615
3. Estimates for the Discrete Spectrum Throughout the paper, T± denotes the positive and negative part of a self adjoint operator T , i.e. 2T± = |T | ± T . Denote by Sp , p > 0 the standard Neumann-Schatten classes of compact operators Sp = {T : tr (T ∗ T )p/2 < ∞}. 2
d 2 Consider a one dimensional Schrödinger operator J = − dx 2 + V (x) in L (R) with a real valued potential V ∈ C0∞ (R).
Theorem 3.1. Let V ∈ C0∞ (R). Then for any δ > 0, δ d2 3/2 4 tr − 2 + V (x) ≤C V dx + |Vˆ (ξ )|2 dξ , − dx R −δ where the constant C = C(δ, ||V ||∞ ) and Vˆ (ξ ) = exp (−iξ x)V (x) dx.
(3.1)
Proof. For each T ∈ S1 , one can define a complex-valued function det(1 + T ), so that |det(1 + T )| ≤ exp( T S1 ). For T ∈ S4 one defines det 4 (1 + T ) = det((1 + T )e−T +T
2 /2−T 3 /3
).
(3.2)
It is proved in [30], Sect. 9, Theorem 9.2(b), that there is a constant c > 0 such that |det4 (1 + T )| ≤ exp(c T 4S4 ),
c > 0.
(3.3)
2
d 2 Note that if J0 is the operator J0 = − dx 2 in L (R), then lim | det I + V (J0 − (λ ± iε))−1 | ≥ 1. ε→0
In order to prove Theorem 3.1 we need the following auxiliary statement: Lemma 3.1. Let V (x) be a smooth real valued function of finite support. For every δ > 0 there is a constant C = C(δ, ||V ||∞ ) such that for all z: |z − ||V ||∞ | = ||V ||∞ + δ 2 it holds | log det 4 (I + V (J0 − z)−1 )| ≤
C ||V ||4L4 . |Im z|4
(3.4)
Proof. Let z = λ + iη, where λ and η are real. One can repeat the arguments of R.Killip and B.Simon, Proposition 5.2 [13], in order to show that |
d log det4 (I + V (J0 − z)−1 )| dη = |tr (i[(J0 − z)−1 V ]4 (J − z)−1 )| ≤ ||V (J0 − z)−1 ||4S4 ||(J − z)−1 ||.
On the other hand,
(3.5)
lim det 4 (I + V (J0 − z)−1 ) = 1.
η→∞
Therefore the estimate (3.4) follows from (3.5) by the Fundamental Theorem of Calculus.
616
A. Laptev, S. Naboko, O. Safronov
It was established in [14] that for z = k 2 , k ∈ R, −Re tr (V (J0 − z)−1 )2 =
|Vˆ (2k)|2 . 2k 2
Therefore for z = k 2 , k ∈ R, 0 ≤ log |det(I + V (J0 − z)−1 )| + log |det(I − V (J0 − z)−1 )| = −Re tr (V (J0 − z)−1 )2 + log |det 4 (I + V (J0 − z)−1 )| |Vˆ (2k)|2 + log |det4 (I − V (J0 − z)−1 )| = + log |det 4 (I + V (J0 − z)−1 )| 2k 2 + log |det4 (I − V (J0 − z)−1 )|. (3.6) Let now σ (k) = k 2 (k 2 − δ 2 )4 ,
L = {k : |k 2 − ||V ||∞ | = ||V ||∞ + δ 2 }.
Then applying (3.4) we obtain log det4 (I + V (J0 − k 2 )−1 )σ (k) dk ≤ C||V ||4L4 . L
(3.7)
Now let iβj (V ) be the zeros of log det 4 (I + V (J0 − k 2 )−1 ) and let B(k, V ) be the Blaschke product
k − iβj (V ) B(k, V ) = . k + iβj (V ) j
Then
δ
log det4 (I + V (J0 − z)−1 )σ (k) dk −δ log det4 (I + V (J0 − z)−1 )σ (k) dk = Re L −Re log(B(k, V ))σ (k) dk.
Re
L
(3.8)
Thus, combining the inequality (3.6) with the estimate (3.7) and the relation (3.8), we obtain δ f (βj (V )) + f (βj (−V )) ≤ C V 4 dx + |Vˆ (2ξ )|2 dξ , (3.9) j
R
j
where f (t) = Re
k − it
−δ
σ (k) dk, t > 0. k + it Integrating by parts and using the fact that σ is even we obtain 1 1 1 − (k) dk = Re (k) dk = 2π (it), f (t) = Re k + it L k − it |k|=2t k − it L
log
where
(k) =
k
σ (τ ) dτ. 0
Absolutely Continuous Spectrum
617
This implies f (t) ≥ The proof is complete.
2πδ 8 3 t . 3
4. The Beginning of the Proof of Theorem 2.2 In this section we reduce problem (2.3) to a one-dimensional problem with an operator valued potential. Such a reduction has been already used in [17]. Assume that V ∈ C0∞ and introduce polar coordinates (r, θ ), x = rθ ∈ Rd , θ ∈ d−1 2 d−1 ) basis of (real) spherical functions, S . Denote by {Yj }∞ j =0 the orthonormal in L (S i.e. eigenfunctions of the Laplace-Beltrami operator −θ , and let Pj be the orthogonal projection given by Yj (θ )u(r, θ ) dθ . Pj u(r, θ ) = Yj (θ ) Sd−1
Clearly P0 u depends only on r. Denote V1 = P0 V P0 ,
H0,1 = P0 H0 P0 ,
V1,2 = P0 V (I − P0 ), V2 = (I − P0 )V (I − P0 ),
∗ V2,1 = V1,2 ,
H0,2 = (I − P0 )H0 (I − P0 ).
Then the operator H − z can be represented as a matrix: V1,2 H0,1 + V1 − z , H −z= V2,1 H0,2 + V2 − z and the equation (H − z)u = P0 f,
Im z = 0,
is equivalent to (H0,1 + Tz − z)P0 u = P0 f,
(H0,2 + V2 − z)−1 V2,1 P0 u = (P0 − I )u.
(4.1)
Here the operator Tz is defined by Tz = V1 − V1,2 (H0,2 + V2 − z)−1 V2,1 on L2 ((1, ∞), r d−1 dr). By using the unitary operator from L2 ((1, ∞), dr) to L2 ((1, ∞), r d−1 dr), U u(r) = r −(d−1)/2 u, we reduce (4.1) to the problem for the following one-dimensional Schrödinger operator in L2 (1, ∞); Lz u(r) = −
d 2u + Qz u, dr 2
u ∈ L2 (1, ∞), u(1) = 0,
(4.2)
618
A. Laptev, S. Naboko, O. Safronov
where Qz = V1 +
αd − V1,2 (U ∗ H0,2 U + V2 − z)−1 V2,1 , r2
αd =
(d − 1)2 d −1 − . 4 2
We are going to approximate the problem by the corresponding problem with a smooth compactly supported potential V and the term αd /r 2 substituted by ζε (r)αd /r 2 , where ζε /r 2 → 1/r 2 , as ε → 0, in both spaces L1 (1, ∞) and L2 (1, ∞) and ζε ∈ C0∞ (1, +∞). The same should be done with the term θ u/r 2 , i.e. it should be substituted by ζε (r)θ u/r 2 . So when approximating the problem we always assume that Qz = V1 + ζε (r)
αd − V1,2 (Sε + V2 − z)−1 V2,1 , r2
(4.3)
where Sε u = −
d 2u θ u − ζε (r) 2 , 2 dr r
u(1, θ) = 0.
(4.4)
According to (4.1) we obtain P0 (H − z)−1 P0 = U (Lz − z)−1 U ∗ .
(4.5)
We see also that if supp V ∪ supp ζε (| · |) ⊂ {x ∈ Rd : c1 < |x| < c2 }, c1 > 1, then for the operator (4.3) we have Qz = Qz χ = χ Qz , where χ is an operator of multiplication by the characteristic function of the interval (c1 , c2 ), c1 > 0. It is important for us that Qz is an analytic operator valued function of z with a negative imaginary part in the upper half plane and which has a positive imaginary part in the lower half plane. 5. Green’s Function In Sects. 5–7 we assume that V is not a potential but the operator P V P , P = nj=0 Pj , which approximates V for large n. It can be interpreted as an operator of multiplication by a matrix valued function of r. In this case the function V1 remains the same as before. Since Pj are projections on real spherical functions, this matrix is real. Recall that the factor 1/r 2 in front of −θ and αd is also substituted by a smooth compactly supported function ζε /r 2 . Let us consider the equation −
d2 ψ(r) + (Qz ψ)(r) = zψ(r), dr 2
r ≥ 1, z ∈ C,
with Qz given by (4.3) and let ψk (r) be the solution of Eq. (5.1) satisfying ψk (r) = exp (ikr),
k 2 = z, Im k > 0, ∀r > c2 .
(5.1)
Absolutely Continuous Spectrum
619
Then this solution also satisfies the following “integral” equation ∞ ikr −1 ψk (r) = e − k sin k(r − s)(Qz ψk )(s) ds.
(5.2)
r
In order to describe the properties of ψk (r) we systematically use the following analytic Fredholm theorem (see, for example, M.Reed and B.Simon [24], Theorem VI.14 or D.Yafaev [35], Ch.I, Sect. 8): Theorem 5.1. Let D ⊂ C be an open connected set and let T(k) be an analytic operator valued function on D such that T(k) is a compact operator in a Hilbert space for each k ∈ D. Then (1) either (I − T(k))−1 exists for no k ∈ D, (2) or (I − T(k))−1 exists for all k ∈ D \ D0 , where D0 is a discrete subset of D. In this case (I − T(k))−1 is meromorphic in D with possible poles belonging to D0 . We first apply this theorem in order to prove the statement which is quite standard in the resonance theory. Lemma 5.1. The operator Qz has a meromorphic continuation into the second sheet of the complex plane. Proof. Let Sε be the same operator as in (4.4) and let S˜ = −d 2 /dr 2 be the operator in L2 ((1, ∞), P L2 (Sd−1 )) with the Dirichlet boundary condition at 1. Let φ ∈ C0∞ (R+ ) be a function which is identically equal to one on the support of the matrix-function V and ζε . Then θ −1 ˜ φ(S − z)−1 φ. φ(Sε + V2 − z)−1 φ = I + φ(S˜ − z)−1 V2 + ζε 2 r Obviously both operators φ(S˜ − z)−1 (V2 + ζε r 2θ ) and φ(S˜ − z)−1 φ have an analytic continuation into the second sheet of the complex plane through the positive semi-axis. By using Theorem 5.1 we obtain that the operator θ −1 I + φ(S˜ − z)−1 V2 + ζε 2 r and thus the operator Qz defined in (4.3) have meromorphic continuations into the second sheet of the complex plane. Let us now apply Theorem 5.1 to the operator ∞ T(k)ψ(r) = −k −1 sin k(r − s)(Qz ψ)(s) ds r
in L2 (1, c2 ). We conclude that Eq. (5.2) is uniquely solvable for all k except perhaps a discrete sequence of points and that its solution ψk is a meromorphic with respect to k function with values in L2 (1, c2 ), in a neighbourhood of every Im k ≥ 0, k = 0. Clearly ψk (x) = a(k)eikx + b(k)e−ikx ,
1 < x < c1 ,
(5.3)
and therefore both a(k) and b(k) are meromorphic functions (even in some neighborhoods of points k = 0 of the real axis).
620
A. Laptev, S. Naboko, O. Safronov
Consider the resolvent operator R(z) = (Lz − z)−1 , where Lz is defined in (4.2). If χc1 is the operator of multiplication by the characteristic function of (1, c1 ), then R(z)χc1 is an integral operator with the kernel:
ψ (s) sin(k(r−1)) k , for r < s < c1 , k k (1) Gz (r, s) = ψ (5.4) ψk (r) sin(k(s−1)) , for s < min{c1 , r}. ψk (1) k Indeed, assuming that supp(f ) ⊂ (1, c1 ) we can easily check that the function 1 ∞ sin(k(r − 1)) ψk (s)f (s) ds u(r) = ψk (1) r k r sin(k(s − 1)) + ψk (r) f (s) ds k 1 satisfies the equation −
d2 u(r) + (Qz u)(r) − zu(r) = f (r), dr 2
r ≥ 1, z ∈ C,
(5.5)
and moreover u(1) = 0. Here we should also mention that since ψk (1) is meromorphic in k in a neighborhood of any k = 0, we conclude that ψk (1) = 0 only on a discrete subset of the closed upper half plane, having no accumulation points except perhaps zero. 6. Wronskian and Properties of the M-Function Let as in (4.3),
Qz = V1 − V1,2 (Sε + V2 − z)−1 V2,1 .
The function M(k) =
ψk (1) , ψk (1)
Im k ≥ 0,
(6.1)
is now well defined and called the Weyl M-function of the operator (5.1). Let us consider the Wronskian W [ψk , ψk ](r) = ψk (r)ψk (r) − ψk (r)ψk (r).
(6.2)
Note that ψk satisfies Eq. (5.1) with Qz and z instead of Qz and z. Since ψk is a solution of Eq. (5.1) we find d W [ψk , ψk ](r) = (z − z)ψk (r)ψk (r) + (Qz ψk )(r)ψk (r) − ψk (r)(Qz ψk )(r). dr So we obtain ±Im {W [ψk , ψk ](c2 ) − W [ψk , ψk ](c1 )} ≥ 0,
for ± Im z ≥ 0+,
which means that for all real k we have the following inequality; k ≤ |ψk (1)|2 . Im M(k)
(6.3)
Absolutely Continuous Spectrum
621
Moreover, if we represent the solution ψk for real k in the form ψk (x) = a(k)eikx + b(k)e−ikx ,
x < c1 ,
then it follows from (6.3) that |a|2 − |b|2 ≥ 1, Clearly
k = k.
M(k) = ψk (1)(ψk (1))−1 = ik(1 − ρ(k))(1 + ρ(k))−1 ,
where
ρ(k) := e−2ik b(k)a(k)−1 .
The latter implies
ρ(k) = (ik − M(k))(ik + M(k))−1 . Since |a|2 − |b|2 ≥ 1 we obtain that for real k 4kIm M . |a(k)|−2 ≤ 1 − |ρ(k)|2 = |ik + M(k)|2 Note that since Im M ≥ 0, then for any k > 0 we have |ik + M(k)|2 = k 2 + |M|2 + 2kIm M ≥ k 2 ,
and therefore
|a(k)|−2 ≤ 4k −1 Im M ,
k > 0.
(6.4)
From (6.1) and (6.2) we obtain Im M(k) > 0
if Im k 2 > 0.
(6.5)
Thus, there are constants C0 ∈ R and C1 ≥ 0 and a positive measure µ, such that ∞ dµ(t) < ∞, 2 −∞ 1 + t where M(k) = C0 + C1 z +
1 t dµ(t), − 1 + t2 R t −z
k 2 = z.
(6.6)
Finally, note that R(z) = P0 (U ∗ H0 U + V − z)−1 P0 , and hence we can formally write that ∂2 M(k) = Gz (r, s)|(1,1) = (P0 (U ∗ H0 U + V − z)−1 P0 δ1 , δ1 ), ∂r∂s where δ1 is the derivative of the delta function δ(r − 1). Let χc1 be the characteristic function of (1, c1 ). The representation (5.4) for the resolvent operator gives us the representation for the operator χc1 P0 EU ∗ H0 U +V (ω)P0 χc1 , where EU ∗ H0 U +V (ω) is the spectral measure of U ∗ H0 U + V : ∗ |F (λ)|2 dµ(λ) (6.7) (P0 EU H0 U +V (ω)P0 f, f ) = ω
and where
1 c1 sin(k(r − 1))f (r) dr, suppf ⊂ (1, c1 ), k 2 = λ. k 0 Since F is a boundary value of an analytic function, we obtain that F (λ) = 0 for a.e. λ. This means that EH (ω) = 0 if µ > 0 a.e. on ω. F (λ) =
622
A. Laptev, S. Naboko, O. Safronov
7. Trace Inequalities Recall that we assume that V is not a potential but the operator nj=0 Pj V nj=0 Pj , which approximates V for large n. As before we substitute the term −θ /r 2 and αd /r 2 on (1, ∞) by “compactly supported” approximations −ζε (r)θ /r 2 and ζε (r)αd /r 2 , where ζε ∈ C0∞ (1, ∞) and ζε (r)/r 2 → 1/r 2 in L1 (1, ∞) and L2 (1, ∞) as ε → 0. Then the coefficient a(k) introduced in (5.3) will depend on ε and we shall write aε (k) instead of a(k). From (5.2) and (4.3) we find that ∞ 1 exp(−ikr)ψk (r) = 1 − (1 − e2ik(s−r) )(ζε (s)αd /s 2 + V1 (s)) ds + o(1/k) 2ik r and thus 1 aε (k) = lim exp(−ikr)ψk (r) = 1 − r→−∞ 2ik
(ζε (r)αd /r 2 + V1 ) dr + o(1/k), (7.1)
as k → ∞. Now let iβm and γj be zeros and poles of aε (k) in the open upper half plane. Note that −γj are also poles of aε (k) (this will follow from (7.5)). We shall show in 2 } are the eigenvalues of a certain self-adjoint operator of a Proposition 7.1 that {−βm Schrödinger type. Therefore we choose βm > 0. Let B be the corresponding Blaschke product
(k − iβm ) (k − γj ) B(k) = . (k + iβm ) (k − γj ) m j
Clearly |B(k)| = 1, B(k) = B(−k), k ∈ R, and we obtain
+∞
log(aε (k)/B(k)) dk π (ζε (r)αd /r 2 + V1 (r)) dr + 2π βm − Im γj , = 2
−∞
(7.2)
provided that for some integer l ≥ 0 the coefficient aε (k) has an expansion aε (k) = j j ≥−l cj k at zero. The existence of such an expansion as well as the condition |aε (k)|− 1 = O(1/|k|2 ) as k → ±∞ will be proven in Appendix. Let P = nj=0 Pj and let Hˆ ε be the operator in L2 (R, P L2 (Sd−1 )) such that θ u d 2u Hˆ ε u = − 2 − ζε 2 , dr r
(I − P0 )u(1, ·) = 0,
u(r, ·) ∈ P L2 (Sd−1 ), ∀r, (7.3)
where ζε is the same as above. 2 is one of the eigenvalues −β 2 (V ) of the operator Hˆ + V . Proposition 7.1. Each −βm ε m Moreover, +∞ ∞ ζε (r)αd log |aε (k)| dk ≤ 2π βm (V ) + βm (−V ) + π dr. (7.4) r2 −∞ 0
Absolutely Continuous Spectrum
623
Proof. Obviously, if s < c1 < c2 < r, then the kernel of the operator P0 (Hˆ ε + V − z)−1 P0 equals g(r, s, k) = −
exp ik(r − s) . 2ikaε (k)
(7.5)
The proof of the latter relation is a counterpart of the proof of (5.4). On the other hand 2 . Denote by φ we can consider the expansion of g near the eigenvalue −βm m,j (r, θ ), 2 . If j = 1, 2 . . . n the orthonormal system of eigenfunctions corresponding to −βm (0) φm,j = Sd−1 φm,j (r, θ) dθ , then n g(r, s, k) =
(0) (0) j =1 φm,j (r)φm,j (s) 2 k 2 + βm
+ g0 (r, s, k),
s < c1 < c2 < r,
(7.6)
where g0 (r, s, k) = O(1), as k → iβm . This proves that aε (k) is a meromorphic function 2 of the operator in the upper half plane and its zeros correspond to the eigenvalues −βm ˆ Hε +V . Comparing (7.5) and (7.6) we find that the multiplicities of these zeros are equal to one. For the latter arguments see [18]. Taking into account the estimate |aε (k)| ≥ 1, we obtain the statement of the proposition if we add to (7.2) the same identity with −V instead of V . Observe that when ε → 0 the eigenvalues of Hˆ ε + V converge to the eigenvalues of the operator Hˆ + V , where Hˆ is the following operator in L2 (R, L2 (Sd−1 )) 1 d 2u Hˆ = − 2 + 2 (−θ u + αd u), (I − P0 )u(1, ·) = 0. dr r (0) 2 (0) Denote the eigenvalues of Hˆ + V by − βm , where βm > 0 and let Vˆ be the Fourier transform of V with respect to the first variable as in Theorem 2.2. Proposition 7.2. For any δ > 0 there is a constant C = C(δ, V ∞ ) > 0 such that δ 1/2 (0) 4 V dx + |Vˆ (ξ, y)|2 dξ dy + V (x) ∞ . (7.7) βm ≤ C Rd
Rd−1 −δ
Proof. For any self-adjoint operator T and t > 0 denote N (t, T ) = rank ET (−∞, −t). Then ||V ||∞ dt (0) βm = N (t, Hˆ + V ) √ 2 t 0 ||V ||∞ dt 1/2 1/2 (1 + N (t, Hˆ D + V )) √ = tr (Hˆ D + V )− + ||V ||∞ , ≤ 2 t 0 where
d 2 u θ u αd Hˆ D = − 2 − 2 + 2 u, dr r r
u(1, ·) = 0.
Let − + V be the operator in L2 (Rd ). Then the mini-max principle tells us that 1/2 1/2 (0) βm ≤ tr (− + V )− + V ∞ . (7.8)
624
A. Laptev, S. Naboko, O. Safronov
Applying the Lieb-Thirring inequality for operator valued potentials (see [12]) and Theorem 3.1 we obtain d2 d/2 1/2 − 2 + V (s, y) dy tr (− + V )− ≤ C d−1 − ds R 3/2 d2 ≤ C0 dy − 2 + V (s, y) − ds Rd−1 δ ≤ C1 V 4 dx + |Vˆ (ξ, y)|2 dξ dy , Rd
Rd−1 −δ
where C0 = C( V ∞ ) and C1 = C(δ, V ∞ ). The latter inequality together with (7.8) implies (7.7). Now the trace formula (7.4) and the inequality (7.7) lead us to +∞ lim sup log |aε (k)| dk ε→0 −∞ δ 1/2 4 ≤C V dx + |Vˆ (ξ, y)|2 dξ dy + V ∞ + 1 . Rd
Rd−1 −δ
(7.9)
For a perturbation V satisfying the conditions of Theorem 2.2 the Weyl function M ∂2 can also be defined as M(k) = ∂r∂s Gz (r, s)|(1,1) , where Gz is the integral kernel of the operator P0 (U ∗ H U − z)−1 P0 . For any pair of finite numbers r2 > r1 ≥ 0 and for V ∈ C0∞ (Rd \ 1 ) it follows from Corollary 5.3 [13] that +∞ 1 r2 k log log |aε (k)| dk. (7.10) dk ≤ lim sup 2 r1 4Im M(k) ε→0 −∞ Therefore (7.9) and (7.10) imply Proposition 7.3. For any pair of finite numbers r2 > r1 ≥ 0 and for V ∈ C0∞ (Rd \ 1 ), 1 r2 k dk log 2 r1 4Im M(k) δ 1/2 V 4 dx + |Vˆ (ξ, y)|2 dξ dy + V ∞ + 1 , (7.11) ≤C Rd
Rd−1 −δ
where C = C(δ, V ∞ ). 8. The End of the Proof of Theorem 2.2 Let Q = [0, 1)d . The cubes Qm = Q + m, m ∈ Zd form a partition of Rd to which we associate classes of functions u such that the sequence of (quasi-) norms ||u||Lp (Qm ) , q > 0, belongs to ∞ . These classes are denoted by ∞ (Zd ; Lp (Q)). It is clear that (2.1) implies V ∈ ∞ (Zd ; Lp (Q)),
(8.1)
p > d, √
and therefore by [6] it guarantees the boundedness of the operator |V |(− + 1)−1/2 . The next proposition allows us to approximate V by compactly supported smooth functions Vn .
Absolutely Continuous Spectrum
625
Proposition 8.1. Let V satisfy the conditions of Theorem 2.2. Then there exists a sequence Vn of compactly supported smooth functions converging to V , |Vn |4 dx < C(V ), ||Vn ||∞ < C(V ) (8.2) and
δ/2
Rd−1 −δ/2
|Vˆn (ξ, y)|2 dξ dy < C(V )
(8.3)
such that the Weyl functions Mn corresponding to Vn converge uniformly to M(k) when k 2 belongs to any compact subset of the upper half plane. Therefore the sequence of measures µn converges weakly to the spectral measure µ. √ Proof. Let W± = V± . Since the class C0∞ is dense in Lp for any p > 0, we can find a pair of sequences Wn− ∈ C0∞ and Wn+ ∈ C0∞ satisfying Wn± → W± in L8 (Rd ); Wn± → W± in ∞ (Zd ; Lp (Q)), 2p > d.
(8.4)
Introduce a sequence of functions {Vn }∞ n=1 , Vn = (Wn+ )2 − (Wn− )2 . The sequences Wn± can be chosen so that δ/2 |Vˆn (ξ, y)|2 dξ dy < C(V ). −δ/2
Then Vn ∈
C0∞
and the relations (8.2), (8.4) hold true. Let S0 u = −
d 2 u θ u − 2 , dr 2 r
u(1, θ) = 0
(8.5)
acting in L2 (1, ∞); L2 (Sd−1 ) . Suppose now that 0 (z) and n (z) are the resolvent operators of S0 and S0 + Vn respectively. Recall that by δ1 we denote the derivative of the delta function δ(r − 1). The expression 0 (z)δ1 , Im z = 0, can be understood as an exponentially decaying function (Hankel’s function) which coincides with the corresponding solution of the equation −
d 2ψ αd + 2 ψ = zψ, dr 2 r
ψ(1) = −1.
According to assumptions (8.4) we have that Wn± 0 (z)δ1 → W± 0 (z)δ1 , in L2 (Rd ). Thus in order to prove that the Weyl functions ∂2 Gn,z (r, s)|(1,1) = (n (z)δ1 , δ1 ) ∂r∂s = (0 (z)δ1 , δ1 ) − ((Wn+ − Wn− )0 (z)δ1 , (Wn+ + Wn− )n (z)δ1 )
Mn (k) =
(8.6)
626
A. Laptev, S. Naboko, O. Safronov
converge, it is sufficient to show that (Wn+ + Wn− )n (z)δ1 → (W+ + W− )(S0 + V − z)−1 δ1 in
(8.7)
L2 (Rd ).
Let us denote Wn = Wn+ + Wn− and Wn (0) = Wn+ − Wn− . Clearly, if Wn± → W± in the class (8.1) with 2p > d, as n → ∞, then Wn 0 (z)Wn(0) → (W+ + W− )0 (z)(W+ − W− )
(8.8)
in the operator norm topology. Then (8.7) follows from the identity Wn n (z)δ1 = (I + Wn 0 (z)Wn(0) )−1 Wn 0 (z)δ1 . l
Similarly we can prove the following result which allows us to pass from
j =0 Pj
l
j =0 Pj V
to V .
Proposition 8.2. Let V be a compactly supported smooth function. Then the Weyl func tions Ml corresponding to the potential lj =0 Pj V lj =0 Pj converge uniformly to M when k 2 belongs to any compact subset K of the upper half plane and therefore the sequence of measures µl converges weakly to the spectral measure µ constructed for the potential V . Proof. Let us denote Vl = lj =0 Pj V lj =0 Pj , let 0 (z) and let l (z) be the resolvent operators of S0 defined in (8.5) and S0 + Vl respectively. As in Proposition 7.1 the expression 0 (z)δ1 , Im z = 0, is understood as the exponentially decaying solution of Eq. (8.6). According to our assumptions Vl 0 (z)δ1 =
l
Pj V 0 (z)δ1 → V 0 (z)δ1
j =0
in L2 (Rd ). Thus in order to prove that the Weyl functions ∂2 Gn,z (r, s)|(1,1) = (l (z)δ1 , δ1 ) ∂r∂s = (0 (z)δ1 , δ1 ) − (Vl 0 (z)δ1 , l (z)δ1 )
Ml (k) =
converge, it is sufficient to show that l (z)δ1 converges to (S0 + V − z)−1 δ1 in L2 (Rd ) uniformly on compact subsets K of the complex plane. The latter follows from the identity l (z)δ1 = (S0 + V − z)−1 δ1 − l (z)(Vl − V )(S0 + V − z)−1 δ1 l = (S0 + V − z)−1 δ1 + l (z)(I − Pj )V (S0 + V − z)−1 δ1 j =0
+l (z)
l
Pi V (I −
i=0
and from the bound ||l (z)|| ≤
l
Pj )(S0 + V − z)−1 δ1
j =0
1 ≤ C, Im z
z ∈ K.
Absolutely Continuous Spectrum
627
Finally according to inequality (7.11) and Propositions 8.1 and 8.2 we observe that there exists a sequence of measures µl weakly convergent to µ, such that for any fixed c > 0, c log(1/µl (t)) dt √ < C(V ), ∀l, (1 + t 3/2 ) t 0 where C(V ) is independent of c. Therefore due to the statement on the upper semi-continuity of an entropy (see [13]) we obtain ∞ log(1/µ (t)) dt √ < ∞. (1 + t 3/2 ) t 0 The proof of Theorem 2.2 is complete. 9. Proof of Theorem 2.1 The proof is reduced to the references on [5, 2] and Theorem 2.2. Let − be the Laplace operator in L2 (Rd ). According to [5], if V satisfies the conditions of Theorem 2.2, then (− + V − z)−n − (H + V − z)−n ∈ S1 for some z and sufficiently large n > 0. The latter relation implies that − + V and H + V have the same a.c. spectrum. Now by Theorem 2.11 and Corollary 2.13 of [2], the a.c. spectrum does not change if we add to V any real L∞ -function V0 with a finite support. Indeed, in this case (− + V − z)−n − (− + V + V0 − z)−n ∈ S1 for some z and sufficiently large n > 0. This proves Theorem 2.1. 10. Appendix Here we show that aε (k) appearing in (7.1) is a meromorphic function in a neighborhood of zero and |aε (k)| = 1 + O(1/|k|2 ), as k → ±∞ which, in particular, means that log |aε (k)| ∈ L1 (R). 1. Let P = nj=0 Pj , V = P V P . Introduce matrices A(k) and B(k) defined in the space P L2 (Sd−1 ), such that the solution of the equation (for the matrix valued function ) −
d 2 ζε + + α + V = k2 − θ d dr 2 r2
(10.1)
satisfies the following conditions: = exp(ikr)P , and
for
exp(ikr)A(k) + exp(−ikr)B(k)
r > c2 , for
r < c1 .
We shall see that A(k) and B(k) both have at most a simple pole at zero and therefore by (10.2) aε (k) could also have a pole at zero.
628
A. Laptev, S. Naboko, O. Safronov
Proposition 10.1. The following relation holds true: −1 1 P0 = P0 A(k) + (I − P0 )e−2ik B(k) P0 . aε (k)
(10.2)
Proof. Let G(r, s, k) be the kernel of the operator (Hˆ ε + V − z)−1 χc1 , where χc1 is the operator of multiplication by the characteristic function of (1, c1 ). Then
(r, k)Z1 (s, k), as r < s < c1 G(r, s, k) = −(r, k)Z2 (s, k), as s < c1 , s < r. Here (r, k) = e−ikr P0 +k −1 sin(k(r −1))(P −P0 ) for r < c1 and (r, k) = eikr P for r > c2 . The matrices Z1 (s, k) and Z2 (s, k) are chosen such that G(r, s, k) is continuous at the diagonal and lim Gr (r, s, k) = lim Gr (r, s, k) + P .
r→s−0
r→s+0
The two latter equations are equivalent to [e−ikr P0 + k −1 sin(k(r − 1))(P − P0 )]Z1 + [e−ikr B(k) + eikr A(k)]Z2 = 0; [−ike−ikr P0 + cos(k(r − 1))(P − P0 )]Z1 + [−ike−ikr B(k) + ikeikr A(k)]Z2 = P
(10.3)
and are uniquely solvable if and only if k 2 is not an eigenvalue of Hˆ ε + V . The first equation of the system (10.3) gives Z1 = − eikr P0 +
k (P − P0 ) e−ikr B(k) + eikr A(k) Z2 . sin(k(r − 1))
Therefore we obtain from the second equation of (10.3) that ikP0 − k ctg (k(r − 1))(P − P0 ) e−ikr B(k) + eikr A(k) Z2 + +[−ike−ikr B(k) + ikeikr A(k)]Z2 = P , (10.4) or equivalently (P − P0 ) −k ctg(k(r − 1)) − ik e−ikr B(k) + −k ctg(k(r − 1)) + ik eikr A(k) Z2 +2ikP0 eikr A(k)Z2 = P . Obviously −k ctg(k(r − 1)) ± ik = −
ke∓ik(r−1) . sin k(r − 1)
This implies
(P − P0 )
−k e−ik B(k) + eik A(k) Z2 + 2ikP0 eikr A(k)Z2 = P . sin k(r − 1)
Multiplying both sides of this identity by − sin k(r − 1) −ik e−ikr e (P − P0 ) + P0 k 2ik
Absolutely Continuous Spectrum
629
we derive −1 P0 Z2 (r, k)P0 = (2ik)−1 e−ikr P0 A(k) + e−2ik (P − P0 )B(k) P0 . Finally, since
P0 Z2 (r, k)P0 = (2ikaε )−1 e−ikr P0
we obtain (10.2).
2. In this subsection we adapt the argument from [18]. The solution (r, k) of (10.1) satisfies the integral equation ∞ ikr k −1 sin k(r − s)V∗ (s)(s, k) ds, (10.5) (r, k) = e P − r
where V∗ = V − r −2 ζε P θ . Denote X(r, k) = e−ikr (r, k) − P . Then
∞
X(r, k) =
∞
K(r, s, k) ds +
r
K(r, s, k)X(s, k) ds,
(10.6)
r
where K(r, s, k) =
e2ik(s−r) − 1 V∗ (s) . 2ik
(10.7)
Note that K(r, s, k) ≤ C1 (V∗ , n)/(1 + |k|)
(10.8)
for all k with Im k ≥ 0 and all k with 1 < r ≤ s. Here and below · denotes the norm of an operator in P L2 (Sd−1 ). Solving the Volterra equation (10.6) we obtain the following convergent series: X(r, k) =
∞
···
m
K(rl−1 , rl , k) dr1 · · · drm .
m=1 r≤r1 ≤···≤rm l=1
From (10.8) we see that |X(r, k)| ≤ C2 (V∗ ) for all 1 < r. Obviously X(r, k) is an entire function in k. Inserting this estimate back into (10.6), we conclude that the inequality X(r, k) ≤ C3 (V∗ , n)(1 + |k|)−1
(10.9)
holds for all r with 1 < r and all k with Im k ≥ 0. If we rewrite (10.5) as follows: ∞ ∞ 1 1 ikr (r, k) = e P− V∗ (s) ds − V∗ (s)X(s, k) ds 2ik 2ik ∞ r ∞ r −ikr e + e2iks V∗ (s) ds + e2iks V∗ (s)X(s, k) dx , (10.10) 2ik r r
630
A. Laptev, S. Naboko, O. Safronov
then the expressions in the brackets in the r.h.s. do not depend on r for r ≤ 1. From (10.10) it follows that +∞ +∞ 1 1 A(k) = P − V∗ (s) ds − V∗ (s)X(s, k) ds , (10.11) 2ik −∞ 2ik −∞ +∞ +∞ 1 1 B(k) = e2iks V∗ (s) ds + e2iks V∗ (s)X(s, k) ds . (10.12) 2ik −∞ 2ik −∞ Recall that supp V˜ ⊂ (1, ∞). Thus for sufficiently large |k| the smoothness of V and (10.9) imply +∞ −2 A(k) − P + 1 V∗ (s)ds (10.13) ≤ C4 (V∗ , n)|k| , Im k ≥ 0 , 2ik −∞ −2ik B(k) ≤ C5 (V∗ , n)|k|−2 , Im k ≥ 0 . (10.14) e Note that from (10.2), (10.13) and (10.14) we now obtain that aε (k) is a meromorphic function in a neighborhood of zero and |aε (k)| tends to 1 as O(1/|k|2 ) when k → ±∞. Acknowledgements. A.L and O.S. are grateful for the partial support of the ESF European programme SPECT. S.N. would like to thank the Gustafsson Foundation which has allowed him to spend one month at the Royal Institute of Technology in Stockholm. This research was also partly supported by the KBN grant 5, PO3A/026/21. g1925l.
References 1. Agranovich, Z.S., Marchenko, V.A.: Re-establishment of the potential from the scattering matrix for a system of differential equations. Dokl. Akad. Nauk SSSR (N.S.) 113, 951–954 (1957) (in Russian) 2. Avron, J., Herbst, I., Simon, B.: Schrödinger operators with magnetic fields. I. General interactions. Duke Math. J. 45, 847–883 (1978) 3. Behncke, H.: Absolute continuity of Hamiltonians with von Neumann-Wigner potentials. Proc. Am. Math. Soc. 111, 373–384 (1991) 4. Behncke, H.: Absolute continuity of Hamiltonians with von Neumann-Wigner potentials. II. Manuscripta Math. 71, 163–181 (1991) 5. Sh.Birman, M.: Perturbations of the continuous spectrum of a singular elliptic operator by varying the boundary and the boundary conditions. (Russian English summary) Vestnik Leningrad. Univ. 17, 22–55 (1962) 6. Sh.Birman, M.: Discrete Spectrum in the gaps of a continuous one for perturbations with large coupling constants. Adv. Soviet Math. 7, 57–73 (1991) 7. Christ, M., Kiselev, A., Remling, C.: The absolutely continuous spectrum of one-dimensional Schrödinger operators with decaying potentials. Math. Res. Lett. 4, 719–723 (1997) 8. Christ, M., Kiselev, A.: Absolutely continuous spectrum for one-dimensional Schrödinger operators with slowly decaying potentials: some optimal results. J. Am. Math. Soc. 11, 771–797 (1998) 9. Domanik, D., Killip, R.: Half-line Schrödinger operators with no bound states. To appear in Acta Math. http://www.math.kth.se/spect/preprint/daman killip.pdf 10. Deich, V.G., Korotjaev, E.L.,Yafaev, D.R.: Potential scattering with allowance for spatial anisotropy. (Russian) Dokl. Akad. Nauk SSSR 235, 749–752 (1977) 11. Deift, P., Killip, R.: On the absolutely continuous spectrum of one -dimensional Schrödinger operators with square summable potentials. Commun. Math. Phys. 203, 341–347 (1999) 12. Hundertmark, D., Laptev, A., Weidl, T.: New bounds on the Lieb-Thirring constants. Invent. Math. 140, 693–704 (2000) 13. Killip, R., Simon, B.: Sum rules for Jacobi matrices and their applications to spectral theory. Ann. Math. 158, 253–321 (2003) 14. Killip, R.: Perturbations of one-dimensional Schrödinger operators preserving the absolutely continuous spectrum. Int. Math. Res. Not. 38, 2029–2061 (2002)
Absolutely Continuous Spectrum
631
15. Kiselev, A., Last, Y., Simon, B.: Modified Prüfer and EFGP transforms and the spectral analysis of one-dimensional Schrödinger operators. Commun. Math. Phys. 194, 1–45 (1998) 16. Koosis, P.: The logarithmic integral I. Cambridge: Cambridge University Press, 1988 17. Laptev, A., Naboko, S., Safronov, O.: A Szeg˝o condition for a multidimensional Schrödinger operator. J. Funct. Anal., to appear 18. Laptev, A., Weidl, T.: Sharp Lieb-Thirring inequalities in high dimensions. Acta Math. 184, 87–111 (2000) 19. Lieb, E.H., Thirring, W.: Inequalities for the moments of the eigenvalues of the Schrödinger Hamiltonian and their relation to Sobolev inequalities. Studies in Math. Phys., Essays in Honor of Valentine Bargmann, Princeton, NJ: Princeton Univ. Press, 1976, pp. 269–303 20. Maz’ya, V.: Sobolev Spaces. Berlin Heidelberg New York Tokyo: Springer-Verlag, 1985 21. Remling, C.: The absolutely continuous spectrum of one-dimensional Schrödinger operators with decaying potentials. Commun. Math. Phys. 193, 151–170 (1998) 22. Safronov, O.: The spectral measure of a Jacobi matrix in terms of the Fourier transform of the perturbation. Ark. Matematik, to appear 23. Safronov, O.: On the absolutely continous spectrum of multi-dimensional Schrödinger operators with slowly decaying potentials. Commun. Math. Phys., DOI 10.1007/s00220-004-1161-0, 2004 24. Reed, M., Simon, B.: Methods of modern mathematical physics, 3. San Francisco, London: Academic Press, 1978 25. Szeg˝o, G.: Beiträge zue Theorie der Toeplitzschen Formen, II. Math. Z. 9, 167–190 (1921) 26. Szeg˝o, G.: Orthogonal Polynomials, 4th edn. American Mathematical Society, Colloquium Publications, Vol. XXIII. Providence, RI: American Mathematical Society, 1975 27. Simon, B., Damanik, D., Hundertmark, D.: Bound states and the Szeg˝o condition for Jacobi matrices and Schrödinger operators. J. Funct. Anal. 205, 357–379 (2003) 28. Simon, B.: Schrödinger operators in the twentieth century. J. Math. Phys. 41, 3523–3555 (2000) 29. Simon, B.: Some Schrödinger operators with dense point spectrum. Proc. Am. Math. Soc. 125, 203–208 (1997) 30. Simon, B.: Trace ideals and their applications. Lecture Note Series 35, London: Math. Soc., 1979 31. Simon, B., Zlatos, A.: Sum rules and the Szeg˝o condition for orthogonal polynomials on the real line. Commun. Math. Phys., to appear 32. Skriganov, M.: The eigenvalues of the Schrödinger operator that are located on the continuous spectrum. (Russian) In: Boundary value problems of mathematical physics and related questions in the theory of functions, 7. Zap. Nauˇcn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 38, 149–152 (1973) 33. Pearson, D.B.: Singular continuous measures in scattering theory. Commun. Math. Phys. 60, 13–36 (1978) 34. von Neumann, J., Wigner, E.P.: Über merkwürdige diskrete Eigenwerte. Z. Phys. 30, 465–467 (1929) 35. Yafaev, D.: Mathematical scattering theory. General theory. Translations of Mathematical Monographs, 105. Providence, RI: American Mathematical Society, 1992, pp. x+341 Communicated by B. Simon
Commun. Math. Phys. 253, 633–644 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1200-x
Communications in
Mathematical Physics
On the Asymptotic Density in a One-Dimensional Self-Organized Critical Forest-Fire Model J. van den Berg , A.A. J´arai Centrum voor Wiskunde en Informatica, P.O. Box 94079, 1090 GB Amsterdam, The Netherlands. E-mail:
[email protected];
[email protected] Received: 15 December 2003 / Accepted: 21 April 2004 Published online: 20 October 2004 – © Springer-Verlag 2004
Abstract: Consider the following forest-fire model where the possible locations of trees are the sites of Z. Each site has two possible states: ‘vacant’ or ‘occupied’. Vacant sites become occupied at rate 1. At each site ignition (by lightning) occurs at ignition rate λ, the parameter of the model. When a site is ignited, its occupied cluster becomes vacant instantaneously. In the literature similar models have been studied for discrete time. The most interesting behaviour occurs when the ignition rate approaches 0. It has been stated by Drossel, Clar and Schwabl (1993) that then (in our notation) the density of vacant sites (at stationarity) is of order 1/ log(1/λ). Their argument uses a ‘scaling ansatz’ and is not rigorous. We give a rigorous and mathematically more natural proof for our version of the model, and point out how it can be modified for the model studied by Drossel et al. Our proof shows that regardless of the initial configuration, already after time of order log(1/λ) the density is of the above mentioned order 1/ log(1/λ). We also obtain bounds on the cluster size distribution, showing that the scaling ansatz of Drossel et al. needs correction. 1. Introduction Suppose each site of the lattice Zd is either vacant or occupied by a tree. Vacant sites become occupied according to independent rate 1 Poisson processes. Also, lightning strikes at any site according to independent rate λ Poisson processes. Here λ > 0 is the parameter of the model. When a site is hit by lightning, its entire occupied cluster burns down, that is, becomes vacant. When d = 1, a process with the above description can be constructed in a standard way, by using a graphical representation; see e.g. Liggett (1985). For this, note that if we start with a configuration in which infinitely many sites on the negative and on the
Also at Vrije Universiteit Amsterdam. Current address: School of Mathematics and Statistics, Carleton University, 1125 Colonel By Drive, Ottawa, ON K1S 5B6, Canada. E-mail:
[email protected]
634
J. van den Berg, A.A. J´arai
positive half-line are vacant, there are, with probability 1, at each time t infinitely many sites (on both half-lines) that have remained vacant throughout the interval [0, t]. These sites ‘break the infinite line into finite pieces’, which enables a graphical representation mentioned above. When d ≥ 2, the existence of the infinite-volume process is not clear. (In principle, the state of a given site can be influenced by infinitely many Poisson events in finite time.) In the physics literature usually a different but closely related forest-fire model is studied. In that model, time is discrete, space is large but finite, and the fire does not spread instantaneously but at a finite speed. The most interesting object of study seems to be the limiting behaviour as, roughly speaking, the ignition rate goes to 0 and the speed of fire and the volume go to infinity, jointly, in an appropriate way. It is believed that this behaviour resembles, in some sense, that of statistical mechanics systems at criticality. This belief is partly supported by heuristic arguments and computer simulations, but very little has been proved rigorously. See Jensen (1998) for a general overview of and introduction to these and other so-called self-organised critical systems, and Schenk, Drossel and Schwabl (2002) for current insights on the forest-fire model, in particular for d = 2. A paper by Malamud, Morein and Turcotte (1998) compares the model with data from real forest fires. As to the one-dimensional case, one of the main statements in the paper by Drossel, Clar and Schwabl (1993) says that under the above mentioned asymptotics, the steady state probability that a site is vacant is (in our notation) of order 1/ log(1/λ). Their arguments leading to this statement are not rigorous (see our Remark at the end of Sect. 2.1). We give, for our model, a rigorous and, in our opinion, more natural proof of the above mentioned asymptotic behaviour of the density of vacancies. It turns out that (see Theorem 4), uniformly in the starting configuration, already after time of order log(1/λ) the density is of order 1/ log(1/λ). In Sect. 3.2 we point out how our arguments can be adapted to the model of Drossel et al. We also derive results for the cluster size distribution (see Theorems 5 and 6), and show that the scaling ansatz given by Drossel et al. is incorrect for cluster sizes of the order of the relevant spatial scale (see the remark preceding Theorem 6). The ingredients of our proofs are fairly elementary but the way one has to combine them is quite subtle. We believe that for each λ > 0 there is a unique invariant distribution for the above dynamics. We have not been able to prove this, but hope that our results and ideas can also be used to make progress on that problem. Notation and terminology. Let denote the set of all configurations ω ∈ {0, 1}Z for which there are infinitely many positive and negative x’s with ωx = 0. For each x ∈ Z let Bx = {b1 , b2 , . . . } ⊂ (0, ∞) and Ix = {i1 , i2 , . . . } ⊂ (0, ∞) denote the birth and ignition times (respectively) at x. As said before, these correspond to the points of independent Poisson point processes with intensities 1 and λ respectively. We let Pλ denote the probability measure governing B and I . Given B and I , and η(0) ∈ , let {ηx (t)}(x,t)∈Z×[0,∞) denote the forest fire process with initial configuration ξ {ηx (0)}x∈Z . We let Pλ denote the probability measure governing the forest fire process with lightning density λ and initial configuration η(0) = ξ ∈ . For arbitrary J ⊂ Z, we let FJ (s, t) denote the information about the births and ignitions during the time interval [s, t] in the set J . That is, FJ (s, t) = σ (Bx ∩ [s, t] : x ∈ J ) ∨ σ (Ix ∩ [s, t] : x ∈ J ),
t ≥ s ≥ 0.
When s equals 0 or J = Z, we omit these symbols from the notation. In particular, we have the notation F(t) = FZ (0, t).
Asymptotic Density in a 1-D Self-Organized Critical Forest-Fire Model
635
We denote the time shift operators on the underlying probability space by (θs )s≥0 . To avoid confusion we make the following remark about our terminology: Occasionally we make statements like ‘x has an ignition at time t’ and ‘x burns at time t’. There is an essential difference between these two statements. The first means, formally, that t ∈ Ix , which informally says that site x is hit by lightning at time t. The site may be empty, in which case the lightning has no effect. The second statement says that x is occupied just before time t and becomes vacant at time t (by lightning at x or somewhere else in its occupied cluster). Finally, we note that by ‘x has a birth at time t’ we just mean that t ∈ Bx . If x was already occupied, this ‘birth’ has no effect. 2. Relevant Scales and the Blocking Property 2.1. Relevant space and time scales. For a set of sites J ⊂ Z and t > 0, define the events AJ (s, t) = {∀x ∈ J : B(x) ∩ [s, t] = ∅} = {each x ∈ J has a birth at some time in [s, t]}, BJ (s, t) = {∃x ∈ J : I (x) ∩ [s, t] = ∅} = {ignition occurs at some x ∈ J at a time in [s, t]}. We denote the complements of these events by AcJ (s, t) and BJc (s, t), respectively. When s = 0, we simply write AJ (t) and BJ (t). Definition. Assume λ < λ0 = (3 log 3)−1 . We define n = n(λ) ≥ 2 as the positive integer satisfying n log n ≤
1 < (n + 1) log(n + 1). λ
(1)
For convenience, we let n(λ) = 2 when λ ≥ λ0 . In the rest of the paper we assume that n and λ are related as in the definition. It is easy to see that Pλ A[0,n] (log n) = (1 − e− log n )n+1 = (1 − n−1 )n+1 ∈ (C1 , 1 − C1 ), and that for 0 < λ < λ0 Pλ B[0,n] (log n) = 1 − (e−λ log n )n+1 = 1 − e−λ(n+1) log n ∈ (C2 , 1 − C2 ),
(2)
(3)
for some constants C1 , C2 > 0. This indicates that n and log n are the relevant space and time scales in the model. Even if we replace, in the above computation, the spatial scale by a constant multiple of n, the result is still bounded away from 0 and 1: for any α > 0, there are constants 0 < C1 (α) < 1 and 0 < C2 (α) < 1 such that for all 0 < λ < λ0 , Pλ A[0,αn] (log n) ∈ (C1 (α), 1 − C1 (α)), Pλ B[0,αn] (log n) ∈ (C2 (α), 1 − C2 (α)). (4) Remark. Throughout this paper, quantities like αn above are to be replaced by their integer parts when they refer to discrete quantities like spatial location or cluster size.
636
J. van den Berg, A.A. J´arai
Remark. For the event B, we can also replace the time log n by β log n (and let C2 depend not only on α but also on β). Note that we do not have similar flexibility in the time variable for the event A: for any β > 1 the probability of A[0,n] (β log n) tends to 1 as n → ∞. Remark. In the paper by Drossel, Clar and Schwabl, an analog of (4) alone is taken as sufficient support to conclude that n (in their notation: smax ) is the ‘characteristic length scale’. This quantity is then explicitly inserted in the postulation of a ‘scaling ansatz’. This ansatz, combined with other computations (concerning the steady-state conditional probabilities of the local configuration near site 0, given that site 0 is vacant), leads to the earlier mentioned order 1/ log(1/λ) for the density of vacant sites, and to results for the cluster size distribution. See the arguments between (7) and (9) in their paper. However, this ansatz is non-rigorous and, as we point out in Sect. 3 (see the remark above Theorem 6) even partly incorrect. Our rigorous proof for the asymptotic density of vacancies and for the cluster size distribution is very different from their arguments and uses properties much more subtle than (4) (see Lemmas 1 and 2 in Sect. 2.2). Further, we avoid the computations regarding the steady-state probabilities mentioned above (although these are interesting in themselves), so that our proof is in some sense more direct. 2.2. Blocking intervals. From (4) we see that if we start with a configuration in which [0, n] is empty, then with probability bounded away from 0, there is a site in [0, n] that remains vacant at each time s ∈ [0, log n] (consider the event Ac[0,n] (log n)). Such events are useful, because they imply that the halflines to the left and to the right of [0, n] ‘do not communicate’, that is, no fire can pass through in either direction. Below we prove two technical lemmas concerning such and related ‘blocking intervals’. For the proof of our main results, Theorems 4–6 in Sect. 3, we will need an initially vacant spatial interval of length of order n to maintain a certain blocking property during time β log n for some β > 1. Here we cannot simply require that there be a site that remains vacant during time [0, β log n], since as we said in the lines following (4), for each β > 1 we have Pλ (Ac[0,n] (β log n)) → 0 as λ → 0. Instead, we are going to achieve the blocking property in a more subtle way, by constructing an event such that at each time s ∈ [0, β log n] there is at least one vacant site in [0, n]. As the first lemma shows, we can reach this goal by arguments involving suitable fires in space intervals of length nα (with some α < 1). It is important here that we establish the blocking property irrespective of what happens outside the initially vacant interval. In the second lemma, we show that a vacant interval of length n is created in time O(log n) with probability bounded away from 0, regardless of the initial configuration. The combination of these lemmas allows us to create suitable blocking intervals after time 2 log n with reasonable probabilities, regardless of the initial configuration (Proposition 3 below). From this our main theorems follow quite easily. For J ⊂ Z, let NJ (t1 , t2 ) denote the event that no fire propagates from J to J c during the interval [t1 , t2 ]. (Formally this is the event that there are no s ∈ [t1 , t2 ], j ∈ J , k ∈ J c and space interval I containing both j and k, with the properties that η(s − ) ≡ 1 on I , and j has an ignition at time s). Definition. For a segment J ⊂ Z, we define the event HJ (s, t) = NJ (s, t) ∩ {for all u ∈ [s, t] there exists x ∈ J with ηx (u) = 0}, with NJ (s, t) as above.
Asymptotic Density in a 1-D Self-Organized Critical Forest-Fire Model
637
The complement of this event will be denoted by HJc (s, t). Note that HJ (s, t) implies that during [s, t] no fire propagates from the half-line left of (and including) the rightmost point of J to the complement of this halfline. A similar statement holds with ‘left’ and ‘right’ interchanged. When HJ (s, t) occurs, we say that the segment J blocks during [s, t]. Lemma 1. (a) For any α > 0 there is a constant C3 = C3 (α) > 0, such that for all 0 < λ < λ0 and all initial configurations ξ with ξ ≡ 0 on [0, αn], ξ Pλ H[0,αn] ((3/2) log n) > C3 . (b) For all α > 0 and β > 0 there is a constant C4 = C4 (α, β) > 0, such that for all 0 < λ < λ0 and all initial configurations ξ with ξ ≡ 0 on [0, αn], ξ Pλ H[0,αn] (β log n) > C4 . (c) Above we can even replace H[0,αn] (β log n) by an event that implies it, and is in F[0,αn] (β log n). More precisely, for all α > 0 and β > 0 there is a constant C4 = C4 (α, β) > 0, such that for all 0 < λ < λ0 there is an event Hˆ [0,αn] (β log n) ∈ F[0,αn] (β log n) such that {η(0) ≡ 0 on [0, αn]} ∩ Hˆ [0,αn] (β log n) ⊂ H[0,αn] (β log n), and Pλ Hˆ [0,αn] (β log n) > C4 . Remark. Note that parts (b) and (c) with β ≤ 1 are trivial; they follow immediately from c (4) and the fact that Ac[0,αn] (log n) ∩ B[0,αn] (log n) implies H[0,αn] (log n). The difficulty is to prove (b) and (c) for some β > 1. That is part (a) of the lemma. We will see that once we have part (b) for some β > 1, it follows quite easily for β + 3/4, and hence for all positive β. Proof. [Lemma 1]. We first give the proof of part (a). Let J1 = [0, αn/4), J2 = [αn/4, α3n/4), J3 = [α3n/4, αn]. Here we assume, without loss of generality, that αn is sufficiently large, so that subdivision makes sense. Subdivide J2 into αn1/4 segments of length n3/4 , denoted K1 , K2 , . . . . Consider the following events (i)–(v): (i) There is no ignition in J1 ∪ J3 before time (3/2) log n. In formal notation, this event is: BJc1 ∪J3 ((3/2) log n). (ii) The intervals J1 and J3 do not try to fill before time log n. More precisely, AcJ1 (log n) ∩ AcJ3 (log n). (iii) There is no ignition in J2 before time (3/4) log n. That is, BJc2 ((3/4) log n).
638
J. van den Berg, A.A. J´arai
(iv) At least one of the blocks Ki has the following three properties: it tries to fill before time (3/4) log n; it has an ignition between times (3/4) log n and (7/8) log n; and it does not try to fill in the interval ((3/4) log n, (3/2) log n]. More formally, this is the event 1/4 αn
i=1
AKi
3 3 3 7 3 log n ∩ BKi log n, log n ∩ AcKi log n, log n . 4 4 8 4 2
(v) There is no ignition in J2 between time (7/8) log n and time (3/2) log n. That is, BJc2 ((7/8) log n, (3/2) log n). Now we will show that if each of the events (i)–(v) occurs, then the event in part (a) of the lemma occurs. First of all, events (i), (ii) and (v) guarantee that no fire propagates from [0, αn] to its complement during the time [0, (3/2) log n]. Further, let Ki be a block with the three properties mentioned in (iv). Its first property, together with events (i), (ii) and (iii), ensure that Ki is indeed fully occupied at time (3/4) log n. Its second property then ensures that at some time between (3/4) log n and (7/8) log n it becomes completely vacant. This, together with its third property then guarantees that some site in Ki remains vacant during [(7/8) log n, (3/2) log n]. Finally, this last property of Ki together with (ii) ensures that at each time in [0, (3/2) log n] some site in [0, αn] is vacant. By independence of (i)–(v), it now suffices to show that for given α > 0 each of the events (i)–(v) has probability bounded away from 0, uniformly in λ. For the events (i)–(iii) and (v), this follows easily from (4). The same computations which led to (4) show that for each i the probability that Ki has the first and third property in event (iv) is larger than some constant c1 > 0. The probability that it has the second property is 1 − exp(−λn3/4 (1/8) log n), which by (1) and some elementary computations is larger than or equal to c2 n−1/4 , where c2 is a positive constant. So, if Xi is the indicator of the event that Ki has the three properties mentioned above, then, since we have αn1/4 blocks, the expectation of the sum of the Xi ’s is at least αn1/4 c1 c2 n−1/4 = αc1 c2 . Since the Xi ’s are independent, the probability that at least one Xi equals 1, and hence that event (iv) occurs, is therefore larger than some constant c3 (α). This completes the proof of part (a) of the lemma. Now we prove part (c), which clearly implies part (b). For β = 3/2, and hence for all β ≤ 3/2, we already know that part (c) holds: take for Hˆ [0,αn] (0, (3/2) log n) the intersection of the events (i)–(v) in the proof of part (a). Now suppose part (c) holds for some β ≥ 3/2. We will show that it then also holds for β + 3/4. Let, as above, J1 = [0, αn/4), J2 = [αn/4, α3n/4), J3 = [α 3n/4, αn]. Consider the following events (I)–(V): (I)
Hˆ J1 (β log n) ∩ Hˆ J3 (β log n).
(II) Each site in the interval J2 has a birth before time log n, that is, AJ2 (log n). (III) The interval J2 has no ignition before time (β − 1/4) log n, but does have an ignition between times (β − 1/4) log n and (β − 1/8) log n. That is, BJc2 ((β − 1/4) log n) ∩ BJ2 ((β − 1/4) log n, (β − 1/8) log n).
Asymptotic Density in a 1-D Self-Organized Critical Forest-Fire Model
639
(IV) J2 does not try to fill during ((β − 1/4) log n, (β + 3/4) log n). That is, AcJ2 ((β − 1/4) log n, (β + 3/4) log n). (V) There are no ignitions in [0, αn] during (β log n, (β + 3/4) log n). That is, c B[0,αn] (β log n, (β + 3/4) log n).
With very similar (and even somewhat simpler) arguments as in the proof of part (a), one can show that the events (I)–(V) imply H[0,αn] ((β + 3/4) log n): event (I), together with (II) and (III) ensure that J2 is completely occupied at time log n and becomes vacant at some time between (β − 1/4) log n and (β − 1/8) log n. This, with (IV) implies that some site in J2 remains vacant during (β log n, (β + 3/4) log n). Finally, this, together with (I) and (V) implies that indeed H[0,αn] ((β + 3/4) log n) occurs. Since the events (I)–(V) are F[0,αn] ((β + 3/4) log n)-measurable, we can define the desired Hˆ [0,αn] ((β + 3/4) log n) as the intersection of these events. Since the five events are independent (note that here we use that the events Hˆ J1 (β log n) and Hˆ J3 (β log n) in (I) are FJ1 (β log n)-measurable and FJ3 (β log n)-measurable respectively), it remains to show that each of the events (I)–(V) has a probability that is larger than some positive constant which depends on α and β but not on λ. The probability of (I), again using the above mentioned measurability properties, is at least (C4 (α/4, β))2 . Suitable lower bounds for the other events follow easily from (4). This completes the proof of part (c) and hence of part (b). For the second lemma, define for J ⊂ Z the stopping time TJ = inf{t > 0 : ηx (t) = 0 for all x ∈ J }.
(5)
Lemma 2. Let α > 0 and let T = T[0,αn] ∧ T[αn,2αn] ∧ T[2αn,3αn] . (a) There exists C5 = C5 (α) > 0 such that for all 0 < λ < λ0 and all initial configurations ξ , ξ Pλ (T ≤ 2 log n) > C5 . (b) We can even replace the event above by an F[0,3αn] (2 log n)-measurable event. More precisely, there is an F[0,3αn] (2 log n)-measurable event A = A(λ, α) with A ⊂ {T ≤ 2 log n} and Pλ (A) > C5 . c Proof. Suppose each of the three events B[0,3αn] (log n), A[αn,2αn] (log n) and B[αn,2αn] (log n, 2 log n) occurs. We show this implies T ≤ 2 log n. Since each of these events is clearly F[0,3αn] (2 log n)-measurable and by (4) has a probability bounded away from 0, this will prove the lemma. Let
σL = inf{t > 0 : site αn burns at time t}, σR = inf{t > 0 : site 2αn burns at time t}. If σL ∧σR > log n, then, by this and the first of the three events above, no site in [αn, 2αn] burns before time log n and hence, by the second of the three events, this segment [αn, 2αn] is filled at time log n. Finally, by the third event, it will then burn completely down at some time between log n and 2 log n, so that we have T ≤ T[αn,2αn] ≤ 2 log n.
640
J. van den Berg, A.A. J´arai
On the other hand, if σL ∧ σR ≤ log n, there must, by the first of the three events, have been a fire before or at time log n from outside [0, 3αn] which reached the site αn or the site 2αn. So this fire has completely burnt the segment [0, αn] or the segment [2αn, 3αn] and hence T ≤ log n. From Lemma 1 and Lemma 2 we obtain the following proposition. Proposition 3. For each pair α, β > 0 there is a constant C6 = C6 (α, β) > 0 such that the following holds. Let m be a positive integer, and Ki = [xi , xi + 3αn], i = 1, . . . , m, disjoint segments ⊂ Z. For all 0 < λ < λ0 , all t > 2 log n, and any initial configuration ξ we have ξ m c > (C6 )m , H (t, t + β log n) | F (6) P λ ∩m (∪i=1 Ki ) i=1 Ki and
ξ m K )c > 1 − (C6 )m . H (t, t + β log n) | F P λ ∪m K (∪ i i=1 i=1 i
(7)
Proof. For each i write Ki as the union of three segments Ki (j ) = [xi + (j − 1)αn, xi + j αn], j = 1, 2, 3. We look from time t0 = t − 2 log n > 0. For J ⊂ Z, let τJ = inf{t > t0 : ηx (t) = 0 for all x ∈ J } = θt0 (TJ ), with TJ as in (5). Further, for 1 ≤ i ≤ m let τ (i) = τKi (1) ∧ τKi (2) ∧ τKi (3) .
(8)
We know from Lemma 2 that there is an FKi (t0 , t)−measurable event A(i) ⊂ {τ (i) ≤ t} satisfying Pλ (A(i)) > C5 (α). From Lemma 1 we know that there is an FKi (1) ((2 + β) log n)-measurable event Hˆ Ki (1) ((2 + β) log n) such that Hˆ Ki (1) ((2 + β) log n) ∩ {η(0) ≡ 0 on Ki (1)} ⊂ HKi (1) ((2+β) log n) and Pλ (Hˆ Ki (1) ((2+β) log n)) > C4 (α, β), and we have similar events Hˆ Ki (2) ((2 + β) log n) and Hˆ Ki (3) ((2 + β) log n) for Ki (2) and Ki (3) respectively. Let Li be the minimizing segment in (8). Note that if τ (i) ≤ t and θτ (i) Hˆ Li ((2 + β) log n) occurs, then HLi (t, t + β log n) occurs. Moreover, if also c BK (τ (i), τ (i) + (2 + β) log n) occurs, then HKi (t, t + β log n) occurs. The price to i \Li pay for the latter event is (C2 (2α))2+β , with C2 as in (4). If, for each i, τ (i) would be a stopping time with respect to the filtration (FKi (s))s≥0 , the above observations would immediately give that the left hand side of (6) is at least (C4 (α, β)C5 (α)(C2 (2α))2+β )m . However, τ (i) is not a stopping time with respect to that filtration but with respect to (F(s))s≥0 . Nevertheless, with some more care one can, using fairly standard arguments, still obtain the above mentioned bound (C4 (α, β)C5 (α)(C2 (2α))2+β )m . Very similar arguments yield (7). 3. Main Results 3.1. Statement and proof of the main theorems. Recall the definitions of λ0 and n = n(λ) in Sect. 2.1. We are now ready to prove our main results on the asymptotic density of vacant sites, and the cluster size distribution. The first of these results, Theorem 4 below, is formulated in such a way that the former restriction λ < λ0 can be dropped.
Asymptotic Density in a 1-D Self-Organized Critical Forest-Fire Model
641
Theorem 4. There exist constants C7 , C8 > 0 such that for any initial configuration ξ , any λ > 0 and for all t > 3 log n + 1, C7 C8 ξ ≤ Pλ (η0 (t) = 0) ≤ . log(1/λ) ∨ 1 log(1/λ) ∨ 1 Proof. We start with the lower bound and with the more interesting case 0 < λ < λ0 . Let t0 = t − log n − 1 > 2 log n. It is clear that, to have η0 (t) = 0 it is sufficient that each of the following events occur: H[−4n,−n] (t0 , t − 1), H[n,4n] (t0 , t − 1), A(−n,n) (t0 , t − 1), c (t0 , t − 1), B(−n,n) (t − 1, t) and Ac0 (t − 1, t). By Proposition 3 and (4), this has B(−n,n) probability at least (C6 )2 C1 (2) C2 (2) (1 − e−(2n−1)λ ) e−1 , which, by (1), gives the desired lower bound. For the case λ ≥ λ0 , note that the event η0 (t) = 0 is implied by the event Ac0 (t − 1, t) ∩ B0 (t − 1, t). This has probability e−1 (1 − e−λ ) ≥ e−1 (1 − e−λ0 ), completing the proof of the lower bound. We continue with the proof of the upper bound. In the case λ ≥ λ0 , the upper bound is trivial. For the case 0 < λ < λ0 , we need the following claim. Claim. There is a constant c1 > 0 such that for all 0 < λ < λ0 and all t > 2 log n, ξ
Pλ (O burns at some time in [t, t + 1]) ≤ c1 / log n.
(9)
Proof of Claim. It is easy to check that, if the event in the claim happens, then there exists an integer k ≥ 0 such that the following events (i) and (ii) occur: (i) We have {S(k) ≤ t + 1}, where S(k) = inf{s ≥ t : η(s) ≡ 1 on [−4kn, 0] or η(s) ≡ 1 on [0, 4kn]}. (ii) An ignition occurs in (−4(k + 1)n, 4(k + 1)n) at some time in [S(k), t + 1]. It is clear that given (i), the conditional probability that (ii) happens is bounded above by
8(k + 1) , log n where again we have used (1). Moreover, for fixed k the probability that (i) holds is at most 2(C6 )k by (7) in Proposition 3. Combining these facts, the probability in the statement of the claim is at most Pλ (B(−4(k+1)n,4(k+1)n) (1)) ≤ 8λ(k + 1)n ≤
∞
1
16 (k + 1)(C6 )k , log n k=0
from which the claim follows.
We continue the proof of the upper bound in Theorem 4. If η0 (t) = 0, then either η0 (s) = 0 for all s ∈ [t − (1/2) log n, t] or there is an integer k ∈ [0, (1/2) log n] such that O burns at some time in the interval [t − k − 1, t − k] and has no birth attempt in [t − k, t]. Hence, using the above claim,
c1 ξ Pλ (η0 (t) = 0) ≤ exp(−(1/2) log n) + exp(−k), log n k: 0≤k≤(1/2) log n
from which the upper bound in Theorem 4 follows immediately.
642
J. van den Berg, A.A. J´arai
With the same type of arguments we get results for the cluster size. Let Cx (t) denote the occupied cluster of vertex x at time t, and |Cx (t)| its size. Theorem 5. There is a constant C9 > 0 such that for any initial configuration ξ , any 0 < λ < λ0 , and all integers 1 ≤ k < n and all t > 4 log n, ξ
Pλ {|C0 (t)| = k} ≥
C9 . k log n
(10)
Proof. Instead of the vertex 0 we may take any other vertex, and in this proof we take (for notational convenience) the vertex 1. The proof is of the same flavour as that of the previous theorem. This time take t0 = t − 2 log n. It is easy to see that for the occurrence of the event {η0 (t) = 0, ηk+1 (t) = 0, and ηx (t) = 1 for x = 1, . . . , k} it is sufficient that each of the following events occur: • H[−4n,−n) (t0 , t) ∩ H(n,4n] (t0 , t), i.e. the intervals [−4n, −n) and (n, 4n] block during [t0 , t]; c • A[−n,n] (t0 , t − log k − 1) ∩ B[−n,n] (t0 , t − log k − 1), i.e. the interval [−n, n] fills up by time t − log k − 1; • B[−n,n] (t − log k − 1, t − log k), i.e. there is a fire in [−n, n] between times t − log k − 1 and t − log k; c • B[−n,n] (t − log k, t), i.e. there is no ignition in [−n, n] between times t − log k and t; • A[1,k] (t − log k, t) ∩ Ac{0,k+1} (t − log k − 1, t), i.e. [1, k] fills up by time t but 0 and k + 1 remain vacant. Using again Proposition 3 and (4) we see that the probability that all these events occur is bounded from below by a constant times the product of the probabilities of two of these events, namely B[−n,n] (t − log k − 1, t − log k) and Ac{0,k+1} (t − log k − 1, t). The first of these two events has probability 1 − exp(−(2n + 1)λ), which is larger than c1 / log n for some constant c1 > 0, and the second has probability exp(−2(log k + 1)) = e−2 /k 2 . Hence c2 ξ , (11) Pλ {η0 (t) = 0, ηk+1 (t) = 0, and ηx (t) = 1 for x = 1, . . . , k} > 2 k log n for some constant c2 > 0. Finally, the event {|C1 (t)| = k} is the union of k disjoint events of the above form, and hence has probability larger than k c2 /(k 2 log n) = c2 /(k log n). Remark. We believe that, when k is sufficiently small compared with n, the reverse of inequality (10) (with a different C9 ) also holds. The earlier mentioned heuristic ‘scaling ansatz’ in the paper by Drossel et al. leads to the prediction that an upper bound of the form C(k log n)−1 even holds for k of order n. The same ansatz also leads to the prediction that the probability that the cluster size is larger than n goes to 0 as λ → 0. (See the computations and arguments between (8) and (10) in their paper, and recall that their quantity smax corresponds with our quantity n). However, although their arguments lead to the correct order of the density of vacant sites, the two above mentioned predictions are false, as our next theorem shows. (For this, note our Remark (ii) in the next subsection explaining that our results apply to their slightly different setup.)
Asymptotic Density in a 1-D Self-Organized Critical Forest-Fire Model
643
Theorem 6. For all 0 < α < β there is a constant C10 = C10 (α, β) ∈ (0, 1) such that for any initial configuration ξ , any 0 < λ < λ0 , and all t > 3 log n, ξ
Pλ {αn ≤ |C0 (t)| ≤ βn} ∈ (C10 , 1 − C10 ). Remark. In a very similar way as below (and as in the proof of Theorem 4) we can also bound the tail of the cluster size distribution. We can show that there are p1 , p2 ∈ (0, 1) such that for all 0 < λ < λ0 , all k ≥ 1 and all t > 3 log n, ξ
p1k < Pλ {|C0 (t)| > kn} < p2k . Proof. [Theorem 6]. Proof of lower bound. This proof is implicitly more or less already in the proof of the lower bound in Theorem 4, but for clarity we state it explicitly here: Let t0 = t − log n. Let γ = (β −α)/4. Without loss of generality we may assume γ n ≥ 1. (The case γ n < 1 can be easily handled afterwards by adapting the constant C10 .) Then it is easy to see c that if the events H[−γ n, −1] (t0 , t), H[αn+1,(α+γ )n] (t0 , t), A[0,αn] (t0 , t) and B[0,αn] (t0 , t) occur, then [0, αn] ⊂ C0 (t) ⊂ [−γ n, (α + γ )n]. Hence, by the choice of γ , αn ≤ |C0 (t)| ≤ βn. Now apply Proposition 3 and (4) as before. That completes the proof of the lower bound. Proof of upper bound. If H[−αn/4,−1] (t0 , t) and H[1,αn/4] (t0 , t) both hold, then clearly C0 (t) ⊂ [−αn/4, αn/4], and hence |C0 (t)| ∈ [αn, βn]. Now apply Proposition 3 again. 3.2. Remarks and discussion. (i) Note that Theorem 4 immediately implies that if µ is a distribution that is invariant under the dynamics, then C7 C8 ≤ µ(η0 = 0) ≤ . log(1/λ) ∨ 1 log(1/λ) ∨ 1
(12)
For the special case where µ is also invariant under spatial translation, we have a considerably simpler proof of (12). (In particular, the proof of the lower bound in (12) then only needs a combination of the arguments in the proof of Lemma 2 and general stationarity arguments.) Since we do not have a proof that all stationary distributions are translation invariant, our present argument is needed. Furthermore, Theorem 4 is much stronger than (12). Its major ingredient, Proposition 3, which in turn is based on Lemmas 1 and 2, gives strong properties of the spatial and temporal dependencies in the process. We believe these properties will also be useful for other purposes, for instance for the study of the question whether the model has a unique stationary distribution. (ii) As we wrote in the Introduction, our model is somewhat different from the one studied by Drossel et al (1993). In that paper the fire propagates at a finite speed. In some more recent papers (see e.g. Schenk, Drossel and Schwabl (2002)) the speed is infinite, like in our model; that is, when a tree is hit by lightning, its occupied cluster becomes vacant instantaneously. Nevertheless we will point out in (a) and (b) below how a modification of our arguments also works for the original model of Drossel et al. (1993).
644
J. van den Berg, A.A. J´arai
(a) Another look at the proofs shows that the arguments and estimates leading to Theorem 4 are, in some sense ‘local’: the births and ignitions outside the space interval [−4n, 4n] ‘do not matter’. In particular this means the following. Suppose that instead of the infinite line we have a finite forest, with locations −N, . . . , N. The forest-fire process is then clearly a finite-state continuous-time Markov-chain. It is easy to see that it is irreducible and hence has a unique invariant distribution, which we denote by µλ,N . For all λ > 0 and N > 4n we have, C1 C2 ≤ µλ,N (η0 (t) = 0) ≤ . log(1/λ) ∨ 1 log(1/λ) ∨ 1 (b) Apart from the above mentioned spatial locality, the arguments also have a locality in time. They essentially reduce to ‘controlling’ what happens in certain space-time blocks, with spatial length of order n and time length of order log n. If we would modify our model and let the fire spread at some finite rate κ, a closer examination of our arguments show that they still work when the time it takes a fire to move through the segment [0, n] is typically o(log n). That is, when n/κ log n. This in turn is guaranteed if κ 1/λ, which corresponds with the condition p p/f in the paragraph preceding (2) in Drossel et al. (1993). (iii) Our theorems suggest that under stationarity, the cluster size distribution satisfies limit laws on two different scales, as λ → 0. Theorems 5 and 6 suggest that log(|C0 | + 1)/ log n and |C0 |/n both have non-trivial limit laws. Acknowledgement. We thank Jeff Steif for stimulating discussions about these and related problems.
References 1. Drossel, B., Clar, S., Schwabl, F.: Exact Results for the One-Dimensional Self-Organized Critical Forest-Fire Model. Phys. Rev. Lett. 71, 3739–3742 (1993) 2. Jensen, H.J.: Self-Organized Criticality. Cambridge Lecture Notes in Physics, Cambridge: Cambridge University Press, 1998 3. Liggett, T.M.: Interacting Particle Systems. Berlin: Springer-Verlag, 1985 4. Malamud, B.D., Morein, G., Turcotte, D.L.: Forest Fires: An Example of Self-Organized Critical Behaviour. Science 281, 1840–1841 (1998) 5. Schenk, K., Drossel, B., Schwabl, F.: Self-organized critical forest-fire model on large scales. Phys. Rev. E 65, 026135-1–8 (2002) Communicated by H. Spohn
Commun. Math. Phys. 253, 645–674 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1199-z
Communications in
Mathematical Physics
Formal Symplectic Groupoid Alberto S. Cattaneo1, , Benoit Dherin2, , Giovanni Felder2, 1
Institut f¨ur Mathematik, Universit¨at Z¨urich–Irchel, Winterthurerstrasse 190, 8057 Z¨urich, Switzerland. E-mail:
[email protected] 2 D-MATH, ETH-Zentrum, 8092 Z¨ urich, Switzerland. E-mail:
[email protected];
[email protected] Received: 5 January 2004 / Accepted: 16 April 2004 Published online: 20 October 2004 – © Springer-Verlag 2004
Abstract: The multiplicative structure of the trivial symplectic groupoid over Rd associated to the zero Poisson structure can be expressed in terms of a generating function. We address the problem of deforming such a generating function in the direction of a non-trivial Poisson structure so that the multiplication remains associative. We prove that such a deformation is unique under some reasonable conditions and we give the explicit formula for it. This formula turns out to be the semi-classical approximation of Kontsevich’s deformation formula. For the case of a linear Poisson structure, the deformed generating function reduces exactly to the CBH formula of the associated Lie algebra. The methods used to prove existence are interesting in their own right as they come from an at first sight unrelated domain of mathematics: the Runge–Kutta theory of the numeric integration of ODE’s. 1. Introduction In this paper we give a formal version of the integration of Poisson manifolds by symplectic groupoids. The solution of this formal integration problem relies on the existence of a generating function for which we give here the explicit formula. This generating function turns out to be a universal Campbell–Baker–Hausdorff (CBH) formula for the non-linear case. It reduces to the usual CBH formula when the Poisson structure comes from a Lie algebra. This generating function can be interpreted as the semi-classical part of the Kontsevich deformation quantization formula. This fact reminds of us the origin of symplectic groupoids which were first introduced by Weinstein in [6], Karasev in [11], and Zakrwewski in [18] as a tool to quantize the algebra of functions on a Poisson manifold. This section is devoted to recall some basic features of the program of quantization by symplectic groupoid, to formulate the formal integration problem for Poisson
A.S.C. acknowledges partial support of SNF Grant No. 20-100029/1. B.D. and G.F. acknowledge partial support of SNF Grant No. 21-65213.01.
646
A.S. Cattaneo, B. Dherin, G. Felder
manifolds and to state the main theorem of this article which gives a positive answer to the formal integration problem. 1.1. Quantization by symplectic groupoid. The program of quantization by symplectic groupoid is an attempt to quantize the algebra of functions on Poisson manifolds by geometric means. It is based mainly on the belief or hope, coming from geometric quantization, that there should exist a kind of correspondence or dictionary between the world of symplectic manifold (classical level) and the world of linear spaces (quantum level). This correspondence, as explained in [1], is summarized in the following table: Symplectic world M L⊂M M Q(M × N)
Linear world Q(M) Q(L) ∈ Q(M) Q(M) = Q(M)∗ Q(M) ⊗ Q(N)
Here M is a symplectic manifold, M the same manifold with opposite symplectic structure, L a Lagrangian submanifold, and Q(M) a complex vector space. Q stands for the “Quantization functor”. In particular, canonical relations, i.e., Lagrangian submanifolds of M × N are sent by Q to linear maps from Q(M) to Q(N ). The main ingredient is the assumption that quantization is functorial , i.e., the composition of canonical relations should be sent to the composition of linear maps (see [16]). If such a quantization functor existed, we could ask the following question: To what kind of symplectic manifold should we associate an algebra (i.e., a vector space with an associative product)? Answering this question leads directly to the notion of symplectic groupoid, see [17]. Definition 1. A symplectic groupoid is a Lie groupoid G (see [1] for a precise definition of a Lie groupoid) with a symplectic form ω for which the multiplication space G(m) = {(x, y, x • y)/x, y ∈ G are composable elements} is a Lagrangian submanifold of G × G × G (G being the symplectic manifold with symplectic form −ω). It can be shown (see [14]) that, given a symplectic groupoid G, there is an induced Poisson structure on the base space G(0) . Conversely, given a Poisson manifold P we call a symplectic groupoid over P any symplectic groupoid G such that the base space G(0) is diffeomorphic as a Poisson manifold to P . In this case we say that G integrates P and we call integrable Poisson manifolds the Poisson manifolds for which we can find such a G. Applying the “Quantization functor” Q to the symplectic groupoid G, we should then get a vector space Q(G) and an associative product Q(G(m) ) on it. The associativity of this product is guaranteed by the associativity of the groupoid multiplication and the functoriality of Q. These facts suggest the following procedure to quantize Poisson manifolds P : Step 1. Find a symplectic groupoid G such that the base G(0) is diffeomorphic to the Poisson manifold P . Step 2. Quantize (geometric quantization,...) G and G(m) to get the quantum algebra.
Formal Symplectic Groupoid
647
This is the idea of quantization by symplectic groupoid. Step 1 is known as the integrability problem and was recently completely settled. Coste, Dazord and Weinstein in [6] and independently Karasev in [11] showed the existence of a local symplectic groupoid over any Poisson manifold, “local” meaning that the multiplication is defined only on a neighborhood of the unit space. Cattaneo and Felder in [5] gave an explicit construction of a topological groupoid canonically associated to any Poisson manifold, which is a global symplectic groupoid whenever the Poisson structure is integrable. Crainic and Fernandes in [8] derived an if and only if criterium which tells one when the previous construction yields a manifold. Step 2 however was only partially achieved (see [15]). If we compare this program with deformation quantization (see [2] and [13]), we see that starting with an integrable Poisson manifold P whose symplectic groupoid is G we should have the following relation between objects involved in these programs:
Semi-classical level Quantum level
Deformation quantization
Quantization by symplectic groupoids
? (C ∞ (P )[[]], ∗ )
(G, G(m) ) (Q(G), Q(G(m) ))
We can regard the symplectic groupoid over a Poisson manifold as a (semi-)classical version of the quantum algebra. In this picture G(m) should then correspond to a semiclassical version of the Kontsevich star-product formula. This is in some sense the case. Namely we can restate the integrability problem into a formal integration problem. The solution of this problem is called the formal symplectic groupoid over a Poisson manifold which is a formal version of the “true symplectic groupoid” that exists however even for non-integrable Poisson structures. This is exactly what the question marks stand for in the above table. Let us be more precise.
1.2. Formal integration problem for Poisson manifolds. In the sequel we will only consider Poisson structures α over M = Rd . Suppose that (M, α) is integrable and that its symplectic groupoid G satisfies the following two properties (which are always satisfied in a neighborhood of M): (1) G ⊂ T ∗ M R∗d × Rd (2) G(m) ⊂ T ∗ M × T ∗ M × T ∗ M is an exact Lagrangian manifold, i.e., there exists a generating function S : R∗d × R∗d × Rd → R such that G(m) = graph(dS). We would like to see what sort of constraints the associativity of the groupoid product imposes on S. First of all we may remark that under the previous assumptions the product space G(m) can be described as follows G(m) = p1 , ∇p1 S , p2 , ∇p2 S , ∇x S, x : (p1 , p2 , x) ∈ B2 , where the partial derivatives are evaluated at (p1 , p2 , x) ∈ B2 := (R∗d )2 × Rd . The groupoid product associativity could be expressed by saying that, whenever the composition is allowed, we have g = g¯ • g3 and g = g1 • g, ˜ where g¯ = g1 • g2 and g˜ = g2 • g3 .
648
A.S. Cattaneo, B. Dherin, G. Felder
Denoting g = (p, x), g¯ = (p, ¯ x) ¯ and g˜ = (p, ˜ x) ˜ implies that (g1 , g2 , g) ¯ ∈ G(m) , (g2 , g3 , g) ˜ ∈ G(m) , (g, ¯ g3 , g) ∈ G(m) and (g1 , g, ˜ g) ∈ G(m) . Now expressing g1 , g2 , g3 , g, g¯ and g˜ each time in terms of the generating function S and equating the different expressions found for the same element we get a system of six equations which can be summarized into the following more compact equation. Symplectic Groupoid Associativity equation (SGA equation). S(p1 , p2 , x) ¯ + S(p, ¯ p3 , x) − x¯ p¯ = S(p2 , p3 , x) ˜ + S(p1 , p, ˜ x) − x˜ p, ˜ where x¯ = ∇p1 S(p, ¯ p3 , x), x˜ = ∇p2 S(p1 , p, ˜ x),
p¯ = ∇x S(p1 , p2 , x), ¯ p˜ = ∇x S(p2 , p3 , x). ˜
This equation encodes the associativity of the groupoid product into the generating function. It can also be seen from two other different points of view. First it is easy to check that one gets the SGA equation by requiring that the saddle point evaluation as h goes to 0 of the two integrals i d d pd d x e h [S(p1 ,p2 ,x)+S(p,p3 ,x)−px] and (2π h)d/2 i d d pd d x e h [S(p2 ,p3 ,x)+S(p1 ,p,x)−px] (2π h)d/2 be equal. This allows us to provide in Sect. 7 a quick but non-rigorous proof of the existence of the generating function relying only on the associativity of the Kontsevich star product. The second way to derive the SGA equation is symplectic reduction. Consider the symplectic groupoid G over M = Rd as above. Let us call LS ⊂ G × G × G the Lagrangian submanifold associated to the generating function S (i.e., LS = graph(dS)). k Now consider the spaces H (k) = G × G and the diagonal l1 ,...,lk ⊂ H (k) × H (l1 ) × · · · × H (lk ), l1 ,...,lk = (g1 , . . . , gk , y), (x11 , . . . , x1l1 , g1 ), . . . , (xk1 , . . . , xklk , gk ) . This is a coisotropic subspace of H (k) × H (l1 ) × · · · × H (lk ). Then one can consider the symplectic reduction by the diagonal l1 ,...,lk which sends Lagrangian submanifolds of H (k) × H (l1 ) × · · · × H (lk ) to Lagrangian submanifolds of H (l1 + · · · + lk ). In particular LS ⊕ LS ⊕ LI (I (p, x) = px) is sent to L1 ⊂ H (3) and LS ⊕ LI ⊕ LS to L2 ⊂ H (3). One can check that L1 = L2 iff S satisfies the SGA equation. In fact we have here, hidden in the background, a structure of an operad, the Lagrangian operad (see [3]). Now consider M = Rd with the zero Poisson structure. The symplectic groupoid G0 over it is the cotangent bundle (G0 = R∗d × Rd ). The source map and the target map s, t : G0 → Rd are identified with the cotangent bundle projection. The inclusion : Rd → G0 is defined by (x) = (0, x), the inverse map i : G0 → G0 by i(p, x) = (−p, x) and the product is the fiber wise addition, i.e., (p1 , x) • (p2 , x) = (p1 + p2 , x). (m) The product space G0 can be seen as the graph of the differential of the function S0 (p1 , p2 , x) = x(p1 + p2 ). It is easy to check that S0 satisfies the SGA equation. We investigate deformations of this trivial generating function. Let us be more precise.
Formal Symplectic Groupoid
649
Definition 2. A deformation of the trivial generating function is a formal power series in h, Sh = S0 +hS1 +h2 S2 . . . , obeying the SGA equation and such that S0 (p1 , p2 , x) = x(p1 + p2 ). Such a deformation is called natural if (1) Sn (p, q, x) are polynomial in q and q, (2) Sn (λq, λp, x) = λn+1 Sn (p, q, x), (3) Sn (p, 0, x) = Sn (0, p, x) = 0, (4) Sni (p, p) = 0, where Sni is the homogeneous part of Sn of degree i in the first argument. In Sect. 2 we show that, provided we have a natural deformation Sh = S0 + hS1 + h2 S2 + · · · of the trivial generating function, we can deform the structure maps of the trivial symplectic groupoid into h (x) = (0, x) unit map, ih (p, x) = (−p, x) inverse map, sh (p, x) = ∇p2 Sh (p, 0, x) source map, th (p, x) = ∇p1 Sh (0, p, x) target map, such that the groupoid structure is (formally) preserved. Moreover there is a unique Poisson bracket on Rd such that the source, sh , is a Poisson map with respect to the canonical symplectic structure on the formal symplectic groupoid. This Poisson bracket is given by {f, g}Rd (x) = 2hS1 (df, dg, x), the first order term of the generating function. We can now formulate the formal integration problem for Poisson manifolds. Formal integration problem for Poisson manifolds. Given a Poisson structure on Rd , does there exist a deformation of the trivial generating function such that the first order term is the original Poisson structure? 1.3. Main Result, main example, main interpretation. The following theorem gives a positive answer to the deformation problem for symplectic groupoids. This is the main result of this article. Theorem 1. Given a Poisson structure α on Rd there exists a unique natural deformation of the trivial generating function such that the first order is precisely α. Moreover we have an explicit formula for this deformation Sh (p1 , p2 , x) = x(p1 + p2 ) +
∞
hn
n=1
n!
W Bˆ (p1 , p2 , x),
∈Tn,2
where Tn,2 is the set of Kontsevich trees of type (n, 2), W is the Kontsevich weight of and Bˆ is the symbol of the bidifferential operator B associated to . Section 2 explains how to recover the structure maps from the deformed generating function. In Sect. 3 we present basic examples of formal symplectic groupoids. In parij ticular the main one is in the case of a linear Poisson structure α ij (x) = αk x k , i.e., when ∗ one considers the Kirillov–Kostant Poisson structure on the dual G of a Lie algebra G.
650
A.S. Cattaneo, B. Dherin, G. Felder
In this case, the generating function of the symplectic groupoid over G ∗ reduces exactly to the Campbell–Baker–Hausdorff formula 1 Sh (p1 , p2 , x) = CBH (hp1 , hp2 ), x, h where , is the natural pairing between G and G ∗ . This basic example suggests to consider the generating function as a generalized CBH formula to the non-linear case and reproves in the linear case a result of V. Kathotia ([12]). Sections 4 to 6 are devoted to the proof of Theorem 1. In Sect. 4 we introduce special graphs, the Cayley trees, which allow us to write down a perturbative version of the SGA equation. In Sect. 5 we describe the Kontsevich trees. We use them to produce an explicit solution for the deformation problem. Section 6 completes the proof of Theorem 1. In the last section we come to the comparison with deformation quantization. We see that the Kontsevich star-product can be put into the form ∞ 1
f ∗ g(x) = exp hi Di (h∂y , h∂z , x) f (y)g(z) , h y=z=x i=0 where
1 D0 (hp1 , hp2 , x). h This allows us to interpret the generating function as a semi-classical version of the Kontsevich star-product formula. At last, considering associativity of the star product of exponential functions, we are able to provide an elegant but non-rigorous proof of the existence part of Theorem 1. Sh (p1 , p2 , x) = x(p1 + p2 ) +
1.4. Planned developments. One of the next objectives is to carry the construction of the formal symplectic groupoid to a general Poisson manifold. Karabegov in [10] already gave some hints on how to make such a globalisation. Namely, he constructed a global source and target provided there is a global star product on the Poisson manifold. These maps are proven to be Poisson maps whenever the Poisson manifold is symplectic. A second possible development is to try to derive the existence of the deformation of the trivial generating function from a kind of “semi-classical” formality theorem. At last we plan to compare the formal construction carried out in this article with the non-formal construction coming from the Poisson-sigma model (see [5]) and with the local symplectic groupoid construction of [6] and [11]. 2. Recovering the Formal Groupoid from the Generating Function In this section we show that one can recover formally the structure of symplectic groupoid from a generating function obeying the SGA equation. Proposition 1. Let Sh be a natural deformation of the trivial generating function which satisfies the SGA equation. Then the set Gh = R∗d [[h]] × Rd [[h]] can be given a structure of formal symplectic groupoid, i.e., the maps h (x) = (0, x) unitmap, ih (p, x) = (−p, x) inversemap, sh (p, x) = ∇p2 Sh (p, 0, x) sourcemap, th (p, x) = ∇p1 Sh (0, p, x) targetmap,
Formal Symplectic Groupoid
651
and the multiplication given by (m)
Gh
= graph(dS)
satisfy formally the axioms of a groupoid. (m) In particular, if we endow Gh with the canonical symplectic form, then Gh is formally Lagrangian in Gh × Gh × Gh . Proof. The multiplication space being given by the graph of the differential of the generating function, we have automatically that the product, when defined, is associative (m) (it satisfies the SGA equation) and that Gh is formally a Lagrangian submanifold of Gh × Gh × G. We still have to check that the space of composable pairs is the right (2) one, i.e., (g, h) ∈ Gh iff s(g) = t (h). We do that by noticing that all products are of the form (p1 , ∇p1 Sh (p1 , p2 , x)) • (p2 , ∇p2 Sh (p1 , p2 , x)) = (∇x Sh (p1 , p2 , x), x). Thus the check amounts to see that s(p1 , ∇p1 Sh (p1 , p2 , x)) = t (p2 , ∇p2 Sh (p1 , p2 , x)) which can be seen by differentiating the SGA equation with respect to p2 , putting p2 = 0 and using the fact that Sh is natural. It remains still to check the following axioms: t (gh) = t (g) (1), s(i(g)) = t (g) (5),
s(gh) = s(h) (2), (t (g))g = g (3), g(s(g)) = g i(g)g = (s(g)) (6), gi(g) = (t (g)) (7).
(4),
Axiom 1 is obtained by differentiating the SGA equation w.r.t. p1 , putting p1 = 0 and using naturality of Sh . Axiom 2 is similar but for replacing p1 by p3 . Axiom 3 and Axiom 4 are direct consequences of the naturality. The last three axioms are however a bit more tricky. First let us prove two lemmas. Lemma 1 (Inversion of source and target). Denote Fp (x) = ∇p2 Sh (p, 0, x) and Gp (x) = ∇p1 Sh (0, p, x). Then Fp and Gp are formal diffeomorphisms and their inverses are given by Fp−1 (x) = ∇p2 Sh (−p, p, x),
G−1 p (x) = ∇p1 Sh (p, −p, x).
Proof. Denote F p (x) = ∇p2 Sh (−p, p, x) and Gp (x) = ∇p1 Sh (p, −p, x). Differentiating the SGA equation w.r.t. p1 , putting p1 = p, p2 = −p, p3 = p, we get that Gp ◦ Gp = id. Putting p1 = 0, p2 = p, p3 = −p, we get Gp ◦ Gp = id. Thus Gp = G−1 p . Similarly differentiating the SGA equation w.r.t. p3 , putting p1 = p, p2 = −p, p3 = p, we get that F p ◦ Fp = id. Putting p1 = p, p2 = −p, p3 = 0, we get Fp ◦ F p = id. Thus F p = Fp−1 . Lemma 2 (Relation between source and target). Denote Fp (x) = ∇p2 Sh (p, 0, x) and Gp (x) = ∇p1 Sh (0, p, x). Then we have the relation Fp = G−p . Proof. Notice that it is equivalent to prove that Fp = G−p or Fp−1 = G−1 −p . We prove the second identity. For each n ≥ 1 we have the decomposition Sn (p1 , p2 , x) = Sn1 (p1 , p2 , x) + Sn2 (p1 , p2 , x) + · · · + Snn (p1 , p2 , x),
652
A.S. Cattaneo, B. Dherin, G. Felder
where Sni is the part of Sn which is homogeneous of degree i in the first argument. Now we have that Sni (−p, p, x) = (−1)i Sni (p, p, x) = 0 because of naturality of the generating function. This implies that S(−p, p, x) = 0. If
we differentiate this equation with respect to p we get exactly Fp−1 = G−1 −p . Going back to the check of axioms we get that Axiom 5 is exactly equivalent to Fp = G−p . As for Axiom 6, if we pose i(p, x) = (−p, ∇p1 Sh (−p, p, s(p, x)), then i(p, x)(p, x) = (∇x Sh (−p, p, s(p, x), s(p, x)) = (0, s(p, x)) = (s(p, x)), provided that x = ∇p2 Sh (−p, p, ∇p2 Sh (p, 0, x)), which is guaranteed by Lemma 1. ˜ Similarly for Axiom 7, if we put i(p, x) = (−p, ∇p2 S(p, −p, t (p, x)) we get that ˜ (p, x)i(p, x) = (t (p, x)). Now by Lemma 2 we get ∇p2 Sh (p, −p, t (p, x)) = F−p ◦ Gp (x) = x, ∇p1 Sh (−p, p, s(p, x)) = G−p ◦ Fp (x) = x. ˜ Thus i(p, x) = i(p, x) = (−p, x).
Now using the canonical symplectic bracket on Gh , i.e., {F, G}Gh (p, x) = ∇x F (p, x), ∇p G(p, x) − ∇p F (p, x), ∇x G(p, x), we can consider the problem of finding a Poisson bracket on Rd such that sh is a Poisson algebra homomorphism, i.e., sh∗ {f, g}Rd (p, x) = {sh∗ f, sh∗ g}Gh (p, x). The following proposition answers this question. Proposition 2. There is a unique Poisson structure on Rd such that s ∗ is a (formal) Poisson map. Moreover this Poisson structure is given by {f, g}Rd (x) = {sh∗ f, sh∗ g}Gh (0, x) = 2hS1 (df, dg, x). Proof. Suppose there exists a Poisson structure {, }Rd such that s ∗ is Poisson. This means that {f, g}Rd (sh (p, x)) = {sh∗ f, sh∗ g}Gh (p, x). In particular if we put p = 0 we get exactly that {f, g}Rd (x) = {sh∗ f, sh∗ g}Gh (0, x) = 2hS1 (df, dg, x) which shows uniqueness. Now it remains to prove that {f, g}Rd (sh (p, x)) = {sh∗ f, sh∗ g}Gh (p, x) which proves as well that the induced bracket is Poisson. Then we have to check that {sh∗ f, sh∗ g}Gh (0, sh (p, x)) = {sh∗ f, sh∗ g}Gh (p, x).
Formal Symplectic Groupoid
653
An easy computation gives us that this equation is equivalent to the following: ∂shk ∂shl (0, s (p, x)) − (0, sh (p, x)) h ∂p k ∂p l =
d
∂s k i=1
∂shl h (p, x) (p, x) − ∂x i ∂p i
∂shl ∂shk (p, x) (p, x) . i i ∂x ∂p
Differentiating the SGA equation first with respect to p3 and then to p2 and then putting p1 = p, p2 = p3 = 0, we get ∇p1 ∇p2 S(0, 0, sh (p, x)) = k
d
l
∇x i ∇p2 S(p, 0, x)∇p1 ∇p2 S(p, 0, x) k
i
l
i=1
−∇p2 ∇p2 S(p, 0, x). k
l
Taking the difference between this equation and the same but with the indices k and l interchanged we finish the proof. 3. Basic Examples Let us see in some examples what are the generating functions and the formal symplectic groupoids. We already know what happens in the case of the trivial Poisson structure over Rd . The generating function is S0 (p1 , p2 , x) = (p1 +p2 )x and the associated symplectic groupoid is the cotangent bundle T ∗ Rd with structure maps s(p, x) = x, t (p, x) = x, (x) = (0, x) i(p, x) = (−p, x). The composition is the fiberwise addition. 3.0.1. Constant Poisson structure. Suppose one has a constant Poisson structure α(x) = α. The main theorem tells us that the generating function is Sh (p1 , p2 , x) = S0 (p1 , p2 , x) + hp1t αp2 . The multiplication space can then be described as (m) Gh = (p1 , x + hαp2 ), (p2 , x − hαp1 ), (p1 + p2 , x) , (p1 , p2 , x) ∈ B2 . By Proposition 1 the structure maps are given by h (x) = (0, x), ih (p, x) = (−p, x), sh (p, x) = x − hαp, th (p, x) = x + hαp. 3.0.2. Linear Poisson structure. Suppose that we have a linear Poisson structure α ij (x) = ij αk x k on Rd which can then be considered as the dual of a Lie algebra Rd = G ∗ , the ij bracket on G being given by [ i , j ] = 2αk k , where l , l = 1, . . . , d is a basis of G(= R∗d ). For this Lie algebra we have the CBH formula exp(p1 ) exp(p2 ) = exp(CBH (p1 , p2 )), 1 1 [p1 , [p2 , p2 ]] + [p2 , [p2 , p1 ]] + . . . . CBH (p1 , p2 ) = p1 + p2 + [p1 , p2 ] + 2 12
654
A.S. Cattaneo, B. Dherin, G. Felder
It is easy to check directly that 1 Sh (p1 , p2 , x) = CBH (hp1 , hp2 ), x, h where ., . is the usual pairing between G and G ∗ , satisfies the SGA equation. It is equivalent to the associativity of CBH, i.e., CBH (hp1 , CBH (hp2 , hp3 )) = CBH (CBH (hp1 , hp2 ), hp3 ). By the uniqueness of the generating function given by the main theorem we recover a result of V. Kathotia (see [12]): Proposition 3. For the Poissson structure coming from the dual of a Lie algebra we have
hn
1
CBH (hp1 , hp2 ) − (p1 + p2 ), x = W Bˆ (p1 , p2 , x). h n! n≥1
∈Tn,2
This result is one of the main ingredients to prove that CBH-quantization is a deformation quantization in the case of the dual of a Lie algebra. It allows us to consider the generating function as a generalization of the CBH formula to the non-linear case. By Proposition 1 we have that the deformed source and target maps are 1 h2 ij th (p, x) = ∇p1 CBH (0, hp), x = x + hαk x k pj + αluv αvni x l pu pn + . . . , h 3 2 1 h ij sh (p, x) = ∇p2 CBH (hp, 0), x = x − hαk x k pj + αluv αvni x l pu pn + . . . . h 3 4. Perturbative Form of the SGA Equation The goal of this section is to formulate a perturbative version of the SGA equation. It is divided in two parts. First we introduce some tools and state the perturbative version of the SGA equation in Proposition 4. The proof is then split into several lemmas. 4.1. Perturbative SGA and Cayley trees. Let us recall that Bn := (R∗d )n × Rd . First suppose that we are looking for a generating function of the form Sh = S0 + hS where S0 (p1 , p2 , x) = (p1 + p2 )x is the trivial generating function and S ∈ C ∞ (B2 )[[h]] is a formal series S = S1 +hS2 +. . . . Inserting Sh in the SGA equation we get a new version of this equation for S, M 1 (S) = M 2 (S), where M i : C ∞ (B2 )[[h]] → C ∞ (B3 )[[h]] are defined by ¯ + hS(p, ¯ p3 , x) M 1 (S)(p1 , p2 , p3 , x; h) = hS(p1 , p2 , x) 2 −h ∇x S(p1 , p2 , x)∇ ¯ p1 S(p, ¯ p3 , x), p¯ = p1 + p2 + h∇x S(p1 , p2 , x), ¯ x¯ = x + h∇p1 S(p, ¯ p3 , x), and M 2 (S)(p1 , p2 , p3 , x; h) = hS(p2 , p3 , x) ˜ − p˜ x˜ + hS(p1 , p, ˜ x) −h2 ∇x S(p2 , p3 , x)∇ ˜ p2 S(p1 , p, ˜ x), p˜ = p2 + p3 + h∇x S(p2 , p3 , x), ˜ x˜ = x + h∇p2 S(p1 , p, ˜ x).
Formal Symplectic Groupoid
655
The idea now is to expand M i (S)(h), i = 1, 2 into powers of h and then to analyze the conditions imposed on S by the equation at each order. For that purpose we are going to introduce some tools and methods that are heavily inspired by the tools used in numerical analysis to determine the order condition of a Runge–Kutta method. The main ingredients are trees which are used to represent the so-called elementary differentials and elementary functions. As these ideas go back to Cayley, we call such trees Cayley trees, in order to distinguish them from Kontsevich trees which will also appear in the story. In the sequel we will mainly follow the notations of [9].*** Definition 3. (1) A graph t is given by a set of vertices Vt = {1, . . . , n} and a set of edges Et which is a set of pairs of elements of Vt . We denote the number of vertices by |t|. An isomorphism between two graphs t and t having the same number of vertices is a permutation σ ∈ S|t| such that {σ (v), σ (w)} ∈ Et if {v, w} ∈ Et . Two graphs are called equivalent if there is an isomorphism between them. The symmetries of a graph are the automorphisms of the graph. We denote the group of symmetries by sym(t). (2) A tree is a graph which has no cycles. Isomorphisms and symmetries are defined the same way as for graphs. (3) A rooted tree is a tree with one distinguished vertex. An isomorphism of rooted trees is an isomorphism of graphs which sends the root to the root. Symmetries and equivalence are defined correspondingly. (4) A bipartite graph is a graph t together with a map ω : Vt → {◦, •} such that ω(v) = ω(w) if {v, w} ∈ Et . An isomorphism of bipartite trees is an isomorphism of graphs which respects the coloring, i.e., ω(σ (v)) = ω(v). The following table summarizes some notations we will use in the sequel. T RT RT◦ RT•
the set of bipartite trees the set of rooted bipartite trees the set of elements of RT with white root the set of elements of RT with black root
[A]: the set of equivalence classes of graphs in A (ex: [RT ]). They are called topological “A” trees. The elements of [RT ] can be described recursively as follows (1) ◦, • ∈ [RT ]. (2) If t1 , . . . , tm ∈ [RT◦ ], then the tree [t1 , . . . , tm ]• ∈ [RT ] where [t1 , . . . , tm ]• is defined by connecting the roots of t1 , . . . , tm with • and saying that • is the new root, and the same if we interchange ◦ and •. Now with the help of this recursive description of topological rooted trees we define elementary differentials and elementary generating functions. Definition 4 (Elementary Differentials (ED)). Let i = 1, 2, t ∈ [RT ]. The elementary differential D i S(t) of S ∈ C ∞ (B2 )[[h]] is defined recursively as follows: (1) D i S(◦) = ∇x S , D i S(•) = ∇pi S, (m+1)
S(D i S(t1 ), . . . , D i S(tm )) if t = [t1 , . . . , tm ]• ,
(m+1)
S(D i S(t1 ), . . . , D i S(tm )) if t = [t1 , . . . , tm ]◦ ,
(2) D i S(t) = ∇pi (3) D i S(t) = ∇x
656
A.S. Cattaneo, B. Dherin, G. Felder (k)
where ∇x S stands for the k th derivative of S w.r.t. x evaluated at (p1 , p2 , x) if i = 1 (k) and at (p2 , p3 , x) if i = 2. ∇pi S stands for the k th derivative of S w.r.t. p i evaluated at (p1 + p2 , p3 , x) if i = 1 and at (p1 , p2 + p3 ) if i = 2. Definition 5 (Elementary Generating Functions (EGF)). Let i = 1, 2, t ∈ [RT ]. The elementary generating function S i (t) of S ∈ C ∞ (B2 )[[h]] is defined recursively as follows: (1) S 1 (◦) = S(p1 , p2 , x) , S 1 (•) = S(p1 + p2 , p3 , x), (2) S 2 (◦) = S(p2 , p3 , x) , S 2 (•) = S(p1 , p2 + p3 , x), (m) (3) S i (t) = ∇pi S(D i S(t1 ), . . . , D i S(tm )) if t = [t1 , . . . , tm ]• , (m)
(4) S i (t) = ∇x S(D i S(t1 ), . . . , D i S(tm )) if t = [t1 , . . . , tm ]◦ , with the same notation as above. Some examples are given in the following table:
Diagram
Notation
ED
EGF
[•]◦
∇x S∇p S
[◦, ◦]•
∇p S(∇x S, ∇x S)
[•, [◦]• ]◦
∇x S(∇p S, ∇p S∇x S)
(2)
∇x S∇p S
(3)
∇p S(∇x S, ∇x S)
(3)
(2)
(2)
(2)
(2)
∇x S(∇p S, ∇p S∇x S)
Remark that for EGF it is not important which vertex is the root. This is not the case for ED. Let us be more precise. Definition 6 (Butcher product). Let u = [u1 , . . . , uk ], v = [v1 , . . . , vl ] ∈ [RT ]. We denote by u ◦ v = [u1 , . . . , uk , v], v ◦ u = [v1 , . . . , vl , u],
the Butcher product. We have not written the obvious conditions on the ui and vi so that the product remains bipartite. Definition 7 (Equivalence relation on rooted topological trees). We consider the minimal equivalence relation on [RT ] such that u ◦ v ∼ v ◦ u.
Formal Symplectic Groupoid
657
Properties of this relation. It is clear that (1) two topological rooted trees are equivalent if it is possible to pass from one to the other by changing the root. More precisely: t, t ∈ [RT ], t ∼ t iff there exists a representative (E, V , r) of t and a representative (E , V , r ) of t and a vertex r ∈ V such that (E, V , r ) and (E , V , r ) are isomorphic rooted trees. (2) the quotient of [RT ] by this equivalence relation is exactly [T ]. (3) it follows immediately from the definition S i (t) = S i (t ) if t ∼ t for i = 1, 2. Then, it makes sense to define the EGF on bipartite trees. Definition 8. Let S ∈ C ∞ (B2 )[[h]] and t = (Vt , Et ) ∈ T . Then
1,ω(v) S 1 (t) := Dβ(e) S , β:Et →{1,...,d} v∈Vt
e∈Et e={∗,v}
where 1,• 1,• Dβ(e . . . Dβ(e S := 1) k) 1,◦ 1,◦ . . . Dβ(e S := Dβ(e 1) k)
∂kS 1 1 ∂pβ(e . . . ∂pβ(e 1) k) k ∂ S
∂x β(e1 ) . . . ∂x β(ek )
(p1 + p2 , p3 , x), (p1 , p2 , x),
and correspondingly for S 2 (t). It is clear that this new definition of S i (t) is equivalent to the previously introduced recursive one. This definition is however better if we want to deal with the fact that S is a formal series. Namely we immediately get the relation
h|t| S i (t) = h|t|
β:Et →{1,...,d} v∈Vt
=
∞
n=|t|
hn
n1 +···+n|t| =n ni ≥1
i,ω(v) 1
Dβ(e)
e∈Et e={∗,v}
h
∞
hn Sn
n=1
Cti (Sn1 , . . . , Sn|t| ),
which defines the Cti which are multi-differential maps from C ∞ (B2 )|t| to C ∞ (B3 ). We can now state the main proposition of this section. Proposition 4 (Perturbative version of the SGA equation). The formal series Sh = S0 + n n≥1 h Sn satisfies the SGA equation iff for each n > 0 we have
1 |t|! n
t∈T |t|≤n
1 +···+n|t| =n ni ≥1
Ct1 (Sn1 , . . . , Sn|t| ) − Ct2 (Sn1 , . . . , Sn|t| ) = 0.
Let us remark that for all f ∈ C ∞ (B2 ) we have that C•1 (f ) + C◦1 (f ) − C•2 (f ) − C◦2 (f ) = dS,
658
A.S. Cattaneo, B. Dherin, G. Felder
where d : C ∞ (Bn ) → C ∞ (Bn+1 ) is a differential (i.e., d 2 = 0) defined by the formula df (p1 , . . . , pn+1 ) = f (p2 , ..., pn+1 ) −
n
(−1)i+1 f (p1 , . . . , pi + pi+1 )
i=1
+(−1)n+1 f (p1 , ..., pn ). This differential can be interpreted either as the Hochschild differential on symbols of multi-differential operators on C ∞ (Rd ) or as the differential of the trivial symplectic groupoid cohomology over Rd . This remark allows us to put the previous recursive equations into the form dSn + Hn (Sn−1 , . . . , S1 ) = 0, which is exactly the analog of the recursive equation involved when considering starproducts. The remainder of this section is devoted to proving Proposition 4. 4.2. Proof of the Proposition. It follows from a series of little lemmas. We are first interested in expanding p¯ = p1 + p2 + h∇x S(p1 , p2 , x), ¯ x¯ = x + h∇p1 S(p, ¯ p3 , x),
(1) (2)
p˜ = p2 + p3 + h∇x S(p2 , p3 , x), ˜ x˜ = x + h∇p2 S(p1 , p, ˜ x)
(3) (4)
and
as a power series in h. The method used is essentially the same as in numerical analysis when one wants to express the Taylor series of the numerical flow of a Runge–Kutta method. Namely the equations above have a form very close to the partitioned implicit Euler method (see [9]). Definition 9. Let t = [t1 , . . . , tm ] ∈ [RT ]. Consider the list t˜1 , . . . , t˜k of all nonisomorphic trees appearing in t1 , . . . , tm . Define µi as the number of times the tree t˜i appears in t1 , . . . , tm . Then we introduce the symmetry coefficient σ (t) of t by the following recursive definition: σ (t) = µ1 !µ2 ! . . . σ (t˜1 ) . . . σ (t˜k ). Moreover σ (◦) = σ (•) = 1. It is clear that σ (t) is the number of symmetries for each representative of t (i.e., σ (t) = |Sym(t )| for all t ∈ t). Lemma 3. There exist unique formal series for x, ¯ p¯ (resp. x, ˜ p) ˜ which satisfy Eqs. (1) and (2) (resp. (2) and (3)). They are given by
h|t| x(h) ¯ =x+ (5) D 1 S(t), σ (t) t∈[RT• ]
p(h) ¯ = p1 + p 2 +
t∈[RT◦ ]
h|t| 1 D S(t), σ (t)
(6)
Formal Symplectic Groupoid
659
and by
x(h) ˜ =x+
t∈[RT• ]
p(h) ˜ = p2 + p 3 +
h|t| 2 D S(t), σ (t)
t∈[RT◦ ]
(7)
h|t| 2 D S(t), σ (t)
(8)
respectively. Proof. Uniqueness is trivial. Let us check that we have the right formal series. We only check Eq. (1). The other computation is similar. x(h) ¯ = x + h∇p1 S(p, ¯ p3 , x)
1
h|t| h|t| 1 (m+1) 1 = x+h S ∇ D S(t), . . . , D S(t) m! p σ (t) σ (t) t∈[RT◦ ]
m≥0
=x+
···
m≥0 t1 ∈[RT◦ ]
tm ∈[RT◦ ]
t∈[RT◦ ]
h1+|t1 |+···+|tm | m!σ (t1 ) . . . σ (tm )
×∇p(m+1) S(D 1 S(t1 ), . . . , D 1 S(tm )) =x+
=x+
···
h|t| (µ1 !µ2 ! . . . )D 1 S(t), m!σ (t)
m≥0 t1
tm
h|t| 1 D S(t). σ (t)
t∈[RT• ]
with t = [t1 , . . . , tm ]•
We now insert these expansions into M 1 and M 2 . Lemma 4. M i (S)(h) =
h|t| h|t|
h|t| S i (t) − D i S(t) D i S(t) σ (t) σ (t) σ (t)
t∈[RT ]
t∈[RT◦ ]
t∈[RT• ]
for i = 1, 2. Proof. Let us do the proof for M 1 . First we compute the different terms arising in the formula for M 1 in terms of trees:
660
A.S. Cattaneo, B. Dherin, G. Felder
1 h|t| 1 (m) hS(p1 , p2 , x) ¯ =h ∇x S D S(t), . . . , m! σ (t) t∈[RT• ]
m≥0
..., =
m≥0 t1 ∈[RT• ]
···
t∈[RT• ] h|t|
tm ∈[RT• ]
m!σ (t)
h|t| 1 D S(t) σ (t) (µ1 !µ2 ! . . . )
×∇x(m) S(D 1 S(t1 ), . . . , D 1 S(tm )),
h|t| = S 1 (t). σ (t)
with t = [t1 , . . . , tm ]•
t∈[RT• ]
By the same sort of computations we also get hS(p, ¯ p3 , x) =
t∈[RT• ]
¯ = h∇x S(p1 , p2 , x)
t∈[RT◦ ]
¯ p3 , x) = h∇p1 S(p,
t∈[RT• ]
h|t| 1 S (t), σ (t) h|t| 1 D S(t), σ (t) h|t| 1 D S(t). σ (t)
The M i ’s are expressed as sums over topological rooted bipartite trees. We would like now to regroup the terms of the formula in the previous lemma. To do so we express all terms in terms of topological trees (no longer rooted). Lemma 5. Let u ∈ [RT◦ ] and v ∈ [RT• ]. Then D i S(u)D i S(v) = S i (u ◦ v) = S i (v ◦ u). Proof. Prove it only for i = 1. Suppose u = [u1 , . . . , um ]◦ , v = [v1 , . . . , vl ]• , then D 1 S(u)D 1 S(v) = ∇x(m+1) S(D 1 S(u1 ), . . . , D 1 S(um )).D 1 S(v) = ∇x(m+1) S(D 1 S(u1 ), . . . , D 1 S(um ), D 1 S(v)) = S 1 (u ◦ v).
Lemma 6. Let t = (Vt , Et ) ∈ T . For all v ∈ Vt let tv be the bipartite rooted tree (Vt , Et , v) ∈ RT . For v ∈ Vt and e = {u, v} ∈ Et we have |sym(t)| = |{v ∈ Vt /tv is isomorphic to tv }|, |sym(tv )| |sym(t)| = |{e ∈ Et /tu tv is isomorphic to tu tv }|. |sym(tu )||sym(tv )|
Formal Symplectic Groupoid
661
Proof. Consider the induced action of the symmetry group of the tree on the set of vertices. Notice that two vertices v and w are in the same orbit iff tv is isomorphic to tw . Then the number of vertices of t which lead to a rooted tree isomorphic to tv is exactly the cardinality of the orbit of v, which is exactly |sym(t)| divided by the cardinality of the isotropy subgroup which fixes v. But the latter is |sym(tv )| by definition. We then get the first statement. For the second statement we have to consider the induced action on the edges and apply the same type of argument. Lemma 7. Let S ∈ C ∞ (B2 )[[h]]. The SGA equation for S can be expressed in terms of bipartite Cayley trees as
h|t| S 1 (t) − S 2 (t) = 0. |t|! t∈T
Proof. We have for i = 1, 2,
h|t|
h|u|+|v| S i (t) − D i S(u)D i S(v) σ (t) σ (u)σ (v) t∈[RT ] u∈[RT◦ ] v∈[RT• ]
¯
1 1 = − h|t | S i (t¯) |sym(t)| |sym(u)||sym(v)|
M i (S) =
t¯∈[T ]
=
t∈t¯
u∈[RT• ],v∈[RT◦ ] u◦v∈t¯
|sym(t)| 1 S i (t) |t|! |sym(tv )| k(t, v)
h|t| t∈T
v∈Vt
−
e={u,v}∈Et
1 |sym(t)| , |sym(tu )||sym(tv )| l(t, e)
where k(t, v) = |{v ∈ Vt /tv is isomorphic to tv }| and l(t, e) = |{e ∈ Et / tu tv is isomorphic to tu tv }|. Using Lemma 6 and the fact that for a tree the difference between the number of vertices and the number of edges is equal to 1 we get the desired result. Using now the fact that S is a formal series we immediately get Proposition 4. 5. Geometry of Kontsevich Trees In this section we present a diagrammatical notation introduced by Kontsevich which allows us to write an explicit solution of the SGA equation.
5.1. Basic Definitions. Definition 10. (1) A Kontsevich graph of type (n, m) is a directed graph = (E , V ) which has the following properties: g • It possesses two types of vertices V = Va V , the aerial vertices Va = g ¯ . . . , m} {1, . . . , n} and the ground vertices V = {1, ¯ .
662
A.S. Cattaneo, B. Dherin, G. Felder
• Each aerial vertex possesses exactly two ordered edges starting from it. The edge set can be described as E = {(k, γ i (k)), k = 1, . . . , n, i = 1, 2}, where γ i : Va → V . Sometimes one denotes the two edges of a vertex k by ek1 and ek2 . • For each aerial vertex v we do not allow small loops (i.e., that γ i (v) = v) and double edges (i.e., that γ 1 (v) = γ 2 (v)). We denote the set of Kontsevich graphs of type (n, m) by Gn,m . If ∈ Gn,m then we set || := n. (2) Let A ∈ V . We call /A the restriction of to A. It is the graph with vertex set A and edges E ∩ A × A. We call (A) the contraction of to A. It is the graph with vertex set (V \A) {∗} (the vertices of A are contracted to a single vertex ∗) and edges (i, j ) ∈ E , where i is replaced by the new vertex ∗ in (A) if i ∈ A and the same for j (simple loops are deleted). Note that the resulting graphs might not be Kontsevich graphs. (3) We denote by () = (Va , Ea ) the restriction of ∈ Gn,m to the aerial vertices. g Sometimes we write E = E \Ea . We say that a Kontsevich graph is connected if () is connected in the usual sense. We say that a connected Kontsevich graph is a tree if () is a tree (i.e., a graph without cycle). Denote by Cn,m the set of connected Kontsevich graph of type (n, m) and by Tn,m the set of Kontsevich trees of type (n, m). Given a Poisson structure α on Rd one can associate to each graph ∈ Gn,m an m-multidifferential operator on C ∞ (Rd ). The general formula is the following B (f1 . . . , fm ) :=
(
I :E →{1,...,d} k∈Va
1 2 ∂I (e) )α I (ek )I (ek ) × ∂I (e) fi . g
e∈E e=(∗,k)
i∈V
e∈E e=(∗,i)
We call Bˆ the symbol of B . It can be defined by the formula B (ep1 x , . . . , epm x ) = Bˆ (p1 , . . . , pm , x)e(p1 +···+pm )x . 00 11 00 0 0011 11 1
Example 1. Take the graph =
1 0
1 0
then we have Bˆ (p1 , p2 , x) =
2 α ij (x)∂n ∂j α kl (x)α mn (x)pk1 pi2 pl2 pm .
1≤i,j,k,l,m,n≤d
Associated to each Kontsevich graph ∈ Gn,m there is also a number, the Kontsevich weight W . In these notes we only need to define these weights for graphs of type (n, 2). The generalization is however straightforward. We do this in several steps. (1) Take a Kontsevich graph ∈ Gn,2 and identify its vertices 1, . . . , n ∈ V with n complex numbers z1 , . . . , zn lying in the upper half complex plane H = {z ∈ C/I m(z) > 0} (we require that zi = zj if i = j ). Identify further 1¯ and 2¯ with 0 and 1 in R.
Formal Symplectic Groupoid
663
(2) Consider now the hyperbolic metric on H. The geodesic joining two points p, q ∈ H is in this metric either the half circle intersecting orthogonally the real line and passing through p and q or the line orthogonal to the real line passing through p and q. We can now associate the oriented edges eki = (k, γ i (k)) to the oriented geodesics joining zk and zγ i (k) . We call such an embedding of a configuration of . We can then identify the configuration space of a Kontsevich graph with Hn \D n , where Hn is n times the Cartesian product of H and D n := {(z1 , . . . , zn ) ∈ Hn /∃i, j
i = j
and
zi = zj }.
Notice that Hn \D n is a real non-compact manifold of dimension 2n. We can however compactify it into a compact manifold with corners Hn \D n such that the open stratum is exactly Hn \D n . (3) For each edge eki = (k, γ i (k)) we can define an “angle function” on Hn \D n by ψzik (z1 , . . . , zn ) := φ h (zk , zγ i (k) ), where φ h (zk , zγ i (k) ) is the oriented hyperbolic angle between the geodesic zk and ∞ and the geodesic joining zk joining and zγ i (k) . So φ h (p, q) = arg q−p . q−p¯ (4) We can now consider the 1-forms dψzik ∈ 1 (Hn \D n ) which can be extended on the compactified space. Then the Kontsevich weight of is defined by n 1 W := (dψz1k ∧ dψz2k ). (2π)2n Hn \D n i=1
Further explanations about these operators and weights can be found in [13]. However we still need a lemma which is also proven in (or follows directly from) [13]. a Definition 11. Let ∈ Gn,3 . We denote by sub(){1, ¯ 2} ¯ the set of the subset S of V such that /{1, and are still Kontsevich graphs of type (n, 2). We define ¯ 2}S ¯ ¯ 2}S) ¯ ({1, similarly sub(){2, ¯ 3} ¯ .
Lemma 8.
∈Gn,3
S∈sub(){1, ¯ 2} ¯
W/{1,¯ 2}S W({1,¯ 2}S) − ¯ ¯
Bˆ = 0. W/{2,¯ 3}S W({2,¯ 3}S) ¯ ¯
S∈sub(){2, ¯ 3} ¯
5.2. Factorization into connected components of graphs of type (n, 2) . We describe here a procedure which allows us to decompose a graph of type (n, 2) into l graphs 1 , . . . , l of the same type, its connected components in a slightly unusual sense. Take ∈ Gn,2 . Then (1) Consider the usual connected components of (). We can number them in a unique way using the following rule: Let i (), j () be two connected components of (). We impose that i < j iff min{Vi () } < min{Vj () }. (2) For each connected component i () of () we can reconstruct a Kontsevich graph which we denote by i : (a) To begin with, add to each i () the vertices and edges that we removed considering (). Let ˆ i be this graph. (b) Relabel the vertices of ˆ i by 1, 2, . . . , |i ()| preserving the relative order of the vertices of i (). One gets a new Kontsevich graph i .
664
A.S. Cattaneo, B. Dherin, G. Felder
Definition 12. (1) Let ∈ Gn,2 . We call the i ’s as constructed above the connected factors of . Because of the numbering of the i () the connected factors of a Kontsevich graph are uniquely numbered. The connected factors of are connected Kontsevich graphs. (2) We denote by Gn,2 (n1 , . . . , nk ) the graphs of Gn,2 which have k connected factors and such that the i th connected factor i is a Kontsevich graph of order ni . (3) We call the factorization map the map D defined by D() = (1 , . . . , k ), where the i are the connected factors of . Similar considerations about connected Kontsevich graphs and connected factorization can be found in [12]. In particular one can find the following lemma: Lemma 9 (Factorization Lemma). Let ∈ Gn,2 and D() = (1 , . . . , k ) be its connected factorization. Then we have (1) W = W1 . . . Wk , (2) Bˆ = Bˆ 1 . . . Bˆ k . 5.3. Number of graphs leading to the same connected factorization. We are looking for the number of graphs of Gn,2 which lead to the same connected factorization. This number plays a crucial role while proving the existence of the generating function. It is clear that D() = D( ) only if , ∈ Gn,2 (n1 , . . . , nk ) for some n1 , . . . , nk . Therefore the problem of counting the number of Kontsevich graphs of type (n, 2) that lead to the same factorization can be stated in the following terms: Given (1 , . . . , k ) ∈ Cn1 ,2 × · · · × Cnk ,2 , what is the number of elements of D −1 (1 , . . . , k )? The answer is contained in the following remarks: Notice that the permutation group Sn acts on Gn,2 by permuting the aerial vertices. Let ∈ Gn,2 (n1 , . . . , nk ). All the graphs ∈ Gn,2 (n1 , . . . , nk ) which give the same connected factorization as are generated by a subset of Sn , i.e., ∀ ∈ Gn,2 (n1 , . . . , nk ) s.t. D() = D( )
∃σ ∈ P s.t. σ = .
This subset P ⊂ Sn is defined by the constraints: (1) The permutation must preserve the relative order of the vertices of Vi . (2) Consider the set of the minimum vertex of each Vi . The permutation must preserve the relative order of this set. It remains then to count the number of such permutations. The second constraint n! restricts the number of allowed permutations to n! k! . The first further restricts to k!n1 !...nk ! . Thus n! |D −1 (1 , . . . , k )| = . k!n1 ! . . . nk ! As this number reappears in another context let us denote it by d(n1 , . . . , nk ) and call it the decomposition coefficient.
Formal Symplectic Groupoid
665
5.4. Contraction-restriction decomposition of trees of type (n, 3). Here begin some new considerations about Kontsevich graphs. We will see that in each Kontsevich tree of type (n, 3) lies, hidden, two Cayley trees which encode the contraction and restriction of the tree leading to Kontsevich trees. These two Cayley trees allow us to make a link between the perturbative SGA equation which is expressed in terms of Cayley trees and the proposed solution expressed in terms of Kontsevich trees. The main results of this section are then summarized in Definition 14 and Proposition 5. But let us begin first to establish a few little facts necessary to make any statement. Lemma 10. Let ∈ Tn,m , then (1) |Ea | = n − 1, g (2) |E | = n + 1. Proof. For the first assertion one notices that (), which has n vertices, is connected, so there are at least n − 1 edges connecting these vertices. Now, if we add an edge, we create a cycle which contradicts the fact that () is a tree. The second assertion follows g from the identity |Ea | + |E | = 2n. Corollary 1. There is no Kontsevich tree of type (n, 1) (i.e. Tn,1 = ∅). g
Proof. As |E | = n + 1 and |Va | = n, one aerial vertex has its two edges landing at the only ground vertex and we do not allow double edges. g Corollary 2. Suppose ∈ Tn,2 . Then E has at least one edge landing at 1¯ and one ¯ edge landing at 2. g Proof. Without loss of generality, suppose that all edges of E land at 1¯ then /V a 1¯ ∈ Tn,1 = ∅.
¯ Corollary 3. Suppose ∈ Tn,2 . There is at least one v ∈ Va such that γ 1 (v) = 1, 2 ¯ γ (v) = 2. g
Proof. As |E | = n + 1 and |Va | = n, there is one aerial vertex where both edges are ground edges. Those two edges can not land at the same ground vertex as we prevent double edges. Definition 13. (1) Let be ∈ Gm,n . One defines the following transitive relation among the vertices of : v < w iff there exists a1 , . . . , ak ∈ V such that (w, a1 ), . . . , (ai , ai+i ), . . . , (ak , v) ∈ E . (2) Let be ∈ Gn,m . Let us denote by star in (v) := {w ∈ V s.t. v < w}, star out (v) := {w ∈ V s.t. w < v}. ¯ B1¯ := V a \N1¯ and 1 , . . . , l Lemma 11. Let ∈ Tn,3 . Denote N1¯ := star in (1), B¯ B¯ 1
1
. Then the Bi ¯ ’s are Kontsevich trees with two ground the connected factors of /{2, ¯ 3}B ¯ 1¯ 1 vertices (provided that B1¯ = ∅). The same statement holds if we replace B1¯ by B3¯ and ¯ 2} ¯ B3¯ . make the restriction around {1,
666
A.S. Cattaneo, B. Dherin, G. Felder
Proof. Take Bi ¯ . As there are no edges (v, w) starting from B1¯ and landing at N1¯ 1¯ 1 (otherwise v > w > 1¯ ⇒ v ∈ N1¯ ), all the vertices of B1¯ conserve their two edges when g . It remains to be shown that all the edges E i are passing to the restriction /{2, ¯ 3}B ¯ ¯ B
1
1¯
¯ But Corollary 3 prevents this phenomenon from not landing exclusively at one of 1¯ or 2. happening. j
Trivial little facts. We define for convenience B1¯i := V ai , i = 1, . . . , l and N1¯ := B
j
1¯
V aj , j = 1, . . . , k, where N1 are the connected factors of ({2, . We see that: ¯ 3}B ¯ ¯) N
1
1¯
j
(1) There is at most one edge from starting from one N1¯ to a B1¯i (otherwise one introduces a cycle). j (2) There is no edge from an N1¯ to another N1¯i (they are connected factors). j
(3) There is no edge from a B1¯ to another B1¯i (they are connected factors). j
j
(4) There is no edge from a B1¯ to a N1¯i (otherwise one vertex of B1¯ should be in N1¯i ). Corollary 4 (Contraction/Restriction trees). Let ∈ Tn,3 . We can make the following construction: j
• identifying each N1¯ , j = 1, . . . , k and B1¯i , i = 1, . . . , l with respectively black vertex and white vertex, • putting an edge between black vertex and white vertex iff there is one edge between j the corresponding sets N1¯ and B1¯i , • labelling the black and white vertices such that i < j iff the minimum of the set corresponding to i is inferior to the minimum of the set corresponding to j , we get a Cayley tree t2 ∈ T . This tree t2 is called the second contraction/restriction tree of . If we start the construction from B3¯ and N3¯ we get t1 , the first contraction/restriction tree of . Example 2. The following graph illustrates these phenomenon:
11 00
1
2
3
1
2
For this graph we have that the two contraction/restriction trees are
3
Formal Symplectic Groupoid
667 3
t1 = • and
t2 =
4
2
. 1
¯ B1 := V a \N1¯ and 1 , . . . , 1 Lemma 12. Let ∈ Tn,3 . Denote N1¯ := star in (1), N1¯ Nk¯ the connected factor of ({2, . ¯ 3}B ¯ ¯) 1
i ’s are Kontsevich trees with two ground vertices (provided that B = ∅). Then the N 1¯ 1¯ The same statement holds if we replace B1¯ by B3¯ and make the contraction around ¯ 2} ¯ B3¯ . {1,
Proof. From the vertices in N1¯i := V i , there is at least one edge landing at 1¯ and at N¯ 1
µ
most one landing at each B1¯ . The only bad thing that can happen is then that there is µ v ∈ N1¯i such that γ 1 (v) ∈ B1¯ and γ 1 (v) ∈ B1¯ν . But then v has no edge left starting from it, which implies that 1¯ ∈ / star out (v). Definition 14. Let ∈ Tn,3 . We define the contraction/restriction decomposition maps P i () = (ti , 1 , . . . , m ),
i = 1, 2,
where ti ∈ T is the i th contraction/restriction-tree of and the j are the connected ¯ 2} ¯ B3¯ for i = 1 and around factor of the contraction and the restriction of around {1, ¯ ¯ {2, 3} B1¯ for i = 2. We index these connected factors with the usual convention, that is k < l if the minimum of the aerial vertices of k is less than the minimum of the aerial vertices of l . i (t, , . . . , ) the subset of T i We denote by Tn,3 1 |t| n,3 such that P () = (t, 1 , . . . , |t| ) for i = 1, 2. Example 3. For the previous graph we get P 1 () =
•1 ,
, 11 00 00 11
3
P 2 () =
1 0
1 0 0 1
000 1 1 0 11
1 0
1 0
4
2
, 1
11 00 00 11
1 0 0 1
,
1 0 0 1
11 00 00 11
,
11 00 00 11
11 00 00 11
,
1 0 0 1
Proposition 5. Let ∈ Tn,3 . Then in the notation used above we have 1 (t; , . . . , ) then (1) Let ∈ Tn,3 1 |t|
W1 . . . W|t| = W/B ¯ ∩{1, ¯ 2} ¯ W(B ¯ ∩{1, ¯ 2}) ¯ . 3
3
2 (t; , . . . , ) then Let ∈ Tn,3 1 |t|
W1 . . . W|t| = W/B ¯ ∩{2, ¯ 3} ¯ W(B ¯ ∩{2, ¯ 3}) ¯ . 1
1
(2) We have the following equations for the Kontsevich weights
W({1,¯ 2}B Bˆ = 0. W/{1,¯ 2}B − W({2,¯ 3}B W/{2,¯ 3}B ¯ ¯ ¯ ¯ ) ) ∈Tn,3
3¯
3¯
1¯
1¯
1 0 0 1
.
668
A.S. Cattaneo, B. Dherin, G. Felder
(3) The following relates Cayley trees and Kontsevich trees, for all t ∈ T we have
Cti (Bˆ 1 , . . . , Bˆ |t| ) = d(n1 , . . . , n|t| )
Bˆ .
i (t; ,..., ) ∈Tn,3 1 |t|
Proof. (1) is trivial. (2) is a consequence of Lemma 8 once one has proved that sub(){1, ¯ 2} ¯ = {B3¯ } and sub(){2, ¯ 3} ¯ = {B1¯ }. By Lemmas 11 and 12 one has already that B1¯ ∈ sub(){2, ¯ 3} ¯ and B3¯ ∈ sub(){2, ¯ 3} ¯ . It remains to check that they are the only ones. Let us prove that only for B1¯ . Suppose there is another subset K ⊂ Va such that ({2, and /{2, are ¯ 3}K) ¯ ¯ 3}K ¯ Kontsevich trees. This implies that in the process of ¯ 3} ¯ K, one should not lose an edge (a) restriction around {2, ¯ 3} ¯ K, one should not end up with a double edge. (b) contraction around {2, (A) Suppose that K ∩ N1¯ = ∅. Take v ∈ K ∩ N1¯ then star out (v) is a subset of ¯ 3} ¯ K. But K otherwise we lose an edge when doing the restriction around {2, ¯1 ∈ star out (v) which implies that 1¯ ∈ K otherwise we lose an edge when doing the restriction. Contradiction with K ⊂ Va . (B) By (A) we have that K ⊂ B1¯ . Suppose that K is strictly contained in B1¯ . Then (/{2, ) ¯ 3}B ¯ ¯ ¯ is a subgraph of (K{2, ¯ 3}) ¯ . But as there are no edges starting 1¯ (K{2,3}) ¯ ) from B1¯ and landing at 1, (/{2, ¯ 3}B ¯ ¯ ¯ is a Kontsevich tree with only one 1¯ (K{2,3}) ground vertex which implies that it is not a Kontsevich tree. Contradiction. (3) First remark that ∈T i (t,1 ,...,|t| ) B = d(n1 , . . . , n|t| ) ∈A B , where A is n,3
i (t, , . . . , ) such that all vertices in V correthe subset of trees ∈ Tn,3 1 |t| sponding to these of Vi are less than these corresponding to Vj if i < j . It is clear that letting all the permutations of Sn which preserve the relative order of the minimal vertex of each Vi and the relative order of the vertices in Vi act, i (t, , . . . , ). We have already counted the number of we get all trees of Tn,3 1 |t| such permutations; it is exactly the decomposition coefficient d(n1 , . . . , n|t| ). The identity ∈A B = Cti (Bˆ 1 , . . . , Bˆ k ) follows from the Leibniz rule.
6. Proof of Theorem 1 Let us restate the main theorem. Theorem 1. Given a Poisson structure α on Rd there exists a unique natural deformation of the trivial generating function such that the first order is precisely α. Moreover we have an explicit formula for this deformation Sh (p1 , p2 , x) = x(p1 + p2 ) +
∞
hn
n=1
n!
W Bˆ (p1 , p2 , x),
∈Tn,2
where Tn,2 is the set of Kontsevich trees of type (n, 2), W is the Kontsevich weight of , and Bˆ is the symbol of the bidifferential operator B associated to .
Formal Symplectic Groupoid
669
Proof. Existence of the solution. Let us verify that the proposed solution satisfies the perturbative version of the SGA equation. Denote Mni (S) =
1 |t|! n
1 +···+n|t| =n ni ≥1
t∈T |t|≤n
Cti (Sn1 , . . . , Sn|t| ).
Let us compute Mn1 (S) for the proposed solution M 1 (S)n =
1 |t|! n
t∈T n1 +···+n|t| =n i ∈Tni ,2 |t|≤n ni ≥1 i=1,...,|t|
=
1 n! n
t∈T |t|≤n
=
W1 . . . W|t|
1 +···+n|t| =n i ∈Tni ,2 ni ≥1 i=1,...,|t|
t∈T |t|≤n
=
n1 ! . . . n|t| !
Ct1 Bˆ 1 , . . . , Bˆ |t|
W1 . . . W|t| (n1 + · · · + n|t| )!
B
1 (t; ,..., ) ∈Tn,3 1 |t|
1 +···+n|t| =n i ∈Tni ,2 ∈T 1 (t,1 ,...,|t| ) n,3 ni ≥1 i=1,...,|t|
W({1,¯ 2}B W/{1,¯ 2}B B ¯ ¯ ) 3¯
3¯
1
W({1,¯ 2}B W/{1,¯ 2}B B , ¯ ¯ ) 3¯ 3¯ n! n≥1
∈Tn,3
which implies by Proposition 5 that Mn1 (S) − Mn2 (S) = 0 for all n > 0. Uniqueness of the solution. We have seen that the perturbative SGA equations could be put at each order into the form dSm + Hm (Sm−1 , . . . , S1 ) = 0, where the differential d may be identified with the Hochschild differential on symbols. Let S and S be two generating functions. By definition we have that S1 = S1 = α. Now suppose that S and S are equal up to order m − 1 (i.e., Sk = Sk , k ≤ m − 1). Thus ∈ C ∞ (B ) satisfies the following equation: Km := Sm − Sm 2 ) = 0. dKm = Hm (S1 , . . . , Sm−1 ) − Hm (S1 , . . . , Sm−1
As H 2 (C ∞ (B• ), d) = V 2 (Rd )(bivector fields over Rd ) we have that Km can be written as Km = dkm + ω, where km is a 1-cochain and ω is a bivector field. Because of the homogeneity of Km in the p’s we have that ω vanishes. −1 1 (p, p) is a primitive of K , i.e., dk = K . Claim. km (p) := m+1 Km m m m 1 (p, p) = 0 This claim proves the uniqueness because by assumption we have Km which means that km = 0 and thus dkm = Km = 0. As for the claim, suppose that I J I,J p1 p2 Km (p1 , p2 ) = |I |+|J |=m+1 Km I !J ! , where we use the usual convention for the multi-indexes I = (i1 , . . . , id ), J = (j1 , . . . , jd ) ∈ Nd . Then an easy computation yields that e1 ,I −e1 pI −1 1 (p, p) = − (1) km (p) = m+1 Km |I |=m+1 Km I ! , where e1 = (1, 0, . . . , 0), I,J L,N (2) dKm = 0 implies that Km = Km if |I | + |J | = |L| + |N |,
which implies that dkm (p1 , p2 ) = Km (p1 , p2 ).
670
A.S. Cattaneo, B. Dherin, G. Felder
7. Comparison with Deformation Quantization In this section we make precise the statement that the generating function may be seen as the semi-classical approximation of the Kontsevich deformation formula. Namely Kontsevich gave in [13] an explicit formula for the associative deformation of the usual product of a function on Rd into the direction of a Poisson structure α,
f ∗ g = fg +
hn
n≥1
n!
W B (f, g),
∈Gn,2
where W are the weights and B the bidifferential operators introduced in Sect. 5. Definition 15. Consider a graph in Cn,2 , the set of connected graphs of type (n, 2). g We denote by n := |Ea | the number of aerial edges and e := |E | the number of ground edges. In order to introduce the number of loops in a connected graph let us make the following remark. If is a connected graph of type (n, 2) then must at least have n − 1 aerial edges, which means that n − 1 ≤ n . On the other hand we have n + e = 2n. This implies that for connected Kontsevich graphs the number n − e + 1 is always positive or zero. Definition 16. For a connected graph of type (n, 2) we call the number n − e + 1 the number of loops of the graph and we denote it by b . We denote by Bnl the set of l connected graphs of type (n, 2) with l loops and we set B l = ∪∞ n=1 Bn . It is easy to see 0 that Bn are exactly the Kontsevich trees Tn,2 . The following lemma shows that the star-product can be considered as a suitable exponentiation of a deformation of the Poisson structure. Lemma 13 (Exponential formula). Let f, g ∈ C ∞ (M). The star-product could be expressed as 1 f ∗ g(x) = exp , D h∂x , h∂x , x f (x )g(x ) h x =x =x
where D(p1 , p2 , x) = Bˆ (p1 , p2 , x).
∞
j =0 h
j D j (p
1 , p2 , x)
and D j (p1 , p2 , x) =
W ∈B j ||!
Proof. By definition of the star-product, the definition of the Bˆ and using Lemma 9 of Sect. 5 we can do the following computation:
Formal Symplectic Groupoid
671
I = f ∗ g(x) ∞
hn
ˆ = 1+ W B (∂x , ∂x , x) f (x )g(x ) n! x =x =x n=1
= 1+
∞
n=1
= 1+
∈Gn,2
(W1 Bˆ 1 ) . . . (Wk Bˆ k ) f (x )g(x )
hn
n!
∈Gn,2 D()=(1 ,...,k ) ∞ n
hn
n=1
n!
= 1+
n=1
hn n!
n
k=1
(
k=1 n1 ,...,nk ∈N\{0} n1 +···+nk =n
(1 ,...,k )∈Cn1 ,2 ×···×Cnk ,2 ∞
1 = 1+ k!
x =x =x
k=1 n1 ,...,nk ∈N\{0} ∈Gn,2 (n1 ,...,nk ) n1 +···+nk =n
×
×(W1 Bˆ 1 ) . . . (Wk Bˆ k ) f (x )g(x ) ∞
x =x =x
n! ) k!n1 ! . . . nk !
(W1 Bˆ 1 ) . . . (Wk Bˆ k ) f (x )g(x )
x =x =x
(1 ,...,k )∈Cn1 ,2 ×···×Cnk ,2 n1 ,...,nk ∈N\{0}
W1 ˆ Wk ˆ B1 ) . . . (hnk Bk ) f (x )g(x ) n1 ! nk ! x =x =x ∞ ∞
1
W ( Bˆ )k f (x )g(x ) = 1+ hn k! n! x =x =x ×(hn1
k=1
= exp
1 h
∞
n=1
n=1
∈Cn,2
hn+1
W . Bˆ (∂x , ∂x , x) f (x )g(x ) n! x =x =x
∈Cn,2
Remarking that Bˆ (∂x , ∂x , x) = f ∗ g(x) = exp = exp
1 he
Bˆ (h∂x , h∂x , x), we can conclude that
∞
1
h
n=1 ∈Cn,2
∞ 1
h
hn+1−e
W ˆ B (h∂x , h∂x , x) ||!
hj D j (h∂x , h∂x , x) .
j =0
The semi-classical part of the deformation formula is 1 0 D (hp1 , hp2 , x). h
672
A.S. Cattaneo, B. Dherin, G. Felder
It is easy to see that 1 0 D (hp1 , hp2 , x) h
x(p1 + p2 ) +
is exactly the formal symplectic groupoid generating function. It is in this sense that one can consider the generating function as a semi-classical approximation of the deformation formula. We give now a quick but non-rigorous proof of the existence part of Theorem 1. We use the technique of saddle point approximation (over non-really-well defined integrals). The following computations are then by no way a replacement of the rigorous and more technical argument developed in the previous sections. First notice that as a consequence of the exponential formula of the previous lemma we have that i
i
i
j j j ≥0 ( i ) D (p1 ,p2 ,x))
e p1 x ∗ e p2 x = e h (
.
We have replaced in the above identity the previously used formal parameter h by for better agreement with the notations in quantum mechanics. Moreover we have absorbed the term x(p1 + p2 ) into D 0 . We keep using this convention through the following computation. Let us compute both sides of i
i
i
(e p1 x ∗ e p2 x ) ∗ e
i p3 x
i
i
i
= e p1 x ∗ (e p2 x e p3 x )
A
B
with the help of the asymptotical Fourier transform. We have then, i i −d/2 fˆ(p1 , p2 , p)(e px ∗ e p3 x )dp, A = (2π) where fˆ(p1 , p2 , p) is the Fourier transform of i
i
i
f (p1 , p2 , x) = e p1 x ∗ e p2 x = e that is, fˆ(p1 , p2 , p) = (2π)−d/2
i
e
∞
j j j =0 ( i ) D (p1 ,p2 ,x)
∞
j j j =0 ( i ) D (p1 ,p2 ,x)−px
,
dx.
We use the method of the saddle point approximation to evaluate this integral when “ i is very small”. First notice that for functions of the form g (x) = g0 (x) + i
g1 (x) + ( )2 g2 (x) + . . . i i
a formal application of the implicit function theorem to F ( i , x) = ∇g (x) tells us that i
(1) (2)
∃x¯ : I → where I is a interval around zero so that x( ¯ i ) is an extremal point of g (x) if x(0) ¯ = x¯ is an extremal point of g0 (x), i −1 x( ¯ i ) = x¯ − g0 (x)g ¯ 1 (x) ¯ i + O(( i )2 ). Rn ,
Formal Symplectic Groupoid
673
Second, notice that we have the following asymptotical expansion: g (x( ¯ )) = g ((x¯ − g0−1 (x)g ¯ 1 (x) ¯ + O(( )2 )); i i i i i
around x¯ we get
g (x( ¯ )) = g0 (x) ¯ + g1 (x) ¯ + O(( )2 ). i i i i Now if we apply the method of the stationary phase to i g (x) I = e i dx
we find
i ¯ i g1 (x)) ¯ I ≈ c(x, ¯ )e (g0 (x)+ , i where x¯ is the extremal point of g0 . Let us come back to the computation of A. With the preceding remarks in mind the computation of fˆ(p1 , p2 , p) leads, through the application of the stationary phase method, to 0 ¯ x+ ¯ i D 1 (p1 ,p2 ,x)) ¯ fˆ(p1 , p2 , p) ≈ c(p1 , p2 , x, ¯ )e (D (p1 ,p2 ,x)−p , i
i
where c is a certain function of p1 , p2 , x¯ and i , and where x, ¯ is a critical point of D 0 (p1 , p2 , x) − px Then
A≈
(i.e.,
∇x D 0 (p1 , p2 , x) ¯ = p).
∞ j j i 0 ¯ x+ ¯ i D 1 (p1 ,p2 ,x)+ ¯ j =0 ( i ) D (p,p3 ,x)) dp. c(p1 , p2 , x, ¯ )e (D (p1 ,p2 ,x)−p i
Using the same method as above again we obtain 0 0 (p,p ¯ p¯ x+D ¯ ¯ 3 ,x)) −D 1 (p,p ¯ 3 ,x)−D 1 (p1 ,p2 ,x) ¯ ˜ 1 , p2 , p, A ≈ C(p ¯ x, ¯ x, )e (D (p1 ,p2 ,x)− e , i i
where x¯ is determined by ∇x D 0 (p1 , p2 , x) ¯ = p¯ as above and p¯ by ∇P1 D 0 (p, ¯ p3 , x) = x. ¯ Namely, d 0 D (p1 , p2 , x) ¯ − p¯ x¯ + D 0 (p, ¯ p3 , x) = 0 dp gives ∇x D 0 (p1 , p2 , x) ¯
d x¯ d x¯ − ∇x D 0 (p1 , p2 , x) − x¯ + ∇p1 D 0 (p, ¯ ¯ p3 , x) = 0. dp dp
By the same kind of computation we approximate B for i “small enough”, i 0 0 (p ,p,x)) 1 1 (p ,p,x) ˜ p˜ x+D ˜ ˜ 1 ˜ 1 ˜ ˜ 2 , p3 , p, B ≈ C(p ˜ x, ˜ x, )e (D (p2 ,p3 ,x)− e−D (p2 ,p3 ,x)−D i with x˜ and p˜ determined by ∇x D 0 (p2 , p3 , x) ˜ = p˜ and ∇p2 D 0 (p1 , p, ˜ x) = x. ˜ 0 Equating A and B we then get that D (p1 , p2 , x) satisfies the SGA equation. Acknowledgements. The second author thanks Ernst Hairer for useful discussions, and suggestions.
674
A.S. Cattaneo, B. Dherin, G. Felder
References 1. Bates, S., Weinstein, A.: Lectures on the geometry of quantization. Berkeley Mathematics Lecture Notes, 8. Providence, RI: and Berkeley, CA: American Mathematical Society, Berkeley Center for Pure and Applied Mathematics, 1997 2. Bayen, F., Flato, M., Fronsdal, C., Lichnerowicz, A., Sternheimer, D.: Deformation theory and quantization. I. Deformations of symplectic structures. Ann. Physics 111(1), 61–110 (1978) 3. Cattaneo, A.S.: The Lagrangian operad. Unpublished notes, http://www.math.unizh.ch/asc/ lagop.pdf 4. Cattaneo, A.S., Felder, G.: Poisson sigma models and deformation quantization. Euroconference on Brane New World and Noncommutative Geometry (Torino, 2000). Modern Phys. Lett. A 16(4–6), 179–189 (2001) 5. Cattaneo, A.S., Felder, G.: Poisson sigma models and symplectic groupoids. In: Quantization of singular symplectic quotients, Progr. Math. 198, Basel: Birkh¨auser, 2001, pp. 61–93 6. Coste, A., Dazord, P., Weinstein, A.: Groupo¨ides symplectiques. (French) [Symplectic groupoids] Publications du D´epartement de Math´ematiques. Nouvelle S´erie. A, Vol. 2, i–ii, Publ. D´ep. Math. Nouvelle S´er. A, 87-2, Lyon: Univ. Claude-Bernard, 1987, pp. 1–62 7. Crainic, M.: Differentiable and algebroid cohomology, van Est isomorphisms, and characteristic classes. http://arxiv.org/math.DG/0008064, 2000 8. Crainic, M., Fernandes, R.L.: Integrability of Lie brackets. Ann. of Math. (2) 157(2), 575–620 (2003) 9. Hairer, E., Lubich, C., Wanner, G.: Geometric numerical integration. In: Structure-preserving algorithms for ordinary differential equations. Springer Series in Computational Mathematics, 31, Berlin: Springer-Verlag, 2002 10. Karabegov, K.: On Dequantization of Fedosov’s Deformation Quantization. http://arxiv.org/ abs/math.QA/0307381, 2003 11. Karas¨ev, M.V.: Analogues of objects of the theory of Lie groups for nonlinear Poisson brackets. (Russian) Izv. Akad. Nauk SSSR Ser. Mat. 50(3), 508–538, 638 (1986) 12. Kathotia, V.: Kontsevich’s universal formula for deformation quantization and the Campbell-BakerHausdorff formula. Internat. J. Math. 11(4), 523–551 (2000) 13. Kontsevich, M.: Deformation quantization of Poisson manifolds, I. Lett. Math. Phys. 66, 157–216 (2003) 14. Weinstein, A.: Symplectic groupoids and Poisson manifolds. Bull. Am. Math. Soc. (N.S.) 16(1), 101–104 (1987) 15. Weinstein, A., Xu, P.: Extensions of symplectic groupoids and quantization. J. Reine Angew. Math. 417, 159–189 (1991) 16. Weinstein, A.: Noncommutative geometry and geometric quantization. In: Symplectic geometry and mathematical physics (Aix-en-Provence, 1990), Progr. Math. 99, Boston, MA: Birkh¨auser Boston, 1991, pp. 446–461 17. Weinstein, A.: Tangential deformation quantization and polarized symplectic groupoids. In: Deformation theory and symplectic geometry (Ascona, 1996), Math. Phys. Stud. 20, Dordrecht: Kluwer Acad. Publ. 1997, 301–314 18. Zakrzewski, S.: Quantum and classical pseudogroups. I. Union pseudogroups and their quantization. Commun. Math. Phys. 134(2), 347–370 (1990) Communicated by L. Takhtajan
Commun. Math. Phys. 253, 675–704 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1198-0
Communications in
Mathematical Physics
Profiles and Quantization of the Blow Up Mass for Critical Nonlinear Schr¨odinger Equation Frank Merle1,2,3 , Pierre Raphael1,2 1 2 3
D´epartment de Math´ematiques, Universit´e de Cergy–Pontoise, 2 av. Adolphe Chauvain, 95302 Cergy-Pontoise, France Institute for Advanced Study, Princeton, NJ 08540, USA CNRS
Received: 19 January 2004 / Accepted: 12 April 2004 Published online: 20 October 2004 – © Springer-Verlag 2004
Abstract: We consider finite time blow up solutions to the critical nonlinear Schr¨oding4 er equation iut = −u − |u| N u for which limt↑T <+∞ |∇u(t)|L2 = +∞. For a suitable class of initial data in the energy space H 1 , we prove that the solution splits in two parts: the first part corresponds to the singular part and accumulates a quantized amount of L2 mass at the blow up point, the second part corresponds to the regular part and has a strong L2 limit at blow up time. 1. Introduction 1.1. Setting of the problem. We consider in this paper the critical nonlinear Schr¨odinger equation 4 iut = −u − |u| N u, (t, x) ∈ [0, T ) × RN (N LS) (1) u(0, x) = u0 (x), u0 : RN → C with u0 ∈ H 1 = H 1 (RN ) in dimension N ≥ 1. From a result of Ginibre Velo [4], (1) is locally well-posed in H 1 and thus, for u0 ∈ H 1 , there exists 0 < T ≤ +∞ such that u(t) ∈ C([0, T ), H 1 ) and either T = +∞, we say the solution is global, or T < +∞ and then lim supt↑T |∇u(t)|L2 = +∞, we say the solution blows up in finite time. Moreover, the Cauchy problem is locally well-posed in L2 from Cazenave, Weissler, [3]. (1) admits the following conservation laws in energy space H 1 : |u(t, x)|2 = |u0 (x)|2 ; L2 norm : 4 Energy : E(u(t, x)) = 21 |∇u(t, x)|2 − 1 4 |u(t, x)|2+ N = E(u0 ); 2+ N Momentum : I m ∇uu(t, x) = I m ∇u0 u0 (x) .
Part of this work has been supported by grant DMS-0111298.
676
F. Merle, P. Raphael
For notational purpose, we shall introduce the following invariant: 1 E (u) = E(u) − 2 G
2 |I m( ∇uu)| . |u|L2
(2)
It is classical from the conservation of the energy and the L2 norm that the power non linearity in (1) is the smallest one for which blow up may occur, and existence of blow up solutions is known from the virial identity: let an initial data u0 ∈ = H 1 ∩ {xu ∈ L2 }, then the corresponding solution u(t) to (1) satisfy: d2 u(t) ∈ and dt 2
|x|2 |u(t, x)|2 = 16E(u0 ).
(3)
Thus if u0 ∈ with E(u0 ) < 0, the positive quantity |x|2 |u(t, x)|2 cannot exist for whole times and u blows up in finite time. Equation (1) admits a number of symmetries in energy space H 1 : if u(t, x) is a solution to (1) then ∀(λ0 , t0 , x0 , β0 , γ0 ) ∈ R∗+ × R × RN × RN × R, so N
v(t, x) = λ02 u(t + t0 , λ0 x + x0 − β0 t)ei
β0 β0 2 ·(x− 2 t)
eiγ0 .
The last symmetry is not in the energy space H 1 but in the virial space , the pseudo conformal transformation: if u(t, x) solves (1), then so does v(t, x) =
1 x |x|2 u( , )ei 4t . t t |t| 1
N 2
Special solutions play a fundamental role for the description of the dynamics of (1). They are the so called solitary waves of the form u(t, x) = eiωt Wω (x), ω > 0, where Wω solves 4
Wω + Wω |Wω | N = ωWω .
(4)
Equation (4) is a standard nonlinear elliptic equation, and from [1] and [6], there is a unique positive solution up to translation Qω (x). Qω is in addition radially symmetric. N 1 Letting Q = Qω=1 , then Qω (x) = ω 4 Q(ω 2 x) from the scaling property, and from direct computation and the Pohozaev identity: E(Qω ) = ωE(Q) = 0 and |Qω |L2 = |Q|L2 . Recall also that in dimension N ≥ 2, (4) for ω = 1 admits a family of non zero radial solutions Qi ∈ H 1 which is unbounded in L2 . For |u0 |L2 < |Q|L2 , the solution is global in H 1 from the conservation of the energy, the L2 norm and the Gagliardo-Nirenberg inequality as exhibited by Weinstein in [20]: ∀u ∈ H 1 , E(u) ≥
1 2
|∇u|2 1 −
|u|2 Q2
2 N .
(5)
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
677
In addition, this condition is sharp: for |u0 |L2 ≥ |Q|L2 , blow up may occur. Indeed, the pseudo-conformal transformation applied to the stationary solution eit Q(x) yields an explicit solution x |x|2 i 1 (6) e−i 4t + t S(t, x) = N Q t 2 |t| which blows up at T = 0 with |S(t)|L2 = |Q|L2 . Note that blow up speed for S(t) is: |∇S(t)|L2 ∼
1 . |t|
Moreover, from [10], S(t) is the unique minimal mass blow up solution up to the symmetries. Most results concerning blow up dynamics of (1) now concern the perturbative situation when u0 ∈ Bα ∗ = {u0 ∈ H 1 with Q2 ≤ |u0 |2 < Q2 + α ∗ } for some small constant α ∗ > 0. At least two different blow up behaviors are known to possibly occur: • There exist in dimension N = 1, 2 a family of solutions of type S(t) by a result of Bourgain, Wang, [2], that is solutions with |∇u(t)|L2 ∼ T 1−t near blow up time. • On the other hand, numerical simulations, [7], and formal arguments, [19], suggest 1 −t)| 2 in dimenthe existence of solutions blowing up like |∇u(t)|L2 ∼ log|log(T T −t sion N = 2. Perelman proves in [17] in dimension N = 1 the existence of an even solution of this type and its stability in some space E ⊂ H 1 . The situation has been clarified in sequel of papers [11–14], [18]. More precisely, let us consider the following property: Spectral Property. Let N ≥ 2. Consider the two real Schr¨odinger operators 4 2 4 2 4 L1 = − + + 1 Q N −1 y · ∇Q , L2 = − + Q N −1 y · ∇Q, N N N
(7)
and the real valued quadratic form for ε = ε1 + iε2 ∈ H 1 : H (ε, ε) = (L1 ε1 , ε1 ) + (L2 ε2 , ε2 ).
(8)
Then there exists a universal constant δ˜1 > 0 such that ∀ε ∈ H 1 , if (ε1 , Q) = (ε1 , Q1 ) = (ε1 , yQ) = (ε2 , Q1 ) = (ε2 , Q2 ) = (ε2 , ∇Q) = 0, then: − (i) for N = 2, H (ε, ε) ≥ δ˜1 ( |∇ε|2 + |ε|2 e−2 |y| ) for some universal constant 2− < 2; (ii) for N ≥ 3, H (ε, ε) ≥ δ˜1 |∇ε|2 , where Q1 = N2 Q + y · ∇Q and Q2 = N2 Q1 + y · ∇Q1 . This property has been proved in [11] for dimension N = 1 and constant 2− = 95 , and will always be implicitly assumed in higher dimension N ≥ 2.
678
F. Merle, P. Raphael
We first have the following theorem which exhibits two different blow up behaviors in H 1 : Theorem 1 (Dynamics of (1), [11–14], [18]). Let N = 1 or N ≥ 2 assuming Spectral Property holds true. There exist universal constants α ∗ > 0, C ∗ > 0 such that the following holds true. For u0 ∈ H 1 , let u(t) the corresponding solution to (1) with [0, T ) its maximum time interval existence on the right in H 1 . Let the set 1 2 |∇u(t)|L2 T −t 1 =√ O = u0 ∈ Bα ∗ with T < +∞ and lim , t→T |∇Q|L2 log|log(T −t)| 2π (9) then: Log-log regime. (i) Dynamic of non positive energy solutions: u0 ∈ Bα ∗ with E0G ≤ 0 and |u0 |2 > Q2 ⊂ O.
(10)
(ii) Stability of the log-log regime: O is open in H 1 . (iii) Universality of blow up profile: if u0 ∈ O, then there exist parameters λ0 (t) = |∇Q|L2 N |∇u(t)| 2 , x0 (t) ∈ R and γ0 (t) ∈ R such that L
N
eiγ0 (t) λ02 (t)u(t, λ0 (t)x + x0 (t)) → Q in H˙ 1 as t → T .
(11)
S(t) type of regime. (iv) If 0 < T < +∞ and u0 ∈ Bα ∗ does not belong to O, then E0G > 0 and the following lower bound holds: |∇u(t)|L2 ≥
C∗ . (T − t) E0G
(12)
Moreover, asymptotic stability (11) holds on a sequence tn → T . 1.2. Statement of the results. Our aim in this paper is to make precise the nature of the singularity formation. This question was first investigated in [13] but in the rescaled variable, which corresponds to the proof of asymptotic stability of the soliton in the blow up regime (11). Here, we further investigate this question and exhibit the structure of the singularity formation in the original space variable. Since the 60’s, a question of fundamental physical importance is the one of the amount of mass which is focused by the blow up dynamic. Recall that (1) is in dimension N = 2 a model for the self focusing of laser beams, and in this frame, the amount of mass which goes into the blow up point is related to the energy of the laser beam. It was conjectured -at least in the stable regime- that this focused mass is quantized. Another question regards the localization in space of the singularity formation. More precisely, considering a blow up solution, one asks the question of the size of the singular set of blow up points and of the behavior of the solution outside this set. One can conjecture from the criticality of the problem that the number of blow up points is finite
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
679
and outside these points, |u|2 converges in the distributional sense to a L1 function. In the setting of small excess of mass u0 ∈ Bα ∗ , the conjecture is written: there exist x(T ) ∈ RN and f ∈ L1 such that: 2 2 |u(t)| Q δx=x(T ) + f as t → T . (13) We in fact claim a stronger result which is more natural in the context of solving the Cauchy problem locally in time in L2 : up to a singular part which structure is universal, the solution remains smooth at blow up time in L2 . Theorem 2 (Existence of a L2 profile at blow up time). Let N = 1 or N ≥ 2 assuming that Spectral Property holds true. There exists a universal constant α ∗ > 0 such that the following holds true. Let u0 ∈ Bα ∗ and assume the corresponding solution to (1) blows up in finite time 0 < T < +∞, then there exist parameters (λ(t), x(t), γ (t)) ∈ ∗ × R N × R and an asymptotic profile u∗ ∈ L2 such that R+ u(t) −
1 N
λ(t) 2
Q
x − x(t) iγ (t) e → u∗ in L2 as t → T . λ(t)
(14)
Moreover, the blow up point is finite in the sense that x(t) → x(T ) ∈ RN as t → T . Comments on the result. 1. Quantization of the blow up mass. Observe that (14) implies: 2 2 |u(t)| Q δx=x(T ) + |u∗ |2 as t → T with |u0 |2 = Q2 + |u∗ |2 , and thus the quantization of mass (13). 2. About concentration points. In the general large data case, the only facts known about L2 concentration have been obtained in [15]: there is a function x(t) ∈ RN without any a priori limit such that: ∀R > 0, lim inf |u(t)|2 ≥ Q2 . t→T
|x−x(t)|
In the radial case, x(t) = 0. In the small super critical mass case u0 ∈ Bα ∗ , open problems following this work were: – prove that x(t) does not oscillate in time and has a finite limit as t → T ; – prove that the amount of mass focused at x(T ) has a limit; – avoid a small concentration of L2 norm outside the blow up point x(T ), and then prove existence of the strong L2 limit. Note that existence and finiteness of the blow up point x(T ) for data u0 ∈ O was already proved in [13]. Of course, the main feature of the result is to say that the focused mass is exactly Q2 which is the mass of the minimal mass blow up solution.
680
F. Merle, P. Raphael
3. Asymptotic stability and mass quantization. On the one hand, it is elementary to check that strong convergence (14) implies the asymptotic stability of blow up profile (11). In particular, a first step in the proof will be to prove asymptotic stability of Q as the blow up profile on the whole sequence in time in the S(t) regime. On the other hand, the proof of (11) in [13] is the starting point of our analysis. Indeed, it gives the exact shape of the solution in the rescaled variables -in both blow up regimes, log-log and S(t). The proof of Theorem 2 follows by propagating in the original variables this information outside the blow up point, and the corresponding estimates are based on those used in [13, 14], but of a different type. We conjecture in a general context that the universality of the blow up profile and the mass quantization are equivalent formulations of the same property. For example, for the generalized critical KdV equation, the blow up picture is similar in some sense to the one of (1), and indeed universality of Q as a blow up profile has been proved in [8]. Nevertheless, the quantization of the blow up mass is still an open problem. 4. Comparison with the Zakharov model. In dimension N = 2, if we consider the next term in the physical approximation leading to (NLS), we get the Zakharov equation:
iut = −u + nu 1 n = n + |u|2 c2 tt
(15)
0
for some large constant c0 . Now as exhibited in [5], this system admits a one parameter family of radially symmetric finite time blow up solutions which each concentrate at x = 0 in L2 an amount of mass m0 ∈ ( Q2 , +∞). 5. Regularity outside blow up point. From Theorem 2, the formation of the singularity is localized in space. Indeed, outside the blow up point x(T ), the solution has a strong L2 limit, whereas the Cauchy problem for (1) is well posed in L2 . It means in particular that the phase of the solution is not oscillatory outside the blow up point, whereas the phase γ (t) of the singularity is known to satisfy γ (t) → +∞ as t → T . This strong regularity of the solution outside the blow up point is a surprise to us. Following this theorem, we conjecture the following for the large initial data case in H 1 : blows up in finite time 0 < T < Conjecture. Let u(t) ∈ H 1 be a solution to (1) which
+∞. Then there exist (xi )1≤i≤L ∈ RN with L ≤
u(t) → u∗ in L2 (RN −
2 |u0 | , and u∗ Q2
∈ L2 such that: ∀R > 0,
B(xi , R))
1≤i≤L
∗ 2
and |u(t)| 1≤i≤L mi δx=xi + |u | with mi ∈ [ 2
Q2 , +∞).
The set M of admissible focused mass mi for N ≥ 2 is known to contain the unbounded set of the L2 masses of excited bound state Qi solutions to (4), see for example [9]. These are the only known examples. In the case of the log-log regime, the singular part of the solution is up to a phase shift completely universal and independent of the Cauchy data -this is an open problem for the S(t) regime.
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
681
Proposition 1 (Universality of the singular structure in the log-log regime). Under assumptions of Theorem 2, let u0 ∈ O, x(T ) be its blow up point and u∗ its L2 profile. Set √ T −t λ0 (t) = 2π , (16) log|log(T − t)| then there exists a phase parameter γ0 (t) ∈ R such that: 1 x − x(T ) iγ0 (t) u(t) − Q → u∗ in L2 as t → T . e N λ (t) 0 λ0 (t) 2
(17)
Remark 1. Regarding the universal behavior of the phase shift, we know: γ0 (t) ∼
1 |log(T − t)|log|log(T − t)| as t → T . 2π
Nevertheless, we do not know whether γ0 (t) minus its equivalent has a finite limit as t → T. Remark 2. Continuity of u∗ in the log-log regime: From [18], the set O of log-log blow up is open. Moreover, from [13], blow up time T and concentration point x(T ) are continuous functions of the data. From the proof of existence of u∗ , same kind of arguments apply and u∗ is in O a continuous function of the data. We claim that the two different blow up dynamics can be characterized by regularity properties at blow up point of the profile u∗ : Theorem 3 (Asymptotic behavior of u∗ at the blow up point). Let N = 1 or N ≥ 2 assuming the Spectral Property holds true. There exist universal constants α ∗ > 0, C ∗ > 0 such that the following holds true. Let u0 ∈ Bα ∗ and assume the corresponding solution u(t) to (1) blows up in finite time 0 < T < +∞. Let x(T ) its blow up point and u∗ ∈ L2 its profile given by Theorem 2, then for R > 0 small enough, we have: (i) Log-log case: if u0 ∈ O, then C∗ 1 ∗ 2 ≤ |u (x)| dx ≤ , (18) C ∗ (log|log(R)|)2 (log|log(R)|)2 |x−x(T )|≤R and in particular: u∗ ∈ / H 1 and u∗ ∈ / Lp f or p > 2. (ii) S(t) case: if u(t) satisfies (12), then |x−x(T )|≤R
|u∗ |2 ≤ C ∗ E0 R 2 ,
and u∗ ∈ H 1 .
(19)
(20)
682
F. Merle, P. Raphael
Comments on the result. 1. The two blow up scenarios. The fact that one can separate within the two blow up dynamics and see the different blow up speeds on asymptotic profile u∗ is a completely new feature for (NLS), and was not even expected at the formal level. Moreover, this result strengthens our belief that the S(t) type of solutions are in some sense on the boundary of the set of finite time blow up solutions. Indeed, the stable log-log blow up scenario is based on the ejection of a radiative mass which strongly couples the singular and the regular parts of the solution and induces singular behavior (18) of the profile at the blow up point; on the contrary, the S(t) regime corresponds to formation of a minimal mass blow up bubble very decoupled from the regular part which indeed remains in the Cauchy space. 2. Degeneracy of u∗ in the S(t) regime. For N = 1, (20) implies u∗ (0) = 0. In [2], Bourdi ∗ gain and Wang construct for a given radial profile u∗ smooth with dr i u (r)|r=0 = 0, 1 ≤ i ≤ A, a solution to (1) with blow up point x = 0 and asymptotic profile u∗ . In their proof, A is very large, and is used to decouple the regular and the singular part of the solutions. In this sense, estimate (20) proves in general a decoupling of this kind for the S(t) dynamic. It is an open problem to estimate the exact degeneracy of u∗ . The proofs of Theorems 2 and 3 rely on deep estimates on the solution in some rescaled variables which have been established in our previous works on this problem. Our analysis here collects these results altogether to derive information in the original variables. Nevertheless, the proof of Theorem 3, and in particular of asymptotic (18), follows a different scheme and involves a new type of estimates. 2. Dispersive Estimates on the Solution We now recall tools and dispersive estimates needed to describe the blow up dynamics. These estimates depend on the considered blow up regime. We first recall them from [12, 14] in the log-log regime, and then from [18] in the S(t) regime. We derive the asymptotic stability of Q as a blow up profile in this last case which generalizes a corollary in [13]. Let u0 ∈ Bα ∗ and assume that u(t) blows up in finite time 0 < T < +∞. As first observed in [11], we may always assume up to a fixed Galilean transform: Im ∇u0 u0 = 0, (21) estimate the solution in this context, and then conclude using Galilean invariance, see Subsect. 3.1. For a given function f , we note: f1 =
N f + y · ∇f, 2
f2 =
N f1 + y · ∇f1 2
2.1. Dynamical controls in the log-log regime. In this subsection, we assume u0 ∈ O, i.e. (9) holds. Step 1. Localized self similar profiles. We recall from [12] the existence of a one parameter family of localized self similar profiles in the vicinity of ground state solution√Q. Let 2 1−η a parameter 0 < η << 1 small enough to be fixed later. For b = 0, set Rb = |b| √ − N and Rb = 1 − ηRb . Denote BRb = {y ∈ R , |y| ≤ Rb }. We introduce a regular radially symmetric cut-off function φb (x) = 0 for |x| ≥ Rb and φb (x) = 1 for |x| ≤ Rb− , 0 ≤ φb (x) ≤ 1, such that |φb |L∞ + |φb |L∞ → 0 as |b| → 0. We claim:
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
683
Proposition 2 (Localized self similar profiles). See Propositions 8 and 9 of [13]. There exist universal constants C > 0, η∗ > 0 such that the following holds true. For all 0 < η < η∗ , there exist constants ε ∗ (η) > 0, b∗ (η) > 0 going to zero as η → 0 such that for all |b| < b∗ (η), there exists a unique radial solution Q0b to N o o o o o o 4 Qb − Qb + 2ib 2 Qb + y · ∇Qb + Qb |Qb | N = 0, b|y| Pbo = Qob ei 4 > 0 in BRb , o Qb (0) ∈ (Q(0) − ε ∗ (η), Q(0) + ε ∗ (η)), Qob (Rb ) = 0. Moreover, let
˜ b (r) = Qo (r)φb (r), Q b
then:
˜ b − Q)C 3 → 0 as b → 0. eCr (Q
˜ b = + i in terms of real and imaginary parts. From now on, we note: Q ˜ b are not exact self similar solutions and we define the error term b by: Profiles Q ˜b − Q ˜ b + ib(Q ˜ b )1 + Q ˜ b |Q ˜ b | N = −b . Q 4
(22)
We next introduce outgoing radiation escaping the soliton core according to the following Lemma: Lemma 1 (Linear outgoing radiation). See Lemma 15 in [13]. There exist universal constants C > 0 and η∗ > 0 such that ∀0 < η < η∗ , there exists b∗ (η) > 0 such that ∀0 < b < b∗ (η), the following holds true: let b be given by (22), there exists a unique radial solution ζb to ζb − ζb + ib(ζb )1 = b |∇ζb |2 < +∞. Moreover, let b =
lim
|y|→+∞
|y|N |ζb (y)|2 ,
then there holds: e−(1+Cη) b ≤ b ≤ e−(1−Cη) b . π
π
(23)
˜ b . The solution u(t) admits for t close Step 2. Geometrical decomposition close to Q enough to T the geometrical decomposition close to the four dimensional manifold ˜ b (λy + x)}. M = {eiγ λ 2 Q N
Lemma 2 (Nonlinear modulation of the solution with respect to M). See Lemma 2 in [12]. There exist some time t (u0 ) ∈ [0, T ) and some C 1 functions (λ, γ , x, b) : [t (u0 ), T ) → (0, +∞) × R × RN × R such that ˜ b(t) (y) ∀t ∈ [t (u0 ), T ), ε(t, y) = eiγ (t) λ 2 (t)u(t, λ(t)y + x(t)) − Q N
satisfies the following:
684
(i)
F. Merle, P. Raphael
ε1 (t), |y|2 b(t) + ε2 (t), |y|2 b(t) = ε1 (t), yb(t) + ε2 (t), yb(t) = 0, − ε1 (t), (b(t) )2 + ε2 (t), (b(t) )2 = − ε1 (t), (b(t) )1 + ε2 (t), (b(t) )1 = 0,
where ε = ε1 + iε2 in terms of real and imaginary part; |∇u(t)| (ii) |1 − λ(t) |∇Q| L22 | + |ε(t)|H 1 + |b(t)| ≤ δ(α ∗ ), where δ(α ∗ ) → 0 as α ∗ → 0. L (iii) Let the rescaled time t dτ s(t) = , 2 (τ ) λ t (u0 ) then: 1 2 λs 1−Cη 2 2 −2− |y| + b + |bs | + |γ˜s | + xs ≤ C + |ε| e + b . (24) |∇ε| λ λ Step 3. Control of the parameters in the log-log dynamic. We proved in [14] the following sharp controls of the parameters for the log-log dynamic: Proposition 3 (Sharp controls of the log-log regime). (i) Equivalent of the geometrical parameters as t → T : 1 √ log|log(T − t)| 2 λ(t) → 2π , (25) T −t b(t)log|log(T − t)| → π, (26) s(t) 1 → . (27) |log(T − t)| log|log(T − t)| 2π (ii) Pointwise dispersive control of ε: There holds for t close enough to T , − 1−Cη |∇ε(t)|2 + |ε(t)|2 e−2 |y| ≤ b(t) , (28) for some universal constant C > 0. Remark 3. From (23) and (26), we have: e−(1+Cη)log|log(T −t)| ≤ e−(1+Cη) b(t) ≤ b(t) ≤ e−(1−Cη) b(t) ≤ e−(1−Cη)log|log(T −t)| , π
π
and thus: 1 1 ≤ b (t) ≤ , 1+Cη |log(T − t)| |log(T − t)|1−Cη
(29)
and λ(t) ≤ e
−
1 bC
,
(30)
for some universal constant C > 0. Estimates of Proposition 3 are pointwise in time. Yet, the analysis in [14] provides us with a slight time averaging improvement of (28) and (29) which will be crucial in our further analysis. In what follows, given a parameter A > 0, we note χA (r) = χ Ar a radial cut-off function with χ (r) = 1 for 0 ≤ r ≤ 1 and χ (r) = 0 for r ≥ 2.
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
685
Proposition 4 (Dispersive control of ε). There exist universal constants η∗ , C ∗ > 0 such that the following holds true. ∀0 < η < η∗ , let √ (31) a= η and −a(1−Cη)
π
A = A(t) = ea b(t) so that b
−a(1+Cη)
≤ A ≤ b
,
(32)
and consider the approximate radiation ζ˜ = χA ζb ,
(33)
where ζb is the linear outgoing radiation of Lemma 1. Consider the new variable ε˜ = ε − ζ˜ , we then have for s large enough: (i) Full dispersive control: +∞ − |∇ ε˜ (s)|2 + |ε(s)|2 e−2 |y| + b(s) ds ≤ s
(34)
C∗ . log(s)
(ii) Space localization of the L2 mass: ∀K ≥ K > 0, +∞ K A C2 (K, K ) C1 (K, K ) 2 |ε(s)| ds ≤ ≤ . log(s) log(s) s KA
(35)
(36)
2.2. Dynamical controls in the S(t) regime. Let u0 ∈ Bα ∗ satisfying (21) such that the corresponding solution u(t) to (1) blows up in finite time 0 < T < +∞ with lower bound (12). First observe from Theorem 1 and (21) that: E0 > 0. Step 1. Geometrical decomposition of the solution. From the blow up assumption on u(t) and variational characterization of Q, the solution admits for t close enough to T a geometrical decomposition close to the four dimensional manifold ˆ = {eiγ λ N2 Q ˆ b (λy + x)}, M ˆ b are adapted to the S(t) blow up behavior: where the profiles Q ˆ b = Qe−i Q
b|y|2 4
ˆ + i ˆ =
(37)
in terms of real and imaginary parts. Remark 4. Introduction of profiles (37) was needed in [18] to obtain some optimal mono˜ b . Neverthetonicity results. This is not needed here and we could work with profiles Q less, as all key estimates in [18] have been written under this decomposition, we stick to it. We then have: ˆ See Lemma 2 in Lemma 3 (Nonlinear modulation of the solution with respect to M). [18]. There exist some time t (u0 ) ∈ [0, T ) and some continuous functions (λ, γ , x, b) : [t (u0 ), T ) → (0, +∞) × R × RN × R such that
686
F. Merle, P. Raphael
ˆ b(t) (y) ∀t ∈ [t (u0 ), T ), ε(t, y) = eiγ (t) λ 2 (t)u(t, λ(t)y + x(t)) − Q N
satisfies the following: (i) ˆ + ε2 (t), y ˆ = 0, ˆ + ε2 (t), |y|2 ˆ = ε1 (t), y ε1 (t), |y|2 ˆ 2 + ε2 (t), ˆ 2 = − ε1 (t), ˆ 1 + ε2 (t), ˆ 1 = 0, − ε1 (t), where ε = ε1 + iε2 in terms of real and imaginary part; |∇u(t)| (ii) |1 − λ(t) |∇Q| L22 | + |ε(t)|H 1 + |b(t)| ≤ δ(α ∗ ), where δ(α ∗ ) → 0 as α ∗ → 0. L t (iii) Let the rescaled time be s(t) = t (u0 ) λ2dt(t ) , then 1 x 2 λs s 2 2 2 −2− |y| | + b| + |bs + b | + |γ˜s | + ≤ C , |∇ε| + |ε| e λ λ
(38)
and
|∇ε|2 ≤ C
|ε|2 e−|y|
1 2
+ C(b2 + λ2 E0 ).
(39)
Step 2. Virial estimate and the sharp monotonicity property. In this regime, we do not know the analogue of Proposition 3 and it is indeed an open problem to get the exact laws for the parameters -if any in general. Nevertheless, we have the following: Proposition 5 (Control of the parameters in the S(t) regime). See Proposition 1 in [18]. There exist universal constants δ0 , C ∗ > 0 such that for t close enough to T : (i) Pointwise control of b by λ: |b(t)| ≤ λ(t)C ∗ E0 , (40) (ii) Pointwise control of the speed: λ(t) ≤ C ∗ E0 (T − t), (iii) Dispersive virial relation: for all s large enough, 1 − 2 bs ≥ δ0 ( |∇ε| + |ε|2 e−2 |y| ) − λ2 E0 . δ0
(41)
(42)
Step 3. Asymptotic stability in the S(t) regime We now prove as a consequence of the local virial estimate (42) coupled with sharp control (40) time averaged dispersive estimates on ε which in particular imply the asymptotic stability of Q as a blow up profile in the S(t) regime. Proposition 6 (Asymptotic stability of the blow up profile in the S(t) regime). There exists some universal constant C > 0 such that the following holds true: (i) Asymptotic stability of Q as the blow up profile: b(t) + |∇ε(t)|2 + |ε(t)|2 e−|y| → 0 as t → T . (43)
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
687
(ii) Estimate of the speed of dispersion: we have for t close enough to T , T dt 2 2 −|y| ≤ CE0 (T − t). |∇ε(t)| + |ε(t)| e 2 t λ (t) In particular, there exists a sequence tn → T such that: ∀n, |∇ε(tn )|2 + |ε(tn )|2 e−|y| ≤ Cλ2 (tn )E0 .
(44)
(45)
Remark 5. Asymptotic stability (43) has been proved for the log-log regime in [13, 14]. The proof there is the most difficult in our analysis. This fact is related to the proximity of the log-log regime to a possible scaling regime, |∇u(t)|L2 ∼ √T1−t , which existence would contradict asymptotic stability. In general, results in [13] ensure (43) on a sequence in time, or on all the sequence under monotonicity assumptions on λ -this last fact is still open. Observe now that in the S(t) regime, the lower bound (12) is far above the scaling estimate, and the proof of (43) will then be direct. dt Proof of Proposition 6. First observe from ds = λ2 and the finite time blow up assumption on u(t) that: +∞ λ2 (s)ds = T − t. s
Integrating (42) in time s and using b(s) → 0 as s → +∞ from (40), we get: +∞ T dt 2 2 −|y| 2 2 −|y| = + |ε(t)| e ds |∇ε(s)| + |ε(s)| e |∇ε(t)| 2 t λ (t) s +∞ λ2 (s)E0 ds + C|b(s)| ≤C s
≤ CE0 (T − t). This proves (44). In particular, +∞
ds
|ε(s)|2 e−|y|
< +∞.
(46)
s
Moreover, from the equation satisfied by ε and control on the parameters (38), we have see [13]: |ε(s)|2 e−|y| s < +∞, and thus (46) implies: |ε(s)|2 e−|y| → 0 as s → +∞. Equation (39) with (40) now yields (43). Equation (45) directly follows from (44). This concludes the proof of Proposition 6. 3. Reduction of the Proof of Main Results 3.1. Reduction to the zero momentum case. Our aim in this subsection using Galilean invariance is to reduce the proof of Theorem 2 and Theorem 3 to the one of a similar result under additional assumption (21) on u0 :
688
F. Merle, P. Raphael
Im
∇u0 u0
= 0.
Proposition 7 (Reduction to the zero momentum case). Let u0 ∈ Bα ∗ satisfying (21) and assume the corresponding solution u(t) to (1) blows up in finite time 0 < T < +∞. Let 1 x − x(t) iγ (t) u(t, x) = N (Qb(t) + ε) t, e λ(t) λ 2 (t) be the decomposition of Lemma 2 or Lemma 3 depending on the blow up regime, and ˜ b or Q ˆ b . Let where Qb is correspondingly Q 1 x − x(t) iγ (t) u(t, ˜ x) = N ε t, , (47) e λ(t) λ 2 (t) then there exist u∗ ∈ L2 , x(T ) ∈ RN , such that: u(t) ˜ → u∗ as t → T , x(t) → x(T ) as t → T ,
(48) (49)
and u∗ satisfies estimates of Theorem 3. Proof of Theorem 2 and Theorem 3 assuming Proposition 7. Let u0 ∈ Bα ∗ such that the corresponding solution u(t) to (1) blows up in finite time 0 < T < +∞. Let β I m( ∇u0 u0 ) β = −2 and uβ (0, x) = u0 ei 2 ·x , (50) 2 |u0 | then the corresponding solution to (1) is from Galilean invariance β
uβ (t, x) = u(t, x − βt)ei 2 ·(x−βt)
(51)
which satisfies (21) from the choice of β, and blows up at T . We thus may apply Proposition 7 and denote x − xβ (t) iγβ (t) 1 uβ (t, x) = N (Qbβ (t) + εβ ) t, e λβ (t) λβ2 (t) its geometrical decomposition for which: x − xβ (t) iγβ (t) 1 u˜ β (t, x) = N εβ t, e → u∗β in L2 as t → T , λ (t) β λβ2 (t)
(52)
and xβ (t) → xβ (T ) ∈ RN . Let now x(t) = xβ (t) − βT , γ (t) = γβ (t) − then u(t) −
1 N
λβ (t) 2
Q
β · x(t), 2
β x − x(t) iγ (t) e = u˜ β (t, x + βt)e−i 2 ·x + Rβ (t, x) λβ (t)
(53)
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
689
with Rβ (t, x) =
β eiγβ (t) e−i 2 ·x
x − (x (t) − βt) β . λβ (t)
β
e−iλβ (t) 2 ·y Qbβ (t) − Q
N
λβ (t) 2
From L2 scaling,
β |Rβ |L2 = e−iλβ (t) 2 ·y Qbβ (t) − Q
L2
→ 0 as t → T
from λβ (t) → 0, Qb → Q as b → 0 in some strong sense and asymptotic stability b(t) → 0 as t → T . Thus from (52): β 1 x − x(t) iγ (t) u(t)− Q → u∗ (x) = u∗β (x + βT )e−i 2 ·x in L2 as t → T , e N λβ (t) λβ (t) 2 (54) and the blow up point is finite: x(T ) = xβ (T ) − βT ∈ RN . β
Moreover, from (54), u∗ (x +x(T )) = u∗β (x +xβ (T ))e−i 2 ·(x+x(T )) so that local behavior in H 1 of u∗ at x(T ) is the same as the one of u∗β at xβ (T ), and Theorem 3 follows. This concludes the proofs of Theorem 2 and Theorem 3 assuming Proposition 7. 3.2. Universal structure of the singularity in the log-log regime. This subsection is devoted to the proof of Proposition 1. Proof of Proposition 1 assuming Proposition 7. From the previous subsection, it is enough to prove it under the additional assumption (21). Let then u0 ∈ O satisfying (21) and (λ(t), x(t), γ (t)) be the parameters associated to geometrical decomposition of Lemma 2. Let λ0 (t) be given by (16); we have: λ(t) → 1 as t → T , λ0 (t) |x(T ) − x(t)| → 0 as t → T . λ(t)
(55) (56)
Indeed, (55) follows from (9). For (56), we have from (24), (25), (28), (29) and (30): x 1 1 log|log(T − t)| 1 xs s . and thus |xt | = ≤ ≤ λ |log(T − t)|C λ λ |log(T − t)|C T −t Integrating this in time, we conclude: |x(T ) − x(t)| log|log(T − t)| T 1 log|log(T − τ )| ≤C dτ C λ(t) T −t T −τ t |log(T − τ )| 1 ≤ → 0 as t → T . C |log(T − t)| 2
690
F. Merle, P. Raphael
We now have: 2 x − x(t) x − x(T ) 1 1 Q Q − dx N λ(t) N2 λ(t) λ0 (t) λ0 (t) 2 N 2 λ0 (t) 2 λ0 (t) x(T ) − x(t) = Q(y) − Q y− dy → 0 as t → T . λ(t) λ(t) λ(t) This concludes the proof of (56) and of Proposition 1.
The rest of this paper is devoted to the proof of Proposition 7. We thus let an initial data u0 ∈ Bα ∗ satisfying (21) with blow up time 0 < T < +∞, and assume the corresponding solution to (1) admits a geometrical decomposition on [0, T ) as in Lemma 2 ˜ b or or Lemma 3 depending on the blow up regime, and where Qb is correspondingly Q ˆ Qb , which we denote: 1 x − x(t) iγ (t) u(t, x) = N (Qb(t) + ε) t, = QS (t, x) + u(t, ˜ x) (57) e λ(t) λ 2 (t) with QS (x, t) =
1 N
λ(t) 2
Qb(t)
x − x(t) iγ (t) 1 x − x(t) iγ (t) , u(t, ˜ x) = N ε t, . e e λ(t) λ(t) λ 2 (t) (58)
4. Existence of L2 Profile u∗ This section is devoted to the proof of (48) and (49) of Proposition 7. 4.1. Convergence of the blow up point. In this subsection, we prove the existence and finiteness of the blow up point x(T ). Proposition 8 (Existence and finiteness of the blow up point). x(t) → x(T ) ∈ RN as t → T .
(59)
Remark 6. This result has already been proved in the log-log regime, see [13], as a direct consequence of the log-log upper bound on blow up speed. In the S(t) regime, it will follow from dispersive controls on ε. Proof of Proposition 8. (59) follows from: T |xt |dt < +∞.
(60)
0
log-log regime. If u0 ∈ O, then from (24) and (9), |xt | = and (60) follows.
log|log(T − t)| 1 xs ≤ C|∇u(t)| , ≤ C L2 λ λ T −t
(61)
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
691
T S(t) regime. In this case, 0 |∇u(t)|L2 dt = +∞ so that the previous argument breaks down. From (38), we have: 1 2 1 xs 1 2 2 −2− |y| + |ε| e . ≤ |∇ε(t)| λ λ λ
|xt | =
T Now (44) yields 0 λ2dt(t) |∇ε(t)|2 + |ε(t)|2 e−|y| < +∞ and (60) follows. This concludes the proof of Proposition 8.
4.2. Existence of a L2 limit outside the blow up point. We now claim existence of a L2 profile outside the blow up point x(T ): Proposition 9 (Existence of a L2 profile outside the blow up point). There exists u∗ ∈ L2 such that for all R > 0, u(t) ˜ → u∗ in L2 (|x − x(T )| > R).
(62)
Proof of Proposition 9. Step 1. Space time control of u outside blow up point. We claim: for all R > 0,
T 0
|x−x(T )|>R
|∇u(t)|2 dt < +∞.
(63)
Remark 7. From local well posedness of the Cauchy problem in L2 , blow up occurs in T L2 if 0 |∇u(t)|2 dt = +∞. Equation (63) thus means a space time gain of regularity on the solution outside the blow up point which will yield some L2 control. Proof of (63). Fix R > 0. From (57), we have: 1 2
0
T
|x−x(T )|>R
|∇u(t)|2 dt ≤ 0
T
T
+ 0
|x−x(T )|>R
|∇ u(t)| ˜ 2 dt
|∇Q (t)| S
|x−x(T )|>R
2
dt.
From convergence (59) and uniform exponential decay on Qb , we have for t ∈ [t (R), T ): |x−x(T )|>R
|∇QS (t)|2 ≤ e−C λ(t) ≤ C(R). R
For the u˜ term, we argue differently depending on the blow up regime:
692
F. Merle, P. Raphael
log-log regime. Using (59), we have for t ∈ [t (R), T ): T T dτ 2 |∇ u(τ ˜ )|2 dτ = |∇ε(τ )| 2 |λ(t)y+x(t)−x(T )|>R |x−x(T )|>R 0 0 λ (τ )
+∞
≤
R |y|> 2λ
0
|∇ε(s)|2 ds.
Now observe from the choice (32) of cut-off parameter A(t) and (30) that: A(t) ≤
b−C
1 bC
<< e
≤
1 . λ
Since cut radiation ζ˜ (y) given by (33) is zero for |y| ≥ dispersive control (35) now yields for s large enough:
+∞
R |y|> 2λ(s)
s
|∇ε(s)|
2
+∞
ds =
R |y|> 2λ(s)
s
S(t) regime. From (44), T 2 |∇ u(t)| ˜ dt = 0
T 0
This concludes the proof of (63).
dt λ2 (t)
R 2λ ,
and from definition (34),
|∇ ε˜ (s)|
2
ds < +∞.
|∇ε(t)|
2
< +∞.
Step 2. Existence of the L2 limit. Fix a parameter R > 0. We claim that u(t) satisfies Cauchy criterion as t → T in L2 (|x| > R). Indeed, pick a small ε0 > 0. We may assume from (63) that t (R) is close enough to T so that:
T
t (R)
R |x−x(T )|> 10
|∇u(t)|2 dt < ε0 .
(64)
Next, given a fixed parameter τ > 0, let v τ (t, x) = u(t + τ, x) − u(t, x). u(t) is strongly continuous in L2 at time t (R), so there exists τ0 > 0 such that |v τ (t (R))|2 < ε0 . ∀τ ∈ [0, τ0 ], We now claim: ∀τ ∈ [0, τ0 ], ∀t ∈ [t (R), T − τ ),
(65)
|x−x(T )|≥ R4
|v τ (t)|2 < Cε0 ,
(66)
for some universal constant C > 0. This indeed implies that u(t) is a Cauchy sequence in L2 (|x| ≥ R) and yields the claim.
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
693
Proof of (66). v τ (t, x) satisfies: 4 4 ivtτ + v τ = − u|u| N (t + τ ) − u|u| N (t) . Let a cut off function φ(x) = 1 for |x| ≥ 2, φ(x) = 0 for |x| ≤ 1, we compute: 1 2
1 x − x(T ) x − x(T ) φ |v τ |2 = I m ∇φ · ∇v τ v τ R R R t 4 4 x − x(T ) τ N N +I m φ v u|u| (t + τ ) − u|u| (t) . R
From Cauchy-Schwarz and the conservation of the L2 norm: ∇φ x − x(T ) · ∇v τ v τ ≤ C(R) + C(R) R
2 2 × |∇u(t)| + |∇u(t + τ )| . R |x−x(T )|≥ 10
For the second term, we first have by homogeneity: 4 4 4 4 |v τ | |u(t)|1+ N + |u(t + τ )|1+ N ≤ C(|u(t)|2+ N + |u(t + τ )|2+ N ). Then, we may assume the cut off function φ writes φ = φ˜ 2+ N and φ˜ regular, and thus from Gagliardo-Nirenberg inequality and the conservation of the L2 norm: 4
φ(
2+ 4 N x − x(T ) φ( ˜ )u(t) R x − x(T ) ˜ ≤C |∇ φ( )u(t) |2 R 2 N x − x(T ) ˜ × |φ( )u(t)|2 R
4 x − x(T ) )|u(t)|2+ N = R
≤ C(R) 1 +
R |x−x(T )|≥ 10
|∇u(t)|2 .
We thus conclude: ∀τ ∈ [0, τ0 ), ∀t with [t, t + τ ] ∈ [t (R), T ), τ 2 φR |v |
t
≤ C(R) 1 +
R |x−x(T )|> 10
(|∇ u(t)| ˜ 2 + |∇ u(t ˜ + τ )|2 ) .
Integrating this in time with controls (64) and (65) now yields (66). This concludes the proof of Proposition 9.
694
F. Merle, P. Raphael
4.3. Non concentration of u˜ at the blow up point. In this subsection, we conclude the proof of L2 convergence (48) of u˜ of Proposition 7, which from Proposition 9 amounts to proving that u˜ does not concentrate any L2 mass at the blow up point. Proof of L2 convergence (48) of Proposition 7. Remark that (48) is implied by |u∗ |2 = |u0 |2 − Q2 .
(67)
Indeed, u˜ u∗ in L2 . Moreover, 2 2 ˜ + QS (t)|2 , |u0 | = |u(t)| = |u(t) and from asymptotic stability: S 2 ˜ QS (t)) |Q (t)| → Q2 and (u(t), ≤ Thus (67) implies
|u(t)| ˜ 2→
|ε(t, y)|2 e−C|y| dy
1 2
→ 0 as t → T .
|u∗ |2 which concludes the proof.
(68)
Proof of (67). As for the asymptotic stability of the blow up profile, the proof requires some work in the log-log regime, but is straightforward in the S(t) regime. log-log case. Let R(t) = A(t)λ(t) with A(t) given by (32). Note from (32) and (29) that there holds for some constant C > 0: A(t) ≤
1 , |log(T − t)|Ca
(69)
a given by (31). Let a cut-off function χ (x) = 1 for |x| ≥ 2, χ (x) = 0 for |x| ≤ 1. Compute from (1) using (61): x − x(τ ) x − x(τ ) 1 1 2 χ |u(τ )| I m ∇χ · ∇uu = R(t) 2 R(t) R(t) τ 1 2 C x − x(τ ) xτ 2 2 . · ∇χ |u(τ )| ≤ |∇u(τ )| − 2 R(t) A(t)λ(t) Integrating this in time, we get using (48) and (69): T C χ x − x(T ) |u∗ |2 − χ x − x(t) |u(t)|2 ≤ |∇u(τ )|L2 dτ A(t)λ(t) R(t) R(t) t C log|logT − t| T log|logT − τ | 1 ≤ dτ ≤ . Ca |log(T − t)|Ca T −t T − τ t |log(T − t)| 2
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
Letting t → T , we have:
|u∗ |2 = lim
t→T
χ
695
x − x(t) |u(t)|2 . R(t)
(70)
We then have: x − x(t) x − x(t) lim χ χ |u(t)| ˜ 2 dt, |u(t)|2 = lim |u(t)| ˜ 2 = lim t→T t→T t→T R(t) R(t) (71) which concludes the proof.
Proof of (71). Recall first the Sobolev type estimate observed in [14]: there exists a universal constant C > 0 such that 1 2 2 2 2 −|y| ∀D ≥ 1, ∀v ∈ H , |∇v| + |v| e . (72) |v| ≤ CD |y|≤D
Then from (28):
|x−x(t)|≤2R(t)
|u(t)| ˜ 2=
2 |y|≤ A(t)
|ε(t, y)|2 dy
≤ CA(t)2
|∇ε(t)|2 +
|ε(t)|2 e−|y|
1
2 ≤ b(t) → 0 as t → T
provided η > 0 small enough, what concludes the proof of (71) and (67). S(t) case. From Proposition 9, u˜ → u∗ in L2 (|x − x(T )| > 1) as t → T . Moreover, from (45), there exists a sequence tn → T such that 1 |∇ε(tn )|2 ≤ CE0 , |∇ u(t ˜ n )|2 = 2 λ (tn ) and thus by uniqueness of the weak limit in H 1 , ˜ n ) u∗ in H 1 as tn → T . u∗ ∈ H 1 and u(t
(73)
From compact embedding of H 1 into L2loc , u(t ˜ n ) → u∗ L2 (|x − x(T )| ≤ 1), and thus:
|u(t ˜ n )|2 →
|u∗ |2 .
Now (67) follows from the conservation of the L2 norm and (68). This concludes the proof of L2 convergence (48) of Proposition 7.
696
F. Merle, P. Raphael
5. Asymptotic of the L2 Profile at the Blow Up Point In this section, we conclude the proof of Proposition 7 by proving that L2 profile u∗ given by (48) satisfies estimates of Theorem 3. To estimate the size of u∗ at the blow up point is equivalent to understand the mass ejection phenomenon which couples the singular part with the regular part of the solution. In the S(t) regime, these two parts are decoupled which leads to a regular and flat profile u∗ at the blow up point as expressed by (20). On the contrary, a characteristic of the log-log regime is the strong coupling between the singularity formation and dispersion outside blow up point. The mass ejection process is indeed the main mechanism in the blow up dynamic. The proof of estimate (18) involves two different arguments: 1. Localization of the L2 mass in the rescaled variables. In [14], an explicit study of dispersive effects in L2 in the rescaled variables has allowed us to derive the very strong localization estimate (36) which should be formally understood as:
2A
A
|ε(t)|2 ∼ b(t) .
(74)
2. Control of the L2 flux. To derive estimates on u∗ , we propagate this information in original variables by proving that this amount of mass is frozen in time up to T . Remark 8. In the log-log case, we will in fact prove that given any R > 0, there is a time t (R) such that for t ∈ [t (R), T ), the mass ejection phenomenon at R becomes negligible, and then the L2 flux around R is frozen at its asymptotic value given by the equivalent (18) for u∗ . In the time averaging sense, we will then have: ∀t ∈ [t (R), T ), C 2 |u(t)| ∼ |ε(t)|2 ∼ . (75) 3 R R R log(R) (log|log(R)|) 2 ≤|x−x(T )|≤R 2λ(t) ≤|y|≤ λ(t) The behavior in R of the right-hand side is a consequence of (74), or more specifically as exhibited in [14] by the fact that in the region |y| ≤ A(t), a good approximation of the solution ε(t) is the universal radiation ζb(t) of Lemma 1. From (75), we now may estimate the size of the region in space for which this approximation is good, which will turn out to be much smaller than the one suggested at a formal level in [19]. Indeed, assuming ε ∼ ζb for |y| ≤ B(t), we get from explicit computation on ζb(t) and (75): R C |ε(t)|2 = b(t) ∼ |ε(t)|2 ∼ 3 R λ(t) log(R) (log|log(R)|) |y|∼ |λ(t) =B(t) =
C . log(λ(t)B(t)) (log|log(λ(t)B(t))|)3
This provides us with a control from above and below on B(t) which implies in particular: B(t) ≤
1 , 0 < δ < 1. λ(t)1−δ
In original variables, this means that the radiative zone does not escape the focusing point, and that the regular part of the solution indeed corresponds to a different regime.
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
697
S(t) case. First observe that u∗ ∈ H 1 has been proved in the proof of Proposition 7 as a straightforward consequence of (45), see (73). It remains to prove the degeneracy estimate at the blow up point (20). Indeed, let tn be the sequence such that (45) holds. From L2 convergence (14), |u∗ (x)|2 dx = lim |u(t ˜ n , x)|2 dx. tn →T
|x−x(T )|
Now for the sequence tn : |u(t ˜ n , x)|2 dx = |x−x(tn )|
|y|≤ λ(tRn )
≤
CR 2 λ2 (tn )
|x−x(tn )|
|ε(tn , y)|2 dy
|∇ε(tn )|2 +
|ε(tn )|2 e−|y|
≤ CE0 R 2 ,
where we used the Sobolev type estimate (72). This concludes the proof of (20) and of Proposition 7 in the S(t) regime. log-log regime. Step 1. Control of the L2 flux. For all t0 close enough to T , let A(t0 ) be given by (32), we define: R(t0 ) = λ(t0 )A(t0 ).
(76)
For a given radial cut off function ψ(r) supported in {1 ≤ r ≤ 2} with ψ(r) = 1 in a neighborhood of r = 23 , let for t ∈ [t0 , T ): x − x(t) m(ψ, t) = ψ |u(t)| ˜ 2. R(t0 ) Then: m(ψ, t0 ) =
ψ
x − x(t0 ) y 2 |u(t ˜ 0 , x)| dx = ψ |ε(t0 , y)|2 dy, R(t0 ) A(t0 )
and thus estimate (36) with (27) is written: T 1 m(ψ, t0 ) C(ψ) ≤ dt0 ≤ . 2 (t ) C(ψ)log|log(T − t)| λ log|log(T − t)| 0 t
(77)
We now claim: Lemma 4 (Control on the flux of the L2 norm). There holds for η > 0 small enough and t0 close enough to T : ∀t ∈ [t0 , T ), x − x(t0 ) 2 2 ψ x − x(t) |u(t)| ˜ | u(t ˜ − ψ )| 0 R(t0 ) R(t0 ) 1 1+a 2 2 ∇ψ x − x(t) |u(t)| ˜ ≤ C(ψ)b(t2 0 ) sup . (78) R(t0 ) t∈[t0 ,T )
698
F. Merle, P. Raphael
Remark 9. From now on, √the parameter η is fixed small enough so that the above estimates hold, and so a = η. Proof of Lemma 4. We first claim from the support property of ψ: ∀t ∈ [t0 , T ),
QS (t, x) = 0
ie u(t, x) = u(t, ˜ x)
x − x(t) ∈ Supp(ψ). R(t0 ) (79)
for
Indeed, first we remark that
b(t) λ(t)
= t
Cb2 1 λs b b ≥ 3 > 0, − s 3 λ λ λ
b(t) ≥ A(t0 )b(t0 ) ≥ 100 where we used (24) and (28) in the last step. Thus A(t0 )λ(t0 ) λ(t) for t0 close to T from (32). Consequently:
R(t0 ) A(t0 )λ(t0 ) 100 = ≥ . λ(t) λ(t) b(t)
∀t ∈ [t0 , T ), Let
x−x(t) R(t0 )
∈ Supp(ψ) ⊂ [1, 2], then
x − x(t) ≥ 1 implies x − x(t) = x − x(t) R(t0 ) ≥ 100 . R(t ) λ(t) R(t ) λ(t) b(t) 0 0 ˜ b has support in the ball |y| ≤ 4 , so that (58) yields (79). Now from Proposition 2, Q b We now compute the flux of the L2 norm using (79): ∀t ∈ [t0 , T ), 1 2
ψ
xt x − x(t) x − x(t) |u(t)| ˜ 2 =− · ∇ψ |u| ˜2 R(t0 ) 2R(t0 ) R(t0 ) t 1 x − x(t) + Im ∇ψ · ∇ u˜ u˜ . R(t0 ) R(t0 )
From (24), (28) and (47), 1
|xt | +
|∇ u(t)| ˜
2
2
1 = λ(t)
1 |∇ε(t)|
2
2
1
−Cη
2 ≤ b . λ(t)
Thus: ∀t ∈ [t0 , T ), x − x(t) 2 ψ |u(t)| ˜ R(t ) 0
t
1 2 −Cη b(t) ≤ C(ψ) R(t )λ(t)
0
1 2 2 ∇ψ x − x(t) |u(t)| ˜ R(t ) 0
1 2 x − x(t) 1 b(t) 2 ∇ψ ≤ C(ψ) . sup |u(t)| ˜ A(t0 ) λ(t0 )λ(t) t∈[t0 ,T ) R(t0 ) 1 2 −Cη
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
699
Integrating this in time, we first estimate from (25) and (29): 1 λ(t0 )
1
t t0
−Cη
2 b(τ )
λ(τ )
dτ ≤ C ×
log|log(T − t0 )| T − t0 T
|log(T − τ )| 1
t0
≤ and thus:
dτ
|log(T − t0 )|
1 2 −3Cη
1 2 −2Cη 1
log|log(T − τ )| T −τ
−4Cη
2 ≤ b(t 0)
,
x − x(t0 ) 2 2 ψ x − x(t) |u(t)| − ψ ˜ |u(t ˜ 0 )| R(t ) R(t ) 0
≤ C(ψ)
0
1 2 −Cη b(t0 )
A(t0 )
1 2 2 ∇ψ x − x(t) |u(t)| . sup ˜ R(t )
t∈[t0 ,T )
0
Now from (32), 1
−Cη
2 b(t 0)
1
−Cη a(1−Cη)
2 b(t0 ) ≤ b(t 0)
1+a
≤ b(t2 0 ) ,
A(t0 ) where we used (31) for η > 0 small enough in the last step. This concludes the proof of (78) and of Lemma 4. Step 2. Iteration of the flux control. In integration in time, the right hand side of (78) is much larger than the size estimate (77). Our aim now is to iterate this flux control to get the contrary in the time averaging sense and for some suitable ψ:
ψ
1+a x − x(t0 ) |u(t ˜ 0 )|2 >> b(t2 0 ) sup R(t0 ) t∈[t0 ,T )
1 2 2 ∇ψ x − x(t) |u(t)| . ˜ R(t ) 0
Fix an integer L = L(a) such that:
a 1 1 + ≤ (1 + a) 1 − L . 2 4
For 0 ≤ k ≤ L, we consider a family of radial cut-off functions ψk with the following properties: 1 1 ≤ |x| ≤ 1 + 21 + 2k+2 1 for 1 + 21 − 2k+2 ψk (x) = 1 1 0 for |x| ≤ 1 + 21 − 2k+1 and |x| ≥ 1 + 21 + 2k+1 , ∀k ≥ 0, ∀x ∈ RN , |∇ψk+1 (x)| ≤ 3ψk (x). We claim: ∀t0 close enough to T , a 1+ a8 ψL x − x(T ) |u∗ |2 − m(ψL , t0 ) ≤ C 4 sup m(ψ , t ) + C . k 0 b(t ) b(t 0 0) R(t0 ) 0≤k≤L−1 (80)
700
F. Merle, P. Raphael
Proof of (80). For a given cut off ψ supported in {1 ≤ |x| ≤ 2}, we note M(ψ, t0 ) = supt∈[t0 ,T ) m(ψ, t). Applying flux control (78), we have: ∀0 ≤ k ≤ L, ∀t ∈ [t0 , T ), 1+a |m(ψk+1 , t) − m(ψk+1 , t0 )| ≤ Cb(t2 0 ) M(|∇ψk+1 |, t0 ) 1+a ≤ Cb(t2 0 ) M(ψk , t0 ),
(81)
and in particular: 1+a M(ψk+1 , t0 ) ≤ m(ψk+1 , t0 ) + Cb(t2 0 ) M(ψk , t0 ).
(82)
We claim: there holds for some universal constant C > 0,
∀0 ≤ k ≤ L, M(ψk , t0 ) ≤ C
sup 0≤p≤k
p m(ψp , t0 ) + b(tk 0 )
(83)
,
with pk = (1 + a)(1 − 2−(k+1) ). Indeed, we argue by induction on k. For k = 0, (78) yields (83) with p0 = 1+a 2 . We now assume (83) for k and prove it for k + 1. Indeed, from (82): 1+a M(ψk+1 , t0 ) ≤ m(ψk+1 , t0 ) + Cb(t2 0 ) M(ψk , t0 ) 1+a p ≤ m(ψk+1 , t0 ) + Cb(t2 0 ) sup m(ψp , t0 ) + b(tk 0 ) 0≤p≤k
≤ m(ψk+1 , t0 ) + C
1+a 2 b(t0 )
sup 0≤p≤k+1
1+a m(ψp , t0 ) + b(t 0)
≤C
sup 0≤p≤k+1
sup m(ψp , t0 ) + b(t0 )
0≤p≤k
≤C
pk 2
+
1+a+pk 2 b(t0 )
p m(ψp , t0 ) + b(tk+1 0)
,
k from pk+1 = 1+a+p , and (83) follows. 2 For k = L − 1, (83) is written:
M(ψL−1 , t0 ) ≤ C
1+ a
sup 0≤k≤L−1
m(ψk , t0 ) + Cb(t04) ,
which injected into (81) for k = L − 1 yields: ∀t ∈ [t0 , T ), 1+a |m(ψL , t) − m(ψL , t0 )| ≤ Cb(t2 0 ) M(ψL−1 , t0 )
1+a 1 a 2 2+8 ≤ Cb(t0 ) sup m(ψk , t0 ) + b(t0 )
0≤k≤L−1
a 4
≤ Cb(t0 )
sup 0≤k≤L−1
1+ a
m(ψk , t0 ) + Cb(t08) .
Letting t → T according to strong L2 convergence (48) yields (80).
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
701
Step 3. Proof of estimate (18). We claim for t close to T : T dt0 x − x(T ) 1 ∗ 2 | ≤ ψ |u L 2 Clog|log(T − t)| R(t0 ) t λ (t0 ) C ≤ . log|log(T − t)| Proof of (84). From (77), 1 ≤ Clog|log(T − t)|
T t
(84)
m(ψL , t0 ) C dt0 ≤ , λ2 (t0 ) log|log(T − t)|
so that using (80), (84) is equivalent to prove: log|log(T − t)|
T a a dt0 1+ 4 × sup m(ψk , t0 ) + b(t08) → 0 as t → T . b(t 2 0) t λ (t0 ) 1≤k≤L−1
(85)
This is a consequence of (35) and (77). Indeed, first rewrite (35) with (27): T +∞ dt0 C C = b(s) ds ≤ ≤ . 2 (t ) b(t0 ) λ log(s) log|log(T − t)| 0 t s Using this and (77), we conclude:
T a dt0 1+ a8 4 log|log(T − t)| b(t0 ) sup m(ψk , t0 ) + b(t0 ) 2 t λ (t0 ) 1≤k≤L−1 T T a dt dt 0 0 8 ≤ log|log(T − t)|b(t) sup m(ψk , t0 ) + b(t0 ) 2 2 t λ (t0 ) 1≤k≤L−1 t λ (t0 ) a
8 ≤ Cb(t) → 0 as t → T .
This is (85) and thus (84) follows.
We now change variables in (84). Let R0 = R(t0 ) = λ(t0 )A(t0 ) = λ(t0 )e
a b(tπ
0)
.
We claim log|log(R0 )| 1 dt0 Clog|log(R0 )| ≤ 2 ≤ . CR0 λ (t0 ) dR0 CR0
(86)
Indeed, from (24) and pointwise control (28), we estimate: dR0 bs A(t0 ) bt λs = A(t0 ) −λt0 − aπ λ 20 = b − ( 0 + b) − aπ 20 , dt0 b λ(t0 ) λ b b(t0 )A(t0 ) dR0 b(t0 )A(t0 ) 0< ≤ , ≤C Cλ(t0 ) dt0 λ(t0 ) 1 1 dt0 C 1 C ≤ 2 = = ≤ . Cb(t0 )R0 Cb(t0 )A(t0 )λ(t0 ) λ (t0 ) dR0 b(t0 )A(t0 )λ(t0 ) b(t0 )R0 (87)
702
F. Merle, P. Raphael
Moreover, we estimate from (76), (26) and (25): 1 C 1 ≤ log|log(T − t0 )| ≤ log|log(R0 )| ≤ Clog|log(T − t0 )| ≤ , Cb(t0 ) C b(t0 ) which with (87) yields (86). With (86), (84) is written for R > 0 small enough: R 1 x − x(T ) log|log(R0 )| ∗ 2 ψL |u (x)| dx dR0 ≤ Clog|log(R)| R0 R0 0 C ≤ . log|log(R)| We now have from the support property of ψL : ∀y ∈ RN , 1a1 ≤|y|≤b1 ≤ ψL (y) ≤ 1a2 ≤|y|≤b2 for some a1 , b1 , a2 , b2 > 0. From Fubini: R log|log(R0 )| x − x(T ) |u∗ (x)|2 dx ψL dR0 R0 R0 0 log|log(R0 )| ≤ dxdR0 |u∗ (x)|2 R0 a2 R0 ≤|x−x(T )|≤b2 R0 0≤R0 ≤R |x−x(T )| a2 log|log(R0 )| = |u∗ (x)|2 dx dR0 |x−x(T )| R0 |x−x(T )|≤b2 R b2 ≤C log|log(|x − x(T )|)|u∗ (x)|2 dx, |x−x(T )|≤b2 R
and we conclude: |x−x(T )|≤b2 R
log|log(|x − x(T )|)|u∗ (x)|2 dx ≥
1 . Clog|log(R)|
Arguing similarly for the upper bound, we thus have proved: for all R > 0 small, 1 C ≤ . (88) log|log(|x − x(T )|)|u∗ (x)|2 dx ≤ Clog|log(R)| log|log(R)| |x−x(T )|≤R The upper bound in (18) follows. For the lower bound, letting C > 0 be the constant 10C 2
involved in (88) and f (R) = e−|log(R)|
so that:
C 1 = , log|log(f (R))| 10Clog|log(R)| we have from control (88): 1 ≤ log|log(|x − x(T )|)|u∗ (x)|2 dx Clog|log(R)| |x−x(T )|≤R log|log(|x − x(T )|)|u∗ (x)|2 dx = f (R)≤|x−x(T )|≤R + log|log(|x − x(T )|)|u∗ (x)|2 dx |x−x(T )|≤f (R)
(89)
Blow Up Mass for Critical Nonlinear Schr¨odinger Equation
703
C log|log(f (R))| |x−x(T )|≤R 1 , ≤ 10C 2 log|log(R)| |u∗ (x)|2 + 10Clog|log(R)| |x−x(T )|≤R |u∗ (x)|2 dx +
≤ log|log(f (R))|
where we used (89) in the last step, and thus: 1 |u∗ (x)|2 ≥ . 3 20C (log|log(R)|)2 |x−x(T )|≤R This concludes the proof of estimate (18). / H 1 . To conclude the proof of Proposition 7, we remark that ∀p > 2, Step 4. u∗ ∈ / Lp . Indeed, assume by contradiction u∗ ∈ Lp for some p > 2, then by H¨older: u∗ ∈ |x−x(T )|≤R
∗
|u (x)| dx ≤ 2
∗ p
|u |
N N +2
p−2 r
N−1
dr
p
0≤r≤R
≤ C(u∗ )R
(p−2)N p
,
which contradicts lower bound in (18) in the limit R → 0. This concludes the proof of estimates of Proposition 7. References 1. Berestycki, H., Lions, P.-L.: Nonlinear scalar field equations. I. Existence of a ground state. Arch. Rat. Mech. Anal. 82(4), 313–345 (1983) 2. Bourgain, J., Wang, W.: Construction of blowup solutions for the nonlinear Schr¨odinger equation with critical nonlinearity. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 25(1–2), 197–215 (1998) 3. Cazenave, Th., Weissler, F.: Some remarks on the nonlinear Schr¨odinger equation in the critical case. In: Nonlinear semigroups, partial differential equations and attractors (Washington, DC, 1987), Lecture Notes in Math. 1394, Berlin: Springer, 1989, pp. 18–29 4. Ginibre, J., Velo, G.: On a class of nonlinear Schr¨odinger equations. I. The Cauchy problem, general case. J. Funct. Anal. 32(1), 1–32 (1979) 5. Glangetas, L., Merle, F.: Existence of self-similar blow-up solutions for Zakharov equation in dimension two. I. Commun. Math. Phys. 160(1), 173–215 (1994) 6. Kwong, M.K.: Uniqueness of positive solutions of u − u + up = 0 in R n . Arch. Rati. Mech. Anal. 105(3), 243–266 (1989) 7. Landman, M.J., Papanicolaou, G.C., Sulem, C., Sulem, P.-L.: Rate of blowup for solutions of the nonlinear Schr¨odinger equation at critical dimension. Phys. Rev. A (3) 38(8), 3837–3843 (1988) 8. Martel, Y., Merle, F.: Stability of blow-up profile and lower bounds for blow-up rate for the critical generalized KdV equation, Ann. of Math. (2) 155(1), 235–280 (2002) 9. Merle, F.: Construction of solutions with exact k blow up points for the Schr¨odinger equation with critical power. Commun. Math.Phys. 129, 223–240 (1990) 10. Merle, F.: Determination of blow-up solutions with minimal mass for nonlinear Schr¨odinger equations with critical power. Duke Math. J. 69(2), 427–454 (1993) 11. Merle, F., Raphael, P.: Blow up dynamic and upper bound on the blow up rate for critical nonlinear Schr¨odinger equation. To appear in Annals of Math. 12. Merle, F., Raphael, P.: Sharp upper bound on the blow up rate for critical nonlinear Schr¨odinger equation, Geom. Funct. Anal. 13, 591–642 (2003) 13. Merle, F., Raphael, P.: On Universality of Blow up Profile for L2 critical nonlinear Schr¨odinger equation. Invent. Math. 156, 565–672 (2004) 14. Merle, F., Raphael, P.: Sharp lower bound on the blow up rate for critical nonlinear Schr¨odinger equation. Preprint 15. Merle, F., Tsutsumi,Y.: L2 concentration of blow up solutions for the nonlinear Schr¨odinger equation with critical power nonlinearity. J. Diff. Eq. 84, 205–214 (1990)
704
F. Merle, P. Raphael
16. Nawa, H.: Asymptotic and limiting profiles of blowup solutions of the nonlinear Schr¨odinger equation with critical power. Commun. Pure Appl. Math. 52(2), 193–270 (1999) 17. Perelman, G.: On the blow up phenomenon for the critical nonlinear Schr¨odinger equation in 1D. Ann. Henri. Poincar´e 2, 605–673 (2001) 18. Raphael, P.: Stability of the log-log bound for blow up solutions to the critical nonlinear Schr¨odinger equation. To appear in Math. Annalen 19. Sulem, C., Sulem, P.L.: The nonlinear Schr¨odinger equation. Self-focusing and wave collapse. Applied Mathematical Sciences 139, New York: Springer-Verlag, 1999 20. Weinstein, M.I.: Nonlinear Schr¨odinger equations and sharp interpolation estimates. Commun. Math. Phys. 87, 567–576 (1983) Communicated by P. Constantin
Commun. Math. Phys. 253, 705–721 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1159-7
Communications in
Mathematical Physics
T-Duality for Torus Bundles with H-Fluxes via Noncommutative Topology Varghese Mathai1, , Jonathan Rosenberg2, 1 2
Department of Pure Mathematics, University of Adelaide, Adelaide, SA 5005, Australia. E-mail:
[email protected] Department of Mathematics, University of Maryland, College Park, MD 20742, USA. E-mail:
[email protected]
Received: 23 January 2004 / Accepted: 3 February 2004 Published online: 27 August 2004 – © Springer-Verlag 2004
Abstract: It is known that the T-dual of a circle bundle with H-flux (given by a NeveuSchwarz 3-form) is the T-dual circle bundle with dual H-flux. However, it is also known that torus bundles with H-flux do not necessarily have a T-dual which is a torus bundle. A big puzzle has been to explain these mysterious “missing T-duals.” Here we show that this problem is resolved using noncommutative topology. It turns out that every principal T 2 -bundle with H-flux does indeed have a T-dual, but in the missing cases (which we characterize), the T-dual is non-classical and is a bundle of noncommutative tori. The duality comes with an isomorphism of twisted K-theories, just as in the classical case. The isomorphism of twisted cohomology which one gets in the classical case is replaced by an isomorphism of twisted cyclic homology. 1. Introduction An important symmetry of string theories is T-duality, which exchanges wrapping of fields over a torus with wrapping over the dual torus [8, 9, 1, 2]. (The exact mathematical meaning of “dual torus” is that if is a lattice in Rn and ∗ is the dual lattice in the dual vector space (Rn )∗ , then (Rn )∗ /∗ is the dual torus to Rn /.) Many authors have tried to understand this duality from various points of view. Since RamondRamond (RR) charges are expected to be represented by classes in K-theory (see, e.g., [39, 40, 27, 24]), T-duality should come with an isomorphism of K-theories (usually with a degree shift) between a theory and its dual. The type of K-theory appropriate for the situation (e.g., K, KO, or KSp) depends on the type of string theory being considered; here we deal with the type II situation, which leads to complex K-theory. (For a few comments on type I theories, see Sect. 6.)
VM was supported by the Australian Research Council. JR was partially supported by NSF Grant DMS-0103647, and thanks the Department of Pure Mathematics of the University of Adelaide for its hospitality in January 2004, which made this collaboration possible.
706
V. Mathai, J. Rosenberg
As pointed out in many contexts (e.g., [37, 17]), T-duality can apply not only to theories over spaces of the form X × T n , but also to non-trivial torus bundles, and even to spaces which are only “approximately” of this form, for example, spaces admitting a torus action which is generically free. (However, in this paper we only consider the case of free torus actions.) In addition, it should apply as well to situations with a non-trivial Neveu-Schwarz (NS) 3-form H . In these situations, the H-flux gives rise to a twisting of K-theory, so that one expects an isomorphism of twisted K-theories. In its general form, T-duality often involves a change of topology (see, e.g., [5, 6 and 7]). Our initial interest was in trying to explain the T-duality of torus bundles, in the presence of twisting by an H-flux, from the perspective of noncommutative topology. An unexpected byproduct, which we will discuss in Sect. 5, is that we have found that several known cases of torus bundles with “missing” T-duals are in fact naturally T-dual to noncommutative torus bundles, in a sense we will make precise below. This suggests an unexpected link between classical string theories and the “noncommutative” ones, obtained by “compactifying” along noncommutative tori, as in [13] (cf. also [36, §§6–7]). Just as a complete characterization of T-duality on circle bundles with H-flux is given in [5 and 6], in this paper, we give a complete characterization of T-duality on principal T2 -bundles with H-flux, Theorem 4.13. We also describe partial results for T-duality on general principal torus bundles with H-flux. The main mathematical result is a detailed analysis of the equivariant Brauer group for principal T2 -bundles, Theorem 4.10, which refines earlier results in [14 and 28]. This depends on some explicit calculations of Moore’s “Borel cochain” cohomology groups. 2. Preliminaries on Noncommutative Tori Here the definition of a (2-dimensional) noncommutative torus is recalled, cf. [34]. This algebra (stabilized by tensoring with the compact operators K) occurs geometrically as the foliation algebra associated to Kronecker foliations on the torus [10]. It also occurs naturally in the matrix formulation of M-theory as the components of Yang-Mills connections in the classification of BPS states [13]. For each θ ∈ [0, 1], the noncommutative torus Aθ is defined abstractly as the C ∗ algebra generated by two unitaries U and V in an infinite dimensional Hilbert space satisfying the relation U V = exp(2πiθ )V U . Elements in Aθ can be represented by infinite power series a(n,m) U m V n , (1) f = (m,n)∈Z2
where the coefficients a(m,n) ∈ C satisfy a decay condition (very hard to make precise) as (m, n) → ∞ in Z2 . There is a natural smooth subalgebra A∞ θ called the smooth noncommutative torus, which is defined as those elements in Aθ that can be represented by infinite power series (1) with (a(m,n) ) ∈ S(Z2 ), the Schwartz space of rapidly decreasing sequences on Z2 . Aθ can also be realized as the crossed product C(T) θ Z, where the generator of Z acts on T by rotation by the angle 2πθ . When θ is rational, Aθ is type I, and is even Morita equivalent to C(T2 ). However, when θ is irrational, Aθ is a simple non-type I C ∗ -algebra. Because of the realization of Aθ as a crossed product by rotation by 2π θ, the algebra in this case is often called an irrational rotation algebra.
T-Duality via Noncommutative Topology
707
Consider the 2 dimensional torus T2 = R2 /Z2 . For each θ ∈ [0, 1], the noncommutative torus Aθ is Morita equivalent to the foliation algebra associated to the foliation on T2 defined by the differential equation dx = θ dy on T2 . 3. Mathematical Framework We begin by explaining the precise mathematical framework in which we are working. We assume X (which will be the spacetime of a string theory) is a (second-countable) locally compact Hausdorff space. In practice it will usually be a compact manifold, though we do not need to assume this. However it is convenient to assume that X is finite-dimensional and has the homotopy type of a finite CW-complex. (This assumption can be weakened but some finiteness assumption is necessary to avoid some pathologies. This is not a problem as far as the physics is concerned.) We assume X comes with a free action of a torus T ; thus (by the Gleason slice theorem [21]) the quotient map p : X → Z is a principal T -bundle. A continuous-trace algebra A over X is a particular type of type I C ∗ -algebra with spectrum X and good local structure (the “Fell condition” [20]).1 We will always assume A is separable; then a basic structure theorem of Dixmier and Douady [16] says that after stabilization (i.e., tensoring by K, the algebra of compact operators on an infinitedimensional separable Hilbert space H), A becomes locally isomorphic to C0 (X, K), the continuous K-valued functions on X vanishing at infinity. However, A need not be globally isomorphic to C0 (X, K), even after stabilization. The reason is that a stable continuous-trace algebra is the algebra of sections (vanishing at infinity) of a bundle of algebras over X, with fibers all isomorphic to K. The structure group of the bundle is Aut K ∼ = P U (H), the projective unitary group U (H)/T. Since U (H) is contractible and the circle group T acts freely on it, P U (H) is an Eilenberg-MacLane K(Z, 2)-space, and thus bundles of this type are classified by homotopy classes of continuous maps from X to BP U (H), which is a K(Z, 3)-space, or in other words by H 3 (X, Z). Alternatively, the bundles are classified by H 1 (X, P U (H)), the sheaf cohomology of the sheaf P U (H) of germs of continuous P U -valued functions on X, where the transition functions of the bundle naturally live. But because of the exact sequences in sheaf cohomology 0 = H 1 (X, U (H)) → H 1 (X, P U (H)) → H 2 (X, T) → 0 and 0 = H 2 (X, R) → H 2 (X, T) → H 3 (X, Z) → H 3 (X, R) = 0, the bundles are classified by H 2 (X, T) ∼ = H 3 (X, Z) [35, §1]. Hence stable isomorphism classes of continuous-trace algebras over X are classified by the Dixmier-Douady class in H 3 (X, Z). It turns out that continuous-trace algebras over X, modulo Morita equivalence over X, naturally form a group under the operation of tensor product over C0 (X), called the Brauer group Br(X), and that this group is isomorphic to H 3 (X, Z) via the Dixmier-Douady class. Given an element δ ∈ H 3 (X, Z), we denote by CT (X, δ) the associated stable continuous-trace algebra. (Thus if δ = 0, this is simply C0 (X, K).) The (complex topological) K-theory K• (CT (X, δ)) is called the twisted K-theory [35, §2] of X with twist 1
Except in Sect. 6 below, all C ∗ -algebras and Hilbert spaces in this paper will be over C.
708
V. Mathai, J. Rosenberg
δ, denoted K −• (X, δ). When δ is torsion, twisted K-theory had earlier been considered by Karoubi and Donovan [18]. When δ = 0, twisted K-theory reduces to ordinary K-theory (with compact supports). Now recall we are assuming X is equipped with a free T -action with quotient X/T = Z. (This means our theory is “compactified along tori” in a way reflecting a global symmetry group of X.) In general, a group action on X need not lift to an action on CT (X, δ) for any value of δ other than 0, and even when such a lift exists, it is not necessarily essentially unique. So one wants a way of keeping track of what lifts are possible and how unique they are. The correct generalization of Br(X) to the equivariant setting is the equivariant Brauer group defined in [14], consisting of equivariant Morita equivalence classes of continuous-trace algebras over X equipped with group actions lifting the action on X. By [14, Lemma 3.1], two group actions on the same stable continuous-trace algebra over X define the same element in the equivariant Brauer group if and only if they are outer conjugate. (This implies in particular that the crossed products are isomorphic.) Now let G be the universal cover of the torus T , a vector group. Then G also acts on X via the quotient map G T (whose kernel N can be identified with the free abelian group π1 (T )). In our situation there are three Brauer groups to consider: Br(X) ∼ = H 3 (X, Z), Br T (X), and Br G (X). It turns out, however, that Br T (X) is rather uninteresting, as it is naturally isomorphic to Br(Z) [14, §6.2]. Again by [14, §6.2], the natural “forgetful map” (forgetting the T -action) Br T (X) → Br(X) can simply be identified with p∗ : Br(Z) ∼ = H 3 (Z, Z) → H 3 (X, Z) ∼ = Br(X). Finally, we can summarize what we are interested in. Basic Setup 3.1. A spacetime X compactified over a torus T will correspond to a space X (locally compact, finite-dimensional homotopically finite) equipped with a free T -action. The quotient map p : X → Z is a principal T -bundle. The NS 3-form H on X has an integral cohomology class δ which corresponds to an element of Br(X) ∼ = H 3 (X, Z). A pair (X, δ) will be a candidate for having a T -dual when the T -symmetry of X lifts to an action of the vector group G on CT (X, δ), or in other words, when δ lies in the image of the forgetful map F : Br G (X) → Br(X). 4. Structure of the Equivariant Brauer Group and T-Duality Throughout this section, the above Basic Setup 3.1 will be in force. We let n = dim T , the dimension of the tori involved. 4.1. Review of the case n = 1. The case n = 1 was treated in [31, Theorem 4.12], from a purely C ∗ -algebraic perspective, in [5], from a combined mathematical and physical perspective, and in [6] from a more physical point of view. In this case, G = R, T = T = R/Z, and N = Z. By [14, Cor. 6.1], the forgetful map F : Br G (X) → Br(X) is an isomorphism, and thus every δ ∈ H 3 (X, Z) is dualizable, in fact in a unique way. It is proven in [5] that the T-dual of the pair (p : X → Z, δ) is a pair (p # : X # → Z, δ # ), where X# is another principal circle bundle over Z and δ # ∈ H 3 (X # , Z). Furthermore, there is a beautiful symmetry in this situation. Principal T-bundles over Z are classified by their Euler class in H 2 (Z, Z), or equivalently by the first Chern class of the associated complex line bundle. So let [p], [p # ] ∈ H 2 (Z, Z) be the characteristic classes of the two circle bundles. One has p! (δ) = [p# ],
(p# )! (δ # ) = [p],
(1)
T-Duality via Noncommutative Topology
709
where p! and (p # )! are the push-forward maps in the Gysin sequences of the two bundles. At the level of forms, p! and (p # )! are simply “integration over the fiber,” which reduces the degree of a form by one. Furthermore, the crossed product CT (X, δ) R is isomorphic to CT (X # , δ # ), and CT (X# , δ # ) R is isomorphic to CT (X, δ). In fact, the R-action on CT (X # , δ # ) may be chosen to be the dual action on the crossed product. If one takes the crossed product CT (X, δ) Z by the R-action restricted to Z = ker(R → T), or the similar crossed product CT (X # , δ # ) Z, the result is CT X ×Z X # , p ∗ (δ # ) = (p# )∗ (δ) . Thus one obtains a commutative diagram of principal T-bundles X ×Z XH# HH (p# )∗ (p) ww HH ww HH w HH ww w $ { w X HH X# HH uu u HH u uu p HHH H$ zuuuu p# . Z
(2)
p ∗ (p# )
Finally, we get the desired isomorphisms of twisted K-theory and of twisted homology by using the above results on crossed products and applying Connes’ Thom isomorphism theorem [11] and its analogue in cyclic homology, due to Elliott, Natsume, and Nest [19]. The final result, found in [5], is a commutative diagram K •+1 (X, δ)
T! ∼ =
Ch
H •+1 (X, δ)
/ K • (X # , δ # )
(3)
Ch
T∗ ∼ =
/ H • (X # , δ # ).
Here Ch is the Chern character, which is an isomorphism after tensoring with R, and homology should be Z/2-graded (i.e., we lump together all the even cohomology and all the odd cohomology). Since this duality interchanges even and odd K-theory, it also exchanges type IIa and type IIb string theories. 4.2. Features of the general case. We return again to the Basic Setup 3.1, but now with T a torus of arbitrary dimension n, so G ∼ = Rn . When n > 1, it is no longer true that the forgetful map F : Br G (X) → Br(X) is an isomorphism. However, some facts about this map are contained in [14] and in [28]. We briefly summarize a few of these results, specialized to the case where G is connected (which forces G to act trivially on the cohomology of X). So as to avoid confusion between cohomology of spaces and of • (G, A) the cohomology of the topological topological groups, we have denoted by HM group G with coefficients in the topological G-module A, as defined in [26]. This is sometimes called “Moore cohomology” or “cohomology with Borel cochains.” Theorem 4.1 ([14, Theorem 5.1]). Suppose G is a connected Lie group and X is a locally compact G-space (satisfying our finiteness assumptions). Then there is an exact sequence
710
V. Mathai, J. Rosenberg
Br G (X)
F
/ ker(d2 )
d3
/ H 3 (G, C(X, T))/ im(d ) , 2 M
where 2 d2 : H 3 (X, Z) → HM (G, H 2 (X, Z))
and 1 3 d2 : HM (G, H 2 (X, Z)) → HM (G, C(X, T)).
In addition, there is an exact sequence H 2 (Z, Z)
d2
/ H 2 (G, C(X, T)) M
ξ
/ ker F
η
/ H 1 (G, H 2 (X, Z)). M
Fortunately, since in our situation G is a vector group and is thus contractible, • (G, A) vanishes when A is discrete, thanks to: HM Theorem 4.2 ([38, Theorem 4]). If G is a Lie group and A is a discrete G-module, • (G, A) is canonically isomorphic to H • (BG, A) (the sheaf cohomology of the then HM classifying space BG with coefficients in the locally constant sheaf defined by A). Corollary 4.3. If G is a vector group and if A is a discrete abelian group on which G • (G, A) = 0 for • > 0. acts trivially, then HM Proof. Since the action of G on A is trivial, the sheaf A is constant and can be replaced by A. Since BG is contractible, H • (BG, A) = 0. Substituting Corollary 4.3 into Theorem 4.1, we obtain (since our finiteness assumption on X implies H 2 (X, Z) is countable and discrete): ∼ Rn is a vector group and X is a locally compact G-space Theorem 4.4. Suppose G = (satisfying our finiteness assumptions). Then there is an exact sequence: H 2 (X, Z)
d2
/ H 2 (G, C(X, T)) M
ξ
/ Br G (X)
F
/ H 3 (X, Z)
d3
/ H 3 (G, C(X, T)). M
This still leaves one set of Moore cohomology groups to calculate, namely • HM (G, C(X, T)),
• = 2, 3.
For purposes of doing this calculation, it is convenient to use the exact sequence of G-modules: 0 → H 0 (X, Z) → C(X, R) → C(X, T) → H 1 (X, Z) → 0.
(4)
This is just the start of the long exact cohomology sequence for the exact sequence of sheaves 0 → Z → R → T → 0. Our finiteness assumption on X implies that the cohomology groups of X are countable and discrete. So by Corollary 4.3 again, H 0 (X, Z) and H 1 (Z, Z) are cohomologi• (G, —)), and thus cally trivial (for HM • • HM (G, C(X, T)) ∼ (G, C(X, R)), = HM
• > 1.
Finally, for computing the latter we can use another result from [38]:
(5)
T-Duality via Noncommutative Topology
711
Theorem 4.5 ([38, Theorem 3]). If G is a Lie group and A is a G-module which is • (G, A) agrees with “continuous cohomology” a topological vector space, then HM • Hcont (G, A), the cohomology of the complex of continuous cochains. On the other hand, “continuous cohomology” for modules which are topological vector spaces is well studied, so we can apply: Theorem 4.6 (“Generalized van Est” [23, Cor. III.7.5] or [29]). If G is a connected Lie group and A is a G-module which is a complete metrizable topological vector space, • (G, A) agrees with the relative Lie algebra cohomology H • (g, k; A ), where then Hcont ∞ Lie g is the Lie algebra of G, k is the Lie algebra of a maximal compact subgroup K, and A∞ is the set of smooth vectors in A (for the action of G). Corollary 4.7. If G is a vector group with Lie algebra g, and if A is a G-module which • (G, A) ∼ H • (g, A ). In is a complete metrizable topological vector space, then Hcont = Lie ∞ particular, it vanishes for • > dim G. Proof. For a vector group, K is trivial. Lie algebra cohomology is computed from the complex Hom( • g, A∞ ), which vanishes for • > dim G. 4.3. Calculations for the case n = 2. We now specialize our Basic Setup 3.1 to the case where n = 2, i.e., p : X → Z is a principal T2 -bundle, and now G = R2 . We apply 3 (G, C(X, T)) ∼ H 3 (G, C(X, R)) (by Eq. (5)), to which Theorem 4.4. But since HM = M we can apply Theorem 4.5 and Corollary 4.7, we obtain: 3 (G, C(X, T)) vanProposition 4.8. If G = R2 and X is a G-space as above, then HM 3 ishes and the forgetful map F : Br G (X) → H (X, Z) is surjective. 2 (G, C(X, T)), because of the folFurthermore, we can also explicitly compute HM lowing:
Lemma 4.9. If G = R2 and X is a G-space as in the Basic Setup 3.1, then the maps p ∗ : C(Z, R) → C(X, R) and “averaging along the fibers of p” : C(X, R) → C(Z, R) (defined by f (z) = T f (g · x) dg, where dg is Haar measure on the torus T and we choose x ∈ p−1 (z)) induce isomorphisms 2 2 (G, C(X, R)) HM (G, C(Z, R)) ∼ HM = C(Z, R)
which are inverses to one another. Proof. We apply Theorem 4.6. Note that the G-action on C(Z, R) is trivial, so every element of C(Z, R) is smooth for the action of G. But since dim G = 2, we have for any real vector space V with trivial G-action the isomorphisms 2 2 2 HM (G, V ) ∼ (g, V ) ∼ (g, R) ⊗ V ∼ = HLie = HLie = V,
2 (g, R) ∼ ( 2 g)∗ ∼ R. since HLie = = Clearly ◦ p∗ is the identity on C(Z, R), so we need to show p ∗ ◦ induces an isomorphism on C(X, R). The calculation turns out to be local, so by a Mayer-Vietoris argument we can reduce to the case where p is a trivial bundle, i.e., X = (G/N ) × Z,
712
V. Mathai, J. Rosenberg
with N = Z2 and G acting only on the first factor. The smooth vectors in C(X, R) for the action of G can then be identified with C(Z, C ∞ (G/N )). So we obtain 2 2 2 G, C(X, R) ∼ g, C(Z, C ∞ (G/N )) ∼ g, C ∞ (G/N ) , HM = HLie = C Z, HLie with the cohomology moving inside since G acts trivially on Z. However, by Poincar´e duality for Lie algebra cohomology, 2 HLie g, C ∞ (G/N ) ∼ = H0Lie g, C ∞ (G/N ) , which is the quotient of C ∞ (G/N ) by all derivatives X · f , X ∈ g and f ∈ C ∞ (G/N ). This quotient is R by the de Rham theorem, since f (g) dvol(g) is exact on T exactly when 2 ∼ f is constant. And it’s easy to check that the isomorphism HM G, C(X, R) = C(Z, R) is induced by . Theorem 4.10. In Basic Setup 3.1 with n = 2, there is a commutative diagram of exact sequences: H 0 (Z, Z)
H 2 (X, Z)
d2
0
/ H 2 (G, C(X, T)) M
ξ
/ ker F
η
/0
a
2 (Z2 , T)) o C(Z, HM
M
h
H 1 (Z, Z) o 0
p!
Br G (X) H 3 (X, Z) 0
2 (Z2 , T)) ∼ C(Z, T) is the Mackey obstruction map Here M : Br G (X) → C(Z, HM = defined in [28], and h : C(Z, T) → H 1 (X, Z) is the map sending a continuous function Z → S 1 to its homotopy class. The definitions of the dotted arrows will be given in the course of the proof.
Proof. Most of this is immediate from Theorem 4.4 together with Proposition 4.8. There are just a few more things to check. First we define the dotted arrows in the diagram. The arrow p! : H 3 (X, Z) → H 1 (Z, Z) is “integration over the fibers” of the bundle p 1,2 T 2 → X → Z; more specifically, it is the projection of H 3 (X, Z) onto E∞ in the Serre 1,2 1,2 1 2 2 spectral sequence of p. Since E∞ ⊆ E2 = H (Z, H (T , Z)), we can think of the image as lying in H 1 (Z, Z). In fact, 1,2 E∞ ⊆ E31,2 = ker d2 : H 1 (Z, H 2 (T 2 , Z)) → H 3 (Z, H 1 (T 2 , Z)) ∼ = H 3 (Z, Z2 ),
and this map d2 can be identified with the cup product with [p] ∈ H 2 (Z, Z2 ).
T-Duality via Noncommutative Topology
713
Next we define the downward dotted arrow a using Lemma 4.9. It is simply the following composite: eq. (5)
Lemma 4.9
∼ =
∼ =
exp
2 2 HM (G, C(X, T)) −−−→ HM (G, C(X, R)) −−−−−−→ C(Z, R) −→ C(Z, T).
Exactness of the middle downward sequence a
h
2 H 0 (Z, Z) → HM (G, C(X, T)) → C(Z, T) → H 1 (Z, Z)
follows immediately from (4) with X replaced by Z. We still need to check commutativity of the squares. As far as the upper square is concerned, the key fact is that the restriction map 2 2 R∼ (R2 , T) → HM (Z2 , T) ∼ = HM =T
is surjective and can be identified with the exponential map (see the Hochschild-Serre spectral sequence • HM (R2 /Z2 , HM (Z2 , T)) ⇒ HM (R2 , T) p
q
of [25] for a method of calculation). To check commutativity for the upper square, 2 (G, C(X, T)) representing a class in H 2 (G, C(X, T)). choose a Borel cocycle ω ∈ ZM M By Lemma 4.9, we may assume ω takes its values in functions constant on T -orbits, i.e., pulled back from C(Z, T) via p∗ . As in [14, Theorem 5.1(3)], choose a Borel map u → U M(C0 (X, K)) satisfying us τs (ut ) = ω(s, t)us+t ,
s, t ∈ G.
(Here τ is the action of G on X.) Then by the prescription in [28], ξ([ω]) is given by C0 (X, K) with the G-action s → (Ad us )τs . We need to compute the Mackey obstruction for the restriction of the action to N = Z2 . But this is just given by z → M(uz ), the Mackey obstruction of the projective unitary representation of N defined by u over a point z ∈ Z. But as the cocycle of the representation is just ω restricted to z (this makes sense since we took ω to have values constant on G-orbits), we can use the above fact about restricting the Moore cohomology from G to N to deduce that M(ξ([ω])) = a([ω]). Finally we need to check commutativity of the bottom square. This amounts to showing that if we have an action α of G on CT (X, δ) representing an element of Br G (X), then h ◦ M(α) = p! (δ). (In the case where M(α) is trivial, this is basically in [28].) First of all, we note that h ◦ M(α) can only depend on δ, not on the choice of the action α on CT (X, δ). The reason is that any two different actions differ by an element of ker F , 2 (G, C(X, T)) ∼ C(Z, R). By which by the rest of the diagram is in the image of HM = commutativity of the upper square, this only changes M(α) within its homotopy class. Since we already know Br G (X) → H 3 (X, Z) is surjective, it follows that h ◦ M induces a homomorphism from H 3 (X, Z) → H 1 (Z, Z). This map is trivial on p ∗ (H 3 (Z, Z)), since this part of H 3 (X, Z) is represented by G-actions where N = Z2 acts trivially [14, §6.2]. And of course when N acts trivially, there is no Mackey obstruction. Next we show that the map H 3 (X, Z) → H 1 (Z, Z) induced by h ◦ M vanishes 2,1 on the E∞ subquotient of the spectral sequence. This consists (modulo classes pulled back from H 3 (Z, Z)) of classes pulled back from some intermediate space Y , where p1 p2 X −→ Y −→ Z is some factorization of the T 2 -bundle p : X → Z as a composite of
714
V. Mathai, J. Rosenberg
two principal S 1 -bundles. But given such a factorization and a class δY ∈ Y , there is an essentially unique action of R on CT (Y, δY ) compatible with the S 1 -action on Y with quotient Z, because of the results of Sect. 4.1. Pulling back from Y to X, we get an action of R × T on CT (X, p1∗ δY ), or in other words an action of G factoring through R × T. Such an action necessarily has trivial Mackey obstruction. So it follows that the map induced by h◦M factors through the remaining subquotient 1,2 of H 3 (Z, Z), i.e., E∞ . That says exactly that the map factors through p! . By naturality, it must be a multiple of p! , and we just need to compute in the case of a trivial bundle to verify that the multiple is 1. Thus the proof is completed with the following Proposition 4.11. Proposition 4.11. Let p : X = Z × T2 → Z be a trivial T2 -bundle, let β ∈ H 1 (Z, Z), and let δ = β × γ ∈ H 3 (X, Z), where γ is the usual generator of H 2 (T2 , Z) ∼ = Z. Then there is an action α of G = R2 on CT (X, δ), compatible with the free T2 -action on X, for which h ◦ M(α) = β. Proof. Choose a function f : Z → T with h(f ) = β. Let H = L2 (T) and for z ∈ Z, consider the projective unitary representation ρf (z) : Z2 → P U (H) defined by sending the first generator of Z2 to multiplication by the identity map T → T → C, and the second generator to translation by f (z) ∈ T. Then the Mackey obstruction of ρf (z) is f (z) ∈ T ∼ = H 2 (Z2 , T). We can view ρ as a spectrum-fixing automorphism of Z2 on C(Z, K(H)), which is given at the point z ∈ Zby Ad ρf (z) . We now let (A, α) be the C ∗ -dynamical system obtained by inducing up C(Z, K(H)), ρ from Z2 to R2 . More precisely, A = IndR (C(Z, K(H)), ρ) Z2 = f : R2 → C(Z, K(H)) : f (t + g) = ρ(g)(f (t)), t ∈ R2 , g ∈ Z2 . 2
Since ρ acts trivially on the spectrum Z of the inducing algebra and A is an algebra of sections of a locally trivial bundle of C ∗ -algebras with fibers isomorphic to K, A is a continuous-trace algebra having spectrum Z × T2 . There is a natural action α of R2 on A by translation, and by construction, M(α) = f . We just need to compute the DixmierDouady invariant of A. We get it by “inducing in stages”. Let B = IndR Z C(Z, K(H)) be the result of inducing over the first copy of R. Since the first generator of Z2 was always acting by conjugation by multiplication by the identity map T → T on L2 (T), one can see that B is a trivial continuous-trace algebra, viz., B ∼ = C0 (Z × T, K(H)). We still have another action of Z on B coming from the second generator of Z2 , and A = IndR Z B, where we induce over the second copy of R to get A. The action of Z acts on B is by means of a map σ : Z × T → P U (H) = Aut K(H), whose value at (z, t) is the product of multiplication by t with translation by f (z). Thus the Dixmier-Douady invariant of A is then [σ ] × c, where [σ ] ∈ H 2 (Z × T, Z) is the homotopy class of σ : Z × T → P U (H) = K(Z, 2) and c is the usual generator of H 1 (S 1 , Z). But [σ ] is now h(f ) × c, so the Dixmier-Douady class of A is β × c × c = β × γ . 4.4. Applications to T-duality. Now we are ready to apply Theorem 4.10 to T-duality in type II string theory. First we need a definition. Definition 4.12. Let p : X → Z be a principal T -bundle as in the Basic Setup 3.1, and let δ ∈ H 3 (X, Z). We will say that the pair (p, δ) has a classical T-dual if there is an
T-Duality via Noncommutative Topology
715
element [A, α] of Br G (X), with A a continuous-trace algebra over X with DixmierDouady class δ, and with α an action of G on A inducing the given free action of T = G/N on X, such that the crossed product A G is again a continuous-trace alge inducing bra over some other principal torus bundle over Z, with the dual action of G the bundle projection to Z. This definition is essentially equivalent to that in [7]; we will say more about this later in Remark 4.15. The following is the main result of this paper. Theorem 4.13. Let p : X → Z be a principal T2 -bundle as in the Basic Setup 3.1. Let δ ∈ H 3 (X, Z) be an “H-flux” on X. Then: 1. If p! δ = 0 ∈ H 1 (Z, Z), then there is a (uniquely determined) classical T-dual to (p, δ), consisting of p# : X# → Z, which is a another principal T2 -bundle over Z, and δ # ∈ H 3 (X # , Z), the “T-dual H-flux” on X # . One obtains a picture exactly like Eq. (2). 2. If p! δ = 0 ∈ H 1 (Z, Z), then a classical T-dual as above does not exist. However, there is a “nonclassical” T-dual bundle of noncommutative tori over Z. It is not unique, but the non-uniqueness does not affect its K-theory. Proof. By Theorem 4.10, the map F : Br G (X) → H 3 (X, Z) is always surjective. This will be the key to the proof. First consider the case when p! δ = 0 ∈ H 1 (Z, Z). This case is considered in [7], but we will redo the results using Theorem 4.10. By commutativity of the lower square, we can lift δ ∈ H 3 (X, Z) to an element [CT (X, δ), α] of Br G (X) with M(α) homotopically trivial. Then by using commutativity of the upper square in Theorem 4.10, we can perturb α, without changing δ, so that M(α) actually vanishes. Once this is done, the element we get in Br G (X) is actually unique. On the one hand, this can be seen from [28, Lemma 1.3] and [28, Cor. 5.18]. Alternatively, it can be read off from Theorem 4.10, since any two classes in ker M mapping to the same δ ∈ H 3 (X, Z) differ by the image under ξ of something in ker a. Thus they differ by the image under ξ of an Z-valued cocycle, which is trivial since such a cocycle exponentiates to the trivial cocycle with values in T, and this is all that is used in the construction of ξ in [14]. Finally, if [CT (X, δ), α] has trivial Mackey obstruction, then as explained in [28, §1], CT (X, δ) α G has continuous trace
and has spectrum which is another principal torus bundle over Z (for the dual torus, G divided by the dual lattice). Now consider the case when p! δ = 0 ∈ H 1 (Z, Z).
(6)
It is still true as before that we can find an element [CT (X, δ), α] in Br G (X) corresponding to δ. But there is no classical T-dual in this situation since the Mackey obstruction can’t be trivial, because of Theorem 4.10. In fact, since any representative f : Z → T of a non-zero class in H 1 (Z, Z) must take on all values in T, there are necessarily points z ∈ Z for which the Mackey obstruction in H 2 (Z2 , T) ∼ = T is irrational, and hence the crossed product CT (X, δ) α G cannot be type I. Nevertheless, we can view this crossed product as a non-classical T-dual to (p, δ). The crossed product can be viewed as the algebra of sections of a bundle of algebras (not locally trivial) over Z, in the sense of [15]. The fiber of this bundle over z ∈ Z will be C(p−1 (z), K(H)) G ∼ = C(G/Z2 , K(H)) G ∼ = Af (z) ⊗ K(H), which is Morita equivalent to the twisted group C ∗ -algebra Af (z) of the stabilizer group Z2 for the
716
V. Mathai, J. Rosenberg
Mackey obstruction class f (z) at that point. In other words, the T-dual will be realized by a bundle of (stabilized) noncommutative tori fibered over Z. (See Fig. 1.) The bundle is not unique since there is no canonical representative f for a given non-zero class in H 1 (X, Z). However, any two choices are homotopic, and the resulting bundles will be in some sense homotopic to one another. As expected, our notion of T-duality comes with isomorphisms in twisted K-theory and (periodic cyclic) homology: Theorem 4.14. In the situation of Theorem 4.13, if X is a manifold, H is an integral 3-form representing δ (in de Rham cohomology), and we choose a smooth model for CT (X, δ) (by taking a smooth bundle over X with fibers the smoothing operators), we have a commutative diagram T!
K • (X, H ) −−−−→ ∼ = ChH
K• (CT (X, δ) R2 ) Ch
(7)
T∗
H • (X, H ) −−−−→ H P• (CT (X, δ)∞ R2 ) ∼ =
where the horizontal arrows are isomorphisms, ChH is the twisted Chern character and Ch is the Connes-Chern character [12]. When p! δ = 0 and there is a classical T-dual, this reduces to a diagram like Eq. (3), except that there is no degree shift since the tori are even-dimensional. Proof. This is done almost exactly as in [5], so we will be brief. We have the isomorphisms in K-theory K • (X, H ) ∼ = K• (CT (X, δ)) ∼ = K• (CT (X, δ) R2 )
(Connes-Thom isomorphism [11]).
We can also consider the smooth subalgebra CT (X, δ)∞ G. The fiber at z ∈ Z is ∞ given by C ∞ (p −1 (z), K∞ (H)) G ∼ = C ∞ (G/Z2 , K∞ (H)) G ∼ = A∞ f (z) ⊗ K (H),
X A
f(z)
p
Z z
Fig. 1. In the diagram, the fiber over z ∈ Z is the noncommutative torus Af (z) , which is represented by a foliated torus, with foliation angle equal to f (z)
T-Duality via Noncommutative Topology
717
where K∞ (H) is the algebra of smoothing operators on H and A∞ f (z) is the smooth noncommutative torus with multiplier equal to f (z). Then we have the isomorphisms H • (X, H ) ∼ = H P• (CT (X, δ)∞ ) ∼ = H P• (CT (X, δ)∞ R2 )
(ENN-Thom isomorphism [19]).
It is well known that the Chern character is compatible with the isomorphisms in K-theory and cohomology, from which the commutativity of the diagram in (7) follows. Remark 4.15. The reader might wonder what happened to the dual H-flux H # in the context of Theorem 4.13(2). It doesn’t really make sense as a cohomology class or differential form since the nonclassical T-dual is not a space; rather, it is subsumed in the noncommutative structure of the dual. Now let us describe the relationship between our Definition 4.12 and Theorem 4.13 and the corresponding notions in [7]. If the pair (p : X → Z, δ) is T-dualizable in the
(), sense of [7], that means δ is represented by a closed 3-form H , such that ι H = p∗ F
with values in the dual of g, the Lie algebra of T , for some integral closed 2-form F and for all ∈ g. This essentially means that when we integrate H over the fibers p1 p1 of p1 , where X −→ Y −→ Z is a factorization of p into two circle bundles, then the resulting 2-form is pulled back from Z. This implies in turn that integrating H over the fibers of p gives 0, which is the condition p! [H ] = 0. (We do not need to worry about torsion in cohomology since p! δ lies in H 1 (Z, Z), which is always torsion-free.) Thus the condition in our Theorem 4.13(1) is satisfied. Conversely, suppose our condition p! δ = 0 is satisfied, so we have a classical T-dual
(), for some closed integral (p# : X # → Z, δ # ). The condition of [7] that ι H = p∗ F
with values in the dual of g and for all ∈ g, will follow from the fact that 2-form F since p! δ = 0 (and we can divide out by trivial cases where δ is pulled back from Z), δ 2,1 comes from the E∞ subquotient of H 3 (X, Z). 5. Examples: Torus Bundles and Noncommutative Torus Bundles over the Circle A famous example of a principal torus bundle with non T-dualizable H-flux is provided by T3 , considered as the trivial T2 -bundle over T, with H given by k times the volume form on T3 , k = 0. H is non T-dualizable in the classical sense since p! [H ] = 0. Alternatively, there are no non-trivial T2 -bundles over T, since H 1 (T, T2 ) ∼ = H 2 (T, Z2 ) = 0, that is, there is no way to dualize the H-flux by a (principal) torus bundle over T. This example is covered by Theorem 4.13(2) and by Theorem 4.14. The T-dual is realized by a bundle of stabilized noncommutative tori fibered over T. In fact the construction of the non-classical T-dual in this case is a special case of the construction in the proof of Proposition 4.11, but we repeat the details since we can make things more explicit. Let H = L2 (T) and consider the projective unitary representation ρθ : Z2 → P U (H) given by the first Z factor acting by multiplication by zk (where T is thought of as the unit circle in C) and the second Z factor acting by translation by θ ∈ T. Then the Mackey obstruction of ρθ is θ k ∈ T ∼ = H 2 (Z2 , T). Let Z2 act on C(T, K(H)) by α, which is given at the point θ by ρθ . Define the C ∗ -algebra B = IndR (C(T, K(H)), α) Z2 = f : R2 → C(T, K(H)) : f (t + g) = α(g)(f (t)), t ∈ R2 , g ∈ Z2 . 2
718
V. Mathai, J. Rosenberg
That is, B (with an implied action of R2 ) is the result of inducing a Z2 -action on C(T, K(H)) from Z2 up to R2 . Then B is a continuous-trace C ∗ -algebra having spectrum T3 , having an action of R2 whose induced action on the spectrum of B is the trivial bundle T3 → T. The crossed product algebra B R2 ∼ = C(T, K(H)) Z2 has fiber over θ ∈ T given by K(H) ρθ Z2 ∼ = Aθ ⊗ K(H), where Aθ is the noncommutative 2-torus. In fact, the crossed product B R2 is Morita equivalent to C(T, K(H)) Z2 and is even isomorphic to the stabilization of this algebra (by [22]). Thus B R2 is isomorphic to C ∗ (HZ ) ⊗ K, where HZ is the integer Heisenberg-type group, 1 x k1 z HZ = 0 1 y : x, y, z ∈ Z , 00 1 a lattice in the usual Heisenberg group HR (consisting of matrices of the same form, but with x, y, z ∈ R). Then we have the isomorphisms in K-theory K• (B) = K • (T3 , k dvol) (definition) ∼ (Connes-Thom isomorphism) = K• (B R2 ) ∼ (above identification) = K• (C ∗ (HZ )) ∼ (special case of the Baum-Connes conjecture2 ) = K• (HR /HZ ) ∼ (Poincar´e duality for HR /HZ ). = K •+1 (HR /HZ ) where we observe that the Heisenberg nilmanifold HR /HZ (which happens to be the classifying space BHZ ) is a circle bundle over T2 with first Chern class equal to kdx∧dy. Notice that as far as K-theory is concerned, the T-dual of (T 3 , k dvol) can also be taken to be the nilmanifold HR /HZ with the trivial H -field. This is a non-principal T 2 -bundle over S 1 . But a better model for a non-classical T-dual is simply the group C ∗ -algebra of HZ . We can also consider the smooth subalgebra B ∞ of B defined by R ∞ (T, K∞ (H)), α) B ∞ = Ind Z2 (C 2 = f : R → C ∞ (T, K∞ (H)) : f (t + g) = α(g)(f (t)), t ∈ R2 , g ∈ Z2 , 2
∼ where K∞ (H)) denotes the algebra of smoothing operators on T. Note that B ∞ R2 = ∞ (H), C ∞ (T, K∞ (H)) Z2 has fiber over θ ∈ T given by K∞ (H) ρθ Z2 ∼ ⊗ K = A∞ θ where A∞ θ is the smooth noncommutative torus and the tensor product is the projective tensor product. In this case, the crossed product B ∞ R2 ∼ = S(HZ ) ⊗ K∞ (H), where 2 This is not as complicated as it sounds. The Baum-Connes conjecture (for torsion-free groups) says that the “index map” or “assembly map” K• (B) → K• (Cr∗ ()) should be an isomorphism for an arbitrary discrete torsion-free group [4]. Here B is the classifying space of , which if is a torsion-free cocompact discrete subgroup of a connected Lie group G can be taken to be K\G/ , K a maximal compact subgroup of G, and Cr∗ () denotes the reduced group C ∗ -algebra, i.e., the C ∗ -algebra generated by the left regular representation of on 2 (). If is amenable, this coincides with the full group C ∗ -algebra, or in other words the universal C ∗ -algebra whose ∗-representations correspond to unitary representations of . When , like HZ , is a poly-Z group, i.e., has a composition series with infinite cyclic composition factors, then this is easy to prove by induction on the length of the composition series, using the Pimsner-Voiculescu exact sequence [30] for the K-theory of a crossed product by an action of Z. Finally, the Pimsner-Voiculescu sequence can be deduced from Connes’ Thom isomorphism theorem (see [11]) by inducing the action of Z to an action of R.
T-Duality via Noncommutative Topology
719
S(HZ ) is the rapid decrease algebra. Then we have the isomorphisms (definition) H P• (B ∞ ) = H • (T3 , k dvol) ∼ (ENN-Thom isomorphism) = H P• (B ∞ R2 ) ∼ (above identification) = H P• (S(HZ )) ∼ (Cyclic homology Baum-Connes conjecture) = H• (HR /HZ ) ∼ (Poincar´e duality for HR /HZ ) = H •+1 (HR /HZ ) where H P• denotes periodic cyclic homology, which is stable under the (projective) tensor product with K∞ (H) and H• , H • denote the Z2 -graded homology and cohomology respectively. Finally, T-duality can be expressed in this case by the following commutative diagram, T!
K • (T3 , k dvol) −−−−→ K• (C ∗ (HZ )) ChH Ch
(1)
T∗
H • (T3 , k dvol) −−−−→ H P• (S(HZ )) where H = k dvol, ChH is the twisted Chern character and Ch is the Connes-Chern character [12].
6. Concluding Remarks In this paper, we have only dealt with complex C ∗ -algebras and complex K-theory, which are relevant for type II string theory. In principle, most of what we have done should also extend to the type I case, which involves real K-theory. However, one has to be careful. Since T -duality is related to the Fourier transform, and since the Fourier transform of a real function is not necessarily real, a theory of T-duality in type I string theory necessarily involves KR-theory, or Real K-theory in the sense of Atiyah [3]. The correct notion of twisted KR-theory is that of K-theory of real continuous-trace algebras in the sense of [35, §3]. What complicates things is that such algebras are built out of continuous-trace algebras of real, quaternionic, and complex type (locally isomorphic to C(X, KR ), C(X, KH ), and C(X, KC ), respectively). Even if one’s original interest is in algebras of real type, passage to the T-dual will often involve algebras of the other types. One possibility suggested by the example in Sect. 5 is that there is a good theory of T-duality for arbitrary torus bundles with H-fluxes, that doesn’t require going to a category of noncommutative bundles, but that it is necessary to include the possibility of non-principal bundles. We have seen that there is a sense in which the Heisenberg nilmanifold (with trivial H -field) can be viewed as a T-dual to T 3 with a non-trivial H -field. (This is literally true in the sense of [5] if we think of both manifolds as T-bundles over T 2 , rather than as T 2 -bundles over S 1 .) It is of course a little disappointing that our main theorem only applies when the fibers of the torus bundle are 2-dimensional. From Theorem 4.4, it is not even clear if the map Br G (X) → H 3 (X, Z) is surjective when n = dim G > 2. However, the methods of this paper should apply on the image of this map.
720
V. Mathai, J. Rosenberg
References ´ ´ 1. Alvarez, E., Alvarez-Gaum´ e, L., Barb´on, J.L.F., Lozano, Y.: Some global aspects of duality in string theory. Nucl. Phys. B415, 71–100 (1994) 2. Alvarez, O.: Target space duality I: General theory. Nucl. Phys. B584, 659–681 (2000); Target space duality II: Applications. Nucl. Phys. B584, 682–704 (2000) 3. Atiyah, M.F.: K-theory and reality. Quart. J. Math. Oxford Ser. (2) 17, 367–386 (1966) 4. Baum, P., Connes, A., Higson, N.: Classifying space for proper actions and K-theory of group C ∗ algebras. In: C ∗ -algebras: 1943–1993 (San Antonio, TX, 1993), Contemp. Math. 167, Providence, RI: Am. Math. Soc. 1994, pp. 240–291 5. Bouwknegt, P., Evslin, J., Mathai, V.: T-duality: Topology change from H -flux. Commun. Math. Phys. 249, 383–415 (2004) 6. Bouwknegt, P., Evslin, J., Mathai, V.: On the topology and H-flux of T-dual manifolds. Phys. Rev. Lett. 92, 181601 (2004) 7. Bouwknegt, P., Hannabuss, K., Mathai, V.: T-duality for principal torus bundles. JHEP 03, 018 (2004) 8. Buscher, T.: A symmetry of the string background field equations. Phys. Lett. B194, 59–62 (1987) 9. Buscher, T.: Path integral derivation of quantum duality in nonlinear sigma models. Phys. Lett. B201, 466–472 (1988) 10. Connes, A.: A survey of foliations and operator algebras. In: Operator algebras and applications, Part I (Kingston, Ont., 1980), R. V. Kadison, (ed.), Proc. Sympos. Pure Math. 38, Providence, R.I.: Am. Math. Soc. 1982, pp. 521–628 11. Connes, A.: An analogue of the Thom isomorphism for crossed products of a C ∗ -algebra by an action of R. Adv. Math. 39(1), 31–55 (1981) ´ 12. Connes, A.: Noncommutative differential geometry. Inst. Hautes Etudes Sci. Publ. Math. No. 62, 257–360 (1985). Available at http://www.numdam.org/item?id=PMIHES 1985 62 41 0 13. Connes, A., Douglas, M.R., Schwarz, A.: Noncommutative geometry and matrix theory: compactification on tori. J. High Energy Phys. 02, 003 (1998) 14. Crocker, D., Kumjian, A., Raeburn, I., Williams, D.P.: An equivariant Brauer group and actions of groups on C ∗ -algebras. J. Funct. Anal. 146(1), 151–184 (1997) 15. Dauns, J., Hofmann, K.H.: Representation of rings by sections. Mem. Am. Math. Soc. no. 83, Providence, R.I.: Am. Math. Soc. 1968 16. Dixmier, J., Douady, A.: Champs continus d’espaces hilbertiens et de C ∗ -alg`ebres. Bull. Soc. Math. France 91, 227–284 (1963) 17. Donagi, R., Pantev, T.: Torus fibrations, gerbes and duality. http://xxx.lanl.gov/abs/math.AG/ 0306213, 2003 18. Donovan, P., Karoubi, M.: Graded Brauer groups and K-theory with local coefficients. Inst. Hautes ´ Etudes Sci. Publ. Math. No. 38, 5–25 (1970). Available at http://www.numdam.org/item?id=PMIHES 1970 38 5 0 19. Elliott, G., Natsume, T., Nest, R.: Cyclic cohomology for one-parameter smooth crossed products. Acta Math. 160(3–4), 285–305 (1988) 20. Fell, J.M.G.: The structure of algebras of operator fields. Acta Math. 106, 233–280 (1961) 21. Gleason, A.: Spaces with a compact Lie group of transformations. Proc. Am. Math. Soc. 1, 35–43 (1950) 22. Green, P.: The structure of imprimitivity algebras. J. Funct. Anal. 36(1), 88–104 (1980) 23. Guichardet, A.: Cohomologie des groupes topologiques et des alg`ebres de Lie. Textes Math´ematiques, 2. Paris: CEDIC, 1980 24. Maldacena, J., Moore, G., Seiberg, N.: D-brane instantons and K-theory charges. J. High Energy Phys. 11, 062 (2001) 25. Moore, C.C.: Extensions and low dimensional cohomology theory of locally compact groups. I. Trans. Am. Math. Soc. 113, 40–63 (1964) 26. Moore, C.C.: Group extensions and cohomology for locally compact groups. III. Trans. Am. Math. Soc. 221(1), 1–33 (1976) 27. Moore, G.: K-theory from a physical perspective. http://arXiv.org.abs/hep-th/0304018, 2003 28. Packer, J., Raeburn, I., Williams, D.P.: The equivariant Brauer group of principal bundles. J. Operator Theory 36, 73–105 (1996) 29. Pichaud, J.: Cohomologie continue et cohomologie diff´erentiable des groupes localement compacts. C. R. Acad. Sci. Paris S´er. I Math. 292(3), 171–173 (1981) 30. Pimsner, M., Voiculescu, D.: Exact sequences for K-groups and Ext-groups of certain cross-product C ∗ -algebras. J. Operator Theory 4(1), 93–118 (1980) 31. Raeburn, I., Rosenberg, J.: Crossed products of continuous-trace C ∗ -algebras by smooth actions. Trans. Am. Math. Soc. 305(1), 1–45 (1988)
T-Duality via Noncommutative Topology
721
32. Raeburn, I., Williams, D.P.: Dixmier–Douady classes of dynamical systems and crossed products. Can. J. Math. 45, 1032–1066 (1993) 33. Raeburn, I., Williams, D.P.: Topological invariants associated with the spectrum of crossed product C ∗ -algebras. J. Funct. Anal. 116(2), 245–276 (1993) 34. Rieffel, M.: C ∗ -algebras associated with irrational rotations. Pacific J. Math. 93(2), 415–429 (1981) 35. Rosenberg, J.: Continuous-trace algebras from the bundle theoretic point of view. J. Austral. Math. Soc. Ser. A 47(3), 368–381 (1989) 36. Seiberg, N., Witten, E.: String theory and noncommutative geometry. J. High Energy Phys. 09, 032 (1999) 37. Strominger, A., Yau, S.-T., Zaslow, E.: Mirror symmetry is T-duality. Nucl. Phys. B479, 243–259 (1996) 38. Wigner, D.: Algebraic cohomology of topological groups. Trans. Am. Math. Soc. 178, 83–93 (1973) 39. Witten, E.: D-branes and K-theory. J. High Energy Phys. 12, 019 (1998) 40. Witten, E.: Overview of K-theory applied to strings. Int. J. Mod. Phys. A16, 693–706 (2001) Communicated by A. Connes
Commun. Math. Phys. 253, 723–764 (2005) Digital Object Identifier (DOI) 10.1007/s00220-004-1160-1
Communications in
Mathematical Physics
Solitons in Affine and Permutation Orbifolds Victor G. Kac1, , Roberto Longo2, , Feng Xu3, 1 2
Department of Mathematics, MIT, Cambridge, MA 02139, USA. E-mail:
[email protected] Dipartimento di Matematica, Universit`a di Roma “Tor Vergata”, Via della Ricerca Scientifica, 1, 00133 Roma, Italy. E-mail:
[email protected] 3 Department of Mathematics, University of California at Riverside, Riverside, CA 92521, USA. E-mail:
[email protected]
Received: 12 February 2004 / Accepted: 20 February 2004 Published online: 3 September 2004 – © Springer-Verlag 2004
Abstract: We consider properties of solitons in general orbifolds in the algebraic quantum field theory framework and constructions of solitons in affine and permutation orbifolds. Under general conditions we show that our construction gives all the twisted representations of the fixed point subnet. This allows us to prove a number of conjectures: in the affine orbifold case we clarify the issue of “fixed point resolutions”; in the permutation orbifold case we determine all irreducible representations of the orbifold, and we also determine the fusion rules in a nontrivial case, which imply an integral property of chiral data for any completely rational conformal net.
1. Introduction Let A be a completely rational conformal net (cf. §3.5 and Def. 3.6 following [21]). Let be a finite group acting properly on A (cf. Def. (3.4)). The starting point of this paper is Th. 3.7 proved in [44] which states that the fixed point subnet (the orbifold) A is also completely rational, and by [21] A has finitely many irreducible representations which are divided into two classes: the ones that are obtained from the restrictions of a representation of A to A which are called untwisted representations, and the ones which are twisted (cf. definition after Th. 3.7). It follows from Th. 3.7 that a twisted representation of A always exists if A = A. The motivating question for this paper is how to construct these twisted representations of A . It turns out that all representations of A are closely related to the solitons of A (cf. §3.3 and Prop. 4.1). Solitons are representations of A0 , the restriction of A to the real line identified with a circle with one point removed. Every representation of A restricts to a soliton of A0 , but not every soliton of A0 can be extended to a representation of
Supported in part by NSF. Supported in part by GNAMPA-INDAM and MIUR. Supported in part by NSF.
724
V.G. Kac, R. Longo, F. Xu
A. In §4 we develop a general theory of solitons in the case of orbifolds with two main results: Th. 4.5 gives a formula for the index of solitons obtained from restrictions, and Th. 4.8 clarifies the general structure of the restriction of a soliton. These results are natural extensions of similar results in [33 and 45]. in special cases. The construction of solitons depends on the net A and the action of . In the case of an affine orbifold, our construction (cf. Def. 5.6) is partially inspired by the “twisted representations” of [18], and in fact can be viewed as an “exponentiated version” of the “twisted representations” of [18] (cf. §5.2.1). Combined with the general properties of solitons described above, this construction allows us to clarify the issue of the “fixed point” problem in [18] in Th. 5.16, and we also show that our construction gives all the irreducible representations of the fixed point subnet under general conditions in Th. 5.11 and Cor. 5.12, thus answering our motivating question in this case. In the case of permutation orbifolds (cf. §6), our construction of solitons in (6.5) is a simple generalization of the construction of solitons in [33] for the case of cyclic orbifolds. Note that the construction of solitons in [33] also leads to structure results such as a dichotomy for any split local conformal net. In Th. 8.1 (resp. Th. 8.5) we show that our construction gives all the irreducible representations of the cyclic orbifold (resp. the permutation orbifold), and in Th. 8.4 (resp. Th. 8.7) we list all the irreducible representations of the cyclic orbifold (resp. the permutation orbifold). These results generalize the results of [33] and prove a claim in [1] which is based on heuristic arguments. Using theses results in §9 we determine the fusion rules for the first nontrivial case when n = 2 in Th. 9.8 which implies an integral property of the chiral data for any completely rational net (cf. Cor. 9.9), proving a conjecture in the paper [5], that contained the first computations leading to correct fusion rules. The rest of this paper is organized as follows: §2 and §3 are preliminaries on the algebraic quantum field theory framework where orbifold construction is considered. In these sections we have collected some basic notions that appear in this paper for the convenience of the reader who may not have an operator algebra background. The results in §2,§3 are known except Prop. 3.8 on extensions of solitons which plays an important role in §7. In §4 we apply the results in §2 and §3 to obtain general properties of solitons under inductions and restrictions, and in particular we prove Th. 4.5 and Th. 4.8. In §5, after recalling basic definitions and properties in affine orbifold from [18], we give the constructions of solitons in §5.2 and in §5.2.1 compare it with the twisted representations in [18]. Theorem 5.11 and its Cor. 5.12 are proved in §5.3. In §5.5 we clarify the issue of fixed point resolutions in Th. 5.16, Cor. 5.17. In §5.6 we illustrate the results of §5.5 in an example considered in [18]. In §6 we first recall the construction of solitons from [33] in the cyclic permutation case, and in §6.3 give the general construction of solitons for permutation orbifolds. We prove in §7 the important property of these solitons (Th. 7.1) which so far has no direct proof. In §8 we apply the results of previous sections to prove four theorems which are briefly described above. In §9 after proving some simple properties of the S matrix (cf. Lemma 9.1), we determine the fusions of solitons in cyclic orbifold in a special case in Prop. 9.4. In §9.3 we determine the fusion rules for the case n = 2 in Th. 9.8 which implies an integral property in Cor. 9.9. 2. Elements of Operator Algebras and Conformal QFT For the convenience of the reader we collect here some basic notions that appear in this paper. This is only a guideline and the reader should look at the references for a more complete treatment.
Solitons in Affine and Permutation Orbifolds
725
2.1. von Neumann algebras. Let H be a Hilbert space that we always assume to be separable to simplify the exposition. With B(H) the algebra of all bounded linear operators on H a von Neumann algebra M is a ∗ -subalgebra of B(H) containing the identity operator such that M = M − (weak closure or, equivalently, strong closure). Equivalently M = M (von Neumann density theorem), where the prime denotes the commutant: M ≡ {a ∈ B(H) : xa = ax ∀x ∈ M}. A linear map η from a von Neumann algebra M to a von Neumann algebra N is positive if η(M+ ) ⊂ N+ , where M+ ≡ {x ∈ M : x ≥ 0} denotes the cone of positive elements of M. η is normal if it commutes with the sup operation, namely sup η(xi ) = η(sup xi ) for any bounded increasing net of elements in M+ ; η is normal iff it is weakly (equivalently strongly) continuous on the unit ball of M. η is faithful if η(x) = 0, x ∈ M+ , implies x = 0. By a homomorphism of a von Neumann algebra we shall always mean an identity preserving homomorphism commuting with the ∗ -operation, and analogously for isomorphisms and endomorphisms. Isomorphisms between von Neumann algebras are automatically normal. By a representation of M on a Hilbert space K we mean a homomorphism of M into B(K). A state ω on the von Neumann algebra M is a positive linear functional on M with the normalization ω(1) = 1. The relevant states for von Neumann algebras are the normal states. By the GNS construction, every normal state of M is given by ω(x) = (π(x), ), where π is a normal representation of M on a Hilbert space K and ∈ K is cyclic (i.e. π(M) is dense in K, see below). Given ω, the triple (K, π, ) is unique up to unitary equivalence. A factor is a von Neumann algebra with trivial center, namely M ∩ M = C. We note that a factor is a (topologically) simple algebra, i.e., the only weakly closed ideal of the factor is either trivial or equal to the factor itself. If M is a factor (and K is separable), a representation of M on K is automatically normal. A factor M is finite if there exists a tracial state ω on M, namely ω(xy) = ω(yx), x, y ∈ M (automatically normal and unique). Otherwise M is called an infinite factor. For a factor M, the following are equivalent: • M is infinite; • M is isomorphic to M ⊗ B(K), with K a separable infinite dimensional Hilbert space; • M contains a non-unitary isometry (an isometry v is an operator with the property v ∗ v = 1); • M contains a non-degenerate Hilbert H space of isometries with arbitrary dimension (but separable). Here a Hilbert space of isometries H in M we mean a norm closed linear subspace H ⊂ M such that x ∗ y ∈ C for all x, y ∈ M. Thus x, y → y ∗ x is a scalar product on H . Then, if L is a set with {v , ∈ L} an orthonormal basis for H , we have v∗ v = δ , namely the vi ’s are isometries of H with pairwise orthogonal range projections. H is non-degenerate if the left support of H is 1, that is the final projections form a partition of the identity: ∈L v v∗ = 1. A factor M is of type III (or purely infinite) if every non-zero projection e ∈ M is equivalent to the identity, namely there exists an isometry v ∈ M with v ∗ v = e and vv ∗ = 1. As we shall see, factors appearing in CFT as local algebras are of type III and the reader may focus on this case for the need of this paper. A semifinite factor is a factor M isomorphic to M0 ⊗ B(K) with M0 a finite factor and K a Hilbert space. Semifinite factor are characterized by the existence of a normal, possible unbounded, trace (that we do not define here). A factor is either semifinite or of type III.
726
V.G. Kac, R. Longo, F. Xu
A factor M of type III has only one representation (on a separable Hilbert space) up to unitary equivalence. Namely, if π : M → B(K) is a representation, there exists a unitary U : H → K such that π(x) = U xU ∗ , x ∈ M, where H is the underlying Hilbert space of M. Refs: [40]. 2.2. Tomita-Takesaki modular theory. Let M be a von Neumann algebra and ω a normal faithful state on M. By the GNS construction, we may assume that ω = ( · , ) with a cyclic and separating vector (M acts standardly). Here a vector is cyclic if M = H and separating if x ∈ M, x = 0 implies x = 0. A vector is cyclic for M iff it is separating for M . The anti-linear operator x → x ∗ , x ∈ M, is closable and its closure is denoted by S. The polar decomposition S = J 1/2 gives an antiunitary involution J , the modular involution, and a positive non-singular linear operator ≡ S ∗ S, the modular operator. We have it M −it = M, J MJ = M ,
(1) (2)
in other words the modular theory associates with a canonical “evolution", i.e. a one-parameter group of modular automorphisms of M, σtω ≡ Ad it and an anti-isomorphism AdJ of M with M . Let N ⊂ M be an inclusion of von Neumann algebras. We always assume that N and M have the same identity. A conditional expectation ε : M → N is a positive, unital map from M onto N such that ε(n1 xn2 ) = n1 ε(x)n2 , x ∈ M, n1 , n2 ∈ N . If ω is a faithful normal state of M, by the Takesaki theorem there exists a normal conditional expectation ε : M → N preserving ω (i.e. ω · ε = ω) if and only if N is globally invariant under the modular group σ ω of M. If ρ is an endomorphism of M and ε : M → ρ(M) is a conditional expectation, the map ϕ ≡ ρ −1 · ε satisfies ϕ · ρ = id and is called a left inverse of ρ. Refs: [40]. 2.3. Jones index. Let N ⊂ M be an inclusion of factors. The index of N in M can be defined by different points of view: analytic, probabilistic or tensor categorical. Analytic definition. The index was originally considered by Jones in the setting of finite factors. Assume M to be finite and let ω be the faithful tracial state ω on M. As above we may assume that ω is the vector state given by the vector . With e the projection onto N , the von Neumann algebra generated by M and e M1 = {M, e} = JM N JM is a semifinite factor. N ⊂ M has finite index iff M1 is finite and the index is then defined by λ = ω(e)−1 with ω also denoting the tracial state of M1 . The Jones theorem shows the possible values for the index: π λ ∈ 4cos2 , n ≥ 3 ∪ [4, ∞] . n If N ⊂ M is an inclusion of finite factor, there exists a unique trace-preserving conditional expectation ε : M → N (σ ω is trivial in this case).
Solitons in Affine and Permutation Orbifolds
727
A definition for the index [M : N]ε of an arbitrary inclusion of factors N ⊂ M with a faithful normal conditional expectation ε : M → N was given by Kosaki using Connes-Haagerup dual weights. It depends on the choice of ε. Given ε, choose a normal faithful state ω of M with ω · ε = ω and a cyclic vector implementing ω. If [M : N ]ε < ∞, it is possible to define a canonical expectation ε : M1 → M and then [M : N ]ε = ε (e)−1 , with e the projection onto N . Jones restriction on the index values holds for [M : N ]ε as well. The good properties are shared by the minimal index [M : N ] = inf [M : N]ε = [M : N ]ε0 , ε
where ε0 is the unique minimal conditional expectation. The analytic point of view will not play an explicit role in this paper. Probabilistic definition. The Pimsner and Popa inequality, and its extension to the infinite factor case, shows that λ ≡ [M : N ]−1 ε is the best constant such that ε(x) ≥ λx,
x ∈ M +,
where ε : M → N a normal conditional expectation (if M is finite-dimensional λ is not an optimal bound). This gives a general way to define the index and a powerful tool to check whether a given inclusion has finite index. Tensor categorical definition. We shall get to this point in a moment. Refs: [16, 22, 25, 28, 31, 35, 40] and references therein. 2.4. Joint modular structure. Sectors. Let N ⊂ M be an inclusion of infinite factors. We may assume that N and M are infinite so M and N have a cyclic and separating vector. With JN and JM modular conjugations of N and M, the unitary = JN JM implements a canonical endomorphism of M into N , γ (x) = x ∗ ,
x ∈ M.
γ depends on the choice of JN and JM only up to perturbations by an inner automorphism of M associated with a unitary in N. The restriction γ |N is called the dual canonical endomorphism (it is the canonical endomorphism associated with γ (M) ⊂ N ). γ is canonical as a sector of M as we define now. Given the infinite factor M, the sectors of M are given by Sect(M) = End(M)/Inn(M), namely Sect(M) is the quotient of the semigroup of the endomorphisms of M modulo the equivalence relation: ρ, ρ ∈ End(M), ρ ∼ ρ iff there is a unitary u ∈ M such that ρ (x) = uρ(x)u∗ for all x ∈ M. Sect(M) is a ∗ -semiring (there are an addition, a product and an involution) equivalent to the Connes correspondences (bimodules) on M up to unitary equivalence. If ρ is an element of End(M) we shall denote by [ρ] its class in Sect(M). The operations are:
728
V.G. Kac, R. Longo, F. Xu
Addition (direct sum). Let ρ1 , ρ2 , . . . , ρn ∈ End(M). Choose a non-degenerate n-dimensional Hilbert H space of isometries in M and a basis v1 , . . . , vn for H . Then ρ(x) ≡
n
vi ρi (x)vi∗ ,
x ∈ M,
i=1
is an endomorphism of M. The definition of the direct sum endomorphism ρ does not depend on the choice of H or on the basis, up to inner automorphism of M, namely ρ is a well-defined sector of M. Composition (monoidal product). The usual composition of maps ρ1 · ρ2 (x) = ρ1 (ρ2 (x)),
x ∈ M,
defined on End(M) passes to the quotient Sect(M). Conjugation. With ρ ∈ End(M), choose a canonical endomorphism γρ : M → ρ(M). Then ρ¯ = ρ −1 · γρ well-defines a conjugation in Sect(M). By definition we thus have ¯ γρ = ρ · ρ.
(3)
Refs: [15, 22, 26] and references therein. 2.5. The tensor category End(M). With M an infinite factor, then End(M) is a strict tensor C ∗ -category, as is already implicit in the previous section. More precisely define a category End(M) whose objects are the elements of End(M) and the arrows Hom(ρ, ρ ) between the objects ρ, ρ are Hom(ρ, ρ ) ≡ {a ∈ M : aρ(x) = ρ (x)a ∀x ∈ M}. The composition of intertwiners (arrows) is the operator product. Clearly Hom(ρ, ρ ) is a Banach space and there is a ∗ -operation a ∈ Hom(ρ, ρ ) → a ∗ ∈ Hom(ρ , ρ) with the usual properties and the C ∗ -norm equality ||a ∗ a|| = ||a||2 . Thus End(M) is a C ∗ -category. Moreover there is a tensor (or monoidal) product in End(M). The tensor product ρ ⊗ ρ is simply the composition ρρ . For simplicity the symbol ⊗ is thus omitted in this case: ρ ⊗ ρ = ρρ . If σ, σ ∈ End(M), and t ∈ Hom(ρ, ρ ), s ∈ Hom(σ, σ ), the tensor product arrow t ⊗ s is the element of Hom(ρ ⊗ σ, ρ ⊗ σ ) given by t ⊗ s ≡ tρ(s) = ρ (s)t . As usual, there is a natural compatibility between tensor product and composition, thus End(M) is a C ∗ -tensor category. Moreover there is an identity object ι for the tensor product (the identity automorphism). So far we have not made much use that M is an infinite factor. This enters crucially for the conjugation in End(M). If ρ is irreducible (i.e. ρ(M) ∩ M = C) and has finite index, then ρ¯ is the unique sector such that ρ ρ¯ contains the identity sector. More generally the objects ρ, ρ¯ ∈ End(M)
Solitons in Affine and Permutation Orbifolds
729
are conjugate according to the analytic definition and have finite index if and only if there exist isometries v ∈ Hom(ι, ρ ρ) ¯ and v¯ ∈ Hom(ι, ρρ) ¯ such that (v¯ ∗ ⊗ 1ρ¯ ) · (1ρ¯ ⊗ v) ≡ v¯ ∗ ρ(v) ¯ =
1 , d
(v ∗ ⊗ 1ρ ) · (1ρ ⊗ v) ¯ ≡ v ∗ ρ(v) ¯ =
1 , d
for some d > 0. The minimal possible value of d in the above formulas is the dimension d(ρ) of ρ; it is related to the minimal index by [M : ρ(M)] = d(ρ)2 (tensor categorical definition of the index) and satisfies the dimension properties d(ρ1 ⊕ ρ2 ) = d(ρ1 ) + d(ρ2 ), d(ρ1 ρ2 ) = d(ρ1 )d(ρ2 ), d(ρ) ¯ = d(ρ). It follows that the subcategory of End(M) having finite-index objects is a C ∗ -tensor category with conjugates and direct sums. Formula (3) shows that given γ ∈ End(M) the problem of deciding whether it is a canonical endomorphism with respect to some subfactor is essentially the problem of finding a “square root" ρ. γ is canonical and has finite index iff there exist isometries t ∈ Hom(ι, γ ), s ∈ Hom(γ , γ 2 ) satisfying the algebraic relations s ∗ s ∗ = s ∗ γ (s ∗ ), s ∗ γ (t) ∈ C\{0} , s ∗ t ∈ C\{0}.
(4) (5)
It is immediate to generalize the notion of Sect(M) to Sect(M, N ), for a pair of factors M, N . They are the homomorphisms of M into N up to unitary equivalence given by a unitary in N . If N ⊂ M is an inclusion of infinite factors, the canonical endomorphism γ : M → N is a well defined element of Sect(M, N ); if [M : N ] < ∞, the above formula show that γ is the conjugate sector of the inclusion homomorphism ιN : N → M: γ = ι¯N ιN ,
γ N = ιN ι¯N .
We use λ, µ to denote the dimension of Hom(λ, µ); it can be ∞, but it is finite if λ, µ have finite index. λ, µ depends only on [λ] and [µ]. Moreover we have if ν has finite dimension, then νλ, µ = λ, ν¯ µ, λν, µ = λ, µ¯ν which follows from Frobenius duality. µ is a subsector of λ if there is an isometry v ∈ M such that µ(x) = v ∗ λ(x)v, ∀x ∈ M. We will also use the following notation: if µ is a subsector of λ, we will write it as µ ≺ λ or λ µ. A sector is said to be irreducible if it has only one subsector. Refs: [7, 29, 32] and references therein. 3. Conformal Nets on S 1 By an interval of the circle we mean an open connected non-empty subset I of S 1 such that the interior of its complement I is not empty. We denote by I the family of all intervals of S 1 . A net A of von Neumann algebras on S 1 is a map I ∈ I → A(I ) ⊂ B(H) from I to von Neumann algebras on a fixed Hilbert space H that satisfies:
730
V.G. Kac, R. Longo, F. Xu
A. Isotony. If I1 ⊂ I2 belong to I, then A(I1 ) ⊂ A(I2 ). If E ⊂ S 1 is any region, we shall put A(E) ≡ E⊃I ∈I A(I ) with A(E) = C if E has empty interior (the symbol ∨ denotes the von Neumann algebra generated). The net A is called local if it satisfies: B. Locality. If I1 , I2 ∈ I and I1 ∩ I2 = ∅ then [A(I1 ), A(I2 )] = {0}, where brackets denote the commutator. The net A is called M¨obius covariant if in addition it satisfies the following properties C,D,E,F: C. M¨obius covariance. There exists a strongly continuous unitary representation U of the M¨obius group M¨ob (isomorphic to P SU (1, 1)) on H such that U (g)A(I )U (g)∗ = A(gI ),
g ∈ M¨ob, I ∈ I.
D. Positivity of the energy. The generator of the one-parameter rotation subgroup of U (conformal Hamiltonian) is positive. E. Existence of the vacuum. There exists a unit U -invariant vector ∈ H (vacuum vector), and is cyclic for the von Neumann algebra I ∈I A(I ). By the Reeh-Schlieder theorem is cyclic and separating for every fixed A(I ). The modular objects associated with (A(I ), ) have a geometric meaning itI = U (I (2π t)),
JI = U (rI ) .
Here I is a canonical one-parameter subgroup of M¨ob and U (rI ) is a antiunitary acting geometrically on A as a reflection rI on S 1 . This implies Haag duality: A(I ) = A(I ), I
I ∈I,
where is the interior of I . F. Irreducibility. I ∈I A(I ) = B(H). Indeed A is irreducible iff is the unique U -invariant vector (up to scalar multiples). Also A is irreducible iff the local von Neumann algebras A(I ) are factors. In this case they are III1 -factors in Connes’ classification of type III factors (unless A(I ) = C for all I ). By a conformal net (or diffeomorphism covariant net) A we shall mean a M¨obius covariant net such that the following holds: G. Conformal covariance. There exists a projective unitary representation U of Diff(S 1 ) on H extending the unitary representation of M¨ob such that for all I ∈ I we have S1
U (g)A(I )U (g)∗ = A(gI ), g ∈ Diff(S 1 ), U (g)xU (g)∗ = x, x ∈ A(I ), g ∈ Diff(I ), where Diff(S 1 ) denotes the group of smooth, positively oriented diffeomorphism of S 1 and Diff(I ) the subgroup of diffeomorphisms g such that g(z) = z for all z ∈ I . Let G be a simply connected compact Lie group. By Th. 3.2 of [9], the vacuum positive energy representation of the loop group LG (cf. [36]) at level k gives rise to an irreducible conformal net denoted by AGk . By Th. 3.3 of [9], every irreducible positive energy representation of the loop group LG at level k gives rise to an irreducible covariant representation of AGk .
Solitons in Affine and Permutation Orbifolds
731
3.1. Doplicher-Haag-Roberts superselection sectors in CQFT. The DHR theory was originally made on the 4-dimensional Minkowski spacetime, but can be generalized to our setting. There are however several important structure differences in the low dimensional case. A (DHR) representation π of A on a Hilbert space H is a map I ∈ I → πI that associates to each I a normal representation of A(I ) on B(H) such that πI˜ A(I ) = πI ,
I ⊂ I˜,
I, I˜ ⊂ I .
π is said to be M¨obius (resp. diffeomorphism) covariant if there is a projective unitary representation Uπ of M¨ob (resp. Diff (∞) (S 1 ), the infinite cover of Diff(S 1 ) ) on H such that πgI (U (g)xU (g)∗ ) = Uπ (g)πI (x)Uπ (g)∗ for all I ∈ I, x ∈ A(I ) and g ∈ M¨ob (resp. g ∈ Diff (∞) (S 1 )). Note that if π is irreducible and diffeomorphism covariant then U is indeed a projective unitary representation of Diff(S 1 ). By definition the irreducible conformal net is in fact an irreducible representation of itself and we will call this representation the vacuum representation. Given an interval I and a representation π of A, there is an endomorphism of A localized in I equivalent to π; namely ρ is a representation of A on the vacuum Hilbert space H, unitarily equivalent to π, such that ρI = id A(I ). Fix an interval I0 and endomorphisms ρ, ρ of A localized in I0 . Then the composition (tensor product) ρρ is defined by (ρρ )I = ρI ρI with I an interval containing I0 . One can indeed define (ρρ )I for an arbitrary interval I of S 1 (by using covariance) and get a well defined endomorphism of A localized in I0 . If π and π are representations of A, fix an interval I0 and choose endomorphisms ρ, ρ localized in I0 with ρ equivalent to π and ρ equivalent to π . Then π · π is defined (up to unitary equivalence) to be ρρ . The class of a DHR representation modulo unitary equivalence is a superselection sector (or simply a sector). Indeed there are localized endomorphisms of A form a tensor C ∗ -category. For our needs, ρ, ρ will be always localized in a common interval I . We now define the statistics. Given the endomorphism ρ of A localized in I ∈ I, choose an equivalent endomorphism ρ0 localized in an interval I0 ∈ I with I¯0 ∩ I¯ = ∅ and let u be a local intertwiner in Hom(ρ, ρ0 ) as above, namely u ∈ Hom(ρI˜ , ρ0,I˜ ) with I0 following clockwise I inside I˜ which is an interval containing both I and I0 . The statistics operator := u∗ ρ(u) = u∗ ρI˜ (u) belongs to Hom(ρ 2˜ , ρ 2˜ ). An eleI I mentary computation shows that it gives rise to a presentation of the Artin braid group i i+1 i = i+1 i i+1 ,
i i = i i
if |i − i | ≥ 2,
where i = ρ i−1 (). The (unitary equivalence class of the) representation of the Artin braid group thus obtained is the statistics of the superselection sector ρ. It turns out the endomorphisms localized in a given interval form a braided C ∗ -tensor category with unitary braiding.
732
V.G. Kac, R. Longo, F. Xu
The statistics parameter λρ can be defined in general. In particular, assume ρ to be localized in I and ρI ∈ End((A(I )) to be irreducible with a conditional expectation E : A(I ) → ρI (A(I )), then λρ := E() depends only on the superselection sector of ρ. The statistical dimension dDH R (ρ) and the univalence ωρ are then defined by dDH R (ρ) = |λρ |−1 ,
ωρ =
λρ . |λρ |
Refs: [7, 8, 25, 26, 31]. 3.2. Index-statistics and spin-statistics relations. Let ρ be an endomorphism localized in the interval I . A natural connection between the Jones and DHR theories is realized by the index-statistics theorem Ind(ρ) = dDHR (ρ)2 . Here Ind(ρ) is Ind(ρI ); namely d(ρI ) = dDHR (ρ). We will thus omit the suffix DHR in the dimension. Since by duality ρ(A(I )) ⊂ A(I ) coincides with ρ(A(I )) ⊂ ρ(A(I )) one may rewrite the above index formula directly in terms of the representation ρ. The map ρ → ρI is a faithful functor of C ∗ -tensor categories of the endomorphism of A localized in I into End(M) with M ≡ A(I ). Passing to the quotient one obtains a natural embedding Superselection sectors −→ Sect(M). Restricting to finite-dimensional endomorphisms, the above functor is full, namely, given endomorphisms ρ, ρ localized in I , if a ∈ Hom(ρI , ρI ) then a intertwines the representations ρ and ρ (this is obviously true also in the infinite-dimensional case if there holds the strong additivity property below, but otherwise a non-trivial result). The conformal spin-statistics theorem shows that ωρ = ei2πL0 (ρ) , where L0 (ρ) is the conformal Hamiltonian (the generator of the rotation subgroup) in the representation ρ. The right-hand side in the above equality is called the univalence of ρ. Refs: [11, 25]. 3.3. Genus 0 S, T -matrices. Next we will recall some of the results of [37] and introduce notations. Let {[λ], λ ∈ L} be a finite set of all equivalence classes of irreducible, covariant, finite-index representations of an irreducible local conformal net A. We will denote the ¯ and identity sector (corresponding to the vacuum representation) conjugate of [λ] by [λ] ν = [λ][µ], [ν]. Here µ, ν denotes the dimenby [1] if no confusion arises, and let Nλµ sion of the space of intertwiners from µ to ν (denoted by Hom(µ, ν)). We will denote
Solitons in Affine and Permutation Orbifolds
733
by {Te } a basis of isometries in Hom(ν, λµ). The univalence of λ and the statistical dimension of λ (cf. §2 of [10]) will be denoted by ωλ and d(λ) (or dλ )) respectively. Let ϕλ be the unique minimal left inverse of λ, define: Yλµ := d(λ)d(µ)ϕµ ((µ, λ)∗ (λ, µ)∗ ),
(6)
where (µ, λ) is the unitary braiding operator (cf. [10] ). We list two properties of Yλµ (cf. (5.13), (5.14) of [37]) which will be used in the following: Lemma 3.1. Yλµ = Yµλ = Yλ∗µ¯ = Yλ¯ µ¯ , ν ωλ ωµ Yλµ = Nλµ d(ν). ων k
We note that one may take the second equation in the above lemma as the definition of Yλµ . . If the matrix (Yµν ) is invertible, by Proposition on p.351 of Define a := i dρ2i ωρ−1 i 2 [37] a satisfies |a| = λ d(λ)2 . Definition 3.2. Let a = |a| exp(−2πi c80 ), where c0 ∈ R and c0 is well defined mod 8Z. Define matrices S := |a|−1 Y, T := CDiag(ωλ ), where
(7)
c0 C := exp −2πi . 24
Then these matrices satisfy (cf. [37]): Lemma 3.3. SS † = T T † = id, ST S = T −1 ST −1 , ˆ S 2 = C, ˆ , T Cˆ = CT where Cˆ λµ = δλµ¯ is the conjugation matrix. Moreover ν Nλµ =
Sλδ Sµδ S ∗
νδ
δ
S1δ
(8)
is known as the Verlinde formula. We will refer to the S, T matrices as defined above as genus 0 modular matrices of A since they are constructed from the fusion rules, monodromies and minimal indices which can be thought as genus 0 chiral data associated to a Conformal Field Theory.
734
V.G. Kac, R. Longo, F. Xu
Let c be the central charge associated with the projective representations of Diff(S 1 ) of the conformal net A (cf. [33] ). We conjecture that c0 − c ∈ 8Z is true in general. We will prove in Lemma 9.7 that c0 − c ∈ 4Z under general conditions. ν is called the The commutative algebra generated by λ’s with structure constants Nλµ fusion algebra of A. If Y is invertible, it follows from Lemma 3.3, (8) that any nontrivial S for some µ. irreducible representation of the fusion algebra is of the form λ → Sλµ 1µ 3.4. The orbifolds. Let A be an irreducible conformal net on a Hilbert space H and let be a finite group. Let V : → U (H) be a unitary representation of on H. If V : → U (H) is not faithful, we set := /kerV . Definition 3.4. We say that acts properly on A if the following conditions are satisfied: (1) For each fixed interval I and each g ∈ , αg (a) := V (g)aV (g ∗ ) ∈ A(I ), ∀a ∈ A(I ); (2) For each g ∈ , V (g) = , ∀g ∈ . We note that if acts properly, then V (g), g ∈ commutes with the unitary representation U of M¨ob. Define B(I ) := {a ∈ A(I )|αg (a) = a, ∀g ∈ } and A (I ) := B(I )P0 on H0 , where H0 := {x ∈ H|V (g)x = x, ∀g ∈ } and P0 is the projection from H to H0 . Then U restricts to an unitary representation (still denoted by U ) of M¨ob on H0 . Then: Proposition 3.5. The map I ∈ I → A (I ) on H0 together with the unitary representation (still denoted by U ) of M¨ob on H0 is an irreducible M¨obius covariant net. The irreducible M¨obius covariant net in Prop. 3.5 will be denoted by A and will be called the orbifold of A with respect to . We note that by definition A = A . 3.5. Complete rationality . We first recall some definitions from [21] . Recall that I denotes the set of intervals of S 1 . Let I1 , I2 ∈ I. We say that I1 , I2 are disjoint if I¯1 ∩ I¯2 = ∅, where I¯ is the closure of I in S 1 . When I1 , I2 are disjoint, I1 ∪ I2 is called a 1-disconnected interval in [46]. Denote by I2 the set of unions of disjoint 2 elements in I. Let A be an irreducible M¨obius covariant net as in §2.1. For E = I1 ∪ I2 ∈ I2 , let I3 ∪ I4 be the interior of the complement of I1 ∪ I2 in S 1 where I3 , I4 are disjoint intervals. Let A(E) := A(I1 ) ∨ A(I2 ),
ˆ A(E) := (A(I3 ) ∨ A(I4 )) .
ˆ Note that A(E) ⊂ A(E). Recall that a net A is split if A(I1 ) ∨ A(I2 ) is naturally isomorphic to the tensor product of von Neumann algebras A(I1 ) ⊗ A(I2 ) for any disjoint intervals I1 , I2 ∈ I. A is strongly additive if A(I1 ) ∨ A(I2 ) = A(I ) where I1 ∪ I2 is obtained by removing an interior point from I . Definition 3.6 ([21]). A is said to be completely rational if A is split, strongly addiˆ tive, and the index [A(E) : A(E)] is finite for some E ∈ I2 . The value of the index ˆ [A(E) : A(E)] (it is independent of E by Prop. 5 of [21]) is denoted by µA and is called ˆ the µ-index of A. If the index [A(E) : A(E)] is infinity for some E ∈ I2 , we define the µ-index of A to be infinity.
Solitons in Affine and Permutation Orbifolds
735
A formula for the µ-index of a subnet is proved in [21]. With the result on strong additivity for A in [44], we have the complete rationality in the following theorem. Note that, by our recent results in [33], every irreducible, split, local conformal net with finite µ-index is automatically strongly additive. Theorem 3.7. Let A be an irreducible M¨obius covariant net and let be a finite group acting properly on A. Suppose that A is completely rational. Then: (1) A is completely rational or µ-rational and µA = | |2 µA ; (2) There are only a finite number of irreducible covariant representations of A (up to unitary equivalence), and they give rise to a unitary modular category as defined in II.5 of [39] by the construction as given in §1.7 of [48]. Suppose that A and satisfy the assumptions of Th. 3.7. Then A has only a finite number of irreducible representations λ˙ and ˙ 2 = µA = | |2 µA . d(λ) λ˙
˙ is closed under conjugation and compositions, and by Cor. 32 of The set of such λ’s [21], the Y -matrix in (6) for A is non-degenerate, and we will denote the corresponding ˙ T˙ . We note that d(λ) ˙ is conjectured to be related to the genus 0 modular matrices by S, asymptotic dimension of Kac-Wakimoto in [19], and one can find a precise statement of the conjecture and its consequences in [27] and in §2.3 of [50]. Denote by λ˙ (resp. µ) the irreducible covariant representations of A (resp. A) with finite index. Denote by bµλ˙ ∈ N∪{0} the multiplicity of representation λ˙ which appears in the restriction of representation µ when restricting from A to A . The bµλ˙ are also known as the branching rules. An irreducible covariant representation λ˙ of A is called an untwisted representation if bµλ˙ = 0 for some representation µ of A. These are representations of A which appear as subrepresentations in the restriction of some representation of A to A . A ˙ µλ˙ = d(µ)| |, representation is called twisted if it is not untwisted. Note that λ˙ d(λ)b ˙ So we have and b1λ˙ = d(λ). ˙ 2≤ ˙ µλ˙ )2 = | | + d(λ) ( d(λ)b d(µ)2 | |2 λ˙ untwisted
µ 2
λ˙
< | | +
µ=1
µ=1 2
d(µ) | | = µA 2
if is not a trivial group, where in the last = we have used Th. 2. It follows that the set of twisted representations of A is not empty. This fact has already been observed in a special case in [21] under the assumption that A is strongly additive. Note that this is very different from the case of cosets, cf. [47] Cor. 3.2 where it was shown that under certain conditions there are no twisted representations for the coset. 3.6. Restriction to the real line: Solitons. Denote by I0 the set of open, connected, non-empty, proper subsets of R, thus I ∈ I0 iff I is an open interval or half-line (by an interval of R we shall always mean a non-empty open bounded interval of R). Given a net A on S 1 we shall denote by A0 its restriction to R = S 1 {−1}. Thus A0 is an isotone map on I0 , that we call a net on R. In this paper we denote by J0 := (0, ∞) ⊂ R.
736
V.G. Kac, R. Longo, F. Xu
A representation π of A0 on a Hilbert space H is a map I ∈ I0 → πI that associates to each I ∈ I0 a normal representation of A(I ) on B(H) such that πI˜ A(I ) = πI ,
I ⊂ I˜,
I, I˜ ∈ I0 .
A representation π of A0 is also called a soliton. As A0 satisfies half-line duality, namely A0 (−∞, a) = A0 (a, ∞), a ∈ R, by the usual DHR argument [7] π is unitarily equivalent to a representation ρ which acts identically on A0 (−∞, 0), thus ρ restricts to an endomorphism of A(J0 ) = A0 (0, ∞). ρ is said to be localized on J0 and we also refer to ρ as a soliton endomorphism. Clearly a representation π of A restricts to a soliton π0 of A0 . But a representation π0 of A0 does not necessarily extend to a representation of A. If A is strongly additive, and a representation π0 of A0 extends to a DHR representation of A, then it is easy to see that such an extension is unique, and in this case we will use the same notation π0 to denote the corresponding DHR representation of A. 3.7. A result on extensions of solitons. The following proposition will play an important role in proving Th.7.1. Proposition 3.8. Let H1 , H2 be two subgroups of a compact group which acts properly on A , and let π be a soliton of A0 . Assume that A is strongly additive. Suppose that π AHi , i = 1, 2 are DHR representations. Then π (AH1 ∨ AH2 ) is also a DHR representation, where AH1 ∨ AH2 is an intermediate net with (AH1 ∨ AH2 )(I ) = AH1 (I ) ∨ AH2 (I ), ∀I . Proof. Let I be an arbitrary interval with −1 ∈ I . It is sufficient to show that π has a normal extension to AH1 (I ) ∨ AH2 (I ). Since π is a soliton, by choosing a unitary equivalence class of π we may assume that π(x) = x, ∀x ∈ A(I ). Let J ⊃ I be an interval sharing a boundary point with I and let I0 = J ∩ I . Since π AHi is a DHR representation, it is localizable on I0 . Denote the corresponding DHR representation localized on I0 by πi,I0 , then we can find unitary ui such that ui πi,I0 u∗i = π on AHi . It follows that ui ∈ AHi (J ) since π is localized on I , and we have π(x) = ui xu∗i , ∀x ∈ AHi (I ). Note that AH1 (I ) ∩ AH2 (I ) ⊃ A (I ), hence u∗2 u1 ∈ A (I ) ∩ A(J ). Since A ⊂ A is a strongly additive pair (cf. [49]), it follows that A (I ) ∩ A(J ) = A(I0 ), and u1 xu∗1 = u2 xu∗2 , ∀x ∈ A(I ). Hence Adu1 defines a normal extension of π from AH1 (I ) to AH1 (I ) ∨ AH2 (I ). Such an extension is also unique by definition. 4. Induction and Restriction for General Orbifolds Let A be a M¨obius covariant net and B a subnet. Given a bounded interval I0 ∈ I0 we fix a canonical endomorphism γI0 associated with B(I0 ) ⊂ A(I0 ). Then we can choose for each I ⊂ I0 with I ⊃ I0 a canonical endomorphism γI of A(I ) into B(I ) in such a way that γI A(I0 ) = γI0 and λI is the identity on B(I1 ) if I1 ∈ I0 is disjoint from I0 , where λI ≡ γI B(I ). We then have an endomorphism γ of the C ∗ -algebra A ≡ ∪I A(I ) (I bounded interval of R).
Solitons in Affine and Permutation Orbifolds
737
Given a DHR endomorphism ρ of B localized in I0 , the α-induction αρ of ρ is the endomorphism of A given by αρ ≡ γ −1 · Adε(ρ, λ) · ρ · γ , where ε denotes the right braiding unitary symmetry (there is another choice for α associated with the left braiding). αρ is localized in a right half-line containing I0 , namely αρ is the identity on A(I ) if I is a bounded interval contained in the left complement of I0 in R. Up to unitary equivalence, αρ is localizable in any right half-line, thus αρ is normal on left half-lines, that is to say, for every a ∈ R, αρ is normal on the C ∗ -algebra A(−∞, a) ≡ ∪I ⊂(−∞,a) A(I ) (I bounded interval of R), namely αρ A(−∞, a) extends to a normal morphism of A(−∞, a). We have the following Prop. 3.1 of [33]: Proposition 4.1. αρ is a soliton endomorphism of A0 . 4.1. Solitons as endomorphisms. Let A be a conformal net and a finite group acting properly on A (cf. (3.4). We will assume that A is strongly additive. Let π be an irreducible soliton of A0 localized on J0 = (0, ∞). Note that the restriction of π to A(J0 ) is an endomorphism and we denote this restriction by π when no confusion arises. Let πA be a soliton of A0 localized on J0 and unitarily equivalent to π A . Let ρ1 be an endomorphism of A(J0 ) such that ρ1 (A(J0 )) = A (J0 ) and ρ1 ρ¯1 = γ , where γ is the canonical endomorphism from A(J0 ) to A (J0 ). Note that [γ ] = g∈ [g], where for simplicity we have used [g] to denote the sector of A(J0 ) induced by the automorphism βg , where β is the action. By [31] as sectors of A (J0 ) we have [πA ] = [γ π A (J0 )]. Definition 4.2. Define π := {h ∈ |[hπ h−1 ] = [π ]}. Note that kerV (cf. the definition before (3.4)) is a normal subgroup of π and let π := π /kerV . Note that Hom(πA , πA )A (J0 ) Hom(ρ¯1 πρ1 , ρ¯1 πρ1 )A(J0 ) . By Frobenius duality we have πA , πA = λ, γ λγ . Lemma 4.3. (1) If g, h ∈ have different images in , then π, gπ h−1 = 0; (2) π, γ π γ = |π | = γ π A , γ π A where π = π /kerV ; (3) γ π A , γ π A = γ1 π Aπ , γ1 π Aπ , where γ1 is the canonical endomorphism from A(J0 ) to Aπ (J0 ); (4) Every irreducible summand of π A0 π (as a soliton of A0 π ) remains irreducible when restricting to A0 . Proof. Note that gπ h−1 = gπg −1 gh−1 , and gπg −1 is a soliton equivalent to πg −1 but localized on J0 . By Lemma 8.5 of [33] we have proved (1). Parts (2), (3) follow from (1) and the definition of π . Part (4) follows from (3). Proposition 4.4. Let π1 , π2 be two irreducible solitons of A0 . If there is g ∈ such that [π1 ] = [gπ2 g −1 ], then [γ π1 A ] = [γ π2 A ]. Otherwise γ π1 A , γ π2 A = 0.
738
V.G. Kac, R. Longo, F. Xu
Proof. By Frobenius duality and Lemma 8.5 of [33] we have π1 , gπ2 g −1 . γ π1 A , γ π2 A = g∈
Hence γ π1 A , γ π2 A = 0 if there is no g ∈ such that [π1 ] = [gπ2 g −1 ]. If there is g ∈ such that [π1 ] = [gπ2 g −1 ], then π1 , hgπ2 g −1 h−1 = |π 1 |. γ π1 A , γ π2 A = h∈π
1
By exchanging π1 and π2 we get γ π1 A , γ π2 A = γ π1 A , γ π1 A = γ π2 A , γ π2 A . It follows that [γ π1 A ] = [γ π2 A ].
Theorem 4.5. Assume that π isirreducible with finite index and [β] = [γ π A ] = −1 j mj [βj ]. Then [αβj ] = mj ( i [hi πhi ]), where hi are representatives of / π . In ||2 d(π )2 . particular d(βj ) = mj d(π) |||π | , and j d(βj )2 = | π| Proof. By the definition we have [γ αβ ] = [βγ ] = [γ π γ ]. So we have γ αβ , π = γ π γ , π = |π |. By Lemma 8.5 of [33] we have γ αβ , π = αβ , π , and therefore αβ |π |π . By Lemma 8.1 of [33] we have [hi αβ h−1 β ], so [αβ ] i ] = [α −1 |d(π ) = | | ]. On the other hand d(α ) = d(β) = | |π | i [hi π h−1 β π i d(hi π hi ). i It follows that
[αβ ] = |π | [hi πh−1 i ] . i
Note that by Lemma 8.1 of [33],
[h−1 i αβj hi ]
= [αβj ], hence
−1 αβj , hi πh−1 i = hi αβj hi , π = αβj , π .
So we must have [αβj ] = kj ( i [hi πh−1 i ]) for some positive integer kj . We note that kj = αβj , π ≤ βj , γ π A = mj by definitions and Frobenius duality. On the other hand j mj kj = |π | = j m2j , and we conclude that kj = mj . Since by definition || |π |
=
| | |π | ,
the proof of the theorem follows.
4.2. Solitons as representations. In this section we use πˆ to denote an irreducible soliton of A0 on a Hilbert space Hπ . Let π be a soliton unitarily equivalent to πˆ but localized on J0 as in the previous section. The restriction of πˆ to A0 , denoted by πˆ A0 is also a soliton. Define Hom(πˆ A0 , πˆ A0 ) := {x ∈ B(Hπ )|x πˆ (a) = πˆ (a)x, ∀x ∈ A0 }, and let πˆ A0 , πˆ A0 = dimHom(πˆ A0 , πˆ A0 ). Lemma 4.6. (1)
πˆ A0 , πˆ A0 = γ π A0 , γ π A0 ;
(2) h ∈ π if and only if πˆ · Adh πˆ as representations of A0 .
Solitons in Affine and Permutation Orbifolds
739
Proof. By [31] πˆ A0 and γ π A0 are unitarily equivalent as solitons of A0 . Note that γ π A0 is localized on J0 , and (1) follows directly. As for (2), we note that h−1 π h is localized on J0 and unitarily equivalent to πˆ · Adh , and (2) now follows from Def. (4.2). From (2) of Lemma 4.6 we have for any h ∈ π , there is a unitary operator denoted by π(h) ˆ on Hπ such that Adπ(h) ˆ = πˆ ·Adh as solitons of A0 . Since πˆ is irreducible, there ˆ ∗ ·π is a U (1) valued cocycle cπ (h1 , h2 ) on π such that πˆ (h1 )πˆ (h2 ) = cπ (h1 , h2 )πˆ (h1 h2 ). We note that cπ (h1 , h2 ) is fixed up to coboundaries (cf. §2 of [20]). Hence h → πˆ (h) is a projective unitary representation of π on Hπ with cocycle cπ . Assume that
Hπ = Mσ ⊗ V σ , σ ∈E
where E is a subset of irreducible projective representations of π with cocycle cπ , and Mσ is the multiplicity space of the representation Vσ of π . Then by definition each Mσ is a representation of A0 . Lemma 4.7. Fix an interval I . Assume that πˆ is a representation of A(I ) (resp. a projective representation of with cocycle cπ ) on a Hilbert space H such that πˆ (βh (x)) = π(h) ˆ πˆ (x)πˆ (h)∗ , ∀x ∈ A(I ). Let σ1 ∈ ˆ , where ˆ denotes the set of irreducible representations of , and σ2 be an irreducible summand of the representation πˆ of . Then: (1) any irreducible summand σ of σ1 ⊗ σ2 appears as an irreducible summand in the projective representation πˆ of with cocycle cπ . In particular if σ2 is the trivial representation of then all elements of ˆ appear as an irreducible summand of the representation πˆ of . (2) Every irreducible projective representation of with cocycle cπ appears as an irreducible summand of π, ˆ and dim(σ )2 = | |2 . σ,σ has cocycle cπ
Proof. Ad(1): Since the action of on A is proper, and A (I ) is a type III factor, for any σ1 ∈ ˆ , by p. 48 of [15] we can find a basis V (σ1 )i , 1 ≤ i ≤ dimσ1 in A(I ) such that V (σ1 )∗i V (σ1 )j = δij , and the linear span of V (σ1 )i , 1 ≤ i ≤ dimσ1 forms the irreducible representation σ1 of . Let W (σ2 )i ∈ H, 1 ≤ i ≤ dimσ2 be an orthogonal basis of representation σ2 . We claim that the vectors π(V (σ1 )i )W (σ2 )j , 1 ≤ i ≤ dimσ1 , 1 ≤ j ≤ dimσ2 in H are linearly independent. If ij Cij π(V (σ1 )i )W (σ2 )j = 0 for some complex numbers Cij , multiply ˆ (σ1 )∗i ) and use the orthogonal property both sides by π(V of V (σ1 )j ’s above, we have j Cij W (σ2 )j = 0, and hence Cij = 0 since W (σ2 )j ’s are linearly independent. It follows that the linear span of πˆ (V (σ1 )i )W (σ2 )j , 1 ≤ i ≤ dimσ1 , 1 ≤ j ≤ dimσ2 gives a tensor product representation of on a subspace of H, and the lemma follows. Ad(2): Let σ3 be an irreducible summand of π, and let σ4 be an arbitrary irreducible projective representation of with cocycle cπ . By definition σ¯ 3 ⊗ σ4 is a representation of (σ¯ 3 stands for the conjugate of σ3 ), and hence σ¯ 3 ⊗ σ4 σ5 for some σ5 ∈ ˆ , and it follows that σ4 appears as an irreducible summand of σ3 ⊗ σ5 , and so by (2) every irreducible projective representation of with cocycle cπ appears as an irreducible summand of π. Note the twisted group algebra Ccπ [ ] with cocycle cπ (cf. p. 85 of [20]) is semisimple, and the equality in (2) follows.
740
V.G. Kac, R. Longo, F. Xu
Theorem 4.8. (1) Hom(πˆ A , πˆ A ) = σ ∈E Mat(dim(σ )), where E is the set of irreducible projective representations of π with the cocycle cπ ; (2) σ ∈E dim(σ )2 = |π |; (3) Mσ as defined before Lemma 4.7 is an irreducible representation of A0 , and Mσ is not unitarily equivalent to Mσ if σ = σ . Proof. Parts (1) (2) follow directly from Lemma 4.7. As for (3), note that by (2) of Lemma 4.3 and (1) of Lemma 4.6 we have πˆ A , πˆ A = | |. On the other hand π A , π A ≥ σ ∈E dimVσ2 with equality iff Mσ as above is an irreducible representation of A0 , and Mσ is not unitarily equivalent to Mσ if σ = σ . Since we have equality by (2), (3) is proved. Since for the cyclic group H 2 (Zk , U (1)) = 0, we have proved the following corollary which generalizes Lemma 2.1 of [45]. Corollary 4.9. If π = Zk for some positive integer k, then Hom(πˆ A , πˆ A ) is isomorphic to the group algebra of Zk , and πˆ A decomposes into k distinct irreducible pieces. 5. Solitons in Affine Orbifold 5.1. Conformal nets associated with the affine algebras. Let G be a compact Lie group of the form G := G0 × G1 × · · · × Gs where G0 = U (1)r , and Gj , j = 1, ..., s, are simple simply-connected groups. Let gj denote the Lie algebra of Gj , j = 0, ..., s 0 2πiω = 1}. Note that G0 = U (1)r = Rr /L. We assume that and let L := {ω ∈ g |e g = j gj is equipped with a symmetric even negative definite invariant bilinear form. This means that the length square of any ω ∈ igj (j = 0, ..., s) such that e2πiω = 1 is an even integer. Note that our condition on the bilinear form is slightly stronger than the condition on p. 61 of [18] to ensure locality of our nets (cf. Remark 1.1 of [18]). When restricted to a simple gj , the even property means that the bilinear form is equal to kj (v|v ), where kj ∈ N will be identified with the level of the affine Kac-Moody algebra gˆ j and 1 (v|v ) = ∨ Tr gj (Adv Adv ) 2gj (gj∨ is the dual Coxeter number of gj ). We will fix k0 = 1. ˜ the central extension of LG whose Lie algebra is the (smooth) We will denote by LG affine Kac-Moody algebra gˆ . For an interval I ⊂ S 1 , we denote by L˜ I G : {f ∈ ˜ LG|f (t) = e, ∀t ∈ I }, where e is the identity element in G, and L˜ I g : {p ∈ L˜ g |p(t) = ˜ as (f, c), where f ∈ Lg, c ∈ C and (0, c) is 0, ∀t ∈ I }. We will write elements of Lg ˜ Denote by AGk the conformal net associated with representations in the center of Lg. ˜ at level k = (k0 , ..., ks ). The following lemma follows from [41]: of LG Lemma 5.1. AGk is strongly additive. For simplicity we will denote AGk by A in this chapter. Let Z j ⊂ Gj denote the center of Gj , j = 1, ..., s, and let Z 0 = L∗ /L, where L∗ := {µ ∈ g0 |(µ|ω) ∈ Z, ∀ω ∈ L}. The following finite subgroup of G will play an important role: Z(G) := Z 0 × Z 1 × · · · × Z s .
Solitons in Affine and Permutation Orbifolds
741
Recall from §4.2 of [18] that an element g ∈ G is called non-exceptional if there exists β(g) ∈ ig such that g = e2πiβ(g) and the centralizer Gg := {b ∈ G|b gb −1 = g} of g is the same as Gβ(g) := {b ∈ G|b β(g)b −1 = β(g)}, the centralizer of β(g). Let be a finite subgroup of G. Then it follows by definition that acts properly on A. We will be interested in the irreducible representations of A . Note that Z(G) acts on A trivially. Hence A = A,Z(G) , where , Z(G) is the subgroup of G generated by , Z(G). Without losing generality, we will always assume that ⊃ Z(G). By the definition before (3.4) we have = /Z(G). The following definition is Definition 4.1 of [18]: Definition 5.2. A group is called a non-exceptional subgroup of G if for any g ∈ there exists ζ ∈ Z(G) such that ζ g is a non-exceptional element. Recall from [18] that every element of Z can be written in the form (ν) (0) (s) (ν) ζ = ζj0 , ..., ζjs ∈ Z 0 × · · · × Z s , ζjν = e2πij . Here j generate the finite abelian group L∗ /L; for each simple component g the fundamental weight j belongs to the set J (1.33) of [18]. If both g and ζj g are nonexceptional, we can write (0)
β(ζj g) = β(g) + j + m, [β(g), β(ζj g)] = 0, e2πim = 1.
(9)
Now we define the action of ζj on . By Lemma 4.1 of [18] the phase factor
σj (b ) = e2πi(kj +km|β ) , b = e2πiβ ∈ g , [β , j + m] = 0
(10)
gives a 1-dimensional representation of σj of g . The transformation → ζj () of a lattice weight ∈ L∗ is given by ζj () = ( + j )modL. If g is a simple rank l Lie algebra and is an integral weight at level k, then ζj () := kj + wj , where wj is the unique element of the Weyl group of g that permutes the set {−θ, α1 , ..., αl } and satisfies −wj θ = αj . Definition 5.3 ([18]). For any ζ ∈ Z, = ν ν , we define: (wjν ν + kν jν ). ζ () = ν
˜ on a Hilbert space H We will use π to denote the irreducible representations of LG with highest weight . Note that π gives an irreducible representation of AGk by §3 of [9] on H . We will (0) (s) (ν) (ν) write ζ = e2πiβ(ζ ) with β(ζ ) = (β(ζj0 ), ..., β(ζjs )) and β(ζjν ) = j + m, where m is as in (10). Let Pg : [0, 1] → G be a map with Pg (θ ) = e2πiβ(g)θ , 0 ≤ θ ≤ 1, Pζg : [0, 1] → G be a map with Pζg (θ ) = e2πiβ(ζg)θ , 0 ≤ θ ≤ 2π , and Pζ : [0, 1] → G be a map with Pζ (θ ) = e2πiβ(ζ )θ , 0 ≤ θ ≤ 1. We note that AdPζ is an automorphism of LG since ζ is in the center of G. Lemma 5.4. (1) If g is non-exceptional then Pg ∈ Z(Gg ); (2) If ζ g, g are non-exceptional then Pζg Pg−1 = Pζ .
742
V.G. Kac, R. Longo, F. Xu
Proof. If h ∈ Gg , since g is non-exceptional, it follows that he2πiθβ(g) h−1 = e2πiθβ(g) , 0 ≤ θ ≤ 1 and (1) is proved. Since ζ g, g are non-exceptional , by (9) [β(ζ g), β] = 0 and (2) follows immediately. Lemma 5.5. If ζ g, g are non-exceptional, and with notations as above, we have: ˜ (1) AdPζ lifts to an automorphism denoted by Adζ of LG; ˜ is given by (2) The induced action of Adζ on Lg Adζ (f, c) = (AdPζ .f, k(ζ |f ) + c); (3) There is an unitary U : Hζ () → H such that U ∗ πζ () (Adζ )U = π as repre˜ sentations of LG; (4) U ∗ πζ () (h)σζ (h)U = π (h) for any h ∈ g , where σζ = ⊗ν σjν with σjν as defined in (10). Proof. We note that the path Pζg Pg∗ is an element of L(G/Z(G)). When G is semisimple, (1), (2) follow from Lemma 4.6.5 and Eq. (4.6.4) of [36]. The proof in §4.6 of [36] also generalizes easily to the proof of (1) and (2) when G = G0 = U (1)r . As for (3), ˜ since such irreducible first note that πζ () (Adζ ) is an irreducible representation of LG, representations are classified (cf. [36 and 17]. ), we just have to identify it with the known representations. By using Th. 4.2 of [18] for the special case when the group ˜ is trivial, we conclude that the character of πζ () · Adζ is the same as that of π (LG), ˜ and it follows that they are unitarily equivalent as representations of LG. ˜ by (2) we have For any h = e2π iβ ∈ Gg ⊂ LG, e2πik(β |jν +m) = π (h)σζ (h). π (Adζ (h)) = π (h) ν
Using (3) we have
U ∗ πζ () (h)σζ (h)U = π (h).
˜ with 5.2. Constructions of solitons. Let π be an irreducible representation of LG highest integral weight . We will denote the net AGk simply by A in this section. For g ∈ G, let β(g) be an element in the Lie algebra of G such that e2πiβ(g) = g. Define Pg (θ ) := e2πiθβ(g) , 0 ≤ θ ≤ 1. Identify R with the open interval (0, 1) via a smooth map ϕ : (−∞, +∞) → (0, 1), ϕ(t) = π1 (tan−1 (t) + π2 ). For any I ⊂ R, let Pg,I ∈ LI G be a loop localized on I such that Pg,I (t) = Pg (ϕ(t)), ∀t ∈ I . ∗ ). Definition 5.6. For any x ∈ A(I ), define πˆ ,g,I (x) := π (Pg,I xPg,I
We note that the above definition is independent of the choice of Pg,I : if P˜g,I is another −1 loop such that P˜g,I (t) = Pg,I (t), ∀t ∈ I , then P˜g,I (t)Pg,I is a loop with support in I , and so π (Pg,I xP ∗ ) = π (P˜g,I x P˜ ∗ ), ∀x ∈ A(I ). One checks easily that Def. (5.6) g,I
g,I
defines a soliton, and we denote it by πˆ ,g .
Solitons in Affine and Permutation Orbifolds
743
Fix J0 := (0, ∞) ⊂ R. To obtain a soliton equivalent to πˆ ,g but localized on J0 , we choose a smooth path PgJ0 ∈ C ∞ (R, G) which satisfies the following boundary conditions: PgJ0 (t) = e, if − ∞ < t ≤ 0 and PgJ0 (t) = g, if 1 ≤ t < ∞. For any interval J0 J0 I ⊂ R, we choose a loop Pg,I ∈ LG such that Pg,I (t) = PgJ0 (t), ∀t ∈ I . ∗
J0 J0 Definition 5.7. For any x ∈ A(I ), define π,g,I := (Pg,I xPg,I ), where we use to denote a representation unitarily equivalent to π but localized on J0 .
We denote the soliton in the above definition as π,g . Proposition 5.8. The unitary equivalence class of π,g is independent of the choice of the path PgJ0 as long as it satisfies the boundary conditions given as above, and π,g is localized on J0 . Moreover π,g is unitarily equivalent to πˆ ,g , and π,g restricts to a DHR representation of Ag , where g denotes the closed subgroup of G generated by g. Proof. If P˜gJ0 is another path which satisfies the same boundary condition as PgJ0 , then −1 ∈ LG, and the first statement of the proposition follows by definition. By P˜gJ0 PgJ0
definition π,g,J0 (x) = x, ∀x ∈ A(J0 ) since PgJ0 (t) = e, if −∞ < t ≤ 0, and so
π,g is localized on J0 . Since PgJ0 Pg−1 extends to an element in LG, it follows that π,g is unitarily equivalent to πˆ ,g . To prove the last statement, let I be an interval with −1 ∈ I . It is sufficient to show that π,g has a normal entension to Ag (I ). Recall from §3.6 that we identify R = S 1 {−1} and J0 = (0, ∞) ⊂ R. Since the net A is strongly additive by Lemma 5.1, and so Ag is strongly additive by [49], we can assume that Ag (I ) = Ag (−∞, a) ∨ Ag (b, ∞), where a < b. Let us assume that J0 J0 Pg,(−∞,a) and Pg,(b,∞) are the elements in LG such that Ad(P J0 = π,g,(−∞,a) ) g,(−∞,a)
and Ad(P J0
g,(b,∞) )
= π,g,(b,∞) as in Definition 5.7. Choose an element P˜ ∈ LG so
J0 J0 (t), −∞ < t < a and P˜ (t) = Pg,(b,∞) (t), b < t < ∞. Then that P˜ (t) = gPg,(−∞,a) g by definition Ad(P˜ )(x) = π,g (x), ∀x ∈ A (−∞, a) ∨ Ag (b, ∞), and hence Ad(P˜ ) defines the normal extension of π,g to Ag (I ).
Proposition 5.9. As sectors of A(J0 ) we have: (1) [π,g ] = [π1,g ]; (2) [π1,g1 π1,g2 ] = [π1,g1 g2 ], [hπ,g h−1 ] = [π,hgh−1 ]; (3) Assume that , µ are irreducible DHR representations of A. Then , µπ1,g h = 1 if and only if h ∈ Z(G), g ∈ Z(g) and = g −1 (µ), where the action of the center is as in (5.3). In all other cases , µπ1,g h = 0; (4) If 1 , 2 are irreducible DHR representations of A , then π1 ,g1 , π2 ,g2 h = 1 if and only if h ∈ Z(G) and there exists a g ∈ Z(G) such that g2 = gg1 and 2 = g −1 (1 ). In all other cases π1 ,g1 , π2 ,g2 h = 0; (5) The stabilizer ,g of π,g (cf. (4.2)) is given by ,g = {h ∈ |hgh−1 = g1 g, g1 () = , g1 ∈ Z(G)}. Proof. Parts (1) and (2) follow directly from Def. 5.7. Now assume that , µπ1,g h = 1. By Lemma 8.5 of [33] we conclude that [h] = [1] and so h ∈ Z(G), hence [] =
744
V.G. Kac, R. Longo, F. Xu
[µπ1,g ], and it follows that µπ1,g is a DHR representation of the net A. In particular µπ1,g is normal on A(−∞, 0) ∨ A(1, ∞). Choose µ to be localized on A(0, 1). Since A(−∞, 0) ∨ A(1, ∞) is a type III von Neumann algebra, there is a unitary u such that π1,g (x) = uxu∗ , ∀x ∈ A(−∞, 0) ∨ A(1, ∞). Since π1,g = id on A(−∞, 0) and π1,g = Adg on A(1, ∞), we have u ∈ A(−∞, 0) ∩ AG (1, ∞) . By (2) of Lemma 3.6 in [49] the pair AG ⊂ A is strongly additive (cf. Def. 3.2 of [49] ) since A is strongly additive by Lemma 5.1, and so A(−∞, 0) ∩ A (1, ∞) = A(0, 1). Therefore u ∈ A(0, 1), Adg (x) = x, ∀x ∈ A(1, ∞), and so g ∈ Z(G). Hence we have , µπ1,g = 1. By (3) of Lemma 5.5 and the definition of π1,g we have = g −1 (µ), where the action of the center is defined in (5.3). As for (4), by (1) and (2) we have π1 ,g1 , π2 ,g2 h = 1 π1,g1 , 2 π1,g2 h = 1 , 2 π1,g2 hπ1 ,g −1
(11)
= 1 , 2 π1,g2 hg −1 h−1 h
(12)
1
1
and (4) follows from the above equation and (3). Part (5) follows from definitions and (4). 5.2.1. Comparing solitons with “twisted representations”. Let e2πiβ = g and choose the Cartan subalgebra of g which contains β. In Def. (5.6), if we choose x = π1 (y), y ∈ ∗ ). Note that Ad L˜ I G, then πˆ ,g,I (π1 (y)) = π (Pg,I yPg,I Pg,I is an automorphism of L˜ I G, and induces an automorphism on L˜ I g. By Prop. 4.3.2 of [36], if we write elements of L˜ I g as (f, c), where f ∈ C ∞ (S 1 , g) with support in I , and c ∈ C, then AdPg,I (f, c) = (AdPg,I .f, c + k(β|f )).
(13)
Let us check that (13) agrees with the definition of the twisted representation (2.11)– (2.14) of [18] on L˜ I g, ∀I ⊂ R. Let E α be a raising or lowering operator as on p. 64 of [18]. Let f1 ∈ C ∞ (S 1 , R) be a smooth map such that f1 (t) = 0, ∀t ∈ I . By the commutation relation [E α , β] = −(α|β)E α we have AdPg,I .f = z−(α|β) E α f1 , where z−(α|β) := e−2π iθ(α|β) as a function on [0, 1], and (β|f1 E α ) = 0 by definition. By (13) we have AdPg,I (f1 E α , c) = (z−(α|β) E α f1 , c) which is the restriction of (2.11) of [18] to L˜ I g. Similarly one can check that (13) agrees with the definition of twisted representation (2.12)-(2.14) of [18] on L˜ I g, ∀I ⊂ R. Hence our soliton representations in Def. 5.6 can be regarded as an “exponentiated” version of the twisted representations in §2 of [18]. In the next section we shall see that these soliton representations are important in constructing irreducible DHR representations of A . Motivated by the above observations, we have the following conjecture: Conjecture 5.10. There is a natural one to one correspondence between the set of irreducible DHR representations of A and the set of irreducible representations of the orbifold chiral algebra as defined on p. 74 of [18] with gauge group . We note that this conjecture, together with the results of §5.4 and §5.5, give a prediction on the set of irreducible representations of the orbifold chiral algebra as defined on p. 74 of [18] with non-exceptional gauge group .
Solitons in Affine and Permutation Orbifolds
745
5.3. Completely rational case. Assume that the net A associated to G has the property that µA = d()2 , (14)
where the sum is over all irreducible projective representations of LG of a fixed level. When G = SU (N ) this property is proved by [46]. We show that all irreducible DHR representations of A are obtained from decomposing the restriction of solitons π,g to A , answering one of the motivating questions for this paper. By Prop. 4.4 π1 ,g1 A π2 ,g2 A iff there exists h ∈ such that [hπ1 ,g1 h−1 ] = [π2 ,g2 . By (2) and (4) of Prop. 5.9 this is true if there is a g3 ∈ Z(G) such that 2 = g3−1 (1 ) and g2 = hg3 g1 h−1 . Define an action of group Z(G) × on the set (, g) by (g3 , h).(, g) = (g3 −1 (), hg3 g1 h−1 ). Denote the orbit of (, g) by {, g}. Note that the stabilizer of (, g) has the same order as the stabilizer ,g of π,g by (5) of Prop. 5.9. Hence the orbit {, g} contains |Z(G)×| i mi [βi ], where βi are |,g | elements. Let [γ π,g ] = irreducible DHR representations of A . 2 By Th. 4.5, i d(βi )2 = || | | d()2 . By Prop. 4.4 we get the sum of the index ,g
of all different irreducible DHR representations of A coming from decomposing the restriction of π,g to A is given by | |2 2 | d() . |,g
{,g}
Since the orbit {, g} contains
|Z(G)×| |,g |
| |2 ,g
||
elements, the above sum is equal to
d()2 = | |2 µA = µA ,
where in the last = we have used Th. 3.7. By Th. 33 of [21] we have proved the following: Theorem 5.11. If Eq. (14) holds, then every irreducible DHR representation of A is contained in the restriction of π,g to A for some , g, where π,g , is defined as in (5.7). Let G = SU (N1 ) × SU (N2 ) × · · · × SU (Nm ) and let level k = (k1 , ..., km ). Since AGk verifies Eq. (14) by [46], we have the following: Corollary 5.12. Let ⊂ G = SU (N1 )×SU (N2 )×· · ·×SU (Nm ) be a finite subgroup. Then every irreducible DHR representation of AGk is contained in the restriction of π,g to AGk for some , g ∈ , where π,g is defined as in Def. (5.7) and AGk is the conformal net associated with the projective representation of LG at level k = (k1 , ..., km ). 5.4. Identifying representations of A for non-exceptional . In this section we assume that is a non-exceptional finite subgroup of G (cf. 5.2). Assume that g ∈ is a non-exceptional element in with g = e2πiβ and Gg = Gβ . We will choose the path
746
V.G. Kac, R. Longo, F. Xu
Pg as Pg (θ ) = e2πiθβ , 0 ≤ θ ≤ 1. Let σ be an irreducible character of the group β := ∩ Gβ = g . Let P,σ :=
σ (1) ∗ σ (h)π (h). |g |
(15)
h∈β
By Lemma 5.15, P,σ π,g is a direct sum of σ (1) copies of a DHR representation of A (on P,σ H ) which we denote by π,g,σ . We have: Proposition 5.13. Let h ∈ NG (g ) := {b ∈ G|bg b−1 = g }. Then as a representation of Ag we have π,g,σ · Adh−1 π,hgh−1 ,σ h , where σ h is an irreducible representation of hgh−1 defined by σ h (b) = σ (h−1 bh). Proof. By Def. (5.6) ∀x ∈ A(I ), I ⊂ R we have ∗ πˆ ,g (Adh−1 x) = π (Pg,I h−1 xhPb,I ) ∗ = π (g)∗ π (hPg,I h−1 xhPg,I h−1 )π (h) ∗
= π (h) πˆ ,hgh−1 (x)π (h).
(16) (17)
On the other hand from the definition (15) one checks that π (h)∗ P,σ h π (h) = P,σ . It follows that ∀y ∈ Ag (I ), π,g,σ · Adh−1 y = π (g)∗ π,hgh−1 ,σ h (y)π (g).
Proposition 5.14. For the pair of non-exceptional triples X = (, g, σ ) and
ν ζ (X) := (wjν + kν jν , ζg, σ ⊗ (⊗ν σjν )) , ν
where σjν is defined as in (10), we have πX πζ (X) as DHR representations of A0 g . Proof. For any a ∈ A(I ) we have: ∗ πˆ ζ (),ζg (a) = πζ () (Pζg Pg∗ Pg aPg∗ Pg Pζg ) = πζ () (Pζ Pg xPg∗ Pζ∗ ),
where we have used (2) of Lemma 5.4. By (3) of Lemma 5.5, there exists a unitary U such that πζ () (Pζ Pg aPg∗ Pζ∗ ) = U π,g (a)U ∗ . By (4) of Lemma 5.5,
πζ () (h) = U π (h)σζ (h)U ∗ ,
and it follows by definition (15) Pζ (),σ ⊗σζ = U P,σ U ∗ , hence the proposition is proved by definition.
Solitons in Affine and Permutation Orbifolds
747
5.5. Details on decomposing solitons: fixed point resolutions. Assume that g ∈ is a non-exceptional element with g = e2πiβ and Gg = Gβ . We will choose the path Pg as Pg (θ ) = e2π iθβ , 0 ≤ θ ≤ 1. Let πˆ ,g A i mi βi , where βi are irreducible DHR representations of A . Define g := {h ∈ |hg = gh}. Note that g is a normal subgroup of ,g and ,g / g = {h ∈ Z(G)|h = } is an abelian group (cf. (5) of Lemma 5.9). Lemma 5.15. For all x ∈ A(I ), h ∈ g , π (h)πˆ ,g (x)π (h)∗ = πˆ ,g (hxh∗ ). Proof. Since π1 (L˜ I G) generates A(I ), it is sufficient to check the equation for x = π1 (y), y ∈ LI G. As elements in LG we have −1 −1 −1 h = Pg,I hyh−1 Pg,I , hPg,I yPg,I
where we have used hPg h−1 = Pg by (1) of Lemma 5.4. It follows by Def. (5.6) that π (h)πˆ ,g (x)π (h)∗ = πˆ ,g (hxh∗ ). Assume that when restricting to Ag , H = σ ∈E Mσ ⊗ Vσ , where Vσ are irreducible representation spaces of g , E ⊂ Irrg and Mσ the corresponding multiplicity spaces. By Th. 4.1 of [18], σ appears in the above decomposition iff σ |Z(G) = |Z(G). Applying Th. 4.8 to the pair Ag ⊂ A, each Mσ with σ |Z(G) = |Z(G) is an irreducible DHR representation of Ag . We will denote Mσ by π,g,σ . When ,g / g is nontrivial, the next question is how π,g,σ decomposes when restricting to A,g . This is the issue of “fixed point resolutions”, since the action of the center has a nontrivial fixed point on the quadruples as described on p. 78 of [18], and the question about the nature of how π,g,σ decomposes as a representation of A is implicitly raised. Assume that ,g / g = {h ∈ Z(G)|h = }. Then A,g ⊂ Ag is the fixed point subnet under the action of ,g / g . Note that ,g / g {ζ ∈ Z(G)|ζ = } and denote the isomorphism by h → ζ (h). Then we have: Theorem 5.16. (1) π,g,σ A , π,g,σ A = |{h ∈ ,g / g |σζ (h) σ ⊗ σζ }|, where σζ (h) is as defined in (4) of Lemma 5.5; (2) π,g,σ A decomposes into irreducible representations of A which are in oneto-one correspondence with all irreducible projective representations of the group ,g / g with a fixed cocycle. Proof. Ad (1): A,g ⊂ Ag is the fixed point subnet under the action of ,g / g . Applying Lemma 4.3 to the pair A,g ⊂ Ag , π,g,σ Ag , π,g,σ Ag is equal to the number of elements h ∈ ,g / g such that π,g,σ π,g,σ (Ad.h) as representations of Ag . By Prop.5.13 π,g,σ (Adh ) π,hgh−1 ,σ h = π,ζ (h)g,σ h , and by Prop. π,g,σ π,ζ (h)g,σ ⊗σζ (h) as representations of Ag . It follows that π,g,σ π,g,σ (Ad.h) as representations of Ag iff σ h σ ⊗ σζ (h) . Hence π,g,σ A,g , π,g,σ A,g = |{h ∈ ,g / g |σζ (h) σ ⊗ σζ }|. By (4) of Lemma 4.3 (1) is proved. Part (2) follows by applying Th. 4.8 to the pair A,g ⊂ Ag and (4) of Lemma 4.3.
748
V.G. Kac, R. Longo, F. Xu
Combine the above theorem with Cor. 4.9, we immediately have: Corollary 5.17. If the group {h ∈ ,g / g |σ h σ ⊗ σζ (h)} is cyclic of order m, then π,g,σ A decomposes into m irreducible pieces. 5.6. An example. Here we illustrate Cor. 5.17 in Example 6.4 of [18]. We keep the same notation of [18]. Set G = SU (2) and = H8 the quaternion group. H8 has 8 elements,{1, , qi , , qi , i = 1, 2, 3, }; they obey the multiplication rules qi2 = , qi qj qi−1 = qj = qj−1 , i = j . We note that qi , qi are non-exceptional elements of SU (2). The centralizer of qi Z4 , and we will label its irreducible representations by the exponents σ = 0, +1, −1, 2. There are 5 irreducible representations of H8 , {α0 , α1 , α2 , α3 , α4 } with dimensions 1, 1, 2, 1, 1 respectively. The characters of these representations are given on p. 94 of [18]. Consider the net ASU (2)2k1 . The irreducible DHR representations of ASU (2)2k1 are ˜ labeled by irreducible representations of LSU (2) at level 2k1 , and we will use integers 0, 1, ..., 2k1 to label these representations such that 0 is the vacuum representation. The only representation which is fixed by the action of the center is k1 . We note that σ = 2k1 (mod)4. When k1 is odd, consider the DHR representation πk1 ,qj ,1 . We have k1 ,qj = H8 . We note that σ = 2k1 (mod)4, and so the stabilizer of πk1 ,qj ,±1 is {h ∈ H8 /Z4 |σ h σζ (h) } Z2 . Hence by Cor. 5.17, πk1 ,qj ,±1 decomposes into two 8 distinct irreducible DHR representations of AH SU (2)2k . When k1 = 1 this is first observed 1
8 in [18] by identifying AH SU (2)2k1 with the tensor products of three “Ising Models” (cf. p. 99 of [18]). When k1 is even, consider the DHR representation πk1 ,qj ,0 or πk1 ,qj ,2 . Similar to the above, the stabilizer of πk1 ,qj ,0 or πk1 ,qj ,2 is Z2 , and by using Cor. 5.17 again we conclude that πk1 ,qj ,0 or πk1 ,qj ,2 decomposes into two distinct irreducible DHR representations 8 of AH SU (2)2k . 1
6. Constructions of Solitons for Permutation Orbifolds 6.1. Preliminaries on cyclic orbifolds. In the rest of this paper we assume that A is completely rational. D := A ⊗ A... ⊗ A (n-fold tensor product) and B := DZn (resp. DPn , where Pn is the permutation group on n letters) is the fixed point subnet of D under the action of cyclic permutations (resp. permutations). Recall that J0 = (0, ∞) ⊂ R. Note that the action of Zn (resp. Pn ) on D is faithful and proper. Let v ∈ D(J0 ) be a unitary 2π i such that βg (v) = e n v (such v exists by p. 48 of [15]), where g is the generator of the cyclic group Zn and βg stands for the action of g on D. Note that σ := Adv is a DHR representation of B localized on J0 . Let γ : D(J0 ) → B(J0 ) be the canonical endomorphism from D(J0 ) to B(J0 ) and let γB := γ B(J0 ). Note [γ ] = [1] + [g] + ... + [g n−1 ] as sectors of D(J0 ) and [γB ] = [1] + [σ ] + ... + [σ n−1 ] as sectors of B(J0 ). Here [g i ] denotes the sector of D(J0 ) which is the automorphism induced by g i . All the sectors considered in the rest of this paper will be sectors of D(J0 ) or B(J0 ) as should be clear from their definitions. All DHR representations will be assumed to be localized on J0 and have finite statistical dimensions unless noted otherwise. For simplicity of notations, for a DHR representation σ0 of D or B localized on J0 , we will use the same notation σ0 to denote its restriction to D(J0 ) or B(J0 ) and we will make no distinction between
Solitons in Affine and Permutation Orbifolds
749
local and global intertwiners for DHR representations localized on J0 since they are the same by the strong additivity of D and B. The following is Lemma 8.3 of [33]: Lemma 6.1. Let µ be an irreducible DHR representation of B. Let i be any integer. Then: (1) G(µ, σ i ) := (µ, σ i )(σ i , µ) ∈ C, G(µ, σ )i = G(µ, σ i ). Moreover G(µ, σ )n = 1; (2) If µ1 ≺ µ2 µ3 with µ1 , µ2 , µ3 irreducible, then G(µ1 , σ i ) = G(µ2 , σ i )G(µ3 , σ i ); (3) µ is untwisted if and only if G(µ, σ ) = 1; ¯ (4) G(µ, ¯ σ i ) = G(µ, σ i ). 6.2. One cycle case. First we recall the construction of solitons for permutation orbifolds in §6 of [33]. Let h : S 1 {−1} R → S 1 be a smooth, orientation preserving, n injective map which is smooth also at ±∞, namely the left and right limits limz→−1± ddzhn exist for all n. The range h(S 1 {−1}) is either S 1 minus a point or a (proper) interval of S 1 . With I ∈ I, −1 ∈ / I , we set h,I ≡ AdU (k) , where k ∈ Diff(S 1 ) and k(z) = h(z) for all z ∈ I and U is the projective unitary representation of Diff(S 1 ) associated with A. Then h,I does not depend on the choice of k ∈ Diff(S 1 ) and h : I → h,I is a well defined soliton of A0 ≡ A R. Clearly h (A0 (R)) = A(h(S 1 {−1})) , thus h is irreducible if the range of h is dense, otherwise it is a type III factor representation. It is easy to see that, in the last case, h does not depend on h up to unitary equivalence. Let now f : S 1 → S 1 be the degree n map f (z) ≡ zn . There are n right inverses hi , i = 0, 1, . . . n − 1, for f (n-roots); namely there are n injective smooth maps hi : S 1 {−1} → S 1 such that f (hi (z)) = z, z ∈ S 1 {−1}. The hi ’s are smooth also at ±∞. Note that the ranges hi (S 1 {−1}) are n pairwise disjoint intervals of S 1 , thus we may fix the labels of the hi ’s so that these intervals are counterclockwise ordered, namely 2π ij we have h0 (1) < h1 (1) < · · · < hn−1 (1) < h0 (1), and we choose hj = e n h0 , 0 ≤ j ≤ n − 1. For any interval I of R, we set π1,{0,1...n−1},I ≡ χI · (h0 ,I ⊗ h1 ,I ⊗ · · · ⊗ hn−1 ,I ) ,
(18)
where χI is the natural isomorphism from A(I0 )⊗· · ·⊗A(In−1 ) to A(I0 )∨· · ·∨A(In−1 ) given by the split property, with Ik ≡ hk (I ). Clearly π1,{0,1...n−1} is a soliton of D0 ≡ A0 ⊗ A0 ⊗ · · · ⊗ A0 (n-fold tensor product). Let p ∈ Pn . We set π1,{p(0),p(1),...,p(n−1)} = π1,{0,1,...,n−1} · βp−1 ,
(19)
where β is the natural action of Pn on D, and π1,{0,1,...,n−1} is as in (18). The following is part of Prop. 6.1 in [33]:
750
V.G. Kac, R. Longo, F. Xu
Proposition 6.2. (1) Index(π1,{0,1...,n−1} ) = µn−1 A . (2) The conjugate of π1,{0,1,...,n−1} is π1,{0,n−1,n−2,...,1} . Let λ be a DHR representation of A. Given an interval I ⊂ S 1 {−1}, we set Definition 6.3. πλ,{p(0),p(1),...,p(n−1)},I (x) = πλ,J (π1,{p(0),p(1),...,p(n−1)},I (x)) ,
x ∈ D(I ) ,
where π1,{p(0),p(1),...,p(n−1)},I is defined as in (19), and J is any interval which contains I0 ∪ I1 ∪ ... ∪ In−1 . Denote the corresponding soliton by πλ,{p(0),p(1),...,p(n−1)} . When p is the identity element in Pn , we will denote the corresponding soliton by πλ,n . The following follows from Prop. 6.4 of [33]: Proposition 6.4. The above definition is independent of the choice of J , thus πλ,{p(0),p(1),...,p(n−1)},I is a well defined soliton of D. We can localize π1,{p(0),p(1),...,p(n−1)} , πλ,{p(0),p(1),...,p(n−1)} and λ on J0 . Denote by π˜ , π˜ λ and (λ, 1, 1, ..., 1) := λ ⊗ ι ⊗ ι · · · ⊗ ι D(J0 ) respectively the corresponding endomorphisms of D(I ). Then as sectors of D(J0 ) we have [π˜ λ ] = [π˜ · (λ, 1, 1, ..., 1) ]. In particular Index(πλ,{p(0),p(1),...p(n−1)} ) = d(λ)2 µn−1 A . 6.3. General case. Let ψ : {0, 1, ..., n − 1} → L, where L is the set of all irreducible DHR representations of D. For any p ∈ Pn we set p.ψ(i) := ψ(p −1 .i), i = 0, ..., n−1, where Pn acts via permutation on the n numbers {0, 1, ..., n−1}. Assume that p.ψ = ψ, and p = c1 ...ck is a product of disjoint cycles. Since p.ψ = ψ, ψ takes the same value denoted by ψ(cj ) on the elements {a1 , a2 , ..., al } of each cycle cj = (a1 ...al ). A presentation fj of the cycle cj = (a1 ...al ) is a list of numbers {b1 , ..., bl } such that (b1 ...bl ) = cj as cycles. The length l(fj ) of fj is l. We note that for a cycle of length l there are l different presentations. For each element x = x0 ⊗ x1 ⊗ · · · ⊗ xn−1 ∈ D, and each cycle c = (a1 ...al ) with a fixed presentation f = {b1 , ..., bl }, we define xc,f = xb1 ⊗ xb2 ⊗ · · · ⊗ xbl . Now we are ready to define solitons for permutation orbifolds: Definition 6.5. Assume that p.ψ = ψ and p = c1 ...ck is a product of disjoint cycles as above. For each cj we fix a presentation fj . Then for any x = x0 ⊗ x1 ⊗ · · · ⊗ xn−1 ∈ D(I ), I ⊂ S 1 {−1} = R, πψ,p ≡ πψ,c1 c2 ...,ck ,f1 ,...fk (x) = πλ1 ,l(f1 ) (xc1 ,f1 ) ⊗ πλ2 ,l(f2 ) (xc2 ,f2 ) ⊗ · · · ⊗ πλk ,l(fk ) (xck ,fk ) on Hψ(c1 ) ⊗ Hψ(c2 ) ⊗ · · · ⊗ Hψ(ck ) , where πλj ,l(fj ) is as in Def. 6.3. Here and in the following, to simplify notations, we do not put the interval suffix I in a representation, if no confusion arises. Lemma 6.6. The unitary equivalence class of πψ,p in Def. 6.5 depends only on p ∈ Pn . Proof. We have to check that the unitary equivalence class of πψ,p in Def. 6.5 is independent of the order c1 , ..., ck and the presentation of cj . The first case is obvious, and second case follows from (a) of Prop. 6.2 in [33].
Solitons in Affine and Permutation Orbifolds
751
Due to the above lemma, for each p ∈ Pn we will fix a choice of the order c1 , ..., ck and presentations of c1 , ..., ck . For simplicity we will denote the corresponding soliton simply by πψ,p . Proposition 6.7. πh.ψ,hph−1 πψ,p · βh−1 as solitons of D0 , p, h ∈ Pn . Proof. Let p = c1 ...ck be a product of disjoint cycles with cj = (a1 ...al ). Then hph−1 = hc1 h−1 ...hck h−1 with hcj h−1 = (h(a1 )...h(al )). Note that h.ψ(h(a1 )) = ψ(a1 ) = ψ(cj ), and βh−1 (x0 ⊗ x1 ⊗ · · · ⊗ xn−1 ) = xh(0) ⊗ xh(1) ⊗ · · · ⊗ xh(n−1) , ∀x0 ⊗ x1 ⊗ · · · ⊗ xn−1 ∈ D(I ). The proposition now follows directly from Def. (6.5). 7. Identifying Solitons in the Permutation Orbifolds The goal in this section is to prove the following: Theorem 7.1. Let πψ1 ,p1 , πψ2 ,p2 be two solitons as given in Def. (6.5). Then πψ1 ,p1 πψ2 ,p2 as solitons of D0 if and only if ψ1 = ψ2 , p1 = p2 . We note that even for the first nontrivial case n = 3 we do not know a direct proof of the theorem. Our proof is indirect and is divided into the following steps:
7.1. Identifying solitons: Cyclic case. We will first prove Th. 7.1 for the case when both p1 , p2 are one cycle. In this case ψ1 (resp. ψ2 ) is a constant function with value denoted by λ1 (resp. λ2 ). We will denote ψ1 (resp. ψ2 ) simply by λ1 (resp. λ2 ). If g ∈ , we will denote by Dg the fixed-point subnet of D under the subgroup generated by g. Proposition 7.2. (1) Let g1 = (01, ...n − 1) and g2 = g1m with (m, n) = 1. Then πλ1 ,g1 πλ1 ,g2 if and only if λ1 = λ2 , g1 = g2 ; (2) If πλ1 ,g1 restricts to a DHR representation a subnet B with Dg1 ⊂ B ⊂ D, then B = Dg1 . Proof. Ad (1): It is sufficient to show that if πλ1 ,g1 πλ1 ,g2 , then λ1 = λ2 , g1 = g2 . Since (m, n) = 1, there exists h ∈ Pn such that hg1 h−1 = g2 . By Prop. 6.7, we can assume that πλ1 ,g1 πλ2 ,g1 · Adh . As in §8.3 of [33], we denote the n irre(0) (n−1) ducible DHR representations of Dg1 of πλ1 ,g1 by τλ1 , ..., τλ1 . Since πλ1 ,g1 (0)
(i)
πλ2 ,g1 · Adh , we must have τλ1 τλ2 · Adh for some 0 ≤ i ≤ n − 1. By (48) of [33] we have that [τλ1 ] ≺ [(λ, 1, ..., 1) Dg1 τ (0) ], and (2) and (3) of Lemma 6.1 we have (0)
(0)
G(τλ1 , σ k(1) ) = G(τ (0) , σ k(1) ) = e
2π i n
, where 1 ≤ k(1) ≤ n − 1 and (k(1), n) = 1 (cf. (i)
2π i
the paragraph after (47)). Similarly G(τλ2 , σ k(1) ) = e n . On the other hand note that by definition 2π i (i) (i) G τλ2 · Adh , σ k(1) · Adh = G τλ2 , σ k(1) = e n . Since [Adh.g] = [g m ], we have σ · Adh σ m , and so we have m 2π i (i) (i) (0) =e n , G τλ2 · Adh , σ k(1) · Adh = G τλ2 · Adh , σ mk(1) = G τλ1 , σ k(1)
752
V.G. Kac, R. Longo, F. Xu 2π i
2π im
where in the second = we have used (1) of Lemma 6.1. Hence e n = e n and it follows that m = 1 since (m, n) = 1. So we have πλ1 ,g1 πλ2 ,g1 , and by (2) of Th. 8.6 in [33] we have λ1 = λ2 . l Ad (2): First we note that the subnet B = Dg1 for some 1 ≤ l ≤ n, n = ll1 by the l Galois correspondence (cf. [15]). Also the vacuum representation of Dg1 restricts to g l l1 i of D g1 . If π λ1 ,g1 restricts to a DHR representation of D 1 , by applying 1≤i≤l σ l (0) (3) of Lemma 6.1 to the pair Dg1 ⊂ Dg1 we conclude that G τλ1 , σ l1 = 1. Since 2π i (0) G τλ1 , σ k(1) = e n , by using (1) of Lemma 6.1 we have k(1) l1 2π l1 i (0) (0) (0) G τλ1 , σ l1 k(1) = G τλ1 , σ l1 = 1 = G τλ1 , σ k(1) = e n . Hence n|l1 and we conclude that l1 = n, B = Dg1 .
Proposition 7.3. Let g1 (resp. g2 ) be one cycle of length n. Then πλ1 ,g1 πλ2 ,g2 if and only if λ1 = λ2 , g1 = g2 . Proof. It is sufficient to show that if πλ1 ,g1 πλ2 ,g2 , then λ1 = λ2 , g1 = g2 . Note that πλ1 ,g1 (resp. πλ2 ,g2 ) restricts to a DHR representation of Dg1 (resp. g2 D ), it follows that πλ1 ,g1 restricts to a DHR representation of Dg2 . By Prop.3.8 πλ1 ,g1 restricts to a DHR representation of Dg1 ∨ Dg2 , and by (2) of 7.2 we must have Dg1 ∨ Dg2 = Dg1 . It follows that Dg2 ⊂ Dg1 and by Galois correspondence again (cf. [15]) we have g2 ⊂ g1 . Exchanging g1 and g2 we conclude that g2 = g1 . Hence g2 = g1m for some integer m with (m, n) = 1. By (1) of Prop.7.2 we have proved that g1 = g2 , λ1 = λ2 . 7.2. Proof of Th. 7.1 for general case and its corollary. Assume that g1 = c1 c2 ...ck and g2 = c1 ...cl , where cj (resp. ci ) are disjoint cycles. Fix 1 ≤ j ≤ k and let cj = (a1 ...am ). Let us first show that a1 , ..., am must appear in one cycle of g2 . Let U be the unitary such that πψ1 ,g1 = AdU · πψ2 ,g2 . Choose x = x0 ⊗ x1 ⊗ · ⊗ xn ∈ D0 such that xi = 1 if i = aj , j = 1, ..., m, and no other constraints. Denote by D0,cj the subalgebra of D0 generated by such elements. We note that πψ1 ,g1 ( D0,cj ) is B(Hcj ), a type I factor by strong additivity. If a1 , ..., am appear in more than one cycle of g2 , then by definition (18) πψ2 ,p2 ( D0,cj ) will be tensor products of factors of the form πλ (AJ ), where J is a union of intervals of S 1 , but J¯ = S 1 , and so πψ1 ,p1 ( D0,cj ) will be tensor products of type III factors, contradicting πψ1 ,p1 ( D0,cj ) = U πψ2 ,p2 ( D0,cj )U ∗ . By exchanging the role of g1 and g2 we conclude that a1 , ..., am must be exactly the elements in one cycle ci of g2 for some 1 ≤ i ≤ l, and we have πψ1 ,p1 ( D0,cj ) = U πψ2 ,p2 ( D0,ci )U ∗ . Let H = Hψ1 (cj ) ⊗ Hr . We have U B(Hψ1 (cj ) )U ∗ = B(Hψ2 (ci ) ). Since every automorphism of a type I factor is inner, there a unitary U1 Hψ2 (ci ) → Hψ1 (cj ) such that U B(Hψ1 (cj ) )U ∗ = U1 B(Hψ2 (ci ) )U1∗ . Hence πψ1 (cj ),l(cj ) = U1 πψ2 (ci ),l(ci ) U1∗ on D0,cj , and by Prop.7.3 we conclude that cj = ci , ψ1 (cj ) = ψ2 (ci ). Since j is arbitrary, exchanging the roles of g1 and g2 we have proved g1 = g2 , ψ1 = ψ2 . Proposition 7.4. Assume that p = c1 ...ck , where ci are disjoint cycles. Let ψ be such that p.ψ = ψ. Then:
Solitons in Affine and Permutation Orbifolds
753
(1) The centralizer (cf. (4.2)) of πψ,p in Pn is ψ, = {h ∈ Pn |h.ψ = ψ, hph−1 = p; (2) If p ∈ Zn , the centralizer (cf. (4.2)) of πψ,p in Zn is ψ,p = {h ∈ Zn |h.ψ = ψ, hph−1 = p; (3) d(πψ,p )2 = 1≤i≤k d(ψ(ci ))2 µn−k A . Proof. Parts (1), (2) follow from Prop. 6.7 and Th. 7.1. Assume that each cycle ci has length mi , 1 ≤ i ≤ k. Then 1≤i≤k mi = n. By Def. (6.5), we have d(πψ,p )2 = 2 2 2 mi −1 , 1≤i≤k d(πψ(ci ) ) . By Prop.6.4 and (1) of Prop. 6.2 d(πψ(ci ) ) = d(ψ(ci )) µA hence i −1 d(πψ,p )2 = d(ψ(ci ))2 µm = d(ψ(ci ))2 µn−k A . A 1≤i≤k
1≤i≤k
8. Identifying all the Irreducible Representations of the Permutation Orbifolds 8.1. Cyclic orbifold case. Theorem 8.1. Let g = (01...n−1). Then every irreducible DHR representation of DZn appears as an irreducible summand of πψ,g i for some ψ, g i . Proof. By Prop. 4.4, πψ1 ,g i1 B πψ2 ,g i2 B iff there exists h ∈ Zn such that πψ1 ,g i1 (βh−1 ) πψ2 ,g i2 , and by Prop. 6.7 and (2) of Prop. 7.4 we have h.ψ1 = ψ2 , hg i1 h−1 = g i2 . Denote the orbit of πψ1 ,g i1 under the action of Zn by {ψ1 , g i1 }. Note that the orbit {ψ1 , g i1 } has length | n i | . By Th. 4.5 the sum of the index of the irreducible summands of πλ,g i is
n2
ψ1 ,g 1
|ψ,g i | d(πλ,g )
2 . Hence the sum of the index of distinct
irreducible summands of πλ,g for all ψ, g ∈ Zn is given by {ψ,g i }
n2 2 |ψ,g i | d(πλ,g i ) ,
where the sum is over different orbits. Assume that g i = c1 ...ck . Then k = (n, i) n (the greatest common divisor of n and i) and each cycle ci has length (n,i) . For each i i 2 element ψ2 , g 2 in the orbit {ψ, g }, by (3) of Prop. 7.4 d(πψ2 ,g i2 ) = d(πλ,g i )2 = 2 n−(n,i) . Hence 1≤j ≤(n,i) d(ψ(cj ) µA {ψ,g i }
n2 d(πψ,g i )2 = |ψ,g i |
n2 d(πψ,g i )2 n | i| ψ,g λ,0≤i≤n |ψ,g i | n−(n,i) d(ψ(cj )2 µA =n 1
ψ,0≤i≤n 1≤j ≤(n,i)
= n µA = µ DZn , 2 n
(20)
where in the last = we have used Th. 3.7. The theorem now follows from Th. 30 of [21]. Let us now decompose πλ,g into irreducible pieces. In this case λ,g = Zn since g = (012...n−1) (cf. (2) of Prop. 7.4). By definition (18) ∀x0 ⊗x1 ⊗· · ·⊗xn−1 ∈ D(I ), πλ,g · Adg −1 (x0 ⊗ x1 ⊗ · · · ⊗ xn−1 ) = πλ,g (x1 ⊗ x2 ⊗ · · · ⊗ x0 ) ∗ 2π 2π = πλ R . πλ,g (x0 ⊗ x1 ⊗ · · · ... ⊗ xn−1 )πλ R n n
(21) (22)
754
V.G. Kac, R. Longo, F. Xu
Here πλ (R(·)) denotes the unitary n one-parameter rotation subgroup in the represen= πλ (R(2π)) = Cλ id for some complex number tation λ. Note that πλ R 2π n Cλ , |Cλ | = 1. Let λ ∈ Hλ be a unit vector such that πλ R 2π λ = Cλ λ with n n (Cλ ) = Cλ . Definition 8.2. πλ,g (g) := Cλ −1 πλ R 2π , and πλ,g (g i ) := πλ (g)i . n Then it follows that g i → πλ,g (g i ) gives a representation of Zn on Hλ , and πλ,g (g i ). λ = λ . So λ affords a trivial representation of Zn on Hλ . It follows from Lemma 4.7 that all irreducible representations of Zn appear in the representation πλ . It follows ˆ n are distinct irreducible representations. by Th. 4.8 that πλ,g,i , i ∈ Z n i Note that g = c1 ...ck is a product of k = (n, i) disjoint cycles of the same length (n,i) . Let h ∈ ψ,g i . Then Adh induces a permutation among the cycles c1 , ..., ck . We define an element h ∈ Pk by the formula hci h−1 = ch (i) , i = 1, ..., k. We note that in the definition of πψ,g a presentation of g has been fixed. Assume that hfh −1 (i) h−1 = h (i).fi , where h (i) is an element in the cyclic group generated by ci . Define Definition 8.3. πψ,g i (h) := h πψ(c1 ),c1 (h (1)) ⊗ · · · ⊗ πψ(ck ),ck (h (k) , where the action of h ∈ Pk on Hψ(c1 ) ⊗ · · · Hψ(ck ) is by permutation of the tensor factors, and πψ(ci ),ci (h (i)) is as defined in Def. (8.2). One checks easily that Def. 8.3 gives a representation of ψ,g i , Adπψ,gi (h) πψ,g i = πψ,g i Adh , and the vector ψ(c1 ) ⊗ · · · ⊗ ψ(ck ) is fixed by πψ,g i (ψ,g i ). It follows by Lemma 4.7 and Th. 4.8 that we have proved the following: Theorem 8.4. πψ,g i ,σ ∈ˆ
ψ,g i
gives all the irreducible summands of πψ,g i DZn .
We note that Th. 8.1 and Th. 8.4 generalize the considerations of §8 of [33] for the case n = 2, 3, 4. 8.2. Permutation orbifold case. Theorem 8.5. Every irreducible DHR representation of DPn appears as an irreducible summand of πψ,p for some ψ, p ∈ Pn . Proof. The proof is similar to the proof of Th. 8.1 with small modifications. By Prop. 4.4 and Th. 7.1 πψ1 ,p1 DPn πψ2 ,p2 DPn iff there exists h ∈ Pn such that h.ψ1 = ψ2 , hp1 h−1 = p2 . Denote the orbit of πψ1 ,p1 under the action of Pn by {ψ1 , p1 }. Note that the orbit {ψ1 , p1 } has length |ψn,p | . By Prop. 4.5 the sum of the index of the 1
irreducible summands of πλ,p is
1
n!2 2 |ψ,p | d(πψ,p ) .
Hence the sum of the index of distinct irreducible summands of πψ,p for all ψ, p ∈ Pn is given by {ψ,p} nd(πλ,p )2 , where the sum is over different orbits. Assume that p = c1 ...ck is a product of disjoint cycles. For each element ψ2 , p2 in the orbit {ψ, p}, by Prop. 7.4 d(πψ2 ,p2 )2 = d(πψ,p )2 = 2 n−k 1≤j ≤k d(ψ(cj ) µA . Hence {ψ,p}
n!2 2 n d(πψ,p )2 = n! d(ψ(cj )2 µn−k A = n! µA = µ DPn , |ψ,p | ψ,p 1≤j ≤k
Solitons in Affine and Permutation Orbifolds
755
where in the last = we have used Th. 3.7. The theorem now follows from Th. 30 of [21]. Let p = c1 ...ck be a product of k disjoint cycles . Let h ∈ ψ,p . Then Adh induces a permutation among the cycles c1 , ..., ck . We define an element h ∈ Pk by the formula hci h−1 = ch (i) , i = 1, ..., k. We note that in the definition of πψ,g a presentation of g has been fixed. Assume that hfh −1 (i) h−1 = h (i).fi , where h (i) is an element in the cyclic group generated by ci . Define Definition 8.6. πψ,p (h) := h πψ(c1 ),c1 (h (1)) ⊗ · · · ⊗ πψ(ck ),ck (h (k)) , where the action of h ∈ Pk on Hψ(c1 ) ⊗ · · · Hψ(ck ) is by permutation of the tensor factors, and πψ(ci ),ci (h (i)) is as defined in Def. (8.2). One checks easily that Def. 8.6 gives a representation of ψ,p , Adπψ,p (h) πψ,p = πψ,p Adh , and the vector ψ(c1 ) ⊗ · · · ⊗ ψ(ck ) is fixed by πψ,p (ψ,p ). It follows by Lemma 4.7, Th. 4.8 that πψ,p,σ ∈ˆ ψ,p gives all the irreducible summands of πψ,p DPn , and we have proved: Theorem 8.7. πψ,p,σ ∈ˆ ψ,p gives all the irreducible summands of πψ,p DPn . Note that by Prop. 8.7 and Th. 8.5 the irreducible DHR representations of DPn are labeled by triples (ψ, p, σ ) with p.ψ = ψ, σ ∈ ˆ ψ,p with equivalence relation ∼, (ψ, p, σ ) ∼ (ψ1 , p1 , σ1 ) iff there is h ∈ Pn such that ψ1 = h.ψ, p1 = hph−1 , σ1 = σ h . In [1], based on a heuristic argument it is claimed that the irreducible representations of DPn should be given by the set of pairs (ψ, ϕ), where ϕ is an irreducible representation of the double D(Fψ ) of the stabilizer Fψ = {p ∈ Pn |p.ψ = ψ} with equivalence relation (ψ, ϕ) ∼ (ψ1 , ϕ1 ) iff there is h ∈ Pn such that ψ1 = h.ψ, ϕ1 = ϕ h . We note that the irreducible representation of the double D(Fψ ) is labeled by (g, π )/Fψ , where g ∈ Fψ , π is an irreducible representation of the centralizer of g in Fψ , and the action of Fψ on (g, π ) is given by h.(g, π ) = (hgh−1 , π h ). Hence the labels [1] are exactly the same as the labels we described above, and we have confirmed this claim of [1]. 9. Examples of Fusion Rules 9.1. Some properties of S matrix for general orbifolds. Let A be a completely rational conformal net and let be a finite group acting properly on A. By Th. 3.7 A has only finitely many irreducible representations. We use λ˙ (resp. µ) to label representations ˙ T˙ of A (resp. A). We will denote the corresponding genus 0 modular matrices by S, (cf. (7). Denote by λ˙ (resp. µ) the irreducible covariant representations of A (resp. A) with finite index. Recall that bµλ˙ ∈ Z denote the multiplicity of representation λ˙ which appears in the restriction of representation µ when restricting from A to A . bµλ˙ is also known as the branching rules. Lemma 9.1. (1) If τ is an automorphism (i.e., d(τ ) = 1) then Sτ (λ)µ = G(τ, µ)∗ Sλµ , where τ (λ) := τ λ, G(τ, µ) = (τ, µ)(µ, τ ); (2) For any h ∈ , let h(λ) be the DHR representation λ · Adh−1 . Then Sλµ = Sh(λ)h(µ) ; S gives a representation of the fusion algebra of A, where z(λ) is a (3) If λ → z(λ) Sλµ 1µ complex-valued function, z(1) = 1, then there exists an automorphism τ such that z(λ) = SSλτ ; λ1
756
V.G. Kac, R. Longo, F. Xu
(4) If [αλ˙ ] = [µαδ˙ ], then for any λ˙ 1 , µ1 with bλ˙ 1 µ1 = 0 we have Proof. Ad (1): Since λ →
Sλµ S1µ
Sλ˙ λ˙ 1 S1˙ λ˙ 1
=
Sµµ1 Sδ˙λ˙ 1 S1µ1 S1˙ λ˙
.
1
is a representation of the fusion algebras, it follows that Sτ (λ)µ Sλµ Sτ µ = . S1µ S1µ S1µ
On the other hand Sτ µ ωτ ωµ = = G(τ, µ)∗ , S1µ ωτ µ where the last equation follows from the monodromy equation (cf.[37]) and (1) is proved. h(δ) δ and ω Ad (2): By Lemma 3.1, it is sufficient to show that Nh(λ)h(µ) = Nλµ h(λ) = ωλ . The first equation follows from the definition. For the second one, we note that ωλ = πλ (R(2π )). Since h commutes with the vacuum unitary representation of M¨ob, it follows that ωh(λ) = ωλ . is a non-trivial representation of the fusion Ad (3): By assumption λ → z(λ) SSλ1 11 algebra, and so there exists τ such that z(λ) SSλ1 = 11 z(λ1 )
Sλτ S1τ
, ∀λ. Hence |z(λ)| ≤ 1. From
λ Sλ1 µ Sλ µ Sλ µ z(λ2 ) 2 = Nλ13λ2 z(λ1 )z(λ2 ) 3 S1µ S1µ S1µ
(23)
λ3
=
Nλλ13λ2 z(λ3 )
λ3
we have
Nλλ13λ2 (z(λ1 )z(λ2 ) − z(λ3 ))
λ3
Using Nλλ13λ2 = 3.3 we have
δ
Sλ1 δ Sλ2 δ Sλ∗ δ 3 S1δ
Sλ3 µ , S1µ
(24)
Sλ3 µ = 0. S1µ
and the orthogonal property of the S matrix in Lemma
Nλλ13λ2 (z(λ1 )z(λ2 ) − z(λ3 )) = 0. Since Nλ1 λ¯ = 1 we have z(λ1 )z(λ¯ 1 ) = 1. So we conclude that |z(λ)| = 1, ∀λ, and 1 1
1 S
2 Sλτ 2 Sλ1 2 1 2 = = S S S . 1τ 1τ 11 11 λ
λ
Hence S1τ = S11 and d(τ ) = 1, i.e., τ is an automorphism. Ad (4): By [49] or [4] there is a unit vector ψ in the vector space spanned by the irreducible components of αλ˙ 2 , ∀λ2 such that αλ˙ ψ = and (4) follows immediately.
Sλ˙ λ˙ 1 S1˙ λ˙ 1
ψ, µψ =
Sδ˙λ˙ 1 Sµµ1 , αδ˙ ψ = ψ S1µ1 S1˙ λ˙ 1
Solitons in Affine and Permutation Orbifolds
757
9.2. Fusions of solitons in cyclic orbifolds. Let B ⊂ D be as in §6.1. Set i = 0 in Th. 8.4. In this ψ is a constant function, and we denote it by its value λ. For simplicity we will label the representation πλ,g j ,i (g = (01...n − 1)) by (λ, g j , i). Define (λi) := (λ, 1, i), n Zn . where i ∈ Z Lemma 9.2. If (k, n) = 1, then
e
2π ikj n
(δ0)
δ N(λ0)(µ0) = Nλµ .
0≤j ≤n−1
Proof. Let V := Hom(δ, λµ) ⊂ A(J0 ). Note that Zn acts on W := V ⊗ V ⊗ · · · ⊗ V (n-tensor factors) by permutations. Let Wj := {w ∈ W |βg (w) = e−2πij nw}. Note n n n that if w ∈ Wj , then wv j ∈ Hom(v −j δ ⊗ v j , λ⊗ · µ⊗ ) ∩ DZn (J0 ), where v is defined as before Lemma 6.1. Hence we have an injective map w ∈ Wj → wv j ∈ Hom((δj ), (λ0)(µ0)). By definition the map is also surjective. So we have 2π ikj 2π ikj (δ0) e− n N(λ0)(µ0) = e− n dimWj = Tr W (βg k ). 0≤j ≤n−1
0≤j ≤n−1
δ . Take When (k, n) = 1, g k is one cycle, and it follows that T rW (βg k ) = dimV = Nλµ the complex conjugate of both sides; we have proved the lemma.
Lemma 9.3. Let fµ := (µ, g, 0). Then: (1) G(σ, fµ ) = e (2) λ →
S(λ0)fµ S(10)fµ
2π l1 i n
for some integer l1 with (l1 , n) = 1;
is a representation of the fusion algebra of A;
(3) There exists an automorphism τ, [τ 2 ] = [1] such that
S(λ0)fµ S(10)fµ
=
Sλτ (µ) S1τ (µ) . 2π i
Proof. Ad(1): By the paragraph after (47) in [33] we have G(σ k(1) , fµ ) = e n where 2π i (k(1), n) = 1. By (1) of Lemma 6.1 we have G(σ, fµ )k(1) = e n . Choose l1 such that 2π l1 i
l1 k(1) = 1modn, we have G(σ, fµ ) = e n for some integer l1 with (l1 , n) = 1. As for (2) and (3), first we note that by Lemma 6.1, if δ ≺ (λ0)(µ0), then δ is untwisted. Suppose that δ is an irreducible component of the restriction of (δ1 , ..., δn ) to DZn . We claim that Sδfµ = 0 if δi = δj for some i = j . In fact if δi = δj for some i = j , then the stabilizer of (δ1 , ..., δn ) under the action of Zn is a proper subgroup of Zn , and by Th. 4.5 αδ is reducible, and [σ k δ] = [δ] for some 1 ≤ k ≤ n − 1 . By (1) of Lemma 7 we have Sδfµ = Sσ k (δ)fµ = G(σ k , fµ )∗ Sδfµ . Since G(σ k , fµ ) = e (l1 , n) = 1 by (1), G(σ k , fµ )∗ = 1, hence Sδfµ = 0. So we have S(λ1 0)fµ S(λ2 0)fµ S(10)fµ S(10)fµ
=
λ3 ,0≤j ≤n−1
=
λ3 ,0≤j ≤n−1
= Nλλ13λ2
(λ j )
N(λ130)(λ2 0) (λ j )
S(λ3 j )fµ S(10)fµ
N(λ130)(λ2 0) e
S(λ3 0)fµ S(10)fµ
,
2π ij l1 n
S(λ3 0)fµ S(10)fµ
2π kl1 n
with
(25) (26) (27)
758
V.G. Kac, R. Longo, F. Xu
where we have used (1) of Lemma 9.1 and Lemma 9.2 in the second = and third = respectively. Ad (2): Since αfµ = (µ, 1, ..., 1)αf1 by (48) of [33], by (4) of Lemma 9.1 we have Sfµ (λ0) S(10)(λ0)
Sµλ Sf1 (λ0) . S1λ S(10)(λ0)
=
Combined with (1) it follows that there exists τ such that the map λ→
Sλτ Sλµ d(λ)S1τ S1µ
gives a representation of the fusion algebra of A. By (3) of Lemma 9.1 we have that τ is an automorphism and S(λ0)f1 Sλτ = . S(10)f1 S1τ Let h ∈ Pn such that hgh−1 = g −1 . By definition h((λ0)) = (λ0). By Prop. 6.2 [h(f1 )] = [σ j (f¯1 )] for some 1 ≤ j ≤ n, and it follows by Lemma 9.1 that ∗ S(λ0)f1 = Sh((λ0))h(f1 ) = S(λ0)σ j (f¯1 ) = S(λ0)f¯1 = S(λ0)f , 1
hence
Sλτ S1τ
=
Sλτ¯ S1τ , ∀λ,
and so [τ ] = [τ¯ ].
We conjecture that [τ ] = [1] in the above lemma. Let f1 := (1, g, 0), where 0 stands for the trivial representation of Zn . In [33] the k ] (cf. (44) of [33]) where k is an integer are questions about the nature of [αfk1 ] = [π1,g raised. Proposition 9.4. When n is even we have
n [π1,g ]= Mλ1 ,...,λn [(λ1 , ..., λn )], λ1 ,...,λn
where Mλ1 ,...,λn :=
2−2g λ S1λ
Sλi λ 1≤i≤n S1λ
with g =
(n−1)(n−2) . 2
n−1 Proof. We note that by Lemma 6.1 π1,g,0 is untwisted, and must be a sum of irreducible untwisted representations. It follows that by Cor. 8.4 of [33] that
[αfn ] = Mλ1 ,...,λn [(λ1 , ..., λn )] λ1 ,...,λn
with Mλ1 ,...,λn non-negative integers. Let µ be any irreducible subsector of αfn−1 . By the equation above µαf (λ1 , ..., λn ) for some (λ1 , ...,λn ), and by Frobenius duality µ ≺ (λ1 , ..., λn )α¯ f . By (46) of [33] [(λ1 , ..., λn )α¯ f ] = λ λ1 · · · λn , λ[(λ, 1, 1, ..., 1)α¯ f ] and by (48) of [33] each (λ, 1, 1, ..., 1)α¯ f is irreducible. Hence [µ] = [(λ, 1, 1, ..., 1)α¯ f ] for some λ. Hence
n−1 [αfn−1 ] = [π1,g ]= mλ [(λ, 1, ..., 1)π¯ 1,g ] 1 λ
Solitons in Affine and Permutation Orbifolds
759
with mλ non-negative integers. By (4) of Lemma 9.1 we have ∗ Sf1 (µ0) n−1 Sλµ Sf1 (µ0) = mλ . S(10)(µ0) S1µ S(10)(µ0) λ
Note that
1 2 S(10)(10)
= µ DZn = n2 µ D = n2 12n , hence S(10)(10) = S11
n S11 n .
From
S(λ0)(10) S(10)(10)
n . By (2) of Lemma 9.3 and = d((λ0)) = nd(λ)n we have S(λ0)(10) = Sλ1 our assumption that n is even and hence [τ n ] = [1], we have
1 (n−1)(n−2)
S1µ
=
mλ
λ
∗ Sλµ
S1λ
.
By the orthogonal property of the S matrix in Lemma 3.3 we have mλ =
Sλµ . n2 −3n+1 µ S1µ
Combine this with (46) of [33] and (8); the proposition follows.
We remark Mλ1 ,...,λn is the dimension of genus (n−1)(n−2) conformal blocks with 2 (n−1)(n−2) is the genus of an the insertion of representations λ1 , ..., λn . Also note that 2 algebraic curve with degree n. It may be interesting to give a geometric interpretation of Prop.9.4. Note that if the conjecture [τ ] = [1] after Lemma 4.3 is true, then the above proposition is also true for odd n. 9.3. n=2 case. In this section we consider the fusion rules for the simplest non-trivial case n = 2. Partial results have been obtained in §8 of [33]. We will confirm the results in §4.6 of [5]. Let us first simplify our notations by introducing similar notations in [5]. := (λ, −1, 0), (λ1) := (λ, −1, 1). Note that by §2 of [33] we can choose Let (λ0) ω(λ0) =e
2π i λ + 8c 2
(
)
, ω(λ1) =e
2π i λ +1+ 8c 2
(
)
(28)
,
where c is the central charge. We also note by definitions 2 , ω(λ1 λ2 ) = e2πi( λ1 + λ2 ) , λ1 = λ2 . ω(λ0) = e4πi λ = ω(λ1) (λ 0) (λ1 1 )(λ2 2 )
(λ 1)
3 + N(λ131 )(λ2 2 ) = Lemma 9.5. (1) N
µ
S 2¯
S S λ3 µ λ1 µ λ2 µ 2 S1µ
,
where 1 , 2 = 0 or 1; Sλ¯ µ Sλ¯ µ Sλ µ Sλ µ (λ λ ) (2) N(λ1415)(λ2 2 ) = µ 4 5S 2 1 2 ; 1µ
(3)
µ
=
Sλ2
Sλ µ Sλ2 µ 1 3µ 1 2 S1µ ωλ2
d((λ3 0)) + 3
Sλ1 µ Tµ2 Sλ2 µ −2π ic0 e 6 , 2 S11
µ,λ4 =λ5
Sλ4 µ Sλ5 µ Sλ1 µ Sλ2 µ 1 S1µ ωλ2
where c0 is defined as in (3.2).
d(λ4 )d(λ5 ) 3
(29)
760
V.G. Kac, R. Longo, F. Xu
Proof. Ad (1): By (48) and (3) of Prop. 8.8 in [33] we have α( α λ ) , (λ3 , λ3 ) = λ1 λ2 λ¯ 3 λ¯ 3 , 1. λ ) ( 1 1
Note that
2 2
(λ 0)
(λ 1)
α(λ1 1 ) α(λ2 2 ) , (λ3 , λ3 ) = N(λ131 )(λ2 2 ) + N(λ131 )(λ2 2 )
and by (8) (1) is proved. Part (2) is proved in a similar way. Ad (3): Sλ2 µ Sλ1 µ Sλ2 µ 1 S λ µ S λ µ Sλ µ Sλ µ 1 3 4 1 2 5 d((λ3 0)) + d(λ4 )d(λ5 ) (30) 2 2 2 S ω S ωλ23 λ3 1µ 1µ µ µ,λ4 =λ5 2 Sλ µ Sλ µ Sλ¯ µ Sλ¯ 1 3 3 1 2 = . (31) 2 ω S λ 3 1µ µ λ 3
From Lemma 3.3 we have S ∗ T −1 S ∗ = T S ∗ T and so −π ic0 1 Sλ¯ 3 µ Sλ3 1 = S1µ e 12 Tµ . ωλ3 λ3
Substitute into the equations above we have proved (3). Define matrices T
1 2
1
2 such that Tλµ = δλµ eπi
Definition 9.6.
c λ − 240
= e P := T 2 ST 2 ST 2 , P 1
1
and 2π i(c−c0 ) 8
P.
It follows by (6) that = ω(λ ω )× Y( λ )( λ ) ) (λ 1 1
2 2
1 1
(λ 0)
(λ 1)
N(λ131 )(λ2 2 ) + N(λ131 )(λ2 2 )
2 2
λ3
1 d((λ3 0)) ωλ23 (32)
1 (λ4 λ5 ) 1 (λ 1) + d((λ4 λ5 )) N(λ1 1 )(λ2 2 ) + N(λ131 )(λ2 2 ) 2 ωλ4 ωλ5 λ4 =λ5
µ,λ4 =λ5
Sλ23 µ Sλ1 µ Sλ2 µ 1 d((λ3 0)) (34) 2 S1µ ωλ23 µ Sλ4 µ Sλ5 µ Sλ1 µ Sλ2 µ 1 πi(1 +2 ) Pλ1 λ2 d(λ )d(λ ) = e , 4 5 2 2 ωλ4 ωλ5 S1µ S11
= ω(λ1 1 ) ω(λ2 2 ) × +
(33)
(35) 2 where in the last = we have used (3) of Lemma 9.5. Note that S(10)(10) = 1 4 4 S11 ,
and so S(10)(10) =
1 2 2 S11 .
It follows by (7) that
1 S( = eπi(1 +2 ) P λ λ . λ1 1 )( λ2 2 ) 2 1 2
1 4µD
=
1 4µ2A
=
Solitons in Affine and Permutation Orbifolds
761
Note that by Lemma 9.3 we have S(λ0)(µ0) = S(λ0)(µ0)
Sλµ Sτ λ × . S1µ S1λ
2
Since [τ 2 ] = [1], SSτ1λλ = 1, and so SSτ1λλ = ±1. By (1) of Lemma 9.1 we can choose our (λ1) such that as a set {(λ0), (λ1)} is the same as {(λ0), (λ1)} and labeling (λ0), πi S(λ) S(10)(10) (µ0) =e
From
Sλµ . S1µ
[α(λ1 λ2 ) ] = [(λ1 , λ2 )] + [(λ2 , λ1 )]
and (4) of Lemma 9.1 we have Sλ λ Sλ µ S λ λ Sλ µ S(λ1 λ2 )(λ) Sλ λ Sλ λ S(λ1 λ2 )(λµ) = 21 + 21 , = 1 1 + 2 2 . S(10)(λ) S1λ S1µ S1λ S1µ S1λ S1λ S(10)(λµ) Since S(10)(10) =
1 2 S , 2 11
we get the following on the entries of S-matrix of DZ2 : S(λµ)(λ1 µ1 ) = Sλλ1 Sµµ1 + Sλµ1 Sµλ1 , S(λµ) (λ 1 ) = Sλλ1 Sµλ1 , 1 2 = Sλλ , S(λµ (λ 1 µ1 ) = 0, S( λ,)(λ 1 ,1 ) 2 1 1 1 S( = eπi Sλλ1 , S( = eπi(+1 ) Pλλ1 , = 0, 1 λ,)(λ λ,) (λ , ) 1 ,1 ) 1 1 2 2
(36) (37) (38)
Denote by c˙0 the number (well defined mod8Z) of DZ2 (cf. (3.2). Lemma 9.7. (1) c˙0 − 2c0 ∈ 8Z; (2) c0 − c ∈ 4Z. Proof. By Lemma 3.3 we have ST S = T −1 ST −1 .
(39)
First let us compare the (10) entry of both sides for S, T matrix of DZ2 . By using the formula before the lemma we have:
2 2π i c˙0 2 2πi λ 2 S1λ e = e 8 S11 . λ
On the other hand comparing the entry 11 of (39) for S-matrix of D we have 2π ic0 2 2πi λ S1λ e = e 8 S11 , λ
and (1) follows by combining the two equations.
762
V.G. Kac, R. Longo, F. Xu
As for (2), we compare the entry (λ0)(10) of both sides of (39). By using the equa tions before the lemma the (λ0)(10) entry of the left-hand side of (39) is given by 2π i(c−c0 )
π i(c0 )
1
e 8 e 24 multiplied by the λ1 entry of the matrix P T 2 S. By applying (39) to S, T matrix of D we have 1
1
P T 2 S = T 2 ST 2 ST S = T
−1 2
ST −2 .
Using these equations to compare with the (λ0)(10) entry of right hand side of (39) we have e
2π i(c−c0 ) 4
= 1 and (2) is proved.
By (8) and (2) of Lemma 9.7 we immediately obtain the following fusion rules: (λ µ )
µ
µ
λ2 λ2 λ2 λ2 2 2 = Nλλ N µ2 + Nλλ21 Nµµ + Nλµ21 Nµλ + Nλµ N λ1 , N(λµ)(λ 1 1 µµ1 1 1 µµ2 1 µ1 )
(40)
( λ2 ) N(λµ)(λ 1 µ1 )
(41)
( λ2 ) ( (λ) λ1 1 ) λ2 µ2 N (λ)(λ1 1 )
N
( λ ) (λ)(λ1 1 )
λ2 λ2 = Nλλ N λ2 + Nλµ N λ2 , 1 µµ1 1 µλ1
1 λ2 N (N λ2 + eπi(+1 +2 ) ), 2 λλ1 λλ1 µ µ = Nλλ1 Nµλ¯2 , =
2 N 2 =
2
µ
Sλµ Pλ µ Pλ µ 1 Sλµ Sλ1 µ Sλ2 µ 1 1 2 + eπi(+1 +2 ) , 2 2 2 µ 2 S1µ S 1µ µ
(42) (43)
2
(44)
where , 1 , 2 = 0 or 1. Let us summarize the above equations in the following: Theorem 9.8. The fusion rules of DZ2 are given by the above equations. From the theorem we immediately have: Corollary 9.9. For any completely rational A , 2 1 Sλ1 µ Pλ2 µ Pλ3 µ 1 Sλ1 µ Sλ2 µ Sλ3 µ ± 2 2 2 µ 2 µ S1µ S1µ
is a non-negative integer where P is defined in (9.6). Corrollary 9.9 confirmed a conjecture in §4.6 of [5]. We note that even for known examples the direct confirmation of Cor. 9.9 seems to be very tedious. It will be an interesting question to generalize our results to n > 2 cases. References 1. Bantay, P.: Permutation orbifolds. Nucl. Phys. B 633(3), 365–378 (2002) 2. Barron, K., Dong, C., Mason, G.: Twisted sectors for tensor product vertex operator algebras associated to permutation groups. Commun. Math. Phys. 227(2), 349–384 (2002) 3. B¨ockenhauer, J., Evans, D.E.: Modular invariants, graphs and α-induction for nets of subfactors I. Commun. Math. Phys. 197, 361–386 (1998) 4. B¨ockenhauer, J., Evans, D.E., Kawahigashi, Y.: On α-induction, chiral projectors and modular invariants for subfactors. Commun. Math. Phys. 208, 429–487 (1999) 5. Borisov, L., Halpern, M.B., Schweigert, C.: Systematic approach to cyclic orbifold. Internat. J. Modern Phys. A 13(1), 125–168 (1998)
Solitons in Affine and Permutation Orbifolds
763
6. Brunetti, R., Guido, D., Longo, R.: Modular structure and duality in conformal quantum field theory. Commun. Math. Phys. 156, 201–219 (1993) 7. Doplicher, S., Haag, R., Roberts, J.E.: Local observables and particle statistics. I. Commun. Math. Phys. 23, 199–230 (1971); II. 35, 49–85 (1974) 8. Fredenhagen, K., Rehren, K.-H., Schroer, B.: Superselection sectors with braid group statistics and exchange algebras. I. Commun. Math. Phys. 125, 201–226 (1989); II. Rev. Math. Phys. Special issue, 113–157 (1992) 9. Fr¨ohlich, J., Gabbiani, F.: Operator algebras and conformal field theory. Commun. Math. Phys. 155, 569–640 (1993) 10. Guido, D., Longo, R.: Relativistic invariance and charge conjugation in quantum field theory. Commun. Math. Phys. 148, 521–551 (1992) 11. Guido, D., Longo, R.: The conformal spin and statistics theorem. Commun. Math. Phys. 181, 11–35 (1996) 12. Guido, D., Longo, R., Wiesbrock, H.-W.: Extensions of conformal nets and superselection structures. Commun. Math. Phys. 192, 217–244 (1998) 13. Haag, R.: “Local Quantum Physics”. 2nd ed., Berlin, Heidelberg, New York: Springer-Verlag, 1996 14. Hiai, F.: Minimizing index of conditional expectations onto subfactors. Publ. Res. Inst. Math. Sci. 24, 673–678 (1988) 15. Izumi, M., Longo, R., Popa, S.: A Galois correspondence for compact groups of automorphisms of von Neumann Algebras with a generalization to Kac algebras. J. Funct. Anal. 155, 25–63 (1998) 16. Jones, V.F.R.: Index for subfactors. Invent. Math. 72, 1–25 (1983) 17. Kac, V.G.: “Infinite Dimensional Lie Algebras”. 3rd Edition, Cambridge: Cambridge University Press, 1990 18. Kac, V.G., Todorov, I.: Affine orbifolds and rational conformal field theory extensions of W1+∞ . Commun. Math. Phys. 190, 57–111 (1997) 19. Kac, V.G., Wakimoto, M.: Modular and conformal invariance constraints in representation theory of affine algebras. Adv. in Math. 70, 156–234 (1988) 20. Karpilovsky, G.: Projective representations of finite groups. Monographs and Textbooks in Pure and Applied Mathematics, Vol.94, New York: Marcel Detter, Inc., 1985 21. Kawahigashi, Y., Longo, R., M¨uger, M.: Multi-interval subfactors and modularity of representations in conformal field theory. Commun. Math. Phys. 219, 631–669 (2001) 22. Kosaki, H.: “Type III Factors and Index Theory”. Res. Inst. Math. Lect. Notes 43, Seoul: Seoul Nat. Univ. Global Analysis Research Center, 1998 23. Kosaki, H., Longo, R.: A remark on the minimal index of subfactors. J. Funct. Anal. 107, 458–470 (1992) 24. Longo, R.: Simple injective subfactors. Adv. Math. 63, 152–171 (1987) 25. Longo, R.: Index of subfactors and statistics of quantum fields. I. Commun. Math. Phys. 126, 217– 247 (1989) 26. Longo, R.: Index of subfactors and statistics of quantum fields. II. Commun. Math. Phys. 130, 285–309 (1990) 27. Longo, R.: An analogue of the Kac-Wakimoto formula and black hole conditional entropy. Commun. Math. Phys. 186, 451–479 (1997) 28. Longo, R.: Minimal index and braided subfactors. J. Funct. Anal. 109, 98–112 (1992) 29. Longo, R.: A duality for Hopf algebras and for subfactors. I. Commun. Math. Phys. 159, 133–150 (1994) 30. Longo, R.: Conformal subnets and intermediate subfactors. Commun. Math. Phys. 237(1–2), 7–30 (2003) 31. Longo, R., Rehren, K.-H.: Nets of subfactors. Rev. Math. Phys. 7, 567–597 (1995) 32. Longo, R., Roberts, J.E.: A theory of dimension. K-theory 11, 103–159 (1997) 33. Longo, R., Xu, F.: Topological sectors and a dichotomy in conformal field theory. To appear in Commun. Math. Phys., DOI 10.1007/s00220-004-1063-1, 2004 34. M¨uger, M.: On soliton automorphisms in massive and conformal theories. Rev. Math. Phys. 11(3), 337–359 (1999) 35. Pimsner, M., Popa, S.: Entropy and index for subfactors. Ann. Scient. Ec. Norm. Sup. 19, 57–106 (1986) 36. Pressley, A., Segal, G.: “Loop Groups”. Oxford: Oxford University Press, 1986 37. Rehren, K.-H.: Braid group statistics and their superselection rules. In: “The Algebraic Theory of Superselection Sectors”, D. Kastler (ed.), Singapore: World Scientific, 1990 38. Roberts, J.E.: Local cohomology and superselection structure. Commun. Math. Phys. 51, 107–119 (1976) 39. Turaev, V.G.: Quantum invariants of knots and 3-manifolds. Berlin, New York: Walter de Gruyter, 1994
764
V.G. Kac, R. Longo, F. Xu
40. Takesaki, M.: “Theory of Operator Algebras”. Vol. I, II, III, Springer Encyclopaedia of Mathematical Sciences 124 (2002); 125, 127 (2003) 41. Toledano Laredo, V.: “Fusion of Positive Energy Representations of LSpin2n ”. Ph.D. dissertation, University of Cambridge, 1997 42. Xu, F.: New braided endomorphisms from conformal inclusions. Commun. Math. Phys. 192, 347– 403 (1998) 43. Xu, F.: Applications of braided endomorphisms from conformal inclusions. Int. Math. Res. Notices, 1, 5–23 (1998) 44. Xu, F.: Algebraic orbifold conformal field theories. Proc. Nat. Acad. in Sci. USA, 97(26), 14069– 14073 (2000) 45. Xu, F.: Algebraic coset conformal field theories II. Publ. RIMS, Kyoto Univ. 35, 795–824 (1999) 46. Xu, F.: Jones-Wassermann subfactors for disconnected intervals. Commun. Contemp. Math. 2, 307– 347 (2000) 47. Xu, F.: On a conjecture of Kac-Wakimoto. Publ. RIMS, Kyoto Univ. 37, 165–190 (2001) 48. Xu, F.: 3-manifold invariants from cosets. math.GT/9907077 49. Xu, F.: Strong additivity and conformal nets. Preprint 2003 50. Xu, F.: Algebraic coset conformal field theories. Commun. Math. Phys. 211, 1–43 (2000) Communicated by Y. Kawahigashi