This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
b · S −1 ( 3 )q˜ρ2 2 1 u 1 ,
(6.3)
for all a ∈ A, b ∈ B and u ∈ A, where q˜ρ = q˜ρ1 ⊗ q˜ρ2 is the element defined in (2.34). We will prove that μ is an algebra isomorphism. First, observe that the multiplication of (A ⊗ B) A is defined by ((a ⊗ b) u)((a ⊗ b ) u ) = [(1 · a)(2 u 0[−1] · a ) ⊗ (b · 5 )(b · S −1 (u 1 )4 )] 3 u 0[0] u ,
(6.4)
for all a, a ∈ A, b, b ∈ B and u, u ∈ A. By using (4.1), (2.40), (2.44) and several times (2.36) and (2.26), we obtain that 11 1 ⊗ 12 2 ⊗ q˜ρ1 ( 2 3 ) 0 ⊗ 5 S −1 ( 3 )1 (q˜ρ2 )1 ( 2 3 ) 11 ⊗ 4 S −1 ( 3 )2 ×(q˜ρ2 )2 ( 2 3 ) 12 = x˜ 1λ 1 ⊗ x˜ 2λ 1 2[−1] ⊗ x˜ 3λ q˜ρ1 ( 2 2[0] Q˜ 1ρ 0 ) 0 x˜ 1ρ 1
2
⊗S −1 ( 3 3 )q˜ρ2 ( 2 2[0] Q˜ 1ρ 0 ) 1 x˜ 2ρ ⊗ S −1 ( ) Q˜ 2ρ 1 x˜ 3ρ , 2
3
2
(6.5)
382
D. Bulacu, F. Panaite, F. Van Oystaeyen
where we denote by Q˜ 1ρ ⊗ Q˜ 2ρ another copy of q˜ρ . On the other hand, by (2.26), (2.43) and (2.36) it follows that 1 2 2 u 0[−1] ⊗ ( Q˜ 1ρ 0 ) 0 x˜ 1ρ u 0[0] 0 u 0 ⊗ ( Q˜ 1ρ 0 ) 1 x˜ 2ρ u 0[0] 1 u 11 1
3 ⊗S −1 ( u
˜ 2ρ 2 1 x˜ 3ρ u 0[0] u 12 = u [−1] 1 ⊗ (u [0] Q˜ 1ρ ) 0 ( 2 u ) 0,0 x˜ 1ρ
1 ) Q
1 2
2 ⊗(u [0] Q˜ 1ρ ) 1 ( u ) 0,1 x˜ 2ρ
⊗ S −1 ( ) Q˜ 2ρ ( u ) 1 x˜ 3ρ , 3
2
(6.6)
for all u, u ∈ A. Finally, using (2.34), (2.45), (2.6) and (2.46), one checks that 1 ⊗ q˜ρ1 2 0 ⊗ S −1 ( 3 )q˜ρ2 2 1 = (q˜ρ1 )[−1] θ 1 ⊗ (q˜ρ1 )[0] θ 2 ⊗ q˜ρ2 θ 3 .
(6.7)
Now, for all a, a ∈ A, u, u ∈ A and b, b ∈ B we compute: μ(((a ⊗ b) u)((a ⊗ b ) u )) (6.4,6.3) =
( 11 1 · a)( 12 2 u 0[−1] · a )
> (b · 5 S −1 ( 3 )1 (q˜ρ2 )1 ( 2 3 ) 11 u 0[0] 1 u 11 ) 1
(b · S −1 (u 1 )4 S −1 ( 3 )2 (q˜ρ2 )2 ( 2 3 ) 12 u 0[0] 1 u 12 ) 2
(6.5) =
(x˜ 1λ 1
1 · a)(x˜ 2λ 1 2[−1] u 0[−1]
·a
2 )<x˜ 3λ q˜ρ1 2 0 2[0] 0 ( Q˜ 1ρ 0 ) 0
2 x˜ 1ρ u 0[0] 0 u 0 > (b · S −1 ( 3 3 )q˜ρ2 2 1 2[0] 1 ( Q˜ 1ρ 0 ) 1 x˜ 2ρ 3 2 u 0[0] 1 u 11 )(b · S −1 ( u 1 ) Q˜ 2ρ 1 x˜ 3ρ u 0[0] 1 u 12 ) 1
(6.6) =
2
(x˜ 1λ 1
1 · a)(x˜ 2λ 1 2[−1] u [−1]
·a
)<x˜ 3λ q˜ρ1 2 0 2[0] 0 (u [0] Q˜ 1ρ ) 0
( u ) 0,0 x˜ 1ρ > (b · S −1 ( 3 3 )q˜ρ2 2 1 2[0] 1 (u [0] Q˜ 1ρ ) 1 ( u ) 0,1 x˜ 2ρ ) 2
2
(b · S −1 ( ) Q˜ 2ρ ( u ) 1 x˜ 3ρ ) 3
(6.7,5.5,2.43) ( 1 = 1
2
· ab · S −1 ( 3 )q˜ρ2 2 1 u 1 )
( · a < Q˜ 1ρ 0 u 0 > b · S −1 ( ) Q˜ 2ρ 1 u 1 )
(6.3) =
2
3
2
μ((a ⊗ b) u)μ((a ⊗ b ) u ),
as claimed. The (co) unit axioms imply μ((1 A ⊗ 1 B ) 1A ) = 1 A <1A > 1 B , so it remains to show that μ is bijective. To this end, define μ−1 : A B → (A⊗B) A, μ−1 (a b) = (θ 1 · a ⊗ b · S −1 (θ 3 u 1 p˜ ρ2 )) θ 2 u 0 p˜ ρ1 ,
(6.8)
for all a ∈ A, u ∈ A and b ∈ B, where p˜ ρ = p˜ ρ1 ⊗ p˜ ρ2 is the element defined in (2.34). We show that μ and μ−1 are inverses. Indeed, μμ−1 (a b) (6.8,6.3) ab, =
> b · S −1 (u 1 p˜ ρ2 )q˜ρ2 u 0,1 ( p˜ ρ1 ) 1
Generalized Diagonal Crossed Products
383
for all a ∈ A, u ∈ A, b ∈ B, and similarly μ−1 μ((a ⊗ b) u) (6.3,6.8) =
[θ 1 1 · a ⊗ b · S −1 ( 3 )q˜ρ2 ( 2 u) 1 S −1 (θ 3 (q˜ρ1 ) 1 ( 2 u) 0,1 ) p˜ ρ2 )] θ 2 (q˜ρ1 ) 0 ( 2 u) 0,0 p˜ ρ1
(2.35,2.37) =
(a ⊗ b) u,
and this finishes our proof. As a consequence of the two propositions, we obtain the following result: Corollary 6.4. Let H be a quasi-Hopf algebra, A a left H -module algebra, B a right H -module algebra, A a right H -comodule algebra and B a left H -comodule algebra. Then we have algebra isomorphisms A<(A ⊗ B) > B (A ⊗ B) (A ⊗ B) A > (A ⊗ B) < B. 7. Invariance Under Twisting In this section we prove that the generalized diagonal crossed products and the two-sided smash products are, in certain senses, invariant under twisting (such a result has also been proved by Hausser and Nill in [12] for their diagonal crossed products, with a different method, and by the authors in [5] for smash products). Let H be a quasi-bialgebra, A a left H -module algebra, A an H -bimodule algebra and F ∈ H ⊗ H a gauge transformation. If we introduce on A another multiplication, by a a = (G 1 · a)(G 2 · a ) for all a, a ∈ A, where F −1 = G 1 ⊗ G 2 , and denote by A F −1 this structure, then, as in [5], one can prove that A F −1 becomes a left H F -module algebra, with the same unit and H -action as for A. If we introduce on A another multiplication, by ϕ ◦ ϕ = (G 1 · ϕ · F 1 )(G 2 · ϕ · F 2 ) for all ϕ, ϕ ∈ A, and denote this by ∗ F A F −1 , then F A F −1 is an H F -bimodule algebra (for instance, if A = H , then F A F −1 ∗ op is just (H F ) ). Moreover, if we regard A as a left H ⊗ H -module algebra and F A F −1 op as a left H F ⊗ H F -module algebra, then F A F −1 coincides with AT −1 , where T is the gauge transformation on H ⊗ H op given by T = (F 1 ⊗ G 1 ) ⊗ (F 2 ⊗ G 2 ), and using the identification H F ⊗ (H F )op ≡ (H ⊗ H op )T . Suppose that we have also a left H -comodule algebra B; then, by [12], on the algebra structure of B one can introduce a left H F -comodule algebra structure (denoted in what −1 −1 −1 follows by B F ) by putting λ F = λ and λF = λ (F −1 ⊗ 1B). Proposition 7.1. With notation as above, we have an algebra isomorphism −1
A
−1
coincides, via the trivial
384
D. Bulacu, F. Panaite, F. Van Oystaeyen
Similarly, if A is a right H -comodule algebra, by [12] one can introduce on the algebra structure of A a right H F -comodule algebra structure (denoted by F A) by putting F ρ = ρ and F = (1 ⊗ F) . ρ ρ A Also, one can check that if A is an H -bicomodule algebra, the left and right H F -com−1 odule algebras A F and F A actually define the structure of an H F -bicomodule algebra −1 on A, denoted by F A F , which has the same λ,ρ as A. Suppose now that H is a quasi-Hopf algebra. Transforming this H F -bicomodule −1 op algebra F A F , as in a previous section, into the two left H F ⊗ H F -comodule algebras −1 −1 op ( F A F )1 and ( F A F )2 , by using the identification H F ⊗ H F ≡ (H ⊗ H op )T as before and the fact, observed in [12], that the Drinfeld twist f F on H F depends on the one on −1 ) f F −1 , we may obtain algebra isomorphisms H by the formula f F = (S ⊗ S)(F21 −1
−1
−1
−1
( F A F )1 ≡ (A1 )T , ( F A F )2 ≡ (A2 )T , defined by the trivial identifications. As a consequence, using the expressions of the generalized left diagonal crossed products as generalized smash products, we obtain the following result: Proposition 7.2. With notation as before, the algebra isomorphisms A A ≡
F A F −1
F
−1
A F , A A ≡
F A F −1
F
−1
AF ,
are defined by the trivial identifications. Suppose again that H is a quasi-bialgebra, A is a left H -module algebra and F ∈ H ⊗ H is a gauge transformation. Suppose now that we also have a right H -module algebra B. If we introduce on B another multiplication, by b b = (b · F 1 )(b · F 2 ) for all b, b ∈ B, denoting this structure by F B, then F B becomes a right H F -module algebra with the same unit and right H -action as for B. So, we have the following type of invariance under twisting for two-sided smash products: Proposition 7.3. With notation as before, we have an algebra isomorphism ϕ : A# H # B A F −1 # H F # F B, ϕ(a#h#b) = F 1 · a#F 2 hG 1 #b · G 2 , ∀ a ∈ A, h ∈ H, b ∈ B. In particular, by taking B = k or respectively A = k, we have algebra isomorphisms A# H A F −1 # H F , H # B H F # F B. Proof. Follows by a direct computation, similar to the one in [5]. 8. Iterated Products It was proved in [5] that, if H is a quasi-bialgebra and A is a left H -module algebra, then A# H becomes a right H -comodule algebra, with structure: ρ : A# H → (A# H ) ⊗ H, ρ(a#h) = (x 1 · a#x 2 h 1 ) ⊗ x 3 h 2 , ∀a ∈ A, h ∈ H , ρ = (1 A # X 1 ) ⊗ X 2 ⊗ X 3 ∈ (A# H ) ⊗ H ⊗ H.
Generalized Diagonal Crossed Products
385
Similarly, one can prove that if B is a right H -module algebra, then H # B becomes a left H -comodule algebra, with structure: λ : H # B → H ⊗ (H # B), λ(h#b) = h 1 x 1 ⊗ (h 2 x 2 #b · x 3 ), ∀h ∈ H , b ∈ B, λ = X 1 ⊗ X 2 ⊗ (X 3 #1 B ) ∈ H ⊗ H ⊗ (H # B). In the sequel we need some more general results, that we are stating now (the proof is similar to the one in [5]). Let H be a quasi-bialgebra, A a left H -module algebra and A an H -bicomodule algebra. Then A
Similarly, let H be a quasi-bialgebra, B a right H -module algebra and A an H -bicomodule algebra. Then A > B becomes a left H -comodule algebra, with structure defined for all u ∈ A and b ∈ B by: λ : A > B → H ⊗ (A > B), λ(u > b) = u [−1] θ 1 ⊗ (u [0] θ 2 > b · θ 3 ), 1 2 3 λ = X˜ λ ⊗ X˜ λ ⊗ ( X˜ λ > 1 B ) ∈ H ⊗ H ⊗ (A > B).
We are now ready to prove that the two-sided generalized smash product can be written (in two ways) as an iterated generalized smash product. Proposition 8.1. Let H be a quasi-bialgebra, A a left H -module algebra, B a right H -module algebra and A an H -bicomodule algebra. Consider the right and left H comodule algebras A B as above. Then we have algebra isomorphisms A B ≡ (A B, A B ≡ A<(A > B), given by the trivial identifications. In particular, we have A# H # B ≡ (A# H ) > B, A# H # B ≡ A<(H # B). Proof. We will prove the first isomorphism, the second is similar. We compute the multiplication in (A B. For a, a ∈ A, b, b ∈ B and u, u ∈ A we have: ((a b)((a b ) = (a (b · (a (b · θ 3 u 1 x˜ 2ρ )(b · x˜ 3ρ ) = ((x˜ 1λ · a)(x˜ 2λ u [−1] θ 1 · a )<x˜ 3λ u [0] θ 2 u 0 x˜ 1ρ ) > (b · θ 3 u 1 x˜ 2ρ )(b · x˜ 3ρ ). Via the trivial identification, this is exactly the multiplication of A B. Recall from [7] the definition and properties of the so-called quasi-smash product, but in a more general form. Let H be a quasi-bialgebra, A a right H -comodule algebra and A an H -bimodule algebra. Define a multiplication on A ⊗ A by
(a#ϕ)(a #ϕ ) = aa 0 x˜ 1ρ #(ϕ · a 1 x˜ 2ρ )(ϕ · x˜ρ3 ), ∀ a, a ∈ A, ϕ, ϕ ∈ A,
(8.1)
386
D. Bulacu, F. Panaite, F. Van Oystaeyen
where we write a # ϕ for a⊗ϕ, and denote this structure by A # A. Then A # A becomes a left H -module algebra with unit 1A # 1A and with left H -action h · (a # ϕ) = a # h · ϕ, ∀ a ∈ A, h ∈ H, ϕ ∈ A. Note that for A = H ∗ we obtain the quasi-smash product A # H ∗ from [7]. Also, by taking B a right H -module algebra and A = B as an H -bimodule algebra with trivial left H -action, A # A is exactly the generalized smash product A > B. We need the left-handed version of the above construction too. Namely, if H is a quasi-bialgebra, B a left H -comodule algebra and A an H -bimodule algebra, define a multiplication on A ⊗ B by (ϕ # b)(ϕ # b ) = (x˜ 1λ · ϕ)(x˜ 2λ b[−1] · ϕ ) # x˜ 3λ b[0] b , ∀ ϕ, ϕ ∈ A, b, b ∈ B, (8.2) where we write ϕ # b for ϕ ⊗ b, and denote this structure by A # B. Then A # B becomes a right H -module algebra with unit 1A # 1B and with right H -action, (ϕ # b) · h = ϕ · h # b, ∀ ϕ ∈ A, h ∈ H, b ∈ B. By taking A a left H -module algebra and A = A as an H -bimodule algebra with trivial right H -action, A # B is exactly the generalized smash product AA < B ≡ (A # A) A < B ≡ A > (A # B), obtained from the trivial identifications. Proof. Follows by direct computations. We now apply the above results. In [12], Hausser and Nill generalized to the setting of quasi-Hopf algebras some models of Hopf spin chains and lattice current algebras. The key result for this was the next theorem, concerning iterated two-sided crossed products (with H finite dimensional and A = H ∗ ). The original proof of this theorem is quite difficult to read, being written in the formalism of universal intertwiners. Using our results, we are now able to obtain for free a conceptual proof of the theorem, together with the explicit form of the structures that appear at (i) and (ii). Theorem 8.3. (Hausser and Nill). Let H be a quasi-bialgebra, A an H -bimodule algebra, A a right H -comodule algebra, B an H -bicomodule algebra and C a left H -comodule algebra. Then: (i) A > A < B admits a right H -comodule algebra structure; (ii) B > A < C admits a left H -comodule algebra structure; (iii) there is an algebra isomorphism (given by the trivial identification) (A > A < B) > A < C ≡ A > A < (B > A < C).
Generalized Diagonal Crossed Products
387
Proof. Writing A > A < B as (A # A) A < B ≡ (A # A) A < B) ⊗ H, ρ(a > ϕ < b) = (a > θ 1 · ϕ < θ 2 b 0 ) ⊗ θ 3 b 1 , ∀ a ∈ A, ϕ ∈ A, b ∈ B, 1 2 3 ρ = (1A > 1A < X˜ ρ ) ⊗ X˜ ρ ⊗ X˜ ρ ∈ (A > A < B) ⊗ H ⊗ H.
Similarly, writing B > A < C as B > (A # C), we obtain that this is a left H -comodule algebra, with structure: λ : B > A < C ≡ B > (A # C) → H ⊗ (B > (A # C)) ≡ H ⊗ (B > A < C), λ(b > ϕ < c) = b[−1] θ 1 ⊗ (b[0] θ 2 > ϕ · θ 3 < c), ∀ b ∈ B, ϕ ∈ A, c ∈ C, 1 2 3 λ = X˜ λ ⊗ X˜ λ ⊗ ( X˜ λ > 1A < 1C) ∈ H ⊗ H ⊗ (B > A < C).
To prove (iii), we will use the identifications appearing in our results: (A > A < B) > A < C ≡ ((A # A) A < C ≡ ((A # A) (A # C) ≡ (A # A) (A # C), and A > A < (B > A < C) ≡ A > A < (B > (A # C)) ≡ (A # A)<(B > (A # C)) ≡ (A # A) (A # C). So, we have proved that the two iterated generalized two-sided crossed products that appear in (iii) are both isomorphic as algebras (via the trivial identifications) to the two-sided generalized smash product (A # A) (A # C). Using the same results, we obtain another relation between the generalized two-sided crossed product and the two-sided generalized smash product. More exactly, let H be a quasi-bialgebra, A an H -bimodule algebra, A a left H -module algebra, B a right H -module algebra and A and B two H -bicomodule algebras. As we have seen before, A B) becomes a right (respectively left) H -comodule algebra, so we may consider the generalized two-sided crossed product (A A < (B > B). On the other hand, by the above Theorem of Hausser and Nill, A > A < B becomes a right H -comodule algebra and a left H -comodule algebra, but actually, using the explicit formulae for its structures that we gave, one can prove that it is even an H -bicomodule algebra, with λ,ρ = 1 H ⊗ (1A > 1A < 1B ) ⊗ 1 H , so we may consider the two-sided generalized smash product A<(A > A < B) > B. Proposition 8.4. We have an algebra isomorphism (A A < (B > B) ≡ A<(A > A < B) > B obtained from the trivial identification. In particular, we have (A# H ) > H ∗ < (H # B) ≡ A<(H > H ∗ < H ) > B.
388
D. Bulacu, F. Panaite, F. Van Oystaeyen
Proof. This may be proved by computing explicitly the multiplication rules in the two algebras and noting that they coincide. Alternatively, we provide a conceptual proof, by a sequence of identifications using the above results. We compute: A<(A > A < B) > B ≡ A<((A > A < B) > B) ≡ A<(((A # A) B) ≡ A<((A # A)<(B > B)) ≡ A<(A > A < (B > B)) ≡ A<(A > (A # (B > B))) ≡ (A (A # (B > B)) ≡ (A A < (B > B), where the fourth and the fifth identities hold because the left H -comodule algebra structures on (A > A < B) > B, A > A < (B > B) and A > (A # (B > B)) coincide (via the trivial identifications). 9. H ∗-Hopf Bimodules Let H be a finite dimensional quasi-bialgebra and A a left H -module algebra. Recall ∗ from [6] the category M H A , whose objects are vector spaces M, such that M is a right H ∗ -comodule (i.e. M is a left H -module, with action denoted by h ⊗ m → h m) and A acts on M to the right (denote by m ⊗ a → m · a this action) such that m · 1 A = m for all m ∈ M and the following relations hold: (m · a) · a = (X 1 m) · [(X 2 · a)(X 3 · a )], h (m · a) = (h 1 m) · (h 2 · a),
(9.1) (9.2) ∗
for all a, a ∈ A, m ∈ M, h ∈ H . Similarly, the category A M H consists of vector spaces M, such that M is a right H ∗ -comodule and A acts on M to the left (denote by a ⊗ m → a · m this action) such that 1 A · m = m for all m ∈ M and the following relations hold: a · (a · m) = [(x 1 · a)(x 2 · a )] · (x 3 m), h (a · m) = (h 1 · a) · (h 2 m),
(9.3) (9.4)
for all a, a ∈ A, m ∈ M, h ∈ H . From the description of left modules over A# H in ∗ [5], it is clear that A M H A# H M. If H is a quasi-Hopf algebra, by [6] we have an ∗ isomorphism of categories M H A M A# H . In what follows we need a description of ∗ MH A as a category of left modules over a right smash product. Proposition 9.1. Let H be a quasi-Hopf algebra and A a left H -module algebra. Define on A a new multiplication, by putting a a = (g 1 · a )(g 2 · a), ∀ a, a ∈ A,
(9.5)
where f −1 = g 1 ⊗ g 2 is given by (2.14), and denote this new structure by A. Then A becomes a right H -module algebra, with the same unit as A and right H -action given by a · h = S(h) · a, for all a ∈ A, h ∈ H . Proof. A straightforward computation, using (2.11) and (2.16). Definition 9.2. Let H be a quasi-bialgebra and B a right H -module algebra. We say that M, a k-linear space, is a left H, B-module if
Generalized Diagonal Crossed Products
389
(i) M is a left H -module with action denoted by h ⊗ m → h m; (ii) B acts weakly on M from the left, i.e. there exists a k-linear map B ⊗ M → M, denoted by b ⊗ m → b · m, such that 1 B · m = m for all m ∈ M; (iii) the following compatibility conditions hold: b · (b · m) = x 1 ([(b · x 2 )(b · x 3 )] · m), b · (h m) = h 1 [(b · h 2 ) · m],
(9.6) (9.7)
for all b, b ∈ B, h ∈ H , m ∈ M. The category of all left H, B-modules, morphisms being the H -linear maps that preserve the B-action, will be denoted by H,B M. Proposition 9.3. If H , B are as above, then the categories H,B M and H # B M are isomorphic. The isomorphism is given as follows. If M ∈ H # B M, define h m = (h#1) · m and b · m = (1#b) · m. Conversely, if M ∈ H,B M, define (h#b) · m = h (b · m). Proof. Straightforward computation. Proposition 9.4. If H is a finite dimensional quasi-Hopf algebra and A is a left H -mod∗ ule algebra, then M H A is isomorphic to H # A M, where A is the right H -module algebra constructed in Proposition 9.1. The correspondence is given as follows (we fix {ei } a basis in H with {ei } a dual basis in H ∗ ): ∗
• I f M ∈ H # A M, then M becomes an object in M H A with the following structures (we denote by h ⊗ m → h m the left H -module structure of M and by a ⊗ m → a m the weak left A-action on M arising from Proposition 9.3): ∗
M → M ⊗ H , m →
n
ei m ⊗ ei , ∀ m ∈ M,
i=1
M ⊗ A → M, m ⊗ a → m · a = q 1 ((S(q 2 ) · a) m), where q R = q 1 ⊗ q 2 = X 1 ⊗ S −1 (α X 3 )X 2 ∈ H ⊗ H (it is the element q˜ρ given by (2.34) corresponding to A = H ). ∗ ∗ • Conver sely, i f M ∈ M H A , denoting the H -comodule structure of M by M → ∗ M ⊗ H , m → m (0) ⊗ m (1) , and the weak right A-action on M by m ⊗ a → ma, then M becomes an object in H # A M with the following structures (again via Proposition 9.3): M is a left H -module with action h m = m (1) (h)m (0) , and the weak left A-action on M is given by a → m = ( p 1 m)( p 2 · a), ∀ a ∈ A, m ∈ M, where p R = p 1 ⊗ p 2 = x 1 ⊗ x 2 β S(x 3 ) ∈ H ⊗ H (it is the element p˜ ρ given by (2.34) corresponding to A = H ). Proof. Assume first that M ∈
H # A M;
then we have, by Propositions 9.3 and 9.1:
a (a m) = x 1 ([(g 1 S(x 3 ) · a )(g 2 S(x 2 ) · a)] m), a (h m) = h 1 [(S(h 2 ) · a) m],
(9.8) (9.9)
390
D. Bulacu, F. Panaite, F. Van Oystaeyen ∗
for all a, a ∈ A, h ∈ H , m ∈ M. We have to prove that M ∈ M H A . To prove (9.1), we compute (denoting by Q 1 ⊗ Q 2 another copy of q R ): (m · a) · a
Q 1 [(S(Q 2 ) · a ) (q 1 [(S(q 2 ) · a) m])]
= (9.9) = (9.8) = (2.40) =
1 q 1 X 11 [((g 1 S(X (2,2) )S(q22 ) f 1 X 2 · a)
(2.11) =
q 1 X 11 [((S(q 2 X 21 )1 X 2 · a)(S(q 2 X 21 )2 X 3 · a )) m]
Q 1 q11 [(S(q21 )S(Q 2 ) · a ) ((S(q 2 ) · a) m)] Q 1 q11 x 1 [((g 1 S(x 3 )S(q 2 ) · a)(g 2 S(x 2 )S(Q 2 q21 ) · a )) m] 1 (g 2 S(X (2,1) )S(q12 ) f 2 X 3 · a )) m]
= q 1 X 11 [(S(q 2 X 21 ) · ((X 2 · a)(X 3 · a ))) m] = q 1 [X 11 [(S(X 21 ) · (S(q 2 ) · ((X 2 · a)(X 3 · a )))) m]]
(9.9) =
q 1 [(S(q 2 ) · ((X 2 · a)(X 3 · a ))) (X 1 m)]
= (X 1 m) · ((X 2 · a)(X 3 · a )), q.e.d.
To prove (9.2), we compute: (h 1 m) · (h 2 · a) = q 1 ((S(q 2 )h 2 · a) (h 1 m)) (9.9) = (2.36) =
q 1 h (1,1) ((S(h (1,2) )S(q 2 )h 2 · a) m)
hq 1 ((S(q 2 ) · a) m) = h (m · a), q.e.d. ∗
Obviously m · 1 A = m, for all m ∈ M, hence indeed M ∈ M H A . ∗ Conversely, assume that M ∈ M H , that is A (ma)a = (X 1 m)[(X 2 · a)(X 3 · a )], h (ma) = (h 1 m)(h 2 · a),
(9.10) (9.11)
for all m ∈ M, a, a ∈ A, h ∈ H , and we have to prove that a → (a → m) = x 1 ([(g 1 S(x 3 ) · a )(g 2 S(x 2 ) · a)] → m), a → (h m) = h 1 [(S(h 2 ) · a) → m],
(9.12) (9.13)
for all a, a ∈ A, h ∈ H , m ∈ M. To prove (9.12), we compute (denoting by P 1 ⊗ P 2 another copy of p R ): a → (a → m) = ( p 1 [(P 1 m)(P 2 · a )])( p 2 · a) (9.11) = (9.10) = (2.39) = (9.11) =
=
[( p11 P 1 m)( p21 P 2 · a )]( p 2 · a) (X 1 p11 P 1 m)[(X 2 p21 P 2 · a )(X 3 p 2 · a)] 1 1 (x11 p 1 m)[(x(2,1) p12 g 1 S(x 3 ) · a )(x(2,2) p22 g 2 S(x 2 ) · a)]
x 1 [( p 1 m)[( p12 g 1 S(x 3 ) · a )( p22 g 2 S(x 2 ) · a)]] x 1 [((g 1 S(x 3 ) · a )(g 2 S(x 2 ) · a)) → m], q.e.d.
Generalized Diagonal Crossed Products
391
To prove (9.13), we compute: h 1 [(S(h 2 ) · a) → m] = h 1 [( p 1 m)( p 2 S(h 2 ) · a)] (9.11) = (2.35) =
(h (1,1) p 1 m)(h (1,2) p 2 S(h 2 ) · a)
( p 1 h m)( p 2 · a) = a → (h m), q.e.d.
Obviously 1 A → m = m, for all m ∈ M, hence indeed M ∈ H # A M. ∗ In order to prove that M H A H # A M, the only things left to prove are the following: (1) If M ∈ H # A M, then a → m = a m, for all a ∈ A, m ∈ M; ∗ (2) If M ∈ M H A , then m · a = ma, for all a ∈ A, m ∈ M. To prove (1), we compute: a→m
= ( p 1 m) · ( p 2 · a) = q 1 [(S(q 2 ) p 2 · a) ( p 1 m)] (9.9) = (2.38) =
q 1 p11 [(S( p21 )S(q 2 ) p 2 · a) m] a m, q.e.d.
To prove (2), we compute: m·a
= q 1 [(S(q 2 ) · a) → m] = q 1 [( p 1 m)( p 2 S(q 2 ) · a)] (9.11) = (2.37) =
(q11 p 1 m)(q21 p 2 S(q 2 ) · a) ma,
and the proof is finished. We will need the description of left modules over a two-sided smash product. Definition 9.5. Let H be a quasi-bialgebra, A a left H -module algebra and B a right H -module algebra. Define the category A,H,B M as follows: an object in this category is a left H -module M, with action denoted by h ⊗ m → h m, and we have left weak actions of A and B on M, denoted by a ⊗ m → a · m and b ⊗ m → b · m, such that: (i) M ∈ A# H M, that is the relations (9.3) and (9.4) hold; (ii) M ∈ H # B M, that is the relations (9.6) and (9.7) hold; (iii) the following compatibility condition holds: b · (a · m) = (y 1 · a) · [y 2 ((b · y 3 ) · m)],
(9.14)
for all a ∈ A, b ∈ B, m ∈ M. The morphisms in this category are the H -linear maps compatible with the two weak actions. Proposition 9.6. If H , A, B are as above, then being given as follows:
A# H # B M
A,H,B M,
the isomorphism
• If M ∈ A# H # B M, define a ·m = (a#1#1)·m, h m = (1#h#1)·m, b·m = (1#1#b)·m. • Conversely, if M ∈ A,H,B M, define (a#h#b) · m = a · (h (b · m)).
392
D. Bulacu, F. Panaite, F. Van Oystaeyen
Proof. Straightforward computation, using the formula for the multiplication in A# H # B. Let us point out how the condition (9.14) occurs: b · (a · m) = = = =
(1#1#b) · ((a#1#1) · m) [(1#1#b)(a#1#1)] · m (y 1 · a#y 2 #b · y 3 ) · m (y 1 · a) · (y 2 ((b · y 3 ) · m)),
which is exactly (9.14). Let H be a finite dimensional quasi-bialgebra and A, D two left H -module algebras. It ∗ is obvious that A M H coincides with the category of left A-modules within the monoi∗ dal category H M, and similarly M H D coincides with the category of right D-modules within H M. Hence, we can introduce the following new category: ∗
Definition 9.7. If H , A, D are as above, define A M H D as the category of A − D-bimod∗ H∗ ules within the monoidal category H M, that is M ∈ A M H D if and only if M ∈ A M , ∗ M ∈ MH D and the following relation holds: (a · m) · d = (X 1 · a) · [(X 2 m) · (X 3 · d)],
(9.15)
for all a ∈ A, m ∈ M, d ∈ D, where a ⊗ m → a · m and m ⊗ d → m · d are the weak actions. Proposition 9.8. Let H be a finite dimensional quasi-Hopf algebra and A, D two left ∗ H -module algebras. Then we have an isomorphism of categories A M H D A# H # D M, where D is the right H -module algebra as in Proposition 9.1. ∗
∗
Proof. Since A M H A# H M and M H D H # D M, the only thing left to prove is that ∗ the compatibility (9.14) in A,H,D M is equivalent to the compatibility (9.15) in A M H D . Let us first note the following easy consequences of (2.3), (2.6): X 1 p11 ⊗ X 2 p21 ⊗ X 3 p 2 = y 1 ⊗ y12 p 1 ⊗ y22 p 2 S(y 3 ), q11 y 1
⊗ q21 y 2
⊗ S(q y ) = X ⊗ q 2 3
1
1
X 12
⊗ S(q
2
X 22 )X 3 ,
(9.16) (9.17)
where p R = p 1 ⊗ p 2 and q R = q 1 ⊗ q 2 are the elements given by (2.34) for A = H . ∗ Let now M ∈ A M H D , with right D-action on M denoted by m ⊗ d → m · d. Then, by Proposition 9.4, the weak left D-action on M is given by d → m = ( p 1 m) · ( p 2 · d). We check (9.14); we compute: d → (a · m) = ( p 1 (a · m)) · ( p 2 · d) (9.4) = (9.15) = (9.16) = (9.2) =
[( p11 · a) · ( p21 m)] · ( p 2 · d) (X 1 p11 · a) · [(X 2 p21 m) · (X 3 p 2 · d)] (y 1 · a) · [(y12 p 1 m) · (y22 p 2 S(y 3 ) · d)] (y 1 · a) · [y 2 (( p 1 m) · ( p 2 S(y 3 ) · d))]
= (y 1 · a) · [y 2 ((S(y 3 ) · d) → m)] = (y 1 · a) · [y 2 ((d · y 3 ) → m)], q.e.d.
Generalized Diagonal Crossed Products
393
Conversely, assume that M ∈ A# H # D M, and denote the actions of A, H , D on M by a · m, h m, d · m respectively. Then, by Proposition 9.4, the right D-action on M is given by m · d = q 1 ((S(q 2 ) · d) · m). To check (9.15), we compute: (a · m) · d
= q 1 [(S(q 2 ) · d) · (a · m)] (9.14) =
q 1 [(y 1 · a) · (y 2 ((S(q 2 ) · d · y 3 ) · m))]
= q 1 [(y 1 · a) · (y 2 ((S(q 2 y 3 ) · d) · m))] (9.4) = (9.17) =
(q11 y 1 · a) · [q21 y 2 ((S(q 2 y 3 ) · d) · m)] (X 1 · a) · [q 1 X 12 ((S(q 2 X 22 )X 3 · d) · m)]
= (X 1 · a) · [q 1 X 12 ((S(q 2 )X 3 · d · X 22 ) · m)] (9.7) =
(X 1 · a) · [q 1 ((S(q 2 )X 3 · d) · (X 2 m))]
= (X 1 · a) · [(X 2 m) · (X 3 · d)], q.e.d. and the proof is finished. Let H be a finite dimensional quasi-bialgebra and A, D two H -bimodule algebras. H ∗ M H ∗ as the category of A − D-bimodules within the monoidal Define the category A D category H M H . By regarding A and D as left module algebras over H ⊗ H op , it is easy op ∗ H ∗ M H ∗ ∼ M(H ⊗H ) . Hence, as a consequence of Proposition 9.8, we to see that A D = A D finally obtain: Theorem 9.9. If H is a finite dimensional quasi-Hopf algebra and A, D are two H -biH ∗ MH ∗ module algebras, then we have an isomorphism of categories A A#(H ⊗H op )#D M. D In particular, we have
H ∗ MH ∗ H∗ H∗
H ∗ #(H ⊗H op )# H ∗ M.
10. Yetter-Drinfeld Modules as Modules Over a Generalized Diagonal Crossed Product If H is a quasi-bialgebra, then the category of (H, H )-bimodules, H M H , is monoidal. The associativity constraints are given by (2.48). A coalgebra in the category of (H, H )-bimodules will be called an H -bimodule coalgebra. More precisely, an H -bimodule coalgebra C is an (H, H )-bimodule (denote the actions by h · c and c · h) with a comultiplication : C → C ⊗ C and a counit ε : C → k satisfying the following relations, for all c ∈ C and h ∈ H : · ( ⊗ id)((c)) · −1 = (id ⊗ )((c)), (h · c) = h 1 · c1 ⊗ h 2 · c2 , (c · h) = c1 · h 1 ⊗ c2 · h 2 , ε(h · c) = ε(h)ε(c), ε(c · h) = ε(c)ε(h),
(10.1) (10.2) (10.3)
where we used the Sweedler-type notation (c) = c1 ⊗ c2 . An example of an H -bimodule coalgebra is H itself. Our next definition extends the definition of Yetter-Drinfeld modules from [18]. Definition 10.1. Let H be a quasi-bialgebra, C an H -bimodule coalgebra and A an H -bicomodule algebra. A left-right Yetter-Drinfeld module is a k-vector space M with the following additional structure:
394
D. Bulacu, F. Panaite, F. Van Oystaeyen
- M is a left A-module; we write · for the left A-action; - we have a k-linear map ρ M : M → M ⊗ C, ρ M (m) = m (0) ⊗ m (1) , called the right C-coaction on M, such that for all m ∈ M, ε(m (1) )m (0) = m and (θ 2 · m (0) )(0) ⊗ (θ 2 · m (0) )(1) · θ 1 ⊗ θ 3 · m (1) = x˜ 1ρ · (x˜ 3λ · m)(0) ⊗ x˜ 2ρ · (x˜ 3λ · m)(1)1 · x˜ 1λ ⊗ x˜ 3ρ · (x˜ 3λ · m)(1)2 · x˜ 2λ , (10.4) - the following compatibility relation holds: u 0 · m (0) ⊗ u 1 · m (1) = (u [0] · m)(0) ⊗ (u [0] · m)(1) · u [−1] ,
(10.5)
for all u ∈ A, m ∈ M. A Y D(H )C will be the category of left-right Yetter-Drinfeld modules and maps preserving the actions by A and the coactions by C. Let H be a quasi-bialgebra, A an H -bicomodule algebra and C an H -bimodule coalgebra. Let us call the threetuple (H, A, C) a Yetter-Drinfeld datum. We note that, for an arbitrary H -bimodule coalgebra C, the linear dual space of C, C ∗ , is an H -bimodule algebra. The multiplication of C ∗ is the convolution, that is (c∗ d ∗ )(c) = c∗ (c1 )d ∗ (c2 ), the unit is ε and the left and right H -module structures are given by (h c∗ h )(c) = c∗ (h ·c·h), for all h, h ∈ H , c∗ , d ∗ ∈ C ∗ , c ∈ C. In the rest of this section we establish that if H is a quasi-Hopf algebra and C is finite dimensional then the category A Y D(H )C is isomorphic to the category of left C ∗ A-modules, C ∗ A M. First some lemmas. Lemma 10.2. Let H be a quasi-Hopf algebra and (H, A, C) a Yetter-Drinfeld datum. We have a functor F : A Y D(H )C → C ∗ A M, given by F(M)=M as k-module, with the C ∗ A-module structure defined by (c∗ u)m := c∗ , q˜ρ2 · (u · m)(1) q˜ρ1 · (u · m)(0) ,
(10.6)
for all c∗ ∈ C ∗ , u ∈ A and m ∈ M, where q˜ρ = q˜ρ1 ⊗ q˜ρ2 is the element defined in (2.34). F transforms a morphism to itself. Proof. Let Q˜ 1ρ ⊗ Q˜ 2ρ be another copy of q˜ρ . For all c∗ , d ∗ ∈ C ∗ , u, u ∈ A and m ∈ M we compute: [(c∗ u)(d ∗ u )]m (3.21) =
=
[(1 c∗ 5 )(2 u 0[−1] d ∗ S −1 (u 1 )4 ) 3 u 0[0] u ]m
d ∗ , S −1 (u 1 )4 (q˜ρ2 )2 · (3 u 0[0] u · m)(1)2 · 2 u 0[−1]
c∗ , 5 (q˜ρ2 )1 · (3 u 0[0] u · m)(1)1 · 1 q˜ρ1 · (3 u 0[0] u · m)(0)
(3.15) =
2
1
1
2
d ∗ , S −1 ( f 1 X˜ ρ θ 3 u 1 )(q˜ρ2 )2 · (( X˜ ρ )[0] x˜ 3λ θ[0] u 0[0] u · m)(1)2 · ( X˜ ρ )[−1]2 3
1
2 2 ×x˜ 2λ θ[−1] u 0[−1] c∗ , S −1 ( f 2 X˜ ρ )(q˜ρ )1 · (( X˜ ρ )[0] x˜ 3λ θ[0] u 0[0] u · m)(1)1 1
2 ·x˜ 1λ θ 1 q˜ρ1 · (( X˜ ρ )[0] x˜ 3λ θ[0] u 0[0] u · m)(0)
Generalized Diagonal Crossed Products (10.5,2.40) =
395
2 2
d ∗ , S −1 (θ 3 u 1 ) Q˜ 2ρ x˜ 3ρ · (x˜ 3λ θ[0] u 0[0] u · m)(1)2 · x˜ 2λ θ[−1] u 0[−1] 2
c∗ , q˜ρ2 ( Q˜ 1ρ ) 1 x˜ 2ρ · (x˜ 3λ θ[0] u 0[0] u · m)(1)1 · x˜ 1λ θ 1 2 q˜ρ1 ( Q˜ 1ρ ) 0 x˜ 1ρ · (x˜ 3λ θ[0] u 0[0] u · m)(0)
(10.4) =
3 2 2
d ∗ , S −1 (θ 3 u 1 ) Q˜ 2ρ θ · (θ[0] u 0[0] u · m)(1) · θ[−1] u 0[−1] 2
c∗ , q˜ρ2 ( Q˜ 1ρ ) 1 · [θ · (θ[0] u 0[0] u · m)(0) ](1) · θ θ 1 2
1
2 q˜ρ1 ( Q˜ 1ρ ) 0 · [θ · (θ[0] u 0[0] u · m)(0) ](0) 2
(10.5,2.45) =
3 2 3 2
d ∗ , S −1 (α X˜ ρ θ 3 u 1 ) X˜ ρ θ θ 1 u 0,1 · (u · m)(1) 1 1 2 2 1
c∗ , q˜ρ2 · [( X˜ ρ )[0] θ θ 0 u 0,0 · (u · m)(0) ](1) · ( X˜ ρ )[−1] θ θ 1 1 2 2 q˜ρ1 · [( X˜ ρ )[0] θ θ 0 u 0,0 · (u · m)(0) ](0)
(2.45,2.26) =
3
2
d ∗ , S −1 (αθ23 u 12 X˜ ρ )θ13 u 11 X˜ ρ · (u · m)(1) 1 1
c∗ , q˜ρ2 · [θ 2 u 0 X˜ ρ · (u · m)(0) ](1) · θ 1 q˜ρ1 · [θ 2 u 0 X˜ ρ · (u · m)(0) ](0)
(2.6,2.34) = (10.6) =
c∗ , q˜ρ2 · [u Q˜ 1ρ · (u · m)(0) ](1) d ∗ , Q˜ 2ρ · (u · m)(1) q˜ρ1 · [u Q˜ 1ρ · (u · m)(0) ](0)
d ∗ , Q˜ 2ρ · (u · m)(1) (c∗ u)[ Q˜ 1ρ · (u · m)(0) ] = (c∗ u)[(d ∗ u )m],
as needed. It is not hard to see that (ε 1A )m = m for all m ∈ M, so M is a left C ∗ A-module. The fact that a morphism in A Y D(H )C becomes a morphism in C ∗ A M can be proved more easily, we leave the details to the reader. Lemma 10.3. Let H be a quasi-Hopf algebra and (H, A, C) a Yetter-Drinfeld datum and assume C is finite dimensional. We have a functor G : C ∗ A M → A Y D(H )C , given by G(M) = M as k-module, with structure maps defined by u · m = (ε u)m, ρ M : M → M ⊗ C, ρ M (m) =
(10.7)
n
(ci ( p˜ ρ1 )[0] )m ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] , (10.8)
i=1
for m ∈ M and u ∈ A. Here p˜ ρ = p˜ ρ1 ⊗ p˜ ρ2 is the element defined in (2.34), {ci }i=1,n is a basis of C and {ci }i=1,n is the corresponding dual basis of C ∗ . G transforms a morphism to itself. Proof. The most difficult part of the proof is to show that G(M) satisfies the relations (10.4) and (10.5). It is then straightforward to show that a map in C ∗ A M is also a map in A Y D(H )C , and that G is a functor. It is not hard to see that (2.45), (2.6) and (2.46) imply 1
2
3
2 2 θ θ 1 ⊗ θ θ 0 p˜ ρ1 ⊗ θ θ 1 p˜ ρ2 S(θ 3 ) = ( p˜ ρ1 )[−1] ⊗ ( p˜ ρ1 )[0] ⊗ p˜ ρ2 .
(10.9)
396
D. Bulacu, F. Panaite, F. Van Oystaeyen
Write p˜ ρ = p˜ ρ1 ⊗ p˜ ρ2 = P˜ρ1 ⊗ P˜ρ2 . For all m ∈ M we compute: (θ 2 · m (0) )(0) ⊗ (θ 2 · m (0) )(1) · θ 1 ⊗ θ 3 · m (1) n = ((ε θ 2 )(ci ( p˜ ρ1 )[0] )m)(0) ⊗ ((ε θ 2 )(ci ( p˜ ρ1 )[0] )m)(1) · θ 1 i=1
⊗θ 3 S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] (3.21,10.8) =
n
2 (c j ( P˜ρ1 )[0] )(ci (θ 0 p˜ ρ1 )[0] )m ⊗ S −1 ( P˜ρ2 ) · c j · ( P˜ρ1 )[−1] θ 1
i, j=1 2 2 ⊗θ 3 S −1 (θ 1 p˜ ρ2 ) · ci · (θ 0 p˜ ρ1 )[−1] n 1 2 (3.21,3.15) 2 [c j ci ( X˜ ρ )[0] x˜ 3λ (θ ( P˜ρ1 )[0] 0 θ 0 p˜ ρ1 )[0] ]m = i, j=1 1
3 ⊗ S −1 ( f 2 X˜ ρ P˜ρ2 )
2 3
2 p˜ ρ2 ) · ci ·c j · ( X˜ ρ )[−1]1 x˜ 1λ θ ( P˜ρ1 )[−1] θ 1 ⊗ θ 3 S −1 ( f 1 X˜ ρ θ ( P˜ρ1 )[0] 1 θ 1 1
1
2 ·( X˜ ρ )[−1]2 x˜ 2λ (θ ( P˜ρ1 )[0] 0 θ 0 p˜ ρ1 )[−1] (2.43,10.9,2.30) =
2
n
1 3 [c j ci ( X˜ ρ ( P˜ρ1 ) 0 p˜ ρ1 )[0] x˜ 3λ ]m ⊗ S −1 ( f 2 X˜ ρ P˜ρ2 ) · c j
i, j=1 1 2 1 ·( X˜ ρ ( P˜ρ1 ) 0 p˜ ρ1 )[−1]1 x˜ 1λ ⊗ S −1 ( f 1 X˜ ρ ( P˜ρ1 ) 1 p˜ ρ2 ) · ci · ( X˜ ρ ( P˜ρ1 ) 0 p˜ ρ1 )[−1]2 x˜ 2λ n (2.39) [c j ci ((x˜ 1ρ ) 0 p˜ ρ1 )[0] x˜ 3λ ]m ⊗ x˜ 2ρ S −1 ( f 2 ((x˜ 1ρ ) 1 p˜ ρ2 )2 g 2 ) · c j = i, j=1
·((x˜ 1ρ ) 0 p˜ ρ1 )[−1]1 x˜ 1λ ⊗ x˜ 3ρ S −1 ( f 1 ((x˜ 1ρ ) 1 p˜ ρ2 )1 g 1 ) · ci · ((x˜ 1ρ ) 0 p˜ ρ1 )[−1]2 x˜ 2λ (2.11,10.2) =
n
[ci ((x˜ 1ρ ) 0 p˜ ρ1 )[0] x˜ 3λ ]m ⊗ x˜ 2ρ · (S −1 ((x˜ 1ρ ) 1 p˜ ρ2 ) · ci
i=1 1 ·((x˜ ρ ) 0 p˜ ρ1 )[−1] )1 · x˜ 1λ ⊗ x˜ 3ρ · (S −1 ((x˜ 1ρ ) 1 p˜ ρ2 ) · ci · ((x˜ 1ρ ) 0 p˜ ρ1 )[−1] )2 n = [(x˜ 1ρ ) 0[−1] ci S −1 ((x˜ 1ρ ) 1 ) ((x˜ 1ρ ) 0 p˜ ρ1 )[0] x˜ 3λ ]m i=1 2 ⊗x˜ ρ · (S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] )1 · x˜ 1λ ⊗ x˜ 3ρ · (S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] )2 n (3.21) [(ε x˜ 1ρ )(ci ( p˜ ρ1 )[0] )(ε x˜ 3λ )]m = i=1 2 ⊗x˜ ρ · (S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] )1 · x˜ 1λ ⊗ x˜ 3ρ · (S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] )2 (10.7,10.8) =
θρ1 · (x˜ 3λ · m)(0) ⊗ x˜ 2ρ · (x˜ 3λ · m)(1)1 · x˜ 1λ ⊗ x˜ 3ρ · (x˜ 3λ · m)(1)2 · x˜ 2λ .
Similarly, we compute: u 0 · m (0) ⊗ u 1 · m (1) n = (ε u 0 )(ci ( p˜ ρ1 )[0] )m ⊗ u 1 S −1 ( p˜ ρ2 ) · ci ( p˜ ρ1 )[−1] i=1
· x˜ 2λ
· x˜ 2λ
· x˜ 2λ
Generalized Diagonal Crossed Products n
(3.21) =
i=1
397
(u 0,0[−1] ci S −1 (u 0,1 ) u 0,0[0] ( p˜ ρ1 )[0] )m
⊗u 1 S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] =
n i=1
(ci (u 0,0 p˜ ρ1 )[0] )m ⊗ u 1 S −1 (u 0,1 p˜ ρ2 ) · ci · (u 0,0 p˜ ρ1 )[−1] n
(2.35) =
i=1
(3.21) =
(ci ( p˜ ρ1 u)[0] )m ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 u)[−1]
n i=1
(ci ( p˜ ρ1 )[0] )(ε u [0] )m ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] u [−1]
(10.8) = (u [0]
· m)(0) ⊗ (u [0] · m)(1) · u [−1] ,
for all u ∈ A and m ∈ M, and this finishes the proof. The next result generalizes [13, Prop. 3.12], which is recovered by taking C = A = H . Theorem 10.4. Let H be a quasi-Hopf algebra and (H, A, C) a Yetter-Drinfeld datum, assuming C to be finite dimensional. Then the categories A Y D(H )C and C ∗ A M are isomorphic. Proof. We have to verify that the functors F and G defined in Lemmas 10.2 and 10.3 are inverse to each other. Let M ∈ A Y D(H )C . The structures on G(F(M)) (using first Lemma 10.2 and then Lemma 10.3) are denoted by · and ρ M . For any u ∈ A and m ∈ M we have that u · m = (ε u)m = ε, q˜ρ2 · (u · m)(1) q˜ρ1 · (u · m)(0) = u · m because ε(h · c) = ε(h)ε(c) and ε(m (1) )m (0) = m for all h ∈ H , c ∈ C, m ∈ M. We now compute for m ∈ M that ρ M (m) n = (ci ( p˜ ρ1 )[0] )m ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] i=1
(10.6) =
n i=1
ci , q˜ρ2 · (( p˜ ρ1 )[0] · m)(1) q˜ρ1 · (( p˜ ρ1 )[0] · m)(0) ⊗ S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1]
(10.5) 1 1 = q˜ρ ( p˜ ρ ) 0 (2.38) = m (0)
· m (0) ⊗ S −1 ( p˜ ρ2 )q˜ρ2 ( p˜ ρ1 ) 1 · m (1)
⊗ m (1) = ρ M (m).
Conversely, take M ∈ C ∗ A M. We want to show that F(G(M)) = M. If we denote the left C ∗ A-action on F(G(M)) by →, then, using Lemmas 10.2 and 10.3 we find,
398
D. Bulacu, F. Panaite, F. Van Oystaeyen
for all c∗ ∈ C ∗ , u ∈ A and m ∈ M: (c∗ u) → m = c∗ , q˜ρ2 · (u · m)(1) q˜ρ1 · (u · m)(0) n =
c∗ , q˜ρ2 S −1 ( p˜ ρ2 ) · ci · ( p˜ ρ1 )[−1] (ε q˜ρ1 )(ci ( p˜ ρ1 )[0] )(ε u)m i=1
(3.21) =
n i=1
c∗ , q˜ρ2 S −1 ((q˜ρ1 ) 1 p˜ ρ2 ) · ci · ((q˜ρ1 ) 0 p˜ ρ1 )[−1]
(ci ((q˜ρ1 ) 0 p˜ ρ1 )[0] )(ε u)m (2.37,3.21) ∗ (c =
1A )(ε u)m = (c∗ u)m,
and this finishes our proof. There is a relation between the functor F from Lemma 10.2 and the map as in Proposition 3.8. Proposition 10.5. Let H be a quasi-Hopf algebra, (H, A, C) a Yetter-Drinfeld datum and M an object in A Y D(H )C ; consider the map : C ∗ → C ∗ A as in Proposition 3.8. Then the left C ∗ A-module structure on M given in Lemma 10.2 and the map are related by the formula: (c∗ )m = c∗ , m (1) m (0) , for all c∗ ∈ C ∗ and m ∈ M. Proof. We compute: (c∗ )m = (( p˜ ρ1 )[−1] c∗ S −1 ( p˜ ρ2 ) ( p˜ ρ1 )[0] )m = ( p˜ ρ1 )[−1] c∗ S −1 ( p˜ ρ2 ), q˜ρ2 · (( p˜ ρ1 )[0] · m)(1) q˜ρ1 · (( p˜ ρ1 )[0] · m)(0) = c∗ , S −1 ( p˜ ρ2 )q˜ρ2 · (( p˜ ρ1 )[0] · m)(1) · ( p˜ ρ1 )[−1] q˜ρ1 · (( p˜ ρ1 )[0] · m)(0) (10.5) = (2.38) =
c∗ , S −1 ( p˜ ρ2 )q˜ρ2 ( p˜ ρ1 ) 1 · m (1) q˜ρ1 ( p˜ ρ1 ) 0 · m (0)
c∗ , m (1) m (0) ,
finishing the proof. References 1. Akrami, S. E., Majid, S.: Braided cyclic cocycles and nonassociative geometry. J. Math. Phys. 45, 3883– 3911 (2004) 2. Albuquerque, H., Majid, S.: Quasialgebra structure of the octonions. J. Algebra 220, 188–224 (1999) 3. Altschuler, D., Coste, A.: Quasi-quantum groups, knots, three-manifolds and topological field theory. Commun. Math. Phys. 150, 83–107 (1992) 4. Beggs, E. J., Majid, S.: Quantization by cochain twists and nonassociative differentials. http://arxiv.org/listmath.QA/0506450, 2005 5. Bulacu, D., Panaite, F., Van Oystaeyen, F.: Quasi-Hopf algebra actions and smash products. Comm. Algebra 28, 631–651 (2000) 6. Bulacu, D., Nauwelaerts, E.: Relative Hopf modules for (dual) quasi-Hopf algebras. J. Algebra 229, 632–659 (2000)
Generalized Diagonal Crossed Products
399
7. Bulacu, D., Caenepeel, S.: Two-sided two-cosided Hopf modules and Doi-Hopf modules for quasi-Hopf algebras. J. Algebra 270, 55–95 (2003) 8. Caenepeel, S., Militaru, G., Zhu, S.: Crossed modules and Doi-Hopf modules. Israel J. Math. 100, 221–247 (1997) 9. Connes A., Dubois-Violette, M.: Noncommutative finite dimensional manifolds I. Spherical manifolds and related examples. Commun. Math. Phys. 230, 539–579 (2002) 10. Dijkgraaf, R., Pasquier, V., Roche, P.: Quasi-Hopf algebras, group cohomology and orbifold models. Nucl. Phys. B Proc. Suppl. 18 B, 60–72 (1990) 11. Drinfeld, V. G.: Quasi-Hopf algebras. Leningrad Math. J. 1, 1419–1457 (1990) 12. Hausser, F., Nill, F.: Diagonal crossed products by duals of quasi-quantum groups. Rev. Math. Phys. 11, 553–629 (1999) 13. Hausser, F., Nill, F.: Doubles of quasi-quantum groups. Commun. Math. Phys. 199, 547–589 (1999) 14. Jara Martínez, P., López Peña, J., Panaite F., Van Oystaeyen, F.: On iterated twisted tensor products of algebras. http://arxiv.org/list/math.QA/0511280, 2005 15. Jimbo, M., Konno, H., Odake S., Shiraishi, J.: Quasi-Hopf twistors for elliptic quantum groups. Transform. Groups 4, 303–327 (1999) 16. Kassel, C.: Quantum Groups, Graduate Texts in Mathematics 155, Berlin: Springer Verlag, 1995 17. Mack, G., Schomerus, V.: Action of truncated quantum groups on quasi-quantum planes and a quasiassociative differential geometry and calculus. Commun. Math. Phys. 149, 513–548 (1992) 18. Majid, S.: Quantum double for quasi-Hopf algebras. Lett. Math. Phys. 45, 1–9 (1998) 19. Majid, S.: Foundations of quantum group theory, Cambridge: Cambridge Univ. Press, 1995 20. Majid, S.: Gauge theory on nonassociative spaces. J. Math. Phys. 46, 103519, (2005) 23 pp 21. Panaite, F.: Hopf bimodules are modules over a diagonal crossed product algebra. Comm. Algebra 30, 4049–4058, (2002) 22. Schauenburg, P.: Hopf modules and the double of a quasi-Hopf algebra. Trans. Amer. Math. Soc. 354, 3349–3378 (2002) 23. Schauenburg, P.: Actions of monoidal categories and generalized Hopf smash products. J. Algebra 270, 521-563 (2003) 24. Sweedler, M. E.: Hopf algebras. New York: Benjamin, 1969 25. Wang, S.-H., Li, J.: On twisted smash products for bimodule algebras and the Drinfeld double. Comm. Algebra 26, 2435–2444 (1998) 26. Wang, S.-H.: Doi-Koppinen Hopf bimodules are modules. Comm. Algebra 29, 4671–4682 (2001) Communicated by A. Connes
Commun. Math. Phys. 266, 401–430 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0017-1
Communications in
Mathematical Physics
Stability of Planar Stationary Solutions to the Compressible Navier-Stokes Equation on the Half Space Yoshiyuki Kagei, Shuichi Kawashima Faculty of Mathematics, Kyushu University, Fukuoka 812-8581, Japan. E-mail: [email protected]; [email protected] Received: 1 August 2005 / Accepted: 9 December 2005 Published online: 29 April 2006 – © Springer-Verlag 2006
Abstract: Stability of planar stationary solutions to the compressible Navier-Stokes equation on the half space Rn+ (n ≥ 2) under outflow boundary condition is investigated. It is shown that the planar stationary solution is stable with respect to small perturbations in H s R+n with s ≥ [n/2] + 1 and the perturbations decay in L ∞ norm as t → ∞, provided that the magnitude of the stationary solution is sufficiently small. The stability result is proved by the energy method. In the proof an energy functional based on the total energy of the system plays an important role. 1. Introduction This paper studies large time behavior of solutions to the compressible Navier-Stokes equation on the half space R+n (n ≥ 2): ∂t ρ + div(ρu) = 0, ∂t (ρu) + div(ρu ⊗ u) + ∇ p(ρ) = μu + μ + μ ∇div u, (1.1) γ p(ρ) = Kρ . Here R+n = x = (x1 , x ); x = (x2 , . . . , xn ) ∈ Rn−1 , x1 > 0 ; ρ = ρ(x, t) and u = 1 u (x, t), . . . , u n (x, t) denote the unknown density and velocity, respectively; μ, μ , K and γ are constants satisfying μ > 0, n2 μ + μ ≥ 0, K > 0 and γ > 1. We consider (1.1) under the initial condition (ρ, u)|t=0 = (ρ0 , u 0 )
(1.2)
and the outflow boundary condition on x1 = 0, u|x1 =0 = (u 1b , 0, . . . , 0),
(1.3)
402
Y. Kagei, S. Kawashima
where u 1b is a constant satisfying u 1b < 0, together with the boundary condition at infinity x1 = ∞, (1.4) ρ → ρ+ , u → u 1+ , 0, . . . , 0 (x1 → ∞) , where ρ+ and u 1+ are constants satisfying ρ+ > 0. As is easily imagined, large time behavior of solutions of (1.1)–(1.4) heavily depends on the values of the boundary data u b , ρ+ and u + . In this paper we are interested in the situation where (1.1)–(1.4) admits a planar stationary solution, 1 i.e., a stationary solution ρ, u ) which depends only on x1 and u has the form u= u (x1 ), 0, . . . , 0 . ( Kawashima, Nishibata and Zhu [5] investigated the conditions for ρ+ , u 1+ and u 1b under which planar stationary motions occur. They proved that there exists a planar stationary solution ( ρ, u ) if and only if u 1+ < 0 and the Mach number at infinity x1 = ∞ is greater than or equal to 1. Furthermore, it was shown in [5] that ( ρ, u ) is asymptotically stable with respect to small one-dimensional perturbations, i.e., perturbations in 1 1 (x ), 0, . . . , 0 , provided the form ρ − ρ = ρ(x , t) − ρ (x ), u − u = u (x , t) − u 1 1 1 1 that u 1+ − u 1b is sufficiently small. In this we show that ( ρ, u ) is stable under multi-dimensional perturbations small paper in H s R+n and perturbations decay in L ∞ norm as t → ∞, provided that u 1+ − u 1b is sufficiently small. Here s is an integer satisfying s ≥ [n/2] + 1. Our stability theorem is proved by showing the local existence of solutions and deriving a suitable a priori estimate. The local existence is proved by applying the local H s -solvability theorem in [4]. We derive our H s -a priori estimate by the energy method. The point in deriving the a priori estimate is to obtain a suitable L 2 -energy bound. In order to do so we will employ the energy functional based on the total energy which is the same as in the one-dimensional case in [5]; and in fact, the energy functional works well also in the multi-dimensional problem exactly in the same way as in [5] to obtain the L 2 -energy bound. This is due to the fact that the stationary solutions do not have any shear components. Once we get the L 2 -energy bound, we proceed to obtain the estimates for derivatives by the energy method as in [3, 7]. This part is entirely different from the computation in the one-dimensional case; we derive the estimates for tangential and normal derivatives respectively, for which certain hyperbolic-parabolic aspects of the system are used and, also, the estimate for the inhomogeneous stationary Stokes problem is applied; and, then, the bootstrap argument yields the desired H s -energy bound. But in contrast to [3, 7] we do not regard the system as a perturbation from the linearization at infinity x1 = ∞. Instead we keep the principal part of the perturbation equation in its own quasilinear form and apply some commutator estimates. This will greatly simplify the argument. In this paper we consider large time behavior of solutions of (1.1)–(1.4) only under the conditions for ρ+ , u 1b and u 1+ where planar stationary solutions exist. If one of such conditions would be disturbed, then complicated phenomena might occur. In fact, Matsumura [6] proposed a classification of all possible time asymptotic states in terms of boundary data for one-dimensional problem. Some parts of this classification were already proved rigorously. See [6] and references therein. This paper is organized as follows. In Sect. 2 we review some properties of planar stationary solutions obtained in [5]. We then state our stability theorem. The proof of the theorem will be given in Sects. 3–5. In Sect. 3 we transform problem (1.1)–(1.4) into the initial boundary value problem for the perturbation. We then discuss the local existence and present our H s a priori estimate. Section 4 is devoted to the proof of the
Stability of Planar Stationary Solutions to Compressible NS Equations
403
a priori estimate. We finally prove decay of perturbations in L ∞ norm in Sect. 5 based on the a priori estimate. 2. Stability Result We first consider the one-dimensional stationary problem whose solutions represent planar stationary motions in R+n . We look for a smooth stationary ρ, u ) of solution ( (1.1)–(1.4) of the form ρ = ρ (x1 ) > 0 and u= u 1 (x1 ), 0, . . . , 0 . Then the problem for ρ , u 1 is written as ρ u1 = 0 (x1 > 0), x
1 2 1 ρ u1 u x1 x1 (x1 > 0), + p( ρ )x1 = 2μ + μ u x
1 =0
=
x1 1 ub, 1
ρ → ρ+ , u → u 1+ (x1 → ∞) ,
(2.1)
where subscript x1 stands for differentiation in x1 . Kawashima, Nishibata and Zhu [5] investigated problem (2.1) and gave a necessary and sufficient condition for the existence of solutions. Following [5], we introduce the Mach number at infinity defined by M+ ≡ We also set
|u + | p (ρ+ )
.
δ ≡ u 1+ − u 1b ,
which measures the strength of the stationary solution. Proposition 2.1 ([5]). Let u 1+ < 0. Then problem (2.1) has a smooth solution ρ , u 1 if 1 1 wc is a certain positive number. The solution if M+ ≥ 1 and wc u + > u b , where and only 1 u 1 (x1 ) is monotonically increasing when M+ = 1. ρ , u is monotonic, in particular, Furthermore, ρ , u 1 has the following decay properties as x1 → ∞. (i) If M+ > 1, then for any nonnegative integer k there exists a constant C > 0 such that k − ρ+ , u 1 − u 1+ ≤ Cδe−σ x1 ∂x1 ρ for some positive constant σ . (ii) If M+ = 1, then for any nonnegative integer k there exists a constant C > 0 such that k − ρ+ , u 1 − u 1+ ≤ C ∂x1 ρ
δ k+1 (1 + δx1 )k+1
.
404
Y. Kagei, S. Kawashima
Remark. The constant wc in Proposition 2.1 is determined in the following way. From (2.1) one can see that ρ+ / ρ = u 1 /u 1+ , and hence, by introducing the new unknown variable w = ρ+ / ρ , problem (2.1) is reduced to 2μ + μ u 1+ wx1 = H (w) (x1 > 0), w(x1 ) → 1 (x1 → ∞) , where 2 γ H (w) = Kρ+ 1 − w −γ − ρ+ u 1+ (w − 1). Clearly, H (1) = 0, and also, one can see that H (w) has another zero, which we denote by wc . Furthermore, it holds that M+ 1 if and only if wc 1. See [5] for the details. Our concern in this paper is to investigate the stability properties of the stationary solution ( ρ, u ) with respect to multi-dimensional perturbations. To state our stability result, we introduce function spaces. We denote by L p the usual Lebesgue space on R+n . The norm of L p space is denoted by · p and the inner product of L 2 is defined by
f g d x, f, g ∈ L 2 . ( f, g) = R+n
For a nonnegative integer m we denote by H m the usual m th order L 2 Sobolev space on R+n with norm · H m . The symbol C0m stands for the set of all C m functions which have compact support in R+n . We denote by H01 the completion of C01 in H 1 and the dual space of H01 is denoted by H −1 . For 0 < T ≤ ∞ and a nonnegative integer σ , we define the Banach space Z σ (T ) = X σ (T ) × Y σ (T )n , where σ
X σ (T ) =
2
C j [0, T ]; H σ −2 j
j=0
and
Y σ (T ) = X σ (T ) ∩
σ +1 2
σ +1−2 j . H j 0, T ; H
j=0
m = H m ∩ H 1 when m ≥ 1 and H m = L 2 when m = 0. The norm of Z σ (T ) Here H 0 is defined by (φ, ψ) Z σ (T ) = φ X σ (T ) + ψY σ (T ) , where φ X σ (T ) = sup |[φ(t)]|σ , ψY σ (T ) 0≤t≤T
2 = ψ X σ (T ) +
T 0
1/2 |[ψ(t)]|2σ +1 dt
Stability of Planar Stationary Solutions to Compressible NS Equations
405
with ⎛ σ 2 2 j |[φ(t)]|σ = ⎝ ∂t φ(t) j=0
H σ −2 j
⎞1/2 ⎠
.
We simply denote by Z σ , X σ and Y σ when T = ∞. ρ , u ) be the Theorem 2.2. Let s be an integer satisfying s ≥ s0 ≡ [n/2] + 1 and 1 let 1( u − u < δ0 , then solution of (2.1). Then there exists a positive number δ0 such that if + b ρ, u ) is stable with respect to perturbations small in H s R+n in the following sense: ( there exist ε0 > 0 and C > 0 such that if initial perturbation (ρ(0) − ρ , u(0) − u) satisfies (ρ(0) − ρ , u(0) − u ) H s ≤ ε0 and a suitable compatibility condition, then perturbation (ρ(t) − ρ , u(t) − u ) exists in Z s , and it satisfies (ρ(t) − ρ , u(t) − u ) H s ≤ C (ρ(0) − ρ , u(0) − u ) H s for all t ≥ 0 and lim ∂x (ρ(t) − ρ , u(t) − u ) H s−1 = 0.
t→∞
In particular, lim (ρ(t) − ρ , u(t) − u )∞ = 0.
t→∞
The stability result in Theorem 2.2 can be proved by combining Proposition 3.1 (local existence) and Proposition 3.2 (a priori estimate). Decay property in L ∞ norm will be proved in Sect. 5. 3. Reformulation of the Problem Let us rewrite the problem into the one for perturbations. We set (φ, ψ) = (ρ − ρ , u − u ). Then problem (1.1)–(1.4) is transformed into ∂t φ + u · ∇φ + ρdiv ψ = f, ρ(∂t ψ + u · ∇ψ) + Lψ + p (ρ)∇φ = g, ψ|x1 =0 = 0; (φ, ψ) → (0, 0) (x1 → ∞) , (φ, ψ)|t=0 = (φ0 , ψ0 ) , where
(3.1)
Lψ = −μψ − μ + μ ∇div ψ, f = f (φ, ψ) = −ψ · ∇ ρ − φdiv u, g = g(φ, ψ) = −(ρψ + φ u ) · ∇ u − p (ρ) − p ( ρ ) ∇ ρ.
The proof of Theorem 2.2 is thus reduced to showing the global existence of solution (φ, ψ) of (3.1) in the class Z s , where s is an integer satisfying s ≥ [n/2] + 1. In this section we will first show the local existence and then give a suitable priori estimate.
406
Y. Kagei, S. Kawashima
3.1. Local existence. Let us firstly consider the local existence of solutions. The local existence can be proved by applying the result in [4]. In fact, one can easily see that problem (3.1) is a hyperbolic-parabolic system satisfying the assumptions of the local solvability theorem in [4]. To state the local existence of solutions precisely, let us mention the compatibility condition for the initial value (φ0 , ψ0 ). Let (φ, ψ) be a smooth j j solution of (3.1). Then ∂t φ, ∂t ψ ( j ≥ 1) is inductively determined by j
j−1
j−1
∂t φ = −u · ∇∂t φ − ρdiv ∂t ψ j−1 j−1 j−1 − ∂t , u · ∇ φ + ∂t , ρdiv ψ + ∂t ( f (φ, ψ)) and
j j−1 j−1 ∂t ψ = −ρ −1 L∂t ψ + p (ρ)∇∂t φ j−1 j−1 −ρ −1 ∂t , ρ ∂t ψ + ∂t , p (ρ)∇ φ −ρ −1 ∂t
j−1
(ρu · ∇ψ) + ρ −1 ∂t
j−1
(g(φ, ψ)).
Here [C, D] = C D − DC is the commutator of C and D. j j is inductively given by (φ0 , ψ0 ) From these relations we see that ∂t φ, ∂t ψ t=0 in the following way: j j ∂t φ, ∂t ψ = φj, ψj , t=0
where φ j = −u 0 · ∇φ j−1 − ρ0 div ψ j−1 j−1
j −1 ψ · ∇φ j−1− + φ div ψ j−1− + =1 +F j−1 φ0 , ψ0 ; φ1 , . . . , φ j−1 , ψ1 , . . . , ψ j−1 , ψ j = −ρ0−1 Lψ j−1 + p (ρ0 ) ∇φ j−1 j−1
j −1 −1 φ ψ j− + a (φ0 ; φ1 , . . . , φ ) φ j−1− −ρ0 =1 −1 +ρ0 G j−1 φ0 , ψ0 , ∂x ψ0 ; φ1 , . . . , φ j−1 , ψ1 , . . . , ψ j−1 , ∂x ψ1 , . . . , ∂x ψ j−1 . +φ0 ; u 0 = u +ψ0 ; a (φ0 ; φ1 , . . . , φ ) is a certain polynomial in φ1 , . . . , φ ; Here ρ0 = ρ · · · · · · , and so on. j The boundary condition ψ|x1 =0 = 0 in (3.1) implies that ∂t ψ x1 =0 = 0 , and therefore, we have ψ j x1 =0 = 0. s Assume ψ) is a solution of (3.1) in Z (T ). Then, from the above observation, that (φ, we need φ j , ψ j ∈ H s−2 j for j = 0, . . . , [s/2], which can be verified by Lemmas
Stability of Planar Stationary Solutions to Compressible NS Equations
407
3.3, 3.4 and 3.10 below, provided that (φ0 , ψ0 ) ∈ H s with s ≥ s0 . Furthermore, it is necessary to require that (φ0 , ψ0 ) satisfies the s th order compatibility condition: s−1 1 . ψ j ∈ H0 for j = 0, 1, . . . , s= 2 Proposition 3.1. Let s be an integer satisfying s ≥ s0 . Assume that the initial value (φ0 , ψ0 ) satisfies the following conditions: s th order compatibility condition, where (a) (φ0 , ψ0 ) ∈ H s and (φ0 , ψ0 ) satisfies the s−1 s= 2 . (x1 ). (b) inf x ρ0 (x) ≥ − 21 inf x1 ρ (x1 ) Then there exists a positive number T0 depending on (φ0 , ψ0 ) H s and inf x1 ρ such that problem (3.1) has a unique solution (φ, ψ) ∈ Z s (T0 ) satisfying φ(x, t) ≥ − 43 inf x1 ρ (x1 ) for all (x, t) ∈ R+n × [0, T0 ]. Furthermore, the inequality a (φ, ψ)2Z s (T0 ) ≤ C 1 + (φ0 , ψ0 )2H s (φ0 , ψ0 )2H s holds for some constants C > 0 and a > 0 depending only on s, (φ0 , ψ0 ) H s and inf x1 ρ (x1 ). 3.2. A priori estimates. The global existence of solutions of (3.1) follows from Proposition 3.1 and the a priori estimate given in Proposition 3.2 below. To state our a priori estimate we introduce some notation. We define E σ (t) and Dσ (t) by 1/2 2 2 E σ (t) = sup |[φ(τ )]|σ + |[ψ(τ )]|σ 0≤τ ≤t
and
⎧
t 1/2 ⎪ ⎪ |||Dψ|||20 + φ|x1 =0 2L 2 (Rn−1 ) dτ for σ = 0, ⎨ Dσ (t) =
0 t 1/2 ⎪ ⎪ 2 2 2 ⎩ |||Dφ|||σ −1 + |||Dψ|||σ + φ|x1 =0 L 2 (Rn−1 ) dτ for σ ≥ 1. 0
Here and in what follows
⎧ ⎨∂x v(t)2 for σ = 0, 1/2 |||Dv(t)|||σ = ⎩ |[∂x v(t)]|2σ + |[∂t v(t)]|2σ −1 for σ ≥ 1.
From now on we fix a T > 0; and (φ, ψ) will denote the solution of (3.1) belonging to Z s (T ). We also write (ρ, u) = ( ρ + φ, u + ψ). Proposition 3.2. There exist constants K > 0, ε0 > 0 and C > 0, which are independent of T > 0, such that if E s (t) < K for all t ∈ [0, T ], then E s (t)2 + Ds (t)2 ≤ C (φ0 , ψ0 )2H s , and 1 inf φ(x, t) ≥ − inf ρ (x1 ) x 2 x1 for all t ∈ [0, T ], provided that (φ0 , ψ0 ) H s < ε0 . The proof of Proposition 3.2 will be given in the next section.
408
Y. Kagei, S. Kawashima
4. Proof of A Priori Estimate In this section we prove the a priori estimate given in Proposition 3.2. For this purpose we first prepare some auxiliary lemmas.
4.1. Auxiliary lemmas. In the proof of Proposition 3.2 we will frequently use the following lemmas. Lemma 4.1. Let 2 ≤ p ≤ ∞ and let j and k be integers satisfying
0 ≤ j < k, k > j + n
1 1 − . 2 p
Then there exists a constant C > 0 such that j k a f ∂x f ≤ C f 1−a ∂ x , 2 p
where a =
1 k
j+
n 2
−
n p
2
.
Lemma 4.1 can be proved by using Fourier transform and extension operator in a standard way. We omit the proof (cf. [1]). To control the energy supplied by the stationary flow, we will use the following lemma which is essentially the same as [5, Lemma 3.3]. Lemma 4.2. Let w denote ρ − ρ+ or u − u+. (i) If M+ ≥ 1, then for any integers k ≥ 1 and ≥ 0 there exists a positive constant C such that k f ∂x1 w
H
f ∈ H1 ∩ H . ≤ Cδ ∂x1 f H (−1)+ + f |x1 =0 L 2 (Rn−1 )
(ii) If M+ > 1, then for any integer k ≥ 0 there exists a positive constant C such that f, g ≤ Cδ ∂x1 f 2 + f |x1 =0 L 2 (Rn−1 ) ∂xk1 w × ∂x1 g2 + g|x1 =0 L 2 (Rn−1 ) f, g ∈ H 1 . The same estimate also holds for k ≥ 2 if M+ = 1. The proof of Lemma 4.2 is similar to that of [5, Lemma 3.3]. We omit the proof.
Stability of Planar Stationary Solutions to Compressible NS Equations
409
Lemma 4.3. (i) Let 1 ≤ σ ≤ s. Suppose that F(x, t, y) is a smooth function on R+n × [0, T ] × I , where I is a compact interval in R. Then for |α| + 2 j = σ , there hold α j ∂x ∂t , F (x, t, f 1 ) f 2 ⎧ 2 ⎨C0 |[ f 2 ]|σ −1 + C1 1 + |||D f 1 ||||α|+ j−1 |||D f 1 |||s−1 |[ f 2 ]|σ , s−1 ≤ ⎩C0 |[ f 2 ]|σ −1 + C1 1 + |||D f 1 ||||α|+ j−1 |||D f 1 |||s |[ f 2 ]|σ −1 , s−1 and
α j ∂x ∂t , F (x, t, f 1 ) f 2 −1 H |α|+ j−1 |||D f 1 |||s−1 |[ f 2 ]|σ −1 . ≤ C0 |[ f 2 ]|σ −1 + C1 1 + |||D f 1 |||s−1
Here C0 =
$
(β,k)≤(α, j) (β,k) =(0,0)
and C1 =
$
(β,k)≤(α, j) 1≤≤ j+|α|
β supx,t,y ∂x ∂tk F(x, t, y)
β supx,t,y ∂x ∂tk ∂ y F(x, t, y) .
(ii) Let 1 ≤ σ ≤ s and let w denote ρ or u . Then for |α| + 2 j = σ there holds j f 1 , f 2 ≤ Cδ|[ f 1 ]|σ −1 ∂x f 2 2 + f 2 |x1 =0 L 2 (Rn−1 ) . ∂xα ∂t , w The proof of Lemma 4.3 will be given in the Appendix. A straightforward application of Lemmas 4.2 and 4.3 yields the following estimates for f and g appearing on the right-hand side of (3.1). Lemma 4.4. Let 0 ≤ σ ≤ s. Suppose that (3.2) is satisfied. Then for |α| + 2 j = σ there hold α j ∂x ∂t f ≤ Cδ |||Dψ|||(σ −1)+ + |||Dφ|||(σ −1)+ + φ|x1 =0 L 2 (Rn−1 ) 2
and
α j ∂x ∂t g ≤ Cδ |||Dψ|||(σ −1)+ + |||Dφ|||(σ −1)+ + φ|x1 =0 L 2 (Rn−1 ) . 2
We will also use the following standard interpolation inequality. Lemma 4.5. Let k ≥ 1 and let 1 ≤ |α| ≤ k. Then % &
t2 α ∂ v(t2 )2 ≤ C ∂ α v(t1 )2 + ∂ v ∂ v dτ τ H |α|−1 x H |α| x x 2 2 t1
for all v ∈
Y k (T )
and 0 ≤ t1 ≤ t2 ≤ T .
Proof. The inequality can be easily proved in the case of the whole space. In the case of the half space one can prove it by using the extension operator, based on the inequality in the whole space case. This completes the proof.
410
Y. Kagei, S. Kawashima
4.2. Basic energy estimates. In order to prove the a priori estimate in Proposition 3.2 we will use the following energy estimates for weak solutions of hyperbolic and parabolic equations. ∈ X 0 (T ) satisfies Let f ∈ L 2 0, T ; L 2 . We say that a function φ + u · ∇φ = f, ∂t φ in the weak sense if
T
−
, ∂t ϕ + div (uϕ) dt = φ
0
(4.1)
T
f , ϕ dt
0
holds for all ϕ ∈ × (0, T )). Similarly, for g ∈ L 2 0, T ; H −1 we say that a ∈ Y 0 (T ) satisfies function ψ + u · ∇ψ + Lψ = ρ ∂t ψ g (4.2) C01 (
in the weak sense if
T
− ψ , ρ (∂t ϕ + u · ∇ϕ) dt + 0
T
L
1/2
ψ, L
1/2
T
ϕ dτ =
0
g , ϕ dt
0
holds for all ϕ ∈ C01 ( × (0, T )), where L 1/2 ψ, L 1/2 ϕ = μ(∇ψ, ∇ϕ) + μ + μ (div ψ, div ϕ); and ·, · denotes the duality pairing of H −1 and H01 . We here use the fact that ∂t ρ + div (ρu) = 0. satisfy (4.1) in the weak sense. Then φ Lemma 4.6. (i) Let f ∈ L 2 (0, T ; L 2 ) and let φ satisfies
t2 φ (t1 )2 + (t2 )2 ≤ φ |2 + 2 dτ div u, | φ f , φ 2 2 t1
for all 0 ≤ t1 ≤ t2 ≤ T . satisfy (4.2) in the (ii) Let g = g (1) + ∇ g (2) with g (1) , g (2) ∈ L 2 (0, T ; L 2 ) and let ψ satisfies weak sense. Then ψ
t2
t2 2 2 ' ( 22 dτ = (t2 ) (t1 ) , dτ L 1/2 ψ g, ψ ρ(t2 ) ψ +2 ρ(t1 ) ψ +2 2
t1
2
t1
' ( (1) (2) = − . for all 0 ≤ t1 ≤ t2 ≤ T . Here g, ψ g ,ψ g , div ψ satisfy (4.2) in the weak sense. Assume that ψ (iii) Let g ∈ L 2 0, T ; L 2 and let ψ 1 belongs to Y (T ). Then ψ satisfies
t2
t2 2 √ 1/2 2 1/2 2 g , ∂τ ψ ρ∂τ ψ 2 dτ = L ψ (t1 ) + 2 L ψ (t2 ) + 2 2 2 t1 t 1 dτ , ∂τ ψ − ρu · ∇ ψ for all 0 ≤ t1 ≤ t2 ≤ T . Proof. Formally the inequality in (i) is obtained by taking the L 2 -inner product of (4.1) 1 with φ and integrating by parts since u x =0 = u 1b < 0. Similarly, the identities in (ii) 1 , and ∂t ψ and (iii) are formally obtained by taking the L 2 -inner product of (4.2) with ψ respectively, and integrating by parts. A rigorous proof can be found in [4]. We omit the details.
Stability of Planar Stationary Solutions to Compressible NS Equations
411
4.3. Proof of Proposition 3.2. Proposition 3.2 is a consequence of the following subsequent propositions. As in the one-dimensional problem studied in [5], the point in the proof of Proposition 3.2 is to derive a suitable L 2 -energy bound. Due to the fact that the stationary solution has no shear components, one can obtain the L 2 bound in the same way as in the one-dimensional case in [5]. We define Ms (t) ≥ 0 by Ms (t)2 = (δ + E s (t)) E s (t)2 + Ds (t)2 . > 0 such that if Proposition 4.7. There exists a constant K E s (t) ≤ K
(4.3)
for all t ∈ [0, T ], then E 0 (t)2 + D0 (t)2 ≤ C (φ0 , ψ0 )22 + Ms (t)2 , uniformly in t ∈ [0, T ], where C > 0 is independent of T . Proof. As in [5] we introduce an energy functional based on the total energy % ρE = ρ
&
1 2 |u| + (ρ) , (ρ) = 2
ρ
p(ζ ) dζ. ζ2
Note that (ρ) is a strictly convex function of ρ1 . We then define & % 1 2 |ψ| + (ρ, ρ ρE = ρ ) , 2 where 1 1 (ρ, ρ ) = (ρ) − ( ρ ) − ∂ 1 ( − ρ) ρ ρ ρ
ρ p(ζ ) − p( ρ) = dζ. ζ2 ρ
|, and As shown in [5], ρ (ρ, ρ ) is equivalent to |ρ − ρ |2 for suitably small |ρ − ρ hence, there are positive constants c0 and c1 such that c0−1 |(φ, ψ)|2 ≤ ρ E ≤ c0 |(φ, ψ)|2 , where φ = ρ − ρ with |φ| ≤ c1 .
(4.4)
412
Y. Kagei, S. Kawashima
> 0 such that if E s (t) ≤ K , then Since H s → L ∞ we can find a number K 1 φ(t)∞ ≤ c1 and inf x φ(x, t) ≥ − 2 inf x1 ρ (x1 ) for all t ∈ [0, T ]. A direct calculation shows ρ )) ψ ∂t ρ E + div ρu E + ( p(ρ) − p( = μdiv (∇ψ · ψ) + μ + μ div (ψdiv ψ) −μ|∇ψ|2 − μ + μ (div ψ)2 + R0 , (4.5) where R0 = R0 (x, t) is the function defined by 1 u. R0 = −ρ (ψ · ∇ u ) · ψ − p(ρ) − p( ρ ) − p ( ρ )φ div u − φψ · L ρ Since (φ, ψ) ∈ Z s (T ) and s ≥ s0 , we deduce from (4.5), after integrating by parts, that
t t) d x − ρ E(x, ρu 1 E d x dτ n n−1 R+ 0 R x1 =0
t
t 2 1/2 =− R0 (x, τ ) d xdτ, (4.6) L ψ dτ + 2
0
where
0
1/2 2 L ψ = μ ∇ψ22 + μ + μ div ψ22 . 2
By Lemmas 4.1 and 4.2, if M+ > 1, we have
t R0 (x, τ ) d xdτ ≤ C δ D0 (t)2 + E s (t)D0 (t)2 .
(4.7)
0
In case M+ = 1, since ∂x1 u 1 > 0 by Proposition 2.1, we have ρ )φ div u −ρ (ψ · ∇ u ) · ψ − p(ρ) − p( ρ ) − p ( = − ρ|ψ|2 + p(ρ) − p( ρ ) − p ( ρ )φ ∂x1 u1 ≤ 0, and hence, by Lemma 4.2,
t
R0 (x, τ ) d xdτ ≤ Cδ D0 (t)2 .
(4.8)
0
The desired inequality in Proposition 4.7 now follows from (4.4), (4.6)–(4.8). This completes the proof. The estimates for derivatives are obtained in principle by the energy method as in [3, 7]. In contrast to [3, 7], we do not regard problem (3.1) as a perturbation from the linearized problem at infinity. We estimate derivatives by differentiating equations in (3.1) and using the commutator estimates given in Lemma 4.3. We begin with the estimates for tangential derivatives of (φ, ψ). The estimates are based on the fact that (3.1) has a structure of a symmetric hyperbolic-parabolic system. j In what follows we will denote the tangential derivative ∂t ∂xα by T j,α :
T j,α v = ∂t ∂xα v. j
Stability of Planar Stationary Solutions to Compressible NS Equations
413
Proposition 4.8. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j and α satisfy 2 j + α = σ . Suppose that (4.3) is satisfied. Then
t 1/2 2 2 2 T j,α φ(t)2 + T j,α ψ(t)2 + ψ dτ ≤ C E σ (0) + Ms (t) L . T j,α 2 2 2
0
Proof. Applying T j,α to (3.1) we have ∂t T j,α φ + u · ∇ T j,α φ + ρdiv T j,α ψ = f j,α
(4.9)
and
ρ ∂t T j,α ψ + u · ∇ T j,α ψ + L T j,α ψ + p (ρ)∇ T j,α φ = g j,α ,
where
and
(4.10)
f j,α = T j,α f − T j,α , u · ∇ φ − T j,α , ρ div ψ g j,α = T j,α g − T j,α , ρ ∂t ψ − T j,α , ρu · ∇ ψ − T j,α , p (ρ) ∇φ.
The estimate in Proposition 4.8 is based on the fact that system (4.9)–(4.10) can be put into a symmetric form. So we transform them into the system ∂t a(ρ)T j,α φ + u · ∇ a(ρ)T j,α φ + ρ a(ρ)div T j,α ψ = f j,α , (4.11) g j,α , ρ ∂t T j,α ψ + u · ∇ T j,α ψ + L T j,α ψ + ∇ p (ρ)T j,α φ = where a(ρ) = p (ρ)/ρ,
(4.12)
f j,α = −ρ a (ρ)(div u)T j,α φ + a(ρ) f j,α and g j,α = p (ρ)(∇ρ)T j,α φ + g j,α . We now apply Lemma 4.6 (i) to (4.11) and Lemma 4.6 (ii) to (4.12) respectively, and obtain
t 2 p (ρ)T j,α φ, div T j,α ψ dτ p (ρ)/ρ T j,α φ(t) + 2 2
0
2 ≤ p (ρ0 )/ρ0 ∂xα φ j + R1 (t)
(4.13)
2
and √ ρ T j,α ψ(t)2 + 2
−2
2
t
t 1/2 2 T j,α ψ dτ L 0
2
p (ρ)T j,α φ, div T j,α ψ
dτ
0
√ 2 ≤ ρ0 ∂xα ψ j + R2 (t). 2
(4.14)
414
Y. Kagei, S. Kawashima
Here R1 (t) =
t
2 p (ρ)/ρ div u, T j,α φ dτ + 2
t
0
p (ρ)/ρ f j,α , T j,α φ dτ
0
and
t
R2 (t) = 2
g j,α , T j,α ψ dτ.
0
It then follows from (4.13) and (4.14) that
t 2 √ 2 1/2 2 T j,α ψ dτ L p (ρ)/ρ T j,α φ(t) + ρ T j,α ψ(t)2 + 2 2 2 0 2 √ 2 ≤ p (ρ0 )/ρ0 ∂xα φ j + ρ0 ∂xα ψ j + R1 (t) + R2 (t). 2
2
Applying now Lemmas 4.1–4.4 to R1 (t) and R2 (t) we obtain the desired inequality in Proposition 4.8. This completes the proof. We next derive the H 1 -parabolic estimates for ψ. Proposition 4.9. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j and α satisfy 2 j + α = σ − 1. Suppose that (4.3) is satisfied. Then 2 t 1/2 T j+1,α ψ 2 dτ L T j,α ψ(t) + 2 2
0
2 ≤ C E σ (0)2 + T j,α φ(t)2 + Ms (t)2
t
+η 0
∂x T j,α φ 2 dτ + Cη 2
t 1/2 2 T j,α ψ dτ L 0
2
for any η > 0 with some constant Cη > 0. Proof. Let 2 j + α = σ − 1. Then we can apply Lemma 4.6 (iii) to (4.10) and obtain
t 2 √ 1/2 ρ ∂τ T j,α ψ 2 dτ L T j,α ψ(t) + 2 2 2
+2 0
t
0
p (ρ)∇ T j,α φ , ∂τ T j,α ψ dτ
t 1/2 α 2 = L ∂x ψ j − 2 ρu · ∇ T j,α ψ , ∂τ T j,α ψ dτ + R3 (t), 2
0
where
t
R3 (t) = 2 0
g j,α , ∂τ T j,α ψ dτ.
(4.15)
Stability of Planar Stationary Solutions to Compressible NS Equations
415
The second term on the right-hand side of (4.15) is estimated as t 2 ρu · ∇ T T ψ , ∂ ψ dτ τ j,α j,α 0
t
≤ 0
√ ρ∂τ T j,α ψ 2 dτ + C 2
t 1/2 2 T j,α ψ dτ, L 2
0
and, by Lemma 4.4, the third term is estimated as |R3 (t)| ≤ C Ms (t)2 . Let us next consider the third term on the left-hand side of (4.15). By integration by parts we have
t
2
0
p (ρ)∇ T j,α φ , ∂τ T j,α ψ dτ
τ =t p (ρ)∇ T j,α φ , T j,α ψ τ =0
t ∂τ p (ρ)∇ T j,α φ , T j,α ψ dτ. −2
=2
(4.16)
0
The first term on the right-hand side of (4.16) is equal to τ =t −2 T j,α φ, div p (ρ)T j,α ψ τ =0 . To estimate the second term on the right-hand side of (4.16), we first observe that p (ρ)∇T j,α φ satisfies ∂t p (ρ)∇T j,α φ + u · ∇ p (ρ)∇T j,α φ + ρ p (ρ)∇div T j,α ψ = h j,α , in the weak sense, where h j,α = −ρ p (ρ)(div u)∇T j,α φ + p (ρ) ∇T j,α f − ∇T j,α , u · ∇ φ − ∇T j,α , ρ div ψ . It then follows that the second term on the right-hand side of (4.16) is written as
−2 0
t
∂τ p (ρ)∇ T j,α φ , T j,α ψ dτ
u · ∇ p (ρ)∇T j,α φ + ρ p (ρ)∇div T j,α ψ − h j,α , T j,α ψ dτ
t
=2 0
t
= −2
−2
0 t 0
p (ρ)∇ T j,α φ , div u ⊗ T j,α ψ dτ
div T j,α ψ , div ρ p (ρ)T j,α ψ dτ − 2
t 0
h j,α , T j,α ψ dτ.
416
Y. Kagei, S. Kawashima
We thus obtain 2
p (ρ)∇ T j,α φ , ∂τ T j,α ψ dτ 0 τ =t = −2 T j,α φ, p (ρ)div T j,α ψ τ =0
t p (ρ)∇ T j,α φ , u div T j,α ψ dτ −2 0
t −2 div T j,α ψ , ρ p (ρ)div T j,α ψ dτ + R4 (t) t
0
≡ I1 (t) + I2 (t) + I3 (t) + R4 (t),
(4.17)
where
τ =t R4 (t) = −2 T j,α φ, ∇ p (ρ) T j,α ψ τ =0
t p (ρ)∇ T j,α φ , (∇u)T j,α ψ dτ −2 0
t
t −2 div T j,α ψ , ∇ ρ p (ρ) T j,α ψ dτ − 2 h j,α , T j,α ψ dτ. 0
0
The right-hand side of (4.17) is estimated as 2 2 1 |I1 (t)| ≤ L 1/2 T j,α ψ(t) + C T j,α φ(t)2 + C E σ (0)2 , 2 2
t ∇ T j,α φ div T j,α ψ dτ |I2 (t)| ≤ C 2 2 0
t
t 2 1/2 ∂x T j,α φ 2 dτ + Cη ≤η ψ T dτ L j,α 2 0
2
0
for any η > 0 with some constant Cη > 0,
t 1/2 2 |I3 (t)| ≤ C T j,α ψ dτ, L 2
0
and, by Lemmas 4.1–4.4, |R4 (t)| ≤ C Ms (t)2 . We thus obtain the desired inequality in Proposition 4.9. This completes the proof. We next derive the dissipative estimates for x1 -derivatives of φ, which follow from a hyperbolic-parabolic aspect of (3.1). Proposition 4.10. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j, α and satisfy 2 j + α + = σ − 1. Suppose that (4.3) is satisfied. Then 2 t 2 +1 φ(t) + ∂ φ T j,α ∂x+1 T dτ j,α x1 1 2 2 0 % ≤ C E σ (0)2 + Ms (t)2 &
t 2 2 2 + T j+1,α ∂x1 ψ + ∂x T j,α ∂x1 ψ + ∂x ∂x T j,α ∂x1 ψ dτ . 0
2
2
2
Stability of Planar Stationary Solutions to Compressible NS Equations
417
Proof. The first equation of (3.1) is written as ∂t φ + u · ∇φ + ρ ∂x1 ψ 1 + ∇ · ψ = f. Here and in what follows we use the notation ψ = ψ 2, . . . , ψ n and ∇ = ∂x2 , . . . , ∂xn . It then follows that +1 ∂t T j,α ∂x+1 φ + u · ∇ T ∂ φ j,α x1 1 1 ∂ ψ = f j,α ,+1 , +ρ T j,α ∂x+2 ψ + ∂ ∇ · T x j,α 1 x 1 1 where
(4.18)
f j,α ,+1 = T j,α ∂x+1 f − T j,α ∂x+1 , u · ∇ φ − T j,α ∂x+1 , ρ div ψ. 1 1 1
Furthermore, multiplying (4.18) by 1/ρ we obtain
1 1 +1 T j,α ∂x+1 T ∂t φ + u · ∇ ∂ φ j,α x1 1 ρ ρ 1 ∂ ψ = f j,α ,+1 , ψ + ∂ ∇ · T +T j,α ∂x+2 x j,α 1 x 1 1
(4.19)
where 1 1 φ + f j,α ,+1 . f j,α ,+1 = (div u)T j,α ∂x+1 1 ρ ρ We next observe the first component of the equation for ψ in (3.1): ρ ∂t ψ 1 + u · ∇ψ 1 + p (ρ)∂x1 φ − 2μ + μ ∂x21 ψ 1 − μ ψ 1 − μ + μ ∂x1 ∇ · ψ = g 1 . Hereafter we denote = ∂x22 + · · · + ∂x2n . Applying T j,α ∂x1 to this equation we have ρ ∂t T j,α ∂x1 ψ 1 + u · ∇ T j,α ∂x1 ψ 1 + p (ρ)T j,α ∂x+1 φ − 2μ + μ T j,α ∂x+2 ψ1 1 1 −μ T j,α ∂x1 ψ 1 − μ + μ ∂x1 ∇ · T j,α ∂x1 ψ = g 1j,α , , where
g 1j,α , = T j,α ∂x1 g 1 − T j,α ∂x1 , ρ ∂t ψ 1 − T j,α ∂x1 , ρu · ∇ ψ 1 − T j,α ∂x1 , p (ρ) ∂x1 φ.
(4.20)
418
Y. Kagei, S. Kawashima
1 By adding (4.19) and (4.20) × 2μ+μ we arrive at
1 1 ρp (ρ) 1 +1 +1 ∂t φ + u · ∇ ∂ φ + ∂ φ T j,α ∂x+1 T T j,α x1 j,α x1 1 ρ ρ 2μ + μ ρ 1 1 M T j,α ∂x1 ψ + f j,α ,+1 + g1 , =− 2μ + μ 2μ + μ j,α ,
where
M(ψ) = ρ ∂t ψ 1 + u · ∇ψ 1 − μ ψ 1 + μ∂x1 ∇ · ψ .
Lemma 4.6 (i) then yields 2
t 1 ρp (ρ) 1 1 +1 +1 T j,α ∂ +1 φ(t) + 2 T j,α ∂x1 φ, T j,α ∂x1 φ dτ x1 ρ 2μ + μ ρ ρ 0 2 2
t 1 1 α +1 2 +1 M T T ∂ ∂ φ − ∂ ψ , ∂ φ dτ ≤ j,α x1 j,α x1 ρ x x1 j 2μ + μ ρ 0
+R5 (t), where
0
2
(4.21)
2
t 1 φ dτ div u, T j,α ∂x+1 R5 (t) = 1 ρ 0
t
1 1 1 +1 T +2 g , ∂ φ dτ. f j,α ,+1 + j,α x1 2μ + μ j,α , ρ 0
The second term on the left-hand side of (4.21) is estimated as
t
t 2 ρp (ρ) 1 1 +1 +1 +1 ∂ ∂ ∂ T T 2 φ, φ dτ ≥ c φ T dτ, j,α j,α j,α x1 x1 x1 2 2μ + μ ρ ρ 0 0 and the second term on the right-hand side of (4.21) is estimated as
t 1 2 +1 M T j,α ∂x1 ψ , T j,α ∂x1 φ dτ 2μ + μ ρ 0
t ≤C ∂τ T j,α ∂x1 ψ + ∂x T j,α ∂x1 ψ + ∂x ∂x T j,α ∂x1 ψ 2 2 2 0 +1 × T j,α ∂x1 φ dτ 2
2 c t +1 ≤ T j,α ∂x1 φ dτ 2 2 0
t 2 2 2 +C ∂τ T j,α ∂x1 ψ + ∂x T j,α ∂x1 ψ + ∂x ∂x T j,α ∂x1 ψ dτ, 0
2
2
2
where c and C are some positive constants. Moreover, using Lemmas 4.2–4.4 we obtain |R5 (t)| ≤ C Ms (t)2 , and hence, Proposition 4.10 is proved.
Stability of Planar Stationary Solutions to Compressible NS Equations
419
In order to obtain the dissipative estimates for higher order x1 -derivatives of ψ and tangential derivatives of φ we prepare estimates for the material derivative of φ. In the ˙ following we denote the material derivative of φ by φ: φ˙ ≡ ∂t φ + u · ∇φ. We also introduce the semi-norm | · |k defined by ⎛ ⎞1/2 ∂ α v 2 ⎠ . |v|k = ⎝ x 2 |α|=k
Proposition 4.11. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j, α and satisfy 2 j + α + = σ − 1. Suppose that (4.3) is satisfied. Then
t T j,α φ˙ 2 dτ +1 0 % &
t 2 2 2 2 ≤ C Ms (t)2 + T j,α ∂x1 φ + T j+1,α ψ + ∂x T j,α ψ + ∂x ∂x T j,α ψ dτ . 0
Proof. We see from the first equation of (3.1) that φ˙ = f − ρdiv ψ, and hence,
T j,α +β φ˙ = −ρdiv T j,α +β ψ + f j,α +β ,
where β = + 1 and
(4.22)
f j,α +β = T j,α +β f − T j,α +β , ρ div ψ.
We also see from the first equation of (3.1) that φ˙ + ρ ∂x1 ψ 1 + ∇ · ψ = f, and hence,
T j,α +β ∂xk+1 φ˙ + ρ T j,α +β ∂xk+2 f j,α +β ,k+1 , ψ 1 + ∂x1 ∇ · T j,α +β ∂xk1 ψ = 1 1 (4.23)
where
k+1 +β ∂ f j,α +β ,k+1 = T j,α +β ∂xk+1 f − T , ρ div ψ. j,α x1 1
Furthermore, Eq. (4.20) with T j,α ∂x1 replaced by T j,α +β ∂xk1 gives ρ ∂t T j,α +β ∂xk1 ψ 1 + u · ∇ T j,α +β ∂xk1 ψ 1 φ − 2μ + μ T j,α +β ∂xk+2 ψ1 + p (ρ)T j,α +β ∂xk+1 1 1 −μ T j,α +β ∂xk1 ψ 1 − μ + μ ∂x1 ∇ · T j,α +β ∂xk1 ψ = g 1j,α +β ,k .
(4.24)
420
Y. Kagei, S. Kawashima
By adding (4.23) ×
1 ρ
and (4.24) ×
1 2μ+μ
we arrive at
1 T j,α +β ∂xk+1 φ˙ 1 ρ p (ρ) 1 k+1 k +β ∂ +β ∂ ψ =− T φ − M T j,α j,α x1 x1 2μ + μ 2μ + μ 1 1 g1 + f j,α +β ,k+1 + ρ 2μ + μ j,α +β ,k
(4.25)
with β + k = . The first term on the right-hand side of (4.22) is estimated as ρdiv T j,α +β ψ ≤ C ∂x ∂x T j,α ψ , 2
while the first two terms on the right-hand side of (4.25) are estimated as p (ρ) k+1 2μ + μ T j,α +β ∂x1 φ ≤ C ∂x1 T j,α φ 2 and
1 k M T ∂ ψ j,α +β x1 2μ + μ 2 ≤ C ∂t T j,α ψ + ∂x T j,α ψ + ∂x ∂x T j,α ψ .
The remaining terms in (4.22) and (4.25) are bounded by C Ms (t) by Lemmas 4.2–4.4, and the desired inequality in Proposition 4.11 is obtained. This completes the proof. We next apply estimates for the Stokes system to obtain the estimates for higher order derivatives. Proposition 4.12. Let σ be an integer satisfying 1 ≤ σ ≤ s and let j, α and satisfy 2 j + α + = σ − 1. Suppose that (4.3) is satisfied. Then
t T j,α ψ 2 + T j,α φ 2 dτ +2 +1 0 % &
t T j,α φ˙ 2 + T j+1,α ψ 2 + T j,α ψ 2 dτ . ≤ C Ms (t)2 + 0
+1
+1
, ψ be the solution Proof. We apply the following estimate for the Stokes system. Let φ of the Stokes system = div ψ f in R+n , = + p (ρ+ )∇ φ g in R+n , −μψ ψ = 0. x1 =0
Then for any k ∈ Z, k ≥ 0, there exists a constant C > 0 such that ψ + φ ≤ C f k+1 + | g |k . k+2 k+1 See, e.g., [2].
(4.26)
Stability of Planar Stationary Solutions to Compressible NS Equations
421
By (4.22) we have 1 1 j,α . div T j,α ψ = − T j,α φ˙ + f j,α ≡ F ρ ρ
(4.27)
Moreover, we see from (4.10) ρ ∂t T j,α ψ + u · ∇ T j,α ψ − μ T j,α ψ − μ + μ ∇div T j,α ψ + p (ρ)∇ T j,α φ = g j,α . By (4.27) we have ∇div T j,α ψ = −∇
1 1 T j,α φ˙ + ∇ f j,α , ρ ρ
and hence, −μ T j,α ψ + p (ρ+ ) ∇ T j,α φ = −ρ ∂t T j,α ψ + u · ∇ T j,α ψ &
%
1 1 ˙ + μ + μ −∇ T j,α φ + ∇ f j,α ρ ρ j,α . − p (ρ) − p (ρ+ ) ∇ T j,α φ + g j,α ≡ G We thus conclude that T j,α φ, T j,α ψ satisfies the Stokes system. Since
1 +1 +1 +1 1 +1 1 ˙ ˙ T j,α φ + ∂x T j,α φ − ∂x , ∂x F j,α = − ∂x f j,α ρ ρ ρ
(4.28)
and j,α = −ρ ∂x ∂t T j,α ψ + u · ∂x ∇ T j,α ψ − μ + μ ∂x ∇ T j,α φ˙ ∂x G ρ − ∂x , ρ ∂t T j,α ψ + ∂x , ρu · ∇ T j,α ψ 1 T j,α φ˙ − μ + μ ∂x ∇, ρ % &
1 +∂x μ + μ ∇ f j,α − p (ρ) − p (ρ+ ) ∇ T j,α φ + g j,α , ρ the desired inequality follows from (4.26) and Lemmas 4.2–4.4. This completes the proof. The following estimates immediately follow from the first equation of (3.1) and Lemmas 4.2–4.4. Proposition 4.13. Let 2 ≤ σ ≤ s and let j satisfy 2 j ≤ σ − 2. Suppose that (4.3) is satisfied. Then % &
t
t j+1 2 2 2 2 |[∂x φ]|σ −2 + |[∂x ψ]|σ −1 dτ . ∂τ φ σ −2−2 j dτ ≤ C Ms (t) + 0
H
0
422
Y. Kagei, S. Kawashima
We are now in a position to prove Proposition 3.2. Proof of Proposition 3.2. We assume that (4.3) holds. We will show E σ (t)2 + Dσ (t)2 ≤ C E σ (0)2 + Ms (t)2
(4.29)
for all 0 ≤ σ ≤ s, which leads to the conclusion of Proposition 3.2. In fact, from (4.29) with σ = s one can easily see that E s (t)2 + Ds (t)2 ≤ C (φ0 , ψ0 )2H s , provided that (φ0 , ψ0 )2H s and δ > 0 are sufficiently small, and thus, Proposition 3.2 is proved. Let us prove (4.29). We prove it by induction argument on σ . By Proposition 4.7, (4.29) clearly holds for σ = 0. Let 1 ≤ r ≤ s and suppose that (4.29) holds for all σ ≤ r − 1. We shall prove (4.29) for σ = r . By (4.29) with σ = 0 and Proposition 4.8 we have &
t % 1/2 2 T j,α φ(t)2 + T j,α ψ(t)2 + T j,α ψ dτ L 2 2 2 j+|α |≤r
2
0
≤ C Er (0)2 + Ms (t)2 .
(4.30)
This, together with Proposition 4.9, implies that % & 2 t 2 1/2 T j+1,α ψ 2 dτ L T j,α ψ(t) + 2 j+|α |=r −1
2
0
≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2
(4.31)
for any η > 0 with some constant Cη > 0. By (4.30) and (4.31) we have
2 j+|α |=r −1 0
t
T j+1,α ψ 2 + ∂x T j,α ψ 2 + ∂x ∂x T j,α ψ 2 dτ 2 2 2
≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .
(4.32)
Using Proposition 4.10 with σ = r and = 0 we see from (4.32) that % &
t T j,α ∂x φ 2 dτ T j,α ∂x φ(t)2 + 1
2 j+|α |=r −1
2
0
1
2
≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 ,
and hence, by Proposition 4.11 with σ = r and = 0,
t T j,α φ˙ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . 1 2 j+|α |=r −1 0
(4.33)
Stability of Planar Stationary Solutions to Compressible NS Equations
423
It then follows from (4.32), (4.33) and Proposition 4.12 with σ = r and = 0 that
t T j,α ψ 2 + T j,α φ 2 dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . 2 1 2 j+|α |=r −1 0
We thus arrive at 2 j+|α |=r −1
t
T j,α φ(t)2 1 + T j,α ψ(t)2 1 H H
T j,α φ 2 + T j+1,α ψ 2 + T j,α ψ 2 dτ 1 0 2 0 ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .
&
+
In particular, we obtain (4.29) for σ = r = 1 by taking η > 0 suitably small. To complete the induction argument for r ≥ 2, we now show the following inequalities: % & 2 t 2 +1 φ(t) + ∂ φ dτ T j,α ∂x+1 T j,α x1 1 2 j+|α |=r −1−
2
0
2
≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 ,
t
T j,α φ˙ 2
+1
2 j+|α |=r −1− 0
dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2
(4.34)
(4.35)
and
2 j+|α |=r −1− 0
t
T j,α ψ 2
+2
2 + T j,α φ +1 dτ
≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2
(4.36)
for all 0 ≤ ≤ r − 1. We will prove (4.34)–(4.36) by induction on . We have already proved (4.34)–(4.36) for = 0. Let 1 ≤ k ≤ r − 1. Assuming that (4.34)–(4.36) hold for all ≤ k − 1, we shall prove (4.34)–(4.36) for = k. By the inductive assumption on σ we have
t T j,α ψ 2 dτ ≤ C Er (0)2 + Ms (t)2 , (4.37) k+1 2 j+|α |=r −1−k 0
and, by the inductive assumption on and (4.30), we have
t T j+1,α ψ 2 + ∂x ∂x T j,α ψ 2 dτ k k 2 j+|α |=r −1−k 0
≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .
(4.38)
424
Y. Kagei, S. Kawashima
It then follows from Proposition 4.10 with σ = r and = k that 2 j+|α |=r −1−k
% & 2 t 2 k+1 φ(t) + ∂ φ dτ T j,α ∂xk+1 T j,α x1 1 2
0
2
≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . This proves (4.34) for = k, and moreover, we have
T j,α ∂x φ 2 dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . 1 k
t
2 j+|α |=r −1−k 0
(4.39)
Applying now Proposition 4.11 with σ = r and = k we deduce from (4.37)–(4.39) that
t
T j,α φ˙ 2
k+1
2 j+|α |=r −1−k 0
dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .
(4.40)
This proves (4.35) for = k. In view of (4.37), (4.38) and (4.40) we see from Proposition 4.12 with σ = r and = k that (4.36) holds for = k. This completes the proof of (4.34)–(4.36). From the above argument and the inductive assumption on σ , we conclude that
t
|||Dψ|||r2 + |[∂x φ]|r2−1 + φ |x1 =0 2L 2 (Rn−1 ) dτ (4.41) ≤ ηDr (t)2 + Cη Er (0) + Ms (t)2 .
|[ψ(t)]|r2−1
+ |[φ(t)]|r2
+
0 2
This, together with Proposition 4.13 with σ = r , gives
0
t
|[∂τ φ]|r2−2 dτ ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 .
Furthermore, by (4.41) and Lemma 4.5 we have |[ψ(t)]|r2 ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 , and hence, we arrive at Er (t)2 + Dr (t)2 ≤ ηDr (t)2 + Cη Er (0)2 + Ms (t)2 . By taking η > 0 suitably small we obtain (4.29) for σ = r , and the induction argument on σ is complete. This completes the proof of Proposition 3.2.
Stability of Planar Stationary Solutions to Compressible NS Equations
425
5. Decay of Perturbations as t → ∞ We finally prove the decay properties of the perturbation as t → ∞. Proposition 5.1. Under the assumption of Proposition 3.2, ∂x φ(t) H s−1 + ∂x ψ(t) H s−1 → 0
(5.1)
φ(t)∞ + ψ(t)∞ → 0
(5.2)
and as t → ∞. Proof. Since
we find a sequence
∞
0 {tk }∞ k=1
∂x φ2H s−1 + ∂x ψ2H s dτ < ∞,
(5.3)
with tk → ∞ as k → ∞ such that
∂x φ(tk )2H s−1 + ∂x ψ(tk )2H s → 0
(5.4)
as k → ∞. Applying Lemma 4.5 we have ∂x φ(t)2H s−2 + ∂x ψ(t)2H s−1 ≤ C ∂x φ(tk )2H s−2 + ∂x ψ(tk )2H s−1 &
t ∂τ φ H s−2 ∂x φ H s−1 + ∂τ ψ H s−1 ∂x ψ H s dτ +
(5.5)
tk
for all t ≥ tk and k ∈ N. We see from (5.3) and (5.4) that for any ε > 0 there exists a k such that if t ≥ tk then the right-hand side of (5.5) is less than ε, which means ∂x φ(t)2H s−2 + ∂x ψ(t)2H s−1 → 0
(5.6)
as t → ∞. To prove the decay of ∂xα φ(t)2 for |α| = s we first note that ∂xα φ ∈ X 0 satisfies (4.1) in the weak sense with f = −ρdiv ∂xα ψ + ∂xα f − ∂xα , u · ∇ φ − ∂xα , ρ div ψ. In view of the proof of Proposition 3.2 we have already known
∞ f 22 dτ < ∞.
(5.7)
0
By Lemma 4.6 (i) we have
t α α ∂ φ + ∂x ψ H s + ∂ φ(t)2 ≤ ∂ α φ(tk )2 + C f 2 ∂xα φ 2 dτ x x x 2 2 2 tk
(5.4) for all t ≥ tk and k ∈ N. This, together with (5.3), and (5.7), implies that for any ε > 0 there exists a k such that if t ≥ tk then ∂xα φ(t)2 < ε, i.e., α ∂ φ(t) → 0 x 2 as t → ∞ for |α| = s. Combining this with (5.6) we conclude the proof of (5.1). Decay property (5.2) now follows from Proposition 3.2, Lemma 4.1 and (5.1). This completes the proof.
426
Y. Kagei, S. Kawashima
Appendix In the Appendix we prove Lemma 4.3. For this purpose we first prepare the following lemma ([3, Lemma A.2]): Lemma A.1. Let s and sk (k = 1, . . . , ) be nonnegative integers and let αk (k = 1, . . . , ) be multi-indices. Suppose that s ≥ s0 , 0 ≤ |αk | ≤ sk ≤ s + |αk | (k = 1, . . . , ) and s1 + · · · + s ≥ ( − 1)s + |α1 | + · · · + |α | . Then there exists a constant C > 0 such that ) α ∂ 1 f 1 · · · ∂ α f ≤ C f k H sk . x x 2 1≤k≤
The proof of Lemma A.1 can be found in the appendix of [3]. Proof of Lemma 4.3. The inequality in (ii) is an immediate consequence of Lemma 4.2. Let us prove the third inequality in (i). We set z = (x, t) and ν = (α, j). Then ν γ ν ν−γ α j ∂x ∂t , F(x, t, f 1 ) f 2 = ∂z , F(z, f 1 ) f 2 = ∂z (F(z, f 1 ))∂z f2 , γ 0<γ ≤ν
γ and ∂z (F(z, f 1 )) is bounded by a linear combination of the following terms: γ (∂z F)(z, f 1 )) and
) γ0 γm ∂z f 1 ∂z ∂ y F (z, f 1 ) m=1
$
with 0 ≤ γ0 < γ , 1 ≤ ≤ |γ | − |γ0 |, m=0 γm = γ and |γm | ≥ 1(m = 1, . . . , ). Therefore, it suffices to estimate γ ν−γ f 2 −1 (A.1) ∂z F (z, f 1 )∂z H
and
) γ γ0 ν−γ m ∂z f 1 ∂z f2 ∂z ∂ y F (z, f 1 ) m=1
(A.2) H −1
$ with 0 ≤ γ0 < γ , 1 ≤ ≤ |γ | − |γ0 |, m=0 γm = γ and |γm | ≥ 1(m = 1, . . . , ). As for (A.1) we have γ γ ν−γ ν−γ f 2 −1 ≤ ∂z F (z, f 1 )∂z f2 ∂z F (z, f 1 )∂z H 2 ν−γ ≤ C0 ∂z f2 ≤ C0 |[ f 2 ]|σ −1 .
2
(A.3)
Stability of Planar Stationary Solutions to Compressible NS Equations
427
We next consider (A.2). We will show ) γ γ0 ν−γ ∂z m f 1 ∂z f2 ∂z ∂ y F (z, f 1 ) m=1
≤ CC1 |||D f 1 |||s−1 |[ f 2 ]|σ −1 , (A.4) H −1
which, together with (A.3), gives the third inequality in (i). We set γ = (β, k) and γm = (βm , km ). For simplicity we assume |βm | ≥ 1 for m = 1, . . . , . The other case can be treated similarly. We first show inequality (A.4) for γ with 1 < |β| + 2k < s. We set s = s − |β| − 2k and σ = σ − 1 − |α − β| − 2( j − k). Then s > 0 and σ = |β| + 2k − 1 > 0. Furthermore,
1 s 1 σ 1 1 1 s − + − + − =1+ − < 1. 2 n 2 n 2 n 2 n
Therefore, we can find p1 , p2 and p3 satisfying s 1 1 1 − < ≤ , 2 n p1 2
1 σ 1 1 − ≤ ≤ , 2 n p2 2
1 1 1 1 − < ≤ 2 n p3 2
and 1 1 1 + + = 1. p1 p2 p3 Applying Lemma 4.1 we see that for any ϕ ∈ C0∞ , ) γ γ0 ν−γ ∂z m f 1 ∂z f2 , ϕ ∂z ∂ y F (z, f 1 ) m=1 ) ν−γ γ ≤ C1 ∂z m f 1 ∂z f 2 ϕ p3 p2 m=1 p1 ) ν−γ γm ≤ CC1 ∂z f 1 ∂z f 2 σ ϕ H 1 H m=1 Hs ) γ ≤ CC1 ∂z m f 1 |[ f 2 ]|σ −1 ϕ H 1 . Hs
m=1
The first factor on the right-hand side of this inequality is estimated as ) γm ∂z f 1 m=1
) βm +λm km ≤C ∂x ∂t f 1 |λ1 |+···+|λ |≤ s m=1 2 ) =C ∂xβm +λm −1 ∂tkm ∂x f 1 .
Hs
|λ1 |+···+|λ |≤ s m=1
2
428
Since
Y. Kagei, S. Kawashima
$
m=1 (2km
+ |βm |) ≤ 2k + |β|, we have
{s − 1 − 2km − |βm + λm − 1|} = s −
m=1
(2km + |βm |) −
m=1
|λm |
m=1
≥ s − 2k − |β| − s = ( − 1)s. Lemma A.1 then implies ) γm ∂z f 1 ≤ C Hs
m=1
) km ∂t ∂x f 1
|λ1 |+···+|λ |≤ s m=1
H s−1−2km
≤ C|||D f 1 |||s−1 ,
and, consequently, we obtain inequality (A.4). We next show inequality (A.4) for γ with |β| + 2k = s. In this case σ = s and γ = ν. Since
1 s−1 1 1 1 1 s 1 + = + < , − − − 2 n 2 n 2 2 n 2 there exist p1 and p2 satisfying 1 s−1 1 1 − < ≤ , 2 n p1 2
1 1 1 1 − < ≤ 2 n p2 2
and 1 1 1 + = . p1 p2 2 By Lemma 4.1 we have ) ) γ γ0 γ ∂z m f 1 f 2 , ϕ ≤ C 1 ∂z m f 1 f 2 p1 ϕ p2 ∂z ∂ y F (z, f 1 ) m=1 m=1 2 ) γm ≤ CC1 ∂z f 1 f 2 H s−1 ϕ H 1 . m=1
2
Since
{s − 1 − 2km − |βm − 1|} = s −
m=1
(2km + |βm |)
m=1
≥ s − 2k − |β| − s = ( − 1)s, we deduce from Lemma A.1 that ) γm ∂z f 1 ≤ C m=1
2
) km ∂t ∂x f 1
|λ1 |+···+|λ |≤ s m=1
and hence, inequality (A.4) also holds in this case.
H s−1−2km
≤ C|||D f 1 |||s−1 ,
Stability of Planar Stationary Solutions to Compressible NS Equations
429
Let us prove inequality (A.4) for γ with |β| + 2k = 1. In this case |β| = 1 and k = 0, and hence, γ0 = (0, 0) and = 0. Therefore, with the same p1 and p2 as in the previous case, we have ν−γ ν−γ f 2 , ϕ ≤ C1 ∂xβ f 1 p ∂z f 2 ϕ p2 (∂ y F)(z, f 1 )∂xβ f 1 ∂z 1 2 j ≤ CC1 ∂x f 1 H s−1 ∂t f 2 σ −1−2 j ϕ H 1 H
≤ CC1 |||D f 1 |||s−1 |[ f 2 ]|σ −1 ϕ H 1 , from which inequality (A.4) is obtained. We next prove the second inequality in (i). In view of (A.3), it suffices to estimate ) γ γ0 ν−γ m ∂z f 1 ∂z f2 ∂z ∂ y F (z, f 1 ) m=1
2
$
with 0 ≤ γ0 < γ , 1 ≤ ≤ |γ | − |γ0 |, m=0 γm = γ and |γm | ≥ 1(m = 1, . . . , ). For simplicity we assume km ≥ 1 for m = 1, . . . , . The other case can be treated similarly. Since {s − 1 − 2(k1 − 1) − |β1 |} +
{s − 2 − 2(km − 1) − |βm |} + σ
m=2
= {s + 1 − 2k1 − |β1 |} +
{s − 2km − |βm |} + (2k + |β| − 1)
m=2
= s −
(2km + |βm |) + 2k + |β|
m=1
≥ s, we apply Lemma A.1 to obtain ) γ γ0 ν−γ m ∂z f 1 ∂z f2 ∂z ∂ y F (z, f 1 ) m=1 2 ) β1 k1 −1 ν−γ βm km −1 ≤ C1 ∂x ∂t ∂t f 1 ∂x ∂t ∂t f 1 ∂ z f2 m=2 2 ) km −1 ≤ CC1 ∂tk1 −1 ∂t f 1 s−1−2(k −1) ∂t f 1 s−2−2(k ∂t H
1
m=2
H
m −1)
ν−γ f2 ∂z
σ H
−1 ≤ CC1 |||D f 1 |||s |||D f 1 |||s−1 |[ f 2 ]|σ −1 .
This, together with (A.3), gives the second inequality in (ii). The first inequality in (i) can be proved similarly by applying Lemma A.1. We omit the details. This completes the proof.
430
Y. Kagei, S. Kawashima
References 1. Friedman, A.: Partial Differential Equations. New York: Holt, Rinehart and Winston, 1969 2. Galdi, G. P.: An Introduction to the Mathematical Theory of the Navier-Stokes Equations. Vol. 1, New York: Springer-Verlag, 1994 3. Kagei, Y., Kobayashi, T.: Asymptotic behavior of solutions to the compressible Navier-Stokes equations on the half space. Arch. Rat. Mech. Anal. 177, 231–330 (2005) 4. Kagei, Y., Kawashima, S.: Local solvability of initial boundary value problem for a quasilinear hyperbolicparabolic system. To appear in J. Hyperbolic Differential Equations 5. Kawashima, S., Nishibata, S., Zhu, P.: Asymptotic Stability of the Stationary Solution to the Compressible Navier-Stokes Equations in the Half Space. Commun. Math. Phys. 240, 483–500 (2003) 6. Matsumura, A.: Inflow and outflow problems in the half space for a one-dimensional isentropic model system of compressible viscous gas. Nonlinear Analysis 47, 4269–4282 (2001) 7. Matsumura, A., Nishida, T.: Initial boundary value problems for the equations of motion of compressible viscous and heat-conductive fluids. Commun. Math. Phys. 89, 445–464 (1983) Communicated by P. Constantin
Commun. Math. Phys. 266, 431–454 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0032-2
Communications in
Mathematical Physics
The Restricted Kirillov–Reshetikhin Modules for the Current and Twisted Current Algebras Vyjayanthi Chari1, , Adriano Moura2 1 Department of Mathematics, University of California, Riverside, CA 92521, USA.
E-mail: [email protected] 2 UNICAMP - IMECC, Campinas SP - Brazil, 13083-859. E-mail: [email protected]
Received: 3 August 2005 / Accepted: 16 January 2006 Published online: 25 April 2006 – © Springer-Verlag 2006
Abstract: We define a family of graded restricted modules for the polynomial current algebra associated to a simple Lie algebra. We study the graded character of these modules and show that they are the same as the graded characters of certain Demazure modules. In particular, we see that the specialized characters are the same as those of the Kirillov Reshetikhin modules for quantum affine algebras. 0. Introduction In this paper we define and study a family of Z+ –graded modules for the polynomial valued current algebra g[t] and the twisted current algebra g[t]σ associated to a finite– dimensional classical simple Lie algebra g and a non–trivial diagram automorphism of g. The modules which we denote as K R(mωi ) and K R σ (mωi ) respectively are indexed by pairs (i, m), where i is a node of the Dynkin diagram and m is a non–negative integer, and are given by generators and relations. These modules are indecomposable, but usually reducible, and we describe their Jordan–Holder series by giving the corresponding graded decomposition as a direct sum of irreducible modules for the underlying finite–dimensional simple Lie algebra. Moreover, we prove that the modules are finite– dimensional and hence restricted, i.e., there exists an integer n ∈ Z+ depending only on g and σ such that (g ⊗ t n )K R(mωi ) = 0. It turns out that this graded decomposition is exactly the one predicted in [9, Appendix A], [10, Sect. 6] coming from the study of the Bethe Ansatz in solvable lattice models. Our interest in these modules and the motivation for calling them the Kirillov–Reshetikhin modules arises from the fact that when we specialize the grading by setting t = 1, (or equivalently putting q = 1 in the formulae in [9, 10]) the character of the module is exactly the one predicted in [13, 14] for a family of irreducible finite–dimensional modules for the Yangian of g. Analogous modules for the untwisted quantum affine VC was partially supported by the NSF grant DMS-0500751.
432
V. Chari, A. Moura
algebra associated to g are also known to exist with this decomposition [2] and these have been studied from a combinatorial viewpoint in [9, 15–17]. One of the methods used in [2] involves passing to the q = 1 limit of the modules for the quantum affine algebra, although the resulting modules are not graded or restricted. The methods used in [2] require a number of complicated results from the representation theory of the untwisted quantum affine algebras which have not been proved in the twisted case. In this paper we show that the Kirillov–Reshetikhin modules can be studied in the non–quantum case. Thus we define graded restricted analogues and compute their graded characters without resorting to the quantum situation. As a consequence we have a mathematical interpretation of the parameter q which appears in the fermionic formulae in [9, 10]. Then we prove that the abstractly defined modules K R(mωi ) have a concrete construction as follows: the g[t]–module structure of the “fundamental” Kirillov–Reshetikhin modules (in most cases these modules correspond to taking m = 1) is described explicitly and then the module K R(mωi ) is realized as a canonical submodule of a tensor product of the fundamental modules. Another motivation for our interest is the connection with Demazure modules, [3, 5–7]. Thus we are able to prove that the Kirillov–Reshetikhin modules for the current algebras are isomorphic as representations of the current algebra to the Demazure modules in multiples of the basic representation of the current algebras. Our methods also work for some nodes of the other exceptional algebras; we explain this together with the reasons for the difficulties for the exceptional algberas in the concluding section of this paper. 1. Preliminaries 1.1. Notation. Let Z+ (resp. N) be the set of non-negative (resp. positive) integers. Given a Lie algebra a, let U(a) denote the universal enveloping algebra of a and a[t] = a⊗C[t] the polynomial valued current Lie algebra of a. The Lie algebra a[t] and its universal enveloping algebra are Z+ -graded where the grading is given by the powers of t. We shall identify a with the subalgebra a ⊗ 1 of a[t]. 1.2. Classical simple Lie algebras. For the rest of the paper g denotes a complex finite– dimensional simple Lie algebra of type An , Bn , Cn or Dn , n ≥ 1, and h a Cartan subalgebra of g. Let I = {1, · · · , n} and {αi : i ∈ I } (resp. {ωi : i ∈ I }) be a set of simple roots (resp. fundamental weights) of g with respect to h, R + (resp. Q, P) be the corresponding set of positive roots (resp. root lattice, weight lattice) and let Q + , P + be the Z+ –span of the simple roots and fundamental weights respectively. It is convenient + to set ω0 = 0. Let θ ∈ R be the highest root. For i ∈ I , let εi : Q → Z be defined by requiring η = i∈I εi (η)αi , η ∈ Q. We assume throughout the paper that the simple roots are indexed as in [1]. Given α ∈ R, let gα be the corresponding root space. Fix non–zero elements xα± ∈ g±α , h α ∈ h, such that h α , xα± = ±2xα± , xα+ , xα− = h α . Set n± =
α ∈ R+
g±α . Given any subset J ⊂ I let g(J ) be the subalgebra of g generated
by the elements xα±j , j ∈ J . The sets R(J ), P(J ) etc. are defined in the obvious way. Let (, ) be the form on h∗ induced by the restriction of the Killing form of g to h normalized so that (θ, θ ) = 2 and set dˇj = 2/(α j , α j ).
Restricted K-R Modules for Current and Twisted Current Algebras
433
1.3. Finite–dimensional g–modules. Given λ ∈ P + , let V (λ) be the irreducible finite– dimensional g–module with highest weight vector vλ , i.e., the cyclic module generated by vλ with defining relations: λh α +1 i vλ = 0. n+ vλ = 0, hvλ = λ(h)vλ , xα−i For μ ∈ P, let V (λ)μ = {v ∈ V (λ) : hv = μ(h)v, h ∈ h}. Any finite–dimensional g–module M is isomorphic to a direct sum ⊕λ∈P + V (λ)⊕m λ (M) , m λ (M) ∈ Z+ . Proposition ([18]). Let λ, μ ∈ P + . Then V (λ) ⊗ V (μ) is generated as a g–module by the element vλ ⊗ vμ∗ and relations: h vλ ⊗ vμ∗ = (λ + μ∗ )(h) vλ ⊗ vμ∗ , and
xα+
−μ∗ (h α )+1 λ(h α )+1 vλ ⊗ vμ∗ = xα− vλ ⊗ vμ∗ = 0,
for all α ∈ R + . Here μ∗ is the lowest weight of V (μ) and 0 = vμ∗ ∈ V (μ)μ∗ . 1.4. Graded modules. Given a g–module M, we regard it as a g[t]–module by setting (x ⊗ t r )m = 0 for all m ∈ M, r ∈ N, and denote the resulting graded g[t]–module by ev0 (M). Let V = ⊕s∈Z+ V [s] be a graded representation of g[t] with dim(V [s]) < ∞. Note that each V [s] is a g–module. For any s ∈ Z+ , let V (s) be the g[t]–quotient of V by the submodule ⊕s ≥s V [s ]. Clearly V (s) ∼ = ev0 (V [s]), V (s + 1) and the irreducible constituents of V are just the irreducible constituents of ev0 (V [s]), s ∈ Z+ . 2. The Kirillov–Reshetikhin Modules for g[t] 2.1. The modules K R(mωi ). Definition. Given i ∈ I and m ∈ Z+ , let K R(mωi ) be the g[t]–module generated by an element vi,m with relations, n+ [t]vi,m = 0, hvi,m = mωi (h)vi,m , (h ⊗ t r )vi,m = 0, h ∈ h, r ∈ N, (2.1) and
xα−i
m+1
vi,m = xα−i ⊗ t vi,m = xα−j vi,m = 0,
j ∈ I \{i}.
(2.2)
Note that the modules K R(mωi ) are graded modules since the defining relations are graded. The following is trivially checked. Lemma. For all i ∈ I , m ∈ Z+ , the module ev0 (V (mωi )) is a quotient of K R(mωi ). Remark. These modules were defined and studied initially in [2, Sect. 1] (where they were denoted as W (i, m)) as quotients of the finite–dimensional Weyl modules defined in [4]. In this paper however, we shall show directly that the modules K R(mωi ) are finite–dimensional as a consequence of the analysis of their g–module structure.
434
V. Chari, A. Moura
2.2. The graded character of K R(mωi ). We now state the main result of this section. We need some additional notation. For i ∈ I , m ∈ Z+ , let P + (i, m) ⊂ P + be defined by: P + (i, m) = P + i, dˇi + P + i, m − dˇi , m > dˇi , where P + (i, 1) = ωi if εi (θ ) = dˇi and in the other cases,
P + (i, 1) = ωi , ωi−2 , · · · , ωi , g = Dn ,
P + (i, 1) = ωi , ωi−2 , · · · , ωi , g = Bn , P + (n, 2) = {2ωn , ωn−2 , · · · , ωn } , g = Bn , P + (i, 2) = {2ωi , 2ωi−1 , · · · , 2ω1 , 0} , g = Cn , where i ∈ {0, 1} and i = i mod 2. Let μ0 , · · · , μk be the unique enumeration of the sets P + (i, dˇi ) chosen so that μ j − μ j+1 ∈ R + , μ j − μ j+2 ∈ Q + \R + , 0 ≤ j ≤ k − 1. Given m = dˇi m 0 + m 1 with 0 ≤ m 1 < dˇi and μ ∈ P + (i, m), we can clearly write μ = m 1 ωi + μ j1 + · · · + μ jm 0 , where jr ∈ {0, 1, · · · , k} for 1 ≤ r ≤ m 0 . We say that the expression is reduced if each jr is minimal with the property that μ − μ j1 − · · · − μ jr ∈ P + (i, m − r ). Such an expression is clearly unique and we set |μ| =
m0
jr .
r =1
Theorem. (i) Let i ∈ I , m, s ∈ Z+ . We have,
K R(mωi )[s] ∼ =
V (μ).
{μ∈P + (i,m):|μ|=s}
(ii) Write m = dˇi m 0 + m 1 , where 0 ≤ m 1 < dˇi . The canonical homomorphism of g[t]–modules ⊗m 0 K R(mωi ) → K R (m 1 ωi ) ⊗ K R dˇi ωi 0 mapping vi,m → vi,m 1 ⊗ v ⊗m is injective. ˇ
i,di
The rest of the section is devoted to the proof of this result. 2.3. Elementary properties of K R(mωi ). Proposition. (i) We have K R(mωi ) =
μ ∈ h∗
and K R(mωi )μ = 0 only if μ ∈ mωi − Q + .
K R(mωi )μ
Restricted K-R Modules for Current and Twisted Current Algebras
435
(ii) Regarded as a g–module, K R(mωi ) and K R(mωi )[s], s ∈ Z+ , are isomorphic to a direct sum of irreducible finite–dimensional representations of g. (iii) For all 0 ≤ r ≤ m, there exists a canonical homomorphism K R(mωi ) → K R(r ωi ) ⊗ K R((m − r )ωi ) of graded g[t]–modules such that vi,m → vi,r ⊗ vi,(m−r ) . Proof. Part (i) follows by a standard application of the PBW theorem. For (ii) it suffices, by standard results, to show that K R(mωi ) is a sum of finite–dimensional g–modules. Note that the defining relations of K R(mωi ) imply that U(g)vi,m ∼ = V (mωi ) as g–modules. Hence − mωi (h α )+1 xα vi,m = 0 (2.3) for all α ∈ R + . For v ∈ K R(mωi )μ we have U(g)v = U(n− )U(n+ )v. Part (ii) implies that U(n+ )v is a finite–dimensional vector space. Further since the action of n− on n− [t] given by the Lie bracket is locally nilpotent and since v ∈ U(n− [t])vi,m , it follows from (2.3) that xα− acts nilpotently on v. This proves that U(g)v is finite–dimensional and part (ii) is established. Part (iii) is clear from the defining relations of the modules. Corollary. For i ∈ I , m ∈ Z+ , we have K R(mωi ) =
V (μ)⊕m μ (i,m) .
μ ∈ P+
In particular, K R(0) ∼ = C. 2.4. An upper bound for m μ (i, m). The following result was proved in [2, Theorem 1] under the assumption that K R(mωi ) is finite–dimensional. An inspection of the proof shows however, that the only place this is used is to write K R(mωi ) as a direct sum of irreducible g–modules. But (as we shall see in the twisted case, where we do give a proof of the analogous proposition) this only requires the weaker result proved in Proposition 2.3, Proposition. As a g-module we have K R(mωi ) ∼ =
V (μ)⊕m μ ,
μ ∈ P + (i, m)
where m μ ∈ {0, 1}. Th next corollary is immediate since P + (i, m) is a finite set by definition. Corollary. For all i ∈ I , the modules K R(mωi ) are finite–dimensional. 2.5. Proof of Theorem 2.2: the cases εi (θ ) = 1, m ∈ Z+ and εi (θ ) = dˇi , m = 1. In these cases, it follows from the definition that P + (i, m) = {mωi }. Since ev0 (V (mωi )) is a g[t]–module quotient of K R(mωi ), part (i) is immediate from Proposition 2.4. Part (ii) of the theorem is now obvious since the canonical inclusion V (mωi ) → V (ωi )⊗m of g–modules is obviously also an inclusion of the g[t]–modules ev0 (V (mωi )) → ev0 (V (ωi ))⊗m . Since εi (θ ) = 1 for all 1 ≤ i ≤ n if g is of type An , the theorem is proved in this case and we assume for the rest of this section that g is not of type An .
436
V. Chari, A. Moura
2.6. An explicit construction in the case εi (θ ) = 2 and m = dˇi . Let Vs , 0 ≤ s ≤ k, be g–modules such that Homg(g ⊗ Vs , Vs+1 ) = 0, Homg ∧2 (g) ⊗ Vs , Vs+2 = 0, 0 ≤ s ≤ k − 1, (2.4) where we assume that Vk+1 = 0. Fix non–zero elements ps ∈ Homg(g ⊗ Vs , Vs+1 ), 0 ≤ s ≤ k − 1, and set pk = 0. It is easily checked that the following formulas extend the canonical g–module structure to a graded g[t]–module structure on V = ⊕ks=1 Vs : (x ⊗ t)v = ps (x ⊗ v), (x ⊗ t r )v = 0, r ≥ 2, for all x ∈ g, v ∈ Vs , 1 ≤ s ≤ k. Clearly, V [s] ∼ =g Vs , 0 ≤ s ≤ k. Moreover, if the maps ps , 0 ≤ s ≤ k − 1, are all surjective and if V0 = U(g)v0 then V = U(g[t])v0 . Proposition. Let i ∈ I be such that εi (θ ) = 2 and let μs ∈ P + (i, dˇi ), 0 ≤ s ≤ k. The modules V (μs ), 0 ≤ s ≤ k, satisfy (2.4) and the resulting g[t]–module is isomorphic to K R dˇi ωi . In particular, K R dˇi ωi [ j] ∼ =g V (μ j ), 0 ≤ j ≤ k. Proof. Using Proposition 1.3 it is easy to see that Homg(g ⊗ V (μs ), V (μs+1 )) ∼ = Homg(g, V (μs ) ⊗ V (μs+1 )) = 0 for all 0 ≤ s ≤ k − 1. It is not hard to check (see [8, 19]) that as g–modules, ∧2 (g) ∼ = g ⊕ V (ν), where ν = 2ω1 + ω2 if g is of type Cn , ν = ω1 + 2ω3 if g is of type B3 , ν = ω1 + ω3 + ω4 if g is of type D4 , and ν = ω1 + ω3 otherwise. Since μs − μs+2 ∈ / R, it follows that Homg(g ⊗ V (μs ), V (μs+2 )) = 0. To prove that Homg(V (ν) ⊗ V (μs ), V (μs+2 )) = 0, it suffices to prove that Homg(V (μs ), V (ν) ⊗ V (μs+2 )) = 0. Suppose that 0 = p ∈ Homg(V (μs ), V (ν) ⊗ V (μs+2 )). A simple computation using the explicit formulas for the fundamental weights in terms of the simple roots (see [11] for instance) shows that μs+2 + ν − μs ∈ Q +J ,
J = {1, · · · , s − 1}.
Hence 0 = p(vμs ) ∈ U(g J )vν ⊗ U(g J )vμs+2 , which implies that p(U(g J )vμs ) ⊂ U(g J )vν ⊗ U(g J )vμs+2 .
Restricted K-R Modules for Current and Twisted Current Algebras
437
The formulas given for μs shows that μs (h J ) = 0. This means that U(g J )vμs ∼ = C which implies that p(C) ∼ = C ⊂ U(g J )vν ⊗ U(g J )vμs+2 . But this is impossible since by standard results U(g J )vμs+2 is not isomorphic to the g J –dual of U(g J )vν . Hence p = 0 and we have proved that the modules V (μs ), 0 ≤ s ≤ k − 1, satisfy the conditions of (2.4). Let K˜R(dˇi ωi ) be the resulting g[t]–module. Since the modules V (μs ) are all irreducible, it follows moreover that K˜R dˇi ωi = U(g[t])vdˇi ωi . To see that K˜R dˇi ωi ∼ = K R(dˇi ωi ) it suffices, by Proposition 2.4, to prove that K˜R(dˇi ωi ) is a quotient of K R(dˇi ωi ). Since μ0 = mωi and μ0 − μ1 ∈ R + \{αi }, we see that we must have, p0 n+ ⊕ h ⊕ Cxα−i ⊗ vmωi = 0. This proves that
n+ [t]vmωi = 0, hvmωi = (mωi )(h)vmωi , h ⊗ t r vmωi = xα−i ⊗ t vmωi = 0, r ≥ 1.
Finally, since K˜R(mωi ) is obviously finite–dimensional, it follows that
xα−i
mωi (h α
i )+1
vμ0 = xα−j vμ0 = 0,
j ∈ I \{i},
and the proof of the proposition is complete. Given μ ∈ P + (i, m), m = dˇi m 0 + m 1 , 0 ≤ m 1 < dˇi , with reduced expression μ = m 1 ωi + μ j1 + · · · + μ jm 0 , set s j = #{r : jr ≥ j}, 0 ≤ j ≤ k, and sk s xμ = xμ−k−1 −μk ⊗ t · · · xμ−0 −μ1 ⊗ t 1 ∈ U(n− [t]). It is easily seen that xμ vi,m ∈ K R(mωi )[|μ|] ∩ K R(mωi )μ . Corollary. Let i ∈ I be such that εi (θ ) = 2. For 0 ≤ s ≤ k we have xμs vi,dˇi = 0, h(xμs vi,dˇi ) = μs (h)(xμs vi,dˇi ), n+ xμ vi,dˇi = 0, h ∈ h, or, equivalently, xμs vμ0 = vμs .
438
V. Chari, A. Moura
Proof. Suppose that μ = μs for some 1 ≤ s ≤ k. We first prove that there exists xs ∈ n− such that ps (xs ⊗ vμs ) = vμs+1 .
(2.5)
For this, note that by Proposition 1.3, v = ps (xθ− ⊗ vμs ) = 0. Since V (μs+1 ) is irreducible there exists xs ∈ U(n+ ) such that xs v = vμs+1 , which gives xs ps xθ− ⊗ vμs = ps ad xs xθ− ⊗ vμs = vμs+1 , proving (2.5). Setting xs = ad(xs )xθ− ∈ n− μs −μs+1 it follows that x s is a non–zero scalar multiple of xμ−s −μs+1 for some 0 ≤ s ≤ k and we get − xμs −μs+1 ⊗ t vμs = 0. An obvious induction on s now proves that xμs vi,dˇi = 0, 1 ≤ s ≤ k − 1. The other two statements of the corollary are now immediate. Recall that
K R dˇi ωi (r ) ∼ = K R dˇi ωi / ⊕r ≥r K R(mωi )[r ] .
Let vr be the image of vdˇi ωi in K R(dˇi ωi )(r ). By Corollary 2.6 we see that xμr −1 vr = 0, xμr vr = 0, 0 ≤ r ≤ r ≤ k.
(2.6)
2.7. Proof of Theorem 2.2: the case εi (θ ) = 2, m ≥ dˇi . If m = dˇi , then part (i) is the statement of Proposition 2.6 and part (ii) is trivially true. Assume that m > dˇi and write m = dˇi m 0 + m 1 , 0 ≤ m 1 < dˇi . Note that since m 1 ∈ {0, 1} we know by the earlier case that K R(m 1 ωi ) ∼ = ev0 (V (m 1 ωi )). Set
⊗m 0 0 ˇi ωi ⊂ K R(m K˜R(mωi ) = U(g[t]) vm 1 ωi ⊗ v ⊗m ω ) ⊗ K R d . 1 i ˇ di ωi
It is easily checked that K˜R(mωi ) is a graded quotient of K R(mωi ). Using Proposition 2.4 we see that part (i) of the theorem follows if we prove that for all s ∈ Z+ ,
V (μ). (2.7) K˜R(mωi )[s] ∼ = {μ∈P + (i,m):|μ|=s}
In particular this proves that K˜R(mωi ) ∼ = K R(mωi ) and so proves part (ii) of the theorem as well.
Restricted K-R Modules for Current and Twisted Current Algebras
439
Let μ = m 1 ωi + μ j1 + · · · + μ jm 0 be a reduced expression for μ, set K = K R(m 1 ωi ) ⊗ K R dˇi ωi ( j1 + 1) ⊗ · · · ⊗ K R dˇi ωi ( jm 0 + 1), and let K˜ = U(g[t])(vm 1 ωi ⊗ v j1 ⊗ · · · ⊗ v jm 0 ). Clearly K and K˜ are graded quotients of K R(m 1 ωi ) ⊗ K R(dˇi ωi )⊗m and K˜R(mωi ) respectively. Let ⊗m 0 π : K R(m 1 ωi ) ⊗ K R dˇi ωi → K , π K˜R(mωi ) = K˜ , be the canonical surjective morphism of g[t]–modules. Using (2.6) and the comultiplication of U(g[t]) one computes easily that 0 ) = vm 1 ωi ⊗ xμ j1 v j1 ⊗ · · · ⊗ xμ jm v jm 0 = 0. xμ π(vm 1 ωi ⊗ v ⊗m ˇ di ωi
0
(2.8)
Corollary 2.6 implies that n+ vm 1 ωi ⊗ xμi1 v j1 ⊗ · · · ⊗ xμ jm v jm 0 = 0, 0
and vm 1 ωi ⊗ xμ j1 v j1 ⊗ · · · ⊗ xμ jm v jm 0 ∈ K [|μ|] ∩ K μ . 0
It follows immediately that K˜ [|μ|] ∼ =g (V (μ) ⊕ N ) for some g–submodule N ⊂ K˜ which proves (2.7). 3. The Twisted Algebras We use the notation of the previous section freely. 3.1. Preliminaries. Throughout this section we let ∈ {0, 1}. From now on g denotes a Lie algebra of type An or Dn and σ : g → g the non–trivial diagram automorphism of order two. The statements in this subsection can be found in [12]. Thus we have ± g = g0 ⊕ g1 , h = h0 ⊕ h1 , n± = n± 0 ⊕ n1 ,
where g0 = {x ∈ g : σ (x) = x}, g1 = {x ∈ g : σ (x) = −x}. For any subalgebra a of g with σ (a) ⊂ a we set a = a ∩ g and we have a = a0 ⊕ a1 . The subalgebra g0 is a simple Lie algebra with Cartan subalgebra h0 and we let I0 be the index set for the corresponding set of simpe roots numbered as in [1]. Although in this and the following sections it is convenient to use the ambient Lie algebra g, we do not need any other data associated with it. Thus, all representations, roots, weights, the maps εi and so on will always be those associated with the fixed point algebra g0 .
440
V. Chari, A. Moura
We have g0 is of type Cn if g is of type A2n−1 and of type Bn if g is of type A2n or Dn+1 . Let (R0 )s be the set of short roots of g0 and (R0 ) the set of long roots. The adjoint action of g0 on g makes g1 into an irreducible representation of g0 and we have g1 = h 1
μ ∈ h∗0
(g1 )μ , g1 ∼ = V (φ),
where φ ∈ Q +0 is the highest short root of g0 if g is of type A2n−1 or Dn+1 and twice the highest short root if g is of type A2n . Further, if we set
R1 = μ ∈ h∗0 : (g1 )μ = 0 \{0}, R1+ = R1 ∩ Q +0 , then R1 = (R0 )s if g is of type A2n−1 or Dn+1 and R1 = R0 ∪ {2α : α ∈ (R0 )s } if g is of type A2n . In all cases, dim(g1 )α = 1 for α ∈ R1 . Given α ∈ R1+ , we denote by yα± any non–zero element in (g1 )±α . Note that n± 1 = ⊕α∈R ± (g1 )α . In addition, if α ∈ R0+ \R1+ we set yα± = 0, and similarly we set xα± = 0 if 1
α ∈ R1+ \R0+ . Lemma.
(i) The maps g1 ⊗ g1 → g0 and g0 ⊗ g1 → g1 defined by x ⊗ y → [x, y] are a surjective homomorphism of g0 –modules. (ii) If α ∈ R0 ∩ R1 , there exists h ∈ h1 such that [h, yα− ] = xα− . 3.2. The twisted current algebra. Extend σ to an automorphism σt : g[t] → g[t] by extending linearly the assignment, σt x ⊗ t r = σ (x) ⊗ (−t)r , x ∈ g, r ∈ Z+ . If a ⊂ g is such that σ (a) ⊂ a, let a[t]σ be the set of fixed points of σt . Clearly a[t]σ = a0 ⊗ C[t 2 ] ⊕ a1 ⊗ tC[t 2 ] and g[t]σ = n− [t]σ ⊕ h[t]σ ⊕ n+ [t]σ . 3.3. The modules K R σ (mωi ). Definition. For i ∈ I0 , m ∈ Z+ , let K R σ (mωi ) be the g[t]σ –module generated by an σ with relations, element vi,m σ σ σ σ σ n+ [t]σ vi,m = 0, h 0 vi,m = mωi (h 0 )vi,m , h ⊗ t 2r − vi,m = xα−j vi,m = 0, (3.1) for all h ∈ h , r ∈ Z+ , j = i,
and
xα−i
m+1
σ vi,m = 0,
σ σ xα−i ⊗ t 2 vi,m = yα−i ⊗ t vi,m = 0.
(3.2)
(3.3)
Restricted K-R Modules for Current and Twisted Current Algebras
441
Note that when g is of type σA2n or if αi ∈ (R0 )s it can be seen, by using Lemma σ 3.1(ii), that the relation xα−i ⊗ t 2 vi,m = 0 is actually a consequence of yα−i ⊗ t vi,m = 0. The modules K R σ (mωi ) are clearly graded modules for the graded algebra g[t]σ . Any g0 –module V can be regarded as a module for g[t]σ by setting x ⊗ t 2r + v = 0, v ∈ V, x ∈ g , r ∈ Z+ , and we denote the corresponding module by evσ0 (V ). The next lemma is easily checked. Lemma. For all i ∈ I0 and m ∈ Z+ the module evσ0 (V (mωi )) is a quotient of K R σ (mωi ). 3.4. The graded character of K R σ (mωi ). For i ∈ I0 and m ∈ N, let P0+ (i, m)σ be the subset of P0+ defined by σ σ P0+ (i, m)σ = P0+ i, diσ + P0+ i, m − diσ , where diσ = 1, g = A2n , diσ = 2, i = n, dnσ = 4, g = A2n , and, for 1 ≤ m ≤ diσ , P0+ (i, m)σ is given as follows. If g is of type A2n , then P0+ (i, 1)σ P0+ (i, 2)σ P0+ (n, 2)σ P0+ (n, 3)σ P0+ (n, 4)σ
= = = = =
{ωi }, {2ωi , 2ωi−1 , · · · , 2ω1 , 0} , i = n, {2ωn } , {3ωn } , {4ωn , 2ωn−1 , · · · , 2ω1 , 0} .
If g is of type A2n−1 , then,
P0+ (i, 1)σ = ωi , ωi−2 , · · · , ωi , where i ∈ {0, 1} and i = i mod 2. Finally if g is of type Dn+1 , then P0+ (i, 1)σ = {ωi , ωi−1 , · · · , 0} , i = n, P0+ (n, 1)σ = {ωn }. In particular P0+ (i, m)σ is finite. Fix an enumeration μs , 0 ≤ s ≤ k, of the sets P0+ (i, m)σ , 1 ≤ m ≤ diσ by requiring μs − μs+1 ∈ R1+ , μs − μs+2 ∈ / (R0+ ∪ R1+ ). For μ ∈ P0+ (i, m)σ define a reduced expression and |μ| ∈ Z+ analogously to the untwisted case. The main result is the following.
442
V. Chari, A. Moura
Theorem. Let i ∈ I0 , m ∈ Z+ . (i) For all s ∈ Z+ , K R σ (mωi )[s] ∼ =g0
μ ∈ P0+ (i, m)σ : |μ| = s
V (μ).
(ii) Write m = diσ m 0 + m 1 , where 0 ≤ m 1 < diσ . The canonical homomorphism of gσ [t]–modules ⊗m 0 K R σ (mωi ) → K R σ (m 1 ωi ) ⊗ K R σ diσ ωi is injective. The proof of the theorem proceeds as in the untwisted case. 3.5. Elementary properties of K R σ (mωi ). The next proposition is the twisted version of Proposition 2.3 and is proved in the same way. Proposition. (i) We have K R σ (mωi ) =
μ ∈ h∗0
K R σ (mωi )μ
and K R σ (mωi )μ = 0 only if μ ∈ mωi − Q +0 . (ii) Regarded as a g0 –module, K R σ (mωi ) and K R σ (mωi )[s], s ∈ Z+ , are isomorphic to a direct sum of irreducible finite–dimensional representations of g0 . In particular if W0 is the Weyl group of g0 , then K R σ (mωi )μ = 0 iff K R σ (mωi )wμ = 0 for all w ∈ W0 . (iii) Given 0 ≤ r ≤ m, there exists a canonical homomorphism K R σ (mωi ) → σ → v σ ⊗ K R σ (r ωi ) ⊗ K R ((m − r )ωi ) of graded g[t]σ –modules such that vi,m i,r σ vi,m−r . Corollary. We have K R σ (mωi ) ∼ =
V (μ)m μ (i,m) .
μ ∈ P0+
In particular K R σ (0) ∼ = C. 3.6. An upper bound for m μ (i, m). Proposition. For all i ∈ I0 , m ∈ Z+ , we have K R σ (mωi ) ∼ =g0
μ ∈ P0+ (i, m)σ
V (μ)m μ ,
where m μ ∈ {0, 1}. Corollary. For all i ∈ I0 and m ∈ Z+ , the modules K R σ (mωi ) are finite–dimensional. We postpone the proof of this proposition to the next section.
Restricted K-R Modules for Current and Twisted Current Algebras
443
3.7. An explicit construction of the modules K R σ (mωi ), i ∈ I0 , 1 ≤ m ≤ diσ . The next lemma is standard and can be found in [8, 19]. Let ι : g0 → ∧2 (g1 ) be the g0 –module map such that [, ] · ι = id. Lemma. (a) If g is of type Dn+1 , n ≥ 2, then ∧2 (g1 ) ∼ =g0 ι(g0 ). (b) If g is of type An , n = 3, we have ∧2 (g1 ) ∼ = ι(g) ⊕ V (ν) , where (i) ν = ω1 + ω3 if g is of type A2n−1 , n ≥ 3, (ii) ν = 2ω1 + ω2 if g is of type A2n , n ≥ 3, (iii) ν = 2ω1 + 2ω2 if g is of type A4 , (iv) ν = 6ω1 if g is of type A2 . Assume that Vs , 0 ≤ s ≤ k, are g0 –modules, let ps : g1 ⊗ Vs → Vs+1 , 0 ≤ s ≤ k − 1, be g0 –module maps, and set pk = 0. Set also qs = ps+1 (1 ⊗ ps ) ∈ Homg0 (g1 ⊗ g1 ⊗ Vs , Vs+2 ). Suppose that one of the following two conditions hold: (a) qs ∧2 (g1 ) ⊗ Vs = 0, ∀ 0 ≤ s ≤ k − 2,
(3.4)
(b) for all 0 ≤ s ≤ k − 2, y, z, w ∈ g1 , v ∈ Vs , we have qs (V (ν) ⊗ Vs ) = 0, ps+2 (1 ⊗ ps+1 )(1 ⊗ ps ) (((z ∧ w) ⊗ y − y ⊗ (z ∧ w)) ⊗ v) = 0.
(3.5)
Let x ∈ g0 , y ∈ g1 , v ∈ Vs . The following formulas define a graded g[t]σ –module structure on V = ⊕ks=1 Vs : (x ⊗ t 2 )v = qs (ι(x) ⊗ v), (y ⊗ t)v = ps (y ⊗ v), and
x ⊗ t 2r v = y ⊗ t 2r −1 v = 0,
for all r ≥ 2 and 0 ≤ s ≤ k. Furthermore, V [s] ∼ =g0 Vs . Moreover, if the maps ps , 0 ≤ s ≤ k − 1 are all surjective and V0 = U(g[t]σ )v0 , then the resulting g[t]σ –module is cyclic on v0 . Proposition. Let i ∈ I0 , 1 ≤ m ≤ diσ . The modules V (μs ), 0 ≤ s ≤ k, satisfy (3.4) (resp. (3.5)) if g is of type An (resp. Dn ). The resulting g[t]σ – module is isomorphic to K R σ (mωi ) and K R σ (mωi )[s] ∼ =g0 V (μs ). Proof. If k = 0 there is nothing to prove since it follows from Proposition 3.6 and Lemma 3.3 that K R σ (mωi ) ∼ = evσ0 (V (mωi )). Assume that k > 0. Using Proposition 1.3 it is easy to see that Homg0 (V (μs ) ⊗ V (μs+1 ), g1 ) = 0 and hence Homg0 (g1 ⊗ V (μs ), V (μs+1 )) = 0.
444
V. Chari, A. Moura
Suppose first that g is of type An . Equation (3.4) holds if we prove that Homg0 ∧2 (g1 ) ⊗ Vs , Vs+2 = 0, ∀0 ≤ s ≤ k − 2. Since μs − μs+2 ∈ / R0 , it is immediate that Homg0 (g0 ⊗ V (μs ), V (μs+2 )) = 0. To prove that Homg0 (V (ν) ⊗ V (μs ), V (μs+2 )) = 0, it suffices to prove that Homg0 (V (μs ), V (ν) ⊗ V (μs+2 )) = 0. This is done exactly as in the untwisted case by noting that μs+2 + ν − μs = s−1 j=1 k j α j for some k j ∈ Z+ . If g is of type Dn+1 , n ≥ 2, then Lemma 3.7 implies that the first condition in (3.5) is trivially satisfied. If n = 3 the second condition is also trivially true since k ≤ 3. If n > 3, let N be the g0 –submodule of T 3 (g1 ) spanned by elements of the form (x ⊗ y − y ⊗ x) ⊗ z − z ⊗ (x ⊗ y − y ⊗ x), with x, y, z ∈ g1 . Note that N ∩ ∧3 (g1 ) = 0, N ⊂ g1 ⊗ ∧2 (g1 ) + ∧2 (g1 ) ⊗ g1 . Now, it is not hard to see that, g1 ⊗ ∧2 (g1 ) ∼ =g0 V (ω1 + ω2 ) ⊕ V (ω3 ) ⊕ V (ω1 ), n > 3, and that the g0 –submodule V (ω3 ) occurs in T 3 (g1 ) with mulitplicity one. It follows that N∼ = V (ω1 + ω2 )⊕r1 ⊕ V (ω1 )⊕r2 , 0 ≤ r1 , r2 ≤ 2. We now prove that Homg0 (N ⊗ V (μs ), V (μs+3 )) = 0, 0 ≤ s ≤ k − 3, which establishes the second condition in (3.5). Since (V (μs ) ⊗ V (μs+3 ))ω1 = 0, it follows that Homg0 (V (ω1 ) ⊗ V (μs ), V (μs+3 )) = 0 and we are left to show that Homg0 (V (ω1 + ω2 ) ⊗ V (μs ), V (μs+3 )) = 0. For this it suffices to prove that Homg0 (V (μs ), V (ω1 + ω2 ) ⊗ V (μs+3 )) = 0. This is done as usual by noting that μs+3 + ωi + ω2 − μs is in the span of the elements αr , 1 ≤ r ≤ s − 1 and by observing that the (g0 ) J –module U((g0 ) J )vω1 +ω2 ⊂ V (ω1 + ω2 ) is not dual to the (g0 ) J –module U((g0 ) J )vμs+3 ⊂ V (μs+3 ), where J = {1, · · · , s − 1}. This proves that if we fix non–zero maps ps ∈ Homg0 (g1 ⊗ V (μs ), V (μs+1 )), then we can construct a graded cyclic g[t]σ –module V = U(g[t])vμ0 = ⊕ks=0 V (μs ), V [s] ∼ = V (μs ), 0 ≤ s ≤ k. As in the untwisted case, to complete the proof it suffices, in view of Proposition 3.6, to prove that V is a quotient of K R σ (mωi ). Since μ0 = mωi and μ0 − μ1 ∈ R0+ \{αi } we get p0 n+1 ⊕ h1 ⊕ Cyα−i ⊗ vmωi = 0,
Restricted K-R Modules for Current and Twisted Current Algebras
i.e.,
445
n+1 ⊗ t 2r −1 vmωi = h1 ⊗ t 2r −1 vmωi = yα−i ⊗ t vmωi = 0, r ≥ 1.
For the same reasons, q0
+ n0 ⊕ h0 ⊕ Cxα−i ⊗ vmωi = 0.
Hence we get σ n+0 ⊗ t 2r vmωi = h0 ⊗ t 2r vmωi = xα−i ⊗ t 2r vi,m = 0, r ≥ 1. The relations (3.2) follow since V is obviously finite–dimensional which completes the proof that V is a quotient of K R σ (mωi ). Given μ ∈ P0+ (i, m)σ , m = diσ m 0 + m 1 , 0 ≤ m 1 < diσ , with reduced expression μ = m 1 ωi + μ j1 + · · · + μ jm 0 , set s j = #{r : jr ≥ j}, 0 ≤ j ≤ k, and sk s yμ = yμ−k−1 −μk ⊗ t · · · yμ−0 −μ1 ⊗ t 1 ∈ U n− [t]σ . It is easily seen that σ yμ vi,m ∈ K R(mωi )[|μ|] ∩ K R(mωi )μ .
The next corollary is proved in the same way as Corollary 2.6 and we omit the details. Corollary. For 0 ≤ s ≤ k we have σ σ σ + σ = μs (h) yμs vi,d yμ vi,d = 0, h ∈ h0 , yμs vi,d σ = 0, h yμs vi,d σ σ , n σ i
i
i
i
or, equivalently, yμs vμ0 = vμs . σ 3.8. The modules K˜R (mωi ) and the completion of the proof of Theorem 3.4. For m = diσ m 0 + m 1 ∈ Z+ , i ∈ I0 , set ⊗m 0 σ σ σ ⊗m 0 ⊂ K R σ (m 1 ωi ) ⊗ K R σ diσ ωi K˜R (mωi ) = U(g[t]σ ) vm ⊗ (v ) . σ d ωi 1 ωi i
The next proposition is proved in exactly the same way as the corresponding result in Sect. 2.7 for the untwisted case, by using Proposition 3.7 and Corollary 3.7, and clearly completes the proof of Theorem 3.4. Proposition. Let i ∈ I0 , m ∈ N. Then σ K˜R (mωi )[s] ∼ =g0
μ∈P0+ (i,m)σ :|μ|=s
In particular, σ K˜R (mωi ) ∼ = K R σ (mωi )
as g[t]σ –modules.
V (μ).
446
V. Chari, A. Moura
4. Proof of Proposition 3.6 We will use the following remark repeatedly without Let x ∈ (g )α , σ further comment. σ = α ∈ R , and r ∈ Z+ be such that x ⊗ t 2r + vi,m = 0. Then x ⊗ t 2r +2s+ vi,m s ∈ Z+ . To see this, observe that if h ∈ h , then by definition we have 0 for all σ = 0. If x ∈ (g ) for some α ∈ R , choose h ∈ h such that [h, x] = x. h ⊗ t 2s+ vi,m α 0 This gives σ σ = x ⊗ t r +2 vi,m , 0 = h ⊗ t 2 , x ⊗ t r vi,m thus proving the remark. 4.1. The case αi ∈ R1 with εi (φ) = 1. The following lemma establishes the proposition in this case. Lemma. Suppose that i ∈ I0 is such that αi ∈ R1 and εi (φ) = 1 (in other words i = n if g is of type Dn+1 and i = 1 if g is of type A2n−1 ). Then K R σ (mωi ) ∼ = evσ0 (V (mωi )). Proof. It is straightforward to check that we can write yφ− = xβ− , yα−i , xθ− = yφ− , yγ− ,
(4.1)
for some β ∈ R0+ , γ ∈ R1+ with εi (β) = 0, εi (γ ) = 1. It follows from (3.3) and σ = 0. Since for all α ∈ R1+ we have yα− ⊗ t ∈ Proposition 3.5 that yφ− ⊗ t vi,m σ U(n+ ) yφ− ⊗ t it follows from (3.1) that yα− ⊗ t vi,m = 0. It follows now from (4.1) σ σ − − 2 2 that xθ ⊗ t vi,m = 0 and hence that xα ⊗ t vi,m = 0 for all α ∈ R0+ . This proves σ that K R σ (mωi ) = U n− 0 vi,m and the lemma is proved. 4.2. The subalgebras n− [t]σmax(i) . From now on we assume that i ∈ I0 is such that either αi ∈ / R1 or εi (φ) = 2. Let max(i) ∈ {1, 2} be the maximum value of the restriction of εi to R1+ , and let n− [t]σmax(i) be the subalgbera of n− [t]σ generated by
yα− ⊗ t : α ∈ R1+ , εi (α) = max(i) .
If g is of type An , then it is easy to see that
n− [t]σmax(i) = α∈R1 :εi (α)=max(i)
C yα− ⊗ t ,
(4.2)
while if g is of type Dn+1 , we have n− [t]σmax(i)=
α ∈ R1 : εi (α) = 1
C yα− ⊗ t
⎛ ⊕⎝
α ∈ R0+ : εi (α) = 2
⎞ C xα− ⊗ t 2 ⎠ . (4.3)
Restricted K-R Modules for Current and Twisted Current Algebras
447
σ . 4.3. Further relations satisfied by vi,m
Proposition. (i) Let α ∈ R1+ . If εi (α) < max(i) we have σ yα− ⊗ t 2r +1 vi,m = 0 for all r ∈ Z+ , while if εi (α) = max(i) we have σ yα− ⊗ t 2r +3 vi,m = 0 for all r ∈ Z+ . (ii) If g is of type An , then for all α ∈ R0+ , r ∈ Z+ , we have σ xα− ⊗ t 2r +2 vi,m = 0. (iii) If g is of type Dn+1 , then for all α ∈ R0+ and r ∈ Z+ , we have σ = 0, i < n. xα− ⊗ t 2r +2εi (α) vi,m Proof. Let α ∈ R1+ . Suppose that εi (α) = 0. Then sα (mωi − α) = mωi + α and hence (K R σ (mωi ))mωi −α = 0. This shows that σ σ xα− vi,m = yα− ⊗ t vi,m = 0. If εi (α) = 1 < max(i), then we can choose j ∈ I0 such that β = α − α j ∈ R1+ ∩ R0+ . If j = i, then εi (β) = 0 and we get by using (3.3) that − σ σ yα ⊗ t vi,m = xβ− , yα−i ⊗ t vi,m = 0, and if j = i, then εi (α j ) = 0 and we get − σ σ σ yα ⊗ t vi,m = xα−j , yβ− ⊗ t vi,m = xα−j yβ− ⊗ t vi,m . σ Repeating the argument with β we eventually get yα− ⊗ t vi,m = 0, thus proving (i). The proofs of (ii) and (iii) are similar and we omit the details. Corollary. (i) As vector spaces we have − σ σ K R σ (mωi ) = U n− U n [t] max(i) vi,m . 0 In particular K R σ (mωi ) is finite–dimensional. (ii) If g is of type A2 and 1 ≤ m ≤ 3, then we have K R σ (mωi ) ∼ = evσ0 (V (mωi )) and Proposition 3.6 is proved in these cases.
(4.4)
448
V. Chari, A. Moura
Proof. A straightforward application of the Poincaré-Birkhoff-Witt theorem gives (4.4). − σ Since n− 0 ⊕ n [t]max(i) is a finite–dimensional Lie algebra, it follows that for all μ ∈ P, dim(K R σ (mωi )μ ) < ∞. Proposition 3.5 now implies that if K R σ (mωi ) is infinite–dimensional, there must be infinitely many elements νr ∈ P + with K R σ (mωi )ν = 0 and hence νk ∈ P + ∩(mωi − Q + ). But this is a contradiction since it is well-known that the set P + ∩ (mωi − Q + ) is finite. If g is of type A2 and m ≤ 3, then i = 1 and (mω1 − φ)(h α1 ) < 0 since ε1 (φ) = 2. On the other hand, since ε1 (φ − α1 ) = 1, we see that − σ σ = yφ−α ⊗ t vi,m = 0. xα+1 yφ− ⊗ t vi,m 1 σ Since K R (mω i ) is isomorphic to a direct sum of finite–dimensional g0 –modules this − σ = 0 which proves the corollary. forces yφ ⊗ t vi,m
σ 4.4. Proof of Proposition 3.6. Recall the enumeration μ0 , · · · , μk of the sets P0+ i, diσ and set φs = μk−s − μk−s+1 ∈ R1+ for 1 ≤ s ≤ k. Given r = (r1 , · · · , rk ) ∈ Zk+ , set ηr = ks=1 rs φs ∈ Q + and r r yr = yφ1 ⊗ t 1 · · · yφk ⊗ t k . Note that ηr + ηr = ηr+r for all r, r ∈ Zk+ . For 1 ≤ s ≤ k, let es ∈ Zk+ be the element with one in the s th place and zero elsewhere. As usual ≤ is the partial order on P0 defined by μ ≤ μ iff μ − μ ∈ Q +0 . Proposition. We have σ n+0 yr vi,m ∈
ηr <ηr
K R σ (mωi ) =
σ U n− 0 yr vi,m ,
σ U(g0 )yr vi,m .
(4.5)
(4.6)
r∈Zk+
Assuming this proposition we complete the proof of Proposition 3.6 as follows. By Proposition 3.5 we can pick a g0 –module W 0 such that σ ⊕ W 0. K R σ (mωi ) ∼ =g0 U(g0 )vi,m
If W 0 = 0 there is nothing to prove. Otherwise, let prW 0 be the g0 -projection of k K R σ (mωi ) onto W 0 . Using (4.6) we see that there exists r ∈ Z+ such that prW 0 (yr ) = 0.
σ = 0 and such that ηr1 is minimal: i.e., Choose r1 ∈ Zk+ such that vr1 = prW 0 yr1 vi,m σ = 0 for some r ∈ Zk+ , then ηr1 − ηr ∈ if prW 0 yr vi,m / Q +0 . Using (4.5) we get
σ σ n+0 prW 0 yr1 vi,m = prW 0 n+0 yr1 vi,m = 0,
Restricted K-R Modules for Current and Twisted Current Algebras
449
i.e., n+0 vrσ1 = 0. Thus we can write σ ⊕ U(g0 )vrσ1 ⊕ W 1 , K R σ (mωi ) ∼ =g0 U(g0 )vi,m
for some g0 –submodule W 1 . Repeating the argument we find that there exists rs , s ∈ Z+ , such that σ ⊕ K R σ (mωi ) ∼ = U(g0 )vi,m
s≥1
U(g0 )vrσs ,
with n+0 vrσs = 0. Since vrσs ∈ K R σ (mωi )mωi −k
j=1 s j φ j
,
for some s j ∈ Z+ we get mωi −
k
s j φ j = m − sk diσ ωi + (sk − sk−1 )μ1 + · · · + (s2 − s1 )μk−1 + s1 μk ∈ P0+ .
j=1
An inspection of the sets P0+ (i, diσ )σ shows that this implies m ≥ diσ sk , s j ≤ s j+1 , 1 ≤ j ≤ k − 1, which in turn implies that the weight of vrσs is in P0+ (i, m)σ . Since rs = rs if s = s , Proposition 3.6 is now proved. It remains to prove Proposition 4.4. 4.4.1. Proof of (4.5). It is useful to recall our convention that if α ∈ R1+ , then yα− denotes an arbitrary non–zero element of the space (g1 )−α . The case when g is of type Dn+1 or A2n . We proceed by induction on n. To see that induc tion begins for the series Dn+1 at n = 2 note that then, n+ [t]σmax(1) = C yα−1 +α2 ⊗ t and hence the result follows from Corollary 4.3(i). For A2n it begins at n = 1 by noting that σ n− 0 [t]max(1) = C(yφ ⊗ t) and ε1 (φ − α1 ) = 1 < max(1) and again using Corollary 4.3. For the inductive step set J = {2, · · · , n} and let j ∈ J . We have σ σ x +j yr vi,m = (yφ1 ⊗ t)r1 x +j yr−r1 e1 vi,m ∈ (yφ1 ⊗ t)r1 r1 × U n− U((n− 0 J yr ⊂ 0 ) J )(yφ1 ⊗ t) yr , r ∈S
r ∈S
where
S = r ∈ Z+k : r1 = 0, ηr < ηr−r1 e1 . The last inclusion is a consequence of the fact that yφ−1 , n− 0 = 0. It is now simple to check that ηr > ηr +r1 e1 .
450
V. Chari, A. Moura
Suppose now that j = 1 and that g is of type A2n . We have σ = yα− +2 n xα+1 yr vi,m
s=2 αs
1
σ σ ⊗ t yr−e1 vi,m = δi,1 xα−1 yr−e1 +e2 vi,m ,
where δi,1 is one if i = 1 and zero otherwise. If g is of type Dn+1 , we have σ σ σ x1+ yr vi,m = yr−e1 +e2 vi,m + xθ− ⊗ t 2 yr−2e1 vi,m , where we recall that θ = α1 + 2
n
j=2 α j .
(4.7)
Since ηr−e1 +e2 < ηr it remains to prove that
σ σ xθ− ⊗ t 2 yr−2e1 vi,m ∈ U n− 0 yr vi,m . ηr <ηr
Now σ σ σ = xθ− ⊗ t 2 yr−2e1 vi,m + yr−e1 +e2 vi,m . xα−1 yr−2e1 +2e2 vi,m
(4.8)
Since ηr > ηr−e1 +e2 and ηr > ηr−2e1 +2e2 the inductive step is proved. The case when g is of type A2n−1 . We note that induction begins at n = 2 by using Corollary 4.3, since in this case we have when i = 2 that n+ [t]σmax(2) = C(yα−1 +α2 ⊗ t). Set J = {2, , · · · , n}. For the inductive step, note that if i is odd, then φs ∈ R +J and by σ . Since the induction hyopthesis it suffices to consider only the case xα+1 yr vi,m
xα+1 , yr = 0,
the result is immediate. Assume then that i is even and let j = 2. We have r r σ σ x +j yr vi,m = yφ1 ⊗ t 1 x +j yr−r1 e1 vi,m ∈ yφ1 ⊗ t 1 r1 × U((n− U((n− 0 ) J )yr ⊂ 0 ) J )(yφ1 ⊗ t) yr , r ∈S
r ∈S
where
S = r ∈ Z+k : r1 = 0, ηr < ηr−r1 e1 . If j = 2, we get σ xα+2 yr vi,m = yα− +α 1
2 +2
n
s=3 αs
σ σ ⊗ t yr−e1 vi,m = δi,2 xα−1 +α2 +α3 yr−e1 +e2 vi,m .
This completes the proof of the inductive step.
Restricted K-R Modules for Current and Twisted Current Algebras
451
4.4.2. Proof of (4.6). Let ki+ ⊂ n+0 be the subalgebra spanned by elements {xα+ : εi (α) = 0}. It is easily seen that the subalgebra n− [t]σmax(i) is a module for ki via the adjoint action, i.e., [ki , n− [t]σmax(i) ] ⊂ n− [t]σmax(i) . This action defines a ki –module structure on U(n− [t]σmax(i) ) and let ρ: ki → End U n− [t]σmax(i) denote the corresponding homomorphism of Lie algebras. Lemma. We have
ρ U ki+ yr . U n− [t]σmax(i) = r∈Zk+
Assuming the lemma we complete the proof of (4.6) as follows. Let g ∈ U(n− [t]σmax(i) ) and write g = r∈Z+ ρ(xr )yr for some xr ∈ U ki+ . Then, k
σ = gvi,m
σ ρ(xr )yr vi,m =
r∈Zk+
σ xr yr vi,m ∈
r∈Zk+
σ U(g0 )yr vi,m ,
r∈Zk+
σ = 0. An application of Corollary where the second equality uses the fact that xr vi,m 4.3(i) now gives (4.6). The proof of Lemma 4.4.2 is straightforward and we give a proof when g is of type Dn+1 . The other cases are similar. If i = 1, there is nothing to prove and we assume from now on that 1 < i < n. We proceed by induction on n, with induction beginning at n = 2 by the preceding comment. Let J = {2, · · · , n}, by the induction hypothesis we have U n− [t]σmax(i)∩n− [t]σ = ρ U ki+ ∩ n+J yr , J
r∈Zi+ :r1 =0
and since yφ−1 , n− [t]σ = yφ−1 , n+J = 0 we get
yφ1 ⊗ t
r1 ∈Z+
r1
U n− [t]σmax(i)∩n− [t]σ = ρ U ki+ ∩ n+J yr . J
Thus the lemma is established if we prove that r yφ1 ⊗ t 1 U n− [t]σmax(i)∩n− [t]σ = U n− [t]σmax(i) . ρ U ki+ r1 ∈Z+
J
For 1 ≤ j ≤ i − 1, set
θ j = α1 + · · · + α j + 2 α j+1 + · · · + αn ,
and
(4.9)
r∈Zi+
s1 si−1 , s ∈ Zi−1 xs = xθ−1 ⊗ t 2 · · · xθ−i−1 ⊗ t 2 + .
(4.10)
452
V. Chari, A. Moura
By the PBW theorem we have, U n− [t]σmax(i) = s∈Zi−1 + ,r ∈Z+
r xs yφ−1 ⊗ t U n− [t]σmax(i)∩n− [t]σ , J
and hence (4.10) follows if we prove that for all s ∈ Zi−1 + and r ∈ Z+ we have r xs yφ−1 ⊗ t U n− [t]σmax(i)∩n− [t]σ ∈ ρ U ki+ J r1 − σ yφ1 ⊗ t U n [t]max(i)∩n− [t]σ . ×
(4.11)
J
r1 ∈Z+
To prove (4.11) let r1 ∈ Z+ , g ∈ U n− [t]σmax(i)∩n− [t]σ . For 2 ≤ 2s1 ≤ r1 we get J
s1 p s ρ xα+1 1 (yφ1 ⊗ t)r1 g = xθ−1 ⊗ t 2 (yφ1 ⊗ t)r1 −s1 − p (yφ2 ⊗ t)s1 − p g, p=0
which proves (4.11) for all s with s j = 0 for j > 1. Assume that (4.11) is established for all s such that s j = 0 for all j ≥ r . To prove that it holds for s with s j = 0 for j > r > 1, suppose first that sr = 1. Then, ρ xα+r xs−er +er −1 (yφ1 ⊗ t)r1 g = xs (yφ1 ⊗ t)r1 g + xs−er +er −1 (yφ1 ⊗ t)r1 ρ xα+r g, and hence we get by using (4.9) that (4.11) holds for xs (yφ1 ⊗ t)r1 g when sr = 1. An obvious induction on sr gives the result in general. 5. Demazure Modules and Concluding Remarks 5.1. The relation between the modules K R(mωi ) and Demazure modules. 5.1.1. The affine Kac–Moody algebras. Let C[t, t −1 ] be the ring of Laurent polynomials in an indeterminate t. Let g be the affine Lie algebra defined by g = g ⊗ C t, t −1 ⊕ Cc ⊕ Cd, where c is central and x ⊗ t s , y ⊗ t k = [x, y] ⊗ t s+k + sδs+k,0 x, y c, d, x ⊗ t s = sx ⊗ t s , for all x, y ∈ g, s, k ∈ Z and where , is the Killing form of g. Set h = h ⊕ Cc ⊕ Cd, h∗ by setting λ(c) = λ(d) = 0 for λ ∈ h∗ . For 0 ≤ i, j ≤ n and regard h∗ as a subspace of ∗ + ⊂ let i ∈ h be defined by i (h j ) = δi, j , and let P h∗ be the non–negative integer ∗ linear span of i , 0 ≤ i ≤ n. Define δ ∈ (h) by δ|h = 0, δ(d) = 1, δ(c) = 0. h)∗ , 0 ≤ i ≤ n, where α0 = δ − θ are a set of simple roots for g with The elements αi ∈ ( + + spanned by the elements αi , 0 ≤ i ≤ n. Let W respect to h. Let Q be the subset of P –invariant form on be the extended affine Weyl group and ( , ) be the W h∗ obtained by requiring (i , α j ) = δi j for all 0 ≤ i, j ≤ n and (δ, αi ) = 0 for all 0 ≤ i ≤ n.
Restricted K-R Modules for Current and Twisted Current Algebras
453
n + , let L() 5.1.2. The highest weight integrable modules. Given = i=0 m i i ∈ P be the g–module generated by an element v with relations:
xα−i
gt[t]v = 0, (n+ ⊗ 1)v = 0, (h ⊗ 1)v = (h)v , m 0 +1 m +1 ⊗ 1 i v = 0, xθ+ ⊗ t −1 v = 0,
for 1 ≤ i ≤ n. This module is known to be irreducible and integrable (see [12]). The next proposition also can be found in [12]. + . Proposition. Let ∈ P (i) We have
L() = L() , where L() = v ∈ L() : hv = (h)v, h ∈ h . ∈ P
+ . / Q Moreover, dim(L() ) < ∞, dim(L()) = 1, and L() = 0 if − ∈ (ii) The set
: L() = 0 ⊂ − Q + wt(L()) = ∈ P and is preserved by W , ∈ P. dim(L()w ) = dim(L() ) ∀ w ∈ W . In particular, dim(L()w ) = 1 for all w ∈ W 5.1.3. The Demazure modules. From now on we fix = m0 for some m ∈ Z+ and , let vw be a we also assume for simplicity that g is of type An or Dn . Given w ∈ W non–zero element in L()w and let D(w) = U(g[t])vw ⊂ L(). . It is clear from Proposition 5.1.2 that D(w) is finite–dimensional for all w ∈ W + Proposition. Let = m0 ∈ P for some m > 0 and w ∈ W be such that w|h = mωi for some 1 ≤ i ≤ n. Then, D(w) is isomorphic to K R(mωi ). Proof. To prove the proposition we first show that vw satisfies the defining relations of K R(mωi ). It was shown in [3] that the element vw satisfied the relations n+ [t]vw = 0, (h ⊗ t r )vw = 0, h ∈ h, r ∈ N. The proof given in [3, Prop. 1.4.3] was for the special case when g is of type An , but works for any simple Lie algebra. Since D(w) is finite–dimensional the only relation that remains to be checked is that − xαi ⊗ t vw = 0. If not, then L()w−αi +δ = 0 and hence we find that L()−w−1 αi +δ = 0. Now (w, αi ) = m = m0 , w −1 αi implies that w −1 αi = α + δ for some α ∈ R and hence L()−α = 0. This forces α ∈ R + , L()sα (−α) = 0. Since = m0 , we find that sα ( − α) = + α and hence L()+α = 0 which is a contradiction. This proves that D(w) is a quotient of K R(mωi ). The fact that it as an isomorphism of g[t]–modules is immediate from [7, Theorem 1].
454
V. Chari, A. Moura
5.2. The case of exceptional Lie algebras. For the exceptional Lie algebras the definition and elementary properties of the modules K R(mωi ) and their twisted analogs are exactly the same as for the classical Lie algebras. Moreover, the g–module decompositions of the modules K R(mωi ) (resp. K R σ (mωi )) can be shown to coincide with the conjectural decompositions given in [9, 10, 14] as long as i is such that the maximum value of εi (α) ≤ dˇi for all α ∈ R + . In the other cases, the main difficulty lies in proving Proposition 2.4 (resp. Proposition 3.6). One reason for this is that the multiplicity of the irreducible representations occurring in a given graded component can be bigger than one. The construction of the fundamental Kirillov–Reshetikhin modules is also much more complicated since the number of irreducible components is large, in the case of E 8 and the trivalent node for instance, the total number of components is conjectured to be three hundred and sixty eight with twenty four non–isomorphic isotypical components. In fact in this case, a general conjecture for the structure of the non–fundamental modules is not available and the combinatorics seems formidable. References 1. Bourbaki, N.: Groupes et algèbres de Lie. Chapitres 4,5,6, Paris: Hermann, (1968) 2. Chari, V.: On the fermionic formula and the Kirillov-Reshetikhin conjecture. Int. Math. Res. Not. 12, 629–654 (2001) 3. Chari, V., Loktev, S.: Weyl, Fusion and Demazure modules for the current algebra of slr +1 . http:// arxiv.org/list/math.QA/0502165, 2005 4. Chari, V., Pressley, A.: Weyl modules for classical and quantum affine algebras. Represent. Theory 5, 191–223 (2001) 5. Feigin, B., Loktev S.: On Generalized Kostka Polynomials and the Quantum Verlinde Rule. In: Differential topology, infinite–dimensional Lie algebras, and applications, Amer. Math. Soc. Transl. Ser. 2, Vol. 194, Providence, RI:/Amer.Math.Soc., 1999, pp. 61–79, 6. Feigin, B., Loktev, S., Kirillov, A.N.: Combinatorics and Geometry of Higher Level Weyl Modules. http://arxiv.org/list/math.QA/0503315, 2005 7. Fourier, G., Littelmann, P.: Tensor product structure of affine Demazure modules and limit constructions. http://arxiv.org/list/math.RT/0412432, 2004 8. Fulton W., Harris, J.: Representation Theory - A first course. GTM bf 129, Berlin-Heidelberg-New York: Springer–Verlag, 1991 9. Hatayama, G., Kuniba, A., Okado, M., Takagi, T., Yamada, Y.: Remarks on the Fermionic Formula. Contemp. Math. 248, 243–291 (1999) 10. Hatayama, G., Kuniba, A., Okado, M., Takagi, T., Tsuboi, Z.: Paths, Crystals and Fermionic Formulae, Prog. Math. Phys. 23, 205–272 (2002) 11. Humphreys, J.: Introduction to Lie Algebras and Representation Theory. GTM 9, Berlin-Heidelberg-New York: Springer–Verlag, 1972 12. Kac, V.: Infinite dimensional Lie algebras. Cambridge: Cambridge Univ. Press, 1985 13. Kirillov, A.N., Reshetikhin, N.: Representations of Yangians and multiplicities of ocurrence of the irreducible components of the tensor product of simplie Lie algebras. J. Sov. Math. 52, no. 5, 393–403 (1981) 14. Kleber, M.: Combinatorial structure of finite-dimensional representations of Yangians: the simply-laced case. Int. Math. Res. Not. 7, no. 4, 187-201 (1997) 15. Naito, S., Sagaki, D.: Path Model for a Level Zero Extremal Weight Module over a Quantum Affine Algebra. http://arxiv.org/list/math.QA/0210450, 2002 16. Naito, S., Sagaki, D.: Construction of perfect crystals conjecturally corresponding to Kirillov-Reshetikhin modules over twisted quantum affine algebras. http://arxiv.org/list/math.QA/0503287, 2005 (1) 17. Schilling, A., Sternberg, P.: Finite-Dimensional Crystals B 2,s for Quantum Affine Algebras of type Dn . http://arxiv.org/list/math.QA/0408113, 2004, to appear in J. Alg. Combin. 18. Parthasarathy, K., Rao, R., Varadarajan, V.: Representations of complex semi–simple Lie groups and Lie Algebras. Ann. Math. 85, 38–429 (1967) 19. Varadarajan, V.: Lie Groups, Lie Algebras, and Their Representations. GTM 102, Berlin-Heidelberg-New York: Springer–Verlag, 1974 Communicated by L. Takhtajan
Commun. Math. Phys. 266, 455–470 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0031-3
Communications in
Mathematical Physics
Multifractal Structure of Two-Dimensional Horseshoes Luis Barreira, Claudia Valls Departamento de Matemática, Instituto Superior Técnico, 1049-001 Lisboa, Portugal. E-mail: [email protected]; [email protected] Received: 24 August 2005 / Accepted: 7 February 2006 Published online: 25 April 2006 – © Springer-Verlag 2006
Abstract: We give a complete description of the dimension spectra of Birkhoff averages on a hyperbolic set of a surface diffeomorphism. The main novelty is that we are able to consider simultaneously Birkhoff averages into the future and into the past, i.e., both for positive and negative time. We emphasize that the description of these spectra is not a consequence of the available results in the case of Birkhoff averages simply into the future (or into the past). The main difficulty is that although the local product structure provided by the intersection of stable and unstable manifolds is bi-Lipschitz equivalent to a product, the level sets of the Birkhoff averages are never compact (this causes their box dimension to be strictly larger than their Hausdorff dimension), and thus the product of level sets may have a dimension that need not be the sum of the dimensions of these sets. Instead we construct explicitly noninvariant measures concentrated on each product of level sets with the appropriate pointwise dimension. We also consider the higher-dimensional case of more than one Birkhoff average, as well as the case of ratios of Birkhoff averages. 1. Introduction 1.1. Motivation. Our main concern in this paper is the multifractal analysis of dynamical systems. This theory can be considered a subfield of the dimension theory of dynamical systems, and it essentially studies the complexity of the level sets of invariant local quantities obtained from a dynamical system. In particular one can consider Birkhoff averages, Lyapunov exponents, pointwise dimensions, and local entropies. We emphasize that these functions are usually only measurable and thus their level sets are rarely manifolds. Hence, in order to measure the complexity of these sets it is appropriate to use quantities such as the topological entropy or the Hausdorff dimension. We refer to Supported by the Center for Mathematical Analysis, Geometry, and Dynamical Systems, and through Fundação para a Ciência e a Tecnologia by Programs POCTI/FEDER, POSI, and POCI 2010/Fundo Social Europeu, and the grant SFRH/BPD/14404/2003.
456
L. Barreira, C. Valls
the book [6] for the state-of-the-art of the theory of multifractal analysis in 1997, and to the survey [1] for later developments. Our main objective is to give a complete description of the dimension spectra of Birkhoff averages on a locally maximal hyperbolic set of a surface diffeomorphism, taking into account simultaneously Birkhoff averages into the future and into the past. More precisely, the spectra that we consider are obtained by computing the Hausdorff dimension of the level sets of Birkhoff averages of a given function both for positive and negative time.
1.2. A model case: the Smale horseshoe. In order to briefly describe our results and to explain why they are nontrivial, we consider here the very particular case of the (linear) Smale horseshoe = C × C, given by the product of two standard middle-third Cantor sets. We emphasize that our results are new even in this very special case. Let f : → be the dynamics on the horseshoe, here assumed to be expanding in the vertical direction and contracting in the horizontal direction. Given continuous functions ϕ, ψ : → R we consider the level sets of Birkhoff averages given for each α, β ∈ R by n−1 n−1 1 1 k −k K αβ = x ∈ : lim ϕ( f x) = α and lim ψ( f x) = β . n→∞ n n→∞ n k=0
k=0
The associated dimension spectrum is defined by D(α, β) = dim H K αβ , where dim H denotes the Hausdorff dimension. Again, the novelty with respect to many other spectra studied in the multifractal analysis of dynamical systems is that we consider simultaneously Birkhoff averages into the future and into the past. We now explain why we cannot obtain a description of the spectrum D from the known results in multifractal analysis. Let P and Q be the orthogonal projections onto the horizontal and vertical axes (note that in the present case of the Smale horseshoe these are respectively the unstable and stable holonomies). From the exponential behavior of f along the stable and unstable manifolds we can show that (see Sect. 3.2) n−1 1 −k P(K αβ ) × C = x ∈ : lim ψ( f x) = β , n→∞ n k=0 n−1 1 k ϕ( f x) = α . C × Q(K αβ ) = x ∈ : lim n→∞ n k=0
Incidentally, notice that the projection P(K αβ ) does not depend on α (and thus on the function ϕ), and that the projection Q(K αβ ) does not depend on β (and thus on the function ψ). Therefore K αβ = [P(K αβ ) × C] ∩ [C × Q(K αβ )] = P(K αβ ) × Q(K αβ ).
(1)
This shows that each level set K αβ is a product of level sets of Birkhoff averages either only into the future or only into the past. A priori it could seem that the identity in (1) would allow us to obtain a description of the spectrum D from the known results for
Multifractal Structure of Two-Dimensional Horseshoes
457
P(K αβ ) and Q(K αβ ). The problem is that in general the Hausdorff dimension of a product A × B need not be the sum of the Hausdorff dimensions of A and B, unless for example dim H A = dim B A or dim H B = dim B B, where dim B denotes the upper box dimension. It happens that as a consequence of the theory of multifractal analysis (see the discussion at the end of Sect. 3.4), if the functions ϕ and ψ are not cohomologous to constants, then dim H P(K αβ ) < dim B P(K αβ ) and dim H Q(K αβ ) < dim B Q(K αβ ) for all except one value of α and one value of β. Thus, even though it follows immediately from (1) that D(α, β) ≥ dim H P(K αβ ) + dim H Q(K αβ ),
(2)
a priori this inequality could be strict. Our main objective is to show that indeed (2) becomes an identity for every α and β (see Sect. 3.4). A consequence of the theory of multifractal analysis is then the analyticity of the spectrum (see Sect. 3.5). We also consider the higher-dimensional case of more than one function, as well as the case of ratios of Birkhoff averages.
1.3. Method of proof. We now briefly explain how we overcome the above difficulty. Our approach to establish the equality in (2) is to construct explicitly measures that are concentrated on each level set K αβ and have the right pointwise dimension, although they are always noninvariant (see Sect. 3.3). These measures are nevertheless constructed by combining invariant measures obtained from the theory of multifractal analysis. We note that we also require the so-called diametric regularity of the involved measures (see (16) below for the definition). This property is crucial in some approaches to the multifractal analysis of a dynamical system: it is used to ensure that the multifractal analysis at the level of symbolic dynamics can be transferred to the multifractal analysis on the manifold (see [6] for details). For example, all equilibrium measures of a Hölder continuous function on a locally maximal hyperbolic set are diametrically regular. Even though the measures that we construct are never equilibrium measures (we recall that they are never invariant), as already mentioned their “building blocks” come from multifractal analysis (although not always from the same dynamics) and this allows us to use the above property.
2. Hyperbolic Sets and Hausdorff Dimension Let f : M → M be a C 1+ε diffeomorphism on a surface M, for some ε ∈ (0, 1], and let ⊂ M be a compact smooth hyperbolic set for f . This means that is an f -invariant set (i.e., f () = ), and that there exist a continuous splitting of the tangent bundle T M = E s ⊕ E u , and constants c > 0 and λ ∈ (0, 1) such that for every x ∈ we have: 1. dx f (E s (x)) = E s ( f x) and dx f (E u (x)) = E u ( f x); 2. dx f n v ≤ cλn v whenever v ∈ E s (x) and n ∈ N; 3. dx f −n v ≤ cλn v whenever v ∈ E u (x) and n ∈ N.
458
L. Barreira, C. Valls
We will always assume that is locally maximal, i.e., that there exists an open neighborhood U of such that = n∈Z f n U , that the stable and unstable distributions E s and E u have dimension one, and that f is topologically mixing on . Let be a hyperbolic set for f , and denote by d the distance on M. For each x ∈ there exist local stable and unstable manifolds V s (x) and V u (x) containing x such that: 1. Tx V s (x) = E s (x) and Tx V u (x) = E u (x); 2. f (V s (x)) ⊂ V s ( f x) and f −1 (V u (x)) ⊂ V u ( f −1 x); 3. there exist constants c > 0 and λ ∈ (0, 1) (independent of x) such that for each n ∈ N we have d( f n y, f n x) ≤ cλn d(y, x) whenever y ∈ V s (x), d( f −n y, f −n x) ≤ cλn d(y, x) whenever y ∈ V u (x).
(3)
Given x ∈ we also consider the global stable and unstable manifolds of x, defined by W s (x) =
f −n V s ( f n x) and W u (x) =
n∈N
f n V u ( f −n x).
n∈N
Let now ts and tu be the unique real numbers such that P(ts log d f |E s ) = P(tu log d f −1 |E u ) = 0, where P denotes the topological pressure with respect to f on (see for example [3] for the definition). It was shown in [4] that dim H ( ∩ V s (x)) = ts and dim H ( ∩ V u (x)) = tu
(4)
for every x ∈ , and it was shown in [5] that dim H ( ∩ V s (x)) = dim B ( ∩ V s (x)), dim H ( ∩ V u (x)) = dim B ( ∩ V u (x))
(5)
for every x ∈ . Since in the present situation the stable and unstable holonomies are Lipschitz, we have dim H = dim H [( ∩ V s (x)) × ( ∩ V u (x))] = dim H ( ∩ V s (x)) + dim H ( ∩ V u (x)) = ts + tu ,
(6)
using (5) in the second identity (we recall that if dim H A = dim B A, then dim H (A× B) = dim H A + dim H B for any set B).
3. Dimension Spectra We consider in this section the dimension spectra of Birkhoff averages on a locally maximal hyperbolic set of a surface diffeomorphism f : M → M.
Multifractal Structure of Two-Dimensional Horseshoes
459
3.1. Dimension spectra. We denote by C δ () the space of Hölder continuous functions ϕ : → R with a given Hölder exponent δ ∈ (0, 1]. We fix κ ∈ N, and we consider two pairs of functions ( + , + ) and ( − , − ) in the space H () := C δ ()κ × C δ ()κ . The symbols + and − correspond respectively to the future and to the past: more precisely, we shall consider the Birkhoff averages of the functions ( + , + ) into the future and the Birkhoff averages of the functions ( − , − ) into the past. A similar notation will be used throughout the remaining sections in other quantities. Write ± = (ϕ1± , . . . , ϕκ± ) and ± = (ψ1± , . . . , ψκ± ). We will always assume that ψi± > 0 for i = 1, . . ., κ (and for simplicity we shall simply write ± > 0). Given a vector α = (α1 , . . . , ακ ) ∈ Rκ we consider the level sets n κ + ( f k x) ϕ k=0 i = αi , K α+ = x ∈ : lim n n→∞ ψi+ ( f k x) k=0 i=1 n κ − −k x) k=0 ϕi ( f − Kα = = αi . x ∈ : lim n − −k n→∞ x) k=0 ψi ( f i=1 We define the associated dimension spectrum D : Rκ × Rκ → R by D(α, β) = dim H (K α+ ∩ K β− ). 3.2. Formulas for the dimension of the level sets. We first consider separately each of the level sets K α+ and K α− . Theorem 1. Let be a compact locally maximal hyperbolic set for a C 1+ε diffeomorphism on a smooth surface. Given pairs of functions ( ± , ± ) ∈ H () with ± > 0, for each α ∈ Rκ and x ± ∈ K α± we have ∩ W s (x + ) ⊂ K α+ , ∩ W u (x − ) ⊂ K α− ,
(7)
and dim H K α+ = dim H (K α+ ∩ V u (x + )) + ts ,
dim H K α− = dim H (K α− ∩ V s (x − )) + tu . Proof. Let a, b : → R be continuous functions with b > 0. It follows from (3) and the uniform continuity of a and b on that for each x ∈ and δ > 0, given n ∈ N sufficiently large we have |a( f m y) − a( f m x)| < δ and |b( f m y) − b( f m x)| < δ for every y ∈ V s (x) and m > n. Therefore, m a( f k y) m a( f k x) k=0 − k=0 m m k x) k=0 b( f k y) b( f k=0 m m a( f k x) m a( f k x) k k k=0 |a( f y) − a( f x)| k=0 k=0 m ≤ + m − m k y) k y) k x) b( f b( f b( f k=0 k=0 k=0 n a ∞ + (m − n + 1)δ n b ∞ + (m − n + 1)δ + (m + 1) a ∞ (m + 1) inf b (m + 1)2 (inf b)2 δ a ∞ δ + → inf b (inf b)2 ≤
(8)
460
L. Barreira, C. Valls
as m → ∞. Assume now that there exists β ∈ R such that m a( f k x) = β. lim k=0 m k m→∞ k=0 b( f x) Then, taking δ → 0 in (8) we conclude that m a( f k y) lim k=0 = β for every y ∈ V s (x). m k y) m→∞ b( f k=0 This implies that ∩ V s (x) ⊂ K α+ for every x ∈ K α+ . Furthermore, since the set K α+ is f -invariant we conclude that ∩ f −n V s ( f n x) ⊂ K α+ whenever x ∈ K α+ for every n ∈ N, and thus ∩ W s (x) ⊂ K α+ . Similar arguments establish the analogous statement in (7) between K α− and the global unstable manifolds. Let now Vεs (x) ⊂ W s (x) and Vεu (x) ⊂ W u (x) be the segments of size ε of the stable and unstable manifolds, with respect to the distances induced by d respectively on W s (x) and W u (x). Since E s and E u have codimension one, the stable and unstable holonomies are Lipschitz. Hence, by the uniform transversality of the stable and unstable manifolds, given x ∈ and a sufficiently small ε > 0, the map ( ∩ Vεs (x)) × ( ∩ Vεu (x)) (y, z) → [y, z] := Vεu (y) ∩ Vεs (z) is a Lipschitz homeomorphism with Lipschitz inverse. This ensures that in the set K α+ the open neighborhood ∩
y∈K α+ ∩Vεu (x)
Vεs (y)
of a point x ∈ K α+ (with respect to the induced topology on , in view of (7)) is taken onto the product (K α+ ∩ Vεu (x)) × ( ∩ Vεs (x)) by a Lipschitz map with Lipschitz inverse. Therefore, dim H K α+ = dim H (K α+ ∩ Vεu (x)) + ts = dim H (K α+ ∩ V u (x)) + ts , in view of (4) and (5). We obtain the corresponding identity for K α− in an analogous manner. This completes the proof of the theorem.
Multifractal Structure of Two-Dimensional Horseshoes
461
3.3. Existence of full measures. We now consider the dimension spectrum. We denote by M the family of f -invariant Borel probability measures on , and we define the functions P± : M → Rκ by
± ± ± ϕ1 dμ ϕκ dμ P (μ) = ,..., . ± ± ψ1 dμ ψκ dμ We note that M is compact and connected, and since P± is continuous, the set P± (M) is also compact and connected. We denote by B(x, r ) the ball of radius r centered at x. Theorem 2. Let be a compact locally maximal hyperbolic set for a C 1+ε diffeomorphism on a smooth surface. Given pairs of functions ( ± , ± ) ∈ H () with ± > 0, if α ∈ int P+ (M) and β ∈ int P− (M), then there exists a probability measure ν on such that ν(K α+ ∩ K β− ) = 1, with log ν(B(x, r )) = dim H K α+ + dim H K β− − dim H r →0 log r lim
(9)
for ν-almost every x ∈ , and lim sup r →0
log ν(B(x, r )) ≤ dim H K α+ + dim H K β− − dim H log r
(10)
for every x ∈ K α+ ∩ K β− . Proof. Consider a Markov partition of , and the associated two-sided topological Markov chain σ : A → A with transfer matrix A. We also consider the coding map χ : A → obtained from the Markov partition. We denote by +A and − A respectively the sets of right-sided and left-sided infinite sequences obtained from A . We consider − the topological Markov chains σ + : +A → +A and σ − : − A → A defined by σ + (i 0 i 1 · · · ) = (i 1 i 2 · · · ) and σ − (· · · i −1 i 0 ) = (· · · i −2 i −1 ). Take now x ∈ , and choose ω ∈ A such that χ (ω) = x. Let R(x) be a rectangle of the Markov partition that contains x, and let π + : A → +A and π − : A → − A be the projections defined respectively by π + (· · · i −1 i 0 i 1 · · · ) = (i 0 i 1 · · · ) and π − (· · · i −1 i 0 i 1 · · · ) = (· · · i −1 i 0 ). For each sequence ω ∈ A we have χ (ω ) ∈ V u (x) ∩ R(x) whenever π − ω = π − ω, χ (ω ) ∈ V s (x) ∩ R(x) whenever π + ω = π + ω. Thus, if ω = (· · · i −1 i 0 i 1 · · · ), then via the coding map χ the set V u (x) ∩ R(x) can be identified with the cylinder Ci+0 = {( j0 j1 · · · ) ∈ +A : j0 = i 0 } ⊂ +A ,
(11)
and the set V s (x) ∩ R(x) can be identified with the cylinder − Ci−0 = {(· · · j−1 j0 ) ∈ − A : j0 = i 0 } ⊂ A .
(12)
462
L. Barreira, C. Valls
We use a construction described in [3, Lemma 1.6] such that given a Hölder continuous function ϕ : A → R allows one to obtain Hölder continuous functions ϕ u and ϕ s cohomologous to ϕ, depending respectively only on the symbolic future and on the symbolic past. We formulate this statement explicitly for the Hölder continuous functions ϕi± , ψi± , log d f |E u , and log d f −1 |E s (recall that f is of class C 1+ε , and that the stable and unstable distributions are always Hölder continuous on the base point). Lemma 1 (see [3, Lemma 1.6]). For each i = 1, . . . , κ there exist Hölder continuous functions ϕiu , ψiu , d u : +A → R and ϕis , ψis , d s : − A → R, and continuous functions ± ± ± gi , h i , ρ : A → R such that ϕi+ ◦ χ = ϕiu ◦ π + + gi+ − gi+ ◦ σ, ψi+ ◦ χ = ψiu ◦ π + + h i+ − h i+ ◦ σ, log d f |E u ◦ χ = d u ◦ π + + ρ + − ρ + ◦ σ, and ϕi− ◦ χ = ϕis ◦ π − + gi− − gi− ◦ σ,
ψi− ◦ χ = ψis ◦ π − + h i− − h i− ◦ σ, log d f −1 |E s ◦ χ = d s ◦ π − + ρ − − ρ − ◦ σ. We now initiate the process of construction of the measure ν. Set d + = dim H K α+ − ts and d − = dim H K β− − tu . Note that by (6) we have d + + d − = dim H K α+ + dim H K β− − dim H .
(13)
Set now u = (ϕ1u , . . . , ϕκu ), u = (ψ1u , . . . , ψκu ), s = (ϕ1s , . . . , ϕκs ), s = (ψ1s , . . . , ψκs ). Given q ± ∈ Rκ , we define (Hölder continuous) functions a u : +A → R and bs : − A → R by a u = q + , u − α ∗ u − d + d u , bs = q − , s − β ∗ s − d − d s ,
(14)
where ·, · denotes the standard inner product in Rκ and α ∗ (ϕ1 , . . . , ϕκ ) = (α1 ϕ1 , . . . , ακ ϕκ ). Since f is topologically mixing on (and hence the same happens with f −1 ), there exist a unique equilibrium measure μu of a u on +A (with respect to σ + ), and a unique − u s equilibrium measure μs of bs on − A (with respect to σ ). Note that both μ and μ are Gibbs measures. Since α ∈ int P+ (M) and β ∈ int P− (M), the following statement is an immediate consequence of Theorem 8 in [2] (see also the discussion after that theorem).
Multifractal Structure of Two-Dimensional Horseshoes
463
Lemma 2. There exist vectors q ± ∈ Rκ such that the corresponding measures μu and μs satisfy Pσ + (a u ) = Pσ − (bs ) = 0 and
dμ = α ∗ u
+A
u
dμ , u
+A
(15)
u
dμ = β ∗ s
− A
s
− A
s dμs .
We also define measures ν u and ν s on the rectangle R(x) ⊂ by ν u = μu ◦ π + ◦ χ −1 and ν s = μs ◦ π − ◦ χ −1 , using the vectors q ± in Lemma 2. We finally define the measure ν on R(x) by ν = ν u ×ν s . Note that since μu and μs are Gibbs measures, we have (see (11) and (12)) ν(R(x)) = μu (Ci+0 )μs (Ci−0 ) > 0. Lemma 3. There exist constants γ > 1 and C > 0 such that for every x ∈ and r > 0 we have ν(B(x, γ r )) ≤ Cν(B(x, r )).
(16)
Proof of the lemma. Consider the Hölder continuous functions a, b : → R defined by a = q + , + − α ∗ + − d + log d f |E u , b = q − , − − β ∗ − + d − log d f |E s . Note that ν u is the unique equilibrium measure of a with respect to f , and that ν s is the unique equilibrium measure of b with respect to f −1 . Being equilibrium measures of Hölder continuous functions, each of them have the property in (16) (this is the so-called diametric regularity property; see Propositions 21.4 and 24.1 in [6]), and thus the same happens with their product ν. We now start to establish the statements in the theorem. Lemma 4. We have lim inf r →0
log ν(B(x, r )) ≥ dim H K α+ + dim H K β− − dim H log r
for ν-almost every x ∈ . Proof of the lemma. Using the variational principle for the topological pressure of the functions in (14) it follows from Lemma 2 that h μu (σ + ) h s (σ − ) + μ = d and = d −, u dμu s s d − d dμ +
A
(17)
A
where h μ denotes the Kolmogorov–Sinai entropy with respect to the measure μ. By the Shannon–McMillan–Breiman theorem and the Birkhoff ergodic theorem, it follows
464
L. Barreira, C. Valls
from (17) that for every ε > 0, and for μu -almost every ω+ ∈ Ci+0 and μs -almost every ω− ∈ Ci−0 , there exists s(ω) ∈ N such that for each n, m > s(ω) we have log μu (Ci+0 ···in ) < d + + ε, d + − ε < − n u ((σ + )k ω+ ) d k=0 and log μs (Ci−−m ···i0 ) < d − + ε. d −ε <− m s − k − k=0 d ((σ ) ω ) −
Given r > 0 sufficiently small we now choose n = n(ω, r ) and m = m(ω, r ) such that −
n
d u ((σ + )k ω+ ) > log r, −
n+1
k=0
k=0
m
m+1
d u ((σ + )k ω+ ) ≤ log r,
(18)
d s ((σ − )k ω− ) ≤ log r.
(19)
and −
d s ((σ − )k ω− ) > log r, −
k=0
k=0
It follows from the construction of the Markov partitions (recall that the stable and unstable distributions have dimension one; see [6] for details) that there exists a constant ρ > 1 independent of x = χ (ω) and r such that B(y, r/ρ) ∩ ⊂ χ (Ci−m ···in ) ⊂ B(x, ρr )
(20)
for some point y ∈ χ (Ci−m ···in ), with ω = (· · · i −1 i 0 i 1 · · · ). Furthermore, by Lemma 3 there exists a constant c > 0 independent of x and r such that ν(B(y, 2ρr )) ≤ cν(B(y, r/ρ)). It follows from (20) that ν(B(x, r )) ≤ ν(B(y, 2ρr )) ≤ cν(B(y, r/ρ)) ≤ cν(χ (Ci−m ···in )) = cμu (Ci+0 ···in )μs (Ci−−m ···i0 ) n + u + k + d ((σ ) ω ) < c exp (−d + ε) −
k=0 m
× exp (−d + ε)
− k
−
d ((σ ) ω ) s
k=0
≤ c exp[(log r + d u ∞ )(d + − ε) + (log r + d s ∞ )(d − − ε)], and hence log ν(B(x, r )) ≥ d + + d − − 2ε, r →0 log r for ν-almost every point x ∈ . In view of (13), the arbitrariness of ε implies the desired result. lim inf
Let now αβ ⊂ A be the set of points ω ∈ A such that for i = 1, . . . , κ we have n n s − k − ϕ u ((σ + )k ω+ ) k=0 ϕi ((σ ) ω ) = α = βi . , lim lim nk=0 iu i n s + k + − k − n→∞ n→∞ k=0 ψi ((σ ) ω ) k=0 ψi ((σ ) ω )
Multifractal Structure of Two-Dimensional Horseshoes
465
Lemma 5. The inequality in (10) holds for every x ∈ χ (αβ ). Proof of the lemma. Given ε > 0 and ω ∈ αβ there exists r (ω) ∈ N such that for every n > r (ω) we have n + u u + k + + u q , ( − α ∗
)((σ ) ω ) (21) < εn q , ∞ , k=0
and n − s s − k − − s q , ( − β ∗
)((σ ) ω ) < εn q , ∞ .
(22)
k=0
Since μu and μs are Gibbs measures, in view of (15) there exists a constant D > 0 such that for every ω+ ∈ Ci+0 , ω− ∈ Ci−0 , and n, m ∈ N we have D −1 <
μu (Ci+0 ···in ) n < D, exp k=0 a u ((σ + )k ω+ )
and D
−1
μs (Ci−−m ···i0 ) < D. < s − k − exp m k=0 b ((σ ) ω )
Combining these inequalities with (21)–(22) we find that n μu (Ci+0 ···in ) > D −1 exp −d + d u ((σ + )k ω+ ) − εn q + , u ∞ ,
(23)
k=0
and
μ
s
(Ci−−m ···i0 )
>D
−1
exp −d
−
m
− k
−
−
d ((σ ) ω ) − εm q , ∞ . s
s
(24)
k=0
As in the proof of Lemma 4, it follows from the construction of the Markov partitions that there exists a constant ρ > 0 independent of x = χ (ω) and r such that B(x, ρr ) ⊃ χ (Ci−m ···in ), with n = n(ω, r ) and m = m(ω, r ) as in (18)–(19). Given ε > 0 and ω ∈ αβ we now take r > 0 sufficiently small so that n(ω, r ) > r (ω) and m(ω, r ) > r (ω) (the uniform hyperbolicity of f on ensures that this is always possible). Then, combining (23)–(24) with (18)–(19) we obtain ν(B(x, ρr )) ≥ ν(χ (Ci−m ···in )) = μu (Ci+0 ···in )μs (Ci−−m ···i0 ) ≥ D −2 r d
+ +d −
exp(−εn q + , u ∞ − εm q − , s ∞ ),
for all sufficiently small r > 0. Note that by (18)–(19) we have −n inf d u > log r, −m inf d s > log r.
466
L. Barreira, C. Valls
Therefore, for every x = χ (ω) with ω ∈ αβ we obtain lim sup r →0
log ν(B(x, r )) q + , u ∞ q − , s ∞ . ≤ d+ + d− + ε + log r inf d u inf d s
Since ε is arbitrary, for every x ∈ χ (αβ ) we have lim sup r →0
log ν(B(x, r )) ≤ d + + d −. log r
In view of (13) this gives the desired statement.
Lemma 6. We have χ (αβ ) = K α+ ∩ K β− . Proof of the lemma. It follows from Lemma 1 that ϕi+ ( f k (χ (ω))) = ψi+ (χ (σ k ω)) = ϕiu (π + (σ k ω)) + gi+ (σ k ω) − gi+ (σ k+1 ω) = ϕiu ((σ + )k ω+ ) + gi+ (σ k ω) − gi+ (σ k+1 ω), with analogous identities for the functions ψi+ , ϕi− , and ψi− . Therefore, n−1
+ k k=0 ϕi ( f (χ (ω))) n−1 + k k=0 ψi ( f (χ (ω)))
n−1 k=0 = n−1 k=0
ϕiu ((σ + )k ω+ ) + gi+ (ω) − gi+ (σ n ω) ψiu ((σ + )k ω+ ) + h i+ (ω) − h i+ (σ n ω)
,
(25)
and n−1
− −k (χ (ω))) k=0 ϕi ( f n−1 − −k (χ (ω))) k=0 ψi ( f
n−1 k=0 = n−1 k=0
ϕis ((σ − )k ω− ) + gi− (ω) − gi− (σ n ω)
ψis ((σ − )k ω− ) + h i− (ω) − h i− (σ n ω)
We now observe that n−1
ψiu ((σ + )k ω+ ) ≥ n inf ψi+ − 2 h i+ ∞ ,
k=0 n−1
ψis ((σ − )k ω− ) ≥ n inf ψi− − 2 h i− ∞ .
k=0
Since ψi± > 0 for i = 1, . . . , κ, these inequalities ensure that the limits n−1
+ k k=0 ϕi ( f (χ (ω))) , lim n−1 + k n→∞ k=0 ψi ( f (χ (ω)))
n−1 k=0 lim n−1 n→∞ k=0
ϕi− ( f −k (χ (ω)))
ψi− ( f −k (χ (ω)))
exist if and only if the limits n−1 u + k + k=0 ϕi ((σ ) ω ) lim n−1 , u + k + n→∞ k=0 ψi ((σ ) ω )
n−1 s − k − k=0 ϕi ((σ ) ω ) lim n−1 s − k − n→∞ k=0 ψi ((σ ) ω )
.
(26)
Multifractal Structure of Two-Dimensional Horseshoes
467
exist, in which case we have n−1 + k n−1 u + k + k=0 ϕi ( f (χ (ω))) k=0 ϕi ((σ ) ω ) = lim n−1 lim n−1 u + k + k + n→∞ n→∞ k=0 ψi ( f (χ (ω))) k=0 ψi ((σ ) ω ) and
n−1 − −k n−1 s − k − (χ (ω))) k=0 ϕi ( f k=0 ϕi ((σ ) ω ) = lim . lim n−1 n−1 − s −k (χ (ω))) − k − n→∞ n→∞ k=0 ψi ( f k=0 ψi ((σ ) ω )
In particular ω ∈ αβ if and only if χ (ω) ∈ K α+ ∩ K β− . This shows that χ (αβ ) = K α+ ∩ K β− . Combining the above lemmas we readily obtain the statement in the theorem.
We call each measure ν with the properties in Theorem 2 a full measure for the set K α+ ∩ K β− . We note that the particular measure ν constructed in the proof of Theorem 2 is never f -invariant (since μu is σ + -invariant while μs is σ − -invariant). 3.4. Formula for the spectrum. We now use the former results to obtain a formula for the spectrum D. Theorem 3. Let be a compact locally maximal hyperbolic set for a C 1+ε diffeomorphism on a smooth surface. Given pairs of functions ( ± , ± ) ∈ H () with ± > 0, if α ∈ int P+ (M) and β ∈ int P− (M), then the set K α+ ∩ K β− is dense in and D(α, β) = dim H K α+ + dim H K β− − dim H .
(27)
Proof. If follows easily from the construction of the functions u , u , s , and s that the sets K α+ and K β− are dense in (we note that by Theorem 2 they are nonempty). Namely, by Lemma 1 (see also (25) and (26)) we know that the ratios of Birkhoff averages of these functions only depend on the symbolic past (in the case of K α+ ) or on the symbolic future (in the case of K β− ). The density follows immediately from the fact that
(σ + )−k ω+ = +A and
k∈N
(σ − )−k ω− = − A
k∈N
− A.
for any points ω+ ∈ +A and ω− ∈ Let now ν be the measure constructed in Theorem 2. By (9) we have (see for example Theorem 7.1 in [6]) dim H ν = dim H K α+ + dim H K β− − dim H , where dim H ν := inf{dim H Z : ν(Z ) = 1} is the Hausdorff dimension of the measure ν. Since ν(K α+ ∩ K β− ) = 1 we obtain dim H (K α+ ∩ K β− ) ≥ dim H ν = dim H K α+ + dim H K β− − dim H .
(28)
468
L. Barreira, C. Valls
For the reverse inequality, we note that it follows readily from (10) that dim H (K α+ ∩ K β− ) ≤ dim H K α+ + dim H K β− − dim H (see for example Theorem 7.2 in [6]).
An alternative proof of the inequality in (28) and which does not use the measure ν, is the following. We first note that given x ∈ K α+ ∩ K β− and a sufficiently small open neighborhood U of x, we have K α+ ∩ U = (V s (y) ∩ U ) and K β− ∩ U = (V u (z) ∩ U ). z∈K β− ∩U
y∈K α+ ∩U
Therefore, K α+
∩
K β−
∩U =
y∈K α+ ∩U
V (y) ∩ s
V (z) ∩ U. u
z∈K β− ∩U
In a similar manner to that in the proof of Theorem 1, since the stable and unstable holonomies are Lipschitz, this identity ensures that in the neighborhood U the set K α+ ∩ K β− is taken by a Lipschitz map with Lipschitz inverse onto the product (K α+ ∩ V u (x)) × (K β− ∩ V s (x)). Therefore, dim H (K α+ ∩ K β− ) ≥ dim H (K α+ ∩ V u (x)) + dim H (K β− ∩ V s (x)) = dim H K α+ + dim H K β− − dim H .
(29)
Unfortunately this approach does not provide an upper bound for D(α, β). In order to explain the difficulty, we first observe that K α+ = and K β− = ,
(30)
whenever the level sets K α+ and K β− are nonempty. This follows from the corresponding statement at the level of symbolic dynamics. Simply note that W s (x) (respectively W u (x)) is the set of points having eventually the same symbolic future (respectively the same symbolic past) as the point x, and thus have an arbitrary symbolic past (respectively symbolic future). The identities in (30) are an immediate consequence of this observation. It is the noncompactness of the sets K α+ and K β− that causes the difficulty. More precisely, it follows from (30) that K α+ ∩ V u (x) = ∩ V u (x) and K β− ∩ V s (x) = ∩ V s (x), and thus dim B (K α+ ∩ V u (x)) = dim B ( ∩ V u (x)) = dim H ( ∩ V u (x)),
dim B (K β− ∩ V s (x)) = dim B ( ∩ V s (x)) = dim H ( ∩ V s (x)).
Multifractal Structure of Two-Dimensional Horseshoes
469
It follows from the general inequality dim H (X ∩ Y ) ≤ dim H X + dim B Y that the best simple-minded upper estimate for D(α, β) is D(α, β) ≤ min dim H (K α+ ∩ V u (x)) + dim H ( ∩ V s (x)),
dim H ( ∩ V u (x)) + dim H (K β− ∩ V s (x)) .
= min{dim H K α+ , dim H K β− }.
(31)
But this inequality also follows trivially from the inclusions K α+ ∩ K β− ⊂ K α+ , K β− . On the other hand, it follows from the theory of multifractal analysis (see [2]) provided that some cohomology relations are excluded, that there exist uncountably many values of α and β such that dim H K α+ < dim H and dim H K β− < dim H . For these values of α and β the upper bound in (31) is strictly larger than the lower bound in (29). 3.5. Conditional variational principle. We describe here a conditional variational principle for the spectrum D. We recall that h μ ( f ) denotes the Kolmogorov–Sinai entropy of an f -invariant measure μ on . Theorem 4. Let be a compact locally maximal hyperbolic set for a C 1+ε diffeomorphism on a smooth surface. Given pairs of functions ( ± , ± ) ∈ H () with ± > 0, the following properties hold: 1. if α ∈ int P+ (M) and β ∈ int P− (M), then hμ( f ) + : μ ∈ M and P (μ) = α D(α, β) = max − log d f |E s dμ hμ( f ) − : μ ∈ M and P (μ) = β ; + max u log d f |E dμ 2. the spectrum D is analytic on int P+ (M) × int P− (M). Proof. In view of Theorem 3 (see (27)), this is an immediate consequence of Theorems 8 and 13 in [2]. Let us consider as an illustration the particular case when the Birkhoff averages are obtained from the Lyapunov exponents. Let μ ∈ M. By Birkhoff’s ergodic theorem, for μ-almost every x ∈ there exist the limits λ+s (x) = lim
n→+∞
1 1 log dx f n |E s , λ+u (x) = lim log dx f n |E u , n→+∞ n n
and λ− s (x) = lim
n→−∞
1 1 log dx f n |E s , λ− log dx f n |E u . lim u (x) = n→−∞ |n| |n|
These are respectively the values of the forward and backward Lyapunov exponent of f at the point x. Set now
± λ± dμ, λ dμ . P± (μ) = s u
470
L. Barreira, C. Valls
It also follows from Birkhoff’s ergodic theorem that P(μ) := P+ (μ) = P− (μ) for every measure μ ∈ M. By Theorem 4, for each α = (α1 , α2 ), β = (β1 , β2 ) ∈ int P(μ) we have hμ( f ) hμ( f ) D(α, β) = max : P(μ) = α + max : P(μ) = β − λ+s dμ − λ− u dμ =−
1 1 max{h μ ( f ) : P(μ) = α} − max{h μ ( f ) : P(μ) = β}. α1 β2
References 1. Barreira, L.: Hyperbolicity and recurrence in dynamical systems: a survey of recent results. Resenhas IME-USP 5, 171–230 (2002) 2. Barreira, L., Saussol, B., Schmeling, J.: Higher-dimensional multifractal analysis. J. Math. Pures Appl. 81, 67–91 (2002) 3. Bowen, R.: Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms. Lect. Notes in Math. 470, Berlin Heidelberg-New York: Springer, 1975 4. McCluskey, H., Manning, A.: Hausdorff dimension for horseshoes, Ergodic Theory Dynam. Systems 3, 251–260 (1983) 5. Palis, J., Viana, M.: On the continuity of Hausdorff dimension and limit capacity for horseshoes. In: Bamón, R., Labarca, R., Palis, J. (eds.), Dynamical Systems (Valparaiso, 1986), Lect. Notes in Math. 1331, Berlin Heidelberg-New York: Springer, 1988, pp. 150–160 6. Pesin, Ya.: Dimension Theory in Dynamical Systems. Contemporary Views and Applications, Chicago Lectures in Mathematics, University of Chicago Press, 1997 Communicated by P. Sarnak
Commun. Math. Phys. 266, 471–497 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0047-8
Communications in
Mathematical Physics
Conservative Solutions to a Nonlinear Variational Wave Equation Alberto Bressan, Yuxi Zheng Department of Mathematics, The Pennsylvania State University, University Park, PA 16802, USA. E-mail: [email protected]; [email protected] Received: 30 August 2005 / Accepted: 13 February 2006 Published online: 25 May 2006 – © Springer-Verlag 2006
Abstract: We establish the existence of a conservative weak solution to the Cauchy problem for the nonlinear variational wave equation u tt − c(u)(c(u)u x )x = 0, for initial data of finite energy. Here c(·) is any smooth function with uniformly positive bounded values. 1. Introduction We are interested in the Cauchy problem u tt − c(u) c(u)u x x = 0 ,
(1.1)
with initial data u(0, x) = u 0 (x) ,
u t (0, x) = u 1 (x) .
(1.2)
Throughout the following, we assume that c : R → R+ is a smooth, bounded, uniformly positive function. Even for smooth initial data, it is well known that the solution can lose regularity in finite time ([12]). It is thus of interest to study whether the solution can be extended beyond the time when a singularity appears. This is indeed the main concern of the present paper. In ([5]) we considered the related equation 1 x u t + f (u)x = f (u)u 2x d x (1.3) 2 0 and constructed a semigroup of solutions, depending continuously on the initial data. Here we establish similar results for the nonlinear wave equation (1.1). By introducing new sets of dependent and independent variables, we show that the solution to the Cauchy problem can be obtained as the fixed point of a contractive transformation. Our main result can be stated as follows.
472
A. Bressan, Y. Zheng
Theorem 1. Let c : R → [κ −1 , κ] be a smooth function, for some κ > 1. Assume that the initial data u 0 in (1.2) is absolutely continuous, and that (u 0 )x ∈ L2 , u 1 ∈ L2 . Then the Cauchy problem (1.1)–(1.2) admits a weak solution u = u(t, x), defined for all (t, x) ∈ R × R. In the t-x plane, the function u is locally Hölder continuous with exponent 1/2. This solution t → u(t, ·) is continuously differentiable as a map with p values in Lloc , for all 1 ≤ p < 2. Moreover, it is Lipschitz continuous w.r.t. the L2 distance, i.e. u(t, ·) − u(s, ·) 2 ≤ L |t − s| (1.4) L for all t, s ∈ R. Equation (1.1) is satisfied in integral sense, i.e. φt u t − c(u)φ x c(u) u x d xdt = 0
(1.5)
for all test functions φ ∈ Cc1 . Concerning the initial conditions, the first equality in (1.2) p is satisfied pointwise, while the second holds in Lloc for p ∈ [1, 2[ . Our constructive procedure yields solutions which depend continuously on the initial data. Moreover, the “energy” . 1 E(t) = (1.6) u 2t (t, x) + c2 u(t, x) u 2x (t, x) d x 2 remains uniformly bounded. More precisely, one has Theorem 2. A family of weak solutions to the Cauchy problem (1.1)–(1.2) can be constructed with the following additional properties. For every t ∈ R one has . 1 E(t) ≤ E0 = (1.7) u 21 (x) + c2 u 0 (x) (u 0 )2x (x) d x . 2 Moreover, let a sequence of initial conditions satisfy n (u )x − (u 0 )x 2 → 0 , u n − u 1 2 → 0 , 0 1 L L and u n0 → u 0 uniformly on compact sets, as n → ∞. Then one has the convergence of the corresponding solutions u n → u, uniformly on bounded subsets of the t-x plane. It appears in (1.7) that the total energy of our solutions may decrease in time. Yet, we emphasize that our solutions are conservative, in the following sense. Theorem 3. There exists a continuous family {μt ; t ∈ R} of positive Radon measures on the real line with the following properties. (i) At every time t, one has μt (R) = E0 . (ii) For each t, the absolutely continuous part of μt has density 21 (u 2t + c2 u 2x ) w.r.t. the Lebesgue measure. (iii) For almost every t ∈ R, the singular part of μt is concentrated on the set where c (u) = 0.
Conservative Solutions to a Nonlinear Variational Wave Equation
473
In other words, the total energy represented by the measure μ is conserved in time. Occasionally, some of this energy is concentrated on a set of measure zero. At the times τ when this happens, μτ has a non-trivial singular part and E(τ ) < E0 . The condition (iii) puts some restrictions on the set of such times τ . In particular, if c (u) = 0 for all u, then this set has measure zero. We point out that what we do is constructing a continuous semigroup of solutions. Uniqueness within a class of conservative solutions at this point is only a conjecture. The paper is organized as follows. In the next two subsections we briefly discuss the physical motivations for the equation and recall some known results on its solutions. In Sect. 2 we introduce a new set of independent and dependent variables, and derive some identities valid for smooth solutions. We formulate a set of equations in the new variables which is equivalent to (1.1). In the new variables all singularities disappear: Smooth initial data lead to globally smooth solutions. In Sect. 3 we use a contractive transformation in a Banach space with a suitable weighted norm to show that there is a unique solution to the set of equations in the new variables, depending continuously on the data u 0 , u 1 . Going back to the original variables u, t, x, remarkably, in Sect. 4 we establish the Hölder continuity of these solutions u = u(t, x), and show that the integral equation (1.5) is satisfied. Moreover, in Sect. 5, we study the conservativeness of the solutions, establish the energy inequality and the Lipschitz continuity of the map t → u(t, ·). This already yields a proof of Theorem 2. In Sect. 6 we study the continuity of the maps t → u x (t, ·), t → u t (t, ·), completing the proof of Theorem 1. The proof of Theorem 3 is given in Sect. 7. 1.1. Physical background of the equation. Equation (1.1) has several physical origins. In the context of nematic liquid crystals, it comes as follows. The mean orientation of the long molecules in a nematic liquid crystal is described by a director field of unit vectors, n ∈ S2 , the unit sphere. Associated with the director field n, there is the well-known Oseen-Franck potential energy density W given by W (n, ∇n) = α |n × (∇ × n)|2 + β(∇ · n)2 + γ (n · ∇ × n)2 .
(1.8)
The positive constants α, β, and γ are elastic constants of the liquid crystal. For the special case α = β = γ , the potential energy density reduces to W (n, ∇n) = α |∇n|2 , which is the potential energy density used in harmonic maps into the sphere S2 . There are many studies on the constrained elliptic system of equations for n derived through variational principles from the potential (1.8), and on the parabolic flow associated with it, see [3, 9, 10, 16, 22, 36] and references therein. In the regime in which inertia effects dominate viscosity, however, the propagation of the orientation waves in the director field may then be modeled by the least action principle (Saxton [29]) δ n · n = 1. (1.9) ∂t n · ∂t n − W (n, ∇n) dxdt = 0, δu In the special case α = β = γ , this variational principle (1.9) yields the equation for harmonic wave maps from (1+3)-dimensional Minkowski space into the two sphere, see [8, 31, 32] for example. For planar deformations depending on a single space variable x, the director field has the special form n = cos u(x, t)ex + sin u(x, t)e y ,
474
A. Bressan, Y. Zheng
where the dependent variable u ∈ R1 measures the angle of the director field to the x-direction, and ex and e y are the coordinate vectors in the x and y directions, respectively. In this case, the variational principle (1.9) reduces to (1.1) with the wave speed c given specifically by c2 (u) = α cos2 u + β sin2 u.
(1.10)
Equation (1.1) has interesting connections with long waves on a dipole chain in the continuum limit ([13], Zorski and Infeld [45], and Grundland and Infeld [14]), and in classical field theories and general relativity ([13]). We refer the interested reader to the article [13] for these connections. This Eq. (1.1) compares interestingly with other well-known equations, e. g. ∂t2 u − ∂x [ p(∂x u)] = 0,
(1.11)
where p(·) is a given function, considered by Lax [25], Klainerman and Majda [24], and Liu [28]. Second related equation is ∂t2 u − c2 (u) u = 0
(1.12)
considered by Lindblad [27], who established the global existence of smooth solutions of (1.12) with smooth, small, and spherically symmetric initial data in R3 , where the large-time decay of solutions in high space dimensions is crucial. The multi-dimensional generalization of Eq. (1.1), ∂t2 u − c(u)∇ · (c(u)∇u) = 0,
(1.13)
contains a lower order term proportional to cc |∇u|2 , which (1.12) lacks. This lower order term is responsible for the blow-up in the derivatives of u. Finally, we note that Eq. (1.1) also looks related to the perturbed wave equation ∂t2 u − u + f (u, ∇u, ∇∇u) = 0,
(1.14)
where f (u, ∇u, ∇∇u) satisfies an appropriate convexity condition (for example, f = u p or f = a(∂t u)2 + b|∇u|2 ) or some nullity condition. Blow-up for (1.14) with a convexity condition has been studied extensively, see [2, 11, 15, 20, 21, 26, 30, 33, 34] and Strauss [35] for more reference. Global existence and uniqueness of solutions to (1.14) with a nullity condition depend on the nullity structure and large time decay of solutions of the linear wave equation in higher dimensions (see Klainerman and Machedon [23] and references therein). Therefore (1.1) with the dependence of c(u) on u and the possibility of sign changes in c (u) is familiar yet truly different. Equation (1.1) has interesting asymptotic uni-directional wave equations. Hunter and Saxton ([17]) derived the asymptotic equations (u t + u n u x )x =
1 n−1 nu (u x )2 2
(1.15)
for (1.1) via weakly nonlinear geometric optics. We mention that the x-derivative of Eq. (1.15) appears in the high-frequency limit of the variational principle for the Camassa-Holm equation ([1, 6, 7]), which arises in the theory of shallow water waves. A construction of global solutions to the Camassa-Holm equations, based on a similar variable transformation as in the present paper, will appear in [4].
Conservative Solutions to a Nonlinear Variational Wave Equation
475
1.2. Known results. In [18], Hunter and Zheng established the global existence of weak solutions to (1.15) (n = 1) with initial data of bounded variations. It has also been shown that the dissipative solutions are limits of vanishing viscosity. Equation (1.15) (n = 1) is also shown to be completely integrable ([19]). In [37–44], Ping Zhang and Zheng study the global existence, uniqueness, and regularity of the weak solutions to (1.15) (n = 1, 2) with L 2 initial data, and special cases of (1.1). The study of the asymptotic equation has been very beneficial for both the blow-up result [12] and the current global existence result for the wave equation (1.1). 2. Variable Transformations We start by deriving some identities valid for smooth solutions. Consider the variables
. R = u t + c(u)u x , . (2.1) S = u t − c(u)u x , so that ut =
R+S , 2
By (1.1), the variables R, S satisfy Rt − c R x = St + cSx =
ux =
R−S . 2c
c 2 2 4c (R − S ), c 2 2 4c (S − R ).
(2.2)
(2.3)
Multiplying the first equation in (2.3) by R and the second one by S, we obtain balance laws for R 2 and S 2 , namely c (R 2 )t − (c R 2 )x = 2c (R 2 S − RS 2 ) , (2.4) c (R 2 S − RS 2 ) . (S 2 )t + (cS 2 )x = − 2c As a consequence, the following quantities are conserved: R2 + S2 . 1 , E = u 2t + c2 u 2x = 2 4 Indeed we have
S2 − R2 . M = −u t u x = . 4c
E t + (c2 M)x = 0 , Mt + E x = 0 .
(2.5)
(2.6)
One can think of R 2 /4 as the energy density of backward moving waves, and S 2 /4 as the energy density of forward moving waves. We observe that, if R, S satisfy (2.3) and u satisfies (2.1b), then the quantity . F = R − S − 2cu x (2.7) provides solutions to the linear homogeneous equation Ft − c Fx =
c (R + S + 2cu x )F. 2c
(2.8)
476
A. Bressan, Y. Zheng
In particular, if F ≡ 0 at time t = 0, the same holds for all t > 0. Similarly, if R, S satisfy (2.3) and u satisfies (2.1a), then the quantity . G = R + S − 2u t provides solutions to the linear homogeneous equation Gt + c G x =
c (R + S − 2cu x )G . 2c
In particular, if G ≡ 0 at time t = 0, the same holds for all t > 0. We thus have Proposition 1. Any smooth solution of (1.1) provides a solution to (2.1)–(2.3). Conversely, any smooth solution of (2.1b) and (2.3) (or (2.1a) and (2.3)) which satisfies (2.2b) (or (2.2a)) at time t = 0 provides a solution to (1.1). The main difficulty in the analysis of (1.1) is the possible breakdown of regularity of solutions. Indeed, even for smooth initial data, the quantities u x , u t can blow up in finite time. This is clear from Eqs. (2.3), where the right hand side grows quadratically, see ([12]) for handling change of signs of c and interaction between R and S. To deal with possibly unbounded values of R, S, it is convenient to introduce a new set of dependent variables: . . w = 2 arctan R , z = 2 arctan S , so that R = tan
w , 2
S = tan
z . 2
(2.9)
Using (2.3), we obtain the equations wt − c w x = zt + c z x =
2 c R 2 − S 2 (Rt − c Rx ) = , 2 1+ R 2c 1 + R 2
(2.10)
2 c S 2 − R 2 (S + c S ) = . t x 1 + S2 2c 1 + S 2
(2.11)
To reduce the equation to a semilinear one, it is convenient to perform a further change of independent variables (Fig. 1). Consider the equations for the forward and backward characteristics: x˙ + = c(u) , x˙ − = −c(u) .
(2.12)
The characteristics passing through the point (t, x) will be denoted by s → x + (s, t, x) , s → x − (s, t, x) , respectively. As coordinates (X, Y ) of a point (t, x) we shall use the quantities x − (0,t,x) 0 . . 1 + R 2 (0, x) d x , Y = 1 + S 2 (0, x) d x . X= (2.13) x + (0,t,x)
0
Of course this implies X t − c(u)X x = 0 ,
Yt + c(u)Yx = 0 ,
(2.14)
Conservative Solutions to a Nonlinear Variational Wave Equation
477
Fig. 1. Characteristic curves
(X x )t − (c X x )x = 0 ,
(Yx )t + (c Yx )x = 0 .
(2.15)
Notice that 1 h→0 h
X x (t, x) = lim
1 h→0 h
Yx (t, x) = lim
x − (0,t,x+h)
1 + R 2 (0, x) d x ,
x − (0,t,x)
x + (0,t,x+h)
1 + S 2 (0, x) d x .
x + (0,t,x)
For any smooth function f , using (2.14) one finds f t + c f x = f X X t + f Y Yt + c f X X x + c f Y Yx = (X t + cX x ) f X = 2cX x f X , f t − c f x = f X X t + f Y Yt − c f X X x − c f Y Yx = (Yt − cYx ) f Y = −2cYx f Y . (2.16) We now introduce the further variables . 1 + R2 p= , Xx
. 1 + S2 q= . −Yx
(2.17)
Notice that the above definitions imply (X x )−1 =
w p = p cos2 , 1 + R2 2
(−Yx )−1 =
z q = q cos2 . 1 + S2 2
(2.18)
From (2.10)–(2.11), using (2.16)–(2.18), we obtain 2c
(1 + S 2 ) c R 2 − S 2 wY = , q 2c 1 + R 2
Therefore ⎧ ⎪ ⎨ wY = ⎪ ⎩z
X
=
c R 2 −S 2 q 4c2 1+R 2 1+S 2
=
c 4c2
p c S 2 −R 2 4c2 1+S 2 1+R 2
=
c 4c2
2c
(1 + R 2 ) c S 2 − R 2 zX = . p 2c 1 + S 2
2 sin
w 2
2 sin
z 2
cos2 cos2
z 2
− sin2
z 2
cos2
w 2
w 2
− sin2
w 2
cos2
z 2
q, (2.19) p.
478
A. Bressan, Y. Zheng
Using trigonometric formulas, the above expressions can be further simplified as wY = 8cc 2 (cos z − cos w) q , zX =
c 8c2
(cos w − cos z) p .
Concerning the quantities p, q, we observe that cx = c u x = c
R−S . 2c
(2.20)
Using again (2.18) and (2.15) we compute
pt − c px = (X x )−1 2R(Rt − c Rx ) − (X x )−2 (X x )t − c(X x )x (1 + R 2 ) = (X x )−1 2R
c 2 4c (R
− S 2 ) − (X x )−2 [cx X x ](1 + R 2 )
=
p 2R(R 2 1+R 2
=
p c 2c 1+R 2 S(1 +
c − S 2 ) 4c −
p c (R 1+R 2 2c
− S)(1 + R 2 )
R 2 ) − R(1 + S 2 ) ,
qt + c qx = (−Yx )−1 2S(St − cSx ) − (−Yx )−2 (−Yx )t + c(−Yx )x (1 + S 2 ) = (−Yx )−1 2S
c 2 4c (S
=
q 2S(S 2 1+S 2
=
c q 2c 1+S 2 R(1 +
− R 2 ) − (−Yx )−2 [cx (Yx )](1 + S 2 )
c − R 2 ) 4c −
q c (S 1+S 2 2c
− R)(1 + S 2 )
S 2 ) − S(1 + R 2 ) .
In turn, this yields q 1 1 = ( pt − c p x ) 2c (−Yx ) 2c 1 + S 2 S c S(1 + R 2 ) − R(1 + S 2 ) R pq pq = − (1 + R 2 )(1 + S 2 ) 4c2 1 + S 2 1 + R2 w c sin z − sin w z w z cos2 pq = 2 pq , (2.21) tan cos2 − tan 2 2 2 2 4c 2
p Y = ( pt − c p x ) c 4c2 c = 2 4c
=
p 1 1 = (qt + c qx ) 2c (X x ) 2c 1 + R 2 R c c R(1 + S 2 ) − S(1 + R 2 ) S pq pq = = 2 − 4c (1 + S 2 )(1 + R 2 ) 4c2 1 + R 2 1 + S2 c z c sin w − sin z w w z = 2 tan cos2 − tan cos2 pq = 2 pq . 4c 2 2 2 2 4c 2 Finally, by (2.16) we have R 1 p 1 1 u X = (u t + cu x ) 2c tan w2 cos2 w2 p , = 2c p = 2c 1+R 2 1+R 2 q X = (qt + c qx )
1 q u Y = (u t − cu x ) 2c = 1+S 2
S 1 2c 1+S 2
q =
1 2c
tan
z 2
cos2
z 2
q .
(2.22)
(2.23)
Conservative Solutions to a Nonlinear Variational Wave Equation
479
Starting with the nonlinear equation (1.1), using X, Y as independent variables we thus obtain a semilinear hyperbolic system with smooth coefficients for the variables u, w, z, p, q. Using some trigonometric identities, the set of Eqs. (2.19), (2.21)–(2.22) and (2.23) can be rewritten as wY = 8cc 2 (cos z − cos w) q , (2.24) z X = 8cc 2 (cos w − cos z) p , pY = 8cc 2 sin z − sin w pq , (2.25) q X = 8cc 2 sin w − sin z pq , u X = sin4cw p , (2.26) z u Y = sin 4c q . Remark 1. The function u can be determined by using either one of the equations in (2.26). One can easily check that the two equations are compatible, namely sin w cos w sin w wY p + pY c uY p + 2 4c 4c 4c c = − 2 sin w sin z + cos w cos z − cos2 w − sin2 w + sin w sin z pq 3 32 c c = cos(w − z) − 1 pq 3 32 c (2.27) = u YX .
u XY = −
Remark 2. We observe that the new system is invariant under translation by 2π in w and . . z. Actually, it would be more precise to work with the variables w† = eiw and z † = ei z . However, for simplicity we shall use the variables w, z, keeping in mind that they range on the unit circle [−π, π ] with endpoints identified. The system (2.24)–(2.26) must now be supplemented by non-characteristic boundary conditions, corresponding to (1.2). For this purpose, we observe that u 0 , u 1 determine the initial values of the functions R, S at time t = 0. The line t = 0 corresponds to a curve γ in the (X, Y ) plane, say Y = ϕ(X ), X ∈ R, . where Y = ϕ(X ) if and only if x x X= 1 + R 2 (0, x) d x , Y = − 1 + S 2 (0, x) d x 0
for some x ∈ R .
0
We can use the variable x as a parameter along the curve γ . The assumptions u 0 ∈ H 1 , u 1 ∈ L2 imply R, S ∈ L2 ; to fix the ideas, let . 1 2 E0 = (2.28) R (0, x) + S 2 (0, x) d x < ∞ . 4 The two functions . X (x) =
0
x
1 + R 2 (0, x) d x ,
. Y (x) =
0
1 + S 2 (0, x) d x
x
480
A. Bressan, Y. Zheng
are well defined and absolutely continuous. Clearly, X is strictly increasing while Y is strictly decreasing. Therefore, the map X → ϕ(X ) is continuous and strictly decreasing. From (2.28) it follows X + ϕ(X ) ≤ 4E0 . (2.29) As (t, x) ranges over the domain [0, ∞[ ×R, the corresponding variables (X, Y ) range over the set . (2.30)
+ = (X, Y ) ; Y ≥ ϕ(X ) . Along the curve
. γ = (X, Y ) ; Y = ϕ(X ) ⊂ R2 parametrized by x → X (x), Y (x) , we can thus assign the boundary data (w, ¯ z¯ , p, ¯ q, ¯ u) ¯ ∈ L∞ defined by
w¯ = 2 arctan R(0, x) , p¯ ≡ 1 , (2.31) u¯ = u 0 (x) . z¯ = 2 arctan S(0, x) , q¯ ≡ 1 , We observe that the identity z¯ w¯ − tan − 2c(u) ¯ u¯ x = 0 2 2 is identically satisfied along γ . A similar identity holds for G. F = tan
(2.32)
3. Construction of Integral Solutions Aim of this section is to prove a global existence theorem for the system (2.24)–(2.26), describing the nonlinear wave equation in our transformed variables. Theorem 4. Let the assumptions in Theorem 1 hold. Then the corresponding problem (2.24)–(2.26) with boundary data (2.31) has a unique solution, defined for all (X, Y ) ∈ R2 . In the following, we shall construct the solution on the domain + where Y ≥ ϕ(X ). On the complementary set − where Y < ϕ(X ), the solution can be constructed in an entirely similar way. Observing that all Eqs. (2.24)–(2.26) have a locally Lipschitz continuous right hand side, the construction of a local solution as fixed point of a suitable integral transformation is straightforward. To make sure that this solution is actually defined on the whole domain + , one must establish a priori bounds, showing that p, q remain bounded on bounded sets. This is not immediately obvious from Eqs. (2.25), because the right hand sides have quadratic growth. The basic estimate can be derived as follows. Assume c (u) . (3.1) C0 = sup 2 < ∞ . 4c (u) u∈R
From (2.25) it follows the identity q X + pY = 0 . In turn, this implies that the differential form p d X − q dY has zero integral along every closed curve contained in + . In particular, for every (X, Y ) ∈ + , consider the closed curve (see Fig. 2) consisting of:
Conservative Solutions to a Nonlinear Variational Wave Equation
481
Fig. 2. The closed curve
– the vertical segment joining X, ϕ(X ) with (X, Y ), – the horizontal segment joining (X, Y ) with ϕ −1 (Y ), Y , −1 – the portion of boundary γ = Y = ϕ(X ) joining ϕ (Y ), Y with X, ϕ(X ) . Integrating along , recalling that p = q = 1 along γ and then using (2.29), we obtain Y X p(X , Y ) d X + q(X, Y ) dY = X − ϕ −1 (Y ) + Y − ϕ(X ) ϕ −1 (Y )
ϕ(X )
≤ 2(|X | + |Y | + 4E0 ) .
(3.2)
Using (3.1)–(3.2) in (2.25), since p, q > 0 we obtain the a priori bounds
Y c (u) p(X, Y ) = exp sin z − sin w q(X, Y ) dY 2 ϕ(X ) 8c (u)
Y ≤ exp C0 q(X, Y ) dY ϕ(X ) ≤ exp 2C0 (|X | + |Y | + 4E0 ) .
(3.3)
Similarly,
q(X, Y ) ≤ exp 2C0 (|X | + |Y | + 4E0 ) .
(3.4)
Relying on (3.3)–(3.4), we now show that, on bounded sets in the X -Y plane, the solution of (2.24)–(2.26) with boundary conditions (2.31) can be obtained as the fixed point of a contractive transformation. For any given r > 0, consider the bounded domain .
r = (X, Y ) ; Y ≥ ϕ(X ) , X ≤ r , Y ≤ r .
482
A. Bressan, Y. Zheng
Introduce the space of functions . . r = f : r → R ; f ∗ = ess sup
(X,Y )∈ r
e−κ(X +Y ) f (X, Y ) < ∞ ,
where κ is a suitably large constant, to be determined later. For w, z, p, q, u ∈ r , consider the transformation T (w, z, p, q, u) = (w, ˜ z˜ , p, ˜ q, ˜ u) ˜ defined by ⎧ Y c (u) ⎪ ⎪ ⎪ ˜ Y ) = w(X, ¯ ϕ(X )) + (cos z − cos w) q dY, ⎨ w(X, 2 ϕ(X ) 8c (u) (3.5) X ⎪ c (u) ⎪ −1 (Y ), Y ) + ⎪ z ˜ (X, Y ) = z ¯ (ϕ (cos w − cos z) p d X , ⎩ 2 ϕ −1 (Y ) 8c (u) ⎧ ⎪ ⎪ ˜ Y) = 1 + ⎪ ⎨ p(X,
Y
ϕ(X ) X
⎪ ⎪ ⎪ ˜ Y) = 1 + ⎩ q(X,
c (u) sin z − sin w pˆ qˆ dY }, 2 8c (u)
ϕ −1 (Y )
c (u) sin w − sin z pˆ qˆ d X } , 2 8c (u)
u(X, ˜ Y ) = u(X, ¯ ϕ(X )) + In (3.6), the quantities p, ˆ qˆ are defined as . pˆ = min p , 2e2C0 (|X |+|Y |+4E0 ) ,
Y
ϕ(X )
sin z q dY. 4c
. qˆ = min q , 2e2C0 (|X |+|Y |+4E0 ) .
(3.6)
(3.7)
(3.8)
Notice that pˆ = p, qˆ = q as long as the a priori estimates (3.3)–(3.4) are satisfied. Moreover, if in Eqs. (2.24)–(2.26) the variables p, q are replaced with p, ˆ q, ˆ then the right hand sides become uniformly Lipschitz continuous on bounded sets in the X -Y plane. A straightforward computation now shows that the map T is a strict contraction on the space r , provided that the constant κ is chosen sufficiently big (depending on the function c and on r ). Obviously, if r > r , then the solution of (3.5)–(3.7) on r also provides the solution to the same equations on r , when restricted to this smaller domain. Letting r → ∞, in the limit we thus obtain a unique solution (w, z, p, q, u) of (3.5)–(3.7), defined on the whole domain + . To prove that these functions satisfy the (2.24)–(2.26), we claim that pˆ = p, qˆ = q at every point (X, Y ) ∈ + . The proof is by contradiction. If our claim does not hold, since the maps Y → p(X, Y ), X → q(X, Y ) are continuous, we can find some point (X ∗ , Y ∗ ) ∈ + such that q(X, Y ) ≤ 2e2C0 (|X |+|Y |+4E0 ) (3.9) p(X, Y ) ≤ 2e2C0 (|X |+|Y |+4E0 ) , . for all (X, Y ) ∈ ∗ = + ∩{X ≤ X ∗ , Y ≤ Y ∗ }, but either p(X ∗ , Y ∗ ) ≥ 23 e2C0 (|X |+|Y |+4E0 ) or q(X ∗ , Y ∗ ) ≥ 23 e2C0 (|X |+|Y |+4E0 ) . By (3.9), we still have pˆ = p, qˆ = q restricted to
∗ , hence the Eqs. (2.24)–(2.26) and the a priori bounds (3.3)–(3.4) remain valid. In particular, these imply p(X ∗ , Y ∗ ) ≤ e2C0 (|X |+|Y |+4E0 ) , reaching a contradiction.
q(X ∗ , Y ∗ ) ≤ e2C0 (|X |+|Y |+4E0 ) ,
Conservative Solutions to a Nonlinear Variational Wave Equation
483
Remark 3. In the solution constructed above, the variables w, z may well grow outside the initial range ]− π, π [ . This happens precisely when the quantities R, S become unbounded, i.e. when singularities arise. For future reference, we state a useful consequence of the above construction. Corollary 1. If the initial data u 0 , u 1 are smooth, then the solution (u, p, q, w, z) of (2.24)–(2.26), (2.31) is a smooth function of the variables (X, Y ). Moreover, assume m that a sequence of smooth functions (u m 0 , u 1 )m≥1 satisfies um 0 → u0 ,
(u m 0 )x → (u 0 )x ,
um 1 → u1
uniformly on compact subsets of R. Then one has the convergence of the corresponding solutions: (u m , p m , q m , w m , z m ) → (u, p, q, w, z) uniformly on bounded subsets of the X -Y plane. We also remark that Eqs. (2.24)–(2.26) imply the conservation laws q p q X + pY = 0 , − = 0. c X c Y
(3.10)
4. Weak Solutions, in the Original Variables By expressing the solution u(X, Y ) in terms of the original variables (t, x), we shall recover a solution of the Cauchy problem (1.1)–(1.2). This will provide a proof of Theorem 1. As a preliminary, we examine the regularity of the solution (u, w, z, p, q) constructed in the previous section. Since the initial data (u 0 )x and u 1 are only assumed to be in L2 , the functions w, z, p, q may well be discontinuous. More precisely, on bounded subsets of the X -Y plane, Eqs. (2.24)–(2.26) imply the following: – The functions w, p are Lipschitz continuous w.r.t. Y , measurable w.r.t. X . – The functions z, q are Lipschitz continuous w.r.t. X , measurable w.r.t. Y . – The function u is Lipschitz continuous w.r.t. both X and Y . The map (X, Y ) → (t, x) can be constructed as follows. Setting f = x, then f = t in the two equations at (2.16), we find
c = 2cX x x X , 1 = 2cX x t X , −c = −2cYx xY , 1 = −2cYx tY , respectively. Therefore, using (2.18) we obtain ⎧ 1 (1 + cos w) p ⎪ ⎪ xX = , = ⎨ 2X x 4 (4.1) 1 (1 + cos z) q ⎪ ⎪ ⎩ xY = , =− 2Yx 4 ⎧ 1 (1 + cos w) p ⎪ ⎪ , = ⎨ tX = 2cX x 4c (4.2) 1 (1 + cos z) q ⎪ ⎪ ⎩ tY = . = −2cYx 4c For future reference, we write here the partial derivatives of the inverse mapping, valid at points where w, z = −π ,
484
A. Bressan, Y. Zheng
⎧ ⎪ ⎪ ⎨ Xx =
2 , (1 + cos w) p 2 ⎪ ⎪ , ⎩ Yx = − (1 + cos z) q
⎧ ⎪ ⎪ ⎨ Xt =
2c , (1 + cos w) p 2c ⎪ ⎪ . ⎩ Yt = (1 + cos z) q
(4.3)
We can now recover the functions x = x(X, Y ) by integrating one of the equations in (4.1). Moreover, we can compute t = t (X, Y ) by integrating one of the equations in (4.2). A straightforward calculation shows that the two equations in (4.1) are equivalent: differentiating the first w.r.t. Y or the second w.r.t. X one obtains the same expression, p sin w wY (1 + cos w) pY − 4 4 c pq sin z − sin w + sin(z − w) = xYX . = 32c2 Similarly, the equivalence of the two equations in (4.2) is checked by x x x 2 xY X Y X t X Y − tYX = + = x X Y − 2 c u Y + 2 c u X c Y c X c c c c pq sin z − sin w + sin(z − w) = 16 c3 c pq − (1 + cos w) sin z − (1 + cos z) sin w = 0 . 3 16 c In order to define u as a function of the original variables t, x, we should formally invert the map (X, Y ) → (t, x) and write u(t, x) = u X (t, x) , Y (t, x) . The fact that the above map may not be one-to-one does not cause any real difficulty. Indeed, given (t ∗ , x ∗ ), we can choose an arbitrary point (X ∗ , Y ∗ ) such that t (X ∗ , Y ∗ ) = t ∗ , x(X ∗ , Y ∗ ) = x ∗ , and define u(t ∗ , x ∗ ) = u(X ∗ , Y ∗ ). To prove that the values of u do not depend on the choice of (X ∗ , Y ∗ ), we proceed as follows. Assume that there are two distinct points such that t (X 1 , Y1 ) = t (X 2 , Y2 ) = t ∗ , x(X 1 , Y1 ) = x(X 2 , Y2 ) = x ∗ . We consider two cases: xXY =
Case 1. X 1 ≤ X 2 , Y1 ≤ Y2 . Consider the set . x ∗ = (X, Y ) ; x(X, Y ) ≤ x ∗ and call ∂x ∗ its boundary. By (4.1), x is increasing with X and decreasing with Y . Hence, this boundary can be represented as the graph of a Lipschitz continuous function: X − Y = φ(X + Y ). We now construct the Lipschitz continuous curve γ (Fig. 3a) consisting of
Fig. 3. Paths of integration
Conservative Solutions to a Nonlinear Variational Wave Equation
485
– a horizontal segment joining (X 1 , Y1 ) with a point A = (X A , Y A ) on ∂x ∗ , with Y A = Y1 , – a portion of the boundary ∂x ∗ , – a vertical segment joining (X 2 , Y2 ) to a point B = (X B , Y B ) on ∂x ∗ , with X B = X 2 . We can obtain a Lipschitz continuous parametrization of the curve γ : [ξ1 , ξ2 ] → R2 in terms of the parameter ξ = X + Y . Observe that the map (X, Y ) → (t, x) is constant along γ . By (4.1)–(4.2) this implies (1 + cos w)X ξ = (1 + cos z)Yξ = 0, hence sin w · X ξ = sin z · Yξ = 0. We now compute u(X 2 , Y2 ) − u(X 1 , Y1 ) = u X d X + u Y dY γ
=
ξ2
ξ1
p sin w q sin z Xξ − Yξ 4c 4c
dξ = 0 ,
proving our claim. Case 2. X 1 ≤ X 2 , Y1 ≥ Y2 . In this case, we consider the set . t ∗ = (X, Y ) ; t (X, Y ) ≤ t ∗ , and construct a curve γ connecting (X 1 , Y1 ) with (X 2 , Y2 ) as in Fig. 3b. Details are entirely similar to Case 1. We now prove that the function u(t, x) = u X (t, x), Y (t, x) thus obtained is Hölder continuous on bounded sets. Toward this goal, consider any characteristic curve, say t → x + (t), with x˙ + = c(u). By construction, this is parametrized by the function X → t (X, Y ), x(X, Y ) , for some fixed Y . Recalling (2.16), (2.14), (2.18) and (2.26), we compute
τ
2 u t + c(u)u x dt =
0
=
X0 Xτ
≤
Xτ
X0 Xτ X0
(2cX x u X )2 (2X t )−1 d X w w 2 w −1 p 2 sin cos 2c p cos2 dX 2 4c 2 2 p d X ≤ Cτ , 2c
(4.4)
for some constant Cτ depending only on τ . Similarly, integrating along any backward characteristics t → x − (t) we obtain τ 2 u t − c(u)u x dt ≤ Cτ . (4.5) 0
Since the speed of characteristics is ±c(u), and c(u) is uniformly positive and bounded, the bounds (4.4)–(4.5) imply that the function u = u(t, x) is Hölder continuous with exponent 1/2. In turn, this implies that all characteristic curves are C 1 with Hölder continuous derivative. Still from (4.4)–(4.5) it follows that the functions ˇ R, S at (2.1) are square integrable on bounded subsets of the t-x plane. Finally, we
486
A. Bressan, Y. Zheng
prove that the function u provides a weak solution to the nonlinear wave equation (1.1). According to (1.5), we need to show that
φt (u t + cu x ) + (u t − cu x ) − c(u)φ x (u t + cu x ) − (u t − cu x ) d xdt = φt − (cφ)x (u t + cu x ) d xdt + φt + (cφ)x (u t − cu x ) d xdt φt + (cφ)x S d xdt . (4.6) = φt − (cφ)x R d xdt +
0=
By (2.16), this is equivalent to
− 2cYx φY R + 2cX x φ X S + c (u X X x + u Y Yx ) φ (S − R) d xdt = 0 . (4.7)
It will be convenient to express the double integral in (4.7) in terms of the variables X, Y . We notice that, by (2.18) and (2.14), d x dt =
pq d X dY . 2c (1 + R 2 )(1 + S 2 )
Using (2.26) and the identities ⎧ ⎪ ⎨
1 1 + cos w w = , = cos2 2 1+ R 2 2 1 + cos z z ⎪ ⎩ 1 , = cos2 = 1 + S2 2 2
⎧ ⎪ ⎨
R sin w , = 2 1+ R 2 ⎪ sin z ⎩ S , = 1 + S2 2
(4.8)
the double integral in (4.6) can thus be written as
1+ S 2 sin w 1+ R 2 sin z 1+ S 2 1+ R 2 2c φY R +2c φ X S +c p − q φ (S − R) q p 4c p 4c q pq d X dY × 2c (1 + R 2 ) (1 + S 2 )
R S c pq sin w sin z = φ (S − R) d X dY p φ + q φ + − Y X 1 + R2 1+ S 2 8c2 1+ S 2 1+ R 2
p sin w q sin z φY + φX = 2 2 z w c pq w z φ d X dY + 2 sin w sin z − sin w cos2 tan − sin z cos2 tan 8c 2 2 2 2
p sin w q sin z c pq = φY + φX + cos(w + z) − 1 φ d X dY . 2 2 8c2 (4.9)
Conservative Solutions to a Nonlinear Variational Wave Equation
487
Recalling (2.30), one finds p sin w q sin z + = (2c u X )Y + (2c u Y ) X 2 2 Y X = 4c u X u Y + 4c u X Y =
c pq c pq cos(w − z) − 1 sin w sin z + 4c2 8c2
c pq cos(w + z) − 1 . (4.10) 2 8c Together, (4.9) and (4.10) imply (4.7) and hence (4.6). This establishes the integral equation (1.5) for every test function φ ∈ Cc1 . =
5. Conserved Quantities From the conservation laws (3.10) it follows that the 1-forms p d X − q dY and p q c d X + c dY are closed, hence their integrals along any closed curve in the X -Y plane vanish. From the conservation laws at (2.6), it follows that the 1-forms E d x − (c2 M) dt ,
M d x − E dt
(5.1)
are also closed. There is a simple correspondence. In fact q 1 p q 1 p d X − dY − d x − M d x + E dt = dX + dY − dt. 4 4 2 4c 4c 2 Recalling (4.1)–(4.2), these can be written in terms of the X -Y coordinates as E d x − (c2 M) dt =
(1 − cos z) q (1 − cos w) p dX − dY , (5.2) 8 8 (1 − cos w) p (1 − cos z) q dX + dY , (5.3) 8c 8c respectively. Using (2.24)–(2.26), one easily checks that these forms are indeed closed: (1 − cos w) p c pq sin z(1 − cos w) − sin w(1 − cos z) = 8 64c2 Y (1 − cos z) q =− , (5.4) 8 X
(1 − cos w) p 8c
Y
c pq sin(w + z) − (sin w + sin z) = = 64c3
(1 − cos z) q 8c
. X
In addition, we have the 1-forms (1 + cos w) p (1 + cos z) q dX − dY , 4 4 (1 + cos w) p (1 + cos z) q dt = dX + dY , 4c 4c which are obviously closed. dx =
(5.5) (5.6)
488
A. Bressan, Y. Zheng
The solutions u = u(X, Y ) constructed in Sect. 3 are conservative, in the sense that the integral of the form (5.3) along every Lipschitz continuous, closed curve in the X -Y plane is zero. To prove the inequality (1.7), fix any τ > 0. The case τ < 0 is identical. For a given r > 0 arbitrarily large, define the set (Fig. 4) . = (X, Y ) ; 0 ≤ t (X, Y ) ≤ τ ,
X ≤ r ,Y ≤ r .
(5.7)
By construction, the map (X, Y ) → (t, x) will act as follows: A → (τ, a) ,
B → (τ, b) ,
C → (0, c) ,
D → (0, d) ,
for some a < b and d < c. Integrating the 1-form (5.3) along the boundary of we obtain
(1 − cos w) p (1 − cos z) q dX − dY 8 8 AB (1 − cos z) q (1 − cos w) p dX − dY = 8 8 DC (1 − cos w) p (1 − cos z) q dX − dY − 8 8 DA CB (1 − cos z) q (1 − cos w) p dX − dY ≤ 8 8 DC c 1 u 2t (0, x) + c2 u(0, x) u 2x (0, x) d x . = d 2
Fig. 4.
(5.8)
Conservative Solutions to a Nonlinear Variational Wave Equation
489
On the other hand, using (5.5) we compute b 1 2 u t (τ, x) + c2 u(τ, x) u 2x (τ, x) d x 2 a (1 − cos w) p = dX 8 AB∩{cos w=−1} (1 − cos z) q dY + 8 AB∩{cos z=−1} ≤ E0 .
(5.9)
Notice that the last relation in (5.8) is satisfied as an equality, because at time t = 0, along the curve γ0 the variables w, z never assume the value −π . Letting r → +∞ in (5.7), one has a → −∞, b → +∞. Therefore (5.8) and (5.9) together imply E(t) ≤ E0 , proving (1.7). We now prove the Lipschitz continuity of the map t → u(t, ·) in the L2 distance. For + this purpose, for any fixed time τ , we let μτ = μ− τ + μτ be the positive measure on the real line defined as follows. In the smooth case, 1 b 2 1 b 2 − + μτ ]a, b[ = R (τ, x) d x , μτ ]a, b[ = S (τ, x) d x . 4 a 4 a (5.10) To define μ± τ in the general case, let γτ be the boundary of the set . τ = (X, Y ) ; t (X, Y ) ≤ τ .
(5.11)
Given any open interval ]a, b[ , let A = (X A , Y A ) and B = (X B , Y B ) be the points on γτ such that x(A) = a ,
X P − Y P ≤ X A − Y A for every point P ∈ γτ with x(P) ≤ a ,
x(B) = b ,
X P − Y P ≥ X B − Y B for every point P ∈ γτ with x(P) ≥ b .
Then
where . − μτ ]a, b[ =
+ μτ ]a, b[ = μ− τ ]a, b[ + μτ ]a, b[ ,
AB
(1 − cos w) p dX 8
μ+τ
. ]a, b[ = −
AB
(5.12) (1 − cos z) q dY . 8 (5.13)
+ Recalling the discussion at (5.1)–(5.3), it is clear that μ− τ , μτ are bounded, positive measures, and μτ (R) = E0 , for all τ . Moreover, by (5.10) and (2.5), b b 1 2 2 2 (R + S 2 )d x ≤ 2μ(]a, b[). c ux d x ≤ 2 a a
For any a < b, this yields the estimate b u 2x (τ, y) dy ≤ 2κ 2 |b − a|μτ (]a, b[). |u(τ, b) − u(τ, a)|2 ≤ |b − a| a
(5.14)
490
A. Bressan, Y. Zheng
Next, for a given h > 0, y ∈ R, we seek an estimate on the distance u(τ +h, y)−u(τ, y)|. As in Fig. 5, let γτ +h be the boundary of the set τ +h , as in (5.11). Let P = (PX , PY ) be the point on γτ such that x(P) = y ,
X P − Y P ≤ X P − Y P for every point P ∈ γτ with x(P ) ≤ x .
Similarly, let Q = (Q X , Q Y ) be the point on γτ +h such that x(Q) = y ,
X Q − Y Q ≤ X Q − Y Q for every point Q ∈ γτ +h with x(Q ) ≤ y .
Notice that X P ≤ X Q and Y P ≤ Y Q . Let P + = (X + , Y + ) be a point on γτ with − X + = X Q , and let P − = (X − , Y − ) be a point on γτ with Y = Y Q . Notice that + x(P ) ∈ ]y, y + κh[ , because the point τ, x(Q) lies on some characteristic curve with speed −c(u) > −κ, passing through the point (τ +h, y). Similarly, x(P − ) ∈ ]y−κh, y[ . Recalling that the forms in (5.3) and (5.6) are closed, we obtain the estimate YQ u(Q) − u(P + ) ≤ u Y (X Q , Y ) dY Y+ YQ sin z = q 4c dY Y+ YQ (1 + cos z) q 1/2 (1 − cos z) q 1/2 = dY 4c 4 Y+ 1/2 Y Q 1/2 Y Q (1 + cos z) q (1 − cos z) q dY dY ≤ · 4c 4 Y+ Y+ 1/2
(1 − cos w) p (1 − cos z) q dX − dY ≤ · h 1/2 . 4 4 P− P+ (5.15) The last term in (5.15) contains the integral of the 1-form at (5.3), along the curve γτ , between P − and P + . Recalling the definition (5.12)–(5.13) and the estimate (5.14), we obtain the bound u(τ + h, x) − u(τ, x)2 ≤ 2u(Q) − u(P + )2 + 2u(P + ) − u(P)2 ≤ 4h · μτ ]x − κh, x + κh[ +4κ 2 · (κh) · μτ ]x , x + h[ . (5.16)
Fig. 5. Proving Lipschitz continuity
Conservative Solutions to a Nonlinear Variational Wave Equation
491
Therefore, for any h > 0, u(τ + h, ·) − u(τ, ·) 2 = L
≤
u(τ + h, x) − u(τ, x)2 d x
1/2
4(1 + κ 3 )h · μτ ]x − κh , x + κh[ d x
1/2 = 4(κ 3 + 1)h 2 μτ (R) 1/2 = h · 4(κ 3 + 1) E0 .
1/2
(5.17)
This proves the uniform Lipschitz continuity of the map t → u(t, ·), stated at (1.4). 6. Regularity of Trajectories In this section we prove the continuity of the functions t → u t (t, ·) and t → u x (t, ·), as functions with values in L p . This will complete the proof of Theorem 1. We first consider the case where the initial data (u 0 )x , u 1 are smooth with compact support. In this case, the solution u = u(X, Y ) remains smooth on the entire X -Y plane. Fix a time τ and let γτ be the boundary of the set τ , as in (5.11). We claim that d u(t, ·) = u t (τ, ·), (6.1) dt t=τ where, by (2.14), (2.18) and (2.26), 2c q sin z 2c p sin w . u t (τ, x) = u X X t + u Y Yt = + 4c p(1 + cos w) 4c q(1 + cos z) sin z sin w + . = 2(1 + cos w) 2(1 + cos z)
(6.2)
Notice that (6.2) defines the values of u t (τ, ·) at almost every point x ∈ R, i.e. at all points outside the support of the singular part of the measure μτ defined at (5.12). By the inequality (1.7), recalling that c(u) ≥ κ −1 , we obtain u t (τ, x)2 d x ≤ κ 2 E(τ ) ≤ κ 2 E0 . (6.3) R
To prove (6.1), let any ε > 0 be given. There exist finitely many disjoint intervals [ai , bi ] ⊂ R, i = 1, . . . , N , with the following property. Call Ai , Bi the points on γτ such that x(Ai ) = ai , x(Bi ) = bi . Then one has min 1 + cos w(P) , 1 + cos z(P) < 2ε (6.4) at every point P on γτ contained in one of the arcs Ai Bi , while 1 + cos w(P) > ε ,
1 + cos z(P) > ε ,
(6.5)
. for every point P along γτ , not contained in any of the arcs Ai Bi . Call J = ∪1≤i≤N [ai , bi ], J = R \ J , and notice that, as a function of the original variables, u = u(t, x) is smooth
492
A. Bressan, Y. Zheng
in a neighborhood of the set {τ } × J . Using Minkowski’s inequality and the differentiability of u on J , we can write limh→0
1 h
p 1/ p u(τ + h, x) − u(τ, x) − h u t (τ, x) d x R
≤ limh→0
1 h
1/ p p 1/ p u t (τ, x) p d x + . u(τ + h, x) − u(τ, x) d x J
J
(6.6) We now provide an estimate on the measure of the “bad” set J : (1 + cos z) q (1 + cos w) p meas (J ) = dX − dY dx = i A B 4 4 J i i (1 − cos z) q (1 − cos w) p ≤ 2ε dX − dY i A B 4 4 i i (1 − cos z) q (1 − cos w) p ≤ 2ε dX − dY ≤ 2ε E0 . 4 4 γτ (6.7) Now choose q = 2/(2 − p) so that 2p + q1 = 1. Using Hölder’s inequality with conjugate exponents 2/ p and q, and recalling (5.17), we obtain p 2 p/2 1/q u(τ + h, x) − u(τ, x) d x ≤ meas (J ) · u(τ + h, x) − u(τ, x) d x J J 2 p/2 1/q ≤ 2ε E0 · u(τ + h, ·) − u(τ, ·)L2 1/q 2 3 p/2 ≤ 2ε E0 · h 4(κ + 1) E0 . Therefore, p 1/ p 1/2 1 lim sup ≤ [2ε E0 ]1/ pq · 4(κ 3 + 1) E0 . u(τ + h, x) − u(τ, x) − h d x h J h→0 (6.8) In a similar way we estimate 2 p/2 u t (τ, x) p d x ≤ meas (J ) 1/q · , u t (τ, x) d x J J 1/ p p/2 u t (τ, x) p d x ≤ meas (J )1/ pq · κ 2 E0 .
(6.9)
J
Since ε > 0 is arbitrary, from (6.6), (6.8) and (6.9) we conclude lim
h→0
1 h
p 1/ p = 0. u(τ + h, x) − u(τ, x) − h u t (τ, x) d x R
(6.10)
Conservative Solutions to a Nonlinear Variational Wave Equation
493
The proof of continuity of the map t → u t is similar. Fix ε > 0. Consider the intervals [ai , bi ] as before. Since u is smooth on a neighborhood of {τ } × J , it suffices to estimate p lim sup u t (τ + h, x) − u t (τ, x) d x h→0 p ≤ lim sup u t (τ + h, x) − u t (τ, x) d x J h→0 2 p/2 1/q · ≤ lim sup meas (J ) u t (τ + h, x) − u t (τ, x) d x J h→0 p 1/q ≤ lim sup 2ε E0 · u t (τ + h, ·)L2 + u t (τ, ·)L2 h→0 p ≤ 2εE0 ]1/q 4E0 . Since ε > 0 is arbitrary, this proves continuity. To extend the result to general initial data, such that (u 0 )x , u 1 ∈ L2 , we consider a sequence of smooth initial data, with (u ν0 )x , u ν1 ∈ Cc∞ , with u n0 → u 0 uniformly, (u n0 )x → (u 0 )x almost everywhere and in L2 , u n1 → u 1 almost everywhere and in L2 . The continuity of the function t → u x (t, ·) as a map with values in L p , 1 ≤ p < 2, is proved in an entirely similar way. 7. Energy Conservation This section is devoted to the proof of Theorem 3, stating that, in some sense, the total energy of the solution remains constant in time. A key tool in our analysis is the wave interaction potential, defined as . + (t) = (μ− (7.1) t ⊗ μt ) (x, y) ; x > y . − + We recall that μ± t are the positive measures defined at (5.13). Notice that, if μt , μt are absolutely continuous w.r.t. Lebesgue measure, so that (5.10) holds, then (7.1) is equivalent to . 1 (t) = R 2 (t, x) S 2 (t, y) d xd y . 4 x>y
Lemma 1. The map t → (t) has locally bounded variation. Indeed, there exists a one-sided Lipschitz constant L 0 such that (t) − (s) ≤ L 0 · (t − s)
t > s > 0.
(7.2)
To prove the lemma, we first give a formal argument, valid when the solution u = u(t, x) remains smooth. We first notice that (2.4) implies 2 d c 2 2 2 (4(t)) ≤ − 2c R S d x + |R 2 S − RS 2 | d x R + S dx · dt 2c c 2 −1 2 2 R S − RS 2 | d x , ≤ −2κ R S d x + 4E0 2c ∞ L where κ −1 is a lower bound for c(u). For each ε > 0 we have |R| ≤ ε−1/2 + ε1/2 R 2 . Choosing ε > 0 such that
494
A. Bressan, Y. Zheng
c √ κ −1 > 4E0 2c ∞ · 2 ε , L we thus obtain d (4(t)) ≤ −κ −1 dt
16 E 2 R S dx + √ 0 ε 2 2
c 2c ∞ . L
This yields the L1 estimate τ 2 |R S| + |RS 2 | d xdt = O(1) · (0) + E02 τ = O(1) · (1 + τ )E02 , 0
where O(1) denotes a quantity whose absolute value admits a uniform bound, depending only on the function c = c(u) and not on the particular solution under consideration. In particular, the map t → (t) has bounded variation on any bounded interval. It can be discontinuous, with downward jumps. To achieve a rigorous proof of Lemma 1, we need to reproduce the above argument in terms of the variables X, Y . As a preliminary, we observe that for every ε > 0 there exists a constant κε such that |sin z(1 − cos w) − sin w(1 − cos z)| w z ≤ κε · tan2 + tan2 (1 + cos w)(1 + cos z) + ε(1 − cos w)(1 − cos z) 2 2 (7.3) for every pair of angles w, z. . Now fix 0 ≤ s < t. Consider the sets s , t as in (5.11) and define st = t \ s . Observing that pq (1 + cos w)(1 + cos z)d X dY , d xdt = 8c we can now write t ∞ R2 + S2 d xdt = (t − s)E0 4 s −∞ 1 2w z pq tan + tan2 · (1 + cos w)(1 + cos z) d X dY . (7.4) = 2 2 8c st 4 The first identity holds only for smooth solutions, but the second one is always valid. Recalling (5.4) and (5.13), and then using (7.3)–(7.4), we obtain 1 − cos z 1 − cos w (t) − (s) ≤ − p· q d X dY 8 8 st c +E0 · pq [sin z(1 − cos w) − sin w(1 − cos z)] d X dY 2 st 64c 1 ≤− (1 − cos w)(1 − cos z) pq d X dY 64 st c w z (1 + cos w)(1 + cos z) +E0 · pq κε · tan2 + tan2 2 2 2 st 64c +ε(1 − cos w)(1 − cos z)] d X dY ≤ κ(t − s) ,
Conservative Solutions to a Nonlinear Variational Wave Equation
for a suitable constant κ. This proves the lemma. To prove Theorem 3, consider the three sets . z(X, Y ) = −π ,
1 = (X, Y ) ; w(X, Y ) = −π , . w(X, Y ) = −π ,
2 = (X, Y ) ; z(X, Y ) = −π , . w(X, Y ) = −π ,
3 = (X, Y ) ; z(X, Y ) = −π ,
495
c u(X, Y ) = 0 , c u(X, Y ) = 0 , c u(X, Y ) = 0 .
From Eqs. (2.24), it follows that meas ( 1 ) = meas ( 2 ) = 0 . Indeed, wY = 0 on 1 and z X = 0 on 2 . Let ∗3 be the set of Lebesgue points of 3 . We now show that meas t (X, Y ) ; (X, Y ) ∈ ∗3 = 0 .
(7.5)
(7.6)
To prove (7.4), fix any P ∗ = (X ∗ , Y ∗ ) ∈ ∗3 and let τ = t (P ∗ ). We claim that lim sup h,k→0+
(τ − h) − (τ + k) = + ∞. h+k
(7.7)
By assumption, for any ε > 0 arbitrarily small we can find δ > 0 with the following property. For any square Q centered at P ∗ with side of length < δ, there exists a vertical segment σ and a horizontal segment σ , as in Fig. 6, such that meas 3 ∩ σ ≥ (1 − ε) , meas 3 ∩ σ ≥ (1 − ε) . (7.8) Call
. t + = max t (X, Y ) ; (X, Y ) ∈ σ ∪ σ , . t − = min t (X, Y ) ; (X, Y ) ∈ σ ∪ σ .
Fig. 6.
496
A. Bressan, Y. Zheng
Notice that, by (4.2), (1 + cos w) p (1 + cos z)q t+ − t− ≤ dX + dY ≤ c0 · (ε)2 . 4c 4c σ σ
(7.9)
Indeed, the integrand functions are Lipschitz continuous. Moreover, they vanish oustide a set of measure ε. On the other hand, (t − ) − (t + ) ≥ c1 (1 − ε)2 2 − c2 (t + − t − )
(7.10)
for some constant c1 > 0. Since ε > 0 was arbitrary, this implies (7.5). Recalling that the map t → has bounded variation, from (7.5) it follows (7.4). We now observe that the singular part of μτ is nontrivial only if the set P ∈ γτ ; w(P) = −π or z(P) = −π has positive 1-dimensional measure. By the previous analysis, restricted to the region where c = 0, this can happen only for a set of times having zero measure. Acknowledgement. Alberto Bressan was supported by the Italian M.I.U.R., within the research project #2002017219, while Yuxi Zheng has been partially supported by grants NSF DMS 0305497 and 0305114.
References 1. Albers, M., Camassa, R., Holm, D., Marsden, J.: The geometry of peaked solitons and billiard solutions of a class of integrable PDE’s. Lett. Math. Phys. 32, 137–151 (1994) 2. Balabane, M.: Non–existence of global solutions for some nonlinear wave equations with small Cauchy data. C. R. Acad. Sc. Paris 301, 569–572 (1985) 3. Berestycki, H., Coron, J.M., Ekeland, I. (eds.): Variational Methods. Progress in Nonlinear Differential Equations and Their Applications, Vol. 4, Boston: Birkhäuser, 1990 4. Bressan, A., Constantin, A.: Global solutions to the Camassa-Holm equations. To appear 5. Bressan, A., Zhang, P., Zheng, Y.: On asymptotic variational wave equations. Arch. Rat. Mech. Anal. To appear 6. Camassa, R., Holm, D.: An integrable shallow water equation with peaked solitons. Phys. Rev. Lett. 71, 1661–1664 (1993) 7. Camassa, R., Holm, D., Hyman, J.: A new integrable shallow water equation. To appear in Adv. Appl. Mech. 8. Christodoulou, D., Tahvildar-Zadeh, A.: On the regularity of spherically symmetric wave maps. Comm. Pure Appl. Math. 46, 1041–1091 (1993) 9. Coron, J., Ghidaglia, J., Hélein, F. (eds.): Nematics, Dordrecht: Kluwer Academic Publishers, 1991 10. Ericksen, J.L., Kinderlehrer, D. (eds.): Theory and Application of Liquid Crystals. IMA Volumes in Mathematics and its Applications, Vol. 5, New York: Springer-Verlag, 1987 11. Glassey, R.T.: Finite–time blow–up for solutions of nonlinear wave equations. Math. Z. 177, 323–340 (1981) 12. Glassey, R.T., Hunter, J.K., Zheng, Y.: Singularities in a nonlinear variational wave equation. J. Diff. Eqs. 129, 49–78 (1996) 13. Glassey, R.T., Hunter, J.K., Zheng, Y.: Singularities and oscillations in a nonlinear variational wave equation. Singularities and Oscillations, edited by J. Rauch, M.E. Taylor, (eds.) IMA, Vol. 91, Springer, 1997 14. Grundland, A., Infeld, E.: A family of nonlinear Klein-Gordon equations and their solutions. J. Math. Phys. 33, 2498–2503 (1992) 15. Hanouzet, B., Joly, J.L.: Explosion pour des problèmes hyperboliques semi–linéaires avec second membre non compatible. C. R. Acad. Sc. Paris 301, 581–584 (1985) 16. Hardt, R., Kinderlehrer, D., Lin, F.: Existence and partial regularity of static liquid crystal configurations. Commun. Math. Phys. 105, 547–570 (1986) 17. Hunter, J.K., Saxton, R.A.: Dynamics of director fields. SIAM J. Appl. Math. 51, 1498–1521 (1991) 18. Hunter, J.K., Zheng, Y.: On a nonlinear hyperbolic variational equation I and II. Arch. Rat. Mech. Anal. 129, 305–353, 355–383 (1995)
Conservative Solutions to a Nonlinear Variational Wave Equation
497
19. Hunter, J.K., Zheng, Y.: On a completely integrable nonlinear hyperbolic variational equation. Physica D 79, 361–386 (1994) 20. John, F.: Blow–up of solutions of nonlinear wave equations in three space dimensions. Manuscripta Math. 28, 235–268 (1979) 21. Kato, T.: Blow–up of solutions of some nonlinear hyperbolic equations. Comm. Pure Appl. Math. 33, 501–505 (1980) 22. Kinderlehrer, D.: Recent developments in liquid crystal theory. In: Frontiers in pure and applied mathematics : a collection of papers dedicated to Jacques-Louis Lions on the occasion of his sixtieth birthday. ed. R. Dautray, New York: Elsevier, 1991, pp. 151–178 23. Klainerman, S., Machedon, M.: Estimates for the null forms and the spaces Hs,δ . Internat. Math. Res. Notices no. 17, 853–865, (1996) 24. Klainerman, S., Majda, A.: Formation of singularities for wave equations including the nonlinear vibrating string. Comm. Pure Appl. Math. 33, 241–263 (1980) 25. Lax, P.: Development of singularities of solutions of nonlinear hyperbolic partial differential equations. J. Math. Phys. 5, 611–613 (1964) 26. Levine, H.: Instability and non–existence of global solutions to nonlinear wave equations. Trans. Amer. Math. Soc. 192, 1–21 (1974) 27. Lindblad, H.: Global solutions of nonlinear wave equations. Comm. Pure Appl. Math. 45, 1063–1096 (1992) 28. Liu, T.-P.: Development of singularities in the nonlinear waves for quasi–linear hyperbolic partial differential equations. J. Differential Equations 33, 92–111 (1979) 29. Saxton, R.A.: Dynamic instability of the liquid crystal director. In: Contemporary Mathematics, Vol. 100: Current Progress in Hyperbolic Systems, ed. W.B. Lindquist, Providence RI: AMS, 1989, pp. 325–330 30. Schaeffer, J.: The equation u tt − u = |u| p for the critical value of p. Proc. Roy. Soc. Edinburgh Sect. A 101A, 31–44 (1985) 31. Shatah, J.: Weak solutions and development of singularities in the SU (2) σ -model. Comm. Pure Appl. Math. 41, 459–469 (1988) 32. Shatah, J., Tahvildar-Zadeh, A.: Regularity of harmonic maps from Minkowski space into rotationally symmetric manifolds. Comm. Pure Appl. Math. 45, 947–971 (1992) 33. Sideris, T.: Global behavior of solutions to nonlinear wave equations in three dimensions. Comm. Partial Diff. Eq. 8, 1291–1323 (1983) 34. Sideris, T.: Nonexistence of global solutions to semilinear wave equations in high dimensions. J. Diff. Eq. 52, 378–406 (1984) 35. Strauss, W.: Nonlinear wave equations. CBMS Lectures 73, Providence RI: AMS, 1989 36. Virga, E.: Variational Theories for Liquid Crystals. Chapman & Hall, New York (1994) 37. Zhang, P., Zheng, Y.: On oscillations of an asymptotic equation of a nonlinear variational wave equation. Asymptotic Analysis 18, 307–327 (1998) 38. Zhang, P., Zheng, Y.: On the existence and uniqueness of solutions to an asymptotic equation of a variational wave equation. Acta Mathematica Sinica 15, 115–130 (1999) 39. Zhang, P., Zheng, Y.: On the existence and uniqueness to an asymptotic equation of a variational wave equation with general data. Arch. Rat. Mech. Anal. 155, 49–83 (2000) 40. Zhang, P., Zheng, Y.: Rarefactive solutions to a nonlinear variational wave equation, Comm. Partial Differential Equations 26, 381–419 (2001) 41. Zhang, P., Zheng, Y.: Singular and rarefactive solutions to a nonlinear variational wave equation, Chinese Annals of Mathematics 22B, 2, 159–170 (2001) 42. Zhang, P., Zheng, Y.: Weak solutions to a nonlinear variational wave equation, Arch. Rat. Mech. Anal. 166, 303–319 (2003) 43. Zhang, P., Zheng, Y.: On the second-order asymptotic equation of a variational wave equation, Proc A of the Royal Soc. Edinburgh, A. Mathematics 132A, 483–509 (2002) 44. Zhang, P., Zheng, Y.: Weak solutions to a nonlinear variational wave equation with general data, Annals of Inst. H. Poincaré, ©Non Linear Anal. 22, 207–226 (2005) 45. Zorski, H., Infeld, E.: New soliton equations for dipole chains, Phys. Rev. Lett. 68, 1180–1183 (1992) Communicated by P. Constantin
Commun. Math. Phys. 266, 499–545 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0036-y
Communications in
Mathematical Physics
The Random Average Process and Random Walk in a Space-Time Random Environment in One Dimension Márton Balázs1, , Firas Rassoul-Agha2 , Timo Seppäläinen1, 1 Mathematics Department, University of Wisconsin-Madison, Van Vleck Hall, Madison, WI 53706, USA.
E-mail: [email protected]; [email protected]
2 Mathematical Biosciences Institute, Ohio State University, 231 West 18th Avenue, Columbus, OH 43210,
USA. E-mail: [email protected] Received: 7 September 2005 / Accepted: 12 January 2006 Published online: 9 May 2006 – © Springer-Verlag 2006
Abstract: We study space-time fluctuations around a characteristic line for a one-dimensional interacting system known as the random average process. The state of this system is a real-valued function on the integers. New values of the function are created by averaging previous values with random weights. The fluctuations analyzed occur on the scale n 1/4 , where n is the ratio of macroscopic and microscopic scales in the system. The limits of the fluctuations are described by a family of Gaussian processes. In cases of known product-form invariant distributions, this limit is a two-parameter process whose time marginals are fractional Brownian motions with Hurst parameter 1/4. Along the way we study the limits of quenched mean processes for a random walk in a space-time random environment. These limits also happen at scale n 1/4 and are described by certain Gaussian processes that we identify. In particular, when we look at a backward quenched mean process, the limit process is the solution of a stochastic heat equation. 1. Introduction Fluctuations for asymmetric interacting systems. An asymmetric interacting system is a random process στ = {στ (k) : k ∈ K} of many components στ (k) that influence each others’ evolution. Asymmetry means here that the components have an average drift in some spatial direction. Such processes are called interacting particle systems because often these components can be thought of as particles. To orient the reader, let us first think of a single random walk {X τ : τ = 0, 1, 2, . . . } that evolves by itself. For random walk we scale both space and time by n because on this scale we see the long-term velocity: n −1 X nt → tv as n → ∞, where v = E X 1 . The random walk is diffusive which means that its fluctuations occur on the scale n 1/2 , as M. Balázs was partially supported by Hungarian Scientific Research Fund (OTKA) grant T037685.
T. Seppäläinen was partially supported by National Science Foundation grant DMS-0402231.
500
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
revealed by the classical central limit theorem: n −1/2 (X nt − ntv) converges weakly to a Gaussian distribution. The Gaussian limit is universal here because it arises regardless of the choice of step distribution for the random walk, as long as a square-integrability hypothesis is satisfied. For asymmetric interacting systems we typically also scale time and space by the same factor n, and this is known as Euler scaling. However, in certain classes of onedimensional asymmetric interacting systems the random evolution produces fluctuations of smaller order than the natural diffusive scale. Two types of such phenomena have been discovered. (i) In Hammersley’s process, in asymmetric exclusion, and in some other closely related systems, dynamical fluctuations occur on the scale n 1/3 . Currently known rigorous results suggest that the Tracy-Widom distributions from random matrix theory are the universal limits of these n 1/3 fluctuations. The seminal works in this context are by Baik, Deift and Johansson [3] on Hammersley’s process and by Johansson [19] on the exclusion process. We should point out though that [3] does not explicitly discuss Hammersley’s process, but instead the maximal number of planar Poisson points on an increasing path in a rectangle. One can intrepret the results in [3] as fluctuation results for Hammersley’s process with a special initial configuration. The connection between the increasing path model and Hammersley’s process goes back to Hammersley’s paper [18]. It was first utilized by Aldous and Diaconis [1] (who also named the process), and then further in the papers [26, 28]. (ii) The second type has fluctuations of the order n 1/4 and limits described by a family of self-similar Gaussian processes that includes fractional Brownian motion with Hurst parameter 41 . This result was first proved for a system of independent random walks [30]. One of the main results of the current paper shows that the n 1/4 fluctuations also appear in a family of interacting systems called random average processes in one dimension. The same family of limiting Gaussian processes appears here too, suggesting that these limits are universal for some class of interacting systems. The random average processes (RAP) studied in the present paper describe a random real-valued function on the integers whose values evolve by jumping to random convex combinations of values in a finite neighborhood. It could be thought of as a caricature model for an interface between two phases on the plane, hence we call the state a height function. RAP is related to the so-called linear systems discussed in Chapter IX of Liggett’s monograph [22]. RAP was introduced by Ferrari and Fontes [14] who studied the fluctuations from initial linear slopes. In particular, they discovered that the height over the origin satisfies a central limit theorem in the time scale t 1/4 . The Ferrari-Fontes results suggested RAP to us as a fruitful place to investigate whether the n 1/4 fluctuation picture discovered in [30] for independent walks had any claim to universality. There are two ways to see the lower order dynamical fluctuations. (1) One can take deterministic initial conditions so that only dynamical randomness is present. (2) Even if the initial state is random with central limit scale fluctuations, one can find the lower order fluctuations by looking at the evolution of the process along a characteristic curve. Articles [3] and [19] studied the evolutions of special deterministic initial states of Hammersley’s process and the exclusion process. Recently Ferrari and Spohn [15] have extended this analysis to the fluctuations across a characteristic in a stationary exclusion process. The general nonequilibrium hydrodynamic limit situation is still out of reach for these models. [30] contains a tail bound for Hammersley’s process that suggests
Random Average Process
501
n 1/3 scaling also in the nonequilibrium situation, including along a shock which can be regarded as a “generalized” characteristic. Our results for the random average process are for the general hydrodynamic limit setting. The initial increments of the random height function are assumed independent and subject to some moment bounds. Their means and variances must vary sufficiently regularly to satisfy a Hölder condition. Deterministic initial increments qualify here as a special case of independent. The classification of the systems mentioned above (Hammersley, exclusion, independent walks, RAP) into n 1/3 and n 1/4 fluctuations coincides with their classification according to type of macroscopic equation. Independent particles and RAP are macroscopically governed by linear first-order partial differential equations u t + bu x = 0. In contrast, macroscopic evolutions of Hammersley’s process and the exclusion process obey genuinely nonlinear Hamilton-Jacobi equations u t + f (u x ) = 0 that create shocks. Suppose we start off one of these systems so that the initial state fluctuates on the n 1/2 spatial scale, for example in a stationary distribution. Then the fluctuations of the entire system on the n 1/2 scale simply consist of initial fluctuations transported along the deterministic characteristics of the macroscopic equation. This is a consequence of the lower order of dynamical fluctuations. When the macroscopic equation is linear this is the whole picture of diffusive fluctuations. In the nonlinear case the behavior at the shocks (where characteristics merge) also needs to be resolved. This has been done for the exclusion process [25] and for Hammersley’s process [29]. Random walk in a space-time random environment. Analysis of the random average process utilizes a dual description in terms of backward random walks in a space-time random environment. Investigation of the fluctuations of RAP leads to a study of fluctuations of these random walks, both quenched invariance principles for the walk itself and limits for the quenched mean process. The quenched invariance principles have been reported elsewhere [24]. The results for the quenched mean process are included in the present paper because they are intimately connected to the random average process results. We look at two types of processes of quenched means. We call them forward and backward. In the forward case the initial point of the walk is fixed, and the walk runs for a specified amount of time on the space-time lattice. In the backward case the initial point moves along a characteristic, and the walk runs until it reaches the horizontal axis. Furthermore, in both cases we let the starting point vary horizontally (spatially), and so we have a space-time process. In both cases we describe a limiting Gaussian process, when space is scaled by n 1/2 , time by n, and the magnitude of the fluctuations by n 1/4 . In particular, in the backward case we find a limit process that solves the stochastic heat equation. There are two earlier papers on the quenched mean of this random walk in a space-time random environment. These previous results were proved under assumptions of small enough noise and finitely many possible values for the random probabilities. Bernabei [5] showed that the centered quenched mean, normalized by its own standard deviation, converges to a normal variable. Then separately he showed that this standard deviation is bounded above and below on the order n 1/4 . Bernabei has results also in dimension 2, and also for the quenched covariance of the walk. Boldrighini and Pellegrinotti [6] also proved a normal limit in the scale n 1/4 for what they term the “correction” caused by the random environment on the mean of a test function.
502
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Finite-dimensional versus process-level convergence. Our main results all state that the finite-dimensional distributions of a process of interest converge to the finite-dimensional distributions of a certain Gaussian process specified by its covariance function. We have not proved process-level tightness, except in the case of forward quenched means for the random walks where we compute a bound on the sixth moment of the process increment. Further relevant literature. It is not clear what exactly are the systems “closely related” to Hammersley’s process or exclusion process, alluded to in the beginning of the Introduction, that share the n 1/3 fluctuations and Tracy-Widom limits. The processes for which rigorous proofs exist all have an underlying representation in terms of a last-passage percolation model. Another such example is “oriented digital boiling” studied by Gravner, Tracy and Widom [16]. (This model was studied earlier in [27] and [20] under different names.) Fluctuations of the current were initially studied from the perspective of a moving observer traveling with a general speed. The fluctuations are diffusive, and the limiting variance is a function of the speed of the observer. The special nature of the characteristic speed manifests itself in the vanishing of the limiting variance on this diffusive scale. The early paper of Ferrari and Fontes [13] treated the asymmetric exclusion process. Their work was extended by Balázs [4] to a class of deposition models that includes the much-studied zero range process and a generalization called the bricklayers’ process. Work on the fluctuations of Hammersley’s process and the exclusion process has connections to several parts of mathematics. Overviews of some of these links appear in papers [2, 10, 17]. General treatments of large scale behavior of interacting random systems can be found in [9, 21–23, 32, 33]. Organization of the paper. We begin with the description of the random average process and the limit theorem for it in Sect. 2. Section 3 describes the random walk in a space-time random environment and the limit theorems for quenched mean processes. The proofs begin with Sect. 4 that lays out some preliminary facts on random walks. Sections 5 and 6 prove the fluctuation results for random walk, and the final Sect. 7 proves the limit theorem for RAP. The reader only interested in the random walk can read Sect. 3 and the proofs for the random walk limits independently of the rest of the paper, except for certain definitions and a hypothesis which have been labeled. The RAP results can be read independently of the random walk, but their proofs depend on the random walk results. Notation. We summarize here some notation and conventions for quick reference. The set of natural numbers is N = {1, 2, 3, . . . }, while Z+ = {0, 1, 2, 3, . . . } and R+ = [0, ∞). On the two dimensional integer lattice Z2 standard basis vectors are e1 = (1, 0) and e2 = (0, 1). The e2 -direction represents time. We need several different probability measures and corresponding expectation operators. P (with expectation E) is the probability measure on the space of environments ω. P is an i.i.d. product measure across the coordinates indexed by the space-time lattice Z2 . P (with expectation E) is the probability measure of the initial state of the random average process. Eω is used to emphasize that an expectation over initial states is taken with a fixed environment ω. Jointly the environment and initial state are independent, so the joint measure is the product P ⊗ P. P ω (with expectation E ω ) is the quenched path measure of the random walks in environment ω. The annealed measure for the walks
Random Average Process
503
is P = P ω P(dω). Additionally, we use P and E for generic probability measures and expectations for processes that are not part of this specific set-up, such as Brownian motions and limiting Gaussian processes. The environments ω ∈ are configurations ω = (ωx,τ : (x, τ ) ∈ Z2 ) of vectors indexed by the space-time lattice Z2 . Each element ωx,τ is a probability vector of length 2M + 1, denoted also by u τ (x) = ωx,τ , and in terms of coordinates u τ (x) = (u τ (x, y) : −M ≤ y ≤ M). The environment at a fixed time value τ is ω¯ τ = (ωx,τ : x ∈ Z). Translations on are defined by (Tx,τ ω) y,s = ωx+y,τ +s . x = max{n ∈ Z : n ≤ x} is the lower integer part of a real x. Throughout, C denotes a constant whose exact value is immaterial and can change from line to line. The density and cumulative distribution function of the centered Gaussian distribution with variance σ 2 are denoted by ϕσ 2 (x) and σ 2 (x). {B(t) : t ≥ 0} is one-dimensional standard Brownian motion, in other words the Gaussian process with covariance E B(s)B(t) = s ∧ t. 2. The Random Average Process The state of the random average process (RAP) is a height function σ : Z → R. It can also be thought of as a sequence σ = (σ (i) : i ∈ Z) ∈ RZ , where σ (i) is the height of an interface above site i. The state evolves in discrete time according to the following rule. At each time point τ = 1, 2, 3, . . . and at each site k ∈ Z, a random probability vector u τ (k) = (u τ (k, j) : −M ≤ j ≤ M) of length 2M + 1 is drawn. Given the state στ −1 = (στ −1 (i) : i ∈ Z) at time τ − 1, the height value at site k is then updated to στ (k) = u τ (k, j)στ −1 (k + j). (2.1) j:| j|≤M
This update is performed independently at each site k to form the state στ = (στ (k) : k ∈ Z) at time τ . The same step is repeated at the next time τ + 1 with new independent draws of the probability vectors. So, given an initial state σ0 , the process στ is constructed with a collection {u τ (k) : τ ∈ N, k ∈ Z} of independent and identically distributed random vectors. These random vectors are defined on a probability space (, S, P). If σ0 is also random with distribution P, then σ0 and the vectors {u τ (k)} are independent, in other words the joint distribution is P ⊗ P. We write u ωτ (k) to make explicit the dependence on ω ∈ . E will denote expectation under the measure P. M is the range and is a fixed finite parameter of the model. P-almost surely each random vector u τ (k) satisfies 0 ≤ u τ (k, j) ≤ 1 for all −M ≤ j ≤ M, and
M
u τ (k, j) = 1.
j=−M
It is often convenient to allow values u τ (k, j) for all j. Then automatically u τ (k, j) = 0 for | j| > M. Let p(0, j) = Eu 0 (0, j) denote the averaged probabilities. Throughout the paper we make two fundamental assumptions.
504
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
(i) First, there is no integer h > 1 such that, for some x ∈ Z,
p(0, x + kh) = 1.
k∈Z
This is also expressed by saying that the span of the random walk with jump probabilities p(0, j) is 1 [11, p. 129]. It follows that the group generated by {x ∈ Z : p(0, x) > 0} is all of Z, in other words this walk is aperiodic in Spitzer’s terminology [31]. (ii) Second, we assume that P{max u 0 (0, j) < 1} > 0.
(2.2)
j
If this assumption fails, then P-almost surely for each (k, τ ) there exists j = j (k, τ ) such that u τ (k, j) = 1. No averaging happens, but instead στ (k) adopts the value στ −1 (k + j). The behavior is then different from that described by our results. No further hypotheses are required of the distribution P on the probability vectors. Deterministic weights u ωτ (k, j) ≡ p(0, j) are also admissible, in which case (2.2) requires max j p(0, j) < 1. In addition to the height process στ we also consider the increment process ητ = (ητ (i) : i ∈ Z) defined by ητ (i) = στ (i) − στ (i − 1). From (2.1) one can deduce a similar linear equation for the evolution of the increment process. However, the weights are not necessarily nonnegative, and even if they are, they do not necessarily sum to one. Next we define several constants that appear in the results. D(ω) = x u ω0 (0, x) (2.3) x∈Z
is the drift at the origin. Its mean is V = E(D) and variance σ D2 = E[(D − V )2 ].
(2.4)
A variance under averaged probabilities is computed by (x − V )2 p(0, x). σa2 =
(2.5)
x∈Z
Define random and averaged characteristic functions by φ ω (t) = u ω0 (0, x)eit x and φa (t) = Eφ ω (t) = p(0, x)eit x , x∈Z
(2.6)
x∈Z
and then further λ(t) = E[ |φ ω (t)|2 ] and λ¯ (t) = |φa (t)|2 . Finally, define a positive constant β by π 1 − λ(t) 1 β= dt. ¯ 2π −π 1 − λ(t)
(2.7)
(2.8)
Random Average Process
505
The assumption of span 1 implies that |φa (t)| = 1 only at multiples of 2π . Hence the integrand above is positive at t = 0. Separately one can check that the integrand has a finite limit as t → 0. Thus β is well-defined and finite. In Sect. 4 we can give these constants, especially β, more probabilistic meaning from the perspective of the underlying random walk in random environment. For the limit theorems we consider a sequence στn of the random average processes, indexed by n ∈ N = {1, 2, . . . }. Initially we set σ0n (0) = 0. For each n we assume 3, n that the initial increments η0 (i) : i ∈ Z are independent random variables, with E[η0n (i)] = (i/n) and Var[η0n (i)] = v(i/n).
(2.9)
The functions and v that appear above are assumed to be uniformly bounded functions on R and to satisfy this local Hölder continuity: For each compact interval [a, b] ⊆ R there exist C = C(a, b) < ∞ and γ = γ (a, b) > 1/2 such that |(x) − (y)| + |v(x) − v(y)| ≤ C |x − y|γ for x, y ∈ [a, b].
(2.10)
The function v must be nonnegative, but the sign of is not restricted. Both functions are allowed to vanish. In particular, our hypotheses permit deterministic initial heights which implies that v vanishes identically. The distribution on initial heights and increments described above is denoted by P. We make this uniform moment hypothesis on the increments: there exists α > 0 such that sup E |η0n (i)|2+α < ∞. (2.11) n∈N, i∈Z
We assume that the processes στn are all defined on the same probability space. The environments ω that drive the dynamics are independent of the initial states {σ0n }, so the joint distribution of (ω, {σ0n }) is P ⊗ P. When computing an expectation under a fixed ω we write Eω . On the larger space and time scale the height function is simply rigidly translated at speed b = −V , and the same is also true of the central limit fluctuations of the initial height function. Precisely speaking, define a function U on R by U (0) = 0 and U (x) = (x). Let (x, t) ∈ R × R+ . The assumptions made thus far imply that both n (nx) −→ U (x − bt) n −1 σnt
(2.12)
and n (nx) − nU (x − bt) σnt σ n (nx − nbt) − nU (x − bt) − 0 −→ 0 (2.13) √ √ n n
in probability, as n → ∞. (We will not give a proof. This follows from easier versions of the estimates in the paper.) Limit (2.12) is the “hydrodynamic limit” of the process. The large scale evolution of the height process is thus governed by the linear transport equation wt + bwx = 0. This equation is uniquely solved by w(x, t) = U (x − bt) given the initial function w(x, 0) = U (x). The lines x(t) = x + bt are the characteristics of this equation, the
506
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
curves along which the equation carries information. Limit (2.13) says that fluctuations on the diffusive scale do not include any randomness from the evolution, only a translation of initial fluctuations along characteristics. We find interesting height fluctuations along a macroscopic characteristic line x(t) = √ y¯ + bt, and around such a line on the microscopic spatial scale n. The magnitude of these fluctuations is of the order n 1/4 , so we study the process n √ √ z n (t, r ) = n −1/4 σnt (n y¯ + r n + ntb) − σ0n (n y¯ + r n ) , indexed by (t, r ) ∈ R+ × R, for a fixed y¯ ∈ R. In terms of the increment process ητn , z n (t, 0) is the net flow from right to left across the discrete characteristic n y¯ + nsb, during the time interval 0 ≤ s ≤ t. Next we describe the limit of z n . Recall the constants defined in (2.4), (2.5), and (2.8). Combine them into a new constant κ=
σ D2 . βσa2
(2.14)
Let {B(t) : t ≥ 0} be one-dimensional standard Brownian motion. Define two functions q and 0 on (R+ × R) × (R+ × R): q ((s, q), (t, r )) = and 0 ((s, q), (t, r )) =
κ 2
σa2 (t+s)
σa2 |t−s|
1 1 exp − (q − r )2 dv √ 2v 2π v
(2.15)
∞
P[σa B(s) > x − q]P[σa B(t) > x − r ] d x r − 1{r >q} P[σa B(s) > x − q]P[σa B(t) ≤ x − r ] d x q q +1{q>r } P[σa B(s) ≤ x − q]P[σa B(t) > x − r ] d x r q∧r P[σa B(s) ≤ x − q]P[σa B(t) ≤ x − r ] d x. (2.16) +
q∨r
−∞
The boundary values are such that q ((s, q), (t, r )) = 0 ((s, q), (t, r )) = 0 if either s = 0 or t = 0. We will see later that q is the limiting covariance of the backward quenched mean process of a related random walk in random environment. 0 is the covariance for fluctuations contributed by the initial increments of the random average process. (Hence the subscripts q for quenched and 0 for initial time. The subscript on q has nothing to do with the argument (s, q).) The integral expressions above are the form in which q and 0 appear in the proofs. For q the key point is the limit (5.19) which is evaluated earlier in (4.5). 0 arises in Proposition 7.1. Here are alternative succinct representations for q and 0 . Denote the centered Gaussian density with variance σ 2 by
1 1 (2.17) exp − 2 x 2 ϕσ 2 (x) = √ 2σ 2π σ 2
Random Average Process
507
and its distribution function by σ 2 (x) =
x
−∞ ϕσ 2 (y) dy.
Then define
σ 2 (x) = σ 2 ϕσ 2 (x) − x(1 − σ 2 (x)), which is an antiderivative of σ 2 (x) − 1. In these terms,
q ((s, q), (t, r )) = κσa2 (t+s) |q − r | − κσa2 |t−s| |q − r | and
0 ((s, q), (t, r )) = σa2 s |q − r | + σa2 t |q − r | − σa2 (t+s) |q − r | .
Theorem 2.1. Assume (2.2) and that the averaged probabilities p(0, j) = Eu ω0 (0, j) have lattice span 1. Let and v be two uniformly bounded functions on R that satisfy the local Hölder condition (2.10). For each n, let στn be a random average process normalized by σ0n (0) = 0 and whose initial increments {η0n (i) : i ∈ Z} are independent and satisfy (2.9) and (2.11). Assume the environments ω independent of the initial heights {σ0n : n ∈ N}. Fix y¯ ∈ R. Under the above assumptions the finite-dimensional distributions of the process {z n (t, r ) : (t, r ) ∈ R+ ×R} converge weakly as n → ∞ to the finite-dimensional distributions of the mean zero Gaussian process {z(t, r ) : (t, r ) ∈ R+ × R} specified by the covariance E z(s, q)z(t, r ) = ( y¯ )2 q ((s, q), (t, r )) + v( y¯ )0 ((s, q), (t, r )).
(2.18)
The statement means that, given space-time points (t1 , r1 ), . . . , (tk , rk ), the Rk -valued random vector (z n (t1 , r1 ), . . . , z n (tk , rk )) converges in distribution to the random vector (z(t1 , r1 ), . . . , z(tk , rk )) as n → ∞. The theorem is also valid in cases where one source of randomness has been turned off: if initial increments around n y¯ are deterministic then v( y¯ ) = 0, while if D(ω) ≡ V then σ D2 = 0. The case σ D2 = 0 contains as a special case the one with deterministic weights u ωτ (k, j) ≡ p(0, j). If we consider only temporal correlations with a fixed r , the formula for the covariance is as follows: √
√ κσa E z(s, r )z(t, r ) = √ ( y¯ )2 s + t − t − s 2π √
√ √ σa + √ v( y¯ ) s + t − s + t for s < t. (2.19) 2π Remark 2.1. The covariances are central to our proofs but they do not illuminate the behavior of the process z. Here is a stochastic integral representation of the Gaussian process with covariance (2.18): √ z(t, r ) = ( y¯ )σa κ ϕσa2 (t−s) (r − x) dW (s, x) [0,t]×R
(2.20) + v( y¯ ) sign(x − r )σa2 t − |x − r | d B(x). R
Above W is a two-parameter Brownian motion defined on R+ × R, B is a one-parameter Brownian motion defined on R, and W and B are independent of each other. The first integral represents the space-time noise created by the dynamics, and the second
508
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
integral represents the initial noise propagated by the evolution. The equality in (2.20) is equality in distribution of processes. It can be verified by checking that the Gaussian process defined by the sum of the integrals has the covariance (2.18). One can readily see the second integral in (2.20) arise as a sum in the proof. It is the limit of Y n (t, r ) defined below Eq. (7.1). One can also check that the right-hand side of (2.20) is a weak solution of a stochastic heat equation with two independent sources of noise: √ z t = 21 σa2 zrr + ( y¯ )σa κ W˙ + 21 v( y¯ )σa2 B , z(0, r ) ≡ 0. (2.21) W˙ is space-time white noise generated by the dynamics and B the second derivative of the one-dimensional Brownian motion that represents initial noise. This equation has to be interpreted in a weak sense through integration against smooth compactly supported test functions. We make a related remark below in Sect. 3.2 for limit processes of quenched means of space-time RWRE. The simplest RAP dynamics averages only two neighboring height values. By translating the indices, we can assume that p(0, −1) + p(0, 0) = 1. In this case the evolution of increments is given by the equation ητ (k) = u τ (k, 0)ητ −1 (k) + u τ (k − 1, −1)ητ −1 (k − 1).
(2.22)
There is a queueing interpretation of sorts for this evolution. Suppose ητ −1 (k) denotes the amount of work that remains at station k at the end of cycle τ − 1. Then during cycle τ , the fraction u τ (k, −1) of this work is completed and moves on to station k + 1, while the remaining fraction u τ (k, 0) stays at station k for further processing. In this case we can explicitly evaluate the constant β in terms of the other quantities. In a particular stationary situation we can also identify the temporal marginal of z in (2.19) as a familiar process. (A probability distribution μ on the space ZZ is an invariant distribution for the increment process if it is the case that when η0 has μ distribution, so does ητ for all times τ ∈ Z+ .) Proposition 2.2. Assume p(0, −1) + p(0, 0) = 1. (a) Then 1 β = 2 E[u 0 (0, 0)u 0 (0, −1)]. σa
(2.23)
(b) Suppose further that the increment process ητ possesses an invariant distribution μ in which the variables {η(i) : i ∈ Z} are i.i.d. with common mean = E μ [η(i)] and variance v = E μ [η(i)2 ] − 2 . Then v = κ2 . Suppose that in Theorem 2.1 each ητn = ητ is a stationary process with marginal μ. Then the limit process z has covariance
E z(s, q)z(t, r ) = κ2 σa2 s |q − r | + σa2 t |q − r | − σa2 |t−s| |q − r | . (2.24) In particular, for a fixed r the process {z(t, r ) : t ∈ R+ } has covariance σa κ2 √ √ (2.25) s + t − |t − s| . E z(s, r )z(t, r ) = √ 2π In other words, process z(·, r ) is fractional Brownian motion with Hurst parameter 1/4.
Random Average Process
509
To rephrase the connection (2.24)–(2.25), the process {z(t, r )} in (2.24) is a certain two-parameter process whose marginals along the first parameter direction are fractional Brownian motions. Ferrari and Fontes [14] showed that given any slope ρ, the process ητ started from deterministic increments η0 (x) = ρx converges weakly to an invariant distribution. But as is typical for interacting systems, there is little information about the invariant distributions in the general case. The next example gives a family of processes and i.i.d. invariant distributions to show that part (b) of Proposition 2.2 is not vacuous. Presently we are not aware of other explictly known invariant distributions for RAP. Example 2.1. Fix integer parameters m > j > 0. Let {u τ (k, −1) : τ ∈ N, k ∈ Z} be i.i.d. beta-distributed random variables with density h(u) =
(m − 1)! u j−1 (1 − u)m− j−1 ( j − 1)!(m − j − 1)!
on (0, 1). Set u τ (k, 0) = 1 − u τ (k, −1). Consider the evolution defined by (2.22) with these weights. Then a family of invariant distributions for the increment process ητ = (ητ (k) : k ∈ Z) is obtained by letting the variables {η(k)} be i.i.d. gamma distributed with common density f (x) =
1 λe−λx (λx)m−1 (m − 1)!
(2.26)
on R+ . The family of invariant distributions is parametrized by 0 < λ < ∞. Under this distribution E[η(k)] = m/λ and Var[η(k)] = m/λ2 . One motivation for the present work was to investigate whether the limits found in [30] for fluctuations along a characteristic for independent walks are instances of some universal behavior. The present results are in agreement with those obtained for independent walks. The common scaling is n 1/4 . In that paper only the case r = 0 of Theorem 2.1 was studied. For both independent walks and RAP the limit z(· , 0) is a mean-zero Gaussian process with covariance of the type √ √
√
√ √ E z(s, 0)z(t, 0) = c1 s + t − t − s + c2 s + t − s + t , where c1 is determined by the mean increment and c2 by the variance of the increment locally around the initial point of the characteristic. Furthermore, as in Proposition 2.2(b), for independent walks the limit process specializes to fractional Brownian motion if the increment process is stationary. These and other related results suggest several avenues of inquiry. In the introduction we contrasted this picture of n 1/4 fluctuations and fractional Brownian motion limits with the n 1/3 fluctuations and Tracy-Widom limits found in exclusion and Hammersley processes. Obviously more classes of processes should be investigated to understand better the demarcation between these two types. Also, there might be further classes with different limits. Above we assumed independent increments at time zero. It would be of interest to see if relaxing this assumption leads to a change in the second part of the covariance (2.18). [The first part comes from the random walks in the dual description and would not be affected by the initial conditions.] However, without knowledge of some explicit invariant distributions it is not clear what types of initial increment processes {η0 (k)} are
510
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
worth considering. Unfortunately finding explicit invariant distributions for interacting systems seems often a matter of good fortune. We conclude this section with the dual description of RAP which leads us to study random walks in a space-time random environment. Given ω, let {X si,τ : s ∈ Z+ } denote a random walk on Z that starts at X 0i,τ = i, and whose transition probabilities are given by i, τ P ω (X s+1 = y | X si, τ = x) = u ωτ−s (x, y − x).
(2.27)
P ω is the path measure of the walk X si,τ , with expectation denoted by E ω . Comparison of (2.1) and (2.27) gives στ (i) = P ω (X 1i, τ = j | X 0i, τ = i)στ −1 ( j) = E ω στ −1 (X 1i, τ ) . (2.28) j
Iteration and the Markov property of the walks X si,τ then lead to στ (i) = E ω σ0 (X τi, τ ) .
(2.29)
Note that the initial height function σ0 is a constant under the expectation E ω . Let us add another coordinate to keep track of time and write X¯ si,τ = (X si,τ , τ − s) for s ≥ 0. Then X¯ si,τ is a random walk on the planar lattice Z2 that always moves down one step in the e2 -direction, and if its current position is (x, n), the e1 -coordinate of its next position is x + y with probability u n (x, y). We shall call it the backward random walk in a random environment. In the next section we discuss this walk and its forward counterpart. 3. Random Walk in a Space-Time Random Environment 3.1. Definition of the model. We consider here a particular random walk in random environment (RWRE). The walk evolves on the planar integer lattice Z2 , which we think of as space-time: the first component represents one-dimensional discrete space, and the second represents discrete time. We denote by e2 the unit vector in the time-direction. The walks will not be random in the e2 -direction, but only in the spatial e1 -direction. i,τ i,τ and backward walks X¯ m . The subscript m ∈ Z+ is We consider forward walks Z¯ m the time parameter of the walk and superscripts are initial points: Z¯ 0i,τ = X¯ 0i,τ = (i, τ ) ∈ Z2 .
(3.1)
The forward walks move deterministically up in time, while the backward walks move deterministically down in time: i,τ i,τ i,τ i,τ Z¯ m = (Z m , τ + m) and X¯ m = (X m , τ − m) for m ≥ 0.
Since the time components of the walks are deterministic, only the spatial components i,τ i,τ Zm and X m are really relevant. We impose a finite range on the steps of the walks: there is a fixed constant M such that i,τ i,τ i,τ i,τ X Z (3.2) m+1 − Z m ≤ M and m+1 − X m ≤ M.
Random Average Process
511
A note of advance justification for the setting: The backward walks are the ones relevant to the random average process. Distributions of forward and backward walks are obvious mappings of each other. However, we will be interested in the quenched mean processes of the walks as we vary the final time for the forward walk or the initial spacetime point for the backward walk. The results for the forward walk form an interesting point of comparison to the backward walk, even though they will not be used to analyze the random average process.
An environment is a configuration of probability vectors ω = u τ (x) : (x, τ ) ∈ Z2 , where each vector u τ (x) = (u τ (x, y) : −M ≤ y ≤ M) satisfies 0 ≤ u τ (x, y) ≤ 1 for all −M ≤ y ≤ M, and
M
u τ (x, y) = 1.
y=−M
An environment ω is a sample point of the probability space (, S, P). The sample 2 space is the product space = P Z , where P is the space of probability vectors of length 2M + 1, and S is the product σ -field on induced by the Borel sets on P. Throughout, we assume that P is a product probability measure on such that the vectors {u τ (x) : (x, τ ) ∈ Z2 } are independent and identically distributed. Expectation under P is denoted by E. When for notational convenience we wish to think of u τ (x) as an infinite vector, then u τ (x, y) = 0 for |y| > M. We write u ωτ (x, y) to make explicit the environment ω, and also ωx,τ = u τ (x) for the environment at space-time point (x, τ ). Fix an environment ω and an initial point (i, τ ). The forward and backward walks i,τ i,τ ¯ (m ≥ 0) are defined as canonical Z2 -valued Markov chains on their path Z m and X¯ m spaces under the measure P ω determined by the conditions
i, τ P ω { Z¯ s+1
P ω { Z¯ 0i, τ = (i, τ )} = 1, = (y, τ + s + 1) | Z¯ si, τ = (x, τ + s)} = u τ +s (x, y − x)
for the forward walk, and by
i, τ P ω { X¯ s+1
P ω { X¯ 0i, τ = (i, τ )} = 1, = (y, τ − s − 1) | X¯ si, τ = (x, τ − s)} = u τ −s (x, y − x)
for the backward walk. By dropping the time components τ , τ ± s and τ ± s ± 1 from the equations we get the corresponding properties for the spatial walks Z si, τ and X si, τ . When we consider many walks under a common environment ω, it will be notationally convenient to attach the initial point (i, τ ) to the walk and only the environment ω to the measure P ω . P ω is called the quenched distribution, and expectation under P ω is denoted by E ω . The annealed distribution and expectation are P(·) = EP ω (·) and E(·) = EE ω (·). i,τ i,τ and Z m are ordinary homogeneous random walks on Z with jump Under P both X m probabilities p(i, i + j) = p(0, j) = Eu 0 (0, j). These walks satisfy the law of large numbers with velocity V = p(0, j) j. (3.3) j∈Z
As for RAP, we also use the notation b = −V .
512
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
3.2. Limits for quenched mean processes. We start by stating the quenched invariance principle for the space-time RWRE. {B(t) : t ≥ 0} denotes standard one-dimensional Brownian motion. DR [0, ∞) is the space of real-valued cadlag functions on [0, ∞) with the standard Skorohod metric [12]. Recall the definition (2.5) of the variance σa2 of the annealed walk, and assumption (2.2) that guarantees that the quenched walk has stochastic noise. Theorem 3.1 [24]. Assume (2.2). We have these bounds on the variance of the quenched mean: there exist constants C1 , C2 such that for all n, (3.4) C1 n 1/2 ≤ E (E ω (X n0,0 ) − nV )2 ≤ C2 n 1/2 . 0,0 For P-almost every ω, under P ω the process n −1/2 (X nt − nt V ) converges weakly to 2 the process B(σa t) on the path space DR [0, ∞) as n → ∞.
Quite obviously, X n0,0 and Z n0,0 are interchangeable in the above theorem. Bounds (3.4) suggest the possibility of a weak limit for the quenched mean on the scale n 1/4 . Such results are the main point of this section. For t ≥ 0, r ∈ R we define scaled, centered quenched mean processes √
√ r n,0 an (t, r ) = n −1/4 E ω Z nt (3.5) − r n − ntV for the forward walks, and √ √
ntb+r n,nt yn (t, r ) = n −1/4 E ω X nt − r n
(3.6)
for the backward walks. In words, the process an follows forward walks from level 0 to level nt and records centered quenched means. Process yn follows backward walks from level nt down to level 0 and records the centered quenched mean of the point it hits at level 0. The initial points of the backward walks are translated by the negative of the mean drift ntb. This way the temporal processes an (·, r ) and yn (·, r ) obtained by fixing r are meaningful processes. Random variable yn (t, r ) is not exactly centered, for
(3.7) Eyn (t, r ) = n −1/4 ntb − ntb . Of course this makes no difference to the limit. Next we describe the Gaussian limiting processes. Recall the constant κ defined in (2.14) and the function q defined in (2.15). Let {a(t, r ) : (t, r ) ∈ R+ × R} and {y(t, r ) : (t, r ) ∈ R+ × R} be the mean zero Gaussian processes with covariances
Ea(s, q)a(t, r ) = q (s ∧ t, q), (s ∧ t, r ) and
E y(s, q)y(t, r ) = q (s, q), (t, r ) for s, t ≥ 0 and q, r ∈ R. When one argument is fixed, the random function r → y(t, r ) is denoted by y(t, ·) and t → y(t, r ) by y(·, r ). From the covariances follows that at a fixed time level t the spatial processes a(t, ·) and y(t, ·) are equal in distribution. We record basic properties of these processes.
Random Average Process
513
Lemma 3.1. The process {y(t, r )} has a version with continuous paths as functions of (t, r ). Furthermore, it has the following Markovian structure in time. Given 0 = t0 < t1 < · · · < tn , let { y˜ (ti − ti−1 , ·) : 1 ≤ i ≤ n} be independent random functions such that y˜ (ti − ti−1 , ·) has the distribution of y(ti − ti−1 , ·) for i = 1, . . . , n. Define y ∗ (t1 , r ) = y˜ (t1 , r ) for r ∈ R, and then inductively for i = 2, . . . , n and r ∈ R, ∗ y (ti , r ) = ϕσa2 (ti −ti−1 ) (u)y ∗ (ti−1 , r + u) du + y˜ (ti − ti−1 , r ). (3.8) R
Then the joint distribution of the random functions {y ∗ (ti , ·) : 1 ≤ i ≤ n} is the same as that of {y(ti , ·) : 1 ≤ i ≤ n} from the original process. Sketch of proof. Consider (s, q) and (t, r ) varying in a compact set. From the covariance comes the estimate
E (y(s, q) − y(t, r ))2 ≤ C |s − t|1/2 + |q − r | (3.9) from which, since the integrand is Gaussian,
5 E (y(s, q) − y(t, r ))10 ≤ C |s − t|1/2 + |q − r | ≤ C (s, q) − (t, r )5/2 . (3.10) Kolmogorov’s criterion implies the existence of a continuous version. n For the second statement use (3.8) to express a linear combination i=1 θi y ∗ (ti , ri ) in the form n n θi y ∗ (ti , ri ) = y˜ (ti − ti−1 , x) λi (d x), i=1
i=1
R
where the signed measures λi are linear combinations of Gaussian distributions. Use this representation to compute the variance of the linear combination on the left-hand side (it is mean zero Gaussian). Observe that this variance equals θi θ j q ((ti , ri ), (t j , r j )). i, j
Lemma 3.2. The process {a(t, r )} has a version with continuous paths as functions of (t, r ). Furthermore, it has independent increments in time. A more precise statement follows. Given 0 = t0 < t1 < · · · < tn , let {a(t ˜ i − ti−1 , ·) : 1 ≤ i ≤ n} be independent random functions such that a(t ˜ i − ti−1 , ·) has the distribution of a(ti − ti−1 , ·) for i = 1, . . . , n. Define a ∗ (t1 , r ) = a(t ˜ 1 , r ) for r ∈ R, and then inductively for i = 2, . . . , n and r ∈ R, ∗ ∗ a (ti , r ) = a (ti−1 , r ) + ϕσa2 ti−1 (u)a(t ˜ i − ti−1 , r + u) du. (3.11) R
Then the joint distribution of the random functions {a ∗ (ti , ·) : 1 ≤ i ≤ n} is the same as that of {a(ti , ·) : 1 ≤ i ≤ n} from the original process. The proof of the lemma above is similar to the previous one so we omit it.
514
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Remark 3.1. Processes y and a have representations in terms of stochastic integrals. As in Remark 2.1 let W be a two-parameter Brownian motion on R+ × R. In more technical terms, W is the orthogonal Gaussian martingale measure on R+ × R with covariance E W ([0, s] × A)W ([0, t] × B) = (s ∧ t) Leb(A ∩ B) for s, t ∈ R+ and bounded Borel sets A, B ⊆ R. Then √ y(t, r ) = σa κ ϕσa2 (t−s) (r − z) dW (s, z) (3.12) [0,t]×R
while √ a(t, r ) = σa κ
[0,t]×R
ϕσa2 s (r − z) dW (s, z).
(3.13)
By the equations above we mean equality in distribution of processes. They can be verified by a comparison of covariances, as the integrals on the right are also Gaussian processes. Formula (3.12) implies that process {y(t, r )} is a weak solution of the stochastic heat equation √ yt = 21 σa2 yrr + σa κ W˙ , y(0, r ) ≡ 0, (3.14) where W˙ is white noise. (See [34].) These observations are not used elsewhere in the paper. Next we record the limits for the quenched mean processes. The four theorems that follow require assumption (2.2) of stochastic noise and the assumption that the annealed probabilities p(0, j) = Eu ω0 (0, j) have span 1. This next theorem is the one needed for Theorem 2.1 for RAP. Theorem 3.2. The finite dimensional distributions of processes yn (t, r ) converge to those of y(t, r ) as n → ∞. More
precisely, for any finite set of points {(t j , r j ) : 1 ≤ j ≤ k} in R+ × R, the vector yn (t j , r j ) : 1 ≤ j ≤ k converges weakly in Rk to the vector y(t j , r j ) : 1 ≤ j ≤ k . Observe that property (3.8) is easy to understand from the limit. It reflects the Markovian property y,s ω E ω (X τx,τ ) = P ω (X τx,τ −s = y)E (X s ) for s < τ , y
and the “homogenization” of the coefficients which converge to Gaussian probabilities by the quenched central limit theorem. Let us restrict the backward quenched mean process to a single characteristic to observe the outcome. This is the source of the first term in the temporal correlations (2.19) for RAP. The next statement needs no proof, for it is just a particular case of the limit in Theorem 3.2. Corollary 3.3. Fix r ∈ R. As n → ∞, the finite dimensional distributions of the process {yn (t, r ) : t ≥ 0} converge to those of the mean zero Gaussian process {y(t) : t ≥ 0} with covariance √ κσa √ E y(s)y(t) = √ t +s− t −s (s < t). 2π Then the same for the forward processes.
Random Average Process
515
Theorem 3.4. The finite dimensional distributions of processes an converge to those of a as n → ∞. More precisely, for any finite set of points {(t j , r j ) : 1k ≤ j ≤ k} in × R, the vector a (t , r ) : 1 ≤ j ≤ k converges weakly in R to the vector R + n j j
a(t j , r j ) : 1 ≤ j ≤ k . When we specialize to a temporal process we also verify path-level tightness and hence get weak convergence of the entire process. When r = q in (2.16) we get √
q (s ∧ t, r ), (s ∧ t, r ) = ca s ∧ t with ca = σ D2 /(β π σa2 ). Since s ∧ t is the covariance of standard Brownian motion B(·), we get the following limit. Corollary √ 3.5. Fix r ∈ R. As n → ∞, the process {an (t, r ) : t ≥ 0} converges weakly to {B(ca t ) : t ≥ 0} on the path space DR [0, ∞). 4. Random Walk Preliminaries In this section we collect some auxiliary results for random walks. The basic assumptions, (2.2) and span 1 for the p(0, j) = Eu 0 (0, j) walk, are in force throughout the remainder of the paper. Recall the drift in the e1 direction at the origin defined by D(ω) =
x u ω0 (0, x),
x∈Z
with mean V = −b = E(D). Define the centered drift by g(ω) = D(ω) − V = E ω (X 10,0 − V ). The variance is σ D2 = E[g 2 ]. The variance of the i.i.d. annealed walk in the e1 direction is σa2 = (x − V )2 Eu ω0 (0, x). x∈Z
These variances are connected by σa2 = σ D2 + E (X 10,0 − D)2 . Let X n and X˜ n be two independent walks in a common environment ω, and Yn = X n − X˜ n . In the annealed sense Yn is a Markov chain on Z with transition probabilities q(0, y) =
E[u 0 (0, z)u 0 (0, z + y)]
(y ∈ Z),
p(0, z) p(0, z + y − x)
(x = 0, y ∈ Z).
z∈Z
q(x, y) =
z∈Z
516
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Yn can be thought of as a symmetric random walk on Z whose transition has been perturbed at the origin. The corresponding homogeneous, unperturbed transition probabilities are p(0, z) p(0, z + y − x) (x, y ∈ Z). q(x, ¯ y) = q(0, ¯ y − x) = z∈Z
The q-walk ¯ has variance 2σa2 and span 1 as can be deduced from the definition and the hypothesis that the p-walk has span 1. Since the q-walk ¯ is symmetric, its range must be a subgroup of Z. Then span 1 implies that it is irreducible. The q-walk ¯ is recurrent by the Chung-Fuchs theorem. Elementary arguments extend irreducibility and recurrence from q¯ to the q-chain because away from the origin the two walks are the same. Note that assumption (2.2) is required here because the q-walk is absorbed at the origin iff (2.2) fails. Note that the functions defined in (2.7) are the characteristic functions of these transitions: λ(t) = q(0, x)eit x and λ¯ (t) = q(0, ¯ x)eit x . x
x
Multistep transitions are denoted by q k (x, y) and q¯ k (x, y), defined as usual by q¯ 0 (x, y) = 1{x=y} , q¯ 1 (x, y) = q(x, y), q(x, ¯ x1 )q(x ¯ 1 , x2 ) · · · q(x ¯ k−1 , y) (k ≥ 2). q¯ k (x, y) = x1 ,...,xk−1 ∈Z
Green functions for the q¯ and q-walks are G¯ n (x, y) =
n
q¯ k (x, y) and G n (x, y) =
k=0
n
q k (x, y).
k=0
G¯ n is symmetric but G n not necessarily. The potential kernel a¯ of the q-walk ¯ is defined by a(x) ¯ = lim G¯ n (0, 0) − G¯ n (x, 0) .
(4.1)
It satisfies a(0) ¯ = 0, the equations q(x, ¯ y)a(y) ¯ for x = 0, and q(0, ¯ y)a(y) ¯ = 1, a(x) ¯ =
(4.2)
n→∞
y∈Z
y∈Z
and the limit lim
x→±∞
a(x) ¯ 1 . = |x| 2σa2
(4.3)
These facts can be found in Sects. 28 and 29 of Spitzer’s monograph [31]. Example 4.1. If for some k ∈ Z, p(0, k) + p(0, k + 1) = 1, so that q(0, ¯ x) = 0 for x∈ / {−1, 0, 1}, then a(x) ¯ = |x| /(2σa2 ).
Random Average Process
517
Define the constant β=
q(0, x)a(x). ¯
(4.4)
x∈Z
To see that this definition agrees with (2.8), observe that the above equality leads to n n k k q¯ (0, 0) − q(0, x)q¯ (x, 0) . β = lim n→∞
k=0 x
k=0
¯ and Y1 and Y¯k Think of the last sum over x as P[Y1 + Y¯k = 0], where Y¯k is the q-walk, k ¯ ¯ are independent. Since Y1 + Yk has characteristic function λ(t)λ (t), we get π π n 1 1 − λ(t) 1 k ¯ λ (t) dt = (1 − λ(t)) dt. β = lim ¯ n→∞ 2π −π 2π −π 1 − λ(t) k=0 Ferrari and Fontes [14] begin their development by showing that ζ¯ (s) , s1 ζ (s)
β = lim where ζ and ζ¯ are the generating functions ζ (s) =
∞
q k (0, 0)s k and ζ¯ (s) =
k=0
∞
q¯ k (0, 0)s k .
k=0
Our development bypasses the generating functions. We begin with the asymptotics of the Green functions. This is the key to all our results, both for RWRE and RAP. As already pointed out, without assumption (2.2) the result would be completely wrong because the q-walk absorbs at 0, while a span h > 1 would appear in this limit as an extra factor. Lemma 4.1. Let x ∈ R, and let xn be any sequence of integers such that xn − n 1/2 x stays bounded. Then 2σa2 x2
1 1 dv. (4.5) lim n −1/2 G n xn , 0 = exp − √ n→∞ 2βσa2 0 2v 2π v Proof. For the homogeneous q-walk ¯ the local limit theorem [11, Sect. 2.5] implies that 2σa2 x2
1 1 −1/2 ¯ lim n dv (4.6) exp − G n (0, xn ) = √ n→∞ 2σa2 0 2v 2π v and by symmetry the same limit is true for n −1/2 G¯ n (xn , 0). In particular, lim n −1/2 G¯ n (0, 0) =
n→∞
1
.
(4.7)
1 . β π σa2
(4.8)
π σa2
Next we show lim n −1/2 G n (0, 0) =
n→∞
518
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Using (4.2), a(0) ¯ = 0, and q(x, ¯ y) = q(x, y) for x = 0 we develop q m (0, x)a(x) ¯ = q m (0, x)a(x) ¯ = q m (0, x)q(x, ¯ y)a(y) ¯ x =0
x∈Z
=
x =0,y∈Z
q (0, x)q(x, y)a(y) ¯ m
x =0,y∈Z
=
q m+1 (0, y)a(y) ¯ − q m (0, 0)
y∈Z
q(0, y)a(y). ¯
y∈Z
Identify β in the last sum above and sum over m = 0, 1, . . . , n − 1 to get
q n (0, x)a(x). ¯ 1 + q(0, 0) + · · · + q n−1 (0, 0) β = x∈Z
Write this in the form
¯ n) . n −1/2 G n−1 (0, 0)β = n −1/2 E 0 a(Y
X n , where X n and X n are two independent walks in the same Recall that Yn = X n − environment. Thus by Theorem 3.1 n −1/2 Yn converges weakly to a centered Gaussian with variance 2σa2 . Under the annealed measure the walks X n and X n are ordinary i.i.d. walks with bounded steps, hence there is enough uniform integrability to conclude that n −1/2 E 0 |Yn | → 2 σa2 /π . By (4.3) and straightforward estimation, 1 n −1/2 E 0 a(Y ¯ n) → . σa2 π This proves (4.8). From (4.7)–(4.8) we take the conclusion 1 lim √ βG n (0, 0) − G¯ n (0, 0) = 0. n→∞ n
(4.9)
Let f 0 (z, 0) = 1{z=0} and for k ≥ 1 let q(z, z 1 )q(z 1 , z 2 ) · · · q(z k−1 , 0). f k (z, 0) = 1{z =0} z 1 =0,...,z k−1 =0
This is the probability that the first visit to the origin occurs at time k, including a possible first visit at time 0. Note that this quantity is the same for the q and q¯ walks. Now bound β 1 sup √ G n (z, 0) − √ G¯ n (z, 0) n n z∈Z n 1 k ≤ sup √ f (z, 0)βG n−k (0, 0) − G¯ n−k (0, 0). n z∈Z k=0
To see that the last line vanishes as n → ∞, by (4.9) choose n 0 so that √ |βG n−k (0, 0) − G¯ n−k (0, 0)| ≤ ε n − k for k ≤ n − n 0 , while trivially |βG n−k (0, 0) − G¯ n−k (0, 0)| ≤ Cn 0 for n − n 0 < k ≤ n. The conclusion (4.5) now follows from this and (4.6).
Random Average Process
519
Lemma 4.2. sup sup G n (x, 0) − G n (x + 1, 0) < ∞. n≥1 x∈Z
Proof. Let Ty = inf{n ≥ 1 : Yn = y} denote the first hitting time of the point y, G n (x, 0) = E x
n k=0
k=0
1{Yk = 0} + G n (y, 0).
Ty
≤ Ex
y ∧n T 1{Yk = 0} = E x 1{Yk = 0} + E x
n
1{Yk = 0}
k=Ty ∧n+1
k=0
Ty In an irreducible Markov chain the expectation E x k=0 1{Yk = 0} is finite for any given states x, y [8, Theorem 3 in Sect. I.9]. Since this is independent of n, the inequalities above show that sup sup |G n (x, 0) − G n (x + 1, 0)| < ∞ n −a≤x≤a
(4.10)
for any fixed a. Fix a positive integer a larger than the range of the jump kernels q(x, y) and q(x, ¯ y). Consider x > a. Let σ = inf{n ≥ 1 : Yn ≤ a − 1} and τ = inf{n ≥ 1 : Yn ≤ a}. Since the q-walks starting at x and x + 1 obey the translation-invariant kernel q¯ until they hit the origin, Px [Yσ = y, σ = n] = Px+1 [Yτ = y + 1, τ = n]. (Any path that starts at x and enters [0, a − 1] at y can be translated by 1 to a path that starts at x + 1 and enters [0, a] at y + 1, without changing its probability.) Consequently G n (x, 0) − G n (x + 1, 0) n a−1
Px [Yσ = y, σ = k] G n−k (y, 0) − G n−k (y + 1, 0) . = k=1 y=0
Together with (4.10) this shows that the quantity in the statement of the lemma is uniformly bounded over x ≥ 0. The same argument works for x ≤ 0. One can also derive the limit lim G n (0, 0) − G n (x, 0) = β −1 a(x) ¯ n→∞
but we have no need for this. Lastly, a moderate deviation bound for the space-time RWRE with bounded steps. Let X si,τ be the spatial backward walk defined in Sect. 3 with the bound (3.2) on the steps. Let X si, τ = X si, τ − i − V s be the centered walk. Lemma 4.3. For m, n ∈ N, let (i(m, n), τ (m, n)) ∈ Z2 , v(n) ≥ 1, and let s(n) → ∞ be a sequence of positive integers. Let α, γ and c be positive reals. Assume ∞ n=1
v(n)s(n)α exp{−cs(n)γ } < ∞.
520
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Then for P-almost every ω, lim
max
n→∞ 1≤m≤v(n)
s(n)α P ω
max
1≤k≤s(n)
1 X ki(m,n), τ (m,n) ≥ cs(n) 2 +γ = 0.
(4.11)
Proof. Fix ε > 0. By Markov’s inequality and translation-invariance,
1 i(m,n), τ (m,n) α ω +γ max X ≥ε P ω : max s(n) P ≥ cs(n) 2 1≤m≤v(n) 1≤k≤s(n) k 1 max ≤ ε−1 s(n)α v(n)P X k0,0 ≥ cs(n) 2 +γ . 1≤k≤s(n)
Under the annealed measure P, X k0,0 is an ordinary homogeneous mean zero random walk with bounded steps. It has a finite moment generating function φ(λ) = log E(exp{λ X 10,0 }) that satisfies φ(λ) = O(λ2 ) for small λ. Apply Doob’s inequality to the martingale Mk = exp(λ X k0,0 − kφ(λ)), note that φ(λ) ≥ 0, and choose a constant a1 such that φ(λ) ≤ a1 λ2 for small λ. This gives
1 1 max Mk ≥ exp cλs(n) 2 +γ − s(n)φ(λ) P max X k0,0 ≥ cs(n) 2 +γ ≤ P 1≤k≤s(n)
1≤k≤s(n)
1 ≤ exp −cλs(n) 2 +γ + a1 s(n)λ2 = ea1 · exp{−cs(n)γ }, 1
where we took λ = s(n)− 2 . The conclusion of the lemma now follows from the hypothesis and Borel-Cantelli. 5. Proofs for Backward Walks in a Random Environment Here are two further notational conventions used in the proofs. The environment configuration at a fixed time level is denoted by ω¯ n = {ωx,n : x ∈ Z}. Translations on are defined by (Tx,n ω) y,k = ωx+y,n+k . 5.1. Proof of Theorem 3.2. This proof proceeds in two stages. First in Lemma 5.1 convergence is proved for finite-dimensional distributions at a fixed t-level. In the second stage the convergence is extended to multiple t-levels via the natural Markovian prop√ ntb+r n,nt erty that we express in terms of yn next. Abbreviate X kn,t,r = X k . Then for 0 ≤ s < t,
√ n,t,r yn (t, r ) = n −1/4 E ω (X nt ) − r n n,t,r
nsb+z,ns P ω X nt−ns = nsb + z n −1/4 E ω (X ns )−z = z∈Z
+
z∈Z
=
z∈Z
n,t,r
√ P ω X nt−ns = nsb + z n −1/4 z − r n
n,t,r
nsb+z,ns P ω X nt−ns = nsb + z n −1/4 E ω (X ns )−z
n,t,r √ +n −1/4 E ω X nt−ns − nsb − r n
Random Average Process
=
z∈Z
521
n,t,r
nsb+z,ns P ω X nt−ns = nsb + z n −1/4 E ω (X ns )−z
(5.1)
+yn (u n , r ) ◦ Tntb−nbu n ,nt−nu n + n −1/4 ntb − nsb − nbu n , (5.2) where we defined u n = n −1 (nt − ns) so that nu n = nt − ns. Tx,m denotes the translation of the random environment that makes (x, m) the new space-time origin, in other words (Tx,m ω) y,n = ωx+y,m+n . The key to making use of the decomposition of yn (t, r ) given on lines (5.1) and (5.2) is that the quenched expectations nsb+z,ns and yn (u n , r ) ◦ Tntb−nbu n ,nt−nu n E ω X ns are independent because they arefunctions of environments ω¯ m on disjoint sets of levn,t,r els m, while the coefficients P ω X nt−ns = nsb + z on line (5.1) converge (in probability) to Gaussian probabilities by the quenched CLT as n → ∞. In the limit this decomposition becomes (3.8). Because of the little technicality of matching nt − ns with n(t − s) we state the next lemma for a sequence tn → t instead of a fixed t. Lemma 5.1. Fix t > 0, and finitely many reals r1 < r2 < . . . < r N . Let tn be a sequence of positive reals such that tn → t. Then as n → ∞ the R N -valued vector (yn (tn , r1 ), . . . , yn (tn , r N )) converges weakly to a mean zero Gaussian vector with covariance matrix {q ((t, ri ), (t, r j )) : 1 ≤ i, j ≤ N } with q as defined in (2.15). The proof of Lemma 5.1 is technical (martingale CLT and random walk estimates), so we postpone it and proceed with the main development. Proof of Theorem 3.2. The argument is inductive on the number M of time points in the finite-dimensional distribution. The induction assumption is that [yn (ti , r j ) : 1 ≤ i ≤ M, 1 ≤ j ≤ N ] → [y(ti , r j ) : 1 ≤ i ≤ M, 1 ≤ j ≤ N ] weakly on R M N for any M time points 0 ≤ t1 < t2 < · · · < t M and for any reals r1 , . . . , r N for any finite N .
(5.3)
The case M = 1 comes from Lemma 5.1. To handle the case M + 1, let 0 ≤ t1 < t2 < · · · < t M+1 , and fix an arbitrary (M + 1)N -vector [θi, j ]. By the Cramér-Wold device, it suffices to show the weak convergence of the linear combination θi, j yn (ti , r j ) = θi, j yn (ti , r j ) + θ M+1, j yn (t M+1 , r j ), (5.4) 1≤i≤M+1 1≤ j≤N
1≤i≤M 1≤ j≤N
1≤ j≤N
where we separated out the (M + 1)-term to be manipulated. The argument will use (5.1)–(5.2) to replace the values at t M+1 with values at t M plus terms independent of the rest. For Borel sets B ⊆ R define the probability measure nt M+1 b+r j √n,nt M+1 ω ω X nt M+1 −nt M pn, (B) = P − nt M b ∈ B . j
522
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Apply the decomposition (5.1)–(5.2), with sn = n −1 (nt M+1 − nt M ) and y˜n (sn , r j ) = yn (sn , r j ) ◦ Tnt M+1 b−nsn b,nt M+1 −nsn to get yn (t M+1 , r j ) =
z∈Z
ω nt M b+z,nt M ω −1/4 −z pn, (z)n E X nt M j
+ y˜n (sn , r j ) + O(n −1/4 ). (5.5)
The O(n −1/4 ) term above is n −1/4 nt M+1 b−nt M b−nsn b , a deterministic quantity. Next we reorganize the sum in (5.5) to take advantage of Lemma 5.1. Given a > 0, define a partition of [−a, a] by −a = u 0 < u 1 < · · · < u L = a
√ √ with mesh = max{u +1 − u √ }. For integers z√such that −a n < z ≤ a n, let u(z) denote the value u such that u n < z ≤ u +1 n. For 1 ≤ j ≤ N define an error term by Rn, j (a) = n −1/4
√ a n
√ z=−a n+1
ω pn, j (z)
nt b+z,nt M
E ω (X nt MM
)−z
√ √ nt b+u(z) n,nt M ) − u(z) n − E ω (X nt MM ω nt M b+z,nt M ω +n −1/4 pn, )−z . j (z) E (X nt M √ √ z≤−a n , z>a n
(5.6) (5.7)
With this we can rewrite (5.5) as yn (t M+1 , r j ) =
L−1
ω 1/2 pn, , u +1 n 1/2 ]yn (t M , u ) + y˜n (sn , r j ) j (u n
(5.8)
=0
+Rn, j (a) + O(n −1/4 ). Let γ denote a normal distribution on R with mean zero and variance σa2 (t M+1 − t M ). According to the quenched CLT Theorem 3.1, ω 1/2 pn, , u +1 n 1/2 ] → γ (u − r j , u +1 − r j ] in P-probability as n → ∞. (5.9) j (u n
In view of (5.4) and (5.8), we can write ω θi, j yn (ti , r j ) = ρn,i,k yn (ti , vk ) + θ M+1, j y˜n (sn , r j ) 1≤i≤M+1 1≤ j≤N
1≤i≤M 1≤k≤K
+Rn (a) + O(n −1/4 ).
1≤ j≤N
(5.10)
Above the spatial points {vk } are a relabeling of {r j , u }, the ω-dependent coefficients ω ω (u n 1/2 , u 1/2 ], and zeroes. The conρn,i,k contain constants θi, j , probabilities pn, +1 n j ω stant limits ρn,i,k → ρi,k exist in P-probability as n → ∞. The error in (5.10) is Rn (a) = j θ M+1, j Rn, j (a).
Random Average Process
523
The variables y˜n (sn , r j ) are functions of the environments {ω¯ m : [nt M+1 ] ≥ m > [nt M ]} and hence independent of yn (ti , vk ) for 1 ≤ i ≤ M which are functions of {ω¯ m : [nt M ] ≥ m > 0}. On a probability space on which the limit process {y(t, r )} has been defined, let y˜ (t M+1 − t M , ·) be a random function distributed like y(t M+1 − t M , ·) but independent of {y(t, r )}. Let f be a bounded Lipschitz continuous function on R, with Lipschitz constant C f . The goal is to show that the top line (5.11) below vanishes as n → ∞. Add and subtract terms to decompose (5.11) into three differences: ⎛ ⎛ ⎞ ⎞ ⎜ Ef ⎜ ⎝
=
1≤i≤M+1 1≤ j≤N
⎧ ⎪ ⎪ ⎨
⎜ ⎟ ⎜ θi, j yn (ti , r j )⎟ ⎠− Ef ⎝
⎛
⎞
1≤i≤M+1 1≤ j≤N
⎟ θi, j y(ti , r j )⎟ ⎠
⎜ ⎟ ⎟ Ef ⎜ θ y (t , r ) i, j n i j ⎝ ⎠ ⎪ ⎪ 1≤i≤M+1 ⎩ 1≤ j≤N ⎛ ⎜ ω ρ y (t , v ) + θ M+1, j −E f ⎜ n i k n,i,k ⎝ ⎧ ⎪ ⎪ ⎨
1≤i≤M 1≤k≤K
1≤ j≤N
⎛
⎞⎫ ⎪ ⎪ ⎟⎬ ⎟ y˜n (sn , r j )⎠ ⎪ ⎪ ⎭ ⎞
(5.11)
(5.12)
⎜ ω ⎟ Ef ⎜ ρ y (t , v ) + θ M+1, j y˜n (sn , r j )⎟ n i k n,i,k ⎝ ⎠ ⎪ ⎪ 1≤i≤M 1≤ j≤N ⎩ 1≤k≤K ⎛ ⎞⎫ ⎪ ⎪ ⎜ ⎟⎬ ⎜ ⎟ −E f ⎝ (5.13) ρi,k y(ti , vk ) + θ M+1, j y˜ (t M+1 − t M , r j )⎠ ⎪ ⎪ 1≤i≤M 1≤ j≤N ⎭ 1≤k≤K ⎧ ⎛ ⎞ ⎪ ⎪ ⎨ ⎜ ⎟ ⎟ Ef ⎜ + ρ y(t , v ) + θ y ˜ (t − t , r ) i,k i k M+1, j M+1 M j ⎝ ⎠ ⎪ ⎪ 1≤i≤M 1≤ j≤N ⎩ 1≤k≤K ⎞⎫ ⎛ ⎪ ⎪ ⎟⎬ ⎜ ⎟ ⎜ (5.14) θi, j y(ti , r j )⎠ . −E f ⎝ ⎪ ⎪ 1≤i≤M+1 ⎭
+
1≤ j≤N
The remainder of the proof consists in treating the three differences of expectations (5.12)–(5.14). By the Lipschitz assumption and (5.10), the difference (5.12) is bounded by C f E|Rn (a)| + O(n −1/4 ). We need to bound Rn (a). Recall that γ is an N (0, σa2 (t M+1 − t M ))-distribution.
524
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Lemma 5.2. There exist constants C1 and a0 such that, if a > a0 , then for any partition {u } of [−a, a] with mesh , and for any 1 ≤ j ≤ N , lim sup E|Rn, j (a)| ≤ C1 n→∞
√ + γ (−∞, −a/2) + γ (a/2, ∞) .
We postpone the proof of Lemma 5.2. From this lemma, given ε > 0, we can choose first a large enough and then small enough so that lim sup [ difference (5.12) ] ≤ ε/2. n→∞
Difference (5.13) vanishes as n → ∞, due to the induction assumption (5.3), the ω limits ρn,i,k → ρi,k in probability, and the next lemma. Notice that we are not trying to invoke the induction assumption (5.3) for M + 1 time points {t1 , . . . , t M , sn }. Instead, the induction assumption is applied to the first sum inside f in (5.13). To the second sum apply Lemma 5.1, noting that sn → t M+1 − t M . The two sums are independent of each other, as already observed after (5.10), so they converge jointly. This point is made precise in the next lemma. Lemma 5.3. Fix a positive integer k. For each n, let Vn = (Vn1 , . . . , Vnk ), X n = (X n1 , . . . , X nk ), and ζn be random variables on a common probability space. Assume that X n and ζn are independent of each other for each n. Let v be a constant k-vector, X another random k-vector, and ζ a random variable. Assume the weak limits Vn → v, X n → X , and ζn → ζ hold marginally. Then we have the weak limit Vn · X n + ζn → v · X + ζ, where the X and ζ on the right are independent. To prove this lemma, write Vn · X n + ζn = (Vn − v) · X n + v · X n + ζn and note that since Vn → v in probability, tightness of {X n } implies that (Vn −v)·X n → 0 in probability. As mentioned, it applies to show that lim [ difference (5.13) ] = 0.
n→∞
It remains to examine the difference (5.14). From a consideration of how the coeffiω cients ρn,i,k in (5.10) arise and from the limit (5.9),
ρi,k y(ti , vk ) +
1≤i≤M 1≤k≤K
+
1≤ j≤N
1≤ j≤N
( θ M+1, j
L−1 =0
θ M+1, j y˜ (t M+1 − t M , r j ) =
1≤i≤M 1≤ j≤N
θi, j y(ti , r j ) )
γ (u − r j , u +1 − r j ]y(t M , u ) + y˜ (t M+1 − t M , r j ) .
Random Average Process
525
The first sum after the equality sign matches all but the (i = M + 1)-terms in the last sum in (5.14). By virtue of the Markov property in (3.8) we can represent the variables y(t M+1 , r j ) in the last sum in (5.14) by y(t M+1 , r j ) =
R
ϕσa2 (t M+1 −t M ) (u − r j )y(t M , u) du + y˜ (t M+1 − t M , r j ).
Then by the Lipschitz property of f it suffices to show that, for each 1 ≤ j ≤ N , the expectation L−1 E ϕσa2 (t M+1 −t M ) (u − r j )y(t M , u) du − γ (u − r j , u +1 − r j ]y(t M , u ) R =0
can be made small by choice of a > 0 and the partition {u }. This follows from the moment bounds (3.9) on the increments of the y-process and we omit the details. We have shown that if a is large enough and then small enough, lim sup [ difference (5.14) ] ≤ ε/2. n→∞
To summarize, given bounded Lipschitz f and ε > 0, by choosing a > 0 large enough and the partition {u } of [−a, a] fine enough, lim sup E f n→∞
⎞ ⎜ ⎜ ⎟ ⎟ ⎜ ⎜ ⎟ ⎟ θi, j yn (ti , r j )⎟ − E f ⎜ θi, j y(ti , r j )⎟ ≤ ε. ⎜ ⎝ ⎝ ⎠ ⎠ 1≤i≤M+1 1≤i≤M+1 1≤ j≤N 1≤ j≤N ⎛
⎞
⎛
This completes the proof of the induction step and thereby the proof of Theorem 3.2. It remains to verify the lemmas that were used along the way. Proof of Lemma 5.2. We begin with a calculation. Here it is convenient to use the spacetime walk X¯ kx,m = (X kx,m , m − k). First observe that E ω (X nx,m ) − x − nV =
n−1
x,m E ω X k+1 − X kx,m − V
k=0
=
n−1
T x,m ω ¯ E ω E { X k } (X 10,0 − V )
k=0
=
n−1
E ω g(TX¯ x,m ω). k
k=0
(5.15)
526
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
From this, for x, y ∈ Z, 2 y,n E {E ω (X nx,n ) − x} − {E ω (X n ) − y} =E
n−1
2 E ω g(TX¯ x,n ω) − g(TX¯ y,n ω) k
k=0
+2
k
EE ω g(TX¯ x,n ω) − g(TX¯ y,n ω) E ω g(TX¯ x,n ω) − g(TX¯ y,n ω) k
k
0≤k<
(the cross terms for k < vanish) ⎞2 ⎛ n−1 y,n ⎝ =E P ω { X¯ kx,n = z}P ω { X¯ k = w} g(Tz ω) − g(Tw ω) ⎠ =E
k=0
z,w∈Z2
n−1
y,n y,n P ω { X¯ kx,n = z}P ω { X¯ k = w}P ω { X¯ kx,n = u}P ω { X¯ k = v}
k=0 z,w,u,v∈Z2
× g(Tz ω)g(Tu ω) − g(Tw ω)g(Tu ω) − g(Tz ω)g(Tv ω) + g(Tw ω)g(Tv ω)
by independence Eg(Tz ω)g(Tu ω) = σ D2 1{z=u} = σ D2 = =
n−1
y,n y,n y,n P{X kx,n = X kx,n } − 2P{X kx,n = X k } + P{X k = Xk }
k=0 n−1 P0 {Yk = 0} − Px−y {Yk = 0} 2σ D2 k=0
2 2σ D G n−1 (0, 0) − G n−1 (x − y, 0) .
On the last three lines above, as elsewhere in the paper, we used these conventions: X k and X k denote walks that are independent in a common environment ω, Yk = X k − Xk is the difference walk, and G n (x, y) the Green function of Yk . By Lemma 4.2 we get the inequality 2 y,n ≤ C |x − y| (5.16) E {E ω (X nx,n ) − x} − {E ω (X n ) − y} valid for all n and all x, y ∈ Z. Turning to Rn, j (a) defined in (5.6)–(5.7), and utilizing independence, E|Rn, j (a)| ≤ n
−1/4
√ a n
√ z=−a n+1
nt M b+z,nt M ω ω X − z E[ pn, (z)] E E j nt M
nt b+u(z)√n,nt M √ 2 1/2 − u(z) n E ω X nt MM nt b+z,nt M ω +n −1/4 E[ pn, (z)] E E ω (X nt MM ) j −
√ z≤−a√ n z>a n
nt b+z,nt M
−E(X nt MM
)
2 1/2
Random Average Process
527
+n −1/4
√ z≤−a√ n z>a n
nt M b+z,nt M ω E[ pn, )−z j (z)] · E(X nt M
√ 1/2 √max √ |z − u(z) n | −a n
≤ Cn −1/4
√ ≥ a n + Cn −1/4 .
For the last inequality above we used (5.16), bound (3.4) on the variance of the quenched mean, and then nt b+z,nt M E X nt MM − z = nt M b + nt M V = nt M b − nt M b = O(1). By the choice of u(z), and by the√central limit theorem if a > 2|r j |, the limit of the bound on E|Rn, j (a)| as n → ∞ is C( + γ (−∞, −a/2) + γ (a/2, ∞)). This completes the proof of Lemma 5.2. Proof of Lemma 5.1. We drop the subscript from tn and write simply t. For the main part of the proof the only relevant property is that ntn = O(n). We point this out after the preliminaries. N We show convergence of the linear combination i=1 θi yn (t, ri ) for an arbitrary but fixed N -vector θ = (θ1 , . . . , θ N ). This in turn will come from a martingale central limit √ ntb+ri n,nt theorem. For this proof abbreviate X ki = X k . For 1 ≤ k ≤ nt define z n,k = n −1/4
N
θi E ω g(TX¯ i ω)
i=1
k−1
so that by (5.15) nt k=1
z n,k =
N
θi yn (t, ri ) + O(n −1/4 ).
i=1
The error is deterministic and comes from the discrepancy (3.7) in the centering. It vanishes in the limit and so can be ignored. i A probability of the type P ω (X k−1 = y) is a function of the environments ω¯ j : nt − k + 2 ≤ j ≤ nt while g(Ty,s ω) is a function of ω¯ s . For a fixed n, {z n,k : 1 ≤ k ≤ nt} are martingale differences with respect to the filtration (1 ≤ k ≤ nt) Un,k = σ ω¯ j : nt − k + 1 ≤ j ≤ nt nt with Un,0 equal to the trivial σ -algebra. The goal is to show that k=1 z n,k converges to a centered Gaussian with variance 1≤i, j≤N θi θ j q ((t, ri ), (t, r j )). By the LindebergFeller Theorem for martingales, it suffices to check that nt 2 Un,k−1 −→ E z n,k k=1
1≤i, j≤N
θi θ j q ((t, ri ), (t, r j ))
(5.17)
528
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
and nt 2 E z n,k 1{z n,k ≥ ε} Un,k−1 −→ 0
(5.18)
k=1
in probability, as n → ∞, for every ε > 0. Condition (5.18) is trivially satisfied because |z n,k | ≤ Cn −1/4 by the boundedness of g. The main part of the proof consists of checking (5.17). This argument is a generalization of the proof of [14, Theorem 4.1] where it was done for a nearest-neighbor walk. We follow their reasoning for the first part of the proof. Since σ D2 = E[g 2 ] and since 2 on U conditioning z n,k ¯ nt−k+1 , one can n,k−1 entails integrating out the environments ω derive nt
2 E[z n,k
| Un,k−1 ] =
σ D2
k=1
1≤i, j≤N
θi θ j n
−1/2
nt−1
j P ω (X ki = X k ),
k=0
where X ki and X k are two walks independent under the common environment ω, started √ √ at (ntb + ri n, nt) and (ntb + r j n, nt). By (4.5), j
σ D2 n −1/2
nt−1
j P(X ki = X k ) −→ q ((t, ri ), (t, r j )).
(5.19)
k=0
This limit holds if instead of a fixed t on the left we have a sequence tn → t. Consequently we will have proved (5.17) if we show, for each fixed pair (i, j), that n −1/2
nt−1
j j P ω {X ki = X k } − P{X ki = X k } −→ 0
(5.20)
k=0
in P-probability. For the above statement the behavior of t is immaterial as long as it stays bounded as n → ∞. Rewrite the expression in (5.20) as
n
−1/2
nt−1
j j P{X ki = X k | Un,k } − P{X ki = X k | Un,0 }
k=0
=n
−1/2
= n −1/2
nt−1 k−1
k=0 =0 nt−1 nt−1 =0
≡ n −1/2
j j P{X ki = X k | Un,+1 } − P{X ki = X k | Un, }
nt−1 =0
k=+1
R ,
j j P{X ki = X k | Un,+1 } − P{X ki = X k | Un, }
Random Average Process
529
where the last line defines R . Check that ER Rm = 0 for = m. Thus it is convenient to verify our goal (5.20) by checking L 2 convergence, in other words by showing n −1
nt−1 =0
E[R2 ]
nt−1
= n −1
=0
⎡⎧ ⎫2 ⎤ ⎨ nt−1
⎬ ⎥ ⎢ j j P{X ki = E⎣ X k | Un,+1 } − P{X ki = X k | Un, } ⎦ ⎭ ⎩ k=+1
(5.21) −→ 0. For the moment we work on a single term inside the braces in (5.21), for a fixed pair j i − k > . Write Ym = X m X m for the difference walk. By the Markov property of the walks [recall (2.27)] we can write j j P{X ki = X k | Un,+1 } = P ω {X i = x, X = x} ˜ x,x,y, ˜ y˜ ∈Z
˜ y˜ − x)P(Y ˜ ×u ωnt− (x, y − x)u ωnt− (x, k = 0 | Y+1 = y − y˜ ) and similarly for the other conditional probability j j P{X ki = X k | Un, } = P ω {X i = x, X = x} ˜ x,x,y, ˜ y˜ ∈Z
˜ y˜ − x)]P(Y ˜ ×E[u ωnt− (x, y − x)u ωnt− (x, k = 0 | Y+1 = y − y˜ ). Introduce the transition probability q(x, y) of the Y -walk. Combine the above decompositions to express the (k, ) term inside the braces in (5.21) as j j P{X ki = X k | Un,+1 } − P{X ki = X k | Un, } j P ω {X i = x, X = x}q ˜ k−−1 (y − y˜ , 0) = x,x,y, ˜ y˜ ∈Z
× u ωnt− (x, y − x)u ωnt− (x, ˜ y˜ − x) ˜
˜ y˜ − x)] ˜ −E[u ωnt− (x, y − x)u ωnt− (x, j = P ω {X i = x, X = x} ˜ q k−−1 (x − x˜ + z, 0) z,w: −M≤w≤M −M≤w−z≤M
x,x˜
× u ωnt− (x, w)u ωnt− (x, ˜ w − z)
−E[u ωnt− (x, w)u ωnt− (x, ˜ w − z)] .
(5.22)
The last sum above uses the finite range M of the jump probabilities. Introduce the quantities ρω (x, x + m) =
y:y≤m
u ωnt− (x, y) =
m y=−M
u ωnt− (x, y)
530
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
and ˜ z, w) = ρω (x, x + w)u ωnt− (x, ˜ w − z) ζω (x, x, ω ω ˜ w − z) . −E ρ (x, x + w)u nt− (x, Fix (x, x), ˜ consider the sum over z and w on line (5.22), and continue with a “summation by parts” step: q k−−1 (x − x˜ + z, 0) u ωnt− (x, w)u ωnt− (x, ˜ w − z) z,w: −M≤w≤M −M≤w−z≤M
˜ w − z)] −E[u ωnt− (x, w)u ωnt− (x,
q k−−1 (x − x˜ + z, 0) ζω (x, x, ˜ z, w) − ζω (x, x, ˜ z − 1, w − 1) = z,w: −M≤w≤M −M≤w−z≤M
=
z,w: −M≤w≤M −M≤w−z≤M
+
2M
q k−−1 (x − x˜ + z, 0) − q k−−1 (x − x˜ + z + 1, 0) ζω (x, x, ˜ z, w)
q k−−1 (x − x˜ + z + 1, 0)ζω (x, x, ˜ z, M)
z=0 −1
−
q k−−1 (x − x˜ + z + 1, 0)ζω (x, x, ˜ z, −M − 1).
z=−2M−1
By definition of the range M, the last sum above vanishes because ζω (x, x, ˜ z, −M −1) = 0. Take this into consideration, substitute the last form above into (5.22) and sum over k = + 1, . . . , nt − 1. Define the quantity nt−1
A,n (x) =
k−−1 q (x, 0) − q k−−1 (x + 1, 0) .
(5.23)
k=+1
Then the expression in braces in (5.21) is represented as R =
nt−1
j j P{X ki = X k | Un,+1 } − P{X ki = X k | Un, }
k=+1
=
x,x˜
+
P ω {X i = x, X = x} ˜
x,x˜
j
z,w: −M≤w≤M −M≤w−z≤M
P ω {X i = x, X = x} ˜ j
2M nt−1
A,n (x − x˜ + z)ζω (x, x, ˜ z, w)
(5.24)
q k−−1 (x − x˜ + z + 1, 0)ζω (x, x, ˜ z, M)
z=0 k=+1
(5.25) ≡ R,1 + R,2 , where R,1 and R,2 denote the sums on lines (5.24) and (5.25).
Random Average Process
531
nt−1 Recall from (5.21) that our goal was to show that n −1 =0 ER2 → 0 as n → ∞. We show this separately for R,1 and R,2 . As a function of ω, ζω (· · · ) is a function of ω¯ nt− and hence independent of the probabilities on line (5.24). Thus we get j j 2 E[R,1 ]= E P ω {X i = x, X = x}P ˜ ω {X i = x , X = x˜ } x,x,x ˜ ,x˜
×
A,n (x − x˜ + z)A,n (x − x˜ + z )
−M≤w ≤M
−M≤w≤M −M≤w−z≤M −M≤w −z ≤M ˜ z, w)ζω (x , x˜ , z , w ) . × E ζω (x, x,
(5.26)
Lemma 4.2 implies that A,n (x) is uniformly bounded over (, n, x). Random variable ζω (x, x, ˜ z, w) is mean zero and a function of the environments {ωx,nt− , ωx,nt− }. ˜ Consequently the last expectation on line (5.26) vanishes unless {x, x} ˜ ∩ {x , x˜ } = ∅. The sums over z, w, z , w contribute a constant because of their bounded range. Taking all these into consideration, we obtain the bound j j j 2 E[R,1 ] ≤ C P{X i = X i } + P{X i = X } + P{X = X } . (5.27) By (4.5) we get the bound n −1
nt−1 =0
2 E[R,1 ] ≤ Cn −1/2
(5.28)
which vanishes as n → ∞. For the remaining sum R,2 observe first that ˜ z, M) = u ωnt− (x, ˜ M − z) − Eu ωnt− (x, ˜ M − z). ζω (x, x,
(5.29)
Summed over 0 ≤ z ≤ 2M this vanishes, so we can start by rewriting as follows: j R,2 = P ω {X i = x, X = x} ˜ x,x˜
×
2M nt−1 q k−−1 (x − x˜ + z + 1, 0) − q k−−1 (x − x, ˜ 0) ζω (x, x, ˜ z, M) z=0 k=+1
=−
x,x˜
=−
x,x˜
P ω {X i = x, X = x} ˜ j
z 2M
A,n (x − x˜ + m, 0)ζω (x, x, ˜ z, M)
z=0 m=0
P ω {X i = x, X = x} ˜ j
2M
A,n (x − x˜ + m, 0)ρ¯ω (x, ˜ x˜ + M − m),
m=0
where we abbreviated on the last line ρ¯ω (x, ˜ x˜ + M − m) = ρω (x, ˜ x˜ + M − m) − Eρω (x, ˜ x˜ + M − m).
532
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Square the last representation for R,2 , take E-expectation, and note that ˜ x˜ + M − m)ρ¯ω (x˜ , x˜ + M − m ) = 0 E ρ¯ω (x, unless x˜ = x˜ . Thus the reasoning applied to R,1 can be repeated, and we conclude that nt−1 2 → 0. also n −1 =0 ER,2 To summarize, we have verified (5.21), thereby (5.20) and condition (5.17) for the martingale CLT. This completes the proof of Lemma 5.1. 6. Proofs for Forward Walks in a Random Environment 6.1. Proof of Theorem 3.4. The proof of Theorem 3.4 is organized in the same way as the proof of Theorem 3.2 so we restrict ourselves to a few remarks. The Markov property reads now (0 ≤ s < t, r ∈ R): √
√ r n,0 an (t, r ) = an (s, r ) + P ω Z ns = r n + nsV + y
y∈Z
√
√ r n+nsV +y,ns E ω Z nt−ns − r n − y − ntV
+n −1/4 nsV − nsV . ×n
−1/4
This serves as the basis for the inductive proof along time levels, exactly as done in the argument following (5.3). Lemma 5.1 about the convergence at a fixed t-level applies to an (t, ·) exactly as worded. This follows from noting that, up to a trivial difference from integer parts, the processes an (t, ·) and yn (t, ·) are the same. Precisely, if S denotes the P-preserving transformation on defined by (Sω)x,τ = ω−ntb+x,nt−τ , then √ √ √ √ ntb+r n,nt r n,0 ) − r n = E ω (Z nt ) − r n + ntb. E Sω (X nt The errors in the inductive argument are treated with the same arguments as used in Lemma 5.2 to treat Rn, j (a). 6.2. Proof of Corollary 3.5. We start with a moment bound that will give tightness of the processes. Lemma 6.1. There exists a constant 0 < C < ∞ such that, for all n ∈ N, E (E ω (Z n0,0 ) − nV )6 ≤ Cn 3/2 . Proof. From
2 E E ω g(TZ¯ x,0 ω) − E ω g(TZ¯ 0,0 ω) = 2σ D2 P[Yn0 = 0] − P[Ynx = 0] n
n
we get P[Ynx = 0] ≤ P[Yn0 = 0] Z¯ n = Z¯ n0,0 for this proof. E ω g(TZ¯ k ω) relative to the
for all n ≥ 0 and x ∈ Z.
(6.1)
Abbreviate E ω (Z n ) − nV is a mean-zero martingale with increments filtration Hn = σ {ω¯ k : 0 ≤ k < n}. By the Burkholder-Davis-Gundy inequality [7],
Random Average Process
533
n−1 ω 2 3 . E g(TZ¯ k ω) E (E ω (Z n ) − nV )6 ≤ CE k=0
Expanding the cube yields four sums 6 C E E ω g(TZ¯ k ω) + C 0≤k
+C
0≤k1
0≤k1
2 4 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1
2
4 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω)
+C
0≤k1
1
2
2 2 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1
2
3
with a constant C that bounds the number of arrangements of each type. Replacing some g-factors with constant upper bounds simplifies the quantity to this: 2 2 2 C E E ω g(TZ¯ k ω) + C E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 0≤k
+C
0≤k1
1
0≤k1
2
2 2 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) . 1
2
3
The expression above is bounded by C(n 1/2 + n + n 3/2 ). We show the argument for the last sum of triple products. (Same reasoning applies to the first two sums.) It utilizes repeatedly independence, Eg(Tu ω)g(Tv ω) = σ D2 1{u=v} for u, v ∈ Z2 , and (6.1). Fix 0 ≤ k1 < k2 < k3 < n. Let Z¯ k denote an independent copy of the walk Z¯ k in the same environment ω: 2 2 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1 2 3 2 ω 2
ω = E E g(TZ¯ k ω) E g(TZ¯ k ω) 1 2 ω ¯ ω ¯ × P { Z k3 = u}P { Z k3 = v}E g(Tu ω)g(Tv ω) u,v∈Z2
2 2 = CE E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) P ω { Z¯ k3 = Z¯ k 3 } 1 2 2 ω 2
ω = CE E g(TZ¯ k ω) E g(TZ¯ k ω) 1 2 ω ¯ ¯ × P { Z k2 +1 = u, Z k2 +1 = v} EP ω { Z¯ ku3 −k2 −1 = Z¯ kv3 −k2 −1 } u,v∈Z2
(walks Z¯ ku and Z¯ kv are independent under a common ω) 2 2
≤ CE E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1 2 × P ω { Z¯ k2 +1 = u, Z¯ k 2 +1 = v} P(Yk03 −k2 −1 = 0) u,v∈Z2
2 2 = CE E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) P(Yk03 −k2 −1 = 0). 1
2
534
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Now repeat the same step, and ultimately arrive at 0≤k1
≤C ≤C
2 2 2 E E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) E ω g(TZ¯ k ω) 1
0≤k1
2
3
P(Yk01 = 0)P(Yk02 −k1 −1 = 0)P(Yk03 −k2 −1 = 0) ≤ Cn 3/2 .
By Theorem 8.8 in [12, Chap. 3], 3 3 E an (t + h, r ) − an (t, r ) an (t, r ) − an (t − h, r ) ≤ Ch 3/2 is sufficient for tightness of the processes {an (t, r ) : t ≥ 0}. The left-hand side above is bounded by 6 6 E an (t + h, r ) − an (t, r ) + E an (t, r ) − an (t − h, r ) .
Note that if h < 1/(2n) then an (t + h, r ) − an (t, r ) an (t, r ) − an (t − h, r ) = 0 due to the discrete time of the unscaled walks, while if h ≥ 1/(2n) then n(t + h) − nt ≤ 3nh. Putting these points together shows that tightness will follow from the next moment bound. Lemma 6.2. There exists a constant 0 < C < ∞ such that, for all 0 ≤ m < n ∈ N, 6 0,0 E {E ω (Z n0,0 ) − nV } − {E ω (Z m ≤ C(n − m)3/2 . ) − mV } Proof. The claim reduces to Lemma 6.1 by restarting the walks at time m. Convergence of finite-dimensional distributions in Corollary 3.5 follows from Theorem 3.4. The limiting process a(·)= ¯ lim an (·, r ) is identified by its covariance E a(s) ¯ a(t)= ¯ q (s ∧ t, r ), (s ∧ t, r ) . This completes the proof of Corollary 3.5. 7. Proofs for the Random Average Process This section requires Theorem 3.2 from the space-time RWRE section.
7.1. Separation of effects. As the form of the limiting process in Theorem 2.1 suggests, we can separate the fluctuations that come from the initial configuration from those created by the dynamics. The quenched means of the RWRE represent the latter. We start with the appropriate decomposition. Abbreviate √ xn,r = x(n, r ) = n y¯ + r n .
Random Average Process
535
Recall that we are considering y¯ ∈ R fixed, while (t, r ) ∈ R+ × R is variable and serves as the index for the process, x(n,r )+ntb, nt n σnt (xn,r + ntb) − σ0n (xn,r ) = E ω σ0n (X nt ) − σ0n (xn,r ) = E ω 1 −1 =
x(n,r )+ntb, nt
X nt
x(n,r )+ntb, nt
X nt
i>x(n,r )
−
x(n,r )+ntb, nt
X nt
>x(n,r )
<x(n,r )
η0n (i)
i=x(n,r )+1
x(n,r )
x(n,r )+ntb, nt
i=X nt
η0n (i) +1
x(n,r )+ntb, nt · η0n (i) P ω i ≤ X nt
i≤x(n,r )
x(n,r )+ntb, nt · η0n (i). P ω i > X nt
Recalling the means (i/n) = Eη0n (i) we write this as n σnt (xn,r + ntb) − σ0n (xn,r ) = Y n (t, r ) + H n (t, r ),
where Y n (t, r ) =
(7.1)
x(n,r )+ntb, nt η0n (i) − (i/n) 1{i > xn,r }P ω i ≤ X nt i∈Z
x(n,r )+ntb, nt − 1{i ≤ xn,r }P ω i > X nt and H n (t, r ) =
i∈Z
x(n,r )+ntb, nt (i/n) 1{i > xn,r }P ω i ≤ X nt
x(n,r )+ntb, nt − 1{i ≤ xn,r }P ω i > X nt . The plan of the proof of Theorem 2.1 is summarized in the next lemma. In the pages that follow we then show the finite-dimensional weak convergence n −1/4 H n → H , and the finite-dimensional weak convergence n −1/4 Y n → Y for a fixed ω. This last statement is actually not proved quite in the strength just stated, but the spirit is correct. The distributional limit n −1/4 Y n → Y comes from the centered initial increments η0n (i) − (i/n), x(n,r )+ntb, nt while a homogenization effect takes place for the coefficients P ω {i ≤ X nt } which converge to limiting deterministic Gaussian probabilities. Since the initial height functions σ0n and the random environments ω that drive the dynamics are independent, we also get convergence n −1/4 (Y n + H n ) → Y + H with independent terms Y and H . This is exactly the statement of Theorem 2.1. Lemma 7.1. Let (0 , F0 , P0 ) be a probability space on which are defined independent random variables η and ω with values in some abstract measurable spaces. The marginal laws are P for ω and P for η, and Pω = δω ⊗ P is the conditional probability distribution of (ω, η), given ω. Let Hn (ω) and Yn (ω, η) be R N -valued measurable functions of (ω, η). Make assumptions (i)–(ii) below.
536
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
(i) There exists an R N -valued random vector H such that Hn (ω) converges weakly to H. (ii) There exists an R N -valued random vector Y such that, for all θ ∈ R N , Eω [eiθ·Y ] → E(eiθ·Y ) in P-probability as n → ∞. n
Then Hn + Yn converges weakly to H + Y, where H and Y are independent. Proof. Let θ, λ be arbitrary vectors in R N . Then ω iλ·Hn +iθ·Yn − E eiλ·H E eiθ·Y EE e n n n ≤ E eiλ·H Eω eiθ·Y − Eeiθ·Y + Eeiλ·H − Eeiλ·H Eeiθ·Y n n n ≤ E eiλ·H Eω eiθ·Y − Eeiθ·Y + Eeiλ·H − Eeiλ·H . By assumption (i), the second term above goes to 0. By assumption (ii), the integrand in the first term goes to 0 in P-probability. Therefore by bounded convergence the first term goes to 0 as n → ∞. Turning to the work itself, we check first that H n (t, r ) can be replaced with a quenched RWRE mean. Then the convergence H n → H follows from the RWRE results. Lemma 7.2. For any S, T < ∞ and for P-almost every ω,
x(n,r )+ntb, nt − xn,r = 0. lim sup n −1/4 H n (t, r ) − ( y¯ ) · E ω X nt n→∞ 0≤t≤T −S≤r ≤S
Proof. Decompose H n (t, r ) = H1n (t, r ) − H2n (t, r ), where H1n (t, r ) =
i>x(n,r )
H2n (t, r )
=
i≤x(n,r )
x(n,r )+ntb, nt · (i/n), P ω i ≤ X nt
x(n,r )+ntb, nt · (i/n). P ω i > X nt
Working with H1n (t, r ), we separate out the negligible error, H1n (t, r ) = ( y¯ )
i>x(n,r )
+
i>x(n,r )
= ( y¯ ) · E ω
x(n,r )+ntb, nt P ω i ≤ X nt
x(n,r )+ntb, nt · (i/n) − ( y¯ ) P ω i ≤ X nt
x(n,r )+ntb, nt
X nt
− xn,r
+
+ R1 (t, r )
with R1 (t, r ) =
∞ m=1
x m n,r x(n,r )+ntb, nt + − ( y¯ ) . · P ω xn,r + m ≤ X nt n n
Random Average Process
537
Fix a small positive number δ < function : |R1 (t, r )| ≤
1 2,
and use the boundedness of probabilities and the
xn,r m − ( y¯ ) + n n
1/2+δ n
m=1
x(n,r )+ntb, nt . P ω xn,r + m ≤ X nt
∞
+C·
(7.2)
m=n 1/2+δ +1
By the local Hölder-continuity of with exponent γ > 21 , the first sum is o(n 1/4 ) if x(n,r )+ntb,nt = xn,r + ntb and by time nt the walk δ > 0 is small enough. Since X 0 has displaced by at most Mnt, there are at most O(n) nonzero terms in the second sum in (7.2). Consequently this sum is at most
x(n,r )+ntb, nt Cn · P ω X nt − xn,r ≥ n 1/2+δ . By Lemma 4.3 the last line vanishes uniformly over t ∈ [0, T ] and r ∈ [−S, S] as n → ∞, for P-almost every ω. We have shown x(n,r )+ntb, nt + − xn,r lim sup n −1/4 H1n (t, r ) − ( y¯ ) · E ω X nt = 0 P-a.s. n→∞ 0≤t≤T −S≤r ≤S
Similarly one shows x(n,r )+ntb, nt − lim sup n −1/4 H2n (t, r ) − ( y¯ ) · E ω X nt − xn,r = 0 P-a.s. n→∞ 0≤t≤T −S≤r ≤S
The conclusion follows from the combination of these two.
x(n,r )+ntb, nt ω For a fixed n and y¯ , the process E X nt − xn,r has the same distribution as the process yn (t, r ) defined in (3.6). A combination of Lemma 7.2 and Theorem 3.2 imply that the finite-dimensional distributions of the processes n −1/4 Hn converge weakly, as n → ∞, to the finite-dimensional distributions of the mean-zero Gaussian process H with covariance E H (s, q)H (t, r ) = ( y¯ )2 q ((s, q), (t, r )).
(7.3)
7.2. Finite-dimensional convergence of Y n . Next we turn to convergence of the finitedimensional distributions of process Y n in (7.1). Recall that B(t) is standard Brownian motion, and σa2 = E[(X 10,0 − V )2 ] is the variance of the annealed walk. Recall the definition ∞ P[σa B(s) > x − q]P[σa B(t) > x − r ] d x 0 ((s, q), (t, r )) = q∨r
− 1{r >q}
q q
r
P[σa B(s) > x − q]P[σa B(t) ≤ x − r ] d x
+1{q>r } P[σa B(s) ≤ x − q]P[σa B(t) > x − r ] d x r q∧r P[σa B(s) ≤ x − q]P[σa B(t) ≤ x − r ] d x. + −∞
538
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
Recall from (2.9) that v( y¯ ) is the variance of the increments around n y¯ . Let {Y (t, r ) : t ≥ 0, r ∈ R} be a real-valued mean-zero Gaussian process with covariance EY (s, q)Y (t, r ) = v( y¯ )0 ((s, q), (t, r )).
(7.4)
Fix N and space-time points (t1 , r1 ), . . . , (t N , r N ) ∈ R+ × R. Define vectors
Yn = n −1/4 Y n (t1 , r1 ), . . . , Y n (t N , r N ) and Y = Y (t1 , r1 ), . . . , Y (t N , r N ) . This section is devoted to the proof of the next proposition, after which we finish the proof of Theorem 2.1. Proposition 7.1. For any vector θ = (θ1 , . . . , θ N ) ∈ R N , Eω (eiθ·Y ) → E(eiθ·Y ) in P-probability as n → ∞. n
Proof. Let G be a centered Gaussian variable with variance S = v( y¯ )
N
θk θl 0 ((tk , rk ), (tl , rl ))
k, l=1
and so θ · Y is distributed like G. We will show that Eω (eiθ·Y ) → E(ei G ) n
in P-probability.
Y n (t, r ),
introduce some notation: x(n,r )+ntb, nt ζnω (i, t, r ) = 1{i > xn,r }P ω i ≤ X nt x(n,r )+ntb, nt − 1{i ≤ xn,r }P ω i > X nt
Recalling the definition of
so that Y n (t, r ) =
η0n (i) − (i/n) ζnω (i, t, r ).
i∈Z
Then put νnω (i) =
N
θk ζnω (i, tk , rk )
k=1
and
Un (i) = n −1/4 η0n (i) − (i/n) νnω (i).
Consequently θ · Yn =
Un (i).
i∈Z
To separate out the relevant terms let δ > 0 be small and define Wn =
1/2+δ n y¯ +n
i=n y¯ −n 1/2+δ
Un (i).
(7.5)
Random Average Process
539
For fixed ω and n, under the measure Pω the variables {Un (i)} are constant multiples of centered increments η0n (i) − (i/n) and hence independent and mean zero. Recall also that second moments of centered increments η0n (i) − (i/n) are uniformly bounded. Thus the terms left out of Wn satisfy
Eω (Wn − θ · Yn )2 ≤ Cn −1/2
νnω (i)2 ,
i:|i−n y¯ | > n 1/2+δ
and we wish to show that this upper bound vanishes for P-almost every ω as n → ∞. Using the definition of νnω (i), bounding the sum on the right reduces to bounding sums of the two types x(n,r )+ntk b, ntk 2 1{i > x(n, rk )} P ω i ≤ X ntk k
n −1/2
i:|i−n y¯ | > n 1/2+δ
and x(n,r )+ntk b, ntk 2 1{i ≤ x(n, rk )} P ω i > X ntk k .
n −1/2
i:|i−n y¯ | > n 1/2+δ
For large enough n the points x(n, rk ) lie within 21 n 1/2+δ of n y¯ , and then the previous sums are bounded by the sums
n −1/2
x(n,r )+ntk b, ntk 2 P ω i ≤ X ntk k
x(n,r )+ntk b, ntk 2 P ω i > X ntk k .
i ≥ x(n,rk )+(1/2)n 1/2+δ
and
n −1/2
i ≤ x(n,rk )−(1/2)n 1/2+δ
These vanish for P-almost every ω as n → ∞by Lemma 4.3, in a manner similar to the second sum in (7.2). Thus Eω (Wn − θ · Yn )2 → 0 and our goal (7.5) has simplified to Eω (ei Wn ) → E(ei G )
in P-probability.
(7.6)
We use the Lindeberg-Feller theorem to formulate conditions for a central limit theorem for Wn under a fixed ω. For Lindeberg-Feller we need to check two conditions: (LF-i) S (ω) ≡ n
1/2+δ n y¯ +n
i=n y¯ −n 1/2+δ
(LF-ii)
1/2+δ n y¯ +n
i=n y¯ −n 1/2+δ
Eω Un (i)2 −→ S, n→∞
Eω Un (i)2 · 1{|Un (i)|>ε} −→ 0 for all ε > 0. n→∞
540
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
To see that (LF-ii) holds, pick conjugate exponents p, q > 1 (1/ p + 1/q = 1): 1 1 p q Eω Un (i)2 · 1{Un (i)2 >ε2 } ≤ Eω |Un (i)|2 p Pω Un (i)2 > ε2 ≤ε
− q2
1 1 p q Eω |Un (i)|2 p Eω Un (i)2
≤ Cn −1/2−1/(2q) . In the last step we used the bound |Un (i)| ≤ Cn −1/4 η0n (i) − (i/n), boundedness of , and we took p close enough to 1 to apply assumption (2.11). Condition (LF-ii) follows if δ < 1/(2q). We turn to condition (LF-i):
S (ω) = n
1/2+δ n y¯ +n
E Un (i)2 = ω
i=n y¯ −n 1/2+δ
=
1/2+δ n y¯ +n
1/2+δ n y¯ +n
n −1/2 v(i/n)[νnω (i)]2
i=n y¯ −n 1/2+δ
n −1/2 [v(i/n) − v( y¯ )] [νnω (i)]2
i=n y¯ −n 1/2+δ
+
1/2+δ n y¯ +n
n −1/2 v( y¯ )[νnω (i)]2 .
i=n y¯ −n 1/2+δ
Due to the local Hölder-property (2.10) of v, the first sum on the last line is bounded above by γ C( y¯ )n 1/2+δ n −1/2 n −1/2+δ = C( y¯ )n δ(1+γ )−γ /2 → 0 for sufficiently small δ. Denote the remaining relevant part by S˜ n (ω), given by
S˜ n (ω) =
1/2+δ n y¯ +n
n
−1/2
v( y¯ )[νnω (i)]2
i=n y¯ −n 1/2+δ
= v( y¯ )
N k, l=1
θk θl n
= v( y¯ )n
−1/2
1/2+δ n
νnω (m + n y¯ )
2
m=−n 1/2+δ −1/2
1/2+δ n
ζnω (n y¯ + m, tk , rk )ζnω (n y¯ + m, tl , rl ).
m=−n 1/2+δ
(7.7) Consider for the moment a particular (k, l) term in the first sum on line (7.7). Rename (s, q) = (tk , rk ) and (t, r ) = (tl , rl ). Expanding the product of the ζnω -factors gives three sums:
Random Average Process
n
1/2+δ n
−1/2
541
ζnω (n y¯ + m, s, q)ζnω (n y¯ + m, t, r )
m=−n 1/2+δ
=n
1/2+δ n
−1/2
×P
ω
x(n,q)+nsb, ns 1{m>q √n } 1{m>r √n } P ω X ns ≥ n y¯ + m
m=−n 1/2+δ x(n,r )+ntb, nt
X nt
− n −1/2
n 1/2+δ
≥ n y¯ + m
(7.8)
x(n,q)+nsb, ns 1{m>q √n } 1{m≤r √n } P ω X ns ≥ n y¯ + m
m=−n 1/2+δ
< n y¯ + m
x(n,q)+nsb, ns + 1{m≤q √n } 1{m>r √n } P ω X ns < n y¯ + m
x(n,r )+ntb, nt ≥ n y¯ + m ×P ω X nt ×P
+n
ω
x(n,r )+ntb, nt X nt
−1/2
×P
ω
1/2+δ n
(7.9)
x(n,q)+nsb, ns 1{m≤q √n } 1{m≤r √n } P ω X ns < n y¯ + m
m=−n 1/2+δ x(n,r )+ntb, nt
X nt
< n y¯ + m .
(7.10)
Each of these three sums (7.8)–(7.10) converges to a corresponding integral in P-probability, due to the quenched CLT Theorem 3.1. To see the correct limit, just note that
x(n,r )+ntb, nt P ω X nt < n y¯ + m
x(n,r )+ntb, nt √ x(n,r )+ntb,nt = P ω X nt − X0 < −ntb + m − r n , and recall that −b = V is the average speed of the walks. We give technical details of the argument for the first sum in the next lemma. Lemma 7.3. As n → ∞, the sum in (7.8) converges in P-probability to ∞ P[σa B(s) > x − q]P[σa B(t) > x − r ] d x. q∨r
Proof of Lemma 7.3. With
x(n,q)+nsb, ns √ ≥ n y¯ + x n f nω (x) = P ω X ns
x(n,r )+ntb, nt √ ×P ω X nt ≥ n y¯ + x n and Inω
=
nδ
q∨r
f nω (x)d x,
542
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
the sum in (7.8) equals Inω + O(n −1/2 ). By the quenched invariance principle Theorem 3.1, for any fixed x, f nω (x) converges in P-probability to f (x) = P[σa B(s) ≥ x − q]P[σa B(t) ≥ x − r ]. x(n,r )+ntb, nt
We cannot claim this convergence P-almost surely because the walks X nt change as n changes. But by a textbook characterization of convergence in probability, for a fixed x each subsequence n( j) has a further subsequence n( j ) such that ω P ω : f n( (x) −→ f (x) = 1. j ) →∞
By the diagonal trick, one can find one subsequence for all x ∈ Q and thus ω ∀{n( j)}, ∃{ j } : P ω : ∀x ∈ Q : f n( (x) → f (x) = 1. j ) Since f nω and f are nonnegative and nonincreasing, and f is continuous and decreases to 0, the convergence works for all x and is uniform on [q ∨ r, ∞). That is, 0 0 ω ∀{n( j)}, ∃{ j } : P ω : 0 f n( j ) −
0 0 f0
L ∞ [q∨r, ∞)
→ 0 = 1.
It remains to make the step to the convergence of the integral Inω to Define now Jnω (A)
=
A
q∨r
∞ q∨r
f (x) d x.
f nω (x)d x.
Then, for any A < ∞, ω ∀{n( j)}, ∃{ j } : P ω : Jn( (A) → j )
A
f (x)d x = 1.
q∨r
A In other words, Jnω (A) converges to q∨r f (x)d x in P-probability. Thus, for each 0 < A < ∞, there is an integer m(A) such that for all n ≥ m(A), P ω : Jnω (A) −
A
q∨r
−1 < A−1 . f (x)d x > A
Pick An ∞ such that m(An ) ≤ n. Under the annealed measure P, X n0,0 is a homogeneous mean zero random walk with variance O(n). Consequently
Random Average Process
543
E[ |Inω − Jnω (An )| ] ≤ ≤
∞
E[ f nω (x)]d x
An ∧n δ ∞
√ x(n,r )+ntb, nt P X nt ≥ x(n, r ) − r n An ∧n δ √ +x n d x −→ 0. n→∞
Combine this with
P ω : Jnω (An ) −
An
∞
An
q∨r
< A−1 f (x)d x > A−1 n n .
Since q∨r f (x)d x converges to q∨r f (x)d x, we have shown that Inω converges to this same integral in P-probability. This completes the proof of Lemma 7.3. We return to the main development, the proof of Proposition 7.1. Apply the argument of the lemma to the three sums (7.8)–(7.10) to conclude the following limit in P-probability: lim n
−1/2
n→∞
=
1/2+δ n
ζnω (n y¯ + m, s, q)ζnω (n y¯ + m, t, r )
m=−n 1/2+δ ∞
P[σa B(s) > x − q]P[σa B(t) > x − r ] d x
q∨r
− 1{r >q}
r
P[σa B(s) > x − q]P[σa B(t) ≤ x − r ] d x
q q
+1{q>r } P[σa B(s) ≤ x − q]P[σa B(t) > x − r ] d x r q∧r P[σa B(s) ≤ x − q]P[σa B(t) ≤ x − r ] d x + −∞
= 0 ((s, q), (t, r )). Return to condition (LF-i) of the Lindeberg-Feller theorem and the definition (7.7) of S˜ n (ω). Since S n (ω) − S˜ n (ω) → 0 as pointed out above (7.7), we have shown that S n → S in P-probability. Consequently ∀{n( j)}, ∃{ j } : P ω : S n( j ) (ω) → S = 1. This can be rephrased as: given any subsequence {n( j)}, there exists a further subsequence {n( j )} along which conditions (LF-i) and (LF-ii) of the Lindeberg-Feller theorem are satisfied for the array
Un( j ) (i) : n( j ) y¯ − n( j )1/2+δ ≤ i ≤ n( j ) y¯ + n( j )1/2+δ , ≥ 1 under the measure Pω for P-a.e. ω. This implies that ∀{n( j)}, ∃{ j } : P ω : Eω (ei Wn( j ) ) → E(ei G ) = 1. But the last statement characterizes convergence Eω (ei Wn ) → E(ei G ) in P-probability. As we already showed above that Wn − θ · Yn → 0 in Pω -probability P-almost surely, this completes the proof of Proposition 7.1.
544
M. Balázs, F. Rassoul-Agha, T. Seppäläinen
7.3. Proofs of Theorem 2.1 and Proposition 2.2. Proof of Theorem 2.1. The decomposition (7.1) gives z n = n −1/4 (Y n + H n ). The paragraph that follows Lemma 7.2 and Proposition 7.1 verify the hypotheses of Lemma 7.1 for H n and Y n . Thus we have the limit z n → z ≡ Y + H in the sense of convergence of finite-dimensional distributions. Since Y and H are mutually independent mean-zero Gaussian processes, their covariances in (7.3) and (7.4) can be added to give (2.18). Proof of Proposition 2.2. The value (2.23) for β can be computed from (2.8), or from the probabilistic characterization (4.4) of β via Example 4.1. If we let u denote a random variable distributed like u 0 (0, −1), then we get β=
E(u 2 ) − (Eu)2 Eu − E(u 2 ) and κ = . Eu − (Eu)2 Eu − E(u 2 )
With obvious notational simplifications, the evolution step (2.22) can be rewritten as η (k) − ρ = (1 − u k )(η(k) − ρ) + u k−1 (η(k − 1) − ρ) + (u k−1 − u k )ρ. Square both sides, take expectations, use the independence of all variables {η(k − 1), η(k), u k , u k−1 } on the right, and use the requirement that η (k) have the same variance v as η(k) and η(k − 1). The result is the identity v = v(1 − 2Eu + 2E(u 2 )) + 2ρ 2 (E(u 2 ) − (Eu)2 ) from which follows v = κρ 2 . The rest of part (b) is a straightforward specialization of (2.18). Acknowledgements. The authors thank P. Ferrari and L. R. Fontes for comments on article [14] and J. Swanson for helpful discussions.
References 1. Aldous, D., Diaconis, P.: Hammersley’s interacting particle process and longest increasing subsequences. Probab. Theory Related Fields 103(2), 199–213 (1995) 2. Aldous, D., Diaconis, P.: Longest increasing subsequences: from patience sorting to the Baik-Deift-Johansson theorem. Bull. Amer. Math. Soc. (N.S.) 36(4), 413–432 (1999) 3. Baik, J., Deift, P., Johansson, K.: On the distribution of the length of the longest increasing subsequence of random permutations. J. Amer. Math. Soc. 12(4), 1119–1178 (1999) 4. Balázs, M.: Growth fluctuations in a class of deposition models. Ann. Inst. H. Poincaré Probab. Statist. 39(4), 639–685 (2003) 5. Bernabei, M.S.: Anomalous behaviour for the random corrections to the cumulants of random walks in fluctuating random media. Probab. Theory Related Fields 119(3), 410–432 (2001) 6. Boldrighini, C., Pellegrinotti, A.: T −1/4 -noise for random walks in dynamic environment on Z. Mosc. Math. J. 1(3), 365–380 470–471 (2001) 7. Burkholder, D.L.: Distribution function inequalities for martingales. Ann. Probab. 1, 19–42 (1973) 8. Chung, K.L.: Markov chains with stationary transition probabilities. In: Die Grundlehren der mathematischen Wissenschaften, Band 104. New York: Springer-Verlag Second edition, 1967 9. De Masi, A., Presutti, E.: Mathematical methods for hydrodynamic limits, Volume 1501 of Lecture Notes in Mathematics. Berlin: Springer-Verlag, 1991 10. Deift, P.: Integrable systems and combinatorial theory. Notices Amer. Math. Soc. 47(6), 631–640 (2000) 11. Durrett, R.: Probability: theory and examples. Duxbury Advanced Series. Belmont, CA: Brooks/Cole– Thomson, Third edition, 2004 12. Ethier, S.N., Kurtz, T.G.: Markov processes. Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. New York: John Wiley & Sons Inc., 1986
Random Average Process
545
13. Ferrari, P.A., Fontes, L.R.G.: Current fluctuations for the asymmetric simple exclusion process. Ann. Probab. 22(2), 820–832 (1994) 14. Ferrari, P.A., Fontes, L.R.G.: Fluctuations of a surface submitted to a random average process. Electron. J. Probab. 3, no. 6, 34 pp., (1998) (electronic) 15. Ferrari, P.L., Spohn, H: Scaling Limit for the Space-Time Covariance of the Stationary Totally Asymmetric Simple Exclusion Process. Commum. Math. Phys. (2006) (in press) 16. Gravner, J., Tracy, C.A., Widom, H.: Limit theorems for height fluctuations in a class of discrete space and time growth models. J. Stat. Phys. 102(5–6), 1085–1132 (2001) 17. Groeneboom, P.: Hydrodynamical methods for analyzing longest increasing subsequences. J. Comput. Appl. Math. 142(1), 83–105 (2002) 18. Hammersley, J.M.: A few seedlings of research. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif., 1970/1971), Vol. I: Theory of statistics, Berkeley, Univ. California Press, 1972, pp. 345–394 19. Johansson, K.: Shape fluctuations and random matrices. Commun. Math. Phys. 209(2), 437–476 (2000) 20. Johansson, K.: Discrete orthogonal polynomial ensembles and the Plancherel measure. Ann. Math. (2) 153(1), 259–296 (2001) 21. Kipnis, C., Landim, C.: Scaling limits of interacting particle systems. Volume 320 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], Berlin: SpringerVerlag 1999 22. Liggett, T.M.: Interacting particle systems. Volume 276 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], New York: Springer-Verlag 1985 23. Liggett, T.M.: Stochastic interacting systems: contact, voter and exclusion processes. Volume 324 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles f Mathematical Sciences], Berlin: Springer-Verlag, 1999 24. Rassoul-Agha, F., Seppäläinen, T.: An almost sure invariance principle for random walks in a space-time random environment. Probab. The. Rel. Fields 133, no. 3, 299–314 (2005) 25. Rezakhanlou, F.: A central limit theorem for the asymmetric simple exclusion process. Ann. Inst. H. Poincaré Probab. Statist. 38(4), 437–464 (2002) 26. Seppäläinen, T.: A microscopic model for the Burgers equation and longest increasing subsequences. Electron. J. Probab. 1, no. 5, approx. 51 pp., (1996) (electronic) 27. Seppäläinen, T.: Exact limiting shape for a simplified model of first-passage percolation on the plane. Ann. Probab. 26(3), 1232–1250 (1998) 28. Seppäläinen, T.: Large deviations for increasing sequences on the plane. Probab. Th. Rel. Fields 112(2), 221–244 (1998) 29. Seppäläinen, T.: Diffusive fluctuations for one-dimensional totally asymmetric interacting random dynamics. Commun. Math. Phys. 229(1), 141–182 (2002) 30. Seppäläinen, T.: Second-order fluctuations and current across characteristic for a one-dimensional growth model of independent random walks. Ann. Probab. 33(2), 759–797 (2005) 31. Spitzer, F.: Principles of random walk. New York: Springer-Verlag, 1976 32. Spohn, H.: Large scale dynamics of interacting particles. Berlin: Springer-Verlag 1991 33. Varadhan, S.R.S.: Lectures on hydrodynamic scaling. In Hydrodynamic limits and related topics (Toronto, ON, 1998), Volume 27 of Fields Inst. Commun., Providence, RI: Amer. Math. Soc., 2000, pp. 3–40 34. Walsh, J.B.: An introduction to stochastic partial differential equations. In: École d’été de probabilités de Saint-Flour, XIV—1984, Volume 1180 of Lecture Notes in Math., Berlin: Springer, 1986, pp. 265–439 Communicated by H. Spohn
Commun. Math. Phys. 266, 547–569 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0061-x
Communications in
Mathematical Physics
Incompressible and Compressible Limits of Coupled Systems of Nonlinear Schrödinger Equations Tai-Chia Lin1,2 , Ping Zhang3 1 National Taiwan University, Department of Mathematics, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan
106. E-mail: [email protected]
2 National Center of Theoretical Sciences, National Tsing Hua University, Hsinchu, Taiwan 3 Academy of Mathematics & Systems Science, CAS Beijing 100080, P.R. China.
E-mail: [email protected] Received: 12 September 2005 / Accepted: 12 February 2006 Published online: 29 June 2006 – © Springer-Verlag 2006
Abstract: Recently, coupled systems of nonlinear Schrödinger equations have been used extensively to describe a double condensate, i.e. a binary mixture of BoseEinstein condensates. In a double condensate, an interface and shock waves may occur due to large intraspecies and interspecies scattering lengths. To know the dynamics of an interface and assure the existence of shock waves in a double condensate, we study the incompressible and the compressible limits respectively of two coupled systems of nonlinear Schrödinger equations. The main idea of our arguments is to define a “H -functional” like a Lyapunov functional which can control the propagation of densities and linear momenta. Such an idea is different from the one using the standard Wigner transform to investigate the incompressible and the compressible limits of a single nonlinear Schrödinger equation. 1. Introduction Here we study the incompressible and the compressible (semiclassical) limits respectively of two coupled systems of nonlinear Schrödinger equations called Gross-Pitaevskii equations given by 1 iε 2 −α ∂t ψ1ε = − 2ε ψ1ε + 1ε (|ψ1ε |2 − 1)ψ1ε + βε |ψ2ε |2 ψ1ε , (1.1) 1 iε 2 −α ∂t ψ2ε = − 2ε ψ2ε + 1ε (|ψ2ε |2 − 1)ψ2ε + βε |ψ1ε |2 ψ2ε , x ∈ , t > 0, and
iε 2 ∂t ψ1ε = − 2ε ψ1ε + |ψ1ε |2 ψ1ε + γ |ψ2ε |2 ψ1ε , 1 iε 2 ∂t ψ2ε = − 2ε ψ2ε + |ψ2ε |2 ψ2ε + γ |ψ1ε |2 ψ2ε , x ∈ , t > 0, 1
(1.2)
where ψ εj = ψ εj (x, t) ∈ C, j = 1,2, 0 < ε << 1 is a small parameter, 0 < α < 16 , β > 1 and γ ≥ 1 are constants. Hereafter, the domain is a bounded smooth domain in R2 , besides, the associated boundary conditions are
548
T.-C. Lin, P. Zhang
(B) Neumann boundary conditions, i.e.
∂ψ εj → ∂− n
∂
= 0 for t > 0, j = 1,2.
The systems (1.1) and (1.2) may come from the following system: 2 ψ1 + U11 |ψ1 |2 ψ1 + U12 |ψ2 |2 ψ1 , i∂t ψ1 = − 2m
(1.3)
ψ2 + U22 |ψ2 |2 ψ2 + U12 |ψ1 |2 ψ2 , x ∈ , t > 0, i∂t ψ2 = − 2m 2
which models a binary mixture of Bose-Einstein condensates called a double condensate without the effect of trap potentials. To investigate the superfluidity of a double condensate, trap potentials should be switched off so that the condensates may expand freely. Physically, is the Planck constant divided by 2π , m is atom mass, U j j ∼ N a j j , j = 1,2, and U12 ∼ N a12 , where a j j is the intraspecies scattering length of the j th hyperfine state, j = 1,2, a12 is the interspecies scattering length, and N is the number of condensate atoms. The system (1.3) can be transformed into the system (1.2) by set2 2 ting U11 = U22 = ε−1 m , U12 = γ ε−1 m , and a suitable time scaling. On the other hand, let j = eiμ j t/ψ j , j = 1,2, where μ j ∼ U j j is the chemical potential of the corresponding component. Then the system (1.3) becomes 2 i∂t 1 = − 2m 1 − μ1 1 + U11 |1 |2 1 + U12 |2 |2 1 , (1.4) 2 i∂t 2 = − 2m 2 − μ2 2 + U22 |2 |2 2 + U12 |1 |2 2 , x ∈ , t > 0, which can be transformed into the system (1.1) by setting U j j = μ j = ε−2 m , j = 1,2, 2
2
U12 = βε−2 m , and another proper time scaling. Bose-Einstein condensates are composed of ultracold dilute Bose gases which may be influenced by boundary conditions. The conventional boundary conditions of the condensates are zero Dirichlet boundary conditions (cf. [8]) if the domain is regarded as a region where a double condensate dwells. However, by [6], zero Dirichlet boundary condition may enhance specific heat more than that of Neumann boundary condition. This may result in more thermal fluctuations which may hinder the expansion of the condensates. To reduce the thermal effect of boundary conditions, we may use Neumann boundary conditions, i.e. (B) instead of zero Dirichlet boundary conditions. Shock waves may occur in a single Bose-Einstein condensate having big initial inhomogeneity of density and strongly repulsive interaction between atoms. Since the sound velocity in Bose-Einstein condensates is proportional to the square root from its density (cf. [18]), the inhomogeneity of density may result in severe compression and wave breaking. Such wave breaking may cause shock waves as for classical compressible gas dynamics if the quantum pressure can be negligible at the initial stages of evolution and the hydrodynamical approach holds. By rapidly increasing the s-wave scattering length using Feshbach resonance, one may ignore the quantum pressure and obtain shock waves in a single Bose-Einstein condensate (cf. [17]). For a double condensate, it would be natural to believe that the quantum pressure of each component can be ignored, the hydrodynamical approach holds, and shock waves in a double condensate may occur if the scattering lengths ai j ’s (i.e. Ui j ’s) go to infinity. Consequently, we may 2 2 set U11 = U22 = ε−1 m , U12 = γ ε−1 m and ε → 0+ in the system (1.3) to get the system (1.2), and study the compressible limit of the system (1.2) to find the compressibility of a double condensate and assure the existence of shock waves in a double condensate. An interface called domain wall may be formed between two components of a double condensate when they have strongly repulsive interactions on each other. As U12 >
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
549
√ U11 U22 and U j j > 0, j = 1,2, spontaneous symmetric breaking occurs, and the two components are immiscible and separated in space by an interface called phase separation ([1, 5, 9 and 24]). For the existence of an interface in a double condensate, we may 2 2 set U j j = μ j = ε−2 m , j = 1,2, U12 = βε−2 m and ε → 0+ in the system (1.4), and transform it into the system (1.1). Actually, such an interface may have motion effected by the superfluid current of a double condensate. To understand the dynamics of an interface in a double condensate, we study the incompressible limit of the system (1.1) and set the constant β > 1 for spontaneous symmetry breaking. 1 The time scale of the system (1.2) is ε 2 which is standard for the compressible (semiclassical) limit of nonlinear Schrödinger equations (cf. [7, 13, 25]). However, the time 1 scale of the system (1.1) is ε 2 −α which is unconventional for the incompressible limit problem (cf. [19]). Since the interface energy is not negligible, we need such a time scale 1 to deal with the effect of interface energy. One may decompose such a time scale as ε 2 1 and ε−α , where ε 2 is for the compressible limit to get compressible fluid equations, and ε−α is for the incompressible limit of these compressible fluid equations. To explain 1 ε this, we may set (ψ1ε , ψ2ε ), ψkε = ρkε ei Sk /ε 2 , k = 1, 2 as a solution of the system (1.1) and formally derive compressible fluid equations with a large parameter ε−α for the time scale. This limit process corresponds to the zero Mach limit of compressible Euler equations to incompressible Euler equations (see [15], (2.69) on p.52). For initial conditions of (1.1), we set ψkε |t=0 = εk ∈ H 4 (; C), k = 1,2 satisfying as ε ↓ 0, def E ε (ε1 , ε2 ) =
ε 2
2 k=1
|∇εk |2 d x
2 1 ε + φk ) − 1]2 [( 2ε k=1
β − 1 ε ε φ1 φ2 d x = O(ε−2α ), + (1.5) ε 2 2 1 1 ε def ε [( H0ε = |(∇ − iε− 2 −α v0 )εk |2 d x + φk ) − 1]2 2 2ε k=1 k=1 β − 1 ε ε φ1 φ2 d x = O(1), (1.6) + ε def
where φkε = |εk |2 , k = 1,2, and O(1) denotes a bounded quantity. Here v0 ∈ → n |∂ = 0. To match the Neumann boundH 3 (; R2 ) satisfies divv0 = 0 and v0 · − ε ary conditions (B), we assume that k are compatible with the Neumann boundary conditions (B) in some appropriate sense. 1 +α We may give an example for such εk ’s as follows: Let k = φkε ei Sk /ε 2 , k = 1,2, where ∇ Sk = v0 in k , φkε (x) = 1 if x ∈ k , dist (x, 0 ) ≥ ε; φkε (x) = 0 if x ∈ 0 ; φkε (x) ∈ (0, 1) with |∇ φkε (x)| = O( 1ε ) if x ∈ k and dist (x, 0 ) < ε. Here 1 and 2 are two segregated smooth domains with a common boundary 0 which is a bounded smooth curve as an interface separating the whole domain into two components 1 and 2 . Then it is obvious that (1.5) and (1.6) hold. Now we state the main theorem for incompressible limits of ψkε ’s as follows: Theorem 1. Let (ψ1ε , ψ2ε ) be the solution of the system (1.1) with the Neumann boundary conditions (B) and initial data (ε1 , ε2 ) ∈ H 4 (; C2 ) satisfying (1.5) and (1.6), where
550
T.-C. Lin, P. Zhang def
def
1
is a bounded smooth domain in R2 . Let ρkε = |ψ kε |2 and Jkε = ε 2 +α Im(ψkε ∇ψkε ) for k = 1, 2. Then for any T > 0, there exists C = C T, v0 H 3 () > 0 such that ε 2
2 k=1
2 1 ε β − 1 ε ε [( ρ1 ρ2 d x ≤ C, |∇ ρkε |2 d x + ρk ) − 1]2 + ε 2ε
(1.7)
k=1
and 2 2 1 ε ε ε Jk − ρk v d x ≤ Cε2α as ε ↓ 0, ρ k
(1.8)
k=1
for t ∈ [0, T ], where v ∈ C([0, ∞); H 3 ()) is the unique solution of incompressible Euler equations given by ⎧ ⎨ ∂t v + (v · ∇)v + ∇ p = 0 f or x ∈ , t > 0, div v = 0 f or x ∈ , t > 0, (1.9) ⎩ v | = v f or x ∈ , t=0 0 → together with slip boundary condition v · − n |∂ = 0. Remark 1.1. (1) Let v0 ∈ H 3 () satisfy divv0 = 0 and some appropriate compatibility conditions with the slip boundary conditions, one can prove the global well-posedness of (1.9) in the class of C([0, ∞]; H 3 ()) (cf. [14] for more details). And one may check [13] to see why the Neumann boundary condition for (1.1) corresponds to the slip boundary condition for (1.9). (2) Due to β > 1, (1.7) gives lim ρ1ε ρ2ε d x = 0 which signifies the strongly repulsive ε↓0
interaction of ψkε ’s, and result in an interface separating the supports of ρkε ’s as ε goes to zero. (3) From energy conservation (see Sect. 2) and (1.5), E ε (ψ1ε , ψ2ε ) = E ε (ε1 , ε2 ) = O(ε−2α ) for t ≥ 0. However, this is not enough to derive the inequality (1.7) which may control the part of energy coming from the interface. Hence we need to define a function called “H -functional” given by ε H (t) = 2 ε
de f
2 k=1
|(∇ − iε
− 12 −α
β − 1 ε ε ρ1 ρ2 d x, + ε
v)ψkε |2
2 1 ε [( + ρk ) − 1]2 2ε k=1
(1.10)
for t ≥ 0, where v and ρkε ’s are defined in Theorem 1. Such a functional like a Lyapunov function may control the propagation of densities and linear momenta of ψkε ’s. Similar approaches are also used in [19] for the incompressible limit of a single Schrödinger– Poisson system in the periodic case, and in [13] for the semiclassical limit of GrossPitaevskii equations in the exterior domain. The new difficulties here lie in the coupling of two nonlinear Schrodinger equation, and the new scaling in (1.1) to deal with the interface problem, which is different from the known scaling before. Notice that here we do not use the standard Wigner transform approach (cf. [25]) either, as it might lead to more complicated situations to study the system (1.1).
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
551
In Sect. 3, we shall prove 1
H ε (t) ≤ eC ∇v L ∞ ([0,T ]×) T (H0ε + Cε 2 −3α )
for T > 0 and t ∈ [0, T ]. (1.11)
Furthermore, Theorem 1 may imply Corollary 1. Under the same assumptions as Theorem 1, for any T > 0, k = 1,2, Jkε − ρkε v L 4/3 () ≤ C0 εα ,
(1.12)
where C0 is a positive constant depending on T , v0 and ||. Generally, the interface dynamics of a double condensate may depend on the associated superfluid currents. By Corollary 1, the superfluid currents of a double condensate are governed by the incompressible Euler equation. Thus it would be expected that the motion of an interface can be described by a particle trajectory equation given by dX dt = v(X (t), t), t > 0, (1.13) X (0) = w ∈ 0 , where v is the solution of (1.9) and 0 is the interface of initial data εk ’s. For initial data of (1.2), we set ψkε |t=0 = Fkε ∈ H 4 (; C), k = 1,2 satisfying as ε ↓ 0, ε ε (F1ε , F2ε ) def E = 2
ε 0ε def = H 2
2 k=1
|∇ Fkε |2 d x +
2 1 ( | f kε |2 ) 2
+γ f 1ε f 2ε d x = O(1), 2 k=1
k=1
(1.14)
1
|(∇ − iε− 2 u 0 )Fkε |2 d x +
+(γ − 1) f 1ε f 2ε ]d x = oε (1),
1 ε [ |( f k ) − ρ0 |2 2 2
k=1
(1.15)
where f kε = |Fkε |2 , k = 1,2, and oε (1) is a small quantity tending to zero as ε goes to zero. Here 0 < ρ0 ∈ H 3 (; R) and u 0 ∈ H 3 (; R2 ) satisfies u 0 · n |∂ = 0. And we assume that Fkε are compatible with the Neumann boundary conditions (B). 1 √ We may give an example for such Fkε as follows: Let Fkε = f k eiθk /ε 2 , k = 1,2, where f k ∈ C∞ 0 (; R) with support in k , ∇θk = u 0 in k , and k ’s are two disjoint bounded smooth domains. Then it is trivial that (1.14) and (1.15) hold. A theorem for compressible limits of ψkε ’s is given by Theorem 2. Let (ψ1ε , ψ2ε ) be the solution of the system (1.2) with the Neumann boundary conditions (B) and initial data (F1ε , F2ε ) ∈ H 4 (; C2 ) satisfying (1.14) and (1.15), def
def
1
where is a bounded smooth domain in R2 . Let ρkε = |ψkε |2 and Jkε = ε 2 Im(ψkε ∇ψkε ) for k = 1,2. Then there exist T∗ > 0 such that (ρ1ε + ρ2ε − ρ)(·, t) L2 () → 0,
(1.16)
(J1ε + J2ε − ρu)(·, t) L4/3 () → 0 for t ∈ [0, T∗ ) as ε ↓ 0,
(1.17)
and
552
T.-C. Lin, P. Zhang
where (ρ, u) ∈ C([0, T∗ ); H 3 ()) is the unique solution of the compressible Euler equations given by ⎧ ⎨ ∂t ρ + div(ρu) = 0, ∂t u + (u · ∇)u + ∇ρ = 0 for x ∈ , t ∈ (0, T∗ ), (1.18) ⎩ ρ| t=0 = ρ0 (x), u|t=0 = u 0 (x) for x ∈ , together with slip boundary condition u · n |∂ = 0. Here T∗ is the maximal time period for the existence of the regular solution of (1.18). Remark 1.2. (1) Under the assumptions that 0 < ρ0 ∈ H 3 (; R) and u 0 ∈ H 3 (; R2 ), Beirao [2, 3] proved the local well-posedness of (1.18) under appropriate compatibility conditions for (ρ0 (x), u 0 (x)). (2) The space dimension two assumption is to guarantee that (1.2) has a unique global solution with smooth enough initial data (cf. [4]). Similar comment serves for Theorem 1 as well. When spatial dimension is greater than two, the global well-posedness of the initial-boundary value problem (1.2) is still open. However, when = R d and d = 2, 3, to guarantee the local well-posedness of (1.18), one may need the assumptions like Fkε → eiθ/ε at infinity so that ρ0 (x) ≥ c > 0 in the whole space. Then as for [13], one can prove the global well-posedness of (1.2), and therefore one may obtain a similar result like Theorem 2. (3) It is remarkable that the solution of (1.18) is also a solution of the isentropic compressible Euler equation given by ∂t ρ + div(ρu) = 0, (1.19) ∂t (ρu) + div(ρu ⊗ u) + 21 ∇ρ 2 = 0 f or x ∈ , t ∈ (0, T∗ ), which may form a shock wave in finite time generically (cf. [21]). Hence Theorem 2 shows the compressibility of a double condensate which may result in shock wave appearance. The main difficulty of Theorem 2 is strong competition between ρ1ε and ρ2ε so the hydrodynamical approach may not hold for ρ1ε and ρ2ε individually. To overcome such difficulty, we may add ρ1ε and ρ2ε together and regard ρ1ε + ρ2ε as the total density of the classical compressible gas when ε tends to zero. The rest of this paper is organized as follows: In Sect. 2, we introduce conservation laws of mass, energy and linear momentum as our basic tools to prove Theorem 1 and 2. Then we give proofs of Theorem 1 and 2 in Sect. 3 and 4, respectively. For the global existence of systems (1.1) and (1.2), one may refer to Sect. 5, the Appendix, for details. 2. Conservation Laws Of single nonlinear Schrödinger equations (NLS), conservation laws, especially the modified Madelung’s fluid dynamic equations (e.g. (2.6), (2.7), (2.29), (2.30)), are very useful to investigate vortex dynamics (cf. [11, 12]), blow up collapse waves (cf. [23]) and semiclassical limit (cf. [13, 19, 25]). Due to the unique form of single NLS, one may derive three conservation laws including conservation of energy, mass and linear momentum. However, for general coupled systems of NLS, conservation laws would be more complicated and difficult to use. Since the systems (1.1) and (1.2) have specific coupling terms which may provide some symmetric properties, the associated conservation laws may have proper forms for studying such systems. In this section, we want
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
553
to prove conservation of energy, mass and linear momentum for the systems (1.1) and (1.2), respectively. Let us first notice that given ψkε (0, x) ∈ H 4 () for k = 1, 2, it is obvious to follow from the result in the Appendix that both (1.1) and (1.2) have a unique solution in the class of C([0, ∞); H 2 ()). Then it is standard (cf. [13] for the exterior domain case) to improve ψkε , k = 1, 2, to the class of C([0, ∞); H 4 ()). Now for the system (1.1), we define its energy, mass density and linear momentum density as follows: eε = E ε (ψ1ε , ψ2ε ),
(2.1)
ρkε = |ψkε |2 , k = 1, 2,
(2.2)
def def
def Jkε = def Jk,ε j =
ε ε (Jk,1 , Jk,2 ),
ε
1 2 +α
Im(ψkε ∂ j ψkε )
(2.3) for j, k = 1, 2,
(2.4)
where ∂ j ≡ ∂x j , ψkε is the complex conjugate of ψkε , E ε (·, ·) is defined in (1.5). Then we have Lemma 1. (i) Conservation of energy: d eε (t) = 0, ∀t > 0, dt
(2.5)
∂t ρkε + div Jkε = 0 f or x ∈ , t > 0, k = 1, 2,
(2.6)
(ii) Conservation of mass:
(iii) Conservation of linear momentum: ε−2α ∂t
2
2 ε ∂l [4Re(∂l ψkε ∂ j ψkε ) − ∂l ∂ j (|ψkε |2 )] 4
Jk,ε j +
k=1
k,l=1
+
1 2ε ∂ j
2
(ρkε )2
k=1
(2.7)
β + ∂ j (ρ1ε ρ2ε ) = 0 ε
for x ∈ , t > 0, j = 1,2. Proof. We may multiply the equation of ψ1ε in (1.1) by ∂t ψ1ε (complex conjugate of ∂t ψ1ε ) and integrate over . Then it is easy to check that 1
iε 2 −α
|∂t ψ1ε |2 d x =
ε 2 +
2 j=1
β ε
∂ j ψ1ε ∂t ∂ j ψ1ε d x +
|ψ2ε |2 ψ1ε ∂t ψ1ε d x.
1 ε
(|ψ1ε |2 − 1)ψ1ε ∂t ψ1ε d x (2.8)
554
T.-C. Lin, P. Zhang
Here we have used the Neumann boundary conditions (B), and integration by parts. Taking complex conjugate on (2.8), we have −iε
1 2 −α
|∂t ψ1ε |2 d x
ε = 2 +
2 j=1
β ε
∂ j ψ1ε ∂t ∂ j ψ1ε d x
1 + ε
(|ψ1ε |2 − 1)ψ1ε ∂t ψ1ε d x
|ψ2ε |2 ψ1ε ∂t ψ1ε d x.
Hence by adding (2.8) and (2.9) together, ε d 1 |∇ψ1ε |2 d x + (|ψ ε |2 + |ψ2ε |2 − 1)∂t |ψ1ε |2 d x 2 dt ε 1 β −1 + |ψ2ε |2 ∂t |ψ1ε |2 d x = 0. ε Similarly, we may use the equation of ψ2ε in (1.1) to obtain ε d 1 ε 2 |∇ψ2 | d x + (|ψ ε |2 + |ψ2ε |2 − 1)∂t |ψ2ε |2 d x 2 dt ε 1 β −1 + |ψ1ε |2 ∂t |ψ2ε |2 d x = 0. ε
(2.9)
(2.10)
(2.11)
Therefore by adding (2.10) and (2.11), we may complete the proof of (2.5). For the proof of (2.6), we may multiply the equation of ψ1ε in (1.1) by ψ1ε (complex conjugate of ψ1ε ). Then 1 ε 1 β iε 2 −α (∂t ψ1ε )ψ1ε = − (ψ1ε )ψ1ε + (|ψ1ε |2 − 1)|ψ1ε |2 + |ψ2ε |2 |ψ1ε |2 . (2.12) 2 ε ε
Take the complex conjugate on (2.12) so we have 1
iε 2 −α (∂t ψ1ε )ψ1ε =
ε 1 β (ψ1ε )ψ1ε − (|ψ1ε |2 − 1)|ψ1ε |2 − |ψ1ε |2 |ψ2ε |2 . 2 ε ε
(2.13)
Adding (2.12) and (2.13), we may complete the proof of (2.6) for k = 1, a similar proof gives (2.6) for k = 2. Now we prove (2.7) as follows: Multiply the conjugate equation of ψ1ε in (1.1) by ∂ j ψ1ε , and then 1
iε 2 −α ∂t ψ1ε ∂ j ψ1ε =
ε 1 β (ψ1ε )∂ j ψ1ε − (|ψ1ε |2 − 1)ψ1ε ∂ j ψ1ε − |ψ2ε |2 ψ1ε ∂ j ψ1ε . 2 ε ε (2.14)
On the other hand, take ∂ j on the equation of ψ1ε in (1.1), and multiply the resulting equation by ψ1ε . Then we have 1
iε 2 −α (∂t ∂ j ψ1ε )ψ1ε = − 2ε (∂ j ψ1ε )ψ1ε + 1ε |ψ1ε |2 ∂ j (|ψ1ε |2 − 1) + 1ε (|ψ1ε |2 − 1)ψ1ε ∂ j ψ1ε + βε |ψ1ε |2 ∂ j (|ψ2ε |2 ) + βε |ψ2ε |2 ψ1ε ∂ j ψ1ε .
(2.15)
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
555
One may add (2.14) and (2.15) together, and choose the real part of the resulting equation to get ε 1 β ε−2α ∂t J1,ε j + Re[(ψ1ε )∂ j ψ1ε − (∂ j ψ1ε )ψ1ε ]+ ∂ j (ρ1ε )2 + ρ1ε ∂ j ρ2ε = 0. (2.16) 2 2ε ε To obtain (2.7), we need to prove
Claim 1. Re[(ψ1ε )∂ j ψ1ε − (∂ j ψ1ε )ψ1ε ] =
1 2
2 l=1
∂l [4Re(∂l ψ1ε ∂ j ψ1ε ) − ∂l ∂ j (|ψ1ε |2 )].
Proof. It is obvious that Re[(ψ1ε )∂ j ψ1ε − (∂ j ψ1ε )ψ1ε ] =
1 2 ε ( ∂l ψ1 )∂ j ψ1ε + ( ∂l2 ψ1ε )∂ j ψ1ε 2 2
2
l=1 2 −( ∂l2 ∂ j ψ1ε )ψ1ε l=1
−(
l=1 2
∂l2 ∂ j ψ1ε )ψ1ε . (2.17)
l=1
Besides, 2 2 2 ( ∂l2 ψ1ε )∂ j ψ1ε = ∂l (∂l ψ1ε ∂ j ψ1ε ) − (∂l ψ1ε )∂l ∂ j ψ1ε l=1
l=1
(2.18)
l=1
and 2 2 2 2 ε ε ε ε ( ∂l ∂ j ψ1 )ψ1 = ∂l (ψ1 ∂l ∂ j ψ1 ) − (∂l ψ1ε )∂l ∂ j ψ1ε . l=1
l=1
(2.19)
l=1
Put (2.18) and (2.19) into (2.17), and then Re[(ψ1ε )∂ j ψ1ε − (∂ j ψ1ε )ψ1ε ] =
1 ∂l (∂l ψ1ε ∂ j ψ1ε + ∂l ψ1ε ∂ j ψ1ε − ψ1ε ∂l ∂ j ψ1ε 2 2
l=1
−ψ1ε ∂l ∂ j ψ1ε ).
(2.20)
On the other hand, ∂l ∂ j (|ψ1ε |2 ) = ∂l ∂ j (ψ1ε ψ1ε ) = (ψ1ε ∂l ∂ j ψ1ε + ψ1ε ∂l ∂ j ψ1ε ) + (∂ j ψ1ε ∂l ψ1ε + ∂l ψ1ε ∂ j ψ1ε ).
(2.21)
Consequently, by (2.20) and (2.21), we may complete the proof of Claim 1. By (2.16) and Claim 1, we obtain ε−2α ∂t J1,ε j +
ε 1 ∂l [4Re(∂l ψ1ε ∂ j ψ1ε ) − ∂l ∂ j (|ψ1ε |2 )] + ∂ j (ρ1ε )2 4 2ε 2
l=1
+
β ε ρ ∂ j ρ2ε = 0. ε 1
(2.22)
556
T.-C. Lin, P. Zhang
Similarly, we may use the equation of ψ2ε in (1.1) to derive ε−2α ∂t J2,ε j +
ε 1 ∂l [4Re(∂l ψ2ε ∂ j ψ2ε ) − ∂l ∂ j (|ψ2ε |2 )] + ∂ j (ρ2ε )2 4 2ε 2
l=1
+
β ε ρ ∂ j ρ1ε = 0. ε 2
(2.23)
Therefore by adding (2.22) and (2.23), we may have (2.7) and complete the proof of Lemma 1. Due to the difference between (1.1) and (1.2), we may define another energy, mass density and linear momentum density for the system (1.2) as follows: def ε ε e˜ε = E ε (ψ1 , ψ2 ),
(2.24)
def ρkε =
(2.25)
|ψkε |2 ,
k = 1, 2,
ε ε , Jk,2 ), Jkε = (Jk,1
def
(2.26)
1 2
Jk,ε j = ε Im(ψkε ∂ j ψkε ) f or j, k = 1, 2, def
(2.27)
ε (·, ·) is defined in (1.14), and (ψ ε , ψ ε ) is the solution of (1.2) with the Neumann where E 1 2 boundary conditions (B). Then corresponding to Lemma 1, we may show Lemma 2. Under the assumptions of Theorem 2, there hold (i) Conservation of energy: d e˜ε (t) = 0, ∀t > 0, dt
(2.28)
∂t ρkε + div Jkε = 0 f or x ∈ , t > 0, k = 1, 2,
(2.29)
(ii) Conservation of mass:
(iii) Conservation of linear momentum:
∂t
2 k=1
Jk,ε j +
2 2 ε 1 ε 2 ∂l [4Re(∂l ψkε ∂ j ψkε ) − ∂l ∂ j (|ψkε |2 )] + ∂ j (ρk ) 4 2 k,l=1
+ γ ∂ j (ρ1ε ρ2ε ) for x ∈ , t > 0, j = 1,2.
k=1
=0
(2.30)
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
557
3. Proof of Theorem 1 In this section, we want to prove both Theorem 1 and Corollary 1. For the proof of Theorem 1, we need to define a crucial function by ε H (t) = 2 de f
ε
2 k=1
|(∇ − iε
− 12 −α
v)ψkε |2 d x
2 1 ε + ρk ) − 1]2 [( 2ε k=1
β − 1 ε ε + ρ1 ρ2 d x, ε
(3.1)
for t ≥ 0, where v and ρkε ’s are defined in (1.9) and (2.2), respectively. Then it is easy to check that 2 2 2 ε ε−2α 1 ε ε 2 ε ε |∇ ρk | d x + H (t) = ε Jk − ρk v d x ρk 2 2 k=1 k=1 (3.2) 2 1 ε β − 1 ε ε 2 [( ρ1 ρ2 d x, ρk ) − 1] + + ε 2ε k=1
where Jkε are defined in (2.3) and (2.4). Moreover, by (1.5) and (2.1), we may rewrite (3.1) as ε
H (t) = eε (t) − ε
−2α
v·
2
Jkε d x
k=1
ε−2α + 2
2
|v|2 ρkε d x.
k=1
(3.3)
d Now we compute dt H ε (t) by using conservation laws, i.e. Lemma 1. By conservation of energy (2.5) and (3.3),
d ε d H (t) = −ε−2α dt dt
v·
2
Jkε d x
k=1
ε−2α d + 2 dt
|v|
2
2
ρkε d x.
(3.4)
k=1
By conservation of mass (2.6), 1 d 2 dt
|v|2
2
ρkε d x =
k=1
(v · ∂t v)
=
(v · ∂t v)
=
(v · ∂t v)
2 k=1 2 k=1 2
ρkε d x + ρkε d x − ρkε d x +
k=1
1 2 1 2 1 2
|v|2
|v|2
2
k=1 2
div Jkε d x (by (2.6))
k=1
∂t ρkε d x
∇|v|2 ·
2
Jkε d x,
k=1
i.e. 1 d 2 dt
|v|2
2 k=1
ρkε d x =
(v · ∂t v)
2 k=1
ρkε d x +
1 2
∇|v|2 ·
2 k=1
Jkε d x.
(3.5)
558
T.-C. Lin, P. Zhang
Here we have used integration by parts and the Neumann boundary conditions (B) to eliminate the integral along the boundary ∂. By conservation of linear momentum (2.7), 2 2 2 ε −ε−2α v· ∂t Jkε d x = vj ∂l [4Re(∂l ψkε ∂ j ψkε ) − ∂l ∂ j (|ψkε |2 )] 4 k=1
j,k=1
l=1
1 + ∂ j (ρkε )2 d x + 2ε 2
j=1
β v j ∂ j (ρ1ε ρ2ε )d x. ε
Using integration by parts and divergence free of v, 2 2 2 vj ∂l [Re(∂l ψkε ∂ j ψkε )]d x = − (∂l v j )Re(∂l ψkε ∂ j ψkε )d x, j,k=1
j,k,l=1
l=1
−
2 j,k,l=1
and
v j ∂l2 ∂ j (|ψkε |2 )d x
2
=
2 k,l=1
(div v)∂l2 (|ψkε |2 )d x = 0,
(3.7)
(3.8)
1 β ∂ j ( (ρkε )2 ) + ∂ j (ρ1ε ρ2ε )]d x 2ε ε 2
vj[
j=1
k=1
=−
[
1 2ε
2 k=1
(ρkε )2 +
β ε ε ρ ρ ]div vd x = 0. ε 1 2
Consequently, by (3.6)–(3.9), 2 2 −ε−2α v· ∂t Jkε d x = −ε (∂l v j )Re(∂l ψkε ∂ j ψkε )d x.
(3.6)
j,k,l=1
k=1
(3.9)
(3.10)
On the other hand, (3.4), (3.5) and (3.10) may imply 2 d ε (∂l v j )Re(∂l ψkε ∂ j ψkε )d x H (t) = −ε dt j,k,l=1
+ε−2α
2 k=1
[−(Jkε · ∂t v) + (v · ∂t v)ρkε +
1 ε J · ∇|v|2 ]d x. (3.11) 2 k
Put the equation of v (see (1.9)) into (3.11). Then 2 d ε H (t) = −ε (∂l v j )Re(∂l ψkε ∂ j ψkε )d x dt j,k,l=1
+ε−2α +ε−2α
2 1 (Jkε − ρkε v) · [(v · ∇)v] + Jkε · ∇|v|2 d x 2 k=1 2
k=1
(Jkε − ρkε ) · ∇ p d x.
(3.12)
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
559
To deal with right side of (3.12), we need Claim 2. −
2
1
(∂l v j )Re[(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] 1
j,l=1
=−
2
(∂l v j )Re(∂l ψkε ∂ j ψkε ) + ε−1−2α {(Jkε − ρkε ) · [(v · ∇)v] +
j,l=1
1 ε J · ∇|v|2 } 2 k
for k = 1, 2. Proof. The proof of Claim 2 is simply algebraic calculation as follows: 1
Re[(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] 1 1 1 [(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] = 2 1 1 + [(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] =
1
1 1 1 {[∂ j ψkε ∂l ψkε − iε− 2 −α v j ψkε ∂l ψkε + iε− 2 −α vl ψkε ∂ j ψkε + ε−1−2α v j vl |ψkε |2 ] 2 1
1
+ [∂ j ψkε ∂l ψkε + iε− 2 −α v j ψkε ∂l ψkε − iε− 2 −α vl ψkε ∂ j ψkε + ε−1−2α v j vl |ψkε |2 ]} i 1 = Re(∂l ψkε ∂ j ψkε ) − ε− 2 −α (ψkε ∂l ψkε − ψkε ∂l ψkε )v j 2 i 1 + ε− 2 −α (ψkε ∂ j ψkε − ψkε ∂ j ψkε )vl + ε−1−2α v j vl |ψkε |2 2 ε ε = Re(∂l ψkε ∂ j ψkε ) − ε−1−2α (v j Jk,l + vl Jk,l − v j vl ρkε ) (by (2.2), (2.4)) for j, k, l = 1, 2. Then taking the sum for j, l = 1, 2, we obtain −
2
1
(∂l v j )Re[(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ] 1
j,l=1
=−
2
(∂l v j )Re(∂l ψkε ∂ j ψkε ) + ε−1−2α
j,l=1
=−
2
2
ε (∂l v j )(v j Jk,l + vl (Jk,ε j − v j ρkε ))
j,l=1
(∂l v j )Re(∂l ψkε ∂ j ψkε ) + ε−1−2α {(Jkε − ρkε v) · [(v · ∇)v] +
j,l=1
1 ε J · ∇|v|2 }, 2 k
and we may complete the proof of Claim 2. From (3.12) and Claim 2, we obtain 2 1 d ε 1 H (t) = −ε (∂l v j )Re[(∂ j − iε− 2 −α v j )ψkε (∂l − iε− 2 −α vl )ψkε ]d x dt j,k,l=1
+ε−2α
2 k=1
(Jkε − ρkε v) · ∇ p d x
(3.13)
560
T.-C. Lin, P. Zhang
Now we want to estimate the second integral of right side of (3.13). By (1.5) and energy conservation (2.5), eε (t) = eε (0) = E ε (ε1 , ε2 ) = O(ε−2α ), ∀ t > 0. This may imply 2 2 ε ρk ) − 1 d x = O(ε1−2α ), ∀ t > 0. (
(3.14)
k=1
Hence by (3.14), 2 2 ε ε (v · ∇ p)ρk d x = [( ρk ) − 1](v · ∇ p)d x + (v · ∇ p)d x k=1
=
[(
k=1 2
1
ρkε ) − 1](v · ∇ p)d x = O(ε 2 −α ) v · ∇ p L2 () ,
k=1
i.e. 2 k=1
1
(v · ∇ p)ρkε d x = O(ε 2 −α ) v · ∇ p L2 () .
(3.15)
Here we have used integration by parts, v · n |∂ = 0, divergence free of v and the Hölder inequality. Furthermore, by (2.6), (3.14) and the Hölder inequality, 2 2 Jkε · ∇ p d x = − p div Jkε d x (integration by par ts and using (B)) k=1
k=1
= = =
2
k=1
d dt d dt
p ∂t ρkε d x (by (2.6))
[(
[(
2 k=1 2
ρkε ) − 1] p d x −
[(
2
ρkε ) − 1]∂t p d x
k=1 1
ρkε ) − 1] p d x + O(ε 2 −α ) ∂t p L2 ()
k=1
¨ (by (3.14) and the H"older ineq.) i.e. 2 k=1
Jkε · ∇ p d x =
d dt
[(
2
1
ρkε ) − 1] p d x + O(ε 2 −α ) ∂t p L2 () . (3.16)
k=1
Thus by (3.13), (3.15) and (3.16), we obtain 2 d ρkε ) − 1] p d x} {H ε (t) − ε−2α [( dt k=1
ε
1
≤ C ∇v L∞ () H (t) + O(ε 2 −3α )( v · ∇ p L2 () + ∂t p L2 () ). (3.17)
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
While again by (3.14) and the Hölder inequality, 2 1 ε−2α [( ρkε ) − 1] p d x = O(ε 2 −3α ) p L2 () .
561
(3.18)
k=1
Therefore by (3.17), (3.18) and the Gronwall inequality, 1 H ε (t) ≤ eC ∇v L ∞ ([0,T ]×) T H ε (0) + O(ε 2 −3α )( p L ∞ ([0,T ];L2 ()) + v · ∇ p L ∞ ([0,T ];L2 ()) + ∂t p L ∞ ([0,T ];L2 ()) ) f or 0 < t < T. (3.19) On the other hand, taking divergence to the first equation of (1.9) gives p=
2
∂i ∂ j (−)−1 (vi v j ),
i, j=1
but from (1) of Remark 1.1, v(t, x) ∈ L ∞ ([0, T ]; H 3 ()), therefore, by the properties of the Riesz transform [22], p L ∞ ([0,T ];L2 ()) ≤ C v ⊗ v L ∞ ([0,T ];L2 ()) ≤ C v 2L ∞ ([0,T ];H 2 ()) . Similarly, ∇ p L ∞ ([0,T ];L2 ()) ≤ C v 2L ∞ ([0,T ];H 3 ()) , and using the first equation of (1.9), we have a similar estimate for ∂t v L ∞ ([0,T ];L2 ()) . As a consequence, v · ∇ p L ∞ ([0,T ];L2 ()) + ∂t p L ∞ ([0,T ];L2 ()) ≤ C v 3L ∞ ([0,T ];H 3 ()) . By summing the above inequalities together with (3.19) and using the fact that 0 < α < 1 6 , we may complete the proof of (1.11) and Theorem 1. Now we prove Corollary 1 as follows: Using (1.8) and the Hölder inequality, we have 1 Jkε − ρkε v L4/3 () = ρkε ( ε Jkε − ρkε v) L4/3 () ρk 1 ¨ ≤ ρkε L4 () ε Jkε − ρkε v L2 () (by H"older ineq.) ρk = O(εα ) ρkε L4 () (by (1.8)). (3.20) By (1.7), it is obvious that 2 1/2 1/2 ρkε L4 () = ρkε L2 () ≤ ρkε L2 () (∵ ρkε ≥ 0) k=1
≤ [ (
2
ρkε ) − 1 L2 () + ||1/2 ]1/2 (by triangle ineq.)
k=1 1/2
= [||
√ + O( ε)]1/2 (by (1.7)),
562
T.-C. Lin, P. Zhang
i.e.
ρkε L4 () ≤ 2||1/4 f or k = 1, 2,
(3.21)
as ε sufficiently small. We may combine (3.20) and (3.21) to obtain (1.12) so we complete the proof of Corollary 1. 4. Proof of Theorem 2 In this section, we want to prove Theorem 2. Basically, the idea of the proof of Theorem 2 is similar to that of Theorem 1. Corresponding to (3.1) in the proof of Theorem 1, we may define another functional by f ε ε (t) de = H 2
2 k=1
1
|(∇ − iε− 2 u)ψkε |2 d x +
2 1 |( ρkε ) − ρ|2 + (γ − 1)ρ1ε ρ2ε d x 2 k=1
(4.1) for t ≥ 0, where (ρ, u) and ρkε ’s are defined in (1.18) and (2.25), respectively. Then it is obvious that 2 2 1 1 ε (t) = ε H |∇ ρkε |2 d x + | ε Jkε − ρkε u|2 d x 2 2 ρk n=1 k=1 2 1 |( (4.2) ρkε ) − ρ|2 + (γ − 1)ρ1ε ρ2ε d x, + 2 k=1
where Jkε ’s are defined in (2.26) and (2.27). Furthermore, by (1.14) and (2.24), we may transform (4.1) into ε (t) = H eε (t) −
u·
2
1 + 2
Jkε d x
k=1
|u|
2
2
ρkε d x
+
k=1
[−(
2 k=1
1 ρkε )ρ + ρ 2 ]d x. 2 (4.3)
We may use conservation laws, i.e. Lemma 2, to calculate of energy (2.28) and (4.3), we obtain d ε d H (t) = − dt dt +
d dt
u·
2
Jkε d x +
k=1 2
[−(
k=1
1 d 2 dt
d dt
|u|2
ε (t). By conservation H 2
ρkε d x
k=1
1 ρkε )ρ + ρ 2 ]d x. 2
(4.4)
By conservation of mass (2.29), as for (3.5), we have 1 d 2 dt
|u|
2
2 k=1
ρkε d x
=
(u · ∂t u)
2 k=1
ρkε
1 dx + 2
∇|u|2 ·
2 k=1
Jkε d x. (4.5)
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
563
Besides, d dt
[−(
2 k=1
1 ρkε )ρ + ρ 2 ]d x = 2
[−(
=
[(
=
2
∂t ρkε )ρ − (
2
k=1 2
k=1 2
k=1 2
k=1
div Jkε )ρ − (
ρkε )∂t ρ + ρ∂t ρ]d x
ρkε )∂t ρ + ρ∂t ρ]d x
2 (Jkε · ∇ρ) − ( ρkε )∂t ρ + ρ∂t ρ]d x,
[−
k=1
k=1
i.e. d dt
[−(
2 k=1
1 ρkε )ρ + ρ 2 ]d x = 2
[−
2
2 (Jkε · ∇ρ) − ( ρkε )∂t ρ + ρ∂t ρ]d x.
k=1
k=1
(4.6) By conservation of linear momentum (2.30), using the same trick as (3.6)–(3.9), we obtain −
u·
2
∂t Jkε d x = −
k=1
ε 4 2
[
2
k=1 j,l=1
4Re(∂l ψkε ∂ j ψkε )∂l u j + (∇ρkε · div u)]d x
1 ε 2 − [ (ρk ) + γρ1ε ρ2ε ]div u d x. 2 2
(4.7)
k=1
Notice that due to compressibility, the velocity u is not divergence free any more. This may give the difference between (3.10) and (4.7). Hence by summing up (4.4)-(4.7), we find d ε H (t) = − dt
∂t u ·
2
Jkε
k=1
ε dx − 4 2
+
(u · ∂t u)
2
−(
k=1
2
[
2
k=1 j,l=1
+(∇ρkε · ∇div u)]d x −
k=1
ρkε )∂t ρ + ρ∂t ρ]d x.
1 ε 2 (ρk ) + γρ1ε ρ2ε ]div u d x 2 2
ρkε d x +
4Re(∂l ψkε ∂ j ψkε )∂l u j
[
1 2
k=1
∇|u|2 ·
2 k=1
Jkε d x +
[−
2
(Jkε · ∇ρ)
k=1
(4.8)
564
T.-C. Lin, P. Zhang
Plug the system (1.18) into (4.8), and then d ε ε H (t) = − dt 4 2
[
2
k=1 j,l=1
+
4Re(∂l ψkε ∂ j ψkε )∂l u j + (∇ρkε · ∇div u)]d x
2 1 {(Jkε − ρkε u) · [(u · ∇)u] + Jkε · ∇|u|2 }d x 2 k=1
2 2 1 ε 2 (Jkε − ρkε u) · ∇ρd x − [ (ρk ) + γρ1ε ρ2ε ]div u d x + 2 k=1
k=1
2 2 − (Jkε · ∇ρ) + [( ρkε ) − ρ]div(ρu) d x. +
k=1
(4.9)
k=1
By (2.25)–(2.27) and the same method as Claim 2 in Sect. 3, we obtain
−
2
1 1 (∂l u j )Re (∂ j − iε− 2 u j )ψkε (∂l − iε− 2 u l )ψkε
j,l=1
=−
2
(∂l u j )Re(∂l ψkε ∂ j ψkε ) + ε−1 {(Jkε − ρkε u) · [(u · ∇)u] +
j,l=1
1 ε J · ∇|u|2 }. 2 k (4.10)
Thus (4.9) and (4.10) imply 2 1 d ε 1 (∂l u j )Re[(∂ j − iε− 2 u j )ψkε (∂l − iε− 2 u j )ψkε ]d x H (t) = −ε dt j,k,l=1
ε 4 2
−
k=1
(∇ρkε · ∇div u)d x −
+γρ1ε ρ2ε ]div u d x −
[
1 ε 2 (ρk ) − (ρ ρkε ) 2 2
2
k=1
k=1
ρdiv(ρu)d x. (4.11)
Since ρdiv(ρu) = (ρ∇ρ) · u + ρ 2 div u = 21 (∇ρ 2 ) · u + ρ 2 div u, then using integration → by parts and noticing the slip boundary condition u · − n |∂ = 0, we find 1 2 ρdiv(ρu)d x = − (∇ρ ) · u d x − ρ 2 div u d x − 2 1 =− ρ 2 div u d x. 2
(4.12)
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
565
Combining (4.11) with (4.12), we arrive at 2 1 d ε 1 (∂l u j )Re[(∂ j − iε− 2 u j )ψkε (∂l − iε− 2 u j )ψkε ]d x H (t) = −ε dt j,k,l=1
ε − 4 2
k=1
−(γ − 1)
(∇ρkε
1 · ∇div u)d x − 2
ρ1ε ρ2ε div
[(
2
ρkε ) − ρ]2 div u d x
k=1
u d x.
(4.13)
By (1.14) and (2.28), eε (t) = eε (0) = O(1) f or t ∈ [0, T∗ ). Consequently, 1 ∇ψkε L2 () = O( √ ), ψkε L4 () = O(1) f or t ∈ [0, T∗ ), k = 1, 2. (4.14) ε Then by (2.25), (4.14) and the Hölder inequality, ε | 4 2
k=1
ε ∇ψkε L2 () ψkε L 4 () ∇div u L4 () 2 √ ≤ C ε u H 3 () . (4.15)
(∇ρkε · ∇div u)d x| ≤
Hence by (4.1), (4.13) and (4.15), we obtain √ d ε ε (t) + ε u(t) H 3 () f or t ∈ [0, T∗ ). (4.16) H (t) ≤ C ∇u(t) L ∞ () H dt ε (0) → 0 as ε ↓ 0. Thus by (1) of Remark 1.2, (4.16) together with the From (1.15), H Gronwall inequality implies ε (t) → 0 H
f or t ∈ [0, T∗ )
as ε ↓ 0.
(4.17)
Therefore by (4.2), (4.14), (4.17) and the Hölder inequality, we may obtain (1.16), (1.17) and we complete the proof of Theorem 2. 5. Appendix In this section, we study global existence of the systems (1.1) and (1.2) with the Neumann boundary conditions (B). Basically, we follow the ideas of Brezis and Gallouet (cf. [4]). We firstly study the local existence theorem and then use conservation laws and a crucial inequality (see (5.12)) which holds only when the domain has two spatial dimensions to prove the global existence theorem. For the local existence theorem, we need to use a semigroup theorem (cf. [20]) as follows:
566
T.-C. Lin, P. Zhang
Theorem 3. Let H be a Hilbert space and A : D(A) ⊂ H → H be a m-accretive linear operator. Assume F is a mapping from D(A) into itself which is Lipschitz on every bounded set of D(A). Then for every u 0 ∈ D(A), there exists a unique solution u of the equation du dt + Au = Fu, u(0) = u 0 , defined for t ∈ [0, Tmax ) such that u ∈ C1 ([0, Tmax ); H ) C([0, Tmax ); D(A)) with the additional property that either Tmax = ∞ or Tmax < ∞ and lim u(t) + Au(t) = ∞. t↑Tmax
For simplicity, we may set ε = 1. Then both (1.1) and (1.2) have the following form: i∂t ψ1 = − 21 ψ1 − aψ1 + b|ψ1 |2 ψ1 + c|ψ2 |2 ψ1 , (5.1) i∂t ψ2 = − 21 ψ2 − aψ2 + b|ψ2 |2 ψ2 + c|ψ1 |2 ψ2 , where a ≥ 0, b, c > 0 are constants. To apply Theorem 3 on (5.1) with the Neumann boundary conditions (B), we may define H = {(ψ1 , ψ2 )T : ψ j ∈ L2 (; C), j = 1, 2}, T i ψ1 2 ( − 2I )ψ1 = i A f or (ψ1 , ψ2 )T ∈ D(A), ψ2 2 ( − 2I )ψ2 ∂ψ where D(A) = {(ψ1 , ψ2 )T : ψ j ∈ H2 (; C), ∂ n j = 0, j = 1, 2}. Besides, we set ∂
initial data of (5.1) as (ψ1 , ψ2 )T |t=0 = (ψ1,0 , ψ2,0 )T ∈ D(A), and i(a + 1)ψ1 − ib|ψ1 |2 ψ1 − ic|ψ2 |2 ψ1 ψ1 ψ1 = f or ∈ D(A). F ψ2 ψ2 i(a + 1)ψ2 − ib|ψ2 |2 ψ2 − ic|ψ1 |2 ψ2 Then it is easy to check that F maps from D(A) into itself which is Lipschitz on every bounded set of D(A). Moreover, the operator A : D(A) ⊂ H → H is a m-accretive linear operator (cf. [10]). Hence by Theorem 3, we obtain the local existence of (5.1) with the Neumann boundary conditions (B). Now we want to show the global existence of (5.1) with the Neumann boundary conditions (B), i.e. Tmax = ∞ and ψ j (t) H2 () , j = 1,2, remain bounded on every finite time interval. Hereafter, (ψ1 , ψ2 ) denotes the solution of (5.1) with the Neumann boundary conditions (B). By energy conservation of (5.1), we have 2 k=1
|∇ψ j |2 d x ≤ C0
for t ∈ [0, Tmax ),
(5.2)
where C0 is a positive constant independent of t. Moreover, by conservation of mass on (5.1), 2 j=1
|ψ j |2 d x ≤ C1
for t ∈ [0, Tmax ),
(5.3)
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
567
where C1 is a positive constant independent of t. Hence (5.2) and (5.3) imply 2
ψ j (t) 2H1 () ≤ C2
for t ∈ [0, Tmax ),
(5.4)
j=1
where C2 = C0 + C1 > 0 independent of t. We now denote by S(t) the L2 isometry group generated by − 2i ( − 2I ). Then by (5.1), we obtain t ψ1 (t) = S(t)ψ1,0 + i 0 S(t − s)[(a + 1)ψ1 + b|ψ1 |2 ψ1 + c|ψ2 |2 ψ1 ](s)ds, (5.5) t ψ2 (t) = S(t)ψ2,0 + i 0 S(t − s)[(a + 1)ψ2 + b|ψ2 |2 ψ2 + c|ψ1 |2 ψ2 ](s)ds, where ψ j,0 ’s are initial data of ψ j ’s. Hence by (5.5), ψ1,0 ψ1 (t) = S(t)A A ψ2 ψ2,0 ! t (a + 1)ψ1 + b|ψ1 |2 ψ1 + c|ψ2 |2 ψ1 ! (s)ds +i S(t − s)A (a + 1)ψ2 + b|ψ2 |2 ψ2 + c|ψ1 |2 ψ2 0 and
" " " " " " " " " A ψ1,0 " " A ψ1 (t)" ≤ " 2 " " ψ2 ψ 2,0 "L2 () L () ! " t " " (a + 1)ψ1 + b|ψ1 |2 ψ1 + c|ψ2 |2 ψ1 " " " +C (s)ds. " (a + 1)ψ + b|ψ |2 ψ + c|ψ |2 ψ ! " 2 2 2 1 2 0 H2 ()
(5.6)
To estimate the integral in the right side of (5.6), we need the following lemma: Lemma 3. |v|2 u H2 () ≤ C v L∞ () max{ u L∞ , v L∞ } max{ u H2 () , v H2 () }, |u|2 v H2 () ≤ C u L∞ () max{ u L∞ , v L∞ } max{ u H2 () , v H2 () } for u, v ∈ H2 (), where C is a positive constant independent of u and v. Proof of Lemma 3. Let D be any first order differential operator. For u, v ∈ H2 (), we have |D 2 (|v|2 u)| ≤ C(|v|2 |D 2 u| + |v Du Dv| + |Dv|2 |u| + |u v D 2 v|) and so
|v|2 u H2 () ≤ C v 2L∞ () u H2 () + v L∞ () u W1,4 () v W1,4 ()
(5.7) + u L∞ () v 2W1,4 () + u L∞ () v L∞ () v H2 () .
Hereafter, for notation convenience, we denote C as the associated constants, which may be different on different lines. By Gagliardo-Nirenberg inequality (cf. [16], p.125)), 1/2
1/2
(5.8)
1/2
1/2
(5.9)
u W1,4 () ≤ C u L∞ () u H2 () , v W1,4 () ≤ C v L∞ () v H2 () ,
568
T.-C. Lin, P. Zhang
for u, v ∈ H2 (). Combining (5.7)–(5.9), we obtain 3/2 1/2 1/2 1/2 |v|2 u H2 () ≤ C v 2L∞ () u H2 () + v L∞ () u L∞ () v H2 () u H2 () + u L∞ () v L∞ () v H2 () 1/2
1/2
1/2
1/2
≤ C v L∞ () ( v L∞ () u H2 () + u L∞ () v H2 () )2
≤ C v L∞ () max{ u L∞ () , v L∞ () } max{ u H2 () , v H2 () }. (5.10) Similarly, we may interchange u and v to get |u|2 v H2 () ≤ C u L∞ () max{ u L∞ () , v L∞ () } max{ u H2 () , v H2 () }. Therefore we complete the proof of Lemma 3. By (5.6) and Lemma 3, t 2 2 2 ψ j (t) H2 () ≤ C + C ( ψ j (s) H2 () )( ψ j (s) L∞ () )2 d x 0
j=1
j=1
(5.11)
j=1
for t ∈ [0, Tmax ), where C may depend on a,b,c and ψ j,0 ’s. From [4], a crucial inequality is given by u L∞ () ≤ C(1+ log(1+ u H2 () )) for u ∈ H2 () with u H1 () ≤ 1, (5.12) provided that the domain has two spatial dimensions. Hence by (5.11) and (5.12), we obtain t 2 2 ψ j (t) H2 () ≤ C + C ( ψ j (s) H2 () ) 0
j=1
j=1
× [1 + log(1 +
2
ψ j (s) H2 () )]ds.
j=1
We denote by G(t) the right hand side in (5.13), thus G (t) = C(
2
ψ j (t) H2 () )[1 + log(1 +
j=1
2
ψ j (t) H2 () )]
j=1
≤ C G(t)[1 + log(1 + G(t))] (by (5.13)). Consequently, d log[1 + log(1 + G(t))] ≤ C, dt and we may find an estimate for
2 j=1
2 j=1
ψ j (t) H2 () of the form
ψ j (t) H2 () ≤ eαe
βt
(5.13)
Incompressible and Compressible Limits of Coupled Systems of NLS Equations
for some constants α and β. Therefore time interval and Tmax = ∞.
2 j=1
569
ψ j (t) H2 () remains bounded on every finite
Acknowledgements. T. C. Lin is partially supported by NCTS and NSC of Taiwan under Grant NSC94-2115M-002-019. P. Zhang is partially supported by NSF of China under Grant 10525101 and 10421101, and the innovation grant from Chinese Academy of Sciences.
References 1. Ao, P., Chui, S.T.: Binary Bose-Einstein condensate mixtures in weakly and strongly segregated phases. Phys. Rev. A 58, 4836–4840 (1998) 2. Beirao da Veiga, H.: On the barotropic motion of compressible perfect fluids. Ann. Sc. Norm. Sup. Pisa 8, 417–451 (1981) 3. Beirao da Veiga, H.: Data dependence in the mathematical theory of compressible inviscid fluids. Arch. Rat. Mech. Anal. 119, 109–127 (1992) 4. Brezis, H., Gallout, T.: Nonlinear Schrödinger evolution equations. Nonlinear Analysis, TMA 4, 677–681 (1980) 5. Esry, B.D., Greene, C.H.: Spontaneous spatial symmetry breaking in two-component Bose-Einstein condensates. Phys. Rev. A 59, 1457–1460 (1999) 6. Hasan, Z.R., Goble, D.F.: Effect of boundary conditions on finite Bose-Einstein assemblies. Phys. Rev. A 10(2), 618–624 (1974) 7. Grenier, E.: Semiclassical limit of the nonlinear Schrödinger equation in small time. Proc. Amer. Math. Soc. 126, 523–530 (1998) 8. Ginzburg, V.L., Pitaevskii, L.P.: Zh. Eksperim. Theor. Fys. 34, 1240 (1958) [Sov. Phys. JETP 7, 858 (1958)] 9. Hall, D.S., Matthews, M.R., Ensher, J.R., Wieman, C.E., Cornell, E.A.: Dynamics of component separation in a binary mixture of Bose- Einstein condensates. Phys. Rev. Lett. 81, 1539–1542 (1998) 10. Kato, T.: Perturbation theory of linear operator. Berlin: Springer, 1980 11. Lin, F.H., Lin, T.C.:Vortices in two-dimensional Bose-Einstein condensates. In: Geometry and nonlinear partial differential equations, AMS/IP Stud. Adv. Math. 29, Amer. Math. Soc., Providence, RI, 2002 pp. 87–114 12. Lin, F.H., Xin, J.X.: On the incompressible fluid limit and the vortex motion law of the nonlinear Schrödinger equation. Commun. Math. Phys. 200(2), 249–274 (1999) 13. Lin, F.H., Zhang, P.: Semiclassical limit of the Gross-Pitaevskii equation in an exterior domain. Arch. Rat. Mech. Anal. 179, 79–107 (2005) 14. Lions, P.L.: Mathematical Topics in Fluid Mechanics, Vol. 1, Incompressible Models, Lecture series in mathematics and its applications, V. 3, Oxford: Clarendon Press, 1996 15. Majda, A.: Compressible fluid flow and systems of conservation laws in several space variables. New York, Springer 1984 16. Nirenberg, L.: On elliptic partial differential equations. Ann. Sci. Norm. Sup. Pisa 13, 115–162 (1959) 17. Pérez-García, V.M., Konotop, V.V., Brazhnyi, V.A.: Feshbach resonance induced shock waves in BoseEinstein condensates. Phys. Rev. Lett. 92, 220–403(1–4) (2004) 18. Pitaevskii, L., Stringari, S.: Bose-Einstein condensation. Oxford: Oxford univ. Press, 2003 19. Puel, M.: Convergence of the Schrödinger–Poisson system to the incompressible Euler equations. Comm. Partial Diff. Eqs. 27, 2311–2331 (2002) 20. Segal, I.: Nonlinear semi-groups. Ann. Math. 78, 339–364 (1963) 21. Sideris, T.C.: Formation of singularities in three-dimensional compressible fluids. Commun. Math. Phys. 101, 475–485 (1985) 22. Stein, E.M.: Singular Integrals and Differentiability Properties of Functions. Princeton, NJ: Princeton University Press 1970 23. Sulem, C., Sulem, P.L.: The nonlinear Schrödinger equation self-focusing and wave collapse. New York: Springer 1999 24. Timmermans, E.: Phase separation of Bose-Einstein condensates. Phys. Rev. Lett. 81, 5718–5721 (1998) 25. Zhang, P.: Wigner measure and the semiclassical limit of Schrodinger–Poisson equations. SIAM J. Math. Anal. 34, 700–718 (2003) Communicated by A. Kupiainen
Commun. Math. Phys. 266, 571–576 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0019-z
Communications in
Mathematical Physics
A Generalization of Hawking’s Black Hole Topology Theorem to Higher Dimensions Gregory J. Galloway1 , Richard Schoen2 1 Department of Mathematics, University of Miami, Coral Gables, FL 33124, USA.
E-mail: [email protected]
2 Department of Mathematics, Stanford University, Stanford, CA 94305, USA
Received: 28 September 2005 / Accepted: 10 November 2005 Published online: 9 June 2006 – © Springer-Verlag 2006
Abstract: Hawking’s theorem on the topology of black holes asserts that cross sections of the event horizon in 4-dimensional asymptotically flat stationary black hole spacetimes obeying the dominant energy condition are topologically 2-spheres. This conclusion extends to outer apparent horizons in spacetimes that are not necessarily stationary. In this paper we obtain a natural generalization of Hawking’s results to higher dimensions by showing that cross sections of the event horizon (in the stationary case) and outer apparent horizons (in the general case) are of positive Yamabe type, i.e., admit metrics of positive scalar curvature. This implies many well-known restrictions on the topology, and is consistent with recent examples of five dimensional stationary black hole spacetimes with horizon topology S 2 × S 1 . The proof is inspired by previous work of Schoen and Yau on the existence of solutions to the Jang equation (but does not make direct use of that equation). 1. Introduction A basic result in the theory of black holes is Hawking’s theorem [11, 13] on the topology of black holes, which asserts that cross sections of the event horizon in 4-dimensional asymptotically flat stationary black hole spacetimes obeying the dominant energy condition are spherical (i.e., topologically S 2 ). The proof is a beautiful variational argument, showing that if a cross section has genus ≥ 1 then it can be deformed along a null hypersurface to an outer trapped surface outside of the event horizon, which is forbidden by standard results on black holes [13].1 In [12], Hawking showed that his black hole topology result extends, by a similar argument, to outer apparent horizons in black hole spacetimes that are not necessarily stationary. (A related result had been shown by Gibbons [8] in the time-symmetric case.) Since Hawking’s arguments rely on the Gauss-Bonnet theorem, these results do not directly extend to higher dimensions. 1 Actually the torus T 2 arises as a borderline case in Hawking’s argument, but can occur only under special circumstances.
572
G.J. Galloway, R. Schoen
Given the current interest in higher dimensional black holes, it is of interest to determine which properties of black holes in four spacetime dimensions extend to higher dimensions. In this note we obtain a natural generalization of Hawking’s theorem on the topology of black holes to higher dimensions. The conclusion in higher dimensions is not that the horizon topology is spherical; that would be too strong, as evidenced by the striking example of Emparan and Reall [7] of a stationary vacuum black hole spacetime in five dimensions with horizon topology S 2 × S 1 . The natural conclusion in higher dimensions is that cross sections of the event horizon (in the stationary case), and outer apparent horizons (in the general case) are of positive Yamabe type, i.e. admit metrics of positive scalar curvature. As noted in [6], in the time symmetric case this follows from the minimal surface methodology of Schoen and Yau [18] in their treatment of manifolds of positive scalar curvature. The main point of the present paper is to show that this conclusion remains valid without any condition on the extrinsic curvature of space. That such a result might be expected to hold is suggested by work in [19, Sect. 4], which implies that the apparent horizons corresponding to the blow-up of solutions of the Jang equation, as described in [19], are of positive Yamabe type. We emphasize, however, that we do not need to make use of the Jang equation here.2 Much is now known about the topological obstructions to the existence of metrics of positive scalar curvature in higher dimensions. While the first major result along these lines is the famous theorem of Lichnerowicz [16] concerning the vanishing of the Aˆ genus, a key advance in our understanding was made in the late 70’s and early 80’s by Schoen and Yau [17, 18], and Gromov and Lawson [9, 10]. A brief review of these results, relevant to the topology of black holes, was considered in [6]. We shall recall the situation in five spacetime dimensions in the next section, after the statement of our main result. 2. The Main Result Let V n be an n-dimensional, n ≥ 3, spacelike hypersurface in a spacetime M n+1 , g . Let n−1 be a closed hypersurface in V n , and assume that n−1 separates V n into an “inside” and an “outside”. Let N be the outward unit normal to n−1 in V n , and let U be the future directed unit normal to V n in M n+1 . Then K = U + N is an outward null normal field to n−1 , unique up to scaling. The null second fundamental form of with respect to K is, for each p ∈ , the bilinear form defined by, χ : T p × T p → R,
χ (X, Y ) = ∇ X K , Y ,
(2.1)
where , = g and ∇ is the Levi-Civita connection, of M n+1 . Then the null expansion of is defined as θ = tr χ = h AB χ AB = div K , where h is the induced metric on . We shall say n−1 is an outer apparent horizon in V n provided, (i) is marginally outer trapped, i.e., θ = 0, and (ii) there are no outer trapped surfaces outside of . The latter means that there is no (n − 1)-surface contained in the region of V n outside of which is homologous to and which has negative expansion θ < 0 with respect to its outer null normal (relative to ). Heuristically, is the “outer limit” of outer trapped surfaces in V . 2 In any case, the parametric estimates of [19] which are used to construct solutions of the Jang equation asymptotic to vertical cylinders over apparent horizons are generally true only in low dimensions.
A Generalization of Hawking’s Black Hole Topology Theorem to Higher Dimensions
573
Finally, a spacetime M n+1 , g satisfying the Einstein equations Rab −
1 Rgab = Tab 2
(2.2)
is said to obey the dominant energy condition provided the energy-momentum tensor T satisfies T (X, Y ) = Tab X a Y b ≥ 0 for all future pointing causal vectors X, Y . We are now ready to state the main theorem. Theorem 2.1. Let M n+1 , g , n ≥ 3, be a spacetime satisfying the dominant energy condition. If n−1 is an outer apparent horizon in V n then n−1 is of positive Yamabe type, unless n−1 is Ricci flat (flat if n = 3, 4) in the induced metric, and both χ and T (U, K ) = Tab U a K b vanish on . Thus, except under special circumstances, n−1 is of positive Yamabe type. As noted in the introduction, this implies various restrictions on the topology of . Let us focus on the case dim M = 5, and hence dim = 3, and assume, by taking a double cover if necessary, that is orientable. Then by well-known results of SchoenYau [18] and Gromov-Lawson [10], topologically, must be a finite connected sum of spherical spaces (homotopy 3-spheres, perhaps with identifications) and S 2 × S 1 ’s. Indeed, by the prime decomposition theorem, can be expressed as a connected sum of spherical spaces, S 2 × S 1 ’s, and K (π, 1) manifolds (manifolds whose universal covers are contractible). But as admits a metric of positive scalar curvature, it cannot have any K (π, 1)’s in its prime decomposition. Thus, the basic horizon topologies in dim M = 5 are S 3 and S 2 × S 1 , both of which are realized by nontrivial black hole spacetimes. Under stringent geometric assumptions on the horizon, a related conclusion is arrived at in [14]. Proof of the theorem. We consider normal variations of in V , i.e., variations t → t of = 0 , − < t < , with variation vector field V = ∂t∂ t=0 = φ N , φ ∈ C ∞ (). Let θ (t) denote the null expansion of t with respect to K t = U + Nt , where Nt is the outer unit normal field to t in V . A computation shows [6, 3], ∂θ 2 φ, (2.3) = − φ + 2X, ∇φ + Q + div X − |X | ∂t t=0 where, Q=
1 1 S − T (U, K ) − |χ |2 , 2 2
(2.4)
S is the scalar curvature of , X is the vector field on defined by X = tan (∇ N U ), and , now denotes the induced metric h on . Introducing as in [3] the operator L = − +X, ∇( )+ Q + div X − |X |2 , Eq. (2.3) may be expressed as, ∂θ = L(φ) . (2.5) ∂t t=0 L is the stability operator associated with variations in the null expansion θ . In the time symmetric case the vector field X vanishes, and L reduces to the classical stability operator of minimal surface theory, as expected [6].
574
G.J. Galloway, R. Schoen
As discussed in [3], although L is not in general self adjoint, its principal eigenvalue λ1 is real, and one can choose a principal eigenfunction φ which is strictly positive, φ > 0. Using the eigenfunction φ to define our variation, we have from (2.5), ∂θ = λ1 φ . (2.6) ∂t t=0 The eigenvalue λ1 cannot be negative, for otherwise (2.6) would imply that ∂θ ∂t < 0 on . Since θ = 0 on , this would mean that for t > 0 sufficiently small, t would be outer trapped, contrary to our assumptions. Hence, λ1 ≥ 0, and we conclude for the variation determined by the positive eigen function φ that ∂θ ∂t t=0 ≥ 0. By completing the square on the right hand side of Eq. (2.3), this implies that the following inequality holds: − φ + (Q + div X ) φ + φ|∇ ln φ|2 − φ|X − ∇ ln φ|2 ≥ 0.
(2.7)
Setting u = ln φ, we obtain, − u + Q + div X − |X − ∇u|2 ≥ 0 .
(2.8)
As a side remark, note that substituting the expression for Q into (2.8) and integrating gives that the total scalar curvature of is nonnegative, and in fact is positive, except under special circumstances. In four spacetime dimensions one may then apply the Gauss-Bonnet theorem to recover Hawking’s theorem; in fact this is essentially Hawking’s original argument. However, in higher dimensions the positivity of the total scalar curvature, in and of itself, does not provide any topological information. To proceed with the higher dimensional case, we first absorb the Laplacian term
u = div (∇u) in (2.8) into the divergence term to obtain, Q + div (X − ∇u) − |X − ∇u|2 ≥ 0 .
(2.9)
Setting Y = X − ∇u, we arrive at the inequality, −Q + |Y |2 ≤ div Y .
(2.10)
Given any ψ ∈ C ∞ (), we multiply through by ψ 2 and derive, −ψ 2 Q + ψ 2 |Y |2 ≤ ψ 2 div Y = div ψ 2 Y − 2ψ∇ψ, Y ≤ div ψ 2 Y + 2|ψ||∇ψ||Y | ≤ div ψ 2 Y + |∇ψ|2 + ψ 2 |Y |2 . Integrating the above inequality yields, |∇ψ|2 + Qψ 2 ≥ 0 for all ψ ∈ C ∞ () ,
where Q is given in (2.4).
(2.11)
(2.12)
A Generalization of Hawking’s Black Hole Topology Theorem to Higher Dimensions
575
At this point rather standard arguments become applicable [19, 6]. Consider the eigenvalue problem, − ψ + Qψ = μψ .
(2.13)
Inequality (2.12) implies that the first eigenvalue μ1 of (2.13) is nonnegative, μ1 ≥ 0. Let f ∈ C ∞ () be an associated eigenfunction; f can be chosen to be strictly positive. Now consider in the conformally related metric h˜ = f 2/n−2 h. The scalar curvature S˜ of in the metric h˜ is given by, n − 1 |∇ f |2 S˜ = f −n/(n−2) −2 f + S f + n−2 f n − 1 |∇ f |2 = f −2/(n−2) 2μ1 + 2T (U, K ) + |χ |2 + , (2.14) n−2 f2 where, for the second equation, we have used (2.13), with ψ = f , and (2.4). Since, by the dominant energy condition, T (U, K ) ≥ 0, Eq. (2.14) implies that S˜ ≥ 0. If S˜ > 0 at some point, then by well known results [15] one can conformally change h˜ to a metric of strictly positive scalar curvature, and the theorem follows. If S˜ vanishes identically then, by Eq. (2.14), μ1 = 0, T (U, K ) ≡ 0, χ ≡ 0 and f is constant. Eq. (2.13), with ψ = f and Eq. (2.4) then imply that S ≡ 0. By a result of Bourguinon (see [15]), it follows that carries a metric of positive scalar curvature unless it is Ricci flat. The theorem now follows. Concluding Remarks 1. Let n−1 be a closed 2-sided hypersurface in the spacelike hypersurface V n ⊂ M n+1 . Then there exists a neighborhood W of n−1 in V n such that separates W into an “inside” and an “outside”. Suppose is marginally outer trapped, i.e., θ = 0 with respect to the outer null normal to . Following the terminology introduced in [3], we say that is stably outermost (respectively, strictly stably outermost) provided the principal eigenvalue λ1 of the stability operator L introduced in 2.5 satisfies λ1 ≥ 0 (resp., λ1 > 0). It is clear from the proof that the conclusion of Theorem 2.1 remains valid for marginally outer trapped surfaces that are stably outermost. Moreover the conclusion that is positive Yamabe holds without any caveat if is strictly stably outermost. To see this, note that Eq. (2.6) then implies that there exists > 0 such that ∂θ ∂t t=0 ≥ . Tracing through the proof using this inequality shows that (2.12) holds with Q replaced by Q − . Then the parenthetical expression in Eq. (2.14) will include a + term, and so S˜ will be strictly positive. 2. Theorem 2.1 applies, in particular, to the marginally trapped surfaces S R of a dynamical horizon H (see [2] for definitions). Indeed, by the maximum principle for marginally trapped surfaces [1], there can be no outer trapped surfaces in H outside of any S R . Alternatively, it is easily checked that each S R is stably outermost in the sense described in the previous paragraph. 3. As discussed in [6], the exceptional case in Theorem 2.1 can in effect be eliminated in the time symmetric case. In this case V n becomes a manifold of nonnegative scalar curvature, and n−1 is minimal. By the results in [5, 4], if is locally outer area minimizing and does not carry a metric of positive scalar curvature then an outer neighborhood of in V splits isometrically as a product [0, ) × . In physical terms, this means that there would be marginally outer trapped surfaces outside of ,
576
G.J. Galloway, R. Schoen
which, by a slight strengthening of our definition of ‘outer apparent horizon’, could not occur. (In fact, marginally outer trapped surfaces cannot occur outside the event horizon.) Under mild physical assumptions, but with dim M ≤ 8, one can show that is locally outer area minimizing; see [6, Theorem 3] for further discussion. Finally, in the asymptotically flat, but not necessarily time symmetric case, it is possible to perturb the initial data to make the dominant energy inequality strict, see [19, p. 240]. Hence, the exceptional case is unstable in this sense. Note added in proof: We are now able to eliminate the exceptional case in the general non-time symmetric setting under conditions analogous to those referred to in remark 3 above, e.g. assuming a mild asymptotic condition and assuming there are no outer trapped or marginally outer trapped surfaces outside of . Details will appear in a forthcoming paper. Acknowledgements. This work was supported in part by NSF grants DMS-0405906 (GJG) and DMS-0104163 (RS). The work was initiated at the Isaac Newton Institute in Cambridge, England during the Fall 2005 Program on Global Problems in Mathematical Relativity, organized by P. Chru´sciel, H. Friedrich and P. Tod. The authors would like to thank the Newton Institute for its support.
References 1. Ashtekar, A., Galloway, G.J.: Uniqueness theorems for dynamical horizons. Adv. Theor. Math. Phys. 8, 1–30 (2005) 2. Ashtekar, A., Krishnan, B.: Dynamical horizons and their properties. Phys. Rev. D 68, 261101 (2003) 3. Andersson, L., Mars, M., Simon, W.: Local existence of dynamical and trapping horizons, Phys. Rev. Lett. 95, 111102 (2005) 4. Cai, M.: Volume minimizing hypersurfaces in manifolds of nonnegative scalar curvature. In: Minimal Surfaces, Geometric Analysis, and Symplectic Geometry, Advanced Studies in Pure Mathematics, eds. Fukaya, K., Nishikawa, S., Spruck, J., 34, 1–7 (2002) 5. Cai, M., Galloway, G.J.: Rigidity of area minimzing tori in 3-manifolds of nonnegative scalar curvature. Commun. Anal. Geom. 8, 565–573 (2000) 6. Cai, M., Galloway, G.J.: On the topology and area of higher dimensional black holes. Class. Quant. Grav. 18, 2707–2718 (2001) 7. Emparan, R., Reall, H.S.: A rotating black ring in five dimensions. Phys. Rev. Lett. 88, 101101 (2002) 8. Gibbons, G.W.: The time symmetric initial value problem for black holes. Commun. Math. Phys. 27, 87–102 (1972) 9. Gromov, M., Lawson, B.: Spin and scalar curvature in the presence of the fundamental group. Ann. of Math. 111, 209–230 (1980) 10. Gromov, M., Lawson, B.: Positive scalar curvature and the Dirac operator on complete Riemannian manifolds. Publ. Math. IHES 58, 83–196 (1983) 11. Hawking, S.W.: Black holes in general relativity. Commun. Math. Phys. 25, 152–166 (1972) 12. Hawking, S.W.: The event horizon. In ‘Black Holes, Les Houches lectures’ (1972), edited by C. DeWitt, B. S. DeWitt, Amsterdam: North Holland, 1972 13. Hawking, S.W., Ellis, G.F.R.: The large scale structure of space-time. Cambridge Cambridge University Press, 1973 14. Helfgott, C., OZ, Y., Yanay, Y.: On the topology of black hole event horizons in higher dimensions. JHEPO2 024 (2006) 15. Kazdan, J., Warner, F.: Prescribing curvatures. Proc. Symp. in Pure Math. 27, 309–319 (1975) 16. Lichnerowicz, A.: Spineurs harmoniques. Cr. Acd. Sci. Paris, Sér. A-B 257, 7–9 (1963) 17. Schoen, R., Yau, S.T.: 1 Existence of incompressible minimal surfaces and the topology of three dimensional manifolds of non-negative scalar curvature. Ann. of Math. 110, 127–142 (1979) 18. Schoen, R., Yau, S.T.: On the structure of manifolds with positive scalar curvature. Manuscripta Math. 28, 159–183 (1979) 19. Schoen, R., Yau, S.T.: Proof of the positive of mass theorem. II. Commun. Math. Phys., 79, 231–260 (1981) Communicated by G.W. Gibbons
Commun. Math. Phys. 266, 577–594 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0043-z
Communications in
Mathematical Physics
Geometric Quantization, Parallel Transport and the Fourier Transform William D. Kirwin , Siye Wu Department of Mathematics, University of Colorado, Boulder, CO 80309-0395, USA Received: 4 October 2004 / Accepted: 27 February 2006 Published online: 7 July 2006 – © Springer-Verlag 2006
Abstract: In quantum mechanics, the momentum space and position space wave functions are related by the Fourier transform. We investigate how the Fourier transform arises in the context of geometric quantization. We consider a Hilbert space bundle H over the space J of compatible complex structures on a symplectic vector space. This bundle is equipped with a projectively flat connection. We show that parallel transport along a geodesic in the bundle H → J is a rescaled orthogonal projection or Bogoliubov transformation. We then construct the kernel for the integral parallel transport operator. Finally, by extending geodesics to the boundary (for which the metaplectic correction is essential), we obtain the Segal-Bargmann and Fourier transforms as parallel transport in suitable limits. 1. Introduction In quantum mechanics, the position and momentum space wave functions are related by the Fourier transform. In this paper we examine how this relationship arises in the context of geometric quantization. We provide a geometric interpretation of the Fourier transform as parallel transport in a vector bundle of infinite rank. In fact, this consideration leads to a smoothly parametrized family of transforms which includes the Fourier transform, the Segal-Bargmann transform, and the Bogoliubov transform. Quantization of a symplectic manifold (M, ω) requires an Hermitian line bundle → M with a compatible connection such that the curvature is √ω−1 . is called a ω pre-quantum line bundle and it exists if and only if the de Rham class [ 2π ] is integral. 0 The pre-quantum Hilbert space H consists of sections of which are square-integrable with respect to the Liouville volume form on M. As is well-known, H0 is too large for the purpose of quantization. The additional structure we need is an almost complex structure compatible with ω. The space J of such J is connected and contractible. Each Current address: Department of Mathematics, University of Notre Dame, Notre Dame, IN 46556, USA
578
W. D. Kirwin, S. Wu
J ∈ J defines a quantum Hilbert space H J of (square-integrable) J -holomorphic sections of . They form a vector bundle H of Hilbert spaces over J, provided there is no jump of dim H J as J varies. To compare H J for different J , we need a connection on H → J. Given J, J ∈ J and a path connecting them, parallel transport in H is a unitary operator from H J to H J . If the connection is projectively flat, the holonomy is U (1), and parallel transports along different paths from J to J differ by at most a phase. Since a quantum state is actually represented by a ray in the Hilbert space, the “physics” obtained is thus independent of the choice of J . Unfortunately such a connection does not exist in general [5]. We will henceforth restrict our attention to a symplectic vector space (V, ω). We also restrict J to the space of linear complex structures on V compatible with ω. In this case, a projectively flat connection on H → J is constructed in [1], as a finite dimensional model for studying Chern-Simons gauge theory. In this paper, we study parallel transport in the bundle H along the geodesics in J, when the symplectic manifold is a vector space. The space J can be identified with the Siegel upper-half space, which has a natural Kähler metric. We show that parallel transport along a geodesic in the Hilbert space bundle is a rescaled orthogonal projection. Hence parallel transport agrees with the Bogoliubov transformation in [14, 15] and the intertwining operators in [10] and [8]. Part of the boundary of J (as a bounded domain) consists of real Langrangian subspaces L of (V, ω). Each L is a real polarization and also defines a quantum Hilbert space. By extending geodesics to the boundary (for which the metaplectic correction is essential), we obtain the Segal-Bargmann and Fourier transforms as parallel transport in suitable limits. The rest of the paper is organized as follows. In Sect. 2, we recall the identification of J with the Siegel upper-half space and describe the connection and the resulting geometry of the bundle of quantum Hilbert spaces over J. We also incorporate the metaplectic correction. In Sect. 3, we study parallel transport in H along geodesics in J. The condition of parallel transport is a partial differential equation. By the Sp(V, ω) symmetry, it suffices to consider a special class of geodesics so that the equation can be solved explicitly. We then show that the parallel transport is actually a rescaled orthogonal projection known as the Bogoliubov transformation. Hence the integral kernel for the equation of parallel transport is the Bergman reproducing kernel, up to a positive factor. We then show that by extending a geodesic in one direction to infinity, the parallel transport becomes the Segal-Bargmann transform. Extending both ends of a geodesic to infinity, the parallel transport converges to the Fourier transform. Since a real Lagrangian space is on the boundary of J, the quantum Hilbert space associated to it is not inside the bundle H → J. We show that with the metaplectic correction, the limit holds in the sense of almost everywhere convergence as sections over V . Other ways to formulate the limit are also established. Finally, we would like to mention some recent related work. Let K be a Lie group of compact type, that is, K is locally isomorphic to a compact Lie group. The cotangent bundle T ∗ K is naturally symplectic and, being diffeomorphic to the complexification K C , has a compatible complex structure. In [6], Hall constructed a generalized SegalBargmann transform between the vertically polarized and Kähler polarized quantum Hilbert spaces. The pairing is a unitary operator and a rescaled projection, as in [14, 15] for the flat case. In [4], the authors study parallel transport in the quantum Hilbert space bundle over a 1-parameter family of Kähler polarizations on T ∗ K . As in the flat case [1], the parallel transport equation is given by a holomorphic version of the heat operator,
Geometric Quantization, Parallel Transport and the Fourier Transform
579
which also appeared in [6]. It would be interesting to explore the projective flatness of the quantum Hilbert space bundle over a larger class of Kähler polarizations on T ∗ K . 2. Geometry of the Hilbert Space Bundle 2.1. Complex polarizations and the metaplectic correction. Let V be a real vector space of dimension 2n equipped with a constant symplectic form ω (i.e., a nondegenerate, closed 2-form). There exist linear coordinates {x i , y i }i=1,...,n or tx = (x 1 · · · x n ), ty = (y 1 · · · y n ) on V such that ω=
n
d x i ∧ dy i = td x ∧ dy.
i=1
A complex structure J ∈ End(V ) is compatible with the symplectic form ω if ω(J · , J · ) = ω( · , · ) and ω( · , J · ) > 0. Given such a J , the complexification of V decomposes as V C = V J(1,0) ⊕ V J(0,1) , √ (1,0) (0,1) (1,0) = {X ∈ V C | J X = −1 X } and V J = V J . Let J be the set of where V J all compatible complex structures on V . J can be identified, as follows, with the Siegel upper half-space 1
Hn = { ∈ Mn (C) | t = , Im > 0} ⊂ C 2 n(n+1) . (1,0)
We associate a compatible complex structure J ∈ J to a point ∈ Hn so that V J is spanned by In . Equivalently, the complex structure can be written in terms of √ = 1 + −1 2 as 1 −1 −2 − 1 −1 2 2 1 . J= −1 −−1 2 2 1 Thus J is identified with the positive Lagrangian Grassmannian. Real Lagrangian subspaces correspond to certain points on the boundary of Hn . For any ∈ Hn , we choose the corresponding holomorphic coordinates on V as ¯ z = (22 )− 2 (x − y). 1
(2.1)
1
The matrix factor (22 )− 2 is chosen so that the symplectic form is √ ω = −1 tdz ∧ d z¯ . We will drop the subscript when there is no danger of confusion. There is a pre-quantum line bundle → V with a connection whose curvature is n i i i i √ω . We use the symplectic potential τ = 1 i=1 x dy − y d x to trivialize → V . 2 −1 That is, the covariant derivative of a section s ∈ () along X is √ ∇ X s = X (s) − −1 (ι X τ )s,
580
W. D. Kirwin, S. Wu
if s is identified with a function on V . The pre-quantum Hilbert space H0 consists of n square-integrable sections of with respect to the Liouville volume form εω = (2πω)n n! . Polarized sections of are those which are holomorphic, i.e., ∇z¯ s = 0. Using the complex coordinates (2.1), the covariant derivatives in are ∇z =
1 1 ∂ ∂ − z¯ , ∇z¯ = + z. ∂z 2 ∂ z¯ 2
(2.2)
Hence, a polarized section ψ ∈ H J can be written as 1
ψ = φ(z) e− 2 |z|
2
for some entire function φ. Let H J ⊂ H0 denote the space of square integrable polarized sections with respect to the complex structure J . This is the quantum Hilbert space. We then have a quantum Hilbert space bundle H → J with fiber H J over J . There is an Hermitian structure on this bundle given by
ψ1 , ψ2 = ψ¯ 1 ψ2 εω (2.3) V
for ψ1 , ψ2 ∈ H J . Here and below, when J is parameterized by ∈ Hn , the subscript J can be replaced by . For example, we write H = H J . Since J is the positive Lagrangian Grassmannian, there is a natural canonical bundle L → J with fiber V J(1,0) over J . Let K → J be the dual determinant bundle with fiber K J = n (V J(1,0) )∗ . Since J is contractible, there is a unique (up to equivalence) square √ √ √ root bundle K → J such that K ⊗ K = K. This square root bundle is known as ˆ →J the bundle of √ half-forms. We define the corrected quantum Hilbert space bundle H √ ˆ ˆ as H = H ⊗ K. The fiber H J = H J ⊗ K J is called the corrected quantum Hilbert space. Including the bundle of half-forms is known as the metaplectic correction. 2.2. Symplectic and metaplectic group actions. Given a vector space V with a symplectic form ω, the symplectic group Sp(V, ω) is the set of linear transformations on V preserving ω. Upon choosing a set of linear symplectic coordinates {x i , y i }i=1,...,n , the group Sp(V, ω) is isomorphic to
A B t AC = tC A, tB D = tD B, t AD − tC B = In . Sp(2n, R) =
C D The group Sp(V, ω) acts on the set J of compatible complex structures by g : J → g J g −1 . The corresponding action on positive complex Lagrangian subspaces (1,0) (1,0) (1,0) is g : V J
→ gV J = Vg J g−1 . Identifying J with the Siegel upper half-space Hn , the action of Sp(V, ω) on J becomes the fractional linear transformation on Hn , i.e., A B g= : → g · = (A + B)(C + D)−1 . (2.4) C D The following results, which will be used in the sequel, can be verified by straightforward calculations.
Geometric Quantization, Parallel Transport and the Fourier Transform
Lemma 2.1. Suppose g =
=
¯ − √ 2 −1
A
B C D
581
∈ Sp(2n, R) and = g · 0 , = g · 0 ∈ Hn . Put
. Then
1. ¯ = t(C + D)−1 (0 − ¯ 0 )(C0 + D)−1 . − 0
(2.5)
In particular, Im = t(C0 + D)−1 Im0 (C0 + D)−1 . 2. −1 ¯ −1 ¯ ¯ (−1 2 − )2 = ( − ) ( − ) ¯ 0 )−1 ( ¯0− ¯ 0 )(C0 + D)−1 ; = (C0 + D)(0 − −1 2 (2
− −1 )
(2.6)
¯ )−1 = ( − )( −
¯ 0 )−1t(C + D). = t(C0 + D)−1 (0 − 0 )(0 − 0
(2.7)
The action of Sp(V, ω) on V lifts to the pre-quantum line bundle preserving the connection. Consequently, the group Sp(V, ω) acts on the pre-quantum Hilbert space H0 . In fact, since the symplectic potential τ is preserved by Sp(2n, R), under the corresponding trivialization ∼ = V × C, the action of g ∈ Sp(2n, R) is g · (v, ζ ) = (gv, ζ ), v ∈ V ∼ = R2n , ζ ∈ C, and that on s ∈ H0 ∼ = L 2 (V ) ⊗ C is (g · s)(v) = s(g −1 v), v ∈ V. The action of Sp(V, ω) lifts to the Hilbert space bundle H → J covering the action on J. Since Sp(V, ω) preserves the connection on , the action g : H J → Hg J g−1 is a unitary isomorphism for any g ∈ Sp(V, ω). The symplectic group Sp(V, ω) also acts on the vector bundle L → J and hence on the line bundle K → J. In fact the choice of coordinates (2.1) provides a global unitary section → d n z of K. The transformation of the complex coordinates 1 −1 A B : z → (g −1 )∗ z = 2 2 t (C + D)(g · )22 z g· g= C D 1
−1
= 22 (C + D)−1 (g · )2 2 z g· ,
(2.8)
where (g · )2 = Im(g · ), is unitary, and so is that of the section d n z, det (C + D) n A B d z g· . g= : d n z → (g −1 )∗ d n z = (2.9) C D |det(C + D)| √ This√action does not lift to K, but the double covering group of Sp(V, ω) does act on K. Since Sp(V, ω) is connected and π1 (Sp(V, ω)) ∼ = Z, there is a unique (up to isomorphism) connected double covering group M p(V, ω), called the metaplectic group. The double covering group of Sp(2n, R) is denoted by M p(2n, R). We have the following well-known result (see for example [9]):
582
W. D. Kirwin, S. Wu
Proposition 2.2. M p(V, ω) is isomorphic √ to the group whose elements are pairs (σ, g), where σ is a bundle isomorphism of K → J covering the action of g ∈ Sp(V, ω) on J. That is, we have a commutative diagram √ σ √ K −→ K ↓ ↓ g J −→ J √ Consequently, the metaplectic group M p(V, ω) acts acts on the cor√ on K and hence ˆ = H ⊗ K. Given g = A B ∈ Sp(2n, R), rected quantum Hilbert space bundle H C D the action of a lifted element in M p(2n, R) is ψ⊗
1
dn z
2 ˆ → det (C + D) ψ ◦ g −1 ⊗ d n z g· ∈ H ˆ g· , ∈H 1 |det(C + D)| 2
(2.10)
1
where the square root det (C + D) 2 depends on the lift of g to M p(2n, R). 2.3. Projectively flat and flat connections. First, we describe√a projectively flat connection on H → J [1]. Combining this and the connection on K → J, we obtain a flat ˆ → J [15, §10.2]. connection on H Since H → J is a subbundle of the product bundle J×H0 → J, the trivial connection on the latter projects to a connection on H. This connection is [1] 1 ∇ H = δ + (δ J ω−1 )i j ∇z i ∇z j , 4
(2.11)
where δ is the exterior differential on J. The second term √ is a 1-form on J valued in (1,0) 1 be the set of skew-adjoint operators on H J . Let PJ = 2 (1 − −1 J ) : V C → V J the projection with respect to the Hermitian form on V C defined by ω and J . Then the curvature of the above connection is [1] 1 F H = − Tr(PJ δ J ∧ δ J PJ ) idHJ . 8
(2.12)
So the connection is projectively flat [1]. Henceforth we omit the identity operator. The connection described above blows up at the boundary of J. We will be interested in extending geodesics in J to the boundary. In order to parallel transport along the extended geodesics in the next section, we must employ the metaplectic correction. The product bundle V C × J → J has an Hermitian structure defined by ω and J . So a connection on the sub-bundle L → J is given by the orthogonal projection of the trivial connection. Its curvature is 1 F L = PJ δ(PJ δ PJ ) = PJ δ PJ ∧ δ PJ PJ = − PJ δ J ∧ δ J PJ . 4
(2.13)
Proposition 2.3 ([15, §10.2]). The induced connection on the corrected quantum Hilbert ˆ → J is flat. space bundle H
Geometric Quantization, Parallel Transport and the Fourier Transform
Proof. The connection on
583
√ √ K is F K = − 21 Tr F L. So by (2.12) and (2.13), √
F H + F K = 0. The result was proved in [15, §10.2] using cocycles.
The identification J = Hn provides J with a convenient set of coordinates. Using the variation of J , √ √ ¯ −1 −1 −1 ¯ ¯ −1 δJ = −1 −1 δ (I , − ) − n 2 2 2 δ 2 (In , −), In In 2 2 the connection (2.11) becomes ∇
H
√ 1 −1 t −1 ¯ − 2 ∇z . =δ− ∇z 2 2 δ 2 2
(2.14)
The curvature (2.12) is FH =
1 −1 ¯ Tr(−1 2 δ ∧ 2 δ ). 8
(2.15)
The latter is proportional to √ the standard Kähler form on √ Hn . On the other hand, using the (unitary) global section d n z, the connection on K is given by the 1-form (for any n ≥ 1) √
A K=
√
−1 4
Tr(−1 2 δ1 ).
(2.16)
Its curvature is the negative of (2.15). The Hilbert space H J is the Fock space of a harmonic √ oscillator with Hamiltonian H = |z|2 . In the case n = 1, the parameter is τ = τ1 + −1 τ2 in the upper half-plane. 1 k 2 A unitary basis for H J is {|k = √z e− 2 |z| }k∈N . The vector |0 is the vacuum state and k! |k (k ≥ 1) are the excited states. Such a basis provides a global unitary frame for the bundle H. Each |k, regarded as a function of τ valued in H0 , has the exterior derivative √ −1 k|k − (k + 1)(k + 2)|k + 2 δτ δ|k = 4τ2 √ + k|k − 2¯z k|k − 1 + z¯ 2 |k δ τ¯ . The connection is given by an infinite skew-Hermitian matrix valued 1-form √ −1 H k δkl − k(k − 1) δk,l+2 δτ Akl = k|δ|l = 4τ2 + l δkl − l(l − 1) δk+2,l δ τ¯ ,
(2.17)
while the matrix of the curvature 2-form is, as expected,
k|F H|l =
δτ ∧ δ τ¯ δkl . 8τ22
(2.18)
584
W. D. Kirwin, S. Wu
3. Parallel Transport Along the Geodesics 3.1. Solutions to the equation of parallel transport. The Siegel upper half-space Hn has a non-positively curved Kähler metric −1 ¯ ds 2 = Tr(−1 2 δ 2 δ ),
which is invariant under the action of Sp(2n, R). We study parallel transport in the ˆ along the geodesics in Hn . Let , ∈ Hn represent J, J ∈ J, bundles H and H respectively. Parallel transport in the bundle H along the unique geodesic from to ˆ We denote them by U J J = U : H J → H J is a unitary operator, and so is that in H. ˆJ →H ˆ J , respectively. The generating function for the basis of and Uˆ J J = Uˆ : H H is a coherent state or a principal vector [2] cα (z ) = exp(tαz ¯ − 21 |z |2 ),
(3.1)
Cn .
where α ∈ We wish to find U cα ∈ H and its metaplectic correction. For any diagonal matrix = diag[λ1 , . . . , λn ] ≥ 0, the curve γ : R → Hn defined √ by γ (t) = −1 e2t is a geodesic in Hn . Lemma 3.1 ([11]). For any geodesic γ : R → Hn , there exist g ∈ Sp(2n, R) and a diagonal matrix ≥ 0 such that γ = g · γ . We first study parallel transport along the geodesic γ ; the latter determines a oneparameter family√ of complex structures Jt , whose complex coordinates are z t = √1 (e−t x + −1 et y). The equation of parallel transport of a family of polarized 2 sections ψt ∈ (γ∗ H) is (∂t −
1t 2 ∇z t ∇z t )ψt
= 0.
(3.2) √
Proposition 3.2. The parallel transport of cα (z 0 ) along γ from 0 = −1 In to √ t = −1 e2t is given by 1 α¯ 1 1 t α¯ tanh t sech t − |z|2 . (Ut 0 cα )(z) = (det sech t) 2 exp sech t −tanh t z 2 z 2 (3.3) Proof. Since the connection 1-form − 21 t∇z ∇z is a sum of diagonal terms, we can assume n = 1; the general case is similar. We can also set λ1 = 1 by a rescaling of t. Let 1 2 ψt be the parallel transport of cα . Write z = z t and ψt = φ(t, z)e− 2 |z| . Then φ(t, z) is an entire function in z (for each t) satisfying ∂ 1 ∂2 1 2 ¯ φ(0, z) = eαz − + z φ(t, z) = 0, . ∂t 2 ∂z 2 2 Here we have used (2.2) and ft −
1 2 f zz
d dt z
−
= −¯z . Set φ(t, z) = e f (t,z) . Then f (t, z) satisfies
1 2 2 fz
+ 21 z 2 = 0,
If we look for a solution of the form f (t, z) = satisfy a set of ordinary differential equations pt = p 2 − 1,
qt = pq,
f (0, z) = αz. ¯ 1 2
p(t)z 2 + q(t)z + 21 r (t), then p, q, r rt = p + q 2
Geometric Quantization, Parallel Transport and the Fourier Transform
585
with the initial conditions p(0) = r (0) = 0, q(0) = α. ¯ The solutions are p(t) = − tanh t, q(t) = α¯ sech t, r (t) = ln sech t + α¯ 2 tanh t, and hence φ(t, z) =
√
sech t exp αz ¯ sech t + 21 (α¯ 2 − z 2 ) tanh t .
The result follows from the uniqueness of parallel transport.
Proposition 3.2 enables us to calculate the parallel transport of any basis vector in H J0 . In particular, the parallel transport of the vacuum is no longer the vacuum in a new polarization; it is a linear combination of states with an even number of excitations. We list the parallel transport of a few states with small excitation numbers in the case n = 1, λ1 = 1: 1 1 z0 z t sech t √ 1 2 1 2 2 − 12 |z 0 |2 2 sech2 t + tanh t e− 2 z t tanh t− 2 |z t | .
→ sech t (3.4) z0 e z t .. .. . . √ Next we study parallel transport in the half-form bundle K. As noted earlier, the complex coordinates corresponding to the point γ (t) ∈ Hn are z t = √1 (e−t x + 2 √ d −1 et y). As t varies, the complex coordinates change by dt z t = −¯z t , whose pro√ (1,0) jection to Vγ (t) is 0. Consequently, dz t is a parallel section of γ∗ L∗ and d n z t is a parallel section of γ∗ K. The latter is also a consequence of (2.16). Hence the parallel √ √ transport of cα ⊗ d n z 0 is Ut 0 cα ⊗ d n z t . We now turn to parallel transport along a general geodesic. Theorem 3.3. Let , ∈ Hn and let γ be the unique geodesic such that γ (0) = and γ (1) = . Then 1. The parallel transport of cα (z ) along γ is 1
(U cα )(z ) =
1
(det 2 ) 4 (det 2 ) 4 1
| det | 2 1 1 t −1 2 2 1 −
I α ¯ n 2 2 × exp 1 1 2 z 2 2 2 −1 2
1 2 2 −1 2 α¯ − 1 1 z 2 In − 2 2 −1 2 1 2
1 2 |z | . 2 (3.5)
2. The parallel transport of
√ n d z along γ is
(det ) 2 1
1
| det | 2
| det | 2 1
dn z
3. The parallel transport of cα (z ) ⊗ (3.6).
=
1
(det ) 2
d n z .
(3.6)
√ n d z along γ is the tensor product of (3.5) and
586
W. D. Kirwin, S. Wu
B and be given by Lemma 3.1 such that γ = g · γ . Then Proof. Let g = CA D √ √ = g · 0 , = g · 0 ∈ Hn , where 0 = −1 In and 0 = −1 e2 In . 1. We first map cα (z ) = exp(tαz ¯ − 21 |z |2 ) in H by g −1 to cα0 (z 0 ) = exp(tα¯ 0 z 0 − 1 2 2 2 2 |z 0 | ) in H0 . By the unitarity of (2.8), we have |z 0 | = |z | and 1
−1
α0 = t(C0 + D) 22 α = (C0 + D)−1 2 2 α. The parallel transport of cα0 (z 0 ) in H0 along γ is (U0 0 cα0 )(z 0 ) in H0 given by (3.3). Since the connection is invariant under Sp(2n, R), the action of g on the latter is ) in H . Here (U cα )(z 1
−1 2 z 0 = e− t(C0 + D) 2 2 z = e (C0 + D)
− 21 z .
Using these identities and Lemma 2.1, we get 1
1
1
det(cosh t) = det 0 0 det(Im0 )− 2 = | det |(det 2 )− 2 (det 2 )− 2 , 1
1
t α¯ 0 sech t z 0 = tα ¯ 22 (C0 + D) −1 (C0 + D)2 2 z
t
0
=
1 2
0
1
2 α ¯ 2 −1 2 z ,
t
1
−1
¯ 0 )−1 ( ¯0− ¯ 0 )(C0 + D)−1 2 α¯ α¯ 0 tanh t α¯ 0 = tα ¯ 22 (C0 + D)(0 − 2
t
1
1
2 = tα(I ¯ n − 22 −1 ¯ 2 )α,
and −tz 0 tanh t z 0 = tz 2
− 21 t
1
¯ 0 )−1t(C + D)2 2 z (C0 + D)−1 (0 − 0 )(0 − 0 1
1
2 −1 2 = tz (In − 2 2 )z .
From these identities and from (3.3), the result follows. √ √ 2. Since the connection on K is invariant under M p(2n, R), we again map d n z by g −1 to 0 , parallel transport it along γ to 0 , and map the result to by g. By (2.9), the phase accumulated in these steps is 1 det(C0 + D) det (C0 + D) 2 , |det(C0 + D) det(C0 + D)| which is equal to those in (3.6) by taking the determinant of (2.5).
3.2. Projections, Bogoliubov transformations and the integral kernel of parallel transport. Given any two compatible complex structures J, J ∈ J, there is an orthogonal projection PJ J : H J → H J inside the pre-quantum Hilbert space H0 . In [14] and [15, §9.9], Woodhouse showed that up to a scalar multiplication, this projection is a unitary operator, the Bogoliubov transformation. On the other hand, parallel transport in the bundle H → J along the geodesic from J to J defines a manifestly unitary operator from H J to H J . The rescaled projection of the vacuum state calculated in [15, §9.9] coincides with (3.3) when α = 0. We show that this is true for all states.
Geometric Quantization, Parallel Transport and the Fourier Transform
587
Theorem 3.4. For any J, J ∈ J, the parallel transport U J J in H along the (unique) geodesic from J to J is the map α(J, J )PJ J : H J → H J , where 1 √ 1 4 ¯J ) 4 . − P α(J, J ) = det J +J = det −1 (P J 2
(3.7)
Proof. The quantity α(J, J ) defined in (3.7) in invariant under Sp(V, ω). Since all the steps are equivariant under Sp(V, ω), it suffices to consider the parallel transport along a geodesic of the form γ . Let Jt be the complex structure corresponding to γ (t). We want to show that there exists a function α(t) with α(0) = 1 such that for any ψ ∈ H J0 , ψt ∈ H Jt , if {ψt } is parallel along γ , then
ψ , ψ0 = α(t) ψ , ψt for any t ∈ R. This is equivalent to the condition that the right hand side has vanishing derivative with respect to t. Again, √ without loss of generality, we prove the case of n = 1, λ1 = 1. Since z t = √1 (e−t x + −1 et y), we have 2
∇z t = sech t ∇z 0 + tanh t ∇z¯ t . We also note that the formal adjoint of ∇z 0 is −∇z¯ 0 . So d (α(t) ψ , ψt ) = α (t) ψ , ψt + 21 α(t) ψ , ∇z2t ψ dt = α (t) ψ , ψt + 21 α(t) ψ , ( sech t ∇z 0 + tanh t ∇z¯ t )∇z t ψt = α (t) ψ , ψt + 21 α(t) sech t −∇z¯ 0 , ∇z t ψt √ + 21 α(t) tanh t ψ , (− −1 ω(∂z¯ t , ∂z t ) + ∇z t ∇z¯ t )ψt = (α (t) − 21 α(t) tanh t) ψ , ψt , √ whichvanishes if we choose α(t) = cosh t. It is easy to verify α(J0 , Jt ) = α(t) using √ 2t ¯ Jt = −2t −e . The second equality in (3.7) is because J +J 2 = −1 (PJ − PJ ). e
Corollary 3.5. Parallel transport in H → J along the geodesic in Hn from J to J coincides with the Bogoliubov transformation from H J to H J . Proof. The operator α(J, J )PJ J , including the scalar factor (3.7), coincides with the formula of the Bogoliubov transformation in [14] and [15, §9.9]. There is a more direct explanation of the above result. Given a complex structure J and the corresponding holomorphic coordinates (2.1) on V , H J is the Fock space of the creation and annihilation operators a †J = z, a J = ∇z + z¯ . As the complex structure changes along a geodesic, so do a †J and a J . In fact, when n = 1 √ and along γ (t) = −1 e2t , the parallel transports of a0 and a0† are cosh t at + sinh t at† and sinh t at + cosh t at† ,
588
W. D. Kirwin, S. Wu
respectively, where at = a Jt and at† = a †Jt . This deformation of the creation and annihilation operators, or the concept of the vacuum and excitations, is the physics origin of the Bogoliubov transformation. For any J ∈ J represented by , H J can be identified with the space of analytic 2 function on (V, J ) with the measure e−|z | εω (z ). The orthogonal projection onto H is given by the Bergman kernel. So we can express parallel transport by an integral kernel operator. Proposition 3.6. For any , ∈ Hn , the parallel transport U in H along the (unique) geodesic from to is 1
1
φ(z ) e− 2 |z | →
| det | 2
2
1 4
1
e− 2 |z |
1 (det 2 ) 4
2
(det 2 ) 1 t 2 1 2 × e z z¯ − 2 |z | − 2 |z | φ(z ) εω (z ).
(3.8)
V
Proof. The projection onto H is given by the Bergman kernel t
1
e z z¯ − 2 |z | Using the facts J 1 1
1
=
2 − 1 |z |2 2
.
√ √ −1 1 and (1, −)J = − −1 (1, −), we get
¯ J +J − 2 1 −
¯ 1
√ ¯ 0 − = −1 ¯ . 0 −
Taking the determinant, we get det
J + J | det |2 , = 2 det 2 det 2
from which the scalar factor in (3.8) follows.
Up to a phase, (3.8) agrees with the unitary intertwining operator from H to H in [10, 8]. √ √ A pairing can be defined on the half forms d n z and d n z even though they come from different complex structures.1 A simple calculation yields d n z , d n z =
1
(det ) 2 1
1
(det 2 ) 4 (det 2 ) 4
.
(3.9)
Since both the scalar factor in (3.8) and the phase in (3.6) are absorbed in (3.9), we have recovered 1 We recall that the pairing of d n z n and d z is determined by (−1)
d n z , d n z εω and that of
n(n−1) d z¯ ∧dz √ 2 (2π −1 )n
√ n √ d z and d n z is d n z , d n z = d n z , d n z .
=
Geometric Quantization, Parallel Transport and the Fourier Transform
589
Corollary 3.7√([15, §10.2]). Parallel transport from to under the flat connection ˆ = H ⊗ K is given by in H Uˆ : ψ ⊗
ˆ → d n z , d n z P ψ ⊗ d n z ∈ H ˆ . d n z ∈ H
(3.10)
ˆ and H ˆ , Alternatively, this map can be described by a pairing between H
ψ ⊗
d n z , ψ ⊗
d n z = ψ , ψ d n z , d n z ,
(3.11)
where ψ , ψ is the inner product of ψ ∈ H and ψ ∈ H in H0 . ˆ . When = , the above pairing is the inner product (2.3)in H − 12 |z |2 in (3.8) is cα (z ), the integration yields the same We remark that if φ(z ) e result as (3.5). This gives another integral kernel of parallel transport. The existence of two different kernels is because φ(z ) is restricted to be holomorphic. Theorem 3.8. Under the assumptions of Proposition 3.6, the map U sends φ(z ) 1 2 e− 2 |z | to 1
1
(det 2 ) 4 (det 2 ) 4
1
e− 2 |z |
1 2
2
|det | 1 1 t 2 1 In − 22 −1 z ¯ 2 × φ(z ) exp 1 1 2 z 2 V 2 2 −1 2
1
1
2 22 −1 2 1
1
2 In − 2 2 −1 2
− |z |2 εω (z ). 1
z¯ z
(3.12)
1
Proof. Since φ(z )e− 2 |z | is in H , we have an estimate |φ(w)| ≤ C e 2 |w| , where C is its norm [2]. By the reproducing property of the Bergman kernel, 2
1
φ(z ) e− 2 |z | = 2
2
φ(w) e−|w| cw (z ) εω (w), 2
V
which we substitute in (3.8). The integrand satisfies 1 2
1 1 1 2 2 t¯ 2 t 2
2 1 − 2 |z | + z z¯ − 2 |z |
≤ C e− 2 |w−z | − 2 |z −z | . e− 2 |z | φ(w) e−|w| − wz
Hence the double integral in w and z is absolutely convergent. Exchanging the order of the integration and performing the integral in z , we get
φ(w) e−|w| (U cw )(z ) εω (w), 2
V
which is (3.12) after relabeling the integration variable w as z .
590
W. D. Kirwin, S. Wu
3.3. Segal-Bargmann and Fourier transforms as parallel transport. The set S of real Lagrangian subspaces in (V, ω) can be identified with the Shilov boundary of J. For any L ∈ S, there is an Hermitian form on the space of sections of that are covariantly constant along L by choosing a translation invariant measure on V /L. The subspace H L of such sections that are L 2√ -integrable on V /L is independent of the choice of the measure. The bundles K and K extend to S; the fiber of K over L ∈ S is K L = (∧n (V /L)∗ )C ,√where (V /L)∗ is identified as the subspace of V ∗ that annihilates ˆ L = H L ⊗ K L for any L ∈ S; this is the quantum Hilbert space (with the L. Let H metaplectic correction) associated to the real √polarization L. The action of Sp(V, ω) on ˆ There is a canonical Hermitian S lifts to that of M p(V, ω) on the bundles √K and H. √ ˆ form on H. Given ψ1 , ψ2 ∈ H L and ν ∈ K L (L ∈ S), we have √ √ ψ¯ 1 ψ2 |ν| n , (3.13)
ψ1 ⊗ ν, ψ2 ⊗ ν = (2π ) 2
V /L
√ where |ν| is a density on V /L ∼ by ν. = Rn determined √ The M p(V, ω)-invariant pairing √on K √ → J between different √ √fibers also extends. Pairings are defined between K J , K L and between K L , K L for J ∈ J and L , L ∈ S such that L and L are transverse. For example, if L − = {x = 0}, L + = {y = 0} ∈ S in the symplectic coordinates (x, y) and ∈ Hn , we have √ √ √ 1 n , d n y, d n x = −1 2 . (3.14) d n z , d n x = det (22 )− 2 √ −1 For any J ∈ J corresponding to ∈ Hn , let R J be the subspace of ψ ∈ H J such that |ψ(z )| ≤
C (1 + |z |2 )n+α
ˆ J = RJ ⊗ for some C ≥ 0 and α > 0; such a ψ is L 1 on V . Let R pairing √ √ √ √ ¯ εω ψψ ψ ⊗ ν, ψ ⊗ ν = ν, ν
√ K J . There is a
(3.15)
V
√ √ ˆ J and ψ ⊗ ν ∈ H ˆ L ; the integral in (3.15) is absolutely between any ψ ⊗ ν ∈ R ˆL → H ˆ J is unitary and intertwines convergent. The corresponding operator Bˆ J L : H with the M p(V, ω)-action [2, 8, 15]. If L = L − and if J is parameterized by ∈ Hn , the operator and its inverse are, respectively, Bˆ L :φ(x) e
√
−1 t 2 xy
⊗
√
1
d n x −→
(det 22 ) 4 (det
√ −1
1
)2
1
2
e− 2 |z |
V /L −
φ(x)
1 −1 (2 ) 21 (2 ) 21 √ −1 In − (22 ) 2 √ 2 2 −1 −1 z 1 z × exp x 2 x 1 −1 (2 ) 2 −1 √ √ − 2 −1 −1 # n |d x| , × d n z (3.16) n ⊗ (2π ) 2
t
Geometric Quantization, Parallel Transport and the Fourier Transform 1
591
and, if φ(z ) e− 2 |z | is in R , 2
1 √ −1 t (det 22 ) 4 2 x y n 2 d z −→ e φ(z ) e−|z | 1 V (det √−1 ) 2 −1 −1 1 1 1 t 2 2 2 √ √ − (2 ) (2 ) (2 ) I 2 2 2 z¯ 1 z¯ n −1 × exp −1−1 −1 1 x 2 x √ (22 ) 2 − √−1 −1 √ (3.17) × εω (z ) ⊗ d n x . √ When = −1 In , they are the usual Segal-Bargmann transform and its inverse [2]. For any pair of Lagrangian subspaces L , L ∈ S that are transverse, there exists a FouˆL →H ˆ L that intertwines with the action of M p(V, ω) rier transform operator Fˆ L L : H [7]. In particular, we have 2 −1 : φ(z ) e− 2 |z | ⊗ Bˆ L 1
√ √ −1 t Fˆ L + L − : φ(x) e 2 x y ⊗ d n x √ t √ n φ(x) e −1 x y
−→ −1 2
V /L −
|d n x|
n (2π ) 2
e−
√
−1 t 2 x y
⊗
d n y,
(3.18)
˜ ). Strictly speakwhere the integral in the parentheses is the usual Fourier transform φ(y √ ˆ L such that |ψ| is L 1 on ing, (3.18) is valid only on the dense subspace of ψ ⊗ ν ∈ H ˆ V /L; the operator then extends continuously to H L . Proposition 3.9. 1. Let J ∈ J and let L , L ∈ S be transverse to each other. Then for ˆ J and ψˆ ∈ R ˆ L, any ψˆ ∈ R ˆ lim Uˆ J J ψˆ = Bˆ −1 J L ψ,
J →L
lim Bˆ J L ψˆ = Fˆ L L ψˆ ;
J →L
(3.19)
here the limit is pointwise in V as J → L or L from inside J. 2. For any J, J ∈ J and L , L , L ∈ S that are mutually transverse, we have ˆ ˆ ˆ ˆ Bˆ J L = Uˆ J J ◦ Bˆ J L , Fˆ L L = Bˆ −1 J L ◦ B J L , FL L = FL L ◦ FL L .
(3.20)
Proof. 1. Let J, J be parameterized by , ∈ Hn . Without loss of generality, assume ˆ ) = φ(z ) e− 12 |z |2 ⊗ L = L − . Then the limit J → L is → 0 with 2 > 0. If ψ(z √ n ˆ ) is the tensor product of (3.12) and (3.6). As → 0, d z , then (Uˆ ψ)(z # √ 1 n 4 (det 22 ) d z → d n x and the integrand in (3.12) goes to that in (3.17). Since the latter is absolutely integrable, the limit commutes with the integration and thus the 1 first limit in (3.19) follows. We remark here that#the scalar factor (det 22 ) 4 that goes
. The proof of the second limit to zero in the limit is absorbed by the half-form d n z is similar. ˆ → J is flat and since Uˆ J J is the parallel 2. Since the connection on the bundle H ˆJ → R ˆ J and transport from J to J , we have Uˆ J J ◦ Uˆ J J = Uˆ J J . Using Uˆ J J : R −1 ˆ ˆ ˆ ˆ taking J → L, we get Bˆ −1 ◦ U = B on R , and hence on H . The proof of the J J J J JL JL other two identities are similar.
592
W. D. Kirwin, S. Wu
ˆ J from J to J goes We thus proved that, as J → L ∈ S, parallel transport of ψˆ ∈ R −1 2 ˆ Since the latter is not L on V and its norm is defined instead by (3.13), it is to Bˆ J L ψ. ˆ J or why its image is not obvious why the “operator” lim J →L Uˆ J J is continuous on H ˆ contained in H L . We now take the limit J → L as J follows the path of a geodesic. Lemma 3.10. 1. Let ≥ 0 be a diagonal matrix and γ = g · γ , a geodesic in J. Then limt→±∞ γ (t) are real Lagrangian subspaces if and only if > 0. 2. For any J ∈ J and L ∈ S, there is a geodesic γ in J such that γ (0) = J , limt→−∞ γ (t) = L. 3. A pair of real Lagrangian subspaces L , L are transverse if and only if there is a geodesic γ in J such that limt→−∞ γ (t) = L, limt→+∞ γ (t) = L . Proof. 1. Using the identification of J and Hn , γ (−∞) = 0 and γ (+∞) = √ + −1 ∞ In if and only if > 0, in which case they are real Lagrangian subspaces L − and L + , respectively. The result follows from the transitivity of the Sp(V, ω) action on S. √ 2. Without loss of generality, assume J is represented by = −1 In . Then for any diagonal > 0, γ (0) = J and limt→−∞ γ (t) = L − . The isotropic subgroup of J in Sp(V, ω) is isomorphic to U (n) and acts transitively on S. Hence the result. 3. Let γ = g · γ ( > 0) be the geodesic such that the limits hold. Then L = g L − and L = g L + . L, L are transverse since L − , L + are. Conversely, if L, L are transverse, then there exists g ∈ Sp(V, ω) such that L = g L − , L = g L + . The geodesic γ = g ·γ for any > 0 satisfies the requirement. Proposition 3.11. Let γ be a geodesic in J such that γ (0) = J and γ (−∞) = L , ˆ J , we have γ (+∞) = L ∈ S. Then for any ψˆ ∈ H ˆ lim Uˆ γ (t)J ψˆ = Bˆ −1 J L ψ,
t→−∞
lim Uˆ γ (t)J ψˆ = Fˆ L L lim Uˆ γ (t)J ψˆ
t→+∞
t→−∞
(3.21)
almost everywhere on V . Proof.√Without loss of generality, we assume γ = γ ( > 0). Then √ J is given by = −1 In and L = L − , L = L + , while at γ (t), z t = √1 (e−t x + −1 et y). Let 2 √ √ −1 t −1 ˆ y) = φ(x) e 2 x y ⊗ d n x. Using π± : V → V /L ± be the projections. Let ( Bˆ J L ψ)(x, (3.19) and (3.16), we get ˆ , y ) = ( Bˆ γ (t)L − Bˆ −1 ψ)(x ˆ , y) (Uˆ γ (t)J ψ)(x JL − n √ −t − 12 t(x−x )e−2t (x−x )+ −1 t(x−x )y |d x| = det e φ(x)e n (2π ) 2 V /L − √ # √ −1 t 1 (3.22) ×e 2 x y ⊗ (det 2et ) 2 d n z t n √ 1t −2t (x−x )+ −1 tx y |d x| = φ(x) e− 2 (x−x )e n (2π ) 2 V /L − √ # √ −1 t 1 (3.23) ×e− 2 x y ⊗ (det 2e−t ) 2 d n z t . √ √ −1 t As t → −∞, (3.22) goes to φ(x ) e 2 x y ⊗ d n x pointwise on π−−1 (E φ ), where E φ is the Lebesgue set of φ (see for example [12, Theorem I.1.25] or [3, Theorem 8.62]).
Geometric Quantization, Parallel Transport and the Fourier Transform
593
√ 1 Again, the scalar factor (det 2et ) 2 that vanishes in the limit is absorbed by d n z t . As √ n −1 n ˜ 2 t → +∞, (3.23) goes to −1 φ(y ) ⊗ d y pointwise on π+ (E φ˜ ) (see for example [3, Theorem 8.31(c)]); this also follows from the t → −∞ limit by making an Sp(V, ω) transformation that fixes J and exchanges L + and L − . It is well known that the Lebesgue set of an L 2 function is the complement of a measure-zero subset (see for example [3, Theorem 3.20] or [12, pp. 12-13]). ˆ L (L ∈ S) are defined up to a set of meaWe remark that since the elements in H sure-zero, the limits in (3.21) are the best possible results for pointwise convergence. The integral in (3.22), being the convolution of φ and the heat kernel, goes to φ(x ) when t → −∞ as tempered distributions on V /L − (see for example [13, Prop. 3.5.1] or ˜ ) when [3, Cor. 8.46]). In the same sense, the integral in (3.23) goes to φ(y t → +∞. Hence the limits in (3.21) hold as tempered distributions on V , with the given trivialization of . Finally, we consider the limit in L 2 -spaces. S is part of the topological boundary of ˆ L (L ∈ S) J as a bounded domain. We define a topology on the disjoint union E of all H ˆ ˆ and the total space of H → J. There is a bijection from E to (J S) × H J0 if we fix ˆ J (J ∈ J) and H ˆ L (L ∈ S) to H ˆ J are Uˆ J J and Bˆ −1 , any J0 ∈ J. The maps from H 0 0 J0 L ˆJ. respectively. The space E thus inherits the product topology on (J S) × H 0
ˆ J, Corollary 3.12. Let J ∈ J and let L , L ∈ S be a transverse pair. Then for any ψˆ ∈ H in the above topology on E, we have ˆ lim Uˆ J J ψˆ = Bˆ −1 J L ψ,
J →L
ˆ lim Uˆ J J ψˆ = Fˆ L L lim Uˆ J J ψ.
J →L
Proof. The limits follow directly from (3.20).
J →L
Acknowledgements. We are grateful to Arlan Ramsay for many helpful conversations and suggestions – in particular regarding the limits of the parallel transport operator. We would also like to thank Wicharn Lewkeeratiyutkul for bringing to our attention the paper [4].
References 1. Axelrod, S., Della Pietra, S., Witten, E.: Geometric quantization of Chern-Simons gauge theory. J. Diff. Geom. 33, 787–902 (1991) 2. Bargmann, V.: On a Hilbert space of analytic functions and an associated integral transform. Comm. Pure Appl. Math. 14, 187–214 (1961) 3. Folland, G.B.: Real analysis. Modern techniques and their applications John Wiley & Sons, New York 1984 4. Florentino, C., Matias, P., Mourao, J., Nunes, J.P.: Geometric quantization, complex structures and the coherent state transform. J. Funct. Anal. 221, 303–322 (2005) 5. Ginzburg, V.L., Montgomery, R.: Geometric quantization and no-go theorems. In: Grabowski, J., Urba´nski, P. (eds.) Poisson geometry, Warsaw, 1998, Banach Center Publ. 51, Warsaw: Polish Acad. Sci., 2000, 69–77 6. Hall, B.C.: Geometric quantization and the generalized Segal-Bargmann transform for Lie groups of compact type. Commun. Math. Phys. 226, 233–268 (2002) 7. Lion, G., Vergne, M.: The Weil representation, Maslov index and theta series. Prog. in Math. 6, Birkhäuser Boston, MA 1980, Part I 8. Magneron, B.: Spineurs symplectiques purs et indice de Maslov de plan Lagrangiens positifs. J. Funct. Anal. 59, 90–122 (1984) 9. Robinson, P.L., Rawnsley, J.H.: The metaplectic representation, Mpc structures and geometric quantization. Mem. Amer. Math. Soc. Vol. 81, No. 410, Providence, RI: Amer. Math. Soc., 1989
594
W. D. Kirwin, S. Wu
10. Satake, I.: On unitary representations of a certain group extension (in Japanese). Sugaku 21, 241–253 (1969); Fock representations and theta-functions. In: Ahlfors, L.V. et al. (eds.) Advances in the theory of Riemann surfaces. Princeton, NJ: Princeton Univ. Press, 1971, pp. 393–405 11. Siegel, C.L.: Symplectic geometry. Amer. J. Math. 65, 1–86 (1943) 12. Stein, E.M., Weiss, G.: Introduction to Fourier analysis on Euclidean spaces. Princeton, NJ: Princeton Univ. Press, 1971 13. Taylor, M.E.: Partial differential equations I. Basic theory. New York: Springer-Verlag 1996 14. Woodhouse, N.M.J.: Geometric quantization and the Bogoliubov transformation. Proc. Royal Soc. London A 378, 119–139 (1981) 15. Woodhouse, N.M.J.: Geometric Quantization (2nd ed.) New York: Oxford Univ. Press 1992 Communicated by A. Connes
Commun. Math. Phys. 266, 595–629 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0052-y
Communications in
Mathematical Physics
The Equations of Magnetohydrodynamics: On the Interaction Between Matter and Radiation in the Evolution of Gaseous Stars Bernard Ducomet1 , Eduard Feireisl2,3, 1 Département de Physique Théorique et Appliquée, CEA/DAM Ile de France, BP 12, 91680
Bruyères-le-Châtel, France
2 Department of Global Analysis, Technical University of Muenchen, Boltzmannstr. 3, 85747 Garching b.
Muenchen, Germany
3 Mathematical Institute AS CR, Žitná 25, 115 67 Praha 1, Czech Republic. E-mail: [email protected]
Received: 18 February 2005 / Accepted: 28 March 2006 Published online: 6 July 2006 – © Springer-Verlag 2006
Abstract: We prove existence of global-in-time weak solutions to the equations of magnetohydrodynamics, specifically, the Navier-Stokes-Fourier system describing the evolution of a compressible, viscous, and heat conducting fluid coupled with the Maxwell equations governing the behaviour of the magnetic field. The result applies to any finite energy data posed on a bounded spatial domain in R 3 , supplemented with conservative boundary conditions.
1. Physical Background-Basic Equations and Main Assumptions 1.1. Introduction. In a number of situations, stars may be considered as compressible fluids (see [36, 40]), in which the matter behaves either as a perfect or completely degenerate gas. Moreover their dynamics is very often shaped and controlled by intense magnetic fields coupled to self-gravitation and high temperature radiation effects (see, for instance, [8, 9, 15], among others). A striking example of such a coupling between magnetic and thermomechanical degrees of freedom is observed in the so-called solar flares [41] (eruption phenomena in the coronal region of the Sun). During this spectacular event, a violent brightening is produced in the solar atmosphere where a huge amount of energy (∼ 1025 joules) is released in a matter of few minutes, and associated to a large coronal mass ejection. Magnetic reconnection is thought to be the mechanism responsible for this conversion of magnetic energy into heat and fluid motion. In our treatment, we consider a mathematical model derived from the classical principles of continuum mechanics and electrodynamics, where the field equations, the constitutive relations, as well as other physically grounded hypotheses are chosen on the basis of the following assumptions: The work supported by Grant A1019302 of GA AV CR
596
B. Ducomet, E. Feireisl
[A1] The material in question is a compressible, viscous, thermally and electrically conducting fluid, occupying a bounded domain in the physical space R 3 . The physical system is energetically isolated. [A2] The motion of the fluid is driven by two dominating body forces, namely, the self-gravitation and the Lorentz force imposed on the fluid by the magnetic field (see Chap. 2 in [8], Chap. 9 in [9], [26]). [A3] The fluid is a perfect mixture of a finite number of species (gases), among which at least one constituent (the electron gas, for instance) behaves as a Fermi gas in the degenerate area of high densities and/or low temperatures. (see Chap. 16 in [15], [34]). [A4] The motion is an entropy producing (dissipative) process, where the viscous stress, the heat flux, and the induced electric current are linear functions of the affinities: the fluid velocity gradient, the temperature gradient, and the magnetic field respectively. The transport coefficients in these fluxes depend effectively on the temperature (see [33]). [A5] The pressure as well as the heat conductivity are augmented through the effect of high temperature radiation, assumed to be at thermal equilibrium with the fluid (see Chap. 2 in [8], Chap. 16 in [15]). Many of the recent theoretical studies in continuum fluid mechanics addressing the problem of global-in-time solutions are concerned with isothermal or isentropic fluid flows (see [27, 39]). The present paper can be considered as a part of the research programme originated in [18], the aim of which is to develop a rigorous and, at the same time, physically relevant theory of general viscous fluids in the full thermodynamical setting. The central idea behind Hypotheses [A1]–[A5] is to identify a class of constitutive assumptions on the fluid in extreme regimes, therefore providing suitable a priori estimates relevant to the corresponding system of partial differential equations. When dealing with dissipative, or entropy producing processes, it is common to impose the second law of thermodynamics as the central principle. The central idea advocated in many recent studies postulates that the state variables change in a way that maximizes the rate of entropy production. From the mathematical viewpoint, the latter could be a non-negative measure, singular or absolutely continuous with respect to the standard Lebesgue measure, depending on the smoothness of the flow. Introducing the total entropy balance as one of the main field equations represents a crucial aspect of the mathematical theory to be developed below. Another important feature employed in the present study is the regularizing effect of radiation already observed in [13]. Indeed one of the main theoretical problems to be dealt with is represented by a possible presence of cavities, inducing uncontrollable large amplitude time oscillations of extensive quantities as the internal energy or entropy. In the present setting, however, these quantities are being supplemented with a radiation component yielding the necessary a priori estimates on the time-derivative. Last but not least, the present theory leans on the commutator estimates developed for the case of parameter-dependent transport coefficients in [19]. 1.2. Radiation theory. In the following section, a purely phenomenological, or macroscopic, description is adopted, where the radiation is treated as a continuous field, and both the wave (classical) and photonic (quantum) aspects are taken into account. In the quantum picture, in agreement with Assumption [A5], the total pressure p in the fluid is augmented, due to the presence of the photon gas, by a radiation component p R related to the absolute temperature ϑ through the Stefan-Boltzmann law a (1.1) p R = ϑ 4 , with a constant a > 0 3
Equations of Magnetohydrodynamics
597
(see, for instance, Chap. 15 in [15]). Furthermore, the specific internal energy of the fluid must be supplemented with a term e R = e R (, ϑ) =
a 4 1 ϑ , equivalently, p R = e R , 3
(1.2)
where is the fluid density. Similarly, the heat conductivity of the fluid is enhanced by a radiation component q R = −κ R ϑ 3 ∇x ϑ, with a constant κ R > 0
(1.3)
(see [5, 29], among others). A different kind of interaction, described by the theory of magnetohydrodynamics, produces the so-called “collective effects” resulting from the macroscopic interaction between the motion of a conducting fluid and the electromagnetic field governed by Faraday’s law, ∂t B + curlx E = 0, divx B = 0.
(1.4)
Here, the magnetic induction vector B is related to the electric field E and the macroscopic fluid velocity u via Ohm’s law J = σ (E + u × B),
(1.5)
where J is the electric current, and σ the electrical conductivity of the fluid. Furthermore, neglecting the displacement current in the Ampère-Maxwell equation governing the electric field, we obtain Ampère’s law, µJ = curlx B, µ > 0,
(1.6)
where the constant µ stands for permeability of free space. Accordingly, Eq. (1.4) can be written in the form ∂t B + curlx (B × u) + curlx (λcurlx B) = 0,
(1.7)
where λ = (µσ )−1 is termed the magnetic diffusivity of the fluid (cf. Chap. 9 in [9], Chap. 4 in [26], among others). Equation (1.7) must be supplemented with suitable boundary conditions in order to obtain, at least formally, a mathematically well-posed problem. Here, conformally to Hypothesis [A1], we suppose that ∂ is a perfect conductor giving rise to the boundary conditions, B · n|∂ = E × n|∂ = 0,
(1.8)
where n stands for the (outer) normal vector. It is worth-noting that both mechanisms, even though being of the same origin, act simultaneously but rather independently as the radiation pressure is attributed to photons of very high energy while the collective effects of the electromagnetic field may be important under conditions of high density and/or extremely low temperature (energy) (see Chap. 2 in [8]).
598
B. Ducomet, E. Feireisl
1.3. Motion. In accordance with the basic principles of continuum mechanics, the fluid motion is described by the Navier-Stokes system of equations - a mathematical formulation of the mass conservation ∂t + divx (u) = 0,
(1.9)
∂t (u) + divx (u ⊗ u) = divx T + ∇x + J × B,
(1.10)
and the momentum balance
with the Cauchy stress tensor T. Here, in agreement with Assumption [A2], the body forces are represented by the gravitational force ∇x , where the potential obeys Poisson’s equation, − = G, with a constant G > 0,
(1.11)
and the Lorentz force imposed by the magnetic field, J × B = divx
1 µ
B⊗B−
1 |B|2 I . 2µ
(1.12)
Furthermore, as stated in Assumption [A4], the Cauchy stress tensor T is given by Stokes’ law T = S − pI, where p is the pressure, and the symbol S stands for the viscous stress tensor, 2 S = ν ∇x u + ∇x ut − divx u I + η divx u I, 3
(1.13)
with the shear viscosity coefficient ν, and the bulk viscosity coefficient η. Finally, in accordance with [A1], we impose the no-slip boundary conditions for the velocity field: u|∂ = 0.
(1.14)
1.4. Thermodynamics, entropy production. By virtue of the first law of thermodynamics, the energy of the system must be a conserved quantity, more specifically, ∂t
1 2
|u|2 + e +
1 1 |B|2 + divx ( |u|2 + e + p)u + E × B − Su 2µ 2
+divx q = ∇x · u,
(1.15)
where e is the specific internal energy, and q denotes the heat flux. In particular, assuming q · n|∂ = 0
(1.16)
Equations of Magnetohydrodynamics
599
(cf. Hypothesis [A1]), we deduce that the total energy of the system is a constant of motion: 1 d 1 1 |u|2 + e + |B|2 − dx = 0. (1.17) dt 2 2µ 2 Note that the gravitational potential is determined by Eq. (1.11) considered on the whole space R 3 , the density being extended to be zero outside . Consequently, d 1 ∇x · u dx = − dx. dt 2 However, in the variational formulation used in the present paper, it is more convenient to replace (1.15) by the entropy balance q ∂t (s) + divx (su) + divx =r (1.18) ϑ with the specific entropy s, and the entropy production rate r , for which the second law of thermodynamics requires r ≥ 0. The specific entropy s, the specific internal energy e, and the pressure are interrelated through the thermodynamics equation 1 ϑ Ds = De + D p, (1.19) in particular, the quantity (1/ϑ)(De + D(1/) p) must be a perfect gradient, which amounts to certain compatibility conditions imposed on e and p called Maxwell’s relationship. If the motion is smooth, it is easy to see that the sum of the kinetic and magnetic energy satisfies 1 1 1 ∂t |u|2 + |B|2 + divx ( |u|2 + p)u + E × B − Su 2 2µ 2 = ∇x · u + p divx u − S : ∇x u −
λ |curlx B|2 , µ
(1.20)
where the last two terms are responsible for the irreversible transfer of the mechanical and magnetic energy into heat. Indeed, with (1.19) at hand, it is a routine matter to check that 1 λ q · ∇x ϑ r= S : ∇x u + |curlx B|2 − . (1.21) ϑ µ ϑ If the motion is not (known to be) smooth, however, the validity of (1.20) is no longer guaranteed, in particular, the dissipation rate of the mechanical energy may exceed the value S : ∇x u (see, for instance, [12, 16] to illuminate this interesting and still largely open question). Accordingly, we replace Eq. (1.18) by the inequality q 1 λ q · ∇x ϑ ∂t (s) + divx (su) + divx ≥ S : ∇x u + |curlx B|2 − (1.22) ϑ ϑ µ ϑ
600
B. Ducomet, E. Feireisl
to be satisfied together with the total energy balance (1.17). Note that such a formulation is (i) consistent with the second law of thermodynamics, and (ii) equivalent to (1.15) provided the motion is smooth. Indeed, using (1.19) one can show that the internal energy production rate equals ϑr + pdivx u, where r is the entropy production evaluated through (1.21). On the other hand, if the motion is smooth, the mechanical and magnetic energy dissipation is given by (1.20), which is compatible with (1.17) only if r satisfies (1.21).
1.5. State equation. Conformally to [A5], the pressure p takes the form p(, ϑ) = p F (, ϑ) + p R (ϑ),
(1.23)
where the radiation component is given by (1.1). Similarly, the specific internal energy reads e(, ϑ) = e F (, ϑ) + e R (, ϑ),
(1.24)
with e R determined by (1.2). Furthermore, in accordance with Assumption [A3], we suppose p F (, ϑ) =
2 e F (, ϑ) 3
(1.25)
- a universal relation deduced by the methods of statistical physics and valid, in the continuum limit, for any system of non-interacting (non-relativistic) particles (see Chap. 10 in [8], Chap. 15 in [15], or [34]). The following hypothesis expresses the physical principle of convexity of the free energy: ∂ p F (, ϑ) ∂e F (, ϑ) > 0, > 0 for all , ϑ > 0 ∂ ∂ϑ
(1.26)
(cf. [23]). Finally, we suppose ∂ p F (, ϑ) > 0, for any fixed ϑ > 0, →0+ ∂
lim p F (, ϑ) = 0, lim
→0+
(1.27)
and lim inf e F (, ϑ) > 0, lim sup ϑ→0+
ϑ→0+
∂e F (, ϑ) < ∞ for any > 0. ∂ϑ
(1.28)
Note that (1.27) is in full agreement with the classical Boyle’s law that applies in the nondegenerate area of high temperatures and low densities. On the other hand, the physical meaning of ∂e F /∂ϑ = cv is the specific heat at constant volume, and condition (1.28) characterizes a Fermi gas in the (degenerate) regime of large densities and/or extremely low temperatures (cf. Chap. 9 in [30], and Chap. 15 in [15]). Note that these conditions also agree with the asymptotic models derived in [3]. The specific entropy s is determined, up to an additive constant, by the thermodynamics relation (1.19).
Equations of Magnetohydrodynamics
601
As shown below, the only admissible form of p F reads p F (, ϑ) = ϑ 5/2 PF (/ϑ 3/2 ), 5 where the function PF behaves as PF (z) ≈ z 3 for large values of z. Thus at least asymptotically the pressure p F coincides with a “γ −law” p F ≈ γ , γ = 5/3, reminiscent of the isentropic state equation for a perfect monoatomic gas. From the mathematical viewpoint, the value of γ plays a role of a “critical” parameter (see [27] or [18] for relevant discussion). 1.6. Fluxes and transport coefficients. In accordance with the second law of thermodynamics, the viscosity coefficients ν and ζ are non-negative quantities. In addition, we shall suppose ν = ν(ϑ, B) > 0, η = η(ϑ, B) > 0,
(1.29)
where both the shear viscosity ν and the bulk viscosity η satisfy some technical but physically relevant coercivity conditions specified below. Similarly, the heat flux q obeys Fourier’s law: q = qR + qF ,
(1.30)
with the “radiation” heat flux q R given by (1.3), and q F = −κ F (, ϑ, B)∇x ϑ, κ > 0.
(1.31)
Finally, the magnetic diffusivity λ satisfies λ = λ(, ϑ, B) > 0.
(1.32)
For the sake of simplicity, but not without certain physical background (see, for instance, Chap. 7 in [24]), we shall assume that all the transport coefficients ν, ζ , κ F , and λ admit a common temperature scaling, namely c1 (1 + ϑ α ) ≤ κ F (, ϑ, B), ν(ϑ, B), η(ϑ, B), λ(, ϑ, B) ≤ c2 (1 + ϑ α ), c1 > 0, (1.33) with α ≥ 1 to be specified below. Note that, in order to comply with the principle of material frame indifference, the transport coefficients depend in fact only on |B|. The reader will have noticed that we have deliberately avoided the case when the viscosity coefficients depend effectively on the density. Being physically relevant though such a situation presents up to now unsurmountable mathematical difficulties. Note that, in the absence of the magnetic field, the transport coefficients depending only on the temperature are physically relevant at least in√the case of gases. In particular, for the so-called hard-sphere model, one has µ(ϑ) ≈ ϑ (see, for instance, [2]). The effect of the magnetic field is much more complicated than in the “toy” situation considered in this paper, namely the viscous stress becomes unisotropic depending effectively on the direction of B (see Sect. 19.44 in [6]). 2. Variational Formulation The approach pursued in this paper is based on the concept of variational solutions, for which the underlying field equations are expressed in terms of integral identities rather than the partial differential equations dicussed in Sect. 1.
602
B. Ducomet, E. Feireisl
2.1. Mass conservation. The principle of mass conservation is expressed through a family of integral identities
T
0
∂t ϕ + u · ∇x ϕ dx dt + 0 ϕ(0, ·) dx = 0
(2.1)
to be satisfied for any test function ϕ ∈ D([0, T ) × R 3 ). Here, the function 0 characterizes the prescribed initial distribution of the density. Note that our choice of the test functions already reflects the boundary conditions (1.14). Moreover, we tacitly assume that both and the flux u to be locally integrable on [0, T ) × so that (2.1) makes sense. If, in addition, belongs to the space L ∞ (0, T ; L γ ()) for a certain γ > 1, then γ ∈ C([0, T ]; L weak ()), in particular, (t) → 0 weakly in L γ () as t → 0, and the total mass
M0 =
(t) dx =
0 dx
(2.2)
is a constant of motion. Finally, if γ ≥ 2, one can use the regularization technique developed by DiPerna and P.-L.Lions [11] in order to show that , u satisfy the renormalized equation T b()∂t ϕ +b()u · ∇x ϕ +(b()−b ())divx uϕ dx dt + b(0 )ϕ(0, ·) dx = 0 0
(2.3) for any ϕ ∈ D([0, T ) × R 3 ), and for any continuously differentiable function b whose derivative vanishes for large arguments. At the same time, it can be shown that any renormalized solution belongs to the class C([0, T ]; L 1 ()) (see [11]). 2.2. Momentum balance. The variational formulation of (1.10) reads T (u) · ∂t ϕ + (u ⊗ u) : ∇x ϕ + p divx ϕ dx dt
0
T
= 0
S : ∇x ϕ −∇x · ϕ −(J × B) · ϕ dx dt − (u)0 · ϕ(0, ·) dx (2.4)
to be satisfied for any vector field ϕ ∈ D([0, T ) × ; R 3 ). Similarly to Sect. 2.1, we suppose the quantities u, u ⊗ u, p, S, ∇x , and J × B to be at least locally integrable on the set [0, T ) × . Here, the gravitational potential is given by Eq. (1.11) considered on the whole physical space R 3 , where was extended to be zero outside . Equivalently, = G(− )−1 [1 ],
(2.5)
Equations of Magnetohydrodynamics
with
603
(− )[v](x) = Fξ →x |ξ |2 Fx→ξ [v] ,
where F stands for the Fourier transform. The initial value of the momentum (u)0 should be prescribed in such a way that (u)0 = 0 a.a. on the set {0 = 0}.
(2.6)
Finally, the satisfaction of the no-slip boundary conditions (1.14) can be rephrased as u ∈ L 2 (0, T ; W01,2 (; R 3 )).
(2.7)
Here, of course, our choice of the function space has been motivated by the anticipated a priori bounds resulting from the entropy balance specified below. 2.3. Entropy production. In accordance with (1.22), the weak form of the entropy production reads T q s ∂t ϕ + su · ∇x ϕ + · ∇x ϕ dx dt ϑ 0 ≤
T 0
1 q · ∇x ϑ λ − S : ∇x u − |curlx B|2 ϕ dx dt ϑ ϑ µ − (s)0 ϕ(0, ·) dx,
(2.8)
for any test function ϕ ∈ D([0, T )× R 3 ), ϕ ≥ 0. Note that our choice of the test function space agrees with the boundary conditions (1.14), (1.16). The presence of ϑ in the denominator indicates that this quantity must be positive on a set of full measure for (2.8) to make sense. In particular, terms like ∇x ϑ/ϑ will be interpreted as ∇x log(ϑ) in the spirit of the following result (Lemma 5.3 in [13]). Lemma 2.1. Let ⊂ R N be a bounded Lipschitz domain. Furthermore, let be a non-negative function such that 2N , dx, γ dx ≤ K , with γ > 0<M≤ N +2 and let ϑ ∈ W 1,2 (). Then the following statements are equivalent: • The function ϑ is strictly positive a.a. on , | log(ϑ)| ∈ L 1 (), and
∇x ϑ ∈ L 2 (; R N ). ϑ
• The function log(ϑ) belongs to the Sobolev space W 1,2 (). Moreover, if this is the case, then ∇x ϑ a.a. on , ϑ and there is a constant c = c(M, K ) such that
log(ϑ) L 2 () ≤ c(M, K ) log(ϑ) L 1 () + ∇x ϑ/ϑ L 2 (;R N ) . ∇x log(ϑ) =
604
B. Ducomet, E. Feireisl
2.4. Maxwell’s equations. Equation (1.7) is replaced by T B · ∂t ϕ − (B × u + λcurlx B) · curlx ϕ dx dt + B0 · ϕ(0, ·) dx, (2.9) 0
to be satisfied for any vector field ϕ ∈ D([0, T ) × Here, in accordance with the boundary conditions (1.8), (1.14), one has to take R 3 ).
B0 ∈ L 2 (), divx B0 = 0 in D (), B0 · n|∂ = 0.
(2.10)
By virtue of Theorem 1.4 in [38], B0 belongs to the closure of all solenoidal functions from D() with respect to the L 2 −norm. Anticipating, in view of (1.17), (1.22), B ∈ L ∞ (0, T ; L 2 (; R 3 )), curlx B ∈ L 2 (0, T ; L 2 (; R 3 )), we can deduce from (2.9) that divx B(t) = 0 in D (), B(t) · n|∂ = 0 for a.a. t ∈ (0, T ) in full agreement with (1.4), (1.8). In particular, using Theorem 6.1, Chap. VII in [14], we conclude B ∈ L 2 (0, T ; W 1,2 (; R 3 )), divx B(t) = 0, B · n|∂ = 0 for a.a. t ∈ (0, T ). (2.11) 2.5. Total energy conservation. In accordance with (1.17), we shall assume the total energy to be a constant of motion, specifically, 1 1 1 E(t) = (t)|u(t)|2 +(e)(t)+ |B(t)|2 − (t) (t) dx = E 0 for a.a. t∈(0, T ), 2µ 2 2 (2.12) where
1 1 1 E0 = |B0 |2 − 0 0 dx. |(u)0 |2 + (e)0 + 2µ 2 20
(2.13)
Exactly as in (2.5), we have set 0 = G(− )−1 [1 0 ], and, in addition to (2.6), we require 1 2 3 √ (u)0 ∈ L (; R ). 0
(2.14)
Finally, it is clear that the initial values of the entropy (s)0 and the internal energy (e)0 should be chosen consistently with the constitutive relations determined through (1.19). A rather obvious possibility consists in fixing the initial temperature ϑ0 and setting (s)0 = 0 s(0 , ϑ0 ), (e)0 = 0 e(0 , ϑ0 ).
(2.15)
Equations of Magnetohydrodynamics
605
3. Main Results 3.1. The main existence theorem. Having introduced the concept of variational solutions we are in a position to state the main result of this paper. Theorem 3.1. Let ⊂ R 3 be a bounded domain with boundary of class C 2+δ , δ > 0. Assume that the thermodynamic functions p = p(, ϑ), e = e(, ϑ), s = s(, ϑ) are interrelated through (1.19), where p, e can be decomposed as in (1.23), (1.24), with the components p F , e F satisfying (1.25). Moreover, let p F (, ϑ), e F (, ϑ) be continuously differentiable functions of positive arguments , ϑ satisfying (1.26 –1.28). Furthermore, suppose that the transport coefficients ν = ν(ϑ, B), η = η(ϑ, B), κ F = κ F (, ϑ, B), and λ = λ(, ϑ, B) are continuously differentiable functions of their arguments obeying (1.29 –1.33), with 1≤α<
65 . 27
(3.1)
Finally, let the initial data 0 , (u)0 , ϑ0 , B0 be given so that 5
0 ∈ L 3 (), (u)0 ∈ L 1 (; R 3 ), ϑ0 ∈ L ∞ (R 3 ), B0 ∈ L 2 (; R 3 ),
(3.2)
0 ≥ 0, ϑ0 > 0,
(3.3)
(s)0 = 0 s(0 , ϑ0 ),
1 |(u)0 |2 , (e)0 = 0 e(0 , ϑ0 ) ∈ L 1 (), 0
(3.4)
and divx B0 = 0 in D (), B0 · n|∂ = 0.
(3.5)
Then problem (2.1–2.10) possesses at least one variational solution , u, ϑ, B in the sense of Sect. 2 on an arbitrary time interval (0, T ). Remarks. (i) Note that, given the level of regularity of the variational solutions, the boundary condition (1.14) as well as the first one in (1.8) are satisfied in the sense of traces (cf. (2.7), (2.11)) while the relation E × n|∂ = 0, expressed in terms of J via (1.5), holds in a weak sense through the choice of test functions in the integral identity (2.9) (cf. [14, Chap. VII, Part 4]). (ii) Hypothesis (3.1) is the only technical restriction required by the mathematical theory. Note however that such a kind of functional dependence on ϑ was rigorously justified in [10] by means of the asymptotic analysis of certain kinetic-fluid models. The idea to impose the same temperature scaling on all transport coefficients was inspired by Chap. 7 in [24].
606
B. Ducomet, E. Feireisl
3.2. Principal difficulties. There seem to be only a few rigorous results available concerning the existence of global-in-time solutions for the full Navier-Stokes-Fourier system with arbitrarily large data (see [18, 25]). The principal difficulties of the present problem may be characterized as follows: • The transport coefficients are effective functions of ϑ, B that admit the uniform “temperature scaling” expressed through (1.33), with the common exponent α satisfying hypothesis (3.1). In particular, only very poor a priori estimates based on boundedness of the entropy production are available, at least in comparison with the previous theory developed for the Navier-Stokes-Fourier system in [20]. Moreover, there are no growth restrictions imposed on the derivatives of the transport coefficients. • The method introduced in [18] that is based on boundedness of the so-called oscillations defect measure is not applicable in a direct manner due to the lack of suitable a priori estimates. • The estimates based on the entropy production balance become rather delicate as s = s(, ϑ) may be singular for vanishing arguments. 3.3. Methods. As already mentioned above, the main stumbling block of the mathematical theory to be developed in the present paper is the lack of a priori estimates. Accordingly, the methods of weak convergence based on the theory of the parametrized (Young) measures represent the main tool (see the monograph [32], among others). If not stated otherwise, “weak” in the present context means “in the sense of integral averages”, that is, in the sense of the weak topology on the Lebesgue space L 1 . Accordingly, we shall denote by b(z) a weak limit of any sequence of composed functions {b(z n )}n≥1 . More precisely, b(y, z)ϕ(y) dy = ϕ(y) b(y, z) d y (z) dy, RM
RM
RK
where y (z) is a parametrized measure associated to a sequence {zn }n≥1 of vector-valued functions, zn : R M → R K (see Chap. 1 in [32]). Assume there is a sequence of approximate solutions resulting from a suitable regularization process. Our starting point will be to establish the relation
p(, ϑ)−
4 ν +η divx u b() = p(, ϑ) b()− ν(ϑ, |B| +η(ϑ, |B|)divx u b() 3 3 (3.6)
4
for any bounded function b. The quantity p − (4/3ν + η)divx u is usually termed the effective viscous pressure, and relation (3.6) was first proved by P.-L. Lions [27] for the barotropic Navier-Stokes system, where p = p(), and the viscosity coefficients ν and η are constant. The same result for general temperature dependent viscosity coefficients was obtained in [19] with the help of certain commutator estimates in the spirit of [7]. In the present setting, the approach developed in [19] has to be further modified in order to accommodate the dependence of ν and η on the magnetic field B. Similarly to [27, 35], the propagation of density oscillations is suitably described by the renormalized continuity equation (see [11]) ∂t b() + divx (b()u) + b () − b() divx u = 0, (3.7)
Equations of Magnetohydrodynamics
607
and its “weak” counterpart ∂t b() + divx (b()u) + b () − b() divx u = 0.
(3.8)
In order to establish (3.7), however, one has to show first boundedness of the oscillations defect measure oscγ +1 [n → ](Q) = sup lim sup |Tk (n ) − Tk ()|γ +1 dx dt < ∞, (3.9) k≥1
n→∞
Q
with Tk () = min{, k}, for any bounded Q ⊂ R 4 and a certain γ > 1 (see Chap. 6 in [18]). However, because of rather poor estimates resulting from (1.33), (2.8), relation (3.9) has to be replaced by a weaker “weighted” estimate sup lim sup (1 + ϑn )−β |Tk (n ) − Tk ()|γ +1 dx dt < ∞, (3.10) k≥1
n→∞
Q
for suitable β > 0, γ > 1; whence the theory developed in [18] must be modified. Finally, similarly to [20], one has to recover strong convergence of the sequence {ϑn }n≥1 of (approximate) temperatures knowing that the spatial gradients of ϑn are uniformly square integrable, and, in addition, s(, ϑ)ϑ = s(, ϑ)ϑ,
(3.11)
where (3.11) can be deduced from the entropy inequality (2.8) by means of a variant of the celebrated Aubin-Lions lemma. To this end, we use a result which is essentially due to Ball [1] (see also Theorem 6.2 in [32]), namely the possibility to characterize the weak limits of compositions with Caratheodory functions in terms of the associated parametrized (Young) measure. 3.4. Structure of the paper. The arrangement of the paper is as follows. After a preliminary section devoted to the basic structural properties of the thermodynamic functions, we introduce a three level approximation scheme adapted from [18]. After a short discussion, the proof of Theorem 3.1 is then reduced to a weak stability problem to be dealt with in the rest of the paper (see Sect. 5). Section 6 is devoted to uniform bounds on the sequence of approximate solutions. Here, the most delicate part is the proof of positivity of the absolute temperature presented in Part 6.3. The “easy” part of the limit passage is carried over in Sect. 7. With help of the uniform bounds established in Sect. 6, one can handle the convective terms in the field equations by means of a simple compactness argument. Section 8 is devoted to the proof of strong convergence of the sequence of approximate temperatures. To this end, the entropy inequality is used together with the theory of parametrized (Young) measures discussed in Sect. 3.3. In Sect. 9 it is shown that the sequence of approximate densities converges strongly in the Lebesgue space L 1 ((0, T ) × ). Obviously, this is the most delicate point of the proof as no uniform estimates are available on the derivatives. Here, the main novelty is introducing weighted estimates of the so-called oscillations defect measure in order to show that the limit densities satisfy the renormalized continuity equation. The proof of Theorem 3.1 is completed in Sect. 10.
608
B. Ducomet, E. Feireisl
4. Preliminaries As we can check by direct computation, relations (1.19), (1.25) are compatible if and only if there exists a function PF ∈ C 1 (0, ∞) such that 5 . (4.1) p F (, ϑ) = ϑ 2 PF 3 ϑ2 Consequently, ∂e F (, ϑ) 3 1 5 = PF (Y ) − PF (Y )Y , Y = 3 ; ∂ϑ 2Y 3 ϑ2 whence, by virtue of hypothesis (1.26), 5 PF (z) − PF (z)z > 0 for any z > 0, 3 where the latter inequality yields P (z) F < 0 for all z > 0, 5 z3 PF (z) > 0,
(4.2)
(4.3)
in particular, PF (z) 5
z3
→ p∞ > 0 for z → ∞.
(4.4)
Note that positivity of the limit p∞ follows from (1.25), (1.28). Moreover, in accordance with (1.27), (1.28), we have PF ∈ C 1 [0, ∞), 15 15 PF (z) − PF (z)z ≤ lim sup PF (z) − PF (z)z < ∞, (4.5) 0 < lim inf z→0+ z 3 z→0+ z 3 lim sup z→∞
15 PF (z) − PF (z)z < ∞, z 3
(4.6)
and lim PF (z) = 0, lim PF (z) > 0, lim
z→0+
z→0+
z→∞
PF (z) z
2 3
=
5 p∞ > 0. 3
(4.7)
Now it follows from (4.7) that 2
PF (z) ≥ cz 3 for all z > 0, and a certain c > 0; therefore there exists pc > 0 such that the mapping 5
→ p F (, ϑ) − pc 3 is a non-decreasing function of for any fixed ϑ > 0. (4.8) In addition, with the other thermodynamic quantities determined through Eq. (1.19), we have s(, ϑ) = s F (, ϑ) + s R (, ϑ),
(4.9)
Equations of Magnetohydrodynamics
609
where s F (, ϑ) = S F
3
ϑ2
, with S F (z) = −
3 53 PF (z) − PF (z)z , 2 z2
(4.10)
and s R (, ϑ) =
4 3 aϑ . 3
(4.11)
Note that, by virtue of (4.2), (4.5), S F is a decreasing function such that 2 lim zS F (z) = − PF (0) < 0; z→0+ 5 whence, normalizing by S F (1) = 0, we get −c1 log(z) ≤ S F (z) ≤ −c2 log(z), c1 > 0, for all 0 < z ≤ 1,
(4.12)
0 ≥ S F (z) ≥ −c3 log(z) for all z ≥ 1.
(4.13)
and
5. Approximation Scheme 5.1. A regularized problem. The approximation scheme used in the present paper is esentially that of Chap. 7 in [18], supplemented with the necessary modifications introduced in [13]. For reader’s convenience, the additional terms are put into {}. The continuity equation (1.9) is replaced by its “artificial viscosity” approximation ∂t + divx (u) = {ε }, ε > 0,
(5.1)
to be satisfied on (0, T )×, and supplemented by the homogeneous Neumann boundary conditions ∇x · n|∂ = 0.
(5.2)
The initial distribution of the approximate densities is given through (0, ·) = 0,δ ,
(5.3)
where 0,δ ∈ C 1 (), ∇x 0,δ · n|∂ = 0, inf 0,δ (x) > 0, x∈
(5.4)
with a positive parameter δ > 0. The functions 0,δ are chosen in such a way that 5
0,δ → 0 in L 3 (), |{0,δ < 0 }| → 0 for δ → 0
(5.5)
(cf. Sect. 4 in [13]). Here, of course, the choice of the “critical” exponent γ = 5/3 is intimately related to estimate (4.7) established above.
610
B. Ducomet, E. Feireisl
A regularized momentum equation reads ∂t (u) + divx (u ⊗ u) + ∇x p + {δ∇x + ε∇x u∇x } = divx S + ∇x + J × B (5.6) in (0, T ) × , with the quantities J × B, S, and determined by (1.6), (1.13), and (2.5), respectively. Furthermore, in accordance with (1.14), the approximate velocity field satisfies the homogeneous Dirichlet boundary conditions u|∂ = 0.
(5.7)
Similarly to Sect. 4 in [13] , we prescribe the initial conditions (u)(0, ·) = (u)0,δ , where
(u)0,δ =
(u)0 provided 0, 0 otherwise.
δ
≥ 0 ,
(5.8)
(5.9)
The role of the “artificial pressure” term δ in (5.6) is to provide additional estimates on the approximate densities in order to facilitate the limit passage ε → 0 (cf. Sect. 7 in [18]). To this end, one has to take large enough, say, > 8, and to re-parametrize the initial distribution of the approximate densities so that δ 0,δ dx → 0 for δ → 0. (5.10)
As a next step, pursuing the strategy of [13] we replace the entropy equation (1.18) by the (modified) internal energy balance ∂t (e + {δϑ}) + divx (e + {δϑ})u − divx (κ F + κ R ϑ 3 + {δϑ })∇x ϑ = S : ∇x u − p divx u +
λ |curlx B|2 + {εδ|∇x |2 −2 } µ
(5.11)
to be satisfied in (0, T ) × , together with no-flux boundary conditions ∇x ϑ · n|∂ = 0.
(5.12)
(e + δϑ)(0, ·) = 0,δ (e(0,δ , ϑ0,δ ) + δϑ0,δ ),
(5.13)
The initial conditions read
where the (approximate) temperature distribution satisfies ϑ0,δ ∈ C 1 (), ∇x ϑ0,δ · n|∂ = 0, inf ϑ0,δ (x) > 0, x∈
and
(5.14)
p ϑ0,δ → ϑ0 in L () for any p ≥ 1, δ 0,δ log(ϑ0,δ ) dx → 0, as δ → 0, 0,δ s(0,δ , ϑ0,δ ) dx → 0 s(0 , ϑ0 ) dx
(5.15)
Equations of Magnetohydrodynamics
611
0,δ e(0,δ , ϑ0,δ ) dx < c uniformly for δ > 0.
(5.16)
Finally, the magnetic induction vector B obeys the (unperturbed) equations ∂t B + curlx (B × u) + curlx (λcurlx B) = 0, divx B = 0
(5.17)
in (0, T ) × , supplemented with the initial condition B(0, ·) = B0,δ ,
(5.18)
where, by virtue of Theorem 1.4 in [38], one can take B0,δ ∈ D(; R 3 ), divx B0,δ = 0,
(5.19)
B0,δ → B0 in L 2 (; R 3 ) for δ → 0.
(5.20)
5.2. The overall strategy of the proof of Theorem 3.1. For given positive parameters ε, δ , and > 8, the proof of Theorem 3.1 consists of the following steps: Step 1. Solving problem (5.1–5.20) for fixed ε > 0, δ > 0. Step 2. Passing to the limit for ε → 0. Step 3. Letting δ → 0. To begin with, the goal proposed in Step 1 can be achieved by means of a simple fixed point argument exactly as in Sect. 5 in [13] (see also Chap. 7 in [18]). More specifically, Eq. (5.6) is solved in terms of the velocity u with help of the Faedo-Galerkin method, where , , ϑ, and B are computed successively from (5.1), (2.5), (5.11), and (5.17) as functions of u. In addition, it is easy to check that the corresponding approximate solutions satisfy the energy balance d 1 1 δ 2 2 |u| + e + |B| + + δϑ dx = ∇x · u dx dt 2 2µ −1 (5.21) in D (0, T ),
1 1 δ 2 2 |u| + e + |B| + + δϑ dx lim t→0+ 2 2µ −1 1 |(u)0,δ |2 + 0,δ e(0,δ , ϑ0,δ ) = 2 0,δ 1 δ dx + |B0,δ |2 + 0,δ + δ0,δ ϑ0,δ 2µ −1
(5.22)
(cf. formula (5.3) in [13]). The technical parts of Steps 2,3 are rather similar. As explained in Chap. 7 in [18], the only reason for splitting this step into the ε and δ−parts are the refined density estimates based on the multipliers ∇x (− )−1 [β ], where one has to take β = 1 when the artificial viscosity term ε is present, while uniform (independent of δ) estimates require β to be a small positive number (see also Sect. 6 in [13]). For this reason, we focus in this paper only on Step 3, in other words, our task will be to establish the weak sequential stability (compactness) property for the solutions set of the approximate problem specified below.
612
B. Ducomet, E. Feireisl
5.3. The weak sequential stability problem. The density δ ≥ 0 and the velocity uδ satisfy the integral identity T δ ∂t ϕ + δ uδ · ∇x ϕ dx dt + 0,δ ϕ(0, ·) dx = 0 (5.23)
0
R3
for any test function ϕ ∈ D([0, T ) × R 3 ). In addition, uδ (t) ∈ W01,2 (; R 3 ) for a.a. t ∈ (0, T ).
(5.24)
The momentum equation T δ uδ · ∂t ϕ + (δ uδ ⊗ uδ ) : ∇x ϕ + pδ divx ϕ + {δδ } divx ϕ dx dt
0
= 0
−
T
Sδ : ∇x ϕ − δ ∇x δ · ϕ − (Jδ × Bδ ) · ϕ dx dt
(u)0,δ · ϕ(0, ·) dx
(5.25)
holds for any ϕ ∈ D([0, T ) × ; R 3 ), with δ = G(− −1 )[1 δ ],
(5.26)
where pδ = p(δ , ϑδ ), and Sδ , Jδ are determined in terms of uδ , ϑδ , and Bδ through the constitutive relations (1.6), (1.13). The entropy production inequality reads T (δ sδ + {δδ log(ϑδ )})∂t ϕ + (δ sδ uδ + {δδ log(ϑδ )uδ }) · ∇x ϕ dx dt 0
T 0
≤
T 0
qδ · ∇x ϕδ − {δϑδ−1 ∇x ϑδ } · ∇x ϕ dx dt ϑδ 1 qδ · ∇x ϑδ λδ − Sδ : ∇x uδ − |curlx Bδ |2 − {δϑδ−1 |∇x ϑδ |2 } ϕ dx dt ϑδ µ ϑδ 0,δ s(0,δ , ϑ0,δ ) + {δ0,δ log(ϑ0,δ )} ϕ(0, ·) dx − (5.27)
for any ϕ ∈ D([0, T ) × R 3 ), ϕ ≥ 0. Here, ϑδ is assumed to be positive a.a. on the set (0, T ) × , sδ = s(δ , ϑδ ), λδ = λ(δ , ϑδ ), and qδ is a function of δ , ϑδ given by (1.30). The magnetic induction vector Bδ satisfies T Bδ · ∂t ϕ − (Bδ × uδ + λδ curlx Bδ ) · curlx ϕ dx dt + B0,δ · ϕ(0, ·) dx = 0 0
δ
(5.28) for any vector field ϕ ∈ D([0, T ) × R 3 ; R 3 ).
Equations of Magnetohydrodynamics
613
Finally, the (total) energy equality 1 1 1 δ 2 2 δ |uδ | + δ e + |Bδ | − δ δ + + δδ ϑδ (t) dx 2µ 2 −1 δ 2 1 1 |B0,δ |2 = |(u)0,δ |2 + 0,δ e(0,δ , ϑ0,δ ) + 2µ 20,δ G − 0,δ (− )−1 [1 0,δ ] dx 2 δ + 0,δ + δ0,δ ϑ0,δ dx (5.29) δ − 1 holds for a.a. t ∈ (0, T ). The weak sequential stability problem to be addressed in the remaining part of this paper consists in showing that one can pass to the limit → , δ uδ → u, as δ → 0 ϑδ → ϑ, Bδ → B in a suitable topology, where the limit quantity {, u, ϑ, B} is a variational solution of problem (2.1–2.15), the existence of which is claimed in Theorem 3.1. 6. Uniform Bounds Our first goal is to identify the uniform bounds imposed on the sequences {δ }δ>0 , {uδ }δ>0 , {ϑδ }δ>0 , and {Bδ }δ>0 through the total energy balance (5.29), the dissipation inequality (5.27) as well as other relations resulting from (5.23–5.29). 6.1. Total mass conservation. As δ , δ uδ are locally integrable in [0, T )×, it follows easily from (5.23) that the total mass is a constant of motion, specifically, δ (t) dx = 0,δ dx = M0 for a.a. t ∈ (0, T ). (6.1)
In particular, as δ ≥ 0 and (5.5) holds, we get {δ }δ>0 bounded in L ∞ (0, T ; L 1 ()).
(6.2)
6.2. The gravitational potential. The classical elliptic estimates applied to (2.5), together with (6.1), give rise to δ δ dx ≤ δ 5 δ 5 ≤ c δ 5 δ L 1 () = cM0 δ 5 .
L 3 ()
L 2 ()
L 3 ()
L 3 ()
(6.3)
614
B. Ducomet, E. Feireisl
6.3. Energy estimates. On the other hand, by virtue of (1.25), (4.7),
δ e(δ , ϑδ ) dx ≥
3 2
p F (δ , ϑδ ) dx ≥
3 p∞ 2
5
δ3 dx;
whence the total energy balance (5.29), estimate (6.3), together with the bounds on the initial data (5.5), (5.9), (5.10), (5.16), and (5.20), yield a family of energy estimates: 5
{δ }δ>0 bounded in L ∞ (0, T ; L 3 ()),
(6.4)
{ϑδ }δ>0 bounded in L ∞ (0, T ; L 4 ()),
(6.5)
{Bδ }δ>0 bounded in L ∞ (0, T ; L 2 (; R 3 )),
(6.6)
{δ |uδ |2 }δ>0 , {δ e(δ , ϑδ )}δ>0 bounded in L ∞ (0, T ; L 1 ()),
(6.7)
and δ
δ dx ≤ c uniformly with respect to δ > 0.
(6.8)
In particular, by virtue of Hölder’s inequality, (6.4) together with (6.7) imply 5
{(u)δ }δ>0 bounded in L ∞ (0, T ; L 4 (; R 3 )).
(6.9)
6.4. Dissipation estimates. The following, relatively strong, estimates result from the entropy production inequality (5.27), where one is allowed to take a spatially homogeneous test function ϕ such that ϕ(0, ·) = 1. Taking (5.15), (5.16) into account we obtain τ 0
1 qδ · ∇x ϑδ λδ Sδ : ∇x uδ − + |curlx Bδ |2 + δϑδ−1 |∇x ϑδ |2 dx dt ϑδ µ ϑδ ≤ c1 + δ (τ )s(δ (τ ), ϑδ (τ )) + δδ (τ ) log(ϑδ (τ )) dx
≤ c2 +
δ (τ )s(δ (τ ), ϑδ (τ )) dx for a.a. τ ∈ (0, T ),
where the last inequality follows from (6.4), (6.5).
(6.10)
Equations of Magnetohydrodynamics
615
Furthermore, the most right integral can be estimated with help of (4.12), (4.13): 4 3 δ dx δ s(δ , ϑδ ) dx = aϑδ + δ S F 3 3 ϑδ2 3 log(ϑδ ) dx ≤ c1 − c2 3 δ log(δ ) − 2 {δ ≤ϑδ2 } ≤ c3 + c4 δ ϑδ dx ≤ c7 , (6.11) 3 δ log(ϑδ ) dx ≤ c5 + c6 {δ ≤ϑδ2 }
where we have used (6.4), (6.5). Thus the far left integral in (6.10) is bounded independently of δ, and we conclude, making use of hypotheses (1.29–1.33), that
α−1
{(1 + ϑδ ) 2 < ∇x uδ >}δ>0 is bounded in L 2 (0, T ; L 2 (; R 3×3 )), α−1 {(1 + ϑδ ) 2 divx uδ }δ>0 is bounded in L 2 (0, T ; L 2 ()), 3 {∇x log(ϑδ )}δ>0 , ∇x ϑδ2 is bounded in L 2 (0, T ; L 2 (; R 3 )), {(1 + ϑδ )
and
α−1 2
√
curlx Bδ }δ>0 is bounded in L 2 (0, T ; L 2 (; R 3 )), 2
δ∇x ϑδ
(6.12)
(6.13) (6.14)
is bounded in L 2 (0, T ; L 2 (; R 3 )),
where we have denoted Q =
(6.15)
1 1 (Q + QT ) − trace(Q)I 2 3
the traceless component of the symmetric part of a tensor Q. Since the velocity field uδ vanishes on the boundary in the sense of (5.24), estimate (6.16) yields, in particular, {uδ }δ>0 bounded in L 2 (0, T ; W01,2 (; R 3 )).
(6.16)
Similarly, (6.13) together with (6.5) give rise to 3 ϑδ2 bounded in L 2 (0, T ; W 1,2 ()).
(6.17)
6.5. Positivity of the absolute temperature. In agreement with the physical background and as required in the variational formulation introduced in Sect. 2, the absolute temperature must be positive a.a. on (0, T ) × . To this end, we make use of the uniform L 2 − estimates of the ∇x log(ϑδ ) established in (6.16) along with the following version of Poincaré’s inequality:
616
B. Ducomet, E. Feireisl
Lemma 6.1. Let ⊂ R N be a bounded Lipschitz domain, and ω ≥ 1 be a given constant. Furthermore, assume O ⊂ is a measurable set such that |O| ≥ o > 0. Then ω 1 . |v| ω dx
v W 1,2 () ≤ c(o, , ω) ∇x v L 2 (;R 3 ) + O
In order to apply Lemma 6.1, we show first a uniform bound 0
ϑδ (t) dx for a.a. t ∈ (0, T ).
(6.18)
Indeed as an immediate consequence of (5.27), we get
δ s(δ , ϑδ )(t) + δδ log(ϑδ )(t) dx ≥
0,δ s(0,δ , ϑ0,δ ) + δ0,δ log(ϑ0,δ ) dx,
where, by virtue of (5.5), (5.15), δ
0,δ log(ϑ0,δ ) dx → 0,
0,δ s(0,δ , ϑ0,δ ) dx →
0 s(0 , ϑ0 ) dx.
On the other hand, (6.4), (6.5) yield δ
δ log(ϑδ )(t) dx ≤ δ
δ (t)ϑδ (t) dx → 0 for a.a t ∈ (0, T )
while, in accordance with (4.9) 4 3 δ aϑδ (t) + δ S F (t) dx. δ s(δ , ϑδ )(t) dx = 3 3 ϑδ2 Assuming, by contradiction, the opposite of (6.18), we could extract a sequence {ϑδ (tδ )}δ>0 such that ϑδ (tδ ) → 0 weakly in L 4 (), and strongly in L p () for any 1 ≤ p < 4, (6.19)
4 3 0 aϑ0 + 0 S F dx ≤ lim inf 3 δ→0+ 3 ϑ02
δ (tδ )S F
δ (tδ ) dx, 3 ϑδ2
where 5
δ (tδ ) → (t) weakly in L 3 (),
δ (tδ ) dx =
0,δ dx.
(6.20)
Equations of Magnetohydrodynamics
617
Now, for any fixed K > 1, we can write
δ δ (tδ ) dx ≤ (tδ ) dx δ (tδ )S F δ (tδ )S F 3 3 3 2 {δ (tδ )≤ϑδ (tδ )} ϑδ2 ϑδ2 δ + (tδ ) dx δ (tδ )S F 3 3 {δ (tδ )≥K ϑδ2 (tδ )} ϑδ2 3 ϑδ2 ≤c δ (tδ ) 1 + (tδ ) dx + S F (K ) 0,δ dx 3 δ {δ (tδ )≤ϑδ2 (tδ )} 3 ≤ 2c ϑδ2 (tδ ) dx + S F (K ) 0,δ dx.
(6.21)
Thus combining (6.19 – 6.21) we conclude
4 3 0 aϑ0 + 0 S F dx ≤ S F (K ) 3 3 2 ϑ0
0 dx for any K > 1,
which is clearly impossible as S F is strictly decreasing with lim K →∞ S F (K ) < 0, and ϑ0 positive (non-zero) on . Thus we have established (6.18). Finally, seeing that T − ε|| ≤
3
{ϑδ (t)>ε}
ϑδ (t) dx ≤ |{ϑδ (t) > ε}| 4 ϑδ (t) L 4 () for any ε > 0
we infer, making use of (6.5), that there exist ε > 0 and o such that |{ϑδ (t) > ε}| > o > 0 for a.a. t ∈ (0, T ) uniformly for δ > 0. In other words, taking (6.13) into account, one can apply Lemma 6.1 to log(ϑδ ) in order to obtain the desired estimate {log(ϑδ )}δ>0 bounded in L 2 (0, T ; W 1,2 ()).
(6.22)
6.6. Estimates of the magnetic field. As already pointed out in Sect. 2.4 , satisfaction of the integral identity (5.28) gives rise divx Bδ (t) = 0 in D (), Bδ · n|∂ = 0,
(6.23)
which, together with estimates (6.6), (6.14), and Theorem 6.1 in [14], yields {Bδ }δ>0 bounded in L 2 (0, T ; W 1,2 (; R 3 )).
(6.24)
618
B. Ducomet, E. Feireisl
6.7. Pressure estimates. More refined density estimates can be obtained through “computing” the pressure in the momentum equation (5.25). In order to do this, we start with a simple observation that estimates (6.4), (6.16) imply that the sequences {δ uδ }δ>0 , {δ uδ ⊗ uδ }δ>0 , and {δ ∇x δ }δ>0 , where is determined by (5.26), are bounded in the Lebesgue space L p ((0, T ) × ) for a certain p > 1. Furthermore, one gets Sδ =
ν(ϑδ ) η(ϑδ ) ϑδ ν(ϑδ ) < ∇x uδ > + ϑδ η(ϑδ ) divx uδ , ϑδ ϑδ
(6.25)
where, by virtue of (6.5), (6.12), (6.17), and hypothesis (3.1), the expression on the right-hand side is bounded in L p ((0, T ) × ) for a certain p > 1. Finally, we have the Lorentz force Jδ × Bδ determined through (1.6); whence (6.6), (6.24) combined with a simple interpolation argument yield {J × Bδ }δ>0 bounded in L p ((0, T ) × ) for a certain p > 1.
(6.26)
At this stage, repeating step by step the proof of the main result in [21], we are allowed to use the quantities ϕ(t, x) = ψ(t)B[δω ], ψ ∈ D(0, T ), for a sufficiently small parameter ω > 0, as test functions in (5.25), where B[v] is a suitable branch of solutions to the boundary value problem 1 divx B[v] = v − v dx, B|∂ = 0. ||
(6.27)
The construction of the operator B, described in detail in [22] (see also Lemma 3.17 in [31]) is based on an integral representation formula due to Bogovskii [4]. The resulting estimate reads T 0
p(δ , ϑδ )δω + δδ+ω dx dt < c, with c indepemdent of δ,
(6.28)
in particular, { p(δ , ϑδ )}δ>0 is bounded in L p ((0, T ) × ) for a certain p > 1,
(6.29)
and 5
{δ3
+ω
}δ>0 is bounded in L 1 ((0, T ) × ).
Note that an alternative way to obtain these estimates was proposed in [28].
(6.30)
Equations of Magnetohydrodynamics
619
7. Sequential Stability of the Field Equations 7.1. The continuity equation. With the estimates established in the preceding section, it is quite easy to pass to the limit for δ → 0 in (5.23). Indeed passing to subsequences if necessary we deduce from (6.4),(6.9), (6.16), and the fact that δ , uδ satisfy the integral identity (5.23): 5
3 δ → in C([0, T ]; L weak ()),
(7.1)
uδ → u weakly in L 2 (0, T ; W01,2 (; R 3 )),
(7.2)
5
δ uδ → u weakly-(*) in L ∞ (0, T ; L 4 (; R 3 )),
(7.3)
where the limit quantities satisfy (2.1). 7.2. The momentum equation. Using the estimates obtained in Sect. 6.7 together with (5.25) we have 5
δ uδ → u in C([0, T ]; L 4 (; R 3 )), 30 29
3×3 δ uδ ⊗ uδ → u ⊗ u weakly in L 2 (0, T ; L (; Rsym )),
(7.4) (7.5)
where we have used the embedding W01,2 () → L 6 (). Furthermore, in accordance with (6.28), (6.29), p(δ , ϑδ ) → p(, ϑ) weakly in L p ((0, T ) × )
(7.6)
δδ → 0 in L p ((0, T ) × )
(7.7)
and
for a certain p > 1. Here, in agreement with Sect. 3.3, T p(, ϑ)ϕ dx dt 0
=
T 0
ϕ
R2
p(, ϑ) dt,x (, ϑ) dx dt, ϕ ∈ D((0, T ) × ),
where t,x (, ϑ) is a parametrized (Young) measure associated to the (vector valued) sequence {δ , ϑδ }δ>0 . Similarly, one can use (5.28) together with estimates (6.6), (6.24) to deduce Bδ → B weakly in L 2 (0, T ; W 1,2 (; R 3 )) and strongly in L 2 ((0, T ) × ; R 3 ), (7.8) and, consequently, Jδ × Bδ =
1 1 curlx Bδ × Bδ → curlx B × B = J × B weakly in L p ((0, T ) × ; R 3 ) µ µ
for a certain p > 1.
620
B. Ducomet, E. Feireisl
Thus the limit quantities satisfy an “averaged” momentum equation T (u) · ∂t ϕ + (u ⊗ u) : ∇x ϕ + p(, ϑ) divx ϕ dx dt 0
=
T
0
S : ∇x ϕ − ∇x · ϕ − (J × B) · ϕ dx dt − (u)0 · ϕ(0, ·) dx
(7.9) for any vector field ϕ ∈ D([0, T ) × ; R 3 ), where the gravitational potential is given by (2.5). Note that, as a direct consequence of (7.1) and the standard elliptic theory,
−1 [1 δ ] → −1 [1 ] in C([0, T ] × ).
(7.10)
3×3 )), p > 1 of the approxThe symbol S denotes a weak limit in L p (0, T ; L p (; Rsym imate viscosity tensors Sδ specified in (6.25). Clearly, relation (7.9) will coincide with the (variational) momentum equation (2.4) as soon as we show strong (pointwise) convergence of the sequences {δ }δ>0 and {ϑδ }δ>0 . This issue will be addressed in the subsequent two sections.
8. Entropy Inequality and Strong Convergence of the Temperature 8.1. Entropy inequality and time oscillations. In order to extract a piece of information on the time oscillations of the sequence {ϑδ }δ>0 , we shall use the (rather poor) estimates on ∂t (δ s(δ , ϑδ )) provided by the approximate entropy balance (5.27). To begin with, it follows from (4.12), (4.13) that |δ s(δ , ϑδ )| ≤ c ϑδ3 + δ | log(δ )| + δ | log(ϑδ )| ; therefore, by virtue of the uniform estimates (6.4), (6.5), (6.9), and (6.22), we can assume δ s(δ , ϑδ ) → s(, ϑ) weakly in L p ((0, T ) × ),
(8.1)
δ s(δ , ϑδ )uδ → s(, ϑ)u weakly in L p ((0, T ) × ; R 3 )
(8.2)
for a certain p > 1. Similarly, one can estimate the entropy flux q + δϑδ−1 ∇x ϑδ ≤ c |∇x log(ϑδ )| + ϑδ2 |∇x ϑδ | + δϑδ−1 |∇x ϑδ | , ϑδ where ϑδ2 |∇x ϑδ | =
3 √ 2 23 2√ s (1−s) 2 ϑδ |∇x ϑδ2 |, δϑδ−1 ∇x ϑδ = δ |∇x ϑδ2 | δϑδ 2 ϑδ . 3
Choosing the parameter 0 < s < 1 small enough so that (1 − s)
= 1, 2
we have, by virtue of Hölder’s inequality,
δϑδ−1 ∇x ϑδ L p (;R 3 ) ≤ c
√
δ∇x ϑδ2 L 2 (;R 3 ) ϑδ L 4 ()
√
s
δϑδ 2 L 6 () .
(8.3)
Equations of Magnetohydrodynamics
621
Thus we can use estimates (6.5), (6.15) together with the imbedding W 1,2 () → in order to infer
L 6 ()
δϑδ−1 ∇x ϑδ → 0 in L p ((0, T ) × ; R 3 ) for a certain p > 1.
(8.4)
Moreover, using similar arguments one can also show that {
q }δ>0 is bounded in L p ((0, T ) × ; R 3 ), for a certain p > 1. ϑδ
(8.5)
On the other hand, in accordance with (6.13) b(ϑδ ) → b(ϑ) weakly in L q ((0, T ) × )), and weakly in L 2 (0, T ; W 1,2 ()) (8.6) for any finite q ≥ 1 provided both b and b are uniformly bounded. Now, as a straightforward consequence of the entropy balance (5.27), we have q Divt,x δ s(δ , ϑδ ) + δδ log(ϑδ ), (δ s(δ , ϑδ ) + δδ log(ϑδ ))uδ + − δϑδ−1 ∇x ϑδ ϑδ ≥ 0 in D ((0, T ) × ),
while (8.6) yields Curlt,x b(ϑδ ), 0, 0, 0 bounded in L 2 ((0, T ) × ; R 4×4 ). Thus a direct application of the celebrated Div-Curl lemma (see, for instance, [37]) gives rise to relation s(, ϑ)b(ϑ) = s(, ϑ) b(ϑ)
(8.7)
for any b as in (8.6). Moreover, seeing that the sequence {δ s(δ , ϑδ )ϑδ }δ>0 is bounded in L p ((0, T ) × ) we deduce from (8.8) a more concise statement s(, ϑ)ϑ = s(, ϑ)ϑ.
(8.8)
8.2. Parametrized measures, monotony, and pointwise convergence of the temperature. Our goal is to show that (8.8) necessarily implies strong (pointwise) convergence of the sequence {ϑδ }δ>0 . First of all, let us remark that the approximate solutions solve the renormalized continuity equation ∂t b(δ ) + divx (b(δ )uδ ) + b (δ )δ − b(δ ) divx uδ = 0 in D ((0, T ) × R 3 ), (8.9) provided δ , uδ are extended to be zero outside , and for any continuously differentiable function b whose derivative vanishes for large arguments. The functions δ being square-integrable because of the presence of the artificial pressure in the energy equality, Eq. (8.9) follows from (5.23) via the regularization technique of DiPerna and P.-L.Lions [11].
622
B. Ducomet, E. Feireisl
Now it follows from (8.9) that p
b(δ ) → b() in C([0, T ]; L weak ()) for any finite p > 1 and bounded b. (8.10) Relation (8.10) combined with (8.6) yields g()h(ϑ) = g() h(ϑ), or, in terms of the corresponding parametrized measures t,x (, ϑ) = t,x () ⊗ t,x (ϑ)
(8.11)
(cf. Sect. 3.3). Relation (8.11) says that oscillations (if any) in the sequences {δ }δ>0 and {ϑδ }δ>0 are “orthogonal”, that means, the parametrized measure associated to {δ , ϑδ }δ>0 can be written as a tensor product of the parametrized measures generated by {δ }δ>0 and {ϑδ }δ>0 . Consider a function H = H (t, x, r, z) defined for t ∈ (0, T ), x ∈ , and (r, z) ∈ R 2 through formula H (t, x, r, z) = r s(r, z) − s(r, ϑ(t, x) z − ϑ(t, x) . Clearly, H is a Caratheodory function, more specifically, H (t, x, ·, ·) is continuous for a.a (t, x) ∈ (0, T ) × , and H (·, ·, r, z) is measurable for any (r, z) ∈ R 2 . Moreover, as both s R and s F are increasing functions of the absolute temperature, we have 4 H (t, x, r, z) ≥ a z 3 − ϑ 3 (t, x) z − ϑ(t, x) ≥ 0. (8.12) 3 At this stage, we use a crucial observation proved rigorously in Theorem 6.2 in [32], namely that weak limits of Caratheodory functions can be characterized in terms of the associated parametrized measure. Accordingly, we obtain T lim ϕ(t, x)H (t, x, δ , ϑδ ) dx dt δ→0 0
=
T
T
T
0
= 0
ϕ(t, x)
− 0
=
T
T
0
= 0
ϕ(t, x)
R2
ϕ(t, x)
ϕ(t, x)
R2
H (t, x, , ϑ) dt,x (, ϑ) dx dt s(, ϑ)(ϑ − ϑ(t, x))dt,x (, ϑ) dx dt
R2
R2
s(, ϑ(t, x))(ϑ − ϑ(t, x))dt,x (, ϑ) dx dt
s(, ϑ)(ϑ − ϑ(t, x))dt,x (, ϑ) dx dt
ϕ s(, ϑ)ϑ − s(, ϑ)ϑ dx dt = 0 for any ϕ ∈ D((0, T ) × ,
where we have used (8.8) to get the last equality together with (8.11) in order to observe that s(, ϑ(t, x))(ϑ − ϑ(t, x)) dt,x (, ϑ) R2 = s(, ϑ(t, x)) dt,x () (ϑ − ϑ(t, x)) dt,x (ϑ) = 0. R
R
Equations of Magnetohydrodynamics
623
In particular, we deduce from (8.12) that ϑ 3 ϑ = ϑ 3 ϑ, which is equivalent to the desired result ϑδ → ϑ in L 4 ((0, T ) × ).
(8.13)
9. Pointwise Convergence of Densities 9.1. The effective viscous pressure. The problem of pointwise (strong) convergence of the sequence {δ }δ>0 represents one of the most delicate points of the present theory. Let us start with the celebrated and nowadays well-established result of P.-L.Lions [27] on the effective viscous pressure. In the present setting, it can be concisely stated in terms of the parametrized measures as follows: ψ p(, ϑ)b() − ψ p(, ϑ) b() = R : [ψ S]b() − R : [ψ S] b()
(9.1)
for any ψ ∈ D(), and any bounded continuous function b, where R = Ri, j is a pseudodifferential operator defined by means of the Fourier transform: ξ ξ i j −1 Ri, j [v] = ∂xi −1 ∂ v = F F [v] . (9.2) xj x→ξ x ξ →x |ξ |2 Note that (9.1) is independent of the specific form of the constitutive relations for p, S and requires only satisfaction of the momentum equation (5.25) together with the renormalized continuity equation (8.9) for δ , uδ . For a detailed proof of (9.1), the reader may consult Chap. 6, Formula (6.17) in [18]. 9.2. Commutator estimates. If the viscosity coefficients ν and η were constant, we would have 4 ν + η divx u, R[S] = 3 and (9.1) would become immediately 4 4 ν + η divx u b() − ν + η divx u b(). p(, ϑ)b() − p(, ϑ) b() = 3 3
(9.3)
The quantity p − ( 43 ν + η)divx u is usually called the effective viscous pressure, and (9.3) coincides with the original equation discovered in [27]. In order to establish the same relation for variable viscosity coefficients, we use a slightly modified version of the approach proposed in [19]. To this end, we shall write 4 4 R : [ψS] = ψ ν + η divx u + R : [ψS] − ψ ν + η divx u , 3 3 where the quantity in the curly brackets is a commutator fitting in the framework of the theory developed by Coifman and Meyer [7].
624
B. Ducomet, E. Feireisl
In particular, by virtue of Lemma 4.2 in [19], the quantity R : ψ ν(ϑδ , Bδ ) < ∇x uδ > +η(ϑδ , Bδ )divx uδ I 4 −ψ ν(ϑδ , Bδ ) + η(ϑδ , Bδ ) divx uδ 3 is bounded in the space L 2 (0, T ; W ω, p ()) for suitable 0 < ω < 1, p > 1 in terms of the bounds established in (6.16), (6.17), and (6.24) provided the viscosity coefficients are globally Lipschitz functions of their arguments. If this is the case, one can use (8.10) in order to deduce the desired relation 4 4 ν + η divx u b() − ψ ν + η divx u b() R : [ψ S]b() − R : [ψ S] b() = ψ 3 3 4 4 =ψ ν + η divx u b() − ψ ν + η divx u b(), 3 3 where the last equality is a direct consequence of the pointwise convergence proved in (7.8), (8.13). If ν and η are only continuously differentiable as required by the hypotheses of Theorem 3.1, one can write ν(ϑ, B) = Y (ϑ, B)ν(ϑ, B) + (1 − Y (ϑ, B))ν(ϑ, B), (9.4) η(ϑ, B) = Y (ϑ, B)η(ϑ, B) + (1 − Y (ϑ, B))η(ϑ, B), where Y ∈ D(R 2 ) is a suitable function. Now we have ν(ϑδ , Bδ ) < uδ > +η(ϑδ , Bδ )divx uδ I = Y (ϑδ , Bδ )ν(ϑδ , Bδ ) < uδ > +Y (ϑδ , Bδ )η(ϑδ , Bδ )divx uδ I + (1 − Y (ϑδ , Bδ ))ν(ϑδ , Bδ ) < uδ > +(1 − Y (ϑδ , Bδ ))η(ϑδ , Bδ )divx uδ I , where the expression in the curly brackets can be made arbitrarily small in the norm of L p ((0, T ) × ), with a certain p > 1, by a suitable choice of Y (see estimates (6.5), (6.12), (6.17), and formula (6.25). Thus relation (9.3) holds for any ν and η satisfying the hypotheses of Theorem 3.1 9.3. The oscillations defect measure. The most suitable tool for describing possible oscillations in the sequence {δ }δ>0 is the renormalized continuity equation (2.3) together with its counterpart resulting from letting δ → 0 in (8.9). Although we have already shown in Sect. 7.1 that the limit quantities , u satisfy the momentum equation (2.1), the validity of its renormalization (2.3) is not obvious because the sequence {δ }δ>0 is not known to be uniformly square integrable and the machinery of [11] does not work. In order to solve this problem, a concept of oscillations defect measure was introduced in [17]. To be more specific we set oscq [δ → ](Q) = sup lim sup |Tk (δ ) − Tk ()|q dx dt , (9.5) k≥1
δ→0
Q
Equations of Magnetohydrodynamics
625
where Tk are the cut-off functions, Tk (z) = sgn(z) min{|z|, k}. As shown in [17], the limit functions , u solve the renormalized equation (2.3) provided • δ , uδ satisfy (8.9); • {uδ }δ>0 is bounded in L 2 (0, T ; W 1,2 ()); • oscq [δ → ]((0, T ) × ) < ∞ for a certain q > 2.
(9.6)
Accordingly, in order to establish (2.3), it is enough to show (9.6).
9.4. Weighted estimates of the oscillations defect measure. In order to show (9.6), we use a new method based on weighted estimates, where the corresponding weight function depends on the absolute temperature. Taking b = Tk in (9.3) we get p(, ϑ)Tk () − p(, ϑ) Tk () 4 4 ν(ϑ, B) + η(ϑ, B) divx u Tk () − ν(ϑ, B) + η(ϑ, B) divx u Tk (). = 3 3 (9.7) 5
As observed in (4.8), there is a positive constant pc such that p F (, ϑ) − pc 3 is a non-decreasing function of for any ϑ. Accordingly, we have 5 5 p(, ϑ)Tk () − p(, ϑ) Tk () ≥ pc 3 Tk () − 3 Tk () .
(9.8)
Now, let us choose a weight function w ∈ C 1 [0, ∞), w(ϑ) > 0 for ϑ ≥ 0, w(ϑ) = ϑ −
1+α 2
for ϑ ≥ 1,
(9.9)
where α is the exponent appearing in hypothesis (1.33). Multiplying (9.7) by w(ϑ) and using (9.8) we obtain 5 w(ϑ) 4 5 ν(ϑ, B) + η(ϑ, B) divx u Tk () w(ϑ) 3 Tk () − 3 Tk () ≤ pc 3 w(ϑ) 4 ν(ϑ, B) + η(ϑ, B) divx u Tk (), (9.10) − pc 3 where the right-hand side is a weak limit of w(ϑδ ) 4 ν(ϑδ , Bδ ) + η(ϑδ , Bδ ) divx uδ Tk (δ ) − Tk () pc 3 w(ϑ) 4 + ν(ϑ, B) + η(ϑ, B) divx u Tk () − Tk () . pc 3
626
B. Ducomet, E. Feireisl
Consequently, employing the uniform weighted estimates (6.12), together with hypothesis (1.13) and the growth restriction on w specified in (9.9), we conclude T w(ϑ) 4 ν(ϑ, B) + η(ϑ, B) divx u Tk () pc 3 0 w(ϑ) 4 ν(ϑ, B) + η(ϑ, B) divx u Tk () dx dt − pc 3 ≤ c lim inf Tk (δ ) − Tk () L 2 ((0,T )×) , with c independent of k. (9.11) δ→0
On the other hand, it was shown in Chap. 6 in [18] that the left-hand side of (9.10) can be bounded below as T 5 5 w(ϑ) 3 Tk () − 3 Tk () dx dt
0
≥ lim sup δ→0
T
8
0
w(ϑ)|Tk (δ ) − Tk ()| 3 dx dt;
(9.12)
whence (9.10), together with (9.11), (9.12), give rise to the weighted estimate T lim sup δ→0
0
8
w(ϑ)|Tk (δ )−Tk ()| 3 dx dt ≤ c lim inf Tk (δ )−Tk () L 2 ((0,T )×) . δ→0
(9.13) Now, taking q > 2,
1+α 8 ω= 3q 2
we can use Hölder’s inequality to obtain T 0
=
|Tk (δ ) − Tk ()|q dx dt
T
≤c
|Tk (δ ) − Tk ()|q (1 + ϑ)−ω (1 + ϑ)ω dx dt
T 0
8
0 T
+ 0
|Tk (δ ) − Tk ()| 3 (1 + ϑ)−
1+α 2
dx dt
3q(1+α) (1 + ϑ) 2(8−3q) dx dt .
(9.14)
On the other hand, in accordance with (6.5), (6.17), ϑ ∈ L ∞ (0, T ; L 4 ()) ∩ L 3 (0, T ; L 9 ()), and a simple interpolation argument yields ϑ ∈ L r ((0, T ) × ), with r =
46 . 9
Equations of Magnetohydrodynamics
627
Consequently, one can find q > 2 such that 3q(1 + α) 46 =r = 2(8 − 3q) 9 provided α complies with hypothesis (3.1). Thus relations (9.13), (9.14) give rise to the desired estimate T lim sup w(ϑ)|Tk (δ )−Tk ()|q dx dt ≤ c 1+lim inf Tk (δ )−Tk () L 2 ((0,T )×) δ→0
δ→0
0
yielding (9.6).
9.5. Propagation of oscillations and strong convergence. Having proved (9.6) we have the renormalized continuity equation (2.3) satisfied by the limit function , u, in particular, ∂t L k () + divx (L k ()u) + Tk () divx u = 0 in D ((0, T ) × R 3 ),
(9.15)
provided , u have been extended to be zero outside , where L k () = 1
Tk (z) dz. z2
On the other hand, one can let δ → 0 in (8.9) to obtain ∂t L k () + divx (L k ()u) + Tk () divx u = 0 in D ((0, T ) × R 3 ).
(9.16)
Now, following step by step Chap. 6 in [18] we take the difference of (9.15), (9.16) and use (9.7), (9.8) to deduce
L k ()(τ ) − L k ()(τ ) dx ≤
τ 0
divx u Tk () − Tk () dx dt for any τ ∈ [0, T ],
where the right-hand sides vanish for k → ∞ due to (9.6). Consequently, log() = log() in (0, T ) × - a relation equivalent to strong convergence of {δ }δ>0 , that means, δ → in L 1 ((0, T ) × ).
(9.17)
628
B. Ducomet, E. Feireisl
10. Conclusion Our ultimate goal is to complete the proof of Theorem 3.1. Note that we have already shown that , u satisfy the continuity equation (2.1) as well as its renormalized version (2.3). Moreover, it is easy to see that the “averaged” momentum equation (7.9) coincides in fact with (2.4) in view of the strong convergence results established in (7.8), (8.13), and (9.17). By the same token, one can pass to the limit in the energy equality (5.29) in order to obtain (2.12). Note that the regularizing δ−dependent terms on the left-hand side disappear by virtue of the estimates (6.4), (6.5), and (6.28). Similarly, one can handle Maxwell’s equation (5.28). Here, the only thing to observe is that the terms λ(δ , ϑδ , Bδ )curlx Bδ are bounded in L p ((0, T ) × ), for a certain p > 1, uniformly with respect to δ. Indeed such a bound can be obtained exactly as in (6.25). To conclude, we have to deal with the entropy inequality (5.27). First of all, the extra terms on the left-hand side tend to zero because of (6.4), (6.5), (6.22), and (8.4). Furthermore, it is standard to pass to the limit in the production rate keeping the correct sense of the inequality as all terms are convex with respect to the spatial gradients of u, ϑ, and B. Finally, the limit in the “logarithmic” terms can be carried over thanks to estimate (6.22) and the following result (see Lemma 5.4 in [13]): Lemma 10.1. Let ⊂ R N be a bounded Lipschitz domain. Assume that ϑδ → ϑ in L 2 ((0, T ) × ) and log(ϑδ ) → log(ϑ) weakly in L 2 ((0, T ) × ). Then ϑ is positive a.a. on (0, T ) × and log(ϑ) = log(ϑ). References 1. Ball, J.M.: A version of the fundamental theorem for Young measures. In Lect. Notes in Physics 344, Berlin-Heidelberg-New York: Springer-Verlag, 1989, pp. 207–215 2. Becker, E.: Gasdynamik. Stuttgart: Teubner-Verlag, 1966 3. Besse, C., Claudel, J., Degond, P., Deluzet, F., Gallice, G., Tesserias, C.: A model hierarchy for ionospheric plasma modeling. Math. Models Meth. Appl. Sci. 14, 393–415 (2004) 4. Bogovskii, M. E.: Solution of some vector analysis problems connected with operators div and grad (in Russian). Trudy Sem. S.L. Sobolev 80(1), 5–40 (1980) 5. Buet, C., Després, B.: Asymptotic analysis of fluid models for the coupling of radiation and hydrodynamics. J. Quant. Spect. and Rad. Trans. 85(3–4), 385–418 (2004) 6. Chapman, S., Cowling, T. G.: Mathematical theory of non-uniform gases. Cambridge: Cambridge Univ. Press, 1990 7. Coifman, R., Meyer, Y.: On commutators of singular integrals and bilinear singular integrals. Trans. Amer. Math. Soc. 212, 315–331 (1975) 8. Cox, J.P., Giuli, R.T.: Principles of stellar structure, I.,II. New York: Gordon and Breach, 1968 9. Davidson, P. A.: Turbulence:An introduction for scientists and engineers. Oxford: Oxford University Press, 2004 10. Degond, P., Lemou, M.: On the viscosity and thermal conduction of fluids with multivalued internal energy. Euro. J. Mech. B- Fluids 20, 303–327 (2001) 11. DiPerna, R.J., Lions, P.-L.: Ordinary differential equations, transport theory and Sobolev spaces. Invent. Math. 98, 511–547 (1989) 12. Duchon, J., Robert, R.: Inertial energy dissipation for weak solutions of incompressible Euler and Navier-Stokes equations. Nonlinearity 13, 249–255 (2000) 13. Ducomet, B., Feireisl, E.: On the dynamics of gaseous stars. Arch. Rational Mech. Anal. 174, 221–266 (2004)
Equations of Magnetohydrodynamics
629
14. Duvaut, G., Lions, J.-L.: Inequalities in mechnics and physics. Heidelberg: Springer-Verlag, 1976 15. Eliezer, S., Ghatak, A., Hora, H.: An introduction to equations of states, theory and applications. Cambridge: Cambridge University Press, 1986 16. Eyink, G. L.: Local 4/5 law and energy dissipation anomaly in turbulence. Nonlinearity 16, 137–145 (2003) 17. Feireisl, E.: On compactness of solutions to the compressible isentropic Navier-Stokes equations when the density is not square integrable. Comment. Math. Univ. Carolinae 42(1), 83–98 (2001) 18. Feireisl, E.: Dynamics of viscous compressible fluids. Oxford: Oxford University Press, 2003 19. Feireisl, E.: On the motion of a viscous, compressible, and heat conducting fluid. Indiana Univ. Math. J. 53, 1707–1740 (2004) 20. Feireisl, E.: Stability of flows of real monoatomic gases. Commun. Partial Differ. Eqs. 31, 325–348 (2006) 21. Feireisl, E., Petzeltová, H.: On integrability up to the boundary of the weak solutions of the Navier-Stokes equations of compressible flow. Commun. Partial Differ. Eqs. 25(3–4), 755–767 (2000) 22. Galdi, G. P.: An introduction to the mathematical theory of the Navier-Stokes equations, I. New York: Springer-Verlag, 1994 23. Gallavotti, G.: Statistical mechanics: A short treatise. Heidelberg: Springer-Verlag, 1999 24. Giovangigli, V.: Multicomponent flow modeling. Basel: Birkhäuser, 1999 25. Hoff, D., Jenssen, H. K.: Symmetric nonbarotropic flows with large data and forces. Arch. Rational Mech. Anal. 173, 297–343 (2004) 26. Jeffrey, A.N., Taniuti, T.: Non-linear wave propagation. New York: Academic Press, 1964 27. Lions, P.-L.: Mathematical topics in fluid dynamics, Vol.2, Compressible models. Oxford: Oxford Science Publication, 1998 28. Lions, P.-L.: Bornes sur la densité pour les équations de Navier- Stokes compressible isentropiques avec conditions aux limites de Dirichlet. C.R. Acad. Sci. Paris, Sér I. 328, 659–662 (1999) 29. Mihalas, B., Weibel-Mihalas, B.: Foundations of radiation hydrodynamics. Dover: Dover Publications 1984 30. Müller, I., Ruggeri, T.: Rational extended thermodynamics. Springer Tracts in Natural Philosophy 37, Heidelberg: Springer-Verlag, 1998 31. Novotný, A., Straškraba, I.: Introduction to the theory of compressible flow. Oxford: Oxford University Press, 2004 32. Pedregal, P.: Parametrized measures and variational principles. Basel: Birkhäuser, 1997 33. Rajagopal, K. R., Srinivasa, A. R.: On thermodynamical restrictions of continua. Proc. Royal Soc. London A 460, 631–651 (2004) 34. Ruggeri, T., Trovato, M.: Hyperbolicity in extended thermodynamics of Fermi and Bose gases. Continuum Mech. Thermodyn. 16(6), 551–576 (2004) 35. Serre, D.: Variation de grande amplitude pour la densité d’un fluid viscueux compressible. Physica D 48, 113–128 (1991) 36. Shore, S. N.: An introduction to atrophysical hydrodynamics. New York: Academic Press, 1992 37. Tartar, L.: Compensated compactness and applications to partial differential equations. Nonlinear Anal. and Mech., Heriot-Watt Sympos., L.J. Knopps editor, Research Notes in Math. 39, Boston: Pitman, 1975, pp. 136–211 38. Temam, R.: Navier-Stokes equations. Amsterdam: North-Holland, 1977 39. Vaigant, V. A., Kazhikhov, A. V.: On the existence of global solutions to two-dimensional Navier-Stokes equations of a compressible viscous fluid (in Russian). Sibirskij Mat. Z. 36(6), 1283–1316 (1995) 40. Zahn, J.P., Zinn-Justin, J.: Astrophysical fluid dynamics, Les Houches, XLVII. Amsterdam: Elsevier, 1993 41. Zirin, H.: Astrophysics of the sun. Cambridge: Cambridge University Press, 1988 Communicated by P. Constantin
Commun. Math. Phys. 266, 631–645 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0058-5
Communications in
Mathematical Physics
A Stochastic Perturbation of Inviscid Flows Gautam Iyer Department of Mathematics, University of Chicago, Chicago, Illinois 60637, USA. E-mail: [email protected] Received: 4 May 2005 / Accepted: 12 February 2006 Published online: 18 July 2006 – © Springer-Verlag 2006
Abstract: We prove existence and regularity of the stochastic flows used in the stochastic Lagrangian formulation of the incompressible Navier-Stokes equations (with periodic boundary conditions), and consequently obtain a C k,α local existence result for the Navier-Stokes equations. Our estimates are independent of viscosity, allowing us to consider the inviscid limit. We show that as ν → 0, solutions of the stochastic Lagrangian formulation (with periodic √ boundary conditions) converge to solutions of the Euler equations at the rate of O( νt). 1. Introduction Consider an incompressible inviscid fluid with velocity field u in the absence of external forcing. The evolution of the velocity field is governed by the Euler [3] equations ∂t u + (u · ∇)u + ∇ p = 0, ∇ · u = 0.
(1.1) (1.2)
Viscosity introduces a diffusive term in the Euler equations and Eq. (1.1) becomes ∂t u + (u · ∇) u − νu + ∇ p = 0.
(1.3)
The Kolmogorov backward and Feynman-Kac formulae [11] show that any linear, diffusive, second order PDE can be obtained by averaging out a stochastic perturbation of an ODE. The theory for non-linear PDE’s is not as well developed. We are interested in interpreting the Navier-Stokes equations as the average of a suitable stochastic perturbation of the Euler equations. Many interesting non-linear PDE’s have been interpreted as averaging of stochastic processes, the simplest example being the Kolmogorov reaction diffusion equation [14]. In two dimensions the same is possible for the Navier-Stokes equations as the vorticity satisfies a standard Fokker-Plank equation. This combined with the Biot-Savart law
632
G. Iyer
led to the random vortex method [16] and has been used and studied extensively. In three dimensions the problem is a little harder as the vorticity equation is no longer of Fokker-Plank type, and the non-linearity causes trouble. In [15] Le Jan and Sznitman used a backward in time branching process in Fourier space to express the Navier-Stokes equations as the expected value of a stochastic process. This approach led to a new existence theorem, and was later [1] generalized and physical space analogues were developed. An approach more along the lines of this paper was developed by Busnello, Flandoli and Romito [2] who considered ‘noisy’ flow paths, and used Girsanov transformations to recover the velocity field. They obtained the 3-dimensional Navier-Stokes equations in this form, and generalized their method to work for a general class of second order parabolic equations. A different technique was used by Gomes [9] to express the diffusive Lagrangian [5] as the expected value minimizer of a suitable functional. Finally we mention similar systems have been considered by Jourdain et al in [10]. Our approach1 is to introduce a Brownian drift into the active vector formulation [4] of the Euler equations. Peter Constantin and the author showed [7] that this provides a physically meaningful, explicit stochastic representation of the Navier-Stokes equations. While long time dynamics of the system we consider are presently unknown, we hope that techniques used here will lead to control of the growth of certain quantities with non-zero probability. For example, we would like to find an exponential bound for ∇ X which holds with non-zero probability. Finding an almost sure bound of this form will lead to global existence. In this paper, we consider the flow given by the stochastic differential equation √ d X = u dt + 2ν d B (1.4) with initial data X (a, 0) = a.
(1.5)
Here ν > 0 represents the viscosity, and B represents a 3-dimensional Wiener process (we use the letter B to avoid confusion with the Weber operator). We recover the velocity field from X by A = X −1 , u = EP (∇ t A) (u 0 ◦ A) ,
(1.6) (1.7)
where E denotes the expected value with respect to the Wiener measure, P denotes the Leray-Hodge projection [3] on divergence free vector fields, and u 0 is the deterministic initial data. We clarify that by X −1 in Eq. (1.6) we mean the spatial inverse of X . We impose periodic boundary conditions, though all theorems proved here will also work if we work with the domain R3 and impose a decay at infinity condition instead. The motivation for considering the above system arises from the fact that in the absence of viscosity, the system (1.4)–(1.7) reduces to ∂t A + (u · ∇)A = 0, A(x, 0) = x, u = P (∇ t A)(u 0 ◦ A) .
(1.8) (1.9) (1.10)
1 In the original version of this paper, our intention was to propose this as a physically meaningful model for the Navier-Stokes equations. We presented a proof that the solution of the system considered here differs from the solution of the Navier-Stokes equations by O(t 3/2 ). Six months after submission of the original version of this paper, Peter Constantin and the author [7] discovered that the equations considered here are exactly equivalent to the Navier-Stokes equations.
A Stochastic Perturbation of Inviscid Flows
633
Peter Constantin proved [4] that u is a solution of the (deterministic) system (1.8)–(1.10) if and only if u is a solution of the incompressible Euler equations (1.1)–(1.2) with initial data u 0 . Thus the system (1.4)–(1.7) can be thought of as superimposing the Wiener process on the flow map, intuitively representing Brownian motion of fluid particles. Physically, the Brownian particle interaction is regarded as the source of viscosity, and the equivalence of (1.4)–(1.7) and the Navier-Stokes equations proved in [7] confirms this. We remark that Eq. (1.7) provides an explicit formula for u in terms of the map X . In this paper, we provide a self contained proof of a C k,α local existence theorem for the stochastic system (1.4)–(1.7). The proof in [7] showing equivalence between Navier-Stokes and (1.4)–(1.7) relies crucially on spatial regularity of solutions as stated in Theorem 2.1. We remark that the stochastic representation of Busnello, Flandoli and Romito does not admit a self contained existence proof as we have here. The estimates, and existence time can be chosen independent of the viscosity, thus enabling us to consider the vanishing viscosity limit. We show that as ν → 0, the√solution of (1.4)–(1.7) converges to the solution of the Euler equations at the rate of O( νt). We remark that the limit ν → 0 is not well understood in bounded domains using classical methods. We hope that this stochastic formulation (when extended to bounded domains) will give us a better handle on computing this limit. In the next section, we establish our notational convention, and describe precisely the results we prove in this paper. In Sect. 3 we prove bounds on the Weber operator, which are essential to all proofs presented in this paper. In Sect. 4 we prove local existence for (1.4)–(1.7) and the vanishing viscosity limit. Finally, in Sect. 5, we digress and present an alternate proof of local existence for the Navier-Stokes equations using the diffusive Lagrangian formulation [5]. 2. Notational Convention and Description of Results In this section we describe the main results we prove. We begin by establishing our notational convention. We let I denote the cube [0, L]3 with side of length L. We define the Hölder norms and semi-norms on I by |u|α = sup L α x,y∈I
uC k =
|u(x) − u(y)| , |x − y|α
L |m| sup |D m u|,
|m|k
uk,α = uC k +
I
|m|=k
L k D m u α ,
where D m denotes the derivative with respect to the multi index m. We let C k denote the space of all k-times continuously differentiable spatially periodic functions on I, and C k,α denote the space of all spatially periodic k + α Hölder continuous functions. The spaces C k and C k,α are endowed with the norms · C k and · k,α respectively. We use I to denote the identity function on R3 or I (depending on the context), and use I to denote the identity matrix. The first theorem we prove addresses local (in time) existence for the system (1.4)–(1.7): Theorem 2.1. Let k 1 and u 0 ∈ C k+1,α be divergence free. There exists a time T = T (k, α, L , u 0 k+1,α ), but independent of viscosity, and a pair of functions λ,
634
G. Iyer
u ∈ C([0, T ], C k+1,α ) such that u and X = I + λ satisfy the system (1.4)–(1.7). Further ∃U = U (k, α, L , u 0 k+1,α ) such that t ∈ [0, T ] =⇒ u(t)k+1,α U . We prove this theorem in Sect. 4. Our proof will also give a local existence result for the Euler equations, or any stochastic perturbation similar to the one considered here. We remark that the estimates required for this theorem along with Constantin’s diffusive Lagrangian formulation [5] also gives us local existence for the Navier-Stokes equations. In Sect. 5, we digress and present this proof. We remark that Theorem 2.1 is still true when k = 0. The only modification we need to make to our proof is to the inequalities in Lemma 4.1 which we do not carry out here. Since our estimates, and local existence time are independent of viscosity, we can address the question of convergence in the limit ν → 0. Proposition 2.1. Let u 0 ∈ C k+1,α be divergence free, and U, T be as in Theorem 2.1. For each ν > 0 we let u ν be the solution of the system (1.4)–(1.7) on the time interval [0, T ]. Making T smaller if necessary, let u be the solution to the Euler equations (1.1)– (1.2) with initial data u 0 defined on the time interval [0, T ]. Then there exists a constant c = c(k, α, U, L) such that for all t ∈ [0, T ] we have √ u(t) − u ν (t)k,α cU νt. L At present we are unable to extend the above proposition to domains with boundaries. In this case, possible detachment of the boundary layer creates analytical obstructions to understanding the inviscid limit. We present a proof of Proposition 2.1 at the end of Sect. 4, and are presently working on extending it to work for domains with boundaries. 3. The Weber Operator and Bounds In this section we define and obtain estimates for the Weber operator which will be central to all subsequent results. Definition 3.1. We define the Weber operator W : C k,α × C k+1,α → C k,α by W(v, ) = P I + ∇ t v , where P is the Leray-Hodge projection [3] onto divergence free vector fields. Remark 3.1. The range of W is C k,α because multiplication by a C k,α function is a bounded operation on C k,α , and P is a classical Calderon-Zygmund singular integral operator [17] which is bounded on Hölder spaces. Remark. In the whole space, or with periodic boundary conditions, the Leray-Hodge projection commutes with derivatives. This is not true for arbitrary domains [6]. Formally it seems that W(v, ) should have one less derivative than . However we prove below that W(v, ) has as many derivatives as . The reason being, when we differentiate W(v, ), we can use ‘integration by parts’ to express the right hand side only in terms of first order derivatives. Lemma 3.1 (Integration by parts). If u, v ∈ C 1,α then P ∇ t u v = −P ∇ t v u .
A Stochastic Perturbation of Inviscid Flows
635
Proof. This follows immediately from the identity t ∇ u v + ∇ t v u = ∇(u · v) and the fact that P vanishes on gradients. Corollary 3.1. If k 1 and v, ∈ C k,α then W(v, ) ∈ C k,α and W(v, )k,α c 1 + ∇k−1,α vk,α . Proof. Notice first that W(v, ) ∈ C k−1,α by Remark 3.1. Now ∂i W(v, ) = P (∇ t ∂i )v + ∇ t ∂i v = P −∇ t v ∂i + ∇ t ∂i v . Now the right hand side has only first order derivatives of and v, hence ∇W(v, ) ∈ C k−1,α and the proposition follows. Proposition 3.1. If k 1 and 1 , 2 ∈ C k,α and v1 , v2 ∈ C k,α , are such that ∇i k−1,α d and
vi k,α U
for i = 1, 2, then there exists c = c(k, d, α) such that W(v1 , 1 ) − W(v2 , 2 )k,α c UL 1 − 2 k,α + v1 − v2 k,α .
(3.1)
If k = 0, the inequality (3.1) still holds provided we assume ∇i α d and
vi 1,α U
for i = 1, 2. Proof of Proposition 3.1. The main idea in the proof is to use ‘integration by parts’ to avoid the loss of derivative. By definition of W we have W(v1 , 1 ) − W(v2 , 2 ) = P (I + ∇ t 1 )v1 − (I + ∇ t 2 )v2 = P (I + ∇ t 1 )(v1 − v2 ) + ∇ t (1 − 2 )v2 (3.2) = P (I + ∇ t 1 )(v1 − v2 ) − ∇ t v2 (1 − 2 ) . Further, differentiating we have ∂i [W(v1 , 1 ) − W(v2 , 2 )] = ∂i P (I + ∇ t 1 )(v1 − v2 ) − ∇ t v2 (1 − 2 ) = P ∇ t ∂i 1 (v1 − v2 ) + (I + ∇ t 1 )∂i (v2 − v1 ) −∇ t ∂i v2 (1 − 2 ) − ∇ t v2 ∂i (2 − 1 ) = P − ∇ t (v1 − v2 )∂i 1 + (I + ∇ t 1 )∂i (v2 − v1 ) (3.3) +∇ t (1 − 2 )∂i v2 − ∇ t v2 ∂i (2 − 1 ) . Note that we used Lemma 3.1 to ensure that the right hand sides of (3.2) and (3.3) have only first order derivatives of and v. Thus taking the C k−1,α norms of Eqs. (3.2) and (3.3), and using the fact that multiplication by a C k,α function and P are bounded on C k,α , the proposition follows.
636
G. Iyer
4. Local Existence for the Stochastic Formulation In this section we prove local in time C k,α existence for the stochastic system (1.4)– (1.7) as stated in Theorem 2.1. We conclude by proving Proposition 2.1, showing how the stochastic system (1.4)–(1.7) behaves as ν → 0. We begin with a few preliminary results. Lemma 4.1. If k 1, then there exists a constant c = c(k, α) such that k+α f ◦ gk,α c f k,α 1 + ∇gk−1,α k+1 f ◦ g1 − f ◦ g2 k,α c 1 + ∇g1 k−1,α + ∇g2 k−1,α · ∇ f k,α g1 − g2 k,α and
k+1 f 1 ◦ g1 − f 2 ◦ g2 k,α c 1 + ∇g1 k−1,α + ∇g2 k−1,α · f 1 − f 2 k,α + min ∇ f 1 k,α , ∇ f 2 k,α g1 − g2 k,α .
The proof of Lemma 4.1 is elementary and not presented here. We subsequently use the above lemma repeatedly without reference or proof. Lemma 4.2. Let X 1 , X 2 ∈ C k+1,α be such that ∇ X 1 − Ik,α d < 1 and ∇ X 2 − Ik,α d < 1. Let A1 and A2 be the inverse of X 1 and X 2 respectively. Then there exists a constant c = c(k, α, d) such that A1 − A2 k,α c X 1 − X 2 k,α . Proof. Let c = c(k, α, d) be a constant that changes from line to line (we use this convention implicitly throughout this paper). Note first ∇ A = (∇ X )−1 ◦ A, and hence by Lemma 5.1, ∇ AC 0 (∇ X )−1 0 c. C
Now using Lemma 5.1 to bound (∇ X )−1 α we have ∇ Aα = (∇ X )−1 ◦ A (∇ X )−1 1 + ∇ AC 0 c. α
When k 1, we again bound (∇ X )−1 ◦ A we have
α
(∇ X )−1 k,α
∇ Ak,α (∇ X )−1
by Lemma 5.1. Taking the C k,α norm of
k,α
k 1 + ∇ Ak−1,α .
So by induction we can bound ∇ Ak,α by a constant c = c(k, α, d). The lemma now follows immediately from the identity A1 − A2 = (A1 ◦ X 2 − I ) ◦ A2 = (A1 ◦ X 2 − A1 ◦ X 1 ) ◦ A2 and Lemma 4.1.
A Stochastic Perturbation of Inviscid Flows
637
Lemma 4.3. Let u ∈ C([0, T ], C k+1,α ) and X satisfy the SDE (1.4) with initial data (1.5). Let λ = X − I and U = supt u(t)k+1,α . Then there exists a constant c = c(k, α, uk+1,α ) such that for short time ∇λ(t)k,α
cU t cU t/L e L
∇(t)k,α
and
cU t cU t/L e . L
Proof. From Eq. (1.4) we have
t X (x, t) = x +
u(X (x, s), s) ds +
√
2ν Bt
0
t =⇒
∇ X (t) = I +
(∇u) ◦ X · ∇ X.
(4.1)
0
Taking the C 0 norm of Eq. (4.1) and using Gronwall’s Lemma we have ∇λ(t)C 0 = ∇ X (t) − I C 0 eU t/L − 1. Now taking the C k,α norm in Eq. (4.1) we have
t ∇λ(t)k,α c
k ∇uk,α 1 + ∇λk−1,α 1 + ∇λk,α .
0
The bound for ∇λk,α now follows from the previous two inequalities, induction and Gronwall’s Lemma. The bound for ∇k,α then follows from Lemma 4.2. We draw attention to the fact that the above argument can only bound ∇λ, and not λ. Fortunately, our results only rely on a bound of ∇λ. Lemma 4.4. Let u, u¯ ∈ C([0, T ], C k+1,α ) be such that sup u(t)k+1,α U and
0t T
¯ sup u(t) k+1,α U.
0t T
Let X, X¯ be solutions of the SDE (1.4)–(1.5) with drift u and u¯ respectively, and let A and A¯ be the spatial inverse of X and X¯ respectively. Then there exists c = c(k, α, U ) and a time T = T (k, α, U ) such that X (t) − X¯ (t)
k,α
t ce
u − u ¯ k,α ,
cU t/L
(4.2)
0
A(t) − A(t) ¯ cecU t/L k,α
t u − u ¯ k,α 0
for all 0 t T .
(4.3)
638
G. Iyer
Proof. We first use Lemma 4.3 to bound ∇ X − Ik,α and ∇ X¯ − Ik,α for short time T . Now X (t) − X¯ (t) =
t
u ◦ X − u¯ ◦ X¯
0
=⇒ X (t) − X¯ (t) k,α
t
u ◦ X − u¯ ◦ X¯ k,α
0
t U u − u ¯ k,α + X − X¯ k,α , c L 0
and inequality (4.2) follows by applying Gronwall’s Lemma. Inequality (4.3) follows immediately from (4.2) and Lemma 4.2. We now provide the proof of Theorem 2.1. We reproduce the statement here for convenience. Theorem 2.1. Let k 1 and u 0 ∈ C k+1,α be divergence free. There exists a time T = T (k, α, L , u 0 k+1,α ), but independent of viscosity, and a pair of functions λ, u ∈ C([0, T ], C k+1,α ) such that u and X = I + λ satisfy the system (1.4)–(1.7). Further ∃U = U (k, α, L , u 0 k+1,α ) such that t ∈ [0, T ] =⇒ u(t)k+1,α U . Proof. Let U be a large constant, and T a small time, both of which will be specified later. Define as before U and L by
U = u ∈ C([0, T ], C k+1,α ) u(t)k+1,α U, ∇ · u = 0 and u(0) = u 0
and L = ∈ C([0, T ], C k+1,α ) ∇(t)k,α 21 ∀t ∈ [0, T ] and (·, 0) = 0 . We clarify that the functions u and are required to be spatially C k+1,α , and need only be continuous in time. Now given u ∈ U we define X u to be the solution of Eq. (1.4) with initial data (1.5) and λu = X u − I be the Eulerian displacement. We define Au by Eq. (1.6) and let u = Au − I be the Lagrangian displacement. Finally we define W : U → U by W (u) = EW(u 0 ◦ Au , u ). We aim to show that W : U → U is Lipschitz in the weaker norm uU = sup u(t)k,α 0t T
and when T is small enough, we will show that W is a contraction mapping. Let c be a constant that changes from line to line. By Corollary 3.1 we have W (u)k+1,α cE 1 + ∇u k,α u 0 ◦ Au k+1,α k+2 . c u 0 k+1,α sup 1 + ∇u k,α
(4.4)
A Stochastic Perturbation of Inviscid Flows
639
Here is the probability space on which our processes are defined. We remark that Lemma 4.3 gives us a bound on ∇u k,α . A bound on E∇u k,α instead would not have been enough. k+2 u 0 k+1,α , and then apply Lemma 4.3 to choose T Now we choose U = c 23 small enough to ensure u , λu ∈ L. Now inequality 4.4 ensures that W (u) ∈ U. Now if u, u¯ ∈ U, Lemma 4.4 guarantees
t u (t) − u¯ (t)k,α cecU t/L
u − u ¯ k,α . 0
Thus applying Proposition 3.1 we have W (u)(t) − W (u)(t) ¯ k,α c
U L
u (t) − u¯ (t)k,α +
+ u 0 ◦ Au (t) − u 0 ◦ Au¯ (t)k,α cU u (t) − u¯ (t)k,α L
t cU cU t/L u − u e ¯ k,α . L
0
So choosing T = T (k, α, L , U ) small enough we can ensure W is a contraction. The existence of a fixed point of W now follows by successive iteration. We define u n+1 = W (u n ). The sequence (u n ) converges strongly with respect to the C k,α norm. Since U is closed and convex, and the sequence (u n ) is uniformly bounded in the C k+1,α norm, it must have a weak limit u ∈ U. Finally since W is continuous with respect to the weaker C k,α norm, the limit must be a fixed point of W , and hence a solution to the system (1.4)–(1.7). We conclude by proving the vanishing viscosity behavior stated in Proposition 2.1. We reproduce the statement here for convenience. Proposition 2.1. Let u 0 ∈ C k+1,α be divergence free, and U, T be as in Theorem 2.1. For each ν > 0 we let u ν be the solution of the system (1.4)–(1.7) on the time interval [0, T ]. Making T smaller if necessary, let u be the solution to the Euler equations (1.1)– (1.2) with initial data u 0 defined on the time interval [0, T ]. Then there exists a constant c = c(k, α, U, L) such that for all t ∈ [0, T ] we have u(t) − u ν (t)k,α
cU L
√ νt.
Proof. We use a subscript of ν to denote quantities associated to the solution of the viscous problem (1.4)–(1.7), and unsubscripted letters to denote the corresponding quantities associated to the solution of the Eulerian-Lagrangian formulation of the Euler equations (1.8)–(1.10). We use the same notation as in the proof of Theorem 2.1. Now from the proof of Theorem 2.1 we know that for short time ν , ∈ L. Using Lemma 4.2 and making T smaller if necessary, we can ensure λν , λ ∈ L. We begin by
640
G. Iyer
estimating Eλν − λk,α :
t λν (t) − λ(t) =
[u ν ◦ X ν − u ◦ X ] + 0
=⇒
λν (t) − λ(t)k,α c
t
u ν − uk,α +
√ 2ν Bt
U L
λν − λk,α +
√
ν|Bt | ,
0
and so by Gronwall’s lemma
λν (t) − λ(t)k,α
√ c ν|Bt | +
t
u ν − uk,α ecU t/L .
0
Using Lemma 4.2 and taking expected values gives
t √ E ν (t) − (t)k,α c νt + u ν − uk,α ecU t/L .
(4.5)
0
To estimate the difference u ν − u, we use (1.7), and (1.10) to obtain =⇒
u ν − u = EW(u 0 ◦ Aν , ν ) − W(u 0 ◦ A, ) u ν − uk,α cE UL ν − k,α + u 0 ◦ Aν − u 0 ◦ Ak,α
=⇒ u ν (t) − u(t)k,α
cU L E ν
− k,α
t √ cU t/L cU νt + u ν − uk,α , L e 0
and the theorem follows from Gronwall’s lemma. 5. Local Existence for the Navier-Stokes Equations Proposition 3.1, along with Peter Constantin’s diffusive Lagrangian formulation [5] immediately gives us a local in time C k,α existence and uniqueness result for the NavierStokes equations using classical PDE methods. We conclude this paper by presenting the proof in this section. Definition 5.1. Let k 2 and T > 0. We define Lk,α T by
k,α ∇(t)k−1,α 1 ∀t ∈ [0, T ] and (·, 0) = 0 . = ∈ C (I × [0, T ], I) Lk,α T 2 k,α (I × [0, T ], I) divergence free we define the virtual Given ∈ Lk,α T , and u ∈ C velocity v = vu, to be the unique solution of the linear parabolic equation
∂t vβ + (u · ∇) vβ − νvβ = 2νC ij,β ∂ j vi
(5.1)
A Stochastic Perturbation of Inviscid Flows
641
with initial data v(x, 0) = u(x, 0),
(5.2)
C j,i = (I + ∇)−1 ki ∂k ∂ j p
(5.3)
where p
are the commutator coefficients. k,α (I × [0, T ]) by Finally we define the operator W : C k,α (I × [0, T ]) × Lk,α T →C W(u, ) = W(vu, , ).
(5.4)
Remark. We clarify that by ∈ C k,α (I × [0, T ], I) we only impose a C k,α spatial regularity restriction. We do not assume anything about time regularity. This will be the case for the remainder of this section. Remark. Observe that ∇k−1,α 21 guarantees that the matrix I + ∇ t in Eq. (5.3) is invertible. Further note that all coefficients in Eq. (5.1) are of class C k,α and hence by parabolic regularity [12], v ∈ C k,α . Lemma 5.1. Let X be a Banach algebra. If x ∈ X is such that x ρ < 1 then 1 + x 1 . Further if in addition y ρ then is invertible and (1 + x)−1 1−ρ (1 + x)−1 − (1 + y)−1
1 x − y . (1 − ρ)2
Proof. The first part of the lemma follows immediately from the identity (1 + x)−1 = (−x)n . The second part follows from the first part and the identity (1 + x)−1 − (1 + y)−1 = (1 + x)−1 (y − x)(1 + y)−1 . We generally use Lemma 5.1 when X is the space of C k,α periodic matrices. We finally prove that the Weber operator W is Lipschitz, which will quickly give us the existence theorem. k,α , are such that Proposition 5.1. If , ¯ ∈ Lk,α T and u, u¯ ∈ C
sup u(t)k,α U and
0t T
¯ sup u(t) k,α U,
0t T
then there exists c = c(k, α, L , ν, U ) and T = T (k, α, L , ν, U ) such that W(u, )(t) − W(u, ¯ c u(0) − u(0) ¯ )(t) ¯ k,α k,α νt cU ¯ 1 + 2 (t) − (t) + k,α L L +t u(t) − u(t) ¯ k−2,α for all 0 t T .
642
G. Iyer
Proof. Let v and v¯ be the virtual velocities associated to u, and u, ¯ ¯ respectively. Let C and C¯ be the commutator coefficients associated to and ¯ respectively. Since Eq. (5.1) is a linear parabolic equation with C k,α coefficients, standard regularity theory [12] ensures that there exists T = T (ν, U ) such that sup v(t)k,α 2U and
0t T
¯ sup v(t) k,α 2U.
0t T
Hence by Proposition 3.1 we have W(u, ) − W(u, ¯ ¯ = W(v, ) − W(v, ¯ ) ¯ ) k,α k,α U ¯ c L − k,α + v − v ¯ k,α .
(5.5)
Now let v˜ = v − v. ¯ The evolution equation of v˜ is given by ∂t v˜β + (u · ∇) v˜β − νv˜β − 2νC ij,β ∂ j v˜i = 2ν(C¯ ij,β − C ij,β )∂ j v¯i + ((u¯ − u) · ∇)v¯β with initial data v(x, ˜ 0) = u(x, 0) − u(x, ¯ 0). We estimate the C k−2,α norm of the right hand side. Let c be some constant which changes from line to line. By definition, ¯ −1 ∇∂ j ¯k − (I + ∇ t )−1 ∇∂ j k C¯ kj − C kj = (I + ∇ t ) ¯ −1 − (I + ∇ t )−1 ∇∂ j ¯k + (I + ∇ t )−1 ∇∂ j ¯k − ∇∂ j k . = (I + ∇ t ) ¯ −1 k−1,α . Note that by Lemma 5.1 we can bound (I + ∇ t )−1 k−1,α and (I + ∇ t ) Further, by Lemma 5.1 again we have ¯ −1 − (I + ∇ t )−1 c ∇ − ∇ ¯ . (I + ∇ t ) k−1,α
k−1,α
Combining these estimates we have c C − C¯ ∇ − ∇ ¯ k−1,α . k−2,α L Finally note ((u¯ − u) · ∇)v ¯ k−2,α
cU u − u ¯ k−2,α . L
Thus by parabolic regularity [12], cU t ν ∇(t) − ∇ (t) ¯ v(t) ˜ + u(t) − u(t) ¯ k,α k−2,α k−1,α L L + u(0) − u(0) ¯ k,α ,
(5.6)
and substituting Eq. (5.6) in (5.5), the proposition follows. Theorem 5.1. Let k 2 and u 0 ∈ C k,α (I, I) be divergence free. Then there exists T = T (k, α, L , ν, u 0 k,α ) and u ∈ C k,α (I × [0, T ], I) which is a solution of the Navier-Stokes equations with initial data u 0 .
A Stochastic Perturbation of Inviscid Flows
643
Proof. Let U > u 0 k,α . We define the set U by
U = u ∈ C([0, T ], C k,α ) u(t)k,α U, ∇ · u = 0, and u(0) = u 0 . Given u ∈ U, let u to be the unique solution of the equation ∂t u + (u · ∇) u − νu + u = 0 with initial data u (x, 0) = 0. Our aim is to produce u ∈ U such that u = W(u, u ), which from [5] we know must be a solution to the Navier-Stokes equations. We define the map W by W (u) = W(u, u ). If U is endowed with the strong norm uU = sup u(t)k,α , 0t T
we will show as before that for sufficiently small T, W maps the U into itself. Finally we will show that W is a contraction under a weaker norm, producing the desired fixed point. First note that by parabolic regularity [12], we have u (t)k,α cU t. The constant c of course depends on U , but we retain the U on the right for dimensional correctness. Thus choosing T small will guarantee u ∈ Lk,α T . Let v = v,u be the virtual velocity defined by Eq. (5.1), with initial data u 0 . Standard parabolic estimates [12], (and the fact that ∈ Lk,α T ), show v(t) − u 0 k,α
cU 2 t. L
Now by definition, W (u) = P I + ∇ t u v = P I + ∇ t u (v − u 0 ) + I + ∇ t u u 0 = P[u 0 ] + P I + ∇ t u (v − u 0 ) + P[(∇ t u )u 0 ]. Since u 0 is divergence free, P(u 0 ) = u 0 . Using Corollary 3.1, the preceding two estimates for u and v − u 0 , we obtain W (u)(t)k,α u 0 k,α + c Thus choosing T <
L (U cU 2
U2 t. L
− u 0 k,α ), we can ensure that W maps U into itself.
644
G. Iyer
To see that W has a fixed point, let u, u¯ ∈ U and define ˜ = u − u¯ . The evolution of ˜ is governed by ∂t ˜ + (u · ∇) ˜ − ν˜ = ((u¯ − u) · ∇) u¯ + u¯ − u and parabolic regularity [12] immediately gives ˜ ¯ (t) ct u(t) − u(t) k−2,α . k,α
Combining this with Proposition 5.1, we have ¯ sup W (u)(t) − W (u)(t) k,α
0t T
cU T
¯ sup u(t) − u(t) k−2,α . L 0t T
L , then W : U → U is a contraction mapping Thus if T is chosen to be smaller than cU and has a unique fixed point concluding the proof.
Remark. The above estimates along with the active vector formulation of the Euler equations [4] can be used to prove a C k,α local existence and uniqueness theorem for the Euler equations. Since a similar proof of this result can be found in the original paper [4] by P. Constantin, we do not present it here. Acknowledgement. I would like to thank Peter Constantin for his encouragement, support and many helpful discussions. I would also like to thank Hongjie Dong and Tu Nguyen for carefully reading this paper, and pointing out an error in the original proof of Theorem 2.1.
References 1. Bhattacharya, R.N., Chen, L., Dobson, S., Guenther, R.B., Orum, C., Ossiander, M., Thomann, E., Waymire, E.C.: Majorizing kernels and stochastic cascades with applications to incompressible Navier-Stokes equations. Trans. Amer. Math. Soc. 355(12), 5003–5040 (2003) (electronic) 2. Busnello, B., Flandoli, F., Romito, M., A probabilistic representation for the vorticity of a 3D viscous fluid and for general systems of parabolic equations. http://arxiv.org/list/math.PR/0306075, 2003 3. Chorin, A., Marsden, J.: A Mathematical Introduction to Fluid Mechanics. Berlin Heidelberg New York: Springer, 2000 4. Constantin, P.: An Eulerian-Lagrangian approach for incompressible fluids: local theory. J. Amer. Math. Soc. 14 (2), 263–278 (2001) (electronic) 5. Constantin, P.: An Eulerian-Lagrangian Approach to the Navier-Stokes equations. Commun. Math. Phys. 216(3), 663–686 (2001) 6. Constantin, P., Foias, C.: Navier-Stokes Equations. Chicago, IL: University of Chicago Press, 1988 7. Constantin, P., Iyer, G.: A stochastic Lagrangian representation of the 3-dimensional incompressible Navier-Stokes equations. http://arxiv.org/list/math.PR/0511067, 2005 8. Friedman, A.: Stochastic Differential Equations and Applications, Volume 1. London-New York-San Diego: Academic Press, 1975 9. Gomes, D.A.: A variational formulation for the Navier-Stokes equation. Commun. Math. Phys 257, 227–234 (2005) 10. Jourdain, B., Le Bris, C., Lelièvre, T.: Coupling PDEs and SDEs: the Illustrative Example of the Multiscale Simulation of Viscoelastic Flows. Lecture Notes in Computational Science and Engineering 44, Berlin Heidelberg New York: Springer, 2005, pp. 151–170 11. Karatzas, I., Shreve, S.: Brownian Motion and Stochastic Calculus. Graduate Texts in Mathematics 113, New York: Springer, 1991 12. Krylov, N.V.: Lectures on Elliptic and Parabolic Equations in Hölder Spaces. Graduate Studies in Mathematics 12, Providence, RI: Amer. Math. Soc, 1996 13. Kunita, H.: Stochastic flows and stochastic differential equations. Cambridge Studies in Advanced Mathematics 24, Cambridge: Cambridge University Press, 1997
A Stochastic Perturbation of Inviscid Flows
645
14. Le Gall, J.: Spatial Branching Processes, Random Snakes and Partial Differential Equations. Lectures in Mathematics, Basel-Baston: Birkhäuser, 1999 15. Le Jan Y., Sznitman, A.S.: Stochastic cascades and 3-dimensional Navier-Stokes equations. Probab. Theory Related Fields 109(3), 343–366 (1997) 16. Majda, A., Bertozzi, A.: Vorticity and Incompressible Flow. Cambridge: Cambridge University Press, 2002 17. Stein, E.: Singular Integrals and Differentiability Properties of Functions. Princeton, NJ: Princeton University Press, 1970 Communicated by A. Kupiainen
Commun. Math. Phys. 266, 647–663 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0056-7
Communications in
Mathematical Physics
On Monopoles and Domain Walls Amihay Hanany1 , David Tong2 1 Center for Theoretical Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
E-mail: [email protected]
2 Department of Applied Mathematics and Theoretical Physics, University of Cambridge, CB3 0WA, UK.
E-mail: [email protected] Received: 4 August 2005 / Accepted: 20 March 2006 Published online: 18 July 2006 – © Springer-Verlag 2006
Abstract: The purpose of this paper is to describe a relationship between maximally supersymmetric domain walls and magnetic monopoles. We show that the moduli space of domain walls in non-abelian gauge theories with N flavors is isomorphic to a complex, middle dimensional, submanifold of the moduli space of U (N ) magnetic monopoles. This submanifold is defined by the fixed point set of a circle action rotating the monopoles in the plane. To derive this result we present a D-brane construction of domain walls, yielding a description of their dynamics in terms of truncated Nahm equations. The physical explanation for the relationship lies in the fact that domain walls, in the guise of kinks on a vortex string, correspond to magnetic monopoles confined by the Meissner effect. Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 2. Domain Walls . . . . . . . . . . . . . . . . . . . . . . 2.1 Classification of Domain Walls . . . . . . . . . . . 2.2 The moduli space of domain walls: Some examples 2.3 The ordering of domain walls . . . . . . . . . . . . 3. Monopoles . . . . . . . . . . . . . . . . . . . . . . . . 3.1 The relationship between Mg and Wg . . . . . . . 3.2 D-branes and Nahm’s equations . . . . . . . . . . 4. D-Branes and Domain Walls . . . . . . . . . . . . . . . 4.1 Domain wall dynamics . . . . . . . . . . . . . . . 4.2 The ordering of domain walls revisited . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
647 648 650 651 652 653 654 656 658 659 661
1. Introduction Domain walls in gauge theories with eight supercharges have rather special properties. These walls were first studied by Abraham and Townsend [1] who showed that in
648
A. Hanany, D. Tong
two-dimensions, where domain walls are known as kinks, they exhibit dyonic behaviour reminiscent of magnetic monopoles. Further similarities between kinks and magnetic monopoles, at both the classical and quantum level, were uncovered in [2]. The physical explanation for this relationship was presented in [3], where new BPS solutions were described corresponding to magnetic monopoles in a phase with fully broken gauge symmetry. The Meissner effect ensures that monopoles are confined: the magnetic flux is unable to propagate through the vacuum and leaves the monopole in two collimated tubes. From the perspective of the flux tube, the monopole appears as a kink. The idea of describing confined monopoles as kinks in Z N strings occurred previously in [4]. The relationship between the confined magnetic monopoles and the kink was further explored in [5–7] and related systems were studied in [8–13]. In this paper we use D-brane techniques to study the moduli space of multiple domain walls. This allows us to develop a description of the domain wall dynamics in terms of a linearized Nahm equation, providing a direct relationship to the dynamics of monopoles. Specifically, we show that the moduli space of domain walls, which we denote as Wg , is isomorphic to a middle dimensional submanifold of the moduli space of magnetic monopoles Mg . This submanifold describes magnetic monopoles lying along a line, ˆ rotating the monopoles in a and can be described as the fixed point of an S1 action k, plane, Wg ∼ . (1.1) = Mg k=0 ˆ The correspondence captures the topology and asymptotic metric of the domain wall moduli space Wg . It does not extend to the full metric on Wg . Nevertheless, as we shall explain, it does correctly capture the most important feature of domain walls: their ordering along the line. The relationship (1.1) plays companion to the result of [14], where the moduli space of vortices was shown to be a middle dimensional submanifold of the moduli space of instantons. Indeed, upon dimensional reduction, the self-dual instanton equations become the monopole equations, while the vortex equations descend to the domain wall equations. We start in the following section by describing the domain walls in question, together with a review of their moduli spaces. We pay particular attention to the crudest physical feature of domain walls, namely the rules governing their spatial ordering along the line. Section 3 contains a brief review of magnetic monopoles in higher rank gauge groups, primarily in order to fix notation, allowing us to elaborate on the relationship (1.1). We also describe the Nahm construction of the monopole moduli space as it arises from D-branes. The meat of the paper is in Sect. 4. We present a D-brane embedding of domain wall solitons which gives a description of their dynamics in terms of a linear Nahm equation. This equation is somewhat trivial, with the content hidden in various boundary conditions. We show how these boundary conditions capture the prescribed ordering of domain walls. 2. Domain Walls In this paper we will study a class of BPS domain wall solutions occurring in maximally supersymmetric theories with multiple, isolated vacua. The Lagrangian for these models includes a U (k) gauge field Aµ , a real adjoint scalar σ and N fundamental scalars qi , each with real mass m i ,
On Monopoles and Domain Walls
649
L = Tr
2 1 1 e2 † µν 2 2 q F F + |D σ | + ⊗q − v µν µ i i 4e2 2e2 2 N + |Dµ qi |2 + qi† (σ − m i )2 qi ,
i=1
where there is an implicit sum over the flavor index i in the adjoint valued term qi ⊗ qi† . This Lagrangian can be embedded in a theory with 8 supercharges in any spacetime dimension 1 ≤ d ≤ 5 (e.g. N = 2 SQCD in d = 3 + 1). Such theories include further scalar fields which can be shown to vanish on the domain wall solutions1 . The fermions do contribute zero modes but will not be important here. When the Higgs expectation value v 2 is non-vanishing, and the masses m i are distinct (m i = m j for i = j), the theory has a set of isolated vacua. Each vacuum is labelled by a set of k distinct elements, chosen from a possible N , = {ξ(a) : ξ(a) = ξ(b) for a = b} .
(2.1)
Here a = 1, . . . , k runs over the color index, while ξ(a) ∈ {1, . . . , N }. Up to a gauge transformation, the vacuum associated to this set is given by, σ = diag(m ξ(1) , . . . , m ξ(k) )
,
q ai = v δ ai=ξ(a) .
(2.2)
For N < k there are no supersymmetric vacua; for N ≥ k, the number of vacua is
N N! = Nvac = . (2.3) k k!(N − k)! Each of these vacua is isolated and exhibits a mass gap. There are k 2 non-BPS massive gauge bosons and quarks with masses m 2γ ∼ e2 v 2 + |m ξ(a) − m ξ(b) |2 and k(N − k) BPS massive quark fields with masses m q ∼ |m ξ(a) − m i | (with i ∈ / ). For vanishing masses m i = 0 the theory enjoys an SU (N ) flavor symmetry, rotating the qi . When distinct masses are turned on this is broken explicitly to the Cartan-subalgebra U (1) N −1 . Meanwhile, the U (k) gauge group is broken spontaneously in the vacuum by the expectation values (2.2). The existence of multiple, gapped, isolated vacua is sufficient to guarantee the existence of co-dimension one domain walls (otherwise known as kinks). These walls are BPS objects, satisfying first order Bogomoln’yi equations which can be derived in the usual manner by completing the square. We first choose a flat connection Fµν = 0, and allow the fields to depend only on a single coordinate, say x 3 . Then the Hamiltonian is given by H = Tr
N 2 e2 1 † † 2 2 2 2 + q |D |D σ | + ⊗q − v q | + q (σ − m ) q 3 i 3 i i i i i 2e2 2 i=1
N 2 1 = 2 Tr D3 σ − e2 qi ⊗qi† − v 2 + |D3 qi − (σ − m i )qi |2 2e i=1
1 If we promote the scalar field σ and the masses m to complex variables, then the theories admit an i interesting array of domain wall junctions [15] and dyonic walls [16].
650
A. Hanany, D. Tong N qi† (σ −m i )D3 qi +D3 qi† (σ −m i )qi + Tr D3 σ qi ⊗qi† − v 2 +
i=1
≥ −∂3 v Tr σ . 2
(2.4)
Our domain wall interpolates between a vacuum − at x 3 = −∞, as determined by a set (2.1), and a distinct vacuum + at x 3 = +∞. The minus signs above have been chosen under the assumption that m > 0, where m = i∈ − m i − i∈ + m i , so that a BPS domain wall satisfies the Bogomoln’yi equations, D3 σ = e2 qi ⊗qi† − v 2 , D3 qi = (σ − m i )qi (2.5) and has tension given by T = v 2 m. Analytic solutions to these equations can be found in the e2 → ∞ limit [17–19], which give smooth approximations to the solution at large, but finite e2 [20]. 2.1. Classification of Domain Walls. Domain walls in field theories are classified by the choice of vacuum − and + at left and right infinity. However, our theory contains an exponentially large number of vacua (2.3) and one may hope that there is a coarser, less unwieldy, classification which captures certain relevant properties of a given domain wall. Such a classification was offered in [21]. Firstly define the N -vector m = (m 1 , . . . , m N ). The tension of the BPS domain wall can then be written as · g, Tg = v 2 m ≡ v 2 m
(2.6)
where the N -vector g ∈ R (su(N )), the root lattice of su(N ). Note that there do not exist domain wall solutions for all g ∈ R (su(N )); the only admissible vectors are of the form g = ( p1 , . . . , p N ) with pi = 0 or ±1. Note also that a choice of g does not specify a unique choice of vacua − and + at left and right infinity. Nor, in fact, does it specify a unique domain wall moduli space Wg . Nevertheless, domain walls in sectors with the same g share common traits. The dimension of the moduli space of domain wall solutions was computed in [21] using an index theorem, following earlier results in [22, 19]. To describe the dimension of the moduli space, it is useful to decompose g in terms of simple roots2 α i , g =
N −1
n i α i ,
n i ∈ Z.
(2.7)
i=1
The index theorem of [21] reveals that domain wall solutions to (2.5) exist only if n i ≥ 0 for all i. If this holds, the number of zero modes of a solution is given by N −1 ni , dim Wg = 2
(2.8)
i=1 2 The basis of simple roots is fixed by the requirement that m · α i > 0 for each i. A unique basis is defined in this way if m lies in a positive Weyl chamber, which occurs whenever the masses are distinct so that SU (N ) → U (1) N −1 . If we choose the ordering m 1 > m 2 > · · · > m N we have simple roots α 1 = (1, −1, 0, . . . , 0) and α 2 = (0, 1, −1, 0, . . . , 0) through to α N −1 = (0, . . . , 1, −1).
On Monopoles and Domain Walls
651
where Wg denotes the moduli space of any set of domain walls with charge g. This result has a simple physical interpretation. There exist N − 1 types of “elementary” domain walls corresponding to a g = α i , the simple roots. Each of these has just two collective coordinates corresponding to a position in the x 3 direction and a phase. As first explained in [1], the phase coordinate is a Goldstone mode arising because the domain wall configuration breaks the U (1) N −1 flavor symmetry as we review below. In general, a domain wall labelled by g can be thought to be constructed from i n i elementary domain walls, each with its own position and phase collective coordinate. Let us now turn to some examples. 2.2. The moduli space of domain walls: Some examples. Example 1. g = α 1 . The simplest system admitting a domain wall is the abelian k = 1 theory with N = 2 charged scalars q1 and q2 . The Nvac = 2 vacua of the theory are given by σ = m i and |q j |2 = v 2 δi j for i = 1, 2. There is a single domain wall in this theory with g = α 1 interpolating between these two vacua. Under the U (1) F flavor symmetry, q1 has charge +1 and q2 has charge −1. In each of the vacua, the U (1) F symmetry coincides with the U (1) gauge action but, in the core of the domain wall, both q1 and q2 are non-vanishing, and U (1) F acts non-trivially. The resulting goldstone mode is the phase collective coordinate. The moduli space of the domain wall is simply Wg=α1 ∼ = R × S1 ,
(2.9)
where the R factor describes the center of mass of the domain wall and the S1 the phase. One can show that the S1 has radius 2π v 2 /Tg = 2π/(m 1 − m 2 ). Example 2. g = α 1 + α 2 . The simplest system admitting multiple domain walls is the abelian k = 1 theory with N = 3 charged scalars. There are now three vacua, given by σ = m i and |q j |2 = v 2 δi j . In each a U (1) F1 × U (1) F2 flavor symmetry is unbroken, under which the scalars have charge q1 : (+1, 0) U (1) F1 × U (1) F2 : q2 : (−1, 1) . (2.10) q : (0, −1) 3 The first elementary domain wall g = α 1 interpolates between vacuum 1 and vacuum 2, breaking U (1) F1 along the way. The second elementary domain wall interpolates between vacuum 2 and vacuum 3, breaking U (1) F2 along the way. Of interest here is the domain wall g = α 1 + α 2 interpolating between vacuum 1 and vacuum 3. It can be thought of as a composite of two domain walls. The moduli space for these two domain walls was studied in [17, 23] and is of the form, ˜ α +α R×W 1 2 Wg=α1 +α2 ∼ . =R× Z
(2.11)
The first factor of R corresponds to the center of mass of the two domain walls; the second factor corresponds to the combined phase associated to the two domain walls. When the ratio of tensions of the two elementary domain walls Tα 1 /Tα 2 is rational, the ratio of the periods of the two phases are similarly rational and the second R factor collapses ˜ α +α to S1 , while the quotient Z reduces to a finite group. The relative moduli space W 1
2
652
A. Hanany, D. Tong
corresponds to the separation and relative phases of the two domain walls. Importantly, and unlike other solitons of higher co-dimension, the domain walls must obey a strict ordering on the x 3 line: the g = α 1 domain wall must always be to the left of g = α 2 domain wall. The separation between the walls is therefore the halfline R+ . It was shown in [17] that the relative phase is fibered over R+ to give rise to a smooth cylinder, with the tip corresponding to coincident domain walls. The resulting moduli space is shown in Fig. 1. Note that the moduli space (2.11) is toric, inheriting two isometries from the U (1) F1 × U (1) F2 symmetry. In an abelian gauge theory with arbitrary number of flavors N , the domain wall charge is always of the form g = i n i α i with n i = 0, 1, and the moduli space is always toric, meaning that half of the dimensions correspond to U (1) isometries. Example 3. g = α 1 + 2 α2 + α 3 . In non-abelian theories, the domain wall moduli spaces are no longer toric. The simplest such theory has a U (2) gauge group with N = 4 fundamental scalars. The 6 vacua, and 15 different domain walls, of this theory were detailed in [21]. Under the U (1)3F flavor symmetry, the fundamental scalars transform as q : (+1, 0, 0) 1 q2 : (−1, 1, 0) U (1) F1 × U (1) F2 × U (1) F3 : . (2.12) q3 : (0, −1, 1) q4 : (0, 0, −1) With this convention, the elementary domain wall g = α i picks up its phase from the action of the U (1) Fi flavor symmetry. Here we concentrate on the domain wall system with the maximal number of zero modes which arises from the choice of vacua − = (1, 2) and + = (3, 4) so that g = α 1 + 2 α2 + α 3 . This system can be separated into four constituent domain walls. As explained in [19, 21], the ordering of domain walls is no longer strictly fixed in this example. The two outer elementary domain walls, on the far left and far right, are each of the type g = α 2 . However, the relative positions of the middle two domain walls, g = α 1 and α 3 are not ordered and they may pass through each other. Unlike the situation for abelian gauge theories, the 8 dimensional domain wall moduli space for this example is no longer toric; Wg inherits only three U (1) isometries from (2.12). Physically this means that the two phases associated to the α 2 domain walls are not both Goldstone modes and they may interact as the domain walls approach. This behaviour is familiar from the study of the Atiyah-Hitchin metric describing the dynamics of two monopoles in SU (2) gauge theory; we shall make the analogy more precise in the following.
2.3. The ordering of domain walls. As we stressed in the above examples, in contrast to other solitons domain walls must obey some ordering on the line. This will be an
Fig. 1. The relative moduli space W˜ α 1 +α2 of two domain walls is a cigar
On Monopoles and Domain Walls
653
important ingredient when we come to extract domain wall data from the linearized Nahm’s equations in Sect. 4. Here we linger to review this ordering. The ordering of the elementary domain walls in non-abelian theories was studied in detail in [19]. One can derive the result by considering the possible sequences of vacua as we move over each domain wall. For example, we could consider the “maximal domain wall”, interpolating between − = {1, 2, . . . , k} and + = {N − k + 1, . . . , N }. From the left, the first elementary domain wall that we come across must be g = α k , corresponding to − = (1, 2, . . . , k − 1, k) → (1, 2, . . . , k − 1, k + 1). The next elementary domain wall may be either α k−1 or α k+1 . These two walls are free to pass through each other, but cannot move further to the left than the α k wall. And so on. Iterating this procedure, one finds that two neighbouring elementary domain walls α i and α j may pass through each other whenever α i · α j = 0, but otherwise have a fixed ordering on the line. The net result of this analysis is summarized in Fig. 2. The x 3 positions of the domain walls are shown on the vertical axis; the position on the horizontal axis denotes the type of elementary domain wall, starting on the left with α 1 and ending on the right with α N −1 . In summary, we see that for n i = n i+1 − 1, the α i domain walls are trapped between the α i+1 domain walls. The reverse holds when n i = n i+1 + 1. Finally, when n i = n i+1 the positions of the α i domain walls are interlaced with those of the α i+1 domain walls. The last α i+1 domain wall is unconstrained by the α i walls in its travel in the positive x 3 direction, although it may be trapped in turn by a α i+2 wall. While we have discussed the maximal domain wall above, other sectors can be reached either by removing some of the outer domain walls to infinity, or by taking non-interacting products such subsets. It’s important to note that labelling a topological sector g does not necessarily determine the ordering of domain walls3 . We shall show that domain walls with the same g, but different orderings, descend from different submanifolds of the same monopole moduli space. 3. Monopoles The main goal of this paper is to show how the moduli space of domain walls introduced in the previous section is isomorphic to a submanifold of a related monopole moduli space. In this section we review several relevant aspects of these monopole moduli spaces. It will turn out that the domain walls of Sect. 2 are related to monopoles in an SU (N ) gauge theory. Note that the flavor group from Sect. 2 has been promoted to a gauge group; we shall see the reason behind this in Sect. 5. The Bogomoln’yi monopole equations are Bµ = Dµ φ,
(3.1)
where Bµ , µ = 1, 2, 3 is the SU (N ) magnetic field and φ is an adjoint valued real scalar field. The monopoles exist only if φ takes a vacuum expectation value, φ = diag(m 1 , . . . , m N ),
(3.2)
where we take m i = m j for i = j, ensuring breaking to the maximal torus, SU (N ) → U (1) N −1 . It is not coincidence that we’ve denoted the vacuum expectation values by 3 An example: the g = α 1 + α 2 + α 3 domain wall. In the abelian theory with k = 1 and N = 4, the ordering is α 1 < α 2 < α 3 . However, in the non-abelian theory with k = 2 and N = 4, the ordering is α 1 , α 3 < α 2 .
654
A. Hanany, D. Tong
α1
α2
α3
αN–3
α4
αN–2
αN–1
x3 ........... ...........
Fig. 2. The ordering for the maximal domain wall. The x 3 spatial direction is shown horizontally. The position in the vertical direction denotes the type of domain wall. Domain walls of neighbouring types have their positions interlaced
m i , the same notation used for the masses in Sect. 2; it is for this choice of vacuum that the correspondence holds. (Specifically, the masses of the kinks will coincide with the masses of monopoles, ensuring that the asymptotic metrics on Wg and Mg also coincide). As described long ago by Goddard, Nuyts and Olive [24], the allowed magnetic charges under each unbroken U (1) N −1 are specified by a root vector4 of su(N ), g = ( p1 , . . . , p N ). It is customary to decompose this in terms of simple roots α i , g =
N −1
n i α i ,
n i ∈ Z.
(3.3)
i=1
Once again, the notation is identical to that used for domain walls (2.7) for good reason. Solutions to the monopole equations (3.1) exist for all values of n i ≥ 0. This is in contrast to domain walls where, as we have seen, configurations only exist in a finite number of sectors defined by pi = 0 or pi = ±1. The mass of the magnetic monopole is Mmono = (2π/e2 )m · g. The monopole moduli space Mg is the space of solutions to (3.2) in a fixed topological sector g. The dimension of this space, equal to the number of zero modes of given solution, was computed by E. Weinberg in [25] using Callias’ version of the index theorem. The result is: dim (Mg ) = 4
N −1
ni ,
(3.4)
i=1
which is to be compared with (2.8). 3.1. The relationship between Mg and Wg . We are now in a position to describe the relationship between the moduli space of domain walls Wg and the moduli space of magnetic monopoles Mg . We will show that Wg is a complex, middle dimensional, submanifold of Mg , defined by the fixed point set of the action rotating the monopoles in a plane, together with a suitable gauge action. To do this, we first need to describe the symmetries of Mg . The monopole moduli space Mg admits a natural, smooth, hyperKähler metric [26, 27]. For generic g this metric enjoys (N −1) tri-holomorphic isometries arising from the 4 We ignore the factor of 2 difference between roots and co-roots. For simply laced groups, such as SU (N ), it can be absorbed into convention.
On Monopoles and Domain Walls
655
action of the U (1) N −1 abelian gauge group. Further the metric has an SU (2) R symmetry, arising from rotations of the monopoles in R3 , which acts on the three complex structures of Mg . In other words, any U (1) R ⊂ SU (2) R is a holomorphic isometry, preserving a single complex structure while revolving the remaining two. Let us choose U (1) R to rotate the monopoles in the (x 2 − x 3 ) plane. In what follows we will be interested in a specific holomorphic Uˆ (1) action which acts simultaneously by a U (1) R rotation and a linear combination of the gauge rotations U (1) N −1 (to be specified presently). We ˆ We claim denote the Killing vector on Mg associated to Uˆ (1) as k. Wg = Mg k=0 . (3.5) ˆ This result holds at the level of topology and asymptotic metric of the spaces. The manifold Wg inherits a metric from Mg by this reduction: it does not coincide with the domain wall metric in the interior on Wg . (For example, corrections to the asymptotic metric on Wg are exponentially suppressed while those of Mg have power law behaviour). It would be interesting to examine if Wg inherits the correct Kähler class and/or complex structure from Mg . We defer a derivation of (3.5) to the following section, but first present some simple examples. Example 1. g = α 1 . Monopoles in SU (2) gauge theories are labelled by a single topological charge g = n 1 α 1 . For a single monopole (n 1 = 1) the moduli space is simply Mg=α1 ∼ = R3 × S1 ,
(3.6)
where the R3 factor denotes the position of the monopole, while the S1 arises from global gauge transformations under the surviving U (1). The radius of the S1 is 2π/(m 1 − m 2 ). In this case the Uˆ (1) action coincides with the rotation U (1) R in the (x 2 − x 3 ) plane and we have trivially Wα 1 ∼ . (3.7) = R × S1 ∼ = Mα 1 k=0 ˆ The similarity between the domain wall and monopole moduli spaces for a single soliton was noted by Abraham and Townsend [1]. In both cases, motion in the S1 factor gives rise to dyonic solitons. Note that monopole moduli spaces for charges g = n 1 α 1 exist for all n 1 ∈ Z+ . For example, the n 1 = 2 monopole moduli space is home to the famous Atiyah-Hitchin metric [27]. However, there is no domain wall moduli space with this charge in the class of theories we discuss in Sect. 2. Example 2. g = α 1 + α 2 . Our second example is the g = α 1 + α 2 monopole in SU (3) gauge theories (sometimes referred to as the (1, 1) monopole). The moduli space was determined in [28–30] to be of the form ˜ α +α R×M 1 2 , Mg=α1 +α2 ∼ = R3 × Z
(3.8)
˜ α +α is the Euclidean Taub-NUT space, endowed where the relative moduli space M 1 2 with the metric
1 1 −1 (dr 2 + r 2 dθ 2 +2 sin2 θ dφ 2 )+ 1+ ds 2 = 1+ (dψ + cos θ dφ)2 . (3.9) r r
656
A. Hanany, D. Tong
Here r ,θ and φ are the familiar spherical polar coordinates. The coordinate ψ arises from U (1) gauge transformations. The manifold has a SU (2) R ×U (1) isometry, of which only a U (1) R × U (1) are manifest in the above coordinates. The holomorphic U (1) R isometry acts by rotating the two monopoles: φ → φ + c. The tri-holomorpic U (1) isometry changes the relative phase of the monopoles: ψ → ψ + c. Both of these actions have a unique fixed point at r = 0, the “nut” of Taub-NUT. However, the combined action with Killing vector ∂ψ + ∂φ has a fixed point along the half-line θ = π , with ψ fibered over ˜ α +α . this line to produce the cigar shown in Fig. 1. This is the relative moduli space W 1 2 N −1 Similar calculations hold for monopoles of charge g = i=1 α i , whose dynamics is described by a class of toric hyperKähler metrics, known as the Lee-Weinberg-Yi metrics [31]. Once again, a suitable S1 action on these spaces can be identified such that the fixed points localize on Wg , the moduli space of domain walls in U (1) gauge theories with N charged scalars. α2 + α 1 . As described in the previous section, the simplest domain Example 3. g = α 1 +2 wall charge g = i n i α i with some n i > 1 occurs for g = α 1 + 2 α2 + α 3 , and corresponds to a monopole in a SU (4) gauge theory. No explicit expression for the metric on this monopole moduli space is known although, given the results of [32], such a computation may be feasible. Without an explicit expression for the metric in this, and more complicated examples, we need a more powerful method to describe the moduli space. This is provided by the Nahm construction, which we now review. 3.2. D-branes and Nahm’s equations. The moduli space of magnetic monopoles is isomorphic to the moduli space of Nahm data. Here we review the Nahm construction of the monopole moduli space [33] and, in particular, the embedding within the framework of D-branes due to Diaconescu [34]. This will be useful to compare to the domain walls of the next section. In the D-brane construction, Nahm’s equations arise as the low-energy description of D-strings suspended between D3-branes [34]. The SU (N ) Yang-Mills theory lives on the worldvolume of N D3-branes separated in, say, the x6 direction, with the i th D3-brane placed at position (x6 )i = m i in accord with the adjoint expectation value (3.2). The monopole of charge g = i n i α i corresponds to suspending n i D-strings between the i th and (i + 1)th D3-brane. This configuration is shown in Fig. 3. The motion of the D-strings in each segment m i ≤ x6 ≤ m i+1 is governed by four hermitian n i ×n i matrices, X 1 , X 2 , X 3 and A6 subject to the covariant version of Nahm’s equations, α1
α2
α3
D3 D1
Fig. 3. The g = 3 α1 + 2 α2 + 3 α3 monopole as D-strings stretched between D3-branes
On Monopoles and Domain Walls
657
d Xµ i − i[A6 , X µ ] − µνρ [X ν , X ρ ] = 0, m i ≤ x6 ≤ m i+1 , (3.10) d x6 2 modulo U (n i ) gauge transformations acting on the interval m i ≤ x6 ≤ m i+1 , and vanishing at the boundaries. The X µ form the triplet representation of the SU (2) R symmetry which rotates monopoles in R3 . The U (1) N −1 surviving gauge transformations act on the Nahm data by constant shifts of the (N − 1) “Wilson lines” A6 → A6 + c1n i . The interactions between neighbouring segments depend on the relative size of the matrices. There are three possibilities, given by [35]: i) n i = n i+1 : In this case the U (n i ) gauge symmetry is extended to the interval m i ≤ x6 ≤ m i+2 and an impurity is added to the right-hand-side of Nahm’s equations, which now read d Xµ i − i[A6 , X µ ] − µνρ [X ν , X ρ ] = ωα σµαβ ωβ† δ(x6 − m i+1 ). (3.11) d x6 2 Here σµ are the Pauli matrices. The impurity degrees of freedom lie in the complex 2-vector, ωα = (ψ, ψ˜ † ) which is a doublet under the SU (2) R symmetry. Both ψ and ψ˜ † are themselves complex n i vectors, transforming in the fundamental repreαβ sentation of the U (n i ) gauge group. The combination ωα σµ ωβ† is thus an n i × n i matrix, transforming in the adjoint representation of the gauge group. The ωα fields can be thought of as a hypermultiplet arising from D1 − D3 strings [36–38] ii) n i = n i+1 − 1: In this case X µ → (X µ )− , a set of three n i × n i matrices, as x6 → (m i )− from the left. To the right of m i , the X µ are (n i + 1) × (n i + 1) matrices which must obey
yµ aµ† as x6 → (m i )+ , Xµ → (3.12) aµ (X µ )− where yµ ∈ R and each aµ is a complex n i -vector. iii) n i ≤ n i+1 − 2: Once again we take X µ → (X µ )− as x6 → (m i )− but, from the other side, the matrices X µ now have a simple pole at the boundary,
0 Jµ /(x6 − m i ) + Yµ as x6 → (m i )+ , (3.13) Xµ → 0 (X µ )− where Jµ is the irreducible (n i+1 − n i ) × (n i+1 − n i ) representation of su(2), and Yµ are now constant (n i+1 − n i ) × (n i+1 − n i ) matrices. Case 2 above is usually described as a subset of Case 3 (with the one-dimensional irreducible su(2) representation given by Jµ = 0). Here we have listed Case 2 separately since when we come to describe a similar construction for domain walls, only Case 1 and 2 above will appear. The conditions for n i < n i+1 were derived in [39] by starting with the impurity data (3.11) and taking several monopoles to infinity. Obviously, for n i > n i+1 , one imposes the same boundary conditions described above, only flipped in the x6 direction. The space of solutions to Nahm’s equations, subject to the boundary conditions detailed above, is isomorphic to the monopole moduli space Mg . Moreover, there exists a natural hyperKähler metric on the solutions to Nahm’s equations which can be shown to coincide with the Manton metric on the monopole moduli space. For the g = α 1 + α 2 monopole, the metric on the associated Nahm data was computedin [28] and shown to give rise to the Euclidean Taub-NUT metric (3.9). For the g = i α i monopoles, the corresponding computation was performed in [40], resulting in the Lee-Weinberg-Yi metrics [31].
658
A. Hanany, D. Tong
4. D-Branes and Domain Walls In this section we would like to realize the domain walls that we described in Sect. 2 on the worldvolume of D-branes, mimicking Diaconescu’s construction for monopoles. From the resulting D-brane set-up we shall read off the world-volume dynamics of the domain walls to find that they are described by a truncated version of Nahm’s equations (3.10). Nahm’s equations have also arisen as a description of domain walls in N = 1 theories [41], although the relationship, if any, with the current work is unclear. Domain walls of the type described in Sect. 2 were previously embedded in D-branes in [42, 43] and several properties of the solitons were extracted (see in particular the latter reference). However, the worldvolume dynamics of the walls is difficult to determine in these set-ups and the relationship to magnetic monopoles obscured. We start by constructing the theory with eight supercharges on the worldvolume of D-branes [36]. For definiteness we choose to build the N = 2, d = 3 + 1 theory in IIA string theory although, by T-duality, we could equivalently work with any spacetime dimension5 . The construction is well known and is drawn in Fig. 4. We suspend k D4branes between two NS5-branes, and insert a further N D6-branes to play the role of the fundamental hypermultiplets. The worldvolume dimensions of the branes are N S5 : 012345, D4 : 01236, D6 : 0123789. The gauge coupling e2 and the Higgs vev v 2 are encoded in the separation of the NS5branes in the x6 and x9 directions respectively, while the masses m i are determined by the positions of the D6-branes in the x4 direction (we choose the D6-branes to be coincident in the x5 direction, corresponding to choosing all masses to be real), x6 x9 x4 1 2 ∼ , v ∼ , m ∼ − . (4.1) i e2 ls gs N S5 ls3 gs N S5 ls2 D6i After turning on the Higgs vev v 2 , the D4-branes must split on the D6-branes in order to preserve supersymmetry. The S-rule [36] ensures that each D6-brane may play host to only a single D4-brane. In this manner a vacuum of the theory is chosen by picking k out of the N D6-branes on which the D4-branes end, in agreement with Eq. (2.1). The domain walls correspond to a configuration of D4-branes which start life at x 3 = −∞ in a vacuum configuration − , and end up at x 3 = +∞ in a distinct vacuum + . As is clear from Fig. 4, as D4-branes walls interpolate in x 1 , they must also move in both the x 4 direction and the x 9 direction [45]. The NS-branes and D6-branes are linked, meaning that a D4-brane is either created or destroyed as they pass the NS5-branes in the x 6 direction [36]. In the domain wall background, which of these possibilities occurs differs if we move the D6-branes to the left or right since the D4-brane charge varies from one end of the domain wall to the other. As it stands, it is difficult to read off the dynamics of the D4-branes in Fig. 4. However, we can make progress by taking the e2 → ∞ limit, in which the two NS5-branes become coincident in the x 6 direction. After rotating our viewpoint, the system of branes now looks like the ladder configuration shown in Fig. 5 (note that we have also rotated the branes relative to Fig. 4, so the horizontal is the x 4 direction). We are left with a 5 In fact, as explained in [44], the overall U (1) ⊂ U (k) is decoupled in the IIA brane set-up after lifting to M-theory. This effect will not concern us here.
On Monopoles and Domain Walls
659 D6
NS5 D4
x6
x9 x4
Fig. 4. The D-brane set-up for the U (1) gauge theory with N = 3 flavors. The vacuum is shown on the left; the elementary domain wall g = α 1 on the right D6 NS5 x9 D4
2xD4
2xD4
2xD4
D4 x4
m1
m2
m3
m4
m5
m6
Fig. 5. The D-brane set-up for the U (2) gauge theory with N = 6 flavors. The maximal g = α 1 + α 5 + 2( α2 + α 3 + α 4 ) domain wall is shown
series of D4-branes, now with worldvolume 02349, stretched between N D6-branes, while simultaneously sandwiched between two NS5-branes. Following these manoeuvres, one finds that the domain wall g = i results in n i D4-branes stretched i ni α th th between the i and (i + 1) D6-branes (counting from the top, since we have chosen the ordering m i > m i+1 . It may be worth describing how the domain wall charges arise directly in the set-up of Fig. 5. We start in a chosen vacuum − , denoted by placing k pairs of white dots on N distinct D6-branes, as shown in the figure. A domain wall arises every time a pair of dots proceeds to another D6-brane, dragging a D4-brane behind it like clingwrap. The S-rule translates to the fact that two pairs of dots may not simultaneously lie on the same D6-brane. The final vacuum + is denoted by the black dots in the figure and the domain wall charges n i are given by the number of times a D4-brane has been pulled between the i th and (i + 1)th D6-branes. 4.1. Domain wall dynamics. We are now in a position to read off the dynamics of the domain walls. In the absence of the NS5-branes, the D4-branes would stretch to infinity in the x9 direction, and the resulting D-brane set-up in Fig. 5 is T-dual to the monopoles in Fig. 3. The presence of the NS5-branes projects out half the degrees of freedom of the monopoles, leaving a simple linear set of equations. In each segment m i ≤ x4 ≤ m i+1 the domain walls are described by two n i × n i matrices X 3 and A4 satisfying d X3 − i[A4 , X 3 ] = 0 d x4
(4.2)
660
A. Hanany, D. Tong
modulo U (n i ) gauge transformations acting on the interval m i ≤ x4 ≤ m i+1 , and vanishing at the boundaries. As in the case of monopoles, the interactions between neighbouring segments depends on the relative size of the matrices. The two possibilities are: i) n i = n i+1 : Again, the U (n i ) gauge symmetry is extended to the interval m i ≤ x4 ≤ m i+2 and an impurity is added to the right-hand-side of Nahm’s equations, which now read d X3 − i[A4 , X 3 ] = ±ψψ † δ(x4 − m i+1 ), d x4
(4.3)
where the impurity degree of freedom ψ transforms in the fundamental representation of the U (n i ) gauge group, ensuring the combination ψψ † is a n i × n i matrix transforming, like X 3 , in the adjoint representation. These ψ degrees of freedom are chiral multiplets which survive the NS5-brane projection. We shall see shortly that the choice of ± sign will dictate the relative ordering of the domain walls along the x3 direction. ii) n i = n i+1 − 1: In this case X 3 → (X 3 )− , an n i × n i matrix, as x4 → (m i )− from the left. To the right of m i , X 3 is a (n i + 1) × (n i + 1) matrix obeying
y a† as x4 → (m i )+ , X3 → (4.4) a (X )− where yµ ∈ R and each aµ is a complex n i -vector. The obvious analog of this boundary condition holds when n i = n i+1 + 1. These boundary conditions obviously descend from the original Nahm boundary conditions for monopoles. Just as the space of Nahm data is isomorphic to the moduli space of magnetic monopoles, we conjecture that the moduli space of linearized Nahm data described above is isomorphic to the moduli space of domain walls. We shall shortly show that it indeed captures the most relevant aspect of domain walls: their ordering. In fact, the linearized Nahm equations (4.2) are rather trivial to solve. We first employ the i U (n i ) gauge transformations to make A4 (x4 ) a constant in each interval m i ≤ x4 ≤ m i+1 . This can be achieved by first diagonalizing A4 , and subsequently acting with the U (1)n i transformation A4 → A4 − ∂4 α where, in each segment m i ≤ x4 ≤ m i+1 , α is given by m x4 i+1 m i − x4 α(x4 ) = A4 (x4 ) d x4 − A4 (x4 ) d x4 , (4.5) m i − m i+1 mi
mi
which has the property that α(m i ) = α(m i+1 ) = 0. Further gauge transformations with non-zero winding on the interval ensure that A4 is periodic, with each eigenvalue lying in A4 ∈ [0, 2π/(m i − m i+1 )). These N − 1 “Wilson lines” will play the role of the phases associated to domain wall system. Note that when n i = n i+1 , the above choice of gauge leaves a residual U (n i ) gauge symmetry acting only on the chiral impurity ψ. In this gauge we can now easily integrate (4.2) in each interval, X 3 (x4 ) = ei A4 x4 Xˆ 3 e−i A4 x4 ,
(4.6)
where the eigenvalues of X 3 are independent of x4 in each interval. We identify these n i eigenvalues with the positions of the n i α i elementary domain walls.
On Monopoles and Domain Walls
661
We are now in a position to derive the linearized Nahm equations (4.2) from the original Nahm equations (3.10) in terms of a fixed point set of a Uˆ (1) action. Consider first the action of the U (1) R ⊂ SU (2) R isometry on the Nahm data, which rotates X 1 and X 2 while leaving X 3 fixed. This rotation also acts on the impurity ω = (ψ, ψ˜ † ) ˜ → eiα (ψ, ψ). ˜ To retain half of the impurities for the domain wall equations by (ψ, ψ) (4.3), we need to compensate for this transformation with the residual U (1) ⊂ U (n i ) transformation acting on the appropriate impurity ω by ω → eiβ ω. By choosing β = ±α we can pick a Uˆ (1) action which leaves either the ψ or the ψ˜ impurity invariant. Which we choose to save is correlated with the choice of minus sign in (4.3) which, in turn, dictates the ordering of neighbouring domain walls as we shall now demonstrate. To summarize, we have shown that the description of domain wall dynamics (4.2) arises from the fixed point of a Uˆ (1) on the original Nahm equations (3.10). This action descends to a Uˆ (1) isometry on the monopole moduli space Mg , the fixed points of which coincide with the domain wall moduli space Wg . A physical explanation for this correspondence follows along the lines of [3]: in theories in the Higgs phase, confined magnetic monopoles with charge g exist, emitting k multiple vortex strings. When these vortex strings coincide, the worldvolume theory is of the form described in Sect. 2 [14] and the monopoles appear as charge g kinks. 4.2. The ordering of domain walls revisited. As explained in Sect. 2, in contrast to monopoles, domain walls must satisfy a specific ordering on the x 3 line. We will now show that this ordering is encoded in the boundary conditions described above. Suppose first that n i = n i+1 . The positions of the α i domain walls are given by the eigenvalues (i) of X 3 restricted to the interval m i ≤ x4 ≤ m i+1 . Let us denote this matrix as X 3 and (i) the eigenvalues as λm , where m = 1, . . . n i . The impurity (4.3) relates the two sets of eigenvalues by the jumping condition (i+1)
X3
(i)
= X 3 + ψψ † ,
(4.7)
where we have chosen the positive sign for definiteness. However, from the discussion in Sect. 2 (see, in particular, Fig. 3) we know that the domain walls cannot have arbitrary position but must be interlaced, (i)
(i+1)
λ1 ≤ λ1
(i)
(i+1)
(i+1) ≤ λ2 ≤ · · · ≤ λn i −1 ≤ λ(i) n i ≤ λn i .
(4.8)
We will now show that the ordering of domain walls (4.8) follows from the impurity jumping condition (4.7). (i) To see this, consider firstly the situation in which ψ † ψ λm so that the matrix (i) † ψψ may be treated as a small perturbation of X 3 . The positivity of ψψ † ensures that (i+1) (i) (i+1) each λm ≥ λm . Moreover, it is simple to show that the λm increase monotonically † with ψ ψ. This leaves us to consider the other extreme, in which ψ † ψ → ∞. It this (i+1) (i+1) limit ψ becomes one of the eigenvectors of X 3 with eigenvalue λn i = ψ † ψ → ∞ which reflects the fact that this limit corresponds to the situation in which the last domain wall is taken to infinity. What we want to show is that the remaining (n i −1) α i+1 domain walls are trapped between the n i α i domain walls as depicted in Fig. 3. Define the n i ×n i projection operator P = 1 − ψˆ ψˆ † ,
(4.9)
662
A. Hanany, D. Tong
where ψˆ = ψ/ ψ † ψ. The positions of the remaining (n i − 1) α i+1 domain walls are (i) given by the (non-zero) eigenvalues of P X 3 P. We must show that, given a rank n hermitian matrix X , the eigenvalues of P X P are trapped between the eigenvalues of X . This elementary property of hermitian matrices can be seen as follows: det(P X P − µ) = det(X P − µ) = det(X − µ − X ψˆ ψˆ † ) = det(X − µ) det(1 − (X − µ)−1 X ψˆ ψˆ † ). Since ψˆ ψˆ † is rank one, we can write this as det(P X P − µ) = det(X − µ) [1 − Tr((X − µ)−1 X ψˆ ψˆ † )] = −µ det(X − µ) Tr((X − µ)−1 ψˆ ψˆ † ) n n |ψˆ m |2 (λm − µ) = −µ , λm − µ m=1
(4.10)
m=1
where ψˆ m is the m th component of the vector ψ. We learn that P X P has one zero eigenvalue while, if the eigenvalues λm of X are distinct, then the eigenvalues of P X P lie at the roots the function n |ψˆ m |2 . R(µ) = λm − µ
(4.11)
m=1
The roots of R(µ) indeed lie between the eigenvalues λm . This completes the proof that the impurities (4.3) capture the correct ordering of the domain walls. The same argument shows that the boundary condition (4.4) gives rise to the correct ordering of domain walls when n i+1 = n i + 1, with the α i domain walls interlaced between the α i+1 domain walls. Indeed, it is not hard to show that (4.4) arises from (4.3) in the limit that one of the domain walls is taken to infinity. Acknowledgements. AH is supported in part by the CTP and LNS of MIT, DOE contract #DE-FC0294ER40818, NSF grant PHY-00-96515, the BSF American-Israeli Bi-national Science Foundation and a DOE OJI Award. DT is supported by the Royal Society.
References 1. Abraham, E.R.C., Townsend, P.K.: Q kinks. Phys. Lett. B 291, 85 (1992); More on Q kinks: A (1+1)dimensional analog of dyons. Phys. Lett. B 295, 225 (1992) 2. Dorey, N.: The BPS spectra of two-dimensional supersymmetric gauge theories with twisted mass terms. JHEP 9811, 005 (1998); Dorey, N., Hollowood, T.J., Tong, D.: The BPS spectra of gauge theories in two and four dimensions. JHEP 9905, 006 (1999) 3. Tong, D.: Monopoles in the Higgs phase. Phys. Rev. D 69, 065003 (2004) 4. Hindmarsh, M., Kibble, T.W.B.: Beads On Strings. Phys. Rev. Lett. 55, 2398 (1985) 5. Shifman, M., Yung, A.: Non-Abelian string junctions as confined monopoles. Phys. Rev. D 70, 045004 (2004) 6. Hanany, A., Tong, D.: Vortex strings and four-dimensional gauge dynamics. JHEP 0404, 066 (2004) 7. Auzzi, R., Bolognesi, S., Evslin, J.: Monopoles can be confined by 0, 1 or 2 vortices. JHEP 0502, 046 (2005) 8. Kneipp, M.A.C.: Color superconductivity, Z(N) flux tubes and monopole confinement in deformed N = 2* super Yang-Mills theories. Phys. Rev. D 69, 045007 (2004) 9. Auzzi, R., Bolognesi, S., Evslin, J., Konishi, K.: Nonabelian monopoles and the vortices that confine them. Nucl. Phys. B 686, 119 (2004)
On Monopoles and Domain Walls
663
10. Markov, V., Marshakov, A., Yung, A.: Non-Abelian vortices in N = 1* gauge theory. Nucl. Phys. B 709, 267 (2005) 11. Gorsky, A., Shifman, M., Yung, A.: Non-Abelian Meissner effect in Yang-Mills theories at weak coupling. Phys. Rev. D 71, 045010 (2005) 12. Mironov, A., Morozov, A., Tomaras, T.N.: On the need for phenomenological theory of P-vortices or does spaghetti confinement pattern admit condensed-matter analogies? J. Exp. Theor. Phys. 101, 331– 340 (2005) 13. Bolognesi, S., Evslin, J.: Stable vs unstable vortices in SQCD. JHEP 0603, 023 (2006) 14. Hanany, A., Tong, D.: Vortices, Instantons and Branes. JHEP 0307, 037 (2003) 15. Eto, M., Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Webs of walls. Phys. Rev. D 72, 085004 (2005) 16. Lee, K., Yee, H.U.: New BPS Objects in N = 2 Supersymmetric Gauge Theories. Phys. Rev. D 72, 065623 (2005); Eto, M., Isozumi, Y., Nitta, M., Ohashi, K.: 21 , 14 and 18 BPS Equations in SUSY Yang-Mills-Higgs Systems: Field Theoretical Brane Configurations. http://arxiv.org/list/hep-th/0506257, 2005 17. Tong, D.: The moduli space of BPS domain walls. Phys. Rev. D 66, 025013 (2002) 18. Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Construction of non-Abelian walls and their complete moduli space. Phys. Rev. Lett. 93, 161601 (2004); All exact solutions of a 1/4 BPS equation. Phys. Rev. D 71, 065018 (2005) 19. Isozumi, Y., Nitta, M., Ohashi, K., Sakai, N.: Non-Abelian walls in supersymmetric gauge theories. Phys. Rev. D 70, 125014 (2004) 20. Isozumi, Y., Ohashi, K., Sakai, N.: Exact wall solutions in 5-dimensional SUSY QED at finite coupling. JHEP 0311, 060 (2003); Sakai, N., Yang, Y.: Moduli sapce of BPS walls in supersymmetric gauge theories. http://arxiv.org/list/hep-th/0505136, 2005 21. Sakai, N., Tong, D.: Monopoles, Vortices, Domain Walls and D-Branes: The Rules of Interaction. JHEP 0503, 019 (2005) 22. Lee, K.S.M.: An index theorem for domain walls in supersymmetric gauge theories. Phys. Rev. D 67, 045009 (2003) 23. Tong, D.: Mirror mirror on the wall: On two-dimensional black holes and Liouville theory. JHEP 0304, 031 (2003) 24. Goddard, P., Nuyts, J., Olive, D.I.: Gauge Theories And Magnetic Charge. Nucl. Phys. B 125, 1 (1977) 25. Weinberg, E.J.: Fundamental Monopoles And Multi - Monopole Solutions For Arbitrary Simple Gauge Groups. Nucl. Phys. B 167, 500 (1980) 26. Manton, N.S.: A Remark On The Scattering Of BPS Monopoles. Phys. Lett. B 110, 54 (1982) 27. Atiyah, M., Hitchin, N.: The Geometry and Dynamics of Magnetic Monopoles. Princeton, NJ: Princeton University Press, 1988 28. Connell, S.A.: The Dynamics of the SU (3) Charge (1, 1) Magnetic Monopole. University of South Australia Preprint, 1995 29. Gauntlett, J.P., Lowe, D.A.: Dyons and S-Duality in N=4 Supersymmetric Gauge Theory. Nucl. Phys. B 472, 194 (1996) 30. Lee, K., Weinberg, E., Yi, P.: Electromagnetic Duality and SU (3) Monopoles. Phys. Lett. B 376, 97 (1996) 31. Lee, K., Weinberg, E., Yi, P.: The Moduli Space of Many BPS Monopoles for Arbitrary Gauge Groups. Phys. Rev. D 54, 1633 (1996) 32. Houghton, C., Irwin, P.W., Mountain, A.J.: Two monopoles of one type and one of another. JHEP 9904, 029 (1999) 33. Nahm, W.: A Simple Formalism For The BPS Monopole. Phys. Lett. B 90, 413 (1980) 34. Diaconescu, D.E.: D-branes, monopoles and Nahm equations. Nucl. Phys. B 503, 220 (1997) 35. Hurtubise, J., Murray, M.K.: On The Construction Of Monopoles For The Classical Groups. Commun. Math. Phys. 122, 35 (1989) 36. Hanany, A., Witten, E.: Type IIB superstrings, BPS monopoles, and three-dimensional gauge dynamics. Nucl. Phys. B 492, 152 (1997) 37. Kapustin, A., Sethi, S.: The Higgs branch of impurity theories. Adv. Theor. Math. Phys. 2, 571 (1998) 38. Tsimpis, D.: Nahm equations and boundary conditions. Phys. Lett. B 433, 287 (1998) 39. Chen, X.G., Weinberg, E.J.: ADHMN boundary conditions from removing monopoles. Phys. Rev. D 67, 065020 (2003) 40. Murray, M.K.: A note on the (1, 1,…, 1) monopole metric. J. Geom. Phys. 23, 31 (1997) 41. Bachas, C., Hoppe, J., Pioline, B.: Nahm equations, N = 1 domain walls, and D-strings in Ad S5 × S 5 . JHEP 0107, 041 (2001) 42. Lambert, N.D., Tong, D.: Kinky D-strings. Nucl. Phys. B 569, 606 (2000) 43. Eto, M., Isozumi, Y., Nitta, M., Ohashi, K., Ohta, K., Sakai, N.: D-brane construction for non-Abelian walls. Phys. Rev. D 71, 125006 (2005) 44. Witten, E.: Solutions of four-dimensional field theories via M-theory. Nucl. Phys. B 500, 3 (1997) 45. Hanany, A., Hori, K.: Branes and N = 2 theories in two dimensions. Nucl. Phys. B 513, 119 (1998) Communicated by N.A. Nekrasov
Commun. Math. Phys. 266, 665–697 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0053-x
Communications in
Mathematical Physics
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section D. Dürr1 , S. Goldstein2 , T. Moser1 , N. Zanghì3 1 Mathematisches Institut der Universität München, Theresienstr. 39, 80333 München, Germany.
E-mail: [email protected]; [email protected]
2 Department of Mathematics, Rutgers University, New Brunswick, NJ 08903, USA.
E-mail: [email protected]
3 Dipartimento di Fisica, Università di Genova, Sezione INFN Genova, Via Dodescanesco 33, 16146 Genova,
Italy. E-mail: [email protected] Received: 1 September 2005 / Accepted: 14 March 2006 Published online: 11 July 2006 – © Springer-Verlag 2006
Abstract: We prove that the empirical distribution of crossings of a “detector” surface by scattered particles converges in appropriate limits to the scattering cross section computed by stationary scattering theory. Our result, which is based on Bohmian mechanics and the flux-across-surfaces theorem, is the first derivation of the cross section starting from first microscopic principles. 1. Introduction The central quantity in a scattering experiment is the empirical cross section, which reflects the number of particles that are scattered in a given solid angle per unit time. In this paper we shall derive the theoretical prediction for the cross section starting from a microscopic model describing a realistic scattering situation. We confine ourselves to the case of potential scattering of a nonrelativistic, (spinless) quantum particle and leave the many-particle case for future research. This paper is in fact a technical elaboration and continuation of our article “Scattering theory from microscopic first principles” [9]. The common approaches to the foundations of scattering theory take for granted that “an experimentalist generally prepares a state … at t → −∞, and then measures what this state looks like at t → +∞” (cf. [25], p. 113), meaning that the asymptotic expressions are “all there is,” as if they are not the asymptotic expressions of some other formula, however complicated, describing the scattering situation as it really is, namely happening at finite distances and at finite times. Thus a truly microscopic derivation starting from first principles must provide firstly a formula for the empirical cross section, which by the law of large numbers approximates its expectation value, and which is computed from the underlying theory. Secondly, that formula should apply to the realistic finite-times and finite-distances situation, from which eventually the usual Born formula should emerge by taking appropriate limits.1 1 For a detailed discussion of the scattering regime see [8].
666
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
We shall present a Bohmian analysis of the scattering cross section. With a particle trajectory we can ask for example whether or not that trajectory eventually crosses a distant spherical surface and if it does when and where it first crosses that surface. Similarly, for a beam of particles we can ask for the number of particles in the beam that first crosses the surface in a given solid angle . From a Bohmian perspective it appears reasonable to identify this number with detection events in a scattering experiment. We thus model in this paper the measured cross section using the number N () of first crossings of . This will of course depend on many parameters encoding the experimental setup, e.g. the distances R and L of the detector and the particle source from the scattering center, the details of the beam including its profile A and the wave functions of the particles in the beam, as well as on the length of the time interval τ during which the particles are emitted. We shall show in this paper that when these parameters are suitably scaled, N τ() is well approximated by the usual Born formula for the scattering cross section in terms of the T -matrix, i.e., N () = 16π 4 |T (k0 ω, k0 )|2 d, lim (1) τ
where k0 is the initial momentum of the particles. The paper is organized as follows: We collect first some mathematical notions and facts as well as recent results of scattering theory. In Sect. 3 we define the relevant random variables associated with the surface-crossings of a single particle and relate their distribution to the quantum probability current density. In Sect. 4 we model the beam by a suitable point process and in Sect. 5 we define N () in terms of this point process. A precise description of the limit procedure will be presented in Sect. 6. Our main results, Theorem 1 and 2, are stated in Sect. 7 and are proven in Sect. 8.
2. The Mathematical Framework of Potential Scattering We list those results of scattering theory (e.g. [2, 7, 11, 14, 16, 18–20, 22]) which are essential for the proof of Theorem 1 and Theorem 2 in Sect. 8. We use the usual description of a nonrelativistic spinless one-particle system by the Hamiltonian H (we use natural units = m = 1), 1 H := − + V (x) =: H0 + V (x), 2 with the real-valued potential V ∈ (V )n , defined as follows: Definition 1. V is in (V )n , n = 2,3,4,..., if (i) V ∈ L 2 (R3 ), (ii) V is locally Hölder continuous except, perhaps, at a finite number of singularities, (iii) there exist positive numbers δ, C, R0 such that |V (x)| ≤ Cx−n−δ for x ≥ R0 , 1
where · := (1 + (·)2 ) 2 .
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
667
Under these conditions (see H is self-adjoint on the domain D(H ) = e.g. [16]) D(H0 ) = { f ∈ L 2 (R3 ) : |k 2 f := F f is the f (k)|2 d 3 k < ∞} (k = |k|), where Fourier transform − 23 f (k) := (2π ) (2) e−i k· x f (x)d 3 x. Let U (t) = e−i H t . Since H is self-adjoint on the domain D(H ), U (t) is a strongly continuous one-parameter unitary group on L 2 (R3 ). Let φ ∈D(H ). Then φt ≡ U (t)φ ∈D(H ) and satisfies the Schrödinger equation i
∂ φt (x) = H φt . ∂t
In a typical scattering experiment the scattered particles move almost freely far away from the scattering center. “Far away” in position space can also be phrased as “long before” and “long after” the scattering event takes place. So for the “scattering states” ψ there are asymptotes ψin , ψout defined by lim e−i H0 t ψin (x) − e−i H t ψ(x) = 0,
t→−∞
lim e−i H0 t ψout (x) − e−i H t ψ(x) = 0.
(3)
t→∞
From this it is natural to define the wave operators ± : L 2 (R3 ) → Ran(± ) by the strong limits ± := s-lim ei H t e−i H0 t . t→±∞
(4)
These wave operators map the incoming and outgoing asymptotes to their corresponding scattering states. Ikebe [14] proved that for a potential V ∈ (V )n the wave operators exist and have the range Ran(± ) = Hcont (H ) = Ha.c. (H ). (This property is called asymptotic completeness.) Hence, the scattering states consist of states with absolutely continuous spectrum and the singular continuous spectrum of H is empty. In addition Ikebe [14] showed that the Hamiltonian has no positive eigenvalues. Then we have for every ψ ∈ Ha.c. (H ) asymptotes ψin , ψout ∈ L 2 (R3 ) with − ψin = ψ = + ψout .
(5)
On D(H0 ) the wave operators satisfy the so-called intertwining property H ± = ± H0 , while on Ha.c. (H )∩D(H ) we have that −1 H0 −1 ± = ± H.
The scattering operator S : L 2 (R3 ) → L 2 (R3 ) is given by S := −1 + − ,
(6)
668
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
while using the identity I , the T -operator is given by T := S − I.
(7)
If the system is asymptotically complete, the ranges of the wave operators are equal and thus S is unitary. Since the wave operator maps a scattering state onto its asymptotic state, the scattering operator maps the incoming asymptote ψin onto the corresponding out state ψout . The formula for the T -matrix, which holds in the L 2 -sense, is given by (see e.g., Theorem XI.42 in [19]) T (k, k ) g (k )k d , (8) Tg(k) = −2πi k =k
for g ∈ S(R3 ) (Schwartz space) such that g has support in a spherical shell.2 T (k, k ) is given by (see e.g., [19]): T (k, k ) = (2π )−3 e−i k· x V (x)ϕ− (x, k )d 3 x, (9) where ϕ− (as well as ϕ+ ) are eigenfunctions of H defined by Lemma 1 below. Since the eigenfunctions ϕ± are bounded and continuous (cf. Lemma 2), we can conclude that T (k, k ) is bounded and continuous on R3 × R3 , if the potential is in (V )3 . Then the formula (8) can be proved for g ∈ S(R3 ) without any restriction on the momentum support by the same method as in [19]. We will need the time evolution of a state ψ ∈ Ha.c. (H ) with the Hamiltonian H . Its diagonalization on Ha.c. (H ) is given by the eigenfunctions ϕ± : 1 k2 (− + V (x))ϕ± (x, k) = ϕ± (x, k). 2 2
(10)
2
Inverting (− 21 − k2 ) one obtains the Lippmann-Schwinger equation. We recall the main parts of a result on this due to Ikebe in [14] which is collected in the present form in [22]. Proposition 1. Let V ∈ (V )2 . Then for any k ∈ R3 \{0} there are unique solutions ϕ± (·, k) : R3 → C of the Lippmann-Schwinger equations ∓ik| x − x | 1 e ϕ± (x, k) = ei k· x − V (x )ϕ± (x , k)d 3 x , (11) 2π |x − x | which satisfy the boundary conditions lim| x |→∞ (ϕ± (x, k) − ei k· x ) = 0, which are also classical solutions of the stationary Schrödinger equation (10), and are such that: (i) For any f ∈ L 2 (R3 ) the generalized Fourier transforms3 1 ∗ (F± f )(k) = l. i. m. ϕ± (x, k) f (x)d 3 x 3 2 (2π ) exist in L 2 (R3 ). 2 In [19] Equation (8) was proven outside an “exceptional set”. For our class of potentials the “exceptional set” is empty. The additional factor 21 in [19] comes from the different definition of H0 . 3 l. i. m. is a shorthand notation for s-lim 2 B , where s-lim denotes the limit in the L -norm and B R a R→∞
ball with radius R around the origin.
R
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
669
(ii) Ran(F± ) = L 2 (R3 ). Moreover F± : Ha.c. (H ) → L 2 (R3 ) are unitary and the inverses of these unitaries are given by 1 −1 (F± f )(x) = l. i. m. ϕ± (x, k) f (k)d 3 k. 3 (2π ) 2 (iii) For any f ∈ L 2 (R3 ) the relations ± f = F±−1 F f hold, where F is the ordinary Fourier transform given by (2). (iv) For any f ∈ D(H ) ∩ Ha.c. (H ) we have: k2 H f (x) = F±−1 F± f (x), 2 and therefore for any f ∈ Ha.c. (H ), k2 e−i H t f (x) = F±−1 e−i 2 t F± f (x). In order to apply stationary phase methods we will need estimates on the derivatives of the generalized eigenfunctions: Proposition 2. Let V ∈ (V )n for some n ≥ 3. Then: (i) ϕ± (x, ·) ∈ C n−2 (R3 \ {0}) for all x ∈ R3 and the partial derivatives4 ∂ kα ϕ± (x, k), |α| ≤ n − 2, are continuous with respect to x and k. If, in addition, zero is neither an eigenvalue nor a resonance of H , then (ii)
sup x ∈R3 , k∈R3
|ϕ± (x, k)| < ∞,
for any α with |α| ≤ n − 2 there is a cα < ∞ such that (iii)
sup k∈
R3 \{0}
|κ |α|−1 ∂ kα ϕ± (x, k)| < cα x|α| , with κ :=
k k ,
and for any l ∈ {1, ..., n − 2} there is a cl < ∞ such that l ∂ ∂ l (iv) sup ∂k l ϕ± (x, k) < cl x , where ∂k is the radial partial derivative in kk∈ R3 \{0}
space.
Remark 1. This proposition, except the assertion (iii), was proved in [22], Theorem 3.1. Assertion (iii) repairs a false statement in Theorem 3.1 which did not include the necessary κ |α|−1 factor, which we have in (iii). For |α| = 1, which was the important case in that paper, there is however no difference. We have commented on the proof of this corrected version in [11]. Remark 2. Zero is a resonance of H if there exists a solution f of H f = 0 such that x−γ f ∈ L 2 (R3 ) for any γ > 21 but not for γ = 0.5 The appearance of a zero eigenvalue or resonance can be regarded as an exceptional event: For a Hamiltonian H = H0 + cV, c ∈ R, this can only happen for c in a discrete subset of R, see [1], p. 20 and [15], p. 589. As a simple consequence of Proposition 2 we obtain 4 We use the usual multi-index notation: α = (α , α , α ), α ∈ N , ∂ α f (k) : ∂ α1 ∂ α2 ∂ α3 f (k) and 1 2 3 0 i k1 k2 k3 k |α| := α1 + α2 + α3 . 5 There are various definitions, see e.g. [26], p. 552, [1], p.20 and [15], p. 584.
670
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
Corollary 1. Let V ∈ (V )3 and let zero be neither an eigenvalue nor a resonance of H . Then the T -matrix defined by (9) is a bounded and continuous function on R3 × R3 . Moreover, if V ∈ (V )n , for some n ≥ 3 we have for all multi-indices α with |α| ≤ n − 3 a constant cα > 0 such that k
sup ∈R3 , k∈R3 \{0}
κ |α|−1 |∂ kα T (k , k)| ≤ cα .
(12)
With the regularity of the generalized eigenfunctions one can prove the flux-acrosssurfaces theorem. The quantum probability current density (=quantum flux density) is given by i j ψt (x) := − (ψt∗ (x)∇ψt (x) − ψt (x)∇ψt∗ (x)). 2
(13)
For ψt (x) a solution of the Schrödinger equation we have the identity ∂|ψt (x)|2 + div j ψt (x) = 0, ∂t which has the form of a continuity equation. The flux-across-surfaces theorem can be naturally proven for the following class of wave functions (in the following definition out (k) = ϕ+ (x, k)ψ(x)d 3 x (cf. Proposition we have the Fourier transform of ψout , ψ 1), in mind): Definition 2. A function f : R3 \ {0} → C is in G + if there is a constant C ∈ R+ with: | f (k)| ≤ Ck−15 , α ∂ f (k) ≤ Ck−6 , |α| = 1, k α κ ∂ f (k) ≤ Ck−5 , |α| = 2, κ = k 2 ∂ ∂k 2 f (k) ≤ Ck−3 .
k k ,
With this definition we have Proposition 3. (Flux-across-surfaces theorem (FAST)). Suppose V ∈ (V )4 and that out (k) ∈ G + and let zero is neither a resonance nor an eigenvalue of H . Suppose ψ −i H t ψ = + ψout . Then ψt (x) = e ψ(x) is continuously differentiable except at the singularities of V , for any measurable set ⊆ S 2 and any T ∈ R j ψt (x) · dσ dt is absolutely integrable on R × [T, ∞) for R sufficiently large and ∞
∞
ψt
j (x) · dσ dt= lim
lim
R→∞
R→∞
T R
T R
ψ j t (x) · dσ dt= |ψ out (k)|2 d 3 k, C
(14) where R := {x ∈ R3 : x = Rω, ω ∈ }, C := {k ∈ R3 : kk ∈ } is the cone given by and dσ is the outward-directed surface element on RS 2 . The proof can be found in [11]. The FAST plays a crucial role in the proof of our main results, Theorem 1 and Theorem 2. Its importance for scattering theory was first pointed out in [6].
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
671
3. The Quantum Flux, Crossing Statistics and Bohmian Mechanics In Bohmian mechanics, see [5], the particle has a position Q t that evolves via the equations d ∇ψt ( Q t ), Q t = v ψt ( Q t ) = Im dt ψt
∂ i ψt (x) = H ψt (x). ∂t
(15)
According to the quantum equilibrium hypothesis ([10], Born’s law), the positions of particles in an ensemble of particles each having wave function ψ are always |ψ|2 -distributed. Note that if Q 0 is |ψ0 |2 -distributed then Q t is |ψt |2 -distributed. Under two assumptions we have the |ψ0 |2 almost-sure existence and uniqueness of the Bohmian dynamics: A 1. The initial wave function ψ0 is normalized, ψ0 = 1, and ψ0 ∈ C ∞ (H ) = ∞ D(H n ). n=1
A 2. The potential V is in V2 and C ∞ except, perhaps, at a finite number of singularities. (See Berndl et al. [4], Theorem 3.1 and Corollary 3.2 for the proof, as well as Theorem 3 and Corollary 4 in [23]. The conditions in [4, 23] are much more general. In our context, however, we have to restrict to the case where V ∈ (V )2 .) Hence, depending on the initial position q 0 ∈ 0 , where 0 is the set of “good” points, the ψ particle has the trajectory Q t (q 0 ). On the set of “good” points, ψ0 (x) is different from zero and is differentiable. The complement R3 \ 0 of 0 has measure 0 (with respect to |ψ0 |2 ). ψ Given a trajectory Q t (q 0 ), q 0 ∈ 0 , we can define the number of crossings in a natural way. For the surface R ⊂ RS 2 with unit and normal vector n(x) = xx , x ∈ R ψ we define N+ (R) on 0 by:
ψ ψ ψ ˙ψ N+ (R)(q 0 ) := t ≥ 0| Q t (q 0 ) ∈ R and Q t (q 0 ) · n Q t (q 0 ) > 0 , (16) ψ
the number of crossings of the trajectory Q t (q 0 ) through R in the direction of the orientation in the time interval [0, ∞) (“problematical crossings” where the velocity is “orthogonal” to the orientation of R have measure zero and need not concern us, see ψ R as the time when the particle [3], p. 28-34). If N+ (R)(q 0 ) ≥ 1, we can define texit crosses the surface R in the positive direction for the first time:
ψ ψ R ˙ψ (q 0 ) := min t ≥ 0| Q t (q 0 ) ∈ R and Q texit t (q 0 ) · n Q t (q 0 ) > 0 . (17) In the case that the particle does not cross the surface in the positive direction, we set ψ
R (q 0 ) := ∞, if N+ (R)(q 0 ) = 0. texit ψ
(18)
Analogously to (16) we have N− (R), the number of crossings in the opposite direcψ ψ tion. For convenience we define N+ (R) and N− (R) on the whole of R3 by setting
672 ψ
D. Dürr, S. Goldstein, T. Moser, N. Zanghì ψ
N+ (R) = N− (R) = 0 for all q 0 ∈ R3 \ 0 . Then we can define the number of signed crossings on R3 by ψ
ψ
ψ
Nsig (R) := N+ (R) − N− (R).
(19)
The total number of crossings defined on R3 is then ψ
ψ
ψ
Ntot (R) := N+ (R) + N− (R).
(20)
These quantities are random variables on the space R3 of initial conditions, see [3], ψ ψ Lemma 4.2. The expectation values of Nsig (R) and Ntot (R) are given by flux inteψ
ψ
grals and are finite, see Proposition 4 below. This means that Nsig (R) and Ntot (R) are almost surely finite. Before we give a precise statement we argue heuristically for the connection between the quantum flux and the expectation values. For a particle to cross an infinitesimal surface dσ := ndσ in a time interval [t, t + dt), it must be at time t in the appropriate cylinder of size |v ψt (x) · dσ dt|. The probability is therefore |ψt (x)|2 |v ψt (x) · dσ dt| = | j ψt (x) · dσ |dt. ψ
Because the intervals are infinitesimal, we have for Nsig (dt, dσ ) ∈ {−1, 0, 1},6 where ψ
the sign will be the same as that of j · dσ . Therefore E(Nsig (dt, dσ )) = j ψt (x) · dσ dt and integration over R and [0, ∞) yields (21). The precise statement is: Proposition 4. Let A1 and A2 be satisfied. In addition suppose that the conditions of Proposition 3 are satisfied. Then for sufficiently large R the expectation values of ψ ψ Nsig (R) and Ntot (R) are finite and ψ E(Nsig (R))
∞ =
j ψt (x) · dσ dt,
(21)
| j ψt (x) · dσ |dt.
(22)
0 R
∞
ψ
E(Ntot (R)) = 0 R
The proof of Proposition 4 can be found in [3], pp. 34–37, and under slightly different conditions in [24]. The results in the references hold under more general conditions on the surfaces. Consider now a scattering situation where we want to calculate the number of first crossings. The detector corresponds to the surface R := {x ∈ R3 : x = Rω, ω ∈ ψ ⊂ S 2 } ⊂ RS 2 . Then we define Ndet ([0, ∞), R, ) to be equal to one if the particle with the wave function ψ0 = ψ is “detected” in [0, ∞) and zero otherwise. More precisely, ψ
Ndet (R, ) : R3 → {0, 1}, 6 N ψ (dt, dσ ) is the number of signed crossings in the time interval [t, t + dt) through the surface dσ . sig
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
ψ Ndet (R, )(q 0 )
:=
1,
R S < ∞ and Q if q0 ≤ R, texit
0
otherwise.
2
ψ 2
RS texit
(q 0 ) ∈ R,
673
(23)
The definition is motivated by the idea that particles are detected when they cross the boundary RS 2 for the first time. Using the fact that RS 2 is closed we can estimate ψ ψ ψ Ndet (R, ) − Nsig (R) ≤ N− (RS 2 ) so that by the triangle inequality ψ ψ ψ E(Ndet (R, )) − E(Nsig (R)) ≤ E(N− (RS 2 )). With (19), (20) and Proposition 4 we obtain for the right-hand side of (24), 1 ψ ψ ψ E(N− (RS 2 )) = E Ntot (RS 2 ) − Nsig (RS 2 ) 2 ∞ ψ 1 | j t (x) · dσ | − j ψt (x) · dσ dt. = 2
(24)
(25)
0 R S2
If j ψt (x) · dσ ≥ 0 for all dσ ∈ RS 2 and t > 0 then we have by (24) and (25) that ψ ψ E(Nsig (R)) = E(Ndet (R)). In general j ψt (x) · dσ does not have to be positive, but the flux-across-surfaces theorem (Proposition 3) ensures that the flux is asymptotically ψ ψ outwards. Thus we can estimate the difference between E(Nsig (R)) and E(Ndet (R)) for all ψ which satisfy the flux-across-surfaces theorem using (24) and (25), 1 ∞ ψ ψ | j ψt (x) · dσ | − j ψt (x) · dσ dt → 0. E(Nsig (R))−E(Ndet (R, ))≤ R→∞ 2 0 R S2
(26) In particular under the hypotheses of Proposition 3 and the general assumptions A1 and ψ A2 we obtain asymptotic equality between the expectation values E(Ndet (R, )) and ψ E(Nsig (R)). 4. A Model for the Beam In a scattering situation a beam of particles is scattered off a target. We now wish to focus on the beam. We take the beam to be produced by a particle source located in the plane Y L perpendicular to the x3 -axis: Y L := {−L e3 + a| a⊥e3 }, L > 0. The particles are created with wave functions ψ ∈ Ha.c. translated to the plane Y L . Calling ψ y the translation of ψ by y, the “centers” of the translated wave functions, with which we are concerned, are located at y = y1 e1 + y2 e2 − L e3 ∈ Y L and are uniformly distributed in a bounded region A ⊂ Y L with area |A|. We call A the beam profile. The momentum distribution of the wave function is concentrated around the momentum k0 e3 .
674
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
Remark 3. This model of a beam, in which the particles have random impact parameters and are scattered off a single target “particle,” is equivalent to the more realistic description of the scattering situation, in which all the target particles are randomly distributed (e.g., in a foil) and the incoming particles have the very same impact parameter, provided coherent and multiple-scattering effects are neglected (see e.g. [17], p. 214). The translated wave function ψ y of a wave function ψ ∈ Ha.c. will not in general be in Ha.c. , but can have a part in Hp.p. . This is problematical for the application of our general results (see Sect. 9). To avoid this difficulty, we assume: A 3. The Hamiltonian H = − 21 + V has no bound states, i.e. Hp.p. = {0}. Then ψ y ∈ Ha.c. , ∀ y ∈ R3 . We specify now more precisely the model for the beam, which has been already mentioned in [9]. The particles are created with wave functions ψ at random times t ∈ R+ and where the wave function of a particle is shifted randomly by the uniformly distributed “impact parameter” y ∈ A, the “center” of the wave function at the moment of emission. In Bohmian mechanics the initial position q ∈ R3 of the particle determines its trajectory. The initial position is |ψ y |2 -distributed. We shall not need many stochastic details about the beam. The reader may think of a Poisson point process with points in = R+ × A × R3 , with a point λ = (t, y, q) ∈ representing a particle with wave function ψ y (x) ≡ ψ(x − y), y ∈ A
(27)
emitted at the time t ∈ R+ and with initial position q ∈ R3 . We shall consider a general point process ( , F, P) built on (, B(), µ), where λ ∈ represents a configuration of countably many points in , i.e. λ = {λ}, λ ∈ , λ countable. For the number of points χ B (λ ) ≡
χ B (λ)
λ∈λ
in a set B ∈ B(), where χ B is the indicator function of the set B, we have that (28) E χ B = µ(B), where the intensity measure µ on B() is given by dµ = |ψ(x − y)|2 χ A (y)dtd 2 yd 3 x.
(29)
Remark 4. For a Poisson process we would have, in addition to (28), that µ(B)k P χ B = k = exp(−µ(B)) k! as well as the independence of χ A and χ B , for A ∩ B = ∅, A, B ∈ B().
(30)
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
675
We shall assume that the point process is ergodic in the following sense: For any B ∈ B() let B(τ ) := {(t, y, q) ∈ B|t ∈ [0, τ )}.
(31)
Then for any > 0, χ χ B(τ ) B(τ ) lim P −E ≥ = 0, τ →∞ τ τ
(32)
with E χ B(τ ) given by (28). Remark 5. Because of the independence property (cf. Remark 4), (32) holds for the case of a Poisson process. Remark 6. The point process has unit density in the following sense: Let C ⊂ A, τ > 0 and B := [0, τ ) × C × R3 be given. Then with (32) for any > 0, χB χ B ≥ = 0, (33) lim P −E τ →∞ |C|τ |C|τ and E
χ B(τ )
|C|τ
=
1 µ(B) = 1. |C|τ
(34)
5. The Definition of the Scattering Cross Section We shall now start to define N (τ, R, A, L , ψ, ), the number of detected particles. To simplify the notation we do not always indicate the dependence of N on A, L and ψ. Sometimes we will also suppress the dependence on R and . We define first Ndet (τ, R, ) for a single particle corresponding to λ = (t, y, q) by Ndet (τ, R, ψ, ) : → {0, 1}, ψ
Ndet (τ, R, ψ, )(λ) := χ[0,τ ) (t)Ndety (R, )(q),
(35)
ψ
where Ndety (R, )(q) is defined by (23). The characteristic function ensures that no particle is counted which is emitted after the time τ. Note that ψ y must satisfy condition ψ
A1 (Sect. 3) to ensure that Ndety (R, )(q) is well defined. Then N (τ, R, A, L , ψ, ) : → N0 , N (τ, R, A, L , ψ, )(λ ) =
λ∈λ
Ndet (τ, R, ψ, )(λ).
(36)
676
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
The empirical scattering cross section σemp () for the solid angle is the random variable7 N (τ, R, A, L , ψ, ) , (37) σemp () := τ which by the law of large numbers (for the Poisson case and by the ergodicity assumption (32) for the general case) should approximate for large τ in P-probability its corresponding P-expectation value. The expected value of (37) is then the theoretically predicted cross section. This theoretically predicted cross section involves a very complicated formula which is not very explicit, cf. (47) and Remark 7. It depends of course on the detection directions , the potential V and the approximate momentum k0 of the particles in the beam, but depends also on the other details of the experimental setup such as R, A, L and the detailed specification of ψ. By taking the scaling limit described in the next section, we shall arrive at (1), which does not depend on these additional details. 6. The Scaling of the Parameters According to the usual asymptotic picture of scattering theory where the particles are prepared long before and are detected long after the scattering event has occurred, the preparation and detection should be far away from the scattering center. That means the limits R → ∞ and L → ∞ have to be taken. However, increasing L has the (undesirable) effect of an increased spreading of the beam, which reduces the beam intensity in the scattering region. To maintain the beam intensity in the scattering region we must widen the beam profile A as L → ∞. The idealization of an incoming plane wave corresponds to particles with a narrow distribution in momentum space, i.e., to a limit in which the Fourier transform of the initial wave function becomes more and more concentrated around a fixed initial wave vector k0 . For a detailed discussion of the scattering regime see [8]. The limits for the parameters L , A, and ψ will be combined by simultaneously scaling them using a small parameter : We introduce L , A and ψ , whose precise dependence on will be given below, and consider the cross section corresponding to (37), depending on , R, τ , N (τ, R, A , L , ψ , ) , (38) τ to which the limit → 0 is to be applied. However, the limit R → ∞ is taken before we take → 0; this is because we must have that the diameter of the beam profile A is much smaller than R, since otherwise unscattered particles will often contribute to what should be the cross section for scattered particles. For convenience, we first take the limit τ → ∞, required for the stabilization of the empirical cross section produced by the law of large numbers. We are thus led to consider a limit for the cross section of the form () = σemp
σ () = lim lim lim σemp (). →0 R→∞ τ →∞
(39)
7 We shall ignore the dimension factor [unit area · unit time] which comes from the normalization of (37) 1 by the unit density [unit area·unit time] of the underlying point process, cf. Remark 6. One can also normalize by the beam density, i.e. with the number of detected particles (by a detector in the beam with a surface perpendicular to the beam axis) per unit time and unit area, in front of the target. In the scattering regime, i.e. if the parameters are suitably scaled (cf. Section 6), the beam will have unit density in front of the target. We shall not elaborate on this further in this paper, see however [8].
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
677
The precise definition of L , A and ψ , used in our main results, is the following: 3
ψ (x) = 2 ei k0 · x ψ( x), with the Fourier transform (k) = − 2 ψ ψ 3
k − k0
(40)
.
(41)
The particle source is located on Y L , with L =
L , l > 2. l
For the beam profile A ⊂ Y L we take the circular region D 3 and x3 = L } A = {x ∈ R | x12 + x22 < 2
(42)
(43)
with the beam diameter D given by D =
D , d > 2l − 3. d
(44)
(One might be inclined to consider a scattering experiment in which the diameter of the beam is much smaller than the distance of the particle source from the scattering center. Indeed, if 2 < l < 3, d < l is consistent with (44). Hence, such a scenario is covered by our results.) 7. The Scattering Cross Section Theorem We can now formulate our main results. Our basic assumptions are that V ∈ (V )5 (Definition 1), A2 (Sect. 3), A3 (no bound states, Sect. 4) and hat for all small enough ψ y is “good” for all y ∈ A in the sense that it satisfies A1 (Sect. 3) as well as the condition for the FAST (Prop. 3). Moreover, we need to assume that the potential has no zero energy resonances. However, instead of invoking the implicit condition on ψ that the ψ y are “good,” we impose stronger but more explicit conditions on ψ, namely that ψ ∈ C0∞ (R3 ) (Theorem 2) or ψ ∈ S (Theorem 1), with corresponding additional conditions on the potential (Definitions 4 and 3, respectively). Definition 3. V is in V if (i) the Hamiltonian H = − 21 + V has no bound states, i.e. Hp.p. = {0}, (ii) the Hamiltonian H = − 21 + V has no zero energy resonances, (iii) V is a C ∞ -function on R3 , (iv) V and its derivatives of all orders are uniformly bounded in x: For all multi-indices α there exist an Mα < ∞ such that |∂ xα V (x)| < Mα for all x ∈ R3 , (v) there exist positive numbers δ and C such that |V (x)| ≤ Cx−5−δ for all x ∈ R3 .
678
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
Theorem 1. Let ψ be a normalized vector in S(R3 ) and suppose that V is in V. Furthermore, suppose that the point process ( , F, P) satisfies (28), (29) and the ergodic is well assumption (32). Let k0 ||e3 with k0 > 0 and suppose that k0 ∈ / C . Then σemp defined and (recalling (1)) N (τ, R, A , L , ψ , ) P σemp () = −→ σ () = σ diff (ω)d, (45) →0,R→∞,τ →∞ τ
P
where σ diff (ω) = 16π 4 |T (k0 ω, k0 )|2 and −→ denotes convergence in probability. Definition 4. V is in V if (i) the Hamiltonian H = − 21 + V has no bound states, i.e. Hp.p. = {0}, (ii) the Hamiltonian H = − 21 + V has no zero energy resonances, (iii) V is in (V )5 , (iv) V is C ∞ except, perhaps, at a finite number of singularities. Under these conditions we obtain Theorem 2. Let ψ be a normalized vector in C0∞ (R3 ) and let V be in V . Furthermore, suppose that the point process ( , F, P) satisfies (28), (29) and the ergodic assumption is well defined and (32). Let k0 ||e3 with k0 > 0 and suppose that k0 ∈ / C . Then σemp (45) of Theorem 1 holds. 8. Proof of Theorem 1 and Theorem 2 During the proof in this section and in the appendix 0 < c < ∞ will denote a constant whose value can change during a calculation—even within the same equation or inequality. If either V ∈ V and ψ ∈ S(R3 ) or V ∈ V and ψ ∈ C0∞ , then (if ψ is normalized) the ψ y are “good” for all y ∈ A for all small enough. That the ψ y satisfy the conditions for the FAST follows from Lemma 1 below, and that they satisfy A1 is easily seen: For the case V ∈ V and ψ ∈ S(R3 ) the conclusion follows from a simple computation, and if V ∈ V and ψ ∈ C0∞ it suffices to observe that by choosing small enough the wave function ψ y has, for all y ∈ A , no overlap with the singularities of the potential. N is thus well defined by (36), and we can take the first limit in (45) using the following Proposition 5. Suppose that ψ y satisfies A1 for all y ∈ A and that the potential satisfies A2. Furthermore, suppose that the point process ( , F, P) satisfies (28), (29) and the ergodic assumption (32). Then the number of detected particles N (τ ) obeys the law of large numbers, i.e. for all δ > 0, N (τ, ) (46) − γ ≥ δ = 0, lim P τ →∞ τ where
γ = A
ψ E Ndety () d 2 y.
(47)
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
679
Remark 7. γ = γ () is in fact the cross section which would be measured in an experiment. The remaining limits in (45) applied to γ yield the cross section σ (). If the basic point process is a Poisson process with [0, τ ) = R+ the times of detection in form a Poisson process with intensity γ . Moreover, in the scattering regime, the detailed detection events, involving times and directions, form a Poisson process on R+ × S 2 with intensity σ diff (ω). Proof. By the definition (36) of N we have that N (τ )(λ ) = χ B(τ ) (λ ) =
χ B(τ ) (λ),
(48)
λ∈λ
with B(τ ) given by B(τ ) = {(t, y, q) ∈ |Ndet (τ, )(t, y, q) = 1}.
(49)
It thus follows from (28) and (29) that
ψ E N (τ ) = µ(B(τ )) = χ[0,τ ) (t)Ndety ()(q)dµ ψ = τ E Ndety () d 2 y = τ γ .
(50)
A
The proposition follows from the ergodicity assumption (32).
It is not easy to calculate the expectation value γ (cf. (47)) directly. However, as we shall show below, using the FAST we can approximate (47) by ψ E Nsigy (R) d 2 y, (51) A
where the integrand of (51) is given by an integral over the flux (cf. (21)), an expression ψ
that we can more easily handle. We will show in Lemma 2 below that E Nsigy (R) is absolutely integrable over A . We introduce now a class of scattering states G for which we can show that the corresponding asymptotes are in the set G + , i.e. that they satisfy the FAST. Definition 5. A function f : R3 → C is in G 0 if 8 f ∈ Ha.c. (H ) ∩ C 8 (H ), x2 H n f (x) ∈ L 2 (R3 ), x4 H n Then G :=
t∈R
f (x) ∈
L 2 (R3 ),
n ∈ {0, 1, 2, ..., 8}, n ∈ {0, 1, 2, 3}.
e−i H t G 0 .
We state now the important lemma that ensures that the ψ y satisfy the FAST. 8 8 C 8 (H ) := D(H n ) n=1
680
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
Lemma 1. Suppose V ∈ (V )4 and that zero is neither a resonance nor an eigenvalue of H . Then
out (k) = F −1 ψ(x) ∈ G ⇒ ψ ψ (k) ∈ G + . + The proof is adapted from [12] and can be found in the appendix. Remark 8. For other mapping properties between ψ and ψout , which are not applicable in our case, see [26]. For ψ ∈ S and V ∈ V or ψ ∈ C0∞ (R3 ) and V ∈ V we have that ψ y ∈ C ∞ (H ) for all y ∈ A and small enough. By (i) in the definition of V or V (Definition 3 or 4) there are no bound states. Hence ψ y ∈ Ha.c. (H ) ∩ C 8 (H ), and one easily sees that ψ y ∈ G. Thus by Lemma 1 and Proposition 3 the ψ y satisfy the FAST for all y ∈ A and small enough.
ψ
We now show that E Nsigy (R) is absolutely integrable over A .
Lemma 2. Suppose that ψ ∈ S and V ∈ V or that ψ ∈ C0∞ (R3 ) and V ∈ V . Then there exist M and R0 > 0 such that for small enough ∞ 0
| j ψ y,t (x) · dσ |dt < M, ∀ y ∈ A , ∀R > R0 .
(52)
R S2
For the proof see the Appendix. From now on we assume that R > R0 . By Lemma 1, Proposition 3, Proposition 4 and Lemma 2 we see that (51) is a ψ well defined expression. Moreover, by (26) the difference between E Ndety (R, )
ψ and E Nsigy (R) vanishes in the limit R → ∞, and using Lemma 2 we easily see by the dominated convergence theorem that the same conclusion holds for the integrals themselves. Thus, by Proposition 5, the limit σ () in Theorem 1 is given by ψ E Ndety (R, ) d 2 y σ () = lim lim γ = lim lim →0 R→∞
→0 R→∞
lim E
= lim
→0
A
R→∞
= lim
→0
lim
A
R→∞ R
A
ψ Nsigy (R)
d2 y
j ψ y,t (x) · dσ dtd 2 y.
Using Lemma 1 and Proposition 3 we get instead of (53), −1 −1 2 2 3 2 2 3 |+ ψ y (k)| d yd k = lim | S σ () = lim − ψ y (k)| d yd k. →0 C A
→0 C A
(53)
(54)
The formula for S = T + I is given by (8) and (9). To exploit this formula we write instead of (54):
2 2 3 σ () = lim |F S(−1 ψ − ψ ) + T ψ + ψ (55) − y y y y (k)| d yd k. →0 C A
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
By the triangle equality we see that (55) yields σ () = lim |F(T ψ y )(k)|2 d 2 yd 3 k, →0 C A
681
(56)
provided
2 2 −1 − ψ y − ψ y d y = 0
lim
→0
(57)
A
and lim
→0 C A
y (k)|2 d 2 yd 3 k = 0. |ψ
(58)
Remark 9. In [9] the “sufficient condition” for proceeding from (54) to (56) was insufficient. We will establish now (57) and (58). We start with (58). Suppose that is such that k0 ∈ / C . With (41) we have then that
k − k0 2 2 3 ψ d yd k C A (k) |2 d 2 yd 3 k. |ψ
2 2 3 ψ y (k) d yd k = −3
C A
=
1 (C − k 0 )
(59)
A
Since k0 ∈ / C there exists a δ > 0 such that |k − k0 | > δ
for all k ∈ C .
(60)
| ≤ ck −(d+2) ), the last integral in (59) can be ∈ S(R3 ) (we will use that |ψ Using that ψ estimated by 1 c (k) |2 d 2 yd 3 k ≤ (k) |2 d 2 yd 3 k ≤ |ψ |ψ d 3 k ≤ c, 2d 2d+4 k 1 (C − k 0 )
A
k> δ A
k> δ
(61) from which (58) follows. Since − is a partial isometry, (57) is equivalent to lim − ψ y − ψ y 2 d 2 y = 0, →0
A
which is the content of the following
(62)
682
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
Lemma 3. Let zero be neither an eigenvalue nor a resonance of H and suppose that V ∈ (V )5 . Let ψ ∈ S(R3 ) and let k0 > 0. Then (63) lim − ψ y − ψ y 2 d 2 y = 0. →0
A
⊂ Peα for some α ∈ (0, π ), Remark 10. Under the additional condition that supp ψ 2 3 where Peα3 := {k ∈ R3 : k · e3 > k cos α}, 0 < α < π2 (this is a convenient condition, see e.g. [2], Lemma 7.17), one can prove in a manner similar to the way we prove Lemma 3 that the following holds: lim − ψ y − ψ y 2 d 2 y = 0. (64) L→∞ YL
It is well known that the integrand in (64) tends to zero for large y (see e.g. [2], Corollary 8.17, [19], Theorem XI.33, and [21], Theorem 2.20). Proof of Lemma 3. We have that − ψ y − ψ y 2 = 1 − (ψ y , − ψ y ) + c.c.
(65)
(k) for any ψ ∈ L 2 (R3 ) (Proposition 1, (iii)) we obtain for the Since − ψ = F−−1 ψ r.h.s. of (65): y (k)ϕ− (x, k)d 3 kd 3 x + c.c. 1 − (ψ y )∗ (x)(2π )−3/2 ψ (66) Writing ϕ− (x, k) = ei k· x − η− (x, k), and since ψ y 2 = 1, we then find that 2 ∗ −3/2 y (k)η− (x, k)d 3 kd 3 x + c.c. − ψ y − ψ y = (ψ y ) (x)(2π ) ψ
(67)
(68)
We shall divide the k-integration into two parts with the help of smooth (C ∞ ) mollifiers 0 ≤ f 1 (k) ≤ 1 and 0 ≤ f 2 (k) ≤ 1 satisfying 1, for |k − k0 | < k30 , f 1 (k) = 0, for |k − k0 | ≥ k20 , f 2 (k) := 1 − f 1 (k).
(69)
Using (69) we obtain for (68) 2 ∗ −3/2 y (k)( f 1 + f 2 )(k)η− (x, k)d 3 kd 3 x + c.c. ψ − ψ y − ψ y = (ψ y ) (x)(2π ) y (k) f 1 (k)η− (x, k)d 3 kd 3 x = (ψ y )∗ (x)(2π )−3/2 ψ ∗ −3/2 y (k) f 2 (k)η− (x, k)d 3 kd 3 x + (ψ y ) (x)(2π ) ψ + c.c. =: I1 + I2 + c.c. ≤ 2|I1 | + 2|I2 |.
(70)
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
683
(k)| ≤ cn Observing that ψ ∈ S(R3 ) we estimate |I2 | by using that for any n > 0 |ψ k and that |η− (x, k)| ≤ 1 + |ϕ− (x, k)| ≤ c (Proposition 2 (ii)) as well as (40), (41) and (69): k − k0 3 3 c ψ |I2 | ≤ 3 |ψ(x − y)|(2π )−3/2 d kd x ≤
c 3
| k− k0 |≥
k0 3
k 3 n−3 ψ d k ≤ c
k | k|≥ 30
1 3 d k = c n−3 , kn
(71)
k | k|≥ 30
if n ≥ 4. Lemma 3 concerns the integration of I1 and I2 over A . With (71) we obtain that |I2 |d 2 y ≤ c n−3−2d , (72) A
which tends to zero if we choose n large enough. We are left with showing that lim |I1 |d 2 y = 0, →0
(73)
A
and for this it suffices to prove that lim
→0 YL
|I1 |d 2 y = 0.
(74)
Recalling the Lippmann-Schwinger equation (11), i.e. that 1 η− (x, k) = 2π
eik| x − x | V (x )ϕ− (x , k), |x − x |
we find that I1 =
1 5
(2π ) 2
∗
(ψ y ) (x)
y (k) f 1 (k) ψ
eik| x − x | V (x )ϕ− (x , k)d 3 x d 3 kd 3 x. (75) |x − x |
Since the integrand in (75) is absolutely integrable over x, x , k (because ψ ∈ S(R3 ), V ∈ (V )5 ; cf. Lemma 2, (ii)) we are free to interchange these integrations and more generally change integration variables as convenient. Using (ψ y )∗ (x) = (ψ )∗ (x − y (k) = ψ (k)e−i k· y we obtain that y), ψ 1 ∗ (k) f 1 (k) (ψ ) (x − y) ψ I1 = 5 (2π ) 2 3 R R3 ik| x − x |−i k· y e V (x )ϕ− (x , k)d 3 x d 3 kd 3 x. × (76) |x − x | R3
684
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
Making the change of variables x → x − y and using y = (y1 , y2 , −L ) we obtain
1
I1 =
5
(2π ) 2
(ψ )∗ (x)
R3
(k) f 1 (k) ψ
R3
R3
eik| y+ x − x |−ik1 y1 −ik2 y2 +ik3 L V (x ) | y + x − x| (77)
× ϕ− (x , k)d 3 x d 3 kd 3 x. Introducing as shorthand notation (no change of variables) ˜y = y + x − x , a := x − x , b3 := −L + a3 and letting (r, θ ) be the polar coordinates for ( y˜1 , y˜2 ), with er the corresponding radial unit vector (⊥e3 ), this becomes I1 =
1 5
(2π ) 2 ×
R3
e
ik
1 5
(2π ) 2
e
×
(k) f 1 (k) ψ
R3
y˜12 + y˜22 +(−L +a3 )2 −ik1 y˜1 −ik2 y˜2 +ik3 L
| ˜y| (k) f 1 (k) (ψ )∗ (x) ψ
R3
=
(ψ )∗ (x)
R3 R3 2 2 ik r +b3 −ik sin ϑ r cos β+ik cos ϑ L
r 2 + b32
R3
eik1 a1 +ik2 a2 · V (x )ϕ− (x , k)d 3 x d 3 kd 3 x
eik1 a1 +ik2 a2 · V (x )ϕ− (x , k)d 3 x d 3 kd 3 x, (78)
with k sin ϑ = |k p | = k12 + k22 , k3 = k cos ϑ, where ϑ (0 ≤ ϑ ≤ π ) is the angle between k and e3 and β is the angle between k p = (k1 , k2 , 0) and er . Moreover, there is an angle 0 < α < π2 such that ϑ ≤ α, i.e. cos α ≤ cos ϑ ≤ 1, 0 ≤ sin ϑ ≤ sin α, 0 < α <
π 2
(79)
for all k’s in the support of f 1 (cf. (69)). We introduce now spherical coordinates (k, ω) for k as integration variables and do ∈ S(R3 ), f 1 is first the integration over k (note that β is not k-dependent). Since ψ ∂ smooth and ∂k ϕ− (x , k) is uniformly bounded in k (Proposition 2 (iv)), we can do two integration by parts with respect to k and obtain that I1 = −
1
∞
× S2 0
(ψ ) (x)
5
(2π ) 2
∗
R3
V (x )
R3
∂2 eikλ (k) f 1 (k)ϕ− (x , k)eik1 a1 +ik2 a2 k 2 · ψ dkdd 3 x d 3 x, ∂k 2 2 2 2 r +b λ 3
(80)
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
685
where b32 λ := r 1 + 2 − sin ϑ cos β + cos ϑ L . r
(81)
To estimate the derivatives of the functions f 1 (k)ϕ− (x , k) we use Proposition 2, (iv) and the smoothness of f 1 (k). We introduce a multi-index notation i := (i 1 , i 2 , i 3 , i 4 ), i m ∈ N0 , |i| := i 1 + i 2 + i 3 + i 4 , j := ( j1 , j2 , j3 ) analogously. With kl = κl k, κl ∈ [−1, 1], l = 1, 2 we obtain that 2 ∂ (k)k 2 eik1 a1 +ik2 a2 ) ( f (k)ϕ (x , k) ψ − ∂k 2 1 i3
i4
∂ i1 ∂ i2
2 ∂ iκ1 ka1 ∂ iκ2 ka2 ≤2 ∂k i4 e ∂k i1 f 1 (k)ϕ− (x , k) ∂k i2 ψ (k)k ∂k i3 e |i|=2 i
i4 ∂2 (k)k 2 κ1 a1 |i3 |κ2 a2 ≤c (1 + x )i1 i ψ ∂k 2 |i|=2 i
2 i1 ∂ 2 i3 i4 ≤c (1 + x ) i ψ (k)k a a ∂k 2 |i|=2 i
∂2 (k)k 2 x − x i3 +i4 ≤c (1 + x )i1 i ψ ∂k 2 |i|=2 j
2 j1 ∂ 2 (k)k x − x j3 . ≤c (1 + x ) j ψ (82) 2 ∂k | j|=2
With (79) we may assume that λ in (81) is bounded below, λ ≥ r (1 − sin α) + L cos α ≥ λmin := η(r + L ),
(83)
with η := min((1 − sin α), cos α) > 0. Using (83) and (82) in (80) we obtain that M :=
|I1 |d 2 y ≤ c
|ψ (x)|
| j|=2
R2 R3
YL
× R3
|V (x )|
∞
1 j2 ∂k ψ (k)k 2 |x − x | j3 (1 + x ) j1 r 2 + b32 λ2min
S2 0 3 3 2
× dkdd x d xd y.
(84)
Since the integrand of the right-hand side of (84) is positive, we may perform the change of integration variables (y1 , y2 ) → ( y˜1 , y˜2 ) → (r, θ ), as well as freely interchange the
686
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
order of integrations. With (83) we then obtain that M≤c |ψ (x)| |V (x )| | j|=2
R3 ∞∞2π
R3
1 j2 ∂k ψ (k)k 2 |x − x | j3 (1+x ) j1 r dθ dr dkdd 3 x d 3 x r 2 + b32 λ2min S2 0 0 0 ≤c |ψ (x)| |V (x )| ×
| j|=2
R3 ∞ ∞
∞
R3
1 j2 2 ψ (k)k |x − x | j3 (1 + x ) j1 dr dkdd 3 x d 3 x ∂ η2 (r + L )2 k S2 0 0 c = 2 |ψ (x)| |V (x )| η L ×
| j|=2
×
R3
R3
j2 ∂k ψ (k)k 2 |x − x | j3 (1 + x ) j1 dkdd 3 x d 3 x.
(85)
S2 0
Using that |x − x | j3 ≤ 2(x j3 + x j3 ) for j3 = 1, 2 we obtain that
c j M≤ |ψ (x)|(1 + x) j3 ∂k 2 ψ (k)k 2 L | j|=2 3 R R3 × |V (x )|(1 + x ) j1 + j3 d 3 x dkdd 3 x.
(86)
R3
Since V ∈ (V )5 (so that V ∈ L 2 (R3 ) and |V (x)| ≤ C x −5−δ , δ > 0, for x > R0 ) and j1 + j3 ≤ 2 the x integration is finite and we obtain (by dividing the integration region for x into two parts, x > R0 and x ≤ R0 )
c j2 j3 M≤ |ψ (x)|(1 + x) (87) ∂k ψ (k)k 2 dkdd 3 x. L j2 + j3 ≤2
R3
R3
Using (40), (41) and that ψ ∈ S(R3 ) one finds by simple calculation that c 1 |ψ (x)|x j3 d 3 x ≤ 3 j 2 3
(88)
R3
and
3 1 j2 ∂k ψ (k)k 2 dkd ≤ c 2 j2 .
R3
(89)
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
687
Since j2 + j3 ≤ 2 we see with (88), (89) and (42) that for M in (87) we have for small the bound M≤
c = c l−2 . L 2
Since l > 2, this completes the proof of (63).
(90)
We can now proceed with the evaluation of (56). With (8) we obtain for (56) σ () = lim |T ψ y (k)|2 d 2 yd 3 k →0 C A
2 2 −i k · y e T (k, k )ψ (k )k d(k ) d 2 yd 3 k =lim 4π →0 C A k =k 2 2 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k )k d(k )dy1 dy2 d 3 k, =lim 4π →0 C y p < D 2
k =k
(91) where y p := (y1 , y2 ). We insert again the identity f 1 + f 2 ≡ 1 and obtain for σ ()
lim 4π 2
→0
C y p < D 2
2 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k )( f 1 (k ) + f 2 (k ))k d(k ) k =k
×dy1 dy2 d 3 k.
(92)
Multiplying out we get four terms. The main term is
lim 4π 2
→0
C y p < D 2
2 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k ) f 1 (k )k d(k ) dy1 dy2 d 3 k. k =k
(93) Before we evaluate (93) we show that the three other terms are zero. Noting that T (k, k ) is bounded (Corollary 1) and that ψ ∈ S(R3 ) we obtain that c −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k ) f i (k )k d(k ) ≤ 3 k, i = 1, 2, 2 k =k c k − k0 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k ) f 2 (k )k d(k ) ≤ 3 k ψ 2 k =k
× f 2 (k )d(k ).
k =k
(94)
688
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
Using (94), the difference between (93) and (92) is no greater than c 3
k − k0 2 2 3 ψ f 2 (k )k d(k )d yd k
C y p < D k =k 2
≤ ≤
c 3+2d
k − k0 2 3 ψ f 2 (k )k d k
R3
c 3+2d
| k − k0 |≥
k0 3
k − k0 2 3 ψ k d k .
(95)
(k)| ≤ cn for any 6 ≤ n ∈ N, we see that the right-hand side in (95) is Using that |ψ k bounded by c n−3−2d , which tends to zero for sufficiently large n. Thus the three other terms are zero. Since, as we shall show,
lim 4π 2
→0
C y p ≥ D 2
2 −i(k1 y1 +k2 y2 −k3 L ) e T (k, k )ψ (k ) f 1 (k )k d(k ) k =k
×dy1 dy2 d 3 k = 0,
(96)
we may extend the y-integration in (93) to all of R2 , so that 2 2 −i(k1 y1 +k2 y2 −k3 L ) σ () = lim 4π e T (k, k )ψ (k ) f 1 (k )k d(k ) →0 2 C R
k =k
×dy1 dy2 d 3 k.
(97)
Before establishing (96) we compute (97) with the help of the following Lemma 4. Let 0 < α < π2 and δ > 0 be given. Suppose that φ : R3 → C is a function with support in the sector Peα3 := {k ∈ R3 : k · e3 > k cos α} such that |φ(k)|2 d(k) < ∞. Then k=δ
2 2 1 1 −i k· y d y= e φ(k)d(k) |φ(k)|2 d(k). 2π kk3 2
R
k=δ
(98)
k=δ
Remark 11. This lemma is proved in [2], Lemma 7.17. The integration over the impact parameter is crucial for the derivation and is a standard ingredient in the derivation of the scattering cross section.
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
689
Because of Corollary 1, T (k, k ) is bounded on R3 ×R3 and continuous on R3 ×R3 \ (k) f 1 (k) has support in Peϑ2 with 0 < ϑ2 < π . (k) ∈ S(R3 ) and ψ {0}. Moreover, ψ 3 2 Hence, by Lemma 4, (97) becomes 1 T (k, k )2 ψ (k )2 f 1 (k )2 σ () = lim 16π 4 d(k )d 3 k →0 cos ϑ C k =k 1 T (k ω, k )2 ψ (k )2 f 1 (k )2 = lim 16π 4 d 3 k d, (99) →0 cos ϑ R3
where k3 = k cos ϑ . Because supp f 1 (k) ⊂ Peϑ32 with 0 < ϑ2 < π2 , there exists a δ > 0 2 such that δ < cos ϑ . Hence the integral in (99) is finite (it is ≤2 c ψ 3 ). Thus, since 2 clearly |ψ (k)| → δ(k − k0 ) (in the sense that lim |ψ (k)| g(k)d k = g(k0 ) for →0
any bounded continuous function g), and since T (k ω, k ), f 1 (k ) and and continuous as functions of k , we may conclude that 4 σ () = 16π |T (k0 ω, k0 )|2 d.
1 cos ϑ
are bounded
(100)
The proof of Theorem 1 and Theorem 2 will thus be complete once we establish (96). Changing variables, (96) follows from 2 y1 y2 1 −i(k1 d +k2 d −k3 L ) 3 e T(k, k ) ψ (k ) f (k )k d(k ) lim 1 dy1 dy2 d k = 0. 2d →0 3 D R yp≥ 2
k =k
(101) Equation (101) is the content of Lemma 5. Let V ∈ (V )5 , ψ ∈ S(R3 ) and suppose that k0 > 0. Let l > 2, d > 2l − 3 and let M be given by (to simplify the notation we interchange k and k ) y y −i(k1 d1 +k2 d2 −k3 L ) (k) f 1 (k)kd(k). e T (k , k)ψ M = M(y1 , y2 , k , ) := k=k
(102) Then for any D > 0,
R3
y p ≥D
lim
→0
1 |M|2 dy1 dy2 d 3 k = 0. 2d
(103)
Proof. We will establish the following inequality (104) giving a bound on M: There exists a c < ∞ such that |M|2 ≤ cχ k0
3 2 , 2 k0
(k )
4d+5−4l
y 4p
1+
1 |k −k0 |
2 .
(104)
690
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
Assuming (104) we show now that (103) follows. Using (104), the integral in (103) is dominated by
k0 3 2
y p ≥D
∞ 1 dk 2d+5−4l 2 3 2d+5−4l c d yd k ≤ c
2 2 y 4p |k −k0 | 0| 1 + |k −k 1 + −∞ ∞ = c
2d+6−4l
= c
−∞ 2d+6−4l
dk (1 + |k |)2
.
(105)
Since d > 2l − 3 there is a δ > 0 such that d = 2l − 3 + δ. Then (105) is of order 2δ and (103) follows. It thus remains to establish (104). Changing variables in (102) from ω to k1 , k2 we obtain, with the Jacobian determinant k k3 with k3 = k3 (k1 , k2 ) = k+ = (k1 , k2 , k3 (k1 , k2 )), M =
e
−i(k1
y1 d
+k2
y2 d
−k3 L )
k12 +k22 ≤k 2
=
=:
1
e
3
2 1
3 2
−i(k1
y1 d
+k2
y2 d
)
−i(k1
y1 d
+k2
y2 d
)
k12 +k22 ≤k 2
e
(k+ ) f 1 (k+ )k T (k , k+ )ψ T (k , k+ )ψ
k+ − k0
g(k1 , k2 , k , )dk1 dk2 .
k 2 − k12 − k22 and
1 dk1 dk2 k k3
eik3 L
f 1 (k+ ) dk1 dk2 k3 (106)
k12 +k22 ≤k 2
Performing two integration by parts with respect to k p := (k1 , k2 ), we obtain (using the fact that f 1 (k+ ) and its derivatives vanish on the boundary of the region of integration) that y y yp 1 d −i(k1 d1 +k2 d2 ) |M| = 3 ∇k p e · 2 f 1 (k+ )g(k1 , k2 , k , )dk1 dk2 y p 2 k p ≤k y y 1 d −i(k1 d1 +k2 d2 ) y p = 3 e · ∇ g(k , k , k , )dk dk 1 2 1 2 k p y 2p 2 k p ≤k y y yp yp 1 2d −i(k1 d1 +k2 d2 ) = 3 ∇k p e · 2 2 · ∇ k p g(k1 , k2 , k , )dk1 dk2 y y p p 2 k p ≤k
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
691
y y y y 1 2 1 2d p −i(k1 d +k2 d ) p = 3 e · ∇ k p 2 · ∇ k p g(k1 , k2 , k , )dk1 dk2 2 y y p p 2 k p ≤k ≤
1 2d 3 2 2 yp
2 ∂k ∂k g(k1 , k2 , k , ) dk1 dk2 . i j kp
≤k
(107)
i, j=1
We estimate now the derivatives of g on the support of f 1 . Note first that on supp f 1 k3 > k0 /2. Using Corollary 1 we have for i, j = 1, 2 that ∂k T (k , k+ ) ≤ c, (108) sup |T (k , k+ )| ≤ c, sup i k ∈R3 , k+ ∈supp f 1
sup
k ∈R3 , k+ ∈supp f 1
k ∈R3 , k+ ∈supp f 1
∂k ∂k T (k , k+ ) ≤ c. i j
k+ − k0 and its derivatives we introduce the following To estimate the wave function ψ notation: Pk :=
1 1+
|k−k0 |
,
Pk :=
1 1+
| k− k0 |
.
(109)
Clearly Pk ≤ Pk .
(110)
and its derivatives decay faster than the reciprocal of any polynoSince ψ ∈ S(R3 ), ψ mial, we can find for k+ ∈ supp f 1 and for n ∈ N suitable constants such that k+ − k0 k+ − k0 c n k+ − k0 c n n ψ ≤ c Pk+ , ∂ki ψ ≤ Pk+ , ∂ki ∂k j ψ ≤ 2 Pk+ . (111)
The derivatives of the third factor e−ik3 L of g can be estimated on supp f 1 as follows: |ki | −ik3 L ≤ L |ki |. e ≤ 1, ∂ki e−ik3 L ≤ L |k3 |
(112)
Since |ki |Pk+ ≤ , we obtain using (111) with n = j + 1 and (42) that
k − k c + 0 j j j ∂k e−ik3 L ψ ≤ cL |ki |Pk+ Pk+ ≤ cL Pk+ = l−1 Pk+ , j arbitrary. i (113) With a similar calculation we find that
k − k c + 0 j ∂k ∂k e−ik3 L ψ ≤ 2l−2 Pk+ , j arbitrary, i j
(114)
692
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
k+ − k0 . Clearly and analogous estimates for terms which contains derivatives of ψ we have that f 1 (k+ ) ≤ c, sup ∂k f 1 (k+ ) ≤ c, sup ∂k ∂k f 1 (k+ ) ≤ c, i, j = 1, 2. sup i i j k3 k3 k3 k+ ∈supp f 1 k+ ∈supp f 1 k+ ∈supp f 1 (115) Combining (108), (111)–(115) and using that 2l − 2 > 2 since l > 2 we obtain for all k ∈ R3 and any n ∈ N that ∂k ∂k g(k1 , k2 , k , ) ≤ i j
c Pn , 2l−2 k+
(116)
for all (k1 , k2 ) such that k+ ∈ supp f 1 . Reintroducing the original integration variable ω we then have that |M| ≤
c 2d−2l+ 1 2 y 2p
χ{ f1 >0} Pkn k k3 d(k)
k=k
1 c ≤ 2 2d−2l+ 2 χ k0 3 (k ) yp 2 , 2 k0
Pkn d(k).
k=k ,| k− k0 |<
(117)
k0 2
Choosing n = 4 in (117) and splitting Pk4 into Pk4 = Pk1 Pk3 ≤ Pk1 Pk3 ,
(118)
we obtain that |M| ≤
c 2d+ 1 −2l
2 χ k0 3 (k )Pk1 y 2p 2 , 2 k0
Pk3 d(k). k=k ,| k− k0 |<
(119)
k0 2
Moreover, it is easy to see that
Pk d(k) ≤ c 3
k k=k ,| k− k0 |< 20
R2
1 1+
kp 3
dk1 dk2 ≤ c 2 .
(120)
Thus |M| ≤
c 2d+ 5 −2l
2 χ k0 3 (k )Pk1 y 2p 2 , 2 k0
and (104) follows. This completes the proof of Lemma 5.
(121)
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
693
9. Summary and Outlook The purpose of this paper has been to rigorously derive the standard formula for the scattering cross section starting from a microscopic model of a scattering experiment. While the use of Bohmian mechanics is crucial for our result, we would like to stress that major parts of our proof are vital even from an orthodox point of view. These parts concern in particular the replacement of the incoming asymptote by its scattering state (cf. Lemma 3 and Remark 10) and the flux-across-surfaces theorem in a formulation which depends only on the smoothness of the scattering state (cf. Proposition 3, Lemma 1 and [11]). Several problems have been left for future work, which we shall mention here. • Bound states: Our assumption A3 arises from the problem that in general the translation of the initial wave function by the impact parameter y—which is needed for the averaging over the beam profile—will produce wave functions which have a component in the bound states. One would then have to show that asymptotically the crossing statistics are induced by the “relevant part” ψ of the wave function, namely ψ := Pψ, where P is the projection onto the absolutely continuous subspace Ha.c. (H ) and is given by P := − ∗− . Note that by using Lemma 3 one can also show that lim
L→∞ YL
Pψ y − ψ y 2 d 2 y = 0,
(122)
i.e., that the bound state component is small in an L 2 -sense. This is however not directly applicable. • It would of course be desirable to derive the crossing statistics for many particles guided in general by an entangled wave function both for the noninteracting case and eventually even for interacting particles [13]. • We are currently working [8] on a detailed formulation of the conditions characterizing the scattering regime, which turns out to be surprisingly intricate. What we have shown here is that the simplest limiting procedure that brings the experimental arrangement into the scattering regime yields the standard formula of formal scattering theory. This formula should of course hold much more generally—more or less for all limits corresponding to the scattering regime—but establishing that this is so remains a formidable challenge. Acknowledgements. The work of S. Goldstein was supported in part by NSF Grant DMS-0504504. The work of T. Moser was supported by the DFG (DU 120/10). The work of N. Zanghì was supported by INFN.
694
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
10. Appendix Proof of Lemma 1. Let ψ ∈ G. Then there is a χ ∈ G 0 and a t ∈ R such that ψ = e−i H t χ . Using the intertwining property (6) we obtain −1 −i H t −i H0 t ψout = −1 χ = e−i H0 t −1 χout . + ψ = + e + χ =e
(123)
Since G + is invariant under time shifts it suffices to show that χ out (k) is in G + . Since 2 n 3 4 n 3 x H χ (x) ∈ L 2 (R ), 0 ≤ n ≤ 8, and x H χ (x) ∈ L 2 (R ), 0 ≤ n ≤ 3, we have H n χ (x) ∈ L 1 (R3 ) ∩ L 2 (R3 ), 0 ≤ n ≤ 8, x j H n χ (x) ∈ L 1 (R3 ) ∩ L 2 (R3 ), 0 ≤ n ≤ 3, j = {1, 2}.
(124)
Using Proposition 1 (ii), (iii) we have for f ∈ L 2 (R3 ): F+ + f = F f, and hence for χ = + χout we have that 3
χ out (k) = F+ χ (k) = (2π )− 2
ϕ+∗ (x, k)χ (x)d 3 x.
(125)
(126)
Using the intertwining property (6) we thus have: k2 −1 χ out (k) = H0 χ out (k) = F(H0 −1 + χ )(k) = F(+ H χ )(k) = F+ (H χ )(k) 2 3
= (2π )− 2
ϕ+∗ (x, k)(H χ )(x)d 3 x.
(127)
out (k) (0 ≤ n ≤ 8) we obtain Similarly, applying H0n to χ k 2n 3 χ out (k) = (2π )− 2 ϕ+∗ (x, k)(H n χ )(x)d 3 x. n 2
(128)
Since the generalized eigenfunctions are bounded (Proposition 2 (ii)) and H n χ ∈ L 1 (R3 ), 0 ≤ n ≤ 8, we obtain | χout (k)| ≤ c(1 + k)−16 ≤ c(1 + k)−15 .
(129)
Because of Proposition 2 (iii) and (124) we can differentiate (126) with respect to ki and get ∗ 3 3 ∂k χ = (2π )− 23 (k) ϕ (x, k) χ (x)d x ∂ out k i i + ≤ c, ∀k ∈ R \ {0}. (130) Differentiating (128) with n = 3 with respect to ki we obtain 3 ki k 6 ∂k i χ out (k) = 8(2π )− 2 out (k) . (131) ∂ki ϕ+∗ (x, k) (H 3 χ )(x)d 3 x − 6k 5 χ k
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
695
Again the right-hand side is bounded because of Lemma 2 (iii), (124) and (129). Hence, we obtain with (130): −6 ∂k χ , ∀k ∈ R3 \ {0}. (132) i out (k) ≤ c(1 + k) Using Proposition 2 (iii) and (126) we may control κ times a second derivative of χ out (k), obtaining − 23 ∗ 3 3 κ∂k ∂k χ (2π ) = (k) ∂ ϕ (x, k) χ (x)d x κ∂ k j ki + j i out ≤ c, ∀k ∈ R \ {0}. (133) For the last inequality we have also used (124) with j = 2 and n = 0. Similarly, using (131) we obtain 6 − 23 k κ∂k j ∂ki χ out (k) = 8(2π ) κ∂k j ∂ki ϕ+∗ (x, k) (H 3 χ )(x)d 3 x k j ki ki κ χout (k) − 6k 5 κ∂k j χ out (k) k k k kδi j k − ki k j kj out (k)κ − 6k 5 κ∂ki χ out (k), − 6k 5 χ k3 k − 30k 4
(134)
with right-hand side that is bounded because of Proposition 2 (iii), (124), (129) and (132). Hence, using (133), α −6 κ∂ χ ≤ c(1 + k)−5 , |α| = 2, ∀k ∈ R3 \ {0}. (135) k out (k) ≤ c(1 + k) Equation (132) implies also that |∂k χ out (k)| ≤ c(1 + k)−6 , ∀k ∈ R3 \ {0}.
(136)
Similarly, twice differentiating (126) with respect to k we obtain that 2 out (k) ≤ c, ∀k ∈ R3 \ {0}, ∂k χ
(137)
and then twice differentiating (128) for n = 2 with respect to k we obtain 2 out (k) ≤ c(1 + k)−4 ≤ c(1 + k)−3 , ∀k ∈ R3 \ {0}, ∂k χ
(138)
using Proposition 2 (iv), (124), (129), (136) and (137). With (129), (132), (135) and (138) we see that χ out (k) ∈ G + . Proof of Lemma 2. In the proof of Proposition 3 in [11] the absolute value of the flux integrated over time and the surface RS 2 with R > R0 (with some R0 > 0 depending on the potential) is shown to be bounded (uniformly in R) by linear combinations of out (k) and its derivatives, namely integrals over expressions correintegrals involving ψ sponding to the left hand side of the inequalities in Definition 2. Thus these bounds are out (k) ∈ G + . To bound the integrated flux uniformly for all ψ y , y ∈ A (and finite if ψ
small enough and fixed), F ψ y,out (k) = F −1 + ψ y (k) (note that ψ y ∈ Ha.c. (H ), for all y ∈ A , cf. (i) in Definition 3 or 4) must be bounded as in Definition 2 with
696
D. Dürr, S. Goldstein, T. Moser, N. Zanghì
constants uniform in y ∈ A . These constants depend, according to the proof of Lemma 1, on the norms of H n ψ y 1 , 0 ≤ n ≤ 8 and x j H n ψ y 1 , 0 ≤ n ≤ 3,
j ∈ {1, 2}.
(139)
We will show that for small enough there exists a constant C > 0 such that |H n ψ y (x)| ≤ C(1 + x)−6 , 0 ≤ n ≤ 8, ∀ y ∈ A .
(140)
Thus the norms in (139) are bounded uniformly in y ∈ A and Lemma 2 follows. It remains to establish (140). We start with n = 0. Since ψ ∈ S(R3 ) and y ∈ A , A compact, we obtain 3
|ψ y (x)| = 2 |ψ((x − y))| ≤ c(1 + |x − y|)−6 ≤ c(1 + x)−6 , ∀ y ∈ A . (141) For n = 1 we have with ψ y ≡ T y ψ (T y is the translation operator) and [T y , H0 ]− = 0, 3
|H ψ y (x)| = |(H0 + V )T y ψ (x)| = |T y H0 ψ (x)| + 2 |V (x)ψ((x − y))|. Using now |V (x)| < M < ∞ for V ∈ V or
sup
x ∈supp ψ y
(142)
|V (x)| < M < ∞ for ψ ∈
C0∞ (R3 ), V ∈ V , y ∈ A and small enough, we obtain together with (141), |H ψ y (x)| ≤ |T y H0 ψ (x)| + c(1 + x)−6 .
(143)
Since ψ ∈ S(R3 ) we have that also H0 ψ ∈ S(R3 ) so that analogously to (141), there is the bound |T y H0 ψ (x)| ≤ c(1 + x)−6 , ∀ y ∈ A .
(144)
Equations (143) and (144) yield (140) for n = 1. Analogously, we obtain (140) for 2 ≤ n ≤ 8 by using the fact that ψ ∈ S(R3 ) and |∂ xα V (x)| < M < ∞, ∀ |α| ≤ 14, if V ∈ V or sup |∂ xα V (x)| < M < ∞, ∀ |α| ≤ 14, for all y ∈ A and small enough if ψ
x ∈supp ψ y ∈ C0∞ (R3 )
and V ∈ V .
References 1. Albeverio, S., Gesztesy, F., Høegh-Krohn, R, Holden, H.: Solvable Models in Quantum Mechanics. Berlin Heidelberg New York: Springer, 1988 2. Amrein, W.O., Jauch, J.M., Sinha, K.B.: Scattering Theory in Quantum Mechanics. London: W. A. Benjamin, Inc., 1977 3. Berndl, K.: Zur Existenz der Dynamik in Bohmschen Systemen, Ph.D. thesis, Ludwig-Maximilians-Universität München, 1994 4. Berndl, K., Dürr, D., Goldstein, S., Peruzzi, G., Zanghì, N.: On the global existence of Bohmian mechanics. Commun. Math. Phys. 173(3), 647–673 (1995) 5. Bohm, D.: A suggested interpretation of the quantum theory in terms of “hidden” variables I, II. Phys. Rev. 85, 166–179, 180–193 (1952) 6. Combes, J.-M., Newton, R.G., Shtokhamer, R.: Scattering into cones and flux across surfaces. Phys. Rev. D 11(2), 366–372 (1975) 7. Dürr, D.: Bohmsche Mechanik als Grundlage der Quantenmechanik. Berlin Heidelberg New York: Springer, 2001 8. Dürr, D., Goldstein, S., Moser, T., Zanghì, N.: What does quantum scattering theory physically describe? In preparation
A Microscopic Derivation of the Quantum Mechanical Formal Scattering Cross Section
697
9. Dürr, D., Goldstein, S., Teufel, S., Zanghì, N.: Scattering theory from microscopic first principles. Physica A 279, 416–431 (2000) 10. Dürr, D., Goldstein, S., Zanghì, N.: Quantum Equilibrium and the Origin of Absolute Uncertainty. J. Stat. Phys. 67, 843–907 (1992) 11. Dürr, D., Moser, T., Pickl, P.: The Flux-Across-Surfaces Theorem under conditions on the scattering state. J. Phys. A: Math. Gen. 39, 163–183 (2006) 12. Dürr, D., Pickl, P.: Flux-across-surfaces theorem for a Dirac particle. J. Math. Phys. 44(2), 423–465 (2003) 13. Dürr, D., Teufel, S.: On the exit statistics theorem of many particle quantum scattering. In: Blanchard, P., Dell’Antonio, G.F. eds. Multiscale Methods in Quantum Mechanics: Theory and experiment, Boston: Birkhäuser, 2003 14. Ikebe, T.: Eigenfunction expansion associated with the Schrödinger operators and their applications to scattering theory. Arch. Rat. Mech. Anal. 5, 1–34 (1960) 15. Jensen, A., Kato, T.: Spectral properties of Schrödinger operators and time-decay of the wave functions. Duke Math. J. 46(3), 583–611 (1979) 16. Kato, T.: Fundamental Properties Of Hamiltonian Operators Of Schrödinger Type. Trans. Amer. Math. Soc. 70(1), 195–211 (1951) 17. Newton, R.G.: Scattering Theory of Waves and Particles, Second Edition, Berlin Heidelberg New York: Springer, 1982 18. Pearson, D.B.: Quantum Scattering and Spectral Theory. San Diego: Academic Press, 1988 19. Reed, M., Simon, B.: Methods Of Modern Mathematical Physics III: Scattering Theory. San Diego: Academic Press, 1979 20. Reed, M., Simon, B.: Methods Of Modern Mathematical Physics I: Functional Analysis. Revised and enlarged ed., San Diego: Academic Press, 1980 21. Teufel, S.: The flux-across-surfaces theorem and its implications for scattering theory. Ph.D. thesis, Ludwig-Maximilians-Universität München, 1999 22. Teufel, S., Dürr, D., Münch-Berndl, K.: The flux-across-surfaces theorem for short range potentials and wave functions without energy cutoffs. J. Math. Phys. 40(4), 1901–1922 (1999) 23. Teufel, S., Tumulka, R.: A Simple Proof for Global Existence of Bohmian Trajectories. Commun. Math. Phys. 258(2), 349–365 (2005) 24. Tumulka, R.: Closed 3-Forms and Random Worldlines. Ph.D. thesis, Ludwig-Maximilians-Universität München, 2001 25. Weinberg, S.: Quantum Theory of Fields. Volume I: Foundations, Cambridge: Cambridge University Press, 1996 26. Yajima, K.: The W k, p -continuity of wave operators for Schrödinger operators. J. Math. Soc. Japan 47(3), 551–581 (1995) Communicated by A. Kupiainen
Commun. Math. Phys. 266, 699–714 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0046-9
Communications in
Mathematical Physics
Homogenization of Ornstein-Uhlenbeck Process in Random Environment Gaël Benabou Ceremade, UMR CNRS 7534, Université Paris IX - Dauphine, Place du Maréchal de Lattre de Tassigny, 75775 Paris Cedex 16, France. E-mail: [email protected] Received: 1 September 2005 / Accepted: 20 March 2006 Published online: 13 July 2006 – © Springer-Verlag 2006
Abstract: We consider a tracer particle moving in a random environment. The velocity of the tracer is modelled by an Ornstein-Uhlenbeck process which takes into account inertia and friction. The medium results in a possibly unbounded random potential. We prove an invariance principle for this kind of motion. The method used is generalized in order to obtain a central limit theorem for a large class of process, the most interesting application being a tagged particle in a medium of infinitely many Ornstein-Uhlenbeck particles.
1. Introduction An aqueous suspension of particles in equilibrium at temperature T = β −1 can be modeled by a system of interacting Ornstein-Uhlenbeck particles, i.e. by the following system of stochastic differential equations: dxi (t) = vi (t)dt, mdvi (t) = −γ vi (t)dt −
∇U (xi − x j )dt +
2γβdwi (t),
(1)
j=i
where the wi s are independent Brownian motions. The mass of each particle is m, V is a two body interaction potential between particles, and γ is the strength of the friction resulting from the fluid. In the overdamped case (mγ −1 small), (1) is approximated by the model of interacting Brownian motions, 1 dxi (t) = − ∇U (xi − x j )dt + γ j=i
2β dwi (t). γ
(2)
700
G. Benabou
Both models are of diffusive type, and one issue is about the macroscopic behaviour on a diffusive space-time scale. It has been proved (see [7, 12]) that both these models have the same macroscopic bulk diffusion. In this article, we consider the problem of the self-diffusion in equilibrium, i.e. the diffusion of one tagged particle. In the overdamped case (2), an invariance principle for its motion has been proved in [6]. But, in the inertial case (1), the question remained an open problem. The purpose of this paper is the proof of a central limit theorem for the tagged particle motion (1). The comparison with interacting Brownian motions is discussed in a recent article by the author ([14]), in which it is proved that the self-diffusion matrix is strictly smaller in the inertial case. Moreover, the self diffusion matrix converges to the one for the non-inertial case when the damping goes to infinity. We tag one of the particles of the system (1) and we note by x(t) its position at time t. We prove the following central limit theorem for the trajectory of this particle Theorem 1. In equilibrium, the process εx(ε−2 t) converges in the limit ε → 0 weakly to a Brownian motion with deterministic diffusion matrix which is defined in Proposition 4.1. First we prove the central limit theorem in the case where the tagged particle is moving in a frozen random environment. Then, we extend this result to a larger class of dynamics, so that Theorem 1 can be seen as a particular case. Let us introduce formally the frozen environment process. Let (X, F, µ) be a probability space on which a group of measure preserving transformations G = {τx , x ∈ Rd } acts ergodically. The action is also assumed to be stochastically continuous. Under this hypothesis, we can define the infinitesimal generator D of G, which is a closed unbounded operator over L 2 (µ). Let V˜ (x, η) be a real-valued random potential given by a stationary random field on Rd × X , i.e. there exists V such that V˜ (x, η) = V (τ−x η). We suppose V ∈ L 2 (µ) and DV ∈ L 2 (µ). γ˜ (x, η) is also a stationary random field, which satisfies γ˜ (x, η) = γ (τ−x η), inf X γ = γ∗ > 0 and sup X γ < ∞. Throughout the article, the canonical Euclidean norm on Rd is denoted by |.|, and · denotes the associated scalar product. For every η ∈ X , we consider on Rd the following system of stochastic differential equations dx(t) = v(t)dt, (3) dv(t) = −γ˜ (η, x(t))v(t)dt − ∇x V˜ (η, x(t))dt + 2γ˜ (η, x(t))βdw(t), with initial conditions
x(0) = 0, v(0) ∼ G(dv) =
− |v|
2
e 2β dv, (2πβ)d/2
where w is a standard Brownian motion on Rd , and β is the inverse temperature of the medium. We assume here the existence of the dynamics (in the situation described in [2], this existence can be proved, due to the boundedness of ∇x V˜ ), and we want to prove that it satisfies an invariance principle. The usual way to treat this kind of problem is to introduce the environment process over X × Rd , say (η(t), v(t)) = (τ−x(t) η, v(t)). This is an autonomous Markov process 2 with formal stationary measure dπ(η, v) = Z −1 e−V (η)/β e−|v| /2β dµ(η)dv, where Z is a normalizing constant.
Homogenization of Ornstein-Uhlenbeck Process in Random Environment
701
In [5], Kipnis and Varadhan develop a scheme for proving central limit theorems for any stochastic dynamics with a reversible invariant measure. The problem of the tagged particle in a system of interacting particles with a reversible measure can also be treated with this method, cf. [5] for the symmetric simple exclusion, and [6] for interacting Brownian particles (2). Moreover, an extension of these results can be proved for nonreversible dynamics with sector condition, or weak sector condition (cf. [9, 10], also see for a review [11]). This method is based on a martingale approximation and uses certain estimates on the resolvent. The problems we investigate in this paper do not satisfy any sector condition. The generator of (η(t), v(t)) is not symmetric with respect to the invariant measure dπ and is also degenerate in the positional variables. However, the system (3) has already been studied in [2] in the case of constant γ . In their article, Papanicolau and Varadhan assume V to be bounded along with its derivatives. This assumption makes it impossible to use their method for the problem of a tagged particle in a system of interacting Ornstein-Uhlenbeck particles (1), since there is no bound on the number of particles interacting with the tracer particle. The method of Papanicolau and Varadhan is inspired by the methods used in [5]. The authors study the resolvent equation corresponding to the generator of the environment process. Since this generator is degenerate, the solution of the resolvent equation lacks regularity in the space variable. This causes problems while trying to prove the usual Kipnis and Varadhan conditions for central limit theorems, see [5] and Proposition 2.4 in the present article. Papanicolau and Varadhan managed to gain enough regularity in the space variable for the solution of the resolvent equation by integrating it over the velocities. But this regularity is only obtained under the condition of a bounded force. Clearly, this method does not fit in Kipnis and Varadhan’s scheme, which only relies on the time reversal symmetry of the system. We propose here a way to avoid the hypothesis of boundedness of the force, motivated by the physical interpretation of the model. The main idea of our proof is the use of another kind of symmetry of the problem, which, in some sense, “replaces” the time reversibility. If one considers the time-reversed process (x(T − t), v(T − t)), it is very easy to see that, in the stationary measure π , it has the same distribution as the process (−x(t), v(t)). We exploit this symmetry by considering that the involution (η, v) → (η, −v) only affects the antisymmetric part of the generator. We try to generalize this approach by presenting a more abstract and general version of the result which extends to a larger class of dynamics which have the same kind of symmetry. This extension allows us to treat the problem of the tagged particle in the system of interacting Ornstein-Uhlenbeck particles (1). 1.1. Results. The first result of this article is the following Theorem 2. In equilibrium, under the hypotheses V ∈ L 2 (µ), e−V /β ∈ L 1 (µ), DV ∈ L 2 (π ), and hypotheses H1 to H3 stated below, the process εx(ε−2 t) defined by (3) converges weakly in π -probability as ε → 0 to a Brownian motion whose diffusion
matrix is the only symmetric matrix OU defined for all l ∈ Rd by l · OU l = β |ξl |2 , where ξl depends linearly on l and is defined in Sect. 2, Proposition 2.4. If we suppose moreover e V /β ∈ L 1 (µ) then this diffusion is non-degenerate, i.e. there exists α > 0 such that OU > αIdRd . The second result presented in this paper is an extension of the previous one to a larger set of dynamics. See Theorems 3 and 4 for the exact statements. Theorem 1 is a direct consequence of these ones.
702
G. Benabou
2. Homogenization in a Frozen Environment 2.1. Preliminaries. The group of transformations G = {τx , x ∈ Rd } acting on the probability space (X, F, µ) is supposed to be commutative, measure preserving and ergodic, i.e. 1. ∀(x, y) ∈ (Rd )2 , τx+y = τx τ y = τ y τx ; 2. ∀x ∈ Rd , ∀A ∈ F, µ(τx A) = µ(A); 3. for A ∈ F, if ∀x ∈ Rd , A = τx A, then µ(A) = 0 or µ(A) = 1. We assume that the associated group of operators {Tx , x ∈ Rd : ∀ f ∈ L 2 (µ), Tx f : η → f (τ−x η)} defined above L 2 (µ) is stochastically continuous ∀δ > 0, f ∈ L 2 (µ), lim µ(|Th f (η) − f (η)| ≥ δ) = 0, h→0
which implies that {Tx } is a strongly continuous unitary group of operators on L 2 (µ). The infinitesimal generator of {Tx } is defined for a suitable f by D f (η) = ∇x (Tx f )(η) . x=0
D is a closed unbounded operator, whose domain D(D) is dense in L 2 (µ). Let V˜ be a random stationary potential defined above Rd × X , i.e. V˜ (x, η) = Tx V (η). V is taken in L 2 (µ) and we suppose that V ∈ D(D), |DV (η)|2 µ(dη) < ∞, (4) X
and that there is a positive β0 such that e−V (η)/β0 µ(dη) < ∞.
(5)
X
Finally, notice that ∇x V˜ (x, η) = DV (τ−x η). γ˜ is also a random stationary field with representation γ . We suppose inf γ = γ∗ > 0. X
(6)
We also suppose for convenience γ ∞ < ∞, so that the existence of solutions of (3) is ensured under good hypotheses on V , see below. This condition could certainly be weakened. 2.2. The Ornstein-Uhlenbeck Process. With the same notations, we now consider on (Rd )2 the diffusion (x(t), v(t)) solution of Eq. (3) with 0 < β ≤ β0 . We assume good enough hypotheses on V such that the existence of the solution (x(t), v(t)) of (3) in almost surely ensured for all positive t. For instance, one can suppose the boundedness of DV , or the usual global Lipschitz conditions in x for ∇x V (x, η). In fact, the main purpose of this article being the study of (30), the existence of the dynamics for this model is proved under very large conditions in [12]. Then, the generator of (x(t), v(t)) is given by Lη = γ˜ (x, η)(βv − v · ∇v ) + v · ∇x − ∇x V˜ (x, η) · ∇v .
Homogenization of Ornstein-Uhlenbeck Process in Random Environment
703
We associate to (x(t), v(t)) the Markov process (η(t), v(t)) defined on X × Rd , where η(t) is the environment as seen by an observer “sitting on the particle”. We define η(t) on X by η(t) = τ−x(t) η η(0) = η. We make the following hypotheses on the semi-group of (η(t), v(t)). Notice that the following assertions can be proved under certain hypotheses on V , including boundedness of DV , the semigroup having enough regularity in this case. The main issue in general is about the existence of a core of regular enough functions for L. H 1. η(t), v(t)) is a Markov process on X × Rd whose semi-group is given by P t f (η, v) = Eη,v [ f (η(t), v(t))] = P η (t, 0, dv, dy) f (τ−y η, v), Rd
where P t is defined on L ∞ (X ) and P η (t, x, v, .) is the transition probability of (x(t), v(t)). H 2. The generator L of (η(t), v(t)) is given by an extension of γ (η)(βv − v · ∇v ) + v · D − DV (η) · ∇v
(7)
and we denote by S and A the corresponding extensions of γ (η)(βv − v · ∇v ) and v · D − DV (η) · ∇v .
(8)
H 3. The probability measure e−V (η)/β µ(dη)G(dv) dπ = e−V (η)/β µ(dη) X
is stationary and ergodic for P t . Moreover
S and A are respectively symmetric and antisymmetric with respect to π , and S is self-adjoint. P t is therefore strongly continuous. In the following, . denotes the expectation with respect to dπ , and . . . the associated scalar product on L 2 (π ). The next lemma gives the key of all the calculus done in this article. Lemma 2.1. For all φ, ψ in D(D) , (a) Dφ(η)µ(dη) = 0, X
Dφ(η)ψ(η)µ(dη) = −
(b) X
Dψ(η)φ(η)µ(dη), X
(c) φ, DV = β Dφ , (d) vφ = β ∇v φ . ♦ The proof of this lemma
is a simple computation. Let us notice that (c) is a consequence of the chain rule D e V = DV e V and of (b). ♦
704
G. Benabou
2.3. The resolvent equation. 2.3.1. Introduction of useful functional spaces. We introduce the Dirichlet form E(φ, ψ) = − Sφ, ψ associated to L as the closure of the following one given for any φ and ψ smooth in v by: E(φ, ψ) = − Sφ, ψ = β γ ∇v φ · ∇v ψ . Let D(E) be its domain. We define H1 as being the completion of {φ ∈ D(E), φ 21 = −β −1 E(φ, φ) < ∞} for the norm . 1 defined as above. We also denote H˜ 1 = H1 ∩ L 2 (π ) endowed of the norm . 2˜ = . 2L 2 + γ∗ β . 21 . Notice that they are both Hilbert spaces. H1 We define also the dual space H˜ −1 of H˜ 1 by the completion of the set of all the functions ψ from H˜ 1 such that ψ 2H˜
−1
= sup (2 ψ, φ − φ 2H˜ ) < ∞ φ∈ H˜ 1
1
with respect to the norm . H˜ −1 . The action of H˜ −1 is denoted by H˜ −1 ., . H˜ 1 . We define H−1 in the same way. 2.3.2. Existence and uniqueness of solutions for the resolvent equation. Let us consider the H−1 -weak resolvent equation for all λ > 0, λh λ − Lh λ = l · v,
(9)
where l is fixed in Rd , in the sense that we are looking for h λ such that for any φ ∈ H1 , λ h λ , φ − H˜ −1 Lh λ , φ H˜ 1 = l · vφ . Besides the tightness of the process and the ergodicity of the stationary measure, Kipnis and Varadhan’s theory needs some information (given below by Proposition 2.4) on the solution to the resolvent equation. Our purpose is now to prove the statements of Proposition 2.4. The next proposition states the following very usual result in homogenization theory: the existence and the uniqueness of the solution h λ of (9), is ensured by l · v ∈ H−1 . Proposition 2.2. Equation (9) has a unique solution h λ in H˜ 1 , and Ah λ and Sh λ are in H˜ −1 . Moreover |l|2 β λ h 2λ ≤ |l|2 and γ |∇v h λ |2 ≤ , γ∗ γ∗ and
λ h 2λ + β γ |∇v h λ |2 ≤ β l · ∇v h λ .
♦ A proof of these results is given in [2], Lemma 2.1. ♦
(10)
(11)
Homogenization of Ornstein-Uhlenbeck Process in Random Environment
705
2.3.3. Asymptotic behaviour of the solution. Let us recall here briefly the method developed in [2] by Papanicolau and Varadhan to prove a central limit theorem for the solution of (3). They manage to write h λ (η, v) = bλ (η, v) + cλ (η), where the family (bλ ) is uniformly bounded in L 2 (π ) with respect to λ, along with the family (Dcλ ). Using this
decomposition, they are able to prove that for all λ > 0, µ > 0, the quantity Ah λ , h µ goes to 0 when λ and µ go successively to 0. This result is sufficient to prove Proposition 2.4. Unfortunately, such a decomposition for h λ can only be obtained using the hypothesis of the boundedness of DV . We propose here another very simple way to decompose h λ , which leads to the same result without any further hypothesis. From now on, f λ and gλ denote respectively the even and the odd part of h λ with respect to v, i.e. for all (η, v), f λ (η, v) =
h λ (η, v) + h λ (η, −v) , 2
h λ (η, v) − h λ (η, −v) gλ (η, v) = . 2
(12)
The following lemma holds. Lemma 2.3. The family (gλ ) is uniformly bounded in L 2 (π ), β gλ2 ≤ |l|2 2 . γ∗
(13)
♦ The operator S has a spectral gap over Rd larger than γ∗ , because the velocities have a Gaussian distribution. Then, we may apply Poincaré’s inequality, which gives 1 gλ (η, v)2 G(dv) ≤ − gλ (η, v)Sgλ (η, v)G(dv) γ∗ R d Rd γ (η) = β |∇v gλ (η, v)|2 G(dv), γ∗ Rd and then due to (10) β β β gλ2 ≤ γ |∇v gλ |2 ≤ γ |∇v h λ |2 ≤ |l|2 2 . ♦ γ∗ γ∗ γ∗ The following proposition proves the basic hypotheses of the Kipnis and Varadhan method to obtain central limit theorems ([5]). Proposition 2.4. We have lim λ h 2λ = 0.
λ→0
Furthermore, there exists ξ , |ξ | ∈ L 2 (π ) so that √ lim γ ∇v h λ − ξ L 2 (π ) = 0. λ→0
(14)
(15)
706
G. Benabou
♦ Using (12), (9) gives λ f λ − S f λ − Agλ = 0
(16)
λgλ − Sgλ − A f λ = l · v.
(17)
and
Let µ be any positive real number. Multiplying (17) by gµ and integrating it with respect to π , we obtain
λ gλ , gµ + β γ ∇v gλ · ∇v gµ − H˜ −1 A f λ , gµ H˜ = β l · ∇v gµ . 1
Using the same regularisation as used in the proof of Proposition 2.2 (see [2] for details), one can prove easily that
H˜ −1 A f λ , gµ H˜ = − H˜ −1 Agµ , f λ H˜ , 1
1
and then using (16), we obtain
λ gλ , gµ + µ f λ , f µ + β γ ∇v gλ · ∇v gµ + β γ ∇v f λ · ∇v f µ
= λ gλ , gµ + µ f λ , f µ + β γ ∇v h λ · ∇v h µ
= β l · ∇v gµ = β l · ∇v h µ .
(18)
√ Due to (10), the family ( γ ∇v h λ ) is weakly compact in L 2 (π ) when λ goes to 0. Let ξ be a weak limiting point of this family. We also denote by g0 an L 2 -weak limiting point of the family (gλ ). We consider a subsequence λn such that ∇v h λn converges weakly to ξ and (gλn ) converges weakly to g0 . Then (18) gives
λn gλn , gλ p + λ p f λn , f λ p + β γ ∇v h λn · ∇v h λ p = β l · ∇v h λ p . (19) Letting p and then n go to infinity, we obtain |ξ |2 = l · ξ .
(20)
Equation (11) gives for all λ > 0, λ h 2λ + β γ |∇v h λ |2 ≤ β l · ∇v h λ
(21)
and then
lim sup γ |∇v h λn |2 ≤ l · ξ .
√ Then, due to the weak convergence of ( γ ∇v h λn ) to ξ , we have l · ξ = |ξ |2 √ ≤ lim inf γ |∇v h λn |2 √ ≤ lim sup γ |∇v h λn |2 ≤ l · ξ
(22)
Homogenization of Ornstein-Uhlenbeck Process in Random Environment
707
√ and ( γ ∇v h λn ) converges strongly to ξ . Now (21) implies (14) for the subsequence λn . The last thing that needs to be proved now is the uniqueness of this limit. Let us √ √ suppose that there exists another limiting point ξ . Let γ ∇v h λn and γ ∇v h µ p be two subsequences which converge respectively to ξ and ξ . Writing Eq. (19) with λ = λn , µ = µ p , and letting n and p go to infinity, we obtain
ξ · ξ = l · ξ = |ξ |2 . Exchanging the roles of λn and µ p , it is obvious that we also have
ξ · ξ = l · ξ = |ξ |2 = |ξ |2 . And then ξ = ξ ,
(23) √ and the limit is unique. The weakly compact sequence ( γ ∇v h λ ) has a unique possible limiting point ξ , then it converges weakly to this point. The same argument as above proves that this convergence is strong. The proof of the proposition is therefore complete. ♦ 2.4. Tightness of the process. In order to prove the continuity of the limit process in the central limit theorem 2, we need the following compactness result: Proposition 2.5. The following inequality holds for all T ≥ 0: β 2 E sup |x(t) − x(s)| ≤ 8T . γ ∗ 0≤s≤t≤T ♦ One can rewrite
x(t) − x(s) =
t
v(u)du.
s
The proposition is then a direct consequence of Proposition 2.1.1 of [11], which gives t 2 ≤ 8T v 2H˜ . ♦ E sup v(u)du 0≤s≤t≤T
s
−1
2.5. Central limit theorem. Now, the homogenization theorem 2 can be proved in the same way as in [5]. Using the resolvent equation (9), we have ε−2 t −2 εl · x(ε t) = ε l · v(s)ds 0
= εM(ε−2 t) + ε h ε2 (η, v(0)) − h ε2 (η(ε−2 t), v((ε−2 t)) +ε3 0
ε−2 t
h ε2 (η(s), v(s))ds,
708
G. Benabou
where M(t) is the martingale t γ (η(s))∇v h ε2 (η(s), v(s)) · dws . M(t) = − 2β 0
Then due to (14), we can prove that 2 −2 −2 →0 Eπ εl · x(ε t) − εM(ε t) when ε goes to 0, and due to (15) and to the ergodicity of π , we can prove a central limit theorem for the martingale M(t) whose quadratic variation is given by t 2
M t = 2β γ (η(s))|∇v h ε2 (η(s), v(s)) ds. 0
Notice that these computations are correct if we assume the existence of a core of regular enough functions for L, as we considered here a weak version of the resolvent equation. One can also refer to [8], pp. 58–59, for a proof of the non-degeneracy of the limit process. 3. Central Limit Theorems for Non-Reversible Markov Processes We propose in this section to extend the previous result to a more general set of dynamics. Let (, F, π ) be a Polish probability space. The expectation with respect to π is denoted by .. Let η(t) be a Markov process with state space . 3.1. General setup. We suppose that there exists a one-to-one, π -preserving map on which we denote for any η ∈ by η⊥ , satisfying (η⊥ )⊥ = η. We define on L 2 (π ) the canonical involution f ⊥ (η) = f (η⊥ ) for all f ∈ L 2 (π ), η ∈ . Let F0 be a sub-σ -algebra of F. We denote the conditional expectation with respect to F0 by .|F0 . Let C0 be the subspace of L 2 (π ) consisting in all F0 -measurable functions. We suppose that for any f ∈ C0 , f ⊥ = f . Let P t be the semi-group of operators over L 2 (π ) corresponding to η(t). P t is supposed to have the following properties. P 1. π is a stationary ergodic measure for P t ; thus, P t is a strongly continuous contraction semi-group over L 2 (π ). We can define its infinitesimal generator L as a closed unbounded operator over L 2 (π ) whose domain D(L) is dense in L 2 (π ), and the adjoint L ∗ of L which is also the generator of a strongly continuous semi-group. P 2. D(L) ∩ D(L ∗ ) contains a core for L and L ∗ . Then we can define S and A as being respectively the symmetric and the antisymmetric part of L with respect to π , and S is therefore self-adjoint. We suppose also D(L)⊥ = D(L), D(S)⊥ = D(S), D(A)⊥ = D(A); P 3. S satisfies the Poincaré’s inequality, i.e. there exists G > 0 such that for all f ∈ D(E) satisfying f |F0 = 0, we have f 2 |F0 ≤ G −S f, f |F0 ; P 4. For all f ∈ D(S), (S f )⊥ = S( f ⊥ ), and for all f ∈ D(S), (A f )⊥ = −A( f ⊥ ).
Homogenization of Ornstein-Uhlenbeck Process in Random Environment
709
We introduce the space H1 consisting in the completion of {φ ∈ D(E), E(φ, φ) < ∞} √ with respect to the norm . 1 = E(φ, φ). We also introduce H˜ 1 = H1 ∩ L 2 (π ) endowed of the norm . 2˜ = . 21 + . 2L 2 (π ) , and the dual spaces H−1 and H˜ −1 of H1 these ones. Let ψ ∈ L 1 (π ) ∩ H−1 . We consider the weak resolvent equation (λ − L)h (n) λ = ψ. As in the case treated in Sect. 2, the following proposition holds Proposition 3.1. There exist sequences (Sn ) and (An ) of bounded operators over L 2 (π ) (n) and a sequence (h λ ) in H˜ 1 such that (n) a. h λ converges weakly in H˜ 1 to h λ ; b. The operators Sn and An are respectively self-adjoint and anti-self-adjoint with respect to π ; (n) (n) c. Sn h λ and An h λ converge weakly in H˜ −1 ; their limits are denoted respectively Sh λ and Ah λ ; (n)
d. For all n ∈ N, (λ − Sn − An )h λ = ψ. √ Moreover, h λ ∈ D( −S) and the following inequality is satisfied λ h 2λ + E(h λ , h λ ) ≤ ψ H−1 E(h λ , h λ )1/2 .
(24)
♦ We introduce for all θ > 0 the following approximation of the operators S and A, inspired by the Yosida approximation ([3] p. 12, Lemma 2.4.) Sθ = S(I − θ S)−1
Aθ = A(I − θ 2 A2 )−1 = 21 A(I − θ A)−1 − −A(I + θ A)−1 , where I is the identity of L 2 (π ). Sθ and Aθ are bounded operators over L 2 (π ), and the Sθ s (resp. the Aθ s) are obviously self-adjoint (resp. anti-self-adjoint) with respect to π . We can then introduce for all positive λ, h λ,θ = (λ − Sθ − Aθ )−1 ψ.
(25)
Due to Hille-Yosida’s theorem ([3], p. 10), we have (I − θ S)−1 ≤ 1, and it is then easy to check that 0 ≤ −S ≤ −Sθ . Using (25), we have 2 2 1/2 √
= ψh λ,θ ≤ ψ H−1 −Sθ h λ,θ −Sh λ,θ , (26) λ h 2λ,θ + As −S ≤ −Sθ , we have
√ 2 2 1/2 √ ≤ ψ H−1 −Sh λ,θ −Sh λ,θ , λ h 2λ,θ +
710
G. Benabou
which implies 2 2 λ h λ,θ ≤ ψ H−1 √ 2 ≤ ψ 2H−1 −Sh λ,θ
√ 2 ≤ ψ 2H−1 . −Sθ h λ,θ
(27)
(n) (n) It only remains to extract subsequences h λ and Sn h λ from (h λ,θ ) and (Sθ h λ,θ ) (n)
(n)
which converge weakly respectively in H˜ 1 and H˜ −1 . As we have An h λ = (λ−Sn )h λ −ψ, the proof of the proposition is complete. ♦
3.2. The main statements. The following theorem holds: Theorem 3. Under the previous assumptions P1 to P4, for all ψ ∈ L 1 (π ) ∩ H−1 such that ψ ⊥ = −ψ, we have
λ h 2λ → 0 as λ → 0
√ ∃ξ ∈ L 2 (π ), ( −Sh λ − ξ )2 → 0 as λ → 0.
(28)
As a direct consequence of this theorem, and the ergodicity of π we have the following result: ε−2 t
Theorem 4. The process ε 0 ψ(η(s))ds converges weakly as ε goes to 0 to a Brown
ian motion with diffusion coefficient ξ 2 . The tightness of the process is ensured by the same argument as in Proposition 2.5.
3.3. Proofs. Let us first prove Theorem 3. Due to (24), λ h 2λ ≤ ψ 2H˜
−1
and E(h λ , h λ ) ≤ ψ 2H−1 .
Following the scheme of the proof of Proposition 2.4, we introduce the symmetric and antisymmetric parts of h λ = f λ + gλ with respect to ⊥ . As ⊥ preserves π , we have 1 1 ⊥ ⊥ hλ − h⊥ (h (29) = |F − h ) |F 0 λ 0 = − gλ |F0 = 0, λ λ 2 2
and thanks to P3 gλ2 ≤ GE(gλ , gλ ). But the same argument as in (29) together with P4 proves that E( f λ , gλ ) = 0. Then E(h λ , h λ ) = E( f λ , f λ ) + E(gλ , gλ ). Consequently gλ2 ≤ G ψ H−1 , and it is uniformly bounded in L 2 (π ). The same argument as in the proof of Proposition 2.4 now can be applied here without any changes, and Theorem 3 is proved. Theorem 4 is now a straightforward application of Theorem 3 and of the ergodicity of π , just as was Theorem 2 in the previous section. gλ |F0 =
Homogenization of Ornstein-Uhlenbeck Process in Random Environment
711
4. Interacting Ornstein-Uhlenbeck Particles Let us now consider an infinite system of particles, each of them moving according to an Ornstein-Uhlenbeck process. They interact through a two-body potential. We want to follow the motion of a special particle, which we tag. The tagged particle evolves in a random environment which is not frozen any more and depends on the motion of the tagged particle. A model of this kind has already been studied for non-massive particles ([6]). The probability space X is now the space of locally finite configurations of particles in Rd with velocities in Rd , i.e. X = {η = {xi , vi }i∈I0 ⊂ Rd , ∀B ∈ B(Rd ), η x ∩ B is finite}, where I0 is a countable set of indexes, B(Rd ) is the set of bounded subsets of Rd and η x = {xi }i∈I0 for all η = ! {xi , vi }i∈I0 ⊂ Rd . We endow X of the weakest topology such that the map φ : η → i∈I0 h(xi , vi ) is continuous for any continuous compactly supported h mapping (Rd )2 to R. F is the corresponding Borel σ -algebra. The transformation group G = {τx , x ∈ Rd } is the group of space shifts τx η = {xi + x, vi }. We define the gradient of a suitable f with respect to xi ∈ η x , ∇xi f (η) · l = lim δ −1 ( f ([η \ {xi }] ∪ {(xi + δl}) − f (η)) δ→0
for all l ∈ Rd , and we similarly define its gradient ∇vi f with respect to a velocity vi ∈ η. Finally, we define the formal operator D D= ∇xi . i
Let U be a twice continuously differentiable, compactly supported map on Rd . We suppose that U is superstable, i.e. for all bounded in Rd , there exists C1 > 0 and C2 ≥ 0 such that for all (x1 , . . . , xn ) ∈ n , U (xi − x j ) ≥ −C2 n + C1 n 2 ||−1 , i= j
where || is the volume of . We also suppose that U is even. The measure µ on F is one of the ergodic Gibbs measures associated to the formal Hamiltonian H0 (η) =
1 1 U (xi − x j ) + |vi |2 2 2 i= j
i
with fixed temperature β and fugacity z. The existence of this measure is ensured by the stability of U . We now study the system defined by dxi (t) = vi (t)dt, ∇U (xi − x j )dt + 2γi βdwi (t), dvi (t) = −γi vi (t)dt − j=i
(30)
712
G. Benabou
which is a slight extension of (1) with γ depending on i, and the mass of the particles taken equal to 1. The γi s are positive constants satisfying inf γi = γ∗ > 0.
(31)
i
The existence of these dynamics is proved in [12]. The generator of this process is γi β(vi − vi · ∇vi ) + vi · ∇xi − ∇U (xi − x j ) · ∇vi . L I OU = j=i
i∈I0
Now, we consider one of these particles, of index i 0 , whose position and velocity are denoted respectively by x and v, and we tag it. We obtain the following system: dx(t) = v(t)dt dv(t) = −γ v(t)dt − ∇U (x − x j )dt + 2γβdw0 j∈I
dxi (t) = vi (t)dt ∇U (xi − x j )dt − ∇U (xi − x)dt + 2γi βdwi (t), dvi (t) = −γi vi (t)dt − j=i
where I = I0 \ {i 0 }. As in the previous section, we consider now the environment ω(t) = τ−x(t) η(t) = {yi (t), vi (t)}. The system becomes dx(t) = v(t)dt dv(t) = −γ v(t)dt + ∇U (y j )dt + 2γβdw0 j∈I (32) dyi (t) = (vi (t) − v(t)) dt dv (t) = −γi vi (t)dt − ∇U (yi − y j )dt −∇U (yi )dt + 2γi βdwi (t). i j=i
We introduce the Palm measure π , whose density with respect to µ is ! dπ − i = Z −1 e dµ
U (xi ) β
2
− |v| 2β
.
Introducing the new formal Hamiltonian H (η, v) = H0 +
U (xi ) +
i
|v|2 2
the generator of (ω(t), v(t)) is given by an extension of L I OU = S I OU + A I OU with S I OU = γ (βv − v · ∇v ) +
γi βvi − vi · ∇vi ,
i∈I
A I OU = −v · D + DH · ∇v +
vi · ∇ yi − ∇ yi H · ∇vi , i∈I
(33)
Homogenization of Ornstein-Uhlenbeck Process in Random Environment
713
where S I OU and A I OU are respectively symmetric and antisymmetric with respect to the invariant measure π defined above. The Dirichlet form associated to this generator is given for any φ ∈ D(L) by L I OU φ, φ = −γβ |∇v φ|2 − β γi |∇vi φ|2 . i
Some problems may arise at this point. The existence of the tagged particle process and of the environment process are not obvious, due to measurability problems. The strong continuity of the semi-group corresponding to the environment process, and the stationarity of π for this semi-group are difficult technical results which, as far as the author knows, have not been shown either. It seems that Ferrari’s scheme for proving the stationarity (see [4]) could be used here, if we assume the existence of a core of regular local function for L I OU , allowing us to commute the semi-group and the generator. These results, however interesting, are not the subject of the present article. We simply assume them in the following. Notice that, in the case of non-massive particles ([6]), these technical points are not treated either. We introduce the resolvent equation for all l ∈ Rd , (λ − L I OU )u λ = l · v. Due to the study done in Sect. 3, we can prove the existence of u λ in H˜ 1 , S I OU u λ and A I OU u λ in H˜ −1 . The next results are now direct consequences of Theorems 3 and 4. Proposition 4.1. We have lim λ u 2λ = 0.
λ→0
Furthermore, there exists ξ in L 2 such that lim |∇v u λ − ξ |2 = 0 λ→0
(34)
(35)
and there exists a family (ξi )i∈I in L 2 such that for all i ∈ I , lim |∇vi u λ − ξi |2 = 0 λ→0
and
|ξi |2 < ∞.
i∈I
♦ All we have to do is to prove that the hypotheses of Theorem 3 are satisfied. Let us introduce on the configuration space X the involution η = (xi , vi )i∈I → η⊥ = (xi , −vi )i∈I . F0 is the sub-σ -algebra of F generated by the functions which do not depend on the velocities. Hypotheses P1 and P4 are very easy to check, as the Palm measure is a product measure with respect to the positions and the velocities, and as it is even in the velocities. Moreover, it is a infinite-dimensional Gaussian measure in the velocities, so
714
G. Benabou
that S I OU has a positive spectral gap γ∗ , where γ∗ has been defined in (31). Hypothesis P3 is then satisfied. ♦ Now, Theorem 1 is a particular case of Theorem 4. In this case, the non-degeneracy of the macroscopic diffusion can not be proved. It depends certainly on the temperature of the system. Acknowledgements. The author wishes to express his thanks to Professor J. Fritz for his extremely precious advice which helped him to highly improve this article. He also wants to thank his PhD. thesis director Stefano Olla who made him work on this problem and helped him solving it.
References 1. Papanicolau, G., Varadhan, S. R. S.: Boundary Value Problems with Rapidly Oscillating Random Coefficients. In Colloquia Mathematica Societatis János Bolay, 27. Random Fields, Esztregom (Hungary), (1979), pp. 835–873 2. Papanicolau, G., Varadhan, S. R. S.: Ornstein-Uhlenbeck Processes in Random Potential. Commun. Pure Appl. Math. 38, 819–834 (1985) 3. Ethier, S. N., Kurtz, T. G.: Markov Processes. New York: John Wiley (1986) 4. Ferrari, P. A.: The Simple Exclusion Process as seen from a Tagged Particle. Ann. Prob. 14, 1277–1290 (1986) 5. Kipnis, C., Varadhan, S. R. S.: Central Limit Theorem for Additive Functionals of Reversible Markov Process and Applications to Simple Exclusions. Commun. Math. Phys. 104, 1–19 (1986) 6. De Masi, A., Ferrari, P., Goldstein, S., Wick, W. D.: An Invariance Principle for Reversible Markov Processes. Applications to Random Motions in Random Environments. J. Stat. Phys. 55, 3/4, 787–855 (1989) 7. Olla, S., Varadhan S.R.S.: Scaling Limit for Interacting Ornstein-Uhlenbeck Processes. Commun. Math. Phys. 135, 335–378 (1991) 8. Olla, S.: Homogenization of Diffusion Processes in Random Fields. In Publications del’ Ecole Doctorale del’ Ecole Polytechnique, Palaiseall, (1994) (Available at http://www.ceremade. dauphine.fr/∼olla/lho.ps) 9. Varadhan, S. R. S.: Self Diffusion of a Tagged Particle in Equilibrium for Asymmetric Mean Zero Random Walks with Simple Exclusion. Ann. Inst. H. Poincaré (Probabilités) 31, 273–285 (1996) 10. Sethuraman, S., Varadhan, S. R. S., Yau, H.T.: Diffusive Limit of a Tagged Particle in Asymmetric Exclusion Process, Commun. Pure Appl. Math. 53, 972–1006 (2000) 11. Olla, S.: Central Limit Theorems for Tagged Particles and for Diffusions in Random Environment. Notes of the course given at “États de la recherche : Milieux Aléatoires”, CIRM 23-25 November 2000. In: Milieux Aléatoires (F. Comets, E. Pardoux, ed.), Panorama et Synthèses 12, 2001, pp. 75-100. Available at http://www.ceremade.dauphine.fr/∼olla/cirmrev.pdf 12. Olla, S., Trémoulet, C.: Equilibrium Fluctuations for Interacting Ornstein-Uhlenbeck particles. Commun. Math. Phys. 233, 463–491 (2003) 13. Freidlin, M.: Some Remarks on the Smoluchowski-Kramers Approximation. J. Stat. Phys. 117, 3/4, 617–634 (2004) 14. Benabou, G.: Comparison between the homogenization of Ornstein-Uhlenbeck and Brownian processes. Preprint, submitted to Stoch. Proc. Appl. Communicated by H. Spohn
Commun. Math. Phys. 266, 715–733 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0048-7
Communications in
Mathematical Physics
Stability Conditions on a Non-Compact Calabi-Yau Threefold Tom Bridgeland School of Mathematics, University of Sheffield, Hicks Building, Hounsfield Road, Sheffield, S3 7RH, UK. E-mail: [email protected] Received: 2 September 2005 / Accepted: 10 February 2006 Published online: 30 May 2006 – © Springer-Verlag 2006
Abstract: We study the space of stability conditions Stab(X ) on the non-compact Calabi-Yau threefold X which is the total space of the canonical bundle of P2 . We give a combinatorial description of an open subset of Stab(X ) and state a conjecture relating Stab(X ) to the Frobenius manifold obtained from the quantum cohomology of P2 . We give some evidence from mirror symmetry for this conjecture.
1. Introduction The space of stability conditions Stab(X ) on a variety X was introduced in [5] as a mathematical framework for understanding Douglas’s notion of π -stability for D-branes in string theory [13]. This paper is concerned with the case when X = OP2 (−3) is the total space of the canonical line bundle of P2 . This non-compact Calabi-Yau threefold provides an amenable but interesting example on which to test the general theory, and many features of the spectrum of D-branes on X have already been studied in the physics literature (see for example [11, 12, 19]). So far, we are unable to give a complete description of Stab(X ). However, using the results of [7], we define an open subset Stab0 (X ) ⊂ Stab(X ) which is a disjoint union of regions indexed by the elements of an affine braid group. The combinatorics of these regions leads us to conjecture a precise connection between Stab0 (X ) and the Frobenius manifold defined by the quantum cohomology of P2 . Our main aim is to assemble some convincing evidence for this conjecture and to discuss some of its consequences. The existence of deep connections between quantum cohomology and derived categories has been known for some time. In particular, following observations of Cecotti and Vafa [10] and Zaslow [29], Dubrovin conjectured [15] that the derived category of a Fano variety Y has a full exceptional collection (E 0 , E 1 , . . . , E n−1 ) if and only if the quantum cohomology of Y is generically semisimple, and that in this case the Stokes matrix Si j of the corresponding Frobenius manifold should coincide with the
716
T. Bridgeland
Gram matrix χ (E i , E j ) for the Euler form of D(Y ). This statement has been verified for projective spaces [21, 28]. It was pointed out by Bondal and Kontsevich that a heuristic explanation for Dubrovin’s conjecture can be given using mirror symmetry. The mirror of a Fano variety with a full exceptional collection is expected to be an affine variety Yˇ , together with a holomorphic function f : Yˇ → C with isolated singularities. The Frobenius manifold arising from the quantum cohomology of Y should coincide with the Frobenius manifold of Saito type defined on the universal unfolding space of f . The Stokes matrix is then the intersection form evaluated on a distinguished basis of vanishing cycles. Under Kontsevich’s homological mirror proposal [25] the intersection form is identified with the Euler form on D(Y ), and the vanishing cycles, which are discs, correspond to exceptional objects in D(Y ). The conjecture stated below suggests that it may be possible to use spaces of stability conditions to give a more direct link between derived categories and quantum cohomology. To make this work one should somehow define the structure of a Frobenius manifold on the space of stability conditions which in some small patch recovers the usual quantum cohomology picture. At present however, the author has no clear ideas as to how this could be done. The other general conclusion one can draw from the example studied in this paper is that the space of stability conditions Stab(X ) is not an analogue of the stringy Kähler moduli space, but rather some extended version of it. The picture seems to be that the space Stab(X ) is a global version of the Frobenius manifold defined by big quantum cohomology, and the stringy Kähler moduli space is a submanifold which near the large volume limit is defined by the small quantum cohomology locus. In the rest of the introduction we shall describe our results in more detail. Missing definitions and proofs are hopefully covered in the main body of the paper. 1.1. A stability condition [5] on a triangulated category D consists of a full abelian subcategory A ⊂ D called the heart, together with a group homomorphism Z : K (D) −→ C called the central charge, with the compatibility property that for every nonzero object E ∈ A one has Z (E) ∈ H = {z ∈ C : z = r exp(iπ φ) with r > 0 and 0 < φ 1}. One insists further that A ⊂ D is the heart of a bounded t-structure on D, and that the map Z has the Harder–Narasimhan property. The set of all stability conditions on D satisfying an extra condition called local-finiteness form a complex manifold Stab(D). Forgetting the heart A ⊂ D and remembering the central charge gives a map Z : Stab(D) −→ HomZ (K(D), C). In this paper we shall consider the case when D is the subcategory of the bounded derived category of coherent sheaves on X = OP2 (−3) consisting of complexes whose cohomology sheaves are supported on the zero section P2 ⊂ X . In that case the Grothendieck group K(D) is a free abelian group of rank three. Our main result is
Stability Conditions on a Non-Compact Calabi-Yau Threefold
717
Theorem 1.1. There is a connected open subset Stab0 (X ) ⊂ Stab(X ) which can be written as a disjoint union of regions D(g), Stab0 (X ) = g∈G
where G is the affine braid group with presentation G = τ0 , τ1 , τ2 | τi τ j τi = τ j τi τ j for all i, j . Each region D(g) is mapped isomorphically by Z onto a locally-closed subset of the three dimensional vector space HomZ (K(D), C), and the closures of two regions D(g1 ) and D(g2 ) intersect in Stab0 (X ) precisely if g1 g2−1 = τi±1 for some i ∈ {0, 1, 2}. The stability conditions in a given region D(g) all have the same heart A(g) ⊂ D. Each of these categories A(g) is equivalent to a category of nilpotent representations of a quiver with relations of the form
a
c
b
where the positive integers a, b, c labelling the graph represent the number of arrows in the quiver joining the corresponding vertices. In fact the triples (a, b, c) which come up are precisely the positive integer solutions to the Markov equation a 2 + b2 + c2 = abc. We denote by S0 (g), S1 (g), S2 (g) the three simple objects of A(g) corresponding to the three one-dimensional representations of the quiver. In the case when g = e is the identity we simply write Si = Si (e). The objects Si (g) are spherical objects of D in the sense of Seidel and Thomas [27]. As such they define autoequivalences Si (g) ∈ Aut D. These descend to give automorphisms φ Si (g) ∈ Aut K(D) i = 0, 1, 2, which with respect to the fixed basis of K(D) defined by the classes of the objects Si are given by a triple of matrices P0 (g), P1 (g), P2 (g) ∈ SL(3, Z). It turns out that exactly the same system of matrices come up in the study of the quantum cohomology of P2 . 1.2. Dubrovin showed that the semisimple Frobenius structure arising from the quantum cohomology of P2 can be analytically continued to give a Frobenius structure on a dense open subset M of the universal cover of the configuration space
718
T. Bridgeland
C3 (C) = {(u 0 , u 1 , u 2 ) ∈ C : i = j =⇒ u i = u j }/ Sym3 . Note that in some small ball on M the corresponding prepotential encodes the geometric data of the Gromov–Witten invariants of P2 , but away from this patch there is no such direct interpretation. Thus, just like the space of stability conditions, M is a non-perturbative object, not depending on any choice of large volume limit. Given a point m ∈ M we denote by {u 0 (m), u 1 (m), u 2 (m)} the corresponding unordered triple of points in C, and set Cm = C \ {u 0 (m), u 1 (m), u 2 (m)}. Let W denote the space W = {(m, z) ∈ M × C : z ∈ Cm } with its projection p : W → M. Using the Frobenius structure Dubrovin defined a series of flat, holomorphic connections ∇ˇ (s) on the pullback of the tangent bundle p ∗ (T M ). These connections are called the second structure connections. Connections of this type were first introduced by K. Saito in the theory of primitive forms for unfolding spaces. We shall be interested only in the case s = 21 . 1 For each m ∈ M the connection ∇ˇ = ∇ˇ ( 2 ) restricts to give a holomorphic connection ∇ˇ m on a trivial rank three bundle over Cm . Dubrovin showed that this family of connections is isomonodromic. Define another configuration space C3 (C∗ ) = {(u 0 , u 1 , u 2 ) ∈ C∗ : i = j =⇒ u i = u j }/ Sym3 ˜ 3 (C∗ ) be its universal cover. Define and let C M 0 = {m ∈ M : 0 ∈ Cm } and let M˜ 0 be its inverse image in C˜ 3 (C∗ ) under the natural map C˜ 3 (C∗ ) → C˜ 3 (C). We can choose a base-point m ∈ M 0 such that {u 0 (m), u 1 (m), u 2 (m)} are the three roots of unity. Let (γ0 , γ1 , γ2 ) denote the following basis of π1 (Cm , 0): γ
1
γ
2
γ
0
Let m ∈ U ⊂ M 0 be a small simply-connected neighbourhood of m. For each point m ∈ U there is a chosen basis of π1 (Cm , 0) obtained by deforming the loops γi , which we also denote (γ0 , γ1 , γ2 ). Let V be the space of flat sections of ∇ˇ m near the origin 0 ∈ C. Using the connection ∇ˇ we can identify V with the space of flat sections of ∇ˇ m˜ near 0 ∈ C for all points m˜ ∈ M˜ 0 . As we explain in Sect. 2.3, the group G is a subgroup of π1 (C3 (C∗ )), and
Stability Conditions on a Non-Compact Calabi-Yau Threefold
719
hence acts on C˜ 3 (C∗ ). Taking the monodromy of the connection ∇ˇ m˜ around the loops γi for m˜ ∈ g(U ) ∩ M˜ 0 we obtain linear automorphisms αi (g) ∈ Aut(V ) for i = 0, 1, 2. The following result relates these to the transformations φ Si (g) of the last section. Theorem 1.2. There is a triple of flat sections (φ0 , φ1 , φ2 ) of the second structure connection ∇ˇ such that for all g ∈ G the monodromy transformations αi (g) act by the matrices Pi (g) defined in the last section. This condition fixes the triple (φ0 , φ1 , φ2 ) uniquely up to a scalar multiple. This theorem is a simple recasting of some results of Dubrovin, and boils down to two previously observed coincidences. The first is the fact mentioned in the introduction that the Stokes matrix Si j for the quantum cohomology of P2 coincides with the Gram matrix χ (E i , E j ) for the Euler form on K(P2 ) with respect to a basis coming from an exceptional triple of vector bundles (E 0 , E 1 , E 2 ). The second is that this coincidence is compatible with the braid group actions on these matrices arising on the one hand from the analytic continuation of the Frobenius manifold [16, Theorem 4.6], and on the other from the action of mutations on exceptional triples discovered by Bondal, Gorodentsev and Rudakov [2, 20]. In fact the connection ∇ˇ corresponds to the Gauss–Manin connection on the universal unfolding space of the singularity mirror to the space X . It might perhaps be easier to understand the connection in this geometric way. But part of the point of this paper is to try to avoid passing to the mirror. 1.3. We now describe a conjecture relating the quantum cohomology of P2 to the space of stability conditions on X . The noncompactness of X makes this relationship slightly more complicated than might be expected. In particular, the Euler form χ (−, −) on K(D) is degenerate, with a one-dimensional kernel generated by the class of a skyscraper sheaf [Ox ] ∈ K(D) for x ∈ P2 ⊂ X . In terms of the basis defined by the spherical objects Si = Si (e) one has [Ox ] = [S0 ] + [S1 ] + [S2 ]. Since this class is somehow special, and in particular is preserved by all autoequivalences of D, it makes sense to define a space of normalised stability conditions by Stab0n (X ) = σ = (Z , P) ∈ Stab0 (X ) : Z (Ox ) = i . This is a connected submanifold of Stab0 (X ). Define an affine space A2 = (z 0 , z 1 , z 2 ) ∈ C3 : z 0 + z 1 + z 2 = i . In co-ordinate form the map Z gives a local isomorphism Z : Stab0n (X ) −→ A2 obtained by sending a stability condition to the triple (Z (S0 ), Z (S1 ), Z (S2 )).
720
T. Bridgeland
On the quantum cohomology side, the flat sections (φ0 , φ1 , φ2 ) of Theorem 1.2 do not form a basis, and in fact satisfy φ0 + φ1 + φ2 = 0. Pulling back the connection ∇ˇ via the embedding M 0 → W defined by p → ( p, 0) we obtain a flat connection on the tangent bundle T M 0 . Taking co-ordinates whose gradients are the sections (φ0 , φ1 , φ2 ) and rescaling appropriately, one obtains a holomorphic map W : M˜ 0 −→ A2 . This map is invariant under the free C action on M˜ 0 ⊂ C˜ 3 (C∗ ) which lifts the C∗ action on C3 (C∗ ) obtained by simultaneously rescaling the points (u 0 , u 1 , u 2 ). The quotient ˜ 3 (C∗ )/C is the universal covering space of C [u 0 , u 1 , u 2 ] ∈ P2 : u i = 0 and u i = u j . The quotient M˜ 0/C is therefore a dense open subset. We call the induced map W : M˜ 0 /C −→ A2 the homogeneous twisted period map. It is a local isomorphism. We can now state our conjecture. Conjecture 1.3. There is a commuting diagram F Stab0n (X ) −−−−→ M˜ 0 /C Z W
A2
A2
Moreover F is an isomorphism onto a dense open subset. Proving this conjecture would require a more detailed understanding of the geometry of the homogeneous twisted period map. In particular, it would be necessary to find an open subset of M˜ 0 /C which was mapped isomorphically by W onto the subset (z 0 , z 1 , z 2 ) ∈ A2 : Im(z i ) > 0 which is the image of the interior of the region D(e) under the map Z. 1.4. Here we describe two pieces of evidence for Conjecture 1.3. First consider the submanifold D ⊂ C3 (C) defined parametrically by taking the unordered triple of points u i = −1 + z 1/3 for some z ∈ C\{0, 1}. The inverse image of D in the universal cover C˜ 3 (C) is contained in the open subspace M. The submanifold D (or its inverse image in M) is the small quantum cohomology locus; in the standard flat co-ordinates it is given by (t0 , t1 , t2 ) = (−1, e z , 0). Dubrovin showed [18, Prop. 5.13] that on this locus the homogeneous twisted period map satisfies the differential equation 1 2 d θz3 − z(θz + )(θz + )θz W = 0, θz ≡ z . 3 3 dz
Stability Conditions on a Non-Compact Calabi-Yau Threefold
721
This is the Picard–Fuchs equation for the periods of the mirror of X , and is thus precisely the equation satisfied by the central charge on the stringy Kähler moduli space [1, 11]. A second piece of evidence for Conjecture 1.3 is that if we go down a dimension to the case X = OP1 (−2) the corresponding statement is known to be at least nearly true. In that case the space M is the universal cover of C2 (C) = {(u 0 , u 1 ) ∈ C : u 0 = u 1 } so that M˜ 0/C is the universal cover of C \ {0, 1} with co-ordinate λ = u 1 /u 0 . In this case we must take the second structure connection with s = 0 (in general, for a projective space of dimension d we should take s = (d − 1)/2). Thus the homogeneous twisted period map in this case is just the homogeneous part of the standard period map for the quantum cohomology of P1 . This was computed by Dubrovin. Identifying the affine space A1 = {(z 0 , z 1 ) : z 0 + z 1 = i} with C via the map (z 0 , z 1 ) → z 0 , the equation [14, G.20] implies that the homogeneous period map is
1 −1 1 + λ W(λ) = cos . π 1−λ On the other hand the space Stab(X ) was studied in [8]. The corresponding open subset Stab0 (X ) ⊂ Stab(X ) is actually a connected component, and the corresponding space Stab0n (X ) is a covering space of C\Z. This gives the following result. Theorem 1.4. In the case X = OP1 (−2) there is a commuting diagram H ˜ 1} Stab0n (X ) ←−−−− C \ {0, Z W
C\Z
C\Z
in which all the maps are covering maps. In fact one expects Stab0n (X ) to be simply-connected so that H is actually an isomorphism.
2. Stability Conditions on X In this section we justify the claims about Stab(X ) made in the introduction. In particular we prove Theorem 1.1. We start by summarising some of the necessary definitions. More details can be found in [5, 7].
722
T. Bridgeland
2.1. Stability conditions and tilting. Let D be a triangulated category. Recall that a bounded t-structure on D determines and is determined by its heart, which is an abelian subcategory A ⊂ D. One has an identification of Grothendieck groups K(D) = K(A). A stability function on an abelian category A is defined to be a group homomorphism Z : K(A) → C such that 0 = E ∈ A =⇒ Z (E) ∈ R>0 exp(iπ φ(E)) with 0 < φ(E) 1. The real number φ(E) ∈ (0, 1] is called the phase of the object E. A nonzero object E ∈ A is said to be semistable with respect to Z if every subobject 0 = A ⊂ E satisfies φ(A) φ(E). The stability function Z is said to have the Harder–Narasimhan property if every nonzero object E ∈ A has a finite filtration 0 = E 0 ⊂ E 1 ⊂ · · · ⊂ E n−1 ⊂ E n = E whose factors F j = E j /E j−1 are semistable objects of A with φ(F1 ) > φ(F2 ) > · · · > φ(Fn ). A simple sufficient condition for the existence of Harder–Narasimhan filtrations was given in [5, Prop. 2.4]. In particular the Harder–Narasimhan property always holds when A has finite length. The definition of a stability condition appears in [5]. For our purposes the following equivalent definition will be more useful, see [5, Prop. 5.3]. Definition 2.1. A stability condition on D consists of a bounded t-structure on D and a stability function on its heart which has the Harder–Narasimhan property. The induced map Z : K(D) → C is called the central charge of the stability condition. It was shown in [5] that the set of stability conditions on D satisfying an additional condition called local-finiteness form the points of a complex manifold Stab(D). In general this manifold will be infinite-dimensional, but in the cases we consider in this paper K(D) is of finite rank, and it follows that Stab(D) has finite dimension. To construct t-structures we use the method of tilting introduced by Happel, Reiten and Smalø [22], based on earlier work of Brenner and Butler [4]. Suppose A ⊂ D is the heart of a bounded t-structure and is a finite length abelian category. Note that the t-structure is completely determined by the set of simple objects of A; indeed A is the smallest extension-closed subcategory of D containing this set of objects. Given a simple object S ∈ A define S ⊂ A to be the full subcategory consisting of objects E ∈ A all of whose simple factors are isomorphic to S. One can either view S as the torsion part of a torsion theory on A, in which case the torsion-free part is
F = E ∈ A : HomA (S, E) = 0 , or as the torsion-free part, in which case the torsion part is
T = E ∈ A : HomA (E, S) = 0 . The corresponding tilted subcategories are defined to be L S A = E ∈ D : H i (E) = 0 for i ∈ / {0, 1}, H 0 (E) ∈ F and H 1 (E) ∈ S , R S A = E ∈ D : H i (E) = 0 for i ∈ / {−1, 0}, H −1 (E) ∈ S and H 0 (E) ∈ T . They are the hearts of new bounded t-structures on D.
Stability Conditions on a Non-Compact Calabi-Yau Threefold
723
2.2. Quivery subcategories. Let X = OP2 (−3) with its projection π : X → P2 . Let D denote the full subcategory of the bounded derived category of coherent sheaves on X consisting of complexes supported on the zero-section P2 ⊂ X . Let Stab(X ) denote the space of locally-finite stability conditions on D. Let (E 0 , E 1 , E 2 ) be an exceptional collection of vector bundles on P2 . Any exceptional collection in Db Coh(P2 ) is of this form up to shifts. It was proved in [7] that there is an equivalence of categories Hom
•
2
∗
π E i , − : Db Coh(X ) −→ Db Mod(B),
i=0
where Mod(B) is the category of finite-dimensional right modules for the algebra B = End X
2
∗
π Ei .
i=0
The algebra B can be described as the path algebra of a quiver with relations taking the form
a
c
b
Pulling back the standard t-structure on Db Mod(B) gives a bounded t-structure on D whose heart is equivalent to the category of nilpotent modules of B. The abelian subcategories A ⊂ D obtained in this way are called exceptional. An abelian subcategory of D is called quivery if it is of the form (A) for some exceptional subcategory A ⊂ D and some autoequivalence ∈ Aut(D). Any quivery subcategory A ⊂ D is equivalent to a category of nilpotent modules of an algebra of the above form. As such it has three simple objects {S0 , S1 , S2 } corresponding to the three one-dimensional representations of the quiver. These objects Si are spherical in the sense of Seidel and Thomas and thus give rise to autoequivalences Si ∈ Aut(D). Note that the three simples Si completely determine the corresponding quivery subcategory A ⊂ D. The Ext groups between them can be read off from the quiver Hom1D (S0 , S1 ) = Ca , Hom1D (S1 , S2 ) = Cb , Hom1D (S2 , S0 ) = Cc with the other Hom1 groups being zero. Serre duality then determines the other groups. Take A to be the exceptional subcategory of D corresponding to the exceptional collection (O, O(1), O(2)) on P2 . Its simples are S0 = i ∗ O, S1 = i ∗ 1 (1)[1], S2 = i ∗ O(−1)[2],
724
T. Bridgeland
where i : P2 → X is the inclusion of the zero-section, and denotes the cotangent bundle of P2 . We have (a, b, c) = (3, 3, 3). Let us compute the automorphisms φ Si of K(D) induced by the autoequivalences Si . The twist functor S is defined by the triangle Hom•D (S, E) ⊗ Si −→ E −→ S (E) so that, at the level of K-theory, φ S ([E]) = [E] − χ (S, E)[S]. If we write Pi for the matrix representing the transformation φ Si with respect to the basis ([S0 ], [S1 ], [S2 ]) of K(D) then 1 3 −3 1 0 0 1 0 0 P0 = 0 1 0 , P1 = −3 1 3 , P2 = 0 1 0 . 0 0 1 0 0 1 3 −3 1 2.3. Braid group action. It was shown in [7] that if one tilts a quivery subcategory A ⊂ D at one of its simples one obtains another quivery subcategory. To describe this process in more detail we need to define a certain braid group which acts on triples of spherical objects. The three-string annular braid group C B3 is the fundamental group of the configuration space of three unordered points in C∗ . It is generated by three elements τi indexed by the cyclic group i ∈ Z3 together with a single element r , subject to the relations for all i ∈ Z3 , r τi r −1 = τi+1 τi τ j τi = τ j τi τ j for all i, j ∈ Z3 . For a proof of the validity of this presentation see [24]. If we take the base point to be defined by the three roots of unity, then the elements τ1 and r correspond to the loops obtained by moving the points as follows:
τ1
r
We write G ⊂ C B3 for the subgroup generated by the three braids τ0 , τ1 , τ2 . Define a spherical triple in D to be a triple of spherical objects (S0 , S1 , S2 ) of D. The group C B3 acts on the set of spherical triples in D by the formulae τ1 (S0 , S1 , S2 ) = (S1 [−1], S1 (S0 ), S2 ), r (S0 , S1 , S2 ) = (S2 , S0 , S1 ). The following result allows one to completely understand the process of tilting for quivery subcategories of D.
Stability Conditions on a Non-Compact Calabi-Yau Threefold
725
Proposition 2.2. Let A ⊂ D be a quivery subcategory with simples (S0 , S1 , S2 ). Then for each i = 0, 1, 2 the three simples of the tilted quivery subcategory L Si (A) are given by the spherical triple τi (S0 , S1 , S2 ). For each g ∈ G we then have a quivery subcategory A(g) ⊂ D obtained by repeatedly tilting starting at A. Its three simples are given by the spherical triple (S0 (g), S1 (g), S2 (g)) = g(S0 , S1 , S2 ). Note that the three simples of an arbitrary quivery subcategory have no well-defined ordering, but the above definition gives a chosen order for the simples of the quivery subcategories A(g). Let Pi (g) ∈ SL(3, Z) be the matrix representing the automorphism of K(D) induced by the twist functor Si (g) with respect to the fixed basis ([S0 ], [S1 ], [S2 ]). The formulae defining the action of the braid group on spherical triples show that this system of matrices has the following transformation laws: P0 (τ1 g) = P1 (g), P0 (rg) = P2 (g),
P1 (τ1 g) = P1 (g)P0 (g)P1 (g)−1 , P1 (rg) = P0 (g),
P2 (τ1 g) = P2 (g), P2 (rg) = P0 (g).
Introduce a graph (D) whose vertices are the quivery subcategories of D, and in which two subcategories are joined by an edge if they differ by a tilt at a simple object. It was shown in [7] that distinct elements g ∈ G define distinct subcategories A(g) ⊂ D. It follows that each connected component of is just the Cayley graph of G with respect to the generators τ0 , τ1 , τ2 . 2.4. Stability conditions on X . Given an element g ∈ G let A(g) ⊂ D be the corresponding quivery subcategory. The class of any nonzero object E ∈ A(g) is a strictly positive linear combination: [E] = n i [Si (g)] with n 1 , n 2 , n 3 0 not all zero. It follows that to define a stability condition on D we can just choose three complex numbers z i in the strict upper half-plane H = {z ∈ C : z = r exp(iπ φ) with r > 0 and 0 < φ 1} and set Z (Si (g)) = z i . The Harder–Narasimhan property is automatically satisfied because A(g) has finite length. We shall denote the corresponding stability condition by σ (g, z 0 , z 1 , z 2 ). Lemma 2.3. If σ = σ (g, z 0 , z 1 , z 2 ) is a stability condition on D of the sort defined above, and E ∈ D is stable in σ , then there is an open subset U ⊂ Stab(D) containing σ such that E is stable for all stability conditions in U . Proof. This follows from the arguments of [6, Sect. 8]. It is enough to check that the set of classes γ ∈ K(D) such that there is an object F ∈ D with class [F] = γ such that m σ (F) m σ (E) is finite. This is easy to see because the heart of σ has finite length.
726
T. Bridgeland
To each element g ∈ G there is an associated set of stability conditions D(g) = {σ (g, z 0 , z 1 , z 2 ) : (z 0 , z 1 , z 2 ) ∈ H 3 with at most one z i ∈ R} ⊂ Stab(X ). By definition these subsets of Stab(X ) are disjoint since they correspond to stability conditions with different hearts. Proposition 2.4. There is an open subset Stab0 (X ) =
D(g) ⊂ Stab(X ).
g∈G
If g1 , g2 ∈ G then the closures of the regions D(gi ) intersect in Stab0 (X ) precisely if g1 = τi±1 g2 for some i ∈ {0, 1, 2}. Proof. Suppose a point σ = σ (g, z 0 , z 1 , z 2 ) lies in D(g). We must show that there is an open neighbourhood of σ contained in the subset Stab0 (X ). The simple objects Si = Si (g) ∈ A(g) are stable in σ . They remain stable in a small open neighbourhood U of σ in Stab(X ). We repeatedly use the easily proved fact that if A, A ⊂ D are hearts of bounded t-structures and A ⊂ A then A = A . Suppose first that Im(z i ) > 0 for each i. Shrinking U we can assume each Si has phase in the interval (0, 1) for all stability conditions (Z , P) of U . Since A(g) is the smallest extension-closed subcategory of D containing the Si it follows that A(g) is contained in the heart P((0, 1]) of all stability conditions in U . This implies that P((0, 1]) = A(g) and so U is contained in D(g). Suppose now that one of the z i , without loss of generality z 0 , lies on the real axis, so that σ lies on the boundary of D(g). Thus z 0 ∈ R<0 , and Im(z i ) > 0 for i = 1, 2. Shrinking U we can assume that Re Z (S0 ) < 0 and Im Z (Si ) > 0 for i = 1, 2 for all stability conditions (Z , P) of U . The object S = S0 (S2 ) ∈ D lies in A(g), and is in fact a universal extension 0 −→ S2 −→ S −→ S0⊕a −→ 0, where a = dim Hom1D (S0 , S2 ). Since HomD (S0 , S ) = 0 the object S lies in P((0, 1)) and shrinking U we can assume that this is the case for all stability conditions (Z , P) of U . We split U into the two pieces U+ = Im Z (S0 ) 0 and U− = Im Z (S0 ) < 0. The argument above shows that U+ ⊂ D(g). On the other hand, for any stability condition (Z , P) in U− the object S0 is stable with phase in the interval (1, 3/2). Thus the heart P((0, 1]) contains the objects S0 [−1], S and S1 . Since these are the simples of the finite length category A(τ0 g) it follows that U− ⊂ D(τ0 g). 3. Quantum Cohomology and the Period Map In this section we describe some of Dubrovin’s results concerning the twisted period map of the quantum cohomology of P2 .
Stability Conditions on a Non-Compact Calabi-Yau Threefold
727
3.1. Frobenius manifolds. The notion of a Frobenius manifold was first introduced by Dubrovin, although similar structures arising in singularity theory were studied earlier by K. Saito. A Frobenius manifold is a complex manifold M with a flat metric g and a commutative multiplication ◦ : T M ⊗ T M −→ T M on its tangent bundle, satisfying the compatibility condition g(X ◦ Y, Z ) = g(X, Y ◦ Z ). One requires that locally on M there exists a holomorphic function called the prepotential such that g(X ◦ Y, Z ) = X Y Z () for all flat vector fields X, Y, Z . Finally, one also assumes the existence of a flat identity vector field e, and an Euler vector field E satisfying Lie E (◦) = ◦, Lie E (g) = (2 − d) · g for some constant d called the charge of the Frobenius manifold. Given a smooth projective variety Y of dimension d one can put the structure of a Frobenius manifold of charge d on an open subset of the vector space H ∗ (Y, C). The metric is the constant metric given by the Poincaré pairing, and the prepotential is defined by an infinite series whose coefficients are the genus zero Gromov–Witten invariants, which naively speaking count rational curves in Y . The condition that the resulting algebras be associative translates into the statement that satisfies the WDVV equations. In turn, these equations boil down to certain relations between Gromov–Witten classes arising from the structure of the cohomology ring of the moduli space of pointed rational curves. The Gromov–Witten invariants are only non-vanishing in certain degrees, which gives the existence of the Euler vector field. The resulting Frobenius manifold is called the quantum cohomology of Y . Actually, for a general projective variety Y (even a Calabi-Yau threefold) it is not known whether the series defining the prepotential has nonzero radius of convergence, so one has to work over a formal coefficient ring. But we shall only be interested in the case when Y is a projective space and here it is known that there are no convergence problems. For example, in the case Y = P2 we can take co-ordinates t = t0 + t1 ω + t2 ω2 , where ω ∈ H 2 (X, C) is the class of a line, and the function is then defined by the series (t) =
nk 1 2 (t0 t2 + t0 t12 ) + t 3k−1 ekt1 , 2 (3k − 1)! 2 k 1
where n k is the number of curves of degree k on P2 passing through 3k − 1 generic points. The Euler vector field is E = t0 See [16, Lect. 1] for more details.
∂ ∂ ∂ +3 − t2 . ∂t0 ∂t1 ∂t2
728
T. Bridgeland
3.2. Tame Frobenius manifolds. Let M be a Frobenius manifold. Multiplication by the Euler vector field E defines a section U ∈ End(T M ). A point m ∈ M is called tame if the endomorphism U has distinct eigenvalues. The set of tame points of M forms an open (possibly empty) subset of M. A Frobenius manifold will be called tame if all its points are tame. Let
Cn (C) = (u 0 , . . . , u n−1 ) ∈ Cn : i = j =⇒ u i = u j / Symn be the configuration space of n unordered points in C. Dubrovin showed that if M is a tame Frobenius manifold the map M → Cn (C) defined by the eigenvalues of U is a regular covering of an open subset of Cn (C). This means that locally one can use the functions u i as co-ordinates on M. In terms of these canonical co-ordinates the product structure is ∂ ∂ ∂ · = δi j , ∂u i ∂u j ∂u i and the Euler field takes the simple form E=
i
ui
∂ . ∂u i
The non-trivial data of the Frobenius structure on M is entirely contained in the dependence of the metric on the canonical co-ordinates. Given a Frobenius manifold M it is natural to ask whether it is possible to analytically continue the prepotential to obtain a larger Frobenius manifold M such that M can be identified with an open subset of M . Dubrovin showed how to do this for tame Frobenius manifolds. Theorem 3.1 [16, Theorem 4.7]. Given a tame Frobenius manifold M of dimension n, there is a regular covering space C˜ n (C) → Cn (C), a divisor B ⊂ C˜ n (C), and a tame Frobenius structure on M = C˜ n (C)\ B such that there is an open inclusion of Frobenius manifolds M → M . Let M be a Frobenius manifold. Define a subset of M × C, W = {( p, z) ∈ M × C : det(U − z 1) = 0}, and let p : W → M be the projection map. For each parameter s ∈ C one can define a flat, holomorphic connection ∇ˇ = ∇ˇ (s) on the bundle p ∗ (TM ), by the following formulae: ∇ˇ XY = ∇ X Y − (∇ E + c 1)(U − z 1)−1 (X ◦ Y ), ∇ˇ ∂/∂z Y = ∇∂/∂z Y + (∇ E + c 1)(U − z 1)−1 (Y ). Here ∇ is the Levi–Civita connection corresponding to the flat metric g on M, ∇ E is the endomorphism of T M defined by X → ∇ X E, and c=s+
(d − 1) . 2
We shall be interested in the case when c = d − 1.
Stability Conditions on a Non-Compact Calabi-Yau Threefold
729
Assume now that M is a tame Frobenius manifold with its canonical co-ordinates u 0 , . . . , u n−1 . The space W takes the form W = {(m, z) ∈ M × C : z = u i for 0 i n − 1}. For each m ∈ M the connection ∇ˇ restricts to give a holomorphic connection ∇ˇ m on a trivial bundle over the space Cm = C\{u 0 , . . . , u n−1 }. Dubrovin showed that these connections ∇ˇ m vary isomonodromically. We briefly explain this condition. Choose a point m ∈ M and a loop γm in Cm based at some point z ∈ Cm . Let H be the space of flat sections of ∇ˇ m near z ∈ Cm . Monodromy around the loop γm defines a linear transformation αm ∈ Aut(H ). For points m ∈ M in a small neighbourhood of m the connection ∇ˇ allows us to identify H with the space of flat sections of the connection ∇ˇ m near z ∈ Cm . Moreover we can continuously deform the loop γm to give a loop γm in Cm based at z, and hence obtain a transformation αm ∈ Aut(H ). The isomonodromy condition is the statement that the transformations αm of H obtained in this way are constant. 3.3. Quantum cohomology of P2 . Let us now consider the Frobenius manifold defined by the quantum cohomology of P2 . It is known that the subset of tame points of the resulting Frobenius manifold is non-empty, so we can apply Dubrovin’s result Theorem 3.1 to obtain a tame Frobenius manifold structure on a dense open subset M = C˜ 3 (C)\ B, ˜ 3 (C) is the universal cover. Let ∇ˇ = ∇ˇ ( 21 ) be the second structure connection where C with parameter s = 1/2. The following result of Dubrovin’s computes its monodromy. Theorem 3.2 (Dubrovin). There is a point m ∈ M with canonical co-ordinates (u 0 , u 1 , u 2 ) the three roots of unity. Let γ0 , γ1 , γ2 be the following loops γi in Cm based at 0 ∈ C : γ
1
γ
2
γ
0
There is a triple (φ0 , φ1 , φ2 ) of flat sections of the connection ∇ˇ m in a neighbourhood of 0 ∈ C, such that the monodromy transformations Pi corresponding to the loops γi act on the triple (φ0 , φ1 , φ2 ) by the matrices 1 3 −3 1 0 0 1 0 0 0 1 0 , −3 1 3 , 0 1 0 . 0 0 1 0 0 1 3 −3 1 Moreover this triple (φ0 , φ1 , φ2 ) is unique up to multiplication by an overall scalar factor.
730
T. Bridgeland
Proof. The existence of a triple of flat sections with the above monodromy properties follows from general work of Dubrovin on monodromy of twisted period maps [18, Lemma 4.10, 4.12], together with Dubrovin’s computation of the Stokes matrix of the quantum cohomology of P2 [16, Example 4.4]. Uniqueness is easily checked. The discriminant of the Frobenius manifold M is the submanifold = {m ∈ M : u i (m) = 0 for some i}. Write M 0 = M\ for its complement and let M˜ 0 be its inverse image under the natural map C˜ 3 (C∗ ) → C˜ 3 (C). Let us take the point m ∈ M 0 of Theorem 3.2 as a base-point, and choose a small simply-connected neighbourhood m ∈ U ⊂ M 0 . For each point m ∈ U we have a well-defined choice of loops γi in Cm based at 0 obtained by deforming the loops γi of Theorem 3.2. The group G is a subgroup of the fundamental group π1 (C3 (C∗ )) and therefore acts by covering transformations on C˜ 3 (C∗ ). Thus to each element g ∈ G we can associate a corresponding open subset Ug = g(U ) ∩ M˜ 0 ⊂ M˜ 0 . Using the connection ∇ˇ we can continue the triple of sections of Theorem 3.2 to obtain a standard triple of flat sections of ∇ˇ m in a neighbourhood of 0 ∈ C for all m ∈ M˜ 0 . This is well-defined despite the fact that M˜ 0 may not be simply-connected because of the isomonodromy property and the uniqueness statement in Theorem 3.2. In particular, for each g ∈ G we obtain a standard triple of flat sections of ∇ˇ m near 0 ∈ C for all points m ∈ Ug . Taking monodromy of the connection ∇ˇ m around the loops γi with respect to this standard triple gives three matrices P0 (g), P1 (g), P2 (g). To calculate these matrices we use the isomonodromy property. For example, consider a path in M˜ 0 from m to τ1 (m). If we move the loops γi continuously with the u i then at the point τ1 (m) we will obtain the following basis (γ0 , γ1 , γ2 ) of π1 (Cm , 0):
γ’ 0
γ’ 2
γ’ 1
Clearly γ0 = γ0−1 γ1 γ0 , γ1 = γ0 , γ2 = γ2 . By the isomonodromy property, the monodromy of the standard triple of sections at τ1 (m) around the loops γi will be the same as the monodromy of the triple of sections at m around the loops γi and is therefore described by the matrices Pi (e) of Theorem 3.2. But the matrices Pi (τ1 ) describe the monodromy of the same sections around the
Stability Conditions on a Non-Compact Calabi-Yau Threefold
731
loops γi . Arguing in this way we see that the matrices Pi (g) have the transformation properties P0 (τ1 g) = P1 (g), P0 (rg) = P2 (g),
P1 (τ1 g) = P1 (g)P0 (g)P1 (g)−1 , P2 (τ1 g) = P2 (g), P1 (rg) = P0 (g), P2 (rg) = P0 (g).
These are exactly the same transformation properties satisfied by the linear maps φ Si (g) . Since the matrices Pi (e) of Theorem 3.2 coincide with the matrices of φ Si (e) with respect to the basis {S0 (e), S1 (e), S2 (e)} we obtain Theorem 1.2. 3.4. Twisted period map. There is an embedding M 0 → W obtained by sending a point m to (m, 0). Pulling back the flat connection ∇ˇ we obtain a flat connection on the tangent ˇ We define flat co-ordinates Wi whose gradients bundle T M 0 which we also denote ∇. with respect to the flat metric on M are the flat sections (φ0 , φ1 , φ2 ) of Theorem 3.2. Putting them together gives a holomorphic map W : M˜ 0 −→ C3 uniquely defined up to scalar multiples. There is a free action of C on C˜ 3 (C∗ ) lifting the C∗ action which simultaneously rescales the co-ordinates (u 0 , u 1 , u 2 ) on C3 (C∗ ). Let A2 = {(z 0 , z 1 , z 2 ) ∈ C3 : z 0 + z 1 + z 2 = i} be the affine space defined in the introduction. Then Proposition 3.3. There is a unique scalar multiple of the map W which descends to give a local isomorphism W : M˜ 0/C −→ A2 . Proof. First we show that the only possible linear relation between the solutions φi of Theorem 3.2 is φ0 + φ1 + φ2 = 0. Indeed, any such relation must be monodromy invariant, and (1, 1, 1) is the unique vector (up to multiples) preserved by the three given matrices. Secondly we show that this relation does indeed hold. Otherwise (φ0 , φ1 , φ2 ) define a basis of solutions and the map W is a local isomorphism. Let E=
i
ui
∂ ∂u i
be the Euler vector field. Dubrovin showed that all components Wi of W satisfy Lie E (Wi ) = constant. We cannot have Lie E (W) = 0 since this would contradict the statement that W is a local isomorphism. Thus there is a two-dimensional subspace of solutions satisfying Lie E (Wi ) = 0. But this subspace would have to be monodromy invariant, and there are
732
T. Bridgeland
no such two-dimensional subspaces. This gives a contradiction, so the relation holds, and rescaling we can assume that W0 + W1 + W2 = i. Now it follows that the only two-dimensional, monodromy invariant subspace of solutions is that generated by the φi , so that Lie E (W) = 0 and the result follows. Acknowledgements. The problem of describing Stab(OP2 (−3)) was originally conceived as a joint project with Alastair King, and the basic picture described in Theorem 1.1 was worked out jointly with him. It’s a pleasure to thank Phil Boalch who first got me interested in the connections with Stokes matrices and quantum cohomology. Several other people have been extremely helpful in explaining various things about Frobenius manifolds; let me thank here B. Dubrovin, C. Hertling and M. Mazzocco.
References 1. Aspinwall, P., Greene, B., Morrison, D.: Measuring small distances in N = 2 sigma models. Nucl. Phys. B 420, no. 1–2, 184–242 (1994) 2. Bondal, A.: Representations of associative algebras and coherent sheaves. (Russian) Izv. Akad. Nauk SSSR Ser. Mat. 53, no. 1, 25–44 (1989); translation in Math. USSR-Izv. 34, no. 1, 23–42 (1990) 3. Bondal, A., Polishchuk, A.: Homological properties of associative algebras: the method of helices. (Russian) Izv. Ross. Akad. Nauk Ser. Mat. 57, no. 2, 3–50 (1993); translation in Russian Acad. Sci. Izv. Math. 42, no. 2, 219–260 (1994) 4. Brenner, S., Butler, M.: Generalizations of the Bernstein-Gelfand-Ponomarev reflection functors. Representation theory, II (Proc. Second Internat. Conf., Carleton Univ., Ottawa, Ont., 1979), Lecture Notes in Math. 832, Berlin-New York: Springer, 1980, pp. 103–169 5. Bridgeland, T.: Stability conditions on triangulated categories. http://arxiv.org/list/math.AG/0212237, 2002, to appear in Ann. of Maths 6. Bridgeland, T.: Stability conditions on K3 surfaces. http://arxiv.org/list/math.AG/0307164, 2003 7. Bridgeland, T.: T-structures on some local Calabi-Yau varieties. J. Alg. 289, 453–483 (2005) 8. Bridgeland, T.: Stability conditions and Kleinian singularities. http://arxiv.org/list/math.AG/0508257, 2005 9. Bridgeland, T., King, A., Reid, M.: The McKay correspondence as an equivalence of derived categories. J. Amer. Math. Soc. 14, no. 3, 535–554 (2001) 10. Cecotti, S., Vafa, C.: On classification of N = 2 supersymmetric theories. Commun. Math. Phys. 158, no. 3, 569–644 (1993) 11. Diaconescu, D.-E., Gomis, J.: Fractional branes and boundary states in orbifold theories. J. High Energy Phys. 2000, no. 10, Paper 1, 44 pp. 12. Douglas, M., Fiol, B., Römelsberger, C.: The spectrum of BPS branes on a noncompact Calabi-Yau. JHEP 0509, 057 (2005) 13. Douglas, M.: Dirichlet branes, homological mirror symmetry, and stability. In: Proceedings of the International Congress of Mathematicians, Vol. III (Beijing, 2002), Beijing: Higher Ed. Press, 2002, pp. 395–408 14. Dubrovin, B.: Geometry of 2D topological field theories. In: Integrable systems and quantum groups (Montecatini Terme, 1993), Lecture Notes in Math. 1620, Berlin: Springer, 1996, pp. 120–348 15. Dubrovin, B.: Geometry and analytic theory of Frobenius manifolds. In: Proceedings of the International Congress of Mathematicians, Vol. II (Berlin, 1998) Doc. Math. 1998 16. Dubrovin, B.: Painlevé transcendents in two-dimensional topological field theory. The Painlevé property, CRM Ser. Math. Phys., New York: Springer, 1999, pp. 287–412 17. Dubrovin, B., Mazzocco, M.: Monodromy of certain Painlevé-VI transcendents and reflection groups. Invent. Math. 141, no. 1, 55–147 (2000) 18. Dubrovin, B.: On almost duality for Frobenius manifolds. In: Geometry, topology, and mathematical physics, Amer. Math. Soc. Transl. Ser. 2, 212, Providence, RI: Amer. Math. Soc., 2004, pp. 75–132 19. Feng, B., Hanany, A., He, Y., Iqbal, A.: Quiver theories, soliton spectra and Picard-Lefschetz transformations. J. High Energy Phys. 2003, no. 2, 056, 33 pp. 20. Gorodentsev, A., Rudakov, A.: Exceptional vector bundles on projective spaces. Duke Math. J. 54, no. 1, 115–130 (1987) 21. Guzzetti, D.: Stokes matrices and monodromy of the quantum cohomology of projective spaces, Commun. Math. Phys. 207, no. 2, 341–383 (1999)
Stability Conditions on a Non-Compact Calabi-Yau Threefold
733
22. Happel, D., Reiten, I., Smalø, S.: Tilting in abelian categories and quasitilted algebras. Mem. Amer. Math. Soc. 120, no. 575 (1996) 23. Hertling, C.: Frobenius manifolds and moduli spaces for singularities. Cambridge Tracts in Mathematics 151, Cambridge: Cambridge University Press, 2002 24. Kent IV, R., Peifer, D.: A geometric and algebraic description of annular braid groups. Internat. J. Algebra Comput. 12, 85–97 (2002) 25. Kontsevich, M.: Homological algebra of mirror symmetry. Proceedings of the International Congress of Mathematicians. Vol. 1, 2, (Zürich, 1994), Basel: Birkhäuser, 1995, pp. 120–139 26. Manin, Yu.: Frobenius manifolds, quantum cohomology, and moduli spaces. American Mathematical Society Colloquium Publications, 47. Providence, RI: Amer. Math. Soc., 1999 27. Seidel, P., Thomas, R.: Braid group actions on derived categories of coherent sheaves. Duke Math. J. 108, no. 1, 37–108 (2001) 28. Tanabé, S.: Invariant of the hypergeometric group associated to the quantum cohomology of the projective space. Bull. Sci. Math. 128, no. 10, 811–827 (2004) 29. Zaslow, E.: Solitons and helices: the search for a math-physics bridge. Commun. Math. Phys. 175, no. 2, 337–375 (1996) Communicated by M.R. Douglas
Commun. Math. Phys. 266, 735–775 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0037-x
Communications in
Mathematical Physics
Grafting and Poisson Structure in (2+1)-Gravity with Vanishing Cosmological Constant C. Meusburger Perimeter Institute for Theoretical Physics, 31 Caroline Street North, Waterloo, Ontario N2L 2Y5, Canada. E-mail: [email protected] Received: 19 September 2005 / Accepted: 20 January 2006 Published online: 4 May 2006 – © Springer-Verlag 2006
Abstract: We relate the geometrical construction of (2+1)-spacetimes via grafting to phase space and Poisson structure in the Chern-Simons formulation of (2+1)-dimensional gravity with vanishing cosmological constant on manifolds of topology R × Sg , where Sg is an orientable two-surface of genus g > 1. We show how grafting along simple closed geodesics λ is implemented in the Chern-Simons formalism and derive explicit expressions for its action on the holonomies of general closed curves on Sg . We prove that this action is generated via the Poisson bracket by a gauge invariant observable associated to the holonomy of λ. We deduce a symmetry relation between the Poisson brackets of observables associated to the Lorentz and translational components of the holonomies of general closed curves on Sg and discuss its physical interpretation. Finally, we relate the action of grafting on the phase space to the action of Dehn twists and show that grafting can be viewed as a Dehn twist with a formal parameter θ satisfying θ 2 = 0. 1. Introduction (2+1)-dimensional gravity is of physical interest as a toy model for the (3+1)-dimensional case. It is used as a testing ground which allows one to investigate conceptual questions arising in the quantisation of gravity without being hindered by the technical complexity in higher dimensions. One of these questions is the problem of "quantising geometry" or, more concretely, the problem of recovering geometrical objects with a clear physical interpretation from the gauge theory-like formulations used as a starting point for quantisation. In (2+1)-dimensions, the relation between Einstein’s theory of gravity and gauge theory is more direct than in higher dimensional cases, since the theory takes the form of a Chern-Simons gauge theory. Depending on the value of the cosmological constant, vacuum solutions of Einstein’s equations of motion are flat or of constant curvature. The theory has only a finite number of physical degrees of freedom arising from the matter content and topology of the spacetime. This absence of local gravitational
736
C. Meusburger
degrees of freedom manifests itself mathematically in the possibility to formulate the theory as a Chern-Simons gauge theory [1, 2] where the gauge group is the (2+1)↑ dimensional Poincaré group P3 , the group S O(3, 1) ∼ = S L(2, C)/Z2 or S O(2, 2) ∼ = (S L(2, R) × S L(2, R))/Z2 , respectively, for cosmological constant = 0, > 0 and < 0. The main advantage of the Chern-Simons formulation of (2+1)-dimensional gravity is that it allows one to apply gauge theoretical concepts and methods which give rise to an efficient description of phase space and Poisson structure. As Einstein’s equations of motion take the form of a flatness condition on the gauge field, physical states can be characterised by holonomies, and conjugation invariant functions of the holonomies form a complete set of physical observables. Starting with the work of Nelson and Regge [3–5, 7, 6], Martin [8] and of Ashtekar, Husain, Rovelli, Samuel and Smolin [9], the description of (2+1)-dimensional gravity in terms of holonomies and the associated gauge invariant observables has proven useful in clarifying the structure of its classical phase space as well as in quantisation. An overview of different approaches and results is given in [10]. The disadvantage of this approach is that it makes it difficult to recover the geometrical picture of a spacetime manifold and thereby complicates the physical interpretation of the theory. Except for cases where the holonomies take a particularly simple form such as static spacetimes and the torus universe, it is in general not obvious how the description of the phase space in terms of holonomies and associated gauge invariant observables gives rise to a Lorentz metric on a spacetime manifold. The first to address this problem for general spacetimes was Mess [11], who showed how the geometry of (2+1)-dimensional spacetimes can be reconstructed from a set of holonomies. More recent results on this problem are obtained in the papers by Benedetti and Guadagnini [12] and by Benedetti and Bonsante [13], which are going to be our main references. They describe the construction of evolving spacetimes from static ones via the geometrical procedure of grafting, which, essentially, consists of inserting small annuli along certain geodesics of the spacetime. As they establish a unified picture for all values of the cosmological constant and show how this change of geometry affects the holonomies, they clarify the relation between holonomies and spacetime geometry considerably. However, despite these results, the problem of relating spacetime geometry and the description of phase space and Poisson structure in terms of holonomies has not yet been fully solved. The missing link is the role of the Poisson structure. A complete understanding of the gauge invariant observables must include a physical interpretation of the transformations on phase space they generate via the Poisson bracket. Conversely, to interpret the geometrical construction of evolving (2+1)-spacetimes via grafting as a physical transformation, one needs to determine how it affects phase space and Poisson structure. This paper addresses these questions for (2+1)-gravity with vanishing cosmological constant on manifolds of topology R × Sg , where Sg is an orientable two-surface of genus g > 1. It relates the construction of evolving (2+1)-spacetimes via grafting along simple, closed curves to the description of the phase space in terms of holonomies and the associated gauge invariant observables. The main results can be stated as follows. 1. We show how grafting along a closed, simple geodesic is implemented in the ChernSimons formulation of (2+1)-dimensional gravity. Using the parametrisation of the phase space in terms of holonomies given in [14, 15], we deduce explicit expressions for the action of grafting on the holonomies of general curves on Sg and investigate its properties as a transformation on phase space.
Grafting and Poisson Structure in (2+1)-Gravity
737
2. We derive the Hamiltonian that generates this grafting transformation via the Poisson bracket. This Hamiltonian is one of the two basic gauge invariant observables associated to a closed curve on Sg and obtained from the Lorentz component of its holonomy. 3. We demonstrate that there is a symmetry relation between the transformation of the observables associated to a curve η under grafting along λ and the transformation of the corresponding observables for λ under grafting along η. Infinitesimally, this relation takes the form of a general identity for the Poisson brackets of certain observables associated to the two curves. 4. We show that the action of grafting in our description of the phase space is closely related to the action of (infinitesimal) Dehn twists investigated in an earlier paper [16]. Essentially, grafting can be viewed as a Dehn twist with a formal parameter θ satisfying θ 2 = 0. The paper is structured as follows. In Sect. 2 we introduce the relevant definitions and notation, present some background on the (2+1)-dimensional Poincaré group and on hyperbolic geometry and summarise the description of grafted (2+1)-spacetimes in [12, 13] for the case of grafting along multicurves. In Sect. 3, we briefly review the Hamiltonian version of the Chern-Simons formulation of (2+1)-dimensional gravity. We discuss the role of holonomies and summarise the relevant results of [14, 15], in which phase space and Poisson structure are characterised ↑ ↑ by a symplectic potential on the manifold (P3 )2g with different copies of P3 standing for the holonomies of a set of generators of the fundamental group π1 (Sg ). Section 4 discusses the implementation of grafting along closed, simple geodesics in the Chern-Simons formalism. We show how the geometrical procedure of grafting in ↑ [12, 13] gives rise to a transformation on the extended phase space (P3 )2g and derive formulas for its action on the holonomies of general elements of the fundamental group π1 (Sg ). Section 5 establishes the relation of grafting and Poisson structure. After express↑ ing the symplectic potential on the extended phase space (P3 )2g in terms of variables adapted to the grafting transformations, we show that these transformations are generated by gauge invariant Hamiltonians and therefore act as Poisson isomorphisms. We deduce a general symmetry relation between the Poisson brackets of certain observables associated to general closed curves on Sg . In Sect. 6, we explore the link between grafting and Dehn twists. We review the results concerning Dehn twists derived in [16] and introduce a graphical procedure which allows one to determine the action of grafting on the holonomies of general closed curves on Sg . By means of this procedure, we then demonstrate that there is a close relation between the action of grafting and (infinitesimal) Dehn twists. In Sect. 7 we illustrate the general results from Sect. 4 to 6 by applying them to a concrete example. Section 8 contains a summary of our results and concluding remarks.
2. Grafted (2+1) Spacetimes with Vanishing Cosmological Constant: The Geometrical Viewpoint 2.1. The (2+1)-dimensional Poincaré group. Throughout the paper we use Einstein’s summation convention. Indices are raised and lowered with the three-dimensional Minkowski metric η = diag(1, −1, −1), and x · y stands for η(x, y).
738
C. Meusburger ↑
↑
↑
In the following L 3 and P3 = L 3 R3 denote, respectively, the (2+1)-dimensional proper orthochronous Lorentz and Poincaré group. We identify R3 and the Lie algebra ↑ ↑ so(2, 1) = Lie L 3 as vector spaces. The action of L 3 on R3 in its matrix representation then agrees with its action on so(2, 1) via the adjoint action p = ( p0 , p1 , p2 ) ∼ = pa Ja ,
Ad(u) p = pa u Ja u −1 = u b a pa Jb
(2.1)
where Ja , a = 0, 1, 2, are the generators of so(2, 1). For notational consistency with earlier papers [14–16] considering more general gauge groups we will use the notation Ad(u) p throughout the paper and often do not distinguish notationally between elements of so(2, 1) and associated vectors in R3 . With the parametrisation ↑
↑
(u, a) = (u, −Ad(u) j ) ∈ P3
u ∈ L 3 , a, j ∈ R3 ,
(2.2)
↑
the group multiplication in P3 is then given by (u 1 , a1 ) · (u 2 , a2 ) = (u 1 · u 2 , a1 + Ad(u 1 )a2 ) = u 1 · u 2 , −Ad(u 1 u 2 ) j 2 + Ad(u −1 . ) j 1 2 ↑
(2.3)
↑
The Lie algebra of P3 is Lie P3 = iso(2, 1). Denoting by Ja , a = 0, 1, 2, the generators of so(2, 1) by Pa , a = 0, 1, 2, the generators of the translations, and choosing the convention 012 = 1 for the epsilon tensor, we have the Lie bracket [Pa , Pb ] = 0, [Ja , Jb ] = abc J c , [Ja , Pb ] = abc P c ,
(2.4)
and a non-degenerate, Ad-invariant bilinear form , on iso(2, 1) is given by Ja , P b = δab , Ja , Jb = P a , P b = 0.
(2.5)
We represent the generators of so(2, 1) by the matrices (Ja )bc = −abc ↑
(2.6) ↑
pa Ja
the exponential map for L 3 . As and denote by exp : so(2, 1) → L 3 , pa Ja → e ↑ this map is surjective, see for example [17, 18], elements of L 3 can be parametrised in terms of a vector p ∈ R3 with p 0 ≥ 0 as u = e− p
aJ a
.
Using expression (2.6) for the generators of so(2, 1) and setting pˆ =
1 m
p
for m 2 := | p2 | = 0,
(2.7)
we find u ab
c pˆ a pˆ b + cos m(ηab − pˆ a pˆ b ) + sin mabc pˆ 1 c = ηab + abc p + 2 pa pb − pˆ pˆ + cosh m(η + pˆ pˆ ) + sinh m pˆ c a b ab a b abc ↑
pa pa = m 2 > 0 pa pa = 0 pa pa = −m 2 < 0.
(2.8)
Elements u = e− p Ja ∈ L 3 are called elliptic, parabolic and hyperbolic, respectively, if p2 > 0, p2 = 0 and p2 < 0. Note that the exponential map is not injective, since e2π n J0 = 1 for n ∈ Z. However, in this paper we will be concerned with a
Grafting and Poisson Structure in (2+1)-Gravity
739
hyperbolic elements, for which the parametrisation (2.8) in terms of a spacelike vector p = ( p 0 , p 1 , p 2 ) with p 0 ≥ 0 is unique. ↑ A convenient way of describing the Lie algebra iso(2, 1) and the group P3 has been introduced in [8]. It relies on a formal parameter θ with θ 2 = 0 analogous to the one occurring in supersymmetry. With the definition (Pa )bc = θ (Ja )bc = −θ abc ,
(2.9)
it follows that the commutator of the matrices Pa , Ja , a = 0, 1, 2, is the Lie bracket (2.4) of the (2+1)-dimensional Poincaré algebra. Definition (2.9) also allows one to ↑ parametrise elements of the group P3 . Identifying (2.10) (u, a) ∼ = 1 + θa b Jb u, one obtains the multiplication law 1 + θa1b Jb u 1 · 1 + θa2c Jc u 2 = u 1 u 2 + θa1b Jb u 1 u 2 + θ u 1 a2b Jb u 2 + θ 2 a1b Jb u 1 a2c Jc u 2 u1u2, = 1+θ a1b + u 1 a2b Jb u −1 (2.11) 1 and with the identification (2.1) of so(2, 1) and R3 one recovers the group multiplication law (2.3). Furthermore, the introduction of the parameter θ makes it possible to ↑ express the exponential map exp : iso(2, 1) → P3 in terms of the exponential map ↑ exp : so(2, 1) → L 3 for the (2+1)-dimensional Lorentz group by setting ep
a J +k a P a a
= e( p +θk )Ja ∞ ∞ n−1 ( pa Ja )m k b Jb ( p c Jc )n−m−1 ( pa Ja )n +θ . = n! n! a
a
n=0
(2.12)
n=0 m=0
↑
To link the parametrisation of elements of P3 in terms of the exponential map with the parametrisation (2.2) (u, −Ad(u) j ) = e−( p
a +θk a )J a
↑
u ∈ L 3 , j ∈ R3 , ( pa +θ k a )Ja ∈ iso(2, 1),
(2.13)
one uses the identity
n n admpa Ja (k b Jb ) · ( p c Jc )n−m ( pa Ja )n , k b Jb = m
(2.14)
m=1
↑
in (2.12) and finds that the elements u ∈ L 3 , j ∈ R3 are given by u = e− p Ja , j = T ( p)k with T : R3 → R3 , ∞ adn a pa Ja (k Ja ) ab T ( p) kb Ja = (n + 1)! n=0
= k a Ja + 21 p b Jb , k a Ja + 16 p c Jc , p b Jb , k a Ja + · · · . a
(2.15)
(2.16)
740
C. Meusburger
Note that the linear map T ( p) is the same as the one considered in [14, 15], where its properties are discussed in more detail. In particular, it is shown that T ( p) is bijective, a maps p to itself and satisfies Ad(e− p Ja )T ( p) = T (− p). Its inverse T −1 ( p) : R3 → R3 plays an important role in the parametrisation of the right- and left-invariant vector fields ↑ ↑ JaL , JaR on L 3 . For any F ∈ C ∞ (L 3 ), we have d ∂F b b (2.17) JaL F e− p Jb = |t=0 F e−t Ja e− p Jb = T −1 ( p)ab b , dt ∂p b a ∂F d b b JaR F e− p Jb = |t=0 F e− p Jb et Ja = −Ad e p Jb T −1 ( p)cb b c dt ∂p ∂ F = −T −1 (− p)ab b . ∂p 2.2. Hyperbolic geometry. In this subsection we summarise some facts from hyperbolic geometry, mostly following the presentation in [19]. For a more specialised treatment focusing on Fuchsian groups see also [20]. In the following we denote by HT ⊂ R3 the hyperboloids of curvature − T1 with the metric induced by the (2+1)-dimensional Minkowski metric HT = x ∈ R3 | x 2 = T 2 , x 0 > 0 (2.18) and realise hyperbolic space H2 as the hyperboloid H2 = H1 . The tangent plane in a point p ∈ HT is given by (2.19) Tp HT = p⊥ = x ∈ R3 | x · p = 0 , and geodesics on HT are of the form c p,q (t) = p cosh t + q sinh t
with p2 = T 2 , q 2 = −T 2 , p · q = 0.
(2.20)
They are given as the intersection of HT with planes through the origin, which can be characterised in terms of their unit (Minkowski) normal vectors c p,q = HT ∩ n⊥ p,q
with
n p,q =
1 T2
p × q ∈ Tc p,q (t) HT ∀t ∈ R.
(2.21)
The isometry group of the hyperboloids HT is the (2+1)-dimensional proper orthochro↑ nous Lorentz group L 3 . The subgroup stabilising a given geodesic maps the associated plane to itself and is generated by the plane’s normal vector. More precisely, for a geodesic c p,q parametrised as in (2.20) and with associated normal vector n p,q as in (2.21), one has a α ∈ R. (2.22) Ad eαn p,q Ja c p,q (t) = cosh(t + α) p + sinh(t + α)q The uniformization theorem implies that any compact, oriented two-manifold of genus g > 1 with a metric of constant negative curvature is given as a quotient S = HT /
of a hyperboloid HT by the action of a cocompact Fuchsian group with 2g hyperbolic generators
↑ −1 · · · v = 1 ⊂ L3 .
= v A1 , v B1 , . . . , v Ag , v Bg ; v Bg , v −1 , v (2.23) B 1 Ag A1
Grafting and Poisson Structure in (2+1)-Gravity
741
The group is isomorphic to the fundamental group π1 (S ), and its action on the hyperboloid HT agrees with the action of π1 (S ) via deck transformations. Via its action on HT , it induces a tesselation of HT by its fundamental regions which are geodesic arc 4g-gons. In particular, there exists a geodesic arc 4g-gon P T in the tesselation of HT , in the following referred to as a fundamental polygon, such that each of the generators of
and their inverses map the polygon to one of its 4g neighbours. If one labels the sides of the polygon by a1 , a1 , . . . , b1 , b1 , . . . , ag , ag , bg , bg as in Fig. 5, it follows that the generators of map side x ∈ {a1 , . . . , bg } of the polygon P into x ∈ {a1 , . . . , bg }, Ad(v Ai ) : ai → ai
Ad(v Bi ) : bi → bi .
(2.24)
For a general polygon P in the tesselation related to P T via P = Ad(v)P T , v ∈ , the −1 elements of mapping this polygon into its 4g neighbours are given by vv ±1 A1 v , . . . , ±1 −1 vv Bg v . Geodesics on S are the images of geodesics on HT under the projection T : HT → S . In particular, a geodesic c p,q on HT gives rise to a closed geodesic on S if and only if there exists a nontrivial element v˜ ∈ , the geodesic’s holonomy, which maps c p,q to itself. From (2.22) it then follows that the group element v˜ ∈ is obtained by exponentiating a multiple of the geodesic’s normal vector ∃α ∈ R+ :
v˜ = eαn p,q Ja . a
(2.25)
Closed geodesics on S are therefore in one to one correspondence with elements of the group and hence with elements of the fundamental group π1 (S ). In the following we will often not distinguish notationally between an element of the fundamental group π1 (S ) and a closed geodesic or a general closed curve on S representing this element. 2.3. Grafting. Grafting along simple geodesics was first investigated in the context of complex projective structures and Teichmüller theory [21–23]. Following the work of Thurston [24, 25] who considered general geodesic laminations, the topic has attracted much interest in mathematics, for historical remarks see for instance [26]. The role of geodesic laminations in (2+1)-dimensional gravity was first explored by Mess [11] who investigated the construction of (2+1)-dimensional spacetimes from a set of holonomies. More recent work on grafting in the context of (2+1)-dimensional gravity are the papers by Benedetti and Guadagnini [12] and by Benedetti and Bonsante [13]. As we investigate a rather specific situation, namely grafting along closed, simple geodesics in (2+1)-dimensional gravity with vanishing cosmological constant , we limit our presentation to a summary of the grafting procedure described in [12, 13] for the case of = 0 and multicurves. For a more general treatment and a discussion of the relation between this grafting procedure and grafting on the space of complex projective structures, we refer the reader to [12, 13]. Given a cocompact Fuchsian group with 2g generators, there is a well-known procedure for constructing a flat (2+1)-dimensional spacetime of genus g associated to this group, see for example [10]. One foliates the interior of the forward lightcone with the tip at the origin by hyperboloids HT . The cocompact Fuchsian group acts on each hyperboloid HT and induces a tesselation of HT by geodesic arc 4g-gons which are mapped into each other by the elements of . The asssociated spacetime of genus g is then obtained by identifying on each hyperboloid the points related by the action of . It
742
C. Meusburger ↑
is shown in [10] that the P3 -valued holonomies of all curves in the resulting spacetime ↑ ↑ have vanishing translational components and lie in the subgroup L 3 ⊂ P3 . However, these spacetimes are of limited physical interest because they are static [10]. Grafting along measured geodesic laminations is a procedure which allows one to construct non-static or genuinely evolving (2+1)-spacetimes associated to a Fuchsian group . In the following we consider measured geodesic laminations which are weighted multicurves, i. e. countable or finite sets G I = (ci , wi ) | i ∈ I (2.26) of closed, simple non-intersecting geodesics ci on the associated two-surface S = H2 / , each equipped with a positive number, the weight wi > 0. Geometrically, grafting amounts to cutting the surface S along each geodesic ci and inserting a strip of width wi as indicated in Fig. 1. Equivalently, the construction can be described in the universal cover H2 . By lifting each geodesic ci to a geodesic ci on H2 and acting on it with the Fuchsian group , one obtains a -invariant multicurve on H2 , G I = {(Ad(v)ci , wi ) | i ∈ I, v ∈ } .
(2.27)
One then cuts the hyperboloid H2 along each geodesic ci in G I , shifts the resulting pieces in the direction of the geodesics’ normal vectors and glues in a strip of width wi as shown in Fig. 2. The cocompact Fuchsian group acts on the resulting surface in such a way that it identifies the images of the points related by the canonical action of
on H2 , and the associated grafted genus g surface is obtained by taking the quotient with respect to this action of . In the construction of flat (2+1)-spacetimes of topology R × Sg via grafting, the grafting procedure is performed for each value of the time parametrising R. As in the construction of static spacetimes, one foliates the interior of the forward lightcone by hyperboloids HT . By cutting and inserting strips along the lifted geodesics on each hyperboloid HT , one assigns to each cocompact Fuchsian group and each multicurve on S = H2 / a regular domain U ∈ R3 . The cocompact Fuchsian group acts on the domain U , and the grafted spacetime of topology R × Sg is obtained by identifying the points in U related by this action of . To give a mathematically precise definition, we follow the presentation in [13]. We consider a multicurve on S = H2 / as in (2.26) together with its lift to a -invariant weighted multicurve on H2 as in (2.27) and parametrise its geodesics as in (2.20), (2.28) G I = (c pi ,qi , wi ) | i ∈ I .
Fig. 1. Grafting along a closed simple geodesic c with weight w on a genus 2 surface
Grafting and Poisson Structure in (2+1)-Gravity
743
Fig. 2. Grafting along a geodesic with weight w in hyperbolic space
Furthermore, we choose a basepoint x 0 ∈ H2 − i∈I c pi ,qi that does not lie on any of the geodesics. For each point x ∈ H2 − i∈I c pi ,qi outside the geodesics, we choose an arc a x connecting x 0 and x, pointing towards x and transverse to each of the geodesics it intersects. We then define a map ρ : H2 − i∈I c pi ,qi → R3 by associating to each intersection point of a x with one of the geodesics the unit normal vector of the geodesic pointing towards x and multiplied with the weight wi i,x n pi ,qi for x ∈ / c pi ,qi , (2.29) ρ(x) = i∈I i∈I : a x ∩ c pi ,qi =∅ where i,x ∈ {±1} is the oriented intersection number of a x and c pi ,qi with the convention i,x = 1 if c pi ,qi crosses a x from the left to the right in the direction of a x and ensures that i,x wi n pi ,qi points towards x. Similarly, for each point x ∈ c p j ,q j that lies on one of the geodesics, we consider a geodesic ray r x starting in x 0 and through x, transversal to the geodesics at each intersection point, and set wi i,x n pi ,qi , ρ− (x) = i∈I −{ j}: r x ∩ c pi ,qi =∅ wi i,x n pi ,qi . (2.30) ρ+ (x) = w j j,x n p j q j + i∈I −{ j}: r x ∩ c pi ,qi =∅ On each hyperboloid HT , we now shift the points outside of the geodesics according to c pi qi , (2.31) T x → T x + ρ(x), x ∈ H2 − i∈I
and replace each geodesic by a strip T x → T x + tρ+ (x) + (1 − t)ρ− (x),
x∈
i∈I
c pi qi ⊂ H2 , t ∈ [0, 1]. (2.32)
744
C. Meusburger
From the definitions (2.29), (2.30) of the maps ρ, ρ± we see that for each value of T , this corresponds to the grafting procedure for hyperbolic space described above. The regular domain U ⊂ R3 associated to the multicurve G I is the image of the forward lightcone under this procedure [13]: U = UT , T ∈R+
UT =
T x + ρ(x) | x ∈ /
c pi ,qi
=:UT0
∪
i∈I
T x + tρ+ (x) + (1 − t)ρ− (x) | x ∈
c pi ,qi , t ∈ [0, 1] ,
i∈I
(2.33)
G =:UT I
where the two-dimensional surfaces UT are the images of the hyperboloids HT , given as a union of shifted pieces UT0 of hyperboloids and of strips UTG I . In particular, the tip of the lightcone is mapped to the initial singularity U0 of the regular domain U , / c pi ,qi ∪ tρ+ (x) + (1 − t)ρ− (x) | x ∈ c pi ,qi , t ∈ [0, 1] , U0 = ρ(x) | x ∈ i∈I
i∈I
(2.34) which is a graph (more precisely, a real simplicial tree) with each vertex corresponding to the area between two geodesics or between a geodesic and infinity and edges given by wi i,x n pi ,qi . It is shown in [12] that the parameter T defines a cosmological time function Tc : U → R+ Tc (T x + ρ(x)) = T Tc (T x + tρ+ (x) + (1 − t)ρ− (x)) = T,
(2.35)
and that the surfaces UT in (2.33) are surfaces of constant geodesic distance to the initial singularity U0 . The genus g spacetime associated to the cocompact Fuchsian group and the invariant multicurve G I is then obtained by identifying on each surface UT the images of the points on HT which are related by the canonical action of . This is implemented by defining another action of the group on U . It is shown in [13] that for -invariant multicurves G I on H2 the map ↑
f G I : → P3 , f G I (v) = (v, ρ(Ad(v)x 0 ))
(2.36)
defines a group homomorphism which leaves each surface UT invariant, acts on U freely and properly discontinuously and satisfies N ( f G I (v) y) = Ad(v)N ( y),
(2.37)
where N : U → H2 is the map that associates to each point in U the corresponding point in H2 , N (T x + ρ(x)) = x N (T x + tρ+ (x) + (1 − t)ρ− (x)) = x.
(2.38)
Grafting and Poisson Structure in (2+1)-Gravity
745
The flat (2+1)-spacetime of genus g associated to the group and the multicurve G I is defined as the quotient of U by the action of via f G I . Using the identity (2.37), we find that this amounts to identifying points y, y ∈ T ∈R+ UT0 according to y ∼ y
⇔
∃v ∈ : N ( y) = Ad(v)N ( y ) , Tc ( y) = Tc ( y ),
(2.39)
where Tc : U → R+0 is the cosmological time (2.35), and for points y, y ∈ T ∈R+ UTG I parametrised as in0 (2.33), we have the additional condition t = t . Hence, two points y, y ∈ T ∈R+ UT are identified if and only if they lie on the same surface UT and the corresponding points on H2 are identified by the canonical action of on H2 . ↑ The function f G I defines the P3 -valued holonomies of the resulting spacetime. Via ∼ the identification = π1 (Sg ) it assigns to each element of the fundamental group π1 (Sg ) ↑ an element of the group P3 , whose Lorentz component is the associated element of the Fuchsian group . However, in contrast to the static spacetimes considered above, it is clear from (2.36) that in grafted (2+1)-spacetimes there exist elements of the fundamental group whose holonomies have a nontrivial translational component. 3. Phase Space and Poisson Structure in the Chern-Simons Formulation of (2+1)-Dimensional Gravity 3.1. The Chern-Simons formulation of (2+1)-dimensional gravity. The formulation of (2+1)-dimensional gravity as a Chern-Simons gauge theory is derived from Cartan’s description, in which Einstein’s theory of gravity is formulated in terms of a dreibein of one-forms ea , a = 0, 1, 2, and spin connection one-forms ωa , a = 0, 1, 2, on a spacetime manifold M. The dreibein defines a Lorentz metric g on M via g = ηab ea ⊗ eb ,
(3.1)
and the one-forms ωa are the coefficients of the spin connection ω = ωa Ja .
(3.2)
To formulate the theory as a Chern-Simons gauge theory, one combines dreibein and spin-connection into the Cartan connection [27] or Chern-Simons gauge field A = ωa Ja + ea P a ,
(3.3)
an iso(2, 1) valued one form whose curvature F = Ta P a + Fωa Ja
(3.4)
combines the curvature and the torsion of the spin connection 1 Fωa = dωa + abc ωb ∧ ωc , Ta = dea + abc ωb ec . 2
(3.5)
This allows one to express Einstein’s equations of motion, the requirements of flatness and vanishing torsion, as a flatness condition on the Chern-Simons gauge field F = 0.
(3.6)
746
C. Meusburger
Note, however, that in order to define a metric g of signature (1, −1, −1) via (3.1), the dreibein e has to be non-degenerate, while no such condition is imposed in the corresponding Chern-Simons gauge theory. It is argued in [28] for the case of spacetimes containing particles that this leads to differences in the global structure of the phase spaces of the two theories. A further subtlety concerning the phase space in Einstein’s formulation and the Chern-Simons formulation of (2+1)-dimensional gravity arises from the presence of large gauge transformations. It has been shown by Witten [2] that infinitesimal gauge transformations are on-shell equivalent to infinitesimal diffeomorphisms in Einstein’s formulation of the theory. This equivalence does not hold for large gauge transformations which are not infinitesimally generated and arise in Chern-Simons the↑ ory with non-simply connected gauge groups such as the group P3 . Nevertheless, configurations related by such large gauge transformations are identified in the Chern-Simons formulation of (2+1)-dimensional gravity, potentially causing further differences in the global structure of the two phase spaces. However, as we are mainly concerned with the local properties of the phase space, we will not address these issues any further in this paper. In the following we consider spacetimes of topology M ≈ R × Sg , where Sg is an orientable two-surface of genus g > 1. On such spacetimes, it is possible to give a Hamiltonian formulation of the theory. One introduces coordinates x 0 , x 1 , x 2 on R × Sg such that x 0 parametrises R and splits the gauge field according to A = A0 d x 0 + A S ,
(3.7) ↑
where A S is a gauge field on Sg and A0 : R × S → P3 a function with values in the (2+1)-dimensional Poincaré group. The Chern-Simons action on M then takes the form 0 1 dx S[A S , A0 ] = 2 ∂0 A S ∧ A S + A0 , FS , R
Sg
where , denotes the bilinear form (2.5) on iso(2, 1), FS is the curvature of the spatial gauge field A S , FS = d S A S + A S ∧ A S ,
(3.8)
and d S denotes differentiation on the surface Sg . The function A0 plays the role of a Lagrange multiplier, and varying it leads to the flatness constraint FS = 0,
(3.9)
while variation of A S yields the evolution equation ∂0 A S = d S A0 + [A S , A0 ].
(3.10)
The action (3.8) is invariant under gauge transformations ↑
A0 → γ A0 γ −1 +γ ∂0 γ −1 A S → γ A S γ −1 +γ d S γ −1 with γ : R× Sg → P3 , ↑
(3.11)
and the phase space of the theory is the moduli space Mg of flat P3 -connections A S modulo gauge transformations on the spatial surface Sg .
Grafting and Poisson Structure in (2+1)-Gravity
747
3.2. Holonomies in the Chern-Simons formalism. Although the moduli space Mg of flat H -connections is defined as a quotient of the infinite dimensional space of flat H -connections on Sg , it is of finite dimension dim Mg = 2dim H (g − 1). In the ChernSimons formulation of (2+1)-dimensional gravity, we have dim Mg = 12(g − 1), and the finite dimensionality of Mg reflects the fact that the theory has no local gravitational degrees of freedom. From the geometrical viewpoint this fact can be summarised in the statement that every flat (2+1)-spacetime is locally Minkowski space. The corresponding statement in the Chern-Simons formalism is that, due to its flatness, a gauge field solving the equations of motion can be trivialised or written as pure gauge on any simply connected region R ⊂ R × Sg , ↑ with γ −1 = (v, x) : R → P3 . (3.12) A = γ dγ −1 = v −1 dv, Ad(v −1 )d x The dreibein on R is then given by ea = Ad(v −1 )ab d xb and from (3.1) it follows that the restriction of the metric g to R takes the form gab d x a d x b = (d x 0 )2 − (d x 1 )2 − (d x 2 )2 .
(3.13)
Hence, the translational part of the trivialising function γ −1 defines an embedding of the region R into Minkowski space, and the function x(x 0 , ·) gives the embedding of the surfaces of constant time parameter x 0 . A maximal simply connected region is obtained by cutting the spatial surface Sg along a set of generators of the fundamental group π1 (Sg ) as in Fig.3, which is the approach pursued by Alekseev and Malkin [29]. The fundamental group of a genus g surface Sg is generated by two loops ai , bi , i = 1, . . . , g around each handle, subject to a single defining relation
Fig. 3. Cutting the surface Sg along the generators of π1 (Sg )
748
C. Meusburger
π1 (Sg ) = a1 , b1 , . . . , ag , bg ; bg , ag−1 · · · b1 , a1−1 = 1 ,
(3.14)
where bi , ai−1 = bi ◦ ai−1 ◦ bi−1 ◦ ai . In the following we will work with a fixed set of generators and with a fixed basepoint p0 as shown in Fig. 4. Cutting the surface along each generator of the fundamental group results in a 4g-gon Pg as pictured in Fig. 5. In ↑ order to define a gauge field A S on Sg , the function γ −1 : Pg → P3 must satisfy an overlap condition relating its values onthe two sides corresponding to each generator of the fundamental group. For any y ∈ a1 , b1 , . . . , ag , bg , one must have [29] A S | y = γ d S γ −1 | y = γ d S γ −1 | y = A S | y ,
(3.15)
which is equivalent to the existence of a constant Poincaré element NY = (vY , x Y ) such that γ −1 | y = NY γ −1 | y or, equivalently, v| y = vY v| y x| y = Ad(vY )x| y + x Y . (3.16) Note that the information about the physical state is encoded entirely in the Poincaré elements N X , X ∈ {A1 , . . . , Bg }, since transformations of the form γ → γ˜ γ with ↑ γ˜ : Sg → P3 are gauge. Conversely, to determine the Poincaré elements N X for a given gauge field, it is not necessary to know the trivialising function γ but only the embedding of the sides of the polygon Pg , which defines them uniquely via (3.16). We will now relate these Poincaré elements to the holonomies of our set of generators of the fundamental group π1 (Sg ). In the Chern-Simons formalism, the holonomy of a curve c : [0, 1] → Sg is given by Hc = γ (c(1))γ −1 (c(0)),
Fig. 4. Generators and dual generators of the fundamental group π1 (Sg )
(3.17)
Grafting and Poisson Structure in (2+1)-Gravity
749
Fig. 5. The polygon Pg
where γ is the trivialising function for the spatial gauge field A S on a simply connected region in Sg containing c. By taking the polygon Pg as our simply connected region and labelling its sides as in Fig. 5, we find that the holonomies Ai , Bi associated to the curves ai , bi , , i = 1, . . . , g, are given by [29] Ai = γ ( p4i−3 )γ ( p4i−4 )−1 = γ ( p4i−2 )γ ( p4i−1 )−1 Bi = γ ( p4i−3 )γ ( p4i−2 )−1 = γ ( p4i )γ ( p4i−1 )−1 .
(3.18)
From the defining relation of the fundamental group, it follows that they satisfy the relation
· · · B1 , A−1 ≈ (1, 0). u ∞ , −Ad(u ∞ ) j ∞ := Bg , A−1 g 1
(3.19)
Using the overlap condition (3.15), we can express the value of the trivialising function γ at the corners of the polygon Pg in terms of its value at p0 and find −1 −1 γ −1 ( p4i ) = N Hi N Hi−1 · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 Hi , −1 −1 −1 γ −1 ( p4i+1 )=N A−1 N B−1 N Ai+1 N Hi · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 Hi Ai+1 , i+1 i+1 −1 −1 −1 N Ai+1 N Hi · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 Hi Ai+1 Bi+1 , γ −1 ( p4i+2 ) = N B−1 i+1 −1 −1 −1 Hi Ai+1 Bi+1 Ai+1 , γ −1 ( p4i+3 ) = N Ai+1 N Hi · · · N H1 γ −1 ( p0 ) = γ −1 ( p0 )H1−1 · · · Hi−1 (3.20)
750
C. Meusburger
where
Hi = u Hi , −Ad(u Hi ) j Hi = Bi , Ai−1 ,
N Hi = v Hi , x Hi = N Bi , N A−1 . i
(3.21)
Equation (3.20) allows us to express the holonomies Ai , Bi in terms of N Ai , N Bi , −1 −1 −1 Ai = γ ( p0 )N H · · · NH NH · N Bi · N Hi−1 N Hi−2 · · · N H1 γ −1 ( p0 ), 1 i−1 i −1 −1 −1 · · · NH NH · N Ai · N Hi−1 N Hi−2 · · · N H1 γ −1 ( p0 ), Bi = γ ( p0 )N H 1 i−1 i
(3.22)
and by inverting these expressions we obtain N Ai = γ −1 ( p0 )H1−1 · · ·Hi−1 Bi Hi−1 · · ·H1 γ ( p0 ), · · ·Hi−1 Ai Hi−1 · · ·H1 γ ( p0 ). N Bi = γ −1 ( p0 )H1−1
(3.23)
Note that expression (3.23) agrees exactly with (3.22) if we exchange Ai ↔ N Ai , Bi ↔ N Bi and γ −1 ( p0 ) ↔ γ ( p0 ). In particular, up to simultaneous conjugation with γ −1 ( p0 ), the Poincaré elements N Ai , N Bi are the holonomies along another set of generators of π1 (Sg ) pictured in Fig. 4 and given in terms of the generators ai , bi by −1 −1 −1 n ai = h −1 1 ◦ ... ◦ h i ◦ bi ◦ h i−1 ◦ ... ◦ h 1 n bi = h 1 ◦ ... ◦ h i ◦ ai ◦ h i−1 ◦ ... ◦ h 1 , (3.24)
where h i := bi , ai−1 . From Fig. 4, we see that this set of generators is dual to the set of generators a1 , b1 , . . . , ag , bg in the sense that n ai and n bi , respectively, intersect only ai and bi , in a single point.
3.3. Phase space and Poisson structure. The description of Chern-Simons theory with gauge group H on manifolds R × Sg in terms of the holonomies along a set of generators of the fundamental group π1 (Sg ) provides an efficient parametrisation of its phase space Mg . While the formulation in terms of Chern-Simons gauge fields exhibits an infinite number of redundant or gauge degrees of freedom, the characterisation in terms of the holonomies allows one to describe the moduli space Mg as a quotient of a finite dimensional space. It is given as
· · · B1 , A−1 = 1 /H, (3.25) Mg = A1 , B1 , . . . , A g , Bg ∈ H 2g | Bg , A−1 g 1 where the quotient stands for simultaneous conjugation of all group elements Ai , Bi ∈ H by elements of the gauge group H . Hence, the physical observables of the theory are functions on H 2g that are invariant under simultaneous conjugation with H or conjugation invariant functions of the holonomies associated to elements of π1 (Sg ). In the case ↑ of the gauge group P3 , these observables were first investigated in [3 and 9]; for the case of a disc with punctures representing massive, spinning particles, see also the work of Martin [8], who identify a complete set of generating observables and determine their
Grafting and Poisson Structure in (2+1)-Gravity
751
Poisson brackets. In our notation the associated to a general curve two basic observables a η ∈ π1 (Sg ) with holonomy Hη = u η , −Ad(u η ) j η , u η = e− pη Ja , are given by m 2η := − p2η ,
m η sη := pη · j η ,
(3.26)
and it follows directly from the group multiplication law (2.3), that they are invariant under conjugation of the holonomies. Furthermore, for a loop η around a puncture representing a massive, spinning particle, m η and sη have the physical interpretation of, respectively, mass and spin of the particle. In the following we will therefore refer to these observables as mass and spin of the curve η. Although it is possible to determine the canonical Poisson brackets of these observables [3, 9], the resulting expressions are nonlinear and rather complicated. The main advantage of the description of the phase space Mg as the quotient (3.25) is that it results in a much simpler description of the Poisson structure on Mg . Although the canonical Poisson structure on the space of Chern-Simons gauge fields does not induce a Poisson structure on the space of holonomies, it is possible to describe the symplectic structure on Mg in terms of an auxiliary Poisson structure on the manifold H 2g . The construction is due to Fock and Rosly [30] and was developed further by Alekseev, Grosse and Schomerus [31, 32] for the case of Chern-Simons theory with compact, semisimple gauge groups. A formulation from the symplectic viewpoint has been derived independently in [29]. In [14], this description is adapted to the universal cover of the (2+1)-dimensional Poincaré group and in [15] to gauge groups of the form G g∗ , where G is a finite dimensional, connected, simply connected, unimodular Lie group, g∗ the dual of its Lie algebra and G acts on g∗ in the coadjoint representation. It is shown in [14, 15] that in this case, the Poisson structure can be formulated in terms of a symplectic potential. ↑ ↑ Although the gauge group P3 = L 3 so(2, 1)∗ is not simply connected, the results of [14, 15] can nevertheless be applied to this case1 and are summarised in the following theorem. ↑
Theorem 3.1 ([15]). Consider the Poisson manifold ((P3 )2g , ) with group elements ↑ (A1 , B1 , ..., A g , Bg ) ∈ (P3 )2g parametrised according to Bi = u Bi , −Ad(u Bi ) j Bi , i = 1, . . . , g, (3.27) Ai = u Ai , −Ad(u Ai ) j Ai , and the Poisson structure given by the symplectic form = δ, where =
g
−1 j Ai , δ u Hi−1 · · · u H1 u Hi−1 · · · u H1
i=1
−1 −1 −1 −1 − j Ai , δ u −1 u u u u · · · u u u u · · · u H1 H1 Ai Bi Ai Hi−1 Ai Bi Ai Hi−1 +
g
−1 −1 −1 −1 j Bi , δ u −1 u u u u · · · u u u u · · · u H1 H1 Ai Bi Ai Hi−1 Ai Bi Ai Hi−1
i=1
−1
−1 − j Bi , δ u −1 u Hi = u Bi , u −1 u u u · · · u u u · · · u H1 H1 Bi Ai Hi−1 Bi Ai Hi−1 Ai , (3.28) 1 The assumptions of simply-connectedness and unimodularity in [14, 15] are motivated by the absence of large gauge transformations and by technical simplifications in the quantisation of the theory but play no role in the classical results needed in this paper.
752
C. Meusburger ↑
and δ denotes the exterior derivative on (P3 )2g . Then, the symplectic structure on the moduli space ↑
↑
−1 Mg = {(A1 , B1 , . . . , A g , Bg ) ∈ (P3 )2g | [Bg , A−1 g ] · · · [B1 , A1 ] = 1}/P3 ,
(3.29)
↑
is obtained from the symplectic form = δ on (P3 )2g by imposing the constraint (3.19) and dividing by the associated gauge transformations which act on the group ↑ elements Ai , Bi by simultaneous conjugation with P3 . 4. Grafting in the Chern-Simons Formalism: The Transformation of the Holonomies In this section we relate the geometrical description of grafted (2+1)-spacetimes to their description in the Chern-Simons formalism. We derive explicit expressions for the transformation of the holonomies Ai , Bi , N Ai , N Bi of our set of generators ai , bi ∈ π1 (Sg ) and their duals n Ai , n Bi ∈ π1 (Sg ) under the grafting operation. We start by considering the static spacetime associated to the cocompact Fuchsian group . In this case, we identify the time parameter x 0 in the splitting (3.7) of the gauge field with the parameter T characterising the hyperboloids HT . After cutting the spatial surface Sg along our set of generators ai , bi ∈ π1 (Sg ), we obtain the 4g-gon Pg in Fig. 5 on which the gauge field can be trivialised by a function ↑
γst−1 = (vst , x st ) : R+0 × Pg → P3
(4.1)
as in (3.12). For fixed T , the translational part x st (T, ·) : Pg → P T of γst−1 maps the polygon Pg to the polygon P T ⊂ HT defined by the Fuchsian group , such that the images of sides and corners of Pg are the corresponding sides and corners of P T . By choosing coordinates on Pg , it is in principle possible to give an explicit expres↑ sion for the trivialising function γst−1 : R+0 × Pg → P3 . However, in order to determine the holonomies Ai , Bi and N Ai , N Bi , it is sufficient to know the embedding of the sides and corners of Pg . As the two sides of the polygon P T corresponding to each generator ai , bi ∈ π1 (Sg ) are mapped into each other by the generators of according to (2.24), the overlap condition (3.16) for the trivialising function γst−1 becomes vst (T, ·)| y = vY vst (T, ·)| y
x st (T, ·)| y = Ad(vY )x st (T, ·)| y ,
(4.2)
where y ∈ {a1 , . . . , bg }, Y ∈ {A1 , . . . , Bg } and vY denotes the associated generator of
. The holonomies N Ai , N Bi are therefore given by N X = (v X , 0)
X ∈ {A1 , . . . , Bg }.
(4.3)
Their translational components vanish, and the same holds for the holonomies Ai , Bi up to conjugation with the Poincaré element γ ( p0 ) associated to the basepoint. We now consider the (2+1)-spacetimes obtained from the static spacetime associated to via grafting along a closed, simple geodesic λ ∈ π1 (S ) on S with weight w. As discussed in Sect. 2.2, this geodesic lifts to a -invariant multicurve on H2 , (4.4) G = (Ad(v)c p,q , w) | v ∈ ,
Grafting and Poisson Structure in (2+1)-Gravity
753
where c p,q is the lift of λ, parametrised as in (2.20) with p ∈ P 1 . As the geodesic c p,q is the lift of a simple closed geodesic on S , there exists a nontrivial element a v˜ = eαn p,q Ja ∈ with α ∈ R+ , the holonomy of λ defined up to conjugation, that maps the geodesic c p,q to itself. More precisely, the geodesic c p,q traverses a sequence of polygons P1 = P 1 ,
P2 = Ad(vr )P 1 ,
P3 = Ad(vr −1 vr )P 1 , . . . , Pr +1 = Ad(v1 · · · vr )P 1 = Ad(v)P ˜ 1
(4.5)
mapped into each other by group elements vi ∈ , until it reaches a point p = Ad(v1 · · · vr ) p = Ad(v) ˜ p ∈ Pr +1 identified with p. In particular, this implies that the geodesics in (4.4) do not have intersection points with the corners of the polygons in the tesselation of HT . In the following we therefore take the corner x st (T, p0 ) as our basepoint x 0 and parametrise γst−1 (T, p0 ) = (v0 , x 0 ).
(4.6)
As each generator of the Fuchsian group maps the polygon P T into one of its neighbours, we can express the group elements vi ∈ in terms of the generators and their inverses as αn p,q Ja r v αk v −αk+1 · · · v −α = v1 · · · vr = v αXrr · · · v αX11 vk = v αXrr · · · v αXk+1 X r , v˜ = e k+1 X k X k+1 a
(4.7)
with v X i ∈ {v A1 , . . . v Bg }, αi = ±1. To determine the map ρ in (2.29) for the grafting along the multicurve (4.4), we note that a general geodesic Ad(v)c p,q = cAd(v) p,Ad(v)q , v ∈ , is mapped to itself by the element v vv ˜ −1 ∈ and has the unit normal vector nAd(v) p,Ad(v)q = Ad(v)n p,q . The map ρ in (2.29) is therefore given by v,x Ad(v)n p,q for x ∈ / cAd(v) p,Ad(v)q , (4.8) ρ(x) = w v∈
v∈ : a x ∩ cAd(v) p,Ad(v)q =∅ and (2.30) implies for points x ∈ cAd(vx ) p,Ad(vx )q , vx ∈ , on one of the geodesics ρ− (x) = w v,x Ad(v)n p,q , ρ+ (x) = ρ− (x) + wvx ,x Ad(vx )n p,q . (4.9) v∈ −{vx }: r x ∩ cAd(v) p,Ad(v)q =∅ We are now ready to determine the transformation of the holonomies Ai , Bi and N Ai , N Bi under grafting along λ. We identify the time parameter x 0 in (3.7) with the cosmological time T of the regular domain (2.33) associated to the multicurve (4.4). For fixed T , the ↑ translational part of the trivialising function γ −1 = (v, x) : R+0 × Pg → P3 maps the T ⊂ U of P T under the grafting operation polygon Pg to the image P ,G T
T , x(T, ·) : Pg → P ,G T 1 cAd(v) p,Ad(v)q P ,G = T x + ρ(x) x ∈ P − v∈
1 ∪ T x + tρ+ (x) + (1− t)ρ− (x) x ∈ P ∩ cAd(v) p,Ad(v)q , t ∈ [0, 1] ⊂UT . v∈
(4.10)
754
C. Meusburger
Again, we do not need an explicit expression for the embedding function γ −1 but can determine the holonomies Ai , Bi and N Ai , N Bi from the embedding of the sides of the polygon Pg . For this, we consider a side y ∈ a1 , b1 . . . , ag , bg of the polygon Pg and denote by qiY , q Yf , respectively, its starting and endpoint. In the case of the static spacetime associated to , the holonomy Yst along y with respect to the basepoint p0 is given by (4.11) Yst = γst T, q Yf γst−1 T, qiY . Since the geodesics in (4.4) do not intersect the corners of the polygon, the embedding of starting and endpoint of y in the resulting regular domain is x(T, qiY ) = x st (T, qiY ) + ρ(qiY ),
x(T, q Yf ) = x st (T, q Yf ) + ρ(q Yf ), (4.12)
where here and in the following ρ(q), q ∈ Pg , stands for ρ(x st (1, q)). This implies γ −1 (T, qi,Y f ) = vst (T, qi,Y f ), x st (T, qi,Y f ) + ρ(qi,Y f ) = 1, ρ(qi,Y f ) · γst−1 (T, qi,Y f ), (4.13) and the holonomy Y becomes −1 (T, qiY ) ρ(q Yf ) − ρ(qiY ) . Y = Yst · 1, −Ad vst From expression (4.8) for the map ρ we deduce v,y Ad(v)n p,q , ρ(q Yf ) − ρ(qiY ) = w v∈ :y∩ cAd(v) p,Ad(v)q =∅
(4.14)
where v,y is the oriented intersection number of cAd(v) p,Ad(v)q and y, taken to be positive if cAd(v) p,Ad(v)q crosses y from the left to the right in the direction of y. In order to determine the transformations of the holonomies Ai , Bi , we therefore need to determine the intersection points of the multicurve (4.4) with the sides of the polygon P 1 and the oriented intersection numbers v,y . As the geodesic c p,q intersects the side Ad(vr −k+2 · · · vr )x ⊂ Pk of the polygon Pk αr −k+1 α1 if and only if the geodesic Ad(vr−1 · · · vr−1 −k+2 )c p,q = Ad(v X r −k+1 · · · v X 1 )c p,q intersects the side x ⊂ P 1 , the geodesics in (4.4) which have intersections points with the sides of P 1 are c1 = c p,q , c2 = Ad v αX11 c p,q , (4.15) α c3 = Ad v αX22 v αX11 c p,q , . . . , cr = Ad v Xrr−1 · · · v αX11 c p,q . −1 The intersections of the multicurve (4.4) with a given side y ⊂ P 1 are in one-to-one correspondence with factors v αXkk , X k = Y , in (4.7), and the geodesic in (4.15) intersecting y is ck if αk = 1 and ck+1 if αk = −1. Similarly, intersections with the side y are also in one-to-one correspondence with factors v αXkk , X k = Y , but the intersection takes place with ck for αk = −1 and with ck+1 for αk = 1. Taking into account the orientation of the sides ai , bi , ai , bi in the polygon Pg , see Fig. 5, we find that intersections with
Grafting and Poisson Structure in (2+1)-Gravity
755
sides ai and ai have positive intersection number for αk = 1 and negative intersection number for αk = −1, while the intersection numbers for sides bi , bi are positive and negative, respectively, for αk = −1 and αk = 1. Hence, we find that the transformation of the holonomy Y = (u Y , −Ad(u Y ) j Y ) under grafting along λ is given by u Y → u Y , −1 Y j Y → j Y + Y w Ad(vst (qi )) α × Ad(v Xi−1 · · · v αX11 )n p,q− i−1 i:X i =Y,αi =1
Ad(v αXii · · · v αX11 )n p,q ,
(4.16)
i:X i =Y,αi =−1
where the overall sign Y is positive for Y = Ai and negative for Y = Bi . Note that (4.16) is invariant under conjugation of the group element v˜ = v αXrr · · · v αX11 ∈ associated to the geodesic c p,q with elements of . Although such a conjugation can give rise to additional intersection points, the identity Ad(v αXrr · · · v αX11 )n p,q = n p,q implies that their contributions to the transformation (4.20) cancel. Hence, the transformation (4.16) depends only on the geodesic λ ∈ π1 (S ) and not on the choice of the lift c p,q . To deduce the transformation of the holonomies Ai , Bi , we determine the corresponding starting and endpoints from Fig. 5. For Y = Ai , starting and end point are given by qiAi = p4(i−1) , q Af i = p4i−3 , for Y = Bi by qiBi = p4i−2 , q Bf i = p4i−3 , and (3.20) implies −1 −1 Ai vst (qi ) = v0−1 v −1 H1 · · · v Hi−1 ,
−1 −1 −1 Bi vst (qi ) = v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi .
(4.17)
Taking into account the oriented intersection numbers, we find that the holonomies Ai , Bi transform under grafting along λ according to −1 j Ai → j Ai+wAd v0−1 v −1 H1 · · ·v Hi−1 α × Ad(v Xk−1 · · · v αX11 )n p,q− Ad(v αXkk · · · v αX11 )n p,q, (4.18) k−1 k:X k =Ai ,αk =1
k:X k =Ai ,αk =−1
−1 −1 · · ·v v v j Bi → j Bi −wAd v0−1 v −1 B i H1 Hi−1 Ai α × Ad(v Xk−1 · · · v αX11 )n p,q− Ad(v αXkk · · · v αX11 )n p,q . k−1 k:X k =Bi ,αk =1
k:X k =Bi ,αk =−1
Equivalently, we could have determined the transformation of the holonomies from the −1 Y sides ai , bi . In this case, qiY = p4i−1 for both y = ai , bi and therefore vst (qi ) = −1 −1 −1 Bi −1 −1 Ai −1 · · · v v = v (q )v = v (q )v , which together with the remark v0−1 v −1 st st i i H1 Hi−1 Ai Bi Ai before (4.16) yields the same result. With the interpretation of the holonomies Ai , Bi as the different factors in the prod↑ ↑ ↑ uct (P3 )2g , (4.18) defines a map Grwλ : (P3 )2g → (P3 )2g which leaves the sub↑ ↑ manifold (L 3 )2g ⊂ (P3 )2g invariant. The transformation of the holonomy of a general βs β curve η = ys ◦ . . . ◦ y1 1 ∈ π1 ( p0 , Sg ) under Grwλ is then obtained by writing the curve as a product in the generators Ai , Bi . Parametrising the associated holonomy as Hη = (u η , −Ad(u η ) j η ) as in (2.2), we find that the vector j η is given by
756
C. Meusburger s
jη =
i=1,βi =1
r
−β
−β
Ad(u Y1 1 · · · u Yi−1i−1 ) j Yi −
−β
i=1,βi =−1
−β
Ad(u Y1 1 · · · u Yi i ) j Yi ,
(4.19)
and using (4.18), we obtain the following theorem. β
β
Theorem 4.1. For η = ys s ◦ . . . ◦ y1 1 ∈ π1 (Sg ) with yi ∈ {a1 , . . . , bg }, βi ∈ {±1}, the transformation of the associated holonomy under grafting along λ is given by Grwλ : u η → u η , g j η → j η + w i=1
Y j =Ai ,β j =1
−1 ·Ad(v0−1 v −1 H1 · · · v Hi−1 )
−w
g i=1
Y j =Bi ,β j =1
−β
−β
Ad(u Y1 1 · · · u Y j−1j−1 ) −
−β
−β
Ad(u Y1 1 · · · u Y j j ) ·
Y j =Ai ,β j =−1
α Ad(v Xk−1 · · · v αX11 )n p,q − Ad(v αXkk · · · v αX11 )n p,q k−1
k:X k =Ai ,αk =1
k:X k =Ai ,αk =−1
−β
−β
Ad(u Y1 1 · · · u Y j−1j−1 ) −
−1 −1 ·Ad(v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi )
−β
−β
Ad(u Y1 1 · · · u Y j j ) ·
Y j =Bi ,β j =−1
α Ad(v Xk−1 · · · v αX11 )n p,q− Ad(v αXkk · · · v αX11)n p,q. k−1
k:X k =Bi ,αk =1
k:X k =Bi ,αk =−1
(4.20) Although the formula for the transformation of j η appears rather complicated, one can give a heuristic interpretation of the various factors (4.20). For this recall that, up to conjugation, the group element v˜ ∈ gives the holonomy of the geodesic λ and consider the associated element λ = n αXrr ◦ . . . ◦ n αX11 ∈ π1 ( p0 , Sg ) of the fundamental group based at p0 . The holonomy along this element is Hλ = (u λ , −Ad(u λ ) j λ ) = γ ( p0 )N Xαrr · · · N Xα11 γ −1 ( p0 ),
u λ = e− pλ Ja , (4.21) a
and from (4.7) it follows that the unit vector n p,q is given by n p,q = −Ad(v0 ) pˆ λ = − m1λ Ad(v0 ) pλ .
(4.22)
α
· · · v αX11 )n p,q , and Ad(v αXkk · · · v αX11 )n p,q in (4.20) can be Hence, the terms Ad(v Xk−1 k−1 viewed as the parallel transport along λ of the vector pˆ λ from the starting point of λ to the −1 intersection point with the sides ai , bi of the polygon Pg . The terms Ad(v0−1 v −1 H1 · · · v Hi−1 ) −1 −1 and Ad(v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi ) transport the vector from the point p0 to the starting −β
−β
point of, respectively, sides ai and bi of Pg . Finally, the terms Ad(u Y1 1 · · · u Y j−1j−1 ) and −β
−β
Ad(u Y1 1 · · · u Y j j ) describe the parallel transport along the curve η from its intersection point with λ to its starting point p0 . We will give a more detailed and precise interpretation of this formula in Sect. 6, where we discuss the link between grafting and Dehn twists.
Grafting and Poisson Structure in (2+1)-Gravity
757
5. Grafting and Poisson Structure In this section, we give explicit expressions for Hamiltonians on the Poisson manifold ↑ ((P3 )2g , ) which generate the transformation (4.20) of the holonomies under grafting via the Poisson bracket. As we have seen in Sect. 4 that the grafting operation is most easily described by parametrising one of the holonomies in question in terms of Ai , Bi and the other one in terms of N Ai , N Bi , the first step is to derive an expression for the symplectic potential (3.28) involving the components of both Ai , Bi and N Ai , N Bi . We then prove that the transformation (4.20) of the holonomies under grafting along a geodesic λ ∈ π1 (Sg ) with weight w is generated by wm λ , where m λ is the mass of λ defined as in (3.26). Finally, we use this result to investigate the properties of the grafting trans↑ ↑ formation Grwλ : (P3 )2g → (P3 )2g and prove a relation between the Poisson brackets of mass and spin for general elements λ, η ∈ π1 (Sg ). 5.1. The Poisson structure in terms of the dual generators. In order to derive an expression for the symplectic potential (3.28) in terms of both Ai , Bi and N Ai , N Bi , we need to express the Lorentz and translational components of the holonomies Ai , Bi and N Ai , N Bi in terms of each other via (3.22) and (3.23). For the Lorentz components, we can simply replace Ai , Bi with u Ai , u Bi , N Ai , N Bi with v Ai , v Bi and γ −1 ( p0 ) with v0 in (3.22), (3.23) and obtain −1 −1 v Ai = v0 u −1 H1 · · · u Hi · u Bi · u Hi−1 · · · u H1 v0 , −1 −1 v Bi = v0 u −1 H1 · · · u Hi · u Ai · u Hi−1 · · · u H1 v0 , −1 u Ai = v0−1 v −1 H1 · · · v Hi · v Bi · v Hi−1 · · · v H1 v0 ,
(5.1)
−1 u Bi = v0−1 v −1 H1 · · · v Hi · v Ai · v Hi−1 · · · v H1 v0 , −1 where u Hi = [u Bi , u −1 Ai ], v Hi = [v Bi , v Ai ]. The corresponding expressions for the translational components require some computation. Inserting the parametrisation of the holonomies Ai , Bi into (3.23) and using (5.1), we find $ % i−1 (1 − Ad(v Ak ))l Ak + (1 − Ad(v Bk ))l Bk x Ai = (1 − Ad(v Ai )) x 0 + k=1
x Bi
+Ad(v Bi )l Bi + (1 − Ad(v Ai ))l Ai + (1 − Ad(v Bi ))l Bi , $ % i−1 = (1 − Ad(v Bi )) x 0 + (1 − Ad(v Ak ))l Ak + (1 − Ad(v Bk ))l Bk k=1
−Ad(v Bi )l Ai + (1 − Ad(v Ai ))l Ai + (1 − Ad(v Bi ))l Bi ,
(5.2)
−1 −1 l Ai = Ad(v Hi−1 · · · v H1 v0 ) j Ai = Ad(v0 u −1 H1 · · · u Hi−1 u Ai ) · Ad(u Ai ) j Ai , −1 −1 −1 l Bi = −Ad(v −1 Bi v Ai v Hi−1 · · · v H1 v0 ) j Bi = −Ad(v0 u H1 · · · u Hi−1 u Ai ) · Ad(u Bi ) j Bi ,
(5.3)
758
C. Meusburger
and an analogous calculation for (3.22) yields % $ i−1 j Ai = − 1−Ad(u −1 Ad(v0−1 )x 0 + (1 − Ad(u Ak )) f Ak + (1 − Ad(u Bk )) f Bk Ai ) k=1
j Bi
+Ad(u −1 Ai ) (1−Ad(u Ai )) f Ai + (1 − Ad(u Bi )) f Bi +Ad(u Bi ) f Bi , % $ i−1 −1 −1 = − 1 − Ad(u Bi ) Ad(v0 )x 0 + (1 − Ad(u Ak )) f Ak + 1−Ad(u Bk ) f Bk k=1
+Ad(u −1 Bi ) (1 − Ad(u Ai )) f Ai + (1 − Ad(u Bi )) f Bi − Ad(u Bi ) f Ai ,
(5.4)
−1 −1 −1 −1 x Ai = Ad v0−1 v −1 f Ai = Ad u −1 Ai u Bi u Ai u Hi−1 · · · u H1 v0 H1 · · · v Hi−1 v Ai x Ai , −1 −1 −1 −1 −1 −1 f Bi = −Ad(u −1 u u u · · · u v )x = −Ad v v · · · v v A H H B 1 0 i i−1 i 0 Ai Bi H1 Hi−1 Ai x Bi . (5.5) Note that the variables f Ai , f Bi , l Ai , l Bi have a clear geometrical interpretation. From Fig. 5 and Eq. (3.20) we see that f Ai , f Bi can be viewed as the parallel transport of x Ai , x Bi from p0 to the point p4i−1 , which is the starting point of the sides ai and bi in the polygon Pg . Equivalently, we can interpret them as the parallel transport of −1 Ad(v −1 Ai )x Ai to p4i−4 and of Ad(v Bi )x Bi to p4i−2 , the starting points of sides ai and bi . Similarly, the variables l Ai represent the parallel transport of j Ai from the starting point p4i−4 of side ai , or, equivalently, of Ad(u Ai ) j Ai from its endpoint p4i−3 to p0 , while l Bi corresponds to the parallel transport of j Bi from p4i−2 to p0 or of Ad(u Bi ) j Bi from p4i−3 to p0 . Using expressions (5.1) to (5.5), we can now express the symplectic potential (3.28) ↑ on (P3 )2g in various combinations of the Lorentz and translational components of holonomies and dual holonomies. Theorem 5.1. 1. In terms of the variables introduced in (5.1) to (5.5), the symplectic potential (3.28) is given by = =
g
−1 l Ai , v −1 Ai δv Ai + l Bi , v Bi δv Bi
i=1 g
−1 −1 −1 f Ai , u −1 + f + j . δu , u δu +Ad(v )x , u δu A B 0 ∞ Bi ∞ ∞ i i 0 Ai Bi
(5.6)
(5.7)
i=1
2. After a gauge transformation which acts on the holonomies N Ai , N Bi by simultaneous conjugation with the Poincaré element (1, −Ad(v0 ) j ∞ − x 0 ), NY → N˜ Y = (vY , x˜ Y)= vY , x Y −(1−Ad(vY ))(Ad(v0 ) j ∞+x 0 ) , Y∈ A1 , ..., Bg , −1 f Ai → ˜f Ai = Ad u −1 · · ·u H1 v0−1 x˜ Ai , Ai u Bi u Ai u Hi−1 −1 −1 x˜ Bi , f Bi → ˜f Bi = −Ad u −1 u u u · · ·u v (5.8) A H H 1 i i−1 0 Ai Bi
Grafting and Poisson Structure in (2+1)-Gravity
759
the symplectic potential (3.28) takes the form = =
g
˜f A , u −1 δu A + ˜f B , u −1 δu B i i Ai Bi i i
i=1 g
˜ Ai , δ(v Hi−1 · · · v H1 )(v Hi−1 · · · v H1 )−1 Ad(v −1 Ai ) x
(5.9) (5.10)
i=1
−1 −1 −1 −1 −1 ˜ − Ad(v −1 ) x , δ(v v v v · · · v )(v v v v · · · v ) Ai H1 H1 Ai Ai Bi Ai Hi−1 Ai Bi Ai Hi−1 +
g
−1 −1 −1 −1 ˜ Bi , δ(v −1 Ad(v −1 Bi ) x Ai v Bi v Ai v Hi−1 · · · v H1 )(v Ai v Bi v Ai v Hi−1 · · · v H1 )
i=1
−1 −1 ˜ Bi , δ(v −1 . − Ad(v −1 Bi ) x Bi v Ai v Hi−1 · · · v H1 )(v Bi v Ai v Hi−1 · · · v H1 ) Proof. 1. The proof is a straightforward but rather lengthy computation. To prove (5.6) we express the products of the Lorentz components u Ai , u Bi in (3.28) as products of v Ai , v Bi , −1 u Hi−1 · · · u H1 = v0−1 v −1 H1 · · · v Hi−1 v0 , −1 −1 −1 −1 −1 u −1 Ai u Bi u Ai u Hi−1 · · · u H1 = v0 v H1 · · · v Hi−1 v Ai v0 ,
(5.11)
−1 −1 −1 −1 u −1 Bi u Ai u Hi−1 · · · u H1 = v0 v H1 · · · v Hi−1 v Ai v Bi v0 ,
and simplify the resulting products via the identity δ(ab)(ab)−1 = δaa −1 + Ad(a)δbb−1 .
(5.12)
Taking into account that the embedding of the basepoint is not varied, δv0 = 0, and using the Ad-invariance of the pairing , together with (5.3) we then obtain (5.6). To prove (5.7), we insert expression (5.4) for the variables j Ai , j Bi in terms of f Ai , f Bi into (3.28) and isolate the terms containing f Ai , f Bi . We then express the components of the constraint (3.19) in terms of Lorentz and translational components of the holonomies Ai , Bi and N Ai , N Bi according to −1 u ∞ = u Hg · · · u H1 = v0−1 v −1 H1 · · · v Hg v0 ,
j ∞ = Ad(v0−1 )
(5.13)
g
(1 − Ad(v Ai ))l Ai + (1 − Ad(v Bi ))l Bi
i=1 −1 = −(1 − Ad(u −1 ∞ ))Ad(v0 )x 0
+Ad(u −1 ∞)
g
(1 − Ad(u Ai )) f Ai + (1 − Ad(u Bi )) f Bi .
(5.14)
i=1
Making use repeatedly of the identity (5.12) and of the second identity in (5.14) we obtain (5.6). 2. Equation (5.7) can be transformed into (5.9), (5.10) as follows. We first derive an
760
C. Meusburger
expression for the term u −1 ∞ δu ∞ in terms of the Lorentz components u Ai , u Bi from (5.13), u −1 ∞ δu ∞ g −1 −1 −1 = Ad(u −1 · · · u ) H1 Hi−1 (1−Ad(u Ai u Bi u Ai ))u Ai δu Ai i=1
−1 −1 +(Ad(u −1 u u )−Ad(u u ))u δu Bi . (5.15) Ai Bi Ai Ai Bi Bi
Expressing f Ai , f Bi in (5.6) in terms of ˜f Ai , ˜f Bi and isolating the terms containing j ∞ + Ad(v0−1 )x 0 yields (5.9). Finally, we express the Lorentz components u Ai , u Bi in (5.9) as products in v Ai , v Bi via (5.1). After applying (5.8) and again making use of (5.12) we obtain (5.10). Thus, we find that the symplectic potential takes a particularly simple form when the components of the holonomies Ai , Bi are paired with those of N Ai , N Bi . Note also that up to the term j ∞ + Ad(v0−1 )x 0 , u −1 ∞ δu ∞ , which involves the components of the constraint (3.19) and can be eliminated by performing the gauge transformation to the variables ˜f Ai , ˜f Bi , the resulting expressions for the symplectic potential are symmetric under the exchange l Ai , l Bi ↔ f Ai , f Bi , v Ai , v Bi ↔ u Ai , u Bi , which corresponds to exchanging Ai , Bi ↔ N Ai , N Bi and γ −1 ( p0 ) ↔ γ ( p0 ). Similarly, expression (5.10) for the sympletic potential agrees with (3.28), if we take into account the difference in the parametrisation of the group elements Ai , Bi and N Ai , N Bi and exchange j Ai ↔ Ad(v Ai ) x˜ Ai , j Bi ↔ Ad(v Bi ) x˜ Bi . Hence, up to the gauge transformation (5.8), the symplectic potential takes the same form when expressed in terms of the holonomies Ai , Bi and in terms of N Ai , N Bi , as could be anticipated from the symmetry in expressions (3.22), (3.23). It follows from formula (5.6) for the symplectic potential that the only nontrivial Poisson brackets of the variables l Ai , l Bi and v Ai , v Bi are given by {laX , lbX } = −abc l cX , {laX , v X } = −v X Ja ,
X ∈ {A1 , . . . , Bg }.
(5.16)
We can therefore identify the variables laX with the left-invariant vector fields defined as ↑ in (2.17) and acting on the copy of L 3 associated to v X , d a v Bg {l aX , F} v A1 , ..., v Bg = −J Ra X F v A1 , ..., v Bg = − F v A1 , ..., v X et,J..., dt (5.17) ↑
for F ∈ C ∞ ((L 3 )2g ), X ∈ {A1 , . . . , Bg }. The same holds for the Poisson brackets of ˜f A , ˜f B with u A , u B , i i i i { f˜Xa , F} u A1 , ..., u Bg = −J Ra X F u A1 , ..., u Bg d a u Bg . = − F u A1 , ..., u X et,J..., dt
(5.18)
Grafting and Poisson Structure in (2+1)-Gravity
761
5.2. Hamiltonians for grafting. We can now use the results from Sect. 5.1 to show that the mass m λ of a closed, simple curve λ ∈ π1 (Sg ) generates the transformation of the holonomies under grafting along λ. Theorem 5.2. Consider a simple, closed curve λ = n αxrr ◦ . . . ◦ n αx11 ∈ π1 (Sg ) and a genβ β eral closed curve η = ys s ◦. . .◦ y1 1 ∈ π1 (Sg ) with holonomies Hλ and Hη , parametrised in terms of Ai , Bi and N Ai , N Bi as Hλ = u λ , −Ad(u λ ) j λ = γ ( p0 )N Xαrr · · · N Xα11 γ ( p0 )−1 , β β Hη = u η , −Ad(u η ) j η = Ys s · · · Y1 1 ,
(5.19)
where X i , Y j ∈ {A1 , . . . , Bg } and αi , β j ∈ {±1}. Then, the transformation (4.20) of the holonomy Hη under grafting along λ is generated by the mass m λ , {wm λ , F} = − ↑
d |t=0 F ◦ Grtwλ , dt
↑
F ∈ C ∞ ((P3 )2g ),
(5.20)
↑
where Grtwλ : (P3 )2g → (P3 )2g is given by (4.18), (4.20). Proof. To prove the theorem, we calculate the Poisson bracket of p2λ = −m 2λ with j Ai , j Bi . From expression (5.16) for the Poisson bracket we have {l aAi , u λ } = −
ba ba α αk α1 α1 u λ · Ad v Xk−1 · · · v v J + u · Ad v · · · v v Jb , 0 b λ 0 X1 Xk X1 k−1
X k =Ai ,αk =1
{l aBi , u λ } = −
X k =Ai ,αk =−1
α
u λ · Ad v Xk−1 · · · v αX11 v0 k−1
X k =Bi ,αk =1
ba
(5.21) ba Jb + u λ · Ad v αXkk · · · v αX11 v0 Jb .
X k =Bi ,αk =−1
(5.22) ↑
Applying the formula (2.17) for the left-invariant vector fields on L 3 to F = p2λ yields {l Ai , p2λ } = 2
α
Ad(v Xk−1 · · · v αX11 v0 ) pλ − 2 k−1
X k =Ai ,αk =1
{l Bi ,
p2λ }
α =2 Ad(v Xk−1 k−1 X k =Bi ,αk =1
Ad(v αXkk · · · v αX11 v0 ) pλ ,
X k =Ai ,αk =−1
· · · v αX11 v0 ) pλ
−2 Ad(v αXkk X k =Bi ,αk =−1
(5.23) · · · v αX11 v0 ) pλ ,
where the expressions involving vectors are to be understood componentwise. With expression (5.3) relating l Ai , l Bi to j Ai , j Bi and setting pˆ λ = m1λ pλ , p2λ = −m 2λ , we obtain
762
C. Meusburger
−1 {m λ , j Ai }= Ad v0−1 v −1 H1 · · · v Hi−1 α α1 ˆλ − × Ad v Xk−1 · · · v v X1 0 p k−1 X k =Ai ,αk =1
Ad v αX· kk· · v αX11 v0 pˆ λ ,
X k =Ai ,αk =−1
−1 −1 · · · v v v {m λ , j Bi }= −Ad v0−1 v −1 B i H1 Hi−1 Ai α α α α × Ad v Xk−1 · · · v X11 v0 pˆ λ − Ad v X· kk· · v X11 v0 pˆ λ . k−1 X k =Bi ,αk =1
(5.24)
X k =Bi ,αk =−1
Using expression (4.19) for the variable j η as a linear combination of j Ai , j Bi and taking into account the relation (4.22) between the vector pˆ λ and the vector n p,q in (4.20), we find agreement with (4.20) up to a sign, which proves (5.20). Hence, we find that the transformation of the holonomies under grafting along a closed, simple geodesic λ on S is generated by the mass m λ . Note, however, that the transformation generated by the mass m λ is defined for general closed curves λ ∈ π1 (Sg ) ↑ ↑ and as a map (P3 )2g → (P3 )2g . In contrast, the grafting procedure defined in [12, 13] whose action on the holonomies is given in Sect. 4 is defined for simple, closed curves and acts on static spacetimes for which the translational components of the dual holonomies N Ai , N Bi vanish and their Lorentz components are the generators of a cocompact ↑ ↑ Fuchsian group . In this sense, the transformation Grλ : (P3 )2g → (P3 )2g generated by the mass m λ can be viewed as an extension of the grafting procedure in [12, 13] to ↑ non-simple curves and to the whole Poisson manifold ((P3 )2g , ). The fact that the transformation of the holonomies under grafting is generated via the Poisson bracket allows us to deduce some properties of this transformation which would be much less apparent from the general formula (4.20). Corollary 5.3. 1. The action of grafting leaves the constraint (3.19) invariant and commutes with the associated gauge transformation by simultaneous conjugation a {u ∞ , m λ } = j∞ (5.25) , m λ = 0. 2. The grafting transformations Grwi λi for different closed curves λi ∈ π1 (Sg ) with weights wi ∈ R+ commute and satisfy n d ↑ wi m λi , F = − F ◦ Grtwn λn ◦ . . . ◦ Grtw1 λ1 , F ∈ C ∞ (P3 )2g . dt t=0 i=1
(5.26) ↑
3. The grafting maps Grwλ act on the Poisson manifold ((P3 )2g , ) via Poisson isomorphisms ↑ {F ◦ Grwλ , G ◦ Grwλ } = {F, G} ◦ Grwλ , F, G ∈ C ∞ (P3 )2g . (5.27)
Grafting and Poisson Structure in (2+1)-Gravity
763
Proof. That the components of the constraint (3.19) Poisson commute with the mass m λ follows from the fact that m λ is an observable of the theory, but can also be checked a act on the Lorentz by direct calculation. It is shown in [14] that the components j∞ ↑ components u Ai , u Bi by simultaneous conjugation with L 3 , which leaves all masses m λ invariant. To prove the second statement, we recall that all Lorentz components u Ai , u Bi Poisson commute, which together with (4.20) and (5.20) implies the commutativity of grafting. Differentiating then yields (5.26). The third statement follows directly from the fact that the grafting transformation is generated via the Poisson bracket by a standard argument making use of the Jacobi identity. In our case, the fact that the Lorentz components u Ai , u Bi Poisson commute allows one to write {F ◦ Grwλ , G ◦ Grwλ } ∂ F ∂G a b a b a b = { j , j } − w{{m , j }, j } − w{ j , {m , j }} , λ λ X Y X Y X Y ∂ j Xa ∂ jYb X,Y ∈{A1 ,...,Bg }
and, using the Jacobi identity for the last two brackets, one obtains (5.27) ∂ F ∂G a b a b {F ◦ Grwλ , G ◦ Grwλ } = { j , j }−w{m , { j , j }} = {F, G} ◦ Grwλ . λ X Y X Y a ∂ j X ∂ jYb X,Y ∈{A ,...,B } g
1
After deriving the Hamiltonians that generate the transformation of the holonomies under grafting along a closed, simple curve λ ∈ π1 (Sg ), we will now demonstrate that Theorem 5.2 gives rise to a general symmetry relation between the Poisson brackets of mass and spin associated to general closed curves λ, η ∈ π1 (Sg ). Theorem 5.4. The Poisson brackets of mass and spin for λ, η ∈ π1 (Sg ) satisfy the relation pη · j η , p2λ = p2η , pλ · j λ , m η , sλ = sη , m λ . (5.28) Proof. To prove (5.28), we consider curves λ, η ∈ π1 (Sg ) with holonomies Hλ , Hη parametrised as in (5.19). From (5.23) it follows that the Poisson bracket of pη · j η and p2λ is given by
pη · j η , p2λ = 2
g i=1
βk−1 β1 βk β1 Ad u Yk−1 · · · u Y1 pη − Ad u Yk · · · u Y1 pη ·
Yk =Ai ,βk =1
Yk =Ai ,βk =−1
αk−1 α1 −1 Ad v0−1 v −1 H1 · · · v Hi−1 v X k−1 · · · v X 1 v0 pλ
X k =Ai ,αk =1
−
α α −1 k 1 Ad v0−1 v −1 H1 · · · v Hi−1 v X k · · · v X 1 v0 pλ
X k =Ai ,αk =−1
764
C. Meusburger
−2
g
Ad
Yk =Bi ,βk =1
i=1
βk−1 u Yk−1
β · · · u Y11
β pη − Ad(u Ykk Yk =Bi ,βk =−1
β · · · u Y11 ) pη ·
αk−1 α1 −1 −1 Ad v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi v X k−1 · · · v X 1 v0 pλ
X k =Bi ,αk =1
α α −1 −1 k 1 − Ad v0−1 v −1 H1 · · · v Hi−1 v Ai v Bi v X k · · · v X 1 v0 pλ . (5.29) X k =Bi ,αk =−1
To compute the Poisson bracket { p2η , pλ · j λ }, we express the translational component of the holonomy Hλ in terms of the holonomies N Ai , N Bi , −αk−1 −1 −αk −1 1 1 jλ =− Ad v0−1 v −α Ad v0−1 v −α X 1 · · · v X k−1 v X k x X k + X1 · · · vXk vXk x Xk . k:αk =1
k:αk =−1
(5.30) As simultaneous conjugation of all holonomies with a general Poincaré valued function ↑ on (P3 )2g leaves pλ · j λ invariant, we can replace x Ai , x Bi by x˜ Ai , x˜ Bi in expression (5.30). Using expression (5.18) for the Poisson bracket of ˜f Ai , ˜f Bi with u Ai , u Bi , Eq. (5.8) relating ˜f Ai , ˜f Bi and x˜ Ai , x˜ Bi and expression (2.17) for the action of the ↑ left-invariant vector fields on L 3 , we find that the Poisson bracket of x˜ Ai , x˜ Bi with p2λ is given by
x˜ Ai , p2η = 2Ad v Ai v Hi−1 · · · v H1 v0 βk−1 β β β × Ad u Yk−1 · · · u Y11 pη − Ad u Ykk · · · u Y11 pη , Yk =Ai ,βk =1
Yk =Ai ,βk =−1
x˜ Bi , p2η = −2Ad v Ai v Hi−1 · · · v H1 v0 βk−1 β β β × Ad u Yk−1 · · · u Y11 pη − Ad u Ykk · · · u Y11 pη . Yk =Bi ,βk =1
Yk =Bi ,βk =−1
Replacing x Ai → x˜ Ai , x Bi → x˜ Bi in expression (5.30) for j η then yields
p2η , j λ = 2
g
1 Ad(v0−1 v −α X1 i=1 X k =Ai ,αk =1
Ad(v Hi−1
βk−1 · · · v H1 v0 ) Ad(u Yk−1 Yk =Ai ,βk =1
−αk−1 1 · · · v X k−1 )− Ad(v0−1 v −α X1 X k =Ai ,αk =−1
β β · · · u Y11 ) pη − Ad(u Ykk Yk =Ai ,βk =−1
k · · · v −α Xk )
β · · · u Y11 ) pη
Grafting and Poisson Structure in (2+1)-Gravity
−2
g
1 Ad(v0−1 v −α X1 i=1 X k =Bi ,αk =1
Ad(v −1 Bi v Ai v Hi−1
765
−αk−1 1 · · · v X k−1 )− Ad(v0−1 v −α X1 X k =Bi ,αk =−1
βk−1 · · · v H1 v0 ) Ad(u Yk−1 Yk =Bi ,βk =1
k · · · v −α Xk )
β β · · · u Y11 ) pη − Ad(u Ykk Yk =Bi ,βk =−1
β · · · u Y11 ) pη ,
(5.31) and multiplication with pλ gives (5.29). The geometrical implications of Theorem 5.4 are that the change of the spin sη of a closed, simple curve η ∈ π1 (Sg ) under grafting along another closed, simple curve λ ∈ π1 (Sg ) is the same as the change of the spin sλ under grafting along η. Furthermore, it is shown in [16], for a summary of the results see Sect. 6, that the product m λ sλ of mass and spin of a closed, simple curve λ ∈ π1 (Sg ) is the Hamiltonian which generates an infinitesimal Dehn twist around λ. Thus, Theorem 5.4 implies that the transformation of the mass m η under an infinitesimal Dehn twist around λ ∈ π1 (Sg ) agrees with the transformation of the spin sη under infinitesimal grafting along λ. We will clarify this connection further in the next section, where we discuss the relation between grafting and Dehn twists. 6. Grafting and Dehn Twists In this section, we show that there is a link between the transformation of the holonomies under grafting and under Dehn twists along a general closed, simple curve λ ∈ π1 (Sg ). The transformation of the holonomies under Dehn twists is investigated in [16] for Chern-Simons theory on a manifold of topology R× Sg,n , where Sg,n is a general orientable two-surface of genus g with n punctures. The gauge groups considered in [16] are of the form G g∗ , where G is a finite dimensional, connected, simply connected and unimodular Lie group, g∗ the dual of its Lie algebra and G acts on g∗ in the coadjoint representation. The assumption of simply-connectedness in [16] gives rise to technical simplifications in the quantised theory but does not affect the classical results. Hence, ↑ reasoning and results in [16] apply to the case of gauge group P3 and can be summarised as follows. Theorem 6.1 ([16]). For any simple, closed curve λ ∈ π1 (Sg ) with holonomy Hλ = a (u λ , −Ad(u λ ) j λ ), u λ = e− pλ Ja , the product of the associated mass and spin pλ · j λ = m λ sλ generates an infinitesimal Dehn twist around λ via the Poisson bracket defined by (3.28), d ↑ {m λ sλ , F} = |t=0 F ◦ Dtλ , (6.1) F ∈ C ∞ (P3 )2g , dt ↑
↑
↑
↑
where Dtλ : (P3 )2g → (P3 )2g agrees with the action Dλ : (P3 )2g → (P3 )2g of the Dehn-twist around λ for t = 1. The transformation Dtλ acts on the Poisson manifold ↑ ((P3 )2g , ) via Poisson isomorphisms, ↑ {F ◦ Dtλ , G ◦ Dtλ } = {F, G} ◦ Dtλ , (6.2) F, G ∈ C ∞ (P3 )2g .
766
C. Meusburger ↑
↑
As in the definition of the grafting map Grwλ : (P3 )2g → (P3 )2g , the different ↑ copies of P3 in Theorem 6.1 stand for the holonomies Ai , Bi . However, unlike our derivation of the grafting map, the derivation in [16] does not make use of the dual generators n ai , n bi but is formulated entirely in terms of the holonomies Ai , Bi . The action ↑ ↑ Dtλ : (P3 )2g → (P3 )2g of (infinitesimal) Dehn twists on the holonomies is determined graphically. As this graphical procedure will play an important role in relating Dehn twists and grafting, we present it here in a slightly different and more detailed version than in [16]. We consider simple curves λ, η ∈ π1 (Sg ) parametrised in terms of the generators β β ai , bi ∈ π1 (Sg ) as λ = z tδt ◦ . . . ◦ z 1δ1 , η = ys s ◦ . . . ◦ y1 1 with z i , y j ∈ {a1 , . . . , bg }, β j , δk ∈ {±1} and associated holonomies a a Hλ = Z tδt · · · Z 1δ1 = e−(pλ +θkλ )Ja = u λ , −Ad(u λ ) j λ , − pa +θk a J β Hη = Ysβs · · · Y1 1 = e η η a = u η , −Ad(u η ) j η .
(6.3)
To determine the action of the transformation generated by m λ sλ on the holonomy Hη , we consider the surface Sg − D obtained from Sg by removing a disc2 D. We represent the generators ai , bi ∈ π1 (Sg ) by curves as in Fig. 4, but instead of a basepoint, we draw a line on which the starting points sai , sbi and endpoints tai , tbi are ordered3 (from right to left) according to sa1 < sb1 < ta1 < tb1 < sa2 < sb2 < ta2 < tb2 < . . . < sag < sbg < tag < tbg .
(6.4)
The curves representing the generators ai , bi ∈ π1 (Sg ) start and end in, respectively, sai , sbi and tai , tbi and their inverses in sa −1 = tai ,sb−1 = tbi and ta −1 = sai ,tb−1 = sbi . i i i i To derive the transformation of the holonomy Hη under an (infinitesimal) Dehn twist along λ, we draw two such lines, one corresponding to η, one to λ such that the line for η is tangent to the disc, while the one for λ is displaced slightly away from the disc. We then decompose the curves representing η and λ graphically into the curves representing the generators ai , bi and their inverses, with ordered starting and end points on the corresponding lines, and into segments parallel to the lines which connect the starting and endpoints of different factors, see Figs. 6, 7, 9, and 10. The curves representing ai±1 , bi±1 and the segments connecting their starting and endpoints are drawn in such a way that there is a minimal number of intersection points and such that all intersection points occur on the lines connecting different starting and endpoints of generators ai , bi in the decomposition of λ, as shown in Figs. 6, 7, 9, and 10. An intersection point qi is said δi+1 to occur between the factors z iδi and z i+1 on λ if it lies on the straight line connecting δt+1 δ1 tz δi and sz δi+1 , where z t+1 = z 1 . Similarly, an intersection point occurs between the i
β
i+1
β
β
β
i+1 i+1 on η if it lies on yi+1 near the starting point s y βi+1 or on yi i near factors yi i and yi+1 i+1
the endpoint t y βi . i
2 The reason for the removal of the disc is that we work on an extended phase space where the constraint (3.19) arising from the defining relation of the fundamental group is not imposed. It is shown in [16] that this implies that instead of the mapping class group Map(Sg ), it is the mapping class group Map(Sg − D) that ↑
acts on the Poisson manifold ((P3 )2g , ). 3 This ordering corresponds to an ordering of the edges at each vertex needed to define the Poisson structure in the formalism developed by Fock and Rosly [30].
Grafting and Poisson Structure in (2+1)-Gravity
767
−1 −1 Fig. 6. The decomposition of n ai = h −1 1 ◦ . . . ◦ h i−1 ai ◦ bi ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 (full line) and its intersection with ai (dashed line); segments in the decomposition of n ai that do not intersect any generator a j , b j ∈ π1 (Sg ) are omitted
−1 −1 −1 Fig. 7. The decomposition of n bi = h −1 1 ◦ . . . ◦ h i−1 ai ◦ bi ◦ ai ◦ bi ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 (full line) and its intersection with bi (dashed line); segments in the decomposition of n bi that do not intersect any generator a j , b j ∈ π1 (Sg ) are omitted
768
C. Meusburger
Fig. 8. The intersection of the geodesics ci with the polygon P 1
Fig. 9. The decomposition of h i (full line) and its intersection points with ai , bi (dashed lines)
Grafting and Poisson Structure in (2+1)-Gravity
769
Fig. 10. The intersection points of h i (full line) and its intersection with ai , bi (dashed lines), simplified representation without horizontal segments that do not contain intersection points
Let now λ, η ∈ π1 (Sg ) have intersection points q1 , . . . , qn such that qi occurs beδk
δk
βj
+1
βj
+1
i tween z ki i and z ki i+1 on λ and between y ji i and y ji +1 on η with j1 ≤ j2 ≤ . . . ≤ jn . We denote by i = i (λ, η) the oriented intersection number in qi with the convention i = 1 if λ crosses η from the left to the right in the direction of η. It is shown in [16] that, with ↑ ↑ these conventions, the action of an infinitesimal Dehn twist Dtλ : (P3 )2g → (P3 )2g is given by inserting the Poincaré element δ δ ti δk δki −1 k ki −1 (Z ki i Z ki −1 · · ·Z 1δ1 )Hλ (Z ki i Z ki −1 · · ·Z 1δ1 )−1
δk
δk
= (Z ki i · · ·Z 1δ1 )e−ti ( pλ +θkλ )Ja(Z ki i · · ·Z 1δ1 )−1 βj
a
βj
a
(6.5)
+1
i , between the factors Y ji i and Y ji +1 β jn−1 +1 a a β n +1 β δ δ · Z knkn· · · Z 1δ1 e−tn( pλ +θkλ ) Ja (Z knkn· · · Z 1δ1 )−1 ·Y jn jn· · · Y jn−1 Dtλ: Hη → Ysβs· · · Y jn j+1 +1 ·
δ −1 β δ β jn−2 +1 a a kn−1 kn−1 jn−1 · Z kn−1 · · · Z 1δ1 e−tn−1 ( pλ +θkλ ) Ja Z kn−1 · · · Z 1δ1 Y jn−1 · · · Y jn−2 +1 () · · · () · δ −1 β δ a a βj β j1 +1 k k j β · Y j2 2· · · Y j1 +1 · Z k11 · · · Z 1δ1 e−t1 ( pλ +θkλ ) Ja Z k11 · · · Z 1δ1 Y j1 1 · · · Y1 1 . (6.6) & tλ that acts on the holonomy Hη by We now define an analogous transformation Gr inserting at each intersection point the vector
770
C. Meusburger
δk Z ki i · · ·Z 1δ1
Hλ
−1 θti δk i δ1 Z ki · · ·Z 1
δ −1 δ a k k ↑ = Z ki i · · ·Z 1δ1 e−θti pλ Ja Z ki i · · ·Z 1δ1 ∈ R3 ⊂ P3
(6.7)
instead of the Poincaré element (6.5) in the definition of the Dehn twist. −1 β βj a a β +1 j & tλ : Hη → Ysβs · · · Y jn +1 · Z δkn · · · Z δ1 e−θtn ( pλ +θkλ )Ja Z δkn · · · Z δ1 · Y j n · · · Y j n−1 · Gr 1 1 jn +1 kn kn n n−1 +1
−1 δk δk βj βj a a +1 δ δ · Z k n−1· · · Z 11 e−θtn−1 pλ +θkλ Ja Z k n−1· · · Z 11 Y j n−1· · · Y j n−2 +1 () · · · () · n−1
βj β j1 +1 · Y j 2· · · Y j +1 · 2 1
n−1
δk δ Z k 1 · · · Z 11 1
n−1
n−2
−1 a δk βj a δ β e−θt1 pλ +θkλ Ja Z k 1 · · · Z 11 Y j 1 · · · Y1 1 . 1 1
(6.8)
From the parametrisation (6.3) we see directly that this transformation leaves the Lorentz component of Hη invariant and acts on the vector j η according to & tλ : j η → j η + t Gr
'n
i=1 i Ad
−β
−β ji
u Y1 1 · · · u Y j
i
δ k Ad u Z ki · · · u δZ11 pλ . i
(6.9)
We will now demonstrate that, up to a factor m λ , the transformation (6.8) is the same as the transformation (4.20) of Hη under grafting along λ. For this, we express λ as a product in the dual generators n ai , n bi , λ = z tδt ◦ . . . ◦ z 1δ1 = n αxrr · . . . · n αx11 ,
xi ∈ a1 , . . . , bg , αi ∈ {±1}.
(6.10)
From expression (3.24) of n ai , n bi in terms of ai , bi , it follows that the curves on Sg representing n ai , n bi both start and end in sa1 . Hence, by representing the curve λ as a product of n ai , n bi , we find that in contrast to the graphical representation in terms of ai and bi , there are no intersection points on straight segments connecting the starting and endpoints of different factors. All intersection points of λ and η occur within the and n ±1 curves representing the factors n a±1 bi in (6.10), which reflects the fact that the i generators n ai , n bi are dual to the generators ai , bi . To show that transformation (6.8) agrees with the transformation (4.20) of the holonomy Hη under grafting along λ, it is therefore sufficient to examine the intersection points of n ai with ai and of n bi with bi . Expressing the generators n ai , n bi as products in the generators ai , bi via (3.24) and applying the graphical prescription defined above, we find that the intersection of −1 −1 ◦ bi on ai and n ai occurs between ai ◦ h i−1 ◦ . . . ◦ h 1 and h −1 1 ◦ . . . ◦ h i−1 ◦ ai n ai and after ai and has negative intersection number, see Fig. 6. Figure 7 shows that the intersection of bi and n bi occurs before bi and between bi−1 ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 −1 −1 and h −1 ◦ bi ◦ ai on n bi , also with negative intersection number. 1 ◦ . . . ◦ h i−1 ◦ ai The intersections of ai and n a−1 therefore lie between bi−1 ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 and i −1 −1 −1 −1 −1 h −1 1 ◦ . . . ◦ h i−1 ◦ ai and those of bi with n bi between ai ◦ bi ◦ ai ◦ h i−1 ◦ . . . ◦ h 1 −1 −1 −1 and h 1 ◦ . . . ◦ h i−1 ◦ ai ◦ bi , both with positive intersection number. By evaluating the general expression (6.9) for the curves η = ai , η = bi , we find
Grafting and Poisson Structure in (2+1)-Gravity
j Ai → j Ai − Ad(u −1 Ai )
771 α
Ad(u Ai u Hi−1 · · · u H1 v0−1 v Xk−1 · · · v αX11 v0 ) pλ k−1
X k =Ai ,αk =1
+Ad(u −1 Ad(u −1 Ai ) Bi u Ai u Hi−1 X k =Ai ,αk =−1 j Bi → j Bi +
α
· · · u H1 v0−1 v Xk−1 · · · v αX11 v0 ) pλ , k−1
(6.11)
α
α1 −1 k−1 Ad(u −1 Bi u Ai u Hi−1 · · · u H1 v0 v X k−1 · · · v X 1 v0 ) pλ
X k =Bi ,αk =1
−
α
α1 −1 −1 k−1 Ad(u −1 Ai u Bi u Ai u Hi−1 · · · u H1 v0 v X k−1 · · · v X 1 v0 ) pλ ,
(6.12)
X k =Ai ,αk =−1
and with identities (4.22), (5.11) we recover (4.18), up to a factor m λ . The transformation of a general curve η ∈ π1 (Sg ) is then given by decomposing it into the generators ai , bi , and we obtain the following theorem Theorem 6.2. Formulated in terms of the holonomies Ai , Bi , the grafting map Grwλ : ↑ ↑ (P3 )2g → (P3 )2g defined by (4.18) takes the form & wλ = Dθwλ , Grwm λ λ = Gr
(6.13)
& wλ given by (6.6), (6.8). In particular, the Poisson bracket between m λ and with Dwλ , Gr sη or, equivalently, sλ and m η is given by m λ , sη = sλ , m η n β α k l β =− i (λ, η) Ad u Z ki · · · u αZ11 pˆ λ · Ad u Yli · · · u Y11 pˆ η . (6.14) i=1
i
i
Hence, we have found a rather close relation between the action of infinitesimal Dehn twists and grafting along a closed, simple curve λ ∈ π1 (Sg ). The infinitesimal Dehn twist along λ is generated by the observable m λ sλ and acts on the holonomy another curve of αki η by inserting at each intersection point qi the Poincaré element Z ki · · · Z 1α1 Hλ α i t k Z ki i · · · Z 1α1 . Grafting along λ is generated by the observable m 2λ and inserts at α θi t α k k ∈ R3 ⊂ each intersection point the element Z ki i · · · Z 1α1 Hλ Z ki i · · · Z 1α1 ↑
P3 . The formal parameter θ satisfying θ 2 = 0 therefore allows us to view grafting along λ with weight w as an infinitesimal Dehn twist with parameter θ w. 7. Example: Grafting and Dehn twists along λ = h i = bi , ai−1 To illustrate the general results of this paper with a concrete example, we consider −1 ∈ π1 ( p0 , Sg ). grafting and Dehn twists along the curve λ = h i = bi , ai We start by determining the transformation of the holonomies under grafting along λ as described in Sect. 4. From (3.23) it follows that the associated element of the cocompact Fuchsian group is given by −1 −1 (7.1) · · · v v = v0 u Hi v0−1 = v −1 H1 Hi−1 v Hi v Hi−1 · · · v H1 .
772
C. Meusburger
As we have shown in Sect. 4 that conjugation with elements of does not affect the grafting, we can instead consider the curve
−1 λ˜ = n −1 = n , n b ai i hi with associated group element
−1 v˜ = v −1 Hi = v Ai , v Bi .
We denote by c˜ p,q the lift of the closed, simple geodesic λ˜ to a geodesic in H2 with p ∈ P 1 and with unit normal vector
a n˜ p,q = −Ad v Hi−1 · · · v H1 v0 pˆ λ , (7.2) e− pλ Ja = u Bi , u −1 Ai . From Fig. 8, we find that the geodesics in the associated -invariant multicurve on H2 that intersect the polygon P 1 ⊂ H2 are given by c1 = c˜ p,q ,
c2 = Ad(v −1 Bi )˜c p,q ,
−1 c4 = Ad(v Bi v −1 Ai v Bi )˜c p,q ,
c3 = Ad(v Ai v −1 Bi )˜c p,q ,
˜ p,q , c5 = Ad([v −1 Ai , v Bi ])˜c p,q = c
(7.3)
and all intersection points lie on sides ai , ai , bi , bi . The side ai of the polygon intersects c2 , c5 = c1 with, respectively, positive and negative intersection number, while bi intersects c2 , c3 , also with, respectively, positive and negative intersection number. Hence, using formula (4.18) and expression (7.2), we find that the transformation of the holonomies Ai , Bi along the generators ai , bi ∈ π1 (Sg ) is given by −1 · · ·v j Ai → j Ai −tAd v0−1 v −1 H1 Hi−1 (n5 − n2 ) −1 1−Ad v −1 n˜ p,q = j Ai −tAd v0−1 v −1 H1· · ·v Hi−1 Bi pˆ λ , = j Ai + t 1 − Ad u −1 Ai −1 −1 j Bi → j Bi +tAd v0−1 v −1 H1· · ·v Hi−1 v Ai v Bi (n3−n2 ) −1 −1 −1 1−Ad v n˜ p,q · · ·v v = j Bi −tAd v0−1 v −1 H1 Hi−1 Ai Hi pˆ λ , = j Bi + t 1 − Ad u −1 (7.4) Bi while all other holonomies transform trivially. The transformation of the holonomy along a general curve η ∈ π1 (Sg ) is obtained by writing the associated vector j η as a linear combination of j Ai , j Bi as in (4.19). Expression (5.24) implies that the mass m λ has non-trivial Poisson brackets only with the variables j Ai , j Bi , −1 ˜ p,q−Ad(v Hi )n˜ cp,q m λ , j Ai = Ad v0−1 v −1 Ad(v −1 H1 · · · v Hi−1 Bi )n = − 1−Ad(u −1 ) pˆ λ , Ai −1 −1 ˜ p,q−Ad(v −1 ˜ p,q m λ , j Bi = −Ad v0−1 v −1 Ad(v Ai v −1 H1 · · · v Hi−1 v Ai v Bi Bi )n Bi )n = − 1−Ad(u −1 ) pˆ λ . Bi
Grafting and Poisson Structure in (2+1)-Gravity
773
Grafting along λ therefore acts on the holonomies Ai , Bi according to Grtm λ λ : Ai → (1, t pλ )Ai (1, −t pλ ) = etθ pλ Ja Ai e−tθ pλ Ja = Hλ−θt Ai Hλθt , a
a
Bi → (1, t pλ )Bi (1, −t pλ ) = etθ pλ Ja Bi e−tθ pλ Ja = Hλ−θt Bi Hλθt . a
a
(7.5)
To determine the action of an (infinitesimal) Dehn twist along λ, we apply the graphical procedure of Sect. 6 as depicted in Figs. 9, and 10. We find that both ai and bi intersect λ twice, once at their starting points with positive intersection number and once at their endpoints with negative intersection number. All intersections take place on the segment linking tbi with sai on λ. Hence, the action of an infinitesimal Dehn twist along λ on the holonomies Ai , Bi is given by Dtλ : Ai → et ( pλ+θkλ ) Ja Ai e−t ( pλ+θkλ ) Ja = Hλ−t Ai Hλt , Bi → et ( pλ+θkλ ) Ja Bi e−t ( pλ+θkλ ) Ja a
a
a
a
a
a
a
a
= Hλ−t Bi Hλt ,
where Hλ = [Bi , Ai −1 ] = e−( pλ +θkλ )Ja , and we obtain the relation between grafting and Dehn twists in Theorem 6.2: Grtm λ λ = Dθtλ . a
a
8. Concluding Remarks In this paper we related the geometrical construction of evolving (2+1)-spacetimes via grafting to phase space and Poisson structure in the Chern-Simons formulation of (2+1)dimensional gravity. We demonstrated how grafting along closed, simple geodesics λ is implemented in the Chern-Simons formalism and showed how it gives rise to a transfor↑ mation on an extended phase space realised as the Poisson manifold ((P3 )2g , ). We derived explicit expressions for the action of this transformation on the holonomies of general elements of the fundamental group and proved that it leaves Poisson structure and constraints invariant. Furthermore, we showed that this transformation is generated via the Poisson bracket by a gauge invariant Hamiltonian, the mass m λ , and deduced the symmetry relation {m λ , sη } = {sλ , m η } between the Poisson brackets of mass and spin of general closed curves λ, η. We related the action of grafting on the extended phase space to the action of Dehn twists investigated in [16] and showed that grafting can essentially be viewed as a Dehn twist with a formal parameter θ satisfying θ 2 = 0. Together with the results concerning Dehn twists in [16], the results of this paper give rise to a rather concrete understanding of the relation between spacetime geometry and the description of the phase space in terms of holonomies. There are two basic transformations associated to a simple, closed curve λ that alter the geometry of (2+1)spacetimes, grafting and infinitesimal Dehn twists. These transformations are generated via the Poisson bracket by the two basic gauge invariant observables associated to this curve, its mass m λ and the product m λ sλ of its mass and spin, and act on the phase space via Poisson isomorphisms or canonical transformations. This sheds some light on the physical interpretation of these observables. In analogy to the situation in classical mechanics where momenta generate translations and angular momenta rotations, the two basic observables associated to a simple, closed curve in a (2+1)-dimensional spacetime generate infinitesimal changes in geometry. The grafting operation, generated by its mass, cuts the surface along the curve and translates the two sides of this cut against each other. The infinitesimal Dehn twist, generated by the product of its mass and spin, cuts the surface along the curve and infinitesimally rotates the two sides of the cut with respect to each other.
774
C. Meusburger
It would be interesting to investigate the relation between grafting and Poisson structure for other values of the cosmological constant and to see if similar results hold in these cases. In particular, it would be desirable to understand if and how the Wick rotation derived in [13] which relates the grafting procedure for different values of the cosmological constant manifests itself on the phase space. Although the semidirect product structure of the (2+1)-dimensional Poincaré group gives rise to many simplifications, Fock and Rosly’s description of the phase space [30] can also be applied to the ChernSimons formulation of (2+1)-dimensional gravity with cosmological constant > 0 and < 0. For the case of the gauge group S L(2, C) this has been achieved in [33, 34]. Although the resulting description of the Poisson structure is technically more involved ↑ than the one for the group P3 , it seems in principle possible to investigate transformations generated by the physical observables and to relate them to the corresponding grafting transformations in [13]. Acknowledgements. I thank Laurent Freidel, who showed interest in the transformation generated by the mass observables and suggested that it might be related to grafting. Some of my knowledge on grafting was acquired in discussions with him. Furthermore, I thank Bernd Schroers for useful discussions, answering many of my questions and for proofreading this paper.
References 1. Achucarro, A., Townsend, P.: A Chern–Simons action for three-dimensional anti-de Sitter supergravity theories. Phys. Lett. B 180, 85–100 (1986) 2. Witten, E.: 2+1 dimensional gravity as an exactly soluble system. Nucl. Phys. B 311, 46–78 (1988), Nucl. Phys. B 339, 516–32 (1988) 3. Nelson, J.E., Regge, T.: Homotopy groups and (2+1)-dimensional quantum gravity. Nucl. Phys. B 328, 190–202 (1989) 4. Nelson, J.E., Regge, T.: (2+1) Gravity for genus > 1. Commun .Math. Phys. 141, 211–23 (1991) 5. Nelson, J.E., Regge, T.: (2+1) Gravity for higher genus. Class Quant Grav. 9, 187–96 (1992) 6. Nelson, J.E., Regge, T.: The mapping class group for genus 2. Int. J. Mod. Phys. B6, 1847–1856 (1992) 7. Nelson, J.E., Regge, T.: Invariants of 2+1 quantum gravity. Commun. Math. Phys. 155, 561–568 (1993) 8. Martin, S. P.: Observables in 2+1 dimensional gravity. Nucl. Phys. B 327, 178–204 (1989) 9. Ashtekar, A., Husain, V., Rovelli, C., Samuel, J., Smolin, L.: (2+1) quantum gravity as a toy model for the (3+1) theory. Class. Quant. Grav. 6, L185–L193 (1989) 10. Carlip, S.: Quantum gravity in 2+1 dimensions. Cambridge: Cambridge University Press, 1998 11. Mess, G.: Lorentz spacetimes of constant curvature. preprint IHES/M/90/28, Avril 1990 12. Benedetti, R., Guadgnini, E.: Cosmological time in (2+1)-gravity. Nucl. Phys. B 613, 330–352 (2001) 13. Benedetti, R., Bonsante, F.: Wick rotations in 3D gravity: ML(H2 ) spacetimes. http://arxiv.org/list/ math.DG/0412470, 2004 14. Meusburger, C., Schroers, B.J.: Poisson structure and symmetry in the Chern-Simons formulation of (2+1)-dimensional gravity. Class. Quant. Grav.20, 2193–2234 (2003) 15. Meusburger, C., Schroers, B.J.: The quantisation of Poisson structures arising in Chern-Simons theory with gauge group G g∗ . Adv. Theor. Math. Phys. 7, 1003–1043 (2004) 16. Meusburger, C., Schroers, B.J.: Mapping class group actions in Chern-Simons theory with gauge group G g∗ . Nucl. Phys. B 706, 569-597 (2005) 17. Grigore, D. R.: The projective unitary irreducible representations of the Poincaré group in 1+2 dimensions. J. Math. Phys. 37, 460–473 (1996) 18. Mund, J., Schrader, R.: Hilbert spaces for Nonrelativistic and Relativistic "Free" Plektons (Particles with Braid Group Statistics). In: Albeverio, S., Figari, R., Orlandi, E., Teta, A. (eds.) Proceeding of the Conference "Advances in Dynamical Systems and Quantum Physics", Capri, Italy, 19-22 May, 1993. Singapore: World Scientific, 1995 19. Benedetti, R., Petronio, C.: Lectures on Hyperbolic Geometry. Berlin-Heidelberg: Springer Verlag, 1992 20. Katok, S.: Fuchsian Groups. Chicago: The University of Chicago Press, 1992 21. Goldman, W.M.: Projective structures with Fuchsian holonomy. J. Diff. Geom. 25, 297–326 (1987) 22. Hejhal, D.A.: Monodromy groups and linearly polymorphic functions. Acta. Math. 135, 1–55 (1975) 23. Maskit, B.: On a class of Kleinian groups. Ann. Acad. Sci. Fenn. Ser. A 442, 1–8 (1969) 24. Thurston, W.P.: Geometry and Topology of Three-Manifolds. Lecture notes, Princeton University, 1979
Grafting and Poisson Structure in (2+1)-Gravity
775
25. Thurston, W.P.: Earthquakes in two-dimensional hyperbolic geometry. In: Epstein, D.B. (ed.), Low dimensional topology and Kleinian groups. Cambridge: Cambridge University Press, 1987, pp. 91–112 26. McMullen, C.: Complex Earthquakes and Teichmüller theory. J. Amer. Math. Soc. 11, 283–320 (1998) 27. Sharpe, R. W.: Differential Geometry. New York: Springer Verlag, 1996 28. Matschull, H.-J.: On the relation between (2+1) Einstein gravity and Chern-Simons Theory. Class. Quant. Grav. 16, 2599–609 (1999) 29. Alekseev, A. Y., Malkin, A. Z.: Symplectic structure of the moduli space of flat connections on a Riemann surface. Commun. Math. Phys. 169, 99–119 (1995) 30. Fock, V. V., Rosly, A. A.: Poisson structures on moduli of flat connections on Riemann surfaces and r -matrices. Am. Math. Soc. Transl. 191, 67–86 (1999) 31. Alekseev, A. Y., Grosse, H., Schomerus, V.: Combinatorial quantization of the Hamiltonian Chern-Simons Theory. Commun. Math. Phys. 172, 317–58 (1995) 32. Alekseev, A. Y., Grosse, H., Schomerus, V.: Combinatorial quantization of the Hamiltonian Chern-Simons Theory II. Commun. Math. Phys. 174, 561–604 (1995) 33. Buffenoir, E., Roche, P.: Harmonic analysis on the quantum Lorentz group. Commun. Math. Phys. 207, 499-555 (1999) 34. Buffenoir, E., Noui, K., Roche, P.: Hamiltonian Quantization of Chern-Simons theory with S L(2, C) Group. Class. Quant. Grav. 19, 4953-5016 (2002) Communicated by G.W. Gibbons
Commun. Math. Phys. 266, 777–795 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0038-9
Communications in
Mathematical Physics
Mott Transition in Lattice Boson Models R. Fernández1 , J. Fröhlich2 , D. Ueltschi3 1 Laboratoire de Mathématiques Raphaël Salem, UMR 6085 CNRS-Université de Rouen, Avenue de
l’Université, BP.12, 76821 Saint Etienne du Rouvray, France. E-mail: [email protected]
2 Institut für Theoretische Physik, Eidgenössische Technische Hochschule, 8093 Zürich, Switzerland.
E-mail: [email protected]
3 Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA.
E-mail: [email protected] Received: 29 September 2005 / Accepted: 24 January 2006 Published online: 9 May 2006 – © Springer-Verlag 2006
Abstract: We use mathematically rigorous perturbation theory to study the transition between the Mott insulator and the conjectured Bose-Einstein condensate in a hard-core Bose-Hubbard model. The critical line is established to lowest order in the tunneling amplitude. 1. Introduction Initially introduced in 1989 [9], the Bose-Hubbard model has been the object of much recent work. It represents a simple lattice model of itinerant bosons which interact locally. This model turns out to describe fairly well recent experiments with bosonic atoms in optical lattices [12, 15]. Its low-temperature phase diagram has been uncovered in several studies, both analytical (see e.g. [9, 10, 8]) and numerical [2, 19] ones. When parameters such as the chemical potential or the tunneling amplitude are varied the Bose-Hubbard model exhibits a phase transition from a Mott insulating phase to a Bose-Einstein condensate. Figure. 2, below, depicts its ground state phase diagram. In this paper, we investigate the phase diagram of this model in a mathematically rigorous way. We focus on the situation with a small tunneling amplitude, t, and a small chemical potential, µ. We construct the critical line between Mott and non-Mott behavior to lowest order in the ratio t/µ. More precisely, we prove the existence of domains with and without Mott insulator. These domains are separated by a comparatively thin stretch; the domain without Mott insulator is widely believed to be a Bose condensate. Our results establish in particular the occurrence of a “quantum phase transition” in the ground state. Over the years several analytical methods have been developed that are useful for the study of models such as the Bose-Hubbard model. They include a general theory of classical lattice systems with quantum perturbations [3, 5, 6, 16]. These methods can Collaboration supported in part by the Swiss National Science Foundation under grant 2-77344-03.
778
R. Fernández, J. Fröhlich, D. Ueltschi
be used to establish the existence of Mott phases for small t; but they only apply to domains of parameters far from the transition lines. The Bose-Hubbard model on the complete graph can be studied rather explicitly and its phase diagram is similar to the one of the finite-dimensional model [4]. Results using reflection positivity are mentioned below and only apply to the hard-core model. A related model with an extra chessboard potential was studied in [1] (see also [17]). The Bose-Hubbard model is defined as follows. Let ⊂ Zd be a finite cube of volume ||. We introduce the bosonic Fock space H,N , (1.1) F = N 0
where H,N is the Hilbert space of symmetric complex functions on N . Creation and annihilation operators for a boson at site x ∈ are denoted by c†x and cx , respectively. The Hamiltonian of the Bose-Hubbard model is given by c†x cx c†x cx − 1 . (1.2) H = −t c†x c y + 21 U x, y ∈ |x − y| = 1
x∈
The first term in the Hamiltonian represents the kinetic energy; the hopping parameter t is chosen to be positive. The second term is an on-site interaction potential (assuming each particle interacts with all other particles at the same site). The interaction is proportional to the number of pairs of particles; the interaction parameter U is positive, and this corresponds to repulsive interactions. In our construction of the equilibrium state, we work in the grand-canonical ensemble. This amounts to adding a term −µN to the Hamiltonian, where N = x c†x cx is the number operator, and µ is the chemical potential. The limit U → ∞ describes the hard-core Bose gas where each site can be occupied by at most one particle. This model is equivalent to the x y model with spin 21 in a magnetic field proportional to µ. Spontaneous magnetization in the spin model corresponds to Bose-Einstein condensation in the boson model. The presence of a Bose condensate has been rigorously established for µ = 0 (the line of hole-particle symmetry). See [7] for a proof valid at low temperatures in three dimensions, and [14] for an analysis of the ground state in two dimensions. The proofs exploit reflection positivity and infrared bounds, a method that was originally introduced for the classical Heisenberg model in [11]. At present, there are no rigorous results about the presence of a condensate for µ = 0, or for finite U . The ground state phase diagram of the hard-core Bose gas is depicted in Fig. 1 and reveals three regions: a phase with empty sites, a phase with Bose-Einstein condensation in dimension greater or equal to two, and a phase with full occupation. Particle-hole symmetry implies that the phase diagram is symmetric around the axis µ = 0. The critical value of the hopping parameter in the ground state of the hard-core (hc) Bose gas is |µ| . (1.3) 2d This follows by observing that the cost of adding one particle in a state of vanishing density (where interactions are negligible) is −µ − 2dt. For µ < −2dt the empty configuration minimizes the energy, while for µ > −2dt a state with sufficiently low, but tchc (µ) =
Mott Transition in Lattice Boson Models
779
t
BEC
m 2d
m 2d
r=1
r=0
m
0
Fig. 1. Zero temperature phase diagram for the hard-core Bose gas. Bose-Einstein condensation is proved on the line µ = 0, for any t > 0. Our perturbation methods provide a quantitative description of the Mott insulator phases with density 0 and 1
positive density has negative energy. The Mott phases of the hard-core Bose gas at zero temperature are stable because of the absence of ‘quantum fluctuations’ — the ground state is just the empty or the full configuration. The hard-core model is an excellent approximation to the general Bose-Hubbard model when t is small and µ is sufficiently small. A first insight into the ground state phase diagram of the general Bose-Hubbard model is obtained by restricting the Hamiltonian to low energy configurations. Namely, for −∞ < µ 21 U , low energy states have 0 or 1 particle per site. The restricted model is the hard-core Bose gas. Next, for 21 U µ 23 U , states of lowest energy have 1 or 2 particles per site. The restricted model is again a hard-core Bose gas, but with effective hopping equal to 2t. We can define projections onto subspaces of low energy states for all µ; corresponding restricted models yield the following approximation for the critical hopping parameter: |µ| if − ∞ < µ < 21 U, approx 2d tc (1.4) (µ) = |µ − kU | if (k − 21 )U < µ < (k + 21 )U, k 1, 2d(k + 1) approx
(thin lines in Fig. 2). The true critical line tc (µ) agrees with tc due to quantum fluctuations. We expect that approx (µ) 1 + O Ut . tc (µ) = tc
(µ) up to corrections (1.5)
In order to state our first result, we recall that the pressure p(β, µ) is defined by p(β, µ) = lim
Zd
1 log Tr F e−β(H −µN ) . ||
(1.6)
Here the limit is taken over a sequence of boxes of increasing size; standard arguments ensure its existence. Its derivative with respect to the chemical potential is the density; i.e., ρ(β, µ) =
1 ∂ p(β, µ). β ∂µ
(1.7)
The zero density phase is simpler to analyze because of the absence of quantum fluctuations. The following theorem holds uniformly in U , and therefore also applies to the hard-core model.
780
R. Fernández, J. Fröhlich, D. Ueltschi
t
m 2d
m 2d
BEC
m–U 4d r= 0
m–U 4d
r =1 0
m–2U 6d r=2
U
m–2U 6d r =3
m
2U
Fig. 2. Zero temperature phase diagram for the Bose-Hubbard model. Lobes are incompressible phases with integer densities. Thin lines represent the approximate critical line defined in (1.4)
Theorem 1.1 (Zero density phase). For µ < −2dt, there exists β0 such that if β > β0 , we have that (a) the pressure is real analytic in β, µ; (b) ρ(β, µ) < e−aβ . Here, a > 0 depends on t, µ, d, but it is uniform in β and U . This theorem is proven in Sect. 2. The transition lines between the Mott phases of density ρ 1 and the Bose-Einstein condensate are much harder to study because of the presence of quantum fluctuations. We consider a simplified model with a generalized hard-core condition that prevents more than two bosons from occupying a given site. The Hamiltonian is still given by (1.2), but it acts on the Hilbert space spanned by the configurations {0, 1, 2} . The phase diagram of this model is depicted in Fig. 3. This model is the simplest one exhibiting a phase with quantum fluctuations. Notice that, in the limit U → ∞, this model coincides with the usual hard-core model. The zero-density phase and the ρ = 2 phase are characterized by Theorem 1.1. The transition line of the ρ = 1 phase is more complicated. The following theorem shows that it is equal to µ/2d to first order in t/U , as in the hard-core model. Theorem 1.2 (Mott phase ρ = 1 in generalized hard-core model). Assume that µ t2 0 < µ < U4 and t < 2d − const U (with const 211 d). Then there exist β0 and a > 0 (depending on d, t, µ, U ) such that if β > β0 , we have that (a) the pressure is real analytic in β, µ; (b) |ρ(β, µ) − 1| < e−aβ . µ for small t, µ, so that our condition The critical line is expected to be close to t = 2d agrees to first order in t. While we do not state and prove it explicitly, a similar claim −µ holds around 4dt = µ − U . Indeed, the ρ = 1 phase prevails for t U4t for small t. The “quantum Pirogov-Sinai theory” of references [3, 5] applies here and allows to establish the existence of a Mott insulator for low t. Proving that the domain extends µ almost to the line t = 2d requires additional arguments, however; Theorem 1.2 is proved in Sect. 3. The generalized hard-core condition considerably simplifies the proof. Indeed,
Mott Transition in Lattice Boson Models
781
t
BEC m 2d
m 2d
m–U 4d
m–U 4d
r=1
r=0 0
r=2 U
m
Fig. 3. Zero temperature phase diagram for the Bose-Hubbard model with the generalized hard-core condition
it allows for a cute and convenient representation of the grand-canonical partition function in terms of a gas of non-overlapping oriented space-time loops, see Fig. 4 in Sect. 3. The result is nevertheless expected to hold for the regular Bose-Hubbard model as well. While we cannot establish the presence of a Bose-Einstein condensate, we can prove the absence of Mott insulating phases away from the critical lines, by establishing bounds on the density of the system. Theorem 1.3 (Absence of Mott phases). µ (a) For t > − 2d and for any large enough, the density of the ground state is bounded below by a strictly positive constant, that depends on t, µ but not on . This applies to the model with or without hard-core condition. 2 µ (b) Consider the model with generalized hard cores. For t > 2d + Ct ( Ut ) d+2 and for any large enough, the density of the ground state is less than a constant that is strictly less than 1; it depends on t, µ but not on . 2
10 d d+2 . This theorem is proved in Sect. 4. It is shown that C d+2 2d (2 dπ ) Quantum fluctuations have some influence on the phase diagram, and a detailed discussion is necessary. “Quantum fluctuations” are fluctuations in the ground state around the constant configuration with k bosons at each site, for some k depending on µ and t. They are present in Mott phases for k 1, while the ground state for µ < −2dt is simply the empty configuration. Quantum fluctuations are not present in effective hardcore models where each site is allowed either k or k + 1 bosons. Their presence lowers the energy of both Mott and Bose condensate states. The key question is which phase benefits most from them. In other words, writing the critical hopping parameter as approx
tc (µ) = tc
(µ) 1 + a( Ut ) ,
(1.8)
the question is about the sign of a( Ut ), for small Ut . The study in [10], based on expansion methods (no attempt at a rigorous control of convergence is made), suggests a rather surprising answer: the sign of a( Ut ) depends on the dimension! Namely, the quantum fluctuations favor Mott phases for d = 1, and they favor the Bose condensate for d 2. We expect that this question can be rigorously settled by combining the partial diagonalization method of [6] with our expansions in Sect. 3.
782
R. Fernández, J. Fröhlich, D. Ueltschi
2. Low-Density Expansions In this section, we present a Feynman-Kac expansion of the partition function adapted to the study of quantum states that are perturbations of the zero density phase. In this situation, quantum effects are reduced to a minimum, amounting basically to the combinatorics related to particle indistinguishability. Nevertheless, the resulting cluster expansion must deal with two difficult points: arbitrarily large numbers of bosons and closeness to the transition line. Both difficulties are resolved by estimating the entropy of space-time trajectories in a way inspired by Kennedy’s study of the Heisenberg model [13] — the present situation being actually simpler. The grand-canonical partition function of the Bose-Hubbard model is given by Z (β, , µ) = Tr e−β(H −µN ) ,
(2.1)
where the trace is taken over the bosonic Fock space. A standard Feynman-Kac expansion yields an expression for Z in terms of “space-time trajectories”, i.e. continuous-time nearest-neighbor random walks. More precisely,
eβµN
dνxβ1 xπ(1) (θ1 ) . . . dνxβN xπ(N ) (θ N ) Z (β, , µ) = N! x1 ,...,x N ∈ π ∈S N N 0
β
exp −U δθi (τ ),θ j (τ ) dτ . (2.2) 0
1i< j N
Here, θ denotes a space-time trajectory, i.e. θ is a map [0, β] → that is constant except for finitely many “jumps” at times 0 < τ1 < · · · < τm < β, and |θ (τ j −) − θ (τ j +)| = 1. β
The “measure” νx y on trajectories starting at x and ending at y introduced in Eq. (2.2) is a shortcut for the following operation. If f is a function on trajectories, then
tm dτ1 . . . dτm f (θ ). (2.3) dνxβy (θ ) f (θ ) = m 0
x1 , . . . , xm−1 |x j − x j−1 | = 1
0<τ1 <···<τm <β
The second sum is over nearest-neighbor sites such that |x1 − x| = |xm−1 − y| = 1. The trajectory θ on the right side of (2.3) is given by θ (τ ) = x j
for j ∈ [τ j , τ j+1 ),
where (x0 , τ0 ) = (x, 0) and (xm , τm+1 ) = (y, β). The underlying trace operation constrains the ensemble of trajectories to satisfy a periodicity condition in the “β-direction”. The initial and final particle configurations must be identical, modulo particle indistinguishablity. This explains the sum over permutations of N elements, π ∈ S N , on the right side of (2.2). We shall rewrite the expresssion (2.2) for the partition function in a form that fits into the framework of cluster expansions. The main result of cluster expansions is summarized in the appendix, and it is enough for our purpose. Trajectories are correlated because of (i) the interactions in the exponential factors of (2.2) which penalize intersections, and (ii) the permutations linking initial and final sites
Mott Transition in Lattice Boson Models
783
of different trajectories. The cluster expansion is designed to handle the former factors, but we need to deal first with the latter issue so to fall into the required framework. To this end, we concatenate each original trajectory with the one starting at its final site, so as to obtain a single closed trajectory that wraps several times around the β axis. Hence, instead of open trajectories [0, β] → , we consider ensembles of closed trajectories θi : [0, i β] → , with i being their winding number. Each such closed trajectory corresponds to a cycle of length i of the permutation π determined by the endpoints of the component open trajectories. For each cycle, the sum over i sites in and the integrals over the i enchained open trajectories can be written as a sum over a single site xi , followed by an integral over closed trajectories with θi (0) = θ (i β) = xi . Recalling that there are 1 N! k k! i=1 i permutations with k cycles of lengths 1 , . . . , k , we obtain the following expansion of the partition function in terms of closed trajectories instead of particles:
1
Z (β, , µ) = dνx11 xβ1 (θ1 ) . . . dνxkk xβk (θk ) k! k 0
k
x1 ,...,xk ∈ 1 ,...,k 1
w(θi )
1 − ζU (θi , θ j ) .
(2.4)
1i< j k
i=1
Let (θ ) denote the winding number of the trajectory θ : [0, (θ )β] → . Its weight is defined by w(θ ) =
1 βµ(θ) exp (θ) e
−U W (θ ) .
(2.5)
Here, W (θ ) measures the self-intersection of θ , that is,
W (θ ) =
β
0i< j (θ)−1 0
It will suffice to use the bound w(θ ) tories θ and θ are given by
δθ(iβ+τ ),θ( jβ+τ ) dτ.
(2.6)
1 βµ(θ) . Finally, interactions between trajec(θ) e
ζU (θ, θ ) = 1 − exp −U W (θ, θ ) .
(2.7)
Here, W (θ, θ ) measures the overlap between trajectories θ and θ ,
W (θ, θ ) =
)−1
(θ)−1 β (θ
i=0
j=0
0
δθ(iβ+τ ),θ ( jβ+τ ) dτ.
(2.8)
Expression (2.4) is suited for an application of Theorem A.1. We show that the weights w(θ ) are small in the sense that they satisfy the “Kotecký-Preiss criterion” (A.4).
784
R. Fernández, J. Fröhlich, D. Ueltschi
Proposition 2.1. For each closed trajectory θ let j (θ ) denote the number of jumps of θ . Then, there exist constants a, b > 0 such that eβµ
dνx xβ (θ )ea j (θ )+βb ζU (θ, θ ) a j (θ ) + βb(θ ). 1
x∈
Proof. Since ζU (θ, θ ) is increasing in U , it is enough to prove that, for any trajectory θ, eβ(µ+b)
(2.9) dνx xβ (θ ) ea j (θ ) ζ∞ (θ, θ ) a j (θ ) + βb(θ ) , x 1
with
ζ∞ (θ, θ ) = χ θ (iβ + τ ) = θ (kβ + τ ) for some 0 i (θ ) − 1, 0 k (θ ) − 1, 0 < τ < β .
Here, χ [·] denotes the characteristic function of the event in brackets. A trajectory θ intersects θ if a jump of θ intersects a vertical line of θ , or if a jump β of θ intersects a vertical line of θ (or both). Let ν0 denote the measure on trajectories β [0, β] → Zd , starting at x = 0 and with a jump at τ = 0. integration with respect to ν0 can be defined similarly as in (2.3); formally, we can also write
β β dν0 (θ ) f (θ ) = dν00 (θ )δ(τ1 ) f (θ ), (2.10) β
where ν00 is as in (2.3) (with x = y = 0), and where the Dirac function δ(τ1 ) forces the first jump to occur at τ1 = 0. We get an upper bound by neglecting the restriction that trajectories need to remain in . The left side of (2.9) is then bounded by eβ(µ+b)
eβ(µ+b)
β a j (θ ) β j (θ ) e + β(θ ) dν dν0 ea j (θ ) . (2.11) 00 1
1
The first term accounts for trajectories θ intersecting jumps of θ ; the second term accounts for trajectories θ involving a jump that intersects a vertical line of θ . We integrate over all trajectories [0, β] → Zd that start at x = 0, without requiring them to stay in . Each trajectory θ in the last integral of (2.11) can be decomposed into the jump from a neighbor z into 0, which contributes a factor tea (see definition (2.3)), plus a trajectory from z to 0. As 0 has 2d neighbors we see that
β β dν0 (θ ) ea j (θ ) 2dtea (2.12) dνz0 (θ ) ea j (θ ) . Furthermore, the definition (2.3) implies that for every x, y,
β a j (θ ) a m dνx y (θ ) e = (te ) m 0
≤ ete
a 2d β
0<τ1 <···<τm < x1 , . . . , xm−1 |x j − x j−1 | = 1 |x1 − x| = |y − xm−1 | = 1
.
β
dτ1 . . . dτm
(2.13)
Mott Transition in Lattice Boson Models
785
From (2.9), (2.11), (2.12) and (2.13) we conclude that eβ(µ+b)
dνx xβ (θ ) ea j (θ ) ζ∞ (θ, θ ) x 1 ≤ j (θ ) + 2dtea β(θ ) exp β [µ + 2dtea + b] .
(2.14)
1
As µ + 2dt < 0, we can choose a and b such that µ + 2dtea + b < 0. Then (2.9) holds for β large enough.
Proof of Theorem 1.1. Recall expression (1.6) for the pressure. Proposition 2.1 establishes the convergence of cluster expansions, as stated in Theorem A.1. With ϕ denoting the usual combinatorial function of cluster expansions, see (A.2), the partition function has the absolutely convergent expression
Z (β, , µ) = exp dνx11 xβ1 (θ1 ) m 1 x1 ,...,xm ∈ 1 ,...,m 1
...
dνxmm xβm (θm ) ϕ(θ1 , . . . , θm )
m
w(θ j ) .
(2.15)
j=1
Taking the logarithm and dividing by the volume, standard arguments show that boundary terms vanish in the thermodynamic limit, and we obtain
β dν001 (θ1 ) dνx22 xβ2 (θ2 ) p(β, µ) = m 1 x2 ,...,xm ∈Zd 1 ,...,m 1
...
dνxmm xβm (θm ) ϕ(θ1 , . . . , θm )
m
w(θ j ).
(2.16)
j=1
Integrals can be viewed as functions of β, µ, indexed by m, (xi ), and (i ). They are real analytic in the domain (β, µ) : β > β0 (µ). Their sum is absolutely convergent and Vitali’s convergence theorem implies that p(β, µ) is analytic. Recall that the density is given by the derivative of the pressure with respect to the chemical potential; see (1.7) for the precise definition. The analyticity implied by the expansion allows for term-by-term differentiation. We can check that
∂w(θ1 ) β ρ(β, µ) = dν001 (θ1 ) m ∂µ d 1 1
dνx22 xβ2 (θ2 ) . . .
m 1
x2 ,...,xm ∈Z 2 ,...,m 1 m
dνxmm xβm (θm ) ϕ(θ1 , . . . , θm )
w(θ j ). (2.17)
j=2 ∂ Note that ∂µ w(θ ) = β(θ )w(θ ), as follows from definition (2.5) of the weight of trajectories. By (A.5), we have the bound
β 1 dν001 (θ )w(θ )ea(θ) . (2.18) ρ(β, µ) β 1 1
786
R. Fernández, J. Fröhlich, D. Ueltschi
There exists ε > 0 such that µ + 2dtea + b + ε < 0. Using (2.13), we get a ρ(β, µ) βe−εβ 1 eβ1 [µ+2dte +b+ε] .
(2.19)
1 1
Then ρ(β, µ) e−εβ for β large enough, and this completes the proof of Theorem 1.1.
3. Space-Time Loop Representation The study of the transition line for the Mott phase with unit density requires the analysis of perturbations of the “vacuum” formed by one particle at each site. This involves the control of full-fledged quantum fluctuations. We turn, then, to a more general expansion setting previously employed to study spin and fermionic systems [3, 5]. This setting shares some similarities with that of Sect. 2, but it also differs from it in significant ways. We use the same symbols ν, w, ζ, , θ, but we caution the reader that they are defined in slightly different ways. Besides the quantum-fluctuation issue, bosonic systems present the additional complication of the unboundedness of occupation numbers. In the present paper we wish to leave this second issue aside. We consider, thus, the model with a generalized hard-core condition that ensures that configurations have at most two bosons at each site. Recall definition (2.1) of the grand-canonical partition function. It is convenient to write H − µN = V + T,
(3.1)
where V denotes the diagonal terms (i.e., interactions and chemical potential terms) in the basis of occupation numbers in position space, and T denotes the hopping terms. We will consider T to be a perturbation of V . Our expansion is based on Duhamel’s formula,
β e−β(V +T ) = e−βV + dτ e−τ V (−T ) e−(β−τ )(V +T ) , (3.2) 0
which we can iterate to obtain
−β(V +T ) e = dτ1 . . . dτm e−τ1 V (−T )e−(τ2 −τ1 )V . . . (−T )e−(β−τm )V . m 0 0<τ1 <...<τm <β
(3.3) Then Z (β, , µ) = Tr e
−β(V +T )
=
m 0
t
m 0<τ1 <...<τm <β
dτ1 . . . dτm
Tr e−τ1 V c†x1 c y1 e−(τ2 −τ1 )V . . . c†xm c ym e−(β−τm )V .
(3.4)
(x1 ,y1 ),...,(xm ,ym )
We denote by n = (n x )x∈ , n x ∈ N, a “classical configuration” that represents the state where n x bosons are located at site x, and |n the corresponding normalized vector.
Mott Transition in Lattice Boson Models
787
Inserting projector decompositions 1 =
ni
|n i n i | the trace can be written as
Tr e−τ1 V c†x1 c y1 e−(τ2 −τ1 )V . . . c†xm c ym e−(β−τm )V = n 0 | e−τ1 V c†x1 c y1 |n 1 n 1 | e−(τ2 −τ1 )V . . . c†xm c ym |n m
(3.5)
n 0 ,n 1 ,...,n m
× n m | e−(β−τm )V |n 0 . As the operator V is diagonal in the base |n, this decomposition allows us to rewrite the expansion (3.4) in the form
Z (β, , µ) = dν(n) w(n) , (3.6) where (i) n is a space-time quantum configuration, namely an assignment of a configuration n(τ ), for each 0 < τ < β, such that – n is constant in τ , except at finitely many times τ1 < · · · < τm , with m even. – At each τi , a “jump” occurs, i.e. there are nearest-neighbor sites (xi , yi ) such that n x (τi −) + 1 if x = xi , n x (τi +) = n x (τi −) − 1 if x = yi , (3.7) n (τ −) otherwise. x i – n is periodic in the τ direction: n(β) = n(0). (ii) w(n) are positive weights defined by
w(n) = exp −
β 0
V (n(τ )) dτ
m
t n xi (τi +)n yi (τi −) ,
(3.8)
i=1
with the short-hand notation V (n) ≡ n| V |n. (iii) Integration with respect to the “measure” ν on quantum configurations stands for a sum over configurations at time 0, a sum over m, integrals over jumping times, and sums over locations of jumps. The expansion just obtained is rather general. It is convenient to interpret it in terms of random geometrical objects in a model-dependent fashion. For the case of interest here, we follow the “excitations”, namely the sites where the occupation number is different from the vacuum value 1. We therefore embed the “space time” × [0, β] in the cylinder Rd × S 1 (with periodic boundary conditions in the time direction) and decompose the trajectories of the excitations in connected components. In this way, a quantum configuration n can be represented as a set of non-intersecting loops (with winding numbers n = 0, ±1, ±2, . . . ) in this cylinder. The representation is defined by the following rules: • The constant configuration n with n x (τ ) = 1, for all x ∈ and 0 τ β, has no loops. • A jump of a boson from yi to xi at time τi (see (3.7)) is represented by a horizontal arrow from (yi , τi ) to (xi , τi ).
788
R. Fernández, J. Fröhlich, D. Ueltschi
• The points (x, τ ) with n x (τ ) = 1 are represented by vertical segments. These segments point upwards if n x (τ ) = 2, and downwards if n x (τ ) = 0. Loops are illustrated in Fig. 4. Similar representations have been used in various contexts, e.g. in a study of the Falicov-Kimball model [18]. Given a loop γ , we introduce the number of jumps j (γ ) (always an even number, possibly zero); the length 0 (γ ) of all vertical segments pointing downwards; the length 2 (γ ) of all vertical segments pointing upwards; (γ ) = 0 (γ ) + 2 (γ ); and the winding number z(γ ). Notice that 2 (γ ) − 0 (γ ) = βz(γ ). A loop γ defines a unique quantum configuration nγ . We define the weight of a loop as w(γ ) = t j (γ )
j (γ )
γ
γ
n yi (τi −)n xi (τi +) e−0 (γ )µ e−2 (γ )(U −µ) .
(3.9)
i=1
Note that we have subtracted the classical energy of the background configuration with one boson at each site. The weight w(γ ) thus only depends on excitation energies. These definitions allow us to rewrite the partition function (3.6) in terms of loops and their weights instead of space-time configurations. Unlike the trajectories of Sect. 2, the loops here have only a hard-core interaction due to the requirement of non-intersection. Furthermore, if = {γ1 , . . . , γm } is a set of disjoint loops, we have the important property that the weight of the corresponding quantum configuration n factorizes, w(n ) = eβµ||
m
w(γi ).
(3.10)
i=1
We define a measure on loops, also denoted ν, and we rewrite the partition function as
m 1
βµ|| Z (β, , µ) = e w(γi ) 1 − ζ (γi , γ j ) . dν(γ1 ) . . . dν(γm ) m! m 0
1i< j m
i=1
(3.11) Here, the term corresponding to m = 0 is set to eβµ|| , and the function ζ (γ , γ ) equals 1 if the loops γ and γ intersect (more precisely, if some of their vertical segments intersect), and equals 0 if the loops have disjoint support. b 2
0 2 2
2
0
0
2 2
0 2
2
0
0 Λ
Fig. 4. Illustration for the gas of space-time loops. There are three loops with respective winding numbers 1,0, and -1
Mott Transition in Lattice Boson Models
789
The expression (3.11) for the partition function is an adequate starting point for the method of cluster expansions. We prove that the weights are small so as to satisfy the “Kotecký-Preiss criterion”, Eq. (A.4). We can then appeal to Theorem A.1 to conclude that the cluster expansion converges. Proposition 3.1. Under the hypotheses of Theorem 1.2, we have that, for any loop γ ,
dν(γ ) w(γ ) ζ (γ , γ ) ea(γ ) a(γ ) with 2
2
t (γ ). a(γ ) = 214 d Ut 2 j (γ ) + 212 d U
(3.12)
Its proof relies on the bounds stated in the following lemma. Let us partition the set of loops into L = L(0+) ∪ L(−) , where L(0+) (resp. L(−) ) is the set of loops with nonnegative (resp. negative) winding numbers. For each site z we introduce the measures (0+) νz on loops that make a jump at time τ = 0 involving z. Further, we let Lz , Lz , and (−) Lz denote the sets of loops that contain (z, 0). Lemma 3.2. Under the hypotheses of Theorem 1.2, for any site z, t2 , (a) L(0+) dνz (γ ) w(γ ) ea(γ ) 211 d U 2 2 −β(U −µ−212 d tU ) a(γ ) e + 213 d Ut 2 , (b) L(0+) dν(γ ) w(γ ) e z 12 2 t 2 (c) L(−) dνz (γ ) w(γ ) ea(γ ) 4dt e−β(µ−2dt−2 d U ) , 12 2 t 2 (d) L(−) dν(γ ) w(γ ) ea(γ ) e−β(µ−2dt−2 d U ) . z
Proof of Proposition 3.1. Suppose that the loops γ and γ intersect, i.e. ζ (γ , γ ) = 1. Then either a jump of γ intersects a vertical line of γ , or a jump of γ intersects a vertical line of γ (both may happen at the same time). The first situation is analyzed using the measures νz , and the second situation involves the sets L(0+) and L(−) z z . More precisely, we have that
dν(γ ) w(γ ) ζ (γ , γ ) ea(γ ) (3.13)
(γ ) sup dνz (γ ) w(γ ) ea(γ ) + j (γ ) sup dν(γ ) w(γ ) ea(γ ) . z
L
z
Lz
Using the estimates in Lemma 3.2, the right side is seen to be smaller than a(γ ), provided β is large enough.
Proof of Lemma 3.2, (a) and (b). Loops of L(0+) have large energy cost, so crude entropy estimates are enough. Since 2 (γ ) 0 (γ ) for any loop γ ∈ L(0+) , we have that µ0 (γ ) + (U − µ)2 (γ ) 21 U (γ ). Then 2
t (γ ) µ0 (γ ) + (U − µ)2 (γ ) − 212 d U
1 4 U (γ ).
(3.14)
Further, we can check that e2
14 dt 2 /U 2
< 2.
(3.15)
790
R. Fernández, J. Fröhlich, D. Ueltschi
From these observations and (3.9), we obtain that e−β(U −µ−212 d tU2 ) if j = 0 a(γ ) w(γ )e 1 (4t) j (γ ) e− 4 U (γ ) if j 2.
(3.16)
A loop with j (γ ) = 2n is characterized by a sequence of jump times 0 τ1 < τ2 < · · · < τ2n . At each such time the trajectory can choose among at most 2d neighbors to jump to and 2 directions of time to proceed after the jump. The last jump is determined by the fact that γ must be a loop, so there is no factor 2d (but both time directions are possible). The measure νz involves only loops with two jumps or more. From the last bound in (3.16) we obtain ∞ 2n−1
1 dνz (γ ) w(γ ) ea(γ ) 2 · 22n (2d)2n−1 (4t)2n dτ e− 4 U τ L(0+)
0
n 1
=
210 dt 2 /U 6 1 − ( 2Udt )2
2
t 211 d U .
(3.17)
Part (b) of the lemma follows from (3.16) and from considerations similar to (3.17). Namely,
12 t 2 dν(γ ) w(γ ) ea(γ ) e−β(U −µ−2 d U ) L(0+) z
+
2 · 2 (2d) 2n
2n−1
(4t)
2n
∞
dτ e
− 14 U τ
2n .
(3.18)
0
n 1
The first term in the right side represents loops without jumps. The right side is less than the upper bound in Lemma 3.2 (b).
Proof of Lemma 3.2, (c) and (d). Loops of L(−) have small energy cost when parameters are close to the transition line. Estimates are needed that are more subtle than for loops of L(0+) . The situation is similar to that of Sect. 2, but a problem needs to be solved: Loops, unlike trajectories, can backtrack in time. Our strategy is to first “renormalize” a loop γ ∈ L(−) by identifying a trajectory θ = θ (γ ) that moves only downwards, but with arbitrarily long jumps. Contributions of backtracking can be controlled by similar estimates as above. The entropy of these trajectories can be expressed using an appropriate hopping matrix and we obtain sharp enough bounds. We start with (d). Given a loop γ ∈ L0(−) , we start at (x, τ ) = (z, β) and move downwards along γ . When reaching the end of a vertical segment (because of the presence of a nearest-neighbor jump), we ignore possible backtracking and directly jump to the next downwards vertical segment in the loop, at constant time. See the dotted lines in Fig. 4. We obtain a trajectory, since the motion is downwards only, punctuated by with long-range hoppings with which we must cope. Behind a hop from x to y there is a backtracking excursion between these sites. Its contribution to the total weight of the original loop (times ea(γ ) ) is given by the “hopping matrix” component
σx y = dνx (γ )w(γ )ea(γ ) , (3.19) x→y
Mott Transition in Lattice Boson Models
791
where the integral is over loops that are open, have nonnegative winding number, start with a jump at (x, 0), and end at (y, 0). Each trajectory so constructed is characterized by a sequence of hopping times 0 = τ1 < · · · < τ2m ≤ β and a sequence of not-necessarily neighboring sites x = x0 , x1 , . . . , xm = x which are the successive hopping endpoints. Its weights are determined by factors exponentially decreasing with 0 for each vertical segment and hopping matrix entries for each jump. In this way we obtain
L(−) z
dν(γ ) w(γ ) ea(γ ) e−β(µ−2
12 d t 2 ) U
βm m!
m 0
e−β(µ−2
12 d t 2 ) U
eβ
m
x1 , · · · , xm−1 x0 = xm = z
i=1
x =0 σ0x
σxi xi−1
.
(3.20)
The overall exponential factor comes from the fact that 0 β because the winding number of the loops is not zero. The factor β m /m! follows by integrating all choices of hopping times. To conclude, we must bound the sum of the matrix elements of σ . The contribution 214 d
t2
U 2 . Other open loops involve of open loops that consist in just one jump is 2dte two jumps or more. Each jump has 2d possible directions. There are two possible time directions after each jump, except for the first and last ones. We need to integrate over time occurrence for each jump except the first one. We obtain ∞ m−1 2 214 d t 2 m m−2 m − 14 U τ U σ0x 2dt e + (2d) 2 (4t) e dτ
0
m 2
x=0
2dt e
2 214 d t 2 U
2
t + 29 d 2 U .
(3.21)
We used (3.15). Inserting into (3.20) we obtain Lemma 3.2 (d). The bound of part (c) is similar, with an extra factor 2dte
214 d
t2 U2
4dt for the additional first jump.
Proof of Theorem 1.2. This proof is similar to the one of Theorem 1.1. We use cluster expansions, in order to get a convergent expansion for the pressure, and prove analyticity by using Vitali’s theorem. The density has an expansion reminiscent of (2.17), namely
∂w(γ1 ) ρ(β, µ) = 1 + dν(γ1 ) ∂µ L0
m m dν(γ2 ) . . . dν(γm )ϕ(γ1 , . . . , γm ) w(γ j ). (3.22) m 1
i=2
The combinatorial function ϕ is given by (A.2). From (3.9) ∂w(γ1 ) = [2 (γ ) − 0 (γ )]w(γ ). ∂µ Again using Eq. (A.5), we find the bound
|ρ(β, µ) − 1| dν(γ ) 2 (γ ) − 0 (γ ) w(γ )ea(γ ) . L0
(3.23)
(3.24)
792
R. Fernández, J. Fröhlich, D. Ueltschi
Only loops with nonzero winding number contribute. Going over the proof of Lemma 3.2 (b) and (d) with a(γ ) → a(γ )+log (γ ), we can check that the right side of the equation t2 above is less than e−εβ whenever µ − 2dt − 212 d 2 U − ε > 0 and β is large enough.
4. Density Bounds Proof of Theorem 1.3, (a). The Bose-Hubbard Hamiltonian preserves the total number of particles, so that the density can be fixed. We denote by e0 (ρ) the ground state energy per site in the subspace of density ρ. Neglecting repulsive interactions can only decrease the ground state energy; the minimum kinetic energy of a single boson is −2dt. It follows that e0 (ρ) (−µ − 2dt)ρ for all U 0. We find an upper bound for e0 (ρ) by using a variational argument. It is well-known that the symmetric ground state is also the absolute ground state, so that we can consider a non-symmetric trial function. We decompose into boxes of size = ρ −1/d . We consider the trial function ⊗ Nj=1 ϕ j , where ϕ j is supported in the j th box only and minimizes the kinetic energy. As is well-known, ϕ j is the ground state of the Dirichlet problem in π . Since + 1 ρ −1/d and the box, and the corresponding eigenvalue is −2dt cos +1 cos x 1 −
x2 2 ,
this eigenvalue is less than −2dt + dt (πρ 1/d )2 . This implies that 2
e0 (ρ) b(ρ) ≡ (−µ − 2dt)ρ + π 2 dtρ 1+ d . The minimum of b(ρ) is reached for ρ 2/d = c=−
The minimum value is
µ + 2dt 1+ d
2 (π 2 t)
µ+2dt . π 2 t (d+2)
2
d 2
d +2
(4.1)
.
(4.2)
By inspecting Fig. 5 we find that the ground state density is necessarily larger than 2 µ + 2dt d/2 a= . (4.3) d + 2 π 2 t (d + 2)
Proof of Theorem 1.3, (b). The strategy is the same as for part (a), although quantum fluctuations bring extra complications. The variational argument leading to the upper bound for e0 (ρ) can be modified by replacing particles with holes, so as to yield ˜ ≡ −µ + (µ − 2dt)(1 − ρ) + π 2 dt (1 − ρ)1+ d . e0 (ρ) b(ρ) 2
(4.4)
The lower bound is trickier. We fix the density ρ and work in the Hilbert space H,N with N = ρ||. We have that e0 (ρ) = − lim
1 lim β|| Zd β→∞
log Tr H,N e−β(H −µN ) .
(4.5)
We can use the loop representation of Sect. 3 for the trace to obtain an expression similar to (3.11); the difference is that we require the sum of winding numbers of all loops to be equal to the negative of the number of holes M = || − N . The weights of loops with strictly positive winding numbers decays exponentially as e−β(U −µ) , so they do not contribute in the limit β → ∞. We obtain an upper bound for
Mott Transition in Lattice Boson Models
793
e (r)
b(r)
a
r
c
(– m – 2dt)r
Fig. 5. Upper and lower bounds for the ground state energy per site, e0 (ρ). The density that minimizes e(ρ) necessarily satisfies ρ a
Z (β, , µ) (and therefore a lower bound for e0 (ρ)) by neglecting the non-intersecting conditions between loops. Further, we replace the loops γ with negative winding numbers by trajectories θ as in the proof of Lemma 3.2 (c), (d). We then obtain the lower bound e0 (ρ) −µρ − lim
1 lim β|| Zd β→∞
− lim
1 β|| Zd β→∞
lim
˜
log Tr H,M e−β T
L(0)
dν(γ )w(γ ).
(4.6)
Here, T˜ denotes the multibody kinetic operator σ (x − y)c†x c y , T˜ =
(4.7)
x,y
and σ (x) is given in (3.19). Then, by (3.21), lim 1 β→∞ β
˜ t2 M. log Tr H,M e−β T 2dt + 210 d 2 U
(4.8)
The contribution of nonwinding loops is bounded using Lemma 3.2 (a),
1 t2 dν(γ )w(γ ) dν0 (γ )w(γ ) 211 d U . (0) β|| L(0) L
(4.9)
We have shown that 2
2
t t e0 (ρ) −µ − (2dt − µ + 210 d 2 U )(1 − ρ) − 211 d U .
(4.10)
1+d/2 . ˜ From here on we proceed as before. The minimum of b(ρ) is −µ− (π 2 2t)d/2 ( 2dt−µ d+2 ) The ground state density then satisfies 1+d/2 2dt−µ 2 t2 − 211 d U d+2 (π 2 t)d/2 1−ρ . (4.11) t2 2dt − µ + 210 d 2 U
One finds the condition of Theorem 1.3 by requiring that the numerator be strictly positive.
794
R. Fernández, J. Fröhlich, D. Ueltschi
Appendix A. Cluster Expansions This appendix contains the main theorem of [20] for the convergence of cluster expansions. It allows for an uncountable set of “polymers”, so that it applies here. Let (A, A, µ) be a measure space with µ a complex measure. We suppose that |µ|(A) < ∞, where |µ| is the total variation (absolute value) of µ. Let ζ be a complex measurable symmetric function on A × A. Let Z be the partition function:
1
Z= 1 − ζ (Ai , A j ) . (A.1) dµ(A1 ) . . . dµ(An ) n! n 0
1i< j n
The term n = 0 of the sum is understood to be 1. We denote by Gn the set of all (unoriented) graphs with n vertices, and Cn ⊂ Gn the set of connected graphs of n vertices. We introduce the following combinatorial function on finite sequences (A1 , . . . , An ) of A: 1 if n = 1 ϕ(A1 , . . . , An ) = 1 (A.2) [−ζ (A , A )] if n 2. i j G∈Cn (i, j)∈G n! The product is over edges of G. A sequence (A1 , . . . , An ) is a cluster if the graph with n vertices and an edge between i and j whenever ζ (Ai , A j ) = 0, is connected. Convergence of cluster expansion is guaranteed provided the terms in (A.1) are small in a suitable sense. First, we assume that |1 − ζ (A, A )| 1
(A.3)
for all A, A ∈ A. Second, we need that the “Kotecký-Preiss criterion” holds true. Namely, we suppose that there exists a nonnegative function a on A such that for all A ∈ A,
d|µ|(A ) |ζ (A, A )| ea(A ) a(A). (A.4) The cluster expansion allows to express the logarithm of the partition function as a sum (or an integral) over clusters. Theorem A.1 (Cluster expansion). Assume that d|µ|(A)ea(A) < ∞, and that (A.3) and (A.4) hold true. Then we have
Z = exp dµ(A1 ) . . . dµ(An ) ϕ(A1 , . . . , An ) . n 1
Combined sum and integrals converge absolutely. Furthermore, we have for all A1 ∈ A,
1+ n d|µ|(A2 ) . . . d|µ|(An ) |ϕ(A1 , . . . , An )| ea(A1 ) . (A.5) n 2
We refer to [20] for the proof of this theorem, and for further statements about correlation functions.
Mott Transition in Lattice Boson Models
795
References 1. Aizenman, M., Lieb, E.H., Seiringer, R., Solovej, J.P., Yngvason, J.: Bose-Einstein quantum phase transition in an optical lattice model. Phys. Rev. A 70, 023612 (2004); see also cond-mat/0412034; see also http:// arxiv.org/list/cond-mat/0412034, 2004 2. Batrouni, G.G., Assaad, F.F., Scalettar, R.T., Denteneer, P.J.H.: Dynamic response of trapped ultracold bosons on optical lattices. Phys. Rev. A 72, 031601(R) (2005) 3. Borgs, C., Kotecký, R., Ueltschi, D.: Low temperature phase diagrams for quantum perturbations of classical spin systems. Commun. Math. Phys. 181, 409–446 (1996) 4. Bru, J.-B., Dorlas, T.C.: Exact solution of the infinite-range-hopping Bose-Hubbard model. J. Stat. Phys. 113, 177–196 (2003) 5. Datta, N., Fernández, R., Fröhlich. J.: Low-temperature phase diagrams of quantum lattice systems. I. Stability for quantum perturbations of classical systems with finitely-many ground states. J. Stat. Phys. 84, 455–534 (1996) 6. Datta, N., Fernández, R., Fröhlich, J., Rey-Bellet, L.: Low-temperature phase diagrams of quantum lattice systems. II. Convergent perturbation expansions and stability in systems with infinite degeneracy. Helv. Phys. Acta. 69, 752–820 (1996) 7. Dyson, F.J., Lieb, E.H., Simon, B.: Phase transitions in quantum spin systems with isotropic and nonisotropic interactions. J. Stat. Phys. 18, 335–383 (1978) 8. Elstner, N., Monien, H.: Dynamics and thermodynamics of the Bose-Hubbard model. Phys. Rev. B 59, 12184–12187 (1999) 9. Fisher, M.P.A., Weichman, P.B., Grinstein, G., Fisher, D.S.: Boson localization and the superfluid-insulator transition. Phys. Rev. B 40, 546–570 (1989) 10. Freericks, J.K., Monien, H.: Strong-coupling expansions for the pure and disordered Bose-Hubbard model. Phys. Rev. B 53, 2691–2700 (1996) 11. Fröhlich, J., Simon, B., Spencer, T.: Infrared bounds, phase transitions and continuous symmetry breaking. Commun. Math. Phys. 50, 79–95 (1976) 12. Greiner, M., Mandel, O., Esslinger, T., Hänsch, T.W., Bloch, I.: Quantum phase transition from a superfluid to a Mott insulator in a gas of ultracold atoms. Nature 415, 39–44 (2002) 13. Kennedy, T.: Long range order in the anisotropic quantum ferromagnetic Heisenberg model. Commun. Math. Phys. 100, 447–462 (1985) 14. Kennedy, T., Lieb, E.H., Shastry, B.S.: The X -Y model has long-range order for all spins and all dimensions greater than one. Phys. Rev. Lett. 61, 2582–2584 (1988) 15. Kölh, M., Moritz, H., Stöferle, T., Schori, C., Esslinger, T.: Superfluid to Mott insulator transition in one, two, and three dimensions. J. Low Temp. Phys. 138, 635 (2005) 16. Kotecký, R., Ueltschi, D.: Effective interactions due to quantum fluctuations. Commun. Math. Phys. 206, 289–335 (1999) 17. Lieb, E.H., Seiringer, R., Solovej, J.P., Yngvason, J.: The mathematics of the Bose gas and its condensation. Oberwohlfach Seminars, Basel Birkhäuser, 2005 18. Messager, A., Miracle-Solé, S.: Low temperature states in the Falicov-Kimball model. Rev. Math. Phys. 8, 271–99 (1996) 19. Schmid, G., Todo, S., Troyer, M., Dorneich, A.: Finite-temperature phase diagram of hard-core bosons in two dimensions. Phys. Rev. Lett. 88, 167208 (2002) 20. Ueltschi, D.: Cluster expansions and correlation functions. Moscow Math. J. 4, 511–522 (2004) Communicated by M. Aizenman
Commun. Math. Phys. 266, 797–818 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0020-6
Communications in
Mathematical Physics
Upper Bounds to the Ground State Energies of the One- and Two-Component Charged Bose Gases Jan Philip Solovej, Institute for Mathematical Sciences, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen, Denmark. E-mail: [email protected] Received: 30 September 2005 / Accepted: 2 December 2005 Published online: 5 May 2006 – © by J.P. Solovej 2006
Abstract: We prove upper bounds on the ground state energies of the one- and twocomponent charged Bose gases. The upper bound for the one-component gas agrees with the high density asymptotic formula proposed by L. Foldy in 1961. The upper bound for the two-component gas agrees in the large particle number limit with the asymptotic formula conjectured by F. Dyson in 1967. Matching asymptotic lower bounds for these systems were proved in references [10] and [11]. The formulas of Foldy and Dyson which are based on Bogolubov’s pairing theory have thus been validated.
1. Introduction and Main Results In 1961 L. Foldy [7] used Bogolubov’s 1947 pairing theory [4] for Bose systems to give a heuristic calculation of the ground state energy of a one-component charged Bose gas in the high density limit. The one-component Bose gas is a system of Bose particles all of the same charge moving in the presence of a fixed uniform background of the opposite charge. In 1967 F. Dyson [6] considered the two-component Bose gas with two species of bosons with opposite charges. Motivated by Foldy’s calculation Dyson was able to prove a rigorous upper bound on the ground state energy. A famous consequence of Dyson’s upper bound is that charged bosonic matter is not stable, the ground state energy is superlinear in the number of particles. Dyson, moreover, conjectured an exact asymptotic form of the ground state energy in the limit of a large number of particles. © 2006 by the author. This article may be reproduced in its entirety for non-commercial purposes.
Work partially supported by NSF grant DMS-0111298, by EU grant HPRN-CT-2002-00277, by
MaPhySto – A Network in Mathematical Physics and Stochastics, funded by The Danish National Research Foundation, and by grants from the Danish research council. Most of this work was done while the author was visiting the School of Mathematics, Institute for Advanced Study, Princeton.
798
J.P. Solovej
In [10] it was proved that Foldy’s calculation is indeed correct as a leading asymptotic lower bound for the ground state energy of the one-component charged Bose gas in the high density limit. In [11] it was similarly proved that Dyson’s conjectured expression is correct as an asymptotic lower bound for the ground state energy of the two-component charged Bose gas in the limit of a large number of particles. The aim of the present paper is to prove the corresponding upper bounds thus validating both Foldy’s one-component and Dyson’s two-component formulas. It should be mentioned that Foldy’s calculation may be viewed as a trial state calculation and may thus be turned into a rigorous upper bound. Foldy, however, uses periodic boundary conditions, and a periodic version of the Coulomb potential. It is not known whether this formulation has the same thermodynamic limit as the formulation given below. The one-component Bose gas is a system of N particles all of the same charge +1, say, constrained to a box = [0, L]3 ⊂ R3 , in which there is a uniform background charge of density ρ. The Hamiltonian for the one-component charged Bose gas is thus N 1 − 2 i − V (xi ) +
H N(1) =
i=1
|xi − x j |−1 + C,
(1)
1≤i< j≤N
where V (x) = ρ
ρ2 2
|xi − y|−1 dy, C =
×
|x − y|−1 dx dy.
We use Dirichlet boundary conditions. It is known from the work of Lieb and Narnhofer [9] that the ground state energy (1) E (1) (N ) of H N has a thermodynamic limit if we restrict to a neutral system e(ρ) = lim
N →∞ L 3 =N /ρ
E (1) (N ) . L3
It is however also shown in [9] that one will get the same thermodynamic energy by minimizing over all particle numbers, i.e., e(ρ) = lim inf L→∞ N
E (1) (N ) . L3
Theorem 1.1 (Foldy’s formula). The ground state energy e(ρ) of the one-component charged Bose gas satisfies the asymptotics lim ρ −5/4 e(ρ) = −I0 ,
ρ→∞
(2)
where
∞
I0 = (2/π )3/4 0
1/2 45/4 (3/4) 1 + x4 − x2 x4 + 2 dx = . 5π 1/4 (5/4)
(3)
Upper Bounds to Ground State Energies of Charged Bose Gases
799
The two component Bose gas is described by the Hamiltonian (2) HN
=
N i=1
1 − i + 2
1≤i< j≤N
ei e j |xi − x j |
acting on the Hilbert space L 2 (R3 ×{1, −1}), where the variable (xi , ei ) ∈ R3 ×{1, −1} gives the position and charge of particle i. The word two component refers to the fact that the charge of each particle can be either positive or negative. Thus the gas has a positive and a negative component. One would not normally consider the charges as variables, but rather fix them to have given values. If we did that, the Hamiltonian would not be fully symmetric in all N variables, but only in the variables for the positively charged particles and negatively charged particles separately. Clearly, the charge variables commute with the Hamiltonian and (2) the bottom of the spectrum (the ground state energy) E (2) (N ) of H N will therefore be achieved for a fixed combination of charges (rather than a superposition). Theorem 1.2 (Dyson’s formula). The ground state energy E (2) (N ) of the two-component charged Bose gas satisfies the asymptotics lim N −7/5 E (2) (N ) = −A,
N →∞
where A is the positive constant determined by the variational principle 2 5/2 2 1 −A = inf 2 |∇| − I0 0 ≤ , =1 R3
R3
R3
(4)
with I0 again given by (3). In [6] Dyson proves that E (2) (N ) ≤ −C N 7/5 , but with a constant different from A. He conjectures that the correct value is given as above. That the exponent 7/5 is, indeed, correct was first proved in 1988 by Conlon, Lieb, and Yau in [5], where they show a lower bound −C N 7/5 , but still not with the correct constant. They also proved that 5/4 is the correct exponent in Foldy’s formula. The asymptotic lower bounds in Theorems 1.1 and 1.2 were proved in [10] and [11] respectively. The main results of the following paper are the asymptotic upper bounds. In Sect. 2 we give a general construction of bosonic trial states on the bosonic Fock space over a general Hilbert space. The trial states will be built from coherent states and squeezed states. The trial states are essentially the ones dictated by Bogolubov theory. These trial states are the bosonic equivalent of the fermionic states in Hartree-Fock theory or rather to their extension including the Bardeen-Cooper-Schrieffer states (see [1]). In the same way as fermionic systems may be approximated by the semi-classical Thomas-Fermi theory we will also use a semi-classical type approximation to the Bogolubov trial states. In Sect. 3 we use the general trial state method to give an upper bound on the ground state energy for the two-component gas, but in a grand canonical setting where we do not fix the total number of particles. In Sect 3.1 we show how to get an upper bound for fixed particle number and thus prove Theorem 1.2. In Sect. 4 we use the general trial state method to give an upper bound on the ground state energy for the one-component gas and prove Theorem 1.1.
800
J.P. Solovej
A key ingredient in the proofs is a semiclassical construction where we represent operators as phase-space integrals with coherent states symbols and use the BerezinLieb inequalities. We need an operator version of the inequality. This is discussed in Appendix A. 2. The Abstract Trial State Construction Our goal in this section is to construct trial states on the bosonic Fock space F =
N F(H1 ) = ∞ N =0 H N , over some Hilbert Space H1 , i.e., H N = Sym H1 and H0 = C. We will be using the language of bosonic creation and annihilation operators as a convenient tool for the book keeping. We denote by |0 the vacuum vector in F. If T is an operator on H1 and W is an operator H1 ⊗ H1 , which is symmetric under interchange of the tensor factors, we may lift (also referred to as second quantize) these operators to F as ∞ N
Ti and
N =1 i=1
∞
Wi j .
N =2 1≤i< j≤N
Here Ti refers to the operator T acting on the i th factor in the tensor product and Wi j refers to W acting on the i th and i th factors. If u α , α = 1, . . . is an orthonormal basis for H1 we can express these operators using creation and annihilation operators as N ∞
Ti =
N =1 i=1
(u α , T u β )a(u α )∗ a(u β )
(5)
α,β
and ∞
Wi j =
N =0 1≤i< j≤N
1 (u α ⊗ u β , W u µ ⊗ u ν )a(u α )∗ a(u β )∗ a(u ν )a(u µ ). 2
(6)
αβµν
Of special interest is the number operator (the second quantization of the identity) N =
∞
N.
N =0
If φ ∈ H1 is a not necessarily normalized vector we define the corresponding coherent state as the normalized vector in Fock space |φC = exp(− φ 2 /2 + a(φ)∗ )|0 ∞ (a(φ)∗ )n 2 |0, e− φ /2 = n!
(7)
n=0
and for a normalized ψ ∈ H1 we define the squeezed state depending on λ ∈ C with |λ| < 1, |λ; ψS = (1 − |λ|2 )1/4 exp(−(λ/2)a(ψ)∗ a(ψ)∗ )|0 ∞ (−λ/2)n (a(ψ)∗ )2n |0. = (1 − |λ|2 )1/4 n! n=0
(8)
Upper Bounds to Ground State Energies of Charged Bose Gases
801
It is straightforward to check that these states are normalized. Up to an overall phase |φC and |λ; ψS are characterized by (a(φ) − φ 2 )|φC = 0 and (a(ψ) + λa(ψ)∗ )|λ; ψS = 0.
(9)
We immediately see that ∗ m k C φ|(a(φ) ) a(φ) |φC
= φ 2(m+k) .
(10)
For the squeezed state we get ∗ j j+2k |λ; ψS S λ; ψ|(a(ψ) ) a(ψ) ∞ (2n + 2k)! = (1 − |λ|2 )1/2 (2n (n + k)!2 n=0
− j + 1)(2n − j + 2) · · · (2n)
×(n + k)(n + k − 1) · · · (n + 1)(|λ|/2)2n (−λ/2)k k j 2 1/2 j k d −1 d |λ| = (1 − |λ| ) |λ| (−λ) (1 − |λ|2 )−1/2 . d|λ| j d|λ|
(11)
Moreover, the expectation in the state |λ; ψS of a product of an odd number of the operators a(ψ)∗ or a(ψ) vanishes. For the expectation of the particle number we find ∗ C φ|a(φ/ φ ) a(φ/ φ )|φC
= φ 2 and
∗ S λ; ψ|a(ψ) a(ψ)|λ; ψS
=
|λ|2 . 1 − |λ|2
We point out that the variation in the particle number is very different in the coherent state and in the squeezed state ∗ 2 C φ|(a(φ/ φ ) a(φ/ φ )) |φC
− C φ|a(φ/ φ )∗ a(φ/ φ )|φ2C = φ 2 ,
∗ 2 S λ; ψ|(a(ψ) a(ψ)) |λ; ψS
− S λ; ψ|a(ψ)∗ a(ψ)|λ; ψ2S =
(12)
2|λ|2
. (1 − |λ|2 )2 (13)
Thus in the coherent state the standard deviation of the particle number is the square root of the expectation itself, whereas for the squeezed state the standard deviation of the particle number is, in fact, greater than the expectation itself. For this reason the squeezed states are not appropriate for describing Bose condensates with a macroscopic and sharply defined occupation number in a specific one-particle state. To describe condensates we will use coherent states. We will here define a variational principle corresponding to the Bogolubov theory of Bose gases. We shall do this by characterizing the set of variational trial states (see also Robinson [12]). The Bogolubov variational theory is very similar to the Hartree-Fock theory for Fermi gases. More precisely, it is similar to the generalized Hartree-Fock theory which includes the Bardeen-Cooper-Schrieffer (BCS) trial states. In generalized Hartree-Fock theory (see [1]) the class of trial states is defined to be the quasi-free states on a fermionic Fock space. For the ground state (zero temperature) theory we may restrict to pure quasi-free states.
802
J.P. Solovej
To describe the variational states of Bogolubov theory we will again start from (normalized) quasi-free pure states. Such a state may be characterized as follows. If ∈ F(H1 ) is a normalized quasi-free pure state there exists an orthonormal family ψ1 , . . . 2 of H1 and a sequence of numbers 0 < λ1 , . . . < 1 with ∞ α=1 λα < ∞ such that
1 λα = (1 − λ2α ) 4 exp − a(ψα )∗ a(ψα )∗ |0. (14) 2 α=1
A straightforward but lengthy calculation from (11) shows that the quasi-free state satisfies , a1 a2 a3 a4 = , a1 a2 , a3 a4 + , a1 a4 , a2 a3 + , a1 a3 , a2 a4 (15) and from the definition of the state we have for all integers m ≥ 1, , a1 · · · a2m−1 = 0.
(16)
In (15) and (16), a j , j = 1, 2 . . . refer to any creation or annihilation operators. The relation (15) is the case m = 2 of the more general rule , a1 · · · a2m = , aπ(1) aπ(2) · · · , aπ(2m−1) aπ(2m) , (17) π ∈P2m
where P2m is the set of pairing permutations P2m = {π ∈ S2m | π(2 j − 1) < π(2 j + 1), j = 1, . . . , m − 1 π(2 j − 1) < π(2 j), j = 1, . . . , m} .
(18)
We shall here use this only in the case (15) when m = 2. The one-particle density matrix of the quasi-free state is the operator γ1 defined on the one-body space H1 by (g, γ1 f )H1 = (, a( f )∗ a(g))F , where f, g ∈ H1 . From (11) γ1 =
∞ α=1
λ2α |ψα ψα |. 1 − λ2α
(19)
Note, in particular, that the one-particle density matrix is a positive semi-definite trace class operator with Tr γ1 = (, N ) =
∞ α=1
λ2α < ∞. 1 − λ2α
Connected to the quasi-free pure state we also have the symmetric bilinear form ξ1 on H1 given by ξ1 ( f, g) = (, a( f )∗ a(g)∗ )F . We find, again from (11), that ξ1 ( f, g) =
∞ −λα (ψα , f )(ψα , g). 1 − λ2α
α=1
(20)
Upper Bounds to Ground State Energies of Charged Bose Gases
803
We may identify ξ1 with a linear map ξ1 : H1 → H1∗ , from the one-body space H1 to its dual space H1∗ . We then have the relations ξ1∗ ξ1 = γ1 (γ1 + 1), ξ1 γ1 = γ1 ξ1 ,
(21)
where we have also identified γ1 in the natural way with a map from H1∗ to itself. If we introduce the operator : H1 ⊕ H1∗ → H1 ⊕ H1∗ defined using matrix notation as
ξ1 γ , = 1∗ ξ1 1 + γ1 we may rewrite the condition (21) as
−1 0
0 = . 1
We may refer to an operator satisfying this condition as a symplectic projection. In the fermionic case the corresponding operator is simply a projection. Note that the operator may also be described by (| f 1 ⊕ g1 |, | f 2 ⊕ g2 |)H1 ⊕H1∗ = , a( f 2 )∗ + a(g2 ) a( f 1 ) + a(g1 )∗ F (H ) , 1
where we have used the Dirac bra and ket notation to denote elements of H1 and H1∗ respectively. Given a positive definite trace class operator γ1 and a symmetric bilinear form ξ1 satisfying (21) we may find a unique quasi-free pure state such that γ1 is the corresponding one-particle density matrix and ξ1 the corresponding bilinear form. To see this one simply has to show that there exists an orthonormal family ψ1 , . . . and a sequence of positive numbers λ1 , . . . such that (19) and (20) hold. This is a fairly simple exercise in linear algebra. The choice of ξ1 is equivalent to a particular choice of eigenbasis for γ1 . If γ1 has real eigenfunctions (in some representation) there is a particular ξ1 corresponding to this choice of basis. We shall use this in our construction of states in the next sections. Consider as an example γ1 being a real translation invariant operator on the Hilbert space L 2 (Rn /2π Zn ) of square integrable functions on the torus. The real eigenfunctions come in degenerate pairs of the form cos( px) and sin( px), p ∈ Zn . The associated quasi-free state will in the exponent have terms of the form a(cos( px))∗ a(cos( px))∗ + a(sin( px))∗ a(sin( px))∗ = a(ei px )∗ a(e−i px )∗ . This corresponds to a pairing of states with opposite momenta, as is the usual case in the Bogolubov pair theory. The Bogolubov variational states are not just quasi-free states as defined above. In fact, quasi-free states being built out of squeezed states are not well suited for describing condensates (see the discussion after (12) and (13). We introduce condensates by appropriate unitary transformations of quasi-free states as we shall now describe. Given φ ∈ H1 we have a unitary map Uφ on the Fock space F(H1 ) which satisfies Uφ∗ a( f )Uφ = a( f ) + ( f, φ). This unitary is unique up to an overall complex phase, which we may fix by noting that we can add the requirement that the unitary maps the vacuum state to a coherent state Uφ |0 = |φC .
804
J.P. Solovej
From the first identity in (9) it is clear that Uφ satisfies this up to a phase. The Bogolubov variational states are constructed from a quasi-free state and a vector φ ∈ H1 as φ = Uφ . From the above discussion we see that a Bogolubov state may be described as follows. Definition 2.1 (Bogolubov variational states). A Bogolubov state on the bosonic Fock space F(H1 ) is given by
1 λα (1 − λ2α ) 4 exp − (a(ψα )∗ − (φ, ψα ))(a(ψα )∗ − (φ, ψα )) |φC , φ,γ1 ,ξ1 = 2 α=1
(22) where φ∈ H1 and ψ1 , ψ2 . . . is an orthonormal family in H1 and 0 < λ1 , λ2 , . . . < 1 2 satisfy ∞ α=1 λα = 1. We call φ the condensate vector and ψ1 , ψ2 . . . the pair states. There is a one-to-one correspondence between Bogolubov states and triples (φ, γ1 , ξ1 ) consisting of a vector φ ∈ H1 a positive trace class operator γ1 on H1 and a bilinear form ξ1 on H1 × H1 satisfying (21). The correspondence is given by (19) and (20). We find for the one-particle density matrix of the Bogolubov state φ,γ1 ,ξ1 that φ,γ1 ,ξ1 , a(u)∗ a(v)φ,γ1 ,ξ1 F (H ) 1 = 0,γ1 ,ξ1 , (a(u)∗ + (φ, u))(a(v) + (v, φ))0,γ1 ,ξ1 F (H ) 1
= (v, γ1 u) + (v, φ)(φ, u)
(23)
and likewise for the two-particle density matrix using (15), φ,γ1 ,ξ1 , a(u 1 )∗ a(u 2 )∗ a(v2 )a(v1 )φ,γ1 ,ξ1 F (H ) 1
= (v1 , φ)(v2 , φ)(φ, u 1 )(φ, u 2 ) +ξ1 (u 1 , u 2 )(v1 , φ)(v2 , φ) + ξ1 (v1 , v2 )(φ, u 1 )(φ, u 2 ) +(v2 , γ1 u 1 )(v1 , φ)(φ, u 2 ) + (v1 , γ1 u 2 )(v2 , φ)(φ, u 1 ) +(v2 , γ1 u 2 )(v1 , φ)(φ, u 1 ) + (v1 , γ1 u 1 )(v2 , φ)(φ, u 2 ) +(v1 , γ1 u 1 )(v2 , γ1 u 2 ) + (v1 , γ1 u 2 )(v2 , γ1 u 1 ) + ξ1 (v1 , v2 )ξ1 (u 1 , u 2 ).
(24)
The above trial states are motivated by the Bogolubov approximation for Bose condensed systems. The states φ represent the condensate, whereas the states ψα , α = 1, . . . represent the pair states. A key ingredient in the Bogolubov approximation is the c-number substitution, i.e., the replacement of the operator a(φ) by the number φ 2 . This replacement will give the correct value for expectations of normal ordered products in the Bogolubov states if we have the additional assumption that γ1 φ = 0 (see (10). In Sect. 3 we will choose a Bogolubov state satisfying this assumption, but in Sect. 4 the Bogolubov state that we choose will not satisfy the assumption. It is not the aim here to study the general properties of the Bogolubov variational problem, i.e., the minimization of the expectation of many-body Hamiltonians restricted to Bogolubov states. We will instead proceed to the specific examples of the one-component and two-component charged Bose gas. Here we shall not characterize the exact Bogolubov minimizer, but instead give the semiclassical approximations to these states which give the leading order asymptotics in Theorems 1.1 and 1.2. The Hamiltonians that we are interested in are particle number conserving, i.e., commute with particle number and the reader may wonder why we do not define a class
Upper Bounds to Ground State Energies of Charged Bose Gases
805
of particle conserving, i.e., canonical trial states rather than the grand canonical states above. As in the fermionic BCS theory it is very complicated to write a canonical trial state. The calculations are greatly simplified in the grand canonical setting. Simple minded trial states with a fixed number of particles in the condensate will not give the correct approximation, since the important virtual pair creation will be lost. 3. The Two-Component Charged Bose Gas We consider the two component Bose gas described by the Hamiltonian H (2) =
∞ N =0
(2)
HN ,
(2)
HN =
N i=1
1 − i + 2
1≤i< j≤N
ei e j |xi − x j |
acting on the Fock space F(L 2 (R3 ×{1, −1}), where the variable (xi , ei ) ∈ R3 ×{1, −1} gives the position and charge of particle i. Our goal here is first to construct a grand canonical normalized trial function ∈ F(L 2 (R3 × {1, −1}) with particle numbers concentrated sharply around the average value N = (, N ) and such that H (2) = (, H (2) ) ≤ −AN 7/5 + o(N 7/5 )
(25)
for large N . We have denoted the expectation in the state by A = (, A). From this the proof of Dyson’s formula Theorem 1.2 (i.e., the fact that we can achieve this estimate with a trial function of fixed particle number) will follow fairly easily (see Sect. 3.1). To construct the trial state we use the method from the previous section. We begin with a normalized minimizer for the variational problem (4). Using spherically symmetric decreasing rearrangements it is not difficult to see that a minimizer exists and that it may be chosen positive and spherically symmetric decreasing. Moreover, from the Euler-Lagrange equation it is exponentially decreasing and smooth. It is, however, not essential that we can find an exact minimizer with these properties. As we shall see, we could as well have chosen an approximate minimizer, which is smooth and compactly supported. Let n > 0 and define the normalized function φ0 (x) = n 3/10 (n 1/5 x).
(26)
We define a normalized state n ∈ F as in (22) with the condensate vector on L 2 (R3 × {−1, 1}) given by n φ0 (x) φ(x, e) = 2 and the operator γ1 on L 2 (R3 × {−1, 1}) defined by the integral kernel γ1 (x, e; , y, e ) =
1 γ (x, y)ee , 2
806
J.P. Solovej
where γ is a positive semi-definite trace class operator having real eigenfunctions. We shall make an explicit choice for γ below (see 39). We write the spectral decomposition of γ as γ =
∞ α=1
λ2α |ψα ψα |, 1 − λ2α
(27)
where ψα , α = 1, . . . is a real orthonormal basis and 0 ≤ λα < 1 for α = 1, . . .. Observe that on the space L 2 (R3 × {1, −1}) we have φ 2 = n and γ1 φ = 0. Denoting ψα± (x, e) = ψα (x)δ±1,e , α = 1, . . . we may write the trial state n as ∞ λα ∗ ∗ n ee aαe aαe |0 , (1 − λ2α )1/4 exp − + a ∗ (φ) − n = 2 4 α=1
(28)
e,e =± α=1
∗ = a(ψ )∗ , for α = 1, . . .. where aα,e αe As discussed in the previous section choosing n and any γ with real eigenfunctions uniquely specifies a state n of the form above (possible degenerate eigenvalues will not cause ambiguities). Instead of specifying the individual eigenfunctions ψα and parameters λα , α = 1, . . . we will simply choose the operator γ . The state n should be compared to Dyson’s trial state in [6]. The main difference is that whereas we use a coherent state construction for the condensate, Dyson used squeezed states for this as well. Put differently, Dyson’s trial state corresponds to an exponential of a purely quadratic expression in creation operators without any linear terms. As we explained in the previous section the consequence of using the linear term in the exponent is that the variation in the number of particles occupying the state φ0 is much smaller than for a quadratic term. From (23) we find for the expected number of particles in the state n , ∞ ∗ N = aαe aαe = n + Tr γ , (29) α=1 e=±
and for the kinetic energy expectation ∞ N n 1 − 2 i n = |∇φ0 |2 + Tr − 21 γ n , 2 N =0 i=1 n 7/5 = |∇|2 + Tr − 21 γ . 2 From (6) we get that ∞ ei e j n , n = |xi − x j | N =0 1≤i< j≤N
1 2
∞
α,β,µ,ν=1 ee =±
(30)
∗ ∗ ee wαβνµ aαe aβe aµe aνe ,
(31)
Upper Bounds to Ground State Energies of Charged Bose Gases
where
wαβνµ =
ψα (x)ψβ (y)|x − y|−1 ψν (x)ψµ (y) dx dy.
807
(32)
(Since the Coulomb energy is an unbounded operator one may worry about the convergence of the expansion in (31). This problem is easily circumvented by introducing a convergence factor into |x|−1 , e.g., |x|−1 (1−exp(−t|x|)). The expectation on the left of (31) converges as t → ∞ by the Monotone Convergence Theorem, since for fixed values of the charges each term is monotone in t. We may do all calculations and estimates for finite t and at the end let t → ∞. We will here ignore this slight complication.) Using the notation of Sect. 2 we have ee λ2α ee λα , ψαe ) = − ψβe , γ1 ψαe = δ , ξ (ψ δαβ , αβ 1 βe 2 1 − λ2α 2 1 − λ2α
(33)
and thus from (24), ∗ ∗ aαe aβe aµe aνe n2 (φ0 , ψα )(φ0 , ψβ )(ψµ , φ0 )(ψν , φ0 ) 4 λµ λα ee (ψµ , φ0 )(ψν , φ0 ) + δµν (φ0 , ψα )(φ0 , ψβ ) −n δαβ 4 1 − λ2α 1 − λ2µ λ2β ee λ2α +n (φ0 , ψβ )(ψν , φ0 ) + δβν (φ0 , ψα )(ψµ , φ0 ) δαµ 4 1 − λ2α 1 − λ2β
=
λ2β n λ2α n δ + δβµ (φ , ψ )(ψ , φ ) + (φ0 , ψβ )(ψµ , φ0 ) 0 α ν 0 αν 4 4 1 − λ2α 1 − λ2β + +
λ2β λ2β δαν δβµ λ2α δαµ δβν λ2α + 4 1 − λ2α 1 − λ2β 4 1 − λ2α 1 − λ2β
δαβ δµν λα λµ . 4 1 − λ2α 1 − λ2µ
(34)
We therefore arrive at
2 ∞ ∞ ei e j λα λα n , , n = wααµν (ψν , φ0 )(ψµ , φ0 )n − |xi − x j | 1−λ2α 1−λ2α α=1
N =0 1≤i< j≤N
where we have used that φ0 and ψα , α = 1, . . . are real. From the expression for wααµν we see that we may write this as ∞ ei e j n , n = nTr K γ − γ (γ + 1) , (35) |xi − x j | N =0 1≤i< j≤N
where K is the operator on L 2 (R3 ) with integral kernel K(x, y) = φ0 (x)|x − y|−1 φ0 (y).
(36)
808
J.P. Solovej
Putting together (30) and (35) we arrive at n 7/5 H (2) = |∇|2 + Tr − 21 γ + nTr K γ − γ (γ + 1) . 2
(37)
Our next goal is to construct the operator γ . Here we shall use the method of coherent states symbols. Let χ (x) = π −3/2 exp(−x 2 ) such that χ (x)2 dx = 1. Let 0 < be a parameter which we shall specify below as a function of n such that n −2/5 n −1/5 . Denote χ (x) = −3/2 χ (x/) and let θu, p (x) = exp(i px)χ (x − u). We then define γ to be the operator γ = (2π )−3
R 3 ×R 3
f (u, p)|θu, p θu, p | du d p,
where
p f (u, p) = g (8π nφ0 (u)2 )1/4
(38)
1 , where g( p) = 2
p4 + 1
(39)
1/2 − 1 . p2 p4 + 2 (40)
We see that f (u, p) ≥ 0 and hence γ is a positive semi-definite operator and since f (u, p) = f (u, − p) all eigenfunctions of γ may be chosen real. That this is an appropriate choice for the function f will be seen at the end of our calculation (see (48)). Moreover, n 3/4 φ0 (u)3/2 du g( p) d p Tr γ = (2π )−3 f (u, p) du d p = π −9/4 2 R3 R3 = 2−3/4 π −9/4 n 3/5 (u)3/2 du g( p) d p. (41) R3
R3
Thus γ is a trace class operator. Hence we have all the requirements needed in order for γ to define a state n . Moreover, we see from (29) that for large n, N = n + O(n 3/5 ).
(42)
We turn now to the calculation of the expectation of the kinetic energy, Tr(−γ ) = (2π )−3 |∇θu, p |2 f (u, p) du d p −3 2 −3 2 −2 χ = (2π ) p f (u, p) du d p + (2π ) (∇ ) f (u, p) du d p ≤ (2π )−3 p 2 f (u, p) du d p + C(n 2/5 )−2 n 7/5 3/4 −7/4 7/5 5/2 =2 π n (u) du p 2 g( p) d p + C(n 2/5 )−2 n 7/5 , (43) R3
R3
where in the second to last inequality we have used the definition (26) of φ0 .
Upper Bounds to Ground State Energies of Charged Bose Gases
809
The next step in calculating the energy expectation in the state n is to calculate (or √ rather estimate) Tr(K( γ (γ + 1) − γ )). In order to do this we shall use the operator version of the Berezin-Lieb inequality given in (76) in √ Theorem A.1 in Appendix A. We will use it for the operator concave function ξ(t) = t (t + 1) − t (see the discussion at the end of Appendix A) and the map ω → |ω being (u, p) → |θu, p . We have (2π )
−3
|θu, p θu, p | du d p = I.
Since K is a positive operator we conclude from Theorem A.1 that Tr(K( γ (γ + 1) − γ )) −3 ≥ (2π ) f (u, p)( f (u, p)+1)− f (u, p) θu, p |K|θu, p du d p. (44) Since |x − y|−1 is a positive definite kernel we have for 0 ≤ δ , θu, p |K|θu, p = ei px χ (x − u)φ0 (x)|x − y|−1 e−i py χ (y − u)φ0 (y) dx dy ≥ (1 − Cδ )φ0 (u)2 ei px χ (x − u)|x − y|−1 e−i py χ (y − u) dx dy −Cδ −1 (n 2/5 )4 n −3/5 ≥ φ0 (u)2 ei px χ (x)|x − y|−1 e−i py χ (y) dx dy −Cδ (n 2/5 )2 n −1/5 −Cδ −1 (n 2/5 )4 n −3/5 4π 2 ≥ φ0 (u) dq − C(n 2/5 )3 n −2/5 , j (q) | p − q|2
(45)
2 2 where j (q) = (2π )−3 |χ (q)|2 = 3 π −3 e−2 q (with the convention f ( p) = ei px f (x) dx for the Fourier transform). In the last inequality we have chosen δ = (n 2/5 )n −1/5 and in the first inequality we have used that |φ0 (x) − φ0 (u)| ≤ Cn 1/2 |x − u| and hence χ (x − u)|φ0 (x) − φ0 (u)||x − y|−1 χ (y − u)|φ0 (y) − φ0 (u)| dx dy ≤ C(n 2/5 )4 n −3/5 . We have that j (q) dq = 1. We will use the estimate −2 | p| − j ∗ | p|−2 |q| |q| −2 −1 ≤ | p| dq + | p| dq j (q) j (q) | p − q| | p − q|2
7/2 −2 −5/2 −1 −1 −5/2 −2 ≤ sup j (q)|q| | p| | p−q| dq +| p| | p−q| dq |q| |q| (46) ≤ C| p|−5/2 sup j (q)|q|7/2 .
810
J.P. Solovej
For our explicit choice of j we get | p|−2 − j ∗ | p|−2 ≤ −1/2 | p|−5/2 . From (44), (45) and estimate (41) we find that Tr(K( γ (γ + 1) − γ )) −2 ≥ 2(2π ) f (u, p)( f (u, p) + 1) − f (u, p) φ0 (u)2 j ∗ | p|−2 du d p −C(n 2/5 )3 n 1/5 ≥ 2−1/4 π −7/4 n 2/5
g( p)(g( p) + 1) − g( p) (u)5/2 | p|−2 du d p
(47) −C(n 2/5 )−1/2 n 2/5 − C(n 2/5 )3 n 1/5 , √ where we have also used that f (u, p)( f (u, p) + 1) − f (u, p) du d p ≤ Cn 3/5 (as in (41)). If we now insert the above estimate and (43) into (37) we arrive at
|∇(u)|2 du H (2) ≤ n 7/5 21 3 R +2−1/4 π −7/4 (u)5/2 du R3 2 −2 p g( p) − | p| g( p)(g( p) + 1) − g( p) d p × R3
+Cn
7/5
((n
2/5
)3 n −1/5 + (n 2/5 )−1/2 ).
(48)
The function g in (40) was chosen precisely so as to optimize the above expression. If we insert the expression for g it is easily seen that the term in the large parenthesis above is 2 1 |∇(u)| du − I0 (u)5/2 du. 2 R3
R3
If we choose to be an exact minimizer then this expression is −A (recall that A and I0 were defined in Theorem 1.2). From the estimate in (48) we see that if we choose as a function of n such that n 2/5 = n 2/35 then H (2) ≤ −An 7/5 (1 − Cn −1/35 ).
(49)
Because of the estimate (42) this means that we have found a state satisfying (25). We could instead have chosen to be a smooth compactly supported approximate minimizer to the variational problem (4). We would then for any ε > 0 have proved that limn→∞ n −7/5 H (2) ≤ −A + ε, which of course implies (25). 3.1. An upper bound for fixed particle number. In this section we shall prove the upper bound in Theorem 1.2 on the energy E (2) (N ) corresponding to a fixed particle number N . Let ε,n for n, ε > 0 denote the state constructed in the previous section, but with the function g in (40) replaced by the function gε , which is equal to g for | p| > ε and is zero otherwise. We will again denote the expectation of any operator A in the state ε,n , by A. It then follows from the construction in the previous section that lim n −7/5 H (2) ≤ −Aε ,
n→∞
where Aε → A as ε → 0.
(50)
Upper Bounds to Ground State Energies of Charged Bose Gases
811
(m)
Let ε,n denote the projection of the state ε,n onto the subspace corresponding to particle number m = 0, 1, . . .. We then have 2 ∞ ∞ 2 2 (m) 2 ∗ N = m ε,n = aαe aα,e . e=± α=1
m=0
Hence from (29) and (34), N 2 − N 2 =
∞
∗ ∗ ∗ ∗ aαe aα,e aαe aα,e − aαe aα,e aαe aα,e
α=1 e,e =±
= n + 2Tr γε (γε + 1), where γε is given as in (39), but with f replaced by f ε , which is expressed in terms of gε instead of g. Thus using (75) in Theorem A.1 (or (76) for that matter) in the convex case, we see that f ε (u, p)( f ε (u, p) + 1) du d p ≤ n + Cε n 3/5 . N 2 − N 2 ≤ n + 2(2π )−3 Here Cε > 0 is a constant depending on ε and such that Cε → ∞ as ε → 0. It is at this point that it is necessary to replace g with gε , since otherwise the above integral is not convergent. For any M > 0 we have
m
∞ ! !2 ! ! ! (m) !2 (m) ! −3/5 m 7/5 |m − N |3/5 !ε,n !ε,n ! ≤ M !
7/5 !
m−N >M
m=0 −3/5
≤M N 2 7/10 (N − N )2 3/10 −3/5 =M N 2 7/10 (N 2 − N 2 )3/10 ≤ Cε M −3/5 n 17/10 .
(51)
Given a positive integer N , we choose n = N − C0 Then if C0 > 0 is chosen appropriately we have according to (29) and (42) that the expected particle number satisfies N 3/5 .
N − C1 N 3/5 ≤ N ≤ N − C2 N 3/5 , for some C1 , C2 > 0. Since M → E(M) is a non-increasing and non-positive function (adding particles will always lower the energy, since one may construct a trial state with the extra particles placed arbitrarily far away from the original particles) we have that ! ! ! (m) !2 E (2) (N ) ≤ E (2) (m) !ε,n ! ≤
m≤N ∞
! ! ! (m) !2 E (2) (m) !ε,n ! −
m=0
≤ H (2) +
m>N +C2 N 3/5
≤ H
(2)
+ Cε N 7/5−3/50 ,
m>N +C2 N 3/5
! ! ! (m) !2 Cm 7/5 !ε,n !
! ! ! (m) !2 E (2) (m) !ε,n !
812
J.P. Solovej
where we have used the lower bound E (2) (m) ≥ −Cm 7/5 (see [5] or[11]) and the estimate (51). Thus we finally get the upper bound in Theorem 1.2, lim sup N −7/5 E (2) (N ) ≤ lim lim sup n −7/5 H (2) + Cε N 7/5−3/50 = −A, ε→0 n→∞
N →∞
according to (50). 4. The One-Component Charged Bose Gas Since the thermodynamic ground state energy e(ρ) of the one-component charged Bose gas may be calculated by minimizing over all particle numbers we may again consider the grand canonical ensemble. Thus we are looking for an upper bound to the ground
(1) state energy of the Hamiltonian H (1) = ∞ N =0 H N acting on the Bosonic Fock space F(L 2 ()). To construct a grand canonical trial function we begin by choosing a real normalized function φ0 ∈ L 2 (). Let η ∈ C01 (0, L) be a non-negative function compactly supported ∞ in (0, L) and such that 0 η(t)2 dt = 1. Moreover, assume that η(t) is a constant for t ∈ [r, L − r ] for some 0 < r < L/4 to be chosen below. We will write this constant as (ρ/n)1/6 , for some n > 0. In fact, we shall choose r independently of L (for large L). We also assume that η(t) ≤ (ρ/n)1/6 . We then define (52) φ0 (x, y, z) = η(x)η(y)η(z). √ √ Thus φ0 is equal to a constant ρ/n on the cube [r, L − r ]3 and 0 ≤ φ0 (x) ≤ ρ/n for all x ∈ . Since η is normalized so is φ0 and ρ(L − 2r )3 ≤ n ≤ ρ L 3 . Thus the constant n is almost the number of particles required to have a neutral system. We have |η(t)| ≤ C L −1/2 and |φ0 (x)| ≤ C L −3/2
(53)
and we may assume that the derivatives satisfy |η (t)| ≤ Cr −1 L −1/2 and hence |∇φ0 (x)| ≤ Cr −1 L −3/2 .
(54)
In particular, we have
|∇φ0 (x)|2 dx ≤ C(r L)−1 .
Observe that we also have that (nφ0 (x)2 − ρ)|x − y|−1 (nφ0 (y)2 − ρ) dx dy ≤ Cρ 2 L 3r 2 .
(55)
(56)
We choose our grand canonical trial function n as in (22). The condensate vector is φ = z 0 φ0 ,
(57)
where the parameter z 0 > 0 will be chosen below. The operator γ1 = γ (we omit the subscript 1 because we shall use a subscript ε below with a different meaning) will be chosen to be a positive semi-definite trace class operator with real eigenfunctions. The eigenfunctions (corresponding to non-zero eigenvalues) should satisfy Dirichlet
Upper Bounds to Ground State Energies of Charged Bose Gases
813
boundary conditions on the boundary of . Let ψα , α = 1, . . . be an orthonormal basis of real eigenfunctions for γ . We use the notation aα∗ = a ∗ (ψα ). As usual we denote the expectation of an operator A in the state n by A. As in (30) we see from (5) and (23), ∞ N z 02 1 n , − 2 i n = |∇φ0 |2 + Tr − 21 γ 2 N =0 i=1 (58) ≤ C z 02 (r L)−1 + Tr − 21 γ , where in the last inequality we have used (55). We likewise get ∞ N n , V (xi )n = V (x)φ(x)2 dx + V (x)ργ (x) dx N =0 i=1
=ρ
×
z 02 φ0 (y)2 + ργ (y) dx dy, |x − y|
(59)
where ργ (x) = γ (x, x) is the density of the operator γ . From (6) we have (as in 31) with wαβνµ given exactly as in (32) ∞ ∞ n , |xi − x j |−1 n = 21 wαβνµ aα∗ aβ∗ aµ aν . α,β,µ,ν=1
N =0 1≤i< j≤N
We then obtain from (24) that ∞ z4 n , |xi − x j |−1 n = 0 φ0 (x)2 |x − y|−1 φ0 (y)2 dx dy 2 × N =0 1≤i< j≤N φ0 (x)2 |x − y|−1 ργ (x) dx dy +z 02 Tr K γ − γ (γ + 1) + z 02 × √ 2 |γ (x, y)| | γ (γ + 1)(x, y)|2 1 1 +2 dx dy + 2 dx dy |x − y| |x − y| × × + 21 ργ (x)|x − y|−1 ργ (y) dx dy, (60) ×
where the operator K is given as in (36). Putting together (58),(59), and (60) we arrive at √ |γ (x, y)|2 | γ (γ + 1)(x, y)|2 H (1) ≤ C z 02 (r L)−1 + 21 d xd y + 21 dx dy |x − y| |x − y| ρ − ργ (x)−z 02 φ0 (x)2 |x − y|−1 ρ − ργ (y)−z 02 φ0 (y)2 dx dy + 21 ×
+Tr − 21 γ + z 02 Tr K γ − γ (γ + 1) . We now choose γ = γε = (2π )−3
R3
gε
p (8πρ)1/4
(61)
|θ p θ p | d p,
(62)
814
J.P. Solovej
where the function gε ( p) = 0 for | p| ≤ ε and gε ( p) = g( p) for | p| > ε, where g is defined in (40), and θ p (x) = nρ −1 exp(i px)φ0 (x). (63) Recall that nρ −1 φ0 (x)2 ≤ 1 and is equal to 1 on most of . We see that the map p → |θ p satisfies the requirements of the map ω → |ω in Theorem A.1 with measure dµ(ω) = (2π )−3 d p. That γε satisfies the necessary requirements follows as before. It is clear that the eigenfunctions of γε with non-zero eigenvalues have compact support in (0, L)3 . We calculate the density of γε
p |θ p (x)2 | d p ργε (x) = (2π )−3 gε 1/4 3 (8πρ) R
p −3 −1 gε = (2π ) nρ φ0 (x)2 dp (8πρ)1/4 R3 (64) = nρ −1/4 2−3/4 π −9/4 φ0 (x)2 gε ( p) d p. We finally choose z 0 > 0 z 02
=n 1−2
−3/4 −1/4 −9/4
ρ
π
gε ( p) d p
(65)
(for ρ large enough). Then z 02 φ0 (x)2 + ργε (x) = nφ0 (x)2 . It follows from (56) and the fact that φ0 (x)2 ≤ ρ/n that ρ − ργε (x) − z 02 φ0 (x)2 |x − y|−1 ρ − ργε (y) − z 02 φ0 (y)2 dx dy ×
≤ Cρ 2 L 3r 2 . 1 4
To estimate the second term in (61) we will use Hardy’s inequality |u(x)| dx as follows: |x|2 |γε (x, y)|2 dx dy ≤ |x − y|
1/2
≤2
|γε (x, y)| dx dy 2
1/2
|γε (x, y)|2 dx dy
(66) |∇u(x)|2 dx ≥
|γε (x, y)|2 dx dy |x − y|2
1/2 1/2
|∇x γε (x, y)|2 dx dy
1/2 1/2 Tr(−γε2 ) = 2 Tr γε2 . Since x → x 2 is operator convex we may estimate these terms using the Berezin-Lieb inequality (76) in the convex case, but we may alternatively simply use the norm bound
γε ≤ Cε−2 . Hence
1/2 |γε (x, y)|2 dx dy ≤ Cε−2 (67) ργε (x) (Tr(−γε ))1/2 ≤ Cε−2 ρ L 3 , |x − y|
Upper Bounds to Ground State Energies of Charged Bose Gases
815
where we have used (64), n ≤ ρ L 3 and the fact which we shall prove below in (68), that Tr(−γε ) ≤ Cρ 5/4 L 3 (recall that we will choose r independently √ of L). The third term in (61) which compared to the second term has γε replaced by γε (γε + 1) is estimated in exactly the same way and with the same bound as the second term. We are now left with calculating the last two terms in (61). For the kinetic energy of γε we have as in (43),
p −3 n 2 2 p + |∇φ0 (x)| dx d p gε Tr(−γε ) ≤ (2π ) ρ R3 (8πρ)1/4 ≤ 23/4 π −7/4 ρ 5/4 L 3 p 2 gε ( p) d p + Cρ 3/4 L 3 (r L)−1/2 , (68) R3
where we have used (55) and n ≤ ρ L 3 . For the last term in (61) we again, as in (44), appeal to the operator version (76) of the Berezin-Lieb inequalities. We arrive at Tr K γε − γε (γε + 1) f ε ( p) − f ε ( p)( f ε ( p) + 1) θ p |K θ p d p, ≤ (2π )−3 (69)
R3
where f ε ( p) = gε p(8πρ)−1/4 . We have as in (45) θ p |K|θ p = 4π J ∗ | p|−2 ,
(70)
where J ( p) = (2π )−3 nρ −1 |φ02 ( p)|2 . The special form (52) implies that J ( p1 , p2 , p3 ) = j ( p1 ) j ( p2 ) j ( p3 ), −1 1/3 −1/3 η2 (τ )|2 . Since where j (τ ) dτ = n 1/3 ρ −1/3 η(t)4 dt, 2 j (τ ) = (2π ) n ρ−1/3 |1/3 η = 1, and 0 ≤ ρ and equal to this constant on [r, L − r ] we have η(t) ≤ n that 1 − 2r/L ≤ j (τ ) dτ ≤ 1. This implies in particular that (1 − 2r/L)3 ≤ J ( p) d p ≤ 1. (71) By (53) and (54) and the support property of η we have |η2 (τ )| ≤ |τ |−1 ≤ C(|τ |L)−1 . Thus j (τ ) ≤ C L(|τ |L)−2 . Hence J (q) dq ≤ 3 j (τ ) dτ ≤ C L −1/2 . |q|>L −1/2
|(η2 ) (t)| dt
|τ |>(3L)−1/2
(72)
For | p| > ε(8πρ)1/4 and |q| ≤ L −1/2 we have | p − q| ≤ (1 + Cρ −1/4 ε−1 L −1/2 )| p| and hence from (71) and (72), −2 −1/4 −1 −1/2 −2 −2 J ∗ | p| ≥ (1 + Cρ ε L ) | p| J (q) dq |q|
≥ (1 + Cρ −1/4 ε−1 L −1/2 )−2 ((1 − 2r L ) − C L −1/2 )| p|−2 ≥ (1 − C(ρ −1/4 ε−1 L −1/2 + r L −1 + L −1/2 ))| p|−2 .
(73)
816
J.P. Solovej
Inserting this into (70) and then into (69) we arrive at Tr K γε − γε (γε + 1) ≤ 2−1/4 ρ 1/4 π −7/4 (gε ( p) − gε ( p)(gε ( p) + 1))| p|−2 d p +C ε−1 L −1/2 + ρ 1/4 r L −1 + ρ 1/4 L −1/2 .
(74)
If we now insert √ the above estimate, (65), (66), (67), (68), and the same estimate for γε replaced by γε (γε + 1)) into (61) we see that lim sup L −3 H (1) ≤ ρ 5/4 2−1/4 π −7/4 | p|2 gε ( p) + gε ( p)| p|−2 L→∞ − gε ( p)(gε ( p) + 1)| p|−2 d p +Cρ(1 + ρr 2 + ε−2 ). Here we may actually let r → 0 (which really means that we could have chosen r as a negative power of L). If we recall the behavior of g( p) for small | p| from (40) we find that the error in replacing gε by g is of order ρ 5/4 ε. Thus by choosing ε = ρ −1/12 we obtain the final result e(ρ) ≤ lim sup L −3 H (1) ≤ −I0 ρ 5/4 (1 − Cρ −1/12 ). L→∞
A. The Berezin-Lieb Inequality In this appendix we shall prove variants of the Berezin-Lieb inequalities [2, 8]. Theorem A.1 (Berezin-Lieb inequalities). Let H be a Hilbert space and a measure space with a (positive) measure µ such that there exists a map ω → |ω ∈ H, satisfying |ωω|dµ(ω) ≤ I as operators. Assume ξ : R+ ∪ {0} → R is a concave function with ξ(0) ≥ 0. Then for any non-negative function f on satisfying f (ω)ω|ωdµ(ω) < ∞ we have the Berezin-Lieb inequality
TrH ξ f (ω)|ωω|dµ(ω) ≥ ξ( f (ω))ω|ωdµ(ω).
(75)
If moreover ξ is operator concave (still satisfying ξ(0) ≥ 0) the inequality holds as an operator inequality
ξ
f (ω)|ωω|dµ(ω) ≥
ξ( f (ω))|ωω|dµ(ω).
(76)
Upper Bounds to Ground State Energies of Charged Bose Gases
817
Proof. We first note that f (ω)|ωω|dµ(ω) is a positive semi-definite trace class operator. Let u 1 , u 2 , . . . be an orthonormal basis of eigenvectors for this operator. Then
TrH ξ f (ω)|ωω|dµ(ω)
∞ 2 ξ = f (ω)|ω|u i | dµ(ω) i=1
≥
∞
|ω|u i | dµ(ω)ξ 2
|ω|u i | dµ(ω) 2
−1
f (ω)|ω|u i | dµ(ω) , 2
i=1
where we have used that |ω|u i |2 dµ(ω) ≤ 1 and that since ξ is concave with ξ(0) ≥ 0 we have ξ(at) ≥ aξ(t) for all t ≥ 0 and 0 < a < 1. If we now use Jensen’s inequality we arrive at
∞ TrH ξ ξ( f (ω))|ω|u i |2 dµ(ω) f (ω)|ωω|dµ(ω) ≥ i=1
=
ξ( f (ω))ω|ωdµ(ω).
We turn to the case when ξ is operator concave. Define the operator U : H → L 2 (, dµ) by (U φ)(ω) = ω|φ. Then ∗ U h = h(ω)|ωdµ(ω). Thus if B is the multiplication operator on L 2 (, dµ) given by Bh(ω) = f (ω)h(ω) we have U ∗ BU = f (ω)|ωω|dµ(ω). In particular, we have the operator inequalities 0 ≤ U ∗ U ≤ I . Using that (1 − UU ∗ )1/2 U = U (1 − U ∗ U )1/2 it is straightforward to check that the following operators on H ⊕ L 2 (, dµ) (written in matrix notation) are unitary:
−U ∗ U∗ (I − U ∗ U )1/2 (I − U ∗ U )1/2 , V= . U= U (I − UU ∗ )1/2 U −(I − UU ∗ )1/2 Moreover we have that
∗
1 1 ∗ 0 0 U BU 0 0 0 U + V∗ V= . U 0B 0B 0 (1 − UU ∗ )1/2 B(1 − UU ∗ )1/2 2 2 Since ξ is operator concave and U and V are unitary we find that
ξ(U ∗ BU ) 0 0 ξ((1 − UU ∗ )1/2 B(1 − UU ∗ )1/2 )
1 1 0 0 0 0 U + V∗ V ≥ U∗ 0 ξ(B) 0 ξ(B) 2 2
∗ U ξ(B)U 0 = . 0 (1 − UU ∗ )1/2 ξ(B)(1 − UU ∗ )1/2
818
J.P. Solovej
In particular, this gives ξ(U ∗ BU ) ≥ U ∗ ξ(B)U , which is precisely the operator Berezin-Lieb inequality (76). In order to determine whether a given function is operator concave we may use Nevanlinna’s Theorem (see [3] Theorems V.4.11 and V.4.14 and Eq. (V.49)). According to this a real function ξ defined on the positive real axis with an analytic extension to C \ {x ∈ R | x ≤ 0}, which maps the upper half plane into itself has a representation of the form ∞ λ 1 ξ(t) = α + βt + − dν(λ), λ2 + 1 λ + t 0 ∞ 1 where β ≥ 0 and where ν is a positive measure satisfying 0 1+λ 2 dν(λ) < ∞. Since −1 t → −(t +λ) is operator concave the same is true for functions with the above integral representation. √ As a special case we see that the function ξ(t) = t (t + 1), which is analytic away from the segment [−1, 0] is operator concave. Acknowledgements. I would like to thank Elliott Lieb, Kumar Raman, and Robert Seiringer for valuable discussions.
References 1. Bach, V., Lieb, E.H., Solovej, J.P.: Generalized Hartree-Fock theory and the Hubbard model. J. Stat. Phys. 76, 3–90 (1994) 2. Berezin, F.A.: Izv. Akad. Nauk, ser. mat. 36(No. 5) (1972); English translation: USSR Izv. 6(No. 5) (1972); Berezin, F.A.: General concept of quantization. Commun. Math. Phys. 40, 153–174 (1975) 3. Bhatia, R.: Matrix Analysis, Graduate Texts in Mathematics, Vol. 169. New York: Springer-Verlag, 1997 4. Bogolubov, N.N.: J. Phys. (U.S.S.R.) 11, 23 (1947); Bogolubov, N.N., Zubarev, D.N.: Sov. Phys.-JETP 1, 83 (1955) 5. Conlon, J., Lieb, E.H., Yau, H.-T.: The N 7/5 Law for Charged Bosons. Commun. Math. Phys. 116, 417–448 (1988) 6. Dyson, F.J.: Ground State Energy of a Finite System of Charged Particles. J. Math. Phys. 8, 1538–1545 (1967) 7. Foldy, L.L.: Charged Boson Gas. Phys. Rev. 124, 649–651 (1961); Errata ibid 125, 2208 (1962) 8. Lieb, E.H.: The classical limit of quantum spin systems. Commun. Math. Phys. 31, 327–340 (1973) 9. Lieb, E.H., Narnhofer, H.: The Thermodynamic Limit for Jellium. J. Stat. Phys. 12, 291–310 (1975); Errata 14, 465 (1976) 10. Lieb, E.H., Solovej, J.P.: Ground State Energy of the One-Component Charged Bose Gas. Commun. Math. Phys. 217, 127–163 (2001). Errata 225, 219–221 (2002) 11. Lieb, E.H., Solovej, J.P.: Ground State Energy of the Two-Component Charged Bose Gas. Commun. Math. Phys. 252, 485–534 (2004) 12. Robinson, D.W.: The ground state of the Bose gas. Commun. Math. Phys. 1, 159–174 (1965) Communicated by M. Aizenman
Commun. Math. Phys. 266, 819–862 (2006) Digital Object Identifier (DOI) 10.1007/s00220-006-0045-x
Communications in
Mathematical Physics
Continuum Limit of the Volterra Model, Separation of Variables and Non-Standard Realizations of the Virasoro Poisson Bracket O. Babelon Laboratoire de Physique Théorique et Hautes Energies (LPTHE), Unité Mixte de Recherche (UMR 7589), Université Pierre et Marie Curie-Paris6; CNRS; Université Denis Diderot-Paris7, Tour 24-25, 5ème étage, Boite 126, 4 place Jussieu, 75252 Paris Cedex 05, France. E-mail: [email protected] Received: 24 October 2005 / Accepted: 7 March 2006 Published online: 15 June 2006 – © Springer-Verlag 2006
Abstract: The classical Volterra model, equipped with the Faddeev-Takhtajan Poisson bracket provides a lattice version of the Virasoro algebra. The Volterra model being integrable, we can express the dynamical variables in terms of the so-called separated variables. Taking the continuum limit of these formulae, we obtain the Virasoro generators written as determinants of infinite matrices, the elements of which are constructed with a set of points lying on an infinite genus Riemann surface. The coordinates of these points are separated variables for an infinite set of Poisson commuting quantities including L 0 . The scaling limit of the eigenvector can also be calculated explicitly, so that the associated Schroedinger equation is in fact exactly solvable. 1. Introduction The relation between integrable systems and conformal field theory has long been recognized [1, 2]. Although the emphasis has been put rightfully on Baxter Q operator and therefore on Sklyanin’s separated variables [3], to the best of our knowledge there are no explicit expressions of the Virasoro generators in terms of these variables. We make here a first step in this direction by considering the classical version of this problem. Our strategy will be to start with the Volterra model on the lattice [4, 6] equipped with the Faddeev-Takhtajan [7, 8] Poisson bracket. Since the Volterra model is integrable, we can rewrite everything in terms of separated variables. Now, the FaddeevTakhtajan bracket goes directly to the Virasoro Poisson bracket in the continuum limit, and therefore by taking this limit in the separated variables formulae we will obtain the Virasoro generators expressed in terms of separated variables. This leads to the following rather new type of formula for the Virasoro generators: u(x) = L n e2inπ x = p02 + 2∂x2 log det (x) + (L 0 − p02 )δ(x). Here p0 is the zero mode and Poisson commutes with everything, the term (L 0 − p02 )δ(x)
820
O. Babelon
will be explained later and the formula for L 0 is given in Eq. (66). The infinite matrix (x) reads (k, m ∈ {1, · · · , ∞}): km (x) =
Wk (x)∂x E m (x) − ∂x Wk (x)E m (x) , 0≤x ≤1 Z k2 − m 2 π 2
(1)
with Wk (x) =
sin Z k (1 − x) sin Z k x + µk , Zk Zk
E m (x) = 2mπ sin mπ x.
(2)
The above formula for u(x) is valid on the interval 0 ≤ x ≤ 1, and should be extended outside this interval by periodicity (in particular the δ(x) term in a Dirac comb). The result of this paper is that if the variables Z k , µk , have Poisson bracket1 {Z k , Z k } = 0, {Z k , µk } = 2(Z k − p02 Z k−1 )µk δkk , {µk , µk } = 0
(3)
then u(x) does satisfy the Virasoro Poisson bracket: {u(x), u(y)} = 4(u(x) + u(y)) δ (x − y) + 2δ (x − y).
(4)
Morever, the variables Z k , µk are separated variables for an infinite set of higher commuting quantities, including L 0 . Since the separated variables are also the ones which solve the classical inverse problem, the Schroedinger equation with the potential u(x) (−∂x2 − u(x))ψ(x, ) = 2 ψ(x, )) is exactly solvable, meaning that we have explicit formulae for both the potential u(x) and a basis of solutions ψ(x, ). Constructing the linear combination which is quasi periodic (the so called Bloch waves) introduces an infinite genus Riemann surface. The coefficients in the expression of this curve define a complete set of Poisson commuting Hamiltonians including L 0 . The separated variables are points on this curve. The paper is organized as follows. In the first three sections we recall some known facts about the Volterra model on the lattice. In particular we recall the formulae expressing the dynamical degrees of freedom in terms of the separated variables. In Sect. 5 we compute the continuum (scaling) limit of the spectral curve. The result is Eq. (41). We then show that the Hamiltonians Hm in this formula are in involution. Moreover we show that the scaling limit of the dynamical divisor still belongs to that curve, and hence define separated variables for these Hamiltonians. In Sects. 6, 7 and 8, we compute the scaling limit of the eigenvector of the Lax matrix at each point of the spectral curve. The result is rather simple and is given in Eq. (50). We then show that the obtained expression does satisfy a second order Schroedinger equation and we compute its potential u(x). Finally, we construct the two quasi periodic solutions of that equation, the Bloch waves, and recover in this way exactly the same spectral curve as the one obtained in Sect. 5. In Sect. 9, we give conditions under which the determinants of the infinite matrices that appeared in the previous sections exist. We then perform a few checks in a certain perturbative scheme. In Sect. 10 we prepare the ground for the serious calculations coming next.
Z k2 − p02 , the Poisson bracket becomes a standard quadratic bracket {k , µk } = 2k µk . However p0 will then enter the formula for (x) and, in this work, we prefer to keep that formula simple at the expense of a slightly more complicated Poisson bracket. 1 Notice that if we redefine = k
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
821
In Sects. 11 and 12 we prove that the potential u(x) does satisfy the Virasoro Poisson bracket. An essential use is made of certain quartic relations, proven very much like the Hirota–Sato bilinear identities. These identities should be considered as generalizations for τ -functions of the quartic relations on Riemann’s Theta functions. 2. The Volterra Model In this and the following two sections we recall some well known facts about the Volterra model. The Volterra model, as an integrable system, was introduced in [4]. It is a restricted version of the Toda lattice. We consider a periodic lattice with N + 1 sites, and on each lattice site we attach a dynamical variable ai on which we impose the Faddeev-Takhtajan Poisson bracket [7]: (5) {ai , a j } = ai a j (4 − ai − a j )(δi, j+1 − δ j,i+1 ) − a j+1 δi, j+2 + ai+1 δ j,i+2 . This bracket2 is interesting because taking the continuum limit as ai 1 + 2 u(x), =
1 (N + 1)
it becomes the Virasoro Poisson bracket Eq. (4). For precisely this reason, and in this perspective, the lattice model has been extensively studied both at the classical level [7, 8] and at the quantum level [9–11, 13, 12]. The present paper is one more contribution to this series of works. The Lax matrix for the Volterra model is defined by: √ √ a1 √0 ··· µ−1 a N +1 √0 a1 0 a2 ··· 0 .. .. .. . . . √ √ (6) L(µ) = ai−1 0 ai 0 0 . .. .. .. . . √. √ a 0 a 0 · · · N −1 √ N √ µ a N +1 ··· 0 aN 0 It is well known that TrL n (µ) are in involution with respect to the Poisson bracket Eq. (5). Hence we have an integrable system on the lattice whose continuum limit is directly related to conformal field theory. The spectral curve is defined as usual:
:
det(L(µ) − λ) = 0.
(7)
Expanding the determinant we see that it is of the form: µ + µ−1 − t (λ) = 0,
(8)
where t (λ) is polynomial of degree N + 1.
t (λ) = A−1 λ N +1 − A−1 ai λ N −1 + · · · ,
(9)
i 2 In terms of Toda Hamiltonian structures, it is a linear combination of restrictions of the second and fourth Poisson brackets.
822
O. Babelon
where A=
√ a1 a2 · · · a N +1 .
Assuming N = 2n even, t (λ) is an odd polynomial, t (−λ) = −t (λ), and has exactly n + 1 independent coefficients. However, in that case, there is one Casimir function K = t (2): {t (2), ai } = 0, ∀i. The dimension of phase space is N = 2n and we have exactly n commuting quantities. The genus of the curve is g = N . At each point (λ, µ) of the spectral curve, we can attach an eigenvector (λ, µ) = (ψi (λ, µ)), i = 1, . . . , N + 1, corresponding to the eigenvalue λ of L(µ). Explicitly, the equation (L(µ) − λ) = 0 reads √ √ a1 ψ2 + µ−1 a N +1 ψ N +1 = λψ1 , √ √ ai−1 ψi−1 + ai ψi+1 = λψi , (10) √ √ µ a N +1 ψ1 + a N ψ N = λψ N +1 . We extend the definition of the coefficients ai by periodicity ai+N +1 = ai , and introduce a second order difference operator D:
√ √ D ≡ ai−1 ψi−1 + ai ψi+1 . i
This operator is a discrete version of a Schroedinger operator with periodic potential. Equations (10) are then equivalent to:
D = λψi , with ψi+N +1 = µψi . (11) i
Therefore, the eigenvector is a Bloch wave for the difference operator D with a Bloch momentum µ. In the continuum limit, Eq. (11) becomes the Schroedinger equation (−∂x2 − u(x))ψ(x) = 2 ψ(x), λ 2 − 2 2 . 3. The Free Case Since in the continuum limit ai → 1, it is useful to first recall some formulae in the trivial case ai = 1. They will be generalized to the full case in the next section. To introduce the zero mode from the start, we consider the slightly more general case ai = a: √ a(ψ2 + µ−1 ψ N +1 ) = λψ1 , √ a(ψ + ψi+1 ) = λψi , √ i−1 a(µψ1 + ψ N ) = λψ N +1 . i , where x are solutions of the The solution of the bulk equations is ψi = αx+i + βx− ± equation 1 λ x 2 − zx + 1 = 0, x± (λ) = (z ± z 2 − 4), z = √ . 2 a
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
823
Imposing the two boundary equations, we get (x+N +1 − µ) α + (x−N +1 − µ) β = 0, (x+N +1 − µ)x+ α + (x−N +1 − µ)x− β = 0. The compatibility of this system yields the spectral curve µ + µ−1 = x+N +1 + x−N +1 ≡ t (λ).
(12)
We now impose that the curve passes through the point (λ = 2, µ = µ0 ), where µ0 is related to the value of the Casimir function by K = µ0 + µ−1 0 .
(13)
Setting 1 x± (2) = √ ± a
1 − 1 = e±α , µ0 = ei p0 , a
0 . Hence the constant a is related to the value of the zero mode Eq. (12) gives α = i Np+1 p0 by:
√ a=
1 cos
p0 N +1
.
The components of the eigenvector, properly normalized, are meromorphic functions on the spectral curve Eq. (12). Choosing the normalization ψ N +1 = µ, they read ψi (λ, µ) =
λ PN +1−i (z) + µPi (z) , z=√ . PN +1 (z) a
(14)
We have introduced the polynomials of degree j − 1: j
P j (z) =
j
x+ − x− , x+ − x−
P j (z) = z j−1 + O(z j−3 ).
(15)
The first few polynomials are P0 = 0,
P1 = 1,
P2 = z,
P3 = z 2 − 1,
P4 = z 3 − 2z.
They are essentially the Tchebitchev polynomials of the second kind. As we will see, Eq. (14) is the general form of the meromorphic function ψi (λ, µ) even when ai = a. In particular, in order to take the continuum limit, the poles of the eigenvector3 will have to be close to the roots of the equation PN +1 ( √λa ) = 0, that is (0) √ Zk λ(0) , = 2 a cos k N +1
Z k(0) = kπ k = 1, . . . , N .
(16)
3 In fact in this simple case the eigenvector has no poles at finite distance because they are compensated by zeroes in the numerator. This degeneracy is lifted as soon as the ai are not all equal.
824
O. Babelon (0)
(0)
kπ
For these special values λk , we have x± = e±i N +1 and Eq. (12) gives µk = (−1)k . (0) (0) The set of points (λk , µk ), k = 1, · · · , N , will be called the free configuration and will play an important role below. It is simple to take the continuum limit in this free case. We set λ 2 − 2 2 , z 2 − 2 Z 2 , ψi±1 = ψ(x ± ), = where we have introduced the variable Z=
1 , N +1
2 + p02 .
(17)
The eigenvector equation becomes the Schroedinger equation −ψ (x) − p02 ψ(x) = 2 ψ(x), x = j.
(18)
We also have x± = 1 ± iZ and the equation of the spectral curve reads: µ + µ−1 = (1 + iZ )1/ + (1 − iZ )−1/ . In the limit → 0, it becomes µ + µ−1 = 2 cos Z .
(19)
Similarly, the eigenvector becomes a Baker-Akhiezer function: ψ(x) =
sin Z (1 − x) + µ sin Z x . sin Z
(20)
When µ = e±i Z this reduces to ψ(x) = e±i Z x as it should be. Notice that when µ is kept as a free parameter, the above formula gives two independent solutions of Eq. (18), but when µ belongs to the spectral curve Eq. (19) one has ψ(x + 1) = µψ(x). Equation (20) presents the two Bloch waves as a single function on the hyperelliptic spectral curve Eq. (19). Another example, important to us, will be the Dirac comb, [−∂ 2 − H0 δ(x)]ψ(x) = 2 ψ(x), δ(x + 1) = δ(x). On each interval x j = j < x < x j+1 = j + 1, one has ψ(x) = α j eix + β j e−ix , x j < x < x j+1 . The Bloch condition ψ(x + x j ) = µ j ψ(x), 0 < x < 1 = (µe−i ) j α0 , β j = (µei ) j β0 . (µ−e−i ) α0 = − (µ−ei ) β0 , while the gap equation on the (x j − 0) − H0 ψ(x j ) = 0, gives the spectral curve gives α j
µ + µ−1 = 2 cos − H0
The continuity of ψ(x) gives first derivative, −ψ (x j + 0) + ψ sin .
(21)
The Bloch wave itself is ψ(x) = µ j ψvac (x − x j ), x j < x < x j+1 , where ψvac (x) is given by Eq. (20) (with p0 = 0) but , µ now belonging to the curve Eq. (21).
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
825
4. Separated Variables In this section, we generalize the previous analysis when ai = a and express the dynamical variables of the Volterra model in terms of the separated variables. Equivalent formulae were already obtained a long time ago in [4, 5]. A quantum version of this construction for the closed Toda chain can be found in [19]. We have to reconstruct the eigenvectors of L(µ). Let us set = (ψi ), i = 1, . . . , N + 1. We normalize the last component ψ N +1 = µ. Notice that due to Eq. (8), µ does not vanish for finite λ. The components ψi are meromorphic functions on the spectral curve and are uniquely characterized by their poles and behavior at infinity which we now describe. We will call P + (λ = ∞, µ = ∞) and P − (λ = ∞, µ = 0) the two points above λ = ∞. In the neighbourhood of P ± , the local parameter is λ−1 and we have by direct expansion of Eq. (8):
P + : µ = A−1 λ N +1 1 + O(λ−2 ) , (22)
P − : µ = Aλ−N −1 1 + O(λ−2 ) . (23) At the points P + and P − , the eigenvector (P) behaves as:
1 ψi (P) = √ λi 1 + O(λ−2 ) , P ∼ P + , a N +1 a1 a2 . . . ai−1
√ ψi (P) = a N +1 a1 a2 . . . ai−1 λ−i 1 + O(λ−2 ) , P ∼ P − .
(24) (25)
This is easily deduced by inspection of Eq. (10). From the general results of the classical inverse scattering theory, we expect g + (N + 1) − 1 = 2N poles for the eigenvector (see e.g. [6, 15]). From Eq. (24), we see that we have a fixed pole of order N at P + (on the component ψ N ), and there remains g = N poles at finite distance, the so called dynamical poles. But we notice the symmetry property ψi (−λ, −µ) = (−1)i ψi (λ, µ) so that the dynamical poles come in pairs λ N +1−k = −λk , µ N +1−k = −µk and only (λk , µk ), k = 1 . . . n, are independent parameters. Everything can be expressed in terms of these 2n = N quantities (λk , µk ), k = 1 . . . n. In fact, they can be viewed as coordinates on (an open set of) phase space. First, the commuting Hamiltonians are easy to reconstruct. Indeed the spectral curve is determined by requiring that it passes through the points (λk , µk ), k = 1 · · · n, and through the point (2, µ0 ), where µ0 is related to the Casimir function as in Eq. (13). The equation of the curve itself can be written as a determinant λ λ3 · · · λ N +1 µ + µ−1 2 23 · · · 2 N +1 µ0 + µ−1 0 N +1 3 µ1 + µ−1 (26) det λ1 λ1 · · · λ1 1 = 0. . .. .. . λn
λ3n
···
λnN +1
µn + µ−1 n
826
O. Babelon
Expanding over the first row, we obtain a curve of the form Eq. (8), and we can read directly the Hamiltonians as the coefficients of t (λ). They appear as functions of the (λk , µk ) and can be shown to Poisson commute (see [16–18] for a proof and for the quantum generalization of this fact). Equations (24,25) and the data of the N dynamical poles also determine the functions ψi uniquely. Being meromorphic functions on a hyperelliptic curve, we can write quite generally ψi =
Q (i) (λ) + µR (i) (λ) , n 2 2 k=1 (λ − λk )
(27)
where Q (i) and R (i) are polynomials such that Q (i) (−λ) = (−1)i Q (i) (λ),
R (i) (−λ) = (−1)i+1 R (i) (λ).
Above λk , we have two points on the curve: (λk , µk ) and (λk , µ−1 k ). We want the poles to be at (λk , µk ) only so that the numerator in Eq. (27) should vanish at the points (λk , µ−1 k ). This gives n conditions (i) Q (i) (λk ) + µ−1 k R (λk ) = 0, k = 1 . . . n.
(28)
To have a pole of order i at P + and a zero of order i at P − we must choose degree Q (i) = N − i, degree R (i) = i − 1. Hence, these two polynomials depend altogether on n + 1 coefficients which are determined by imposing the n conditions Eq. (28) and requiring that the normalization coefficients are inverse to each other at P ± as in Eqs. (24,25). It is convenient to use the basis of polynomials P j (λ) given by Eq. (15). We will write the formulae for ψi in the case i odd, the case i even is similar. The polynomial Q (i) (λ) can be expanded over Q (i) (λ) :
P2 (λ), P4 (λ), · · · PN +1−i
and the polynomial R (i) (λ) can be expanded over R (i) (λ) :
P1 (λ), P3 (λ), · · · Pi (λ).
Solving the linear system Eq. (28), the eigenvector can be written as
µP1 (λ) · · · P1 (λ1 ) · · · . .. . Ki . . ψi = 2 det P1 (λk ) · · · (λ − λ2k ) . .. .. . P1 (λn ) · · ·
· · · −P2 (λ) · · · −µ1 P2 (λ1 ) .. .. . . , Pi (λk ) −µk PN +1−i (λk ) · · · −µk P2 (λk ) .. .. .. .. . . . . Pi (λn ) −µn PN +1−i (λn ) · · · −µn P2 (λn ) (29)
µPi (λ) −PN +1−i (λ) Pi (λ1 ) −µ1 PN +1−i (λ1 ) .. .. . .
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
827
where K i are constants independent of λ, µ. Defining
P1 (λ1 ) .. . i = P1 (λk ) . . . P1 (λn )
··· .. .
··· .. .
···
Pi−2 (λ1 ) .. .
Pi−2 (λk ) .. .
Pi−2 (λn )
−µ1 PN +1−i (λ1 ) .. .
−µk PN +1−i (λk ) .. .
··· .. .
−µn PN +1−i (λn )
··· .. .
···
−µ1 P2 (λ1 ) .. . −µk P2 (λk ) (30) .. . −µn P2 (λn )
we can compute the leading terms in Eq. (29) when λ → ∞. At P (+) the leading term comes from µPi (λ), while at P (−) it comes from PN +1−i (λ), ψi (−1)
i−1 2
A−1 K i det i λi ,
ψi (−1)
i−1 2
K i det i+2 λ−i ,
P +, P −.
Imposing that the two coefficients of λi and λ−i are inverse to each other, we get K i2 =
A . det i det i+2
Comparing with Eqs. (24, 25), we finally obtain ai =
det i det i+3 det N det 3 , aN = A, a N +1 = A. det i+1 det i+2 det N +2 det 1
(31)
Here A−1 is the coefficient of λ N +1 in t (λ), Eq. (9), computed from Eq. (26). We impose the Poisson bracket on the variables λk , µk ,
1 {λk , λk } = 0, {λk , µk } = − δkk 4λk − λ3k µk , {µk , µk } = 0. 2
(32)
One can then check that the Hamiltonians defined by Eq. (26) are all in involution (this is a general result), and that the ai defined above do satisfy the Faddeev-Takhtajan Poisson bracket. The fact that the expressions for a N , a N +1 are different from the ones in the bulk is due to the choice of normalization of the eigenvector. However, the Poisson bracket of the ai is periodic. All this can be proved using techniques similar to the ones in [19]. 5. Continuum Limit of the Spectral Curve We now take the continuum limit of the spectral curve Eq. (26). The result is Eq. (41). We set λ=
√ √ az, λk = az k ,
2 p0 = z0 . √ = 2 cos N +1 a
From these, the scaled variables , Z , Z k are defined like this: λ = 2 cos
Z Zk , z = 2 cos , z k = 2 cos . N +1 N +1 N +1
(33)
828
O. Babelon
Notice that we have Z = 2 + p02 . In the following, we will refer to the terminology “perturbation theory” when the points (Z k , µk ) are small deviations from the free configuration Eq. (16). The formulae we will write will make sense in this perturbative setting. This however does not exclude the possibility to have a finite number of points which are large deviation. We will also be interested√in the deviation √ √ from the zero mode configuration. That is, we make the substitution ai → a√ a˜ i everywhere on the lattice. Alternatively this amounts to using the variable z = λ/ a. Using the basis of polynomials P j (z) defined in Eq. (15) instead of the z j , we can write the spectral curve as (it has the right form and passes through the right points) µ + µ−1 PN +2 (z) · · · P4 (z) P2 (z) µ0 + µ−1 PN +2 (z 0 ) · · · P4 (z 0 ) P2 (z 0 ) 0 −1 PN +2 (z 1 ) · · · P4 (z 1 ) P2 (z 1 ) det µ1 + µ1 = 0. . . . . . . µn + µ−1 n
PN +2 (z n )
···
P4 (z n )
P2 (z n )
Without changing the determinant, we can subtract from the first column the linear combination of the next two columns: PN +2 (z k ) − PN (z k ) = 2 cos Z k . The first column becomes
µ + µ−1 − 2 cos Z γ0 γ1 . .. γn
where we have set γk = µk + µ−1 k − 2 cos Z k .
(34)
Notice that γ0 = 0. The reason for this subtraction is that for the free configuration we also have γk(0) = 0, so that the spectral curve becomes simply µ + µ−1 = 2 cos Z as it should be. The subtraction gives sense to the spectral curve in perturbation theory. Expanding the determinant over the first row, we can write µ + µ−1 − 2 cos Z =
n+1
H2 j P2 j (z).
j=1
The H2 j are given by H = N −1 V, where we have defined PN +2 (z 0 ) H N +2 PN +2 (z 1 ) HN H = .. ... , N = . H2
PN +2 (z n )
··· ···
P4 (z 0 ) P4 (z 1 )
···
P4 (z n )
γ0 P2 (z 0 ) P2 (z 1 ) γ1 , V = .. ... . . γn P2 (z n )
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
829
We will need to treat separately the first row and column in the matrix N . Let us write it as A B , N= C D where
B = PN (z 0 ) · · · P4 (z 0 ) P2 (z 0 ) ,
A = PN +2 (z 0 ), PN +2 (z 1 ) .. C = , . PN +2 (z n )
PN (z 1 ) · · · P4 (z 1 ) P2 (z 1 ) .. . D = ... . PN (z n ) · · · P4 (z n ) P2 (z n )
To zero-th order in perturbation theory, we denote N = N (0) and similarly A(0) , B (0) , C (0) , D (0) . To take the continuum limit we have to consider the matrix N N (0)−1 . Lemma 1. In the continuum limit, we have 1 N N (0)−1 = sin Z k p0 sin Z k Zk
sin p0
Zk
1 Z k2 −m 2 π 2
0 −
1 p02 −m 2 π 2
2(−1)m m 2 π 2
k, m = 1, . . . , ∞.
, (35)
Proof. Since PN +2 (z) = z PN +1 (z) − PN (z) and PN +1 (z k(0) ) = 0, we have (0)
Ck
(0)
= −Dk1 =⇒ (D (0)−1 C (0) )k = −δk1 ,
so that N
(0)−1
1 = A + B1
1 F
−B D (0)−1 . (A + B1 )D (0)−1 − F ⊗ B D (0)−1
where F is the column vector with components Fk = δk,1 , k = 1, . . . , n. It follows that 1 0 . N N (0)−1 = 1 1 D D (0)−1 − A+B (C + D F) ⊗ B D (0)−1 A+B1 (C + D F) 1 Noticing that A + B1 = PN +2 (z 0 ) + PN (z 0 ), (C + D F)k = PN +2 (z k ) + PN (z k ), we get in the continuum limit 1 sin Z k p0 (C + D F)k → . A + B1 Z k sin p0 The main trick to proceed is an explicit formula for the inverse of the matrix D (0) . It is not difficult to check that (D (0)−1 ) jk =
kπ 4 (0) sin2 P2 j (z k ). N +1 N +1
830
O. Babelon
With this, we find mπ 4 (0) sin2 P2 j (z 0 )P2 j (z m ) N +1 N +1 n
(B D (0)−1 )m =
j=1
sin2 Nmπ +1 0 sin Np+1 sin Nmπ +1
=
4 2 j p0 2 jmπ mπ sin sin →2 N +1 N +1 N +1 p0 n
j=1
1
×
d x sin p0 x sin mπ x, 0
and the last integral is easily evaluated with the result (B D (0)−1 )m → (−1)m
2m 2 π 2 sin p0 . p0 ( p02 − m 2 π 2 )
Similarly we compute (D D (0)−1 )km =
sin Z k (−1)m 2m 2 π 2 . Z k Z k2 − m 2 π 2
Gathering all this we get Eq. (35).
We now introduce the important infinite matrix Mkm =
1 , k, m = 1, . . . , ∞ Z k2 − m 2 π 2
(36)
and the important vector |η 1 1 |η = M −1 ... .
(37)
1 With these notations we can compute the inverse of the matrix N N (0)−1 : Lemma 2. (N N (0)−1 )−1 =
p0 sin p0
1 1 1−χ ( p0 )|η
(−1)m+1 η 2m 2 π 2 m
where we have defined the vector χ ( p0 )|m =
(−1)m 2m 2 π 2
0 |ηχ ( p )| 1 + 1−χ ( p 0)|η 0
1 . p02 −π 2 m 2
Proof. With the above notations we can write 1 0 N N (0)−1 = sin Z k p0 sin Z k m 2 2 . Z k sin p0 Z k Mkl (δlm − ηl χm ( p0 )) 2(−1) m π Letting N N (0)−1 =
1 Y
0 X
=⇒ (N N (0)−1 )−1 =
1 −X −1 Y
Zk , M −1 mk sin Z k
0 X −1
,
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
we find (X
−1
)mk
(−1)m = 2m 2 π 2
1+
and (−X
−1
p0 Y )m = sin p0
|ηχ ( p0 )| 1 − χ ( p0 )|η 1 1 − χ ( p0 )|η
831
M
−1
mk
Zk sin Z k
(38)
(−1)m+1 ηm . 2m 2 π 2
Let us return to the formula for the spectral curve. We assume that the conditions explained in Sect. 9 are satisfied so that the infinite sums we will manipulate are convergent. Denote η(Z ) = 1 −
∞ m=1
ηm Z 2 − m2π 2
(39)
Zk γk . sin Z k
(40)
and | m =
k
−1 Mmk
These quantities enter the expression of the continuum limit of the spectral curve. Proposition 1. In the continuum limit, the equation of the spectral curve becomes: Hm sin Z −1 , (41) µ + µ = 2 cos Z + −H0 + Z Z 2 − m2π 2 m where the conserved quantities Hm can be taken as Hm m 1 1 χ ( p0 )| ηm , H0 = Hm = m + = . 2 2 2 2 η( p0 ) η( p0 ) m p0 − π 2 m 2 m p0 − π m (42) Proof. We have µ + µ−1 − 2 cos Z =
n
PN +2−2i (z)H N +2−2i =
i=0
n
PN +2−2i (z)(N −1 V )i .
i=0
We insert 1 = N (0)−1 N (0) into the above expression: µ + µ−1 − 2 cos Z = PN +2−2i (z)(N (0)−1 )ik (N (0) )k j (N −1 V ) j .
(43)
i, j,k
Hence, we need to compute PN +2−2i (z)(N (0)−1 )im i
PN +2 (z) + PN (z) PN +2 (z) + PN (z) (0)−1 (0)−1 ,− BD = + PN +2−2i (z)(D )im PN +2 (z 0 ) + PN (z 0 ) PN +2 (z 0 ) + PN (z 0 ) i
832
O. Babelon
whose limit N → ∞ is easy to take i
PN +2−2i (z)(N (0)−1 )im →
sin Z Z
p0 , 2(−1)m m 2 π 2 × sin p0
1 1 − 2 2 2 2 Z −m π p0 − m 2 π 2
.
We can now take the limit N → ∞ in Eq. (43) sin Z µ + µ−1 − 2 cos Z = Z ∞ p0 1 1 m 2 2 × 2(−1) m π − 2 H0 + Hm , sin p0 Z 2 − m2π 2 p0 − m 2 π 2 m=1 where
0 γ0 γ1 γ1 m = (N N (0)−1 )−1 . = H , −1 .. X ... γn
γn
where X −1 is given in Eq. (38) and we remembered that γ0 = 0. Since 1 − χ ( p0 )|η = η( p0 ) the equation of the spectral curve finally becomes −1
µ+µ
1 sin Z 1 1 − 2 cos Z = − 2 . m + × ηm χ( p0 )| Z Z 2 − m2π 2 η( p0 ) p0 − m 2 π 2 m
Another useful expression of this result is: sin Z η(Z ) m m µ + µ−1 = 2 cos Z + − . Z Z 2 − m2π 2 η( p0 ) p02 − m 2 π 2 m
(44)
The next proposition performs a few consistency checks. ±1 Proposition 2. The points (Z = p0 , µ±1 0 ), and (Z = Z k , µk ), all belong to the curve Eq. (44).
Proof. When Z = p0 , we find µ + µ−1 = 2 cos p0 , hence the curve passes through the point = 0, µ±1 0 , as it should be. When Z = Z k , recalling that η(Z k ) = 0, we find 1 Zl sin Z k M −1 (µl + µl−1 − 2 cos Z l ) Z k m Z k2 − m 2 π 2 ml sin Z l sin Z k −1 Z l Mkm Mml (µl + µl−1 − 2 cos Z l ) = µk + µ−1 = 2 cos Z k + k . Zk m sin Z l
µ + µ−1 = 2 cos Z k +
Hence the curve passes through the points Z k , µ±1 k .
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
833
We now show that the Hm in Eq. (41) all Poisson commute. We need the following result: Lemma 3. One has { n , m } = 0, { n , ηm } = { m , ηn }, {ηn , ηm } = 0.
(45) (46) (47)
Proof. Recall the definitions Eqs. (37,40) of ηm and m , −1 −1 ηm = Mmk |1k , m = Mmk |γ˜ k , γ˜k =
Zk γk , sin Z k
−1 where Mmk is the inverse of the matrix defined in Eq. (36). The relation Eq. (47) is obvious because the ηm depend only on the Z k . Consider the second relation Eq. (46): −1 −1 −1 −1 1k , Mml γ˜l } = Mm,l {Mnk , γ˜l }1k {ηn , m } = {Mnk −1 −1 −1 −1 = −Mm,l Mn,k {Mk p , γ˜l }M −1 pk |1k = −Mm,l Mn,l {Mlp , γ˜l }η p ,
where in the last step we used that {Mk p , γ˜l } = 0 if k = l. The result is obviously symmetric in m and n. Finally the first statement, Eq. (45), is simple. One has −1 −1 −1 −1 −1 γ˜k , Mnl γ˜l } = −Mmr Mnl {Mr s , γ˜l } − {Mls , γ˜r } Msk γ˜k { m , n } = {Mmk but because of the structure of M, we have {Mr s , γ˜l } = 0 if r = l and for r = l the term in the square bracket obviously vanishes. This is a special case of a general theorem [17, 18]. We are now ready to prove Proposition 3. The quantities H0 , Hm , Poisson commute {H0 , Hn } = 0, {Hn , Hm } = 0. Proof. Using Eq. (42), one has {Hn , Hm } = ηn ({C, m } + C{C, ηm }) − ηm ({C, n } + C{C, ηn }), where we denoted C= One has { m , C} =
l 1 1 χ ( p0 )| = . 2 η( p0 ) η( p0 ) p0 − π 2 l 2 l
{ m , ηl } 1 1 {ηm , l } C , {ηm , C} = , 2 2 2 η( p0 ) η( p0 ) p0 − π l p02 − π 2 l 2 l l
hence {C, m } + C{C, ηm } = −
{ m , ηl } + {ηm , l } 1 C = 0. η( p0 ) p02 − π 2 l 2 l
All this means that (Z k , µk ) are separated coordinates for the Hamiltonians Hm .
834
O. Babelon
6. Continuum Limit of the Eigenvector Having found the continuum limit of the spectral curve, we now consider the limit of the eigenvector. Again, the continuum limit can be computed, the result being Eq. (50). As seen from Eq. (29), the eigenvector can be written as (for i odd) √ A det Ni ψi = 2 , √ (z − z 2j ) det i det i+2 where
µPi (z) + PN +1−i (z) Pi (z 1 ) + µ1 PN +1−i (z 1 ) .. . Ni = Pi (z k ) + µ j PN +1−i (z k ) .. . Pi (z n ) + µn PN +1−i (z n )
µP1 (z) P1 (z 1 ) .. . P1 (z k ) .. .
· · · −P2 (z) · · · −µ1 P2 (z 1 ) .. .. . . . −µk PN +1−i (z k ) · · · −µk P2 (z k ) .. .. .. . . .
· · · −PN +1−i (z) · · · −µ1 PN +1−i (z 1 ) .. .. . . ··· .. .
P1 (z n ) · · · −µn PN +1−i (z n ) · · · −µn P2 (z n )
(48) Compared to Eq. (29), we have subtracted the i th column from the first one for the same reason as in the previous section. Also we have used the variable z, z k instead of λ, λk . Let us decompose the matrix Ni in blocs particularizing the first row and first column: Ui Vi , Ni = Wi i where Ui ≡ µPi (z) + PN +1−i (z), (Wi )k ≡ µk Pi (z k ) + PN +1−i (z k ), (Vi ) j = µP j (z)θ (i − j) − PN +1− j (z)θ ( j − i), i, j odd. To order zero in perturbation, we have −(−1)k P j (z k(0) ) = PN +1− j (z k(0) ) so that Vi Ui (0) , Ni = 0 i(0) (0)
where i is the matrix Eq. (30) evaluated on the free configuration. It is in fact independent of i and we will denote it by (0) . The appearance of zero in the lower left corner was the reason for the subtraction in Eq. (48) and makes things better behaved in perturbation. (0) The matrix Ni being bloc triangular we can compute its inverse: −1 −Ui−1 Vi (0)−1 Ui (0)−1 = Ni 0 (0)−1 so that (0)−1
Ni Ni
=
1
Ui−1 Wi
0
i (0)−1 − Ui−1 Wi ⊗ Vi (0)−1
.
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
835
Returning to the formula for ψi , we multiply all the matrices by (0)−1 . The factors det (0)−1 cancel between the numerator and denominator. We arrive at
√ (0)−1 − U −1 W ⊗ V (0)−1 det i i i i A Ui ψi = 2
. 2 k z − zk (0)−1 (0)−1 det i+2 det i We want to take the scaling limit of this expression. Again, the main trick is an explicit formula for the inverse of (0) . It is not difficult to check that kπ 4 (0) sin2 P j (z k ). ((0)−1 ) jk = N +1 N +1 Let us compute i (0)−1 . Using the parametrization Eq. (33) we find (recall that i is assumed to be odd) i−2 mπ sin jmπ 4 j Zk N +1 sin sin (i (0)−1 )km = Z k N + 1 sin N +1 N +1 N +1 j=1,odd N −1 jmπ (N + 1 − j)Z k sin −µk sin . N +1 N + 1 j=i,odd
Defining (x) as the scaling limit of i (0)−1 , we find (there is a factor 1/2 because the sum is over j odd only) x 1 mπ ((x))km = 2 dy sin Z k y sin mπ y − µk dy sin Z k (1 − y) sin mπ y . Zk 0 x Similarly, we define U (x, , µ) and Wk (x) by Ui = (N + 1)U (x, , µ), (Wi )k = (N + 1)Wk (x). We find U (x, , µ) = µ
sin x Z sin(1 − x)Z + , Z Z
Z=
2 + p02
and Wk (x) =
sin Z k (1 − x) sin Z k x + µk . Zk Zk
Finally, we have (again there is a factor 1/2 because the sum is over j odd only) (Vi (0)−1 )m = Vm (x, Z , µ), where
x
Vm (x, , µ) = 2mπ 0
sin y Z sin mπ y − dy µ Z
1 x
sin(1 − y)Z sin mπ y . Z
(49)
Putting everything together, we arrive at (up to a factor4 independent of x) 4 In this factor we will include in particular
1 2 2 k (1−Z /Z k )
which produces the poles at Z 2 = Z k2 . This is
important for the analyticity properties of ψ(x) but plays little role for the considerations of this paper.
836
O. Babelon
Proposition 4. ψ(x, , µ) = U (x, , µ) − V (x, , µ)|−1 (x)|W (x),
(50)
where we denoted by V (x, , µ)| the row vector with components Vm (x, , µ) and by |W (x) the column vector with components Wk (x). It is easy to show that the infinite sums involved in this formula converge under the conditions Eq. (65) of Sect. 9. Equation (50) is the generalization of Eq. (20). Here, the and µ dependence is entirely contained in the function U (x, , µ) and the vector V (x, , µ). For the moment they are free complex parameters. We now want to specialize to = 0, µ0 = e±i p0 ., We have U (x, , µ)|0,e±i p0 =
sin p0 ±i p0 x e , p0
m(±) (x, p0 ), Vm (x, , µ)|0,e±i p0 = U (x, , µ)|0,e±i p0 V where m(±) (x, p0 ) = −mπ V
$
% eimπ x e−imπ x . + mπ ± p0 mπ ∓ p0
Hence, up to a constant (±) (x, p0 )|−1 (x)|W (x) . ψ (±) (x, p0 ) = e±i p0 x 1 − V These are the primary fields of CFT. Their logarithmic derivatives are the free fields of the Coulomb gas representation. Notice that we have two such fields playing a completely symmetrical role: we go from one to the other by changing p0 → − p0 . This circumstance was recognized and used with great profit in [14]. The separated variables make this symmetry explicit and built in. 7. Schroedinger Equation Having found a formula for the wave function ψ(x, , µ), the next question is to find the potential in the Schroedinger equation that ψ(x, , µ) is expected to satisfy. At this point it is simpler to forget the lattice model and work directly with Eq. (50). Let us denote by |E(x) the vector with components E m (x) = 2mπ sin mπ x. Calculating explicitly the integrals in Eq. (49), we find (’ denotes the derivative with respect to x) Vm (x, , µ) =
U (x, , µ)E m (x) − U (x, , µ)E m (x) , Z 2 − m2π 2
Z 2 = 2 + p02 , (51)
and similarly for (x), we find km (x) =
Wk (x)E m (x) − Wk (x)E m (x) Z k2 − m 2 π 2
.
(52)
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
837
Notice the important formulae V (x, , µ)| = U (x, , µ)E(x)|, (x) = |W (x)E(x)|. The derivative of the matrix (x) is a rank one projector. The matrix (x) has a form familiar in the theory of integrable systems, and we know that it leads to non linear differential equations. Indeed, let us define the vector |(x) by |(x) = −1 (x)|W (x)
(53)
and let K be the diagonal matrix K mm = mπ δmm . Proposition 5. The vector |(x) satisfies the set of coupled non linear second order differential equations | + 2(|E) | + K 2 | = 0, where |E(x) =
(54)
m (x)E m (x).
m
Proof. By derivation, and using the formula for , we get |W = |, |W = | + | = |E|W + | , |W = |E|W + | + (|E + |E)|W . But we also have Z 2 − K 2 = |W E | − |W E|, where we defined the diagonal matrix Z Zkk = Z k δkk . Applying this identity to |, we get Z 2 | − K 2 | = E ||W − E||W . But Z 2 | = Z 2 |W = −|W , and therefore |W = E||W − E ||W − K 2 |. Comparing these two expressions of |W , we find | + (|E + |E)|W = −E ||W − K 2 |. Multiplying by −1 yields Eq. (54).
We are now ready to find the Schroedinger equation satisfied by ψ.
(55)
838
O. Babelon
Proposition 6. The function ψ(x, , µ) defined by Eq. (50) satisfies the linear second order differential equation −ψ (x, , µ) − [ p02 + 2(E|) ] ψ(x, , µ) = 2 ψ(x, , µ).
(56)
Proof. We have ψ(x, , µ) = U (x, , µ) − V (x, , µ)|−1 |W (x) = U (x, , µ) − V (x, , µ)|(x). Using the formula for V (x, , µ)|, we get 1 1 ψ(x, , µ) = U 1 − E | 2 | + U E| 2 |. Z − K2 Z − K2 Next, remembering that U (x, , µ) = −Z 2 U (x, , µ), |E (x) = −K 2 |E(x), we obtain 1 1 ψ (x, , µ) = −U E| + E | 2 | + U 1 + E| 2 | Z − K2 Z − K2 and
1 | ψ (x, , µ) = −U Z + E| + (E|) + E | 2 Z − K2 1 | . +U −E| + E| 2 Z − K2
2
Using now the equation for | , we get Eq. (56).
The potential T (x) = 2(E|) can also be written directly in terms of . In fact, we have ∂x2 log det = ∂x Tr −1 = ∂x E|−1 W = ∂x E|, hence T (x) = 2∂x E| = 2∂x2 log det (x). The Schroedinger equation therefore also reads ψ (x, , µ) + [ p02 + 2 ∂x2 log det ] ψ(x, , µ) = −2 ψ(x, , µ).
(57)
In this formula both the potential and the function ψ(x, , µ) are known. The potential therefore belongs to the class of exactly solvable potentials. It is strongly reminiscent of the formula for finite zones potentials [20–23]. It can probably also be obtained by an infinite sequence of Darboux transformations [24]. The parameter µ which enters the function U (x, , µ) and the vector V (x, , µ)| was, up to now, a free parameter. Eq. (50) therefore provides two linearly independent solutions of Eq. (56). We now introduce the spectral curve by imposing the quasiperiodicity of ψ(x, , µ).
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
839
8. Bloch Waves and Spectral Curve So far, ψ(x, , µ) was defined on the interval [0, 1]. We extend its definition by imposing ψ(x + 1, , µ) = µψ(x, , µ). This extension is continuous as we now show. Proposition 7. ψ(1, , µ) − µψ(0, , µ) = 0. Proof. This follows immediately from −1 m Wk (1) = µ−1 U (1, , µ) = µU (0, , µ). k Wk (0), km (1) = µk km (0)(−1)
It is worth computing explicitly ψ(0, , µ). In terms of the matrix M introduced in Eq. (36), we have km (0) = Wk (0)Mkm E m (0), U (0, , µ) = Vm (0, , µ) = U (0, , µ) It follows that sin Z ψ(0, , µ) = Z
1−
sin Z , Z
1 E (0). Z 2 − m2π 2 m
ηm 2 Z − m2π 2
m
=
sin Z η(Z ), Z
(58)
where ηm and η(Z ) are defined in Eqs. (37, 39). Notice that when Z 2 = Z k2 , we have ψ(0, , µ) = 0, by definition of η(Z ). We now turn to the derivative of ψ(x, , µ). Lemma 4. One has ψ (1, , µ) − µψ (0, , µ) = µ (, µ), (, µ) = µ + µ−1 − 2 cos Z −
∞ sin Z E m (0)m (0) − E m (1)m (1) . Z Z 2 − m2π 2 m=1
Proof. We have U (1, , µ) = µ
sin Z sin Z , U (0, , µ) = , Z Z
and U (1, , µ) = µ cos Z − 1, U (0, , µ) = µ − cos Z . Using E k (1) = E k (0) = 0, we get 1 | (1) + U (1, , µ), Z2 − K 2 1 ψ (0, , µ) = −U (0, , µ)E (0)| 2 | (0) + U (0, , µ). Z − K2 From this the result follows. ψ (1, , µ) = −U (1, , µ)E (1)|
(59)
840
O. Babelon
At this point it is tempting to identify the spectral curve as (, µ) = 0. However, this cannot be correct because the point = 0, µ = µ0 does not belong to it. We have to change the Schroedinger equation. The only possible modification is at the edges. We consider therefore the equation ψ (x, , µ) + p02 + 2E + H0 δ(x) ψ(x, , µ) = −2 ψ(x, , µ). (60) The bulk formula for ψ(x, , µ) does not change. The continuity of ψ(x, , µ) at x = 1 still holds, but the derivative now has a discontinuity
1+
d x ψ + H0 ψ(1) = 0.
1−
Using ψ(1, , µ) = µ η(Z )
sin Z Z
the Bloch condition becomes ψ (1, , µ) − µψ (0, , µ) − H0 µ η(Z ) = 0, that is −1
µ+µ
sin Z = 2 cos Z + Z
m
m , −H0 η(Z ) , Z 2 − m2π 2
where we have set m = E m (0)m (0) − E m (1)m (1).
(61)
We now determine the coefficient H0 by requiring that the curve passes through the points p0 , µ±1 0 . We find H0 =
m 1 . η( p0 ) m p02 − m 2 π 2
Hence the curve takes the form −1
µ+µ
sin Z = 2 cos Z + Z
m
m m η(Z ) − Z 2 − m2π 2 η(Z 0 ) p02 − m 2 π 2
.
In order to compare with Eq. (44), we must compute m in Eq. (61). We have Wk (0) = µk
sin Z k sin Z k , Wk (1) = , Wk (0) = 1 − µk cos Z k , Zk Zk Wk (1) = cos Z k − µk
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
841
and km (0) = km (1) =
Wk (0)E m (0) = Wk (0)Mkm E m (0), Z k2 − π 2 m 2 Wk (1)E m (1) = Wk (1)Mkm E m (1), Z k2 − π 2 m 2
where M is the matrix introduced in Eq. (36). Since | (0) = −1 (0)|W (0) and | (1) = −1 (1)|W (1), one has E m (0)m (0) =
−1 Mmk (µ−1 k − cos Z k )
k
E m (1)m (1) =
−1 Mmk (cos Z k − µk )
k
Zk , sin Z k
Zk sin Z k
hence m = E m (0)m (0) − E m (1)m (1) =
−1 Mmk
k
Zk γk , sin Z k
where we recall that γk = µk + µ−1 k − 2 cos Z k . This is exactly Eq. (40) and shows that we have recovered precisely the spectral curve Eq. (44). Finally, let us close this section by proving the Proposition 8. The function ψ(x, , µ) has no pole at the point Z = Z k , µ−1 k . Proof. Here we restore the factor
1 k (1−Z
2 /Z 2 ) k
,
−1 U (x, , µ)| Z =Z k ,µ−1 = µ−1 k Wk (x), =⇒ Vk (x, , µ)| Z =Z k ,µ−1 = µk kk (x), k
k
ψ(x, , µ)| Z =Z k ,µ−1 k
1 −1 −1 (x), (x)Wk (x) = regular. × W (x) − µ k kk k k k 2 2 k (1 − Z /Z k )
This shows that the same property for the eigenvector on the lattice has been preserved when taking the continuum limit. 9. Perturbation Theory In the previous sections, we have manipulated determinants of infinite matrices quite freely. It is necessary now to investigate the conditions for the existence of the determinant det (x). We recall the free configuration Eq. (16). In the scaled variables it reads (0)
(0)
Z k = kπ, µk = (−1)k .
842
O. Babelon (0)
(0)
By construction, when (Z k , µk ) = (Z k , µk ), we have (x) = Id, so that det (x) = 1. (0) (0) Clearly for det (x) to exist, we have to assume (Z k , µk ) → (Z k , µk ) when k → ∞. Hence we set Z k = kπ + δ Z k , µk = (−1)k (1 + δµk ).
(62)
It is not difficult to see that to leading order in (δ Z k , δµk ), we have Wk (x) =
1 (δ Z k cos kπ x − δµk sin kπ x), kπ
and this implies k,m (x) =
2mπ δ Z k (m cos kπ x cos mπ x + k sin kπ x sin mπ x) k(Z k2 − π 2 m 2 ) (63) +δµk (−m sin kπ x cos mπ x + k cos kπ x sin mπ x) .
Notice that when m = k this formula gives that to leading order k,k (x) = 1, as it should be. A first consequence of these formulae is that if δ Z k = 0, δµk = 0, beyond a certain index k = kmax , then for k > kmax we have Wk (x) = 0, k,k (x) = 1, and k,m (x) = 0, ∀m = k. As a result only the first block of size kmax × kmax of the matrix (x) plays a role and all the constructions of the previous sections reduce to finite size matrices and vectors. If however we want to retain an infinite number of modes in order to keep the field theoretical character of the model, one has to say something about the rate at which δ Z k and δµk tend to zero when k → ∞. Disregarding a finite number of possibly large δ Z k , δµk which play no role in these convergence questions, we may assume that (x) is given by Eq. (63). As we have seen, it is of the form (x), (x) = Id + (x) is small. In fact, bounding the trigonometric functions by 1, we have where m k,m (x)| ≤ c (|δ Z k | + |δµk |), m = k. | k|k − m| Since |k − m| ≥ 1 when k = m, we may write as well m k,m (x)| ≤ c (|δ Z k | + |δµk |), m = k. | k It is not difficult to see that this formula is valid also for m = k (we have to adapt the constant c). It follows that (x)) log det (x) = Tr log(1 + ∞ ∞ n n 1 c n (x)| ≤ Tr| ≤ |δ Z k | + |δµk | . n n n=1
n=1
(64)
k
Hence & a sufficientcondition for the existence of the determinant is that the series k |δ Z k | + |δµk | converges. This is achieved if |δ Z k | + |δµk | <
c k 1+
, > 0.
(65)
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
843
One can then adjust the constant c such that the series in Eq. (64) converges. The condition Eq. (65) ensures that log det (x) exists. To build the potential u(x) however, this function has to be twice differentiable in x and this may require stronger conditions on . Now that we have found an expression for the potential u(x) in terms of a countable set of variables Z k , µk , we would like to check the Virasoro Poisson bracket directly. Recall that Tn e2iπ x . u(x) = p02 + T (x) + H0 δ(x), T (x) = 2∂x E| = n
Notice first that T (x) has no Fourier component T0 : 1 1 T0 = d x T (x) = 2 d x(E|) = 2( E(1)|(1) − E(0)|(0) ) = 0, 0
0
where we used that E m (0) = E m (1) = 0. The Fourier expansion of the potential u(x) reads L n e2iπ nx = p02 + (Tn + H0 )e2iπ nx . u(x) = n
n
We must therefore identify L 0 = p02 + H0 = p02 +
m 1 2 η(Z 0 ) m p0 − m 2 π 2
(66)
and L n = Tn + H0 , =⇒ Tn = L n − L 0 + p02 , n = 0 If u(x) has Poisson bracket Eq. (4), the algebra of the L n reads {L n , L m } = 8iπ(n − m)L n+m − 16iπ 3 n 3 δn+m,0 . The Poisson algebra for the Tn is then closed: {Tn , Tm } = 8iπ(n − m)Tn+m − 8iπ nTn + 8iπ mTm − 16i(π 3 n 3 − π p02 n)δn+m,0 or, in a form that will be useful later, {T (x), T (y)}=2δ (x − y)+4(2 p02 +T (x) + T (y))δ (x − y)−4T (x)δ(y)+4T (y)δ(x). (67) In this section, we consider the situation where all the variables (Z k , µk ) are close to the free configuration as in Eq. (62), and we perform a perturbation theory in δ Z k , δµk . We have seen that to lowest order (0) (x) = Id by construction. So, we can write the expansion (x) = Id + (1) (x) + (2) (x) + · · · , |W (x) = |W (1) (x) + |W (2) (x) + · · · , where we have taken into account that |W (0) (x) = 0. It follows that |(x) = |(1) (x) + |(2) (x) + · · · ,
844
O. Babelon
where |(1) = |W (1) , |(2) = |W (2) − (1) |W (1) , . . . . To lowest order, we find easily T (1) (x) = 2E|(1) = 4
kπ(δ Z k cos 2kπ x − δµk sin 2kπ x).
k
This shows in particular that δ Z k and δµk are just the Fourier components of the potential in this first approximation. We see here clearly that for T (1) (x) to exist as a function (and not just as a distribution), we need > 1 in Eq. (65). The Poisson bracket Eq. (3) becomes to leading order p02 {δ Z k , δµk } = 2kπ 1 − 2 2 . k π To define modes independent of the zero mode p0 we introduce ak = αk (δµk − iδ Z k ), ak† = α¯ k (δµk + iδ Z k ), where the coefficients αk , α¯ k satisfy5 αk α¯ k =
k2π 2 1 . 4π p02 − k 2 π 2
With this choice, one has {ak , ak† } = ik. we can then rewrite T (1) (x) = 2i
k
kπ
† ak 2ikπ x ak −2ikπ x e − e . αk α¯ k
It is now straightforward to compute the Poisson bracket {T (1) (x), T (1) (y)} = 2δ (x − y) + 8 p02 δ (x − y). This is the correct result for the Virasoro Poisson bracket in this approximation. Notice that the term δ (x − y) is exact already at this level. Higher order terms cannot contribute to it. Next, we look at the conserved quantities. The leading terms in the expansions of ηm and Hm are easy to find: 2 ). ηm 2mπ δ Z m , m 2m 2 π 2 (δµ2m + δ Z m
To see it, consider the defining relations of ηm , ηm m
Z k2 − π 2 m 2
= 1, ∀k.
5 It is known [14] that the poles at p 2 = k 2 π 2 are classical remnants of the zeroes of the Kac determinant. 0
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
845
When Z k is given by Eq. (62), the dominant term in the above sum is m = k. The equation becomes ηk /(2kπ δ Z k ) = 1. The same argument starting with the equation of the spectral curve, Eq. (41), taken at the point Z k , µk which belongs to it, yields the formula for Hm . Remark that the condition Eq. (65) ensures that the sums in the definition of the function η(Z ), Eq. (39), or in the definition of the spectral curve, Eq. (41), are convergent. † Written with the oscillators am , am , we find † † Hm = 8π( p02 − π 2 m 2 )am am , H0 = 8π am am . m
It is clear that the Hm are in involution at this order. As we see, in first approximation, the dynamical system reduces to a set of decoupled harmonic oscillators. The generator L 0 is given by † am am . L 0 = p02 + 8π m
It is easy to verify that {L 0 , T (1) (x)}0 = −4∂x T (1) (x). These perturbative arguments are good indications that u(x) indeed satisfies the Virasoro Poisson bracket. Clearly we will not go very far in perturbation and we now look for a more formal proof of this fact. For that purpose, we need some preparation. 10. Some Identities Before computing Poisson brackets to check the Virasoro algebra, we collect a number of useful identities. We start with a formula for the inverse matrix −1 (x). It has the same form as (x). Proposition 9. Let us define F| = E|−1 .
(68)
Then, we can write −1 mk =
m Fk − m Fk Z k2 − π 2 m 2
.
(69)
Proof. Multiplying Eq. (55) on both sides by −1 , we get −1 Z 2 − K 2 −1 = −1 |W E |−1 − −1 |W E|−1 and so −1 mk =
(−1 |W )m (E|−1 )k − (−1 |W )m (E |−1 )k . π 2 m 2 − Z k2
But −1 |W = | + E||, E |−1 = F | + E|F|. Plugging into the above formula, we obtain Eq. (69).
846
O. Babelon
Proposition 10. The vector F| satisfies a set of differential equations, F | + 2(E|) F| + F|Z 2 = 0.
(70)
Proof. The proof is the same as for |. From this we easily deduce (−1 ) = −|F|, which can also be proved using the similar property of . Let us define Ak (x) =
E m (x)m (x) , Z k2 − m 2 π 2 m
Ck (x) =
E m (x)m (x) , 2 2 2 2 m (Z k − m π )
Bk (x) =
E (x)m (x) m , 2 2 2 Z k −m π m
Dk (x) =
E (x)m (x) m . 2 2 2 2 (Z k −m π ) m
Proposition 11. We have the identity (1 − Bk )Wk + Ak Wk = 0.
(71)
Proof. This is just a rewriting of |W = | using Eq. (52) for (x).
Proposition 12. The following two identities hold (1 − Bk + Ak )Fk − Ak Fk = 0, (Bk
+
Z k2 Ak )Fk
+ (1 −
Bk )Fk
= 0.
(72) (73)
Proof. The first identity is a rewriting of F| = E|−1 using Eq. (69) for −1 (x). The second identity is just a rewriting of F | = E |−1 − E|F|. The above two identities form a linear system for Fk and Fk . Its compatibility implies the following: Proposition 13. (1 − Bk )2 + Ak Bk − Ak Bk + Ak + Z k2 A2k = 0.
(74)
k = Wk (1 − Bk ) + Wk Ak . F
(75)
Let us define
An important consequence of Eqs. (72,73) is k (x) are proportional, Proposition 14. The functions Fk (x) and F Fk (x) = −
ζk Fk (x). µk γk
(76)
The proportionality coefficient is written in this specific way for later convenience. The quantity γk is the one defined in Eq. (34).
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
847
Proof. Let us compute the Wronskian
k ) = Fk Wk (1 − Bk ) − Wk Bk + Wk Ak + Wk Ak W r (Fk , F
−Fk Wk (1 − Bk ) + Wk Ak
= −2k Wk (1 − Bk + Ak )Fk − Ak Fk
−Wk (Bk + Z k2 Ak )Fk + (1 − Bk )Fk = 0. Proposition 15. The following quantities are constants independent of x: ηm = m E m − m E m − < E > m E m 1 (m p − m p )(E m E p − E m E p ). − π 2 ( p2 − m 2 )
(77)
p =m
The quantities ηm defined in Eq. (77) are in fact the same as the ones introduced in Eq. (37). Proof. To prove the first statement, just take the derivative with respect to x and use Eq. (54). To prove the second statement, use the ηm defined in Eq. (77) to rewrite Eq. (74) as ηm = 1, ∀ k, (78) 2 − m2π 2) (Z k m which is exactly the same as Eq. (37).
A straightforward consequence is the “trace” formula that will be useful later: E − E + E2 = − ηm .
(79)
m
We now compute the coefficients ζk appearing in Eq. (76). Proposition 16. The coefficients ζk in Eq. (76) are determined by the set of equations ζk = 1, ∀ m. (80) 2 − m2π 2 Z k k These equations are dual to Eq. (78). Proof. Start with F|W = E|−1 |W = E|. Using Eq. (71,75,76 ), we have Fk Wk = − hence F|W =
k
ζk A k =
ζk (−Wk2 + Wk Wk )Ak = ζk Ak , µk γk
m
k
ζk Z k2 − m 2 π 2
E m m = E| =
Since this has to hold for all x, the only possibility is Eq. (80).
(81)
m
E m m .
848
O. Babelon
We collect below a few more identities of the type of Eq. (81) that will be important later. Proposition 17. Fk Wk = ζk Ak , Fk Wk = −ζk (1 − Bk ), Fk Wk = ζk (1 − Bk + Ak ),
(82) (83) (84)
Fk Wk = ζk (Bk + Z k2 Ak ).
(85)
Next, we relate the ηm and ζk Proposition 18. The following relation holds: 2 m (Z k
ηm 1 = . 2 2 2 ζ −m π ) k
Proof. We start from
km −1 mk = δkk .
m
When k = k this gives Wk Fk Dk − Wk Fk (Dk − Ak + Z k2 Ck ) − Wk Fk Ck + Wk Fk (Ck − Dk ) = 1, or using Eqs. (82–85), ζk 2Dk (1 − Bk ) + Dk Ak − Dk Ak + A2k − 2Z k2 Ck Ak − Ck Bk − Ck (1 − Bk ) = 1. Expanding this formula using ( p = m) 1 (Z k2
− π 2 m 2 )2 (Z k2
− π 2 p2 )
1 1 2 2 − m ) (Z k − π 2 m 2 )2 1 1 1 − 4 2 − , π ( p − m 2 )2 (Z k2 − π 2 m 2 ) (Z k2 − π 2 p 2 )
=−
π 2 ( p2
we get the result with ηm represented by Eq. (77).
An immediate and important consequence is an expansion of the function η(Z ) near Z 2 = Z k2 , η(Z ) = (Z 2 − Z k2 )
1 + ··· . ζk
Returning to the formula for ψ(x, , µ), we can write it as ψ(x, , µ) =
µ − ei Z ∗ µ − e−i Z w(x, Z ) − w (x, Z ), 2i Z 2i Z
(86)
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
849
where we define 1 iZ w(x, Z ) = ei Z x 1 − E | 2 | + E| | , Z − K2 Z2 − K 2 1 iZ | − E| | . w ∗ (x, Z ) = e−i Z x 1 − E | 2 Z − K2 Z2 − K 2
(87) (88)
The functions w(x, Z ) and w∗ (x, Z ) are defined on the Riemann sphere with a puncture at ∞. One can compute the Wronskian w w ∗ − w ∗ w = 2i Z η(Z ). This Wronskian vanishes precisely when Z 2 = Z k2 . Hence, when Z = Z k , w(x, Z k ) becomes proportional to w∗ (x, Z k ). Indeed we have w(x, Z k ) = ei Z k x (1 − Bk + i Z k Ak ), w ∗ (x, Z k ) = e−i Z k x (1 − Bk − i Z k Ak ). Then Eq. (71) can be rewritten as w(x, Z k ) =
αk(+)
(−)
αk
w ∗ (x, Z k ),
(89)
where (±)
αk
(+) (−)
= 1 − µk e±i Z k , αk αk
= µk γk .
(90)
k (x) introduced in Eq. (75) is solution of Eq. (70). It is not difficult to see The function F that the other solution of this equation is k . k = Ak Wk + 2Z k (Wk Dk − Wk Ck ) + x F G In terms of w(x, Z ), w∗ (x, Z ) defined in Eqs. (87,88), we have k = 1 (α (−) w| Z k + α (+) w ∗ | Z k ) = α (−) w| Z k = α (+) w ∗ | Z k , F k k k 2 k i k = − (α (−) ∂ Z w| Z k − α (+) ∂ Z w ∗ | Z k ). G k 2 k
(91) (92)
This will play an important role below.
11. Virasoro Algebra We are now ready to compute the Poisson bracket {T (x), T (y)}. The result is precisely the algebra of the Tn = L n − L 0 , Eq. (67).
850
O. Babelon
Proposition 19. Let T (x) = 2∂x E(x)|(x), and X (x), Y (x) be two arbitrary test functions. Then we have 1 1 1 d x X (x)T (x), dyY (y)T (y) = − d x(X Y − X Y ) 0
0
0
−4
1
0 1
+4
d x(X Y − X Y )( p02 + T )
0
−4
1
d x X (x)δ(x)
dyY (y)T (y)
0 1
d x X (x)T (x)
0
1
dyY (y)δ(y).
0
(93) The proof is rather long and we will split it into several lemmas. Since | = −1 |W , we have {|1 , |2 } = −1 −1 1 2 × {1 , 2 }|1 |2 −{1 , W2 }|1 −{W1 , 2 }|2 + {W1 , W2 } , where the index 1, 2 refers to the customary tensor notation. In this expression, all Poisson brackets can be computed explicitly. Using the fact that rows with different indices in and W Poisson commute and using only Eq. (71), we arrive at p02 sin Z k −1 −1 {m (x), n (y)} = −2 mk (x)nk (y) 1− 2 Z k γk Zk k k (x)G k (y) − F k (y)G k (x) , × F (94) where γk are defined in Eq. (34). Multypliying by E m (x)E n (y) and remembering that E|−1 = F| we get ζ 2 sin Z k p02 k {E(x)|(x), E(y)|(y)} = 2 1− 2 , Z k γk3 µ2k Zk k
(95) × Ak (y)Bk (x) − Ak (x)Bk (y) where k2 (x), Bk (x) = F k (x)G k (x). Ak (x) = F Using Eqs. (87,88), we can write Ak (x) = µk γk w(x, Z )w ∗ (x, Z )| Z =Z k , µk γk ∗ (w (x, Z )∂ Z w(x, Z ) − w(x, Z )∂ Z w ∗ (x, Z ))| Z =Z k . Bk (x) = 2i The strategy to evaluate the right hand side of Eq. (95) is to rewrite it as a sum over the residues of certain poles of a function on the Riemann Z -sphere. This sum can then be transformed as a sum over the residues of the other poles (there will be none in our case)
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
851
plus a integral over a small circle at infinity surrounding an essential singularity. This last integral can then be evaluated using the known asymptotics of the function. Let us define Z Z Z e−i Z , b(Z ) = − cos Z , c(Z ) = ei Z , (96) a(Z ) = 2i sin Z 2i sin Z 2i sin Z and introduce the functions 1 = w(x, Z )w ∗ (y, Z ) − w ∗ (x, Z )w(y, Z ), 2 = a(Z )w(x, Z )w(y, Z ) +b(Z )(w(x, Z )w∗ (y, Z ) + w ∗ (x, Z )w(y, Z )) +c(Z )w ∗ (x, Z )w ∗ (y, Z ). These functions defined on the Riemann Z -sphere have poles at the points ± mπ and have an essential singularity at infinity. Let = 1 2 and recall the definition of η(Z ) Eq. (39). Lemma 5.
ζ 2 sin Z k p02 p02 k Res 2 1− 2 1 − 2 = −2 η (Z ) Z Z k γk3 µ2k Zk ±Z k k
× Ak (y)Bk (x) − Ak (x)Bk (y) .
(97)
Proof. The factor η2 (Z ) introduces double poles at ±Z k because η(Z k ) = 0. However using Eq. (89), we see immediately that 1 |±Z k = 0, so that the poles are in fact simple. Remembering Eq. (86), we have ζ2 p02 p02 k ∂ Res 2 | + ∂ | 1 − 1− 2 = Z Z Z −Z k k . η (Z ) Z 4Z k2 Z k2 ±Z k k
We need to compute ∂ Z |±Z k = ∂ Z 1 |±Z k 2 |±Z k . Evaluating at Z k gives αk+ αk− 2i Ak (y)Bk (x) − Ak (x)Bk (y) , ∂ Z | Z k = 2 2 a(Z k ) − + 2b(Z k ) + c(Z k ) + αk µk γk αk while using w(−Z k ) = w ∗ (Z k ), ∂ Z w|−Z k =−∂ Z w ∗ | Z k , we also have αk+ αk− 2i ∂ Z |−Z k = 2 2 c(−Z k ) − + 2b(−Z k ) + a(−Z k ) + αk µk γk αk
× Ak (y)Bk (x) − Ak (x)Bk (y) . The result follows from the identities: αk+ α− Z k sin Z k a(Z k ) − + 2b(Z k ) + c(Z k ) k+ = 2i , α γk αk k c(−Z k )
αk+
αk−
+ 2b(−Z k ) + a(−Z k )
αk− Z k sin Z k = 2i . αk+ γk
852
O. Babelon
Next we have to examine the poles at ±mπ in the expression p02 1− 2 . η2 (Z ) Z We rewrite 1 and 2 as 1 = (w(x) − w ∗ (x))w ∗ (y) − (w(y) − w ∗ (y))w ∗ (x), Z 2 = × (w ∗ (x) − w(x))(ei Z w ∗ (y) − e−i Z w(y)) 4i sin Z
+(w ∗ (y) − w(y))(ei Z w ∗ (x) − e−i Z w(x)) .
Recalling the formula Eqs. (87, 88) for w(x, Z ) and w∗ (x, Z ), we see that when Z = 0, we have w = w∗ so that 1 = 0(Z ) and 2 = 0(Z 2 ). Hence we have no pole at Z = 0. When Z = ±π m + , w=
1 ∗(−1) 1 (−1) (0) ∗(0) w±m + w±m + · · · , w ∗ = w±m + w±m + ···
with (−1)
∗(−1)
w±m (x) = w±m (x) = ∓π mm (x). Because the two leading terms are the same, both w∗ (x, Z ) − w(x, Z ) and ei Z w ∗ (x, Z ) − e−i Z w(x, Z ) are regular. So 1 and 2 both behaves like 1/. Since 1/η2 (Z ) behaves like 2 , the whole thing is in fact regular. We come to the conclusion that everything happens at infinity. We want to compute 1 1 1 1 d x X (x)T (x), dyY (y)T (y) = 4 d x X (x) dyY (y) dZ C∞ 0 0 0 0 p02 1 ∂x ∂ y . × 1− 2 (98) Z η2 (Z ) Let w (x, Z ) = η−1/2 (Z ) w(x, Z ), w ∗ (x, Z ) = η−1/2 (Z ) w ∗ (x, Z ). The wronskian of w (x, Z ) and w ∗ (x, Z ) is 2i Z and therefore these functions coincide with the Baker–Akhiezer functions which are usually introduced in the pseudo-differential approach to the KdV hierarchy (see e.g. [15]). At Z ∞, we have ω(x) ω (x) + ω2 (x) + w (x, Z ) = ei Z x 1 − + · · · , iZ 2(i Z )2 ω(x) ω (x) + ω2 (x) + w ∗ (x, Z ) = e−i Z x 1 + + · · · , iZ 2(i Z )2 where we have set ω(x) = E(x).
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
853
Equation (79) is needed to verify this formula. Using these asymptotic forms, we find
∂x w 2 (x, Z ) =
1
An (x)(i Z )
n
e2i Z x ,
n=−∞
∗2
∂x w (x, Z ) =
1
(−1) An (x)(i Z ) n
n
e−2i Z x ,
n=−∞ −1 ∂x w (x, Z ) w ∗ (x, Z ) = C2n (x)(i Z )2n , n=−∞
where A1 = 2,
8 A−1 = 4ω2 (x), A−2 = − ω3 − 2 3 x 4 4 = ω + 4ω ω2 , C−2 (x) = ω (x). 3
A0 = −4ω(x), A−3
x
ω2 ,
Consider the term proportional to b(Z ) in Eq. (98): p02 dZ b(Z ) 1 − 2 2 (x, Z )∂ y w ∗2 (y, Z ). 4 d xd y(X (x)Y (y) − X (y)Y (x))∂x w 2iπ Z C∞ Lemma 6. Let us define I p (z) = C∞
dZ b(Z )(i Z ) p e2i Z z . 2iπ
We have 1 I−1 (z) = − 21 δ(z), I0 (z) = − 41 δ (z), I1 (z) = − 18 δ (z), I2 (z) = − 16 δ (z), n−3
2 I−n (z) = − (n−2)! (z)z n−2 , n ≥ 2.
Proof. It is clear that ∂z I p (z) = 2I p+1 (z), hence we can determine all the I p (z) recursively. For p ≥ 0 this is done by successively differentiating I−1 (z), which is easy to calculate, 1 dZ d Z cos Z 2i Z z 1 1 2inπ z 1 I−1 (z) = b(Z ) e2i Z z = e =− e = − δ(z). 2iπ i Z 2 2iπ sin Z 2 2 C∞ C∞ n∈Z
For p ≤ −2 we have to successively integrate I−1 (z). For this we need boundary conditions which are provided by dZ b(Z )Z − p = 0, p ≥ 2. (99) C∞ 2iπ
854
O. Babelon
This is because ∞ ∞ dZ d Z cos Z p−1 − p+1 ζ b(Z ) ζ p Z−p = − ζ Z 2i C∞ 2iπ sin Z C∞ 2iπ p=2 p=2 d Z cos Z 1 ζ2 =− 2i C∞ 2iπ sin Z Z − ζ ∞ 2 2ζ ζ 1 = = 0. cot ζ − − 2i ζ ζ 2 − n2π 2 n=0
Denoting
F p (x, y) =
(−1)m An (x)Am (y),
m+n= p
we get
1
4
dx 0
1
dy (X (x)Y (y) − X (y)Y (x))
0
p=2
F p (x, y)(I p (x − y) + p02 I p−2 (x − y)).
−∞
In this expression, we separate the terms with a δ(x − y) function or its derivative which will lead to local terms (L b ), and the non-local terms (N L b ) which are proportional to (x − y) : L b = 4 d xd y(X (x)Y (y) − X (y)Y (x))
× F2 I2 + F1 I1 + (F0 + p02 F2 )I0 + (F−1 + p02 F1 )I−1 , ∞ N L b = 4 d xd y(X (x)Y (y) − X (y)Y (x)) (F−2−n + p02 F−n )I−n−2 . (100) n=0
We have F2 (x, y) = −4, F0 (x, y) = −8 (ω(x) − ω(y))2 ,
F1 (x, y) = 8 (ω(x) − ω(y)) , F−1 (x, y) =
16 (ω(x) − ω(y))3 + 4 3
x
ω2 .
y
The local terms are 1 1 1 δ (x − y) + p02 δ (x − y) Lb = 4 dx dy(X (x)Y (y) − X (y)Y (x)) 4 0 0 1 − (ω(x)−ω(y))δ (x − y) − 2(ω(x) − ω(y))2 δ (x − y) − F−1 (x, y)δ(x − y) . 2 The last two terms obviously vanish and what remains is 1 d x −(X Y − X Y ) − 4(X Y − X Y )( p02 + 2ω ) . Lb = 0
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
855
Lemma 7. The non local term Eq. (100) is identically zero. Proof. The non local term reads 1 1 dx dy(X (x)Y (y) − X (y)Y (x))(x − y) N L b = −2 0 0 '∞ ( 2n 2 n F−2−n (x, y) + p0 F−n (x, y) × (x − y) . (n)! n=0
2 (x, Z )∂ w ∗2 The first sum is just the coefficient of (i Z )−2 in the formal expansion of ∂ w 0 (y, Z ) while the second sum is the coefficient of (i Z ) . Hence we have 1 1 N Lb = 2 dx dy(X (x)Y (y) − X (y)Y (x))(x − y) 0 0 dZ (Z − p02 Z −1 ) ∂ w 2 (x, Z )∂ w ∗2 (y, Z ). × 2iπ C∞ The above expression is zero in the following sense. Let us write dZ (Z − p02 Z −1 ) ∂ w 2 (x, Z )∂ w ∗2 (y, Z ) C∞ 2iπ ∞ (y − x)i dZ = (Z − p02 Z −1 ) ∂ w 2 (x, Z )∂ i+1 w ∗2 (x, Z ). i! C∞ 2iπ i=1
We will show that all the integrals around C∞ in the right-hand side are identically zero. Since the function w (x, Z ) satisfies the Schroedinger equation, its square w 2 (x, Z ) satisfies a third order differential equation D w 2 (x, Z ) = −4Z 2 ∂ w 2 (x, Z ), D = ∂ 3 + 8ω ∂ + 4ω . Let us introduce a pseudo differential operator such that w 2 (x, Z ) = e2i Z x . Then D = ∂∂ 2 −1 . Since D is anti self-adjoint, we also have D = −D∗ = ∗−1 ∂ 2 ∗ ∂. It follows that w ∗2 (x, Z ) which is solution of (−D∗ ) w ∗2 (x, Z ) = −4Z 2 ∂ w ∗2 (x, Z ) can be written as w ∗2 (x, Z ) = ∂ −1 ∗−1 ∂e−2i Z x .
856
O. Babelon
Hence ∂w 2 = ∂e2i Z x , ∂ i+1 w ∗2 = ∂ i ∗−1 ∂e−2i Z x . Finally, we have to compute dZ 2 (x, Z ))∂ i+1 w ∗2 (x, Z ) (Z − p02 Z −1 ) (∂ w 2iπ C∞ dZ = (Z − p02 Z −1 ) (∂e2i Z x ) × ∂ i ∗−1 ∂e−2i Z x 2iπ C∞ dZ −1 (∂e2i Z x ) × ∂ i ∗−1 (∂ 2 + 4 p02 )e−2i Z x . = 2i C∞ 2iπ We recall the formula (see e.g. [15]) dZ (Dei Z x )(Fe−i Z x ) = Res∂ (D F ∗ ), C∞ 2iπ where Res∂ is Adler’s residue [25]. So our expression is equal to Res∂ (∂(∂ 2 + 4 p02 )−1 ∂ i ) = Res∂ ((D + 4 p02 ∂)∂ i ) = 0, because (D + 4 p02 ∂)∂ i is a differential operator.
Consider next the term proportional to a(Z ) in Eq. (98), 1 1 dx dy X (x)Y (y) − X (y)Y (x) 4 0 0
p02 dZ a(Z ) 1 − 2 ∂x w (y, Z ) w ∗ (y, Z ) . × 2 (x, Z )∂ y w Z C∞ 2iπ Lemma 8. Let us define J p (x) = C∞
dZ a(Z )(i Z ) p e2i Z x , 2iπ
p = −1, −2, . . . .
One has J−1 (x) =
x p−2 1 δ(x), J− p (x) = 2 p−2 ((x) − 1), 2 ( p − 2)!
p ≥ 2.
Proof. One has ∂x J p (x) = 2J p+1 (x). The calculation of J−1 (x) is easy. Next, we need boundary conditions to determine the other J p by integration ∞ d Z e−i Z 1 ζ2 (iζ )n J−n (0) = 2i C∞ 2iπ sin Z Z − ζ n=2 ' ( $ −iζ % e ζ2 ζ2 1 e−iζ ζ2 + − + cot ζ = . = − = 2i sin ζ ζ − nπ 2i sin ζ 2 n∈Z
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
857
It follows that all the non-local terms containing J p (x) for p ≤ −2 vanish when 0 < x < 1. The a(Z ) term is
1
La = 4
1
dx 0
0
dy X (x)Y (y) − X (y)Y (x) A1 (x)C−2 (y)J−1 (x)
or La = 4
1
1
d x X (x)δ(x)
0
dyY (y)ω (y) − 4
0
1
1
d xY (x)δ(x)
0
dy X (y)ω (y).
0
Finally, the term in c(Z ) is just equal to the a(Z ) one and double it. Putting everything together, we arrive at Eq. (93). In the course of this proof, we have shown the identities dZ (Z − p02 Z −1 ) (∂ w 2 (x, Z ))∂ i+1 w ∗2 (x, Z ) = 0, ∀i ≥ 0. 2iπ C∞ These are quartic identities on the coefficients of w (x, Z ). Of course, we also have the quadratic relations of Hirota and Sato that were interpreted by Sato as Plücker relations defining the infinite Grassmannian, allowing to give a precise definition of τ -functions. The above relations are quartic relations on τ -functions analogous to the quartic relations on Riemann Theta functions. 12. Poisson Bracket {L 0 , u(x)} In the previous section, we have obtained the Poisson bracket for the generators Tn = L n − L 0 . We now have to reintroduce L 0 and check that it has the correct Poisson brackets. The candidate for L 0 was given in Eq. (66). Let us recall it: L 0 = p02 +
k 1 , 2 η( p0 ) p0 − k 2 π 2 k
where k = E k (0)k (0) − E k (1)k (1)) and η(Z ) = 1 −
m
E (0)m (0) ηm m = 1 − , 2 − m2π 2 Z 2 − m2π 2 Z m
where we have used Eq. (77), evaluated at x = 0, to express ηm . Proposition 20. We have the following Poisson bracket:
1
L 0, 0
dyY (y)u(y) = −4
1 0
dyY (y) (L 0 − p02 )δ (y) + T (y) .
This shows that {L 0 , ·} acts on u(y) as ∂ y , as it should be.
(101)
858
O. Babelon
Again, the proof is long and we will split it into several lemmas. We need to compute {m (x), T (y)}. and {m (x), T (y)} for x = 0 and x = 1. Multiplying Eq. (94) by E n (y) and remembering that E|−1 = F| and k (x), we get Fk (x) = − γkζµk k F {m (x), E(y)(y)} = −2m (x)
ζk2 (1 − p02 Z k−2 )
sin Z k Zk
k
µ2k γk3 (Z k2 − π 2 m 2 )
k (x) F k (x) F k (y)G k (y) − F k2 (y) F k (x)G k (x) × F +2m (x)
ζk2 (1 − p02 Z k−2 )
sin Z k
µ2k γk3 (Z k2 − π 2 m 2 )
Zk
k
k2 (x) F k (y)G k (y) − F k2 (y) F k (x)G k (x) . × F
(102)
As before, this can be expressed in terms of 1 and 2 . We find {m (x), E(y)(y)} = − Res±Z k k
(1 − p02 Z −2 ) 1 (m (x)∂x 2−m (x)2 ). η2 (Z )(Z 2 − π 2 m 2 )
This formula is the starting point to begin the computation of {η( p0 ), E(y)(y)}. Setting x = 0 in Eq. (102), the term proportional to m (x) vanishes because Fk (0) = 0. We are left with (x = 0, but we keep it for a while) {m (x), E(y)(y)} = −m (x)
Res±Z k
k
Multiplying by −
2 m ( p0
(0) Em 2 ( p0 −m 2 π 2 )
(1 − p02 Z −2 ) 1 ∂x 2 . (103) η2 (Z )(Z 2 − π 2 m 2 )
and summing over m, in the right hand side appears the sum
E m (0)m (0) ηm =− 2 2 2 2 2 2 2 2 2 2 2 − m π )(Z k − π m ) m ( p0 − π m )(Z k − π m ) ηm ηm −1 − 2 = 2 Z k − p02 m p02 − m 2 π 2 Zk − π 2m2 =
where we used that
&
ηm m Z 2 −π 2 m 2 k
1 η( p0 ), Z k2 − p02
= 1, ∀k. The factor 1/(Z k2 − p02 ) cancels with the
factor (1 − p02 Z k−2 ) in Eq. (103) and we are left with {η( p0 ), E(y)} = −η( p0 )
Res±Z k
k
Finally η( p0 ),
1 0
dyY (y)T (y) = 2η( p0 ) 0
1
1 1 ∂x 2 . η2 (Z )Z 2
dyY (y) C∞
dZ 1 2 ), (104) 1 ∂x ∂ y ( 2iπ Z 2
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
859
where we used the asymptotics expressions for w and w ∗ inside 1 and 2 , hence 2 removing the η (Z ) factor. The last step consists in evaluating the integral around C∞ . The result is: Lemma 9. η( p0 ),
1
1
dyY (y)T (y) = 4η( p0 )
0
dyY (y)δ (y).
(105)
0
Proof. Let us consider the integral over C∞ in Eq. (104). The term containing b(Z ) can be written as d Z b(Z ) Lb = 2 C∞ 2iπ Z × ( w (x, Z ) w ∗ (x, Z ) − w (x, Z ) w ∗ (x, Z ))∂ y ( w (y, Z ) w ∗ (y, Z )) + w (x, Z ) w (x, Z )∂ y w ∗2 (y, Z ) − w ∗ (x, Z ) w ∗ (x, Z )∂ y w 2 (y, Z ) . On the first line, we recognize the wronskian of w (x, Z ) and w ∗ (x, Z ), which is just ∗ −2 equal to −2i Z . Since ∂ y ( w (y, Z ) w (y, Z )) = Z S2 (y) + · · · this term vanishes by Eq. (99). Hence d Z b(Z ) 1 2 ∗2 ∗2 2 Lb = ∂ w (x, Z )∂ w (y, Z ) − ∂ w (x, Z )∂ w (y, Z ) (106) x y x y 2 C∞ 2iπ Z 2 ∞ =− F− p (x, y)I− p−2 (x − y) p=−2
1 = δ (y − x) + (x − y) 2
C∞
dZ 1 ∂x w 2 (x, Z )∂ y w ∗2 (y, Z ). 2iπ Z
(107)
The (x − y) term is zero because d Z 1 i+1 2 d Z 1 i+1 2i Z x ∂ w (∂ e (x, Z )∂ w ∗2 (x, Z ) = )(∗−1 ∂e−2i Z x ) 2iπ Z 2iπ Z C∞ C ∞ d Z i+1 2i Z x (∂ e = )(∗−1 e−2i Z x ) 2iπ C∞ = Res∂ (∂ i+1 · −1 ) = 0.
Next, we consider the a(Z ) term. It reads d Z a(Z ) L a = ∂y ( w(x, Z ) w ∗ (y, Z ) − w ∗ (x, Z ) w (y, Z )) w (x, Z ) w (y, Z ) 2iπ Z2 C∞ d Z a(Z ) 1 ∂x w = 2 (x, Z )∂ y ( w (y, Z ) w ∗ (y, Z )) 2 2 C∞ 2iπ Z 1 ∗ w (x, Z ) w (x, Z ) + w ∗ (x, Z ) w (x, Z ))∂ y w 2 (y, Z ) − ( 2 1 ∗ w (x, Z ) w (x, Z ) − w ∗ (x, Z ) w (x, Z ))∂ y w 2 (y, Z ) . − ( 2
860
O. Babelon
Again, the last term is the wronskian and so d Z a(Z ) 1 La = w 2 (y, Z ) + ∂x w 2 (x, Z )∂ y ( w(y, Z ) w ∗ (y, Z )) −i Z ∂ y ( 2 2 C∞ 2iπ Z 1 − ∂x ( w ∗ (x, Z ) w (x, Z ))∂ y ( w 2 (y, Z )) . 2 The first term is ∞ dZ A− p (y)J− p−1 (y) a(Z )(i Z )− p−1 A− p (y)e2i Z y = C∞ 2iπ p=−1
=
1 δ (y) − ω(y)δ(y) + ((y) − 1) 2 ∞ (2y) p−1 . × A− p (y) ( p − 1)! p=1
The last sum is zero because 0 < y < 1. The δ(y) term vanishes because ω(0) = 0. The second term is 1 d Z a(Z ) 1 A− p (x)C−q (y)(i Z )− p−q e2i Z x = − ((x) − 1) 2 C∞ 2iπ Z 2 2
× A1 (x)C−2 (y)(2x) + · · · . (108) This vanishes when x = 0. The third term is 1 d Z a(Z ) 1 − A− p (y)C−q (x)(i Z )− p−q e2i Z y = ((y) − 1) 2 C∞ 2iπ Z 2 2
× A1 (y)C−2 (x)(2y) + · · · and this vanishes when 0 < y < 1. Finally, it is easy to see that the c(Z ) term is equal to the a(Z ) one. Putting everything together, we get Eq. (105). The last result we need is the following: Lemma 10. 1 E (0) (0) − E (1) (1) 1 m m m m , dyY (y)T (y) = −4η( p ) dyY (y)T (y). 0 2 − π 2m2 p 0 0 0 m (109) Proof. Taking the derivative with respect to x of Eq. (102) and remembering that Fk (0) = Fk (1) = 0, the remaining terms are (there is a cancellation in the m (x) term) {m (x), E(y)(y)} = −2m (x)
sin Z k k
Zk
ζk2 (1 − p02 Z k−2 ) µ2k γk3 (Z k2 − π 2 m 2 )
k (x) F k (y)G k (y) − F k2 (y) F k (x)G k (x) , k (x) F × ∂x F
Classical Volterra Modeland a Lattice Version of Virasoro Algebra
861
where it is understood that x = 0 or x = 1. By exactly the same argument as before 1 E (x) (x) 1 m m , dyY (y)T (y) = 2η( p ) dyY (y)∂ y ∂x 0 p02 − π 2 m 2 0 0 m 1 dZ 1 ∂x 2 . × 2 C∞ 2iπ Z η(Z ) Hence, we just have to take the derivative with respect to x of the previous result, before setting x = 0 or x = 1. At x = 0 we get 1 2η( p0 ) dyY (y) − δ (y) + 4ω (y) . 0
δ (y)
The term comes from Eq. (107) while the second term comes from Eq. (108) doubled by the c(Z ) contribution. At x = 1 only the periodic δ (y) remains. Taking the difference we obtain Eq. (109). Putting everything together, we arrive at Eq. (101), and this finishes our proof that u(x) does satisfy the Virasoro Poisson bracket. 13. Conclusion We have succeeded to take the continuum limit in the formulae expressing the dynamical variables of the Volterra model in terms of the separated variables. This yields exactly solvable potentials and formulae for the Virasoro generators of a rather unusual type. Still, we were able to check that they have the correct Poisson brackets. Of course the most interesting thing now is to try to quantize this approach. As a first step, a semiclassical analysis along the lines of [26] should be very enlightening. The full quantum theory however may reserve some surprise. The bracket Eq. (3) being in fact an ordinary quadratic bracket, it is natural to quantize it with Weyl type commutation relations. This opens up the possibility of phenomena as those advocated in [9, 27]. Acknowledgements. This work was supported in part by the European Network ENIGMA, Contract number: MRTN-CT-2004-5652.
References 1. Gervais, J.L.: Transport matrices associated with the Virasoro algebra. Phys. Lett. B160, 279 (1985) 2. Bazhanov, V., Lukyanov, S., Zamolodchikov, A.: Integrable Structure of Conformal Field Theory, Quantum KdV Theory and Thermodynamic Bethe Ansatz. Commun. Math. Phys. 177, 381–398 (1996) Integrable Structure of Conformal Field Theory II. Q-operator and DDV equation. Commun. Math. Phys. 190, 247–278 (1997) Integrable Structure of Conformal Field Theory III. The Yang-Baxter Relation. Commun. Math. Phys. 200, 297–324 (1999) 3. Sklyanin, E.K.: The quantum Toda chain. Lect. Notes in Phys. 226, Berlin-Heidelberg-New York: Springer, 1985, pp. 196–233; Separation of variables. Prog. Theor. Phys. (suppl.) 185, 35 (1995) 4. Kac, M., van Moerbeke, P.: Some probabilistic aspect of scattering theory. Proceedings of the Conference on functional integration and its applications, (Cumberland Lodge London, 1974) Oxford:Clarendon Press, 1975, pp. 87–96 On some periodic Toda lattices, Proc. Nat. Acad. Sci., USA 72 (4), 1627–1629 (1975); A complete solution of the periodic Toda problem. Proc. Nat. Acad. Sci., USA, 72 (8), 2879–2880 (1975) 5. van Moerbeke, P.: The spectrum of Jacobi matrices. Invent. Math. 37, 45–81 (1976)
862
O. Babelon
6. Dubrovin, B.A., Krichever, I.M., Novikov, S.P.: Integrable Systems I. Encyclopedia of Mathematical Sciences, Dynamical systems IV. Berlin-Heidelberg-New York: Springer, 1990, p. 173–281 7. Faddeev, L.D., Takhtajan, L.: Liouville model on the lattice. Springer Lectures Notes in Physics, 246, Berlin-Heidelberg-New York: Springer, 1986, p. 66 8. Volkov, A.: A Hamiltonian interpretation of the Volterra model. Zapiski.Nauch.Semin. LOMI 150,17 (1986); Liouville theory and sh-Gordon model on the lattice. Zapiski.Nauch.Semin. LOMI 151,24 (1987); Miura transformation on the lattice. Theor. Math. Phys. 74 96 (1988) 9. Babelon, O.: Exchange formula and lattice deformation of the Virasoro algebra. Physics Letters 238B, 234 (1990) 10. Volkov, A. Yu.: Quantum Volterra Model. Phys. Lett. A167, 345 (1992); Noncommutative Hypergeometry. http://arxiv.org/list/ math.QA/0312084, 2003 11. Faddeev, L.D., Volkov, A. Yu.: Abelian current algebra and the Virasoro algebra on the lattice. Phys. Lett. B315, 311318 (1993) 12. Faddeev, L., Volkov, A. Yu.: Shift Operator for Nonabelian Lattice Current Algebra. Publ. Res. Inst. Math. Sci. Kyoto 40, 1113–1125 (2004) 13. Faddeev, L.D., Kashaev, R.M., Volkov, A.Yu.: Strongly coupled quantum discrete Liouville theory. I: Algebraic approach and duality. Commun. Math. Phys. 219, 199–219 (2001) 14. Gervais, J.L., Neveu, A.: Novel triangle relation and absence of tachyon in Liouville string field theory. Nucl. Phys. B238, 125 (1984); Oscillator representations of the two-dimensional conformal algebra. Commun. Math. Phys. 100, 15,(1985) 15. Babelon, O., Bernard, D., Talon, M.: Introduction to Classical Integrable Systems. Cambridge: Cambridge University Press, 2003 16. Atkinson, F.V.: Multiparameter spectral theory. Bull. Amer. Math. Soc. 74,1–27 (1968); Multiparameter Eigenvalue Problems. New York: Academic, 1972 17. Enriquez, B., Rubtsov, V.: Commuting families in skew fields and quantization of Beauville’s fibration. http://arxiv.org/list/ math.AG/0112276,2001 18. Babelon, O., Talon, M.: Riemann surfaces, separation of variables and classical and quantum integrability. Phys. Lett. A 312, 71–77 (2003) 19. Babelon, O.: On the Quantum Inverse Problem for the Closed Toda Chain. J. Phys A. Math. Gen. 37, (2004) pp. 303–316. 20. Novikov, S.P.: The periodic problem for the Korteweg-de Vries equation. Funkt. Anal. i Ego Pril. 8, 54–66 (1974) Translation in Funct. Anal. Jan.1975, pp. 236–246. 21. Dubrovin, B., Novikov, S.P.: Periodic and conditionally periodic analogues of multisoliton solutions of the Korteweg-de Vries equation. Dokl. Akad. Nauk. USSR 6, 2131–2144 (1974) 22. Its, A., Matveev, V.: On Hill operators with finitely many lacunae. Funkt. Anal. i ego Pril. 9 (1975) 23. McKean, H.P., van Moerbeke, P.: The Spectrum of Hill’s Equation. Invent. Math. 30, 217–274 (1975) 24. Matveev, V.B., Salle, M.A.: Darboux Transformations and Solitons, Berlin-Heidelberg-New York: Springer-Verlag, 1990 25. Adler, M.: On a trace functional for formal pseudo-differential operators and the symplectic structure of the Korteweg-de-Vries equations. Invent. Math. 50, 219 (1979) 26. Smirnov, F.A.: Quasi-classical Study of Form Factors in Finite Volume. http://arxiv.org/list/ hep-th/9802132, 1998; Dual Baxter equations and quantization of Affine Jacobian. http://arxiv.org/list/ math-ph/0001032, 2000 27. Faddeev, L.D.: Discrete Heisenberg-Weyl Group and Modular Group. Lett. Math. Phys. 34, 249–254 (1995); Modular Double of Quantum Group. http://arxiv.org/list/math.QA/9912078, 1999 Communicated by L. Takhtajan